milvus

mirror of https://gitee.com/milvus-io/milvus.git synced 2025-12-30 07:25:37 +08:00

Author	SHA1	Message	Date
MrPresent-Han	9e2e7157e9	feat: support search_group_by for milvus(#25324 ) (#28983 ) related: #25324 Search GroupBy function, used to aggregate result entities based on a specific scalar column. several points to mention: 1. Temporarliy, the whole groupby is implemented separated from iterative expr framework for the first period 2. In the long term, the groupBy operation will be incorporated into the iterative expr framework:https://github.com/milvus-io/milvus/pull/28166 3. This pr includes some unrelated mocked interface regarding alterIndex due to some unworth-to-mention reasons. All these un-associated content will be removed before the final pr is merged. This version of pr is only for review 4. All other related details were commented in the files comparison Signed-off-by: MrPresent-Han <chun.han@zilliz.com>	2024-01-05 15:50:47 +08:00
cqy123456	22bb84fa9d	feat:add new gpu index:GPU_BRUTE_FORCE and limit gpu index metric type (#29590 ) issue: https://github.com/milvus-io/milvus/issues/29230 this pr do these things: 1. add gpu brute force; 2. limit gpu index only support l2 / ip; Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>	2024-01-05 15:24:48 +08:00
PowderLi	c8db36a63a	enhance: get a blob to check object storage config (#29703 ) issue: #29672 the storage account need privileges of actions `Microsoft.Storage/storageAccounts/blobServices/containers/blobs/*` at least Signed-off-by: PowderLi <min.li@zilliz.com>	2024-01-05 14:50:46 +08:00
yah01	0ae90443ba	enhance: fill missed info for segcore error (#29610 ) - fill missed error info - format the error message directly Signed-off-by: yah01 <yang.cen@zilliz.com>	2024-01-04 17:54:46 +08:00
yah01	99e0f1e65a	enhance: unable to compile C++ tests (#29616 ) The tests need to call a private method, Milvus uses `#define` to replace private with public, the hack trick works but would be broken if the including order changed. This uses friend to make all things work well Signed-off-by: yah01 <yang.cen@zilliz.com> Signed-off-by: yah01 <yah2er0ne@outlook.com>	2024-01-04 13:20:46 +08:00
PowderLi	5f00bad4b8	fix: link with install path's libblob-chunk-manager (#29496 ) issue: #29494 1. link with install path's libblob-chunk-manager 2. performance of `ShouldBindWith` is better than `ShouldBindBodyWith` 3. the middleware shouldn't read the unrefreshed parameter repeatly Signed-off-by: PowderLi <min.li@zilliz.com>	2023-12-31 20:02:48 +08:00
Jiquan Long	3f46c6d459	feat: support inverted index (#28783 ) issue: https://github.com/milvus-io/milvus/issues/27704 Add inverted index for some data types in Milvus. This index type can save a lot of memory compared to loading all data into RAM and speed up the term query and range query. Supported: `INT8`, `INT16`, `INT32`, `INT64`, `FLOAT`, `DOUBLE`, `BOOL` and `VARCHAR`. Not supported: `ARRAY` and `JSON`. Note: - The inverted index for `VARCHAR` is not designed to serve full-text search now. We will treat every row as a whole keyword instead of tokenizing it into multiple terms. - The inverted index don't support retrieval well, so if you create inverted index for field, those operations which depend on the raw data will fallback to use chunk storage, which will bring some performance loss. For example, comparisons between two columns and retrieval of output fields. The inverted index is very easy to be used. Taking below collection as an example: ```python fields = [ FieldSchema(name="pk", dtype=DataType.VARCHAR, is_primary=True, auto_id=False, max_length=100), FieldSchema(name="int8", dtype=DataType.INT8), FieldSchema(name="int16", dtype=DataType.INT16), FieldSchema(name="int32", dtype=DataType.INT32), FieldSchema(name="int64", dtype=DataType.INT64), FieldSchema(name="float", dtype=DataType.FLOAT), FieldSchema(name="double", dtype=DataType.DOUBLE), FieldSchema(name="bool", dtype=DataType.BOOL), FieldSchema(name="varchar", dtype=DataType.VARCHAR, max_length=1000), FieldSchema(name="random", dtype=DataType.DOUBLE), FieldSchema(name="embeddings", dtype=DataType.FLOAT_VECTOR, dim=dim), ] schema = CollectionSchema(fields) collection = Collection("demo", schema) ``` Then we can simply create inverted index for field via: ```python index_type = "INVERTED" collection.create_index("int8", {"index_type": index_type}) collection.create_index("int16", {"index_type": index_type}) collection.create_index("int32", {"index_type": index_type}) collection.create_index("int64", {"index_type": index_type}) collection.create_index("float", {"index_type": index_type}) collection.create_index("double", {"index_type": index_type}) collection.create_index("bool", {"index_type": index_type}) collection.create_index("varchar", {"index_type": index_type}) ``` Then, term query and range query on the field can be speed up automatically by the inverted index: ```python result = collection.query(expr='int64 in [1, 2, 3]', output_fields=["pk"]) result = collection.query(expr='int64 < 5', output_fields=["pk"]) result = collection.query(expr='int64 > 2997', output_fields=["pk"]) result = collection.query(expr='1 < int64 < 5', output_fields=["pk"]) ``` --------- Signed-off-by: longjiquan <jiquan.long@zilliz.com>	2023-12-31 19:50:47 +08:00
zhagnlu	79c417b14e	fix: pass active count to query context instead of timestamp (#29541 ) #29319 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2023-12-31 16:08:48 +08:00
sre-ci-robot	c2345daf3a	[automated] Update Knowhere Commit (#29578 ) Update Knowhere Commit Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2023-12-29 18:56:46 +08:00
Jiquan Long	6f4791da0b	fix: panic in concurrent insert/query scenario (#29408 ) issue: https://github.com/milvus-io/milvus/issues/29405 --------- Signed-off-by: longjiquan <jiquan.long@zilliz.com>	2023-12-26 15:10:48 +08:00
yah01	b8318fcd7d	enhance: improve the handling for segcore error (#29471 ) - fix lost exception details in segcore - improve the logs of handling errors from segcore Signed-off-by: yah01 <yang.cen@zilliz.com>	2023-12-26 14:06:46 +08:00
cqy123456	4c979538a4	enhance: update cagra index params in config and add params check (#29045 ) issue:https://github.com/milvus-io/milvus/issues/29230 this pr do two things about cagra index: a.milvus yaml config support gpu memory settings b.add cagra-params check Signed-off-by: cqy123456 <qianya.cheng@zilliz.com> Co-authored-by: yusheng.ma <yusheng.ma@zilliz.com>	2023-12-26 11:04:47 +08:00
sre-ci-robot	fce1a8dafb	[automated] Update Knowhere Commit (#29412 ) Update Knowhere Commit Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2023-12-25 17:58:46 +08:00
yah01	aef483806d	enhance: improve the segcore logs (#29372 ) - remove the streaming logging - refine existing logs fix #29366 --------- Signed-off-by: yah01 <yah2er0ne@outlook.com>	2023-12-23 21:52:43 +08:00
yah01	1b7f1d7067	enhance: mmap data corrupted after seal the column (#29422 ) this bug was introduced in recent changes Signed-off-by: yah01 <yang.cen@zilliz.com>	2023-12-23 15:20:43 +08:00
zhagnlu	1cbe3cd5fc	fix: fix memory leak when cancel segcore task (#29431 ) #29430 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2023-12-22 20:28:43 +08:00
zhagnlu	a6eb7e5f9a	enhance: skip segment when using pk in (..) expr (#29394 ) #29293 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2023-12-21 20:06:42 +08:00
yah01	7a2374e698	enhance: reduce the memory usage of variable length data (#29387 ) add all loading data into a buffer and then copy them into the a fit-in-size memory --------- Signed-off-by: yah01 <yang.cen@zilliz.com>	2023-12-21 18:02:42 +08:00
chyezh	be87c18b44	fix: fixup data race at generate binlog index (#29370 ) issue: #29339 Signed-off-by: chyezh <ye.zhen@zilliz.com>	2023-12-21 14:58:49 +08:00
yah01	04b2518ae7	enhance: fix the incorrect init parameter (#29357 ) as the `driver_` field is not used so this doesn't matter for now Signed-off-by: yah01 <yang.cen@zilliz.com>	2023-12-20 20:50:43 +08:00
Gao	9b52cb6417	enhance: improve reducing results when many segments are filtered (#29073 ) Do not fill the invalid ids for the empty results, it will incur useless memory overhead and reduce overhead when nq and topk is large. --------- Signed-off-by: chasingegg <chao.gao@zilliz.com>	2023-12-20 12:56:42 +08:00
yah01	8f89e9cf75	enhance: remove all unnecessary string formatting (#29323 ) done by two regex expressions: - `PanicInfo\((.+),[. \n]+fmt::format\(([.\s\S]+?)\)\)` - `AssertInfo\((.+),[. \n]+fmt::format\(([.\s\S]+?)\)\)` related: #28811 --------- Signed-off-by: yah01 <yang.cen@zilliz.com>	2023-12-20 10:04:43 +08:00
Bingyi Sun	89b208d27a	enhance: Fix format message (#29159 ) Signed-off-by: sunby <sunbingyi1992@gmail.com>	2023-12-20 09:30:44 +08:00
MrPresent-Han	bfca0a7926	fix: refine skipIndex to resolve cyclic dependcy(#29132 ) (#29189 ) related: #29132 Signed-off-by: MrPresent-Han <chun.han@zilliz.com>	2023-12-19 10:26:40 +08:00
zhagnlu	a602171d06	enhance: Refactor runtime and expr framework (#28166 ) #28165 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2023-12-18 12:04:42 +08:00
Cai Yudong	26409d801e	enhance: Remove omp from segcore (#29207 ) Signed-off-by: Yudong Cai <yudong.cai@zilliz.com>	2023-12-15 14:00:39 +08:00
sre-ci-robot	3e66e78508	[automated] Update Knowhere Commit (#29178 ) Update Knowhere Commit Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2023-12-14 17:16:39 +08:00
cai.zhang	49b8657f95	enhance: Support implicit type conversion for parquet (#29046 ) issue: #29019 Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2023-12-12 16:14:44 +08:00
Enwei Jiao	0e65e90338	enhance: Support otlp with insecure (#29115 ) issue: https://github.com/milvus-io/milvus/issues/28914 Signed-off-by: Enwei Jiao <enwei.jiao@zilliz.com>	2023-12-12 11:14:37 +08:00
Xiaofan	9d54d6f590	fix: change Abseil to shared library to solve macos compilation issue (#28986 ) fix the compilation error on macos 14.0 with x86 arch processor related to #28985 Signed-off-by: xiaofanluan <xiaofan.luan@zilliz.com>	2023-12-11 21:54:37 +08:00
MrPresent-Han	464bc9e8f4	fix: fix reduce precision for search(#27325 ) (#29031 ) related: #27325 Signed-off-by: MrPresent-Han <chun.han@zilliz.com>	2023-12-08 10:04:37 +08:00
congqixia	dcb662d9ed	enhance: Refine C.NewSegment response and handle exception (#28952 ) See also #28795 Orignal `C.NewSegment` may panic if some condition is not met, this pr changes response struct to `CNewSegmentResult`, which contains `C.CStatus` and may return catched exception --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2023-12-07 13:34:35 +08:00
cai.zhang	fb089cda8b	enhance: Load raw data while scalar index doesn't have raw data (#28888 ) issue: #28886 Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2023-12-06 20:36:36 +08:00
Bingyi Sun	36f69ea031	feat: integrate storagev2 in building index of segcore (#28768 ) issue: https://github.com/milvus-io/milvus/issues/28655 --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2023-12-05 16:48:54 +08:00
sre-ci-robot	f01e507b15	[automated] Update Knowhere Commit (#28965 ) Update Knowhere Commit Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2023-12-05 15:56:35 +08:00
sre-ci-robot	9b6cbe956a	[automated] Update Knowhere Commit (#28917 ) Update Knowhere Commit Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2023-12-04 15:42:34 +08:00
congqixia	c8b1a4618a	enhance: Resolve libunwind requirement conflict using 1.7.2 (#28929 ) Try to resolve libunwind dependency requirement conflict between glog & folly --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2023-12-04 14:06:40 +08:00
PowderLi	20fc90c591	enhance: find collection schema from cache (#28782 ) issue: #28781 #28329 1. There is no need to call `DescribeCollection`, if the collection's schema is found in the globalMetaCache 2. did `GetProperties` to check the access to Azure Blob Service while construct the ChunkManager Signed-off-by: PowderLi <min.li@zilliz.com>	2023-12-03 19:22:33 +08:00
yah01	342635ed61	enhance: enable assert method to format arguments (#28812 ) for now the assert method in segcore could accept a string information, too many codes don't print the value they assert. make it happy related #28811 --------- Signed-off-by: yah01 <yah2er0ne@outlook.com>	2023-12-01 18:04:33 +08:00
yihao.dai	f5856812a2	fix: Fix get binary vector from chunk cache (#28866 ) The way of getting binary vector size is wrong. This PR will fix it. issue: https://github.com/milvus-io/milvus/issues/28865 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2023-12-01 14:40:32 +08:00
Gao	7206795e91	fix: update folly to resolve simd issue (#28878 ) related #27552 , after this, milvus could run successfully on sse4.2 only machine Signed-off-by: chasingegg <chao.gao@zilliz.com>	2023-12-01 13:50:32 +08:00
Bingyi Sun	8036ee13fa	feat: avoid dereferencing nullptr (#28862 ) issue: #28793 Signed-off-by: sunby <sunbingyi1992@gmail.com>	2023-12-01 10:20:32 +08:00
sre-ci-robot	ecc3ca374c	[automated] Update Knowhere Commit (#28882 ) Update Knowhere Commit Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2023-12-01 02:28:31 +08:00
PowderLi	cac802ef7f	enhance: use already installed vcpkg (#28703 ) issue #28686 1. Update Builder gpu image changes, see changes #28505 2. update azure-identity-cpp from beta to release Signed-off-by: PowderLi <min.li@zilliz.com>	2023-11-30 15:58:32 +08:00
yah01	d69440524b	fix: bypass growing index if no index meta (#28791 ) we shouldn't panic if no index meta, just skip building it fix #28022 Signed-off-by: yah01 <yang.cen@zilliz.com>	2023-11-30 14:10:27 +08:00
congqixia	1dc086496f	fix: schema->size() check logic with system field (#28802 ) Now segcore load system field info as well, the growing segment assertion shall not pass with "+ 2" value This will cause all growing segments load failure Fix #28801 Related to #28478 See also #28524 --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2023-11-29 22:40:28 +08:00
cqy123456	3b1b14dd78	fix: update binlog index memory uasge before loading segments (#28528 ) issue: #27678 when interimIndex = true, memory predict should be update with the memory usage of binlog index build process. Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>	2023-11-29 16:42:27 +08:00
sre-ci-robot	86ccb8e146	[automated] Update Knowhere Commit (#28704 ) Update Knowhere Commit Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2023-11-24 16:56:24 +08:00
cai.zhang	6f7a9264d5	enhance: Handle knowhere error for creare diskann index (#28690 ) Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2023-11-24 11:58:23 +08:00
zhagnlu	0d9d098186	enhance: Add precheck when chunk manager init (#28330 ) #28329 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2023-11-23 19:56:32 +08:00

1 2 3 4 5 ...

1311 Commits