milvus

mirror of https://gitee.com/milvus-io/milvus.git synced 2026-01-04 18:02:08 +08:00

Author	SHA1	Message	Date
marcelo-cjl	3b599441fd	feat: Add nullable vector support for proxy and querynode (#46305 ) related: #45993 This commit extends nullable vector support to the proxy layer, querynode, and adds comprehensive validation, search reduce, and field data handling for nullable vectors with sparse storage. Proxy layer changes: - Update validate_util.go checkAligned() with getExpectedVectorRows() helper to validate nullable vector field alignment using valid data count - Update checkFloatVectorFieldData/checkSparseFloatVectorFieldData for nullable vector validation with proper row count expectations - Add FieldDataIdxComputer in typeutil/schema.go for logical-to-physical index translation during search reduce operations - Update search_reduce_util.go reduceSearchResultData to use idxComputers for correct field data indexing with nullable vectors - Update task.go, task_query.go, task_upsert.go for nullable vector handling - Update msg_pack.go with nullable vector field data processing QueryNode layer changes: - Update segments/result.go for nullable vector result handling - Update segments/search_reduce.go with nullable vector offset translation Storage and index changes: - Update data_codec.go and utils.go for nullable vector serialization - Update indexcgowrapper/dataset.go and index.go for nullable vector indexing Utility changes: - Add FieldDataIdxComputer struct with Compute() method for efficient logical-to-physical index mapping across multiple field data - Update EstimateEntitySize() and AppendFieldData() with fieldIdxs parameter - Update funcutil.go with nullable vector support functions <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * New Features * Full support for nullable vector fields (float, binary, float16, bfloat16, int8, sparse) across ingest, storage, indexing, search and retrieval; logical↔physical offset mapping preserves row semantics. * Client: compaction control and compaction-state APIs. * Bug Fixes * Improved validation for adding vector fields (nullable + dimension checks) and corrected search/query behavior for nullable vectors. * Chores * Persisted validity maps with indexes and on-disk formats. * Tests * Extensive new and updated end-to-end nullable-vector tests. <sub>✏️ Tip: You can customize this high-level summary in your review settings.</sub> <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Signed-off-by: marcelo-cjl <marcelo.chen@zilliz.com>	2025-12-24 10:13:19 +08:00
Spade A	f6f716bcfd	feat: impl StructArray -- support embedding searches embeddings in embedding list with element level filter expression (#45830 ) issue: https://github.com/milvus-io/milvus/issues/42148 For a vector field inside a STRUCT, since a STRUCT can only appear as the element type of an ARRAY field, the vector field in STRUCT is effectively an array of vectors, i.e. an embedding list. Milvus already supports searching embedding lists with metrics whose names start with the prefix MAX_SIM_. This PR allows Milvus to search embeddings inside an embedding list using the same metrics as normal embedding fields. Each embedding in the list is treated as an independent vector and participates in ANN search. Further, since STRUCT may contain scalar fields that are highly related to the embedding field, this PR introduces an element-level filter expression to refine search results. The grammar of the element-level filter is: element_filter(structFieldName, $[subFieldName] == 3) where $[subFieldName] refers to the value of subFieldName in each element of the STRUCT array structFieldName. It can be combined with existing filter expressions, for example: "varcharField == 'aaa' && element_filter(struct_field, $[struct_int] == 3)" A full example: ``` struct_schema = milvus_client.create_struct_field_schema() struct_schema.add_field("struct_str", DataType.VARCHAR, max_length=65535) struct_schema.add_field("struct_int", DataType.INT32) struct_schema.add_field("struct_float_vec", DataType.FLOAT_VECTOR, dim=EMBEDDING_DIM) schema.add_field( "struct_field", datatype=DataType.ARRAY, element_type=DataType.STRUCT, struct_schema=struct_schema, max_capacity=1000, ) ... filter = "varcharField == 'aaa' && element_filter(struct_field, $[struct_int] == 3 && $[struct_str] == 'abc')" res = milvus_client.search( COLLECTION_NAME, data=query_embeddings, limit=10, anns_field="struct_field[struct_float_vec]", filter=filter, output_fields=["struct_field[struct_int]", "varcharField"], ) ``` TODO: 1. When an `element_filter` expression is used, a regular filter expression must also be present. Remove this restriction. 2. Implement `element_filter` expressions in the `query`. --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-12-15 12:01:15 +08:00
Buqian Zheng	95a535cb4d	fix: struct reduce incorrect (#46150 ) issue: https://github.com/milvus-io/milvus/issues/42148 --------- Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2025-12-08 10:23:11 +08:00
zhagnlu	3dd5deb70a	fix:disable using shredding for json_path contains digital (#44724 ) #44132 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-10-13 17:25:59 +08:00
congqixia	e3b3502287	fix: Use correct regex for cppcheck (#44077 ) Related to #44076 Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-08-27 20:57:50 +08:00
marcelo-cjl	e13e19cd2c	enhance: add sparse_u32_f32 data type for sparse vertor (#43974 ) issue: #43973 Signed-off-by: marcelo.chen <marcelo.chen@zilliz.com>	2025-08-27 16:47:50 +08:00
Gao	e97a618630	enhance: support readAt interface for remote input stream (#43997 ) #42032 Also, fix the cacheoptfield method to work in storagev2. Also, change the sparse related interface for knowhere version bump #43974 . Also, includes https://github.com/milvus-io/milvus/pull/44046 for metric lost. --------- Signed-off-by: chasingegg <chao.gao@zilliz.com> Signed-off-by: marcelo.chen <marcelo.chen@zilliz.com> Signed-off-by: Congqi Xia <congqi.xia@zilliz.com> Co-authored-by: marcelo.chen <marcelo.chen@zilliz.com> Co-authored-by: Congqi Xia <congqi.xia@zilliz.com>	2025-08-26 11:19:58 +08:00
cqy123456	317bbfbf81	enhance: milvus support minhash vector and mhjaccard metric (#42036 ) issue: https://github.com/issues/assigned?issue=milvus-io%7Cmilvus%7C41746 Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>	2025-06-10 14:38:34 +08:00
zhagnlu	39e7ad33d7	enhance: add optimize for like expr (#41066 ) #41065 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-05-08 14:28:52 +08:00
zhagnlu	0a378dc308	fix:fix format error for json (#41026 ) #40963 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-04-07 10:22:22 +08:00
cqy123456	6dc0f42830	fix:growing mmap data type crashed by nullable input (#40994 ) issue: https://github.com/milvus-io/milvus/issues/40981 2.5 pr: https://github.com/milvus-io/milvus/pull/40980 Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>	2025-03-31 20:32:19 +08:00
Cai Yudong	341d6c1eb7	feat: Update segcore for VECTOR_INT8 (#39415 ) Issue: #38666 Signed-off-by: Cai Yudong <yudong.cai@zilliz.com>	2025-01-21 11:03:03 +08:00
zhagnlu	01de0afc4e	enhance: refactor delete mvcc function (#38066 ) #37413 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2024-12-15 18:02:43 +08:00
aoiasd	db34572c56	feat: support load and query with bm25 metric (#36071 ) relate: https://github.com/milvus-io/milvus/issues/35853 --------- Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2024-10-11 10:23:20 +08:00
zhagnlu	489087d18b	enhance: refactor executor framework V2 (#35251 ) #32636 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2024-09-13 20:57:09 +08:00
Buqian Zheng	f4a91e135b	enhance: Allow empty sparse row (#34700 ) issue: #29419 * If a sparse vector with 0 non-zero value is inserted, no ANN search on this sparse vector field will return it as a result. User may retrieve this row via scalar query or ANN search on another vector field though. * If the user uses an empty sparse vector as the query vector for a ANN search, no neighbor will be returned. Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2024-08-16 14:14:54 +08:00
Buqian Zheng	7c60d725cc	fix: validate sparse vector in search request (#32856 ) issue: #32368 Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2024-05-15 15:39:33 +08:00
Buqian Zheng	96cfae55a5	feat: [Sparse Float Vector] segcore to support sparse vector search and get raw vector by id (#30629 ) This PR adds the ability to search/get sparse float vectors in segcore, and added unit tests by modifying lots of existing tests into parameterized ones. https://github.com/milvus-io/milvus/issues/29419 Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2024-03-12 09:16:30 -07:00
Buqian Zheng	070dfc77bf	feat: [Sparse Float Vector] segcore basics and index building (#30357 ) This commit adds sparse float vector support to segcore with the following: 1. data type enum declarations 2. Adds corresponding data structures for handling sparse float vectors in various scenarios, including: * FieldData as a bridge between the binlog and the in memory data structures * mmap::Column as the in memory representation of a sparse float vector column of a sealed segment; * ConcurrentVector as the in memory representation of a sparse float vector of a growing segment which supports inserts. 3. Adds logic in payload reader/writer to serialize/deserialize from/to binlog 4. Adds the ability to allow the index node to build sparse float vector index 5. Adds the ability to allow the query node to build growing index for growing segment and temp index for sealed segment without index built This commit also includes some code cleanness, comment improvement, and some unit tests for sparse vector. https://github.com/milvus-io/milvus/issues/29419 Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2024-03-11 14:45:02 +08:00
zhagnlu	8c58d9af67	enhance: optimize marisa trie range search for performance (#30079 ) #30078 #29986 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2024-01-25 10:07:00 +08:00
zhagnlu	a602171d06	enhance: Refactor runtime and expr framework (#28166 ) #28165 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2023-12-18 12:04:42 +08:00
cai.zhang	8011054a2a	Check length before comparing strings (#28110 ) Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2023-11-04 10:04:29 +08:00
foxspy	370b6fde58	milvus support multi index engine (#27178 ) Co-authored-by: longjiquan <jiquan.long@zilliz.com>	2023-09-22 09:59:26 +08:00
Enwei Jiao	c3f15c6b95	Refactor duplicate error class into one place (#26985 ) Signed-off-by: Enwei Jiao <enwei.jiao@zilliz.com>	2023-09-11 20:43:17 +08:00
yah01	cb721781f3	Improve error message throwed from knowhere (#25473 ) Signed-off-by: yah01 <yang.cen@zilliz.com>	2023-07-11 17:26:29 +08:00
xige-16	04082b3de2	Migrate the ability to upload and download binlog to cpp (#22984 ) Signed-off-by: xige-16 <xi.ge@zilliz.com>	2023-06-25 14:38:44 +08:00
Cai Yudong	b1afd3ea2f	Update knowhere commit to fix BIN_IVF_FLAT upgrade (#24187 ) Signed-off-by: Yudong Cai <yudong.cai@zilliz.com>	2023-05-18 17:57:23 +08:00
yah01	60fdd7e4f4	Introduce simdjson (#23644 ) Signed-off-by: yah01 <yang.cen@zilliz.com>	2023-04-26 10:30:34 +08:00
yihao.dai	092d743917	Add support for getting vectors by ids (#23450 ) Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2023-04-23 09:00:32 +08:00
yah01	546080dcdd	Support to retrieve json (#23563 ) Signed-off-by: yah01 <yang.cen@zilliz.com>	2023-04-21 11:46:32 +08:00
yah01	aa2985490c	Retrieve page size by getpagesize() (#23561 ) Signed-off-by: yah01 <yang.cen@zilliz.com>	2023-04-20 12:36:30 +08:00
Enwei Jiao	967a97b9bd	Support json & array types (#23408 ) Signed-off-by: yah01 <yang.cen@zilliz.com> Co-authored-by: yah01 <yang.cen@zilliz.com>	2023-04-20 11:32:31 +08:00
Cai Yudong	2725d38b9e	Add COSINE metric type (#23350 ) Signed-off-by: Yudong Cai <yudong.cai@zilliz.com>	2023-04-20 10:20:31 +08:00
congqixia	4a53018e5f	Remove legacy annoy_inner_error (#23220 ) Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2023-04-04 18:08:24 +08:00
yah01	a4031da634	Refine string parameters, avoid coping or deref (#22708 ) Signed-off-by: yah01 <yang.cen@zilliz.com>	2023-03-13 17:53:53 +08:00
yah01	7bc3309918	Replace NULL with nullptr (#22701 ) Signed-off-by: yah01 <yang.cen@zilliz.com>	2023-03-13 10:35:52 +08:00
Jiquan Long	a36fefb009	Fix cpplint (#22657 ) Signed-off-by: longjiquan <jiquan.long@zilliz.com>	2023-03-10 09:47:54 +08:00
yah01	bdd6bc7695	Re-format cpp code (#22513 ) Signed-off-by: yah01 <yang.cen@zilliz.com>	2023-03-02 15:55:49 +08:00
yah01	7478e44911	Support using mmap to load data (#22052 ) Signed-off-by: yah01 <yang.cen@zilliz.com>	2023-03-01 18:07:49 +08:00
smellthemoon	7a4dfcc72b	Add status error output (#22325 ) Signed-off-by: lixinguo <xinguo.li@zilliz.com> Co-authored-by: lixinguo <xinguo.li@zilliz.com>	2023-02-22 15:51:44 +08:00
smellthemoon	9e0ec15436	Support range search (#21652 ) Signed-off-by: smellthemoon <xinguo.li@zilliz.com> Signed-off-by: lixinguo <xinguo.li@zilliz.com> Co-authored-by: jaime <yun.zhang@zilliz.com>	2023-02-21 09:48:32 +08:00
yah01	187788059b	Fix double copy varchar field while loading (#22114 ) Signed-off-by: yah01 <yang.cen@zilliz.com>	2023-02-16 17:16:35 +08:00
presburger	9950cacd10	support knowhere 2.0 (#21857 ) Signed-off-by: Yusheng.Ma <Yusheng.Ma@zilliz.com>	2023-02-10 14:24:32 +08:00
Cai Yudong	87d78a4a85	Ignore cases when comparing metric type in segcore (#19437 ) Signed-off-by: yudong.cai <yudong.cai@zilliz.com> Signed-off-by: yudong.cai <yudong.cai@zilliz.com>	2022-09-26 17:58:52 +08:00
xige-16	428840178c	Support diskann index for vector field (#19093 ) Signed-off-by: xige-16 <xi.ge@zilliz.com> Signed-off-by: xige-16 <xi.ge@zilliz.com>	2022-09-21 20:16:51 +08:00
Cai Yudong	21a1311f66	Merge utils/Utils.h into common/Utils.h (#16762 ) Signed-off-by: yudong.cai <yudong.cai@zilliz.com>	2022-05-03 12:05:50 +08:00
xige-16	515d0369de	Support string type in segcore (#16546 ) Signed-off-by: xige-16 <xi.ge@zilliz.com> Co-authored-by: dragondriver <jiquan.long@zilliz.com> Co-authored-by: dragondriver <jiquan.long@zilliz.com>	2022-04-29 13:35:49 +08:00

47 Commits