milvus

mirror of https://gitee.com/milvus-io/milvus.git synced 2025-12-08 01:58:34 +08:00

Author	SHA1	Message	Date
zhagnlu	ae19c93c14	enhance: remove timestamp filter for search_ids to optimize performance (#44634 ) #44352 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-10-17 16:10:01 +08:00
Spade A	c4f3f0ce4c	feat: impl StructArray -- support more types of vector in STRUCT (#44736 ) ref: https://github.com/milvus-io/milvus/issues/42148 --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com> Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>	2025-10-15 10:25:59 +08:00
cai.zhang	9d1bb8497c	fix: Get R-Tree index correct for growing segment (#44612 ) issue： #43427 R-Tree index is the entire segment, not the chunk. Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2025-09-29 21:34:54 +08:00
cai.zhang	19346fa389	feat: Geospatial Data Type and GIS Function support for milvus (#44547 ) issue: #43427 This pr's main goal is merge #37417 to milvus 2.5 without conflicts. # Main Goals 1. Create and describe collections with geospatial type 2. Insert geospatial data into the insert binlog 3. Load segments containing geospatial data into memory 4. Enable query and search can display geospatial data 5. Support using GIS funtions like ST_EQUALS in query 6. Support R-Tree index for geometry type # Solution 1. Add Type: Modify the Milvus core by adding a Geospatial type in both the C++ and Go code layers, defining the Geospatial data structure and the corresponding interfaces. 2. Dependency Libraries: Introduce necessary geospatial data processing libraries. In the C++ source code, use Conan package management to include the GDAL library. In the Go source code, add the go-geom library to the go.mod file. 3. Protocol Interface: Revise the Milvus protocol to provide mechanisms for Geospatial message serialization and deserialization. 4. Data Pipeline: Facilitate interaction between the client and proxy using the WKT format for geospatial data. The proxy will convert all data into WKB format for downstream processing, providing column data interfaces, segment encapsulation, segment loading, payload writing, and cache block management. 5. Query Operators: Implement simple display and support for filter queries. Initially, focus on filtering based on spatial relationships for a single column of geospatial literal values, providing parsing and execution for query expressions.Now only support brutal search 7. Client Modification: Enable the client to handle user input for geospatial data and facilitate end-to-end testing.Check the modification in pymilvus. --------- Signed-off-by: Yinwei Li <yinwei.li@zilliz.com> Signed-off-by: Cai Zhang <cai.zhang@zilliz.com> Co-authored-by: ZhuXi <150327960+Yinwei-Yu@users.noreply.github.com>	2025-09-28 19:43:05 +08:00
zhagnlu	eac16a577c	enhance:support cachelayer for json stats (#44446 ) #42533 Signed-off-by: zhagnlu <lu.zhang@zilliz.com>	2025-09-24 15:30:04 +08:00
sparknack	ab64afba2f	enhance: add storage resource usage for scalar search (#44414 ) issue: #44212 --------- Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>	2025-09-22 14:28:06 +08:00
Gao	d3784c6515	enhance: add storage resource usage for vector search (#44308 ) issue: #44212 Implement search/query storage usage statistics in go side(result reduce), only record storage usage in vector search C++ path. Need to be implemented in query c++ path in next prs. --------- Signed-off-by: chasingegg <chao.gao@zilliz.com> Signed-off-by: marcelo.chen <marcelo.chen@zilliz.com> Co-authored-by: marcelo.chen <marcelo.chen@zilliz.com>	2025-09-19 20:20:02 +08:00
zhagnlu	baa84e0b2b	fix: avoid mvcc when doing pk compare expr (#44353 ) #44352 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-09-15 10:17:59 +08:00
Bingyi Sun	0c0630cc38	feat: support dropping index without releasing collection (#42941 ) issue: #42942 This pr includes the following changes: 1. Added checks for index checker in querycoord to generate drop index tasks 2. Added drop index interface to querynode 3. To avoid search failure after dropping the index, the querynode allows the use of lazy mode (warmup=disable) to load raw data even when indexes contain raw data. 4. In segcore, loading the index no longer deletes raw data; instead, it evicts it. 5. In expr, the index is pinned to prevent concurrent errors. --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-09-02 16:17:52 +08:00
zhagnlu	fc876639cf	enhance: support json stats with shredding design (#42534 ) #42533 Co-authored-by: luzhang <luzhang@zilliz.com>	2025-09-01 10:49:52 +08:00
Spade A	8456f824be	feat: impl StructArray -- miscellaneous staffs for struct array (#43960 ) Ref https://github.com/milvus-io/milvus/issues/42148 1. enable storage v2 2. implement some missing staffs 3. fix some bugs and add tests --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-08-26 21:35:53 +08:00
zhagnlu	d904c4e677	enhance: optimize compare expr performance for pk field (#43154 ) #43153 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-08-21 10:59:46 +08:00
Spade A	d6a428e880	feat: impl StructArray -- support create index for vector array (embedding list) and search on it (#43726 ) Ref https://github.com/milvus-io/milvus/issues/42148 This PR supports create index for vector array (now, only for `DataType.FLOAT_VECTOR`) and search on it. The index type supported in this PR is `EMB_LIST_HNSW` and the metric type is `MAX_SIM` only. The way to use it: ```python milvus_client = MilvusClient("xxx:19530") schema = milvus_client.create_schema(enable_dynamic_field=True, auto_id=True) ... struct_schema = milvus_client.create_struct_array_field_schema("struct_array_field") ... struct_schema.add_field("struct_float_vec", DataType.ARRAY_OF_VECTOR, element_type=DataType.FLOAT_VECTOR, dim=128, max_capacity=1000) ... schema.add_struct_array_field(struct_schema) index_params = milvus_client.prepare_index_params() index_params.add_index(field_name="struct_float_vec", index_type="EMB_LIST_HNSW", metric_type="MAX_SIM", index_params={"nlist": 128}) ... milvus_client.create_index(COLLECTION_NAME, schema=schema, index_params=index_params) ``` Note: This PR uses `Lims` to convey offsets of the vector array to knowhere where vectors of multiple vector arrays are concatenated and we need offsets to specify which vectors belong to which vector array. --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com> Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>	2025-08-20 10:27:46 +08:00
congqixia	b6199acb05	enhance: Utilize `search_batch_pks` for `search_ids` of PkTerm (#43751 ) Related to #43660 --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-08-07 14:19:40 +08:00
Chun Han	d826d6ac91	fix: try to get span raw data for variable length data type(#43544 ) (#43705 ) related: #43544 Signed-off-by: MrPresent-Han <chun.han@gmail.com> Co-authored-by: MrPresent-Han <chun.han@gmail.com>	2025-08-04 11:15:38 +08:00
congqixia	4aff581007	enhance: Pass callback in search batch pks to void large result (#43695 ) Related to #43660 --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-08-02 17:57:37 +08:00
congqixia	5f2f4eb3d6	enhance: Ignore entry with same ts when DeleteRecord search pks (#43669 ) Related to #43660 This patch reduces the unwanted offset&ts entries having same timestamp of delete record. Under large amount of upsert, this false hit could increase large amount of memory usage while applying delete. The next step could be passing a callback to `search_pk_func_` to handle hit entry streamingly. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-08-01 10:15:36 +08:00
congqixia	6a74a7de66	enhance: Make DeleteRecord search pks by batch and PinAll (#43640 ) Related to #43592 When delete records are large, search pk one by one will result into many `Pincells` call which creates lots of futures. This patch make search pk execute in batch to reduce this cost. Also add `GetAllChunks` API to utilize `PinAllCells` to reduce pins. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-07-30 19:15:36 +08:00
congqixia	c9bc70f272	fix: [AddField] Use shared_ptr of schema in plan fixing dangling ref (#42693 ) Related to #42640 The search/query plan holded a reference to schema, which could be destructed after schema change. This PR make plan hold a shared ptr to it fixing dangling reference problem under concurrent read & schema change. This PR also remove field binlog check for loading index for old segment with old schema may have binlog lack. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-06-12 20:46:36 +08:00
Spade A	911a8df17c	feat: impl StructArray -- data storage support in segcore (#42406 ) Ref https://github.com/milvus-io/milvus/issues/42148 This PR mainly enables segcore to support array of vector (read and write, but not indexing). Now only float vector as the element type is supported. --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com> Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>	2025-06-12 14:38:35 +08:00
Bingyi Sun	6c16d3dbee	enhance: Add bulk api for json data (#42407 ) issue: https://github.com/milvus-io/milvus/issues/42409 --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-06-12 10:40:39 +08:00
Bingyi Sun	6404e02d99	fix: Check cast type is array for json contains expr (#42184 ) issue: https://github.com/milvus-io/milvus/issues/42181 Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-06-09 17:04:33 +08:00
cqy123456	727f4ec24b	enhance:mmapchunkmanager allocates MmapChunkDescriptor itself (#42150 ) issue: https://github.com/milvus-io/milvus/issues/42157 Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>	2025-06-03 14:42:31 +08:00
cqy123456	5fe7015f63	enhance: InterimIndex support more index type and data type (#41021 ) issue: https://github.com/milvus-io/milvus/issues/27678 cherry pick from : https://github.com/milvus-io/milvus/pull/39180, https://github.com/milvus-io/milvus/pull/40429 Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>	2025-05-28 08:40:28 +08:00
Xianhui Lin	6a0e182e13	enhance: support TTL expiration with queries returning no results (#42086 ) support TTL expiration with queries returning no results issue:https://github.com/milvus-io/milvus/issues/41959 Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>	2025-05-27 18:28:27 +08:00
congqixia	f021b3f26a	fix: [AddField] Add protection logic inserting old data into new schema (#41978 ) Related to #39718 Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-05-22 11:30:24 +08:00
Buqian Zheng	ff5c2770e5	feat: cachinglayer: various improvements (#41546 ) issue: https://github.com/milvus-io/milvus/issues/41435 this PR is based on https://github.com/milvus-io/milvus/pull/41436. Improvements include: - Lazy Load support for Storage v1 - Use Low/High watermark to control eviction - Caching Layer related config changes - Removed ChunkCache related configs and code in golang - Add `PinAllCells` helper method to CacheSlot class - Modified ValueAt, RawAt, PrimitiveRawAt to Bulk version, to reduce caching layer overhead - Removed some unclear templated bulk_subscript methods - CachedSearchIterator to store PinWrapper when searching on ChunkedColumn, and removed unused contrustor. --------- Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2025-05-10 09:19:16 +08:00
foxspy	e2ddbe4962	feat: add cachinglayer to index (#41653 ) issue: #41435 Signed-off-by: xianliang.li <xianliang.li@zilliz.com>	2025-05-08 10:12:54 +08:00
congqixia	f3f8227cd0	enhance: [AddField] Trigger check schema in retrieve as well (#41598 ) Related to #39718 Fixes milvus-io/pymilvus#2771 This PR: - Make AsyncRetrieve task triggers "schema check" logic as well - Rename `AddField` related methods to align with code standard Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-04-29 14:10:49 +08:00
Buqian Zheng	3de904c7ea	feat: add cachinglayer to sealed segment (#41436 ) issue: https://github.com/milvus-io/milvus/issues/41435 --------- Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2025-04-28 10:52:40 +08:00
congqixia	b5443ddbd0	enhance: [AddField] Reopen loaded segments after AddField (#41529 ) Related to #39718 This PR: - Add reopen logic for growing & sealed segments - Lazy reopen when schema version increases - Add FinishLoad api for loading progress --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-04-26 08:48:39 +08:00
sthuang	1f1c836fb9	feat: Storage v2 growing segment load (#41001 ) support parallel loading sealed and growing segments with storage v2 format by async reading row groups. related: #39173 --------- Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2025-04-16 17:14:33 +08:00
Xianhui Lin	3bc24c264f	enhance: Add json key inverted index in stats for optimization (#38039 ) Add json key inverted index in stats for optimization https://github.com/milvus-io/milvus/issues/36995 --------- Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-04-10 15:20:28 +08:00
Bingyi Sun	fcb03b5bd1	feat: add json null/exists expression (#41004 ) issue: #35528 --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-04-03 17:48:21 +08:00
smellthemoon	cb1e86e17c	enhance: support add field (#39800 ) after the pr merged, we can support to insert, upsert, build index, query, search in the added field. can only do the above operates in added field after add field request complete, which is a sync operate. compact will be supported in the next pr. #39718 --------- Signed-off-by: lixinguo <xinguo.li@zilliz.com> Co-authored-by: lixinguo <xinguo.li@zilliz.com>	2025-04-02 14:24:31 +08:00
Chun Han	259f9106ad	enhance: refine variable-length-type memory usage(#38736 ) (#39578 ) related: #38736 Signed-off-by: MrPresent-Han <chun.han@gmail.com> Co-authored-by: MrPresent-Han <chun.han@gmail.com>	2025-02-27 21:13:58 +08:00
Bingyi Sun	db4769281c	fix: Fall back to a brute-force search if json index type unmatched (#40076 ) issue: https://github.com/milvus-io/milvus/issues/35528 If the query data type does not match the index type, fall back to a brute-force search --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-02-24 16:25:57 +08:00
zhagnlu	316534e065	enhance: optimize delete init construct code (#39327 ) #39326 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-02-17 21:05:26 +08:00
Bingyi Sun	b59555057d	feat: support json index (#36750 ) https://github.com/milvus-io/milvus/issues/35528 This PR adds json index support for json and dynamic fields. Now you can only do unary query like 'a["b"] > 1' using this index. We will support more filter type later. basic usage: ``` collection.create_index("json_field", {"index_type": "INVERTED", "params": {"json_cast_type": DataType.STRING, "json_path": 'json_field["a"]["b"]'}}) ``` There are some limits to use this index: 1. If a record does not have the json path you specify, it will be ignored and there will not be an error. 2. If a value of the json path fails to be cast to the type you specify, it will be ignored and there will not be an error. 3. A specific json path can have only one json index. 4. If you try to create more than one json indexes for one json field, sdk(pymilvus<=2.4.7) may return immediately because of internal implementation. This will be fixed in a later version. --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-02-15 14:06:15 +08:00
zhagnlu	01de0afc4e	enhance: refactor delete mvcc function (#38066 ) #37413 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2024-12-15 18:02:43 +08:00
Gao	994fc544e7	enhance: support iterative filter execution (#37363 ) issue: #37360 --------- Signed-off-by: chasingegg <chao.gao@zilliz.com>	2024-12-11 11:32:44 +08:00
zhagnlu	e4b6773d0a	fix: fix create text index dir conflict bug (#37693 ) #37623 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2024-11-15 18:26:30 +08:00
smellthemoon	3389a6b500	enhance: support null in text match index (#37517 ) #37508 Signed-off-by: lixinguo <xinguo.li@zilliz.com> Co-authored-by: lixinguo <xinguo.li@zilliz.com>	2024-11-13 11:08:29 +08:00
Bingyi Sun	a75bb85f3a	feat: support chunked column for sealed segment (#35764 ) This PR splits sealed segment to chunked data to avoid unnecessary memory copy and save memory usage when loading segments so that loading can be accelerated. To support rollback to previous version, we add an option `multipleChunkedEnable` which is false by default. Signed-off-by: sunby <sunbingyi1992@gmail.com>	2024-10-12 15:04:52 +08:00
zhagnlu	489087d18b	enhance: refactor executor framework V2 (#35251 ) #32636 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2024-09-13 20:57:09 +08:00
Jiquan Long	89bf226f0b	feat: support keyword text match (#35923 ) fix: #35922 --------- Signed-off-by: longjiquan <jiquan.long@zilliz.com>	2024-09-10 15:11:08 +08:00
zhagnlu	3107701fe8	enhance: optimize retrieve on dynamic field (#35580 ) #35514 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com> Co-authored-by: zhenshan.cao <zhenshan.cao@zilliz.com>	2024-08-22 14:24:56 +08:00
zhagnlu	4b553b0333	enhance: revert remove duplicated pk function (#35103 ) issue: #34778 Revert "fix: fix query count(*) concurrently" Revert "enhance: mark duplicated pk as deleted " Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2024-08-05 10:48:17 +08:00
zhagnlu	16dd53e7cf	enhance: remove timestamp_filter after retrieve (#35207 ) #35226 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2024-08-02 19:32:46 +08:00
smellthemoon	475c333fa2	enhance: add valid_data in span (#35030 ) #31728 Signed-off-by: lixinguo <xinguo.li@zilliz.com> Co-authored-by: lixinguo <xinguo.li@zilliz.com>	2024-08-02 15:40:14 +08:00

1 2 3

134 Commits