milvus

mirror of https://gitee.com/milvus-io/milvus.git synced 2025-12-07 01:28:27 +08:00

Author	SHA1	Message	Date
congqixia	2c50d7e1f8	fix: [2.6] Move FinishLoad before text index creation to ensure raw data availability (#45335 ) Cherry-pick from master pr: #45334 Related to #45333 Fix segment loading failure when adding fields with text match enabled. The issue occurred because text indexes were being loaded before FinishLoad() was called, meaning raw data was not properly available when text index creation attempted to access it, resulting in "failed to create text index, neither raw data nor index are found" errors. Solution is to move the FinishLoad() call to execute after raw data loading but before text index loading. This ensures that: 1. Raw data is properly loaded and available in memory 2. Text indexes can access the raw data they need during creation 3. The segment is in the correct state before any index operations Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-11-06 17:11:34 +08:00
Zhen Ye	122d024df4	enhance: cherry pick patch of new DDL framework and CDC 3 (#45280 ) issue: #43897, #44123 pr: #45266 also pick pr: #45237, #45264,#45244,#45275 fix: kafka should auto reset the offset from earliest to read (#45237) issue: #44172, #45210, #44851,#45244 kafka will auto reset the offset to "latest" if the offset is Out-of-range. the recovery of milvus wal cannot read any message from that. So once the offset is out-of-range, kafka should read from eariest to read the latest uncleared data. https://kafka.apache.org/documentation/#consumerconfigs_auto.offset.reset enhance: support alter collection/database with WAL-based DDL framework (#45266) issue: #43897 - Alter collection/database is implemented by WAL-based DDL framework now. - Support AlterCollection/AlterDatabase in wal now. - Alter operation can be synced by new CDC now. - Refactor some UT for alter DDL. fix: milvus role cannot stop at initializing state (#45244) issue: #45243 fix: support upgrading from 2.6.x -> 2.6.5 (#45264) issue: #43897 --------- Signed-off-by: chyezh <chyezh@outlook.com>	2025-11-04 20:21:37 +08:00
congqixia	d490a5b4bf	enhance: [2.6] set schema version when creating new collection (#45263 ) (#45269 ) Cherry pick from master pr: #45263 Related to #43028 Initialize the schema version field when creating a new collection instance in QueryNode. The schema version is extracted from loadMetaInfo and assigned to the collection, ensuring proper schema version tracking and consistency across the distributed system. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-11-04 17:05:34 +08:00
congqixia	7f10c98321	fix: [2.6] update QueryNode NumEntities metrics when collection has no segments (#45147 ) (#45160 ) Cherry-pick from master pr: #45147 Related to #44509 Fix a bug where QueryNodeNumEntities metrics were not updated for collections with zero segments, causing stale metrics when all segments are flushed or compacted. The previous implementation used separate loops: one to update size metrics for all collections, and another to update num entities metrics only for collections present in the grouped segments map. Collections with no segments were skipped in the second loop, leaving their NumEntities metrics stale. Changes: - Consolidate size and num entities metric updates into single loop - Iterate over all collections instead of grouped segments - Get collection metadata from manager instead of segment instances - Correctly set NumEntities to 0 for collections with no segments - Apply the same fix to both growing and sealed segment processing - Add nil check for collection metadata before processing This ensures all collection metrics are updated consistently, even when segment count drops to zero. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-10-30 14:08:09 +08:00
aoiasd	fd22dc281a	enhance: [2.6] update some annotations (#44953 ) relate: https://github.com/milvus-io/milvus/issues/43114 pr: https://github.com/milvus-io/milvus/pull/44769 --------- Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-10-20 15:42:08 +08:00
sparknack	64b76b723f	enhance: [2.6] add a disk quota for the loaded binlog size to prevent load failures of querynode (#44932 ) issue: #41435 pr: #44893 --------- Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>	2025-10-19 19:46:07 +08:00
sparknack	54fe82756a	enhance: [2.6] add cachinglayer management for TextMatchIndex (#44768 ) issue: #41435, #44502 pr: #44741, #44806 --------- Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>	2025-10-15 11:09:59 +08:00
sparknack	c72a19d174	enhance: remove logical usage checks during segment loading (#44770 ) issue: #41435 pr: #44743 Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>	2025-10-13 14:23:58 +08:00
zhenshan.cao	fc6fe6e3bd	enhance: Add refine logs for task scheduler in QueryCoord (#44577 ) (#44725 ) issue: https://github.com/milvus-io/milvus/issues/43968 pr: https://github.com/milvus-io/milvus/pull/44577 --------- Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com> Signed-off-by: Wei Liu <wei.liu@zilliz.com> Co-authored-by: wei liu <wei.liu@zilliz.com>	2025-10-11 15:35:57 +08:00
congqixia	07bca45376	fix: [2.6] Pass fs via `FileManagerContext` when loading index (#44734 ) Cherry-pick from master pr: #44733 Related to #44615 --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-10-11 09:57:57 +08:00
Zhen Ye	a110d8cc49	fix: don't use logical resource for metrics of quota center on streaming node (#44613 ) issue: #44599 Signed-off-by: chyezh <chyezh@outlook.com>	2025-09-29 21:34:13 +08:00
aoiasd	78ee76f018	enhance: support preload sealed segment bm25 stats and optimize bm25 stats serialize (#44279 ) relate: https://github.com/milvus-io/milvus/issues/41424 --------- Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-09-29 16:35:05 +08:00
Zhen Ye	b6b59bd222	fix: remove redundant initialization of storage v2 (#44597 ) issue: #44596 - querynode already init the storage v2 and segcore, so streamingnode should not do this again. - It also fix the gcp object storage access denied. Signed-off-by: chyezh <chyezh@outlook.com>	2025-09-29 10:17:04 +08:00
zhagnlu	eac16a577c	enhance:support cachelayer for json stats (#44446 ) #42533 Signed-off-by: zhagnlu <lu.zhang@zilliz.com>	2025-09-24 15:30:04 +08:00
Tianx	2c0c5ef41e	feat: timestamptz expression & index & timezone (#44080 ) issue: https://github.com/milvus-io/milvus/issues/27467 >My plan is as follows. >- [x] M1 Create collection with timestamptz field >- [x] M2 Insert timestamptz field data >- [x] M3 Retrieve timestamptz field data >- [x] M4 Implement handoff >- [x] M5 Implement compare operator >- [x] M6 Implement extract operator >- [x] M8 Support database/collection level default timezone >- [x] M7 Support STL-SORT index for datatype timestamptz --- The third PR of issue: https://github.com/milvus-io/milvus/issues/27467, which completes M5, M6, M7, M8 described above. ## M8 Default Timezone We will be able to use alter_collection() and alter_database() in a future Python SDK release to modify the default timezone at the collection or database level. For insert requests, the timezone will be resolved using the following order of precedence: String Literal-> Collection Default -> Database Default. For retrieval requests, the timezone will be resolved in this order: Query Parameters -> Collection Default -> Database Default. In both cases, the final fallback timezone is UTC. ## M5: Comparison Operators We can now use the following expression format to filter on the timestamptz field: - `timestamptz_field [+/- INTERVAL 'interval_string'] {comparison_op} ISO 'iso_string' ` - The interval_string follows the ISO 8601 duration format, for example: P1Y2M3DT1H2M3S. - The iso_string follows the ISO 8601 timestamp format, for example: 2025-01-03T00:00:00+08:00. - Example expressions: "tsz + INTERVAL 'P0D' != ISO '2025-01-03T00:00:00+08:00'" or "tsz != ISO '2025-01-03T00:00:00+08:00'". ## M6: Extract We will be able to extract sepecific time filed by kwargs in a future Python SDK release. The key is `time_fields`, and value should be one or more of "year, month, day, hour, minute, second, microsecond", seperated by comma or space. Then the result of each record would be an array of int64. ## M7: Indexing Support Expressions without interval arithmetic can be accelerated using an STL-SORT index. However, expressions that include interval arithmetic cannot be indexed. This is because the result of an interval calculation depends on the specific timestamp value. For example, adding one month to a date in February results in a different number of added days than adding one month to a date in March. --- After this PR, the input / output type of timestamptz would be iso string. Timestampz would be stored as timestamptz data, which is int64_t finally. > for more information, see https://en.wikipedia.org/wiki/ISO_8601 --------- Signed-off-by: xtx <xtianx@smail.nju.edu.cn>	2025-09-23 10:24:12 +08:00
jiaqizho	338ed2fed4	enhance: Introduce sparse filter in query (#44347 ) issue: #44373 The current commit implements sparse filtering in query tasks using the statistical information (Bloom filter/MinMax) of the Primary Key (PK). The statistical information of the PK is bound to the segment during the segment loading phase. A new filter has been added to the segment filter to enable the sparse filtering functionality. Signed-off-by: jiaqizho <jiaqi.zhou@zilliz.com>	2025-09-23 09:58:09 +08:00
Gao	d3784c6515	enhance: add storage resource usage for vector search (#44308 ) issue: #44212 Implement search/query storage usage statistics in go side(result reduce), only record storage usage in vector search C++ path. Need to be implemented in query c++ path in next prs. --------- Signed-off-by: chasingegg <chao.gao@zilliz.com> Signed-off-by: marcelo.chen <marcelo.chen@zilliz.com> Co-authored-by: marcelo.chen <marcelo.chen@zilliz.com>	2025-09-19 20:20:02 +08:00
zhenshan.cao	691a8df953	feat: Add RESTful api for rolling upgrade support (#44381 ) issue: https://github.com/milvus-io/milvus/issues/43968 Co-authored-by: chyezh <ye.zhen@zilliz.com>	2025-09-16 20:08:00 +08:00
yihao.dai	51f69f32d0	feat: Add CDC support (#44124 ) This PR implements a new CDC service for Milvus 2.6, providing log-based cross-cluster replication. issue: https://github.com/milvus-io/milvus/issues/44123 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com> Signed-off-by: chyezh <chyezh@outlook.com> Co-authored-by: chyezh <chyezh@outlook.com>	2025-09-16 16:32:01 +08:00
congqixia	aa861f55e6	enhance: [StorageV2] Reverts #44232 bucket name change (#44390 ) Related to #39173 - Put bucket name concatenation logic back for azure support This reverts commit 8f97eb355fde6b86cf37f166d2191750b4210ba3. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-09-16 10:10:00 +08:00
sthuang	9140201b8f	fix: add init fs check for querynode and streaming node (#44360 ) related: #44354 Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2025-09-13 10:31:58 +08:00
congqixia	abe22b95c7	enhance: Utilize group info estimating logic usage as well (#44356 ) Related to #44257 #44334 --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-09-12 22:47:57 +08:00
congqixia	9d2ff48d63	enhance: Utilize group split info to estimate usage (#44338 ) Related to #44257 #44334 --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-09-12 14:49:57 +08:00
aoiasd	9add663a08	fix: idf oracle use wrong dir (#44266 ) relate: https://github.com/milvus-io/milvus/issues/44264 Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-09-10 14:41:56 +08:00
congqixia	8f97eb355f	enhance: [StorageV2] Make bucket name concatenation transparent to user (#44232 ) Related to #39173 This PR: - Bump milvus-storage commit to handle bucket name concatenation logic in multipart s3 fs - Remove all user-side bucket name concatenation code Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-09-08 10:15:55 +08:00
zhagnlu	d67f1ea0ab	enhance: add param to modify dump snapshot batch size (#44215 ) issue: #44216 Signed-off-by: luzhang <luzhang@zilliz.com>	2025-09-05 14:29:54 +08:00
Gao	2e98cb0103	enhance: load resource estimation for tiered index (#44171 ) issue: https://github.com/milvus-io/milvus/issues/42032 - Use bytes to estimate load resource in the whole estimation procedure - Add num_rows and dim info for vector index to better estimate - Disable eviction for tiered index's meta --------- Signed-off-by: chasingegg <chao.gao@zilliz.com>	2025-09-04 19:41:53 +08:00
Bingyi Sun	0c0630cc38	feat: support dropping index without releasing collection (#42941 ) issue: #42942 This pr includes the following changes: 1. Added checks for index checker in querycoord to generate drop index tasks 2. Added drop index interface to querynode 3. To avoid search failure after dropping the index, the querynode allows the use of lazy mode (warmup=disable) to load raw data even when indexes contain raw data. 4. In segcore, loading the index no longer deletes raw data; instead, it evicts it. 5. In expr, the index is pinned to prevent concurrent errors. --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-09-02 16:17:52 +08:00
congqixia	7721edf32a	enhance: Add mutex and range check preventing concurrent del (#44128 ) This PR adds a mutex prevent concurrent applying delete on same segment and check latestDeltaTimestamp to skip overlapping delete range Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-09-01 14:29:52 +08:00
zhagnlu	fc876639cf	enhance: support json stats with shredding design (#42534 ) #42533 Co-authored-by: luzhang <luzhang@zilliz.com>	2025-09-01 10:49:52 +08:00
sparknack	70c8114e85	enhance: cachinglayer: resource management for segment loading (#43846 ) issue: #41435 --------- Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>	2025-08-29 11:37:50 +08:00
Buqian Zheng	6420d72391	enhance: print as storage size unit MB with 2 digits only, so the log is easier to read (#44085 ) issue: https://github.com/milvus-io/milvus/issues/41435 Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2025-08-27 19:47:50 +08:00
Chun Han	da156981c6	feat: milvus support posix-compatible mode(milvus-io#43942) (#43944 ) related: #43942 Signed-off-by: MrPresent-Han <chun.han@gmail.com> Co-authored-by: MrPresent-Han <chun.han@gmail.com>	2025-08-27 16:29:50 +08:00
XuanYang-cn	37a447d166	feat: Add CMEK cipher plugin (#43722 ) 1. Enable Milvus to read cipher configs 2. Enable cipher plugin in binlog reader and writer 3. Add a testCipher for unittests 4. Support pooling for datanode 5. Add encryption in storagev2 See also: #40321 Signed-off-by: yangxuan <xuan.yang@zilliz.com> --------- Signed-off-by: yangxuan <xuan.yang@zilliz.com>	2025-08-27 11:15:52 +08:00
Spade A	8456f824be	feat: impl StructArray -- miscellaneous staffs for struct array (#43960 ) Ref https://github.com/milvus-io/milvus/issues/42148 1. enable storage v2 2. implement some missing staffs 3. fix some bugs and add tests --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-08-26 21:35:53 +08:00
Tianx	c0d62268ac	feat: add timesatmptz data type (#44005 ) issue: https://github.com/milvus-io/milvus/issues/27467 > https://github.com/milvus-io/milvus/issues/27467#issuecomment-3092211420 > * [x] M1 Create collection with timestamptz field > * [x] M2 Insert timestamptz field data > * [x] M3 Retrieve timestamptz field data > * [x] M4 Implement handoff[ ] The second PR of issue: https://github.com/milvus-io/milvus/issues/27467, which completes M1-M4 described above. --------- Signed-off-by: xtx <xtianx@smail.nju.edu.cn>	2025-08-26 15:59:53 +08:00
Gao	e97a618630	enhance: support readAt interface for remote input stream (#43997 ) #42032 Also, fix the cacheoptfield method to work in storagev2. Also, change the sparse related interface for knowhere version bump #43974 . Also, includes https://github.com/milvus-io/milvus/pull/44046 for metric lost. --------- Signed-off-by: chasingegg <chao.gao@zilliz.com> Signed-off-by: marcelo.chen <marcelo.chen@zilliz.com> Signed-off-by: Congqi Xia <congqi.xia@zilliz.com> Co-authored-by: marcelo.chen <marcelo.chen@zilliz.com> Co-authored-by: Congqi Xia <congqi.xia@zilliz.com>	2025-08-26 11:19:58 +08:00
zhagnlu	8934c18792	enhance: support cache result cache for expr (#43923 ) issue: #43878 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-08-26 10:55:52 +08:00
sparknack	4fae074d56	enhance: add write rate limit for disk file writer (#43912 ) issue: #43040 --------- Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>	2025-08-25 10:27:47 +08:00
wei liu	399f63300c	enhance: Implement dynamic interval updates for ticker components (#43865 ) issue: #43858 Enable dynamic configuration updates for ticker intervals without restart. This enhancement allows runtime configuration changes to take effect immediately for better operational flexibility. Changes include: - Apply "drain+Reset only when interval changed" pattern across all ticker components to preserve existing timing phases - Fix goroutine variable capture issue in CheckerController.Start() - Remove unnecessary ticker.Stop() in manual trigger paths - Add dynamic interval checking in QueryCoordV2 components: * checkers/controller.go: Various checker intervals * dist/dist_handler.go: DistPullInterval, CheckExecutedFlagInterval * session/cluster.go: CheckNodeSessionInterval * server.go: CheckAutoBalanceConfigInterval * observers/target_observer.go: UpdateNextTargetInterval * observers/collection_observer.go: CollectionObserverInterval - Add dynamic interval checking in QueryNodeV2 components: * segments/disk_usage_fetcher.go: DiskSizeFetchInterval - Ensure thread safety by performing all ticker operations in same goroutine with proper drain before Reset to avoid spurious triggers --------- Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-08-21 10:07:47 +08:00
Spade A	d6a428e880	feat: impl StructArray -- support create index for vector array (embedding list) and search on it (#43726 ) Ref https://github.com/milvus-io/milvus/issues/42148 This PR supports create index for vector array (now, only for `DataType.FLOAT_VECTOR`) and search on it. The index type supported in this PR is `EMB_LIST_HNSW` and the metric type is `MAX_SIM` only. The way to use it: ```python milvus_client = MilvusClient("xxx:19530") schema = milvus_client.create_schema(enable_dynamic_field=True, auto_id=True) ... struct_schema = milvus_client.create_struct_array_field_schema("struct_array_field") ... struct_schema.add_field("struct_float_vec", DataType.ARRAY_OF_VECTOR, element_type=DataType.FLOAT_VECTOR, dim=128, max_capacity=1000) ... schema.add_struct_array_field(struct_schema) index_params = milvus_client.prepare_index_params() index_params.add_index(field_name="struct_float_vec", index_type="EMB_LIST_HNSW", metric_type="MAX_SIM", index_params={"nlist": 128}) ... milvus_client.create_index(COLLECTION_NAME, schema=schema, index_params=index_params) ``` Note: This PR uses `Lims` to convey offsets of the vector array to knowhere where vectors of multiple vector arrays are concatenated and we need offsets to specify which vectors belong to which vector array. --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com> Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>	2025-08-20 10:27:46 +08:00
Xianhui Lin	c7d8dc100a	fix: add segment lock in LoadTextIndex and LoadJSONKeyIndex (#43811 ) fix: add segment lock in LoadTextIndex and LoadJSONKeyIndex issue:https://github.com/milvus-io/milvus/issues/43572 Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>	2025-08-18 01:17:52 +08:00
congqixia	de3e5c285b	enhance: Add downgrade tsafe switch param item (#43874 ) Related to #43873 Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-08-15 12:31:43 +08:00
congqixia	f032044125	enhance: Refine segcore param change callback (#43838 ) Related to #43230 This PR - Move segcore setup function to `initcore` package to remove cgo dependency from pkg - Register core callback only for components depends on segcore - Rectify `UpdateLogLevel` implementation Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-08-13 19:31:44 +08:00
zhagnlu	c04d678ad4	enhance: make segcore params effective without restarting milvus (#43231 ) #43230 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-08-08 10:33:48 +08:00
wei liu	715b5153b8	enhance: Improve delegator serviceable check logic in PinReadableSegments (#43768 ) issue: #43767 - Enhance serviceable check logic to properly handle full vs partial result requirements - For full result (requiredLoadRatio >= 1.0): check queryView.Serviceable() - For partial result (requiredLoadRatio < 1.0): check load ratio satisfaction - Add comprehensive unit tests covering all serviceable check scenarios This enhancement ensures delegator correctly validates serviceability based on the requested result completeness, improving reliability of query operations. Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-08-07 12:13:40 +08:00
Zhen Ye	5551d99425	enhance: remove old arch non-streaming arch code (#43651 ) issue: #41609 - remove all dml dead code at proxy - remove dead code at l0_write_buffer - remove msgstream dependency at proxy - remove timetick reporter from proxy - remove replicate stream implementation --------- Signed-off-by: chyezh <chyezh@outlook.com>	2025-08-06 14:41:40 +08:00
sparknack	544c7c0600	enhance: update cachinglayer default cache ratio to 0.3 (#43723 ) issue: #41435 --------- Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>	2025-08-05 01:35:39 +08:00
zhagnlu	f14c7d598c	fix: skip load raw data when loading index for storagev2 (#43720 ) #43653 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-08-04 21:17:39 +08:00
Chun Han	d826d6ac91	fix: try to get span raw data for variable length data type(#43544 ) (#43705 ) related: #43544 Signed-off-by: MrPresent-Han <chun.han@gmail.com> Co-authored-by: MrPresent-Han <chun.han@gmail.com>	2025-08-04 11:15:38 +08:00

1 2 3 4 5 ...

811 Commits