milvus

mirror of https://gitee.com/milvus-io/milvus.git synced 2026-01-07 19:31:51 +08:00

Author	SHA1	Message	Date
congqixia	f8c972a102	fix: update EnableDynamicField and SchemaVersion during collection modification (#45615 ) Related to #45614 This commit fixes a bug where certain collection attributes were not properly updated during collection modification, causing metadata errors after cluster restart and collection reload failures. When altering a collection, the `EnableDynamicField` and `SchemaVersion` attributes were not being persisted to the catalog. This caused inconsistencies between the in-memory collection metadata and the persisted state, leading to: - Dynamic field validation failures after restart - Collection loading errors - Metadata state mismatches Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-11-18 10:05:39 +08:00
wei liu	7aed88113c	enhance: Deduplicate primary keys in upsert request batch (#45249 ) issue: #44320 This change adds deduplication logic to handle duplicate primary keys within a single upsert batch, keeping the last occurrence of each primary key. Key changes: - Add DeduplicateFieldData function to remove duplicate PKs from field data, supporting both Int64 and VarChar primary keys - Refactor fillFieldPropertiesBySchema into two separate functions: validateFieldDataColumns for validation and fillFieldPropertiesOnly for property filling, improving code clarity and reusability - Integrate deduplication logic in upsertTask.PreExecute to automatically deduplicate data before processing - Add comprehensive unit tests for deduplication with various PK types (Int64, VarChar) and field types (scalar, vector) - Add Python integration tests to verify end-to-end behavior --------- Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-11-17 21:35:40 +08:00
congqixia	e9506f1d64	fix: Handle default values correctly during compaction for added fields (#45572 ) Related to #45543 When a field with a default value is added to a collection, the default value becomes null after compaction instead of retaining the expected default value. Root Cause The `appendValueAt` function in `internal/storage/arrow_util.go` incorrectly checked if the entire arrow.Array was nil before handling default values. This meant that default values were only applied when the array itself was nil, not when individual field values were null (which is the correct condition). Changes 1. Early nil check: Added a guard at the function entry to detect nil arrow.Array and return an error immediately, as this is an unexpected condition that should not occur during normal operation. 2. Refactored default value handling: Removed the per-type nil array checks and moved default value logic to handle individual null values within the array (when `IsNull(idx)` returns true). 3. Applied to all types: Updated the logic consistently across all builder types: - BooleanBuilder - Int8Builder, Int16Builder, Int32Builder, Int64Builder - Float32Builder - StringBuilder - BinaryBuilder (added default value support for internal $meta json) - ListBuilder (removed unnecessary nil check) --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-11-17 19:03:38 +08:00
aoiasd	96d0e780ac	fix: segcore collection schema update not concurrent safe. (#45337 ) relate: https://github.com/milvus-io/milvus/issues/45345 Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-11-14 17:51:37 +08:00
Zhen Ye	40e2042728	enhance: add more metrics for DDL framework (#45558 ) issue: #43897 --------- Signed-off-by: chyezh <chyezh@outlook.com>	2025-11-14 15:19:37 +08:00
congqixia	0a208d7224	enhance: Move segment loading logic from Go layer to segcore for self-managed loading (#45488 ) Related to #45060 Refactor segment loading architecture to make segments autonomously manage their own loading process, moving the orchestration logic from Go (segment_loader.go) to C++ (segcore). C++ Layer (segcore): - Added `SetLoadInfo()` and `Load()` methods to `SegmentInterface` and implementations - Implemented `ChunkedSegmentSealedImpl::Load()` with parallel loading strategy: - Separates indexed fields from non-indexed fields - Loads indexes concurrently using thread pools - Loads field data for non-indexed fields in parallel - Implemented `SegmentGrowingImpl::Load()` to convert and load field data - Extracted `LoadIndexData()` as a reusable utility function in `Utils.cpp` - Added `SegmentLoad()` C binding in `segment_c.cpp` Go Layer: - Added `Load()` method to segment interfaces - Updated mock implementations and test interfaces - Integrated new C++ `SegmentLoad()` binding in Go segment wrapper --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-11-14 11:21:37 +08:00
Spade A	0454cdaab3	fix: remove validateFieldName in dropIndex (#45460 ) issue: https://github.com/milvus-io/milvus/issues/45459 This check is unnecessary when dropping index. --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-11-14 10:17:37 +08:00
Xiaofan	1c69c7fa17	enhance: Upgrade etcd to 3.5.23 (#44666 ) related to #44614 fix the issue embedded etcd are not affected by quota config Signed-off-by: xiaofanluan <xiaofan.luan@zilliz.com>	2025-11-14 09:47:38 +08:00
cai.zhang	cc07be3c30	fix: Ignore compaction task when from segment is not healthy (#45534 ) issue: #45533 Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2025-11-13 23:07:39 +08:00
junjiejiangjjj	102481e53f	feat: Support add_function/alter_function/drop_function (#44895 ) https://github.com/milvus-io/milvus/issues/44053 Signed-off-by: junjie.jiang <junjie.jiang@zilliz.com>	2025-11-13 20:53:39 +08:00
Gao	09a3195867	enhance: support max_connections config for remote storage (#45225 ) related: https://github.com/milvus-io/milvus/issues/45344 Signed-off-by: chasingegg <chao.gao@zilliz.com>	2025-11-13 15:37:38 +08:00
Spade A	929dc65882	fix: fix index compatibility after upgrade (#45373 ) issue: https://github.com/milvus-io/milvus/issues/45380 --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-11-13 12:59:38 +08:00
junjiejiangjjj	50f198e346	feat: Support zilliz models (#45168 ) https://github.com/milvus-io/milvus/issues/35856 Signed-off-by: junjie.jiang <junjie.jiang@zilliz.com>	2025-11-13 12:55:37 +08:00
groot	e48fe7f820	fix: Fix bulkimport bug for Struct field (#45474 ) issue: https://github.com/milvus-io/milvus/issues/45006 Signed-off-by: yhmo <yihua.mo@zilliz.com>	2025-11-13 11:31:41 +08:00
Xiaofan	a9895bb904	enhance: add robust handle etcd servercrash (#45304 ) related to #45303 fix milvus pod may restart when etcd pod start Signed-off-by: xiaofanluan <xiaofan.luan@zilliz.com>	2025-11-13 10:23:36 +08:00
Chun Han	406fa7b694	fix: failed to get raw data for hybrid index(#45318 ) (#45411 ) related: #45318 Signed-off-by: MrPresent-Han <chun.han@gmail.com> Co-authored-by: MrPresent-Han <chun.han@gmail.com>	2025-11-13 10:17:37 +08:00
Zhen Ye	b7fb8ed38c	fix: use the right resource key lock for ddl and use new ddl in transfer replica (#45506 ) issue: #45452 - alias/rename related DDL should use database level exclusive lock - alias cannot use as the resource key of lock, use collection name instead - transfer replica should use WAL-based framework Signed-off-by: chyezh <chyezh@outlook.com>	2025-11-12 19:01:38 +08:00
yihao.dai	cabc47ce01	fix: Fix channel not available error and release collection blocking (#45428 ) 1. Ensure replica creation is idempotent. 2. Prevent currentTarget update when replica is missing. 3. Move the wait-for-release logic into the DDL framework's callback, and add a timeout to prevent it from blocking the DDL callback indefinitely. issue: https://github.com/milvus-io/milvus/issues/45301, https://github.com/milvus-io/milvus/issues/45274, https://github.com/milvus-io/milvus/issues/45295 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2025-11-12 18:55:37 +08:00
XuanYang-cn	28d0755aaa	fix: Set schema properties before broadcast alter collection (#45502 ) This causes collection schema properties is empty in datacoord caches, thus making compaction, indexing, unable to get properties from schema. See also: #45053, #45159 Signed-off-by: yangxuan <xuan.yang@zilliz.com>	2025-11-12 18:11:41 +08:00
Zhen Ye	8b01af55b9	fix: remove collection meta when drop partition (#45493 ) issue: #45476 Signed-off-by: chyezh <chyezh@outlook.com>	2025-11-11 23:39:36 +08:00
cai.zhang	216c576da2	fix: Retain collection early to prevent it from being released before query completion (#45413 ) issue: #45314 This PR only ensures that no panic occurs. However, we still need to provide protection for the delegator handling ongoing query tasks. Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2025-11-11 20:29:37 +08:00
cai.zhang	d0d908e51d	fix: Fix target segment marked dropped for save stats result twice (#45478 ) issue: #45477 Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2025-11-11 17:19:38 +08:00
sparknack	9d75d0393e	enhance: some optimization of scalar field fetching in tiered storage scenarios (#45360 ) issue: #43611 --------- Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>	2025-11-11 17:17:41 +08:00
sijie-ni-0214	77dc512b3b	fix: alter collection with alias failed (#45447 ) issue: #45397 Signed-off-by: sijie-ni-0214 <sijie.ni@zilliz.com>	2025-11-11 16:05:36 +08:00
Zhen Ye	4797bb6ab2	fix: wrong update timetick of collection meta info (#45461 ) issue: #45403, #45463 - fix the Nightly E2E failures. - fix the wrong update timetick of altering collection to fix the related load failure. Signed-off-by: chyezh <chyezh@outlook.com>	2025-11-11 16:01:36 +08:00
cai.zhang	e3c1673191	fix: Fix filter geometry for growing with mmap (#45464 ) issue: #45450 Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2025-11-11 15:39:36 +08:00
Chun Han	69f3aab229	feat: milvus support huawei cloud iam verification(#45298 ) (#45457 ) related: #45298 Signed-off-by: MrPresent-Han <chun.han@gmail.com> Co-authored-by: MrPresent-Han <chun.han@gmail.com>	2025-11-11 14:41:41 +08:00
congqixia	382b1d7de6	fix: correct field data offset calculation in rerank functions for bulk search (#45444 ) Related to #45338 When using bulk vector search in hybrid search with rerank functions, the output field values for different queries were all equal to the values returned by the first query, instead of the correct values belonging to each document ID. The document IDs were correct, but the entity field values were wrong. In rerank functions (RRF, weighted, decay, model), when processing multiple queries in a batch, the `idLocations` stored only the relative offset within each result set (`idx`), not accounting for the absolute position within the entire batch. This caused `FillFieldData` to retrieve field data from the wrong positions, always using offsets relative to the first query. This fix ensures that when processing bulk searches with rerank functions, each result correctly retrieves its corresponding field data based on the absolute offset within the entire batch, resolving the issue where all queries returned the first query's field values. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-11-11 14:39:41 +08:00
XuanYang-cn	dcf490663c	fix: store database event if the key is invalid (#45348 ) See also: #45136, #45124 Signed-off-by: yangxuan <xuan.yang@zilliz.com>	2025-11-11 10:55:36 +08:00
congqixia	8d1ea751a6	fix: Support JSON default values in FillFieldData (#45455 ) Related to #45445 Previously, FillFieldData for JSON fields would assert and fail when a default_value was provided, blocking index creation for JSON fields with default values (including dynamic fields like $meta). This change enables JSON default value support by: - Removing the assertion that blocked default values - Parsing bytes_data into Json objects when default_value is present - Properly filling data_ array and setting valid_data_ bitset to true - Maintaining null behavior when no default_value is provided Impact: - Fixes index creation failure for JSON fields with default values - Resolves upgrade issues from 2.5 to 2.6.5 where dynamic fields with default values couldn't be indexed - Index builds that were stuck in InProgress state can now complete Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-11-11 10:35:36 +08:00
Spade A	6f4abab6c8	fix: nextFieldID does not consider STRUCT (#45437 ) issue: https://github.com/milvus-io/milvus/issues/45362 Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-11-11 10:31:36 +08:00
zhenshan.cao	45907747e2	feat: Add /livez for Liveness Probes (#45454 ) issue: https://github.com/milvus-io/milvus/issues/45443 Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>	2025-11-11 09:51:15 +08:00
Gao	e9a875f7ac	enhance: override index_type while creating segment index (#45416 ) issue: #44752 --------- Signed-off-by: chasingegg <chao.gao@zilliz.com>	2025-11-11 07:27:36 +08:00
congqixia	0e1de0073a	enhance: Update tantivy-binding with cargo build result (#45458 ) Related to #44988 This PR commit newly updated tantivy-binding.h with cargo build result which shall passes format check. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-11-10 18:09:36 +08:00
XuanYang-cn	897ac983c8	feat: Add new config and enable to dynamic update configs (#45170 ) This PR changes the config layout according to the latest design, and adds two external credential configs for aws kms See also: #45169 Signed-off-by: yangxuan <xuan.yang@zilliz.com>	2025-11-10 14:43:35 +08:00
aoiasd	e82bf0e54f	enhance: fix typo of analyzer params (#45299 ) Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-11-10 14:35:35 +08:00
sparknack	f815f57b82	enhance: check both eviction and warmup when estimate segment loading size (#45222 ) issue: #44857 Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>	2025-11-10 14:15:36 +08:00
aoiasd	a38a0deb43	enhance: prevent panic by adding null pointer check when clearing InsertRecord _pk2offset_ (#45281 ) Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-11-10 11:37:35 +08:00
Chun Han	87b466fd83	fix: Group value is nil(#45418 ) (#45422 ) related: #45418 Signed-off-by: MrPresent-Han <chun.han@gmail.com> Co-authored-by: MrPresent-Han <chun.han@gmail.com>	2025-11-08 10:29:33 +08:00
Xiaofan	7aa0ca5d4e	enhance: Clean unused conan dependency (#45366 ) fix #45365 Signed-off-by: xiaofanluan <xiaofan.luan@zilliz.com>	2025-11-07 17:07:34 +08:00
Buqian Zheng	515a939edf	enhance: remove obsolete code (#45307 ) issue: #44452 Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2025-11-07 16:07:35 +08:00
Amit Kumar	388d56fdc7	enhance: Add support for minimum_should_match in text_match (parser, engine, client, and tests) (#44988 ) ### Is there an existing issue for this? - [x] I have searched the existing issues --- Please see: https://github.com/milvus-io/milvus/issues/44593 for the background This PR makes https://github.com/milvus-io/milvus/pull/44638 redundant, which can be closed. The PR comments for the original implementation suggested an alternative and a better approach, this new PR has that implementation. --- This PR - Adds an optional `minimum_should_match` argument to `text_match(...)` and wires it through the parser, planner/visitor, index bindings, and client-level tests/examples so full-text queries can require a minimum number of tokens to match. Motivation - Provide a way to require an expression to match a minimum number of tokens in lexical search. What changed - Parser / grammar - Added grammar rule and token: `MINIMUM_SHOULD_MATCH` and `textMatchOption` in `internal/parser/planparserv2/Plan.g4`. - Regenerated parser outputs: `internal/parser/planparserv2/generated/*` (parser, lexer, visitor, etc.) to support the new rule. - Planner / visitor - `parser_visitor.go`: parse and validate the `minimum_should_match` integer; propagate as an extra value on the `TextMatch` expression so downstream components receive it. - Added `VisitTextMatchOption` visitor method handling. - Client (Golang) - Added a unit test to verify `text_match(..., minimum_should_match=...)` appears in the generated DSL and is accepted by client code: `client/milvusclient/read_test.go` (new test coverage). - Added an integration-style test for the feature to the go-client testcase suite: `tests/go_client/testcases/full_text_search_test.go` (exercise min=1, min=3, large min). - Added an example demonstrating `text_match` usage: `client/milvusclient/read_example_test.go` (example name conforms to godoc mapping). - Engine / index - Updated C++ index interface: `TextMatchIndex::MatchQuery` - Added/updated unit tests for the index behavior: `internal/core/src/index/TextMatchIndexTest.cpp`. - Tantivy binding - Added `match_query_with_minimum` implementation and unit tests to `internal/core/thirdparty/tantivy/tantivy-binding/src/index_reader_text.rs` that construct boolean queries with minimum required clauses. Behavioral / compatibility notes - This adds an optional argument to `text_match` only; default behavior (no `minimum_should_match`) is unchanged. - Internal API change: `TextMatchIndex::MatchQuery` signature changed (internal component). Callers in the repo were updated accordingly. - Parser changes required regenerating ANTLR outputs Tests and verification - New/updated tests: - Go client unit test: `client/milvusclient/read_test.go` (mocked Search request asserts DSL contains `minimum_should_match=2`). - Go e2e-style test: `tests/go_client/testcases/full_text_search_test.go` (exercises min=1, 3 and a large min). - C++ unit tests for index behavior: `internal/core/src/index/TextMatchIndexTest.cpp`. - Rust binding unit tests for `match_query_with_minimum`. - Local verification commands to run: - Go client tests: `cd client && go test ./milvusclient -run ^$` (client package) - Go testcases: `cd tests/go_client && go test ./testcases -run TestTextMatchMinimumShouldMatch` (requires a running Milvus instance) - C++ unit tests / build: run core build/test per repo instructions (the change touches core index code). - Rust binding tests: `cd internal/core/thirdparty/tantivy/tantivy-binding && cargo test` (if developing locally). --------- Signed-off-by: Amit Kumar <amit.kumar@reddit.com> Co-authored-by: Amit Kumar <amit.kumar@reddit.com>	2025-11-07 16:07:11 +08:00
aoiasd	6102f001a9	enhance: skip check source id (#45377 ) relate:https://github.com/milvus-io/milvus/issues/45381 Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-11-07 15:19:34 +08:00
yihao.dai	2fad5b34f7	fix: Fix data race in replicate stream client (#45346 ) issue: https://github.com/milvus-io/milvus/issues/44123 Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2025-11-07 10:17:33 +08:00
cai.zhang	7527ddf50f	enhance: [test] Move R-Tree index tests into the implementation package (#45355 ) Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2025-11-07 10:03:33 +08:00
cai.zhang	b8f9384a85	fix: Skip building text index for newly added columns (#45316 ) issue: #45315 Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2025-11-06 19:47:35 +08:00
XuanYang-cn	2dd2c96eb1	fix: Accidentally ignored sealed segments in L0 Compaction (#45340 ) When there're no growing segments in the collection, L0 Compaction will try to choose all L0 segments that hits all L1/L2 segments. However, if there's Sealed Segment still under flushing in DataNode at the same time L0 Compaction selects satisfied L1/L2 segments, L0 Compaction will ignore this Segment because it's not in "FlushState", which is wrong, causing missing deletes on the Sealed Segment. This quick solution here is to fail this L0 compaction task once selected a Sealed segment. See also: #45339 --------- Signed-off-by: yangxuan <xuan.yang@zilliz.com>	2025-11-06 16:53:38 +08:00
XuanYang-cn	623a9e5156	fix: Accurate size estimation for sliced arrow arrays in compaction (#45294 ) Sliced arrow arrays "incorrectly" returned the original array's size via SizeInBytes(), causing inaccurate memory estimates during compaction. This resulted in segments closing prematurely in mergeSplit mode - expected 500MB compactions produced 4x100+MB segments instead. Fixed by calculating actual byte size of sliced arrays, ensuring proper segment sizing and more accurate memory usage tracking. See also: #45293 Signed-off-by: yangxuan <xuan.yang@zilliz.com>	2025-11-06 14:57:34 +08:00
congqixia	e284733399	fix: Move FinishLoad before text index creation to ensure raw data availability (#45334 ) Related to #45333 Fix segment loading failure when adding fields with text match enabled. The issue occurred because text indexes were being loaded before FinishLoad() was called, meaning raw data was not properly available when text index creation attempted to access it, resulting in "failed to create text index, neither raw data nor index are found" errors. Solution is to move the FinishLoad() call to execute after raw data loading but before text index loading. This ensures that: 1. Raw data is properly loaded and available in memory 2. Text indexes can access the raw data they need during creation 3. The segment is in the correct state before any index operations Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-11-06 14:49:34 +08:00
zhagnlu	59c64bee07	fix: not use json_shredding for json path is null (#45310 ) #45284 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-11-06 11:43:33 +08:00

1 2 3 4 5 ...

11376 Commits