milvus

mirror of https://gitee.com/milvus-io/milvus.git synced 2026-01-07 19:31:51 +08:00

Author	SHA1	Message	Date
cai.zhang	9b4b0cb808	enhance: [2.5] Estimate the taskSlot based on whether scalar or vector index (#46260 ) issue: #45186 master pr: #45850 --------- Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2025-12-11 15:43:14 +08:00
Buqian Zheng	dcc3975f17	fix: [2.5] move cursor after skip index skipped a chunk (#46078 ) issue: https://github.com/milvus-io/milvus/issues/46053 pr: https://github.com/milvus-io/milvus/pull/46054 Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2025-12-05 13:59:11 +08:00
zhagnlu	f1f11b336b	fix:fix undefined bahavior when dump snapshot (#45613 ) pr: #45611 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-12-05 10:49:12 +08:00
cai.zhang	6ce2df9944	fix: [2.5]Fix setting default value for geometry by restful (#46058 ) (#46065 ) issue: https://github.com/milvus-io/milvus/issues/46056 master pr: https://github.com/milvus-io/milvus/pull/46058 Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2025-12-04 16:35:11 +08:00
congqixia	e70e70699c	enhance: [2.5] skip adding stopping node to resource group in handleNodeUp (#45969 ) (#45982 ) Cherry-pick from master pr: #45969 Related to #45960 Follow-up to #45961 After #45961 ensured that handleNodeUp is always called for nodes discovered during rewatchNodes (including stopping nodes), this change adds a safeguard in ResourceManager.handleNodeUp to skip adding stopping nodes to resource groups. 1. resource_manager.go: Add check for IsStoppingState() in handleNodeUp to prevent stopping nodes from being added to incomingNode set and assigned to resource groups. 2. server.go: - Delete processed nodes from sessionMap to avoid duplicate processing in the subsequent loop - Add warning logs for stopping state transitions during rewatch Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-12-02 10:23:13 +08:00
congqixia	61c80235dd	fix: [2.5] update QueryNode NumEntities metrics when collection has no segments (#45147 ) (#45981 ) Cherry-pick from master pr: #45147 Related to #44509 Fix a bug where QueryNodeNumEntities metrics were not updated for collections with zero segments, causing stale metrics when all segments are flushed or compacted. The previous implementation used separate loops: one to update size metrics for all collections, and another to update num entities metrics only for collections present in the grouped segments map. Collections with no segments were skipped in the second loop, leaving their NumEntities metrics stale. Changes: - Consolidate size and num entities metric updates into single loop - Iterate over all collections instead of grouped segments - Get collection metadata from manager instead of segment instances - Correctly set NumEntities to 0 for collections with no segments - Apply the same fix to both growing and sealed segment processing - Add nil check for collection metadata before processing This ensures all collection metrics are updated consistently, even when segment count drops to zero. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-12-02 10:19:10 +08:00
1mmortal	0ece57325e	fix: [2.5] Correcting the incorrect AllSearchCount value in the result of hybrid_search. (#45843 ) Correcting the incorrect AllSearchCount value in the result of hybrid_search. #45842 Signed-off-by: 1mmortal <lmzzzzz1@163.com>	2025-12-01 15:11:11 +08:00
congqixia	a24a0f11aa	fix: [2.5] always call handleNodeUp in rewatchNodes for proper stopping balance (#45964 ) Cherry-pick from master pr: #45961 Related to #45960 When QueryCoord restarts or reconnects to etcd, the rewatchNodes function previously skipped handleNodeUp for QueryNodes in stopping state. This caused stopping balance to fail because necessary components were not initialized: - Task scheduler executor was not added - Dist handler was not started - Node was not registered in resource manager This fix ensures handleNodeUp is always called for new nodes regardless of their stopping state, followed by handleNodeStopping if the node is stopping. This allows the graceful shutdown process to correctly migrate segments and channels away from stopping nodes. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-12-01 11:11:10 +08:00
Bingyi Sun	ba6198a3b8	fix: Replace json.doc() calls with json.dom_doc() in JsonContainsExpr (#45785 ) issue: https://github.com/milvus-io/milvus/issues/45783 pr: https://github.com/milvus-io/milvus/pull/45573 Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-11-25 20:19:07 +08:00
aoiasd	7af4e4076d	enhance: [2.5] optimize bm25 stats load. (#45780 ) relate: https://github.com/milvus-io/milvus/issues/41424 pr: https://github.com/milvus-io/milvus/pull/44279 https://github.com/milvus-io/milvus/pull/44628 --------- Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-11-25 10:39:08 +08:00
cai.zhang	74a0363df7	fix: [2.5] Remove the incorrect reset task step (#45771 ) issue: #45184 Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2025-11-21 19:47:06 +08:00
Buqian Zheng	1fda4bcae4	enhance: [2.5] add ScalarFieldProto& overload to avoid unnecessary copies (#45744 ) 1. Array.h: Add output_data(ScalarFieldProto&) overload for both Array and ArrayView classes 2. Use std::string_view instead of std::string for VARCHAR and GEOMETRY types to avoid extra string copies 3. Call Reserve(length_) before writing to proto objects to reduce memory reallocations a simple test shows those optimizations improve the Array of Varchar bulk_subscript performance by 20% issue: https://github.com/milvus-io/milvus/issues/45679 pr: https://github.com/milvus-io/milvus/pull/45743 Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2025-11-21 12:39:05 +08:00
wei liu	2232dfc3de	fix: Prevent Close from hanging on etcd reconnection (#45622 ) issue: #45623 When etcd reconnects, the DataCoord rewatches DataNodes and calls ChannelManager.Startup again without closing the previous instance. This causes multiple contexts and goroutines to accumulate, leading to Close hanging indefinitely waiting for untracked goroutines. Root cause: - Etcd reconnection triggers rewatch flow and calls Startup again - Startup was not idempotent, allowing repeated calls - Multiple context cancellations and goroutines accumulated - Close would wait indefinitely for untracked goroutines Changes: - Add started field to ChannelManagerImpl - Refactor Startup to check and handle restart scenario - Add state check in Close to prevent hanging --------- Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-11-19 12:49:06 +08:00
Bingyi Sun	f1844c9841	enhance: optimize term expr performance (#45490 ) issue: https://github.com/milvus-io/milvus/issues/45641 pr: https://github.com/milvus-io/milvus/pull/45491 --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-11-19 11:51:06 +08:00
7y-9	a42e847678	fix: [2.5] Fix infinite loop in ResourceManager recovery process (#45563 ) relate: https://github.com/milvus-io/milvus/issues/45557 Signed-off-by: lianyu.sun <lianyu.sun@ly.com>	2025-11-17 15:19:39 +08:00
cai.zhang	6eb77ddc4d	fix: [2.5]Fix target segment marked dropped for save stats result twice (#45480 ) issue: #45477 master pr: #45478 --------- Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2025-11-13 09:41:37 +08:00
cai.zhang	1d6786545b	fix: [2.5] Fix filter geometry for growing with mmap (#45466 ) issue: #45450 master pr: #45464 Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2025-11-11 15:41:40 +08:00
aoiasd	7ad68910d9	enhance: [2.5] skip check source id (#45383 ) pr: https://github.com/milvus-io/milvus/pull/45377 relate:https://github.com/milvus-io/milvus/issues/45381 Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-11-07 15:21:42 +08:00
XuanYang-cn	2d6c736448	fix: [2.5]Accidentally ignored sealed segments in L0 Compaction #45341 (#45342 ) When there're no growing segments in the collection, L0 Compaction will try to choose all L0 segments that hits all L1/L2 segments. However, if there's Sealed Segment still under flushing in DataNode at the same time L0 Compaction selects satisfied L1/L2 segments, L0 Compaction will ignore this Segment because it's not in "FlushState", which is wrong, causing missing deletes on the Sealed Segment. This quick solution here is to fail this L0 compaction task once selected a Sealed segment. See also: #45339 pr: #45340 pr: #45341 Signed-off-by: yangxuan <xuan.yang@zilliz.com>	2025-11-06 19:21:35 +08:00
sparknack	91645d9242	enhance: [2.5] unify the aligned buffer for both buffered and direct I/O (#45324 ) issue: #43040 pr: #45323 Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>	2025-11-06 10:55:35 +08:00
sparknack	561b167f1e	fix:[2.5] avoid potential race conditions when updating the executor (#45231 ) issue: #43030 pr: #45230 Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>	2025-11-05 10:15:33 +08:00
cai.zhang	2e4502a4fc	fix: [2.5]Skip create tmp dir for growing R-Tree index (#45258 ) issue: https://github.com/milvus-io/milvus/issues/45181 master pr: https://github.com/milvus-io/milvus/pull/45256 Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2025-11-04 17:35:34 +08:00
cai.zhang	cc9735ff4f	enhance: [2.5]Make GeometryCache an optional configuration (#45197 ) issue: #45187 master pr: #45192 --------- Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2025-11-03 20:31:34 +08:00
cai.zhang	dfcef7d14d	fix: [2.5]Fix sort stats task failed when segment is compacting (#45185 ) issue: #45184 Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2025-11-03 11:25:33 +08:00
cai.zhang	0ca74f234f	fix: [2.5] Fix import null geometry data (#45163 ) issue: #44787 master pr: #45161 Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2025-10-31 12:34:10 +08:00
foxspy	0f0ea4d206	enhance: [2.5] update knowhere version (#45148 ) issue: #42937 /kind branch-feature Signed-off-by: xianliang.li <xianliang.li@zilliz.com>	2025-10-30 10:08:08 +08:00
cai.zhang	e58cd7fcc4	fix: [2.5]Fix bug for importing Geometry data (#45091 ) issue: https://github.com/milvus-io/milvus/issues/44787 , https://github.com/milvus-io/milvus/issues/45012 master pr: https://github.com/milvus-io/milvus/pull/45089 --------- Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2025-10-29 18:48:13 +08:00
cai.zhang	3ebd1f2f26	fix: [2.5]Fix retrieve geometry null data when enable mmap (#45142 ) issue: #44648 --------- Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2025-10-29 16:48:12 +08:00
aoiasd	529a31a1bf	enhance: [2.5]support use nullable field as bm25 function input field (#44586 ) (#45118 ) relate: https://github.com/milvus-io/milvus/pull/44586 Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-10-28 19:20:11 +08:00
zhagnlu	78d70db6fd	fix: support skip load json stats when disable jsonstats (#45098 ) pr: #45101 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-10-28 10:50:11 +08:00
congqixia	9ed77d4484	fix: [2.5] prevent data race in querycoord collection notifier update (#45037 ) (#45052 ) Cherry-pick from master pr: #45037 Fixes #45035 This commit addresses a data race issue where refreshCollection was updating the collection notifier without proper lock protection. Changes: - Add UpdateCollection method to CollectionManager with proper locking - Introduce CollectionOperator pattern for thread-safe collection updates - Make setRefreshNotifier private and use it through the operator pattern - Update refreshCollection to use the new UpdateCollection method - Handle collection not found error gracefully in refreshCollection The CollectionOperator pattern ensures all collection modifications go through the CollectionManager's lock, preventing concurrent access issues. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-10-23 19:34:12 +08:00
wei liu	c633556fee	fix: [2.5] Handle empty FieldsData in reduce/rerank for requery scenario (#44919 ) issue: #44909 pr: #44917 When requery optimization is enabled, search results contain IDs but empty FieldsData. During reduce/rerank operations, if the first shard has empty FieldsData while others have data, PrepareResultFieldData initializes an empty array, causing AppendFieldData to panic when accessing array indices. Changes: - Find first non-empty FieldsData as template in 5 functions: reduceAdvanceGroupBY, reduceSearchResultDataWithGroupBy, reduceSearchResultDataNoGroupBy, rankSearchResultDataByGroup, rankSearchResultDataByPk - Add length check before 4 AppendFieldData calls to prevent panic - Add unit tests for empty and partial empty FieldsData scenarios This fix handles both pure requery (all empty) and mixed scenarios (some empty, some with data) without breaking normal search flow. Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-10-21 19:48:04 +08:00
cai.zhang	d43b030b4d	fix: [2.5] Fix bug for gis function to filter geometry (#44968 ) issue: #44961 master pr: #44966 This PR fixes 3 geometry related bugs: 1. Implement ToString interface for GisFunctionFilter. 2. Ignore GisFunctionFilter MoveCursor for growing segment. 3. Don't skip null geometry for building R-Tree index, should be record in null_offsets. --------- Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2025-10-21 17:00:13 +08:00
Bingyi Sun	a0201ef98d	enhance: optimize the performace of bitmap reverse lookup (#44804 ) (#44958 ) pr: https://github.com/milvus-io/milvus/pull/44804 Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-10-21 14:38:04 +08:00
cai.zhang	c6cc3d2c25	fix: [2.5] Fix the geometry return POINT(0 0) when growing mmap is enabled (#44891 ) issue: #44802 master pr: #44889 After a Geometry object is serialized into WKB, the resulting binary may contain '\0' bytes. When growing mmap is enabled, the append data logic uses strcpy, which stops copying at the first '\0' bytes. This causes only part of the WKB---typically the portion up to the geometry type field to be copied, leading to corrupted data. As a result, during parsing, all POINT geometries are incorrectly interperted as POINT(0 0). To fix this issue, memcpy will be used instead of strcpy. Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2025-10-17 17:16:08 +08:00
cai.zhang	f27dfa4490	enhance: [2.5]Support import geometry data by json/csv (#44828 ) issue: #44787 master pr: #44826 2.6 pr: #44827 --------- Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2025-10-17 17:14:23 +08:00
cqy123456	e4b72977dd	fix:[2.5]remove the limit of deduplicate case when disable autoindex (#44782 ) issue: https://github.com/milvus-io/milvus/issues/44702 related pr: https://github.com/milvus-io/milvus/pull/44825 Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>	2025-10-17 11:40:02 +08:00
congqixia	93411a388c	fix: [2.5] ensure deterministic search result ordering when scores are equal (#44870 ) (#44885 ) Cherry-pick from master pr: #44870 Related to #44819 This fix addresses an issue(#44819) where the offset parameter did not work correctly during searches when multiple results had identical scores. The problem occurred because results with equal scores were not consistently ordered, leading to unpredictable pagination behavior. The solution adds a new sorting step (SortEqualScoresByPks) in the reduce phase that sorts results with identical scores by their primary keys in ascending order. This ensures deterministic ordering and enables proper offset functionality. Changes: - Add SortEqualScoresByPks() to sort results with equal scores by PK - Add SortEqualScoresOneNQ() to handle per-query sorting logic - Invoke sorting step after FillPrimaryKey() in Reduce() workflow --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-10-16 19:34:08 +08:00
wei liu	82081eba1b	fix: [2.5] Fix deactivate balance checker also stops stopping balance (#44835 ) issue: #43858 pr: #44834 Fix the issue introduced in PR #43992 where deactivating the balance checker incorrectly stops stopping balance operations. Changes: - Move IsActive() check after stopping balance logic - Only skip normal balance when checker is inactive - Allow stopping balance to proceed regardless of checker state This ensures stopping balance can execute even when the balance checker is deactivated. --------- Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-10-15 15:56:01 +08:00
aoiasd	71fc23dd24	fix: [2.5] dropped segment in excluded segment use wrong excluded ts (#44771 ) relate: https://github.com/milvus-io/milvus/issues/43114 pr: https://github.com/milvus-io/milvus/pull/43115 https://github.com/milvus-io/milvus/pull/44769 --------- Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-10-15 15:06:01 +08:00
wei liu	47949fd883	enhance: Implement rewatch mechanism for etcd failure scenarios (#43829 ) (#43920 ) issue: #43828 pr: #43829 #43909 Implement robust rewatch mechanism to handle etcd connection failures and node reconnection scenarios in DataCoord and QueryCoord, along with heartbeat lag monitoring capabilities. Changes include: - Implement rewatchDataNodes/rewatchQueryNodes callbacks for etcd reconnection scenarios - Add idempotent rewatchNodes method to handle etcd session recovery gracefully - Add QueryCoordLastHeartbeatTimeStamp metric for monitoring node heartbeat lag - Clean up heartbeat metrics when nodes go down to prevent metric leaks --------- --------- Signed-off-by: Wei Liu <wei.liu@zilliz.com> Co-authored-by: Zhen Ye <chyezh@outlook.com>	2025-10-15 14:12:01 +08:00
congqixia	1f94b1f5f6	fix: [2.5] avoid concurrent Reset/Add operations on DataCoord metrics (#44789 ) (#44817 ) Cherry-pick from master pr: #44789 This commit addresses issue #44788 where the `datacoord_stored_binlog_size` metric could become inaccurate when multiple concurrent `GetMetrics` calls arrived at DataCoord. ### Problem The original implementation called `Reset()` followed by `Add()` operations on Prometheus metrics within the `GetQuotaInfo()` method. When multiple goroutines invoked this method concurrently, race conditions occurred: - Thread 1: Reset() → Add(value1) - Thread 2: Reset() → Add(value2) - Result: Metrics could be reset multiple times and values added in an interleaved fashion, leading to inaccurate and inflated metric values ### Solution Changed the approach from `Reset() + Add()` to aggregating metric values in local maps first, then using `Set()` to update metrics atomically: 1. Collect segment size data into local maps: - `storedBinlogSize`: tracks size per collection per segment state - `binlogFileSize`: tracks total file count per collection - `coll2DbName`: maps collection IDs to database names 2. After aggregation is complete, use `Set()` (instead of `Add()`) to update metrics in a single operation per label combination This ensures that concurrent `GetMetrics` calls don't interfere with each other, as each invocation works with its own local state and only updates the final metric value atomically. --------- --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-10-14 20:06:00 +08:00
jiaqizho	00ef6032c6	enhance:[2.5] Introduce sparse filter in query (#44347 ) (#44790 ) pr: #44347 Signed-off-by: jiaqizho <jiaqi.zhou@zilliz.com>	2025-10-14 15:02:01 +08:00
congqixia	c30cb6c283	enhance: [2.5] Add accesslog field for template value length info (#44723 ) (#44791 ) Cherry-pick from master pr: #44723 Related to #36672 Add accesslog field displaying value length for search/query request may help developers debug related issues --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-10-13 14:29:58 +08:00
wei liu	cbe2761e99	fix: Fix L0 segment duplicate load task generation during channel balance (#44700 ) issue: #44699 Fix the issue where L0 segment checking logic incorrectly identifies L0 segments as missing when they exist on multiple delegators during channel balance process, which blocks sealed segment loading and target progression. Changes include: - Replace GetLatestShardLeaderByFilter with GetByFilter to check all delegators instead of only the latest leader - Iterate through all delegator views to identify which ones lack the L0 segment The original logic only checked the latest shard leader, causing false positive detection of missing L0 segments when they actually exist on other delegators in the same channel during balance operations. This led to continuous generation of duplicate L0 segment load tasks, preventing normal sealed segment loading flow. Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-10-11 10:04:00 +08:00
cai.zhang	52ab33ba88	fix: [2.5] Skip empty loop for process growing segment (#44608 ) issue: #43427 master pr: #44606 The GISFunction asserts that the segment_offsets cannot be nullptr. When size is 0, the segment_offsets is nullptr, so the loop is skiped. Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2025-10-10 14:55:59 +08:00
congqixia	cb0e88632f	enhance: [2.5] Make accesslog.$consistency_level represent actual value used (#44708 ) Cherry-pick from master pr: #44706 Related to #44703 This PR: - Add `SetActualConsistencyLevel` to `info.AccessInfo` interface and related util method processing it - Make `$consistency_level` returning actual value if set --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-10-09 21:55:59 +08:00
Bingyi Sun	9434a3bdaa	fix: Fix bulk import with autoid (#44601 ) pr: #44604 issue: #44424 --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-10-09 14:51:58 +08:00
congqixia	c86d68bea5	enhance: [2.5] Bump arrow/go to v17 (#44663 ) Related to #40777 Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-10-09 11:47:57 +08:00
wei liu	892d63d26e	enhance: [2.5] Refactor balance checker with priority queue (#43992 ) (#44588 ) issue: #43858 pr: #43992 Refactor the balance checker implementation to use priority queues for managing collection balance operations, improving processing efficiency and order control. Changes include: - Export priority queue interfaces (Item, BaseItem, PriorityQueue) - Replace collection round-robin with priority-based queue system - Add BalanceCheckCollectionMaxCount configuration parameter - Optimize balance task generation with batch processing limits - Refactor processBalanceQueue method for different strategies - Enhance test coverage with comprehensive unit tests The new priority queue system processes collections based on row count or collection ID order, providing better control over balance operation priorities and resource utilization. --------- Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-09-28 19:23:05 +08:00

1 2 3 4 5 ...

10564 Commits