milvus

mirror of https://gitee.com/milvus-io/milvus.git synced 2026-01-02 08:55:56 +08:00

Author	SHA1	Message	Date
SimFG	91d40fa558	fix: Update logging context and upgrade dependencies (#41318 ) - issue: #41291 --------- Signed-off-by: SimFG <bang.fu@zilliz.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2025-04-23 10:52:38 +08:00
congqixia	b36c88f3c8	enhance: [AddField] Broadcast schema change via WAL (#41373 ) Related to #39718 Add Broadcast logic for collection schema change and notifies: - Streamnode - Delegator - Streamnode - Flush component - QueryNodes via grpc --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-04-22 16:28:37 +08:00
Xianhui Lin	f9febe3bae	enhance: Merge RootCoord, DataCoord And QueryCoord into MixCoord (#41006 ) Merge RootCoord, DataCoord And QueryCoord into MixCoord Make Session into one issue : https://github.com/milvus-io/milvus/issues/37764 --------- Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>	2025-04-11 16:36:30 +08:00
wei liu	a839d94c9e	fix: balance checker may enter infinite normal balance loop after balance suspension (#41195 ) issue: #41194 - Refactor hasUnbalancedCollection flag handling to function scope - Ensure tracking sets clearance when no balance needed - Add deferred cleanup for both normal/stopping balance paths - Add unit tests for collection tracking scenarios The changes ensure tracking sets (normalBalanceCollectionsCurrentRound and stoppingBalanceCollectionsCurrentRound) are properly cleared when: - All collections in current round are balanced - Balance checks return early due to unready targets - Balance feature flags are disabled Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-04-10 15:22:29 +08:00
Xianhui Lin	3bc24c264f	enhance: Add json key inverted index in stats for optimization (#38039 ) Add json key inverted index in stats for optimization https://github.com/milvus-io/milvus/issues/36995 --------- Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-04-10 15:20:28 +08:00
wei liu	99270103cf	fix: Offline segment block delegator recovery (#40827 ) issue: #39937 Before PR #39552, whenever a segment was missing in either the `current target` or the `next target`, we would trigger `load segment` to recover the delegator. However, restoring only the missing segments in the `next target` is sufficient to advance the target and complete the recovery process. In PR #39552, we removed the scheduling of L0 segments along with this unnecessary `load segment` logic. However, this exposed a new issue: if the `current target` still has missing segments and there is a flaw in the `checkDelegatorDataReady` logic, it could block the recovery of a delegator that contains `offline segments`. Since `offline segments` are cleaned up asynchronously in this scenario, this PR removes their blocking effect on delegator recovery, ensuring a smoother failure recovery process. Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-04-07 14:56:22 +08:00
wei liu	bf8547578f	fix: Address manual balance and balance check issues (#41037 ) issue: #37651 - Fix context propagation for manual balance segment task creation from PR #38080. - Optimize stopping balance by preventing redundant checks per round, addressing performance regression from PR #40297. - Decrease default `checkBalanceInterval` from 3000ms to 300ms. - Correct minor log messages in `BalanceChecker`. --------- Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-04-03 15:48:27 +08:00
smellthemoon	cb1e86e17c	enhance: support add field (#39800 ) after the pr merged, we can support to insert, upsert, build index, query, search in the added field. can only do the above operates in added field after add field request complete, which is a sync operate. compact will be supported in the next pr. #39718 --------- Signed-off-by: lixinguo <xinguo.li@zilliz.com> Co-authored-by: lixinguo <xinguo.li@zilliz.com>	2025-04-02 14:24:31 +08:00
wei liu	c02892e9fb	enhance: Balance the collection with the largest row count first (#40297 ) issue: #37651 this PR enable to balance the collection with largest row count first, to avoid temporary migration of small table data to new nodes during their onboarding, only to be moved out again after the large table balance, which would cause unnecessary load. --------- Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-03-31 16:00:19 +08:00
wei liu	0420dc1eb1	fix: use correct delete checkpoint to prevent premature data cleanup (#40366 ) issue: #40292 related to #39552 - Fix incorrect delete checkpoint usage in SyncDistribution - Change checkpoint parameter from action.GetCheckpoint() to action.GetDeleteCP() in SyncTargetVersion call - This resolves the issue where delete buffer data was being cleaned prematurely due to wrong checkpoint reference Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-03-12 15:00:08 +08:00
yihao.dai	c368113233	fix: Fix task delta cache data race (#40259 ) issue: https://github.com/milvus-io/milvus/issues/40258 Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2025-03-02 16:52:09 +08:00
wei liu	b0806bb900	fix: task delta cache leak due to duplicate task id (#40183 ) issue: #40052 task delta cache rely on the taskID is unique, so it incDeltaCache at AddTask, and decDeltaCache at RemoveTask, but the taskID allocator is not atomic, which cause two task with same taskID, in such case, it will call incDeltaCache twice, but call decDeltaCacheOnce, which cause delta cache leak. --------- Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-02-28 10:22:08 +08:00
wei liu	94f55df7fb	enhance: clean shard location cache after collection released (#40088 ) issue: #40077 Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-02-27 19:42:05 +08:00
wei liu	69b8b89369	enhance: Remove QueryCoord's scheduling of L0 segments (#39552 ) issue: #39551 This PR remove querycoord's scheduling of l0 segments: - only load l0 segment when watch channel - only release l0 segment when release channel or sync data distribution --------- Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-02-26 21:38:00 +08:00
congqixia	cb7f2fa6fd	enhance: Use v2 package name for pkg module (#39990 ) Related to #39095 https://go.dev/doc/modules/version-numbers Update pkg version according to golang dep version convention --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-02-22 23:15:58 +08:00
yihao.dai	2a037a97f1	enhance: Add get vector latency metric and refine request limit error message (#40083 ) issue: https://github.com/milvus-io/milvus/issues/40078 Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2025-02-21 19:41:55 +08:00
wei liu	7d2c948c69	fix: task delta cache leak on reduce task (#40055 ) issue: #40052 Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-02-21 16:47:54 +08:00
wei liu	07578041ba	fix: querycoord panic in cornor case (#40057 ) issue: #40050 Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-02-21 11:19:58 +08:00
Zhen Ye	64dad60dc2	fix: delegator doesn't follow with wal if streaming enabled (#39890 ) issue: #38399 Signed-off-by: chyezh <chyezh@outlook.com>	2025-02-17 14:10:15 +08:00
Bingyi Sun	b59555057d	feat: support json index (#36750 ) https://github.com/milvus-io/milvus/issues/35528 This PR adds json index support for json and dynamic fields. Now you can only do unary query like 'a["b"] > 1' using this index. We will support more filter type later. basic usage: ``` collection.create_index("json_field", {"index_type": "INVERTED", "params": {"json_cast_type": DataType.STRING, "json_path": 'json_field["a"]["b"]'}}) ``` There are some limits to use this index: 1. If a record does not have the json path you specify, it will be ignored and there will not be an error. 2. If a value of the json path fails to be cast to the type you specify, it will be ignored and there will not be an error. 3. A specific json path can have only one json index. 4. If you try to create more than one json indexes for one json field, sdk(pymilvus<=2.4.7) may return immediately because of internal implementation. This will be fixed in a later version. --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-02-15 14:06:15 +08:00
wei liu	bfc802297e	enhance: Add management api to check querycoord balance status (#37784 ) issue: #37783 Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-02-14 18:00:14 +08:00
wei liu	b9e3ec7175	enhance: Add trigger interval config for auto balance (#39154 ) issue: #39156 Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-02-14 16:12:15 +08:00
congqixia	58045a3396	fix: Check collection released before target checks (#39841 ) Related to #39840 The target could be updated async in previous code. This PR make remove collection from target observer block until all tasks related in dispatchers are removed preventing the metrics being updated after collection released. --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-02-14 11:38:14 +08:00
Zhen Ye	0988807160	enhance: enable write ahead buffer for streaming service (#39771 ) issue: #38399 - Make a timetick-commit-based write ahead buffer at write side. - Add a switchable scanner at read side to transfer the state between catchup and tailing read Signed-off-by: chyezh <chyezh@outlook.com>	2025-02-12 20:38:46 +08:00
wei liu	c12c4b4fff	fix: [skip e2e] pr conflict cause ut failed (#39811 ) Related to https://github.com/milvus-io/milvus/pull/39701 & https://github.com/milvus-io/milvus/issues/39681 Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-02-12 11:44:51 +08:00
congqixia	7b51e4839f	fix: Resolve conflict on qc task test (#39796 ) Related to #39701 & #39681 Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-02-11 18:40:45 +08:00
wei liu	ff5c680c99	fix: load collection stucks if compaction/gc happens (#39701 ) issue: #39680 if compaction/gc happens, load collection may stuck due to SegmentNotFound, we should trigger UpdateNextTarget to get a new data view to execute loading operation. Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-02-11 15:48:50 +08:00
wei liu	85c9f92ff4	fix: uneven distribution caused by executing task delta cache leak (#39702 ) issue: #39681 this PR maintain workload effect in action instead of computing workload effect from target, which may cause leak if target changes. Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-02-11 14:30:46 +08:00
Zhen Ye	d3e32bb599	enhance: make pchannel level flusher (#39275 ) issue: #38399 - Add a pchannel level checkpoint for flush processing - Refactor the recovery of flushers of wal - make a shared wal scanner first, then make multi datasyncservice on it Signed-off-by: chyezh <chyezh@outlook.com>	2025-02-10 16:32:45 +08:00
jaime	8a4ac8cccd	enhance: expose more metrics data (#39456 ) issue: #36621 #39417 1. Adjust the server-side cache size. 2. Add source information for configurations. 3. Add node ID for compaction and indexing tasks. 4. Resolve localhost access issues to fix health check failures for etcd. Signed-off-by: jaime <yun.zhang@zilliz.com>	2025-02-07 11:50:50 +08:00
wei liu	05ac4041aa	enhance: use rated logger for high frequency log in dist handler (#39452 ) Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-02-05 15:31:10 +08:00
Zhen Ye	c84a0748c4	enhance: add rw/ro streaming query node replica management (#38677 ) issue: #38399 - Embed the query node into streaming node to make delegator available at streaming node. - The embedded query node has a special server label `QUERYNODE_STREAMING-EMBEDDED`. - Change the balance strategy to make the channel assigned to streaming node as much as possible. Signed-off-by: chyezh <chyezh@outlook.com>	2025-01-24 16:55:07 +08:00
yihao.dai	5fb597b37b	fix: Remove frequently updating metric to avoid mutex contention (#38775 ) issue: https://github.com/milvus-io/milvus/issues/37630 Reduce the frequency updating metrics to avoid holding the mutex for long periods. --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2025-01-24 10:31:07 +08:00
yihao.dai	e0b26260f2	enhance: enable task delta cache (#39307 ) When there are many segment tasks in the querycoord scheduler, the traversal in `GetSegmentTaskDelta` checks becomes time-consuming. This PR adds caching for segment deltas. issue: https://github.com/milvus-io/milvus/issues/37630 Signed-off-by: Wei Liu <wei.liu@zilliz.com> Co-authored-by: Wei Liu <wei.liu@zilliz.com>	2025-01-23 14:31:16 +08:00
yihao.dai	38f813bed3	enhance: Read metadata concurrently to accelerate recovery (#38403 ) Read metadata such as segments, binlogs, and partitions concurrently at the collection level. issue: https://github.com/milvus-io/milvus/issues/37630 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2025-01-23 14:27:27 +08:00
yihao.dai	e55d6506e3	enhance: Remove frequent observe log (#39413 ) /kind improvement Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2025-01-20 11:01:10 +08:00
yihao.dai	657550cf06	fix: Fix slow dist handle and slow observe (#38566 ) 1. Provide partition&channel level indexing in the collection target. 2. Make `SegmentAction` not wait for distribution. 3. Remove scheduler and target manager mutex. 4. Optimize logging to reduce CPU overhead. issue: https://github.com/milvus-io/milvus/issues/37630 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2025-01-15 20:17:00 +08:00
wei liu	d2834a1812	enhance: Add logs for check health failed (#39208 ) Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-01-15 17:31:00 +08:00
jaime	e8f76cd2d9	fix: unstable ut in leader_vew_manager.go file (#39161 ) issue: #38672 Signed-off-by: jaime <yun.zhang@zilliz.com>	2025-01-15 12:26:59 +08:00
Zhen Ye	3e788f0fbd	enhance: record memory size (uncompressed) item for index (#38770 ) issue: #38715 - Current milvus use a serialized index size(compressed) for estimate resource for loading. - Add a new field `MemSize` (before compressing) for index to estimate resource. --------- Signed-off-by: chyezh <chyezh@outlook.com>	2025-01-14 10:33:06 +08:00
wei liu	cc5d59392a	fix: channel unbalance during stopping balance progress (#38971 ) issue: #38970 cause the stopping balance channel still use the row_count_based policy, which may causes channel unbalance in multi-collection case. This PR impl a score based stopping balance channel policy. Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-01-13 11:21:06 +08:00
wei liu	826b726c86	fix: Prevent leader checker from generating excessive duplicate leader tasks (#39000 ) issue: #39001 Background: Segment Load Version: Each segment load request assigns a timestamp as its version. When multiple copies of a segment are loaded on different QueryNodes, the leader checker uses this version to identify the latest copy and updates the routing table in the leader view to point to it. Delegator Router Version: When a delegator builds a route to a QueryNode that has loaded a segment, it also records the segment's version. Router Table Update Logic: If the leader checker detects that the version of a segment in the routing table does not match the version in the worker, it updates the routing table to point to the QueryNode with the latest version. Additionally, it updates the segment's load version in the QueryNode during this process. Issue: When a channel is undergoing load balancing, the leader checker may sync the routing table to a new delegator. This sync operation modifies the segment's load version, which invalidates the routing in the old delegator. Subsequently, the leader checker updates the routing table in the old delegator, breaking the routing in the new delegator. This cycle continues, causing repeated updates and inconsistencies. Fix: This PR introduces two changes to address the issue: 1. Use NodeID to verify whether the delegator's routing table needs an update, avoiding unnecessary modifications. 2. Ensure compatibility by using the latest segment's load version as the version recorded in the routing table. These changes resolve the cyclic updates and prevent the leader checker from generating excessive duplicate tasks, ensuring routing stability across delegators during load balancing. Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-01-10 14:12:57 +08:00
Zhen Ye	bb8d1ab3bf	enhance: make new go package to manage proto (#39114 ) issue: #39095 --------- Signed-off-by: chyezh <chyezh@outlook.com>	2025-01-10 10:49:01 +08:00
jaime	f03a85725a	enhance: add db name in replica (#38672 ) issue: #36621 Signed-off-by: jaime <yun.zhang@zilliz.com>	2025-01-09 19:40:59 +08:00
wei liu	47e7ea241e	enhance: Add log for case which target not update as expected (#38944 ) Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-01-07 17:45:03 +08:00
Xiaofan	cb6eca8e91	fix: drop partition can not be successful if load failed (#38793 ) fix #38649 when partition load failed, the partition drop will also fail due to the wrong error message Signed-off-by: xiaofanluan <xiaofan.luan@zilliz.com>	2024-12-30 19:42:52 +08:00
wei liu	f49d618382	fix: Querycoord will trigger unexpected balance task after restart (#38630 ) issue: #38606 --------- Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-12-25 19:30:48 +08:00
wei liu	25f0c82ceb	fix: Fix update loading collection's load config doesn't work (#38595 ) issue: #38594 --------- Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-12-25 18:02:51 +08:00
wei liu	9c3f59dbbe	fix: Prevent balancer from overloading the same QueryNode (#38719 ) issue: #38718 The balancer calculates the workload of executing tasks as an ongoing score for target nodes. However, a logic issue arises when GetSegmentTaskDelta or GetChannelTaskDelta is called with collectionID=-1, which incorrectly returns zero. Due to the incorrect global score, the executing task's workload is not properly reflected for each collection. Consequently, each collection submits its own balance task, leading to the balancer assigning excessive tasks to the same QueryNode. --------- Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-12-25 16:36:49 +08:00
jaime	5afd0c0a2b	fix: Revert "Expose metrics of stanby coordinators (#27698 )" (#38620 ) issue: #38608 Signed-off-by: jaime <yun.zhang@zilliz.com>	2024-12-23 11:46:57 +08:00

1 2 3 4 5 ...

664 Commits