milvus

mirror of https://gitee.com/milvus-io/milvus.git synced 2026-02-04 11:18:44 +08:00

Author	SHA1	Message	Date
Bingyi Sun	f9827392bb	enhance: implement external collection update task with source change detection (#45905 ) issue: #45881 Add persistent task management for external collections with automatic detection of external_source and external_spec changes. When source changes, the system aborts running tasks and creates new ones, ensuring only one active task per collection. Tasks validate their source on completion to prevent superseded tasks from committing results. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> - Core invariant: at most one active UpdateExternalCollection task exists per collection — tasks are serialized by collectionID (collection-level locking) and any change to external_source or external_spec aborts superseded tasks and causes a new task creation (externalCollectionManager + external_collection_task_meta collection-based locks enforce this). - What was simplified/removed: per-task fine-grained locking and concurrent multi-task acceptance per collection were replaced by collection-level synchronization (external_collection_task_meta.go) and a single persistent task lifecycle in DataCoord/Index task code; redundant double-concurrent update paths were removed by checking existing task presence in AddTask/LoadOrStore and aborting/overwriting via Drop/Cancel flows. - Why this does NOT cause data loss or regress behavior: task state transitions and commit are validated against the current external source/spec before applying changes — UpdateStateWithMeta and SetJobInfo verify task metadata and persist via catalog only under matching collection-state; DataNode externalCollectionManager persists task results to in-memory manager and exposes Query/Drop flows (services.go) without modifying existing segment data unless a task successfully finishes and SetJobInfo atomically updates segments via meta/catalog calls, preventing superseded tasks from committing stale results. - New capability added: end-to-end external collection update workflow — DataCoord Index task + Cluster RPC helpers + DataNode external task runner and ExternalCollectionManager enable creating, querying, cancelling, and applying external collection updates (fragment-to-segment balancing, kept/updated segment handling, allocator integration); accompanying unit tests cover success, failure, cancellation, allocator errors, and balancing logic. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-12-29 19:53:21 +08:00
Zhen Ye	30091a3bb7	enhance: remove redundant channel manager from datacoord (#44532 ) issue: #41611 - After enabling streaming arch, channel manager of data coord is a redundant component. --------- Signed-off-by: chyezh <chyezh@outlook.com>	2025-10-09 11:01:57 +08:00
yihao.dai	51f69f32d0	feat: Add CDC support (#44124 ) This PR implements a new CDC service for Milvus 2.6, providing log-based cross-cluster replication. issue: https://github.com/milvus-io/milvus/issues/44123 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com> Signed-off-by: chyezh <chyezh@outlook.com> Co-authored-by: chyezh <chyezh@outlook.com>	2025-09-16 16:32:01 +08:00
cai.zhang	7f470e6bd3	fix: Fix retry state with palyload is not nil (#44068 ) issue: #43776 Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2025-08-27 18:11:49 +08:00
wei liu	3e9e830074	enhance: Implement rewatch mechanism for etcd failure scenarios (#43829 ) issue: #43828 Implement robust rewatch mechanism to handle etcd connection failures and node reconnection scenarios in DataCoord and QueryCoord, along with heartbeat lag monitoring capabilities. Changes include: - Implement rewatchDataNodes/rewatchQueryNodes callbacks for etcd reconnection scenarios - Add idempotent rewatchNodes method to handle etcd session recovery gracefully - Add QueryCoordLastHeartbeatTimeStamp metric for monitoring node heartbeat lag - Clean up heartbeat metrics when nodes go down to prevent metric leaks --------- Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-08-14 10:31:44 +08:00
cai.zhang	77f2fb562f	fix: Fix task state is InProgress but payload is nil (#43777 ) issue: #43776 --------- Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2025-08-11 14:13:42 +08:00
yihao.dai	5124ed9758	fix: Fix import fileStats incorrectly set to nil (#43463 ) 1. Ensure that tasks in the InProgress state return valid fileStats. 2. Enhance import logs. issue: https://github.com/milvus-io/milvus/issues/43387 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2025-07-22 12:37:01 +08:00
yihao.dai	83c9527e70	enhance: Use QuerySlot interface for tasks (#41989 ) Use `QuerySlot` rpc instead of `QueryTask` for querying slot. issue: https://github.com/milvus-io/milvus/issues/41123 Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2025-05-23 10:30:28 +08:00
yihao.dai	142bd2fc05	enhance: Pooling for data tasks (#41256 ) 1. Add global scheduler for datacoord. 2. Define and implement new CreateTask, QueryTask, DropTask interfaces. 3. Refine Import, Compaction, Stats, Index task. issue: https://github.com/milvus-io/milvus/issues/41123 Co-authored-by: Cai Zhang <cai.zhang@zilliz.com>	2025-05-20 21:06:24 +08:00
SimFG	91d40fa558	fix: Update logging context and upgrade dependencies (#41318 ) - issue: #41291 --------- Signed-off-by: SimFG <bang.fu@zilliz.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2025-04-23 10:52:38 +08:00
Chun Han	016920b023	fix: solve incompitable problem for none-encoding index(#40838 ) (#41369 ) related: #40838 Signed-off-by: MrPresent-Han <chun.han@gmail.com> Co-authored-by: MrPresent-Han <chun.han@gmail.com>	2025-04-20 22:56:44 +08:00
cai.zhang	bc11feae74	fix: Close client before remove worker client (#41253 ) issue: #41252 --------- Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2025-04-15 10:26:31 +08:00
cai.zhang	8a77fb9cdc	enhance: Support slot for index task and stats task (#39084 ) issue: #39101 --------- Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2025-04-08 20:46:25 +08:00
cai.zhang	05e25431d9	enhance: Deprecate disk params about indexing (#41045 ) issue: #40863 Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2025-04-07 11:36:34 +08:00
yihao.dai	b2a8694686	enhance: Merge IndexNode and DataNode (#40272 ) Merge DataNode and IndexNode into DataNode. issue: https://github.com/milvus-io/milvus/issues/39115 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2025-03-13 14:26:11 +08:00
cai.zhang	5a810400b5	enhance: Optimize Task Scheduling to Enable Concurrent Execution (#40251 ) issue: #39101 2.5 pr: #40104 Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2025-03-02 18:38:00 +08:00
congqixia	cb7f2fa6fd	enhance: Use v2 package name for pkg module (#39990 ) Related to #39095 https://go.dev/doc/modules/version-numbers Update pkg version according to golang dep version convention --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-02-22 23:15:58 +08:00
congqixia	6d8441ad7e	enhance: Use mockery pkg config for datacoord&datanode (#39567 ) Related to #38339 Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-01-24 14:25:06 +08:00
Zhen Ye	bb8d1ab3bf	enhance: make new go package to manage proto (#39114 ) issue: #39095 --------- Signed-off-by: chyezh <chyezh@outlook.com>	2025-01-10 10:49:01 +08:00
congqixia	3d360c0624	fix: SyncSegments rpc always failed (#38578 ) miss the patch due to code branching previous pr: #38032 Signed-off-by: Congqi Xia <congqi.xia@zilliz.com> Co-authored-by: Wei Liu <wei.liu@zilliz.com>	2024-12-19 15:40:45 +08:00
jaime	78438ef41e	fix: revert optimize CPU usage for CheckHealth requests (#35589 ) (#38555 ) issue: #35563 Signed-off-by: jaime <yun.zhang@zilliz.com>	2024-12-19 00:38:45 +08:00
jaime	28fdbc4e30	enhance: optimize CPU usage for CheckHealth requests (#35589 ) issue: #35563 1. Use an internal health checker to monitor the cluster's health state, storing the latest state on the coordinator node. The CheckHealth request retrieves the cluster's health from this latest state on the proxy sides, which enhances cluster stability. 2. Each health check will assess all collections and channels, with detailed failure messages temporarily saved in the latest state. 3. Use CheckHealth request instead of the heavy GetMetrics request on the querynode and datanode Signed-off-by: jaime <yun.zhang@zilliz.com>	2024-12-17 11:02:45 +08:00
tinswzy	27229f7907	enhance: refine exists log print with ctx (#38080 ) issue: #35917 Refines exists log print with ctx Signed-off-by: tinswzy <zhenyuan.wei@zilliz.com>	2024-12-14 22:36:44 +08:00
wei liu	f8ac91f1db	fix: datacoord stuck at stopping progress (#36852 ) issue: #36868 if datacoord is syncing segments to datanode, and stop datacoord happens, datacoord's stop progress will stuck until syncing segment finished. This PR add ctx to syncing segment, which will failed if stopping datacoord happens. Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-10-17 12:13:37 +08:00
Bingyi Sun	6851738fd1	fix: fix `make generate-mockery` panic with go1.22 (#36830 ) https://github.com/milvus-io/milvus/issues/36831 Fix `make generate-mockery` panic. Signed-off-by: sunby <sunbingyi1992@gmail.com>	2024-10-17 12:11:31 +08:00
cai.zhang	2c9bb4dfa3	feat: Support stats task to sort segment by PK (#35054 ) issue: #33744 This PR includes the following changes: 1. Added a new task type to the task scheduler in datacoord: stats task, which sorts segments by primary key. 2. Implemented segment sorting in indexnode. 3. Added a new field `FieldStatsLog` to SegmentInfo to store token index information. --------- Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2024-09-02 14:19:03 +08:00
congqixia	582d2eec79	enhance: Move datanode/indexnode manager to session pkg (#35634 ) Related to #28861 Move session manager, worker manager to session package. Also renaming each manager to corresponding node name(datanode, indexnode). --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-08-22 16:02:56 +08:00

27 Commits