milvus

mirror of https://gitee.com/milvus-io/milvus.git synced 2026-01-07 19:31:51 +08:00

Author	SHA1	Message	Date
wei liu	ac23beefb5	fix: ensure all channels synced before updating current target (#46348 ) issue: #46087, #46327 The previous implementation only checked if there were any ready delegators before updating the current target. This could lead to partial target updates when only some channels had ready delegators. This regression was introduced by #46088, which removed the check for all channels being ready. This fix ensures that shouldUpdateCurrentTarget returns true only when ALL channels have been successfully synced, preventing incomplete target updates that could cause query inconsistencies. Added unit tests to cover: - All channels synced scenario (should return true) - Partial channels synced scenario (should return false) - No ready delegators scenario (should return false) Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-12-16 14:45:17 +08:00
wei liu	a195c33b71	fix: Prevent target update blocking when replica lacks nodes during scaling (#46088 ) issue: #46087 The previous implementation checked if the total number of ready delegators >= replicaNum per channel. This could cause target updates to block indefinitely when dynamically increasing replicas, because some replicas might lack nodes while the total count still met the threshold. This change switches to a replica-based check approach: - Iterate through each replica individually - For each replica, verify all channels have at least one ready delegator - Only sync delegators from fully ready replicas - Skip replicas that are not ready (e.g., missing nodes for some channels) This ensures target updates can proceed with ready replicas while replicas that lack nodes during dynamic scaling are gracefully skipped. --------- Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-12-11 17:09:14 +08:00
Zhen Ye	23085ae437	fix: use query node label check if streamingnode (#44099 ) issue: #44014 - Because the session of querynode and streamingnode is different. - So when streamingnode session down first, a streaming query node will be treated as querynode. - Use label but not streaming node session to fix it. Signed-off-by: chyezh <chyezh@outlook.com>	2025-08-29 10:45:59 +08:00
Chun Han	001619aef9	feat: supporing load priority for loading (#42413 ) related: #40781 Signed-off-by: MrPresent-Han <chun.han@gmail.com> Co-authored-by: MrPresent-Han <chun.han@gmail.com>	2025-06-17 15:22:38 +08:00
wei liu	54619eaa2c	feat: Implement partial result support on node down (#42009 ) issue: https://github.com/milvus-io/milvus/issues/41690 This commit implements partial search result functionality when query nodes go down, improving system availability during node failures. The changes include: - Enhanced load balancing in proxy (lb_policy.go) to handle node failures with retry support - Added partial search result capability in querynode delegator and distribution logic - Implemented tests for various partial result scenarios when nodes go down - Added metrics to track partial search results in querynode_metrics.go - Updated parameter configuration to support partial result required data ratio - Replaced old partial_search_test.go with more comprehensive partial_result_on_node_down_test.go - Updated proto definitions and improved retry logic These changes improve query resilience by returning partial results to users when some query nodes are unavailable, ensuring that queries don't completely fail when a portion of data remains accessible. --------- Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-05-28 00:12:28 +08:00
wei liu	78010262f0	enhance: Optimize shard serviceable mechanism (#41937 ) issue: https://github.com/milvus-io/milvus/issues/41690 - Merge leader view and channel management into ChannelDistManager, allowing a channel to have multiple delegators. - Improve shard leader switching to ensure a single replica only has one shard leader per channel. The shard leader handles all resource loading and query requests. - Refine the serviceable mechanism: after QC completes loading, sync the query view to the delegator. The delegator then determines its serviceable status based on the query view. - When a delegator encounters forwarding query or deletion failures, mark the corresponding segment as offline and transition it to an unserviceable state. Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-05-22 11:38:24 +08:00
wei liu	0420dc1eb1	fix: use correct delete checkpoint to prevent premature data cleanup (#40366 ) issue: #40292 related to #39552 - Fix incorrect delete checkpoint usage in SyncDistribution - Change checkpoint parameter from action.GetCheckpoint() to action.GetDeleteCP() in SyncTargetVersion call - This resolves the issue where delete buffer data was being cleaned prematurely due to wrong checkpoint reference Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-03-12 15:00:08 +08:00
congqixia	cb7f2fa6fd	enhance: Use v2 package name for pkg module (#39990 ) Related to #39095 https://go.dev/doc/modules/version-numbers Update pkg version according to golang dep version convention --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-02-22 23:15:58 +08:00
Zhen Ye	bb8d1ab3bf	enhance: make new go package to manage proto (#39114 ) issue: #39095 --------- Signed-off-by: chyezh <chyezh@outlook.com>	2025-01-10 10:49:01 +08:00
tinswzy	e76802f910	enhance: refine querycoord meta/catalog related interfaces to ensure that each method includes a ctx parameter (#37916 ) issue: #35917 This PR refine the querycoord meta related interfaces to ensure that each method includes a ctx parameter. Signed-off-by: tinswzy <zhenyuan.wei@zilliz.com>	2024-11-25 11:14:34 +08:00
wei liu	266f8ef1f5	fix: Search may return less result after qn recover (#36549 ) issue: #36293 #36242 after qn recover, delegator may be loaded in new node, after all segment has been loaded, delegator becomes serviceable. but delegator's target version hasn't been synced, and if search/query comes, delegator will use wrong target version to filter out a empty segment list, which caused empty search result. This pr will block delegator's serviceable status until target version is synced --------- Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-11-12 16:34:28 +08:00
yihao.dai	ff9bdf7029	fix: Fix load slowly (#37454 ) When there're a lot of loaded collections, they would occupy the target observer scheduler’s pool. This prevents loading collections from updating the current target in time, slowing down the load process. This PR adds a separate target dispatcher for loading collections. issue: https://github.com/milvus-io/milvus/issues/37166 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-11-09 07:48:26 +08:00
wei liu	75676fbd11	fix: Fix dynamic release partition may fail search/query request (#35919 ) issue: #33550 cause concurrent issue may occur between remove parition in target manager and sync segment list to delegator. when it happens, some segment may be released in delegator, and those segment may also be synced to delegator, which cause delegator become unserviceable due to lack of necessary segments, then search/query fails. this PR make sure that all write access to target_manager will be executed in serial to avoid the concurrent issues. Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-09-05 18:47:03 +08:00
congqixia	b284b81a47	fix: Check partition in current target when observing partition load status (#34282 ) See also #34234 `LoadPartitions` does not guarantee the current target has loading partitions if there are some partitions already loaded before. This PR check current target contains the partition to load when advancing loading percentage to 100. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-07-01 17:40:07 +08:00
jaime	9630974fbb	enhance: move rocksmq from internal to pkg module (#33881 ) issue: #33956 Signed-off-by: jaime <yun.zhang@zilliz.com>	2024-06-25 21:18:15 +08:00
wei liu	e2332bdc17	enhance: Enable channel exclusive balance policy (#32911 ) issue: #32910 * split replica's node list to channels when create replicas * balance nodes among channels when node change happens * implement channel level balance, let balance happens in channel level Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-05-10 17:27:31 +08:00
chyezh	a2502bde75	enhance: replica manager enhancement (#31496 ) issue: #30647 - ReplicaManager manage read only node now, and always do persistent of node distribution of replica. - All segment/channel checker using ReplicaManager to get read-only node or read-write node, but not ResourceManager. - ReplicaManager promise that only apply unique querynode to one replica in same collection now (replicas in same collection never hold same querynode at same time). - ReplicaManager promise that fairly node count assignment policy if multi replicas of collection is assigned to one resource group. - Move some parameters check into ReplicaManager to avoid data race. - Allow transfer replica to resource group that already load replica of same collection - Allow transfer node between resource groups that load replica of same collection --------- Signed-off-by: chyezh <chyezh@outlook.com>	2024-04-05 04:57:16 +08:00
aoiasd	b4af6f8c40	fix: sync action load segment with lack collection index info list (#28788 ) relate: https://github.com/milvus-io/milvus/issues/28779 https://github.com/milvus-io/milvus/issues/28637 Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2023-12-04 18:14:34 +08:00
yah01	1b90630633	Fix the target updated before version updated to cause data missing (#28250 ) Signed-off-by: yah01 <yah2er0ne@outlook.com>	2023-11-08 11:36:22 +08:00
wei liu	e0222b2ce3	refine target manager code style (#27883 ) Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2023-10-25 00:44:12 +08:00
congqixia	93a877f55e	Make qcv2 target&leader observer execute in parallel (#27844 ) - Add `taskDispatcher` to submit and run task async safely - Change `LeaderObeserver` and `TargetObserver` schedule and manual check action to submitting task into dispatcher - Fix logic problem in collection observer when manual check return false See also #27494 Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2023-10-24 10:14:11 +08:00
wei liu	55e5f80e24	update collection target after observer start (#27774 ) Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2023-10-19 21:52:10 +08:00
congqixia	eca79d149c	Add ctx control for observer manual check methods (#27531 ) Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2023-10-09 11:07:33 +08:00
yah01	a8ce1b6686	Refine QueryCoord stopping (#27371 ) Signed-off-by: yah01 <yah2er0ne@outlook.com>	2023-09-27 16:27:27 +08:00
SimFG	26f06dd732	Format the code (#27275 ) Signed-off-by: SimFG <bang.fu@zilliz.com>	2023-09-21 09:45:27 +08:00
Bingyi Sun	a3e22786ed	Move meta store to kv catalog (#25915 ) Signed-off-by: sunby <sunbingyi1992@gmail.com>	2023-07-31 13:57:04 +08:00
yiwangdr	4387f36897	make etcdKV private (#24778 ) Signed-off-by: yiwangdr <yiwangdr@gmail.com>	2023-06-13 10:52:38 +08:00
wei liu	cbfe7a45ef	fix pull target (#23491 ) Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2023-04-18 18:30:32 +08:00
jaime	c9d0c157ec	Move some modules from internal to public package (#22572 ) Signed-off-by: jaime <yun.zhang@zilliz.com>	2023-04-06 19:14:32 +08:00
yihao.dai	1f718118e9	Dynamic load/release partitions (#22655 ) Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2023-03-20 14:55:57 +08:00
wei liu	73c44d4b29	resource group impl (#21609 ) Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2023-01-30 10:19:48 +08:00
yah01	32fb409e57	Fix may update the current target to an unavailable target when node down (#21698 ) Signed-off-by: yah01 <yang.cen@zilliz.com>	2023-01-13 17:11:41 +08:00
yah01	5ba1a94858	Fix observers may update current target to a unfinished next target (#21107 ) Signed-off-by: yah01 <yang.cen@zilliz.com> Signed-off-by: yah01 <yang.cen@zilliz.com>	2022-12-12 15:15:22 +08:00
Enwei Jiao	89b810a4db	Refactor all params into ParamItem (#20987 ) Signed-off-by: Enwei Jiao <enwei.jiao@zilliz.com> Signed-off-by: Enwei Jiao <enwei.jiao@zilliz.com>	2022-12-07 18:01:19 +08:00
Enwei Jiao	2ecdb4ba4a	Etcd config source support TLS (#20874 ) Signed-off-by: Enwei Jiao <enwei.jiao@zilliz.com> Signed-off-by: Enwei Jiao <enwei.jiao@zilliz.com>	2022-11-30 18:23:15 +08:00
Enwei Jiao	c05b9ad539	Add event dispatcher for config (#20393 ) Signed-off-by: Enwei Jiao <enwei.jiao@zilliz.com> Signed-off-by: Enwei Jiao <enwei.jiao@zilliz.com>	2022-11-17 18:59:09 +08:00
wei liu	c5cd92d36e	update target (#19296 ) Signed-off-by: Wei Liu <wei.liu@zilliz.com> Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2022-11-07 19:37:04 +08:00

37 Commits