Cherry-pick from master
pr: #29922
See also: #29650
Either segment dml position & channel checkpoint could be newer in some
cases. This PR make PackLoadSegments use the newer one improving load
performance during cases where there are lots of upsert.
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
issue: #29814
pr: #29724
if channel is not subscribed yet, the generated load segment task will
be remove from task scheduler due to the load segment task need to be
transfer to worker node by shard leader.
This PR skip generate load segment task when channel is not subscribed
yet.
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
Cherry-pick from master
pr: #29806
See also #29803
This PR:
- Add trace span for collection/partition load
- Use TraceSpan to generate Segment/ChannelTasks when loading
- Refine BaseTask trace tag usage
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
issue: #29582
pr: #29574
This PR rewrite gen segment plan logic based on assign segment in
`score_based_balancer`
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
Cherry pick from master
pr: #29577
Related to #29575
Add `getCollectionTarget` method which is atomic when scope is
`CurrentTargetFirst` or `NextTargetFirst`
Also return error when executor finds no channel in target manager
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
issue: #29523
pr: #29525
readable shard leader should still be the old one during channel
balance, if the new shard leader is not ready.
This PR fixed that query coord choose wrong shard leader during balance
channel
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
Cherry-pick from master
pr: #29526
`WatchDmChannel` only need growing segment info, this PR removes fetch
segmentInfos when fill watch dml channel request.
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
the executor always fetches the latest segment info, so we could consume
from the latest checkpoint, which could save much time while deleted
many entities
pr: #29455
Signed-off-by: yah01 <yang.cen@zilliz.com>
Signed-off-by: yah01 <yah2er0ne@outlook.com>
pr: #29473
`AssignSegment` method defines how to assign segment to nodes, but
score_based_balance implement another assign logic in
`genStoppingSegmentPlan`
This PR rewrite gen stopping segment plan based on assign segment.
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
pr: #29443
milvus branch 2.3 add `loadType` in CollectionLoadInfo, so for
collection meta upgrade from 2.2, we should add `loadType` to
CollectionLoadInfo. This PR update CollectionLoadInfo with `loadType`
when meet a old version CollectionLoadInfo
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
issue: #23726
pr: #29231
This PR add control config to querycoord's background auto balance
channel operation
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
issue: #28622
pr: #29216
After we support balance segment with growing segment count #28623, if
we balance segment and channel at same time, some segments need to be
rebalanced after balance channel finish.
This PR skip balance segment when channel need be balanced.
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
issue: #23726
pr: #28469
1. enable auto balance channel between nodes in querycoord
2. make `genSegmentPlan` reuse the `AssignSegment` logic
3. make `genChannelPlan` reuse the `AssignChannel` logic
---------
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
we found the load got stuck probably, and reviewed the logs.
the target observer seems not working, the reason is the taskDispatcher
removes the task in a goroutine, and modifies the task status after
committing the task into the goroutine pool, but this may happen after
the task removed, which leads to the task will never be removed
related #29086
pr: #29191
Signed-off-by: yah01 <yang.cen@zilliz.com>
issue: #28622
pr: #28623
query node with delegator will has more rows than other query node due
to delgator loads all growing rows.
This PR enable the balance segment which based on the num of growing
rows in leader view.
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
This pull request enhances the logging functionality in the code related
to target updating. It adds more logs about the condition satisfying
when updating the target. The logs provide additional information about
the collection ID, replica number, channel readiness, segment readiness,
and leader view readiness. These logs will help in troubleshooting and
monitoring the target updating process.
pr: #29090
Signed-off-by: yah01 <yah2er0ne@outlook.com>
Signed-off-by: yah01 <yang.cen@zilliz.com>
pr: #28829
issue: #28831
release old delegator before new delegator update it's distribution may
cause `channel not available` error
This PR will block release old delgator before new delegator finish
`syncDistribution`
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
Cherry-pick from master
pr: #28661
See also #28660
This pr add request timeout config item for etcd kv request timeout
Sync the default timeout value to same value for etcdKV & tikv config
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
issue: #28332
pr: #28396
during querycoord's recover, it try to call `DescribeCollection` and
`ShowPartitions` to root coord, to checker whether collection or
partition has been released in rootcoord. but if rootcoord isn't not
ready yet, the rpc will fail, the querycoord panic.
to fix this, we remove rpc call during querycoord's start
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
Cherry-pick from master
pr: #28472
See also #28466
In `taskDispatcher.schedule`, same task may be resubmitted if the
previous round did not finish
In this case, TaskObserver.check may set current target by mistake,
which may cause the random search/query failure
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
Remove the "failCount" log field, which is ambiguous
replace the status (int32) with string, to improve the readability for
log of task removed
pr: #28331
Signed-off-by: yah01 <yah2er0ne@outlook.com>
- Add `taskDispatcher` to submit and run task async safely
- Change `LeaderObeserver` and `TargetObserver` schedule and manual check action to submitting task into dispatcher
- Fix logic problem in collection observer when manual check return false
See also #27494
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>