This PR includes the following adjustments:
1. To prevent channelCP update task backlog, only one task with the same
vchannel is retained in the updater. Additionally, the lastUpdateTime is
refreshed after the flowgraph submits the update task, rather than in
the callBack function.
2. Batch updates of multiple vchannel checkpoints are performed in the
UpdateChannelCheckpoint RPC (default batch size is 128). Additionally,
the lock for channelCPs in DataCoord meta has been switched from key
lock to global lock.
3. The concurrency of UpdateChannelCheckpoint RPCs in the datanode has
been reduced from 1000 to 10.
issue: https://github.com/milvus-io/milvus/issues/30004
pr: https://github.com/milvus-io/milvus/pull/30941
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
1. add coordinator and proxy graceful stop timeout to 5s.
3. add other work node graceful stop timeout to 900s, and we should
potentially change this to 600s when graceful stop is smooth
4. change the order of datacoord component while stop.
5. `LivenessCheck` do not perform graceful shutdown now.
issue: https://github.com/milvus-io/milvus/issues/30310
pr: #30317
also see: https://github.com/milvus-io/milvus/pull/30306
---------
Signed-off-by: chyezh <chyezh@outlook.com>
Allows proactive warming up of chunk cache. Original vector data will be
asynchronously loaded into the chunk cache during the load process. It
has the potential to significantly reduce query/search latency for a
certain duration after the load, albeit with a concurrent increase in
disk usage.
issue: https://github.com/milvus-io/milvus/issues/30181
pr: https://github.com/milvus-io/milvus/pull/30182
---------
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
issue: #23726
pr: #29231
This PR add control config to querycoord's background auto balance
channel operation
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
In order to minimize the CPU usage of the coroutine and avoid frequent
execution of time-consuming operations in the flowgraph when the message
stream consists solely of "ttMsg," it is recommended to implement a
mechanism for quickly bypassing the subsequent flowgraph node processing
logic.
If "ttMsg" is continuously received for a certain period of time
(coldTime), the flowgraph enters skipMode. Once in skipMode, every
skipNum "ttMsg" messages are merged into one for processing. If a
non-"ttMsg" message is received while in skipMode, the flowgraph exits
skipMode.
pr: #28756
Signed-off-by: wayblink <anyang.wang@zilliz.com>
Co-authored-by: wayblink <anyang.wang@zilliz.com>
pr: #28648
it's easy to trigger heartbeat timeout after 100ms when standalone cpu
usage reach 100%.
This PR increase the heartbeat timeout param to 2000ms
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
1. balance granuity to replica to avoid influence unrelated replicas
2. avoid balance back and forth
Signed-off-by: MrPresent-Han <jamesharden11122@gmail.com>