Cherry-pick from master
pr: #31363
See also #31362
This PR make datacoord garbage collection scan operation using differet
interval than other opeartion.
This interval is a newly added param item, which default value is 7*24
hours.
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
/kind improvement
pr: #31271
fix: https://github.com/milvus-io/milvus/issues/31272
This pr add more metrics, which are:
Slow query count, which the duration considered as slow can be
configurable;
Number of deleted entities;
Number of entities per collection;
Number of loaded entities per collection;
Number of indexed entities;
Number of indexed entities, per collection, per index and whether it's a
vetor index;
Quota states (LongTimeTickDelay, MemoryExhuasted, DiskQuotaExhuasted)
per database;
---------
Signed-off-by: longjiquan <jiquan.long@zilliz.com>
This PR includes the following adjustments:
1. To prevent channelCP update task backlog, only one task with the same
vchannel is retained in the updater. Additionally, the lastUpdateTime is
refreshed after the flowgraph submits the update task, rather than in
the callBack function.
2. Batch updates of multiple vchannel checkpoints are performed in the
UpdateChannelCheckpoint RPC (default batch size is 128). Additionally,
the lock for channelCPs in DataCoord meta has been switched from key
lock to global lock.
3. The concurrency of UpdateChannelCheckpoint RPCs in the datanode has
been reduced from 1000 to 10.
issue: https://github.com/milvus-io/milvus/issues/30004
pr: https://github.com/milvus-io/milvus/pull/30941
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
1. add coordinator and proxy graceful stop timeout to 5s.
3. add other work node graceful stop timeout to 900s, and we should
potentially change this to 600s when graceful stop is smooth
4. change the order of datacoord component while stop.
5. `LivenessCheck` do not perform graceful shutdown now.
issue: https://github.com/milvus-io/milvus/issues/30310
pr: #30317
also see: https://github.com/milvus-io/milvus/pull/30306
---------
Signed-off-by: chyezh <chyezh@outlook.com>
Allows proactive warming up of chunk cache. Original vector data will be
asynchronously loaded into the chunk cache during the load process. It
has the potential to significantly reduce query/search latency for a
certain duration after the load, albeit with a concurrent increase in
disk usage.
issue: https://github.com/milvus-io/milvus/issues/30181
pr: https://github.com/milvus-io/milvus/pull/30182
---------
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
- this much improve the performance for GPU index
- this also reduce 1x copy while parsing index meta
pr: #29678
Signed-off-by: yah01 <yang.cen@zilliz.com>
issue: #23726
pr: #29231
This PR add control config to querycoord's background auto balance
channel operation
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
See also #29223
cherry pick part of master commit
pr: #29224
Make `conc.Pool` resizable by adding Resize method for it.
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
Cherry pick from master
pr: #29154
See also #29177
Add a config item for partition name as regexp feature and disable it by
default
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
In order to minimize the CPU usage of the coroutine and avoid frequent
execution of time-consuming operations in the flowgraph when the message
stream consists solely of "ttMsg," it is recommended to implement a
mechanism for quickly bypassing the subsequent flowgraph node processing
logic.
If "ttMsg" is continuously received for a certain period of time
(coldTime), the flowgraph enters skipMode. Once in skipMode, every
skipNum "ttMsg" messages are merged into one for processing. If a
non-"ttMsg" message is received while in skipMode, the flowgraph exits
skipMode.
pr: #28756
Signed-off-by: wayblink <anyang.wang@zilliz.com>
Co-authored-by: wayblink <anyang.wang@zilliz.com>
issue: #28960
master pr: #28961
add new configuration: builtinRoles
user can define roles in config file: milvus.yaml
there is an example:
db_ro, only have read privileges, include load
db_rw, read and write privileges, include create/drop/rename collection
db_admin, not only read and write privileges, but also user
administration
Signed-off-by: PowderLi <min.li@zilliz.com>
pr: #28648
it's easy to trigger heartbeat timeout after 100ms when standalone cpu
usage reach 100%.
This PR increase the heartbeat timeout param to 2000ms
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
- Add `taskDispatcher` to submit and run task async safely
- Change `LeaderObeserver` and `TargetObserver` schedule and manual check action to submitting task into dispatcher
- Fix logic problem in collection observer when manual check return false
See also #27494
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>