350 Commits

Author SHA1 Message Date
SimFG
5c166a25b9
enhance: [2.4] improve rootcoord task scheduling policy (#37523)
- issue: #30301
- pr: #37352

Signed-off-by: SimFG <bang.fu@zilliz.com>
2024-11-08 14:56:27 +08:00
XuanYang-cn
20534a3f7b
fix: [cp24]Saperate L0 and Mix trigger interval (#37319)
See also: #37108
pr: #37190

- Add MixCompactionTriggerInterval, default 60s
- Add L0CompactionTriggerInterval, default 10s
- Export Single related compaction configs
- Raise SingleCompactionDeltaLogMaxSize from 2MB to 16MB

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-11-06 11:10:26 +08:00
XuanYang-cn
6109e9d69e
fix: Skip mark compaction timeout for mix and l0 compaction (#37118) (#37194)
Timeout is a bad design for long running tasks, especially using a
static timeout config. We should monitor execution progress and fail the
task if the progress has been stale for a long time.

This pr is a small patch to stop DC from marking compaction tasks
timeout, while still waiting for DN to finish. The design is
self-conflicted. After this pr, mix and L0 compaction are no longer
controlled by DC timeout, but clustering is still under timeout control.

The compaction queue capacity grows larger for priority calc, hence
timeout compactions appears more often, and when timeout, the queuing
tasks will be timeout too, no compaction will success after.

See also: #37108, #37015
pr: #37118

---------

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-10-31 10:36:21 +08:00
presburger
27a4fe002a
enhance:change gpu default mem pool size (#36969)
Signed-off-by: yusheng.ma <yusheng.ma@zilliz.com>
2024-10-23 17:17:28 +08:00
yihao.dai
539f56220f
enhance: Remove bf from datanode (#36367) (#37027)
Remove bf from datanode:
1. When watching vchannels, skip loading **flushed** segments's bf. For
generating merged bf, we need to keep loading **growing** segments's bf.
2. Bypass bloom filter checks for delete messages, directly writing to
L0 segments.
3. In version 2.4, when dropping a partition, marking segments as
dropped depends on having the full segment list in the DataNode. So, we
need to keep syncing the segments every 10 minutes.

issue: https://github.com/milvus-io/milvus/issues/34585

pr: https://github.com/milvus-io/milvus/pull/35902,
https://github.com/milvus-io/milvus/pull/36367,
https://github.com/milvus-io/milvus/pull/36592

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-10-22 11:15:28 +08:00
yihao.dai
4e0f5845a1
enhance: Limit import job number (#36891) (#36892)
issue: https://github.com/milvus-io/milvus/issues/36890

pr: https://github.com/milvus-io/milvus/pull/36891

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-10-18 18:13:25 +08:00
yihao.dai
8923936c9a
enhance: Support memory mode chunk cache (#35347) (#35836)
Chunk cache supports loading raw vectors into memory.

issue: https://github.com/milvus-io/milvus/issues/35273

pr: https://github.com/milvus-io/milvus/pull/35347

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-10-18 17:03:25 +08:00
Ted Xu
22838a8413
enhance: Datacoord to support prioritization of compaction tasks (#36979)
See #36550

pr: #36547 
pr: #36956

---------

Signed-off-by: Ted Xu <ted.xu@zilliz.com>
2024-10-18 14:15:25 +08:00
cqy123456
6934e8da3a
enhance: [2.4]use growingMmapEnabled to control the behavior of interim index, not vectorField (#36391)
issue: https://github.com/milvus-io/milvus/issues/36392
related pr: https://github.com/milvus-io/milvus/pull/36500

Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>
2024-10-17 20:23:25 +08:00
congqixia
3252d7a64c
fix: [2.4] Load original key if ts is MaxTimestamp (#36934) (#36950)
Cherry-pick from master
pr: #36934 

Related to #36933

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-10-17 16:05:29 +08:00
XuanYang-cn
e976b41f97
fix: Remove enableLevelZeroSegment config (#36507)
See also: #36504
pr: #36535

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-10-11 16:41:21 +08:00
yihao.dai
a4ef93457d
enhance: Optimize import scheduling and add time cost metric (#36601) (#36684)
1. Optimize import scheduling strategic:
a. Revise slot weights, calculating them based on the number of files
and segments for both import and pre-import tasks.
b. Ensure that the DN executes tasks in ascending order of task ID.
2. Add time cost metric and log.

issue: https://github.com/milvus-io/milvus/issues/36600,
https://github.com/milvus-io/milvus/issues/36518

pr: https://github.com/milvus-io/milvus/pull/36601

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-10-11 10:27:22 +08:00
yihao.dai
9cb5396cf6
enhance: Use common gc config (#36668) (#36670)
Use the GC config from `common` and remove the GC config from
`queryNode`.

issue: https://github.com/milvus-io/milvus/issues/36667

pr: https://github.com/milvus-io/milvus/pull/36668

related pr: https://github.com/milvus-io/milvus/pull/34949

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-10-09 19:49:20 +08:00
congqixia
3a80d1f602
enhance: [2.4] Add streaming forward policy switch for delegator (#36330) (#36712)
Cherry pick from master
pr: #36330
Related to #35303

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-10-09 17:41:20 +08:00
XuanYang-cn
05f96f5298
fix: [24]raise l0 compaction memory ratio to 0.5 (#36691)
5 percent of free memory is too less for l0 compaction. This pr will
raise it to 50 percent.

See also: #36614
pr: #36690

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-10-09 17:19:24 +08:00
SimFG
58a763c529
enhance: [2.4] avoid to create many timer object in the target (#36573)
/kind improvement
- pr: #36570

Signed-off-by: SimFG <bang.fu@zilliz.com>
2024-09-29 19:27:16 +08:00
wei liu
ad5d24be65
enhance: Optimize workload based replica selection policy (#36181) (#36384)
issue: #35859
pr: #36181

This PR introduce two new param: toleranceFactor and checkRequestNum,
after every checkRequestNum request has been assigned, try to compute
querynode's workload score.

if the diff is less than the toleranceFactor, replica selection policy
will fallback to round_robin, which reduce the average cost to about
500ns.

if the diff is larger than the toleranceFactor, replica selection policy
will compute querynode's score to select the target node with smallest
score in every assigment.

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-09-26 11:19:14 +08:00
jaime
b92daa1532
fix: iaccurate size estimation for encoded array data (#36379)
issue: #36029
pr: #36373

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-09-23 21:17:13 +08:00
SimFG
a35d99eabf
fix: [2.4] long buffering causes mq to be unable to receive messages. (#36425)
- issue: #36397
- pr: #36420

Signed-off-by: SimFG <bang.fu@zilliz.com>
2024-09-23 16:33:17 +08:00
congqixia
13d443eb2e
enhance: [2.4] Add L0 forward policy to support remote load (#36189) (#36208)
Cherry-pick from master
pr: #36189
Related to #35303

This PR add a param item to support change l0 forward behavior from bf
filtering and forward to remote load.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-09-12 19:09:08 +08:00
XuanYang-cn
835c9d5c65
fix: Change l0SegmentsRowCount limits to a reasonable value (#36015)
pr: #36014
See also: #36028

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-09-08 16:55:05 +08:00
Ted Xu
45b2049d5d
fix: fallback params may be overridden (#35972) (#36006)
See #35756

---------

pr: #35972

Signed-off-by: Ted Xu <ted.xu@zilliz.com>
2024-09-05 19:05:05 +08:00
congqixia
da0bc22a5f
enhance: [2.4] Add delete buffer related quota logic (#35918) (#35997)
Cherry pick from master
pr: #35128 #35918
See also #35303

---------

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
Co-authored-by: aoiasd <45024769+aoiasd@users.noreply.github.com>
2024-09-05 16:43:06 +08:00
jaime
2c1fa50412
enhance: remove cooling off in rate limiter for read requests (#35936)
issue: #35934
pr: #35935

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-09-04 14:39:10 +08:00
SimFG
5b5119a51f
feat: [2.4] provide more general configuration to control mmap behavior (#35609)
- issue: #35273
- pr: #35359

Signed-off-by: SimFG <bang.fu@zilliz.com>
2024-08-23 12:35:02 +08:00
wei liu
e014ad9280
fix: fix dynamic update config doesn't works for some param (#35572) (#35637)
issue: #35570
pr: #35572
milvus support config cache to spped up config access, but only evict
param's cache when param has been updated. but milvus's param may rely
on other param's value, let's say ParamsA relys on paramsB, when paramsB
updated, it will evict paramB's cache, but the paramA's cache still keep
the old value.

This PR evict all config cache to solve the above issue, cause dynamic
update config won't be much frequetly.

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-08-22 16:00:58 +08:00
wei liu
14ec3dc357
enhance: Enable ReadOnly/ReadWrite/Admin Privilege Group to simplify RBAC grant progress (#35472) (#35543)
issue: #35471
pr: #35472 #35515

---------

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-08-19 16:24:54 +08:00
Ted Xu
57d4bcbf15
enhance: adding the msgchannel section in generated yaml (#35466)
See #32168

Signed-off-by: Ted Xu <ted.xu@zilliz.com>
2024-08-14 19:03:11 +08:00
Ted Xu
ce53e79f12
fix: enable milvus.yaml check (#34567) (#35446)
See #32168

pr: #34567 #35152

---------

Signed-off-by: Ted Xu <ted.xu@zilliz.com>
2024-08-13 19:00:23 +08:00
aoiasd
a20cb727eb
enhance:[Cherry-pick] Check by proxy rate limiter when delete get data by query. (#30891) (#35262)
relate: https://github.com/milvus-io/milvus/issues/30927
pr: https://github.com/milvus-io/milvus/pull/30891

---------

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2024-08-13 14:32:21 +08:00
wei liu
0201e00a2f
enhance: enable to set load config in cluster level (#35293)
issue: #35170
pr: #35169
This PR enable to set load configs in cluster level, such as replicas
and resource groups. then when load collections will use the load
config.

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-08-07 12:38:21 +08:00
cai.zhang
2534b30e39
enhance: [cherry-pick] Add monitoring metrics for task execution time in datacoord (#35141)
issue: #35138 

master pr: #35139

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-08-02 19:46:16 +08:00
wei liu
d767f8977a
enhance: Refine param init for MmapDirPath (#35181) (#35214)
pr: #35181

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-08-02 16:30:15 +08:00
congqixia
f8444b900f
enhance: [2.4] Support proxy/delegator qn client pooling (#35195)
Cherry pick from master
pr: #35194
See also #35196
Add param item for proxy/delegator query node client pooling and
implement pooling logic

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-08-02 11:24:19 +08:00
wei liu
5f601fcc50
enhance: Reduce delegator memory overloaded factor to 0.1 (#35092) (#35164)
pr: #35092

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-08-01 14:20:13 +08:00
cai.zhang
c340f387cf
enhance: [cherry-pick] Change the fixed value to a ratio for clustering segment size (#35075)
issue: #34495 

master pr: #35076

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-07-31 10:32:00 +08:00
congqixia
935a117396
enhance: [2.4] Support otlp http exporter (#35053) (#35073)
Cherry-pick from master
pr: #35053
See also #35052

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-07-30 21:00:08 +08:00
Jiquan Long
86edca8c1b
fix: support auto index for array (#35095)
/kind branch-feature
pr: #34450

---------

Signed-off-by: longjiquan <jiquan.long@zilliz.com>
Co-authored-by: Zhagnlu <lu.zhang@zilliz.com>
2024-07-30 17:57:50 +08:00
wei liu
b3bc7f3985
enhance: Limit collection's normal balance speed (#34810) (#34987)
issue: #34798
pr: #34810

after we remove the task priority on query coord, to avoid load/release
segment blocked by too much balance task, we limit the balance task size
in each round. at same time, we reduce the balance interval to trigger
balance more frequently.

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-07-26 10:13:46 +08:00
jaime
77ae127a62
fix: check collection health(queryable) fail for releasing collection (#34948)
issue: #34946
pr: #34947

---------

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-07-25 10:25:57 +08:00
cai.zhang
74adedf750
enhance: Optimized the GC logic to ensure that memory is released in time (#34950)
issue: #34703 

master pr: #34949

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-07-24 14:07:43 +08:00
congqixia
7540714c1b
enhance: [2.4] Add l0 segment entry num quota (#34733) (#34837)
Cherry-pick from master
pr: #34733
See also #34670

This PR add quota configuration for l0 segment entry number per
collection. If l0 compaction cannot keep up the insertion/upsertion
rate, this feature could back press the related rate.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-07-22 12:05:43 +08:00
wayblink
c0c3c5f528
enhance: [cherry-pick] refine clustering compaction configs and logs (#34818)
issue: #30633
pr: #34784

---------

Signed-off-by: wayblink <anyang.wang@zilliz.com>
2024-07-21 19:27:41 +08:00
yihao.dai
07bc1b6717
enhance: Seal by total growing segments size (#34692) (#34779)
Seals the largest growing segment if the total size of growing segments
of each shard exceeds the size threshold(default 4GB). Introducing this
policy can help keep the size of growing segments within a suitable
level, alleviating the pressure on the delegator.

issue: https://github.com/milvus-io/milvus/issues/34554

pr: https://github.com/milvus-io/milvus/pull/34692

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-07-19 18:25:50 +08:00
SimFG
0e226502e4
enhance: [2.4] pick default root password and log level pr (#34777)
default root password
- issue: #33058
- pr: #34752

set log level
- issue: #34756
- pr: #34757

---------

Signed-off-by: SimFG <bang.fu@zilliz.com>
2024-07-18 13:45:43 +08:00
wayblink
a26e965e6a
enhance:[cherry-pick] Add compaction task slot usage logic (#34625)
issue: #34544
pr: #34581

---------

Signed-off-by: wayblink <anyang.wang@zilliz.com>
2024-07-18 09:55:43 +08:00
wayblink
83fc26c31a
fix: [cherry-pick] compaction task not be cleaned correctly (#34766)
1.fix compaction task not be cleaned correctly
2.add a new parameter to control compaction gc loop interval
3.remove some useless configs of clustering compaction

bug: #34764
pr: #34765

Signed-off-by: wayblink <anyang.wang@zilliz.com>
2024-07-17 22:17:42 +08:00
wei liu
cf701a9bf0
enhance: Preserve fixed-size memory in delegator node for growing segment (#34600)
issue: #34595
pr: #34596
When consuming insert data on the delegator node, QueryCoord will move
out some sealed segments to manage its memory usage. After the growing
segment gets flushed, some sealed segments from other workers will be
moved back to the delegator node. To avoid the frequent movement of
segments, we estimate the maximum growing row count and preserve a
fixed-size memory in the delegator node.

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-07-15 20:51:46 +08:00
congqixia
10c04f33c7
enhance: [2.4] Add param item for segmentFlushInterval (#34629) (#34663)
Cherry-pick from master
pr: #34629
See also #28817

Add paramitem for segment flush interval

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-07-15 14:31:40 +08:00
Gao
a60e2a65ff
enhance: change autoindex default metric type (#34277)
issue: #34304 
pr: #34261

Signed-off-by: chasingegg <chao.gao@zilliz.com>
2024-07-08 10:52:14 +08:00