20655 Commits

Author SHA1 Message Date
tinswzy
bd08b3b2d3
enhance: Add mmap file usage metric (#38211)
issue: #38156 
cherry-pick from https://github.com/milvus-io/milvus/pull/38193

Signed-off-by: tinswzy <zhenyuan.wei@zilliz.com>
2024-12-04 20:24:40 +08:00
sthuang
febed0abb7
enhance: [2.4] add list aliases privilege into public role and fix typo (#38208)
cherry-pick from master: https://github.com/milvus-io/milvus/pull/38176,
https://github.com/milvus-io/milvus/pull/38195
related issue: https://github.com/milvus-io/milvus/issues/37031

---------

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2024-12-04 17:58:40 +08:00
jaime
319f5494cd
enhance: optimize CPU usage for CheckHealth requests (#35595)
issue: #35563
pr: #35589

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-12-04 14:26:41 +08:00
cai.zhang
0fe4b2c7b7
enhance: Remove pre-marking segments as L2 during clustering compaction (#37653)
issue: #36686

master pr: #36799 

The core of this change is to **ensure that the many-to-many lineage
derivation logic is correct, making sure that both the parent and child
cannot simultaneously exist in the target segment view.**

feature:
  - Clustering compaction no longer marks the input segments as L2.
- Add a new field `is_invisible` to `segmentInfo`, and mark segments
that have completed clustering but have not yet built indexes as
`is_invisible` to prevent them from being loaded prematurely."
- Do not mark the input segment as `Dropped` before the clustering
compaction is completed.
- After compaction fails, only the result segment needs to be marked as
Dropped.

compatibility:
- If the upgraded task has not failed, there are no compatibility
issues.
- If the status after the upgrade is `MetaSaved`, then skip the stats
task based on whether TmpSegments is empty.
  - If the failure occurs before `MetaSaved`:
- there are no ResultSegments, and InputSegments have not been marked as
dropped yet.
    - the level of input segments need to revert to LastLevel
  - If the failure occurs after `MetaSaved`:
- ResultSegments have already been generated, and InputSegments have
been marked as Dropped. At this point, simply make the ResultSegments
visible.
- the level of ResultSegments needs to be set to L1(in order to
participate in mixCompaction)

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-12-04 10:48:40 +08:00
smellthemoon
ab88d23ec0
enhance: support db request in Restful api(#38140)(#38078) (#38188)
pr: #38078
pr: #38140
issue: #38077

---------

Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2024-12-04 10:40:39 +08:00
smellthemoon
7bd401c019
enhance: enable limiter for restful v1(#38160) (#38190)
pr: #38160

Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2024-12-04 10:18:38 +08:00
yihao.dai
7b82417641
fix: [2.4] Fix inaccurate partition num metric (#38073)
The partition number has already been incremented in
ChangePartitionState, so there is no need to increment it again in
AddPartition.

issue: https://github.com/milvus-io/milvus/issues/37630

pr: https://github.com/milvus-io/milvus/pull/37996

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-12-04 10:02:39 +08:00
sthuang
66f2dac5f5
fix: [2.4] fix grant/revoke v2 meta and unclear error messages (#38146)
cherry-pick from https://github.com/milvus-io/milvus/pull/38110,
https://github.com/milvus-io/milvus/pull/38130
related issue: https://github.com/milvus-io/milvus/issues/37031

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2024-12-03 22:06:41 +08:00
congqixia
03f01d8869
enhance: [2.4] Use fdopen, fwrite to reduce direct syscall (#38157) (#38180)
Cherry-pick from master
pr: #38157
`File.Write` and `File.WriteInt` use `write`, which may be just direct
syscall in some systems. When mappding field data and write line by
line, this could cost lost of CPU time when the row number is large.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-12-03 20:14:39 +08:00
yihao.dai
4652c4759a
enhance: [2.4] Accelerate observe collection (#38072)
1. A collection should observe the channel only once.
2. A collection should check the CollectionLoadPercent for updates only
once.
3. Skip saving coll/partition meta if there are no changes, primarily to
accelerate collection observation after recovery.

issue: https://github.com/milvus-io/milvus/issues/37630

pr: https://github.com/milvus-io/milvus/pull/38028

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-12-03 19:34:39 +08:00
wei liu
b29237e5d5
enhance: Add collection id to search request count metrics (#38069) (#38144)
pr: #38069 #38167

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-12-03 18:16:39 +08:00
yihao.dai
c5c449fc90
fix: [2.4] Fix datacoord metrics (#38164)
issue: https://github.com/milvus-io/milvus/issues/38162

pr: https://github.com/milvus-io/milvus/pull/38163

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-12-03 14:12:40 +08:00
cai.zhang
dca779debe
enhance: [2.4] Refine clustering compaction log (#38102)
master pr: #38100

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-12-02 21:02:39 +08:00
congqixia
49fa78c481
enhance: [2.4] Make timeout work for each GetSegmentInfo req (#38132)
Cherry pick from master
pr: #36026
Relate: #36025

Fix datanode watch channel timeout when segment number is too large

Previous timeout apply for whole process for batching fetch segment
info, when segment number is large one rpc timeout does not work well
for multiple round rpc case

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
Co-authored-by: aoiasd <zhicheng.yue@zilliz.com>
2024-12-02 20:14:39 +08:00
cai.zhang
580caebb31
fix: [2.4]Check if the dynamic fields contain any static fields (#38027)
issue: #38024 

master pr: #38025

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-12-02 12:22:38 +08:00
XuanYang-cn
c32ad6573c
enhance: [24]Increase task capacity and clean illegal task (#37896) (#38095)
1. taskQueueCapacity 256 is too small for production when we want to
re-write the entire collection

2. tasks should be cleaned when unable to recover, or the meta will
remain in etcd forever later.

pr: #37896

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-12-02 11:58:38 +08:00
congqixia
6a32b06e54
fix: [2.4] Return thread watcher goroutine after closed (#38091) (#38104)
Cherry-pick from master
pr: #38091 
Resolves #38090

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-12-02 10:08:38 +08:00
XuanYang-cn
8ae7cdd77d
fix: [24]Replace outer lock with concurrent map (#37817) (#37897)
See also: #37493
pr: #37817

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-11-29 14:26:43 +08:00
yihao.dai
a71d49dcb4
enhance: [2.4] Accelerate the loading of collection (#37841)
Remove unnecessary ListIndex and DescribeCollection RPC call during
loading.

issue: https://github.com/milvus-io/milvus/issues/37166,
https://github.com/milvus-io/milvus/issues/37630

pr: https://github.com/milvus-io/milvus/pull/37741

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-11-29 11:52:37 +08:00
cai.zhang
fdf3a8aa0a
fix: [2.4] Use the ID to retrieve the real name when collectionName is empty (#37859) (#37881)
issue: #36989

master pr: #37859

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-11-29 11:36:37 +08:00
cai.zhang
045cf56b6c
fix: [2.4] Handle the error of the compaction queue being full (#37990)
issue: #37988

master pr: #37989

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-11-29 11:08:37 +08:00
wei liu
88b731d393
fix: SyncSegments rpc always failed (#38032)
issue: #38031
cause call `cli.SyncSegments` use ctx which already be override and
canceled, so SyncSegments rpc will always failed.

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-11-28 17:58:36 +08:00
yihao.dai
7805f7d036
fix: [2.4] Fix load slowly (#37735)
When there're a lot of loaded collections, they would occupy the target
observer scheduler’s pool. This prevents loading collections from
updating the current target in time, slowing down the load process. This
PR adds a separate target dispatcher for loading collections.

issue: https://github.com/milvus-io/milvus/issues/37166

pr: https://github.com/milvus-io/milvus/pull/37454

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-11-28 14:56:37 +08:00
yihao.dai
913a00911b
enhance: [2.4] Reduce GetIndexInfos calls (#37840)
Batch `GetIndexInfos` calls for segments to reduce RPC calls.

issue: https://github.com/milvus-io/milvus/issues/37634

pr: https://github.com/milvus-io/milvus/pull/37695

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-11-28 14:38:37 +08:00
congqixia
c326a52370
fix: [2.4] Use correct policy merging growing&l0 and add unit tests (#37950) (#37968)
Cherry-pick from master
pr: #37950
Related to #37574

---------

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-11-28 14:22:37 +08:00
wei liu
fc2886215c
enhance: Optimize param cost in search (#37738) (#38019)
pr: #37738

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-11-28 14:10:36 +08:00
XuanYang-cn
40fe656de9
enhance: compaction performance by remove paramtable get (#37882)
pr: #37163

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-11-28 10:38:37 +08:00
Gao
165afbba91
enhance: support retry search when topk is reduced and result not enough (#37093)
issue: #35576 
pr: #35645

---------

Signed-off-by: chasingegg <chao.gao@zilliz.com>
2024-11-28 10:12:37 +08:00
jaime
09a7b55c87
enhance: set the maximum database configuration to be refreshable (#37932)
pr: #37931

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-11-27 11:26:36 +08:00
congqixia
af4e008cd1
enhance: [2.4] Add thread watcher to provide actual thread num (#37905) (#37921)
Cherry pick from master
pr: #37905 

Related to #37904

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-11-27 11:24:36 +08:00
sthuang
6088a1fdbe
enhance: [2.4] Grant v2 proxy service supports operatePrivilegeV2 (#37997)
should support operatePrivilegeV2 service on Proxy to let SDK utilizes
cherry-pick from master: https://github.com/milvus-io/milvus/pull/37945
issue: https://github.com/milvus-io/milvus/issues/37031

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2024-11-26 17:44:35 +08:00
jaime
1c8f14eda4
enhance: the actual number of databases should equal the config value (#38009)
pr: #38006

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-11-26 16:10:36 +08:00
zhagnlu
0a6cfff26b
fix: change search latency metric from us unit to ms unit (#37807)
pr: #37806

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-11-26 14:34:36 +08:00
cqy123456
48fe9dbc30
fix:[2.4]GrowingDataGetter get the wrong string data (#37995)
issue: https://github.com/milvus-io/milvus/issues/37994
master pr: https://github.com/milvus-io/milvus/pull/38015

Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>
2024-11-26 14:22:35 +08:00
smellthemoon
ef6f990040
enhance: do not log out the full req(#36546) (#37948)
pr: #36546

Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2024-11-26 14:00:36 +08:00
wei liu
93063ce1f9
fix: Prevent simultaneous balance of segments and channels (#37850) (#37939)
issue: #33550
pr: #37850
balance segment and balance segment execute at same time, which will
cause bounch of corner case.

This PR disable simultaneous balance of segments and channels

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-11-26 10:26:40 +08:00
congqixia
8601f3ed66
enhance: [2.4] Refine Replica manager colle2Replicas secondary index (#37906) (#37970)
Cherry-pick from master
pr: #37906
Related to #37630

This PR add a new util coll2Replicas secondary index to reduce map
access & iteration while get replicas by collection

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-11-26 10:20:35 +08:00
wei liu
b24510164e
enhance: Decouple shard client manager from shard cache (#37371) (#37753)
issue: #37115
pr: #37371 #37646 #37729
the old implementation update shard cache and shard client manager at
same time, which causes lots of conor case due to concurrent issue
without lock.

This PR decouple shard client manager from shard cache, so only shard
cache will be updated if delegator changes. and make sure shard client
manager will always return the right client, and create a new client if
not exist. in case of client leak, shard client manager will purge
client in async for every 10 minutes.

---------

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
Co-authored-by: congqixia <congqi.xia@zilliz.com>
2024-11-25 17:50:34 +08:00
Ted Xu
e928e15bfc
fix: refuse schedule compaction tasks if there is no slot (#37809)
See #37621


pr: #37589

---------

Signed-off-by: Ted Xu <ted.xu@zilliz.com>
Signed-off-by: Yinzuo Jiang <jiangyinzuo@foxmail.com>
Signed-off-by: yangxuan <xuan.yang@zilliz.com>
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
Co-authored-by: Yinzuo Jiang <jiangyinzuo@foxmail.com>
Co-authored-by: yangxuan <xuan.yang@zilliz.com>
Co-authored-by: wei liu <wei.liu@zilliz.com>
2024-11-25 14:02:34 +08:00
wei liu
370f39db67
enhance: Remove unnecessary stack trace in error (#37816) (#37941)
pr: #37816

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-11-25 11:28:35 +08:00
wei liu
bb66636448
fix: Channel may be released after balance (#37862) (#37940)
issue: #37830
pr: #37862
casue dist handler doesn't set channel's version, so if channel checker
try to dedup channel, it may release the new delegator after balance
finished.

this PR fix the way to set proper version for channel.

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-11-25 11:26:44 +08:00
sthuang
d8f1af68e9
enhance: [2.4] RBAC built in privilege groups and grant v2 (#37787)
cherry-pick from master: https://github.com/milvus-io/milvus/pull/37720,
https://github.com/milvus-io/milvus/pull/37785
issue: https://github.com/milvus-io/milvus/issues/37031

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2024-11-25 11:24:54 +08:00
wei liu
ff6e8e2f2b
fix: [skip e2e] unstable ut TestResourceManager (#37761) (#37936)
issue: #37760
pr: #37761

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-11-25 11:06:34 +08:00
zhikunyao
cf1c423e9b
enhance: [skip e2e]2.4 update workflow macos to 13 (#37942)
Signed-off-by: Zhikun Yao <zhikun.yao@zilliz.com>
2024-11-22 18:56:32 +08:00
sre-ci-robot
3ceb494403
[automated] Bump milvus version to v2.4.17 (#37935)
Bump milvus version to v2.4.17
Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2024-11-22 17:42:32 +08:00
yellow-shine
fc96133374
enhance: [2.4 pipeline] limit compute resource (#37889)
Signed-off-by: Yellow Shine <sammy.huang@zilliz.com>
v2.4.17
2024-11-22 14:14:38 +08:00
congqixia
4aca68a739
enhance: Bump milvus & proto verison to v2.4.17 (#37920)
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-11-22 14:08:34 +08:00
zhenshan.cao
9b3de3ac3e
fix: Revert "enhance: [2.4] Enable RemoteLoad l0 forward policy" (#37875)
issue https://github.com/milvus-io/milvus/issues/35303
pr: https://github.com/milvus-io/milvus/pull/37867
This reverts commit cdf703aabc2ec7e4addded68e808ba6add3ab2cb.

Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>
2024-11-22 12:40:33 +08:00
wei liu
e63a2f3559
fix: unstable integration test caused by paramtable.GetNodeID (#37910)
issue: #37908
pr: #37909
cause paramtable is global single instance, which cause
paramtable.GetNodeID may return wrong server id in integration test.

This PR use node.GetNodeID to replace paramtable.GetNodeID

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-11-22 10:14:33 +08:00
congqixia
0bd26171d5
enhance: [2.4] Provide secondary index criteria when filter leaderview (#37777) (#37802)
Cherry-pick from master
pr: #37777 
Related to #37630

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-11-21 10:48:33 +08:00