20624 Commits

Author SHA1 Message Date
congqixia
c4df6b5910
enhance: [10kcp] Refine Replica manager colle2Replicas secondary index (#37907)
Related to #37630

This PR add a new util coll2Replicas secondary index to reduce map
access & iteration while get replicas by collection

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-12-05 11:57:29 +08:00
yihao.dai
d75fb5b3f8
enhance: [10kcp] Reduce mutex contention in datacoord meta (#38229)
1. Using secondary index to avoid retrieving all segments at
GetSegmentsChanPart.
2. Perform batch SetAllocations to reduce the number of times the meta
lock is acquired.

issue: https://github.com/milvus-io/milvus/issues/37630

pr: https://github.com/milvus-io/milvus/pull/38219

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-12-05 11:57:07 +08:00
yihao.dai
3219b869a3
fix: [10kcp] Fix timeout when listing meta (#38152)
When there are too many key-value pairs, the etcd list operation may
times out. This PR replaces LoadWithPrefix in list operations, which
could involve many keys, with WalkWithPrefix.

issue: https://github.com/milvus-io/milvus/issues/37917

pr: https://github.com/milvus-io/milvus/pull/38151

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-12-03 14:15:49 +08:00
yihao.dai
0c29d8ff64
enhance: [10kcp] Update segment manger (#38153)
Use a channel level key lock for segments in segmentManager.

issue: https://github.com/milvus-io/milvus/issues/37633,
https://github.com/milvus-io/milvus/issues/37630

pr: https://github.com/milvus-io/milvus/pull/37836

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-12-03 14:15:35 +08:00
yihao.dai
338ccc9ff9
enhance: [10kcp] Reduce memory usage of BF in DataNode and QueryNode (#38133)
1. DataNode: Skip generating BF during the insert phase (BF will be
regenerated during the sync phase).
2. QueryNode: Skip generating or maintaining BF for growing segments;
deletion checks will be handled in the segcore.

issue: https://github.com/milvus-io/milvus/issues/37630

pr: https://github.com/milvus-io/milvus/pull/38129

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-12-02 14:41:19 +08:00
yihao.dai
0930430a68
enhance: [10kcp] Skip creating partition rate limiters when not enable (#38062)
issue: https://github.com/milvus-io/milvus/issues/37630

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-11-28 10:45:46 +08:00
yihao.dai
635d161109
enhance: [10kcp] Accelerate observe collection (#38058)
issue: https://github.com/milvus-io/milvus/issues/37630

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-11-28 10:05:24 +08:00
yihao.dai
312475d1f1
enhance: [10kcp] remove the rpc level of coordinator (#37984)
issue: https://github.com/milvus-io/milvus/issues/37764

- add a local client to call local server directly for
querycoord/rootcoord/datacoord.
- enable local client if milvus is running mixcoord or standalone mode.

Signed-off-by: chyezh <chyezh@outlook.com>

---------

Signed-off-by: chyezh <chyezh@outlook.com>
Co-authored-by: Zhen Ye <chyezh@outlook.com>
2024-11-25 14:50:42 +08:00
yihao.dai
e5c16e0676
fix: [10kcp] Fix checkGeneralCapacity slowly (#37981)
Cache the general count to speed up checkGeneralCapacity.

issue: https://github.com/milvus-io/milvus/issues/37630

pr: https://github.com/milvus-io/milvus/pull/37976

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-11-25 14:50:24 +08:00
yihao.dai
fd30034c77
fix: [10kcp] Fix data view and add more ut (#37915)
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-11-21 21:35:42 +08:00
yihao.dai
4845e4d679
enhance: [10kcp] Revert "enhance: remove the rpc level of coordinator (#37914)
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-11-21 21:35:29 +08:00
yihao.dai
bf90e55319
enhance: [10kcp] Reduce GetRecoveryInfo calls (#37891)
1. Introduce a data view mechanism for DataCoord, attempting to update
each collection's data view periodically.
2. QueryCoord maintains a cache of data view versions. Before
batch-fetching recovery info, it retrieves all versions and only fetches
recovery info for collections with updated versions.
3. Return DataCoord's current data view when fetching RecoverInfo.

issue: https://github.com/milvus-io/milvus/issues/37743,
https://github.com/milvus-io/milvus/issues/37630

pr: https://github.com/milvus-io/milvus/pull/37863

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-11-21 15:43:13 +08:00
Zhen Ye
ce8069c0fd
enhance: remove the rpc layer of coordinator when enabling standalone or mixcoord (#37892)
issue: #37764

- add a local client to call local server directly for
querycoord/rootcoord/datacoord.
- enable local client if milvus is running mixcoord or standalone mode.

Signed-off-by: chyezh <chyezh@outlook.com>
2024-11-21 15:42:18 +08:00
Zhen Ye
1a6b98be77
enhance: remove the rpc level of coordinator (#37876)
issue: #33285
pr: #37722

- move most cgo opeartions related to search/query into segcore package
for reusing for streamingnode.
- add go unittest for segcore operations.

Signed-off-by: chyezh <chyezh@outlook.com>
2024-11-21 15:21:11 +08:00
yihao.dai
99da46dd0b
fix: [10kcp] Fix load slowly (#37454) (#37878)
When there're a lot of loaded collections, they would occupy the target
observer scheduler’s pool. This prevents loading collections from
updating the current target in time, slowing down the load process. This
PR adds a separate target dispatcher for loading collections.

issue: https://github.com/milvus-io/milvus/issues/37166

pr: https://github.com/milvus-io/milvus/pull/37454

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-11-21 15:11:03 +08:00
yihao.dai
ac7b485a08
enhance: [10kcp] Accelerate the loading of collection (#37879)
Remove unnecessary ListIndex and DescribeCollection RPC call during
loading.

issue: https://github.com/milvus-io/milvus/issues/37166,
https://github.com/milvus-io/milvus/issues/37630

pr: https://github.com/milvus-io/milvus/pull/37741

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-11-21 15:10:36 +08:00
yihao.dai
9e1ba0759c
enhance: [10kcp] Optimize segmentManager segments (#37884)
1. Use vchannel and partition indices for segments.
2. Replace coarse-grained mutex with concurrent map.

issue: https://github.com/milvus-io/milvus/issues/37633,
https://github.com/milvus-io/milvus/issues/37630

pr: https://github.com/milvus-io/milvus/pull/37836

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-11-21 15:10:04 +08:00
yihao.dai
92ab65ada0
enhance:[10kcp] Reduce GetIndexInfos calls (#37877)
Batch GetIndexInfos calls for segments to reduce RPC calls.

issue: https://github.com/milvus-io/milvus/issues/37634

pr: https://github.com/milvus-io/milvus/pull/37695

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-11-21 15:09:39 +08:00
congqixia
0bd26171d5
enhance: [2.4] Provide secondary index criteria when filter leaderview (#37777) (#37802)
Cherry-pick from master
pr: #37777 
Related to #37630

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-11-21 10:48:33 +08:00
congqixia
28adfe4629
enhance: [2.4] Remove unnecessary segment clone updating dist (#37797) (#37833)
Cherry-pick from master
pr: #37797
Related to #37630

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-11-20 19:48:33 +08:00
sre-ci-robot
5ac4e4839e
[automated] Bump milvus version to v2.4.16 (#37790)
Bump milvus version to v2.4.16
Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2024-11-20 11:46:37 +08:00
congqixia
cffde80e68
enhance: [2.4] Prevent generate "null" search params (#37811)
pr: #37812
Preventing generating null search params in restful search request

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
v2.4.16
2024-11-19 18:20:32 +08:00
Zhen Ye
ebfd917bb6
fix: make asan avaiable when building milvus image (#37804)
issue: #35854
pr: #37041

- USE_ASAN will not enable the Debug mode.
- replace USE_ASAN by `ldd`  to make generate right so in milvus image.

Signed-off-by: chyezh <chyezh@outlook.com>
Co-authored-by: yellow-shine <sammy.huang@zilliz.com>
2024-11-19 17:28:32 +08:00
congqixia
a10f95d71c
enhance: Bump milvus & proto version to v2.4.16 (#37762)
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-11-18 20:36:31 +08:00
congqixia
876e06b862
fix: [2.4] Load l0 delta for growings when using RemoteLoad (#37772)
Cherry-pick from master
pr: #37771
Related to #37574

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-11-18 20:26:31 +08:00
smellthemoon
46692d7525
enhance: support upsert autoid==true in Restful API and fix some bugs(#37072)(#37487) (#37766)
pr: #37072
pr: #37487

---------

Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2024-11-18 19:44:31 +08:00
wei liu
2a4f54cd4f
fix: L0 segment has been loaded to worker during channel balance (#37758)
issue: https://github.com/milvus-io/milvus/issues/37703
pr: https://github.com/milvus-io/milvus/pull/37748

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-11-18 17:00:32 +08:00
foxspy
cabb55595a
enhance: update knowhere version (#37763)
/kind branch-feature

knowhere release note :
https://github.com/zilliztech/knowhere/releases/tag/v2.3.13

Signed-off-by: xianliang.li <xianliang.li@zilliz.com>
2024-11-18 16:30:32 +08:00
wei liu
79f676e7d8
enhance: Use batch to speed up list collections from meta kv (#37752)
issue: #36228
pr: #37742

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-11-18 15:58:33 +08:00
nico
bbd96e1829
test: update pymilvus version and test cases (#37711)
Signed-off-by: nico <cheng.yuan@zilliz.com>
2024-11-18 14:14:32 +08:00
jaime
3ce27ca689
enhance: remove collection queryable check from health check (#37731)
pr: #37712

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-11-18 10:50:38 +08:00
yihao.dai
13f83df019
enhance: [2.4] Remove segment-level tag from monitoring metrics (#37737)
When there are a large number of segments, the metrics consume a lot of
memory. This PR Remove segment-level tag from monitoring metrics.

issue: https://github.com/milvus-io/milvus/issues/37636

pr: https://github.com/milvus-io/milvus/pull/37696

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-11-16 23:04:33 +08:00
yihao.dai
d29573551b
enhance: [2.4] Remove unnecessary clone in SetState (#37736)
issue: https://github.com/milvus-io/milvus/issues/37637

pr: https://github.com/milvus-io/milvus/pull/37697

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-11-16 19:04:34 +08:00
congqixia
cdf703aabc
enhance: [2.4] Enable RemoteLoad l0 forward policy by default (#37678) (#37713)
Cherry-pick from master
pr: #37678
Related to #35303

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-11-15 18:28:31 +08:00
smellthemoon
b3e6482367
enhance: add search params in search request in restful(#36304) (#37673)
pr: #36304 
pr: #36714 
pr: #36448

---------

Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: zhuwenxing <wenxing.zhu@zilliz.com>
2024-11-15 17:54:30 +08:00
Zhen Ye
4e11fe7adf
enhance: make milvus image with asan available (#37682)
issue: #35854
pr: #37050

Signed-off-by: chyezh <chyezh@outlook.com>
2024-11-15 17:10:30 +08:00
wei liu
1bd502b585
fix: Delegator stuck at unserviceable status (#37694) (#37702)
issue: #37679
pr: #37694

pr #36549 introduce the logic error which update current target when
only parts of channel is ready.

This PR fix the logic error and let dist handler keep pull distribution
on querynode until all delegator becomes serviceable.

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-11-15 14:52:30 +08:00
congqixia
e222289038
fix: [2.4] Store default value if ErrKeyNotFound is returned (#37691) (#37705)
Cherry-pick from master
pr: #37691
Related to #37690

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-11-15 14:50:32 +08:00
XuanYang-cn
5d5f899274
fix: [cp24]Change memoryCheck write lock to read lock (#37526)
pr: #37525

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-11-15 14:42:31 +08:00
wei liu
c50cb8d3ef
fix: Make GetShardLeaders only retries on retriable error (#37687)
issue: #37532
pr: #37684

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-11-14 21:14:40 +08:00
XuanYang-cn
d5cad01c22
enhance: [cp24]tidy compaction logs (#37595) (#37647)
Remove some annoying logs and lower a log level from warn to info

pr: #37595

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-11-14 18:44:31 +08:00
nico
2bf8773d58
enhance: update sdk version (#37661)
pr: #37660

Signed-off-by: nico <cheng.yuan@zilliz.com>
2024-11-14 17:46:39 +08:00
XuanYang-cn
d23da2db4f
fix: [cp24]Correct varchar primarykey size calculation (#37619)
See also: #37582
pr: #37617

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-11-14 14:16:38 +08:00
wei liu
28bcd85bd0
fix: Balance channel may stuck at increasing replica number case (#37642)
issue: #37640
pr: #37641
fix the pr #36549
cause balance channel will wait until new delegator becomes serviceable,
but new delegator need to sync target version then becomes serviceable,
and sync target version need to be wait all replica load done. so if
increasing replica number and balance channel happens at same time,
logic dead lock occurs.

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-11-13 14:26:30 +08:00
congqixia
8801322371
enhance: [2.4] Invalidate collection cache when release collection (#37577) (#37628)
Cherry-pick from master
pr: #37577
Related to #37395

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-11-13 14:00:31 +08:00
congqixia
d073f322a4
enhance: [2.4] Add cgo call metrics for load/write API (#37405) (#37627)
Cherry-pick from master
pr: #37405

Cgo API cost is not observerable since not metrics is related to them.
This PR add metrics for some sync cgo call related to load & write

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-11-13 13:58:30 +08:00
wei liu
6dc879b1e2
enhance: Enable node assign policy on resource group (#36968) (#37588)
issue: #36977
pr: #36968
with node_label_filter on resource group, user can add label on
querynode with env `MILVUS_COMPONENT_LABEL`, then resource group will
prefer to accept node which match it's node_label_filter.

then querynode's can't be group by labels, and put querynodes with same
label to same resource groups.

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-11-13 11:10:29 +08:00
wei liu
7d1c899155
fix: Search may return less result after qn recover (#36549) (#37610)
issue: #36293 #36242
pr: #36549
after qn recover, delegator may be loaded in new node, after all segment
has been loaded, delegator becomes serviceable. but delegator's target
version hasn't been synced, and if search/query comes, delegator will
use wrong target version to filter out a empty segment list, which
caused empty search result.

This pr will block delegator's serviceable status until target version
is synced

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-11-12 19:16:30 +08:00
cai.zhang
3456e241ac
fix: [2.4]Fix the bug that retrieved from wrong field for L0 segments (#37599)
issue: #37574 

master pr: #37598

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-11-12 19:02:31 +08:00
wei liu
074f8ee696
enhance: optimize describe collection and index (#37490) (#37605)
fix #37489
pr: #34790
combine multiple describe collection and list index into one call

Signed-off-by: xiaofanluan <xiaofan.luan@zilliz.com>
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
Co-authored-by: Xiaofan <83447078+xiaofan-luan@users.noreply.github.com>
2024-11-12 16:54:29 +08:00