7919 Commits

Author SHA1 Message Date
yah01
c8a129756f
enhance: filter out the not needed collections while listing (#29690) (#30180)
this improves performance while many collections exist resolve #29631
pr: #29690

Signed-off-by: yah01 <yang.cen@zilliz.com>
2024-01-22 16:52:55 +08:00
MrPresent-Han
6aaccdd5f4
feat: support general capacity restrict for cloud-side resoure contro… (#30017)
related: #29844
pr: #https://github.com/milvus-io/milvus/pull/29845

Signed-off-by: MrPresent-Han <chun.han@zilliz.com>
2024-01-22 16:18:56 +08:00
SimFG
2465d86138
enhance: [2.3] support related privilege for grant api (#30154)
/kind improvement
pr: #30153

Signed-off-by: SimFG <bang.fu@zilliz.com>
2024-01-22 14:42:55 +08:00
yah01
ce318f3286
enhance: make the error of parsing expression to ParameterInvalid (#29681) (#29795)
before this, the error is unexpected error
pr: #29681

Signed-off-by: yah01 <yang.cen@zilliz.com>
2024-01-22 13:36:55 +08:00
yihao.dai
917a4d74f3
fix: Use channel cp as the dml&start position for import segments (#30107) (#30133)
This PR discontinuing the subscription to the mq and, instead, employing
the channel checkpoint as the DML and starting position for the import
segments.

issue: https://github.com/milvus-io/milvus/issues/30106

pr: https://github.com/milvus-io/milvus/pull/30107

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-01-22 13:32:55 +08:00
yah01
a8d9b0ccba
enhance: optimize the loading index performance (#29894) (#30018)
this utilizes concurrent loading
pr: #29894

Signed-off-by: yah01 <yang.cen@zilliz.com>
2024-01-22 13:12:56 +08:00
congqixia
bac1a1355b
fix: [Cherry-pick] collection properties not saved for alter collection (#30145) (#30156)
Cherry-pick from master
pr: #30145
Resolves: #30144

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-01-22 10:08:55 +08:00
yihao.dai
b95f0cc0a1
enhance: Add a counter monitoring for the rate-limit requests (#30109) (#30132)
Add a counter monitoring metric for the ratelimited rpc requests with
labels: proxy nodeID, rpc request type, and state.

issue: https://github.com/milvus-io/milvus/issues/30052

pr: https://github.com/milvus-io/milvus/pull/30109

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-01-21 14:44:59 +08:00
PowderLi
3dc2585d9b
enhance: support dataType: array & json (#30077)
issue: #30075 
master pr: #30076

deal with the array<?> field data correctly

Signed-off-by: PowderLi <min.li@zilliz.com>
2024-01-21 14:00:56 +08:00
wei liu
b2997eb881
fix: Leader checker can't remove segment from leader view (#30152)
issue: #30150
pr: #30151

This PR fix three problems:

1. the load request generated by leader checker doesn't set load scope
2. leader checker use wrong node id when generate release task, which
cause the release task finished immediately
3. the release request generated by leader_checker doesn't set the force
flag, the operation to clean leader view on delegator will fail.

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-01-20 18:58:58 +08:00
congqixia
079ddbfc01
enhance: [Cherry-pick] Shuffle candidates before channel assignment (#30066) (#30089)
Cherry-pick from master
pr: #30066

Shuffle candidates to reduce scenario that some channel allocated into
same node

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-01-19 12:06:54 +08:00
foxspy
0700434c58
fix: patching search cache param when index meta does not hold one (#30116)
patch search cache param from index configs when index meta could not
get the search cache size key

issue: #30113 
pr: #30119

Signed-off-by: xianliang <xianliang.li@zilliz.com>
2024-01-19 11:50:56 +08:00
SimFG
be1470a654
enhance: [2.3] Add load/release partitions to replicate msg stream (#30001)
/kind improvement
pr: #28399

---------

Signed-off-by: SimFG <bang.fu@zilliz.com>
2024-01-18 22:50:55 +08:00
wei liu
71e24f0a7f
fix: Remove heartbeat lag logic during get shard leaders (#29999) (#30085)
issue: #29677 #29838
pr: #29999
during get shard leaders, if qeurynode doesn't ack the heartbeat than
10s, querycoord will treat it as unavailable, and won't return shard
leader on it. but when querynode has a full cpu usage, it's easily to
stuck for more than 10s without ack the heartbeat, which cause no shard
leader to search/query.

This PR remove heartbeat lag logic during get shard leaders

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-01-18 17:48:55 +08:00
congqixia
7f32576f36
enhance: [cherry-pick] replace magic number with ParamItem for dist handler (#30020) (#30070)
Cherry-pick from master
pr: #30020
See also #28817

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-01-18 15:58:54 +08:00
wei liu
7d73032582
enhance: refactor leader_observer to leader_checker (#29454) (#29984)
issue: #29453
pr: #29452
sync distribution by rpc will also call loadSegment/releaseSegment,
which may cause all kinds of concurrent case on same segment, such as
concurrent load and release on one segment.
This PR add leader_checker which generate load/release task to correct
the leader view, instead of calling sync distribution by rpc

---------

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-01-18 14:08:54 +08:00
congqixia
ce1ba6808a
enhance: [cherry-pick] change some important request log level to Info (#30062) (#30071)
Cherry-pick from master
pr: #30062 
Some important request log level shall be at least Info level

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-01-18 12:44:55 +08:00
congqixia
14aa20b7f7
enhance: [cherry-pick] fix otel config param type & leak (#30068)
cherry pick from master
pr: #29810 #30055 

`SampleFraction` shall be float and all `C.CString` shall be freed

Signed-off-by: Yudong Cai <yudong.cai@zilliz.com>
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-01-18 12:43:05 +08:00
zhenshan.cao
9aceff5a6e
fix: duplicate dynamic field data by mistake (#30043)
issue: #30000 
pr: https://github.com/milvus-io/milvus/pull/30042

Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>
2024-01-17 00:20:55 +08:00
zhagnlu
9f6a19c56c
fix: increase expr recursion depth to avoid parse failed (#29860) (#30021)
pr: #29860

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-01-16 19:48:38 +08:00
cai.zhang
88c30b48ce
fix: [pick]Fix bug for read data from azure (#30006)
issue: #30005 
master pr: #30007

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-01-16 15:44:53 +08:00
PowderLi
ff93e8b489
fix: [CHERRY-PICK] CollectionSchema.autoID is deprecated (#30011)
issue: [#30000](https://github.com/milvus-io/milvus/issues/30000)
related to: [milvus-proto
#202](https://github.com/milvus-io/milvus-proto/pull/202)
master pr: #30002

1. replace collSchema.AutoID with primaryField.AutoID
2. show `enableDynamic` & `enableDynamicField` at the same time
3. avoid data race about the access to metacache

Signed-off-by: PowderLi <min.li@zilliz.com>
2024-01-16 14:32:53 +08:00
congqixia
1dbc2ab8ee
enhance: [Cherry-pick] make compactor use actual buffer size to decide when to sync(#29945) (#29971)
Cherry-pick from master
pr: #29945
See also: #29657

Datanode Compactor use estimated row number from schema to decide when
to sync the batch of data when executing compaction. This est value
could go way from actual size when the schema contains variable field(
say VarChar, JSON, etc.)

This PR make compactor able to check the actual buffer data size and
make it possible to sync when buffer is actually beyond max binglog
size.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-01-16 12:22:52 +08:00
congqixia
7fc7e1a0d5
enhance: [Cherry-pick] Use newer checkpoint when packing LoadSegmentRequest (#29922) (#29978)
Cherry-pick from master
pr: #29922 
See also: #29650

Either segment dml position & channel checkpoint could be newer in some
cases. This PR make PackLoadSegments use the newer one improving load
performance during cases where there are lots of upsert.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-01-16 12:08:53 +08:00
wei liu
81fdb6f472
enhance: Skip generate load segment task (#29724) (#29982)
issue: #29814
pr: #29724
if channel is not subscribed yet, the generated load segment task will
be remove from task scheduler due to the load segment task need to be
transfer to worker node by shard leader.

This PR skip generate load segment task when channel is not subscribed
yet.

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-01-16 10:12:52 +08:00
chyezh
df9b3376dc
fix: Use determined order to lock in BlockAll to avoid deadlock (#29972)
issue: #29104
pr: #29246

Signed-off-by: chyezh <ye.zhen@zilliz.com>
2024-01-15 14:32:51 +08:00
chyezh
072b11355d
fix: SealedIndexingEntry in SealedIndexingRecord may leak without smart pointer protected (#29966)
may related issue: #29828
pr: #29932

Signed-off-by: chyezh <ye.zhen@zilliz.com>
2024-01-15 10:30:52 +08:00
cai.zhang
434ac1f6d0
fix: [Pick]Fix error message for indexing (#29906)
issue: #29897 

master pr: #29898

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-01-14 13:30:52 +08:00
chyezh
c8e3a48214
fix: querynode num entity metric is broken by illegal label (#29949)
issue: #29766
also see pr: #29825
pr: #29948

Signed-off-by: chyezh <ye.zhen@zilliz.com>
2024-01-14 10:22:59 +08:00
congqixia
227071a754
enhance: [cherry-pick] reduce delete detail log to delete range (#29916) (#29930)
Cherry-pick from master
pr: #29916
Delete detail log will be large and hard to read when log level is
debug. This PR change the log to stringer and print only pk range,
number.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-01-12 20:18:51 +08:00
congqixia
c21229b7bb
enhance: [cherry-pick] add trace span for wait tsafe (#29911) (#29929)
Cherry-pick from master
pr: #29911 
Add tracing span for search/query operation waiting tsafe duration

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-01-12 20:17:01 +08:00
aoiasd
128f197797
enhance: [Cherry-Pick] support access log print cluster prefix (#29646) (#29831)
relate: https://github.com/milvus-io/milvus/issues/29645
pr: https://github.com/milvus-io/milvus/pull/29646

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2024-01-12 18:58:52 +08:00
wei liu
86cddd24b5
enhance: Add ctx for load index logs (#29686) (#29905)
pr: #29686
This PR add ctx for load index logs

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-01-12 18:56:58 +08:00
SimFG
d573f0ec1a
fix: [2.3] the delete msg disorder issue (#29917)
/kind improvement
pr: #29915

Signed-off-by: SimFG <bang.fu@zilliz.com>
2024-01-12 18:04:50 +08:00
wayblink
e1446da83c
feat: [Cherry-pick] Implement DescribeAlias and ListAliases interfaces (#29896)
#22882
pr: #29641

Signed-off-by: wayblink <anyang.wang@zilliz.com>
2024-01-12 16:30:51 +08:00
congqixia
c56622dea7
enhance: move confusing warning log to error branch (#29891)
`flushInsertData` & `flushDeleteData` prints WARNING log even there is
no error returned. So move error branch into if block.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-01-12 15:50:52 +08:00
wei liu
16e7f51033
fix: Dynamic update rate limit config with wrong value (#29902)
pr: #29901 
when apply dynamic config changes, we should format the value to proper
unit
This PR fix update rate limit config with wrong value.

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-01-12 15:10:51 +08:00
chyezh
98aae10273
fix: compact operation on datacoord meta should preform as a transcation (#29776)
issue: #29691
pr: #29775

Signed-off-by: chyezh <ye.zhen@zilliz.com>
2024-01-12 14:54:52 +08:00
chyezh
7d3ec9f869
fix: unhealthy datacoord started with unhealthy channel manager (#29849)
issue: #29818
pr: #29848

Signed-off-by: chyezh <ye.zhen@zilliz.com>
2024-01-12 14:24:54 +08:00
wei liu
5520bfbb05
enhance: Change some frequency log to rated level (#29720) (#29903)
pr: #29720
This PR change some frequency log to rated level

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-01-12 11:46:52 +08:00
yah01
4edcd4d22b
fix: the insert count is zero after set the pointer to nil (#29870) (#29881)
this leads to the EntitiesNum metric would be never reduced

fix: #29766
pr: #29870

Signed-off-by: yah01 <yang.cen@zilliz.com>
2024-01-12 10:20:51 +08:00
chyezh
f0db26107c
fix: panic caused by type assert LocalSegment on Segment (#29018) (#29900)
- Make implementation of LocalWorker and RemoteWorker same.

issue: #29017, #29899
pr: #29018

Signed-off-by: yah01 <yah2er0ne@outlook.com>
Co-authored-by: yah01 <yah2er0ne@outlook.com>
2024-01-12 10:08:50 +08:00
jaime
c0b711e9fb
enhance: Support read hardware metrics for cgroupv2 (#29847)
issue: #29846
pr: #29850

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-01-11 19:20:57 +08:00
congqixia
00c0a5a2ab
enhance: [Cherry-pick] make Load process traceable in querycoord (#29806) (#29869)
Cherry-pick from master
pr: #29806
See also #29803

This PR:
- Add trace span for collection/partition load
- Use TraceSpan to generate Segment/ChannelTasks when loading
- Refine BaseTask trace tag usage

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-01-11 18:00:52 +08:00
congqixia
cd93954214
enhance: [Cherry-pick] pre-allocate result FieldData space to reduce growslice (#29726) (#29866)
Cherry-pick from master
pr: #29726

See also: #29113

Add a new utitliy function in `pkg/util/typetuil` to pre-allocate field
data slice capacity acoording to search limit. This shall avoid copying
the data during `AppendFieldData` when previous slice is out of space.
And shall also save CPU time during high paylog.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-01-11 17:59:01 +08:00
wei liu
603cd1fb3f
fix: Drop segment meta info with prefix (#29857)
pr: #29856
If segment has more than 128 log fils, drop segment will exceed etcd txn
ops limit, which will failed the drop segment request
This PR drop segment meta info with prefix, to avoid drop segment meta
failed

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-01-11 15:02:50 +08:00
zhenshan.cao
7cf2be09b5
fix: Restore the MVCC functionality. (#29749) (#29802)
When the TimeTravel functionality was previously removed, it
inadvertently affected the MVCC functionality within the system. This PR
aims to reintroduce the internal MVCC functionality as follows:

1. Add MvccTimestamp to the requests of Search/Query and the results of
Search internally.
2. When the delegator receives a Query/Search request and there is no
MVCC timestamp set in the request, set the delegator's current tsafe as
the MVCC timestamp of the request. If the request already has an MVCC
timestamp, do not modify it.
3. When the Proxy handles Search and triggers the second phase ReQuery,
divide the ReQuery into different shards and pass the MVCC timestamp to
the corresponding Query requests.

issue: #29656
pr: #29749

Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>
2024-01-11 14:42:49 +08:00
yah01
e7e4561da8
fix: the entities num metric may be contributed more than once (#29767) (#29825)
the growing segments contribute to this metric while inserting and
putting into the manager, but the current impl inserts data before
putting the segments into manager, which leads to double contributions

fix: #29766
pr: #29767

Signed-off-by: yah01 <yah2er0ne@outlook.com>
2024-01-11 10:24:51 +08:00
XuanYang-cn
1128b1dd67
fix: [cherry-pick]Save lite WatchInfo into etcd in DataNode (#29751)
See also: #29689
pr: #29687

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-01-10 20:48:50 +08:00
congqixia
6c9a5e347e
fix: [cherry-pick] Assertion all async invocations in test case (#29737) (#29782)
Cherry-pick from master
pr: #29737
Resolves: #29736

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-01-09 17:48:49 +08:00