9171 Commits

Author SHA1 Message Date
congqixia
1cd8d1bd80
enhance: [2.4] Use stats Handler to record request/response size metrics (#36107) (#36118)
Cherry-pick from master
pr: #36107 
Related to #36102

This PR use newly added `grpcSizeStatsHandler` to reduce calling
`proto.Size` since the request & response size info is recorded by grpc
framework.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-09-10 17:13:08 +08:00
congqixia
c166253540
fix: [2.4] Make legacy non-lexicographic branch break swtich (#36126)
Cherry-pick from master
pr: #36125
Related to #35941
Previous PR: #36034

This patch makes the switch branching logic correct and make the unit
test work for cases which does not select the whole dataset.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-09-09 22:49:06 +08:00
zhenshan.cao
9fe846c9e3
fix: binary arith expression on inverted index (#35945) (#36097)
issue: https://github.com/milvus-io/milvus/issues/35946
pr : https://github.com/milvus-io/milvus/pull/35945

---------

Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>
Co-authored-by: Jiquan Long <jiquan.long@zilliz.com>
2024-09-09 12:51:06 +08:00
SimFG
d3bf7a2d27
fix: [2.4] delay to start the metric server port (#36085)
- issue: #36083
- pr: #36080
/kind improvement

Signed-off-by: SimFG <bang.fu@zilliz.com>
2024-09-09 10:11:06 +08:00
jaime
256d4e209f
fix: memory leak in proxy meta cache (#36076)
issue: #36074
pr: #36075

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-09-08 17:49:06 +08:00
zhagnlu
08b9db424b
fix:rename mmap file path to avoid directory conflict (#35810) (#35975)
pr: #35810

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-09-08 16:57:06 +08:00
yihao.dai
e16afe78ba
enhance: Log warn on delayed compaction task (#36049) (#36050)
/kind enhancement

pr: https://github.com/milvus-io/milvus/pull/36049

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-09-06 23:31:06 +08:00
wayblink
922f54967d
fix: [cherry-pick] add log in mixCompactionTask and set fail/timeout task to clean (#35967)
#35966
master pr :#35970

Signed-off-by: wayblink <anyang.wang@zilliz.com>
2024-09-06 23:27:05 +08:00
wei liu
cc414d53b7
fix: Fix logic dead lock when delegator has high memory usage (#36066)
issue: #36064
pr: #36065
when delegator has high memory usage, load l0 segment will failed. and
balance segment task will blocked by load segment task, then delegator
cann't free memory by moving out some segment, causes a logic dead lock.

this PR remove the limit for balance, we permit segment and balance
execute in parallel. which won't cause side effect due to:
1. one segment can only has one task in qc's scheduler, and load/release
task will replace balance task if necessary
2. balance speed has been limited, and it won't block load segment task.

3. if collection has load task and balance task at same time, load task
will be scheduled first due to high proirity.

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-09-06 22:01:07 +08:00
foxspy
32929fac6c
enhance: Update Knowhere version (#36067)
/kind branch-feature

release note: https://github.com/zilliztech/knowhere/releases/tag/v2.3.9

Signed-off-by: xianliang.li <xianliang.li@zilliz.com>
2024-09-06 20:11:05 +08:00
XuanYang-cn
6dc7d2041f
fix: Set an empty segment if compaction deleted all inserts (#36045)
See also: #36038 
pr: #36044

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-09-06 20:09:05 +08:00
congqixia
b34b035edc
fix: [2.4] Use SliceSetEqual to compare load field list (#36062)
Cherry-pick from master
pr: #36051
Related to #36037

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-09-06 19:17:05 +08:00
congqixia
aeb576ec0a
enhance: [2.4] Use MARISA_LABEL_ORDER when building trie index (#36060)
Cherry pick from master
pr: #36034

Related to #35941
Previous PR: #35943

This PR make `Trie` index using `MARISA_LABEL_ORDER`, which make
predictive search iterating in lexicographic order.

When trie index is build in label order, lexicographc could be utilized
accelerating `Range` operations.

However according to the official document, using `MARISA_LABEL_ORDER`
will make "exact match lookup, common prefix search, and predictive
search" slower.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-09-06 19:15:16 +08:00
XuanYang-cn
64e109d155
fix: [cp]Change deltalog memory estimation factor to one (#36035)
See also: #36031
pr: #36033

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-09-06 18:09:05 +08:00
congqixia
e21b09cc90
fix: [2.4] Fill load field list from old version load info (#35993) (#36018)
Cherry-pick from master
pr: #35993
See also #35959

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-09-06 17:27:06 +08:00
congqixia
55b33cd3cf
fix: [2.4] Fix tracing config update logic (#35928) (#35998)
Cherry-pick from master
pr: #35928 
Related to #35927

There are serveral issue this PR addresses:
- Use `ResetTraceConfig` method instead init one in update event handler
- Implement dynamic stats.Handler to receive tracing config update event
- Update `enable_trace` flag when `ResetTraceConfig` is invoked
- Change `enable_trace` to `std::atomic<bool>` in case of data race

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-09-06 11:19:05 +08:00
wei liu
10211ea056
fix: Fix dynamic release partition may fail search/query request (#35919) (#36019)
issue: #33550
pr: #35919
cause concurrent issue may occur between remove parition in target
manager and sync segment list to delegator. when it happens, some
segment may be released in delegator, and those segment may also be
synced to delegator, which cause delegator become unserviceable due to
lack of necessary segments, then search/query fails.

this PR make sure that all write access to target_manager will be
executed in serial to avoid the concurrent issues.

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-09-06 10:49:05 +08:00
XuanYang-cn
54ec290109
enhance: [cp]Remove too frequent logs in Delete (#35981)
pr: #35980

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-09-06 10:47:13 +08:00
wei liu
ceca666e2a
fix: Fix privilege group hasn't been register for validate (#35938)
issue: #35471
pr: #35937

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-09-05 18:09:11 +08:00
cqy123456
9de431991f
enhance: [2.4]reduce mmap-rss after warmup (#35965)
related pr: https://github.com/milvus-io/milvus/pull/35974

Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>
2024-09-05 17:41:04 +08:00
yihao.dai
b578064869
fix: Fix DB limiter nodes are mistakenly cleaned up (#35991) (#35992)
This issue only occurs for a short time right after a table is created.
To avoid this, we simply reduce the frequency of cleaning up invalid
limiter nodes.

issue: https://github.com/milvus-io/milvus/issues/35933

pr: https://github.com/milvus-io/milvus/pull/35991

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-09-05 17:31:05 +08:00
congqixia
abf3f68ae9
enhance: [2.4] Fix typo of clustering key not loaded msg (#35948) (#36000)
Cherry-pick from master
pr: #35948
Related to #35415

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-09-05 16:59:07 +08:00
congqixia
ffa7755136
fix: [2.4] Check all values for trie.predictive_search (#35943) (#35999)
Cherry-pick from master
pr: #35943 
Related to #35941

For marisa trie `predictive_search` default behavior, it value iterated
is not in lexicographic order.

This PR is a brute force fix to make range operator returns correct
values.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-09-05 16:49:04 +08:00
congqixia
da0bc22a5f
enhance: [2.4] Add delete buffer related quota logic (#35918) (#35997)
Cherry pick from master
pr: #35128 #35918
See also #35303

---------

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
Co-authored-by: aoiasd <45024769+aoiasd@users.noreply.github.com>
2024-09-05 16:43:06 +08:00
congqixia
6158ad37f9
enhance: [2.4] fix cpp-lint issue for recent change (#35989)
fix lint issue in:
internal/core/src/query/SearchOnSealed.cpp

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-09-05 10:57:05 +08:00
wei liu
c87711d903
fix: Fix some replicas don't participate in the query after the failure recovery (#35850) (#35925)
issue: #35846
pr: #35850
querycoord will notify proxy to update shard leader cache after
delegator location changes, but during querynode's failure recovery,
some delegator may become unserviceable due to lacking of segments, and
back to serviceable after segment loaded, so we also need to notify
proxy to invalidate shard leader cache when delegator serviceable state
changes.

This PR will maintain querynode's serviceable state during heartbeat,
and notify proxy to invalidate shard leader cache if serviceable state
changes.

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-09-05 10:09:04 +08:00
cai.zhang
b7a0e08dd3
fix: [cherry-pick]Fix data race for clustering compaction writer (#35958)
issue: #35950 

master pr: #35957

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-09-05 04:07:10 +08:00
SimFG
084b3efaa1
fix: [2.4] fill the metric type field in the LoadMetaInfo object (#35963)
- issue: #35960
- pr: #35962

Signed-off-by: SimFG <bang.fu@zilliz.com>
2024-09-04 16:21:05 +08:00
jaime
2c1fa50412
enhance: remove cooling off in rate limiter for read requests (#35936)
issue: #35934
pr: #35935

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-09-04 14:39:10 +08:00
SimFG
4fdd4d9ec3
feat: [2.4] add static view for the expr interface (#35954)
- issue: #35886
- pr: #35887
/kind improvement

Signed-off-by: SimFG <bang.fu@zilliz.com>
2024-09-04 13:17:04 +08:00
Zhen Ye
dffbc17a37
fix: SkipIndex cause segment fault (#35908)
issue: #35882
pr: #35907

Signed-off-by: chyezh <chyezh@outlook.com>
2024-09-03 18:17:04 +08:00
zhagnlu
55df25387e
fix: Fix the reference to a variable after it has been moved (#35875) (#35904)
pr: #35875

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-09-03 11:49:03 +08:00
congqixia
df8d1c7ca3
enhance: [2.4] Check load fields for previous loaded collection (#35905) (#35910)
Cherry-pick from master
pr: #35905
Related to #35415

This PR make querycoord report error when load request tries to update
load fields list, which is currently not supported.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-09-03 11:25:03 +08:00
congqixia
268c60c301
fix: [2.4] Check clustering key skip load behavior (#35865) (#35899)
Cherry-pick from master
pr: #35865
feature issue: #35415
See also #35861

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-09-02 16:03:04 +08:00
Zhen Ye
90147b1339
fix: memory leak in unittest and open the USE_ASAN option when build unittest (#35857)
issue: #35854
pr: #35855

Signed-off-by: chyezh <chyezh@outlook.com>
2024-09-02 16:01:04 +08:00
SimFG
8b706122a8
enhance: [2.4] support to drop the role which is related the privilege list (#35863)
- issue: #35545
- pr: #35727

Signed-off-by: SimFG <bang.fu@zilliz.com>
2024-08-31 21:57:02 +08:00
wei liu
da026b1e28
enhance: Add depguard rules to ban deprecated proto lib (#35140) (#35818)
See also #34394 #34252
pr: #35140

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
Co-authored-by: congqixia <congqi.xia@zilliz.com>
2024-08-30 14:13:01 +08:00
congqixia
cfc99e63b1
fix: [2.4] Make sure querycoord observers started once (#35811) (#35817)
Cherry-pick from master
pr: #35811
Related to #35809

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-08-29 19:15:01 +08:00
congqixia
8d3685fadf
enhance: [2.4] Print log only when rate limit updates (#35806) (#35816)
Cherry-pick from master
pr: #35806
The debug log for "RateLimiter register for rateType" is too frequent
and in e2e cases, the may print 18M times in one run.

This PR make the log be printed only when rate limit is updated.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-08-29 19:09:01 +08:00
Patrick Weizhi Xu
5da64f6d5a
feat: [2.4] support range search pagination retains order (#35739)
issue: https://github.com/milvus-io/milvus/issues/35464
pr: https://github.com/milvus-io/milvus/pull/35738
2024-08-29 14:07:01 +08:00
congqixia
8928c9d570
enhance: [2.4] Change frequent balancer debug log to rated one (#35749) (#35796)
Cherry-pick from master
pr: #35749
"skip balance" log is too frequent in debug level. This PR changes it
into rated on.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-08-29 12:31:00 +08:00
jaime
2fc111eb10
enhance: set database properties to restrict read access (#35754)
issue: #35744
pr: #35745

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-08-29 11:41:01 +08:00
congqixia
21454163bb
fix: [2.4] Check response size before add to counter (#35779)
Cherry-pick from master
pr: #35778 
Related to #35767

prometheus counter cannot add negative value
when response is not written(say timeout/network broken) panicking may
happen if not check

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-08-28 20:01:01 +08:00
Zhen Ye
e25d1ef63c
fix: wrong construction in evalctx (#35773)
issue: #35771
pr: #35772

Signed-off-by: chyezh <chyezh@outlook.com>
2024-08-28 18:37:01 +08:00
congqixia
12575885d3
enhance: [2.4] Add skip load validation for create collection task (#35761)
Cherry-pick from master
pr: #35737 
Related to #35415

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-08-28 18:25:00 +08:00
foxspy
93b118946c
feat: [2.4] Encode traceID and spanID as hex string & upgrade knowhere version (#35568)
Cherry-pick and upgrade knowhere version
issue: https://github.com/zilliztech/knowhere/issues/714
pr:  #34807 

knowhere release notes:
https://github.com/zilliztech/knowhere/releases/tag/v2.3.8

Signed-off-by: Cai Yudong <yudong.cai@zilliz.com>
Co-authored-by: Cai Yudong <yudong.cai@zilliz.com>
2024-08-28 17:03:00 +08:00
jaime
c75f556769
fix: inconsistent meta view causes rate limit invalid (#35664)
issue: #35663
pr: #35665

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-08-28 11:21:04 +08:00
yihao.dai
8e6ec58652
fix: Fix rate wasn't limited to the expected value (#35699) (#35700)
Each time the rate is reset, the token bucket is fully refilled, causing
the rate wasn't limited to the expected value. This PR addresses the
issue by preventing the token reset.

issue: https://github.com/milvus-io/milvus/issues/35675,
https://github.com/milvus-io/milvus/issues/35702

pr: https://github.com/milvus-io/milvus/pull/35699

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-08-28 10:29:00 +08:00
jaime
eecc598786
fix: mistaken deletions may occur during GC channel checkpoints (#35708)
issue: #35706
pr: #35707

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-08-28 10:11:05 +08:00
Zhen Ye
a4533f1b8a
enhance: optimize milvus core building (#35660)
issue: #35549,#35611,#35633
pr: #35610

- remove milvus_segcore milvus_indexbuilder..., add libmilvus_core
- core building only link once
- move opendal compilation into cmake
- fix odr

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2024-08-27 18:55:00 +08:00