517 Commits

Author SHA1 Message Date
Zhen Ye
a4533f1b8a
enhance: optimize milvus core building (#35660)
issue: #35549,#35611,#35633
pr: #35610

- remove milvus_segcore milvus_indexbuilder..., add libmilvus_core
- core building only link once
- move opendal compilation into cmake
- fix odr

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2024-08-27 18:55:00 +08:00
congqixia
ab261d0f8b
feat: [2.4] Support field partial load collection (#35416) (#35696)
Cherry-pick from master
pr: #35416
Related to #35415

---------

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-08-27 14:07:00 +08:00
wei liu
35d2f9b210
fix: Fix index memory estimation (#35225) (#35670)
issue: https://github.com/milvus-io/milvus/issues/35229
pr: #32525

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
Co-authored-by: Bingyi Sun <sunbingyi1992@gmail.com>
2024-08-24 10:28:57 +08:00
Gao
1687d64c46
enhance: [2.4] add hit segment num metrics for queryHook (#35619)
issue: #35576 
pr: #35577

Signed-off-by: chasingegg <chao.gao@zilliz.com>
2024-08-23 12:49:02 +08:00
SimFG
5b5119a51f
feat: [2.4] provide more general configuration to control mmap behavior (#35609)
- issue: #35273
- pr: #35359

Signed-off-by: SimFG <bang.fu@zilliz.com>
2024-08-23 12:35:02 +08:00
wei liu
e2542a1bf5
enhance: Update protobuf-go to protobuf-go v2 (#34394) (#35555)
issue: #34252
pr: #34394 #35072 #35084

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
Co-authored-by: Congqi Xia <congqi.xia@zilliz.com>
2024-08-21 18:50:58 +08:00
congqixia
bd222e58eb
enhance: [2.4] Exclude L0 segment from readable snapshot (#35510)
Cherry-pick from master
pr: #35507

L0 segments now do not contain insert data and may cause confusion for
query hook optimizer if counted as sealed segment number.

This PR add segment level flag in segment entry and exclude L0 segments
while get readable segment snapshot

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-08-16 15:26:54 +08:00
congqixia
e64d27aa51
fix: [2.4] Set corresponding DataScope for loadStreamDelete (#35313)
Cherry-pick from master
pr: #35312
Related to #35311

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-08-06 22:32:23 +08:00
Chun Han
58f7c35b75
enhance: add log for partition stats(#30376) (#35220)
related: #30376
pr: https://github.com/milvus-io/milvus/pull/35219

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2024-08-02 19:34:21 +08:00
wei liu
11578772ef
fix: Set legacy level to l0 segment after qc restart (#35197) (#35211)
issue: #35087
pr: #35197
after qc restarts, and target is not ready yet, if dist_handler try to
update segment dist, it will set legacy level to l0 segment, which may
cause l0 segment be moved to other node, cause search/query failed.

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-08-02 18:22:15 +08:00
congqixia
f8444b900f
enhance: [2.4] Support proxy/delegator qn client pooling (#35195)
Cherry pick from master
pr: #35194
See also #35196
Add param item for proxy/delegator query node client pooling and
implement pooling logic

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-08-02 11:24:19 +08:00
congqixia
824b26c209
enhance: [2.4] Pre-allocate space for reduce data structure (#35118) (#35137)
Cherry-pick from master
pr: #35118 
Grow slice & map.growWork may cause a lot when segment number is large
for big K query. This PR pre-allocate space for reduce methods to avoid
this cost.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-08-01 09:59:50 +08:00
Gao
be0123863f
enhance: add channel num for queryHook optimization (#35105)
pr: #35104

Signed-off-by: chasingegg <chao.gao@zilliz.com>
2024-07-31 18:23:51 +08:00
zhagnlu
866055527b
enhance: revert remove duplicated pk function (#35102)
pr: #35103

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-07-30 20:03:50 +08:00
Chun Han
6283fd0b46
fix:nil part stats without l2 compaction (#34977)
related: #34923
pr: https://github.com/milvus-io/milvus/pull/34992

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2024-07-25 14:21:46 +08:00
cai.zhang
74adedf750
enhance: Optimized the GC logic to ensure that memory is released in time (#34950)
issue: #34703 

master pr: #34949

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-07-24 14:07:43 +08:00
Chun Han
ae1636c2be
fix: refine handling type for segment pruner(#34923) (#34926)
related: #34923
pr: https://github.com/milvus-io/milvus/pull/34925

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2024-07-24 12:05:44 +08:00
congqixia
c06a0ebef2
enhance: [2.4] Remove useless ops when there is no write (#34767) (#34839)
Cherry pick from master
pr: #34767
Related to: #33235

THe querynode pipeline will make map & call ProcessInsert when there is
no write messages. So querynodes will have high CPU usage even when
there is no workload.

This PR check msg length before composing data struct and calling method

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-07-22 10:23:42 +08:00
congqixia
6a3a14affb
enhance: [2.4] Add lint rule to forbid gogo protobuf (#34594) (#34630)
Cherry pick from master
pr: #34594
github.com/gogo/protobuf is deprecated and could be error prune after
upgrade protobuf message to v2.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-07-12 18:13:36 +08:00
zhagnlu
4e02e57044
enhance: mark duplicated pk as deleted (#34619)
pr: #34586

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-07-12 10:27:37 +08:00
jaime
bfd386aad7
fix: unstable UT for level0 deletion (#34525)
issue: #34533
pr: #34524

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-07-11 10:02:56 +08:00
wei liu
7034260721
fix: Query may return deleted records (#34502)
issue: #34500
pr: #34501

cause the sort in `GetLevel0Deletions` will broken the corresponed order
between pks and tss, then the pks and tss will be sorted in
segment.Delete() interface.

This PR remove this uncessary and incorrect sort progress to avoid query
may return deleted records.

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-07-09 10:00:14 +08:00
Chun Han
2f38483418
fix: lose partitionIDs when scalar pruning and refine segment prune ratio metrics(#30376) (#34475)
related: #30376
pr: https://github.com/milvus-io/milvus/pull/34477

Signed-off-by: MrPresent-Han <chun.han@zilliz.com>
2024-07-08 19:44:13 +08:00
wei liu
d3e94f9861
enhance: Use Blocked Bloom Filter instead of basic bloom fitler impl (#34377)
issue: #32995
pr: #33405
To speed up the construction and querying of Bloom filters, we chose a
blocked Bloom filter instead of a basic Bloom filter implementation.

WARN: This PR is compatible with old version bf impl, but if fall back
to old milvus version, it may causes bloom filter deserialize failed.

In single Bloom filter test cases with a capacity of 1,000,000 and a
false positive rate (FPR) of 0.001, the blocked Bloom filter is 5 times
faster than the basic Bloom filter in both querying and construction, at
the cost of a 30% increase in memory usage.

Block BF construct time {"time": "54.128131ms"}
Block BF size {"size": 3021578}
Block BF Test cost {"time": "55.407352ms"}
Basic BF construct time {"time": "210.262183ms"}
Basic BF size {"size": 2396308}
Basic BF Test cost {"time": "192.596229ms"}
In multi Bloom filter test cases with a capacity of 100,000, an FPR of
0.001, and 100 Bloom filters, we reuse the primary key locations for all
Bloom filters to avoid repeated hash computations. As a result, the
blocked Bloom filter is also 5 times faster than the basic Bloom filter
in querying.

Block BF TestLocation cost {"time": "529.97183ms"}
Basic BF TestLocation cost {"time": "3.197430181s"}

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-07-05 17:04:10 +08:00
Chun Han
5831908aa2
enhance: reconstruct scalar part's code for segment-pruner(#30376) (#34365)
related: #30376
pr: https://github.com/milvus-io/milvus/pull/34346
1. support more complex expr
2. add more ut test for unrelated fields

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2024-07-04 16:30:10 +08:00
chyezh
a1a0a56f86
enhance: async search and retrieve in cgo (#34200)
issue: #33132
pr: #33133
other pr: #33228, #34084, #33946

- implement future-based cgo utility
- async search and retrieve in cgo
- modify gc configuration document

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2024-07-04 13:02:09 +08:00
Chun Han
e12b701c03
enhance: add metrics for segment prune latnecy(#30376) (#34364)
related: #30376
pr: https://github.com/milvus-io/milvus/pull/34094

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2024-07-04 10:14:09 +08:00
aoiasd
7000cec365
enhance: [Cherry-pick] Merge query stream result for reduce delete task (#32855) (#34281)
relate: https://github.com/milvus-io/milvus/issues/32854
pr:  https://github.com/milvus-io/milvus/pull/32855

---------

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2024-07-03 18:08:09 +08:00
wayblink
c62bf8a0b0
fix: [Cherry-pick]Pick major compaction fixs and optimizations (#34360)
This PR cherry-picks the following commits:

- fix: sync partitiion stats blocking balance task #33742
- fix: Fix meta prefix overlap bug #33830
- fix: Small fixs of major compaction #33929 
- fix: Fix memory buffer error & some renaming #33850
- fix: sync part stats task cannot be finished #34027 
- Add an option to enable/disable vector field clustering key #34097
- fix: fix error ignore in compactor #34169
- fix:load major compaction partial result #34052
- Use new stream segment reader in clustering compaction #34232

issue: #30633
pr: #33742 #33830 #33929 #33850 #34027 #34097 #34169 #34052 #34232

---------

Signed-off-by: MrPresent-Han <chun.han@zilliz.com>
Signed-off-by: wayblink <anyang.wang@zilliz.com>
Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: Chun Han <116052805+MrPresent-Han@users.noreply.github.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2024-07-03 09:53:37 +08:00
wayblink
99586066f5
feat: [cherry-pick] Major compaction (#34326)
This PR cherry-picks the following commits:
fix: speed up segment lookup via channel name in datacoord (#33530)
needed by the next commit
  feat: Major compaction (#33620)

issue: #30633
pr: #33620

---------

Signed-off-by: yiwangdr <yiwangdr@gmail.com>
Signed-off-by: wayblink <anyang.wang@zilliz.com>
Co-authored-by: yiwangdr <80064917+yiwangdr@users.noreply.github.com>
Co-authored-by: MrPresent-Han <chun.han@zilliz.com>
2024-07-02 18:29:01 +08:00
wei liu
c344083f22
enhance: Optimize grow slice cost during query (#34254)
issue: #32252
pr: #34253

This PR try to pre-allocate FieldData for Reduce operations in the Query
chain using typeutil.PrepareResultFieldData to avoid the overhead of
dynamically growing the slice during appendFieldData process.

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-07-01 15:18:11 +08:00
jaime
b37d6fa0f9
enhance: decrease cpu overhead during filter segments on datacoord (#34231)
issue: https://github.com/milvus-io/milvus/issues/33129
pr: #33130 
pr: #33373

---------

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-07-01 10:20:08 +08:00
Bingyi Sun
460815ceab
fix: fix partition loaded num metric (#33316) (#34195)
issue: https://github.com/milvus-io/milvus/issues/32108
related pr: #33316

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2024-06-28 14:58:05 +08:00
wei liu
18a0efe737
enhance: Avoid search querynode return nil status in response (#34100) (#34189)
pr: #34100

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-06-26 18:52:04 +08:00
jaime
6423b6c718
enhance: move rocksmq from internal to pkg (#34165)
pr:  https://github.com/milvus-io/milvus/pull/33881
issue:  https://github.com/milvus-io/milvus/issues/33956

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-06-26 13:36:05 +08:00
cqy123456
f5344abdaf
enhance: [cherry-pick]growing segment support mmap (#34110)
issue: issue: https://github.com/milvus-io/milvus/issues/32984
related pr: https://github.com/milvus-io/milvus/pull/32633,
https://github.com/milvus-io/milvus/pull/33951,
https://github.com/milvus-io/milvus/pull/33993

Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>
2024-06-25 14:52:07 +08:00
Jiquan Long
22e6807e9a
feat: support inverted index for array (#33452) (#34053)
pr: https://github.com/milvus-io/milvus/pull/33184
pr: https://github.com/milvus-io/milvus/pull/33452
pr: https://github.com/milvus-io/milvus/pull/33633
issue: https://github.com/milvus-io/milvus/issues/27704
Co-authored-by: xiaocai2333 <cai.zhang@zilliz.com>

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
Signed-off-by: longjiquan <jiquan.long@zilliz.com>
Co-authored-by: cai.zhang <cai.zhang@zilliz.com>
2024-06-24 10:50:03 +08:00
congqixia
e02a95e3c2
fix: [2.4] Return record with largest timestamp for entires with same PK (#33936) (#34024)
Cherry-pick from master
pr: #33936
See also #33883

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-06-21 14:14:01 +08:00
congqixia
891a94ad9e
fix: [2.4] Check nodeID wildcard when removing pkOracle (#33895) (#34020)
Cherry-pick from master
pr: #33895
See also #33894

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-06-21 12:04:00 +08:00
wei liu
fbc8fb3cb2
enhance: Skip return data distribution if no change happen (#32814) (#33985)
issue: #32813
pr: #32814

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-06-21 10:24:12 +08:00
Gao
08c096cf55
enhance: Use primitive type for vectorType (#33911)
issue: #22837 
pr: #33868 

Use primitive type instead of proto enum type for queryHook to recognize

Signed-off-by: chasingegg <chao.gao@zilliz.com>
2024-06-17 19:05:58 +08:00
congqixia
e8071830fa
fix: [2.4] Prevent use captured iteration variable partitionID (#33907)
Cherry-pick from master
pr: #33906 
See also #33902

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-06-17 19:02:00 +08:00
Gao
5fc1370f6f
enhance: [2.4] autoindex for multi data type (#33867)
issue: #22837 
pr: https://github.com/milvus-io/milvus/pull/33868

- opensource autoindex support
- metric type check for different data types
- autoindex data type for search param

Signed-off-by: chasingegg <chao.gao@zilliz.com>
2024-06-14 23:26:00 +08:00
chyezh
dd6c982bdb
fix: load operation when segment is on releasing (#33699)
issue: #30857
pr: #31340

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2024-06-14 11:47:57 +08:00
wei liu
25d8b74f71
enhance: Execute bloom filter apply in parallel to speed up segment predict (#33793)
issue: #33610
pr: #33792

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-06-13 14:14:04 +08:00
wei liu
54feef30e7
enhance: Use BatchPkExist to reduce bloom filter func call cost (#33752)
issue: #33610
pr: #33611

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-06-12 17:45:58 +08:00
SimFG
c331aa4ad3
enhance: [2.4] add the includeCurrentMsg param for the Seek method (#33743)
/kind improvement

- issue: #33325
- pr: #33326

Signed-off-by: SimFG <bang.fu@zilliz.com>
2024-06-11 15:01:55 +08:00
yihao.dai
ed1dee9e38
enhance: Support L0 import (#33514) (#33712)
issue: https://github.com/milvus-io/milvus/issues/33157

pr: https://github.com/milvus-io/milvus/pull/33514

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-06-08 11:17:52 +08:00
chyezh
93348af5c0
fix: async warmup will be blocked by state lock (#33687)
issue: #33685
pr: #33686

Signed-off-by: chyezh <chyezh@outlook.com>
2024-06-07 14:23:54 +08:00
Xiaofan
d331b403c3
enhance: Remove l0 delete cache (#33537)
Cherry pick from master
pr: #32989
remove l0 cache and build delete pk and ts everytime. this reduce the
memory and also increase the code readability

Signed-off-by: xiaofanluan <xiaofan.luan@zilliz.com>
2024-06-06 17:13:50 +08:00