503 Commits

Author SHA1 Message Date
Chun Han
6283fd0b46
fix:nil part stats without l2 compaction (#34977)
related: #34923
pr: https://github.com/milvus-io/milvus/pull/34992

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2024-07-25 14:21:46 +08:00
cai.zhang
74adedf750
enhance: Optimized the GC logic to ensure that memory is released in time (#34950)
issue: #34703 

master pr: #34949

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-07-24 14:07:43 +08:00
Chun Han
ae1636c2be
fix: refine handling type for segment pruner(#34923) (#34926)
related: #34923
pr: https://github.com/milvus-io/milvus/pull/34925

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2024-07-24 12:05:44 +08:00
congqixia
c06a0ebef2
enhance: [2.4] Remove useless ops when there is no write (#34767) (#34839)
Cherry pick from master
pr: #34767
Related to: #33235

THe querynode pipeline will make map & call ProcessInsert when there is
no write messages. So querynodes will have high CPU usage even when
there is no workload.

This PR check msg length before composing data struct and calling method

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-07-22 10:23:42 +08:00
congqixia
6a3a14affb
enhance: [2.4] Add lint rule to forbid gogo protobuf (#34594) (#34630)
Cherry pick from master
pr: #34594
github.com/gogo/protobuf is deprecated and could be error prune after
upgrade protobuf message to v2.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-07-12 18:13:36 +08:00
zhagnlu
4e02e57044
enhance: mark duplicated pk as deleted (#34619)
pr: #34586

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-07-12 10:27:37 +08:00
jaime
bfd386aad7
fix: unstable UT for level0 deletion (#34525)
issue: #34533
pr: #34524

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-07-11 10:02:56 +08:00
wei liu
7034260721
fix: Query may return deleted records (#34502)
issue: #34500
pr: #34501

cause the sort in `GetLevel0Deletions` will broken the corresponed order
between pks and tss, then the pks and tss will be sorted in
segment.Delete() interface.

This PR remove this uncessary and incorrect sort progress to avoid query
may return deleted records.

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-07-09 10:00:14 +08:00
Chun Han
2f38483418
fix: lose partitionIDs when scalar pruning and refine segment prune ratio metrics(#30376) (#34475)
related: #30376
pr: https://github.com/milvus-io/milvus/pull/34477

Signed-off-by: MrPresent-Han <chun.han@zilliz.com>
2024-07-08 19:44:13 +08:00
wei liu
d3e94f9861
enhance: Use Blocked Bloom Filter instead of basic bloom fitler impl (#34377)
issue: #32995
pr: #33405
To speed up the construction and querying of Bloom filters, we chose a
blocked Bloom filter instead of a basic Bloom filter implementation.

WARN: This PR is compatible with old version bf impl, but if fall back
to old milvus version, it may causes bloom filter deserialize failed.

In single Bloom filter test cases with a capacity of 1,000,000 and a
false positive rate (FPR) of 0.001, the blocked Bloom filter is 5 times
faster than the basic Bloom filter in both querying and construction, at
the cost of a 30% increase in memory usage.

Block BF construct time {"time": "54.128131ms"}
Block BF size {"size": 3021578}
Block BF Test cost {"time": "55.407352ms"}
Basic BF construct time {"time": "210.262183ms"}
Basic BF size {"size": 2396308}
Basic BF Test cost {"time": "192.596229ms"}
In multi Bloom filter test cases with a capacity of 100,000, an FPR of
0.001, and 100 Bloom filters, we reuse the primary key locations for all
Bloom filters to avoid repeated hash computations. As a result, the
blocked Bloom filter is also 5 times faster than the basic Bloom filter
in querying.

Block BF TestLocation cost {"time": "529.97183ms"}
Basic BF TestLocation cost {"time": "3.197430181s"}

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-07-05 17:04:10 +08:00
Chun Han
5831908aa2
enhance: reconstruct scalar part's code for segment-pruner(#30376) (#34365)
related: #30376
pr: https://github.com/milvus-io/milvus/pull/34346
1. support more complex expr
2. add more ut test for unrelated fields

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2024-07-04 16:30:10 +08:00
chyezh
a1a0a56f86
enhance: async search and retrieve in cgo (#34200)
issue: #33132
pr: #33133
other pr: #33228, #34084, #33946

- implement future-based cgo utility
- async search and retrieve in cgo
- modify gc configuration document

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2024-07-04 13:02:09 +08:00
Chun Han
e12b701c03
enhance: add metrics for segment prune latnecy(#30376) (#34364)
related: #30376
pr: https://github.com/milvus-io/milvus/pull/34094

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2024-07-04 10:14:09 +08:00
aoiasd
7000cec365
enhance: [Cherry-pick] Merge query stream result for reduce delete task (#32855) (#34281)
relate: https://github.com/milvus-io/milvus/issues/32854
pr:  https://github.com/milvus-io/milvus/pull/32855

---------

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2024-07-03 18:08:09 +08:00
wayblink
c62bf8a0b0
fix: [Cherry-pick]Pick major compaction fixs and optimizations (#34360)
This PR cherry-picks the following commits:

- fix: sync partitiion stats blocking balance task #33742
- fix: Fix meta prefix overlap bug #33830
- fix: Small fixs of major compaction #33929 
- fix: Fix memory buffer error & some renaming #33850
- fix: sync part stats task cannot be finished #34027 
- Add an option to enable/disable vector field clustering key #34097
- fix: fix error ignore in compactor #34169
- fix:load major compaction partial result #34052
- Use new stream segment reader in clustering compaction #34232

issue: #30633
pr: #33742 #33830 #33929 #33850 #34027 #34097 #34169 #34052 #34232

---------

Signed-off-by: MrPresent-Han <chun.han@zilliz.com>
Signed-off-by: wayblink <anyang.wang@zilliz.com>
Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: Chun Han <116052805+MrPresent-Han@users.noreply.github.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2024-07-03 09:53:37 +08:00
wayblink
99586066f5
feat: [cherry-pick] Major compaction (#34326)
This PR cherry-picks the following commits:
fix: speed up segment lookup via channel name in datacoord (#33530)
needed by the next commit
  feat: Major compaction (#33620)

issue: #30633
pr: #33620

---------

Signed-off-by: yiwangdr <yiwangdr@gmail.com>
Signed-off-by: wayblink <anyang.wang@zilliz.com>
Co-authored-by: yiwangdr <80064917+yiwangdr@users.noreply.github.com>
Co-authored-by: MrPresent-Han <chun.han@zilliz.com>
2024-07-02 18:29:01 +08:00
wei liu
c344083f22
enhance: Optimize grow slice cost during query (#34254)
issue: #32252
pr: #34253

This PR try to pre-allocate FieldData for Reduce operations in the Query
chain using typeutil.PrepareResultFieldData to avoid the overhead of
dynamically growing the slice during appendFieldData process.

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-07-01 15:18:11 +08:00
jaime
b37d6fa0f9
enhance: decrease cpu overhead during filter segments on datacoord (#34231)
issue: https://github.com/milvus-io/milvus/issues/33129
pr: #33130 
pr: #33373

---------

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-07-01 10:20:08 +08:00
Bingyi Sun
460815ceab
fix: fix partition loaded num metric (#33316) (#34195)
issue: https://github.com/milvus-io/milvus/issues/32108
related pr: #33316

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2024-06-28 14:58:05 +08:00
wei liu
18a0efe737
enhance: Avoid search querynode return nil status in response (#34100) (#34189)
pr: #34100

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-06-26 18:52:04 +08:00
jaime
6423b6c718
enhance: move rocksmq from internal to pkg (#34165)
pr:  https://github.com/milvus-io/milvus/pull/33881
issue:  https://github.com/milvus-io/milvus/issues/33956

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-06-26 13:36:05 +08:00
cqy123456
f5344abdaf
enhance: [cherry-pick]growing segment support mmap (#34110)
issue: issue: https://github.com/milvus-io/milvus/issues/32984
related pr: https://github.com/milvus-io/milvus/pull/32633,
https://github.com/milvus-io/milvus/pull/33951,
https://github.com/milvus-io/milvus/pull/33993

Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>
2024-06-25 14:52:07 +08:00
Jiquan Long
22e6807e9a
feat: support inverted index for array (#33452) (#34053)
pr: https://github.com/milvus-io/milvus/pull/33184
pr: https://github.com/milvus-io/milvus/pull/33452
pr: https://github.com/milvus-io/milvus/pull/33633
issue: https://github.com/milvus-io/milvus/issues/27704
Co-authored-by: xiaocai2333 <cai.zhang@zilliz.com>

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
Signed-off-by: longjiquan <jiquan.long@zilliz.com>
Co-authored-by: cai.zhang <cai.zhang@zilliz.com>
2024-06-24 10:50:03 +08:00
congqixia
e02a95e3c2
fix: [2.4] Return record with largest timestamp for entires with same PK (#33936) (#34024)
Cherry-pick from master
pr: #33936
See also #33883

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-06-21 14:14:01 +08:00
congqixia
891a94ad9e
fix: [2.4] Check nodeID wildcard when removing pkOracle (#33895) (#34020)
Cherry-pick from master
pr: #33895
See also #33894

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-06-21 12:04:00 +08:00
wei liu
fbc8fb3cb2
enhance: Skip return data distribution if no change happen (#32814) (#33985)
issue: #32813
pr: #32814

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-06-21 10:24:12 +08:00
Gao
08c096cf55
enhance: Use primitive type for vectorType (#33911)
issue: #22837 
pr: #33868 

Use primitive type instead of proto enum type for queryHook to recognize

Signed-off-by: chasingegg <chao.gao@zilliz.com>
2024-06-17 19:05:58 +08:00
congqixia
e8071830fa
fix: [2.4] Prevent use captured iteration variable partitionID (#33907)
Cherry-pick from master
pr: #33906 
See also #33902

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-06-17 19:02:00 +08:00
Gao
5fc1370f6f
enhance: [2.4] autoindex for multi data type (#33867)
issue: #22837 
pr: https://github.com/milvus-io/milvus/pull/33868

- opensource autoindex support
- metric type check for different data types
- autoindex data type for search param

Signed-off-by: chasingegg <chao.gao@zilliz.com>
2024-06-14 23:26:00 +08:00
chyezh
dd6c982bdb
fix: load operation when segment is on releasing (#33699)
issue: #30857
pr: #31340

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2024-06-14 11:47:57 +08:00
wei liu
25d8b74f71
enhance: Execute bloom filter apply in parallel to speed up segment predict (#33793)
issue: #33610
pr: #33792

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-06-13 14:14:04 +08:00
wei liu
54feef30e7
enhance: Use BatchPkExist to reduce bloom filter func call cost (#33752)
issue: #33610
pr: #33611

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-06-12 17:45:58 +08:00
SimFG
c331aa4ad3
enhance: [2.4] add the includeCurrentMsg param for the Seek method (#33743)
/kind improvement

- issue: #33325
- pr: #33326

Signed-off-by: SimFG <bang.fu@zilliz.com>
2024-06-11 15:01:55 +08:00
yihao.dai
ed1dee9e38
enhance: Support L0 import (#33514) (#33712)
issue: https://github.com/milvus-io/milvus/issues/33157

pr: https://github.com/milvus-io/milvus/pull/33514

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-06-08 11:17:52 +08:00
chyezh
93348af5c0
fix: async warmup will be blocked by state lock (#33687)
issue: #33685
pr: #33686

Signed-off-by: chyezh <chyezh@outlook.com>
2024-06-07 14:23:54 +08:00
Xiaofan
d331b403c3
enhance: Remove l0 delete cache (#33537)
Cherry pick from master
pr: #32989
remove l0 cache and build delete pk and ts everytime. this reduce the
memory and also increase the code readability

Signed-off-by: xiaofanluan <xiaofan.luan@zilliz.com>
2024-06-06 17:13:50 +08:00
wei liu
0c6354018b
enhance: Avoid load bf in delegator when qn worker has no more memory(#33557) (#33650)
pr: #33557

query coord send load request to delegator, delegator load bf first,
then forward load request to qn worker. but when qn worker has no more
memory, it will return load failed immediatelly. then delegator roll
back the loaded bf. query coord wil retry the load request, and
delegator will load and roll back bf again and again.

this PR delay the loading bf step until load segment succeed in worker.

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-06-06 10:41:52 +08:00
Chun Han
627b787aed
fix: query iterator lack results(#33137) (#33422) (#33506)
related: #33137 
pr: https://github.com/milvus-io/milvus/pull/33422

Signed-off-by: MrPresent-Han <chun.han@zilliz.com>
2024-06-05 18:51:52 +08:00
jaime
0ad55c6c44
fix: fix loaded entity num is inaccurate (#33522)
issue: #33520

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-06-04 20:09:54 +08:00
yihao.dai
7384bfe3f8
fix: use seperate warmup pool and disable warmup by default (#33348) (#33349)
1. use a small warmup pool to reduce the impact of warmup
2. change the warmup pool to nonblocking mode
3. disable warmup by default
4. remove the maximum size limit of 16 for the load pool

issue: https://github.com/milvus-io/milvus/issues/32772

pr: https://github.com/milvus-io/milvus/pull/33348

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
Co-authored-by: xiaofanluan <xiaofan.luan@zilliz.com>
2024-05-28 19:27:43 +08:00
jaime
8990b8b051
fix: correct error of metrics stats (#33305)
issue: #32980
cherry pick from master
pr:  #33075 #33255

---------

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-05-24 09:15:41 +08:00
Bingyi Sun
d4a146ef1a
enhance: mmap load raw data if scalar index does not have raw data (#… (#33317)
pr: #33175

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2024-05-24 09:09:42 +08:00
Jiquan Long
d98e1f6ff5
fix: two-phase retrieval on lru-segment (#32945) (#33313)
Cherry-pick from master
pr: #32945 
issue: #31822

---------

Signed-off-by: longjiquan <jiquan.long@zilliz.com>
2024-05-23 16:25:40 +08:00
wei liu
a988e7cabc
enhance: Reduce bloom filter lock contention between insert and delete in query coord (#32643) (#33284)
issue: #32530
pr: #32643 

cause ProcessDelete need to check whether pk exist in bloom filter, and
ProcessInsert need to update pk to bloom filter, when execute
ProcessInsert and ProcessDelete in parallel, it will cause race
condition in segment's bloom filter

This PR execute ProcessInsert and ProcessDelete in serial to avoid block
each other

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-05-22 20:53:40 +08:00
cai.zhang
6ea7633bd5
enhance: Add memory size for binlog (#33025)
issue: #33005
1. add `MemorySize` field for insert binlog.
2. `LogSize` means the file size in the storage object.
3. `MemorySize` means the size of the data in the memory.

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
Signed-off-by: cai.zhang <cai.zhang@zilliz.com>
2024-05-15 12:59:34 +08:00
SimFG
1d48d0aeb2
enhance: use different value to get related data size according to segment type (#33017)
issue: #30436

Signed-off-by: SimFG <bang.fu@zilliz.com>
2024-05-14 14:59:33 +08:00
Cai Yudong
4fc7915c70
enhance: unify data generation test APIs (#32955)
Issue: #22837

Signed-off-by: Cai Yudong <yudong.cai@zilliz.com>
2024-05-14 14:33:33 +08:00
chyezh
96489b814d
fix: remove busy log (#33042)
issue: #32963

Signed-off-by: chyezh <chyezh@outlook.com>
2024-05-14 14:20:32 +08:00
foxspy
f6777267e3
enhance: add score compute consistency config for knowhere (#32997)
issue: https://github.com/milvus-io/milvus/issues/32583
related: #32584

Signed-off-by: xianliang.li <xianliang.li@zilliz.com>
2024-05-13 14:21:31 +08:00
chyezh
1c84a1c9b6
fix: lru related issue fixup patch (#32916)
issue: #32206, #32801

- search failure with some assertion, segment not loaded and resource
insufficient.

- segment leak when query segments

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2024-05-10 19:17:30 +08:00