1038 Commits

Author SHA1 Message Date
congqixia
d16320705e
enhance: [2.4] Add Segment Level in milvus segment info APIs (#34763) (#35023)
Cherry-pick from master
pr: #34763
See also #34746

This PR add segment level field in response of
`GetPersistentSegmentInfo` and `GetQuerySegmentInfo`

---------

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-07-29 10:11:52 +08:00
congqixia
2a43f43916
fix: [2.4] Remove timeout in datanode watch ctx (#35011) (#35017)
Cherry-pick from master
pr: #35011
See also #35008

Use tickle timeout logic instead of hardcode context timeout

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-07-26 11:59:46 +08:00
cai.zhang
9cd6dbcbc9
fix: [cherry-pick] Fix bug for block clustering compaction (#35021)
issue: #34703 

master pr: #35019

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-07-26 11:33:40 +08:00
cai.zhang
74adedf750
enhance: Optimized the GC logic to ensure that memory is released in time (#34950)
issue: #34703 

master pr: #34949

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-07-24 14:07:43 +08:00
wei liu
c13c48d99a
fix: Failed to unmarshal field stats's bloom filter (#34922)
pr #34377 introduce this issue, which miss some new changes during the
cherry-pick

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-07-23 16:45:47 +08:00
cai.zhang
6986dfdd5b
enhance:[cherry-pick]Send flush signal when the water level reaches the high watermark (#34908)
issue: #30633 

master pr: #34907

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-07-23 10:27:42 +08:00
cai.zhang
4ed62e9dbb
enhance: [cherry-pick] Add integration test for clustering compaction (#34860)
issue: #34792 

master pr: #34881

Signed-off-by: cai.zhang <cai.zhang@zilliz.com>
2024-07-22 17:49:42 +08:00
wayblink
33bbc614df
enhance: [cherry-pick] add ut for clustering_compactor (#34817)
issue: #34792
pr: #34852

Signed-off-by: wayblink <anyang.wang@zilliz.com>
2024-07-21 19:25:41 +08:00
cai.zhang
323dee2fbc
fix: [cherry-pick] Fix the issue of concurrent packing of the same segment (#34838)
issue: #34703 

master pr: #34840

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-07-19 18:27:40 +08:00
wayblink
a26e965e6a
enhance:[cherry-pick] Add compaction task slot usage logic (#34625)
issue: #34544
pr: #34581

---------

Signed-off-by: wayblink <anyang.wang@zilliz.com>
2024-07-18 09:55:43 +08:00
cai.zhang
d74a5bdc1d
fix: [cherry-pick] Fix bug where binlogs already flushed with new segment during pack (#34760)
issue: #34703

master pr: #34762

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-07-18 09:15:47 +08:00
cai.zhang
19d3606e0d
fix: [cherry-pick] Fix the bug that caused small segment flush frequently (#34727)
issue: #34703 

master pr: #34725

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-07-17 09:47:43 +08:00
XuanYang-cn
5909a62ca6
fix: Fix accidentlly exit MixCompaction task loop (#34689)
See also: #33431, #34460
pr: #34688

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-07-16 16:21:38 +08:00
cai.zhang
0c5aafd2d1
fix:[cherry-pick] Reset flushed row num after pack segment for clustering compaction (#34704)
issue: #34703 
master pr: #34702

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-07-16 15:57:43 +08:00
SimFG
15adb2feac
enhance: [2.4] add the seal segment when dispatch delete msgs (#34566)
/kind improvement
- pr: #34565

Signed-off-by: SimFG <bang.fu@zilliz.com>
2024-07-10 19:18:51 +08:00
SimFG
737bd7c734
enhance: [2.4] release the record in delete codec and add some log for compaction (#34506)
/kind improvement
- pr: #34454

Signed-off-by: SimFG <bang.fu@zilliz.com>
2024-07-09 15:40:17 +08:00
yihao.dai
0d7ba810b3
enhance: Check segment existence when FlushSegments and add some key logs (#34438) (#34472)
Check if the segment exists during FlushSegments and add some key logs
in write path.

issue: https://github.com/milvus-io/milvus/issues/34255

pr: https://github.com/milvus-io/milvus/pull/34438

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-07-08 19:00:13 +08:00
yihao.dai
0732167c87
fix: Fix incorrect segment num rows (#34441) (#34474)
Repeated calls to UpdateStatistics, this PR correct it.

issue: https://github.com/milvus-io/milvus/issues/34440

pr: https://github.com/milvus-io/milvus/pull/34441

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-07-08 17:30:12 +08:00
wei liu
d3e94f9861
enhance: Use Blocked Bloom Filter instead of basic bloom fitler impl (#34377)
issue: #32995
pr: #33405
To speed up the construction and querying of Bloom filters, we chose a
blocked Bloom filter instead of a basic Bloom filter implementation.

WARN: This PR is compatible with old version bf impl, but if fall back
to old milvus version, it may causes bloom filter deserialize failed.

In single Bloom filter test cases with a capacity of 1,000,000 and a
false positive rate (FPR) of 0.001, the blocked Bloom filter is 5 times
faster than the basic Bloom filter in both querying and construction, at
the cost of a 30% increase in memory usage.

Block BF construct time {"time": "54.128131ms"}
Block BF size {"size": 3021578}
Block BF Test cost {"time": "55.407352ms"}
Basic BF construct time {"time": "210.262183ms"}
Basic BF size {"size": 2396308}
Basic BF Test cost {"time": "192.596229ms"}
In multi Bloom filter test cases with a capacity of 100,000, an FPR of
0.001, and 100 Bloom filters, we reuse the primary key locations for all
Bloom filters to avoid repeated hash computations. As a result, the
blocked Bloom filter is also 5 times faster than the basic Bloom filter
in querying.

Block BF TestLocation cost {"time": "529.97183ms"}
Basic BF TestLocation cost {"time": "3.197430181s"}

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-07-05 17:04:10 +08:00
yihao.dai
a57c9e61fc
enhance: [cherry-pick] optimize datanode cpu usage and correct the update logic of ttchecker (#34383)
This PR cherry-picks the following commits:
- Try to improve cpu usage by refactoring the ttchecker logic and
caching string. https://github.com/milvus-io/milvus/pull/33267
- Correct the update logic of timerecorder in the flowgraph to avoid
false failure: "some node(s) haven't received input".
https://github.com/milvus-io/milvus/pull/34339

issue: https://github.com/milvus-io/milvus/issues/33266,
https://github.com/milvus-io/milvus/issues/34337

pr: https://github.com/milvus-io/milvus/pull/33267,
https://github.com/milvus-io/milvus/pull/34339

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
Co-authored-by: Xiaofan <83447078+xiaofan-luan@users.noreply.github.com>
2024-07-04 16:34:17 +08:00
XuanYang-cn
0f1915ef24
fix: DataNode might OOM by estimating based on MemorySize (#34203)
See also: #34136
pr: #34201

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-07-04 15:24:10 +08:00
cai.zhang
bc1746f96c
enhance: [cherry-pick] Optimize clustering compaction (#34313) (#34398)
issue: #30633

master pr: #34313

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-07-04 09:52:09 +08:00
aoiasd
07daa8f12b
enhance:[Cherry-pick] avoid maintain checkpoint info in sync manager (#33413) (#34285)
relate: https://github.com/milvus-io/milvus/issues/32915
pr: https://github.com/milvus-io/milvus/pull/33413

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2024-07-03 19:02:09 +08:00
cai.zhang
0c01ace0d2
fix: [cherry-pick] Only load or release Flushed segment in datanode meta (#34393)
issue: #34376 ,  #34375,  #34379

master pr: #34390

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-07-03 17:44:11 +08:00
wayblink
c62bf8a0b0
fix: [Cherry-pick]Pick major compaction fixs and optimizations (#34360)
This PR cherry-picks the following commits:

- fix: sync partitiion stats blocking balance task #33742
- fix: Fix meta prefix overlap bug #33830
- fix: Small fixs of major compaction #33929 
- fix: Fix memory buffer error & some renaming #33850
- fix: sync part stats task cannot be finished #34027 
- Add an option to enable/disable vector field clustering key #34097
- fix: fix error ignore in compactor #34169
- fix:load major compaction partial result #34052
- Use new stream segment reader in clustering compaction #34232

issue: #30633
pr: #33742 #33830 #33929 #33850 #34027 #34097 #34169 #34052 #34232

---------

Signed-off-by: MrPresent-Han <chun.han@zilliz.com>
Signed-off-by: wayblink <anyang.wang@zilliz.com>
Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: Chun Han <116052805+MrPresent-Han@users.noreply.github.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2024-07-03 09:53:37 +08:00
cai.zhang
6cb0f1ff74
fix: [cherry-pick] Sync the sealed and flushed segments to datanode (#34301) (#34318)
issue: #33696

master pr: #34301

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-07-02 19:36:09 +08:00
wayblink
99586066f5
feat: [cherry-pick] Major compaction (#34326)
This PR cherry-picks the following commits:
fix: speed up segment lookup via channel name in datacoord (#33530)
needed by the next commit
  feat: Major compaction (#33620)

issue: #30633
pr: #33620

---------

Signed-off-by: yiwangdr <yiwangdr@gmail.com>
Signed-off-by: wayblink <anyang.wang@zilliz.com>
Co-authored-by: yiwangdr <80064917+yiwangdr@users.noreply.github.com>
Co-authored-by: MrPresent-Han <chun.han@zilliz.com>
2024-07-02 18:29:01 +08:00
cai.zhang
f11e421839
enhance: [cherry-pick] Remove compaction plans on the datanode (#33548) (#34312)
issue: #33546

master pr: #33548

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-07-02 17:39:08 +08:00
yihao.dai
1b95ee7ae8
enhance: [cherry-pick] Batch pick PRs related to compaction (#34315)
This PR cherry-picks the following commits related to compaction:

- Use a pool for CompactionExecutor.
https://github.com/milvus-io/milvus/pull/33558
- Move compaction executor to compaction pacakge.
https://github.com/milvus-io/milvus/pull/33778
- Ensure the idempotency of compaction tasks.
https://github.com/milvus-io/milvus/pull/33872
- Add comment for channel cp updater.
https://github.com/milvus-io/milvus/pull/33759

issue: https://github.com/milvus-io/milvus/issues/33182,
https://github.com/milvus-io/milvus/issues/32451

pr: https://github.com/milvus-io/milvus/pull/33558,
https://github.com/milvus-io/milvus/pull/33778,
https://github.com/milvus-io/milvus/pull/33872,
https://github.com/milvus-io/milvus/pull/33759

---------

Signed-off-by: coldWater <254244460@qq.com>
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
Co-authored-by: coldWater <254244460@qq.com>
2024-07-02 09:54:07 +08:00
zhenshan.cao
14a11e379c
enhance: Refactor Compaction to enable persistence(#33265) (#34268)
pr : #33265 

issue #33586

Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>
2024-07-01 19:32:07 +08:00
smellthemoon
af442b936c
enhance:change wrong log(#33447) (#34213)
pr: #33447

Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2024-07-01 18:02:07 +08:00
cai.zhang
1c6e850f73
enhance: [cherry-pick] Periodically synchronize segments to datanode watcher (#33420) (#34186)
This PR primary picks up the SyncSegments functionality, including the
following commits:
- main functionality: https://github.com/milvus-io/milvus/pull/33420
- related fixes:
  - https://github.com/milvus-io/milvus/pull/33664
  - https://github.com/milvus-io/milvus/pull/33829
  - https://github.com/milvus-io/milvus/pull/34056
  - https://github.com/milvus-io/milvus/pull/34156

issue: #32809 
master pr: #33420, #33664, #33829, #34056, #34156

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-06-27 11:24:05 +08:00
congqixia
f741bb7526
enhance: [2.4] Avoid merging insert data when buffering insert msgs (#34205)
Cherry-pick from master
pr: #33526 #33817
See also #33561

This PR:
- Use zero copy when buffering insert messages
- Make `storage.InsertCodec` support serialize multiple insert data
chunk into same batch binlog files

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-06-27 10:14:05 +08:00
jaime
6423b6c718
enhance: move rocksmq from internal to pkg (#34165)
pr:  https://github.com/milvus-io/milvus/pull/33881
issue:  https://github.com/milvus-io/milvus/issues/33956

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-06-26 13:36:05 +08:00
congqixia
3cf526e0cc
fix: [2.4] Deep copy ImportTask.segmentsInfo to prevent data race (#34090) (#34126)
Cherry-pick from master
pr: #34090
See also #34089

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-06-26 11:50:11 +08:00
yihao.dai
b1e74dc7cb
enhance: [cherry-pick] Decouple compaction from shard (#34157)
This PR cherry-picks the following commits:

- Implement task limit control logic in datanode.
https://github.com/milvus-io/milvus/pull/32881
- Load bf from storage instead of memory during L0 compaction.
https://github.com/milvus-io/milvus/pull/32913
- Remove dependencies on shards (e.g. SyncSegments, injection).
https://github.com/milvus-io/milvus/pull/33138
- Rename Compaction interface to CompactionV2.
https://github.com/milvus-io/milvus/pull/33858
- Remove the unused residual compaction logic.
https://github.com/milvus-io/milvus/pull/33932

issue: https://github.com/milvus-io/milvus/issues/32809

pr: https://github.com/milvus-io/milvus/pull/32881,
https://github.com/milvus-io/milvus/pull/32913,
https://github.com/milvus-io/milvus/pull/33138,
https://github.com/milvus-io/milvus/pull/33858,
https://github.com/milvus-io/milvus/pull/33932

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-06-25 20:22:03 +08:00
XuanYang-cn
a33b68678d
enhance: [cherry-pick] Move compactor into sub package (#34098)
This PR consists of the following commits:

- enhance: Tidy compactor and remove dup codes (#32198)
- fix: Fix l0 compactor may cause DN from OOM (#33554)
- enhance: Add deltaRowCount in l0 compaction (#33997)
- enhance: enable stream writer in compactions (#32612)
- fix: turn on compression on stream writers (#34067)
- fix: adding blob memory size in binlog serde (#33324)

See also: #32451, #33547, #33998, #31679
pr: #32198, #33554, #33997, #32612

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
Signed-off-by: Ted Xu <ted.xu@zilliz.com>
Co-authored-by: Ted Xu <ted.xu@zilliz.com>
2024-06-25 11:16:02 +08:00
XuanYang-cn
e55fee6b04
enhance: Add deltaRowCount in l0 compaction (#33843)
See also: #33998 
pr: #33997

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-06-20 19:32:02 +08:00
wei liu
a7ae45c91c
enhance: Add trace for bf cost in l0 compactor (#33898)
pr: #33860

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-06-20 14:40:15 +08:00
wei liu
25d8b74f71
enhance: Execute bloom filter apply in parallel to speed up segment predict (#33793)
issue: #33610
pr: #33792

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-06-13 14:14:04 +08:00
congqixia
efd1fa8b8a
fix: [2.4] Prevent restart timetick sender creating ut datanode (#33790) (#33801)
Cherry-pick from master
pr: #33790
See also #33789

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-06-13 10:03:57 +08:00
wei liu
54feef30e7
enhance: Use BatchPkExist to reduce bloom filter func call cost (#33752)
issue: #33610
pr: #33611

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-06-12 17:45:58 +08:00
SimFG
c331aa4ad3
enhance: [2.4] add the includeCurrentMsg param for the Seek method (#33743)
/kind improvement

- issue: #33325
- pr: #33326

Signed-off-by: SimFG <bang.fu@zilliz.com>
2024-06-11 15:01:55 +08:00
yihao.dai
b71a404776
fix: Check if the import job exists (#33672) (#33673)
issue: https://github.com/milvus-io/milvus/issues/33671

pr: https://github.com/milvus-io/milvus/pull/33672

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-06-10 21:50:29 +08:00
yihao.dai
ed1dee9e38
enhance: Support L0 import (#33514) (#33712)
issue: https://github.com/milvus-io/milvus/issues/33157

pr: https://github.com/milvus-io/milvus/pull/33514

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-06-08 11:17:52 +08:00
yihao.dai
e81ae1e5a4
fix: Fix import segment size is uneven (#33605) (#33634)
The data coordinator computed the appropriate number of import segments,
thus when importing in the data node, one can randomly select a segment.

issue: https://github.com/milvus-io/milvus/issues/33604

pr: https://github.com/milvus-io/milvus/pull/33605

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-06-05 18:49:52 +08:00
yihao.dai
e282e1408e
enhance: Abstract Execute interface for import/preimport task (#33234) (#33607)
Abstract Execute interface for import/preimport task, simplify import
scheduler.

issue: https://github.com/milvus-io/milvus/issues/33157

pr: https://github.com/milvus-io/milvus/pull/33234

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-06-05 11:17:56 +08:00
XuanYang-cn
95582b0208
fix: [2.4] L0 compactor may cause DN OOM (#33564)
See also: #33547
pr: #33554

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-06-05 10:51:50 +08:00
congqixia
44e97b7cda
enhance: [2.4] Use map PK to timestamp in buffer insert (#33566) (#33582)
Cherry-pick from master
pr: #33566 
Related to #27675

Store pk to minimal timestamp in `inData` instead of bloom filter to
check whether some delete entry hit current insert batch

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-06-04 19:21:54 +08:00
XuanYang-cn
07b995fea4
fix: [2.4]Sync dropped segment for dropped partition (#33332)
See also: #33330
pr: #33331

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-05-27 17:57:43 +08:00