cai.zhang
74adedf750
enhance: Optimized the GC logic to ensure that memory is released in time ( #34950 )
...
issue: #34703
master pr: #34949
Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-07-24 14:07:43 +08:00
wei liu
c13c48d99a
fix: Failed to unmarshal field stats's bloom filter ( #34922 )
...
pr #34377 introduce this issue, which miss some new changes during the
cherry-pick
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-07-23 16:45:47 +08:00
cai.zhang
6986dfdd5b
enhance:[cherry-pick]Send flush signal when the water level reaches the high watermark ( #34908 )
...
issue: #30633
master pr: #34907
Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-07-23 10:27:42 +08:00
cai.zhang
4ed62e9dbb
enhance: [cherry-pick] Add integration test for clustering compaction ( #34860 )
...
issue: #34792
master pr: #34881
Signed-off-by: cai.zhang <cai.zhang@zilliz.com>
2024-07-22 17:49:42 +08:00
wayblink
33bbc614df
enhance: [cherry-pick] add ut for clustering_compactor ( #34817 )
...
issue: #34792
pr: #34852
Signed-off-by: wayblink <anyang.wang@zilliz.com>
2024-07-21 19:25:41 +08:00
cai.zhang
323dee2fbc
fix: [cherry-pick] Fix the issue of concurrent packing of the same segment ( #34838 )
...
issue: #34703
master pr: #34840
Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-07-19 18:27:40 +08:00
wayblink
a26e965e6a
enhance:[cherry-pick] Add compaction task slot usage logic ( #34625 )
...
issue: #34544
pr: #34581
---------
Signed-off-by: wayblink <anyang.wang@zilliz.com>
2024-07-18 09:55:43 +08:00
cai.zhang
d74a5bdc1d
fix: [cherry-pick] Fix bug where binlogs already flushed with new segment during pack ( #34760 )
...
issue: #34703
master pr: #34762
Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-07-18 09:15:47 +08:00
cai.zhang
19d3606e0d
fix: [cherry-pick] Fix the bug that caused small segment flush frequently ( #34727 )
...
issue: #34703
master pr: #34725
Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-07-17 09:47:43 +08:00
XuanYang-cn
5909a62ca6
fix: Fix accidentlly exit MixCompaction task loop ( #34689 )
...
See also: #33431 , #34460
pr: #34688
---------
Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-07-16 16:21:38 +08:00
cai.zhang
0c5aafd2d1
fix:[cherry-pick] Reset flushed row num after pack segment for clustering compaction ( #34704 )
...
issue: #34703
master pr: #34702
Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-07-16 15:57:43 +08:00
SimFG
15adb2feac
enhance: [2.4] add the seal segment when dispatch delete msgs ( #34566 )
...
/kind improvement
- pr: #34565
Signed-off-by: SimFG <bang.fu@zilliz.com>
2024-07-10 19:18:51 +08:00
SimFG
737bd7c734
enhance: [2.4] release the record in delete codec and add some log for compaction ( #34506 )
...
/kind improvement
- pr: #34454
Signed-off-by: SimFG <bang.fu@zilliz.com>
2024-07-09 15:40:17 +08:00
yihao.dai
0d7ba810b3
enhance: Check segment existence when FlushSegments and add some key logs ( #34438 ) ( #34472 )
...
Check if the segment exists during FlushSegments and add some key logs
in write path.
issue: https://github.com/milvus-io/milvus/issues/34255
pr: https://github.com/milvus-io/milvus/pull/34438
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-07-08 19:00:13 +08:00
yihao.dai
0732167c87
fix: Fix incorrect segment num rows ( #34441 ) ( #34474 )
...
Repeated calls to UpdateStatistics, this PR correct it.
issue: https://github.com/milvus-io/milvus/issues/34440
pr: https://github.com/milvus-io/milvus/pull/34441
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-07-08 17:30:12 +08:00
wei liu
d3e94f9861
enhance: Use Blocked Bloom Filter instead of basic bloom fitler impl ( #34377 )
...
issue: #32995
pr: #33405
To speed up the construction and querying of Bloom filters, we chose a
blocked Bloom filter instead of a basic Bloom filter implementation.
WARN: This PR is compatible with old version bf impl, but if fall back
to old milvus version, it may causes bloom filter deserialize failed.
In single Bloom filter test cases with a capacity of 1,000,000 and a
false positive rate (FPR) of 0.001, the blocked Bloom filter is 5 times
faster than the basic Bloom filter in both querying and construction, at
the cost of a 30% increase in memory usage.
Block BF construct time {"time": "54.128131ms"}
Block BF size {"size": 3021578}
Block BF Test cost {"time": "55.407352ms"}
Basic BF construct time {"time": "210.262183ms"}
Basic BF size {"size": 2396308}
Basic BF Test cost {"time": "192.596229ms"}
In multi Bloom filter test cases with a capacity of 100,000, an FPR of
0.001, and 100 Bloom filters, we reuse the primary key locations for all
Bloom filters to avoid repeated hash computations. As a result, the
blocked Bloom filter is also 5 times faster than the basic Bloom filter
in querying.
Block BF TestLocation cost {"time": "529.97183ms"}
Basic BF TestLocation cost {"time": "3.197430181s"}
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-07-05 17:04:10 +08:00
yihao.dai
a57c9e61fc
enhance: [cherry-pick] optimize datanode cpu usage and correct the update logic of ttchecker ( #34383 )
...
This PR cherry-picks the following commits:
- Try to improve cpu usage by refactoring the ttchecker logic and
caching string. https://github.com/milvus-io/milvus/pull/33267
- Correct the update logic of timerecorder in the flowgraph to avoid
false failure: "some node(s) haven't received input".
https://github.com/milvus-io/milvus/pull/34339
issue: https://github.com/milvus-io/milvus/issues/33266 ,
https://github.com/milvus-io/milvus/issues/34337
pr: https://github.com/milvus-io/milvus/pull/33267 ,
https://github.com/milvus-io/milvus/pull/34339
---------
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
Co-authored-by: Xiaofan <83447078+xiaofan-luan@users.noreply.github.com>
2024-07-04 16:34:17 +08:00
XuanYang-cn
0f1915ef24
fix: DataNode might OOM by estimating based on MemorySize ( #34203 )
...
See also: #34136
pr: #34201
---------
Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-07-04 15:24:10 +08:00
cai.zhang
bc1746f96c
enhance: [cherry-pick] Optimize clustering compaction ( #34313 ) ( #34398 )
...
issue: #30633
master pr: #34313
Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-07-04 09:52:09 +08:00
aoiasd
07daa8f12b
enhance:[Cherry-pick] avoid maintain checkpoint info in sync manager ( #33413 ) ( #34285 )
...
relate: https://github.com/milvus-io/milvus/issues/32915
pr: https://github.com/milvus-io/milvus/pull/33413
Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2024-07-03 19:02:09 +08:00
cai.zhang
0c01ace0d2
fix: [cherry-pick] Only load or release Flushed segment in datanode meta ( #34393 )
...
issue: #34376 , #34375 , #34379
master pr: #34390
---------
Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-07-03 17:44:11 +08:00
wayblink
c62bf8a0b0
fix: [Cherry-pick]Pick major compaction fixs and optimizations ( #34360 )
...
This PR cherry-picks the following commits:
- fix: sync partitiion stats blocking balance task #33742
- fix: Fix meta prefix overlap bug #33830
- fix: Small fixs of major compaction #33929
- fix: Fix memory buffer error & some renaming #33850
- fix: sync part stats task cannot be finished #34027
- Add an option to enable/disable vector field clustering key #34097
- fix: fix error ignore in compactor #34169
- fix:load major compaction partial result #34052
- Use new stream segment reader in clustering compaction #34232
issue: #30633
pr: #33742 #33830 #33929 #33850 #34027 #34097 #34169 #34052 #34232
---------
Signed-off-by: MrPresent-Han <chun.han@zilliz.com>
Signed-off-by: wayblink <anyang.wang@zilliz.com>
Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: Chun Han <116052805+MrPresent-Han@users.noreply.github.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2024-07-03 09:53:37 +08:00
cai.zhang
6cb0f1ff74
fix: [cherry-pick] Sync the sealed and flushed segments to datanode ( #34301 ) ( #34318 )
...
issue: #33696
master pr: #34301
---------
Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-07-02 19:36:09 +08:00
wayblink
99586066f5
feat: [cherry-pick] Major compaction ( #34326 )
...
This PR cherry-picks the following commits:
fix: speed up segment lookup via channel name in datacoord (#33530 )
needed by the next commit
feat: Major compaction (#33620 )
issue: #30633
pr: #33620
---------
Signed-off-by: yiwangdr <yiwangdr@gmail.com>
Signed-off-by: wayblink <anyang.wang@zilliz.com>
Co-authored-by: yiwangdr <80064917+yiwangdr@users.noreply.github.com>
Co-authored-by: MrPresent-Han <chun.han@zilliz.com>
2024-07-02 18:29:01 +08:00
cai.zhang
f11e421839
enhance: [cherry-pick] Remove compaction plans on the datanode ( #33548 ) ( #34312 )
...
issue: #33546
master pr: #33548
---------
Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-07-02 17:39:08 +08:00
yihao.dai
1b95ee7ae8
enhance: [cherry-pick] Batch pick PRs related to compaction ( #34315 )
...
This PR cherry-picks the following commits related to compaction:
- Use a pool for CompactionExecutor.
https://github.com/milvus-io/milvus/pull/33558
- Move compaction executor to compaction pacakge.
https://github.com/milvus-io/milvus/pull/33778
- Ensure the idempotency of compaction tasks.
https://github.com/milvus-io/milvus/pull/33872
- Add comment for channel cp updater.
https://github.com/milvus-io/milvus/pull/33759
issue: https://github.com/milvus-io/milvus/issues/33182 ,
https://github.com/milvus-io/milvus/issues/32451
pr: https://github.com/milvus-io/milvus/pull/33558 ,
https://github.com/milvus-io/milvus/pull/33778 ,
https://github.com/milvus-io/milvus/pull/33872 ,
https://github.com/milvus-io/milvus/pull/33759
---------
Signed-off-by: coldWater <254244460@qq.com>
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
Co-authored-by: coldWater <254244460@qq.com>
2024-07-02 09:54:07 +08:00
zhenshan.cao
14a11e379c
enhance: Refactor Compaction to enable persistence( #33265 ) ( #34268 )
...
pr : #33265
issue #33586
Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>
2024-07-01 19:32:07 +08:00
smellthemoon
af442b936c
enhance:change wrong log( #33447 ) ( #34213 )
...
pr: #33447
Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2024-07-01 18:02:07 +08:00
cai.zhang
1c6e850f73
enhance: [cherry-pick] Periodically synchronize segments to datanode watcher ( #33420 ) ( #34186 )
...
This PR primary picks up the SyncSegments functionality, including the
following commits:
- main functionality: https://github.com/milvus-io/milvus/pull/33420
- related fixes:
- https://github.com/milvus-io/milvus/pull/33664
- https://github.com/milvus-io/milvus/pull/33829
- https://github.com/milvus-io/milvus/pull/34056
- https://github.com/milvus-io/milvus/pull/34156
issue: #32809
master pr: #33420 , #33664 , #33829 , #34056 , #34156
Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-06-27 11:24:05 +08:00
congqixia
f741bb7526
enhance: [2.4] Avoid merging insert data when buffering insert msgs ( #34205 )
...
Cherry-pick from master
pr: #33526 #33817
See also #33561
This PR:
- Use zero copy when buffering insert messages
- Make `storage.InsertCodec` support serialize multiple insert data
chunk into same batch binlog files
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-06-27 10:14:05 +08:00
jaime
6423b6c718
enhance: move rocksmq from internal to pkg ( #34165 )
...
pr: https://github.com/milvus-io/milvus/pull/33881
issue: https://github.com/milvus-io/milvus/issues/33956
Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-06-26 13:36:05 +08:00
congqixia
3cf526e0cc
fix: [2.4] Deep copy ImportTask.segmentsInfo to prevent data race ( #34090 ) ( #34126 )
...
Cherry-pick from master
pr: #34090
See also #34089
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-06-26 11:50:11 +08:00
yihao.dai
b1e74dc7cb
enhance: [cherry-pick] Decouple compaction from shard ( #34157 )
...
This PR cherry-picks the following commits:
- Implement task limit control logic in datanode.
https://github.com/milvus-io/milvus/pull/32881
- Load bf from storage instead of memory during L0 compaction.
https://github.com/milvus-io/milvus/pull/32913
- Remove dependencies on shards (e.g. SyncSegments, injection).
https://github.com/milvus-io/milvus/pull/33138
- Rename Compaction interface to CompactionV2.
https://github.com/milvus-io/milvus/pull/33858
- Remove the unused residual compaction logic.
https://github.com/milvus-io/milvus/pull/33932
issue: https://github.com/milvus-io/milvus/issues/32809
pr: https://github.com/milvus-io/milvus/pull/32881 ,
https://github.com/milvus-io/milvus/pull/32913 ,
https://github.com/milvus-io/milvus/pull/33138 ,
https://github.com/milvus-io/milvus/pull/33858 ,
https://github.com/milvus-io/milvus/pull/33932
---------
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-06-25 20:22:03 +08:00
XuanYang-cn
a33b68678d
enhance: [cherry-pick] Move compactor into sub package ( #34098 )
...
This PR consists of the following commits:
- enhance: Tidy compactor and remove dup codes (#32198 )
- fix: Fix l0 compactor may cause DN from OOM (#33554 )
- enhance: Add deltaRowCount in l0 compaction (#33997 )
- enhance: enable stream writer in compactions (#32612 )
- fix: turn on compression on stream writers (#34067 )
- fix: adding blob memory size in binlog serde (#33324 )
See also: #32451 , #33547 , #33998 , #31679
pr: #32198 , #33554 , #33997 , #32612
---------
Signed-off-by: yangxuan <xuan.yang@zilliz.com>
Signed-off-by: Ted Xu <ted.xu@zilliz.com>
Co-authored-by: Ted Xu <ted.xu@zilliz.com>
2024-06-25 11:16:02 +08:00
XuanYang-cn
e55fee6b04
enhance: Add deltaRowCount in l0 compaction ( #33843 )
...
See also: #33998
pr: #33997
Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-06-20 19:32:02 +08:00
wei liu
a7ae45c91c
enhance: Add trace for bf cost in l0 compactor ( #33898 )
...
pr: #33860
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-06-20 14:40:15 +08:00
wei liu
25d8b74f71
enhance: Execute bloom filter apply in parallel to speed up segment predict ( #33793 )
...
issue: #33610
pr: #33792
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-06-13 14:14:04 +08:00
congqixia
efd1fa8b8a
fix: [2.4] Prevent restart timetick sender creating ut datanode ( #33790 ) ( #33801 )
...
Cherry-pick from master
pr: #33790
See also #33789
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-06-13 10:03:57 +08:00
wei liu
54feef30e7
enhance: Use BatchPkExist to reduce bloom filter func call cost ( #33752 )
...
issue: #33610
pr: #33611
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-06-12 17:45:58 +08:00
SimFG
c331aa4ad3
enhance: [2.4] add the includeCurrentMsg param for the Seek method ( #33743 )
...
/kind improvement
- issue: #33325
- pr: #33326
Signed-off-by: SimFG <bang.fu@zilliz.com>
2024-06-11 15:01:55 +08:00
yihao.dai
b71a404776
fix: Check if the import job exists ( #33672 ) ( #33673 )
...
issue: https://github.com/milvus-io/milvus/issues/33671
pr: https://github.com/milvus-io/milvus/pull/33672
---------
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-06-10 21:50:29 +08:00
yihao.dai
ed1dee9e38
enhance: Support L0 import ( #33514 ) ( #33712 )
...
issue: https://github.com/milvus-io/milvus/issues/33157
pr: https://github.com/milvus-io/milvus/pull/33514
---------
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-06-08 11:17:52 +08:00
yihao.dai
e81ae1e5a4
fix: Fix import segment size is uneven ( #33605 ) ( #33634 )
...
The data coordinator computed the appropriate number of import segments,
thus when importing in the data node, one can randomly select a segment.
issue: https://github.com/milvus-io/milvus/issues/33604
pr: https://github.com/milvus-io/milvus/pull/33605
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-06-05 18:49:52 +08:00
yihao.dai
e282e1408e
enhance: Abstract Execute interface for import/preimport task ( #33234 ) ( #33607 )
...
Abstract Execute interface for import/preimport task, simplify import
scheduler.
issue: https://github.com/milvus-io/milvus/issues/33157
pr: https://github.com/milvus-io/milvus/pull/33234
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-06-05 11:17:56 +08:00
XuanYang-cn
95582b0208
fix: [2.4] L0 compactor may cause DN OOM ( #33564 )
...
See also: #33547
pr: #33554
---------
Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-06-05 10:51:50 +08:00
congqixia
44e97b7cda
enhance: [2.4] Use map PK to timestamp in buffer insert ( #33566 ) ( #33582 )
...
Cherry-pick from master
pr: #33566
Related to #27675
Store pk to minimal timestamp in `inData` instead of bloom filter to
check whether some delete entry hit current insert batch
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-06-04 19:21:54 +08:00
XuanYang-cn
07b995fea4
fix: [2.4]Sync dropped segment for dropped partition ( #33332 )
...
See also: #33330
pr: #33331
---------
Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-05-27 17:57:43 +08:00
congqixia
3bd8137062
enhance: [2.4] Use pre-built logger for write buffer frequent ops ( #33273 ) ( #33304 )
...
Cherry-pick from master
pr: #33273
See also #33266
Each `WriteBuffer` shall have same channel/collection id attribute, so
use same logger will do and reduce logger allocation & frequent name
composition
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-05-23 15:19:41 +08:00
congqixia
2f3b377479
fix: [2.4] Remove task from syncmgr after task done ( #33303 )
...
Cherry-pick from master
pr: #33302
See also #33247
Introduced in PR #32865
Remove task after task done to keep checkpoint sound and safe
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-05-23 14:51:39 +08:00
XuanYang-cn
00b05fcc02
fix: Remove L0 compactor in completedCompactor ( #33169 ) ( #33216 )
...
See also: #33168
pr: #33169
Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-05-21 19:07:39 +08:00