20 Commits

Author SHA1 Message Date
yihao.dai
f32f2694bc
enhance: Implement new FlushAllMessage and refactor flush all (#45920)
This PR:
1. Define and implement the new FlushAllMessage.
2. Refactor FlushAll to flush the entire cluster.

issue: https://github.com/milvus-io/milvus/issues/45919

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-12-10 19:27:13 +08:00
Zhen Ye
c171280f63
enhance: support replicate message in wal. (#44456)
issue: #44123

- support replicate message  in wal of milvus.
- support CDC-replicate recovery from wal.
- fix some CDC replicator bugs

Signed-off-by: chyezh <chyezh@outlook.com>
2025-09-22 17:06:11 +08:00
yihao.dai
51f69f32d0
feat: Add CDC support (#44124)
This PR implements a new CDC service for Milvus 2.6, providing log-based
cross-cluster replication.

issue: https://github.com/milvus-io/milvus/issues/44123

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
Signed-off-by: chyezh <chyezh@outlook.com>
Co-authored-by: chyezh <chyezh@outlook.com>
2025-09-16 16:32:01 +08:00
Zhen Ye
3b01388587
fix: use recovery snapshot checkpoint if no vchannel is on-recovering (#44246)
issue: #44194

Signed-off-by: chyezh <chyezh@outlook.com>
2025-09-10 14:15:56 +08:00
Zhen Ye
cbe4c3d231
enhance: get cchannel before build message (#44229)
issue: #43897

- support never expire txn message.

Signed-off-by: chyezh <chyezh@outlook.com>
2025-09-10 11:09:57 +08:00
Zhen Ye
e9ab73e93d
enhance: add schema version at recovery storage (#43500)
issue: #43072, #43289

- manage the schema version at recovery storage.
- update the schema when creating collection or alter schema.
- get schema at write buffer based on version.
- recover the schema when upgrading from 2.5.

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-07-23 21:38:54 +08:00
Zhen Ye
15a6631147
enhance: add quota limit based on sn consuming lag (#43105)
issue: #42995

- The consuming lag at streaming node will be reported to coordinator.
- The consuming lag will trigger the write limit and deny by quota
center.
- Set the ttProtection by default.

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-07-11 14:10:49 +08:00
Zhen Ye
66cc194ab2
enhance: add partition gc at streaming arch (#42179)
issue: #41976

- make drop partition message as a broadcast message.
- add gc when drop partition message is acked.
- add a call back to handle the broadcast message when ack.
- the ack operation of broadcast message will retry until success.

Signed-off-by: chyezh <chyezh@outlook.com>
2025-05-29 23:20:30 +08:00
Zhen Ye
59ab274dbe
fix: use flusher and recovery checkpoint together to determine the truncate position (#41934)
issue: #41544

- unify the log field of message
- use the minimum one of flusher and recovery storage checkpoint as the
truncate position

Signed-off-by: chyezh <chyezh@outlook.com>
2025-05-20 16:10:24 +08:00
Zhen Ye
a3d5ad135e
fix: recover a dropped collection from wal if create collection message can be seen (#41902)
issue: #41654

Signed-off-by: chyezh <chyezh@outlook.com>
2025-05-17 07:38:21 +08:00
Zhen Ye
0a465bb5b7
enhance: use recovery+shardmanager, remove segment assignment interceptor (#41824)
issue: #41544

- add lock interceptor into wal.
- use recovery and shardmanager to replace the original implementation
of segment assignment.
- remove redundant implementation and unittest.
- remove redundant proto definition.
- use 2 streamingnode in e2e.

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-05-14 23:00:23 +08:00
Zhen Ye
e675da76e4
enhance: simplify the proto message, make segment assignment code more clean (#41671)
issue: #41544

- simplify the proto message for flush and create segment.
- simplify the msg handler for flowgraph.

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-05-11 20:49:00 +08:00
Zhen Ye
452d6fb709
fix: write buffer leak if the wal flusher is cancelled when recovery (#41719)
issue: #41715

Signed-off-by: chyezh <chyezh@outlook.com>
2025-05-10 09:32:56 +08:00
Zhen Ye
9339bccccc
enhance: move sent first timeticksync, make recovery more easier (#41405)
issue: #38399

Signed-off-by: chyezh <chyezh@outlook.com>
2025-04-21 17:18:37 +08:00
Xianhui Lin
f9febe3bae
enhance: Merge RootCoord, DataCoord And QueryCoord into MixCoord (#41006)
Merge RootCoord, DataCoord And QueryCoord into MixCoord
Make Session into one
issue : https://github.com/milvus-io/milvus/issues/37764

---------

Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>
2025-04-11 16:36:30 +08:00
Zhen Ye
224728c2d2
fix: catchup cannot work if using StartAfter (#41201)
issue: #41062

Signed-off-by: chyezh <chyezh@outlook.com>
2025-04-10 19:04:27 +08:00
Zhen Ye
5735c3ef19
fix: too many memory usage of streaming node (#40606)
issue: #40592

Signed-off-by: chyezh <chyezh@outlook.com>
2025-03-13 07:10:07 +08:00
Zhen Ye
84df80b5e4
enhance: refactor metrics of streaming (#40031)
issue: #38399

- add metrics for broadcaster component.
- add metrics for wal flusher component.
- add metrics for wal interceptors.
- add slow log for wal.
- add more label for some wal metrics. (local or remote/catcup or
tailing...)

Signed-off-by: chyezh <chyezh@outlook.com>
2025-02-25 12:25:56 +08:00
congqixia
cb7f2fa6fd
enhance: Use v2 package name for pkg module (#39990)
Related to #39095

https://go.dev/doc/modules/version-numbers

Update pkg version according to golang dep version convention

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-02-22 23:15:58 +08:00
Zhen Ye
d3e32bb599
enhance: make pchannel level flusher (#39275)
issue: #38399

- Add a pchannel level checkpoint for flush processing
- Refactor the recovery of flushers of wal
- make a shared wal scanner first, then make multi datasyncservice on it

Signed-off-by: chyezh <chyezh@outlook.com>
2025-02-10 16:32:45 +08:00