49 Commits

Author SHA1 Message Date
tinswzy
1427825133
enhance: improve WAL retention strategy (#45350)
issue: #44369 
woodpecker related[ issue:
#59](https://github.com/zilliztech/woodpecker/issues/59)

Refactor the WAL retention logic in Milvus StreamingNode:
- Remove the simple sampling-based truncation mechanism.
- After flush, WAL data is directly truncated.
- The retention control is now delegated to the underlying message queue
(MQ) implementation.

Signed-off-by: tinswzy <zhenyuan.wei@zilliz.com>
2025-11-23 21:41:05 +08:00
Xiaofan
a9895bb904
enhance: add robust handle etcd servercrash (#45304)
related to #45303
fix milvus pod may restart when etcd pod start

Signed-off-by: xiaofanluan <xiaofan.luan@zilliz.com>
2025-11-13 10:23:36 +08:00
Zhen Ye
31a609c21d
fix: kafka should auto reset the offset from earliest to read (#45237)
issue: #44172, #45210, #44851

kafka will auto reset the offset to "latest" if the offset is
Out-of-range. the recovery of milvus wal cannot read any message from
that. So once the offset is out-of-range, kafka should read from eariest
to read the latest uncleared data.


https://kafka.apache.org/documentation/#consumerconfigs_auto.offset.reset

Signed-off-by: chyezh <chyezh@outlook.com>
2025-11-03 21:07:33 +08:00
tinswzy
f342f49b32
enhance: add support for Azure Blob Storage in wp (#44592)
#44485 
add support for blob in woodpecker

#43638 
upgrade wp v0.1.6

related wp [issue#11](https://github.com/zilliztech/woodpecker/issues/11
)

Signed-off-by: tinswzy <zhenyuan.wei@zilliz.com>
2025-09-29 09:51:44 +08:00
tinswzy
c7f21d5a06
enhance: purge small files right after wp segment compaction (#44473)
#43638 
improve wp log output
[wp#43](https://github.com/zilliztech/woodpecker/issues/43)
intro purge small files right after segment compaction
[wp#47](https://github.com/zilliztech/woodpecker/issues/47)
The rootpath configured by milvus is uniformly used as the base for wp
local fs storage.
update to v0.1.5

Signed-off-by: tinswzy <zhenyuan.wei@zilliz.com>
2025-09-21 16:32:01 +08:00
yihao.dai
51f69f32d0
feat: Add CDC support (#44124)
This PR implements a new CDC service for Milvus 2.6, providing log-based
cross-cluster replication.

issue: https://github.com/milvus-io/milvus/issues/44123

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
Signed-off-by: chyezh <chyezh@outlook.com>
Co-authored-by: chyezh <chyezh@outlook.com>
2025-09-16 16:32:01 +08:00
Zhen Ye
5bdc593b8a
enhance: use v0.15.1 official pulsar client and add logging for pulsar client (#43913)
issue: #43785

- pulsar client will print log into milvus logger now.
- pulsar client open the metric by default.
- upgrade the pulsar client to v0.15.1, and use offical repo.
- the fixing of milvus-io/pulsar-client-go is already covered by
official v0.15.1.

Signed-off-by: chyezh <chyezh@outlook.com>
2025-08-26 16:45:53 +08:00
tinswzy
6a342edc5a
fix: empty error returned on append timeout when MinIO is unavailable (#43926)
#43810 
Fixed the issue where the result err returned by append timeout was
empty when objectstorage was unavailable, causing the client to
mistakenly believe that the write was successful.

Signed-off-by: tinswzy <zhenyuan.wei@zilliz.com>
2025-08-19 10:47:45 +08:00
tinswzy
084f777552
enhance: use wp internal writer without lock (#43775)
#43638  #43810 
add internal writer without session lock;
refactor and unify read state and log entry
refactor data reading related methods;
fix bug where a closed writer is reused for finalize;

Signed-off-by: tinswzy <zhenyuan.wei@zilliz.com>
2025-08-18 01:15:44 +08:00
Zhen Ye
8ff118a9ff
fix: call IntoMessageProto instead of Payload when rpc (#43678)
issue: #43677

Signed-off-by: chyezh <chyezh@outlook.com>
2025-08-06 14:45:40 +08:00
Zhen Ye
3e3775fb81
fix: panics when describe collection internal failure (#43630)
issue: #43629

- also fix the scanner_switchable panic underlying wal scanner return
context error.

Signed-off-by: chyezh <chyezh@outlook.com>
2025-07-29 20:33:36 +08:00
tinswzy
173efe2b98
enhance: wp metrics and update deps to v0.1.0 (#43569)
#43574   #43604 #43431  #43603 
Fix wp metrics not registered bug;
Update the version dependent on wp to v0.1.2-rc1;
improve advanced reader with concurrent prefetch blks;
add the segment rolling policy based on the number of blocks;
improve concurrent compaction
release lock failed bug

Signed-off-by: tinswzy <zhenyuan.wei@zilliz.com>
2025-07-29 14:51:35 +08:00
Zhen Ye
648994182f
fix: pulsar use more memory for queue (#43565)
issue: #43564

Signed-off-by: chyezh <chyezh@outlook.com>
2025-07-28 14:00:56 +08:00
Zhen Ye
070aabd27e
enhance: fix remove flushing state of segment (#43560)
issue: #43559, #42884

- also fix the data lost when streaming resuming from old arch message.

Signed-off-by: chyezh <chyezh@outlook.com>
2025-07-25 18:08:54 +08:00
tinswzy
7da62698e0
enhance: improve WP parallel sync mechanism and fencing logic (#42892)
related: #42595 
improve WP parallel sync mechanism and fencing logic; remove redundant
metrics and labels

Signed-off-by: tinswzy <zhenyuan.wei@zilliz.com>
2025-07-13 23:04:49 +08:00
Zhen Ye
6798fdc3b3
fix: rocksmq cannot graceful stop (#42841)
issue: #40532

Signed-off-by: chyezh <chyezh@outlook.com>
2025-06-19 19:38:39 +08:00
Zhen Ye
593662970b
fix: reuse consumer for backlog clear and use shared consumer (#42822)
issue: #42820

- fix that ro pulsar cannot be closed when upgrading milvus.

Signed-off-by: chyezh <chyezh@outlook.com>
2025-06-19 19:36:48 +08:00
Zhen Ye
1f66b650e9
fix: pulsar cannot work properly if backlog exceed (#42653)
issue: #42649

- the sync operation of different pchannel is concurrent now.
- add a option to notify the backlog clear automatically.
- make pulsar walimpls can be recovered from backlog exceed.

Signed-off-by: chyezh <chyezh@outlook.com>
2025-06-13 14:28:37 +08:00
tinswzy
36a4b74fc0
enhance: Adjust the default parameters of WP according to performance tests (#42598)
#42595

Signed-off-by: tinswzy <zhenyuan.wei@zilliz.com>
2025-06-10 16:30:35 +08:00
tinswzy
f55f900c85
fix insert hang caused by WAL writer writing to a closing logfile (#42078)
related issue #42049 
wp commit
[94de4](94de4cbc60)

Signed-off-by: tinswzy <zhenyuan.wei@zilliz.com>
2025-06-03 09:58:36 +08:00
Zhen Ye
b94cee2413
fix: growing segment from old arch is not flushed after upgrading (#42164)
issue: #42162

- enhance: add read ahead buffer size issue #42129
- fix: rocksmq consumer's close operation may get stucked
- fix: growing segment from old arch is not flushed after upgrading

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-05-29 23:00:28 +08:00
tinswzy
1735f557ca
fix sn oom issue during small file loading in wp (#41946)
#41846  #41894 
Resolve SN OOM issue during small file loading in Woodpecker; 
Correct WP fence/close execution order;

Signed-off-by: tinswzy <zhenyuan.wei@zilliz.com>
2025-05-23 01:30:28 +08:00
tinswzy
4edb1bc6f1
fix: resolve wp WALImpls concurrent read/write bug (#41763)
#41563 #41579 #41842 #41846 #41758
Upgraded the wp dependency to incorporate recent fixes addressing
multiple concurrency bugs in WALImpls.

Signed-off-by: tinswzy <zhenyuan.wei@zilliz.com>
2025-05-16 12:02:27 +08:00
Zhen Ye
0a465bb5b7
enhance: use recovery+shardmanager, remove segment assignment interceptor (#41824)
issue: #41544

- add lock interceptor into wal.
- use recovery and shardmanager to replace the original implementation
of segment assignment.
- remove redundant implementation and unittest.
- remove redundant proto definition.
- use 2 streamingnode in e2e.

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-05-14 23:00:23 +08:00
Zhen Ye
21d6d1669e
fix: wal should be reopen if wal append receive the fence error (#41807)
issue: #41544

Signed-off-by: chyezh <chyezh@outlook.com>
2025-05-14 01:02:56 +08:00
Zhen Ye
7beafe99a7
enhance: implement wal garbage collector with truncate api (#41770)
issue: #41544

- add a truncator implementation into wal recovery storage.
- add metrics for recovery storage.

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-05-13 22:08:56 +08:00
Zhen Ye
52950ce392
enhance: add pulsar truncate api to protect pulsar unconsumed message (#41724)
issue: #41465

- implement truncate api for pulsar based on durable subscription.
- truncate api can only be called if wal is read-write.

Signed-off-by: chyezh <chyezh@outlook.com>
2025-05-11 20:50:55 +08:00
tinswzy
b36ed03141
enhance: Add Truncate Interface to WALImpls for Log Retention Control (#41517)
#41465 Add Truncate Interface to WALImpls for Proactive Log Retention
Management

Signed-off-by: tinswzy <zhenyuan.wei@zilliz.com>
2025-04-29 14:36:49 +08:00
Zhen Ye
dfbb02a5f7
enhance: make streaming message as a log field for easier coding (#41545)
issue: #41544

- implement message can be logged as a field by zap.
- fix too many slow log for woodpecker.

Signed-off-by: chyezh <chyezh@outlook.com>
2025-04-28 14:38:42 +08:00
tinswzy
6fa68c1f16
enhance: Support Woodpecker as a WAL storage option for Milvus (#41095)
#40916 Support Woodpecker as a WAL storage option for Milvus

Signed-off-by: tinswzy <zhenyuan.wei@zilliz.com>
2025-04-20 22:22:42 +08:00
Zhen Ye
6982f007e2
enhance: add walimpls access mode options (#40591)
issue: #40532

Signed-off-by: chyezh <chyezh@outlook.com>
2025-03-14 10:58:11 +08:00
congqixia
cb7f2fa6fd
enhance: Use v2 package name for pkg module (#39990)
Related to #39095

https://go.dev/doc/modules/version-numbers

Update pkg version according to golang dep version convention

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-02-22 23:15:58 +08:00
Zhen Ye
0988807160
enhance: enable write ahead buffer for streaming service (#39771)
issue: #38399

- Make a timetick-commit-based write ahead buffer at write side.
- Add a switchable scanner at read side to transfer the state between
catchup and tailing read

Signed-off-by: chyezh <chyezh@outlook.com>
2025-02-12 20:38:46 +08:00
Zhen Ye
bb8d1ab3bf
enhance: make new go package to manage proto (#39114)
issue: #39095

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-01-10 10:49:01 +08:00
Zhen Ye
285289d5b0
enhance: implement kafka for wal (#38598)
issue: #38399

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2024-12-25 10:22:50 +08:00
Zhen Ye
69a9fd6ead
enhance: enable rmq for streaming (#38669)
issue: #38399

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2024-12-24 20:24:48 +08:00
Zhen Ye
99dff06391
enhance: using streaming service in insert/upsert/flush/delete/querynode (#35406)
issue: #33285

- using streaming service in insert/upsert/flush/delete/querynode
- fixup flusher bugs and refactor the flush operation
- enable streaming service for dml and ddl
- pass the e2e when enabling streaming service
- pass the integration tst when enabling streaming service

---------

Signed-off-by: chyezh <chyezh@outlook.com>
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-08-29 10:03:08 +08:00
Zhen Ye
4d69898cb2
enhance: support single pchannel level transaction (#35289)
issue: #33285

- support transaction on single wal.
- last confirmed message id can still be used when enable transaction.
- add fence operation for segment allocation interceptor.

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2024-08-19 21:22:56 +08:00
chyezh
c725416288
enhance: move streaming proto into pkg (#35284)
issue: #33285

- move streaming related proto into pkg.
- add v2 message type and change flush message into v2 message.

Signed-off-by: chyezh <chyezh@outlook.com>
2024-08-07 10:34:16 +08:00
chyezh
14051fed7d
enhance: streaming service client (#34656)
issue: #33285

- implement streaming service client.
- implement producing and consuming service client by streaming coord
client and streaming node client.

Signed-off-by: chyezh <chyezh@outlook.com>
2024-08-05 21:38:15 +08:00
wei liu
c45f38aa61
enhance: Update protobuf-go to protobuf-go v2 (#34394)
issue: #34252

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-07-29 11:31:51 +08:00
chyezh
4f6cbfd520
enhance: specialized immutable and mutable message (#34951)
issue: #33285

- add specialized mutable and immutable message, make type safe.
- add version based constructor and type.

Signed-off-by: chyezh <chyezh@outlook.com>
2024-07-25 11:57:45 +08:00
chyezh
39c7e06bc5
enhance: add message and msgstream msgpack adaptor (#34874)
issue: #33285

- make message builder and message conversion type safe
- add adaptor type and function to adapt old msgstream msgpack and
message interface

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2024-07-22 20:59:42 +08:00
chyezh
cc8f7aa110
fix: streaming service related fix patch (#34696)
issue: #33285

- add idAlloc interface
- fix binary unsafe bug for message
- fix service discovery lost when repeated address with different server
id

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2024-07-16 15:49:38 +08:00
chyezh
1bc3c0b925
enhance: implement balancer at streaming coord (#34435)
issue: #33285

- add balancer implementation
- add channel count fair balance policy
- add channel assignment discover grpc service

Signed-off-by: chyezh <chyezh@outlook.com>
2024-07-11 09:58:48 +08:00
chyezh
dfe0416a70
enhance: implement streaming node server service (#34166)
issue: #33285

- implement producing and consuming server of message
- implement management operation for streaming node server

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2024-07-09 07:58:14 +08:00
chyezh
ba04981a43
enhance: implement wal managerment on streaming node (#34153)
issue: #33285

- add lifetime control for wal.
- implement distributed-safe wal manager on streaming node.

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2024-07-05 14:29:42 +08:00
chyezh
3563136c2a
enhance: timetick interceptor implementation (#34238)
issue: #33285

- optimize the message package
- add interceptor package to achieve append operation intercepting.
- add timetick interceptor to attach timetick properties for message.
- add timetick background task to send timetick message.

Signed-off-by: chyezh <chyezh@outlook.com>
2024-07-02 14:42:08 +08:00
chyezh
d2bc4a53be
enhance: implement rmq and pulsar as wal (#34046)
issue: #33285

- use reader but not consumer for pulsar
- advanced test framework
- move some streaming related package into pkg

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2024-06-27 15:11:05 +08:00