milvus

mirror of https://gitee.com/milvus-io/milvus.git synced 2026-01-07 19:31:51 +08:00

Author	SHA1	Message	Date
Zhen Ye	c7b5c23ff6	enhance: filter the empty timetick from consuming side (#46541 ) issue: #46540 Empty timetick is just used to sync up the time clock between different component in milvus. So empty timetick can be ignored if we achieve the lsn/mvcc semantic for timetick. Currently, some components need the empty timetick to trigger some operation, such as flush/tsafe. So we only slow down the empty time tick for 5 seconds. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> - Core invariant: with LSN/MVCC semantics consumers only need (a) the first timetick that advances the latest-required-MVCC to unblock MVCC-dependent waits and (b) occasional periodic timeticks (~≤5s) for clock synchronization—therefore frequent non-persisted empty timeticks can be suppressed without breaking MVCC correctness. - Logic removed/simplified: per-message dispatch/consumption of frequent non-persisted empty timeticks is suppressed — an MVCC-aware filter emptyTimeTickSlowdowner (internal/util/pipeline/consuming_slowdown.go) short-circuits frequent empty timeticks in the stream pipeline (internal/util/pipeline/stream_pipeline.go), and the WAL flusher rate-limits non-persisted timetick dispatch to one emission per ~5s (internal/streamingnode/server/flusher/flusherimpl/wal_flusher.go); the delegator exposes GetLatestRequiredMVCCTimeTick to drive the filter (internal/querynodev2/delegator/delegator.go). - Why this does NOT introduce data loss or regressions: the slowdowner always refreshes latestRequiredMVCCTimeTick via GetLatestRequiredMVCCTimeTick and (1) never filters timeticks < latestRequiredMVCCTimeTick (so existing tsafe/flush waits stay unblocked) and (2) always lets the first timetick ≥ latestRequiredMVCCTimeTick pass to notify pending MVCC waits; separately, WAL flusher suppression applies only to non-persisted timeticks and still emits when the 5s threshold elapses, preserving periodic clock-sync messages used by flush/tsafe. - Enhancement summary (where it takes effect): adds GetLatestRequiredMVCCTimeTick on ShardDelegator and LastestMVCCTimeTickGetter, wires emptyTimeTickSlowdowner into NewPipelineWithStream (internal/util/pipeline), and adds WAL flusher rate-limiting + metrics (internal/streamingnode/server/flusher/flusherimpl/wal_flusher.go, pkg/metrics) to reduce CPU/dispatch overhead while keeping MVCC correctness and periodic synchronization. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Signed-off-by: chyezh <chyezh@outlook.com>	2026-01-06 20:53:24 +08:00
Zhen Ye	7c575a18b0	enhance: support AckSyncUp for broadcaster, and enable it in truncate api (#46313 ) issue: #43897 also for issue: #46166 add ack_sync_up flag into broadcast message header, which indicates that whether the broadcast operation is need to be synced up between the streaming node and the coordinator. If the ack_sync_up is false, the broadcast operation will be acked once the recovery storage see the message at current vchannel, the fast ack operation can be applied to speed up the broadcast operation. If the ack_sync_up is true, the broadcast operation will be acked after the checkpoint of current vchannel reach current message. The fast ack operation can not be applied to speed up the broadcast operation, because the ack operation need to be synced up with streaming node. e.g. if truncate collection operation want to call ack once callback after the all segment are flushed at current vchannel, it should set the ack_sync_up to be true. TODO: current implementation doesn't promise the ack sync up semantic, it only promise FastAck operation will not be applied, wait for 3.0 to implement the ack sync up semantic. only for truncate api now. --------- Signed-off-by: chyezh <chyezh@outlook.com>	2025-12-17 16:55:17 +08:00
sijie-ni-0214	f51de1a8ab	feat: support TruncateCollection api to clear collection data (#46167 ) issue: https://github.com/milvus-io/milvus/issues/46166 --------- Signed-off-by: sijie-ni-0214 <sijie.ni@zilliz.com>	2025-12-12 10:31:14 +08:00
yihao.dai	f32f2694bc	enhance: Implement new FlushAllMessage and refactor flush all (#45920 ) This PR: 1. Define and implement the new FlushAllMessage. 2. Refactor FlushAll to flush the entire cluster. issue: https://github.com/milvus-io/milvus/issues/45919 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2025-12-10 19:27:13 +08:00
Zhen Ye	576084fe86	enhance: support alter collection/database with WAL-based DDL framework (#45266 ) issue: #43897 - Alter collection/database is implemented by WAL-based DDL framework now. - Support AlterCollection/AlterDatabase in wal now. - Alter operation can be synced by new CDC now. - Refactor some UT for alter DDL. --------- Signed-off-by: chyezh <chyezh@outlook.com>	2025-11-04 09:59:33 +08:00
Zhen Ye	309d564796	enhance: support collection and index with WAL-based DDL framework (#45033 ) issue: #43897 - Part of collection/index related DDL is implemented by WAL-based DDL framework now. - Support following message type in wal, CreateCollection, DropCollection, CreatePartition, DropPartition, CreateIndex, AlterIndex, DropIndex. - Part of collection/index related DDL can be synced by new CDC now. - Refactor some UT for collection/index DDL. - Add Tombstone scheduler to manage the tombstone GC for collection or partition meta. - Move the vchannel allocation into streaming pchannel manager. --------- Signed-off-by: chyezh <chyezh@outlook.com>	2025-10-30 14:24:08 +08:00
Zhen Ye	c171280f63	enhance: support replicate message in wal. (#44456 ) issue: #44123 - support replicate message in wal of milvus. - support CDC-replicate recovery from wal. - fix some CDC replicator bugs Signed-off-by: chyezh <chyezh@outlook.com>	2025-09-22 17:06:11 +08:00
yihao.dai	51f69f32d0	feat: Add CDC support (#44124 ) This PR implements a new CDC service for Milvus 2.6, providing log-based cross-cluster replication. issue: https://github.com/milvus-io/milvus/issues/44123 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com> Signed-off-by: chyezh <chyezh@outlook.com> Co-authored-by: chyezh <chyezh@outlook.com>	2025-09-16 16:32:01 +08:00
Zhen Ye	3b01388587	fix: use recovery snapshot checkpoint if no vchannel is on-recovering (#44246 ) issue: #44194 Signed-off-by: chyezh <chyezh@outlook.com>	2025-09-10 14:15:56 +08:00
Zhen Ye	cbe4c3d231	enhance: get cchannel before build message (#44229 ) issue: #43897 - support never expire txn message. Signed-off-by: chyezh <chyezh@outlook.com>	2025-09-10 11:09:57 +08:00
Zhen Ye	a86b6f2a54	enhance: extend the stats manage at streaming shard manager for L0 (#43371 ) issue: #42416 - Rename the InsertMetric into ModifiedMetric. - Add L0 control configuration. - Add some L0 current state collect. Signed-off-by: chyezh <chyezh@outlook.com>	2025-08-18 20:41:46 +08:00
Zhen Ye	7b005c48bf	enhance: support util template generation for messages (#43881 ) issue: #43880 Signed-off-by: chyezh <chyezh@outlook.com>	2025-08-18 01:19:44 +08:00
Zhen Ye	e9ab73e93d	enhance: add schema version at recovery storage (#43500 ) issue: #43072, #43289 - manage the schema version at recovery storage. - update the schema when creating collection or alter schema. - get schema at write buffer based on version. - recover the schema when upgrading from 2.5. --------- Signed-off-by: chyezh <chyezh@outlook.com>	2025-07-23 21:38:54 +08:00
Zhen Ye	15a6631147	enhance: add quota limit based on sn consuming lag (#43105 ) issue: #42995 - The consuming lag at streaming node will be reported to coordinator. - The consuming lag will trigger the write limit and deny by quota center. - Set the ttProtection by default. --------- Signed-off-by: chyezh <chyezh@outlook.com>	2025-07-11 14:10:49 +08:00
yihao.dai	e6da4a64b5	fix: Pre-check import message to prevent pipeline block indefinitely (#42415 ) Pre-check import message to prevent pipeline block indefinitely. issue: https://github.com/milvus-io/milvus/issues/42414 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com> Co-authored-by: chyezh <chyezh@outlook.com>	2025-06-11 13:40:38 +08:00
Zhen Ye	fc010e44a8	fix: release memory after pop from heap (#42482 ) issue: #42481 Signed-off-by: chyezh <chyezh@outlook.com>	2025-06-04 10:00:32 +08:00
Zhen Ye	e479467582	fix: panic when upgrading from old arch (#42422 ) issue: #42405 - add delete rows into header when upsert. Signed-off-by: chyezh <chyezh@outlook.com>	2025-05-31 22:56:29 +08:00
Zhen Ye	66cc194ab2	enhance: add partition gc at streaming arch (#42179 ) issue: #41976 - make drop partition message as a broadcast message. - add gc when drop partition message is acked. - add a call back to handle the broadcast message when ack. - the ack operation of broadcast message will retry until success. Signed-off-by: chyezh <chyezh@outlook.com>	2025-05-29 23:20:30 +08:00
Zhen Ye	b94cee2413	fix: growing segment from old arch is not flushed after upgrading (#42164 ) issue: #42162 - enhance: add read ahead buffer size issue #42129 - fix: rocksmq consumer's close operation may get stucked - fix: growing segment from old arch is not flushed after upgrading --------- Signed-off-by: chyezh <chyezh@outlook.com>	2025-05-29 23:00:28 +08:00
Zhen Ye	c9b0748ff9	enhance: add delete rows into delete msg header and more metric (#41952 ) issue: #41544 - add delete rows into delete messsage header - add more insert/delete metrics - fix non-broadcast message has broadcast header Signed-off-by: chyezh <chyezh@outlook.com>	2025-05-22 20:28:26 +08:00
Zhen Ye	458ab86894	fix: stop retry if collection not found too much when get recovery from coord (#41980 ) issue: #41966 Signed-off-by: chyezh <chyezh@outlook.com>	2025-05-22 16:22:24 +08:00
Zhen Ye	59ab274dbe	fix: use flusher and recovery checkpoint together to determine the truncate position (#41934 ) issue: #41544 - unify the log field of message - use the minimum one of flusher and recovery storage checkpoint as the truncate position Signed-off-by: chyezh <chyezh@outlook.com>	2025-05-20 16:10:24 +08:00
Zhen Ye	59dff668dc	enhance: schema change without manual flush (#41882 ) issue: #39718 - remove the manual flush message from schema change operation - add flush segment id handle into schema change processes Signed-off-by: chyezh <chyezh@outlook.com> Co-authored-by: congqixia <congqi.xia@zilliz.com>	2025-05-19 10:14:22 +08:00
Zhen Ye	a3d5ad135e	fix: recover a dropped collection from wal if create collection message can be seen (#41902 ) issue: #41654 Signed-off-by: chyezh <chyezh@outlook.com>	2025-05-17 07:38:21 +08:00
Zhen Ye	0a465bb5b7	enhance: use recovery+shardmanager, remove segment assignment interceptor (#41824 ) issue: #41544 - add lock interceptor into wal. - use recovery and shardmanager to replace the original implementation of segment assignment. - remove redundant implementation and unittest. - remove redundant proto definition. - use 2 streamingnode in e2e. --------- Signed-off-by: chyezh <chyezh@outlook.com>	2025-05-14 23:00:23 +08:00
Zhen Ye	e675da76e4	enhance: simplify the proto message, make segment assignment code more clean (#41671 ) issue: #41544 - simplify the proto message for flush and create segment. - simplify the msg handler for flowgraph. --------- Signed-off-by: chyezh <chyezh@outlook.com>	2025-05-11 20:49:00 +08:00
Zhen Ye	452d6fb709	fix: write buffer leak if the wal flusher is cancelled when recovery (#41719 ) issue: #41715 Signed-off-by: chyezh <chyezh@outlook.com>	2025-05-10 09:32:56 +08:00
Zhen Ye	de8f0af20d	enhance: use dispatcher at delegator when enable streaming (#41266 ) issue: #38399 - add an adaptor type to adapt the streaming service client and msgstream client to reuse the msgdispatcher. Signed-off-by: chyezh <chyezh@outlook.com>	2025-05-06 01:12:53 +08:00
Zhen Ye	dfbb02a5f7	enhance: make streaming message as a log field for easier coding (#41545 ) issue: #41544 - implement message can be logged as a field by zap. - fix too many slow log for woodpecker. Signed-off-by: chyezh <chyezh@outlook.com>	2025-04-28 14:38:42 +08:00
Zhen Ye	ecfc868dcb	fix: write buffer not unregistered when datasyncservice is gone (#41496 ) issue: #41495 Signed-off-by: chyezh <chyezh@outlook.com>	2025-04-24 19:38:38 +08:00
congqixia	b36c88f3c8	enhance: [AddField] Broadcast schema change via WAL (#41373 ) Related to #39718 Add Broadcast logic for collection schema change and notifies: - Streamnode - Delegator - Streamnode - Flush component - QueryNodes via grpc --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-04-22 16:28:37 +08:00
Zhen Ye	9339bccccc	enhance: move sent first timeticksync, make recovery more easier (#41405 ) issue: #38399 Signed-off-by: chyezh <chyezh@outlook.com>	2025-04-21 17:18:37 +08:00
Xianhui Lin	f9febe3bae	enhance: Merge RootCoord, DataCoord And QueryCoord into MixCoord (#41006 ) Merge RootCoord, DataCoord And QueryCoord into MixCoord Make Session into one issue : https://github.com/milvus-io/milvus/issues/37764 --------- Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>	2025-04-11 16:36:30 +08:00
Zhen Ye	224728c2d2	fix: catchup cannot work if using StartAfter (#41201 ) issue: #41062 Signed-off-by: chyezh <chyezh@outlook.com>	2025-04-10 19:04:27 +08:00
Ted Xu	1bcea2a775	fix: assigning the correct storage version in sync and index tasks (#41093 ) See #39663 #40667 --------- Signed-off-by: Ted Xu <ted.xu@zilliz.com>	2025-04-08 10:14:25 +08:00
Zhen Ye	af80a4dac2	fix: auto flush all segment that is not created by streaming service (#40767 ) issue: #40532 Signed-off-by: chyezh <chyezh@outlook.com>	2025-03-26 16:32:22 +08:00
Zhen Ye	f6fb4bc442	fix: backoff will retry infinitely after reaching max elapse (#40589 ) issue: #40588 Signed-off-by: chyezh <chyezh@outlook.com>	2025-03-13 16:24:06 +08:00
Zhen Ye	5735c3ef19	fix: too many memory usage of streaming node (#40606 ) issue: #40592 Signed-off-by: chyezh <chyezh@outlook.com>	2025-03-13 07:10:07 +08:00
sthuang	63a7c4570e	feat: storage v2 sync (#39663 ) related: #39173 Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2025-03-05 11:22:15 +08:00
Zhen Ye	84df80b5e4	enhance: refactor metrics of streaming (#40031 ) issue: #38399 - add metrics for broadcaster component. - add metrics for wal flusher component. - add metrics for wal interceptors. - add slow log for wal. - add more label for some wal metrics. (local or remote/catcup or tailing...) Signed-off-by: chyezh <chyezh@outlook.com>	2025-02-25 12:25:56 +08:00
congqixia	cb7f2fa6fd	enhance: Use v2 package name for pkg module (#39990 ) Related to #39095 https://go.dev/doc/modules/version-numbers Update pkg version according to golang dep version convention --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-02-22 23:15:58 +08:00
SimFG	047254665d	feat: support to replicate import msg (#39171 ) - issue: #39849 --------- Signed-off-by: SimFG <bang.fu@zilliz.com> Signed-off-by: chyezh <chyezh@outlook.com> Co-authored-by: chyezh <chyezh@outlook.com>	2025-02-16 00:08:13 +08:00
Zhen Ye	034575396f	fix: streaming consume checkpoint is always nil and limit resource of ci (#39781 ) issue: #38399 - fix the nil pointer bug - limit the resource usage for streaming e2e - enhance the go test - fix: rootcoord block when graceful stop --------- Signed-off-by: chyezh <chyezh@outlook.com>	2025-02-13 19:18:14 +08:00
Zhen Ye	0988807160	enhance: enable write ahead buffer for streaming service (#39771 ) issue: #38399 - Make a timetick-commit-based write ahead buffer at write side. - Add a switchable scanner at read side to transfer the state between catchup and tailing read Signed-off-by: chyezh <chyezh@outlook.com>	2025-02-12 20:38:46 +08:00
Zhen Ye	d3e32bb599	enhance: make pchannel level flusher (#39275 ) issue: #38399 - Add a pchannel level checkpoint for flush processing - Refactor the recovery of flushers of wal - make a shared wal scanner first, then make multi datasyncservice on it Signed-off-by: chyezh <chyezh@outlook.com>	2025-02-10 16:32:45 +08:00
Zhen Ye	5669016af0	enhance: erase the rpc level when wal is located at same node (#38858 ) issue: #38399 - Make the wal scanner interface same with streaming scanner. - Use wal if the wal is located at current node. - Otherwise fallback the old logic. Signed-off-by: chyezh <chyezh@outlook.com>	2025-02-05 22:25:10 +08:00
Zhen Ye	c84a0748c4	enhance: add rw/ro streaming query node replica management (#38677 ) issue: #38399 - Embed the query node into streaming node to make delegator available at streaming node. - The embedded query node has a special server label `QUERYNODE_STREAMING-EMBEDDED`. - Change the balance strategy to make the channel assigned to streaming node as much as possible. Signed-off-by: chyezh <chyezh@outlook.com>	2025-01-24 16:55:07 +08:00
aoiasd	9cb4c4e8ac	fix: bm25 import segment without bm25 stats meta (#38855 ) relate: https://github.com/milvus-io/milvus/issues/38854 Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-01-21 11:09:04 +08:00
Zhen Ye	bb8d1ab3bf	enhance: make new go package to manage proto (#39114 ) issue: #39095 --------- Signed-off-by: chyezh <chyezh@outlook.com>	2025-01-10 10:49:01 +08:00
Zhen Ye	3bcdd92915	enhance: add broadcast for streaming service (#39020 ) issue: #38399 - Add new rpc for transfer broadcast to streaming coord - Add broadcast service at streaming coord to make broadcast message sent automicly Signed-off-by: chyezh <chyezh@outlook.com>	2025-01-09 16:24:55 +08:00

1 2

65 Commits