348 Commits

Author SHA1 Message Date
aoiasd
ee216877bb
enhance: support compaction with file resource in ref mode (#46399)
Add support for DataNode compaction using file resources in ref mode.
SortCompation and StatsJobs will build text indexes, which may use file
resources.
relate: https://github.com/milvus-io/milvus/issues/43687

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
- Core invariant: file resources (analyzer binaries/metadata) are only
fetched, downloaded and used when the node is configured in Ref mode
(fileresource.IsRefMode via CommonCfg.QNFileResourceMode /
DNFileResourceMode); Sync now carries a version and managers track
per-resource versions/resource IDs so newer resource sets win and older
entries are pruned (RefManager/SynchManager resource maps).
- Logic removed / simplified: component-specific FileResourceMode flags
and an indirection through a long-lived BinlogIO wrapper were
consolidated — file-resource mode moved to CommonCfg, Sync/Download APIs
became version- and context-aware, and compaction/index tasks accept a
ChunkManager directly (binlog IO wrapper creation inlined). This
eliminates duplicated config checks and wrapper indirection while
preserving the same chunk/IO semantics.
- Why no data loss or behavior regression: all file-resource code paths
are gated by the configured mode (default remains "sync"); when not in
ref-mode or when no resources exist, compaction and stats flows follow
existing code paths unchanged. Versioned Sync + resourceID maps ensure
newly synced sets replace older ones and RefManager prunes stale files;
GetFileResources returns an error if requested IDs are missing (prevents
silent use of wrong resources). Analyzer naming/parameter changes add
analyzer_extra_info but default-callers pass "" so existing analyzers
and index contents remain unchanged.
- New capability: DataNode compaction and StatsJobs can now build text
indexes using external file resources in Ref mode — DataCoord exposes
GetFileResources and populates CompactionPlan.file_resources;
SortCompaction/StatsTask download resources via fileresource.Manager,
produce an analyzer_extra_info JSON (storage + resource->id map) via
analyzer.BuildExtraResourceInfo, and propagate analyzer_extra_info into
BuildIndexInfo so the tantivy bindings can load custom analyzers during
text index creation.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2026-01-06 16:31:31 +08:00
Bingyi Sun
f9827392bb
enhance: implement external collection update task with source change detection (#45905)
issue: #45881 
Add persistent task management for external collections with automatic
detection of external_source and external_spec changes. When source
changes, the system aborts running tasks and creates new ones, ensuring
only one active task per collection. Tasks validate their source on
completion to prevent superseded tasks from committing results.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
- Core invariant: at most one active UpdateExternalCollection task
exists per collection — tasks are serialized by collectionID
(collection-level locking) and any change to external_source or
external_spec aborts superseded tasks and causes a new task creation
(externalCollectionManager + external_collection_task_meta
collection-based locks enforce this).
- What was simplified/removed: per-task fine-grained locking and
concurrent multi-task acceptance per collection were replaced by
collection-level synchronization (external_collection_task_meta.go) and
a single persistent task lifecycle in DataCoord/Index task code;
redundant double-concurrent update paths were removed by checking
existing task presence in AddTask/LoadOrStore and aborting/overwriting
via Drop/Cancel flows.
- Why this does NOT cause data loss or regress behavior: task state
transitions and commit are validated against the current external
source/spec before applying changes — UpdateStateWithMeta and SetJobInfo
verify task metadata and persist via catalog only under matching
collection-state; DataNode externalCollectionManager persists task
results to in-memory manager and exposes Query/Drop flows (services.go)
without modifying existing segment data unless a task successfully
finishes and SetJobInfo atomically updates segments via meta/catalog
calls, preventing superseded tasks from committing stale results.
- New capability added: end-to-end external collection update workflow —
DataCoord Index task + Cluster RPC helpers + DataNode external task
runner and ExternalCollectionManager enable creating, querying,
cancelling, and applying external collection updates
(fragment-to-segment balancing, kept/updated segment handling, allocator
integration); accompanying unit tests cover success, failure,
cancellation, allocator errors, and balancing logic.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-12-29 19:53:21 +08:00
cai.zhang
7fca6e759f
enhance: Execute text indexes for multiple fields concurrently (#46279)
issue: #46274 

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Performance Improvements**
* Field-level text index creation and JSON-key statistics now run
concurrently, reducing overall indexing time and speeding task
completion.

* **Observability Enhancements**
* Per-task and per-field logging expanded with richer context and
per-phase elapsed-time reporting for improved monitoring and
diagnostics.

* **Refactor**
* Node slot handling simplified to compute slot counts on demand instead
of storing them.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-12-23 21:05:18 +08:00
aoiasd
354ab2f55e
enhance: sync file resource to querynode and datanode (#44480)
relate:https://github.com/milvus-io/milvus/issues/43687
Support use file resource with sync mode.
Auto download or remove file resource to local when user add or remove
file resource.
Sync file resource to node when find new node session.

---------

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-12-04 16:23:11 +08:00
Zhen Ye
2ef18c5b4f
enhance: remove watch at session liveness check (#45968)
issue: #45724

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-12-01 17:55:10 +08:00
Chun Han
da156981c6
feat: milvus support posix-compatible mode(milvus-io#43942) (#43944)
related: #43942

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2025-08-27 16:29:50 +08:00
XuanYang-cn
37a447d166
feat: Add CMEK cipher plugin (#43722)
1. Enable Milvus to read cipher configs
2. Enable cipher plugin in binlog reader and writer
3. Add a testCipher for unittests
4. Support pooling for datanode
5. Add encryption in storagev2

See also: #40321 
Signed-off-by: yangxuan <xuan.yang@zilliz.com>

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2025-08-27 11:15:52 +08:00
cai.zhang
f6b2a71c95
enhance: Remove chunkmanager-related dependencies from datanode (#43021)
issue: #41611

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-07-03 14:44:45 +08:00
cai.zhang
a9dcd4a380
enhance: ChunkManager is no longer created during datanode initialization (#42791)
issue: #41611

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-06-17 17:06:38 +08:00
yihao.dai
dccfc69660
enhance: Get compaction params from request (#41125)
Make DataNode use compaction parameters from request instead of
configuration.

issue: https://github.com/milvus-io/milvus/issues/41123

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-04-15 10:28:53 +08:00
Xianhui Lin
f9febe3bae
enhance: Merge RootCoord, DataCoord And QueryCoord into MixCoord (#41006)
Merge RootCoord, DataCoord And QueryCoord into MixCoord
Make Session into one
issue : https://github.com/milvus-io/milvus/issues/37764

---------

Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>
2025-04-11 16:36:30 +08:00
XuanYang-cn
e7a53da025
enhance: remove not inused util/* in datanode (#41177)
See also: #41229

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2025-04-11 10:34:29 +08:00
cai.zhang
8a77fb9cdc
enhance: Support slot for index task and stats task (#39084)
issue: #39101

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-04-08 20:46:25 +08:00
sthuang
d7df78a6c9
feat: Storage v2 compaction (#40667)
- Feat: Support Mix compaction. Covering tests include compatibility and
rollback ability.
  - Read v1 segments and compact with v2 format.
  - Read both v1 and v2 segments and compact with v2 format.
  - Read v2 segments and compact with v2 format.
  - Compact with duplicate primary key test.
  - Compact with bm25 segments.
  - Compact with merge sort segments.
  - Compact with no expiration segments.
  - Compact with lack binlog segments.
  - Compact with nullable field segments.
- Feat: Support Clustering compaction. Covering tests include
compatibility and rollback ability.
  - Read v1 segments and compact with v2 format.
  - Read both v1 and v2 segments and compact with v2 format.
  - Read v2 segments and compact with v2 format.
  - Compact bm25 segments with v2 format.
  - Compact with memory limit.
- Enhance: Use serdeMap serialize in BuildRecord function to support all
Milvus data types.
related: #39173

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2025-03-21 10:16:12 +08:00
yihao.dai
b2a8694686
enhance: Merge IndexNode and DataNode (#40272)
Merge DataNode and IndexNode into DataNode.

issue: https://github.com/milvus-io/milvus/issues/39115

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-03-13 14:26:11 +08:00
XuanYang-cn
837ac295fa
enhance: Remove iterators in datanode (#40301)
Iterators are long deprecated, but sort are still using it. This PR
unifies stats task with the latest compaction common functions and
remove the usage of iterators.

1. Rename `datanode/compaction` to `datanode/compactor`
2. Add `internal/compaction` and move some compaction commons into it.
3. Replace `DeltalogIterators` with `ComposeDeleteFromDeltalogs`
4. Remove `datanode/iterators`

See also: #39242

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2025-03-04 12:14:00 +08:00
congqixia
cb7f2fa6fd
enhance: Use v2 package name for pkg module (#39990)
Related to #39095

https://go.dev/doc/modules/version-numbers

Update pkg version according to golang dep version convention

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-02-22 23:15:58 +08:00
SimFG
047254665d
feat: support to replicate import msg (#39171)
- issue: #39849

---------

Signed-off-by: SimFG <bang.fu@zilliz.com>
Signed-off-by: chyezh <chyezh@outlook.com>
Co-authored-by: chyezh <chyezh@outlook.com>
2025-02-16 00:08:13 +08:00
jaime
f03a85725a
enhance: add db name in replica (#38672)
issue: #36621

Signed-off-by: jaime <yun.zhang@zilliz.com>
2025-01-09 19:40:59 +08:00
jaime
29e620fa6d
fix: sync task still running after DataNode has stopped (#38377)
issue: #38319

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-12-17 18:06:44 +08:00
tinswzy
27229f7907
enhance: refine exists log print with ctx (#38080)
issue: #35917 
Refines exists log print with ctx

Signed-off-by: tinswzy <zhenyuan.wei@zilliz.com>
2024-12-14 22:36:44 +08:00
jaime
7bbfe86bcd
enhance: add list index and segment index retrieval API for WebUI (#37861)
issue: #36621

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-11-22 16:58:34 +08:00
jaime
f348bd9441
feat: add segment,pipeline, replica and resourcegroup api for WebUI (#37344)
issue: #36621

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-11-07 11:52:25 +08:00
jaime
9d16b972ea
feat: add tasks page into management WebUI (#37002)
issue: #36621

1. Add API to access task runtime metrics, including:
  - build index task
  - compaction task
  - import task
- balance (including load/release of segments/channels and some leader
tasks on querycoord)
  - sync task
2. Add a debug model to the webpage by using debug=true or debug=false
in the URL query parameters to enable or disable debug mode.

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-10-28 10:13:29 +08:00
XuanYang-cn
b172ea1093
fix: Remove enableLevelZeroSegment config (#36535)
See also: #36504

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-10-17 11:59:24 +08:00
yihao.dai
9e8cafcbe2
enhance: Skip loading bf in datanode (#36367)
Skip loading bf in datanode:
1. When watching vchannels, skip loading bloom filters for segments.
2. Bypass bloom filter checks for delete messages, directly writing to
L0 segments.
3. Remove flushed segments proactively after flush.

issue: https://github.com/milvus-io/milvus/issues/34585

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-09-26 10:11:15 +08:00
congqixia
3352030a84
enhance: Graceful stop flowgraph manager when stopping datanode (#36229)
Flowgraph manager is not stopped durong datanode stopping procedure
which may lead to unexpect flowgraph behavior during/after datanode stop
progress.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-09-14 15:55:08 +08:00
Zhen Ye
99dff06391
enhance: using streaming service in insert/upsert/flush/delete/querynode (#35406)
issue: #33285

- using streaming service in insert/upsert/flush/delete/querynode
- fixup flusher bugs and refactor the flush operation
- enable streaming service for dml and ddl
- pass the e2e when enabling streaming service
- pass the integration tst when enabling streaming service

---------

Signed-off-by: chyezh <chyezh@outlook.com>
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-08-29 10:03:08 +08:00
yihao.dai
a4439cc911
enhance: Implement flusher in streamingNode (#34942)
- Implement flusher to:
  - Manage the pipelines (creation, deletion, etc.)
  - Manage the segment write buffer
  - Manage sync operation (including receive flushMsg and execute flush)
- Add a new `GetChannelRecoveryInfo` RPC in DataCoord.
- Reorganize packages: `flushcommon` and `datanode`.

issue: https://github.com/milvus-io/milvus/issues/33285

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-08-02 18:30:23 +08:00
yihao.dai
8aab6cbfac
enhance: Organize the common modules of streamingNode and dataNode (#34773)
1. Move the common modules of streamingNode and dataNode to flushcommon
2. Add new GetVChannels interface for rootcoord

issue: https://github.com/milvus-io/milvus/issues/33285

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-07-22 11:33:51 +08:00
yihao.dai
4e5f1d5f75
enhance: Pre-allocate ids for import (#33958)
The import is dependent on syncTask, which in turn relies on the
allocator. This PR pre-allocate the necessary IDs for import syncTask.

issue: https://github.com/milvus-io/milvus/issues/33957

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-07-07 21:26:14 +08:00
jaime
21fc5f5d46
enhance: Remove datanode reporting TT based on MQ implementation (#34421)
issue: #34420

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-07-05 15:48:09 +08:00
jaime
d6afb31b94
enhance: make subfunctions of datanode component modular (#33992)
issue: #33994

also remove deprecated channel manager based on the etcd implementation

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-07-01 14:46:07 +08:00
jaime
9630974fbb
enhance: move rocksmq from internal to pkg module (#33881)
issue: #33956

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-06-25 21:18:15 +08:00
yihao.dai
86a36b105a
enhance: Tidy compaction executor (#33778)
Move compaction executor to compaction pacakge.

issue: https://github.com/milvus-io/milvus/issues/32451

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-06-14 14:34:01 +08:00
XuanYang-cn
4dd0c54ca0
fix: Fix l0 compactor may cause DN from OOM (#33554)
See also: #33547

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-06-06 14:33:52 +08:00
cai.zhang
77637180fa
enhance: Periodically synchronize segments to datanode watcher (#33420)
issue: #32809

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-05-30 13:37:44 +08:00
yihao.dai
895799ec61
enhance: Abstract Execute interface for import/preimport task (#33234)
Abstract Execute interface for import/preimport task, simplify import
scheduler.

issue: https://github.com/milvus-io/milvus/issues/33157

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-05-23 11:29:41 +08:00
congqixia
892fe66b57
enhance: Refine channelCpUpdater field & test (#33083)
Avoid passing datanode around preparing datanode code directory
refactory.

Also refine unit test code for same component. The `Await` shall return
first before checking the counter number since when lock cost is heavy
(using deadlock.RWMutex See PR #33069.) case may fail due to long
running time submitting tasks.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-05-16 14:19:34 +08:00
yihao.dai
a984e46a29
enhance: Remove rootcoord from datanode broker (#32818)
issue: https://github.com/milvus-io/milvus/issues/32827

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-05-14 10:03:32 +08:00
yiwangdr
d6e537c91c
fix: allow datanode's server id to be updated (#31597)
issue: #31516

background: the server id field in data node is redundant. session id
already provides the source of truth.

Signed-off-by: yiwangdr <yiwangdr@gmail.com>
2024-05-08 14:03:29 +08:00
yiwangdr
b1eacb2ae8
feat: datacoord/node watch based on rpc (#32036)
issue: https://github.com/milvus-io/milvus/issues/25309

Signed-off-by: yiwangdr <yiwangdr@gmail.com>
2024-05-07 15:49:30 +08:00
SimFG
1af084ea6b
enhance: Make datanode exit and case TestProxy faster (#32218)
/kind improvement
issue: #32219

Signed-off-by: SimFG <bang.fu@zilliz.com>
2024-04-16 10:49:20 +08:00
XuanYang-cn
aad3ed3835
fix: [cherry-pick]Skip changing meta if nodeID not match with channel (#31672)
See also: #31648
pr: #31665, #31694

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-04-10 15:09:18 +08:00
yihao.dai
4e264003bf
enhance: Ensure ImportV2 waits for the index to be built and refine some logic (#31629)
Feature Introduced:
1. Ensure ImportV2 waits for the index to be built

Enhancements Introduced:
1. Utilization of local time for timeout ts instead of allocating ts
from rootcoord.
3. Enhanced input file length check for binlog import.
4. Removal of duplicated manager in datanode.
5. Renaming of executor to scheduler in datanode.
6. Utilization of a thread pool in the scheduler in datanode.

issue: https://github.com/milvus-io/milvus/issues/28521

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-04-01 20:09:13 +08:00
XuanYang-cn
39337e09b8
fix: Using zero serverID for metrics (#31518)
Fixes: #31516

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-04-01 16:55:19 +08:00
congqixia
d9efea2fea
fix: Cleanup write buffer when flowgraph released (#31376)
See also #30137

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-03-19 01:33:05 +08:00
jaime
db79be3ae0
fix: ctx cancel should be the last step while stopping server (#31220)
issue: #31219

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-03-15 10:33:05 +08:00
XuanYang-cn
a52a52064d
fix: Use lock and map instead of concurrentMap (#31212)
See also: #31209

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-03-14 18:39:04 +08:00
yihao.dai
c411cb4a49
enhance: Prevent the backlog of channelCP update tasks, perform batch updates of channelCPs (#30941)
This PR includes the following adjustments:
1. To prevent channelCP update task backlog, only one task with the same
vchannel is retained in the updater. Additionally, the lastUpdateTime is
refreshed after the flowgraph submits the update task, rather than in
the callBack function.
2. Batch updates of multiple vchannel checkpoints are performed in the
UpdateChannelCheckpoint RPC (default batch size is 128). Additionally,
the lock for channelCPs in DataCoord meta has been switched from key
lock to global lock.
3. The concurrency of UpdateChannelCheckpoint RPCs in the datanode has
been reduced from 1000 to 10.

issue: https://github.com/milvus-io/milvus/issues/30004

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
Co-authored-by: jaime <yun.zhang@zilliz.com>
Co-authored-by: congqixia <congqi.xia@zilliz.com>
2024-03-07 20:39:02 +08:00