37 Commits

Author SHA1 Message Date
congqixia
c01fd94a6a
enhance: integrate Storage V2 FFI interface for unified storage access (#45723)
Related #44956
This commit integrates the Storage V2 FFI (Foreign Function Interface)
interface throughout the Milvus codebase, enabling unified storage
access through the Loon FFI layer. This is a significant step towards
standardizing storage operations across different storage versions.

1. Configuration Support
- **configs/milvus.yaml**: Added `useLoonFFI` configuration flag under
`common.storage.file.splitByAvgSize` section
- Allows runtime toggle between traditional binlog readers and new
FFI-based manifest readers
  - Default: `false` (maintains backward compatibility)

2. Core FFI Infrastructure

Enhanced Utilities (internal/core/src/storage/loon_ffi/util.cpp/h)
- **ToCStorageConfig()**: Converts Go's `StorageConfig` to C's
`CStorageConfig` struct for FFI calls
- **GetManifest()**: Parses manifest JSON and retrieves latest column
groups using FFI
  - Accepts manifest path with `base_path` and `ver` fields
  - Calls `get_latest_column_groups()` FFI function
  - Returns column group information as string
  - Comprehensive error handling for JSON parsing and FFI errors

3. Dependency Updates
- **internal/core/thirdparty/milvus-storage/CMakeLists.txt**:
  - Updated milvus-storage version from `0883026` to `302143c`
  - Ensures compatibility with latest FFI interfaces

4. Data Coordinator Changes

All compaction task builders now include manifest path in segment
binlogs:

- **compaction_task_clustering.go**: Added `Manifest:
segInfo.GetManifestPath()` to segment binlogs
- **compaction_task_l0.go**: Added manifest path to both L0 segment
selection and compaction plan building
- **compaction_task_mix.go**: Added manifest path to mixed compaction
segment binlogs
- **meta.go**: Updated metadata completion logic:
- `completeClusterCompactionMutation()`: Set `ManifestPath` in new
segment info
- `completeMixCompactionMutation()`: Preserve manifest path in compacted
segments
- `completeSortCompactionMutation()`: Include manifest path in sorted
segments

5. Data Node Compactor Enhancements

All compactors updated to support dual-mode reading (binlog vs
manifest):

6. Flush & Sync Manager Updates

Pack Writer V2 (pack_writer_v2.go)
- **BulkPackWriterV2.Write()**: Extended return signature to include
`manifest string`
- Implementation:
  - Generate manifest path: `path.Join(pack.segmentID, "manifest.json")`
  - Write packed data using FFI-based writer
  - Return manifest path along with binlogs, deltas, and stats

Task Handling (task.go)
- Updated all sync task result handling to accommodate new manifest
return value
- Ensured backward compatibility for callers not using manifest

7. Go Storage Layer Integration

New Interfaces and Implementations
- **record_reader.go**: Interface for unified record reading across
storage versions
- **record_writer.go**: Interface for unified record writing across
storage versions
- **binlog_record_writer.go**: Concrete implementation for traditional
binlog-based writing

Enhanced Schema Support (schema.go, schema_test.go)
- Schema conversion utilities to support FFI-based storage operations
- Ensures proper Arrow schema mapping for V2 storage

Serialization Updates
- **serde.go, serde_events.go, serde_events_v2.go**: Updated to work
with new reader/writer interfaces
- Test files updated to validate dual-mode serialization

8. Storage V2 Packed Format

FFI Common (storagev2/packed/ffi_common.go)
- Common FFI utilities and type conversions for packed storage format

Packed Writer FFI (storagev2/packed/packed_writer_ffi.go)
- FFI-based implementation of packed writer
- Integrates with Loon storage layer for efficient columnar writes

Packed Reader FFI (storagev2/packed/packed_reader_ffi.go)
- Already existed, now complemented by writer implementation

9. Protocol Buffer Updates

data_coord.proto & datapb/data_coord.pb.go
- Added `manifest` field to compaction segment messages
- Enables passing manifest metadata through compaction pipeline

worker.proto & workerpb/worker.pb.go
- Added compaction parameter for `useLoonFFI` flag
- Allows workers to receive FFI configuration from coordinator

10. Parameter Configuration

component_param.go
- Added `UseLoonFFI` parameter to compaction configuration
- Reads from `common.storage.file.useLoonFFI` config path
- Default: `false` for safe rollout

11. Test Updates
- **clustering_compactor_storage_v2_test.go**: Updated signatures to
handle manifest return value
- **mix_compactor_storage_v2_test.go**: Updated test helpers for
manifest support
- **namespace_compactor_test.go**: Adjusted writer calls to expect
manifest
- **pack_writer_v2_test.go**: Validated manifest generation in pack
writing

This integration follows a **dual-mode approach**:
1. **Legacy Path**: Traditional binlog-based reading/writing (when
`useLoonFFI=false` or no manifest)
2. **FFI Path**: Manifest-based reading/writing through Loon FFI (when
`useLoonFFI=true` and manifest exists)

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-11-24 19:57:07 +08:00
aoiasd
ed69375f00
enhance: remove resource type from file resource config (#45103)
File resource type was useless till now, remove it before new release.
relate: https://github.com/milvus-io/milvus/issues/43687

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-11-03 10:15:32 +08:00
Zhen Ye
309d564796
enhance: support collection and index with WAL-based DDL framework (#45033)
issue: #43897

- Part of collection/index related DDL is implemented by WAL-based DDL
framework now.
- Support following message type in wal, CreateCollection,
DropCollection, CreatePartition, DropPartition, CreateIndex, AlterIndex,
DropIndex.
- Part of collection/index related DDL can be synced by new CDC now.
- Refactor some UT for collection/index DDL.
- Add Tombstone scheduler to manage the tombstone GC for collection or
partition meta.
- Move the vchannel allocation into streaming pchannel manager.

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-10-30 14:24:08 +08:00
congqixia
569a5b40d2
enhance: [StorageV2] add manifest path support for FFI integration (#44991)
Related to #44956

Add manifest_path field throughout the data path to support LOON Storage
V2 manifest tracking. The manifest stores metadata for segment data
files and enables the unified Storage V2 FFI interface.

Changes include:
- Add manifest_path field to SegmentInfo and SaveBinlogPathsRequest
proto messages
- Add UpdateManifest operator to datacoord meta operations
- Update metacache, sync manager, and meta writer to propagate manifest
paths
- Include manifest_path in segment load info for query coordinator

This is part of the Storage V2 FFI interface integration.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-10-27 19:24:10 +08:00
yihao.dai
8d11373376
enhance: Show create time for import job (#45058)
issue: https://github.com/milvus-io/milvus/issues/45056

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-10-27 12:14:08 +08:00
Bingyi Sun
58277c8eb0
feat: Auto add namespace field data if namespace is enabled (#44933)
issue: #44011

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-10-24 18:40:05 +08:00
wei liu
92d2fb6360
enhance: Add granular flush targets support for FlushAll operation (#44234)
issue: #44156
Enhance FlushAll functionality to support targeting specific collections
within databases instead of only database-level flushing.

Changes include:

- Add FlushAllTarget message in data_coord.proto for granular targeting
- Support collection-specific flush operations within databases
- Maintain backward compatibility with deprecated db_name field

This enhancement allows users to flush specific collections without
affecting other collections in the same database, providing more precise
control over data persistence operations.

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-09-19 18:38:01 +08:00
zhenshan.cao
691a8df953
feat: Add RESTful api for rolling upgrade support (#44381)
issue: https://github.com/milvus-io/milvus/issues/43968

Co-authored-by: chyezh <ye.zhen@zilliz.com>
2025-09-16 20:08:00 +08:00
cai.zhang
76f6768ea1
enhance: Remove timeout for compaction task (#44277)
issue: #44272

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-09-15 11:03:58 +08:00
congqixia
f5618d5153
enhance: [StorageV2] Utilized advance split policy and persist in meta (#44282)
Related to #44257

This PR:
- Utilize configurable split policy for storage v2, enabling system
field policy
- Store split result in field binlog struct
- Adapt legacy binlog without child fields

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-09-10 14:47:57 +08:00
XuanYang-cn
37a447d166
feat: Add CMEK cipher plugin (#43722)
1. Enable Milvus to read cipher configs
2. Enable cipher plugin in binlog reader and writer
3. Add a testCipher for unittests
4. Support pooling for datanode
5. Add encryption in storagev2

See also: #40321 
Signed-off-by: yangxuan <xuan.yang@zilliz.com>

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2025-08-27 11:15:52 +08:00
aoiasd
eca51ed2c6
enhance: add file resource api (#43766)
relate: https://github.com/milvus-io/milvus/issues/43687

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-08-08 14:17:41 +08:00
wei liu
1fae8f5ae3
enhance: Optimize FlushAll performance for multi-table scenarios (#43339)
Replace multiple per-table flush RPC calls with single FlushAll RPC to
improve performance in multi-table scenarios.
issue: #43338
- Implement server-side FlushAll request processing in
DataCoord/MixCoord
- Add flushAllTask to handle unified flush operations across all tables
- Replace proxy-side per-table flush iteration with single RPC call
- Support both streaming and non-streaming service execution paths
- Add comprehensive unit tests for new FlushAll implementation

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-07-30 15:37:37 +08:00
Zhen Ye
cd38d65417
fix: make savebinlogpath idompotent at binlog level (#43615)
issue: #43574

- update all binlog every time when calling udpate savebinlogpath.

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-07-29 19:47:36 +08:00
Zhen Ye
e9ab73e93d
enhance: add schema version at recovery storage (#43500)
issue: #43072, #43289

- manage the schema version at recovery storage.
- update the schema when creating collection or alter schema.
- get schema at write buffer based on version.
- recover the schema when upgrading from 2.5.

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-07-23 21:38:54 +08:00
yihao.dai
a839017e81
fix: Handle retry state in import task (#43474)
issue: https://github.com/milvus-io/milvus/issues/43473

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-07-22 14:52:53 +08:00
cai.zhang
3ffd44f302
fix: Fix remaining issues with Datanode pooling and StorageV2 (#43147)
issue: #43146

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-07-10 14:26:48 +08:00
cai.zhang
6989e18599
enhance: Move sort stats task to sort compaction (#42562)
issue: #42560

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-07-08 20:22:47 +08:00
yihao.dai
9cbd194c6b
fix: Prevent import from generating small binlogs (#43132)
- Introduce dynamic buffer sizing to avoid generating small binlogs
during import
- Refactor import slot calculation based on CPU and memory constraints
- Implement dynamic pool sizing for sync manager and import tasks
according to CPU core count

issue: https://github.com/milvus-io/milvus/issues/43131

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-07-07 21:32:47 +08:00
cai.zhang
a9dcd4a380
enhance: ChunkManager is no longer created during datanode initialization (#42791)
issue: #41611

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-06-17 17:06:38 +08:00
Zhen Ye
b94cee2413
fix: growing segment from old arch is not flushed after upgrading (#42164)
issue: #42162

- enhance: add read ahead buffer size issue #42129
- fix: rocksmq consumer's close operation may get stucked
- fix: growing segment from old arch is not flushed after upgrading

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-05-29 23:00:28 +08:00
yihao.dai
83c9527e70
enhance: Use QuerySlot interface for tasks (#41989)
Use `QuerySlot` rpc instead of `QueryTask` for querying slot.

issue: https://github.com/milvus-io/milvus/issues/41123

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-05-23 10:30:28 +08:00
Zhen Ye
d32b802752
fix: remove the max_row_num for segment meta (#41680)
issue: #41544

- remove the estimate logic, so the create segment operation will not
check the collection meta anymore

Signed-off-by: chyezh <chyezh@outlook.com>
2025-05-09 20:46:53 +08:00
yihao.dai
dccfc69660
enhance: Get compaction params from request (#41125)
Make DataNode use compaction parameters from request instead of
configuration.

issue: https://github.com/milvus-io/milvus/issues/41123

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-04-15 10:28:53 +08:00
Xianhui Lin
f9febe3bae
enhance: Merge RootCoord, DataCoord And QueryCoord into MixCoord (#41006)
Merge RootCoord, DataCoord And QueryCoord into MixCoord
Make Session into one
issue : https://github.com/milvus-io/milvus/issues/37764

---------

Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>
2025-04-11 16:36:30 +08:00
Xianhui Lin
3bc24c264f
enhance: Add json key inverted index in stats for optimization (#38039)
Add json key inverted index in stats for optimization
https://github.com/milvus-io/milvus/issues/36995

---------

Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-04-10 15:20:28 +08:00
smellthemoon
cb1e86e17c
enhance: support add field (#39800)
after the pr merged, we can support to insert, upsert, build index,
query, search in the added field.
can only do the above operates in added field after add field request
complete, which is a sync operate.

compact will be supported in the next pr.
#39718

---------

Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2025-04-02 14:24:31 +08:00
yihao.dai
5b78ef0a49
fix: Fix delete data loss due to duplicate binlogID (#40960)
With concurrenct L0 compaction
(https://github.com/milvus-io/milvus/pull/36816), delta logs might be
written to the same L1 segment, causing logID duplication when using the
incremental beginLogID. This PR removes the beginLogID mechanism and
instead passes a log ID range, where the number of IDs in the range
equals the number of compaction segment binlogs multiplied by an
expansion factor.

issue: https://github.com/milvus-io/milvus/issues/40207

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-04-01 10:36:22 +08:00
Zhen Ye
af80a4dac2
fix: auto flush all segment that is not created by streaming service (#40767)
issue: #40532

Signed-off-by: chyezh <chyezh@outlook.com>
2025-03-26 16:32:22 +08:00
XuanYang-cn
4bebca6416
enhance: Replace currRows with NumOfRows (#40074)
See also: #40068

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2025-03-10 12:16:03 +08:00
sthuang
63a7c4570e
feat: storage v2 sync (#39663)
related: #39173

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2025-03-05 11:22:15 +08:00
wei liu
69b8b89369
enhance: Remove QueryCoord's scheduling of L0 segments (#39552)
issue: #39551
This PR remove querycoord's scheduling of l0 segments:
  - only load l0 segment when watch channel
- only release l0 segment when release channel or sync data distribution

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-02-26 21:38:00 +08:00
congqixia
cb7f2fa6fd
enhance: Use v2 package name for pkg module (#39990)
Related to #39095

https://go.dev/doc/modules/version-numbers

Update pkg version according to golang dep version convention

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-02-22 23:15:58 +08:00
Ted Xu
8562a102ec
enhance: API integration with storage v2 in mix-compactions (#40008)
See #39173

---------

Signed-off-by: Ted Xu <ted.xu@zilliz.com>
2025-02-22 14:23:54 +08:00
SimFG
047254665d
feat: support to replicate import msg (#39171)
- issue: #39849

---------

Signed-off-by: SimFG <bang.fu@zilliz.com>
Signed-off-by: chyezh <chyezh@outlook.com>
Co-authored-by: chyezh <chyezh@outlook.com>
2025-02-16 00:08:13 +08:00
aoiasd
9cb4c4e8ac
fix: bm25 import segment without bm25 stats meta (#38855)
relate: https://github.com/milvus-io/milvus/issues/38854

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-01-21 11:09:04 +08:00
Zhen Ye
bb8d1ab3bf
enhance: make new go package to manage proto (#39114)
issue: #39095

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-01-10 10:49:01 +08:00