74 Commits

Author SHA1 Message Date
congqixia
0425336635
fix: [skip e2e] resolve flaky TestKeyLockDispatcher unit test (#46454)
Related to #46453

The test was flaky because Submit() returns a Future and executes
asynchronously. The test was setting sig=true immediately after Submit()
returned, but the task's Run() might not have completed yet, causing
mock expectation failures.

Fix by calling future.Await() to wait for task execution to complete
before signaling. Also remove dead commented code.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-12-19 14:07:19 +08:00
congqixia
46c14781be
enhance: support useLoonFFI flag in import workflow (#46363)
Related to #44956

This change propagates the useLoonFFI configuration through the import
pipeline to enable LOON FFI usage during data import operations.

Key changes:
- Add use_loon_ffi field to ImportRequest protobuf message
- Add manifest_path field to ImportSegmentInfo for tracking manifest
- Initialize manifest path when creating segments (both import and
growing)
- Pass useLoonFFI flag through NewSyncTask in import tasks
- Simplify pack_writer_v2 by removing GetManifestInfo method and relying
on pre-initialized manifest path from segment creation
- Update segment meta with manifest path after import completion

This allows the import workflow to use the LOON FFI based packed writer
when the common.useLoonFFI configuration is enabled.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-12-17 16:35:16 +08:00
XuanYang-cn
0bbb134e39
feat: Enable to backup and reload ez (#46332)
see also: #40013

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2025-12-16 17:19:16 +08:00
Zhen Ye
675a6b9ba0
fix: illegal reference count of record in binlog writer (#46344)
issue: #46205

Signed-off-by: chyezh <chyezh@outlook.com>
2025-12-15 22:51:15 +08:00
congqixia
18fbaaca0a
enhance: support specified version manifest write (#46331)
Related to #44956

**Support specified version manifest write**
- Add `baseVersion` parameter to `NewPackedRecordManifestWriter` and
`NewFFIPackedWriter` to support writing manifest based on a specific
version instead of always overwriting the latest
- Add `manifestPath` tracking in `BulkPackWriterV2` to maintain manifest
state across writes
- Add `GetManifestInfo` method to parse existing manifest path and
extract base path and version
- Add `UpdateManifestPath` metacache action to track manifest path in
segment info
- Update `transaction_begin` FFI call to use the specified base version

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-12-15 19:49:14 +08:00
Zhen Ye
d24cd6200b
fix: always retry when writing binlog (#46309)
issue: #46205

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-12-12 18:27:15 +08:00
congqixia
c01fd94a6a
enhance: integrate Storage V2 FFI interface for unified storage access (#45723)
Related #44956
This commit integrates the Storage V2 FFI (Foreign Function Interface)
interface throughout the Milvus codebase, enabling unified storage
access through the Loon FFI layer. This is a significant step towards
standardizing storage operations across different storage versions.

1. Configuration Support
- **configs/milvus.yaml**: Added `useLoonFFI` configuration flag under
`common.storage.file.splitByAvgSize` section
- Allows runtime toggle between traditional binlog readers and new
FFI-based manifest readers
  - Default: `false` (maintains backward compatibility)

2. Core FFI Infrastructure

Enhanced Utilities (internal/core/src/storage/loon_ffi/util.cpp/h)
- **ToCStorageConfig()**: Converts Go's `StorageConfig` to C's
`CStorageConfig` struct for FFI calls
- **GetManifest()**: Parses manifest JSON and retrieves latest column
groups using FFI
  - Accepts manifest path with `base_path` and `ver` fields
  - Calls `get_latest_column_groups()` FFI function
  - Returns column group information as string
  - Comprehensive error handling for JSON parsing and FFI errors

3. Dependency Updates
- **internal/core/thirdparty/milvus-storage/CMakeLists.txt**:
  - Updated milvus-storage version from `0883026` to `302143c`
  - Ensures compatibility with latest FFI interfaces

4. Data Coordinator Changes

All compaction task builders now include manifest path in segment
binlogs:

- **compaction_task_clustering.go**: Added `Manifest:
segInfo.GetManifestPath()` to segment binlogs
- **compaction_task_l0.go**: Added manifest path to both L0 segment
selection and compaction plan building
- **compaction_task_mix.go**: Added manifest path to mixed compaction
segment binlogs
- **meta.go**: Updated metadata completion logic:
- `completeClusterCompactionMutation()`: Set `ManifestPath` in new
segment info
- `completeMixCompactionMutation()`: Preserve manifest path in compacted
segments
- `completeSortCompactionMutation()`: Include manifest path in sorted
segments

5. Data Node Compactor Enhancements

All compactors updated to support dual-mode reading (binlog vs
manifest):

6. Flush & Sync Manager Updates

Pack Writer V2 (pack_writer_v2.go)
- **BulkPackWriterV2.Write()**: Extended return signature to include
`manifest string`
- Implementation:
  - Generate manifest path: `path.Join(pack.segmentID, "manifest.json")`
  - Write packed data using FFI-based writer
  - Return manifest path along with binlogs, deltas, and stats

Task Handling (task.go)
- Updated all sync task result handling to accommodate new manifest
return value
- Ensured backward compatibility for callers not using manifest

7. Go Storage Layer Integration

New Interfaces and Implementations
- **record_reader.go**: Interface for unified record reading across
storage versions
- **record_writer.go**: Interface for unified record writing across
storage versions
- **binlog_record_writer.go**: Concrete implementation for traditional
binlog-based writing

Enhanced Schema Support (schema.go, schema_test.go)
- Schema conversion utilities to support FFI-based storage operations
- Ensures proper Arrow schema mapping for V2 storage

Serialization Updates
- **serde.go, serde_events.go, serde_events_v2.go**: Updated to work
with new reader/writer interfaces
- Test files updated to validate dual-mode serialization

8. Storage V2 Packed Format

FFI Common (storagev2/packed/ffi_common.go)
- Common FFI utilities and type conversions for packed storage format

Packed Writer FFI (storagev2/packed/packed_writer_ffi.go)
- FFI-based implementation of packed writer
- Integrates with Loon storage layer for efficient columnar writes

Packed Reader FFI (storagev2/packed/packed_reader_ffi.go)
- Already existed, now complemented by writer implementation

9. Protocol Buffer Updates

data_coord.proto & datapb/data_coord.pb.go
- Added `manifest` field to compaction segment messages
- Enables passing manifest metadata through compaction pipeline

worker.proto & workerpb/worker.pb.go
- Added compaction parameter for `useLoonFFI` flag
- Allows workers to receive FFI configuration from coordinator

10. Parameter Configuration

component_param.go
- Added `UseLoonFFI` parameter to compaction configuration
- Reads from `common.storage.file.useLoonFFI` config path
- Default: `false` for safe rollout

11. Test Updates
- **clustering_compactor_storage_v2_test.go**: Updated signatures to
handle manifest return value
- **mix_compactor_storage_v2_test.go**: Updated test helpers for
manifest support
- **namespace_compactor_test.go**: Adjusted writer calls to expect
manifest
- **pack_writer_v2_test.go**: Validated manifest generation in pack
writing

This integration follows a **dual-mode approach**:
1. **Legacy Path**: Traditional binlog-based reading/writing (when
`useLoonFFI=false` or no manifest)
2. **FFI Path**: Manifest-based reading/writing through Loon FFI (when
`useLoonFFI=true` and manifest exists)

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-11-24 19:57:07 +08:00
XuanYang-cn
c082317681
fix: Use base64 to encode not utf-8 bytes (#45655)
See also: #45654

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2025-11-24 18:23:06 +08:00
congqixia
569a5b40d2
enhance: [StorageV2] add manifest path support for FFI integration (#44991)
Related to #44956

Add manifest_path field throughout the data path to support LOON Storage
V2 manifest tracking. The manifest stores metadata for segment data
files and enables the unified Storage V2 FFI interface.

Changes include:
- Add manifest_path field to SegmentInfo and SaveBinlogPathsRequest
proto messages
- Add UpdateManifest operator to datacoord meta operations
- Update metacache, sync manager, and meta writer to propagate manifest
paths
- Include manifest_path in segment load info for query coordinator

This is part of the Storage V2 FFI interface integration.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-10-27 19:24:10 +08:00
Ted Xu
196006b4ce
enhance: update delta log serialization APIs to integrate storage V2 (#44998)
See #39173

In this PR:

- Adjusted the delta log serialization APIs.
- Refactored the stats collector to improve the collection and digest of
primary key and BM25 statistics.
- Introduced new tests for the delta log reader/writer and stats
collectors to ensure functionality and correctness.

---------

Signed-off-by: Ted Xu <ted.xu@zilliz.com>
2025-10-22 15:58:12 +08:00
congqixia
1e3ec42e54
fix: Check child fields len instead of nil (#44405)
Related to #44398

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-09-17 11:36:00 +08:00
congqixia
6f7318a731
enhance: [StorageV2] Use compressed size as log file size (#44402)
Related to #39173

backlog issue that memory size and log size shared same value. This
patch add `GetFileSize` api to get remote compressed binlog size as meta
log file size to calculate usage more accurate.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-09-16 21:20:02 +08:00
congqixia
fc968ff1c2
enhance: [StorageV2] Pass args for avg size split policy (#44301)
Related to #44257

This PR
- Pass column stats for avg size split policy
- Add param items for policy configuration

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-09-11 10:43:57 +08:00
congqixia
f5618d5153
enhance: [StorageV2] Utilized advance split policy and persist in meta (#44282)
Related to #44257

This PR:
- Utilize configurable split policy for storage v2, enabling system
field policy
- Store split result in field binlog struct
- Adapt legacy binlog without child fields

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-09-10 14:47:57 +08:00
cai.zhang
eddf188452
fix: Using bucket name in storage config for bulk writer v2 (#44083)
issue: #44082

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-08-27 23:47:51 +08:00
XuanYang-cn
37a447d166
feat: Add CMEK cipher plugin (#43722)
1. Enable Milvus to read cipher configs
2. Enable cipher plugin in binlog reader and writer
3. Add a testCipher for unittests
4. Support pooling for datanode
5. Add encryption in storagev2

See also: #40321 
Signed-off-by: yangxuan <xuan.yang@zilliz.com>

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2025-08-27 11:15:52 +08:00
Spade A
8456f824be
feat: impl StructArray -- miscellaneous staffs for struct array (#43960)
Ref https://github.com/milvus-io/milvus/issues/42148

1. enable storage v2
2. implement some missing staffs
3. fix some bugs and add tests

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-08-26 21:35:53 +08:00
sthuang
a2c7ed2780
fix: [StorageV2] sort field binlogs paths for packed reader and writer (#43585)
key changes:
* fix unstable storage v2 compaction unit test by guaranteeing the order
of paths during sync.
* bump milvus-storage version, include
https://github.com/milvus-io/milvus-storage/pull/222
https://github.com/milvus-io/milvus-storage/pull/223
https://github.com/milvus-io/milvus-storage/pull/224
https://github.com/milvus-io/milvus-storage/pull/225
https://github.com/milvus-io/milvus-storage/pull/226
* Also fix the below related oom issue.
related: https://github.com/milvus-io/milvus/issues/43310

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2025-07-30 08:09:36 +08:00
Zhen Ye
cd38d65417
fix: make savebinlogpath idompotent at binlog level (#43615)
issue: #43574

- update all binlog every time when calling udpate savebinlogpath.

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-07-29 19:47:36 +08:00
XuanYang-cn
0ccb95303e
feat: [CMEK] Add utils to load plugins (#42986)
See also: #40321

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2025-07-29 17:17:36 +08:00
Spade A
faeb7fd410
feat: impl StructArray -- create schema, insert, and retrieve data (#42855)
Ref https://github.com/milvus-io/milvus/issues/42148

https://github.com/milvus-io/milvus/pull/42406 impls the segcore part of
storage for handling with VectorArray.
This PR:
1. impls the go part of storage for VectorArray
2. impls the collection creation with StructArrayField and VectorArray
3. insert and retrieve data from the collection.

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>
Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>
2025-07-27 01:30:55 +08:00
sthuang
a0c9f499ee
fix: [StorageV2] sync panic with nullable add field (#43142)
related: https://github.com/milvus-io/milvus/pull/42932
fix: https://github.com/milvus-io/milvus/issues/43072

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2025-07-25 10:08:53 +08:00
Zhen Ye
e9ab73e93d
enhance: add schema version at recovery storage (#43500)
issue: #43072, #43289

- manage the schema version at recovery storage.
- update the schema when creating collection or alter schema.
- get schema at write buffer based on version.
- recover the schema when upgrading from 2.5.

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-07-23 21:38:54 +08:00
Zhen Ye
25b76e1fde
fix: cannot auto balance the channel from old arch to streamingnode (#43424)
issue: #43416, #43413

- also fix the panic on streamingnode when concurrent sync

Signed-off-by: chyezh <chyezh@outlook.com>
2025-07-20 23:00:52 +08:00
cai.zhang
3ffd44f302
fix: Fix remaining issues with Datanode pooling and StorageV2 (#43147)
issue: #43146

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-07-10 14:26:48 +08:00
yihao.dai
9cbd194c6b
fix: Prevent import from generating small binlogs (#43132)
- Introduce dynamic buffer sizing to avoid generating small binlogs
during import
- Refactor import slot calculation based on CPU and memory constraints
- Implement dynamic pool sizing for sync manager and import tasks
according to CPU core count

issue: https://github.com/milvus-io/milvus/issues/43131

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-07-07 21:32:47 +08:00
sthuang
238bd30f42
fix: [StorageV2] end to end minor issues for sync, stats, and load (#42948)
Fix issues in end-to-end tests: 
1. **Split column groups based on schema**, rather than estimating by
average chunk row size. **Ensure column group consistency within a
segment**, to avoid errors caused by loading multiple column group
chunks simultaneously.
2. **Use sorted segmentId** when generating the stats binlog path, to
ensure consistent and correct file path resolution.
3. **Determine field IDs as follows**:
For multi-column column groups, retrieve the field ID list from
metadata.
For single-column column groups, use the column group ID directly as the
field ID.

related: #39173 
fix: #42862

---------

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2025-06-27 14:44:42 +08:00
sthuang
d4260b47fa
fix: [StorageV2] sync panic with add field (#42932)
related: #39663

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2025-06-25 10:08:40 +08:00
sthuang
4a0a2441f2
enhance: [StorageV2] field id as meta path for wide column (#42787)
related: #39173

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2025-06-19 15:00:38 +08:00
cai.zhang
a9dcd4a380
enhance: ChunkManager is no longer created during datanode initialization (#42791)
issue: #41611

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-06-17 17:06:38 +08:00
sthuang
ed5dbf3eaa
enhance: [StorageV2] sync separate vector datatype into its own column group (#42638)
related: #39173

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2025-06-16 11:48:37 +08:00
yihao.dai
ed55b14484
fix: Release data memory after sync task completes (#42627)
Release data memory after sync task completes to prevent datanode oom
during import.

issue: https://github.com/milvus-io/milvus/issues/42608

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-06-10 16:28:34 +08:00
sthuang
89c3afb12e
fix: [StorageV2] index/stats task level storage v2 fs (#42191)
related: #39173

---------

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2025-06-10 11:06:35 +08:00
congqixia
b8d7045539
enhance: [Add Field] Use consistent schema for single buffer (#41891)
Related to #41873

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-05-17 19:46:22 +08:00
congqixia
a6d09ff4cd
enhance: [StorageV2] fix issues integrating basic RW operations (#41834)
Related to #39173

This PR:
- Upgrade milvus-storage commit to fix filesystem finalized issue
- Add bucket-name as prefix for all fs style access io
- Initial arrow fs on querynodes startup
- Fix timestamp access when loading sealed segment

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-05-15 09:52:23 +08:00
SimFG
91d40fa558
fix: Update logging context and upgrade dependencies (#41318)
- issue: #41291

---------

Signed-off-by: SimFG <bang.fu@zilliz.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-04-23 10:52:38 +08:00
Ted Xu
1bcea2a775
fix: assigning the correct storage version in sync and index tasks (#41093)
See #39663 #40667

---------

Signed-off-by: Ted Xu <ted.xu@zilliz.com>
2025-04-08 10:14:25 +08:00
Ted Xu
128efaa3e3
enhance: simplify size calculation in file writers (#40808)
See: #40342

---------

Signed-off-by: Ted Xu <ted.xu@zilliz.com>
2025-03-26 20:04:22 +08:00
sthuang
d7df78a6c9
feat: Storage v2 compaction (#40667)
- Feat: Support Mix compaction. Covering tests include compatibility and
rollback ability.
  - Read v1 segments and compact with v2 format.
  - Read both v1 and v2 segments and compact with v2 format.
  - Read v2 segments and compact with v2 format.
  - Compact with duplicate primary key test.
  - Compact with bm25 segments.
  - Compact with merge sort segments.
  - Compact with no expiration segments.
  - Compact with lack binlog segments.
  - Compact with nullable field segments.
- Feat: Support Clustering compaction. Covering tests include
compatibility and rollback ability.
  - Read v1 segments and compact with v2 format.
  - Read both v1 and v2 segments and compact with v2 format.
  - Read v2 segments and compact with v2 format.
  - Compact bm25 segments with v2 format.
  - Compact with memory limit.
- Enhance: Use serdeMap serialize in BuildRecord function to support all
Milvus data types.
related: #39173

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2025-03-21 10:16:12 +08:00
XuanYang-cn
015a8f7631
fix: L0 brings its own start pos when syncing (#40663)
See also: #40388, #40207

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2025-03-18 15:56:16 +08:00
Ted Xu
df4285c9ef
enhance: API integration with storage v2 in clustering-compactions (#40133)
See #39173

---------

Signed-off-by: Ted Xu <ted.xu@zilliz.com>
2025-03-13 14:12:06 +08:00
jaime
c8a96377bb
enhance: move object storage client creation to pkg package (#40440)
issue: #40439

Signed-off-by: jaime <yun.zhang@zilliz.com>
2025-03-12 20:38:07 +08:00
XuanYang-cn
e6c46a25ea
enhance: Use correct counter metrics for overall wa calculation (#40394)
- Use CounterVec to calculate sum of increase during a time period.
- Use entries number instead of binlog size

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2025-03-10 16:34:06 +08:00
Ted Xu
878ce56079
fix: correct memory size estimation on arrays (#40312)
See: #40342

---------

Signed-off-by: Ted Xu <ted.xu@zilliz.com>
2025-03-05 16:54:09 +08:00
sthuang
63a7c4570e
feat: storage v2 sync (#39663)
related: #39173

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2025-03-05 11:22:15 +08:00
congqixia
cb7f2fa6fd
enhance: Use v2 package name for pkg module (#39990)
Related to #39095

https://go.dev/doc/modules/version-numbers

Update pkg version according to golang dep version convention

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-02-22 23:15:58 +08:00
Zhen Ye
d3e32bb599
enhance: make pchannel level flusher (#39275)
issue: #38399

- Add a pchannel level checkpoint for flush processing
- Refactor the recovery of flushers of wal
- make a shared wal scanner first, then make multi datasyncservice on it

Signed-off-by: chyezh <chyezh@outlook.com>
2025-02-10 16:32:45 +08:00
XuanYang-cn
1f14053c70
enhance: Enable to observe write amplification (#39661)
Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2025-02-08 18:38:43 +08:00
jaime
8a4ac8cccd
enhance: expose more metrics data (#39456)
issue: #36621 #39417
1. Adjust the server-side cache size.
2. Add source information for configurations.
3. Add node ID for compaction and indexing tasks.
4. Resolve localhost access issues to fix health check failures for
etcd.

Signed-off-by: jaime <yun.zhang@zilliz.com>
2025-02-07 11:50:50 +08:00
Ted Xu
f007465942
fix: unstable UT pack_writer_test (#39641)
See #39601

Signed-off-by: Ted Xu <ted.xu@zilliz.com>
2025-02-05 16:39:11 +08:00