milvus/pkg/proto/data_coord.proto
congqixia c01fd94a6a
enhance: integrate Storage V2 FFI interface for unified storage access (#45723)
Related #44956
This commit integrates the Storage V2 FFI (Foreign Function Interface)
interface throughout the Milvus codebase, enabling unified storage
access through the Loon FFI layer. This is a significant step towards
standardizing storage operations across different storage versions.

1. Configuration Support
- **configs/milvus.yaml**: Added `useLoonFFI` configuration flag under
`common.storage.file.splitByAvgSize` section
- Allows runtime toggle between traditional binlog readers and new
FFI-based manifest readers
  - Default: `false` (maintains backward compatibility)

2. Core FFI Infrastructure

Enhanced Utilities (internal/core/src/storage/loon_ffi/util.cpp/h)
- **ToCStorageConfig()**: Converts Go's `StorageConfig` to C's
`CStorageConfig` struct for FFI calls
- **GetManifest()**: Parses manifest JSON and retrieves latest column
groups using FFI
  - Accepts manifest path with `base_path` and `ver` fields
  - Calls `get_latest_column_groups()` FFI function
  - Returns column group information as string
  - Comprehensive error handling for JSON parsing and FFI errors

3. Dependency Updates
- **internal/core/thirdparty/milvus-storage/CMakeLists.txt**:
  - Updated milvus-storage version from `0883026` to `302143c`
  - Ensures compatibility with latest FFI interfaces

4. Data Coordinator Changes

All compaction task builders now include manifest path in segment
binlogs:

- **compaction_task_clustering.go**: Added `Manifest:
segInfo.GetManifestPath()` to segment binlogs
- **compaction_task_l0.go**: Added manifest path to both L0 segment
selection and compaction plan building
- **compaction_task_mix.go**: Added manifest path to mixed compaction
segment binlogs
- **meta.go**: Updated metadata completion logic:
- `completeClusterCompactionMutation()`: Set `ManifestPath` in new
segment info
- `completeMixCompactionMutation()`: Preserve manifest path in compacted
segments
- `completeSortCompactionMutation()`: Include manifest path in sorted
segments

5. Data Node Compactor Enhancements

All compactors updated to support dual-mode reading (binlog vs
manifest):

6. Flush & Sync Manager Updates

Pack Writer V2 (pack_writer_v2.go)
- **BulkPackWriterV2.Write()**: Extended return signature to include
`manifest string`
- Implementation:
  - Generate manifest path: `path.Join(pack.segmentID, "manifest.json")`
  - Write packed data using FFI-based writer
  - Return manifest path along with binlogs, deltas, and stats

Task Handling (task.go)
- Updated all sync task result handling to accommodate new manifest
return value
- Ensured backward compatibility for callers not using manifest

7. Go Storage Layer Integration

New Interfaces and Implementations
- **record_reader.go**: Interface for unified record reading across
storage versions
- **record_writer.go**: Interface for unified record writing across
storage versions
- **binlog_record_writer.go**: Concrete implementation for traditional
binlog-based writing

Enhanced Schema Support (schema.go, schema_test.go)
- Schema conversion utilities to support FFI-based storage operations
- Ensures proper Arrow schema mapping for V2 storage

Serialization Updates
- **serde.go, serde_events.go, serde_events_v2.go**: Updated to work
with new reader/writer interfaces
- Test files updated to validate dual-mode serialization

8. Storage V2 Packed Format

FFI Common (storagev2/packed/ffi_common.go)
- Common FFI utilities and type conversions for packed storage format

Packed Writer FFI (storagev2/packed/packed_writer_ffi.go)
- FFI-based implementation of packed writer
- Integrates with Loon storage layer for efficient columnar writes

Packed Reader FFI (storagev2/packed/packed_reader_ffi.go)
- Already existed, now complemented by writer implementation

9. Protocol Buffer Updates

data_coord.proto & datapb/data_coord.pb.go
- Added `manifest` field to compaction segment messages
- Enables passing manifest metadata through compaction pipeline

worker.proto & workerpb/worker.pb.go
- Added compaction parameter for `useLoonFFI` flag
- Allows workers to receive FFI configuration from coordinator

10. Parameter Configuration

component_param.go
- Added `UseLoonFFI` parameter to compaction configuration
- Reads from `common.storage.file.useLoonFFI` config path
- Default: `false` for safe rollout

11. Test Updates
- **clustering_compactor_storage_v2_test.go**: Updated signatures to
handle manifest return value
- **mix_compactor_storage_v2_test.go**: Updated test helpers for
manifest support
- **namespace_compactor_test.go**: Adjusted writer calls to expect
manifest
- **pack_writer_v2_test.go**: Validated manifest generation in pack
writing

This integration follows a **dual-mode approach**:
1. **Legacy Path**: Traditional binlog-based reading/writing (when
`useLoonFFI=false` or no manifest)
2. **FFI Path**: Manifest-based reading/writing through Loon FFI (when
`useLoonFFI=true` and manifest exists)

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-11-24 19:57:07 +08:00

1139 lines
33 KiB
Protocol Buffer

syntax = "proto3";
package milvus.proto.data;
option go_package = "github.com/milvus-io/milvus/pkg/v2/proto/datapb";
import "common.proto";
import "internal.proto";
import "milvus.proto";
import "schema.proto";
import "msg.proto";
import "index_coord.proto";
// TODO: import google/protobuf/empty.proto
message Empty {}
enum SegmentType {
New = 0;
Normal = 1;
Flushed = 2;
Compacted = 3;
}
enum SegmentLevel {
Legacy = 0; // zero value for legacy logic
L0 = 1; // L0 segment, contains delta data for current channel
L1 = 2; // L1 segment, normal segment, with no extra compaction attribute
L2 = 3; // L2 segment, segment with extra data distribution info
}
service DataCoord {
rpc Flush(FlushRequest) returns (FlushResponse) {}
rpc FlushAll(FlushAllRequest) returns(FlushAllResponse) {}
// AllocSegment alloc a new growing segment, add it into segment meta.
rpc AllocSegment(AllocSegmentRequest) returns (AllocSegmentResponse) {}
rpc AssignSegmentID(AssignSegmentIDRequest) returns (AssignSegmentIDResponse) {
option deprecated = true;
}
rpc GetSegmentInfo(GetSegmentInfoRequest) returns (GetSegmentInfoResponse) {}
rpc GetSegmentStates(GetSegmentStatesRequest) returns (GetSegmentStatesResponse) {}
rpc GetInsertBinlogPaths(GetInsertBinlogPathsRequest) returns (GetInsertBinlogPathsResponse) {}
rpc GetCollectionStatistics(GetCollectionStatisticsRequest) returns (GetCollectionStatisticsResponse) {}
rpc GetPartitionStatistics(GetPartitionStatisticsRequest) returns (GetPartitionStatisticsResponse) {}
rpc GetSegmentInfoChannel(GetSegmentInfoChannelRequest) returns (milvus.StringResponse){}
rpc SaveBinlogPaths(SaveBinlogPathsRequest) returns (common.Status){}
rpc GetRecoveryInfo(GetRecoveryInfoRequest) returns (GetRecoveryInfoResponse){}
rpc GetRecoveryInfoV2(GetRecoveryInfoRequestV2) returns (GetRecoveryInfoResponseV2){}
rpc GetChannelRecoveryInfo(GetChannelRecoveryInfoRequest) returns (GetChannelRecoveryInfoResponse){}
rpc GetFlushedSegments(GetFlushedSegmentsRequest) returns(GetFlushedSegmentsResponse){}
rpc GetSegmentsByStates(GetSegmentsByStatesRequest) returns(GetSegmentsByStatesResponse){}
rpc GetFlushAllState(milvus.GetFlushAllStateRequest) returns(milvus.GetFlushAllStateResponse) {}
rpc ShowConfigurations(internal.ShowConfigurationsRequest) returns (internal.ShowConfigurationsResponse){}
// https://wiki.lfaidata.foundation/display/MIL/MEP+8+--+Add+metrics+for+proxy
rpc GetMetrics(milvus.GetMetricsRequest) returns (milvus.GetMetricsResponse) {}
rpc ManualCompaction(milvus.ManualCompactionRequest) returns (milvus.ManualCompactionResponse) {}
rpc GetCompactionState(milvus.GetCompactionStateRequest) returns (milvus.GetCompactionStateResponse) {}
rpc GetCompactionStateWithPlans(milvus.GetCompactionPlansRequest) returns (milvus.GetCompactionPlansResponse) {}
rpc WatchChannels(WatchChannelsRequest) returns (WatchChannelsResponse) {
option deprecated = true;
}
rpc GetFlushState(GetFlushStateRequest) returns (milvus.GetFlushStateResponse) {}
rpc DropVirtualChannel(DropVirtualChannelRequest) returns (DropVirtualChannelResponse) {}
rpc SetSegmentState(SetSegmentStateRequest) returns (SetSegmentStateResponse) {}
// Deprecated
rpc UpdateSegmentStatistics(UpdateSegmentStatisticsRequest) returns (common.Status) {}
rpc UpdateChannelCheckpoint(UpdateChannelCheckpointRequest) returns (common.Status) {}
rpc MarkSegmentsDropped(MarkSegmentsDroppedRequest) returns(common.Status) {}
rpc BroadcastAlteredCollection(AlterCollectionRequest) returns (common.Status) {}
rpc CheckHealth(milvus.CheckHealthRequest) returns (milvus.CheckHealthResponse) {}
rpc CreateIndex(index.CreateIndexRequest) returns (common.Status){}
rpc AlterIndex(index.AlterIndexRequest) returns (common.Status){}
// Deprecated: use DescribeIndex instead
rpc GetIndexState(index.GetIndexStateRequest) returns (index.GetIndexStateResponse) {}
rpc GetSegmentIndexState(index.GetSegmentIndexStateRequest) returns (index.GetSegmentIndexStateResponse) {}
rpc GetIndexInfos(index.GetIndexInfoRequest) returns (index.GetIndexInfoResponse){}
rpc DropIndex(index.DropIndexRequest) returns (common.Status) {}
rpc DescribeIndex(index.DescribeIndexRequest) returns (index.DescribeIndexResponse) {}
rpc GetIndexStatistics(index.GetIndexStatisticsRequest) returns (index.GetIndexStatisticsResponse) {}
// Deprecated: use DescribeIndex instead
rpc GetIndexBuildProgress(index.GetIndexBuildProgressRequest) returns (index.GetIndexBuildProgressResponse) {}
rpc ListIndexes(index.ListIndexesRequest) returns (index.ListIndexesResponse) {}
rpc GcConfirm(GcConfirmRequest) returns (GcConfirmResponse) {}
rpc ReportDataNodeTtMsgs(ReportDataNodeTtMsgsRequest) returns (common.Status) {}
rpc GcControl(GcControlRequest) returns(common.Status){}
// importV2
rpc ImportV2(internal.ImportRequestInternal) returns(internal.ImportResponse){}
rpc GetImportProgress(internal.GetImportProgressRequest) returns(internal.GetImportProgressResponse){}
rpc ListImports(internal.ListImportsRequestInternal) returns(internal.ListImportsResponse){}
// File Resource Management
rpc AddFileResource(milvus.AddFileResourceRequest) returns (common.Status) {}
rpc RemoveFileResource(milvus.RemoveFileResourceRequest) returns (common.Status) {}
rpc ListFileResources(milvus.ListFileResourcesRequest) returns (milvus.ListFileResourcesResponse) {}
}
service DataNode {
rpc GetComponentStates(milvus.GetComponentStatesRequest) returns (milvus.ComponentStates) {}
rpc GetStatisticsChannel(internal.GetStatisticsChannelRequest) returns (milvus.StringResponse) {}
rpc WatchDmChannels(WatchDmChannelsRequest) returns (common.Status) {}
rpc FlushSegments(FlushSegmentsRequest) returns(common.Status) {}
rpc ShowConfigurations(internal.ShowConfigurationsRequest) returns (internal.ShowConfigurationsResponse){}
// https://wiki.lfaidata.foundation/display/MIL/MEP+8+--+Add+metrics+for+proxy
rpc GetMetrics(milvus.GetMetricsRequest) returns (milvus.GetMetricsResponse) {}
rpc CompactionV2(CompactionPlan) returns (common.Status) {}
rpc GetCompactionState(CompactionStateRequest) returns (CompactionStateResponse) {}
rpc SyncSegments(SyncSegmentsRequest) returns (common.Status) {}
// Deprecated
rpc ResendSegmentStats(ResendSegmentStatsRequest) returns(ResendSegmentStatsResponse) {}
rpc FlushChannels(FlushChannelsRequest) returns(common.Status) {}
rpc NotifyChannelOperation(ChannelOperationsRequest) returns(common.Status) {}
rpc CheckChannelOperationProgress(ChannelWatchInfo) returns(ChannelOperationProgressResponse) {}
// import v2
rpc PreImport(PreImportRequest) returns(common.Status) {}
rpc ImportV2(ImportRequest) returns(common.Status) {}
rpc QueryPreImport(QueryPreImportRequest) returns(QueryPreImportResponse) {}
rpc QueryImport(QueryImportRequest) returns(QueryImportResponse) {}
rpc DropImport(DropImportRequest) returns(common.Status) {}
rpc QuerySlot(QuerySlotRequest) returns(QuerySlotResponse) {}
rpc DropCompactionPlan(DropCompactionPlanRequest) returns(common.Status) {}
}
message FlushRequest {
common.MsgBase base = 1;
int64 dbID = 2;
repeated int64 segmentIDs = 3;
int64 collectionID = 4;
bool isImport = 5; // deprecated
}
message FlushResponse {
common.Status status = 1;
int64 dbID = 2;
int64 collectionID = 3;
repeated int64 segmentIDs = 4; // newly sealed segments
repeated int64 flushSegmentIDs = 5; // old flushed segment
int64 timeOfSeal = 6;
uint64 flush_ts = 7;
map<string, msg.MsgPosition> channel_cps = 8;
}
message FlushResult {
int64 collectionID = 1;
repeated int64 segmentIDs =2; // newly sealed segments
repeated int64 flushSegmentIDs = 3; // old flushed segment
int64 timeOfSeal = 4;
uint64 flush_ts = 5;
map<string, msg.MsgPosition> channel_cps = 6;
string db_name = 7; // database name for this flush result
string collection_name = 8; // collection name for this flush result
}
message FlushAllRequest {
common.MsgBase base = 1;
string dbName = 2; // Deprecated: use flush_targets instead
// List of specific databases and collections to flush
repeated FlushAllTarget flush_targets = 3;
}
// Specific collection to flush with database context
// This message allows targeting specific collections within a database for flush operations
message FlushAllTarget {
// Database name to target for flush operation
string db_name = 1;
// Collections within this database to flush
// If empty, flush all collections in this database
repeated int64 collection_ids = 3;
}
message FlushAllResponse {
common.Status status = 1;
uint64 flushTs = 2;
// Detailed flush results for each target
repeated FlushResult flush_results = 3;
}
message FlushChannelsRequest {
common.MsgBase base = 1;
uint64 flush_ts = 2;
repeated string channels = 3;
}
message SegmentIDRequest {
uint32 count = 1;
string channel_name = 2;
int64 collectionID = 3;
int64 partitionID = 4;
bool isImport = 5; // deprecated
int64 importTaskID = 6; // deprecated
SegmentLevel level = 7; // deprecated
int64 storage_version = 8;
}
message AllocSegmentRequest {
int64 collection_id = 1;
int64 partition_id = 2;
int64 segment_id = 3; // segment id must be allocate from rootcoord idalloc service.
string vchannel = 4;
int64 storage_version = 5;
bool is_created_by_streaming = 6;
}
message AllocSegmentResponse {
SegmentInfo segment_info = 1;
common.Status status = 2;
}
message AssignSegmentIDRequest {
int64 nodeID = 1;
string peer_role = 2;
repeated SegmentIDRequest segmentIDRequests = 3;
}
message SegmentIDAssignment {
int64 segID = 1;
string channel_name = 2;
uint32 count = 3;
int64 collectionID = 4;
int64 partitionID = 5;
uint64 expire_time = 6;
common.Status status = 7;
}
message AssignSegmentIDResponse {
repeated SegmentIDAssignment segIDAssignments = 1;
common.Status status = 2;
}
message GetSegmentStatesRequest {
common.MsgBase base = 1;
repeated int64 segmentIDs = 2;
}
message SegmentStateInfo {
int64 segmentID = 1;
common.SegmentState state = 2;
msg.MsgPosition start_position = 3;
msg.MsgPosition end_position = 4;
common.Status status = 5;
}
message GetSegmentStatesResponse {
common.Status status = 1;
repeated SegmentStateInfo states = 2;
}
message GetSegmentInfoRequest {
common.MsgBase base = 1;
repeated int64 segmentIDs = 2;
bool includeUnHealthy =3;
}
message GetSegmentInfoResponse {
common.Status status = 1;
repeated SegmentInfo infos = 2;
map<string, msg.MsgPosition> channel_checkpoint = 3;
}
message GetInsertBinlogPathsRequest {
common.MsgBase base = 1;
int64 segmentID = 2;
}
message GetInsertBinlogPathsResponse {
repeated int64 fieldIDs = 1;
repeated internal.StringList paths = 2;
common.Status status = 3;
}
message GetCollectionStatisticsRequest {
common.MsgBase base = 1;
int64 dbID = 2;
int64 collectionID = 3;
}
message GetCollectionStatisticsResponse {
repeated common.KeyValuePair stats = 1;
common.Status status = 2;
}
message GetPartitionStatisticsRequest{
common.MsgBase base = 1;
int64 dbID = 2;
int64 collectionID = 3;
repeated int64 partitionIDs = 4;
}
message GetPartitionStatisticsResponse {
repeated common.KeyValuePair stats = 1;
common.Status status = 2;
}
message GetSegmentInfoChannelRequest {
}
message VchannelInfo {
int64 collectionID = 1;
string channelName = 2;
msg.MsgPosition seek_position = 3;
repeated SegmentInfo unflushedSegments = 4 [deprecated = true]; // deprecated, keep it for compatibility
repeated SegmentInfo flushedSegments = 5 [deprecated = true]; // deprecated, keep it for compatibility
repeated SegmentInfo dropped_segments = 6 [deprecated = true]; // deprecated, keep it for compatibility
repeated int64 unflushedSegmentIds = 7;
repeated int64 flushedSegmentIds = 8;
repeated int64 dropped_segmentIds = 9;
repeated int64 indexed_segmentIds = 10; // deprecated, keep it for compatibility
repeated SegmentInfo indexed_segments = 11; // deprecated, keep it for compatibility
repeated int64 level_zero_segment_ids = 12;
map<int64, int64> partition_stats_versions = 13;
// delete record which ts is smaller than delete_checkpoint already be dispatch to sealed segments.
msg.MsgPosition delete_checkpoint = 14;
}
message WatchDmChannelsRequest {
common.MsgBase base = 1;
repeated VchannelInfo vchannels = 2;
}
message FlushSegmentsRequest {
common.MsgBase base = 1;
int64 dbID = 2;
int64 collectionID = 3;
repeated int64 segmentIDs = 4; // segments to flush
string channelName = 5; // vchannel name to flush
}
message SegmentMsg{
common.MsgBase base = 1;
SegmentInfo segment = 2;
}
message SegmentInfo {
int64 ID = 1;
int64 collectionID = 2;
int64 partitionID = 3;
string insert_channel = 4;
int64 num_of_rows = 5;
common.SegmentState state = 6;
int64 max_row_num = 7 [deprecated = true]; // deprecated, we use the binary size to control the segment size but not a estimate rows.
uint64 last_expire_time = 8;
msg.MsgPosition start_position = 9;
msg.MsgPosition dml_position = 10;
// binlogs consist of insert binlogs
repeated FieldBinlog binlogs = 11;
repeated FieldBinlog statslogs = 12;
// deltalogs consists of delete binlogs. FieldID is not used yet since delete is always applied on primary key
repeated FieldBinlog deltalogs = 13;
bool createdByCompaction = 14;
repeated int64 compactionFrom = 15;
uint64 dropped_at = 16; // timestamp when segment marked drop
// A flag indicating if:
// (1) this segment is created by bulk insert, and
// (2) the bulk insert task that creates this segment has not yet reached `ImportCompleted` state.
bool is_importing = 17;
bool is_fake = 18;
// denote if this segment is compacted to other segment.
// For compatibility reasons, this flag of an old compacted segment may still be False.
// As for new fields added in the message, they will be populated with their respective field types' default values.
bool compacted = 19;
// Segment level, indicating compaction segment level
// Available value: Legacy, L0, L1, L2
// For legacy level, it represent old segment before segment level introduced
// so segments with Legacy level shall be treated as L1 segment
SegmentLevel level = 20;
int64 storage_version = 21;
int64 partition_stats_version = 22;
// use in major compaction, if compaction fail, should revert segment level to last value
SegmentLevel last_level = 23;
// use in major compaction, if compaction fail, should revert partition stats version to last value
int64 last_partition_stats_version = 24;
// used to indicate whether the segment is sorted by primary key.
bool is_sorted = 25;
// textStatsLogs is used to record tokenization index for fields.
map<int64, TextIndexStats> textStatsLogs = 26;
repeated FieldBinlog bm25statslogs = 27;
// This field is used to indicate that some intermediate state segments should not be loaded.
// For example, segments that have been clustered but haven't undergone stats yet.
bool is_invisible = 28;
// jsonKeyStats is used to record json key index for fields.
map<int64, JsonKeyStats> jsonKeyStats = 29;
// This field is used to indicate that the segment is created by streaming service.
// This field is meaningful only when the segment state is growing.
// If the segment is created by streaming service, it will be a true.
// A segment generated by datacoord of old arch, will be false.
// After the growing segment is full managed by streamingnode, the true value can never be seen at coordinator.
bool is_created_by_streaming = 30;
bool is_partition_key_sorted = 31;
// manifest_path stores the fullpath of LOON manifest file of segemnt data files.
// we could keep the fullpath since one segment shall only have one active manifest
// and we could keep the possiblity that manifest stores out side of collection/partition/segment path
string manifest_path = 32;
}
message SegmentStartPosition {
msg.MsgPosition start_position = 1;
int64 segmentID = 2;
}
message SaveBinlogPathsRequest {
common.MsgBase base = 1;
int64 segmentID = 2;
int64 collectionID = 3;
repeated FieldBinlog field2BinlogPaths = 4;
repeated CheckPoint checkPoints = 5;
repeated SegmentStartPosition start_positions = 6;
bool flushed = 7;
repeated FieldBinlog field2StatslogPaths = 8;
repeated FieldBinlog deltalogs = 9;
bool dropped = 10;
bool importing = 11; // deprecated
string channel = 12; // report channel name for verification
SegmentLevel seg_level =13;
int64 partitionID =14; // report partitionID for create L0 segment
int64 storageVersion = 15;
repeated FieldBinlog field2Bm25logPaths = 16;
bool with_full_binlogs = 17; // report with full data for verification.
string manifest_path = 18; //
}
message CheckPoint {
int64 segmentID = 1;
msg.MsgPosition position = 2;
int64 num_of_rows = 3;
}
message DeltaLogInfo {
uint64 record_entries = 1;
uint64 timestamp_from = 2;
uint64 timestamp_to = 3;
string delta_log_path = 4;
int64 delta_log_size = 5;
}
enum ChannelWatchState {
Uncomplete = 0; // deprecated, keep it for compatibility
Complete = 1; // deprecated, keep it for compatibility
ToWatch = 2;
WatchSuccess = 3;
WatchFailure = 4;
ToRelease = 5;
ReleaseSuccess = 6;
ReleaseFailure = 7;
}
message ChannelStatus {
string name = 1;
ChannelWatchState state=2;
int64 collectionID = 3;
}
message DataNodeInfo {
string address = 1;
int64 version = 2;
repeated ChannelStatus channels = 3;
}
message SegmentBinlogs {
int64 segmentID = 1;
repeated FieldBinlog fieldBinlogs = 2;
int64 num_of_rows = 3;
repeated FieldBinlog statslogs = 4;
repeated FieldBinlog deltalogs = 5;
string insert_channel = 6;
map<int64, TextIndexStats> textStatsLogs = 7;
}
message FieldBinlog{
int64 fieldID = 1;
repeated Binlog binlogs = 2;
repeated int64 child_fields = 3;
}
message TextIndexStats {
int64 fieldID = 1;
int64 version = 2;
repeated string files = 3;
int64 log_size = 4;
int64 memory_size = 5;
int64 buildID = 6;
}
message JsonKeyStats {
int64 fieldID = 1;
int64 version = 2;
repeated string files = 3;
int64 log_size = 4;
int64 memory_size = 5;
int64 buildID = 6;
int64 json_key_stats_data_format =7;
}
message Binlog {
int64 entries_num = 1;
uint64 timestamp_from = 2;
uint64 timestamp_to = 3;
// deprecated
string log_path = 4;
int64 log_size = 5;
int64 logID = 6;
// memory_size represents the size occupied by loading data into memory.
// log_size represents the size after data serialized.
// for stats_log, the memory_size always equal log_size.
int64 memory_size = 7;
}
message GetRecoveryInfoResponse {
common.Status status = 1;
repeated VchannelInfo channels = 2;
repeated SegmentBinlogs binlogs = 3;
}
message GetRecoveryInfoRequest {
common.MsgBase base = 1;
int64 collectionID = 2;
int64 partitionID = 3;
}
message GetRecoveryInfoResponseV2 {
common.Status status = 1;
repeated VchannelInfo channels = 2;
repeated SegmentInfo segments = 3;
}
message GetRecoveryInfoRequestV2 {
common.MsgBase base = 1;
int64 collectionID = 2;
repeated int64 partitionIDs = 3;
}
message GetChannelRecoveryInfoRequest {
common.MsgBase base = 1;
string vchannel = 2;
}
message GetChannelRecoveryInfoResponse {
common.Status status = 1;
VchannelInfo info = 2;
schema.CollectionSchema schema = 3 [deprecated = true]; // schema is managed by streaming node itself now, so it should not be passed by rpc.
repeated SegmentNotCreatedByStreaming segments_not_created_by_streaming = 4; // Should be flushed by streaming service when upgrading.
}
message SegmentNotCreatedByStreaming {
int64 collection_id = 1;
int64 partition_id = 2;
int64 segment_id = 3;
}
message GetSegmentsByStatesRequest {
common.MsgBase base = 1;
int64 collectionID = 2;
int64 partitionID = 3;
repeated common.SegmentState states = 4;
}
message GetSegmentsByStatesResponse {
common.Status status = 1;
repeated int64 segments = 2;
}
message GetFlushedSegmentsRequest {
common.MsgBase base = 1;
int64 collectionID = 2;
int64 partitionID = 3;
bool includeUnhealthy = 4;
}
message GetFlushedSegmentsResponse {
common.Status status = 1;
repeated int64 segments = 2;
}
message SegmentFlushCompletedMsg {
common.MsgBase base = 1;
SegmentInfo segment = 2;
}
message ChannelWatchInfo {
VchannelInfo vchan= 1;
int64 startTs = 2;
ChannelWatchState state = 3;
// the timeout ts, datanode shall do nothing after it
// NOT USED.
int64 timeoutTs = 4;
// the schema of the collection to watch, to avoid get schema rpc issues.
schema.CollectionSchema schema = 5;
// watch progress, deprecated
int32 progress = 6;
int64 opID = 7;
repeated common.KeyValuePair dbProperties = 8;
}
enum CompactionType {
UndefinedCompaction = 0;
reserved 1;
MergeCompaction = 2;
MixCompaction = 3;
// compactionV2
SingleCompaction = 4;
MinorCompaction = 5;
MajorCompaction = 6;
Level0DeleteCompaction = 7;
ClusteringCompaction = 8;
SortCompaction = 9;
PartitionKeySortCompaction = 10;
ClusteringPartitionKeySortCompaction = 11;
}
message CompactionStateRequest {
common.MsgBase base = 1;
int64 planID = 2;
}
message SyncSegmentInfo {
int64 segment_id = 1;
FieldBinlog pk_stats_log = 2;
common.SegmentState state = 3;
SegmentLevel level = 4;
int64 num_of_rows = 5;
}
message SyncSegmentsRequest {
// Deprecated, after v2.4.3
int64 planID = 1;
// Deprecated, after v2.4.3
int64 compacted_to = 2;
// Deprecated, after v2.4.3
int64 num_of_rows = 3;
// Deprecated, after v2.4.3
repeated int64 compacted_from = 4;
// Deprecated, after v2.4.3
repeated FieldBinlog stats_logs = 5;
string channel_name = 6;
int64 partition_id = 7;
int64 collection_id = 8;
map<int64, SyncSegmentInfo> segment_infos = 9;
}
message CompactionSegmentBinlogs {
int64 segmentID = 1;
repeated FieldBinlog fieldBinlogs = 2;
repeated FieldBinlog field2StatslogPaths = 3;
repeated FieldBinlog deltalogs = 4;
string insert_channel = 5;
SegmentLevel level = 6;
int64 collectionID = 7;
int64 partitionID = 8;
bool is_sorted = 9;
int64 storage_version = 10;
string manifest = 11;
}
message CompactionPlan {
int64 planID = 1;
repeated CompactionSegmentBinlogs segmentBinlogs = 2;
int64 start_time = 3;
int32 timeout_in_seconds = 4 [deprecated = true];
CompactionType type = 5;
uint64 timetravel = 6;
string channel = 7;
int64 collection_ttl = 8; // nanoseconds
int64 total_rows = 9;
schema.CollectionSchema schema = 10;
int64 clustering_key_field = 11;
int64 max_segment_rows = 12;
int64 prefer_segment_rows = 13;
string analyze_result_path = 14;
repeated int64 analyze_segment_ids = 15;
int32 state = 16;
int64 begin_logID = 17; // deprecated, use pre_allocated_logIDs instead.
IDRange pre_allocated_segmentIDs = 18;
int64 slot_usage = 19;
int64 max_size = 20;
// bf path for importing
// collection is importing
IDRange pre_allocated_logIDs = 21;
string json_params = 22;
int32 current_scalar_index_version = 23;
repeated common.KeyValuePair plugin_context = 29;
}
message CompactionSegment {
int64 planID = 1; // deprecated after 2.3.4
int64 segmentID = 2;
int64 num_of_rows = 3;
repeated FieldBinlog insert_logs = 4;
repeated FieldBinlog field2StatslogPaths = 5;
repeated FieldBinlog deltalogs = 6;
string channel = 7;
bool is_sorted = 8;
repeated FieldBinlog bm25logs = 9;
int64 storage_version = 10;
map<int64, data.TextIndexStats> text_stats_logs = 11;
string manifest = 12;
}
message CompactionPlanResult {
int64 planID = 1;
CompactionTaskState state = 2;
repeated CompactionSegment segments = 3;
string channel = 4;
CompactionType type = 5;
}
message CompactionStateResponse {
common.Status status = 1;
repeated CompactionPlanResult results = 2;
}
// Deprecated
message SegmentFieldBinlogMeta {
int64 fieldID = 1;
string binlog_path = 2;
}
message WatchChannelsRequest {
int64 collectionID = 1;
repeated string channelNames = 2;
repeated common.KeyDataPair start_positions = 3;
schema.CollectionSchema schema = 4;
uint64 create_timestamp = 5;
repeated common.KeyValuePair db_properties = 6;
}
message WatchChannelsResponse {
common.Status status = 1;
}
message SetSegmentStateRequest {
common.MsgBase base = 1;
int64 segment_id = 2;
common.SegmentState new_state = 3;
}
message SetSegmentStateResponse {
common.Status status = 1;
}
message DropVirtualChannelRequest {
common.MsgBase base = 1;
string channel_name = 2;
repeated DropVirtualChannelSegment segments = 3;
}
message DropVirtualChannelSegment {
int64 segmentID = 1;
int64 collectionID = 2;
repeated FieldBinlog field2BinlogPaths = 3;
repeated FieldBinlog field2StatslogPaths = 4;
repeated FieldBinlog deltalogs = 5;
msg.MsgPosition startPosition = 6;
msg.MsgPosition checkPoint = 7;
int64 numOfRows = 8;
}
message DropVirtualChannelResponse {
common.Status status = 1;
}
message UpdateSegmentStatisticsRequest {
common.MsgBase base = 1;
repeated common.SegmentStats stats = 2;
}
message UpdateChannelCheckpointRequest {
common.MsgBase base = 1;
string vChannel = 2; // deprecated, keep it for compatibility
msg.MsgPosition position = 3; // deprecated, keep it for compatibility
repeated msg.MsgPosition channel_checkpoints = 4;
}
message ResendSegmentStatsRequest {
common.MsgBase base = 1;
}
message ResendSegmentStatsResponse {
common.Status status = 1;
repeated int64 seg_resent = 2;
}
message MarkSegmentsDroppedRequest {
common.MsgBase base = 1;
repeated int64 segment_ids = 2; // IDs of segments that needs to be marked as `dropped`.
}
message SegmentReferenceLock {
int64 taskID = 1;
int64 nodeID = 2;
repeated int64 segmentIDs = 3;
}
message AlterCollectionRequest {
int64 collectionID = 1;
schema.CollectionSchema schema = 2;
repeated int64 partitionIDs = 3;
repeated common.KeyDataPair start_positions = 4;
repeated common.KeyValuePair properties = 5;
int64 dbID = 6;
repeated string vChannels = 7;
}
message AddCollectionFieldRequest {
int64 collectionID = 1;
int64 dbID = 2;
schema.FieldSchema field_schema = 3;
schema.CollectionSchema schema = 4;
repeated int64 partitionIDs = 5;
repeated common.KeyDataPair start_positions = 6;
repeated common.KeyValuePair properties = 7;
repeated string vChannels = 8;
}
message GcConfirmRequest {
int64 collection_id = 1;
int64 partition_id = 2; // -1 means whole collection.
}
message GcConfirmResponse {
common.Status status = 1;
bool gc_finished = 2;
}
message ReportDataNodeTtMsgsRequest {
common.MsgBase base = 1;
repeated msg.DataNodeTtMsg msgs = 2; // -1 means whole collection.
}
message GetFlushStateRequest {
repeated int64 segmentIDs = 1;
uint64 flush_ts = 2;
string db_name = 3;
string collection_name = 4;
int64 collectionID = 5;
}
message ChannelOperationsRequest {
repeated ChannelWatchInfo infos = 1;
}
message ChannelOperationProgressResponse {
common.Status status = 1;
int64 opID = 2;
ChannelWatchState state = 3;
int32 progress = 4;
}
message PreImportRequest {
string clusterID = 1;
int64 jobID = 2;
int64 taskID = 3;
int64 collectionID = 4;
repeated int64 partitionIDs = 5;
repeated string vchannels = 6;
schema.CollectionSchema schema = 7;
repeated internal.ImportFile import_files = 8;
repeated common.KeyValuePair options = 9;
index.StorageConfig storage_config = 10;
int64 task_slot = 11;
repeated common.KeyValuePair plugin_context = 12;
}
message IDRange {
int64 begin = 1;
int64 end = 2;
}
message ImportRequestSegment {
int64 segmentID = 1;
int64 partitionID = 2;
string vchannel = 3;
}
message ImportRequest {
string clusterID = 1;
int64 jobID = 2;
int64 taskID = 3;
int64 collectionID = 4;
repeated int64 partitionIDs = 5;
repeated string vchannels = 6;
schema.CollectionSchema schema = 7;
repeated internal.ImportFile files = 8;
repeated common.KeyValuePair options = 9;
uint64 ts = 10;
IDRange ID_range = 11;
repeated ImportRequestSegment request_segments = 12;
index.StorageConfig storage_config = 13;
int64 task_slot = 14;
int64 storage_version = 15;
repeated common.KeyValuePair plugin_context = 16;
}
message QueryPreImportRequest {
string clusterID = 1;
int64 jobID = 2;
int64 taskID = 3;
}
message PartitionImportStats {
map<int64, int64> partition_rows = 1; // partitionID -> numRows
map<int64, int64> partition_data_size = 2; // partitionID -> dataSize
}
message ImportFileStats {
internal.ImportFile import_file = 1;
int64 file_size = 2;
int64 total_rows = 3;
int64 total_memory_size = 4;
map<string, PartitionImportStats> hashed_stats = 5; // channel -> PartitionImportStats
}
message QueryPreImportResponse {
common.Status status = 1;
int64 taskID = 2;
ImportTaskStateV2 state = 3;
string reason = 4;
int64 slots = 5;
repeated ImportFileStats file_stats = 6;
}
message QueryImportRequest {
string clusterID = 1;
int64 jobID = 2;
int64 taskID = 3;
bool querySlot = 4;
}
message ImportSegmentInfo {
int64 segmentID = 1;
int64 imported_rows = 2;
repeated FieldBinlog binlogs = 3;
repeated FieldBinlog statslogs = 4;
repeated FieldBinlog deltalogs = 5;
repeated FieldBinlog bm25logs = 6;
}
message QueryImportResponse {
common.Status status = 1;
int64 taskID = 2;
ImportTaskStateV2 state = 3;
string reason = 4;
int64 slots = 5;
repeated ImportSegmentInfo import_segments_info = 6;
}
message DropImportRequest {
string clusterID = 1;
int64 jobID = 2;
int64 taskID = 3;
}
message ImportJob {
int64 jobID = 1;
int64 dbID = 2;
int64 collectionID = 3;
string collection_name = 4;
repeated int64 partitionIDs = 5;
repeated string vchannels = 6;
schema.CollectionSchema schema = 7;
uint64 timeout_ts = 8;
uint64 cleanup_ts = 9;
int64 requestedDiskSize = 10;
internal.ImportJobState state = 11;
string reason = 12;
string complete_time = 13;
repeated internal.ImportFile files = 14;
repeated common.KeyValuePair options = 15;
string create_time = 16;
repeated string ready_vchannels = 17;
uint64 data_ts = 18;
}
enum ImportTaskStateV2 {
None = 0;
Pending = 1;
InProgress = 2;
Failed = 3;
Completed = 4;
Retry = 5;
}
enum ImportTaskSourceV2 {
Request = 0;
L0Compaction = 1;
}
message PreImportTask {
int64 jobID = 1;
int64 taskID = 2;
int64 collectionID = 3;
int64 nodeID = 6;
ImportTaskStateV2 state = 7;
string reason = 8;
repeated ImportFileStats file_stats = 10;
string created_time = 11;
string complete_time = 12;
}
message ImportTaskV2 {
int64 jobID = 1;
int64 taskID = 2;
int64 collectionID = 3;
repeated int64 segmentIDs = 4;
int64 nodeID = 5;
ImportTaskStateV2 state = 6;
string reason = 7;
string complete_time = 8;
repeated ImportFileStats file_stats = 9;
repeated int64 sorted_segmentIDs = 10;
string created_time = 11;
ImportTaskSourceV2 source = 12;
}
enum GcCommand {
_ = 0;
Pause = 1;
Resume = 2;
}
message GcControlRequest {
common.MsgBase base = 1;
GcCommand command = 2;
repeated common.KeyValuePair params = 3;
}
// The response message for GetGcStatus.
message GetGcStatusResponse {
// is_paused is true if the garbage collector is currently paused.
bool is_paused = 1;
// time_remaining_seconds is the remaining duration in seconds until the GC resumes.
int32 time_remaining_seconds = 2;
}
message QuerySlotRequest {}
message QuerySlotResponse {
common.Status status = 1;
int64 available_slots = 2;
}
enum CompactionTaskState {
unknown = 0;
executing = 1;
pipelining = 2;
completed = 3;
failed = 4;
timeout = 5;
analyzing = 6;
indexing = 7;
cleaned = 8;
meta_saved = 9;
statistic = 10;
}
message CompactionTask{
int64 planID = 1;
int64 triggerID = 2;
int64 collectionID = 3;
int64 partitionID = 4;
string channel = 5;
CompactionType type = 6;
CompactionTaskState state = 7;
string fail_reason = 8;
int64 start_time = 9;
int64 end_time = 10;
int32 timeout_in_seconds = 11 [deprecated = true];
int32 retry_times = 12;
int64 collection_ttl = 13;
int64 total_rows = 14;
repeated int64 inputSegments = 15;
repeated int64 resultSegments = 16;
msg.MsgPosition pos = 17;
int64 nodeID = 18;
schema.CollectionSchema schema = 19;
schema.FieldSchema clustering_key_field = 20;
int64 max_segment_rows = 21;
int64 prefer_segment_rows = 22;
int64 analyzeTaskID = 23;
int64 analyzeVersion = 24;
int64 lastStateStartTime = 25;
int64 max_size = 26;
repeated int64 tmpSegments = 27;
IDRange pre_allocated_segmentIDs = 28;
}
message PartitionStatsInfo {
int64 collectionID = 1;
int64 partitionID = 2;
string vChannel = 3;
int64 version = 4;
repeated int64 segmentIDs = 5;
int64 analyzeTaskID = 6;
int64 commitTime = 7;
}
message DropCompactionPlanRequest {
int64 planID = 1;
}
message FileResourceInfo {
string name = 1;
string path = 2;
int64 resource_id = 3;
}