milvus/pkg/proto/data_coord.proto
congqixia f94b04e642
feat: [2.6] integrate Loon FFI for manifest-based segment loading and index building (#46076)
Cherry-pick from master
pr: #45061 #45488 #45803 #46017 #44991 #45132 #45723 #45726 #45798
#45897 #45918 #44998

This feature integrates the Storage V2 (Loon) FFI interface as a unified
storage layer for segment loading and index building in Milvus. It
enables
manifest-based data access, replacing the traditional binlog-based
approach
with a more efficient columnar storage format.

Key changes:

### Segment Self-Managed Loading Architecture
- Move segment loading orchestration from Go layer to C++ segcore
- Add NewSegmentWithLoadInfo() API for passing load info during segment
creation
- Implement SetLoadInfo() and Load() methods in SegmentInterface
- Support parallel loading of indexed and non-indexed fields
- Enable both sealed and growing segments to self-manage loading

### Storage V2 FFI Integration
- Integrate milvus-storage library's FFI interface for packed columnar
data
- Add manifest path support throughout the data path (SegmentInfo,
LoadInfo)
- Implement ManifestReader for generating manifests from binlogs
- Support zero-copy data exchange using Arrow C Data Interface
- Add ToCStorageConfig() for Go-to-C storage config conversion

### Manifest-Based Index Building
- Extend FileManagerContext to carry loon_ffi_properties
- Implement GetFieldDatasFromManifest() using Arrow C Stream interface
- Support manifest-based reading in DiskFileManagerImpl and
MemFileManagerImpl
- Add fallback to traditional segment insert files when manifest
unavailable

### Compaction Pipeline Updates
- Include manifest path in all compaction task builders (clustering, L0,
mix)
- Update BulkPackWriterV2 to return manifest path
- Propagate manifest metadata through compaction pipeline

### Configuration & Protocol
- Add common.storageV2.useLoonFFI config option (default: false)
- Add manifest_path field to SegmentLoadInfo and related proto messages
- Add manifest field to compaction segment messages

### Bug Fixes
- Fix mmap settings not applied during segment load (key typo fix)
- Populate index info after segment loading to prevent redundant load
tasks
- Fix memory corruption by removing premature transaction handle
destruction

Related issues: #44956, #45060, #39173

## Individual Cherry-Picked Commits

1. **e1c923b5cc** - fix: apply mmap settings correctly during segment
load (#46017)
2. **63b912370b** - enhance: use milvus-storage internal C++ Reader API
for Loon FFI (#45897)
3. **bfc192faa5** - enhance: Resolve issues integrating loon FFI
(#45918)
4. **fb18564631** - enhance: support manifest-based index building with
Loon FFI reader (#45726)
5. **b9ec2392b9** - enhance: integrate StorageV2 FFI interface for
manifest-based segment loading (#45798)
6. **66db3c32e6** - enhance: integrate Storage V2 FFI interface for
unified storage access (#45723)
7. **ae789273ac** - fix: populate index info after segment loading to
prevent redundant load tasks (#45803)
8. **49688b0be2** - enhance: Move segment loading logic from Go layer to
segcore for self-managed loading (#45488)
9. **5b2df88bac** - enhance: [StorageV2] Integrate FFI interface for
packed reader (#45132)
10. **91ff5706ac** - enhance: [StorageV2] add manifest path support for
FFI integration (#44991)
11. **2192bb4a85** - enhance: add NewSegmentWithLoadInfo API to support
segment self-managed loading (#45061)
12. **4296b01da0** - enhance: update delta log serialization APIs to
integrate storage V2 (#44998)

## Technical Details

### Architecture Changes
- **Before**: Go layer orchestrated segment loading, making multiple CGO
calls
- **After**: Segments autonomously manage loading in C++ layer with
single entry point

### Storage Access Pattern
- **Before**: Read individual binlog files through Go storage layer
- **After**: Read manifest file that references packed columnar data via
FFI

### Benefits
- Reduced cross-language call overhead
- Better resource management at C++ level
- Improved I/O performance through batched streaming reads
- Cleaner separation of concerns between Go and C++ layers
- Foundation for proactive schema evolution handling

---------

Signed-off-by: Ted Xu <ted.xu@zilliz.com>
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
Co-authored-by: Ted Xu <ted.xu@zilliz.com>
2025-12-04 17:09:12 +08:00

1137 lines
32 KiB
Protocol Buffer

syntax = "proto3";
package milvus.proto.data;
option go_package = "github.com/milvus-io/milvus/pkg/v2/proto/datapb";
import "common.proto";
import "internal.proto";
import "milvus.proto";
import "schema.proto";
import "msg.proto";
import "index_coord.proto";
// TODO: import google/protobuf/empty.proto
message Empty {}
enum SegmentType {
New = 0;
Normal = 1;
Flushed = 2;
Compacted = 3;
}
enum SegmentLevel {
Legacy = 0; // zero value for legacy logic
L0 = 1; // L0 segment, contains delta data for current channel
L1 = 2; // L1 segment, normal segment, with no extra compaction attribute
L2 = 3; // L2 segment, segment with extra data distribution info
}
service DataCoord {
rpc Flush(FlushRequest) returns (FlushResponse) {}
rpc FlushAll(FlushAllRequest) returns(FlushAllResponse) {}
// AllocSegment alloc a new growing segment, add it into segment meta.
rpc AllocSegment(AllocSegmentRequest) returns (AllocSegmentResponse) {}
rpc AssignSegmentID(AssignSegmentIDRequest) returns (AssignSegmentIDResponse) {
option deprecated = true;
}
rpc GetSegmentInfo(GetSegmentInfoRequest) returns (GetSegmentInfoResponse) {}
rpc GetSegmentStates(GetSegmentStatesRequest) returns (GetSegmentStatesResponse) {}
rpc GetInsertBinlogPaths(GetInsertBinlogPathsRequest) returns (GetInsertBinlogPathsResponse) {}
rpc GetCollectionStatistics(GetCollectionStatisticsRequest) returns (GetCollectionStatisticsResponse) {}
rpc GetPartitionStatistics(GetPartitionStatisticsRequest) returns (GetPartitionStatisticsResponse) {}
rpc GetSegmentInfoChannel(GetSegmentInfoChannelRequest) returns (milvus.StringResponse){}
rpc SaveBinlogPaths(SaveBinlogPathsRequest) returns (common.Status){}
rpc GetRecoveryInfo(GetRecoveryInfoRequest) returns (GetRecoveryInfoResponse){}
rpc GetRecoveryInfoV2(GetRecoveryInfoRequestV2) returns (GetRecoveryInfoResponseV2){}
rpc GetChannelRecoveryInfo(GetChannelRecoveryInfoRequest) returns (GetChannelRecoveryInfoResponse){}
rpc GetFlushedSegments(GetFlushedSegmentsRequest) returns(GetFlushedSegmentsResponse){}
rpc GetSegmentsByStates(GetSegmentsByStatesRequest) returns(GetSegmentsByStatesResponse){}
rpc GetFlushAllState(milvus.GetFlushAllStateRequest) returns(milvus.GetFlushAllStateResponse) {}
rpc ShowConfigurations(internal.ShowConfigurationsRequest) returns (internal.ShowConfigurationsResponse){}
// https://wiki.lfaidata.foundation/display/MIL/MEP+8+--+Add+metrics+for+proxy
rpc GetMetrics(milvus.GetMetricsRequest) returns (milvus.GetMetricsResponse) {}
rpc ManualCompaction(milvus.ManualCompactionRequest) returns (milvus.ManualCompactionResponse) {}
rpc GetCompactionState(milvus.GetCompactionStateRequest) returns (milvus.GetCompactionStateResponse) {}
rpc GetCompactionStateWithPlans(milvus.GetCompactionPlansRequest) returns (milvus.GetCompactionPlansResponse) {}
rpc WatchChannels(WatchChannelsRequest) returns (WatchChannelsResponse) {
option deprecated = true;
}
rpc GetFlushState(GetFlushStateRequest) returns (milvus.GetFlushStateResponse) {}
rpc DropVirtualChannel(DropVirtualChannelRequest) returns (DropVirtualChannelResponse) {}
rpc SetSegmentState(SetSegmentStateRequest) returns (SetSegmentStateResponse) {}
// Deprecated
rpc UpdateSegmentStatistics(UpdateSegmentStatisticsRequest) returns (common.Status) {}
rpc UpdateChannelCheckpoint(UpdateChannelCheckpointRequest) returns (common.Status) {}
rpc MarkSegmentsDropped(MarkSegmentsDroppedRequest) returns(common.Status) {}
rpc BroadcastAlteredCollection(AlterCollectionRequest) returns (common.Status) {}
rpc CheckHealth(milvus.CheckHealthRequest) returns (milvus.CheckHealthResponse) {}
rpc CreateIndex(index.CreateIndexRequest) returns (common.Status){}
rpc AlterIndex(index.AlterIndexRequest) returns (common.Status){}
// Deprecated: use DescribeIndex instead
rpc GetIndexState(index.GetIndexStateRequest) returns (index.GetIndexStateResponse) {}
rpc GetSegmentIndexState(index.GetSegmentIndexStateRequest) returns (index.GetSegmentIndexStateResponse) {}
rpc GetIndexInfos(index.GetIndexInfoRequest) returns (index.GetIndexInfoResponse){}
rpc DropIndex(index.DropIndexRequest) returns (common.Status) {}
rpc DescribeIndex(index.DescribeIndexRequest) returns (index.DescribeIndexResponse) {}
rpc GetIndexStatistics(index.GetIndexStatisticsRequest) returns (index.GetIndexStatisticsResponse) {}
// Deprecated: use DescribeIndex instead
rpc GetIndexBuildProgress(index.GetIndexBuildProgressRequest) returns (index.GetIndexBuildProgressResponse) {}
rpc ListIndexes(index.ListIndexesRequest) returns (index.ListIndexesResponse) {}
rpc GcConfirm(GcConfirmRequest) returns (GcConfirmResponse) {}
rpc ReportDataNodeTtMsgs(ReportDataNodeTtMsgsRequest) returns (common.Status) {}
rpc GcControl(GcControlRequest) returns(common.Status){}
// importV2
rpc ImportV2(internal.ImportRequestInternal) returns(internal.ImportResponse){}
rpc GetImportProgress(internal.GetImportProgressRequest) returns(internal.GetImportProgressResponse){}
rpc ListImports(internal.ListImportsRequestInternal) returns(internal.ListImportsResponse){}
// File Resource Management
rpc AddFileResource(milvus.AddFileResourceRequest) returns (common.Status) {}
rpc RemoveFileResource(milvus.RemoveFileResourceRequest) returns (common.Status) {}
rpc ListFileResources(milvus.ListFileResourcesRequest) returns (milvus.ListFileResourcesResponse) {}
}
service DataNode {
rpc GetComponentStates(milvus.GetComponentStatesRequest) returns (milvus.ComponentStates) {}
rpc GetStatisticsChannel(internal.GetStatisticsChannelRequest) returns (milvus.StringResponse) {}
rpc WatchDmChannels(WatchDmChannelsRequest) returns (common.Status) {}
rpc FlushSegments(FlushSegmentsRequest) returns(common.Status) {}
rpc ShowConfigurations(internal.ShowConfigurationsRequest) returns (internal.ShowConfigurationsResponse){}
// https://wiki.lfaidata.foundation/display/MIL/MEP+8+--+Add+metrics+for+proxy
rpc GetMetrics(milvus.GetMetricsRequest) returns (milvus.GetMetricsResponse) {}
rpc CompactionV2(CompactionPlan) returns (common.Status) {}
rpc GetCompactionState(CompactionStateRequest) returns (CompactionStateResponse) {}
rpc SyncSegments(SyncSegmentsRequest) returns (common.Status) {}
// Deprecated
rpc ResendSegmentStats(ResendSegmentStatsRequest) returns(ResendSegmentStatsResponse) {}
rpc FlushChannels(FlushChannelsRequest) returns(common.Status) {}
rpc NotifyChannelOperation(ChannelOperationsRequest) returns(common.Status) {}
rpc CheckChannelOperationProgress(ChannelWatchInfo) returns(ChannelOperationProgressResponse) {}
// import v2
rpc PreImport(PreImportRequest) returns(common.Status) {}
rpc ImportV2(ImportRequest) returns(common.Status) {}
rpc QueryPreImport(QueryPreImportRequest) returns(QueryPreImportResponse) {}
rpc QueryImport(QueryImportRequest) returns(QueryImportResponse) {}
rpc DropImport(DropImportRequest) returns(common.Status) {}
rpc QuerySlot(QuerySlotRequest) returns(QuerySlotResponse) {}
rpc DropCompactionPlan(DropCompactionPlanRequest) returns(common.Status) {}
}
message FlushRequest {
common.MsgBase base = 1;
int64 dbID = 2;
repeated int64 segmentIDs = 3;
int64 collectionID = 4;
bool isImport = 5; // deprecated
}
message FlushResponse {
common.Status status = 1;
int64 dbID = 2;
int64 collectionID = 3;
repeated int64 segmentIDs = 4; // newly sealed segments
repeated int64 flushSegmentIDs = 5; // old flushed segment
int64 timeOfSeal = 6;
uint64 flush_ts = 7;
map<string, msg.MsgPosition> channel_cps = 8;
}
message FlushResult {
int64 collectionID = 1;
repeated int64 segmentIDs =2; // newly sealed segments
repeated int64 flushSegmentIDs = 3; // old flushed segment
int64 timeOfSeal = 4;
uint64 flush_ts = 5;
map<string, msg.MsgPosition> channel_cps = 6;
string db_name = 7; // database name for this flush result
string collection_name = 8; // collection name for this flush result
}
message FlushAllRequest {
common.MsgBase base = 1;
string dbName = 2; // Deprecated: use flush_targets instead
// List of specific databases and collections to flush
repeated FlushAllTarget flush_targets = 3;
}
// Specific collection to flush with database context
// This message allows targeting specific collections within a database for flush operations
message FlushAllTarget {
// Database name to target for flush operation
string db_name = 1;
// Collections within this database to flush
// If empty, flush all collections in this database
repeated int64 collection_ids = 3;
}
message FlushAllResponse {
common.Status status = 1;
uint64 flushTs = 2;
// Detailed flush results for each target
repeated FlushResult flush_results = 3;
}
message FlushChannelsRequest {
common.MsgBase base = 1;
uint64 flush_ts = 2;
repeated string channels = 3;
}
message SegmentIDRequest {
uint32 count = 1;
string channel_name = 2;
int64 collectionID = 3;
int64 partitionID = 4;
bool isImport = 5; // deprecated
int64 importTaskID = 6; // deprecated
SegmentLevel level = 7; // deprecated
int64 storage_version = 8;
}
message AllocSegmentRequest {
int64 collection_id = 1;
int64 partition_id = 2;
int64 segment_id = 3; // segment id must be allocate from rootcoord idalloc service.
string vchannel = 4;
int64 storage_version = 5;
bool is_created_by_streaming = 6;
}
message AllocSegmentResponse {
SegmentInfo segment_info = 1;
common.Status status = 2;
}
message AssignSegmentIDRequest {
int64 nodeID = 1;
string peer_role = 2;
repeated SegmentIDRequest segmentIDRequests = 3;
}
message SegmentIDAssignment {
int64 segID = 1;
string channel_name = 2;
uint32 count = 3;
int64 collectionID = 4;
int64 partitionID = 5;
uint64 expire_time = 6;
common.Status status = 7;
}
message AssignSegmentIDResponse {
repeated SegmentIDAssignment segIDAssignments = 1;
common.Status status = 2;
}
message GetSegmentStatesRequest {
common.MsgBase base = 1;
repeated int64 segmentIDs = 2;
}
message SegmentStateInfo {
int64 segmentID = 1;
common.SegmentState state = 2;
msg.MsgPosition start_position = 3;
msg.MsgPosition end_position = 4;
common.Status status = 5;
}
message GetSegmentStatesResponse {
common.Status status = 1;
repeated SegmentStateInfo states = 2;
}
message GetSegmentInfoRequest {
common.MsgBase base = 1;
repeated int64 segmentIDs = 2;
bool includeUnHealthy =3;
}
message GetSegmentInfoResponse {
common.Status status = 1;
repeated SegmentInfo infos = 2;
map<string, msg.MsgPosition> channel_checkpoint = 3;
}
message GetInsertBinlogPathsRequest {
common.MsgBase base = 1;
int64 segmentID = 2;
}
message GetInsertBinlogPathsResponse {
repeated int64 fieldIDs = 1;
repeated internal.StringList paths = 2;
common.Status status = 3;
}
message GetCollectionStatisticsRequest {
common.MsgBase base = 1;
int64 dbID = 2;
int64 collectionID = 3;
}
message GetCollectionStatisticsResponse {
repeated common.KeyValuePair stats = 1;
common.Status status = 2;
}
message GetPartitionStatisticsRequest{
common.MsgBase base = 1;
int64 dbID = 2;
int64 collectionID = 3;
repeated int64 partitionIDs = 4;
}
message GetPartitionStatisticsResponse {
repeated common.KeyValuePair stats = 1;
common.Status status = 2;
}
message GetSegmentInfoChannelRequest {
}
message VchannelInfo {
int64 collectionID = 1;
string channelName = 2;
msg.MsgPosition seek_position = 3;
repeated SegmentInfo unflushedSegments = 4 [deprecated = true]; // deprecated, keep it for compatibility
repeated SegmentInfo flushedSegments = 5 [deprecated = true]; // deprecated, keep it for compatibility
repeated SegmentInfo dropped_segments = 6 [deprecated = true]; // deprecated, keep it for compatibility
repeated int64 unflushedSegmentIds = 7;
repeated int64 flushedSegmentIds = 8;
repeated int64 dropped_segmentIds = 9;
repeated int64 indexed_segmentIds = 10; // deprecated, keep it for compatibility
repeated SegmentInfo indexed_segments = 11; // deprecated, keep it for compatibility
repeated int64 level_zero_segment_ids = 12;
map<int64, int64> partition_stats_versions = 13;
// delete record which ts is smaller than delete_checkpoint already be dispatch to sealed segments.
msg.MsgPosition delete_checkpoint = 14;
}
message WatchDmChannelsRequest {
common.MsgBase base = 1;
repeated VchannelInfo vchannels = 2;
}
message FlushSegmentsRequest {
common.MsgBase base = 1;
int64 dbID = 2;
int64 collectionID = 3;
repeated int64 segmentIDs = 4; // segments to flush
string channelName = 5; // vchannel name to flush
}
message SegmentMsg{
common.MsgBase base = 1;
SegmentInfo segment = 2;
}
message SegmentInfo {
int64 ID = 1;
int64 collectionID = 2;
int64 partitionID = 3;
string insert_channel = 4;
int64 num_of_rows = 5;
common.SegmentState state = 6;
int64 max_row_num = 7 [deprecated = true]; // deprecated, we use the binary size to control the segment size but not a estimate rows.
uint64 last_expire_time = 8;
msg.MsgPosition start_position = 9;
msg.MsgPosition dml_position = 10;
// binlogs consist of insert binlogs
repeated FieldBinlog binlogs = 11;
repeated FieldBinlog statslogs = 12;
// deltalogs consists of delete binlogs. FieldID is not used yet since delete is always applied on primary key
repeated FieldBinlog deltalogs = 13;
bool createdByCompaction = 14;
repeated int64 compactionFrom = 15;
uint64 dropped_at = 16; // timestamp when segment marked drop
// A flag indicating if:
// (1) this segment is created by bulk insert, and
// (2) the bulk insert task that creates this segment has not yet reached `ImportCompleted` state.
bool is_importing = 17;
bool is_fake = 18;
// denote if this segment is compacted to other segment.
// For compatibility reasons, this flag of an old compacted segment may still be False.
// As for new fields added in the message, they will be populated with their respective field types' default values.
bool compacted = 19;
// Segment level, indicating compaction segment level
// Available value: Legacy, L0, L1, L2
// For legacy level, it represent old segment before segment level introduced
// so segments with Legacy level shall be treated as L1 segment
SegmentLevel level = 20;
int64 storage_version = 21;
int64 partition_stats_version = 22;
// use in major compaction, if compaction fail, should revert segment level to last value
SegmentLevel last_level = 23;
// use in major compaction, if compaction fail, should revert partition stats version to last value
int64 last_partition_stats_version = 24;
// used to indicate whether the segment is sorted by primary key.
bool is_sorted = 25;
// textStatsLogs is used to record tokenization index for fields.
map<int64, TextIndexStats> textStatsLogs = 26;
repeated FieldBinlog bm25statslogs = 27;
// This field is used to indicate that some intermediate state segments should not be loaded.
// For example, segments that have been clustered but haven't undergone stats yet.
bool is_invisible = 28;
// jsonKeyStats is used to record json key index for fields.
map<int64, JsonKeyStats> jsonKeyStats = 29;
// This field is used to indicate that the segment is created by streaming service.
// This field is meaningful only when the segment state is growing.
// If the segment is created by streaming service, it will be a true.
// A segment generated by datacoord of old arch, will be false.
// After the growing segment is full managed by streamingnode, the true value can never be seen at coordinator.
bool is_created_by_streaming = 30;
bool is_partition_key_sorted = 31;
// manifest_path stores the fullpath of LOON manifest file of segemnt data files.
// we could keep the fullpath since one segment shall only have one active manifest
// and we could keep the possiblity that manifest stores out side of collection/partition/segment path
string manifest_path = 32;
}
message SegmentStartPosition {
msg.MsgPosition start_position = 1;
int64 segmentID = 2;
}
message SaveBinlogPathsRequest {
common.MsgBase base = 1;
int64 segmentID = 2;
int64 collectionID = 3;
repeated FieldBinlog field2BinlogPaths = 4;
repeated CheckPoint checkPoints = 5;
repeated SegmentStartPosition start_positions = 6;
bool flushed = 7;
repeated FieldBinlog field2StatslogPaths = 8;
repeated FieldBinlog deltalogs = 9;
bool dropped = 10;
bool importing = 11; // deprecated
string channel = 12; // report channel name for verification
SegmentLevel seg_level =13;
int64 partitionID =14; // report partitionID for create L0 segment
int64 storageVersion = 15;
repeated FieldBinlog field2Bm25logPaths = 16;
bool with_full_binlogs = 17; // report with full data for verification.
string manifest_path = 18; //
}
message CheckPoint {
int64 segmentID = 1;
msg.MsgPosition position = 2;
int64 num_of_rows = 3;
}
message DeltaLogInfo {
uint64 record_entries = 1;
uint64 timestamp_from = 2;
uint64 timestamp_to = 3;
string delta_log_path = 4;
int64 delta_log_size = 5;
}
enum ChannelWatchState {
Uncomplete = 0; // deprecated, keep it for compatibility
Complete = 1; // deprecated, keep it for compatibility
ToWatch = 2;
WatchSuccess = 3;
WatchFailure = 4;
ToRelease = 5;
ReleaseSuccess = 6;
ReleaseFailure = 7;
}
message ChannelStatus {
string name = 1;
ChannelWatchState state=2;
int64 collectionID = 3;
}
message DataNodeInfo {
string address = 1;
int64 version = 2;
repeated ChannelStatus channels = 3;
}
message SegmentBinlogs {
int64 segmentID = 1;
repeated FieldBinlog fieldBinlogs = 2;
int64 num_of_rows = 3;
repeated FieldBinlog statslogs = 4;
repeated FieldBinlog deltalogs = 5;
string insert_channel = 6;
map<int64, TextIndexStats> textStatsLogs = 7;
}
message FieldBinlog{
int64 fieldID = 1;
repeated Binlog binlogs = 2;
repeated int64 child_fields = 3;
}
message TextIndexStats {
int64 fieldID = 1;
int64 version = 2;
repeated string files = 3;
int64 log_size = 4;
int64 memory_size = 5;
int64 buildID = 6;
}
message JsonKeyStats {
int64 fieldID = 1;
int64 version = 2;
repeated string files = 3;
int64 log_size = 4;
int64 memory_size = 5;
int64 buildID = 6;
int64 json_key_stats_data_format =7;
}
message Binlog {
int64 entries_num = 1;
uint64 timestamp_from = 2;
uint64 timestamp_to = 3;
// deprecated
string log_path = 4;
int64 log_size = 5;
int64 logID = 6;
// memory_size represents the size occupied by loading data into memory.
// log_size represents the size after data serialized.
// for stats_log, the memory_size always equal log_size.
int64 memory_size = 7;
}
message GetRecoveryInfoResponse {
common.Status status = 1;
repeated VchannelInfo channels = 2;
repeated SegmentBinlogs binlogs = 3;
}
message GetRecoveryInfoRequest {
common.MsgBase base = 1;
int64 collectionID = 2;
int64 partitionID = 3;
}
message GetRecoveryInfoResponseV2 {
common.Status status = 1;
repeated VchannelInfo channels = 2;
repeated SegmentInfo segments = 3;
}
message GetRecoveryInfoRequestV2 {
common.MsgBase base = 1;
int64 collectionID = 2;
repeated int64 partitionIDs = 3;
}
message GetChannelRecoveryInfoRequest {
common.MsgBase base = 1;
string vchannel = 2;
}
message GetChannelRecoveryInfoResponse {
common.Status status = 1;
VchannelInfo info = 2;
schema.CollectionSchema schema = 3 [deprecated = true]; // schema is managed by streaming node itself now, so it should not be passed by rpc.
repeated SegmentNotCreatedByStreaming segments_not_created_by_streaming = 4; // Should be flushed by streaming service when upgrading.
}
message SegmentNotCreatedByStreaming {
int64 collection_id = 1;
int64 partition_id = 2;
int64 segment_id = 3;
}
message GetSegmentsByStatesRequest {
common.MsgBase base = 1;
int64 collectionID = 2;
int64 partitionID = 3;
repeated common.SegmentState states = 4;
}
message GetSegmentsByStatesResponse {
common.Status status = 1;
repeated int64 segments = 2;
}
message GetFlushedSegmentsRequest {
common.MsgBase base = 1;
int64 collectionID = 2;
int64 partitionID = 3;
bool includeUnhealthy = 4;
}
message GetFlushedSegmentsResponse {
common.Status status = 1;
repeated int64 segments = 2;
}
message SegmentFlushCompletedMsg {
common.MsgBase base = 1;
SegmentInfo segment = 2;
}
message ChannelWatchInfo {
VchannelInfo vchan= 1;
int64 startTs = 2;
ChannelWatchState state = 3;
// the timeout ts, datanode shall do nothing after it
// NOT USED.
int64 timeoutTs = 4;
// the schema of the collection to watch, to avoid get schema rpc issues.
schema.CollectionSchema schema = 5;
// watch progress, deprecated
int32 progress = 6;
int64 opID = 7;
repeated common.KeyValuePair dbProperties = 8;
}
enum CompactionType {
UndefinedCompaction = 0;
reserved 1;
MergeCompaction = 2;
MixCompaction = 3;
// compactionV2
SingleCompaction = 4;
MinorCompaction = 5;
MajorCompaction = 6;
Level0DeleteCompaction = 7;
ClusteringCompaction = 8;
SortCompaction = 9;
}
message CompactionStateRequest {
common.MsgBase base = 1;
int64 planID = 2;
}
message SyncSegmentInfo {
int64 segment_id = 1;
FieldBinlog pk_stats_log = 2;
common.SegmentState state = 3;
SegmentLevel level = 4;
int64 num_of_rows = 5;
}
message SyncSegmentsRequest {
// Deprecated, after v2.4.3
int64 planID = 1;
// Deprecated, after v2.4.3
int64 compacted_to = 2;
// Deprecated, after v2.4.3
int64 num_of_rows = 3;
// Deprecated, after v2.4.3
repeated int64 compacted_from = 4;
// Deprecated, after v2.4.3
repeated FieldBinlog stats_logs = 5;
string channel_name = 6;
int64 partition_id = 7;
int64 collection_id = 8;
map<int64, SyncSegmentInfo> segment_infos = 9;
}
message CompactionSegmentBinlogs {
int64 segmentID = 1;
repeated FieldBinlog fieldBinlogs = 2;
repeated FieldBinlog field2StatslogPaths = 3;
repeated FieldBinlog deltalogs = 4;
string insert_channel = 5;
SegmentLevel level = 6;
int64 collectionID = 7;
int64 partitionID = 8;
bool is_sorted = 9;
int64 storage_version = 10;
string manifest = 11;
}
message CompactionPlan {
int64 planID = 1;
repeated CompactionSegmentBinlogs segmentBinlogs = 2;
int64 start_time = 3;
int32 timeout_in_seconds = 4 [deprecated = true];
CompactionType type = 5;
uint64 timetravel = 6;
string channel = 7;
int64 collection_ttl = 8; // nanoseconds
int64 total_rows = 9;
schema.CollectionSchema schema = 10;
int64 clustering_key_field = 11;
int64 max_segment_rows = 12;
int64 prefer_segment_rows = 13;
string analyze_result_path = 14;
repeated int64 analyze_segment_ids = 15;
int32 state = 16;
int64 begin_logID = 17; // deprecated, use pre_allocated_logIDs instead.
IDRange pre_allocated_segmentIDs = 18;
int64 slot_usage = 19;
int64 max_size = 20;
// bf path for importing
// collection is importing
IDRange pre_allocated_logIDs = 21;
string json_params = 22;
int32 current_scalar_index_version = 23;
repeated common.KeyValuePair plugin_context = 29;
}
message CompactionSegment {
int64 planID = 1; // deprecated after 2.3.4
int64 segmentID = 2;
int64 num_of_rows = 3;
repeated FieldBinlog insert_logs = 4;
repeated FieldBinlog field2StatslogPaths = 5;
repeated FieldBinlog deltalogs = 6;
string channel = 7;
bool is_sorted = 8;
repeated FieldBinlog bm25logs = 9;
int64 storage_version = 10;
map<int64, data.TextIndexStats> text_stats_logs = 11;
string manifest = 12;
}
message CompactionPlanResult {
int64 planID = 1;
CompactionTaskState state = 2;
repeated CompactionSegment segments = 3;
string channel = 4;
CompactionType type = 5;
}
message CompactionStateResponse {
common.Status status = 1;
repeated CompactionPlanResult results = 2;
}
// Deprecated
message SegmentFieldBinlogMeta {
int64 fieldID = 1;
string binlog_path = 2;
}
message WatchChannelsRequest {
int64 collectionID = 1;
repeated string channelNames = 2;
repeated common.KeyDataPair start_positions = 3;
schema.CollectionSchema schema = 4;
uint64 create_timestamp = 5;
repeated common.KeyValuePair db_properties = 6;
}
message WatchChannelsResponse {
common.Status status = 1;
}
message SetSegmentStateRequest {
common.MsgBase base = 1;
int64 segment_id = 2;
common.SegmentState new_state = 3;
}
message SetSegmentStateResponse {
common.Status status = 1;
}
message DropVirtualChannelRequest {
common.MsgBase base = 1;
string channel_name = 2;
repeated DropVirtualChannelSegment segments = 3;
}
message DropVirtualChannelSegment {
int64 segmentID = 1;
int64 collectionID = 2;
repeated FieldBinlog field2BinlogPaths = 3;
repeated FieldBinlog field2StatslogPaths = 4;
repeated FieldBinlog deltalogs = 5;
msg.MsgPosition startPosition = 6;
msg.MsgPosition checkPoint = 7;
int64 numOfRows = 8;
}
message DropVirtualChannelResponse {
common.Status status = 1;
}
message UpdateSegmentStatisticsRequest {
common.MsgBase base = 1;
repeated common.SegmentStats stats = 2;
}
message UpdateChannelCheckpointRequest {
common.MsgBase base = 1;
string vChannel = 2; // deprecated, keep it for compatibility
msg.MsgPosition position = 3; // deprecated, keep it for compatibility
repeated msg.MsgPosition channel_checkpoints = 4;
}
message ResendSegmentStatsRequest {
common.MsgBase base = 1;
}
message ResendSegmentStatsResponse {
common.Status status = 1;
repeated int64 seg_resent = 2;
}
message MarkSegmentsDroppedRequest {
common.MsgBase base = 1;
repeated int64 segment_ids = 2; // IDs of segments that needs to be marked as `dropped`.
}
message SegmentReferenceLock {
int64 taskID = 1;
int64 nodeID = 2;
repeated int64 segmentIDs = 3;
}
message AlterCollectionRequest {
int64 collectionID = 1;
schema.CollectionSchema schema = 2;
repeated int64 partitionIDs = 3;
repeated common.KeyDataPair start_positions = 4;
repeated common.KeyValuePair properties = 5;
int64 dbID = 6;
repeated string vChannels = 7;
}
message AddCollectionFieldRequest {
int64 collectionID = 1;
int64 dbID = 2;
schema.FieldSchema field_schema = 3;
schema.CollectionSchema schema = 4;
repeated int64 partitionIDs = 5;
repeated common.KeyDataPair start_positions = 6;
repeated common.KeyValuePair properties = 7;
repeated string vChannels = 8;
}
message GcConfirmRequest {
int64 collection_id = 1;
int64 partition_id = 2; // -1 means whole collection.
}
message GcConfirmResponse {
common.Status status = 1;
bool gc_finished = 2;
}
message ReportDataNodeTtMsgsRequest {
common.MsgBase base = 1;
repeated msg.DataNodeTtMsg msgs = 2; // -1 means whole collection.
}
message GetFlushStateRequest {
repeated int64 segmentIDs = 1;
uint64 flush_ts = 2;
string db_name = 3;
string collection_name = 4;
int64 collectionID = 5;
}
message ChannelOperationsRequest {
repeated ChannelWatchInfo infos = 1;
}
message ChannelOperationProgressResponse {
common.Status status = 1;
int64 opID = 2;
ChannelWatchState state = 3;
int32 progress = 4;
}
message PreImportRequest {
string clusterID = 1;
int64 jobID = 2;
int64 taskID = 3;
int64 collectionID = 4;
repeated int64 partitionIDs = 5;
repeated string vchannels = 6;
schema.CollectionSchema schema = 7;
repeated internal.ImportFile import_files = 8;
repeated common.KeyValuePair options = 9;
index.StorageConfig storage_config = 10;
int64 task_slot = 11;
repeated common.KeyValuePair plugin_context = 12;
}
message IDRange {
int64 begin = 1;
int64 end = 2;
}
message ImportRequestSegment {
int64 segmentID = 1;
int64 partitionID = 2;
string vchannel = 3;
}
message ImportRequest {
string clusterID = 1;
int64 jobID = 2;
int64 taskID = 3;
int64 collectionID = 4;
repeated int64 partitionIDs = 5;
repeated string vchannels = 6;
schema.CollectionSchema schema = 7;
repeated internal.ImportFile files = 8;
repeated common.KeyValuePair options = 9;
uint64 ts = 10;
IDRange ID_range = 11;
repeated ImportRequestSegment request_segments = 12;
index.StorageConfig storage_config = 13;
int64 task_slot = 14;
int64 storage_version = 15;
repeated common.KeyValuePair plugin_context = 16;
}
message QueryPreImportRequest {
string clusterID = 1;
int64 jobID = 2;
int64 taskID = 3;
}
message PartitionImportStats {
map<int64, int64> partition_rows = 1; // partitionID -> numRows
map<int64, int64> partition_data_size = 2; // partitionID -> dataSize
}
message ImportFileStats {
internal.ImportFile import_file = 1;
int64 file_size = 2;
int64 total_rows = 3;
int64 total_memory_size = 4;
map<string, PartitionImportStats> hashed_stats = 5; // channel -> PartitionImportStats
}
message QueryPreImportResponse {
common.Status status = 1;
int64 taskID = 2;
ImportTaskStateV2 state = 3;
string reason = 4;
int64 slots = 5;
repeated ImportFileStats file_stats = 6;
}
message QueryImportRequest {
string clusterID = 1;
int64 jobID = 2;
int64 taskID = 3;
bool querySlot = 4;
}
message ImportSegmentInfo {
int64 segmentID = 1;
int64 imported_rows = 2;
repeated FieldBinlog binlogs = 3;
repeated FieldBinlog statslogs = 4;
repeated FieldBinlog deltalogs = 5;
repeated FieldBinlog bm25logs = 6;
}
message QueryImportResponse {
common.Status status = 1;
int64 taskID = 2;
ImportTaskStateV2 state = 3;
string reason = 4;
int64 slots = 5;
repeated ImportSegmentInfo import_segments_info = 6;
}
message DropImportRequest {
string clusterID = 1;
int64 jobID = 2;
int64 taskID = 3;
}
message ImportJob {
int64 jobID = 1;
int64 dbID = 2;
int64 collectionID = 3;
string collection_name = 4;
repeated int64 partitionIDs = 5;
repeated string vchannels = 6;
schema.CollectionSchema schema = 7;
uint64 timeout_ts = 8;
uint64 cleanup_ts = 9;
int64 requestedDiskSize = 10;
internal.ImportJobState state = 11;
string reason = 12;
string complete_time = 13;
repeated internal.ImportFile files = 14;
repeated common.KeyValuePair options = 15;
string create_time = 16;
repeated string ready_vchannels = 17;
uint64 data_ts = 18;
}
enum ImportTaskStateV2 {
None = 0;
Pending = 1;
InProgress = 2;
Failed = 3;
Completed = 4;
Retry = 5;
}
enum ImportTaskSourceV2 {
Request = 0;
L0Compaction = 1;
}
message PreImportTask {
int64 jobID = 1;
int64 taskID = 2;
int64 collectionID = 3;
int64 nodeID = 6;
ImportTaskStateV2 state = 7;
string reason = 8;
repeated ImportFileStats file_stats = 10;
string created_time = 11;
string complete_time = 12;
}
message ImportTaskV2 {
int64 jobID = 1;
int64 taskID = 2;
int64 collectionID = 3;
repeated int64 segmentIDs = 4;
int64 nodeID = 5;
ImportTaskStateV2 state = 6;
string reason = 7;
string complete_time = 8;
repeated ImportFileStats file_stats = 9;
repeated int64 sorted_segmentIDs = 10;
string created_time = 11;
ImportTaskSourceV2 source = 12;
}
enum GcCommand {
_ = 0;
Pause = 1;
Resume = 2;
}
message GcControlRequest {
common.MsgBase base = 1;
GcCommand command = 2;
repeated common.KeyValuePair params = 3;
}
// The response message for GetGcStatus.
message GetGcStatusResponse {
// is_paused is true if the garbage collector is currently paused.
bool is_paused = 1;
// time_remaining_seconds is the remaining duration in seconds until the GC resumes.
int32 time_remaining_seconds = 2;
}
message QuerySlotRequest {}
message QuerySlotResponse {
common.Status status = 1;
int64 available_slots = 2;
}
enum CompactionTaskState {
unknown = 0;
executing = 1;
pipelining = 2;
completed = 3;
failed = 4;
timeout = 5;
analyzing = 6;
indexing = 7;
cleaned = 8;
meta_saved = 9;
statistic = 10;
}
message CompactionTask{
int64 planID = 1;
int64 triggerID = 2;
int64 collectionID = 3;
int64 partitionID = 4;
string channel = 5;
CompactionType type = 6;
CompactionTaskState state = 7;
string fail_reason = 8;
int64 start_time = 9;
int64 end_time = 10;
int32 timeout_in_seconds = 11 [deprecated = true];
int32 retry_times = 12;
int64 collection_ttl = 13;
int64 total_rows = 14;
repeated int64 inputSegments = 15;
repeated int64 resultSegments = 16;
msg.MsgPosition pos = 17;
int64 nodeID = 18;
schema.CollectionSchema schema = 19;
schema.FieldSchema clustering_key_field = 20;
int64 max_segment_rows = 21;
int64 prefer_segment_rows = 22;
int64 analyzeTaskID = 23;
int64 analyzeVersion = 24;
int64 lastStateStartTime = 25;
int64 max_size = 26;
repeated int64 tmpSegments = 27;
IDRange pre_allocated_segmentIDs = 28;
}
message PartitionStatsInfo {
int64 collectionID = 1;
int64 partitionID = 2;
string vChannel = 3;
int64 version = 4;
repeated int64 segmentIDs = 5;
int64 analyzeTaskID = 6;
int64 commitTime = 7;
}
message DropCompactionPlanRequest {
int64 planID = 1;
}
message FileResourceInfo {
string name = 1;
string path = 2;
int64 resource_id = 3;
}