milvus/pkg/proto/segcore.proto
congqixia 36a887b38b
enhance: add NewSegmentWithLoadInfo API to support segment self-managed loading (#45061)
This commit introduces the foundation for enabling segments to manage
their own loading process by passing load information during segment
creation.

Changes:

C++ Layer:
- Add NewSegmentWithLoadInfo() C API to create segments with serialized
load info
- Add SetLoadInfo() method to SegmentInterface for storing load
information
- Refactor segment creation logic into shared CreateSegment() helper
function
- Add comprehensive documentation for the new API

Go Layer:
- Extend CreateCSegmentRequest to support optional LoadInfo field
- Update segment creation in querynode to pass SegmentLoadInfo when
available
- Add ConvertToSegcoreSegmentLoadInfo() and helper converters for proto
translation

Proto Definitions:
- Add segcorepb.SegmentLoadInfo message with essential loading metadata
- Add supporting messages: Binlog, FieldBinlog, FieldIndexInfo,
TextIndexStats, JsonKeyStats
- Remove dependency on data_coord.proto by creating segcore-specific
definitions

Testing:
- Add comprehensive unit tests for proto conversion functions
- Test edge cases including nil inputs, empty data, and nil array/map
elements

This is the first step toward issue #45060 - enabling segments to
autonomously manage their loading process, which will:
- Clarify responsibilities between Go and C++ layers
- Reduce cross-language call overhead
- Enable precise resource management at the C++ level
- Support better integration with caching layer
- Enable proactive schema evolution handling

Related to #45060

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-10-27 15:28:12 +08:00

122 lines
2.8 KiB
Protocol Buffer

syntax = "proto3";
package milvus.proto.segcore;
option go_package = "github.com/milvus-io/milvus/pkg/v2/proto/segcorepb";
import "schema.proto";
import "common.proto";
message Binlog {
int64 entries_num = 1;
uint64 timestamp_from = 2;
uint64 timestamp_to = 3;
string log_path = 4;
int64 log_size = 5;
int64 logID = 6;
int64 memory_size = 7;
}
message FieldBinlog {
int64 fieldID = 1;
repeated Binlog binlogs = 2;
repeated int64 child_fields = 3;
}
message TextIndexStats {
int64 fieldID = 1;
int64 version = 2;
repeated string files = 3;
int64 log_size = 4;
int64 memory_size = 5;
int64 buildID = 6;
}
message JsonKeyStats {
int64 fieldID = 1;
int64 version = 2;
repeated string files = 3;
int64 log_size = 4;
int64 memory_size = 5;
int64 buildID = 6;
int64 json_key_stats_data_format = 7;
}
message RetrieveResults {
schema.IDs ids = 1;
repeated int64 offset = 2;
repeated schema.FieldData fields_data = 3;
int64 all_retrieve_count = 4;
bool has_more_result = 5;
int64 scanned_remote_bytes = 6;
int64 scanned_total_bytes = 7;
}
message LoadFieldMeta {
int64 min_timestamp = 1;
int64 max_timestamp = 2;
int64 row_count = 3;
}
message LoadSegmentMeta {
// TODOs
repeated LoadFieldMeta metas = 1;
int64 total_size = 2;
}
message InsertRecord {
repeated schema.FieldData fields_data = 1;
int64 num_rows = 2;
}
message FieldIndexMeta {
int64 fieldID = 1;
int64 collectionID = 2;
string index_name = 3;
repeated common.KeyValuePair type_params = 4;
repeated common.KeyValuePair index_params = 5;
bool is_auto_index = 6;
repeated common.KeyValuePair user_index_params = 7;
}
message CollectionIndexMeta {
int64 maxIndexRowCount = 1;
repeated FieldIndexMeta index_metas = 2;
}
message FieldIndexInfo {
int64 fieldID = 1;
bool enable_index = 2 [deprecated = true];
string index_name = 3;
int64 indexID = 4;
int64 buildID = 5;
repeated common.KeyValuePair index_params = 6;
repeated string index_file_paths = 7;
int64 index_size = 8;
int64 index_version = 9;
int64 num_rows = 10;
int32 current_index_version = 11;
int64 index_store_version = 12;
}
message SegmentLoadInfo {
int64 segmentID = 1;
int64 partitionID = 2;
int64 collectionID = 3;
int64 dbID = 4;
int64 flush_time = 5;
repeated FieldBinlog binlog_paths = 6;
int64 num_of_rows = 7;
repeated FieldBinlog statslogs = 8;
repeated FieldBinlog deltalogs = 9;
repeated int64 compactionFrom = 10; // segmentIDs compacted from
repeated FieldIndexInfo index_infos = 11;
int64 segment_size = 12 [deprecated = true];
string insert_channel = 13;
int64 readableVersion = 14;
int64 storageVersion = 15;
bool is_sorted = 16;
map<int64, TextIndexStats> textStatsLogs = 17;
repeated FieldBinlog bm25logs = 18;
map<int64, JsonKeyStats> jsonKeyStatsLogs = 19;
common.LoadPriority priority = 20;
}