congqixia 3f8c146831
enhance: support manifest-based index building with Loon FFI reader (#45726)
This PR adds support for reading data from StorageV2 using manifest
files and the Loon FFI interface during index building, providing an
alternative to the traditional segment insert files approach.

Key changes:

Core C++ changes:
- Add SEGMENT_MANIFEST_KEY and LOON_FFI_PROPERTIES_KEY constants for
manifest handling
- Extend FileManagerContext to carry loon_ffi_properties for FFI
operations
- Update index_c.cpp to pass manifest and loon properties to file
managers for all index types (vector, JSON key, text)
- Implement GetFieldDatasFromManifest() in Util.cpp using Arrow C Stream
interface:
  * Create Arrow schema from field metadata
  * Initialize FFI reader with manifest content and storage properties
  * Import record batches from C data interface
  * Convert to FieldData for index building
- Update DiskFileManagerImpl and MemFileManagerImpl to support
manifest-based data reading with fallback to traditional paths

Loon FFI utilities (internal/core/src/storage/loon_ffi/):
- Add ToCStorageConfig() to convert StorageConfig to C-compatible
structure
- Implement GetManifest() to parse manifest JSON and retrieve column
groups via FFI
- Enhance MakePropertiesFromStorageConfig() integration

Storage V2 integration:
- Update milvus-storage dependency from 0883026 to 302143c for latest
FFI support

Protobuf changes:
- Add manifest field to BuildIndexInfo for passing manifest path to C++
layer

Configuration:
- Add common.storageV2.useLoonFFI config option (default: false) for
feature toggle

This change is part of issue #44956 to integrate the StorageV2 FFI
interface as the unified storage layer. The implementation maintains
backward compatibility by checking for manifest presence and falling
back to existing segment insert files approach when manifest is not
provided.

Related issue: #44956

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-11-26 12:43:08 +08:00
..