2306 Commits

Author SHA1 Message Date
congqixia
3daff1ab2b
enhance: use specified manifest version in loon ffi reader (#46101)
Related to #44956

Use the exact manifest version from the path parameter instead of always
fetching the latest manifest. This ensures data consistency by reading
from the specific version that was requested.

Changes:
- Update GetColumnGroups to use transaction.begin(version) with the
specified version from the path JSON
- Replace get_latest_manifest() with get_current_manifest() after
beginning transaction at the target version
- Update Go FFI binding to call get_column_groups_by_version instead of
get_latest_column_groups
- Remove unused GetManifest function from util.cpp/util.h
- Bump milvus-storage version from 5fff4f5 to 33bf815

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-12-05 11:45:11 +08:00
Buqian Zheng
1372e84d7f
fix: move cursor after skip index skipped a chunk (#46054)
issue: #46053

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2025-12-05 10:47:11 +08:00
congqixia
3bbc5d0825
fix: correct loop logic for timestamptz scalar index output (#46100)
Related to #46098

Fix the ReverseDataFromIndex function where the assignment of raw_data
to scalar_array and the break statement were incorrectly placed inside
the for loop for TIMESTAMPTZ data type. This caused QueryNode to panic
when outputting timestamptz fields from scalar index.

Move the assignment and break statement outside the loop to match the
pattern used by other data types like VARCHAR.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-12-05 00:09:11 +08:00
Xinyi7
59752f216d
fix: add check in batch_score function to prevent query node seg fault (#46025)
previously we saw that when doing reranker with phrase matching, the
query node throws a segmentation fault error.

github issue link: https://github.com/milvus-io/milvus/issues/45990

---------

Signed-off-by: Xinyi Jiang <xinyi.jiang@reddit.com>
Co-authored-by: Xinyi Jiang <xinyi.jiang@reddit.com>
2025-12-04 17:35:17 +08:00
XuanYang-cn
64d19fb4f3
fix: Set plugin context when cipher enabled (#46050)
See also: #46008

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2025-12-04 15:17:15 +08:00
Ted Xu
20ce9fdc23
feat: bump loon version (#46029)
See: #44956

This PR upgrades loon to the latest version and resolves building
conflicts.

---------

Signed-off-by: Ted Xu <ted.xu@zilliz.com>
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
Co-authored-by: Congqi Xia <congqi.xia@zilliz.com>
2025-12-04 10:57:12 +08:00
aoiasd
b9487cf8e7
enhance: support load stop words file and add chinese default stop words (#45577)
relate: https://github.com/milvus-io/milvus/issues/45576

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-12-03 14:33:10 +08:00
Spade A
3fc309bdfc
fix: add more logs related to tantivy upload/cache (#46019)
issue: https://github.com/milvus-io/milvus/issues/45590

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-12-03 10:47:09 +08:00
congqixia
5d0c8b1b40
fix: apply mmap settings correctly during segment load (#46017)
Previously, mmap settings configured at the collection or field level
were not being applied during segment loading in segcore. This was
caused by:

1. A typo in the key name: "mmap.enable" instead of "mmap.enabled"
2. Missing logic to parse and apply mmap settings from schema

This commit fixes the issue by:
- Correcting the key name to "mmap.enabled" to match the standard
- Adding Schema::MmapEnabled() method to retrieve field/collection level
mmap settings with proper fallback logic
- Parsing mmap settings from field type_params and collection properties
during schema parsing
- Applying computed mmap settings in LoadColumnGroup() and Load()
methods instead of hardcoded false values
- Using global MmapConfig as fallback when no explicit setting exists

The mmap setting priority is now:
1. Field-level mmap setting (from type_params)
2. Collection-level mmap setting (from properties)
3. Global mmap config (from MmapManager)

For column groups, if any field has mmap enabled, the entire group uses
mmap (since they are loaded together).

Related issue: #45060

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-12-03 10:31:10 +08:00
aoiasd
f62297db92
feat: support synonym filter in analyzer (#45540)
relate: https://github.com/milvus-io/milvus/issues/45539

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-12-02 10:23:13 +08:00
zhagnlu
d5bd17315c
enhance: remove some meta cache for json shredding (#45888)
#42533

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-12-01 11:57:09 +08:00
Buqian Zheng
b886b14291
fix: term expr to correctly handle in of string in json (#45955)
issue: #45887

---------

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2025-11-29 21:53:08 +08:00
sparknack
8ef35de7ca
enhance: always use buffered io for high load priority (#45900)
issue: #43040

Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>
2025-11-29 00:03:08 +08:00
congqixia
d2e4278b18
enhance: use milvus-storage internal C++ Reader API for Loon FFI (#45897)
This PR refactors the Loon FFI reader implementation to use
milvus-storage's internal C++ Reader API directly instead of the
external FFI interface.

Key changes:
- Replace external FFI calls (get_record_batch_reader, reader_destroy)
with direct C++ Reader API calls
- Add GetLoonReader() helper function to create Reader instances using
milvus-storage::api::Reader::create()
- Use MakeInternalPropertiesFromStorageConfig() instead of
MakePropertiesFromStorageConfig() to get internal properties
- Update NewPackedFFIReaderWithManifest() to deserialize column groups
from JSON manifest content directly
- Simplify GetFFIReaderStream() to use Reader::get_record_batch_reader()
and arrow::ExportRecordBatchReader() for Arrow stream export
- Change CFFIPackedReader typedef from ReaderHandle to void* for
flexibility
- Update milvus-storage dependency version to ba7df7b

This change improves code maintainability by using the native C++ API
directly and eliminates the overhead of going through the external FFI
layer.

issue: #44956

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-11-28 18:55:07 +08:00
congqixia
ae256c52ae
enhance: Resolve issues integrating loon FFI (#45918)
Related to #44956

- Update milvus-storage version to ba7df7b for chunk reader fix
- Pass manifest path to index build request in DataCoord/DataNode
- Add null chunk assertion with detailed debug info in
ManifestGroupTranslator
- Fix memory corruption by removing premature transaction handle
destruction
- Clean up log message in ChunkedSegmentSealedImpl

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-11-28 18:41:08 +08:00
zhagnlu
1b58844319
enhance: support mmap for jsonstats shared key index (#44914)
#42533

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-11-27 16:01:08 +08:00
Xiaofan
f455910bee
fix: support azure blob storage with federated token (#45632)
fix #44582 
related to #44583
Co-authored-by: DuMinhLe<https://github.com/ducminhle>

Signed-off-by: xiaofanluan <xiaofan.luan@zilliz.com>
2025-11-27 14:29:07 +08:00
Buqian Zheng
6c0a80d8c3
enhance: pk binary range in sealed segment to use binary search (#45829)
issue: https://github.com/milvus-io/milvus/discussions/44935
pr: https://github.com/milvus-io/milvus/pull/45328

this pr is to improve pk range op

---------

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2025-11-26 17:17:08 +08:00
congqixia
3f8c146831
enhance: support manifest-based index building with Loon FFI reader (#45726)
This PR adds support for reading data from StorageV2 using manifest
files and the Loon FFI interface during index building, providing an
alternative to the traditional segment insert files approach.

Key changes:

Core C++ changes:
- Add SEGMENT_MANIFEST_KEY and LOON_FFI_PROPERTIES_KEY constants for
manifest handling
- Extend FileManagerContext to carry loon_ffi_properties for FFI
operations
- Update index_c.cpp to pass manifest and loon properties to file
managers for all index types (vector, JSON key, text)
- Implement GetFieldDatasFromManifest() in Util.cpp using Arrow C Stream
interface:
  * Create Arrow schema from field metadata
  * Initialize FFI reader with manifest content and storage properties
  * Import record batches from C data interface
  * Convert to FieldData for index building
- Update DiskFileManagerImpl and MemFileManagerImpl to support
manifest-based data reading with fallback to traditional paths

Loon FFI utilities (internal/core/src/storage/loon_ffi/):
- Add ToCStorageConfig() to convert StorageConfig to C-compatible
structure
- Implement GetManifest() to parse manifest JSON and retrieve column
groups via FFI
- Enhance MakePropertiesFromStorageConfig() integration

Storage V2 integration:
- Update milvus-storage dependency from 0883026 to 302143c for latest
FFI support

Protobuf changes:
- Add manifest field to BuildIndexInfo for passing manifest path to C++
layer

Configuration:
- Add common.storageV2.useLoonFFI config option (default: false) for
feature toggle

This change is part of issue #44956 to integrate the StorageV2 FFI
interface as the unified storage layer. The implementation maintains
backward compatibility by checking for manifest presence and falling
back to existing segment insert files approach when manifest is not
provided.

Related issue: #44956

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-11-26 12:43:08 +08:00
sparknack
4b14ab14e3
enhance: mmap once for each group chunk (#45487)
issue: #45486

This commit refactors the chunk writing system by introducing a
two-phase
approach: size calculation followed by writing to a target. This enables
efficient group chunk creation where multiple fields share a single mmap
region, significantly reducing the number of mmap system calls and VMAs.

- Optimize `mmap` usage: single `mmap` per group chunk instead of per
field
- Split ChunkWriter into two phases:
  - `calculate_size()`: Pre-compute required memory without allocation
  - `write_to_target()`: Write data to a provided ChunkTarget
- Implement `ChunkMmapGuard` for unified mmap region lifecycle
management
  - Handles `munmap` and file cleanup via RAII
  - Shared via `std::shared_ptr` across multiple chunks in a group

Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>

---------

Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>
2025-11-26 10:37:08 +08:00
sparknack
0392db6976
enhance: add cancellation checking in each operator and expr (#45354)
issue: #45353

---------

Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>
2025-11-26 10:15:07 +08:00
congqixia
03f5d7c0a5
enhance: integrate StorageV2 FFI interface for manifest-based segment loading (#45798)
Related to #44956

**New Translator (C++)**
- Added `ManifestGroupTranslator`
(`internal/core/src/segcore/storagev2translator/`)
  - Translates manifest-based column groups to Milvus internal format
  - Implements `GroupCTMeta` interface for chunk-based column access
  - Supports both memory and mmap storage modes
  - Handles cache warmup policies for vector and scalar data

**ChunkedSegmentSealedImpl**
(`internal/core/src/segcore/ChunkedSegmentSealedImpl.cpp:333`)
- Added `LoadColumnGroups(const std::string& manifest_path)`: Main entry
point for manifest-based loading
  - Creates milvus-storage Reader from manifest file
  - Parallelizes column group loading using thread pool
  - Aggregates loading exceptions and reports errors
- Added `LoadColumnGroup()`: Loads individual column group
  - Extracts field IDs from column group metadata
  - Creates ManifestGroupTranslator for each column group
  - Builds ProxyChunkColumn for field access
  - Special handling for timestamp field index construction

**SegmentGrowingImpl**
(`internal/core/src/segcore/SegmentGrowingImpl.cpp`)
- Added similar `LoadColumnGroups()` and `LoadColumnGroup()` methods for
growing segments
- Maintains consistency with sealed segment loading path

Storage FFI Utilities

**loon_ffi/util** (`internal/core/src/storage/loon_ffi/util.cpp`)
- Added `MakeInternalPropertiesFromStorageConfig()`: Converts C storage
config to internal Properties
  - Maps all storage configuration fields (S3, GCS, Azure, local)
  - Handles SSL, IAM, virtual host settings
  - Configures connection timeouts and max connections
- Added `MakeInternalLocalProperies()`: Creates local filesystem
properties
- Added `ToCStorageConfig()`: Converts Go StorageConfig to C
representation
- Added `GetColumnGroups()`: Extracts column groups from manifest file
using Transaction API

Protocol Buffer Changes

**segcore.proto** (`pkg/proto/segcore.proto:121`)
- Added `manifest_path` field to `SegmentLoadInfo` message
- Enables passing manifest file path from Go layer to C++ core

Go Integration

**segment.go** (`internal/util/segcore/segment.go:372`)
- Updated `ConvertToSegcoreSegmentLoadInfo()` to propagate
`ManifestPath` field
- Bridges QueryNode segment load info to Segcore format

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-11-25 17:27:07 +08:00
Buqian Zheng
7078f403f1
enhance: add vector reserve to improve memory allocation in segcore (#45757)
This commit optimizes std::vector usage across segcore by adding
reserve() calls where the size is known in advance, reducing memory
reallocations during push_back operations.

Changes:
- TimestampIndex.cpp: Reserve space for prefix_sums and
timestamp_barriers
- SegmentGrowingImpl.cpp: Reserve space for binlog info vectors
- ChunkedSegmentSealedImpl.cpp: Reserve space for futures and field data
vectors
- storagev2translator/GroupChunkTranslator.cpp: Reserve space for
metadata vectors

This improves performance by avoiding multiple memory reallocations when
the vector size is predictable.

issue: https://github.com/milvus-io/milvus/issues/45679

---------

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2025-11-25 14:19:07 +08:00
zhagnlu
346449d87f
fix:fix undefined behavior for dump snapshot (#45611)
#45610

this fix add a little cost for execute:
=== Lower Bound Overhead (isolated) ===
Position 1 (list len = 90000): 39 ns per lower_bound
Position 2 (list len =180000): 45 ns per lower_bound
Position 3 (list len =270000): 46 ns per lower_bound
Position 4 (list len =360000): 38 ns per lower_bound
Position 5 (list len =450000): 42 ns per lower_bound
Position 6 (list len =540000): 55 ns per lower_bound
Position 7 (list len =630000): 56 ns per lower_bound
Position 8 (list len =720000): 49 ns per lower_bound
Position 9 (list len =810000): 48 ns per lower_bound

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-11-25 14:17:07 +08:00
Bingyi Sun
929cb42fcc
fix: Replace json.doc() calls with json.dom_doc() in JsonContainsExpr (#45573)
issue: https://github.com/milvus-io/milvus/issues/45783
for simdjson ondemand api, a iterator can only be used once. use dom api
to prevent crashes when processing JSON contains operations with
different types.

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-11-25 14:15:12 +08:00
aoiasd
322caafe18
feat: support file params in analyzer and set jieba dict file (#45206)
relate: https://github.com/milvus-io/milvus/issues/43687
Support use user provice file by file params, in analyzer params.
Could use local file or remote file resource.
Support use file params in jieba extern dict.

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-11-25 11:33:06 +08:00
congqixia
c01fd94a6a
enhance: integrate Storage V2 FFI interface for unified storage access (#45723)
Related #44956
This commit integrates the Storage V2 FFI (Foreign Function Interface)
interface throughout the Milvus codebase, enabling unified storage
access through the Loon FFI layer. This is a significant step towards
standardizing storage operations across different storage versions.

1. Configuration Support
- **configs/milvus.yaml**: Added `useLoonFFI` configuration flag under
`common.storage.file.splitByAvgSize` section
- Allows runtime toggle between traditional binlog readers and new
FFI-based manifest readers
  - Default: `false` (maintains backward compatibility)

2. Core FFI Infrastructure

Enhanced Utilities (internal/core/src/storage/loon_ffi/util.cpp/h)
- **ToCStorageConfig()**: Converts Go's `StorageConfig` to C's
`CStorageConfig` struct for FFI calls
- **GetManifest()**: Parses manifest JSON and retrieves latest column
groups using FFI
  - Accepts manifest path with `base_path` and `ver` fields
  - Calls `get_latest_column_groups()` FFI function
  - Returns column group information as string
  - Comprehensive error handling for JSON parsing and FFI errors

3. Dependency Updates
- **internal/core/thirdparty/milvus-storage/CMakeLists.txt**:
  - Updated milvus-storage version from `0883026` to `302143c`
  - Ensures compatibility with latest FFI interfaces

4. Data Coordinator Changes

All compaction task builders now include manifest path in segment
binlogs:

- **compaction_task_clustering.go**: Added `Manifest:
segInfo.GetManifestPath()` to segment binlogs
- **compaction_task_l0.go**: Added manifest path to both L0 segment
selection and compaction plan building
- **compaction_task_mix.go**: Added manifest path to mixed compaction
segment binlogs
- **meta.go**: Updated metadata completion logic:
- `completeClusterCompactionMutation()`: Set `ManifestPath` in new
segment info
- `completeMixCompactionMutation()`: Preserve manifest path in compacted
segments
- `completeSortCompactionMutation()`: Include manifest path in sorted
segments

5. Data Node Compactor Enhancements

All compactors updated to support dual-mode reading (binlog vs
manifest):

6. Flush & Sync Manager Updates

Pack Writer V2 (pack_writer_v2.go)
- **BulkPackWriterV2.Write()**: Extended return signature to include
`manifest string`
- Implementation:
  - Generate manifest path: `path.Join(pack.segmentID, "manifest.json")`
  - Write packed data using FFI-based writer
  - Return manifest path along with binlogs, deltas, and stats

Task Handling (task.go)
- Updated all sync task result handling to accommodate new manifest
return value
- Ensured backward compatibility for callers not using manifest

7. Go Storage Layer Integration

New Interfaces and Implementations
- **record_reader.go**: Interface for unified record reading across
storage versions
- **record_writer.go**: Interface for unified record writing across
storage versions
- **binlog_record_writer.go**: Concrete implementation for traditional
binlog-based writing

Enhanced Schema Support (schema.go, schema_test.go)
- Schema conversion utilities to support FFI-based storage operations
- Ensures proper Arrow schema mapping for V2 storage

Serialization Updates
- **serde.go, serde_events.go, serde_events_v2.go**: Updated to work
with new reader/writer interfaces
- Test files updated to validate dual-mode serialization

8. Storage V2 Packed Format

FFI Common (storagev2/packed/ffi_common.go)
- Common FFI utilities and type conversions for packed storage format

Packed Writer FFI (storagev2/packed/packed_writer_ffi.go)
- FFI-based implementation of packed writer
- Integrates with Loon storage layer for efficient columnar writes

Packed Reader FFI (storagev2/packed/packed_reader_ffi.go)
- Already existed, now complemented by writer implementation

9. Protocol Buffer Updates

data_coord.proto & datapb/data_coord.pb.go
- Added `manifest` field to compaction segment messages
- Enables passing manifest metadata through compaction pipeline

worker.proto & workerpb/worker.pb.go
- Added compaction parameter for `useLoonFFI` flag
- Allows workers to receive FFI configuration from coordinator

10. Parameter Configuration

component_param.go
- Added `UseLoonFFI` parameter to compaction configuration
- Reads from `common.storage.file.useLoonFFI` config path
- Default: `false` for safe rollout

11. Test Updates
- **clustering_compactor_storage_v2_test.go**: Updated signatures to
handle manifest return value
- **mix_compactor_storage_v2_test.go**: Updated test helpers for
manifest support
- **namespace_compactor_test.go**: Adjusted writer calls to expect
manifest
- **pack_writer_v2_test.go**: Validated manifest generation in pack
writing

This integration follows a **dual-mode approach**:
1. **Legacy Path**: Traditional binlog-based reading/writing (when
`useLoonFFI=false` or no manifest)
2. **FFI Path**: Manifest-based reading/writing through Loon FFI (when
`useLoonFFI=true` and manifest exists)

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-11-24 19:57:07 +08:00
Buqian Zheng
2cf1e0e452
enhance: optimize pk search to use binary search, and 2 pointers for in expr (#45328)
issue: #44935

this is somewhat related to #44935, but on pk instead of stl_sort index

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2025-11-21 19:01:05 +08:00
Buqian Zheng
e00ad1098f
enhance: add ScalarFieldProto& overload to avoid unnecessary copies (#45743)
1. Array.h: Add output_data(ScalarFieldProto&) overload for both Array
and ArrayView classes
2. Use std::string_view instead of std::string for VARCHAR and GEOMETRY
types to avoid extra string copies
3. Call Reserve(length_) before writing to proto objects to reduce
memory reallocations

a simple test shows those optimizations improve the Array of Varchar
bulk_subscript performance by 20%

issue: https://github.com/milvus-io/milvus/issues/45679

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2025-11-21 18:35:05 +08:00
Bingyi Sun
275a5b9afc
enhance: optimize term expr performance (#45491)
issue: https://github.com/milvus-io/milvus/issues/45641

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-11-21 11:01:05 +08:00
Zhen Ye
1cd0ef943e
fix: use latest timetick to expire cache (#45717)
issue: #45697

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-11-20 21:39:04 +08:00
congqixia
79926b412c
fix: protect tbb concurrent_map emplace to avoid race condition deadlock (#45681)
Related to #44974

The emplace() operation on tbb::concurrent_hash_map was not protected,
allowing other threads to erase entries between the emplace attempt and
the subsequent lookup.

Solution:
1. Add shared_lock protection around the emplace() operation to prevent
concurrent erasure during insertion
2. Instead of returning nullptr when the key is not found on retry,
recursively call Get(key) to retry the entire operation
3. Fix typo: "earsed" -> "erased"

This ensures that concurrent Get() operations are properly synchronized
and will eventually succeed even under high contention.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-11-20 11:57:06 +08:00
Bingyi Sun
a3add6a391
fix: Fix json indices can not be loaded (#45620)
issue: https://github.com/milvus-io/milvus/issues/45575

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-11-20 10:41:06 +08:00
Buqian Zheng
5b85f0e4dc
enhance: updated multiple places where the expr copies the input values in every loop (#45680)
issue: https://github.com/milvus-io/milvus/issues/45679

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2025-11-20 01:51:07 +08:00
Gao
8ee8c01bcf
enhance: prefetch vector chunks for sealed non-indexed segments (#45665)
Signed-off-by: chasingegg <chao.gao@zilliz.com>
2025-11-19 18:39:07 +08:00
sparknack
16acf8829b
enhance: expr: only prefetch chunks once (#45554)
issue: https://github.com/milvus-io/milvus/issues/43611

Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>
2025-11-18 15:51:41 +08:00
862103595
a0e2fe78f3
enhance: Add ST_IsValid operator implementation for gis (#45501)
issue:#43427

---------

Signed-off-by: xiejh <862103595@qq.com>
2025-11-18 15:09:40 +08:00
aoiasd
96d0e780ac
fix: segcore collection schema update not concurrent safe. (#45337)
relate: https://github.com/milvus-io/milvus/issues/45345

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-11-14 17:51:37 +08:00
congqixia
0a208d7224
enhance: Move segment loading logic from Go layer to segcore for self-managed loading (#45488)
Related to #45060

Refactor segment loading architecture to make segments autonomously
manage their own loading process, moving the orchestration logic from Go
(segment_loader.go) to C++ (segcore).

**C++ Layer (segcore):**
- Added `SetLoadInfo()` and `Load()` methods to `SegmentInterface` and
implementations
- Implemented `ChunkedSegmentSealedImpl::Load()` with parallel loading
strategy:
  - Separates indexed fields from non-indexed fields
  - Loads indexes concurrently using thread pools
  - Loads field data for non-indexed fields in parallel
- Implemented `SegmentGrowingImpl::Load()` to convert and load field
data
- Extracted `LoadIndexData()` as a reusable utility function in
`Utils.cpp`
- Added `SegmentLoad()` C binding in `segment_c.cpp`

**Go Layer:**
- Added `Load()` method to segment interfaces
- Updated mock implementations and test interfaces
- Integrated new C++ `SegmentLoad()` binding in Go segment wrapper

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-11-14 11:21:37 +08:00
Gao
09a3195867
enhance: support max_connections config for remote storage (#45225)
related: https://github.com/milvus-io/milvus/issues/45344

Signed-off-by: chasingegg <chao.gao@zilliz.com>
2025-11-13 15:37:38 +08:00
Spade A
929dc65882
fix: fix index compatibility after upgrade (#45373)
issue: https://github.com/milvus-io/milvus/issues/45380

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-11-13 12:59:38 +08:00
Chun Han
406fa7b694
fix: failed to get raw data for hybrid index(#45318) (#45411)
related: #45318

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2025-11-13 10:17:37 +08:00
sparknack
9d75d0393e
enhance: some optimization of scalar field fetching in tiered storage scenarios (#45360)
issue: #43611

---------

Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>
2025-11-11 17:17:41 +08:00
cai.zhang
e3c1673191
fix: Fix filter geometry for growing with mmap (#45464)
issue: #45450

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-11-11 15:39:36 +08:00
Chun Han
69f3aab229
feat: milvus support huawei cloud iam verification(#45298) (#45457)
related: #45298

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2025-11-11 14:41:41 +08:00
congqixia
8d1ea751a6
fix: Support JSON default values in FillFieldData (#45455)
Related to #45445

Previously, FillFieldData for JSON fields would assert and fail when a
default_value was provided, blocking index creation for JSON fields with
default values (including dynamic fields like $meta).

This change enables JSON default value support by:
- Removing the assertion that blocked default values
- Parsing bytes_data into Json objects when default_value is present
- Properly filling data_ array and setting valid_data_ bitset to true
- Maintaining null behavior when no default_value is provided

Impact:
- Fixes index creation failure for JSON fields with default values
- Resolves upgrade issues from 2.5 to 2.6.5 where dynamic fields with
default values couldn't be indexed
- Index builds that were stuck in InProgress state can now complete

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-11-11 10:35:36 +08:00
Gao
e9a875f7ac
enhance: override index_type while creating segment index (#45416)
issue: #44752

---------

Signed-off-by: chasingegg <chao.gao@zilliz.com>
2025-11-11 07:27:36 +08:00
congqixia
0e1de0073a
enhance: Update tantivy-binding with cargo build result (#45458)
Related to #44988

This PR commit newly updated tantivy-binding.h with cargo build result
which shall passes format check.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-11-10 18:09:36 +08:00
aoiasd
e82bf0e54f
enhance: fix typo of analyzer params (#45299)
Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-11-10 14:35:35 +08:00
aoiasd
a38a0deb43
enhance: prevent panic by adding null pointer check when clearing InsertRecord _pk2offset_ (#45281)
Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-11-10 11:37:35 +08:00