This commit optimizes std::vector usage across segcore by adding
reserve() calls where the size is known in advance, reducing memory
reallocations during push_back operations.
Changes:
- TimestampIndex.cpp: Reserve space for prefix_sums and
timestamp_barriers
- SegmentGrowingImpl.cpp: Reserve space for binlog info vectors
- ChunkedSegmentSealedImpl.cpp: Reserve space for futures and field data
vectors
- storagev2translator/GroupChunkTranslator.cpp: Reserve space for
metadata vectors
This improves performance by avoiding multiple memory reallocations when
the vector size is predictable.
issue: https://github.com/milvus-io/milvus/issues/45679
---------
Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
#45610
this fix add a little cost for execute:
=== Lower Bound Overhead (isolated) ===
Position 1 (list len = 90000): 39 ns per lower_bound
Position 2 (list len =180000): 45 ns per lower_bound
Position 3 (list len =270000): 46 ns per lower_bound
Position 4 (list len =360000): 38 ns per lower_bound
Position 5 (list len =450000): 42 ns per lower_bound
Position 6 (list len =540000): 55 ns per lower_bound
Position 7 (list len =630000): 56 ns per lower_bound
Position 8 (list len =720000): 49 ns per lower_bound
Position 9 (list len =810000): 48 ns per lower_bound
Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
issue: https://github.com/milvus-io/milvus/issues/45783
for simdjson ondemand api, a iterator can only be used once. use dom api
to prevent crashes when processing JSON contains operations with
different types.
Signed-off-by: sunby <sunbingyi1992@gmail.com>
relate: https://github.com/milvus-io/milvus/issues/43687
Support use user provice file by file params, in analyzer params.
Could use local file or remote file resource.
Support use file params in jieba extern dict.
Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
Skip redundant failure marking for completed import jobs when the
collection is dropped or import jobs are timeout.
issue: https://github.com/milvus-io/milvus/issues/45766
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
issue: #45782
- because the zero value of the repeated field and bytes field in proto
is ignored or treated as empty value but not nil pointer, so we need to
fix the recovery info of the broadcast task from proto to keep the
consistency of memory state.
Signed-off-by: chyezh <chyezh@outlook.com>
Related #44956
This commit integrates the Storage V2 FFI (Foreign Function Interface)
interface throughout the Milvus codebase, enabling unified storage
access through the Loon FFI layer. This is a significant step towards
standardizing storage operations across different storage versions.
1. Configuration Support
- **configs/milvus.yaml**: Added `useLoonFFI` configuration flag under
`common.storage.file.splitByAvgSize` section
- Allows runtime toggle between traditional binlog readers and new
FFI-based manifest readers
- Default: `false` (maintains backward compatibility)
2. Core FFI Infrastructure
Enhanced Utilities (internal/core/src/storage/loon_ffi/util.cpp/h)
- **ToCStorageConfig()**: Converts Go's `StorageConfig` to C's
`CStorageConfig` struct for FFI calls
- **GetManifest()**: Parses manifest JSON and retrieves latest column
groups using FFI
- Accepts manifest path with `base_path` and `ver` fields
- Calls `get_latest_column_groups()` FFI function
- Returns column group information as string
- Comprehensive error handling for JSON parsing and FFI errors
3. Dependency Updates
- **internal/core/thirdparty/milvus-storage/CMakeLists.txt**:
- Updated milvus-storage version from `0883026` to `302143c`
- Ensures compatibility with latest FFI interfaces
4. Data Coordinator Changes
All compaction task builders now include manifest path in segment
binlogs:
- **compaction_task_clustering.go**: Added `Manifest:
segInfo.GetManifestPath()` to segment binlogs
- **compaction_task_l0.go**: Added manifest path to both L0 segment
selection and compaction plan building
- **compaction_task_mix.go**: Added manifest path to mixed compaction
segment binlogs
- **meta.go**: Updated metadata completion logic:
- `completeClusterCompactionMutation()`: Set `ManifestPath` in new
segment info
- `completeMixCompactionMutation()`: Preserve manifest path in compacted
segments
- `completeSortCompactionMutation()`: Include manifest path in sorted
segments
5. Data Node Compactor Enhancements
All compactors updated to support dual-mode reading (binlog vs
manifest):
6. Flush & Sync Manager Updates
Pack Writer V2 (pack_writer_v2.go)
- **BulkPackWriterV2.Write()**: Extended return signature to include
`manifest string`
- Implementation:
- Generate manifest path: `path.Join(pack.segmentID, "manifest.json")`
- Write packed data using FFI-based writer
- Return manifest path along with binlogs, deltas, and stats
Task Handling (task.go)
- Updated all sync task result handling to accommodate new manifest
return value
- Ensured backward compatibility for callers not using manifest
7. Go Storage Layer Integration
New Interfaces and Implementations
- **record_reader.go**: Interface for unified record reading across
storage versions
- **record_writer.go**: Interface for unified record writing across
storage versions
- **binlog_record_writer.go**: Concrete implementation for traditional
binlog-based writing
Enhanced Schema Support (schema.go, schema_test.go)
- Schema conversion utilities to support FFI-based storage operations
- Ensures proper Arrow schema mapping for V2 storage
Serialization Updates
- **serde.go, serde_events.go, serde_events_v2.go**: Updated to work
with new reader/writer interfaces
- Test files updated to validate dual-mode serialization
8. Storage V2 Packed Format
FFI Common (storagev2/packed/ffi_common.go)
- Common FFI utilities and type conversions for packed storage format
Packed Writer FFI (storagev2/packed/packed_writer_ffi.go)
- FFI-based implementation of packed writer
- Integrates with Loon storage layer for efficient columnar writes
Packed Reader FFI (storagev2/packed/packed_reader_ffi.go)
- Already existed, now complemented by writer implementation
9. Protocol Buffer Updates
data_coord.proto & datapb/data_coord.pb.go
- Added `manifest` field to compaction segment messages
- Enables passing manifest metadata through compaction pipeline
worker.proto & workerpb/worker.pb.go
- Added compaction parameter for `useLoonFFI` flag
- Allows workers to receive FFI configuration from coordinator
10. Parameter Configuration
component_param.go
- Added `UseLoonFFI` parameter to compaction configuration
- Reads from `common.storage.file.useLoonFFI` config path
- Default: `false` for safe rollout
11. Test Updates
- **clustering_compactor_storage_v2_test.go**: Updated signatures to
handle manifest return value
- **mix_compactor_storage_v2_test.go**: Updated test helpers for
manifest support
- **namespace_compactor_test.go**: Adjusted writer calls to expect
manifest
- **pack_writer_v2_test.go**: Validated manifest generation in pack
writing
This integration follows a **dual-mode approach**:
1. **Legacy Path**: Traditional binlog-based reading/writing (when
`useLoonFFI=false` or no manifest)
2. **FFI Path**: Manifest-based reading/writing through Loon FFI (when
`useLoonFFI=true` and manifest exists)
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
After segments gained self-management capabilities for loading, the
index information from the initial load was not being preserved in the
Go-side segment metadata. This caused QueryCoord to repeatedly dispatch
load index tasks, which would fail in segcore since the indexes were
already loaded.
**Root Cause:**
The segment's `fieldIndexes` map was not being populated with index
metadata after calling `FinishLoad`, leading to a mismatch between the
Go-side metadata and segcore's internal state.
**Solution:**
After successfully loading a sealed segment, iterate through
`loadInfo.IndexInfos` and insert each index entry into the segment's
`fieldIndexes` map. This ensures the Go-side metadata stays in sync with
segcore and prevents redundant load index operations.
Fixes#45802
Related to #45060
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
issue: #44369
woodpecker related[ issue:
#59](https://github.com/zilliztech/woodpecker/issues/59)
Refactor the WAL retention logic in Milvus StreamingNode:
- Remove the simple sampling-based truncation mechanism.
- After flush, WAL data is directly truncated.
- The retention control is now delegated to the underlying message queue
(MQ) implementation.
Signed-off-by: tinswzy <zhenyuan.wei@zilliz.com>
1. Array.h: Add output_data(ScalarFieldProto&) overload for both Array
and ArrayView classes
2. Use std::string_view instead of std::string for VARCHAR and GEOMETRY
types to avoid extra string copies
3. Call Reserve(length_) before writing to proto objects to reduce
memory reallocations
a simple test shows those optimizations improve the Array of Varchar
bulk_subscript performance by 20%
issue: https://github.com/milvus-io/milvus/issues/45679
Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
Related to #44620
Related to unstable ut "internal/querycoordv2 TestServer/TestNodeUp"
Introduce SessionWatcher interface to fix race condition and goroutine
leak that caused unstable unit test TestServer/TestNodeUp.
Changes:
- Add SessionWatcher interface with EventChannel() and Stop() methods
- Refactor WatchServices() to return SessionWatcher instead of raw
channel
- Fix cleanup order in QueryCoordV2: stop watcher before session
- Update DataCoord, ConnectionManager to use SessionWatcher
- Add MockSessionWatcher for testing
Fixes race condition between session context cancellation and internal
loop exit. Eliminates goroutine leak by providing explicit lifecycle
management.
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
https://github.com/milvus-io/milvus/issues/45544
- Add batch_factor configuration parameter (default: 5) to control
embedding provider batch sizes
- Add disable_func_runtime_check property to bypass function validation
during collection creation
- Add database interceptor support for AddCollectionFunction,
AlterCollectionFunction, and DropCollectionFunction requests
Signed-off-by: junjie.jiang <junjie.jiang@zilliz.com>
issue: #45718
Logging complete segment ID arrays caused excessive log volume (3-6 TB
for 200k segments). Remove arrays from logger fields and keep only
segment counts for observability.
Changes:
- Remove requestSegments/preparedSegments arrays from Load logger
- Remove segmentIDs from BM25 stats logs
- Remove entries structure from sync distribution log
This reduces log volume by 99.99% for large-scale operations.
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
Related to #44974
The emplace() operation on tbb::concurrent_hash_map was not protected,
allowing other threads to erase entries between the emplace attempt and
the subsequent lookup.
Solution:
1. Add shared_lock protection around the emplace() operation to prevent
concurrent erasure during insertion
2. Instead of returning nullptr when the key is not found on retry,
recursively call Get(key) to retry the entire operation
3. Fix typo: "earsed" -> "erased"
This ensures that concurrent Get() operations are properly synchronized
and will eventually succeed even under high contention.
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
The tests were failing with "grpc: Server.RegisterService after
Server.Serve" because setupMockServer() was starting the gRPC server
before tests could register their services. gRPC requires all services
to be registered before Server.Serve() is called.
Changes:
- Remove s.Serve() from setupMockServer() helper function
- Add s.Serve() to each test after service registration
- Apply fix consistently to all 6 affected tests:
* TestZillizClient_Embedding
* TestZillizClient_Embedding_Error
* TestZillizClient_Rerank
* TestZillizClient_Rerank_Error
* TestNewZilliClient_WithMockServer
* TestZillizClient_Embedding_EmptyResponse
This follows the correct gRPC server lifecycle:
1. Create server
2. Register services
3. Start serving
Related to #44620
Case: "internal/util/function/models/zilliz TestZillizClient_Rerank"
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
issue: #45210
If the underlying WAL is failed to open, the recovery info size of
streaming coord `streamingcoord-meta/pchannel` will increase fast until
reaching the etcd limitation.
So make a compaction by serverID at assignment history to decrease the
`streamingcoord-meta/pchannel` size.
Signed-off-by: chyezh <chyezh@outlook.com>
Related to #45614
This commit fixes a bug where certain collection attributes were not
properly updated during collection modification, causing metadata errors
after cluster restart and collection reload failures.
When altering a collection, the `EnableDynamicField` and `SchemaVersion`
attributes were not being persisted to the catalog. This caused
inconsistencies between the in-memory collection metadata and the
persisted state, leading to:
- Dynamic field validation failures after restart
- Collection loading errors
- Metadata state mismatches
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
issue: #44320
This change adds deduplication logic to handle duplicate primary keys
within a single upsert batch, keeping the last occurrence of each
primary key.
Key changes:
- Add DeduplicateFieldData function to remove duplicate PKs from field
data, supporting both Int64 and VarChar primary keys
- Refactor fillFieldPropertiesBySchema into two separate functions:
validateFieldDataColumns for validation and fillFieldPropertiesOnly for
property filling, improving code clarity and reusability
- Integrate deduplication logic in upsertTask.PreExecute to
automatically deduplicate data before processing
- Add comprehensive unit tests for deduplication with various PK types
(Int64, VarChar) and field types (scalar, vector)
- Add Python integration tests to verify end-to-end behavior
---------
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
Related to #45543
When a field with a default value is added to a collection, the default
value becomes null after compaction instead of retaining the expected
default value.
**Root Cause**
The `appendValueAt` function in `internal/storage/arrow_util.go`
incorrectly checked if the entire arrow.Array was nil before handling
default values. This meant that default values were only applied when
the array itself was nil, not when individual field values were null
(which is the correct condition).
**Changes**
1. **Early nil check**: Added a guard at the function entry to detect
nil arrow.Array and return an error immediately, as this is an
unexpected condition that should not occur during normal operation.
2. **Refactored default value handling**: Removed the per-type nil array
checks and moved default value logic to handle individual null values
within the array (when `IsNull(idx)` returns true).
3. **Applied to all types**: Updated the logic consistently across all
builder types:
- BooleanBuilder
- Int8Builder, Int16Builder, Int32Builder, Int64Builder
- Float32Builder
- StringBuilder
- BinaryBuilder (added default value support for internal $meta json)
- ListBuilder (removed unnecessary nil check)
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
Related to #45060
Refactor segment loading architecture to make segments autonomously
manage their own loading process, moving the orchestration logic from Go
(segment_loader.go) to C++ (segcore).
**C++ Layer (segcore):**
- Added `SetLoadInfo()` and `Load()` methods to `SegmentInterface` and
implementations
- Implemented `ChunkedSegmentSealedImpl::Load()` with parallel loading
strategy:
- Separates indexed fields from non-indexed fields
- Loads indexes concurrently using thread pools
- Loads field data for non-indexed fields in parallel
- Implemented `SegmentGrowingImpl::Load()` to convert and load field
data
- Extracted `LoadIndexData()` as a reusable utility function in
`Utils.cpp`
- Added `SegmentLoad()` C binding in `segment_c.cpp`
**Go Layer:**
- Added `Load()` method to segment interfaces
- Updated mock implementations and test interfaces
- Integrated new C++ `SegmentLoad()` binding in Go segment wrapper
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>