Related to #46358
Refactor segment loading to use a unified diff-based approach for both
initial Load and Reopen operations:
- Extract ApplyLoadDiff from Reopen to share loading logic
- Add GetLoadDiff to compute diff from empty state for initial load
- Change column_groups_to_load from map to vector<pair> to preserve
order
- Add validation for empty index file paths in diff computation
- Add comprehensive unit tests for GetLoadDiff
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **Performance**
* Improved segment loading efficiency through incremental updates,
reducing memory overhead and enhancing performance during data updates.
* **Tests**
* Expanded test coverage for load operation scenarios.
<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
related: #45993
This commit extends nullable vector support to the proxy layer,
querynode,
and adds comprehensive validation, search reduce, and field data
handling
for nullable vectors with sparse storage.
Proxy layer changes:
- Update validate_util.go checkAligned() with getExpectedVectorRows()
helper
to validate nullable vector field alignment using valid data count
- Update checkFloatVectorFieldData/checkSparseFloatVectorFieldData for
nullable vector validation with proper row count expectations
- Add FieldDataIdxComputer in typeutil/schema.go for logical-to-physical
index translation during search reduce operations
- Update search_reduce_util.go reduceSearchResultData to use
idxComputers
for correct field data indexing with nullable vectors
- Update task.go, task_query.go, task_upsert.go for nullable vector
handling
- Update msg_pack.go with nullable vector field data processing
QueryNode layer changes:
- Update segments/result.go for nullable vector result handling
- Update segments/search_reduce.go with nullable vector offset
translation
Storage and index changes:
- Update data_codec.go and utils.go for nullable vector serialization
- Update indexcgowrapper/dataset.go and index.go for nullable vector
indexing
Utility changes:
- Add FieldDataIdxComputer struct with Compute() method for efficient
logical-to-physical index mapping across multiple field data
- Update EstimateEntitySize() and AppendFieldData() with fieldIdxs
parameter
- Update funcutil.go with nullable vector support functions
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **New Features**
* Full support for nullable vector fields (float, binary, float16,
bfloat16, int8, sparse) across ingest, storage, indexing, search and
retrieval; logical↔physical offset mapping preserves row semantics.
* Client: compaction control and compaction-state APIs.
* **Bug Fixes**
* Improved validation for adding vector fields (nullable + dimension
checks) and corrected search/query behavior for nullable vectors.
* **Chores**
* Persisted validity maps with indexes and on-disk formats.
* **Tests**
* Extensive new and updated end-to-end nullable-vector tests.
<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Signed-off-by: marcelo-cjl <marcelo.chen@zilliz.com>
issue: https://github.com/milvus-io/milvus/issues/45525
see added README.md for added optimizations
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **New Features**
* Added query expression optimization feature with a new `optimizeExpr`
configuration flag to enable automatic simplification of filter
predicates, including range predicate optimization, merging of IN/NOT IN
conditions, and flattening of nested logical operators.
* **Bug Fixes**
* Adjusted delete operation behavior to correctly handle expression
evaluation.
<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
issue: #45486
Introduce row group batching to reduce cache cell granularity and
improve
memory&disk efficiency. Previously, each parquet row group mapped 1:1 to
a cache
cell. Now, up to `kRowGroupsPerCell` (4) row groups are merged into one
cell.
This reduces the number of cache cells (and associated overhead) by ~4x
while
maintaining the same data granularity for loading.
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **Refactor**
* Switched to cell-based grouping that merges multiple row groups for
more efficient multi-file aggregation and reads.
* Chunk loading now combines multiple source batches/tables per cell and
better supports mmap-backed storage.
* **New Features**
* Exposed helpers to query row-group ranges and global row-group offsets
for diagnostics and testing.
* Translators now accept chunk-type and mmap/load hints to control
on-disk vs in-memory behavior.
* **Bug Fixes**
* Improved bounds checks and clearer error messages for out-of-range
cell requests.
<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>
### **User description**
Related to #44956
Add manifest-based data loading path for optional fields in
`cache_opt_field_memory_v2`. When a manifest file is provided in the
config, the function now retrieves field data directly from the manifest
using `GetFieldDatasFromManifest` instead of reading from segment insert
files. This enables storage v2 compatibility for building indexes with
optional fields.
___
### **PR Type**
Enhancement
___
### **Description**
- Add manifest-based data loading for optional fields in index building
- Support storage v2 compatibility via `GetFieldDatasFromManifest`
function
- Enable PK isolation optional field handling without segment insert
files
___
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
issue: https://github.com/milvus-io/milvus/issues/44399
this PR also adds `ByteSize()` methods for scalar indexes. currently not
used in milvus code, but used in scalar benchmark. may be used by
cachinglayer in the future.
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **Refactor**
* Improved and standardized memory-size computation and caching across
index types so reported index footprints are more accurate and
consistent.
* **Chores**
* Ensured byte-size metrics are refreshed immediately after index
build/load operations to keep memory accounting in sync with runtime
state.
<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
issue: #46349
When using brute-force search, the iterator results from multiple chunks
are merged; at that point, we need to pay attention to how the metric
affects result ranking.
Signed-off-by: xianliang.li <xianliang.li@zilliz.com>
Related to #44956
When loading column groups with mmap enabled, the
ManifestGroupTranslator needs the mmap directory path to properly handle
memory-mapped data loading. This change retrieves the root path from
LocalChunkManagerSingleton and passes it to the translator during
construction.
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
issue: https://github.com/milvus-io/milvus/issues/42053
The splitted literals in `match` execution should be handled in `and`
manner rather than `or`.
Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
issue: https://github.com/milvus-io/milvus/issues/45890
ComputePhraseMatchSlop accepts three pararms:
1. A string: query text
2. Some trings: data texts
3. Analyzer params,
Slop will be calculated for the query text with each data text in the
context of phrase match where they are tokenized with tokenizer with
analyzer params.
So two array will be returned:
1. is_match: is phrase match can sucess
2. slop: the related slop if phrase match can sucess, or -1 is cannot.
---------
Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
issue: #46358
This PR implements segment reopening functionality on query nodes,
enabling the application of data or schema changes to already-loaded
segments without requiring a full reload.
### Core (C++)
**New SegmentLoadInfo class**
(`internal/core/src/segcore/SegmentLoadInfo.h/cpp`):
- Encapsulates segment load configuration with structured access
- Implements `ComputeDiff()` to calculate differences between old and
new load states
- Tracks indexes, binlogs, and column groups that need to be loaded or
dropped
- Provides `ConvertFieldIndexInfoToLoadIndexInfo()` for index loading
**ChunkedSegmentSealedImpl modifications**:
- Added `Reopen(const SegmentLoadInfo&)` method to apply incremental
changes based on computed diff
- Refactored `LoadColumnGroups()` and `LoadColumnGroup()` to support
selective loading via field ID map
- Extracted `LoadBatchIndexes()` and `LoadBatchFieldData()` for reusable
batch loading logic
- Added `LoadManifest()` for manifest-based loading path
- Updated all methods to use `SegmentLoadInfo` wrapper instead of direct
proto access
**SegmentGrowingImpl modifications**:
- Added `Reopen()` stub method for interface compliance
**C API additions** (`segment_c.h/cpp`):
- Added `ReopenSegment()` function exposing reopen to Go layer
### Go Side
**QueryNode handlers** (`internal/querynodev2/`):
- Added `HandleReopen()` in handlers.go
- Added `ReopenSegments()` RPC in services.go
**Segment interface** (`internal/querynodev2/segments/`):
- Extended `Segment` interface with `Reopen()` method
- Implemented `Reopen()` in LocalSegment
- Added `Reopen()` to segment loader
**Segcore wrapper** (`internal/util/segcore/`):
- Added `Reopen()` method in segment.go
- Added `ReopenSegmentRequest` in requests.go
### Proto
- Added new fields to support reopen in `query_coord.proto`
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
fixes: https://github.com/milvus-io/milvus/issues/45934
pinIndex is a const and only do read operations rlock would be the right
choice for performance
Signed-off-by: Lanqing Yang <lanqingy93@gmail.com>
issue: https://github.com/milvus-io/milvus/issues/42148
For a vector field inside a STRUCT, since a STRUCT can only appear as
the element type of an ARRAY field, the vector field in STRUCT is
effectively an array of vectors, i.e. an embedding list.
Milvus already supports searching embedding lists with metrics whose
names start with the prefix MAX_SIM_.
This PR allows Milvus to search embeddings inside an embedding list
using the same metrics as normal embedding fields. Each embedding in the
list is treated as an independent vector and participates in ANN search.
Further, since STRUCT may contain scalar fields that are highly related
to the embedding field, this PR introduces an element-level filter
expression to refine search results.
The grammar of the element-level filter is:
element_filter(structFieldName, $[subFieldName] == 3)
where $[subFieldName] refers to the value of subFieldName in each
element of the STRUCT array structFieldName.
It can be combined with existing filter expressions, for example:
"varcharField == 'aaa' && element_filter(struct_field, $[struct_int] ==
3)"
A full example:
```
struct_schema = milvus_client.create_struct_field_schema()
struct_schema.add_field("struct_str", DataType.VARCHAR, max_length=65535)
struct_schema.add_field("struct_int", DataType.INT32)
struct_schema.add_field("struct_float_vec", DataType.FLOAT_VECTOR, dim=EMBEDDING_DIM)
schema.add_field(
"struct_field",
datatype=DataType.ARRAY,
element_type=DataType.STRUCT,
struct_schema=struct_schema,
max_capacity=1000,
)
...
filter = "varcharField == 'aaa' && element_filter(struct_field, $[struct_int] == 3 && $[struct_str] == 'abc')"
res = milvus_client.search(
COLLECTION_NAME,
data=query_embeddings,
limit=10,
anns_field="struct_field[struct_float_vec]",
filter=filter,
output_fields=["struct_field[struct_int]", "varcharField"],
)
```
TODO:
1. When an `element_filter` expression is used, a regular filter
expression must also be present. Remove this restriction.
2. Implement `element_filter` expressions in the `query`.
---------
Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
Related to #46225
Replace the heterogeneous insert data handling logic that modified
schema_ while holding a shared lock with an assertion. The previous
implementation had a concurrency bug where schema modification
operations were performed under a shared_lock, which violates mutex
semantics and can lead to data races.
Issue: #46225 reported two problems:
1. Schema modification under shared_lock (not exclusive lock)
2. Access to schema_ not protected by mutex in growing segment
The removed code attempted to handle "added fields" by:
- Adding new field to schema (schema_->AddField)
- Appending field metadata to insert_record_
- Setting default data for existing rows
All these write operations were performed while holding only a
shared_lock, which is incorrect since shared_locks are meant for
read-only operations.
This fix replaces the unsafe modification with an assertion that fails
if an unexpected new field is encountered in a growing segment with
existing data. The proper handling of schema changes should go through
the Reopen() path which correctly acquires a unique_lock before
modifying schema_.
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
issue: #45511
our tantivy inverted index currently does not include item index if the
value is an array, thus we can't do `a[0] == 'b'` type of look up in the
inverted index. for such, we need to skip the index and use brute force
search.
we may improve our index in the future, so this is a temp solution
Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
This PR fixes two issues related to segment loading and index
deserialization:
1. Fill partition_id in LoadIndexInfo when converting field index info,
which is required by cardinal (DiskANN) index deserialization.
2. Close RemoteOutputStream in destructor to ensure buffer flushed and
resources released properly.
issue: #46141
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
This PR adds Customer Managed Encryption Keys (CMEK) support to the
StorageV2 FFI layer, enabling data encryption/decryption through the
cipher plugin system.
Changes:
- Add ffi_writer_c.cpp/h with GetEncParams() to retrieve encryption
parameters (key and metadata) from cipher plugin for data encryption
- Extend GetLoonReader() in ffi_reader_c.cpp to support CMEK decryption
by configuring KeyRetriever when plugin context is provided
- Add encryption property constants in ffi_common.go for writer config
- Integrate CMEK encryption in NewFFIPackedWriter() to pass encryption
parameters to the underlying storage writer
issue: #44956
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
Related to #46098
This fix addresses a bug where the segment loader incorrectly determined
whether scalar fields have raw data in their indexes, leading to
unnecessary field data loading or skipping indexed raw data retrieval.
- Build `field_ids` vector that handles both single field and column
group cases (when `child_fields_size() > 0`)
- Move the mmap setting and index_has_raw_data checks before the skip
decision, iterating over the correctly built `field_ids`
- Fix the boolean AND logic in both `Load()` and `LoadColumnGroup()` to
properly check if ALL fields in the group have raw data in their indexes
This bug was hiding the root cause of issue #46098, where QueryNode
panics when outputting timestamptz data from scalar index with raw data.
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
Related to #44956
Use the exact manifest version from the path parameter instead of always
fetching the latest manifest. This ensures data consistency by reading
from the specific version that was requested.
Changes:
- Update GetColumnGroups to use transaction.begin(version) with the
specified version from the path JSON
- Replace get_latest_manifest() with get_current_manifest() after
beginning transaction at the target version
- Update Go FFI binding to call get_column_groups_by_version instead of
get_latest_column_groups
- Remove unused GetManifest function from util.cpp/util.h
- Bump milvus-storage version from 5fff4f5 to 33bf815
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
Related to #46098
Fix the ReverseDataFromIndex function where the assignment of raw_data
to scalar_array and the break statement were incorrectly placed inside
the for loop for TIMESTAMPTZ data type. This caused QueryNode to panic
when outputting timestamptz fields from scalar index.
Move the assignment and break statement outside the loop to match the
pattern used by other data types like VARCHAR.
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
See: #44956
This PR upgrades loon to the latest version and resolves building
conflicts.
---------
Signed-off-by: Ted Xu <ted.xu@zilliz.com>
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
Co-authored-by: Congqi Xia <congqi.xia@zilliz.com>
Previously, mmap settings configured at the collection or field level
were not being applied during segment loading in segcore. This was
caused by:
1. A typo in the key name: "mmap.enable" instead of "mmap.enabled"
2. Missing logic to parse and apply mmap settings from schema
This commit fixes the issue by:
- Correcting the key name to "mmap.enabled" to match the standard
- Adding Schema::MmapEnabled() method to retrieve field/collection level
mmap settings with proper fallback logic
- Parsing mmap settings from field type_params and collection properties
during schema parsing
- Applying computed mmap settings in LoadColumnGroup() and Load()
methods instead of hardcoded false values
- Using global MmapConfig as fallback when no explicit setting exists
The mmap setting priority is now:
1. Field-level mmap setting (from type_params)
2. Collection-level mmap setting (from properties)
3. Global mmap config (from MmapManager)
For column groups, if any field has mmap enabled, the entire group uses
mmap (since they are loaded together).
Related issue: #45060
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
This PR refactors the Loon FFI reader implementation to use
milvus-storage's internal C++ Reader API directly instead of the
external FFI interface.
Key changes:
- Replace external FFI calls (get_record_batch_reader, reader_destroy)
with direct C++ Reader API calls
- Add GetLoonReader() helper function to create Reader instances using
milvus-storage::api::Reader::create()
- Use MakeInternalPropertiesFromStorageConfig() instead of
MakePropertiesFromStorageConfig() to get internal properties
- Update NewPackedFFIReaderWithManifest() to deserialize column groups
from JSON manifest content directly
- Simplify GetFFIReaderStream() to use Reader::get_record_batch_reader()
and arrow::ExportRecordBatchReader() for Arrow stream export
- Change CFFIPackedReader typedef from ReaderHandle to void* for
flexibility
- Update milvus-storage dependency version to ba7df7b
This change improves code maintainability by using the native C++ API
directly and eliminates the overhead of going through the external FFI
layer.
issue: #44956
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
Related to #44956
- Update milvus-storage version to ba7df7b for chunk reader fix
- Pass manifest path to index build request in DataCoord/DataNode
- Add null chunk assertion with detailed debug info in
ManifestGroupTranslator
- Fix memory corruption by removing premature transaction handle
destruction
- Clean up log message in ChunkedSegmentSealedImpl
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
This PR adds support for reading data from StorageV2 using manifest
files and the Loon FFI interface during index building, providing an
alternative to the traditional segment insert files approach.
Key changes:
Core C++ changes:
- Add SEGMENT_MANIFEST_KEY and LOON_FFI_PROPERTIES_KEY constants for
manifest handling
- Extend FileManagerContext to carry loon_ffi_properties for FFI
operations
- Update index_c.cpp to pass manifest and loon properties to file
managers for all index types (vector, JSON key, text)
- Implement GetFieldDatasFromManifest() in Util.cpp using Arrow C Stream
interface:
* Create Arrow schema from field metadata
* Initialize FFI reader with manifest content and storage properties
* Import record batches from C data interface
* Convert to FieldData for index building
- Update DiskFileManagerImpl and MemFileManagerImpl to support
manifest-based data reading with fallback to traditional paths
Loon FFI utilities (internal/core/src/storage/loon_ffi/):
- Add ToCStorageConfig() to convert StorageConfig to C-compatible
structure
- Implement GetManifest() to parse manifest JSON and retrieve column
groups via FFI
- Enhance MakePropertiesFromStorageConfig() integration
Storage V2 integration:
- Update milvus-storage dependency from 0883026 to 302143c for latest
FFI support
Protobuf changes:
- Add manifest field to BuildIndexInfo for passing manifest path to C++
layer
Configuration:
- Add common.storageV2.useLoonFFI config option (default: false) for
feature toggle
This change is part of issue #44956 to integrate the StorageV2 FFI
interface as the unified storage layer. The implementation maintains
backward compatibility by checking for manifest presence and falling
back to existing segment insert files approach when manifest is not
provided.
Related issue: #44956
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
issue: #45486
This commit refactors the chunk writing system by introducing a
two-phase
approach: size calculation followed by writing to a target. This enables
efficient group chunk creation where multiple fields share a single mmap
region, significantly reducing the number of mmap system calls and VMAs.
- Optimize `mmap` usage: single `mmap` per group chunk instead of per
field
- Split ChunkWriter into two phases:
- `calculate_size()`: Pre-compute required memory without allocation
- `write_to_target()`: Write data to a provided ChunkTarget
- Implement `ChunkMmapGuard` for unified mmap region lifecycle
management
- Handles `munmap` and file cleanup via RAII
- Shared via `std::shared_ptr` across multiple chunks in a group
Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>
---------
Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>
Related to #44956
**New Translator (C++)**
- Added `ManifestGroupTranslator`
(`internal/core/src/segcore/storagev2translator/`)
- Translates manifest-based column groups to Milvus internal format
- Implements `GroupCTMeta` interface for chunk-based column access
- Supports both memory and mmap storage modes
- Handles cache warmup policies for vector and scalar data
**ChunkedSegmentSealedImpl**
(`internal/core/src/segcore/ChunkedSegmentSealedImpl.cpp:333`)
- Added `LoadColumnGroups(const std::string& manifest_path)`: Main entry
point for manifest-based loading
- Creates milvus-storage Reader from manifest file
- Parallelizes column group loading using thread pool
- Aggregates loading exceptions and reports errors
- Added `LoadColumnGroup()`: Loads individual column group
- Extracts field IDs from column group metadata
- Creates ManifestGroupTranslator for each column group
- Builds ProxyChunkColumn for field access
- Special handling for timestamp field index construction
**SegmentGrowingImpl**
(`internal/core/src/segcore/SegmentGrowingImpl.cpp`)
- Added similar `LoadColumnGroups()` and `LoadColumnGroup()` methods for
growing segments
- Maintains consistency with sealed segment loading path
Storage FFI Utilities
**loon_ffi/util** (`internal/core/src/storage/loon_ffi/util.cpp`)
- Added `MakeInternalPropertiesFromStorageConfig()`: Converts C storage
config to internal Properties
- Maps all storage configuration fields (S3, GCS, Azure, local)
- Handles SSL, IAM, virtual host settings
- Configures connection timeouts and max connections
- Added `MakeInternalLocalProperies()`: Creates local filesystem
properties
- Added `ToCStorageConfig()`: Converts Go StorageConfig to C
representation
- Added `GetColumnGroups()`: Extracts column groups from manifest file
using Transaction API
Protocol Buffer Changes
**segcore.proto** (`pkg/proto/segcore.proto:121`)
- Added `manifest_path` field to `SegmentLoadInfo` message
- Enables passing manifest file path from Go layer to C++ core
Go Integration
**segment.go** (`internal/util/segcore/segment.go:372`)
- Updated `ConvertToSegcoreSegmentLoadInfo()` to propagate
`ManifestPath` field
- Bridges QueryNode segment load info to Segcore format
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
This commit optimizes std::vector usage across segcore by adding
reserve() calls where the size is known in advance, reducing memory
reallocations during push_back operations.
Changes:
- TimestampIndex.cpp: Reserve space for prefix_sums and
timestamp_barriers
- SegmentGrowingImpl.cpp: Reserve space for binlog info vectors
- ChunkedSegmentSealedImpl.cpp: Reserve space for futures and field data
vectors
- storagev2translator/GroupChunkTranslator.cpp: Reserve space for
metadata vectors
This improves performance by avoiding multiple memory reallocations when
the vector size is predictable.
issue: https://github.com/milvus-io/milvus/issues/45679
---------
Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
#45610
this fix add a little cost for execute:
=== Lower Bound Overhead (isolated) ===
Position 1 (list len = 90000): 39 ns per lower_bound
Position 2 (list len =180000): 45 ns per lower_bound
Position 3 (list len =270000): 46 ns per lower_bound
Position 4 (list len =360000): 38 ns per lower_bound
Position 5 (list len =450000): 42 ns per lower_bound
Position 6 (list len =540000): 55 ns per lower_bound
Position 7 (list len =630000): 56 ns per lower_bound
Position 8 (list len =720000): 49 ns per lower_bound
Position 9 (list len =810000): 48 ns per lower_bound
Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
issue: https://github.com/milvus-io/milvus/issues/45783
for simdjson ondemand api, a iterator can only be used once. use dom api
to prevent crashes when processing JSON contains operations with
different types.
Signed-off-by: sunby <sunbingyi1992@gmail.com>