When there're no growing segments in the collection, L0 Compaction will
try to choose all L0 segments that hits all L1/L2 segments.
However, if there's Sealed Segment still under flushing in DataNode at
the same time L0 Compaction selects satisfied L1/L2 segments, L0
Compaction will ignore this Segment because it's not in "FlushState",
which is wrong, causing missing deletes on the Sealed Segment.
This quick solution here is to fail this L0 compaction task once
selected a Sealed segment.
See also: #45339
---------
Signed-off-by: yangxuan <xuan.yang@zilliz.com>
Sliced arrow arrays "incorrectly" returned the original array's size via
SizeInBytes(), causing inaccurate memory estimates during compaction.
This resulted in segments closing prematurely in mergeSplit mode -
expected 500MB compactions produced 4x100+MB segments instead.
Fixed by calculating actual byte size of sliced arrays, ensuring proper
segment sizing and more accurate memory usage tracking.
See also: #45293
Signed-off-by: yangxuan <xuan.yang@zilliz.com>
Related to #45333
Fix segment loading failure when adding fields with text match enabled.
The issue occurred because text indexes were being loaded before
FinishLoad() was called, meaning raw data was not properly available
when text index creation attempted to access it, resulting in "failed to
create text index, neither raw data nor index are found" errors.
Solution is to move the FinishLoad() call to execute after raw data
loading but before text index loading. This ensures that:
1. Raw data is properly loaded and available in memory
2. Text indexes can access the raw data they need during creation
3. The segment is in the correct state before any index operations
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
Related to #45329
Fix compaction failure when handling newly added dynamic fields with
storage v1 binlogs. The issue occurred because the
`GenerateEmptyArrayFromSchema` function did not support JSON data type
default values, causing "Unexpected default value" errors during
compaction.
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
Currently, the index type AiSAQ RAM usage estimation is not being
calculated correctly.
AiSAQ index type consumes less RAM usage while loading the index than
DISKANN does, and the query node module is missing the implementation of
the RAM usage estimation for that AiSAQ index type.
We suggest that the AiSAQ RAM usage estimation calculation should be as
follows:
UsedDiskMemoryRatioAisaq = 1024 (contrary to the UsedDiskMemoryRatio,
which is 4)
neededMemSize = indexInfo.IndexSize / UsedDiskMemoryRatioAisaq
neededDiskSize = indexInfo.IndexSize
Reported issue is #45247
---------
Signed-off-by: Lior Friedman <lior.friedman@il.kioxia.com>
Signed-off-by: friedl <lior.friedman@kioxia.com>
Co-authored-by: friedl <lior.friedman@kioxia.com>
When getting disk usage, files or directories may be removed
concurrently due to segment release. This PR ignores “file or directory
does not exist” errors in such cases.
issue: https://github.com/milvus-io/milvus/issues/45239
---------
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
Related to #44956
Integrate the StorageV2 FFI interface as the unified storage layer for
reading packed columnar data, replacing the custom iterative reader with
a manifest-based approach using the milvus-storage library.
Changes:
- Add C++ FFI reader implementation (ffi_reader_c.cpp/h) with Arrow C
Stream interface
- Implement utility functions to convert CStorageConfig to
milvus-storage Properties
- Create ManifestReader in Go that generates manifests from binlogs
- Add FFI packed reader CGO bindings (packed_reader_ffi.go)
- Refactor NewBinlogRecordReader to use ManifestReader for V2 storage
- Support both manifest file paths and direct manifest content
- Enable configurable buffer sizes and column projection
Technical improvements:
- Zero-copy data exchange using Arrow C Data Interface
- Optimized I/O operations through milvus-storage library
- Simplified code path with manifest-based reading
- Better performance with batched streaming reads
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
Related to #42148
Add comprehensive support for struct array field type in the Go SDK,
including data structure definitions, column operations, schema
construction, and full test coverage.
**Struct Array Column Implementation (`client/column/struct.go`)**
- Add `columnStructArray` type to handle struct array fields
- Implement `Column` interface methods:
- `NewColumnStructArray()`: Create new struct array column from
sub-fields
- `Name()`, `Type()`: Basic metadata accessors
- `Slice()`: Support slicing across all sub-fields
- `FieldData()`: Convert to protobuf `StructArrayField` format
- `Get()`: Retrieve struct values as `map[string]any`
- `ValidateNullable()`, `CompactNullableValues()`: Nullable support
- Placeholder implementations for unsupported operations (AppendValue,
GetAsX, IsNull, AppendNull)
**Struct Array Parsing (`client/column/columns.go`)**
- Add `parseStructArrayData()` function to parse `StructArrayField` from
protobuf
- Update `FieldDataColumn()` to detect and parse struct array fields
- Support range-based slicing for struct array data
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
issue: #45080, #45274, #45285
- LoadCollection doesn't ignore the ignorable request, for false field
array.
- CreatIndex doesn't ignore the ignorable request, for wrong index.
- index meta is not thread safe.
- lost parameter check of DDL.
- DDL Ack scheduler may get stuck and DDL is block until next incoming
DDL.
- lost parameter checker of ddl
---------
Signed-off-by: chyezh <chyezh@outlook.com>
Issue: #45129
<test>: <add new test case>
<also delete duplicate test case>
On branch feature/partial-update
Changes to be committed:
modified: milvus_client/test_milvus_client_partial_update.py
modified: milvus_client/test_milvus_client_upsert.py
---------
Signed-off-by: Eric Hou <eric.hou@zilliz.com>
Co-authored-by: Eric Hou <eric.hou@zilliz.com>
Related to #43028
Initialize the schema version field when creating a new collection
instance in QueryNode. The schema version is extracted from loadMetaInfo
and assigned to the collection, ensuring proper schema version tracking
and consistency across the distributed system.
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
issue: #43897
- Alter collection/database is implemented by WAL-based DDL framework
now.
- Support AlterCollection/AlterDatabase in wal now.
- Alter operation can be synced by new CDC now.
- Refactor some UT for alter DDL.
---------
Signed-off-by: chyezh <chyezh@outlook.com>
issue: #44172, #45210, #44851
kafka will auto reset the offset to "latest" if the offset is
Out-of-range. the recovery of milvus wal cannot read any message from
that. So once the offset is out-of-range, kafka should read from eariest
to read the latest uncleared data.
https://kafka.apache.org/documentation/#consumerconfigs_auto.offset.reset
Signed-off-by: chyezh <chyezh@outlook.com>
issue: #43897
- Load/Release collection/partition is implemented by WAL-based DDL
framework now.
- Support AlterLoadConfig/DropLoadConfig in wal now.
- Load/Release operation can be synced by new CDC now.
- Refactor some UT for load/release DDL.
---------
Signed-off-by: chyezh <chyezh@outlook.com>