23559 Commits

Author SHA1 Message Date
Lior Friedman
a4d69031f1
fix: Add AiSAQ index type RAM estimation implementation on the query node. (#45246)
Currently, the index type AiSAQ RAM usage estimation is not being
calculated correctly.
AiSAQ index type consumes less RAM usage while loading the index than
DISKANN does, and the query node module is missing the implementation of
the RAM usage estimation for that AiSAQ index type.
We suggest that the AiSAQ RAM usage estimation calculation should be as
follows:
 
UsedDiskMemoryRatioAisaq = 1024 (contrary to the UsedDiskMemoryRatio,
which is 4)
neededMemSize = indexInfo.IndexSize / UsedDiskMemoryRatioAisaq 
neededDiskSize = indexInfo.IndexSize

Reported issue is #45247

---------

Signed-off-by: Lior Friedman <lior.friedman@il.kioxia.com>
Signed-off-by: friedl <lior.friedman@kioxia.com>
Co-authored-by: friedl <lior.friedman@kioxia.com>
2025-11-06 08:53:34 +08:00
yihao.dai
121eb912ba
fix: Fix load segment failed due to get disk usage error (#45255)
When getting disk usage, files or directories may be removed
concurrently due to segment release. This PR ignores “file or directory
does not exist” errors in such cases.

issue: https://github.com/milvus-io/milvus/issues/45239

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-11-06 08:51:33 +08:00
congqixia
55bfd610b6
enhance: [StorageV2] Integrate FFI interface for packed reader (#45132)
Related to #44956

Integrate the StorageV2 FFI interface as the unified storage layer for
reading packed columnar data, replacing the custom iterative reader with
a manifest-based approach using the milvus-storage library.

Changes:
- Add C++ FFI reader implementation (ffi_reader_c.cpp/h) with Arrow C
Stream interface
- Implement utility functions to convert CStorageConfig to
milvus-storage Properties
- Create ManifestReader in Go that generates manifests from binlogs
- Add FFI packed reader CGO bindings (packed_reader_ffi.go)
- Refactor NewBinlogRecordReader to use ManifestReader for V2 storage
- Support both manifest file paths and direct manifest content
- Enable configurable buffer sizes and column projection

Technical improvements:
- Zero-copy data exchange using Arrow C Data Interface
- Optimized I/O operations through milvus-storage library
- Simplified code path with manifest-based reading
- Better performance with batched streaming reads

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-11-05 19:57:34 +08:00
XuanYang-cn
d036fd5422
test: Increase PyMilvus version to 2.7.0rc54 for master branch (#45273)
Automated daily bump from pymilvus master branch. Updates
tests/python_client/requirements.txt.

Signed-off-by: XuanYang-cn <xuan.yang@zilliz.com>
2025-11-05 19:35:33 +08:00
zhenshan.cao
4a936fae8e
fix: timezone parameter ingored in Search/Query (#45320)
issue: https://github.com/milvus-io/milvus/issues/44598

Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>
2025-11-05 19:19:33 +08:00
congqixia
1e48911825
enhance: [GoSDK] Support struct array field type (#45291)
Related to #42148

Add comprehensive support for struct array field type in the Go SDK,
including data structure definitions, column operations, schema
construction, and full test coverage.

**Struct Array Column Implementation (`client/column/struct.go`)**
- Add `columnStructArray` type to handle struct array fields
- Implement `Column` interface methods:
- `NewColumnStructArray()`: Create new struct array column from
sub-fields
  - `Name()`, `Type()`: Basic metadata accessors
  - `Slice()`: Support slicing across all sub-fields
  - `FieldData()`: Convert to protobuf `StructArrayField` format
  - `Get()`: Retrieve struct values as `map[string]any`
  - `ValidateNullable()`, `CompactNullableValues()`: Nullable support
- Placeholder implementations for unsupported operations (AppendValue,
GetAsX, IsNull, AppendNull)

**Struct Array Parsing (`client/column/columns.go`)**
- Add `parseStructArrayData()` function to parse `StructArrayField` from
protobuf
- Update `FieldDataColumn()` to detect and parse struct array fields
- Support range-based slicing for struct array data

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-11-05 15:43:33 +08:00
zhenshan.cao
490a618c30
fix: Handle timestamptz import errors (#45287)
issue: https://github.com/milvus-io/milvus/issues/44585

Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>
2025-11-05 15:05:33 +08:00
foxspy
95d7302cf4
enhance: update knowhere version (#45270)
issue: #42937

Signed-off-by: xianliang.li <xianliang.li@zilliz.com>
2025-11-05 11:09:32 +08:00
Zhen Ye
a2ce70d252
fix: ddl framework bug patch (#45290)
issue: #45080, #45274, #45285

- LoadCollection doesn't ignore the ignorable request, for false field
array.
- CreatIndex doesn't ignore the ignorable request, for wrong index.
- index meta is not thread safe.
- lost parameter check of DDL.
- DDL Ack scheduler may get stuck and DDL is block until next incoming
DDL.
- lost parameter checker of ddl

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-11-04 22:25:33 +08:00
cai.zhang
fa3d4ebfbe
fix: Compute the correct batch size for the geometry index of the growing segment (#45253)
issue: #44648

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-11-04 20:25:37 +08:00
zhagnlu
792e931fcb
enhance: rename jsonstats related user config params (#45254)
#44132

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-11-04 20:21:36 +08:00
Spade A
c0029b788d
fix: alter collection failed with MMAP setting for STRUCT (#45173)
issue: https://github.com/milvus-io/milvus/issues/45001
ref: https://github.com/milvus-io/milvus/issues/42148

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>
Co-authored-by: aoiasd <zhicheng.yue@zilliz.com>
2025-11-04 20:19:33 +08:00
sijie-ni-0214
19af1903f6
fix: etcd data persistence by adding --data-dir parameter (#45200)
issue: https://github.com/milvus-io/milvus/issues/45174

Signed-off-by: sijie-ni-0214 <sijie.ni@zilliz.com>
2025-11-04 19:37:34 +08:00
Gao
8f645760af
enhance: make knowhere thread pool config refreshable (#45190)
Signed-off-by: chasingegg <chao.gao@zilliz.com>
2025-11-04 18:33:33 +08:00
Zhen Ye
966ebfbcab
fix: support upgrading from 2.6.x to 2.6.5 (#45264)
issue: #43897

Signed-off-by: chyezh <chyezh@outlook.com>
2025-11-04 18:31:32 +08:00
zhuwenxing
06933c25b8
test: add geometry datatype in import testcases (#45014)
/kind improvement

Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2025-11-04 16:55:33 +08:00
zhenshan.cao
6327c9a514
fix: Fix bugs related to TimestampTz (#45111)
issue: https://github.com/milvus-io/milvus/issues/44527
https://github.com/milvus-io/milvus/issues/44537
https://github.com/milvus-io/milvus/issues/44538
https://github.com/milvus-io/milvus/issues/44585
https://github.com/milvus-io/milvus/issues/44622

Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>
2025-11-04 16:51:33 +08:00
Feilong Hou
9e4975bdfa
test: added test case for partial update on duplicate pk (#45130)
Issue: #45129
 <test>: <add new test case>
 <also delete duplicate test case>

 On branch feature/partial-update
 Changes to be committed:
	modified:   milvus_client/test_milvus_client_partial_update.py
	modified:   milvus_client/test_milvus_client_upsert.py

---------

Signed-off-by: Eric Hou <eric.hou@zilliz.com>
Co-authored-by: Eric Hou <eric.hou@zilliz.com>
2025-11-04 15:47:32 +08:00
zhikunyao
7193d01808
test: support e2e-amd helm in gcp milvus cluster (#45175)
Signed-off-by: Zhikun Yao <zhikun.yao@zilliz.com>
2025-11-04 15:07:32 +08:00
sparknack
40b5e6b134
fix: avoid potential race conditions when updating the executor (#45230)
issue: #43040

Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>
2025-11-04 14:25:33 +08:00
yihao.dai
ab11fddc66
enhance: Wait for replicate stream client to finish (#45259)
Make channel replicator stop more gracefully.

issue: https://github.com/milvus-io/milvus/issues/44123

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-11-04 14:19:33 +08:00
cai.zhang
617891b436
fix: Skip create tmp dir for growing R-Tree index (#45256)
issue: #45181

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-11-04 13:01:32 +08:00
Spade A
2b5241fe5a
fix: allow "[" and "]" in index name (#45193)
issue: https://github.com/milvus-io/milvus/issues/42148

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>
2025-11-04 11:59:34 +08:00
Spade A
cd0b36c39e
feat: impl StructArray -- support diskann index (#45223)
issue: https://github.com/milvus-io/milvus/issues/42148

---------

Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>
Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-11-04 11:57:33 +08:00
zhagnlu
653e95aaad
fix: fix bug for shredding json when empty json but not null (#45221)
#45157

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-11-04 11:11:33 +08:00
Zhen Ye
d320ccab99
fix: milvus role cannot stop at initializing state (#45244)
issue: #45243

Signed-off-by: chyezh <chyezh@outlook.com>
2025-11-04 10:47:32 +08:00
congqixia
e6be590b97
enhance: set schema version when creating new collection (#45263)
Related to #43028

Initialize the schema version field when creating a new collection
instance in QueryNode. The schema version is extracted from loadMetaInfo
and assigned to the collection, ensuring proper schema version tracking
and consistency across the distributed system.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-11-04 10:15:32 +08:00
Zhen Ye
576084fe86
enhance: support alter collection/database with WAL-based DDL framework (#45266)
issue: #43897

- Alter collection/database is implemented by WAL-based DDL framework
now.
- Support AlterCollection/AlterDatabase in wal now.
- Alter operation can be synced by new CDC now.
- Refactor some UT for alter DDL.

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-11-04 09:59:33 +08:00
Zhen Ye
31a609c21d
fix: kafka should auto reset the offset from earliest to read (#45237)
issue: #44172, #45210, #44851

kafka will auto reset the offset to "latest" if the offset is
Out-of-range. the recovery of milvus wal cannot read any message from
that. So once the offset is out-of-range, kafka should read from eariest
to read the latest uncleared data.


https://kafka.apache.org/documentation/#consumerconfigs_auto.offset.reset

Signed-off-by: chyezh <chyezh@outlook.com>
2025-11-03 21:07:33 +08:00
cai.zhang
01cf5c9341
enhance: Add log to debug index task (#45198)
Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-11-03 20:01:34 +08:00
cai.zhang
ed8ba4a28c
enhance: Make GeometryCache an optional configuration (#45192)
issue: #45187

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-11-03 19:59:32 +08:00
Spade A
ae03dee116
feat: implement ngram tokenizer with token_chars and custom_token_chars (#45040)
issue: https://github.com/milvus-io/milvus/issues/45039

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-11-03 18:09:33 +08:00
zhuwenxing
434e0847fd
test: remove xfail after fix (#45114)
/kind improvement

Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2025-11-03 17:21:37 +08:00
zhuwenxing
a03c398986
test: add import case for struct array (#45146)
/kind improvement

Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2025-11-03 17:19:39 +08:00
Zhen Ye
25e0485a56
fix: unrecoverable when replicate from old (#45224)
issue: #44962

Signed-off-by: chyezh <chyezh@outlook.com>
2025-11-03 15:07:36 +08:00
yihao.dai
27734982fa
enhance: Don't start cdc by default (#45216)
issue: https://github.com/milvus-io/milvus/issues/44123

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-11-03 13:15:32 +08:00
zhuwenxing
a47c168dd7
test: add json dumps for json string data (#45189)
/kind improvement

issue: https://github.com/milvus-io/milvus/issues/44982

Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2025-11-03 10:37:33 +08:00
aoiasd
ed69375f00
enhance: remove resource type from file resource config (#45103)
File resource type was useless till now, remove it before new release.
relate: https://github.com/milvus-io/milvus/issues/43687

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-11-03 10:15:32 +08:00
Zhen Ye
00d8d2c33d
enhance: support load/release collection/partition with WAL-based DDL framework (#45154)
issue: #43897

- Load/Release collection/partition is implemented by WAL-based DDL
framework now.
- Support AlterLoadConfig/DropLoadConfig in wal now.
- Load/Release operation can be synced by new CDC now.
- Refactor some UT for load/release DDL.

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-11-02 18:39:32 +08:00
Jingsong Yin
e25ee08566
fix: fix LoadMetrics bool type error (#45209)
#44584

Signed-off-by: thekingking <1677273255@qq.com>
2025-11-01 01:19:32 +08:00
Jingsong Yin
0cc79772e7
enhance: Extend SkipIndex with IN/Match support and BloomFilter (#44581)
issue: #44584

---------

Signed-off-by: thekingking <1677273255@qq.com>
2025-10-31 22:39:32 +08:00
zhikunyao
950e8f1f92
test: update helm to 5.0.6 for e2e (#45204)
Signed-off-by: Zhikun Yao <zhikun.yao@zilliz.com>
2025-10-31 18:46:08 +08:00
congqixia
22098c1785
fix: add null check for packed_writer_ in JsonStatsParquetWriter::Close() (#45158)
Related to #45157

Fix a bug where DataNode panics when building json stats index throws an
exception before the writer is initialized. The destructor would call
Close() on an uninitialized packed_writer_ pointer, causing a null
pointer dereference.

Changes:
- Add null check for packed_writer_ before calling Flush() and Close()
- Prevents null pointer dereference in edge cases
- Ignore close status as this is a cleanup operation

This ensures safe cleanup even when initialization fails due to
exceptions.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-10-30 17:40:09 +08:00
Zhen Ye
309d564796
enhance: support collection and index with WAL-based DDL framework (#45033)
issue: #43897

- Part of collection/index related DDL is implemented by WAL-based DDL
framework now.
- Support following message type in wal, CreateCollection,
DropCollection, CreatePartition, DropPartition, CreateIndex, AlterIndex,
DropIndex.
- Part of collection/index related DDL can be synced by new CDC now.
- Refactor some UT for collection/index DDL.
- Add Tombstone scheduler to manage the tombstone GC for collection or
partition meta.
- Move the vchannel allocation into streaming pchannel manager.

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-10-30 14:24:08 +08:00
cai.zhang
3c9aa3e784
fix: Fix import null geometry data (#45161)
issue: #44787

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-10-30 14:18:07 +08:00
wei liu
3566cb745c
enhance: remove max vector field number limit (#45151)
issue: #45150
Removed the maximum limit constraint (value range [1, 10]) for vector
fields in a collection to support more flexible schema design.

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-10-30 12:42:07 +08:00
cqy123456
35d8213a00
fix: fail to mmap emb_list_meta in embedding list (#45127)
issue: https://github.com/milvus-io/milvus/issues/44965

Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>
2025-10-30 11:00:09 +08:00
congqixia
8c98adfeb3
fix: update QueryNode NumEntities metrics when collection has no segments (#45147)
Related to #44509

Fix a bug where QueryNodeNumEntities metrics were not updated for
collections with zero segments, causing stale metrics when all segments
are flushed or compacted.

The previous implementation used separate loops: one to update size
metrics for all collections, and another to update num entities metrics
only for collections present in the grouped segments map. Collections
with no segments were skipped in the second loop, leaving their
NumEntities metrics stale.

Changes:
- Consolidate size and num entities metric updates into single loop
- Iterate over all collections instead of grouped segments
- Get collection metadata from manager instead of segment instances
- Correctly set NumEntities to 0 for collections with no segments
- Apply the same fix to both growing and sealed segment processing
- Add nil check for collection metadata before processing

This ensures all collection metrics are updated consistently, even when
segment count drops to zero.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-10-30 10:12:08 +08:00
Zhen Ye
6e5189fe19
fix: make ack of broadcaster cannot canceled by client (#45145)
issue: #45141

- make ack of broadcaster cannot canceled by rpc.
- make clone for assignment snapshot of wal balancer.
- add server id for GetReplicateCheckpoint to avoid failure.

Signed-off-by: chyezh <chyezh@outlook.com>
2025-10-29 20:34:11 +08:00
Jingsong Yin
653dfcca41
fix: fix BloomFilter type name mapping is reversed in bfNames map (#45024)
#45017

Signed-off-by: thekingking <1677273255@qq.com>
2025-10-29 16:28:12 +08:00