11512 Commits

Author SHA1 Message Date
congqixia
ab90dd287f
fix: bump milvus-storage to fix initialization race condition (#46336)
Related to #44647

Update milvus-storage from 91df193 to 839a8e5 to include
milvus-io/milvus-storage#342, which fixes a race condition in
S3GlobalContext initialization.

The fix moves the is_initialized_ flag update from before DoInitialize()
to after it completes. This ensures the initialization flag is only set
to true after the actual initialization is done, preventing potential
issues if DoInitialize() fails or if other code checks the flag during
initialization.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-12-15 19:51:15 +08:00
congqixia
18fbaaca0a
enhance: support specified version manifest write (#46331)
Related to #44956

**Support specified version manifest write**
- Add `baseVersion` parameter to `NewPackedRecordManifestWriter` and
`NewFFIPackedWriter` to support writing manifest based on a specific
version instead of always overwriting the latest
- Add `manifestPath` tracking in `BulkPackWriterV2` to maintain manifest
state across writes
- Add `GetManifestInfo` method to parse existing manifest path and
extract base path and version
- Add `UpdateManifestPath` metacache action to track manifest path in
segment info
- Update `transaction_begin` FFI call to use the specified base version

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-12-15 19:49:14 +08:00
Zhen Ye
9ce5f08cc7
fix: lost broadcasting persisted before making message broadcast (#46328)
issue: #43897

Signed-off-by: chyezh <chyezh@outlook.com>
2025-12-15 13:59:15 +08:00
Spade A
f6f716bcfd
feat: impl StructArray -- support embedding searches embeddings in embedding list with element level filter expression (#45830)
issue: https://github.com/milvus-io/milvus/issues/42148

For a vector field inside a STRUCT, since a STRUCT can only appear as
the element type of an ARRAY field, the vector field in STRUCT is
effectively an array of vectors, i.e. an embedding list.
Milvus already supports searching embedding lists with metrics whose
names start with the prefix MAX_SIM_.

This PR allows Milvus to search embeddings inside an embedding list
using the same metrics as normal embedding fields. Each embedding in the
list is treated as an independent vector and participates in ANN search.

Further, since STRUCT may contain scalar fields that are highly related
to the embedding field, this PR introduces an element-level filter
expression to refine search results.
The grammar of the element-level filter is:

element_filter(structFieldName, $[subFieldName] == 3)

where $[subFieldName] refers to the value of subFieldName in each
element of the STRUCT array structFieldName.

It can be combined with existing filter expressions, for example:

"varcharField == 'aaa' && element_filter(struct_field, $[struct_int] ==
3)"

A full example:
```
struct_schema = milvus_client.create_struct_field_schema()
struct_schema.add_field("struct_str", DataType.VARCHAR, max_length=65535)
struct_schema.add_field("struct_int", DataType.INT32)
struct_schema.add_field("struct_float_vec", DataType.FLOAT_VECTOR, dim=EMBEDDING_DIM)

schema.add_field(
    "struct_field",
    datatype=DataType.ARRAY,
    element_type=DataType.STRUCT,
    struct_schema=struct_schema,
    max_capacity=1000,
)
...

filter = "varcharField == 'aaa' && element_filter(struct_field, $[struct_int] == 3 && $[struct_str] == 'abc')"
res = milvus_client.search(
    COLLECTION_NAME,
    data=query_embeddings,
    limit=10,
    anns_field="struct_field[struct_float_vec]",
    filter=filter,
    output_fields=["struct_field[struct_int]", "varcharField"],
)

```
TODO:
1. When an `element_filter` expression is used, a regular filter
expression must also be present. Remove this restriction.
2. Implement `element_filter` expressions in the `query`.

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-12-15 12:01:15 +08:00
Xiaofan
ca2e27f576
enhance: remove uncessary segment size estimation and make it configurable (#46302)
fix #46300
remove unused segment size estimation, and make size estimation configurable

Signed-off-by: xiaofanluan <xiaofan.luan@zilliz.com>
2025-12-13 02:58:46 +08:00
Zhen Ye
05b8b3b4c6
fix: stack overflow when gc json or json key (#46317)
issue: #46316

Signed-off-by: chyezh <chyezh@outlook.com>
2025-12-12 20:05:15 +08:00
huanghaoyuanhhy
addb66f89c
fix: fix DescribeCollection always returning db_id = 0 (#46092)
fix: #46089

Signed-off-by: huanghaoyuanhhy <haoyuan.huang@zilliz.com>
2025-12-12 20:03:14 +08:00
Zhen Ye
d24cd6200b
fix: always retry when writing binlog (#46309)
issue: #46205

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-12-12 18:27:15 +08:00
Buqian Zheng
76aa00a4c6
fix: fix CanUseIndexForJson (#46286)
issue: https://github.com/milvus-io/milvus/issues/46269

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2025-12-12 18:25:20 +08:00
aoiasd
0c54875832
enhance: ValidateAnalyzer return ValidateAnalyzerResponse instead common.Status (#46292)
Prepare for return more info when validate analyzer.
relate: https://github.com/milvus-io/milvus/issues/43687

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-12-12 10:35:14 +08:00
sijie-ni-0214
f51de1a8ab
feat: support TruncateCollection api to clear collection data (#46167)
issue: https://github.com/milvus-io/milvus/issues/46166

---------

Signed-off-by: sijie-ni-0214 <sijie.ni@zilliz.com>
2025-12-12 10:31:14 +08:00
wei liu
d2c403ce4b
enhance: Improve disk quota metrics update when cluster quota changes (#46278)
issue: #46277

- Update db/collection/partition disk quota metrics when cluster disk
quota changes, since they use cluster quota as default value
- Fix incorrect label "collection" to "partition" in disk quota per
partition watcher

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-12-11 20:45:14 +08:00
aoiasd
82e1dfc7d0
fix: highlight queries not work when not BM25 search (#46288)
Should aways init highlight queries.
relate: https://github.com/milvus-io/milvus/issues/42589

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-12-11 20:07:14 +08:00
wei liu
a195c33b71
fix: Prevent target update blocking when replica lacks nodes during scaling (#46088)
issue: #46087
The previous implementation checked if the total number of ready
delegators >= replicaNum per channel. This could cause target updates to
block indefinitely when dynamically increasing replicas, because some
replicas might lack nodes while the total count still met the threshold.

This change switches to a replica-based check approach:
- Iterate through each replica individually
- For each replica, verify all channels have at least one ready
delegator
- Only sync delegators from fully ready replicas
- Skip replicas that are not ready (e.g., missing nodes for some
channels)

This ensures target updates can proceed with ready replicas while
replicas that lack nodes during dynamic scaling are gracefully skipped.

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-12-11 17:09:14 +08:00
zhagnlu
a86b8b7a12
enhance: move jsonshredding meta from parquet to meta.json (#46130)
#42533

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-12-11 14:01:13 +08:00
congqixia
b2c49d0197
enhance: bump milvus-storage to resolve credentials provider namespace conflict (#46263)
Upgrade milvus-storage from 33bf815 to 91df193.

This includes the fix from milvus-io/milvus-storage#337, which resolves
a namespace collision where both Milvus and milvus-storage defined
identical credentials provider classes in the same namespace. Although
no compile-time redefinition errors occurred, the dynamic linker could
resolve to the wrong implementation at runtime, potentially causing
cloud authentication failures due to configuration mismatches.

The fix changes milvus-storage's credentials provider namespace to
`milvus_storage`, ensuring each project uses its own implementation.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-12-11 10:09:13 +08:00
Zhen Ye
15f8dfc7ad
enhance: introduce a tolerance duration to delay the drop operation (#46251)
issue: #46214

Signed-off-by: chyezh <chyezh@outlook.com>
2025-12-10 19:57:13 +08:00
yihao.dai
f32f2694bc
enhance: Implement new FlushAllMessage and refactor flush all (#45920)
This PR:
1. Define and implement the new FlushAllMessage.
2. Refactor FlushAll to flush the entire cluster.

issue: https://github.com/milvus-io/milvus/issues/45919

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-12-10 19:27:13 +08:00
congqixia
8780e12570
fix: use assertion instead of modifying schema under shared lock (#46242)
Related to #46225

Replace the heterogeneous insert data handling logic that modified
schema_ while holding a shared lock with an assertion. The previous
implementation had a concurrency bug where schema modification
operations were performed under a shared_lock, which violates mutex
semantics and can lead to data races.

Issue: #46225 reported two problems:
1. Schema modification under shared_lock (not exclusive lock)
2. Access to schema_ not protected by mutex in growing segment

The removed code attempted to handle "added fields" by:
- Adding new field to schema (schema_->AddField)
- Appending field metadata to insert_record_
- Setting default data for existing rows

All these write operations were performed while holding only a
shared_lock, which is incorrect since shared_locks are meant for
read-only operations.

This fix replaces the unsafe modification with an assertion that fails
if an unexpected new field is encountered in a growing segment with
existing data. The proper handling of schema changes should go through
the Reopen() path which correctly acquires a unique_lock before
modifying schema_.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-12-10 16:25:13 +08:00
Chun Han
d9f8e38d6a
fix: query failed for int value on edge(#46075) (#46126)
related: #46075

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2025-12-10 15:59:12 +08:00
Buqian Zheng
ab2e51b1c7
fix: VectorArrayChunkWriter::calculate_size (#46244)
issue: #46238

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2025-12-10 15:27:14 +08:00
sparknack
5fb420b156
fix: milvus-common update (#45929)
issue: #41435

fix some usage tracking bugs in caching layer.

Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>
2025-12-10 14:53:13 +08:00
aoiasd
c84b6d56f8
fix: char_group tokenizer only support one byte char as delimiters (#46193)
relate: https://github.com/milvus-io/milvus/issues/46192

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-12-10 14:33:13 +08:00
wgcn
6e2872c982
fix: wrong reduce lantency metric (#46233)
#46248

Signed-off-by: wgcn <wangg48@chinatelecom.cn>
Co-authored-by: wgcn <wangg48@chinatelecom.cn>
2025-12-10 14:17:13 +08:00
Buqian Zheng
85a7a7b1e3
fix: skip json path index if the query path includes number (#46200)
issue: #45511

our tantivy inverted index currently does not include item index if the
value is an array, thus we can't do `a[0] == 'b'` type of look up in the
inverted index. for such, we need to skip the index and use brute force
search.

we may improve our index in the future, so this is a temp solution

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2025-12-10 13:59:13 +08:00
cai.zhang
bb486c0db3
fix: Fix path concatenation error when rootPath = "." in minio (#46220)
issue: #46219

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-12-10 13:53:13 +08:00
liliu-z
3f063a29b0
feat: Support Search By PK (#45820)
issue: #39157

Overview:
Support search by PK by resolving IDs to vectors on Proxy side. Upgrade
go-api to adapt to new proto definitions.

Design:
- Upgrade milvus-proto/go-api to latest master.
- Implement handleIfSearchByPK in Proxy: resolve IDs to vectors via
internal Query, then rewrite SearchRequest.
- Adapt to 'SearchInput' oneof field in SearchRequest across client and
handlers.
- Fix binary vector stride calculation bug in placeholder utils.

Compatibility:
- Old Pymilvus can still work w/o this feature

What is included:
- Dense and Sparse
- Multi vector fields
- Rejection on BM25

What is **not** include:
- Hybrid Search
- EmbeddingList
- Restful API

Signed-off-by: Li Liu <li.liu@zilliz.com>
2025-12-10 10:59:14 +08:00
cai.zhang
b5e11f810d
fix: Fix panic when search empty result with output geometry field (#46230)
issue: #46146

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-12-09 20:37:13 +08:00
zhagnlu
8f0b7983ec
enhance: add jemalloc cached monitor (#46041)
#46133

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-12-09 19:53:13 +08:00
wei liu
046693eaf7
test: [skip e2e] fix race condition in TestQueryNodePipeline/TestBasic (#46218)
issue: #46217
The test was failing intermittently because it didn't wait for the
pipeline to finish processing messages before exiting. The test sent a
message to the pipeline and immediately returned, causing the deferred
Close() to execute before ProcessInsert, ProcessDelete, and UpdateTSafe
could be called.

Fix by:
- Moving message construction before mock expectations setup
- Adding a done channel to synchronize on UpdateTSafe completion
- Waiting for the signal before test exits

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-12-09 17:57:14 +08:00
zhenshan.cao
765768b0e4
fix: restfulv2 parsing fixes and schema defaults support with timestamptz (#46057)
issue: https://github.com/milvus-io/milvus/issues/44585

Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>
2025-12-09 17:53:17 +08:00
wei liu
d7050c417f
fix: Add field data alignment validation to prevent partial update panic (#46177)
issue: #46176

- Add checkAligned validation before processing partial update field
data to prevent index out of range panic when field data arrays have
mismatched lengths
- Fix GetNumRowOfFieldDataWithSchema to handle Timestamptz string format
and Geometry WKT format properly
- Add unit tests for empty data array scenarios in partial update

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-12-09 14:17:12 +08:00
congqixia
728cdc15b2
fix: fill partition_id in load index info and close RemoteOutputStream properly (#46203)
This PR fixes two issues related to segment loading and index
deserialization:

1. Fill partition_id in LoadIndexInfo when converting field index info,
which is required by cardinal (DiskANN) index deserialization.

2. Close RemoteOutputStream in destructor to ensure buffer flushed and
resources released properly.

issue: #46141

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-12-09 13:27:13 +08:00
Zhen Ye
b8086cb62b
fix: lost database in restful v2 (#46171)
issue: #45812

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-12-09 10:59:13 +08:00
Zhen Ye
459425ac84
fix: wrong context using by session of grpc client (#46183)
issue: #46182

Signed-off-by: chyezh <chyezh@outlook.com>
2025-12-08 21:47:12 +08:00
congqixia
a042a6e1e8
enhance: support pause GC at collection level (#45943)
Add collection-level granularity to the garbage collector pause/resume
mechanism. Previously, GC pause affected all collections globally. Now
operators can pause GC for specific collections while allowing other
collections to continue normal GC operations.

Changes:
- Add `pausedCollection` concurrent map to track per-collection pause
state
- Extend `Pause()` and `Resume()` methods with `collectionID` parameter
- Add `collectionGCPaused()` helper to check collection pause status
- Skip dropped segment recycling when collection GC is paused
- Update management API to accept optional `collection_id` query
parameter
- Add `GetInt64Value()` utility function for parsing int64 from KV pairs
- Maintain backward compatibility: collectionID <= 0 triggers global
pause

This provides DevOps with finer control over Milvus data lifecycle.

issue: #45941

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-12-08 19:33:15 +08:00
Zhen Ye
354927374f
fix: wrong auth of restfulv2 forward when upgrading (#46139)
issue: #45812

Signed-off-by: chyezh <chyezh@outlook.com>
2025-12-08 12:45:12 +08:00
Buqian Zheng
95a535cb4d
fix: struct reduce incorrect (#46150)
issue: https://github.com/milvus-io/milvus/issues/42148

---------

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2025-12-08 10:23:11 +08:00
yihao.dai
b69f4ab1cd
fix: Fix replicate lag metric calculation to prevent false-positive health (#46120)
This change fixes the calculation by using timestamp subtraction (WAL
confirmed time - Last replicate time). This ensures the lag metric
immediately spikes when replication is blocked, providing reliable
monitoring.

issue: https://github.com/milvus-io/milvus/issues/46116

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-12-07 21:51:12 +08:00
foxspy
3a3163e613
fix: skip gpu init for streaming node (#45650)
issue: #45597

The streaming node currently cannot use GPU resources and does not need
to perform initialization.

Signed-off-by: xianliang.li <xianliang.li@zilliz.com>
2025-12-07 13:59:11 +08:00
congqixia
d4450b2f57
enhance: [StorageV2] Integrate CMEK support into Loon FFI interface (#46123)
This PR adds Customer Managed Encryption Keys (CMEK) support to the
StorageV2 FFI layer, enabling data encryption/decryption through the
cipher plugin system.

Changes:
- Add ffi_writer_c.cpp/h with GetEncParams() to retrieve encryption
parameters (key and metadata) from cipher plugin for data encryption
- Extend GetLoonReader() in ffi_reader_c.cpp to support CMEK decryption
by configuring KeyRetriever when plugin context is provided
- Add encryption property constants in ffi_common.go for writer config
- Integrate CMEK encryption in NewFFIPackedWriter() to pass encryption
parameters to the underlying storage writer

issue: #44956

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-12-05 17:59:12 +08:00
congqixia
8e82631282
fix: correct index_has_raw_data logic for fielddata loading (#46117)
Related to #46098
This fix addresses a bug where the segment loader incorrectly determined
whether scalar fields have raw data in their indexes, leading to
unnecessary field data loading or skipping indexed raw data retrieval.

- Build `field_ids` vector that handles both single field and column
group cases (when `child_fields_size() > 0`)
- Move the mmap setting and index_has_raw_data checks before the skip
decision, iterating over the correctly built `field_ids`
- Fix the boolean AND logic in both `Load()` and `LoadColumnGroup()` to
properly check if ALL fields in the group have raw data in their indexes

This bug was hiding the root cause of issue #46098, where QueryNode
panics when outputting timestamptz data from scalar index with raw data.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-12-05 17:47:12 +08:00
cai.zhang
141547d8a8
enhance: Add log with segment size for tasks (#46118)
Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-12-05 16:45:11 +08:00
aoiasd
d8c9d15c07
fix: highlighter return error when search return empty result (#46107)
relate: https://github.com/milvus-io/milvus/issues/42589

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-12-05 14:23:10 +08:00
wei liu
354fe9c9d2
fix: unstable test case TestTask_VarCharPrimaryKey (#46106)
issue: #46105

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-12-05 14:01:12 +08:00
congqixia
3daff1ab2b
enhance: use specified manifest version in loon ffi reader (#46101)
Related to #44956

Use the exact manifest version from the path parameter instead of always
fetching the latest manifest. This ensures data consistency by reading
from the specific version that was requested.

Changes:
- Update GetColumnGroups to use transaction.begin(version) with the
specified version from the path JSON
- Replace get_latest_manifest() with get_current_manifest() after
beginning transaction at the target version
- Update Go FFI binding to call get_column_groups_by_version instead of
get_latest_column_groups
- Remove unused GetManifest function from util.cpp/util.h
- Bump milvus-storage version from 5fff4f5 to 33bf815

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-12-05 11:45:11 +08:00
Buqian Zheng
1372e84d7f
fix: move cursor after skip index skipped a chunk (#46054)
issue: #46053

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2025-12-05 10:47:11 +08:00
congqixia
3bbc5d0825
fix: correct loop logic for timestamptz scalar index output (#46100)
Related to #46098

Fix the ReverseDataFromIndex function where the assignment of raw_data
to scalar_array and the break statement were incorrectly placed inside
the for loop for TIMESTAMPTZ data type. This caused QueryNode to panic
when outputting timestamptz fields from scalar index.

Move the assignment and break statement outside the loop to match the
pattern used by other data types like VARCHAR.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-12-05 00:09:11 +08:00
Xinyi7
59752f216d
fix: add check in batch_score function to prevent query node seg fault (#46025)
previously we saw that when doing reranker with phrase matching, the
query node throws a segmentation fault error.

github issue link: https://github.com/milvus-io/milvus/issues/45990

---------

Signed-off-by: Xinyi Jiang <xinyi.jiang@reddit.com>
Co-authored-by: Xinyi Jiang <xinyi.jiang@reddit.com>
2025-12-04 17:35:17 +08:00
Zhen Ye
c22cdbbf9a
enhance: support proxy DQL forward (#46036)
issue: #45812

Signed-off-by: chyezh <chyezh@outlook.com>
2025-12-04 17:11:12 +08:00