23676 Commits

Author SHA1 Message Date
congqixia
21ed1fabfd
feat: support reopen segment for data/schema changes (#46359)
issue: #46358

This PR implements segment reopening functionality on query nodes,
enabling the application of data or schema changes to already-loaded
segments without requiring a full reload.

### Core (C++)

**New SegmentLoadInfo class**
(`internal/core/src/segcore/SegmentLoadInfo.h/cpp`):
- Encapsulates segment load configuration with structured access
- Implements `ComputeDiff()` to calculate differences between old and
new load states
- Tracks indexes, binlogs, and column groups that need to be loaded or
dropped
- Provides `ConvertFieldIndexInfoToLoadIndexInfo()` for index loading

**ChunkedSegmentSealedImpl modifications**:
- Added `Reopen(const SegmentLoadInfo&)` method to apply incremental
changes based on computed diff
- Refactored `LoadColumnGroups()` and `LoadColumnGroup()` to support
selective loading via field ID map
- Extracted `LoadBatchIndexes()` and `LoadBatchFieldData()` for reusable
batch loading logic
- Added `LoadManifest()` for manifest-based loading path
- Updated all methods to use `SegmentLoadInfo` wrapper instead of direct
proto access

**SegmentGrowingImpl modifications**:
- Added `Reopen()` stub method for interface compliance

**C API additions** (`segment_c.h/cpp`):
- Added `ReopenSegment()` function exposing reopen to Go layer

### Go Side

**QueryNode handlers** (`internal/querynodev2/`):
- Added `HandleReopen()` in handlers.go
- Added `ReopenSegments()` RPC in services.go

**Segment interface** (`internal/querynodev2/segments/`):
- Extended `Segment` interface with `Reopen()` method
- Implemented `Reopen()` in LocalSegment
- Added `Reopen()` to segment loader

**Segcore wrapper** (`internal/util/segcore/`):
- Added `Reopen()` method in segment.go
- Added `ReopenSegmentRequest` in requests.go

### Proto

- Added new fields to support reopen in `query_coord.proto`

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-12-17 15:49:16 +08:00
groot
d63ec2d8c6
fix: Enable search iterator for binary vector BIN_FLAT (#46340)
issue: https://github.com/milvus-io/milvus/issues/46339
https://github.com/milvus-io/milvus/discussions/46326

Signed-off-by: yhmo <yihua.mo@zilliz.com>
2025-12-17 14:13:16 +08:00
Chun Han
f0265dde18
fix: catch exception from LoadWithStrategy(#46380) (#46381)
related: #46380

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2025-12-17 11:37:17 +08:00
liliu-z
28061ec2f4
enhance: eliminate race condition in TestTargetObserver causing intermitt… (#46375)
This commit addresses an intermittent test failure in TestTargetObserver
with a mock panic error.

Problem:
--------
The original test TestTriggerUpdateTarget was a monolithic test that
cleared and recreated mock expectations mid-test execution. This created
a race condition:

1. Background goroutine in TargetObserver runs every 3 seconds, calling
broker.ListIndexes() and broker.DescribeCollection()
2. Test cleared all mock expectations at line 200 to prepare for next
phase
3. Test only re-mocked GetRecoveryInfoV2, leaving ListIndexes unmocked
4. If background goroutine triggered during this ~0.01s window (lines
200-213), it would call the unmocked ListIndexes() method, causing panic
and timeout

Error observed:
```
panic: test timed out after 10m0s
mock: I don't know what to return because the method call was unexpected.
Either do Mock.On("ListIndexes").Return(...) first, or remove the call.
```

Solution:
---------
Split the monolithic test into two independent test cases:

1. TestInitialLoad_ShouldNotUpdateCurrentTarget
   - Tests that CurrentTarget remains empty during initial load
   - Verifies the two-phase update mechanism works correctly

2. TestIncrementalUpdate_WithNewSegment
   - Tests incremental updates when new segments arrive
   - Properly sets up ALL required mocks before Eventually() calls
   - Lines 241-242 now include ListIndexes and DescribeCollection mocks

Benefits:
---------
- Eliminates race condition entirely (no mid-test mock clearing)
- Better test isolation and maintainability
- Clearer test intent with descriptive names
- Tests can run independently and in parallel
- Follows FIRST principles (Fast, Isolated, Repeatable, Self-validating,
Timely)

Signed-off-by: Li Liu <li.liu@zilliz.com>
2025-12-17 11:35:16 +08:00
congqixia
efa7ccdf81
fix: pass manifest path when loading growing segments (#46378)
Related to #44956

Pass ManifestPath field to SegmentLoadInfo when loading growing segments
in loadGrowingSegments function. This ensures storage v2 can properly
locate segment data via manifest path, consistent with other segment
loading paths.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-12-17 10:19:15 +08:00
wei liu
c1844d2aae
test: Temporarily disable partialResultCounter assertion in flaky tests (#46364)
issue: #46352

Comment out partialResultCounter assertions in partial search tests due
to concurrent issue between segment_checker and leader_checker during
heartbeat (500ms). This assertion sometimes fails because partial
results may be returned unexpectedly before segments are properly
distributed.

Affected tests:
- TestSingleNodeDownOnSingleReplica
- TestAllNodeDownOnSingleReplica
- TestSingleNodeDownOnMultiReplica
- TestPartialResultRequiredDataRatioTooHigh
- TestSkipWaitTSafe

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-12-16 17:29:16 +08:00
XuanYang-cn
0bbb134e39
feat: Enable to backup and reload ez (#46332)
see also: #40013

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2025-12-16 17:19:16 +08:00
wei liu
ac23beefb5
fix: ensure all channels synced before updating current target (#46348)
issue: #46087, #46327

The previous implementation only checked if there were any ready
delegators before updating the current target. This could lead to
partial target updates when only some channels had ready delegators.

This regression was introduced by #46088, which removed the check for
all channels being ready. This fix ensures that
shouldUpdateCurrentTarget returns true only when ALL channels have been
successfully synced, preventing incomplete target updates that could
cause query inconsistencies.

Added unit tests to cover:
- All channels synced scenario (should return true)
- Partial channels synced scenario (should return false)
- No ready delegators scenario (should return false)

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-12-16 14:45:17 +08:00
aoiasd
df80f54151
feat: support use user's file as dictionary for analyzer filter (#46145)
relate: https://github.com/milvus-io/milvus/issues/43687

---------

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-12-16 11:45:16 +08:00
congqixia
bb2a08ed71
enhance: pass manifest path to stats task for storage v2 support (#46350)
Related #44956

Add manifest_path field to CreateStatsRequest and propagate it through
the stats task pipeline. This enables stats tasks and text index
building to access segment manifest for storage v2 format operations.

- Add manifest_path field to CreateStatsRequest proto
- Set ManifestPath from segment metadata in DataCoord
- Pass manifest to BuildIndexInfo in stats task builder
- Include manifest in compaction text index creation

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-12-16 11:11:16 +08:00
yihao.dai
889505872a
enhance: Return FlushAllMsg in response (#46347)
issue: https://github.com/milvus-io/milvus/issues/45919

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-12-16 10:35:16 +08:00
Zhen Ye
675a6b9ba0
fix: illegal reference count of record in binlog writer (#46344)
issue: #46205

Signed-off-by: chyezh <chyezh@outlook.com>
2025-12-15 22:51:15 +08:00
Lanqing Yang
3e15604f2e
fix: use rlock for pinindex (#45932)
fixes: https://github.com/milvus-io/milvus/issues/45934
pinIndex is a const and only do read operations rlock would be the right
choice for performance

Signed-off-by: Lanqing Yang <lanqingy93@gmail.com>
2025-12-15 22:33:16 +08:00
congqixia
ab90dd287f
fix: bump milvus-storage to fix initialization race condition (#46336)
Related to #44647

Update milvus-storage from 91df193 to 839a8e5 to include
milvus-io/milvus-storage#342, which fixes a race condition in
S3GlobalContext initialization.

The fix moves the is_initialized_ flag update from before DoInitialize()
to after it completes. This ensures the initialization flag is only set
to true after the actual initialization is done, preventing potential
issues if DoInitialize() fails or if other code checks the flag during
initialization.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-12-15 19:51:15 +08:00
congqixia
18fbaaca0a
enhance: support specified version manifest write (#46331)
Related to #44956

**Support specified version manifest write**
- Add `baseVersion` parameter to `NewPackedRecordManifestWriter` and
`NewFFIPackedWriter` to support writing manifest based on a specific
version instead of always overwriting the latest
- Add `manifestPath` tracking in `BulkPackWriterV2` to maintain manifest
state across writes
- Add `GetManifestInfo` method to parse existing manifest path and
extract base path and version
- Add `UpdateManifestPath` metacache action to track manifest path in
segment info
- Update `transaction_begin` FFI call to use the specified base version

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-12-15 19:49:14 +08:00
Feilong Hou
971085b033
test: enable debug_mode to observe test case instability. (#46341)
Issue: #46333

Signed-off-by: Eric Hou <eric.hou@zilliz.com>
Co-authored-by: Eric Hou <eric.hou@zilliz.com>
2025-12-15 17:55:16 +08:00
zhuwenxing
5f8daa0f6d
test: Add geometry operations test suite for RESTful API (#46174)
/kind improvement

---------

Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2025-12-15 15:45:15 +08:00
Zhen Ye
9ce5f08cc7
fix: lost broadcasting persisted before making message broadcast (#46328)
issue: #43897

Signed-off-by: chyezh <chyezh@outlook.com>
2025-12-15 13:59:15 +08:00
Spade A
f6f716bcfd
feat: impl StructArray -- support embedding searches embeddings in embedding list with element level filter expression (#45830)
issue: https://github.com/milvus-io/milvus/issues/42148

For a vector field inside a STRUCT, since a STRUCT can only appear as
the element type of an ARRAY field, the vector field in STRUCT is
effectively an array of vectors, i.e. an embedding list.
Milvus already supports searching embedding lists with metrics whose
names start with the prefix MAX_SIM_.

This PR allows Milvus to search embeddings inside an embedding list
using the same metrics as normal embedding fields. Each embedding in the
list is treated as an independent vector and participates in ANN search.

Further, since STRUCT may contain scalar fields that are highly related
to the embedding field, this PR introduces an element-level filter
expression to refine search results.
The grammar of the element-level filter is:

element_filter(structFieldName, $[subFieldName] == 3)

where $[subFieldName] refers to the value of subFieldName in each
element of the STRUCT array structFieldName.

It can be combined with existing filter expressions, for example:

"varcharField == 'aaa' && element_filter(struct_field, $[struct_int] ==
3)"

A full example:
```
struct_schema = milvus_client.create_struct_field_schema()
struct_schema.add_field("struct_str", DataType.VARCHAR, max_length=65535)
struct_schema.add_field("struct_int", DataType.INT32)
struct_schema.add_field("struct_float_vec", DataType.FLOAT_VECTOR, dim=EMBEDDING_DIM)

schema.add_field(
    "struct_field",
    datatype=DataType.ARRAY,
    element_type=DataType.STRUCT,
    struct_schema=struct_schema,
    max_capacity=1000,
)
...

filter = "varcharField == 'aaa' && element_filter(struct_field, $[struct_int] == 3 && $[struct_str] == 'abc')"
res = milvus_client.search(
    COLLECTION_NAME,
    data=query_embeddings,
    limit=10,
    anns_field="struct_field[struct_float_vec]",
    filter=filter,
    output_fields=["struct_field[struct_int]", "varcharField"],
)

```
TODO:
1. When an `element_filter` expression is used, a regular filter
expression must also be present. Remove this restriction.
2. Implement `element_filter` expressions in the `query`.

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-12-15 12:01:15 +08:00
Xiaofan
ca2e27f576
enhance: remove uncessary segment size estimation and make it configurable (#46302)
fix #46300
remove unused segment size estimation, and make size estimation configurable

Signed-off-by: xiaofanluan <xiaofan.luan@zilliz.com>
2025-12-13 02:58:46 +08:00
Zhen Ye
05b8b3b4c6
fix: stack overflow when gc json or json key (#46317)
issue: #46316

Signed-off-by: chyezh <chyezh@outlook.com>
2025-12-12 20:05:15 +08:00
huanghaoyuanhhy
addb66f89c
fix: fix DescribeCollection always returning db_id = 0 (#46092)
fix: #46089

Signed-off-by: huanghaoyuanhhy <haoyuan.huang@zilliz.com>
2025-12-12 20:03:14 +08:00
Zhen Ye
d24cd6200b
fix: always retry when writing binlog (#46309)
issue: #46205

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-12-12 18:27:15 +08:00
Buqian Zheng
76aa00a4c6
fix: fix CanUseIndexForJson (#46286)
issue: https://github.com/milvus-io/milvus/issues/46269

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2025-12-12 18:25:20 +08:00
aoiasd
0c54875832
enhance: ValidateAnalyzer return ValidateAnalyzerResponse instead common.Status (#46292)
Prepare for return more info when validate analyzer.
relate: https://github.com/milvus-io/milvus/issues/43687

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-12-12 10:35:14 +08:00
sijie-ni-0214
f51de1a8ab
feat: support TruncateCollection api to clear collection data (#46167)
issue: https://github.com/milvus-io/milvus/issues/46166

---------

Signed-off-by: sijie-ni-0214 <sijie.ni@zilliz.com>
2025-12-12 10:31:14 +08:00
wei liu
d2c403ce4b
enhance: Improve disk quota metrics update when cluster quota changes (#46278)
issue: #46277

- Update db/collection/partition disk quota metrics when cluster disk
quota changes, since they use cluster quota as default value
- Fix incorrect label "collection" to "partition" in disk quota per
partition watcher

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-12-11 20:45:14 +08:00
aoiasd
82e1dfc7d0
fix: highlight queries not work when not BM25 search (#46288)
Should aways init highlight queries.
relate: https://github.com/milvus-io/milvus/issues/42589

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-12-11 20:07:14 +08:00
wei liu
a195c33b71
fix: Prevent target update blocking when replica lacks nodes during scaling (#46088)
issue: #46087
The previous implementation checked if the total number of ready
delegators >= replicaNum per channel. This could cause target updates to
block indefinitely when dynamically increasing replicas, because some
replicas might lack nodes while the total count still met the threshold.

This change switches to a replica-based check approach:
- Iterate through each replica individually
- For each replica, verify all channels have at least one ready
delegator
- Only sync delegators from fully ready replicas
- Skip replicas that are not ready (e.g., missing nodes for some
channels)

This ensures target updates can proceed with ready replicas while
replicas that lack nodes during dynamic scaling are gracefully skipped.

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-12-11 17:09:14 +08:00
Feilong Hou
224a7943ad
test: add e2e case for timestamptz on restful (#46254)
Issue: #46253

 On branch feature/timestamps
 Changes to be committed:
	new file:   testcases/test_timestamptz.py

Signed-off-by: Eric Hou <eric.hou@zilliz.com>
Co-authored-by: Eric Hou <eric.hou@zilliz.com>
2025-12-11 14:21:13 +08:00
zhagnlu
a86b8b7a12
enhance: move jsonshredding meta from parquet to meta.json (#46130)
#42533

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-12-11 14:01:13 +08:00
zhuwenxing
3aa0b769e5
test: add unique error message collection in chaos checker (#46262)
/kind improvement

- Add normalize_error_message function to extract and normalize error
text
- Collect unique error messages during chaos test execution
- Display error details in assertion messages for better debugging

Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2025-12-11 13:49:12 +08:00
zhuwenxing
75d6f0d509
test: add ST_ISVALID geometry function test cases (#46232)
/kind improvement

Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2025-12-11 13:47:21 +08:00
congqixia
b2c49d0197
enhance: bump milvus-storage to resolve credentials provider namespace conflict (#46263)
Upgrade milvus-storage from 33bf815 to 91df193.

This includes the fix from milvus-io/milvus-storage#337, which resolves
a namespace collision where both Milvus and milvus-storage defined
identical credentials provider classes in the same namespace. Although
no compile-time redefinition errors occurred, the dynamic linker could
resolve to the wrong implementation at runtime, potentially causing
cloud authentication failures due to configuration mismatches.

The fix changes milvus-storage's credentials provider namespace to
`milvus_storage`, ensuring each project uses its own implementation.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-12-11 10:09:13 +08:00
Zhen Ye
15f8dfc7ad
enhance: introduce a tolerance duration to delay the drop operation (#46251)
issue: #46214

Signed-off-by: chyezh <chyezh@outlook.com>
2025-12-10 19:57:13 +08:00
congqixia
416113d11a
enhance: Change RootCoord default port to non-ephemeral port (#46256)
Related to #46255

Change RootCoord default gRPC port from 53100 to 22125 to avoid
conflicts with ephemeral port range.

The previous port 53100 falls within the Linux ephemeral port range
(32768-60999), which could cause conflicts with other connections
including Milvus's own outbound connections. Port 22125 is outside the
ephemeral range and provides more reliable service binding.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-12-10 19:47:13 +08:00
yihao.dai
f32f2694bc
enhance: Implement new FlushAllMessage and refactor flush all (#45920)
This PR:
1. Define and implement the new FlushAllMessage.
2. Refactor FlushAll to flush the entire cluster.

issue: https://github.com/milvus-io/milvus/issues/45919

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-12-10 19:27:13 +08:00
congqixia
8780e12570
fix: use assertion instead of modifying schema under shared lock (#46242)
Related to #46225

Replace the heterogeneous insert data handling logic that modified
schema_ while holding a shared lock with an assertion. The previous
implementation had a concurrency bug where schema modification
operations were performed under a shared_lock, which violates mutex
semantics and can lead to data races.

Issue: #46225 reported two problems:
1. Schema modification under shared_lock (not exclusive lock)
2. Access to schema_ not protected by mutex in growing segment

The removed code attempted to handle "added fields" by:
- Adding new field to schema (schema_->AddField)
- Appending field metadata to insert_record_
- Setting default data for existing rows

All these write operations were performed while holding only a
shared_lock, which is incorrect since shared_locks are meant for
read-only operations.

This fix replaces the unsafe modification with an assertion that fails
if an unexpected new field is encountered in a growing segment with
existing data. The proper handling of schema changes should go through
the Reopen() path which correctly acquires a unique_lock before
modifying schema_.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-12-10 16:25:13 +08:00
Chun Han
d9f8e38d6a
fix: query failed for int value on edge(#46075) (#46126)
related: #46075

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2025-12-10 15:59:12 +08:00
Buqian Zheng
ab2e51b1c7
fix: VectorArrayChunkWriter::calculate_size (#46244)
issue: #46238

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2025-12-10 15:27:14 +08:00
sparknack
5fb420b156
fix: milvus-common update (#45929)
issue: #41435

fix some usage tracking bugs in caching layer.

Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>
2025-12-10 14:53:13 +08:00
aoiasd
c84b6d56f8
fix: char_group tokenizer only support one byte char as delimiters (#46193)
relate: https://github.com/milvus-io/milvus/issues/46192

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-12-10 14:33:13 +08:00
wgcn
6e2872c982
fix: wrong reduce lantency metric (#46233)
#46248

Signed-off-by: wgcn <wangg48@chinatelecom.cn>
Co-authored-by: wgcn <wangg48@chinatelecom.cn>
2025-12-10 14:17:13 +08:00
Buqian Zheng
85a7a7b1e3
fix: skip json path index if the query path includes number (#46200)
issue: #45511

our tantivy inverted index currently does not include item index if the
value is an array, thus we can't do `a[0] == 'b'` type of look up in the
inverted index. for such, we need to skip the index and use brute force
search.

we may improve our index in the future, so this is a temp solution

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2025-12-10 13:59:13 +08:00
cai.zhang
bb486c0db3
fix: Fix path concatenation error when rootPath = "." in minio (#46220)
issue: #46219

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-12-10 13:53:13 +08:00
liliu-z
3f063a29b0
feat: Support Search By PK (#45820)
issue: #39157

Overview:
Support search by PK by resolving IDs to vectors on Proxy side. Upgrade
go-api to adapt to new proto definitions.

Design:
- Upgrade milvus-proto/go-api to latest master.
- Implement handleIfSearchByPK in Proxy: resolve IDs to vectors via
internal Query, then rewrite SearchRequest.
- Adapt to 'SearchInput' oneof field in SearchRequest across client and
handlers.
- Fix binary vector stride calculation bug in placeholder utils.

Compatibility:
- Old Pymilvus can still work w/o this feature

What is included:
- Dense and Sparse
- Multi vector fields
- Rejection on BM25

What is **not** include:
- Hybrid Search
- EmbeddingList
- Restful API

Signed-off-by: Li Liu <li.liu@zilliz.com>
2025-12-10 10:59:14 +08:00
cai.zhang
b5e11f810d
fix: Fix panic when search empty result with output geometry field (#46230)
issue: #46146

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-12-09 20:37:13 +08:00
zhagnlu
8f0b7983ec
enhance: add jemalloc cached monitor (#46041)
#46133

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-12-09 19:53:13 +08:00
zhuwenxing
f9ff0e8402
test: add testcases for add/alter/drop text embedding function (#46229)
/kind improvement

Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2025-12-09 19:23:14 +08:00
zhuwenxing
abe0318bec
test: use predefined fake_de instead of creating new Faker instances to reduce run time (#46194)
related: https://github.com/milvus-io/milvus/issues/46014

/kind improvement

Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2025-12-09 17:59:14 +08:00