23522 Commits

Author SHA1 Message Date
zhuwenxing
256e073e8d
test: add more testcases for geo and struct (#45414)
/kind improvement

Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2025-11-25 10:51:06 +08:00
Zhen Ye
446e0b7bf5
fix: keep memory state consistent when recovering broadcast task from proto (#45787)
issue: #45782

- because the zero value of the repeated field and bytes field in proto
is ignored or treated as empty value but not nil pointer, so we need to
fix the recovery info of the broadcast task from proto to keep the
consistency of memory state.

Signed-off-by: chyezh <chyezh@outlook.com>
2025-11-24 20:05:07 +08:00
congqixia
c01fd94a6a
enhance: integrate Storage V2 FFI interface for unified storage access (#45723)
Related #44956
This commit integrates the Storage V2 FFI (Foreign Function Interface)
interface throughout the Milvus codebase, enabling unified storage
access through the Loon FFI layer. This is a significant step towards
standardizing storage operations across different storage versions.

1. Configuration Support
- **configs/milvus.yaml**: Added `useLoonFFI` configuration flag under
`common.storage.file.splitByAvgSize` section
- Allows runtime toggle between traditional binlog readers and new
FFI-based manifest readers
  - Default: `false` (maintains backward compatibility)

2. Core FFI Infrastructure

Enhanced Utilities (internal/core/src/storage/loon_ffi/util.cpp/h)
- **ToCStorageConfig()**: Converts Go's `StorageConfig` to C's
`CStorageConfig` struct for FFI calls
- **GetManifest()**: Parses manifest JSON and retrieves latest column
groups using FFI
  - Accepts manifest path with `base_path` and `ver` fields
  - Calls `get_latest_column_groups()` FFI function
  - Returns column group information as string
  - Comprehensive error handling for JSON parsing and FFI errors

3. Dependency Updates
- **internal/core/thirdparty/milvus-storage/CMakeLists.txt**:
  - Updated milvus-storage version from `0883026` to `302143c`
  - Ensures compatibility with latest FFI interfaces

4. Data Coordinator Changes

All compaction task builders now include manifest path in segment
binlogs:

- **compaction_task_clustering.go**: Added `Manifest:
segInfo.GetManifestPath()` to segment binlogs
- **compaction_task_l0.go**: Added manifest path to both L0 segment
selection and compaction plan building
- **compaction_task_mix.go**: Added manifest path to mixed compaction
segment binlogs
- **meta.go**: Updated metadata completion logic:
- `completeClusterCompactionMutation()`: Set `ManifestPath` in new
segment info
- `completeMixCompactionMutation()`: Preserve manifest path in compacted
segments
- `completeSortCompactionMutation()`: Include manifest path in sorted
segments

5. Data Node Compactor Enhancements

All compactors updated to support dual-mode reading (binlog vs
manifest):

6. Flush & Sync Manager Updates

Pack Writer V2 (pack_writer_v2.go)
- **BulkPackWriterV2.Write()**: Extended return signature to include
`manifest string`
- Implementation:
  - Generate manifest path: `path.Join(pack.segmentID, "manifest.json")`
  - Write packed data using FFI-based writer
  - Return manifest path along with binlogs, deltas, and stats

Task Handling (task.go)
- Updated all sync task result handling to accommodate new manifest
return value
- Ensured backward compatibility for callers not using manifest

7. Go Storage Layer Integration

New Interfaces and Implementations
- **record_reader.go**: Interface for unified record reading across
storage versions
- **record_writer.go**: Interface for unified record writing across
storage versions
- **binlog_record_writer.go**: Concrete implementation for traditional
binlog-based writing

Enhanced Schema Support (schema.go, schema_test.go)
- Schema conversion utilities to support FFI-based storage operations
- Ensures proper Arrow schema mapping for V2 storage

Serialization Updates
- **serde.go, serde_events.go, serde_events_v2.go**: Updated to work
with new reader/writer interfaces
- Test files updated to validate dual-mode serialization

8. Storage V2 Packed Format

FFI Common (storagev2/packed/ffi_common.go)
- Common FFI utilities and type conversions for packed storage format

Packed Writer FFI (storagev2/packed/packed_writer_ffi.go)
- FFI-based implementation of packed writer
- Integrates with Loon storage layer for efficient columnar writes

Packed Reader FFI (storagev2/packed/packed_reader_ffi.go)
- Already existed, now complemented by writer implementation

9. Protocol Buffer Updates

data_coord.proto & datapb/data_coord.pb.go
- Added `manifest` field to compaction segment messages
- Enables passing manifest metadata through compaction pipeline

worker.proto & workerpb/worker.pb.go
- Added compaction parameter for `useLoonFFI` flag
- Allows workers to receive FFI configuration from coordinator

10. Parameter Configuration

component_param.go
- Added `UseLoonFFI` parameter to compaction configuration
- Reads from `common.storage.file.useLoonFFI` config path
- Default: `false` for safe rollout

11. Test Updates
- **clustering_compactor_storage_v2_test.go**: Updated signatures to
handle manifest return value
- **mix_compactor_storage_v2_test.go**: Updated test helpers for
manifest support
- **namespace_compactor_test.go**: Adjusted writer calls to expect
manifest
- **pack_writer_v2_test.go**: Validated manifest generation in pack
writing

This integration follows a **dual-mode approach**:
1. **Legacy Path**: Traditional binlog-based reading/writing (when
`useLoonFFI=false` or no manifest)
2. **FFI Path**: Manifest-based reading/writing through Loon FFI (when
`useLoonFFI=true` and manifest exists)

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-11-24 19:57:07 +08:00
congqixia
a7275e190e
fix: populate index info after segment loading to prevent redundant load tasks (#45803)
After segments gained self-management capabilities for loading, the
index information from the initial load was not being preserved in the
Go-side segment metadata. This caused QueryCoord to repeatedly dispatch
load index tasks, which would fail in segcore since the indexes were
already loaded.

**Root Cause:**
The segment's `fieldIndexes` map was not being populated with index
metadata after calling `FinishLoad`, leading to a mismatch between the
Go-side metadata and segcore's internal state.

**Solution:**
After successfully loading a sealed segment, iterate through
`loadInfo.IndexInfos` and insert each index entry into the segment's
`fieldIndexes` map. This ensures the Go-side metadata stays in sync with
segcore and prevents redundant load index operations.

Fixes #45802
Related to #45060

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-11-24 19:55:07 +08:00
XuanYang-cn
c082317681
fix: Use base64 to encode not utf-8 bytes (#45655)
See also: #45654

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2025-11-24 18:23:06 +08:00
yanliang567
1da75c0ee2
test: Update hybrid search tests to milvus client style (#45772)
related issue: #45326

---------

Signed-off-by: yanliang567 <yanliang.qiao@zilliz.com>
2025-11-24 17:55:07 +08:00
aoiasd
5efb0cedc8
feat: support use fragment config for highlight (#45099)
relate: https://github.com/milvus-io/milvus/issues/42589

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-11-24 17:07:06 +08:00
Feilong Hou
228eb0f5d0
test: add more test cases and add bulk insert scenario (#45770)
Issue: #45756 
1. add bulk insert scenario
 2. fix small issue in e2e cases
 3. add search group by test case
 4. add timestampstz to gen_all_datatype_collection_schema
5. modify partial update testcase to ensure correct result from
timestamptz field

 On branch feature/timestamps
 Changes to be committed:
	modified:   common/bulk_insert_data.py
	modified:   common/common_func.py
	modified:   common/common_type.py
	modified:   milvus_client/test_milvus_client_partial_update.py
	modified:   milvus_client/test_milvus_client_timestamptz.py
	modified:   pytest.ini
	modified:   testcases/test_bulk_insert.py

Signed-off-by: Eric Hou <eric.hou@zilliz.com>
Co-authored-by: Eric Hou <eric.hou@zilliz.com>
2025-11-24 15:21:06 +08:00
zhikunyao
2134f83aa3
test: update macos github runner cache (#45778)
Signed-off-by: Zhikun Yao <zhikun.yao@zilliz.com>
2025-11-23 22:47:24 +08:00
tinswzy
1427825133
enhance: improve WAL retention strategy (#45350)
issue: #44369 
woodpecker related[ issue:
#59](https://github.com/zilliztech/woodpecker/issues/59)

Refactor the WAL retention logic in Milvus StreamingNode:
- Remove the simple sampling-based truncation mechanism.
- After flush, WAL data is directly truncated.
- The retention control is now delegated to the underlying message queue
(MQ) implementation.

Signed-off-by: tinswzy <zhenyuan.wei@zilliz.com>
2025-11-23 21:41:05 +08:00
Zhen Ye
823c7f7e3e
fix: use remote wal when local wal shutdown (#45753)
issue: #45750

Signed-off-by: chyezh <chyezh@outlook.com>
2025-11-22 16:17:05 +08:00
zhikunyao
eea9c8093d
test: update macos checker to macos-15-intel (#45673)
Signed-off-by: Zhikun Yao <zhikun.yao@zilliz.com>
2025-11-22 00:07:06 +08:00
Buqian Zheng
2cf1e0e452
enhance: optimize pk search to use binary search, and 2 pointers for in expr (#45328)
issue: #44935

this is somewhat related to #44935, but on pk instead of stl_sort index

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2025-11-21 19:01:05 +08:00
zhenshan.cao
bec6d1d1e1
enhance: timestamptz support groupby (#45762)
issue: https://github.com/milvus-io/milvus/issues/45761

Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>
2025-11-21 18:39:05 +08:00
Buqian Zheng
e00ad1098f
enhance: add ScalarFieldProto& overload to avoid unnecessary copies (#45743)
1. Array.h: Add output_data(ScalarFieldProto&) overload for both Array
and ArrayView classes
2. Use std::string_view instead of std::string for VARCHAR and GEOMETRY
types to avoid extra string copies
3. Call Reserve(length_) before writing to proto objects to reduce
memory reallocations

a simple test shows those optimizations improve the Array of Varchar
bulk_subscript performance by 20%

issue: https://github.com/milvus-io/milvus/issues/45679

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2025-11-21 18:35:05 +08:00
congqixia
f51fcc09ae
fix: resolve SessionWatcher goroutine leak and unstable UT in querycoordv2 (#45627)
Related to #44620
Related to unstable ut "internal/querycoordv2 TestServer/TestNodeUp"

Introduce SessionWatcher interface to fix race condition and goroutine
leak that caused unstable unit test TestServer/TestNodeUp.

Changes:
- Add SessionWatcher interface with EventChannel() and Stop() methods
- Refactor WatchServices() to return SessionWatcher instead of raw
channel
- Fix cleanup order in QueryCoordV2: stop watcher before session
- Update DataCoord, ConnectionManager to use SessionWatcher
- Add MockSessionWatcher for testing

Fixes race condition between session context cancellation and internal
loop exit. Eliminates goroutine leak by providing explicit lifecycle
management.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-11-21 18:33:06 +08:00
sre-ci-robot
937fd99354
[automated] Bump milvus version to v2.6.6 (#45769)
Bump milvus version to v2.6.6
Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-11-21 17:23:05 +08:00
sre-ci-robot
386ca2e7cf
[automated] Bump milvus version to v2.6.6 (#45764)
Bump milvus version to v2.6.6
Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-11-21 16:41:04 +08:00
Zhen Ye
a0c269dfe7
fix: use 2.6.6 for milvus DDL upgrading (#45738)
issue: #43897

Signed-off-by: chyezh <chyezh@outlook.com>
2025-11-21 11:45:04 +08:00
Feilong Hou
0231a3edf8
test: enable all timestamptz case (#45128)
Issue: #44518

---------

Signed-off-by: Eric Hou <eric.hou@zilliz.com>
Co-authored-by: Eric Hou <eric.hou@zilliz.com>
2025-11-21 11:03:06 +08:00
Bingyi Sun
275a5b9afc
enhance: optimize term expr performance (#45491)
issue: https://github.com/milvus-io/milvus/issues/45641

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-11-21 11:01:05 +08:00
qixuan
3202847092
test: add field case about dynamic and compaction (#45694)
related issue: #42126

Signed-off-by: qixuan <673771573@qq.com>
2025-11-21 10:07:05 +08:00
Zhen Ye
1cd0ef943e
fix: use latest timetick to expire cache (#45717)
issue: #45697

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-11-20 21:39:04 +08:00
zhenshan.cao
352a8d06ec
fix: Partial update panic with TIMESTAMPTZ (#45740)
issue: https://github.com/milvus-io/milvus/issues/45729

Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>
2025-11-20 21:20:12 +08:00
junjiejiangjjj
d3164e8030
feat: add configurable batch factor and runtime check bypass for embedding functions (#45592)
https://github.com/milvus-io/milvus/issues/45544
- Add batch_factor configuration parameter (default: 5) to control
embedding provider batch sizes
- Add disable_func_runtime_check property to bypass function validation
during collection creation
- Add database interceptor support for AddCollectionFunction,
AlterCollectionFunction, and DropCollectionFunction requests

Signed-off-by: junjie.jiang <junjie.jiang@zilliz.com>
2025-11-20 19:55:04 +08:00
liliu-z
bbf1a3118d
enhance: Fix CVE-2025-63811 (#45659)
Signed-off-by: Li Liu <li.liu@zilliz.com>
2025-11-20 17:19:44 +08:00
wei liu
3fbee154f6
enhance: Remove large segment ID arrays from QueryNode logs (#45719)
issue: #45718

Logging complete segment ID arrays caused excessive log volume (3-6 TB
for 200k segments). Remove arrays from logger fields and keep only
segment counts for observability.

Changes:
- Remove requestSegments/preparedSegments arrays from Load logger
- Remove segmentIDs from BM25 stats logs
- Remove entries structure from sync distribution log

This reduces log volume by 99.99% for large-scale operations.

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-11-20 17:18:14 +08:00
Zhen Ye
3c90dddebf
fix: streamingnode should exit when initializing failure (#45731)
issue: #45721

Signed-off-by: chyezh <chyezh@outlook.com>
2025-11-20 17:12:38 +08:00
XuanYang-cn
b95bbaffae
test: Increase PyMilvus version to 2.7.0rc60 for master branch (#45600)
Automated daily bump from pymilvus master branch. Updates
tests/python_client/requirements.txt.

Signed-off-by: XuanYang-cn <xuan.yang@zilliz.com>
2025-11-20 16:53:08 +08:00
zhikunyao
aa0870d2ff
test: add e2e-v2 helm for amd (#45621)
Signed-off-by: Zhikun Yao <zhikun.yao@zilliz.com>
2025-11-20 13:45:11 +08:00
zhuwenxing
e0df44481d
test: refactor checker to using milvus client (#45524)
/kind improvement

Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2025-11-20 11:59:08 +08:00
congqixia
79926b412c
fix: protect tbb concurrent_map emplace to avoid race condition deadlock (#45681)
Related to #44974

The emplace() operation on tbb::concurrent_hash_map was not protected,
allowing other threads to erase entries between the emplace attempt and
the subsequent lookup.

Solution:
1. Add shared_lock protection around the emplace() operation to prevent
concurrent erasure during insertion
2. Instead of returning nullptr when the key is not found on retry,
recursively call Get(key) to retry the entire operation
3. Fix typo: "earsed" -> "erased"

This ensures that concurrent Get() operations are properly synchronized
and will eventually succeed even under high contention.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-11-20 11:57:06 +08:00
Zhen Ye
f6411abbd7
fix: panic when streaming coord shutdown but query coord still work (#45695)
issue: #44984

Signed-off-by: chyezh <chyezh@outlook.com>
2025-11-20 11:07:06 +08:00
Bingyi Sun
a3add6a391
fix: Fix json indices can not be loaded (#45620)
issue: https://github.com/milvus-io/milvus/issues/45575

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-11-20 10:41:06 +08:00
Zhen Ye
87f9a79a6a
fix: inconsistent proxy cache when multiple DDL is executing with DML (#45698)
issue: #45697

Signed-off-by: chyezh <chyezh@outlook.com>
2025-11-20 02:53:06 +08:00
Buqian Zheng
5b85f0e4dc
enhance: updated multiple places where the expr copies the input values in every loop (#45680)
issue: https://github.com/milvus-io/milvus/issues/45679

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2025-11-20 01:51:07 +08:00
Gao
8ee8c01bcf
enhance: prefetch vector chunks for sealed non-indexed segments (#45665)
Signed-off-by: chasingegg <chao.gao@zilliz.com>
2025-11-19 18:39:07 +08:00
cai.zhang
03a244844e
fix: Set task init when worker doesn't have task (#45675)
issue: #45674

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-11-19 18:03:07 +08:00
XuanYang-cn
40fdf1e828
enhance: Enable to merge sort one segment (#45652)
Remove the log stack when setting isCompacting

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2025-11-19 15:21:05 +08:00
Zhen Ye
c8073eb90b
fix: panic when double close channel of ack broadcast (#45661)
issue: #45635

Signed-off-by: chyezh <chyezh@outlook.com>
2025-11-19 14:25:05 +08:00
zhenshan.cao
a3b8bcb198
fix: correct default value backfill during AddField (#45634)
issue: https://github.com/milvus-io/milvus/issues/44585

Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>
2025-11-18 23:05:42 +08:00
aoiasd
947c8855f3
feat: support search bm25 with highlight (#44923)
relate: https://github.com/milvus-io/milvus/issues/42589

---------

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-11-18 16:09:39 +08:00
sparknack
16acf8829b
enhance: expr: only prefetch chunks once (#45554)
issue: https://github.com/milvus-io/milvus/issues/43611

Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>
2025-11-18 15:51:41 +08:00
wei liu
7708abd8fe
fix: Prevent deadlock in runComponent when Prepare fails (#45609)
issue: #45068
When component.Prepare() fails (e.g., net listener creation error), the
sign channel was never closed, causing runComponent to block
indefinitely at <-sign. This resulted in the entire process hanging
after logging the error message.

Changes:
- Move close(sign) to defer statement in runComponent goroutine
- Ensures sign channel is always closed regardless of success/failure
- Allows proper error propagation through future.Await() mechanism

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-11-18 15:41:40 +08:00
congqixia
b734de5398
fix: [skip e2e] move gRPC server start after service registration in zilliz client tests (#45645)
The tests were failing with "grpc: Server.RegisterService after
Server.Serve" because setupMockServer() was starting the gRPC server
before tests could register their services. gRPC requires all services
to be registered before Server.Serve() is called.

Changes:
- Remove s.Serve() from setupMockServer() helper function
- Add s.Serve() to each test after service registration
- Apply fix consistently to all 6 affected tests:
  * TestZillizClient_Embedding
  * TestZillizClient_Embedding_Error
  * TestZillizClient_Rerank
  * TestZillizClient_Rerank_Error
  * TestNewZilliClient_WithMockServer
  * TestZillizClient_Embedding_EmptyResponse

This follows the correct gRPC server lifecycle:
1. Create server
2. Register services
3. Start serving

Related to #44620
Case: "internal/util/function/models/zilliz TestZillizClient_Rerank"

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-11-18 15:31:39 +08:00
862103595
a0e2fe78f3
enhance: Add ST_IsValid operator implementation for gis (#45501)
issue:#43427

---------

Signed-off-by: xiejh <862103595@qq.com>
2025-11-18 15:09:40 +08:00
Zhen Ye
caed0fe470
fix: compact the assignment history of channel to decrease the size of assignment recovery info (#45606)
issue: #45210

If the underlying WAL is failed to open, the recovery info size of
streaming coord `streamingcoord-meta/pchannel` will increase fast until
reaching the etcd limitation.
So make a compaction by serverID at assignment history to decrease the
`streamingcoord-meta/pchannel` size.

Signed-off-by: chyezh <chyezh@outlook.com>
2025-11-18 12:47:39 +08:00
Bingyi Sun
1ba75eea62
enhance: skip test_milvus_client_search_json_path_index_default (#45604)
To prevent this issue from blocking other PRs, we are temporarily
disabling this test. A proper fix will be implemented before the 2.6.6
release.

issue: https://github.com/milvus-io/milvus/issues/45511

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-11-18 10:54:09 +08:00
congqixia
f8c972a102
fix: update EnableDynamicField and SchemaVersion during collection modification (#45615)
Related to #45614

This commit fixes a bug where certain collection attributes were not
properly updated during collection modification, causing metadata errors
after cluster restart and collection reload failures.

When altering a collection, the `EnableDynamicField` and `SchemaVersion`
attributes were not being persisted to the catalog. This caused
inconsistencies between the in-memory collection metadata and the
persisted state, leading to:
- Dynamic field validation failures after restart
- Collection loading errors
- Metadata state mismatches

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-11-18 10:05:39 +08:00
wei liu
7aed88113c
enhance: Deduplicate primary keys in upsert request batch (#45249)
issue: #44320

This change adds deduplication logic to handle duplicate primary keys
within a single upsert batch, keeping the last occurrence of each
primary key.

Key changes:
- Add DeduplicateFieldData function to remove duplicate PKs from field
data, supporting both Int64 and VarChar primary keys
- Refactor fillFieldPropertiesBySchema into two separate functions:
validateFieldDataColumns for validation and fillFieldPropertiesOnly for
property filling, improving code clarity and reusability
- Integrate deduplication logic in upsertTask.PreExecute to
automatically deduplicate data before processing
- Add comprehensive unit tests for deduplication with various PK types
(Int64, VarChar) and field types (scalar, vector)
- Add Python integration tests to verify end-to-end behavior

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-11-17 21:35:40 +08:00