11029 Commits

Author SHA1 Message Date
Zhen Ye
3327df72e4
enhance: make immutable message as the param of ack operation for cdc (#43900)
issue: #43897

- The original broadcast ack operation need to recover message from
etcd, which can not support cdc.
- immutable message will set as the ack parameter to fix it.

Signed-off-by: chyezh <chyezh@outlook.com>
2025-09-01 10:21:52 +08:00
XuanYang-cn
3160f41821
enhance: [cmek]Merge cipher.yml with hook.yml (#44118)
See also: #40321

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2025-08-29 18:37:51 +08:00
wei liu
16af4e230a
fix: Prevent panic in upsert due to missing nullable fields [Proxy] (#44070)
issue: #43980
Fixes a panic that occurred when a partial update was converted to an
insert due to a non-existent primary key. The panic was caused by
missing nullable fields that were not provided in the original partial
update request.

The upsert pre-execution logic is refactored to handle this correctly:
- Explicitly splits upsert data into 'insert' and 'update' batches.
- Automatically generates data for missing nullable or default-value
fields during inserts, preventing the panic.
- Enhances `typeutil.UpdateFieldData` to support different source and
destination indexes for flexible data merging.
- Adds comprehensive unit tests for mixed upsert, pure insert, and pure
update scenarios.

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-08-29 18:33:51 +08:00
cai.zhang
c16296a53f
fix: Handle compaction retry state (#44119)
issue: #43776

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-08-29 13:31:51 +08:00
wei liu
d84c4c580a
enhance: [DataCoord] Remove full-collection index work from metrics (#43859)
issue: #43858

- Remove full-collection index handling in getCollectionMetrics
- Avoid heavy metadata scans and RPC calls during metrics
- Reduce latency and CPU/memory usage on large datasets
- No functional change to metrics semantics

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-08-29 12:05:50 +08:00
sparknack
70c8114e85
enhance: cachinglayer: resource management for segment loading (#43846)
issue: #41435

---------

Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>
2025-08-29 11:37:50 +08:00
Zhen Ye
7b04107863
fix: unrecoverable if lease expire when standby mode (#44112)
issue: #44111

Signed-off-by: chyezh <chyezh@outlook.com>
2025-08-29 10:47:51 +08:00
Zhen Ye
23085ae437
fix: use query node label check if streamingnode (#44099)
issue: #44014

- Because the session of querynode and streamingnode is different.
- So when streamingnode session down first, a streaming query node will
be treated as querynode.
- Use label but not streaming node session to fix it.

Signed-off-by: chyezh <chyezh@outlook.com>
2025-08-29 10:45:59 +08:00
Buqian Zheng
6b22661c06
fix: use tbb::concurrent_unordered_map for ChunkedSegmentSealedImpl::fields_ (#44084)
issue: https://github.com/milvus-io/milvus/issues/44078

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2025-08-29 10:01:51 +08:00
cqy123456
844caf5cfe
enhance: estimate the size of interim index (#44104)
issue: #41435

Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>
2025-08-28 19:37:51 +08:00
congqixia
ba88cfa7a9
enhance: Add unified GRPC latency metrics in inteceptor (#44089)
Related to #43966

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-08-28 09:53:51 +08:00
cai.zhang
eddf188452
fix: Using bucket name in storage config for bulk writer v2 (#44083)
issue: #44082

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-08-27 23:47:51 +08:00
congqixia
e3b3502287
fix: Use correct regex for cppcheck (#44077)
Related to #44076

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-08-27 20:57:50 +08:00
Buqian Zheng
6420d72391
enhance: print as storage size unit MB with 2 digits only, so the log is easier to read (#44085)
issue: https://github.com/milvus-io/milvus/issues/41435

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2025-08-27 19:47:50 +08:00
cai.zhang
7f470e6bd3
fix: Fix retry state with palyload is not nil (#44068)
issue: #43776

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-08-27 18:11:49 +08:00
marcelo-cjl
e13e19cd2c
enhance: add sparse_u32_f32 data type for sparse vertor (#43974)
issue: #43973

Signed-off-by: marcelo.chen <marcelo.chen@zilliz.com>
2025-08-27 16:47:50 +08:00
Chun Han
da156981c6
feat: milvus support posix-compatible mode(milvus-io#43942) (#43944)
related: #43942

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2025-08-27 16:29:50 +08:00
XuanYang-cn
09b29a88aa
enhance: Remove not inused allocator (#43821)
See also: #44039

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2025-08-27 14:31:50 +08:00
congqixia
d3fa305785
enhance: Add grpc metadata header for client request time (#44059)
Related to #44058

This PR:
- Add common grpc metadata key for client request time
- Add gosdk & milvus inteceptor related logic for this attribute
- Bump go sdk version

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-08-27 14:27:49 +08:00
XuanYang-cn
37a447d166
feat: Add CMEK cipher plugin (#43722)
1. Enable Milvus to read cipher configs
2. Enable cipher plugin in binlog reader and writer
3. Add a testCipher for unittests
4. Support pooling for datanode
5. Add encryption in storagev2

See also: #40321 
Signed-off-by: yangxuan <xuan.yang@zilliz.com>

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2025-08-27 11:15:52 +08:00
groot
55b24b7a78
fix: Hide sensitive items for restful get configs (#44057)
issue:https://github.com/milvus-io/milvus/issues/44065

Signed-off-by: yhmo <yihua.mo@zilliz.com>
2025-08-27 11:09:52 +08:00
aoiasd
208a345a3d
enhance: package analyzer code in Go and fix named analyzer as tokenizer (#43694)
relate: https://github.com/milvus-io/milvus/issues/43687

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-08-27 10:59:52 +08:00
Spade A
90a7e63665
enhance: collect doc_id from posting list directly for text match (#43899)
issue: https://github.com/milvus-io/milvus/issues/43898

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-08-27 10:39:52 +08:00
aoiasd
e205c30f7d
fix: boost panic if search return empty result (#44042)
relate: https://github.com/milvus-io/milvus/issues/44041
Skip rescore node if no valid offsets.

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-08-27 05:01:52 +08:00
Spade A
8456f824be
feat: impl StructArray -- miscellaneous staffs for struct array (#43960)
Ref https://github.com/milvus-io/milvus/issues/42148

1. enable storage v2
2. implement some missing staffs
3. fix some bugs and add tests

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-08-26 21:35:53 +08:00
Zhen Ye
5bdc593b8a
enhance: use v0.15.1 official pulsar client and add logging for pulsar client (#43913)
issue: #43785

- pulsar client will print log into milvus logger now.
- pulsar client open the metric by default.
- upgrade the pulsar client to v0.15.1, and use offical repo.
- the fixing of milvus-io/pulsar-client-go is already covered by
official v0.15.1.

Signed-off-by: chyezh <chyezh@outlook.com>
2025-08-26 16:45:53 +08:00
Tianx
c0d62268ac
feat: add timesatmptz data type (#44005)
issue: https://github.com/milvus-io/milvus/issues/27467
>
https://github.com/milvus-io/milvus/issues/27467#issuecomment-3092211420
> * [x]  M1 Create collection with timestamptz field
> * [x]  M2 Insert timestamptz field data
> * [x]  M3 Retrieve timestamptz field data
> * [x]  M4 Implement handoff[ ]  

The second PR of issue:
https://github.com/milvus-io/milvus/issues/27467, which completes M1-M4
described above.

---------

Signed-off-by: xtx <xtianx@smail.nju.edu.cn>
2025-08-26 15:59:53 +08:00
groot
ccb0db92e7
fix: Not allow to import null element of array field from parquet (#43964)
issue: https://github.com/milvus-io/milvus/issues/43819

Before this fix: null elements are converted to zero or empty strings
After this fix: import job will return error "array element is not
allowed to be null value for field xxx"

Signed-off-by: yhmo <yihua.mo@zilliz.com>
2025-08-26 14:45:51 +08:00
Zhen Ye
575345ae7b
fix: get streamingnodes from service discovery without channel assign (#44033)
issue: #43767

Signed-off-by: chyezh <chyezh@outlook.com>
2025-08-26 14:29:51 +08:00
Gao
e97a618630
enhance: support readAt interface for remote input stream (#43997)
#42032 

Also, fix the cacheoptfield method to work in storagev2.
Also, change the sparse related interface for knowhere version bump
#43974 .
Also, includes https://github.com/milvus-io/milvus/pull/44046 for metric
lost.

---------

Signed-off-by: chasingegg <chao.gao@zilliz.com>
Signed-off-by: marcelo.chen <marcelo.chen@zilliz.com>
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
Co-authored-by: marcelo.chen <marcelo.chen@zilliz.com>
Co-authored-by: Congqi Xia <congqi.xia@zilliz.com>
2025-08-26 11:19:58 +08:00
zhagnlu
8934c18792
enhance: support cache result cache for expr (#43923)
issue: #43878

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-08-26 10:55:52 +08:00
junjiejiangjjj
f1ce84996d
enhance: refactor model service configuration and environment variables (#44036)
- Add enable configuration for all model service providers
- Migrate environment variables from MILVUSAI_* to MILVUS_* prefix with
backward compatibility
- Unify model service enable/disable logic using configuration
- Add tests for environment variable parsing with fallback support

#35856

Signed-off-by: junjie.jiang <junjie.jiang@zilliz.com>
2025-08-26 10:49:52 +08:00
cqy123456
d987dd7103
enhance: Make build ratio of interim index configurable (#43939)
issue: https://github.com/milvus-io/milvus/issues/43993

Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>
2025-08-25 14:43:51 +08:00
sparknack
4fae074d56
enhance: add write rate limit for disk file writer (#43912)
issue: #43040

---------

Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>
2025-08-25 10:27:47 +08:00
Zhen Ye
d0e3a33c37
enhance: add IsRebalanceSuspended interface for wal balancer (#44026)
issue: #43968

Signed-off-by: chyezh <chyezh@outlook.com>
2025-08-24 09:19:47 +08:00
Zhen Ye
cbb9392564
fix: filter the streaming node from resource group (#43984)
issue: #43981

Signed-off-by: chyezh <chyezh@outlook.com>
2025-08-22 19:21:47 +08:00
junjiejiangjjj
f3d7e47227
feat: Supports more rerankers (#43270)
https://github.com/milvus-io/milvus/issues/35856

Signed-off-by: junjiejiangjjj <junjie.jiang@zilliz.com>
2025-08-22 17:29:47 +08:00
congqixia
d5ecf49319
enhance: Add request stats interceptor collecting req metrics (#43967)
Related to #43966 #43809

This PR:
- Replace distributed request metrics collection into one interceptor
- Add `Retry` and `Reject` label represents auth rejection and
retry-able error cases

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-08-22 13:09:47 +08:00
congqixia
606d4c24cd
enhance: Use function def determine field IsFunctionOutput only (#43979)
Related to #35853

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-08-22 04:49:46 +08:00
congqixia
847b79e197
enhance: Use RLock for ListPrivilegeGroups (#43998)
Related to #43901

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-08-21 23:45:46 +08:00
Zhen Ye
082ca62ec1
enhance: support balancer interface for streaming client to fetch streaming node information (#43969)
issue: #43968

- Add ListStreamingNode/GetWALDistribution to  fetch streaming node info
- Add SuspendRebalance/ResumeRebalance to enable or stop balance
- Add FreezeNodeIDs/DefreezeNodeIDs to freeze target node

Signed-off-by: chyezh <chyezh@outlook.com>
2025-08-21 15:55:47 +08:00
Spade A
8e1ce15146
fix: ngram index is mistakenly used for unsopported operations (#43955)
issue: https://github.com/milvus-io/milvus/issues/43917

1. fix ngrma index to be mistakenly used for unsopported operation
2. fix potential uaf problem

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-08-21 14:41:46 +08:00
Zhen Ye
f5cee0012a
fix: remove panic for message type in recovery storage and marshal log (#43976)
issue: #43897

Signed-off-by: chyezh <chyezh@outlook.com>
2025-08-21 14:23:47 +08:00
Tianx
26c5c779bf
feat: temporarily disable Timestamptz collection creation (#43935)
issue: https://github.com/milvus-io/milvus/issues/27467

Signed-off-by: xtx <xtianx@smail.nju.edu.cn>
2025-08-21 11:17:46 +08:00
zhagnlu
d904c4e677
enhance: optimize compare expr performance for pk field (#43154)
#43153

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-08-21 10:59:46 +08:00
congqixia
7963b17ac1
fix: Revert "fix: Use folly::SharedMutex preventing starvation (#43937)" (#43959)
Related to #43958

This reverts commit 580350495ab40b3c0a2ec473882258edf6d7dd08.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-08-21 10:09:47 +08:00
wei liu
399f63300c
enhance: Implement dynamic interval updates for ticker components (#43865)
issue: #43858

Enable dynamic configuration updates for ticker intervals without
restart. This enhancement allows runtime configuration changes to take
effect immediately for better operational flexibility.

Changes include:
- Apply "drain+Reset only when interval changed" pattern across all
ticker components to preserve existing timing phases
- Fix goroutine variable capture issue in CheckerController.Start()
- Remove unnecessary ticker.Stop() in manual trigger paths
- Add dynamic interval checking in QueryCoordV2 components:
  * checkers/controller.go: Various checker intervals
  * dist/dist_handler.go: DistPullInterval, CheckExecutedFlagInterval
  * session/cluster.go: CheckNodeSessionInterval
  * server.go: CheckAutoBalanceConfigInterval
  * observers/target_observer.go: UpdateNextTargetInterval
  * observers/collection_observer.go: CollectionObserverInterval
- Add dynamic interval checking in QueryNodeV2 components:
  * segments/disk_usage_fetcher.go: DiskSizeFetchInterval
- Ensure thread safety by performing all ticker operations in same
goroutine with proper drain before Reset to avoid spurious triggers

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-08-21 10:07:47 +08:00
wei liu
384c493d0e
fix: Fix node status inconsistency after QueryCoord restart (#43941)
issue: #43933

Fix the issue where QueryCoord restart leads to node status
inconsistency in resource manager, causing segment loading failures and
incorrect resource group assignments.

Changes include:
- Add CheckNodesInResourceGroup method to sync node status after restart
- Implement proper cleanup of offline/stopping nodes from resource
groups
- Add automatic discovery and assignment of new nodes to resource groups
- Enhance rewatchNodes process to include resource manager
synchronization

This ensures resource manager maintains correct node status and
assignments even after QueryCoord restarts, preventing segment loading
failures and improving system reliability.

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-08-20 14:13:46 +08:00
aoiasd
8d49ffcc8b
enhance: report field name when text match or pharse match failed because field not enable match (#43366)
relate: https://github.com/milvus-io/milvus/issues/41953

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-08-20 10:59:46 +08:00
Spade A
d6a428e880
feat: impl StructArray -- support create index for vector array (embedding list) and search on it (#43726)
Ref https://github.com/milvus-io/milvus/issues/42148

This PR supports create index for vector array (now, only for
`DataType.FLOAT_VECTOR`) and search on it.
The index type supported in this PR is `EMB_LIST_HNSW` and the metric
type is `MAX_SIM` only.

The way to use it:
```python
milvus_client = MilvusClient("xxx:19530")
schema = milvus_client.create_schema(enable_dynamic_field=True, auto_id=True)
...
struct_schema = milvus_client.create_struct_array_field_schema("struct_array_field")
...
struct_schema.add_field("struct_float_vec", DataType.ARRAY_OF_VECTOR, element_type=DataType.FLOAT_VECTOR, dim=128, max_capacity=1000)
...
schema.add_struct_array_field(struct_schema)
index_params = milvus_client.prepare_index_params()
index_params.add_index(field_name="struct_float_vec", index_type="EMB_LIST_HNSW", metric_type="MAX_SIM", index_params={"nlist": 128})
...
milvus_client.create_index(COLLECTION_NAME, schema=schema, index_params=index_params)
```

Note: This PR uses `Lims` to convey offsets of the vector array to
knowhere where vectors of multiple vector arrays are concatenated and we
need offsets to specify which vectors belong to which vector array.

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>
2025-08-20 10:27:46 +08:00