11226 Commits

Author SHA1 Message Date
Zhen Ye
80bb09f7c2
enhance: support rbac with WAL-based DDL framework (#44735)
issue: #43897

- RBAC(Roles/Users/Privileges/Privilege Groups) is implemented by
WAL-based DDL framework now.
- Support following message type in wal `AlterUser`, `DropUser`,
`AlterRole`, `DropRole`, `AlterUserRole`, `DropUserRole`,
`AlterPrivilege`, `DropPrivilege`, `AlterPrivilegeGroup`,
`DropPrivilegeGroup`, `RestoreRBAC`.
- RBAC can be synced by new CDC now.
- Refactor some UT for RBAC.

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-10-16 16:02:01 +08:00
foxspy
b13b4dabbe
fix: fix the size of diskann thread cache (#44887)
issue: https://github.com/milvus-io/milvus/issues/44857

Signed-off-by: xianliang.li <xianliang.li@zilliz.com>
2025-10-16 15:32:01 +08:00
1mmortal
e18e7d3b32
fix: Pingpong load balancing issue when segment has only 1 row(#44840) (#44841)
Use math.Ceil to calculate Priority uniformly
issue: https://github.com/milvus-io/milvus/issues/44840

Signed-off-by: 1mmortal <lmzzzzz1@163.com>
2025-10-16 11:18:00 +08:00
congqixia
684018ca7b
fix: ensure deterministic search result ordering when scores are equal (#44870)
Related to #44819
This fix addresses an issue(#44819) where the offset parameter did not
work correctly during searches when multiple results had identical
scores. The problem occurred because results with equal scores were not
consistently ordered, leading to unpredictable pagination behavior.

The solution adds a new sorting step (SortEqualScoresByPks) in the
reduce phase that sorts results with identical scores by their primary
keys in ascending order. This ensures deterministic ordering and enables
proper offset functionality.

Changes:
- Add SortEqualScoresByPks() to sort results with equal scores by PK
- Add SortEqualScoresOneNQ() to handle per-query sorting logic
- Invoke sorting step after FillPrimaryKey() in Reduce() workflow

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-10-16 10:04:00 +08:00
foxspy
b91878857e
fix: update aisaq param (#44861)
issue: #44365

Signed-off-by: xianliang.li <xianliang.li@zilliz.com>
2025-10-15 19:18:00 +08:00
Bingyi Sun
26d06c6340
feat: load skip index using parquet statistics (#44252)
#44011

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-10-15 19:16:00 +08:00
wei liu
38833b0e1d
fix: Fix deactivate balance checker also stops stopping balance (#44834)
issue: #43858
Fix the issue introduced in PR #43992 where deactivating the balance
checker incorrectly stops stopping balance operations.

Changes:
- Move IsActive() check after stopping balance logic
- Only skip normal balance when checker is inactive
- Allow stopping balance to proceed regardless of checker state

This ensures stopping balance can execute even when the balance checker
is deactivated.

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-10-15 15:50:04 +08:00
Zhen Ye
8bf7d6ae72
enhance: refactor update replicate config operation using wal-broadcast-based DDL/DCL framework (#44560)
issue: #43897

- UpdateReplicateConfig operation will broadcast AlterReplicateConfig
message into all pchannels with cluster-exclusive-lock.
- Begin txn message will use commit message timetick now (to avoid
timetick rollback when CDC with txn message).
- If current cluster is secondary, the UpdateReplicateConfig will wait
until the replicate configuration is consistent with the config
replicated from primary.

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-10-15 15:26:01 +08:00
cqy123456
822588302a
enhance: embedding_list support mmap in MemVectorIndex (#44764)
issue: https://github.com/milvus-io/milvus/issues/44702

Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>
2025-10-15 15:22:00 +08:00
wei liu
529c98520c
enhance: Add nullable support for Geometry and Timestamptz types (#44846)
issue: #44800
This commit enhances the upsert and validation logic to properly handle
nullable Geometry (WKT/WKB) and Timestamptz data types:

- Add ToCompressedFormatNullable support for TimestamptzData,
GeometryWktData, and GeometryData to filter out null values during data
compression
- Implement GenNullableFieldData for Timestamptz and Geometry types to
generate nullable field data structures
- Update FillWithNullValue to handle both GeometryData and
GeometryWktData with null value filling logic
- Add UpdateFieldData support for Timestamptz, GeometryData, and
GeometryWktData field updates
- Comprehensive unit tests covering all new data type handling scenarios

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-10-15 14:04:00 +08:00
sparknack
3d3fa44745
fix: milvus-common update (#44798)
issue: #44501

Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>
2025-10-15 11:24:00 +08:00
Spade A
c4f3f0ce4c
feat: impl StructArray -- support more types of vector in STRUCT (#44736)
ref: https://github.com/milvus-io/milvus/issues/42148

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>
2025-10-15 10:25:59 +08:00
yihao.dai
5ad8a29c0b
enhance: Speed up CDC scheduling (#44564)
Make CDC watch etcd replicate pchannel meta instead of listing them
periodically.

issue: https://github.com/milvus-io/milvus/issues/44123

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-10-15 10:15:59 +08:00
XuanYang-cn
fc46668812
fix: Disk encryption config missing (#44820)
See also: #44823

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2025-10-14 17:22:00 +08:00
Spade A
b8df1c0cc5
enhance: improve observability in trace for segcore scalar expression (#44260)
Ref https://github.com/milvus-io/milvus/issues/44259

This PR connects the trace between go and segcore, and add full traces
for scalar expression calling chain:
<img width="2418" height="960" alt="image"
src="https://github.com/user-attachments/assets/8cad69d7-bcb7-4002-a4e3-679a3641e229"
/>
<img width="2452" height="850" alt="image"
src="https://github.com/user-attachments/assets/8b44aed0-0f03-48a7-baa0-b022fee994ce"
/>
<img width="2403" height="707" alt="image"
src="https://github.com/user-attachments/assets/cd6f0601-0d5c-4087-8ed8-2385f1bc740b"
/>

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-10-14 17:15:59 +08:00
congqixia
92910117a8
fix: binlog count use correct value (#44830)
Related to #44789

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-10-14 16:48:05 +08:00
cqy123456
4b84ba2189
fix:remove the limit of deduplicate case when disable autoindex (#44825)
issue: https://github.com/milvus-io/milvus/issues/44702

Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>
2025-10-14 16:14:00 +08:00
yihao.dai
cebe923d4a
enhance: Make GetReplicateInfo API work at the pchannel level (#44809)
issue: https://github.com/milvus-io/milvus/issues/44123

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-10-14 15:12:00 +08:00
Chun Han
2ea8d85c2f
feat: restful support run analyzer(#44803) (#44805)
related: #44803

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2025-10-14 14:41:59 +08:00
Bingyi Sun
6cb1f7d7c6
enhance: optimize the performace of bitmap reverse lookup (#44804)
Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-10-14 11:57:58 +08:00
zhagnlu
2f178f810f
fix:fix json_contains(path, int) bug (#44814)
#44816

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-10-14 00:19:59 +08:00
sparknack
df6a4dc1a0
fix: cachinglayer: avoid eviction during json handling (#44812)
issue: #44797

Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>
2025-10-13 22:07:58 +08:00
congqixia
33060d9cf2
fix: avoid concurrent Reset/Add operations on DataCoord metrics (#44789)
This commit addresses issue #44788 where the
`datacoord_stored_binlog_size` metric could become inaccurate when
multiple concurrent `GetMetrics` calls arrived at DataCoord.

### Problem

The original implementation called `Reset()` followed by `Add()`
operations on Prometheus metrics within the `GetQuotaInfo()` method.
When multiple goroutines invoked this method concurrently, race
conditions occurred:
- Thread 1: Reset() → Add(value1)
- Thread 2: Reset() → Add(value2)
- Result: Metrics could be reset multiple times and values added in an
interleaved fashion, leading to inaccurate and inflated metric values

### Solution

Changed the approach from `Reset() + Add()` to aggregating metric values
in local maps first, then using `Set()` to update metrics atomically:

1. Collect segment size data into local maps:
   - `storedBinlogSize`: tracks size per collection per segment state
   - `binlogFileSize`: tracks total file count per collection
   - `coll2DbName`: maps collection IDs to database names

2. After aggregation is complete, use `Set()` (instead of `Add()`) to
update metrics in a single operation per label combination

This ensures that concurrent `GetMetrics` calls don't interfere with
each other, as each invocation works with its own local state and only
updates the final metric value atomically.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-10-13 18:39:59 +08:00
aoiasd
1b17e16fc7
fix: expr filter return wrong result when skipped (#44778)
relate: https://github.com/milvus-io/milvus/issues/44777
Should return res with false if skipped. But now return vaild[0], it
almost be true.

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-10-13 18:33:59 +08:00
zhagnlu
3dd5deb70a
fix:disable using shredding for json_path contains digital (#44724)
#44132

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-10-13 17:25:59 +08:00
Zhen Ye
53e8f150e8
fix: check if qn is sqn with label and streamingnode list (#44792)
issue: #44014

- On standalone, the query node inside need to load segment and watch
channel, so the querynode is not a embeded querynode in streamingnode
without `LabelStreamingNodeEmbeddedQueryNode`. The channel dist manager
can not confirm a standalone node is a embededStreamingNode.

Bug is introduced by #44099

Signed-off-by: chyezh <chyezh@outlook.com>
2025-10-13 16:33:59 +08:00
sparknack
c8a4d6e2ef
enhance: add cachinglayer management for TextMatchIndex (#44741)
issue: #41435, #44502

Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>
2025-10-13 14:37:58 +08:00
sparknack
6d5b41644b
enhance: remove logical usage checks during segment loading (#44743)
issue: #41435

Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>
2025-10-13 14:21:58 +08:00
congqixia
f5f053f1d2
enhance: Refactor privilege management by extracting privilege cache into separate package (#44762)
Related to #44761

This commit refactors the privilege management system in the proxy
component by:

1. **Separation of Concerns**: Extracts privilege-related functionality
from MetaCache into a dedicated `internal/proxy/privilege` package,
improving code organization and maintainability.

2. **New Package Structure**: Creates `internal/proxy/privilege/` with:
   - `cache.go`: Core privilege cache implementation (PrivilegeCache)
   - `result_cache.go`: Privilege enforcement result caching
   - `model.go`: Casbin model and policy enforcement functions
   - `meta_cache_adapter.go`: Casbin adapter for MetaCache integration
   - Corresponding test files and mock implementations

3. **MetaCache Simplification**: Removes privilege and credential
management methods from MetaCache interface and implementation:
   - Removed: GetCredentialInfo, RemoveCredential, UpdateCredential
- Removed: GetPrivilegeInfo, GetUserRole, RefreshPolicyInfo,
InitPolicyInfo
   - Deleted: meta_cache_adapter.go, privilege_cache.go and their tests

4. **Updated References**: Updates all callsites to use the new
privilegeCache global:
- Authentication interceptor now uses privilegeCache for password
verification
- Credential cache operations (InvalidateCredentialCache,
UpdateCredentialCache, UpdateCredential) now use privilegeCache
- Policy refresh operations (RefreshPolicyInfoCache) now use
privilegeCache
- Privilege interceptor uses new privilege.GetEnforcer() and privilege
result cache

5. **Improved API**: Renames cache functions for clarity:
   - GetPrivilegeCache → GetResultCache
   - SetPrivilegeCache → SetResultCache
   - CleanPrivilegeCache → CleanResultCache

This refactoring makes the codebase more modular, separates privilege
management concerns from general metadata caching, and provides a
clearer API for privilege enforcement operations.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-10-13 11:15:58 +08:00
Zhen Ye
369c6eb206
enhance: support remove cluster from replicate topology (#44642)
issue: #44558, #44123
- Update config(A->C) to A and C, config(B) to B on replicate topology
(A->B,A->C) can remove the B from replicate topology
- Fix some metric error of CDC

Signed-off-by: chyezh <chyezh@outlook.com>
2025-10-13 11:07:58 +08:00
congqixia
7b8ecdaad5
enhance: Add accesslog field for template value length info (#44723)
Related to #36672

Add accesslog field displaying value length for search/query request may
help developers debug related issues

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-10-11 18:23:57 +08:00
XuanYang-cn
a3bdabb328
enhance: Unify compaction executor task state management (#44721)
Remove stopTask.
Replace multiple task tracking maps with single unified taskState map.
Fix slot tracking, improve state transitions, and add comprehensive test

See also: #44714

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2025-10-11 17:53:57 +08:00
aoiasd
09865a5da5
fix: BM25 with boost return result not ordered. (#44744)
relate: https://github.com/milvus-io/milvus/issues/44758
Wrong code which should be `(result.seg_offsets_[i] >= 0 &&
result.seg_offsets_[j] < 0)`, but was `(result.seg_offsets_[j] >= 0 &&
result.seg_offsets_[j] < 0) ` now.
But because all placeholder which was offset -1, will fill with worst
distance value.
For IP, L2 or COSIN, it will be +inf or -inf. So sort distance was
enough.
But when use BM25, it will be NAN. Will case sort out of ordered.

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-10-11 17:17:58 +08:00
cai.zhang
d8beafd6d0
fix: Skip running scalar index when segment was compacted (#44690)
issue: #44689

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-10-11 12:05:56 +08:00
cai.zhang
7a93cfe890
fix: Fix bug for nullable geometry (#44732)
issue: #44648

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-10-11 11:27:57 +08:00
congqixia
5ece760d73
fix: Pass fs via FileManagerContext when loading index (#44733)
Related to #44615

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-10-11 09:55:57 +08:00
wei liu
33d1e7de83
fix: Replace incorrect log import with milvus v2 log package (#44731)
issue: #44730
Fix the issue where logs were not outputting as expected due to
incorrect log package imports across multiple components.

Changes include:
- Add golangci-lint rule to forbid github.com/pingcap/log usage
- Replace github.com/pingcap/log with
github.com/milvus-io/milvus/pkg/v2/log

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-10-10 20:27:57 +08:00
sparknack
7e750190b6
enhance: add a size getter for tantivy inverted index (#44609)
issue: #41435

---------

Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>
2025-10-10 17:43:57 +08:00
Spade A
208481a070
feat: impl StructArray -- support same names in different STRUCT (#44557)
ref: https://github.com/milvus-io/milvus/issues/42148

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-10-10 15:53:56 +08:00
groot
81f0d498be
enhance: Support JSONL/NDJSON files for bulkinsert (#44602)
issue: https://github.com/milvus-io/milvus/issues/44567

Signed-off-by: yhmo <yihua.mo@zilliz.com>
2025-10-10 10:43:57 +08:00
zhenshan.cao
4279f166c6
enhance: Add refine logs for task scheduler in QueryCoord (#44577)
issue: https://github.com/milvus-io/milvus/issues/43968

Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>
2025-10-10 10:07:55 +08:00
congqixia
e83c7e0c92
fix: Use eventually & fix task id appear in both executing&completed (#44698)
Related to #44620

This PR:
- Use eventually instead of `time.Sleep` in accesslog writer unit test
- Make sure compaction task results have only one state from executor
API

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-10-10 10:05:56 +08:00
congqixia
efbe6669ec
enhance: Make accesslog.$consistency_level represent actual value used (#44706)
Related to #44703

This PR:
- Add `SetActualConsistencyLevel` to `info.AccessInfo` interface and
related util method processing it
- Make `$consistency_level` returning actual value if set

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-10-09 21:03:58 +08:00
Zhen Ye
97c987f313
fix: remove wrong start timetick to avoid filtering DML whose timetick is less than it. (#44691)
issue: #41611

- introduced by #44532

Signed-off-by: chyezh <chyezh@outlook.com>
2025-10-09 20:39:58 +08:00
yihao.dai
4ae44f1aa9
fix: Fix CDC OOM due to high buffer size (#44607)
Fix CDC OOM by:
1. free msg buffer manually.
2. limit max msg buffer size.
3. reduce scanner msg hander buffer size.

issue: https://github.com/milvus-io/milvus/issues/44123

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-10-09 19:31:58 +08:00
liliu-z
174e1e4071
enhance: Upgrade Knohwere Version to fix GPU Compilation issue (#44704)
Signed-off-by: Li Liu <li.liu@zilliz.com>
2025-10-09 19:17:57 +08:00
congqixia
8a443c699e
fix: Make aws credential provider singleton (#44687)
Related to #44647

This patch make milvus-storage using singleton credential provider in
case of data race when concurrent index build task recieved.

See also milvus-io/milvus-storage#44647

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-10-09 16:11:58 +08:00
Bingyi Sun
c25166a202
fix: Fix bulk import with autoid (#44604)
issue: #44424

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-10-09 12:09:56 +08:00
Zhen Ye
30091a3bb7
enhance: remove redundant channel manager from datacoord (#44532)
issue: #41611

- After enabling streaming arch, channel manager of data coord is a
redundant component.

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-10-09 11:01:57 +08:00
congqixia
1d85b83215
enhance: [backlog] Fix unittest and remove fs fallback logic (#44615)
Related to #44535

This PR:
- Fix the unittest creating `DiskFileManagerImpl` without `filesystem`
- Add comments for methods need `fs_`
- Remove fallback creation and add assertion for nullptr fs

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-10-09 10:41:57 +08:00