239 Commits

Author SHA1 Message Date
junjiejiangjjj
f1a4526bac
enhance: refactor rrf and weighted rerank (#42154)
https://github.com/milvus-io/milvus/issues/35856

Signed-off-by: junjie.jiang <junjie.jiang@zilliz.com>
2025-06-10 18:08:35 +08:00
wei liu
c262f987db
test: Fix unstable integration test for partial result (#42511)
issue: #42510

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-06-05 10:16:31 +08:00
Zhen Ye
e479467582
fix: panic when upgrading from old arch (#42422)
issue: #42405

- add delete rows into header when upsert.

Signed-off-by: chyezh <chyezh@outlook.com>
2025-05-31 22:56:29 +08:00
Zhen Ye
4bad293655
enhance: make upgrading from 2.5.x less down time (#42082)
issue: #40532

- start timeticksync at rootcoord if the streaming service is not
available
- stop timeticksync if the streaming service is available
- open a read-only wal if some nodes in cluster is not upgrading to 2.6
- allow to open read-write wal after all nodes in cluster is upgrading
to 2.6

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-05-29 23:02:29 +08:00
wei liu
54619eaa2c
feat: Implement partial result support on node down (#42009)
issue: https://github.com/milvus-io/milvus/issues/41690
This commit implements partial search result functionality when query
nodes go down, improving system availability during node failures. The
changes include:

- Enhanced load balancing in proxy (lb_policy.go) to handle node
failures with retry support
- Added partial search result capability in querynode delegator and
distribution logic
- Implemented tests for various partial result scenarios when nodes go
down
- Added metrics to track partial search results in querynode_metrics.go
- Updated parameter configuration to support partial result required
data ratio
- Replaced old partial_search_test.go with more comprehensive
partial_result_on_node_down_test.go
- Updated proto definitions and improved retry logic

These changes improve query resilience by returning partial results to
users when some query nodes are unavailable, ensuring that queries don't
completely fail when a portion of data remains accessible.

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-05-28 00:12:28 +08:00
yihao.dai
142bd2fc05
enhance: Pooling for data tasks (#41256)
1. Add global scheduler for datacoord.
2. Define and implement new CreateTask, QueryTask, DropTask interfaces.
3. Refine Import, Compaction, Stats, Index task.

issue: https://github.com/milvus-io/milvus/issues/41123

Co-authored-by: Cai Zhang <cai.zhang@zilliz.com>
2025-05-20 21:06:24 +08:00
congqixia
b099926f24
fix: Read group by field in sparse format in it (#41943)
Related to #41942

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-05-20 16:02:23 +08:00
Ted Xu
ae32203d3a
fix: support group by with nullable grouping keys (#41797)
See #36264

In this PR:
- Enhanced error handling in parse of grouping field.
- Fixed null handling in reduce tasks in proxy nodes. 
- Updated tests to reflect changes in error handling and data processing
logic.

---------

Signed-off-by: Ted Xu <ted.xu@zilliz.com>
2025-05-17 20:54:22 +08:00
yihao.dai
36e9e41627
fix: Fix no candidate segments error for small import (#41771)
When autoID is enabled, the preimport task estimates row distribution by
evenly dividing the total row count (numRows) across all vchannels:
`estimatedCount = numRows / vchannelNum`.
However, the actual import task hashes real auto-generated IDs to
determine
the target vchannel. This mismatch can lead to inaccurate row
distribution estimation
in such corner cases:
- Importing 1 row into 2 vchannels:
				• Preimport: 1 / 2 = 0 → both v0 and v1 are estimated to have 0 rows
				• Import: real autoID (e.g., 457975852966809057) hashes to v1
				  → actual result: v0 = 0, v1 = 1

To resolve such corner case, we now allocate at least one segment for
each vchannel
when autoID is enabled, ensuring all vchannels are prepared to receive
data even
if no rows are estimated for them.

issue: https://github.com/milvus-io/milvus/issues/41759

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-05-14 15:30:21 +08:00
SimFG
91d40fa558
fix: Update logging context and upgrade dependencies (#41318)
- issue: #41291

---------

Signed-off-by: SimFG <bang.fu@zilliz.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-04-23 10:52:38 +08:00
Xianhui Lin
f9febe3bae
enhance: Merge RootCoord, DataCoord And QueryCoord into MixCoord (#41006)
Merge RootCoord, DataCoord And QueryCoord into MixCoord
Make Session into one
issue : https://github.com/milvus-io/milvus/issues/37764

---------

Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>
2025-04-11 16:36:30 +08:00
Ted Xu
1bcea2a775
fix: assigning the correct storage version in sync and index tasks (#41093)
See #39663 #40667

---------

Signed-off-by: Ted Xu <ted.xu@zilliz.com>
2025-04-08 10:14:25 +08:00
sthuang
a85e36bad2
fix: create collection task check failed after restart (#40982)
The fields and partitions information are stored and fetched with
different prefixes in the metadata. In the CreateCollectionTask, the
RootCoord checks the existing collection information against the
metadata. This check fails if the order of the fields or partitions info
differs, leading to an error after restarting Milvus. To resolve this,
we should use a map in the check logic to ensure consistency.

related: https://github.com/milvus-io/milvus/issues/40955

---------

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2025-04-05 06:58:22 +08:00
Zhen Ye
f18aa85083
enhance: vchannel fair balance policy for streaming (#40959)
issue: #40638 

- Add `ChannelID` for streaming replica in future.
- Remove the pchannel count fair balance policy for streaming.
- Add Score based vchannel fair balance policy for streaming.
- Add pchannel stats manager to collect the stats of pchannel for
balancer.
- Add configuration and metrics for new balance policy

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-04-04 10:12:22 +08:00
yihao.dai
b2a8694686
enhance: Merge IndexNode and DataNode (#40272)
Merge DataNode and IndexNode into DataNode.

issue: https://github.com/milvus-io/milvus/issues/39115

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-03-13 14:26:11 +08:00
Ted Xu
df4285c9ef
enhance: API integration with storage v2 in clustering-compactions (#40133)
See #39173

---------

Signed-off-by: Ted Xu <ted.xu@zilliz.com>
2025-03-13 14:12:06 +08:00
wei liu
69b8b89369
enhance: Remove QueryCoord's scheduling of L0 segments (#39552)
issue: #39551
This PR remove querycoord's scheduling of l0 segments:
  - only load l0 segment when watch channel
- only release l0 segment when release channel or sync data distribution

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-02-26 21:38:00 +08:00
sthuang
90acc8a58f
enhance: upgrade go arrow version from 12.0.1 to 17.0.0 (#39916)
related: https://github.com/milvus-io/milvus/issues/39915

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2025-02-25 10:30:02 +08:00
congqixia
cb7f2fa6fd
enhance: Use v2 package name for pkg module (#39990)
Related to #39095

https://go.dev/doc/modules/version-numbers

Update pkg version according to golang dep version convention

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-02-22 23:15:58 +08:00
wei liu
946a344966
fix: [skip e2e] data race in load test (#39845)
Related to #39701

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-02-18 16:16:51 +08:00
SimFG
047254665d
feat: support to replicate import msg (#39171)
- issue: #39849

---------

Signed-off-by: SimFG <bang.fu@zilliz.com>
Signed-off-by: chyezh <chyezh@outlook.com>
Co-authored-by: chyezh <chyezh@outlook.com>
2025-02-16 00:08:13 +08:00
wei liu
ff5c680c99
fix: load collection stucks if compaction/gc happens (#39701)
issue: #39680
if compaction/gc happens, load collection may stuck due to
SegmentNotFound, we should trigger UpdateNextTarget to get a new data
view to execute loading operation.

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-02-11 15:48:50 +08:00
Zhen Ye
d3e32bb599
enhance: make pchannel level flusher (#39275)
issue: #38399

- Add a pchannel level checkpoint for flush processing
- Refactor the recovery of flushers of wal
- make a shared wal scanner first, then make multi datasyncservice on it

Signed-off-by: chyezh <chyezh@outlook.com>
2025-02-10 16:32:45 +08:00
Zhen Ye
a9e0e0a852
enhance: broadcast with event-based notification (#39522)
issue: #38399

- broadcast message can carry multi resource key now.
- implement event-based notification for broadcast messages
- broadcast message use broadcast id as a unique identifier in message
- broadcasted message on vchannels keep the broadcasted vchannel now.
- broadcasted message and broadcast message have a common broadcast
header now.

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-02-07 11:14:43 +08:00
Zhen Ye
5669016af0
enhance: erase the rpc level when wal is located at same node (#38858)
issue: #38399

- Make the wal scanner interface same with streaming scanner.
- Use wal if the wal is located at current node.
- Otherwise fallback the old logic.

Signed-off-by: chyezh <chyezh@outlook.com>
2025-02-05 22:25:10 +08:00
Cai Yudong
5730b69e56
feat: Enable more VECTOR_INT8 unittest (#39569)
Issue: #38666

Signed-off-by: Cai Yudong <yudong.cai@zilliz.com>
2025-01-24 17:03:07 +08:00
Zhen Ye
fd84ed817c
enhance: add broadcast operation for msgstream (#39040)
issue: #38399

- make broadcast service available for msgstream by reusing the
architecture streaming service

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-01-14 15:14:59 +08:00
sthuang
5c5948cb70
fix: rbac custom group privilege level check (#39164)
related: https://github.com/milvus-io/milvus/issues/39086

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2025-01-13 11:24:59 +08:00
sthuang
3cd74037db
fix: restore rbac with empty meta panic (#39141)
related: https://github.com/milvus-io/milvus/issues/38985

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2025-01-13 10:02:57 +08:00
Zhen Ye
bb8d1ab3bf
enhance: make new go package to manage proto (#39114)
issue: #39095

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-01-10 10:49:01 +08:00
wei liu
25f0c82ceb
fix: Fix update loading collection's load config doesn't work (#38595)
issue: #38594

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-12-25 18:02:51 +08:00
sthuang
6bc799061e
fix: fix privilege group list and list collections (#38684)
related: #37031
* built-in privilege group privileges in listPrivilegeGroups() should be
the same as in milvus.yaml
* collections granted by collection level built-in privilege group
should be list in showCollections()

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2024-12-25 18:00:51 +08:00
XuanYang-cn
ca7ec23198
enhance: Use partitionID when delete by partitionKey (#38231)
When delete by partition_key, Milvus will generates L0 segments
globally. During L0 Compaction, those L0 segments will touch all
partitions collection wise. Due to the false-positive rate of segment
bloomfilters, L0 compactions will append false deltalogs to completed
irrelevant partitions, which causes *partition deletion amplification.

This PR uses partition_key to set targeted partitionID when producing
deleteMsgs into MsgStreams. This'll narrow down L0 segments scope to
partition level, and remove the false-positive influence
collection-wise.

However, due to DeleteMsg structure, we can only label one partition to
one deleteMsg, so this enhancement fails if user wants to delete over 2
partition_keys in one deletion.

See also: #34665

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-12-20 11:18:46 +08:00
tinswzy
27229f7907
enhance: refine exists log print with ctx (#38080)
issue: #35917 
Refines exists log print with ctx

Signed-off-by: tinswzy <zhenyuan.wei@zilliz.com>
2024-12-14 22:36:44 +08:00
wei liu
a118ca14a7
fix: Fix role be dropped when grant still exist. (#38342)
issue: #38325
the old impl only to check grant in default db before drop role, which
may cause role be dropped when grant still exist.

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-12-11 11:24:42 +08:00
cai.zhang
73aa95f596
fix: Add version to the proxy cache to resolve concurrency issues (#38067)
issue: #36989

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-12-04 18:06:39 +08:00
shaoyue
1f66b9ebfb
feat: add config field to set internal tls sni (#38124)
/cc @xiaofan-luan @jaime0815 @nish112022

part of https://github.com/milvus-io/milvus/issues/36864

Signed-off-by: haorenfsa <haorenfsa@gmail.com>
2024-12-04 14:56:47 +08:00
sthuang
a5e0a56a8e
fix: move grant/revoke v2 params check from rootcoord to proxy (#38130)
related issue: https://github.com/milvus-io/milvus/issues/37031

fixed issues #38042: The interface "grant_v2" does not support empty
collectionName while the error says it does

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2024-12-02 19:48:37 +08:00
sthuang
23dc313c44
fix: fix grant/revoke v2 meta and unclear error messages (#38110)
related issue: https://github.com/milvus-io/milvus/issues/37031

fixed issues:
#37974: better error messages for grant v2 interface
#37903: fix meta built-in privilege group object name
#37843: better error messages for custom privilege group interface 
#38002: fix built-in privilege group meta to pass proxy interceptor
check
#38008: fix revoke v2 to support revoking v1 granted privileges

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2024-12-02 11:36:39 +08:00
sthuang
19572f5b06
enhance: RBAC new grant/revoke privilege (#37785)
issue: https://github.com/milvus-io/milvus/issues/37031
also fix issues: https://github.com/milvus-io/milvus/issues/37843,
https://github.com/milvus-io/milvus/issues/37842,
https://github.com/milvus-io/milvus/issues/37887

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2024-11-21 22:20:34 +08:00
nish112022
484c6b5c44
feat: Added code for Internal-tls (#36865)
issue : https://github.com/milvus-io/milvus/issues/36864

I have a few questions regarding my approach.I will consolidate them
here for feedback and review.Thanks

---------

Signed-off-by: Nischay Yadav <nischay.yadav@ibm.com>
Signed-off-by: Nischay <Nischay.Yadav@ibm.com>
2024-11-20 06:00:32 +08:00
sthuang
2d72ad33f2
enhance: RBAC built in privilege groups (#37720)
issue: #37031

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2024-11-18 20:38:39 +08:00
wei liu
a1b6be1253
fix: Delegator stuck at unserviceable status (#37694)
issue: #37679

pr #36549 introduce the logic error which update current target when
only parts of channel is ready.

This PR fix the logic error and let dist handler keep pull distribution
on querynode until all delegator becomes serviceable.

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-11-15 10:20:31 +08:00
jaime
1e8ea4a7e7
feat: add segment/channel/task/slow query render (#37561)
issue: #36621

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-11-12 17:44:29 +08:00
wei liu
266f8ef1f5
fix: Search may return less result after qn recover (#36549)
issue: #36293 #36242
after qn recover, delegator may be loaded in new node, after all segment
has been loaded, delegator becomes serviceable. but delegator's target
version hasn't been synced, and if search/query comes, delegator will
use wrong target version to filter out a empty segment list, which
caused empty search result.

This pr will block delegator's serviceable status until target version
is synced

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-11-12 16:34:28 +08:00
Zhen Ye
b5b003551e
enhance: use localhost for it and ut (#37529)
issue: #37528

Signed-off-by: chyezh <chyezh@outlook.com>
2024-11-12 11:36:27 +08:00
XuanYang-cn
a45a288a25
fix: Separate L0 and Mix trigger interval (#37190)
See also: #37108

- Add MixCompactionTriggerInterval, default 60s
- Add L0CompactionTriggerInterval, default 10s
- Export Single related compaction configs
- Raise SingleCompactionDeltaLogMaxSize from 2MB to 16MB

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-11-12 10:56:37 +08:00
sthuang
ff00a12805
enhance: RBAC custom privilege group ut coverage (#37558)
issue: https://github.com/milvus-io/milvus/issues/37031

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2024-11-09 20:40:25 +08:00
sthuang
70605cf5b3
enhance: Support custom privilege group for RBAC (#37087)
issue: #37031

---------

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2024-11-09 08:44:28 +08:00
cai.zhang
ae227e3934
enhance: Add integration test for stats task (#37506)
issue: #33744

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-11-08 15:48:26 +08:00