82 Commits

Author SHA1 Message Date
wei liu
1fae8f5ae3
enhance: Optimize FlushAll performance for multi-table scenarios (#43339)
Replace multiple per-table flush RPC calls with single FlushAll RPC to
improve performance in multi-table scenarios.
issue: #43338
- Implement server-side FlushAll request processing in
DataCoord/MixCoord
- Add flushAllTask to handle unified flush operations across all tables
- Replace proxy-side per-table flush iteration with single RPC call
- Support both streaming and non-streaming service execution paths
- Add comprehensive unit tests for new FlushAll implementation

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-07-30 15:37:37 +08:00
Zhen Ye
cd38d65417
fix: make savebinlogpath idompotent at binlog level (#43615)
issue: #43574

- update all binlog every time when calling udpate savebinlogpath.

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-07-29 19:47:36 +08:00
Zhen Ye
e9ab73e93d
enhance: add schema version at recovery storage (#43500)
issue: #43072, #43289

- manage the schema version at recovery storage.
- update the schema when creating collection or alter schema.
- get schema at write buffer based on version.
- recover the schema when upgrading from 2.5.

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-07-23 21:38:54 +08:00
yihao.dai
a839017e81
fix: Handle retry state in import task (#43474)
issue: https://github.com/milvus-io/milvus/issues/43473

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-07-22 14:52:53 +08:00
Zhen Ye
07fa2cbdd3
enhance: wal balance consider the wal status on streamingnode (#43265)
issue: #42995

- don't balance the wal if the producing-consuming lag is too long.
- don't balance if the rebalance is set as false.
- don't balance if the wal is balanced recently.

Signed-off-by: chyezh <chyezh@outlook.com>
2025-07-18 11:10:51 +08:00
congqixia
5d90b65342
enhance: [StorageV2] Add storage version in Data/Query view resp (#43348)
Related to #39173

Add `storage_version` in data/query view segment info response

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-07-16 15:52:51 +08:00
wei liu
b2597c6329
enhance: apply load config changes after QueryCoord restart (#43108)
issue: #43107 
- Add checkLoadConfigChanges() to apply load config during startup
- Call config check in startQueryCoord() after restart
- Skip auto-updates for collections with user-specified replica numbers
- Add is_user_specified_replica_mode field to preserve user settings
- Add comprehensive unit tests with mockey

Ensures existing collections use latest cluster-level config after
restart.

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-07-10 14:28:48 +08:00
cai.zhang
3ffd44f302
fix: Fix remaining issues with Datanode pooling and StorageV2 (#43147)
issue: #43146

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-07-10 14:26:48 +08:00
cai.zhang
6989e18599
enhance: Move sort stats task to sort compaction (#42562)
issue: #42560

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-07-08 20:22:47 +08:00
yihao.dai
9cbd194c6b
fix: Prevent import from generating small binlogs (#43132)
- Introduce dynamic buffer sizing to avoid generating small binlogs
during import
- Refactor import slot calculation based on CPU and memory constraints
- Implement dynamic pool sizing for sync manager and import tasks
according to CPU core count

issue: https://github.com/milvus-io/milvus/issues/43131

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-07-07 21:32:47 +08:00
Zhen Ye
46b6f1b9e2
fix: panic when logging a old message should be skipped (#43076)
issue: #43074

- fix: panic when logging a old message should be skipped, #43074 
- fix: make the ack of broadcaster idompotent, #43026
- fix: lost dropping collection when upgrading, #43092
- fix: panic when DropPartition happen after DropCollection, #43027,
#43078

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-07-04 16:04:44 +08:00
congqixia
7bc7b18ed5
fix: [AddField] Prevent concurrent load during UpdateSchema (#43043)
Related to #43028

This PR:
- Add mutex prevent concurrent load segment & schema change
- Add schema verison field in load meta
- Update schema in PutOrRef if schema verison is larger

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-07-02 17:38:44 +08:00
Spade A
e2c85eec81
fix: load stats index based on mmap config (#42788)
ref https://github.com/milvus-io/milvus/issues/42626

This PR makes text match index and json key stats index be loaded based
on mmap config.

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-06-19 10:10:39 +08:00
cai.zhang
a9dcd4a380
enhance: ChunkManager is no longer created during datanode initialization (#42791)
issue: #41611

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-06-17 17:06:38 +08:00
Chun Han
001619aef9
feat: supporing load priority for loading (#42413)
related: #40781

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2025-06-17 15:22:38 +08:00
Bingyi Sun
1bf960b1a8
enhance: Check loaded segments before gc (#42639)
issue: https://github.com/milvus-io/milvus/issues/42412

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-06-13 17:44:38 +08:00
wei liu
e7c0a6ffbb
enhance: Refine QueryNode task parallelism based on CPU core count (#42166)
issue: #42165
Implement dynamic task execution capacity calculation based on QueryNode
CPU core count instead of static configuration for better resource
utilization.

Changes include:
- Add CpuCoreNum() method and WithCpuCoreNum() option to NodeInfo
- Implement GetTaskExecutionCap() for dynamic capacity calculation
- Add QueryNodeTaskParallelismFactor parameter for tuning
- Update proto definition to include cpu_core_num field
- Add unit tests for new functionality

This allows QueryCoord to automatically adjust task parallelism based on
actual hardware resources.

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-06-11 13:20:35 +08:00
cai.zhang
5566a85bcc
enhance: Add proxy task queue metrics (#42156)
issue: #42155

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-06-04 11:26:32 +08:00
Zhen Ye
4bad293655
enhance: make upgrading from 2.5.x less down time (#42082)
issue: #40532

- start timeticksync at rootcoord if the streaming service is not
available
- stop timeticksync if the streaming service is available
- open a read-only wal if some nodes in cluster is not upgrading to 2.6
- allow to open read-write wal after all nodes in cluster is upgrading
to 2.6

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-05-29 23:02:29 +08:00
Zhen Ye
b94cee2413
fix: growing segment from old arch is not flushed after upgrading (#42164)
issue: #42162

- enhance: add read ahead buffer size issue #42129
- fix: rocksmq consumer's close operation may get stucked
- fix: growing segment from old arch is not flushed after upgrading

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-05-29 23:00:28 +08:00
aoiasd
3a74044149
fix: hybird search sub requset not set analyzer name (#41896)
relate: https://github.com/milvus-io/milvus/issues/41213

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-05-29 14:56:28 +08:00
aoiasd
2ae4d80120
enhance: support run analyzer by loaded collection field (#42113)
relate: https://github.com/milvus-io/milvus/issues/42094

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-05-29 10:54:30 +08:00
wei liu
54619eaa2c
feat: Implement partial result support on node down (#42009)
issue: https://github.com/milvus-io/milvus/issues/41690
This commit implements partial search result functionality when query
nodes go down, improving system availability during node failures. The
changes include:

- Enhanced load balancing in proxy (lb_policy.go) to handle node
failures with retry support
- Added partial search result capability in querynode delegator and
distribution logic
- Implemented tests for various partial result scenarios when nodes go
down
- Added metrics to track partial search results in querynode_metrics.go
- Updated parameter configuration to support partial result required
data ratio
- Replaced old partial_search_test.go with more comprehensive
partial_result_on_node_down_test.go
- Updated proto definitions and improved retry logic

These changes improve query resilience by returning partial results to
users when some query nodes are unavailable, ensuring that queries don't
completely fail when a portion of data remains accessible.

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-05-28 00:12:28 +08:00
Xianhui Lin
6a0e182e13
enhance: support TTL expiration with queries returning no results (#42086)
support TTL expiration with queries returning no results
issue:https://github.com/milvus-io/milvus/issues/41959

Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>
2025-05-27 18:28:27 +08:00
yihao.dai
83c9527e70
enhance: Use QuerySlot interface for tasks (#41989)
Use `QuerySlot` rpc instead of `QueryTask` for querying slot.

issue: https://github.com/milvus-io/milvus/issues/41123

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-05-23 10:30:28 +08:00
Zhen Ye
c9b0748ff9
enhance: add delete rows into delete msg header and more metric (#41952)
issue: #41544

- add delete rows into delete messsage header
- add more insert/delete metrics
- fix non-broadcast message has broadcast header

Signed-off-by: chyezh <chyezh@outlook.com>
2025-05-22 20:28:26 +08:00
wei liu
78010262f0
enhance: Optimize shard serviceable mechanism (#41937)
issue: https://github.com/milvus-io/milvus/issues/41690
- Merge leader view and channel management into ChannelDistManager,
allowing a channel to have multiple delegators.
- Improve shard leader switching to ensure a single replica only has one
shard leader per channel. The shard leader handles all resource loading
and query requests.
- Refine the serviceable mechanism: after QC completes loading, sync the
query view to the delegator. The delegator then determines its
serviceable status based on the query view.
- When a delegator encounters forwarding query or deletion failures,
mark the corresponding segment as offline and transition it to an
unserviceable state.

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-05-22 11:38:24 +08:00
yihao.dai
142bd2fc05
enhance: Pooling for data tasks (#41256)
1. Add global scheduler for datacoord.
2. Define and implement new CreateTask, QueryTask, DropTask interfaces.
3. Refine Import, Compaction, Stats, Index task.

issue: https://github.com/milvus-io/milvus/issues/41123

Co-authored-by: Cai Zhang <cai.zhang@zilliz.com>
2025-05-20 21:06:24 +08:00
Zhen Ye
59dff668dc
enhance: schema change without manual flush (#41882)
issue: #39718

- remove the manual flush message from schema change operation
- add flush segment id handle into schema change processes

Signed-off-by: chyezh <chyezh@outlook.com>
Co-authored-by: congqixia <congqi.xia@zilliz.com>
2025-05-19 10:14:22 +08:00
foxspy
1c794be119
enhance: Output index version information in the DescribeIndex interface (#41847)
issue: https://github.com/milvus-io/milvus/issues/41431

Signed-off-by: xianliang.li <xianliang.li@zilliz.com>
2025-05-15 14:36:22 +08:00
Zhen Ye
0a465bb5b7
enhance: use recovery+shardmanager, remove segment assignment interceptor (#41824)
issue: #41544

- add lock interceptor into wal.
- use recovery and shardmanager to replace the original implementation
of segment assignment.
- remove redundant implementation and unittest.
- remove redundant proto definition.
- use 2 streamingnode in e2e.

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-05-14 23:00:23 +08:00
Zhen Ye
e675da76e4
enhance: simplify the proto message, make segment assignment code more clean (#41671)
issue: #41544

- simplify the proto message for flush and create segment.
- simplify the msg handler for flowgraph.

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-05-11 20:49:00 +08:00
Zhen Ye
d32b802752
fix: remove the max_row_num for segment meta (#41680)
issue: #41544

- remove the estimate logic, so the create segment operation will not
check the collection meta anymore

Signed-off-by: chyezh <chyezh@outlook.com>
2025-05-09 20:46:53 +08:00
zhagnlu
39e7ad33d7
enhance: add optimize for like expr (#41066)
#41065

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-05-08 14:28:52 +08:00
sthuang
6c377b6e86
feat: Storage v2 index and stats raw data (#41534)
related: #39173

---------

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2025-04-30 08:48:54 +08:00
XuanYang-cn
b6f3fd0de1
enhance: Update to the latest CipherPlugin API (#41599)
See also: #41585

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2025-04-29 14:50:50 +08:00
Zhen Ye
dfbb02a5f7
enhance: make streaming message as a log field for easier coding (#41545)
issue: #41544

- implement message can be logged as a field by zap.
- fix too many slow log for woodpecker.

Signed-off-by: chyezh <chyezh@outlook.com>
2025-04-28 14:38:42 +08:00
cai.zhang
640f526301
fix: Update current scalar index version to compatible tantivy different versions (#41141)
issue: #40823

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-04-27 20:44:39 +08:00
congqixia
b5443ddbd0
enhance: [AddField] Reopen loaded segments after AddField (#41529)
Related to #39718

This PR:
- Add reopen logic for growing & sealed segments
- Lazy reopen when schema version increases
- Add FinishLoad api for loading progress

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-04-26 08:48:39 +08:00
aoiasd
f52c2909c4
feat: support multi analyzer for bm25 function (#41351)
relate: https://github.com/milvus-io/milvus/issues/41213

---------

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-04-23 18:22:38 +08:00
congqixia
b36c88f3c8
enhance: [AddField] Broadcast schema change via WAL (#41373)
Related to #39718

Add Broadcast logic for collection schema change and notifies:
- Streamnode - Delegator
- Streamnode - Flush component
- QueryNodes via grpc

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-04-22 16:28:37 +08:00
yihao.dai
dccfc69660
enhance: Get compaction params from request (#41125)
Make DataNode use compaction parameters from request instead of
configuration.

issue: https://github.com/milvus-io/milvus/issues/41123

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-04-15 10:28:53 +08:00
Xianhui Lin
f9febe3bae
enhance: Merge RootCoord, DataCoord And QueryCoord into MixCoord (#41006)
Merge RootCoord, DataCoord And QueryCoord into MixCoord
Make Session into one
issue : https://github.com/milvus-io/milvus/issues/37764

---------

Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>
2025-04-11 16:36:30 +08:00
Xianhui Lin
144911aec6
fix: CreateStatsRequest change storage_version to 25 consistent with 2.5 (#41217)
fix: CreateStatsRequest change storage_version to 25 consistent with 2.5
relate-pr:https://github.com/milvus-io/milvus/pull/38039
issue: https://github.com/milvus-io/milvus/issues/36995

Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>
2025-04-11 11:16:43 +08:00
Xianhui Lin
3bc24c264f
enhance: Add json key inverted index in stats for optimization (#38039)
Add json key inverted index in stats for optimization
https://github.com/milvus-io/milvus/issues/36995

---------

Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-04-10 15:20:28 +08:00
cai.zhang
8a77fb9cdc
enhance: Support slot for index task and stats task (#39084)
issue: #39101

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-04-08 20:46:25 +08:00
Ted Xu
1bcea2a775
fix: assigning the correct storage version in sync and index tasks (#41093)
See #39663 #40667

---------

Signed-off-by: Ted Xu <ted.xu@zilliz.com>
2025-04-08 10:14:25 +08:00
cai.zhang
05e25431d9
enhance: Deprecate disk params about indexing (#41045)
issue: #40863

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-04-07 11:36:34 +08:00
smellthemoon
cb1e86e17c
enhance: support add field (#39800)
after the pr merged, we can support to insert, upsert, build index,
query, search in the added field.
can only do the above operates in added field after add field request
complete, which is a sync operate.

compact will be supported in the next pr.
#39718

---------

Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2025-04-02 14:24:31 +08:00
yihao.dai
5b78ef0a49
fix: Fix delete data loss due to duplicate binlogID (#40960)
With concurrenct L0 compaction
(https://github.com/milvus-io/milvus/pull/36816), delta logs might be
written to the same L1 segment, causing logID duplication when using the
incremental beginLogID. This PR removes the beginLogID mechanism and
instead passes a log ID range, where the number of IDs in the range
equals the number of compaction segment binlogs multiplied by an
expansion factor.

issue: https://github.com/milvus-io/milvus/issues/40207

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-04-01 10:36:22 +08:00