22875 Commits

Author SHA1 Message Date
cai.zhang
d8a3236e44
fix: Reorder worker proto fields to ensure compatibility (#43735)
issue: #43734

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-08-05 14:59:38 +08:00
ThreadDao
4b14e94206
test: update partial load case due to server behavior changing (#43717)
issue: #33419
pr: #41476

Signed-off-by: ThreadDao <yufen.zong@zilliz.com>
2025-08-05 11:23:37 +08:00
yanliang567
d45274512c
test: Refactor search tests and remove useless common functions (#43608)
related issue: #40698

---------

Signed-off-by: yanliang567 <yanliang.qiao@zilliz.com>
2025-08-05 11:15:39 +08:00
sparknack
544c7c0600
enhance: update cachinglayer default cache ratio to 0.3 (#43723)
issue: #41435

---------

Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>
2025-08-05 01:35:39 +08:00
yihao.dai
cb7be8885d
enhance: Deep copy arraw array (#43724)
Deep copy arrow array and make a new RecordBatch with the copied array.

issue: https://github.com/milvus-io/milvus/issues/43310

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-08-05 00:31:38 +08:00
zhagnlu
f14c7d598c
fix: skip load raw data when loading index for storagev2 (#43720)
#43653

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-08-04 21:17:39 +08:00
zhikunyao
6d5fb73fa4
test: add rust to source code megrify rule (#43727)
Signed-off-by: Zhikun Yao <zhikun.yao@zilliz.com>
2025-08-04 18:43:38 +08:00
congqixia
c1638afd3f
enhance: [StorageV2] Update enablev2 default param value (#43713)
Related to #43652

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-08-04 16:59:38 +08:00
sthuang
e66a2cb4dd
fix: skip binlog v2 milvus tools build (#43701)
related: #43648, #39173

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2025-08-04 16:03:38 +08:00
zhuwenxing
174804e61a
test: add more testcase for analyzer (#43367)
/kind improvement

Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2025-08-04 11:23:38 +08:00
Chun Han
d826d6ac91
fix: try to get span raw data for variable length data type(#43544) (#43705)
related: #43544

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2025-08-04 11:15:38 +08:00
aoiasd
4f02b06abc
enhance: support set lindera dict build dir and download url in yaml (#43541)
relate: https://github.com/milvus-io/milvus/issues/43120

---------

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-08-04 09:47:38 +08:00
zhuwenxing
e305a3fa35
test: add hybrid search offset testcase in restful api (#43646)
/kind improvement

Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2025-08-03 18:43:37 +08:00
congqixia
4aff581007
enhance: Pass callback in search batch pks to void large result (#43695)
Related to #43660

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-08-02 17:57:37 +08:00
tinswzy
75666153e3
enhance: add internal writer without session lock (#43675)
#43638 
- add internal writer without session lock
- modify lastReadState pb type

Signed-off-by: tinswzy <zhenyuan.wei@zilliz.com>
2025-08-02 09:25:37 +08:00
Buqian Zheng
01baf582d5
fix: GroupChunkTranslator to correctly identify vector field (#43706)
issue: #43653

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2025-08-02 00:49:37 +08:00
Bingyi Sun
b59bc5e2c0
fix: make json path index non exists offsets compatible with 2.5 (#43691)
issue: https://github.com/milvus-io/milvus/issues/43666

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-08-01 23:22:23 +08:00
sparknack
bdd65871ea
enhance: tiered storage: estimate segment loading resource usage while considering eviction (#43323)
issue: #41435 

After introducing the caching layer's lazy loading and eviction
mechanisms, most parts of a segment won't be loaded into memory or disk
immediately, even if the segment is marked as LOADED. This means
physical resource usage may be very low. However, we still need to
reserve enough resources for the segments marked as LOADED. Thus, the
logic of resource usage estimation during segment loading, which based
on physcial resource usage only for now, should be changed.

To address this issue, we introduced the concept of logical resource
usage in this patch. This can be thought of as the base reserved
resource for each LOADED segment.

A segment’s logical resource usage is derived from its final evictable
and inevictable resource usage and calculated as follows:

```
SLR = SFPIER + evitable_cache_ratio * SFPER
```

it also equals to

```
SLR = (SFPIER + SFPER) - (1.0 - evitable_cache_ratio) * SFPER
```

`SLR`: The logical resource usage of a segment.
`SFPIER`: The final physical inevictable resource usage of a segment.
`SFPER`: The final physical evictable resource usage of a segment.
`evitable_cache_ratio`: The ratio of a segment's evictable resources
that can be cached locally. The higher the ratio, the more physical
memory is reserved for evictable memory.

When loading a segment, two types of resource usage are taken into
account.

First is the estimated maximum physical resource usage:

```
PPR = HPR + CPR + SMPR - SFPER
```

`PPR`: The predicted physical resource usage after the current segment
is allowed to load.
`HPR`: The physical resource usage obtained from hardware information.  
`CPR`: The total physical resource usage of segments that have been
committed but not yet loaded. When one new segment is allow to load,
`CPR' = CPR + (SMR - SER)`. When one of the committed segments is
loaded, `CPR' = CPR - (SMR - SER)`.
`SMPR`: The maximum physical resource usage of the current segment.
`SFPER`: The final physical evictable resource usage of the current
segment.

Second is the estimated logical resource usage, this check is only valid
when eviction is enabled:

```
PLR = LLR + CLR + SLR
```

`PLR`: The predicted logical resource usage after the current segment is
allowed to load.
`LLR`: The total logical resource usage of all loaded segments. When a
new segment is loaded, `LLR` should be updated to `LLR' = LLR + SLR`.
`CLR`: The total logical resource usage of segments that have been
committed but not yet loaded. When one new segment is allow to load,
`CLR' = CLR + SLR`. When one of the committed segments is loaded, `CLR'
= CLR - SLR`.
`SLR`: The logical resource usage of the current segment.

Only when `PPR < PRL && PLR < PRL` (`PRL`: Physical resource limit of
the querynode), the segment is allowed to be loaded.

---------

Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>
2025-08-01 21:31:37 +08:00
Buqian Zheng
b0226ef47c
fix: added more comprehensive container limit detection (#43693)
issue: #41435

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2025-08-01 20:37:37 +08:00
wei liu
ecc2ac0426
fix: apply load config changes failed after restart (#43554)
issue: #43107

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-08-01 20:13:37 +08:00
yihao.dai
50f621abf2
fix: Fix compaction failed due to ID exhausted (#43699)
Change default `compaction.preAllocateIDExpansionFactor` to 10000.

issue: https://github.com/milvus-io/milvus/issues/43673

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-08-01 19:17:37 +08:00
Xianhui Lin
0f0edff7f0
fix: increment offset for null data rows in JsonKeyStats (#43679)
fix: increment offset for null data rows in JsonKeyStatsInvertedIndex
issue: https://github.com/milvus-io/milvus/issues/43151

Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>
2025-08-01 15:53:37 +08:00
Buqian Zheng
21cec95fe8
fix: fix disk path sent to cachinglayer (#43685)
`localDataRootPath` is used to init local chunk manager and has
`querynode` appended to it, thus is incorrect

#41435

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2025-08-01 13:19:36 +08:00
zhagnlu
2594250906
fix: fix miss loading index for storagev2 (#43674)
#43653

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-08-01 13:07:36 +08:00
congqixia
5f2f4eb3d6
enhance: Ignore entry with same ts when DeleteRecord search pks (#43669)
Related to #43660

This patch reduces the unwanted offset&ts entries having same timestamp
of delete record. Under large amount of upsert, this false hit could
increase large amount of memory usage while applying delete.

The next step could be passing a callback to `search_pk_func_` to handle
hit entry streamingly.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-08-01 10:15:36 +08:00
Ted Xu
e37cd19da2
enhance: enable storage v2 by default (#43652)
Signed-off-by: Ted Xu <ted.xu@zilliz.com>
2025-08-01 08:59:36 +08:00
zhagnlu
239f743a18
fix: add enable_mmap key to load config (#43672)
#43670

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-07-31 21:35:37 +08:00
sthuang
df02014b3b
enhance: [rbac] privilege groups add import and add field privileges (#43664)
related: https://github.com/milvus-io/milvus/issues/29367,
https://github.com/milvus-io/milvus/pull/42687

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2025-07-31 20:47:36 +08:00
sparknack
4aabe23a45
enhance: update flat_hash_map.hpp to a modified version (#43506)
issue: #41435

Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>
2025-07-31 20:09:36 +08:00
Chun Han
d72c0357ff
fix: empty hybridsearch result due to one-sub-search empty(#43537) (#43647)
related: #43537

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2025-07-31 19:47:37 +08:00
congqixia
f29964bd17
fix: Add padding for sorted index preventing 0 length mmap (#43663)
Related to #43655

This patch add a padding when writing mmap file for ScalarSortedIndex in
case of mmap falure due to 0 mmap length.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-07-31 18:53:36 +08:00
sthuang
43c3c160ff
feat: [StorageV2] cmd binlog tool (#43648)
related: #39173 

Core Features
* Parquet File Analysis: Analyze Milvus binlog Parquet files with
metadata extraction
* MinIO Integration: Direct connection to MinIO storage for remote file
analysis
* Vector Data Deserialization: Specialized handling of Milvus vector
data in binlog files
* Interactive CLI: Command-line interface with interactive exploration

Analysis Capabilities
* Metadata & Vector Analysis: Extract schema info, row counts, and
vector statistics
* Data Export: Export data to JSON format with configurable limits
* Query Functionality: Search for specific records by ID
* Batch Processing: Analyze multiple Parquet files simultaneously

User Experience
* Verbose Output: Detailed logging for debugging
* Error Handling: Robust error handling for file access and parsing
* Flexible Output: Support for single file and batch analysis formats

---------

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
Co-authored-by: nico <109071306+NicoYuan1986@users.noreply.github.com>
2025-07-31 15:05:37 +08:00
tinswzy
1fe60520ae
enhance: update wp version v0.1.3 (#43658)
#43638 
update wp to v0.1.3.
- Fix the goroutine leak of adv file reader.
- Refactor the log reader time wait logic. 
- The server segment file reuses the reader singleton.

Signed-off-by: tinswzy <zhenyuan.wei@zilliz.com>
2025-07-31 14:17:37 +08:00
zhagnlu
708e426bb3
enhance: using set element for string term type (#43049)
issue: #43048

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-07-31 10:35:37 +08:00
zhagnlu
31801f5937
fix: fix pk in [..] skip next batch when using multi-chunk segment (#43618)
#43494

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-07-31 10:15:37 +08:00
congqixia
089f02bcca
fix: [StorageV2] Align null bitmap offset for fixed-length datatype (#43654)
Related to #43626

Similar to previous pr #43321, null bitmap could be dislocated if the
bitset ptr does not count the offset of array

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-07-31 09:55:36 +08:00
Zhen Ye
0d5e0ca795
fix: close timetick protection by default (#43650)
issue: #43266

Signed-off-by: chyezh <chyezh@outlook.com>
2025-07-30 19:51:37 +08:00
congqixia
6a74a7de66
enhance: Make DeleteRecord search pks by batch and PinAll (#43640)
Related to #43592

When delete records are large, search pk one by one will result into
many `Pincells` call which creates lots of futures.

This patch make search pk execute in batch to reduce this cost.

Also add `GetAllChunks` API to utilize `PinAllCells` to reduce pins.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-07-30 19:15:36 +08:00
SimFG
9ffcc55b55
fix: Clean privilege cache after loading policy in InitPolicyInfo (#43642)
- issue: #43641

Signed-off-by: SimFG <bang.fu@zilliz.com>
2025-07-30 16:57:37 +08:00
wei liu
1fae8f5ae3
enhance: Optimize FlushAll performance for multi-table scenarios (#43339)
Replace multiple per-table flush RPC calls with single FlushAll RPC to
improve performance in multi-table scenarios.
issue: #43338
- Implement server-side FlushAll request processing in
DataCoord/MixCoord
- Add flushAllTask to handle unified flush operations across all tables
- Replace proxy-side per-table flush iteration with single RPC call
- Support both streaming and non-streaming service execution paths
- Add comprehensive unit tests for new FlushAll implementation

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-07-30 15:37:37 +08:00
tinswzy
1718b0d141
enhance: update wp version v0.1.2 (#43636)
#43638 
update wp to v0.1.2
fix read failure when minio is killed during data reading. related wp
commit#[aabd1c4eb2](aabd1c4eb2
)

Signed-off-by: tinswzy <zhenyuan.wei@zilliz.com>
2025-07-30 14:39:36 +08:00
9Eurydice9
93884d219a
test: add create collection V2 cases for milvus client (#43600)
issue: #43590
Migrate duplicate collection name test cases from TestcaseBase to
TestMilvusClientV2Base

@yanliang567

---------

Signed-off-by: Orpheus Wang <orpheus.wang@zilliz.com>
2025-07-30 10:25:36 +08:00
nico
909347dc66
test: update nightly cases (#43591)
Signed-off-by: nico <cheng.yuan@zilliz.com>
2025-07-30 09:51:37 +08:00
sthuang
a2c7ed2780
fix: [StorageV2] sort field binlogs paths for packed reader and writer (#43585)
key changes:
* fix unstable storage v2 compaction unit test by guaranteeing the order
of paths during sync.
* bump milvus-storage version, include
https://github.com/milvus-io/milvus-storage/pull/222
https://github.com/milvus-io/milvus-storage/pull/223
https://github.com/milvus-io/milvus-storage/pull/224
https://github.com/milvus-io/milvus-storage/pull/225
https://github.com/milvus-io/milvus-storage/pull/226
* Also fix the below related oom issue.
related: https://github.com/milvus-io/milvus/issues/43310

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2025-07-30 08:09:36 +08:00
congqixia
4fe55e3008
fix: [StorageV2] Use separate channel for get_cells (#43632)
Related to #43584

There might be concurrent calls on `translator.get_cells`. The channel
cannot be shared among these calls, otherwise the logic will break.

This patch create new channel for each `get_cells` invocation in case of
data race.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-07-29 20:59:38 +08:00
Zhen Ye
3e3775fb81
fix: panics when describe collection internal failure (#43630)
issue: #43629

- also fix the scanner_switchable panic underlying wal scanner return
context error.

Signed-off-by: chyezh <chyezh@outlook.com>
2025-07-29 20:33:36 +08:00
Zhen Ye
cd38d65417
fix: make savebinlogpath idompotent at binlog level (#43615)
issue: #43574

- update all binlog every time when calling udpate savebinlogpath.

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-07-29 19:47:36 +08:00
foxspy
d57890449f
enhance: update knowhere version (#43528)
issue: #42937

Signed-off-by: xianliang.li <xianliang.li@zilliz.com>
2025-07-29 17:21:36 +08:00
XuanYang-cn
0ccb95303e
feat: [CMEK] Add utils to load plugins (#42986)
See also: #40321

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2025-07-29 17:17:36 +08:00
Buqian Zheng
052fb6c562
feat: add time based eviction to data managed by cachinglayer (#43490)
issue: https://github.com/milvus-io/milvus/issues/41435

also added disk capacity protection

---------

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2025-07-29 16:17:35 +08:00