10951 Commits

Author SHA1 Message Date
cai.zhang
77f2fb562f
fix: Fix task state is InProgress but payload is nil (#43777)
issue: #43776

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-08-11 14:13:42 +08:00
Gao
81a0915c29
enhance: add milvus-common module to decouple knwhere & segcore (#43624)
issue: https://github.com/milvus-io/milvus/issues/42032
https://github.com/milvus-io/milvus/issues/41435

based on pr: https://github.com/milvus-io/milvus/pull/42124

---------

Signed-off-by: chasingegg <chao.gao@zilliz.com>
Co-authored-by: xianliang.li <xianliang.li@zilliz.com>
2025-08-11 14:09:42 +08:00
yihao.dai
ad950368fe
enhance: Fix parquet import OOM (#43756)
Each ColumnReader consumes ReaderProperties.BufferSize memory
independently. Therefore, the bufferSize should be divided by the number
of columns to ensure total memory usage stays within the intended limit.

issue: https://github.com/milvus-io/milvus/issues/43755

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-08-08 18:57:40 +08:00
aoiasd
eca51ed2c6
enhance: add file resource api (#43766)
relate: https://github.com/milvus-io/milvus/issues/43687

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-08-08 14:17:41 +08:00
zhagnlu
5b83975d39
enhance:convert multi not equal to not in (#43690)
#43689

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-08-08 10:37:40 +08:00
sparknack
169be30a76
enhance: cachinglayer: reserve resource for inevictable cachecell (#43602)
issue: #41435

---------

Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>
2025-08-08 10:35:49 +08:00
zhagnlu
c04d678ad4
enhance: make segcore params effective without restarting milvus (#43231)
#43230

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-08-08 10:33:48 +08:00
congqixia
1561a4ae8c
enhance: [StorageV2] Avoid create local parent dir if fs remote (#43790)
Related to #43752
milvus-storage pr: milvus-io/milvus-storage#230

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-08-08 10:19:40 +08:00
congqixia
b6199acb05
enhance: Utilize search_batch_pks for search_ids of PkTerm (#43751)
Related to #43660

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-08-07 14:19:40 +08:00
wei liu
715b5153b8
enhance: Improve delegator serviceable check logic in PinReadableSegments (#43768)
issue: #43767
- Enhance serviceable check logic to properly handle full vs partial
result requirements
- For full result (requiredLoadRatio >= 1.0): check
queryView.Serviceable()
- For partial result (requiredLoadRatio < 1.0): check load ratio
satisfaction
- Add comprehensive unit tests covering all serviceable check scenarios

This enhancement ensures delegator correctly validates serviceability
based on the requested result completeness, improving reliability of
query operations.

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-08-07 12:13:40 +08:00
wei liu
46dfe260da
enhance: Add timestamp filtering support to L0Reader (#43747)
issue: #43745
Add timestamp filtering capability to L0Reader to match the
functionality available in the regular Reader. This enhancement allows
filtering delete records based on timestamp range during L0 import
operations.

Changes include:
- Add tsStart and tsEnd fields to l0Reader struct for timestamp
filtering
- Modify NewL0Reader function signature to accept tsStart and tsEnd
parameters
- Implement timestamp filtering logic in Read method to skip records
outside the specified range
- Update L0ImportTask and L0PreImportTask to parse timestamp parameters
from request options and pass them to NewL0Reader
- Add comprehensive test case TestL0Reader_ReadWithTsFilter to verify ts
filtering functionality using mockey framework

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-08-06 16:49:39 +08:00
Zhen Ye
8ff118a9ff
fix: call IntoMessageProto instead of Payload when rpc (#43678)
issue: #43677

Signed-off-by: chyezh <chyezh@outlook.com>
2025-08-06 14:45:40 +08:00
Zhen Ye
5551d99425
enhance: remove old arch non-streaming arch code (#43651)
issue: #41609

- remove all dml dead code at proxy
- remove dead code at l0_write_buffer
- remove msgstream dependency at proxy
- remove timetick reporter from proxy
- remove replicate stream implementation

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-08-06 14:41:40 +08:00
congqixia
d414f6bd4d
enhance: Add assertion preventing reload same field (#43736)
Related to #43725

This patch add assertion preventing segment reloading same field column.
Also improve the message info when pk already exists.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-08-05 19:35:39 +08:00
cai.zhang
d8a3236e44
fix: Reorder worker proto fields to ensure compatibility (#43735)
issue: #43734

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-08-05 14:59:38 +08:00
sparknack
544c7c0600
enhance: update cachinglayer default cache ratio to 0.3 (#43723)
issue: #41435

---------

Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>
2025-08-05 01:35:39 +08:00
yihao.dai
cb7be8885d
enhance: Deep copy arraw array (#43724)
Deep copy arrow array and make a new RecordBatch with the copied array.

issue: https://github.com/milvus-io/milvus/issues/43310

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-08-05 00:31:38 +08:00
zhagnlu
f14c7d598c
fix: skip load raw data when loading index for storagev2 (#43720)
#43653

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-08-04 21:17:39 +08:00
Chun Han
d826d6ac91
fix: try to get span raw data for variable length data type(#43544) (#43705)
related: #43544

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2025-08-04 11:15:38 +08:00
aoiasd
4f02b06abc
enhance: support set lindera dict build dir and download url in yaml (#43541)
relate: https://github.com/milvus-io/milvus/issues/43120

---------

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-08-04 09:47:38 +08:00
congqixia
4aff581007
enhance: Pass callback in search batch pks to void large result (#43695)
Related to #43660

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-08-02 17:57:37 +08:00
Buqian Zheng
01baf582d5
fix: GroupChunkTranslator to correctly identify vector field (#43706)
issue: #43653

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2025-08-02 00:49:37 +08:00
Bingyi Sun
b59bc5e2c0
fix: make json path index non exists offsets compatible with 2.5 (#43691)
issue: https://github.com/milvus-io/milvus/issues/43666

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-08-01 23:22:23 +08:00
sparknack
bdd65871ea
enhance: tiered storage: estimate segment loading resource usage while considering eviction (#43323)
issue: #41435 

After introducing the caching layer's lazy loading and eviction
mechanisms, most parts of a segment won't be loaded into memory or disk
immediately, even if the segment is marked as LOADED. This means
physical resource usage may be very low. However, we still need to
reserve enough resources for the segments marked as LOADED. Thus, the
logic of resource usage estimation during segment loading, which based
on physcial resource usage only for now, should be changed.

To address this issue, we introduced the concept of logical resource
usage in this patch. This can be thought of as the base reserved
resource for each LOADED segment.

A segment’s logical resource usage is derived from its final evictable
and inevictable resource usage and calculated as follows:

```
SLR = SFPIER + evitable_cache_ratio * SFPER
```

it also equals to

```
SLR = (SFPIER + SFPER) - (1.0 - evitable_cache_ratio) * SFPER
```

`SLR`: The logical resource usage of a segment.
`SFPIER`: The final physical inevictable resource usage of a segment.
`SFPER`: The final physical evictable resource usage of a segment.
`evitable_cache_ratio`: The ratio of a segment's evictable resources
that can be cached locally. The higher the ratio, the more physical
memory is reserved for evictable memory.

When loading a segment, two types of resource usage are taken into
account.

First is the estimated maximum physical resource usage:

```
PPR = HPR + CPR + SMPR - SFPER
```

`PPR`: The predicted physical resource usage after the current segment
is allowed to load.
`HPR`: The physical resource usage obtained from hardware information.  
`CPR`: The total physical resource usage of segments that have been
committed but not yet loaded. When one new segment is allow to load,
`CPR' = CPR + (SMR - SER)`. When one of the committed segments is
loaded, `CPR' = CPR - (SMR - SER)`.
`SMPR`: The maximum physical resource usage of the current segment.
`SFPER`: The final physical evictable resource usage of the current
segment.

Second is the estimated logical resource usage, this check is only valid
when eviction is enabled:

```
PLR = LLR + CLR + SLR
```

`PLR`: The predicted logical resource usage after the current segment is
allowed to load.
`LLR`: The total logical resource usage of all loaded segments. When a
new segment is loaded, `LLR` should be updated to `LLR' = LLR + SLR`.
`CLR`: The total logical resource usage of segments that have been
committed but not yet loaded. When one new segment is allow to load,
`CLR' = CLR + SLR`. When one of the committed segments is loaded, `CLR'
= CLR - SLR`.
`SLR`: The logical resource usage of the current segment.

Only when `PPR < PRL && PLR < PRL` (`PRL`: Physical resource limit of
the querynode), the segment is allowed to be loaded.

---------

Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>
2025-08-01 21:31:37 +08:00
Buqian Zheng
b0226ef47c
fix: added more comprehensive container limit detection (#43693)
issue: #41435

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2025-08-01 20:37:37 +08:00
wei liu
ecc2ac0426
fix: apply load config changes failed after restart (#43554)
issue: #43107

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-08-01 20:13:37 +08:00
Xianhui Lin
0f0edff7f0
fix: increment offset for null data rows in JsonKeyStats (#43679)
fix: increment offset for null data rows in JsonKeyStatsInvertedIndex
issue: https://github.com/milvus-io/milvus/issues/43151

Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>
2025-08-01 15:53:37 +08:00
Buqian Zheng
21cec95fe8
fix: fix disk path sent to cachinglayer (#43685)
`localDataRootPath` is used to init local chunk manager and has
`querynode` appended to it, thus is incorrect

#41435

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2025-08-01 13:19:36 +08:00
zhagnlu
2594250906
fix: fix miss loading index for storagev2 (#43674)
#43653

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-08-01 13:07:36 +08:00
congqixia
5f2f4eb3d6
enhance: Ignore entry with same ts when DeleteRecord search pks (#43669)
Related to #43660

This patch reduces the unwanted offset&ts entries having same timestamp
of delete record. Under large amount of upsert, this false hit could
increase large amount of memory usage while applying delete.

The next step could be passing a callback to `search_pk_func_` to handle
hit entry streamingly.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-08-01 10:15:36 +08:00
Ted Xu
e37cd19da2
enhance: enable storage v2 by default (#43652)
Signed-off-by: Ted Xu <ted.xu@zilliz.com>
2025-08-01 08:59:36 +08:00
zhagnlu
239f743a18
fix: add enable_mmap key to load config (#43672)
#43670

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-07-31 21:35:37 +08:00
sparknack
4aabe23a45
enhance: update flat_hash_map.hpp to a modified version (#43506)
issue: #41435

Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>
2025-07-31 20:09:36 +08:00
Chun Han
d72c0357ff
fix: empty hybridsearch result due to one-sub-search empty(#43537) (#43647)
related: #43537

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2025-07-31 19:47:37 +08:00
congqixia
f29964bd17
fix: Add padding for sorted index preventing 0 length mmap (#43663)
Related to #43655

This patch add a padding when writing mmap file for ScalarSortedIndex in
case of mmap falure due to 0 mmap length.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-07-31 18:53:36 +08:00
zhagnlu
708e426bb3
enhance: using set element for string term type (#43049)
issue: #43048

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-07-31 10:35:37 +08:00
zhagnlu
31801f5937
fix: fix pk in [..] skip next batch when using multi-chunk segment (#43618)
#43494

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-07-31 10:15:37 +08:00
congqixia
089f02bcca
fix: [StorageV2] Align null bitmap offset for fixed-length datatype (#43654)
Related to #43626

Similar to previous pr #43321, null bitmap could be dislocated if the
bitset ptr does not count the offset of array

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-07-31 09:55:36 +08:00
congqixia
6a74a7de66
enhance: Make DeleteRecord search pks by batch and PinAll (#43640)
Related to #43592

When delete records are large, search pk one by one will result into
many `Pincells` call which creates lots of futures.

This patch make search pk execute in batch to reduce this cost.

Also add `GetAllChunks` API to utilize `PinAllCells` to reduce pins.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-07-30 19:15:36 +08:00
SimFG
9ffcc55b55
fix: Clean privilege cache after loading policy in InitPolicyInfo (#43642)
- issue: #43641

Signed-off-by: SimFG <bang.fu@zilliz.com>
2025-07-30 16:57:37 +08:00
wei liu
1fae8f5ae3
enhance: Optimize FlushAll performance for multi-table scenarios (#43339)
Replace multiple per-table flush RPC calls with single FlushAll RPC to
improve performance in multi-table scenarios.
issue: #43338
- Implement server-side FlushAll request processing in
DataCoord/MixCoord
- Add flushAllTask to handle unified flush operations across all tables
- Replace proxy-side per-table flush iteration with single RPC call
- Support both streaming and non-streaming service execution paths
- Add comprehensive unit tests for new FlushAll implementation

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-07-30 15:37:37 +08:00
sthuang
a2c7ed2780
fix: [StorageV2] sort field binlogs paths for packed reader and writer (#43585)
key changes:
* fix unstable storage v2 compaction unit test by guaranteeing the order
of paths during sync.
* bump milvus-storage version, include
https://github.com/milvus-io/milvus-storage/pull/222
https://github.com/milvus-io/milvus-storage/pull/223
https://github.com/milvus-io/milvus-storage/pull/224
https://github.com/milvus-io/milvus-storage/pull/225
https://github.com/milvus-io/milvus-storage/pull/226
* Also fix the below related oom issue.
related: https://github.com/milvus-io/milvus/issues/43310

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2025-07-30 08:09:36 +08:00
congqixia
4fe55e3008
fix: [StorageV2] Use separate channel for get_cells (#43632)
Related to #43584

There might be concurrent calls on `translator.get_cells`. The channel
cannot be shared among these calls, otherwise the logic will break.

This patch create new channel for each `get_cells` invocation in case of
data race.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-07-29 20:59:38 +08:00
Zhen Ye
3e3775fb81
fix: panics when describe collection internal failure (#43630)
issue: #43629

- also fix the scanner_switchable panic underlying wal scanner return
context error.

Signed-off-by: chyezh <chyezh@outlook.com>
2025-07-29 20:33:36 +08:00
Zhen Ye
cd38d65417
fix: make savebinlogpath idompotent at binlog level (#43615)
issue: #43574

- update all binlog every time when calling udpate savebinlogpath.

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-07-29 19:47:36 +08:00
foxspy
d57890449f
enhance: update knowhere version (#43528)
issue: #42937

Signed-off-by: xianliang.li <xianliang.li@zilliz.com>
2025-07-29 17:21:36 +08:00
XuanYang-cn
0ccb95303e
feat: [CMEK] Add utils to load plugins (#42986)
See also: #40321

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2025-07-29 17:17:36 +08:00
Buqian Zheng
052fb6c562
feat: add time based eviction to data managed by cachinglayer (#43490)
issue: https://github.com/milvus-io/milvus/issues/41435

also added disk capacity protection

---------

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2025-07-29 16:17:35 +08:00
Bingyi Sun
a765cd1eaa
enhance: unlink mmap file when chunk and index are destructed (#43524)
issue: https://github.com/milvus-io/milvus/issues/41636

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-07-29 16:05:36 +08:00
congqixia
268f1cdace
fix: Hold field shared_ptr in case of being released (#43614)
Related to #43584

Directly accessing `fields_` in `get_raw_data` may have race if load vec
index happens concurrently during getting raw data.

This PR make `bulk_subscript` hold shared_ptr of field column prevent
field column being release during reading it.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-07-29 12:15:36 +08:00