2146 Commits

Author SHA1 Message Date
Buqian Zheng
9bf2b5c10c
enhance: moved more segcore unit test files (#44210)
issue: https://github.com/milvus-io/milvus/issues/43931

---------

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2025-09-08 10:21:55 +08:00
congqixia
8f97eb355f
enhance: [StorageV2] Make bucket name concatenation transparent to user (#44232)
Related to #39173

This PR:
- Bump milvus-storage commit to handle bucket name concatenation logic
in multipart s3 fs
- Remove all user-side bucket name concatenation code

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-09-08 10:15:55 +08:00
Spade A
ba4cd68edb
fix: adjust params to make CPP UT run faster (#44223)
fix: https://github.com/milvus-io/milvus/issues/44224

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-09-06 14:13:54 +08:00
aoiasd
c71b47b52c
enhance: add internal core latency metric for rescore node (#44010)
For fetching latency of boost.

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-09-05 17:37:54 +08:00
cqy123456
1d4d721859
test: Reduce the run time of interim index cpp ut (#44200)
issue: https://github.com/milvus-io/milvus/issues/44176

Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>
2025-09-05 16:45:53 +08:00
zhagnlu
d67f1ea0ab
enhance: add param to modify dump snapshot batch size (#44215)
issue: #44216

Signed-off-by: luzhang <luzhang@zilliz.com>
2025-09-05 14:29:54 +08:00
Gao
2e98cb0103
enhance: load resource estimation for tiered index (#44171)
issue: https://github.com/milvus-io/milvus/issues/42032

- Use bytes to estimate load resource in the whole estimation procedure
- Add num_rows and dim info for vector index to better estimate
- Disable eviction for tiered index's meta

---------

Signed-off-by: chasingegg <chao.gao@zilliz.com>
2025-09-04 19:41:53 +08:00
Buqian Zheng
b76bf13fc3
enhance: move c++ unit test file to aside of the production code (#43932)
issue: https://github.com/milvus-io/milvus/issues/43931

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2025-09-03 23:45:53 +08:00
Spade A
825a134739
feat: impl StructArray -- reject json types for struct (#44190)
issue: https://github.com/milvus-io/milvus/issues/42148

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-09-03 19:33:53 +08:00
Spade A
7cb15ef141
feat: impl StructArray -- optimize vector array serialization (#44035)
issue: https://github.com/milvus-io/milvus/issues/42148

Optimized from
Go VectorArray → VectorArray Proto → Binary → C++ VectorArray Proto →
C++ VectorArray local impl → Memory
to
Go VectorArray → Arrow ListArray  → Memory

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-09-03 16:39:53 +08:00
Buqian Zheng
ad16441aa0
enhance: removed unused VectorFunction (#44178)
Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2025-09-03 14:37:53 +08:00
foxspy
d55bf49bf1
enhance: update knowhere version (#44144)
issue: #42937

---------

Signed-off-by: xianliang.li <xianliang.li@zilliz.com>
2025-09-03 01:31:53 +08:00
Bingyi Sun
0c0630cc38
feat: support dropping index without releasing collection (#42941)
issue: #42942

This pr includes the following changes:
1. Added checks for index checker in querycoord to generate drop index
tasks
2. Added drop index interface to querynode
3. To avoid search failure after dropping the index, the querynode
allows the use of lazy mode (warmup=disable) to load raw data even when
indexes contain raw data.
4. In segcore, loading the index no longer deletes raw data; instead, it
evicts it.
5. In expr, the index is pinned to prevent concurrent errors.

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-09-02 16:17:52 +08:00
congqixia
aa4ef9c996
feat: Support enabling dynamic schema on existing collection (#44151)
Related to #44150

This PR make enabling `dynamic schema` feature for an existing
collection possible.

This related API is to reuse `AlterCollection` and underhood its
redirected to `adding nullable json field`

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-09-02 15:51:52 +08:00
Bingyi Sun
c420e7bd27
enhance: align the behavior of exist expr between brute force and index (#44030)
https://github.com/milvus-io/milvus/issues/44031

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-09-01 15:03:52 +08:00
zhagnlu
e2f34d7b78
fix:expand lock range for dump_snapshot (#44130)
issue: #44129

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-09-01 14:43:52 +08:00
zhagnlu
fc876639cf
enhance: support json stats with shredding design (#42534)
#42533

Co-authored-by: luzhang <luzhang@zilliz.com>
2025-09-01 10:49:52 +08:00
sparknack
70c8114e85
enhance: cachinglayer: resource management for segment loading (#43846)
issue: #41435

---------

Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>
2025-08-29 11:37:50 +08:00
Buqian Zheng
6b22661c06
fix: use tbb::concurrent_unordered_map for ChunkedSegmentSealedImpl::fields_ (#44084)
issue: https://github.com/milvus-io/milvus/issues/44078

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2025-08-29 10:01:51 +08:00
cqy123456
844caf5cfe
enhance: estimate the size of interim index (#44104)
issue: #41435

Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>
2025-08-28 19:37:51 +08:00
congqixia
e3b3502287
fix: Use correct regex for cppcheck (#44077)
Related to #44076

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-08-27 20:57:50 +08:00
marcelo-cjl
e13e19cd2c
enhance: add sparse_u32_f32 data type for sparse vertor (#43974)
issue: #43973

Signed-off-by: marcelo.chen <marcelo.chen@zilliz.com>
2025-08-27 16:47:50 +08:00
Chun Han
da156981c6
feat: milvus support posix-compatible mode(milvus-io#43942) (#43944)
related: #43942

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2025-08-27 16:29:50 +08:00
XuanYang-cn
37a447d166
feat: Add CMEK cipher plugin (#43722)
1. Enable Milvus to read cipher configs
2. Enable cipher plugin in binlog reader and writer
3. Add a testCipher for unittests
4. Support pooling for datanode
5. Add encryption in storagev2

See also: #40321 
Signed-off-by: yangxuan <xuan.yang@zilliz.com>

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2025-08-27 11:15:52 +08:00
Spade A
90a7e63665
enhance: collect doc_id from posting list directly for text match (#43899)
issue: https://github.com/milvus-io/milvus/issues/43898

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-08-27 10:39:52 +08:00
aoiasd
e205c30f7d
fix: boost panic if search return empty result (#44042)
relate: https://github.com/milvus-io/milvus/issues/44041
Skip rescore node if no valid offsets.

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-08-27 05:01:52 +08:00
Spade A
8456f824be
feat: impl StructArray -- miscellaneous staffs for struct array (#43960)
Ref https://github.com/milvus-io/milvus/issues/42148

1. enable storage v2
2. implement some missing staffs
3. fix some bugs and add tests

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-08-26 21:35:53 +08:00
Tianx
c0d62268ac
feat: add timesatmptz data type (#44005)
issue: https://github.com/milvus-io/milvus/issues/27467
>
https://github.com/milvus-io/milvus/issues/27467#issuecomment-3092211420
> * [x]  M1 Create collection with timestamptz field
> * [x]  M2 Insert timestamptz field data
> * [x]  M3 Retrieve timestamptz field data
> * [x]  M4 Implement handoff[ ]  

The second PR of issue:
https://github.com/milvus-io/milvus/issues/27467, which completes M1-M4
described above.

---------

Signed-off-by: xtx <xtianx@smail.nju.edu.cn>
2025-08-26 15:59:53 +08:00
Gao
e97a618630
enhance: support readAt interface for remote input stream (#43997)
#42032 

Also, fix the cacheoptfield method to work in storagev2.
Also, change the sparse related interface for knowhere version bump
#43974 .
Also, includes https://github.com/milvus-io/milvus/pull/44046 for metric
lost.

---------

Signed-off-by: chasingegg <chao.gao@zilliz.com>
Signed-off-by: marcelo.chen <marcelo.chen@zilliz.com>
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
Co-authored-by: marcelo.chen <marcelo.chen@zilliz.com>
Co-authored-by: Congqi Xia <congqi.xia@zilliz.com>
2025-08-26 11:19:58 +08:00
zhagnlu
8934c18792
enhance: support cache result cache for expr (#43923)
issue: #43878

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-08-26 10:55:52 +08:00
cqy123456
d987dd7103
enhance: Make build ratio of interim index configurable (#43939)
issue: https://github.com/milvus-io/milvus/issues/43993

Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>
2025-08-25 14:43:51 +08:00
sparknack
4fae074d56
enhance: add write rate limit for disk file writer (#43912)
issue: #43040

---------

Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>
2025-08-25 10:27:47 +08:00
Spade A
8e1ce15146
fix: ngram index is mistakenly used for unsopported operations (#43955)
issue: https://github.com/milvus-io/milvus/issues/43917

1. fix ngrma index to be mistakenly used for unsopported operation
2. fix potential uaf problem

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-08-21 14:41:46 +08:00
zhagnlu
d904c4e677
enhance: optimize compare expr performance for pk field (#43154)
#43153

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-08-21 10:59:46 +08:00
congqixia
7963b17ac1
fix: Revert "fix: Use folly::SharedMutex preventing starvation (#43937)" (#43959)
Related to #43958

This reverts commit 580350495ab40b3c0a2ec473882258edf6d7dd08.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-08-21 10:09:47 +08:00
Spade A
d6a428e880
feat: impl StructArray -- support create index for vector array (embedding list) and search on it (#43726)
Ref https://github.com/milvus-io/milvus/issues/42148

This PR supports create index for vector array (now, only for
`DataType.FLOAT_VECTOR`) and search on it.
The index type supported in this PR is `EMB_LIST_HNSW` and the metric
type is `MAX_SIM` only.

The way to use it:
```python
milvus_client = MilvusClient("xxx:19530")
schema = milvus_client.create_schema(enable_dynamic_field=True, auto_id=True)
...
struct_schema = milvus_client.create_struct_array_field_schema("struct_array_field")
...
struct_schema.add_field("struct_float_vec", DataType.ARRAY_OF_VECTOR, element_type=DataType.FLOAT_VECTOR, dim=128, max_capacity=1000)
...
schema.add_struct_array_field(struct_schema)
index_params = milvus_client.prepare_index_params()
index_params.add_index(field_name="struct_float_vec", index_type="EMB_LIST_HNSW", metric_type="MAX_SIM", index_params={"nlist": 128})
...
milvus_client.create_index(COLLECTION_NAME, schema=schema, index_params=index_params)
```

Note: This PR uses `Lims` to convey offsets of the vector array to
knowhere where vectors of multiple vector arrays are concatenated and we
need offsets to specify which vectors belong to which vector array.

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>
2025-08-20 10:27:46 +08:00
Alexander Guzhva
cfdb17a088
enhance: Fix ArithHelperI64 for SVE in bitset (#43952)
missing ArithHelperI64<ArithOpType::Div, CmpOp>

Signed-off-by: Alexandr Guzhva <alexanderguzhva@gmail.com>
2025-08-19 22:48:58 +08:00
Alexander Guzhva
e179a5635f
enhance: remove duplicate code in ArithHelperF32 in SVE for bitset (#43950)
fixes a problem of https://github.com/milvus-io/milvus/pull/43949

Signed-off-by: Alexandr Guzhva <alexanderguzhva@gmail.com>
2025-08-19 22:35:47 +08:00
liliu-z
7dd2b103b0
enhance: Fix template declaration order for ArithHelperF32 in SVE implementation (#43949)
Signed-off-by: Li Liu <li.liu@zilliz.com>
2025-08-19 21:58:22 +08:00
congqixia
580350495a
fix: Use folly::SharedMutex preventing starvation (#43937)
Related to #43936

This PR:
- Use `folly::SharedMutex` instead of `std::shared_mutex` preventing
starvation
- Use `folly::SharedMutex::WriteHolder/ReadHolder` instead of
std::shared_lock and std::unique_lock to get better performance

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-08-19 20:05:46 +08:00
aoiasd
dcf04a58b8
feat: support use score function on segment search and use filter (#43868)
relate: https://github.com/milvus-io/milvus/issues/43867
Support boost function score, multiply by the weight if match filter.

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-08-19 16:15:45 +08:00
Gao
b602b4187d
enhance: upgrade aws-sdk from 1.9.234 to 1.11.352 (#43916)
issue: #43908

Signed-off-by: chasingegg <chao.gao@zilliz.com>
2025-08-19 11:11:45 +08:00
yihao.dai
64ab3d2681
enhance: Improve error message when query vector and dim mismatch (#43835)
/kind improvement

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-08-18 01:07:44 +08:00
foxspy
647c2bca2d
enhance: Support streaming read and write of vector index files (#43824)
issue: #42032

Signed-off-by: xianliang.li <xianliang.li@zilliz.com>
2025-08-15 23:41:43 +08:00
Alexander Guzhva
ebb10dfae0
enhance: better auto-detect of SVE options for the bitset library (#43833)
Enables the compilation of SVE code for the bitset library if a C++
compiler supports it.

There are two conditions for enabling the SVE code
* a C++ compiler needs to have a `-march=armv8-a+sve`
* `arm_sve.h` header must be available

AFAIK, `gcc 7 does not support SVE`, `gcc 8` and `gcc 9` support SVE,
but have no `arm_sve.h` file, and only `gcc 10` has both.

Signed-off-by: Alexandr Guzhva <alexanderguzhva@gmail.com>
2025-08-15 22:37:44 +08:00
sthuang
5e4eb4a6e0
enhance: [StorageV2] bump storage version (#43871)
related: https://github.com/milvus-io/milvus/issues/43869

bump storage version. include the following feature:
* https://github.com/milvus-io/milvus-storage/pull/231
* https://github.com/milvus-io/milvus-storage/pull/232
* https://github.com/milvus-io/milvus-storage/pull/233

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2025-08-15 17:59:43 +08:00
sthuang
c102fa8b0b
enhance: [StorageV2] zero copy for packed writer record batch (#43779)
The Out of Memory (OOM) error occurs because a handler retains the
entire ImportRecordBatch in memory. Consequently, even when child arrays
within the batch are flushed, the memory for the complete batch is not
released. We temporarily fixed by deep copying record batch in #43724.

The proposed fix is to split the RecordBatch into smaller sub-batches by
column group. These sub-batches will be transferred via CGO, then
reassembled before being written to storage using the Storage V2 API.
Thus we can achieve zero-copy and only transferring references in CGO.

related: #43310

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2025-08-15 10:11:44 +08:00
congqixia
f032044125
enhance: Refine segcore param change callback (#43838)
Related to #43230

This PR
- Move segcore setup function to `initcore` package to remove cgo
dependency from pkg
- Register core callback only for components depends on segcore
- Rectify `UpdateLogLevel` implementation

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-08-13 19:31:44 +08:00
zhagnlu
b7c7df9440
fix: fix delete consumer bug for cocurrency R-W (#43831)
#41570

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-08-12 11:37:42 +08:00
Gao
81a0915c29
enhance: add milvus-common module to decouple knwhere & segcore (#43624)
issue: https://github.com/milvus-io/milvus/issues/42032
https://github.com/milvus-io/milvus/issues/41435

based on pr: https://github.com/milvus-io/milvus/pull/42124

---------

Signed-off-by: chasingegg <chao.gao@zilliz.com>
Co-authored-by: xianliang.li <xianliang.li@zilliz.com>
2025-08-11 14:09:42 +08:00