420 Commits

Author SHA1 Message Date
sparknack
ab64afba2f
enhance: add storage resource usage for scalar search (#44414)
issue: #44212

---------

Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>
2025-09-22 14:28:06 +08:00
Gao
d3784c6515
enhance: add storage resource usage for vector search (#44308)
issue: #44212 

Implement search/query storage usage statistics in go side(result
reduce), only record storage usage in vector search C++ path. Need to be
implemented in query c++ path in next prs.

---------

Signed-off-by: chasingegg <chao.gao@zilliz.com>
Signed-off-by: marcelo.chen <marcelo.chen@zilliz.com>
Co-authored-by: marcelo.chen <marcelo.chen@zilliz.com>
2025-09-19 20:20:02 +08:00
zhagnlu
e9bbb6aa9b
fix: fix json_contains bug for stats (#44325)
#42533

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-09-15 10:16:07 +08:00
Chun Han
26a024625d
feat: support search by on json field and dynamic field(#43124) (#43203)
related: #43124

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2025-09-09 21:51:56 +08:00
Buqian Zheng
9bf2b5c10c
enhance: moved more segcore unit test files (#44210)
issue: https://github.com/milvus-io/milvus/issues/43931

---------

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2025-09-08 10:21:55 +08:00
Buqian Zheng
b76bf13fc3
enhance: move c++ unit test file to aside of the production code (#43932)
issue: https://github.com/milvus-io/milvus/issues/43931

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2025-09-03 23:45:53 +08:00
zhagnlu
fc876639cf
enhance: support json stats with shredding design (#42534)
#42533

Co-authored-by: luzhang <luzhang@zilliz.com>
2025-09-01 10:49:52 +08:00
congqixia
e3b3502287
fix: Use correct regex for cppcheck (#44077)
Related to #44076

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-08-27 20:57:50 +08:00
marcelo-cjl
e13e19cd2c
enhance: add sparse_u32_f32 data type for sparse vertor (#43974)
issue: #43973

Signed-off-by: marcelo.chen <marcelo.chen@zilliz.com>
2025-08-27 16:47:50 +08:00
aoiasd
e205c30f7d
fix: boost panic if search return empty result (#44042)
relate: https://github.com/milvus-io/milvus/issues/44041
Skip rescore node if no valid offsets.

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-08-27 05:01:52 +08:00
Spade A
8456f824be
feat: impl StructArray -- miscellaneous staffs for struct array (#43960)
Ref https://github.com/milvus-io/milvus/issues/42148

1. enable storage v2
2. implement some missing staffs
3. fix some bugs and add tests

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-08-26 21:35:53 +08:00
Tianx
c0d62268ac
feat: add timesatmptz data type (#44005)
issue: https://github.com/milvus-io/milvus/issues/27467
>
https://github.com/milvus-io/milvus/issues/27467#issuecomment-3092211420
> * [x]  M1 Create collection with timestamptz field
> * [x]  M2 Insert timestamptz field data
> * [x]  M3 Retrieve timestamptz field data
> * [x]  M4 Implement handoff[ ]  

The second PR of issue:
https://github.com/milvus-io/milvus/issues/27467, which completes M1-M4
described above.

---------

Signed-off-by: xtx <xtianx@smail.nju.edu.cn>
2025-08-26 15:59:53 +08:00
Gao
e97a618630
enhance: support readAt interface for remote input stream (#43997)
#42032 

Also, fix the cacheoptfield method to work in storagev2.
Also, change the sparse related interface for knowhere version bump
#43974 .
Also, includes https://github.com/milvus-io/milvus/pull/44046 for metric
lost.

---------

Signed-off-by: chasingegg <chao.gao@zilliz.com>
Signed-off-by: marcelo.chen <marcelo.chen@zilliz.com>
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
Co-authored-by: marcelo.chen <marcelo.chen@zilliz.com>
Co-authored-by: Congqi Xia <congqi.xia@zilliz.com>
2025-08-26 11:19:58 +08:00
Spade A
d6a428e880
feat: impl StructArray -- support create index for vector array (embedding list) and search on it (#43726)
Ref https://github.com/milvus-io/milvus/issues/42148

This PR supports create index for vector array (now, only for
`DataType.FLOAT_VECTOR`) and search on it.
The index type supported in this PR is `EMB_LIST_HNSW` and the metric
type is `MAX_SIM` only.

The way to use it:
```python
milvus_client = MilvusClient("xxx:19530")
schema = milvus_client.create_schema(enable_dynamic_field=True, auto_id=True)
...
struct_schema = milvus_client.create_struct_array_field_schema("struct_array_field")
...
struct_schema.add_field("struct_float_vec", DataType.ARRAY_OF_VECTOR, element_type=DataType.FLOAT_VECTOR, dim=128, max_capacity=1000)
...
schema.add_struct_array_field(struct_schema)
index_params = milvus_client.prepare_index_params()
index_params.add_index(field_name="struct_float_vec", index_type="EMB_LIST_HNSW", metric_type="MAX_SIM", index_params={"nlist": 128})
...
milvus_client.create_index(COLLECTION_NAME, schema=schema, index_params=index_params)
```

Note: This PR uses `Lims` to convey offsets of the vector array to
knowhere where vectors of multiple vector arrays are concatenated and we
need offsets to specify which vectors belong to which vector array.

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>
2025-08-20 10:27:46 +08:00
aoiasd
dcf04a58b8
feat: support use score function on segment search and use filter (#43868)
relate: https://github.com/milvus-io/milvus/issues/43867
Support boost function score, multiply by the weight if match filter.

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-08-19 16:15:45 +08:00
yihao.dai
64ab3d2681
enhance: Improve error message when query vector and dim mismatch (#43835)
/kind improvement

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-08-18 01:07:44 +08:00
Buqian Zheng
0599113a4b
enhance: add timeout to resource reservation (#43441)
issue: https://github.com/milvus-io/milvus/issues/41435

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2025-07-22 15:24:53 +08:00
Buqian Zheng
389104d200
enhance: rename PanicInfo to ThrowInfo (#43384)
issue: #41435

this is to prevent AI from thinking of our exception throwing as a
dangerous PANIC operation that terminates the program.

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2025-07-19 20:22:52 +08:00
Spade A
d750816ba0
fix: remove std::string support for stlsort index (#43355)
fix: https://github.com/milvus-io/milvus/issues/43354

The current implementation of stdsort index is not supported for
std::string. Remove the code.

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-07-16 17:46:51 +08:00
congqixia
c9bc70f272
fix: [AddField] Use shared_ptr of schema in plan fixing dangling ref (#42693)
Related to #42640

The search/query plan holded a reference to schema, which could be
destructed after schema change. This PR make plan hold a shared ptr to
it fixing dangling reference problem under concurrent read & schema
change.

This PR also remove field binlog check for loading index for old segment
with old schema may have binlog lack.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-06-12 20:46:36 +08:00
cqy123456
5fe7015f63
enhance: InterimIndex support more index type and data type (#41021)
issue: https://github.com/milvus-io/milvus/issues/27678
cherry pick from : https://github.com/milvus-io/milvus/pull/39180,
https://github.com/milvus-io/milvus/pull/40429

Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>
2025-05-28 08:40:28 +08:00
Xianhui Lin
6a0e182e13
enhance: support TTL expiration with queries returning no results (#42086)
support TTL expiration with queries returning no results
issue:https://github.com/milvus-io/milvus/issues/41959

Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>
2025-05-27 18:28:27 +08:00
Buqian Zheng
ff5c2770e5
feat: cachinglayer: various improvements (#41546)
issue: https://github.com/milvus-io/milvus/issues/41435

this PR is based on https://github.com/milvus-io/milvus/pull/41436. 

Improvements include:

- Lazy Load support for Storage v1
- Use Low/High watermark to control eviction
- Caching Layer related config changes
- Removed ChunkCache related configs and code in golang
- Add `PinAllCells` helper method to CacheSlot class
- Modified ValueAt, RawAt, PrimitiveRawAt to Bulk version, to reduce
caching layer overhead
- Removed some unclear templated bulk_subscript methods
- CachedSearchIterator to store PinWrapper when searching on
ChunkedColumn, and removed unused contrustor.

---------

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2025-05-10 09:19:16 +08:00
zhagnlu
39e7ad33d7
enhance: add optimize for like expr (#41066)
#41065

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-05-08 14:28:52 +08:00
foxspy
e2ddbe4962
feat: add cachinglayer to index (#41653)
issue: #41435

Signed-off-by: xianliang.li <xianliang.li@zilliz.com>
2025-05-08 10:12:54 +08:00
sthuang
e9442f575d
feat: storage v2 seal segment load (#41567)
storage v2 chunked seal segment loading is based on caching layer. A
cell unit in storage v2 is a parquet row group in remote object storage,
containing all fields. Therefore, each field needs a proxy to do related
one field operations.

<img width="965" alt="Screenshot 2025-04-28 at 10 59 30"
src="https://github.com/user-attachments/assets/83e93a10-3b1d-4066-ac17-b996d5650416"
/>

related: #39173

---------

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2025-04-30 14:22:58 +08:00
Buqian Zheng
3de904c7ea
feat: add cachinglayer to sealed segment (#41436)
issue: https://github.com/milvus-io/milvus/issues/41435

---------

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2025-04-28 10:52:40 +08:00
Xianhui Lin
3bc24c264f
enhance: Add json key inverted index in stats for optimization (#38039)
Add json key inverted index in stats for optimization
https://github.com/milvus-io/milvus/issues/36995

---------

Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-04-10 15:20:28 +08:00
Spade A
9f3bd55755
fix: avoid panic when field not exists in schema in query node (#40541)
ref #40473

This PR is a workaround to avoid the panic described in the issue.

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-03-12 22:44:08 +08:00
Cai Yudong
2bd2cca04a
enhance: Truly support multi vector data types in SearchBruteForce (#40499)
Issue: #38666

Signed-off-by: CaiYudong <yudong.cai@zilliz.com>
2025-03-10 18:36:03 +08:00
Spade A
0dc21f0aeb
feat: support random sample (#39532)
issue: #39541

This PR implements random sample, the syntax is:
```
filter="random_sample(factor)"
or 
filter="boolean_expression && random_sample(factor)"

where 
factor is a float between (0, 1) and 
boolean_expression is like
 "1 <= number < 10", "color in ["read, "blue"]" or others
```

---------

Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>
Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-02-18 12:40:50 +08:00
congqixia
844df76cc0
enhance: Rectify run_clang_format grep command (#39534)
Previously the grep with regex does not work and failed to match lots of
.cpp files

This PR:
- use "-E" flag to use regex match
- commit the fixed result of current cpp code

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-01-23 17:07:05 +08:00
Cai Yudong
341d6c1eb7
feat: Update segcore for VECTOR_INT8 (#39415)
Issue: #38666

Signed-off-by: Cai Yudong <yudong.cai@zilliz.com>
2025-01-21 11:03:03 +08:00
Gao
75d7978a18
enhance: pass partition key scalar info if enable for vector mem index (#39123)
issue: #34332

---------

Signed-off-by: chasingegg <chao.gao@zilliz.com>
2025-01-16 14:33:03 +08:00
Spade A
032292a432
feat: support phrase match query (#38869)
The relevant issue: https://github.com/milvus-io/milvus/issues/38930

---------

Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>
2025-01-12 20:24:58 +08:00
Gao
f0dae81494
fix: set iterative filter hint to false when no expr specified (#39033)
issue: https://github.com/milvus-io/milvus/issues/39013

Signed-off-by: chasingegg <chao.gao@zilliz.com>
2025-01-08 12:56:56 +08:00
smellthemoon
907fc24f85
enhance: support null expr (#38772)
#31728

---------

Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2025-01-02 14:16:54 +08:00
Patrick Weizhi Xu
85f462be1a
enhance: speed up search iterator stage 1 (#37947)
issue: #37548

Signed-off-by: Patrick Weizhi Xu <weizhi.xu@zilliz.com>
2024-12-26 10:32:49 +08:00
Gao
363d7f31ef
fix: report error when hints not supported (#38717)
issue: #38705

---------

Signed-off-by: chasingegg <chao.gao@zilliz.com>
2024-12-25 19:02:56 +08:00
Gao
994fc544e7
enhance: support iterative filter execution (#37363)
issue: #37360

---------

Signed-off-by: chasingegg <chao.gao@zilliz.com>
2024-12-11 11:32:44 +08:00
cqy123456
8216345b07
enhance: reduce copy of bitset and id conversion of brurtforce search (#37675)
issue: https://github.com/milvus-io/milvus/issues/37798

Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>
2024-11-19 15:48:40 +08:00
aoiasd
e9391acf80
fix: bm25 brute force search need index params k1 and b (#37721)
relate: https://github.com/milvus-io/milvus/issues/35853

---------

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2024-11-18 15:44:31 +08:00
aoiasd
993051bb49
fix: brute force bm25 search lack avgdl param (#37650)
relate: https://github.com/milvus-io/milvus/issues/35853

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2024-11-14 14:58:31 +08:00
Chun Han
2d29dcd30c
enhance:refine group_strict_size parameter(#37482) (#37483)
related: #37482

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2024-11-12 09:56:28 +08:00
Bingyi Sun
bd04cac4b3
fix: fix group by on chunked segment (#37292)
https://github.com/milvus-io/milvus/issues/37244

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2024-11-05 17:12:23 +08:00
Yinzuo Jiang
5aad000a93
enhance: [skip ci] remove tools/core_gen after #35251 (#36306)
`tools/core_gen` python scripts are useless after
https://github.com/milvus-io/milvus/pull/35251

fixes: #36305

Signed-off-by: Yinzuo Jiang <yinzuo.jiang@zilliz.com>
Signed-off-by: Yinzuo Jiang <jiangyinzuo@foxmail.com>
2024-11-01 13:08:21 +08:00
zhenshan.cao
63843dce33
fix: Fix conan gdal building problem (#37338)
issue:https://github.com/milvus-io/milvus/issues/27576

Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>
2024-10-31 21:04:16 +08:00
Hao Tan
67c4340565
feat: Geospatial Data Type and GIS Function Support for milvus server (#35990)
issue:https://github.com/milvus-io/milvus/issues/27576

# Main Goals
1. Create and describe collections with geospatial fields, enabling both
client and server to recognize and process geo fields.
2. Insert geospatial data as payload values in the insert binlog, and
print the values for verification.
3. Load segments containing geospatial data into memory.
4. Ensure query outputs can display geospatial data.
5. Support filtering on GIS functions for geospatial columns.

# Solution
1. **Add Type**: Modify the Milvus core by adding a Geospatial type in
both the C++ and Go code layers, defining the Geospatial data structure
and the corresponding interfaces.
2. **Dependency Libraries**: Introduce necessary geospatial data
processing libraries. In the C++ source code, use Conan package
management to include the GDAL library. In the Go source code, add the
go-geom library to the go.mod file.
3. **Protocol Interface**: Revise the Milvus protocol to provide
mechanisms for Geospatial message serialization and deserialization.
4. **Data Pipeline**: Facilitate interaction between the client and
proxy using the WKT format for geospatial data. The proxy will convert
all data into WKB format for downstream processing, providing column
data interfaces, segment encapsulation, segment loading, payload
writing, and cache block management.
5. **Query Operators**: Implement simple display and support for filter
queries. Initially, focus on filtering based on spatial relationships
for a single column of geospatial literal values, providing parsing and
execution for query expressions.
6. **Client Modification**: Enable the client to handle user input for
geospatial data and facilitate end-to-end testing.Check the modification
in pymilvus.

---------

Signed-off-by: tasty-gumi <1021989072@qq.com>
2024-10-31 20:58:20 +08:00
Bingyi Sun
90948e9444
fix: add SearchOnSealed unit test and fix a bug (#37241)
issue: https://github.com/milvus-io/milvus/issues/37244

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2024-10-30 10:26:19 +08:00
Yinzuo Jiang
3628593d20
feat: Implement custom function module in milvus expr (#36560)
OSPP 2024 project:
https://summer-ospp.ac.cn/org/prodetail/247410235?list=org&navpage=org

Solutions:

- parser (planparserv2)
    - add CallExpr in planparserv2/Plan.g4
    - update parser_visitor and show_visitor
- grpc protobuf
    - add CallExpr in plan.proto
- execution (`core/src/exec`)
- add `CallExpr` `ValueExpr` and `ColumnExpr` (both logical and
physical) for function call and function parameters
- function factory (`core/src/exec/expression/function`)
    - create a global hashmap when starting milvus (see server.go)
- the global hashmap stores function signatures and their function
pointers, the CallExpr in execution engine can get the function pointer
by function signature.
- custom functions
    - empty(string)
    - starts_with(string, string)
- add cpp/go unittests and E2E tests

closes: #36559

Signed-off-by: Yinzuo Jiang <jiangyinzuo@foxmail.com>
2024-10-25 15:25:30 +08:00