174 Commits

Author SHA1 Message Date
ZhuXi
cd931a0388
feat:Geospatial Data Type and GIS Function support for milvus (#43661)
issue: #43427
pr: #37417

This pr's main goal is merge #37417 to milvus 2.5 without conflicts.

# Main Goals

1. Create and describe collections with geospatial type
2. Insert geospatial data into the insert binlog
3. Load segments containing geospatial data into memory
4. Enable query and search can display  geospatial data
5. Support using GIS funtions like ST_EQUALS in query

# Solution

1. **Add Type**: Modify the Milvus core by adding a Geospatial type in
both the C++ and Go code layers, defining the Geospatial data structure
and the corresponding interfaces.
2. **Dependency Libraries**: Introduce necessary geospatial data
processing libraries. In the C++ source code, use Conan package
management to include the GDAL library. In the Go source code, add the
go-geom library to the go.mod file.
3. **Protocol Interface**: Revise the Milvus protocol to provide
mechanisms for Geospatial message serialization and deserialization.
4. **Data Pipeline**: Facilitate interaction between the client and
proxy using the WKT format for geospatial data. The proxy will convert
all data into WKB format for downstream processing, providing column
data interfaces, segment encapsulation, segment loading, payload
writing, and cache block management.
5. **Query Operators**: Implement simple display and support for filter
queries. Initially, focus on filtering based on spatial relationships
for a single column of geospatial literal values, providing parsing and
execution for query expressions.Now only support brutal search
6. **Client Modification**: Enable the client to handle user input for
geospatial data and facilitate end-to-end testing.Check the modification
in pymilvus.

---------

Signed-off-by: Yinwei Li <yinwei.li@zilliz.com>
Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
Co-authored-by: cai.zhang <cai.zhang@zilliz.com>
2025-08-26 19:11:55 +08:00
congqixia
b1bbc56b54
fix: [2.5] Use task timestamp to calculate TTL timestamp (#42944)
Cherry-pick from master
pr: #42920
Related to #42918

Previously the `CollectionTtlTimestamp` could be overflowed when the
guarantee_ts==1, which means using `Eventually` consistency level.

This patch use task timestamp, allocated by scheduler, to generate ttl
timestamp ignore the potential very small timestamp being used.

Also add overflow check for ttl timestamp calculated.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-06-25 20:56:42 +08:00
Xianhui Lin
0490344442
fix: support TTL expiration with queries returning no results (#42103)
support TTL expiration with queries returning no results
issue:https://github.com/milvus-io/milvus/issues/41959
pr:https://github.com/milvus-io/milvus/pull/42086

---------

Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>
2025-05-27 15:18:28 +08:00
Xianhui Lin
0574fc7b7b
enhance: support TTL expiration with queries returning no results (#41960)
support TTL expiration with queries returning no results
issue:https://github.com/milvus-io/milvus/issues/41959
pr:https://github.com/milvus-io/milvus/pull/41720

---------

Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>
2025-05-26 15:52:28 +08:00
aoiasd
daf745ffa3
fix: [2.5] hybird search sub requset not set analyzer name (#41897)
relate: https://github.com/milvus-io/milvus/issues/41213
pr: https://github.com/milvus-io/milvus/pull/41896

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-05-16 17:58:22 +08:00
aoiasd
544493e3e2
feat:[2.5] support multi analyzer for bm25 function (#41456)
relate: https://github.com/milvus-io/milvus/issues/41213
pr: https://github.com/milvus-io/milvus/pull/41351

---------

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-04-23 20:52:39 +08:00
congqixia
c5f87c1b6d
fix: [2.5] Avoid update original search/query request (#41127)
Cherry-pick from master
pr: #41126
Related to #41034

Recent pr #40842 introduced logic to avoid requery pk column, which
updates the original request which makes the request not equavilant to
the original one.

When retry happens due to incomplete request error, this change makes
the final result set lacks the pk column even when user specifies it
explicitly.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-04-07 22:13:29 +08:00
congqixia
618054d836
enhance: [2.5] Use %v for missing id log (#41036) (#41090)
Cherry-pick from master
pr: #41036
`incomplete query result, missing id %!s(int64=348), len(searchIDs) =
10, len(queryIDs) = 9` error message format with error
when missing id is int64

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-04-03 18:52:23 +08:00
Buqian Zheng
8504a7b98d
enhance: [2.5] avoid re-query if hybrid search requested only pk as output field (#40906)
pr: https://github.com/milvus-io/milvus/pull/40842
issue: https://github.com/milvus-io/milvus/issues/40833

---------

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2025-03-28 14:50:26 +08:00
Buqian Zheng
cff0e82f57
enhance: [2.5] improve sparse query nnz metric (#40714)
add query type and field id label; add metric for hybrid search

issue: https://github.com/milvus-io/milvus/issues/35853
pr: https://github.com/milvus-io/milvus/pull/40713

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2025-03-20 14:14:21 +08:00
congqixia
709594f158
enhance: [2.5] Use v2 package name for pkg module (#40117)
Cherry-pick from master
pr: #39990
Related to #39095

https://go.dev/doc/modules/version-numbers

Update pkg version according to golang dep version convention

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-02-23 00:46:01 +08:00
Zhen Ye
56c1a8d462
fix: ignore growing option is lost at hibridsearch (#39900)
issue: #39892
pr: #39799

Signed-off-by: chyezh <chyezh@outlook.com>
2025-02-17 15:30:15 +08:00
Xianhui Lin
f0964f769d
enhance: [2.5]Add json key inverted index in stats for optimization (#39876)
Add json key inverted index in stats for optimization
issue: https://github.com/milvus-io/milvus/issues/36995
pr: https://github.com/milvus-io/milvus/pull/38039

---------

Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-02-16 20:12:15 +08:00
cai.zhang
667c84740c
enhance: [2.5] Add metrics for parse expression (#39716)
master pr: #39654

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-02-12 10:30:46 +08:00
Gao
dd44a58381
enhance: [2.5] pass partition key scalar info if enable for vector mem index (#39245)
issue: #34332 
pr: #39123

---------

Signed-off-by: chasingegg <chao.gao@zilliz.com>
2025-01-15 21:45:01 +08:00
Chun Han
4c91e05a5d
enhance: fix inconsistenty of alias and db for query iterator(#39045) (#39248)
related: #39045
pr: https://github.com/milvus-io/milvus/pull/39216

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2025-01-15 10:36:59 +08:00
zhenshan.cao
99a8274326
enhance: Add primary field name in SearchResult and QueryResults (#39222)
pr: https://github.com/milvus-io/milvus/pull/39220
issue: https://github.com/milvus-io/milvus/issues/39219

Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>
2025-01-14 16:10:58 +08:00
Zhen Ye
95809ca767
enhance: make new go package to manage proto (#39128)
issue: #39095
pr: #39114

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-01-10 10:53:01 +08:00
Patrick Weizhi Xu
ef400227ad
enhance: [2.5][cp] speed up search iterator stage 1 (#38678)
pr: https://github.com/milvus-io/milvus/pull/37947
issue: https://github.com/milvus-io/milvus/issues/37548

Signed-off-by: Patrick Weizhi Xu <weizhi.xu@zilliz.com>
(cherry picked from commit 9016c4adcd765c0766b01e7e5d465c915e176a6f)
2024-12-27 18:48:52 +08:00
tinswzy
27229f7907
enhance: refine exists log print with ctx (#38080)
issue: #35917 
Refines exists log print with ctx

Signed-off-by: tinswzy <zhenyuan.wei@zilliz.com>
2024-12-14 22:36:44 +08:00
Buqian Zheng
75e64b993f
enhance: add metrics for counting number of nun-zeros/tokens of sparse/FTS search (#38329)
sparse vectors may have arbitrary number of non zeros and it is hard to
optimize without knowing the actual distribution of nnz. this PR adds a
metric for analyzing that.

issue: https://github.com/milvus-io/milvus/issues/35853

comparing with https://github.com/milvus-io/milvus/pull/38328, this
includes also metric for FTS in query node delegator

also fixed a bug of sparse when searching by pk

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2024-12-12 16:22:43 +08:00
Gao
8977454311
enhance: support recall estimation (#38017)
issue: #37899 
Only `search` api will be supported

---------

Signed-off-by: chasingegg <chao.gao@zilliz.com>
2024-12-11 20:40:48 +08:00
jaime
1e8ea4a7e7
feat: add segment/channel/task/slow query render (#37561)
issue: #36621

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-11-12 17:44:29 +08:00
aoiasd
d67853fa89
feat: Tokenizer support build with params and clone for concurrency (#37048)
relate: https://github.com/milvus-io/milvus/issues/35853
https://github.com/milvus-io/milvus/issues/36751

---------

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2024-11-06 17:48:24 +08:00
zhenshan.cao
63843dce33
fix: Fix conan gdal building problem (#37338)
issue:https://github.com/milvus-io/milvus/issues/27576

Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>
2024-10-31 21:04:16 +08:00
Hao Tan
67c4340565
feat: Geospatial Data Type and GIS Function Support for milvus server (#35990)
issue:https://github.com/milvus-io/milvus/issues/27576

# Main Goals
1. Create and describe collections with geospatial fields, enabling both
client and server to recognize and process geo fields.
2. Insert geospatial data as payload values in the insert binlog, and
print the values for verification.
3. Load segments containing geospatial data into memory.
4. Ensure query outputs can display geospatial data.
5. Support filtering on GIS functions for geospatial columns.

# Solution
1. **Add Type**: Modify the Milvus core by adding a Geospatial type in
both the C++ and Go code layers, defining the Geospatial data structure
and the corresponding interfaces.
2. **Dependency Libraries**: Introduce necessary geospatial data
processing libraries. In the C++ source code, use Conan package
management to include the GDAL library. In the Go source code, add the
go-geom library to the go.mod file.
3. **Protocol Interface**: Revise the Milvus protocol to provide
mechanisms for Geospatial message serialization and deserialization.
4. **Data Pipeline**: Facilitate interaction between the client and
proxy using the WKT format for geospatial data. The proxy will convert
all data into WKB format for downstream processing, providing column
data interfaces, segment encapsulation, segment loading, payload
writing, and cache block management.
5. **Query Operators**: Implement simple display and support for filter
queries. Initially, focus on filtering based on spatial relationships
for a single column of geospatial literal values, providing parsing and
execution for query expressions.
6. **Client Modification**: Enable the client to handle user input for
geospatial data and facilitate end-to-end testing.Check the modification
in pymilvus.

---------

Signed-off-by: tasty-gumi <1021989072@qq.com>
2024-10-31 20:58:20 +08:00
cai.zhang
2ef6cbbf59
feat: The expression supports filling elements through templates (#37033)
issue: #36672

The expression supports filling elements through templates, which helps
to reduce the overhead of parsing the elements.

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-10-31 14:20:22 +08:00
Patrick Weizhi Xu
43ad9af529
fix: use max MvccTs for iterator (#37247)
issue: #37158

Signed-off-by: Patrick Weizhi Xu <weizhi.xu@zilliz.com>
2024-10-30 13:58:20 +08:00
Patrick Weizhi Xu
fc69df44a1
fix: set guarantee ts for seach/query iterator (#37180)
issue: #37158

Return the GuaranteeTS so that the subsequent requests following the
correct TS.

BeginTS is the current timestamp when the task is created.
The GuaranteeTS is the one parsed based on both consistency level and
beginTS, in PreExecute of the task on Proxy.
The delegator will wait until GuaranteeTS is met.
In PostExecute of the task on Proxy, the TS of the first iterator
request will be returned to the SDK and add it to the subsequent
requests.
Hence, if the default consistency level is Eventually or Bounded, the
order of TS will be
> Guarantee TS < BeginTS

If it returns the BeginTS, the second request will need to catch up and
result in extra 200ms max of latency, which results in something like

| Call | Latency |
| --- | --- |
| first call on `Next()` | 30ms |
| second call on `Next()` | 210ms |
| third call on `Next()` | 10ms |
| fourth call on `Next()` | 11 ms |
| ... | ... |

where we expect

| Call | Latency |
| --- | --- |
| first call on `Next()` | 30ms |
| second call on `Next()` | 10ms |
| third call on `Next()` | 10ms |
| fourth call on `Next()` | 11 ms |
| ... | ... |

Signed-off-by: Patrick Weizhi Xu <weizhi.xu@zilliz.com>
2024-10-28 15:57:35 +08:00
Gao
1d61b604e1
enhance: support retry search when topk is reduced and result not enough (#35645)
issue: #35576 

This pr is to cover those cases when queryHook optimize search params
and make the result size insufficient, add retry search mechanism and
add related metrics for alarming.

---------

Signed-off-by: chasingegg <chao.gao@zilliz.com>
2024-10-23 19:19:30 +08:00
Chun Han
903450f5c6
enhance: add ts support for iterator(#22718) (#36572)
related: #22718

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2024-10-16 18:51:23 +08:00
Buqian Zheng
16b533cbf0
feat: Restful support for BM25 function (#36713)
issue: https://github.com/milvus-io/milvus/issues/35853

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2024-10-13 17:41:21 +08:00
aoiasd
db34572c56
feat: support load and query with bm25 metric (#36071)
relate: https://github.com/milvus-io/milvus/issues/35853

---------

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2024-10-11 10:23:20 +08:00
Chun Han
eb23e23cd2
enhance: refine parameter relationship for hybridsearch_group_by(#35096) (#36289)
related: #35096

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2024-09-20 14:55:11 +08:00
zhenshan.cao
9d8d332c88
fix: Fix improper use of offset in HybridSearch (#36244)
issue :https://github.com/milvus-io/milvus/issues/36243

Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>
2024-09-13 22:05:15 +08:00
Ted Xu
e7ea1d7a04
enhance: improve log encoding performance on proxy nodes (#36123)
See #36122

This PR is designed to enhance log performance through two improvements:

1. Optimize JSON encoding by switching JSON serializer to
`json-iterator`.
2. Adding support of lazy initialization `WithLazy`.

---------

Signed-off-by: Ted Xu <ted.xu@zilliz.com>
2024-09-11 14:51:07 +08:00
Chun Han
e480b103bd
feat: supporing hybrid search group_by (#35982)
related: #35096

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2024-09-08 17:09:04 +08:00
Chun Han
bfd9d86fe9
feat: support groupby size on go-layer(#33544) (#33845)
related: #33544

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2024-08-27 14:21:00 +08:00
zhagnlu
3107701fe8
enhance: optimize retrieve on dynamic field (#35580)
#35514

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
Co-authored-by: zhenshan.cao <zhenshan.cao@zilliz.com>
2024-08-22 14:24:56 +08:00
Patrick Weizhi Xu
3c7f73137e
fix: disallow expr when partition key isolation is enabled (#35031)
issue: #34336 

Signed-off-by: Patrick Weizhi Xu <weizhi.xu@zilliz.com>
(cherry picked from commit 2889481ce9a14230e9c7f1c8f9c3c1decde77e03)
2024-07-29 14:21:50 +08:00
wei liu
c45f38aa61
enhance: Update protobuf-go to protobuf-go v2 (#34394)
issue: #34252

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-07-29 11:31:51 +08:00
congqixia
e2f40fc2a8
fix: Check legacy guarantee ts when skipping alloc ts (#34981)
See also #34980

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-07-25 10:17:45 +08:00
Jiquan Long
a2ac84bd64
feat: record the duration waiting in the proxy queue (#34744)
fix: https://github.com/milvus-io/milvus/issues/34743

---------

Signed-off-by: longjiquan <jiquan.long@zilliz.com>
2024-07-23 14:23:52 +08:00
Patrick Weizhi Xu
104d0966b7
feat: support partition key isolation (#34336)
issue: #34332

---------

Signed-off-by: Patrick Weizhi Xu <weizhi.xu@zilliz.com>
2024-07-11 19:01:35 +08:00
aoiasd
186757e622
enhance: support mark error as user error (#33498)
relate: https://github.com/milvus-io/milvus/issues/33492

---------

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2024-07-01 14:56:12 +08:00
Jiquan Long
1f58cda957
enhance: add more trace for search & query (#32734)
issue: https://github.com/milvus-io/milvus/issues/32728

---------

Signed-off-by: longjiquan <jiquan.long@zilliz.com>
2024-05-07 13:03:29 +08:00
smellthemoon
365e50b63e
fix: revert add range search params check in proxy (#32366)
no need to check params in empty segment.
#30365

Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2024-04-23 17:41:23 +08:00
SimFG
8594b55ad5
enhance: add max insert request size and must use partition key configs (#32433)
issue: https://github.com/milvus-io/milvus/issues/30577
/kind improvement

Signed-off-by: SimFG <bang.fu@zilliz.com>
2024-04-19 10:31:20 +08:00
zhenshan.cao
02f17b842a
fix: fix incomplete hybrid search result when nq > 1 (#32177)
issue: https://github.com/milvus-io/milvus/issues/25639

Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>
2024-04-17 17:09:32 +08:00
zhenshan.cao
88c6828d6c
fix: failed to raise metric_type not match error (#32202)
issue: https://github.com/milvus-io/milvus/issues/32176

Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>
2024-04-12 16:19:18 +08:00