201 Commits

Author SHA1 Message Date
zhenshan.cao
6327c9a514
fix: Fix bugs related to TimestampTz (#45111)
issue: https://github.com/milvus-io/milvus/issues/44527
https://github.com/milvus-io/milvus/issues/44537
https://github.com/milvus-io/milvus/issues/44538
https://github.com/milvus-io/milvus/issues/44585
https://github.com/milvus-io/milvus/issues/44622

Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>
2025-11-04 16:51:33 +08:00
congqixia
6c34386ff2
enhance: extract shard client logic into dedicated package (#45018)
Related to #44761

Refactor proxy shard client management by creating a new
internal/proxy/shardclient package. This improves code organization and
modularity by:

- Moving load balancing logic (LookAsideBalancer, RoundRobinBalancer) to
shardclient package
- Extracting shard client manager and related interfaces into separate
package
- Relocating shard leader management and client lifecycle code
- Adding package documentation (README.md, OWNERS)
- Updating proxy code to use the new shardclient package interfaces

This change makes the shard client functionality more maintainable and
better encapsulated, reducing coupling in the proxy layer.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-10-22 10:22:04 +08:00
Bingyi Sun
633cae9461
enhance: add namespace for query and search request (#44343)
issue: #44011

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-10-16 17:52:01 +08:00
congqixia
efbe6669ec
enhance: Make accesslog.$consistency_level represent actual value used (#44706)
Related to #44703

This PR:
- Add `SetActualConsistencyLevel` to `info.AccessInfo` interface and
related util method processing it
- Make `$consistency_level` returning actual value if set

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-10-09 21:03:58 +08:00
cai.zhang
19346fa389
feat: Geospatial Data Type and GIS Function support for milvus (#44547)
issue: #43427

This pr's main goal is merge #37417 to milvus 2.5 without conflicts.

# Main Goals

1. Create and describe collections with geospatial type
2. Insert geospatial data into the insert binlog
3. Load segments containing geospatial data into memory
4. Enable query and search can display  geospatial data
5. Support using GIS funtions like ST_EQUALS in query
6. Support R-Tree index for geometry type

# Solution

1. **Add Type**: Modify the Milvus core by adding a Geospatial type in
both the C++ and Go code layers, defining the Geospatial data structure
and the corresponding interfaces.
2. **Dependency Libraries**: Introduce necessary geospatial data
processing libraries. In the C++ source code, use Conan package
management to include the GDAL library. In the Go source code, add the
go-geom library to the go.mod file.
3. **Protocol Interface**: Revise the Milvus protocol to provide
mechanisms for Geospatial message serialization and deserialization.
4. **Data Pipeline**: Facilitate interaction between the client and
proxy using the WKT format for geospatial data. The proxy will convert
all data into WKB format for downstream processing, providing column
data interfaces, segment encapsulation, segment loading, payload
writing, and cache block management.
5. **Query Operators**: Implement simple display and support for filter
queries. Initially, focus on filtering based on spatial relationships
for a single column of geospatial literal values, providing parsing and
execution for query expressions.Now only support brutal search
7. **Client Modification**: Enable the client to handle user input for
geospatial data and facilitate end-to-end testing.Check the modification
in pymilvus.

---------

Signed-off-by: Yinwei Li <yinwei.li@zilliz.com>
Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
Co-authored-by: ZhuXi <150327960+Yinwei-Yu@users.noreply.github.com>
2025-09-28 19:43:05 +08:00
congqixia
99598ae5ec
enhance: Add param item for hybrid search requery policy (#44466)
Related to #39757

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-09-24 17:32:04 +08:00
junjiejiangjjj
f07979f91d
enhance: add support for controlling function output field insertion (#44162)
#44053

Signed-off-by: junjie.jiang <junjie.jiang@zilliz.com>
2025-09-24 17:26:04 +08:00
Tianx
2c0c5ef41e
feat: timestamptz expression & index & timezone (#44080)
issue: https://github.com/milvus-io/milvus/issues/27467

>My plan is as follows.
>- [x] M1 Create collection with timestamptz field
>- [x] M2 Insert timestamptz field data
>- [x] M3 Retrieve timestamptz field data
>- [x] M4 Implement handoff
>- [x] M5 Implement compare operator
>- [x] M6 Implement extract operator
 >- [x] M8 Support database/collection level default timezone
>- [x] M7 Support STL-SORT index for datatype timestamptz

---

The third PR of issue: https://github.com/milvus-io/milvus/issues/27467,
which completes M5, M6, M7, M8 described above.

## M8 Default Timezone

We will be able to use alter_collection() and alter_database() in a
future Python SDK release to modify the default timezone at the
collection or database level.

For insert requests, the timezone will be resolved using the following
order of precedence: String Literal-> Collection Default -> Database
Default.
For retrieval requests, the timezone will be resolved in this order:
Query Parameters -> Collection Default -> Database Default.
In both cases, the final fallback timezone is UTC.


## M5: Comparison Operators

We can now use the following expression format to filter on the
timestamptz field:

- `timestamptz_field [+/- INTERVAL 'interval_string'] {comparison_op}
ISO 'iso_string' `

- The interval_string follows the ISO 8601 duration format, for example:
P1Y2M3DT1H2M3S.

- The iso_string follows the ISO 8601 timestamp format, for example:
2025-01-03T00:00:00+08:00.

- Example expressions: "tsz + INTERVAL 'P0D' != ISO
'2025-01-03T00:00:00+08:00'" or "tsz != ISO
'2025-01-03T00:00:00+08:00'".

## M6: Extract

We will be able to extract sepecific time filed by kwargs in a future
Python SDK release.
The key is `time_fields`, and value should be one or more of "year,
month, day, hour, minute, second, microsecond", seperated by comma or
space. Then the result of each record would be an array of int64.



## M7: Indexing Support

Expressions without interval arithmetic can be accelerated using an
STL-SORT index. However, expressions that include interval arithmetic
cannot be indexed. This is because the result of an interval calculation
depends on the specific timestamp value. For example, adding one month
to a date in February results in a different number of added days than
adding one month to a date in March.

--- 

After this PR, the input / output type of timestamptz would be iso
string. Timestampz would be stored as timestamptz data, which is int64_t
finally.

> for more information, see https://en.wikipedia.org/wiki/ISO_8601

---------

Signed-off-by: xtx <xtianx@smail.nju.edu.cn>
2025-09-23 10:24:12 +08:00
Gao
d3784c6515
enhance: add storage resource usage for vector search (#44308)
issue: #44212 

Implement search/query storage usage statistics in go side(result
reduce), only record storage usage in vector search C++ path. Need to be
implemented in query c++ path in next prs.

---------

Signed-off-by: chasingegg <chao.gao@zilliz.com>
Signed-off-by: marcelo.chen <marcelo.chen@zilliz.com>
Co-authored-by: marcelo.chen <marcelo.chen@zilliz.com>
2025-09-19 20:20:02 +08:00
junjiejiangjjj
f3d7e47227
feat: Supports more rerankers (#43270)
https://github.com/milvus-io/milvus/issues/35856

Signed-off-by: junjiejiangjjj <junjie.jiang@zilliz.com>
2025-08-22 17:29:47 +08:00
Spade A
d6a428e880
feat: impl StructArray -- support create index for vector array (embedding list) and search on it (#43726)
Ref https://github.com/milvus-io/milvus/issues/42148

This PR supports create index for vector array (now, only for
`DataType.FLOAT_VECTOR`) and search on it.
The index type supported in this PR is `EMB_LIST_HNSW` and the metric
type is `MAX_SIM` only.

The way to use it:
```python
milvus_client = MilvusClient("xxx:19530")
schema = milvus_client.create_schema(enable_dynamic_field=True, auto_id=True)
...
struct_schema = milvus_client.create_struct_array_field_schema("struct_array_field")
...
struct_schema.add_field("struct_float_vec", DataType.ARRAY_OF_VECTOR, element_type=DataType.FLOAT_VECTOR, dim=128, max_capacity=1000)
...
schema.add_struct_array_field(struct_schema)
index_params = milvus_client.prepare_index_params()
index_params.add_index(field_name="struct_float_vec", index_type="EMB_LIST_HNSW", metric_type="MAX_SIM", index_params={"nlist": 128})
...
milvus_client.create_index(COLLECTION_NAME, schema=schema, index_params=index_params)
```

Note: This PR uses `Lims` to convey offsets of the vector array to
knowhere where vectors of multiple vector arrays are concatenated and we
need offsets to specify which vectors belong to which vector array.

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>
2025-08-20 10:27:46 +08:00
aoiasd
dcf04a58b8
feat: support use score function on segment search and use filter (#43868)
relate: https://github.com/milvus-io/milvus/issues/43867
Support boost function score, multiply by the weight if match filter.

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-08-19 16:15:45 +08:00
Spade A
faeb7fd410
feat: impl StructArray -- create schema, insert, and retrieve data (#42855)
Ref https://github.com/milvus-io/milvus/issues/42148

https://github.com/milvus-io/milvus/pull/42406 impls the segcore part of
storage for handling with VectorArray.
This PR:
1. impls the go part of storage for VectorArray
2. impls the collection creation with StructArrayField and VectorArray
3. insert and retrieve data from the collection.

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>
Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>
2025-07-27 01:30:55 +08:00
junjiejiangjjj
77f3a1f213
enhance: Add search post pipeline (#43065)
https://github.com/milvus-io/milvus/issues/35856

Signed-off-by: junjiejiangjjj <junjie.jiang@zilliz.com>
2025-07-21 11:10:52 +08:00
Bingyi Sun
13f6e2130b
fix: Fix hybrid search return back empty set if one result is emtpy (#43209)
issue: https://github.com/milvus-io/milvus/issues/43160

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-07-10 10:34:55 +08:00
congqixia
942055fa7d
fix: Use task timestamp to calculate TTL timestamp (#42920)
Related to #42918

Previously the `CollectionTtlTimestamp` could be overflowed when the
guarantee_ts==1, which means using `Eventually` consistency level.

This patch use task timestamp, allocated by scheduler, to generate ttl
timestamp ignore the potential very small timestamp being used.

Also add overflow check for ttl timestamp calculated.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-06-25 20:48:42 +08:00
junjiejiangjjj
f1a4526bac
enhance: refactor rrf and weighted rerank (#42154)
https://github.com/milvus-io/milvus/issues/35856

Signed-off-by: junjie.jiang <junjie.jiang@zilliz.com>
2025-06-10 18:08:35 +08:00
Ted Xu
35c17523de
feat: limit search result entries (#42522)
See: #42521

Signed-off-by: Ted Xu <ted.xu@zilliz.com>
2025-06-05 12:08:33 +08:00
aoiasd
3a74044149
fix: hybird search sub requset not set analyzer name (#41896)
relate: https://github.com/milvus-io/milvus/issues/41213

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-05-29 14:56:28 +08:00
junjiejiangjjj
4202c775ba
feat: Support vllm and tei rerank (#41947)
https://github.com/milvus-io/milvus/issues/35856

Signed-off-by: junjie.jiang <junjie.jiang@zilliz.com>
2025-05-28 19:18:28 +08:00
Xianhui Lin
6a0e182e13
enhance: support TTL expiration with queries returning no results (#42086)
support TTL expiration with queries returning no results
issue:https://github.com/milvus-io/milvus/issues/41959

Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>
2025-05-27 18:28:27 +08:00
junjiejiangjjj
0bbbf98a5b
enhance: refactor decay rerank (#41778)
https://github.com/milvus-io/milvus/issues/35856
1. Optimizing decay function
2. Since the decay function is larger, the more similar it is, the
smaller the L2/JACCARD/HAMMING metrics scores the more similar they are.
For these metrics, the decay function regenerates new scores.

Signed-off-by: junjie.jiang <junjie.jiang@zilliz.com>
2025-05-20 15:22:24 +08:00
Ted Xu
ae32203d3a
fix: support group by with nullable grouping keys (#41797)
See #36264

In this PR:
- Enhanced error handling in parse of grouping field.
- Fixed null handling in reduce tasks in proxy nodes. 
- Updated tests to reflect changes in error handling and data processing
logic.

---------

Signed-off-by: Ted Xu <ted.xu@zilliz.com>
2025-05-17 20:54:22 +08:00
junjiejiangjjj
f337d2989b
enhance : New decay function (#41634)
https://github.com/milvus-io/milvus/issues/35856

Signed-off-by: junjie.jiang <junjie.jiang@zilliz.com>
2025-05-08 17:24:54 +08:00
junjiejiangjjj
bb7df40fc1
feat: Http interface supports rerank (#41486)
https://github.com/milvus-io/milvus/issues/35856

Signed-off-by: junjie.jiang <junjie.jiang@zilliz.com>
2025-04-28 23:02:50 +08:00
junjiejiangjjj
f23df95a77
feat : Support decay rerank (#41223)
https://github.com/milvus-io/milvus/issues/35856
#41312

Signed-off-by: junjie.jiang <junjie.jiang@zilliz.com>
2025-04-23 20:48:39 +08:00
aoiasd
f52c2909c4
feat: support multi analyzer for bm25 function (#41351)
relate: https://github.com/milvus-io/milvus/issues/41213

---------

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-04-23 18:22:38 +08:00
congqixia
6f4e0d8e38
enhance: [AddField] Use schema update ts as guarantee ts (#41430)
Related to #39718

Use schema update ts when it's greater than calculated guarantee
timestamp to make sure that all read request using updated schema shall
wait all schema change event processed.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-04-22 17:12:45 +08:00
Xianhui Lin
f9febe3bae
enhance: Merge RootCoord, DataCoord And QueryCoord into MixCoord (#41006)
Merge RootCoord, DataCoord And QueryCoord into MixCoord
Make Session into one
issue : https://github.com/milvus-io/milvus/issues/37764

---------

Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>
2025-04-11 16:36:30 +08:00
Xianhui Lin
3bc24c264f
enhance: Add json key inverted index in stats for optimization (#38039)
Add json key inverted index in stats for optimization
https://github.com/milvus-io/milvus/issues/36995

---------

Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-04-10 15:20:28 +08:00
zhenshan.cao
ecc2d80915
enhance: Add primary field name in SearchResult and QueryResults (#39220)
issue: https://github.com/milvus-io/milvus/issues/39219

Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>
2025-04-09 10:48:25 +08:00
congqixia
96eca2531f
fix: Avoid update original search/query request (#41126)
Related to #41034

Recent pr #40842 introduced logic to avoid requery pk column, which
updates the original request which makes the request not equavilant to
the original one.

When retry happens due to incomplete request error, this change makes
the final result set lacks the pk column even when user specifies it
explicitly.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-04-08 14:26:28 +08:00
smellthemoon
cb1e86e17c
enhance: support add field (#39800)
after the pr merged, we can support to insert, upsert, build index,
query, search in the added field.
can only do the above operates in added field after add field request
complete, which is a sync operate.

compact will be supported in the next pr.
#39718

---------

Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2025-04-02 14:24:31 +08:00
congqixia
37cf9a0dc1
enhance: Use %v for missing id log (#41036)
`incomplete query result, missing id %!s(int64=348), len(searchIDs) =
10, len(queryIDs) = 9` error message format with error when missing id
is int64

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-04-02 11:54:23 +08:00
Buqian Zheng
7a056aff9d
enhance: avoid re-query if hybrid search requested only pk as output field (#40842)
proxy to always remove pk field from output field when forwarding
request to QN, and if user requested pk, fill it from IDs.

issue: https://github.com/milvus-io/milvus/issues/40833

---------

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2025-03-28 14:32:18 +08:00
Ted Xu
688505ab1c
enhance: cleanup lint check exclusions (#40829)
See: #40828

Signed-off-by: Ted Xu <ted.xu@zilliz.com>
2025-03-21 18:12:14 +08:00
Buqian Zheng
c12abf4e2a
enhance: improve sparse query nnz metric (#40713)
add query type and field id label; add metric for hybrid search

issue: https://github.com/milvus-io/milvus/issues/35853

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2025-03-18 17:20:16 +08:00
junjiejiangjjj
359e7efd8e
feat: Add function running monitoring (#40358)
#35856 
#40004 
1. Optimize model verification logic
2. Add profiling code

Signed-off-by: junjie.jiang <junjie.jiang@zilliz.com>
2025-03-10 22:28:05 +08:00
cai.zhang
13aff35a83
enhance: Add metrics for parse expression (#39654)
Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-02-28 10:07:58 +08:00
congqixia
cb7f2fa6fd
enhance: Use v2 package name for pkg module (#39990)
Related to #39095

https://go.dev/doc/modules/version-numbers

Update pkg version according to golang dep version convention

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-02-22 23:15:58 +08:00
Zhen Ye
21724ab52c
enhance: generate guaranteets at delegator if local wal (#39799)
issue: #38399, #39892

- use mvcc timestamp of wal as guaranteets if wal and delegator is
located at same node.
- fix: ignore growing option is lost at hibridsearch

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-02-17 15:22:15 +08:00
junjiejiangjjj
16cbdfb3b1
feat: Add Text Embedding Function (#36366)
https://github.com/milvus-io/milvus/issues/35856

Signed-off-by: junjie.jiang <junjie.jiang@zilliz.com>
2025-01-24 14:23:06 +08:00
Gao
75d7978a18
enhance: pass partition key scalar info if enable for vector mem index (#39123)
issue: #34332

---------

Signed-off-by: chasingegg <chao.gao@zilliz.com>
2025-01-16 14:33:03 +08:00
Chun Han
ed31a5a4bf
enhance: fix inconsistenty of alias and db for query iterator(#39045) (#39216)
related: #39045

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2025-01-15 09:48:59 +08:00
Zhen Ye
bb8d1ab3bf
enhance: make new go package to manage proto (#39114)
issue: #39095

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-01-10 10:49:01 +08:00
Patrick Weizhi Xu
85f462be1a
enhance: speed up search iterator stage 1 (#37947)
issue: #37548

Signed-off-by: Patrick Weizhi Xu <weizhi.xu@zilliz.com>
2024-12-26 10:32:49 +08:00
tinswzy
27229f7907
enhance: refine exists log print with ctx (#38080)
issue: #35917 
Refines exists log print with ctx

Signed-off-by: tinswzy <zhenyuan.wei@zilliz.com>
2024-12-14 22:36:44 +08:00
Buqian Zheng
75e64b993f
enhance: add metrics for counting number of nun-zeros/tokens of sparse/FTS search (#38329)
sparse vectors may have arbitrary number of non zeros and it is hard to
optimize without knowing the actual distribution of nnz. this PR adds a
metric for analyzing that.

issue: https://github.com/milvus-io/milvus/issues/35853

comparing with https://github.com/milvus-io/milvus/pull/38328, this
includes also metric for FTS in query node delegator

also fixed a bug of sparse when searching by pk

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2024-12-12 16:22:43 +08:00
Gao
8977454311
enhance: support recall estimation (#38017)
issue: #37899 
Only `search` api will be supported

---------

Signed-off-by: chasingegg <chao.gao@zilliz.com>
2024-12-11 20:40:48 +08:00
jaime
1e8ea4a7e7
feat: add segment/channel/task/slow query render (#37561)
issue: #36621

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-11-12 17:44:29 +08:00