2181 Commits

Author SHA1 Message Date
Bingyi Sun
f0446fd9a0
enhance: optimize the performance of binary_search_string (#44469)
Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-09-23 10:52:13 +08:00
Tianx
2c0c5ef41e
feat: timestamptz expression & index & timezone (#44080)
issue: https://github.com/milvus-io/milvus/issues/27467

>My plan is as follows.
>- [x] M1 Create collection with timestamptz field
>- [x] M2 Insert timestamptz field data
>- [x] M3 Retrieve timestamptz field data
>- [x] M4 Implement handoff
>- [x] M5 Implement compare operator
>- [x] M6 Implement extract operator
 >- [x] M8 Support database/collection level default timezone
>- [x] M7 Support STL-SORT index for datatype timestamptz

---

The third PR of issue: https://github.com/milvus-io/milvus/issues/27467,
which completes M5, M6, M7, M8 described above.

## M8 Default Timezone

We will be able to use alter_collection() and alter_database() in a
future Python SDK release to modify the default timezone at the
collection or database level.

For insert requests, the timezone will be resolved using the following
order of precedence: String Literal-> Collection Default -> Database
Default.
For retrieval requests, the timezone will be resolved in this order:
Query Parameters -> Collection Default -> Database Default.
In both cases, the final fallback timezone is UTC.


## M5: Comparison Operators

We can now use the following expression format to filter on the
timestamptz field:

- `timestamptz_field [+/- INTERVAL 'interval_string'] {comparison_op}
ISO 'iso_string' `

- The interval_string follows the ISO 8601 duration format, for example:
P1Y2M3DT1H2M3S.

- The iso_string follows the ISO 8601 timestamp format, for example:
2025-01-03T00:00:00+08:00.

- Example expressions: "tsz + INTERVAL 'P0D' != ISO
'2025-01-03T00:00:00+08:00'" or "tsz != ISO
'2025-01-03T00:00:00+08:00'".

## M6: Extract

We will be able to extract sepecific time filed by kwargs in a future
Python SDK release.
The key is `time_fields`, and value should be one or more of "year,
month, day, hour, minute, second, microsecond", seperated by comma or
space. Then the result of each record would be an array of int64.



## M7: Indexing Support

Expressions without interval arithmetic can be accelerated using an
STL-SORT index. However, expressions that include interval arithmetic
cannot be indexed. This is because the result of an interval calculation
depends on the specific timestamp value. For example, adding one month
to a date in February results in a different number of added days than
adding one month to a date in March.

--- 

After this PR, the input / output type of timestamptz would be iso
string. Timestampz would be stored as timestamptz data, which is int64_t
finally.

> for more information, see https://en.wikipedia.org/wiki/ISO_8601

---------

Signed-off-by: xtx <xtianx@smail.nju.edu.cn>
2025-09-23 10:24:12 +08:00
Gao
539f17f1ad
enhance: tiered index updates (#44433)
issue: #42032 #44212 

- special case for warmup param and cell storage size for tiered index
- add a config to enable/disable storage usage tracking

---------

Signed-off-by: chasingegg <chao.gao@zilliz.com>
2025-09-22 21:34:11 +08:00
Buqian Zheng
75557f3eb8
enhance: Use std::shared_lock and std::unique_lock for mutexes (#44459)
issue: https://github.com/milvus-io/milvus/issues/44452

Signed-off-by: zhengbuqian <zhengbuqian@gmail.com>
Co-authored-by: buqian.zheng <buqian.zheng@zilliz.com>
2025-09-22 18:02:09 +08:00
Buqian Zheng
846cf52a95
enhance: Remove unused vector plan node subclasses (#44453)
Remove redundant `VectorPlanNode` subclasses and simplify the visitor
pattern by consolidating to a single `VectorPlanNode`.

The previous design used distinct `VectorPlanNode` subclasses and a
templated `VectorVisitorImpl` for type-directed dispatch. However, the
template parameter was not functionally used to implement different
logic for each vector type, making the subclasses redundant for their
intended purpose.

This PR is created by Cursor Agent and manually moved from
https://github.com/zhengbuqian/milvus/pull/14.

Signed-off-by: zhengbuqian <zhengbuqian@gmail.com>
Co-authored-by: Cursor Agent <cursoragent@cursor.com>
Co-authored-by: buqian.zheng <buqian.zheng@zilliz.com>
2025-09-22 18:00:27 +08:00
sthuang
edd250ffef
fix: [StorageV2] force virtual host for oss and cos (#44484)
related: #44481

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2025-09-22 16:58:11 +08:00
sparknack
ab64afba2f
enhance: add storage resource usage for scalar search (#44414)
issue: #44212

---------

Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>
2025-09-22 14:28:06 +08:00
Gao
d3784c6515
enhance: add storage resource usage for vector search (#44308)
issue: #44212 

Implement search/query storage usage statistics in go side(result
reduce), only record storage usage in vector search C++ path. Need to be
implemented in query c++ path in next prs.

---------

Signed-off-by: chasingegg <chao.gao@zilliz.com>
Signed-off-by: marcelo.chen <marcelo.chen@zilliz.com>
Co-authored-by: marcelo.chen <marcelo.chen@zilliz.com>
2025-09-19 20:20:02 +08:00
sangheee
bed94fc061
feat: support grpc tokenizer (#41994)
relate: https://github.com/milvus-io/milvus/issues/41035

This PR adds support for a gRPC-based tokenizer.

- The protobuf definition was added in
[milvus-proto#445](https://github.com/milvus-io/milvus-proto/pull/445).
- Based on this, the corresponding Rust client code was generated and
added under `tantivi-binding`.
  - The generated file is `milvus.proto.tokenizer.rs`.

I'm not very experienced with Rust, so there might be parts of the code
that could be improved.
I’d appreciate any suggestions or improvements.

---------

Signed-off-by: park.sanghee <park.sanghee@navercorp.com>
2025-09-19 17:40:01 +08:00
congqixia
b532a3e026
enhance: Move c API unittest aside to src files (#44458)
Related to #43931

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-09-19 10:30:01 +08:00
congqixia
7b83314bf3
enhance: [StorageV2] Make datanode use non-singleton fs (#44418)
Related to #39173

According to the current design, datanode shall create fs from storage
config in request instead of using singleton fs. This PR upgrade
milvus-storage and make packed reader/writer compose new fs from storage
config.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-09-18 20:06:00 +08:00
zhagnlu
9b6703626d
fix:fix unescaped bug for json stats (#44421)
#42533

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-09-17 20:54:01 +08:00
sthuang
2f70a73258
fix: turn on azure by default (#44377)
related: #44354, #44138, #43869

---------

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2025-09-17 10:12:01 +08:00
congqixia
6f7318a731
enhance: [StorageV2] Use compressed size as log file size (#44402)
Related to #39173

backlog issue that memory size and log size shared same value. This
patch add `GetFileSize` api to get remote compressed binlog size as meta
log file size to calculate usage more accurate.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-09-16 21:20:02 +08:00
congqixia
98d23de36c
enhance: [StorageV2] Make load info contains child info (#44384)
Related to #44257

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-09-16 16:14:00 +08:00
congqixia
aa861f55e6
enhance: [StorageV2] Reverts #44232 bucket name change (#44390)
Related to #39173

- Put bucket name concatenation logic back for azure support

This reverts commit 8f97eb355fde6b86cf37f166d2191750b4210ba3.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-09-16 10:10:00 +08:00
zhagnlu
baa84e0b2b
fix: avoid mvcc when doing pk compare expr (#44353)
#44352

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-09-15 10:17:59 +08:00
zhagnlu
e9bbb6aa9b
fix: fix json_contains bug for stats (#44325)
#42533

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-09-15 10:16:07 +08:00
sthuang
b38013352d
enhance: [StorageV2] enable build with azure (#44177)
related: #43869

---------

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2025-09-14 08:05:58 +08:00
Bingyi Sun
1931dcd9b5
fix: Fix initialize timestamp index concurrently (#44317)
#issue: https://github.com/milvus-io/milvus/issues/44341

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-09-12 14:25:57 +08:00
sparknack
060fc61e80
fix: milvus-common commits update (#44339)
issue: #41435
related: #44268

Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>
2025-09-12 12:43:57 +08:00
aoiasd
fb58701cbb
enhance: update rust version (#44322)
relate: https://github.com/milvus-io/milvus/issues/44321

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-09-12 10:53:57 +08:00
zhagnlu
16e6b6aa8a
fix:fix build json stats bug for nested object (#44303)
issue: #44132

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-09-11 14:13:56 +08:00
cqy123456
f5c6138793
enhance: update knowhere version (#44294)
issue: https://github.com/milvus-io/milvus/issues/42937

Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>
2025-09-11 11:21:56 +08:00
sparknack
e821468d2a
fix: milvus-common commit update (#44304)
issue: #41435 
related: #44268

Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>
2025-09-11 10:19:56 +08:00
zhagnlu
77f7d19400
fix:avoid mmap rewrite by multi json fields (#44299)
issue: #44127

Signed-off-by: zhagnlu <lu.zhang@zilliz.com>
2025-09-11 10:13:57 +08:00
congqixia
f5618d5153
enhance: [StorageV2] Utilized advance split policy and persist in meta (#44282)
Related to #44257

This PR:
- Utilize configurable split policy for storage v2, enabling system
field policy
- Store split result in field binlog struct
- Adapt legacy binlog without child fields

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-09-10 14:47:57 +08:00
sparknack
4a01c726f3
enhance: cachinglayer: some metric and params update (#44276)
issue: #41435

---------

Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>
2025-09-10 11:03:57 +08:00
zhagnlu
2f8620fa79
fix: fix like failed and add max columns limit (#44233)
#44137

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-09-10 10:33:57 +08:00
Spade A
45adf2d426
fix: load resource considers ngram index (#44237)
fix https://github.com/milvus-io/milvus/issues/44236

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-09-10 10:27:56 +08:00
Chun Han
26a024625d
feat: support search by on json field and dynamic field(#43124) (#43203)
related: #43124

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2025-09-09 21:51:56 +08:00
sthuang
dfc2335144
enhance: [StorageV2] storage file system error messages (#44255)
related: https://github.com/milvus-io/milvus/issues/44138

bump milvus storage version, include the followings:
* https://github.com/milvus-io/milvus-storage/pull/243
* https://github.com/milvus-io/milvus-storage/pull/240
* https://github.com/milvus-io/milvus-storage/pull/245

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2025-09-09 19:37:56 +08:00
Spade A
575d490af6
fix: ngram index is mistakenly used for unsopported operations 2 (#44142)
issue: https://github.com/milvus-io/milvus/issues/44020
https://github.com/milvus-io/milvus/pull/43955 only fixed unary
expression
This fixes all expressions and add more tests.

---------

Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>
Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-09-09 19:05:56 +08:00
Buqian Zheng
dae0fd0e90
enhance: removed unused map_c (#44183)
Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2025-09-09 16:46:04 +08:00
aoiasd
92fedb8280
enhance: forbid panic when tantivy index path not exist (#44135)
Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-09-08 15:21:56 +08:00
Buqian Zheng
9bf2b5c10c
enhance: moved more segcore unit test files (#44210)
issue: https://github.com/milvus-io/milvus/issues/43931

---------

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2025-09-08 10:21:55 +08:00
congqixia
8f97eb355f
enhance: [StorageV2] Make bucket name concatenation transparent to user (#44232)
Related to #39173

This PR:
- Bump milvus-storage commit to handle bucket name concatenation logic
in multipart s3 fs
- Remove all user-side bucket name concatenation code

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-09-08 10:15:55 +08:00
Spade A
ba4cd68edb
fix: adjust params to make CPP UT run faster (#44223)
fix: https://github.com/milvus-io/milvus/issues/44224

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-09-06 14:13:54 +08:00
aoiasd
c71b47b52c
enhance: add internal core latency metric for rescore node (#44010)
For fetching latency of boost.

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-09-05 17:37:54 +08:00
cqy123456
1d4d721859
test: Reduce the run time of interim index cpp ut (#44200)
issue: https://github.com/milvus-io/milvus/issues/44176

Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>
2025-09-05 16:45:53 +08:00
zhagnlu
d67f1ea0ab
enhance: add param to modify dump snapshot batch size (#44215)
issue: #44216

Signed-off-by: luzhang <luzhang@zilliz.com>
2025-09-05 14:29:54 +08:00
Gao
2e98cb0103
enhance: load resource estimation for tiered index (#44171)
issue: https://github.com/milvus-io/milvus/issues/42032

- Use bytes to estimate load resource in the whole estimation procedure
- Add num_rows and dim info for vector index to better estimate
- Disable eviction for tiered index's meta

---------

Signed-off-by: chasingegg <chao.gao@zilliz.com>
2025-09-04 19:41:53 +08:00
Buqian Zheng
b76bf13fc3
enhance: move c++ unit test file to aside of the production code (#43932)
issue: https://github.com/milvus-io/milvus/issues/43931

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2025-09-03 23:45:53 +08:00
Spade A
825a134739
feat: impl StructArray -- reject json types for struct (#44190)
issue: https://github.com/milvus-io/milvus/issues/42148

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-09-03 19:33:53 +08:00
Spade A
7cb15ef141
feat: impl StructArray -- optimize vector array serialization (#44035)
issue: https://github.com/milvus-io/milvus/issues/42148

Optimized from
Go VectorArray → VectorArray Proto → Binary → C++ VectorArray Proto →
C++ VectorArray local impl → Memory
to
Go VectorArray → Arrow ListArray  → Memory

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-09-03 16:39:53 +08:00
Buqian Zheng
ad16441aa0
enhance: removed unused VectorFunction (#44178)
Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2025-09-03 14:37:53 +08:00
foxspy
d55bf49bf1
enhance: update knowhere version (#44144)
issue: #42937

---------

Signed-off-by: xianliang.li <xianliang.li@zilliz.com>
2025-09-03 01:31:53 +08:00
Bingyi Sun
0c0630cc38
feat: support dropping index without releasing collection (#42941)
issue: #42942

This pr includes the following changes:
1. Added checks for index checker in querycoord to generate drop index
tasks
2. Added drop index interface to querynode
3. To avoid search failure after dropping the index, the querynode
allows the use of lazy mode (warmup=disable) to load raw data even when
indexes contain raw data.
4. In segcore, loading the index no longer deletes raw data; instead, it
evicts it.
5. In expr, the index is pinned to prevent concurrent errors.

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-09-02 16:17:52 +08:00
congqixia
aa4ef9c996
feat: Support enabling dynamic schema on existing collection (#44151)
Related to #44150

This PR make enabling `dynamic schema` feature for an existing
collection possible.

This related API is to reuse `AlterCollection` and underhood its
redirected to `adding nullable json field`

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-09-02 15:51:52 +08:00
Bingyi Sun
c420e7bd27
enhance: align the behavior of exist expr between brute force and index (#44030)
https://github.com/milvus-io/milvus/issues/44031

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-09-01 15:03:52 +08:00