1601 Commits

Author SHA1 Message Date
Buqian Zheng
3140bd0ca6
enhance: enable default json stats (#44810)
issue: https://github.com/milvus-io/milvus/issues/44132

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2025-10-13 21:29:59 +08:00
sparknack
c8a4d6e2ef
enhance: add cachinglayer management for TextMatchIndex (#44741)
issue: #41435, #44502

Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>
2025-10-13 14:37:58 +08:00
sparknack
6d5b41644b
enhance: remove logical usage checks during segment loading (#44743)
issue: #41435

Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>
2025-10-13 14:21:58 +08:00
Zhen Ye
369c6eb206
enhance: support remove cluster from replicate topology (#44642)
issue: #44558, #44123
- Update config(A->C) to A and C, config(B) to B on replicate topology
(A->B,A->C) can remove the B from replicate topology
- Fix some metric error of CDC

Signed-off-by: chyezh <chyezh@outlook.com>
2025-10-13 11:07:58 +08:00
foxspy
e7a91f514c
enhance: overwriting current index type during index build stage (#44753)
issue: #44752

Signed-off-by: xianliang.li <xianliang.li@zilliz.com>
2025-10-11 18:31:58 +08:00
foxspy
0eb42a7870
enhance: support load params for vector index (#44747)
issue: #44746 

Support modifying vector index behavior during loading by change/add
aram knowhere.xxx.load.xxx.

Signed-off-by: xianliang.li <xianliang.li@zilliz.com>
2025-10-11 18:30:08 +08:00
congqixia
7b8ecdaad5
enhance: Add accesslog field for template value length info (#44723)
Related to #36672

Add accesslog field displaying value length for search/query request may
help developers debug related issues

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-10-11 18:23:57 +08:00
congqixia
faaf215913
enhance: Bump go version & builder image tag (#44757)
Bump go version to 1.24.6 fixing CVE-2025-47907

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-10-11 13:49:57 +08:00
Zhen Ye
30091a3bb7
enhance: remove redundant channel manager from datacoord (#44532)
issue: #41611

- After enabling streaming arch, channel manager of data coord is a
redundant component.

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-10-09 11:01:57 +08:00
congqixia
d76d92d33f
enhance: Use relative path in proto following convention (#44650)
Previous pr #44163

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-09-30 18:43:52 +08:00
congqixia
1185fbec0a
enhance: Bump milvus & proto version to v2.6.3 (#44633)
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-09-30 17:33:49 +08:00
Xiaofan
7c00f292bc
enhance: add config for meta batch(#44569) (#44645)
fix: https://github.com/milvus-io/milvus/issues/44569
add a new config to control meta batch to avoid too large

Signed-off-by: xiaofanluan <xiaofan.luan@zilliz.com>
2025-09-30 17:31:02 +08:00
Agnes George
aea0418713
fix: resolve CVE-2020-25576, WS-2023-0223 (#44163)
fix: issue https://github.com/milvus-io/milvus/issues/44160

WS-2023-0223 reported for
[atty-0.2.14.crate](https://ibmets.whitesourcesoftware.com/Wss/WSS.html#!libraryDetails;uuid=9c622063-376a-446b-bece-d7f6fd096758;project=7300448;orgToken=79623fcf-07fe-42b8-90bf-513fafeb41be)
CVE-2020-25576 reported for
[rand_core-0.3.1.crate](https://ibmets.whitesourcesoftware.com/Wss/WSS.html#!libraryDetails;uuid=20e2ad1b-c84c-4f18-98a9-4f27643b29ff;project=7300448;orgToken=79623fcf-07fe-42b8-90bf-513fafeb41be)

[atty-0.2.14.crate](https://ibmets.whitesourcesoftware.com/Wss/WSS.html#!libraryDetails;uuid=9c622063-376a-446b-bece-d7f6fd096758;project=7300448;orgToken=79623fcf-07fe-42b8-90bf-513fafeb41be)
is a transitive dependency coming from the root libraries
'cbindgen-0.26.0.crate' and 'criterion-0.4.0.crate'

[rand_core-0.3.1.crate](https://ibmets.whitesourcesoftware.com/Wss/WSS.html#!libraryDetails;uuid=20e2ad1b-c84c-4f18-98a9-4f27643b29ff;project=7300448;orgToken=79623fcf-07fe-42b8-90bf-513fafeb41be)
is also a transitive dependency coming from 'rand-0.3.23.crate' library
Path to dependency file:
/workspace/app/milvus/internal/core/thirdparty/tantivy/tantivy-binding/Cargo.toml
For Remediation, since these vulnerabilities are transitive one, the
root libraries should be updated to the latest non-vulnerable version

---------

Co-authored-by: Agnes-George1 <agnes.george1@ibm.com>
Co-authored-by: Abita Ann Augustine <abitaaugustine@gmail.com>
Co-authored-by: gifi-siby <gifi.s@ibm.com>
2025-09-30 16:25:53 +08:00
zhagnlu
4c49295c3d
Revert "enhance: enable default json stats (#44559)" (#44644)
This reverts commit 1b5191974c71eee342e4f7a8c804e1d95cfd094b.
 #44132

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-09-30 12:11:53 +08:00
Gao
3cc59a0d69
enhance: add storage usage for delete/upsert/restful (#44512)
#44212 

Also, record metrics only when storageUsageTracking is enabled.
Use MB for scanned_remote counter and scanned_total counter metrics to
avoid overflow.

---------

Signed-off-by: chasingegg <chao.gao@zilliz.com>
2025-09-30 00:31:06 +08:00
tinswzy
f342f49b32
enhance: add support for Azure Blob Storage in wp (#44592)
#44485 
add support for blob in woodpecker

#43638 
upgrade wp v0.1.6

related wp [issue#11](https://github.com/zilliztech/woodpecker/issues/11
)

Signed-off-by: tinswzy <zhenyuan.wei@zilliz.com>
2025-09-29 09:51:44 +08:00
cai.zhang
19346fa389
feat: Geospatial Data Type and GIS Function support for milvus (#44547)
issue: #43427

This pr's main goal is merge #37417 to milvus 2.5 without conflicts.

# Main Goals

1. Create and describe collections with geospatial type
2. Insert geospatial data into the insert binlog
3. Load segments containing geospatial data into memory
4. Enable query and search can display  geospatial data
5. Support using GIS funtions like ST_EQUALS in query
6. Support R-Tree index for geometry type

# Solution

1. **Add Type**: Modify the Milvus core by adding a Geospatial type in
both the C++ and Go code layers, defining the Geospatial data structure
and the corresponding interfaces.
2. **Dependency Libraries**: Introduce necessary geospatial data
processing libraries. In the C++ source code, use Conan package
management to include the GDAL library. In the Go source code, add the
go-geom library to the go.mod file.
3. **Protocol Interface**: Revise the Milvus protocol to provide
mechanisms for Geospatial message serialization and deserialization.
4. **Data Pipeline**: Facilitate interaction between the client and
proxy using the WKT format for geospatial data. The proxy will convert
all data into WKB format for downstream processing, providing column
data interfaces, segment encapsulation, segment loading, payload
writing, and cache block management.
5. **Query Operators**: Implement simple display and support for filter
queries. Initially, focus on filtering based on spatial relationships
for a single column of geospatial literal values, providing parsing and
execution for query expressions.Now only support brutal search
7. **Client Modification**: Enable the client to handle user input for
geospatial data and facilitate end-to-end testing.Check the modification
in pymilvus.

---------

Signed-off-by: Yinwei Li <yinwei.li@zilliz.com>
Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
Co-authored-by: ZhuXi <150327960+Yinwei-Yu@users.noreply.github.com>
2025-09-28 19:43:05 +08:00
yihao.dai
f61952adfc
fix: Fix compaction task blocking due to executor loop exit (#44543)
1. Use goroutine pool instead of sem.
2. Remove compaction executor from pipeline, since in streaming mode
pipeline should be decoupled from compaction.

issue: https://github.com/milvus-io/milvus/issues/44541

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-09-28 11:03:04 +08:00
zhagnlu
1b5191974c
enhance: enable default json stats (#44559)
#44132

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-09-28 10:45:04 +08:00
yihao.dai
2807d1d1b2
fix: Make default local storage path effective (#44514)
Make default local storage path effective instead of empty when yaml
config file is missing.

issue: https://github.com/milvus-io/milvus/issues/44513

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-09-24 21:00:06 +08:00
Zhen Ye
19e5e9f910
enhance: broadcaster will lock resource until message acked (#44508)
issue: #43897

- Return LastConfirmedMessageID when wal append operation.
- Add resource-key-based locker for broadcast-ack operation to protect
the coord state when executing ddl.
- Resource-key-based locker is held until the broadcast operation is
acked.
- ResourceKey support shared and exclusive lock.
- Add FastAck execute ack right away after the broadcast done to speed
up ddl.
- Ack callback will support broadcast message result now.
- Add tombstone for broadcaster to avoid to repeatedly commit DDL and
ABA issue.

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-09-24 20:58:05 +08:00
aoiasd
1b20e956be
enhance: support random score for boost function score (#44214)
And support set function mode and boost mode when run search with boost.

RandomScore support get random function score between [0, weight).
FunctionMode decide how to calculate boost score for multiple boost
function scores.
BoostMode decide how to calculate final score for origin score and boost
score.
relate: https://github.com/milvus-io/milvus/issues/43867

---------

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-09-24 17:50:04 +08:00
foxspy
13c3b0b909
enhance: add autoindex configuration for the int8 vector type (#44554)
issue: #38666 

Add int8 support for autoindex to ensure it can be independently
configured. At the same time, remove the restriction on int8 type for
vectorDiskIndex (note that vectorDiskIndex only determines the building
and loading method of the index, not the index type).

Signed-off-by: xianliang.li <xianliang.li@zilliz.com>
2025-09-24 17:48:04 +08:00
congqixia
99598ae5ec
enhance: Add param item for hybrid search requery policy (#44466)
Related to #39757

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-09-24 17:32:04 +08:00
junjiejiangjjj
f07979f91d
enhance: add support for controlling function output field insertion (#44162)
#44053

Signed-off-by: junjie.jiang <junjie.jiang@zilliz.com>
2025-09-24 17:26:04 +08:00
zhagnlu
eac16a577c
enhance:support cachelayer for json stats (#44446)
#42533

Signed-off-by: zhagnlu <lu.zhang@zilliz.com>
2025-09-24 15:30:04 +08:00
yihao.dai
20411e5218
fix: Fix replicator cannot stop and enhance replicate config validator (#44531)
1. Fix replicator cannot stop if error occurs on replicate stream RPC.
2. Simplify replicate stream client.
3. Enhance replicate config validator: 
1. Compare the incoming replicate config, cluster attributes must not be
changed.
  2. Cluster URI must be unique.
  3. Remove the check of pchannel prefix.
 
issue: https://github.com/milvus-io/milvus/issues/44123

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
Co-authored-by: chyezh <chyezh@outlook.com>
2025-09-24 11:54:03 +08:00
Bingyi Sun
96e1de4e22
feat: allow users to write pk field when autoid is enabled (#44424)
https://github.com/milvus-io/milvus/issues/44425

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-09-23 16:10:04 +08:00
Chun Han
1b7562a766
feat: support mannual compact l0(#44439) (#44440)
related: #44439

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2025-09-23 12:44:07 +08:00
congqixia
0e5fb8ac6f
fix: Cleanup collection metrics after dropped on rootcoord (#44511)
Related to #44509

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-09-23 11:02:06 +08:00
Tianx
2c0c5ef41e
feat: timestamptz expression & index & timezone (#44080)
issue: https://github.com/milvus-io/milvus/issues/27467

>My plan is as follows.
>- [x] M1 Create collection with timestamptz field
>- [x] M2 Insert timestamptz field data
>- [x] M3 Retrieve timestamptz field data
>- [x] M4 Implement handoff
>- [x] M5 Implement compare operator
>- [x] M6 Implement extract operator
 >- [x] M8 Support database/collection level default timezone
>- [x] M7 Support STL-SORT index for datatype timestamptz

---

The third PR of issue: https://github.com/milvus-io/milvus/issues/27467,
which completes M5, M6, M7, M8 described above.

## M8 Default Timezone

We will be able to use alter_collection() and alter_database() in a
future Python SDK release to modify the default timezone at the
collection or database level.

For insert requests, the timezone will be resolved using the following
order of precedence: String Literal-> Collection Default -> Database
Default.
For retrieval requests, the timezone will be resolved in this order:
Query Parameters -> Collection Default -> Database Default.
In both cases, the final fallback timezone is UTC.


## M5: Comparison Operators

We can now use the following expression format to filter on the
timestamptz field:

- `timestamptz_field [+/- INTERVAL 'interval_string'] {comparison_op}
ISO 'iso_string' `

- The interval_string follows the ISO 8601 duration format, for example:
P1Y2M3DT1H2M3S.

- The iso_string follows the ISO 8601 timestamp format, for example:
2025-01-03T00:00:00+08:00.

- Example expressions: "tsz + INTERVAL 'P0D' != ISO
'2025-01-03T00:00:00+08:00'" or "tsz != ISO
'2025-01-03T00:00:00+08:00'".

## M6: Extract

We will be able to extract sepecific time filed by kwargs in a future
Python SDK release.
The key is `time_fields`, and value should be one or more of "year,
month, day, hour, minute, second, microsecond", seperated by comma or
space. Then the result of each record would be an array of int64.



## M7: Indexing Support

Expressions without interval arithmetic can be accelerated using an
STL-SORT index. However, expressions that include interval arithmetic
cannot be indexed. This is because the result of an interval calculation
depends on the specific timestamp value. For example, adding one month
to a date in February results in a different number of added days than
adding one month to a date in March.

--- 

After this PR, the input / output type of timestamptz would be iso
string. Timestampz would be stored as timestamptz data, which is int64_t
finally.

> for more information, see https://en.wikipedia.org/wiki/ISO_8601

---------

Signed-off-by: xtx <xtianx@smail.nju.edu.cn>
2025-09-23 10:24:12 +08:00
jiaqizho
338ed2fed4
enhance: Introduce sparse filter in query (#44347)
issue: #44373

The current commit implements sparse filtering in query tasks using the
statistical information (Bloom filter/MinMax) of the Primary Key (PK).

The statistical information of the PK is bound to the segment during the
segment loading phase. A new filter has been added to the segment filter
to enable the sparse filtering functionality.

Signed-off-by: jiaqizho <jiaqi.zhou@zilliz.com>
2025-09-23 09:58:09 +08:00
Gao
539f17f1ad
enhance: tiered index updates (#44433)
issue: #42032 #44212 

- special case for warmup param and cell storage size for tiered index
- add a config to enable/disable storage usage tracking

---------

Signed-off-by: chasingegg <chao.gao@zilliz.com>
2025-09-22 21:34:11 +08:00
Zhen Ye
c171280f63
enhance: support replicate message in wal. (#44456)
issue: #44123

- support replicate message  in wal of milvus.
- support CDC-replicate recovery from wal.
- fix some CDC replicator bugs

Signed-off-by: chyezh <chyezh@outlook.com>
2025-09-22 17:06:11 +08:00
sthuang
edd250ffef
fix: [StorageV2] force virtual host for oss and cos (#44484)
related: #44481

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2025-09-22 16:58:11 +08:00
Bingyi Sun
94d53a5ac6
feat: encode cluster id in auto id (#44471)
https://github.com/milvus-io/milvus/issues/44326
prev:
[physical_ts][logical_ts]
after
[sign_bit][cluster_id][physical_ts][logical_ts]

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-09-22 10:40:02 +08:00
tinswzy
c7f21d5a06
enhance: purge small files right after wp segment compaction (#44473)
#43638 
improve wp log output
[wp#43](https://github.com/zilliztech/woodpecker/issues/43)
intro purge small files right after segment compaction
[wp#47](https://github.com/zilliztech/woodpecker/issues/47)
The rootpath configured by milvus is uniformly used as the base for wp
local fs storage.
update to v0.1.5

Signed-off-by: tinswzy <zhenyuan.wei@zilliz.com>
2025-09-21 16:32:01 +08:00
Gao
d3784c6515
enhance: add storage resource usage for vector search (#44308)
issue: #44212 

Implement search/query storage usage statistics in go side(result
reduce), only record storage usage in vector search C++ path. Need to be
implemented in query c++ path in next prs.

---------

Signed-off-by: chasingegg <chao.gao@zilliz.com>
Signed-off-by: marcelo.chen <marcelo.chen@zilliz.com>
Co-authored-by: marcelo.chen <marcelo.chen@zilliz.com>
2025-09-19 20:20:02 +08:00
wei liu
92d2fb6360
enhance: Add granular flush targets support for FlushAll operation (#44234)
issue: #44156
Enhance FlushAll functionality to support targeting specific collections
within databases instead of only database-level flushing.

Changes include:

- Add FlushAllTarget message in data_coord.proto for granular targeting
- Support collection-specific flush operations within databases
- Maintain backward compatibility with deprecated db_name field

This enhancement allows users to flush specific collections without
affecting other collections in the same database, providing more precise
control over data persistence operations.

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-09-19 18:38:01 +08:00
wei liu
6d4961b978
enhance: Refactor balance checker with priority queue (#43992)
issue: #43858
Refactor the balance checker implementation to use priority queues for
managing collection balance operations, improving processing efficiency
and order control.

Changes include:
- Export priority queue interfaces (Item, BaseItem, PriorityQueue)
- Replace collection round-robin with priority-based queue system
- Add BalanceCheckCollectionMaxCount configuration parameter
- Optimize balance task generation with batch processing limits
- Refactor processBalanceQueue method for different strategies
- Enhance test coverage with comprehensive unit tests

The new priority queue system processes collections based on row count
or collection ID order, providing better control over balance operation
priorities and resource utilization.

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-09-19 17:46:01 +08:00
Zhen Ye
ba289891c0
enhance: add all ddl message into messages (#44407)
issue: #43897

- add ddl messages proto and add some message utilities.
- support shard/exclusive resource-key-lock.
- add all ddl callbacks future into broadcast registry.

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-09-18 10:08:00 +08:00
congqixia
c142974853
enhance: Bump milvus & proto version to v2.6.2 (#44427)
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-09-17 21:12:01 +08:00
Bingyi Sun
5cd2d99799
enhance: Revert "feat: encode cluster id in auto id (#44324)" (#44426)
This reverts commit 7af159410395f0e7079d4875d96544c01f1d477b
2025-09-17 17:56:01 +08:00
Bingyi Sun
7af1594103
feat: encode cluster id in auto id (#44324)
https://github.com/milvus-io/milvus/issues/44326
prev:
`[physical_ts][logical_ts]`
after
`[sign_bit][cluster_id][physical_ts][logical_ts]`

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-09-17 16:56:01 +08:00
zhenshan.cao
691a8df953
feat: Add RESTful api for rolling upgrade support (#44381)
issue: https://github.com/milvus-io/milvus/issues/43968

Co-authored-by: chyezh <ye.zhen@zilliz.com>
2025-09-16 20:08:00 +08:00
yihao.dai
51f69f32d0
feat: Add CDC support (#44124)
This PR implements a new CDC service for Milvus 2.6, providing log-based
cross-cluster replication.

issue: https://github.com/milvus-io/milvus/issues/44123

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
Signed-off-by: chyezh <chyezh@outlook.com>
Co-authored-by: chyezh <chyezh@outlook.com>
2025-09-16 16:32:01 +08:00
congqixia
103db5ae3e
enhance: [StorageV2] Include partition & clustering key to sys group (#44372)
Related to #44257

This PR makes partition key & clustering candidates of system field
group and adds param item controlling the policy

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-09-16 12:08:00 +08:00
Spade A
eb793531b9
feat: impl StructArray -- support import for CSV/JSON/PARQUET/BINLOG (#44201)
Ref https://github.com/milvus-io/milvus/issues/42148

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-09-15 20:41:59 +08:00
cai.zhang
76f6768ea1
enhance: Remove timeout for compaction task (#44277)
issue: #44272

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-09-15 11:03:58 +08:00
congqixia
bfc9e80e14
enhance: Add param item forcing all indices ready for segment (#44313)
Related to #44312

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-09-12 17:51:58 +08:00