848 Commits

Author SHA1 Message Date
wei liu
33ac8707dc
enhance:[2.5] Introduce sparse filter in query (#44347) (#44790) (#44867)
pr: #44347

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
Co-authored-by: jiaqizho <smalinuxer@gmail.com>
Co-authored-by: jiaqizho <jiaqi.zhou@zilliz.com>
2025-10-15 18:08:00 +08:00
wei liu
892d63d26e
enhance: [2.5] Refactor balance checker with priority queue (#43992) (#44588)
issue: #43858
pr: #43992
Refactor the balance checker implementation to use priority queues for
managing collection balance operations, improving processing efficiency
and order control.

Changes include:
- Export priority queue interfaces (Item, BaseItem, PriorityQueue)
- Replace collection round-robin with priority-based queue system
- Add BalanceCheckCollectionMaxCount configuration parameter
- Optimize balance task generation with batch processing limits
- Refactor processBalanceQueue method for different strategies
- Enhance test coverage with comprehensive unit tests

The new priority queue system processes collections based on row count
or collection ID order, providing better control over balance operation
priorities and resource utilization.

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-09-28 19:23:05 +08:00
wei liu
3a7a08f2b3
enhance: [2.5] Add granular flush targets support for FlushAll operation (#44431)
issue: #44156
pr: #44234
Enhance FlushAll functionality to support targeting specific collections
within databases instead of only database-level flushing.

Changes include:

- Add FlushAllTarget message in data_coord.proto for granular targeting
- Support collection-specific flush operations within databases
- Maintain backward compatibility with deprecated db_name field

This enhancement allows users to flush specific collections without
affecting other collections in the same database, providing more precise
control over data persistence operations.

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-09-28 10:37:06 +08:00
congqixia
d251e102b6
enhance: [2.5] Add param item for hybrid search requery policy (#44467)
Cherry-pick from master
pr: #44466
related to #39757

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-09-24 17:52:07 +08:00
Bingyi Sun
8b0bfe4cd8
feat: encode cluster id in auto id (#44471) (#44500)
pr: #44471 
https://github.com/milvus-io/milvus/issues/44326
prev:
[physical_ts][logical_ts]
after
[sign_bit][cluster_id][physical_ts][logical_ts]

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-09-22 20:30:10 +08:00
cai.zhang
a2bb36a6dc
enhance: [2.5] Remove timeout for compaction task (#44278)
issue: #44272 
master pr: #44277

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-09-15 21:24:01 +08:00
congqixia
5d54d84438
enhance: [2.5] Add param item forcing all indices ready for segment (#44329)
Cherry-pick from master
pr: #44313
Related to #44312

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-09-12 19:29:58 +08:00
cai.zhang
877e68f851
enhance: Support R-Tree index for geometry datatype (#44069)
issue: #43427
pr: #37417

Support R-Tree index for geometry datatype.

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
Co-authored-by: ZhuXi <150327960+Yinwei-Yu@users.noreply.github.com>
2025-09-11 14:19:58 +08:00
zhagnlu
802026569d
enhance:add param to modify delete snapshot size (#44213)
pr: #44215

Co-authored-by: luzhang <luzhang@zilliz.com>
2025-09-05 14:31:56 +08:00
cqy123456
c17ce3cf90
enhance:[2.5]minhash support and add autoindex config (#44015)
master pr: https://github.com/milvus-io/milvus/pull/44186

Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>
2025-09-03 17:39:54 +08:00
ZhuXi
cd931a0388
feat:Geospatial Data Type and GIS Function support for milvus (#43661)
issue: #43427
pr: #37417

This pr's main goal is merge #37417 to milvus 2.5 without conflicts.

# Main Goals

1. Create and describe collections with geospatial type
2. Insert geospatial data into the insert binlog
3. Load segments containing geospatial data into memory
4. Enable query and search can display  geospatial data
5. Support using GIS funtions like ST_EQUALS in query

# Solution

1. **Add Type**: Modify the Milvus core by adding a Geospatial type in
both the C++ and Go code layers, defining the Geospatial data structure
and the corresponding interfaces.
2. **Dependency Libraries**: Introduce necessary geospatial data
processing libraries. In the C++ source code, use Conan package
management to include the GDAL library. In the Go source code, add the
go-geom library to the go.mod file.
3. **Protocol Interface**: Revise the Milvus protocol to provide
mechanisms for Geospatial message serialization and deserialization.
4. **Data Pipeline**: Facilitate interaction between the client and
proxy using the WKT format for geospatial data. The proxy will convert
all data into WKB format for downstream processing, providing column
data interfaces, segment encapsulation, segment loading, payload
writing, and cache block management.
5. **Query Operators**: Implement simple display and support for filter
queries. Initially, focus on filtering based on spatial relationships
for a single column of geospatial literal values, providing parsing and
execution for query expressions.Now only support brutal search
6. **Client Modification**: Enable the client to handle user input for
geospatial data and facilitate end-to-end testing.Check the modification
in pymilvus.

---------

Signed-off-by: Yinwei Li <yinwei.li@zilliz.com>
Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
Co-authored-by: cai.zhang <cai.zhang@zilliz.com>
2025-08-26 19:11:55 +08:00
Ted Xu
8821743c17
enhance: returning collection metadata from cache (#42823) (#43911)
See #43187
pr: #42823

---------

Signed-off-by: Ted Xu <ted.xu@zilliz.com>
2025-08-26 14:23:54 +08:00
zhagnlu
6c29689ca2
enhance: support expr result cache (#43882)
cherry-pick from pr: #43923

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-08-26 11:19:57 +08:00
cqy123456
a1ff6c89be
enhance:[2.5] Make build ratio of interim index configurable (#43938)
issue: https://github.com/milvus-io/milvus/issues/43993
master pr: https://github.com/milvus-io/milvus/pull/43939

Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>
2025-08-25 16:01:52 +08:00
sparknack
b57d104742
enhance: [2.5] add write rate limit for disk file writer (#43856)
issue: https://github.com/milvus-io/milvus/issues/43040
pr: #43912

---------

Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>
2025-08-18 23:33:46 +08:00
congqixia
1f7bb41102
enhance: [2.5] Add downgrade tsafe switch param item (#43874) (#43886)
Cherry-pick from master
pr: #43874
Related to #43873

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-08-18 10:41:46 +08:00
yihao.dai
16d1947d5c
enhance: [2.5] Adjust import task concurrency based on CPU count (#43817)
pr: https://github.com/milvus-io/milvus/pull/43132

issue: https://github.com/milvus-io/milvus/issues/43131

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-08-18 01:11:45 +08:00
sparknack
4d944aecf7
enhance: add disk file writer with Direct IO support (#43692)
issue: #43040
pr: #42665 

This patch introduces a disk file writer that supports Direct IO.

Currently, it is exclusively utilized during the QueryNode load process.

Below is its parameters:

1. `common.diskWriteMode` This parameter controls the write mode of the
local disk, which is used to write temporary data downloaded from remote
storage. Currently, only QueryNode uses 'common.diskWrite*' parameters.
Support for other components will be added in the future.
The options include 'direct' and 'buffered'. The default value is
'buffered'.

2. `common.diskWriteBufferSizeKb` Disk write buffer size in KB, only
used when disk write mode is 'direct', default is 64KB.
Current valid range is [4, 65536]. If the value is not aligned to 4KB,
it will be rounded up to the nearest multiple of 4KB.

3. `common.diskWriteNumThreads` This parameter controls the number of
writer threads used for disk write operations. The valid range is [0,
hardware_concurrency]. It is designed to limit the maximum concurrency
of disk write operations to reduce the impact on disk read performance.
For example, if you want to limit the maximum concurrency of disk write
operations to 1, you can set this parameter to 1.
The default value is 0, which means the caller will perform write
operations directly without using an additional writer thread pool. In
this case, the maximum concurrency of disk write operations is
determined by the caller's thread pool size.

Both parameters can be updated during runtime.

---------

Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>
2025-08-08 12:13:41 +08:00
XuanYang-cn
1165a5300f
fix: [cp25]Use diskSegmentMaxSize for coll with sparse and dense vectors (#43195)
Previous code uses diskSegmentMaxSize if and only if all of the
collection's vector fields are indexed with DiskANN index.

When introducing sparse vectors, since sparse vector cannot be indexed
with DiskANN index, collections with both dense and sparse vectors will
use maxSize instead.

This PR changes the requirments of using diskSegmentMaxSize to all dense
vectors are indexed with DiskANN indexs, ignoring sparse vector fields.

See also: #43193
pr: #43194

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2025-07-18 11:16:52 +08:00
wei liu
73210303a9
fix: Fix exclude nodes clearing logic position in load balancer retry (#43002)
issue: #42994
pr: #42577 #40438

cp partial logic from pr #42577 and #40438

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-06-30 16:10:44 +08:00
Chun Han
bfa9688da3
enhance: supporting separate chunk cache pool(#42803) (#42901)
related: #42803

1. add a new thread pools using folly::CPUThreadPoolExecutor, named by
FThreadPools
2. reading vectors from chunkcache will use the separated
CHUNKCACHE_POOL to avoid being influenced by load collection
3. Note. For safety on cloud side on 2.5.x, only read-chunk-cache
operations is using this newly created thread pools other caller points
for threadpool will be mutated in the near future
4. master-branch doesn't need this pr as caching layer unified the chunk
cache behaviour

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2025-06-26 15:52:43 +08:00
aoiasd
7feeeabca5
enhance: [2.5] bm25 stats local cache use local storage path (#42924)
relate: https://github.com/milvus-io/milvus/pull/42923

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-06-25 13:44:46 +08:00
cai.zhang
78b66a29b6
fix: [2.5] Reduce task slot for standalone to 1/4 of normal datanode (#42809)
issue: #42129 

master pr: #42808

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-06-20 16:38:46 +08:00
cai.zhang
e30fc0fbaf
enhance: [2.5] Make Web UI toggleable via config (#42815)
issue: #42813

master pr: #42814

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-06-18 12:20:39 +08:00
yihao.dai
f978641d6a
enhance: [2.5] Enhance import integration tests and logs (#42696)
1. Optimize the import process: skip subsequent steps and mark the task
as complete if the number of imported rows is 0.
2. Improve import integration tests:
a. Add a test to verify that autoIDs are not duplicated
b. Add a test for the corner case where all data is deleted
c. Shorten test execution time
3. Enhance import logging:
a. Print imported segment information upon completion
b. Include file name in failure logs

issue: https://github.com/milvus-io/milvus/issues/42488,
https://github.com/milvus-io/milvus/issues/42518

pr: https://github.com/milvus-io/milvus/pull/42612

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-06-16 20:06:38 +08:00
aoiasd
5110130b2e
enhance: add segment bm25 stats local cache (#41775) (#42646)
relate: https://github.com/milvus-io/milvus/issues/41424
pr: https://github.com/milvus-io/milvus/pull/41775

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-06-13 16:50:37 +08:00
Bingyi Sun
a32b55ed71
enhance: support auto index type for json index (#42161)
issue: #42070 
pr: #42071

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-06-09 21:20:34 +08:00
yihao.dai
28aa364bf7
enhance: [2.5] Adjust default import buffer size (#42542)
Increase insert buffer size from 16MB to 64MB, while keeping delete
buffer size at 16MB.

issue: https://github.com/milvus-io/milvus/issues/42518

pr: https://github.com/milvus-io/milvus/pull/42541

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-06-05 18:46:33 +08:00
yihao.dai
72a8777c9d
enhance: [2.5] Accelerate dispatcher building (#42544)
Reduce check interval to accelerate dispatcher building.

issue: https://github.com/milvus-io/milvus/issues/42067

pr: https://github.com/milvus-io/milvus/pull/42500

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-06-05 18:16:33 +08:00
yihao.dai
fdfb78b9e5
fix: [2.5] Fix duplicate autoID between import and insert (#42520)
Remove the unlimited logID mechanism and switch to redundantly
allocating a large number of IDs.

issue: https://github.com/milvus-io/milvus/issues/42518

pr: https://github.com/milvus-io/milvus/pull/42519

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-06-05 00:54:33 +08:00
liliu-z
1cab5dc2b2
enhance: Make cagra gpu image default (#42193)
pr: #41906
issue: #41907

Signed-off-by: yusheng.ma <yusheng.ma@zilliz.com>
Signed-off-by: Li Liu <li.liu@zilliz.com>
Co-authored-by: presburger <yusheng.ma@zilliz.com>
2025-05-30 03:12:30 +08:00
wei liu
4a05180f88
enhance: [2.5] support balancing multiple collections in single trigger (#41875) (#42134)
issue: #41874
pr: #41875
- Optimize balance_checker to support balancing multiple collections
simultaneously
- Add new parameters for segment and channel balancing batch sizes
- Add enableBalanceOnMultipleCollections parameter
- Update tests for balance checker

This change improves resource utilization by allowing the system to
balance multiple collections in a single trigger with configurable batch
sizes.

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-05-28 23:18:30 +08:00
congqixia
6c17cdffd8
enhance: [2.5] Take nq into slow query consideration (#42109) (#42125)
Cherry-pick from master
pr: #42109
Related to #40756

Large nq will naturally increase query time, which causing lots of slow
log when user NQ numbers are very large.

This PR make slow search counts span per nq (using avg val) to decide
whether one request is slow or not.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-05-28 19:02:30 +08:00
Chun Han
81ed143132
enhance: refine expiring compaction(#41336) (#42052)
related: #41336
pr: https://github.com/milvus-io/milvus/pull/42056

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2025-05-25 16:44:27 +08:00
yihao.dai
83ca664150
fix: [2.5] Fix import slot assignment (#41982)
Assign the import task to the worker with the most available slots, even
if availableSlots < requiredSlots. This ensures tasks won’t be blocked
indefinitely.

issue: https://github.com/milvus-io/milvus/issues/41981

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-05-23 01:36:30 +08:00
Chun Han
043e333290
enhance: support strict expiry compaction for milvus(#41855) (#41856)
related: #41855

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2025-05-19 09:50:24 +08:00
yihao.dai
7c8370ccd2
fix: [2.5] Fix ants.Pool goroutine leak (#41893)
1. Release the pool after it is no longer in use.
2. Upgrade ants.Pool to fix the goroutine leak issue (see
https://github.com/panjf2000/ants/pull/287).

issue: https://github.com/milvus-io/milvus/issues/41838

pr: https://github.com/milvus-io/milvus/pull/41892

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-05-16 19:12:22 +08:00
cai.zhang
dc1e9e2f81
fix: [2.5] Don't create index for unsorted importing segment when enable stats (#41865)
issue: #41863 

master pr: #41864

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-05-16 10:18:35 +08:00
wei liu
aed074d83e
fix: unexpected password for root user (#41818)
issue: #41816
pr: #41817
pr #37983 introduced an issue, if doesn't specified
`defaultRootPassword` in milvus.yaml, then `"Milvus"` will be used as
default password for root user, instead of `Milvus`.

This PR fix the unexpected password for root, and add comment for case
which use large numeric password requires double quotes.

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-05-14 19:40:22 +08:00
Julien Salleyron
9b8a8f7607
fix: Allow to compile on windows (#41617)
This PR fixes https://github.com/milvus-io/milvus/issues/41384 on 2.5.

Related to #41448.

When using milvus client and compile on windows, the compilation failed
with the undefined RSS error.

On windows, the way to get memory used is the same as on darwin.

Signed-off-by: Julien Salleyron <julien.salleyron@gmail.com>
2025-05-14 11:33:40 +08:00
Xianhui Lin
548754a5e3
fix: fallback to mixcoord session when upgrade to mixCoord (#41773)
fix: fallback to mixcoord  session when upgrade to mixCoord
issue:https://github.com/milvus-io/milvus/issues/41737

Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>
2025-05-13 23:12:58 +08:00
zhagnlu
5b8ea84d38
fix: add params to ignore config type exception (#41777)
pr: #41776

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-05-13 11:28:57 +08:00
Chun Han
69a80b9ce3
enhance: resize high priority wqthreadpool dynamically(#40838) (#41549)
related: #40838

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2025-04-28 00:44:39 +08:00
SimFG
18eb627533
fix: [2.5] Update logging context and upgrade dependencies (#41319)
- issue: #41291
- pr: #41318

---------

Signed-off-by: SimFG <bang.fu@zilliz.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-04-24 23:50:40 +08:00
foxspy
d5977ec521
enhance: [2.5] add force rebuild index configuration (#41432)
issue: #41431 
pr: #41473

Signed-off-by: xianliang.li <xianliang.li@zilliz.com>
2025-04-23 21:44:38 +08:00
aoiasd
544493e3e2
feat:[2.5] support multi analyzer for bm25 function (#41456)
relate: https://github.com/milvus-io/milvus/issues/41213
pr: https://github.com/milvus-io/milvus/pull/41351

---------

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-04-23 20:52:39 +08:00
congqixia
f2a5542996
enhance: [2.5] Adapt hyphen in grpc metadata header (#41358) (#41372)
Cherry-pick from master
pr: #41358
Related to #41357

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-04-17 19:12:39 +08:00
Ted Xu
daa48f6806
fix: errorous deadlock report in unittests (#41350) (#41377)
See #41349 #41291
pr: #41350

Signed-off-by: Ted Xu <ted.xu@zilliz.com>
2025-04-17 18:00:37 +08:00
congqixia
b073112e16
enhance: [2.5][Restful] Make default timeout configurable (#41211) (#41225)
Cherry-pick from master
pr: #41211

The restful API default timeout was hard-coded. This PR make this
timeout value configurable via paramtable.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-04-11 13:12:27 +08:00
congqixia
d75596456a
enhance: [2.5] Rectify client_request_id logic (#41089) (#41149)
Cherry-pick from master
pr: #41089 
The traceID is not initialized by client_request_id in context. If the
client sent valid traceID, milvus log will print two different traceID
which is wierd.

This PR add the logic to tray parsing incoming `client_request_id` into
traceID. If it works just use it the request traceID, otherwise set it
to a different field named `client_request_id`.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-04-09 10:26:27 +08:00