22749 Commits

Author SHA1 Message Date
Zhen Ye
e97e44d56e
enhance: limit the gc concurrency when cpu is high (#43059)
issue: #42833

Signed-off-by: chyezh <chyezh@outlook.com>
2025-07-04 09:22:43 +08:00
congqixia
1d9a9a993d
fix: [StorageV2] Use correct template typename for cache_raw_data_to_disk_common (#43104)
Related to #43099

Previously `cache_raw_data_to_disk_common` used `milvus::DataType`
template typename, which shall be `knowhere::bf16` or other actual
datatype.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-07-03 18:50:46 +08:00
Zhen Ye
bbbc7d4517
enhance: collect all cgo calling into metric and log slow cgo call (#43035)
issue: #42833

- also fix the error metric for async cgo.
- also make sure the roles can be seen when node startup, #43041.

Signed-off-by: chyezh <chyezh@outlook.com>
2025-07-03 15:00:44 +08:00
cai.zhang
f6b2a71c95
enhance: Remove chunkmanager-related dependencies from datanode (#43021)
issue: #41611

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-07-03 14:44:45 +08:00
congqixia
1fae5230fe
fix: Check field mmap property before apply collection level one (#43090)
Related to #43089

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-07-03 14:30:44 +08:00
Bingyi Sun
6e38e9d18f
fix: Add json cast type for flat index (#42970)
issue: #42916

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-07-03 14:14:44 +08:00
nico
a8d174a841
test: update nightly test cases (#42991)
Signed-off-by: nico <cheng.yuan@zilliz.com>
2025-07-03 09:18:51 +08:00
sparknack
7e855f1046
enhance: add disk file writer with Direct IO support (#42665)
issue: #43040 

This patch introduces a disk file writer that supports Direct IO.

Currently, it is exclusively utilized during the QueryNode load process.

Below is its parameters:

1. `common.diskWriteMode`
This parameter controls the write mode of the local disk, which is used
to write temporary data downloaded from remote storage.
Currently, only QueryNode uses 'common.diskWrite*' parameters. Support
for other components will be added in the future.
The options include 'direct' and 'buffered'. The default value is
'buffered'.

2. `common.diskWriteBufferSizeKb`
Disk write buffer size in KB, only used when disk write mode is
'direct', default is 64KB.
Current valid range is [4, 65536]. If the value is not aligned to 4KB,
it will be rounded up to the nearest multiple of 4KB.

3. `common.diskWriteNumThreads`
This parameter controls the number of writer threads used for disk write
operations. The valid range is [0, hardware_concurrency].
It is designed to limit the maximum concurrency of disk write operations
to reduce the impact on disk read performance.
For example, if you want to limit the maximum concurrency of disk write
operations to 1, you can set this parameter to 1.
The default value is 0, which means the caller will perform write
operations directly without using an additional writer thread pool.
In this case, the maximum concurrency of disk write operations is
determined by the caller's thread pool size.

Both parameters can be updated during runtime.

---------

Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>
2025-07-02 22:18:44 +08:00
congqixia
7bc7b18ed5
fix: [AddField] Prevent concurrent load during UpdateSchema (#43043)
Related to #43028

This PR:
- Add mutex prevent concurrent load segment & schema change
- Add schema verison field in load meta
- Update schema in PutOrRef if schema verison is larger

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-07-02 17:38:44 +08:00
zhikunyao
a71196ee65
enhance: update minio to RELEASE.2024-05-28T17-19-04Z (#43063)
Signed-off-by: Zhikun Yao <zhikun.yao@zilliz.com>
2025-07-02 15:30:43 +08:00
congqixia
8962b0058d
fix: [StorageV2] Check writer nil when closing not written one (#43056)
Related to #43047

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-07-02 14:22:43 +08:00
Zhen Ye
906430f291
fix: enable tls integration test after tls fixed at streaming arch (#43025)
issue: #42680

Signed-off-by: chyezh <chyezh@outlook.com>
2025-07-02 11:42:43 +08:00
zhikunyao
9886d7d4b6
enhance: Master updatecmake zhikun (#43023)
Signed-off-by: Zhikun Yao <zhikun.yao@zilliz.com>
2025-07-02 11:36:43 +08:00
zhuwenxing
d0e9547869
test: add restful case for add field (#43044)
/kind improvement

Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2025-07-02 11:18:43 +08:00
zhuwenxing
7b26cef3be
test: add group by for fts hybrid search (#43037)
/kind improvement

Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2025-07-02 11:16:50 +08:00
yanliang567
e8011908ac
test: Add tests for partition key filter issue and ttl eventually search (#43052)
related issue: #42918
1. add tests for ttl eventually search
2. add tests for partition key filter 
3. improve check query results for output fields 
4. verify some fix for rabitq index and update the test accordingly
5. update gen random float vector in (-1, 1) instead of (0,1)

---------

Signed-off-by: yanliang567 <yanliang.qiao@zilliz.com>
2025-07-02 11:02:43 +08:00
Zhen Ye
09c6df62d8
fix: use impl and remove the close method of broadcast service (#42992)
issue: #38399

Signed-off-by: chyezh <chyezh@outlook.com>
2025-07-02 10:30:44 +08:00
wei liu
c381bf3e41
enhance: add logs for count(*) (#43001)
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-07-01 19:36:43 +08:00
Zhen Ye
08fff353af
fix: Revert "enhance: Enable mergeSort by default starting from version 2.6.0 (#42981)" (#43046)
issue: #43034

- implementation of mergeSortMultipleSegments is wrong.

Signed-off-by: chyezh <chyezh@outlook.com>
2025-07-01 17:30:29 +08:00
Spade A
26ec841feb
feat: optimize Like query with n-gram (#41803)
Ref #42053

This is the first PR for optimizing `LIKE` with ngram inverted index.
Now, only VARCHAR data type is supported and only InnerMatch LIKE
(%xxx%) query is supported.


How to use it:
```
milvus_client = MilvusClient("http://localhost:19530")
schema = milvus_client.create_schema()
...
schema.add_field("content_ngram", DataType.VARCHAR, max_length=10000)
...
index_params = milvus_client.prepare_index_params()
index_params.add_index(field_name="content_ngram", index_type="NGRAM", index_name="ngram_index", min_gram=2, max_gram=3)
milvus_client.create_collection(COLLECTION_NAME, ...)
```

min_gram and max_gram controls how we tokenize the documents. For
example, for min_gram=2 and max_gram=4, we will tokenize each document
with 2-gram, 3-gram and 4-gram.

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>
2025-07-01 10:08:44 +08:00
wei liu
396120ade5
enhance: Improve delegator serviceable check with coordinator sync state (#42975)
issue: #42404
Add syncedByCoord field to ensure delegator only becomes serviceable
after coordinator sync, preventing unreliable service state when memory
is insufficient.

Issue: When memory is low, delegator may become serviceable before
current target is ready, but segments can be released at any time,
making the serviceable state unreliable.

Changes include:
- Add syncedByCoord field to track coordinator sync status
- Update Serviceable() to require both data readiness and coord sync
- Set syncedByCoord=true in SyncTargetVersion
- Add comprehensive test coverage

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-07-01 10:00:43 +08:00
cai.zhang
c82943dca1
enhance: Enable mergeSort by default starting from version 2.6.0 (#42981)
issue: #42980 

Enable mergeSort for mix compaction to reduce sort stats tasks.

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-06-30 21:46:43 +08:00
Zhen Ye
ecb24e7232
enhance: use multi-process framework in integration test (#42976)
issue: #41609

- add env `MILVUS_NODE_ID_FOR_TESTING` to set up a node id for milvus
process.
- add env `MILVUS_CONFIG_REFRESH_INTERVAL` to set up the refresh
interval of paramtable.
- Init paramtable when calling `paramtable.Get()`.
- add new multi process framework for integration test.
- change all integration test into multi process.
- merge some test case into one suite to speed up it.
- modify some test, which need to wait for issue #42966, #42685.
- remove the waittssync for delete collection to fix issue: #42989

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-06-30 14:22:43 +08:00
wei liu
c919340763
enhance: Optimize channel node balancing for uneven QN distribution (#42786)
issue: #42860
Fix channel node allocation when QueryNode count is not a multiple of
channel count. The previous algorithm used simple division which caused
uneven distribution with remainders.

Key improvements:
- Implement smart remainder distribution algorithm
- Refactor large function into focused helper functions
- Support two-phase rebalancing (release then allocate)
- Handle edge cases like insufficient nodes gracefully

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-06-30 12:14:42 +08:00
sre-ci-robot
519f57de67
[automated] Bump milvus version to v2.5.14 (#43012)
Bump milvus version to v2.5.14
Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-06-30 10:24:54 +08:00
sre-ci-robot
d327fc5ee6
[automated] Bump milvus version to v2.5.14 (#43005)
Bump milvus version to v2.5.14
Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-06-28 15:54:47 +08:00
rhys
48661655d6
fix: streamingcoord and streamingnode client support internal tls (#42685)
https://github.com/milvus-io/milvus/issues/42680

streamingnode/streamingcoord support internal tls

Signed-off-by: rhys <sdbwlr@163.com>
2025-06-27 17:50:42 +08:00
Zhen Ye
8367e4ec6a
fix: set 72h for wal retention (#42910)
issue: #42706

Signed-off-by: chyezh <chyezh@outlook.com>
2025-06-27 17:36:43 +08:00
zhikunyao
cfe7fd356d
test: [skip e2e]update helm to fix kafka image update (#42993)
Signed-off-by: Zhikun Yao <zhikun.yao@zilliz.com>
2025-06-27 17:09:14 +08:00
Bingyi Sun
23c784cf69
fix: Fix querynode crash caused by json index (#42982)
issue: https://github.com/milvus-io/milvus/issues/42978

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-06-27 16:44:41 +08:00
XuanYang-cn
17f1ab71bb
enhance: Remove not inused BuildIndexInfo (#42926)
1. removed not inuse cgo methods in index_c.h/cpp
2. removed indexcogowrapper/build_index_info.go

See also: #39242

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2025-06-27 15:00:42 +08:00
congqixia
9b06ecb72f
enhance: [StorageV2] Release record and close reader (#42983)
Related to #39173

This PR
- Close packed reader after sort
- Release arrow.Record preventing memory leakage
- Invoke `pack_reader->Close()` for CloseReader

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-06-27 14:46:43 +08:00
sthuang
238bd30f42
fix: [StorageV2] end to end minor issues for sync, stats, and load (#42948)
Fix issues in end-to-end tests: 
1. **Split column groups based on schema**, rather than estimating by
average chunk row size. **Ensure column group consistency within a
segment**, to avoid errors caused by loading multiple column group
chunks simultaneously.
2. **Use sorted segmentId** when generating the stats binlog path, to
ensure consistent and correct file path resolution.
3. **Determine field IDs as follows**:
For multi-column column groups, retrieve the field ID list from
metadata.
For single-column column groups, use the column group ID directly as the
field ID.

related: #39173 
fix: #42862

---------

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2025-06-27 14:44:42 +08:00
zhikunyao
94e2b05ffd
enhance: init v2 ci pipelines policy 2 (#42853)
Signed-off-by: Zhikun Yao <zhikun.yao@zilliz.com>
2025-06-27 14:33:41 +08:00
zhuwenxing
76004eb6cf
test: add model rerank testcases (#42657)
/kind improvement

---------

Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2025-06-27 10:48:43 +08:00
zhuwenxing
8bda337f52
test: add text embedding function restful api test (#42977)
/kind improvement

Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2025-06-27 10:46:41 +08:00
Zhen Ye
2d73e6eaa8
fix: mixcoord will not handle timetick anymore (#42965)
issue: #42954

Signed-off-by: chyezh <chyezh@outlook.com>
2025-06-26 19:14:42 +08:00
Zhen Ye
3602817c53
fix: dynamic log level for streaming node (#42964)
issue: #42963

Signed-off-by: chyezh <chyezh@outlook.com>
2025-06-26 19:12:50 +08:00
congqixia
5dd1f841d2
enhance: [AddField] Add Restful API for addfield (#42972)
Related to #39718

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-06-26 18:46:41 +08:00
Bingyi Sun
289b8b85d3
enhance: remove name check for alter index task (#42953)
issue: https://github.com/milvus-io/milvus/issues/42952

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-06-26 16:32:41 +08:00
congqixia
1139ebb964
enhance: [GoSDK] Return SchemaMismatch error to retry (#42950)
Related to #39718

After adding field, composing write request may failure and shall
trigger retry with new schema. This PR make composing error returns
SchemaMismatch error to trigger retry policy

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-06-26 10:28:41 +08:00
foxspy
be05b653c1
enhance: update knowhere version (#42938)
issue: #42937

Signed-off-by: xianliang.li <xianliang.li@zilliz.com>
2025-06-26 01:22:41 +08:00
yihao.dai
d7c9914eff
fix: Consider fields number when preallocating ids for import (#42810)
In corner cases where there are many fields but only a small number of
rows to import, the default preallocated IDs may be insufficient. To
address this, consider the number of fields when preallocating IDs.

issue: https://github.com/milvus-io/milvus/issues/42518

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-06-25 23:38:41 +08:00
wei liu
be492c2939
fix: Add missing keylocks in ReleasePartition operation (#42940)
issue: #42098
Fix concurrent access issue by adding proper locking around
ReleasePartition operation to prevent race conditions when releasing
partitions on the same collection.

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-06-25 21:48:42 +08:00
congqixia
336e743b55
fix: [AddField] Respect growing mmap setting adding empty field (#42933)
Related to #42856

Data under mmapped growing segment shall be treated respecting
growingMmap setting. Otherwise, varchar datatype could be treated with
logic error.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-06-25 21:10:42 +08:00
congqixia
942055fa7d
fix: Use task timestamp to calculate TTL timestamp (#42920)
Related to #42918

Previously the `CollectionTtlTimestamp` could be overflowed when the
guarantee_ts==1, which means using `Eventually` consistency level.

This patch use task timestamp, allocated by scheduler, to generate ttl
timestamp ignore the potential very small timestamp being used.

Also add overflow check for ttl timestamp calculated.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-06-25 20:48:42 +08:00
zhagnlu
69872f45ad
fix: fix is_not_in for trie index (#42716)
#42604

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-06-25 16:52:42 +08:00
cai.zhang
ebe1c95bb1
enhance: Add Size interface to FileReader to eliminate the StatObject call during Read (#42908)
issue: #42907

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-06-25 14:36:41 +08:00
aoiasd
e2566c0e92
enhance: bm25 stats local cache use local storage path (#42923)
Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-06-25 13:44:46 +08:00
XuanYang-cn
0dfe5308e1
enhance: Tidy Download and decode in segcore storage (#42902)
1. Unify calling from GetObjectData
2. Move SetData inside Deserialize

See also: #40013

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2025-06-25 11:10:43 +08:00