10423 Commits

Author SHA1 Message Date
Zhen Ye
78fca7e88d
fix: transaction should retry if transaction is expired (#41379)
issue: #41248

Signed-off-by: chyezh <chyezh@outlook.com>
2025-04-20 22:38:36 +08:00
tinswzy
6fa68c1f16
enhance: Support Woodpecker as a WAL storage option for Milvus (#41095)
#40916 Support Woodpecker as a WAL storage option for Milvus

Signed-off-by: tinswzy <zhenyuan.wei@zilliz.com>
2025-04-20 22:22:42 +08:00
Zhen Ye
c893344289
fix: close of wal is block when recovery (#41326)
issue: #41307

Signed-off-by: chyezh <chyezh@outlook.com>
2025-04-18 16:14:35 +08:00
Xianhui Lin
c43f8f7944
feat: Ignore reporting index metrics for non-existent indexes (#41294)
feat: Ignore reporting index metrics for non-existent indexes

Remove the reporting of index metrics for non-existent indexes in the
`getCollectionMetrics` function. This change improves the code by
skipping unnecessary operations and reduces log noise.
issue: https://github.com/milvus-io/milvus/issues/41280

Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>
2025-04-18 10:36:36 +08:00
Ted Xu
d50781c8cc
enhance: support nullable group by keys (#41313)
See #36264

---------

Signed-off-by: Ted Xu <ted.xu@zilliz.com>
2025-04-18 10:08:34 +08:00
Spade A
62293cb582
fix: revert batch add (#41374)
issue: #41375

todo: to fix the problems fixed in the issue.

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-04-17 22:32:38 +08:00
Bingyi Sun
4552dd4b23
fix: Fix json index does not work for string filter (#41382)
issue: #35528

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-04-17 20:10:39 +08:00
Xiaowei Shi
1a35374672
fix: correct wrong querynode metric labels (#41344)
issue: https://github.com/milvus-io/milvus/issues/41343

Signed-off-by: Xiaowei Shi <shallwe.shih@gmail.com>
2025-04-16 21:32:33 +08:00
cai.zhang
5fd8a196f6
fix: Fix panic with nil pointer dereference when get indexed segment (#41297)
issue: #41288

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-04-16 20:54:00 +08:00
Xianhui Lin
deb610e5d3
fix: update MixCoord registration in MilvusRoles (#41337)
enhance: update MixCoord registration in MilvusRoles

The `runMixCoord` function in `MilvusRoles` was updated to use the
`RegisterMixCoord` function from the `rootcoord_metrics` package instead
of `RegisterRootCoord`. This change aligns with the recent modifications
made to the `rootcoord_metrics` package.
issue:https://github.com/milvus-io/milvus/issues/41338

---------

Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>
2025-04-16 19:49:54 +08:00
박상호
4be6d0e967
fix: skip dim check for non-vector fields in PreCheck (#41287) (#41289)
## What this PR does

This PR fixes an issue where the `PreCheck` function in DataCoord logs
unnecessary warnings
when attempting to retrieve 'dim' from non-vector fields.

The change adds a check to only call `GetDimFromParams` when the field
type is a vector type.

## Related issue

Fixes #41287

---------

Signed-off-by: 박상호 <sangho@rapportlabs.kr>
Signed-off-by: Sangho Park <hoyaspark@gmail.com>
2025-04-16 17:52:32 +08:00
Xiaowei Shi
a6606ce9c6
fix: check PreCreatedTopic first in shard number validation (#41274)
issue : https://github.com/milvus-io/milvus/issues/41271

Signed-off-by: Xiaowei Shi <shallwe.shih@gmail.com>
2025-04-16 17:38:34 +08:00
sthuang
e46e3a1708
enhance: optimize error log message for list policy (#41251)
related: #41250

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2025-04-16 17:16:32 +08:00
sthuang
1f1c836fb9
feat: Storage v2 growing segment load (#41001)
support parallel loading sealed and growing segments with storage v2
format by async reading row groups.
related: #39173

---------

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2025-04-16 17:14:33 +08:00
Spade A
70d13dcf61
enhance: update tantivy for removing "doc_id" fast field (#41198)
Issue: #41210

After https://github.com/zilliztech/tantivy/pull/5, we can provide
milvus row id directly to tantivy rather than record it in the fast
field "doc_id".
So rather than search tantivy doc id and then get milvus row id from
"doc_id", now, the searched tantivy doc id is the milvus row id,
eliminating the expensive acquiring row id phase.

The following shows a simple benchmark where insert **1M** docs where
all rows are "hello", the latency is **segcore** level, CPU is 9900K:

![image](https://github.com/user-attachments/assets/d8e72134-56b5-430b-8628-36c3bed8eaad)
**The latency is 2.02 and 2.1 times respectively.**

bench mark code:
```
TEST(TextMatch, TestPerf) {
    auto schema = GenTestSchema({}, true);
    auto seg = CreateSealedSegment(schema, empty_index_meta);
    int64_t N = 1000000;
    uint64_t seed = 19190504;
    auto raw_data = DataGen(schema, N, seed);
    auto str_col = raw_data.raw_->mutable_fields_data()
                       ->at(1)
                       .mutable_scalars()
                       ->mutable_string_data()
                       ->mutable_data();
    for (int64_t i = 0; i < N - 1; i++) {
        str_col->at(i) = "hello";
    }
    SealedLoadFieldData(raw_data, *seg);
    seg->CreateTextIndex(FieldId(101));

    auto now = std::chrono::high_resolution_clock::now();
    auto expr = GetMatchExpr(schema, "hello", OpType::TextMatch);
    auto final = ExecuteQueryExpr(expr, seg.get(), N, MAX_TIMESTAMP);
    auto end = std::chrono::high_resolution_clock::now();
    auto duration =
        std::chrono::duration_cast<std::chrono::microseconds>(end - now);
    std::cout << "TextMatch query time: " << duration.count() << "ms"
              << std::endl;
}
```

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-04-15 20:20:32 +08:00
Bingyi Sun
a953eaeaf0
enhance: support binary range expression for json path index (#41025)
issue: #35528

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-04-15 19:32:33 +08:00
congqixia
1d564a2d95
fix: Make TestScannerAdaptorReadError stable (#41303)
Related to #41302

Previously wait for 200 milliseconds could cause unsable behavior of
this unittest. This PR make unittest wait for certain function call
instead of wait for some time.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-04-15 14:54:34 +08:00
congqixia
a53f3024cf
fix: Add save field schema log for kv_catalog.AlterCollection (#41242)
Related to #41241

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-04-15 12:14:32 +08:00
yihao.dai
dccfc69660
enhance: Get compaction params from request (#41125)
Make DataNode use compaction parameters from request instead of
configuration.

issue: https://github.com/milvus-io/milvus/issues/41123

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-04-15 10:28:53 +08:00
cai.zhang
bc11feae74
fix: Close client before remove worker client (#41253)
issue: #41252

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-04-15 10:26:31 +08:00
Xianhui Lin
3963fc818f
fix:Add debug memory freeing in sortStats (#41284)
issue: https://github.com/milvus-io/milvus/issues/41218

Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>
2025-04-15 09:56:29 +08:00
Xianhui Lin
23f9226250
fix: Initialize streaming coordinator during mixCoord initialization (#41283)
relater-pr: https://github.com/milvus-io/milvus/pull/41006
issue: https://github.com/milvus-io/milvus/issues/41282

Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>
2025-04-15 09:44:30 +08:00
Chun Han
59b14d38f5
enhance: Optimize index format for improved load performance(#40838) (#40839)
related: https://github.com/milvus-io/milvus/issues/40838

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2025-04-15 03:10:30 +08:00
Spade A
736512a59e
fix: change log info to debug for collection ref (#41267)
issue: #41268

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-04-14 15:16:30 +08:00
Bingyi Sun
bf617115ca
enhance: Remove single chunk segment related codes (#39249)
https://github.com/milvus-io/milvus/issues/39112

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-04-11 18:56:29 +08:00
congqixia
154a2a68e0
enhance: Fill dbname for AddCollectionFieldRequest (#41237)
Related to #39718

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-04-11 18:54:29 +08:00
Xianhui Lin
f9febe3bae
enhance: Merge RootCoord, DataCoord And QueryCoord into MixCoord (#41006)
Merge RootCoord, DataCoord And QueryCoord into MixCoord
Make Session into one
issue : https://github.com/milvus-io/milvus/issues/37764

---------

Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>
2025-04-11 16:36:30 +08:00
Spade A
9ce3e3cb44
enhance: add documents in batch for json key stats (#41228)
issue: https://github.com/milvus-io/milvus/issues/40897

After this, the document add operations scheduling duration is decreased
roughly from 6s to 0.9s for the case in the issue.

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-04-11 14:08:26 +08:00
Bingyi Sun
b9b8419cbf
fix: Use int32 when creating array index for element type int8/int16 (#41185)
issue: #41172
Elements with type int8 or int16 in Array is encoded using int32, so we
should parse it as int32 when creating index.

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-04-11 13:18:25 +08:00
Xianhui Lin
144911aec6
fix: CreateStatsRequest change storage_version to 25 consistent with 2.5 (#41217)
fix: CreateStatsRequest change storage_version to 25 consistent with 2.5
relate-pr:https://github.com/milvus-io/milvus/pull/38039
issue: https://github.com/milvus-io/milvus/issues/36995

Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>
2025-04-11 11:16:43 +08:00
Sungyun Hur
7d946af00f
fix: correct typo kvmetestore -> kvmetastore (#40965)
Fixes a minor typo by replacing `kvmetestore` with `kvmetastore`.

Signed-off-by: lambert <lambert@daangn.com>
2025-04-11 11:13:35 +08:00
XuanYang-cn
e7a53da025
enhance: remove not inused util/* in datanode (#41177)
See also: #41229

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2025-04-11 10:34:29 +08:00
foxspy
17e10beba0
fix: avoid segmentation faults caused by retrieving empty vector datasets (#40545)
issue: #40544

Signed-off-by: xianliang.li <xianliang.li@zilliz.com>
2025-04-10 20:16:29 +08:00
Zhen Ye
224728c2d2
fix: catchup cannot work if using StartAfter (#41201)
issue: #41062

Signed-off-by: chyezh <chyezh@outlook.com>
2025-04-10 19:04:27 +08:00
XuanYang-cn
793fdeafe1
enhance: Refine logs in compaction trigger (#41171)
See also: #41118

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2025-04-10 18:08:26 +08:00
congqixia
486141cc12
enhance: [Restful] Make default timeout configurable (#41211)
The restful API default timeout was hard-coded. This PR make this
timeout value configurable via paramtable.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-04-10 16:14:28 +08:00
wei liu
a839d94c9e
fix: balance checker may enter infinite normal balance loop after balance suspension (#41195)
issue: #41194
- Refactor hasUnbalancedCollection flag handling to function scope
- Ensure tracking sets clearance when no balance needed
- Add deferred cleanup for both normal/stopping balance paths
- Add unit tests for collection tracking scenarios

The changes ensure tracking sets (normalBalanceCollectionsCurrentRound
and stoppingBalanceCollectionsCurrentRound) are properly cleared when:
- All collections in current round are balanced
- Balance checks return early due to unready targets
- Balance feature flags are disabled

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-04-10 15:22:29 +08:00
Xianhui Lin
3bc24c264f
enhance: Add json key inverted index in stats for optimization (#38039)
Add json key inverted index in stats for optimization
https://github.com/milvus-io/milvus/issues/36995

---------

Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-04-10 15:20:28 +08:00
SimFG
a308d2c886
fix: get replicate channel position (#41188)
- issue: #41187

Signed-off-by: SimFG <bang.fu@zilliz.com>
2025-04-10 15:14:28 +08:00
cai.zhang
6f4dc8dda2
fix: Revert add a sign (positive or negative) to constants (#41191)
issue: #41174

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-04-10 14:54:28 +08:00
congqixia
b593bfd9a5
fix: [RESTFUL] Return error when writer body closed (#41183)
Related to #41181

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-04-10 14:34:28 +08:00
sthuang
0d45b24599
fix: show collections support custom privilege groups granted objects (#41203)
related: https://github.com/milvus-io/milvus/issues/41200

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2025-04-10 12:12:25 +08:00
Spade A
e9fa30f462
fix: remove single segment logic in V7 (#41159)
Ref: https://github.com/milvus-io/milvus/issues/40823

It does not make any sense to create single segment tantivy index for
old version such as 2.4 by using tantivy V7.
So, clean the relevant code.

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-04-09 19:54:27 +08:00
zhagnlu
3ed23a5f48
fix: fix remove index type failed when remote storage is local mode (#41164)
#41142

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-04-09 16:42:26 +08:00
zhagnlu
ee1faf80dd
fix:add clear bitmap for batch skip mode (#41166)
#41086 #41150

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-04-09 13:08:27 +08:00
zhenshan.cao
ecc2d80915
enhance: Add primary field name in SearchResult and QueryResults (#39220)
issue: https://github.com/milvus-io/milvus/issues/39219

Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>
2025-04-09 10:48:25 +08:00
sthuang
50e02e3598
enhance: update packed reader api (#41055)
related: https://github.com/milvus-io/milvus/issues/39173

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2025-04-09 10:18:26 +08:00
congqixia
e2d8adb963
fix: Use element_type for Array is null operator (#41157)
Related to #41156

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-04-09 10:16:24 +08:00
cai.zhang
8a77fb9cdc
enhance: Support slot for index task and stats task (#39084)
issue: #39101

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-04-08 20:46:25 +08:00
Spade A
c6a0c2ab64
enhance: process tantivy document add by batch (#40124)
issue: https://github.com/milvus-io/milvus/issues/40006

This PR make tantivy document add by batch. Add document by batch can
greately reduce the latency of scheduling the document add operation
(call tantivy `add_document` only schdules the add operation and it
returns immediately after scheduled) , because each call involes a tokio
block_on which is relatively heavy.

Reduce scheduling part not necessarily reduces the overall latency if
the index writer threads does not process indexing quickly enough.
But if scheduling itself is pretty slow, even the index writer threads
process indexing very fast (by increasing thread number), the overall
performance can still be limited.

The following codes bench the PR (Note, the duration only counts for
scheduling without commit)
```
fn test_performance() {
    let field_name = "text";
    let dir = TempDir::new().unwrap();
    let mut index_wrapper = IndexWriterWrapper::create_text_writer(
        field_name,
        dir.path().to_str().unwrap(),
        "default",
        "",
        1,
        50_000_000,
        false,
        TantivyIndexVersion::V7,
    )
    .unwrap();

    let mut batch = vec![];
    for i in 0..1_000_000 {
        batch.push(format!("hello{:04}", i));
    }
    let batch_ref = batch.iter().map(|s| s.as_str()).collect::<Vec<_>>();

    let now = std::time::Instant::now();
    index_wrapper
        .add_data_by_batch(&batch_ref, Some(0))
        .unwrap();
    let elapsed = now.elapsed();
    println!("add_data_by_batch elapsed: {:?}", elapsed);
}
```
Latency roughly reduces from 1.4s to 558ms.

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-04-08 19:50:24 +08:00