22211 Commits

Author SHA1 Message Date
SimFG
91d40fa558
fix: Update logging context and upgrade dependencies (#41318)
- issue: #41291

---------

Signed-off-by: SimFG <bang.fu@zilliz.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-04-23 10:52:38 +08:00
Mario Camou
6b30e9ae60
feat: Export milvusclient.annRequest (issue: #41261) (#41356)
Issue: #41261

`milvusclient.NewHybridSearchOption` receives a variadic `annRequests`
parameter. However, since `milvusclient.annRequest` is private, there is
no way to declare a slice, therefore there is no way to make it fully
generic (as in, create a slice of `milvusclient.annRequest`s and pass
them to `NewHybridSearchOption`. This PR renames
`milvusclient.annRequest` to `milvusclient.AnnRequest` to export it.

This is an API change since it's renaming a struct. However, since the
struct was previously private no external code depends on it, unless
it's doing nasty things with reflection (in which case it should not
depend on the name).

Signed-off-by: Mario Camou <mcamou@users.noreply.github.com>

Signed-off-by: mcamou <mcamou@users.noreply.github.com>
2025-04-23 02:38:38 +08:00
aoiasd
11f2fae42e
feat: support extend default dict for jieba tokenizer (#41360)
relate: https://github.com/milvus-io/milvus/issues/41213

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-04-22 20:34:37 +08:00
Shubhendra Kushwaha
e615e3daa9
fix: add missing OS_NAME variable in build script for Linux (#40242)
issue: #40243
This helps identify the running OS version during the build process,
ensuring better logging.

Signed-off-by: Shubhendra Kushwaha <shubhendrakushwaha94@gmail.com>
2025-04-22 18:06:37 +08:00
congqixia
481938297c
enhance: [AddField] Use next field id instead of global allocation (#41440)
Related to #39718

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-04-22 17:14:37 +08:00
congqixia
6f4e0d8e38
enhance: [AddField] Use schema update ts as guarantee ts (#41430)
Related to #39718

Use schema update ts when it's greater than calculated guarantee
timestamp to make sure that all read request using updated schema shall
wait all schema change event processed.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-04-22 17:12:45 +08:00
congqixia
b36c88f3c8
enhance: [AddField] Broadcast schema change via WAL (#41373)
Related to #39718

Add Broadcast logic for collection schema change and notifies:
- Streamnode - Delegator
- Streamnode - Flush component
- QueryNodes via grpc

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-04-22 16:28:37 +08:00
aoiasd
110c5aaaf4
feat: support icu and language identifier tokenizer (#41214)
relate: https://github.com/milvus-io/milvus/issues/41213

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-04-22 15:56:37 +08:00
cqy123456
5219d9a723
fix: Inserting null and non-null array at the same time will cause milvus crash when growing mmap open (#41051)
issue: https://github.com/milvus-io/milvus/issues/40981
2.5 pr: https://github.com/milvus-io/milvus/pull/41052

Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>
2025-04-22 12:26:37 +08:00
ThreadDao
7cec96f892
test: [skip-e2e] add nightly tests for woodpecker mq (#41427)
Signed-off-by: ThreadDao <yufen.zong@zilliz.com>
2025-04-22 11:46:36 +08:00
Zhen Ye
7f5a9a6046
fix: unstable timeticksync unittest (#41437)
issue: #38399

Signed-off-by: chyezh <chyezh@outlook.com>
2025-04-22 10:53:29 +08:00
Zhen Ye
9339bccccc
enhance: move sent first timeticksync, make recovery more easier (#41405)
issue: #38399

Signed-off-by: chyezh <chyezh@outlook.com>
2025-04-21 17:18:37 +08:00
zhikunyao
ac1e04372f
enhance: Update go env to 1.24.1 (#41415)
Signed-off-by: Zhikun Yao <zhikun.yao@zilliz.com>
2025-04-21 16:29:50 +08:00
aoiasd
f166843c5e
enhance: support use lindera tag filter (#40416)
relate: https://github.com/milvus-io/milvus/issues/39659

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-04-21 15:56:36 +08:00
Xianhui Lin
c5428c12eb
feat: Add support for modifying max capacity of array fields (#41404)
feat: Add support for modifying max capacity of array fields

This commit adds support for modifying the max capacity of array fields
in the `alterCollectionFieldTask` function. It checks if the field is an
array type and then validates and updates the max capacity value. This
change improves the flexibility of array fields in the collection.

Issue: https://github.com/milvus-io/milvus/issues/41363

---------

Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>
2025-04-21 15:52:37 +08:00
sparknack
8ccb875e41
enhance: add simde package (#40943)
issue: #40942

Add simde package, which can make porting SIMD code to other
architectures much easier.

Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>
2025-04-21 12:18:40 +08:00
aoiasd
24eb70f382
enhance: [GOSDK] support run analyzer for go client (#39973)
relate: https://github.com/milvus-io/milvus/issues/39705

---------

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-04-21 10:24:40 +08:00
Spade A
5b1430f27e
enhance: tantivy collector set bitset directly (#39748)
fix: #39755

The following shows a simple benchmark where insert 1M docs where all
rows are "hello", the latency is segcore level, CPU is 9900K:
master: 2.62ms
this PR: 2.11ms

bench mark code:

```
TEST(TextMatch, TestPerf) {
    auto schema = GenTestSchema({}, true);
    auto seg = CreateSealedSegment(schema, empty_index_meta);
    int64_t N = 1000000;
    uint64_t seed = 19190504;
    auto raw_data = DataGen(schema, N, seed);
    auto str_col = raw_data.raw_->mutable_fields_data()
                       ->at(1)
                       .mutable_scalars()
                       ->mutable_string_data()
                       ->mutable_data();
    for (int64_t i = 0; i < N - 1; i++) {
        str_col->at(i) = "hello";
    }
    SealedLoadFieldData(raw_data, *seg);
    seg->CreateTextIndex(FieldId(101));

    auto now = std::chrono::high_resolution_clock::now();
    auto expr = GetMatchExpr(schema, "hello", OpType::TextMatch);
    auto final = ExecuteQueryExpr(expr, seg.get(), N, MAX_TIMESTAMP);
    auto end = std::chrono::high_resolution_clock::now();
    auto duration =
        std::chrono::duration_cast<std::chrono::microseconds>(end - now);
    std::cout << "TextMatch query time: " << duration.count() << "ms"
              << std::endl;
}
```

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-04-20 23:02:41 +08:00
Chun Han
016920b023
fix: solve incompitable problem for none-encoding index(#40838) (#41369)
related: #40838

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2025-04-20 22:56:44 +08:00
Zhen Ye
c4a41cc32b
fix: add node id check to avoid double flush at most time (#41236)
issue: #41028

Signed-off-by: chyezh <chyezh@outlook.com>
2025-04-20 22:44:38 +08:00
Zhen Ye
ef4923e66b
fix: catchup scan never done if wal truncate (#41345)
issue: #41062

Signed-off-by: chyezh <chyezh@outlook.com>
2025-04-20 22:40:37 +08:00
Zhen Ye
78fca7e88d
fix: transaction should retry if transaction is expired (#41379)
issue: #41248

Signed-off-by: chyezh <chyezh@outlook.com>
2025-04-20 22:38:36 +08:00
tinswzy
6fa68c1f16
enhance: Support Woodpecker as a WAL storage option for Milvus (#41095)
#40916 Support Woodpecker as a WAL storage option for Milvus

Signed-off-by: tinswzy <zhenyuan.wei@zilliz.com>
2025-04-20 22:22:42 +08:00
Zhen Ye
c893344289
fix: close of wal is block when recovery (#41326)
issue: #41307

Signed-off-by: chyezh <chyezh@outlook.com>
2025-04-18 16:14:35 +08:00
sre-ci-robot
43d982bd11
[automated] Bump milvus version to v2.5.10 (#41399)
Bump milvus version to v2.5.10
Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-04-18 14:12:33 +08:00
sre-ci-robot
d7f0ff02d5
[automated] Bump milvus version to v2.5.10 (#41397)
Bump milvus version to v2.5.10
Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-04-18 14:08:41 +08:00
Xianhui Lin
c43f8f7944
feat: Ignore reporting index metrics for non-existent indexes (#41294)
feat: Ignore reporting index metrics for non-existent indexes

Remove the reporting of index metrics for non-existent indexes in the
`getCollectionMetrics` function. This change improves the code by
skipping unnecessary operations and reduces log noise.
issue: https://github.com/milvus-io/milvus/issues/41280

Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>
2025-04-18 10:36:36 +08:00
Ted Xu
d50781c8cc
enhance: support nullable group by keys (#41313)
See #36264

---------

Signed-off-by: Ted Xu <ted.xu@zilliz.com>
2025-04-18 10:08:34 +08:00
Spade A
62293cb582
fix: revert batch add (#41374)
issue: #41375

todo: to fix the problems fixed in the issue.

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-04-17 22:32:38 +08:00
Bingyi Sun
4552dd4b23
fix: Fix json index does not work for string filter (#41382)
issue: #35528

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-04-17 20:10:39 +08:00
zhuwenxing
511c4d373a
test: add concurrent index creation verification (#40174)
/kind improvement

Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2025-04-17 17:18:32 +08:00
yanliang567
6a9ecb760f
test: Update binary pagination search assertion for more stable (#41370)
related issue: #41362

Signed-off-by: yanliang567 <yanliang.qiao@zilliz.com>
2025-04-17 14:16:32 +08:00
Ted Xu
93788255eb
fix: errorous deadlock report in unittests (#41350)
See #41349

Signed-off-by: Ted Xu <ted.xu@zilliz.com>
2025-04-17 13:16:31 +08:00
congqixia
52c7d62012
enhance: Adapt hyphen in grpc metadata header (#41358)
Related to #41357

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-04-17 10:52:33 +08:00
Xiaowei Shi
1a35374672
fix: correct wrong querynode metric labels (#41344)
issue: https://github.com/milvus-io/milvus/issues/41343

Signed-off-by: Xiaowei Shi <shallwe.shih@gmail.com>
2025-04-16 21:32:33 +08:00
Sungyun Hur
fbd595eae1
fix: correct a minor typo in log message of kafka client (#41216)
fixes a minor typo that notify users that some special kafka configs are
overwritten in Milvus.

Signed-off-by: lambert <lambert@daangn.com>
2025-04-16 21:14:20 +08:00
cai.zhang
5fd8a196f6
fix: Fix panic with nil pointer dereference when get indexed segment (#41297)
issue: #41288

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-04-16 20:54:00 +08:00
Xianhui Lin
deb610e5d3
fix: update MixCoord registration in MilvusRoles (#41337)
enhance: update MixCoord registration in MilvusRoles

The `runMixCoord` function in `MilvusRoles` was updated to use the
`RegisterMixCoord` function from the `rootcoord_metrics` package instead
of `RegisterRootCoord`. This change aligns with the recent modifications
made to the `rootcoord_metrics` package.
issue:https://github.com/milvus-io/milvus/issues/41338

---------

Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>
2025-04-16 19:49:54 +08:00
박상호
4be6d0e967
fix: skip dim check for non-vector fields in PreCheck (#41287) (#41289)
## What this PR does

This PR fixes an issue where the `PreCheck` function in DataCoord logs
unnecessary warnings
when attempting to retrieve 'dim' from non-vector fields.

The change adds a check to only call `GetDimFromParams` when the field
type is a vector type.

## Related issue

Fixes #41287

---------

Signed-off-by: 박상호 <sangho@rapportlabs.kr>
Signed-off-by: Sangho Park <hoyaspark@gmail.com>
2025-04-16 17:52:32 +08:00
Xiaowei Shi
a6606ce9c6
fix: check PreCreatedTopic first in shard number validation (#41274)
issue : https://github.com/milvus-io/milvus/issues/41271

Signed-off-by: Xiaowei Shi <shallwe.shih@gmail.com>
2025-04-16 17:38:34 +08:00
sthuang
e46e3a1708
enhance: optimize error log message for list policy (#41251)
related: #41250

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2025-04-16 17:16:32 +08:00
sthuang
1f1c836fb9
feat: Storage v2 growing segment load (#41001)
support parallel loading sealed and growing segments with storage v2
format by async reading row groups.
related: #39173

---------

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2025-04-16 17:14:33 +08:00
zhuwenxing
d4b56ea4ab
test: [skip e2e]fix mixcoord label selectors (#41340)
/kind improvement

Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2025-04-16 15:18:31 +08:00
Spade A
70d13dcf61
enhance: update tantivy for removing "doc_id" fast field (#41198)
Issue: #41210

After https://github.com/zilliztech/tantivy/pull/5, we can provide
milvus row id directly to tantivy rather than record it in the fast
field "doc_id".
So rather than search tantivy doc id and then get milvus row id from
"doc_id", now, the searched tantivy doc id is the milvus row id,
eliminating the expensive acquiring row id phase.

The following shows a simple benchmark where insert **1M** docs where
all rows are "hello", the latency is **segcore** level, CPU is 9900K:

![image](https://github.com/user-attachments/assets/d8e72134-56b5-430b-8628-36c3bed8eaad)
**The latency is 2.02 and 2.1 times respectively.**

bench mark code:
```
TEST(TextMatch, TestPerf) {
    auto schema = GenTestSchema({}, true);
    auto seg = CreateSealedSegment(schema, empty_index_meta);
    int64_t N = 1000000;
    uint64_t seed = 19190504;
    auto raw_data = DataGen(schema, N, seed);
    auto str_col = raw_data.raw_->mutable_fields_data()
                       ->at(1)
                       .mutable_scalars()
                       ->mutable_string_data()
                       ->mutable_data();
    for (int64_t i = 0; i < N - 1; i++) {
        str_col->at(i) = "hello";
    }
    SealedLoadFieldData(raw_data, *seg);
    seg->CreateTextIndex(FieldId(101));

    auto now = std::chrono::high_resolution_clock::now();
    auto expr = GetMatchExpr(schema, "hello", OpType::TextMatch);
    auto final = ExecuteQueryExpr(expr, seg.get(), N, MAX_TIMESTAMP);
    auto end = std::chrono::high_resolution_clock::now();
    auto duration =
        std::chrono::duration_cast<std::chrono::microseconds>(end - now);
    std::cout << "TextMatch query time: " << duration.count() << "ms"
              << std::endl;
}
```

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-04-15 20:20:32 +08:00
Bingyi Sun
a953eaeaf0
enhance: support binary range expression for json path index (#41025)
issue: #35528

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-04-15 19:32:33 +08:00
congqixia
1d564a2d95
fix: Make TestScannerAdaptorReadError stable (#41303)
Related to #41302

Previously wait for 200 milliseconds could cause unsable behavior of
this unittest. This PR make unittest wait for certain function call
instead of wait for some time.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-04-15 14:54:34 +08:00
congqixia
a53f3024cf
fix: Add save field schema log for kv_catalog.AlterCollection (#41242)
Related to #41241

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-04-15 12:14:32 +08:00
zhuwenxing
6a12304d1e
test: add alter collection checker (#41281)
/kind improvement

Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2025-04-15 11:02:34 +08:00
yihao.dai
dccfc69660
enhance: Get compaction params from request (#41125)
Make DataNode use compaction parameters from request instead of
configuration.

issue: https://github.com/milvus-io/milvus/issues/41123

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-04-15 10:28:53 +08:00
cai.zhang
bc11feae74
fix: Close client before remove worker client (#41253)
issue: #41252

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-04-15 10:26:31 +08:00