22784 Commits

Author SHA1 Message Date
XuanYang-cn
dab39c610b
enhance: remove not inused DDLCodec (#41485)
See also: #39242

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2025-04-25 17:26:37 +08:00
Zhen Ye
01c0356ed3
fix: make add segment operation in meta idempotent (#41515)
issue: #41514

Signed-off-by: chyezh <chyezh@outlook.com>
2025-04-25 16:50:38 +08:00
Jiang Chen
0b75ea56a4
doc: clarify the "fork and pull" process in contributing guide (#41474)
Signed-off-by: codingjaguar <jiang.chen@zilliz.com>
2025-04-25 16:23:52 +08:00
yanliang567
70b311735b
test: [E2e Refactor] use vector datatype instead of hard code dataype names (#41497)
related issue: #40698 
1. use vector datat types instead of hard code datatpe names
2. update search pagination tests
3. remove checking distances in search results checking, for knowhere
customize the distances for different metrics and indexes. Now only
assert the distances are sorted correct.

---------

Signed-off-by: yanliang567 <yanliang.qiao@zilliz.com>
2025-04-25 10:46:38 +08:00
congqixia
6084930854
fix: [GoSDK] Loose rowbased insert data check (#41498)
Related to #41460

This PR looses insert data check based on schema. These check shall
actually happen at milvus server side.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-04-25 10:38:38 +08:00
Xianhui Lin
1a6838b496
fix: json stats add map null check before insert into tantivity (#41505)
json stats add map null check before insert into tantivity. Json stats
index may fail if there is no data
issue:https://github.com/milvus-io/milvus/issues/41494

---------

Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>
2025-04-24 21:06:37 +08:00
Sungyun Hur
52f255c958
fix: mark CacheMemoryLimit as deprecated in configuration (#41451)
`CacheMemoryLimit` configuration is never used in current code. Thus
mark it as deprecated

---------

Signed-off-by: lambert <lambert@daangn.com>
2025-04-24 20:38:40 +08:00
Zhen Ye
a3d621cb5e
fix: remove the concurrent limits for streaming service (#41484)
issue: #41479

Signed-off-by: chyezh <chyezh@outlook.com>
2025-04-24 20:36:38 +08:00
Julien Salleyron
9de0c84576
fix: Allow to compile on windows (#41448)
This PR fixes #41384 .

When using milvus client and compile on windows, the compilation failed
with the undefined RSS error.

On windows, the way to get memory used is the same as on darwin.

Signed-off-by: Julien Salleyron <julien.salleyron@gmail.com>
2025-04-24 20:34:38 +08:00
Zhen Ye
ecfc868dcb
fix: write buffer not unregistered when datasyncservice is gone (#41496)
issue: #41495

Signed-off-by: chyezh <chyezh@outlook.com>
2025-04-24 19:38:38 +08:00
zhuwenxing
b5fe6a5243
test: add icu tokenizer testcases (#41501)
/kind improvement

Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2025-04-24 18:54:38 +08:00
ThreadDao
bcec6cc3c5
test: [skip-e2e] fix nightly wp deploy config (#41472)
Signed-off-by: ThreadDao <yufen.zong@zilliz.com>
2025-04-24 17:36:39 +08:00
junjiejiangjjj
e56adc121b
enhance: refactor embedding credentials manager (#41442)
https://github.com/milvus-io/milvus/issues/35856

Signed-off-by: junjie.jiang <junjie.jiang@zilliz.com>
2025-04-24 14:34:38 +08:00
congqixia
dbe54c2df8
enhance: [AddField] Resolve conflicts & make WAL ts collection updatets (#41476)
Related to #39718

This PR:
- Use WAL broadcast timestamp as Collection update timestamp
- Remove request_fields size assertion
- Remove proxy schema cache loaded field check & skip related cases
- other minor issues

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-04-24 12:06:39 +08:00
XuanYang-cn
540456041f
enhance: Remove not inuse binlog iterator (#41359)
See also: #41466

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2025-04-24 12:04:38 +08:00
Spade A
f3d878ab3f
fix: update tantivy for fixing phrase match (#41450)
issue: #41454
https://github.com/zilliztech/tantivy/pull/8 fixes the problem, this PR
update the tantivy.

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-04-24 10:52:37 +08:00
Zhen Ye
5fd47c3c89
fix: mockery too unavailable after upgrade golang version (#41481)
issue: #41291
pr: #41318

Signed-off-by: chyezh <chyezh@outlook.com>
2025-04-24 10:46:43 +08:00
junjiejiangjjj
f23df95a77
feat : Support decay rerank (#41223)
https://github.com/milvus-io/milvus/issues/35856
#41312

Signed-off-by: junjie.jiang <junjie.jiang@zilliz.com>
2025-04-23 20:48:39 +08:00
aoiasd
f52c2909c4
feat: support multi analyzer for bm25 function (#41351)
relate: https://github.com/milvus-io/milvus/issues/41213

---------

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-04-23 18:22:38 +08:00
congqixia
85ed200529
fix: Save update timestamp in catalog.AlterCollection API (#41468)
Related to #41467

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-04-23 16:48:37 +08:00
Xianhui Lin
3d4889586d
fix: JsonStats filter by conjunctExpr and improve the task slot calculation logic (#41459)
Optimized JSON filter execution by introducing
ProcessJsonStatsChunkPos() for unified position calculation and
GetNextBatchSize() for better batch processing.
Improved JSON key generation by replacing manual path joining with
milvus::Json::pointer() and adjusted slot size calculation for JSON key
index jobs.
Updated the task slot calculation logic in calculateStatsTaskSlot() to
handle the increased resource needs of JSON key index jobs.
issue: https://github.com/milvus-io/milvus/issues/41378
https://github.com/milvus-io/milvus/issues/41218

---------

Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>
2025-04-23 16:30:37 +08:00
aoiasd
655cc7fe06
fix: bm25 stats idf oracle leak (#41425)
relate: https://github.com/milvus-io/milvus/issues/41424

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-04-23 14:28:37 +08:00
aoiasd
a16bd6263b
feat: support more lauguage for build in stop words and add remove punct, regex filter (#41412)
relate: https://github.com/milvus-io/milvus/issues/41213

---------

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-04-23 11:44:37 +08:00
SimFG
91d40fa558
fix: Update logging context and upgrade dependencies (#41318)
- issue: #41291

---------

Signed-off-by: SimFG <bang.fu@zilliz.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-04-23 10:52:38 +08:00
Mario Camou
6b30e9ae60
feat: Export milvusclient.annRequest (issue: #41261) (#41356)
Issue: #41261

`milvusclient.NewHybridSearchOption` receives a variadic `annRequests`
parameter. However, since `milvusclient.annRequest` is private, there is
no way to declare a slice, therefore there is no way to make it fully
generic (as in, create a slice of `milvusclient.annRequest`s and pass
them to `NewHybridSearchOption`. This PR renames
`milvusclient.annRequest` to `milvusclient.AnnRequest` to export it.

This is an API change since it's renaming a struct. However, since the
struct was previously private no external code depends on it, unless
it's doing nasty things with reflection (in which case it should not
depend on the name).

Signed-off-by: Mario Camou <mcamou@users.noreply.github.com>

Signed-off-by: mcamou <mcamou@users.noreply.github.com>
2025-04-23 02:38:38 +08:00
aoiasd
11f2fae42e
feat: support extend default dict for jieba tokenizer (#41360)
relate: https://github.com/milvus-io/milvus/issues/41213

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-04-22 20:34:37 +08:00
Shubhendra Kushwaha
e615e3daa9
fix: add missing OS_NAME variable in build script for Linux (#40242)
issue: #40243
This helps identify the running OS version during the build process,
ensuring better logging.

Signed-off-by: Shubhendra Kushwaha <shubhendrakushwaha94@gmail.com>
2025-04-22 18:06:37 +08:00
congqixia
481938297c
enhance: [AddField] Use next field id instead of global allocation (#41440)
Related to #39718

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-04-22 17:14:37 +08:00
congqixia
6f4e0d8e38
enhance: [AddField] Use schema update ts as guarantee ts (#41430)
Related to #39718

Use schema update ts when it's greater than calculated guarantee
timestamp to make sure that all read request using updated schema shall
wait all schema change event processed.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-04-22 17:12:45 +08:00
congqixia
b36c88f3c8
enhance: [AddField] Broadcast schema change via WAL (#41373)
Related to #39718

Add Broadcast logic for collection schema change and notifies:
- Streamnode - Delegator
- Streamnode - Flush component
- QueryNodes via grpc

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-04-22 16:28:37 +08:00
aoiasd
110c5aaaf4
feat: support icu and language identifier tokenizer (#41214)
relate: https://github.com/milvus-io/milvus/issues/41213

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-04-22 15:56:37 +08:00
cqy123456
5219d9a723
fix: Inserting null and non-null array at the same time will cause milvus crash when growing mmap open (#41051)
issue: https://github.com/milvus-io/milvus/issues/40981
2.5 pr: https://github.com/milvus-io/milvus/pull/41052

Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>
2025-04-22 12:26:37 +08:00
ThreadDao
7cec96f892
test: [skip-e2e] add nightly tests for woodpecker mq (#41427)
Signed-off-by: ThreadDao <yufen.zong@zilliz.com>
2025-04-22 11:46:36 +08:00
Zhen Ye
7f5a9a6046
fix: unstable timeticksync unittest (#41437)
issue: #38399

Signed-off-by: chyezh <chyezh@outlook.com>
2025-04-22 10:53:29 +08:00
Zhen Ye
9339bccccc
enhance: move sent first timeticksync, make recovery more easier (#41405)
issue: #38399

Signed-off-by: chyezh <chyezh@outlook.com>
2025-04-21 17:18:37 +08:00
zhikunyao
ac1e04372f
enhance: Update go env to 1.24.1 (#41415)
Signed-off-by: Zhikun Yao <zhikun.yao@zilliz.com>
2025-04-21 16:29:50 +08:00
aoiasd
f166843c5e
enhance: support use lindera tag filter (#40416)
relate: https://github.com/milvus-io/milvus/issues/39659

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-04-21 15:56:36 +08:00
Xianhui Lin
c5428c12eb
feat: Add support for modifying max capacity of array fields (#41404)
feat: Add support for modifying max capacity of array fields

This commit adds support for modifying the max capacity of array fields
in the `alterCollectionFieldTask` function. It checks if the field is an
array type and then validates and updates the max capacity value. This
change improves the flexibility of array fields in the collection.

Issue: https://github.com/milvus-io/milvus/issues/41363

---------

Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>
2025-04-21 15:52:37 +08:00
sparknack
8ccb875e41
enhance: add simde package (#40943)
issue: #40942

Add simde package, which can make porting SIMD code to other
architectures much easier.

Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>
2025-04-21 12:18:40 +08:00
aoiasd
24eb70f382
enhance: [GOSDK] support run analyzer for go client (#39973)
relate: https://github.com/milvus-io/milvus/issues/39705

---------

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-04-21 10:24:40 +08:00
Spade A
5b1430f27e
enhance: tantivy collector set bitset directly (#39748)
fix: #39755

The following shows a simple benchmark where insert 1M docs where all
rows are "hello", the latency is segcore level, CPU is 9900K:
master: 2.62ms
this PR: 2.11ms

bench mark code:

```
TEST(TextMatch, TestPerf) {
    auto schema = GenTestSchema({}, true);
    auto seg = CreateSealedSegment(schema, empty_index_meta);
    int64_t N = 1000000;
    uint64_t seed = 19190504;
    auto raw_data = DataGen(schema, N, seed);
    auto str_col = raw_data.raw_->mutable_fields_data()
                       ->at(1)
                       .mutable_scalars()
                       ->mutable_string_data()
                       ->mutable_data();
    for (int64_t i = 0; i < N - 1; i++) {
        str_col->at(i) = "hello";
    }
    SealedLoadFieldData(raw_data, *seg);
    seg->CreateTextIndex(FieldId(101));

    auto now = std::chrono::high_resolution_clock::now();
    auto expr = GetMatchExpr(schema, "hello", OpType::TextMatch);
    auto final = ExecuteQueryExpr(expr, seg.get(), N, MAX_TIMESTAMP);
    auto end = std::chrono::high_resolution_clock::now();
    auto duration =
        std::chrono::duration_cast<std::chrono::microseconds>(end - now);
    std::cout << "TextMatch query time: " << duration.count() << "ms"
              << std::endl;
}
```

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-04-20 23:02:41 +08:00
Chun Han
016920b023
fix: solve incompitable problem for none-encoding index(#40838) (#41369)
related: #40838

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2025-04-20 22:56:44 +08:00
Zhen Ye
c4a41cc32b
fix: add node id check to avoid double flush at most time (#41236)
issue: #41028

Signed-off-by: chyezh <chyezh@outlook.com>
2025-04-20 22:44:38 +08:00
Zhen Ye
ef4923e66b
fix: catchup scan never done if wal truncate (#41345)
issue: #41062

Signed-off-by: chyezh <chyezh@outlook.com>
2025-04-20 22:40:37 +08:00
Zhen Ye
78fca7e88d
fix: transaction should retry if transaction is expired (#41379)
issue: #41248

Signed-off-by: chyezh <chyezh@outlook.com>
2025-04-20 22:38:36 +08:00
tinswzy
6fa68c1f16
enhance: Support Woodpecker as a WAL storage option for Milvus (#41095)
#40916 Support Woodpecker as a WAL storage option for Milvus

Signed-off-by: tinswzy <zhenyuan.wei@zilliz.com>
2025-04-20 22:22:42 +08:00
Zhen Ye
c893344289
fix: close of wal is block when recovery (#41326)
issue: #41307

Signed-off-by: chyezh <chyezh@outlook.com>
2025-04-18 16:14:35 +08:00
sre-ci-robot
43d982bd11
[automated] Bump milvus version to v2.5.10 (#41399)
Bump milvus version to v2.5.10
Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-04-18 14:12:33 +08:00
sre-ci-robot
d7f0ff02d5
[automated] Bump milvus version to v2.5.10 (#41397)
Bump milvus version to v2.5.10
Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-04-18 14:08:41 +08:00
Xianhui Lin
c43f8f7944
feat: Ignore reporting index metrics for non-existent indexes (#41294)
feat: Ignore reporting index metrics for non-existent indexes

Remove the reporting of index metrics for non-existent indexes in the
`getCollectionMetrics` function. This change improves the code by
skipping unnecessary operations and reduces log noise.
issue: https://github.com/milvus-io/milvus/issues/41280

Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>
2025-04-18 10:36:36 +08:00