10283 Commits

Author SHA1 Message Date
Spade A
001fc992df
enhance: get doc ids by batch (#40608)
issue: #40607

tantivy change: https://github.com/zilliztech/tantivy/pull/3

Benchmarks:
Test Envrioment: CPU 9900K
The data is insert by:
```
for i in 0..N {
    for j in 0..UNIQUE {
        let key = format!("hello{}", j);
        index_writer.add_string(&key, i * UNIQUE + j).unwrap();
    }
}
```
So the unique influences the locality of the matched docs.
The latency is the avg latency over 1000 repeate quries.
The result shows 22.5%-34.8% latency reduction.

![image](https://github.com/user-attachments/assets/dd8af75a-ddc3-445d-92df-50d354dd5645)

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-03-14 15:48:09 +08:00
cai.zhang
6dbe5d475e
enhance: Refine task meta with key lock (#40613)
issue: #39101

2.5 pr: #40146 #40353

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-03-14 15:44:22 +08:00
SimFG
bf4fc6a8c6
feat: add DDLDB rate type and related quota configurations (#40651)
- issue: #40650

Signed-off-by: SimFG <bang.fu@zilliz.com>
2025-03-14 15:38:09 +08:00
Spade A
f36d1562bd
enhance: add metrics for random sample (#40634)
issue: #39541

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-03-13 21:42:11 +08:00
yihao.dai
bab30a41bf
enhance: Improve import error msgs (#40567)
issue: https://github.com/milvus-io/milvus/issues/40208

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-03-13 21:02:07 +08:00
Zhen Ye
f6fb4bc442
fix: backoff will retry infinitely after reaching max elapse (#40589)
issue: #40588

Signed-off-by: chyezh <chyezh@outlook.com>
2025-03-13 16:24:06 +08:00
yihao.dai
b2a8694686
enhance: Merge IndexNode and DataNode (#40272)
Merge DataNode and IndexNode into DataNode.

issue: https://github.com/milvus-io/milvus/issues/39115

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-03-13 14:26:11 +08:00
Ted Xu
df4285c9ef
enhance: API integration with storage v2 in clustering-compactions (#40133)
See #39173

---------

Signed-off-by: Ted Xu <ted.xu@zilliz.com>
2025-03-13 14:12:06 +08:00
Zhen Ye
5735c3ef19
fix: too many memory usage of streaming node (#40606)
issue: #40592

Signed-off-by: chyezh <chyezh@outlook.com>
2025-03-13 07:10:07 +08:00
Xiaofan
fb48b3c7ac
fix: empty sparse row in importer (#40585)
fix #40584

parquet bulk writer can not finish 0 dim sparse vector.

Signed-off-by: xiaofanluan <xiaofan.luan@zilliz.com>
2025-03-13 01:29:41 +08:00
Spade A
9f3bd55755
fix: avoid panic when field not exists in schema in query node (#40541)
ref #40473

This PR is a workaround to avoid the panic described in the issue.

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-03-12 22:44:08 +08:00
jaime
c8a96377bb
enhance: move object storage client creation to pkg package (#40440)
issue: #40439

Signed-off-by: jaime <yun.zhang@zilliz.com>
2025-03-12 20:38:07 +08:00
yihao.dai
27c7cbbc72
fix: Fix QueryNodeNumEntities metric (#40602)
fix QueryNodeNumEntities metric introduced by pr
https://github.com/milvus-io/milvus/pull/39536

issue: https://github.com/milvus-io/milvus/issues/38162

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-03-12 19:08:05 +08:00
wei liu
0420dc1eb1
fix: use correct delete checkpoint to prevent premature data cleanup (#40366)
issue: #40292
related to #39552

- Fix incorrect delete checkpoint usage in SyncDistribution
- Change checkpoint parameter from action.GetCheckpoint() to
action.GetDeleteCP() in SyncTargetVersion call
- This resolves the issue where delete buffer data was being cleaned
prematurely due to wrong checkpoint reference

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-03-12 15:00:08 +08:00
sthuang
c0e03b6ca4
fix: rbac star privilege return empty when listing policy (#40553)
related: https://github.com/milvus-io/milvus/issues/40547

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2025-03-12 14:16:05 +08:00
Spade A
95e2680a36
fix: ref collection for search/query (#40549)
ref https://github.com/milvus-io/milvus/issues/40473

Collection is got without ref which means the collection could be
releases and the struct could be freed during the search which leads
schema inconsistency.

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-03-12 11:30:07 +08:00
Bingyi Sun
0698d04f7d
enhance: Upgrade simdjson version (#40538)
issue: https://github.com/milvus-io/milvus/issues/40519
simdjson returns better error code in newer version.

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-03-11 15:04:05 +08:00
cai.zhang
e5f50076ec
enhance: Only check element type with not null array (#40446)
Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-03-11 14:58:07 +08:00
yihao.dai
a33c9372ce
fix: Fix channel not balance on datanodes (#40422)
1. Prevent channels from being assigned to only one datanode during
datacoord startup.
2. Optimize the channel assignment policy by considering newly assigned
channels.
3. Make msgdispatcher manager lock-free.

issue: https://github.com/milvus-io/milvus/issues/40421,
https://github.com/milvus-io/milvus/issues/37630

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-03-11 14:56:16 +08:00
Bingyi Sun
a729bb84ba
enhance: add json path escape and replace $meta with dynamic field name (#40407)
issue: #35528

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-03-11 14:00:05 +08:00
Zhen Ye
d9fe8f0dcf
fix: [skip e2e] wab unittest may failure (#40470)
issue: #38399

Signed-off-by: chyezh <chyezh@outlook.com>
2025-03-11 11:34:06 +08:00
congqixia
3899b0f0d4
fix: Add duplicated type/index params check creating collection (#40462)
Related to #40461

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-03-11 10:14:11 +08:00
junjiejiangjjj
359e7efd8e
feat: Add function running monitoring (#40358)
#35856 
#40004 
1. Optimize model verification logic
2. Add profiling code

Signed-off-by: junjie.jiang <junjie.jiang@zilliz.com>
2025-03-10 22:28:05 +08:00
Bingyi Sun
0a7e692b6f
fix: Fix null offset loading in inverted index (#40523)
issue: #40516

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-03-10 22:12:04 +08:00
Cai Yudong
2bd2cca04a
enhance: Truly support multi vector data types in SearchBruteForce (#40499)
Issue: #38666

Signed-off-by: CaiYudong <yudong.cai@zilliz.com>
2025-03-10 18:36:03 +08:00
yihao.dai
2ca2e2dbc8
fix: Fix parsing import endTs (#40332)
Parsing import beginTs, endTs as a hybrid timestamp.

issue: https://github.com/milvus-io/milvus/issues/40326

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-03-10 17:38:04 +08:00
XuanYang-cn
e6c46a25ea
enhance: Use correct counter metrics for overall wa calculation (#40394)
- Use CounterVec to calculate sum of increase during a time period.
- Use entries number instead of binlog size

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2025-03-10 16:34:06 +08:00
congqixia
391804c7fb
enhance: Add channel seal policy based on blocking l0 (#40505)
Related to #40502

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-03-10 16:28:04 +08:00
XuanYang-cn
6f70e6d1e1
enhance: Log start position of delete msgs (#40315)
Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2025-03-10 14:58:05 +08:00
sre-ci-robot
a6d4121034
[automated] Update Knowhere Commit (#40486)
Update Knowhere Commit
Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-03-10 12:28:04 +08:00
XuanYang-cn
4bebca6416
enhance: Replace currRows with NumOfRows (#40074)
See also: #40068

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2025-03-10 12:16:03 +08:00
cai.zhang
d6a650bd14
fix: Skip executing stats for zero segment (#40448)
issue: #40241

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-03-09 21:14:02 +08:00
smellthemoon
faae8ee518
fix: store wrong offset when build tantivy in nullable field (#40452)
#40454

Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2025-03-09 09:34:04 +08:00
Bingyi Sun
37b118d55d
fix: Skip loading primary key if index has raw data (#39921)
issue: https://github.com/milvus-io/milvus/issues/39907

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-03-06 17:46:02 +08:00
congqixia
7fbeb5624e
enhance: Avoid convert body byte slice to string in httpserver (#40405)
The convertion of byte slice to string may copy the underline data which
may cause extra memory and cpu time for httpserver

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-03-06 16:28:02 +08:00
congqixia
5c5273f95e
fix: Pass Knapsnak ptr to avoid compact multiple times (#40400)
Related to #40388

The small segments may be put into bucket twice due to value parameter
of Knapsnap.packWith

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-03-06 15:42:03 +08:00
sthuang
e0ec1aceeb
fix: skip storage v2 unstable ut for now (#40378)
related: #39173

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2025-03-05 20:30:39 +08:00
congqixia
fde80bc8b7
enhance: Remove debug log in rg handler v2 (#40376)
Remove debug log in resource group handler

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-03-05 19:18:00 +08:00
Spade A
3db56560fb
fix: fix concurrent issues in null offset (#40363)
issue: #40308
This issue fixes these two concurrent issues:
1. element in null_offset is used to set bitset where the size of bitset
is initialized by tantivy document count. However, there may still be
some documents that are not committed in tantivy but are null in
null_offset. So array out of range occurs.
2. null_offset can be read and write concurrently but there's no
synchronization protection.

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-03-05 17:48:00 +08:00
Ted Xu
878ce56079
fix: correct memory size estimation on arrays (#40312)
See: #40342

---------

Signed-off-by: Ted Xu <ted.xu@zilliz.com>
2025-03-05 16:54:09 +08:00
Zhen Ye
1637cf5664
enhance: better logging for grpc resolver (#40337)
issue: #40311

- better logging for grpc resolver
- remove the redundant streaming node manage client when streaming
service is disable

Signed-off-by: chyezh <chyezh@outlook.com>
2025-03-05 15:12:01 +08:00
Ted Xu
96952ad3c5
fix: compaction task cannot be genereted if size greater than max size (#40348)
See: #40343

---------

Signed-off-by: Ted Xu <ted.xu@zilliz.com>
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
Co-authored-by: Congqi Xia <congqi.xia@zilliz.com>
2025-03-05 14:40:01 +08:00
yihao.dai
004a1875dc
enhance: Introduce batch subscription in msgdispatcher (#39863)
Introduce a batch subscription mechanism in msgdispatcher: the
msgdispatcher now includes a vchannel watch task queue, where all
vchannels in the queue will subscribe to the MQ only once and pull
messages from the oldest vchannel checkpoint to the latest.

issue: https://github.com/milvus-io/milvus/issues/39862

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-03-05 14:38:02 +08:00
Bingyi Sun
be4d09561b
fix: Fix missing null or non-exist key in json index (#40336)
issue: #35528

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-03-05 11:48:02 +08:00
sthuang
63a7c4570e
feat: storage v2 sync (#39663)
related: #39173

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2025-03-05 11:22:15 +08:00
SimFG
a3755cf409
fix: improve error handling and unit tests for InitMetaCache function (#40322)
- issue: #40320

Signed-off-by: SimFG <bang.fu@zilliz.com>
2025-03-05 11:08:13 +08:00
junjiejiangjjj
b2e630b1a1
feat: Support TEI serving and support int8 embedding (#40199)
#35856

Signed-off-by: junjie.jiang <junjie.jiang@zilliz.com>
2025-03-05 10:18:00 +08:00
Zhen Ye
9ca5088f62
fix: duplicate consuming from stream for invisble segment (#40316)
issue: #40207

Signed-off-by: chyezh <chyezh@outlook.com>
2025-03-04 15:54:00 +08:00
sthuang
d77756cf2d
fix: fix storage v2 cgo mem leak (#40305)
related: #39173

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2025-03-04 14:16:00 +08:00
XuanYang-cn
837ac295fa
enhance: Remove iterators in datanode (#40301)
Iterators are long deprecated, but sort are still using it. This PR
unifies stats task with the latest compaction common functions and
remove the usage of iterators.

1. Rename `datanode/compaction` to `datanode/compactor`
2. Add `internal/compaction` and move some compaction commons into it.
3. Replace `DeltalogIterators` with `ComposeDeleteFromDeltalogs`
4. Remove `datanode/iterators`

See also: #39242

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2025-03-04 12:14:00 +08:00