10205 Commits

Author SHA1 Message Date
yihao.dai
8f077089ba
enhance: Accelerate listing objects during binlog import (#40047)
issue: https://github.com/milvus-io/milvus/issues/40030

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-02-26 15:55:57 +08:00
Chun Han
190ac11cd1
fix: cancel sub contexts casade when http request timeout(#40030) (#40059)
related: #40030

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2025-02-26 11:33:57 +08:00
junjiejiangjjj
162d241063
feat: Add siliconflow text embedding (#39867)
https://github.com/milvus-io/milvus/issues/35856

Signed-off-by: junjie.jiang <junjie.jiang@zilliz.com>
2025-02-26 11:01:56 +08:00
Bingyi Sun
f05e9628f6
fix: Fix search failure of null expression (#40129)
issue: #40095

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-02-25 20:43:55 +08:00
congqixia
e0b028ade5
enhance: Integrate holmes as pprof dumper (#40151)
Related to #40150

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-02-25 19:01:55 +08:00
XuanYang-cn
315cfb7f32
fix: Negative -1 executing compaction tasks (#39954)
See also: #39675

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2025-02-25 18:07:55 +08:00
Zhen Ye
84df80b5e4
enhance: refactor metrics of streaming (#40031)
issue: #38399

- add metrics for broadcaster component.
- add metrics for wal flusher component.
- add metrics for wal interceptors.
- add slow log for wal.
- add more label for some wal metrics. (local or remote/catcup or
tailing...)

Signed-off-by: chyezh <chyezh@outlook.com>
2025-02-25 12:25:56 +08:00
sthuang
90acc8a58f
enhance: upgrade go arrow version from 12.0.1 to 17.0.0 (#39916)
related: https://github.com/milvus-io/milvus/issues/39915

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2025-02-25 10:30:02 +08:00
Bingyi Sun
db4769281c
fix: Fall back to a brute-force search if json index type unmatched (#40076)
issue: https://github.com/milvus-io/milvus/issues/35528
If the query data type does not match the index type, fall back to a
brute-force search

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-02-24 16:25:57 +08:00
aoiasd
38f1608910
enhance: pack analyzer code and support lindera tokenizer (#39660)
relate: https://github.com/milvus-io/milvus/issues/39659

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-02-24 12:13:55 +08:00
congqixia
dd68814c15
enhance: Remove hardcoded partition num in restful handler (#40112)
The partition num shall be determined by core logic if user did not
specifiy the partition num in request.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-02-23 15:33:54 +08:00
cai.zhang
9f5b488f9a
enhance: Export request timeout interval in config (#40119)
Signed-off-by: cai.zhang <cai.zhang@zilliz.com>
Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-02-23 15:15:54 +08:00
congqixia
cb7f2fa6fd
enhance: Use v2 package name for pkg module (#39990)
Related to #39095

https://go.dev/doc/modules/version-numbers

Update pkg version according to golang dep version convention

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-02-22 23:15:58 +08:00
congqixia
a774f05ea7
fix: Add sub task pool for multi-stage tasks (#40079)
Related to #40078

Add a subTaskPool to execute sub task in case of logic deadlock
described in issue.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-02-22 16:37:54 +08:00
congqixia
e1b5b37195
enhance: Avoid stringtoslicebytes copy for BatchPKExists (#40096)
Using unsafe.Slice to convert string to []byte by directly using
underlying data could avoid lots of copy and cpu time

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-02-22 15:23:52 +08:00
Ted Xu
8562a102ec
enhance: API integration with storage v2 in mix-compactions (#40008)
See #39173

---------

Signed-off-by: Ted Xu <ted.xu@zilliz.com>
2025-02-22 14:23:54 +08:00
SimFG
ad36347fb3
fix: add BeginTimestamp and EndTimestamp to insert and upsert messages (#40110)
- issue: #40109
- caused by: #38656

Signed-off-by: SimFG <bang.fu@zilliz.com>
2025-02-22 12:29:53 +08:00
smellthemoon
8b974c5742
enhance: support compact if lack of binlog (#40000)
https://github.com/milvus-io/milvus/issues/39718

Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2025-02-22 10:51:56 +08:00
sre-ci-robot
dd1347d041
[automated] Update Knowhere Commit (#40103)
Update Knowhere Commit
Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-02-22 01:01:53 +08:00
sthuang
3eb3af5f08
feat: explicitly specify column groups for storage v2 api (#39790)
* use the new packed reader and writer api to be compatible with current
etcd meta
* For the new packed writer API: column groups and paths are explicitly
defined by users and won't split column groups by memory in storage v2.
Packed writer follows the user-defined column groups to split arrow
record and write into the corresponding file path.
* For the new packed reader API: read paths are explicitly defined by
users.
related: #39173

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2025-02-21 22:03:54 +08:00
yihao.dai
2a037a97f1
enhance: Add get vector latency metric and refine request limit error message (#40083)
issue: https://github.com/milvus-io/milvus/issues/40078

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-02-21 19:41:55 +08:00
Chun Han
d6699b5f50
enhance: support return configable properties when describing index(#39951) (#40042)
related: #39951

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2025-02-21 19:07:53 +08:00
XuanYang-cn
fb969cf636
fix: A segment may never transfer from sealed to flushing (#39993)
See also: #39717

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2025-02-21 16:51:54 +08:00
wei liu
7d2c948c69
fix: task delta cache leak on reduce task (#40055)
issue: #40052

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-02-21 16:47:54 +08:00
Spade A
d34d70582d
fix: fix misleading name *_add_multi_* (#39997)
fix: #39995

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-02-21 16:45:55 +08:00
wei liu
07578041ba
fix: querycoord panic in cornor case (#40057)
issue: #40050

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-02-21 11:19:58 +08:00
SimFG
b562f8e644
fix: add filter to exclude L0 import jobs in compaction trigger (#40045)
- issue: #39849

Signed-off-by: SimFG <bang.fu@zilliz.com>
2025-02-21 10:45:53 +08:00
sre-ci-robot
f0d3d98c3f
[automated] Update Knowhere Commit (#40063)
Update Knowhere Commit
Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-02-21 01:19:54 +08:00
Chun Han
1dc31619f8
enhance: support create collection with description(#40022) (#40023)
related: #40022

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2025-02-20 22:31:53 +08:00
SimFG
aba39ff98f
fix: enhance isBalanced function to correctly count quote pairs (#40001)
- issue: #39999

Signed-off-by: SimFG <bang.fu@zilliz.com>
2025-02-19 20:19:00 +08:00
Zhen Ye
fd701eca71
fix: local wal perform different with remote wal (#39967)
issue: #38399

Signed-off-by: chyezh <chyezh@outlook.com>
2025-02-19 19:12:51 +08:00
sthuang
f47320e0e7
enhance: clean up legacy storage v2 (#39987)
related: https://github.com/milvus-io/milvus/issues/39173

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2025-02-19 15:42:52 +08:00
congqixia
5d83deb3f8
fix: Use start pos ts instead for sealSegmentByLifetime policy (#39982)
Related to #39981

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-02-19 15:41:06 +08:00
Patrick Weizhi Xu
04fff74a56
feat: introduce Text data type (#39874)
issue: https://github.com/milvus-io/milvus/issues/39818

This PR mimics Varchar data type, allows insert, search, query, delete,
full-text search and others.
Functionalities related to filter expressions are disabled temporarily. 

Storage changes for Text data type will be in the following PRs.

Signed-off-by: Patrick Weizhi Xu <weizhi.xu@zilliz.com>
2025-02-19 11:04:51 +08:00
Spade A
52c7d7dd80
fix: offset combined with term should be based on Token positions in phrase match (#39931)
fix: #39711

Unlike English sentence where each words are parsed exactly once and one
after one with position length 1, one Chinese word may be parsed to
multiple words with position length larger than 1.

For example, "badminton and skiing" will be parsed to Token{ start: 0,
length: 1, text: "badminton" }, Token{ start: 1, length: 1, text: "and"
}, and Token{ start: 2, length: 1, text: "tennis" }.

While for exmaple for Chinsese: "羽毛球和滑雪" may be parsed to Token{ start:
0, length: 2, text: "羽毛" }, Token{ start: 0, length: 3, text: "羽毛球" },
Token{ start: 3, length: 1, text: "和" }, and Token{ start: 4, length: 2,
text: "滑雪" }.

This PR fix that the code not recognizes this situation.

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-02-18 20:38:51 +08:00
congqixia
59881a7f73
fix: Remove load field & schema column size check (#39833)
Related to #39788

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-02-18 16:24:51 +08:00
cqy123456
1b8a837758
fix: Adjust segment loader's memory estimate for intermin indexes (#39507)
issue: https://github.com/milvus-io/milvus/issues/27678
related 2.4 pr: https://github.com/milvus-io/milvus/pull/39508
related 2.5 pr: https://github.com/milvus-io/milvus/pull/39509
related master pr: https://github.com/milvus-io/milvus/pull/39507

Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>
2025-02-18 14:44:50 +08:00
Spade A
0dc21f0aeb
feat: support random sample (#39532)
issue: #39541

This PR implements random sample, the syntax is:
```
filter="random_sample(factor)"
or 
filter="boolean_expression && random_sample(factor)"

where 
factor is a float between (0, 1) and 
boolean_expression is like
 "1 <= number < 10", "color in ["read, "blue"]" or others
```

---------

Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>
Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-02-18 12:40:50 +08:00
Zhen Ye
ae700e7519
enhance: make compatitle with old msgstream for new streaming service (#39943)
issue: #38399

Signed-off-by: chyezh <chyezh@outlook.com>
2025-02-18 11:21:08 +08:00
zhagnlu
316534e065
enhance: optimize delete init construct code (#39327)
#39326

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-02-17 21:05:26 +08:00
congqixia
7ccde3300e
fix: Use text_log prefix for TextMatchIndex null offset file (#39935)
Related to #39933

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-02-17 20:17:25 +08:00
Xianhui Lin
a4dbbc2e52
fix: AlterCollection modify ConsistencyLevel test confict (#39919)
fix: AlterCollection unable to modify ConsistencyLevel
issue: https://github.com/milvus-io/milvus/issues/39707
relate-pr:https://github.com/milvus-io/milvus/pull/39708

Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>
2025-02-17 15:54:14 +08:00
Zhen Ye
21724ab52c
enhance: generate guaranteets at delegator if local wal (#39799)
issue: #38399, #39892

- use mvcc timestamp of wal as guaranteets if wal and delegator is
located at same node.
- fix: ignore growing option is lost at hibridsearch

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-02-17 15:22:15 +08:00
Zhen Ye
64dad60dc2
fix: delegator doesn't follow with wal if streaming enabled (#39890)
issue: #38399

Signed-off-by: chyezh <chyezh@outlook.com>
2025-02-17 14:10:15 +08:00
smellthemoon
38cfd38b31
enhance: return topks when search in restful v2 (#39812)
if nq>2, restful will flatten all the res. If one nq res has duplicate
pks, the length of this slice will be less then topk. This pr
will attach topks in the output.

Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2025-02-17 13:52:14 +08:00
zhagnlu
8a9f02ef71
enhance: optimize expr performace for some points (#39695)
1. skip get expr arguments which deserialize proto for every batch
execute.
2. replace unordered_set with sort array that has better performace for
small set.

#39688

Co-authored-by: luzhang <luzhang@zilliz.com>
2025-02-16 20:32:14 +08:00
Xianhui Lin
d827dd8b2f
fix: AlterCollection unable to modify ConsistencyLevel (#39708)
fix: AlterCollection unable to modify ConsistencyLevel
issue: https://github.com/milvus-io/milvus/issues/39707

Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>
2025-02-16 20:10:14 +08:00
sre-ci-robot
61cc22354e
[automated] Update Knowhere Commit (#39898)
Update Knowhere Commit
Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-02-16 01:32:13 +08:00
SimFG
047254665d
feat: support to replicate import msg (#39171)
- issue: #39849

---------

Signed-off-by: SimFG <bang.fu@zilliz.com>
Signed-off-by: chyezh <chyezh@outlook.com>
Co-authored-by: chyezh <chyezh@outlook.com>
2025-02-16 00:08:13 +08:00
Bingyi Sun
b59555057d
feat: support json index (#36750)
https://github.com/milvus-io/milvus/issues/35528

This PR adds json index support for json and dynamic fields. Now you can
only do unary query like 'a["b"] > 1' using this index. We will support
more filter type later.

basic usage:
```
collection.create_index("json_field", {"index_type": "INVERTED",
    "params": {"json_cast_type": DataType.STRING, "json_path":
'json_field["a"]["b"]'}})
```

There are some limits to use this index:
1. If a record does not have the json path you specify, it will be
ignored and there will not be an error.
2. If a value of the json path fails to be cast to the type you specify,
it will be ignored and there will not be an error.
3. A specific json path can have only one json index.
4. If you try to create more than one json indexes for one json field,
sdk(pymilvus<=2.4.7) may return immediately because of internal
implementation. This will be fixed in a later version.

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-02-15 14:06:15 +08:00