1501 Commits

Author SHA1 Message Date
cai.zhang
e5f50076ec
enhance: Only check element type with not null array (#40446)
Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-03-11 14:58:07 +08:00
Bingyi Sun
0a7e692b6f
fix: Fix null offset loading in inverted index (#40523)
issue: #40516

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-03-10 22:12:04 +08:00
Cai Yudong
2bd2cca04a
enhance: Truly support multi vector data types in SearchBruteForce (#40499)
Issue: #38666

Signed-off-by: CaiYudong <yudong.cai@zilliz.com>
2025-03-10 18:36:03 +08:00
smellthemoon
faae8ee518
fix: store wrong offset when build tantivy in nullable field (#40452)
#40454

Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2025-03-09 09:34:04 +08:00
Bingyi Sun
37b118d55d
fix: Skip loading primary key if index has raw data (#39921)
issue: https://github.com/milvus-io/milvus/issues/39907

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-03-06 17:46:02 +08:00
Spade A
3db56560fb
fix: fix concurrent issues in null offset (#40363)
issue: #40308
This issue fixes these two concurrent issues:
1. element in null_offset is used to set bitset where the size of bitset
is initialized by tantivy document count. However, there may still be
some documents that are not committed in tantivy but are null in
null_offset. So array out of range occurs.
2. null_offset can be read and write concurrently but there's no
synchronization protection.

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-03-05 17:48:00 +08:00
Bingyi Sun
be4d09561b
fix: Fix missing null or non-exist key in json index (#40336)
issue: #35528

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-03-05 11:48:02 +08:00
Bingyi Sun
7040ba1c12
enhance: make json path index support term filter (#40140)
issue: #35528

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-03-04 11:56:02 +08:00
Zhen Ye
8eb662b4dc
enhance: add more metrics for async cgo component (#40136)
issue: #40014

Signed-off-by: chyezh <chyezh@outlook.com>
2025-03-03 09:56:03 +08:00
zhagnlu
7a17fb68ec
enhance: add monitor metric for retrieve raw data (#40141)
#40078

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-03-02 18:30:01 +08:00
zhagnlu
8c19e5c4a7
enhance: decrease delete record dump snapshot limit (#40101)
#40100

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-03-02 17:55:59 +08:00
Chun Han
259f9106ad
enhance: refine variable-length-type memory usage(#38736) (#39578)
related: #38736

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2025-02-27 21:13:58 +08:00
Spade A
476cf61d98
fix: random sample consider empty input (#40201)
issue: #40198

Fix random sample does not consider empty input, that is no data is hit
by filter expression.

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-02-26 16:15:58 +08:00
Bingyi Sun
f05e9628f6
fix: Fix search failure of null expression (#40129)
issue: #40095

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-02-25 20:43:55 +08:00
Bingyi Sun
db4769281c
fix: Fall back to a brute-force search if json index type unmatched (#40076)
issue: https://github.com/milvus-io/milvus/issues/35528
If the query data type does not match the index type, fall back to a
brute-force search

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-02-24 16:25:57 +08:00
sthuang
3eb3af5f08
feat: explicitly specify column groups for storage v2 api (#39790)
* use the new packed reader and writer api to be compatible with current
etcd meta
* For the new packed writer API: column groups and paths are explicitly
defined by users and won't split column groups by memory in storage v2.
Packed writer follows the user-defined column groups to split arrow
record and write into the corresponding file path.
* For the new packed reader API: read paths are explicitly defined by
users.
related: #39173

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2025-02-21 22:03:54 +08:00
yihao.dai
2a037a97f1
enhance: Add get vector latency metric and refine request limit error message (#40083)
issue: https://github.com/milvus-io/milvus/issues/40078

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-02-21 19:41:55 +08:00
Spade A
d34d70582d
fix: fix misleading name *_add_multi_* (#39997)
fix: #39995

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-02-21 16:45:55 +08:00
Patrick Weizhi Xu
04fff74a56
feat: introduce Text data type (#39874)
issue: https://github.com/milvus-io/milvus/issues/39818

This PR mimics Varchar data type, allows insert, search, query, delete,
full-text search and others.
Functionalities related to filter expressions are disabled temporarily. 

Storage changes for Text data type will be in the following PRs.

Signed-off-by: Patrick Weizhi Xu <weizhi.xu@zilliz.com>
2025-02-19 11:04:51 +08:00
congqixia
59881a7f73
fix: Remove load field & schema column size check (#39833)
Related to #39788

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-02-18 16:24:51 +08:00
Spade A
0dc21f0aeb
feat: support random sample (#39532)
issue: #39541

This PR implements random sample, the syntax is:
```
filter="random_sample(factor)"
or 
filter="boolean_expression && random_sample(factor)"

where 
factor is a float between (0, 1) and 
boolean_expression is like
 "1 <= number < 10", "color in ["read, "blue"]" or others
```

---------

Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>
Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-02-18 12:40:50 +08:00
zhagnlu
316534e065
enhance: optimize delete init construct code (#39327)
#39326

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-02-17 21:05:26 +08:00
congqixia
7ccde3300e
fix: Use text_log prefix for TextMatchIndex null offset file (#39935)
Related to #39933

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-02-17 20:17:25 +08:00
zhagnlu
8a9f02ef71
enhance: optimize expr performace for some points (#39695)
1. skip get expr arguments which deserialize proto for every batch
execute.
2. replace unordered_set with sort array that has better performace for
small set.

#39688

Co-authored-by: luzhang <luzhang@zilliz.com>
2025-02-16 20:32:14 +08:00
Bingyi Sun
b59555057d
feat: support json index (#36750)
https://github.com/milvus-io/milvus/issues/35528

This PR adds json index support for json and dynamic fields. Now you can
only do unary query like 'a["b"] > 1' using this index. We will support
more filter type later.

basic usage:
```
collection.create_index("json_field", {"index_type": "INVERTED",
    "params": {"json_cast_type": DataType.STRING, "json_path":
'json_field["a"]["b"]'}})
```

There are some limits to use this index:
1. If a record does not have the json path you specify, it will be
ignored and there will not be an error.
2. If a value of the json path fails to be cast to the type you specify,
it will be ignored and there will not be an error.
3. A specific json path can have only one json index.
4. If you try to create more than one json indexes for one json field,
sdk(pymilvus<=2.4.7) may return immediately because of internal
implementation. This will be fixed in a later version.

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-02-15 14:06:15 +08:00
Spade A
f7d9587720
enhance: add tantivy collector for i64 (#39850)
issue: #39852

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-02-14 15:50:15 +08:00
aoiasd
24d2bbc441
enhance: unmashall ts msg in dispatcher instead in msgstream (#38656)
relate: https://github.com/milvus-io/milvus/issues/38655

---------

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-02-14 12:04:13 +08:00
cai.zhang
9e6e477c5d
fix: Fix modulo for long type (#39722)
issue: #39640

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-02-11 20:04:46 +08:00
sparknack
2d9bef44d4
fix: sparse: add inverted_index_algo and dim_max_score_ratio config (#39358)
issue: #39332

Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>
2025-02-07 16:40:44 +08:00
Gao
c1794cc490
enhance: update knowhere version and IsAdditionalScalarSupported interface (#39573)
Signed-off-by: chasingegg <chao.gao@zilliz.com>
2025-02-05 19:51:10 +08:00
sthuang
c4ae9f4ece
feat: introduce third-party milvus-storage (#39418)
related: https://github.com/milvus-io/milvus/issues/39173

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2025-01-24 17:21:13 +08:00
Cai Yudong
5730b69e56
feat: Enable more VECTOR_INT8 unittest (#39569)
Issue: #38666

Signed-off-by: Cai Yudong <yudong.cai@zilliz.com>
2025-01-24 17:03:07 +08:00
zhagnlu
8117d59f85
fix:fix GetValueFromConfig for bool type (#39526)
#39525

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-01-24 16:17:05 +08:00
congqixia
844df76cc0
enhance: Rectify run_clang_format grep command (#39534)
Previously the grep with regex does not work and failed to match lots of
.cpp files

This PR:
- use "-E" flag to use regex match
- commit the fixed result of current cpp code

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-01-23 17:07:05 +08:00
Spade A
547c686027
fix: fix assignment operator in AssertInfo to comparison operator (#39347)
fix: #39346

Remove the problem line as it's redundant.

---------

Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>
2025-01-23 14:23:18 +08:00
Cai Yudong
341d6c1eb7
feat: Update segcore for VECTOR_INT8 (#39415)
Issue: #38666

Signed-off-by: Cai Yudong <yudong.cai@zilliz.com>
2025-01-21 11:03:03 +08:00
congqixia
45d49df89b
fix: Skip load extra indexes for sorted segment pk field (#39389)
Related to #39339

Extra indexes can be ignored for most cases since sorted pk column
already provided indexing features

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-01-20 18:13:15 +08:00
Gao
1a680c29e2
fix: correct remote centroids path in clustering compaction (#39398)
issue: https://github.com/milvus-io/milvus/issues/39353
The path was modified unintentionally, change it back.

Signed-off-by: chasingegg <chao.gao@zilliz.com>
2025-01-20 10:59:10 +08:00
congqixia
7cac87caca
fix: Skip erase field if index build on PK field (#39370)
Related to #39339

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-01-17 20:31:02 +08:00
Spade A
0461ddf776
fix: phrase match does not support offset input (#39338)
fix: #39337

Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>
2025-01-16 22:05:01 +08:00
Gao
75d7978a18
enhance: pass partition key scalar info if enable for vector mem index (#39123)
issue: #34332

---------

Signed-off-by: chasingegg <chao.gao@zilliz.com>
2025-01-16 14:33:03 +08:00
Spade A
8c4ba70a4c
fix: enable to build index with single segment (#39233)
fix https://github.com/milvus-io/milvus/issues/39232

---------

Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>
2025-01-16 11:01:06 +08:00
congqixia
eb63334312
enhance: Add try-catch and return CStatus for NewCollection (#39279)
Related to #28795

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-01-15 19:17:01 +08:00
Cai Yudong
5bf1b2b929
feat: Support Int8Vector in go (#38990)
Issue: #38666

Signed-off-by: Cai Yudong <yudong.cai@zilliz.com>
2025-01-14 20:43:06 +08:00
congqixia
da1b786ef8
enhance: Utilize "find0" in segment.find_first (#39229)
Related to #39003

Previous PR #39004 has to clone & flip bitset due to bitset does not
support find0 operator. #39176 added this feature so clone & flip could
be removed now.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-01-14 14:14:58 +08:00
Zhen Ye
3e788f0fbd
enhance: record memory size (uncompressed) item for index (#38770)
issue: #38715

- Current milvus use a serialized index size(compressed) for estimate
resource for loading.
- Add a new field `MemSize` (before compressing) for index to estimate
resource.

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-01-14 10:33:06 +08:00
Alexander Guzhva
3447ff7310
enhance: [bitset] extend op_find() to be able to search both 0 and 1 (#39176)
issue: #39124 

`bitset::find_first()` and `bitset::find_next()` now accept one more
parameter, which allows to search for `0` bit instead of `1` bit

Signed-off-by: Alexandr Guzhva <alexanderguzhva@gmail.com>
2025-01-14 09:50:58 +08:00
Bingyi Sun
a00ba861a4
fix: Fix in filter search result is empty if pk type is varchar (#39106)
https://github.com/milvus-io/milvus/issues/39107

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-01-13 16:14:58 +08:00
smellthemoon
accc9e7fbf
fix: fail to get empty index num rows (#39155)
#39125

Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2025-01-13 16:04:58 +08:00
Zhen Ye
5f94954bb4
fix: data race when accessing field_ when retrieving (#39151)
issue: #39148

Signed-off-by: chyezh <chyezh@outlook.com>
2025-01-13 11:23:04 +08:00