298 Commits

Author SHA1 Message Date
Bingyi Sun
6249335859
fix: Catch invalid json pointer error (#40625)
issue: #35528

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-03-14 16:56:08 +08:00
Bingyi Sun
d3adab15ac
fix: Build double index for all json numeric field (#40619)
issue: #35528

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-03-14 16:52:11 +08:00
cai.zhang
e5f50076ec
enhance: Only check element type with not null array (#40446)
Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-03-11 14:58:07 +08:00
Bingyi Sun
0a7e692b6f
fix: Fix null offset loading in inverted index (#40523)
issue: #40516

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-03-10 22:12:04 +08:00
smellthemoon
faae8ee518
fix: store wrong offset when build tantivy in nullable field (#40452)
#40454

Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2025-03-09 09:34:04 +08:00
Spade A
3db56560fb
fix: fix concurrent issues in null offset (#40363)
issue: #40308
This issue fixes these two concurrent issues:
1. element in null_offset is used to set bitset where the size of bitset
is initialized by tantivy document count. However, there may still be
some documents that are not committed in tantivy but are null in
null_offset. So array out of range occurs.
2. null_offset can be read and write concurrently but there's no
synchronization protection.

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-03-05 17:48:00 +08:00
Bingyi Sun
be4d09561b
fix: Fix missing null or non-exist key in json index (#40336)
issue: #35528

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-03-05 11:48:02 +08:00
Bingyi Sun
db4769281c
fix: Fall back to a brute-force search if json index type unmatched (#40076)
issue: https://github.com/milvus-io/milvus/issues/35528
If the query data type does not match the index type, fall back to a
brute-force search

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-02-24 16:25:57 +08:00
Spade A
d34d70582d
fix: fix misleading name *_add_multi_* (#39997)
fix: #39995

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-02-21 16:45:55 +08:00
congqixia
7ccde3300e
fix: Use text_log prefix for TextMatchIndex null offset file (#39935)
Related to #39933

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-02-17 20:17:25 +08:00
Bingyi Sun
b59555057d
feat: support json index (#36750)
https://github.com/milvus-io/milvus/issues/35528

This PR adds json index support for json and dynamic fields. Now you can
only do unary query like 'a["b"] > 1' using this index. We will support
more filter type later.

basic usage:
```
collection.create_index("json_field", {"index_type": "INVERTED",
    "params": {"json_cast_type": DataType.STRING, "json_path":
'json_field["a"]["b"]'}})
```

There are some limits to use this index:
1. If a record does not have the json path you specify, it will be
ignored and there will not be an error.
2. If a value of the json path fails to be cast to the type you specify,
it will be ignored and there will not be an error.
3. A specific json path can have only one json index.
4. If you try to create more than one json indexes for one json field,
sdk(pymilvus<=2.4.7) may return immediately because of internal
implementation. This will be fixed in a later version.

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-02-15 14:06:15 +08:00
Gao
c1794cc490
enhance: update knowhere version and IsAdditionalScalarSupported interface (#39573)
Signed-off-by: chasingegg <chao.gao@zilliz.com>
2025-02-05 19:51:10 +08:00
zhagnlu
8117d59f85
fix:fix GetValueFromConfig for bool type (#39526)
#39525

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-01-24 16:17:05 +08:00
Cai Yudong
341d6c1eb7
feat: Update segcore for VECTOR_INT8 (#39415)
Issue: #38666

Signed-off-by: Cai Yudong <yudong.cai@zilliz.com>
2025-01-21 11:03:03 +08:00
Gao
75d7978a18
enhance: pass partition key scalar info if enable for vector mem index (#39123)
issue: #34332

---------

Signed-off-by: chasingegg <chao.gao@zilliz.com>
2025-01-16 14:33:03 +08:00
Spade A
8c4ba70a4c
fix: enable to build index with single segment (#39233)
fix https://github.com/milvus-io/milvus/issues/39232

---------

Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>
2025-01-16 11:01:06 +08:00
Cai Yudong
5bf1b2b929
feat: Support Int8Vector in go (#38990)
Issue: #38666

Signed-off-by: Cai Yudong <yudong.cai@zilliz.com>
2025-01-14 20:43:06 +08:00
Zhen Ye
3e788f0fbd
enhance: record memory size (uncompressed) item for index (#38770)
issue: #38715

- Current milvus use a serialized index size(compressed) for estimate
resource for loading.
- Add a new field `MemSize` (before compressing) for index to estimate
resource.

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-01-14 10:33:06 +08:00
smellthemoon
accc9e7fbf
fix: fail to get empty index num rows (#39155)
#39125

Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2025-01-13 16:04:58 +08:00
Spade A
032292a432
feat: support phrase match query (#38869)
The relevant issue: https://github.com/milvus-io/milvus/issues/38930

---------

Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>
2025-01-12 20:24:58 +08:00
Spade A
8abf6c9149
fix: build text index when loading field data (#39070)
fix: https://github.com/milvus-io/milvus/issues/39053
may fix https://github.com/milvus-io/milvus/issues/38644 which could be
caused by https://github.com/milvus-io/milvus/issues/39053

---------

Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>
2025-01-09 15:24:56 +08:00
Bingyi Sun
f0cddfd160
fix: Fix panic caused by removing directory (#38622)
https://github.com/milvus-io/milvus/issues/38604

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-01-06 10:54:54 +08:00
cai.zhang
ba3c2e6fb1
fix: Only generate the index_null_offset file when the field support null value (#38833)
issue: #38832

Signed-off-by: cai.zhang <cai.zhang@zilliz.com>
2024-12-30 18:02:52 +08:00
Patrick Weizhi Xu
85f462be1a
enhance: speed up search iterator stage 1 (#37947)
issue: #37548

Signed-off-by: Patrick Weizhi Xu <weizhi.xu@zilliz.com>
2024-12-26 10:32:49 +08:00
zhagnlu
8fcb33c21d
fix:fix delete record assert failed (#38580)
#38472

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-12-19 18:22:47 +08:00
Zhen Ye
b537a72309
fix: interted index out of range (#38577)
issue: #38546, #38486

Signed-off-by: chyezh <chyezh@outlook.com>
2024-12-19 15:20:47 +08:00
Bingyi Sun
f0096ec292
fix: Fix IsMmapSupported for scalar index (#38135)
https://github.com/milvus-io/milvus/issues/38134

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2024-12-17 20:30:44 +08:00
zhagnlu
9afcc5bc5c
fix:fix incorrect dir operations when create or load inverted index (#38359)
#37944

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-12-17 20:06:45 +08:00
Chun Han
c1f9158996
fix: search-group-by failed to get data from multi-chunked-segment(##… (#38383)
related: #38343

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2024-12-13 16:54:43 +08:00
zhagnlu
32f575be0f
enhance: change bitmap index mmap mode to view mode (#38179)
#38138

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-12-08 17:26:41 +08:00
zhagnlu
e4b6773d0a
fix: fix create text index dir conflict bug (#37693)
#37623

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-11-15 18:26:30 +08:00
smellthemoon
3389a6b500
enhance: support null in text match index (#37517)
#37508

Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2024-11-13 11:08:29 +08:00
aoiasd
12951f0abb
enhance: rename tokenizer to analyzer and check analyzer params (#37478)
relate: https://github.com/milvus-io/milvus/issues/35853

---------

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2024-11-10 16:12:26 +08:00
congqixia
5310d3469f
fix: Escape brace of dumped JSON for index err message (#37504)
Related to #37503

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-11-07 18:58:25 +08:00
smellthemoon
9b6dd23f8e
fix: wrong path spelling when use rootpath in segcore (#37453)
#36532

Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2024-11-07 11:22:25 +08:00
aoiasd
d67853fa89
feat: Tokenizer support build with params and clone for concurrency (#37048)
relate: https://github.com/milvus-io/milvus/issues/35853
https://github.com/milvus-io/milvus/issues/36751

---------

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2024-11-06 17:48:24 +08:00
smellthemoon
2b3f5bec07
fix: panic when create index on all none data (#37046)
#37045

Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2024-10-24 17:09:28 +08:00
smellthemoon
6bedc7e8c8
fix: not set valid_data in bitmap index when mmap (#37023)
#37013

Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2024-10-22 12:03:26 +08:00
SimFG
903c18ba26
enhance: consider the mmap chunck cache config when resource usage estimate (#36814)
- issue: #36530

Signed-off-by: SimFG <bang.fu@zilliz.com>
2024-10-18 10:17:23 +08:00
foxspy
3de57ec4fa
enhance: add vector index mgr to remove vector index type dependency (#36843)
issue: #34298

Signed-off-by: xianliang.li <xianliang.li@zilliz.com>
2024-10-17 22:15:25 +08:00
smellthemoon
eb3e4583ec
enhance: all op(Null) is false in expr (#35527)
#31728

---------

Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2024-10-17 21:14:30 +08:00
Bingyi Sun
a75bb85f3a
feat: support chunked column for sealed segment (#35764)
This PR splits sealed segment to chunked data to avoid unnecessary
memory copy and save memory usage when loading segments so that loading
can be accelerated.

To support rollback to previous version, we add an option
`multipleChunkedEnable` which is false by default.

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2024-10-12 15:04:52 +08:00
aoiasd
db34572c56
feat: support load and query with bm25 metric (#36071)
relate: https://github.com/milvus-io/milvus/issues/35853

---------

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2024-10-11 10:23:20 +08:00
SimFG
130a923dec
enhance: the estimate method when loading the collection (#36307)
- issue: #36530

---------

Signed-off-by: SimFG <bang.fu@zilliz.com>
Signed-off-by: xianliang.li <xianliang.li@zilliz.com>
Co-authored-by: xianliang.li <xianliang.li@zilliz.com>
2024-10-09 17:35:19 +08:00
Bingyi Sun
23b95aeba3
fix: remove element type check (#35828)
https://github.com/milvus-io/milvus/issues/36275
Array's element type is not same with schema's. It is INT32 for INT16
and INT8

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2024-09-18 11:37:10 +08:00
jaime
2ff3765058
enhance: catch std::stoi exception and improve error msg (#36267)
issue: #36255

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-09-14 16:17:08 +08:00
zhagnlu
5e5e87cc2f
enhance: rename some params and reduce default bitmapCardinalityLimit… (#36138)
#32900

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-09-12 12:09:08 +08:00
Jiquan Long
89bf226f0b
feat: support keyword text match (#35923)
fix: #35922

---------

Signed-off-by: longjiquan <jiquan.long@zilliz.com>
2024-09-10 15:11:08 +08:00
congqixia
851f3b9883
fix: Make legacy non-lexicographic branch break swtich (#36125)
Related to #35941
Previous PR: #36034

This patch makes the switch branching logic correct and make the unit
test work for cases which does not select the whole dataset.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-09-10 10:15:07 +08:00
congqixia
3123093dd7
enhance: Use MARISA_LABEL_ORDER when building trie index (#36034)
Related to #35941
Previous PR: #35943

This PR make `Trie` index using `MARISA_LABEL_ORDER`, which make
predictive search iterating in lexicographic order.

When trie index is build in label order, lexicographc could be utilized
accelerating `Range` operations.

However according to the official document, using `MARISA_LABEL_ORDER`
will make "exact match lookup, common prefix search, and predictive
search" slower.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-09-09 14:29:05 +08:00