271 Commits

Author SHA1 Message Date
zhagnlu
9afcc5bc5c
fix:fix incorrect dir operations when create or load inverted index (#38359)
#37944

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-12-17 20:06:45 +08:00
Chun Han
c1f9158996
fix: search-group-by failed to get data from multi-chunked-segment(##… (#38383)
related: #38343

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2024-12-13 16:54:43 +08:00
zhagnlu
32f575be0f
enhance: change bitmap index mmap mode to view mode (#38179)
#38138

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-12-08 17:26:41 +08:00
zhagnlu
e4b6773d0a
fix: fix create text index dir conflict bug (#37693)
#37623

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-11-15 18:26:30 +08:00
smellthemoon
3389a6b500
enhance: support null in text match index (#37517)
#37508

Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2024-11-13 11:08:29 +08:00
aoiasd
12951f0abb
enhance: rename tokenizer to analyzer and check analyzer params (#37478)
relate: https://github.com/milvus-io/milvus/issues/35853

---------

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2024-11-10 16:12:26 +08:00
congqixia
5310d3469f
fix: Escape brace of dumped JSON for index err message (#37504)
Related to #37503

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-11-07 18:58:25 +08:00
smellthemoon
9b6dd23f8e
fix: wrong path spelling when use rootpath in segcore (#37453)
#36532

Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2024-11-07 11:22:25 +08:00
aoiasd
d67853fa89
feat: Tokenizer support build with params and clone for concurrency (#37048)
relate: https://github.com/milvus-io/milvus/issues/35853
https://github.com/milvus-io/milvus/issues/36751

---------

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2024-11-06 17:48:24 +08:00
smellthemoon
2b3f5bec07
fix: panic when create index on all none data (#37046)
#37045

Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2024-10-24 17:09:28 +08:00
smellthemoon
6bedc7e8c8
fix: not set valid_data in bitmap index when mmap (#37023)
#37013

Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2024-10-22 12:03:26 +08:00
SimFG
903c18ba26
enhance: consider the mmap chunck cache config when resource usage estimate (#36814)
- issue: #36530

Signed-off-by: SimFG <bang.fu@zilliz.com>
2024-10-18 10:17:23 +08:00
foxspy
3de57ec4fa
enhance: add vector index mgr to remove vector index type dependency (#36843)
issue: #34298

Signed-off-by: xianliang.li <xianliang.li@zilliz.com>
2024-10-17 22:15:25 +08:00
smellthemoon
eb3e4583ec
enhance: all op(Null) is false in expr (#35527)
#31728

---------

Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2024-10-17 21:14:30 +08:00
Bingyi Sun
a75bb85f3a
feat: support chunked column for sealed segment (#35764)
This PR splits sealed segment to chunked data to avoid unnecessary
memory copy and save memory usage when loading segments so that loading
can be accelerated.

To support rollback to previous version, we add an option
`multipleChunkedEnable` which is false by default.

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2024-10-12 15:04:52 +08:00
aoiasd
db34572c56
feat: support load and query with bm25 metric (#36071)
relate: https://github.com/milvus-io/milvus/issues/35853

---------

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2024-10-11 10:23:20 +08:00
SimFG
130a923dec
enhance: the estimate method when loading the collection (#36307)
- issue: #36530

---------

Signed-off-by: SimFG <bang.fu@zilliz.com>
Signed-off-by: xianliang.li <xianliang.li@zilliz.com>
Co-authored-by: xianliang.li <xianliang.li@zilliz.com>
2024-10-09 17:35:19 +08:00
Bingyi Sun
23b95aeba3
fix: remove element type check (#35828)
https://github.com/milvus-io/milvus/issues/36275
Array's element type is not same with schema's. It is INT32 for INT16
and INT8

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2024-09-18 11:37:10 +08:00
jaime
2ff3765058
enhance: catch std::stoi exception and improve error msg (#36267)
issue: #36255

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-09-14 16:17:08 +08:00
zhagnlu
5e5e87cc2f
enhance: rename some params and reduce default bitmapCardinalityLimit… (#36138)
#32900

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-09-12 12:09:08 +08:00
Jiquan Long
89bf226f0b
feat: support keyword text match (#35923)
fix: #35922

---------

Signed-off-by: longjiquan <jiquan.long@zilliz.com>
2024-09-10 15:11:08 +08:00
congqixia
851f3b9883
fix: Make legacy non-lexicographic branch break swtich (#36125)
Related to #35941
Previous PR: #36034

This patch makes the switch branching logic correct and make the unit
test work for cases which does not select the whole dataset.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-09-10 10:15:07 +08:00
congqixia
3123093dd7
enhance: Use MARISA_LABEL_ORDER when building trie index (#36034)
Related to #35941
Previous PR: #35943

This PR make `Trie` index using `MARISA_LABEL_ORDER`, which make
predictive search iterating in lexicographic order.

When trie index is build in label order, lexicographc could be utilized
accelerating `Range` operations.

However according to the official document, using `MARISA_LABEL_ORDER`
will make "exact match lookup, common prefix search, and predictive
search" slower.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-09-09 14:29:05 +08:00
congqixia
7b21032d19
fix: Check all values for trie.predictive_search (#35943)
Related to #35941

For marisa trie `predictive_search` default behavior, it value iterated
is not in lexicographic order.

This PR is a brute force fix to make range operator returns correct
values.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-09-05 15:01:04 +08:00
Zhen Ye
f68df9a11e
fix: SkipIndex cause segment fault (#35907)
issue: #35882

Signed-off-by: chyezh <chyezh@outlook.com>
2024-09-03 17:15:03 +08:00
zhagnlu
671112d17b
enhance: add more info to hybrid index log (#35808)
#32900

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-08-29 21:07:04 +08:00
Patrick Weizhi Xu
b3089b5bdc
feat: support range search pagination retains order (#35738)
issue: #35464

Signed-off-by: Patrick Weizhi Xu <weizhi.xu@zilliz.com>
2024-08-29 14:09:00 +08:00
smellthemoon
b51b4a2838
fix: try get not exist file after upgrade (#35740)
https://github.com/milvus-io/milvus/issues/35741

Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2024-08-29 11:09:01 +08:00
Jiquan Long
a52ba3d09d
enhance: allow many segments for inverted index (#35616)
fix: https://github.com/milvus-io/milvus/issues/35615

---------

Signed-off-by: longjiquan <jiquan.long@zilliz.com>
2024-08-28 11:30:59 +08:00
zhagnlu
4d2f96c760
enhance: support bitmap mmap (#35399)
#32900

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-08-27 16:34:59 +08:00
cai.zhang
615a653988
fix: Fix offset out of range for creating Trie index (#35553)
issue: #35550

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-08-25 15:50:57 +08:00
zhagnlu
42f7800b5b
enhance: add bitmap offset cache to speed up retrieve raw data (#35498)
#35458

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-08-24 01:40:58 +08:00
Zhen Ye
a773836b89
enhance: optimize milvus core building (#35610)
issue: #35549,#35611,#35633

- remove milvus_segcore milvus_indexbuilder..., add libmilvus_core
- core building only link once
- move opendal compilation into cmake
- fix odr

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2024-08-23 12:35:02 +08:00
smellthemoon
80dbe87759
enhance: support null value in index (#35238)
#31728

---------

Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2024-08-16 15:30:54 +08:00
zhagnlu
626b1b2f5e
fix:redefine hybrid internal index type (#35314)
#32900

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-08-08 10:32:16 +08:00
zhagnlu
c19fe95154
fix: support string match for hybrid and bitmap index (#35294)
#34841

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-08-07 09:54:22 +08:00
Cai Yudong
3c9a47c8db
feat: Encode traceID and spanID as hex string (#34807)
Issue: https://github.com/zilliztech/knowhere/pull/714

Signed-off-by: Cai Yudong <yudong.cai@zilliz.com>
2024-08-06 15:20:16 +08:00
Jiquan Long
91df03afe8
feat: put inverted index into ram (#35222)
fix: https://github.com/milvus-io/milvus/issues/35224

---------

Signed-off-by: longjiquan <jiquan.long@zilliz.com>
2024-08-06 11:54:16 +08:00
zhagnlu
f8c1b138a8
fix:fix get array error for int type (#35154)
#35055

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-08-01 14:30:12 +08:00
zhenshan.cao
aa247f192d
enhance: remove unused code for StorageV2 (#35132)
issue: https://github.com/milvus-io/milvus/issues/34168

Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>
2024-08-01 12:08:13 +08:00
zhagnlu
804ec24c02
fix:fix retrieve raw data from bitmap array index (#34848)
#34795

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-07-27 01:53:47 +08:00
smellthemoon
5616b7e8d2
enhance: support null in c data_datacodec and load null value (#32183)
1. support read and write null in segcore
    will store valid_data(use uint8_t type to save memory) in fieldData.
2. support load null
binlog reader read and write data into column(sealed segment),
insertRecord(growing segment). In sealed segment, store valid_data
directly. In growing segment, considering prior implementation and easy
code reading, it covert uint8_t to fbvector<bool>, which may optimize in
future.
3.  retrieve valid_data.
    parse valid_data in search/query.
#31728

---------

Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2024-07-23 16:07:51 +08:00
Min Tian
a4aed9b0b5
enhance: new knowhere param for range_search (#34686)
issue: #34685 
knowhere needs a new json param `range_search_k` for RangeSearch to
early terminate the iterator.

Signed-off-by: min.tian <min.tian.cn@gmail.com>
2024-07-23 11:45:43 +08:00
foxspy
8e64bf929c
enhance: add scalar filtering and vector search latency metrics (#34785)
add scalar filtering and vector search latency metrics to distinguish
the cost of scalar filtering.
To add metrics in query chain, add a monitor module and move the metric
files from original storage module.
issue: #34780

Signed-off-by: xianliang.li <xianliang.li@zilliz.com>
2024-07-19 14:01:39 +08:00
zhagnlu
f1b2f7b640
enhance: refactor bitmap index and internal hybrid index (#34450)
#32900

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-07-18 10:39:42 +08:00
Patrick Weizhi Xu
104d0966b7
feat: support partition key isolation (#34336)
issue: #34332

---------

Signed-off-by: Patrick Weizhi Xu <weizhi.xu@zilliz.com>
2024-07-11 19:01:35 +08:00
zhagnlu
cc1bc07bfd
enhance: add log to bitmap index (#34197)
#32900

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-06-30 20:02:06 +08:00
Cai Yudong
ad90360162
enhance: Update knowhere commit (#34223)
Signed-off-by: Cai Yudong <yudong.cai@zilliz.com>
2024-06-27 18:20:06 +08:00
zhagnlu
03a3f50892
enhance: add skip using array index when some situation (#33947)
#32900

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-06-23 21:26:02 +08:00
zhagnlu
0d7ea8ec42
enhance: Enhance and correct exception module (#33705)
#33704

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-06-23 21:22:01 +08:00