166 Commits

Author SHA1 Message Date
Jiquan Long
3f46c6d459
feat: support inverted index (#28783)
issue: https://github.com/milvus-io/milvus/issues/27704

Add inverted index for some data types in Milvus. This index type can
save a lot of memory compared to loading all data into RAM and speed up
the term query and range query.

Supported: `INT8`, `INT16`, `INT32`, `INT64`, `FLOAT`, `DOUBLE`, `BOOL`
and `VARCHAR`.

Not supported: `ARRAY` and `JSON`.

Note:
- The inverted index for `VARCHAR` is not designed to serve full-text
search now. We will treat every row as a whole keyword instead of
tokenizing it into multiple terms.
- The inverted index don't support retrieval well, so if you create
inverted index for field, those operations which depend on the raw data
will fallback to use chunk storage, which will bring some performance
loss. For example, comparisons between two columns and retrieval of
output fields.

The inverted index is very easy to be used.

Taking below collection as an example:

```python
fields = [
		FieldSchema(name="pk", dtype=DataType.VARCHAR, is_primary=True, auto_id=False, max_length=100),
		FieldSchema(name="int8", dtype=DataType.INT8),
		FieldSchema(name="int16", dtype=DataType.INT16),
		FieldSchema(name="int32", dtype=DataType.INT32),
		FieldSchema(name="int64", dtype=DataType.INT64),
		FieldSchema(name="float", dtype=DataType.FLOAT),
		FieldSchema(name="double", dtype=DataType.DOUBLE),
		FieldSchema(name="bool", dtype=DataType.BOOL),
		FieldSchema(name="varchar", dtype=DataType.VARCHAR, max_length=1000),
		FieldSchema(name="random", dtype=DataType.DOUBLE),
		FieldSchema(name="embeddings", dtype=DataType.FLOAT_VECTOR, dim=dim),
]
schema = CollectionSchema(fields)
collection = Collection("demo", schema)
```

Then we can simply create inverted index for field via:

```python
index_type = "INVERTED"
collection.create_index("int8", {"index_type": index_type})
collection.create_index("int16", {"index_type": index_type})
collection.create_index("int32", {"index_type": index_type})
collection.create_index("int64", {"index_type": index_type})
collection.create_index("float", {"index_type": index_type})
collection.create_index("double", {"index_type": index_type})
collection.create_index("bool", {"index_type": index_type})
collection.create_index("varchar", {"index_type": index_type})
```

Then, term query and range query on the field can be speed up
automatically by the inverted index:

```python
result = collection.query(expr='int64 in [1, 2, 3]', output_fields=["pk"])
result = collection.query(expr='int64 < 5', output_fields=["pk"])
result = collection.query(expr='int64 > 2997', output_fields=["pk"])
result = collection.query(expr='1 < int64 < 5', output_fields=["pk"])
```

---------

Signed-off-by: longjiquan <jiquan.long@zilliz.com>
2023-12-31 19:50:47 +08:00
sre-ci-robot
c2345daf3a
[automated] Update Knowhere Commit (#29578)
Update Knowhere Commit
Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2023-12-29 18:56:46 +08:00
sre-ci-robot
fce1a8dafb
[automated] Update Knowhere Commit (#29412)
Update Knowhere Commit
Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2023-12-25 17:58:46 +08:00
sre-ci-robot
3e66e78508
[automated] Update Knowhere Commit (#29178)
Update Knowhere Commit
Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2023-12-14 17:16:39 +08:00
Bingyi Sun
36f69ea031
feat: integrate storagev2 in building index of segcore (#28768)
issue: https://github.com/milvus-io/milvus/issues/28655

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2023-12-05 16:48:54 +08:00
sre-ci-robot
f01e507b15
[automated] Update Knowhere Commit (#28965)
Update Knowhere Commit
Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2023-12-05 15:56:35 +08:00
sre-ci-robot
9b6cbe956a
[automated] Update Knowhere Commit (#28917)
Update Knowhere Commit
Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2023-12-04 15:42:34 +08:00
sre-ci-robot
ecc3ca374c
[automated] Update Knowhere Commit (#28882)
Update Knowhere Commit
Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2023-12-01 02:28:31 +08:00
sre-ci-robot
86ccb8e146
[automated] Update Knowhere Commit (#28704)
Update Knowhere Commit
Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2023-11-24 16:56:24 +08:00
sre-ci-robot
b7b31ce0bc
Update knowhere commit (#28285)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2023-11-09 14:24:19 +08:00
sre-ci-robot
b1df3ead0e
Update knowhere commit (#28176)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2023-11-06 22:44:19 +08:00
sre-ci-robot
7f28e9d2f3
Update knowhere commit (#28087)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2023-11-03 10:24:16 +08:00
Enwei Jiao
8ae9c947ae
Use OpenDAL to access object store (#25642)
Signed-off-by: Enwei Jiao <enwei.jiao@zilliz.com>
2023-11-01 09:00:14 +08:00
sre-ci-robot
1ae6e5d8c8
Update knowhere commit (#27993)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2023-10-28 04:24:10 +08:00
sre-ci-robot
8c605ca858
Update knowhere commit (#27865)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2023-10-24 02:34:08 +08:00
sre-ci-robot
b6e07d6fe3
Update knowhere commit (#27812)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2023-10-20 03:54:09 +08:00
Gao
9dd369dd99
Update knowhere version to v2.2.2 (#27810)
Signed-off-by: chasingegg <chao.gao@zilliz.com>
2023-10-19 21:34:07 +08:00
sre-ci-robot
6b79d2b7d6
Update knowhere commit (#27752)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2023-10-18 07:14:09 +08:00
sre-ci-robot
75343b2cb4
Update knowhere commit (#27706)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2023-10-14 03:03:37 +08:00
Enwei Jiao
0f2f4a0a75
Remove useless parameters for Makefile (#27622)
Signed-off-by: Enwei Jiao <enwei.jiao@zilliz.com>
2023-10-11 20:45:35 +08:00
Gao
7a65b6fb85
Limit faiss ivf index build thread num and fix ut (#27567)
Signed-off-by: chasingegg <chao.gao@zilliz.com>
2023-10-11 10:33:33 +08:00
Sheldon
5ba158a3f5
fix knowhere version-changing (#27508)
Update KNOWHERE_VERSION for the first occurrence

Signed-off-by: Sheldon <chuanfeng.liu@zilliz.com>
2023-10-08 08:35:32 +08:00
zhenshan.cao
dbdb9e15d8
Update Knowhere version (#27445)
Signed-off-by: Li Liu <li.liu@zilliz.com>
Co-authored-by: Li Liu <li.liu@zilliz.com>
2023-09-29 14:23:28 +08:00
sre-ci-robot
e02228b5ad
Update knowhere commit (#27357)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2023-09-26 18:57:37 +08:00
foxspy
5db4a0489e
dynamic index version control (#27335)
Co-authored-by: longjiquan <jiquan.long@zilliz.com>
2023-09-25 21:39:27 +08:00
foxspy
fa033e586a
disable growing index for flat (#27309)
Signed-off-by: xianliang <xianliang.li@zilliz.com>
2023-09-22 14:19:24 +08:00
foxspy
370b6fde58
milvus support multi index engine (#27178)
Co-authored-by: longjiquan <jiquan.long@zilliz.com>
2023-09-22 09:59:26 +08:00
sre-ci-robot
fc694bd56d
Update knowhere commit (#27190)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2023-09-19 10:35:22 +08:00
sre-ci-robot
a11136b158
Update knowhere commit (#27159)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2023-09-18 09:41:22 +08:00
sre-ci-robot
c85c255eb1
Update knowhere commit (#27109)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2023-09-15 13:59:19 +08:00
Enwei Jiao
0afdfdb9af
Remove other Exceptions, keeps SegcoreError only (#27017)
Signed-off-by: Enwei Jiao <enwei.jiao@zilliz.com>
2023-09-14 14:05:20 +08:00
sre-ci-robot
dde3cd2f93
Update knowhere commit (#26998)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2023-09-12 09:49:18 +08:00
sre-ci-robot
56a6559fe7
Update knowhere commit (#26888)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2023-09-07 09:19:16 +08:00
sre-ci-robot
78a2638fd4
Update knowhere commit (#26861)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2023-09-06 09:21:49 +08:00
sre-ci-robot
c132c53b1a
Update knowhere commit (#26840)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2023-09-05 09:05:49 +08:00
sre-ci-robot
b47da91f3c
Update knowhere commit (#26792)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2023-09-01 09:23:01 +08:00
sre-ci-robot
1dbe1e63a4
Update knowhere commit (#26604)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2023-08-27 17:04:25 +08:00
liliu-z
e17cda23f4
update knowhere's verison to 2.2.0 (#26553)
Signed-off-by: Li Liu <li.liu@zilliz.com>
2023-08-23 00:52:21 +08:00
cqy123456
fd37860e57
init knowhere build/search thread pool; (#26449)
Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>
2023-08-18 14:00:19 +08:00
Patrick Weizhi Xu
09da953a19
Use Knowhere AIO Context Init Defalut Value and Panic when Fail (#26286)
Signed-off-by: Patrick Weizhi Xu <weizhi.xu@zilliz.com>
2023-08-13 20:53:30 +08:00
Gao
b6fcbb0998
Support ScaNN index (#26099)
Signed-off-by: chasingegg <chao.gao@zilliz.com>
2023-08-11 14:21:29 +08:00
sre-ci-robot
ae3a3d148c
Update knowhere commit (#26049)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2023-08-07 11:27:07 +08:00
sre-ci-robot
16e342d6b5
Update knowhere commit (#26018)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2023-07-31 14:25:04 +08:00
yah01
b986e3af81
Upgrade Knowhere to 4ba9091 (#25950)
Signed-off-by: yah01 <yang.cen@zilliz.com>
2023-07-27 13:59:03 +08:00
Cai Yudong
5e3cbb584a
Update Knowhere Commit (#25875)
Signed-off-by: Yudong Cai <yudong.cai@zilliz.com>
2023-07-25 11:01:00 +08:00
sre-ci-robot
8f33d87843
Update knowhere commit (#25827)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2023-07-21 14:09:05 +08:00
sre-ci-robot
4e5d2a311b
Update knowhere commit (#25640)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2023-07-17 14:04:34 +08:00
Enwei Jiao
4aed32ff61
Use librdkafka for all platform (#25538)
Signed-off-by: Enwei Jiao <enwei.jiao@zilliz.com>
2023-07-13 15:34:33 +08:00
sre-ci-robot
74c4b28ef1
Update knowhere commit (#25519)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2023-07-12 14:44:30 +08:00
foxspy
31173727b2
growing segment index memory opt & get vector bugfix (#25272)
Signed-off-by: xianliang <xianliang.li@zilliz.com>
2023-07-05 00:04:25 +08:00