120 Commits

Author SHA1 Message Date
groot
c81909bfab
enhance: Support MinIO TLS connection (#31311)
issue: https://github.com/milvus-io/milvus/issues/30709
pr: #31292

Signed-off-by: yhmo <yihua.mo@zilliz.com>
Co-authored-by: Chen Rao <chenrao317328@163.com>
2024-03-21 11:15:20 +08:00
Buqian Zheng
070dfc77bf
feat: [Sparse Float Vector] segcore basics and index building (#30357)
This commit adds sparse float vector support to segcore with the
following:

1. data type enum declarations
2. Adds corresponding data structures for handling sparse float vectors
in various scenarios, including:
* FieldData as a bridge between the binlog and the in memory data
structures
* mmap::Column as the in memory representation of a sparse float vector
column of a sealed segment;
* ConcurrentVector as the in memory representation of a sparse float
vector of a growing segment which supports inserts.
3. Adds logic in payload reader/writer to serialize/deserialize from/to
binlog
4. Adds the ability to allow the index node to build sparse float vector
index
5. Adds the ability to allow the query node to build growing index for
growing segment and temp index for sealed segment without index built

This commit also includes some code cleanness, comment improvement, and
some unit tests for sparse vector.

https://github.com/milvus-io/milvus/issues/29419

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2024-03-11 14:45:02 +08:00
MrPresent-Han
77eb6defb1
feat: support groupby on growing and non-indexed sealed egment(#30307) (#30644)
related: #30308

Signed-off-by: MrPresent-Han <chun.han@zilliz.com>
2024-02-21 14:02:53 +08:00
Patrick Weizhi Xu
0907d76253
enhance: pass partition key scalar info if enabled when build vector index (#29931)
issue: #29892 

Pass optional scalar IVF offsets to Cardinal

Signed-off-by: Patrick Weizhi Xu <weizhi.xu@zilliz.com>
2024-01-24 00:04:55 +08:00
Bingyi Sun
e1258b8cad
feat: integrate storagev2 into loading segment (#29336)
issue: #29335

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2024-01-12 18:10:51 +08:00
Xu Tong
e429965f32
Add float16 approve for multi-type part (#28427)
issue:https://github.com/milvus-io/milvus/issues/22837

Add bfloat16 vector, add the index part of float16 vector.

Signed-off-by: Writer-X <1256866856@qq.com>
2024-01-11 15:48:51 +08:00
Jiquan Long
3f46c6d459
feat: support inverted index (#28783)
issue: https://github.com/milvus-io/milvus/issues/27704

Add inverted index for some data types in Milvus. This index type can
save a lot of memory compared to loading all data into RAM and speed up
the term query and range query.

Supported: `INT8`, `INT16`, `INT32`, `INT64`, `FLOAT`, `DOUBLE`, `BOOL`
and `VARCHAR`.

Not supported: `ARRAY` and `JSON`.

Note:
- The inverted index for `VARCHAR` is not designed to serve full-text
search now. We will treat every row as a whole keyword instead of
tokenizing it into multiple terms.
- The inverted index don't support retrieval well, so if you create
inverted index for field, those operations which depend on the raw data
will fallback to use chunk storage, which will bring some performance
loss. For example, comparisons between two columns and retrieval of
output fields.

The inverted index is very easy to be used.

Taking below collection as an example:

```python
fields = [
		FieldSchema(name="pk", dtype=DataType.VARCHAR, is_primary=True, auto_id=False, max_length=100),
		FieldSchema(name="int8", dtype=DataType.INT8),
		FieldSchema(name="int16", dtype=DataType.INT16),
		FieldSchema(name="int32", dtype=DataType.INT32),
		FieldSchema(name="int64", dtype=DataType.INT64),
		FieldSchema(name="float", dtype=DataType.FLOAT),
		FieldSchema(name="double", dtype=DataType.DOUBLE),
		FieldSchema(name="bool", dtype=DataType.BOOL),
		FieldSchema(name="varchar", dtype=DataType.VARCHAR, max_length=1000),
		FieldSchema(name="random", dtype=DataType.DOUBLE),
		FieldSchema(name="embeddings", dtype=DataType.FLOAT_VECTOR, dim=dim),
]
schema = CollectionSchema(fields)
collection = Collection("demo", schema)
```

Then we can simply create inverted index for field via:

```python
index_type = "INVERTED"
collection.create_index("int8", {"index_type": index_type})
collection.create_index("int16", {"index_type": index_type})
collection.create_index("int32", {"index_type": index_type})
collection.create_index("int64", {"index_type": index_type})
collection.create_index("float", {"index_type": index_type})
collection.create_index("double", {"index_type": index_type})
collection.create_index("bool", {"index_type": index_type})
collection.create_index("varchar", {"index_type": index_type})
```

Then, term query and range query on the field can be speed up
automatically by the inverted index:

```python
result = collection.query(expr='int64 in [1, 2, 3]', output_fields=["pk"])
result = collection.query(expr='int64 < 5', output_fields=["pk"])
result = collection.query(expr='int64 > 2997', output_fields=["pk"])
result = collection.query(expr='1 < int64 < 5', output_fields=["pk"])
```

---------

Signed-off-by: longjiquan <jiquan.long@zilliz.com>
2023-12-31 19:50:47 +08:00
yah01
aef483806d
enhance: improve the segcore logs (#29372)
- remove the streaming logging
- refine existing logs

fix #29366

---------

Signed-off-by: yah01 <yah2er0ne@outlook.com>
2023-12-23 21:52:43 +08:00
yah01
8f89e9cf75
enhance: remove all unnecessary string formatting (#29323)
done by two regex expressions:
- `PanicInfo\((.+),[. \n]+fmt::format\(([.\s\S]+?)\)\)`
- `AssertInfo\((.+),[. \n]+fmt::format\(([.\s\S]+?)\)\)`

related: #28811

---------

Signed-off-by: yah01 <yang.cen@zilliz.com>
2023-12-20 10:04:43 +08:00
Bingyi Sun
36f69ea031
feat: integrate storagev2 in building index of segcore (#28768)
issue: https://github.com/milvus-io/milvus/issues/28655

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2023-12-05 16:48:54 +08:00
Enwei Jiao
8ae9c947ae
Use OpenDAL to access object store (#25642)
Signed-off-by: Enwei Jiao <enwei.jiao@zilliz.com>
2023-11-01 09:00:14 +08:00
Enwei Jiao
4a33391b8f
rename createindex (#27903)
Signed-off-by: Enwei Jiao <enwei.jiao@zilliz.com>
2023-10-27 10:12:14 +08:00
zhagnlu
6060dd7ea8
Add chunk manager request timeout (#27692)
Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2023-10-23 20:08:08 +08:00
foxspy
5db4a0489e
dynamic index version control (#27335)
Co-authored-by: longjiquan <jiquan.long@zilliz.com>
2023-09-25 21:39:27 +08:00
foxspy
370b6fde58
milvus support multi index engine (#27178)
Co-authored-by: longjiquan <jiquan.long@zilliz.com>
2023-09-22 09:59:26 +08:00
PowderLi
4feb3fa7c6
support azure (#26398)
Signed-off-by: PowderLi <min.li@zilliz.com>
2023-09-19 10:01:23 +08:00
Enwei Jiao
0afdfdb9af
Remove other Exceptions, keeps SegcoreError only (#27017)
Signed-off-by: Enwei Jiao <enwei.jiao@zilliz.com>
2023-09-14 14:05:20 +08:00
Enwei Jiao
c3f15c6b95
Refactor duplicate error class into one place (#26985)
Signed-off-by: Enwei Jiao <enwei.jiao@zilliz.com>
2023-09-11 20:43:17 +08:00
zhagnlu
411f9ac823
Upgrade minio-go and add region and virtual host config for segcore chunk manager (#26194)
Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2023-08-11 10:37:36 +08:00
xige-16
04082b3de2
Migrate the ability to upload and download binlog to cpp (#22984)
Signed-off-by: xige-16 <xi.ge@zilliz.com>
2023-06-25 14:38:44 +08:00
Cai Yudong
8aebc6f3b7
Remove faiss GPU index support (#22966)
Signed-off-by: Yudong Cai <yudong.cai@zilliz.com>
2023-03-24 17:53:58 +08:00
Cai Yudong
0e9a4478e3
Remove useless index mode (#22934)
Signed-off-by: Yudong Cai <yudong.cai@zilliz.com>
2023-03-23 21:39:59 +08:00
Jiquan Long
dff15c3488
Check dimension of inserted records (#22819)
Signed-off-by: longjiquan <jiquan.long@zilliz.com>
2023-03-17 17:33:58 +08:00
Cai Yudong
ab3cbdfc61
Partial change to prepare for GPU index type support (#22591)
Signed-off-by: Yudong Cai <yudong.cai@zilliz.com>
2023-03-14 23:21:56 +08:00
yah01
bdd6bc7695
Re-format cpp code (#22513)
Signed-off-by: yah01 <yang.cen@zilliz.com>
2023-03-02 15:55:49 +08:00
presburger
9950cacd10
support knowhere 2.0 (#21857)
Signed-off-by: Yusheng.Ma <Yusheng.Ma@zilliz.com>
2023-02-10 14:24:32 +08:00
xige-16
158787811e
Move assemble/disassemble func to core (#19420)
Signed-off-by: xige-16 <xi.ge@zilliz.com>

Signed-off-by: xige-16 <xi.ge@zilliz.com>
2022-10-16 21:17:25 +08:00
xige-16
8c9c1672ae
Assign different storage config for indexes (#19517)
Signed-off-by: xige-16 <xi.ge@zilliz.com>

Signed-off-by: xige-16 <xi.ge@zilliz.com>
2022-10-14 14:45:23 +08:00
xige-16
d4bc00423c
Fix start milvus failed on macos (#19394)
Signed-off-by: xige-16 <xi.ge@zilliz.com>

Signed-off-by: xige-16 <xi.ge@zilliz.com>
2022-09-23 16:54:50 +08:00
xige-16
428840178c
Support diskann index for vector field (#19093)
Signed-off-by: xige-16 <xi.ge@zilliz.com>

Signed-off-by: xige-16 <xi.ge@zilliz.com>
2022-09-21 20:16:51 +08:00
Cai Yudong
686b0ce796
Upgrade to knowhere-v1.3.0, remove following index support: (#18935)
- IVF_SQ8H
- RHNSW_FLAT/RHNSW_PQ/RHNSW_SQ
- NGT
- NSG
- SPTAG

Signed-off-by: yudong.cai <yudong.cai@zilliz.com>

Signed-off-by: yudong.cai <yudong.cai@zilliz.com>
2022-09-05 10:41:11 +08:00
Cai Yudong
94ffa8a275
Upgrade to knowhere-v1.2.0 (#18746)
Signed-off-by: yudong.cai <yudong.cai@zilliz.com>

Signed-off-by: yudong.cai <yudong.cai@zilliz.com>
2022-08-23 10:20:58 +08:00
bigsheeper
cef8b1e7cc
Enable jemalloc (#18349)
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2022-07-20 22:22:31 +08:00
Cai Yudong
a001412e12
Replace faiss::MetricType with knowhere::MetricType (#17891)
Signed-off-by: yudong.cai <yudong.cai@zilliz.com>
2022-06-29 14:20:19 +08:00
Enwei Jiao
16c3aedc15
refine complie configuration (#17502)
Signed-off-by: Enwei Jiao <enwei.jiao@zilliz.com>
2022-06-24 21:12:15 +08:00
Cai Yudong
7385770014
Upgrade to knowhere-v1.1.12 (#17692)
Signed-off-by: yudong.cai <yudong.cai@zilliz.com>
2022-06-24 10:34:18 +08:00
bigsheeper
92d06b2e30
Purge memory by the memory state and try to purge after each search (#17565)
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2022-06-17 17:46:10 +08:00
groot
8736372fd2
Fix bulkload bugs (#16760)
Signed-off-by: groot <yihua.mo@zilliz.com>
2022-05-06 11:21:50 +08:00
Cai Yudong
6a62ff18bf
Support easylogging config for segcore and knowhere (#16751)
Signed-off-by: yudong.cai <yudong.cai@zilliz.com>
2022-05-03 08:39:49 +08:00
Ten Thousand Leaves
6f75d02c65
Disable knowhere logging for embedded Milvus and some other tweaks (#16496)
/kind enhancement

issue: #15711
Signed-off-by: Yuchen Gao <yuchen.gao@zilliz.com>
2022-04-20 17:23:46 +08:00
Cai Yudong
71cd7ba67a
Add configuration common.indexSliceSize (#16438)
Signed-off-by: yudong.cai <yudong.cai@zilliz.com>
2022-04-08 20:29:33 +08:00
Cai Yudong
a37479d728
Upgrade to knowhere-v1.1.2 to support all index types for mac (#16416)
Signed-off-by: Cai Yudong <yudong.cai@zilliz.com>
2022-04-08 15:29:31 +08:00
Jiquan Long
fd589baca7
Integrates marisa trie index (#16192)
Signed-off-by: dragondriver <jiquan.long@zilliz.com>
2022-04-01 15:31:29 +08:00
Cai Yudong
f4ebd3a9ce
Upgrade to knowhere v1.1.0 (#16186)
Signed-off-by: yudong.cai <yudong.cai@zilliz.com>
2022-03-25 13:49:25 +08:00
Jiquan Long
48706f416f
Migrate scalar index from knowhere (#16174)
Signed-off-by: dragondriver <jiquan.long@zilliz.com>
2022-03-24 14:57:26 +08:00
Jiquan Long
f8d9bc919d
Unify interface of vector index & scalar index. (#15959)
Signed-off-by: dragondriver <jiquan.long@zilliz.com>
2022-03-21 14:23:24 +08:00
Ji Bin
3cd28420f1
Support compile under windows (#15786)
This patch makes compile milvus under windows(MSYS), including:
- some cpp adaptation for compile under msys/gcc-10.3
- install toolchain scripts for setup from MinGW/MSYS `scripts/install_deps_msys.sh`
- adaptation for POSIX API use in golang
  * using gofrs/flock instead of syscall.Flock
  * using x/exp/mmap instead of syscall.Mmap
- introducing github actions for build milvus.exe under windows/MSYS
- rocksdb's patch for MSYS
- adaptation for compile knowhere under windows
- a windows package script for pack zip file, `scripts/package_windows.sh`

issue #7706

Signed-off-by: Ji Bin <matrixji@live.com>
2022-03-17 17:17:22 +08:00
Cai Yudong
503724be19
Optimize CMakeLists.txt under internal/core (#15770)
Signed-off-by: yudong.cai <yudong.cai@zilliz.com>
2022-03-01 10:31:55 +08:00
zhenshan.cao
142848fcc3
Abandon using protobuf to pass binaryset parameter (#15626)
Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>
2022-02-18 18:39:50 +08:00
jaime
307a8ce535
Support compile and run on Mac (#15491)
Co-authored-by: jaime <yun.zhang@zilliz.com>
Co-authored-by: Cai Yudong <yudong.cai@zilliz.com>
Co-authored-by: Jenny Li <jing.li@zilliz.com>
Co-authored-by: Nemo <yuchen.gao@zilliz.com>
Signed-off-by: yun.zhang <yun.zhang@zilliz.com>

Co-authored-by: Cai Yudong <yudong.cai@zilliz.com>
Co-authored-by: Jenny Li <jing.li@zilliz.com>
Co-authored-by: Nemo <yuchen.gao@zilliz.com>
2022-02-09 14:27:46 +08:00