milvus

mirror of https://gitee.com/milvus-io/milvus.git synced 2026-01-07 19:31:51 +08:00

Author	SHA1	Message	Date
cqy123456	aba4993c6c	fix: fix some fp16/bf16 code miss in segcore. (#31771 ) issue：https://github.com/milvus-io/milvus/issues/22837 Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>	2024-04-07 14:13:16 +08:00
Cai Yudong	246586be27	enhance: Unify data type check APIs under internal/core (#31800 ) Issue: #22837 Move and rename following C++ APIs: datatype_sizeof() ==> GetDataTypeSize() datatype_name() ==> GetDataTypeName() datatype_is_vector() / IsVectorType() ==> IsVectorDataType() datatype_is_variable() ==> IsVariableDataType() datatype_is_sparse_vector() ==> IsSparseFloatVectorDataType() datatype_is_string() / IsString() ==> IsDataTypeString() datatype_is_floating() / IsFloat() ==> IsDataTypeFloat() datatype_is_binary() ==> IsDataTypeBinary() datatype_is_json() ==> IsDataTypeJson() datatype_is_array() ==> IsDataTypeArray() datatype_is_variable() == IsDataTypeVariable() datatype_is_integer() / IsIntegral() ==> IsDataTypeInteger() Signed-off-by: Cai Yudong <yudong.cai@zilliz.com>	2024-04-02 19:15:14 +08:00
PowderLi	d299fa502e	fix: use milvus-io/vcpkg (#31770 ) issue: #31769 GitHub Disables The XZ Repository because of CVE-2024-3094 Signed-off-by: PowderLi <min.li@zilliz.com>	2024-04-01 15:01:13 +08:00
chyezh	5655ec4fc0	enhance: add mmap usage metrics (#31708 ) issue: #31707 Signed-off-by: chyezh <chyezh@outlook.com>	2024-04-01 11:35:12 +08:00
groot	5be395354c	fix: minio ssl compatible issue (#31607 ) issue: https://github.com/milvus-io/milvus/issues/30709 Signed-off-by: yhmo <yihua.mo@zilliz.com>	2024-03-27 14:41:20 +08:00
groot	c81909bfab	enhance: Support MinIO TLS connection (#31311 ) issue: https://github.com/milvus-io/milvus/issues/30709 pr: #31292 Signed-off-by: yhmo <yihua.mo@zilliz.com> Co-authored-by: Chen Rao <chenrao317328@163.com>	2024-03-21 11:15:20 +08:00
Buqian Zheng	070dfc77bf	feat: [Sparse Float Vector] segcore basics and index building (#30357 ) This commit adds sparse float vector support to segcore with the following: 1. data type enum declarations 2. Adds corresponding data structures for handling sparse float vectors in various scenarios, including: * FieldData as a bridge between the binlog and the in memory data structures * mmap::Column as the in memory representation of a sparse float vector column of a sealed segment; * ConcurrentVector as the in memory representation of a sparse float vector of a growing segment which supports inserts. 3. Adds logic in payload reader/writer to serialize/deserialize from/to binlog 4. Adds the ability to allow the index node to build sparse float vector index 5. Adds the ability to allow the query node to build growing index for growing segment and temp index for sealed segment without index built This commit also includes some code cleanness, comment improvement, and some unit tests for sparse vector. https://github.com/milvus-io/milvus/issues/29419 Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2024-03-11 14:45:02 +08:00
zhagnlu	976b6fc0e4	enhance: change opendal as compile configurable (#30384 ) #30373 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2024-02-20 19:16:52 +08:00
congqixia	18c351efa6	fix: Prevent ChunkCache use absolute path in All-in-one mode (#30666 ) See also #30651 Append operator of `std::filesystem::path` will replace whole path when the param of "/" operation is an absolute path. In "All-in-one" mode, this shall cause ChunkCache removing the original vector data file when building chunk cache during/after load procedure. This PR changes the ChunkCache path generation logic to a separate function in which will check whether the file path is absolute or not. If the file path is absolute, it removes the root path prefix and return concatenated file path. --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-02-19 20:58:51 +08:00
PowderLi	5cf9bb236e	enhance: restful support import jobs (#30343 ) issue: #28521 #29732 include 1. list collection's import jobs 2. create a new import job 3. get the progress of an import job fix: 1. mix the order of dbName & collectionName #29728 2. trace log keep the same as v1 3. support traceID 4. azure precheck, blob name cannot end with / #29703 --------- Signed-off-by: PowderLi <min.li@zilliz.com>	2024-01-31 17:57:04 +08:00
yah01	878c4c9463	enhance: limit the max pool size to 16 (#30371 ) according to our benchmark, concurrency level 16 is enough to fully utilize the object storage network bandwidth Signed-off-by: yah01 <yang.cen@zilliz.com>	2024-01-31 14:13:06 +08:00
yihao.dai	c02fb64ad6	enhance: Allows proactive warming up of chunk cache (#30182 ) Allows proactive warming up of chunk cache. Original vector data will be asynchronously loaded into the chunk cache during the load process. It has the potential to significantly reduce query/search latency for a certain duration after the load, albeit with a concurrent increase in disk usage. issue: https://github.com/milvus-io/milvus/issues/30181 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-01-25 19:55:39 +08:00
yah01	a27c0e86fd	enhance: reduce many I/O operations while loading disk index (#30189 ) before this, every time writting the index chunk data into the disk, there are 4 I/O operations: - open the file - seek to the offset - write the data - close the file this optimized this to open only once and continiously write all data. This also makes it concurrent to load the files from object storage Signed-off-by: yah01 <yang.cen@zilliz.com>	2024-01-25 15:23:02 +08:00
Patrick Weizhi Xu	0907d76253	enhance: pass partition key scalar info if enabled when build vector index (#29931 ) issue: #29892 Pass optional scalar IVF offsets to Cardinal Signed-off-by: Patrick Weizhi Xu <weizhi.xu@zilliz.com>	2024-01-24 00:04:55 +08:00
cai.zhang	6cf2f09b60	feat: Support tencent cloud object storage for milvus (#30163 ) issue: #30162 Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2024-01-23 11:28:56 +08:00
yah01	a77693aa19	enhance: convert the `GetObject` util to async (#30166 ) This makes it much easier to use Signed-off-by: yah01 <yang.cen@zilliz.com>	2024-01-22 19:20:57 +08:00
Bingyi Sun	e1258b8cad	feat: integrate storagev2 into loading segment (#29336 ) issue: #29335 --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2024-01-12 18:10:51 +08:00
Xu Tong	e429965f32	Add float16 approve for multi-type part (#28427 ) issue：https://github.com/milvus-io/milvus/issues/22837 Add bfloat16 vector, add the index part of float16 vector. Signed-off-by: Writer-X <1256866856@qq.com>	2024-01-11 15:48:51 +08:00
PowderLi	c8db36a63a	enhance: get a blob to check object storage config (#29703 ) issue: #29672 the storage account need privileges of actions `Microsoft.Storage/storageAccounts/blobServices/containers/blobs/*` at least Signed-off-by: PowderLi <min.li@zilliz.com>	2024-01-05 14:50:46 +08:00
yah01	0ae90443ba	enhance: fill missed info for segcore error (#29610 ) - fill missed error info - format the error message directly Signed-off-by: yah01 <yang.cen@zilliz.com>	2024-01-04 17:54:46 +08:00
PowderLi	5f00bad4b8	fix: link with install path's libblob-chunk-manager (#29496 ) issue: #29494 1. link with install path's libblob-chunk-manager 2. performance of `ShouldBindWith` is better than `ShouldBindBodyWith` 3. the middleware shouldn't read the unrefreshed parameter repeatly Signed-off-by: PowderLi <min.li@zilliz.com>	2023-12-31 20:02:48 +08:00
Jiquan Long	3f46c6d459	feat: support inverted index (#28783 ) issue: https://github.com/milvus-io/milvus/issues/27704 Add inverted index for some data types in Milvus. This index type can save a lot of memory compared to loading all data into RAM and speed up the term query and range query. Supported: `INT8`, `INT16`, `INT32`, `INT64`, `FLOAT`, `DOUBLE`, `BOOL` and `VARCHAR`. Not supported: `ARRAY` and `JSON`. Note: - The inverted index for `VARCHAR` is not designed to serve full-text search now. We will treat every row as a whole keyword instead of tokenizing it into multiple terms. - The inverted index don't support retrieval well, so if you create inverted index for field, those operations which depend on the raw data will fallback to use chunk storage, which will bring some performance loss. For example, comparisons between two columns and retrieval of output fields. The inverted index is very easy to be used. Taking below collection as an example: ```python fields = [ FieldSchema(name="pk", dtype=DataType.VARCHAR, is_primary=True, auto_id=False, max_length=100), FieldSchema(name="int8", dtype=DataType.INT8), FieldSchema(name="int16", dtype=DataType.INT16), FieldSchema(name="int32", dtype=DataType.INT32), FieldSchema(name="int64", dtype=DataType.INT64), FieldSchema(name="float", dtype=DataType.FLOAT), FieldSchema(name="double", dtype=DataType.DOUBLE), FieldSchema(name="bool", dtype=DataType.BOOL), FieldSchema(name="varchar", dtype=DataType.VARCHAR, max_length=1000), FieldSchema(name="random", dtype=DataType.DOUBLE), FieldSchema(name="embeddings", dtype=DataType.FLOAT_VECTOR, dim=dim), ] schema = CollectionSchema(fields) collection = Collection("demo", schema) ``` Then we can simply create inverted index for field via: ```python index_type = "INVERTED" collection.create_index("int8", {"index_type": index_type}) collection.create_index("int16", {"index_type": index_type}) collection.create_index("int32", {"index_type": index_type}) collection.create_index("int64", {"index_type": index_type}) collection.create_index("float", {"index_type": index_type}) collection.create_index("double", {"index_type": index_type}) collection.create_index("bool", {"index_type": index_type}) collection.create_index("varchar", {"index_type": index_type}) ``` Then, term query and range query on the field can be speed up automatically by the inverted index: ```python result = collection.query(expr='int64 in [1, 2, 3]', output_fields=["pk"]) result = collection.query(expr='int64 < 5', output_fields=["pk"]) result = collection.query(expr='int64 > 2997', output_fields=["pk"]) result = collection.query(expr='1 < int64 < 5', output_fields=["pk"]) ``` --------- Signed-off-by: longjiquan <jiquan.long@zilliz.com>	2023-12-31 19:50:47 +08:00
yah01	aef483806d	enhance: improve the segcore logs (#29372 ) - remove the streaming logging - refine existing logs fix #29366 --------- Signed-off-by: yah01 <yah2er0ne@outlook.com>	2023-12-23 21:52:43 +08:00
Bingyi Sun	89b208d27a	enhance: Fix format message (#29159 ) Signed-off-by: sunby <sunbingyi1992@gmail.com>	2023-12-20 09:30:44 +08:00
zhagnlu	a602171d06	enhance: Refactor runtime and expr framework (#28166 ) #28165 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2023-12-18 12:04:42 +08:00
Bingyi Sun	36f69ea031	feat: integrate storagev2 in building index of segcore (#28768 ) issue: https://github.com/milvus-io/milvus/issues/28655 --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2023-12-05 16:48:54 +08:00
PowderLi	20fc90c591	enhance: find collection schema from cache (#28782 ) issue: #28781 #28329 1. There is no need to call `DescribeCollection`, if the collection's schema is found in the globalMetaCache 2. did `GetProperties` to check the access to Azure Blob Service while construct the ChunkManager Signed-off-by: PowderLi <min.li@zilliz.com>	2023-12-03 19:22:33 +08:00
PowderLi	cac802ef7f	enhance: use already installed vcpkg (#28703 ) issue #28686 1. Update Builder gpu image changes, see changes #28505 2. update azure-identity-cpp from beta to release Signed-off-by: PowderLi <min.li@zilliz.com>	2023-11-30 15:58:32 +08:00
zhagnlu	0d9d098186	enhance: Add precheck when chunk manager init (#28330 ) #28329 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2023-11-23 19:56:32 +08:00
PowderLi	a1c505dbd5	add internal storage metrics (#28278 ) /kind improvement issue: #28277 Signed-off-by: PowderLi <min.li@zilliz.com>	2023-11-19 17:22:25 +08:00
zhagnlu	3920bbc55f	Force set aliyun use_virtual_host to true for all (#28158 ) Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2023-11-10 10:32:20 +08:00
PowderLi	9f9726f8b7	print azure sdk log (#28240 ) Signed-off-by: PowderLi <min.li@zilliz.com>	2023-11-08 17:50:18 +08:00
PowderLi	7bb0fa9c70	reduce useless ObjectExists (#28156 ) replace ListBlobs() with GetProperties() unified style std::string& / char* config azure requestTimeoutMs Signed-off-by: PowderLi <min.li@zilliz.com>	2023-11-07 16:32:20 +08:00
yihao.dai	873b29e226	Fix unstable cpp ut (#28083 ) Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2023-11-02 00:58:16 +08:00
PowderLi	0c0f012e03	add a custom http header: Accept-Type-Allow-Int64 (#27901 ) Signed-off-by: PowderLi <min.li@zilliz.com>	2023-11-01 11:42:16 +08:00
Enwei Jiao	8ae9c947ae	Use OpenDAL to access object store (#25642 ) Signed-off-by: Enwei Jiao <enwei.jiao@zilliz.com>	2023-11-01 09:00:14 +08:00
yah01	ab6dbf7659	Limit max thread num for pool (#28018 ) Signed-off-by: yah01 <yah2er0ne@outlook.com>	2023-10-31 13:00:13 +08:00
yah01	2af46d7333	Increase the ChunkManager request timeout (#28015 ) Signed-off-by: yah01 <yah2er0ne@outlook.com>	2023-10-31 09:06:13 +08:00
yihao.dai	ab6b0103a3	Get vector concurrently (#27838 ) Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2023-10-30 15:44:12 +08:00
zhagnlu	6060dd7ea8	Add chunk manager request timeout (#27692 ) Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2023-10-23 20:08:08 +08:00
Xiaofan	d83869aaeb	Refine minio chunks manager (#27510 ) Signed-off-by: xiaofan-luan <xiaofan.luan@zilliz.com>	2023-10-13 14:15:35 +08:00
PowderLi	8d3069b1db	update openssl to 3.1.2 (#27399 ) deal with root path's normalization Signed-off-by: PowderLi <min.li@zilliz.com>	2023-10-08 19:17:31 +08:00
yihao.dai	106c17f304	Make read ahead policy in ChunkCache configurable (#27291 ) Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2023-09-28 15:47:27 +08:00
Enwei Jiao	b80a3e19d3	Add code for PanicInfo (#27364 ) Signed-off-by: Enwei Jiao <enwei.jiao@zilliz.com>	2023-09-27 12:01:28 +08:00
foxspy	370b6fde58	milvus support multi index engine (#27178 ) Co-authored-by: longjiquan <jiquan.long@zilliz.com>	2023-09-22 09:59:26 +08:00
cai.zhang	a362bb1457	Support array datatype (#26369 ) Signed-off-by: cai.zhang <cai.zhang@zilliz.com>	2023-09-19 14:23:23 +08:00
PowderLi	4feb3fa7c6	support azure (#26398 ) Signed-off-by: PowderLi <min.li@zilliz.com>	2023-09-19 10:01:23 +08:00
yihao.dai	060d3563ba	Fix compile error at core/storage (#27121 ) Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2023-09-15 14:41:21 +08:00
yihao.dai	bb6711f28c	Add ChunkCache: support get vector from storage (#26142 ) Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2023-09-15 10:21:20 +08:00
Enwei Jiao	0afdfdb9af	Remove other Exceptions, keeps SegcoreError only (#27017 ) Signed-off-by: Enwei Jiao <enwei.jiao@zilliz.com>	2023-09-14 14:05:20 +08:00

1 2 3

108 Commits