milvus

mirror of https://gitee.com/milvus-io/milvus.git synced 2025-12-07 09:38:39 +08:00

Author	SHA1	Message	Date
SimFG	130a923dec	enhance: the estimate method when loading the collection (#36307 ) - issue: #36530 --------- Signed-off-by: SimFG <bang.fu@zilliz.com> Signed-off-by: xianliang.li <xianliang.li@zilliz.com> Co-authored-by: xianliang.li <xianliang.li@zilliz.com>	2024-10-09 17:35:19 +08:00
zhenshan.cao	aa247f192d	enhance: remove unused code for StorageV2 (#35132 ) issue: https://github.com/milvus-io/milvus/issues/34168 Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>	2024-08-01 12:08:13 +08:00
zhagnlu	f1b2f7b640	enhance: refactor bitmap index and internal hybrid index (#34450 ) #32900 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2024-07-18 10:39:42 +08:00
zhagnlu	0d7ea8ec42	enhance: Enhance and correct exception module (#33705 ) #33704 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2024-06-23 21:22:01 +08:00
Gao	0d20303e54	fix: fix binary vector data size (#33750 ) issue: https://github.com/milvus-io/milvus/issues/22837 - fix byte size wrong for binary vectors - fix the expect/actual error msg Signed-off-by: chasingegg <chao.gao@zilliz.com>	2024-06-18 21:39:59 +08:00
zhagnlu	d43ec4db0b	enhance: support array bitmap index (#33527 ) #32900 --------- Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2024-06-16 21:51:58 +08:00
Buqian Zheng	47b04ea167	enhance: support sparse cardinal hnsw index (#33656 ) issue: #29419 Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2024-06-12 16:57:55 +08:00
cqy123456	703fc73f71	enhance: disk index support binary vector (#33631 ) issue:https://github.com/milvus-io/milvus/issues/22837 related https://github.com/milvus-io/milvus/pull/33575 Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>	2024-06-05 19:37:57 +08:00
Jiquan Long	0c5d8660aa	feat: support inverted index for array (#33452 ) issue: https://github.com/milvus-io/milvus/issues/27704 --------- Signed-off-by: longjiquan <jiquan.long@zilliz.com>	2024-05-31 09:47:47 +08:00
zhagnlu	589d4dfd82	enhance: optimize bitmap index (#33358 ) #32900 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2024-05-30 13:09:43 +08:00
zhagnlu	d669fbcf46	enhance: support bitmap index for scalar type (#32902 ) #32900 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2024-05-19 21:49:38 +08:00
cqy123456	aba4993c6c	fix: fix some fp16/bf16 code miss in segcore. (#31771 ) issue：https://github.com/milvus-io/milvus/issues/22837 Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>	2024-04-07 14:13:16 +08:00
Cai Yudong	246586be27	enhance: Unify data type check APIs under internal/core (#31800 ) Issue: #22837 Move and rename following C++ APIs: datatype_sizeof() ==> GetDataTypeSize() datatype_name() ==> GetDataTypeName() datatype_is_vector() / IsVectorType() ==> IsVectorDataType() datatype_is_variable() ==> IsVariableDataType() datatype_is_sparse_vector() ==> IsSparseFloatVectorDataType() datatype_is_string() / IsString() ==> IsDataTypeString() datatype_is_floating() / IsFloat() ==> IsDataTypeFloat() datatype_is_binary() ==> IsDataTypeBinary() datatype_is_json() ==> IsDataTypeJson() datatype_is_array() ==> IsDataTypeArray() datatype_is_variable() == IsDataTypeVariable() datatype_is_integer() / IsIntegral() ==> IsDataTypeInteger() Signed-off-by: Cai Yudong <yudong.cai@zilliz.com>	2024-04-02 19:15:14 +08:00
Buqian Zheng	070dfc77bf	feat: [Sparse Float Vector] segcore basics and index building (#30357 ) This commit adds sparse float vector support to segcore with the following: 1. data type enum declarations 2. Adds corresponding data structures for handling sparse float vectors in various scenarios, including: * FieldData as a bridge between the binlog and the in memory data structures * mmap::Column as the in memory representation of a sparse float vector column of a sealed segment; * ConcurrentVector as the in memory representation of a sparse float vector of a growing segment which supports inserts. 3. Adds logic in payload reader/writer to serialize/deserialize from/to binlog 4. Adds the ability to allow the index node to build sparse float vector index 5. Adds the ability to allow the query node to build growing index for growing segment and temp index for sealed segment without index built This commit also includes some code cleanness, comment improvement, and some unit tests for sparse vector. https://github.com/milvus-io/milvus/issues/29419 Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2024-03-11 14:45:02 +08:00
Xu Tong	e429965f32	Add float16 approve for multi-type part (#28427 ) issue：https://github.com/milvus-io/milvus/issues/22837 Add bfloat16 vector, add the index part of float16 vector. Signed-off-by: Writer-X <1256866856@qq.com>	2024-01-11 15:48:51 +08:00
yah01	99e0f1e65a	enhance: unable to compile C++ tests (#29616 ) The tests need to call a private method, Milvus uses `#define` to replace private with public, the hack trick works but would be broken if the including order changed. This uses friend to make all things work well Signed-off-by: yah01 <yang.cen@zilliz.com> Signed-off-by: yah01 <yah2er0ne@outlook.com>	2024-01-04 13:20:46 +08:00
Jiquan Long	3f46c6d459	feat: support inverted index (#28783 ) issue: https://github.com/milvus-io/milvus/issues/27704 Add inverted index for some data types in Milvus. This index type can save a lot of memory compared to loading all data into RAM and speed up the term query and range query. Supported: `INT8`, `INT16`, `INT32`, `INT64`, `FLOAT`, `DOUBLE`, `BOOL` and `VARCHAR`. Not supported: `ARRAY` and `JSON`. Note: - The inverted index for `VARCHAR` is not designed to serve full-text search now. We will treat every row as a whole keyword instead of tokenizing it into multiple terms. - The inverted index don't support retrieval well, so if you create inverted index for field, those operations which depend on the raw data will fallback to use chunk storage, which will bring some performance loss. For example, comparisons between two columns and retrieval of output fields. The inverted index is very easy to be used. Taking below collection as an example: ```python fields = [ FieldSchema(name="pk", dtype=DataType.VARCHAR, is_primary=True, auto_id=False, max_length=100), FieldSchema(name="int8", dtype=DataType.INT8), FieldSchema(name="int16", dtype=DataType.INT16), FieldSchema(name="int32", dtype=DataType.INT32), FieldSchema(name="int64", dtype=DataType.INT64), FieldSchema(name="float", dtype=DataType.FLOAT), FieldSchema(name="double", dtype=DataType.DOUBLE), FieldSchema(name="bool", dtype=DataType.BOOL), FieldSchema(name="varchar", dtype=DataType.VARCHAR, max_length=1000), FieldSchema(name="random", dtype=DataType.DOUBLE), FieldSchema(name="embeddings", dtype=DataType.FLOAT_VECTOR, dim=dim), ] schema = CollectionSchema(fields) collection = Collection("demo", schema) ``` Then we can simply create inverted index for field via: ```python index_type = "INVERTED" collection.create_index("int8", {"index_type": index_type}) collection.create_index("int16", {"index_type": index_type}) collection.create_index("int32", {"index_type": index_type}) collection.create_index("int64", {"index_type": index_type}) collection.create_index("float", {"index_type": index_type}) collection.create_index("double", {"index_type": index_type}) collection.create_index("bool", {"index_type": index_type}) collection.create_index("varchar", {"index_type": index_type}) ``` Then, term query and range query on the field can be speed up automatically by the inverted index: ```python result = collection.query(expr='int64 in [1, 2, 3]', output_fields=["pk"]) result = collection.query(expr='int64 < 5', output_fields=["pk"]) result = collection.query(expr='int64 > 2997', output_fields=["pk"]) result = collection.query(expr='1 < int64 < 5', output_fields=["pk"]) ``` --------- Signed-off-by: longjiquan <jiquan.long@zilliz.com>	2023-12-31 19:50:47 +08:00
Bingyi Sun	36f69ea031	feat: integrate storagev2 in building index of segcore (#28768 ) issue: https://github.com/milvus-io/milvus/issues/28655 --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2023-12-05 16:48:54 +08:00
Bingyi Sun	e5ce385ffd	enhance: remove -inl.h files (#28674 ) issue: https://github.com/milvus-io/milvus/issues/28673 Move template implementations from -inl.h to .cpp file and make explicit instantiation Signed-off-by: sunby <sunbingyi1992@gmail.com>	2023-11-23 17:20:25 +08:00
Xu Tong	8ec85f5f4c	Add template for VectorMemIndex (#28324 ) Signed-off-by: Writer-X <1256866856@qq.com>	2023-11-11 13:20:22 +08:00
foxspy	370b6fde58	milvus support multi index engine (#27178 ) Co-authored-by: longjiquan <jiquan.long@zilliz.com>	2023-09-22 09:59:26 +08:00
Enwei Jiao	0afdfdb9af	Remove other Exceptions, keeps SegcoreError only (#27017 ) Signed-off-by: Enwei Jiao <enwei.jiao@zilliz.com>	2023-09-14 14:05:20 +08:00
cqy123456	0ff4ddc76c	remove VectorMemNMIndex (#27000 ) Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>	2023-09-12 17:13:18 +08:00
xige-16	04082b3de2	Migrate the ability to upload and download binlog to cpp (#22984 ) Signed-off-by: xige-16 <xi.ge@zilliz.com>	2023-06-25 14:38:44 +08:00
Cai Yudong	0e9a4478e3	Remove useless index mode (#22934 ) Signed-off-by: Yudong Cai <yudong.cai@zilliz.com>	2023-03-23 21:39:59 +08:00
yah01	bdd6bc7695	Re-format cpp code (#22513 ) Signed-off-by: yah01 <yang.cen@zilliz.com>	2023-03-02 15:55:49 +08:00
xige-16	4a66965df4	Delete RAW_DATA copy when load IVF_FLAT index data (#20274 ) Signed-off-by: xige-16 <xi.ge@zilliz.com> Signed-off-by: xige-16 <xi.ge@zilliz.com>	2022-11-05 17:33:05 +08:00
xige-16	428840178c	Support diskann index for vector field (#19093 ) Signed-off-by: xige-16 <xi.ge@zilliz.com> Signed-off-by: xige-16 <xi.ge@zilliz.com>	2022-09-21 20:16:51 +08:00
Jiquan Long	fd589baca7	Integrates marisa trie index (#16192 ) Signed-off-by: dragondriver <jiquan.long@zilliz.com>	2022-04-01 15:31:29 +08:00
Jiquan Long	48706f416f	Migrate scalar index from knowhere (#16174 ) Signed-off-by: dragondriver <jiquan.long@zilliz.com>	2022-03-24 14:57:26 +08:00

30 Commits