milvus

mirror of https://gitee.com/milvus-io/milvus.git synced 2026-01-07 19:31:51 +08:00

Author	SHA1	Message	Date
Spade A	8abf6c9149	fix: build text index when loading field data (#39070 ) fix: https://github.com/milvus-io/milvus/issues/39053 may fix https://github.com/milvus-io/milvus/issues/38644 which could be caused by https://github.com/milvus-io/milvus/issues/39053 --------- Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>	2025-01-09 15:24:56 +08:00
Spade A	4245c5bed1	fix: text match panics when enable_match is set be false (#38950 ) fix: https://github.com/milvus-io/milvus/issues/38949 --------- Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>	2025-01-03 14:20:55 +08:00
zhagnlu	01de0afc4e	enhance: refactor delete mvcc function (#38066 ) #37413 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2024-12-15 18:02:43 +08:00
Gao	994fc544e7	enhance: support iterative filter execution (#37363 ) issue: #37360 --------- Signed-off-by: chasingegg <chao.gao@zilliz.com>	2024-12-11 11:32:44 +08:00
zhagnlu	e4b6773d0a	fix: fix create text index dir conflict bug (#37693 ) #37623 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2024-11-15 18:26:30 +08:00
smellthemoon	3389a6b500	enhance: support null in text match index (#37517 ) #37508 Signed-off-by: lixinguo <xinguo.li@zilliz.com> Co-authored-by: lixinguo <xinguo.li@zilliz.com>	2024-11-13 11:08:29 +08:00
aoiasd	12951f0abb	enhance: rename tokenizer to analyzer and check analyzer params (#37478 ) relate: https://github.com/milvus-io/milvus/issues/35853 --------- Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2024-11-10 16:12:26 +08:00
aoiasd	d67853fa89	feat: Tokenizer support build with params and clone for concurrency (#37048 ) relate: https://github.com/milvus-io/milvus/issues/35853 https://github.com/milvus-io/milvus/issues/36751 --------- Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2024-11-06 17:48:24 +08:00
zhenshan.cao	63843dce33	fix: Fix conan gdal building problem (#37338 ) issue:https://github.com/milvus-io/milvus/issues/27576 Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>	2024-10-31 21:04:16 +08:00
Hao Tan	67c4340565	feat: Geospatial Data Type and GIS Function Support for milvus server (#35990 ) issue:https://github.com/milvus-io/milvus/issues/27576 # Main Goals 1. Create and describe collections with geospatial fields, enabling both client and server to recognize and process geo fields. 2. Insert geospatial data as payload values in the insert binlog, and print the values for verification. 3. Load segments containing geospatial data into memory. 4. Ensure query outputs can display geospatial data. 5. Support filtering on GIS functions for geospatial columns. # Solution 1. Add Type: Modify the Milvus core by adding a Geospatial type in both the C++ and Go code layers, defining the Geospatial data structure and the corresponding interfaces. 2. Dependency Libraries: Introduce necessary geospatial data processing libraries. In the C++ source code, use Conan package management to include the GDAL library. In the Go source code, add the go-geom library to the go.mod file. 3. Protocol Interface: Revise the Milvus protocol to provide mechanisms for Geospatial message serialization and deserialization. 4. Data Pipeline: Facilitate interaction between the client and proxy using the WKT format for geospatial data. The proxy will convert all data into WKB format for downstream processing, providing column data interfaces, segment encapsulation, segment loading, payload writing, and cache block management. 5. Query Operators: Implement simple display and support for filter queries. Initially, focus on filtering based on spatial relationships for a single column of geospatial literal values, providing parsing and execution for query expressions. 6. Client Modification: Enable the client to handle user input for geospatial data and facilitate end-to-end testing.Check the modification in pymilvus. --------- Signed-off-by: tasty-gumi <1021989072@qq.com>	2024-10-31 20:58:20 +08:00
Bingyi Sun	a75bb85f3a	feat: support chunked column for sealed segment (#35764 ) This PR splits sealed segment to chunked data to avoid unnecessary memory copy and save memory usage when loading segments so that loading can be accelerated. To support rollback to previous version, we add an option `multipleChunkedEnable` which is false by default. Signed-off-by: sunby <sunbingyi1992@gmail.com>	2024-10-12 15:04:52 +08:00
zhagnlu	489087d18b	enhance: refactor executor framework V2 (#35251 ) #32636 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2024-09-13 20:57:09 +08:00
congqixia	58d3200986	enhance: Filter out non-hit delete records during load delta (#36207 ) Related to #35303 This PR utilizes pk index in segment to exclude non-hit delete record during load delete records. This ability is crucial when l0/delete forward policy only replies on segment itself(without BF filtering). --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-09-13 19:05:08 +08:00
Jiquan Long	89bf226f0b	feat: support keyword text match (#35923 ) fix: #35922 --------- Signed-off-by: longjiquan <jiquan.long@zilliz.com>	2024-09-10 15:11:08 +08:00
zhagnlu	3107701fe8	enhance: optimize retrieve on dynamic field (#35580 ) #35514 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com> Co-authored-by: zhenshan.cao <zhenshan.cao@zilliz.com>	2024-08-22 14:24:56 +08:00
zhagnlu	4b553b0333	enhance: revert remove duplicated pk function (#35103 ) issue: #34778 Revert "fix: fix query count(*) concurrently" Revert "enhance: mark duplicated pk as deleted " Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2024-08-05 10:48:17 +08:00
smellthemoon	475c333fa2	enhance: add valid_data in span (#35030 ) #31728 Signed-off-by: lixinguo <xinguo.li@zilliz.com> Co-authored-by: lixinguo <xinguo.li@zilliz.com>	2024-08-02 15:40:14 +08:00
zhenshan.cao	aa247f192d	enhance: remove unused code for StorageV2 (#35132 ) issue: https://github.com/milvus-io/milvus/issues/34168 Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>	2024-08-01 12:08:13 +08:00
smellthemoon	5616b7e8d2	enhance: support null in c data_datacodec and load null value (#32183 ) 1. support read and write null in segcore will store valid_data(use uint8_t type to save memory) in fieldData. 2. support load null binlog reader read and write data into column(sealed segment), insertRecord(growing segment). In sealed segment, store valid_data directly. In growing segment, considering prior implementation and easy code reading, it covert uint8_t to fbvector<bool>, which may optimize in future. 3. retrieve valid_data. parse valid_data in search/query. #31728 --------- Signed-off-by: lixinguo <xinguo.li@zilliz.com> Co-authored-by: lixinguo <xinguo.li@zilliz.com>	2024-07-23 16:07:51 +08:00
zhagnlu	804dd5409a	enhance: mark duplicated pk as deleted (#34586 ) fix #34247 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2024-07-16 14:25:39 +08:00
zhagnlu	3030e4625e	enhance: refactor variable column to reduce memory cost (#33875 ) #33874 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2024-06-30 20:16:06 +08:00
cqy123456	32f685ff12	enhance: growing segment support mmap (#32633 ) issue: https://github.com/milvus-io/milvus/issues/32984 Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>	2024-06-18 14:42:00 +08:00
zhagnlu	c6f8a73bb2	enhance: optimize some cache to reduce memory usage (#33534 ) #33533 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2024-06-04 14:09:47 +08:00
chyezh	e19d17076f	fix: delete may lost when enable lru cache, some field should be reset when ReleaseData (#32012 ) issue: #30361 - Delete may be lost when segment is not data-loaded status in lru cache. skip filtering to fix it. - `stats_` and `variable_fields_avg_size_` should be reset when `ReleaseData` - Remove repeat load delta log operation in lru. --------- Signed-off-by: chyezh <chyezh@outlook.com>	2024-04-16 11:17:20 +08:00
Cai Yudong	246586be27	enhance: Unify data type check APIs under internal/core (#31800 ) Issue: #22837 Move and rename following C++ APIs: datatype_sizeof() ==> GetDataTypeSize() datatype_name() ==> GetDataTypeName() datatype_is_vector() / IsVectorType() ==> IsVectorDataType() datatype_is_variable() ==> IsVariableDataType() datatype_is_sparse_vector() ==> IsSparseFloatVectorDataType() datatype_is_string() / IsString() ==> IsDataTypeString() datatype_is_floating() / IsFloat() ==> IsDataTypeFloat() datatype_is_binary() ==> IsDataTypeBinary() datatype_is_json() ==> IsDataTypeJson() datatype_is_array() ==> IsDataTypeArray() datatype_is_variable() == IsDataTypeVariable() datatype_is_integer() / IsIntegral() ==> IsDataTypeInteger() Signed-off-by: Cai Yudong <yudong.cai@zilliz.com>	2024-04-02 19:15:14 +08:00
Buqian Zheng	7fc3094a42	fix: fix growing index data race and properly handle build error (#31170 ) issue: https://github.com/milvus-io/milvus/issues/31169 also properly handling index build error by re-create a new index so that nothing will be left in the previous failed index build attempt. Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2024-03-13 20:19:04 +08:00
Buqian Zheng	96cfae55a5	feat: [Sparse Float Vector] segcore to support sparse vector search and get raw vector by id (#30629 ) This PR adds the ability to search/get sparse float vectors in segcore, and added unit tests by modifying lots of existing tests into parameterized ones. https://github.com/milvus-io/milvus/issues/29419 Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2024-03-12 09:16:30 -07:00
Buqian Zheng	070dfc77bf	feat: [Sparse Float Vector] segcore basics and index building (#30357 ) This commit adds sparse float vector support to segcore with the following: 1. data type enum declarations 2. Adds corresponding data structures for handling sparse float vectors in various scenarios, including: * FieldData as a bridge between the binlog and the in memory data structures * mmap::Column as the in memory representation of a sparse float vector column of a sealed segment; * ConcurrentVector as the in memory representation of a sparse float vector of a growing segment which supports inserts. 3. Adds logic in payload reader/writer to serialize/deserialize from/to binlog 4. Adds the ability to allow the index node to build sparse float vector index 5. Adds the ability to allow the query node to build growing index for growing segment and temp index for sealed segment without index built This commit also includes some code cleanness, comment improvement, and some unit tests for sparse vector. https://github.com/milvus-io/milvus/issues/29419 Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2024-03-11 14:45:02 +08:00
MrPresent-Han	77eb6defb1	feat: support groupby on growing and non-indexed sealed egment(#30307 ) (#30644 ) related: #30308 Signed-off-by: MrPresent-Han <chun.han@zilliz.com>	2024-02-21 14:02:53 +08:00
yah01	f542bdbf3c	enhance: calc the accurate mem size of segment (#30093 ) this stats the real memory size of segment, also reduces the memory usage in mmap mode resolve #30095 Signed-off-by: yah01 <yang.cen@zilliz.com>	2024-01-19 12:32:53 +08:00
yah01	6c477ce3a7	enhance: optimize the loading strategy (#29910 ) as we have the pool size limit so we don't need to limit the concurrency manually Signed-off-by: yah01 <yang.cen@zilliz.com>	2024-01-12 14:26:50 +08:00
Xu Tong	e429965f32	Add float16 approve for multi-type part (#28427 ) issue：https://github.com/milvus-io/milvus/issues/22837 Add bfloat16 vector, add the index part of float16 vector. Signed-off-by: Writer-X <1256866856@qq.com>	2024-01-11 15:48:51 +08:00
MrPresent-Han	9e2e7157e9	feat: support search_group_by for milvus(#25324 ) (#28983 ) related: #25324 Search GroupBy function, used to aggregate result entities based on a specific scalar column. several points to mention: 1. Temporarliy, the whole groupby is implemented separated from iterative expr framework for the first period 2. In the long term, the groupBy operation will be incorporated into the iterative expr framework:https://github.com/milvus-io/milvus/pull/28166 3. This pr includes some unrelated mocked interface regarding alterIndex due to some unworth-to-mention reasons. All these un-associated content will be removed before the final pr is merged. This version of pr is only for review 4. All other related details were commented in the files comparison Signed-off-by: MrPresent-Han <chun.han@zilliz.com>	2024-01-05 15:50:47 +08:00
yah01	aef483806d	enhance: improve the segcore logs (#29372 ) - remove the streaming logging - refine existing logs fix #29366 --------- Signed-off-by: yah01 <yah2er0ne@outlook.com>	2023-12-23 21:52:43 +08:00
zhagnlu	a602171d06	enhance: Refactor runtime and expr framework (#28166 ) #28165 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2023-12-18 12:04:42 +08:00
Bingyi Sun	36f69ea031	feat: integrate storagev2 in building index of segcore (#28768 ) issue: https://github.com/milvus-io/milvus/issues/28655 --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2023-12-05 16:48:54 +08:00
congqixia	1dc086496f	fix: schema->size() check logic with system field (#28802 ) Now segcore load system field info as well, the growing segment assertion shall not pass with "+ 2" value This will cause all growing segments load failure Fix #28801 Related to #28478 See also #28524 --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2023-11-29 22:40:28 +08:00
yah01	02c5a649cf	enhance: store system fields in segcore (#28524 ) we need the system fields info for some usacase fix: #28523 --------- Signed-off-by: yah01 <yah2er0ne@outlook.com>	2023-11-21 09:28:22 +08:00
yah01	f7d2ab6677	enhance: reduce 1x copy for variable length field while retrieving (#28345 ) - Reduce 1x copy for varchar/string/JSON/array types while retrieving - Reduce 1x copy for int8/int16 while retrieving Signed-off-by: yah01 <yah2er0ne@outlook.com>	2023-11-15 18:08:20 +08:00
yah01	267c67dfee	enhance: reduce 1x copy while retrieving data from growing segment (#28323 ) Signed-off-by: yah01 <yah2er0ne@outlook.com>	2023-11-10 15:44:22 +08:00
cqy123456	4fbe3c9142	replace loaded binlog with binlog index for search performance (#27673 ) Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>	2023-11-01 02:20:15 +08:00
yah01	f212158d61	Fix delete records timestamp may be reordered (#27941 ) Signed-off-by: yah01 <yah2er0ne@outlook.com>	2023-10-27 10:24:10 +08:00
Enwei Jiao	b80a3e19d3	Add code for PanicInfo (#27364 ) Signed-off-by: Enwei Jiao <enwei.jiao@zilliz.com>	2023-09-27 12:01:28 +08:00
yah01	93e2eb78c9	Delete only if primary keys exist (#25292 ) Signed-off-by: yah01 <yang.cen@zilliz.com>	2023-09-20 19:03:25 +08:00
cai.zhang	a362bb1457	Support array datatype (#26369 ) Signed-off-by: cai.zhang <cai.zhang@zilliz.com>	2023-09-19 14:23:23 +08:00
Enwei Jiao	0afdfdb9af	Remove other Exceptions, keeps SegcoreError only (#27017 ) Signed-off-by: Enwei Jiao <enwei.jiao@zilliz.com>	2023-09-14 14:05:20 +08:00
yah01	3203ce1654	Reduce copy while retrieving primary keys (#26616 ) Signed-off-by: yah01 <yang.cen@zilliz.com>	2023-09-11 21:31:18 +08:00
Enwei Jiao	c3f15c6b95	Refactor duplicate error class into one place (#26985 ) Signed-off-by: Enwei Jiao <enwei.jiao@zilliz.com>	2023-09-11 20:43:17 +08:00
Xu Tong	9166011c4a	Add float16 vector (#25852 ) Signed-off-by: Writer-X <1256866856@qq.com>	2023-09-08 10:03:16 +08:00
yah01	b475f25042	Remove invalid offset check while filling data (#26666 ) Signed-off-by: yah01 <yah2er0ne@outlook.com>	2023-08-30 09:52:27 +08:00

1 2 3

123 Commits