milvus

mirror of https://gitee.com/milvus-io/milvus.git synced 2026-01-07 19:31:51 +08:00

Author	SHA1	Message	Date
Bingyi Sun	cd2655c861	fix: fix wrong method is called to fetch variable valid data (#37304 ) issue: https://github.com/milvus-io/milvus/issues/37147 Signed-off-by: sunby <sunbingyi1992@gmail.com>	2024-11-01 01:52:20 +08:00
zhenshan.cao	63843dce33	fix: Fix conan gdal building problem (#37338 ) issue:https://github.com/milvus-io/milvus/issues/27576 Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>	2024-10-31 21:04:16 +08:00
Hao Tan	67c4340565	feat: Geospatial Data Type and GIS Function Support for milvus server (#35990 ) issue:https://github.com/milvus-io/milvus/issues/27576 # Main Goals 1. Create and describe collections with geospatial fields, enabling both client and server to recognize and process geo fields. 2. Insert geospatial data as payload values in the insert binlog, and print the values for verification. 3. Load segments containing geospatial data into memory. 4. Ensure query outputs can display geospatial data. 5. Support filtering on GIS functions for geospatial columns. # Solution 1. Add Type: Modify the Milvus core by adding a Geospatial type in both the C++ and Go code layers, defining the Geospatial data structure and the corresponding interfaces. 2. Dependency Libraries: Introduce necessary geospatial data processing libraries. In the C++ source code, use Conan package management to include the GDAL library. In the Go source code, add the go-geom library to the go.mod file. 3. Protocol Interface: Revise the Milvus protocol to provide mechanisms for Geospatial message serialization and deserialization. 4. Data Pipeline: Facilitate interaction between the client and proxy using the WKT format for geospatial data. The proxy will convert all data into WKB format for downstream processing, providing column data interfaces, segment encapsulation, segment loading, payload writing, and cache block management. 5. Query Operators: Implement simple display and support for filter queries. Initially, focus on filtering based on spatial relationships for a single column of geospatial literal values, providing parsing and execution for query expressions. 6. Client Modification: Enable the client to handle user input for geospatial data and facilitate end-to-end testing.Check the modification in pymilvus. --------- Signed-off-by: tasty-gumi <1021989072@qq.com>	2024-10-31 20:58:20 +08:00
congqixia	7961568223	fix: Rectify `OffsetOrderedArray` contain logic (#37305 ) Related to #36887 Remove non-hit pk delete record logic does not work since `insert_record_.contain` does not work due to logic problem. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-10-30 21:26:19 +08:00
cai.zhang	86687bd8ed	enhance: Refine code for get_deleted_bitmap (#36819 ) issue: #33744 Check whether the PK is truly sorted in the debug model. --------- Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2024-10-28 15:19:30 +08:00
Bingyi Sun	b81f162f6a	fix: fix several bugs and refactor some codes related with chunked segment (#37168 ) issue: https://github.com/milvus-io/milvus/issues/37147 --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2024-10-28 14:17:30 +08:00
foxspy	d7b2ffe5aa	enhance: add an unify vector index config checker (#36844 ) issue: #34298 Signed-off-by: xianliang.li <xianliang.li@zilliz.com>	2024-10-28 10:11:37 +08:00
Bingyi Sun	a2f0092e39	fix: check sparse float before calling get_dim (#37145 ) https://github.com/milvus-io/milvus/issues/37146 Signed-off-by: sunby <sunbingyi1992@gmail.com>	2024-10-26 16:25:29 +08:00
cqy123456	ff0b7ea0ef	enhance: build interim index for mmapped vector in ChunkedSealedSegment (#36993 ) issue:https://github.com/milvus-io/milvus/issues/36392 related pr: https://github.com/milvus-io/milvus/pull/36391 Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>	2024-10-25 15:55:28 +08:00
Yinzuo Jiang	3628593d20	feat: Implement custom function module in milvus expr (#36560 ) OSPP 2024 project: https://summer-ospp.ac.cn/org/prodetail/247410235?list=org&navpage=org Solutions: - parser (planparserv2) - add CallExpr in planparserv2/Plan.g4 - update parser_visitor and show_visitor - grpc protobuf - add CallExpr in plan.proto - execution (`core/src/exec`) - add `CallExpr` `ValueExpr` and `ColumnExpr` (both logical and physical) for function call and function parameters - function factory (`core/src/exec/expression/function`) - create a global hashmap when starting milvus (see server.go) - the global hashmap stores function signatures and their function pointers, the CallExpr in execution engine can get the function pointer by function signature. - custom functions - empty(string) - starts_with(string, string) - add cpp/go unittests and E2E tests closes: #36559 Signed-off-by: Yinzuo Jiang <jiangyinzuo@foxmail.com>	2024-10-25 15:25:30 +08:00
yellow-shine	8902e2220e	enhance: enable asan for cpp unittest (#37041 ) https://github.com/milvus-io/milvus/issues/35854 Signed-off-by: chyezh <chyezh@outlook.com> Co-authored-by: chyezh <chyezh@outlook.com>	2024-10-23 17:21:27 +08:00
foxspy	3de57ec4fa	enhance: add vector index mgr to remove vector index type dependency (#36843 ) issue: #34298 Signed-off-by: xianliang.li <xianliang.li@zilliz.com>	2024-10-17 22:15:25 +08:00
smellthemoon	eb3e4583ec	enhance: all op(Null) is false in expr (#35527 ) #31728 --------- Signed-off-by: lixinguo <xinguo.li@zilliz.com> Co-authored-by: lixinguo <xinguo.li@zilliz.com>	2024-10-17 21:14:30 +08:00
cqy123456	b474374ea5	enhance: use growingMmapEnabled to control the behavior of interim index, not vectorField (#36500 ) issue:https://github.com/milvus-io/milvus/issues/36392 related pr: https://github.com/milvus-io/milvus/pull/36391 Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>	2024-10-17 20:25:24 +08:00
Bingyi Sun	b2037c95a8	fix: use chunk_row_nums to iterate (#36882 ) Fix segmentation fault error and remove useless codes. https://github.com/milvus-io/milvus/issues/36834 Signed-off-by: sunby <sunbingyi1992@gmail.com>	2024-10-16 11:15:25 +08:00
Bingyi Sun	a75bb85f3a	feat: support chunked column for sealed segment (#35764 ) This PR splits sealed segment to chunked data to avoid unnecessary memory copy and save memory usage when loading segments so that loading can be accelerated. To support rollback to previous version, we add an option `multipleChunkedEnable` which is false by default. Signed-off-by: sunby <sunbingyi1992@gmail.com>	2024-10-12 15:04:52 +08:00
aoiasd	db34572c56	feat: support load and query with bm25 metric (#36071 ) relate: https://github.com/milvus-io/milvus/issues/35853 --------- Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2024-10-11 10:23:20 +08:00
SimFG	130a923dec	enhance: the estimate method when loading the collection (#36307 ) - issue: #36530 --------- Signed-off-by: SimFG <bang.fu@zilliz.com> Signed-off-by: xianliang.li <xianliang.li@zilliz.com> Co-authored-by: xianliang.li <xianliang.li@zilliz.com>	2024-10-09 17:35:19 +08:00
Buqian Zheng	8495bc6bbc	fix: fix broken Sparse Float Vector raw data mmap (#36183 ) issue: https://github.com/milvus-io/milvus/issues/36182 * improved `Column.h` to make the code much more readable and maintainable, and added detailed comments. * fixed an issue where `ArrayColumn::NumRows()` always returns 0 when the mmap backing storage is a file. * removed unused `ColumnBase` constructors and unnecessary members so we don't get confused. * Updated `test_chunk_cache.cpp` to make the tests parameterized: to test both mmap enabled and disabled. Added sparse field in the test to add coverage. * re-enabled test `Sealed::GetSparseVectorFromChunkCache`. * But 2 other disabled tests `Sealed::WarmupChunkCache` and `Sealed::GetVectorFromChunkCache` remain disabled, there seems to be errors. @bigsheeper PTAL. --------- Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2024-09-25 18:59:13 +08:00
yihao.dai	8cda48a96a	enhance: Use mmap.scalarIndex config for text index (#36400 ) issue: https://github.com/milvus-io/milvus/issues/35273 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-09-24 12:21:13 +08:00
zhagnlu	489087d18b	enhance: refactor executor framework V2 (#35251 ) #32636 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2024-09-13 20:57:09 +08:00
congqixia	58d3200986	enhance: Filter out non-hit delete records during load delta (#36207 ) Related to #35303 This PR utilizes pk index in segment to exclude non-hit delete record during load delete records. This ability is crucial when l0/delete forward policy only replies on segment itself(without BF filtering). --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-09-13 19:05:08 +08:00
Jiquan Long	f0f2fb4cf0	enhance: span tracing of c++ part (#36205 ) fix: https://github.com/milvus-io/milvus/issues/36204 Signed-off-by: longjiquan <jiquan.long@zilliz.com>	2024-09-13 11:19:09 +08:00
Jiquan Long	89bf226f0b	feat: support keyword text match (#35923 ) fix: #35922 --------- Signed-off-by: longjiquan <jiquan.long@zilliz.com>	2024-09-10 15:11:08 +08:00
smellthemoon	21b135c7c2	fix: not append valid data when transfer to insert record (#36027 ) fix not append valid data when transfer to insert record and add a tiny check when in groupBy field. #35924 Signed-off-by: lixinguo <xinguo.li@zilliz.com> Co-authored-by: lixinguo <xinguo.li@zilliz.com>	2024-09-06 14:53:04 +08:00
SimFG	5247631289	fix: fill the metric type field in the LoadMetaInfo object (#35962 ) - issue: #35960 Signed-off-by: SimFG <bang.fu@zilliz.com>	2024-09-05 20:50:23 -07:00
cqy123456	560e8e70b0	enhance: reduce mmap_rss after chunkcache warmup (#35974 ) related pr: https://github.com/milvus-io/milvus/pull/35965 Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>	2024-09-05 18:07:05 +08:00
zhagnlu	74048ce34f	fix:rename mmap file path to avoid directory conflict (#35810 ) #35784 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2024-09-03 16:05:03 +08:00
cai.zhang	2c9bb4dfa3	feat: Support stats task to sort segment by PK (#35054 ) issue: #33744 This PR includes the following changes: 1. Added a new task type to the task scheduler in datacoord: stats task, which sorts segments by primary key. 2. Implemented segment sorting in indexnode. 3. Added a new field `FieldStatsLog` to SegmentInfo to store token index information. --------- Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2024-09-02 14:19:03 +08:00
Jiquan Long	5ea2454fdf	feat: tantivy tokenizer binding (#35801 ) fix: #35800 --------- Signed-off-by: longjiquan <jiquan.long@zilliz.com>	2024-09-01 17:13:03 +08:00
yihao.dai	f2b83d316b	enhance: Support memory mode chunk cache (#35347 ) Chunk cache supports loading raw vectors into memory. issue: https://github.com/milvus-io/milvus/issues/35273 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-08-25 15:42:58 +08:00
zhagnlu	42f7800b5b	enhance: add bitmap offset cache to speed up retrieve raw data (#35498 ) #35458 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2024-08-24 01:40:58 +08:00
Zhen Ye	a773836b89	enhance: optimize milvus core building (#35610 ) issue: #35549,#35611,#35633 - remove milvus_segcore milvus_indexbuilder..., add libmilvus_core - core building only link once - move opendal compilation into cmake - fix odr --------- Signed-off-by: chyezh <chyezh@outlook.com>	2024-08-23 12:35:02 +08:00
zhagnlu	3107701fe8	enhance: optimize retrieve on dynamic field (#35580 ) #35514 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com> Co-authored-by: zhenshan.cao <zhenshan.cao@zilliz.com>	2024-08-22 14:24:56 +08:00
smellthemoon	80dbe87759	enhance: support null value in index (#35238 ) #31728 --------- Signed-off-by: lixinguo <xinguo.li@zilliz.com> Co-authored-by: lixinguo <xinguo.li@zilliz.com>	2024-08-16 15:30:54 +08:00
Buqian Zheng	f4a91e135b	enhance: Allow empty sparse row (#34700 ) issue: #29419 * If a sparse vector with 0 non-zero value is inserted, no ANN search on this sparse vector field will return it as a result. User may retrieve this row via scalar query or ANN search on another vector field though. * If the user uses an empty sparse vector as the query vector for a ANN search, no neighbor will be returned. Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2024-08-16 14:14:54 +08:00
zhagnlu	4b553b0333	enhance: revert remove duplicated pk function (#35103 ) issue: #34778 Revert "fix: fix query count(*) concurrently" Revert "enhance: mark duplicated pk as deleted " Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2024-08-05 10:48:17 +08:00
zhagnlu	16dd53e7cf	enhance: remove timestamp_filter after retrieve (#35207 ) #35226 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2024-08-02 19:32:46 +08:00
smellthemoon	475c333fa2	enhance: add valid_data in span (#35030 ) #31728 Signed-off-by: lixinguo <xinguo.li@zilliz.com> Co-authored-by: lixinguo <xinguo.li@zilliz.com>	2024-08-02 15:40:14 +08:00
zhenshan.cao	aa247f192d	enhance: remove unused code for StorageV2 (#35132 ) issue: https://github.com/milvus-io/milvus/issues/34168 Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>	2024-08-01 12:08:13 +08:00
zhagnlu	dd0c26cf58	enhance: redefine variable column block size (#35040 ) #35013 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2024-07-30 19:23:50 +08:00
congqixia	de8a266d8a	enhance: Enable linux code checker (#35084 ) See also #34483 --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-07-30 15:53:51 +08:00
zhagnlu	86322e0468	fix: fix query count() concurrently (#35007 ) #34778 #34849 fix two problems: 1. count() incorrect, if growing insert duplicated (pk, timestamp) pairs that pk and timestamp all same, need to keep just one pair. 2. count(*) may core dump, if get_real_count interface get snapshot and do mvcc at not consistency status, mainly happens under concurrency. Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2024-07-29 19:53:50 +08:00
Aldrin	9463eeef2b	fix: Avoided dereferencing NULL pointer (#34836 ) issue : https://github.com/milvus-io/milvus/issues/34835 Signed-off-by: Ald392 <imagesai32@gmail.com>	2024-07-27 17:27:52 +08:00
smellthemoon	5616b7e8d2	enhance: support null in c data_datacodec and load null value (#32183 ) 1. support read and write null in segcore will store valid_data(use uint8_t type to save memory) in fieldData. 2. support load null binlog reader read and write data into column(sealed segment), insertRecord(growing segment). In sealed segment, store valid_data directly. In growing segment, considering prior implementation and easy code reading, it covert uint8_t to fbvector<bool>, which may optimize in future. 3. retrieve valid_data. parse valid_data in search/query. #31728 --------- Signed-off-by: lixinguo <xinguo.li@zilliz.com> Co-authored-by: lixinguo <xinguo.li@zilliz.com>	2024-07-23 16:07:51 +08:00
Chun Han	6c19f9baf8	enhance: optimize search reduce perf(#32507 ) (#34607 ) related: #32507 Signed-off-by: MrPresent-Han <chun.han@gmail.com> Co-authored-by: MrPresent-Han <chun.han@gmail.com>	2024-07-22 20:57:43 +08:00
zhagnlu	804dd5409a	enhance: mark duplicated pk as deleted (#34586 ) fix #34247 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2024-07-16 14:25:39 +08:00
Chun Han	f00c529aea	feat: support group_size for search_group_by(#33544 ) (#33720 ) related: #33544 mainly changes in three aspects: 1. enable setting group_size for group by function 2. separate normal reduce and group by reduce 3. eleminate uncessary padding in search result for reducing Signed-off-by: MrPresent-Han <chun.han@gmail.com> Co-authored-by: MrPresent-Han <chun.han@gmail.com>	2024-07-12 10:17:36 +08:00
congqixia	4850336ca3	fix: Write padding at end of mmap file not chunk (#34529 ) Related to #34508 The padding bytes shall be written only at the end of the mmap file not the chunk of each field data file. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-07-10 11:12:14 +08:00
yihao.dai	734415b8a2	fix: Reduce duplicate PKs in segcore (#34267 ) issue: https://github.com/milvus-io/milvus/issues/34247 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-07-01 17:42:06 +08:00

1 2 3 4 5 ...

602 Commits