milvus

mirror of https://gitee.com/milvus-io/milvus.git synced 2026-01-07 19:31:51 +08:00

Author	SHA1	Message	Date
zhagnlu	0a378dc308	fix:fix format error for json (#41026 ) #40963 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-04-07 10:22:22 +08:00
smellthemoon	cb1e86e17c	enhance: support add field (#39800 ) after the pr merged, we can support to insert, upsert, build index, query, search in the added field. can only do the above operates in added field after add field request complete, which is a sync operate. compact will be supported in the next pr. #39718 --------- Signed-off-by: lixinguo <xinguo.li@zilliz.com> Co-authored-by: lixinguo <xinguo.li@zilliz.com>	2025-04-02 14:24:31 +08:00
cqy123456	6dc0f42830	fix:growing mmap data type crashed by nullable input (#40994 ) issue: https://github.com/milvus-io/milvus/issues/40981 2.5 pr: https://github.com/milvus-io/milvus/pull/40980 Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>	2025-03-31 20:32:19 +08:00
Bingyi Sun	9676365af9	fix: Fix json index not equal filter (#40647 ) issue: #35528 --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-03-27 23:06:23 +08:00
Xiaofan	8788e591cd	enhance: add detailed stack for error message (#40883 ) fix #40882 adding stacktrace will operator execute failed. Signed-off-by: xiaofanluan <xiaofan.luan@zilliz.com>	2025-03-26 13:24:20 +08:00
zhagnlu	7fdb2e144f	enhance:change multi or expr to in expr (#40757 ) #40752 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-03-25 11:06:18 +08:00
cai.zhang	a41cb942f6	fix: Do not delete the centroids file when sampling fails instead wait GC (#40701 ) issue: #40700 --------- Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2025-03-21 10:32:12 +08:00
zhagnlu	6c55db44f1	enhance: reorder sub expr for conjunct expr (#39872 ) two point: (1) reoder conjucts expr's subexpr, postpone heavy operations sequence: int(column) -> index(column) -> string(column) -> light conjuct ...... -> json(column) -> heavy conjuct -> two_column_compare (2) support pre filter for expr execute, skip scan raw data that had been skipped because of preceding expr result. #39869 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-03-19 14:50:14 +08:00
zhagnlu	7ebe3d7038	enhance: refine chunk access logic and add some comment on data (#40618 ) #40367 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-03-16 22:20:08 +08:00
Chun Han	259f9106ad	enhance: refine variable-length-type memory usage(#38736 ) (#39578 ) related: #38736 Signed-off-by: MrPresent-Han <chun.han@gmail.com> Co-authored-by: MrPresent-Han <chun.han@gmail.com>	2025-02-27 21:13:58 +08:00
Bingyi Sun	db4769281c	fix: Fall back to a brute-force search if json index type unmatched (#40076 ) issue: https://github.com/milvus-io/milvus/issues/35528 If the query data type does not match the index type, fall back to a brute-force search --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-02-24 16:25:57 +08:00
sthuang	3eb3af5f08	feat: explicitly specify column groups for storage v2 api (#39790 ) * use the new packed reader and writer api to be compatible with current etcd meta * For the new packed writer API: column groups and paths are explicitly defined by users and won't split column groups by memory in storage v2. Packed writer follows the user-defined column groups to split arrow record and write into the corresponding file path. * For the new packed reader API: read paths are explicitly defined by users. related: #39173 Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2025-02-21 22:03:54 +08:00
Patrick Weizhi Xu	04fff74a56	feat: introduce Text data type (#39874 ) issue: https://github.com/milvus-io/milvus/issues/39818 This PR mimics Varchar data type, allows insert, search, query, delete, full-text search and others. Functionalities related to filter expressions are disabled temporarily. Storage changes for Text data type will be in the following PRs. Signed-off-by: Patrick Weizhi Xu <weizhi.xu@zilliz.com>	2025-02-19 11:04:51 +08:00
Bingyi Sun	b59555057d	feat: support json index (#36750 ) https://github.com/milvus-io/milvus/issues/35528 This PR adds json index support for json and dynamic fields. Now you can only do unary query like 'a["b"] > 1' using this index. We will support more filter type later. basic usage: ``` collection.create_index("json_field", {"index_type": "INVERTED", "params": {"json_cast_type": DataType.STRING, "json_path": 'json_field["a"]["b"]'}}) ``` There are some limits to use this index: 1. If a record does not have the json path you specify, it will be ignored and there will not be an error. 2. If a value of the json path fails to be cast to the type you specify, it will be ignored and there will not be an error. 3. A specific json path can have only one json index. 4. If you try to create more than one json indexes for one json field, sdk(pymilvus<=2.4.7) may return immediately because of internal implementation. This will be fixed in a later version. --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-02-15 14:06:15 +08:00
Cai Yudong	341d6c1eb7	feat: Update segcore for VECTOR_INT8 (#39415 ) Issue: #38666 Signed-off-by: Cai Yudong <yudong.cai@zilliz.com>	2025-01-21 11:03:03 +08:00
Spade A	0461ddf776	fix: phrase match does not support offset input (#39338 ) fix: #39337 Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>	2025-01-16 22:05:01 +08:00
Gao	75d7978a18	enhance: pass partition key scalar info if enable for vector mem index (#39123 ) issue: #34332 --------- Signed-off-by: chasingegg <chao.gao@zilliz.com>	2025-01-16 14:33:03 +08:00
Cai Yudong	5bf1b2b929	feat: Support Int8Vector in go (#38990 ) Issue: #38666 Signed-off-by: Cai Yudong <yudong.cai@zilliz.com>	2025-01-14 20:43:06 +08:00
Zhen Ye	3e788f0fbd	enhance: record memory size (uncompressed) item for index (#38770 ) issue: #38715 - Current milvus use a serialized index size(compressed) for estimate resource for loading. - Add a new field `MemSize` (before compressing) for index to estimate resource. --------- Signed-off-by: chyezh <chyezh@outlook.com>	2025-01-14 10:33:06 +08:00
Cai Yudong	2a02bbe3ee	enhance: Use template to remove unittest duplication (#39144 ) Issue: #38666 Signed-off-by: Cai Yudong <yudong.cai@zilliz.com>	2025-01-13 09:58:57 +08:00
Spade A	032292a432	feat: support phrase match query (#38869 ) The relevant issue: https://github.com/milvus-io/milvus/issues/38930 --------- Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>	2025-01-12 20:24:58 +08:00
Cai Yudong	d6206ad2de	fix: Remove duplicated Macro definition (#39076 ) Issue: #39102 Signed-off-by: Cai Yudong <yudong.cai@zilliz.com>	2025-01-09 15:26:56 +08:00
Chun Han	3739446a33	enhance: refine array view to optimize memory usage(#38736 ) (#38808 ) related: #38736 700m data, array_length=10 non-mmap_offsets_uint64: 2.0G mmap_offsets_uint64: 1.1G mmap_offsets_uint32: 880MB Signed-off-by: MrPresent-Han <chun.han@gmail.com> Co-authored-by: MrPresent-Han <chun.han@gmail.com>	2025-01-07 13:26:55 +08:00
Spade A	4245c5bed1	fix: text match panics when enable_match is set be false (#38950 ) fix: https://github.com/milvus-io/milvus/issues/38949 --------- Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>	2025-01-03 14:20:55 +08:00
Bingyi Sun	aa0a87eda7	fix: Block warmup submit if pool full in sync mode (#38690 ) https://github.com/milvus-io/milvus/issues/38692 --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-01-02 15:04:58 +08:00
smellthemoon	907fc24f85	enhance: support null expr (#38772 ) #31728 --------- Signed-off-by: lixinguo <xinguo.li@zilliz.com> Co-authored-by: lixinguo <xinguo.li@zilliz.com>	2025-01-02 14:16:54 +08:00
congqixia	19052ef3e5	enhance: Add buffered writer to reduce fwrite syscall (#38570 ) Related to previous PR #38157 If mmapped row is too small, frequent fwrite call still cost too much cpu time for context switching. This PR add buffered write to avoid this bad case with extra buffer per variable field. --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-12-27 12:20:50 +08:00
Patrick Weizhi Xu	85f462be1a	enhance: speed up search iterator stage 1 (#37947 ) issue: #37548 Signed-off-by: Patrick Weizhi Xu <weizhi.xu@zilliz.com>	2024-12-26 10:32:49 +08:00
Ted Xu	acc8fb7af6	enhance: eliminate compile warnings (part2) (#38535 ) See #38435 --------- Signed-off-by: Ted Xu <ted.xu@zilliz.com>	2024-12-25 15:30:50 +08:00
Ted Xu	4919ccf543	enhance: eliminate compile warnings (#38420 ) See: #38435 --------- Signed-off-by: Ted Xu <ted.xu@zilliz.com>	2024-12-16 09:58:43 +08:00
zhagnlu	01de0afc4e	enhance: refactor delete mvcc function (#38066 ) #37413 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2024-12-15 18:02:43 +08:00
Gao	994fc544e7	enhance: support iterative filter execution (#37363 ) issue: #37360 --------- Signed-off-by: chasingegg <chao.gao@zilliz.com>	2024-12-11 11:32:44 +08:00
congqixia	767b7e6218	enhance: Use fdopen, fwrite to reduce direct syscall (#38157 ) `File.Write` and `File.WriteInt` use `write`, which may be just direct syscall in some systems. When mappding field data and write line by line, this could cost lost of CPU time when the row number is large. --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-12-03 15:24:39 +08:00
cqy123456	8216345b07	enhance: reduce copy of bitset and id conversion of brurtforce search (#37675 ) issue: https://github.com/milvus-io/milvus/issues/37798 Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>	2024-11-19 15:48:40 +08:00
Zhen Ye	3f1614e9d9	enhance: add trace_id into segcore logs (#37656 ) issue: #37655 Signed-off-by: chyezh <chyezh@outlook.com>	2024-11-18 10:20:30 +08:00
smellthemoon	7999367c0c	fix: use not retried err when get wrong parameter (#37707 ) #37508 Signed-off-by: lixinguo <xinguo.li@zilliz.com> Co-authored-by: lixinguo <xinguo.li@zilliz.com>	2024-11-15 19:14:30 +08:00
Chun Han	2d29dcd30c	enhance:refine group_strict_size parameter(#37482 ) (#37483 ) related: #37482 Signed-off-by: MrPresent-Han <chun.han@gmail.com> Co-authored-by: MrPresent-Han <chun.han@gmail.com>	2024-11-12 09:56:28 +08:00
aoiasd	12951f0abb	enhance: rename tokenizer to analyzer and check analyzer params (#37478 ) relate: https://github.com/milvus-io/milvus/issues/35853 --------- Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2024-11-10 16:12:26 +08:00
aoiasd	d67853fa89	feat: Tokenizer support build with params and clone for concurrency (#37048 ) relate: https://github.com/milvus-io/milvus/issues/35853 https://github.com/milvus-io/milvus/issues/36751 --------- Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2024-11-06 17:48:24 +08:00
cai.zhang	625b6176cd	fix: Search for pk using raw data to reduce the overhead caused by views (#37202 ) issue: #37152 Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2024-11-05 20:36:24 +08:00
Bingyi Sun	bd04cac4b3	fix: fix group by on chunked segment (#37292 ) https://github.com/milvus-io/milvus/issues/37244 --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2024-11-05 17:12:23 +08:00
Zhen Ye	9a0e1c82bc	fix: repeated error code in milvus and segcore (#37359 ) issue: #37357 Signed-off-by: chyezh <chyezh@outlook.com>	2024-11-05 16:28:23 +08:00
smellthemoon	51cb2fbf97	fix: parse fail in empty json (#37294 ) #37200 Signed-off-by: lixinguo <xinguo.li@zilliz.com> Co-authored-by: lixinguo <xinguo.li@zilliz.com>	2024-11-03 19:00:21 +08:00
zhenshan.cao	63843dce33	fix: Fix conan gdal building problem (#37338 ) issue:https://github.com/milvus-io/milvus/issues/27576 Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>	2024-10-31 21:04:16 +08:00
Hao Tan	67c4340565	feat: Geospatial Data Type and GIS Function Support for milvus server (#35990 ) issue:https://github.com/milvus-io/milvus/issues/27576 # Main Goals 1. Create and describe collections with geospatial fields, enabling both client and server to recognize and process geo fields. 2. Insert geospatial data as payload values in the insert binlog, and print the values for verification. 3. Load segments containing geospatial data into memory. 4. Ensure query outputs can display geospatial data. 5. Support filtering on GIS functions for geospatial columns. # Solution 1. Add Type: Modify the Milvus core by adding a Geospatial type in both the C++ and Go code layers, defining the Geospatial data structure and the corresponding interfaces. 2. Dependency Libraries: Introduce necessary geospatial data processing libraries. In the C++ source code, use Conan package management to include the GDAL library. In the Go source code, add the go-geom library to the go.mod file. 3. Protocol Interface: Revise the Milvus protocol to provide mechanisms for Geospatial message serialization and deserialization. 4. Data Pipeline: Facilitate interaction between the client and proxy using the WKT format for geospatial data. The proxy will convert all data into WKB format for downstream processing, providing column data interfaces, segment encapsulation, segment loading, payload writing, and cache block management. 5. Query Operators: Implement simple display and support for filter queries. Initially, focus on filtering based on spatial relationships for a single column of geospatial literal values, providing parsing and execution for query expressions. 6. Client Modification: Enable the client to handle user input for geospatial data and facilitate end-to-end testing.Check the modification in pymilvus. --------- Signed-off-by: tasty-gumi <1021989072@qq.com>	2024-10-31 20:58:20 +08:00
Bingyi Sun	b81f162f6a	fix: fix several bugs and refactor some codes related with chunked segment (#37168 ) issue: https://github.com/milvus-io/milvus/issues/37147 --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2024-10-28 14:17:30 +08:00
smellthemoon	44ddcb5a63	fix: not check has_value before get value in JSON (#37128 ) https://github.com/milvus-io/milvus/issues/36236 also: https://github.com/milvus-io/milvus/issues/37113 Signed-off-by: lixinguo <xinguo.li@zilliz.com> Co-authored-by: lixinguo <xinguo.li@zilliz.com>	2024-10-25 17:19:28 +08:00
Yinzuo Jiang	3628593d20	feat: Implement custom function module in milvus expr (#36560 ) OSPP 2024 project: https://summer-ospp.ac.cn/org/prodetail/247410235?list=org&navpage=org Solutions: - parser (planparserv2) - add CallExpr in planparserv2/Plan.g4 - update parser_visitor and show_visitor - grpc protobuf - add CallExpr in plan.proto - execution (`core/src/exec`) - add `CallExpr` `ValueExpr` and `ColumnExpr` (both logical and physical) for function call and function parameters - function factory (`core/src/exec/expression/function`) - create a global hashmap when starting milvus (see server.go) - the global hashmap stores function signatures and their function pointers, the CallExpr in execution engine can get the function pointer by function signature. - custom functions - empty(string) - starts_with(string, string) - add cpp/go unittests and E2E tests closes: #36559 Signed-off-by: Yinzuo Jiang <jiangyinzuo@foxmail.com>	2024-10-25 15:25:30 +08:00
Bingyi Sun	bf956a3ec2	fix: fix string field has invalid utf-8 (#37104 ) issue: https://github.com/milvus-io/milvus/issues/37083 We use vector of string_view to save data temporally but real string data will be released after record batch is deconstructed. Change it to vector of string to avoid memory corruption. --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2024-10-24 18:33:47 -07:00
smellthemoon	eb3e4583ec	enhance: all op(Null) is false in expr (#35527 ) #31728 --------- Signed-off-by: lixinguo <xinguo.li@zilliz.com> Co-authored-by: lixinguo <xinguo.li@zilliz.com>	2024-10-17 21:14:30 +08:00

1 2 3 4 5 ...

378 Commits