milvus

mirror of https://gitee.com/milvus-io/milvus.git synced 2025-12-28 22:45:26 +08:00

Author	SHA1	Message	Date
Tianx	2c0c5ef41e	feat: timestamptz expression & index & timezone (#44080 ) issue: https://github.com/milvus-io/milvus/issues/27467 >My plan is as follows. >- [x] M1 Create collection with timestamptz field >- [x] M2 Insert timestamptz field data >- [x] M3 Retrieve timestamptz field data >- [x] M4 Implement handoff >- [x] M5 Implement compare operator >- [x] M6 Implement extract operator >- [x] M8 Support database/collection level default timezone >- [x] M7 Support STL-SORT index for datatype timestamptz --- The third PR of issue: https://github.com/milvus-io/milvus/issues/27467, which completes M5, M6, M7, M8 described above. ## M8 Default Timezone We will be able to use alter_collection() and alter_database() in a future Python SDK release to modify the default timezone at the collection or database level. For insert requests, the timezone will be resolved using the following order of precedence: String Literal-> Collection Default -> Database Default. For retrieval requests, the timezone will be resolved in this order: Query Parameters -> Collection Default -> Database Default. In both cases, the final fallback timezone is UTC. ## M5: Comparison Operators We can now use the following expression format to filter on the timestamptz field: - `timestamptz_field [+/- INTERVAL 'interval_string'] {comparison_op} ISO 'iso_string' ` - The interval_string follows the ISO 8601 duration format, for example: P1Y2M3DT1H2M3S. - The iso_string follows the ISO 8601 timestamp format, for example: 2025-01-03T00:00:00+08:00. - Example expressions: "tsz + INTERVAL 'P0D' != ISO '2025-01-03T00:00:00+08:00'" or "tsz != ISO '2025-01-03T00:00:00+08:00'". ## M6: Extract We will be able to extract sepecific time filed by kwargs in a future Python SDK release. The key is `time_fields`, and value should be one or more of "year, month, day, hour, minute, second, microsecond", seperated by comma or space. Then the result of each record would be an array of int64. ## M7: Indexing Support Expressions without interval arithmetic can be accelerated using an STL-SORT index. However, expressions that include interval arithmetic cannot be indexed. This is because the result of an interval calculation depends on the specific timestamp value. For example, adding one month to a date in February results in a different number of added days than adding one month to a date in March. --- After this PR, the input / output type of timestamptz would be iso string. Timestampz would be stored as timestamptz data, which is int64_t finally. > for more information, see https://en.wikipedia.org/wiki/ISO_8601 --------- Signed-off-by: xtx <xtianx@smail.nju.edu.cn>	2025-09-23 10:24:12 +08:00
yihao.dai	51f69f32d0	feat: Add CDC support (#44124 ) This PR implements a new CDC service for Milvus 2.6, providing log-based cross-cluster replication. issue: https://github.com/milvus-io/milvus/issues/44123 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com> Signed-off-by: chyezh <chyezh@outlook.com> Co-authored-by: chyezh <chyezh@outlook.com>	2025-09-16 16:32:01 +08:00
Spade A	eb793531b9	feat: impl StructArray -- support import for CSV/JSON/PARQUET/BINLOG (#44201 ) Ref https://github.com/milvus-io/milvus/issues/42148 --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-09-15 20:41:59 +08:00
wei liu	18371773dd	enhance: Optimize partial update merge logic by unifying nullable format (#44197 ) issue: #43980 This commit optimizes the partial update merge logic by standardizing nullable field representation before merge operations to avoid corner cases during the merge process. Key changes: - Unify nullable field data format to FULL FORMAT before merge execution - Add extensive unit tests for bounds checking and edge cases The optimization ensures: - Consistent nullable field representation across SDK and internal - Proper handling of null values during merge operations - Prevention of index out-of-bounds errors in vector field updates - Better error handling and validation for partial update scenarios This resolves issues where different nullable field formats could cause merge failures or data corruption during partial update operations. Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-09-10 17:27:56 +08:00
wei liu	5ef793c393	fix: Fix panic when upsert with partial_update=true on empty table (#44155 ) issue: #43980 Fix panic issue caused by incorrect nullable field merging logic when upsert converts to insert operation on empty tables. - Add AppendFieldDataWithNullData to handle nullable field merging - Fix existing data merge with skipAppendNullData=false - Fix insert data merge with skipAppendNullData=true - Add unit tests for nullable field data appending scenarios Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-09-02 16:47:52 +08:00
wei liu	16af4e230a	fix: Prevent panic in upsert due to missing nullable fields [Proxy] (#44070 ) issue: #43980 Fixes a panic that occurred when a partial update was converted to an insert due to a non-existent primary key. The panic was caused by missing nullable fields that were not provided in the original partial update request. The upsert pre-execution logic is refactored to handle this correctly: - Explicitly splits upsert data into 'insert' and 'update' batches. - Automatically generates data for missing nullable or default-value fields during inserts, preventing the panic. - Enhances `typeutil.UpdateFieldData` to support different source and destination indexes for flexible data merging. - Adds comprehensive unit tests for mixed upsert, pure insert, and pure update scenarios. --------- Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-08-29 18:33:51 +08:00
Spade A	8456f824be	feat: impl StructArray -- miscellaneous staffs for struct array (#43960 ) Ref https://github.com/milvus-io/milvus/issues/42148 1. enable storage v2 2. implement some missing staffs 3. fix some bugs and add tests --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-08-26 21:35:53 +08:00
Tianx	c0d62268ac	feat: add timesatmptz data type (#44005 ) issue: https://github.com/milvus-io/milvus/issues/27467 > https://github.com/milvus-io/milvus/issues/27467#issuecomment-3092211420 > * [x] M1 Create collection with timestamptz field > * [x] M2 Insert timestamptz field data > * [x] M3 Retrieve timestamptz field data > * [x] M4 Implement handoff[ ] The second PR of issue: https://github.com/milvus-io/milvus/issues/27467, which completes M1-M4 described above. --------- Signed-off-by: xtx <xtianx@smail.nju.edu.cn>	2025-08-26 15:59:53 +08:00
Spade A	d6a428e880	feat: impl StructArray -- support create index for vector array (embedding list) and search on it (#43726 ) Ref https://github.com/milvus-io/milvus/issues/42148 This PR supports create index for vector array (now, only for `DataType.FLOAT_VECTOR`) and search on it. The index type supported in this PR is `EMB_LIST_HNSW` and the metric type is `MAX_SIM` only. The way to use it: ```python milvus_client = MilvusClient("xxx:19530") schema = milvus_client.create_schema(enable_dynamic_field=True, auto_id=True) ... struct_schema = milvus_client.create_struct_array_field_schema("struct_array_field") ... struct_schema.add_field("struct_float_vec", DataType.ARRAY_OF_VECTOR, element_type=DataType.FLOAT_VECTOR, dim=128, max_capacity=1000) ... schema.add_struct_array_field(struct_schema) index_params = milvus_client.prepare_index_params() index_params.add_index(field_name="struct_float_vec", index_type="EMB_LIST_HNSW", metric_type="MAX_SIM", index_params={"nlist": 128}) ... milvus_client.create_index(COLLECTION_NAME, schema=schema, index_params=index_params) ``` Note: This PR uses `Lims` to convey offsets of the vector array to knowhere where vectors of multiple vector arrays are concatenated and we need offsets to specify which vectors belong to which vector array. --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com> Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>	2025-08-20 10:27:46 +08:00
wei liu	d3c95eaa77	enhance: Support partial field updates with upsert API (#42877 ) issue: #29735 Implement partial field update functionality for upsert operations, supporting scalar, vector, and dynamic JSON fields without requiring all collection fields. Changes: - Add queryPreExecute to retrieve existing records before upsert - Implement UpdateFieldData function for merging data - Add IDsChecker utility for efficient primary key lookups - Fix JSON data creation in tests using proper map marshaling - Add test cases for partial updates of different field types Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-08-19 11:15:45 +08:00
Spade A	faeb7fd410	feat: impl StructArray -- create schema, insert, and retrieve data (#42855 ) Ref https://github.com/milvus-io/milvus/issues/42148 https://github.com/milvus-io/milvus/pull/42406 impls the segcore part of storage for handling with VectorArray. This PR: 1. impls the go part of storage for VectorArray 2. impls the collection creation with StructArrayField and VectorArray 3. insert and retrieve data from the collection. --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com> Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com> Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>	2025-07-27 01:30:55 +08:00
XuanYang-cn	4dcaa97682	fix: Use diskSegmentMaxSize for coll with sparse and dense vectors (#43194 ) Previous code uses diskSegmentMaxSize if and only if all of the collection's vector fields are indexed with DiskANN index. When introducing sparse vectors, since sparse vector cannot be indexed with DiskANN index, collections with both dense and sparse vectors will use maxSize instead. This PR changes the requirments of using diskSegmentMaxSize to all dense vectors are indexed with DiskANN indexs, ignoring sparse vector fields. See also: #43193 Signed-off-by: yangxuan <xuan.yang@zilliz.com>	2025-07-16 18:04:52 +08:00
Chun Han	07745439b5	fix: empty search groupby result causing crash(#43137 ) (#43214 ) related: #43137 Signed-off-by: MrPresent-Han <chun.han@gmail.com> Co-authored-by: MrPresent-Han <chun.han@gmail.com>	2025-07-10 12:04:48 +08:00
congqixia	74ea57bac1	enhance: Remove unused load field check from proxy (#42816 ) Related to #42489 Since load list works as hint after cachelayer implemented, the related check logic could be removed to keep code logic clean. --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-06-19 19:34:47 +08:00
Zhen Ye	fc010e44a8	fix: release memory after pop from heap (#42482 ) issue: #42481 Signed-off-by: chyezh <chyezh@outlook.com>	2025-06-04 10:00:32 +08:00
Ted Xu	ae32203d3a	fix: support group by with nullable grouping keys (#41797 ) See #36264 In this PR: - Enhanced error handling in parse of grouping field. - Fixed null handling in reduce tasks in proxy nodes. - Updated tests to reflect changes in error handling and data processing logic. --------- Signed-off-by: Ted Xu <ted.xu@zilliz.com>	2025-05-17 20:54:22 +08:00
aoiasd	f52c2909c4	feat: support multi analyzer for bm25 function (#41351 ) relate: https://github.com/milvus-io/milvus/issues/41213 --------- Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-04-23 18:22:38 +08:00
SimFG	91d40fa558	fix: Update logging context and upgrade dependencies (#41318 ) - issue: #41291 --------- Signed-off-by: SimFG <bang.fu@zilliz.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2025-04-23 10:52:38 +08:00
Xianhui Lin	f9febe3bae	enhance: Merge RootCoord, DataCoord And QueryCoord into MixCoord (#41006 ) Merge RootCoord, DataCoord And QueryCoord into MixCoord Make Session into one issue : https://github.com/milvus-io/milvus/issues/37764 --------- Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>	2025-04-11 16:36:30 +08:00
Xianhui Lin	3bc24c264f	enhance: Add json key inverted index in stats for optimization (#38039 ) Add json key inverted index in stats for optimization https://github.com/milvus-io/milvus/issues/36995 --------- Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-04-10 15:20:28 +08:00
yihao.dai	b4cb8a4b13	enhance: Add UTF-8 string validation for import (#40694 ) issue: https://github.com/milvus-io/milvus/issues/40684 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2025-04-01 19:04:21 +08:00
Ted Xu	688505ab1c	enhance: cleanup lint check exclusions (#40829 ) See: #40828 Signed-off-by: Ted Xu <ted.xu@zilliz.com>	2025-03-21 18:12:14 +08:00
Buqian Zheng	c12abf4e2a	enhance: improve sparse query nnz metric (#40713 ) add query type and field id label; add metric for hybrid search issue: https://github.com/milvus-io/milvus/issues/35853 Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2025-03-18 17:20:16 +08:00
Zhen Ye	f6fb4bc442	fix: backoff will retry infinitely after reaching max elapse (#40589 ) issue: #40588 Signed-off-by: chyezh <chyezh@outlook.com>	2025-03-13 16:24:06 +08:00
Zhen Ye	96a010da7a	fix: msgstream adaptor may not gc quickly after comsumed (#40555 ) issue: #40540 Signed-off-by: chyezh <chyezh@outlook.com>	2025-03-12 14:00:04 +08:00
congqixia	cb7f2fa6fd	enhance: Use v2 package name for pkg module (#39990 ) Related to #39095 https://go.dev/doc/modules/version-numbers Update pkg version according to golang dep version convention --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-02-22 23:15:58 +08:00
yihao.dai	d72d2281ca	fix: Fix concurrent map (#39775 ) issue: https://github.com/milvus-io/milvus/issues/39778 Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2025-02-22 09:51:54 +08:00
Patrick Weizhi Xu	04fff74a56	feat: introduce Text data type (#39874 ) issue: https://github.com/milvus-io/milvus/issues/39818 This PR mimics Varchar data type, allows insert, search, query, delete, full-text search and others. Functionalities related to filter expressions are disabled temporarily. Storage changes for Text data type will be in the following PRs. Signed-off-by: Patrick Weizhi Xu <weizhi.xu@zilliz.com>	2025-02-19 11:04:51 +08:00
Cai Yudong	5bf1b2b929	feat: Support Int8Vector in go (#38990 ) Issue: #38666 Signed-off-by: Cai Yudong <yudong.cai@zilliz.com>	2025-01-14 20:43:06 +08:00
SimFG	357eaf0d71	fix: use the object heap to keep the min ddl ts order (#39118 ) issue: #39002 Signed-off-by: SimFG <bang.fu@zilliz.com>	2025-01-10 18:16:58 +08:00
Zhen Ye	3bcdd92915	enhance: add broadcast for streaming service (#39020 ) issue: #38399 - Add new rpc for transfer broadcast to streaming coord - Add broadcast service at streaming coord to make broadcast message sent automicly Signed-off-by: chyezh <chyezh@outlook.com>	2025-01-09 16:24:55 +08:00
Spade A	4245c5bed1	fix: text match panics when enable_match is set be false (#38950 ) fix: https://github.com/milvus-io/milvus/issues/38949 --------- Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>	2025-01-03 14:20:55 +08:00
Zhen Ye	afac153c26	enhance: move the lifetime implementation out of server level lifetime (#38442 ) issue: #38399 - move the lifetime implementation of common code out of the server level lifetime implementation Signed-off-by: chyezh <chyezh@outlook.com>	2024-12-17 11:42:44 +08:00
Buqian Zheng	75e64b993f	enhance: add metrics for counting number of nun-zeros/tokens of sparse/FTS search (#38329 ) sparse vectors may have arbitrary number of non zeros and it is hard to optimize without knowing the actual distribution of nnz. this PR adds a metric for analyzing that. issue: https://github.com/milvus-io/milvus/issues/35853 comparing with https://github.com/milvus-io/milvus/pull/38328, this includes also metric for FTS in query node delegator also fixed a bug of sparse when searching by pk Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2024-12-12 16:22:43 +08:00
Xianhui Lin	db05d4f976	enhance: alterindex & altercollection supports altering properties (#37437 ) enhance : 1. alterindex delete properties We have introduced a new parameter deleteKeys to the alterindex functionality, which allows for the deletion of properties within an index. This enhancement provides users with the flexibility to manage index properties more effectively by removing specific keys as needed. 2. altercollection delete properties We have introduced a new parameter deleteKeys to the altercollection functionality, which allows for the deletion of properties within an collection. This enhancement provides users with the flexibility to manage collection properties more effectively by removing specific keys as needed. 3.support altercollectionfield We currently support modifying the fieldparams of a field in a collection using altercollectionfield, which only allows changes to the max-length attribute. Key Points: - New Parameter - deleteKeys: This new parameter enables the deletion of specified properties from an index. By passing a list of keys to deleteKeys, users can remove the corresponding properties from the index. - Mutual Exclusivity: The deleteKeys parameter cannot be used in conjunction with the extraParams parameter. Users must choose one parameter to pass based on their requirement. If deleteKeys is provided, it indicates an intent to delete properties; if extraParams is provided, it signifies the addition or update of properties. issue: https://github.com/milvus-io/milvus/issues/37436 --------- Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>	2024-12-11 10:20:42 +08:00
aoiasd	f16232f834	fix: sparse inc mast less than uint32 max (#38250 ) relate: https://github.com/milvus-io/milvus/issues/35853 Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2024-12-06 15:14:41 +08:00
Yinzuo Jiang	5a06faca39	feat: fp32 vector to fp16/bf16 vector conversion for RESTful API (#37556 ) RESTful API. The influenced API are as follows: - Handler. insert - HandlerV1. insert/upsert - HandlerV2. insert/upsert/search We do not modify search API in Handler/HandlerV1 because they do not support fp16/bf16 vectors. module github.com/milvus-io/milvus/pkg: Add `Float32ArrayToBFloat16Bytes()`, `Float32ArrayToFloat16Bytes()` and `Float32ArrayToBytes()`. These method will be used in GoSDK in the future. issue: #37448 Signed-off-by: Yinzuo Jiang <yinzuo.jiang@zilliz.com> Signed-off-by: Yinzuo Jiang <jiangyinzuo@foxmail.com>	2024-11-24 17:46:33 +08:00
Buqian Zheng	511edd29fd	enhance: disallow get raw vector data of a BM25 Function output field (#37800 ) issue: https://github.com/milvus-io/milvus/issues/35853 Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2024-11-20 14:22:30 +08:00
jaime	1d06d4324b	fix: Int64 overflow in JSON encoding (#37657 ) issue: ##36621 - For simple types in a struct, add "string" to the JSON tag for automatic string conversion during JSON encoding. - For complex types in a struct, replace "int64" with "string." Signed-off-by: jaime <yun.zhang@zilliz.com>	2024-11-14 22:52:30 +08:00
aoiasd	12951f0abb	enhance: rename tokenizer to analyzer and check analyzer params (#37478 ) relate: https://github.com/milvus-io/milvus/issues/35853 --------- Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2024-11-10 16:12:26 +08:00
jaime	f348bd9441	feat: add segment,pipeline, replica and resourcegroup api for WebUI (#37344 ) issue: #36621 Signed-off-by: jaime <yun.zhang@zilliz.com>	2024-11-07 11:52:25 +08:00
foxspy	642a651f60	enhance: remove useless vector index param checker (#37329 ) issue: #34298 because all vector index config checker has been moved into vector_index_checker, then the useless checkers can be removed. Signed-off-by: xianliang.li <xianliang.li@zilliz.com>	2024-11-01 06:20:21 +08:00
zhenshan.cao	63843dce33	fix: Fix conan gdal building problem (#37338 ) issue:https://github.com/milvus-io/milvus/issues/27576 Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>	2024-10-31 21:04:16 +08:00
Hao Tan	67c4340565	feat: Geospatial Data Type and GIS Function Support for milvus server (#35990 ) issue:https://github.com/milvus-io/milvus/issues/27576 # Main Goals 1. Create and describe collections with geospatial fields, enabling both client and server to recognize and process geo fields. 2. Insert geospatial data as payload values in the insert binlog, and print the values for verification. 3. Load segments containing geospatial data into memory. 4. Ensure query outputs can display geospatial data. 5. Support filtering on GIS functions for geospatial columns. # Solution 1. Add Type: Modify the Milvus core by adding a Geospatial type in both the C++ and Go code layers, defining the Geospatial data structure and the corresponding interfaces. 2. Dependency Libraries: Introduce necessary geospatial data processing libraries. In the C++ source code, use Conan package management to include the GDAL library. In the Go source code, add the go-geom library to the go.mod file. 3. Protocol Interface: Revise the Milvus protocol to provide mechanisms for Geospatial message serialization and deserialization. 4. Data Pipeline: Facilitate interaction between the client and proxy using the WKT format for geospatial data. The proxy will convert all data into WKB format for downstream processing, providing column data interfaces, segment encapsulation, segment loading, payload writing, and cache block management. 5. Query Operators: Implement simple display and support for filter queries. Initially, focus on filtering based on spatial relationships for a single column of geospatial literal values, providing parsing and execution for query expressions. 6. Client Modification: Enable the client to handle user input for geospatial data and facilitate end-to-end testing.Check the modification in pymilvus. --------- Signed-off-by: tasty-gumi <1021989072@qq.com>	2024-10-31 20:58:20 +08:00
SimFG	1cc9cb49ad	enhance: allow to delete data when disk quota exhausted (#37134 ) - issue: #37133 Signed-off-by: SimFG <bang.fu@zilliz.com>	2024-10-25 16:47:29 +08:00
Buqian Zheng	f7b811450d	feat: add enable_tokenizer params to VarChar field (#36480 ) issue: #35922 add an enable_tokenizer param to varchar field: must be set to true so that a varchar field can enable_match or used as input of BM25 function --------- Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2024-10-10 20:33:21 +08:00
SimFG	9c1772f659	enhance: avoid to create many timer object in the target (#36570 ) /kind improvement Signed-off-by: SimFG <bang.fu@zilliz.com>	2024-09-29 19:23:16 +08:00
aoiasd	ffc12fb5c4	fix: split delete task msg to MaxMessageSize to avoid mq message too large error (#36197 ) relate: https://github.com/milvus-io/milvus/issues/36089 --------- Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2024-09-27 18:15:19 +08:00
jaime	52cce4de58	fix: iaccurate size estimation for encoded array data (#36373 ) issue: #36029 Signed-off-by: jaime <yun.zhang@zilliz.com>	2024-09-24 14:51:14 +08:00
Ted Xu	363004fd44	enhance: simplify reduction on single search result (#36334 ) See: #36122 --------- Signed-off-by: Ted Xu <ted.xu@zilliz.com>	2024-09-20 11:59:10 +08:00

1 2 3 4

151 Commits