milvus

mirror of https://gitee.com/milvus-io/milvus.git synced 2026-01-07 19:31:51 +08:00

Author	SHA1	Message	Date
cai.zhang	b069eeecd2	fix: Added GetMetrics back to IndexNodeServer to ensure compatibility (#45073 ) issue: #45070 --------- Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2025-10-24 17:00:06 +08:00
aoiasd	cfeb095ad7	enhance: forbid build analyzer at proxy (#44067 ) relate: https://github.com/milvus-io/milvus/issues/43687 We used to run the temporary analyzer and validate analyzer on the proxy, but the proxy should not be a computation-heavy node. This PR move all analyzer calculations to the streaming node. --------- Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-10-23 10:58:12 +08:00
Zhen Ye	21076196bf	enhance: support resource group with WAL-based DDL framework (#44874 ) issue: #43897 - Resource group related DDL is implemented by WAL-based DDL framework now. - Support following message type in wal AlterResourceGroup, DropResourceGroup. - Resource group DDL can be synced by new CDC now. - Refactor some UT for resource group DDL. --------- Signed-off-by: chyezh <chyezh@outlook.com>	2025-10-21 09:58:03 +08:00
Bingyi Sun	633cae9461	enhance: add namespace for query and search request (#44343 ) issue: #44011 --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-10-16 17:52:01 +08:00
Zhen Ye	80bb09f7c2	enhance: support rbac with WAL-based DDL framework (#44735 ) issue: #43897 - RBAC(Roles/Users/Privileges/Privilege Groups) is implemented by WAL-based DDL framework now. - Support following message type in wal `AlterUser`, `DropUser`, `AlterRole`, `DropRole`, `AlterUserRole`, `DropUserRole`, `AlterPrivilege`, `DropPrivilege`, `AlterPrivilegeGroup`, `DropPrivilegeGroup`, `RestoreRBAC`. - RBAC can be synced by new CDC now. - Refactor some UT for RBAC. --------- Signed-off-by: chyezh <chyezh@outlook.com>	2025-10-16 16:02:01 +08:00
Spade A	c4f3f0ce4c	feat: impl StructArray -- support more types of vector in STRUCT (#44736 ) ref: https://github.com/milvus-io/milvus/issues/42148 --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com> Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>	2025-10-15 10:25:59 +08:00
sparknack	c8a4d6e2ef	enhance: add cachinglayer management for TextMatchIndex (#44741 ) issue: #41435, #44502 Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>	2025-10-13 14:37:58 +08:00
congqixia	d76d92d33f	enhance: Use relative path in proto following convention (#44650 ) Previous pr #44163 Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-09-30 18:43:52 +08:00
Agnes George	aea0418713	fix: resolve CVE-2020-25576, WS-2023-0223 (#44163 ) fix: issue https://github.com/milvus-io/milvus/issues/44160 WS-2023-0223 reported for [atty-0.2.14.crate](https://ibmets.whitesourcesoftware.com/Wss/WSS.html#!libraryDetails;uuid=9c622063-376a-446b-bece-d7f6fd096758;project=7300448;orgToken=79623fcf-07fe-42b8-90bf-513fafeb41be) CVE-2020-25576 reported for [rand_core-0.3.1.crate](https://ibmets.whitesourcesoftware.com/Wss/WSS.html#!libraryDetails;uuid=20e2ad1b-c84c-4f18-98a9-4f27643b29ff;project=7300448;orgToken=79623fcf-07fe-42b8-90bf-513fafeb41be) [atty-0.2.14.crate](https://ibmets.whitesourcesoftware.com/Wss/WSS.html#!libraryDetails;uuid=9c622063-376a-446b-bece-d7f6fd096758;project=7300448;orgToken=79623fcf-07fe-42b8-90bf-513fafeb41be) is a transitive dependency coming from the root libraries 'cbindgen-0.26.0.crate' and 'criterion-0.4.0.crate' [rand_core-0.3.1.crate](https://ibmets.whitesourcesoftware.com/Wss/WSS.html#!libraryDetails;uuid=20e2ad1b-c84c-4f18-98a9-4f27643b29ff;project=7300448;orgToken=79623fcf-07fe-42b8-90bf-513fafeb41be) is also a transitive dependency coming from 'rand-0.3.23.crate' library Path to dependency file: /workspace/app/milvus/internal/core/thirdparty/tantivy/tantivy-binding/Cargo.toml For Remediation, since these vulnerabilities are transitive one, the root libraries should be updated to the latest non-vulnerable version --------- Co-authored-by: Agnes-George1 <agnes.george1@ibm.com> Co-authored-by: Abita Ann Augustine <abitaaugustine@gmail.com> Co-authored-by: gifi-siby <gifi.s@ibm.com>	2025-09-30 16:25:53 +08:00
cai.zhang	19346fa389	feat: Geospatial Data Type and GIS Function support for milvus (#44547 ) issue: #43427 This pr's main goal is merge #37417 to milvus 2.5 without conflicts. # Main Goals 1. Create and describe collections with geospatial type 2. Insert geospatial data into the insert binlog 3. Load segments containing geospatial data into memory 4. Enable query and search can display geospatial data 5. Support using GIS funtions like ST_EQUALS in query 6. Support R-Tree index for geometry type # Solution 1. Add Type: Modify the Milvus core by adding a Geospatial type in both the C++ and Go code layers, defining the Geospatial data structure and the corresponding interfaces. 2. Dependency Libraries: Introduce necessary geospatial data processing libraries. In the C++ source code, use Conan package management to include the GDAL library. In the Go source code, add the go-geom library to the go.mod file. 3. Protocol Interface: Revise the Milvus protocol to provide mechanisms for Geospatial message serialization and deserialization. 4. Data Pipeline: Facilitate interaction between the client and proxy using the WKT format for geospatial data. The proxy will convert all data into WKB format for downstream processing, providing column data interfaces, segment encapsulation, segment loading, payload writing, and cache block management. 5. Query Operators: Implement simple display and support for filter queries. Initially, focus on filtering based on spatial relationships for a single column of geospatial literal values, providing parsing and execution for query expressions.Now only support brutal search 7. Client Modification: Enable the client to handle user input for geospatial data and facilitate end-to-end testing.Check the modification in pymilvus. --------- Signed-off-by: Yinwei Li <yinwei.li@zilliz.com> Signed-off-by: Cai Zhang <cai.zhang@zilliz.com> Co-authored-by: ZhuXi <150327960+Yinwei-Yu@users.noreply.github.com>	2025-09-28 19:43:05 +08:00
aoiasd	1b20e956be	enhance: support random score for boost function score (#44214 ) And support set function mode and boost mode when run search with boost. RandomScore support get random function score between [0, weight). FunctionMode decide how to calculate boost score for multiple boost function scores. BoostMode decide how to calculate final score for origin score and boost score. relate: https://github.com/milvus-io/milvus/issues/43867 --------- Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-09-24 17:50:04 +08:00
zhagnlu	eac16a577c	enhance:support cachelayer for json stats (#44446 ) #42533 Signed-off-by: zhagnlu <lu.zhang@zilliz.com>	2025-09-24 15:30:04 +08:00
Tianx	2c0c5ef41e	feat: timestamptz expression & index & timezone (#44080 ) issue: https://github.com/milvus-io/milvus/issues/27467 >My plan is as follows. >- [x] M1 Create collection with timestamptz field >- [x] M2 Insert timestamptz field data >- [x] M3 Retrieve timestamptz field data >- [x] M4 Implement handoff >- [x] M5 Implement compare operator >- [x] M6 Implement extract operator >- [x] M8 Support database/collection level default timezone >- [x] M7 Support STL-SORT index for datatype timestamptz --- The third PR of issue: https://github.com/milvus-io/milvus/issues/27467, which completes M5, M6, M7, M8 described above. ## M8 Default Timezone We will be able to use alter_collection() and alter_database() in a future Python SDK release to modify the default timezone at the collection or database level. For insert requests, the timezone will be resolved using the following order of precedence: String Literal-> Collection Default -> Database Default. For retrieval requests, the timezone will be resolved in this order: Query Parameters -> Collection Default -> Database Default. In both cases, the final fallback timezone is UTC. ## M5: Comparison Operators We can now use the following expression format to filter on the timestamptz field: - `timestamptz_field [+/- INTERVAL 'interval_string'] {comparison_op} ISO 'iso_string' ` - The interval_string follows the ISO 8601 duration format, for example: P1Y2M3DT1H2M3S. - The iso_string follows the ISO 8601 timestamp format, for example: 2025-01-03T00:00:00+08:00. - Example expressions: "tsz + INTERVAL 'P0D' != ISO '2025-01-03T00:00:00+08:00'" or "tsz != ISO '2025-01-03T00:00:00+08:00'". ## M6: Extract We will be able to extract sepecific time filed by kwargs in a future Python SDK release. The key is `time_fields`, and value should be one or more of "year, month, day, hour, minute, second, microsecond", seperated by comma or space. Then the result of each record would be an array of int64. ## M7: Indexing Support Expressions without interval arithmetic can be accelerated using an STL-SORT index. However, expressions that include interval arithmetic cannot be indexed. This is because the result of an interval calculation depends on the specific timestamp value. For example, adding one month to a date in February results in a different number of added days than adding one month to a date in March. --- After this PR, the input / output type of timestamptz would be iso string. Timestampz would be stored as timestamptz data, which is int64_t finally. > for more information, see https://en.wikipedia.org/wiki/ISO_8601 --------- Signed-off-by: xtx <xtianx@smail.nju.edu.cn>	2025-09-23 10:24:12 +08:00
Gao	d3784c6515	enhance: add storage resource usage for vector search (#44308 ) issue: #44212 Implement search/query storage usage statistics in go side(result reduce), only record storage usage in vector search C++ path. Need to be implemented in query c++ path in next prs. --------- Signed-off-by: chasingegg <chao.gao@zilliz.com> Signed-off-by: marcelo.chen <marcelo.chen@zilliz.com> Co-authored-by: marcelo.chen <marcelo.chen@zilliz.com>	2025-09-19 20:20:02 +08:00
wei liu	92d2fb6360	enhance: Add granular flush targets support for FlushAll operation (#44234 ) issue: #44156 Enhance FlushAll functionality to support targeting specific collections within databases instead of only database-level flushing. Changes include: - Add FlushAllTarget message in data_coord.proto for granular targeting - Support collection-specific flush operations within databases - Maintain backward compatibility with deprecated db_name field This enhancement allows users to flush specific collections without affecting other collections in the same database, providing more precise control over data persistence operations. Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-09-19 18:38:01 +08:00
Zhen Ye	ba289891c0	enhance: add all ddl message into messages (#44407 ) issue: #43897 - add ddl messages proto and add some message utilities. - support shard/exclusive resource-key-lock. - add all ddl callbacks future into broadcast registry. --------- Signed-off-by: chyezh <chyezh@outlook.com>	2025-09-18 10:08:00 +08:00
zhenshan.cao	691a8df953	feat: Add RESTful api for rolling upgrade support (#44381 ) issue: https://github.com/milvus-io/milvus/issues/43968 Co-authored-by: chyezh <ye.zhen@zilliz.com>	2025-09-16 20:08:00 +08:00
yihao.dai	51f69f32d0	feat: Add CDC support (#44124 ) This PR implements a new CDC service for Milvus 2.6, providing log-based cross-cluster replication. issue: https://github.com/milvus-io/milvus/issues/44123 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com> Signed-off-by: chyezh <chyezh@outlook.com> Co-authored-by: chyezh <chyezh@outlook.com>	2025-09-16 16:32:01 +08:00
cai.zhang	76f6768ea1	enhance: Remove timeout for compaction task (#44277 ) issue: #44272 --------- Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2025-09-15 11:03:58 +08:00
congqixia	f5618d5153	enhance: [StorageV2] Utilized advance split policy and persist in meta (#44282 ) Related to #44257 This PR: - Utilize configurable split policy for storage v2, enabling system field policy - Store split result in field binlog struct - Adapt legacy binlog without child fields --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-09-10 14:47:57 +08:00
Chun Han	26a024625d	feat: support search by on json field and dynamic field(#43124 ) (#43203 ) related: #43124 Signed-off-by: MrPresent-Han <chun.han@gmail.com> Co-authored-by: MrPresent-Han <chun.han@gmail.com>	2025-09-09 21:51:56 +08:00
Gao	2e98cb0103	enhance: load resource estimation for tiered index (#44171 ) issue: https://github.com/milvus-io/milvus/issues/42032 - Use bytes to estimate load resource in the whole estimation procedure - Add num_rows and dim info for vector index to better estimate - Disable eviction for tiered index's meta --------- Signed-off-by: chasingegg <chao.gao@zilliz.com>	2025-09-04 19:41:53 +08:00
Spade A	7cb15ef141	feat: impl StructArray -- optimize vector array serialization (#44035 ) issue: https://github.com/milvus-io/milvus/issues/42148 Optimized from Go VectorArray → VectorArray Proto → Binary → C++ VectorArray Proto → C++ VectorArray local impl → Memory to Go VectorArray → Arrow ListArray → Memory --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-09-03 16:39:53 +08:00
Bingyi Sun	0c0630cc38	feat: support dropping index without releasing collection (#42941 ) issue: #42942 This pr includes the following changes: 1. Added checks for index checker in querycoord to generate drop index tasks 2. Added drop index interface to querynode 3. To avoid search failure after dropping the index, the querynode allows the use of lazy mode (warmup=disable) to load raw data even when indexes contain raw data. 4. In segcore, loading the index no longer deletes raw data; instead, it evicts it. 5. In expr, the index is pinned to prevent concurrent errors. --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-09-02 16:17:52 +08:00
Zhen Ye	9e2d1963d4	enhance: support cchannel for streaming service (#44143 ) issue: #43897 - add cchannel as a special vchannel to hold some ddl and dcl. Signed-off-by: chyezh <chyezh@outlook.com>	2025-09-02 10:05:52 +08:00
zhagnlu	fc876639cf	enhance: support json stats with shredding design (#42534 ) #42533 Co-authored-by: luzhang <luzhang@zilliz.com>	2025-09-01 10:49:52 +08:00
Zhen Ye	3327df72e4	enhance: make immutable message as the param of ack operation for cdc (#43900 ) issue: #43897 - The original broadcast ack operation need to recover message from etcd, which can not support cdc. - immutable message will set as the ack parameter to fix it. Signed-off-by: chyezh <chyezh@outlook.com>	2025-09-01 10:21:52 +08:00
Chun Han	da156981c6	feat: milvus support posix-compatible mode(milvus-io#43942) (#43944 ) related: #43942 Signed-off-by: MrPresent-Han <chun.han@gmail.com> Co-authored-by: MrPresent-Han <chun.han@gmail.com>	2025-08-27 16:29:50 +08:00
XuanYang-cn	37a447d166	feat: Add CMEK cipher plugin (#43722 ) 1. Enable Milvus to read cipher configs 2. Enable cipher plugin in binlog reader and writer 3. Add a testCipher for unittests 4. Support pooling for datanode 5. Add encryption in storagev2 See also: #40321 Signed-off-by: yangxuan <xuan.yang@zilliz.com> --------- Signed-off-by: yangxuan <xuan.yang@zilliz.com>	2025-08-27 11:15:52 +08:00
Zhen Ye	d0e3a33c37	enhance: add IsRebalanceSuspended interface for wal balancer (#44026 ) issue: #43968 Signed-off-by: chyezh <chyezh@outlook.com>	2025-08-24 09:19:47 +08:00
Zhen Ye	082ca62ec1	enhance: support balancer interface for streaming client to fetch streaming node information (#43969 ) issue: #43968 - Add ListStreamingNode/GetWALDistribution to fetch streaming node info - Add SuspendRebalance/ResumeRebalance to enable or stop balance - Add FreezeNodeIDs/DefreezeNodeIDs to freeze target node Signed-off-by: chyezh <chyezh@outlook.com>	2025-08-21 15:55:47 +08:00
Spade A	d6a428e880	feat: impl StructArray -- support create index for vector array (embedding list) and search on it (#43726 ) Ref https://github.com/milvus-io/milvus/issues/42148 This PR supports create index for vector array (now, only for `DataType.FLOAT_VECTOR`) and search on it. The index type supported in this PR is `EMB_LIST_HNSW` and the metric type is `MAX_SIM` only. The way to use it: ```python milvus_client = MilvusClient("xxx:19530") schema = milvus_client.create_schema(enable_dynamic_field=True, auto_id=True) ... struct_schema = milvus_client.create_struct_array_field_schema("struct_array_field") ... struct_schema.add_field("struct_float_vec", DataType.ARRAY_OF_VECTOR, element_type=DataType.FLOAT_VECTOR, dim=128, max_capacity=1000) ... schema.add_struct_array_field(struct_schema) index_params = milvus_client.prepare_index_params() index_params.add_index(field_name="struct_float_vec", index_type="EMB_LIST_HNSW", metric_type="MAX_SIM", index_params={"nlist": 128}) ... milvus_client.create_index(COLLECTION_NAME, schema=schema, index_params=index_params) ``` Note: This PR uses `Lims` to convey offsets of the vector array to knowhere where vectors of multiple vector arrays are concatenated and we need offsets to specify which vectors belong to which vector array. --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com> Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>	2025-08-20 10:27:46 +08:00
aoiasd	dcf04a58b8	feat: support use score function on segment search and use filter (#43868 ) relate: https://github.com/milvus-io/milvus/issues/43867 Support boost function score, multiply by the weight if match filter. Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-08-19 16:15:45 +08:00
Zhen Ye	a86b6f2a54	enhance: extend the stats manage at streaming shard manager for L0 (#43371 ) issue: #42416 - Rename the InsertMetric into ModifiedMetric. - Add L0 control configuration. - Add some L0 current state collect. Signed-off-by: chyezh <chyezh@outlook.com>	2025-08-18 20:41:46 +08:00
aoiasd	eca51ed2c6	enhance: add file resource api (#43766 ) relate: https://github.com/milvus-io/milvus/issues/43687 Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-08-08 14:17:41 +08:00
cai.zhang	d8a3236e44	fix: Reorder worker proto fields to ensure compatibility (#43735 ) issue: #43734 Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2025-08-05 14:59:38 +08:00
wei liu	1fae8f5ae3	enhance: Optimize FlushAll performance for multi-table scenarios (#43339 ) Replace multiple per-table flush RPC calls with single FlushAll RPC to improve performance in multi-table scenarios. issue: #43338 - Implement server-side FlushAll request processing in DataCoord/MixCoord - Add flushAllTask to handle unified flush operations across all tables - Replace proxy-side per-table flush iteration with single RPC call - Support both streaming and non-streaming service execution paths - Add comprehensive unit tests for new FlushAll implementation --------- Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-07-30 15:37:37 +08:00
Zhen Ye	cd38d65417	fix: make savebinlogpath idompotent at binlog level (#43615 ) issue: #43574 - update all binlog every time when calling udpate savebinlogpath. --------- Signed-off-by: chyezh <chyezh@outlook.com>	2025-07-29 19:47:36 +08:00
Zhen Ye	e9ab73e93d	enhance: add schema version at recovery storage (#43500 ) issue: #43072, #43289 - manage the schema version at recovery storage. - update the schema when creating collection or alter schema. - get schema at write buffer based on version. - recover the schema when upgrading from 2.5. --------- Signed-off-by: chyezh <chyezh@outlook.com>	2025-07-23 21:38:54 +08:00
yihao.dai	a839017e81	fix: Handle retry state in import task (#43474 ) issue: https://github.com/milvus-io/milvus/issues/43473 Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2025-07-22 14:52:53 +08:00
Zhen Ye	07fa2cbdd3	enhance: wal balance consider the wal status on streamingnode (#43265 ) issue: #42995 - don't balance the wal if the producing-consuming lag is too long. - don't balance if the rebalance is set as false. - don't balance if the wal is balanced recently. Signed-off-by: chyezh <chyezh@outlook.com>	2025-07-18 11:10:51 +08:00
congqixia	5d90b65342	enhance: [StorageV2] Add storage version in Data/Query view resp (#43348 ) Related to #39173 Add `storage_version` in data/query view segment info response --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-07-16 15:52:51 +08:00
wei liu	b2597c6329	enhance: apply load config changes after QueryCoord restart (#43108 ) issue: #43107 - Add checkLoadConfigChanges() to apply load config during startup - Call config check in startQueryCoord() after restart - Skip auto-updates for collections with user-specified replica numbers - Add is_user_specified_replica_mode field to preserve user settings - Add comprehensive unit tests with mockey Ensures existing collections use latest cluster-level config after restart. --------- Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-07-10 14:28:48 +08:00
cai.zhang	3ffd44f302	fix: Fix remaining issues with Datanode pooling and StorageV2 (#43147 ) issue: #43146 --------- Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2025-07-10 14:26:48 +08:00
cai.zhang	6989e18599	enhance: Move sort stats task to sort compaction (#42562 ) issue: #42560 --------- Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2025-07-08 20:22:47 +08:00
yihao.dai	9cbd194c6b	fix: Prevent import from generating small binlogs (#43132 ) - Introduce dynamic buffer sizing to avoid generating small binlogs during import - Refactor import slot calculation based on CPU and memory constraints - Implement dynamic pool sizing for sync manager and import tasks according to CPU core count issue: https://github.com/milvus-io/milvus/issues/43131 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2025-07-07 21:32:47 +08:00
Zhen Ye	46b6f1b9e2	fix: panic when logging a old message should be skipped (#43076 ) issue: #43074 - fix: panic when logging a old message should be skipped, #43074 - fix: make the ack of broadcaster idompotent, #43026 - fix: lost dropping collection when upgrading, #43092 - fix: panic when DropPartition happen after DropCollection, #43027, #43078 --------- Signed-off-by: chyezh <chyezh@outlook.com>	2025-07-04 16:04:44 +08:00
congqixia	7bc7b18ed5	fix: [AddField] Prevent concurrent load during UpdateSchema (#43043 ) Related to #43028 This PR: - Add mutex prevent concurrent load segment & schema change - Add schema verison field in load meta - Update schema in PutOrRef if schema verison is larger --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-07-02 17:38:44 +08:00
Spade A	e2c85eec81	fix: load stats index based on mmap config (#42788 ) ref https://github.com/milvus-io/milvus/issues/42626 This PR makes text match index and json key stats index be loaded based on mmap config. --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-06-19 10:10:39 +08:00
cai.zhang	a9dcd4a380	enhance: ChunkManager is no longer created during datanode initialization (#42791 ) issue: #41611 --------- Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2025-06-17 17:06:38 +08:00

1 2 3

118 Commits