milvus

mirror of https://gitee.com/milvus-io/milvus.git synced 2025-12-06 17:18:35 +08:00

Author	SHA1	Message	Date
congqixia	f94b04e642	feat: [2.6] integrate Loon FFI for manifest-based segment loading and index building (#46076 ) Cherry-pick from master pr: #45061 #45488 #45803 #46017 #44991 #45132 #45723 #45726 #45798 #45897 #45918 #44998 This feature integrates the Storage V2 (Loon) FFI interface as a unified storage layer for segment loading and index building in Milvus. It enables manifest-based data access, replacing the traditional binlog-based approach with a more efficient columnar storage format. Key changes: ### Segment Self-Managed Loading Architecture - Move segment loading orchestration from Go layer to C++ segcore - Add NewSegmentWithLoadInfo() API for passing load info during segment creation - Implement SetLoadInfo() and Load() methods in SegmentInterface - Support parallel loading of indexed and non-indexed fields - Enable both sealed and growing segments to self-manage loading ### Storage V2 FFI Integration - Integrate milvus-storage library's FFI interface for packed columnar data - Add manifest path support throughout the data path (SegmentInfo, LoadInfo) - Implement ManifestReader for generating manifests from binlogs - Support zero-copy data exchange using Arrow C Data Interface - Add ToCStorageConfig() for Go-to-C storage config conversion ### Manifest-Based Index Building - Extend FileManagerContext to carry loon_ffi_properties - Implement GetFieldDatasFromManifest() using Arrow C Stream interface - Support manifest-based reading in DiskFileManagerImpl and MemFileManagerImpl - Add fallback to traditional segment insert files when manifest unavailable ### Compaction Pipeline Updates - Include manifest path in all compaction task builders (clustering, L0, mix) - Update BulkPackWriterV2 to return manifest path - Propagate manifest metadata through compaction pipeline ### Configuration & Protocol - Add common.storageV2.useLoonFFI config option (default: false) - Add manifest_path field to SegmentLoadInfo and related proto messages - Add manifest field to compaction segment messages ### Bug Fixes - Fix mmap settings not applied during segment load (key typo fix) - Populate index info after segment loading to prevent redundant load tasks - Fix memory corruption by removing premature transaction handle destruction Related issues: #44956, #45060, #39173 ## Individual Cherry-Picked Commits 1. e1c923b5cc - fix: apply mmap settings correctly during segment load (#46017) 2. 63b912370b - enhance: use milvus-storage internal C++ Reader API for Loon FFI (#45897) 3. bfc192faa5 - enhance: Resolve issues integrating loon FFI (#45918) 4. fb18564631 - enhance: support manifest-based index building with Loon FFI reader (#45726) 5. b9ec2392b9 - enhance: integrate StorageV2 FFI interface for manifest-based segment loading (#45798) 6. 66db3c32e6 - enhance: integrate Storage V2 FFI interface for unified storage access (#45723) 7. ae789273ac - fix: populate index info after segment loading to prevent redundant load tasks (#45803) 8. 49688b0be2 - enhance: Move segment loading logic from Go layer to segcore for self-managed loading (#45488) 9. 5b2df88bac - enhance: [StorageV2] Integrate FFI interface for packed reader (#45132) 10. 91ff5706ac - enhance: [StorageV2] add manifest path support for FFI integration (#44991) 11. 2192bb4a85 - enhance: add NewSegmentWithLoadInfo API to support segment self-managed loading (#45061) 12. 4296b01da0 - enhance: update delta log serialization APIs to integrate storage V2 (#44998) ## Technical Details ### Architecture Changes - Before: Go layer orchestrated segment loading, making multiple CGO calls - After: Segments autonomously manage loading in C++ layer with single entry point ### Storage Access Pattern - Before: Read individual binlog files through Go storage layer - After: Read manifest file that references packed columnar data via FFI ### Benefits - Reduced cross-language call overhead - Better resource management at C++ level - Improved I/O performance through batched streaming reads - Cleaner separation of concerns between Go and C++ layers - Foundation for proactive schema evolution handling --------- Signed-off-by: Ted Xu <ted.xu@zilliz.com> Signed-off-by: Congqi Xia <congqi.xia@zilliz.com> Co-authored-by: Ted Xu <ted.xu@zilliz.com>	2025-12-04 17:09:12 +08:00
aoiasd	8bdbc4379e	feat: [2.6] Support search with highlighter (#46052 ) relate: https://github.com/milvus-io/milvus/issues/42589 pr: https://github.com/milvus-io/milvus/pull/45736 https://github.com/milvus-io/milvus/pull/45099 https://github.com/milvus-io/milvus/pull/44923 https://github.com/milvus-io/milvus/pull/45984 --------- Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-12-04 10:41:11 +08:00
Zhen Ye	e9d920a785	enhance: support proxy DML forward (#45922 ) issue: #45812 pr: #45921 - 2.6 proxy will try to forward DWL to 2.5 proxy if streaming service is not ready Signed-off-by: chyezh <chyezh@outlook.com>	2025-12-01 19:39:10 +08:00
aoiasd	37b163613b	enhance: [2.6] remove resource type from file resource config (#45103 ) (#45727 ) File resource type was useless till now, remove it before new release. relate: https://github.com/milvus-io/milvus/issues/43687 pr: https://github.com/milvus-io/milvus/pull/45103 Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-11-24 11:37:06 +08:00
Gao	df5da9c2b5	enhance: [2.6] support max_connection config for remote storage (#45364 ) issue: #45344 pr: #45225 Signed-off-by: chasingegg <chao.gao@zilliz.com>	2025-11-13 15:41:37 +08:00
Gao	1398a069d3	enhance: override index_type while creating segment index (#45417 ) issue: #44752 pr: #45416 --------- Signed-off-by: chasingegg <chao.gao@zilliz.com>	2025-11-11 09:45:36 +08:00
Zhen Ye	02e2170601	enhance: cherry pick patch of new DDL framework and CDC 2 (#45241 ) issue: #43897, #44123 pr: #45224 also pick pr: #45216,#45154,#45033,#45145,#45092,#45058,#45029 enhance: Close channel replicator more gracefully (#45029) issue: https://github.com/milvus-io/milvus/issues/44123 enhance: Show create time for import job (#45058) issue: https://github.com/milvus-io/milvus/issues/45056 fix: wal state may be unconsistent after recovering from crash (#45092) issue: #45088, #45086 - Message on control channel should trigger the checkpoint update. - LastConfrimedMessageID should be recovered from the minimum of checkpoint or the LastConfirmedMessageID of uncommitted txn. - Add more log info for wal debugging. fix: make ack of broadcaster cannot canceled by client (#45145) issue: #45141 - make ack of broadcaster cannot canceled by rpc. - make clone for assignment snapshot of wal balancer. - add server id for GetReplicateCheckpoint to avoid failure. enhance: support collection and index with WAL-based DDL framework (#45033) issue: #43897 - Part of collection/index related DDL is implemented by WAL-based DDL framework now. - Support following message type in wal, CreateCollection, DropCollection, CreatePartition, DropPartition, CreateIndex, AlterIndex, DropIndex. - Part of collection/index related DDL can be synced by new CDC now. - Refactor some UT for collection/index DDL. - Add Tombstone scheduler to manage the tombstone GC for collection or partition meta. - Move the vchannel allocation into streaming pchannel manager. enhance: support load/release collection/partition with WAL-based DDL framework (#45154) issue: #43897 - Load/Release collection/partition is implemented by WAL-based DDL framework now. - Support AlterLoadConfig/DropLoadConfig in wal now. - Load/Release operation can be synced by new CDC now. - Refactor some UT for load/release DDL. enhance: Don't start cdc by default (#45216) issue: https://github.com/milvus-io/milvus/issues/44123 fix: unrecoverable when replicate from old (#45224) issue: #44962 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com> Signed-off-by: chyezh <chyezh@outlook.com> Co-authored-by: yihao.dai <yihao.dai@zilliz.com>	2025-11-04 01:35:33 +08:00
Zhen Ye	318db122b8	enhance: cherry pick patch of new DDL framework and CDC (#45025 ) issue: #43897, #44123 pr: #44898 related pr: #44607 #44642 #44792 #44809 #44564 #44560 #44735 #44822 #44865 #44850 #44942 #44874 #44963 #44886 #44898 enhance: remove redundant channel manager from datacoord (#44532) issue: #41611 - After enabling streaming arch, channel manager of data coord is a redundant component. fix: Fix CDC OOM due to high buffer size (#44607) Fix CDC OOM by: 1. free msg buffer manually. 2. limit max msg buffer size. 3. reduce scanner msg hander buffer size. issue: https://github.com/milvus-io/milvus/issues/44123 fix: remove wrong start timetick to avoid filtering DML whose timetick is less than it. (#44691) issue: #41611 - introduced by #44532 enhance: support remove cluster from replicate topology (#44642) issue: #44558, #44123 - Update config(A->C) to A and C, config(B) to B on replicate topology (A->B,A->C) can remove the B from replicate topology - Fix some metric error of CDC fix: check if qn is sqn with label and streamingnode list (#44792) issue: #44014 - On standalone, the query node inside need to load segment and watch channel, so the querynode is not a embeded querynode in streamingnode without `LabelStreamingNodeEmbeddedQueryNode`. The channel dist manager can not confirm a standalone node is a embededStreamingNode. Bug is introduced by #44099 enhance: Make GetReplicateInfo API work at the pchannel level (#44809) issue: https://github.com/milvus-io/milvus/issues/44123 enhance: Speed up CDC scheduling (#44564) Make CDC watch etcd replicate pchannel meta instead of listing them periodically. issue: https://github.com/milvus-io/milvus/issues/44123 enhance: refactor update replicate config operation using wal-broadcast-based DDL/DCL framework (#44560) issue: #43897 - UpdateReplicateConfig operation will broadcast AlterReplicateConfig message into all pchannels with cluster-exclusive-lock. - Begin txn message will use commit message timetick now (to avoid timetick rollback when CDC with txn message). - If current cluster is secondary, the UpdateReplicateConfig will wait until the replicate configuration is consistent with the config replicated from primary. enhance: support rbac with WAL-based DDL framework (#44735) issue: #43897 - RBAC(Roles/Users/Privileges/Privilege Groups) is implemented by WAL-based DDL framework now. - Support following message type in wal `AlterUser`, `DropUser`, `AlterRole`, `DropRole`, `AlterUserRole`, `DropUserRole`, `AlterPrivilege`, `DropPrivilege`, `AlterPrivilegeGroup`, `DropPrivilegeGroup`, `RestoreRBAC`. - RBAC can be synced by new CDC now. - Refactor some UT for RBAC. enhance: support database with WAL-based DDL framework (#44822) issue: #43897 - Database related DDL is implemented by WAL-based DDL framework now. - Support following message type in wal CreateDatabase, AlterDatabase, DropDatabase. - Database DDL can be synced by new CDC now. - Refactor some UT for Database DDL. enhance: support alias with WAL-based DDL framework (#44865) issue: #43897 - Alias related DDL is implemented by WAL-based DDL framework now. - Support following message type in wal AlterAlias, DropAlias. - Alias DDL can be synced by new CDC now. - Refactor some UT for Alias DDL. enhance: Disable import for replicating cluster (#44850) 1. Import in replicating cluster is not supported yet, so disable it for now. 2. Remove GetReplicateConfiguration wal API issue: https://github.com/milvus-io/milvus/issues/44123 fix: use short debug string to avoid newline in debug logs (#44925) issue: #44924 fix: rerank before requery if reranker didn't use field data (#44942) issue: #44918 enhance: support resource group with WAL-based DDL framework (#44874) issue: #43897 - Resource group related DDL is implemented by WAL-based DDL framework now. - Support following message type in wal AlterResourceGroup, DropResourceGroup. - Resource group DDL can be synced by new CDC now. - Refactor some UT for resource group DDL. fix: Fix Fix replication txn data loss during chaos (#44963) Only confirm CommitMsg for txn messages to prevent data loss. issue: https://github.com/milvus-io/milvus/issues/44962, https://github.com/milvus-io/milvus/issues/44123 fix: wrong execution order of DDL/DCL on secondary (#44886) issue: #44697, #44696 - The DDL executing order of secondary keep same with order of control channel timetick now. - filtering the control channel operation on shard manager of streamingnode to avoid wrong vchannel of create segment. - fix that the immutable txn message lost replicate header. fix: Fix primary-secondary replication switch blocking (#44898) 1. Fix primary-secondary replication switchover blocking by delete replicate pchannel meta using modRevision. 2. Stop channel replicator(scanner) when cluster role changes to prevent continued message consumption and replication. 3. Close Milvus client to prevent goroutine leak. 4. Create Milvus client once for a channel replicator. 5. Simplify CDC controller and resources. issue: https://github.com/milvus-io/milvus/issues/44123 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com> Signed-off-by: chyezh <chyezh@outlook.com> Co-authored-by: yihao.dai <yihao.dai@zilliz.com>	2025-11-03 15:39:33 +08:00
yihao.dai	60a802d3c8	enhance: [2.6] Show create time for import job (#45059 ) issue: https://github.com/milvus-io/milvus/issues/45056 pr: https://github.com/milvus-io/milvus/pull/45058 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2025-10-27 14:20:19 +08:00
cai.zhang	f9a49c60e4	fix: [2.6] Added GetMetrics back to IndexNodeServer to ensure compatibility (#45074 ) issue: #45070 master pr: #45073 --------- Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2025-10-24 17:32:06 +08:00
sparknack	54fe82756a	enhance: [2.6] add cachinglayer management for TextMatchIndex (#44768 ) issue: #41435, #44502 pr: #44741, #44806 --------- Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>	2025-10-15 11:09:59 +08:00
Spade A	661b803d12	feat: impl StructArray -- support more types of vector in STRUCT [2.6] (#44845 ) ref: https://github.com/milvus-io/milvus/issues/42148 pr: https://github.com/milvus-io/milvus/pull/44736 --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com> Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>	2025-10-15 10:34:00 +08:00
congqixia	6c0a234ad6	enhance: [2.6] Use relative path in proto following convention (#44650 ) (#44653 ) Cherry-pick from master pr: #44650 Previous pr #44163 Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-10-09 10:01:57 +08:00
Agnes George	aea0418713	fix: resolve CVE-2020-25576, WS-2023-0223 (#44163 ) fix: issue https://github.com/milvus-io/milvus/issues/44160 WS-2023-0223 reported for [atty-0.2.14.crate](https://ibmets.whitesourcesoftware.com/Wss/WSS.html#!libraryDetails;uuid=9c622063-376a-446b-bece-d7f6fd096758;project=7300448;orgToken=79623fcf-07fe-42b8-90bf-513fafeb41be) CVE-2020-25576 reported for [rand_core-0.3.1.crate](https://ibmets.whitesourcesoftware.com/Wss/WSS.html#!libraryDetails;uuid=20e2ad1b-c84c-4f18-98a9-4f27643b29ff;project=7300448;orgToken=79623fcf-07fe-42b8-90bf-513fafeb41be) [atty-0.2.14.crate](https://ibmets.whitesourcesoftware.com/Wss/WSS.html#!libraryDetails;uuid=9c622063-376a-446b-bece-d7f6fd096758;project=7300448;orgToken=79623fcf-07fe-42b8-90bf-513fafeb41be) is a transitive dependency coming from the root libraries 'cbindgen-0.26.0.crate' and 'criterion-0.4.0.crate' [rand_core-0.3.1.crate](https://ibmets.whitesourcesoftware.com/Wss/WSS.html#!libraryDetails;uuid=20e2ad1b-c84c-4f18-98a9-4f27643b29ff;project=7300448;orgToken=79623fcf-07fe-42b8-90bf-513fafeb41be) is also a transitive dependency coming from 'rand-0.3.23.crate' library Path to dependency file: /workspace/app/milvus/internal/core/thirdparty/tantivy/tantivy-binding/Cargo.toml For Remediation, since these vulnerabilities are transitive one, the root libraries should be updated to the latest non-vulnerable version --------- Co-authored-by: Agnes-George1 <agnes.george1@ibm.com> Co-authored-by: Abita Ann Augustine <abitaaugustine@gmail.com> Co-authored-by: gifi-siby <gifi.s@ibm.com>	2025-09-30 16:25:53 +08:00
cai.zhang	19346fa389	feat: Geospatial Data Type and GIS Function support for milvus (#44547 ) issue: #43427 This pr's main goal is merge #37417 to milvus 2.5 without conflicts. # Main Goals 1. Create and describe collections with geospatial type 2. Insert geospatial data into the insert binlog 3. Load segments containing geospatial data into memory 4. Enable query and search can display geospatial data 5. Support using GIS funtions like ST_EQUALS in query 6. Support R-Tree index for geometry type # Solution 1. Add Type: Modify the Milvus core by adding a Geospatial type in both the C++ and Go code layers, defining the Geospatial data structure and the corresponding interfaces. 2. Dependency Libraries: Introduce necessary geospatial data processing libraries. In the C++ source code, use Conan package management to include the GDAL library. In the Go source code, add the go-geom library to the go.mod file. 3. Protocol Interface: Revise the Milvus protocol to provide mechanisms for Geospatial message serialization and deserialization. 4. Data Pipeline: Facilitate interaction between the client and proxy using the WKT format for geospatial data. The proxy will convert all data into WKB format for downstream processing, providing column data interfaces, segment encapsulation, segment loading, payload writing, and cache block management. 5. Query Operators: Implement simple display and support for filter queries. Initially, focus on filtering based on spatial relationships for a single column of geospatial literal values, providing parsing and execution for query expressions.Now only support brutal search 7. Client Modification: Enable the client to handle user input for geospatial data and facilitate end-to-end testing.Check the modification in pymilvus. --------- Signed-off-by: Yinwei Li <yinwei.li@zilliz.com> Signed-off-by: Cai Zhang <cai.zhang@zilliz.com> Co-authored-by: ZhuXi <150327960+Yinwei-Yu@users.noreply.github.com>	2025-09-28 19:43:05 +08:00
aoiasd	1b20e956be	enhance: support random score for boost function score (#44214 ) And support set function mode and boost mode when run search with boost. RandomScore support get random function score between [0, weight). FunctionMode decide how to calculate boost score for multiple boost function scores. BoostMode decide how to calculate final score for origin score and boost score. relate: https://github.com/milvus-io/milvus/issues/43867 --------- Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-09-24 17:50:04 +08:00
zhagnlu	eac16a577c	enhance:support cachelayer for json stats (#44446 ) #42533 Signed-off-by: zhagnlu <lu.zhang@zilliz.com>	2025-09-24 15:30:04 +08:00
Tianx	2c0c5ef41e	feat: timestamptz expression & index & timezone (#44080 ) issue: https://github.com/milvus-io/milvus/issues/27467 >My plan is as follows. >- [x] M1 Create collection with timestamptz field >- [x] M2 Insert timestamptz field data >- [x] M3 Retrieve timestamptz field data >- [x] M4 Implement handoff >- [x] M5 Implement compare operator >- [x] M6 Implement extract operator >- [x] M8 Support database/collection level default timezone >- [x] M7 Support STL-SORT index for datatype timestamptz --- The third PR of issue: https://github.com/milvus-io/milvus/issues/27467, which completes M5, M6, M7, M8 described above. ## M8 Default Timezone We will be able to use alter_collection() and alter_database() in a future Python SDK release to modify the default timezone at the collection or database level. For insert requests, the timezone will be resolved using the following order of precedence: String Literal-> Collection Default -> Database Default. For retrieval requests, the timezone will be resolved in this order: Query Parameters -> Collection Default -> Database Default. In both cases, the final fallback timezone is UTC. ## M5: Comparison Operators We can now use the following expression format to filter on the timestamptz field: - `timestamptz_field [+/- INTERVAL 'interval_string'] {comparison_op} ISO 'iso_string' ` - The interval_string follows the ISO 8601 duration format, for example: P1Y2M3DT1H2M3S. - The iso_string follows the ISO 8601 timestamp format, for example: 2025-01-03T00:00:00+08:00. - Example expressions: "tsz + INTERVAL 'P0D' != ISO '2025-01-03T00:00:00+08:00'" or "tsz != ISO '2025-01-03T00:00:00+08:00'". ## M6: Extract We will be able to extract sepecific time filed by kwargs in a future Python SDK release. The key is `time_fields`, and value should be one or more of "year, month, day, hour, minute, second, microsecond", seperated by comma or space. Then the result of each record would be an array of int64. ## M7: Indexing Support Expressions without interval arithmetic can be accelerated using an STL-SORT index. However, expressions that include interval arithmetic cannot be indexed. This is because the result of an interval calculation depends on the specific timestamp value. For example, adding one month to a date in February results in a different number of added days than adding one month to a date in March. --- After this PR, the input / output type of timestamptz would be iso string. Timestampz would be stored as timestamptz data, which is int64_t finally. > for more information, see https://en.wikipedia.org/wiki/ISO_8601 --------- Signed-off-by: xtx <xtianx@smail.nju.edu.cn>	2025-09-23 10:24:12 +08:00
Gao	d3784c6515	enhance: add storage resource usage for vector search (#44308 ) issue: #44212 Implement search/query storage usage statistics in go side(result reduce), only record storage usage in vector search C++ path. Need to be implemented in query c++ path in next prs. --------- Signed-off-by: chasingegg <chao.gao@zilliz.com> Signed-off-by: marcelo.chen <marcelo.chen@zilliz.com> Co-authored-by: marcelo.chen <marcelo.chen@zilliz.com>	2025-09-19 20:20:02 +08:00
wei liu	92d2fb6360	enhance: Add granular flush targets support for FlushAll operation (#44234 ) issue: #44156 Enhance FlushAll functionality to support targeting specific collections within databases instead of only database-level flushing. Changes include: - Add FlushAllTarget message in data_coord.proto for granular targeting - Support collection-specific flush operations within databases - Maintain backward compatibility with deprecated db_name field This enhancement allows users to flush specific collections without affecting other collections in the same database, providing more precise control over data persistence operations. Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-09-19 18:38:01 +08:00
Zhen Ye	ba289891c0	enhance: add all ddl message into messages (#44407 ) issue: #43897 - add ddl messages proto and add some message utilities. - support shard/exclusive resource-key-lock. - add all ddl callbacks future into broadcast registry. --------- Signed-off-by: chyezh <chyezh@outlook.com>	2025-09-18 10:08:00 +08:00
zhenshan.cao	691a8df953	feat: Add RESTful api for rolling upgrade support (#44381 ) issue: https://github.com/milvus-io/milvus/issues/43968 Co-authored-by: chyezh <ye.zhen@zilliz.com>	2025-09-16 20:08:00 +08:00
yihao.dai	51f69f32d0	feat: Add CDC support (#44124 ) This PR implements a new CDC service for Milvus 2.6, providing log-based cross-cluster replication. issue: https://github.com/milvus-io/milvus/issues/44123 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com> Signed-off-by: chyezh <chyezh@outlook.com> Co-authored-by: chyezh <chyezh@outlook.com>	2025-09-16 16:32:01 +08:00
cai.zhang	76f6768ea1	enhance: Remove timeout for compaction task (#44277 ) issue: #44272 --------- Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2025-09-15 11:03:58 +08:00
congqixia	f5618d5153	enhance: [StorageV2] Utilized advance split policy and persist in meta (#44282 ) Related to #44257 This PR: - Utilize configurable split policy for storage v2, enabling system field policy - Store split result in field binlog struct - Adapt legacy binlog without child fields --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-09-10 14:47:57 +08:00
Chun Han	26a024625d	feat: support search by on json field and dynamic field(#43124 ) (#43203 ) related: #43124 Signed-off-by: MrPresent-Han <chun.han@gmail.com> Co-authored-by: MrPresent-Han <chun.han@gmail.com>	2025-09-09 21:51:56 +08:00
Gao	2e98cb0103	enhance: load resource estimation for tiered index (#44171 ) issue: https://github.com/milvus-io/milvus/issues/42032 - Use bytes to estimate load resource in the whole estimation procedure - Add num_rows and dim info for vector index to better estimate - Disable eviction for tiered index's meta --------- Signed-off-by: chasingegg <chao.gao@zilliz.com>	2025-09-04 19:41:53 +08:00
Spade A	7cb15ef141	feat: impl StructArray -- optimize vector array serialization (#44035 ) issue: https://github.com/milvus-io/milvus/issues/42148 Optimized from Go VectorArray → VectorArray Proto → Binary → C++ VectorArray Proto → C++ VectorArray local impl → Memory to Go VectorArray → Arrow ListArray → Memory --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-09-03 16:39:53 +08:00
Bingyi Sun	0c0630cc38	feat: support dropping index without releasing collection (#42941 ) issue: #42942 This pr includes the following changes: 1. Added checks for index checker in querycoord to generate drop index tasks 2. Added drop index interface to querynode 3. To avoid search failure after dropping the index, the querynode allows the use of lazy mode (warmup=disable) to load raw data even when indexes contain raw data. 4. In segcore, loading the index no longer deletes raw data; instead, it evicts it. 5. In expr, the index is pinned to prevent concurrent errors. --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-09-02 16:17:52 +08:00
Zhen Ye	9e2d1963d4	enhance: support cchannel for streaming service (#44143 ) issue: #43897 - add cchannel as a special vchannel to hold some ddl and dcl. Signed-off-by: chyezh <chyezh@outlook.com>	2025-09-02 10:05:52 +08:00
zhagnlu	fc876639cf	enhance: support json stats with shredding design (#42534 ) #42533 Co-authored-by: luzhang <luzhang@zilliz.com>	2025-09-01 10:49:52 +08:00
Zhen Ye	3327df72e4	enhance: make immutable message as the param of ack operation for cdc (#43900 ) issue: #43897 - The original broadcast ack operation need to recover message from etcd, which can not support cdc. - immutable message will set as the ack parameter to fix it. Signed-off-by: chyezh <chyezh@outlook.com>	2025-09-01 10:21:52 +08:00
Chun Han	da156981c6	feat: milvus support posix-compatible mode(milvus-io#43942) (#43944 ) related: #43942 Signed-off-by: MrPresent-Han <chun.han@gmail.com> Co-authored-by: MrPresent-Han <chun.han@gmail.com>	2025-08-27 16:29:50 +08:00
XuanYang-cn	37a447d166	feat: Add CMEK cipher plugin (#43722 ) 1. Enable Milvus to read cipher configs 2. Enable cipher plugin in binlog reader and writer 3. Add a testCipher for unittests 4. Support pooling for datanode 5. Add encryption in storagev2 See also: #40321 Signed-off-by: yangxuan <xuan.yang@zilliz.com> --------- Signed-off-by: yangxuan <xuan.yang@zilliz.com>	2025-08-27 11:15:52 +08:00
Zhen Ye	d0e3a33c37	enhance: add IsRebalanceSuspended interface for wal balancer (#44026 ) issue: #43968 Signed-off-by: chyezh <chyezh@outlook.com>	2025-08-24 09:19:47 +08:00
Zhen Ye	082ca62ec1	enhance: support balancer interface for streaming client to fetch streaming node information (#43969 ) issue: #43968 - Add ListStreamingNode/GetWALDistribution to fetch streaming node info - Add SuspendRebalance/ResumeRebalance to enable or stop balance - Add FreezeNodeIDs/DefreezeNodeIDs to freeze target node Signed-off-by: chyezh <chyezh@outlook.com>	2025-08-21 15:55:47 +08:00
Spade A	d6a428e880	feat: impl StructArray -- support create index for vector array (embedding list) and search on it (#43726 ) Ref https://github.com/milvus-io/milvus/issues/42148 This PR supports create index for vector array (now, only for `DataType.FLOAT_VECTOR`) and search on it. The index type supported in this PR is `EMB_LIST_HNSW` and the metric type is `MAX_SIM` only. The way to use it: ```python milvus_client = MilvusClient("xxx:19530") schema = milvus_client.create_schema(enable_dynamic_field=True, auto_id=True) ... struct_schema = milvus_client.create_struct_array_field_schema("struct_array_field") ... struct_schema.add_field("struct_float_vec", DataType.ARRAY_OF_VECTOR, element_type=DataType.FLOAT_VECTOR, dim=128, max_capacity=1000) ... schema.add_struct_array_field(struct_schema) index_params = milvus_client.prepare_index_params() index_params.add_index(field_name="struct_float_vec", index_type="EMB_LIST_HNSW", metric_type="MAX_SIM", index_params={"nlist": 128}) ... milvus_client.create_index(COLLECTION_NAME, schema=schema, index_params=index_params) ``` Note: This PR uses `Lims` to convey offsets of the vector array to knowhere where vectors of multiple vector arrays are concatenated and we need offsets to specify which vectors belong to which vector array. --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com> Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>	2025-08-20 10:27:46 +08:00
aoiasd	dcf04a58b8	feat: support use score function on segment search and use filter (#43868 ) relate: https://github.com/milvus-io/milvus/issues/43867 Support boost function score, multiply by the weight if match filter. Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-08-19 16:15:45 +08:00
Zhen Ye	a86b6f2a54	enhance: extend the stats manage at streaming shard manager for L0 (#43371 ) issue: #42416 - Rename the InsertMetric into ModifiedMetric. - Add L0 control configuration. - Add some L0 current state collect. Signed-off-by: chyezh <chyezh@outlook.com>	2025-08-18 20:41:46 +08:00
aoiasd	eca51ed2c6	enhance: add file resource api (#43766 ) relate: https://github.com/milvus-io/milvus/issues/43687 Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-08-08 14:17:41 +08:00
cai.zhang	d8a3236e44	fix: Reorder worker proto fields to ensure compatibility (#43735 ) issue: #43734 Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2025-08-05 14:59:38 +08:00
wei liu	1fae8f5ae3	enhance: Optimize FlushAll performance for multi-table scenarios (#43339 ) Replace multiple per-table flush RPC calls with single FlushAll RPC to improve performance in multi-table scenarios. issue: #43338 - Implement server-side FlushAll request processing in DataCoord/MixCoord - Add flushAllTask to handle unified flush operations across all tables - Replace proxy-side per-table flush iteration with single RPC call - Support both streaming and non-streaming service execution paths - Add comprehensive unit tests for new FlushAll implementation --------- Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-07-30 15:37:37 +08:00
Zhen Ye	cd38d65417	fix: make savebinlogpath idompotent at binlog level (#43615 ) issue: #43574 - update all binlog every time when calling udpate savebinlogpath. --------- Signed-off-by: chyezh <chyezh@outlook.com>	2025-07-29 19:47:36 +08:00
Zhen Ye	e9ab73e93d	enhance: add schema version at recovery storage (#43500 ) issue: #43072, #43289 - manage the schema version at recovery storage. - update the schema when creating collection or alter schema. - get schema at write buffer based on version. - recover the schema when upgrading from 2.5. --------- Signed-off-by: chyezh <chyezh@outlook.com>	2025-07-23 21:38:54 +08:00
yihao.dai	a839017e81	fix: Handle retry state in import task (#43474 ) issue: https://github.com/milvus-io/milvus/issues/43473 Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2025-07-22 14:52:53 +08:00
Zhen Ye	07fa2cbdd3	enhance: wal balance consider the wal status on streamingnode (#43265 ) issue: #42995 - don't balance the wal if the producing-consuming lag is too long. - don't balance if the rebalance is set as false. - don't balance if the wal is balanced recently. Signed-off-by: chyezh <chyezh@outlook.com>	2025-07-18 11:10:51 +08:00
congqixia	5d90b65342	enhance: [StorageV2] Add storage version in Data/Query view resp (#43348 ) Related to #39173 Add `storage_version` in data/query view segment info response --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-07-16 15:52:51 +08:00
wei liu	b2597c6329	enhance: apply load config changes after QueryCoord restart (#43108 ) issue: #43107 - Add checkLoadConfigChanges() to apply load config during startup - Call config check in startQueryCoord() after restart - Skip auto-updates for collections with user-specified replica numbers - Add is_user_specified_replica_mode field to preserve user settings - Add comprehensive unit tests with mockey Ensures existing collections use latest cluster-level config after restart. --------- Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-07-10 14:28:48 +08:00
cai.zhang	3ffd44f302	fix: Fix remaining issues with Datanode pooling and StorageV2 (#43147 ) issue: #43146 --------- Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2025-07-10 14:26:48 +08:00
cai.zhang	6989e18599	enhance: Move sort stats task to sort compaction (#42562 ) issue: #42560 --------- Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2025-07-08 20:22:47 +08:00

1 2 3

123 Commits