milvus

mirror of https://gitee.com/milvus-io/milvus.git synced 2026-01-07 19:31:51 +08:00

Author	SHA1	Message	Date
Zhen Ye	c3fe6473b8	enhance: support async write syncer for milvus logging (#45805 ) issue: #45640 - log may be dropped if the underlying file system is busy. - use async write syncer to avoid the log operation block the milvus major system. - remove some log dependency from the until function to avoid dependency-loop. --------- Signed-off-by: chyezh <chyezh@outlook.com>	2025-11-28 17:43:11 +08:00
congqixia	c01fd94a6a	enhance: integrate Storage V2 FFI interface for unified storage access (#45723 ) Related #44956 This commit integrates the Storage V2 FFI (Foreign Function Interface) interface throughout the Milvus codebase, enabling unified storage access through the Loon FFI layer. This is a significant step towards standardizing storage operations across different storage versions. 1. Configuration Support - configs/milvus.yaml: Added `useLoonFFI` configuration flag under `common.storage.file.splitByAvgSize` section - Allows runtime toggle between traditional binlog readers and new FFI-based manifest readers - Default: `false` (maintains backward compatibility) 2. Core FFI Infrastructure Enhanced Utilities (internal/core/src/storage/loon_ffi/util.cpp/h) - ToCStorageConfig(): Converts Go's `StorageConfig` to C's `CStorageConfig` struct for FFI calls - GetManifest(): Parses manifest JSON and retrieves latest column groups using FFI - Accepts manifest path with `base_path` and `ver` fields - Calls `get_latest_column_groups()` FFI function - Returns column group information as string - Comprehensive error handling for JSON parsing and FFI errors 3. Dependency Updates - internal/core/thirdparty/milvus-storage/CMakeLists.txt: - Updated milvus-storage version from `0883026` to `302143c` - Ensures compatibility with latest FFI interfaces 4. Data Coordinator Changes All compaction task builders now include manifest path in segment binlogs: - compaction_task_clustering.go: Added `Manifest: segInfo.GetManifestPath()` to segment binlogs - compaction_task_l0.go: Added manifest path to both L0 segment selection and compaction plan building - compaction_task_mix.go: Added manifest path to mixed compaction segment binlogs - meta.go: Updated metadata completion logic: - `completeClusterCompactionMutation()`: Set `ManifestPath` in new segment info - `completeMixCompactionMutation()`: Preserve manifest path in compacted segments - `completeSortCompactionMutation()`: Include manifest path in sorted segments 5. Data Node Compactor Enhancements All compactors updated to support dual-mode reading (binlog vs manifest): 6. Flush & Sync Manager Updates Pack Writer V2 (pack_writer_v2.go) - BulkPackWriterV2.Write(): Extended return signature to include `manifest string` - Implementation: - Generate manifest path: `path.Join(pack.segmentID, "manifest.json")` - Write packed data using FFI-based writer - Return manifest path along with binlogs, deltas, and stats Task Handling (task.go) - Updated all sync task result handling to accommodate new manifest return value - Ensured backward compatibility for callers not using manifest 7. Go Storage Layer Integration New Interfaces and Implementations - record_reader.go: Interface for unified record reading across storage versions - record_writer.go: Interface for unified record writing across storage versions - binlog_record_writer.go: Concrete implementation for traditional binlog-based writing Enhanced Schema Support (schema.go, schema_test.go) - Schema conversion utilities to support FFI-based storage operations - Ensures proper Arrow schema mapping for V2 storage Serialization Updates - serde.go, serde_events.go, serde_events_v2.go: Updated to work with new reader/writer interfaces - Test files updated to validate dual-mode serialization 8. Storage V2 Packed Format FFI Common (storagev2/packed/ffi_common.go) - Common FFI utilities and type conversions for packed storage format Packed Writer FFI (storagev2/packed/packed_writer_ffi.go) - FFI-based implementation of packed writer - Integrates with Loon storage layer for efficient columnar writes Packed Reader FFI (storagev2/packed/packed_reader_ffi.go) - Already existed, now complemented by writer implementation 9. Protocol Buffer Updates data_coord.proto & datapb/data_coord.pb.go - Added `manifest` field to compaction segment messages - Enables passing manifest metadata through compaction pipeline worker.proto & workerpb/worker.pb.go - Added compaction parameter for `useLoonFFI` flag - Allows workers to receive FFI configuration from coordinator 10. Parameter Configuration component_param.go - Added `UseLoonFFI` parameter to compaction configuration - Reads from `common.storage.file.useLoonFFI` config path - Default: `false` for safe rollout 11. Test Updates - clustering_compactor_storage_v2_test.go: Updated signatures to handle manifest return value - mix_compactor_storage_v2_test.go: Updated test helpers for manifest support - namespace_compactor_test.go: Adjusted writer calls to expect manifest - pack_writer_v2_test.go: Validated manifest generation in pack writing This integration follows a dual-mode approach: 1. Legacy Path: Traditional binlog-based reading/writing (when `useLoonFFI=false` or no manifest) 2. FFI Path: Manifest-based reading/writing through Loon FFI (when `useLoonFFI=true` and manifest exists) --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-11-24 19:57:07 +08:00
tinswzy	1427825133	enhance: improve WAL retention strategy (#45350 ) issue: #44369 woodpecker related[ issue: #59](https://github.com/zilliztech/woodpecker/issues/59) Refactor the WAL retention logic in Milvus StreamingNode: - Remove the simple sampling-based truncation mechanism. - After flush, WAL data is directly truncated. - The retention control is now delegated to the underlying message queue (MQ) implementation. Signed-off-by: tinswzy <zhenyuan.wei@zilliz.com>	2025-11-23 21:41:05 +08:00
junjiejiangjjj	d3164e8030	feat: add configurable batch factor and runtime check bypass for embedding functions (#45592 ) https://github.com/milvus-io/milvus/issues/45544 - Add batch_factor configuration parameter (default: 5) to control embedding provider batch sizes - Add disable_func_runtime_check property to bypass function validation during collection creation - Add database interceptor support for AddCollectionFunction, AlterCollectionFunction, and DropCollectionFunction requests Signed-off-by: junjie.jiang <junjie.jiang@zilliz.com>	2025-11-20 19:55:04 +08:00
Zhen Ye	c8073eb90b	fix: panic when double close channel of ack broadcast (#45661 ) issue: #45635 Signed-off-by: chyezh <chyezh@outlook.com>	2025-11-19 14:25:05 +08:00
Gao	d7a5a87b11	enhance: update maxConnections config version (#45546 ) issue: #45344 Signed-off-by: chasingegg <chao.gao@zilliz.com>	2025-11-13 17:57:38 +08:00
Gao	09a3195867	enhance: support max_connections config for remote storage (#45225 ) related: https://github.com/milvus-io/milvus/issues/45344 Signed-off-by: chasingegg <chao.gao@zilliz.com>	2025-11-13 15:37:38 +08:00
junjiejiangjjj	50f198e346	feat: Support zilliz models (#45168 ) https://github.com/milvus-io/milvus/issues/35856 Signed-off-by: junjie.jiang <junjie.jiang@zilliz.com>	2025-11-13 12:55:37 +08:00
Xiaofan	a9895bb904	enhance: add robust handle etcd servercrash (#45304 ) related to #45303 fix milvus pod may restart when etcd pod start Signed-off-by: xiaofanluan <xiaofan.luan@zilliz.com>	2025-11-13 10:23:36 +08:00
Gao	e9a875f7ac	enhance: override index_type while creating segment index (#45416 ) issue: #44752 --------- Signed-off-by: chasingegg <chao.gao@zilliz.com>	2025-11-11 07:27:36 +08:00
wei liu	931d4bf95f	enhance: increase session TTL from 10s to 30s (#45228 ) issue: #45227 Increase the default session TTL to 30 seconds to tolerate etcd failover time. This prevents session expiration during etcd cluster failover, improving system stability. When etcd undergoes failover (leader election or node restart), the previous 10s TTL was too short to survive the failover window, causing unnecessary session expiration and component restarts. The new 30s TTL provides sufficient buffer for etcd to complete failover while maintaining session liveness. Changes: - Update DefaultSessionTTL constant from 10 to 30 - Update SessionTTL ParamItem DefaultValue from "10" to "30" Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-11-10 16:47:36 +08:00
XuanYang-cn	897ac983c8	feat: Add new config and enable to dynamic update configs (#45170 ) This PR changes the config layout according to the latest design, and adds two external credential configs for aws kms See also: #45169 Signed-off-by: yangxuan <xuan.yang@zilliz.com>	2025-11-10 14:43:35 +08:00
sparknack	9032bb7668	enhance: unify the aligned buffer for both buffered and direct I/O (#45323 ) issue: #43040 Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>	2025-11-06 10:53:33 +08:00
zhagnlu	792e931fcb	enhance: rename jsonstats related user config params (#45254 ) #44132 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-11-04 20:21:36 +08:00
Gao	8f645760af	enhance: make knowhere thread pool config refreshable (#45190 ) Signed-off-by: chasingegg <chao.gao@zilliz.com>	2025-11-04 18:33:33 +08:00
zhenshan.cao	6327c9a514	fix: Fix bugs related to TimestampTz (#45111 ) issue: https://github.com/milvus-io/milvus/issues/44527 https://github.com/milvus-io/milvus/issues/44537 https://github.com/milvus-io/milvus/issues/44538 https://github.com/milvus-io/milvus/issues/44585 https://github.com/milvus-io/milvus/issues/44622 Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>	2025-11-04 16:51:33 +08:00
cai.zhang	ed8ba4a28c	enhance: Make GeometryCache an optional configuration (#45192 ) issue: #45187 Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2025-11-03 19:59:32 +08:00
Zhen Ye	309d564796	enhance: support collection and index with WAL-based DDL framework (#45033 ) issue: #43897 - Part of collection/index related DDL is implemented by WAL-based DDL framework now. - Support following message type in wal, CreateCollection, DropCollection, CreatePartition, DropPartition, CreateIndex, AlterIndex, DropIndex. - Part of collection/index related DDL can be synced by new CDC now. - Refactor some UT for collection/index DDL. - Add Tombstone scheduler to manage the tombstone GC for collection or partition meta. - Move the vchannel allocation into streaming pchannel manager. --------- Signed-off-by: chyezh <chyezh@outlook.com>	2025-10-30 14:24:08 +08:00
wei liu	3566cb745c	enhance: remove max vector field number limit (#45151 ) issue: #45150 Removed the maximum limit constraint (value range [1, 10]) for vector fields in a collection to support more flexible schema design. Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-10-30 12:42:07 +08:00
aoiasd	ad9a0cae48	enhance: add global analyzer options (#44684 ) relate: https://github.com/milvus-io/milvus/issues/43687 Add global analyzer options, avoid having to merge some milvus params into user's analyzer params. Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-10-28 14:52:10 +08:00
sparknack	935160840c	enhance: add a disk quota for the loaded binlog size to prevent load failure of querynode (#44893 ) issue: #41435 --------- Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>	2025-10-19 19:44:01 +08:00
foxspy	b91878857e	fix: update aisaq param (#44861 ) issue: #44365 Signed-off-by: xianliang.li <xianliang.li@zilliz.com>	2025-10-15 19:18:00 +08:00
XuanYang-cn	fc46668812	fix: Disk encryption config missing (#44820 ) See also: #44823 Signed-off-by: yangxuan <xuan.yang@zilliz.com>	2025-10-14 17:22:00 +08:00
Buqian Zheng	3140bd0ca6	enhance: enable default json stats (#44810 ) issue: https://github.com/milvus-io/milvus/issues/44132 Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2025-10-13 21:29:59 +08:00
sparknack	6d5b41644b	enhance: remove logical usage checks during segment loading (#44743 ) issue: #41435 Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>	2025-10-13 14:21:58 +08:00
foxspy	e7a91f514c	enhance: overwriting current index type during index build stage (#44753 ) issue: #44752 Signed-off-by: xianliang.li <xianliang.li@zilliz.com>	2025-10-11 18:31:58 +08:00
Xiaofan	7c00f292bc	enhance: add config for meta batch(#44569 ) (#44645 ) fix: https://github.com/milvus-io/milvus/issues/44569 add a new config to control meta batch to avoid too large Signed-off-by: xiaofanluan <xiaofan.luan@zilliz.com>	2025-09-30 17:31:02 +08:00
zhagnlu	4c49295c3d	Revert "enhance: enable default json stats (#44559 )" (#44644 ) This reverts commit 1b5191974c71eee342e4f7a8c804e1d95cfd094b. #44132 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-09-30 12:11:53 +08:00
cai.zhang	19346fa389	feat: Geospatial Data Type and GIS Function support for milvus (#44547 ) issue: #43427 This pr's main goal is merge #37417 to milvus 2.5 without conflicts. # Main Goals 1. Create and describe collections with geospatial type 2. Insert geospatial data into the insert binlog 3. Load segments containing geospatial data into memory 4. Enable query and search can display geospatial data 5. Support using GIS funtions like ST_EQUALS in query 6. Support R-Tree index for geometry type # Solution 1. Add Type: Modify the Milvus core by adding a Geospatial type in both the C++ and Go code layers, defining the Geospatial data structure and the corresponding interfaces. 2. Dependency Libraries: Introduce necessary geospatial data processing libraries. In the C++ source code, use Conan package management to include the GDAL library. In the Go source code, add the go-geom library to the go.mod file. 3. Protocol Interface: Revise the Milvus protocol to provide mechanisms for Geospatial message serialization and deserialization. 4. Data Pipeline: Facilitate interaction between the client and proxy using the WKT format for geospatial data. The proxy will convert all data into WKB format for downstream processing, providing column data interfaces, segment encapsulation, segment loading, payload writing, and cache block management. 5. Query Operators: Implement simple display and support for filter queries. Initially, focus on filtering based on spatial relationships for a single column of geospatial literal values, providing parsing and execution for query expressions.Now only support brutal search 7. Client Modification: Enable the client to handle user input for geospatial data and facilitate end-to-end testing.Check the modification in pymilvus. --------- Signed-off-by: Yinwei Li <yinwei.li@zilliz.com> Signed-off-by: Cai Zhang <cai.zhang@zilliz.com> Co-authored-by: ZhuXi <150327960+Yinwei-Yu@users.noreply.github.com>	2025-09-28 19:43:05 +08:00
yihao.dai	f61952adfc	fix: Fix compaction task blocking due to executor loop exit (#44543 ) 1. Use goroutine pool instead of sem. 2. Remove compaction executor from pipeline, since in streaming mode pipeline should be decoupled from compaction. issue: https://github.com/milvus-io/milvus/issues/44541 Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2025-09-28 11:03:04 +08:00
zhagnlu	1b5191974c	enhance: enable default json stats (#44559 ) #44132 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-09-28 10:45:04 +08:00
yihao.dai	2807d1d1b2	fix: Make default local storage path effective (#44514 ) Make default local storage path effective instead of empty when yaml config file is missing. issue: https://github.com/milvus-io/milvus/issues/44513 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2025-09-24 21:00:06 +08:00
Zhen Ye	19e5e9f910	enhance: broadcaster will lock resource until message acked (#44508 ) issue: #43897 - Return LastConfirmedMessageID when wal append operation. - Add resource-key-based locker for broadcast-ack operation to protect the coord state when executing ddl. - Resource-key-based locker is held until the broadcast operation is acked. - ResourceKey support shared and exclusive lock. - Add FastAck execute ack right away after the broadcast done to speed up ddl. - Ack callback will support broadcast message result now. - Add tombstone for broadcaster to avoid to repeatedly commit DDL and ABA issue. --------- Signed-off-by: chyezh <chyezh@outlook.com>	2025-09-24 20:58:05 +08:00
foxspy	13c3b0b909	enhance: add autoindex configuration for the int8 vector type (#44554 ) issue: #38666 Add int8 support for autoindex to ensure it can be independently configured. At the same time, remove the restriction on int8 type for vectorDiskIndex (note that vectorDiskIndex only determines the building and loading method of the index, not the index type). Signed-off-by: xianliang.li <xianliang.li@zilliz.com>	2025-09-24 17:48:04 +08:00
congqixia	99598ae5ec	enhance: Add param item for hybrid search requery policy (#44466 ) Related to #39757 --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-09-24 17:32:04 +08:00
jiaqizho	338ed2fed4	enhance: Introduce sparse filter in query (#44347 ) issue: #44373 The current commit implements sparse filtering in query tasks using the statistical information (Bloom filter/MinMax) of the Primary Key (PK). The statistical information of the PK is bound to the segment during the segment loading phase. A new filter has been added to the segment filter to enable the sparse filtering functionality. Signed-off-by: jiaqizho <jiaqi.zhou@zilliz.com>	2025-09-23 09:58:09 +08:00
Gao	539f17f1ad	enhance: tiered index updates (#44433 ) issue: #42032 #44212 - special case for warmup param and cell storage size for tiered index - add a config to enable/disable storage usage tracking --------- Signed-off-by: chasingegg <chao.gao@zilliz.com>	2025-09-22 21:34:11 +08:00
sthuang	edd250ffef	fix: [StorageV2] force virtual host for oss and cos (#44484 ) related: #44481 Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2025-09-22 16:58:11 +08:00
Bingyi Sun	94d53a5ac6	feat: encode cluster id in auto id (#44471 ) https://github.com/milvus-io/milvus/issues/44326 prev: [physical_ts][logical_ts] after [sign_bit][cluster_id][physical_ts][logical_ts] --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-09-22 10:40:02 +08:00
tinswzy	c7f21d5a06	enhance: purge small files right after wp segment compaction (#44473 ) #43638 improve wp log output [wp#43](https://github.com/zilliztech/woodpecker/issues/43) intro purge small files right after segment compaction [wp#47](https://github.com/zilliztech/woodpecker/issues/47) The rootpath configured by milvus is uniformly used as the base for wp local fs storage. update to v0.1.5 Signed-off-by: tinswzy <zhenyuan.wei@zilliz.com>	2025-09-21 16:32:01 +08:00
wei liu	6d4961b978	enhance: Refactor balance checker with priority queue (#43992 ) issue: #43858 Refactor the balance checker implementation to use priority queues for managing collection balance operations, improving processing efficiency and order control. Changes include: - Export priority queue interfaces (Item, BaseItem, PriorityQueue) - Replace collection round-robin with priority-based queue system - Add BalanceCheckCollectionMaxCount configuration parameter - Optimize balance task generation with batch processing limits - Refactor processBalanceQueue method for different strategies - Enhance test coverage with comprehensive unit tests The new priority queue system processes collections based on row count or collection ID order, providing better control over balance operation priorities and resource utilization. --------- Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-09-19 17:46:01 +08:00
Bingyi Sun	5cd2d99799	enhance: Revert "feat: encode cluster id in auto id (#44324 )" (#44426 ) This reverts commit 7af159410395f0e7079d4875d96544c01f1d477b	2025-09-17 17:56:01 +08:00
Bingyi Sun	7af1594103	feat: encode cluster id in auto id (#44324 ) https://github.com/milvus-io/milvus/issues/44326 prev: `[physical_ts][logical_ts]` after `[sign_bit][cluster_id][physical_ts][logical_ts]` --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-09-17 16:56:01 +08:00
congqixia	103db5ae3e	enhance: [StorageV2] Include partition & clustering key to sys group (#44372 ) Related to #44257 This PR makes partition key & clustering candidates of system field group and adds param item controlling the policy --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-09-16 12:08:00 +08:00
cai.zhang	76f6768ea1	enhance: Remove timeout for compaction task (#44277 ) issue: #44272 --------- Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2025-09-15 11:03:58 +08:00
congqixia	bfc9e80e14	enhance: Add param item forcing all indices ready for segment (#44313 ) Related to #44312 --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-09-12 17:51:58 +08:00
congqixia	fc968ff1c2	enhance: [StorageV2] Pass args for avg size split policy (#44301 ) Related to #44257 This PR - Pass column stats for avg size split policy - Add param items for policy configuration --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-09-11 10:43:57 +08:00
sparknack	4a01c726f3	enhance: cachinglayer: some metric and params update (#44276 ) issue: #41435 --------- Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>	2025-09-10 11:03:57 +08:00
zhagnlu	2f8620fa79	fix: fix like failed and add max columns limit (#44233 ) #44137 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-09-10 10:33:57 +08:00
zhagnlu	d67f1ea0ab	enhance: add param to modify dump snapshot batch size (#44215 ) issue: #44216 Signed-off-by: luzhang <luzhang@zilliz.com>	2025-09-05 14:29:54 +08:00

1 2 3 4 5 ...

619 Commits