milvus

mirror of https://gitee.com/milvus-io/milvus.git synced 2026-01-07 19:31:51 +08:00

Author	SHA1	Message	Date
aoiasd	ee216877bb	enhance: support compaction with file resource in ref mode (#46399 ) Add support for DataNode compaction using file resources in ref mode. SortCompation and StatsJobs will build text indexes, which may use file resources. relate: https://github.com/milvus-io/milvus/issues/43687 <!-- This is an auto-generated comment: release notes by coderabbit.ai --> - Core invariant: file resources (analyzer binaries/metadata) are only fetched, downloaded and used when the node is configured in Ref mode (fileresource.IsRefMode via CommonCfg.QNFileResourceMode / DNFileResourceMode); Sync now carries a version and managers track per-resource versions/resource IDs so newer resource sets win and older entries are pruned (RefManager/SynchManager resource maps). - Logic removed / simplified: component-specific FileResourceMode flags and an indirection through a long-lived BinlogIO wrapper were consolidated — file-resource mode moved to CommonCfg, Sync/Download APIs became version- and context-aware, and compaction/index tasks accept a ChunkManager directly (binlog IO wrapper creation inlined). This eliminates duplicated config checks and wrapper indirection while preserving the same chunk/IO semantics. - Why no data loss or behavior regression: all file-resource code paths are gated by the configured mode (default remains "sync"); when not in ref-mode or when no resources exist, compaction and stats flows follow existing code paths unchanged. Versioned Sync + resourceID maps ensure newly synced sets replace older ones and RefManager prunes stale files; GetFileResources returns an error if requested IDs are missing (prevents silent use of wrong resources). Analyzer naming/parameter changes add analyzer_extra_info but default-callers pass "" so existing analyzers and index contents remain unchanged. - New capability: DataNode compaction and StatsJobs can now build text indexes using external file resources in Ref mode — DataCoord exposes GetFileResources and populates CompactionPlan.file_resources; SortCompaction/StatsTask download resources via fileresource.Manager, produce an analyzer_extra_info JSON (storage + resource->id map) via analyzer.BuildExtraResourceInfo, and propagate analyzer_extra_info into BuildIndexInfo so the tantivy bindings can load custom analyzers during text index creation. <!-- end of auto-generated comment: release notes by coderabbit.ai --> Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2026-01-06 16:31:31 +08:00
wei liu	975c91df16	feat: Add comprehensive snapshot functionality for collections (#44361 ) issue: #44358 Implement complete snapshot management system including creation, deletion, listing, description, and restoration capabilities across all system components. Key features: - Create snapshots for entire collections - Drop snapshots by name with proper cleanup - List snapshots with collection filtering - Describe snapshot details and metadata Components added/modified: - Client SDK with full snapshot API support and options - DataCoord snapshot service with metadata management - Proxy layer with task-based snapshot operations - Protocol buffer definitions for snapshot RPCs - Comprehensive unit tests with mockey framework - Integration tests for end-to-end validation Technical implementation: - Snapshot metadata storage in etcd with proper indexing - File-based snapshot data persistence in object storage - Garbage collection integration for snapshot cleanup - Error handling and validation across all operations - Thread-safe operations with proper locking mechanisms <!-- This is an auto-generated comment: release notes by coderabbit.ai --> - Core invariant/assumption: snapshots are immutable point‑in‑time captures identified by (collection, snapshot name/ID); etcd snapshot metadata is authoritative for lifecycle (PENDING → COMMITTED → DELETING) and per‑segment manifests live in object storage (Avro / StorageV2). GC and restore logic must see snapshotRefIndex loaded (snapshotMeta.IsRefIndexLoaded) before reclaiming or relying on segment/index files. - New capability added: full end‑to‑end snapshot subsystem — client SDK APIs (Create/Drop/List/Describe/Restore + restore job queries), DataCoord SnapshotWriter/Reader (Avro + StorageV2 manifests), snapshotMeta in meta, SnapshotManager orchestration (create/drop/describe/list/restore), copy‑segment restore tasks/inspector/checker, proxy & RPC surface, GC integration, and docs/tests — enabling point‑in‑time collection snapshots persisted to object storage and restorations orchestrated across components. - Logic removed/simplified and why: duplicated recursive compaction/delta‑log traversal and ad‑hoc lookup code were consolidated behind two focused APIs/owners (Handler.GetDeltaLogFromCompactTo for delta traversal and SnapshotManager/SnapshotReader for snapshot I/O). MixCoord/coordinator broker paths were converted to thin RPC proxies. This eliminates multiple implementations of the same traversal/lookup, reducing divergence and simplifying responsibility boundaries. - Why this does NOT introduce data loss or regressions: snapshot create/drop use explicit two‑phase semantics (PENDING → COMMIT/DELETING) with SnapshotWriter writing manifests and metadata before commit; GC uses snapshotRefIndex guards and IsRefIndexLoaded/GetSnapshotBySegment/GetSnapshotByIndex checks to avoid removing referenced files; restore flow pre‑allocates job IDs, validates resources (partitions/indexes), performs rollback on failure (rollbackRestoreSnapshot), and converts/updates segment/index metadata only after successful copy tasks. Extensive unit and integration tests exercise pending/deleting/GC/restore/error paths to ensure idempotence and protection against premature deletion. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2026-01-06 10:15:24 +08:00
Bingyi Sun	b6532d3e44	enhance: implement external collection update task with source change detection (#45690 ) issue: https://github.com/milvus-io/milvus/issues/45691 Add persistent task management for external collections with automatic detection of external_source and external_spec changes. When source changes, the system aborts running tasks and creates new ones, ensuring only one active task per collection. Tasks validate their source on completion to prevent superseded tasks from committing results. --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-11-27 15:33:08 +08:00
Gao	09a3195867	enhance: support max_connections config for remote storage (#45225 ) related: https://github.com/milvus-io/milvus/issues/45344 Signed-off-by: chasingegg <chao.gao@zilliz.com>	2025-11-13 15:37:38 +08:00
Gao	e9a875f7ac	enhance: override index_type while creating segment index (#45416 ) issue: #44752 --------- Signed-off-by: chasingegg <chao.gao@zilliz.com>	2025-11-11 07:27:36 +08:00
Spade A	7cb15ef141	feat: impl StructArray -- optimize vector array serialization (#44035 ) issue: https://github.com/milvus-io/milvus/issues/42148 Optimized from Go VectorArray → VectorArray Proto → Binary → C++ VectorArray Proto → C++ VectorArray local impl → Memory to Go VectorArray → Arrow ListArray → Memory --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-09-03 16:39:53 +08:00
cai.zhang	6989e18599	enhance: Move sort stats task to sort compaction (#42562 ) issue: #42560 --------- Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2025-07-08 20:22:47 +08:00
foxspy	1c794be119	enhance: Output index version information in the DescribeIndex interface (#41847 ) issue: https://github.com/milvus-io/milvus/issues/41431 Signed-off-by: xianliang.li <xianliang.li@zilliz.com>	2025-05-15 14:36:22 +08:00
Xianhui Lin	f9febe3bae	enhance: Merge RootCoord, DataCoord And QueryCoord into MixCoord (#41006 ) Merge RootCoord, DataCoord And QueryCoord into MixCoord Make Session into one issue : https://github.com/milvus-io/milvus/issues/37764 --------- Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>	2025-04-11 16:36:30 +08:00
Xianhui Lin	3bc24c264f	enhance: Add json key inverted index in stats for optimization (#38039 ) Add json key inverted index in stats for optimization https://github.com/milvus-io/milvus/issues/36995 --------- Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-04-10 15:20:28 +08:00
congqixia	cb7f2fa6fd	enhance: Use v2 package name for pkg module (#39990 ) Related to #39095 https://go.dev/doc/modules/version-numbers Update pkg version according to golang dep version convention --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-02-22 23:15:58 +08:00
cai.zhang	6d45dd5666	fix: Add scalar index engine version for compatibility (#39204 ) issue: #39203 --------- Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2025-01-15 12:25:00 +08:00
Zhen Ye	3e788f0fbd	enhance: record memory size (uncompressed) item for index (#38770 ) issue: #38715 - Current milvus use a serialized index size(compressed) for estimate resource for loading. - Add a new field `MemSize` (before compressing) for index to estimate resource. --------- Signed-off-by: chyezh <chyezh@outlook.com>	2025-01-14 10:33:06 +08:00
Zhen Ye	bb8d1ab3bf	enhance: make new go package to manage proto (#39114 ) issue: #39095 --------- Signed-off-by: chyezh <chyezh@outlook.com>	2025-01-10 10:49:01 +08:00

14 Commits