milvus

mirror of https://gitee.com/milvus-io/milvus.git synced 2026-01-07 19:31:51 +08:00

Author	SHA1	Message	Date
wei liu	975c91df16	feat: Add comprehensive snapshot functionality for collections (#44361 ) issue: #44358 Implement complete snapshot management system including creation, deletion, listing, description, and restoration capabilities across all system components. Key features: - Create snapshots for entire collections - Drop snapshots by name with proper cleanup - List snapshots with collection filtering - Describe snapshot details and metadata Components added/modified: - Client SDK with full snapshot API support and options - DataCoord snapshot service with metadata management - Proxy layer with task-based snapshot operations - Protocol buffer definitions for snapshot RPCs - Comprehensive unit tests with mockey framework - Integration tests for end-to-end validation Technical implementation: - Snapshot metadata storage in etcd with proper indexing - File-based snapshot data persistence in object storage - Garbage collection integration for snapshot cleanup - Error handling and validation across all operations - Thread-safe operations with proper locking mechanisms <!-- This is an auto-generated comment: release notes by coderabbit.ai --> - Core invariant/assumption: snapshots are immutable point‑in‑time captures identified by (collection, snapshot name/ID); etcd snapshot metadata is authoritative for lifecycle (PENDING → COMMITTED → DELETING) and per‑segment manifests live in object storage (Avro / StorageV2). GC and restore logic must see snapshotRefIndex loaded (snapshotMeta.IsRefIndexLoaded) before reclaiming or relying on segment/index files. - New capability added: full end‑to‑end snapshot subsystem — client SDK APIs (Create/Drop/List/Describe/Restore + restore job queries), DataCoord SnapshotWriter/Reader (Avro + StorageV2 manifests), snapshotMeta in meta, SnapshotManager orchestration (create/drop/describe/list/restore), copy‑segment restore tasks/inspector/checker, proxy & RPC surface, GC integration, and docs/tests — enabling point‑in‑time collection snapshots persisted to object storage and restorations orchestrated across components. - Logic removed/simplified and why: duplicated recursive compaction/delta‑log traversal and ad‑hoc lookup code were consolidated behind two focused APIs/owners (Handler.GetDeltaLogFromCompactTo for delta traversal and SnapshotManager/SnapshotReader for snapshot I/O). MixCoord/coordinator broker paths were converted to thin RPC proxies. This eliminates multiple implementations of the same traversal/lookup, reducing divergence and simplifying responsibility boundaries. - Why this does NOT introduce data loss or regressions: snapshot create/drop use explicit two‑phase semantics (PENDING → COMMIT/DELETING) with SnapshotWriter writing manifests and metadata before commit; GC uses snapshotRefIndex guards and IsRefIndexLoaded/GetSnapshotBySegment/GetSnapshotByIndex checks to avoid removing referenced files; restore flow pre‑allocates job IDs, validates resources (partitions/indexes), performs rollback on failure (rollbackRestoreSnapshot), and converts/updates segment/index metadata only after successful copy tasks. Extensive unit and integration tests exercise pending/deleting/GC/restore/error paths to ensure idempotence and protection against premature deletion. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2026-01-06 10:15:24 +08:00
aoiasd	354ab2f55e	enhance: sync file resource to querynode and datanode (#44480 ) relate:https://github.com/milvus-io/milvus/issues/43687 Support use file resource with sync mode. Auto download or remove file resource to local when user add or remove file resource. Sync file resource to node when find new node session. --------- Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-12-04 16:23:11 +08:00
aoiasd	eca51ed2c6	enhance: add file resource api (#43766 ) relate: https://github.com/milvus-io/milvus/issues/43687 Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-08-08 14:17:41 +08:00
yihao.dai	ee9a95189a	enhance: Print segments info after import done (#43200 ) issue: https://github.com/milvus-io/milvus/issues/42488 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2025-07-10 12:38:47 +08:00
cai.zhang	6989e18599	enhance: Move sort stats task to sort compaction (#42562 ) issue: #42560 --------- Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2025-07-08 20:22:47 +08:00
yihao.dai	9cbd194c6b	fix: Prevent import from generating small binlogs (#43132 ) - Introduce dynamic buffer sizing to avoid generating small binlogs during import - Refactor import slot calculation based on CPU and memory constraints - Implement dynamic pool sizing for sync manager and import tasks according to CPU core count issue: https://github.com/milvus-io/milvus/issues/43131 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2025-07-07 21:32:47 +08:00
yihao.dai	e6da4a64b5	fix: Pre-check import message to prevent pipeline block indefinitely (#42415 ) Pre-check import message to prevent pipeline block indefinitely. issue: https://github.com/milvus-io/milvus/issues/42414 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com> Co-authored-by: chyezh <chyezh@outlook.com>	2025-06-11 13:40:38 +08:00
yihao.dai	f71930e8db	enhance: Enhance import context (#42021 ) Rename `imeta` to `importMeta` to improve readability, and enhance import related context usage. issue: https://github.com/milvus-io/milvus/issues/41123 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2025-05-23 12:58:27 +08:00
yihao.dai	142bd2fc05	enhance: Pooling for data tasks (#41256 ) 1. Add global scheduler for datacoord. 2. Define and implement new CreateTask, QueryTask, DropTask interfaces. 3. Refine Import, Compaction, Stats, Index task. issue: https://github.com/milvus-io/milvus/issues/41123 Co-authored-by: Cai Zhang <cai.zhang@zilliz.com>	2025-05-20 21:06:24 +08:00
yihao.dai	36e9e41627	fix: Fix no candidate segments error for small import (#41771 ) When autoID is enabled, the preimport task estimates row distribution by evenly dividing the total row count (numRows) across all vchannels: `estimatedCount = numRows / vchannelNum`. However, the actual import task hashes real auto-generated IDs to determine the target vchannel. This mismatch can lead to inaccurate row distribution estimation in such corner cases: - Importing 1 row into 2 vchannels: • Preimport: 1 / 2 = 0 → both v0 and v1 are estimated to have 0 rows • Import: real autoID (e.g., 457975852966809057) hashes to v1 → actual result: v0 = 0, v1 = 1 To resolve such corner case, we now allocate at least one segment for each vchannel when autoID is enabled, ensuring all vchannels are prepared to receive data even if no rows are estimated for them. issue: https://github.com/milvus-io/milvus/issues/41759 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2025-05-14 15:30:21 +08:00
XuanYang-cn	4bebca6416	enhance: Replace currRows with NumOfRows (#40074 ) See also: #40068 --------- Signed-off-by: yangxuan <xuan.yang@zilliz.com>	2025-03-10 12:16:03 +08:00
congqixia	cb7f2fa6fd	enhance: Use v2 package name for pkg module (#39990 ) Related to #39095 https://go.dev/doc/modules/version-numbers Update pkg version according to golang dep version convention --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-02-22 23:15:58 +08:00
SimFG	047254665d	feat: support to replicate import msg (#39171 ) - issue: #39849 --------- Signed-off-by: SimFG <bang.fu@zilliz.com> Signed-off-by: chyezh <chyezh@outlook.com> Co-authored-by: chyezh <chyezh@outlook.com>	2025-02-16 00:08:13 +08:00
yihao.dai	38f813bed3	enhance: Read metadata concurrently to accelerate recovery (#38403 ) Read metadata such as segments, binlogs, and partitions concurrently at the collection level. issue: https://github.com/milvus-io/milvus/issues/37630 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2025-01-23 14:27:27 +08:00
Zhen Ye	bb8d1ab3bf	enhance: make new go package to manage proto (#39114 ) issue: #39095 --------- Signed-off-by: chyezh <chyezh@outlook.com>	2025-01-10 10:49:01 +08:00
cai.zhang	0d7a89a4f8	fix: Use the correct RootPath when decompressing binlog in stats task (#38341 ) issue: #38336 Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2024-12-11 16:16:42 +08:00
tinswzy	1dbb6cd7cb	enhance: refine the datacoord meta related interfaces (#37957 ) issue: #35917 This PR refines the meta-related APIs in datacoord to allow the ctx to be passed down to the catalog operation interfaces Signed-off-by: tinswzy <zhenyuan.wei@zilliz.com>	2024-11-26 19:46:34 +08:00
congqixia	b0bd290a6e	enhance: Use internal json(sonic) to replace std json lib (#37708 ) Related to #35020 Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-11-18 10:46:31 +08:00
jaime	9d16b972ea	feat: add tasks page into management WebUI (#37002 ) issue: #36621 1. Add API to access task runtime metrics, including: - build index task - compaction task - import task - balance (including load/release of segments/channels and some leader tasks on querycoord) - sync task 2. Add a debug model to the webpage by using debug=true or debug=false in the URL query parameters to enable or disable debug mode. Signed-off-by: jaime <yun.zhang@zilliz.com>	2024-10-28 10:13:29 +08:00
yihao.dai	1f47d5510b	fix: Fix import segments leak in segment manager (#36602 ) Directly add import segments from the meta, eliminating the dependency on the segment manager. issue: https://github.com/milvus-io/milvus/issues/34648 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-10-08 10:11:22 +08:00
yihao.dai	a61668c77e	feat: Introduce stats task for import (#35868 ) This PR introduce stats task for import: 1. Define new `Stats` and `IndexBuilding` states for importJob 2. Add new stats step to the import process: trigger the stats task and wait for its completion 3. Abort stats task if import job failed issue: https://github.com/milvus-io/milvus/issues/33744 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-09-15 15:17:08 +08:00
cai.zhang	2c9bb4dfa3	feat: Support stats task to sort segment by PK (#35054 ) issue: #33744 This PR includes the following changes: 1. Added a new task type to the task scheduler in datacoord: stats task, which sorts segments by primary key. 2. Implemented segment sorting in indexnode. 3. Added a new field `FieldStatsLog` to SegmentInfo to store token index information. --------- Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2024-09-02 14:19:03 +08:00
congqixia	c992a61a23	enhance: Separate allocator pkg in datacoord (#35622 ) Related to #28861 Move allocator interface and implementation into separate package. Also update some unittest logic. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-08-22 10:06:56 +08:00
yihao.dai	678018d9ca	enhance: Avoid unnecessary compaction (#35148 ) Estimate the import segment size based on DiskSegmentMaxSize(2G) to avoid unnecessary compaction after import completed. issue: https://github.com/milvus-io/milvus/issues/35147 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-08-06 10:30:21 +08:00
yihao.dai	b71e058bc5	enhance: Add import option to skip disk quota check (#35274 ) Add an option to skip the disk quota check for backup-restore import. issue: https://github.com/milvus-io/milvus/issues/33775 Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-08-05 16:40:16 +08:00
yihao.dai	eb5d4de390	fix: Check if the import job exists (#33672 ) issue: https://github.com/milvus-io/milvus/issues/33671 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-06-10 21:51:55 +08:00
wayblink	a1232fafda	feat: Major compaction (#33620 ) #30633 Signed-off-by: wayblink <anyang.wang@zilliz.com> Co-authored-by: MrPresent-Han <chun.han@zilliz.com>	2024-06-10 21:34:08 +08:00
yihao.dai	3540eee977	enhance: Support L0 import (#33514 ) issue: https://github.com/milvus-io/milvus/issues/33157 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-06-07 14:17:20 +08:00
cai.zhang	27cc9f2630	enhance: Support analyze data (#33651 ) issue: #30633 Signed-off-by: Cai Zhang <cai.zhang@zilliz.com> Co-authored-by: chasingegg <chao.gao@zilliz.com>	2024-06-06 17:37:51 +08:00
zhenshan.cao	ac4f3997ce	enhance: Reconstructing Compaction to possess persistence capability (#33265 ) issue #33586 Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>	2024-06-05 10:17:50 +08:00
chyezh	2586c2f1b3	enhance: use WalkWithPrefix api for oss, enable piplined file gc (#31740 ) issue: #19095,#29655,#31718 - Change `ListWithPrefix` to `WalkWithPrefix` of OOS into a pipeline mode. - File garbage collection is performed in other goroutine. - Segment Index Recycle clean index file too. --------- Signed-off-by: chyezh <chyezh@outlook.com>	2024-04-25 20:41:27 +08:00
yihao.dai	4e264003bf	enhance: Ensure ImportV2 waits for the index to be built and refine some logic (#31629 ) Feature Introduced: 1. Ensure ImportV2 waits for the index to be built Enhancements Introduced: 1. Utilization of local time for timeout ts instead of allocating ts from rootcoord. 3. Enhanced input file length check for binlog import. 4. Removal of duplicated manager in datanode. 5. Renaming of executor to scheduler in datanode. 6. Utilization of a thread pool in the scheduler in datanode. issue: https://github.com/milvus-io/milvus/issues/28521 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-04-01 20:09:13 +08:00
yihao.dai	9a13b9822f	enhance: Return more fields in import progress response (#31539 ) Return more fields in import progress response, include importedRows and totalRows. Additionally, ensure compatibility with the old import progress response by retaining fields of create timestamp and row count. issue: https://github.com/milvus-io/milvus/issues/31448 https://github.com/milvus-io/milvus/issues/31237 https://github.com/milvus-io/milvus/issues/28521 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-03-24 21:57:06 +08:00
yihao.dai	776709e5ff	fix: Fix binlog import (#31310 ) Fix binlog import functionality by removing the existing check and refining the size retrieval process. issue: https://github.com/milvus-io/milvus/issues/31221, https://github.com/milvus-io/milvus/issues/28521 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-03-17 20:59:04 +08:00
yihao.dai	c408a32db6	feat: Add disk quota checks for import V2 (#31131 ) Return quota error when the files to be imported exceed the disk quota. issue: https://github.com/milvus-io/milvus/issues/28521 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-03-15 14:43:03 +08:00
yihao.dai	811316d2ba	fix: Fix binlog import and refine error reporting (#31241 ) 1. Fix binlog import with partition key. 2. Refine binlog import error reportins. 3. Avoid division by zero when retrieving import progress. issue: https://github.com/milvus-io/milvus/issues/31221, https://github.com/milvus-io/milvus/issues/28521 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-03-15 10:55:05 +08:00
yihao.dai	7d7ef388df	enhance: Remove adding import segments to the datanode (#31244 ) With the presence of L0 segments, there's no longer a need to add import segments to the datanode. issue: https://github.com/milvus-io/milvus/issues/28521 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-03-15 06:53:03 +08:00
yihao.dai	b5c67948b7	enhance: Enhance and modify the return content of ImportV2 (#31192 ) 1. The Import APIs now provide detailed progress information for each imported file, including details such as file name, file size, progress, and more. 2. The APIs now return the collection name and the completion time. 3. Other modifications include changing jobID to jobId and other similar adjustments. issue: https://github.com/milvus-io/milvus/issues/28521 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-03-13 19:51:03 +08:00
yihao.dai	a434d33e75	feat: Add import scheduler and manager (#29367 ) This PR introduces novel managerial roles for importv2: 1. ImportMeta: To manage all the import tasks; 2. ImportScheduler: To process tasks and modify their states; 3. ImportChecker: To ascertain the completion of all tasks and instigate relevant operations. issue: https://github.com/milvus-io/milvus/issues/28521 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-03-01 18:31:02 +08:00

39 Commits