mirror of
https://gitee.com/milvus-io/milvus.git
synced 2026-01-07 19:31:51 +08:00
issue: #44358 Implement complete snapshot management system including creation, deletion, listing, description, and restoration capabilities across all system components. Key features: - Create snapshots for entire collections - Drop snapshots by name with proper cleanup - List snapshots with collection filtering - Describe snapshot details and metadata Components added/modified: - Client SDK with full snapshot API support and options - DataCoord snapshot service with metadata management - Proxy layer with task-based snapshot operations - Protocol buffer definitions for snapshot RPCs - Comprehensive unit tests with mockey framework - Integration tests for end-to-end validation Technical implementation: - Snapshot metadata storage in etcd with proper indexing - File-based snapshot data persistence in object storage - Garbage collection integration for snapshot cleanup - Error handling and validation across all operations - Thread-safe operations with proper locking mechanisms <!-- This is an auto-generated comment: release notes by coderabbit.ai --> - Core invariant/assumption: snapshots are immutable point‑in‑time captures identified by (collection, snapshot name/ID); etcd snapshot metadata is authoritative for lifecycle (PENDING → COMMITTED → DELETING) and per‑segment manifests live in object storage (Avro / StorageV2). GC and restore logic must see snapshotRefIndex loaded (snapshotMeta.IsRefIndexLoaded) before reclaiming or relying on segment/index files. - New capability added: full end‑to‑end snapshot subsystem — client SDK APIs (Create/Drop/List/Describe/Restore + restore job queries), DataCoord SnapshotWriter/Reader (Avro + StorageV2 manifests), snapshotMeta in meta, SnapshotManager orchestration (create/drop/describe/list/restore), copy‑segment restore tasks/inspector/checker, proxy & RPC surface, GC integration, and docs/tests — enabling point‑in‑time collection snapshots persisted to object storage and restorations orchestrated across components. - Logic removed/simplified and why: duplicated recursive compaction/delta‑log traversal and ad‑hoc lookup code were consolidated behind two focused APIs/owners (Handler.GetDeltaLogFromCompactTo for delta traversal and SnapshotManager/SnapshotReader for snapshot I/O). MixCoord/coordinator broker paths were converted to thin RPC proxies. This eliminates multiple implementations of the same traversal/lookup, reducing divergence and simplifying responsibility boundaries. - Why this does NOT introduce data loss or regressions: snapshot create/drop use explicit two‑phase semantics (PENDING → COMMIT/DELETING) with SnapshotWriter writing manifests and metadata before commit; GC uses snapshotRefIndex guards and IsRefIndexLoaded/GetSnapshotBySegment/GetSnapshotByIndex checks to avoid removing referenced files; restore flow pre‑allocates job IDs, validates resources (partitions/indexes), performs rollback on failure (rollbackRestoreSnapshot), and converts/updates segment/index metadata only after successful copy tasks. Extensive unit and integration tests exercise pending/deleting/GC/restore/error paths to ensure idempotence and protection against premature deletion. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Signed-off-by: Wei Liu <wei.liu@zilliz.com>
Tests
E2E Test
Configuration Requirements
Operating System
| Operating System | Version |
|---|---|
| Amazon Linux | 2023 or above |
| Ubuntu | 20.04 or above |
| Mac | 10.14 or above |
Hardware
| Hardware Type | Recommended Configuration |
|---|---|
| CPU | x86_64 architecture Intel CPU Sandy Bridge or above CPU Instruction Set - SSE4_2 - AVX - AVX2 - AVX512 or arm64 Linux/MacOS |
| Memory | 16 GB or more |
Software
| Software Name | Version |
|---|---|
| Docker | 19.05 or above |
| Docker Compose | 1.25.5 or above |
| jq | 1.3 or above |
| kubectl | 1.14 or above |
| helm | 3.0 or above |
| kind | 0.10.0 or above |
Installing Dependencies
Troubleshooting Docker and Docker Compose
- Confirm that Docker Daemon is running:
$ docker info
-
Ensure that Docker is installed. Refer to the official installation instructions for Docker CE/EE.
-
Start the Docker Daemon if it is not already started.
-
To run Docker without
rootprivileges, create a user group labeleddocker, then add a user to the group withsudo usermod -aG docker $USER. Log out and log back into the terminal for the changes to take effect. For more information, see the official Docker documentation for Managing Docker as a Non-Root User.
- Check the version of Docker-Compose
$ docker compose version
docker compose version 1.25.5, build 8a1c60f6
docker-py version: 4.1.0
CPython version: 3.7.5
OpenSSL version: OpenSSL 1.1.1f 31 Mar 2020
- To install Docker-Compose, see Install Docker Compose
Install jq
Install kubectl
Install helm
- Refer to https://helm.sh/docs/intro/install/
Install kind
Run E2E Tests
$ cd tests/scripts
$ ./e2e-k8s.sh
Getting help
You can get help with the following command:
$ ./e2e-k8s.sh --help