14 Commits

Author SHA1 Message Date
wei liu
975c91df16
feat: Add comprehensive snapshot functionality for collections (#44361)
issue: #44358

Implement complete snapshot management system including creation,
deletion, listing, description, and restoration capabilities across all
system components.

Key features:
- Create snapshots for entire collections
- Drop snapshots by name with proper cleanup
- List snapshots with collection filtering
- Describe snapshot details and metadata

Components added/modified:
- Client SDK with full snapshot API support and options
- DataCoord snapshot service with metadata management
- Proxy layer with task-based snapshot operations
- Protocol buffer definitions for snapshot RPCs
- Comprehensive unit tests with mockey framework
- Integration tests for end-to-end validation

Technical implementation:
- Snapshot metadata storage in etcd with proper indexing
- File-based snapshot data persistence in object storage
- Garbage collection integration for snapshot cleanup
- Error handling and validation across all operations
- Thread-safe operations with proper locking mechanisms

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
- Core invariant/assumption: snapshots are immutable point‑in‑time
captures identified by (collection, snapshot name/ID); etcd snapshot
metadata is authoritative for lifecycle (PENDING → COMMITTED → DELETING)
and per‑segment manifests live in object storage (Avro / StorageV2). GC
and restore logic must see snapshotRefIndex loaded
(snapshotMeta.IsRefIndexLoaded) before reclaiming or relying on
segment/index files.

- New capability added: full end‑to‑end snapshot subsystem — client SDK
APIs (Create/Drop/List/Describe/Restore + restore job queries),
DataCoord SnapshotWriter/Reader (Avro + StorageV2 manifests),
snapshotMeta in meta, SnapshotManager orchestration
(create/drop/describe/list/restore), copy‑segment restore
tasks/inspector/checker, proxy & RPC surface, GC integration, and
docs/tests — enabling point‑in‑time collection snapshots persisted to
object storage and restorations orchestrated across components.

- Logic removed/simplified and why: duplicated recursive
compaction/delta‑log traversal and ad‑hoc lookup code were consolidated
behind two focused APIs/owners (Handler.GetDeltaLogFromCompactTo for
delta traversal and SnapshotManager/SnapshotReader for snapshot I/O).
MixCoord/coordinator broker paths were converted to thin RPC proxies.
This eliminates multiple implementations of the same traversal/lookup,
reducing divergence and simplifying responsibility boundaries.

- Why this does NOT introduce data loss or regressions: snapshot
create/drop use explicit two‑phase semantics (PENDING → COMMIT/DELETING)
with SnapshotWriter writing manifests and metadata before commit; GC
uses snapshotRefIndex guards and
IsRefIndexLoaded/GetSnapshotBySegment/GetSnapshotByIndex checks to avoid
removing referenced files; restore flow pre‑allocates job IDs, validates
resources (partitions/indexes), performs rollback on failure
(rollbackRestoreSnapshot), and converts/updates segment/index metadata
only after successful copy tasks. Extensive unit and integration tests
exercise pending/deleting/GC/restore/error paths to ensure idempotence
and protection against premature deletion.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2026-01-06 10:15:24 +08:00
Zhen Ye
309d564796
enhance: support collection and index with WAL-based DDL framework (#45033)
issue: #43897

- Part of collection/index related DDL is implemented by WAL-based DDL
framework now.
- Support following message type in wal, CreateCollection,
DropCollection, CreatePartition, DropPartition, CreateIndex, AlterIndex,
DropIndex.
- Part of collection/index related DDL can be synced by new CDC now.
- Refactor some UT for collection/index DDL.
- Add Tombstone scheduler to manage the tombstone GC for collection or
partition meta.
- Move the vchannel allocation into streaming pchannel manager.

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-10-30 14:24:08 +08:00
Zhen Ye
30091a3bb7
enhance: remove redundant channel manager from datacoord (#44532)
issue: #41611

- After enabling streaming arch, channel manager of data coord is a
redundant component.

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-10-09 11:01:57 +08:00
zhenshan.cao
758cf29e77
fix: create multiple idential indexes by accident (#40179)
issue: https://github.com/milvus-io/milvus/issues/40163

Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>
2025-04-08 15:06:25 +08:00
wei liu
0420dc1eb1
fix: use correct delete checkpoint to prevent premature data cleanup (#40366)
issue: #40292
related to #39552

- Fix incorrect delete checkpoint usage in SyncDistribution
- Change checkpoint parameter from action.GetCheckpoint() to
action.GetDeleteCP() in SyncTargetVersion call
- This resolves the issue where delete buffer data was being cleaned
prematurely due to wrong checkpoint reference

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-03-12 15:00:08 +08:00
congqixia
cb7f2fa6fd
enhance: Use v2 package name for pkg module (#39990)
Related to #39095

https://go.dev/doc/modules/version-numbers

Update pkg version according to golang dep version convention

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-02-22 23:15:58 +08:00
cai.zhang
3a6408b237
fix: Record a map to avoid repeatedly traversing the CompactionFrom (#38925)
issue: #38811

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-01-15 10:02:58 +08:00
Zhen Ye
bb8d1ab3bf
enhance: make new go package to manage proto (#39114)
issue: #39095

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-01-10 10:49:01 +08:00
cai.zhang
a348122758
fix: Support get segments from current segments view (#38512)
issue: #38511

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-12-18 18:00:54 +08:00
tinswzy
1dbb6cd7cb
enhance: refine the datacoord meta related interfaces (#37957)
issue: #35917 
This PR refines the meta-related APIs in datacoord to allow the ctx to
be passed down to the catalog operation interfaces

Signed-off-by: tinswzy <zhenyuan.wei@zilliz.com>
2024-11-26 19:46:34 +08:00
cai.zhang
4dc684126e
enhance: Handoff growing segment after sorted (#37385)
issue: #33744 

1. Segments generated from inserts will be loaded as growing until they
are sorted by primary key.
2. This PR may increase memory pressure on the delegator, but we need to
test the performance of stats. In local testing, the speed of stats is
greater than the insert speed.

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-11-07 16:08:24 +08:00
congqixia
d8db3e8761
enhance: Add metrics for querynode delete buffer info (#37081)
Related to #35303

This PR add metrics for querynode delegator delete buffer information,
which is related to dml quota logic.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-10-24 10:47:28 +08:00
cai.zhang
ac8c5fcd5d
enhance: Remove pre-marking segments as L2 during clustering compaction (#36799)
issue: #36686 

This pr will remove pre-marking segments as L2 during clustering
compaction in version 2.5, and ensure compatibility with version 2.4.

The core of this change is to **ensure that the many-to-many lineage
derivation logic is correct, making sure that both the parent and child
cannot simultaneously exist in the target segment view.**

feature:
  - Clustering compaction no longer marks the input segments as L2.
- Add a new field `is_invisible` to `segmentInfo`, and mark segments
that have completed clustering but have not yet built indexes as
`is_invisible` to prevent them from being loaded prematurely."
- Do not mark the input segment as `Dropped` before the clustering
compaction is completed.
- After compaction fails, only the result segment needs to be marked as
Dropped.

compatibility:
- If the upgraded task has not failed, there are no compatibility
issues.
- If the status after the upgrade is `MetaSaved`, then skip the stats
task based on whether TmpSegments is empty.
  - If the failure occurs before `MetaSaved`:
- there are no ResultSegments, and InputSegments have not been marked as
dropped yet.
    - the level of input segments need to revert to LastLevel
  - If the failure occurs after `MetaSaved`:
- ResultSegments have already been generated, and InputSegments have
been marked as Dropped. At this point, simply make the ResultSegments
visible.
- the level of ResultSegments needs to be set to L1(in order to
participate in mixCompaction)

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-10-23 17:15:28 +08:00
XuanYang-cn
4e0ea39235
fix: Remove neighbors if compactTo is unindexed (#36503)
See also: #36360

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-10-08 14:15:19 +08:00