issue: #44358
Implement complete snapshot management system including creation,
deletion, listing, description, and restoration capabilities across all
system components.
Key features:
- Create snapshots for entire collections
- Drop snapshots by name with proper cleanup
- List snapshots with collection filtering
- Describe snapshot details and metadata
Components added/modified:
- Client SDK with full snapshot API support and options
- DataCoord snapshot service with metadata management
- Proxy layer with task-based snapshot operations
- Protocol buffer definitions for snapshot RPCs
- Comprehensive unit tests with mockey framework
- Integration tests for end-to-end validation
Technical implementation:
- Snapshot metadata storage in etcd with proper indexing
- File-based snapshot data persistence in object storage
- Garbage collection integration for snapshot cleanup
- Error handling and validation across all operations
- Thread-safe operations with proper locking mechanisms
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
- Core invariant/assumption: snapshots are immutable point‑in‑time
captures identified by (collection, snapshot name/ID); etcd snapshot
metadata is authoritative for lifecycle (PENDING → COMMITTED → DELETING)
and per‑segment manifests live in object storage (Avro / StorageV2). GC
and restore logic must see snapshotRefIndex loaded
(snapshotMeta.IsRefIndexLoaded) before reclaiming or relying on
segment/index files.
- New capability added: full end‑to‑end snapshot subsystem — client SDK
APIs (Create/Drop/List/Describe/Restore + restore job queries),
DataCoord SnapshotWriter/Reader (Avro + StorageV2 manifests),
snapshotMeta in meta, SnapshotManager orchestration
(create/drop/describe/list/restore), copy‑segment restore
tasks/inspector/checker, proxy & RPC surface, GC integration, and
docs/tests — enabling point‑in‑time collection snapshots persisted to
object storage and restorations orchestrated across components.
- Logic removed/simplified and why: duplicated recursive
compaction/delta‑log traversal and ad‑hoc lookup code were consolidated
behind two focused APIs/owners (Handler.GetDeltaLogFromCompactTo for
delta traversal and SnapshotManager/SnapshotReader for snapshot I/O).
MixCoord/coordinator broker paths were converted to thin RPC proxies.
This eliminates multiple implementations of the same traversal/lookup,
reducing divergence and simplifying responsibility boundaries.
- Why this does NOT introduce data loss or regressions: snapshot
create/drop use explicit two‑phase semantics (PENDING → COMMIT/DELETING)
with SnapshotWriter writing manifests and metadata before commit; GC
uses snapshotRefIndex guards and
IsRefIndexLoaded/GetSnapshotBySegment/GetSnapshotByIndex checks to avoid
removing referenced files; restore flow pre‑allocates job IDs, validates
resources (partitions/indexes), performs rollback on failure
(rollbackRestoreSnapshot), and converts/updates segment/index metadata
only after successful copy tasks. Extensive unit and integration tests
exercise pending/deleting/GC/restore/error paths to ensure idempotence
and protection against premature deletion.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
issue: #46635
## Summary
- Fix spelling error in constant name: `CredentialSeperator` ->
`CredentialSeparator`
- Updated all usages across the codebase to use the correct spelling
## Changes
- `pkg/util/constant.go`: Renamed the constant
- `pkg/util/contextutil/context_util.go`: Updated usage
- `pkg/util/contextutil/context_util_test.go`: Updated usage
- `internal/proxy/authentication_interceptor.go`: Updated usage
- `internal/proxy/util.go`: Updated usage
- `internal/proxy/util_test.go`: Updated usage
- `internal/proxy/trace_log_interceptor_test.go`: Updated usage
- `internal/proxy/accesslog/info/util.go`: Updated usage
- `internal/distributed/proxy/service.go`: Updated usage
- `internal/distributed/proxy/httpserver/utils.go`: Updated usage
## Test Plan
- [x] All references updated consistently
- [x] No functional changes - only constant name spelling correction
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
- Core invariant: the separator character for credentials remains ":"
everywhere — only the exported identifier was renamed from
CredentialSeperator → CredentialSeparator; the constant value and
split/join semantics are unchanged.
- Change (bug fix): corrected the misspelled exported constant in
pkg/util/constant.go and updated all references across the codebase
(parsing, token construction, header handling and tests) to use the new
identifier; this is an identifier rename that removes an inconsistent
symbol and prevents compile-time/reference errors.
- Logic simplified/redundant work removed: no runtime logic was removed;
the simplification is purely maintenance-focused — eliminating a
misspelled exported name that could cause developers to introduce
duplicate or incorrect constants.
- No data loss or behavior regression: runtime code paths are unchanged
— e.g., GetAuthInfoFromContext, ParseUsernamePassword,
AuthenticationInterceptor, proxy service token construction and
access-log extraction still use ":" to split/join credentials; updated
and added unit tests (parsing and metadata extraction) exercise these
paths and validate identical semantics.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Signed-off-by: majiayu000 <1835304752@qq.com>
Signed-off-by: lif <1835304752@qq.com>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
issue: https://github.com/milvus-io/milvus/issues/45890
ComputePhraseMatchSlop accepts three pararms:
1. A string: query text
2. Some trings: data texts
3. Analyzer params,
Slop will be calculated for the query text with each data text in the
context of phrase match where they are tokenized with tokenizer with
analyzer params.
So two array will be returned:
1. is_match: is phrase match can sucess
2. slop: the related slop if phrase match can sucess, or -1 is cannot.
---------
Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
issue: #43785
- pulsar client will print log into milvus logger now.
- pulsar client open the metric by default.
- upgrade the pulsar client to v0.15.1, and use offical repo.
- the fixing of milvus-io/pulsar-client-go is already covered by
official v0.15.1.
Signed-off-by: chyezh <chyezh@outlook.com>
Related to #43966#43809
This PR:
- Replace distributed request metrics collection into one interceptor
- Add `Retry` and `Reject` label represents auth rejection and
retry-able error cases
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
Merge RootCoord, DataCoord And QueryCoord into MixCoord
Make Session into one
issue : https://github.com/milvus-io/milvus/issues/37764
---------
Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>
after the pr merged, we can support to insert, upsert, build index,
query, search in the added field.
can only do the above operates in added field after add field request
complete, which is a sync operate.
compact will be supported in the next pr.
#39718
---------
Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
enhance :
1. alterindex delete properties
We have introduced a new parameter deleteKeys to the alterindex
functionality, which allows for the deletion of properties within an
index. This enhancement provides users with the flexibility to manage
index properties more effectively by removing specific keys as needed.
2. altercollection delete properties
We have introduced a new parameter deleteKeys to the altercollection
functionality, which allows for the deletion of properties within an
collection. This enhancement provides users with the flexibility to
manage collection properties more effectively by removing specific keys
as needed.
3.support altercollectionfield
We currently support modifying the fieldparams of a field in a
collection using altercollectionfield, which only allows changes to the
max-length attribute.
Key Points:
- New Parameter - deleteKeys: This new parameter enables the deletion of
specified properties from an index. By passing a list of keys to
deleteKeys, users can remove the corresponding properties from the
index.
- Mutual Exclusivity: The deleteKeys parameter cannot be used in
conjunction with the extraParams parameter. Users must choose one
parameter to pass based on their requirement. If deleteKeys is provided,
it indicates an intent to delete properties; if extraParams is provided,
it signifies the addition or update of properties.
issue: https://github.com/milvus-io/milvus/issues/37436
---------
Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>
issue : https://github.com/milvus-io/milvus/issues/36864
I have a few questions regarding my approach.I will consolidate them
here for feedback and review.Thanks
---------
Signed-off-by: Nischay Yadav <nischay.yadav@ibm.com>
Signed-off-by: Nischay <Nischay.Yadav@ibm.com>
issue: ##36621
- For simple types in a struct, add "string" to the JSON tag for
automatic string conversion during JSON encoding.
- For complex types in a struct, replace "int64" with "string."
Signed-off-by: jaime <yun.zhang@zilliz.com>
issue: #36621
1. Add API to access task runtime metrics, including:
- build index task
- compaction task
- import task
- balance (including load/release of segments/channels and some leader
tasks on querycoord)
- sync task
2. Add a debug model to the webpage by using debug=true or debug=false
in the URL query parameters to enable or disable debug mode.
Signed-off-by: jaime <yun.zhang@zilliz.com>
Related to #36102
This PR use newly added `grpcSizeStatsHandler` to reduce calling
`proto.Size` since the request & response size info is recorded by grpc
framework.
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
Related to #35927
There are serveral issue this PR addresses:
- Use `ResetTraceConfig` method instead init one in update event handler
- Implement dynamic stats.Handler to receive tracing config update event
- Update `enable_trace` flag when `ResetTraceConfig` is invoked
- Change `enable_trace` to `std::atomic<bool>` in case of data race
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
Related to #35767
prometheus counter cannot add negative value
when response is not written(say timeout/network broken) panicking may
happen if not check
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
issue: #32698
This PR add two rest api for component stop and status check:
1. `/management/stop?role=querynode` can stop the specified component
2. `/management/check/ready?role=rootcoord` can check whether the target
component is serviceable
---------
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
issue: #32466
this PR enhance that when shard location changed, update proxy's shard
leader cache. in case of query node failover case, proxy can find
replica recover
---------
Signed-off-by: Wei Liu <wei.liu@zilliz.com>