issue: https://github.com/milvus-io/milvus/issues/46713
/kind bug
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
- Core invariant: a Limiter's hasUpdated flag controls whether its
non-Infinite limit is exported to proxies via toRequestLimiter(); only
limiters with hasUpdated==true and non-Inf rates produce rate updates
delivered to proxies (pkg/util/ratelimitutil/limiter.go: HasUpdated /
toRequestLimiter behavior unchanged).
- Exact bug and fix (issue #46713): collection-level limiters created
from configured collection/partition/database properties were
constructed with correct limits but left hasUpdated==false, so they were
skipped by the existing !HasUpdated() check and never sent to proxies.
Fix: add Limiter.SetHasUpdated(updated bool) and call a new
updateLimiterHasUpdated helper immediately after creating limiter nodes
during initialization/reset (internal/rootcoord/quota_center.go) to mark
non-Inf newly-created limiters as updated so they are included in
toRequestLimiter exports.
- Logic simplified / redundancy removed: initialization now explicitly
sets limiter initialization state (hasUpdated) for newly-created
non-Infinite limiters instead of relying on implicit later side-effects
to toggle the flag; this removes the implicit gap between creation and
the expectation that a configured limiter should be published.
- No data-loss or behavior regression: the change only mutates the
in-memory hasUpdated flag for freshly created limiter instances
(pkg/util/ratelimitutil/limiter.go: SetHasUpdated) and sets it in the
limiter initialization path (internal/rootcoord/quota_center.go). It
does not alter token accounting (advance, AllowN, Cancel), rate
computation, SetLimit semantics, persistence, or proxy filtering
logic—only ensures intended collection-level rates are delivered to
proxies—so no persisted data or runtime rate behavior is removed or
degraded.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
Signed-off-by: guanghuihuang <guanghuihuang@didiglobal.com>
Co-authored-by: guanghuihuang <guanghuihuang@didiglobal.com>
issue: #44358
Implement complete snapshot management system including creation,
deletion, listing, description, and restoration capabilities across all
system components.
Key features:
- Create snapshots for entire collections
- Drop snapshots by name with proper cleanup
- List snapshots with collection filtering
- Describe snapshot details and metadata
Components added/modified:
- Client SDK with full snapshot API support and options
- DataCoord snapshot service with metadata management
- Proxy layer with task-based snapshot operations
- Protocol buffer definitions for snapshot RPCs
- Comprehensive unit tests with mockey framework
- Integration tests for end-to-end validation
Technical implementation:
- Snapshot metadata storage in etcd with proper indexing
- File-based snapshot data persistence in object storage
- Garbage collection integration for snapshot cleanup
- Error handling and validation across all operations
- Thread-safe operations with proper locking mechanisms
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
- Core invariant/assumption: snapshots are immutable point‑in‑time
captures identified by (collection, snapshot name/ID); etcd snapshot
metadata is authoritative for lifecycle (PENDING → COMMITTED → DELETING)
and per‑segment manifests live in object storage (Avro / StorageV2). GC
and restore logic must see snapshotRefIndex loaded
(snapshotMeta.IsRefIndexLoaded) before reclaiming or relying on
segment/index files.
- New capability added: full end‑to‑end snapshot subsystem — client SDK
APIs (Create/Drop/List/Describe/Restore + restore job queries),
DataCoord SnapshotWriter/Reader (Avro + StorageV2 manifests),
snapshotMeta in meta, SnapshotManager orchestration
(create/drop/describe/list/restore), copy‑segment restore
tasks/inspector/checker, proxy & RPC surface, GC integration, and
docs/tests — enabling point‑in‑time collection snapshots persisted to
object storage and restorations orchestrated across components.
- Logic removed/simplified and why: duplicated recursive
compaction/delta‑log traversal and ad‑hoc lookup code were consolidated
behind two focused APIs/owners (Handler.GetDeltaLogFromCompactTo for
delta traversal and SnapshotManager/SnapshotReader for snapshot I/O).
MixCoord/coordinator broker paths were converted to thin RPC proxies.
This eliminates multiple implementations of the same traversal/lookup,
reducing divergence and simplifying responsibility boundaries.
- Why this does NOT introduce data loss or regressions: snapshot
create/drop use explicit two‑phase semantics (PENDING → COMMIT/DELETING)
with SnapshotWriter writing manifests and metadata before commit; GC
uses snapshotRefIndex guards and
IsRefIndexLoaded/GetSnapshotBySegment/GetSnapshotByIndex checks to avoid
removing referenced files; restore flow pre‑allocates job IDs, validates
resources (partitions/indexes), performs rollback on failure
(rollbackRestoreSnapshot), and converts/updates segment/index metadata
only after successful copy tasks. Extensive unit and integration tests
exercise pending/deleting/GC/restore/error paths to ensure idempotence
and protection against premature deletion.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
issue: #46033
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Pull Request Summary: Entity-Level TTL Field Support
### Core Invariant and Design
This PR introduces **per-entity TTL (time-to-live) expiration** via a
dedicated TIMESTAMPTZ field as a fine-grained alternative to
collection-level TTL. The key invariant is **mutual exclusivity**:
collection-level TTL and entity-level TTL field cannot coexist on the
same collection. Validation is enforced at the proxy layer during
collection creation/alteration (`validateTTL()` prevents both being set
simultaneously).
### What Is Removed and Why
- **Global `EntityExpirationTTL` parameter** removed from config
(`configs/milvus.yaml`, `pkg/util/paramtable/component_param.go`). This
was the only mechanism for collection-level expiration. The removal is
safe because:
- The collection-level TTL path (`isEntityExpired(ts)` check) remains
intact in the codebase for backward compatibility
- TTL field check (`isEntityExpiredByTTLField()`) is a secondary path
invoked only when a TTL field is configured
- Existing deployments using collection TTL can continue without
modification
The global parameter was removed specifically because entity-level TTL
makes per-entity control redundant with a collection-wide setting, and
the PR chooses one mechanism per collection rather than layering both.
### No Data Loss or Behavior Regression
**TTL filtering logic is additive and safe:**
1. **Collection-level TTL unaffected**: The `isEntityExpired(ts)` check
still applies when no TTL field is configured; callers of
`EntityFilter.Filtered()` pass `-1` as the TTL expiration timestamp when
no field exists, causing `isEntityExpiredByTTLField()` to return false
immediately
2. **Null/invalid TTL values treated safely**: Rows with null TTL or TTL
≤ 0 are marked as "never expire" (using sentinel value `int64(^uint64(0)
>> 1)`) and are preserved across compactions; percentile calculations
only include positive TTL values
3. **Query-time filtering automatic**: TTL filtering is transparently
added to expression compilation via `AddTTLFieldFilterExpressions()`,
which appends `(ttl_field IS NULL OR ttl_field > current_time)` to the
filter pipeline. Entities with null TTL always pass the filter
4. **Compaction triggering granular**: Percentile-based expiration (20%,
40%, 60%, 80%, 100%) allows configurable compaction thresholds via
`SingleCompactionRatioThreshold`, preventing premature data deletion
### Capability Added: Per-Entity Expiration with Data Distribution
Awareness
Users can now specify a TIMESTAMPTZ collection property `ttl_field`
naming a schema field. During data writes, TTL values are collected per
segment and percentile quantiles (5-value array) are computed and stored
in segment metadata. At query time, the TTL field is automatically
filtered. At compaction time, segment-level percentiles drive
expiration-based compaction decisions, enabling intelligent compaction
of segments where a configurable fraction of data has expired (e.g.,
compact when 40% of rows are expired, controlled by threshold ratio).
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
issue: #45841
- CPP log make the multi log line in one debug, remove the "\n\t".
- remove some log that make no sense.
- slow down some log like ChannelDistManager.
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
- Core invariant: logging is purely observational — this PR only
reduces, consolidates, or reformats diagnostic output (removing
per-item/noise logs, consolidating batched logs, and converting
multi-line log strings) while preserving all control flow, return
values, and state mutations across affected code paths.
- Removed / simplified logic: deleted low-value per-operation debug/info
logs (e.g., ListIndexes, GetRecoveryInfo, GcConfirm,
push-to-reorder-buffer, several streaming/wal/debug traces), replaced
per-item inline logs with single batched deferred logs in
querynodev2/delegator (logExcludeInfo) and CleanInvalid, changed C++
PlanNode ToString() multi-line output to compact single-line bracketed
format (removed "\n\t"), and added thresholded interceptor logging
(InterceptorMetrics.ShouldBeLogged) and message-type-driven log levels
to avoid verbose entries.
- Why this does NOT cause data loss or behavioral regression: no
function signatures, branching, state updates, persistence calls, or
return values were changed — examples: ListIndexes still returns the
same Status/IndexInfos; GcConfirm still constructs and returns
resp.GetGcFinished(); Insert and CleanInvalid still perform the same
insert/removal operations (only their per-item logging was aggregated);
PlanNode ToString changes only affect emitted debug strings. All error
handling and control flow paths remain intact.
- Enhancement intent: reduce log volume and improve signal-to-noise for
debugging by removing redundant, noisy logs and emitting concise,
rate-/threshold-limited summaries while preserving necessary diagnostics
and original program behavior.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Signed-off-by: chyezh <chyezh@outlook.com>
issue: https://github.com/milvus-io/milvus/issues/44123
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
- Core invariant: legacy in-cluster CDC/replication plumbing
(ReplicateMsg types, ReplicateID-based guards and flags) is obsolete —
the system relies on standard msgstream positions, subPos/end-ts
semantics and timetick ordering as the single source of truth for
message ordering and skipping, so replication-specific
channels/types/guards can be removed safely.
- Removed/simplified logic (what and why): removed replication feature
flags and params (ReplicateMsgChannel, TTMsgEnabled,
CollectionReplicateEnable), ReplicateMsg type and its tests, ReplicateID
constants/helpers and MergeProperties hooks, ReplicateConfig and its
propagation (streamPipeline, StreamConfig, dispatcher, target),
replicate-aware dispatcher/pipeline branches, and replicate-mode
pre-checks/timestamp-allocation in proxy tasks — these implemented a
redundant alternate “replicate-mode” pathway that duplicated
position/end-ts and timetick logic.
- Why this does NOT cause data loss or regression (concrete code paths):
no persistence or core write paths were removed — proxy PreExecute flows
(internal/proxy/task_*.go) still perform the same schema/ID/size
validations and then follow the normal non-replicate execution path;
dispatcher and pipeline continue to use position/subPos and
pullback/end-ts in Seek/grouping (pkg/mq/msgdispatcher/dispatcher.go,
internal/util/pipeline/stream_pipeline.go), so skipping and ordering
behavior remains unchanged; timetick emission in rootcoord
(sendMinDdlTsAsTt) is now ungated (no silent suppression), preserving or
increasing timetick delivery rather than removing it.
- PR type and net effect: Enhancement/Refactor — removes deprecated
replication API surface (types, helpers, config, tests) and replication
branches, simplifies public APIs and constructor signatures, and reduces
surface area for future maintenance while keeping DML/DDL persistence,
ordering, and seek semantics intact.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
issue: https://github.com/milvus-io/milvus/issues/46651
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Enhancement: Add Context-Aware Logging for Proxy and RootCoord Meta
Table Operations
**Core Invariant**: All changes maintain existing cache behavior and
state transition logic by purely enhancing observability through
context-aware logging without modifying control flow, return values, or
data structures.
**Logic Simplified Without Regression**:
- Removed internal helper method `getFullCollectionInfo` from MetaCache
by inlining its logic directly into GetCollectionInfo, eliminating an
unnecessary abstraction layer while preserving the exact same
cache-hit/miss and fetch-or-update paths
- This consolidation has no impact on behavior because the helper was
only called from one location and the inlined logic executes identically
**Enhanced Logging for Observability (No Behavior Changes)**:
- Added context-aware logging (log.Ctx(ctx)) to cache miss scenarios and
timestamp comparisons in proxy MetaCache, enabling request tracing
without altering cache lookup logic
- Expanded RootCoord MetaTable's internal helper method signatures to
propagate context for contextual logging across collection lifecycle
events (begin truncate, update state, remove names/aliases, delete from
collections map), while keeping all call sites and state transitions
unchanged
- Enhanced DescribeCollection logging in proxy to capture request scope
(role, database, collection IDs, timestamp) and response schema at
operation boundaries
**Type**: Enhancement focused on improved observability. All
modifications are strictly additive logging; no data structures, caching
strategies, or core logic paths were altered.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
Support crate analyzer with file resource info, and return used file
resource ids when validate analyzer.
Save the related resource ids in collection schema.
relate: https://github.com/milvus-io/milvus/issues/43687
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
- Core invariant: analyzer file-resource resolution is deterministic and
traceable by threading a FileResourcePathHelper (collecting used
resource IDs in a HashSet) through all tokenizer/analyzer construction
and validation paths; validate_analyzer(params, extra_info) returns the
collected Vec<i64) which is propagated through C/Rust/Go layers to
callers (CValidateResult → RustResult::from_vec_i64 → Go []int64 →
querypb.ValidateAnalyzerResponse.ResourceIds →
CollectionSchema.FileResourceIds).
- Logic removed/simplified: ad‑hoc, scattered resource-path lookups and
per-filter file helpers (e.g., read_synonyms_file and other inline
file-reading logic) were consolidated into ResourceInfo +
FileResourcePathHelper and a centralized get_resource_path(helper, ...)
API; filter/tokenizer builder APIs now accept &mut
FileResourcePathHelper so all file path resolution and ID collection use
the same path and bookkeeping logic (redundant duplicated lookups
removed).
- Why no data loss or behavior regression: changes are additive and
default-preserving — existing call sites pass extra_info = "" so
analyzer creation/validation behavior and error paths remain unchanged;
new Collection.FileResourceIds is populated from resp.ResourceIds in
validateSchema and round‑tripped through marshal/unmarshal
(model.Collection ↔ schemapb.CollectionSchema) so schema persistence
uses the new list without overwriting other schema fields; proto change
adds a repeated field (resource_ids) which is wire‑compatible (older
clients ignore extra field). Concrete code paths: analyzer creation
still uses create_analyzer (now with extra_info ""), tokenizer
validation still returns errors as before but now also returns IDs via
CValidateResult/RustResult, and rootcoord.validateSchema assigns
resp.ResourceIds → schema.FileResourceIds.
- New capability added: end‑to‑end discovery, return, and persistence of
file resource IDs used by analyzers — validate flows now return resource
IDs and the system stores them in collection schema (affects tantivy
analyzer binding, canalyzer C bindings, internal/util analyzer APIs,
querynode ValidateAnalyzer response, and rootcoord/create_collection
flow).
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
related: #45993
This commit extends nullable vector support to the proxy layer,
querynode,
and adds comprehensive validation, search reduce, and field data
handling
for nullable vectors with sparse storage.
Proxy layer changes:
- Update validate_util.go checkAligned() with getExpectedVectorRows()
helper
to validate nullable vector field alignment using valid data count
- Update checkFloatVectorFieldData/checkSparseFloatVectorFieldData for
nullable vector validation with proper row count expectations
- Add FieldDataIdxComputer in typeutil/schema.go for logical-to-physical
index translation during search reduce operations
- Update search_reduce_util.go reduceSearchResultData to use
idxComputers
for correct field data indexing with nullable vectors
- Update task.go, task_query.go, task_upsert.go for nullable vector
handling
- Update msg_pack.go with nullable vector field data processing
QueryNode layer changes:
- Update segments/result.go for nullable vector result handling
- Update segments/search_reduce.go with nullable vector offset
translation
Storage and index changes:
- Update data_codec.go and utils.go for nullable vector serialization
- Update indexcgowrapper/dataset.go and index.go for nullable vector
indexing
Utility changes:
- Add FieldDataIdxComputer struct with Compute() method for efficient
logical-to-physical index mapping across multiple field data
- Update EstimateEntitySize() and AppendFieldData() with fieldIdxs
parameter
- Update funcutil.go with nullable vector support functions
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **New Features**
* Full support for nullable vector fields (float, binary, float16,
bfloat16, int8, sparse) across ingest, storage, indexing, search and
retrieval; logical↔physical offset mapping preserves row semantics.
* Client: compaction control and compaction-state APIs.
* **Bug Fixes**
* Improved validation for adding vector fields (nullable + dimension
checks) and corrected search/query behavior for nullable vectors.
* **Chores**
* Persisted validity maps with indexes and on-disk formats.
* **Tests**
* Extensive new and updated end-to-end nullable-vector tests.
<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Signed-off-by: marcelo-cjl <marcelo.chen@zilliz.com>
AddProxyClients now removes clients not in the new snapshot before
adding new ones. This ensures proper cleanup when ProxyWatcher re-watche
etcd.
issue: https://github.com/milvus-io/milvus/issues/46397
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
Set the auto-appended dynamic field to be nullable with a default value
of empty JSON object `{}`. This allows collections with dynamic schema
to handle rows that don't have any dynamic fields more gracefully,
avoiding potential null reference issues when the dynamic field is not
explicitly set during insert.
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
issue: #43897
also for issue: #46166
add ack_sync_up flag into broadcast message header, which indicates that
whether the broadcast operation is need to be synced up between the
streaming node and the coordinator.
If the ack_sync_up is false, the broadcast operation will be acked once
the recovery storage see the message at current vchannel, the fast ack
operation can be applied to speed up the broadcast operation.
If the ack_sync_up is true, the broadcast operation will be acked after
the checkpoint of current vchannel reach current message.
The fast ack operation can not be applied to speed up the broadcast
operation, because the ack operation need to be synced up with streaming
node.
e.g. if truncate collection operation want to call ack once callback
after the all segment are flushed at current vchannel, it should set the
ack_sync_up to be true.
TODO: current implementation doesn't promise the ack sync up semantic,
it only promise FastAck operation will not be applied, wait for 3.0 to
implement the ack sync up semantic. only for truncate api now.
---------
Signed-off-by: chyezh <chyezh@outlook.com>
issue: #46277
- Update db/collection/partition disk quota metrics when cluster disk
quota changes, since they use cluster quota as default value
- Fix incorrect label "collection" to "partition" in disk quota per
partition watcher
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
This will avoid endless retry CreateDatabase/DropDatabase when
cipherPlugin fails in the new DDL framework.
See also: #45826
---------
Signed-off-by: yangxuan <xuan.yang@zilliz.com>
issue: #45452
- alias/rename related DDL should use database level exclusive lock
- alias cannot use as the resource key of lock, use collection name
instead
- transfer replica should use WAL-based framework
Signed-off-by: chyezh <chyezh@outlook.com>
This causes collection schema properties is empty in datacoord caches,
thus making compaction, indexing, unable to get properties from schema.
See also: #45053, #45159
Signed-off-by: yangxuan <xuan.yang@zilliz.com>
issue: #45403, #45463
- fix the Nightly E2E failures.
- fix the wrong update timetick of altering collection to fix the
related load failure.
Signed-off-by: chyezh <chyezh@outlook.com>
issue: #45080, #45274, #45285
- LoadCollection doesn't ignore the ignorable request, for false field
array.
- CreatIndex doesn't ignore the ignorable request, for wrong index.
- index meta is not thread safe.
- lost parameter check of DDL.
- DDL Ack scheduler may get stuck and DDL is block until next incoming
DDL.
- lost parameter checker of ddl
---------
Signed-off-by: chyezh <chyezh@outlook.com>
issue: #43897
- Alter collection/database is implemented by WAL-based DDL framework
now.
- Support AlterCollection/AlterDatabase in wal now.
- Alter operation can be synced by new CDC now.
- Refactor some UT for alter DDL.
---------
Signed-off-by: chyezh <chyezh@outlook.com>
issue: #43897
- Load/Release collection/partition is implemented by WAL-based DDL
framework now.
- Support AlterLoadConfig/DropLoadConfig in wal now.
- Load/Release operation can be synced by new CDC now.
- Refactor some UT for load/release DDL.
---------
Signed-off-by: chyezh <chyezh@outlook.com>
issue: #43897
- Part of collection/index related DDL is implemented by WAL-based DDL
framework now.
- Support following message type in wal, CreateCollection,
DropCollection, CreatePartition, DropPartition, CreateIndex, AlterIndex,
DropIndex.
- Part of collection/index related DDL can be synced by new CDC now.
- Refactor some UT for collection/index DDL.
- Add Tombstone scheduler to manage the tombstone GC for collection or
partition meta.
- Move the vchannel allocation into streaming pchannel manager.
---------
Signed-off-by: chyezh <chyezh@outlook.com>
relate: https://github.com/milvus-io/milvus/issues/43687
We used to run the temporary analyzer and validate analyzer on the
proxy, but the proxy should not be a computation-heavy node. This PR
move all analyzer calculations to the streaming node.
---------
Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
issue: #43897
- Alias related DDL is implemented by WAL-based DDL framework now.
- Support following message type in wal AlterAlias, DropAlias.
- Alias DDL can be synced by new CDC now.
- Refactor some UT for Alias DDL.
Signed-off-by: chyezh <chyezh@outlook.com>
issue: #43897
- Database related DDL is implemented by WAL-based DDL framework now.
- Support following message type in wal CreateDatabase, AlterDatabase,
DropDatabase.
- Database DDL can be synced by new CDC now.
- Refactor some UT for Database DDL.
Signed-off-by: chyezh <chyezh@outlook.com>
issue: #43897
- RBAC(Roles/Users/Privileges/Privilege Groups) is implemented by
WAL-based DDL framework now.
- Support following message type in wal `AlterUser`, `DropUser`,
`AlterRole`, `DropRole`, `AlterUserRole`, `DropUserRole`,
`AlterPrivilege`, `DropPrivilege`, `AlterPrivilegeGroup`,
`DropPrivilegeGroup`, `RestoreRBAC`.
- RBAC can be synced by new CDC now.
- Refactor some UT for RBAC.
---------
Signed-off-by: chyezh <chyezh@outlook.com>
The collection not found err could contains db id in err message, which
is not meaningful to users.
This patch make error message wrapping dbname instead of db id.
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
issue: #43427
This pr's main goal is merge #37417 to milvus 2.5 without conflicts.
# Main Goals
1. Create and describe collections with geospatial type
2. Insert geospatial data into the insert binlog
3. Load segments containing geospatial data into memory
4. Enable query and search can display geospatial data
5. Support using GIS funtions like ST_EQUALS in query
6. Support R-Tree index for geometry type
# Solution
1. **Add Type**: Modify the Milvus core by adding a Geospatial type in
both the C++ and Go code layers, defining the Geospatial data structure
and the corresponding interfaces.
2. **Dependency Libraries**: Introduce necessary geospatial data
processing libraries. In the C++ source code, use Conan package
management to include the GDAL library. In the Go source code, add the
go-geom library to the go.mod file.
3. **Protocol Interface**: Revise the Milvus protocol to provide
mechanisms for Geospatial message serialization and deserialization.
4. **Data Pipeline**: Facilitate interaction between the client and
proxy using the WKT format for geospatial data. The proxy will convert
all data into WKB format for downstream processing, providing column
data interfaces, segment encapsulation, segment loading, payload
writing, and cache block management.
5. **Query Operators**: Implement simple display and support for filter
queries. Initially, focus on filtering based on spatial relationships
for a single column of geospatial literal values, providing parsing and
execution for query expressions.Now only support brutal search
7. **Client Modification**: Enable the client to handle user input for
geospatial data and facilitate end-to-end testing.Check the modification
in pymilvus.
---------
Signed-off-by: Yinwei Li <yinwei.li@zilliz.com>
Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
Co-authored-by: ZhuXi <150327960+Yinwei-Yu@users.noreply.github.com>
issue: #43897
- Return LastConfirmedMessageID when wal append operation.
- Add resource-key-based locker for broadcast-ack operation to protect
the coord state when executing ddl.
- Resource-key-based locker is held until the broadcast operation is
acked.
- ResourceKey support shared and exclusive lock.
- Add FastAck execute ack right away after the broadcast done to speed
up ddl.
- Ack callback will support broadcast message result now.
- Add tombstone for broadcaster to avoid to repeatedly commit DDL and
ABA issue.
---------
Signed-off-by: chyezh <chyezh@outlook.com>
issue: https://github.com/milvus-io/milvus/issues/27467
>My plan is as follows.
>- [x] M1 Create collection with timestamptz field
>- [x] M2 Insert timestamptz field data
>- [x] M3 Retrieve timestamptz field data
>- [x] M4 Implement handoff
>- [x] M5 Implement compare operator
>- [x] M6 Implement extract operator
>- [x] M8 Support database/collection level default timezone
>- [x] M7 Support STL-SORT index for datatype timestamptz
---
The third PR of issue: https://github.com/milvus-io/milvus/issues/27467,
which completes M5, M6, M7, M8 described above.
## M8 Default Timezone
We will be able to use alter_collection() and alter_database() in a
future Python SDK release to modify the default timezone at the
collection or database level.
For insert requests, the timezone will be resolved using the following
order of precedence: String Literal-> Collection Default -> Database
Default.
For retrieval requests, the timezone will be resolved in this order:
Query Parameters -> Collection Default -> Database Default.
In both cases, the final fallback timezone is UTC.
## M5: Comparison Operators
We can now use the following expression format to filter on the
timestamptz field:
- `timestamptz_field [+/- INTERVAL 'interval_string'] {comparison_op}
ISO 'iso_string' `
- The interval_string follows the ISO 8601 duration format, for example:
P1Y2M3DT1H2M3S.
- The iso_string follows the ISO 8601 timestamp format, for example:
2025-01-03T00:00:00+08:00.
- Example expressions: "tsz + INTERVAL 'P0D' != ISO
'2025-01-03T00:00:00+08:00'" or "tsz != ISO
'2025-01-03T00:00:00+08:00'".
## M6: Extract
We will be able to extract sepecific time filed by kwargs in a future
Python SDK release.
The key is `time_fields`, and value should be one or more of "year,
month, day, hour, minute, second, microsecond", seperated by comma or
space. Then the result of each record would be an array of int64.
## M7: Indexing Support
Expressions without interval arithmetic can be accelerated using an
STL-SORT index. However, expressions that include interval arithmetic
cannot be indexed. This is because the result of an interval calculation
depends on the specific timestamp value. For example, adding one month
to a date in February results in a different number of added days than
adding one month to a date in March.
---
After this PR, the input / output type of timestamptz would be iso
string. Timestampz would be stored as timestamptz data, which is int64_t
finally.
> for more information, see https://en.wikipedia.org/wiki/ISO_8601
---------
Signed-off-by: xtx <xtianx@smail.nju.edu.cn>
issue: #44429
Fix the issue where dynamic modification of collection replica count
doesn't take effect due to incorrect parameter usage in load config
parsing.
Changes include:
- Replace DatabaseLevelReplicaNumber with CollectionLevelReplicaNumber
- Replace DatabaseLevelResourceGroups with CollectionLevelResourceGroups
- Update test cases to use correct collection-level constants
- Ensure dynamic replica changes are properly applied to collections
Signed-off-by: Wei Liu <wei.liu@zilliz.com>