diff --git a/docs/design_docs/entitiy_level_ttl.md b/docs/design_docs/entitiy_level_ttl.md new file mode 100644 index 0000000000..3992ddc160 --- /dev/null +++ b/docs/design_docs/entitiy_level_ttl.md @@ -0,0 +1,298 @@ +# Entity-level TTL Design + +## Background + +Currently, Milvus supports **collection-level TTL** for data expiration, but does not support defining an independent expiration time for individual entities (rows). As application scenarios become more diverse, for example: + +* Data from different tenants or businesses stored in the same collection but with different lifecycles; +* Hot and cold data mixed together, where short-lived data should be cleaned automatically while long-term data is retained; +* IoT / logging / MLOps data that requires record-level retention policies; + +Relying solely on collection-level TTL can no longer satisfy these requirements. If users want to retain only part of the data, they must periodically perform **upsert** operations to refresh the timestamps of those entities. This approach is unintuitive and increases operational and maintenance costs. + +Therefore, **Entity-level TTL** becomes a necessary feature. + +Related issues: + +* [milvus-io/milvus#45917](https://github.com/milvus-io/milvus/issues/45917) +* [milvus-io/milvus#45923](https://github.com/milvus-io/milvus/issues/45923) + +--- + +## Design Principles + +* Fully compatible with existing collection-level TTL behavior. +* Allow users to choose whether to enable entity-level TTL. +* User-controllable: support explicit declaration in schema or transparent system management. +* Minimize changes to compaction and query logic; expiration is determined only by the TTL column and write timestamp. +* Support dynamic upgrade for existing collections. + +--- + +## Basic Approach + +Milvus already supports the `TIMESTAMPTZ` data type. Entity TTL information will therefore be stored in a field of this type. + +--- + +## Design Details + +Entity-level TTL is implemented by allowing users to explicitly add a `TIMESTAMPTZ` column in the schema and mark it in collection properties: + +```text +"collection.ttl.field": "ttl" +``` + +Here, `ttl` is the name of the column that stores TTL information. This mechanism is **mutually exclusive** with collection-level TTL. + +--- + +### Terminology and Conventions + +* **TTL column / TTL field** : A field of type `TIMESTAMPTZ` declared in the schema and marked with `is_ttl = true`. +* **ExpireAt** : The value stored in the TTL field, representing the absolute expiration timestamp of an entity (UTC by default if no timezone is specified). +* **Collection-level TTL** : The existing mechanism where retention duration is defined at the collection level (e.g., retain 30 days). +* **insert_ts / mvcc_ts** : Existing Milvus write or MVCC timestamps, used as fallback when needed. +* **expirationTimeByPercentile** : A time point corresponding to a certain percentile of expired data within a segment, used to quickly determine whether compaction should be triggered. + +Example: + +* 20% of data expires at time `t1` +* 40% of data expires at time `t2` + +--- + +### 1. Collection Properties and Constraints + +* Only fields with `DataType == TIMESTAMPTZ` can be configured as a TTL field. +* Mutually exclusive with collection-level TTL: + * If collection-level TTL is enabled, specifying a TTL field is not allowed. + * Collection-level TTL must be disabled first. +* One TTL field per collection: + * A collection may contain multiple `TIMESTAMPTZ` fields, but only one can be designated as the TTL field. + +--- + +### 2. Storage Semantics + +* Unified convention: the TTL field stores an **absolute expiration time** (`ExpireAt`). +* Duration-based TTL is not supported. +* `NULL` value semantics: + * A `NULL` TTL value means the entity never expires. + +--- + +### 3. Compatibility Rules + +#### Existing Collections + +For an existing collection to enable entity-level TTL: + +1. Disable collection-level TTL using `AlterCollection`. +2. Add a new `TIMESTAMPTZ` field using `AddField`. +3. Update collection properties via `AlterCollection` to mark the new field as the TTL field. + +If historical data should also have expiration times, users must perform an **upsert** operation to backfill the TTL field. + +--- + +### 4. SegmentInfo Extension and Compaction Trigger + +#### 4.1 SegmentInfo Metadata Extension + +A new field `expirationTimeByPercentile` is added to the segment metadata: + +```proto +message SegmentInfo { + int64 ID = 1; + int64 collectionID = 2; + int64 partitionID = 3; + string insert_channel = 4; + int64 num_of_rows = 5; + common.SegmentState state = 6; + int64 max_row_num = 7 [deprecated = true]; // deprecated, we use the binary size to control the segment size but not a estimate rows. + uint64 last_expire_time = 8; + msg.MsgPosition start_position = 9; + msg.MsgPosition dml_position = 10; + // binlogs consist of insert binlogs + repeated FieldBinlog binlogs = 11; + repeated FieldBinlog statslogs = 12; + // deltalogs consists of delete binlogs. FieldID is not used yet since delete is always applied on primary key + repeated FieldBinlog deltalogs = 13; + bool createdByCompaction = 14; + repeated int64 compactionFrom = 15; + uint64 dropped_at = 16; // timestamp when segment marked drop + // A flag indicating if: + // (1) this segment is created by bulk insert, and + // (2) the bulk insert task that creates this segment has not yet reached `ImportCompleted` state. + bool is_importing = 17; + bool is_fake = 18; + + // denote if this segment is compacted to other segment. + // For compatibility reasons, this flag of an old compacted segment may still be False. + // As for new fields added in the message, they will be populated with their respective field types' default values. + bool compacted = 19; + + // Segment level, indicating compaction segment level + // Available value: Legacy, L0, L1, L2 + // For legacy level, it represent old segment before segment level introduced + // so segments with Legacy level shall be treated as L1 segment + SegmentLevel level = 20; + int64 storage_version = 21; + + int64 partition_stats_version = 22; + // use in major compaction, if compaction fail, should revert segment level to last value + SegmentLevel last_level = 23; + // use in major compaction, if compaction fail, should revert partition stats version to last value + int64 last_partition_stats_version = 24; + + // used to indicate whether the segment is sorted by primary key. + bool is_sorted = 25; + + // textStatsLogs is used to record tokenization index for fields. + map textStatsLogs = 26; + repeated FieldBinlog bm25statslogs = 27; + + // This field is used to indicate that some intermediate state segments should not be loaded. + // For example, segments that have been clustered but haven't undergone stats yet. + bool is_invisible = 28; + + + // jsonKeyStats is used to record json key index for fields. + map jsonKeyStats = 29; + // This field is used to indicate that the segment is created by streaming service. + // This field is meaningful only when the segment state is growing. + // If the segment is created by streaming service, it will be a true. + // A segment generated by datacoord of old arch, will be false. + // After the growing segment is full managed by streamingnode, the true value can never be seen at coordinator. + bool is_created_by_streaming = 30; + bool is_partition_key_sorted = 31; + + // manifest_path stores the fullpath of LOON manifest file of segemnt data files. + // we could keep the fullpath since one segment shall only have one active manifest + // and we could keep the possiblity that manifest stores out side of collection/partition/segment path + string manifest_path = 32; + + // expirationTimeByPercentile records the expiration timestamps of the segment + // at the 20%, 40%, 60%, 80%, and 100% data distribution levels + repeated int64 expirationTimeByPercentile = 33; +} +``` + +Meaning: + +* `expirationTimeByPercentile`: The expiration timestamps corresponding to the 20%, 40%, 60%, 80%, and 100% percentiles of data within the segment. + +--- + +#### 4.2 Metadata Writing + +* Statistics are collected **only during compaction** . +* `expirationTimeByPercentile` is computed during sort or mix compaction tasks. +* For streaming segments, sort compaction is required as the first step, making this approach sufficient. + +--- + +#### 4.3 Compaction Trigger Strategy + +* Based on a configured expired-data ratio, select the corresponding percentile from `expirationTimeByPercentile` (rounded down). +* Compare the selected expiration time with the current time. +* If the expiration condition is met, trigger a compaction task. + +Special cases: + +* If `expirationTimeByPercentile` is `NULL`, the segment is treated as non-expiring. +* For old segments without a TTL field, expiration logic is skipped. +* Subsequent upsert operations will trigger the corresponding L0 compaction. + +--- + +### 5. Query / Search Logic + +* Each query is executed with an MVCC timestamp assigned by Milvus. +* When loading a collection, the system records which field is configured as the TTL field. + +During query execution, expired data is filtered by comparing the TTL field value with the MVCC timestamp inside `mask_with_timestamps`. + +--- + +## PyMilvus Example + +### 1. Create Collection + +Specify the TTL field in the schema: + +```python +schema = client.create_schema(auto_id=False, description="test entity ttl") +schema.add_field("id", DataType.INT64, is_primary=True) +schema.add_field("ttl", DataType.TIMESTAMPTZ, nullable=True) +schema.add_field("vector", DataType.FLOAT_VECTOR, dim=dim) + +prop = {"collection.ttl.field": "ttl"} +client.create_collection( + collection_name, + schema=schema, + enable_dynamic_field=True, + properties=prop, +) +``` + +--- + +### 2. Insert Data + +Insert data the same way as a normal `TIMESTAMPTZ` field: + +```python +rows = [ + {"id": 0, "vector": [random.random() for _ in range(dim)], "ttl": None}, + {"id": 1, "vector": [random.random() for _ in range(dim)], "ttl": None}, + {"id": 2, "vector": [random.random() for _ in range(dim)], "ttl": None}, + {"id": 3, "vector": [random.random() for _ in range(dim)], "ttl": "2025-12-31T00:00:00Z"}, + {"id": 4, "vector": [random.random() for _ in range(dim)], "ttl": "2025-12-31T01:00:00Z"}, + {"id": 5, "vector": [random.random() for _ in range(dim)], "ttl": "2025-12-31T02:00:00Z"}, + {"id": 6, "vector": [random.random() for _ in range(dim)], "ttl": "2025-12-31T03:00:00Z"}, + {"id": 7, "vector": [random.random() for _ in range(dim)], "ttl": "2025-12-31T04:00:00Z"}, + {"id": 8, "vector": [random.random() for _ in range(dim)], "ttl": "2025-12-31T23:59:59Z"}, +] + +insert_result = client.insert(collection_name, rows) +client.flush(collection_name) +``` + +--- + +### 3. Index and Load + +Index creation and loading are unaffected. Indexes can still be built on the TTL field if needed. + +```python-repl +index_params = client.prepare_index_params() +index_params.add_index( + field_name="vector", + index_type="IVF_FLAT", + index_name="vector", + metric_type="L2", + params={"nlist": 128}, +) +client.create_index(collection_name, index_params=index_params) + +client.load_collection(collection_name) +``` + +--- + +### 4. Search / Query + +Use queries with different timestamps to validate expiration behavior: + +```python +query_expr = "id > 0" +res = client.query( + collection_name, + query_expr, + output_fields=["id", "ttl"], + limit=100, +) +print(res) +```