From 5227db3fb47356c0cbb242b3e650946e2bf7d073 Mon Sep 17 00:00:00 2001 From: Tumao Date: Thu, 11 Nov 2021 13:19:14 +0800 Subject: [PATCH] [skip ci]Improve milvus_timesync_en.md (#11623) Signed-off-by: tumao --- docs/design_docs/milvus_timesync_en.md | 32 +++++++++++++++----------- 1 file changed, 19 insertions(+), 13 deletions(-) diff --git a/docs/design_docs/milvus_timesync_en.md b/docs/design_docs/milvus_timesync_en.md index 28f9ab7c2b..7abb2abff3 100644 --- a/docs/design_docs/milvus_timesync_en.md +++ b/docs/design_docs/milvus_timesync_en.md @@ -3,6 +3,7 @@ `Time Synchronization` is the kernel part of Milvus 2.0; it affects all components of the system. This document describes the detailed design of `Time Synchronization`. There are 2 kinds of events in Milvus 2.0: + - DDL events - create collection - drop collection @@ -17,20 +18,20 @@ All events have a `Timestamp` to indicate when this event occurs. Suppose there are two users, `u1` and `u2`. They connect to Milvus and do the following operations at the respective timestamps. -| ts | u1 | u2 | -|-----------|----------------------|--------------| -| t0 | create Collection C0 | - | -| t2 | - | search on C0 | -| t5 | insert A1 into C0 | - | -| t7 | - | search on C0 | -| t10 | insert A2 | - | -| t12 | - | search on C0 | -| t15 | delete A1 from C0 | - | -| t17 | - | search on C0 | +| ts | u1 | u2 | +| --- | -------------------- | ------------ | +| t0 | create Collection C0 | - | +| t2 | - | search on C0 | +| t5 | insert A1 into C0 | - | +| t7 | - | search on C0 | +| t10 | insert A2 | - | +| t12 | - | search on C0 | +| t15 | delete A1 from C0 | - | +| t17 | - | search on C0 | Ideally, `u2` expects `C0` to be empty at `t2`, and could only see `A1` at `t7`; while `u2` could see both `A1` and `A2` at `t12`, but only see `A2` at `t17`. -It's easy to achieve this in a `single-node` database. But for a `Distributed System`, such like `Milvus`, it's a little difficult; the following problems need to be solved. +It's easy to achieve this in a `single-node` database. But for a `Distributed System`, such as `Milvus`, it's a little difficult; the following problems need to be solved. 1. If `u1` and `u2` are on different nodes, and their time clock is not synchronized. To give an extreme example, suppose that the time of `u2` is 24 hours later than `u1`, then all the operations of `u1` can't been seen by `u2` until next day. 2. Network latency. If `u2` starts the `Search on C0` at `t17`, then how can it be guaranteed that all the `events` before `t17` have been processed? If the events of `delete A1 from C0` has been delayed due to the network latency, then it would lead to incorrect state: `u2` would see both `A1` and `A2` at `t17`. @@ -47,7 +48,7 @@ Like [TiKV](https://github.com/tikv/tikv), Milvus 2.0 provides `TSO` service. Al service RootCoord { ... rpc AllocTimestamp(AllocTimestampRequest) returns (AllocTimestampResponse) {} - ... + ... } message AllocTimestampRequest { @@ -61,6 +62,7 @@ message AllocTimestampResponse { uint32 count = 3; } ``` + `Timestamp` is of type `uint64`, containing physical and logical parts. This is the format of `Timestamp` @@ -70,8 +72,10 @@ This is the format of `Timestamp` In an `AllocTimestamp` request, if `AllocTimestampRequest.count` is greater than `1`, `AllocTimestampResponse.timestamp` indicates the first available timestamp in the response. ## Time Synchronization + To understand the `Time Synchronization` better, let's introduce the data operation of Milvus 2.0 briefly. Taking `Insert Operation` as an example. + - User can configure lots of `Proxy` to achieve load balancing, in `Milvus 2.0` - User can use `SDK` to connect to any `Proxy` - When `Proxy` receives `Insert` Request from `SDK`, it splits `InsertMsg` into different `MsgStream` according to the hash value of `Primary Key` @@ -82,6 +86,7 @@ Taking `Insert Operation` as an example. ![proxy insert](./graphs/timesync_proxy_insert_msg.png) Based on the above information, we know that the `MsgStream` has the following characteristics: + - In `MsgStream`, `InsertMsg` from the same `Proxy` must be incremented in timestamp - In `MsgStream`, `InsertMsg` from different `Proxy` have no relationship in timestamp @@ -96,6 +101,7 @@ So the second problem has turned into this: after reading a message from `MsgStr For example, when reading a message with timestamp `110` produced by `Proxy2`, but the message with timestamp `80` produced by `Proxy1`, is still in the `MsgStream`. How can this situation be handled? The following graph shows the core logic of `Time Synchronization System` in `Milvus 2.0`; it should solve the second problem. + - Each `Proxy` will periodically report its latest timestamp of every `MsgStream` to `RootCoord`; the default interval is `200ms` - For each `Msgstream`, `Rootcoord` finds the minimum timestamp of all `Proxy` on this `Msgstream`, and inserts this minimum timestamp into the `Msgstream` - When the consumer reads the timestamp inserted by the `RootCoord` on the `MsgStream`, it indicates that all messages with smaller timestamp have been consumed, so all actions that depend on this timestamp can be executed safely @@ -109,7 +115,7 @@ This is the `Proto` that used by `Proxy` to report timestamp to `RootCoord`: service RootCoord { ... rpc UpdateChannelTimeTick(internal.ChannelTimeTickMsg) returns (common.Status) {} - ... + ... } message ChannelTimeTickMsg {