Add documentation for grwing segment (#9177)

Signed-off-by: xiaofan-luan <xiaofan.luan@zilliz.com>
This commit is contained in:
Xiaofan 2021-10-04 08:28:25 +08:00 committed by GitHub
parent 0ab1f597ae
commit 6b9d580913
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

View File

@ -0,0 +1,21 @@
# SegmentGrowing
Segmentgrowing has the following additional interfaces:
1. `PreInsert(size) -> reseveredOffset`: serial interface, which reserves space for future insertion and returns the insertion `reseveredOffset`.
2. `Insert(reseveredOffset, size, ...Data...)`: write `...Data...` into range `[reseveredOffset, reseveredOffset + size)`. this interface is allowed to be called concurrently.
1. `...Data...` contains row_ids, timestamps two system attributes, and other columns
2. data column can be stored either row based or columne based.
3. `PreDelete & Delete(reseveredOffset, row_ids, timestamps)` is the delete interface similar to intert interface.
Growing segment stores data in the form of chunk. The number of rows in each chunk is restricted by configs,
Rows per segment is controlled by parameters `size_per_Chunk ` control
When insert, first allocate enough space to ensure `total_size <= num_chunk * size_per_chunk`, and then convert data in row format to column format.
During search, each 'chunk' will be searched, and the search results will be saved as 'subquery result', then merged.
GrGowing Segment also implements small batch index for vectors. The parameters of small batch index are preset in 'segcoreconfig'
When `metric type ` is specified in the schema, the default parameters will build index for each chunk to accelerate query