mirror of
https://gitee.com/milvus-io/milvus.git
synced 2025-12-28 22:45:26 +08:00
issue: https://github.com/milvus-io/milvus/issues/44399 This PR implements STL_SORT for VARCHAR data type for both RAM and MMAP mode. The general idea is that we deduplicate field values and maintains a posting list for each unique value. The serialization format of the index is: ``` [unique_count][string_offsets][string_data][post_list_offsets][post_list_data][magic_code] string_offsets: array of offsets into string_data section string_data: str_len1, str1, str_len2, str2, ... post_list_offsets: array of offsets into post_list_data section post_list_data: post_list_len1, row_id1, row_id2, ..., post_list_len2, row_id1, row_id2, ... ``` --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>