issue: #33744
This PR includes the following changes:
1. Added a new task type to the task scheduler in datacoord: stats task,
which sorts segments by primary key.
2. Implemented segment sorting in indexnode.
3. Added a new field `FieldStatsLog` to SegmentInfo to store token index
information.
---------
Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
issue: #33005
1. add `MemorySize` field for insert binlog.
2. `LogSize` means the file size in the storage object.
3. `MemorySize` means the size of the data in the memory.
---------
Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
Signed-off-by: cai.zhang <cai.zhang@zilliz.com>
issue: #32206, #32801
- search failure with some assertion, segment not loaded and resource
insufficient.
- segment leak when query segments
---------
Signed-off-by: chyezh <chyezh@outlook.com>
issue:#32899
This PR fix the wrong metric value of load index, which introduced by
pr#32567, use wrong time unit for load index metrics
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
issue: #32530
when try to match segment bloom filter with pk, we can reuse the hash
locations. This PR maintain the max hash Func, and compute hash location
once for all segment, reuse hash location can speed up bf access
---------
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
See also #32748
This PR:
- Add `metautil.Channel` utiltiy which convert virtual name to physical
channel name, collectionID and shard idx
- Add channel mapper interface & implementation to convert limited
physical channel name into int index
- Apply `metautil.Channel` filter in querynode segment manager logic
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
issue: #32663
- Use new param to control request resource timeout for lazy load.
- Remove the timeout parameter of `Do`, remove `DoWait`. use `context`
to control the timeout.
- Use `VersionedNotifier` to avoid notify event lost and broadcast,
remove the redundant goroutine in cache.
related dev pr: #32684
Signed-off-by: chyezh <chyezh@outlook.com>
issue: #30361
- Delete may be lost when segment is not data-loaded status in lru
cache. skip filtering to fix it.
- `stats_` and `variable_fields_avg_size_` should be reset when
`ReleaseData`
- Remove repeat load delta log operation in lru.
---------
Signed-off-by: chyezh <chyezh@outlook.com>
related: #31959
1. reset segment index status after evicting to lazyload=true
2. reset num_rows to null_opt
Signed-off-by: MrPresent-Han <chun.han@zilliz.com>
issue: #30931
- move resource estimate function outside from segment loader.
- add load info and collection to base segment.
- add resource usage method for sealed segment.
Signed-off-by: chyezh <chyezh@outlook.com>
This PR add metrics for load segment progress:
1. add metrics for load segment/index concurrency
2. add metrics for load index latency
3. change load segment latency's time unit to ms
---------
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
Segment load memory usage is underestimated due to removing the load
memroy factor. This PR adds it back to protect querynode OOM during some
extreme memory cases.
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
issue: #30950
due to segment version doesn't update as expected.
This PR will update segment version until segment become loaded
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
See also #30756
This PR:
- Request disk resource when index type, version loaded with disk
- Add attribute cache for index utility
- Add `typeutil.Pair`
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
See also #30191
It turned out that in auto id and batch delete scenario actual memory
size of deltalog maybe way larger than deltalog file size. This PR add a
configurable expansion rate for deltalog memory usage to prevent
out-of-memory panicking during loading deltalogs.
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
the old version Knowhere would copy the index data while loading, we
need to consider this to avoid OOM.
Knowhere provides a util function to indicate whether it will load the
index with disk, if not, we need to double the memory usage prediction
for index data
Signed-off-by: yah01 <yang.cen@zilliz.com>
Related to #30191
When loading segment, segment loader shall check memory usage for
current loading task. Previously l0 segment was ignored but level zero
segment may actually cost lots of memory.
This PR adds back memory resource check for Level zero segment loading.
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
Allows proactive warming up of chunk cache. Original vector data will be
asynchronously loaded into the chunk cache during the load process. It
has the potential to significantly reduce query/search latency for a
certain duration after the load, albeit with a concurrent increase in
disk usage.
issue: https://github.com/milvus-io/milvus/issues/30181
---------
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
don't store logPath in meta to reduce memory, when service get
segmentinfo, generate logpath from logid.
#28885
Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>