1. Use blocking memory allocation to wait until memory becomes available
2. Perform memory allocation at the file level instead of per task
3. Limit Parquet file reader batch size to prevent excessive memory
consumption
4. Limit import buffer size from 20% to 10% of total memory
issue: https://github.com/milvus-io/milvus/issues/43387,
https://github.com/milvus-io/milvus/issues/43131
---------
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
- Introduce dynamic buffer sizing to avoid generating small binlogs
during import
- Refactor import slot calculation based on CPU and memory constraints
- Implement dynamic pool sizing for sync manager and import tasks
according to CPU core count
issue: https://github.com/milvus-io/milvus/issues/43131
---------
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
1. Optimize the import process: skip subsequent steps and mark the task
as complete if the number of imported rows is 0.
2. Improve import integration tests:
a. Add a test to verify that autoIDs are not duplicated
b. Add a test for the corner case where all data is deleted
c. Shorten test execution time
3. Enhance import logging:
a. Print imported segment information upon completion
b. Include file name in failure logs
issue: https://github.com/milvus-io/milvus/issues/42488,
https://github.com/milvus-io/milvus/issues/42518
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
Increase insert buffer size from 16MB to 64MB, while keeping delete
buffer size at 16MB.
issue: https://github.com/milvus-io/milvus/issues/42518
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
Remove the unlimited logID mechanism and switch to redundantly
allocating a large number of IDs.
issue: https://github.com/milvus-io/milvus/issues/42518
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
1. Optimize import scheduling strategic:
a. Revise slot weights, calculating them based on the number of files
and segments for both import and pre-import tasks.
b. Ensure that the DN executes tasks in ascending order of task ID.
2. Add time cost metric and log.
issue: https://github.com/milvus-io/milvus/issues/36600,
https://github.com/milvus-io/milvus/issues/36518
---------
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
1. Move the common modules of streamingNode and dataNode to flushcommon
2. Add new GetVChannels interface for rootcoord
issue: https://github.com/milvus-io/milvus/issues/33285
---------
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
This PR removes the dependency of compaction on the ID allocator by
pre-allocating the logID and segmentID.
issue: https://github.com/milvus-io/milvus/issues/33957
---------
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
The import is dependent on syncTask, which in turn relies on the
allocator. This PR pre-allocate the necessary IDs for import syncTask.
issue: https://github.com/milvus-io/milvus/issues/33957
---------
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
The data coordinator computed the appropriate number of import segments,
thus when importing in the data node, one can randomly select a segment.
issue: https://github.com/milvus-io/milvus/issues/33604
---------
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>