9 Commits

Author SHA1 Message Date
Cai Yudong
dc89c6f810
enhance: remove duplicated data generation APIs for bulk insert test (#32889)
Issue: #22837

including following changes:
1. Add API CreateInsertData() and BuildArrayData() in
internal/util/testutil
2. Remove duplicated test APIs from importutilv2 unittest and bulk
insert integration test

Signed-off-by: Cai Yudong <yudong.cai@zilliz.com>
2024-05-10 15:27:31 +08:00
Cai Yudong
bcdbd1966e
feat: Support sparse float vector bulk insert for binlog/json/parquet (#32649)
Issue: #22837

Signed-off-by: Cai Yudong <yudong.cai@zilliz.com>
2024-05-07 18:43:30 +08:00
chyezh
2586c2f1b3
enhance: use WalkWithPrefix api for oss, enable piplined file gc (#31740)
issue: #19095,#29655,#31718

- Change `ListWithPrefix` to `WalkWithPrefix` of OOS into a pipeline
mode.

- File garbage collection is performed in other goroutine.

- Segment Index Recycle clean index file too.

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2024-04-25 20:41:27 +08:00
Cai Yudong
5fc439c600
feat: Bulk insert support fp16/bf16 (#32157)
Issue: #22837

Signed-off-by: Cai Yudong <yudong.cai@zilliz.com>
2024-04-22 10:05:22 +08:00
yihao.dai
4e264003bf
enhance: Ensure ImportV2 waits for the index to be built and refine some logic (#31629)
Feature Introduced:
1. Ensure ImportV2 waits for the index to be built

Enhancements Introduced:
1. Utilization of local time for timeout ts instead of allocating ts
from rootcoord.
3. Enhanced input file length check for binlog import.
4. Removal of duplicated manager in datanode.
5. Renaming of executor to scheduler in datanode.
6. Utilization of a thread pool in the scheduler in datanode.

issue: https://github.com/milvus-io/milvus/issues/28521

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-04-01 20:09:13 +08:00
yihao.dai
31cf849f68
enhance: Support retriving file size from importutilv2.Reader (#31533)
To reduce the overhead caused by listing the S3 objects, add an
interface to importutil.Reader to retrieve file sizes.

issue: https://github.com/milvus-io/milvus/issues/31532,
https://github.com/milvus-io/milvus/issues/28521

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-03-25 20:29:07 +08:00
yihao.dai
87b3c25b15
fix: Fix binlog import (#31205)
1. File type validation is omitted during binlog import.
2. System fields are appended to the schema during binlog import.

issue: https://github.com/milvus-io/milvus/issues/28521

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-03-13 10:35:04 +08:00
yihao.dai
c5918290e6
feat: Add import executor and manager for datanode (#29438)
This PR introduces novel importv2 roles for datanode:
1. Executor: To execute tasks, a import task will be divided into the
following steps: read data -> hash data -> sync data;
2. Manager: To manage all the tasks;

issue: https://github.com/milvus-io/milvus/issues/28521

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-01-31 20:45:04 +08:00
yihao.dai
3561586edf
feat: Add import reader for binlog (#28910)
This PR defines the new import reader interfaces and implement a binlog
reader for import.

issue: https://github.com/milvus-io/milvus/issues/28521

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-01-05 11:48:47 +08:00