40 Commits

Author SHA1 Message Date
yihao.dai
5e525eb3bf
enhance: Retry reads from object storage on rate limit error (#46455)
This PR improves the robustness of object storage operations by retrying
both explicit throttling errors (e.g. HTTP 429, SlowDown, ServerBusy).
These errors commonly occur under high concurrency and are typically
recoverable with bounded retries.

issue: https://github.com/milvus-io/milvus/issues/44772

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Configurable retry support for reads from object storage and improved
mapping of transient/rate-limit errors.
* Added a retryable reader wrapper used by CSV/JSON/Parquet/Numpy import
paths.

* **Configuration**
  * New parameter to control storage read retry attempts.

* **Tests**
* Expanded unit tests covering error mapping and retry behaviors across
storage backends.
* Standardized mock readers and test initialization to simplify test
setups.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-12-23 11:03:18 +08:00
groot
a545ebc702
fix: Fix a bug that bulkimport cannot handle empty struct list (#45693)
issue: https://github.com/milvus-io/milvus/issues/42148

Signed-off-by: yhmo <yihua.mo@zilliz.com>
2025-11-25 17:21:06 +08:00
zhenshan.cao
a3b8bcb198
fix: correct default value backfill during AddField (#45634)
issue: https://github.com/milvus-io/milvus/issues/44585

Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>
2025-11-18 23:05:42 +08:00
zhenshan.cao
490a618c30
fix: Handle timestamptz import errors (#45287)
issue: https://github.com/milvus-io/milvus/issues/44585

Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>
2025-11-05 15:05:33 +08:00
cai.zhang
c33d221536
fix: Fix bug for importing Geometry data (#45089)
issue: #44787 , #45012

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-10-27 20:34:11 +08:00
Spade A
d8591f9548
fix: csv/json import with STRUCT adapts concatenated struct name (#45000)
After https://github.com/milvus-io/milvus/pull/44557, the field name in
STRUCT field becomes STRUCT_NAME[FIELD_NAME]
This PR make import consider the change.

issue: https://github.com/milvus-io/milvus/issues/45006
ref: https://github.com/milvus-io/milvus/issues/42148

TODO: parquet is much more complex than csv/json, and I will leave it to
a separate PR.

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-10-24 10:22:15 +08:00
cai.zhang
d5ecb63f53
enhance: Support import geometry data by json/csv (#44826)
issue: #44787

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-10-17 17:08:02 +08:00
groot
81f0d498be
enhance: Support JSONL/NDJSON files for bulkinsert (#44602)
issue: https://github.com/milvus-io/milvus/issues/44567

Signed-off-by: yhmo <yihua.mo@zilliz.com>
2025-10-10 10:43:57 +08:00
Bingyi Sun
c25166a202
fix: Fix bulk import with autoid (#44604)
issue: #44424

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-10-09 12:09:56 +08:00
Bingyi Sun
96e1de4e22
feat: allow users to write pk field when autoid is enabled (#44424)
https://github.com/milvus-io/milvus/issues/44425

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-09-23 16:10:04 +08:00
Spade A
eb793531b9
feat: impl StructArray -- support import for CSV/JSON/PARQUET/BINLOG (#44201)
Ref https://github.com/milvus-io/milvus/issues/42148

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-09-15 20:41:59 +08:00
Tianx
c0d62268ac
feat: add timesatmptz data type (#44005)
issue: https://github.com/milvus-io/milvus/issues/27467
>
https://github.com/milvus-io/milvus/issues/27467#issuecomment-3092211420
> * [x]  M1 Create collection with timestamptz field
> * [x]  M2 Insert timestamptz field data
> * [x]  M3 Retrieve timestamptz field data
> * [x]  M4 Implement handoff[ ]  

The second PR of issue:
https://github.com/milvus-io/milvus/issues/27467, which completes M1-M4
described above.

---------

Signed-off-by: xtx <xtianx@smail.nju.edu.cn>
2025-08-26 15:59:53 +08:00
groot
1ee8cea35b
enhance: bulkinsert handle nullable/defaultValue/functionOutput fields (#42956)
issue: https://github.com/milvus-io/milvus/issues/42173

Signed-off-by: yhmo <yihua.mo@zilliz.com>
2025-07-04 14:20:44 +08:00
cai.zhang
ebe1c95bb1
enhance: Add Size interface to FileReader to eliminate the StatObject call during Read (#42908)
issue: #42907

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-06-25 14:36:41 +08:00
yihao.dai
6c1a37fca1
fix: Fix import reader goroutine leak (#41869)
Close the chunk manager's reader after the import completes to prevent
goroutine leaks.

issues: https://github.com/milvus-io/milvus/issues/41868

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-05-16 10:18:35 +08:00
SimFG
91d40fa558
fix: Update logging context and upgrade dependencies (#41318)
- issue: #41291

---------

Signed-off-by: SimFG <bang.fu@zilliz.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-04-23 10:52:38 +08:00
yihao.dai
b4cb8a4b13
enhance: Add UTF-8 string validation for import (#40694)
issue: https://github.com/milvus-io/milvus/issues/40684

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-04-01 19:04:21 +08:00
yihao.dai
bab30a41bf
enhance: Improve import error msgs (#40567)
issue: https://github.com/milvus-io/milvus/issues/40208

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-03-13 21:02:07 +08:00
congqixia
cb7f2fa6fd
enhance: Use v2 package name for pkg module (#39990)
Related to #39095

https://go.dev/doc/modules/version-numbers

Update pkg version according to golang dep version convention

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-02-22 23:15:58 +08:00
Cai Yudong
7476eb3625
feat: Support bulk insert for Int8Vector (#39499)
Issue: #38666

Signed-off-by: Cai Yudong <yudong.cai@zilliz.com>
2025-01-23 10:19:06 +08:00
smellthemoon
92a2d608ac
fix: Bulk insert failed when the nullable/default_value field is not exist (#39063)
#39036

Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2025-01-09 19:27:03 +08:00
congqixia
b0bd290a6e
enhance: Use internal json(sonic) to replace std json lib (#37708)
Related to #35020

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-11-18 10:46:31 +08:00
yihao.dai
b45cf2d49f
enhance: Add max length check for csv import (#37077)
1. Add max length check for csv import.
2. Tidy import options.
3. Tidy common import util functions.

issue: https://github.com/milvus-io/milvus/issues/34150

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-10-25 14:37:29 +08:00
smellthemoon
463c47ced1
enhance: support default value in import (#36700)
https://github.com/milvus-io/milvus/issues/31728

Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2024-10-17 12:05:24 +08:00
Buqian Zheng
82c5cf2fa2
feat: add bulk insert support for Functions (#36715)
issue: https://github.com/milvus-io/milvus/issues/35853 and
https://github.com/milvus-io/milvus/issues/35856

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2024-10-12 17:19:20 +08:00
Xiaofan
50fcfe8ef1
enhance: add nan and inf check (#35683)
fix #35594
add float check on files

Signed-off-by: xiaofanluan <xiaofan.luan@zilliz.com>
2024-08-25 15:22:57 +08:00
smellthemoon
80a7c78f28
enhance: import supports null in parquet and json formats (#35558)
#31728

---------

Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2024-08-20 16:50:55 +08:00
nish112022
3948bd4e79
fix: Added check for validating varchar,array max length (#35499)
issue : https://github.com/milvus-io/milvus/issues/34150

This is for numpy,parquet,json readers.

---------

Signed-off-by: Nischay Yadav <nischay.yadav@ibm.com>
2024-08-20 11:42:55 +08:00
yihao.dai
b1d46eb34b
fix: Fix multiple vector fields import (#33723)
1. Fix dim mismatch with multi-vector fields and JSON import
2. Enhance: do not display file ID in GetImportResponse.

issue: https://github.com/milvus-io/milvus/issues/33681,
https://github.com/milvus-io/milvus/issues/33682

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-06-10 21:57:54 +08:00
Cai Yudong
4004e4c545
enhance: Optimize bulk insert unittest (#33224)
Issue: #22837

Signed-off-by: Cai Yudong <yudong.cai@zilliz.com>
2024-05-24 10:23:41 +08:00
Cai Yudong
b560602885
enhance: Store SparseFloatVector into parquet as JSON string (#33101)
Issue: #22837

Signed-off-by: Cai Yudong <yudong.cai@zilliz.com>
2024-05-17 15:01:37 +08:00
Cai Yudong
4ef163fb70
enhance: Support readable JSON file import for Float16/BFloat16/SparseFloat (#33064)
Issue: #22837

Signed-off-by: Cai Yudong <yudong.cai@zilliz.com>
2024-05-16 14:47:35 +08:00
Cai Yudong
dc89c6f810
enhance: remove duplicated data generation APIs for bulk insert test (#32889)
Issue: #22837

including following changes:
1. Add API CreateInsertData() and BuildArrayData() in
internal/util/testutil
2. Remove duplicated test APIs from importutilv2 unittest and bulk
insert integration test

Signed-off-by: Cai Yudong <yudong.cai@zilliz.com>
2024-05-10 15:27:31 +08:00
Cai Yudong
bcdbd1966e
feat: Support sparse float vector bulk insert for binlog/json/parquet (#32649)
Issue: #22837

Signed-off-by: Cai Yudong <yudong.cai@zilliz.com>
2024-05-07 18:43:30 +08:00
Cai Yudong
5fc439c600
feat: Bulk insert support fp16/bf16 (#32157)
Issue: #22837

Signed-off-by: Cai Yudong <yudong.cai@zilliz.com>
2024-04-22 10:05:22 +08:00
yihao.dai
1b5554c8cb
enhance: Support $meta key for json import (#32013)
During JSON import:
1. Allow the specification of the $meta key
2. Prohibit duplicated keys within the $meta field, for instance,
`{"id": 1, "vector": [], "x": 6, "$meta": {"x": 8}}`

issue: https://github.com/milvus-io/milvus/issues/31835

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-04-10 17:27:17 +08:00
yihao.dai
31cf849f68
enhance: Support retriving file size from importutilv2.Reader (#31533)
To reduce the overhead caused by listing the S3 objects, add an
interface to importutil.Reader to retrieve file sizes.

issue: https://github.com/milvus-io/milvus/issues/31532,
https://github.com/milvus-io/milvus/issues/28521

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-03-25 20:29:07 +08:00
cai.zhang
de2c95d00c
enhance: Constraint dynamic field as key-value format (#31183)
issue: #31051

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-03-12 12:45:03 +08:00
yihao.dai
c5918290e6
feat: Add import executor and manager for datanode (#29438)
This PR introduces novel importv2 roles for datanode:
1. Executor: To execute tasks, a import task will be divided into the
following steps: read data -> hash data -> sync data;
2. Manager: To manage all the tasks;

issue: https://github.com/milvus-io/milvus/issues/28521

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-01-31 20:45:04 +08:00
yihao.dai
23183ffb0f
feat: Add import reader for json (#29252)
This PR implements a new json reader for import.

issue: https://github.com/milvus-io/milvus/issues/28521

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-01-05 18:12:48 +08:00