milvus

mirror of https://gitee.com/milvus-io/milvus.git synced 2025-12-28 22:45:26 +08:00

Author	SHA1	Message	Date
Spade A	d8591f9548	fix: csv/json import with STRUCT adapts concatenated struct name (#45000 ) After https://github.com/milvus-io/milvus/pull/44557, the field name in STRUCT field becomes STRUCT_NAME[FIELD_NAME] This PR make import consider the change. issue: https://github.com/milvus-io/milvus/issues/45006 ref: https://github.com/milvus-io/milvus/issues/42148 TODO: parquet is much more complex than csv/json, and I will leave it to a separate PR. --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-10-24 10:22:15 +08:00
Spade A	c4f3f0ce4c	feat: impl StructArray -- support more types of vector in STRUCT (#44736 ) ref: https://github.com/milvus-io/milvus/issues/42148 --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com> Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>	2025-10-15 10:25:59 +08:00
Spade A	208481a070	feat: impl StructArray -- support same names in different STRUCT (#44557 ) ref: https://github.com/milvus-io/milvus/issues/42148 --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-10-10 15:53:56 +08:00
cai.zhang	19346fa389	feat: Geospatial Data Type and GIS Function support for milvus (#44547 ) issue: #43427 This pr's main goal is merge #37417 to milvus 2.5 without conflicts. # Main Goals 1. Create and describe collections with geospatial type 2. Insert geospatial data into the insert binlog 3. Load segments containing geospatial data into memory 4. Enable query and search can display geospatial data 5. Support using GIS funtions like ST_EQUALS in query 6. Support R-Tree index for geometry type # Solution 1. Add Type: Modify the Milvus core by adding a Geospatial type in both the C++ and Go code layers, defining the Geospatial data structure and the corresponding interfaces. 2. Dependency Libraries: Introduce necessary geospatial data processing libraries. In the C++ source code, use Conan package management to include the GDAL library. In the Go source code, add the go-geom library to the go.mod file. 3. Protocol Interface: Revise the Milvus protocol to provide mechanisms for Geospatial message serialization and deserialization. 4. Data Pipeline: Facilitate interaction between the client and proxy using the WKT format for geospatial data. The proxy will convert all data into WKB format for downstream processing, providing column data interfaces, segment encapsulation, segment loading, payload writing, and cache block management. 5. Query Operators: Implement simple display and support for filter queries. Initially, focus on filtering based on spatial relationships for a single column of geospatial literal values, providing parsing and execution for query expressions.Now only support brutal search 7. Client Modification: Enable the client to handle user input for geospatial data and facilitate end-to-end testing.Check the modification in pymilvus. --------- Signed-off-by: Yinwei Li <yinwei.li@zilliz.com> Signed-off-by: Cai Zhang <cai.zhang@zilliz.com> Co-authored-by: ZhuXi <150327960+Yinwei-Yu@users.noreply.github.com>	2025-09-28 19:43:05 +08:00
Spade A	eb793531b9	feat: impl StructArray -- support import for CSV/JSON/PARQUET/BINLOG (#44201 ) Ref https://github.com/milvus-io/milvus/issues/42148 --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-09-15 20:41:59 +08:00
Spade A	8456f824be	feat: impl StructArray -- miscellaneous staffs for struct array (#43960 ) Ref https://github.com/milvus-io/milvus/issues/42148 1. enable storage v2 2. implement some missing staffs 3. fix some bugs and add tests --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-08-26 21:35:53 +08:00
Ted Xu	e37cd19da2	enhance: enable storage v2 by default (#43652 ) Signed-off-by: Ted Xu <ted.xu@zilliz.com>	2025-08-01 08:59:36 +08:00
yihao.dai	df8ceb123b	enhance: Support parallel execution of L0 import tasks (#43213 ) issue: https://github.com/milvus-io/milvus/issues/43212 Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2025-07-17 10:14:50 +08:00
yihao.dai	9cbd194c6b	fix: Prevent import from generating small binlogs (#43132 ) - Introduce dynamic buffer sizing to avoid generating small binlogs during import - Refactor import slot calculation based on CPU and memory constraints - Implement dynamic pool sizing for sync manager and import tasks according to CPU core count issue: https://github.com/milvus-io/milvus/issues/43131 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2025-07-07 21:32:47 +08:00
Zhen Ye	ecb24e7232	enhance: use multi-process framework in integration test (#42976 ) issue: #41609 - add env `MILVUS_NODE_ID_FOR_TESTING` to set up a node id for milvus process. - add env `MILVUS_CONFIG_REFRESH_INTERVAL` to set up the refresh interval of paramtable. - Init paramtable when calling `paramtable.Get()`. - add new multi process framework for integration test. - change all integration test into multi process. - merge some test case into one suite to speed up it. - modify some test, which need to wait for issue #42966, #42685. - remove the waittssync for delete collection to fix issue: #42989 --------- Signed-off-by: chyezh <chyezh@outlook.com>	2025-06-30 14:22:43 +08:00
yihao.dai	86876682da	enhance: Enhance import integration tests and logs (#42612 ) 1. Optimize the import process: skip subsequent steps and mark the task as complete if the number of imported rows is 0. 2. Improve import integration tests: a. Add a test to verify that autoIDs are not duplicated b. Add a test for the corner case where all data is deleted c. Shorten test execution time 3. Enhance import logging: a. Print imported segment information upon completion b. Include file name in failure logs issue: https://github.com/milvus-io/milvus/issues/42488, https://github.com/milvus-io/milvus/issues/42518 Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2025-06-12 20:02:35 +08:00
Zhen Ye	4bad293655	enhance: make upgrading from 2.5.x less down time (#42082 ) issue: #40532 - start timeticksync at rootcoord if the streaming service is not available - stop timeticksync if the streaming service is available - open a read-only wal if some nodes in cluster is not upgrading to 2.6 - allow to open read-write wal after all nodes in cluster is upgrading to 2.6 --------- Signed-off-by: chyezh <chyezh@outlook.com>	2025-05-29 23:02:29 +08:00
yihao.dai	36e9e41627	fix: Fix no candidate segments error for small import (#41771 ) When autoID is enabled, the preimport task estimates row distribution by evenly dividing the total row count (numRows) across all vchannels: `estimatedCount = numRows / vchannelNum`. However, the actual import task hashes real auto-generated IDs to determine the target vchannel. This mismatch can lead to inaccurate row distribution estimation in such corner cases: - Importing 1 row into 2 vchannels: • Preimport: 1 / 2 = 0 → both v0 and v1 are estimated to have 0 rows • Import: real autoID (e.g., 457975852966809057) hashes to v1 → actual result: v0 = 0, v1 = 1 To resolve such corner case, we now allocate at least one segment for each vchannel when autoID is enabled, ensuring all vchannels are prepared to receive data even if no rows are estimated for them. issue: https://github.com/milvus-io/milvus/issues/41759 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2025-05-14 15:30:21 +08:00
SimFG	91d40fa558	fix: Update logging context and upgrade dependencies (#41318 ) - issue: #41291 --------- Signed-off-by: SimFG <bang.fu@zilliz.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2025-04-23 10:52:38 +08:00
Xianhui Lin	f9febe3bae	enhance: Merge RootCoord, DataCoord And QueryCoord into MixCoord (#41006 ) Merge RootCoord, DataCoord And QueryCoord into MixCoord Make Session into one issue : https://github.com/milvus-io/milvus/issues/37764 --------- Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>	2025-04-11 16:36:30 +08:00
sthuang	90acc8a58f	enhance: upgrade go arrow version from 12.0.1 to 17.0.0 (#39916 ) related: https://github.com/milvus-io/milvus/issues/39915 Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2025-02-25 10:30:02 +08:00
congqixia	cb7f2fa6fd	enhance: Use v2 package name for pkg module (#39990 ) Related to #39095 https://go.dev/doc/modules/version-numbers Update pkg version according to golang dep version convention --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-02-22 23:15:58 +08:00
SimFG	047254665d	feat: support to replicate import msg (#39171 ) - issue: #39849 --------- Signed-off-by: SimFG <bang.fu@zilliz.com> Signed-off-by: chyezh <chyezh@outlook.com> Co-authored-by: chyezh <chyezh@outlook.com>	2025-02-16 00:08:13 +08:00
Cai Yudong	5730b69e56	feat: Enable more VECTOR_INT8 unittest (#39569 ) Issue: #38666 Signed-off-by: Cai Yudong <yudong.cai@zilliz.com>	2025-01-24 17:03:07 +08:00
Zhen Ye	bb8d1ab3bf	enhance: make new go package to manage proto (#39114 ) issue: #39095 --------- Signed-off-by: chyezh <chyezh@outlook.com>	2025-01-10 10:49:01 +08:00
zhenshan.cao	63843dce33	fix: Fix conan gdal building problem (#37338 ) issue:https://github.com/milvus-io/milvus/issues/27576 Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>	2024-10-31 21:04:16 +08:00
Hao Tan	67c4340565	feat: Geospatial Data Type and GIS Function Support for milvus server (#35990 ) issue:https://github.com/milvus-io/milvus/issues/27576 # Main Goals 1. Create and describe collections with geospatial fields, enabling both client and server to recognize and process geo fields. 2. Insert geospatial data as payload values in the insert binlog, and print the values for verification. 3. Load segments containing geospatial data into memory. 4. Ensure query outputs can display geospatial data. 5. Support filtering on GIS functions for geospatial columns. # Solution 1. Add Type: Modify the Milvus core by adding a Geospatial type in both the C++ and Go code layers, defining the Geospatial data structure and the corresponding interfaces. 2. Dependency Libraries: Introduce necessary geospatial data processing libraries. In the C++ source code, use Conan package management to include the GDAL library. In the Go source code, add the go-geom library to the go.mod file. 3. Protocol Interface: Revise the Milvus protocol to provide mechanisms for Geospatial message serialization and deserialization. 4. Data Pipeline: Facilitate interaction between the client and proxy using the WKT format for geospatial data. The proxy will convert all data into WKB format for downstream processing, providing column data interfaces, segment encapsulation, segment loading, payload writing, and cache block management. 5. Query Operators: Implement simple display and support for filter queries. Initially, focus on filtering based on spatial relationships for a single column of geospatial literal values, providing parsing and execution for query expressions. 6. Client Modification: Enable the client to handle user input for geospatial data and facilitate end-to-end testing.Check the modification in pymilvus. --------- Signed-off-by: tasty-gumi <1021989072@qq.com>	2024-10-31 20:58:20 +08:00
foxspy	d7b2ffe5aa	enhance: add an unify vector index config checker (#36844 ) issue: #34298 Signed-off-by: xianliang.li <xianliang.li@zilliz.com>	2024-10-28 10:11:37 +08:00
smellthemoon	89397d1e66	enhance: adjust parquet reader type check with null type (#36266 ) #36252 remove no need type check. if users use null type writer to write parquet, hope it successfully. Signed-off-by: lixinguo <xinguo.li@zilliz.com> Co-authored-by: lixinguo <xinguo.li@zilliz.com>	2024-09-19 18:43:10 +08:00
yihao.dai	a61668c77e	feat: Introduce stats task for import (#35868 ) This PR introduce stats task for import: 1. Define new `Stats` and `IndexBuilding` states for importJob 2. Add new stats step to the import process: trigger the stats task and wait for its completion 3. Abort stats task if import job failed issue: https://github.com/milvus-io/milvus/issues/33744 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-09-15 15:17:08 +08:00
OxalisCu	3a381bc247	enhance: Bulkinsert supports null in csv formats (#35912 ) see details in this issue https://github.com/milvus-io/milvus/issues/35911 --------- Signed-off-by: OxalisCu <2127298698@qq.com>	2024-09-09 19:17:07 +08:00
cai.zhang	2c9bb4dfa3	feat: Support stats task to sort segment by PK (#35054 ) issue: #33744 This PR includes the following changes: 1. Added a new task type to the task scheduler in datacoord: stats task, which sorts segments by primary key. 2. Implemented segment sorting in indexnode. 3. Added a new field `FieldStatsLog` to SegmentInfo to store token index information. --------- Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2024-09-02 14:19:03 +08:00
OxalisCu	ed4eaffc9d	enhance: add csv support for bulkinsert (#34938 ) See this issue for details: #34937 --------- Signed-off-by: OxalisCu <2127298698@qq.com>	2024-08-21 17:47:01 +08:00
smellthemoon	80a7c78f28	enhance: import supports null in parquet and json formats (#35558 ) #31728 --------- Signed-off-by: lixinguo <xinguo.li@zilliz.com> Co-authored-by: lixinguo <xinguo.li@zilliz.com>	2024-08-20 16:50:55 +08:00
wei liu	c45f38aa61	enhance: Update protobuf-go to protobuf-go v2 (#34394 ) issue: #34252 Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-07-29 11:31:51 +08:00
congqixia	3333160b8d	enhance: Fix lint issues from recent PRs (#34482 ) See also #34483 Some lint issues are introduced due to lack of static check run. This PR fixes these problems. --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-07-09 10:06:24 +08:00
Cai Yudong	9d4535ce0b	enhance: Handle Float16Vector/BFloat16Vector numpy bulk insert as same as BinaryVector (#33760 ) Issue: #22837 Signed-off-by: Cai Yudong <yudong.cai@zilliz.com>	2024-06-12 17:17:55 +08:00
yihao.dai	b1d46eb34b	fix: Fix multiple vector fields import (#33723 ) 1. Fix dim mismatch with multi-vector fields and JSON import 2. Enhance: do not display file ID in GetImportResponse. issue: https://github.com/milvus-io/milvus/issues/33681, https://github.com/milvus-io/milvus/issues/33682 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-06-10 21:57:54 +08:00
yihao.dai	3540eee977	enhance: Support L0 import (#33514 ) issue: https://github.com/milvus-io/milvus/issues/33157 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-06-07 14:17:20 +08:00
yihao.dai	35532a3e7d	fix: Fill stats log id and check validity (#33477 ) 1. Fill log ID of stats log from import 2. Add a check to validate the log ID before writing to meta issue: https://github.com/milvus-io/milvus/issues/33476 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-06-05 11:17:56 +08:00
Cai Yudong	4004e4c545	enhance: Optimize bulk insert unittest (#33224 ) Issue: #22837 Signed-off-by: Cai Yudong <yudong.cai@zilliz.com>	2024-05-24 10:23:41 +08:00
yihao.dai	9ff023ee35	fix: Fix filtering by partition key fails for importing data (#33274 ) Before executing the import, partition IDs should be reordered according to partition names. Otherwise, the data might be hashed to the wrong partition during import. This PR corrects this error. issue: https://github.com/milvus-io/milvus/issues/33237 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-05-23 11:13:40 +08:00
Cai Yudong	b560602885	enhance: Store SparseFloatVector into parquet as JSON string (#33101 ) Issue: #22837 Signed-off-by: Cai Yudong <yudong.cai@zilliz.com>	2024-05-17 15:01:37 +08:00
Cai Yudong	4ef163fb70	enhance: Support readable JSON file import for Float16/BFloat16/SparseFloat (#33064 ) Issue: #22837 Signed-off-by: Cai Yudong <yudong.cai@zilliz.com>	2024-05-16 14:47:35 +08:00
Cai Yudong	dc89c6f810	enhance: remove duplicated data generation APIs for bulk insert test (#32889 ) Issue: #22837 including following changes: 1. Add API CreateInsertData() and BuildArrayData() in internal/util/testutil 2. Remove duplicated test APIs from importutilv2 unittest and bulk insert integration test Signed-off-by: Cai Yudong <yudong.cai@zilliz.com>	2024-05-10 15:27:31 +08:00
Cai Yudong	bcdbd1966e	feat: Support sparse float vector bulk insert for binlog/json/parquet (#32649 ) Issue: #22837 Signed-off-by: Cai Yudong <yudong.cai@zilliz.com>	2024-05-07 18:43:30 +08:00
yihao.dai	53874ce245	fix: Fix cannot specify partition name in binlog import (#32730 ) issue: https://github.com/milvus-io/milvus/issues/32807 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-05-07 17:19:30 +08:00
yihao.dai	4de063ae14	fix: Make the dynamic column optional in parquet import (#32738 ) issue: https://github.com/milvus-io/milvus/issues/32729 Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-05-07 11:21:29 +08:00
congqixia	ecd8e52b53	fix: Use default integration case timeout for `TestBinlogImport` (#32701 ) See also #32700 Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-04-29 19:07:27 +08:00
yihao.dai	1594122c0a	enhance: Make the dynamic field file optional during numpy import (#32596 ) 1. Make the dynamic field file optional during numpy import 2. Add integration importing test with dynamic 3. Disallow file of pk when autoID=true during numpy import issue: https://github.com/milvus-io/milvus/issues/32542 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-04-28 19:39:25 +08:00
Cai Yudong	5fc439c600	feat: Bulk insert support fp16/bf16 (#32157 ) Issue: #22837 Signed-off-by: Cai Yudong <yudong.cai@zilliz.com>	2024-04-22 10:05:22 +08:00
yihao.dai	558feed5ed	fix: Use pk from binlog during import (#32118 ) During binlog import, even if the primary key's autoID is set to true, the primary key from the binlog should be used instead of being reassigned. issue: https://github.com/milvus-io/milvus/discussions/31943, https://github.com/milvus-io/milvus/issues/28521 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-04-16 14:51:20 +08:00
yihao.dai	273df98e20	enhance: Add binlog import intergration test (#32112 ) issue: https://github.com/milvus-io/milvus/issues/28521 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-04-11 10:31:18 +08:00
yihao.dai	c408a32db6	feat: Add disk quota checks for import V2 (#31131 ) Return quota error when the files to be imported exceed the disk quota. issue: https://github.com/milvus-io/milvus/issues/28521 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-03-15 14:43:03 +08:00
Bingyi Sun	5c0bb40549	fix: merge index params when creating index (#31127 ) issue: https://github.com/milvus-io/milvus/issues/31102 --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2024-03-11 17:31:03 +08:00

1 2

51 Commits