milvus

mirror of https://gitee.com/milvus-io/milvus.git synced 2025-12-07 01:28:27 +08:00

Author	SHA1	Message	Date
XuanYang-cn	623a9e5156	fix: Accurate size estimation for sliced arrow arrays in compaction (#45294 ) Sliced arrow arrays "incorrectly" returned the original array's size via SizeInBytes(), causing inaccurate memory estimates during compaction. This resulted in segments closing prematurely in mergeSplit mode - expected 500MB compactions produced 4x100+MB segments instead. Fixed by calculating actual byte size of sliced arrays, ensuring proper segment sizing and more accurate memory usage tracking. See also: #45293 Signed-off-by: yangxuan <xuan.yang@zilliz.com>	2025-11-06 14:57:34 +08:00
Spade A	c4f3f0ce4c	feat: impl StructArray -- support more types of vector in STRUCT (#44736 ) ref: https://github.com/milvus-io/milvus/issues/42148 --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com> Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>	2025-10-15 10:25:59 +08:00
Spade A	7cb15ef141	feat: impl StructArray -- optimize vector array serialization (#44035 ) issue: https://github.com/milvus-io/milvus/issues/42148 Optimized from Go VectorArray → VectorArray Proto → Binary → C++ VectorArray Proto → C++ VectorArray local impl → Memory to Go VectorArray → Arrow ListArray → Memory --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-09-03 16:39:53 +08:00
Spade A	faeb7fd410	feat: impl StructArray -- create schema, insert, and retrieve data (#42855 ) Ref https://github.com/milvus-io/milvus/issues/42148 https://github.com/milvus-io/milvus/pull/42406 impls the segcore part of storage for handling with VectorArray. This PR: 1. impls the go part of storage for VectorArray 2. impls the collection creation with StructArrayField and VectorArray 3. insert and retrieve data from the collection. --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com> Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com> Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>	2025-07-27 01:30:55 +08:00
Ted Xu	9041bf1b9a	fix: including shouldCopy parameter in file readers (#43578 ) This parameter determines whether the returned value should be a copy or a reference from the arrow array. The updates enhance memory management and provide more control over data handling during deserialization. See #43186 --------- Signed-off-by: Ted Xu <ted.xu@zilliz.com>	2025-07-26 17:30:55 +08:00
sthuang	238bd30f42	fix: [StorageV2] end to end minor issues for sync, stats, and load (#42948 ) Fix issues in end-to-end tests: 1. Split column groups based on schema, rather than estimating by average chunk row size. Ensure column group consistency within a segment, to avoid errors caused by loading multiple column group chunks simultaneously. 2. Use sorted segmentId when generating the stats binlog path, to ensure consistent and correct file path resolution. 3. Determine field IDs as follows: For multi-column column groups, retrieve the field ID list from metadata. For single-column column groups, use the column group ID directly as the field ID. related: #39173 fix: #42862 --------- Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2025-06-27 14:44:42 +08:00
sthuang	ed5dbf3eaa	enhance: [StorageV2] sync separate vector datatype into its own column group (#42638 ) related: #39173 Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2025-06-16 11:48:37 +08:00
sthuang	9439eaef52	fix: [StorageV2] sync with int8 vector data type core dumped (#42616 ) related: https://github.com/milvus-io/milvus/issues/42613, #39173 Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2025-06-10 11:42:35 +08:00
XuanYang-cn	540456041f	enhance: Remove not inuse binlog iterator (#41359 ) See also: #41466 Signed-off-by: yangxuan <xuan.yang@zilliz.com>	2025-04-24 12:04:38 +08:00
Ted Xu	128efaa3e3	enhance: simplify size calculation in file writers (#40808 ) See: #40342 --------- Signed-off-by: Ted Xu <ted.xu@zilliz.com>	2025-03-26 20:04:22 +08:00
sthuang	d7df78a6c9	feat: Storage v2 compaction (#40667 ) - Feat: Support Mix compaction. Covering tests include compatibility and rollback ability. - Read v1 segments and compact with v2 format. - Read both v1 and v2 segments and compact with v2 format. - Read v2 segments and compact with v2 format. - Compact with duplicate primary key test. - Compact with bm25 segments. - Compact with merge sort segments. - Compact with no expiration segments. - Compact with lack binlog segments. - Compact with nullable field segments. - Feat: Support Clustering compaction. Covering tests include compatibility and rollback ability. - Read v1 segments and compact with v2 format. - Read both v1 and v2 segments and compact with v2 format. - Read v2 segments and compact with v2 format. - Compact bm25 segments with v2 format. - Compact with memory limit. - Enhance: Use serdeMap serialize in BuildRecord function to support all Milvus data types. related: #39173 Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2025-03-21 10:16:12 +08:00
Ted Xu	df4285c9ef	enhance: API integration with storage v2 in clustering-compactions (#40133 ) See #39173 --------- Signed-off-by: Ted Xu <ted.xu@zilliz.com>	2025-03-13 14:12:06 +08:00
sthuang	90acc8a58f	enhance: upgrade go arrow version from 12.0.1 to 17.0.0 (#39916 ) related: https://github.com/milvus-io/milvus/issues/39915 Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2025-02-25 10:30:02 +08:00
congqixia	cb7f2fa6fd	enhance: Use v2 package name for pkg module (#39990 ) Related to #39095 https://go.dev/doc/modules/version-numbers Update pkg version according to golang dep version convention --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-02-22 23:15:58 +08:00
Ted Xu	2978b0890e	enhance: iterative download data during compaction to reduce memory cost (#39724 ) See #37234 --------- Signed-off-by: Ted Xu <ted.xu@zilliz.com>	2025-02-13 10:36:47 +08:00
Ted Xu	427b6a4c94	enhance: reduce stats task cost by skipping ser/de (#39568 ) See #37234 --------- Signed-off-by: Ted Xu <ted.xu@zilliz.com>	2025-02-06 17:14:45 +08:00
zhagnlu	6ee94d00b9	fix:fix calculate arrow nest type and add ut (#38527 ) #37767 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2024-12-18 11:54:44 +08:00
shaoting-huang	f4dd7c7efb	enhance: add delta log stream new format reader and writer (#34116 ) issue: #34123 Benchmark case: The benchmark run the go benchmark function `BenchmarkDeltalogFormat` which is put in the Files changed. It tests the performance of serializing and deserializing from two different data formats under a 10 million delete log dataset. Metrics: The benchmarks measure the average time taken per operation (ns/op), memory allocated per operation (MB/op), and the number of memory allocations per operation (allocs/op). \| Test Name \| Avg Time (ns/op) \| Time Comparison \| Memory Allocation (MB/op) \| Memory Comparison \| Allocation Count (allocs/op) \| Allocation Comparison \| \|---------------------------------\|------------------\|-----------------\|---------------------------\|-------------------\|------------------------------\|------------------------\| \| one_string_format_reader \| 2,781,990,000 \| Baseline \| 2,422 \| Baseline \| 20,336,539 \| Baseline \| \| pk_ts_separate_format_reader \| 480,682,639 \| -82.72% \| 1,765 \| -27.14% \| 20,396,958 \| +0.30% \| \| one_string_format_writer \| 5,483,436,041 \| Baseline \| 13,900 \| Baseline \| 70,057,473 \| Baseline \| \| pk_and_ts_separate_format_writer\| 798,591,584 \| -85.43% \| 2,178 \| -84.34% \| 30,270,488 \| -56.78% \| Both read and write operations show significant improvements in both speed and memory allocation. Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2024-07-06 09:08:09 +08:00
Ted Xu	6d5747cb3e	feat: adding deltalog stream reader and writer (#33844 ) See #31679 --------- Signed-off-by: Ted Xu <ted.xu@zilliz.com>	2024-06-19 14:42:01 +08:00
Ted Xu	066c8ea175	feat: stream reader/writer to support nulls (#33080 ) See: #31728 --------- Signed-off-by: Ted Xu <ted.xu@zilliz.com>	2024-05-27 16:27:42 +08:00
Ted Xu	a8bd9bea39	fix: adding blob memory size in binlog serde (#33324 ) See: #33280 Signed-off-by: Ted Xu <ted.xu@zilliz.com>	2024-05-24 10:33:40 +08:00
Ted Xu	a9c7ce72b8	enhance: enable stream writer in compactions (#32612 ) See #31679 --------- Signed-off-by: Ted Xu <ted.xu@zilliz.com>	2024-05-17 15:05:37 +08:00
Ted Xu	dc5ea6f17c	feat: adding binlog streaming writer (#31537 ) See #31679 --------- Signed-off-by: Ted Xu <ted.xu@zilliz.com>	2024-04-11 10:33:20 +08:00
Buqian Zheng	d7dbc3c9d8	fix: [sparse float vector] support the new streaming deserialize reader (#31325 ) issue: https://github.com/milvus-io/milvus/issues/31324 Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2024-03-17 13:59:04 +08:00
Ted Xu	987d9023a5	enhance: Enable binlog deserialize reader in datanode compaction (#31036 ) See #30863 Signed-off-by: Ted Xu <ted.xu@zilliz.com>	2024-03-08 18:25:02 +08:00
Ted Xu	71adafa933	enhance: adding a streaming deserialize reader for binlogs (#30860 ) See #30863 --------- Signed-off-by: Ted Xu <ted.xu@zilliz.com>	2024-03-04 19:31:09 +08:00

26 Commits