milvus

mirror of https://gitee.com/milvus-io/milvus.git synced 2026-01-07 19:31:51 +08:00

Author	SHA1	Message	Date
Spade A	7cb15ef141	feat: impl StructArray -- optimize vector array serialization (#44035 ) issue: https://github.com/milvus-io/milvus/issues/42148 Optimized from Go VectorArray → VectorArray Proto → Binary → C++ VectorArray Proto → C++ VectorArray local impl → Memory to Go VectorArray → Arrow ListArray → Memory --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-09-03 16:39:53 +08:00
Bingyi Sun	6624011927	enhance: storage sort can sort by multiple fields (#43994 ) https://github.com/milvus-io/milvus/issues/44011 this is to support compaction that sorts records by partition key and pk in the future --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-09-03 10:11:52 +08:00
XuanYang-cn	37a447d166	feat: Add CMEK cipher plugin (#43722 ) 1. Enable Milvus to read cipher configs 2. Enable cipher plugin in binlog reader and writer 3. Add a testCipher for unittests 4. Support pooling for datanode 5. Add encryption in storagev2 See also: #40321 Signed-off-by: yangxuan <xuan.yang@zilliz.com> --------- Signed-off-by: yangxuan <xuan.yang@zilliz.com>	2025-08-27 11:15:52 +08:00
Spade A	8456f824be	feat: impl StructArray -- miscellaneous staffs for struct array (#43960 ) Ref https://github.com/milvus-io/milvus/issues/42148 1. enable storage v2 2. implement some missing staffs 3. fix some bugs and add tests --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-08-26 21:35:53 +08:00
Tianx	c0d62268ac	feat: add timesatmptz data type (#44005 ) issue: https://github.com/milvus-io/milvus/issues/27467 > https://github.com/milvus-io/milvus/issues/27467#issuecomment-3092211420 > * [x] M1 Create collection with timestamptz field > * [x] M2 Insert timestamptz field data > * [x] M3 Retrieve timestamptz field data > * [x] M4 Implement handoff[ ] The second PR of issue: https://github.com/milvus-io/milvus/issues/27467, which completes M1-M4 described above. --------- Signed-off-by: xtx <xtianx@smail.nju.edu.cn>	2025-08-26 15:59:53 +08:00
Spade A	d6a428e880	feat: impl StructArray -- support create index for vector array (embedding list) and search on it (#43726 ) Ref https://github.com/milvus-io/milvus/issues/42148 This PR supports create index for vector array (now, only for `DataType.FLOAT_VECTOR`) and search on it. The index type supported in this PR is `EMB_LIST_HNSW` and the metric type is `MAX_SIM` only. The way to use it: ```python milvus_client = MilvusClient("xxx:19530") schema = milvus_client.create_schema(enable_dynamic_field=True, auto_id=True) ... struct_schema = milvus_client.create_struct_array_field_schema("struct_array_field") ... struct_schema.add_field("struct_float_vec", DataType.ARRAY_OF_VECTOR, element_type=DataType.FLOAT_VECTOR, dim=128, max_capacity=1000) ... schema.add_struct_array_field(struct_schema) index_params = milvus_client.prepare_index_params() index_params.add_index(field_name="struct_float_vec", index_type="EMB_LIST_HNSW", metric_type="MAX_SIM", index_params={"nlist": 128}) ... milvus_client.create_index(COLLECTION_NAME, schema=schema, index_params=index_params) ``` Note: This PR uses `Lims` to convey offsets of the vector array to knowhere where vectors of multiple vector arrays are concatenated and we need offsets to specify which vectors belong to which vector array. --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com> Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>	2025-08-20 10:27:46 +08:00
Ted Xu	e37cd19da2	enhance: enable storage v2 by default (#43652 ) Signed-off-by: Ted Xu <ted.xu@zilliz.com>	2025-08-01 08:59:36 +08:00
sthuang	a2c7ed2780	fix: [StorageV2] sort field binlogs paths for packed reader and writer (#43585 ) key changes: * fix unstable storage v2 compaction unit test by guaranteeing the order of paths during sync. * bump milvus-storage version, include https://github.com/milvus-io/milvus-storage/pull/222 https://github.com/milvus-io/milvus-storage/pull/223 https://github.com/milvus-io/milvus-storage/pull/224 https://github.com/milvus-io/milvus-storage/pull/225 https://github.com/milvus-io/milvus-storage/pull/226 * Also fix the below related oom issue. related: https://github.com/milvus-io/milvus/issues/43310 Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2025-07-30 08:09:36 +08:00
congqixia	34d3f0c0f8	enhance: Reserve builder space for ValueSerializer (#43570 ) Add `arrowBuild.Reserve` call for `ValueSerializer` to reduce repeated resizing buffer when write size is large Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-07-28 11:02:55 +08:00
Spade A	faeb7fd410	feat: impl StructArray -- create schema, insert, and retrieve data (#42855 ) Ref https://github.com/milvus-io/milvus/issues/42148 https://github.com/milvus-io/milvus/pull/42406 impls the segcore part of storage for handling with VectorArray. This PR: 1. impls the go part of storage for VectorArray 2. impls the collection creation with StructArrayField and VectorArray 3. insert and retrieve data from the collection. --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com> Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com> Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>	2025-07-27 01:30:55 +08:00
Ted Xu	9041bf1b9a	fix: including shouldCopy parameter in file readers (#43578 ) This parameter determines whether the returned value should be a copy or a reference from the arrow array. The updates enhance memory management and provide more control over data handling during deserialization. See #43186 --------- Signed-off-by: Ted Xu <ted.xu@zilliz.com>	2025-07-26 17:30:55 +08:00
sthuang	a0c9f499ee	fix: [StorageV2] sync panic with nullable add field (#43142 ) related: https://github.com/milvus-io/milvus/pull/42932 fix: https://github.com/milvus-io/milvus/issues/43072 Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2025-07-25 10:08:53 +08:00
congqixia	1cf8ed505f	fix: Implement `NeededFields` feature in `RecordReader` (#43523 ) Related to #43522 Currently, passing partial schema to storage v2 packed reader may trigger SEGV during clustering compaction unit test. This patch implement `NeededFields` differently in each `RecordReader` imlementation. For now, v2 will implemented as no-op. This will be supported after packed reader support this API. --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-07-24 00:22:54 +08:00
congqixia	563e2935c5	enhance: [StorageV2] Fill ts range default values for `PackedBinlogRecordWriter` (#43454 ) This PR fill default value for `PackedBinlogRecordWriter` timestamp range so target segment meta will contains correct timestamp range Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-07-22 12:04:53 +08:00
yihao.dai	b69e601fe1	fix: [StorageV2] Correct read and write buffer size (#43335 ) Correct read and buffer size to 64MB to prevent OOM during clustering compaction. issue: https://github.com/milvus-io/milvus/issues/43310 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2025-07-16 14:28:52 +08:00
yihao.dai	1984be646c	fix: Fix storagev2 binlog import (#43221 ) issue: https://github.com/milvus-io/milvus/issues/43218 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2025-07-13 22:52:49 +08:00
congqixia	5a9efb3f81	enhance: [StorageV2] Refine storage rw option usage & validation (#43175 ) Related to #39173 This PR: - Make all datanode task passes storage config via storage config option - Remove legacy comments, rootPath & bucketName parameters - Fix clustering compaction option behavior - Add validation logic for `rwOptions` - Use correct storageType from storageConfig - Add storage config in sync task --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-07-11 01:14:48 +08:00
cai.zhang	95e767611a	fix: Fix merge sort loss data when last row in a record is deleted (#43216 ) issue: #43207 --------- Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2025-07-09 22:18:48 +08:00
cai.zhang	8720feeb79	fix: Fix enqueuing when current batch is fully deleted (#43174 ) issue: #43045 --------- Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2025-07-08 12:20:46 +08:00
congqixia	ab818dcbca	fix: [StorageV2] Pass storage config for compaction rw (#43167 ) Related to #43148 Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-07-07 15:32:46 +08:00
congqixia	d09764508a	fix: [Storagev2] Close segment readers in mergeSort (#43116 ) Related to #43062 --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-07-04 23:56:44 +08:00
cai.zhang	4133e3b8fd	fix: Enable merge sort and fix sort bug (#43080 ) issue: #42980, #43034 Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2025-07-04 10:18:44 +08:00
congqixia	8962b0058d	fix: [StorageV2] Check writer nil when closing not written one (#43056 ) Related to #43047 Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-07-02 14:22:43 +08:00
congqixia	9b06ecb72f	enhance: [StorageV2] Release record and close reader (#42983 ) Related to #39173 This PR - Close packed reader after sort - Release arrow.Record preventing memory leakage - Invoke `pack_reader->Close()` for CloseReader --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-06-27 14:46:43 +08:00
sthuang	238bd30f42	fix: [StorageV2] end to end minor issues for sync, stats, and load (#42948 ) Fix issues in end-to-end tests: 1. Split column groups based on schema, rather than estimating by average chunk row size. Ensure column group consistency within a segment, to avoid errors caused by loading multiple column group chunks simultaneously. 2. Use sorted segmentId when generating the stats binlog path, to ensure consistent and correct file path resolution. 3. Determine field IDs as follows: For multi-column column groups, retrieve the field ID list from metadata. For single-column column groups, use the column group ID directly as the field ID. related: #39173 fix: #42862 --------- Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2025-06-27 14:44:42 +08:00
cai.zhang	ebe1c95bb1	enhance: Add Size interface to FileReader to eliminate the StatObject call during Read (#42908 ) issue: #42907 --------- Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2025-06-25 14:36:41 +08:00
congqixia	ee056f0bff	fix: [AddField] Fill default value in serde logic when field missing (#42891 ) Related to #42856 Default value will be missing after segment get sorted/compacted. This PR is a temp workaround since in long term default value shall be filled with storage engine instead. --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-06-23 14:20:41 +08:00
sthuang	4a0a2441f2	enhance: [StorageV2] field id as meta path for wide column (#42787 ) related: #39173 Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2025-06-19 15:00:38 +08:00
congqixia	f01ff57f3f	fix: [StorageV2] Use correct offset filling null bitmap (#42774 ) Related to #39173 `null_bitmap_data()` returns raw pointer of null bitmap of Array. While after slicing, this bitmap is not rewritten due to zero copy implementation, so the current start pos maybe non-zero while FillFieldData generating column `valid_data` array. This PR add `offset` param for `FillFieldData` method, and force all invocation pass correct offset of `null_bitmap_data` ptr. Also update milvus-storage commit fixing reader failed to return data when buffer size smaller than row group size problem. --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-06-17 10:08:38 +08:00
sthuang	ed5dbf3eaa	enhance: [StorageV2] sync separate vector datatype into its own column group (#42638 ) related: #39173 Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2025-06-16 11:48:37 +08:00
congqixia	ef8829c5bc	fix: [AddField] Skip missing nullable field in insertCodec (#42724 ) Related to #42723 Previous PR #42684 permit insert msg transformation but insertCodec did not adapt the same skip logic, whic causes panicking. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-06-13 19:56:36 +08:00
Zhen Ye	1f66b650e9	fix: pulsar cannot work properly if backlog exceed (#42653 ) issue: #42649 - the sync operation of different pchannel is concurrent now. - add a option to notify the backlog clear automatically. - make pulsar walimpls can be recovered from backlog exceed. Signed-off-by: chyezh <chyezh@outlook.com>	2025-06-13 14:28:37 +08:00
congqixia	cbed31933a	fix: [AddField] Permit missing new nullable field in InsertMsg (#42684 ) Related to #41858 #41951 #42084 When insert msg consumer (pipeline/flowgraph) have newer schema than insertMsg, it have to adapter the insert msg used old schema(missing newly added field) Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-06-13 13:52:35 +08:00
Zhen Ye	43f0c56ce7	fix: limit the concurency of zstd compression and decrease the memory usage of binlog generation (#42630 ) issue: #42028 - limit the concurrency of zstd compression. - zstd.go modified from `github.com/apache/arrow/go/v17/parquet/compress/ztsd.go` - may be related to #42129 Signed-off-by: chyezh <chyezh@outlook.com>	2025-06-11 09:06:34 +08:00
sthuang	9439eaef52	fix: [StorageV2] sync with int8 vector data type core dumped (#42616 ) related: https://github.com/milvus-io/milvus/issues/42613, #39173 Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2025-06-10 11:42:35 +08:00
sthuang	89c3afb12e	fix: [StorageV2] index/stats task level storage v2 fs (#42191 ) related: #39173 --------- Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2025-06-10 11:06:35 +08:00
congqixia	118684afbb	enhance: [storageV2] Pass nullable converting insertMsg fieldData (#42584 ) Related to #39173 `nullable` flag is crucial for serde logic of v2 writer, missing this flag causes logic bug for v2 nullalbe data. --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-06-10 10:06:34 +08:00
Chun Han	e9b5d9e8bc	enhance: refine compaction trigger to reduce read/write amplifaction(#41336 ) (#41728 ) related: #41336 Signed-off-by: MrPresent-Han <chun.han@gmail.com> Co-authored-by: MrPresent-Han <chun.han@gmail.com>	2025-06-04 11:24:38 +08:00
yihao.dai	e0113b375e	fix: Fix sort stats generates large binlogs (#42456 ) Remove the hardcoded batchSize of 100,000 and instead trigger a write every 64MB based on actual data size. This prevents sort stats from generating excessively large binlog files. issue: https://github.com/milvus-io/milvus/issues/42400 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2025-06-04 09:56:39 +08:00
congqixia	a22088a380	enhance: [StorageV2] Make packed reader use correct path (#41919 ) Related to #39173 This PR - Use updated path with bucketName for packedReader - Update milvus-storage commit to report reader/writer initialization failure, see also milvus-io/milvus-storage#192 --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-05-20 10:36:23 +08:00
Ted Xu	7660be0993	feat: bulk insert support storage v2 (#41843 ) See #39173 --------- Signed-off-by: Ted Xu <ted.xu@zilliz.com>	2025-05-19 10:34:24 +08:00
congqixia	a6d09ff4cd	enhance: [StorageV2] fix issues integrating basic RW operations (#41834 ) Related to #39173 This PR: - Upgrade milvus-storage commit to fix filesystem finalized issue - Add bucket-name as prefix for all fs style access io - Initial arrow fs on querynodes startup - Fix timestamp access when loading sealed segment --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-05-15 09:52:23 +08:00
aoiasd	3892451880	fix: bm25 search failed when avgdl == nan (#41502 ) relate: https://github.com/milvus-io/milvus/issues/41490 --------- Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-04-27 17:34:38 +08:00
XuanYang-cn	dab39c610b	enhance: remove not inused DDLCodec (#41485 ) See also: #39242 Signed-off-by: yangxuan <xuan.yang@zilliz.com>	2025-04-25 17:26:37 +08:00
XuanYang-cn	540456041f	enhance: Remove not inuse binlog iterator (#41359 ) See also: #41466 Signed-off-by: yangxuan <xuan.yang@zilliz.com>	2025-04-24 12:04:38 +08:00
SimFG	91d40fa558	fix: Update logging context and upgrade dependencies (#41318 ) - issue: #41291 --------- Signed-off-by: SimFG <bang.fu@zilliz.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2025-04-23 10:52:38 +08:00
sthuang	50e02e3598	enhance: update packed reader api (#41055 ) related: https://github.com/milvus-io/milvus/issues/39173 Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2025-04-09 10:18:26 +08:00
Ted Xu	1bcea2a775	fix: assigning the correct storage version in sync and index tasks (#41093 ) See #39663 #40667 --------- Signed-off-by: Ted Xu <ted.xu@zilliz.com>	2025-04-08 10:14:25 +08:00
smellthemoon	cb1e86e17c	enhance: support add field (#39800 ) after the pr merged, we can support to insert, upsert, build index, query, search in the added field. can only do the above operates in added field after add field request complete, which is a sync operate. compact will be supported in the next pr. #39718 --------- Signed-off-by: lixinguo <xinguo.li@zilliz.com> Co-authored-by: lixinguo <xinguo.li@zilliz.com>	2025-04-02 14:24:31 +08:00
Ted Xu	128efaa3e3	enhance: simplify size calculation in file writers (#40808 ) See: #40342 --------- Signed-off-by: Ted Xu <ted.xu@zilliz.com>	2025-03-26 20:04:22 +08:00

1 2 3 4 5 ...

578 Commits