milvus

mirror of https://gitee.com/milvus-io/milvus.git synced 2026-01-07 19:31:51 +08:00

Author	SHA1	Message	Date
tinswzy	173efe2b98	enhance: wp metrics and update deps to v0.1.0 (#43569 ) #43574 #43604 #43431 #43603 Fix wp metrics not registered bug; Update the version dependent on wp to v0.1.2-rc1; improve advanced reader with concurrent prefetch blks; add the segment rolling policy based on the number of blocks; improve concurrent compaction release lock failed bug Signed-off-by: tinswzy <zhenyuan.wei@zilliz.com>	2025-07-29 14:51:35 +08:00
congqixia	268f1cdace	fix: Hold field shared_ptr in case of being released (#43614 ) Related to #43584 Directly accessing `fields_` in `get_raw_data` may have race if load vec index happens concurrently during getting raw data. This PR make `bulk_subscript` hold shared_ptr of field column prevent field column being release during reading it. --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-07-29 12:15:36 +08:00
Chun Han	4ee9f63f72	fix: return id by default(#43595 ) (#43601 ) related: #43595 Signed-off-by: MrPresent-Han <chun.han@gmail.com> Co-authored-by: MrPresent-Han <chun.han@gmail.com>	2025-07-29 12:07:36 +08:00
zhenshan.cao	4835ef9db8	enhance: update committers (#43620 ) Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>	2025-07-29 12:05:43 +08:00
congqixia	18d8dc82b8	feat: [GoSDK] Support search iterator v2 (#43612 ) Related to #37548 Also link #43122 This patch implements basic functions of search iterator v2. --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-07-29 11:21:35 +08:00
aoiasd	c9412434c8	enhance: add char group tokenizer (#42793 ) relate: https://github.com/milvus-io/milvus/issues/42792 Add char group tokenizer which support use costum char group or use some build-in char group as delimiters. Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-07-29 11:11:35 +08:00
congqixia	f666d89919	fix: [StorageV2] Access future result to get exception if any (#43613 ) Related to #43584 When `LoadWithStrategy` throw exception, the ex was wrapped in the returned future. If the future is not handled, this exception would be ignored. This patch add `future.get()` to get exception if any. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-07-28 22:33:35 +08:00
Xiaofan	bd31b32167	fix: hybridsearch should support offset param in restful api (#43586 ) Add support of offset param for reqeustful. api and refine some constant usage related #43556 Signed-off-by: xiaofanluan <xiaofan.luan@zilliz.com>	2025-07-28 22:15:36 +08:00
yihao.dai	a29b3272b0	fix: Improve import memory management to prevent OOM (#43568 ) 1. Use blocking memory allocation to wait until memory becomes available 2. Perform memory allocation at the file level instead of per task 3. Limit Parquet file reader batch size to prevent excessive memory consumption 4. Limit import buffer size from 20% to 10% of total memory issue: https://github.com/milvus-io/milvus/issues/43387, https://github.com/milvus-io/milvus/issues/43131 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2025-07-28 21:25:35 +08:00
Zhen Ye	5b9b895cb0	fix: get schema panics when recover from channel checkpoint (#43605 ) issue: #43597, #43598 Signed-off-by: chyezh <chyezh@outlook.com>	2025-07-28 16:42:56 +08:00
Spade A	864d1b93b1	enhance: enable stlsort with mmap support (#43359 ) issue: https://github.com/milvus-io/milvus/issues/43358 --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-07-28 15:32:55 +08:00
zhagnlu	9bf1cb02d5	fix: add array_contains_all int to float converter (#43593 ) #43334 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-07-28 14:14:55 +08:00
Zhen Ye	648994182f	fix: pulsar use more memory for queue (#43565 ) issue: #43564 Signed-off-by: chyezh <chyezh@outlook.com>	2025-07-28 14:00:56 +08:00
yihao.dai	192521c6bd	enhance: Fix unbalanced task scheduling (#43581 ) Make scheduler always pick the node with the most available slots. issue: https://github.com/milvus-io/milvus/issues/43580 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2025-07-28 12:58:55 +08:00
congqixia	34d3f0c0f8	enhance: Reserve builder space for ValueSerializer (#43570 ) Add `arrowBuild.Reserve` call for `ValueSerializer` to reduce repeated resizing buffer when write size is large Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-07-28 11:02:55 +08:00
yanliang567	10ec3ce2bf	test: Upgrade pymilvus to 2.6.0rc166 (#43543 ) related issue: #40698 Signed-off-by: yanliang567 <yanliang.qiao@zilliz.com>	2025-07-28 10:30:55 +08:00
wei liu	7b8bf6393b	enhance: Improve partial result evaluation with row count based strategy (#43361 ) issue: #43360 Enhance the partial result evaluation mechanism in delegator to use row count based data ratio instead of simple segment count ratio for better accuracy. Key improvements: - Introduce PartialResultEvaluator interface for flexible evaluation strategy - Implement NewRowCountBasedEvaluator using sealed segment row count data - Replace segment count based ratio with row count based data ratio calculation - Update PinReadableSegments to return sealedRowCount information - Modify executeSubTasks to use configurable evaluator for partial result decisions - Add comprehensive unit tests for the new row count based evaluation logic This change provides more accurate partial result evaluation by considering the actual data volume rather than just segment quantity, leading to better query performance and consistency when some segments are unavailable. Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-07-28 10:18:55 +08:00
Zhen Ye	7877aaa96c	fix: dirty cp metrics after drop (#43567 ) issue: #42688 - The channel cp is dropped by garbage collector - The channel is dropped and the cp is marked as math.Uint64 - If we drop it here, the update channel checkpoints will write the dirty cp back. Signed-off-by: chyezh <chyezh@outlook.com>	2025-07-27 23:22:55 +08:00
Zhen Ye	feb5db60f2	fix: make flush save binlog paths idempotent (#43579 ) issue: #43574 Signed-off-by: chyezh <chyezh@outlook.com>	2025-07-27 23:14:55 +08:00
Spade A	faeb7fd410	feat: impl StructArray -- create schema, insert, and retrieve data (#42855 ) Ref https://github.com/milvus-io/milvus/issues/42148 https://github.com/milvus-io/milvus/pull/42406 impls the segcore part of storage for handling with VectorArray. This PR: 1. impls the go part of storage for VectorArray 2. impls the collection creation with StructArrayField and VectorArray 3. insert and retrieve data from the collection. --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com> Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com> Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>	2025-07-27 01:30:55 +08:00
Buqian Zheng	b497d3d7a4	fix: call promise->setValue only after released the ListNode mtx (#43547 ) issue: #43261 `promise->setValue(folly::Unit());` may run callbacks inline and some of them may attempt to grab `mtx_`. So we should not call `promise->setValue(folly::Unit());` while holding the lock. --------- Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2025-07-26 18:34:55 +08:00
Ted Xu	9041bf1b9a	fix: including shouldCopy parameter in file readers (#43578 ) This parameter determines whether the returned value should be a copy or a reference from the arrow array. The updates enhance memory management and provide more control over data handling during deserialization. See #43186 --------- Signed-off-by: Ted Xu <ted.xu@zilliz.com>	2025-07-26 17:30:55 +08:00
Bingyi Sun	742d72a6c2	fix: Fix wrong null offsets for json path index (#43390 ) issue: https://github.com/milvus-io/milvus/issues/43315 --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-07-26 17:26:54 +08:00
Bingyi Sun	a89e579485	fix: use tantivy version to make json index compatible with milvus 2.5 (#43563 ) issue: https://github.com/milvus-io/milvus/issues/43562 Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-07-26 17:18:55 +08:00
congqixia	0b860b4aec	fix: Revert "enhance: DataCodec to release ownership of input_data after initialization (#43542 )" (#43571 )	2025-07-25 20:48:16 +08:00
Zhen Ye	070aabd27e	enhance: fix remove flushing state of segment (#43560 ) issue: #43559, #42884 - also fix the data lost when streaming resuming from old arch message. Signed-off-by: chyezh <chyezh@outlook.com>	2025-07-25 18:08:54 +08:00
congqixia	2a7b7a811a	fix: [StorageV2] Throw exception when read rg fails (#43561 ) Related to #43261 Read error with catched in `LoadWithStrategy`. Caller could not detect read failure when some error occurred. This patch make `LoadWithStrategy` throw ex instead of swallowing it. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-07-25 17:40:55 +08:00
yihao.dai	0e1f367164	enhance: Fail compaction task to prevent data loss (#43545 ) We’ve frequently observed data loss caused by broken mutual exclusion in compaction tasks. This PR introduces a post-check: before modifying metadata upon compaction task completion, it verifies the state of the input segments. If any input segment has been dropped, the compaction task will be marked as failed. issue: https://github.com/milvus-io/milvus/issues/43513 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2025-07-25 16:24:54 +08:00
Ted Xu	078ccf5e08	fix: the underlying record got released in clustering compaction (#43551 ) See: #43186 In this PR: 1. Flush renamed to FlushChunk, while a new Flush primitive is introduced to serialize values to records. 2. Segment mapping in clustering compaction now process data by records instead of values, it calls flush to all buffers after each record is processed. Signed-off-by: Ted Xu <ted.xu@zilliz.com>	2025-07-25 15:04:54 +08:00
Buqian Zheng	d23205b718	enhance: DataCodec to release ownership of input_data after initialization (#43542 ) issue: https://github.com/milvus-io/milvus/issues/43088 issue: https://github.com/milvus-io/milvus/issues/43038 see also https://github.com/milvus-io/milvus/pull/43533. Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2025-07-25 14:24:54 +08:00
wei liu	369a811ae1	fix: only clear exclude node list after refresh shard leader cache (#43553 ) issue: #43511 Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-07-25 14:18:54 +08:00
sthuang	5cebc9f7f6	fix: [StorageV2] handle correct cid with multiple files and add storage v2 prefix logs (#43539 ) related: #43372 --------- Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2025-07-25 11:22:54 +08:00
Shuyoou	87326a5a64	fix: [skip e2e] webui collection filter params error (#42969 ) Fix Issue: #40929 Signed-off-by: Shuyoou <shuyoou@outlook.com>	2025-07-25 10:40:53 +08:00
tinswzy	83f6811dbd	fix: local fs incomplete block read bug (#43444 ) #43340 fix log reader bug #43370 list object goroutine leak ; block flush bug #43431 #43356 improve read latency other fix: local FS block CRC fix; incomplete block read bugfix; multi-segment rolling not complete bug; local fs concurent flush bug other enhance: log reader EOF-based segment end detection ; revisioned log/segment meta updates. Signed-off-by: tinswzy <zhenyuan.wei@zilliz.com>	2025-07-25 10:30:54 +08:00
Spade A	10fe53ff59	feat: support json for ngram (#43170 ) Ref https://github.com/milvus-io/milvus/issues/42053 This PR enable ngram to support json data type. --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-07-25 10:28:54 +08:00
sthuang	a0c9f499ee	fix: [StorageV2] sync panic with nullable add field (#43142 ) related: https://github.com/milvus-io/milvus/pull/42932 fix: https://github.com/milvus-io/milvus/issues/43072 Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2025-07-25 10:08:53 +08:00
zhagnlu	c86307aef0	enhance: forbid two column comparison with json type in parser stage (#43382 ) #43381 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-07-24 19:42:54 +08:00
congqixia	fe16de702b	test: [GoSDK] Use strong consistency level for hybrid search cases (#43536 ) There are some unstable cases in go sdk e2e cases, which used default bounded consistency level. This patch make these cases use strong level to avoid unstable test results Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-07-24 15:26:54 +08:00
yihao.dai	804a7692a6	fix: Fix delete loss caused by missing mutual exclusion in sort compaction (#43540 ) issue: https://github.com/milvus-io/milvus/issues/43513 Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2025-07-24 14:53:34 +08:00
qixuan	3d8f728091	test: modify add field case about skipped cases (#43461 ) related issue: #42126 Signed-off-by: qixuan <673771573@qq.com>	2025-07-24 14:26:54 +08:00
Buqian Zheng	d367770649	enhance: greatly reduce the loading memory overhead - by up to 25% (#43533 ) issue: #43088 issue: #43038 The current loading process: * When loading an index, we first download the index files into a list of buffers, say A * then constructing(copying) them into a vector of FieldDatas(each file is a FieldData), say B * assembles them together as a huge BinarySet, say C * lastly, copy into the actual index data structure, say D The problem: * We can see that, after each step, we don't need the data in previous step. * But currently, we release the memory of A, B, C only after we have finished constructing D * This leads to a up to 4x peak memory usage comparing with the raw index size, during the loading process * This PR allows timely releasing of B after we assembled C. So after this PR, the peak memory usage during loading will be up to 3x of the raw index size. I will create another PR to release A after we created B, that seems more complicated and need more work. Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2025-07-24 11:26:54 +08:00
congqixia	4bdb5ccafa	fix: Close segment writer when reader returns error (#43531 ) Realted #43520 Datanode may have memory leakage when reader returns error. In previously mention issue, datanodes got OOM killed due to continueous error in read path. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-07-24 11:18:54 +08:00
Jean-Francois Weber-Marx	1bd66b09e3	enhance: allow '.' and '-' characters in usernames (#42417 ) (#42588 ) related: #42417 - update the isValidUsername function to accept dots and hyphens in addition to letters, digits, and underscores - this change improves compatibility with common username formats and addresses feedback in issue #42417 Signed-off-by: Jean-Francois Weber-Marx <jfwm@hotmail.com> Signed-off-by: Jean-Francois Weber-Marx <jf.webermarx@criteo.com>	2025-07-24 09:54:54 +08:00
wei liu	990a25e51a	fix: Prevent delete records loss during slow segment loading [QueryNodeV2] (#43527 ) issue: #42884 Fixes an issue where delete records for a segment are lost from the delete buffer if `load segment` execution on the delegator is too slow, causing `syncTargetVersion` or other cleanup operations to clear them prematurely. Changes include: - Introduced `Pin` and `Unpin` methods in `DeleteBuffer` interface and its implementations (`doubleCacheBuffer`, `listDeleteBuffer`). - Added a `pinnedTimestamps` map to track timestamps protected from cleanup by specific segments. - Modified `LoadSegments` in `shardDelegator` to `Pin` relevant segment delete records before loading and `Unpin` them afterwards. - Added `isPinned` check in `UnRegister` and `TryDiscard` methods of `listDeleteBuffer` to skip cleanup if corresponding timestamps are pinned. - Added comprehensive unit tests for `Pin`, `Unpin`, and `isPinned` functionality, covering basic, multiple pins, concurrent, and edge cases. This ensures the integrity of delete records by preventing their premature removal from the delete buffer during segment loading. Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-07-24 01:00:54 +08:00
congqixia	1cf8ed505f	fix: Implement `NeededFields` feature in `RecordReader` (#43523 ) Related to #43522 Currently, passing partial schema to storage v2 packed reader may trigger SEGV during clustering compaction unit test. This patch implement `NeededFields` differently in each `RecordReader` imlementation. For now, v2 will implemented as no-op. This will be supported after packed reader support this API. --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-07-24 00:22:54 +08:00
yanliang567	abb3aeacdf	test: Refactor diskann and hsnw index, and update gen data functions (#43452 ) related issue #40698 1. add diskann and hnsw index test 2. update gen_row_data and gen_column_data functions --------- Signed-off-by: yanliang567 <yanliang.qiao@zilliz.com>	2025-07-23 22:04:54 +08:00
Zhen Ye	e9ab73e93d	enhance: add schema version at recovery storage (#43500 ) issue: #43072, #43289 - manage the schema version at recovery storage. - update the schema when creating collection or alter schema. - get schema at write buffer based on version. - recover the schema when upgrading from 2.5. --------- Signed-off-by: chyezh <chyezh@outlook.com>	2025-07-23 21:38:54 +08:00
yihao.dai	9fbd41a97d	fix: Adjust binlog and parquet reader buffer size for import (#43495 ) 1. Modify the binlog reader to stop reading a fixed 4096 rows and instead use the calculated bufferSize to avoid generating small binlogs. 2. Use a fixed bufferSize (32MB) for the Parquet reader to prevent OOM. issue: https://github.com/milvus-io/milvus/issues/43387 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2025-07-23 21:28:54 +08:00
foxspy	ed57650b52	fix: remove invalid restrictions on dim for int8 vector (#43469 ) issue: #43466 Signed-off-by: xianliang.li <xianliang.li@zilliz.com>	2025-07-23 20:22:54 +08:00
cai.zhang	74c08069ef	fix: Set result storage version for sort compaction (#43521 ) issue: #43520 Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2025-07-23 19:04:53 +08:00

1 2 3 4 5 ...

22824 Commits