milvus

mirror of https://gitee.com/milvus-io/milvus.git synced 2026-01-07 19:31:51 +08:00

Author	SHA1	Message	Date
Buqian Zheng	d23205b718	enhance: DataCodec to release ownership of input_data after initialization (#43542 ) issue: https://github.com/milvus-io/milvus/issues/43088 issue: https://github.com/milvus-io/milvus/issues/43038 see also https://github.com/milvus-io/milvus/pull/43533. Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2025-07-25 14:24:54 +08:00
wei liu	369a811ae1	fix: only clear exclude node list after refresh shard leader cache (#43553 ) issue: #43511 Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-07-25 14:18:54 +08:00
sthuang	5cebc9f7f6	fix: [StorageV2] handle correct cid with multiple files and add storage v2 prefix logs (#43539 ) related: #43372 --------- Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2025-07-25 11:22:54 +08:00
Shuyoou	87326a5a64	fix: [skip e2e] webui collection filter params error (#42969 ) Fix Issue: #40929 Signed-off-by: Shuyoou <shuyoou@outlook.com>	2025-07-25 10:40:53 +08:00
tinswzy	83f6811dbd	fix: local fs incomplete block read bug (#43444 ) #43340 fix log reader bug #43370 list object goroutine leak ; block flush bug #43431 #43356 improve read latency other fix: local FS block CRC fix; incomplete block read bugfix; multi-segment rolling not complete bug; local fs concurent flush bug other enhance: log reader EOF-based segment end detection ; revisioned log/segment meta updates. Signed-off-by: tinswzy <zhenyuan.wei@zilliz.com>	2025-07-25 10:30:54 +08:00
Spade A	10fe53ff59	feat: support json for ngram (#43170 ) Ref https://github.com/milvus-io/milvus/issues/42053 This PR enable ngram to support json data type. --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-07-25 10:28:54 +08:00
sthuang	a0c9f499ee	fix: [StorageV2] sync panic with nullable add field (#43142 ) related: https://github.com/milvus-io/milvus/pull/42932 fix: https://github.com/milvus-io/milvus/issues/43072 Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2025-07-25 10:08:53 +08:00
zhagnlu	c86307aef0	enhance: forbid two column comparison with json type in parser stage (#43382 ) #43381 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-07-24 19:42:54 +08:00
congqixia	fe16de702b	test: [GoSDK] Use strong consistency level for hybrid search cases (#43536 ) There are some unstable cases in go sdk e2e cases, which used default bounded consistency level. This patch make these cases use strong level to avoid unstable test results Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-07-24 15:26:54 +08:00
yihao.dai	804a7692a6	fix: Fix delete loss caused by missing mutual exclusion in sort compaction (#43540 ) issue: https://github.com/milvus-io/milvus/issues/43513 Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2025-07-24 14:53:34 +08:00
qixuan	3d8f728091	test: modify add field case about skipped cases (#43461 ) related issue: #42126 Signed-off-by: qixuan <673771573@qq.com>	2025-07-24 14:26:54 +08:00
Buqian Zheng	d367770649	enhance: greatly reduce the loading memory overhead - by up to 25% (#43533 ) issue: #43088 issue: #43038 The current loading process: * When loading an index, we first download the index files into a list of buffers, say A * then constructing(copying) them into a vector of FieldDatas(each file is a FieldData), say B * assembles them together as a huge BinarySet, say C * lastly, copy into the actual index data structure, say D The problem: * We can see that, after each step, we don't need the data in previous step. * But currently, we release the memory of A, B, C only after we have finished constructing D * This leads to a up to 4x peak memory usage comparing with the raw index size, during the loading process * This PR allows timely releasing of B after we assembled C. So after this PR, the peak memory usage during loading will be up to 3x of the raw index size. I will create another PR to release A after we created B, that seems more complicated and need more work. Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2025-07-24 11:26:54 +08:00
congqixia	4bdb5ccafa	fix: Close segment writer when reader returns error (#43531 ) Realted #43520 Datanode may have memory leakage when reader returns error. In previously mention issue, datanodes got OOM killed due to continueous error in read path. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-07-24 11:18:54 +08:00
Jean-Francois Weber-Marx	1bd66b09e3	enhance: allow '.' and '-' characters in usernames (#42417 ) (#42588 ) related: #42417 - update the isValidUsername function to accept dots and hyphens in addition to letters, digits, and underscores - this change improves compatibility with common username formats and addresses feedback in issue #42417 Signed-off-by: Jean-Francois Weber-Marx <jfwm@hotmail.com> Signed-off-by: Jean-Francois Weber-Marx <jf.webermarx@criteo.com>	2025-07-24 09:54:54 +08:00
wei liu	990a25e51a	fix: Prevent delete records loss during slow segment loading [QueryNodeV2] (#43527 ) issue: #42884 Fixes an issue where delete records for a segment are lost from the delete buffer if `load segment` execution on the delegator is too slow, causing `syncTargetVersion` or other cleanup operations to clear them prematurely. Changes include: - Introduced `Pin` and `Unpin` methods in `DeleteBuffer` interface and its implementations (`doubleCacheBuffer`, `listDeleteBuffer`). - Added a `pinnedTimestamps` map to track timestamps protected from cleanup by specific segments. - Modified `LoadSegments` in `shardDelegator` to `Pin` relevant segment delete records before loading and `Unpin` them afterwards. - Added `isPinned` check in `UnRegister` and `TryDiscard` methods of `listDeleteBuffer` to skip cleanup if corresponding timestamps are pinned. - Added comprehensive unit tests for `Pin`, `Unpin`, and `isPinned` functionality, covering basic, multiple pins, concurrent, and edge cases. This ensures the integrity of delete records by preventing their premature removal from the delete buffer during segment loading. Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-07-24 01:00:54 +08:00
congqixia	1cf8ed505f	fix: Implement `NeededFields` feature in `RecordReader` (#43523 ) Related to #43522 Currently, passing partial schema to storage v2 packed reader may trigger SEGV during clustering compaction unit test. This patch implement `NeededFields` differently in each `RecordReader` imlementation. For now, v2 will implemented as no-op. This will be supported after packed reader support this API. --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-07-24 00:22:54 +08:00
yanliang567	abb3aeacdf	test: Refactor diskann and hsnw index, and update gen data functions (#43452 ) related issue #40698 1. add diskann and hnsw index test 2. update gen_row_data and gen_column_data functions --------- Signed-off-by: yanliang567 <yanliang.qiao@zilliz.com>	2025-07-23 22:04:54 +08:00
Zhen Ye	e9ab73e93d	enhance: add schema version at recovery storage (#43500 ) issue: #43072, #43289 - manage the schema version at recovery storage. - update the schema when creating collection or alter schema. - get schema at write buffer based on version. - recover the schema when upgrading from 2.5. --------- Signed-off-by: chyezh <chyezh@outlook.com>	2025-07-23 21:38:54 +08:00
yihao.dai	9fbd41a97d	fix: Adjust binlog and parquet reader buffer size for import (#43495 ) 1. Modify the binlog reader to stop reading a fixed 4096 rows and instead use the calculated bufferSize to avoid generating small binlogs. 2. Use a fixed bufferSize (32MB) for the Parquet reader to prevent OOM. issue: https://github.com/milvus-io/milvus/issues/43387 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2025-07-23 21:28:54 +08:00
foxspy	ed57650b52	fix: remove invalid restrictions on dim for int8 vector (#43469 ) issue: #43466 Signed-off-by: xianliang.li <xianliang.li@zilliz.com>	2025-07-23 20:22:54 +08:00
cai.zhang	74c08069ef	fix: Set result storage version for sort compaction (#43521 ) issue: #43520 Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2025-07-23 19:04:53 +08:00
zhagnlu	d64dceea47	fix:add convert int to float function to array_contains related expr (#43468 ) #43281 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-07-23 15:20:53 +08:00
junjiejiangjjj	4db877f76c	fix: Fix weighted rerank (#43503 ) #43478 Signed-off-by: junjie.jiang <junjie.jiang@zilliz.com>	2025-07-23 14:54:53 +08:00
Buqian Zheng	7ced9fc5d9	fix: fix loading resource estimation (#43509 ) currently we multiplied the requesting size when adding to loading, but did not do so when estimating projected usage. issue: #43088 Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2025-07-23 10:36:53 +08:00
congqixia	cc1034fe96	fix: [AddField] Resolve FieldIndexing dangling reference (#43499 ) Related to #43113 This PR: - Change member of FieldIndex from `FieldMeta &` to needed `DataType` and dim member resolving dangling reference after schema change - Add double check after acquiring lock to reduce multiple assignment - Change `auto schema` to `auto& schema` to reduce schema copy Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-07-23 00:14:52 +08:00
sthuang	59bbdd93f5	fix: [StorageV2] fill the correct group chunk into cell (#43486 ) The root cause of the issue lies in the fact that when a sealed segment contains multiple row groups, the get_cells function may receive unordered cids. This can result in row groups being written into incorrect cells during data retrieval. Previously, this issue was hard to reproduce because the old Storage V2 writer had a bug that caused it to write row groups larger than 1MB. These large row groups could lead to uncontrolled memory usage and eventually an OOM (Out of Memory) error. Additionally, compaction typically produced a single large row group, which avoided the incorrect cell-filling issue during query execution. related: https://github.com/milvus-io/milvus/issues/43388, https://github.com/milvus-io/milvus/issues/43372, https://github.com/milvus-io/milvus/issues/43464, #43446, #43453 --------- Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2025-07-22 22:22:53 +08:00
XuanYang-cn	92f4fc0e8b	fix: Set status when err is not empty (#43403 ) See also: #43341 Signed-off-by: yangxuan <xuan.yang@zilliz.com>	2025-07-22 17:48:53 +08:00
cai.zhang	f19e0ef6e4	fix: Ensure task execution order by using a priority queue (#43271 ) issue: #43260 Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2025-07-22 17:42:53 +08:00
cai.zhang	e26a532504	enhance: Only download necessary fields during clustering analyze phase (#43322 ) issue: #43310 --------- Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2025-07-22 16:40:52 +08:00
Zhen Ye	df7e507c49	fix: balance may not trigger at balance checker when upgrading (#43462 ) issue: #43416 Signed-off-by: chyezh <chyezh@outlook.com>	2025-07-22 16:02:53 +08:00
Buqian Zheng	0599113a4b	enhance: add timeout to resource reservation (#43441 ) issue: https://github.com/milvus-io/milvus/issues/41435 Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2025-07-22 15:24:53 +08:00
yihao.dai	a839017e81	fix: Handle retry state in import task (#43474 ) issue: https://github.com/milvus-io/milvus/issues/43473 Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2025-07-22 14:52:53 +08:00
Chun Han	5a1092304c	fix: refine judgement for batch views(#38736 ) (#43481 ) related: #38736 Signed-off-by: MrPresent-Han <chun.han@gmail.com> Co-authored-by: MrPresent-Han <chun.han@gmail.com>	2025-07-22 14:20:53 +08:00
congqixia	5c0f0ee765	enhance: [StorageV2] Return EOF when packedReader closed (#43465 ) This patch makes `PackedReader` return EOF when try to calling `ReadNext` after closing it. This behavior make importv2.binlog reader could retry after EOF reached and act normally. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-07-22 14:04:52 +08:00
yihao.dai	5124ed9758	fix: Fix import fileStats incorrectly set to nil (#43463 ) 1. Ensure that tasks in the InProgress state return valid fileStats. 2. Enhance import logs. issue: https://github.com/milvus-io/milvus/issues/43387 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2025-07-22 12:37:01 +08:00
congqixia	563e2935c5	enhance: [StorageV2] Fill ts range default values for `PackedBinlogRecordWriter` (#43454 ) This PR fill default value for `PackedBinlogRecordWriter` timestamp range so target segment meta will contains correct timestamp range Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-07-22 12:04:53 +08:00
zhikunyao	c5bb236e1e	test: add jobs=8 in e2e and go-sdk tests (#43475 ) Signed-off-by: Zhikun Yao <zhikun.yao@zilliz.com>	2025-07-22 11:48:53 +08:00
sthuang	f77571d5c1	fix: [StorageV2] file writer write row group split to default size (#43471 ) Bumped milvus storage version. related: https://github.com/milvus-io/milvus/issues/43310 * https://github.com/milvus-io/milvus-storage/pull/213 * https://github.com/milvus-io/milvus-storage/pull/217 * https://github.com/milvus-io/milvus-storage/pull/220 Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2025-07-22 09:52:52 +08:00
sre-ci-robot	ac94428e1b	Update all contributors Signed-off-by: sre-ci-robot <sre-ci-robot@zilliz.com>	2025-07-21 12:00:46 +00:00
sthuang	6c5f5f1e32	enhance: [StorageV2] refactor group chunk translator (#43406 ) related: #43372 Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2025-07-21 19:46:53 +08:00
sparknack	81694739ef	fix: revert ska::flat_hash_set to std::unordered_set to address an un… (#43428 ) issue: #43388 Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>	2025-07-21 17:39:40 +08:00
aoiasd	e9fc140eaf	fix: jieba tokenizer cause panic when dict word was empty string (#43337 ) relate: https://github.com/milvus-io/milvus/issues/42779 Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-07-21 16:34:53 +08:00
nico	d451476a62	test: update nightly cases (#43410 ) Signed-off-by: nico <cheng.yuan@zilliz.com>	2025-07-21 15:12:53 +08:00
aoiasd	c7b53ed43b	enhance: run rust format (#43447 ) Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-07-21 14:12:53 +08:00
zhikunyao	26d6918010	test: add jobs=8 in ut-cpp (#43436 ) Signed-off-by: Zhikun Yao <zhikun.yao@zilliz.com>	2025-07-21 11:40:52 +08:00
zhuwenxing	b619684ca2	test: add collection rename checker in chaos test (#43412 ) /kind improvement Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>	2025-07-21 11:34:53 +08:00
junjiejiangjjj	77f3a1f213	enhance: Add search post pipeline (#43065 ) https://github.com/milvus-io/milvus/issues/35856 Signed-off-by: junjiejiangjjj <junjie.jiang@zilliz.com>	2025-07-21 11:10:52 +08:00
Bingyi Sun	21e71f6eb2	fix: Check json nested path before validating data type (#43329 ) issue: #43279 Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-07-21 10:30:54 +08:00
Zhen Ye	69c8c2660b	fix: create nil start position segment if sync start position before insert (#43435 ) issue: #43434 - the segment start position can be carried by other segment sync operation. so the sync start position operation can happens before insert. - TODO: It's a wired design should be removed. Signed-off-by: chyezh <chyezh@outlook.com>	2025-07-21 09:50:52 +08:00
Bingyi Sun	09b6407e63	enhance: optimize error msg for json index inconsistent parameters (#43345 ) Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-07-21 00:32:52 +08:00

1 2 3 4 5 ...

22795 Commits