milvus

mirror of https://gitee.com/milvus-io/milvus.git synced 2026-01-07 19:31:51 +08:00

Author	SHA1	Message	Date
wei liu	039564199c	fix: Prevent duplicate segment results in count queries (#43173 ) issue: #41570 Fix issue where growing and sealed segments could be searched simultaneously, causing inflated count() results. This was caused by logic introduced in PR #42009 that made sealed segments readable before target version advancement. Changes include: - Fix conditional filtering logic in PinReadableSegments to prevent sealed segments from becoming readable prematurely - Use target version filter for full results (ratio=1.0) to ensure sealed segments only become readable after target advancement - Use query view segment list filter for partial results (ratio<1.0) to maintain backward compatibility - Simplify target version setting in AddDistributions to prevent premature segment readability - Add logging for redundant growing segments during sync - Add comprehensive unit tests covering the duplicate segment scenario This fix ensures count() queries return accurate results by preventing the same segment from being counted in both growing and sealed states. --------- Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-07-14 11:10:49 +08:00
Zhen Ye	15a6631147	enhance: add quota limit based on sn consuming lag (#43105 ) issue: #42995 - The consuming lag at streaming node will be reported to coordinator. - The consuming lag will trigger the write limit and deny by quota center. - Set the ttProtection by default. --------- Signed-off-by: chyezh <chyezh@outlook.com>	2025-07-11 14:10:49 +08:00
Chun Han	07745439b5	fix: empty search groupby result causing crash(#43137 ) (#43214 ) related: #43137 Signed-off-by: MrPresent-Han <chun.han@gmail.com> Co-authored-by: MrPresent-Han <chun.han@gmail.com>	2025-07-10 12:04:48 +08:00
congqixia	f027eea545	enhance: [AddField] Add log for segcore segment schema change (#43215 ) Related to #39178 This PR add logs for segment schema change operations. Also fixes the nit comments from PR #42490 --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-07-10 10:22:47 +08:00
aoiasd	97b1c3ed96	enhance: add warn log if some segment's bm25 stats lacks (#43111 ) Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-07-09 23:22:47 +08:00
aoiasd	54cc0b60f2	fix: dropped segment in excluded segment use wrong excluded ts (#43115 ) cause some excluded growing data insert again relate: https://github.com/milvus-io/milvus/issues/43114 Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-07-08 18:04:46 +08:00
sparknack	7e855f1046	enhance: add disk file writer with Direct IO support (#42665 ) issue: #43040 This patch introduces a disk file writer that supports Direct IO. Currently, it is exclusively utilized during the QueryNode load process. Below is its parameters: 1. `common.diskWriteMode` This parameter controls the write mode of the local disk, which is used to write temporary data downloaded from remote storage. Currently, only QueryNode uses 'common.diskWrite*' parameters. Support for other components will be added in the future. The options include 'direct' and 'buffered'. The default value is 'buffered'. 2. `common.diskWriteBufferSizeKb` Disk write buffer size in KB, only used when disk write mode is 'direct', default is 64KB. Current valid range is [4, 65536]. If the value is not aligned to 4KB, it will be rounded up to the nearest multiple of 4KB. 3. `common.diskWriteNumThreads` This parameter controls the number of writer threads used for disk write operations. The valid range is [0, hardware_concurrency]. It is designed to limit the maximum concurrency of disk write operations to reduce the impact on disk read performance. For example, if you want to limit the maximum concurrency of disk write operations to 1, you can set this parameter to 1. The default value is 0, which means the caller will perform write operations directly without using an additional writer thread pool. In this case, the maximum concurrency of disk write operations is determined by the caller's thread pool size. Both parameters can be updated during runtime. --------- Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>	2025-07-02 22:18:44 +08:00
congqixia	7bc7b18ed5	fix: [AddField] Prevent concurrent load during UpdateSchema (#43043 ) Related to #43028 This PR: - Add mutex prevent concurrent load segment & schema change - Add schema verison field in load meta - Update schema in PutOrRef if schema verison is larger --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-07-02 17:38:44 +08:00
wei liu	c381bf3e41	enhance: add logs for count(*) (#43001 ) Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-07-01 19:36:43 +08:00
Spade A	26ec841feb	feat: optimize `Like` query with n-gram (#41803 ) Ref #42053 This is the first PR for optimizing `LIKE` with ngram inverted index. Now, only VARCHAR data type is supported and only InnerMatch LIKE (%xxx%) query is supported. How to use it: ``` milvus_client = MilvusClient("http://localhost:19530") schema = milvus_client.create_schema() ... schema.add_field("content_ngram", DataType.VARCHAR, max_length=10000) ... index_params = milvus_client.prepare_index_params() index_params.add_index(field_name="content_ngram", index_type="NGRAM", index_name="ngram_index", min_gram=2, max_gram=3) milvus_client.create_collection(COLLECTION_NAME, ...) ``` min_gram and max_gram controls how we tokenize the documents. For example, for min_gram=2 and max_gram=4, we will tokenize each document with 2-gram, 3-gram and 4-gram. --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com> Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>	2025-07-01 10:08:44 +08:00
wei liu	396120ade5	enhance: Improve delegator serviceable check with coordinator sync state (#42975 ) issue: #42404 Add syncedByCoord field to ensure delegator only becomes serviceable after coordinator sync, preventing unreliable service state when memory is insufficient. Issue: When memory is low, delegator may become serviceable before current target is ready, but segments can be released at any time, making the serviceable state unreliable. Changes include: - Add syncedByCoord field to track coordinator sync status - Update Serviceable() to require both data readiness and coord sync - Set syncedByCoord=true in SyncTargetVersion - Add comprehensive test coverage Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-07-01 10:00:43 +08:00
aoiasd	e2566c0e92	enhance: bm25 stats local cache use local storage path (#42923 ) Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-06-25 13:44:46 +08:00
sthuang	0d57acb13a	enhance: [StorageV2] field id as meta path for wide column when load (#42863 ) related: #42862 #39173 Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2025-06-25 11:08:48 +08:00
Zhen Ye	6798fdc3b3	fix: rocksmq cannot graceful stop (#42841 ) issue: #40532 Signed-off-by: chyezh <chyezh@outlook.com>	2025-06-19 19:38:39 +08:00
congqixia	4ba177cd2c	enhance: [StorageV2] Handle narrow column group resource estimation (#42842 ) Related to #39173 In storage v2, "narrow" column group could have group id not mapped schema, which causing loading fails or resource estimation result inaccurate. This PR handles this case by mapping binlog from index instead of vice versa. --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-06-19 14:44:39 +08:00
wei liu	bf5fde1431	fix: Prevent delegator unserviceable due to shard leader change (#42689 ) issue: #42098 #42404 Fix critical issue where concurrent balance segment and balance channel operations cause delegator view inconsistency. When shard leader switches between load and release phases of segment balance, it results in loading segments on old delegator but releasing on new delegator, making the new delegator unserviceable. The root cause is that balance segment modifies delegator views, and if these modifications happen on different delegators due to leader change, it corrupts the delegator state and affects query availability. Changes include: - Add shardLeaderID field to SegmentTask to track delegator for load - Record shard leader ID during segment loading in move operations - Skip release if shard leader changed from the one used for loading - Add comprehensive unit tests for leader change scenarios This ensures balance segment operations are atomic on single delegator, preventing view corruption and maintaining delegator serviceability. --------- Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-06-19 12:10:38 +08:00
Spade A	e2c85eec81	fix: load stats index based on mmap config (#42788 ) ref https://github.com/milvus-io/milvus/issues/42626 This PR makes text match index and json key stats index be loaded based on mmap config. --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-06-19 10:10:39 +08:00
Chun Han	001619aef9	feat: supporing load priority for loading (#42413 ) related: #40781 Signed-off-by: MrPresent-Han <chun.han@gmail.com> Co-authored-by: MrPresent-Han <chun.han@gmail.com>	2025-06-17 15:22:38 +08:00
wei liu	679930bb93	enhance: refine delegator state checking error msg (#42673 ) issue: #42661 Add NotStopped() and IsWorking() methods to shardDelegator for better state management and error handling. Changes include: - Add instance state checking methods with proper error messages - Replace lifetime package calls with delegator instance methods - Add comprehensive unit tests for state transitions and error cases - Improve error reporting with channel name for better debugging Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-06-17 10:40:38 +08:00
aoiasd	201e980d3d	fix: flow graph should free function resource after all node close (#42731 ) relate: https://github.com/milvus-io/milvus/issues/42730 Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-06-13 22:14:37 +08:00
wei liu	78c39edbce	fix: Fix potential panic when DeleteCheckpoint is nil (#42664 ) issue: #42663 Fix panic issue when processing VchannelInfo messages from older coordinator versions that don't have DeleteCheckpoint field. Changes: - Add null safety check for DeleteCheckpoint before accessing methods - Maintain backward compatibility with legacy message formats - Improve seek position selection logic for both old and new versions --------- Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-06-13 14:26:36 +08:00
Zhen Ye	ca48603f35	fix: msg dispatcher lost data at streaming service (#42670 ) issue: #41570 Signed-off-by: chyezh <chyezh@outlook.com>	2025-06-13 11:54:36 +08:00
congqixia	c9bc70f272	fix: [AddField] Use shared_ptr of schema in plan fixing dangling ref (#42693 ) Related to #42640 The search/query plan holded a reference to schema, which could be destructed after schema change. This PR make plan hold a shared ptr to it fixing dangling reference problem under concurrent read & schema change. This PR also remove field binlog check for loading index for old segment with old schema may have binlog lack. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-06-12 20:46:36 +08:00
Buqian Zheng	8511ede5f8	feat: add back queryNode.cache.warmup for compatibility (#42621 ) issue: https://github.com/milvus-io/milvus/issues/41435 also make ChunkTranslator to load in parallel --------- Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2025-06-12 10:56:40 +08:00
wei liu	e7c0a6ffbb	enhance: Refine QueryNode task parallelism based on CPU core count (#42166 ) issue: #42165 Implement dynamic task execution capacity calculation based on QueryNode CPU core count instead of static configuration for better resource utilization. Changes include: - Add CpuCoreNum() method and WithCpuCoreNum() option to NodeInfo - Implement GetTaskExecutionCap() for dynamic capacity calculation - Add QueryNodeTaskParallelismFactor parameter for tuning - Update proto definition to include cpu_core_num field - Add unit tests for new functionality This allows QueryCoord to automatically adjust task parallelism based on actual hardware resources. Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-06-11 13:20:35 +08:00
XuanYang-cn	83877b9faf	enhance: remove extra get collection (#42042 ) Signed-off-by: yangxuan <xuan.yang@zilliz.com>	2025-06-10 18:34:35 +08:00
wei liu	f3fe117840	fix: Use delete checkpoint to prevent delete record loss in L0 refactoring (#42628 ) issue: #39333 #41570 Fix delete record missing issue introduced in PR #39552 L0 refactoring: - Use delete checkpoint as consume start position when deleteCP < channelCP - Add logging when delete checkpoint is used instead of seek position - Prevent delete record loss when deleteCP is earlier than default channelCP Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-06-10 17:34:35 +08:00
aoiasd	13330bd466	fix: add concurrency and close protect for bm25 function (#42597 ) relate: https://github.com/milvus-io/milvus/issues/42576 Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-06-10 11:36:34 +08:00
aoiasd	2eb24fbe7c	fix: analyzer memory leak because function runner not close (#41839 ) relate: https://github.com/milvus-io/milvus/issues/41213 --------- Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-06-05 14:24:40 +08:00
Zhen Ye	0567f512b3	fix: streamingnode get stucked when stop (#42501 ) issue: #42498 - fix: sealed segment cannot be flushed after upgrading - fix: get mvcc panic when upgrading - ignore the L0 segment when graceful stop of querynode. --------- Signed-off-by: chyezh <chyezh@outlook.com>	2025-06-05 12:22:31 +08:00
cai.zhang	5566a85bcc	enhance: Add proxy task queue metrics (#42156 ) issue: #42155 Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2025-06-04 11:26:32 +08:00
Zhen Ye	508264f953	fix: querynode upgrade from 2.5 get stucked (#42502 ) issue: #42492 - consider the old RO query node (not streaming node) when balancing channel. - querynode graceful stop can be done if there's only L0 segment exists. Signed-off-by: chyezh <chyezh@outlook.com>	2025-06-04 11:20:30 +08:00
congqixia	b76478378a	feat: [Tiered] Make load list work as warmup hint (#42490 ) Related to #42489 See also #41435 This PR's main target is to make partial load field list work as caching layer warmup policy hint. If user specify load field list, the fields not included in the list shall use `disabled` warmup policy and be able to lazily loaded if any read op uses them. The major changes are listed here: - Pass load list to segcore and creating collection&schema - Add util functions to check field shall be proactively loaded - Adapt storage v2 column group, which may lead to hint fail if columns share same group --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-06-04 10:28:32 +08:00
congqixia	cc42d49769	fix: [StorageV2][AddField] Handle lack binlog rows in storage v2 (#42186 ) Related to #39173 #39718 In storage v2, the `lack_bin_rows` cannot be used since field id is not column group id, which will not be matched forever. --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-05-31 02:44:30 +08:00
congqixia	6d2ad519b1	enhance:[StorageV2] Adapt local storage & other minor issue (#42167 ) Related to #39173 This PR - Handle storage v2 log path in local storage mode on querynode - Ignore field info check when append index for loaded sealed segment when using storage v2 Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-05-30 10:22:29 +08:00
Chun Han	ed0df38605	enhance: resize high priority wqthreadpool dynamically(#40838 ) (#41549 ) (#41929 ) related: #40838 pr: https://github.com/milvus-io/milvus/pull/41549 Signed-off-by: MrPresent-Han <chun.han@gmail.com>	2025-05-30 10:18:36 +08:00
Zhen Ye	4bad293655	enhance: make upgrading from 2.5.x less down time (#42082 ) issue: #40532 - start timeticksync at rootcoord if the streaming service is not available - stop timeticksync if the streaming service is available - open a read-only wal if some nodes in cluster is not upgrading to 2.6 - allow to open read-write wal after all nodes in cluster is upgrading to 2.6 --------- Signed-off-by: chyezh <chyezh@outlook.com>	2025-05-29 23:02:29 +08:00
aoiasd	3a74044149	fix: hybird search sub requset not set analyzer name (#41896 ) relate: https://github.com/milvus-io/milvus/issues/41213 Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-05-29 14:56:28 +08:00
aoiasd	2ae4d80120	enhance: support run analyzer by loaded collection field (#42113 ) relate: https://github.com/milvus-io/milvus/issues/42094 Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-05-29 10:54:30 +08:00
Xianhui Lin	da30e1e4df	fix: pass the ttl duration in the search request for ttl filter (#42122 ) fix: pass the TTL duration in the search request for TTL filter issue:https://github.com/milvus-io/milvus/issues/41959 Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>	2025-05-28 11:08:29 +08:00
cqy123456	5fe7015f63	enhance: InterimIndex support more index type and data type (#41021 ) issue: https://github.com/milvus-io/milvus/issues/27678 cherry pick from : https://github.com/milvus-io/milvus/pull/39180, https://github.com/milvus-io/milvus/pull/40429 Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>	2025-05-28 08:40:28 +08:00
wei liu	54619eaa2c	feat: Implement partial result support on node down (#42009 ) issue: https://github.com/milvus-io/milvus/issues/41690 This commit implements partial search result functionality when query nodes go down, improving system availability during node failures. The changes include: - Enhanced load balancing in proxy (lb_policy.go) to handle node failures with retry support - Added partial search result capability in querynode delegator and distribution logic - Implemented tests for various partial result scenarios when nodes go down - Added metrics to track partial search results in querynode_metrics.go - Updated parameter configuration to support partial result required data ratio - Replaced old partial_search_test.go with more comprehensive partial_result_on_node_down_test.go - Updated proto definitions and improved retry logic These changes improve query resilience by returning partial results to users when some query nodes are unavailable, ensuring that queries don't completely fail when a portion of data remains accessible. --------- Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-05-28 00:12:28 +08:00
Xianhui Lin	6a0e182e13	enhance: support TTL expiration with queries returning no results (#42086 ) support TTL expiration with queries returning no results issue:https://github.com/milvus-io/milvus/issues/41959 Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>	2025-05-27 18:28:27 +08:00
aoiasd	0fafb706ba	enhance: add segment bm25 stats local cache (#41775 ) relate: https://github.com/milvus-io/milvus/issues/41424 Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-05-26 18:44:27 +08:00
Zhen Ye	38c804fb01	fix: more stable recovery graceful closing and stable unittest (#42013 ) issue: #41544 Signed-off-by: chyezh <chyezh@outlook.com>	2025-05-23 17:52:26 +08:00
wei liu	78010262f0	enhance: Optimize shard serviceable mechanism (#41937 ) issue: https://github.com/milvus-io/milvus/issues/41690 - Merge leader view and channel management into ChannelDistManager, allowing a channel to have multiple delegators. - Improve shard leader switching to ensure a single replica only has one shard leader per channel. The shard leader handles all resource loading and query requests. - Refine the serviceable mechanism: after QC completes loading, sync the query view to the delegator. The delegator then determines its serviceable status based on the query view. - When a delegator encounters forwarding query or deletion failures, mark the corresponding segment as offline and transition it to an unserviceable state. Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-05-22 11:38:24 +08:00
wei liu	dad43a3894	fix: cost metrics collection logic for replica selection (#41965 ) issue: #41621 - Deprecate EnableWorkerSQCostMetrics parameter - Always collect cost metrics from all search and retrieve results - Update code with comments explaining the changes rationale Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-05-22 10:20:25 +08:00
Ted Xu	ae32203d3a	fix: support group by with nullable grouping keys (#41797 ) See #36264 In this PR: - Enhanced error handling in parse of grouping field. - Fixed null handling in reduce tasks in proxy nodes. - Updated tests to reflect changes in error handling and data processing logic. --------- Signed-off-by: Ted Xu <ted.xu@zilliz.com>	2025-05-17 20:54:22 +08:00
Buqian Zheng	b0260d8676	feat: manual evict cache after built interim index (#41836 ) issue: https://github.com/milvus-io/milvus/issues/41435 this PR also makes HasRawData of ChunkedSegmentSealedImpl to return based on metadata, without needing to load the cache just to answer this simple question. --------- Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2025-05-16 16:34:23 +08:00
cai.zhang	9eebb9b464	fix: Collect entites num group by collection instead of partition (#41788 ) issue: #41787 Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2025-05-15 12:04:22 +08:00

1 2 3 4 5 ...

749 Commits