milvus

mirror of https://gitee.com/milvus-io/milvus.git synced 2025-12-07 01:28:27 +08:00

Author	SHA1	Message	Date
Zhen Ye	b94cee2413	fix: growing segment from old arch is not flushed after upgrading (#42164 ) issue: #42162 - enhance: add read ahead buffer size issue #42129 - fix: rocksmq consumer's close operation may get stucked - fix: growing segment from old arch is not flushed after upgrading --------- Signed-off-by: chyezh <chyezh@outlook.com>	2025-05-29 23:00:28 +08:00
cqy123456	5fe7015f63	enhance: InterimIndex support more index type and data type (#41021 ) issue: https://github.com/milvus-io/milvus/issues/27678 cherry pick from : https://github.com/milvus-io/milvus/pull/39180, https://github.com/milvus-io/milvus/pull/40429 Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>	2025-05-28 08:40:28 +08:00
wei liu	54619eaa2c	feat: Implement partial result support on node down (#42009 ) issue: https://github.com/milvus-io/milvus/issues/41690 This commit implements partial search result functionality when query nodes go down, improving system availability during node failures. The changes include: - Enhanced load balancing in proxy (lb_policy.go) to handle node failures with retry support - Added partial search result capability in querynode delegator and distribution logic - Implemented tests for various partial result scenarios when nodes go down - Added metrics to track partial search results in querynode_metrics.go - Updated parameter configuration to support partial result required data ratio - Replaced old partial_search_test.go with more comprehensive partial_result_on_node_down_test.go - Updated proto definitions and improved retry logic These changes improve query resilience by returning partial results to users when some query nodes are unavailable, ensuring that queries don't completely fail when a portion of data remains accessible. --------- Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-05-28 00:12:28 +08:00
Zhen Ye	212e17c4c5	fix: modify param to use less memory when flush and sync (#42102 ) issue: #42097 Signed-off-by: chyezh <chyezh@outlook.com>	2025-05-27 10:12:27 +08:00
wei liu	4e1208f4f6	enhance: support balancing multiple collections in single trigger (#41875 ) issue: #41874 - Optimize balance_checker to support balancing multiple collections simultaneously - Add new parameters for segment and channel balancing batch sizes - Add enableBalanceOnMultipleCollections parameter - Update tests for balance checker This change improves resource utilization by allowing the system to balance multiple collections in a single trigger with configurable batch sizes. Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-05-21 21:38:25 +08:00
Zhen Ye	7beafe99a7	enhance: implement wal garbage collector with truncate api (#41770 ) issue: #41544 - add a truncator implementation into wal recovery storage. - add metrics for recovery storage. --------- Signed-off-by: chyezh <chyezh@outlook.com>	2025-05-13 22:08:56 +08:00
Zhen Ye	61b6ca5b73	enhance: add in mem shard manager (#41749 ) issue: #41544 - Implement in-memory shard manager to maintain the shard state at write ahead. - Remove all rpc and meta operation at write ahead, make the segment assignment logic only use wal and memory. - Refactor global stats management, add node-level flush policy. - Fix the recovery storage inconsistency bug when graceful close. Signed-off-by: chyezh <chyezh@outlook.com>	2025-05-13 12:04:56 +08:00
Zhen Ye	e675da76e4	enhance: simplify the proto message, make segment assignment code more clean (#41671 ) issue: #41544 - simplify the proto message for flush and create segment. - simplify the msg handler for flowgraph. --------- Signed-off-by: chyezh <chyezh@outlook.com>	2025-05-11 20:49:00 +08:00
Buqian Zheng	ff5c2770e5	feat: cachinglayer: various improvements (#41546 ) issue: https://github.com/milvus-io/milvus/issues/41435 this PR is based on https://github.com/milvus-io/milvus/pull/41436. Improvements include: - Lazy Load support for Storage v1 - Use Low/High watermark to control eviction - Caching Layer related config changes - Removed ChunkCache related configs and code in golang - Add `PinAllCells` helper method to CacheSlot class - Modified ValueAt, RawAt, PrimitiveRawAt to Bulk version, to reduce caching layer overhead - Removed some unclear templated bulk_subscript methods - CachedSearchIterator to store PinWrapper when searching on ChunkedColumn, and removed unused contrustor. --------- Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2025-05-10 09:19:16 +08:00
Zhen Ye	dfbb02a5f7	enhance: make streaming message as a log field for easier coding (#41545 ) issue: #41544 - implement message can be logged as a field by zap. - fix too many slow log for woodpecker. Signed-off-by: chyezh <chyezh@outlook.com>	2025-04-28 14:38:42 +08:00
Zhen Ye	1f2077b68f	fix: remove dead config queryNode.grouping.enabled (#41244 ) issue: #41243 Signed-off-by: chyezh <chyezh@outlook.com>	2025-04-14 14:36:32 +08:00
cai.zhang	05e25431d9	enhance: Deprecate disk params about indexing (#41045 ) issue: #40863 Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2025-04-07 11:36:34 +08:00
Zhen Ye	f18aa85083	enhance: vchannel fair balance policy for streaming (#40959 ) issue: #40638 - Add `ChannelID` for streaming replica in future. - Remove the pchannel count fair balance policy for streaming. - Add Score based vchannel fair balance policy for streaming. - Add pchannel stats manager to collect the stats of pchannel for balancer. - Add configuration and metrics for new balance policy --------- Signed-off-by: chyezh <chyezh@outlook.com>	2025-04-04 10:12:22 +08:00
wei liu	bf8547578f	fix: Address manual balance and balance check issues (#41037 ) issue: #37651 - Fix context propagation for manual balance segment task creation from PR #38080. - Optimize stopping balance by preventing redundant checks per round, addressing performance regression from PR #40297. - Decrease default `checkBalanceInterval` from 3000ms to 300ms. - Correct minor log messages in `BalanceChecker`. --------- Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-04-03 15:48:27 +08:00
yihao.dai	5b78ef0a49	fix: Fix delete data loss due to duplicate binlogID (#40960 ) With concurrenct L0 compaction (https://github.com/milvus-io/milvus/pull/36816), delta logs might be written to the same L1 segment, causing logID duplication when using the incremental beginLogID. This PR removes the beginLogID mechanism and instead passes a log ID range, where the number of IDs in the range equals the number of compaction segment binlogs multiplied by an expansion factor. issue: https://github.com/milvus-io/milvus/issues/40207 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2025-04-01 10:36:22 +08:00
yihao.dai	b2a8694686	enhance: Merge IndexNode and DataNode (#40272 ) Merge DataNode and IndexNode into DataNode. issue: https://github.com/milvus-io/milvus/issues/39115 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2025-03-13 14:26:11 +08:00
cai.zhang	762a644d76	enhance: Limit the speed of the generating stats task (#39644 ) Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2025-02-28 10:27:59 +08:00
congqixia	cb7f2fa6fd	enhance: Use v2 package name for pkg module (#39990 ) Related to #39095 https://go.dev/doc/modules/version-numbers Update pkg version according to golang dep version convention --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-02-22 23:15:58 +08:00
wei liu	b9e3ec7175	enhance: Add trigger interval config for auto balance (#39154 ) issue: #39156 Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-02-14 16:12:15 +08:00
Xiaofan	13d908f302	enhance: improve bloomfilter performance (#39730 ) 1. remove unnecessary allocations 2. recude the concurrency to avoid extra context switch Signed-off-by: xiaofanluan <xiaofan.luan@zilliz.com>	2025-02-13 22:12:14 +08:00
Zhen Ye	0988807160	enhance: enable write ahead buffer for streaming service (#39771 ) issue: #38399 - Make a timetick-commit-based write ahead buffer at write side. - Add a switchable scanner at read side to transfer the state between catchup and tailing read Signed-off-by: chyezh <chyezh@outlook.com>	2025-02-12 20:38:46 +08:00
yihao.dai	a5a83a0904	fix: Fix consume blocked due to too many consumers (#38455 ) This PR limits the maximum number of consumers per pchannel to 10 for each QueryNode and DataNode. issue: https://github.com/milvus-io/milvus/issues/37630 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2025-01-15 21:37:01 +08:00
yihao.dai	ce41778fe6	enhance: Optimize GetLocalDiskSize and segment loader mutex (#38599 ) 1. Make the segment loader lock protect only the resource. 2. Optimize GetDiskUsage to avoid excessive overhead. issue: https://github.com/milvus-io/milvus/issues/37630 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2025-01-15 15:45:01 +08:00
yihao.dai	ec2e77b5d7	enhance: Reduce memory usage of BF in DataNode and QueryNode (#38129 ) 1. DataNode: Skip generating BF during the insert phase (BF will be regenerated during the sync phase). 2. QueryNode: Skip generating or maintaining BF for growing segments; deletion checks will be handled in the segcore. issue: https://github.com/milvus-io/milvus/issues/37630 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2025-01-15 01:59:01 +08:00
Zhen Ye	fd84ed817c	enhance: add broadcast operation for msgstream (#39040 ) issue: #38399 - make broadcast service available for msgstream by reusing the architecture streaming service --------- Signed-off-by: chyezh <chyezh@outlook.com>	2025-01-14 15:14:59 +08:00
jaime	78438ef41e	fix: revert optimize CPU usage for CheckHealth requests (#35589 ) (#38555 ) issue: #35563 Signed-off-by: jaime <yun.zhang@zilliz.com>	2024-12-19 00:38:45 +08:00
jaime	29e620fa6d	fix: sync task still running after DataNode has stopped (#38377 ) issue: #38319 Signed-off-by: jaime <yun.zhang@zilliz.com>	2024-12-17 18:06:44 +08:00
jaime	28fdbc4e30	enhance: optimize CPU usage for CheckHealth requests (#35589 ) issue: #35563 1. Use an internal health checker to monitor the cluster's health state, storing the latest state on the coordinator node. The CheckHealth request retrieves the cluster's health from this latest state on the proxy sides, which enhances cluster stability. 2. Each health check will assess all collections and channels, with detailed failure messages temporarily saved in the latest state. 3. Use CheckHealth request instead of the heavy GetMetrics request on the querynode and datanode Signed-off-by: jaime <yun.zhang@zilliz.com>	2024-12-17 11:02:45 +08:00
wei liu	e279ccf109	enhance: Enable score based balance channel policy (#38143 ) issue: #38142 current balance channel policy only consider current collection's distribution, so if all collections has 1 channel, and all channels has been loaded on same querynode, after querynode num increase, balance channel won't be triggered. This PR enable score based balance channel policy, to achieve: 1. distribute all channels evenly across multiple querynodes 2. distribute each collection's channel evenly across multiple querynodes. --------- Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-12-11 17:20:43 +08:00
SimFG	49ee46ec1d	enhance: support to config the default db properties (#38035 ) - issue: #38034 Signed-off-by: SimFG <bang.fu@zilliz.com>	2024-11-27 10:04:34 +08:00
SimFG	2208b7c2ef	fix: the too long default root password does not take effect (#37983 ) - issue: #36987 Signed-off-by: SimFG <bang.fu@zilliz.com>	2024-11-26 17:24:35 +08:00
Zhen Ye	2b4f211d84	enhance: add switch for local rpc enabled (#37985 ) issue: #33285 - Add switch for local rpc --------- Signed-off-by: chyezh <chyezh@outlook.com>	2024-11-26 17:00:54 +08:00
wei liu	0a440e0d38	fix: Prevent simultaneous balance of segments and channels (#37850 ) issue: #33550 balance segment and balance segment execute at same time, which will cause bounch of corner case. This PR disable simultaneous balance of segments and channels Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-11-21 17:56:55 +08:00
yihao.dai	0fc0d1a888	fix: Limit the concurrency of channel tasks (#37740 ) Limit the maximum concurrency of channel tasks for each DataNode to prevent excessive subscriptions from causing DataNode OOM. issue: https://github.com/milvus-io/milvus/issues/37665 Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-11-18 16:26:30 +08:00
Zhen Ye	81fa7dd52c	fix: add ddl and dcl concurrency to avoid competition (#37672 ) issue: #37166 Signed-off-by: chyezh <chyezh@outlook.com>	2024-11-15 15:04:31 +08:00
yihao.dai	f0b3942a08	enhance: Limit import job number (#36891 ) issue: https://github.com/milvus-io/milvus/issues/36890 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-10-23 16:01:28 +08:00
yihao.dai	0fc2a4aa53	enhance: Optimize import scheduling and add time cost metric (#36601 ) 1. Optimize import scheduling strategic: a. Revise slot weights, calculating them based on the number of files and segments for both import and pre-import tasks. b. Ensure that the DN executes tasks in ascending order of task ID. 2. Add time cost metric and log. issue: https://github.com/milvus-io/milvus/issues/36600, https://github.com/milvus-io/milvus/issues/36518 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-10-09 14:41:20 +08:00
Zhen Ye	a6545b2e29	fix: refactor milvus config and change default txn timeout (#36522 ) issue: #36498 Signed-off-by: chyezh <chyezh@outlook.com>	2024-09-29 11:01:15 +08:00
SimFG	c50fe71163	fix: long buffering causes mq to be unable to receive messages. (#36420 ) - issue: #36397 Signed-off-by: SimFG <bang.fu@zilliz.com>	2024-09-23 16:33:18 +08:00
wei liu	3b10085f61	enhance: Optimize workload based replica selection policy (#36181 ) issue: #35859 This PR introduce two new param: toleranceFactor and checkRequestNum, after every checkRequestNum request has been assigned, try to compute querynode's workload score. if the diff is less than the toleranceFactor, replica selection policy will fallback to round_robin, which reduce the average cost to about 500ns. if the diff is larger than the toleranceFactor, replica selection policy will compute querynode's score to select the target node with smallest score in every assigment. --------- Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-09-20 12:33:11 +08:00
yihao.dai	763fd0dfc5	enhance: Use a separate mmap config for chunk cache (#36276 ) issue: https://github.com/milvus-io/milvus/issues/35273 Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-09-15 16:23:09 +08:00
Ted Xu	d9a40784a2	fix: fallback params may be overridden (#35972 ) See #35756 --------- Signed-off-by: Ted Xu <ted.xu@zilliz.com>	2024-09-05 16:19:04 +08:00
wei liu	cf242f9e09	fix: fix dynamic update config doesn't works for some param (#35572 ) issue: #35570 milvus support config cache to spped up config access, but only evict param's cache when param has been updated. but milvus's param may rely on other param's value, let's say ParamsA relys on paramsB, when paramsB updated, it will evict paramB's cache, but the paramA's cache still keep the old value. This PR evict all config cache to solve the above issue, cause dynamic update config won't be much frequetly. Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-08-21 11:02:56 +08:00
wei liu	a570567644	enhance: Enable ReadOnly/ReadWrite/Admin Privilege Group to simplify RBAC grant progress (#35472 ) issue: #35471 --------- Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-08-16 14:18:54 +08:00
wei liu	344dc6a9f8	enhance: enable to set load config in cluster level (#35169 ) issue: #35170 This PR enable to set load configs in cluster level, such as replicas and resource groups. then when load collections will use the load config. Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-08-07 12:38:21 +08:00
cai.zhang	6542c1ab0e	enhance: Add monitoring metrics for task execution time in datacoord (#35139 ) issue: #35138 Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2024-08-05 16:26:17 +08:00
jaime	fcec4c21b9	fix: check collection health(queryable) fail for releasing collection (#34947 ) issue: #34946 Signed-off-by: jaime <yun.zhang@zilliz.com>	2024-08-02 17:20:15 +08:00
wei liu	3b735b4b02	enhance: Refine param init for MmapDirPath (#35181 ) Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-08-02 11:12:14 +08:00
cai.zhang	196a7986b3	enhance: Change the fixed value to a ratio for clustering segment size (#35076 ) issue: #34495 --------- Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2024-08-01 22:04:14 +08:00
wei liu	e9d61daa3f	enhance: Reduce delegator memory overloaded factor to 0.1 (#35092 ) Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-08-01 10:21:50 +08:00

1 2 3

127 Commits