1195 Commits

Author SHA1 Message Date
cai.zhang
afaabc2a38
enhance: [2.4] clean compaction task in compactionHandler (#38170) (#38584)
issue: #35711

master pr: #38170

Signed-off-by: wayblink <anyang.wang@zilliz.com>
Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
Co-authored-by: wayblink <anyang.wang@zilliz.com>
2024-12-24 15:34:50 +08:00
XuanYang-cn
21d76ad1ce
enhance: Use partitionID when delete by partitionKey (#38232)
When delete by partition_key, Milvus will generates L0 segments
globally. During L0 Compaction, those L0 segments will touch all
partitions collection wise. Due to the false-positive rate of segment
bloomfilters, L0 compactions will append false deltalogs to completed
irrelevant partitions, which causes *partition deletion amplification.

This PR uses partition_key to set targeted partitionID when producing
deleteMsgs into MsgStreams. This'll narrow down L0 segments scope to
partition level, and remove the false-positive influence
collection-wise.

However, due to DeleteMsg structure, we can only label one partition to
one deleteMsg, so this enhancement fails if user wants to delete over 2
partition_keys in one deletion.

pr: #38231 
See also: #34665

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-12-23 13:52:51 +08:00
XuanYang-cn
e82af48706
fix: State trans error in concurrent Release and Watching (#38591)
See also: #38589
pr: #38590

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-12-19 21:46:47 +08:00
wei liu
83e162f5f1
enhance: Enable score based balance channel policy (#38143) (#38378)
issue: #38142
pr: #38143
current balance channel policy only consider current collection's
distribution, so if all collections has 1 channel, and all channels has
been loaded on same querynode, after querynode num increase, balance
channel won't be triggered.

This PR enable score based balance channel policy, to achieve:
1. distribute all channels evenly across multiple querynodes
2. distribute each collection's channel evenly across multiple
querynodes.

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-12-13 10:28:44 +08:00
Ted Xu
e959860451
enhance: remove unnecessary clone in meta cache (#36628) (#38392)
See #36627

---------

pr: #36628

Signed-off-by: Ted Xu <ted.xu@zilliz.com>
2024-12-12 11:28:43 +08:00
cai.zhang
dde9d6c54f
fix:[2.4]Set the correct compactionFroms for clustering segments (#38376)
issue: #38373 
master pr: #36799 
This bug was introduced by PR #37653 .

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-12-11 19:02:43 +08:00
yihao.dai
9765db2465
fix: [2.4] Fix empty import task result (#38317)
Ensure the idempotency of import tasks to prevent duplicate tasks in
DataNode.

issue: https://github.com/milvus-io/milvus/issues/38313

pr: https://github.com/milvus-io/milvus/pull/38316

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-12-11 15:42:49 +08:00
cai.zhang
e758d8e4e8
fix: [2.4] Set the start time for index tasks that no need actual building (#38353)
issue: #38354

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-12-11 15:36:44 +08:00
cai.zhang
e843a464e1
enhance: [2.4]Skip create index for l0 segment (#38335)
master pr: #38334

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-12-11 15:02:44 +08:00
Zhen Ye
6b310e16dc
enhance: remove the rpc layer of coordinator when enabling standalone or mixcoord (#38207)
issue: #37764
pr: #37815 
also see: #38259

- add a local client to call local server directly for
querycoord/rootcoord/datacoord.
- enable local client if milvus is running mixcoord or standalone mode.
- after removing rpc layer from mixcoord, the querycoord at standby mode
will be blocked forever of deployment rolling

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2024-12-10 20:38:44 +08:00
yihao.dai
12cc500009
enhance: [2.4] Reduce segmentManager lock granularity (#37869)
Use a channel level key lock for segments in segmentManager.

issue: https://github.com/milvus-io/milvus/issues/37633,
https://github.com/milvus-io/milvus/issues/37630

pr: https://github.com/milvus-io/milvus/pull/37836

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-12-06 16:34:41 +08:00
Xianhui Lin
a51647569b
enhance: [2.4]alterindex & altercollection supports altering properties (#38111)
enhance :

alterindex delete properties
We have introduced a new parameter deleteKeys to the alterindex
functionality, which allows for the deletion of properties within an
index. This enhancement provides users with the flexibility to manage
index properties more effectively by removing specific keys as needed.
altercollection delete properties
We have introduced a new parameter deleteKeys to the altercollection
functionality, which allows for the deletion of properties within an
collection. This enhancement provides users with the flexibility to
manage collection properties more effectively by removing specific keys
as needed.
3.support altercollectionfield
We currently support modifying the fieldparams of a field in a
collection using altercollectionfield, which only allows changes to the
max-length attribute.
Key Points:

New Parameter - deleteKeys: This new parameter enables the deletion of
specified properties from an index. By passing a list of keys to
deleteKeys, users can remove the corresponding properties from the
index.

Mutual Exclusivity: The deleteKeys parameter cannot be used in
conjunction with the extraParams parameter. Users must choose one
parameter to pass based on their requirement. If deleteKeys is provided,
it indicates an intent to delete properties; if extraParams is provided,
it signifies the addition or update of properties.

issue: https://github.com/milvus-io/milvus/issues/37436
pr: https://github.com/milvus-io/milvus/pull/37437

---------

Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>
2024-12-06 14:50:41 +08:00
jaime
51eb25438c
fix: nil pointer in health check request (#38266)
issue: #35563
pr: #35589

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-12-06 13:34:41 +08:00
jaime
319f5494cd
enhance: optimize CPU usage for CheckHealth requests (#35595)
issue: #35563
pr: #35589

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-12-04 14:26:41 +08:00
cai.zhang
0fe4b2c7b7
enhance: Remove pre-marking segments as L2 during clustering compaction (#37653)
issue: #36686

master pr: #36799 

The core of this change is to **ensure that the many-to-many lineage
derivation logic is correct, making sure that both the parent and child
cannot simultaneously exist in the target segment view.**

feature:
  - Clustering compaction no longer marks the input segments as L2.
- Add a new field `is_invisible` to `segmentInfo`, and mark segments
that have completed clustering but have not yet built indexes as
`is_invisible` to prevent them from being loaded prematurely."
- Do not mark the input segment as `Dropped` before the clustering
compaction is completed.
- After compaction fails, only the result segment needs to be marked as
Dropped.

compatibility:
- If the upgraded task has not failed, there are no compatibility
issues.
- If the status after the upgrade is `MetaSaved`, then skip the stats
task based on whether TmpSegments is empty.
  - If the failure occurs before `MetaSaved`:
- there are no ResultSegments, and InputSegments have not been marked as
dropped yet.
    - the level of input segments need to revert to LastLevel
  - If the failure occurs after `MetaSaved`:
- ResultSegments have already been generated, and InputSegments have
been marked as Dropped. At this point, simply make the ResultSegments
visible.
- the level of ResultSegments needs to be set to L1(in order to
participate in mixCompaction)

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-12-04 10:48:40 +08:00
yihao.dai
c5c449fc90
fix: [2.4] Fix datacoord metrics (#38164)
issue: https://github.com/milvus-io/milvus/issues/38162

pr: https://github.com/milvus-io/milvus/pull/38163

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-12-03 14:12:40 +08:00
cai.zhang
dca779debe
enhance: [2.4] Refine clustering compaction log (#38102)
master pr: #38100

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-12-02 21:02:39 +08:00
XuanYang-cn
c32ad6573c
enhance: [24]Increase task capacity and clean illegal task (#37896) (#38095)
1. taskQueueCapacity 256 is too small for production when we want to
re-write the entire collection

2. tasks should be cleaned when unable to recover, or the meta will
remain in etcd forever later.

pr: #37896

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-12-02 11:58:38 +08:00
cai.zhang
045cf56b6c
fix: [2.4] Handle the error of the compaction queue being full (#37990)
issue: #37988

master pr: #37989

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-11-29 11:08:37 +08:00
wei liu
88b731d393
fix: SyncSegments rpc always failed (#38032)
issue: #38031
cause call `cli.SyncSegments` use ctx which already be override and
canceled, so SyncSegments rpc will always failed.

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-11-28 17:58:36 +08:00
yihao.dai
913a00911b
enhance: [2.4] Reduce GetIndexInfos calls (#37840)
Batch `GetIndexInfos` calls for segments to reduce RPC calls.

issue: https://github.com/milvus-io/milvus/issues/37634

pr: https://github.com/milvus-io/milvus/pull/37695

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-11-28 14:38:37 +08:00
Ted Xu
e928e15bfc
fix: refuse schedule compaction tasks if there is no slot (#37809)
See #37621


pr: #37589

---------

Signed-off-by: Ted Xu <ted.xu@zilliz.com>
Signed-off-by: Yinzuo Jiang <jiangyinzuo@foxmail.com>
Signed-off-by: yangxuan <xuan.yang@zilliz.com>
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
Co-authored-by: Yinzuo Jiang <jiangyinzuo@foxmail.com>
Co-authored-by: yangxuan <xuan.yang@zilliz.com>
Co-authored-by: wei liu <wei.liu@zilliz.com>
2024-11-25 14:02:34 +08:00
yihao.dai
13f83df019
enhance: [2.4] Remove segment-level tag from monitoring metrics (#37737)
When there are a large number of segments, the metrics consume a lot of
memory. This PR Remove segment-level tag from monitoring metrics.

issue: https://github.com/milvus-io/milvus/issues/37636

pr: https://github.com/milvus-io/milvus/pull/37696

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-11-16 23:04:33 +08:00
yihao.dai
d29573551b
enhance: [2.4] Remove unnecessary clone in SetState (#37736)
issue: https://github.com/milvus-io/milvus/issues/37637

pr: https://github.com/milvus-io/milvus/pull/37697

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-11-16 19:04:34 +08:00
XuanYang-cn
5d5f899274
fix: [cp24]Change memoryCheck write lock to read lock (#37526)
pr: #37525

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-11-15 14:42:31 +08:00
XuanYang-cn
d5cad01c22
enhance: [cp24]tidy compaction logs (#37595) (#37647)
Remove some annoying logs and lower a log level from warn to info

pr: #37595

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-11-14 18:44:31 +08:00
sthuang
9e8b6ace6d
enhance: [2.4] RBAC custom privilege group (#37560)
Cherry-pick from master
pr: https://github.com/milvus-io/milvus/pull/37087,
https://github.com/milvus-io/milvus/pull/37558
issue: #37031

---------

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2024-11-11 14:20:29 +08:00
yihao.dai
fd1ca73b61
fix: Fix large growing segment (#37388) (#37540)
Consider the `sealProportion` factor during segment allocation.

issue: https://github.com/milvus-io/milvus/issues/37387

pr: https://github.com/milvus-io/milvus/pull/37388

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-11-08 17:34:27 +08:00
XuanYang-cn
dd0cf20ee0
fix: [cp24]Correct dropped segment num metrics (#37471)
See also: #31891
pr: #37410

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-11-07 16:46:33 +08:00
XuanYang-cn
20534a3f7b
fix: [cp24]Saperate L0 and Mix trigger interval (#37319)
See also: #37108
pr: #37190

- Add MixCompactionTriggerInterval, default 60s
- Add L0CompactionTriggerInterval, default 10s
- Export Single related compaction configs
- Raise SingleCompactionDeltaLogMaxSize from 2MB to 16MB

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-11-06 11:10:26 +08:00
yihao.dai
380662153f
fix: [2.4] Revert "enhance: Support db for bulkinsert (#37012) (#37017)" (#37421)
This reverts commit d6adc62765665d1555039c4d256a75d1144d49d0.

issue: https://github.com/milvus-io/milvus/issues/31273

pr: https://github.com/milvus-io/milvus/pull/37420

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-11-05 10:48:24 +08:00
XuanYang-cn
28fd217e27
fix: [cp24]l0RowCount metrics value always empty (#37307)
See also: #36953
pr: #37306

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-11-04 15:34:24 +08:00
XuanYang-cn
6109e9d69e
fix: Skip mark compaction timeout for mix and l0 compaction (#37118) (#37194)
Timeout is a bad design for long running tasks, especially using a
static timeout config. We should monitor execution progress and fail the
task if the progress has been stale for a long time.

This pr is a small patch to stop DC from marking compaction tasks
timeout, while still waiting for DN to finish. The design is
self-conflicted. After this pr, mix and L0 compaction are no longer
controlled by DC timeout, but clustering is still under timeout control.

The compaction queue capacity grows larger for priority calc, hence
timeout compactions appears more often, and when timeout, the queuing
tasks will be timeout too, no compaction will success after.

See also: #37108, #37015
pr: #37118

---------

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-10-31 10:36:21 +08:00
aoiasd
8370caa4a6
enhance: [Cherry-pick]Add collection name label for some metric (#36951) (#37159)
pr: https://github.com/milvus-io/milvus/pull/36951

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2024-10-29 17:38:22 +08:00
XuanYang-cn
4cb5b2c3b5
fix: [cp24]Exlude L0 compaction when clustering is executing (#37142)
Also remove conflit check when executing L0. The exclusive is already
guarenteed in scheduler

See also: #37140
pr: #37141

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-10-28 15:01:30 +08:00
yihao.dai
ca2057c57d
enhance: Tidy import options (#37077) (#37078)
1. Tidy import options.
2. Tidy common import util functions.

issue: https://github.com/milvus-io/milvus/issues/34150

pr: https://github.com/milvus-io/milvus/pull/37077

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-10-25 14:35:45 +08:00
yihao.dai
d6adc62765
enhance: Support db for bulkinsert (#37012) (#37017)
issue: https://github.com/milvus-io/milvus/issues/31273

pr: https://github.com/milvus-io/milvus/pull/37012

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-10-23 16:31:29 +08:00
congqixia
7acf1d53c1
enhance: [2.4] Preallocate delete data slice to avoid growslice (#37044)
Rewritten based on master pr
pr: #37043

Related to #36887

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-10-22 14:15:28 +08:00
yihao.dai
4e0f5845a1
enhance: Limit import job number (#36891) (#36892)
issue: https://github.com/milvus-io/milvus/issues/36890

pr: https://github.com/milvus-io/milvus/pull/36891

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-10-18 18:13:25 +08:00
Ted Xu
2742524508
enhance: enable parallel execution of L0 compactions (#36816) (#36985)
pr: #36816

Signed-off-by: Ted Xu <ted.xu@zilliz.com>
2024-10-18 16:17:25 +08:00
wei liu
b88d610e42
fix: datacoord stuck at stopping progress (#36852) (#36961)
issue: #36868
pr: #36852
if datacoord is syncing segments to datanode, and stop datacoord
happens, datacoord's stop progress will stuck until syncing segment
finished.

This PR add ctx to syncing segment, which will failed if stopping
datacoord happens.

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-10-18 14:43:24 +08:00
Ted Xu
22838a8413
enhance: Datacoord to support prioritization of compaction tasks (#36979)
See #36550

pr: #36547 
pr: #36956

---------

Signed-off-by: Ted Xu <ted.xu@zilliz.com>
2024-10-18 14:15:25 +08:00
cai.zhang
2bfd22f217
fix: [cherry-pick] Fix clustering compaction task leak (#36803)
issue: #36686
master pr: #36800 

bug reason:
- The clustering compaction tasks on the datanode were never cleaned up.
- The clustering compaction task contains a mapping from clustering key
to buffer, this caused a large memory leak.

fix:
- clean the tasks on datanode by datacoord when clustering compaction
finished.
- reset the mapping that from clustering key to buffer on datanode when
clustering finished.

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-10-17 20:43:30 +08:00
wei liu
7c35c51e15
fix: Rootcoord stuck at graceful stop progress (#36881)
issue: #34553
pr: #36880
when rootcoord trigger graceful stop progress, it will block until all
rpc finished. for create collection request, rootcoord need to block
until datacoord finish to watch all channels, but datacoord need to call
`rootcoord.Alloc` during watch channel, and rootcoord doesn't respond to
new request anymore. which cause create collection stucks, and graceful
stop progress stucks.

This PR remove the func call `rootcoord.Alloc` to solve the logic dead
lock during graceful stop progress.

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-10-17 12:13:37 +08:00
yihao.dai
604e346585
enhance: Enhance segment log (#36848) (#36849)
/kind improvement

pr: https://github.com/milvus-io/milvus/pull/36848

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-10-15 20:43:30 +08:00
XuanYang-cn
e976b41f97
fix: Remove enableLevelZeroSegment config (#36507)
See also: #36504
pr: #36535

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-10-11 16:41:21 +08:00
yihao.dai
a4ef93457d
enhance: Optimize import scheduling and add time cost metric (#36601) (#36684)
1. Optimize import scheduling strategic:
a. Revise slot weights, calculating them based on the number of files
and segments for both import and pre-import tasks.
b. Ensure that the DN executes tasks in ascending order of task ID.
2. Add time cost metric and log.

issue: https://github.com/milvus-io/milvus/issues/36600,
https://github.com/milvus-io/milvus/issues/36518

pr: https://github.com/milvus-io/milvus/pull/36601

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-10-11 10:27:22 +08:00
aoiasd
eaa948752b
enhance: [Cherry-Pick] UpdateSegmentsInfo should update remaining segment info even if some one not exist (#36729)
pr: https://github.com/milvus-io/milvus/pull/36726

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2024-10-10 15:11:20 +08:00
yihao.dai
93a4574a35
Add buildIndex state for import job (#36705)
issue: https://github.com/milvus-io/milvus/issues/36698

pr: https://github.com/milvus-io/milvus/pull/35868,
https://github.com/milvus-io/milvus/pull/36699

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-10-09 16:57:20 +08:00
XuanYang-cn
0ff8e13232
fix: [24]Remove neighbors if compactTo is unindexed (#36503) (#36694)
See also: #36360
pr: #36503

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-10-09 15:43:20 +08:00