241 Commits

Author SHA1 Message Date
zhenshan.cao
ac4f3997ce
enhance: Reconstructing Compaction to possess persistence capability (#33265)
issue #33586

Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>
2024-06-05 10:17:50 +08:00
congqixia
f31a20faad
fix: [Backport] Mark channel checkpoint dropped prevent cp lag metrics leakage (#32454) (#33198)
Cherry-pick from 2.3
pr: #32454
See also #31506 #31508

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-05-21 11:59:39 +08:00
wayblink
259bc97a2b
fix: Fix segments lost in flush response (#33061)
#33055

Signed-off-by: wayblink <anyang.wang@zilliz.com>
2024-05-15 13:49:34 +08:00
jaime
f48a7ff8ff
enhance: use Delete instead of DeletePartialMatch to remove metrics (#33029)
Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-05-14 18:49:33 +08:00
yiwangdr
b1eacb2ae8
feat: datacoord/node watch based on rpc (#32036)
issue: https://github.com/milvus-io/milvus/issues/25309

Signed-off-by: yiwangdr <yiwangdr@gmail.com>
2024-05-07 15:49:30 +08:00
wayblink
42d0412e93
enhance: Add channelCPs in FlushResponse (#32044)
#32609

Signed-off-by: wayblink <anyang.wang@zilliz.com>
2024-04-30 09:45:27 +08:00
zhagnlu
e2c38750c7
fix: modify retry error (#32351)
#32322

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-04-18 14:25:14 +08:00
zhagnlu
4586bcef9f
fix: correct AssignSegmentID return and add retry for loadCollectionF… (#32335)
#32322
#31942

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-04-16 10:20:10 -07:00
SimFG
c012e6786f
feat: support rate limiter based on db and partition levels (#31070)
issue: https://github.com/milvus-io/milvus/issues/30577
co-author: @jaime0815

---------

Signed-off-by: Patrick Weizhi Xu <weizhi.xu@zilliz.com>
Signed-off-by: SimFG <bang.fu@zilliz.com>
Co-authored-by: Patrick Weizhi Xu <weizhi.xu@zilliz.com>
2024-04-12 16:01:19 +08:00
XuanYang-cn
4617d22482
enhance: Use channel manager interface in server_test (#31621)
Tidy the following test codes

    - Remove channel in newTestServer
    - Remove newTestServerWithMeta
    - Remove newTestServer2
    - Remove testDataCoordBase
    - Use the same func for handleTTmsg and handleRPCTTmsg

See also: #31620

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-04-12 14:59:20 +08:00
SimFG
ac26908cc4
enhance: Remove the storage info report (#31772)
issue: #30436
origin pr: #30438

Signed-off-by: SimFG <bang.fu@zilliz.com>
2024-04-01 20:50:59 -07:00
congqixia
e191827a87
fix: Clone child segment info before decompress its deltalog (#31792)
Related to #31791

This segment meta is implemented in COW pattern. All modification on
segment info shall happen on the copied version of it.

This PR clones the child segment info for `GetSegmentInfo` in case data
race problem.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-04-01 21:41:15 +08:00
yihao.dai
4e264003bf
enhance: Ensure ImportV2 waits for the index to be built and refine some logic (#31629)
Feature Introduced:
1. Ensure ImportV2 waits for the index to be built

Enhancements Introduced:
1. Utilization of local time for timeout ts instead of allocating ts
from rootcoord.
3. Enhanced input file length check for binlog import.
4. Removal of duplicated manager in datanode.
5. Renaming of executor to scheduler in datanode.
6. Utilization of a thread pool in the scheduler in datanode.

issue: https://github.com/milvus-io/milvus/issues/28521

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-04-01 20:09:13 +08:00
SimFG
b1a1cca10b
feat: add more operation detail info for better allocation (#30438)
issue: #30436

---------

Signed-off-by: SimFG <bang.fu@zilliz.com>
2024-03-28 06:33:11 +08:00
yihao.dai
9a13b9822f
enhance: Return more fields in import progress response (#31539)
Return more fields in import progress response, include importedRows and
totalRows. Additionally, ensure compatibility with the old import
progress response by retaining fields of create timestamp and row count.

issue: https://github.com/milvus-io/milvus/issues/31448
https://github.com/milvus-io/milvus/issues/31237
https://github.com/milvus-io/milvus/issues/28521

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-03-24 21:57:06 +08:00
yihao.dai
0fe5e90e8b
enhance: Remove import v1 (#31403)
Remove all code and logic related to import v1.

issue: https://github.com/milvus-io/milvus/issues/28521

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-03-22 15:29:09 +08:00
congqixia
5c5f53d11b
fix: Check nodeID before update channel checkpoint (#31473)
See also #31470

This PR adds nodeID assignment verification before updating channel
checkpoints.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-03-22 10:07:06 +08:00
aoiasd
0c153a5820
enhance: Rename update segment operator (#31121)
Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2024-03-20 17:53:14 +08:00
Jiquan Long
dc2cdbe387
enhance: add more metrics (#31271)
/kind improvement
fix: #31272 

This pr add more metrics, which are:
- Slow query count, which the duration considered as slow can be
configurable;
- Number of deleted entities;
- Number of entities imported;
- Number of entities per collection;
- Number of loaded entities per collection;
- Number of indexed entities;
- Number of indexed entities, per collection, per index and whether it's
a vetor index;
- Quota states (LongTimeTickDelay, MemoryExhuasted, DiskQuotaExhuasted)
per database;

---------

Signed-off-by: longjiquan <jiquan.long@zilliz.com>
2024-03-19 15:23:06 +08:00
yihao.dai
776709e5ff
fix: Fix binlog import (#31310)
Fix binlog import functionality by removing the existing check and
refining the size retrieval process.

issue: https://github.com/milvus-io/milvus/issues/31221,
https://github.com/milvus-io/milvus/issues/28521

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-03-17 20:59:04 +08:00
yihao.dai
811316d2ba
fix: Fix binlog import and refine error reporting (#31241)
1. Fix binlog import with partition key.
2. Refine binlog import error reportins.
3. Avoid division by zero when retrieving import progress.

issue: https://github.com/milvus-io/milvus/issues/31221,
https://github.com/milvus-io/milvus/issues/28521

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-03-15 10:55:05 +08:00
yihao.dai
b5c67948b7
enhance: Enhance and modify the return content of ImportV2 (#31192)
1. The Import APIs now provide detailed progress information for each
imported file, including details such as file name, file size, progress,
and more.
2. The APIs now return the collection name and the completion time.
3. Other modifications include changing jobID to jobId and other similar
adjustments.

issue: https://github.com/milvus-io/milvus/issues/28521

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-03-13 19:51:03 +08:00
yihao.dai
8cb06acfed
feat: Replacing the current import API with the v2 implementation (#31046)
Replacing the current import API v1 implementation with the v2
implementation.

issue: https://github.com/milvus-io/milvus/issues/28521

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-03-10 12:23:02 +08:00
yihao.dai
c411cb4a49
enhance: Prevent the backlog of channelCP update tasks, perform batch updates of channelCPs (#30941)
This PR includes the following adjustments:
1. To prevent channelCP update task backlog, only one task with the same
vchannel is retained in the updater. Additionally, the lastUpdateTime is
refreshed after the flowgraph submits the update task, rather than in
the callBack function.
2. Batch updates of multiple vchannel checkpoints are performed in the
UpdateChannelCheckpoint RPC (default batch size is 128). Additionally,
the lock for channelCPs in DataCoord meta has been switched from key
lock to global lock.
3. The concurrency of UpdateChannelCheckpoint RPCs in the datanode has
been reduced from 1000 to 10.

issue: https://github.com/milvus-io/milvus/issues/30004

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
Co-authored-by: jaime <yun.zhang@zilliz.com>
Co-authored-by: congqixia <congqi.xia@zilliz.com>
2024-03-07 20:39:02 +08:00
chyezh
8f7019468f
fix: starve lock caused by slow GetCompactionTo method when too much segments (#30963)
issue: #30823

Signed-off-by: chyezh <chyezh@outlook.com>
2024-03-05 10:04:59 +08:00
yihao.dai
a434d33e75
feat: Add import scheduler and manager (#29367)
This PR introduces novel managerial roles for importv2:
1. ImportMeta: To manage all the import tasks;
2. ImportScheduler: To process tasks and modify their states;
3. ImportChecker: To ascertain the completion of all tasks and instigate
relevant operations.

issue: https://github.com/milvus-io/milvus/issues/28521

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-03-01 18:31:02 +08:00
congqixia
6387403639
fix: Prevent clone when selecting segments from meta (#30928)
See also #30538

Previously the `SelectSegments` changed to clone all return value
preventing possible update to returned info.

Since meta is implemented following COW rules, this shall not happen and
any update on segment shall have copy before it.

This PR:
- Remove clone for read-only Get segment info
- Add Segment Operator abstraction for changing segment
- Implemnt COW for updating MaxRowNum

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-03-01 13:59:02 +08:00
aoiasd
2180e2cfae
fix: wrong segment binlog path cause load segment failed #30726 (#30959)
DataCoord GetSegmentInfo should return binlog info with logpath instead
logid when segment merge child segment's
binlog.
relate: https://github.com/milvus-io/milvus/issues/30366
https://github.com/milvus-io/milvus/issues/30165
https://github.com/milvus-io/milvus/issues/30550

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2024-03-01 13:09:01 +08:00
XuanYang-cn
cdc5ce5d6f
fix: Donot set LogPath when executing compaction (#30537)
Compaction would copy logPaths from comapctFrom segA to compactTo segB,
and previous code would copy the logPath directly, causing there're
full-logPaths-of-segA in compactTo segB's meta. So, for the next
compaction of segB, if segA has been GCed, Download would report error
"The sperified key not found".

This PR makes sure compactTo segment's meta contains logID only. And
this PR also refines CompleteComapctionMutation, increasing some
readability and merge two methods into one.

See also: #30496

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-02-28 19:03:01 +08:00
cai.zhang
16b4c9a79e
fix: Skip filling segmentID in indexBuildCh to prevent flush blocked (#30747)
issue: #30580

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-02-22 20:40:53 +08:00
XuanYang-cn
6959630652
fix: donot set l0 segment as growing when savebinlogs (#29194)
This PR fixes negative growing L0 segments in Metrics

See also: #29204, #30441

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-02-04 10:21:06 +08:00
yihao.dai
7ce876a072
fix: Decoupling importing segment from flush process (#30402)
This pr decoups importing segment from flush process by:
1. Exclude the importing segment from the flush policy, this approch
avoids notifying the datanode to flush the importing segment, which may
not exist.
2. When RootCoord call Flush, DataCoord directly set the importing
segment state to `Flushed`.

issue: https://github.com/milvus-io/milvus/issues/30359

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-02-03 13:01:12 +08:00
XuanYang-cn
fb5e09d94d
fix: call injectDone after compaction failed (#30277)
syncMgr.Block() will lock the segment when executing compaction.

Previous implementation was unable to Unblock thoese segments when
compaction failed. If next compaction of the same segments arrives,
it'll stuck forever and block all later compation tasks.

This PR makes sure compaction executor would Unblock these segments
after a failure compaction.

Apart form that, this PR also refines some logs and clean some codes of
compaction, compactor:

1. Log segment count instead of segmentIDs to avoid logging too many
segments
2. Flush RPC returns L1 segments only, skip L0 and L2
3. CompactionType is checked in `Compaction`, no need to check again
inside compactor
4. Use ligter method to replace `getSegmentMeta`
5. Log information for L0 compaction when encounters an error

See also: #30213

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-02-01 14:25:04 +08:00
yihao.dai
8780d65b66
fix: Use channel cp as the dml&start position for import segments (#30107)
This PR discontinuing the subscription to the mq and, instead, employing
the channel checkpoint as the DML and starting position for the import
segments.

issue: https://github.com/milvus-io/milvus/issues/30106

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-01-22 14:36:55 +08:00
Bingyi Sun
d8025177fa
fix: return correct compaction plan count by datacoord (#29980)
issue: #29943

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2024-01-19 21:06:55 +08:00
smellthemoon
e52ce370b6
enhance:don't store logPath in meta to reduce memory (#28873)
don't store logPath in meta to reduce memory, when service get
segmentinfo, generate logpath from logid.
#28885

Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2024-01-18 22:06:31 +08:00
XuanYang-cn
75e6b65c60
enhance: Use ChannelManger interface in Server (#29629)
See also: #29447

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-01-08 17:46:47 +08:00
aoiasd
010d8362ad
fix: datanode L0 segment num wrong (#29050)
relate:  https://github.com/milvus-io/milvus/issues/27675

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2023-12-20 11:20:44 +08:00
congqixia
cbf0f9c527
enhance: change cp metric to absolute unix ts (#29328)
See also #29327

Change channel checkpoint metrics to unix seconds instead of checkpoint
timestamp lag value

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-12-20 11:04:45 +08:00
congqixia
8a63e53421
enhance: Add http method to control datacoord garbage collection (#29052)
See also #29051

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-12-14 19:26:39 +08:00
yah01
58dbb7872a
fix: forbid balancing level zero segments (#29168)
related #29128
Signed-off-by: yah01 <yah2er0ne@outlook.com>

---------

Signed-off-by: yah01 <yah2er0ne@outlook.com>
Signed-off-by: yah01 <yang.cen@zilliz.com>
2023-12-14 14:30:39 +08:00
Bingyi Sun
ad866d2889
feat: integrate storagev2 into index build process (#28995)
issue: https://github.com/milvus-io/milvus/issues/28994

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2023-12-13 17:24:38 +08:00
SimFG
ef18e6a532
enhance: Use a non-blocking method to trigger compaction when saveBinlogPath is executed (#28941)
/kind improvement
issue: #28924

Signed-off-by: SimFG <bang.fu@zilliz.com>
2023-12-12 10:16:39 +08:00
aoiasd
3c32ba2407
enhance: pack datacoord Cluster and SessionManager with interface and mock them (#28869)
relate: https://github.com/milvus-io/milvus/issues/28861
https://github.com/milvus-io/milvus/issues/28854

---------

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2023-12-11 17:52:37 +08:00
XuanYang-cn
5bac7f7897
fix: Fix L0 compaction in datacoord (#28814)
See also: #27606

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2023-12-05 18:44:37 +08:00
XuanYang-cn
321c5c32e3
fix: Separate schedule and check results loop (#28692)
This PR:

- Separates compaction scheduler and check results loop So that slow in
check-loop doesn't influence execution.

- Cleans compaction tasks when drop a vchannel so dropped-channel's
compaction tasks won't be checked over and over again.

  - Skips meta change when meta's already changed, avoid panic
  - Remove not inuse injectDone(bool) parameter

See also: #28628, #28209

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2023-11-29 10:50:29 +08:00
XuanYang-cn
606ec77b66
enhance: Unify levelzero segment config in DN (#28720)
Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2023-11-28 18:04:26 +08:00
congqixia
5025c10e8b
enhance: Remove channel cp lag metrics when DropChannel (#28784)
See also #28765
Remove metric when Drop channel grpc execute succeed

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-11-28 14:24:25 +08:00
aoiasd
8a4cfb7d6a
enhance: add l0 metric and fix datacoord no need drop l0 segment when flush (#28373)
relate: https://github.com/milvus-io/milvus/issues/27675

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2023-11-24 15:58:24 +08:00
Bingyi Sun
4fedff6d47
feat: integrate storage v2 into the write path (#28440)
#28378

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2023-11-23 17:26:24 +08:00