970 Commits

Author SHA1 Message Date
cai.zhang
4dca57535f
fix: Fix bug for get segment index state (#31427)
issue: #31361

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-03-20 14:45:10 +08:00
Jiquan Long
dc2cdbe387
enhance: add more metrics (#31271)
/kind improvement
fix: #31272 

This pr add more metrics, which are:
- Slow query count, which the duration considered as slow can be
configurable;
- Number of deleted entities;
- Number of entities imported;
- Number of entities per collection;
- Number of loaded entities per collection;
- Number of indexed entities;
- Number of indexed entities, per collection, per index and whether it's
a vetor index;
- Quota states (LongTimeTickDelay, MemoryExhuasted, DiskQuotaExhuasted)
per database;

---------

Signed-off-by: longjiquan <jiquan.long@zilliz.com>
2024-03-19 15:23:06 +08:00
congqixia
16c661c722
enhance: Use different interval for gc scan (#31363)
See also #31362

This PR make datacoord garbage collection scan operation using differet
interval than other opeartion.

This interval is a newly added param item, which default value is 7*24
hours.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-03-19 11:27:06 +08:00
XuanYang-cn
0066c016b6
enhance: Skip submit empty l0 tasks in DC (#31280)
Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-03-19 10:13:14 +08:00
Xiaofan
8c43c5b6cb
fix: get compaction failure when datanode is actually alive (#31353)
didn't mark the compact as failure if it's simply an rpc error when
GetCompactionPlansResults
see #31352

Signed-off-by: xiaofanluan <xiaofan.luan@zilliz.com>
2024-03-18 10:01:36 -07:00
Bingyi Sun
bdc70dfc6a
feat: Add global mmap enable configuration (#31267)
https://github.com/milvus-io/milvus/issues/31279

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2024-03-18 15:17:10 +08:00
yihao.dai
776709e5ff
fix: Fix binlog import (#31310)
Fix binlog import functionality by removing the existing check and
refining the size retrieval process.

issue: https://github.com/milvus-io/milvus/issues/31221,
https://github.com/milvus-io/milvus/issues/28521

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-03-17 20:59:04 +08:00
cai.zhang
4871786a7b
enhance: When describing an index, fetch the index info in batches (#31238)
issue: #29313

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-03-15 16:37:09 +08:00
yihao.dai
c408a32db6
feat: Add disk quota checks for import V2 (#31131)
Return quota error when the files to be imported exceed the disk quota.

issue: https://github.com/milvus-io/milvus/issues/28521

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-03-15 14:43:03 +08:00
yihao.dai
2b035ba2d4
enhance: Allow import tasks to retry for more errors (#31268)
Allow import tasks to retry for a wider range of errors, including all
gRPC errors and unexpected status codes from Milvus.

issue: https://github.com/milvus-io/milvus/issues/31227,
https://github.com/milvus-io/milvus/issues/28521

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-03-15 11:05:04 +08:00
yihao.dai
811316d2ba
fix: Fix binlog import and refine error reporting (#31241)
1. Fix binlog import with partition key.
2. Refine binlog import error reportins.
3. Avoid division by zero when retrieving import progress.

issue: https://github.com/milvus-io/milvus/issues/31221,
https://github.com/milvus-io/milvus/issues/28521

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-03-15 10:55:05 +08:00
jaime
db79be3ae0
fix: ctx cancel should be the last step while stopping server (#31220)
issue: #31219

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-03-15 10:33:05 +08:00
XuanYang-cn
a1386bae7f
fix: Skip to submit l0 tasks when scheduler full (#31270)
See also: #31242

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-03-15 10:21:12 +08:00
yihao.dai
7d7ef388df
enhance: Remove adding import segments to the datanode (#31244)
With the presence of L0 segments, there's no longer a need to add import
segments to the datanode.

issue: https://github.com/milvus-io/milvus/issues/28521

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-03-15 06:53:03 +08:00
XuanYang-cn
1fafc72077
fix: Correct the last empty l0 views (#31198)
See also: #31191

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-03-14 10:31:04 +08:00
Buqian Zheng
3c80083f51
feat: [Sparse Float Vector] add sparse vector support to milvus components (#30630)
add sparse float vector support to different milvus components,
including proxy, data node to receive and write sparse float vectors to
binlog, query node to handle search requests, index node to build index
for sparse float column, etc.

https://github.com/milvus-io/milvus/issues/29419

---------

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2024-03-13 14:32:54 -07:00
yihao.dai
b5c67948b7
enhance: Enhance and modify the return content of ImportV2 (#31192)
1. The Import APIs now provide detailed progress information for each
imported file, including details such as file name, file size, progress,
and more.
2. The APIs now return the collection name and the completion time.
3. Other modifications include changing jobID to jobId and other similar
adjustments.

issue: https://github.com/milvus-io/milvus/issues/28521

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-03-13 19:51:03 +08:00
yihao.dai
69e132e05b
Use logID instead of logPath for import segment (#31182)
Currently, the logPath in the querycoord should be replaced with logID.
This PR updates the import segment's logPath to logID.

issue: https://github.com/milvus-io/milvus/issues/31123,
https://github.com/milvus-io/milvus/issues/28885

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-03-12 10:13:03 +08:00
Bingyi Sun
5c0bb40549
fix: merge index params when creating index (#31127)
issue: https://github.com/milvus-io/milvus/issues/31102

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2024-03-11 17:31:03 +08:00
Bingyi Sun
425da78b38
fix: alter index request's index name can not be empty (#31128)
issue: https://github.com/milvus-io/milvus/issues/31138

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2024-03-11 13:05:02 +08:00
yihao.dai
8cb06acfed
feat: Replacing the current import API with the v2 implementation (#31046)
Replacing the current import API v1 implementation with the v2
implementation.

issue: https://github.com/milvus-io/milvus/issues/28521

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-03-10 12:23:02 +08:00
yihao.dai
c411cb4a49
enhance: Prevent the backlog of channelCP update tasks, perform batch updates of channelCPs (#30941)
This PR includes the following adjustments:
1. To prevent channelCP update task backlog, only one task with the same
vchannel is retained in the updater. Additionally, the lastUpdateTime is
refreshed after the flowgraph submits the update task, rather than in
the callBack function.
2. Batch updates of multiple vchannel checkpoints are performed in the
UpdateChannelCheckpoint RPC (default batch size is 128). Additionally,
the lock for channelCPs in DataCoord meta has been switched from key
lock to global lock.
3. The concurrency of UpdateChannelCheckpoint RPCs in the datanode has
been reduced from 1000 to 10.

issue: https://github.com/milvus-io/milvus/issues/30004

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
Co-authored-by: jaime <yun.zhang@zilliz.com>
Co-authored-by: congqixia <congqi.xia@zilliz.com>
2024-03-07 20:39:02 +08:00
congqixia
d81ba164c8
enhance: Add ListIndexes API from datacoord (#31104)
See also #31103

This PR add `listIndexes` API for datacoor server to list all indexes
for provided collection.
Comparing to the existing `DescribeIndex` API, the new one does NOT
check the segment index building progress to ease the burden when
invoking it

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-03-07 17:37:01 +08:00
congqixia
196f0c1e1d
fix: Skip invalid compaction plan (#31045)
See also #31044

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-03-06 21:36:59 +08:00
Bingyi Sun
df7aafa3ec
fix: filter mmap key when checking index params (#31030)
issue: https://github.com/milvus-io/milvus/issues/31031

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2024-03-06 16:03:00 +08:00
XuanYang-cn
def72947c7
fix: Trigger l0 compaction when l0 views don't change (#30729)
Trigger l0 compaction when l0 views don't change

So that leftover l0 segments would be compacted in the end.

1. Refresh LevelZero plans in comactionPlanHandler, remove the meta
dependency
of compaction trigger v2
2. Add ForceTrigger method for CompactionView interface
3. rename mu to taskGuard
4. Add a new TriggerTypeLevelZeroViewIDLE
5. Add an idleTicker for compaction view manager

See also: #30098, #30556

Signed-off-by: yangxuan <xuan.yang@zilliz.com>

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-03-05 16:37:00 +08:00
zhagnlu
b9775a1816
fix: add GetSegments optimization to avoid meta mutex competition (#31025)
#30835

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-03-05 14:47:00 +08:00
congqixia
1936aa4caa
enhance: Check channel cp lag before generate compaction task (#30997)
See also #30996

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-03-05 13:39:01 +08:00
chyezh
8f7019468f
fix: starve lock caused by slow GetCompactionTo method when too much segments (#30963)
issue: #30823

Signed-off-by: chyezh <chyezh@outlook.com>
2024-03-05 10:04:59 +08:00
jaime
4b0c3dd377
enhance: index meta use independent rather than global meta lock (#30869)
issue: https://github.com/milvus-io/milvus/issues/30837

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-03-04 16:56:59 +08:00
cai.zhang
f6ff2588cd
enhance: Optimize DescribeIndex to reduce lock contention (#30939)
issue: #29313 
issue: #30443

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-03-03 19:00:59 +08:00
yihao.dai
a434d33e75
feat: Add import scheduler and manager (#29367)
This PR introduces novel managerial roles for importv2:
1. ImportMeta: To manage all the import tasks;
2. ImportScheduler: To process tasks and modify their states;
3. ImportChecker: To ascertain the completion of all tasks and instigate
relevant operations.

issue: https://github.com/milvus-io/milvus/issues/28521

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-03-01 18:31:02 +08:00
congqixia
6387403639
fix: Prevent clone when selecting segments from meta (#30928)
See also #30538

Previously the `SelectSegments` changed to clone all return value
preventing possible update to returned info.

Since meta is implemented following COW rules, this shall not happen and
any update on segment shall have copy before it.

This PR:
- Remove clone for read-only Get segment info
- Add Segment Operator abstraction for changing segment
- Implemnt COW for updating MaxRowNum

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-03-01 13:59:02 +08:00
aoiasd
2180e2cfae
fix: wrong segment binlog path cause load segment failed #30726 (#30959)
DataCoord GetSegmentInfo should return binlog info with logpath instead
logid when segment merge child segment's
binlog.
relate: https://github.com/milvus-io/milvus/issues/30366
https://github.com/milvus-io/milvus/issues/30165
https://github.com/milvus-io/milvus/issues/30550

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2024-03-01 13:09:01 +08:00
XuanYang-cn
2867f50fcc
fix: Clear DN unkown compaction tasks (#30850)
If DC restarted,  those unkonwn compaction tasks
will never get call back in DN, so that the segments in the compaction
task will be locked, unable to sync and compaction again, blocking cp
advance and compaction executing.

See also: #30137

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-03-01 11:31:00 +08:00
chyezh
0c7474d7e8
enhance: add graceful stop timeout to avoid node stop hang under extreme cases (#30317)
1. add coordinator graceful stop timeout to 5s
2. change the order of datacoord component while stop
3. change querynode grace stop timeout to 900s, and we should
potentially change this to 600s when graceful stop is smooth

issue: #30310
also see pr: #30306

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2024-02-29 17:01:50 +08:00
Bingyi Sun
816ed671aa
fix: alter_index should return error if index not found (#30786)
issue: https://github.com/milvus-io/milvus/issues/30932

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2024-02-29 15:55:01 +08:00
XuanYang-cn
cdc5ce5d6f
fix: Donot set LogPath when executing compaction (#30537)
Compaction would copy logPaths from comapctFrom segA to compactTo segB,
and previous code would copy the logPath directly, causing there're
full-logPaths-of-segA in compactTo segB's meta. So, for the next
compaction of segB, if segA has been GCed, Download would report error
"The sperified key not found".

This PR makes sure compactTo segment's meta contains logID only. And
this PR also refines CompleteComapctionMutation, increasing some
readability and merge two methods into one.

See also: #30496

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-02-28 19:03:01 +08:00
foxspy
e1e87d572b
fix: compatibility for diskann cache param (#30119)
patch search cache param from index configs when index meta could not
get the search cache size key
#issue: #30113

Signed-off-by: xianliang.li <xianliang.li@zilliz.com>
2024-02-26 16:54:55 +08:00
cai.zhang
16b4c9a79e
fix: Skip filling segmentID in indexBuildCh to prevent flush blocked (#30747)
issue: #30580

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-02-22 20:40:53 +08:00
wayblink
b74264881c
enhance: Refine compaction interfaces to support major compaction (#30632)
Refine compaction interfaces in datacoord, support compaction result
with more than one segment. Prepare for major compaction.

related: #30633

Signed-off-by: wayblink <anyang.wang@zilliz.com>
2024-02-19 20:52:50 +08:00
XuanYang-cn
44d436d0b6
enhance: Add force trigger (#30641)
1. Increase maxCount of L0 compaction tasks to 30

This could reduce the l0 compaction task number by 30% for
high-frequently-generated-small l0 segments, with the maximum size 64MB
stay not changed. So that l0 segments would accumulate slower and
decrease the mem presure caused by L0 segment for QueryNode

2. Add force Trigger for later manual timely l0 compaction triggers.

See also: #30191, #30556

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-02-19 18:40:50 +08:00
wayblink
1635211c3f
enhance: Add log when garbage collection resumed (#30535)
/kind enhancement

Signed-off-by: wayblink <anyang.wang@zilliz.com>
2024-02-05 17:09:15 +08:00
XuanYang-cn
e184c891ff
fix: [skip-e2e] Fix unstable SaveBinlogPath ut (#30508)
Fixes: #30507

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-02-05 16:00:59 +08:00
XuanYang-cn
6959630652
fix: donot set l0 segment as growing when savebinlogs (#29194)
This PR fixes negative growing L0 segments in Metrics

See also: #29204, #30441

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-02-04 10:21:06 +08:00
yihao.dai
7ce876a072
fix: Decoupling importing segment from flush process (#30402)
This pr decoups importing segment from flush process by:
1. Exclude the importing segment from the flush policy, this approch
avoids notifying the datanode to flush the importing segment, which may
not exist.
2. When RootCoord call Flush, DataCoord directly set the importing
segment state to `Flushed`.

issue: https://github.com/milvus-io/milvus/issues/30359

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-02-03 13:01:12 +08:00
cai.zhang
36d3fd41e1
fix: Only use bound indexnodes in bound mode (#30461)
issue: #30463

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-02-03 11:01:47 +08:00
XuanYang-cn
e0ed5647b3
fix: Limit L0 Compaction segment size and count (#30374)
See also: #30191

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-02-01 20:39:03 +08:00
XuanYang-cn
fb5e09d94d
fix: call injectDone after compaction failed (#30277)
syncMgr.Block() will lock the segment when executing compaction.

Previous implementation was unable to Unblock thoese segments when
compaction failed. If next compaction of the same segments arrives,
it'll stuck forever and block all later compation tasks.

This PR makes sure compaction executor would Unblock these segments
after a failure compaction.

Apart form that, this PR also refines some logs and clean some codes of
compaction, compactor:

1. Log segment count instead of segmentIDs to avoid logging too many
segments
2. Flush RPC returns L1 segments only, skip L0 and L2
3. CompactionType is checked in `Compaction`, no need to check again
inside compactor
4. Use ligter method to replace `getSegmentMeta`
5. Log information for L0 compaction when encounters an error

See also: #30213

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-02-01 14:25:04 +08:00
congqixia
0c7a96b48d
enhance: Make compaction log has traceID (#30338)
See also #30167

After support open telemetry tracing, we want to have traceID as well,
this PR adds util functions to set traceID with span & propagate traceID
between different context.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-01-30 10:09:03 +08:00