71 Commits

Author SHA1 Message Date
congqixia
81b197267a
enhance: [Cherry-Pick] Add back load memory factor when esitmating memory resource (#30999)
Cherry-pick from master
pr: #30994
Segment load memory usage is underestimated due to removing the load
memroy factor. This PR adds it back to protect querynode OOM during some
extreme memory cases.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-03-05 09:15:00 +08:00
wei liu
b0c7f8653f
fix: Segment version doesn't update as expected (#30953)
issue: #30950 
pr: #30951

due to segment version doesn't update as expected.
This PR will update segment version until segment become loaded

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-03-01 14:21:10 +08:00
congqixia
c3f831fce4
fix: [Cherry-pick] Disk resource is not requested for index loaded with disk (#30757) (#30948)
Cherry pick from master
pr: #30757
See also #30756

This PR:
- Request disk resource when index type, version loaded with disk
- Add attribute cache for index utility
- Add `typeutil.Pair`

---------

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-03-01 13:07:00 +08:00
chyezh
be1bd9615a
enhance: add configurable memory index load predict memory usage factor (#30563)
pr: #30561

related pr: #30475

Signed-off-by: chyezh <chyezh@outlook.com>
2024-02-06 22:00:49 +08:00
yah01
655e235230
enhance: calculate the accuracy memory usage while loading segment (#30473) (#30475)
the old version Knowhere would copy the index data while loading, we
need to consider this to avoid OOM.

Knowhere provides a util function to indicate whether it will load the
index with disk, if not, we need to double the memory usage prediction
for index data

pr: #30473

Signed-off-by: yah01 <yang.cen@zilliz.com>
2024-02-03 13:01:12 +08:00
yihao.dai
e0f987ee9b
enhance: Allows proactive warming up of chunk cache (#30182) (#30289)
Allows proactive warming up of chunk cache. Original vector data will be
asynchronously loaded into the chunk cache during the load process. It
has the potential to significantly reduce query/search latency for a
certain duration after the load, albeit with a concurrent increase in
disk usage.

issue: https://github.com/milvus-io/milvus/issues/30181

pr: https://github.com/milvus-io/milvus/pull/30182

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-01-26 09:57:01 +08:00
congqixia
35e4165722
enhance: [2.3] make Load process traceable in querynode & segcore (#30187)
Cherry-pick from master, modified some files since branching
pr: #29858
See also #29803

This PR:
- Add trace span for LoadIndex & LoadFieldData in segment loader
- Add TraceCtx parameter for Index.Load in segcore
- Add span for ReadFiles & Engine Load for Memory/Disk Vector index

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-01-23 15:58:57 +08:00
chyezh
c8e3a48214
fix: querynode num entity metric is broken by illegal label (#29949)
issue: #29766
also see pr: #29825
pr: #29948

Signed-off-by: chyezh <ye.zhen@zilliz.com>
2024-01-14 10:22:59 +08:00
wei liu
86cddd24b5
enhance: Add ctx for load index logs (#29686) (#29905)
pr: #29686
This PR add ctx for load index logs

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-01-12 18:56:58 +08:00
jaime
c0b711e9fb
enhance: Support read hardware metrics for cgroupv2 (#29847)
issue: #29846
pr: #29850

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-01-11 19:20:57 +08:00
yah01
e7e4561da8
fix: the entities num metric may be contributed more than once (#29767) (#29825)
the growing segments contribute to this metric while inserting and
putting into the manager, but the current impl inserts data before
putting the segments into manager, which leads to double contributions

fix: #29766
pr: #29767

Signed-off-by: yah01 <yah2er0ne@outlook.com>
2024-01-11 10:24:51 +08:00
congqixia
dd52a674aa
enhance: [cherry-pick] add ctx for HandleCStatus and callers (#29517) (#29546)
Cherry-pick from master
pr: #29517 
See also #29516

Make `HandleCStatus` print trace id for better logging

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-12-28 10:20:47 +08:00
congqixia
fc5dd524c5
enhance: [Cherry-pick] add log when release segment created for load failure (#29464) (#29500)
Cherry-pick from master
pr: #29464 
Add log for releasing segment created during load process when load
error happens

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-12-27 20:00:48 +08:00
cqy123456
8fd38c8eea
enhance:[cherry-pick] Use binlog index for better search performance (#29012)
this pr is cherry-pick from master:
pr: https://github.com/milvus-io/milvus/pull/28528
pr: https://github.com/milvus-io/milvus/pull/27673
related issue:
issue: https://github.com/milvus-io/milvus/issues/27678

Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>
2023-12-07 09:52:34 +08:00
yah01
a1b861ed7a
enhance: improve load speed (#28518) (#28719)
This check rejects load request if running out the pool workers, but
small segment would be loaded soon, another segments would been loading
again after a check interval, which leads to slow loading for collection

Block the request by go pool

pr: #28518

Signed-off-by: yah01 <yah2er0ne@outlook.com>
2023-11-26 22:10:26 +08:00
yah01
d10a82dba4
Fix getting incorrect CPU num (#28178)
Signed-off-by: yah01 <yang.cen@zilliz.com>
2023-11-07 11:52:22 +08:00
yah01
5c444218a2
Limit max thread num for pool (#28018) (#28115)
Signed-off-by: yah01 <yah2er0ne@outlook.com>
2023-11-06 10:50:17 +08:00
congqixia
f492e33343
Refine log level when request resource fail for loading segments (#28004) (#28077)
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-11-01 23:16:26 +08:00
zhenshan.cao
dbdb9e15d8
Update Knowhere version (#27445)
Signed-off-by: Li Liu <li.liu@zilliz.com>
Co-authored-by: Li Liu <li.liu@zilliz.com>
2023-09-29 14:23:28 +08:00
Jiquan Long
370fdaf50d
Record engine version for segment index (#27384)
Signed-off-by: longjiquan <jiquan.long@zilliz.com>
2023-09-28 18:03:28 +08:00
foxspy
5db4a0489e
dynamic index version control (#27335)
Co-authored-by: longjiquan <jiquan.long@zilliz.com>
2023-09-25 21:39:27 +08:00
foxspy
370b6fde58
milvus support multi index engine (#27178)
Co-authored-by: longjiquan <jiquan.long@zilliz.com>
2023-09-22 09:59:26 +08:00
SimFG
26f06dd732
Format the code (#27275)
Signed-off-by: SimFG <bang.fu@zilliz.com>
2023-09-21 09:45:27 +08:00
yah01
0a750408d0
Skip delta logs have been applied (#26971)
Signed-off-by: yah01 <yah2er0ne@outlook.com>
2023-09-19 16:21:23 +08:00
MrPresent-Han
7939f0e7d5
enable ctx traceId for assignsegment on dc(#26972) (#27108) (#27030)
Signed-off-by: MrPresent-Han <chun.han@zilliz.com>
2023-09-18 11:39:20 +08:00
yihao.dai
bb6711f28c
Add ChunkCache: support get vector from storage (#26142)
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2023-09-15 10:21:20 +08:00
congqixia
394c898b4c
Discard SyncDistribution set action from legacy querycoord (#27027)
Since Milvus in lower version (< 2.3.0), there is no load info in set action
which may corrupt data integrity and cause panicking

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-09-13 09:35:18 +08:00
congqixia
ac45af585b
Make segment loaded successful put in manager even ctx done (#26992)
Leave segment loaded in manager even wait other segment failed
See also #26908
Fix error case in distributed scenario

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-09-11 21:33:18 +08:00
congqixia
2a5d574a0d
Fix querynodev2 concurrent load logic (#26959)
Fix logic error from #26926
function `waitSegmentLoadDone` shall return error when context is done

Make delegator control concurrency for each same segment
Related to #26908

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-09-10 07:41:18 +08:00
congqixia
c6116d1819
Remove segment to LocalSegment type assertion (#26931)
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-09-09 10:35:16 +08:00
congqixia
c8f9f22c4a
Fix segment loader return false success (#26926)
`waitSegmentLoadDone` did not check waitCh result is success or failure
after load return without error, delegator will assume all segments are loaded

This PR changes waitCh to loadResult with `sync.Cond` with `atomic.Int32` to represent status

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-09-08 16:41:16 +08:00
MrPresent-Han
528948559f
fix false load failure for long unserviable period(#26813) (#26818)
Signed-off-by: MrPresent-Han <chun.han@zilliz.com>
2023-09-06 12:57:15 +08:00
XuanYang-cn
ef75784715
Fix LoadSegmentLatency metric p99 (#26761)
See also: #26743

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2023-09-01 11:07:07 +08:00
MrPresent-Han
8330c18dc9
add log for loading segment(#26564) (#26640)
/kind improvement

Signed-off-by: MrPresent-Han <chun.han@zilliz.com>
2023-08-31 12:03:00 +08:00
yah01
bfcc691129
Fix segment leaked if task canceled (#26685)
Signed-off-by: yah01 <yah2er0ne@outlook.com>
2023-08-30 14:17:03 +08:00
congqixia
1cf6e00fa6
Improve segment manager interface (#26637)
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-08-29 15:46:27 +08:00
congqixia
ec65a4e048
Use float64 for resource estimation log (#26335)
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-08-15 14:25:33 +08:00
Enwei Jiao
7d61355ab0
Refactor log for Query (#26310)
Signed-off-by: Enwei Jiao <enwei.jiao@zilliz.com>
2023-08-14 18:57:32 +08:00
yah01
a173486d2e
Fix calculation of memory usage prediction for mmap mode (#26264)
Signed-off-by: yah01 <yah2er0ne@outlook.com>
2023-08-12 17:19:31 +08:00
yah01
48422dd4c5
Fix spawn too many threads (#26293)
- Low the thread pool cap
- Limit CGO calls concurrency

Signed-off-by: yah01 <yah2er0ne@outlook.com>
2023-08-11 18:29:29 +08:00
wei liu
b47a72bfcf
fix set dirty segment distribution to leader view (#26180)
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2023-08-11 11:21:32 +08:00
xige-16
1971d98897
Add disk metric info (#25675)
Signed-off-by: xige-16 <xi.ge@zilliz.com>
2023-08-11 10:35:42 +08:00
xige-16
1055c90456
Add default retrieve limit (#24782)
Signed-off-by: xige-16 <xi.ge@zilliz.com>
2023-08-10 14:11:15 +08:00
yah01
300fef446b
Enable mmap for vector index (#25877)
Signed-off-by: yah01 <yah2er0ne@outlook.com>
2023-08-10 13:59:15 +08:00
yah01
889424b3f9
Fix load index with empty file list (#26236)
Signed-off-by: yah01 <yah2er0ne@outlook.com>
2023-08-09 18:39:16 +08:00
MrPresent-Han
68ecf49ed5
fix lost parameter for threadCoefficient(#25781) (#26109)
Signed-off-by: MrPresent-Han <chun.han@zilliz.com>
2023-08-03 12:05:06 +08:00
yah01
9c55a7f422
Add worker num as one of load resource (#26045)
Signed-off-by: yah01 <yah2er0ne@outlook.com>
2023-08-01 21:47:06 +08:00
yah01
39b00b97a6
Add more logs for committed resources (#26026)
Signed-off-by: yah01 <yang.cen@zilliz.com>
2023-07-31 19:27:03 +08:00
yah01
8245e078c0
Add max segment size log (#26015)
Signed-off-by: yah01 <yang.cen@zilliz.com>
2023-07-31 14:37:04 +08:00
congqixia
5fbe6e99fd
Remove debug log from segment loader (#25937)
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-07-26 18:51:01 +08:00