419 Commits

Author SHA1 Message Date
congqixia
618f0cb728
enhance: Put release segment and other misc cgo call into pool (#38186)
Related to #30273

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-12-05 11:04:40 +08:00
Zhen Ye
c6dcef7b84
enhance: move segcore codes of segment into one package (#37722)
issue: #33285

- move most cgo opeartions related to search/query into segcore package
for reusing for streamingnode.
- add go unittest for segcore operations.

Signed-off-by: chyezh <chyezh@outlook.com>
2024-11-29 10:22:36 +08:00
congqixia
cb6542339e
enhance: Mark cgo thread with tag name (#38000)
Related to #37999

This PR add `SetThreadName` API for marking cgo thread and utilize it
when initializing cgo worker.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-11-26 11:22:35 +08:00
congqixia
1ed686783f
enhance: Use PrimaryKeys to replace interface slice for segment delete (#37880)
Related to #35303

Reduce temporary memory usage for PK interface for segment delete.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-11-22 11:52:33 +08:00
congqixia
92e6ee6285
enhance: Use load pool for CreateTextIndex (#37898)
Related to #37895

Only resolves the starving issue which caused goroutine leakage

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-11-22 10:06:33 +08:00
congqixia
ee54a98578
enhance: Add cgo call metrics for load/write API (#37405)
Cgo API cost is not observerable since not metrics is related to them.
This PR add metrics for some sync cgo call related to load & write

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-11-07 10:06:25 +08:00
congqixia
3106384fc4
enhance: Return deltadata for DeleteCodec.Deserialize (#37214)
Related to #35303 #30404

This PR change return type of `DeleteCodec.Deserialize` from
`storage.DeleteData` to `DeltaData`, which
reduces the memory usage of interface header.

Also refine `storage.DeltaData` methods to make it easier to usage.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-10-29 12:04:24 +08:00
congqixia
5a0135727d
fix: Check resource when loading deltalogs (#37195)
Related to #36887

`LoadDeltaLogs` API did not check memory usage. When system is under
high delete load pressure, this could result into OOM quit.

This PR add resource check for `LoadDeltaLogs` actions and separate
internal deltalog loading function with public one.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-10-29 10:04:25 +08:00
congqixia
224d797f94
fix: Use singleton delete pool and avoid goroutine leakage (#37220)
Related to #36887

Previously using newly create pool per request shall cause goroutine
leakage. This PR change this behavior by using singleton delete pool.
This change could also provide better concurrency control over delete
memory usage.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-10-29 10:02:24 +08:00
congqixia
7774b7275e
enhance: Replace PrimaryKey slice with PrimaryKeys saving memory (#37127)
Related to #35303

Slice of `storage.PrimaryKey` will have extra interface cost for each
element, which may cause notable memory usage when delta row count
number is large.

This PR replaces PrimaryKey slice with PrimaryKeys interface saving the
extra interface cost.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-10-28 10:29:30 +08:00
foxspy
d7b2ffe5aa
enhance: add an unify vector index config checker (#36844)
issue: #34298

Signed-off-by: xianliang.li <xianliang.li@zilliz.com>
2024-10-28 10:11:37 +08:00
yihao.dai
ed37c27bda
fix: Fix collection leak in querynode (#37061)
Unref the removed L0 segment count.

issue: https://github.com/milvus-io/milvus/issues/36918

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-10-25 19:59:29 +08:00
aoiasd
22b917a1e6
enhance: Add collection name label for some metric (#36951)
Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2024-10-25 14:29:47 +08:00
congqixia
f43527ef6f
enhance: Batch forward delete when using DirectForward (#37076)
Relatedt #36887

DirectFoward streaming delete will cause memory usage explode if the
segments number was large. This PR add batching delete API and using it
for direct forward implementation.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-10-24 10:39:28 +08:00
Gao
1d61b604e1
enhance: support retry search when topk is reduced and result not enough (#35645)
issue: #35576 

This pr is to cover those cases when queryHook optimize search params
and make the result size insufficient, add retry search mechanism and
add related metrics for alarming.

---------

Signed-off-by: chasingegg <chao.gao@zilliz.com>
2024-10-23 19:19:30 +08:00
congqixia
5dd3f44cc1
enhance: Preallocate delete data slice to avoid growslice (#37043)
Related to #36887

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-10-22 19:07:29 +08:00
congqixia
0d8f20f7ce
fix: Pass full field list when partial load enabled (#37053)
Related to #37038

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-10-22 18:43:27 +08:00
SimFG
903c18ba26
enhance: consider the mmap chunck cache config when resource usage estimate (#36814)
- issue: #36530

Signed-off-by: SimFG <bang.fu@zilliz.com>
2024-10-18 10:17:23 +08:00
foxspy
3de57ec4fa
enhance: add vector index mgr to remove vector index type dependency (#36843)
issue: #34298

Signed-off-by: xianliang.li <xianliang.li@zilliz.com>
2024-10-17 22:15:25 +08:00
cqy123456
b474374ea5
enhance: use growingMmapEnabled to control the behavior of interim index, not vectorField (#36500)
issue:https://github.com/milvus-io/milvus/issues/36392
related pr: https://github.com/milvus-io/milvus/pull/36391

Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>
2024-10-17 20:25:24 +08:00
Bingyi Sun
6851738fd1
fix: fix make generate-mockery panic with go1.22 (#36830)
https://github.com/milvus-io/milvus/issues/36831
Fix `make generate-mockery` panic.

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2024-10-17 12:11:31 +08:00
yihao.dai
f3b6792a25
enhance: Enhance segment log (#36848)
/kind improvement

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-10-15 20:43:30 +08:00
congqixia
ba25320aea
fix: Unify loaded partition check to delegator (#36879)
Related to #36370

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-10-15 19:15:23 +08:00
Bingyi Sun
a75bb85f3a
feat: support chunked column for sealed segment (#35764)
This PR splits sealed segment to chunked data to avoid unnecessary
memory copy and save memory usage when loading segments so that loading
can be accelerated.

To support rollback to previous version, we add an option
`multipleChunkedEnable` which is false by default.

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2024-10-12 15:04:52 +08:00
aoiasd
db34572c56
feat: support load and query with bm25 metric (#36071)
relate: https://github.com/milvus-io/milvus/issues/35853

---------

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2024-10-11 10:23:20 +08:00
SimFG
130a923dec
enhance: the estimate method when loading the collection (#36307)
- issue: #36530

---------

Signed-off-by: SimFG <bang.fu@zilliz.com>
Signed-off-by: xianliang.li <xianliang.li@zilliz.com>
Co-authored-by: xianliang.li <xianliang.li@zilliz.com>
2024-10-09 17:35:19 +08:00
congqixia
ddc3e76803
fix: Add defer Unpin when error happens (#36620)
Resolves: #36619

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-09-30 19:49:17 +08:00
cai.zhang
ecb2b242e2
enhance: Add sorted for segment info (#36469)
issue: #33744

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-09-30 10:01:16 +08:00
congqixia
4fd9b0a8e3
enhance: Return segment id hint in QueryStream response (#36487)
Related to #36482

This PR reuses `SealedSegmentIDsRetrieved` field in `RetrieveResults`
struct to store segment id hint.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-09-26 10:13:14 +08:00
congqixia
ed95568a05
enhance: Fix PR conflict in reduce unit test (#36470)
Related to #36433 #36180

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-09-24 18:01:13 +08:00
congqixia
98a917c5d4
enhance: [skip e2e] Add unittest for reducing duplicated pk from multi segments (#36433)
Related to #35505 #36362

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-09-24 14:11:13 +08:00
Chun Han
df7ae08851
fix: iterator cursor progress too fast(#36179) (#36180)
related: #36179

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2024-09-24 11:45:13 +08:00
yihao.dai
763fd0dfc5
enhance: Use a separate mmap config for chunk cache (#36276)
issue: https://github.com/milvus-io/milvus/issues/35273

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-09-15 16:23:09 +08:00
congqixia
3bc7d63be9
fix: overwrite correct selection when pk duplicated (#35826)
Related to #35505

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-09-14 10:27:08 +08:00
Jiquan Long
89bf226f0b
feat: support keyword text match (#35923)
fix: #35922

---------

Signed-off-by: longjiquan <jiquan.long@zilliz.com>
2024-09-10 15:11:08 +08:00
Chun Han
9d0aa5c202
fix: empty result when having only one subReq(#36098) (#36128)
related: #36098

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2024-09-10 14:25:07 +08:00
zhagnlu
208c8a2328
fix:support config index offsetcache and fix create same index again (#35985)
#35971

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-09-08 18:23:05 +08:00
Chun Han
e480b103bd
feat: supporing hybrid search group_by (#35982)
related: #35096

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2024-09-08 17:09:04 +08:00
XuanYang-cn
7859faf8ea
fix: Change deltalog memory estimation factor to one (#36033)
See also: #36031

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-09-06 18:21:05 +08:00
cai.zhang
2c9bb4dfa3
feat: Support stats task to sort segment by PK (#35054)
issue: #33744 

This PR includes the following changes:
1. Added a new task type to the task scheduler in datacoord: stats task,
which sorts segments by primary key.
2. Implemented segment sorting in indexnode.
3. Added a new field `FieldStatsLog` to SegmentInfo to store token index
information.

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-09-02 14:19:03 +08:00
Chun Han
bfd9d86fe9
feat: support groupby size on go-layer(#33544) (#33845)
related: #33544

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2024-08-27 14:21:00 +08:00
yihao.dai
f2b83d316b
enhance: Support memory mode chunk cache (#35347)
Chunk cache supports loading raw vectors into memory.

issue: https://github.com/milvus-io/milvus/issues/35273

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-08-25 15:42:58 +08:00
Zhen Ye
a773836b89
enhance: optimize milvus core building (#35610)
issue: #35549,#35611,#35633

- remove milvus_segcore milvus_indexbuilder..., add libmilvus_core
- core building only link once
- move opendal compilation into cmake
- fix odr

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2024-08-23 12:35:02 +08:00
SimFG
731d45abbe
enhance: provide more general configuration to control mmap behavior (#35359)
- issue: #35273

Signed-off-by: SimFG <bang.fu@zilliz.com>
2024-08-21 00:22:54 +08:00
Ted Xu
41646c8439
feat: integrate new deltalog format (#35522)
See #34123

---------

Signed-off-by: Ted Xu <ted.xu@zilliz.com>
2024-08-20 19:06:56 +08:00
congqixia
2fbc628994
feat: Support field partial load collection (#35416)
Related to #35415

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-08-20 16:49:02 +08:00
zhagnlu
4b553b0333
enhance: revert remove duplicated pk function (#35103)
issue: #34778
 Revert "fix: fix query count(*) concurrently"
 Revert "enhance: mark duplicated pk as deleted "

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-08-05 10:48:17 +08:00
Bingyi Sun
3641ae6611
fix: Fix index memory estimation (#35225)
issue: https://github.com/milvus-io/milvus/issues/35229

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2024-08-02 16:24:15 +08:00
zhenshan.cao
aa247f192d
enhance: remove unused code for StorageV2 (#35132)
issue: https://github.com/milvus-io/milvus/issues/34168

Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>
2024-08-01 12:08:13 +08:00
congqixia
f7f9a729c9
enhance: Pre-allocate space for reduce data structure (#35118)
Grow slice & map.growWork may cause a lot when segment number is large
for big K query. This PR pre-allocate space for reduce methods to avoid
this cost.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-07-31 10:35:49 +08:00