824 Commits

Author SHA1 Message Date
congqixia
10460ed3f8
enhance: Simplify querynode tsafe & reduce goroutine number (#38416)
Related to #37630

TSafe manager is too complex for current implementation and each
delegator need one goroutine waiting for tsafe update event.

Tsafe updating could be executed in pipeline. This PR remove tsafe
manager and simplify the entire logic of tsafe updating.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-12-13 10:56:43 +08:00
Buqian Zheng
75e64b993f
enhance: add metrics for counting number of nun-zeros/tokens of sparse/FTS search (#38329)
sparse vectors may have arbitrary number of non zeros and it is hard to
optimize without knowing the actual distribution of nnz. this PR adds a
metric for analyzing that.

issue: https://github.com/milvus-io/milvus/issues/35853

comparing with https://github.com/milvus-io/milvus/pull/38328, this
includes also metric for FTS in query node delegator

also fixed a bug of sparse when searching by pk

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2024-12-12 16:22:43 +08:00
Gao
8977454311
enhance: support recall estimation (#38017)
issue: #37899 
Only `search` api will be supported

---------

Signed-off-by: chasingegg <chao.gao@zilliz.com>
2024-12-11 20:40:48 +08:00
congqixia
7c55649585
enhance: Refine querynode collection number metrics (#38350)
Related to #37630

Previously the loaded collection metrics was calculated via scanning all
loaded segment in segment manager, which is slow and buggy
implementation.

This PR:

- Move collection num metrics to collection manager
- Remove deprecated loaded partition metrics update logic

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-12-11 11:36:43 +08:00
congqixia
7ea9c983d2
enhance: Add mockery package config for QC&QN (#38340)
Related to #38339

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-12-10 19:18:42 +08:00
congqixia
b8e3795374
enhance: Add secondary index for querynode segment manager (#38311)
Related to #37630

Add secondary index with vchannel to reduce `GetBy` rlock holding time
when segment number is large.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-12-10 18:18:42 +08:00
wei liu
28c5189c6d
enhance: Refine the error msg for filter node (#38278)
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-12-06 18:22:40 +08:00
jaime
8ed019735c
enhance: add disk stats within system metrics (#38033)
issue: ##36621

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-12-06 16:32:41 +08:00
congqixia
051bc280dd
enhance: Make dynamic load/release partition follow targets (#38059)
Related to #37849

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-12-05 16:24:40 +08:00
congqixia
618f0cb728
enhance: Put release segment and other misc cgo call into pool (#38186)
Related to #30273

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-12-05 11:04:40 +08:00
congqixia
3c6a373a75
enhance: Fill version for load delta request (#38212)
Version is needed for load delta request in case of false alarm warning
about version go backward

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-12-05 10:54:39 +08:00
wei liu
108434969f
enhance: Add collection id to search request count metrics (#38069)
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-11-29 17:28:38 +08:00
Zhen Ye
c6dcef7b84
enhance: move segcore codes of segment into one package (#37722)
issue: #33285

- move most cgo opeartions related to search/query into segcore package
for reusing for streamingnode.
- add go unittest for segcore operations.

Signed-off-by: chyezh <chyezh@outlook.com>
2024-11-29 10:22:36 +08:00
aoiasd
db3c002788
fix:wrong warning log when idf oracle find L0 segment lacks (#37873)
relate: https://github.com/milvus-io/milvus/issues/35853

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2024-11-26 16:26:41 +08:00
congqixia
cb6542339e
enhance: Mark cgo thread with tag name (#38000)
Related to #37999

This PR add `SetThreadName` API for marking cgo thread and utilize it
when initializing cgo worker.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-11-26 11:22:35 +08:00
congqixia
1e76d2b9f1
fix: Use correct policy merging growing&l0 and add unit tests (#37950)
Related to #37574

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-11-24 17:10:33 +08:00
jaime
7bbfe86bcd
enhance: add list index and segment index retrieval API for WebUI (#37861)
issue: #36621

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-11-22 16:58:34 +08:00
congqixia
1ed686783f
enhance: Use PrimaryKeys to replace interface slice for segment delete (#37880)
Related to #35303

Reduce temporary memory usage for PK interface for segment delete.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-11-22 11:52:33 +08:00
congqixia
92e6ee6285
enhance: Use load pool for CreateTextIndex (#37898)
Related to #37895

Only resolves the starving issue which caused goroutine leakage

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-11-22 10:06:33 +08:00
SimFG
54aaeda63f
fix: add the request ctx for stream pipeline interface (#37835)
- issue: #37834

Signed-off-by: SimFG <bang.fu@zilliz.com>
2024-11-22 03:10:34 +08:00
wei liu
5f3601a6a5
fix: unstable integration test caused by paramtable.GetNodeID (#37909)
issue: #37908
cause paramtable is global single instance, which cause
paramtable.GetNodeID may return wrong server id in integration test.

This PR use node.GetNodeID to replace paramtable.GetNodeID

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-11-21 22:16:33 +08:00
congqixia
c79fbd5eab
fix: Load l0 delta for growings when using RemoteLoad (#37771)
Related to #37574

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-11-19 10:38:31 +08:00
wei liu
351463b67e
fix: L0 segment has been loaded to worker during channel balance (#37748)
issue: #37703

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-11-18 20:58:30 +08:00
Chun Han
f2a2fd6808
enhance: shallow copy search request for large nq cases(#37732) (#37749)
related: #37732

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2024-11-18 13:06:32 +08:00
congqixia
b0bd290a6e
enhance: Use internal json(sonic) to replace std json lib (#37708)
Related to #35020

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-11-18 10:46:31 +08:00
jaime
1d06d4324b
fix: Int64 overflow in JSON encoding (#37657)
issue: ##36621

- For simple types in a struct, add "string" to the JSON tag for
automatic string conversion during JSON encoding.
- For complex types in a struct, replace "int64" with "string."

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-11-14 22:52:30 +08:00
aoiasd
993051bb49
fix: brute force bm25 search lack avgdl param (#37650)
relate: https://github.com/milvus-io/milvus/issues/35853

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2024-11-14 14:58:31 +08:00
XuanYang-cn
31a8d08bdd
fix: Correct varchar primarykey size calculation (#37617)
See also: #37582

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-11-14 14:16:38 +08:00
cai.zhang
14e007d6fb
fix: Fix the bug that retrieved from wrong field for L0 segments (#37598)
issue: #37574

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-11-12 20:58:29 +08:00
Zhen Ye
fca946dee1
enhance: move the search utilities from querynode into new module (#37531)
issue: #33285

- The search utilities will be shared between query node and streaming
node.

Signed-off-by: chyezh <chyezh@outlook.com>
2024-11-11 11:10:27 +08:00
jaime
f348bd9441
feat: add segment,pipeline, replica and resourcegroup api for WebUI (#37344)
issue: #36621

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-11-07 11:52:25 +08:00
congqixia
ee54a98578
enhance: Add cgo call metrics for load/write API (#37405)
Cgo API cost is not observerable since not metrics is related to them.
This PR add metrics for some sync cgo call related to load & write

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-11-07 10:06:25 +08:00
congqixia
f54cf41830
enhance: Move forward l0 logic out of delta lock (#37337)
Related to #35303

`deleteMut` shall be protecting streaming delete buffer, forward l0
could be move out of the rlock section to reduce tsafe impact from
loading segments.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-11-04 10:42:22 +08:00
congqixia
be71b98146
enhance: Fix case error msg after knowhere upgrade (#37339)
Previous PR: #37315

Error message updated after knowhere behavior change, this PR update
corresponding test error message.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-11-01 14:00:22 +08:00
cai.zhang
2ef6cbbf59
feat: The expression supports filling elements through templates (#37033)
issue: #36672

The expression supports filling elements through templates, which helps
to reduce the overhead of parsing the elements.

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-10-31 14:20:22 +08:00
congqixia
3106384fc4
enhance: Return deltadata for DeleteCodec.Deserialize (#37214)
Related to #35303 #30404

This PR change return type of `DeleteCodec.Deserialize` from
`storage.DeleteData` to `DeltaData`, which
reduces the memory usage of interface header.

Also refine `storage.DeltaData` methods to make it easier to usage.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-10-29 12:04:24 +08:00
congqixia
5a0135727d
fix: Check resource when loading deltalogs (#37195)
Related to #36887

`LoadDeltaLogs` API did not check memory usage. When system is under
high delete load pressure, this could result into OOM quit.

This PR add resource check for `LoadDeltaLogs` actions and separate
internal deltalog loading function with public one.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-10-29 10:04:25 +08:00
congqixia
224d797f94
fix: Use singleton delete pool and avoid goroutine leakage (#37220)
Related to #36887

Previously using newly create pool per request shall cause goroutine
leakage. This PR change this behavior by using singleton delete pool.
This change could also provide better concurrency control over delete
memory usage.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-10-29 10:02:24 +08:00
congqixia
f87acdf2a2
fix: Ref collection meta when load l0 segment meta only (#37178)
Related to #37177

Previous PR #37160

Collection meta is not ref-ed when loading l0 segment in `RemoteLoad`
policy, which cause collection meta release when lots of l0 segment
released.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-10-28 15:49:38 +08:00
congqixia
7774b7275e
enhance: Replace PrimaryKey slice with PrimaryKeys saving memory (#37127)
Related to #35303

Slice of `storage.PrimaryKey` will have extra interface cost for each
element, which may cause notable memory usage when delta row count
number is large.

This PR replaces PrimaryKey slice with PrimaryKeys interface saving the
extra interface cost.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-10-28 10:29:30 +08:00
jaime
9d16b972ea
feat: add tasks page into management WebUI (#37002)
issue: #36621

1. Add API to access task runtime metrics, including:
  - build index task
  - compaction task
  - import task
- balance (including load/release of segments/channels and some leader
tasks on querycoord)
  - sync task
2. Add a debug model to the webpage by using debug=true or debug=false
in the URL query parameters to enable or disable debug mode.

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-10-28 10:13:29 +08:00
foxspy
d7b2ffe5aa
enhance: add an unify vector index config checker (#36844)
issue: #34298

Signed-off-by: xianliang.li <xianliang.li@zilliz.com>
2024-10-28 10:11:37 +08:00
congqixia
05f880708d
enhance: Make skip load work for all branches (#37160)
Related to #37112

Skip load logic used to work only when there is multiple segment load
info entires in load request. In continous delete case, delegator still
loads l0 segment, which occupies lot of memory.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-10-25 23:37:29 +08:00
yihao.dai
ed37c27bda
fix: Fix collection leak in querynode (#37061)
Unref the removed L0 segment count.

issue: https://github.com/milvus-io/milvus/issues/36918

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-10-25 19:59:29 +08:00
Yinzuo Jiang
3628593d20
feat: Implement custom function module in milvus expr (#36560)
OSPP 2024 project:
https://summer-ospp.ac.cn/org/prodetail/247410235?list=org&navpage=org

Solutions:

- parser (planparserv2)
    - add CallExpr in planparserv2/Plan.g4
    - update parser_visitor and show_visitor
- grpc protobuf
    - add CallExpr in plan.proto
- execution (`core/src/exec`)
- add `CallExpr` `ValueExpr` and `ColumnExpr` (both logical and
physical) for function call and function parameters
- function factory (`core/src/exec/expression/function`)
    - create a global hashmap when starting milvus (see server.go)
- the global hashmap stores function signatures and their function
pointers, the CallExpr in execution engine can get the function pointer
by function signature.
- custom functions
    - empty(string)
    - starts_with(string, string)
- add cpp/go unittests and E2E tests

closes: #36559

Signed-off-by: Yinzuo Jiang <jiangyinzuo@foxmail.com>
2024-10-25 15:25:30 +08:00
Buqian Zheng
088d5d7d76
fix: optimize BM25 err message (#37074)
issue: https://github.com/milvus-io/milvus/issues/37022

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2024-10-25 14:35:45 +08:00
aoiasd
22b917a1e6
enhance: Add collection name label for some metric (#36951)
Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2024-10-25 14:29:47 +08:00
congqixia
b086ef6b19
enhance: Skip load delta data in delegater when using RemoteLoad (#37082)
Related to #35303

Delta data is not needed when using `RemoteLoad` l0 forward policy. By
skipping load delta data, memory pressure could be eased if l0 segment
size/number is large.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-10-24 16:21:37 +08:00
congqixia
d8db3e8761
enhance: Add metrics for querynode delete buffer info (#37081)
Related to #35303

This PR add metrics for querynode delegator delete buffer information,
which is related to dml quota logic.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-10-24 10:47:28 +08:00
congqixia
f43527ef6f
enhance: Batch forward delete when using DirectForward (#37076)
Relatedt #36887

DirectFoward streaming delete will cause memory usage explode if the
segments number was large. This PR add batching delete API and using it
for direct forward implementation.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-10-24 10:39:28 +08:00