206 Commits

Author SHA1 Message Date
congqixia
05f880708d
enhance: Make skip load work for all branches (#37160)
Related to #37112

Skip load logic used to work only when there is multiple segment load
info entires in load request. In continous delete case, delegator still
loads l0 segment, which occupies lot of memory.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-10-25 23:37:29 +08:00
Buqian Zheng
088d5d7d76
fix: optimize BM25 err message (#37074)
issue: https://github.com/milvus-io/milvus/issues/37022

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2024-10-25 14:35:45 +08:00
congqixia
b086ef6b19
enhance: Skip load delta data in delegater when using RemoteLoad (#37082)
Related to #35303

Delta data is not needed when using `RemoteLoad` l0 forward policy. By
skipping load delta data, memory pressure could be eased if l0 segment
size/number is large.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-10-24 16:21:37 +08:00
congqixia
d8db3e8761
enhance: Add metrics for querynode delete buffer info (#37081)
Related to #35303

This PR add metrics for querynode delegator delete buffer information,
which is related to dml quota logic.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-10-24 10:47:28 +08:00
congqixia
f43527ef6f
enhance: Batch forward delete when using DirectForward (#37076)
Relatedt #36887

DirectFoward streaming delete will cause memory usage explode if the
segments number was large. This PR add batching delete API and using it
for direct forward implementation.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-10-24 10:39:28 +08:00
Gao
1d61b604e1
enhance: support retry search when topk is reduced and result not enough (#35645)
issue: #35576 

This pr is to cover those cases when queryHook optimize search params
and make the result size insufficient, add retry search mechanism and
add related metrics for alarming.

---------

Signed-off-by: chasingegg <chao.gao@zilliz.com>
2024-10-23 19:19:30 +08:00
Zhen Ye
ac178eeea5
enhance: make delegator lock critical smaller (#36997)
issue: #36804

Signed-off-by: chyezh <chyezh@outlook.com>
2024-10-21 11:33:25 +08:00
aoiasd
fbe177d6e7
fix: avoid panic when load segment with pkoracle and idforacle already exist (#36959)
relate: https://github.com/milvus-io/milvus/issues/36949

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2024-10-18 11:57:24 +08:00
Bingyi Sun
6851738fd1
fix: fix make generate-mockery panic with go1.22 (#36830)
https://github.com/milvus-io/milvus/issues/36831
Fix `make generate-mockery` panic.

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2024-10-17 12:11:31 +08:00
Buqian Zheng
06b5e186a7
fix: return error if searching against BM25 output field with incorrect metric type (#36910)
issue: https://github.com/milvus-io/milvus/issues/36835

currently searching BM25 output field using IP will end up in an error
in segcore which is hard to understand. now returning error in query
node delegator and provide more useful error message

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2024-10-16 19:45:23 +08:00
Chun Han
903450f5c6
enhance: add ts support for iterator(#22718) (#36572)
related: #22718

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2024-10-16 18:51:23 +08:00
congqixia
447ff342fb
fix: Direct forward delta exclude l0 segments (#36899)
Related to #36887

Forward delete to L0 segment will return error and mark l0 segment
offline causing delegator unserviceable

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-10-16 14:05:23 +08:00
aoiasd
72dc07ba48
fix: bm25 search failed when nq > 1 and remove idf oracle when no bm25 field exist. (#36886)
relate: https://github.com/milvus-io/milvus/issues/35853

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2024-10-16 12:51:23 +08:00
congqixia
ba25320aea
fix: Unify loaded partition check to delegator (#36879)
Related to #36370

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-10-15 19:15:23 +08:00
aoiasd
5ec4163d0f
feat: support bm25 logs mixcompaction (#36072)
relate: https://github.com/milvus-io/milvus/issues/35853

---------

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2024-10-14 16:57:22 +08:00
Buqian Zheng
383350c120
feat: added more checks for function creation check (#36766)
issue: https://github.com/milvus-io/milvus/issues/35853

* BM25 Function now takes no params, k1, b should be passed via index
params
* support BM25 full text search when metric type is not present in
search request
* add more strict validation with functions at collection creation time

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2024-10-13 17:43:22 +08:00
Buqian Zheng
16b533cbf0
feat: Restful support for BM25 function (#36713)
issue: https://github.com/milvus-io/milvus/issues/35853

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2024-10-13 17:41:21 +08:00
aoiasd
db34572c56
feat: support load and query with bm25 metric (#36071)
relate: https://github.com/milvus-io/milvus/issues/35853

---------

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2024-10-11 10:23:20 +08:00
congqixia
1833913f44
enhance: Add streaming forward policy switch for delegator (#36330)
Related to #35303

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-09-23 18:01:12 +08:00
congqixia
11dbe1e755
enhance: Add L0 forward policy to support remote load (#36189)
Related to #35303

This PR add a param item to support change l0 forward behavior from bf
filtering and forward to remote load.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-09-12 12:01:08 +08:00
Chun Han
e480b103bd
feat: supporing hybrid search group_by (#35982)
related: #35096

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2024-09-08 17:09:04 +08:00
congqixia
8593c4580a
enhance: Add delete buffer related quota logic (#35918)
See also #35303

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-09-05 11:39:03 +08:00
Zhen Ye
99dff06391
enhance: using streaming service in insert/upsert/flush/delete/querynode (#35406)
issue: #33285

- using streaming service in insert/upsert/flush/delete/querynode
- fixup flusher bugs and refactor the flush operation
- enable streaming service for dml and ddl
- pass the e2e when enabling streaming service
- pass the integration tst when enabling streaming service

---------

Signed-off-by: chyezh <chyezh@outlook.com>
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-08-29 10:03:08 +08:00
Chun Han
bfd9d86fe9
feat: support groupby size on go-layer(#33544) (#33845)
related: #33544

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2024-08-27 14:21:00 +08:00
aoiasd
fe83805d56
fix: loss data bug for deprecated querynode DoubleBuffer (#35128)
relate: https://github.com/milvus-io/milvus/issues/31548

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2024-08-27 14:10:59 +08:00
congqixia
f87af9bc54
enhance: Exclude L0 segment from readable snapshot (#35507)
L0 segments now do not contain insert data and may cause confusion for
query hook optimizer if counted as sealed segment number.

This PR add segment level flag in segment entry and exclude L0 segments
while get readable segment snaphsot

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-08-16 15:28:53 +08:00
congqixia
6ff238e88a
fix: Set corresponding DataScope for loadStreamDelete (#35312)
Related to #35311

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-08-06 22:32:23 +08:00
Chun Han
3faef63a25
enhance: add log for partition stats( #30376) (#35219)
related:  #30376

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2024-08-02 19:34:22 +08:00
congqixia
de8a266d8a
enhance: Enable linux code checker (#35084)
See also #34483

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-07-30 15:53:51 +08:00
wei liu
c45f38aa61
enhance: Update protobuf-go to protobuf-go v2 (#34394)
issue: #34252

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-07-29 11:31:51 +08:00
Chun Han
e2e38e98df
fix: nil part stats without l2 compaction(#34923) (#34992)
related: #34923

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2024-07-29 11:07:48 +08:00
Chun Han
c46c401112
fix: refine handling type for segment pruner(#34923) (#34925)
related: #34923

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2024-07-25 13:57:45 +08:00
jaime
3b62138c5c
fix: unstable UT for level0 deletion (#34524)
issue: #34533

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-07-11 10:02:56 +08:00
wei liu
eeb03a0d6a
fix: Query may return deleted records (#34501)
issue: #34500
cause the sort in `GetLevel0Deletions` will broken the corresponed order
between pks and tss, then the pks and tss will be sorted in
segment.Delete() interface.

This PR remove this uncessary and incorrect sort progress to avoid query
may return deleted records.

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-07-09 10:46:11 +08:00
Chun Han
8af187f673
fix: lose partitionIDs when scalar pruning and refine segment prune ratio metrics(#30376) (#34477)
related: #30376
fix: paritionIDs lost when no setting paritions
enhance: refine metrics for segment prune

Signed-off-by: MrPresent-Han <chun.han@zilliz.com>
2024-07-08 19:54:15 +08:00
Chun Han
fcafdb6d5f
enhance: reconstruct scalar part's code for segment-pruner(#30376) (#34346)
related: #30376
1. support more complex expr
2. add more ut test for unrelated fields

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2024-07-04 16:36:09 +08:00
Chun Han
34bec2ea5e
enhance: add metrics for segment prune latnecy(#30376) (#34094)
related: #30376

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2024-07-03 10:04:07 +08:00
jaime
9630974fbb
enhance: move rocksmq from internal to pkg module (#33881)
issue: #33956

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-06-25 21:18:15 +08:00
wayblink
f9a0f7bb25
Add an option to enable/disable vector field clustering key (#34097)
#30633

Signed-off-by: wayblink <anyang.wang@zilliz.com>
2024-06-25 18:52:04 +08:00
Chun Han
ca7ef26e4b
fix: sync part stats task cannot be finished(#30376) (#34027)
related: #30376
also: refine log output for query_coord task by rephrasing action string

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2024-06-24 10:16:02 +08:00
cqy123456
32f685ff12
enhance: growing segment support mmap (#32633)
issue: https://github.com/milvus-io/milvus/issues/32984

Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>
2024-06-18 14:42:00 +08:00
congqixia
2a04b0929a
fix: Prevent use captured iteration variable partitionID (#33906)
See also #33902

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-06-17 19:11:59 +08:00
wei liu
4987067375
enhance: Execute bloom filter apply in parallel to speed up segment predict (#33792)
issue: #33610

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-06-14 11:37:56 +08:00
wei liu
ab93d9c23d
enhance: Use BatchPkExist to reduce bloom filter func call cost (#33611)
issue:#33610

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-06-13 17:57:56 +08:00
wayblink
a1232fafda
feat: Major compaction (#33620)
#30633

Signed-off-by: wayblink <anyang.wang@zilliz.com>
Co-authored-by: MrPresent-Han <chun.han@zilliz.com>
2024-06-10 21:34:08 +08:00
wei liu
34c6a989ab
enhance: Avoid load bf in delegator when qn worker has no more memory (#33557)
query coord send load request to delegator, delegator load bf first,
then forward load request to qn worker. but when qn worker has no more
memory, it will return load failed immediatelly. then delegator roll
back the loaded bf. query coord wil retry the load request, and
delegator will load and roll back bf again and again.

this PR delay the loading bf step until load segment succeed in worker.

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-06-03 19:23:45 +08:00
wei liu
c6a1c49e02
enhance: Use Blocked Bloom Filter instead of basic bloom fitler impl. (#33405)
issue: #32995
To speed up the construction and querying of Bloom filters, we chose a
blocked Bloom filter instead of a basic Bloom filter implementation.

WARN: This PR is compatible with old version bf impl, but if fall back
to old milvus version, it may causes bloom filter deserialize failed.

In single Bloom filter test cases with a capacity of 1,000,000 and a
false positive rate (FPR) of 0.001, the blocked Bloom filter is 5 times
faster than the basic Bloom filter in both querying and construction, at
the cost of a 30% increase in memory usage.

- Block BF construct time	{"time": "54.128131ms"}
- Block BF size	                {"size": 3021578}
- Block BF Test cost	        {"time": "55.407352ms"}
- Basic BF construct time	{"time": "210.262183ms"}
- Basic BF size	                {"size": 2396308}
- Basic BF Test cost	        {"time": "192.596229ms"}

In multi Bloom filter test cases with a capacity of 100,000, an FPR of
0.001, and 100 Bloom filters, we reuse the primary key locations for all
Bloom filters to avoid repeated hash computations. As a result, the
blocked Bloom filter is also 5 times faster than the basic Bloom filter
in querying.

- Block BF TestLocation cost    {"time": "529.97183ms"}
- Basic BF TestLocation cost	{"time": "3.197430181s"}

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-05-31 17:49:45 +08:00
SimFG
cb99e3db34
enhance: add the includeCurrentMsg param for the Seek method (#33326)
/kind improvement
- issue: #33325

Signed-off-by: SimFG <bang.fu@zilliz.com>
2024-05-27 10:31:41 +08:00
Xiaofan
3d105fcb4d
enhance: Remove l0 delete cache (#32990)
fix #32979
remove l0 cache and build delete pk and ts everytime. this reduce the
memory and also increase the code readability

Signed-off-by: xiaofanluan <xiaofan.luan@zilliz.com>
2024-05-21 22:53:40 +08:00
wei liu
5038036ece
enhance: Reuse hash locations during access bloom fitler (#32642)
issue: #32530 

when try to match segment bloom filter with pk, we can reuse the hash
locations. This PR maintain the max hash Func, and compute hash location
once for all segment, reuse hash location can speed up bf access

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-05-07 06:13:47 -07:00