18757 Commits

Author SHA1 Message Date
wei liu
c8658d17f8
fix: Grpcclient return unrecoverable error (#31256) (#31452)
issue: #31222
pr: #31256

grpcclient's `call` func return a unrecoverable error, then the caller's
retry policy also breaks due to this unrecoverable error.

This PR introduce `retry.Handle`, the new func use `func() (bool,
error)` as input parameters, which return `shouldRetry` directly, to
avoid grpcclient return a unrecoverable error

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-03-21 11:59:12 +08:00
wei liu
6b761204ce
fix: Set node unreachable when get shard client failed (#31277) (#31451)
issue: #30531
pr: #31277

cause get client from `shardClientMgr`, doesn't means query node is
unavailable. because of the ref counter policy in `shardClientMgr`,
which will clean the client, if no collection use qn as shard leader.

This PR fix that set node unreachable when get shard client failed.

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-03-21 11:57:08 +08:00
wei liu
5994b6a7b0
fix: Search doesn't expire shard leader cache (#31380) (#31450)
issue: #31351
pr: #31380
This PR fixed that search doesn't expire shard leader cache when send
request to query node failed, which make every request keep trying to
connect a offline query node

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-03-21 11:55:07 +08:00
groot
1ca7cba222
enhance: Support MinIO TLS connection (#31292)
issue: https://github.com/milvus-io/milvus/issues/30709
master pr: #31311

Signed-off-by: yhmo <yihua.mo@zilliz.com>
Co-authored-by: Chen Rao <chenrao317328@163.com>
2024-03-21 11:15:20 +08:00
congqixia
94f3aec80a
enhance: [Cherry-pick] Add metrics for querycoord current target cp lag (#31391) (#31463)
Cherry-pick from master
pr: #31391 #31399
See also #31390

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-03-21 10:17:07 +08:00
wei liu
fef430daed
fix: Wrong behavior of CurrentTargetFirst/NextTargetFirst in target manager(#31379) (#31419)
issue: #31162
pr: #31379

when give scope CurrentTargetFirst/NextTargetFirst, it's expected to
scan both current and next target.

This PR fixed wrong behavior of CurrentTargetFirst/NextTargetFirst in
target manager, which may cause unexpected task generated, and load
collection may stuck forever due to dirty leader view.

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-03-20 23:39:07 +08:00
cai.zhang
52a7eb9548
fix: Fix bug for get segment index state (#31429)
issue: #31361 
master pr: #31427 
2.4 pr: #31428

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-03-20 15:05:06 +08:00
congqixia
86e347a1a4
enhance: [2.3] Cache formatted key for param item (#31388) (#31402)
Cherry-pick from master
pr: #31388 
See also #30806

`formatKey` may cost lots of CPU on string processing under high QPS
scenario, this PR adds a formattedKeys cache preventing string operation
in each param get value.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-03-19 19:25:10 +08:00
cai.zhang
ef530a2324
enhance: When describing an index, fetch the index info in batches (#31239)
issue: #29313 
master pr: #31238

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-03-15 16:37:09 +08:00
sre-ci-robot
e77afcb5d5
[automated] Bump milvus version to v2.3.12 (#31303)
Bump milvus version to v2.3.12
Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2024-03-15 16:19:05 +08:00
nico
75a86bc2d3
test: update test cases (#31253)
Signed-off-by: nico <cheng.yuan@zilliz.com>
2024-03-15 15:23:10 +08:00
Jiquan Long
50bfde92f2
fix: wrong num_entities used when mmap variable length data (#30848) (#31274)
https://github.com/milvus-io/milvus/issues/30728
pr: #30848

Signed-off-by: longjiquan <jiquan.long@zilliz.com>
v2.3.12
2024-03-14 20:33:03 +08:00
congqixia
4e48a4de0e
enhance: Bump milvus & proto version to v2.3.12 (#31193)
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-03-14 19:09:04 +08:00
jaime
5ddb0b435f
fix: revoke session may be ignored due to server context cancellation in advance (#31213)
issue: #31219
pr: #31220

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-03-14 19:05:04 +08:00
sre-ci-robot
a33751a2d7
[automated] Update Pytest image changes (#31235)
Update Pytest image changes
See changes:
645cc0bdc3
Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2024-03-14 09:59:11 +08:00
nico
645cc0bdc3
test: update test cases (#31161)
Signed-off-by: nico <cheng.yuan@zilliz.com>
2024-03-13 19:05:11 +08:00
sre-ci-robot
5386a2c43e
[automated] Update Pytest image changes (#31108)
Update Pytest image changes
See changes:
005dbf2b24
Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2024-03-13 11:21:19 +08:00
chyezh
7105e0b261
fix: lost dbname when only passing collection id to describeCollection (#31177)
issue: #30931
pr: #31167

Signed-off-by: chyezh <chyezh@outlook.com>
2024-03-11 19:51:03 +08:00
aoiasd
e747f15c80
fix: flush insert data with nil buffer (#31159)
relate: https://github.com/milvus-io/milvus/issues/31165

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2024-03-11 17:43:03 +08:00
wei liu
9d712f4dd4
fix: Balance param use duplicated key (#31112) (#31141)
pr: #31112
issue: #31115
This PR fix balance check interval  param use duplicated key

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-03-11 15:03:02 +08:00
wei liu
855f71ac89
fix: Dirty sealed segment won't release after channel balance (#31095) (#31126)
issue: #31074
pr: #31095
This PR fix dirty sealed segment doesn't release after channel balance,
dirty sealed segment means segment doesn't exist in targets.

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-03-11 15:01:11 +08:00
congqixia
3e7f2e8e7d
enhance: [Cherry-Pick] Use ListIndexes instead of DescribeIndex for qc broker (#31163)
Cherry pick from master 
pr: #31122

See also #31103

Since querycoord need index meta information from datacoord only, broker
shall use `ListIndexes` to skip segment index building check logic in
datacoord

This PR is also related to #30538, in which DescribeIndex caused lots of
memory usage and lead to OOM eventually

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-03-11 14:41:02 +08:00
pingliu
1dd4f4b4dc
enhance: jemalloc aarch64 platform use 64k pagesize. (#31114)
pr: https://github.com/milvus-io/milvus/pull/29522
enhance: jemalloc aarch64 platform use 64k pagesize.

Signed-off-by: ping.liu <ping.liu@zilliz.com>
2024-03-11 12:03:02 +08:00
congqixia
3c90475d55
enhance: [Cherry-pick] Add ListIndexes API from datacoord (#31104) (#31150)
Cherry-pick from master
pr: #31104
See also #31103

This PR add `listIndexes` API for datacoor server to list all indexes
for provided collection.
Comparing to the existing `DescribeIndex` API, the new one does NOT
check the segment index building progress to ease the burden when
invoking it

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-03-11 10:47:02 +08:00
sre-ci-robot
8211af3a95
[automated] Bump milvus version to v2.3.11 (#31148)
Bump milvus version to v2.3.11
Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2024-03-08 18:05:01 +08:00
Jiquan Long
c37b7792f4
enhance: purge client infos periodically (#31037) (#31092)
https://github.com/milvus-io/milvus/issues/31007
pr: #31037 

---------

Signed-off-by: longjiquan <jiquan.long@zilliz.com>
2024-03-08 10:17:01 +08:00
zhuwenxing
542b46fb1e
test: add json and array datatype check in restful v1 (#31096)
pr: https://github.com/milvus-io/milvus/pull/31097

* When the collection is created using an SDK and includes array and
JSON datatypes in the schema, data can be inserted using the RESTful
API.
* When the collection is created using the RESTful API and includes JSON
and array datatypes in dynamic fields, data can also be inserted using
the RESTful API.

Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
v2.3.11
2024-03-07 19:25:01 +08:00
nico
005dbf2b24
enhance: update pymilvus version (#31004)
Signed-off-by: nico <cheng.yuan@zilliz.com>
2024-03-07 15:17:02 +08:00
congqixia
383ff8b0b1
enhance: [2.3] Add flush trigger for channel cp updater (#31082)
See also #31024  #31058

Flush cost boosted from 2 seconds to 5 or more after the change of
channel updater. This PR add a manual trigger method to accelerate flush
procedure.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-03-07 15:15:01 +08:00
yihao.dai
3eeeae8519
fix: Fix errors in the Index service APIs are ignored (#31077) (#31086)
In Index service APIs, return error if occurs instead of always
returning nil. Additionally, add more tests to cover this scenario.

issue: https://github.com/milvus-io/milvus/issues/31069,
https://github.com/milvus-io/milvus/issues/31027

pr: https://github.com/milvus-io/milvus/pull/31077

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-03-06 22:55:00 +08:00
congqixia
53f5a67112
enhance: [Cherry-pick] Fix misleading log content & possible nil panic (#31021) (#31054)
Cherry pick from master
pr: #31021 

- Change load field log from "dy pool" to "load pool"
- Also defer delete when there is no error

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-03-06 16:09:01 +08:00
congqixia
6b5e19f6b7
enhance: Bump milvus & proto version to v2.3.11 (#31035)
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-03-05 17:15:00 +08:00
zhagnlu
095c94305c
fix: add GetSegments optimization to avoid meta mutex competition (#31026)
pr: #31025

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-03-05 14:49:01 +08:00
yihao.dai
91d17870d6
enhance: Prevent the backlog of channelCP update tasks, perform batch updates of channelCPs (#30941) (#31024)
This PR includes the following adjustments:

1. To prevent channelCP update task backlog, only one task with the same
vchannel is retained in the updater. Additionally, the lastUpdateTime is
refreshed after the flowgraph submits the update task, rather than in
the callBack function.
2. Batch updates of multiple vchannel checkpoints are performed in the
UpdateChannelCheckpoint RPC (default batch size is 128). Additionally,
the lock for channelCPs in DataCoord meta has been switched from key
lock to global lock.
3. The concurrency of UpdateChannelCheckpoint RPCs in the datanode has
been reduced from 1000 to 10.

issue: https://github.com/milvus-io/milvus/issues/30004

pr: https://github.com/milvus-io/milvus/pull/30941

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-03-05 14:27:01 +08:00
congqixia
b7635ed989
enhance: [Cherry-pick] Change proxy connection manager to concurrent safe (#31009)
Cherry-pick from master
pr: #31008 
See also #31007

This PR:
- Add param item for connection manager behavior: TTL & check interval
- Change clientInfo map to concurrent map

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-03-05 14:13:00 +08:00
yihao.dai
a5350f64a5
enhance: Reduce the memory usage of the timeTickSender (#30968) (#30991)
In the cache of the timeTickSender, retain only the latest stats instead
of storing stats for every time tick.

issue: https://github.com/milvus-io/milvus/issues/30967

pr: https://github.com/milvus-io/milvus/pull/30968

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-03-05 10:59:01 +08:00
congqixia
81b197267a
enhance: [Cherry-Pick] Add back load memory factor when esitmating memory resource (#30999)
Cherry-pick from master
pr: #30994
Segment load memory usage is underestimated due to removing the load
memroy factor. This PR adds it back to protect querynode OOM during some
extreme memory cases.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-03-05 09:15:00 +08:00
jaime
336e0ae45e
enhance: index meta use independent rather than global meta lock (#30986)
issue: https://github.com/milvus-io/milvus/issues/30837
pr: #30869

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-03-05 08:48:59 +08:00
chyezh
df09222029
fix: starve lock caused by slow GetCompactionTo method when too much segments (#30965)
issue: #30823
pr: #30963

Signed-off-by: chyezh <chyezh@outlook.com>
2024-03-04 20:51:00 +08:00
shaoyue
7014d352b3
fix: permissions on /milvus for OpenShift compatibility (#30937)
Fixes #25565
Cherry-pick 
pr: #30775

Signed-off-by: Guillaume Moutier <guillaume.moutier@gmail.com>
Signed-off-by: shaoyue.chen <shaoyue.chen@zilliz.com>
Co-authored-by: Guillaume Moutier <guimou@users.noreply.github.com>
2024-03-04 16:59:00 +08:00
XuanYang-cn
bb2de0d964
fix: [cherry-pick] Clear DN unknown compaction tasks (#30972)
If DC restarted,  those unkonwn compaction tasks
will never get call back in DN, so that the segments in the compaction
task will be locked, unable to sync and compaction again, blocking cp
advance and compaction executing.

See also: #30137
pr: #30850

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-03-04 16:52:59 +08:00
wei liu
db49b8524d
fix: Skip generate balance task when target not ready (#30725)
issue: #30723
pr: #30724

This PR skip generate balance task when collection's target isn't ready.
also refine the check stale logic in query coord's scheduler, if channel
exist in current or next target, task won't be canceled.

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-03-04 11:38:59 +08:00
wei liu
af54c3ba85
fix: Make datacoord client retry on index api (#30656)
pr: #30654

This PR add retry on all interface which belong to indexcoord in milvus
2.2 and. move to data coord in milvus 2.3, to prevent meet unimplemented
error during rolling upgrade from milvus 2.2 to 2.3.

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-03-04 11:37:09 +08:00
cai.zhang
38e3d6af3e
enhance: Optimize DescribeIndex to reduce lock contention (#30975)
issue: #29313
issue: #30443
master pr: #30939

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-03-04 11:30:59 +08:00
SimFG
b0569f430b
enhance: [2.3] retry to read when the s3 get the unexpect eof error (#30976)
issue: https://github.com/milvus-io/milvus/issues/30877
pr: #30861

Signed-off-by: SimFG <bang.fu@zilliz.com>
2024-03-04 10:42:59 +08:00
PowderLi
c93f127c7d
fix: [cherry-pick] [restful v1] bug list (#30873)
master pr: #30871 issue: #30870
fix: vector field cannot be empty while insert
did a check whether the vector field is empty in advance

master pr: #30740
fix:
1. spelling mistake about metricsType #30643
2. int64 percious #20415
3. insert into collection which has multi vector fields #30674

enhance: support dataType: Float16Vector & BFloat16Vector #22837
#30980(master pr: #30969)
enhance: describe collection will show the field is partition key or not
#30789

---------

Signed-off-by: PowderLi <min.li@zilliz.com>
2024-03-03 17:56:59 +08:00
groot
5b695d7e86
fix: Clean kafka default configuration (#30925)
issue: #30917
pr: #30924

Signed-off-by: yhmo <yihua.mo@zilliz.com>
2024-03-01 18:15:29 +08:00
SimFG
ef84d40e54
enhance: [2.3] make the watch dm channel request better compatibility (#30954)
pr: #30952
issue: https://github.com/milvus-io/milvus/issues/30938

Signed-off-by: SimFG <bang.fu@zilliz.com>
2024-03-01 16:09:01 +08:00
congqixia
430e10c8e2
fix: [Cherry-pick] Use localStorage path to check disk cap (#30944) (#30966)
Cherry-pick from master
pr: #30944
See also #30943

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-03-01 15:11:01 +08:00
wei liu
b0c7f8653f
fix: Segment version doesn't update as expected (#30953)
issue: #30950 
pr: #30951

due to segment version doesn't update as expected.
This PR will update segment version until segment become loaded

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-03-01 14:21:10 +08:00