130 Commits

Author SHA1 Message Date
Xianhui Lin
4662aff36e
fix: retry old session existence in ProcessActiveStandBy (#44208)
fix: retry old session existence in ProcessActiveStandBy
issue: https://github.com/milvus-io/milvus/issues/44205

---------

Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>
2025-09-04 15:45:56 +08:00
Zhen Ye
7b04107863
fix: unrecoverable if lease expire when standby mode (#44112)
issue: #44111

Signed-off-by: chyezh <chyezh@outlook.com>
2025-08-29 10:47:51 +08:00
Zhen Ye
ecb24e7232
enhance: use multi-process framework in integration test (#42976)
issue: #41609

- add env `MILVUS_NODE_ID_FOR_TESTING` to set up a node id for milvus
process.
- add env `MILVUS_CONFIG_REFRESH_INTERVAL` to set up the refresh
interval of paramtable.
- Init paramtable when calling `paramtable.Get()`.
- add new multi process framework for integration test.
- change all integration test into multi process.
- merge some test case into one suite to speed up it.
- modify some test, which need to wait for issue #42966, #42685.
- remove the waittssync for delete collection to fix issue: #42989

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-06-30 14:22:43 +08:00
SimFG
91d40fa558
fix: Update logging context and upgrade dependencies (#41318)
- issue: #41291

---------

Signed-off-by: SimFG <bang.fu@zilliz.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-04-23 10:52:38 +08:00
Chun Han
016920b023
fix: solve incompitable problem for none-encoding index(#40838) (#41369)
related: #40838

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2025-04-20 22:56:44 +08:00
Xianhui Lin
f9febe3bae
enhance: Merge RootCoord, DataCoord And QueryCoord into MixCoord (#41006)
Merge RootCoord, DataCoord And QueryCoord into MixCoord
Make Session into one
issue : https://github.com/milvus-io/milvus/issues/37764

---------

Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>
2025-04-11 16:36:30 +08:00
cai.zhang
05e25431d9
enhance: Deprecate disk params about indexing (#41045)
issue: #40863

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-04-07 11:36:34 +08:00
congqixia
cb7f2fa6fd
enhance: Use v2 package name for pkg module (#39990)
Related to #39095

https://go.dev/doc/modules/version-numbers

Update pkg version according to golang dep version convention

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-02-22 23:15:58 +08:00
Zhen Ye
c84a0748c4
enhance: add rw/ro streaming query node replica management (#38677)
issue: #38399

- Embed the query node into streaming node to make delegator available
at streaming node.
- The embedded query node has a special server label
`QUERYNODE_STREAMING-EMBEDDED`.
- Change the balance strategy to make the channel assigned to streaming
node as much as possible.

Signed-off-by: chyezh <chyezh@outlook.com>
2025-01-24 16:55:07 +08:00
cai.zhang
6d45dd5666
fix: Add scalar index engine version for compatibility (#39204)
issue: #39203

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-01-15 12:25:00 +08:00
tinswzy
27229f7907
enhance: refine exists log print with ctx (#38080)
issue: #35917 
Refines exists log print with ctx

Signed-off-by: tinswzy <zhenyuan.wei@zilliz.com>
2024-12-14 22:36:44 +08:00
tinswzy
7944538ade
enhance: Add ctx param to KV operation interfaces (#38154)
issue: #35917 
Refine KV operation interfaces by adding a ctx param

Signed-off-by: tinswzy <zhenyuan.wei@zilliz.com>
2024-12-05 15:16:41 +08:00
congqixia
b0bd290a6e
enhance: Use internal json(sonic) to replace std json lib (#37708)
Related to #35020

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-11-18 10:46:31 +08:00
wei liu
a03157838b
enhance: Enable node assign policy on resource group (#36968)
issue: #36977
with node_label_filter on resource group, user can add label on
querynode with env `MILVUS_COMPONENT_LABEL`, then resource group will
prefer to accept node which match it's node_label_filter.

then querynode's can't be group by labels, and put querynodes with same
label to same resource groups.

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-11-08 11:18:27 +08:00
chyezh
cc8f7aa110
fix: streaming service related fix patch (#34696)
issue: #33285

- add idAlloc interface
- fix binary unsafe bug for message
- fix service discovery lost when repeated address with different server
id

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2024-07-16 15:49:38 +08:00
congqixia
25a1c9ecf0
fix: Make coordinator Register not blocked on ProcessActiveStandby (#32069)
See also #32066

This PR make coordinator register successful and let
`ProcessActiveStandBy` run async. And roles may receive stop signal and
notify servers.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-04-10 18:49:18 +08:00
congqixia
357fe814ce
fix: Remove unnecessary deleteSession operation (#31647)
See also #31628

The `Revoke` operation shall delete all keys related to the lease
attaching to. This `deleteSession` operation may also remove the session
key in next epoch by mistake and cause chaos session status

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-03-29 13:57:11 +08:00
jaime
db79be3ae0
fix: ctx cancel should be the last step while stopping server (#31220)
issue: #31219

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-03-15 10:33:05 +08:00
yiwangdr
32cff25f97
enhance: decrease coordinator init time (#29822)
This PR mainly improve two items:
1. Target observer should refresh loading status during init time. An
uninitialized loading status blocks search/query. Currently, the target
observer refreshes every 10 seconds, i.e. we'd need to wait for 10s for
no reason. That's also the reason why we constantly see false log
"collection unloaded" upon mixcoord restarts.
2. Delete session when service is stopped. So that the new service
doesn't need to wait for the previous session to expire (~10s).

Item 1 is the major improvement of this PR, which should speed up init
time by 10s.
Item 2 is not a big concern in most cases as coordinators usually shut
down after stop(). In those cases, coordinator restart triggers serverID
change which further triggers an existing logic that deletes expired
session. This PR only fixes rare cases where serverID doesn't change.

integration test:
`go test -tags dynamic -v -coverprofile=profile.out -covermode=atomic
tests/integration/coordrecovery/coord_recovery_test.go -timeout=20m`
Performance after the change:
Average init time of coordinators: 10s
Hardware: M2 Pro
Test setup: 1000 collections with 1000 rows (dim=128) per collection.


issue: #29409

Signed-off-by: yiwangdr <yiwangdr@gmail.com>
2024-02-05 14:00:12 +08:00
smellthemoon
1c1f2a1371
enhance:change some logs (#29579)
related #29588

Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2024-01-05 16:12:48 +08:00
wei liu
5b45a138b1
disable auto balance when old node exists (#28191)
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2023-11-07 14:02:20 +08:00
Xiaofan
da19e49daf
Support purge old session for standalone (#28184)
Signed-off-by: xiaofan-luan <xiaofan.luan@zilliz.com>
2023-11-06 21:21:42 +08:00
wei liu
ecec5dfcfd
fix retry on offline node (#28079)
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2023-11-03 10:14:16 +08:00
yah01
9658367a3c
Refine chunk manager errors (#27590)
Signed-off-by: yah01 <yah2er0ne@outlook.com>
2023-10-31 12:18:15 +08:00
Filip Haltmayer
6b1a106a31
Moving etcd client into session (#27069)
Signed-off-by: Filip Haltmayer <filip.haltmayer@zilliz.com>
2023-10-27 07:36:12 +08:00
SimFG
9b0ecbdca7
Support to replicate the mq message (#27240)
Signed-off-by: SimFG <bang.fu@zilliz.com>
2023-10-20 14:26:09 +08:00
jaime
ac2d1bb5c2
Support receive signals from parent process (#27756)
Signed-off-by: jaime <yun.zhang@zilliz.com>
2023-10-18 20:20:11 +08:00
congqixia
2f201c25e2
Remove deprecated io/ioutil usage (#27747)
`io/ioutil` package is deprecated, use `io`,`os` package replacement
also added golangci-lint rule to block future reference

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
Co-authored-by: guoguangwu <guoguangwu@magic-shield.com>
2023-10-17 20:32:09 +08:00
jaime
ec1fe3549e
Add a stop hook to clean session (#27564)
Signed-off-by: jaime <yun.zhang@zilliz.com>
2023-10-16 10:24:10 +08:00
Jiquan Long
e4f73cc805
Add host & enable_disk to session (#27507)
Signed-off-by: longjiquan <jiquan.long@zilliz.com>
2023-10-08 20:05:31 +08:00
Jiquan Long
5c1abfa2cc
Print the server id when active-standby switch (#27119)
Signed-off-by: longjiquan <jiquan.long@zilliz.com>
2023-10-07 10:01:31 +08:00
Jiquan Long
0f14d18201
Optimize the codec code of session (#27360)
Signed-off-by: longjiquan <jiquan.long@zilliz.com>
2023-10-01 10:33:30 +08:00
SimFG
c9653b1683
Add some log and improve TestSessionProcessActiveStandBy test case (#27403)
Signed-off-by: SimFG <bang.fu@zilliz.com>
2023-09-28 09:35:27 +08:00
foxspy
5db4a0489e
dynamic index version control (#27335)
Co-authored-by: longjiquan <jiquan.long@zilliz.com>
2023-09-25 21:39:27 +08:00
wei liu
9433a24f5d
fix component not exit when liveness check failed (#27236)
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2023-09-22 19:13:25 +08:00
SimFG
26f06dd732
Format the code (#27275)
Signed-off-by: SimFG <bang.fu@zilliz.com>
2023-09-21 09:45:27 +08:00
congqixia
16b35e07b3
Fix TestSessionSuite/TestKeepAliveRetryActiveCancel unit test logic (#27231)
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-09-20 18:59:23 +08:00
congqixia
f0d0651989
Do not reset connection immediately if grpc code is Canceled or DeadlineExceeded (#27014)
We found lots of connection reset & canceled due to recent retry change
Current implementation resets connection no matter what the error code is
To sync behavior to previous retry, skip reset connection only if cancel error happens too much.

Also adds a config item for minResetInterval for grpc reset connection

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-09-13 15:01:18 +08:00
congqixia
adfb5298c6
Refine TestSessionProcessActiveStandBy unit test logic (#26980)
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-09-11 18:13:17 +08:00
wei liu
0e2085b77f
fix dc standby to active (#26810)
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2023-09-06 10:41:49 +08:00
Enwei Jiao
fb0705df1b
Decouple basetable and componentparam (#26725)
Signed-off-by: Enwei Jiao <enwei.jiao@zilliz.com>
2023-09-05 10:31:48 +08:00
congqixia
145387fdcb
Bump proto go-api to v2.3.0 (#26561)
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-08-23 20:18:23 +08:00
congqixia
2b367b6bb0
Fix sessionutil Liveness check blcok in watch forever (#26248)
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-08-10 14:07:16 +08:00
congqixia
7dfc8fbf0a
Fix data race on keepAliveCancel (#26087)
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-08-02 18:55:07 +08:00
congqixia
8b11636e72
Cancel previous ctx for session retry keepalive (#26050)
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-08-02 12:09:05 +08:00
wayblink
587237a3c9
Fix dead loop in session (#25451)
Signed-off-by: wayblink <anyang.wang@zilliz.com>
2023-07-13 18:02:29 +08:00
yah01
cd29b863d0
Fix data race in session (#25354)
Signed-off-by: yah01 <yang.cen@zilliz.com>
2023-07-06 14:52:25 +08:00
wayblink
b7ecb7f56b
Disable retryKeepAlive when LivenessCheck's Context close (#25161)
Signed-off-by: wayblink <anyang.wang@zilliz.com>
2023-06-27 17:08:45 +08:00
wayblink
b752a29995
Add timeout for keepalive in session (#25077)
Signed-off-by: wayblink <anyang.wang@zilliz.com>
2023-06-26 12:30:44 +08:00
SimFG
0c3f92d7d7
Improve the panic code about the rootcoord/session/rocksmq (#24859) (#25024)
Signed-off-by: SimFG <bang.fu@zilliz.com>
2023-06-21 11:24:42 +08:00