issue: #42561
Move the exclude nodes clearing logic from ExecuteWithRetry to
selectNode after shard leader cache refresh to ensure proper retry
behavior:
- Remove premature exclude clearing in ExecuteWithRetry that happened
before shard leader cache update
- Add exclude clearing logic in selectNode after refreshing shard leader
cache when all replicas are excluded
- Ensure multiple retries can properly update shard leader cache and
clear exclude list when needed
- Add comprehensive tests for edge cases including empty shard leaders
and mixed serviceable node scenarios
---------
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
issue: https://github.com/milvus-io/milvus/issues/41690
This commit implements partial search result functionality when query
nodes go down, improving system availability during node failures. The
changes include:
- Enhanced load balancing in proxy (lb_policy.go) to handle node
failures with retry support
- Added partial search result capability in querynode delegator and
distribution logic
- Implemented tests for various partial result scenarios when nodes go
down
- Added metrics to track partial search results in querynode_metrics.go
- Updated parameter configuration to support partial result required
data ratio
- Replaced old partial_search_test.go with more comprehensive
partial_result_on_node_down_test.go
- Updated proto definitions and improved retry logic
These changes improve query resilience by returning partial results to
users when some query nodes are unavailable, ensuring that queries don't
completely fail when a portion of data remains accessible.
---------
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
Merge RootCoord, DataCoord And QueryCoord into MixCoord
Make Session into one
issue : https://github.com/milvus-io/milvus/issues/37764
---------
Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>
after the pr merged, we can support to insert, upsert, build index,
query, search in the added field.
can only do the above operates in added field after add field request
complete, which is a sync operate.
compact will be supported in the next pr.
#39718
---------
Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
/kind improvement
Here you only need to filter out the system fields, and you don’t need
to recreate a response, because recreating the response will cause this
part to be easily missed when adding fields later.
Signed-off-by: SimFG <bang.fu@zilliz.com>
issue: #37115
the old implementation update shard cache and shard client manager at
same time, which causes lots of conor case due to concurrent issue
without lock.
This PR decouple shard client manager from shard cache, so only shard
cache will be updated if delegator changes. and make sure shard client
manager will always return the right client, and create a new client if
not exist. in case of client leak, shard client manager will purge
client in async for every 10 minutes.
---------
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
issue: #32466
this PR enhance that when shard location changed, update proxy's shard
leader cache. in case of query node failover case, proxy can find
replica recover
---------
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
issue: #29772
The shardLeaders cache does not actively expire, update the cache when
search/query fails.
Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
See also #29113
The collection schema is crucial when performing search/query but some
of the information is calculated for every request.
This PR change schema field of cached collection info into a utility
`schemaInfo` type to store some stable result, say pk field,
partitionKeyEnabled, etc. And provided field name to id map for
search/query services.
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
See also #29113
- Unify partition info refresh logic
- Prevent parse partition names for each partition key search request
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
issue: #28781#28329
1. There is no need to call `DescribeCollection`, if the collection's
schema is found in the globalMetaCache
2. did `GetProperties` to check the access to Azure Blob Service while
construct the ChunkManager
Signed-off-by: PowderLi <min.li@zilliz.com>
Support Database(#23742)
Fix db nonexists error for FlushAll (#24222)
Fix check collection limits fails (#24235)
backward compatibility with empty DB name (#24317)
Fix GetFlushAllState with DB (#24347)
Remove db from global meta cache after drop database (#24474)
Fix db name is empty for describe collection response (#24603)
Add RBAC for Database API (#24653)
Fix miss load the same name collection during recover stage (#24941)
RBAC supports Database validation (#23609)
Fix to list grant with db return empty (#23922)
Optimize PrivilegeAll permission check (#23972)
Add the default db value for the rbac request (#24307)
Signed-off-by: jaime <yun.zhang@zilliz.com>
Co-authored-by: SimFG <bang.fu@zilliz.com>
Co-authored-by: longjiquan <jiquan.long@zilliz.com>