Cherry-pick from master
pr: #45975
Related to #45976
Bump golang.org/x/crypto to v0.45.0 fixing CVE-2025-47914
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
issue: #43828
pr: #43829#43909
Implement robust rewatch mechanism to handle etcd connection failures
and node reconnection scenarios in DataCoord and QueryCoord, along with
heartbeat lag monitoring capabilities.
Changes include:
- Implement rewatchDataNodes/rewatchQueryNodes callbacks for etcd
reconnection scenarios
- Add idempotent rewatchNodes method to handle etcd session recovery
gracefully
- Add QueryCoordLastHeartbeatTimeStamp metric for monitoring node
heartbeat lag
- Clean up heartbeat metrics when nodes go down to prevent metric leaks
---------
---------
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
Co-authored-by: Zhen Ye <chyezh@outlook.com>
Cherry-pick from master
pr: #44723
Related to #36672
Add accesslog field displaying value length for search/query request may
help developers debug related issues
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
issue: #43858
pr: #43992
Refactor the balance checker implementation to use priority queues for
managing collection balance operations, improving processing efficiency
and order control.
Changes include:
- Export priority queue interfaces (Item, BaseItem, PriorityQueue)
- Replace collection round-robin with priority-based queue system
- Add BalanceCheckCollectionMaxCount configuration parameter
- Optimize balance task generation with batch processing limits
- Refactor processBalanceQueue method for different strategies
- Enhance test coverage with comprehensive unit tests
The new priority queue system processes collections based on row count
or collection ID order, providing better control over balance operation
priorities and resource utilization.
---------
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
issue: #44156
pr: #44234
Enhance FlushAll functionality to support targeting specific collections
within databases instead of only database-level flushing.
Changes include:
- Add FlushAllTarget message in data_coord.proto for granular targeting
- Support collection-specific flush operations within databases
- Maintain backward compatibility with deprecated db_name field
This enhancement allows users to flush specific collections without
affecting other collections in the same database, providing more precise
control over data persistence operations.
---------
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
issue: #43427
pr: #37417
Support R-Tree index for geometry datatype.
---------
Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
Co-authored-by: ZhuXi <150327960+Yinwei-Yu@users.noreply.github.com>
issue: #43427
pr: #37417
This pr's main goal is merge #37417 to milvus 2.5 without conflicts.
# Main Goals
1. Create and describe collections with geospatial type
2. Insert geospatial data into the insert binlog
3. Load segments containing geospatial data into memory
4. Enable query and search can display geospatial data
5. Support using GIS funtions like ST_EQUALS in query
# Solution
1. **Add Type**: Modify the Milvus core by adding a Geospatial type in
both the C++ and Go code layers, defining the Geospatial data structure
and the corresponding interfaces.
2. **Dependency Libraries**: Introduce necessary geospatial data
processing libraries. In the C++ source code, use Conan package
management to include the GDAL library. In the Go source code, add the
go-geom library to the go.mod file.
3. **Protocol Interface**: Revise the Milvus protocol to provide
mechanisms for Geospatial message serialization and deserialization.
4. **Data Pipeline**: Facilitate interaction between the client and
proxy using the WKT format for geospatial data. The proxy will convert
all data into WKB format for downstream processing, providing column
data interfaces, segment encapsulation, segment loading, payload
writing, and cache block management.
5. **Query Operators**: Implement simple display and support for filter
queries. Initially, focus on filtering based on spatial relationships
for a single column of geospatial literal values, providing parsing and
execution for query expressions.Now only support brutal search
6. **Client Modification**: Enable the client to handle user input for
geospatial data and facilitate end-to-end testing.Check the modification
in pymilvus.
---------
Signed-off-by: Yinwei Li <yinwei.li@zilliz.com>
Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
Co-authored-by: cai.zhang <cai.zhang@zilliz.com>
issue: #43040
pr: #42665
This patch introduces a disk file writer that supports Direct IO.
Currently, it is exclusively utilized during the QueryNode load process.
Below is its parameters:
1. `common.diskWriteMode` This parameter controls the write mode of the
local disk, which is used to write temporary data downloaded from remote
storage. Currently, only QueryNode uses 'common.diskWrite*' parameters.
Support for other components will be added in the future.
The options include 'direct' and 'buffered'. The default value is
'buffered'.
2. `common.diskWriteBufferSizeKb` Disk write buffer size in KB, only
used when disk write mode is 'direct', default is 64KB.
Current valid range is [4, 65536]. If the value is not aligned to 4KB,
it will be rounded up to the nearest multiple of 4KB.
3. `common.diskWriteNumThreads` This parameter controls the number of
writer threads used for disk write operations. The valid range is [0,
hardware_concurrency]. It is designed to limit the maximum concurrency
of disk write operations to reduce the impact on disk read performance.
For example, if you want to limit the maximum concurrency of disk write
operations to 1, you can set this parameter to 1.
The default value is 0, which means the caller will perform write
operations directly without using an additional writer thread pool. In
this case, the maximum concurrency of disk write operations is
determined by the caller's thread pool size.
Both parameters can be updated during runtime.
---------
Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>
Cherry-pick from master
pr: #43408
Related to #43407
When `MultiSaveAndRemove` like ops contains same key in saves and
removal keys it may cause data lost if the execution order is save first
than removal.
This PR make all the kv execute removal first then save the new values.
Even when same key appeared in both saves and removals, the new value
shall stay.
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
Previous code uses diskSegmentMaxSize if and only if all of the
collection's vector fields are indexed with DiskANN index.
When introducing sparse vectors, since sparse vector cannot be indexed
with DiskANN index, collections with both dense and sparse vectors will
use maxSize instead.
This PR changes the requirments of using diskSegmentMaxSize to all dense
vectors are indexed with DiskANN indexs, ignoring sparse vector fields.
See also: #43193
pr: #43194
---------
Signed-off-by: yangxuan <xuan.yang@zilliz.com>
issue: #43107
pr: #43108
- Add checkLoadConfigChanges() to apply load config during startup
- Call config check in startQueryCoord() after restart
- Skip auto-updates for collections with user-specified replica numbers
- Add is_user_specified_replica_mode field to preserve user settings
- Add comprehensive unit tests with mockey
Ensures existing collections use latest cluster-level config after
restart.
---------
Signed-off-by: Wei Liu <wei.liu@zilliz.com>