issue: #45865
pr: #45949
- Modified leader_checker.go to include all nodes (RO + RW) instead of
only RW nodes, preventing channel balance from stucking on RO nodes
- Added debug logging in segment_checker.go when no shard leader found
- Enhanced target_observer.go with detailed logging for delegator check
failures to improve debugging visibility
- Fixed integration tests:
- Temporarily disabled partial result counter assertion in
partial_result_on_node_down_test.go pending concurrent issue fix
- Increased transfer channel timeout from 10s to 20s in
manual_rolling_upgrade_test.go to avoid flaky test caused by target
update interval (10s)
---------
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
Follow up for #45971
Update the `milvus/pkg/v2` dependency in both root and client modules to
align with the latest v2.6.7 release, and improve Makefile lint-fix
target logging.
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
master pr: https://github.com/milvus-io/milvus/pull/45985
Replace direct self.schema access and describe_collection() calls with
get_schema() method to ensure consistent schema handling with complete
struct_fields information. Also fix FlushChecker error handling and
change schema log level from info to debug.
Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
Bump milvus & proto version
Also bump golang.org/x/crypto to v0.45.0 fixing CVE-2025-47914
Related to #45976
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
Cherry-pick from master
pr: #45961
Related to #45960
When QueryCoord restarts or reconnects to etcd, the rewatchNodes
function previously skipped handleNodeUp for QueryNodes in stopping
state. This caused stopping balance to fail because necessary components
were not initialized:
- Task scheduler executor was not added
- Dist handler was not started
- Node was not registered in resource manager
This fix ensures handleNodeUp is always called for new nodes regardless
of their stopping state, followed by handleNodeStopping if the node is
stopping. This allows the graceful shutdown process to correctly migrate
segments and channels away from stopping nodes.
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
issue: #45486
pr: #45487
This commit refactors the chunk writing system by introducing a
two-phase
approach: size calculation followed by writing to a target. This enables
efficient group chunk creation where multiple fields share a single mmap
region, significantly reducing the number of mmap system calls and VMAs.
- Optimize `mmap` usage: single `mmap` per group chunk instead of per
field
- Split ChunkWriter into two phases:
- `calculate_size()`: Pre-compute required memory without allocation
- `write_to_target()`: Write data to a provided ChunkTarget
- Implement `ChunkMmapGuard` for unified mmap region lifecycle
management
- Handles `munmap` and file cleanup via RAII
- Shared via `std::shared_ptr` across multiple chunks in a group
---------
Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>
issue: #45728
pr: #45730
When mixcoord is in standby mode and shutdown is triggered, the
ProcessActiveStandBy goroutine may panic if context cancellation occurs.
This happens because the error handling didn't check for
context.Canceled errors before panicking.
Changes:
- Add context cancellation check in mix_coord Register() before panic
- Check s.ctx.Err() == context.Canceled and gracefully exit
- Remove unused ForceActiveStandby() function from session_util
This ensures standby mixcoord can shutdown gracefully without panic when
context is cancelled during the standby process.
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
issue: #45640
pr: #45805
- log may be dropped if the underlying file system is busy.
- use async write syncer to avoid the log operation block the milvus
major system.
- remove some log dependency from the until function to avoid
dependency-loop.
---------
Signed-off-by: chyezh <chyezh@outlook.com>
issue: #45847
master pr: #45908
After a collection is successfully loaded, the shard-leader state on the
QC may still not be marked as serviceable. It becomes serviceable only
after the scheduled distribution update runs, which will also invalidate
the shard-leader cache on the proxy. Therefore, even if queries are
already executable, the shard-leader mapping on the proxy may still
change afterward.
Try to ensure—as much as possible—that the proxy’s shard-leader cache
remains stable before killing the mixcoord.
Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
issue: #43117
pr: #45859
If we enable checking when loading segments, all segment should always
be loaded by streamingnode but not 2.5 querynode, make some search and
query failure when upgrading. Otherwise, some search and query result
will be wrong when upgrading. We choose to disable this checking for now
to promise available search and query when upgrading.
also see pr: #43346
Signed-off-by: chyezh <chyezh@outlook.com>
Cherry-pick from master
pr: #45911
Related to #45910
When IndexNodeBinding mode is enabled, DataCoord skips session watching
for datanodes but the dnSessionWatcher field remains nil. This causes a
panic when other code attempts to access the watcher.
This fix introduces an EmptySessionWatcher as a placeholder for the
IndexNodeBinding mode scenario. The empty watcher implements the
SessionWatcher interface with no-op methods, preventing nil pointer
dereferences while maintaining the expected interface contract.
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
This commit optimizes std::vector usage across segcore by adding
reserve() calls where the size is known in advance, reducing memory
reallocations during push_back operations.
Changes:
TimestampIndex.cpp: Reserve space for prefix_sums and timestamp_barriers
SegmentGrowingImpl.cpp: Reserve space for binlog info vectors
ChunkedSegmentSealedImpl.cpp: Reserve space for futures and field data
vectors
storagev2translator/GroupChunkTranslator.cpp: Reserve space for metadata
vectors
This improves performance by avoiding multiple memory reallocations when
the vector size is predictable.
issue: https://github.com/milvus-io/milvus/issues/45679
pr: https://github.com/milvus-io/milvus/pull/45757
Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>