From 4f8a8dd7ef67916586057a61691b4d0fe641ab7f Mon Sep 17 00:00:00 2001 From: tianhang Date: Mon, 5 Jan 2026 17:41:24 +0800 Subject: [PATCH] =?UTF-8?q?fix:=20Register=20all=20sealed=20segments?= =?UTF-8?q?=E2=80=99=20BM25=20stats=20in=20IDFOracle=20(#46726)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit If QueryNode loads multiple sealed segments with BM25 enabled, BM25 stats registration into IDFOracle could stop after the first segment due to an early-terminating ConcurrentMap.Range callback. This change: Register BM25 stats for all sealed segments by continuing iteration (return true) during sealed-segment load Prevent repeated warnings like idf oracle lack some sealed segment Ensure IDF/BM25 statistics are not silently incomplete (improving BM25 ranking correctness) issue: #46725 Core invariant: for any BM25-enabled collection, every loaded sealed segment with available BM25 stats must be registered into IDFOracle, so SyncDistribution can always find the sealed segments present in the distribution snapshot. Bug fix: ConcurrentMap.Range respects the callback’s boolean return; returning false stops iteration. The sealed BM25 stats registration callback previously returned false, which could register only the first sealed segment and leave the rest missing—causing IDFOracle to warn idf oracle lack some sealed segment and potentially compute IDF from incomplete stats. Fixed by returning true to continue iterating and registering all segments. No behavior regression: the change only affects the sealed-segment BM25 stats registration loop; it does not alter segment loading, distribution snapshot generation, or non-BM25 codepaths. For collections without BM25 (or when BM25 stats are nil), behavior remains unchanged. - Core invariant: For any BM25-enabled collection, every loaded sealed segment with available BM25 stats must be registered into IDFOracle so SyncDistribution can discover them in distribution snapshots. - Bug fix (links to #46725): The BM25 stats registration callback used with bm25Stats.Range() in loadStreamDelete() returned false, which prematurely stopped iteration after the first sealed segment and left subsequent sealed segments unregistered. The fix changes the callback to return true so the Range loop completes and registers BM25 stats for all sealed segments. - Logic simplified/removed: The early-return (false) in the ConcurrentMap.Range callback that aborted further registrations has been removed (replaced by returning true). That early abort was redundant and incorrect because registration must proceed for every entry; allowing Range to continue restores the intended one-to-many registration behavior. - No data loss or regression: The change is narrowly scoped to the sealed-segment BM25 stats registration loop in internal/querynodev2/delegator/delegator_data.go and does not modify segment loading, distribution snapshot generation, growing-segment handling, or non-BM25 codepaths. Returning true only permits full iteration and registration; it does not delete or alter existing data structures or load state, so IDF/BM25 statistics become complete without changing other behaviors. Signed-off-by: thangTang --- internal/querynodev2/delegator/delegator_data.go | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/internal/querynodev2/delegator/delegator_data.go b/internal/querynodev2/delegator/delegator_data.go index ab35368576..08f0389d12 100644 --- a/internal/querynodev2/delegator/delegator_data.go +++ b/internal/querynodev2/delegator/delegator_data.go @@ -760,7 +760,8 @@ func (sd *shardDelegator) loadStreamDelete(ctx context.Context, zap.Int64("segmentID", segmentID), ) sd.idfOracle.Register(segmentID, stats, segments.SegmentTypeSealed) - return false + // continue iterating to register all segments' stats + return true }) }