11294 Commits

Author SHA1 Message Date
congqixia
9ff5731c7d
fix: [2.6] Support JSON default values in FillFieldData (#45455) (#45470)
Cherry-pick from master
pr: #45455
Related to #45445

Previously, FillFieldData for JSON fields would assert and fail when a
default_value was provided, blocking index creation for JSON fields with
default values (including dynamic fields like $meta).

This change enables JSON default value support by:
- Removing the assertion that blocked default values
- Parsing bytes_data into Json objects when default_value is present
- Properly filling data_ array and setting valid_data_ bitset to true
- Maintaining null behavior when no default_value is provided

Impact:
- Fixes index creation failure for JSON fields with default values
- Resolves upgrade issues from 2.5 to 2.6.5 where dynamic fields with
default values couldn't be indexed
- Index builds that were stuck in InProgress state can now complete

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-11-11 18:21:37 +08:00
congqixia
7d13bdcf4c
fix: [2.6] correct field data offset calculation in rerank functions for bulk search (#45444) (#45482)
Cherry-pick from master
pr: #45444
Related to #45338

When using bulk vector search in hybrid search with rerank functions,
the output field values for different queries were all equal to the
values returned by the first query, instead of the correct values
belonging to each document ID. The document IDs were correct, but the
entity field values were wrong.

In rerank functions (RRF, weighted, decay, model), when processing
multiple queries in a batch, the `idLocations` stored only the relative
offset within each result set (`idx`), not accounting for the absolute
position within the entire batch. This caused `FillFieldData` to
retrieve field data from the wrong positions, always using offsets
relative to the first query.

This fix ensures that when processing bulk searches with rerank
functions, each result correctly retrieves its corresponding field data
based on the absolute offset within the entire batch, resolving the
issue where all queries returned the first query's field values.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-11-11 18:15:37 +08:00
sparknack
aaa8f4335d
enhance: [2.6] some optimization of scalar field fetching in tiered storage scenarios (#45361)
issue: #43611
pr: #45360

---------

Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>
2025-11-11 17:27:40 +08:00
cai.zhang
8b16216e01
fix: [2.6]Fix filter geometry for growing with mmap (#45465)
issue: #45450
master pr: #45464

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-11-11 15:41:40 +08:00
Chun Han
85c8cca094
feat: milvus support huawei cloud iam verification(#45298) (#45312)
related: #45298
pr: https://github.com/milvus-io/milvus/pull/45457

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2025-11-11 15:11:36 +08:00
Spade A
7cee398df1
fix: nextFieldID does not consider STRUCT [2.6] (#45438)
issue: https://github.com/milvus-io/milvus/issues/45362
pr: https://github.com/milvus-io/milvus/pull/45437

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-11-11 11:29:35 +08:00
aoiasd
2637854a12
enhance: [2.6] fix typo of analyzer params (#45434)
pr: https://github.com/milvus-io/milvus/pull/45299

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-11-11 10:33:36 +08:00
Gao
1398a069d3
enhance: override index_type while creating segment index (#45417)
issue: #44752
pr: #45416

---------

Signed-off-by: chasingegg <chao.gao@zilliz.com>
2025-11-11 09:45:36 +08:00
Chun Han
94563fb4f2
fix: Group value is nil(#45418) (#45419)
related: #45418 
pr: https://github.com/milvus-io/milvus/pull/45422

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2025-11-10 16:17:38 +08:00
XuanYang-cn
7c37e444a2
fix: [cp26]Accurate size estimation for sliced arrow arrays in compaction (#45352)
Sliced arrow arrays "incorrectly" returned the original array's size via
SizeInBytes(), causing inaccurate memory estimates during compaction.

This resulted in segments closing prematurely in mergeSplit mode -
expected 500MB compactions produced 4x100+MB segments instead.

Fixed by calculating actual byte size of sliced arrays, ensuring proper
segment sizing and more accurate memory usage tracking.

See also: #45293
pr: #45294

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2025-11-07 18:07:35 +08:00
yihao.dai
0bfb5f6012
fix: [2.6] Fix data race in replicate stream client (#45347)
issue: https://github.com/milvus-io/milvus/issues/44123

pr: https://github.com/milvus-io/milvus/pull/45346

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-11-07 16:43:35 +08:00
cai.zhang
12e3fb7655
fix: [2.6]Skip building text index for newly added columns (#45317)
issue: #45315 
master pr: #45316

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-11-07 15:21:42 +08:00
XuanYang-cn
f3e5a53fc5
fix: [2.6]Accidentally ignored sealed segments in L0 Compaction (#45341)
When there're no growing segments in the collection, L0 Compaction will
try to choose all L0 segments that hits all L1/L2 segments.

However, if there's Sealed Segment still under flushing in DataNode at
the same time L0 Compaction selects satisfied L1/L2 segments, L0
Compaction will ignore this Segment because it's not in "FlushState",
which is wrong, causing missing deletes on the Sealed Segment.

This quick solution here is to fail this L0 compaction task once
selected a Sealed segment.

See also: #45339
pr: #45340

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2025-11-07 11:49:34 +08:00
cai.zhang
b33c58807a
enhance: [2.6] [test] Move R-Tree index tests into the implementation package (#45356)
master pr: #45355

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-11-07 10:05:35 +08:00
congqixia
2c50d7e1f8
fix: [2.6] Move FinishLoad before text index creation to ensure raw data availability (#45335)
Cherry-pick from master
pr: #45334
Related to #45333

Fix segment loading failure when adding fields with text match enabled.
The issue occurred because text indexes were being loaded before
FinishLoad() was called, meaning raw data was not properly available
when text index creation attempted to access it, resulting in "failed to
create text index, neither raw data nor index are found" errors.

Solution is to move the FinishLoad() call to execute after raw data
loading but before text index loading. This ensures that:
1. Raw data is properly loaded and available in memory
2. Text indexes can access the raw data they need during creation
3. The segment is in the correct state before any index operations

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-11-06 17:11:34 +08:00
zhagnlu
d91470e59e
fix: not use json_shredding for json path is null (#45311)
pr: #45310

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-11-06 11:49:33 +08:00
zhenshan.cao
a42d3248e1
fix: cherry pick fixes related to timestamptz (#45321)
pr: https://github.com/milvus-io/milvus/pull/45111
https://github.com/milvus-io/milvus/pull/45287
issue: https://github.com/milvus-io/milvus/issues/44527
https://github.com/milvus-io/milvus/issues/44537
https://github.com/milvus-io/milvus/issues/44538
https://github.com/milvus-io/milvus/issues/44585
https://github.com/milvus-io/milvus/issues/44622
https://github.com/milvus-io/milvus/issues/44585

---------

Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>
2025-11-06 11:05:34 +08:00
sparknack
fb1b16186a
enhance: [2.6] unify the aligned buffer for both buffered and direct I/O (#45325)
issue: https://github.com/milvus-io/milvus/issues/43040
pr: https://github.com/milvus-io/milvus/pull/45323

Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>
2025-11-06 10:59:33 +08:00
yihao.dai
c19fa9f5c3
fix: [2.6] Fix load segment failed due to get disk usage error (#45300)
When getting disk usage, files or directories may be removed
concurrently due to segment release. This PR ignores “file or directory
does not exist” errors in such cases.

issue: https://github.com/milvus-io/milvus/issues/45239

pr: https://github.com/milvus-io/milvus/pull/45255

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-11-06 10:35:34 +08:00
congqixia
9a26b11614
fix: [2.6] Support JSON default value in compaction (#45331)
Cherry-pick from master
pr: #45330
Related to #45329

Fix compaction failure when handling newly added dynamic fields with
storage v1 binlogs. The issue occurred because the
`GenerateEmptyArrayFromSchema` function did not support JSON data type
default values, causing "Unexpected default value" errors during
compaction.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-11-06 10:21:34 +08:00
cai.zhang
3df5f89cb0
fix: [2.6] Compute the correct batch size for the geometry index of the growing segment (#45261)
issue: #44648 
master pr: #45253

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-11-05 15:51:33 +08:00
foxspy
e1ea30b04c
enhance: [2.6] update knowhere version (#45271)
issue: #42937 
pr: #45270

Signed-off-by: xianliang.li <xianliang.li@zilliz.com>
2025-11-05 11:11:33 +08:00
Zhen Ye
40806f9162
fix: ddl framework bug patch (#45292)
issue: #45080, #45274, #45285
pr: #45290

- LoadCollection doesn't ignore the ignorable request, for false field
array.
- CreatIndex doesn't ignore the ignorable request, for wrong index.
- index meta is not thread safe.
- lost parameter check of DDL.
- DDL Ack scheduler may get stuck and DDL is block until next incoming
DDL.

Signed-off-by: chyezh <chyezh@outlook.com>
2025-11-05 00:29:34 +08:00
Gao
844fa8c999
enhance: [2.6] make knowhere thread pool config refreshable (#45191)
pr: #45190

---------

Signed-off-by: chasingegg <chao.gao@zilliz.com>
2025-11-04 20:43:34 +08:00
Spade A
282798371d
fix: alter collection failed with MMAP setting for STRUCT [2.6] (#45240)
pr: https://github.com/milvus-io/milvus/pull/45173
issue: https://github.com/milvus-io/milvus/issues/45001
ref: https://github.com/milvus-io/milvus/issues/42148

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>
2025-11-04 20:25:37 +08:00
Zhen Ye
122d024df4
enhance: cherry pick patch of new DDL framework and CDC 3 (#45280)
issue: #43897, #44123
pr: #45266
also pick pr: #45237, #45264,#45244,#45275

fix: kafka should auto reset the offset from earliest to read (#45237)

issue: #44172, #45210, #44851,#45244

kafka will auto reset the offset to "latest" if the offset is
Out-of-range. the recovery of milvus wal cannot read any message from
that. So once the offset is out-of-range, kafka should read from eariest
to read the latest uncleared data.


https://kafka.apache.org/documentation/#consumerconfigs_auto.offset.reset

enhance: support alter collection/database with WAL-based DDL framework
(#45266)

issue: #43897

- Alter collection/database is implemented by WAL-based DDL framework
now.
- Support AlterCollection/AlterDatabase in wal now.
- Alter operation can be synced by new CDC now.
- Refactor some UT for alter DDL.

fix: milvus role cannot stop at initializing state (#45244)

issue: #45243

fix: support upgrading from 2.6.x -> 2.6.5 (#45264)

issue: #43897

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-11-04 20:21:37 +08:00
congqixia
9d7ef929e1
fix: [2.6] Initialize timestamp range in composite binlog writer (#45283)
Related to #45282

Initialize `tsFrom` and `tsTo` fields in the composite binlog record
writer constructor to prevent timestamp range information loss in stats
tasks.

The composite binlog writer now properly initializes the timestamp range
fields, ensuring that:
1. The first timestamp update will correctly set the minimum (`tsFrom`)
2. The first timestamp update will correctly set the maximum (`tsTo`)
3. All subsequent data writes will maintain accurate timestamp range
tracking

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-11-04 19:47:34 +08:00
cai.zhang
852b801e90
fix: [2.6] Skip create tmp dir for growing R-Tree index (#45257)
issue: #45181 

master pr: #45256

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-11-04 17:43:33 +08:00
congqixia
d490a5b4bf
enhance: [2.6] set schema version when creating new collection (#45263) (#45269)
Cherry pick from master
pr: #45263
Related to #43028

Initialize the schema version field when creating a new collection
instance in QueryNode. The schema version is extracted from loadMetaInfo
and assigned to the collection, ensuring proper schema version tracking
and consistency across the distributed system.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-11-04 17:05:34 +08:00
sparknack
efaa538238
fix:[2.6] avoid potential race conditions when updating the executor (#45232)
issue: #43040 
pr: #45230

Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>
2025-11-04 16:11:33 +08:00
groot
f21d7ce05b
enhance: Support JSONL/NDJSON files for bulkinsert (#44602) (#44717)
issue: https://github.com/milvus-io/milvus/issues/44567
pr: https://github.com/milvus-io/milvus/pull/44602

Signed-off-by: yhmo <yihua.mo@zilliz.com>
2025-11-04 16:01:33 +08:00
yihao.dai
b8257facf2
enhance: [2.6] Wait for replicate stream client to finish (#45260)
Make channel replicator stop more gracefully.

issue: https://github.com/milvus-io/milvus/issues/44123

pr: https://github.com/milvus-io/milvus/pull/45259

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-11-04 14:23:33 +08:00
Spade A
bc47935600
feat: impl StructArray -- support diskann index [2.6] (#45234)
pr: https://github.com/milvus-io/milvus/pull/45223
issue: https://github.com/milvus-io/milvus/issues/42148

---------

Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>
Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-11-04 12:13:34 +08:00
Spade A
f26ce204ce
fix: allow "[" and "]" in index name [2.6] (#45194)
pr: https://github.com/milvus-io/milvus/pull/45193
issue: https://github.com/milvus-io/milvus/issues/42148

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>
2025-11-04 12:11:34 +08:00
zhagnlu
cb2bc2b41b
fix: fix bug for shredding json when empty but not null json (#45214)
pr: #45221

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-11-04 11:13:34 +08:00
cai.zhang
4bc2dc86a0
enhance: [2.6]Make GeometryCache an optional configuration (#45196)
issue: #45187 
master pr: #45192

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-11-04 09:51:33 +08:00
Zhen Ye
02e2170601
enhance: cherry pick patch of new DDL framework and CDC 2 (#45241)
issue: #43897, #44123
pr: #45224
also pick pr: #45216,#45154,#45033,#45145,#45092,#45058,#45029

enhance: Close channel replicator more gracefully (#45029)

issue: https://github.com/milvus-io/milvus/issues/44123

enhance: Show create time for import job (#45058)

issue: https://github.com/milvus-io/milvus/issues/45056

fix: wal state may be unconsistent after recovering from crash (#45092)

issue: #45088, #45086

- Message on control channel should trigger the checkpoint update.
- LastConfrimedMessageID should be recovered from the minimum of
checkpoint or the LastConfirmedMessageID of uncommitted txn.
- Add more log info for wal debugging.

fix: make ack of broadcaster cannot canceled by client (#45145)

issue: #45141

- make ack of broadcaster cannot canceled by rpc.
- make clone for assignment snapshot of wal balancer.
- add server id for GetReplicateCheckpoint to avoid failure.

enhance: support collection and index with WAL-based DDL framework
(#45033)

issue: #43897

- Part of collection/index related DDL is implemented by WAL-based DDL
framework now.
- Support following message type in wal, CreateCollection,
DropCollection, CreatePartition, DropPartition, CreateIndex, AlterIndex,
DropIndex.
- Part of collection/index related DDL can be synced by new CDC now.
- Refactor some UT for collection/index DDL.
- Add Tombstone scheduler to manage the tombstone GC for collection or
partition meta.
- Move the vchannel allocation into streaming pchannel manager.

enhance: support load/release collection/partition with WAL-based DDL
framework (#45154)

issue: #43897

- Load/Release collection/partition is implemented by WAL-based DDL
framework now.
- Support AlterLoadConfig/DropLoadConfig in wal now.
- Load/Release operation can be synced by new CDC now.
- Refactor some UT for load/release DDL.

enhance: Don't start cdc by default (#45216)

issue: https://github.com/milvus-io/milvus/issues/44123


fix: unrecoverable when replicate from old (#45224)

issue: #44962

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
Signed-off-by: chyezh <chyezh@outlook.com>
Co-authored-by: yihao.dai <yihao.dai@zilliz.com>
2025-11-04 01:35:33 +08:00
cai.zhang
7451d89a22
enhance: [2.6]Add log to debug index task (#45199)
master pr: #45198

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-11-03 20:03:34 +08:00
Spade A
902eea8e2c
feat: implement ngram tokenizer with token_chars and custom_token_chars [2.6] (#45046)
pr: https://github.com/milvus-io/milvus/pull/45040
issue: https://github.com/milvus-io/milvus/issues/45039

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-11-03 18:11:34 +08:00
Zhen Ye
318db122b8
enhance: cherry pick patch of new DDL framework and CDC (#45025)
issue: #43897, #44123
pr: #44898
related pr: #44607 #44642 #44792 #44809 #44564 #44560 #44735 #44822
#44865 #44850 #44942 #44874 #44963 #44886 #44898

enhance: remove redundant channel manager from datacoord (#44532)

issue: #41611

- After enabling streaming arch, channel manager of data coord is a
redundant component.


fix: Fix CDC OOM due to high buffer size (#44607)

Fix CDC OOM by:
1. free msg buffer manually.
2. limit max msg buffer size.
3. reduce scanner msg hander buffer size.

issue: https://github.com/milvus-io/milvus/issues/44123

fix: remove wrong start timetick to avoid filtering DML whose timetick
is less than it. (#44691)

issue: #41611

- introduced by #44532

enhance: support remove cluster from replicate topology (#44642)

issue: #44558, #44123
- Update config(A->C) to A and C, config(B) to B on replicate topology
(A->B,A->C) can remove the B from replicate topology
- Fix some metric error of CDC

fix: check if qn is sqn with label and streamingnode list (#44792)

issue: #44014

- On standalone, the query node inside need to load segment and watch
channel, so the querynode is not a embeded querynode in streamingnode
without `LabelStreamingNodeEmbeddedQueryNode`. The channel dist manager
can not confirm a standalone node is a embededStreamingNode.

Bug is introduced by #44099

enhance: Make GetReplicateInfo API work at the pchannel level (#44809)

issue: https://github.com/milvus-io/milvus/issues/44123

enhance: Speed up CDC scheduling (#44564)

Make CDC watch etcd replicate pchannel meta instead of listing them
periodically.

issue: https://github.com/milvus-io/milvus/issues/44123


enhance: refactor update replicate config operation using
wal-broadcast-based DDL/DCL framework (#44560)

issue: #43897

- UpdateReplicateConfig operation will broadcast AlterReplicateConfig
message into all pchannels with cluster-exclusive-lock.
- Begin txn message will use commit message timetick now (to avoid
timetick rollback when CDC with txn message).
- If current cluster is secondary, the UpdateReplicateConfig will wait
until the replicate configuration is consistent with the config
replicated from primary.


enhance: support rbac with WAL-based DDL framework (#44735)

issue: #43897

- RBAC(Roles/Users/Privileges/Privilege Groups) is implemented by
WAL-based DDL framework now.
- Support following message type in wal `AlterUser`, `DropUser`,
`AlterRole`, `DropRole`, `AlterUserRole`, `DropUserRole`,
`AlterPrivilege`, `DropPrivilege`, `AlterPrivilegeGroup`,
`DropPrivilegeGroup`, `RestoreRBAC`.
- RBAC can be synced by new CDC now.
- Refactor some UT for RBAC.


enhance: support database with WAL-based DDL framework (#44822)

issue: #43897

- Database related DDL is implemented by WAL-based DDL framework now.
- Support following message type in wal CreateDatabase, AlterDatabase,
DropDatabase.
- Database DDL can be synced by new CDC now.
- Refactor some UT for Database DDL.

enhance: support alias with WAL-based DDL framework (#44865)

issue: #43897

- Alias related DDL is implemented by WAL-based DDL framework now.
- Support following message type in wal AlterAlias, DropAlias.
- Alias DDL can be synced by new CDC now.
- Refactor some UT for Alias DDL.

enhance: Disable import for replicating cluster (#44850)

1. Import in replicating cluster is not supported yet, so disable it for
now.
2. Remove GetReplicateConfiguration wal API

issue: https://github.com/milvus-io/milvus/issues/44123


fix: use short debug string to avoid newline in debug logs (#44925)

issue: #44924

fix: rerank before requery if reranker didn't use field data (#44942)

issue: #44918


enhance: support resource group with WAL-based DDL framework (#44874)

issue: #43897

- Resource group related DDL is implemented by WAL-based DDL framework
now.
- Support following message type in wal AlterResourceGroup,
DropResourceGroup.
- Resource group DDL can be synced by new CDC now.
- Refactor some UT for resource group DDL.


fix: Fix Fix replication txn data loss during chaos (#44963)

Only confirm CommitMsg for txn messages to prevent data loss.

issue: https://github.com/milvus-io/milvus/issues/44962,
https://github.com/milvus-io/milvus/issues/44123

fix: wrong execution order of DDL/DCL on secondary (#44886)

issue: #44697, #44696

- The DDL executing order of secondary keep same with order of control
channel timetick now.
- filtering the control channel operation on shard manager of
streamingnode to avoid wrong vchannel of create segment.
- fix that the immutable txn message lost replicate header.


fix: Fix primary-secondary replication switch blocking (#44898)

1. Fix primary-secondary replication switchover blocking by delete
replicate pchannel meta using modRevision.
2. Stop channel replicator(scanner) when cluster role changes to prevent
continued message consumption and replication.
3. Close Milvus client to prevent goroutine leak.
4. Create Milvus client once for a channel replicator.
5. Simplify CDC controller and resources.

issue: https://github.com/milvus-io/milvus/issues/44123

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
Signed-off-by: chyezh <chyezh@outlook.com>
Co-authored-by: yihao.dai <yihao.dai@zilliz.com>
2025-11-03 15:39:33 +08:00
Zhen Ye
022241cc14
fix: append operation can be only canceled by the wal itself but not the rpc (#45079)
issue: #45077
pr: #45078

We need to promise the state of wal consistent with the memory state of
streamingnode. So we don't allow the append operation can be cancelled
by the append caller to avoid leave a inconsistent state of alive wal.
The wal append operation can only be cancelled when the wal is shutting
down.

Signed-off-by: chyezh <chyezh@outlook.com>
2025-11-03 15:07:36 +08:00
cai.zhang
e70ee327f1
fix: [2.6] Fix import null geometry data (#45162)
issue: #44787 

master pr: #45161

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-10-31 13:42:08 +08:00
congqixia
f3175d2964
fix: [2.6] add null check for packed_writer_ in JsonStatsParquetWriter::Close() (#45158) (#45176)
Cherry-pick from master
pr: #45158

Related to #45157

Fix a bug where DataNode panics when building json stats index throws an
exception before the writer is initialized. The destructor would call
Close() on an uninitialized packed_writer_ pointer, causing a null
pointer dereference.

Changes:
- Add null check for packed_writer_ before calling Flush() and Close()
- Prevents null pointer dereference in edge cases
- Ignore close status as this is a cleanup operation

This ensures safe cleanup even when initialization fails due to
exceptions.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-10-31 10:04:10 +08:00
cqy123456
7878293c46
fix: fail to mmap emb_list_meta in embedding list (#45126)
issue: https://github.com/milvus-io/milvus/issues/44965
related pr: https://github.com/milvus-io/milvus/pull/45127

Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>
2025-10-30 17:04:08 +08:00
congqixia
7f10c98321
fix: [2.6] update QueryNode NumEntities metrics when collection has no segments (#45147) (#45160)
Cherry-pick from master
pr: #45147 

Related to #44509

Fix a bug where QueryNodeNumEntities metrics were not updated for
collections with zero segments, causing stale metrics when all segments
are flushed or compacted.

The previous implementation used separate loops: one to update size
metrics for all collections, and another to update num entities metrics
only for collections present in the grouped segments map. Collections
with no segments were skipped in the second loop, leaving their
NumEntities metrics stale.

Changes:
- Consolidate size and num entities metric updates into single loop
- Iterate over all collections instead of grouped segments
- Get collection metadata from manager instead of segment instances
- Correctly set NumEntities to 0 for collections with no segments
- Apply the same fix to both growing and sealed segment processing
- Add nil check for collection metadata before processing

This ensures all collection metrics are updated consistently, even when
segment count drops to zero.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-10-30 14:08:09 +08:00
wei liu
71bd07bcf7
fix: Handle empty FieldsData in reduce/rerank for requery scenario (#44917) (#45137)
issue: #44909
pr: #44917

When requery optimization is enabled, search results contain IDs but
empty FieldsData. During reduce/rerank operations, if the first shard
has empty FieldsData while others have data, PrepareResultFieldData
initializes an empty array, causing AppendFieldData to panic when
accessing array indices.

Changes:
- Find first non-empty FieldsData as template in 3 functions:
reduceAdvanceGroupBy, reduceSearchResultDataWithGroupBy,
reduceSearchResultDataNoGroupBy
- Add length check before 2 AppendFieldData calls in reduce functions to
prevent panic
- Improve newRerankOutputs to find first non-empty fieldData using
len(FieldsData) check instead of GetSizeOfIDs
- Add length check in appendResult before AppendFieldData
- Add comprehensive unit tests for empty and partial empty FieldsData
scenarios in both reduce and rerank functions

This fix handles both pure requery (all empty) and mixed scenarios (some
empty, some with data) without breaking normal search flow. The key
improvement is checking FieldsData length directly rather than IDs, as
requery may have IDs but empty FieldsData.

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-10-29 20:12:13 +08:00
yihao.dai
a3006b7116
fix: [2.6] Fix panic when gracefully stopping cdc (#45095)
issue: https://github.com/milvus-io/milvus/issues/45093,
https://github.com/milvus-io/milvus/issues/44123

pr: https://github.com/milvus-io/milvus/pull/45094

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-10-28 17:14:13 +08:00
tinswzy
76b2d67c6a
fix: [2.6] auth token contamination, OSS/COS support, redundant sync err logs (#45106)
Cherry-pick from master

pr: #44964 , related issue #44892 fix invalid auth token cause by
context contamination

pr: #44879   enable WP support aliyun oss/tencent cos

pr: #44934 , related issue #44713  fix redundant wp sync error logs

---------

Signed-off-by: tinswzy <zhenyuan.wei@zilliz.com>
2025-10-28 15:06:12 +08:00
congqixia
d84348b7a2
fix: [2.6] Handle all-null data in StringIndexSort to prevent load timeout(#45100) (#45104)
Cherry-pick from master
pr: #45100

Related to #45081

StringIndexSort now properly handles collections with all-null string
fields by:
- Removing the error thrown when unique_count is 0 in ParseBinaryData
- Adding alignment and padding support in mmap serialization (similar to
ScalarIndexSort)
- Separating data_size_ from mmap_size_ to correctly parse data without
reading padding

This fixes load collection timeout failures when all string field data
is null, particularly affecting STL_SORT and TRIE index types.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-10-28 12:10:11 +08:00
zhagnlu
1df958f444
fix: disable build old version jsonstats from request (#45102)
pr: #45101

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-10-28 11:14:11 +08:00