11434 Commits

Author SHA1 Message Date
congqixia
d2e4278b18
enhance: use milvus-storage internal C++ Reader API for Loon FFI (#45897)
This PR refactors the Loon FFI reader implementation to use
milvus-storage's internal C++ Reader API directly instead of the
external FFI interface.

Key changes:
- Replace external FFI calls (get_record_batch_reader, reader_destroy)
with direct C++ Reader API calls
- Add GetLoonReader() helper function to create Reader instances using
milvus-storage::api::Reader::create()
- Use MakeInternalPropertiesFromStorageConfig() instead of
MakePropertiesFromStorageConfig() to get internal properties
- Update NewPackedFFIReaderWithManifest() to deserialize column groups
from JSON manifest content directly
- Simplify GetFFIReaderStream() to use Reader::get_record_batch_reader()
and arrow::ExportRecordBatchReader() for Arrow stream export
- Change CFFIPackedReader typedef from ReaderHandle to void* for
flexibility
- Update milvus-storage dependency version to ba7df7b

This change improves code maintainability by using the native C++ API
directly and eliminates the overhead of going through the external FFI
layer.

issue: #44956

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-11-28 18:55:07 +08:00
congqixia
ae256c52ae
enhance: Resolve issues integrating loon FFI (#45918)
Related to #44956

- Update milvus-storage version to ba7df7b for chunk reader fix
- Pass manifest path to index build request in DataCoord/DataNode
- Add null chunk assertion with detailed debug info in
ManifestGroupTranslator
- Fix memory corruption by removing premature transaction handle
destruction
- Clean up log message in ChunkedSegmentSealedImpl

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-11-28 18:41:08 +08:00
Spade A
08142a4854
fix: fix false negative panic on missing fields (#45902)
issue: https://github.com/milvus-io/milvus/issues/45834

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-11-28 18:05:08 +08:00
Zhen Ye
c3fe6473b8
enhance: support async write syncer for milvus logging (#45805)
issue: #45640

- log may be dropped if the underlying file system is busy.
- use async write syncer to avoid the log operation block the milvus
major system.
- remove some log dependency from the until function to avoid
dependency-loop.

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-11-28 17:43:11 +08:00
Zhen Ye
4f080bd3a0
fix: remove the streamingnode checking when loading segment (#45859)
issue: #43117

If we enable checking when loading segments, all segment should always
be loaded by streamingnode but not 2.5 querynode, make some search and
query failure when upgrading. Otherwise, some search and query result
will be wrong when upgrading. We choose to disable this checking for now
to promise available search and query when upgrading.

also see pr: #43346

Signed-off-by: chyezh <chyezh@outlook.com>
2025-11-28 10:09:08 +08:00
Zhen Ye
31976d8adb
fix: executor/scheduler should be latest replica meta but not replica copy (#45877)
issue: #45865

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-11-28 06:59:08 +08:00
zhagnlu
1b58844319
enhance: support mmap for jsonstats shared key index (#44914)
#42533

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-11-27 16:01:08 +08:00
Bingyi Sun
b6532d3e44
enhance: implement external collection update task with source change detection (#45690)
issue: https://github.com/milvus-io/milvus/issues/45691
Add persistent task management for external collections with automatic
detection of external_source and external_spec changes. When source
changes, the system aborts running tasks and creates new ones, ensuring
only one active task per collection. Tasks validate their source on
completion to prevent superseded tasks from committing results.

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-11-27 15:33:08 +08:00
cai.zhang
7c9a9c6f7e
fix: Reduce querycoord check node in replica interval for test (#45837)
issue: #45791

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-11-27 15:07:07 +08:00
Xiaofan
f455910bee
fix: support azure blob storage with federated token (#45632)
fix #44582 
related to #44583
Co-authored-by: DuMinhLe<https://github.com/ducminhle>

Signed-off-by: xiaofanluan <xiaofan.luan@zilliz.com>
2025-11-27 14:29:07 +08:00
Zhen Ye
8e0ae6433d
fix: LastConfirmedMessageID may be wrong if high concurrent writing (#45873)
issue: #45872

Signed-off-by: chyezh <chyezh@outlook.com>
2025-11-27 12:01:07 +08:00
Chun Han
f34eb3ae90
enhance: remove useless code(#30376) (#45685)
related: #30376

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2025-11-27 10:43:07 +08:00
liliu-z
95ee65e950
fix: Fix race condition in Zilliz client tests (#45832)
issue: #45831

This PR fixes a race condition in `TestZillizClient` test cases where
`t.Logf`
could be called after the test function had returned, leading to a panic
or data race failure.

Root cause:
1. Incorrect defer order: `defer s.Stop()` was called after `defer
lis.Close()`.
This caused `lis.Accept()` to return an error before `s.Stop()` had set
the quit flag,
   triggering the error handling path in the server goroutine.
2. Unsafe logging: The error handling path used `t.Logf` inside a
goroutine, which
   is unsafe if the main test function exits before the log is printed.

Fixes:
- Changed defer order to ensure `s.Stop()` is called before
`lis.Close()`.
- Replaced `t.Logf` with `fmt.Printf` in the background goroutine to
avoid panic on test completion.

Signed-off-by: Li Liu <li.liu@zilliz.com>
2025-11-26 21:13:09 +08:00
Buqian Zheng
6c0a80d8c3
enhance: pk binary range in sealed segment to use binary search (#45829)
issue: https://github.com/milvus-io/milvus/discussions/44935
pr: https://github.com/milvus-io/milvus/pull/45328

this pr is to improve pk range op

---------

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2025-11-26 17:17:08 +08:00
Xiaofan
e8419a8074
fix: listImport and getImportProgress should follow import access (#45822)
related to #45709
listImport and DescribeImportJob shouldn't use global access

Signed-off-by: xiaofanluan <xiaofan.luan@zilliz.com>
2025-11-26 14:59:08 +08:00
congqixia
3f8c146831
enhance: support manifest-based index building with Loon FFI reader (#45726)
This PR adds support for reading data from StorageV2 using manifest
files and the Loon FFI interface during index building, providing an
alternative to the traditional segment insert files approach.

Key changes:

Core C++ changes:
- Add SEGMENT_MANIFEST_KEY and LOON_FFI_PROPERTIES_KEY constants for
manifest handling
- Extend FileManagerContext to carry loon_ffi_properties for FFI
operations
- Update index_c.cpp to pass manifest and loon properties to file
managers for all index types (vector, JSON key, text)
- Implement GetFieldDatasFromManifest() in Util.cpp using Arrow C Stream
interface:
  * Create Arrow schema from field metadata
  * Initialize FFI reader with manifest content and storage properties
  * Import record batches from C data interface
  * Convert to FieldData for index building
- Update DiskFileManagerImpl and MemFileManagerImpl to support
manifest-based data reading with fallback to traditional paths

Loon FFI utilities (internal/core/src/storage/loon_ffi/):
- Add ToCStorageConfig() to convert StorageConfig to C-compatible
structure
- Implement GetManifest() to parse manifest JSON and retrieve column
groups via FFI
- Enhance MakePropertiesFromStorageConfig() integration

Storage V2 integration:
- Update milvus-storage dependency from 0883026 to 302143c for latest
FFI support

Protobuf changes:
- Add manifest field to BuildIndexInfo for passing manifest path to C++
layer

Configuration:
- Add common.storageV2.useLoonFFI config option (default: false) for
feature toggle

This change is part of issue #44956 to integrate the StorageV2 FFI
interface as the unified storage layer. The implementation maintains
backward compatibility by checking for manifest presence and falling
back to existing segment insert files approach when manifest is not
provided.

Related issue: #44956

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-11-26 12:43:08 +08:00
sparknack
4b14ab14e3
enhance: mmap once for each group chunk (#45487)
issue: #45486

This commit refactors the chunk writing system by introducing a
two-phase
approach: size calculation followed by writing to a target. This enables
efficient group chunk creation where multiple fields share a single mmap
region, significantly reducing the number of mmap system calls and VMAs.

- Optimize `mmap` usage: single `mmap` per group chunk instead of per
field
- Split ChunkWriter into two phases:
  - `calculate_size()`: Pre-compute required memory without allocation
  - `write_to_target()`: Write data to a provided ChunkTarget
- Implement `ChunkMmapGuard` for unified mmap region lifecycle
management
  - Handles `munmap` and file cleanup via RAII
  - Shared via `std::shared_ptr` across multiple chunks in a group

Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>

---------

Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>
2025-11-26 10:37:08 +08:00
sparknack
0392db6976
enhance: add cancellation checking in each operator and expr (#45354)
issue: #45353

---------

Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>
2025-11-26 10:15:07 +08:00
XuanYang-cn
471b3e4e09
fix: Move Init and Remove EZ logic out of metatable (#45827)
This will avoid endless retry CreateDatabase/DropDatabase when
cipherPlugin fails in the new DDL framework.

See also: #45826

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2025-11-25 20:01:06 +08:00
wei liu
4d6b130af4
fix: prevent panic in standby mixcoord during shutdown (#45730)
issue: #45728
When mixcoord is in standby mode and shutdown is triggered, the
ProcessActiveStandBy goroutine may panic if context cancellation occurs.
This happens because the error handling didn't check for
context.Canceled errors before panicking.

Changes:
- Add context cancellation check in mix_coord Register() before panic
- Check s.ctx.Err() == context.Canceled and gracefully exit
- Remove unused ForceActiveStandby() function from session_util

This ensures standby mixcoord can shutdown gracefully without panic when
context is cancelled during the standby process.

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-11-25 19:27:07 +08:00
congqixia
03f5d7c0a5
enhance: integrate StorageV2 FFI interface for manifest-based segment loading (#45798)
Related to #44956

**New Translator (C++)**
- Added `ManifestGroupTranslator`
(`internal/core/src/segcore/storagev2translator/`)
  - Translates manifest-based column groups to Milvus internal format
  - Implements `GroupCTMeta` interface for chunk-based column access
  - Supports both memory and mmap storage modes
  - Handles cache warmup policies for vector and scalar data

**ChunkedSegmentSealedImpl**
(`internal/core/src/segcore/ChunkedSegmentSealedImpl.cpp:333`)
- Added `LoadColumnGroups(const std::string& manifest_path)`: Main entry
point for manifest-based loading
  - Creates milvus-storage Reader from manifest file
  - Parallelizes column group loading using thread pool
  - Aggregates loading exceptions and reports errors
- Added `LoadColumnGroup()`: Loads individual column group
  - Extracts field IDs from column group metadata
  - Creates ManifestGroupTranslator for each column group
  - Builds ProxyChunkColumn for field access
  - Special handling for timestamp field index construction

**SegmentGrowingImpl**
(`internal/core/src/segcore/SegmentGrowingImpl.cpp`)
- Added similar `LoadColumnGroups()` and `LoadColumnGroup()` methods for
growing segments
- Maintains consistency with sealed segment loading path

Storage FFI Utilities

**loon_ffi/util** (`internal/core/src/storage/loon_ffi/util.cpp`)
- Added `MakeInternalPropertiesFromStorageConfig()`: Converts C storage
config to internal Properties
  - Maps all storage configuration fields (S3, GCS, Azure, local)
  - Handles SSL, IAM, virtual host settings
  - Configures connection timeouts and max connections
- Added `MakeInternalLocalProperies()`: Creates local filesystem
properties
- Added `ToCStorageConfig()`: Converts Go StorageConfig to C
representation
- Added `GetColumnGroups()`: Extracts column groups from manifest file
using Transaction API

Protocol Buffer Changes

**segcore.proto** (`pkg/proto/segcore.proto:121`)
- Added `manifest_path` field to `SegmentLoadInfo` message
- Enables passing manifest file path from Go layer to C++ core

Go Integration

**segment.go** (`internal/util/segcore/segment.go:372`)
- Updated `ConvertToSegcoreSegmentLoadInfo()` to propagate
`ManifestPath` field
- Bridges QueryNode segment load info to Segcore format

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-11-25 17:27:07 +08:00
groot
a545ebc702
fix: Fix a bug that bulkimport cannot handle empty struct list (#45693)
issue: https://github.com/milvus-io/milvus/issues/42148

Signed-off-by: yhmo <yihua.mo@zilliz.com>
2025-11-25 17:21:06 +08:00
Buqian Zheng
7078f403f1
enhance: add vector reserve to improve memory allocation in segcore (#45757)
This commit optimizes std::vector usage across segcore by adding
reserve() calls where the size is known in advance, reducing memory
reallocations during push_back operations.

Changes:
- TimestampIndex.cpp: Reserve space for prefix_sums and
timestamp_barriers
- SegmentGrowingImpl.cpp: Reserve space for binlog info vectors
- ChunkedSegmentSealedImpl.cpp: Reserve space for futures and field data
vectors
- storagev2translator/GroupChunkTranslator.cpp: Reserve space for
metadata vectors

This improves performance by avoiding multiple memory reallocations when
the vector size is predictable.

issue: https://github.com/milvus-io/milvus/issues/45679

---------

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2025-11-25 14:19:07 +08:00
zhagnlu
346449d87f
fix:fix undefined behavior for dump snapshot (#45611)
#45610

this fix add a little cost for execute:
=== Lower Bound Overhead (isolated) ===
Position 1 (list len = 90000): 39 ns per lower_bound
Position 2 (list len =180000): 45 ns per lower_bound
Position 3 (list len =270000): 46 ns per lower_bound
Position 4 (list len =360000): 38 ns per lower_bound
Position 5 (list len =450000): 42 ns per lower_bound
Position 6 (list len =540000): 55 ns per lower_bound
Position 7 (list len =630000): 56 ns per lower_bound
Position 8 (list len =720000): 49 ns per lower_bound
Position 9 (list len =810000): 48 ns per lower_bound

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-11-25 14:17:07 +08:00
Bingyi Sun
929cb42fcc
fix: Replace json.doc() calls with json.dom_doc() in JsonContainsExpr (#45573)
issue: https://github.com/milvus-io/milvus/issues/45783
for simdjson ondemand api, a iterator can only be used once. use dom api
to prevent crashes when processing JSON contains operations with
different types.

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-11-25 14:15:12 +08:00
aoiasd
322caafe18
feat: support file params in analyzer and set jieba dict file (#45206)
relate: https://github.com/milvus-io/milvus/issues/43687
Support use user provice file by file params, in analyzer params.
Could use local file or remote file resource.
Support use file params in jieba extern dict.

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-11-25 11:33:06 +08:00
yihao.dai
5e6fdf3ba7
enhance: Skip redundant failure marking for completed import jobs (#45767)
Skip redundant failure marking for completed import jobs when the
collection is dropped or import jobs are timeout.

issue: https://github.com/milvus-io/milvus/issues/45766

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-11-25 11:21:07 +08:00
Zhen Ye
446e0b7bf5
fix: keep memory state consistent when recovering broadcast task from proto (#45787)
issue: #45782

- because the zero value of the repeated field and bytes field in proto
is ignored or treated as empty value but not nil pointer, so we need to
fix the recovery info of the broadcast task from proto to keep the
consistency of memory state.

Signed-off-by: chyezh <chyezh@outlook.com>
2025-11-24 20:05:07 +08:00
congqixia
c01fd94a6a
enhance: integrate Storage V2 FFI interface for unified storage access (#45723)
Related #44956
This commit integrates the Storage V2 FFI (Foreign Function Interface)
interface throughout the Milvus codebase, enabling unified storage
access through the Loon FFI layer. This is a significant step towards
standardizing storage operations across different storage versions.

1. Configuration Support
- **configs/milvus.yaml**: Added `useLoonFFI` configuration flag under
`common.storage.file.splitByAvgSize` section
- Allows runtime toggle between traditional binlog readers and new
FFI-based manifest readers
  - Default: `false` (maintains backward compatibility)

2. Core FFI Infrastructure

Enhanced Utilities (internal/core/src/storage/loon_ffi/util.cpp/h)
- **ToCStorageConfig()**: Converts Go's `StorageConfig` to C's
`CStorageConfig` struct for FFI calls
- **GetManifest()**: Parses manifest JSON and retrieves latest column
groups using FFI
  - Accepts manifest path with `base_path` and `ver` fields
  - Calls `get_latest_column_groups()` FFI function
  - Returns column group information as string
  - Comprehensive error handling for JSON parsing and FFI errors

3. Dependency Updates
- **internal/core/thirdparty/milvus-storage/CMakeLists.txt**:
  - Updated milvus-storage version from `0883026` to `302143c`
  - Ensures compatibility with latest FFI interfaces

4. Data Coordinator Changes

All compaction task builders now include manifest path in segment
binlogs:

- **compaction_task_clustering.go**: Added `Manifest:
segInfo.GetManifestPath()` to segment binlogs
- **compaction_task_l0.go**: Added manifest path to both L0 segment
selection and compaction plan building
- **compaction_task_mix.go**: Added manifest path to mixed compaction
segment binlogs
- **meta.go**: Updated metadata completion logic:
- `completeClusterCompactionMutation()`: Set `ManifestPath` in new
segment info
- `completeMixCompactionMutation()`: Preserve manifest path in compacted
segments
- `completeSortCompactionMutation()`: Include manifest path in sorted
segments

5. Data Node Compactor Enhancements

All compactors updated to support dual-mode reading (binlog vs
manifest):

6. Flush & Sync Manager Updates

Pack Writer V2 (pack_writer_v2.go)
- **BulkPackWriterV2.Write()**: Extended return signature to include
`manifest string`
- Implementation:
  - Generate manifest path: `path.Join(pack.segmentID, "manifest.json")`
  - Write packed data using FFI-based writer
  - Return manifest path along with binlogs, deltas, and stats

Task Handling (task.go)
- Updated all sync task result handling to accommodate new manifest
return value
- Ensured backward compatibility for callers not using manifest

7. Go Storage Layer Integration

New Interfaces and Implementations
- **record_reader.go**: Interface for unified record reading across
storage versions
- **record_writer.go**: Interface for unified record writing across
storage versions
- **binlog_record_writer.go**: Concrete implementation for traditional
binlog-based writing

Enhanced Schema Support (schema.go, schema_test.go)
- Schema conversion utilities to support FFI-based storage operations
- Ensures proper Arrow schema mapping for V2 storage

Serialization Updates
- **serde.go, serde_events.go, serde_events_v2.go**: Updated to work
with new reader/writer interfaces
- Test files updated to validate dual-mode serialization

8. Storage V2 Packed Format

FFI Common (storagev2/packed/ffi_common.go)
- Common FFI utilities and type conversions for packed storage format

Packed Writer FFI (storagev2/packed/packed_writer_ffi.go)
- FFI-based implementation of packed writer
- Integrates with Loon storage layer for efficient columnar writes

Packed Reader FFI (storagev2/packed/packed_reader_ffi.go)
- Already existed, now complemented by writer implementation

9. Protocol Buffer Updates

data_coord.proto & datapb/data_coord.pb.go
- Added `manifest` field to compaction segment messages
- Enables passing manifest metadata through compaction pipeline

worker.proto & workerpb/worker.pb.go
- Added compaction parameter for `useLoonFFI` flag
- Allows workers to receive FFI configuration from coordinator

10. Parameter Configuration

component_param.go
- Added `UseLoonFFI` parameter to compaction configuration
- Reads from `common.storage.file.useLoonFFI` config path
- Default: `false` for safe rollout

11. Test Updates
- **clustering_compactor_storage_v2_test.go**: Updated signatures to
handle manifest return value
- **mix_compactor_storage_v2_test.go**: Updated test helpers for
manifest support
- **namespace_compactor_test.go**: Adjusted writer calls to expect
manifest
- **pack_writer_v2_test.go**: Validated manifest generation in pack
writing

This integration follows a **dual-mode approach**:
1. **Legacy Path**: Traditional binlog-based reading/writing (when
`useLoonFFI=false` or no manifest)
2. **FFI Path**: Manifest-based reading/writing through Loon FFI (when
`useLoonFFI=true` and manifest exists)

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-11-24 19:57:07 +08:00
congqixia
a7275e190e
fix: populate index info after segment loading to prevent redundant load tasks (#45803)
After segments gained self-management capabilities for loading, the
index information from the initial load was not being preserved in the
Go-side segment metadata. This caused QueryCoord to repeatedly dispatch
load index tasks, which would fail in segcore since the indexes were
already loaded.

**Root Cause:**
The segment's `fieldIndexes` map was not being populated with index
metadata after calling `FinishLoad`, leading to a mismatch between the
Go-side metadata and segcore's internal state.

**Solution:**
After successfully loading a sealed segment, iterate through
`loadInfo.IndexInfos` and insert each index entry into the segment's
`fieldIndexes` map. This ensures the Go-side metadata stays in sync with
segcore and prevents redundant load index operations.

Fixes #45802
Related to #45060

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-11-24 19:55:07 +08:00
XuanYang-cn
c082317681
fix: Use base64 to encode not utf-8 bytes (#45655)
See also: #45654

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2025-11-24 18:23:06 +08:00
aoiasd
5efb0cedc8
feat: support use fragment config for highlight (#45099)
relate: https://github.com/milvus-io/milvus/issues/42589

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-11-24 17:07:06 +08:00
tinswzy
1427825133
enhance: improve WAL retention strategy (#45350)
issue: #44369 
woodpecker related[ issue:
#59](https://github.com/zilliztech/woodpecker/issues/59)

Refactor the WAL retention logic in Milvus StreamingNode:
- Remove the simple sampling-based truncation mechanism.
- After flush, WAL data is directly truncated.
- The retention control is now delegated to the underlying message queue
(MQ) implementation.

Signed-off-by: tinswzy <zhenyuan.wei@zilliz.com>
2025-11-23 21:41:05 +08:00
Zhen Ye
823c7f7e3e
fix: use remote wal when local wal shutdown (#45753)
issue: #45750

Signed-off-by: chyezh <chyezh@outlook.com>
2025-11-22 16:17:05 +08:00
Buqian Zheng
2cf1e0e452
enhance: optimize pk search to use binary search, and 2 pointers for in expr (#45328)
issue: #44935

this is somewhat related to #44935, but on pk instead of stl_sort index

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2025-11-21 19:01:05 +08:00
Buqian Zheng
e00ad1098f
enhance: add ScalarFieldProto& overload to avoid unnecessary copies (#45743)
1. Array.h: Add output_data(ScalarFieldProto&) overload for both Array
and ArrayView classes
2. Use std::string_view instead of std::string for VARCHAR and GEOMETRY
types to avoid extra string copies
3. Call Reserve(length_) before writing to proto objects to reduce
memory reallocations

a simple test shows those optimizations improve the Array of Varchar
bulk_subscript performance by 20%

issue: https://github.com/milvus-io/milvus/issues/45679

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2025-11-21 18:35:05 +08:00
congqixia
f51fcc09ae
fix: resolve SessionWatcher goroutine leak and unstable UT in querycoordv2 (#45627)
Related to #44620
Related to unstable ut "internal/querycoordv2 TestServer/TestNodeUp"

Introduce SessionWatcher interface to fix race condition and goroutine
leak that caused unstable unit test TestServer/TestNodeUp.

Changes:
- Add SessionWatcher interface with EventChannel() and Stop() methods
- Refactor WatchServices() to return SessionWatcher instead of raw
channel
- Fix cleanup order in QueryCoordV2: stop watcher before session
- Update DataCoord, ConnectionManager to use SessionWatcher
- Add MockSessionWatcher for testing

Fixes race condition between session context cancellation and internal
loop exit. Eliminates goroutine leak by providing explicit lifecycle
management.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-11-21 18:33:06 +08:00
Zhen Ye
a0c269dfe7
fix: use 2.6.6 for milvus DDL upgrading (#45738)
issue: #43897

Signed-off-by: chyezh <chyezh@outlook.com>
2025-11-21 11:45:04 +08:00
Bingyi Sun
275a5b9afc
enhance: optimize term expr performance (#45491)
issue: https://github.com/milvus-io/milvus/issues/45641

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-11-21 11:01:05 +08:00
Zhen Ye
1cd0ef943e
fix: use latest timetick to expire cache (#45717)
issue: #45697

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-11-20 21:39:04 +08:00
junjiejiangjjj
d3164e8030
feat: add configurable batch factor and runtime check bypass for embedding functions (#45592)
https://github.com/milvus-io/milvus/issues/45544
- Add batch_factor configuration parameter (default: 5) to control
embedding provider batch sizes
- Add disable_func_runtime_check property to bypass function validation
during collection creation
- Add database interceptor support for AddCollectionFunction,
AlterCollectionFunction, and DropCollectionFunction requests

Signed-off-by: junjie.jiang <junjie.jiang@zilliz.com>
2025-11-20 19:55:04 +08:00
wei liu
3fbee154f6
enhance: Remove large segment ID arrays from QueryNode logs (#45719)
issue: #45718

Logging complete segment ID arrays caused excessive log volume (3-6 TB
for 200k segments). Remove arrays from logger fields and keep only
segment counts for observability.

Changes:
- Remove requestSegments/preparedSegments arrays from Load logger
- Remove segmentIDs from BM25 stats logs
- Remove entries structure from sync distribution log

This reduces log volume by 99.99% for large-scale operations.

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-11-20 17:18:14 +08:00
Zhen Ye
3c90dddebf
fix: streamingnode should exit when initializing failure (#45731)
issue: #45721

Signed-off-by: chyezh <chyezh@outlook.com>
2025-11-20 17:12:38 +08:00
congqixia
79926b412c
fix: protect tbb concurrent_map emplace to avoid race condition deadlock (#45681)
Related to #44974

The emplace() operation on tbb::concurrent_hash_map was not protected,
allowing other threads to erase entries between the emplace attempt and
the subsequent lookup.

Solution:
1. Add shared_lock protection around the emplace() operation to prevent
concurrent erasure during insertion
2. Instead of returning nullptr when the key is not found on retry,
recursively call Get(key) to retry the entire operation
3. Fix typo: "earsed" -> "erased"

This ensures that concurrent Get() operations are properly synchronized
and will eventually succeed even under high contention.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-11-20 11:57:06 +08:00
Zhen Ye
f6411abbd7
fix: panic when streaming coord shutdown but query coord still work (#45695)
issue: #44984

Signed-off-by: chyezh <chyezh@outlook.com>
2025-11-20 11:07:06 +08:00
Bingyi Sun
a3add6a391
fix: Fix json indices can not be loaded (#45620)
issue: https://github.com/milvus-io/milvus/issues/45575

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-11-20 10:41:06 +08:00
Zhen Ye
87f9a79a6a
fix: inconsistent proxy cache when multiple DDL is executing with DML (#45698)
issue: #45697

Signed-off-by: chyezh <chyezh@outlook.com>
2025-11-20 02:53:06 +08:00
Buqian Zheng
5b85f0e4dc
enhance: updated multiple places where the expr copies the input values in every loop (#45680)
issue: https://github.com/milvus-io/milvus/issues/45679

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2025-11-20 01:51:07 +08:00
Gao
8ee8c01bcf
enhance: prefetch vector chunks for sealed non-indexed segments (#45665)
Signed-off-by: chasingegg <chao.gao@zilliz.com>
2025-11-19 18:39:07 +08:00
cai.zhang
03a244844e
fix: Set task init when worker doesn't have task (#45675)
issue: #45674

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-11-19 18:03:07 +08:00