2004 Commits

Author SHA1 Message Date
aoiasd
55feb7ded8
feat: set related resource ids in collection schema (#46423)
Support crate analyzer with file resource info, and return used file
resource ids when validate analyzer.
Save the related resource ids in collection schema.
relate: https://github.com/milvus-io/milvus/issues/43687

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
- Core invariant: analyzer file-resource resolution is deterministic and
traceable by threading a FileResourcePathHelper (collecting used
resource IDs in a HashSet) through all tokenizer/analyzer construction
and validation paths; validate_analyzer(params, extra_info) returns the
collected Vec<i64) which is propagated through C/Rust/Go layers to
callers (CValidateResult → RustResult::from_vec_i64 → Go []int64 →
querypb.ValidateAnalyzerResponse.ResourceIds →
CollectionSchema.FileResourceIds).

- Logic removed/simplified: ad‑hoc, scattered resource-path lookups and
per-filter file helpers (e.g., read_synonyms_file and other inline
file-reading logic) were consolidated into ResourceInfo +
FileResourcePathHelper and a centralized get_resource_path(helper, ...)
API; filter/tokenizer builder APIs now accept &mut
FileResourcePathHelper so all file path resolution and ID collection use
the same path and bookkeeping logic (redundant duplicated lookups
removed).

- Why no data loss or behavior regression: changes are additive and
default-preserving — existing call sites pass extra_info = "" so
analyzer creation/validation behavior and error paths remain unchanged;
new Collection.FileResourceIds is populated from resp.ResourceIds in
validateSchema and round‑tripped through marshal/unmarshal
(model.Collection ↔ schemapb.CollectionSchema) so schema persistence
uses the new list without overwriting other schema fields; proto change
adds a repeated field (resource_ids) which is wire‑compatible (older
clients ignore extra field). Concrete code paths: analyzer creation
still uses create_analyzer (now with extra_info ""), tokenizer
validation still returns errors as before but now also returns IDs via
CValidateResult/RustResult, and rootcoord.validateSchema assigns
resp.ResourceIds → schema.FileResourceIds.

- New capability added: end‑to‑end discovery, return, and persistence of
file resource IDs used by analyzers — validate flows now return resource
IDs and the system stores them in collection schema (affects tantivy
analyzer binding, canalyzer C bindings, internal/util analyzer APIs,
querynode ValidateAnalyzer response, and rootcoord/create_collection
flow).
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-12-26 22:49:19 +08:00
Zhen Ye
2c2cbe89c2
fix: flush log when os exit (#46608)
issue: #45640

Signed-off-by: chyezh <chyezh@outlook.com>
2025-12-26 14:25:18 +08:00
sijie-ni-0214
fc45905ee0
enhance: Optimize QuotaCenter CPU usage (#46388)
issue: https://github.com/milvus-io/milvus/issues/46387

---------

Signed-off-by: sijie-ni-0214 <sijie.ni@zilliz.com>
2025-12-26 10:09:19 +08:00
congqixia
6452d146af
enhance: move jemalloc_stats from pkg to internal/util/segcore (#46560)
Related to #46133

Move jemalloc_stats.go and its test file from pkg/util/hardware to
internal/util/segcore. This is a more appropriate location because:
- jemalloc_stats depends on milvus_core C++ library via cgo
- The pkg directory should remain independent of internal C++
dependencies
- segcore is the natural home for core memory allocator utilities

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **Refactor**
* Improved internal code organization by reorganizing memory statistics
collection infrastructure for better maintainability and modularity. No
impact on end-user functionality or behavior.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-12-24 19:03:18 +08:00
Zhen Ye
1cae4a6194
enhance: support new server label rule for milvus and MILVUS_SERVER_LABEL_RESOURCE_GROUP (#46401)
issue: #46400

- add new server label rule.
- add `MILVUS_SERVER_LABEL_RESOURCE_GROUP` to determine the resource
group of querynode.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Automatic creation of resource groups when nodes with resource-group
labels join.
* Expanded server-label system supporting role-specific and global
labels.

* **Bug Fixes**
* Node acceptance now enforces resource-group name compatibility,
preventing cross-group assignment.

* **Refactor**
* Node handling flows updated to use richer node information for
assignment and validation.

* **Tests**
* Added tests validating resource-group labeling and node acceptance
behavior.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-12-24 14:23:18 +08:00
marcelo-cjl
3b599441fd
feat: Add nullable vector support for proxy and querynode (#46305)
related: #45993 

This commit extends nullable vector support to the proxy layer,
querynode,
and adds comprehensive validation, search reduce, and field data
handling
    for nullable vectors with sparse storage.
    
    Proxy layer changes:
- Update validate_util.go checkAligned() with getExpectedVectorRows()
helper
      to validate nullable vector field alignment using valid data count
- Update checkFloatVectorFieldData/checkSparseFloatVectorFieldData for
      nullable vector validation with proper row count expectations
- Add FieldDataIdxComputer in typeutil/schema.go for logical-to-physical
      index translation during search reduce operations
- Update search_reduce_util.go reduceSearchResultData to use
idxComputers
      for correct field data indexing with nullable vectors
- Update task.go, task_query.go, task_upsert.go for nullable vector
handling
    - Update msg_pack.go with nullable vector field data processing
    
    QueryNode layer changes:
    - Update segments/result.go for nullable vector result handling
- Update segments/search_reduce.go with nullable vector offset
translation
    
    Storage and index changes:
- Update data_codec.go and utils.go for nullable vector serialization
- Update indexcgowrapper/dataset.go and index.go for nullable vector
indexing
    
    Utility changes:
- Add FieldDataIdxComputer struct with Compute() method for efficient
      logical-to-physical index mapping across multiple field data
- Update EstimateEntitySize() and AppendFieldData() with fieldIdxs
parameter
    - Update funcutil.go with nullable vector support functions

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Full support for nullable vector fields (float, binary, float16,
bfloat16, int8, sparse) across ingest, storage, indexing, search and
retrieval; logical↔physical offset mapping preserves row semantics.
  * Client: compaction control and compaction-state APIs.

* **Bug Fixes**
* Improved validation for adding vector fields (nullable + dimension
checks) and corrected search/query behavior for nullable vectors.

* **Chores**
  * Persisted validity maps with indexes and on-disk formats.

* **Tests**
  * Extensive new and updated end-to-end nullable-vector tests.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Signed-off-by: marcelo-cjl <marcelo.chen@zilliz.com>
2025-12-24 10:13:19 +08:00
Buqian Zheng
e379b1f0f4
enhance: moved query optimization to proxy, added various optimizations (#45526)
issue: https://github.com/milvus-io/milvus/issues/45525

see added README.md for added optimizations

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Added query expression optimization feature with a new `optimizeExpr`
configuration flag to enable automatic simplification of filter
predicates, including range predicate optimization, merging of IN/NOT IN
conditions, and flattening of nested logical operators.

* **Bug Fixes**
* Adjusted delete operation behavior to correctly handle expression
evaluation.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2025-12-24 00:39:19 +08:00
aoiasd
0203aefad1
enhance: add concurrency pool for analyzer (#46185)
relate: https://github.com/milvus-io/milvus/issues/42589

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

## New Features
- Added `concurrency_per_cpu_core` configuration parameter for the
analyzer component, enabling customizable per-CPU concurrency tuning
(default: 8).

## Tests
- Added test coverage for batch analysis operations.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-12-23 16:01:18 +08:00
yihao.dai
5e525eb3bf
enhance: Retry reads from object storage on rate limit error (#46455)
This PR improves the robustness of object storage operations by retrying
both explicit throttling errors (e.g. HTTP 429, SlowDown, ServerBusy).
These errors commonly occur under high concurrency and are typically
recoverable with bounded retries.

issue: https://github.com/milvus-io/milvus/issues/44772

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Configurable retry support for reads from object storage and improved
mapping of transient/rate-limit errors.
* Added a retryable reader wrapper used by CSV/JSON/Parquet/Numpy import
paths.

* **Configuration**
  * New parameter to control storage read retry attempts.

* **Tests**
* Expanded unit tests covering error mapping and retry behaviors across
storage backends.
* Standardized mock readers and test initialization to simplify test
setups.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-12-23 11:03:18 +08:00
Zhen Ye
2edc9ee236
enhance: support milvus version when coordinator startup (#46456)
issue: #46451

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Session versioning added to validate coordinator compatibility during
registration and active takeover.

* **Changes**
* Active–standby flow simplified: standby-to-active activation now
always enabled and initialized unconditionally.
* Registration uses version-aware transactions to ensure version
consistency during takeover.
  * Startup/health startup path streamlined.

* **Tests**
* Added version-key integration test; removed test for disabling
active-standby.
  * Updated flush test to assert rate-limiter errors occur.

* **Chores**
  * Removed centralized connection manager and its test suite.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-12-22 20:29:18 +08:00
yihao.dai
32809c1053
fix: Remove stale proxy clients on rewatch etcd (#46398)
AddProxyClients now removes clients not in the new snapshot before
adding new ones. This ensures proper cleanup when ProxyWatcher re-watche
etcd.

issue: https://github.com/milvus-io/milvus/issues/46397

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-12-21 19:11:16 +08:00
Spade A
ad8aba7cb4
feat: impl ComputePhraseMatchSlop for compute min slop for phrase match query (#45892)
issue: https://github.com/milvus-io/milvus/issues/45890

ComputePhraseMatchSlop accepts three pararms:
1. A string: query text
2. Some trings: data texts
3. Analyzer params,

Slop will be calculated for the query text with each data text in the
context of phrase match where they are tokenized with tokenizer with
analyzer params.

So two array will be returned:
1. is_match: is phrase match can sucess
2. slop: the related slop if phrase match can sucess, or -1 is cannot.

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-12-19 16:03:18 +08:00
junjiejiangjjj
617a77b0bd
enhance: Add embedding model and schema field type checks (#46421)
https://github.com/milvus-io/milvus/issues/46415

- Add output type validation when creating functions
- Fix improper error handling in bulk insert tasks

Signed-off-by: junjie.jiang <junjie.jiang@zilliz.com>
2025-12-19 11:05:19 +08:00
aoiasd
7e4f87e351
fix: Init analyzer at delegator for all field with enable analyzer (#46361)
To support text match highlight
relate: https://github.com/milvus-io/milvus/issues/46308

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-12-19 10:23:18 +08:00
congqixia
21ed1fabfd
feat: support reopen segment for data/schema changes (#46359)
issue: #46358

This PR implements segment reopening functionality on query nodes,
enabling the application of data or schema changes to already-loaded
segments without requiring a full reload.

### Core (C++)

**New SegmentLoadInfo class**
(`internal/core/src/segcore/SegmentLoadInfo.h/cpp`):
- Encapsulates segment load configuration with structured access
- Implements `ComputeDiff()` to calculate differences between old and
new load states
- Tracks indexes, binlogs, and column groups that need to be loaded or
dropped
- Provides `ConvertFieldIndexInfoToLoadIndexInfo()` for index loading

**ChunkedSegmentSealedImpl modifications**:
- Added `Reopen(const SegmentLoadInfo&)` method to apply incremental
changes based on computed diff
- Refactored `LoadColumnGroups()` and `LoadColumnGroup()` to support
selective loading via field ID map
- Extracted `LoadBatchIndexes()` and `LoadBatchFieldData()` for reusable
batch loading logic
- Added `LoadManifest()` for manifest-based loading path
- Updated all methods to use `SegmentLoadInfo` wrapper instead of direct
proto access

**SegmentGrowingImpl modifications**:
- Added `Reopen()` stub method for interface compliance

**C API additions** (`segment_c.h/cpp`):
- Added `ReopenSegment()` function exposing reopen to Go layer

### Go Side

**QueryNode handlers** (`internal/querynodev2/`):
- Added `HandleReopen()` in handlers.go
- Added `ReopenSegments()` RPC in services.go

**Segment interface** (`internal/querynodev2/segments/`):
- Extended `Segment` interface with `Reopen()` method
- Implemented `Reopen()` in LocalSegment
- Added `Reopen()` to segment loader

**Segcore wrapper** (`internal/util/segcore/`):
- Added `Reopen()` method in segment.go
- Added `ReopenSegmentRequest` in requests.go

### Proto

- Added new fields to support reopen in `query_coord.proto`

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-12-17 15:49:16 +08:00
XuanYang-cn
0bbb134e39
feat: Enable to backup and reload ez (#46332)
see also: #40013

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2025-12-16 17:19:16 +08:00
Spade A
f6f716bcfd
feat: impl StructArray -- support embedding searches embeddings in embedding list with element level filter expression (#45830)
issue: https://github.com/milvus-io/milvus/issues/42148

For a vector field inside a STRUCT, since a STRUCT can only appear as
the element type of an ARRAY field, the vector field in STRUCT is
effectively an array of vectors, i.e. an embedding list.
Milvus already supports searching embedding lists with metrics whose
names start with the prefix MAX_SIM_.

This PR allows Milvus to search embeddings inside an embedding list
using the same metrics as normal embedding fields. Each embedding in the
list is treated as an independent vector and participates in ANN search.

Further, since STRUCT may contain scalar fields that are highly related
to the embedding field, this PR introduces an element-level filter
expression to refine search results.
The grammar of the element-level filter is:

element_filter(structFieldName, $[subFieldName] == 3)

where $[subFieldName] refers to the value of subFieldName in each
element of the STRUCT array structFieldName.

It can be combined with existing filter expressions, for example:

"varcharField == 'aaa' && element_filter(struct_field, $[struct_int] ==
3)"

A full example:
```
struct_schema = milvus_client.create_struct_field_schema()
struct_schema.add_field("struct_str", DataType.VARCHAR, max_length=65535)
struct_schema.add_field("struct_int", DataType.INT32)
struct_schema.add_field("struct_float_vec", DataType.FLOAT_VECTOR, dim=EMBEDDING_DIM)

schema.add_field(
    "struct_field",
    datatype=DataType.ARRAY,
    element_type=DataType.STRUCT,
    struct_schema=struct_schema,
    max_capacity=1000,
)
...

filter = "varcharField == 'aaa' && element_filter(struct_field, $[struct_int] == 3 && $[struct_str] == 'abc')"
res = milvus_client.search(
    COLLECTION_NAME,
    data=query_embeddings,
    limit=10,
    anns_field="struct_field[struct_float_vec]",
    filter=filter,
    output_fields=["struct_field[struct_int]", "varcharField"],
)

```
TODO:
1. When an `element_filter` expression is used, a regular filter
expression must also be present. Remove this restriction.
2. Implement `element_filter` expressions in the `query`.

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-12-15 12:01:15 +08:00
aoiasd
0c54875832
enhance: ValidateAnalyzer return ValidateAnalyzerResponse instead common.Status (#46292)
Prepare for return more info when validate analyzer.
relate: https://github.com/milvus-io/milvus/issues/43687

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-12-12 10:35:14 +08:00
sijie-ni-0214
f51de1a8ab
feat: support TruncateCollection api to clear collection data (#46167)
issue: https://github.com/milvus-io/milvus/issues/46166

---------

Signed-off-by: sijie-ni-0214 <sijie.ni@zilliz.com>
2025-12-12 10:31:14 +08:00
zhagnlu
8f0b7983ec
enhance: add jemalloc cached monitor (#46041)
#46133

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-12-09 19:53:13 +08:00
Zhen Ye
459425ac84
fix: wrong context using by session of grpc client (#46183)
issue: #46182

Signed-off-by: chyezh <chyezh@outlook.com>
2025-12-08 21:47:12 +08:00
foxspy
3a3163e613
fix: skip gpu init for streaming node (#45650)
issue: #45597

The streaming node currently cannot use GPU resources and does not need
to perform initialization.

Signed-off-by: xianliang.li <xianliang.li@zilliz.com>
2025-12-07 13:59:11 +08:00
aoiasd
354ab2f55e
enhance: sync file resource to querynode and datanode (#44480)
relate:https://github.com/milvus-io/milvus/issues/43687
Support use file resource with sync mode.
Auto download or remove file resource to local when user add or remove
file resource.
Sync file resource to node when find new node session.

---------

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-12-04 16:23:11 +08:00
Zhen Ye
adbdf916e1
enhance: support proxy DML forward (#45921)
issue: #45812

- 2.6 proxy will try to forward DWL to 2.5 proxy if streaming service is
not ready

Signed-off-by: chyezh <chyezh@outlook.com>
2025-12-01 19:37:10 +08:00
Zhen Ye
2ef18c5b4f
enhance: remove watch at session liveness check (#45968)
issue: #45724

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-12-01 17:55:10 +08:00
zhagnlu
3901f112ae
enhance: make estimate json stats size more accurate (#45875)
#42533

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-12-01 15:31:10 +08:00
congqixia
ea4bbbda3a
fix: Add EmptySessionWatcher to prevent panic in IndexNodeBinding mode (#45911)
Related to #45910

When IndexNodeBinding mode is enabled, DataCoord skips session watching
for datanodes but the dnSessionWatcher field remains nil. This causes a
panic when other code attempts to access the watcher.

This fix introduces an EmptySessionWatcher as a placeholder for the
IndexNodeBinding mode scenario. The empty watcher implements the
SessionWatcher interface with no-op methods, preventing nil pointer
dereferences while maintaining the expected interface contract.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-11-28 19:07:08 +08:00
Zhen Ye
c3fe6473b8
enhance: support async write syncer for milvus logging (#45805)
issue: #45640

- log may be dropped if the underlying file system is busy.
- use async write syncer to avoid the log operation block the milvus
major system.
- remove some log dependency from the until function to avoid
dependency-loop.

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-11-28 17:43:11 +08:00
Chun Han
f34eb3ae90
enhance: remove useless code(#30376) (#45685)
related: #30376

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2025-11-27 10:43:07 +08:00
liliu-z
95ee65e950
fix: Fix race condition in Zilliz client tests (#45832)
issue: #45831

This PR fixes a race condition in `TestZillizClient` test cases where
`t.Logf`
could be called after the test function had returned, leading to a panic
or data race failure.

Root cause:
1. Incorrect defer order: `defer s.Stop()` was called after `defer
lis.Close()`.
This caused `lis.Accept()` to return an error before `s.Stop()` had set
the quit flag,
   triggering the error handling path in the server goroutine.
2. Unsafe logging: The error handling path used `t.Logf` inside a
goroutine, which
   is unsafe if the main test function exits before the log is printed.

Fixes:
- Changed defer order to ensure `s.Stop()` is called before
`lis.Close()`.
- Replaced `t.Logf` with `fmt.Printf` in the background goroutine to
avoid panic on test completion.

Signed-off-by: Li Liu <li.liu@zilliz.com>
2025-11-26 21:13:09 +08:00
wei liu
4d6b130af4
fix: prevent panic in standby mixcoord during shutdown (#45730)
issue: #45728
When mixcoord is in standby mode and shutdown is triggered, the
ProcessActiveStandBy goroutine may panic if context cancellation occurs.
This happens because the error handling didn't check for
context.Canceled errors before panicking.

Changes:
- Add context cancellation check in mix_coord Register() before panic
- Check s.ctx.Err() == context.Canceled and gracefully exit
- Remove unused ForceActiveStandby() function from session_util

This ensures standby mixcoord can shutdown gracefully without panic when
context is cancelled during the standby process.

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-11-25 19:27:07 +08:00
congqixia
03f5d7c0a5
enhance: integrate StorageV2 FFI interface for manifest-based segment loading (#45798)
Related to #44956

**New Translator (C++)**
- Added `ManifestGroupTranslator`
(`internal/core/src/segcore/storagev2translator/`)
  - Translates manifest-based column groups to Milvus internal format
  - Implements `GroupCTMeta` interface for chunk-based column access
  - Supports both memory and mmap storage modes
  - Handles cache warmup policies for vector and scalar data

**ChunkedSegmentSealedImpl**
(`internal/core/src/segcore/ChunkedSegmentSealedImpl.cpp:333`)
- Added `LoadColumnGroups(const std::string& manifest_path)`: Main entry
point for manifest-based loading
  - Creates milvus-storage Reader from manifest file
  - Parallelizes column group loading using thread pool
  - Aggregates loading exceptions and reports errors
- Added `LoadColumnGroup()`: Loads individual column group
  - Extracts field IDs from column group metadata
  - Creates ManifestGroupTranslator for each column group
  - Builds ProxyChunkColumn for field access
  - Special handling for timestamp field index construction

**SegmentGrowingImpl**
(`internal/core/src/segcore/SegmentGrowingImpl.cpp`)
- Added similar `LoadColumnGroups()` and `LoadColumnGroup()` methods for
growing segments
- Maintains consistency with sealed segment loading path

Storage FFI Utilities

**loon_ffi/util** (`internal/core/src/storage/loon_ffi/util.cpp`)
- Added `MakeInternalPropertiesFromStorageConfig()`: Converts C storage
config to internal Properties
  - Maps all storage configuration fields (S3, GCS, Azure, local)
  - Handles SSL, IAM, virtual host settings
  - Configures connection timeouts and max connections
- Added `MakeInternalLocalProperies()`: Creates local filesystem
properties
- Added `ToCStorageConfig()`: Converts Go StorageConfig to C
representation
- Added `GetColumnGroups()`: Extracts column groups from manifest file
using Transaction API

Protocol Buffer Changes

**segcore.proto** (`pkg/proto/segcore.proto:121`)
- Added `manifest_path` field to `SegmentLoadInfo` message
- Enables passing manifest file path from Go layer to C++ core

Go Integration

**segment.go** (`internal/util/segcore/segment.go:372`)
- Updated `ConvertToSegcoreSegmentLoadInfo()` to propagate
`ManifestPath` field
- Bridges QueryNode segment load info to Segcore format

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-11-25 17:27:07 +08:00
groot
a545ebc702
fix: Fix a bug that bulkimport cannot handle empty struct list (#45693)
issue: https://github.com/milvus-io/milvus/issues/42148

Signed-off-by: yhmo <yihua.mo@zilliz.com>
2025-11-25 17:21:06 +08:00
XuanYang-cn
c082317681
fix: Use base64 to encode not utf-8 bytes (#45655)
See also: #45654

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2025-11-24 18:23:06 +08:00
congqixia
f51fcc09ae
fix: resolve SessionWatcher goroutine leak and unstable UT in querycoordv2 (#45627)
Related to #44620
Related to unstable ut "internal/querycoordv2 TestServer/TestNodeUp"

Introduce SessionWatcher interface to fix race condition and goroutine
leak that caused unstable unit test TestServer/TestNodeUp.

Changes:
- Add SessionWatcher interface with EventChannel() and Stop() methods
- Refactor WatchServices() to return SessionWatcher instead of raw
channel
- Fix cleanup order in QueryCoordV2: stop watcher before session
- Update DataCoord, ConnectionManager to use SessionWatcher
- Add MockSessionWatcher for testing

Fixes race condition between session context cancellation and internal
loop exit. Eliminates goroutine leak by providing explicit lifecycle
management.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-11-21 18:33:06 +08:00
junjiejiangjjj
d3164e8030
feat: add configurable batch factor and runtime check bypass for embedding functions (#45592)
https://github.com/milvus-io/milvus/issues/45544
- Add batch_factor configuration parameter (default: 5) to control
embedding provider batch sizes
- Add disable_func_runtime_check property to bypass function validation
during collection creation
- Add database interceptor support for AddCollectionFunction,
AlterCollectionFunction, and DropCollectionFunction requests

Signed-off-by: junjie.jiang <junjie.jiang@zilliz.com>
2025-11-20 19:55:04 +08:00
zhenshan.cao
a3b8bcb198
fix: correct default value backfill during AddField (#45634)
issue: https://github.com/milvus-io/milvus/issues/44585

Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>
2025-11-18 23:05:42 +08:00
aoiasd
947c8855f3
feat: support search bm25 with highlight (#44923)
relate: https://github.com/milvus-io/milvus/issues/42589

---------

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-11-18 16:09:39 +08:00
congqixia
b734de5398
fix: [skip e2e] move gRPC server start after service registration in zilliz client tests (#45645)
The tests were failing with "grpc: Server.RegisterService after
Server.Serve" because setupMockServer() was starting the gRPC server
before tests could register their services. gRPC requires all services
to be registered before Server.Serve() is called.

Changes:
- Remove s.Serve() from setupMockServer() helper function
- Add s.Serve() to each test after service registration
- Apply fix consistently to all 6 affected tests:
  * TestZillizClient_Embedding
  * TestZillizClient_Embedding_Error
  * TestZillizClient_Rerank
  * TestZillizClient_Rerank_Error
  * TestNewZilliClient_WithMockServer
  * TestZillizClient_Embedding_EmptyResponse

This follows the correct gRPC server lifecycle:
1. Create server
2. Register services
3. Start serving

Related to #44620
Case: "internal/util/function/models/zilliz TestZillizClient_Rerank"

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-11-18 15:31:39 +08:00
congqixia
0a208d7224
enhance: Move segment loading logic from Go layer to segcore for self-managed loading (#45488)
Related to #45060

Refactor segment loading architecture to make segments autonomously
manage their own loading process, moving the orchestration logic from Go
(segment_loader.go) to C++ (segcore).

**C++ Layer (segcore):**
- Added `SetLoadInfo()` and `Load()` methods to `SegmentInterface` and
implementations
- Implemented `ChunkedSegmentSealedImpl::Load()` with parallel loading
strategy:
  - Separates indexed fields from non-indexed fields
  - Loads indexes concurrently using thread pools
  - Loads field data for non-indexed fields in parallel
- Implemented `SegmentGrowingImpl::Load()` to convert and load field
data
- Extracted `LoadIndexData()` as a reusable utility function in
`Utils.cpp`
- Added `SegmentLoad()` C binding in `segment_c.cpp`

**Go Layer:**
- Added `Load()` method to segment interfaces
- Updated mock implementations and test interfaces
- Integrated new C++ `SegmentLoad()` binding in Go segment wrapper

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-11-14 11:21:37 +08:00
Xiaofan
1c69c7fa17
enhance: Upgrade etcd to 3.5.23 (#44666)
related to #44614
fix the issue embedded etcd are not affected by quota config

Signed-off-by: xiaofanluan <xiaofan.luan@zilliz.com>
2025-11-14 09:47:38 +08:00
junjiejiangjjj
102481e53f
feat: Support add_function/alter_function/drop_function (#44895)
https://github.com/milvus-io/milvus/issues/44053

Signed-off-by: junjie.jiang <junjie.jiang@zilliz.com>
2025-11-13 20:53:39 +08:00
Gao
09a3195867
enhance: support max_connections config for remote storage (#45225)
related: https://github.com/milvus-io/milvus/issues/45344

Signed-off-by: chasingegg <chao.gao@zilliz.com>
2025-11-13 15:37:38 +08:00
Spade A
929dc65882
fix: fix index compatibility after upgrade (#45373)
issue: https://github.com/milvus-io/milvus/issues/45380

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-11-13 12:59:38 +08:00
junjiejiangjjj
50f198e346
feat: Support zilliz models (#45168)
https://github.com/milvus-io/milvus/issues/35856

Signed-off-by: junjie.jiang <junjie.jiang@zilliz.com>
2025-11-13 12:55:37 +08:00
groot
e48fe7f820
fix: Fix bulkimport bug for Struct field (#45474)
issue: https://github.com/milvus-io/milvus/issues/45006

Signed-off-by: yhmo <yihua.mo@zilliz.com>
2025-11-13 11:31:41 +08:00
Xiaofan
a9895bb904
enhance: add robust handle etcd servercrash (#45304)
related to #45303
fix milvus pod may restart when etcd pod start

Signed-off-by: xiaofanluan <xiaofan.luan@zilliz.com>
2025-11-13 10:23:36 +08:00
congqixia
382b1d7de6
fix: correct field data offset calculation in rerank functions for bulk search (#45444)
Related to #45338

When using bulk vector search in hybrid search with rerank functions,
the output field values for different queries were all equal to the
values returned by the first query, instead of the correct values
belonging to each document ID. The document IDs were correct, but the
entity field values were wrong.

In rerank functions (RRF, weighted, decay, model), when processing
multiple queries in a batch, the `idLocations` stored only the relative
offset within each result set (`idx`), not accounting for the absolute
position within the entire batch. This caused `FillFieldData` to
retrieve field data from the wrong positions, always using offsets
relative to the first query.

This fix ensures that when processing bulk searches with rerank
functions, each result correctly retrieves its corresponding field data
based on the absolute offset within the entire batch, resolving the
issue where all queries returned the first query's field values.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-11-11 14:39:41 +08:00
XuanYang-cn
897ac983c8
feat: Add new config and enable to dynamic update configs (#45170)
This PR changes the config layout according to the latest design, and
adds two external credential configs for aws kms

See also: #45169

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2025-11-10 14:43:35 +08:00
Chun Han
87b466fd83
fix: Group value is nil(#45418) (#45422)
related: #45418

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2025-11-08 10:29:33 +08:00