3574 Commits

Author SHA1 Message Date
Amit Kumar
388d56fdc7
enhance: Add support for minimum_should_match in text_match (parser, engine, client, and tests) (#44988)
### Is there an existing issue for this?

- [x] I have searched the existing issues

---

Please see: https://github.com/milvus-io/milvus/issues/44593 for the
background

This PR makes https://github.com/milvus-io/milvus/pull/44638 redundant,
which can be closed. The PR comments for the original implementation
suggested an alternative and a better approach, this new PR has that
implementation.

---

This PR

- Adds an optional `minimum_should_match` argument to `text_match(...)`
and wires it through the parser, planner/visitor, index bindings, and
client-level tests/examples so full-text queries can require a minimum
number of tokens to match.

Motivation
- Provide a way to require an expression to match a minimum number of
tokens in lexical search.

What changed
- Parser / grammar
- Added grammar rule and token: `MINIMUM_SHOULD_MATCH` and
`textMatchOption` in `internal/parser/planparserv2/Plan.g4`.
- Regenerated parser outputs: `internal/parser/planparserv2/generated/*`
(parser, lexer, visitor, etc.) to support the new rule.
- Planner / visitor
- `parser_visitor.go`: parse and validate the `minimum_should_match`
integer; propagate as an extra value on the `TextMatch` expression so
downstream components receive it.
  - Added `VisitTextMatchOption` visitor method handling.
- Client (Golang)
- Added a unit test to verify `text_match(...,
minimum_should_match=...)` appears in the generated DSL and is accepted
by client code: `client/milvusclient/read_test.go` (new test coverage).
- Added an integration-style test for the feature to the go-client
testcase suite: `tests/go_client/testcases/full_text_search_test.go`
(exercise min=1, min=3, large min).
- Added an example demonstrating `text_match` usage:
`client/milvusclient/read_example_test.go` (example name conforms to
godoc mapping).
- Engine / index
  - Updated C++ index interface: `TextMatchIndex::MatchQuery`
- Added/updated unit tests for the index behavior:
`internal/core/src/index/TextMatchIndexTest.cpp`.
- Tantivy binding 
- Added `match_query_with_minimum` implementation and unit tests to
`internal/core/thirdparty/tantivy/tantivy-binding/src/index_reader_text.rs`
that construct boolean queries with minimum required clauses.



Behavioral / compatibility notes
- This adds an optional argument to `text_match` only; default behavior
(no `minimum_should_match`) is unchanged.
- Internal API change: `TextMatchIndex::MatchQuery` signature changed
(internal component). Callers in the repo were updated accordingly.
- Parser changes required regenerating ANTLR outputs 

Tests and verification
- New/updated tests:
- Go client unit test: `client/milvusclient/read_test.go` (mocked Search
request asserts DSL contains `minimum_should_match=2`).
- Go e2e-style test:
`tests/go_client/testcases/full_text_search_test.go` (exercises min=1, 3
and a large min).
- C++ unit tests for index behavior:
`internal/core/src/index/TextMatchIndexTest.cpp`.
  - Rust binding unit tests for `match_query_with_minimum`.
- Local verification commands to run:
- Go client tests: `cd client && go test ./milvusclient -run ^$` (client
package)
- Go testcases: `cd tests/go_client && go test ./testcases -run
TestTextMatchMinimumShouldMatch` (requires a running Milvus instance)
- C++ unit tests / build: run core build/test per repo instructions (the
change touches core index code).
- Rust binding tests: `cd
internal/core/thirdparty/tantivy/tantivy-binding && cargo test` (if
developing locally).

---------

Signed-off-by: Amit Kumar <amit.kumar@reddit.com>
Co-authored-by: Amit Kumar <amit.kumar@reddit.com>
2025-11-07 16:07:11 +08:00
aoiasd
6102f001a9
enhance: skip check source id (#45377)
relate:https://github.com/milvus-io/milvus/issues/45381

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-11-07 15:19:34 +08:00
congqixia
4a6e8d822c
enhance: Bump go version to 1.24.9 (#45359)
Fixing CVE-2025-58187

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-11-07 10:13:35 +08:00
yanliang567
a2282d61cb
test: Add more async tests (#45327)
related issue: #45326

Signed-off-by: yanliang567 <yanliang.qiao@zilliz.com>
2025-11-06 15:43:33 +08:00
XuanYang-cn
d036fd5422
test: Increase PyMilvus version to 2.7.0rc54 for master branch (#45273)
Automated daily bump from pymilvus master branch. Updates
tests/python_client/requirements.txt.

Signed-off-by: XuanYang-cn <xuan.yang@zilliz.com>
2025-11-05 19:35:33 +08:00
congqixia
1e48911825
enhance: [GoSDK] Support struct array field type (#45291)
Related to #42148

Add comprehensive support for struct array field type in the Go SDK,
including data structure definitions, column operations, schema
construction, and full test coverage.

**Struct Array Column Implementation (`client/column/struct.go`)**
- Add `columnStructArray` type to handle struct array fields
- Implement `Column` interface methods:
- `NewColumnStructArray()`: Create new struct array column from
sub-fields
  - `Name()`, `Type()`: Basic metadata accessors
  - `Slice()`: Support slicing across all sub-fields
  - `FieldData()`: Convert to protobuf `StructArrayField` format
  - `Get()`: Retrieve struct values as `map[string]any`
  - `ValidateNullable()`, `CompactNullableValues()`: Nullable support
- Placeholder implementations for unsupported operations (AppendValue,
GetAsX, IsNull, AppendNull)

**Struct Array Parsing (`client/column/columns.go`)**
- Add `parseStructArrayData()` function to parse `StructArrayField` from
protobuf
- Update `FieldDataColumn()` to detect and parse struct array fields
- Support range-based slicing for struct array data

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-11-05 15:43:33 +08:00
zhuwenxing
06933c25b8
test: add geometry datatype in import testcases (#45014)
/kind improvement

Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2025-11-04 16:55:33 +08:00
zhenshan.cao
6327c9a514
fix: Fix bugs related to TimestampTz (#45111)
issue: https://github.com/milvus-io/milvus/issues/44527
https://github.com/milvus-io/milvus/issues/44537
https://github.com/milvus-io/milvus/issues/44538
https://github.com/milvus-io/milvus/issues/44585
https://github.com/milvus-io/milvus/issues/44622

Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>
2025-11-04 16:51:33 +08:00
Feilong Hou
9e4975bdfa
test: added test case for partial update on duplicate pk (#45130)
Issue: #45129
 <test>: <add new test case>
 <also delete duplicate test case>

 On branch feature/partial-update
 Changes to be committed:
	modified:   milvus_client/test_milvus_client_partial_update.py
	modified:   milvus_client/test_milvus_client_upsert.py

---------

Signed-off-by: Eric Hou <eric.hou@zilliz.com>
Co-authored-by: Eric Hou <eric.hou@zilliz.com>
2025-11-04 15:47:32 +08:00
zhikunyao
7193d01808
test: support e2e-amd helm in gcp milvus cluster (#45175)
Signed-off-by: Zhikun Yao <zhikun.yao@zilliz.com>
2025-11-04 15:07:32 +08:00
Zhen Ye
576084fe86
enhance: support alter collection/database with WAL-based DDL framework (#45266)
issue: #43897

- Alter collection/database is implemented by WAL-based DDL framework
now.
- Support AlterCollection/AlterDatabase in wal now.
- Alter operation can be synced by new CDC now.
- Refactor some UT for alter DDL.

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-11-04 09:59:33 +08:00
zhuwenxing
434e0847fd
test: remove xfail after fix (#45114)
/kind improvement

Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2025-11-03 17:21:37 +08:00
zhuwenxing
a03c398986
test: add import case for struct array (#45146)
/kind improvement

Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2025-11-03 17:19:39 +08:00
zhuwenxing
a47c168dd7
test: add json dumps for json string data (#45189)
/kind improvement

issue: https://github.com/milvus-io/milvus/issues/44982

Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2025-11-03 10:37:33 +08:00
Zhen Ye
309d564796
enhance: support collection and index with WAL-based DDL framework (#45033)
issue: #43897

- Part of collection/index related DDL is implemented by WAL-based DDL
framework now.
- Support following message type in wal, CreateCollection,
DropCollection, CreatePartition, DropPartition, CreateIndex, AlterIndex,
DropIndex.
- Part of collection/index related DDL can be synced by new CDC now.
- Refactor some UT for collection/index DDL.
- Add Tombstone scheduler to manage the tombstone GC for collection or
partition meta.
- Move the vchannel allocation into streaming pchannel manager.

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-10-30 14:24:08 +08:00
congqixia
511a04a6a5
enhance: Refactor go_client test wrapper to use embedding and improve test structure (#45113)
Related to #45105

This commit refactors the test MilvusClient wrapper to leverage Go's
embedding pattern and improves test organization with subtests.

**File**: `tests/go_client/base/milvus_client.go`

- **Use `typeutil.NewSet` for rate limiting**: Replace map-based
`rateLogMethods` with `typeutil.NewSet` for cleaner and more efficient
membership checking
- **Embed `*client.Client` directly**: Change `MilvusClient` structure
from wrapping the client as a field to embedding it directly
- **Remove ~380 lines of wrapper methods**: All wrapper methods
(database, collection, partition, index, read/write, RBAC, etc.) are now
unnecessary thanks to Go's embedding feature, which automatically
promotes embedded methods to the outer type
- **Simplify initialization**: Update `NewMilvusClient` and `Close` to
use embedded client directly
- **Fix typo**: Correct comment "Ike the actual method" → "Invoke the
actual method"

**File**: `tests/go_client/testcases/search_test.go`

- **Wrap assertions in subtests**: Each search expression test is now
wrapped in `t.Run()` with descriptive names
- **Dynamic subtest naming**: Format:
`expr={expression}_dynamic-{true/false}` for clear test identification

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-10-28 16:16:10 +08:00
aoiasd
ad9a0cae48
enhance: add global analyzer options (#44684)
relate: https://github.com/milvus-io/milvus/issues/43687
Add global analyzer options, avoid having to merge some milvus params
into user's analyzer params.

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-10-28 14:52:10 +08:00
Spade A
ce2862d325
fix: fix parquet import bug in STRUCT (#45028)
issue: https://github.com/milvus-io/milvus/issues/45006
ref: https://github.com/milvus-io/milvus/issues/42148

Previsouly, the parquet import is implemented based on that the STRUCT
in the parquet files is hanlded in the way that each field in struct is
stored in a single column.
However, in the user's perspective, the array of STRUCT contains data is
something like STRUCT_A:
for one row, [struct{field1_1, field2_1, field3_1}, struct{field1_2,
field2_2, field3_2}, ...], rather than {[field1_1, field1_2, ...],
[field2_1, field2_2, ...], [field3_1, field3_2, field3_3, ...]}.

This PR fixes this.

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-10-27 10:26:06 +08:00
cai.zhang
b069eeecd2
fix: Added GetMetrics back to IndexNodeServer to ensure compatibility (#45073)
issue: #45070

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-10-24 17:00:06 +08:00
zhuwenxing
1e130683be
test: add geometry datatype in checker (#44794)
/kind improvement

---------

Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2025-10-24 11:28:04 +08:00
Spade A
d8591f9548
fix: csv/json import with STRUCT adapts concatenated struct name (#45000)
After https://github.com/milvus-io/milvus/pull/44557, the field name in
STRUCT field becomes STRUCT_NAME[FIELD_NAME]
This PR make import consider the change.

issue: https://github.com/milvus-io/milvus/issues/45006
ref: https://github.com/milvus-io/milvus/issues/42148

TODO: parquet is much more complex than csv/json, and I will leave it to
a separate PR.

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-10-24 10:22:15 +08:00
Feilong Hou
7aa56e1fb6
test: change test_milvus_client_search_json_path_index_all_expressions to L1 (#44986)
Issue: #44989 
On branch feature/json-shredding
Changes to be committed:
modified: milvus_client/test_milvus_client_query.py

Signed-off-by: Eric Hou <eric.hou@zilliz.com>
Co-authored-by: Eric Hou <eric.hou@zilliz.com>
2025-10-23 16:14:05 +08:00
Spade A
6077178553
enhance: enable STL_SORT to support VARCHAR (#44401)
issue: https://github.com/milvus-io/milvus/issues/44399

This PR implements STL_SORT for VARCHAR data type for both RAM and MMAP
mode.
The general idea is that we deduplicate field values and maintains a
posting list for each unique value.

The serialization format of the index is:
```
[unique_count][string_offsets][string_data][post_list_offsets][post_list_data][magic_code]
string_offsets: array of offsets into string_data section
string_data: str_len1, str1, str_len2, str2, ...
post_list_offsets: array of offsets into post_list_data section
post_list_data: post_list_len1, row_id1, row_id2, ..., post_list_len2, row_id1, row_id2, ...
```

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-10-23 11:00:05 +08:00
zhuwenxing
b497dd0b45
test: add geometry datatype testcases (#44646)
/kind improvement

Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2025-10-21 19:56:03 +08:00
zhuwenxing
2f4b66d9ab
test: add struct array testcases (#44940)
/kind improvement

Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2025-10-20 17:34:03 +08:00
nico
a4935d2eaa
test: update rba test cases 2 (#44954)
Signed-off-by: nico <cheng.yuan@zilliz.com>
2025-10-20 16:32:03 +08:00
zhikunyao
c2ed2cfc39
test: support e2e on arm (#44880)
Signed-off-by: Zhikun Yao <zhikun.yao@zilliz.com>
2025-10-20 10:44:03 +08:00
Feilong Hou
16ff5db79d
test: Add e2e case for timestamptz (currently skipping them) (#44871)
Issue: #44518

On branch feature/timestamps
Changes to be committed:
    modified: common/common_func.py
    new file: milvus_client/test_milvus_client_timestamptz.py

---------

Signed-off-by: Eric Hou <eric.hou@zilliz.com>
Co-authored-by: Eric Hou <eric.hou@zilliz.com>
2025-10-20 10:04:02 +08:00
nico
eae6aff644
test: update cases for binary vectors support HNSW (#44928)
Signed-off-by: nico <cheng.yuan@zilliz.com>
2025-10-18 11:56:01 +08:00
zhagnlu
b7935557e1
fix:unified json exists path semantic (#44916)
#44927

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-10-17 16:40:02 +08:00
nico
9f2937fd0f
test: updatec rba test cases (#44863)
Signed-off-by: nico <cheng.yuan@zilliz.com>
2025-10-17 15:14:02 +08:00
Spade A
c4f3f0ce4c
feat: impl StructArray -- support more types of vector in STRUCT (#44736)
ref: https://github.com/milvus-io/milvus/issues/42148

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>
2025-10-15 10:25:59 +08:00
yihao.dai
cebe923d4a
enhance: Make GetReplicateInfo API work at the pchannel level (#44809)
issue: https://github.com/milvus-io/milvus/issues/44123

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-10-14 15:12:00 +08:00
yanliang567
9e5f9277c0
test: Split insert file and add test for allow insert auto id (#44801)
related issue: #44425
1. split insert.py into a few files: upsert.py, insert.py,
partial_upsert.py ...
2. add test for allow insert auto id

---------

Signed-off-by: yanliang567 <yanliang.qiao@zilliz.com>
2025-10-14 14:28:00 +08:00
zhagnlu
2f178f810f
fix:fix json_contains(path, int) bug (#44814)
#44816

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-10-14 00:19:59 +08:00
XuanYang-cn
a444e2f937
test: Increase PyMilvus version to 2.7.0rc44 for master branch (#44796)
Automated daily bump from pymilvus master branch. Updates
tests/python_client/requirements.txt.

Signed-off-by: XuanYang-cn <xuan.yang@zilliz.com>
2025-10-13 15:54:03 +08:00
XuanYang-cn
a3bdabb328
enhance: Unify compaction executor task state management (#44721)
Remove stopTask.
Replace multiple task tracking maps with single unified taskState map.
Fix slot tracking, improve state transitions, and add comprehensive test

See also: #44714

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2025-10-11 17:53:57 +08:00
congqixia
faaf215913
enhance: Bump go version & builder image tag (#44757)
Bump go version to 1.24.6 fixing CVE-2025-47907

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-10-11 13:49:57 +08:00
congqixia
78b266a44f
enhance: Bump builder image go version to v1.24.6 (#44739)
Bump go version fixing CVE issues

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-10-11 09:53:57 +08:00
Spade A
208481a070
feat: impl StructArray -- support same names in different STRUCT (#44557)
ref: https://github.com/milvus-io/milvus/issues/42148

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-10-10 15:53:56 +08:00
nico
e5378a64bc
test: update test cases (#44651)
Signed-off-by: nico <cheng.yuan@zilliz.com>
2025-10-10 11:35:56 +08:00
congqixia
e55e63288e
enhance: [GoSDK] bump pkg dep to v2.6.3 (#44619)
See also #31293 

Bump pkg dependency version to v2.6.2

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-10-09 19:43:57 +08:00
congqixia
1185fbec0a
enhance: Bump milvus & proto version to v2.6.3 (#44633)
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-09-30 17:33:49 +08:00
yanliang567
c9f01a73cc
test:Skip unstable test (#44649)
related issue: #44620

Signed-off-by: yanliang567 <yanliang.qiao@zilliz.com>
2025-09-30 16:47:52 +08:00
congqixia
31670c5489
enhance: Use dbName in error message (#44618)
The collection not found err could contains db id in err message, which
is not meaningful to users.

This patch make error message wrapping dbname instead of db id.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-09-30 12:25:05 +08:00
yanliang567
7438b00108
test:Fix a space issue for config update (#44630)
related issue: https://github.com/milvus-io/milvus/issues/44623

Signed-off-by: yanliang567 <yanliang.qiao@zilliz.com>
2025-09-29 20:51:06 +08:00
yanliang567
424075d26f
test: Update querynode.segcore.exprEvalBatchSize to 512 to cover multiple batches in segcore (#44624)
related issue: #44623

Signed-off-by: yanliang567 <yanliang.qiao@zilliz.com>
2025-09-29 18:01:05 +08:00
XuanYang-cn
e9aa270713
test: Increase PyMilvus version to 2.7.0rc38 for master branch (#44579)
Automated daily bump from pymilvus master branch. Updates
tests/python_client/requirements.txt.

Signed-off-by: XuanYang-cn <xuan.yang@zilliz.com>
2025-09-29 11:15:06 +08:00
aoiasd
294282f1d2
enhance: support use nullable field as bm25 function input field (#44586)
Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-09-29 10:25:05 +08:00
cai.zhang
19346fa389
feat: Geospatial Data Type and GIS Function support for milvus (#44547)
issue: #43427

This pr's main goal is merge #37417 to milvus 2.5 without conflicts.

# Main Goals

1. Create and describe collections with geospatial type
2. Insert geospatial data into the insert binlog
3. Load segments containing geospatial data into memory
4. Enable query and search can display  geospatial data
5. Support using GIS funtions like ST_EQUALS in query
6. Support R-Tree index for geometry type

# Solution

1. **Add Type**: Modify the Milvus core by adding a Geospatial type in
both the C++ and Go code layers, defining the Geospatial data structure
and the corresponding interfaces.
2. **Dependency Libraries**: Introduce necessary geospatial data
processing libraries. In the C++ source code, use Conan package
management to include the GDAL library. In the Go source code, add the
go-geom library to the go.mod file.
3. **Protocol Interface**: Revise the Milvus protocol to provide
mechanisms for Geospatial message serialization and deserialization.
4. **Data Pipeline**: Facilitate interaction between the client and
proxy using the WKT format for geospatial data. The proxy will convert
all data into WKB format for downstream processing, providing column
data interfaces, segment encapsulation, segment loading, payload
writing, and cache block management.
5. **Query Operators**: Implement simple display and support for filter
queries. Initially, focus on filtering based on spatial relationships
for a single column of geospatial literal values, providing parsing and
execution for query expressions.Now only support brutal search
7. **Client Modification**: Enable the client to handle user input for
geospatial data and facilitate end-to-end testing.Check the modification
in pymilvus.

---------

Signed-off-by: Yinwei Li <yinwei.li@zilliz.com>
Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
Co-authored-by: ZhuXi <150327960+Yinwei-Yu@users.noreply.github.com>
2025-09-28 19:43:05 +08:00