2128 Commits

Author SHA1 Message Date
Feilong Hou
a1721bb47b
test: add more test case on partial update (#46628)
Issue: #46627  
add one more test case to cover duplicate pk partial update

 On branch feature/partial-update
 Changes to be committed:
	modified:   milvus_client/test_milvus_client_partial_update.py

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
- Core invariant: upserts with partial_update=True consolidate records
by primary key (PK) rather than creating duplicate rows; this test
verifies the partial-update upsert path preserves PK identity and merge
semantics.
- Change: adds test
test_milvus_client_partial_update_duplicate_pk_partial_update which
inserts duplicate-PK batches, then calls client.upsert(...,
partial_update=True) on a subset of fields and asserts final row count
equals default_nb, exercising the partial-update code path (upsert →
partial update handling → query) not previously covered.
- No production logic removed/simplified: this PR only adds test
coverage (no code paths removed or altered); nothing in production code
is changed or simplified by the PR.
- No data loss or regression introduced: the test validates concrete
code paths — upsert with partial_update True followed by
query(out_fields/with_vec, pk checks) — and asserts deduplication
(2×default_nb → default_nb). Because the PR only adds assertions against
existing behavior and does not modify runtime logic, it cannot cause
data loss or behavioral regressions.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: Eric Hou <eric.hou@zilliz.com>
Co-authored-by: Eric Hou <eric.hou@zilliz.com>
2025-12-30 16:01:26 +08:00
Feilong Hou
69a2d202b0
test: cover more timesamptz e2e (#46575)
Issue: #46424  
test:add_collection_field(invalid_default_value)
       hybrid_search(NOT supported_
simplify some test cases using one single collection to save time.
       query with different time shift and timezone settings

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
- Core invariant: TIMESTAMPTZ values are treated as absolute instants
(timezone-preserving). Tests assume conversions between stored instants
and display timezones/time-shifts are deterministic and reversible; the
PR validates queries/reads across different timezone and time-shift
settings against that invariant.

- Removed/simplified logic: duplicated per-test create/insert/teardown
flows and several isolated timestamptz unit cases (edge_case, Feb_29,
partial_update, standalone query) were consolidated into a module-scoped
fixture that creates a single COLLECTION_NAME, inserts ROWS, and handles
teardown. This removes redundant setup/teardown code and repeated
scaffolding while preserving the same API exercise points
(create_collection, insert, query, alter_collection_properties,
alter_database_properties, describe_collection, describe_database).

- No data loss or behavior regression: only test code was reorganized
and new assertions exercise the same production APIs and code paths used
previously (create_collection → insert → query / alter_properties →
describe). The fixture inserts the same ROWS and tests still
convert/compare timestamptz values via cf.convert_timestamptz and query
check routines; the new invalid-default-value test only asserts error
handling when adding a TIMESTAMPTZ field with an invalid default and
does not mutate persisted data or change production logic.

- PR type (Enhancement/Test): expands and reorganizes E2E test coverage
for TIMESTAMPTZ—centralizes collection setup to reduce runtime and
flakiness, adds explicit coverage for invalid-default-value behavior,
and increases timezone/time-shift query scenarios without altering
product behavior.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Signed-off-by: Eric Hou <eric.hou@zilliz.com>
Co-authored-by: Eric Hou <eric.hou@zilliz.com>
2025-12-30 15:59:21 +08:00
zhuwenxing
e3a85be435
test: replace parquet with jsonl for EventRecords and RequestRecords in checker (#46671)
/kind improvement

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
- Core invariant: tests' persistence of EventRecords and RequestRecords
must be append-safe under concurrent writers; this PR replaces Parquet
with JSONL and uses per-file locks and explicit buffer flushes to
guarantee atomic, append-safe writes (EventRecords uses event_lock +
append per line; RequestRecords buffers under request_lock and flushes
to file when threshold or on sink()).

- Logic removed/simplified and rationale: DataFrame-based parquet
append/read logic (pyarrow/fastparquet) and implicit parquet buffering
were removed in favor of simple line-oriented JSON writes and explicit
buffer management. The complex Parquet append/merge paths were redundant
because parquet append under concurrent test-writer patterns caused
corruption; JSONL removes the append-mode complexity and the
parquet-specific buffering/serialization code.

- Why no data loss or behavior regression (concrete code paths):
EventRecords.insert writes a complete JSON object per event under
event_lock to /tmp/ci_logs/event_records_*.jsonl and get_records_df
reads every JSON line under the same lock (or returns an empty DataFrame
with the same schema on FileNotFound/Error), preserving all fields
event_name/event_status/event_ts. RequestRecords.insert appends to an
in-memory buffer under request_lock and triggers _flush_buffer() when
len(buffer) >= 100; _flush_buffer() writes each buffered JSON line to
/tmp/ci_logs/request_records_*.jsonl and clears the buffer; sink() calls
_flush_buffer() under request_lock before get_records_df() reads the
file — ensuring all buffered records are persisted before reads. Both
read paths handle FileNotFoundError and exceptions by returning empty
DataFrames with identical column schemas, so external callers see the
same API and no silent record loss.

- Enhancement summary (concrete): Replaces flaky Parquet append/read
with JSONL + explicit locking and deterministic flush semantics,
removing the root cause of parquet append corruption in tests while
keeping the original DataFrame-based analysis consumers unchanged
(get_records_df returns equivalent schemas).
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2025-12-30 14:13:21 +08:00
zhuwenxing
f4f5e0f4dc
test: add highlighter in checker (#46289)
/kind improvement

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **Tests**
  * Updated test suite dependencies to pymilvus 2.7.0rc91.
  * Enhanced text highlighting validation in test checkers.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2025-12-30 14:11:26 +08:00
jac
898e6d6e94
fix: [master]update ngram test error codes to match actual server responses (#46675)
- Updates e2e test error code and message after pymilvus update

see #46677

Signed-off-by: silas.jiang <silas.jiang@zilliz.com>
Co-authored-by: silas.jiang <silas.jiang@zilliz.com>
2025-12-30 13:07:20 +08:00
jiamingli-maker
ebe82db4fe
test: Add HNSW_PQ test cases and update HNSW_SQ (#46604)
/kind improvement

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
- Core invariant: test infrastructure treats insertion granularity as
orthogonal to data semantics—bulk generation
gen_row_data_by_schema(nb=2000, start=0, random_pk=False) yields the
same sequential PKs and vector payloads as prior multi-batch inserts, so
tests relying on collection lifecycle, flush, index build, load and
search behave identically.
- What changed / simplified: added a full HNSW_PQ parameterized test
suite (tests/python_client/testcases/indexes/idx_hnsw_pq.py and
test_hnsw_pq.py) and simplified HNSW_SQ test insertion by replacing
looped per-batch generation+insert with a single bulk
gen_row_data_by_schema(...) + insert. The per-batch PK sequencing and
repeated vector generation were redundant for correctness and were
removed to reduce complexity.
- Why this does NOT cause data loss or behavior regression: the
post-insert code paths remain unchanged—tests still call client.flush(),
create_index(...), util.wait_for_index_ready(), collection.load(), and
perform searches that assert describe_index and search outputs. Because
start=0 and random_pk=False reproduce identical sequential PKs (0..1999)
and the same vectors, index creation and search validation operate on
identical data and index parameters, preserving previous assertions and
outcomes.
- New capability: comprehensive HNSW_PQ coverage (build params: M,
efConstruction, m, nbits, refine, refine_type; search params: ef,
refine_k) across vector types (FLOAT_VECTOR, FLOAT16_VECTOR,
BFLOAT16_VECTOR, INT8_VECTOR) and metrics (L2, IP, COSINE), implemented
as data-driven tests to validate success and failure/error messages for
boundary, type-mismatch and inter-parameter constraints.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: zilliz <jiaming.li@zilliz.com>
2025-12-30 10:07:21 +08:00
nico
db3f065a61
test: test case json_contains_any is not stable (#46506)
### **User description**
issue: #46367


___

### **PR Type**
Bug fix, Tests


___

### **Description**
- Fix unstable test case by adjusting float precision

- Change listMix float value from 1.1 to 1.111

- Improves test stability for json_contains_any query


___

### Diagram Walkthrough


```mermaid
flowchart LR
  A["Test Data Generation"] -- "Adjust float precision" --> B["listMix field value"]
  B -- "1.1 to 1.111" --> C["Improved test stability"]
```



<details><summary><h3>File Walkthrough</h3></summary>

<table><thead><tr><th></th><th align="left">Relevant
files</th></tr></thead><tbody><tr><td><strong>Bug
fix</strong></td><td><table>
<tr>
  <td>
    <details>
<summary><strong>test_milvus_client_query.py</strong><dd><code>Adjust
float precision in test data</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></summary>
<hr>

tests/python_client/milvus_client/test_milvus_client_query.py

<ul><li>Modified test data generation in
<br><code>test_milvus_client_query_expr_all_datatype_json_contains_all</code>
method<br> <li> Changed <code>listMix</code> field float value from
<code>1.1</code> to <code>1.111</code> for improved <br>precision<br>
<li> Addresses test instability issue by adjusting floating-point test
data</ul>


</details>


  </td>
<td><a
href="https://github.com/milvus-io/milvus/pull/46506/files#diff-d6fe357e4678415bc62596b802571043fa571c7d1b8e841aa43124437dd2f739">+1/-1</a>&nbsp;
&nbsp; &nbsp; </td>

</tr>
</table></td></tr></tbody></table>

</details>

___



<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
- Core invariant: the test assumes stable float equality/containment
behavior for JSON-typed fields when generating test rows; small changes
in stored float precision can make json_contains_any assertions flaky.
- Exact fix for the bug (refs #46367): in
tests/python_client/milvus_client/test_milvus_client_query.py, the test
data value for the second element of the "listMix" JSON field was
adjusted from i * 1.1 to i * 1.111 in
test_milvus_client_query_expr_all_datatype_json_contains_all to increase
numeric precision and remove instability in json_contains_any
assertions.
- Logic removed/simplified: no production logic was changed or removed —
only a one-line test-data change. There is no control-flow or
algorithmic simplification because the test’s intent and checks remain
identical; the change removes the redundant dependence on a borderline
float value that caused flakiness.
- No data loss or behavior regression: this change only updates
test-generated input
(test_milvus_client_query_expr_all_datatype_json_contains_all) and does
not touch any library or runtime code paths. Production code paths
(query parsing/execution, JSON handling) are unchanged, so no persisted
data, API behavior, or client logic is affected.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: nico <cheng.yuan@zilliz.com>
2025-12-26 15:13:19 +08:00
congqixia
6f94d8c41a
fix: Handle legacy binlog format (v1) in segment load diff computation (#46598)
When computing load diff, binlogs in v1/legacy format have empty
child_fields. In this case, the field_id itself should be used as the
child_id (group_id == field_id for legacy format).

Without this fix, legacy format binlogs are not recognized during diff
computation, causing segments to fail loading and TestProxy to timeout.

Changes:
- Add fallback to use fieldid as child_id when child_fields is empty
- Add LoadDiff::ToString() for debugging
- Add logging for diff in Load/Reopen operations
- Add comprehensive unit tests for legacy format handling

Related to #46594

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
- Core invariant: load-diff computation must enumerate every binlog
child group for a field so current vs new segment state comparisons
include all column-group/binlog groups; for legacy (v1) binlogs that
have empty child_fields, the code must treat group_id == field_id to
preserve that mapping.
- Bug fix (resolves #46594): SegmentLoadInfo now normalizes
field_binlog.child_fields() into a vector and falls back to using
field_id as the single child group when child_fields is empty; the same
normalization is applied for both current and new-info paths, ensuring
legacy v1 binlogs are discovered and included in Load/ComputeDiff
results so segments load correctly.
- Logic simplified: removed the implicit assumption that child_fields is
always present by centralizing a single normalization/fallback step used
symmetrically for both diff paths, avoiding ad-hoc special-casing and
unifying iteration over child groups.
- No data loss / no behavior regression: the fallback only activates
when child_fields is empty — non-legacy binlogs continue to use their
child_fields unchanged. Add/drop semantics are preserved because the
same normalization is applied to both sides of the diff. Unit tests
(v1-only, v4-only, mixed cases) were added to validate correctness;
LoadDiff::ToString() and extra logging are diagnostic only.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Co-authored-by: Cai Zhang <cai.zhang@zilliz.com>

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-12-25 23:33:19 +08:00
aoiasd
342ba550bf
enhance: update highlight ci (#46573)
relate: https://github.com/milvus-io/milvus/issues/46571

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
- Core invariant: the LexicalHighlighter API now expects the match
queries under the parameter name highlight_query (not queries); all call
sites must pass highlight_query to supply match data. This PR assumes
the underlying highlighter behavior and processing of those query values
are unchanged.
- Logic simplified/removed: removed the legacy keyword queries in tests
and updated calls to use highlight_query
(tests/python_client/milvus_client/test_milvus_client_highlighter.py).
This eliminates a redundant/incorrect keyword alias and aligns tests
with the consolidated LexicalHighlighter constructor parameter name.
- Why this does NOT introduce data loss or behavior regression: the
change is a parameter-name rename only — no parsing, matching, or
storage logic was modified. Tests now construct LexicalHighlighter with
pre_tags/post_tags/highlight_search_text/fragment_* and pass the query
list under highlight_query; the highlighter execution path
(client.search → highlighter processing → result['highlight']) is
untouched, so existing highlight outputs and stored data remain
unchanged.
- Other changes: bumped pymilvus test dependency to 2.7.0rc93 in
tests/python_client/requirements.txt to match the updated constructor
signature; scope of change is limited to tests and dependency pinning
(no production code changes).
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-12-24 19:07:18 +08:00
marcelo-cjl
3b599441fd
feat: Add nullable vector support for proxy and querynode (#46305)
related: #45993 

This commit extends nullable vector support to the proxy layer,
querynode,
and adds comprehensive validation, search reduce, and field data
handling
    for nullable vectors with sparse storage.
    
    Proxy layer changes:
- Update validate_util.go checkAligned() with getExpectedVectorRows()
helper
      to validate nullable vector field alignment using valid data count
- Update checkFloatVectorFieldData/checkSparseFloatVectorFieldData for
      nullable vector validation with proper row count expectations
- Add FieldDataIdxComputer in typeutil/schema.go for logical-to-physical
      index translation during search reduce operations
- Update search_reduce_util.go reduceSearchResultData to use
idxComputers
      for correct field data indexing with nullable vectors
- Update task.go, task_query.go, task_upsert.go for nullable vector
handling
    - Update msg_pack.go with nullable vector field data processing
    
    QueryNode layer changes:
    - Update segments/result.go for nullable vector result handling
- Update segments/search_reduce.go with nullable vector offset
translation
    
    Storage and index changes:
- Update data_codec.go and utils.go for nullable vector serialization
- Update indexcgowrapper/dataset.go and index.go for nullable vector
indexing
    
    Utility changes:
- Add FieldDataIdxComputer struct with Compute() method for efficient
      logical-to-physical index mapping across multiple field data
- Update EstimateEntitySize() and AppendFieldData() with fieldIdxs
parameter
    - Update funcutil.go with nullable vector support functions

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Full support for nullable vector fields (float, binary, float16,
bfloat16, int8, sparse) across ingest, storage, indexing, search and
retrieval; logical↔physical offset mapping preserves row semantics.
  * Client: compaction control and compaction-state APIs.

* **Bug Fixes**
* Improved validation for adding vector fields (nullable + dimension
checks) and corrected search/query behavior for nullable vectors.

* **Chores**
  * Persisted validity maps with indexes and on-disk formats.

* **Tests**
  * Extensive new and updated end-to-end nullable-vector tests.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Signed-off-by: marcelo-cjl <marcelo.chen@zilliz.com>
2025-12-24 10:13:19 +08:00
Feilong Hou
e4b0f48bc0
test: add e2e test cases for highlighter (#46505)
### **User description**
Issue: #46504 
test: create e2e test case for highlighter

 On branch feature/highlighter
 Changes to be committed:
	new file:   milvus_client/test_milvus_client_highlighter.py


___

### **PR Type**
Tests


___

### **Description**
- Add comprehensive e2e test suite for LexicalHighlighter functionality

- Test highlighter initialization with collection setup and data
insertion

- Validate highlighter with various parameters (tags, fragments,
offsets)

- Test edge cases including Chinese characters, long text, and invalid
inputs

- Verify error handling for invalid fragment sizes, offsets, and
configurations


___

### Diagram Walkthrough


```mermaid
flowchart LR
  A["Test Suite Setup"] --> B["Highlighter Init Tests"]
  B --> C["Valid Test Cases"]
  C --> D["Fragment Parameters"]
  C --> E["Search Variations"]
  C --> F["Language Support"]
  B --> G["Invalid Test Cases"]
  G --> H["Parameter Validation"]
  G --> I["Error Handling"]
```



<details><summary><h3>File Walkthrough</h3></summary>

<table><thead><tr><th></th><th align="left">Relevant
files</th></tr></thead><tbody><tr><td><strong>Tests</strong></td><td><table>
<tr>
  <td>
    <details>

<summary><strong>test_milvus_client_highlighter.py</strong><dd><code>Add
comprehensive LexicalHighlighter e2e test suite</code>&nbsp; &nbsp;
&nbsp; &nbsp; &nbsp; &nbsp; </dd></summary>
<hr>

tests/python_client/milvus_client/test_milvus_client_highlighter.py

<ul><li>Create new test file with 1163 lines of comprehensive
highlighter test <br>cases<br> <li> Implement
<code>TestMilvusClientHighlighterInit</code> class to initialize
<br>collection with pre-defined test data including English, Chinese,
and <br>long text samples<br> <li> Implement
<code>TestMilvusClientHighlighterValid</code> class with 15+ test
methods <br>covering basic usage, multiple tags, fragment parameters,
offsets, <br>numbers, sentences, and language support<br> <li> Implement
<code>TestMilvusClientHighlighterInvalid</code> class with 8+ test
<br>methods validating error handling for invalid parameters and
<br>configurations<br> <li> Test highlighter with BM25 search, text
matching, and various analyzer <br>configurations</ul>


</details>


  </td>
<td><a
href="https://github.com/milvus-io/milvus/pull/46505/files#diff-443e3fefb65fbdb088d5920083306ecfe3605745b1e2714198c6566ca67b3736">+1163/-0</a></td>

</tr>
</table></td></tr></tbody></table>

</details>

___



<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Tests**
  * Added a comprehensive highlighter test suite covering:
- Core highlighting with single and multi-analyzer setups and multi-tag
variations
    - Fragment parameter behaviors and edge cases (size, offset, count)
- Text-match and query-based highlighting, including BM25 and vector
interactions
- Sub-word, long-text/tag, case sensitivity, Chinese/multi-language
scenarios
- Error handling for invalid parameters, no-match cases, and other edge
conditions
- Module-scoped fixture preparing multilingual, long-form test data and
teardown

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Signed-off-by: Eric Hou <eric.hou@zilliz.com>
Co-authored-by: Eric Hou <eric.hou@zilliz.com>
2025-12-24 09:49:19 +08:00
Buqian Zheng
e379b1f0f4
enhance: moved query optimization to proxy, added various optimizations (#45526)
issue: https://github.com/milvus-io/milvus/issues/45525

see added README.md for added optimizations

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Added query expression optimization feature with a new `optimizeExpr`
configuration flag to enable automatic simplification of filter
predicates, including range predicate optimization, merging of IN/NOT IN
conditions, and flattening of nested logical operators.

* **Bug Fixes**
* Adjusted delete operation behavior to correctly handle expression
evaluation.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2025-12-24 00:39:19 +08:00
foxspy
ab03521588
fix: fix chunk iterator merge order (#46461)
issue: #46349 
When using brute-force search, the iterator results from multiple chunks
are merged; at that point, we need to pay attention to how the metric
affects result ranking.

Signed-off-by: xianliang.li <xianliang.li@zilliz.com>
2025-12-23 10:33:17 +08:00
jiamingli-maker
b9fe8e9f9e
test: add HNSW_SQ test cases (#46428)
/kind improvement
/assign @yanliang567

Signed-off-by: zilliz <jiaming.li@zilliz.com>
2025-12-22 11:29:18 +08:00
nico
51350f4ef8
test: optimize ci test about compaction and flush (#46097)
Signed-off-by: nico <cheng.yuan@zilliz.com>
2025-12-20 12:37:21 +08:00
Feilong Hou
a7eb327746
test: fix unstable timestamptz test cases (#46403)
Issue: #46333 

test: re-write convert timestamp logic to cover daylight saving time

Signed-off-by: Eric Hou <eric.hou@zilliz.com>
Co-authored-by: Eric Hou <eric.hou@zilliz.com>
2025-12-17 21:13:16 +08:00
aoiasd
df80f54151
feat: support use user's file as dictionary for analyzer filter (#46145)
relate: https://github.com/milvus-io/milvus/issues/43687

---------

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-12-16 11:45:16 +08:00
Feilong Hou
971085b033
test: enable debug_mode to observe test case instability. (#46341)
Issue: #46333

Signed-off-by: Eric Hou <eric.hou@zilliz.com>
Co-authored-by: Eric Hou <eric.hou@zilliz.com>
2025-12-15 17:55:16 +08:00
zhuwenxing
3aa0b769e5
test: add unique error message collection in chaos checker (#46262)
/kind improvement

- Add normalize_error_message function to extract and normalize error
text
- Collect unique error messages during chaos test execution
- Display error details in assertion messages for better debugging

Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2025-12-11 13:49:12 +08:00
zhuwenxing
75d6f0d509
test: add ST_ISVALID geometry function test cases (#46232)
/kind improvement

Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2025-12-11 13:47:21 +08:00
Buqian Zheng
85a7a7b1e3
fix: skip json path index if the query path includes number (#46200)
issue: #45511

our tantivy inverted index currently does not include item index if the
value is an array, thus we can't do `a[0] == 'b'` type of look up in the
inverted index. for such, we need to skip the index and use brute force
search.

we may improve our index in the future, so this is a temp solution

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2025-12-10 13:59:13 +08:00
zhuwenxing
f9ff0e8402
test: add testcases for add/alter/drop text embedding function (#46229)
/kind improvement

Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2025-12-09 19:23:14 +08:00
zhuwenxing
abe0318bec
test: use predefined fake_de instead of creating new Faker instances to reduce run time (#46194)
related: https://github.com/milvus-io/milvus/issues/46014

/kind improvement

Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2025-12-09 17:59:14 +08:00
Feilong Hou
624147740b
test: fix timestamptz e2e case failure on Jenkins Weekly (#46210)
Issue: #46188 

Bug was caused by inconsistent version of tzdata as well as wrong month
assignment in convert_timestamptz function.
Also fix when debug_mode=True the compare function can correctly return
True or False.

---------

Signed-off-by: Eric Hou <eric.hou@zilliz.com>
Co-authored-by: Eric Hou <eric.hou@zilliz.com>
2025-12-09 16:09:15 +08:00
zhuwenxing
8fac376afd
test: upgrade minio sdk from 7.1.5 to 7.2.0 (#46186)
/kind improvement


fix minio sdk error
```
2025-12-08T07:54:48Z {container="step-test"} def _put_object(self, bucket_name, object_name, data, headers, 

2025-12-08T07:54:48Z {container="step-test"} query_params=None): 

2025-12-08T07:54:48Z {container="step-test"} """Execute PutObject S3 API.""" 

2025-12-08T07:54:48Z {container="step-test"} response = self._execute( 

2025-12-08T07:54:48Z {container="step-test"} "PUT", 

2025-12-08T07:54:48Z {container="step-test"} bucket_name, 

2025-12-08T07:54:48Z {container="step-test"} object_name, 

2025-12-08T07:54:48Z {container="step-test"} body=data, 

2025-12-08T07:54:48Z {container="step-test"} headers=headers, 

2025-12-08T07:54:48Z {container="step-test"} query_params=query_params, 

2025-12-08T07:54:48Z {container="step-test"} no_body_trace=True, 

2025-12-08T07:54:48Z {container="step-test"} ) 

2025-12-08T07:54:48Z {container="step-test"} return ObjectWriteResult( 

2025-12-08T07:54:48Z {container="step-test"} bucket_name, 

2025-12-08T07:54:48Z {container="step-test"} object_name, 

2025-12-08T07:54:48Z {container="step-test"} >           response.getheader("x-amz-version-id"), 

2025-12-08T07:54:48Z {container="step-test"} response.getheader("etag").replace('"', ""), 

2025-12-08T07:54:48Z {container="step-test"} response.getheaders(), 

2025-12-08T07:54:48Z {container="step-test"} ) 

2025-12-08T07:54:48Z {container="step-test"} E       AttributeError: 'HTTPResponse' object has no attribute 'getheader' 

2025-12-08T07:54:48Z {container="step-test"}  

2025-12-08T07:54:48Z {container="step-test"} /usr/local/lib/python3.10/site-packages/minio/api.py:1582: AttributeError
```

---------

Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2025-12-09 11:39:12 +08:00
zhuwenxing
4fe41ff14d
test: add emb list recall test (#46135)
/kind improvement

Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2025-12-08 19:21:13 +08:00
XuanYang-cn
2cb6073bdd
test: Increase PyMilvus version to 2.7.0rc83 for master branch (#46072)
Automated daily bump from pymilvus master branch. Updates
tests/python_client/requirements.txt.

Signed-off-by: XuanYang-cn <xuan.yang@zilliz.com>
2025-12-04 17:33:20 +08:00
nico
43fe215787
test: update sdk version and skip some debug log (#46040)
Signed-off-by: nico <cheng.yuan@zilliz.com>
2025-12-04 10:33:11 +08:00
wei liu
f85e86a6ec
fix: change upsert duplicate PK behavior from dedup to error (#45997)
issue: #44320

Replace the DeduplicateFieldData function with CheckDuplicatePkExist
that returns an error when duplicate primary keys are detected in the
same batch, instead of silently deduplicating.

Changes:
- Replace DeduplicateFieldData with CheckDuplicatePkExist in util.go
- Update upsertTask.PreExecute to return error on duplicate PKs
- Simplify helper function from findLastOccurrenceIndices to
hasDuplicates
- Update unit tests to verify the new error behavior
- Add Python integration tests for duplicate PK error cases

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-12-04 10:23:11 +08:00
Feilong Hou
dd3797f3ac
test: add timestamptz to more bulk writer case (#46016)
Issue #46015
 <test>: <add timestamptz to more bulk writer>

 On branch feature/timestamps
 Changes to be committed:
	modified:   testcases/test_bulk_insert.py

Signed-off-by: Eric Hou <eric.hou@zilliz.com>
Co-authored-by: Eric Hou <eric.hou@zilliz.com>
2025-12-03 10:05:10 +08:00
yanliang567
13a52016ac
test: Update hybrid search tests with milvus client (#46003)
related issue: https://github.com/milvus-io/milvus/issues/45326

Signed-off-by: yanliang567 <yanliang.qiao@zilliz.com>
2025-12-02 18:11:10 +08:00
zhuwenxing
f68bd44f35
test: unify schema retrieval to use get_schema() method in chaos checker (#45985)
/kind improvement


Replace direct self.schema access and describe_collection() calls with
get_schema() method to ensure consistent schema handling with complete
struct_fields information. Also fix FlushChecker error handling and
change schema log level from info to debug.

Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2025-12-02 09:43:10 +08:00
zhuwenxing
2d7574b5a3
test: refactor connection method to prioritize uri/token and add query limit (#45901)
- Refactor connection logic to prioritize uri and token parameters over
host/port/user/password for a more modern connection approach
- Add explicit limit parameter (limit=5) to search and query operations
in chaos checkers to avoid returning unlimited results
- Migrate test_all_collections_after_chaos.py from Collection wrapper to
MilvusClient API style
- Update pytest fixtures in chaos test files to support uri/token params

Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2025-11-27 19:25:07 +08:00
zhuwenxing
464a805c63
test: add dynamicfield.enabled property alter in chaos checker (#45625)
/kind improvement

Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2025-11-27 14:53:08 +08:00
jac
97da6bd7e3
test: Increase PyMilvus version to 2.7.0rc72 for master branch and fix async teardown logic (#45809)
Signed-off-by: silas.jiang <silas.jiang@zilliz.com>
Co-authored-by: silas.jiang <silas.jiang@zilliz.com>
2025-11-25 11:01:07 +08:00
zhuwenxing
256e073e8d
test: add more testcases for geo and struct (#45414)
/kind improvement

Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2025-11-25 10:51:06 +08:00
yanliang567
1da75c0ee2
test: Update hybrid search tests to milvus client style (#45772)
related issue: #45326

---------

Signed-off-by: yanliang567 <yanliang.qiao@zilliz.com>
2025-11-24 17:55:07 +08:00
Feilong Hou
228eb0f5d0
test: add more test cases and add bulk insert scenario (#45770)
Issue: #45756 
1. add bulk insert scenario
 2. fix small issue in e2e cases
 3. add search group by test case
 4. add timestampstz to gen_all_datatype_collection_schema
5. modify partial update testcase to ensure correct result from
timestamptz field

 On branch feature/timestamps
 Changes to be committed:
	modified:   common/bulk_insert_data.py
	modified:   common/common_func.py
	modified:   common/common_type.py
	modified:   milvus_client/test_milvus_client_partial_update.py
	modified:   milvus_client/test_milvus_client_timestamptz.py
	modified:   pytest.ini
	modified:   testcases/test_bulk_insert.py

Signed-off-by: Eric Hou <eric.hou@zilliz.com>
Co-authored-by: Eric Hou <eric.hou@zilliz.com>
2025-11-24 15:21:06 +08:00
Feilong Hou
0231a3edf8
test: enable all timestamptz case (#45128)
Issue: #44518

---------

Signed-off-by: Eric Hou <eric.hou@zilliz.com>
Co-authored-by: Eric Hou <eric.hou@zilliz.com>
2025-11-21 11:03:06 +08:00
qixuan
3202847092
test: add field case about dynamic and compaction (#45694)
related issue: #42126

Signed-off-by: qixuan <673771573@qq.com>
2025-11-21 10:07:05 +08:00
XuanYang-cn
b95bbaffae
test: Increase PyMilvus version to 2.7.0rc60 for master branch (#45600)
Automated daily bump from pymilvus master branch. Updates
tests/python_client/requirements.txt.

Signed-off-by: XuanYang-cn <xuan.yang@zilliz.com>
2025-11-20 16:53:08 +08:00
zhikunyao
aa0870d2ff
test: add e2e-v2 helm for amd (#45621)
Signed-off-by: Zhikun Yao <zhikun.yao@zilliz.com>
2025-11-20 13:45:11 +08:00
zhuwenxing
e0df44481d
test: refactor checker to using milvus client (#45524)
/kind improvement

Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2025-11-20 11:59:08 +08:00
Bingyi Sun
1ba75eea62
enhance: skip test_milvus_client_search_json_path_index_default (#45604)
To prevent this issue from blocking other PRs, we are temporarily
disabling this test. A proper fix will be implemented before the 2.6.6
release.

issue: https://github.com/milvus-io/milvus/issues/45511

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-11-18 10:54:09 +08:00
wei liu
7aed88113c
enhance: Deduplicate primary keys in upsert request batch (#45249)
issue: #44320

This change adds deduplication logic to handle duplicate primary keys
within a single upsert batch, keeping the last occurrence of each
primary key.

Key changes:
- Add DeduplicateFieldData function to remove duplicate PKs from field
data, supporting both Int64 and VarChar primary keys
- Refactor fillFieldPropertiesBySchema into two separate functions:
validateFieldDataColumns for validation and fillFieldPropertiesOnly for
property filling, improving code clarity and reusability
- Integrate deduplication logic in upsertTask.PreExecute to
automatically deduplicate data before processing
- Add comprehensive unit tests for deduplication with various PK types
(Int64, VarChar) and field types (scalar, vector)
- Add Python integration tests to verify end-to-end behavior

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-11-17 21:35:40 +08:00
Spade A
0454cdaab3
fix: remove validateFieldName in dropIndex (#45460)
issue: https://github.com/milvus-io/milvus/issues/45459

This check is unnecessary when dropping index.

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-11-14 10:17:37 +08:00
Feilong Hou
db42b2df75
test: fix partial update chaos checker (#45492)
Issue: #45489 
<fix>: <fix partial update chaos checker>

 On branch feature/partial-update
 Changes to be committed:
	modified:   chaos/checker.py

---------

Signed-off-by: Eric Hou <eric.hou@zilliz.com>
Co-authored-by: Eric Hou <eric.hou@zilliz.com>
2025-11-13 17:29:37 +08:00
XuanYang-cn
e31eec2921
test: Increase PyMilvus version to 2.7.0rc56 for master branch (#45515)
Automated daily bump from pymilvus master branch. Updates
tests/python_client/requirements.txt.

Signed-off-by: XuanYang-cn <xuan.yang@zilliz.com>
2025-11-13 11:33:41 +08:00
Zhen Ye
4797bb6ab2
fix: wrong update timetick of collection meta info (#45461)
issue: #45403, #45463

- fix the Nightly E2E failures.
- fix the wrong update timetick of altering collection to fix the
related load failure.

Signed-off-by: chyezh <chyezh@outlook.com>
2025-11-11 16:01:36 +08:00
zhuwenxing
b1af0df9f3
test: add struct array mmap testcases (#45309)
/kind improvement

Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2025-11-10 16:49:36 +08:00