issue: #44320
Replace the DeduplicateFieldData function with CheckDuplicatePkExist
that returns an error when duplicate primary keys are detected in the
same batch, instead of silently deduplicating.
Changes:
- Replace DeduplicateFieldData with CheckDuplicatePkExist in util.go
- Update upsertTask.PreExecute to return error on duplicate PKs
- Simplify helper function from findLastOccurrenceIndices to
hasDuplicates
- Update unit tests to verify the new error behavior
- Add Python integration tests for duplicate PK error cases
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
Issue #46015
<test>: <add timestamptz to more bulk writer>
On branch feature/timestamps
Changes to be committed:
modified: testcases/test_bulk_insert.py
Signed-off-by: Eric Hou <eric.hou@zilliz.com>
Co-authored-by: Eric Hou <eric.hou@zilliz.com>
/kind improvement
Replace direct self.schema access and describe_collection() calls with
get_schema() method to ensure consistent schema handling with complete
struct_fields information. Also fix FlushChecker error handling and
change schema log level from info to debug.
Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
- Refactor connection logic to prioritize uri and token parameters over
host/port/user/password for a more modern connection approach
- Add explicit limit parameter (limit=5) to search and query operations
in chaos checkers to avoid returning unlimited results
- Migrate test_all_collections_after_chaos.py from Collection wrapper to
MilvusClient API style
- Update pytest fixtures in chaos test files to support uri/token params
Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
Issue: #45756
1. add bulk insert scenario
2. fix small issue in e2e cases
3. add search group by test case
4. add timestampstz to gen_all_datatype_collection_schema
5. modify partial update testcase to ensure correct result from
timestamptz field
On branch feature/timestamps
Changes to be committed:
modified: common/bulk_insert_data.py
modified: common/common_func.py
modified: common/common_type.py
modified: milvus_client/test_milvus_client_partial_update.py
modified: milvus_client/test_milvus_client_timestamptz.py
modified: pytest.ini
modified: testcases/test_bulk_insert.py
Signed-off-by: Eric Hou <eric.hou@zilliz.com>
Co-authored-by: Eric Hou <eric.hou@zilliz.com>
To prevent this issue from blocking other PRs, we are temporarily
disabling this test. A proper fix will be implemented before the 2.6.6
release.
issue: https://github.com/milvus-io/milvus/issues/45511
---------
Signed-off-by: sunby <sunbingyi1992@gmail.com>
issue: #44320
This change adds deduplication logic to handle duplicate primary keys
within a single upsert batch, keeping the last occurrence of each
primary key.
Key changes:
- Add DeduplicateFieldData function to remove duplicate PKs from field
data, supporting both Int64 and VarChar primary keys
- Refactor fillFieldPropertiesBySchema into two separate functions:
validateFieldDataColumns for validation and fillFieldPropertiesOnly for
property filling, improving code clarity and reusability
- Integrate deduplication logic in upsertTask.PreExecute to
automatically deduplicate data before processing
- Add comprehensive unit tests for deduplication with various PK types
(Int64, VarChar) and field types (scalar, vector)
- Add Python integration tests to verify end-to-end behavior
---------
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
issue: #45403, #45463
- fix the Nightly E2E failures.
- fix the wrong update timetick of altering collection to fix the
related load failure.
Signed-off-by: chyezh <chyezh@outlook.com>
Issue: #45129
<test>: <add new test case>
<also delete duplicate test case>
On branch feature/partial-update
Changes to be committed:
modified: milvus_client/test_milvus_client_partial_update.py
modified: milvus_client/test_milvus_client_upsert.py
---------
Signed-off-by: Eric Hou <eric.hou@zilliz.com>
Co-authored-by: Eric Hou <eric.hou@zilliz.com>
issue: #43897
- Alter collection/database is implemented by WAL-based DDL framework
now.
- Support AlterCollection/AlterDatabase in wal now.
- Alter operation can be synced by new CDC now.
- Refactor some UT for alter DDL.
---------
Signed-off-by: chyezh <chyezh@outlook.com>
Issue: #44989
On branch feature/json-shredding
Changes to be committed:
modified: milvus_client/test_milvus_client_query.py
Signed-off-by: Eric Hou <eric.hou@zilliz.com>
Co-authored-by: Eric Hou <eric.hou@zilliz.com>
issue: https://github.com/milvus-io/milvus/issues/44399
This PR implements STL_SORT for VARCHAR data type for both RAM and MMAP
mode.
The general idea is that we deduplicate field values and maintains a
posting list for each unique value.
The serialization format of the index is:
```
[unique_count][string_offsets][string_data][post_list_offsets][post_list_data][magic_code]
string_offsets: array of offsets into string_data section
string_data: str_len1, str1, str_len2, str2, ...
post_list_offsets: array of offsets into post_list_data section
post_list_data: post_list_len1, row_id1, row_id2, ..., post_list_len2, row_id1, row_id2, ...
```
---------
Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
Issue: #44518
On branch feature/timestamps
Changes to be committed:
modified: common/common_func.py
new file: milvus_client/test_milvus_client_timestamptz.py
---------
Signed-off-by: Eric Hou <eric.hou@zilliz.com>
Co-authored-by: Eric Hou <eric.hou@zilliz.com>
related issue: #44425
1. split insert.py into a few files: upsert.py, insert.py,
partial_upsert.py ...
2. add test for allow insert auto id
---------
Signed-off-by: yanliang567 <yanliang.qiao@zilliz.com>
The collection not found err could contains db id in err message, which
is not meaningful to users.
This patch make error message wrapping dbname instead of db id.
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>