issue: #44320
This change adds deduplication logic to handle duplicate primary keys
within a single upsert batch, keeping the last occurrence of each
primary key.
Key changes:
- Add DeduplicateFieldData function to remove duplicate PKs from field
data, supporting both Int64 and VarChar primary keys
- Refactor fillFieldPropertiesBySchema into two separate functions:
validateFieldDataColumns for validation and fillFieldPropertiesOnly for
property filling, improving code clarity and reusability
- Integrate deduplication logic in upsertTask.PreExecute to
automatically deduplicate data before processing
- Add comprehensive unit tests for deduplication with various PK types
(Int64, VarChar) and field types (scalar, vector)
- Add Python integration tests to verify end-to-end behavior
---------
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
Issue: #45129
<test>: <add new test case>
<also delete duplicate test case>
On branch feature/partial-update
Changes to be committed:
modified: milvus_client/test_milvus_client_partial_update.py
modified: milvus_client/test_milvus_client_upsert.py
---------
Signed-off-by: Eric Hou <eric.hou@zilliz.com>
Co-authored-by: Eric Hou <eric.hou@zilliz.com>
related issue: #44425
1. split insert.py into a few files: upsert.py, insert.py,
partial_upsert.py ...
2. add test for allow insert auto id
---------
Signed-off-by: yanliang567 <yanliang.qiao@zilliz.com>