milvus/tests/python_client/requirements.txt
congqixia 6f94d8c41a
fix: Handle legacy binlog format (v1) in segment load diff computation (#46598)
When computing load diff, binlogs in v1/legacy format have empty
child_fields. In this case, the field_id itself should be used as the
child_id (group_id == field_id for legacy format).

Without this fix, legacy format binlogs are not recognized during diff
computation, causing segments to fail loading and TestProxy to timeout.

Changes:
- Add fallback to use fieldid as child_id when child_fields is empty
- Add LoadDiff::ToString() for debugging
- Add logging for diff in Load/Reopen operations
- Add comprehensive unit tests for legacy format handling

Related to #46594

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
- Core invariant: load-diff computation must enumerate every binlog
child group for a field so current vs new segment state comparisons
include all column-group/binlog groups; for legacy (v1) binlogs that
have empty child_fields, the code must treat group_id == field_id to
preserve that mapping.
- Bug fix (resolves #46594): SegmentLoadInfo now normalizes
field_binlog.child_fields() into a vector and falls back to using
field_id as the single child group when child_fields is empty; the same
normalization is applied for both current and new-info paths, ensuring
legacy v1 binlogs are discovered and included in Load/ComputeDiff
results so segments load correctly.
- Logic simplified: removed the implicit assumption that child_fields is
always present by centralizing a single normalization/fallback step used
symmetrically for both diff paths, avoiding ad-hoc special-casing and
unifying iteration over child groups.
- No data loss / no behavior regression: the fallback only activates
when child_fields is empty — non-legacy binlogs continue to use their
child_fields unchanged. Add/drop semantics are preserved because the
same normalization is applied to both sides of the diff. Unit tests
(v1-only, v4-only, mixed cases) were added to validate correctness;
LoadDiff::ToString() and extra logging are diagnostic only.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Co-authored-by: Cai Zhang <cai.zhang@zilliz.com>

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-12-25 23:33:19 +08:00

90 lines
1.4 KiB
Plaintext

--extra-index-url https://test.pypi.org/simple/
pytest-cov==2.8.1
requests>=2.32.4
scikit-learn>=1.5.2
timeout_decorator==0.5.0
ujson==5.5.0
pytest==8.3.4
pytest-asyncio==0.24.0
pytest-assume==2.4.3
pytest-timeout==1.3.3
pytest-repeat==0.8.0
allure-pytest==2.7.0
pytest-print==0.2.1
pytest-level==0.1.1
pytest-xdist==2.5.0
pytest-rerunfailures==14.0
pytest_tagging==1.6.0
ndg-httpsclient
pyopenssl
pyasn1
pytest-html==3.1.1
delayed-assert==0.3.5
kubernetes==17.17.0
PyYAML==6.0
pytest-sugar==0.9.5
pytest-parallel
pytest-random-order
# pymilvus
pymilvus==2.7.0rc95
pymilvus[bulk_writer]==2.7.0rc95
# for protobuf
protobuf>=5.29.5
# for customize config test
python-benedict==0.24.3
timeout-decorator==0.5.0
# for bulk insert test
minio==7.2.0
npy-append-array==0.9.15
Faker==19.2.0
# for benchmark
h5py==3.8.0
# for log
loguru==0.7.0
# util
psutil==5.9.4
pandas==1.5.3
numpy==1.26.4
tenacity==8.1.0
rich==13.7.0
# for standby test
etcd-sdk-python==0.0.6
deepdiff==8.6.1
# for test result analyzer
prettytable==3.8.0
pyarrow==14.0.1
fastparquet==2023.7.0
# for bf16 datatype
ml-dtypes==0.2.0
# for full text search
tantivy==0.22.0
bm25s==0.2.0
jieba==0.42.1
Unidecode==1.3.8
# for perf test
locust==2.25.0
# for supporting higher python version
typing_extensions==4.12.2
# for env configuration
python-dotenv<=2.0.0
# for geometry
shapely==2.1.1
# for time zone
tzdata>=2024.1