congqixia 6f94d8c41a
fix: Handle legacy binlog format (v1) in segment load diff computation (#46598)
When computing load diff, binlogs in v1/legacy format have empty
child_fields. In this case, the field_id itself should be used as the
child_id (group_id == field_id for legacy format).

Without this fix, legacy format binlogs are not recognized during diff
computation, causing segments to fail loading and TestProxy to timeout.

Changes:
- Add fallback to use fieldid as child_id when child_fields is empty
- Add LoadDiff::ToString() for debugging
- Add logging for diff in Load/Reopen operations
- Add comprehensive unit tests for legacy format handling

Related to #46594

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
- Core invariant: load-diff computation must enumerate every binlog
child group for a field so current vs new segment state comparisons
include all column-group/binlog groups; for legacy (v1) binlogs that
have empty child_fields, the code must treat group_id == field_id to
preserve that mapping.
- Bug fix (resolves #46594): SegmentLoadInfo now normalizes
field_binlog.child_fields() into a vector and falls back to using
field_id as the single child group when child_fields is empty; the same
normalization is applied for both current and new-info paths, ensuring
legacy v1 binlogs are discovered and included in Load/ComputeDiff
results so segments load correctly.
- Logic simplified: removed the implicit assumption that child_fields is
always present by centralizing a single normalization/fallback step used
symmetrically for both diff paths, avoiding ad-hoc special-casing and
unifying iteration over child groups.
- No data loss / no behavior regression: the fallback only activates
when child_fields is empty — non-legacy binlogs continue to use their
child_fields unchanged. Add/drop semantics are preserved because the
same normalization is applied to both sides of the diff. Unit tests
(v1-only, v4-only, mixed cases) were added to validate correctness;
LoadDiff::ToString() and extra logging are diagnostic only.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Co-authored-by: Cai Zhang <cai.zhang@zilliz.com>

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-12-25 23:33:19 +08:00
..
2021-11-16 15:41:11 +08:00

Tests

E2E Test

Configuration Requirements

Operating System
Operating System Version
Amazon Linux 2023 or above
Ubuntu 20.04 or above
Mac 10.14 or above
Hardware
Hardware Type Recommended Configuration
CPU x86_64 architecture
Intel CPU Sandy Bridge or above
CPU Instruction Set
- SSE4_2
- AVX
- AVX2
- AVX512 or arm64 Linux/MacOS
Memory 16 GB or more
Software
Software Name Version
Docker 19.05 or above
Docker Compose 1.25.5 or above
jq 1.3 or above
kubectl 1.14 or above
helm 3.0 or above
kind 0.10.0 or above

Installing Dependencies

Troubleshooting Docker and Docker Compose
  1. Confirm that Docker Daemon is running
$ docker info
  • Ensure that Docker is installed. Refer to the official installation instructions for Docker CE/EE.

  • Start the Docker Daemon if it is not already started.

  • To run Docker without root privileges, create a user group labeled docker, then add a user to the group with sudo usermod -aG docker $USER. Log out and log back into the terminal for the changes to take effect. For more information, see the official Docker documentation for Managing Docker as a Non-Root User.

  1. Check the version of Docker-Compose
$ docker compose version

docker compose version 1.25.5, build 8a1c60f6
docker-py version: 4.1.0
CPython version: 3.7.5
OpenSSL version: OpenSSL 1.1.1f  31 Mar 2020
Install jq
Install kubectl
Install helm
Install kind

Run E2E Tests

$ cd tests/scripts
$ ./e2e-k8s.sh

Getting help

You can get help with the following command:

$ ./e2e-k8s.sh --help