3 Commits

Author SHA1 Message Date
foxspy
53a300db83
enhance: update knowhere version (#45564)
issue: #42937 

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
- Core invariant: callers must explicitly close output streams (call
Close()) instead of relying on RemoteOutputStream's destructor to
perform closure.
- Logic removed/simplified: RemoteOutputStream's destructor no longer
closes or asserts on the underlying arrow::io::OutputStream; an explicit
public Close() method was added and closure responsibility moved to that
code path.
- Why this is safe (no data loss/regression): callers now invoke Close()
before reading or destroying streams (e.g.,
DiskFileManagerTest::ReadAndWriteWithStream calls os->Close() before
opening the input stream). Write paths remain unchanged
(RemoteOutputStream::Write -> output_stream_->Write), and Close invokes
output_stream_->Close() with status assertion, ensuring
flush/confirmation via the same API and preserving data integrity;
removing destructor-side asserts prevents unexpected failures during
object destruction without changing write/close semantics.
- Chore: updated third-party pins — internal/core/thirdparty/knowhere
CMakeLists.txt: KNOWHERE_VERSION -> a59816e;
internal/core/thirdparty/milvus-common CMakeLists.txt:
MILVUS-COMMON-VERSION -> b6629f7.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: xianliang.li <xianliang.li@zilliz.com>
2026-01-07 10:39:24 +08:00
jiamingli-maker
c10cf53b4b
test: Add HNSW_PRQ test cases and fix HNSW_PQ (#46680)
/kind improvement

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
- Core invariant: index parameter validation and test expectations for
the HNSW-family must be explicit, consistent, and deterministic — this
PR enforces that by adding exhaustive parameter matrices for HNSW_PRQ
(tests/python_client/testcases/indexes/{idx_hnsw_prq.py,
test_hnsw_prq.py}) and normalizing expectations in idx_hnsw_pq.py via a
shared success variable.
- Logic removed / simplified: brittle, ad-hoc string expectations were
consolidated — literal "success" occurrences were replaced with a single
success variable and ambiguous short error messages were replaced by the
canonical descriptive error text; this reduces duplicated assertion
logic in tests and removes dependence on fragile, truncated messages.
- Bug fix (tests): corrected HNSW_PQ test expectations to assert the
full, authoritative error for invalid PQ m ("The dimension of the vector
(dim) should be a multiple of the number of subquantizers (m).") and
aligned HNSW_PRQ test matrices (idx_hnsw_prq.py) to the same explicit
expectations — the change targets test assertions only and fixes false
negatives caused by mismatched messages.
- No data loss or behavior regression: only test code is added/modified
(tests/python_client/testcases/indexes/*). Production code paths remain
unmodified — collection creation, insert/flush, client.create_index,
wait_for_index_ready, load_collection, search, and client.describe_index
are invoked by tests but not changed; therefore persisted data, index
artifacts, and runtime behavior are unaffected.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: zilliz <jiaming.li@zilliz.com>
2026-01-04 18:57:22 +08:00
jiamingli-maker
ebe82db4fe
test: Add HNSW_PQ test cases and update HNSW_SQ (#46604)
/kind improvement

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
- Core invariant: test infrastructure treats insertion granularity as
orthogonal to data semantics—bulk generation
gen_row_data_by_schema(nb=2000, start=0, random_pk=False) yields the
same sequential PKs and vector payloads as prior multi-batch inserts, so
tests relying on collection lifecycle, flush, index build, load and
search behave identically.
- What changed / simplified: added a full HNSW_PQ parameterized test
suite (tests/python_client/testcases/indexes/idx_hnsw_pq.py and
test_hnsw_pq.py) and simplified HNSW_SQ test insertion by replacing
looped per-batch generation+insert with a single bulk
gen_row_data_by_schema(...) + insert. The per-batch PK sequencing and
repeated vector generation were redundant for correctness and were
removed to reduce complexity.
- Why this does NOT cause data loss or behavior regression: the
post-insert code paths remain unchanged—tests still call client.flush(),
create_index(...), util.wait_for_index_ready(), collection.load(), and
perform searches that assert describe_index and search outputs. Because
start=0 and random_pk=False reproduce identical sequential PKs (0..1999)
and the same vectors, index creation and search validation operate on
identical data and index parameters, preserving previous assertions and
outcomes.
- New capability: comprehensive HNSW_PQ coverage (build params: M,
efConstruction, m, nbits, refine, refine_type; search params: ef,
refine_k) across vector types (FLOAT_VECTOR, FLOAT16_VECTOR,
BFLOAT16_VECTOR, INT8_VECTOR) and metrics (L2, IP, COSINE), implemented
as data-driven tests to validate success and failure/error messages for
boundary, type-mismatch and inter-parameter constraints.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: zilliz <jiaming.li@zilliz.com>
2025-12-30 10:07:21 +08:00