cqy123456 eb63a363bd
enhance: support minhash function in DIDO (#45322)
issue : https://github.com/milvus-io/milvus/issues/41746
This PR adds MinHash "DIDO" (Data In, Data Out) support to Milvus, which
allows computing MinHash signatures on-the-fly during search operations
instead of requiring pre-stored vectors.

  Key changes:
- Implemented SIMD-optimized C++ MinHash computation (AVX2/AVX512 for
x86, NEON/SVE for ARM)
- Added runtime CPU detection and function hooks to automatically select
the best SIMD implementation
- Integrated MinHash computation into search pipeline (brute force
search, growing segment search)
- Added support for LSH-based MinHash search with configurable band
width and bit width parameters
- Enabled direct text-to-signature conversion during query execution,
reducing storage overhead

This enables efficient text deduplication and similarity search without
storing pre-computed MinHash vectors.

Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>
2026-01-14 14:19:27 +08:00
..