milvus

mirror of https://gitee.com/milvus-io/milvus.git synced 2025-12-28 14:35:27 +08:00

History

enhance: map multi row groups into one cache cell (#46249 )

issue: #45486

Introduce row group batching to reduce cache cell granularity and
improve
memory&disk efficiency. Previously, each parquet row group mapped 1:1 to
a cache
cell. Now, up to `kRowGroupsPerCell` (4) row groups are merged into one
cell.
This reduces the number of cache cells (and associated overhead) by ~4x
while
maintaining the same data granularity for loading.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Refactor**
* Switched to cell-based grouping that merges multiple row groups for
more efficient multi-file aggregation and reads.
* Chunk loading now combines multiple source batches/tables per cell and
better supports mmap-backed storage.

* **New Features**
* Exposed helpers to query row-group ranges and global row-group offsets
for diagnostics and testing.
* Translators now accept chunk-type and mmap/load hints to control
on-disk vs in-memory behavior.

* **Bug Fixes**
* Improved bounds checks and clearer error messages for out-of-range
cell requests.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>

2025-12-23 14:57:18 +08:00

build-support

fix: clang format broken under osx (#38427 )

2025-01-17 10:43:03 +08:00

cmake

enhance: move c++ unit test file to aside of the production code (#43932 )

2025-09-03 23:45:53 +08:00

src

enhance: map multi row groups into one cache cell (#46249 )

2025-12-23 14:57:18 +08:00

thirdparty

feat: impl ComputePhraseMatchSlop for compute min slop for phrase match query (#45892 )