See #36264
In this PR:
- Enhanced error handling in parse of grouping field.
- Fixed null handling in reduce tasks in proxy nodes.
- Updated tests to reflect changes in error handling and data processing
logic.
---------
Signed-off-by: Ted Xu <ted.xu@zilliz.com>
issue: https://github.com/milvus-io/milvus/issues/41435
this PR also makes HasRawData of ChunkedSegmentSealedImpl to return
based on metadata, without needing to load the cache just to answer this
simple question.
---------
Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
issue: #41853
- persist the estimated binary size for insert message into wal.
- add metric to record the total growing rows of channel.
Signed-off-by: chyezh <chyezh@outlook.com>
Close the chunk manager's reader after the import completes to prevent
goroutine leaks.
issues: https://github.com/milvus-io/milvus/issues/41868
---------
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
Related to #39173
This PR:
- Upgrade milvus-storage commit to fix filesystem finalized issue
- Add bucket-name as prefix for all fs style access io
- Initial arrow fs on querynodes startup
- Fix timestamp access when loading sealed segment
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
issue: https://github.com/milvus-io/milvus/issues/41435
this PR also:
1. fixed the skip index for VARCHAR. before this PR, skip index of
VARCHAR uses the minmax of the entire column as the minmax of chunk 0,
and provides no minmax for other chunks.
2. refactored some skip index loading related code
3. partly fixed a bug in test_expr.cpp
---------
Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
issue: #41544
- add lock interceptor into wal.
- use recovery and shardmanager to replace the original implementation
of segment assignment.
- remove redundant implementation and unittest.
- remove redundant proto definition.
- use 2 streamingnode in e2e.
---------
Signed-off-by: chyezh <chyezh@outlook.com>
Related to #41726#41736
The load field list blocks the new field from being loaded.
`load_fields` shall work as hint after tiered storage support API to
specifiy this behavior.
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
When autoID is enabled, the preimport task estimates row distribution by
evenly dividing the total row count (numRows) across all vchannels:
`estimatedCount = numRows / vchannelNum`.
However, the actual import task hashes real auto-generated IDs to
determine
the target vchannel. This mismatch can lead to inaccurate row
distribution estimation
in such corner cases:
- Importing 1 row into 2 vchannels:
• Preimport: 1 / 2 = 0 → both v0 and v1 are estimated to have 0 rows
• Import: real autoID (e.g., 457975852966809057) hashes to v1
→ actual result: v0 = 0, v1 = 1
To resolve such corner case, we now allocate at least one segment for
each vchannel
when autoID is enabled, ensuring all vchannels are prepared to receive
data even
if no rows are estimated for them.
issue: https://github.com/milvus-io/milvus/issues/41759
---------
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
issue: #41348
related and optimized by #41347
---------
Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
Co-authored-by: Sangho Park <hoyaspark@gmail.com>
issue: #41544
- Implement in-memory shard manager to maintain the shard state at write
ahead.
- Remove all rpc and meta operation at write ahead, make the segment
assignment logic only use wal and memory.
- Refactor global stats management, add node-level flush policy.
- Fix the recovery storage inconsistency bug when graceful close.
Signed-off-by: chyezh <chyezh@outlook.com>
issue: #41544
- simplify the proto message for flush and create segment.
- simplify the msg handler for flowgraph.
---------
Signed-off-by: chyezh <chyezh@outlook.com>
issue: https://github.com/milvus-io/milvus/issues/41435
this PR is based on https://github.com/milvus-io/milvus/pull/41436.
Improvements include:
- Lazy Load support for Storage v1
- Use Low/High watermark to control eviction
- Caching Layer related config changes
- Removed ChunkCache related configs and code in golang
- Add `PinAllCells` helper method to CacheSlot class
- Modified ValueAt, RawAt, PrimitiveRawAt to Bulk version, to reduce
caching layer overhead
- Removed some unclear templated bulk_subscript methods
- CachedSearchIterator to store PinWrapper when searching on
ChunkedColumn, and removed unused contrustor.
---------
Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
issue: #41544
- remove the estimate logic, so the create segment operation will not
check the collection meta anymore
Signed-off-by: chyezh <chyezh@outlook.com>
Related to #39718
After support add field with dynamic fields enabled, the masked dynamic
field shall be able to return with `$meta["name"]`
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>