Previous code uses diskSegmentMaxSize if and only if all of the
collection's vector fields are indexed with DiskANN index.
When introducing sparse vectors, since sparse vector cannot be indexed
with DiskANN index, collections with both dense and sparse vectors will
use maxSize instead.
This PR changes the requirments of using diskSegmentMaxSize to all dense
vectors are indexed with DiskANN indexs, ignoring sparse vector fields.
See also: #43193
Signed-off-by: yangxuan <xuan.yang@zilliz.com>
fix: https://github.com/milvus-io/milvus/issues/43354
The current implementation of stdsort index is not supported for
std::string. Remove the code.
Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
Correct read and buffer size to 64MB to prevent OOM during clustering
compaction.
issue: https://github.com/milvus-io/milvus/issues/43310
---------
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
Related to #43262
This patch fixes following logic bug:
- When multiple chunks are loaded and size cannot be divided by 8, just
appending uint8_t as bitmap will cause null bitmap dislocation
- `null_bitmap_data()` points to start of whole row group, which may not
stand for current `arrow::Array`
The current solutions is:
- Reorganize the null_bitmap with currect size & offset
- Pass `array->offset()` in tuple to info the current offset
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
Ref https://github.com/milvus-io/milvus/issues/42053
This PR enable ngram to support more kinds of matches such as prefix and
postfix match.
---------
Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
Related to #43250
Use FieldIDList to check missing field. If column is missing, return
empty resultset
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
issue: #41570
Fix issue where growing and sealed segments could be searched
simultaneously, causing inflated count(*) results. This was caused by
logic introduced in PR #42009 that made sealed segments readable before
target version advancement.
Changes include:
- Fix conditional filtering logic in PinReadableSegments to prevent
sealed segments from becoming readable prematurely
- Use target version filter for full results (ratio=1.0) to ensure
sealed segments only become readable after target advancement
- Use query view segment list filter for partial results (ratio<1.0) to
maintain backward compatibility
- Simplify target version setting in AddDistributions to prevent
premature segment readability
- Add logging for redundant growing segments during sync
- Add comprehensive unit tests covering the duplicate segment scenario
This fix ensures count(*) queries return accurate results by preventing
the same segment from being counted in both growing and sealed states.
---------
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
issue: https://github.com/milvus-io/milvus/issues/42900
@sunby Unfortunately, it is not that easy to fix as it was thought in
#43177
Upd: also handles `Inf` and `NaN` values, and the division by zero case
for `fp32` and `fp64`
Signed-off-by: Alexandr Guzhva <alexanderguzhva@gmail.com>
issue: #42995
- The consuming lag at streaming node will be reported to coordinator.
- The consuming lag will trigger the write limit and deny by quota
center.
- Set the ttProtection by default.
---------
Signed-off-by: chyezh <chyezh@outlook.com>
Related to #43003
When nq > 1, returning nullable data in search result will lead to
parsing error. This patch add slicing valid data logic to make nullable
parsing validation logic could work.
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
Related to #43113
When schema change happens, insert shall not happen, otherwise:
- Data race may happen causing insertion failure
- Inconsistent data schema
This PR add shared_lock prevent this data race.
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
issue: #43107
- Add checkLoadConfigChanges() to apply load config during startup
- Call config check in startQueryCoord() after restart
- Skip auto-updates for collections with user-specified replica numbers
- Add is_user_specified_replica_mode field to preserve user settings
- Add comprehensive unit tests with mockey
Ensures existing collections use latest cluster-level config after
restart.
---------
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
Related to #39178
This PR add logs for segment schema change operations.
Also fixes the nit comments from PR #42490
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>