related: #45993
This commit extends nullable vector support to the proxy layer,
querynode,
and adds comprehensive validation, search reduce, and field data
handling
for nullable vectors with sparse storage.
Proxy layer changes:
- Update validate_util.go checkAligned() with getExpectedVectorRows()
helper
to validate nullable vector field alignment using valid data count
- Update checkFloatVectorFieldData/checkSparseFloatVectorFieldData for
nullable vector validation with proper row count expectations
- Add FieldDataIdxComputer in typeutil/schema.go for logical-to-physical
index translation during search reduce operations
- Update search_reduce_util.go reduceSearchResultData to use
idxComputers
for correct field data indexing with nullable vectors
- Update task.go, task_query.go, task_upsert.go for nullable vector
handling
- Update msg_pack.go with nullable vector field data processing
QueryNode layer changes:
- Update segments/result.go for nullable vector result handling
- Update segments/search_reduce.go with nullable vector offset
translation
Storage and index changes:
- Update data_codec.go and utils.go for nullable vector serialization
- Update indexcgowrapper/dataset.go and index.go for nullable vector
indexing
Utility changes:
- Add FieldDataIdxComputer struct with Compute() method for efficient
logical-to-physical index mapping across multiple field data
- Update EstimateEntitySize() and AppendFieldData() with fieldIdxs
parameter
- Update funcutil.go with nullable vector support functions
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **New Features**
* Full support for nullable vector fields (float, binary, float16,
bfloat16, int8, sparse) across ingest, storage, indexing, search and
retrieval; logical↔physical offset mapping preserves row semantics.
* Client: compaction control and compaction-state APIs.
* **Bug Fixes**
* Improved validation for adding vector fields (nullable + dimension
checks) and corrected search/query behavior for nullable vectors.
* **Chores**
* Persisted validity maps with indexes and on-disk formats.
* **Tests**
* Extensive new and updated end-to-end nullable-vector tests.
<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Signed-off-by: marcelo-cjl <marcelo.chen@zilliz.com>
issue: https://github.com/milvus-io/milvus/issues/44105
- I have added support to set this property
**queued.max.messages.kbytes** in kafka consumers from the user side.
- It limits the size (in KB) of the consumer’s local message queue
(buffer) where messages are temporarily stored after being fetched from
Kafka but before your application actually processes them
---------
Signed-off-by: Nischay Yadav <Nischay.Yadav@ibm.com>
issue: #43785
- pulsar client will print log into milvus logger now.
- pulsar client open the metric by default.
- upgrade the pulsar client to v0.15.1, and use offical repo.
- the fixing of milvus-io/pulsar-client-go is already covered by
official v0.15.1.
Signed-off-by: chyezh <chyezh@outlook.com>
issue: #42162
- enhance: add read ahead buffer size issue #42129
- fix: rocksmq consumer's close operation may get stucked
- fix: growing segment from old arch is not flushed after upgrading
---------
Signed-off-by: chyezh <chyezh@outlook.com>
issue: #38399
- add an adaptor type to adapt the streaming service client and
msgstream client to reuse the msgdispatcher.
Signed-off-by: chyezh <chyezh@outlook.com>
issue: #40532
- balance should enable only when there's no proxy and datanode which
version is lower than 2.6.0
---------
Signed-off-by: chyezh <chyezh@outlook.com>
Introduce a batch subscription mechanism in msgdispatcher: the
msgdispatcher now includes a vchannel watch task queue, where all
vchannels in the queue will subscribe to the MQ only once and pull
messages from the oldest vchannel checkpoint to the latest.
issue: https://github.com/milvus-io/milvus/issues/39862
---------
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
issue: #38399
related PR: #39522
- Just implement exclusive broadcaster between broadcast message with
same resource key to keep same order in different wal.
- After simplify the broadcast model, original watch-based broadcast is
too complicated and redundant, remove it.
---------
Signed-off-by: chyezh <chyezh@outlook.com>
This PR limits the maximum number of consumers per pchannel to 10 for
each QueryNode and DataNode.
issue: https://github.com/milvus-io/milvus/issues/37630
---------
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
issue: #38263
Revert "fix: Move init kafka pool into once (#37786)"
Revert "enhance: Use pool to limit kafka cgo thread number (#37744)"
Signed-off-by: jaime <yun.zhang@zilliz.com>