issue: #40607
tantivy change: https://github.com/zilliztech/tantivy/pull/3
Benchmarks:
Test Envrioment: CPU 9900K
The data is insert by:
```
for i in 0..N {
for j in 0..UNIQUE {
let key = format!("hello{}", j);
index_writer.add_string(&key, i * UNIQUE + j).unwrap();
}
}
```
So the unique influences the locality of the matched docs.
The latency is the avg latency over 1000 repeate quries.
The result shows 22.5%-34.8% latency reduction.

---------
Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
issue: #40292
related to #39552
- Fix incorrect delete checkpoint usage in SyncDistribution
- Change checkpoint parameter from action.GetCheckpoint() to
action.GetDeleteCP() in SyncTargetVersion call
- This resolves the issue where delete buffer data was being cleaned
prematurely due to wrong checkpoint reference
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
ref https://github.com/milvus-io/milvus/issues/40473
Collection is got without ref which means the collection could be
releases and the struct could be freed during the search which leads
schema inconsistency.
Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
- Use CounterVec to calculate sum of increase during a time period.
- Use entries number instead of binlog size
---------
Signed-off-by: yangxuan <xuan.yang@zilliz.com>
The convertion of byte slice to string may copy the underline data which
may cause extra memory and cpu time for httpserver
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
Related to #40388
The small segments may be put into bucket twice due to value parameter
of Knapsnap.packWith
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
issue: #40308
This issue fixes these two concurrent issues:
1. element in null_offset is used to set bitset where the size of bitset
is initialized by tantivy document count. However, there may still be
some documents that are not committed in tantivy but are null in
null_offset. So array out of range occurs.
2. null_offset can be read and write concurrently but there's no
synchronization protection.
---------
Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
issue: #40311
- better logging for grpc resolver
- remove the redundant streaming node manage client when streaming
service is disable
Signed-off-by: chyezh <chyezh@outlook.com>
Introduce a batch subscription mechanism in msgdispatcher: the
msgdispatcher now includes a vchannel watch task queue, where all
vchannels in the queue will subscribe to the MQ only once and pull
messages from the oldest vchannel checkpoint to the latest.
issue: https://github.com/milvus-io/milvus/issues/39862
---------
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
Iterators are long deprecated, but sort are still using it. This PR
unifies stats task with the latest compaction common functions and
remove the usage of iterators.
1. Rename `datanode/compaction` to `datanode/compactor`
2. Add `internal/compaction` and move some compaction commons into it.
3. Replace `DeltalogIterators` with `ComposeDeleteFromDeltalogs`
4. Remove `datanode/iterators`
See also: #39242
Signed-off-by: yangxuan <xuan.yang@zilliz.com>