With concurrenct L0 compaction
(https://github.com/milvus-io/milvus/pull/36816), delta logs might be
written to the same L1 segment, causing logID duplication when using the
incremental beginLogID. This PR removes the beginLogID mechanism and
instead passes a log ID range, where the number of IDs in the range
equals the number of compaction segment binlogs multiplied by an
expansion factor.
issue: https://github.com/milvus-io/milvus/issues/40207
---------
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
issue: #37651
this PR enable to balance the collection with largest row count first,
to avoid temporary migration of small table data to new nodes during
their onboarding, only to be moved out again after the large table
balance, which would cause unnecessary load.
---------
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
#35856
1. Add function-related configuration in milvus.yaml
2. Add null and empty value check to TextEmbeddingFunction
Signed-off-by: junjie.jiang <junjie.jiang@zilliz.com>
two point:
(1) reoder conjucts expr's subexpr, postpone heavy operations
sequence: int(column) -> index(column) -> string(column) -> light
conjuct
...... -> json(column) -> heavy conjuct -> two_column_compare
(2) support pre filter for expr execute, skip scan raw data that had
been skipped
because of preceding expr result.
#39869
Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
See also #40558
Related to #35303 & #38066 as well
This PR:
- Add `BufferedForward` to limit memory usage forwarding stream delete
- Add `UseLoad` flag to determine `Delete` shall use `segment.Delete` or
`segment.LoadDelta`
- Fix delegator accidentally use always true candidate while load
streaming delta
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
issue: https://github.com/milvus-io/milvus/issues/39818
This PR mimics Varchar data type, allows insert, search, query, delete,
full-text search and others.
Functionalities related to filter expressions are disabled temporarily.
Storage changes for Text data type will be in the following PRs.
Signed-off-by: Patrick Weizhi Xu <weizhi.xu@zilliz.com>
The default value and yaml have different values which may cause
confusion when upgrading from older version.
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
issue: #38399
- Make a timetick-commit-based write ahead buffer at write side.
- Add a switchable scanner at read side to transfer the state between
catchup and tailing read
Signed-off-by: chyezh <chyezh@outlook.com>
issue: #36621#39417
1. Adjust the server-side cache size.
2. Add source information for configurations.
3. Add node ID for compaction and indexing tasks.
4. Resolve localhost access issues to fix health check failures for
etcd.
Signed-off-by: jaime <yun.zhang@zilliz.com>
This PR limits the maximum number of consumers per pchannel to 10 for
each QueryNode and DataNode.
issue: https://github.com/milvus-io/milvus/issues/37630
---------
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
1. Make the segment loader lock protect only the resource.
2. Optimize GetDiskUsage to avoid excessive overhead.
issue: https://github.com/milvus-io/milvus/issues/37630
---------
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
1. DataNode: Skip generating BF during the insert phase (BF will be
regenerated during the sync phase).
2. QueryNode: Skip generating or maintaining BF for growing segments;
deletion checks will be handled in the segcore.
issue: https://github.com/milvus-io/milvus/issues/37630
---------
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
Related to #39205
This PR merge `RLock` & `PinIfNotReleased` into `PinIf` function
preventing segment being released before any Read operation finished.
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
issue: #38399
- make broadcast service available for msgstream by reusing the
architecture streaming service
---------
Signed-off-by: chyezh <chyezh@outlook.com>
issue: #35563
1. Use an internal health checker to monitor the cluster's health state,
storing the latest state on the coordinator node. The CheckHealth
request retrieves the cluster's health from this latest state on the
proxy sides, which enhances cluster stability.
2. Each health check will assess all collections and channels, with
detailed failure messages temporarily saved in the latest state.
3. Use CheckHealth request instead of the heavy GetMetrics request on
the querynode and datanode
Signed-off-by: jaime <yun.zhang@zilliz.com>
issue: #38142
current balance channel policy only consider current collection's
distribution, so if all collections has 1 channel, and all channels has
been loaded on same querynode, after querynode num increase, balance
channel won't be triggered.
This PR enable score based balance channel policy, to achieve:
1. distribute all channels evenly across multiple querynodes
2. distribute each collection's channel evenly across multiple
querynodes.
---------
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
1. taskQueueCapacity 256 is too small for production when we want to
re-write the entire collection
2. tasks should be cleaned when unable to recover, or the meta will
remain in etcd forever later.
Signed-off-by: yangxuan <xuan.yang@zilliz.com>
issue : https://github.com/milvus-io/milvus/issues/36864
I have a few questions regarding my approach.I will consolidate them
here for feedback and review.Thanks
---------
Signed-off-by: Nischay Yadav <nischay.yadav@ibm.com>
Signed-off-by: Nischay <Nischay.Yadav@ibm.com>