8055 Commits

Author SHA1 Message Date
zhagnlu
5164d30287
fix: increase expr recursion depth to avoid parse failed (#29860)
#29759

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-01-11 10:26:50 +08:00
yah01
031243fee7
feat: support mmap for marisa trie (#29613)
this supports mmap for marisa trie index
related https://github.com/milvus-io/milvus/issues/21866

Signed-off-by: yah01 <yang.cen@zilliz.com>
2024-01-11 10:22:50 +08:00
congqixia
d6429933a7
enhance: make Load process traceable in querynode & segcore (#29858)
See also #29803

This PR:
- Add trace span for `LoadIndex` & `LoadFieldData` in segment loader
- Add `TraceCtx` parameter for `Index.Load` in segcore
- Add span for ReadFiles & Engine Load for Memory/Disk Vector index

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-01-10 21:58:51 +08:00
aoiasd
73cfdab776
fix: Release collection delete proxy collection meta (#29854)
pr: https://github.com/milvus-io/milvus/issues/29675

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2024-01-10 21:54:49 +08:00
XuanYang-cn
9c8fd5e51d
fix: Save lite WatchInfo into etcd in DataNode (#29687)
Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-01-10 21:18:49 +08:00
congqixia
a040692129
enhance: Use estimated batch size to initalize BF (#29842)
See also: #27675

The bloom filter set initialized new BF with fixed configured `n`. This
value is always larger than the actual batch size and causes generated
BF using more memory.

This PR make write buffer to initialize BF with estimated batch size
from schema & configuration value.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-01-10 20:36:50 +08:00
congqixia
93f87417fd
enhance: remove .git folder for unit test workflow (#29833)
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-01-10 16:46:49 +08:00
yah01
e8496d4d49
enhance: filter out the not needed collections while listing (#29690)
this improves performance while many collections exist
resolve #29631

Signed-off-by: yah01 <yang.cen@zilliz.com>
2024-01-10 15:18:48 +08:00
Buqian Zheng
d506d33a8d
fix: meta cache in datanode incorrectly tracking row nums (#29817)
... of compacted segments

issue: #29816

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2024-01-10 13:22:48 +08:00
Cai Yudong
cb9d9ec0f0
enhance: Correct sampleFraction's type to float (#29810)
Signed-off-by: Yudong Cai <yudong.cai@zilliz.com>
2024-01-10 13:18:50 +08:00
Cai Yudong
600f6eff06
enhance: Upgrade gtest to 1.13.0 (#29805)
Signed-off-by: Yudong Cai <yudong.cai@zilliz.com>
2024-01-10 13:16:57 +08:00
yah01
d357139064
fix: the entities num metric may be contributed more than once (#29767)
the growing segments contribute to this metric while inserting and
putting into the manager, but the current impl inserts data before
putting the segments into manager, which leads to double contributions

fix: #29766

Signed-off-by: yah01 <yah2er0ne@outlook.com>
2024-01-10 10:00:51 +08:00
congqixia
c4ddfff2a7
enhance: make Load process traceable in querycoord (#29806)
See also #29803

This PR:
- Add trace span for collection/partition load
- Use TraceSpan to generate Segment/ChannelTasks when loading
- Refine BaseTask trace tag usage

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-01-10 09:58:49 +08:00
zhagnlu
601a8b801b
fix: add move cursor function to physical expr (#29603)
#29570

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-01-09 17:08:48 +08:00
congqixia
8a6e1a4b27
enhance: pre-allocate result FieldData space to reduce copy & growslice (#29726)
See also: #29113

Add a new utitliy function in `pkg/util/typetuil` to pre-allocate field
data slice capacity acoording to search limit. This shall avoid copying
the data during `AppendFieldData` when previous slice is out of space.
And shall also save CPU time during high paylog.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-01-09 15:48:55 +08:00
yah01
f030f31d92
enhance: make the error of parsing expression to ParameterInvalid (#29681)
before this, the error is unexpected error

Signed-off-by: yah01 <yang.cen@zilliz.com>
2024-01-09 15:36:47 +08:00
congqixia
f18a7191f2
enhance: make ColumnBasedInsertMsgToInsertData check field missing (#29758)
fix: #29757

In previous code, `ColumnBasedInsertMsgToInsertData` adds empty field if
the insertMsg parameter does not have the column schema defined. This
may lead to unexpected behavior of caller functions.

This PR:
- Add column missing check
- Add column length check
- Generate BlobInfo for ColumnBasedInsertMsgToInsertData result

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-01-09 11:50:48 +08:00
zhenshan.cao
60e88fb833
fix: Restore the MVCC functionality. (#29749)
When the TimeTravel functionality was previously removed, it
inadvertently affected the MVCC functionality within the system. This PR
aims to reintroduce the internal MVCC functionality as follows:

1. Add MvccTimestamp to the requests of Search/Query and the results of
Search internally.
2. When the delegator receives a Query/Search request and there is no
MVCC timestamp set in the request, set the delegator's current tsafe as
the MVCC timestamp of the request. If the request already has an MVCC
timestamp, do not modify it.
3. When the Proxy handles Search and triggers the second phase ReQuery,
divide the ReQuery into different shards and pass the MVCC timestamp to
the corresponding Query requests.

issue: #29656

Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>
2024-01-09 11:38:48 +08:00
aoiasd
cb18f18c1d
fix: compacted segment status was flushing instead flushed and L0 segment trigger gc slowly (#29587)
relate: https://github.com/milvus-io/milvus/issues/29492

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2024-01-09 10:52:49 +08:00
yihao.dai
3d07b6682c
feat: Add import reader for numpy (#29253)
This PR implements a new numpy reader for import.

issue: https://github.com/milvus-io/milvus/issues/28521

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-01-08 19:42:49 +08:00
XuanYang-cn
75e6b65c60
enhance: Use ChannelManger interface in Server (#29629)
See also: #29447

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-01-08 17:46:47 +08:00
yah01
97e4ec5a69
enhance: use random root path for minio unit tests (#29753)
this avoids the conflicts while running multiple unit tests

Signed-off-by: yah01 <yah2er0ne@outlook.com>
2024-01-08 15:58:48 +08:00
xige-16
9702cef2b5
feat: Support multiple vector search (#29433)
issue #25639 

Signed-off-by: xige-16 <xi.ge@zilliz.com>

Signed-off-by: xige-16 <xi.ge@zilliz.com>
2024-01-08 15:34:48 +08:00
zhenshan.cao
7e6f73a12d
feat: Authorize users to query grant info of their roles (#29747)
Once a role is granted to a user, the user should automatically possess
the privilege information associated with that role.

issue: #29710

Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>
2024-01-08 15:10:49 +08:00
congqixia
fe47deebf3
fix: Set & Return correct SegmentLevel in querynode segment manager (#29740)
See also #27349

The segment level label in querynode used `Legacy` before segment level
was correctly passed in Load request. Now this attribute is still using
legacy so the metrics does not look right.

This PR add paramter for `NewSegment` and passes corrent values for each
invocation.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-01-08 14:16:48 +08:00
Jiquan Long
e9f3df3626
fix: inverted index file not found (#29695)
issue: https://github.com/milvus-io/milvus/issues/29654

---------

Signed-off-by: longjiquan <jiquan.long@zilliz.com>
2024-01-07 20:26:49 +08:00
Jiquan Long
20fb847521
enhance: load delta logs concurrently (#29623)
This pr will make milvus load delta logs concurrently, which should
decrease the latency of loading a segment.
/kind improvement

---------

Signed-off-by: longjiquan <jiquan.long@zilliz.com>
2024-01-07 20:22:48 +08:00
zhagnlu
d07197ab1a
enhance: add compare simd function (#29432)
#26137

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-01-07 20:20:57 +08:00
foxspy
271edc6669
fix: throw exception when upload file failed for DiskIndex (#29627)
related to : #29417 

cardinal indexes upload index files in `Serialize` interface, and throw
exception when the `Serialize` failed.

Signed-off-by: xianliang <xianliang.li@zilliz.com>
2024-01-07 20:03:13 +08:00
wayblink
635a7f777c
feat: add clustering key in create/describe collection (#29506)
#28410
/kind feature

Signed-off-by: wayblink <anyang.wang@zilliz.com>
2024-01-07 19:56:48 +08:00
yihao.dai
156a0dd450
feat: Add import reader for Parquet (#29618)
This PR implements a Parquet reader for import.

issue: https://github.com/milvus-io/milvus/issues/28521

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-01-07 19:38:49 +08:00
cai.zhang
5dc300c4a9
fix: Fix bug for pk index doesn't have raw data (#29711)
issue: #29697

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-01-07 19:36:48 +08:00
congqixia
b5f039a221
fix: Assertion all async invocations in test case (#29737)
Resolves: #29736

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-01-07 15:54:47 +08:00
yah01
a0cec4047a
fix: make the entity num metric accurate (#29643)
fix #29642

Signed-off-by: yah01 <yang.cen@zilliz.com>
2024-01-05 18:24:47 +08:00
yihao.dai
23183ffb0f
feat: Add import reader for json (#29252)
This PR implements a new json reader for import.

issue: https://github.com/milvus-io/milvus/issues/28521

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-01-05 18:12:48 +08:00
aoiasd
70ec00cd5d
enhance: support access log print cluster prefix (#29646)
relate: https://github.com/milvus-io/milvus/issues/29645

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2024-01-05 16:34:47 +08:00
smellthemoon
1c1f2a1371
enhance:change some logs (#29579)
related #29588

Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2024-01-05 16:12:48 +08:00
wei liu
e98c62abbb
enhance: refactor leader_observer to leader_checker (#29454)
issue: #29453 

sync distribution by rpc will also call loadSegment/releaseSegment,
which may cause all kinds of concurrent case on same segment, such as
concurrent load and release on one segment.
This PR add leader_checker which generate load/release task to correct
the leader view, instead of calling sync distribution by rpc

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-01-05 15:54:55 +08:00
MrPresent-Han
9e2e7157e9
feat: support search_group_by for milvus(#25324) (#28983)
related: #25324

Search GroupBy function, used to aggregate result entities based on a
specific scalar column.
several points to mention:

1. Temporarliy, the whole groupby is implemented separated from
iterative expr framework **for the first period**
2. In the long term, the groupBy operation will be incorporated into the
iterative expr framework:https://github.com/milvus-io/milvus/pull/28166
3. This pr includes some unrelated mocked interface regarding alterIndex
due to some unworth-to-mention reasons. All these un-associated content
will be removed before the final pr is merged. This version of pr is
only for review
4. All other related details were commented in the files comparison

Signed-off-by: MrPresent-Han <chun.han@zilliz.com>
2024-01-05 15:50:47 +08:00
cqy123456
22bb84fa9d
feat:add new gpu index:GPU_BRUTE_FORCE and limit gpu index metric type (#29590)
issue: https://github.com/milvus-io/milvus/issues/29230
this pr do these things:
1. add gpu brute force;
2. limit gpu index only support l2 / ip;

Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>
2024-01-05 15:24:48 +08:00
PowderLi
c8db36a63a
enhance: get a blob to check object storage config (#29703)
issue: #29672
the storage account need privileges of actions
`Microsoft.Storage/storageAccounts/blobServices/containers/blobs/*` at
least

Signed-off-by: PowderLi <min.li@zilliz.com>
2024-01-05 14:50:46 +08:00
wei liu
b45d08b47b
enhance: Add ctx for load index logs (#29686)
This PR add ctx for load index logs

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-01-05 14:24:49 +08:00
yihao.dai
3561586edf
feat: Add import reader for binlog (#28910)
This PR defines the new import reader interfaces and implement a binlog
reader for import.

issue: https://github.com/milvus-io/milvus/issues/28521

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-01-05 11:48:47 +08:00
congqixia
3626f49025
fix: make sure balance candidate is alway pushed back (#29702)
See also #29699

Querycoord panicked when tried to pop from an empty heap. We assume the
heap shall not be empty, but in some branch, the candidate is never
pushed back.

This PR put pop & push in a closure and adds a defer call to push item
back.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-01-05 10:08:47 +08:00
congqixia
dc6a6a50fa
enhance: reduce SyncTask AllocID call and refine code (#29701)
See also #27675

`Allocator.Alloc` and `Allocator.AllocOne` might be invoked multiple
times if there were multiple blobs set in one sync task.

This PR add pre-fetch logic for all blobs and cache logIDs in sync task
so that at most only one call of the allocator is needed.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-01-05 10:04:46 +08:00
cai.zhang
dc8b5c1130
enhance: Read azure file without ReadAll (#29602)
issue: #29292

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-01-04 20:50:46 +08:00
wayblink
05d735c322
enhance: Rename SearchV2 to HybridSearch (#29592)
related: https://github.com/milvus-io/milvus-proto/pull/233
issue: #29593
/kind enhancement

Signed-off-by: wayblink <anyang.wang@zilliz.com>
2024-01-04 19:22:46 +08:00
yah01
0ae90443ba
enhance: fill missed info for segcore error (#29610)
- fill missed error info
- format the error message directly

Signed-off-by: yah01 <yang.cen@zilliz.com>
2024-01-04 17:54:46 +08:00
yah01
9e0163e12f
enhance: use GPU pool for gpu tasks (#29678)
- this much improve the performance for GPU index

Signed-off-by: yah01 <yang.cen@zilliz.com>
2024-01-04 17:50:46 +08:00
congqixia
4f8c540c77
enhance: cache collection schema attributes to reduce proxy cpu (#29668)
See also #29113

The collection schema is crucial when performing search/query but some
of the information is calculated for every request.

This PR change schema field of cached collection info into a utility
`schemaInfo` type to store some stable result, say pk field,
partitionKeyEnabled, etc. And provided field name to id map for
search/query services.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-01-04 17:28:46 +08:00