Zhen Ye
a773836b89
enhance: optimize milvus core building ( #35610 )
...
issue: #35549,#35611,#35633
- remove milvus_segcore milvus_indexbuilder..., add libmilvus_core
- core building only link once
- move opendal compilation into cmake
- fix odr
---------
Signed-off-by: chyezh <chyezh@outlook.com>
2024-08-23 12:35:02 +08:00
smellthemoon
80dbe87759
enhance: support null value in index ( #35238 )
...
#31728
---------
Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2024-08-16 15:30:54 +08:00
Buqian Zheng
f4a91e135b
enhance: Allow empty sparse row ( #34700 )
...
issue: #29419
* If a sparse vector with 0 non-zero value is inserted, no ANN search on
this sparse vector field will return it as a result. User may retrieve
this row via scalar query or ANN search on another vector field though.
* If the user uses an empty sparse vector as the query vector for a ANN
search, no neighbor will be returned.
Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2024-08-16 14:14:54 +08:00
zhagnlu
626b1b2f5e
fix:redefine hybrid internal index type ( #35314 )
...
#32900
Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-08-08 10:32:16 +08:00
Cai Yudong
3c9a47c8db
feat: Encode traceID and spanID as hex string ( #34807 )
...
Issue: https://github.com/zilliztech/knowhere/pull/714
Signed-off-by: Cai Yudong <yudong.cai@zilliz.com>
2024-08-06 15:20:16 +08:00
zhagnlu
4b553b0333
enhance: revert remove duplicated pk function ( #35103 )
...
issue: #34778
Revert "fix: fix query count(*) concurrently"
Revert "enhance: mark duplicated pk as deleted "
Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-08-05 10:48:17 +08:00
zhagnlu
16dd53e7cf
enhance: remove timestamp_filter after retrieve ( #35207 )
...
#35226
Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-08-02 19:32:46 +08:00
smellthemoon
475c333fa2
enhance: add valid_data in span ( #35030 )
...
#31728
Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2024-08-02 15:40:14 +08:00
zhagnlu
f8c1b138a8
fix:fix get array error for int type ( #35154 )
...
#35055
Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-08-01 14:30:12 +08:00
zhenshan.cao
aa247f192d
enhance: remove unused code for StorageV2 ( #35132 )
...
issue: https://github.com/milvus-io/milvus/issues/34168
Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>
2024-08-01 12:08:13 +08:00
Bingyi Sun
f229f244d2
enhance: add chunk basic impl ( #34634 )
...
https://github.com/milvus-io/milvus/issues/35112
This pr would not affect milvus functionality by now.
It implments a Chunk memory layout that looks like
```
VariableColumn
|offset|offset|offset|
|data|data|data|
```
We maybe move offsets to the beginning and add null bitmaps later but
not in this PR.
And mmap test will also be added in another PR.
---------
Signed-off-by: sunby <sunbingyi1992@gmail.com>
2024-08-01 10:29:51 +08:00
congqixia
de8a266d8a
enhance: Enable linux code checker ( #35084 )
...
See also #34483
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-07-30 15:53:51 +08:00
zhagnlu
86322e0468
fix: fix query count(*) concurrently ( #35007 )
...
#34778
#34849
fix two problems:
1. count(*) incorrect, if growing insert duplicated (pk, timestamp)
pairs that pk and timestamp all same, need to keep just one pair.
2. count(*) may core dump, if get_real_count interface get snapshot and
do mvcc at not consistency status, mainly happens under concurrency.
Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-07-29 19:53:50 +08:00
smellthemoon
7ec9d856f3
fix: access address was not malloc ( #34971 )
...
issue: #34972
fix string type data use memcpy to fill cause segv for not malloc enough
memory in advance.
Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2024-07-25 02:31:44 +08:00
smellthemoon
5616b7e8d2
enhance: support null in c data_datacodec and load null value ( #32183 )
...
1. support read and write null in segcore
will store valid_data(use uint8_t type to save memory) in fieldData.
2. support load null
binlog reader read and write data into column(sealed segment),
insertRecord(growing segment). In sealed segment, store valid_data
directly. In growing segment, considering prior implementation and easy
code reading, it covert uint8_t to fbvector<bool>, which may optimize in
future.
3. retrieve valid_data.
parse valid_data in search/query.
#31728
---------
Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2024-07-23 16:07:51 +08:00
Chun Han
ed057e6fce
fix: non-init seg_offset for growing raw-data when doing groupby ( #34748 )
...
related: #34713
Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2024-07-19 17:01:40 +08:00
foxspy
8e64bf929c
enhance: add scalar filtering and vector search latency metrics ( #34785 )
...
add scalar filtering and vector search latency metrics to distinguish
the cost of scalar filtering.
To add metrics in query chain, add a monitor module and move the metric
files from original storage module.
issue: #34780
Signed-off-by: xianliang.li <xianliang.li@zilliz.com>
2024-07-19 14:01:39 +08:00
zhagnlu
f1b2f7b640
enhance: refactor bitmap index and internal hybrid index ( #34450 )
...
#32900
Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-07-18 10:39:42 +08:00
zhagnlu
804dd5409a
enhance: mark duplicated pk as deleted ( #34586 )
...
fix #34247
Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-07-16 14:25:39 +08:00
Chun Han
f00c529aea
feat: support group_size for search_group_by( #33544 ) ( #33720 )
...
related: #33544
mainly changes in three aspects:
1. enable setting group_size for group by function
2. separate normal reduce and group by reduce
3. eleminate uncessary padding in search result for reducing
Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2024-07-12 10:17:36 +08:00
yihao.dai
734415b8a2
fix: Reduce duplicate PKs in segcore ( #34267 )
...
issue: https://github.com/milvus-io/milvus/issues/34247
---------
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-07-01 17:42:06 +08:00
zhagnlu
3030e4625e
enhance: refactor variable column to reduce memory cost ( #33875 )
...
#33874
Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-06-30 20:16:06 +08:00
Cai Yudong
ad90360162
enhance: Update knowhere commit ( #34223 )
...
Signed-off-by: Cai Yudong <yudong.cai@zilliz.com>
2024-06-27 18:20:06 +08:00
Gao
a933f6731b
fix: centroids file not removed when data skew in major compaction ( #34050 )
...
issue: https://github.com/milvus-io/milvus/issues/30633
Signed-off-by: chasingegg <chao.gao@zilliz.com>
2024-06-26 10:48:04 +08:00
Jiquan Long
aa36f9feed
fix: [ut] regex query under unsupported index ( #34087 )
...
/kind improvement
issue: https://github.com/milvus-io/milvus/issues/29988
Signed-off-by: longjiquan <jiquan.long@zilliz.com>
2024-06-24 14:08:03 +08:00
Patrick Weizhi Xu
b961767005
enhance: support integral type for MV and skip MV if there is only one category ( #33161 )
...
issue: #29892
---------
Signed-off-by: Patrick Weizhi Xu <weizhi.xu@zilliz.com>
2024-06-24 10:20:01 +08:00
zhagnlu
03a3f50892
enhance: add skip using array index when some situation ( #33947 )
...
#32900
Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-06-23 21:26:02 +08:00
chyezh
259a682673
enhance: async search and retrieve in cgo ( #33228 )
...
issue: #30926 , #33132
related pr: #33133
---------
Signed-off-by: chyezh <chyezh@outlook.com>
2024-06-22 09:38:02 +08:00
cqy123456
dc4437ff82
enhance: use segment id and type to register in MmapChunkManager and opt malloc in variableChunk ( #33993 )
...
issue: https://github.com/milvus-io/milvus/issues/32984
Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>
2024-06-20 17:42:02 +08:00
Gao
0d20303e54
fix: fix binary vector data size ( #33750 )
...
issue: https://github.com/milvus-io/milvus/issues/22837
- fix byte size wrong for binary vectors
- fix the expect/actual error msg
Signed-off-by: chasingegg <chao.gao@zilliz.com>
2024-06-18 21:39:59 +08:00
cqy123456
32f685ff12
enhance: growing segment support mmap ( #32633 )
...
issue: https://github.com/milvus-io/milvus/issues/32984
Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>
2024-06-18 14:42:00 +08:00
zhagnlu
d43ec4db0b
enhance: support array bitmap index ( #33527 )
...
#32900
---------
Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-06-16 21:51:58 +08:00
Buqian Zheng
8cb350598c
enhance: Improve GetVectorById of Sparse Float Vector ( #33209 )
...
issue: #29419
* sparse float vector to support raw data mmap
For get vector from chunk cache, I added a unit test but marking it as
skipped due to a known issue. I have tested it locally.
Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2024-06-12 10:09:55 +08:00
Jiquan Long
ecf2bcee42
enhance: speed up array-equal operator via inverted index ( #33633 )
...
fix : #33632
---------
Signed-off-by: longjiquan <jiquan.long@zilliz.com>
2024-06-11 14:13:54 +08:00
chyezh
f53ab54c5d
enhance: async cgo utility ( #33133 )
...
issue: #30926 , #33132
- implement future-based cgo utility.
---------
Signed-off-by: chyezh <chyezh@outlook.com>
2024-06-09 22:55:53 +08:00
cai.zhang
27cc9f2630
enhance: Support analyze data ( #33651 )
...
issue: #30633
Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
Co-authored-by: chasingegg <chao.gao@zilliz.com>
2024-06-06 17:37:51 +08:00
Gao
545d4725fb
fix: correct get vector data size for bf16/fp16/binary vector ( #33377 )
...
related #22837
Signed-off-by: chasingegg <chao.gao@zilliz.com>
2024-06-05 14:31:57 +08:00
congqixia
597f4c5e03
enhance: Make hasMoreResult accurate when hit number larger than limit ( #33609 )
...
See also milvus-io/milvus-sdk-go#756
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-06-05 11:51:51 +08:00
Jiquan Long
0c5d8660aa
feat: support inverted index for array ( #33452 )
...
issue: https://github.com/milvus-io/milvus/issues/27704
---------
Signed-off-by: longjiquan <jiquan.long@zilliz.com>
2024-05-31 09:47:47 +08:00
Chun Han
416a2cf507
fix: query iterator lack results( #33137 ) ( #33422 )
...
related: #33137
adding has_more_result_tag for various level's reduce to rectify
reduce_stop_for_best
Signed-off-by: MrPresent-Han <chun.han@zilliz.com>
2024-05-30 17:51:44 +08:00
zhagnlu
589d4dfd82
enhance: optimize bitmap index ( #33358 )
...
#32900
Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-05-30 13:09:43 +08:00
Buqian Zheng
c5918ffbdb
enhance: mark sparse inverted index as mmap-able ( #33281 )
...
issue: #29419
Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2024-05-23 14:11:42 +08:00
cai.zhang
be77ceba84
enhance: Use proto for passing info in cgo ( #33184 )
...
issue: #33183
---------
Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-05-23 10:31:40 +08:00
zhagnlu
d669fbcf46
enhance: support bitmap index for scalar type ( #32902 )
...
#32900
Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-05-19 21:49:38 +08:00
Buqian Zheng
bb7765cbd6
fix: fix Indexing.Iterator ut: build index with all data at once ( #32844 )
...
issue: #32843
Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2024-05-10 11:31:30 +08:00
Chun Han
01c2684355
enhance: [skip e2e] disable unstable ut temporarily ( #32836 )
...
Signed-off-by: MrPresent-Han <chun.han@zilliz.com>
2024-05-08 12:17:29 +08:00
Jiquan Long
9837ad6a8d
enhance: remove deprecated api ( #32808 )
...
issue: #32728
Signed-off-by: longjiquan <jiquan.long@zilliz.com>
2024-05-07 20:03:30 +08:00
Jiquan Long
1f58cda957
enhance: add more trace for search & query ( #32734 )
...
issue: https://github.com/milvus-io/milvus/issues/32728
---------
Signed-off-by: longjiquan <jiquan.long@zilliz.com>
2024-05-07 13:03:29 +08:00
Bingyi Sun
fecd9c21ba
feat: LRU cache implementation ( #32567 )
...
issue: https://github.com/milvus-io/milvus/issues/32783
This pr is the implementation of lru cache on branch lru-dev.
Signed-off-by: sunby <sunbingyi1992@gmail.com>
Co-authored-by: chyezh <chyezh@outlook.com>
Co-authored-by: MrPresent-Han <chun.han@zilliz.com>
Co-authored-by: Ted Xu <ted.xu@zilliz.com>
Co-authored-by: jaime <yun.zhang@zilliz.com>
Co-authored-by: wayblink <anyang.wang@zilliz.com>
2024-05-06 20:29:30 +08:00
Buqian Zheng
858599d831
enhance: sparse float vector to support brute force iterator and range search ( #32635 )
...
issue: #29419
Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2024-04-29 14:35:26 +08:00