Jiquan Long
f0f2fb4cf0
enhance: span tracing of c++ part ( #36205 )
...
fix: https://github.com/milvus-io/milvus/issues/36204
Signed-off-by: longjiquan <jiquan.long@zilliz.com>
2024-09-13 11:19:09 +08:00
Jiquan Long
89bf226f0b
feat: support keyword text match ( #35923 )
...
fix : #35922
---------
Signed-off-by: longjiquan <jiquan.long@zilliz.com>
2024-09-10 15:11:08 +08:00
smellthemoon
21b135c7c2
fix: not append valid data when transfer to insert record ( #36027 )
...
fix not append valid data when transfer to insert record and add a tiny
check when in groupBy field.
#35924
Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2024-09-06 14:53:04 +08:00
SimFG
5247631289
fix: fill the metric type field in the LoadMetaInfo object ( #35962 )
...
- issue: #35960
Signed-off-by: SimFG <bang.fu@zilliz.com>
2024-09-05 20:50:23 -07:00
cqy123456
560e8e70b0
enhance: reduce mmap_rss after chunkcache warmup ( #35974 )
...
related pr: https://github.com/milvus-io/milvus/pull/35965
Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>
2024-09-05 18:07:05 +08:00
zhagnlu
74048ce34f
fix:rename mmap file path to avoid directory conflict ( #35810 )
...
#35784
Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-09-03 16:05:03 +08:00
cai.zhang
2c9bb4dfa3
feat: Support stats task to sort segment by PK ( #35054 )
...
issue: #33744
This PR includes the following changes:
1. Added a new task type to the task scheduler in datacoord: stats task,
which sorts segments by primary key.
2. Implemented segment sorting in indexnode.
3. Added a new field `FieldStatsLog` to SegmentInfo to store token index
information.
---------
Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-09-02 14:19:03 +08:00
Jiquan Long
5ea2454fdf
feat: tantivy tokenizer binding ( #35801 )
...
fix : #35800
---------
Signed-off-by: longjiquan <jiquan.long@zilliz.com>
2024-09-01 17:13:03 +08:00
yihao.dai
f2b83d316b
enhance: Support memory mode chunk cache ( #35347 )
...
Chunk cache supports loading raw vectors into memory.
issue: https://github.com/milvus-io/milvus/issues/35273
---------
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-08-25 15:42:58 +08:00
zhagnlu
42f7800b5b
enhance: add bitmap offset cache to speed up retrieve raw data ( #35498 )
...
#35458
Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-08-24 01:40:58 +08:00
Zhen Ye
a773836b89
enhance: optimize milvus core building ( #35610 )
...
issue: #35549,#35611,#35633
- remove milvus_segcore milvus_indexbuilder..., add libmilvus_core
- core building only link once
- move opendal compilation into cmake
- fix odr
---------
Signed-off-by: chyezh <chyezh@outlook.com>
2024-08-23 12:35:02 +08:00
zhagnlu
3107701fe8
enhance: optimize retrieve on dynamic field ( #35580 )
...
#35514
Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
Co-authored-by: zhenshan.cao <zhenshan.cao@zilliz.com>
2024-08-22 14:24:56 +08:00
smellthemoon
80dbe87759
enhance: support null value in index ( #35238 )
...
#31728
---------
Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2024-08-16 15:30:54 +08:00
Buqian Zheng
f4a91e135b
enhance: Allow empty sparse row ( #34700 )
...
issue: #29419
* If a sparse vector with 0 non-zero value is inserted, no ANN search on
this sparse vector field will return it as a result. User may retrieve
this row via scalar query or ANN search on another vector field though.
* If the user uses an empty sparse vector as the query vector for a ANN
search, no neighbor will be returned.
Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2024-08-16 14:14:54 +08:00
zhagnlu
4b553b0333
enhance: revert remove duplicated pk function ( #35103 )
...
issue: #34778
Revert "fix: fix query count(*) concurrently"
Revert "enhance: mark duplicated pk as deleted "
Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-08-05 10:48:17 +08:00
zhagnlu
16dd53e7cf
enhance: remove timestamp_filter after retrieve ( #35207 )
...
#35226
Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-08-02 19:32:46 +08:00
smellthemoon
475c333fa2
enhance: add valid_data in span ( #35030 )
...
#31728
Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2024-08-02 15:40:14 +08:00
zhenshan.cao
aa247f192d
enhance: remove unused code for StorageV2 ( #35132 )
...
issue: https://github.com/milvus-io/milvus/issues/34168
Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>
2024-08-01 12:08:13 +08:00
zhagnlu
dd0c26cf58
enhance: redefine variable column block size ( #35040 )
...
#35013
Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-07-30 19:23:50 +08:00
congqixia
de8a266d8a
enhance: Enable linux code checker ( #35084 )
...
See also #34483
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-07-30 15:53:51 +08:00
zhagnlu
86322e0468
fix: fix query count(*) concurrently ( #35007 )
...
#34778
#34849
fix two problems:
1. count(*) incorrect, if growing insert duplicated (pk, timestamp)
pairs that pk and timestamp all same, need to keep just one pair.
2. count(*) may core dump, if get_real_count interface get snapshot and
do mvcc at not consistency status, mainly happens under concurrency.
Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-07-29 19:53:50 +08:00
Aldrin
9463eeef2b
fix: Avoided dereferencing NULL pointer ( #34836 )
...
issue : https://github.com/milvus-io/milvus/issues/34835
Signed-off-by: Ald392 <imagesai32@gmail.com>
2024-07-27 17:27:52 +08:00
smellthemoon
5616b7e8d2
enhance: support null in c data_datacodec and load null value ( #32183 )
...
1. support read and write null in segcore
will store valid_data(use uint8_t type to save memory) in fieldData.
2. support load null
binlog reader read and write data into column(sealed segment),
insertRecord(growing segment). In sealed segment, store valid_data
directly. In growing segment, considering prior implementation and easy
code reading, it covert uint8_t to fbvector<bool>, which may optimize in
future.
3. retrieve valid_data.
parse valid_data in search/query.
#31728
---------
Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2024-07-23 16:07:51 +08:00
Chun Han
6c19f9baf8
enhance: optimize search reduce perf( #32507 ) ( #34607 )
...
related: #32507
Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2024-07-22 20:57:43 +08:00
zhagnlu
804dd5409a
enhance: mark duplicated pk as deleted ( #34586 )
...
fix #34247
Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-07-16 14:25:39 +08:00
Chun Han
f00c529aea
feat: support group_size for search_group_by( #33544 ) ( #33720 )
...
related: #33544
mainly changes in three aspects:
1. enable setting group_size for group by function
2. separate normal reduce and group by reduce
3. eleminate uncessary padding in search result for reducing
Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2024-07-12 10:17:36 +08:00
congqixia
4850336ca3
fix: Write padding at end of mmap file not chunk ( #34529 )
...
Related to #34508
The padding bytes shall be written only at the end of the mmap file not
the chunk of each field data file.
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-07-10 11:12:14 +08:00
yihao.dai
734415b8a2
fix: Reduce duplicate PKs in segcore ( #34267 )
...
issue: https://github.com/milvus-io/milvus/issues/34247
---------
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-07-01 17:42:06 +08:00
zhagnlu
3030e4625e
enhance: refactor variable column to reduce memory cost ( #33875 )
...
#33874
Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-06-30 20:16:06 +08:00
zhagnlu
03a3f50892
enhance: add skip using array index when some situation ( #33947 )
...
#32900
Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-06-23 21:26:02 +08:00
zhagnlu
0d7ea8ec42
enhance: Enhance and correct exception module ( #33705 )
...
#33704
Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-06-23 21:22:01 +08:00
chyezh
259a682673
enhance: async search and retrieve in cgo ( #33228 )
...
issue: #30926 , #33132
related pr: #33133
---------
Signed-off-by: chyezh <chyezh@outlook.com>
2024-06-22 09:38:02 +08:00
cqy123456
dc4437ff82
enhance: use segment id and type to register in MmapChunkManager and opt malloc in variableChunk ( #33993 )
...
issue: https://github.com/milvus-io/milvus/issues/32984
Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>
2024-06-20 17:42:02 +08:00
cqy123456
298e50b834
enhance: check index with data type ( #33880 )
...
issue: https://github.com/milvus-io/milvus/issues/22837
related: https://github.com/milvus-io/milvus/pull/33878
Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>
2024-06-19 10:15:59 +08:00
cqy123456
b460862537
fix: can't find Chunk struct after growing support mmap ( #33951 )
...
issue: https://github.com/milvus-io/milvus/issues/32984
Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>
2024-06-18 18:37:58 +08:00
congqixia
3fdaae8792
fix: Return record with largest timestamp for entires with same PK ( #33936 )
...
See also #33883
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-06-18 15:55:59 +08:00
cqy123456
32f685ff12
enhance: growing segment support mmap ( #32633 )
...
issue: https://github.com/milvus-io/milvus/issues/32984
Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>
2024-06-18 14:42:00 +08:00
zhagnlu
e422168f09
fix: readd timestamp index because segment timestamp not ordered ( #33856 )
...
#33533
Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-06-16 21:48:01 +08:00
Buqian Zheng
47b04ea167
enhance: support sparse cardinal hnsw index ( #33656 )
...
issue: #29419
Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2024-06-12 16:57:55 +08:00
Buqian Zheng
8cb350598c
enhance: Improve GetVectorById of Sparse Float Vector ( #33209 )
...
issue: #29419
* sparse float vector to support raw data mmap
For get vector from chunk cache, I added a unit test but marking it as
skipped due to a known issue. I have tested it locally.
Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2024-06-12 10:09:55 +08:00
zhagnlu
8ad26093ba
fix: fix load failure ( #33599 )
...
issue: #33533
Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-06-05 19:19:51 +08:00
congqixia
597f4c5e03
enhance: Make hasMoreResult accurate when hit number larger than limit ( #33609 )
...
See also milvus-io/milvus-sdk-go#756
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-06-05 11:51:51 +08:00
zhagnlu
c6f8a73bb2
enhance: optimize some cache to reduce memory usage ( #33534 )
...
#33533
Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-06-04 14:09:47 +08:00
Jiquan Long
0c5d8660aa
feat: support inverted index for array ( #33452 )
...
issue: https://github.com/milvus-io/milvus/issues/27704
---------
Signed-off-by: longjiquan <jiquan.long@zilliz.com>
2024-05-31 09:47:47 +08:00
Chun Han
416a2cf507
fix: query iterator lack results( #33137 ) ( #33422 )
...
related: #33137
adding has_more_result_tag for various level's reduce to rectify
reduce_stop_for_best
Signed-off-by: MrPresent-Han <chun.han@zilliz.com>
2024-05-30 17:51:44 +08:00
yihao.dai
5cf4161394
fix: Fix exception info is missing ( #33393 )
...
Replace based std::exception to prevent "object slicing"
issue: https://github.com/milvus-io/milvus/issues/33392
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-05-27 14:33:41 +08:00
congqixia
3c4df81261
enhance: Assert insert data length not overflow int ( #33248 )
...
When InsertData is too large for cpp proto unmarshalling, the error
message is confusing since the length is overflowed
This PR adds assertion for insert data length.
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-05-22 15:11:39 +08:00
foxspy
f6777267e3
enhance: add score compute consistency config for knowhere ( #32997 )
...
issue: https://github.com/milvus-io/milvus/issues/32583
related: #32584
Signed-off-by: xianliang.li <xianliang.li@zilliz.com>
2024-05-13 14:21:31 +08:00
Jiquan Long
9837ad6a8d
enhance: remove deprecated api ( #32808 )
...
issue: #32728
Signed-off-by: longjiquan <jiquan.long@zilliz.com>
2024-05-07 20:03:30 +08:00
Jiquan Long
1f58cda957
enhance: add more trace for search & query ( #32734 )
...
issue: https://github.com/milvus-io/milvus/issues/32728
---------
Signed-off-by: longjiquan <jiquan.long@zilliz.com>
2024-05-07 13:03:29 +08:00