milvus

mirror of https://gitee.com/milvus-io/milvus.git synced 2026-01-07 19:31:51 +08:00

Author	SHA1	Message	Date
Bingyi Sun	a75bb85f3a	feat: support chunked column for sealed segment (#35764 ) This PR splits sealed segment to chunked data to avoid unnecessary memory copy and save memory usage when loading segments so that loading can be accelerated. To support rollback to previous version, we add an option `multipleChunkedEnable` which is false by default. Signed-off-by: sunby <sunbingyi1992@gmail.com>	2024-10-12 15:04:52 +08:00
Rijin-N	a05a37a583	enhance: GCS native support (GCS implemented using Google Cloud Storage libraries) (#36214 ) Native support for Google cloud storage using the Google Cloud Storage libraries. Authentication is performed using GCS service account credentials JSON. Currently, Milvus supports Google Cloud Storage using S3-compatible APIs via the AWS SDK. This approach has the following limitations: 1. Overhead: Translating requests between S3-compatible APIs and GCS can introduce additional overhead. 2. Compatibility Limitations: Some features of the original S3 API may not fully translate or work as expected with GCS. To address these limitations, This enhancement is needed. Related Issue: #36212	2024-09-30 13:23:32 +08:00
Buqian Zheng	8495bc6bbc	fix: fix broken Sparse Float Vector raw data mmap (#36183 ) issue: https://github.com/milvus-io/milvus/issues/36182 * improved `Column.h` to make the code much more readable and maintainable, and added detailed comments. * fixed an issue where `ArrayColumn::NumRows()` always returns 0 when the mmap backing storage is a file. * removed unused `ColumnBase` constructors and unnecessary members so we don't get confused. * Updated `test_chunk_cache.cpp` to make the tests parameterized: to test both mmap enabled and disabled. Added sparse field in the test to add coverage. * re-enabled test `Sealed::GetSparseVectorFromChunkCache`. * But 2 other disabled tests `Sealed::WarmupChunkCache` and `Sealed::GetVectorFromChunkCache` remain disabled, there seems to be errors. @bigsheeper PTAL. --------- Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2024-09-25 18:59:13 +08:00
yihao.dai	8cda48a96a	enhance: Use mmap.scalarIndex config for text index (#36400 ) issue: https://github.com/milvus-io/milvus/issues/35273 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-09-24 12:21:13 +08:00
jaime	2ff3765058	enhance: catch std::stoi exception and improve error msg (#36267 ) issue: #36255 Signed-off-by: jaime <yun.zhang@zilliz.com>	2024-09-14 16:17:08 +08:00
Jiquan Long	89bf226f0b	feat: support keyword text match (#35923 ) fix: #35922 --------- Signed-off-by: longjiquan <jiquan.long@zilliz.com>	2024-09-10 15:11:08 +08:00
cqy123456	560e8e70b0	enhance: reduce mmap_rss after chunkcache warmup (#35974 ) related pr: https://github.com/milvus-io/milvus/pull/35965 Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>	2024-09-05 18:07:05 +08:00
foxspy	9da86529a7	enhance: Add disk filemananger parallel load control to reduce the memory consumption (#35281 ) issue: #35280 add parallel control to limit the memory consumption during index file loading Signed-off-by: xianliang.li <xianliang.li@zilliz.com>	2024-09-03 18:01:03 +08:00
smellthemoon	a3f2f044d6	fix: not set nullable when stream writer write headers (#35799 ) #35802 Signed-off-by: lixinguo <xinguo.li@zilliz.com> Co-authored-by: lixinguo <xinguo.li@zilliz.com>	2024-08-29 20:59:00 +08:00
zhagnlu	4d2f96c760	enhance: support bitmap mmap (#35399 ) #32900 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2024-08-27 16:34:59 +08:00
yihao.dai	f2b83d316b	enhance: Support memory mode chunk cache (#35347 ) Chunk cache supports loading raw vectors into memory. issue: https://github.com/milvus-io/milvus/issues/35273 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-08-25 15:42:58 +08:00
Zhen Ye	a773836b89	enhance: optimize milvus core building (#35610 ) issue: #35549,#35611,#35633 - remove milvus_segcore milvus_indexbuilder..., add libmilvus_core - core building only link once - move opendal compilation into cmake - fix odr --------- Signed-off-by: chyezh <chyezh@outlook.com>	2024-08-23 12:35:02 +08:00
Chun Han	337e065902	fix: querynode hang when failing to allocate disk space for mmap(#35184 ) (#35187 ) related: #35184 Signed-off-by: MrPresent-Han <chun.han@gmail.com> Co-authored-by: MrPresent-Han <chun.han@gmail.com>	2024-08-19 15:30:55 +08:00
smellthemoon	80dbe87759	enhance: support null value in index (#35238 ) #31728 --------- Signed-off-by: lixinguo <xinguo.li@zilliz.com> Co-authored-by: lixinguo <xinguo.li@zilliz.com>	2024-08-16 15:30:54 +08:00
Jiquan Long	91df03afe8	feat: put inverted index into ram (#35222 ) fix: https://github.com/milvus-io/milvus/issues/35224 --------- Signed-off-by: longjiquan <jiquan.long@zilliz.com>	2024-08-06 11:54:16 +08:00
zhenshan.cao	aa247f192d	enhance: remove unused code for StorageV2 (#35132 ) issue: https://github.com/milvus-io/milvus/issues/34168 Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>	2024-08-01 12:08:13 +08:00
congqixia	de8a266d8a	enhance: Enable linux code checker (#35084 ) See also #34483 --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-07-30 15:53:51 +08:00
smellthemoon	5616b7e8d2	enhance: support null in c data_datacodec and load null value (#32183 ) 1. support read and write null in segcore will store valid_data(use uint8_t type to save memory) in fieldData. 2. support load null binlog reader read and write data into column(sealed segment), insertRecord(growing segment). In sealed segment, store valid_data directly. In growing segment, considering prior implementation and easy code reading, it covert uint8_t to fbvector<bool>, which may optimize in future. 3. retrieve valid_data. parse valid_data in search/query. #31728 --------- Signed-off-by: lixinguo <xinguo.li@zilliz.com> Co-authored-by: lixinguo <xinguo.li@zilliz.com>	2024-07-23 16:07:51 +08:00
foxspy	8e64bf929c	enhance: add scalar filtering and vector search latency metrics (#34785 ) add scalar filtering and vector search latency metrics to distinguish the cost of scalar filtering. To add metrics in query chain, add a monitor module and move the metric files from original storage module. issue: #34780 Signed-off-by: xianliang.li <xianliang.li@zilliz.com>	2024-07-19 14:01:39 +08:00
zhagnlu	f1b2f7b640	enhance: refactor bitmap index and internal hybrid index (#34450 ) #32900 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2024-07-18 10:39:42 +08:00
smellthemoon	ef3ced8138	fix: descriptor event in previous version not has nullable to parse error (#34235 ) #34176 --------- Signed-off-by: lixinguo <xinguo.li@zilliz.com> Co-authored-by: lixinguo <xinguo.li@zilliz.com>	2024-07-01 16:38:06 +08:00
congqixia	14e827dc6c	fix: Implement singleflight for segcore ChunkCache (#34250 ) See also #34249 Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-07-01 11:46:06 +08:00
Patrick Weizhi Xu	b961767005	enhance: support integral type for MV and skip MV if there is only one category (#33161 ) issue: #29892 --------- Signed-off-by: Patrick Weizhi Xu <weizhi.xu@zilliz.com>	2024-06-24 10:20:01 +08:00
zhagnlu	0d7ea8ec42	enhance: Enhance and correct exception module (#33705 ) #33704 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2024-06-23 21:22:01 +08:00
cqy123456	dc4437ff82	enhance: use segment id and type to register in MmapChunkManager and opt malloc in variableChunk (#33993 ) issue: https://github.com/milvus-io/milvus/issues/32984 Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>	2024-06-20 17:42:02 +08:00
smellthemoon	2a1356985d	enhance: support null in go payload (#32296 ) #31728 --------- Signed-off-by: lixinguo <xinguo.li@zilliz.com> Co-authored-by: lixinguo <xinguo.li@zilliz.com>	2024-06-19 17:08:00 +08:00
Gao	0d20303e54	fix: fix binary vector data size (#33750 ) issue: https://github.com/milvus-io/milvus/issues/22837 - fix byte size wrong for binary vectors - fix the expect/actual error msg Signed-off-by: chasingegg <chao.gao@zilliz.com>	2024-06-18 21:39:59 +08:00
cqy123456	32f685ff12	enhance: growing segment support mmap (#32633 ) issue: https://github.com/milvus-io/milvus/issues/32984 Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>	2024-06-18 14:42:00 +08:00
zhagnlu	d43ec4db0b	enhance: support array bitmap index (#33527 ) #32900 --------- Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2024-06-16 21:51:58 +08:00
Buqian Zheng	47b04ea167	enhance: support sparse cardinal hnsw index (#33656 ) issue: #29419 Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2024-06-12 16:57:55 +08:00
Buqian Zheng	8cb350598c	enhance: Improve GetVectorById of Sparse Float Vector (#33209 ) issue: #29419 * sparse float vector to support raw data mmap For get vector from chunk cache, I added a unit test but marking it as skipped due to a known issue. I have tested it locally. Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2024-06-12 10:09:55 +08:00
cai.zhang	27cc9f2630	enhance: Support analyze data (#33651 ) issue: #30633 Signed-off-by: Cai Zhang <cai.zhang@zilliz.com> Co-authored-by: chasingegg <chao.gao@zilliz.com>	2024-06-06 17:37:51 +08:00
wei liu	b69740c8f3	enhance: Remove unnecessary log info during load segment (#33663 ) Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-06-06 14:13:50 +08:00
cqy123456	703fc73f71	enhance: disk index support binary vector (#33631 ) issue:https://github.com/milvus-io/milvus/issues/22837 related https://github.com/milvus-io/milvus/pull/33575 Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>	2024-06-05 19:37:57 +08:00
Jiquan Long	0c5d8660aa	feat: support inverted index for array (#33452 ) issue: https://github.com/milvus-io/milvus/issues/27704 --------- Signed-off-by: longjiquan <jiquan.long@zilliz.com>	2024-05-31 09:47:47 +08:00
cai.zhang	32d3e22d7d	fix: Throw an exception after all the threads in thread pool finished (#32810 ) issue: #32487 Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2024-05-23 11:47:40 +08:00
Bingyi Sun	fecd9c21ba	feat: LRU cache implementation (#32567 ) issue: https://github.com/milvus-io/milvus/issues/32783 This pr is the implementation of lru cache on branch lru-dev. Signed-off-by: sunby <sunbingyi1992@gmail.com> Co-authored-by: chyezh <chyezh@outlook.com> Co-authored-by: MrPresent-Han <chun.han@zilliz.com> Co-authored-by: Ted Xu <ted.xu@zilliz.com> Co-authored-by: jaime <yun.zhang@zilliz.com> Co-authored-by: wayblink <anyang.wang@zilliz.com>	2024-05-06 20:29:30 +08:00
PowderLi	6289f3a9eb	fix: build milvus in rockylinux8 (#32619 ) issue: #32299 1. xz utils recovers 2. forget to install ninja Signed-off-by: PowderLi <min.li@zilliz.com>	2024-04-29 14:53:26 +08:00
smellthemoon	4fb8044a27	enhance: delete some no lint code (#32182 ) #31728 Signed-off-by: lixinguo <xinguo.li@zilliz.com> Co-authored-by: lixinguo <xinguo.li@zilliz.com>	2024-04-26 14:15:26 +08:00
cqy123456	aba4993c6c	fix: fix some fp16/bf16 code miss in segcore. (#31771 ) issue：https://github.com/milvus-io/milvus/issues/22837 Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>	2024-04-07 14:13:16 +08:00
Cai Yudong	246586be27	enhance: Unify data type check APIs under internal/core (#31800 ) Issue: #22837 Move and rename following C++ APIs: datatype_sizeof() ==> GetDataTypeSize() datatype_name() ==> GetDataTypeName() datatype_is_vector() / IsVectorType() ==> IsVectorDataType() datatype_is_variable() ==> IsVariableDataType() datatype_is_sparse_vector() ==> IsSparseFloatVectorDataType() datatype_is_string() / IsString() ==> IsDataTypeString() datatype_is_floating() / IsFloat() ==> IsDataTypeFloat() datatype_is_binary() ==> IsDataTypeBinary() datatype_is_json() ==> IsDataTypeJson() datatype_is_array() ==> IsDataTypeArray() datatype_is_variable() == IsDataTypeVariable() datatype_is_integer() / IsIntegral() ==> IsDataTypeInteger() Signed-off-by: Cai Yudong <yudong.cai@zilliz.com>	2024-04-02 19:15:14 +08:00
PowderLi	d299fa502e	fix: use milvus-io/vcpkg (#31770 ) issue: #31769 GitHub Disables The XZ Repository because of CVE-2024-3094 Signed-off-by: PowderLi <min.li@zilliz.com>	2024-04-01 15:01:13 +08:00
chyezh	5655ec4fc0	enhance: add mmap usage metrics (#31708 ) issue: #31707 Signed-off-by: chyezh <chyezh@outlook.com>	2024-04-01 11:35:12 +08:00
groot	5be395354c	fix: minio ssl compatible issue (#31607 ) issue: https://github.com/milvus-io/milvus/issues/30709 Signed-off-by: yhmo <yihua.mo@zilliz.com>	2024-03-27 14:41:20 +08:00
groot	c81909bfab	enhance: Support MinIO TLS connection (#31311 ) issue: https://github.com/milvus-io/milvus/issues/30709 pr: #31292 Signed-off-by: yhmo <yihua.mo@zilliz.com> Co-authored-by: Chen Rao <chenrao317328@163.com>	2024-03-21 11:15:20 +08:00
Buqian Zheng	070dfc77bf	feat: [Sparse Float Vector] segcore basics and index building (#30357 ) This commit adds sparse float vector support to segcore with the following: 1. data type enum declarations 2. Adds corresponding data structures for handling sparse float vectors in various scenarios, including: * FieldData as a bridge between the binlog and the in memory data structures * mmap::Column as the in memory representation of a sparse float vector column of a sealed segment; * ConcurrentVector as the in memory representation of a sparse float vector of a growing segment which supports inserts. 3. Adds logic in payload reader/writer to serialize/deserialize from/to binlog 4. Adds the ability to allow the index node to build sparse float vector index 5. Adds the ability to allow the query node to build growing index for growing segment and temp index for sealed segment without index built This commit also includes some code cleanness, comment improvement, and some unit tests for sparse vector. https://github.com/milvus-io/milvus/issues/29419 Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2024-03-11 14:45:02 +08:00
zhagnlu	976b6fc0e4	enhance: change opendal as compile configurable (#30384 ) #30373 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2024-02-20 19:16:52 +08:00
congqixia	18c351efa6	fix: Prevent ChunkCache use absolute path in All-in-one mode (#30666 ) See also #30651 Append operator of `std::filesystem::path` will replace whole path when the param of "/" operation is an absolute path. In "All-in-one" mode, this shall cause ChunkCache removing the original vector data file when building chunk cache during/after load procedure. This PR changes the ChunkCache path generation logic to a separate function in which will check whether the file path is absolute or not. If the file path is absolute, it removes the root path prefix and return concatenated file path. --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-02-19 20:58:51 +08:00
PowderLi	5cf9bb236e	enhance: restful support import jobs (#30343 ) issue: #28521 #29732 include 1. list collection's import jobs 2. create a new import job 3. get the progress of an import job fix: 1. mix the order of dbName & collectionName #29728 2. trace log keep the same as v1 3. support traceID 4. azure precheck, blob name cannot end with / #29703 --------- Signed-off-by: PowderLi <min.li@zilliz.com>	2024-01-31 17:57:04 +08:00
yah01	878c4c9463	enhance: limit the max pool size to 16 (#30371 ) according to our benchmark, concurrency level 16 is enough to fully utilize the object storage network bandwidth Signed-off-by: yah01 <yang.cen@zilliz.com>	2024-01-31 14:13:06 +08:00

1 2 3

147 Commits