milvus

mirror of https://gitee.com/milvus-io/milvus.git synced 2026-02-02 01:06:41 +08:00

Author	SHA1	Message	Date
wei liu	e2542a1bf5	enhance: Update protobuf-go to protobuf-go v2 (#34394 ) (#35555 ) issue: #34252 pr: #34394 #35072 #35084 Signed-off-by: Wei Liu <wei.liu@zilliz.com> Co-authored-by: Congqi Xia <congqi.xia@zilliz.com>	2024-08-21 18:50:58 +08:00
cai.zhang	09aea3fbf1	enhance: [cherry-pick] Optimize the use of locks and avoid double flush clustering buffer writer (#35490 ) issue: #35436 master pr: #35486 Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2024-08-16 02:24:59 +08:00
congqixia	537a817be9	fix: [2.4] Use k locations only for basic BF test location (#35381 ) Cherry-pick from master pr: #35380 Related to #35379 Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-08-09 07:52:22 +08:00
congqixia	d16320705e	enhance: [2.4] Add Segment Level in milvus segment info APIs (#34763 ) (#35023 ) Cherry-pick from master pr: #34763 See also #34746 This PR add segment level field in response of `GetPersistentSegmentInfo` and `GetQuerySegmentInfo` --------- --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-07-29 10:11:52 +08:00
Chun Han	ae1636c2be	fix: refine handling type for segment pruner(#34923 ) (#34926 ) related: #34923 pr: https://github.com/milvus-io/milvus/pull/34925 Signed-off-by: MrPresent-Han <chun.han@gmail.com> Co-authored-by: MrPresent-Han <chun.han@gmail.com>	2024-07-24 12:05:44 +08:00
wei liu	c13c48d99a	fix: Failed to unmarshal field stats's bloom filter (#34922 ) pr #34377 introduce this issue, which miss some new changes during the cherry-pick Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-07-23 16:45:47 +08:00
congqixia	6a3a14affb	enhance: [2.4] Add lint rule to forbid gogo protobuf (#34594 ) (#34630 ) Cherry pick from master pr: #34594 github.com/gogo/protobuf is deprecated and could be error prune after upgrade protobuf message to v2. --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-07-12 18:13:36 +08:00
SimFG	737bd7c734	enhance: [2.4] release the record in delete codec and add some log for compaction (#34506 ) /kind improvement - pr: #34454 Signed-off-by: SimFG <bang.fu@zilliz.com>	2024-07-09 15:40:17 +08:00
wei liu	d3e94f9861	enhance: Use Blocked Bloom Filter instead of basic bloom fitler impl (#34377 ) issue: #32995 pr: #33405 To speed up the construction and querying of Bloom filters, we chose a blocked Bloom filter instead of a basic Bloom filter implementation. WARN: This PR is compatible with old version bf impl, but if fall back to old milvus version, it may causes bloom filter deserialize failed. In single Bloom filter test cases with a capacity of 1,000,000 and a false positive rate (FPR) of 0.001, the blocked Bloom filter is 5 times faster than the basic Bloom filter in both querying and construction, at the cost of a 30% increase in memory usage. Block BF construct time {"time": "54.128131ms"} Block BF size {"size": 3021578} Block BF Test cost {"time": "55.407352ms"} Basic BF construct time {"time": "210.262183ms"} Basic BF size {"size": 2396308} Basic BF Test cost {"time": "192.596229ms"} In multi Bloom filter test cases with a capacity of 100,000, an FPR of 0.001, and 100 Bloom filters, we reuse the primary key locations for all Bloom filters to avoid repeated hash computations. As a result, the blocked Bloom filter is also 5 times faster than the basic Bloom filter in querying. Block BF TestLocation cost {"time": "529.97183ms"} Basic BF TestLocation cost {"time": "3.197430181s"} Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-07-05 17:04:10 +08:00
Chun Han	5831908aa2	enhance: reconstruct scalar part's code for segment-pruner(#30376 ) (#34365 ) related: #30376 pr: https://github.com/milvus-io/milvus/pull/34346 1. support more complex expr 2. add more ut test for unrelated fields Signed-off-by: MrPresent-Han <chun.han@gmail.com> Co-authored-by: MrPresent-Han <chun.han@gmail.com>	2024-07-04 16:30:10 +08:00
shaoting-huang	dd4dfbcd8d	enhance: [cherry-pick] Batch pick PRs related to data codec (#34345 ) This PR cherry-picks the following commits related to data codec - Fix data codec writer close. #33818 - Legacy code clean up. #33838 issue: #33813 #33839 pr: #33818 #33838 Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2024-07-04 15:08:11 +08:00
wayblink	c62bf8a0b0	fix: [Cherry-pick]Pick major compaction fixs and optimizations (#34360 ) This PR cherry-picks the following commits: - fix: sync partitiion stats blocking balance task #33742 - fix: Fix meta prefix overlap bug #33830 - fix: Small fixs of major compaction #33929 - fix: Fix memory buffer error & some renaming #33850 - fix: sync part stats task cannot be finished #34027 - Add an option to enable/disable vector field clustering key #34097 - fix: fix error ignore in compactor #34169 - fix:load major compaction partial result #34052 - Use new stream segment reader in clustering compaction #34232 issue: #30633 pr: #33742 #33830 #33929 #33850 #34027 #34097 #34169 #34052 #34232 --------- Signed-off-by: MrPresent-Han <chun.han@zilliz.com> Signed-off-by: wayblink <anyang.wang@zilliz.com> Signed-off-by: MrPresent-Han <chun.han@gmail.com> Co-authored-by: Chun Han <116052805+MrPresent-Han@users.noreply.github.com> Co-authored-by: MrPresent-Han <chun.han@gmail.com>	2024-07-03 09:53:37 +08:00
wayblink	99586066f5	feat: [cherry-pick] Major compaction (#34326 ) This PR cherry-picks the following commits: fix: speed up segment lookup via channel name in datacoord (#33530) needed by the next commit feat: Major compaction (#33620) issue: #30633 pr: #33620 --------- Signed-off-by: yiwangdr <yiwangdr@gmail.com> Signed-off-by: wayblink <anyang.wang@zilliz.com> Co-authored-by: yiwangdr <80064917+yiwangdr@users.noreply.github.com> Co-authored-by: MrPresent-Han <chun.han@zilliz.com>	2024-07-02 18:29:01 +08:00
congqixia	f741bb7526	enhance: [2.4] Avoid merging insert data when buffering insert msgs (#34205 ) Cherry-pick from master pr: #33526 #33817 See also #33561 This PR: - Use zero copy when buffering insert messages - Make `storage.InsertCodec` support serialize multiple insert data chunk into same batch binlog files --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-06-27 10:14:05 +08:00
congqixia	aae94d7c40	enhance: [2.4] Unify DeleteLog parsing code (#34188 ) Cherry-pick from master pr: #34009 See also #33787 The parsing delete log is distributed in lots of places, which is not recommended and hard to maintain. This PR abstract common parsing logic into DeleteLog.Parse method to unify implementation and make it easier to replace json parsing lib. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-06-27 10:12:13 +08:00
XuanYang-cn	a33b68678d	enhance: [cherry-pick] Move compactor into sub package (#34098 ) This PR consists of the following commits: - enhance: Tidy compactor and remove dup codes (#32198) - fix: Fix l0 compactor may cause DN from OOM (#33554) - enhance: Add deltaRowCount in l0 compaction (#33997) - enhance: enable stream writer in compactions (#32612) - fix: turn on compression on stream writers (#34067) - fix: adding blob memory size in binlog serde (#33324) See also: #32451, #33547, #33998, #31679 pr: #32198, #33554, #33997, #32612 --------- Signed-off-by: yangxuan <xuan.yang@zilliz.com> Signed-off-by: Ted Xu <ted.xu@zilliz.com> Co-authored-by: Ted Xu <ted.xu@zilliz.com>	2024-06-25 11:16:02 +08:00
XuanYang-cn	a446e754b4	fix: [2.4]DeleteData merge wrong data casuing data loss (#33821 ) See also: #33819 pr: #33820 Signed-off-by: yangxuan <xuan.yang@zilliz.com>	2024-06-13 16:07:56 +08:00
congqixia	86f3433053	enhance: [2.4]Use fastjson lib for unmarshal delete log (#33787 ) (#33802 ) Cherry-pick from master pr: #33878 ``` goos: linux goarch: amd64 GOMAXPROC=1 cpu: Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz BenchmarkJsonSerdeStd 343872 3568 ns/op 1335 B/op 25 allocs/op BenchmarkJsonSerdeFastjson 5124177 234.9 ns/op 16 B/op 1 allocs/op ``` --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-06-13 10:27:57 +08:00
wei liu	54feef30e7	enhance: Use BatchPkExist to reduce bloom filter func call cost (#33752 ) issue: #33610 pr: #33611 Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-06-12 17:45:58 +08:00
wei liu	f2917f5bdf	enhance: Remove StringPrimaryKey to reduce unnecessary copy and function call cost (#33486 ) (#33649 ) issue: #33497 pr: #33486 Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-06-06 10:40:01 +08:00
Cai Yudong	68e2d532d8	enhance: Cherry-pick following SparseFloatVector bulk insert PRs to Milvus2.4 (#33391 ) Cherry pick from master pr: #33064 #33101 #33187 #33259 #33224 #33064 Support readable JSON file import for Float16/BFloat16/SparseFloat #33101 Store SparseFloatVector into parquet as JSON string #33187 Fix SparseFloatVector data parse error for parquet #33259 Fix SparseFloatVector data parse error for json #33224 Optimize bulk insert unittest Signed-off-by: Cai Yudong <yudong.cai@zilliz.com>	2024-05-30 10:31:45 +08:00
congqixia	e2626c7b9e	fix: [2.4]Allocate new slice for each batch in streaming reader (#33360 ) Cherry-pick from master pr: #33359 Related to #33268 Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-05-24 18:59:42 +08:00
cai.zhang	6ea7633bd5	enhance: Add memory size for binlog (#33025 ) issue: #33005 1. add `MemorySize` field for insert binlog. 2. `LogSize` means the file size in the storage object. 3. `MemorySize` means the size of the data in the memory. --------- Signed-off-by: Cai Zhang <cai.zhang@zilliz.com> Signed-off-by: cai.zhang <cai.zhang@zilliz.com>	2024-05-15 12:59:34 +08:00
Cai Yudong	4fc7915c70	enhance: unify data generation test APIs (#32955 ) Issue: #22837 Signed-off-by: Cai Yudong <yudong.cai@zilliz.com>	2024-05-14 14:33:33 +08:00
congqixia	0e5765b116	enhance: Utilize `TestLocations` ability to accelerate write & compaction (#32948 ) See also #32642 This PR reuses hash locations for bloom filter prediction utilizing `storage.Location`, like enhancement #32642. Also adds a utility struct in storage: `LocationCache` to storage locations for variable K (numbers of hash functions) --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-05-13 10:15:32 +08:00
wei liu	5038036ece	enhance: Reuse hash locations during access bloom fitler (#32642 ) issue: #32530 when try to match segment bloom filter with pk, we can reuse the hash locations. This PR maintain the max hash Func, and compute hash location once for all segment, reuse hash location can speed up bf access --------- Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-05-07 06:13:47 -07:00
Cai Yudong	bcdbd1966e	feat: Support sparse float vector bulk insert for binlog/json/parquet (#32649 ) Issue: #22837 Signed-off-by: Cai Yudong <yudong.cai@zilliz.com>	2024-05-07 18:43:30 +08:00
aoiasd	31dca3249e	enhance: add type info for payload writer error message and add log when querynode find new collection (#32522 ) relate: https://github.com/milvus-io/milvus/issues/32668 --------- Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2024-05-07 14:45:29 +08:00
Aldrin	cb8dbc3c83	fix: Removed minio bucket after use in test (#32624 ) issue: https://github.com/milvus-io/milvus/issues/32616 - Forcefully deleted the non empty minio bucket with dummy data. Signed-off-by: Aldrin <imagesai32@gmail.com>	2024-04-28 13:51:26 +08:00
chyezh	2586c2f1b3	enhance: use WalkWithPrefix api for oss, enable piplined file gc (#31740 ) issue: #19095,#29655,#31718 - Change `ListWithPrefix` to `WalkWithPrefix` of OOS into a pipeline mode. - File garbage collection is performed in other goroutine. - Segment Index Recycle clean index file too. --------- Signed-off-by: chyezh <chyezh@outlook.com>	2024-04-25 20:41:27 +08:00
Buqian Zheng	8a1017a152	enhance: add helpers to parse sparse float vector in JSON (#32543 ) issue: #29419 added helper functions to parse JSON representation of sparse float vectors, will be used by both the restful server and the import utils. Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2024-04-25 14:47:24 +08:00
Cai Yudong	5fc439c600	feat: Bulk insert support fp16/bf16 (#32157 ) Issue: #22837 Signed-off-by: Cai Yudong <yudong.cai@zilliz.com>	2024-04-22 10:05:22 +08:00
Ted Xu	dc5ea6f17c	feat: adding binlog streaming writer (#31537 ) See #31679 --------- Signed-off-by: Ted Xu <ted.xu@zilliz.com>	2024-04-11 10:33:20 +08:00
aoiasd	5b693c466d	fix: delegator filter out all partition's delete msg when loading segment (#31585 ) May cause deleted data queryable a period of time. relate: https://github.com/milvus-io/milvus/issues/31484 https://github.com/milvus-io/milvus/issues/31548 --------- Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2024-04-09 15:21:24 +08:00
Cai Yudong	00438f408f	enhance: Unify data type check APIs for go (#31887 ) Issue: #22837 Signed-off-by: Cai Yudong <yudong.cai@zilliz.com>	2024-04-07 14:27:22 +08:00
cqy123456	976928ecd1	fix: fix fp16/bf16 some code missing and add more fp16/bf16 test (#31612 ) issue: #31534 Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>	2024-03-28 14:11:10 +08:00
SimFG	b1a1cca10b	feat: add more operation detail info for better allocation (#30438 ) issue: #30436 --------- Signed-off-by: SimFG <bang.fu@zilliz.com>	2024-03-28 06:33:11 +08:00
groot	5be395354c	fix: minio ssl compatible issue (#31607 ) issue: https://github.com/milvus-io/milvus/issues/30709 Signed-off-by: yhmo <yihua.mo@zilliz.com>	2024-03-27 14:41:20 +08:00
yihao.dai	31cf849f68	enhance: Support retriving file size from importutilv2.Reader (#31533 ) To reduce the overhead caused by listing the S3 objects, add an interface to importutil.Reader to retrieve file sizes. issue: https://github.com/milvus-io/milvus/issues/31532, https://github.com/milvus-io/milvus/issues/28521 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-03-25 20:29:07 +08:00
Chun Han	c3264ca3e3	feat: support segment pruner (#31003 ) related: #30376	2024-03-22 13:57:06 +08:00
groot	c81909bfab	enhance: Support MinIO TLS connection (#31311 ) issue: https://github.com/milvus-io/milvus/issues/30709 pr: #31292 Signed-off-by: yhmo <yihua.mo@zilliz.com> Co-authored-by: Chen Rao <chenrao317328@163.com>	2024-03-21 11:15:20 +08:00
Buqian Zheng	d7dbc3c9d8	fix: [sparse float vector] support the new streaming deserialize reader (#31325 ) issue: https://github.com/milvus-io/milvus/issues/31324 Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2024-03-17 13:59:04 +08:00
Buqian Zheng	3c80083f51	feat: [Sparse Float Vector] add sparse vector support to milvus components (#30630 ) add sparse float vector support to different milvus components, including proxy, data node to receive and write sparse float vectors to binlog, query node to handle search requests, index node to build index for sparse float column, etc. https://github.com/milvus-io/milvus/issues/29419 --------- Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2024-03-13 14:32:54 -07:00
Ted Xu	987d9023a5	enhance: Enable binlog deserialize reader in datanode compaction (#31036 ) See #30863 Signed-off-by: Ted Xu <ted.xu@zilliz.com>	2024-03-08 18:25:02 +08:00
wayblink	875036b81b	feat: Define FieldValue, FieldStats and PartitionStats (#30286 ) Define FieldValue, FieldStats, PartitionStats FieldValue is largely copied from PrimaryKey FieldStats is largely copied from PrimaryKeyStats PartitionStats is map[segmentid][]FieldStats Each partition can have a PartitionStats file /kind feature related: #30287 related: #30633 --------- Signed-off-by: wayblink <anyang.wang@zilliz.com>	2024-03-06 20:42:37 -08:00
Ted Xu	71adafa933	enhance: adding a streaming deserialize reader for binlogs (#30860 ) See #30863 --------- Signed-off-by: Ted Xu <ted.xu@zilliz.com>	2024-03-04 19:31:09 +08:00
yihao.dai	a434d33e75	feat: Add import scheduler and manager (#29367 ) This PR introduces novel managerial roles for importv2: 1. ImportMeta: To manage all the import tasks; 2. ImportScheduler: To process tasks and modify their states; 3. ImportChecker: To ascertain the completion of all tasks and instigate relevant operations. issue: https://github.com/milvus-io/milvus/issues/28521 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-03-01 18:31:02 +08:00
SimFG	229fc4f755	enhance: retry to read when the s3 get the unexpect eof error (#30861 ) /kind improvement issue: #30877 Signed-off-by: SimFG <bang.fu@zilliz.com>	2024-02-28 16:28:53 +08:00
Ted Xu	12acaf3e4f	enhance: Adding a generic stream payload reader (#30682 ) See: #30404 --------- Signed-off-by: Ted Xu <ted.xu@zilliz.com>	2024-02-21 17:10:52 +08:00
wayblink	f976385421	enhance: replace binlogIO with io.BinlogIO in datanode (#29725 ) #30633 Signed-off-by: wayblink <anyang.wang@zilliz.com>	2024-02-20 14:38:51 +08:00

1 2 3 4 5 ...

451 Commits