milvus

mirror of https://gitee.com/milvus-io/milvus.git synced 2026-01-01 08:28:10 +08:00

Author	SHA1	Message	Date
foxspy	3dbad0306a	fix: Add bypass thread pool mode to avoid growing indexes blocking insert/load (#41012 ) issue: #40825 Signed-off-by: xianliang.li <xianliang.li@zilliz.com>	2025-05-20 14:30:24 +08:00
congqixia	f2a8330f87	fix: [StorageV2] Use correct group building index (#41925 ) Related to #39173 #41534 This pr fixes an issue that building mem index may report datatype not match error when collection split fields into multiple groups --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-05-20 13:26:23 +08:00
congqixia	a22088a380	enhance: [StorageV2] Make packed reader use correct path (#41919 ) Related to #39173 This PR - Use updated path with bucketName for packedReader - Update milvus-storage commit to report reader/writer initialization failure, see also milvus-io/milvus-storage#192 --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-05-20 10:36:23 +08:00
congqixia	3bbc0fa560	enhance: [StorageV2] update storage to pass endpoint as-is (#41889 ) Related to milvus-io/milvus-storage#190 Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-05-16 18:06:21 +08:00
Bingyi Sun	b006d738b2	fix: Fix skip much more rows when moving cursor (#41862 ) issue: https://github.com/milvus-io/milvus/issues/41790 --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-05-16 16:46:22 +08:00
Buqian Zheng	b0260d8676	feat: manual evict cache after built interim index (#41836 ) issue: https://github.com/milvus-io/milvus/issues/41435 this PR also makes HasRawData of ChunkedSegmentSealedImpl to return based on metadata, without needing to load the cache just to answer this simple question. --------- Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2025-05-16 16:34:23 +08:00
congqixia	a6d09ff4cd	enhance: [StorageV2] fix issues integrating basic RW operations (#41834 ) Related to #39173 This PR: - Upgrade milvus-storage commit to fix filesystem finalized issue - Add bucket-name as prefix for all fs style access io - Initial arrow fs on querynodes startup - Fix timestamp access when loading sealed segment --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-05-15 09:52:23 +08:00
Buqian Zheng	cae0091071	feat: make SkipIndex lazyload (#41826 ) issue: https://github.com/milvus-io/milvus/issues/41435 this PR also: 1. fixed the skip index for VARCHAR. before this PR, skip index of VARCHAR uses the minmax of the entire column as the minmax of chunk 0, and provides no minmax for other chunks. 2. refactored some skip index loading related code 3. partly fixed a bug in test_expr.cpp --------- Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2025-05-15 01:30:23 +08:00
cai.zhang	4ead8caaba	fix: prevent crash when contains_all/any is used with empty array (#41739 ) issue: #41348 related and optimized by #41347 --------- Signed-off-by: Cai Zhang <cai.zhang@zilliz.com> Co-authored-by: Sangho Park <hoyaspark@gmail.com>	2025-05-14 14:32:22 +08:00
foxspy	358bc150df	enhance: add force rebuild index configuration (#41473 ) issue: #41431 Signed-off-by: xianliang.li <xianliang.li@zilliz.com>	2025-05-14 10:52:21 +08:00
zhagnlu	f094d026f8	fix: add params to ignore config type exception (#41776 ) #41707 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-05-13 13:48:56 +08:00
Buqian Zheng	ff5c2770e5	feat: cachinglayer: various improvements (#41546 ) issue: https://github.com/milvus-io/milvus/issues/41435 this PR is based on https://github.com/milvus-io/milvus/pull/41436. Improvements include: - Lazy Load support for Storage v1 - Use Low/High watermark to control eviction - Caching Layer related config changes - Removed ChunkCache related configs and code in golang - Add `PinAllCells` helper method to CacheSlot class - Modified ValueAt, RawAt, PrimitiveRawAt to Bulk version, to reduce caching layer overhead - Removed some unclear templated bulk_subscript methods - CachedSearchIterator to store PinWrapper when searching on ChunkedColumn, and removed unused contrustor. --------- Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2025-05-10 09:19:16 +08:00
congqixia	bcf94a0754	fix: Remove noexcept from `CacheIndexToDiskInternal` (#41725 ) Related to #41219 Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-05-09 14:16:53 +08:00
zhagnlu	f674e232b9	fix: GetValueFromConfig return nullopt instead of exception for null value (#41709 ) #41707 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-05-09 11:18:53 +08:00
Xianhui Lin	26cbc74478	fix: support infix and suffix match types in JsonStats (#41720 ) fix: support infix and suffix match types in JsonStats issue:https://github.com/milvus-io/milvus/issues/41386 Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>	2025-05-09 10:42:53 +08:00
zhagnlu	e3c81ba1cc	enhance: use scan mode for like although inverted index exists (#41325 ) #41065 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-05-09 10:36:54 +08:00
zhagnlu	39e7ad33d7	enhance: add optimize for like expr (#41066 ) #41065 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-05-08 14:28:52 +08:00
foxspy	e2ddbe4962	feat: add cachinglayer to index (#41653 ) issue: #41435 Signed-off-by: xianliang.li <xianliang.li@zilliz.com>	2025-05-08 10:12:54 +08:00
congqixia	b1f3fe1f07	fix: Use sum of num_rows instead of last one (#41685 ) Related to #41656 --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-05-07 19:40:53 +08:00
Bingyi Sun	0dee3ccfd7	enhance: Make user specified doc id selectable for tantivy index writer (#41528 ) issue: https://github.com/milvus-io/milvus/issues/41527 --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-05-07 10:48:53 +08:00
Bingyi Sun	4c08090687	feat: Add json index support for json contains expr (#41478 ) issue: #35528 --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-05-06 11:44:52 +08:00
Buqian Zheng	73bbf4c674	fix: error when lack_binlog_rows = 0 (#41644 ) issue: https://github.com/milvus-io/milvus/issues/41643 Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2025-05-04 00:24:56 +08:00
sthuang	e9442f575d	feat: storage v2 seal segment load (#41567 ) storage v2 chunked seal segment loading is based on caching layer. A cell unit in storage v2 is a parquet row group in remote object storage, containing all fields. Therefore, each field needs a proxy to do related one field operations. <img width="965" alt="Screenshot 2025-04-28 at 10 59 30" src="https://github.com/user-attachments/assets/83e93a10-3b1d-4066-ac17-b996d5650416" /> related: #39173 --------- Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2025-04-30 14:22:58 +08:00
sthuang	6c377b6e86	feat: Storage v2 index and stats raw data (#41534 ) related: #39173 --------- Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2025-04-30 08:48:54 +08:00
zhagnlu	cd60b965c8	enhance: add expr filter ratio monitor params (#41402 ) #41401 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-04-29 17:02:54 +08:00
foxspy	1d99f8bd67	enhance: add force rebuild index configuration (#41473 ) issue: #41431 Signed-off-by: xianliang.li <xianliang.li@zilliz.com>	2025-04-29 16:20:56 +08:00
congqixia	f3f8227cd0	enhance: [AddField] Trigger check schema in retrieve as well (#41598 ) Related to #39718 Fixes milvus-io/pymilvus#2771 This PR: - Make AsyncRetrieve task triggers "schema check" logic as well - Rename `AddField` related methods to align with code standard Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-04-29 14:10:49 +08:00
Spade A	910f68c986	fix: update tantivy to fix tantivy doc out of order when merge (#41596 ) issue: #41597 Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-04-29 13:46:49 +08:00
Spade A	f35e8f7420	fix: fix arm64 compile issue (#41593 ) issue: https://github.com/milvus-io/milvus/issues/41059, https://github.com/milvus-io/milvus/issues/41510 Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-04-29 13:19:25 +08:00
Buqian Zheng	3de904c7ea	feat: add cachinglayer to sealed segment (#41436 ) issue: https://github.com/milvus-io/milvus/issues/41435 --------- Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2025-04-28 10:52:40 +08:00
cai.zhang	640f526301	fix: Update current scalar index version to compatible tantivy different versions (#41141 ) issue: #40823 Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2025-04-27 20:44:39 +08:00
Chun Han	12cde913b5	fix: fail to get string views due to chunk bound empty loop(#41300 ) (#41452 ) related: #41300 Signed-off-by: MrPresent-Han <chun.han@gmail.com> Co-authored-by: MrPresent-Han <chun.han@gmail.com>	2025-04-27 10:40:38 +08:00
congqixia	b5443ddbd0	enhance: [AddField] Reopen loaded segments after AddField (#41529 ) Related to #39718 This PR: - Add reopen logic for growing & sealed segments - Lazy reopen when schema version increases - Add FinishLoad api for loading progress --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-04-26 08:48:39 +08:00
Buqian Zheng	1c8b9c127d	fix: Make sure segment in ut is destroyed before static MmapManager singleton (#41508 ) issue: #41507 Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2025-04-25 18:50:38 +08:00
Xianhui Lin	1a6838b496	fix: json stats add map null check before insert into tantivity (#41505 ) json stats add map null check before insert into tantivity. Json stats index may fail if there is no data issue:https://github.com/milvus-io/milvus/issues/41494 --------- Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>	2025-04-24 21:06:37 +08:00
congqixia	dbe54c2df8	enhance: [AddField] Resolve conflicts & make WAL ts collection updatets (#41476 ) Related to #39718 This PR: - Use WAL broadcast timestamp as Collection update timestamp - Remove request_fields size assertion - Remove proxy schema cache loaded field check & skip related cases - other minor issues --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-04-24 12:06:39 +08:00
Spade A	f3d878ab3f	fix: update tantivy for fixing phrase match (#41450 ) issue: #41454 https://github.com/zilliztech/tantivy/pull/8 fixes the problem, this PR update the tantivy. --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-04-24 10:52:37 +08:00
aoiasd	f52c2909c4	feat: support multi analyzer for bm25 function (#41351 ) relate: https://github.com/milvus-io/milvus/issues/41213 --------- Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-04-23 18:22:38 +08:00
Xianhui Lin	3d4889586d	fix: JsonStats filter by conjunctExpr and improve the task slot calculation logic (#41459 ) Optimized JSON filter execution by introducing ProcessJsonStatsChunkPos() for unified position calculation and GetNextBatchSize() for better batch processing. Improved JSON key generation by replacing manual path joining with milvus::Json::pointer() and adjusted slot size calculation for JSON key index jobs. Updated the task slot calculation logic in calculateStatsTaskSlot() to handle the increased resource needs of JSON key index jobs. issue: https://github.com/milvus-io/milvus/issues/41378 https://github.com/milvus-io/milvus/issues/41218 --------- Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>	2025-04-23 16:30:37 +08:00
aoiasd	a16bd6263b	feat: support more lauguage for build in stop words and add remove punct, regex filter (#41412 ) relate: https://github.com/milvus-io/milvus/issues/41213 --------- Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-04-23 11:44:37 +08:00
aoiasd	11f2fae42e	feat: support extend default dict for jieba tokenizer (#41360 ) relate: https://github.com/milvus-io/milvus/issues/41213 Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-04-22 20:34:37 +08:00
congqixia	b36c88f3c8	enhance: [AddField] Broadcast schema change via WAL (#41373 ) Related to #39718 Add Broadcast logic for collection schema change and notifies: - Streamnode - Delegator - Streamnode - Flush component - QueryNodes via grpc --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-04-22 16:28:37 +08:00
aoiasd	110c5aaaf4	feat: support icu and language identifier tokenizer (#41214 ) relate: https://github.com/milvus-io/milvus/issues/41213 Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-04-22 15:56:37 +08:00
cqy123456	5219d9a723	fix: Inserting null and non-null array at the same time will cause milvus crash when growing mmap open (#41051 ) issue: https://github.com/milvus-io/milvus/issues/40981 2.5 pr: https://github.com/milvus-io/milvus/pull/41052 Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>	2025-04-22 12:26:37 +08:00
aoiasd	f166843c5e	enhance: support use lindera tag filter (#40416 ) relate: https://github.com/milvus-io/milvus/issues/39659 Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-04-21 15:56:36 +08:00
sparknack	8ccb875e41	enhance: add simde package (#40943 ) issue: #40942 Add simde package, which can make porting SIMD code to other architectures much easier. Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>	2025-04-21 12:18:40 +08:00
Spade A	5b1430f27e	enhance: tantivy collector set bitset directly (#39748 ) fix: #39755 The following shows a simple benchmark where insert 1M docs where all rows are "hello", the latency is segcore level, CPU is 9900K: master: 2.62ms this PR: 2.11ms bench mark code: ``` TEST(TextMatch, TestPerf) { auto schema = GenTestSchema({}, true); auto seg = CreateSealedSegment(schema, empty_index_meta); int64_t N = 1000000; uint64_t seed = 19190504; auto raw_data = DataGen(schema, N, seed); auto str_col = raw_data.raw_->mutable_fields_data() ->at(1) .mutable_scalars() ->mutable_string_data() ->mutable_data(); for (int64_t i = 0; i < N - 1; i++) { str_col->at(i) = "hello"; } SealedLoadFieldData(raw_data, *seg); seg->CreateTextIndex(FieldId(101)); auto now = std::chrono::high_resolution_clock::now(); auto expr = GetMatchExpr(schema, "hello", OpType::TextMatch); auto final = ExecuteQueryExpr(expr, seg.get(), N, MAX_TIMESTAMP); auto end = std::chrono::high_resolution_clock::now(); auto duration = std::chrono::duration_cast<std::chrono::microseconds>(end - now); std::cout << "TextMatch query time: " << duration.count() << "ms" << std::endl; } ``` --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-04-20 23:02:41 +08:00
Chun Han	016920b023	fix: solve incompitable problem for none-encoding index(#40838 ) (#41369 ) related: #40838 Signed-off-by: MrPresent-Han <chun.han@gmail.com> Co-authored-by: MrPresent-Han <chun.han@gmail.com>	2025-04-20 22:56:44 +08:00
Ted Xu	d50781c8cc	enhance: support nullable group by keys (#41313 ) See #36264 --------- Signed-off-by: Ted Xu <ted.xu@zilliz.com>	2025-04-18 10:08:34 +08:00
Spade A	62293cb582	fix: revert batch add (#41374 ) issue: #41375 todo: to fix the problems fixed in the issue. --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-04-17 22:32:38 +08:00

1 2 3 4 5 ...

1945 Commits