milvus

mirror of https://gitee.com/milvus-io/milvus.git synced 2026-01-05 18:31:59 +08:00

Author	SHA1	Message	Date
Buqian Zheng	389104d200	enhance: rename PanicInfo to ThrowInfo (#43384 ) issue: #41435 this is to prevent AI from thinking of our exception throwing as a dangerous PANIC operation that terminates the program. Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2025-07-19 20:22:52 +08:00
Buqian Zheng	f7b262a702	feat: make storagev1 to support eviction (#43219 ) issue: https://github.com/milvus-io/milvus/issues/41435 turns out we have per file binlog size in golang code, by passing it into segcore we can support eviction in storage v1 Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2025-07-19 02:02:52 +08:00
congqixia	ae48f0e484	fix: [StorageV2] Handle missing column creating index (#43292 ) Related to #43250 Use FieldIDList to check missing field. If column is missing, return empty resultset Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-07-14 17:06:50 +08:00
sthuang	276c52490d	fix: [StorageV2] missing arrow fs when building index (#43162 ) fix: https://github.com/milvus-io/milvus/issues/43150, https://github.com/milvus-io/milvus/issues/43149 Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2025-07-07 15:26:46 +08:00
congqixia	1d9a9a993d	fix: [StorageV2] Use correct template typename for `cache_raw_data_to_disk_common` (#43104 ) Related to #43099 Previously `cache_raw_data_to_disk_common` used `milvus::DataType` template typename, which shall be `knowhere::bf16` or other actual datatype. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-07-03 18:50:46 +08:00
Zhen Ye	bbbc7d4517	enhance: collect all cgo calling into metric and log slow cgo call (#43035 ) issue: #42833 - also fix the error metric for async cgo. - also make sure the roles can be seen when node startup, #43041. Signed-off-by: chyezh <chyezh@outlook.com>	2025-07-03 15:00:44 +08:00
sparknack	7e855f1046	enhance: add disk file writer with Direct IO support (#42665 ) issue: #43040 This patch introduces a disk file writer that supports Direct IO. Currently, it is exclusively utilized during the QueryNode load process. Below is its parameters: 1. `common.diskWriteMode` This parameter controls the write mode of the local disk, which is used to write temporary data downloaded from remote storage. Currently, only QueryNode uses 'common.diskWrite*' parameters. Support for other components will be added in the future. The options include 'direct' and 'buffered'. The default value is 'buffered'. 2. `common.diskWriteBufferSizeKb` Disk write buffer size in KB, only used when disk write mode is 'direct', default is 64KB. Current valid range is [4, 65536]. If the value is not aligned to 4KB, it will be rounded up to the nearest multiple of 4KB. 3. `common.diskWriteNumThreads` This parameter controls the number of writer threads used for disk write operations. The valid range is [0, hardware_concurrency]. It is designed to limit the maximum concurrency of disk write operations to reduce the impact on disk read performance. For example, if you want to limit the maximum concurrency of disk write operations to 1, you can set this parameter to 1. The default value is 0, which means the caller will perform write operations directly without using an additional writer thread pool. In this case, the maximum concurrency of disk write operations is determined by the caller's thread pool size. Both parameters can be updated during runtime. --------- Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>	2025-07-02 22:18:44 +08:00
Spade A	26ec841feb	feat: optimize `Like` query with n-gram (#41803 ) Ref #42053 This is the first PR for optimizing `LIKE` with ngram inverted index. Now, only VARCHAR data type is supported and only InnerMatch LIKE (%xxx%) query is supported. How to use it: ``` milvus_client = MilvusClient("http://localhost:19530") schema = milvus_client.create_schema() ... schema.add_field("content_ngram", DataType.VARCHAR, max_length=10000) ... index_params = milvus_client.prepare_index_params() index_params.add_index(field_name="content_ngram", index_type="NGRAM", index_name="ngram_index", min_gram=2, max_gram=3) milvus_client.create_collection(COLLECTION_NAME, ...) ``` min_gram and max_gram controls how we tokenize the documents. For example, for min_gram=2 and max_gram=4, we will tokenize each document with 2-gram, 3-gram and 4-gram. --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com> Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>	2025-07-01 10:08:44 +08:00
sthuang	238bd30f42	fix: [StorageV2] end to end minor issues for sync, stats, and load (#42948 ) Fix issues in end-to-end tests: 1. Split column groups based on schema, rather than estimating by average chunk row size. Ensure column group consistency within a segment, to avoid errors caused by loading multiple column group chunks simultaneously. 2. Use sorted segmentId when generating the stats binlog path, to ensure consistent and correct file path resolution. 3. Determine field IDs as follows: For multi-column column groups, retrieve the field ID list from metadata. For single-column column groups, use the column group ID directly as the field ID. related: #39173 fix: #42862 --------- Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2025-06-27 14:44:42 +08:00
XuanYang-cn	0dfe5308e1	enhance: Tidy Download and decode in segcore storage (#42902 ) 1. Unify calling from GetObjectData 2. Move SetData inside Deserialize See also: #40013 --------- Signed-off-by: yangxuan <xuan.yang@zilliz.com>	2025-06-25 11:10:43 +08:00
sthuang	0d57acb13a	enhance: [StorageV2] field id as meta path for wide column when load (#42863 ) related: #42862 #39173 Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2025-06-25 11:08:48 +08:00
Xianhui Lin	b902960057	fix: revert remote jsonstats path (#42882 ) fix: revert remote jsonstats path relate-pr:https://github.com/milvus-io/milvus/pull/42676 issue:https://github.com/milvus-io/milvus/issues/42872 Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>	2025-06-21 13:24:39 +08:00
sthuang	4a0a2441f2	enhance: [StorageV2] field id as meta path for wide column (#42787 ) related: #39173 Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2025-06-19 15:00:38 +08:00
Spade A	e2c85eec81	fix: load stats index based on mmap config (#42788 ) ref https://github.com/milvus-io/milvus/issues/42626 This PR makes text match index and json key stats index be loaded based on mmap config. --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-06-19 10:10:39 +08:00
Spade A	80f1d707f7	fix: tidy up path for scalar index (#42676 ) Ref #42626 This path tidy up path for scalar index including path for loading index from remote storage and temporary path for buliding index. --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-06-18 00:42:38 +08:00
Chun Han	001619aef9	feat: supporing load priority for loading (#42413 ) related: #40781 Signed-off-by: MrPresent-Han <chun.han@gmail.com> Co-authored-by: MrPresent-Han <chun.han@gmail.com>	2025-06-17 15:22:38 +08:00
congqixia	f01ff57f3f	fix: [StorageV2] Use correct offset filling null bitmap (#42774 ) Related to #39173 `null_bitmap_data()` returns raw pointer of null bitmap of Array. While after slicing, this bitmap is not rewritten due to zero copy implementation, so the current start pos maybe non-zero while FillFieldData generating column `valid_data` array. This PR add `offset` param for `FillFieldData` method, and force all invocation pass correct offset of `null_bitmap_data` ptr. Also update milvus-storage commit fixing reader failed to return data when buffer size smaller than row group size problem. --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-06-17 10:08:38 +08:00
Spade A	9873e0ee78	fix: fix text match index / json key stats index leak when segment released (#42655 ) Ref https://github.com/milvus-io/milvus/issues/42626 Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-06-13 04:28:37 +08:00
congqixia	c9bc70f272	fix: [AddField] Use shared_ptr of schema in plan fixing dangling ref (#42693 ) Related to #42640 The search/query plan holded a reference to schema, which could be destructed after schema change. This PR make plan hold a shared ptr to it fixing dangling reference problem under concurrent read & schema change. This PR also remove field binlog check for loading index for old segment with old schema may have binlog lack. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-06-12 20:46:36 +08:00
Spade A	911a8df17c	feat: impl StructArray -- data storage support in segcore (#42406 ) Ref https://github.com/milvus-io/milvus/issues/42148 This PR mainly enables segcore to support array of vector (read and write, but not indexing). Now only float vector as the element type is supported. --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com> Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>	2025-06-12 14:38:35 +08:00
Buqian Zheng	8511ede5f8	feat: add back queryNode.cache.warmup for compatibility (#42621 ) issue: https://github.com/milvus-io/milvus/issues/41435 also make ChunkTranslator to load in parallel --------- Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2025-06-12 10:56:40 +08:00
congqixia	499e9a0a73	fix: [AddField] Use corresponding datatype for int8/int16 def val (#42633 ) Related to #42629 This PR handles converting default value to int8/int18 scalar with int32 default value definition Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-06-11 11:54:34 +08:00
Bingyi Sun	fbf5cb4e62	feat: Add json flat index (#39917 ) issue: https://github.com/milvus-io/milvus/issues/35528 This PR introduces a JSON flat index that allows indexing JSON fields and dynamic fields in the same way as other field types. In a previous PR (#36750), we implemented a JSON index that requires specifying a JSON path and casting a type. The only distinction lies in the json_cast_type parameter. When json_cast_type is set to JSON type, Milvus automatically creates a JSON flat index. For details on how Tantivy interprets JSON data, refer to the [tantivy documentation](https://github.com/quickwit-oss/tantivy/blob/main/doc/src/json.md#pitfalls-limitation-and-corner-cases). Limitations Array handling: Arrays do not function as nested objects. See the [limitations section](https://github.com/quickwit-oss/tantivy/blob/main/doc/src/json.md#arrays-do-not-work-like-nested-object) for more details. --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-06-10 19:14:35 +08:00
sthuang	89c3afb12e	fix: [StorageV2] index/stats task level storage v2 fs (#42191 ) related: #39173 --------- Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2025-06-10 11:06:35 +08:00
congqixia	f1188b6781	enhance: [storagev2] Support partition key isolation index (#42574 ) Related to #39173 This patch make storage v2 support partition key isolation index feature Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-06-09 14:02:33 +08:00
congqixia	b50c4a7973	enhance: Make segcore thread name set correctly (#42497 ) Previous PR: #42017 did not work due to following updated points by this PR: - Initialize the `name_map`, which not touched at all before - Trim the thread name under 15 characters to fit syscall limit --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-06-06 16:26:32 +08:00
sthuang	490827974d	enhance: avoid shutdown sdk api in minio cm destructor (#42459 ) related: #39173 --------- Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2025-06-04 09:58:39 +08:00
cqy123456	727f4ec24b	enhance:mmapchunkmanager allocates MmapChunkDescriptor itself (#42150 ) issue: https://github.com/milvus-io/milvus/issues/42157 Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>	2025-06-03 14:42:31 +08:00
congqixia	cc42d49769	fix: [StorageV2][AddField] Handle lack binlog rows in storage v2 (#42186 ) Related to #39173 #39718 In storage v2, the `lack_bin_rows` cannot be used since field id is not column group id, which will not be matched forever. --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-05-31 02:44:30 +08:00
Chun Han	ed0df38605	enhance: resize high priority wqthreadpool dynamically(#40838 ) (#41549 ) (#41929 ) related: #40838 pr: https://github.com/milvus-io/milvus/pull/41549 Signed-off-by: MrPresent-Han <chun.han@gmail.com>	2025-05-30 10:18:36 +08:00
cqy123456	5fe7015f63	enhance: InterimIndex support more index type and data type (#41021 ) issue: https://github.com/milvus-io/milvus/issues/27678 cherry pick from : https://github.com/milvus-io/milvus/pull/39180, https://github.com/milvus-io/milvus/pull/40429 Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>	2025-05-28 08:40:28 +08:00
sthuang	b9b554676c	fix: storage v2 get field data with correct column group files (#42107 ) related: #39173 Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2025-05-27 15:26:28 +08:00
congqixia	9fb0257bfa	enhance: Set thread name for segcore thread pool (#42017 ) Thread name could be helpful when debugging thread explosion issues Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-05-22 19:06:27 +08:00
Buqian Zheng	8a85bc4213	fix: fixes async warmup deadlock (#41995 ) issue: https://github.com/milvus-io/milvus/issues/41993 also updated cachinglayer metrics Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2025-05-22 09:54:24 +08:00
congqixia	f2a8330f87	fix: [StorageV2] Use correct group building index (#41925 ) Related to #39173 #41534 This pr fixes an issue that building mem index may report datatype not match error when collection split fields into multiple groups --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-05-20 13:26:23 +08:00
Buqian Zheng	ff5c2770e5	feat: cachinglayer: various improvements (#41546 ) issue: https://github.com/milvus-io/milvus/issues/41435 this PR is based on https://github.com/milvus-io/milvus/pull/41436. Improvements include: - Lazy Load support for Storage v1 - Use Low/High watermark to control eviction - Caching Layer related config changes - Removed ChunkCache related configs and code in golang - Add `PinAllCells` helper method to CacheSlot class - Modified ValueAt, RawAt, PrimitiveRawAt to Bulk version, to reduce caching layer overhead - Removed some unclear templated bulk_subscript methods - CachedSearchIterator to store PinWrapper when searching on ChunkedColumn, and removed unused contrustor. --------- Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2025-05-10 09:19:16 +08:00
congqixia	bcf94a0754	fix: Remove noexcept from `CacheIndexToDiskInternal` (#41725 ) Related to #41219 Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-05-09 14:16:53 +08:00
congqixia	b1f3fe1f07	fix: Use sum of num_rows instead of last one (#41685 ) Related to #41656 --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-05-07 19:40:53 +08:00
sthuang	6c377b6e86	feat: Storage v2 index and stats raw data (#41534 ) related: #39173 --------- Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2025-04-30 08:48:54 +08:00
Buqian Zheng	3de904c7ea	feat: add cachinglayer to sealed segment (#41436 ) issue: https://github.com/milvus-io/milvus/issues/41435 --------- Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2025-04-28 10:52:40 +08:00
Chun Han	016920b023	fix: solve incompitable problem for none-encoding index(#40838 ) (#41369 ) related: #40838 Signed-off-by: MrPresent-Han <chun.han@gmail.com> Co-authored-by: MrPresent-Han <chun.han@gmail.com>	2025-04-20 22:56:44 +08:00
sthuang	1f1c836fb9	feat: Storage v2 growing segment load (#41001 ) support parallel loading sealed and growing segments with storage v2 format by async reading row groups. related: #39173 --------- Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2025-04-16 17:14:33 +08:00
Chun Han	59b14d38f5	enhance: Optimize index format for improved load performance(#40838 ) (#40839 ) related: https://github.com/milvus-io/milvus/issues/40838 Signed-off-by: MrPresent-Han <chun.han@gmail.com> Co-authored-by: MrPresent-Han <chun.han@gmail.com>	2025-04-15 03:10:30 +08:00
Bingyi Sun	bf617115ca	enhance: Remove single chunk segment related codes (#39249 ) https://github.com/milvus-io/milvus/issues/39112 --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-04-11 18:56:29 +08:00
Xianhui Lin	3bc24c264f	enhance: Add json key inverted index in stats for optimization (#38039 ) Add json key inverted index in stats for optimization https://github.com/milvus-io/milvus/issues/36995 --------- Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-04-10 15:20:28 +08:00
zhagnlu	10a63b3f2e	enhance: add formatter for serveral types to remove compile warning (#41094 ) #41091 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-04-07 11:54:24 +08:00
Spade A	216be1494b	fix: add log for object storage operation fail (#40666 ) fix: #40665 --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-04-02 01:26:21 +08:00
Bingyi Sun	8fbacf3583	fix: Null expr does not work for json field (#40456 ) issue: https://github.com/milvus-io/milvus/issues/40455 Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-03-14 16:06:08 +08:00
Chun Han	259f9106ad	enhance: refine variable-length-type memory usage(#38736 ) (#39578 ) related: #38736 Signed-off-by: MrPresent-Han <chun.han@gmail.com> Co-authored-by: MrPresent-Han <chun.han@gmail.com>	2025-02-27 21:13:58 +08:00
Patrick Weizhi Xu	04fff74a56	feat: introduce Text data type (#39874 ) issue: https://github.com/milvus-io/milvus/issues/39818 This PR mimics Varchar data type, allows insert, search, query, delete, full-text search and others. Functionalities related to filter expressions are disabled temporarily. Storage changes for Text data type will be in the following PRs. Signed-off-by: Patrick Weizhi Xu <weizhi.xu@zilliz.com>	2025-02-19 11:04:51 +08:00

1 2 3 4 5

216 Commits