milvus

mirror of https://gitee.com/milvus-io/milvus.git synced 2026-01-05 18:31:59 +08:00

Author	SHA1	Message	Date
Buqian Zheng	b497d3d7a4	fix: call promise->setValue only after released the ListNode mtx (#43547 ) issue: #43261 `promise->setValue(folly::Unit());` may run callbacks inline and some of them may attempt to grab `mtx_`. So we should not call `promise->setValue(folly::Unit());` while holding the lock. --------- Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2025-07-26 18:34:55 +08:00
Bingyi Sun	742d72a6c2	fix: Fix wrong null offsets for json path index (#43390 ) issue: https://github.com/milvus-io/milvus/issues/43315 --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-07-26 17:26:54 +08:00
Bingyi Sun	a89e579485	fix: use tantivy version to make json index compatible with milvus 2.5 (#43563 ) issue: https://github.com/milvus-io/milvus/issues/43562 Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-07-26 17:18:55 +08:00
congqixia	0b860b4aec	fix: Revert "enhance: DataCodec to release ownership of input_data after initialization (#43542 )" (#43571 )	2025-07-25 20:48:16 +08:00
congqixia	2a7b7a811a	fix: [StorageV2] Throw exception when read rg fails (#43561 ) Related to #43261 Read error with catched in `LoadWithStrategy`. Caller could not detect read failure when some error occurred. This patch make `LoadWithStrategy` throw ex instead of swallowing it. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-07-25 17:40:55 +08:00
Buqian Zheng	d23205b718	enhance: DataCodec to release ownership of input_data after initialization (#43542 ) issue: https://github.com/milvus-io/milvus/issues/43088 issue: https://github.com/milvus-io/milvus/issues/43038 see also https://github.com/milvus-io/milvus/pull/43533. Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2025-07-25 14:24:54 +08:00
sthuang	5cebc9f7f6	fix: [StorageV2] handle correct cid with multiple files and add storage v2 prefix logs (#43539 ) related: #43372 --------- Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2025-07-25 11:22:54 +08:00
Spade A	10fe53ff59	feat: support json for ngram (#43170 ) Ref https://github.com/milvus-io/milvus/issues/42053 This PR enable ngram to support json data type. --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-07-25 10:28:54 +08:00
Buqian Zheng	d367770649	enhance: greatly reduce the loading memory overhead - by up to 25% (#43533 ) issue: #43088 issue: #43038 The current loading process: * When loading an index, we first download the index files into a list of buffers, say A * then constructing(copying) them into a vector of FieldDatas(each file is a FieldData), say B * assembles them together as a huge BinarySet, say C * lastly, copy into the actual index data structure, say D The problem: * We can see that, after each step, we don't need the data in previous step. * But currently, we release the memory of A, B, C only after we have finished constructing D * This leads to a up to 4x peak memory usage comparing with the raw index size, during the loading process * This PR allows timely releasing of B after we assembled C. So after this PR, the peak memory usage during loading will be up to 3x of the raw index size. I will create another PR to release A after we created B, that seems more complicated and need more work. Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2025-07-24 11:26:54 +08:00
zhagnlu	d64dceea47	fix:add convert int to float function to array_contains related expr (#43468 ) #43281 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-07-23 15:20:53 +08:00
Buqian Zheng	7ced9fc5d9	fix: fix loading resource estimation (#43509 ) currently we multiplied the requesting size when adding to loading, but did not do so when estimating projected usage. issue: #43088 Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2025-07-23 10:36:53 +08:00
congqixia	cc1034fe96	fix: [AddField] Resolve FieldIndexing dangling reference (#43499 ) Related to #43113 This PR: - Change member of FieldIndex from `FieldMeta &` to needed `DataType` and dim member resolving dangling reference after schema change - Add double check after acquiring lock to reduce multiple assignment - Change `auto schema` to `auto& schema` to reduce schema copy Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-07-23 00:14:52 +08:00
sthuang	59bbdd93f5	fix: [StorageV2] fill the correct group chunk into cell (#43486 ) The root cause of the issue lies in the fact that when a sealed segment contains multiple row groups, the get_cells function may receive unordered cids. This can result in row groups being written into incorrect cells during data retrieval. Previously, this issue was hard to reproduce because the old Storage V2 writer had a bug that caused it to write row groups larger than 1MB. These large row groups could lead to uncontrolled memory usage and eventually an OOM (Out of Memory) error. Additionally, compaction typically produced a single large row group, which avoided the incorrect cell-filling issue during query execution. related: https://github.com/milvus-io/milvus/issues/43388, https://github.com/milvus-io/milvus/issues/43372, https://github.com/milvus-io/milvus/issues/43464, #43446, #43453 --------- Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2025-07-22 22:22:53 +08:00
Buqian Zheng	0599113a4b	enhance: add timeout to resource reservation (#43441 ) issue: https://github.com/milvus-io/milvus/issues/41435 Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2025-07-22 15:24:53 +08:00
Chun Han	5a1092304c	fix: refine judgement for batch views(#38736 ) (#43481 ) related: #38736 Signed-off-by: MrPresent-Han <chun.han@gmail.com> Co-authored-by: MrPresent-Han <chun.han@gmail.com>	2025-07-22 14:20:53 +08:00
sthuang	f77571d5c1	fix: [StorageV2] file writer write row group split to default size (#43471 ) Bumped milvus storage version. related: https://github.com/milvus-io/milvus/issues/43310 * https://github.com/milvus-io/milvus-storage/pull/213 * https://github.com/milvus-io/milvus-storage/pull/217 * https://github.com/milvus-io/milvus-storage/pull/220 Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2025-07-22 09:52:52 +08:00
sthuang	6c5f5f1e32	enhance: [StorageV2] refactor group chunk translator (#43406 ) related: #43372 Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2025-07-21 19:46:53 +08:00
sparknack	81694739ef	fix: revert ska::flat_hash_set to std::unordered_set to address an un… (#43428 ) issue: #43388 Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>	2025-07-21 17:39:40 +08:00
aoiasd	e9fc140eaf	fix: jieba tokenizer cause panic when dict word was empty string (#43337 ) relate: https://github.com/milvus-io/milvus/issues/42779 Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-07-21 16:34:53 +08:00
aoiasd	c7b53ed43b	enhance: run rust format (#43447 ) Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-07-21 14:12:53 +08:00
Bingyi Sun	21e71f6eb2	fix: Check json nested path before validating data type (#43329 ) issue: #43279 Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-07-21 10:30:54 +08:00
Xianhui Lin	c13393418c	fix: invalid string error when enabled json stats (#43380 ) fix: invalid string error when enabled json stats issue: https://github.com/milvus-io/milvus/issues/43151 Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>	2025-07-20 23:38:53 +08:00
aoiasd	f7e1f1c382	enhance: support download lindera system dictionary online (#43121 ) relate: https://github.com/milvus-io/milvus/issues/43120 Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-07-20 23:24:52 +08:00
Buqian Zheng	389104d200	enhance: rename PanicInfo to ThrowInfo (#43384 ) issue: #41435 this is to prevent AI from thinking of our exception throwing as a dangerous PANIC operation that terminates the program. Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2025-07-19 20:22:52 +08:00
Buqian Zheng	f7b262a702	feat: make storagev1 to support eviction (#43219 ) issue: https://github.com/milvus-io/milvus/issues/41435 turns out we have per file binlog size in golang code, by passing it into segcore we can support eviction in storage v1 Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2025-07-19 02:02:52 +08:00
Spade A	42ad786f75	fix: update tantivy for fixing dir removing race condition (#43399 ) fix: https://github.com/milvus-io/milvus/issues/43258 Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-07-18 15:44:56 +08:00
Buqian Zheng	d793def47c	feat: impose a physical memory limit when loading cells (#43222 ) issue: #41435 issue: https://github.com/milvus-io/milvus/issues/43038 This PR also: 1. removed ERROR state from ListNode 2. CacheSlot will do reserveMemory once for all requested cells after updating the state to LOADING, so now we transit a cell to LOADING before its resource reservation 3. reject resource reservation directly if size >= max_size --------- Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2025-07-18 11:18:52 +08:00
Spade A	8612a2c946	enhance: optimize in by batch-in (#43268 ) fix: https://github.com/milvus-io/milvus/issues/43267 --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-07-17 19:40:52 +08:00
sparknack	9b4081e110	enhance: cachinglayer: some performance optimization (#42858 ) issue: #41435 We compared the performance using the modified test_sealed.cpp, which randomly accesses all rows in all chunks and counts the number of runs within 3s. ## performance data comparison (ops/second) chunk config: 1x1000 \| Field Type \| w/o cachinglayer (commit 640f526301) \| w/ cachinglayer \| w/ cachinglayer + opt \| \|---\|---\|---\|---\| \| Bool field \| 82428 \| -63.6% (29983) \| +2.7% (84675) \| \| Int8 field \| 82228 \| -63.3% (30166) \| +2.4% (84163) \| \| Int16 field \| 82572 \| -63.8% (29867) \| +1.8% (84036) \| \| Int32 field \| 82797 \| -63.7% (30031) \| +1.5% (84043) \| \| Int64 field \| 81077 \| -62.9% (30107) \| +0.6% (81604) \| \| Float field \| 82678 \| -63.4% (30266) \| +1.8% (84146) \| \| Double field \| 81925 \| -63.4% (29974) \| +0.2% (82097) \| \| Varchar field \| 19933 \| -19.6% (16027) \| +18.9% (23690) \| \| JSON field \| 16519 \| -96.8% (533) \| +2.5% (16927) \| \| Int array field \| 7325 \| -13.7% (6321) \| -1.4% (7220) \| \| Long array field \| 6347 \| -8.9% (5781) \| -0.1% (6344) \| \| Bool array field \| 8275 \| -14.0% (7116) \| +0.4% (8311) \| \| String array field \| 2281 \| -5.0% (2168) \| +0.2% (2287) \| \| Double array field \| 6427 \| -13.3% (5574) \| -2.0% (6301) \| \| Float array field \| 7291 \| -13.0% (6346) \| -1.5% (7183) \| \| Vector field \| 27487 \| -40.4% (16371) \| -4.7% (26192) \| \| Float16 vector field \| 49773 \| -54.6% (22601) \| -5.9% (46834) \| \| BFloat16 vector field \| 49783 \| -53.1% (23350) \| -5.7% (46934) \| \| Int8 vector field \| 63871 \| -59.0% (26179) \| -6.2% (59926) \| --- chunk config: 10x1000 \| Field Type \| w/o cachinglayer (commit 640f526301) \| w/ cachinglayer \| w/ cachinglayer + opt \| \|---\|---\|---\|---\| \| Bool field \| 3659 \| -48.6% (1879) \| +110.1% (7686) \| \| Int8 field \| 3410 \| -45.3% (1864) \| +123.9% (7636) \| \| Int16 field \| 3647 \| -48.6% (1874) \| +110.1% (7661) \| \| Int32 field \| 3647 \| -48.8% (1866) \| +109.6% (7645) \| \| Int64 field \| 3645 \| -48.9% (1863) \| +107.8% (7573) \| \| Float field \| 3647 \| -49.0% (1861) \| +109.5% (7639) \| \| Double field \| 3640 \| -45.1% (1998) \| +108.4% (7586) \| \| Varchar field \| 1594 \| -23.9% (1213) \| +20.6% (1922) \| \| JSON field \| 1202 \| -26.5% (884) \| +16.1% (1396) \| \| Int array field \| 602 \| -12.3% (528) \| +12.7% (678) \| \| Long array field \| 529 \| -12.2% (465) \| +7.5% (569) \| \| Double array field \| 537 \| -13.0% (467) \| +6.4% (571) \| \| Vector field \| 1520 \| -37.9% (943) \| -5.5% (1437) \| \| Float16 vector field \| 2607 \| -47.0% (1382) \| +6.4% (2774) \| \| BFloat16 vector field \| 2586 \| -46.5% (1383) \| +8.8% (2813) \| \| Int8 vector field \| 3101 \| -47.3% (1633) \| +41.9% (4400) \| --------- Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>	2025-07-17 11:20:51 +08:00
zhagnlu	ee43954534	fix:fix text_match bug because of not adapting to multi-chunk model (#43303 ) https://github.com/milvus-io/milvus/issues/43296 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-07-17 10:32:51 +08:00
Spade A	d750816ba0	fix: remove std::string support for stlsort index (#43355 ) fix: https://github.com/milvus-io/milvus/issues/43354 The current implementation of stdsort index is not supported for std::string. Remove the code. Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-07-16 17:46:51 +08:00
foxspy	58a9e49066	enhance: update knowhere version (#43331 ) issue: #42937 #43294 Signed-off-by: xianliang.li <xianliang.li@zilliz.com>	2025-07-16 15:04:50 +08:00
Bingyi Sun	1b8c958cff	enhance: fix tantivy wrapper is freed after json flat executor is destructed (#43233 ) Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-07-16 10:58:50 +08:00
congqixia	fe8de016d5	fix: [StorageV2] Align null bitmap offset when loading multi-chunk (#43321 ) Related to #43262 This patch fixes following logic bug: - When multiple chunks are loaded and size cannot be divided by 8, just appending uint8_t as bitmap will cause null bitmap dislocation - `null_bitmap_data()` points to start of whole row group, which may not stand for current `arrow::Array` The current solutions is: - Reorganize the null_bitmap with currect size & offset - Pass `array->offset()` in tuple to info the current offset Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-07-15 19:22:50 +08:00
Bingyi Sun	ccfaa7bee8	fix: Fix the bug when offsets is nullptr in bulk api (#43127 ) issue: https://github.com/milvus-io/milvus/issues/42978 --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-07-15 17:54:50 +08:00
Spade A	db91d85dbc	feat: more types of matches for ngram (#43081 ) Ref https://github.com/milvus-io/milvus/issues/42053 This PR enable ngram to support more kinds of matches such as prefix and postfix match. --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-07-14 20:34:50 +08:00
Spade A	e14a52721e	enhance: use stl sort with high cardinality for data_type int (#43305 ) fix: https://github.com/milvus-io/milvus/issues/43304 Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-07-14 18:40:50 +08:00
congqixia	ae48f0e484	fix: [StorageV2] Handle missing column creating index (#43292 ) Related to #43250 Use FieldIDList to check missing field. If column is missing, return empty resultset Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-07-14 17:06:50 +08:00
foxspy	8171a2a0b5	enhance: update knowhere version (#43246 ) issue: #42937 Signed-off-by: xianliang.li <xianliang.li@zilliz.com>	2025-07-14 11:06:49 +08:00
Alexander Guzhva	a848c4a8c5	fix: fix incorrect bitset for the division comparison when the right is < 0 (#43179 ) issue: https://github.com/milvus-io/milvus/issues/42900 @sunby Unfortunately, it is not that easy to fix as it was thought in #43177 Upd: also handles `Inf` and `NaN` values, and the division by zero case for `fp32` and `fp64` Signed-off-by: Alexandr Guzhva <alexanderguzhva@gmail.com>	2025-07-11 19:04:49 +08:00
congqixia	6bbed3b019	fix: [AddField] Add shared_lock for insert prevent race (#43229 ) Related to #43113 When schema change happens, insert shall not happen, otherwise: - Data race may happen causing insertion failure - Inconsistent data schema This PR add shared_lock prevent this data race. --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-07-10 21:26:48 +08:00
PjJinchen	a90694165b	feat: Supports tracing services that require header-based authentication. (#43211 ) issue: https://github.com/milvus-io/milvus/issues/43082 support tracing services that require header-based authentication. for example: aliyun SLS, volcengine LogService etc... [aliyun SLS](https://help.aliyun.com/zh/sls/import-trace-data-from-golang-applications-to-log-service-by-using-opentelemetry-sdk-for-golang?spm=a2c4g.11186623.help-menu-search-28958.d_1#section-ktk-xxz-8om) Add a headers config in trace config ``` trace: exporter: otlp sampleFraction: 1 otlp: endpoint: milvus-cn-beijing-pre.cn-beijing.log.aliyuncs.com:10010 method: # otlp export method, acceptable values: ["grpc", "http"], using "grpc" by default secure: true headers: # base64 initTimeoutSeconds: 10 ``` it is encoded as base64, raw data is json ``` { "x-sls-otel-project": "milvus-cn-beijing-pre", "x-sls-otel-instance-id": "milvus-cn-beijing-pre", "x-sls-otel-ak-id": "xxx", "x-sls-otel-ak-secret": "xxx" } ``` [volcengine tls](https://www.volcengine.com/docs/6470/812322#grpc-%E5%8D%8F%E8%AE%AE%E5%88%9D%E5%A7%8B%E5%8C%96%E7%A4%BA%E4%BE%8B) Add a headers config in trace config ``` trace: exporter: otlp sampleFraction: 1 otlp: endpoint: xxx method: # otlp export method, acceptable values: ["grpc", "http"], using "grpc" by default secure: true headers: # base64 initTimeoutSeconds: 10 ``` it is encoded as base64, raw data is json ``` { "x-tls-otel-region": "cn-beijing", "x-tls-otel-tracetopic": "milvus-cn-beijing-pre", "x-tls-otel-ak": "xxx", "x-tls-otel-sk": "xxx" } ``` Signed-off-by: PjJinchen <6268414+pj1987111@users.noreply.github.com>	2025-07-10 17:32:48 +08:00
Chun Han	07745439b5	fix: empty search groupby result causing crash(#43137 ) (#43214 ) related: #43137 Signed-off-by: MrPresent-Han <chun.han@gmail.com> Co-authored-by: MrPresent-Han <chun.han@gmail.com>	2025-07-10 12:04:48 +08:00
congqixia	f027eea545	enhance: [AddField] Add log for segcore segment schema change (#43215 ) Related to #39178 This PR add logs for segment schema change operations. Also fixes the nit comments from PR #42490 --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-07-10 10:22:47 +08:00
zhagnlu	21d1fb2aa3	fix: fix move cursor bug for chunk segment with index (#43095 ) #42974 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-07-09 17:38:47 +08:00
Spade A	d41eec6f10	fix: void copy when getting json chunk (#43183 ) fix: https://github.com/milvus-io/milvus/issues/43182 Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-07-08 15:28:46 +08:00
sthuang	a0ae5bccc9	fix: [StorageV2] load growing segment get dim datatype check (#43168 ) related: https://github.com/milvus-io/milvus/issues/43072 Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2025-07-07 15:46:47 +08:00
sthuang	276c52490d	fix: [StorageV2] missing arrow fs when building index (#43162 ) fix: https://github.com/milvus-io/milvus/issues/43150, https://github.com/milvus-io/milvus/issues/43149 Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2025-07-07 15:26:46 +08:00
sthuang	9f361a228e	enhance: storage v2 chunked column memory size from meta (#43130 ) use meta to get chunked column memory size to avoid getting cells actually from storage. related: #39173 --------- Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2025-07-07 14:24:46 +08:00
Spade A	fce0bbe2ae	fix: remove redundant locks for null_offset (#43103 ) Ref: https://github.com/milvus-io/milvus/issues/40308 https://github.com/milvus-io/milvus/pull/40363 add lock for protecting concurrent read/write for null offset. But we don't need this for sealed segment. --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-07-04 10:10:45 +08:00

1 2 3 4 5 ...

2063 Commits