1869 Commits

Author SHA1 Message Date
Chun Han
afa519b4c7
fix: array is null failed(#40686) (#41027)
related: #40686

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2025-04-02 18:20:22 +08:00
smellthemoon
cb1e86e17c
enhance: support add field (#39800)
after the pr merged, we can support to insert, upsert, build index,
query, search in the added field.
can only do the above operates in added field after add field request
complete, which is a sync operate.

compact will be supported in the next pr.
#39718

---------

Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2025-04-02 14:24:31 +08:00
Spade A
216be1494b
fix: add log for object storage operation fail (#40666)
fix: #40665

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-04-02 01:26:21 +08:00
cqy123456
6dc0f42830
fix:growing mmap data type crashed by nullable input (#40994)
issue: https://github.com/milvus-io/milvus/issues/40981
2.5 pr: https://github.com/milvus-io/milvus/pull/40980

Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>
2025-03-31 20:32:19 +08:00
Bingyi Sun
27ff3a42e7
enhance: Record simdjson error (#41003)
issue: #35528

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-03-31 17:56:19 +08:00
Bingyi Sun
15ec7bae4d
fix: Fix using json index when iterative_filter is specified (#40945)
issue: #40934

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-03-31 15:26:19 +08:00
Bingyi Sun
9676365af9
fix: Fix json index not equal filter (#40647)
issue: #35528

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-03-27 23:06:23 +08:00
aoiasd
384d39ef5a
enhance: not build lindera features by default and support make milvus with tantivy features (#40813)
relate: https://github.com/milvus-io/milvus/issues/39659

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-03-27 14:08:22 +08:00
zhagnlu
87e7d6d79f
fix:fix exception when do arith expr with using index (#40794)
#40783

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-03-27 11:10:21 +08:00
Xiaofan
8788e591cd
enhance: add detailed stack for error message (#40883)
fix #40882
adding stacktrace will operator execute failed.

Signed-off-by: xiaofanluan <xiaofan.luan@zilliz.com>
2025-03-26 13:24:20 +08:00
zhagnlu
7fdb2e144f
enhance:change multi or expr to in expr (#40757)
#40752

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-03-25 11:06:18 +08:00
cai.zhang
a41cb942f6
fix: Do not delete the centroids file when sampling fails instead wait GC (#40701)
issue: #40700

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-03-21 10:32:12 +08:00
sthuang
d7df78a6c9
feat: Storage v2 compaction (#40667)
- Feat: Support Mix compaction. Covering tests include compatibility and
rollback ability.
  - Read v1 segments and compact with v2 format.
  - Read both v1 and v2 segments and compact with v2 format.
  - Read v2 segments and compact with v2 format.
  - Compact with duplicate primary key test.
  - Compact with bm25 segments.
  - Compact with merge sort segments.
  - Compact with no expiration segments.
  - Compact with lack binlog segments.
  - Compact with nullable field segments.
- Feat: Support Clustering compaction. Covering tests include
compatibility and rollback ability.
  - Read v1 segments and compact with v2 format.
  - Read both v1 and v2 segments and compact with v2 format.
  - Read v2 segments and compact with v2 format.
  - Compact bm25 segments with v2 format.
  - Compact with memory limit.
- Enhance: Use serdeMap serialize in BuildRecord function to support all
Milvus data types.
related: #39173

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2025-03-21 10:16:12 +08:00
Bingyi Sun
5a6b4e56d5
fix: Fix tasks will panic if one of them throw an exception. (#40691)
issue: https://github.com/milvus-io/milvus/issues/40690

the variable rcm will be dangling if a future throws an exception and
return.

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-03-19 16:52:09 +08:00
aoiasd
92bdf7a0c1
enhance: support run anayser return detaild token (#40458)
relate: https://github.com/milvus-io/milvus/issues/39705

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-03-19 15:48:15 +08:00
zhagnlu
6c55db44f1
enhance: reorder sub expr for conjunct expr (#39872)
two point:
 (1) reoder conjucts expr's subexpr, postpone heavy operations
sequence: int(column) -> index(column) -> string(column) -> light
conjuct
...... -> json(column) -> heavy conjuct -> two_column_compare
(2) support pre filter for expr execute, skip scan raw data that had
been skipped
     because of preceding expr result.

#39869

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-03-19 14:50:14 +08:00
Zhen Ye
8db708f67d
enhance: enable memory prof based on jemalloc (#40731)
issue: #40730

also see: https://github.com/milvus-io/cgosymbolizer/pull/2

After these PR, at linux:

- the milvus will always enable jemalloc by default.
- jemalloc will always compiled with --enable-prof options.
- all image will always enable the jemalloc prof by default.
- a pprof http service for jemalloc at `/debug/jemalloc/` will be
registered into restful.
- `jeprof` can remote profile the memory of milvus.

Signed-off-by: chyezh <chyezh@outlook.com>
2025-03-19 14:46:18 +08:00
zhagnlu
7ebe3d7038
enhance: refine chunk access logic and add some comment on data (#40618)
#40367

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-03-16 22:20:08 +08:00
Bingyi Sun
6249335859
fix: Catch invalid json pointer error (#40625)
issue: #35528

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-03-14 16:56:08 +08:00
Bingyi Sun
d3adab15ac
fix: Build double index for all json numeric field (#40619)
issue: #35528

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-03-14 16:52:11 +08:00
Bingyi Sun
8fbacf3583
fix: Null expr does not work for json field (#40456)
issue: https://github.com/milvus-io/milvus/issues/40455

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-03-14 16:06:08 +08:00
Spade A
001fc992df
enhance: get doc ids by batch (#40608)
issue: #40607

tantivy change: https://github.com/zilliztech/tantivy/pull/3

Benchmarks:
Test Envrioment: CPU 9900K
The data is insert by:
```
for i in 0..N {
    for j in 0..UNIQUE {
        let key = format!("hello{}", j);
        index_writer.add_string(&key, i * UNIQUE + j).unwrap();
    }
}
```
So the unique influences the locality of the matched docs.
The latency is the avg latency over 1000 repeate quries.
The result shows 22.5%-34.8% latency reduction.

![image](https://github.com/user-attachments/assets/dd8af75a-ddc3-445d-92df-50d354dd5645)

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-03-14 15:48:09 +08:00
Spade A
f36d1562bd
enhance: add metrics for random sample (#40634)
issue: #39541

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-03-13 21:42:11 +08:00
Spade A
9f3bd55755
fix: avoid panic when field not exists in schema in query node (#40541)
ref #40473

This PR is a workaround to avoid the panic described in the issue.

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-03-12 22:44:08 +08:00
Bingyi Sun
0698d04f7d
enhance: Upgrade simdjson version (#40538)
issue: https://github.com/milvus-io/milvus/issues/40519
simdjson returns better error code in newer version.

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-03-11 15:04:05 +08:00
cai.zhang
e5f50076ec
enhance: Only check element type with not null array (#40446)
Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-03-11 14:58:07 +08:00
Bingyi Sun
0a7e692b6f
fix: Fix null offset loading in inverted index (#40523)
issue: #40516

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-03-10 22:12:04 +08:00
Cai Yudong
2bd2cca04a
enhance: Truly support multi vector data types in SearchBruteForce (#40499)
Issue: #38666

Signed-off-by: CaiYudong <yudong.cai@zilliz.com>
2025-03-10 18:36:03 +08:00
sre-ci-robot
a6d4121034
[automated] Update Knowhere Commit (#40486)
Update Knowhere Commit
Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-03-10 12:28:04 +08:00
smellthemoon
faae8ee518
fix: store wrong offset when build tantivy in nullable field (#40452)
#40454

Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2025-03-09 09:34:04 +08:00
Bingyi Sun
37b118d55d
fix: Skip loading primary key if index has raw data (#39921)
issue: https://github.com/milvus-io/milvus/issues/39907

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-03-06 17:46:02 +08:00
Spade A
3db56560fb
fix: fix concurrent issues in null offset (#40363)
issue: #40308
This issue fixes these two concurrent issues:
1. element in null_offset is used to set bitset where the size of bitset
is initialized by tantivy document count. However, there may still be
some documents that are not committed in tantivy but are null in
null_offset. So array out of range occurs.
2. null_offset can be read and write concurrently but there's no
synchronization protection.

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-03-05 17:48:00 +08:00
Bingyi Sun
be4d09561b
fix: Fix missing null or non-exist key in json index (#40336)
issue: #35528

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-03-05 11:48:02 +08:00
Bingyi Sun
7040ba1c12
enhance: make json path index support term filter (#40140)
issue: #35528

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-03-04 11:56:02 +08:00
Zhen Ye
8eb662b4dc
enhance: add more metrics for async cgo component (#40136)
issue: #40014

Signed-off-by: chyezh <chyezh@outlook.com>
2025-03-03 09:56:03 +08:00
sre-ci-robot
6a57a1973f
[automated] Update Knowhere Commit (#40283)
Update Knowhere Commit
Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-03-03 01:11:58 +08:00
zhagnlu
7a17fb68ec
enhance: add monitor metric for retrieve raw data (#40141)
#40078

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-03-02 18:30:01 +08:00
zhagnlu
8c19e5c4a7
enhance: decrease delete record dump snapshot limit (#40101)
#40100

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-03-02 17:55:59 +08:00
Micka
5cc104b412
fix: Change CMake variable for switch to knowhere-cuvs (#40105)
issue: https://github.com/milvus-io/milvus/issues/39883

Signed-off-by: Mickael Ide <mide@nvidia.com>
2025-02-27 22:05:58 +08:00
Chun Han
259f9106ad
enhance: refine variable-length-type memory usage(#38736) (#39578)
related: #38736

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2025-02-27 21:13:58 +08:00
sre-ci-robot
b2769fb357
[automated] Update Knowhere Commit (#40223)
Update Knowhere Commit
Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-02-27 01:35:59 +08:00
Spade A
476cf61d98
fix: random sample consider empty input (#40201)
issue: #40198

Fix random sample does not consider empty input, that is no data is hit
by filter expression.

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-02-26 16:15:58 +08:00
Bingyi Sun
f05e9628f6
fix: Fix search failure of null expression (#40129)
issue: #40095

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-02-25 20:43:55 +08:00
Bingyi Sun
db4769281c
fix: Fall back to a brute-force search if json index type unmatched (#40076)
issue: https://github.com/milvus-io/milvus/issues/35528
If the query data type does not match the index type, fall back to a
brute-force search

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-02-24 16:25:57 +08:00
aoiasd
38f1608910
enhance: pack analyzer code and support lindera tokenizer (#39660)
relate: https://github.com/milvus-io/milvus/issues/39659

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-02-24 12:13:55 +08:00
sre-ci-robot
dd1347d041
[automated] Update Knowhere Commit (#40103)
Update Knowhere Commit
Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-02-22 01:01:53 +08:00
sthuang
3eb3af5f08
feat: explicitly specify column groups for storage v2 api (#39790)
* use the new packed reader and writer api to be compatible with current
etcd meta
* For the new packed writer API: column groups and paths are explicitly
defined by users and won't split column groups by memory in storage v2.
Packed writer follows the user-defined column groups to split arrow
record and write into the corresponding file path.
* For the new packed reader API: read paths are explicitly defined by
users.
related: #39173

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2025-02-21 22:03:54 +08:00
yihao.dai
2a037a97f1
enhance: Add get vector latency metric and refine request limit error message (#40083)
issue: https://github.com/milvus-io/milvus/issues/40078

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-02-21 19:41:55 +08:00
Spade A
d34d70582d
fix: fix misleading name *_add_multi_* (#39997)
fix: #39995

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-02-21 16:45:55 +08:00
sre-ci-robot
f0d3d98c3f
[automated] Update Knowhere Commit (#40063)
Update Knowhere Commit
Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-02-21 01:19:54 +08:00