zhagnlu
eac16a577c
enhance:support cachelayer for json stats ( #44446 )
...
#42533
Signed-off-by: zhagnlu <lu.zhang@zilliz.com>
2025-09-24 15:30:04 +08:00
sparknack
ab64afba2f
enhance: add storage resource usage for scalar search ( #44414 )
...
issue: #44212
---------
Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>
2025-09-22 14:28:06 +08:00
zhagnlu
e9bbb6aa9b
fix: fix json_contains bug for stats ( #44325 )
...
#42533
Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-09-15 10:16:07 +08:00
Spade A
575d490af6
fix: ngram index is mistakenly used for unsopported operations 2 ( #44142 )
...
issue: https://github.com/milvus-io/milvus/issues/44020
https://github.com/milvus-io/milvus/pull/43955 only fixed unary
expression
This fixes all expressions and add more tests.
---------
Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>
Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-09-09 19:05:56 +08:00
zhagnlu
fc876639cf
enhance: support json stats with shredding design ( #42534 )
...
#42533
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-09-01 10:49:52 +08:00
zhagnlu
8934c18792
enhance: support cache result cache for expr ( #43923 )
...
issue: #43878
Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-08-26 10:55:52 +08:00
congqixia
b6199acb05
enhance: Utilize search_batch_pks for search_ids of PkTerm ( #43751 )
...
Related to #43660
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-08-07 14:19:40 +08:00
Chun Han
d826d6ac91
fix: try to get span raw data for variable length data type( #43544 ) ( #43705 )
...
related: #43544
Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2025-08-04 11:15:38 +08:00
zhagnlu
31801f5937
fix: fix pk in [..] skip next batch when using multi-chunk segment ( #43618 )
...
#43494
Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-07-31 10:15:37 +08:00
Spade A
864d1b93b1
enhance: enable stlsort with mmap support ( #43359 )
...
issue: https://github.com/milvus-io/milvus/issues/43358
---------
Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-07-28 15:32:55 +08:00
zhagnlu
9bf1cb02d5
fix: add array_contains_all int to float converter ( #43593 )
...
#43334
Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-07-28 14:14:55 +08:00
zhagnlu
d64dceea47
fix:add convert int to float function to array_contains related expr ( #43468 )
...
#43281
Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-07-23 15:20:53 +08:00
Xianhui Lin
c13393418c
fix: invalid string error when enabled json stats ( #43380 )
...
fix: invalid string error when enabled json stats
issue: https://github.com/milvus-io/milvus/issues/43151
Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>
2025-07-20 23:38:53 +08:00
Buqian Zheng
389104d200
enhance: rename PanicInfo to ThrowInfo ( #43384 )
...
issue: #41435
this is to prevent AI from thinking of our exception throwing as a
dangerous PANIC operation that terminates the program.
Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2025-07-19 20:22:52 +08:00
Spade A
26ec841feb
feat: optimize Like query with n-gram ( #41803 )
...
Ref #42053
This is the first PR for optimizing `LIKE` with ngram inverted index.
Now, only VARCHAR data type is supported and only InnerMatch LIKE
(%xxx%) query is supported.
How to use it:
```
milvus_client = MilvusClient("http://localhost:19530 ")
schema = milvus_client.create_schema()
...
schema.add_field("content_ngram", DataType.VARCHAR, max_length=10000)
...
index_params = milvus_client.prepare_index_params()
index_params.add_index(field_name="content_ngram", index_type="NGRAM", index_name="ngram_index", min_gram=2, max_gram=3)
milvus_client.create_collection(COLLECTION_NAME, ...)
```
min_gram and max_gram controls how we tokenize the documents. For
example, for min_gram=2 and max_gram=4, we will tokenize each document
with 2-gram, 3-gram and 4-gram.
---------
Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>
2025-07-01 10:08:44 +08:00
Bingyi Sun
6c16d3dbee
enhance: Add bulk api for json data ( #42407 )
...
issue: https://github.com/milvus-io/milvus/issues/42409
---------
Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-06-12 10:40:39 +08:00
cai.zhang
4ead8caaba
fix: prevent crash when contains_all/any is used with empty array ( #41739 )
...
issue: #41348
related and optimized by #41347
---------
Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
Co-authored-by: Sangho Park <hoyaspark@gmail.com>
2025-05-14 14:32:22 +08:00
Bingyi Sun
4c08090687
feat: Add json index support for json contains expr ( #41478 )
...
issue: #35528
---------
Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-05-06 11:44:52 +08:00
Buqian Zheng
3de904c7ea
feat: add cachinglayer to sealed segment ( #41436 )
...
issue: https://github.com/milvus-io/milvus/issues/41435
---------
Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2025-04-28 10:52:40 +08:00
Xianhui Lin
3d4889586d
fix: JsonStats filter by conjunctExpr and improve the task slot calculation logic ( #41459 )
...
Optimized JSON filter execution by introducing
ProcessJsonStatsChunkPos() for unified position calculation and
GetNextBatchSize() for better batch processing.
Improved JSON key generation by replacing manual path joining with
milvus::Json::pointer() and adjusted slot size calculation for JSON key
index jobs.
Updated the task slot calculation logic in calculateStatsTaskSlot() to
handle the increased resource needs of JSON key index jobs.
issue: https://github.com/milvus-io/milvus/issues/41378
https://github.com/milvus-io/milvus/issues/41218
---------
Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>
2025-04-23 16:30:37 +08:00
Xianhui Lin
3bc24c264f
enhance: Add json key inverted index in stats for optimization ( #38039 )
...
Add json key inverted index in stats for optimization
https://github.com/milvus-io/milvus/issues/36995
---------
Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-04-10 15:20:28 +08:00
zhagnlu
6c55db44f1
enhance: reorder sub expr for conjunct expr ( #39872 )
...
two point:
(1) reoder conjucts expr's subexpr, postpone heavy operations
sequence: int(column) -> index(column) -> string(column) -> light
conjuct
...... -> json(column) -> heavy conjuct -> two_column_compare
(2) support pre filter for expr execute, skip scan raw data that had
been skipped
because of preceding expr result.
#39869
Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-03-19 14:50:14 +08:00
zhagnlu
8a9f02ef71
enhance: optimize expr performace for some points ( #39695 )
...
1. skip get expr arguments which deserialize proto for every batch
execute.
2. replace unordered_set with sort array that has better performace for
small set.
#39688
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-02-16 20:32:14 +08:00
Gao
994fc544e7
enhance: support iterative filter execution ( #37363 )
...
issue: #37360
---------
Signed-off-by: chasingegg <chao.gao@zilliz.com>
2024-12-11 11:32:44 +08:00
smellthemoon
eb3e4583ec
enhance: all op(Null) is false in expr ( #35527 )
...
#31728
---------
Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2024-10-17 21:14:30 +08:00
zhagnlu
3030e4625e
enhance: refactor variable column to reduce memory cost ( #33875 )
...
#33874
Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-06-30 20:16:06 +08:00
Jiquan Long
0c5d8660aa
feat: support inverted index for array ( #33452 )
...
issue: https://github.com/milvus-io/milvus/issues/27704
---------
Signed-off-by: longjiquan <jiquan.long@zilliz.com>
2024-05-31 09:47:47 +08:00
Cai Yudong
246586be27
enhance: Unify data type check APIs under internal/core ( #31800 )
...
Issue: #22837
Move and rename following C++ APIs:
datatype_sizeof() ==> GetDataTypeSize()
datatype_name() ==> GetDataTypeName()
datatype_is_vector() / IsVectorType() ==> IsVectorDataType()
datatype_is_variable() ==> IsVariableDataType()
datatype_is_sparse_vector() ==> IsSparseFloatVectorDataType()
datatype_is_string() / IsString() ==> IsDataTypeString()
datatype_is_floating() / IsFloat() ==> IsDataTypeFloat()
datatype_is_binary() ==> IsDataTypeBinary()
datatype_is_json() ==> IsDataTypeJson()
datatype_is_array() ==> IsDataTypeArray()
datatype_is_variable() == IsDataTypeVariable()
datatype_is_integer() / IsIntegral() ==> IsDataTypeInteger()
Signed-off-by: Cai Yudong <yudong.cai@zilliz.com>
2024-04-02 19:15:14 +08:00
Alexander Guzhva
c4b37fb285
enhance: Custom bitset and bitsetview prototypes ( #30454 )
...
Issue: #31285
Basically, I've replaced `FixedVector<bool>` and `boost::dynamic_bitset`
with custom bitset and bitsetview in order to reduce the memory
bandwidth & increase performance for the filtering.
This PR is for internal use only.
Current progress (numbers are for GCC 9.5.0 on Ubuntu 22.04 LTS;
clang-17 produces better performance numbers):
Baseline:
```
[ RUN ] CApiTest.AssembeChunkPerfTest
start test
cost: 17903us
[ OK ] CApiTest.AssembeChunkPerfTest (183 ms)
[ RUN ] Expr.TestMultiLogicalExprsOptimization
cost: 1391us
cost: 5us
cost: 4us
cost: 4us
cost: 6us
cost: 4us
cost: 4us
cost: 4us
cost: 4us
cost: 4us
143
cost: 10us
cost: 8us
cost: 10us
cost: 8us
cost: 8us
cost: 8us
cost: 8us
cost: 8us
cost: 8us
cost: 9us
8
/home/ubuntu/zilliz/milvus4/milvus/internal/core/unittest/test_expr.cpp:1561: Failure
Expected: (cost_op) < (cost_no_op), actual: 143 vs 8
[ FAILED ] Expr.TestMultiLogicalExprsOptimization (7 ms)
[ RUN ] Expr.TestExprs
start test
3cost: 889us
start test
10cost: 2us
start test
20cost: 2us
start test
30cost: 2us
start test
50cost: 3us
start test
100cost: 7us
start test
200cost: 16us
[ OK ] Expr.TestExprs (9 ms)
[ RUN ] Expr.TestUnaryBenchTest
start test type:2
cost: 124.8us
start test type:3
cost: 163.1us
start test type:4
cost: 275.9us
start test type:5
cost: 590.9us
start test type:10
cost: 62.7us
start test type:11
cost: 65.9us
[ OK ] Expr.TestUnaryBenchTest (1153 ms)
[ RUN ] Expr.TestBinaryRangeBenchTest
start test type:2
cost: 151.4us
start test type:3
cost: 198.4us
start test type:4
cost: 361.9us
start test type:5
cost: 753.9us
start test type:10
cost: 64.6us
start test type:11
cost: 62.2us
[ OK ] Expr.TestBinaryRangeBenchTest (1151 ms)
[ RUN ] Expr.TestLogicalUnaryBenchTest
start test type:2
cost: 121.14us
start test type:3
cost: 156.84us
start test type:4
cost: 249.76us
start test type:5
cost: 534.44us
start test type:10
cost: 82.2us
start test type:11
cost: 83.52us
[ OK ] Expr.TestLogicalUnaryBenchTest (1202 ms)
[ RUN ] Expr.TestBinaryLogicalBenchTest
start test type:2
cost: 80.64us
start test type:3
cost: 78.22us
start test type:4
cost: 255.76us
start test type:5
cost: 532.04us
start test type:10
cost: 89.26us
start test type:11
cost: 90us
[ OK ] Expr.TestBinaryLogicalBenchTest (1198 ms)
[ RUN ] Expr.TestBinaryArithOpEvalRangeBenchExpr
start test type:2
cost: 401.7us
start test type:3
cost: 420.96us
start test type:4
cost: 418.04us
start test type:5
cost: 470.54us
start test type:10
cost: 250.32us
start test type:11
cost: 850.08us
[ OK ] Expr.TestBinaryArithOpEvalRangeBenchExpr (1273 ms)
[ RUN ] Expr.TestCompareExprBenchTest
start test type:2
cost: 162us
start test type:3
cost: 142us
start test type:4
cost: 374us
start test type:5
cost: 674us
start test type:10
cost: 366us
start test type:11
cost: 645us
[ OK ] Expr.TestCompareExprBenchTest (1214 ms)
[ RUN ] Expr.TestRefactorExprs
start test
3cost: 1253us
start test
10cost: 1060us
start test
20cost: 681us
start test
30cost: 522us
start test
50cost: 511us
start test
100cost: 506us
start test
200cost: 497us
[ OK ] Expr.TestRefactorExprs (1142 ms)
```
Candidate:
```
[ RUN ] CApiTest.AssembeChunkPerfTest
start test
cost: 6099us
[ OK ] CApiTest.AssembeChunkPerfTest (153 ms)
[ RUN ] Expr.TestMultiLogicalExprsOptimization
cost: 42us
cost: 15us
cost: 15us
cost: 14us
cost: 15us
cost: 15us
cost: 15us
cost: 15us
cost: 15us
cost: 15us
17
cost: 41us
cost: 39us
cost: 33us
cost: 33us
cost: 33us
cost: 33us
cost: 34us
cost: 41us
cost: 34us
cost: 34us
35
[ OK ] Expr.TestMultiLogicalExprsOptimization (6 ms)
[ RUN ] Expr.TestExprs
start test
3cost: 20us
start test
10cost: 2us
start test
20cost: 2us
start test
30cost: 2us
start test
50cost: 4us
start test
100cost: 8us
start test
200cost: 15us
[ OK ] Expr.TestExprs (8 ms)
[ RUN ] Expr.TestUnaryBenchTest
start test type:2
cost: 55.7us
start test type:3
cost: 79.8us
start test type:4
cost: 177.6us
start test type:5
cost: 337.2us
start test type:10
cost: 16.9us
start test type:11
cost: 15.7us
[ OK ] Expr.TestUnaryBenchTest (1140 ms)
[ RUN ] Expr.TestBinaryRangeBenchTest
start test type:2
cost: 57.1us
start test type:3
cost: 87us
start test type:4
cost: 177.5us
start test type:5
cost: 342.7us
start test type:10
cost: 17.9us
start test type:11
cost: 16.7us
[ OK ] Expr.TestBinaryRangeBenchTest (1152 ms)
[ RUN ] Expr.TestLogicalUnaryBenchTest
start test type:2
cost: 34.58us
start test type:3
cost: 68.86us
start test type:4
cost: 151.38us
start test type:5
cost: 286.8us
start test type:10
cost: 16.54us
start test type:11
cost: 16.7us
[ OK ] Expr.TestLogicalUnaryBenchTest (1165 ms)
[ RUN ] Expr.TestBinaryLogicalBenchTest
start test type:2
cost: 20us
start test type:3
cost: 17.1us
start test type:4
cost: 154.12us
start test type:5
cost: 286.1us
start test type:10
cost: 19.6us
start test type:11
cost: 19.24us
[ OK ] Expr.TestBinaryLogicalBenchTest (1188 ms)
[ RUN ] Expr.TestBinaryArithOpEvalRangeBenchExpr
start test type:2
cost: 125.7us
start test type:3
cost: 111.34us
start test type:4
cost: 148.02us
start test type:5
cost: 306.7us
start test type:10
cost: 149.3us
start test type:11
cost: 282.94us
[ OK ] Expr.TestBinaryArithOpEvalRangeBenchExpr (1221 ms)
[ RUN ] Expr.TestCompareExprBenchTest
start test type:2
cost: 89us
start test type:3
cost: 79us
start test type:4
cost: 323us
start test type:5
cost: 629us
start test type:10
cost: 313us
start test type:11
cost: 591us
[ OK ] Expr.TestCompareExprBenchTest (1228 ms)
[ RUN ] Expr.TestRefactorExprs
start test
3cost: 874us
start test
10cost: 611us
start test
20cost: 290us
start test
30cost: 294us
start test
50cost: 272us
start test
100cost: 278us
start test
200cost: 279us
[ OK ] Expr.TestRefactorExprs (1149 ms)
```
Signed-off-by: Alexandr Guzhva <alexanderguzhva@gmail.com>
2024-03-24 21:49:07 +08:00
yah01
8f89e9cf75
enhance: remove all unnecessary string formatting ( #29323 )
...
done by two regex expressions:
- `PanicInfo\((.+),[. \n]+fmt::format\(([.\s\S]+?)\)\)`
- `AssertInfo\((.+),[. \n]+fmt::format\(([.\s\S]+?)\)\)`
related: #28811
---------
Signed-off-by: yah01 <yang.cen@zilliz.com>
2023-12-20 10:04:43 +08:00
zhagnlu
a602171d06
enhance: Refactor runtime and expr framework ( #28166 )
...
#28165
Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2023-12-18 12:04:42 +08:00