55 Commits

Author SHA1 Message Date
Cai Yudong
246586be27
enhance: Unify data type check APIs under internal/core (#31800)
Issue: #22837 

Move and rename following C++ APIs:
datatype_sizeof() ==> GetDataTypeSize()
datatype_name() ==> GetDataTypeName()
datatype_is_vector() / IsVectorType() ==> IsVectorDataType()
datatype_is_variable() ==> IsVariableDataType()
datatype_is_sparse_vector() ==> IsSparseFloatVectorDataType()
datatype_is_string() / IsString() ==> IsDataTypeString()
datatype_is_floating() / IsFloat() ==> IsDataTypeFloat()
datatype_is_binary() ==> IsDataTypeBinary()
datatype_is_json() ==> IsDataTypeJson()
datatype_is_array() ==> IsDataTypeArray()
datatype_is_variable() == IsDataTypeVariable()
datatype_is_integer() / IsIntegral() ==> IsDataTypeInteger()

Signed-off-by: Cai Yudong <yudong.cai@zilliz.com>
2024-04-02 19:15:14 +08:00
SimFG
b1a1cca10b
feat: add more operation detail info for better allocation (#30438)
issue: #30436

---------

Signed-off-by: SimFG <bang.fu@zilliz.com>
2024-03-28 06:33:11 +08:00
Alexander Guzhva
c4b37fb285
enhance: Custom bitset and bitsetview prototypes (#30454)
Issue: #31285 

Basically, I've replaced `FixedVector<bool>` and `boost::dynamic_bitset`
with custom bitset and bitsetview in order to reduce the memory
bandwidth & increase performance for the filtering.

This PR is for internal use only. 

Current progress (numbers are for GCC 9.5.0 on Ubuntu 22.04 LTS;
clang-17 produces better performance numbers):
Baseline:
```
[ RUN      ] CApiTest.AssembeChunkPerfTest
start test
cost: 17903us
[       OK ] CApiTest.AssembeChunkPerfTest (183 ms)

[ RUN      ] Expr.TestMultiLogicalExprsOptimization
cost: 1391us
cost: 5us
cost: 4us
cost: 4us
cost: 6us
cost: 4us
cost: 4us
cost: 4us
cost: 4us
cost: 4us
143
cost: 10us
cost: 8us
cost: 10us
cost: 8us
cost: 8us
cost: 8us
cost: 8us
cost: 8us
cost: 8us
cost: 9us
8
/home/ubuntu/zilliz/milvus4/milvus/internal/core/unittest/test_expr.cpp:1561: Failure
Expected: (cost_op) < (cost_no_op), actual: 143 vs 8
[  FAILED  ] Expr.TestMultiLogicalExprsOptimization (7 ms)
[ RUN      ] Expr.TestExprs
start test
3cost: 889us
start test
10cost: 2us
start test
20cost: 2us
start test
30cost: 2us
start test
50cost: 3us
start test
100cost: 7us
start test
200cost: 16us
[       OK ] Expr.TestExprs (9 ms)

[ RUN      ] Expr.TestUnaryBenchTest
start test type:2
 cost: 124.8us
start test type:3
 cost: 163.1us
start test type:4
 cost: 275.9us
start test type:5
 cost: 590.9us
start test type:10
 cost: 62.7us
start test type:11
 cost: 65.9us
[       OK ] Expr.TestUnaryBenchTest (1153 ms)
[ RUN      ] Expr.TestBinaryRangeBenchTest
start test type:2
 cost: 151.4us
start test type:3
 cost: 198.4us
start test type:4
 cost: 361.9us
start test type:5
 cost: 753.9us
start test type:10
 cost: 64.6us
start test type:11
 cost: 62.2us
[       OK ] Expr.TestBinaryRangeBenchTest (1151 ms)
[ RUN      ] Expr.TestLogicalUnaryBenchTest
start test type:2
 cost: 121.14us
start test type:3
 cost: 156.84us
start test type:4
 cost: 249.76us
start test type:5
 cost: 534.44us
start test type:10
 cost: 82.2us
start test type:11
 cost: 83.52us
[       OK ] Expr.TestLogicalUnaryBenchTest (1202 ms)
[ RUN      ] Expr.TestBinaryLogicalBenchTest
start test type:2
 cost: 80.64us
start test type:3
 cost: 78.22us
start test type:4
 cost: 255.76us
start test type:5
 cost: 532.04us
start test type:10
 cost: 89.26us
start test type:11
 cost: 90us
[       OK ] Expr.TestBinaryLogicalBenchTest (1198 ms)
[ RUN      ] Expr.TestBinaryArithOpEvalRangeBenchExpr
start test type:2
 cost: 401.7us
start test type:3
 cost: 420.96us
start test type:4
 cost: 418.04us
start test type:5
 cost: 470.54us
start test type:10
 cost: 250.32us
start test type:11
 cost: 850.08us
[       OK ] Expr.TestBinaryArithOpEvalRangeBenchExpr (1273 ms)
[ RUN      ] Expr.TestCompareExprBenchTest
start test type:2
 cost: 162us
start test type:3
 cost: 142us
start test type:4
 cost: 374us
start test type:5
 cost: 674us
start test type:10
 cost: 366us
start test type:11
 cost: 645us
[       OK ] Expr.TestCompareExprBenchTest (1214 ms)
[ RUN      ] Expr.TestRefactorExprs
start test
3cost: 1253us
start test
10cost: 1060us
start test
20cost: 681us
start test
30cost: 522us
start test
50cost: 511us
start test
100cost: 506us
start test
200cost: 497us
[       OK ] Expr.TestRefactorExprs (1142 ms)

```

Candidate:
```
[ RUN      ] CApiTest.AssembeChunkPerfTest
start test
cost: 6099us
[       OK ] CApiTest.AssembeChunkPerfTest (153 ms)

[ RUN      ] Expr.TestMultiLogicalExprsOptimization
cost: 42us
cost: 15us
cost: 15us
cost: 14us
cost: 15us
cost: 15us
cost: 15us
cost: 15us
cost: 15us
cost: 15us
17
cost: 41us
cost: 39us
cost: 33us
cost: 33us
cost: 33us
cost: 33us
cost: 34us
cost: 41us
cost: 34us
cost: 34us
35
[       OK ] Expr.TestMultiLogicalExprsOptimization (6 ms)
[ RUN      ] Expr.TestExprs
start test
3cost: 20us
start test
10cost: 2us
start test
20cost: 2us
start test
30cost: 2us
start test
50cost: 4us
start test
100cost: 8us
start test
200cost: 15us
[       OK ] Expr.TestExprs (8 ms)

[ RUN      ] Expr.TestUnaryBenchTest
start test type:2
 cost: 55.7us
start test type:3
 cost: 79.8us
start test type:4
 cost: 177.6us
start test type:5
 cost: 337.2us
start test type:10
 cost: 16.9us
start test type:11
 cost: 15.7us
[       OK ] Expr.TestUnaryBenchTest (1140 ms)
[ RUN      ] Expr.TestBinaryRangeBenchTest
start test type:2
 cost: 57.1us
start test type:3
 cost: 87us
start test type:4
 cost: 177.5us
start test type:5
 cost: 342.7us
start test type:10
 cost: 17.9us
start test type:11
 cost: 16.7us
[       OK ] Expr.TestBinaryRangeBenchTest (1152 ms)
[ RUN      ] Expr.TestLogicalUnaryBenchTest
start test type:2
 cost: 34.58us
start test type:3
 cost: 68.86us
start test type:4
 cost: 151.38us
start test type:5
 cost: 286.8us
start test type:10
 cost: 16.54us
start test type:11
 cost: 16.7us
[       OK ] Expr.TestLogicalUnaryBenchTest (1165 ms)
[ RUN      ] Expr.TestBinaryLogicalBenchTest
start test type:2
 cost: 20us
start test type:3
 cost: 17.1us
start test type:4
 cost: 154.12us
start test type:5
 cost: 286.1us
start test type:10
 cost: 19.6us
start test type:11
 cost: 19.24us
[       OK ] Expr.TestBinaryLogicalBenchTest (1188 ms)
[ RUN      ] Expr.TestBinaryArithOpEvalRangeBenchExpr
start test type:2
 cost: 125.7us
start test type:3
 cost: 111.34us
start test type:4
 cost: 148.02us
start test type:5
 cost: 306.7us
start test type:10
 cost: 149.3us
start test type:11
 cost: 282.94us
[       OK ] Expr.TestBinaryArithOpEvalRangeBenchExpr (1221 ms)
[ RUN      ] Expr.TestCompareExprBenchTest
start test type:2
 cost: 89us
start test type:3
 cost: 79us
start test type:4
 cost: 323us
start test type:5
 cost: 629us
start test type:10
 cost: 313us
start test type:11
 cost: 591us
[       OK ] Expr.TestCompareExprBenchTest (1228 ms)
[ RUN      ] Expr.TestRefactorExprs
start test
3cost: 874us
start test
10cost: 611us
start test
20cost: 290us
start test
30cost: 294us
start test
50cost: 272us
start test
100cost: 278us
start test
200cost: 279us
[       OK ] Expr.TestRefactorExprs (1149 ms)

```

Signed-off-by: Alexandr Guzhva <alexanderguzhva@gmail.com>
2024-03-24 21:49:07 +08:00
xige-16
e9fdd2475d
fix: fix searchPlan metricType modified concurrently (#30227)
issue: #30225
/kind bug
Signed-off-by: xige-16 <xi.ge@zilliz.com>

---------

Signed-off-by: xige-16 <xi.ge@zilliz.com>
2024-01-26 14:03:09 +08:00
yah01
f542bdbf3c
enhance: calc the accurate mem size of segment (#30093)
this stats the real memory size of segment, also reduces the memory
usage in mmap mode
resolve #30095

Signed-off-by: yah01 <yang.cen@zilliz.com>
2024-01-19 12:32:53 +08:00
xige-16
fa7cf587b0
enhance: Opt metric type does not match error message (#29927)
issue: #29791 
/kind improvement
Signed-off-by: xige-16 <xi.ge@zilliz.com>

Signed-off-by: xige-16 <xi.ge@zilliz.com>
2024-01-17 20:25:03 +08:00
zhenshan.cao
60e88fb833
fix: Restore the MVCC functionality. (#29749)
When the TimeTravel functionality was previously removed, it
inadvertently affected the MVCC functionality within the system. This PR
aims to reintroduce the internal MVCC functionality as follows:

1. Add MvccTimestamp to the requests of Search/Query and the results of
Search internally.
2. When the delegator receives a Query/Search request and there is no
MVCC timestamp set in the request, set the delegator's current tsafe as
the MVCC timestamp of the request. If the request already has an MVCC
timestamp, do not modify it.
3. When the Proxy handles Search and triggers the second phase ReQuery,
divide the ReQuery into different shards and pass the MVCC timestamp to
the corresponding Query requests.

issue: #29656

Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>
2024-01-09 11:38:48 +08:00
MrPresent-Han
836f300536
support skip-index based on chunk-metrics to accelerate expr filter(#27925) (#28297)
related: #27925

Signed-off-by: MrPresent-Han <chun.han@zilliz.com>
2023-11-15 11:20:19 +08:00
Enwei Jiao
b80a3e19d3
Add code for PanicInfo (#27364)
Signed-off-by: Enwei Jiao <enwei.jiao@zilliz.com>
2023-09-27 12:01:28 +08:00
cai.zhang
a362bb1457
Support array datatype (#26369)
Signed-off-by: cai.zhang <cai.zhang@zilliz.com>
2023-09-19 14:23:23 +08:00
Enwei Jiao
0afdfdb9af
Remove other Exceptions, keeps SegcoreError only (#27017)
Signed-off-by: Enwei Jiao <enwei.jiao@zilliz.com>
2023-09-14 14:05:20 +08:00
MrPresent-Han
7d5a4b2994
add more event for segcore search(#26277) (#26688)
Signed-off-by: MrPresent-Han <chun.han@zilliz.com>
2023-08-30 14:15:01 +08:00
yah01
9605c03c3c
Fix the number of rows of column not correct (#26347)
Signed-off-by: yah01 <yah2er0ne@outlook.com>
2023-08-16 13:35:33 +08:00
Enwei Jiao
ca1349708b
Remove time travel ralted testcase (#26119)
Signed-off-by: Enwei Jiao <enwei.jiao@zilliz.com>
2023-08-10 18:53:17 +08:00
xige-16
1055c90456
Add default retrieve limit (#24782)
Signed-off-by: xige-16 <xi.ge@zilliz.com>
2023-08-10 14:11:15 +08:00
Jiquan Long
bafb183a2b
Optimize bitset usage (#26096)
Signed-off-by: longjiquan <jiquan.long@zilliz.com>
2023-08-03 15:25:09 +08:00
Jiquan Long
5c1f79dc54
Push down the limit operator to segcore (#25959)
Signed-off-by: longjiquan <jiquan.long@zilliz.com>
2023-08-01 20:29:05 +08:00
yah01
546080dcdd
Support to retrieve json (#23563)
Signed-off-by: yah01 <yang.cen@zilliz.com>
2023-04-21 11:46:32 +08:00
yah01
005d178a0e
Optimize performance of insert & query & search (#22829)
- Reduce 1x copy of inserting int8/int16 into growing segment
- Reduce 1x copy of retrieving primary keys
- Reduce 1x copy of inserting/loading/deleting/filtering primary keys
- Reduce 1x copy of reducing string results

Signed-off-by: yah01 <yang.cen@zilliz.com>
2023-03-20 10:19:56 +08:00
Jiquan Long
8139106b51
Feat: count entities by expression (#22765)
Signed-off-by: longjiquan <jiquan.long@zilliz.com>
2023-03-16 19:31:55 +08:00
yah01
bdd6bc7695
Re-format cpp code (#22513)
Signed-off-by: yah01 <yang.cen@zilliz.com>
2023-03-02 15:55:49 +08:00
yah01
7478e44911
Support using mmap to load data (#22052)
Signed-off-by: yah01 <yang.cen@zilliz.com>
2023-03-01 18:07:49 +08:00
smellthemoon
f5ab719f21
timestamp decided if the pks were the same (#20166)
Signed-off-by: lixinguo <xinguo.li@zilliz.com>

Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2022-11-21 10:55:10 +08:00
Jeng.Gwan
638f6c36e9
Support to get real row count of segment (#18115)
Signed-off-by: xaxys <zheng.guan@zilliz.com>
2022-07-18 09:58:28 +08:00
bigsheeper
b657e58370
Remove temporary variable to prevent memory fragmentation (#17728)
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2022-06-28 16:30:23 +08:00
bigsheeper
f38637c227
Pass PlaceholderGroup pointer to prevent memory copy in SegCore (#17389)
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2022-06-06 21:34:05 +08:00
Cai Yudong
d5db4ae463
Merge utils/Types.h with common/Types.h (#16445)
Signed-off-by: yudong.cai <yudong.cai@zilliz.com>
2022-05-02 22:21:51 +08:00
xige-16
515d0369de
Support string type in segcore (#16546)
Signed-off-by: xige-16 <xi.ge@zilliz.com>
Co-authored-by: dragondriver <jiquan.long@zilliz.com>

Co-authored-by: dragondriver <jiquan.long@zilliz.com>
2022-04-29 13:35:49 +08:00
xige-16
27b4cbc098
Cherry pick remove translateHits commit to mater (#16436)
Signed-off-by: xige-16 <xi.ge@zilliz.com>

Co-authored-by: bigsheeper <yihao.dai@zilliz.com>
2022-04-08 20:27:31 +08:00
Cai Yudong
d40af885b9
Update header files for segcore/SegmentInterface.cpp (#12770)
Signed-off-by: yudong.cai <yudong.cai@zilliz.com>
2021-12-07 10:17:51 +08:00
Cai Yudong
cbb01051f0
Update Search return type (#12578)
Signed-off-by: yudong.cai <yudong.cai@zilliz.com>
2021-12-02 11:45:32 +08:00
Cai Yudong
8f1e75718c
Clean code (#12433)
Signed-off-by: yudong.cai <yudong.cai@zilliz.com>
2021-11-30 18:37:41 +08:00
Cai Yudong
cb952d6036
Rename SearchResult fields for better readability (#12327)
Signed-off-by: yudong.cai <yudong.cai@zilliz.com>
2021-11-29 14:57:18 +08:00
Cai Yudong
a35db8eda0
Optimize retrieve to use batch mode assignment (#11647)
Signed-off-by: yudong.cai <yudong.cai@zilliz.com>
2021-11-12 10:06:44 +08:00
Cai Yudong
db2a0a3bd3
Fix reduce panic (#11325)
Signed-off-by: yudong.cai <yudong.cai@zilliz.com>
2021-11-05 18:17:00 +08:00
Cai Yudong
48648c818b
Remove duplicated search results in segcore reduce (#10117)
Signed-off-by: yudong.cai <yudong.cai@zilliz.com>
2021-10-20 15:40:49 +08:00
yukun
5c997710ab
Change Assert() to AssertInfo() to return error messages (#7843)
Signed-off-by: fishpenguin <kun.yu@zilliz.com>
2021-09-14 10:06:40 +08:00
yukun
94272bba87
Support query by expression (#7386)
Signed-off-by: fishpenguin <kun.yu@zilliz.com>
2021-09-03 17:12:55 +08:00
Cai Yudong
e771bda92f
optimize retrieve output vector code structure (#7102)
Signed-off-by: yudong.cai <yudong.cai@zilliz.com>
2021-08-14 11:18:10 +08:00
Cai Yudong
6c75301c70
optimize search reduce logic (#7066)
Signed-off-by: yudong.cai <yudong.cai@zilliz.com>
2021-08-12 18:00:11 +08:00
Cai Yudong
a992dcf6a8
Support query return vector output field (#6570)
* improve code readibility

Signed-off-by: yudong.cai <yudong.cai@zilliz.com>

* add offset in RetrieveResults

Signed-off-by: yudong.cai <yudong.cai@zilliz.com>

* add VectorFieldInfo into Segment struct

Signed-off-by: yudong.cai <yudong.cai@zilliz.com>

* add new interface for query vector

Signed-off-by: yudong.cai <yudong.cai@zilliz.com>

* update load vector field logic

Signed-off-by: yudong.cai <yudong.cai@zilliz.com>

* update load vector field logic

Signed-off-by: yudong.cai <yudong.cai@zilliz.com>

* fill in field name in query result

Signed-off-by: yudong.cai <yudong.cai@zilliz.com>

* add FieldId into FieldData

Signed-off-by: yudong.cai <yudong.cai@zilliz.com>

* add fillVectorOutputFieldsIfNeeded

Signed-off-by: yudong.cai <yudong.cai@zilliz.com>

* update data_codec_test.go

Signed-off-by: yudong.cai <yudong.cai@zilliz.com>

* add DeserializeFieldData

Signed-off-by: yudong.cai <yudong.cai@zilliz.com>

* realize query return vector output field

Signed-off-by: yudong.cai <yudong.cai@zilliz.com>

* fix static-check

Signed-off-by: yudong.cai <yudong.cai@zilliz.com>

* disable query vector case

Signed-off-by: yudong.cai <yudong.cai@zilliz.com>
2021-07-16 17:19:55 +08:00
Cai Yudong
724f10b9a0
Unify the usage of query and search (#6467)
Unify the usage of query and search

Signed-off-by: yudong.cai <yudong.cai@zilliz.com>
2021-07-13 22:20:33 +08:00
FluorineDog
bf8b2be4a7
Deprecate num_groups to simplify search API (#6230)
Signed-off-by: fluorinedog <fluorinedog@gmail.com>
2021-07-01 10:32:15 +08:00
FluorineDog
bec9f2c182
Split segcore and plan proto for future feature (#5767)
* Split segcore and plan proto for future feature

Signed-off-by: fluorinedog <fluorinedog@gmail.com>

* lint

Signed-off-by: fluorinedog <fluorinedog@gmail.com>
2021-06-15 14:43:57 +08:00
FluorineDog
006dae35c3
fix retrieve bug (#5727)
Signed-off-by: fluorinedog <fluorinedog@gmail.com>
2021-06-11 00:06:55 +08:00
FluorineDog
9a90313390
Support GetEntityByIDs in CGo, fix segcore bugs (#5563)
Signed-off-by: fluorinedog <fluorinedog@gmail.com>
2021-06-04 10:38:34 +08:00
FluorineDog
b1a9aea6a6
support get entity by ids in segcore (#5456)
Signed-off-by: fluorinedog <fluorinedog@gmail.com>
2021-05-28 10:39:30 +08:00
xige-16
a6f1de036b Optimize search performance in query node
Signed-off-by: xige-16 <xi.ge@zilliz.com>
2021-03-30 22:16:58 +08:00
xige-16
2ca53fa668 Fix msgstream deadlock when loadCollection
Signed-off-by: xige-16 <xi.ge@zilliz.com>
2021-02-07 15:47:10 +08:00
cai.zhang
f940cc455a Add dockerfile for index
Signed-off-by: cai.zhang <cai.zhang@zilliz.com>
2021-01-26 09:38:40 +08:00