80 Commits

Author SHA1 Message Date
Buqian Zheng
3de904c7ea
feat: add cachinglayer to sealed segment (#41436)
issue: https://github.com/milvus-io/milvus/issues/41435

---------

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2025-04-28 10:52:40 +08:00
Chun Han
12cde913b5
fix: fail to get string views due to chunk bound empty loop(#41300) (#41452)
related: #41300

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2025-04-27 10:40:38 +08:00
Xianhui Lin
3d4889586d
fix: JsonStats filter by conjunctExpr and improve the task slot calculation logic (#41459)
Optimized JSON filter execution by introducing
ProcessJsonStatsChunkPos() for unified position calculation and
GetNextBatchSize() for better batch processing.
Improved JSON key generation by replacing manual path joining with
milvus::Json::pointer() and adjusted slot size calculation for JSON key
index jobs.
Updated the task slot calculation logic in calculateStatsTaskSlot() to
handle the increased resource needs of JSON key index jobs.
issue: https://github.com/milvus-io/milvus/issues/41378
https://github.com/milvus-io/milvus/issues/41218

---------

Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>
2025-04-23 16:30:37 +08:00
Bingyi Sun
a953eaeaf0
enhance: support binary range expression for json path index (#41025)
issue: #35528

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-04-15 19:32:33 +08:00
Bingyi Sun
bf617115ca
enhance: Remove single chunk segment related codes (#39249)
https://github.com/milvus-io/milvus/issues/39112

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-04-11 18:56:29 +08:00
Spade A
9ce3e3cb44
enhance: add documents in batch for json key stats (#41228)
issue: https://github.com/milvus-io/milvus/issues/40897

After this, the document add operations scheduling duration is decreased
roughly from 6s to 0.9s for the case in the issue.

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-04-11 14:08:26 +08:00
Xianhui Lin
3bc24c264f
enhance: Add json key inverted index in stats for optimization (#38039)
Add json key inverted index in stats for optimization
https://github.com/milvus-io/milvus/issues/36995

---------

Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-04-10 15:20:28 +08:00
zhagnlu
ee1faf80dd
fix:add clear bitmap for batch skip mode (#41166)
#41086 #41150

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-04-09 13:08:27 +08:00
congqixia
e2d8adb963
fix: Use element_type for Array is null operator (#41157)
Related to #41156

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-04-09 10:16:24 +08:00
Bingyi Sun
da21640ac3
fix: Fix the bug that null data can not be filtered by null expr (#41124)
issue: https://github.com/milvus-io/milvus/issues/41063

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-04-08 19:12:24 +08:00
Bingyi Sun
fcb03b5bd1
feat: add json null/exists expression (#41004)
issue: #35528

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-04-03 17:48:21 +08:00
Chun Han
afa519b4c7
fix: array is null failed(#40686) (#41027)
related: #40686

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2025-04-02 18:20:22 +08:00
Bingyi Sun
15ec7bae4d
fix: Fix using json index when iterative_filter is specified (#40945)
issue: #40934

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-03-31 15:26:19 +08:00
zhagnlu
87e7d6d79f
fix:fix exception when do arith expr with using index (#40794)
#40783

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-03-27 11:10:21 +08:00
zhagnlu
7fdb2e144f
enhance:change multi or expr to in expr (#40757)
#40752

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-03-25 11:06:18 +08:00
zhagnlu
6c55db44f1
enhance: reorder sub expr for conjunct expr (#39872)
two point:
 (1) reoder conjucts expr's subexpr, postpone heavy operations
sequence: int(column) -> index(column) -> string(column) -> light
conjuct
...... -> json(column) -> heavy conjuct -> two_column_compare
(2) support pre filter for expr execute, skip scan raw data that had
been skipped
     because of preceding expr result.

#39869

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-03-19 14:50:14 +08:00
Bingyi Sun
d3adab15ac
fix: Build double index for all json numeric field (#40619)
issue: #35528

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-03-14 16:52:11 +08:00
Bingyi Sun
7040ba1c12
enhance: make json path index support term filter (#40140)
issue: #35528

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-03-04 11:56:02 +08:00
Chun Han
259f9106ad
enhance: refine variable-length-type memory usage(#38736) (#39578)
related: #38736

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2025-02-27 21:13:58 +08:00
Bingyi Sun
f05e9628f6
fix: Fix search failure of null expression (#40129)
issue: #40095

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-02-25 20:43:55 +08:00
Bingyi Sun
db4769281c
fix: Fall back to a brute-force search if json index type unmatched (#40076)
issue: https://github.com/milvus-io/milvus/issues/35528
If the query data type does not match the index type, fall back to a
brute-force search

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-02-24 16:25:57 +08:00
congqixia
7ccde3300e
fix: Use text_log prefix for TextMatchIndex null offset file (#39935)
Related to #39933

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-02-17 20:17:25 +08:00
zhagnlu
8a9f02ef71
enhance: optimize expr performace for some points (#39695)
1. skip get expr arguments which deserialize proto for every batch
execute.
2. replace unordered_set with sort array that has better performace for
small set.

#39688

Co-authored-by: luzhang <luzhang@zilliz.com>
2025-02-16 20:32:14 +08:00
Bingyi Sun
b59555057d
feat: support json index (#36750)
https://github.com/milvus-io/milvus/issues/35528

This PR adds json index support for json and dynamic fields. Now you can
only do unary query like 'a["b"] > 1' using this index. We will support
more filter type later.

basic usage:
```
collection.create_index("json_field", {"index_type": "INVERTED",
    "params": {"json_cast_type": DataType.STRING, "json_path":
'json_field["a"]["b"]'}})
```

There are some limits to use this index:
1. If a record does not have the json path you specify, it will be
ignored and there will not be an error.
2. If a value of the json path fails to be cast to the type you specify,
it will be ignored and there will not be an error.
3. A specific json path can have only one json index.
4. If you try to create more than one json indexes for one json field,
sdk(pymilvus<=2.4.7) may return immediately because of internal
implementation. This will be fixed in a later version.

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-02-15 14:06:15 +08:00
Spade A
f7d9587720
enhance: add tantivy collector for i64 (#39850)
issue: #39852

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-02-14 15:50:15 +08:00
cai.zhang
9e6e477c5d
fix: Fix modulo for long type (#39722)
issue: #39640

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-02-11 20:04:46 +08:00
Spade A
0461ddf776
fix: phrase match does not support offset input (#39338)
fix: #39337

Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>
2025-01-16 22:05:01 +08:00
Spade A
032292a432
feat: support phrase match query (#38869)
The relevant issue: https://github.com/milvus-io/milvus/issues/38930

---------

Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>
2025-01-12 20:24:58 +08:00
congqixia
182cac03e5
enhance: Use bitset or instead of bitwise set (#39037)
Related to #39003

Copying bitset value bit by bit is slow and CPU heavy, this PR utilizes
bitset operator "|=" to accelerate this procedure

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-01-07 15:02:56 +08:00
smellthemoon
907fc24f85
enhance: support null expr (#38772)
#31728

---------

Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2025-01-02 14:16:54 +08:00
Ted Xu
acc8fb7af6
enhance: eliminate compile warnings (part2) (#38535)
See #38435

---------

Signed-off-by: Ted Xu <ted.xu@zilliz.com>
2024-12-25 15:30:50 +08:00
Ted Xu
4919ccf543
enhance: eliminate compile warnings (#38420)
See: #38435

---------

Signed-off-by: Ted Xu <ted.xu@zilliz.com>
2024-12-16 09:58:43 +08:00
Gao
994fc544e7
enhance: support iterative filter execution (#37363)
issue: #37360

---------

Signed-off-by: chasingegg <chao.gao@zilliz.com>
2024-12-11 11:32:44 +08:00
Xianhui Lin
6d0a4fdb31
fix: Fix bug for Search fails with filter expression contains underscore (#38085)
Enhance the matching for elements within the UnaryRangeArray
https://github.com/milvus-io/milvus/issues/38068

---------

Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>
2024-12-05 10:18:40 +08:00
Bingyi Sun
90064cd47b
fix: Fix variable redeclaration in term filter (#38045)
https://github.com/milvus-io/milvus/issues/38046

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2024-12-02 15:10:38 +08:00
Bingyi Sun
d1596297d9
fix: Fix query failure with inverted index (#37686)
https://github.com/milvus-io/milvus/issues/37649

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2024-11-15 10:28:31 +08:00
smellthemoon
3389a6b500
enhance: support null in text match index (#37517)
#37508

Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2024-11-13 11:08:29 +08:00
Bingyi Sun
40ba5a3414
fix: fix chunked segment term filter expression and add ut (#37392)
issue: https://github.com/milvus-io/milvus/issues/37143

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2024-11-07 11:04:19 -08:00
Bingyi Sun
cd2655c861
fix: fix wrong method is called to fetch variable valid data (#37304)
issue: https://github.com/milvus-io/milvus/issues/37147

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2024-11-01 01:52:20 +08:00
zhenshan.cao
63843dce33
fix: Fix conan gdal building problem (#37338)
issue:https://github.com/milvus-io/milvus/issues/27576

Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>
2024-10-31 21:04:16 +08:00
Hao Tan
67c4340565
feat: Geospatial Data Type and GIS Function Support for milvus server (#35990)
issue:https://github.com/milvus-io/milvus/issues/27576

# Main Goals
1. Create and describe collections with geospatial fields, enabling both
client and server to recognize and process geo fields.
2. Insert geospatial data as payload values in the insert binlog, and
print the values for verification.
3. Load segments containing geospatial data into memory.
4. Ensure query outputs can display geospatial data.
5. Support filtering on GIS functions for geospatial columns.

# Solution
1. **Add Type**: Modify the Milvus core by adding a Geospatial type in
both the C++ and Go code layers, defining the Geospatial data structure
and the corresponding interfaces.
2. **Dependency Libraries**: Introduce necessary geospatial data
processing libraries. In the C++ source code, use Conan package
management to include the GDAL library. In the Go source code, add the
go-geom library to the go.mod file.
3. **Protocol Interface**: Revise the Milvus protocol to provide
mechanisms for Geospatial message serialization and deserialization.
4. **Data Pipeline**: Facilitate interaction between the client and
proxy using the WKT format for geospatial data. The proxy will convert
all data into WKB format for downstream processing, providing column
data interfaces, segment encapsulation, segment loading, payload
writing, and cache block management.
5. **Query Operators**: Implement simple display and support for filter
queries. Initially, focus on filtering based on spatial relationships
for a single column of geospatial literal values, providing parsing and
execution for query expressions.
6. **Client Modification**: Enable the client to handle user input for
geospatial data and facilitate end-to-end testing.Check the modification
in pymilvus.

---------

Signed-off-by: tasty-gumi <1021989072@qq.com>
2024-10-31 20:58:20 +08:00
smellthemoon
b8492498ac
fix: mask with valid data when preCheckOverflow (#37221)
#37175

---------

Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2024-10-31 10:44:26 +08:00
Bingyi Sun
b81f162f6a
fix: fix several bugs and refactor some codes related with chunked segment (#37168)
issue: https://github.com/milvus-io/milvus/issues/37147

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2024-10-28 14:17:30 +08:00
Yinzuo Jiang
3628593d20
feat: Implement custom function module in milvus expr (#36560)
OSPP 2024 project:
https://summer-ospp.ac.cn/org/prodetail/247410235?list=org&navpage=org

Solutions:

- parser (planparserv2)
    - add CallExpr in planparserv2/Plan.g4
    - update parser_visitor and show_visitor
- grpc protobuf
    - add CallExpr in plan.proto
- execution (`core/src/exec`)
- add `CallExpr` `ValueExpr` and `ColumnExpr` (both logical and
physical) for function call and function parameters
- function factory (`core/src/exec/expression/function`)
    - create a global hashmap when starting milvus (see server.go)
- the global hashmap stores function signatures and their function
pointers, the CallExpr in execution engine can get the function pointer
by function signature.
- custom functions
    - empty(string)
    - starts_with(string, string)
- add cpp/go unittests and E2E tests

closes: #36559

Signed-off-by: Yinzuo Jiang <jiangyinzuo@foxmail.com>
2024-10-25 15:25:30 +08:00
smellthemoon
6ef014d931
fix: get correct size when sealed segment chunked (#37062)
#37019

Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2024-10-25 12:01:31 +08:00
Gao
ad2df904c6
fix: correctly set ExecTermArrayVariableInField bitset result (#37111)
issue: https://github.com/milvus-io/milvus/issues/37110

Signed-off-by: chasingegg <chao.gao@zilliz.com>
2024-10-24 18:52:02 -07:00
smellthemoon
eb3e4583ec
enhance: all op(Null) is false in expr (#35527)
#31728

---------

Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2024-10-17 21:14:30 +08:00
Bingyi Sun
a75bb85f3a
feat: support chunked column for sealed segment (#35764)
This PR splits sealed segment to chunked data to avoid unnecessary
memory copy and save memory usage when loading segments so that loading
can be accelerated.

To support rollback to previous version, we add an option
`multipleChunkedEnable` which is false by default.

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2024-10-12 15:04:52 +08:00
zhagnlu
b1e678dcba
fix: fix json in [] expr bug (#36721)
#36718

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-10-11 01:11:20 +08:00
zhagnlu
0799d927c6
fix:fix term expr overflow bug (#36525)
#36520

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-09-26 15:01:14 +08:00