Spade A 0114bd1dc9
feat: support match operator family (#46518)
issue: https://github.com/milvus-io/milvus/issues/46517
ref: https://github.com/milvus-io/milvus/issues/42148

This PR supports match operator family with struct array and brute force
search only.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
- Core invariant: match operators only target struct-array element-level
predicates and assume callers provide a correct row_start so element
indices form a contiguous range; IArrayOffsets implementations convert
row-level bitmaps/rows (starting at row_start) into element-level
bitmaps or a contiguous element-offset vector used by brute-force
evaluation.

- New capability added: end-to-end support for MATCH_* semantics
(match_any, match_all, match_least, match_most, match_exact) — parser
(grammar + proto), planner (ParseMatchExprs), expr model
(expr::MatchExpr), compilation (Expr→PhyMatchFilterExpr), execution
(PhyMatchFilterExpr::Eval uses element offsets/bitmaps), and unit tests
(MatchExprTest + parser tests). Implementation currently works for
struct-array inputs and uses brute-force element counting via
RowBitsetToElementOffsets/RowBitsetToElementBitset.

- Logic removed or simplified and why: removed the ad-hoc
DocBitsetToElementOffsets helper and consolidated offset/bitset
derivation into IArrayOffsets::RowBitsetToElementOffsets and a
row_start-aware RowBitsetToElementBitset, and removed EvalCtx overloads
that embedded ExprSet (now EvalCtx(exec_ctx, offset_input)). This
centralizes array-layout logic in ArrayOffsets and removes duplicated
offset conversion and EvalCtx variants that were redundant for
element-level evaluation.

- No data loss / no behavior regression: persistent formats are
unchanged (no proto storage or on-disk layout changed); callers were
updated to supply row_start and now route through the centralized
ArrayOffsets APIs which still use the authoritative
row_to_element_start_ mapping, preserving exact element index mappings.
Eval logic changes are limited to in-memory plumbing (how
offsets/bitmaps are produced and how EvalCtx is constructed); expression
evaluation still invokes exprs_->Eval where needed, so existing behavior
and stored data remain intact.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>
2025-12-29 11:03:26 +08:00

102 lines
1.0 KiB
Plaintext

T__0=1
T__1=2
T__2=3
T__3=4
T__4=5
LBRACE=6
RBRACE=7
LT=8
LE=9
GT=10
GE=11
EQ=12
NE=13
LIKE=14
EXISTS=15
TEXTMATCH=16
PHRASEMATCH=17
RANDOMSAMPLE=18
MATCH_ALL=19
MATCH_ANY=20
MATCH_LEAST=21
MATCH_MOST=22
MATCH_EXACT=23
INTERVAL=24
ISO=25
MINIMUM_SHOULD_MATCH=26
THRESHOLD=27
ASSIGN=28
ADD=29
SUB=30
MUL=31
DIV=32
MOD=33
POW=34
SHL=35
SHR=36
BAND=37
BOR=38
BXOR=39
AND=40
OR=41
ISNULL=42
ISNOTNULL=43
BNOT=44
NOT=45
IN=46
EmptyArray=47
JSONContains=48
JSONContainsAll=49
JSONContainsAny=50
ArrayContains=51
ArrayContainsAll=52
ArrayContainsAny=53
ArrayLength=54
ElementFilter=55
STEuqals=56
STTouches=57
STOverlaps=58
STCrosses=59
STContains=60
STIntersects=61
STWithin=62
STDWithin=63
STIsValid=64
BooleanConstant=65
IntegerConstant=66
FloatingConstant=67
Identifier=68
Meta=69
StringLiteral=70
JSONIdentifier=71
StructSubFieldIdentifier=72
Whitespace=73
Newline=74
'('=1
')'=2
'['=3
','=4
']'=5
'{'=6
'}'=7
'<'=8
'<='=9
'>'=10
'>='=11
'=='=12
'!='=13
'='=28
'+'=29
'-'=30
'*'=31
'/'=32
'%'=33
'**'=34
'<<'=35
'>>'=36
'&'=37
'|'=38
'^'=39
'~'=44
'$meta'=69