mirror of
https://gitee.com/milvus-io/milvus.git
synced 2025-12-08 01:58:34 +08:00
### Is there an existing issue for this? - [x] I have searched the existing issues --- Please see: https://github.com/milvus-io/milvus/issues/44593 for the background This PR makes https://github.com/milvus-io/milvus/pull/44638 redundant, which can be closed. The PR comments for the original implementation suggested an alternative and a better approach, this new PR has that implementation. --- This PR - Adds an optional `minimum_should_match` argument to `text_match(...)` and wires it through the parser, planner/visitor, index bindings, and client-level tests/examples so full-text queries can require a minimum number of tokens to match. Motivation - Provide a way to require an expression to match a minimum number of tokens in lexical search. What changed - Parser / grammar - Added grammar rule and token: `MINIMUM_SHOULD_MATCH` and `textMatchOption` in `internal/parser/planparserv2/Plan.g4`. - Regenerated parser outputs: `internal/parser/planparserv2/generated/*` (parser, lexer, visitor, etc.) to support the new rule. - Planner / visitor - `parser_visitor.go`: parse and validate the `minimum_should_match` integer; propagate as an extra value on the `TextMatch` expression so downstream components receive it. - Added `VisitTextMatchOption` visitor method handling. - Client (Golang) - Added a unit test to verify `text_match(..., minimum_should_match=...)` appears in the generated DSL and is accepted by client code: `client/milvusclient/read_test.go` (new test coverage). - Added an integration-style test for the feature to the go-client testcase suite: `tests/go_client/testcases/full_text_search_test.go` (exercise min=1, min=3, large min). - Added an example demonstrating `text_match` usage: `client/milvusclient/read_example_test.go` (example name conforms to godoc mapping). - Engine / index - Updated C++ index interface: `TextMatchIndex::MatchQuery` - Added/updated unit tests for the index behavior: `internal/core/src/index/TextMatchIndexTest.cpp`. - Tantivy binding - Added `match_query_with_minimum` implementation and unit tests to `internal/core/thirdparty/tantivy/tantivy-binding/src/index_reader_text.rs` that construct boolean queries with minimum required clauses. Behavioral / compatibility notes - This adds an optional argument to `text_match` only; default behavior (no `minimum_should_match`) is unchanged. - Internal API change: `TextMatchIndex::MatchQuery` signature changed (internal component). Callers in the repo were updated accordingly. - Parser changes required regenerating ANTLR outputs Tests and verification - New/updated tests: - Go client unit test: `client/milvusclient/read_test.go` (mocked Search request asserts DSL contains `minimum_should_match=2`). - Go e2e-style test: `tests/go_client/testcases/full_text_search_test.go` (exercises min=1, 3 and a large min). - C++ unit tests for index behavior: `internal/core/src/index/TextMatchIndexTest.cpp`. - Rust binding unit tests for `match_query_with_minimum`. - Local verification commands to run: - Go client tests: `cd client && go test ./milvusclient -run ^$` (client package) - Go testcases: `cd tests/go_client && go test ./testcases -run TestTextMatchMinimumShouldMatch` (requires a running Milvus instance) - C++ unit tests / build: run core build/test per repo instructions (the change touches core index code). - Rust binding tests: `cd internal/core/thirdparty/tantivy/tantivy-binding && cargo test` (if developing locally). --------- Signed-off-by: Amit Kumar <amit.kumar@reddit.com> Co-authored-by: Amit Kumar <amit.kumar@reddit.com>
143 lines
8.9 KiB
Plaintext
143 lines
8.9 KiB
Plaintext
token literal names:
|
|
null
|
|
'('
|
|
')'
|
|
'['
|
|
','
|
|
']'
|
|
'{'
|
|
'}'
|
|
'<'
|
|
'<='
|
|
'>'
|
|
'>='
|
|
'=='
|
|
'!='
|
|
null
|
|
null
|
|
null
|
|
null
|
|
null
|
|
null
|
|
null
|
|
null
|
|
'='
|
|
'+'
|
|
'-'
|
|
'*'
|
|
'/'
|
|
'%'
|
|
'**'
|
|
'<<'
|
|
'>>'
|
|
'&'
|
|
'|'
|
|
'^'
|
|
null
|
|
null
|
|
null
|
|
null
|
|
'~'
|
|
null
|
|
null
|
|
null
|
|
null
|
|
null
|
|
null
|
|
null
|
|
null
|
|
null
|
|
null
|
|
null
|
|
null
|
|
null
|
|
null
|
|
null
|
|
null
|
|
null
|
|
null
|
|
null
|
|
null
|
|
null
|
|
null
|
|
'$meta'
|
|
null
|
|
null
|
|
null
|
|
null
|
|
|
|
token symbolic names:
|
|
null
|
|
null
|
|
null
|
|
null
|
|
null
|
|
null
|
|
LBRACE
|
|
RBRACE
|
|
LT
|
|
LE
|
|
GT
|
|
GE
|
|
EQ
|
|
NE
|
|
LIKE
|
|
EXISTS
|
|
TEXTMATCH
|
|
PHRASEMATCH
|
|
RANDOMSAMPLE
|
|
INTERVAL
|
|
ISO
|
|
MINIMUM_SHOULD_MATCH
|
|
ASSIGN
|
|
ADD
|
|
SUB
|
|
MUL
|
|
DIV
|
|
MOD
|
|
POW
|
|
SHL
|
|
SHR
|
|
BAND
|
|
BOR
|
|
BXOR
|
|
AND
|
|
OR
|
|
ISNULL
|
|
ISNOTNULL
|
|
BNOT
|
|
NOT
|
|
IN
|
|
EmptyArray
|
|
JSONContains
|
|
JSONContainsAll
|
|
JSONContainsAny
|
|
ArrayContains
|
|
ArrayContainsAll
|
|
ArrayContainsAny
|
|
ArrayLength
|
|
STEuqals
|
|
STTouches
|
|
STOverlaps
|
|
STCrosses
|
|
STContains
|
|
STIntersects
|
|
STWithin
|
|
STDWithin
|
|
BooleanConstant
|
|
IntegerConstant
|
|
FloatingConstant
|
|
Identifier
|
|
Meta
|
|
StringLiteral
|
|
JSONIdentifier
|
|
Whitespace
|
|
Newline
|
|
|
|
rule names:
|
|
expr
|
|
textMatchOption
|
|
|
|
|
|
atn:
|
|
[4, 1, 65, 240, 2, 0, 7, 0, 2, 1, 7, 1, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 3, 0, 10, 8, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 3, 0, 22, 8, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 5, 0, 41, 8, 0, 10, 0, 12, 0, 44, 9, 0, 1, 0, 3, 0, 47, 8, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 3, 0, 61, 8, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 3, 0, 71, 8, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 5, 0, 162, 8, 0, 10, 0, 12, 0, 165, 9, 0, 1, 0, 3, 0, 168, 8, 0, 3, 0, 170, 8, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 3, 0, 177, 8, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 3, 0, 193, 8, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 5, 0, 231, 8, 0, 10, 0, 12, 0, 234, 9, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 0, 2, 0, 2, 0, 14, 1, 0, 23, 24, 1, 0, 8, 13, 1, 0, 60, 61, 2, 0, 23, 24, 38, 39, 2, 0, 42, 42, 45, 45, 2, 0, 43, 43, 46, 46, 2, 0, 44, 44, 47, 47, 2, 0, 60, 60, 63, 63, 1, 0, 25, 27, 1, 0, 29, 30, 1, 0, 8, 9, 1, 0, 10, 11, 1, 0, 8, 11, 1, 0, 12, 13, 293, 0, 176, 1, 0, 0, 0, 2, 235, 1, 0, 0, 0, 4, 5, 6, 0, -1, 0, 5, 9, 5, 60, 0, 0, 6, 7, 7, 0, 0, 0, 7, 8, 5, 19, 0, 0, 8, 10, 5, 62, 0, 0, 9, 6, 1, 0, 0, 0, 9, 10, 1, 0, 0, 0, 10, 11, 1, 0, 0, 0, 11, 12, 7, 1, 0, 0, 12, 13, 5, 20, 0, 0, 13, 177, 5, 62, 0, 0, 14, 15, 5, 20, 0, 0, 15, 16, 5, 62, 0, 0, 16, 17, 7, 1, 0, 0, 17, 21, 5, 60, 0, 0, 18, 19, 7, 0, 0, 0, 19, 20, 5, 19, 0, 0, 20, 22, 5, 62, 0, 0, 21, 18, 1, 0, 0, 0, 21, 22, 1, 0, 0, 0, 22, 177, 1, 0, 0, 0, 23, 177, 5, 58, 0, 0, 24, 177, 5, 59, 0, 0, 25, 177, 5, 57, 0, 0, 26, 177, 5, 62, 0, 0, 27, 177, 7, 2, 0, 0, 28, 177, 5, 63, 0, 0, 29, 30, 5, 6, 0, 0, 30, 31, 5, 60, 0, 0, 31, 177, 5, 7, 0, 0, 32, 33, 5, 1, 0, 0, 33, 34, 3, 0, 0, 0, 34, 35, 5, 2, 0, 0, 35, 177, 1, 0, 0, 0, 36, 37, 5, 3, 0, 0, 37, 42, 3, 0, 0, 0, 38, 39, 5, 4, 0, 0, 39, 41, 3, 0, 0, 0, 40, 38, 1, 0, 0, 0, 41, 44, 1, 0, 0, 0, 42, 40, 1, 0, 0, 0, 42, 43, 1, 0, 0, 0, 43, 46, 1, 0, 0, 0, 44, 42, 1, 0, 0, 0, 45, 47, 5, 4, 0, 0, 46, 45, 1, 0, 0, 0, 46, 47, 1, 0, 0, 0, 47, 48, 1, 0, 0, 0, 48, 49, 5, 5, 0, 0, 49, 177, 1, 0, 0, 0, 50, 177, 5, 41, 0, 0, 51, 52, 5, 15, 0, 0, 52, 177, 3, 0, 0, 35, 53, 54, 5, 16, 0, 0, 54, 55, 5, 1, 0, 0, 55, 56, 5, 60, 0, 0, 56, 57, 5, 4, 0, 0, 57, 60, 5, 62, 0, 0, 58, 59, 5, 4, 0, 0, 59, 61, 3, 2, 1, 0, 60, 58, 1, 0, 0, 0, 60, 61, 1, 0, 0, 0, 61, 62, 1, 0, 0, 0, 62, 177, 5, 2, 0, 0, 63, 64, 5, 17, 0, 0, 64, 65, 5, 1, 0, 0, 65, 66, 5, 60, 0, 0, 66, 67, 5, 4, 0, 0, 67, 70, 5, 62, 0, 0, 68, 69, 5, 4, 0, 0, 69, 71, 3, 0, 0, 0, 70, 68, 1, 0, 0, 0, 70, 71, 1, 0, 0, 0, 71, 72, 1, 0, 0, 0, 72, 177, 5, 2, 0, 0, 73, 74, 5, 18, 0, 0, 74, 75, 5, 1, 0, 0, 75, 76, 3, 0, 0, 0, 76, 77, 5, 2, 0, 0, 77, 177, 1, 0, 0, 0, 78, 79, 7, 3, 0, 0, 79, 177, 3, 0, 0, 29, 80, 81, 7, 4, 0, 0, 81, 82, 5, 1, 0, 0, 82, 83, 3, 0, 0, 0, 83, 84, 5, 4, 0, 0, 84, 85, 3, 0, 0, 0, 85, 86, 5, 2, 0, 0, 86, 177, 1, 0, 0, 0, 87, 88, 7, 5, 0, 0, 88, 89, 5, 1, 0, 0, 89, 90, 3, 0, 0, 0, 90, 91, 5, 4, 0, 0, 91, 92, 3, 0, 0, 0, 92, 93, 5, 2, 0, 0, 93, 177, 1, 0, 0, 0, 94, 95, 7, 6, 0, 0, 95, 96, 5, 1, 0, 0, 96, 97, 3, 0, 0, 0, 97, 98, 5, 4, 0, 0, 98, 99, 3, 0, 0, 0, 99, 100, 5, 2, 0, 0, 100, 177, 1, 0, 0, 0, 101, 102, 5, 49, 0, 0, 102, 103, 5, 1, 0, 0, 103, 104, 5, 60, 0, 0, 104, 105, 5, 4, 0, 0, 105, 106, 5, 62, 0, 0, 106, 177, 5, 2, 0, 0, 107, 108, 5, 50, 0, 0, 108, 109, 5, 1, 0, 0, 109, 110, 5, 60, 0, 0, 110, 111, 5, 4, 0, 0, 111, 112, 5, 62, 0, 0, 112, 177, 5, 2, 0, 0, 113, 114, 5, 51, 0, 0, 114, 115, 5, 1, 0, 0, 115, 116, 5, 60, 0, 0, 116, 117, 5, 4, 0, 0, 117, 118, 5, 62, 0, 0, 118, 177, 5, 2, 0, 0, 119, 120, 5, 52, 0, 0, 120, 121, 5, 1, 0, 0, 121, 122, 5, 60, 0, 0, 122, 123, 5, 4, 0, 0, 123, 124, 5, 62, 0, 0, 124, 177, 5, 2, 0, 0, 125, 126, 5, 53, 0, 0, 126, 127, 5, 1, 0, 0, 127, 128, 5, 60, 0, 0, 128, 129, 5, 4, 0, 0, 129, 130, 5, 62, 0, 0, 130, 177, 5, 2, 0, 0, 131, 132, 5, 54, 0, 0, 132, 133, 5, 1, 0, 0, 133, 134, 5, 60, 0, 0, 134, 135, 5, 4, 0, 0, 135, 136, 5, 62, 0, 0, 136, 177, 5, 2, 0, 0, 137, 138, 5, 55, 0, 0, 138, 139, 5, 1, 0, 0, 139, 140, 5, 60, 0, 0, 140, 141, 5, 4, 0, 0, 141, 142, 5, 62, 0, 0, 142, 177, 5, 2, 0, 0, 143, 144, 5, 56, 0, 0, 144, 145, 5, 1, 0, 0, 145, 146, 5, 60, 0, 0, 146, 147, 5, 4, 0, 0, 147, 148, 5, 62, 0, 0, 148, 149, 5, 4, 0, 0, 149, 150, 3, 0, 0, 0, 150, 151, 5, 2, 0, 0, 151, 177, 1, 0, 0, 0, 152, 153, 5, 48, 0, 0, 153, 154, 5, 1, 0, 0, 154, 155, 7, 7, 0, 0, 155, 177, 5, 2, 0, 0, 156, 157, 5, 60, 0, 0, 157, 169, 5, 1, 0, 0, 158, 163, 3, 0, 0, 0, 159, 160, 5, 4, 0, 0, 160, 162, 3, 0, 0, 0, 161, 159, 1, 0, 0, 0, 162, 165, 1, 0, 0, 0, 163, 161, 1, 0, 0, 0, 163, 164, 1, 0, 0, 0, 164, 167, 1, 0, 0, 0, 165, 163, 1, 0, 0, 0, 166, 168, 5, 4, 0, 0, 167, 166, 1, 0, 0, 0, 167, 168, 1, 0, 0, 0, 168, 170, 1, 0, 0, 0, 169, 158, 1, 0, 0, 0, 169, 170, 1, 0, 0, 0, 170, 171, 1, 0, 0, 0, 171, 177, 5, 2, 0, 0, 172, 173, 7, 7, 0, 0, 173, 177, 5, 36, 0, 0, 174, 175, 7, 7, 0, 0, 175, 177, 5, 37, 0, 0, 176, 4, 1, 0, 0, 0, 176, 14, 1, 0, 0, 0, 176, 23, 1, 0, 0, 0, 176, 24, 1, 0, 0, 0, 176, 25, 1, 0, 0, 0, 176, 26, 1, 0, 0, 0, 176, 27, 1, 0, 0, 0, 176, 28, 1, 0, 0, 0, 176, 29, 1, 0, 0, 0, 176, 32, 1, 0, 0, 0, 176, 36, 1, 0, 0, 0, 176, 50, 1, 0, 0, 0, 176, 51, 1, 0, 0, 0, 176, 53, 1, 0, 0, 0, 176, 63, 1, 0, 0, 0, 176, 73, 1, 0, 0, 0, 176, 78, 1, 0, 0, 0, 176, 80, 1, 0, 0, 0, 176, 87, 1, 0, 0, 0, 176, 94, 1, 0, 0, 0, 176, 101, 1, 0, 0, 0, 176, 107, 1, 0, 0, 0, 176, 113, 1, 0, 0, 0, 176, 119, 1, 0, 0, 0, 176, 125, 1, 0, 0, 0, 176, 131, 1, 0, 0, 0, 176, 137, 1, 0, 0, 0, 176, 143, 1, 0, 0, 0, 176, 152, 1, 0, 0, 0, 176, 156, 1, 0, 0, 0, 176, 172, 1, 0, 0, 0, 176, 174, 1, 0, 0, 0, 177, 232, 1, 0, 0, 0, 178, 179, 10, 30, 0, 0, 179, 180, 5, 28, 0, 0, 180, 231, 3, 0, 0, 31, 181, 182, 10, 28, 0, 0, 182, 183, 7, 8, 0, 0, 183, 231, 3, 0, 0, 29, 184, 185, 10, 27, 0, 0, 185, 186, 7, 0, 0, 0, 186, 231, 3, 0, 0, 28, 187, 188, 10, 26, 0, 0, 188, 189, 7, 9, 0, 0, 189, 231, 3, 0, 0, 27, 190, 192, 10, 25, 0, 0, 191, 193, 5, 39, 0, 0, 192, 191, 1, 0, 0, 0, 192, 193, 1, 0, 0, 0, 193, 194, 1, 0, 0, 0, 194, 195, 5, 40, 0, 0, 195, 231, 3, 0, 0, 26, 196, 197, 10, 11, 0, 0, 197, 198, 7, 10, 0, 0, 198, 199, 7, 7, 0, 0, 199, 200, 7, 10, 0, 0, 200, 231, 3, 0, 0, 12, 201, 202, 10, 10, 0, 0, 202, 203, 7, 11, 0, 0, 203, 204, 7, 7, 0, 0, 204, 205, 7, 11, 0, 0, 205, 231, 3, 0, 0, 11, 206, 207, 10, 9, 0, 0, 207, 208, 7, 12, 0, 0, 208, 231, 3, 0, 0, 10, 209, 210, 10, 8, 0, 0, 210, 211, 7, 13, 0, 0, 211, 231, 3, 0, 0, 9, 212, 213, 10, 7, 0, 0, 213, 214, 5, 31, 0, 0, 214, 231, 3, 0, 0, 8, 215, 216, 10, 6, 0, 0, 216, 217, 5, 33, 0, 0, 217, 231, 3, 0, 0, 7, 218, 219, 10, 5, 0, 0, 219, 220, 5, 32, 0, 0, 220, 231, 3, 0, 0, 6, 221, 222, 10, 4, 0, 0, 222, 223, 5, 34, 0, 0, 223, 231, 3, 0, 0, 5, 224, 225, 10, 3, 0, 0, 225, 226, 5, 35, 0, 0, 226, 231, 3, 0, 0, 4, 227, 228, 10, 34, 0, 0, 228, 229, 5, 14, 0, 0, 229, 231, 5, 62, 0, 0, 230, 178, 1, 0, 0, 0, 230, 181, 1, 0, 0, 0, 230, 184, 1, 0, 0, 0, 230, 187, 1, 0, 0, 0, 230, 190, 1, 0, 0, 0, 230, 196, 1, 0, 0, 0, 230, 201, 1, 0, 0, 0, 230, 206, 1, 0, 0, 0, 230, 209, 1, 0, 0, 0, 230, 212, 1, 0, 0, 0, 230, 215, 1, 0, 0, 0, 230, 218, 1, 0, 0, 0, 230, 221, 1, 0, 0, 0, 230, 224, 1, 0, 0, 0, 230, 227, 1, 0, 0, 0, 231, 234, 1, 0, 0, 0, 232, 230, 1, 0, 0, 0, 232, 233, 1, 0, 0, 0, 233, 1, 1, 0, 0, 0, 234, 232, 1, 0, 0, 0, 235, 236, 5, 21, 0, 0, 236, 237, 5, 22, 0, 0, 237, 238, 5, 58, 0, 0, 238, 3, 1, 0, 0, 0, 13, 9, 21, 42, 46, 60, 70, 163, 167, 169, 176, 192, 230, 232] |