Amit Kumar 388d56fdc7
enhance: Add support for minimum_should_match in text_match (parser, engine, client, and tests) (#44988)
### Is there an existing issue for this?

- [x] I have searched the existing issues

---

Please see: https://github.com/milvus-io/milvus/issues/44593 for the
background

This PR makes https://github.com/milvus-io/milvus/pull/44638 redundant,
which can be closed. The PR comments for the original implementation
suggested an alternative and a better approach, this new PR has that
implementation.

---

This PR

- Adds an optional `minimum_should_match` argument to `text_match(...)`
and wires it through the parser, planner/visitor, index bindings, and
client-level tests/examples so full-text queries can require a minimum
number of tokens to match.

Motivation
- Provide a way to require an expression to match a minimum number of
tokens in lexical search.

What changed
- Parser / grammar
- Added grammar rule and token: `MINIMUM_SHOULD_MATCH` and
`textMatchOption` in `internal/parser/planparserv2/Plan.g4`.
- Regenerated parser outputs: `internal/parser/planparserv2/generated/*`
(parser, lexer, visitor, etc.) to support the new rule.
- Planner / visitor
- `parser_visitor.go`: parse and validate the `minimum_should_match`
integer; propagate as an extra value on the `TextMatch` expression so
downstream components receive it.
  - Added `VisitTextMatchOption` visitor method handling.
- Client (Golang)
- Added a unit test to verify `text_match(...,
minimum_should_match=...)` appears in the generated DSL and is accepted
by client code: `client/milvusclient/read_test.go` (new test coverage).
- Added an integration-style test for the feature to the go-client
testcase suite: `tests/go_client/testcases/full_text_search_test.go`
(exercise min=1, min=3, large min).
- Added an example demonstrating `text_match` usage:
`client/milvusclient/read_example_test.go` (example name conforms to
godoc mapping).
- Engine / index
  - Updated C++ index interface: `TextMatchIndex::MatchQuery`
- Added/updated unit tests for the index behavior:
`internal/core/src/index/TextMatchIndexTest.cpp`.
- Tantivy binding 
- Added `match_query_with_minimum` implementation and unit tests to
`internal/core/thirdparty/tantivy/tantivy-binding/src/index_reader_text.rs`
that construct boolean queries with minimum required clauses.



Behavioral / compatibility notes
- This adds an optional argument to `text_match` only; default behavior
(no `minimum_should_match`) is unchanged.
- Internal API change: `TextMatchIndex::MatchQuery` signature changed
(internal component). Callers in the repo were updated accordingly.
- Parser changes required regenerating ANTLR outputs 

Tests and verification
- New/updated tests:
- Go client unit test: `client/milvusclient/read_test.go` (mocked Search
request asserts DSL contains `minimum_should_match=2`).
- Go e2e-style test:
`tests/go_client/testcases/full_text_search_test.go` (exercises min=1, 3
and a large min).
- C++ unit tests for index behavior:
`internal/core/src/index/TextMatchIndexTest.cpp`.
  - Rust binding unit tests for `match_query_with_minimum`.
- Local verification commands to run:
- Go client tests: `cd client && go test ./milvusclient -run ^$` (client
package)
- Go testcases: `cd tests/go_client && go test ./testcases -run
TestTextMatchMinimumShouldMatch` (requires a running Milvus instance)
- C++ unit tests / build: run core build/test per repo instructions (the
change touches core index code).
- Rust binding tests: `cd
internal/core/thirdparty/tantivy/tantivy-binding && cargo test` (if
developing locally).

---------

Signed-off-by: Amit Kumar <amit.kumar@reddit.com>
Co-authored-by: Amit Kumar <amit.kumar@reddit.com>
2025-11-07 16:07:11 +08:00

154 lines
5.4 KiB
Go

// Code generated from Plan.g4 by ANTLR 4.13.2. DO NOT EDIT.
package planparserv2 // Plan
import "github.com/antlr4-go/antlr/v4"
// A complete Visitor for a parse tree produced by PlanParser.
type PlanVisitor interface {
antlr.ParseTreeVisitor
// Visit a parse tree produced by PlanParser#JSONIdentifier.
VisitJSONIdentifier(ctx *JSONIdentifierContext) interface{}
// Visit a parse tree produced by PlanParser#RandomSample.
VisitRandomSample(ctx *RandomSampleContext) interface{}
// Visit a parse tree produced by PlanParser#Parens.
VisitParens(ctx *ParensContext) interface{}
// Visit a parse tree produced by PlanParser#String.
VisitString(ctx *StringContext) interface{}
// Visit a parse tree produced by PlanParser#Floating.
VisitFloating(ctx *FloatingContext) interface{}
// Visit a parse tree produced by PlanParser#JSONContainsAll.
VisitJSONContainsAll(ctx *JSONContainsAllContext) interface{}
// Visit a parse tree produced by PlanParser#LogicalOr.
VisitLogicalOr(ctx *LogicalOrContext) interface{}
// Visit a parse tree produced by PlanParser#IsNotNull.
VisitIsNotNull(ctx *IsNotNullContext) interface{}
// Visit a parse tree produced by PlanParser#MulDivMod.
VisitMulDivMod(ctx *MulDivModContext) interface{}
// Visit a parse tree produced by PlanParser#Identifier.
VisitIdentifier(ctx *IdentifierContext) interface{}
// Visit a parse tree produced by PlanParser#STIntersects.
VisitSTIntersects(ctx *STIntersectsContext) interface{}
// Visit a parse tree produced by PlanParser#Like.
VisitLike(ctx *LikeContext) interface{}
// Visit a parse tree produced by PlanParser#LogicalAnd.
VisitLogicalAnd(ctx *LogicalAndContext) interface{}
// Visit a parse tree produced by PlanParser#TemplateVariable.
VisitTemplateVariable(ctx *TemplateVariableContext) interface{}
// Visit a parse tree produced by PlanParser#Equality.
VisitEquality(ctx *EqualityContext) interface{}
// Visit a parse tree produced by PlanParser#Boolean.
VisitBoolean(ctx *BooleanContext) interface{}
// Visit a parse tree produced by PlanParser#TimestamptzCompareReverse.
VisitTimestamptzCompareReverse(ctx *TimestamptzCompareReverseContext) interface{}
// Visit a parse tree produced by PlanParser#STDWithin.
VisitSTDWithin(ctx *STDWithinContext) interface{}
// Visit a parse tree produced by PlanParser#Shift.
VisitShift(ctx *ShiftContext) interface{}
// Visit a parse tree produced by PlanParser#TimestamptzCompareForward.
VisitTimestamptzCompareForward(ctx *TimestamptzCompareForwardContext) interface{}
// Visit a parse tree produced by PlanParser#Call.
VisitCall(ctx *CallContext) interface{}
// Visit a parse tree produced by PlanParser#STCrosses.
VisitSTCrosses(ctx *STCrossesContext) interface{}
// Visit a parse tree produced by PlanParser#ReverseRange.
VisitReverseRange(ctx *ReverseRangeContext) interface{}
// Visit a parse tree produced by PlanParser#BitOr.
VisitBitOr(ctx *BitOrContext) interface{}
// Visit a parse tree produced by PlanParser#EmptyArray.
VisitEmptyArray(ctx *EmptyArrayContext) interface{}
// Visit a parse tree produced by PlanParser#AddSub.
VisitAddSub(ctx *AddSubContext) interface{}
// Visit a parse tree produced by PlanParser#PhraseMatch.
VisitPhraseMatch(ctx *PhraseMatchContext) interface{}
// Visit a parse tree produced by PlanParser#Relational.
VisitRelational(ctx *RelationalContext) interface{}
// Visit a parse tree produced by PlanParser#ArrayLength.
VisitArrayLength(ctx *ArrayLengthContext) interface{}
// Visit a parse tree produced by PlanParser#TextMatch.
VisitTextMatch(ctx *TextMatchContext) interface{}
// Visit a parse tree produced by PlanParser#STTouches.
VisitSTTouches(ctx *STTouchesContext) interface{}
// Visit a parse tree produced by PlanParser#STContains.
VisitSTContains(ctx *STContainsContext) interface{}
// Visit a parse tree produced by PlanParser#Term.
VisitTerm(ctx *TermContext) interface{}
// Visit a parse tree produced by PlanParser#JSONContains.
VisitJSONContains(ctx *JSONContainsContext) interface{}
// Visit a parse tree produced by PlanParser#STWithin.
VisitSTWithin(ctx *STWithinContext) interface{}
// Visit a parse tree produced by PlanParser#Range.
VisitRange(ctx *RangeContext) interface{}
// Visit a parse tree produced by PlanParser#Unary.
VisitUnary(ctx *UnaryContext) interface{}
// Visit a parse tree produced by PlanParser#Integer.
VisitInteger(ctx *IntegerContext) interface{}
// Visit a parse tree produced by PlanParser#Array.
VisitArray(ctx *ArrayContext) interface{}
// Visit a parse tree produced by PlanParser#JSONContainsAny.
VisitJSONContainsAny(ctx *JSONContainsAnyContext) interface{}
// Visit a parse tree produced by PlanParser#BitXor.
VisitBitXor(ctx *BitXorContext) interface{}
// Visit a parse tree produced by PlanParser#Exists.
VisitExists(ctx *ExistsContext) interface{}
// Visit a parse tree produced by PlanParser#BitAnd.
VisitBitAnd(ctx *BitAndContext) interface{}
// Visit a parse tree produced by PlanParser#STEuqals.
VisitSTEuqals(ctx *STEuqalsContext) interface{}
// Visit a parse tree produced by PlanParser#IsNull.
VisitIsNull(ctx *IsNullContext) interface{}
// Visit a parse tree produced by PlanParser#Power.
VisitPower(ctx *PowerContext) interface{}
// Visit a parse tree produced by PlanParser#STOverlaps.
VisitSTOverlaps(ctx *STOverlapsContext) interface{}
// Visit a parse tree produced by PlanParser#textMatchOption.
VisitTextMatchOption(ctx *TextMatchOptionContext) interface{}
}