mirror of
https://gitee.com/milvus-io/milvus.git
synced 2025-12-28 14:35:27 +08:00
Related to #31293 Implement QueryIterator for the Go SDK to enable efficient iteration over large query result sets using PK-based pagination. Key changes: - Add QueryIterator interface and implementation with PK-based pagination - Support Int64 and VarChar primary key types for pagination filtering - Add QueryIteratorOption with batchSize, limit, filter, outputFields config - Fix ResultSet.Slice to handle Query results without IDs/Scores - Add comprehensive unit tests and integration tests <!-- This is an auto-generated comment: release notes by coderabbit.ai --> - Core invariant: the iterator requires the collection primary key (PK) to be present in outputFields so PK-based pagination and accurate row counting work. The constructor enforces this by appending the PK to outputFields when absent, and all pagination (lastPK tracking, PK-range filters) and ResultCount calculations depend on that guaranteed PK column. - New capability: adds a public QueryIterator API (Client.QueryIterator, QueryIterator interface, QueryIteratorOption) that issues server-side Query RPCs in configurable batches and implements PK-based pagination supporting Int64 and VarChar PKs, with options for batchSize, limit, filter, outputFields and an upfront first-batch validation to fail fast on invalid params. - Removed/simplified logic: ResultSet.Slice no longer assumes IDs and Scores are always present — it branches on presence of IDs (use IDs length when non-nil; otherwise derive row count from Fields[0]) and guards Scores slicing. This eliminates redundant/unsafe assumptions and centralizes correct row-count logic based on actual returned fields. - No data loss or behavior regression: pagination composes the user filter with a PK-range filter and always requests the PK field, so lastPK is extracted from a real column and fetchNextBatch only advances when rows are returned; EOF is returned only when the server returns no rows or iterator limit is reached. ResultSet.Slice guards prevent panics for queries that lack IDs/Scores; Query RPC → ResultSet.Fields remains the authoritative path for row data, so rows are not dropped and existing query behavior is preserved. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
Milvus Go Client Test Framework
Overview
This is a comprehensive test framework for the Milvus Go Client, designed to validate various functionalities of the Milvus vector database client. The framework provides a structured approach to writing tests with reusable components and helper functions.
Framework Architecture
Directory Structure
/go_client/
├── testcases/ # Main test cases
│ ├── helper/ # Helper functions and utilities
│ │ ├── helper.go
│ │ ├── data_helper.go
│ │ └── collection_helper.go
│ ├── search_test.go # Search functionality tests
│ ├── index_test.go # Index management tests
│ └── ...
├── common/ # Common utilities and constants
└── base/ # Base infrastructure code
Key Components
- Collection Preparation: Utilities for creating and managing collections
- Data Generation: Tools for generating test data
- Helper Functions: Common operations and validations
- Test Cases: Organized by functionality
Writing Test Cases
Basic Test Structure
func TestYourFeature(t *testing.T) {
// 1. Setup context and client
ctx := hp.CreateContext(t, time.Second*common.DefaultTimeout)
mc := createDefaultMilvusClient(ctx, t)
// 2. Prepare collection
prepare, schema := hp.CollPrepare.CreateCollection(
ctx, t, mc,
hp.NewCreateCollectionParams(hp.Int64Vec),
hp.TNewFieldsOption(),
hp.TNewSchemaOption(),
)
// 3. Insert test data
prepare.InsertData(ctx, t, mc,
hp.NewInsertParams(schema),
hp.TNewDataOption(),
)
// 4. Execute test operations
// ... your test logic here ...
// 5. Validate results
require.NoError(t, err)
require.Equal(t, expected, actual)
}
Using Custom Parameters
- Collection Creation Parameters
fieldsOption := hp.TNewFieldsOption().
TWithEnableAnalyzer(true).
TWithAnalyzerParams(map[string]any{
"tokenizer": "standard",
})
schemaOption := hp.TNewSchemaOption().
TWithEnableDynamicField(true).
TWithDescription("Custom schema").
TWithAutoID(false)
- Data Insertion Options
insertOption := hp.TNewDataOption().
TWithNb(1000). // Number of records
TWithDim(128). // Vector dimension
TWithStart(100). // Starting ID
TWithMaxLen(256). // Maximum length
TWithTextLang("en") // Text language
- Index Parameters
indexParams := hp.TNewIndexParams(schema).
TWithFieldIndex(map[string]index.Index{
common.DefaultVectorFieldName: index.NewIVFSQIndex(
&index.IVFSQConfig{
MetricType: entity.L2,
NList: 128,
},
),
})
- Search Parameters
searchOpt := client.NewSearchOption(schema.CollectionName, 100, vectors).
WithOffset(0).
WithLimit(100).
WithConsistencyLevel(entity.ClStrong).
WithFilter("int64 >= 100").
WithOutputFields([]string{"*"}).
WithSearchParams(map[string]any{
"nprobe": 16,
"ef": 64,
})
Adding New Parameters
- Define New Option Type
// In helper/data_helper.go
type YourNewOption struct {
newParam1 string
newParam2 int
}
- Add Constructor
func TNewYourOption() *YourNewOption {
return &YourNewOption{
newParam1: "default",
newParam2: 0,
}
}
- Add Parameter Methods
func (opt *YourNewOption) TWithNewParam1(value string) *YourNewOption {
opt.newParam1 = value
return opt
}
func (opt *YourNewOption) TWithNewParam2(value int) *YourNewOption {
opt.newParam2 = value
return opt
}
Best Practices
-
Test Organization
- Group related tests in the same file
- Use clear and descriptive test names
- Add comments explaining test purpose
-
Data Generation
- Use helper functions for generating test data
- Ensure data is appropriate for the test case
- Clean up test data after use
-
Error Handling
- Use
common.CheckErrfor consistent error checking - Test both success and failure scenarios
- Validate error messages when appropriate
- Use
-
Performance Considerations
- Use appropriate timeouts
- Clean up resources after tests
- Consider test execution time
Running Tests
# Run all tests
go test ./testcases/...
# Run specific test
go test -run TestYourFeature ./testcases/
# Run with verbose output
go test -v ./testcases/...
# gotestsum
Recommend you to use gotestsum https://github.com/gotestyourself/gotestsum
# Run all default cases
gotestsum --format testname --hide-summary=output -v ./testcases/... --addr=127.0.0.1:19530 -timeout=30m
# Run a specified file
gotestsum --format testname --hide-summary=output ./testcases/collection_test.go ./testcases/main_test.go --addr=127.0.0.1:19530
# Run L3 rg cases
gotestsum --format testname --hide-summary=output -v ./testcases/advcases/... --addr=127.0.0.1:19530 -timeout=30m -tags=rg
# Run advanced rg cases and default cases
# rg cases conflicts with default cases, so -p=1 is required
gotestsum --format testname --hide-summary=output -v ./testcases/... --addr=127.0.0.1:19530 -timeout=30m -tags=rg -p 1
Contributing
- Follow the existing code structure
- Add comprehensive test cases
- Document new parameters and options
- Update this README for significant changes
- Ensure code quality standards:
- Run
golangci-lint runto check for style mistakes - Use
gofmt -w your/code/pathto format your code before submitting - CI will verify both golint and go format compliance
- Run