mirror of
https://gitee.com/milvus-io/milvus.git
synced 2026-01-07 19:31:51 +08:00
Add support for DataNode compaction using file resources in ref mode. SortCompation and StatsJobs will build text indexes, which may use file resources. relate: https://github.com/milvus-io/milvus/issues/43687 <!-- This is an auto-generated comment: release notes by coderabbit.ai --> - Core invariant: file resources (analyzer binaries/metadata) are only fetched, downloaded and used when the node is configured in Ref mode (fileresource.IsRefMode via CommonCfg.QNFileResourceMode / DNFileResourceMode); Sync now carries a version and managers track per-resource versions/resource IDs so newer resource sets win and older entries are pruned (RefManager/SynchManager resource maps). - Logic removed / simplified: component-specific FileResourceMode flags and an indirection through a long-lived BinlogIO wrapper were consolidated — file-resource mode moved to CommonCfg, Sync/Download APIs became version- and context-aware, and compaction/index tasks accept a ChunkManager directly (binlog IO wrapper creation inlined). This eliminates duplicated config checks and wrapper indirection while preserving the same chunk/IO semantics. - Why no data loss or behavior regression: all file-resource code paths are gated by the configured mode (default remains "sync"); when not in ref-mode or when no resources exist, compaction and stats flows follow existing code paths unchanged. Versioned Sync + resourceID maps ensure newly synced sets replace older ones and RefManager prunes stale files; GetFileResources returns an error if requested IDs are missing (prevents silent use of wrong resources). Analyzer naming/parameter changes add analyzer_extra_info but default-callers pass "" so existing analyzers and index contents remain unchanged. - New capability: DataNode compaction and StatsJobs can now build text indexes using external file resources in Ref mode — DataCoord exposes GetFileResources and populates CompactionPlan.file_resources; SortCompaction/StatsTask download resources via fileresource.Manager, produce an analyzer_extra_info JSON (storage + resource->id map) via analyzer.BuildExtraResourceInfo, and propagate analyzer_extra_info into BuildIndexInfo so the tantivy bindings can load custom analyzers during text index creation. <!-- end of auto-generated comment: release notes by coderabbit.ai --> Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
33 lines
885 B
Go
33 lines
885 B
Go
package analyzer
|
|
|
|
import (
|
|
"github.com/milvus-io/milvus/internal/util/analyzer/canalyzer"
|
|
"github.com/milvus-io/milvus/internal/util/analyzer/interfaces"
|
|
"github.com/milvus-io/milvus/pkg/v2/proto/internalpb"
|
|
)
|
|
|
|
type (
|
|
Analyzer interfaces.Analyzer
|
|
TokenStream interfaces.TokenStream
|
|
)
|
|
|
|
func NewAnalyzer(param string, extraInfo string) (Analyzer, error) {
|
|
return canalyzer.NewAnalyzer(param, extraInfo)
|
|
}
|
|
|
|
func ValidateAnalyzer(param string, extraInfo string) ([]int64, error) {
|
|
return canalyzer.ValidateAnalyzer(param, extraInfo)
|
|
}
|
|
|
|
func UpdateGlobalResourceInfo(resourceMap map[string]int64) error {
|
|
return canalyzer.UpdateGlobalResourceInfo(resourceMap)
|
|
}
|
|
|
|
func BuildExtraResourceInfo(storage string, resources []*internalpb.FileResourceInfo) (string, error) {
|
|
return canalyzer.BuildExtraResourceInfo(storage, resources)
|
|
}
|
|
|
|
func InitOptions() {
|
|
canalyzer.InitOptions()
|
|
}
|