milvus

mirror of https://gitee.com/milvus-io/milvus.git synced 2026-01-07 19:31:51 +08:00

Author	SHA1	Message	Date
zhagnlu	031acf5711	enhance: convert jsonstats translator to bson_index translator (#45036 ) issue: #42533 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-12-31 10:39:21 +08:00
yihao.dai	b18ebd9468	enhance: Remove legacy cdc/replication (#46603 ) issue: https://github.com/milvus-io/milvus/issues/44123 <!-- This is an auto-generated comment: release notes by coderabbit.ai --> - Core invariant: legacy in-cluster CDC/replication plumbing (ReplicateMsg types, ReplicateID-based guards and flags) is obsolete — the system relies on standard msgstream positions, subPos/end-ts semantics and timetick ordering as the single source of truth for message ordering and skipping, so replication-specific channels/types/guards can be removed safely. - Removed/simplified logic (what and why): removed replication feature flags and params (ReplicateMsgChannel, TTMsgEnabled, CollectionReplicateEnable), ReplicateMsg type and its tests, ReplicateID constants/helpers and MergeProperties hooks, ReplicateConfig and its propagation (streamPipeline, StreamConfig, dispatcher, target), replicate-aware dispatcher/pipeline branches, and replicate-mode pre-checks/timestamp-allocation in proxy tasks — these implemented a redundant alternate “replicate-mode” pathway that duplicated position/end-ts and timetick logic. - Why this does NOT cause data loss or regression (concrete code paths): no persistence or core write paths were removed — proxy PreExecute flows (internal/proxy/task_*.go) still perform the same schema/ID/size validations and then follow the normal non-replicate execution path; dispatcher and pipeline continue to use position/subPos and pullback/end-ts in Seek/grouping (pkg/mq/msgdispatcher/dispatcher.go, internal/util/pipeline/stream_pipeline.go), so skipping and ordering behavior remains unchanged; timetick emission in rootcoord (sendMinDdlTsAsTt) is now ungated (no silent suppression), preserving or increasing timetick delivery rather than removing it. - PR type and net effect: Enhancement/Refactor — removes deprecated replication API surface (types, helpers, config, tests) and replication branches, simplifies public APIs and constructor signatures, and reduces surface area for future maintenance while keeping DML/DDL persistence, ordering, and seek semantics intact. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2025-12-30 14:53:21 +08:00
congqixia	b4682b7352	fix: use LoadDeltaData instead of Delete for L0 growing forward (#46657 ) Related to #46660 Replace segment.Delete() with segment.LoadDeltaData() when forwarding L0 deletions to growing segments. LoadDeltaData is the more appropriate API for bulk loading delta data compared to individual Delete calls. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> • Core invariant: forwarding L0 deletions to growing segments must use the bulk-delta API (storage.DeltaData + segment.LoadDeltaData) because LoadDeltaData preserves paired primary keys and timestamps as a single atomic delta payload; segment.Delete was intended for per-delete RPCs and not for loading L0 delta payloads. • Logic removed/simplified: addL0GrowingBF() no longer calls segment.Delete for buffered L0 keys. Instead the buffered callback builds a storage.DeltaData via storage.NewDeltaDataWithData(pks, tss) and calls segment.LoadDeltaData(ctx, dd). This eliminates the previous per-batch Delete call path and centralizes forwarding as a single delta-load operation. • Why this does not cause data loss or regression: the new path supplies identical PK+timestamp pairs to the segment via DeltaData; LoadDeltaData applies the same delete semantics but accepts batched delta payloads. The change is limited to the L0→growing Bloom-Filter forward path (addL0GrowingBF/addL0ForGrowingLoad), leaving sealed-segment deletes, streaming direct forwarding, and remote-load policies unchanged. Also, the prior code would fail on L0Segment.Delete (L0 segments prohibit Delete), so switching to LoadDeltaData prevents lost-forwarding caused by unsupported Delete calls. • Category: Enhancement / Refactor — replaces inappropriate per-delete calls with the correct bulk delta-load API, simplifying error handling around NewDeltaDataWithData and ensuring API contract correctness for L0→growing forwarding. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-12-30 14:05:21 +08:00
wei liu	293838bb67	enhance: add delegator catching up streaming data state tracking (#46551 ) issue: #46550 - Add CatchUpStreamingDataTsLag parameter to control tolerable lag threshold for delegator to be considered caught up - Add catchingUpStreamingData field in delegator to track whether delegator has caught up with streaming data - Add catching_up_streaming_data field in LeaderViewStatus proto - Check catching up status in CheckDelegatorDataReady, return not ready when delegator is still catching up streaming data - Add unit tests for the new functionality When tsafe lag exceeds the threshold, the distribution will not be considered serviceable, preventing queries from timing out in waitTSafe. This is useful when streaming message queue consumption is slow. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> - Core invariant: a delegator must not be considered serviceable while its tsafe lags behind the latest committed timestamp beyond a configurable tolerance; a delegator is "caught-up" only when (latestTsafe - delegator.GetTSafe()) < CatchUpStreamingDataTsLag (configured by queryNode.delegator.catchUpStreamingDataTsLag, default 1s). - New capability and where it takes effect: adds streaming-catchup tracking to QueryNode/QueryCoord — an atomic catchingUpStreamingData flag on shardDelegator (internal/querynodev2/delegator/delegator.go), a new param CatchUpStreamingDataTsLag (pkg/util/paramtable/component_param.go), and a LeaderViewStatus.catching_up_streaming_data field in the proto (pkg/proto/query_coord.proto). The flag is exposed in GetDataDistribution (internal/querynodev2/services.go) and used by QueryCoord readiness checks (internal/querycoordv2/utils/util.go::CheckDelegatorDataReady) to reject leaders that are still catching up. - What logic is simplified/added (not removed): instead of relying solely on segment distribution/worker heartbeats, the PR adds an explicit readiness gate that returns "not available" when the delegator reports catching-up-streaming-data. This is strictly additive — no existing checks are removed; the new precondition runs before segment availability validation to prevent premature routing to slow-consuming delegators. - Why this does NOT cause data loss or regress behavior: the change only controls serviceability visibility and routing — it never drops or mutates data. Concretely: shardDelegator starts with catchingUpStreamingData=true and flips to false in UpdateTSafe once the sampled lag falls below the configured threshold (internal/querynodev2/delegator/delegator.go::UpdateTSafe). QueryCoord will short-circuit in CheckDelegatorDataReady when leader.Status.GetCatchingUpStreamingData() is true (internal/querycoordv2/utils/util.go), returning a channel-not-available error before any segment checks; when the flag clears, existing segment-distribution checks (same code paths) resume. Tests added cover both catching-up and caught-up paths (internal/querynodev2/delegator/delegator_test.go, internal/querycoordv2/utils/util_test.go, internal/querynodev2/services_test.go), demonstrating convergence without changed data flows or deletion of data. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-12-29 17:15:21 +08:00
XuanYang-cn	4230a5beaa	enhance: revert mmap configs (#46581 ) Signed-off-by: yangxuan <xuan.yang@zilliz.com>	2025-12-29 15:09:21 +08:00
wei liu	0d70d2b98c	enhance: simplify seek position selection in WatchDmChannels (#46567 ) issue: #46566 Remove the complex comparison logic between seekPosition and deleteCheckpoint. Use seekPosition directly since: - L0 segments are loaded before consuming message stream, which contain delete records from [deleteCheckpoint, L0.endPosition] - DataCoord ensures seekPosition is based on channel checkpoint, updated after data (including deletes) is flushed - L0 segments should cover up to seekPosition, avoiding data loss - This eliminates redundant message consumption when seekPosition > deleteCheckpoint <!-- This is an auto-generated comment: release notes by coderabbit.ai --> - Core invariant: L0 segments are loaded before consuming the DM channel stream and contain delete records for range [deleteCheckpoint, L0.endPosition]; DataCoord guarantees channel.GetSeekPosition() is derived from the channel checkpoint after data (including deletes) is flushed, so L0 segments collectively cover up to that seekPosition. - Change made: removed the prior branching that built a synthetic seek position from deleteCheckpoint vs. channel checkpoint and instead always calls channel.GetSeekPosition() (used directly in ConsumeMsgStream). Added an informational log comparing seekPosition and deleteCheckpoint. - Why the removed logic was redundant: deleteCheckpoint represented the smallest start position of L0 segments and was used to avoid re-consuming delete messages already present in loaded L0 segments. Because L0 segments already include deletes up to the channel checkpoint and DataCoord updates the channel checkpoint after flush, using deleteCheckpoint to alter the seek introduces duplicate consumption without benefit. - Why this is safe (no data loss/regression): L0 segments are guaranteed to be loaded before consumption, so deletes present in L0 cover the range up to channel.GetSeekPosition(); delete records earlier than deleteCheckpoint have been compacted to L1 and can be evicted from the delete buffer. The code path still calls ConsumeMsgStream with the channel seek position, preserving original consumption/error handling, so no messages are skipped and no additional delete application occurs beyond what L0/L1 already cover. <!-- end of auto-generated comment: release notes by coderabbit.ai --> Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-12-29 10:53:22 +08:00
sijie-ni-0214	0a54c93227	fix: etcd RPC size limit exceeded when dropping collection (#46414 ) issue: https://github.com/milvus-io/milvus/issues/46410 <!-- This is an auto-generated comment: release notes by coderabbit.ai --> - Core invariant: etcd metadata and in-memory Segment/TextIndex records must store only compact filenames for text-index files; full object keys are deterministically reconstructed at use-sites from a stable root + common.TextIndexPath + IDs via metautil.BuildTextLogPaths. - Bug & fix (issue #46410): the etcd RPC size overflow was caused by persisting full upload keys in segment/TextIndex metadata. Fix: at upload/creation sites (internal/datanode/compactor/sort_compaction.go and internal/datanode/index/task_stats.go) store only filenames using metautil.ExtractTextLogFilenames; at consumption/use sites (internal/datacoord/garbage_collector.go, internal/querynodev2/segments/segment.go, and other GC/loader code) reconstruct full paths with metautil.BuildTextLogPaths before accessing object storage. - Simplified/removed logic: removed the redundant practice of carrying full object keys through metadata and in-memory structures; callers now persist compact filenames and perform on-demand path reconstruction. This eliminates large payloads in etcd and reduces memory pressure while preserving the same runtime control flow and error handling. - No data loss / no regression: filename extraction is a deterministic suffix operation (metautil.ExtractTextLogFilenames) and reloadFromKV performs backward compatibility (internal/datacoord/meta.go converts existing full-path entries to filenames before caching). All read paths reconstruct full paths at runtime (garbage_collector.getTextLogs, LocalSegment.LoadTextIndex, GC/loader), so no files are modified/deleted and access semantics remain identical. <!-- end of auto-generated comment: release notes by coderabbit.ai --> Signed-off-by: sijie-ni-0214 <sijie.ni@zilliz.com>	2025-12-28 15:31:19 +08:00
aoiasd	55feb7ded8	feat: set related resource ids in collection schema (#46423 ) Support crate analyzer with file resource info, and return used file resource ids when validate analyzer. Save the related resource ids in collection schema. relate: https://github.com/milvus-io/milvus/issues/43687 <!-- This is an auto-generated comment: release notes by coderabbit.ai --> - Core invariant: analyzer file-resource resolution is deterministic and traceable by threading a FileResourcePathHelper (collecting used resource IDs in a HashSet) through all tokenizer/analyzer construction and validation paths; validate_analyzer(params, extra_info) returns the collected Vec<i64) which is propagated through C/Rust/Go layers to callers (CValidateResult → RustResult::from_vec_i64 → Go []int64 → querypb.ValidateAnalyzerResponse.ResourceIds → CollectionSchema.FileResourceIds). - Logic removed/simplified: ad‑hoc, scattered resource-path lookups and per-filter file helpers (e.g., read_synonyms_file and other inline file-reading logic) were consolidated into ResourceInfo + FileResourcePathHelper and a centralized get_resource_path(helper, ...) API; filter/tokenizer builder APIs now accept &mut FileResourcePathHelper so all file path resolution and ID collection use the same path and bookkeeping logic (redundant duplicated lookups removed). - Why no data loss or behavior regression: changes are additive and default-preserving — existing call sites pass extra_info = "" so analyzer creation/validation behavior and error paths remain unchanged; new Collection.FileResourceIds is populated from resp.ResourceIds in validateSchema and round‑tripped through marshal/unmarshal (model.Collection ↔ schemapb.CollectionSchema) so schema persistence uses the new list without overwriting other schema fields; proto change adds a repeated field (resource_ids) which is wire‑compatible (older clients ignore extra field). Concrete code paths: analyzer creation still uses create_analyzer (now with extra_info ""), tokenizer validation still returns errors as before but now also returns IDs via CValidateResult/RustResult, and rootcoord.validateSchema assigns resp.ResourceIds → schema.FileResourceIds. - New capability added: end‑to‑end discovery, return, and persistence of file resource IDs used by analyzers — validate flows now return resource IDs and the system stores them in collection schema (affects tantivy analyzer binding, canalyzer C bindings, internal/util analyzer APIs, querynode ValidateAnalyzer response, and rootcoord/create_collection flow). <!-- end of auto-generated comment: release notes by coderabbit.ai --> Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-12-26 22:49:19 +08:00
zhagnlu	9ba0c4e501	fix:add json stats version because previous change #46130 (#46467 ) #42533 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-12-24 19:17:18 +08:00
congqixia	6452d146af	enhance: move jemalloc_stats from pkg to internal/util/segcore (#46560 ) Related to #46133 Move jemalloc_stats.go and its test file from pkg/util/hardware to internal/util/segcore. This is a more appropriate location because: - jemalloc_stats depends on milvus_core C++ library via cgo - The pkg directory should remain independent of internal C++ dependencies - segcore is the natural home for core memory allocator utilities <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * Refactor * Improved internal code organization by reorganizing memory statistics collection infrastructure for better maintainability and modularity. No impact on end-user functionality or behavior. <sub>✏️ Tip: You can customize this high-level summary in your review settings.</sub> <!-- end of auto-generated comment: release notes by coderabbit.ai --> Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-12-24 19:03:18 +08:00
marcelo-cjl	3b599441fd	feat: Add nullable vector support for proxy and querynode (#46305 ) related: #45993 This commit extends nullable vector support to the proxy layer, querynode, and adds comprehensive validation, search reduce, and field data handling for nullable vectors with sparse storage. Proxy layer changes: - Update validate_util.go checkAligned() with getExpectedVectorRows() helper to validate nullable vector field alignment using valid data count - Update checkFloatVectorFieldData/checkSparseFloatVectorFieldData for nullable vector validation with proper row count expectations - Add FieldDataIdxComputer in typeutil/schema.go for logical-to-physical index translation during search reduce operations - Update search_reduce_util.go reduceSearchResultData to use idxComputers for correct field data indexing with nullable vectors - Update task.go, task_query.go, task_upsert.go for nullable vector handling - Update msg_pack.go with nullable vector field data processing QueryNode layer changes: - Update segments/result.go for nullable vector result handling - Update segments/search_reduce.go with nullable vector offset translation Storage and index changes: - Update data_codec.go and utils.go for nullable vector serialization - Update indexcgowrapper/dataset.go and index.go for nullable vector indexing Utility changes: - Add FieldDataIdxComputer struct with Compute() method for efficient logical-to-physical index mapping across multiple field data - Update EstimateEntitySize() and AppendFieldData() with fieldIdxs parameter - Update funcutil.go with nullable vector support functions <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * New Features * Full support for nullable vector fields (float, binary, float16, bfloat16, int8, sparse) across ingest, storage, indexing, search and retrieval; logical↔physical offset mapping preserves row semantics. * Client: compaction control and compaction-state APIs. * Bug Fixes * Improved validation for adding vector fields (nullable + dimension checks) and corrected search/query behavior for nullable vectors. * Chores * Persisted validity maps with indexes and on-disk formats. * Tests * Extensive new and updated end-to-end nullable-vector tests. <sub>✏️ Tip: You can customize this high-level summary in your review settings.</sub> <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Signed-off-by: marcelo-cjl <marcelo.chen@zilliz.com>	2025-12-24 10:13:19 +08:00
XuanYang-cn	0507db2015	feat: Add force merge (#45556 ) See also: #46043 --------- Signed-off-by: yangxuan <xuan.yang@zilliz.com>	2025-12-19 18:03:18 +08:00
Spade A	ad8aba7cb4	feat: impl ComputePhraseMatchSlop for compute min slop for phrase match query (#45892 ) issue: https://github.com/milvus-io/milvus/issues/45890 ComputePhraseMatchSlop accepts three pararms: 1. A string: query text 2. Some trings: data texts 3. Analyzer params, Slop will be calculated for the query text with each data text in the context of phrase match where they are tokenized with tokenizer with analyzer params. So two array will be returned: 1. is_match: is phrase match can sucess 2. slop: the related slop if phrase match can sucess, or -1 is cannot. --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-12-19 16:03:18 +08:00
aoiasd	7e4f87e351	fix: Init analyzer at delegator for all field with enable analyzer (#46361 ) To support text match highlight relate: https://github.com/milvus-io/milvus/issues/46308 Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-12-19 10:23:18 +08:00
congqixia	21ed1fabfd	feat: support reopen segment for data/schema changes (#46359 ) issue: #46358 This PR implements segment reopening functionality on query nodes, enabling the application of data or schema changes to already-loaded segments without requiring a full reload. ### Core (C++) New SegmentLoadInfo class (`internal/core/src/segcore/SegmentLoadInfo.h/cpp`): - Encapsulates segment load configuration with structured access - Implements `ComputeDiff()` to calculate differences between old and new load states - Tracks indexes, binlogs, and column groups that need to be loaded or dropped - Provides `ConvertFieldIndexInfoToLoadIndexInfo()` for index loading ChunkedSegmentSealedImpl modifications: - Added `Reopen(const SegmentLoadInfo&)` method to apply incremental changes based on computed diff - Refactored `LoadColumnGroups()` and `LoadColumnGroup()` to support selective loading via field ID map - Extracted `LoadBatchIndexes()` and `LoadBatchFieldData()` for reusable batch loading logic - Added `LoadManifest()` for manifest-based loading path - Updated all methods to use `SegmentLoadInfo` wrapper instead of direct proto access SegmentGrowingImpl modifications: - Added `Reopen()` stub method for interface compliance C API additions (`segment_c.h/cpp`): - Added `ReopenSegment()` function exposing reopen to Go layer ### Go Side QueryNode handlers (`internal/querynodev2/`): - Added `HandleReopen()` in handlers.go - Added `ReopenSegments()` RPC in services.go Segment interface (`internal/querynodev2/segments/`): - Extended `Segment` interface with `Reopen()` method - Implemented `Reopen()` in LocalSegment - Added `Reopen()` to segment loader Segcore wrapper (`internal/util/segcore/`): - Added `Reopen()` method in segment.go - Added `ReopenSegmentRequest` in requests.go ### Proto - Added new fields to support reopen in `query_coord.proto` --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-12-17 15:49:16 +08:00
congqixia	efa7ccdf81	fix: pass manifest path when loading growing segments (#46378 ) Related to #44956 Pass ManifestPath field to SegmentLoadInfo when loading growing segments in loadGrowingSegments function. This ensures storage v2 can properly locate segment data via manifest path, consistent with other segment loading paths. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-12-17 10:19:15 +08:00
XuanYang-cn	0bbb134e39	feat: Enable to backup and reload ez (#46332 ) see also: #40013 Signed-off-by: yangxuan <xuan.yang@zilliz.com>	2025-12-16 17:19:16 +08:00
Spade A	f6f716bcfd	feat: impl StructArray -- support embedding searches embeddings in embedding list with element level filter expression (#45830 ) issue: https://github.com/milvus-io/milvus/issues/42148 For a vector field inside a STRUCT, since a STRUCT can only appear as the element type of an ARRAY field, the vector field in STRUCT is effectively an array of vectors, i.e. an embedding list. Milvus already supports searching embedding lists with metrics whose names start with the prefix MAX_SIM_. This PR allows Milvus to search embeddings inside an embedding list using the same metrics as normal embedding fields. Each embedding in the list is treated as an independent vector and participates in ANN search. Further, since STRUCT may contain scalar fields that are highly related to the embedding field, this PR introduces an element-level filter expression to refine search results. The grammar of the element-level filter is: element_filter(structFieldName, $[subFieldName] == 3) where $[subFieldName] refers to the value of subFieldName in each element of the STRUCT array structFieldName. It can be combined with existing filter expressions, for example: "varcharField == 'aaa' && element_filter(struct_field, $[struct_int] == 3)" A full example: ``` struct_schema = milvus_client.create_struct_field_schema() struct_schema.add_field("struct_str", DataType.VARCHAR, max_length=65535) struct_schema.add_field("struct_int", DataType.INT32) struct_schema.add_field("struct_float_vec", DataType.FLOAT_VECTOR, dim=EMBEDDING_DIM) schema.add_field( "struct_field", datatype=DataType.ARRAY, element_type=DataType.STRUCT, struct_schema=struct_schema, max_capacity=1000, ) ... filter = "varcharField == 'aaa' && element_filter(struct_field, $[struct_int] == 3 && $[struct_str] == 'abc')" res = milvus_client.search( COLLECTION_NAME, data=query_embeddings, limit=10, anns_field="struct_field[struct_float_vec]", filter=filter, output_fields=["struct_field[struct_int]", "varcharField"], ) ``` TODO: 1. When an `element_filter` expression is used, a regular filter expression must also be present. Remove this restriction. 2. Implement `element_filter` expressions in the `query`. --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-12-15 12:01:15 +08:00
aoiasd	0c54875832	enhance: ValidateAnalyzer return ValidateAnalyzerResponse instead common.Status (#46292 ) Prepare for return more info when validate analyzer. relate: https://github.com/milvus-io/milvus/issues/43687 Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-12-12 10:35:14 +08:00
wgcn	6e2872c982	fix: wrong reduce lantency metric (#46233 ) #46248 Signed-off-by: wgcn <wangg48@chinatelecom.cn> Co-authored-by: wgcn <wangg48@chinatelecom.cn>	2025-12-10 14:17:13 +08:00
zhagnlu	8f0b7983ec	enhance: add jemalloc cached monitor (#46041 ) #46133 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-12-09 19:53:13 +08:00
wei liu	046693eaf7	test: [skip e2e] fix race condition in TestQueryNodePipeline/TestBasic (#46218 ) issue: #46217 The test was failing intermittently because it didn't wait for the pipeline to finish processing messages before exiting. The test sent a message to the pipeline and immediately returned, causing the deferred Close() to execute before ProcessInsert, ProcessDelete, and UpdateTSafe could be called. Fix by: - Moving message construction before mock expectations setup - Adding a done channel to synchronize on UpdateTSafe completion - Waiting for the signal before test exits Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-12-09 17:57:14 +08:00
Zhen Ye	459425ac84	fix: wrong context using by session of grpc client (#46183 ) issue: #46182 Signed-off-by: chyezh <chyezh@outlook.com>	2025-12-08 21:47:12 +08:00
aoiasd	354ab2f55e	enhance: sync file resource to querynode and datanode (#44480 ) relate:https://github.com/milvus-io/milvus/issues/43687 Support use file resource with sync mode. Auto download or remove file resource to local when user add or remove file resource. Sync file resource to node when find new node session. --------- Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-12-04 16:23:11 +08:00
wei liu	e70c01362d	enhance: Add resource exhaustion querynode penalty policy (#45808 ) issue: #40513 for querynode which return resource exhausted error, add a penalty duration on it, and suspend loading new resource until penalty duration expired. --------- Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-12-02 16:59:11 +08:00
Zhen Ye	2ef18c5b4f	enhance: remove watch at session liveness check (#45968 ) issue: #45724 --------- Signed-off-by: chyezh <chyezh@outlook.com>	2025-12-01 17:55:10 +08:00
zhagnlu	3901f112ae	enhance: make estimate json stats size more accurate (#45875 ) #42533 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-12-01 15:31:10 +08:00
aoiasd	7d19c40e3c	feat: support search highlight with queries (#45736 ) Previously, search with highlight only supported using BM25 search text as the highlight target. This PR adds support for highlighting with user-defined queries. relate: https://github.com/milvus-io/milvus/issues/42589 --------- Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-12-01 10:17:09 +08:00
congqixia	a7275e190e	fix: populate index info after segment loading to prevent redundant load tasks (#45803 ) After segments gained self-management capabilities for loading, the index information from the initial load was not being preserved in the Go-side segment metadata. This caused QueryCoord to repeatedly dispatch load index tasks, which would fail in segcore since the indexes were already loaded. Root Cause: The segment's `fieldIndexes` map was not being populated with index metadata after calling `FinishLoad`, leading to a mismatch between the Go-side metadata and segcore's internal state. Solution: After successfully loading a sealed segment, iterate through `loadInfo.IndexInfos` and insert each index entry into the segment's `fieldIndexes` map. This ensures the Go-side metadata stays in sync with segcore and prevents redundant load index operations. Fixes #45802 Related to #45060 Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-11-24 19:55:07 +08:00
XuanYang-cn	c082317681	fix: Use base64 to encode not utf-8 bytes (#45655 ) See also: #45654 Signed-off-by: yangxuan <xuan.yang@zilliz.com>	2025-11-24 18:23:06 +08:00
aoiasd	5efb0cedc8	feat: support use fragment config for highlight (#45099 ) relate: https://github.com/milvus-io/milvus/issues/42589 Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-11-24 17:07:06 +08:00
wei liu	3fbee154f6	enhance: Remove large segment ID arrays from QueryNode logs (#45719 ) issue: #45718 Logging complete segment ID arrays caused excessive log volume (3-6 TB for 200k segments). Remove arrays from logger fields and keep only segment counts for observability. Changes: - Remove requestSegments/preparedSegments arrays from Load logger - Remove segmentIDs from BM25 stats logs - Remove entries structure from sync distribution log This reduces log volume by 99.99% for large-scale operations. Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-11-20 17:18:14 +08:00
aoiasd	947c8855f3	feat: support search bm25 with highlight (#44923 ) relate: https://github.com/milvus-io/milvus/issues/42589 --------- Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-11-18 16:09:39 +08:00
congqixia	0a208d7224	enhance: Move segment loading logic from Go layer to segcore for self-managed loading (#45488 ) Related to #45060 Refactor segment loading architecture to make segments autonomously manage their own loading process, moving the orchestration logic from Go (segment_loader.go) to C++ (segcore). C++ Layer (segcore): - Added `SetLoadInfo()` and `Load()` methods to `SegmentInterface` and implementations - Implemented `ChunkedSegmentSealedImpl::Load()` with parallel loading strategy: - Separates indexed fields from non-indexed fields - Loads indexes concurrently using thread pools - Loads field data for non-indexed fields in parallel - Implemented `SegmentGrowingImpl::Load()` to convert and load field data - Extracted `LoadIndexData()` as a reusable utility function in `Utils.cpp` - Added `SegmentLoad()` C binding in `segment_c.cpp` Go Layer: - Added `Load()` method to segment interfaces - Updated mock implementations and test interfaces - Integrated new C++ `SegmentLoad()` binding in Go segment wrapper --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-11-14 11:21:37 +08:00
cai.zhang	216c576da2	fix: Retain collection early to prevent it from being released before query completion (#45413 ) issue: #45314 This PR only ensures that no panic occurs. However, we still need to provide protection for the delegator handling ongoing query tasks. Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2025-11-11 20:29:37 +08:00
sparknack	f815f57b82	enhance: check both eviction and warmup when estimate segment loading size (#45222 ) issue: #44857 Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>	2025-11-10 14:15:36 +08:00
congqixia	e284733399	fix: Move FinishLoad before text index creation to ensure raw data availability (#45334 ) Related to #45333 Fix segment loading failure when adding fields with text match enabled. The issue occurred because text indexes were being loaded before FinishLoad() was called, meaning raw data was not properly available when text index creation attempted to access it, resulting in "failed to create text index, neither raw data nor index are found" errors. Solution is to move the FinishLoad() call to execute after raw data loading but before text index loading. This ensures that: 1. Raw data is properly loaded and available in memory 2. Text indexes can access the raw data they need during creation 3. The segment is in the correct state before any index operations Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-11-06 14:49:34 +08:00
Lior Friedman	a4d69031f1	fix: Add AiSAQ index type RAM estimation implementation on the query node. (#45246 ) Currently, the index type AiSAQ RAM usage estimation is not being calculated correctly. AiSAQ index type consumes less RAM usage while loading the index than DISKANN does, and the query node module is missing the implementation of the RAM usage estimation for that AiSAQ index type. We suggest that the AiSAQ RAM usage estimation calculation should be as follows: UsedDiskMemoryRatioAisaq = 1024 (contrary to the UsedDiskMemoryRatio, which is 4) neededMemSize = indexInfo.IndexSize / UsedDiskMemoryRatioAisaq neededDiskSize = indexInfo.IndexSize Reported issue is #45247 --------- Signed-off-by: Lior Friedman <lior.friedman@il.kioxia.com> Signed-off-by: friedl <lior.friedman@kioxia.com> Co-authored-by: friedl <lior.friedman@kioxia.com>	2025-11-06 08:53:34 +08:00
congqixia	e6be590b97	enhance: set schema version when creating new collection (#45263 ) Related to #43028 Initialize the schema version field when creating a new collection instance in QueryNode. The schema version is extracted from loadMetaInfo and assigned to the collection, ensuring proper schema version tracking and consistency across the distributed system. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-11-04 10:15:32 +08:00
Zhen Ye	576084fe86	enhance: support alter collection/database with WAL-based DDL framework (#45266 ) issue: #43897 - Alter collection/database is implemented by WAL-based DDL framework now. - Support AlterCollection/AlterDatabase in wal now. - Alter operation can be synced by new CDC now. - Refactor some UT for alter DDL. --------- Signed-off-by: chyezh <chyezh@outlook.com>	2025-11-04 09:59:33 +08:00
congqixia	8c98adfeb3	fix: update QueryNode NumEntities metrics when collection has no segments (#45147 ) Related to #44509 Fix a bug where QueryNodeNumEntities metrics were not updated for collections with zero segments, causing stale metrics when all segments are flushed or compacted. The previous implementation used separate loops: one to update size metrics for all collections, and another to update num entities metrics only for collections present in the grouped segments map. Collections with no segments were skipped in the second loop, leaving their NumEntities metrics stale. Changes: - Consolidate size and num entities metric updates into single loop - Iterate over all collections instead of grouped segments - Get collection metadata from manager instead of segment instances - Correctly set NumEntities to 0 for collections with no segments - Apply the same fix to both growing and sealed segment processing - Add nil check for collection metadata before processing This ensures all collection metrics are updated consistently, even when segment count drops to zero. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-10-30 10:12:08 +08:00
aoiasd	ad9a0cae48	enhance: add global analyzer options (#44684 ) relate: https://github.com/milvus-io/milvus/issues/43687 Add global analyzer options, avoid having to merge some milvus params into user's analyzer params. Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-10-28 14:52:10 +08:00
congqixia	36a887b38b	enhance: add NewSegmentWithLoadInfo API to support segment self-managed loading (#45061 ) This commit introduces the foundation for enabling segments to manage their own loading process by passing load information during segment creation. Changes: C++ Layer: - Add NewSegmentWithLoadInfo() C API to create segments with serialized load info - Add SetLoadInfo() method to SegmentInterface for storing load information - Refactor segment creation logic into shared CreateSegment() helper function - Add comprehensive documentation for the new API Go Layer: - Extend CreateCSegmentRequest to support optional LoadInfo field - Update segment creation in querynode to pass SegmentLoadInfo when available - Add ConvertToSegcoreSegmentLoadInfo() and helper converters for proto translation Proto Definitions: - Add segcorepb.SegmentLoadInfo message with essential loading metadata - Add supporting messages: Binlog, FieldBinlog, FieldIndexInfo, TextIndexStats, JsonKeyStats - Remove dependency on data_coord.proto by creating segcore-specific definitions Testing: - Add comprehensive unit tests for proto conversion functions - Test edge cases including nil inputs, empty data, and nil array/map elements This is the first step toward issue #45060 - enabling segments to autonomously manage their loading process, which will: - Clarify responsibilities between Go and C++ layers - Reduce cross-language call overhead - Enable precise resource management at the C++ level - Support better integration with caching layer - Enable proactive schema evolution handling Related to #45060 --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-10-27 15:28:12 +08:00
aoiasd	cfeb095ad7	enhance: forbid build analyzer at proxy (#44067 ) relate: https://github.com/milvus-io/milvus/issues/43687 We used to run the temporary analyzer and validate analyzer on the proxy, but the proxy should not be a computation-heavy node. This PR move all analyzer calculations to the streaming node. --------- Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-10-23 10:58:12 +08:00
aoiasd	ac82bad0b3	enhance: optimize idf oracle sync logic (#44628 ) Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-10-20 15:42:08 +08:00
sparknack	935160840c	enhance: add a disk quota for the loaded binlog size to prevent load failure of querynode (#44893 ) issue: #41435 --------- Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>	2025-10-19 19:44:01 +08:00
aoiasd	754997ac2b	enhance: update some annotations (#44769 ) relate: https://github.com/milvus-io/milvus/issues/43114 --------- Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-10-17 16:22:02 +08:00
sparknack	c8a4d6e2ef	enhance: add cachinglayer management for TextMatchIndex (#44741 ) issue: #41435, #44502 Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>	2025-10-13 14:37:58 +08:00
sparknack	6d5b41644b	enhance: remove logical usage checks during segment loading (#44743 ) issue: #41435 Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>	2025-10-13 14:21:58 +08:00
congqixia	5ece760d73	fix: Pass fs via `FileManagerContext` when loading index (#44733 ) Related to #44615 --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-10-11 09:55:57 +08:00

1 2 3 4 5 ...

852 Commits