milvus

mirror of https://gitee.com/milvus-io/milvus.git synced 2025-12-28 14:35:27 +08:00

Author	SHA1	Message	Date
aoiasd	55feb7ded8	feat: set related resource ids in collection schema (#46423 ) Support crate analyzer with file resource info, and return used file resource ids when validate analyzer. Save the related resource ids in collection schema. relate: https://github.com/milvus-io/milvus/issues/43687 <!-- This is an auto-generated comment: release notes by coderabbit.ai --> - Core invariant: analyzer file-resource resolution is deterministic and traceable by threading a FileResourcePathHelper (collecting used resource IDs in a HashSet) through all tokenizer/analyzer construction and validation paths; validate_analyzer(params, extra_info) returns the collected Vec<i64) which is propagated through C/Rust/Go layers to callers (CValidateResult → RustResult::from_vec_i64 → Go []int64 → querypb.ValidateAnalyzerResponse.ResourceIds → CollectionSchema.FileResourceIds). - Logic removed/simplified: ad‑hoc, scattered resource-path lookups and per-filter file helpers (e.g., read_synonyms_file and other inline file-reading logic) were consolidated into ResourceInfo + FileResourcePathHelper and a centralized get_resource_path(helper, ...) API; filter/tokenizer builder APIs now accept &mut FileResourcePathHelper so all file path resolution and ID collection use the same path and bookkeeping logic (redundant duplicated lookups removed). - Why no data loss or behavior regression: changes are additive and default-preserving — existing call sites pass extra_info = "" so analyzer creation/validation behavior and error paths remain unchanged; new Collection.FileResourceIds is populated from resp.ResourceIds in validateSchema and round‑tripped through marshal/unmarshal (model.Collection ↔ schemapb.CollectionSchema) so schema persistence uses the new list without overwriting other schema fields; proto change adds a repeated field (resource_ids) which is wire‑compatible (older clients ignore extra field). Concrete code paths: analyzer creation still uses create_analyzer (now with extra_info ""), tokenizer validation still returns errors as before but now also returns IDs via CValidateResult/RustResult, and rootcoord.validateSchema assigns resp.ResourceIds → schema.FileResourceIds. - New capability added: end‑to‑end discovery, return, and persistence of file resource IDs used by analyzers — validate flows now return resource IDs and the system stores them in collection schema (affects tantivy analyzer binding, canalyzer C bindings, internal/util analyzer APIs, querynode ValidateAnalyzer response, and rootcoord/create_collection flow). <!-- end of auto-generated comment: release notes by coderabbit.ai --> Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-12-26 22:49:19 +08:00
zhagnlu	9ba0c4e501	fix:add json stats version because previous change #46130 (#46467 ) #42533 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-12-24 19:17:18 +08:00
congqixia	6452d146af	enhance: move jemalloc_stats from pkg to internal/util/segcore (#46560 ) Related to #46133 Move jemalloc_stats.go and its test file from pkg/util/hardware to internal/util/segcore. This is a more appropriate location because: - jemalloc_stats depends on milvus_core C++ library via cgo - The pkg directory should remain independent of internal C++ dependencies - segcore is the natural home for core memory allocator utilities <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * Refactor * Improved internal code organization by reorganizing memory statistics collection infrastructure for better maintainability and modularity. No impact on end-user functionality or behavior. <sub>✏️ Tip: You can customize this high-level summary in your review settings.</sub> <!-- end of auto-generated comment: release notes by coderabbit.ai --> Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-12-24 19:03:18 +08:00
marcelo-cjl	3b599441fd	feat: Add nullable vector support for proxy and querynode (#46305 ) related: #45993 This commit extends nullable vector support to the proxy layer, querynode, and adds comprehensive validation, search reduce, and field data handling for nullable vectors with sparse storage. Proxy layer changes: - Update validate_util.go checkAligned() with getExpectedVectorRows() helper to validate nullable vector field alignment using valid data count - Update checkFloatVectorFieldData/checkSparseFloatVectorFieldData for nullable vector validation with proper row count expectations - Add FieldDataIdxComputer in typeutil/schema.go for logical-to-physical index translation during search reduce operations - Update search_reduce_util.go reduceSearchResultData to use idxComputers for correct field data indexing with nullable vectors - Update task.go, task_query.go, task_upsert.go for nullable vector handling - Update msg_pack.go with nullable vector field data processing QueryNode layer changes: - Update segments/result.go for nullable vector result handling - Update segments/search_reduce.go with nullable vector offset translation Storage and index changes: - Update data_codec.go and utils.go for nullable vector serialization - Update indexcgowrapper/dataset.go and index.go for nullable vector indexing Utility changes: - Add FieldDataIdxComputer struct with Compute() method for efficient logical-to-physical index mapping across multiple field data - Update EstimateEntitySize() and AppendFieldData() with fieldIdxs parameter - Update funcutil.go with nullable vector support functions <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * New Features * Full support for nullable vector fields (float, binary, float16, bfloat16, int8, sparse) across ingest, storage, indexing, search and retrieval; logical↔physical offset mapping preserves row semantics. * Client: compaction control and compaction-state APIs. * Bug Fixes * Improved validation for adding vector fields (nullable + dimension checks) and corrected search/query behavior for nullable vectors. * Chores * Persisted validity maps with indexes and on-disk formats. * Tests * Extensive new and updated end-to-end nullable-vector tests. <sub>✏️ Tip: You can customize this high-level summary in your review settings.</sub> <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Signed-off-by: marcelo-cjl <marcelo.chen@zilliz.com>	2025-12-24 10:13:19 +08:00
XuanYang-cn	0507db2015	feat: Add force merge (#45556 ) See also: #46043 --------- Signed-off-by: yangxuan <xuan.yang@zilliz.com>	2025-12-19 18:03:18 +08:00
Spade A	ad8aba7cb4	feat: impl ComputePhraseMatchSlop for compute min slop for phrase match query (#45892 ) issue: https://github.com/milvus-io/milvus/issues/45890 ComputePhraseMatchSlop accepts three pararms: 1. A string: query text 2. Some trings: data texts 3. Analyzer params, Slop will be calculated for the query text with each data text in the context of phrase match where they are tokenized with tokenizer with analyzer params. So two array will be returned: 1. is_match: is phrase match can sucess 2. slop: the related slop if phrase match can sucess, or -1 is cannot. --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-12-19 16:03:18 +08:00
aoiasd	7e4f87e351	fix: Init analyzer at delegator for all field with enable analyzer (#46361 ) To support text match highlight relate: https://github.com/milvus-io/milvus/issues/46308 Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-12-19 10:23:18 +08:00
congqixia	21ed1fabfd	feat: support reopen segment for data/schema changes (#46359 ) issue: #46358 This PR implements segment reopening functionality on query nodes, enabling the application of data or schema changes to already-loaded segments without requiring a full reload. ### Core (C++) New SegmentLoadInfo class (`internal/core/src/segcore/SegmentLoadInfo.h/cpp`): - Encapsulates segment load configuration with structured access - Implements `ComputeDiff()` to calculate differences between old and new load states - Tracks indexes, binlogs, and column groups that need to be loaded or dropped - Provides `ConvertFieldIndexInfoToLoadIndexInfo()` for index loading ChunkedSegmentSealedImpl modifications: - Added `Reopen(const SegmentLoadInfo&)` method to apply incremental changes based on computed diff - Refactored `LoadColumnGroups()` and `LoadColumnGroup()` to support selective loading via field ID map - Extracted `LoadBatchIndexes()` and `LoadBatchFieldData()` for reusable batch loading logic - Added `LoadManifest()` for manifest-based loading path - Updated all methods to use `SegmentLoadInfo` wrapper instead of direct proto access SegmentGrowingImpl modifications: - Added `Reopen()` stub method for interface compliance C API additions (`segment_c.h/cpp`): - Added `ReopenSegment()` function exposing reopen to Go layer ### Go Side QueryNode handlers (`internal/querynodev2/`): - Added `HandleReopen()` in handlers.go - Added `ReopenSegments()` RPC in services.go Segment interface (`internal/querynodev2/segments/`): - Extended `Segment` interface with `Reopen()` method - Implemented `Reopen()` in LocalSegment - Added `Reopen()` to segment loader Segcore wrapper (`internal/util/segcore/`): - Added `Reopen()` method in segment.go - Added `ReopenSegmentRequest` in requests.go ### Proto - Added new fields to support reopen in `query_coord.proto` --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-12-17 15:49:16 +08:00
congqixia	efa7ccdf81	fix: pass manifest path when loading growing segments (#46378 ) Related to #44956 Pass ManifestPath field to SegmentLoadInfo when loading growing segments in loadGrowingSegments function. This ensures storage v2 can properly locate segment data via manifest path, consistent with other segment loading paths. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-12-17 10:19:15 +08:00
XuanYang-cn	0bbb134e39	feat: Enable to backup and reload ez (#46332 ) see also: #40013 Signed-off-by: yangxuan <xuan.yang@zilliz.com>	2025-12-16 17:19:16 +08:00
Spade A	f6f716bcfd	feat: impl StructArray -- support embedding searches embeddings in embedding list with element level filter expression (#45830 ) issue: https://github.com/milvus-io/milvus/issues/42148 For a vector field inside a STRUCT, since a STRUCT can only appear as the element type of an ARRAY field, the vector field in STRUCT is effectively an array of vectors, i.e. an embedding list. Milvus already supports searching embedding lists with metrics whose names start with the prefix MAX_SIM_. This PR allows Milvus to search embeddings inside an embedding list using the same metrics as normal embedding fields. Each embedding in the list is treated as an independent vector and participates in ANN search. Further, since STRUCT may contain scalar fields that are highly related to the embedding field, this PR introduces an element-level filter expression to refine search results. The grammar of the element-level filter is: element_filter(structFieldName, $[subFieldName] == 3) where $[subFieldName] refers to the value of subFieldName in each element of the STRUCT array structFieldName. It can be combined with existing filter expressions, for example: "varcharField == 'aaa' && element_filter(struct_field, $[struct_int] == 3)" A full example: ``` struct_schema = milvus_client.create_struct_field_schema() struct_schema.add_field("struct_str", DataType.VARCHAR, max_length=65535) struct_schema.add_field("struct_int", DataType.INT32) struct_schema.add_field("struct_float_vec", DataType.FLOAT_VECTOR, dim=EMBEDDING_DIM) schema.add_field( "struct_field", datatype=DataType.ARRAY, element_type=DataType.STRUCT, struct_schema=struct_schema, max_capacity=1000, ) ... filter = "varcharField == 'aaa' && element_filter(struct_field, $[struct_int] == 3 && $[struct_str] == 'abc')" res = milvus_client.search( COLLECTION_NAME, data=query_embeddings, limit=10, anns_field="struct_field[struct_float_vec]", filter=filter, output_fields=["struct_field[struct_int]", "varcharField"], ) ``` TODO: 1. When an `element_filter` expression is used, a regular filter expression must also be present. Remove this restriction. 2. Implement `element_filter` expressions in the `query`. --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-12-15 12:01:15 +08:00
aoiasd	0c54875832	enhance: ValidateAnalyzer return ValidateAnalyzerResponse instead common.Status (#46292 ) Prepare for return more info when validate analyzer. relate: https://github.com/milvus-io/milvus/issues/43687 Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-12-12 10:35:14 +08:00
wgcn	6e2872c982	fix: wrong reduce lantency metric (#46233 ) #46248 Signed-off-by: wgcn <wangg48@chinatelecom.cn> Co-authored-by: wgcn <wangg48@chinatelecom.cn>	2025-12-10 14:17:13 +08:00
zhagnlu	8f0b7983ec	enhance: add jemalloc cached monitor (#46041 ) #46133 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-12-09 19:53:13 +08:00
wei liu	046693eaf7	test: [skip e2e] fix race condition in TestQueryNodePipeline/TestBasic (#46218 ) issue: #46217 The test was failing intermittently because it didn't wait for the pipeline to finish processing messages before exiting. The test sent a message to the pipeline and immediately returned, causing the deferred Close() to execute before ProcessInsert, ProcessDelete, and UpdateTSafe could be called. Fix by: - Moving message construction before mock expectations setup - Adding a done channel to synchronize on UpdateTSafe completion - Waiting for the signal before test exits Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-12-09 17:57:14 +08:00
Zhen Ye	459425ac84	fix: wrong context using by session of grpc client (#46183 ) issue: #46182 Signed-off-by: chyezh <chyezh@outlook.com>	2025-12-08 21:47:12 +08:00
aoiasd	354ab2f55e	enhance: sync file resource to querynode and datanode (#44480 ) relate:https://github.com/milvus-io/milvus/issues/43687 Support use file resource with sync mode. Auto download or remove file resource to local when user add or remove file resource. Sync file resource to node when find new node session. --------- Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-12-04 16:23:11 +08:00
wei liu	e70c01362d	enhance: Add resource exhaustion querynode penalty policy (#45808 ) issue: #40513 for querynode which return resource exhausted error, add a penalty duration on it, and suspend loading new resource until penalty duration expired. --------- Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-12-02 16:59:11 +08:00
Zhen Ye	2ef18c5b4f	enhance: remove watch at session liveness check (#45968 ) issue: #45724 --------- Signed-off-by: chyezh <chyezh@outlook.com>	2025-12-01 17:55:10 +08:00
zhagnlu	3901f112ae	enhance: make estimate json stats size more accurate (#45875 ) #42533 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-12-01 15:31:10 +08:00
aoiasd	7d19c40e3c	feat: support search highlight with queries (#45736 ) Previously, search with highlight only supported using BM25 search text as the highlight target. This PR adds support for highlighting with user-defined queries. relate: https://github.com/milvus-io/milvus/issues/42589 --------- Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-12-01 10:17:09 +08:00
congqixia	a7275e190e	fix: populate index info after segment loading to prevent redundant load tasks (#45803 ) After segments gained self-management capabilities for loading, the index information from the initial load was not being preserved in the Go-side segment metadata. This caused QueryCoord to repeatedly dispatch load index tasks, which would fail in segcore since the indexes were already loaded. Root Cause: The segment's `fieldIndexes` map was not being populated with index metadata after calling `FinishLoad`, leading to a mismatch between the Go-side metadata and segcore's internal state. Solution: After successfully loading a sealed segment, iterate through `loadInfo.IndexInfos` and insert each index entry into the segment's `fieldIndexes` map. This ensures the Go-side metadata stays in sync with segcore and prevents redundant load index operations. Fixes #45802 Related to #45060 Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-11-24 19:55:07 +08:00
XuanYang-cn	c082317681	fix: Use base64 to encode not utf-8 bytes (#45655 ) See also: #45654 Signed-off-by: yangxuan <xuan.yang@zilliz.com>	2025-11-24 18:23:06 +08:00
aoiasd	5efb0cedc8	feat: support use fragment config for highlight (#45099 ) relate: https://github.com/milvus-io/milvus/issues/42589 Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-11-24 17:07:06 +08:00
wei liu	3fbee154f6	enhance: Remove large segment ID arrays from QueryNode logs (#45719 ) issue: #45718 Logging complete segment ID arrays caused excessive log volume (3-6 TB for 200k segments). Remove arrays from logger fields and keep only segment counts for observability. Changes: - Remove requestSegments/preparedSegments arrays from Load logger - Remove segmentIDs from BM25 stats logs - Remove entries structure from sync distribution log This reduces log volume by 99.99% for large-scale operations. Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-11-20 17:18:14 +08:00
aoiasd	947c8855f3	feat: support search bm25 with highlight (#44923 ) relate: https://github.com/milvus-io/milvus/issues/42589 --------- Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-11-18 16:09:39 +08:00
congqixia	0a208d7224	enhance: Move segment loading logic from Go layer to segcore for self-managed loading (#45488 ) Related to #45060 Refactor segment loading architecture to make segments autonomously manage their own loading process, moving the orchestration logic from Go (segment_loader.go) to C++ (segcore). C++ Layer (segcore): - Added `SetLoadInfo()` and `Load()` methods to `SegmentInterface` and implementations - Implemented `ChunkedSegmentSealedImpl::Load()` with parallel loading strategy: - Separates indexed fields from non-indexed fields - Loads indexes concurrently using thread pools - Loads field data for non-indexed fields in parallel - Implemented `SegmentGrowingImpl::Load()` to convert and load field data - Extracted `LoadIndexData()` as a reusable utility function in `Utils.cpp` - Added `SegmentLoad()` C binding in `segment_c.cpp` Go Layer: - Added `Load()` method to segment interfaces - Updated mock implementations and test interfaces - Integrated new C++ `SegmentLoad()` binding in Go segment wrapper --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-11-14 11:21:37 +08:00
cai.zhang	216c576da2	fix: Retain collection early to prevent it from being released before query completion (#45413 ) issue: #45314 This PR only ensures that no panic occurs. However, we still need to provide protection for the delegator handling ongoing query tasks. Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2025-11-11 20:29:37 +08:00
sparknack	f815f57b82	enhance: check both eviction and warmup when estimate segment loading size (#45222 ) issue: #44857 Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>	2025-11-10 14:15:36 +08:00
congqixia	e284733399	fix: Move FinishLoad before text index creation to ensure raw data availability (#45334 ) Related to #45333 Fix segment loading failure when adding fields with text match enabled. The issue occurred because text indexes were being loaded before FinishLoad() was called, meaning raw data was not properly available when text index creation attempted to access it, resulting in "failed to create text index, neither raw data nor index are found" errors. Solution is to move the FinishLoad() call to execute after raw data loading but before text index loading. This ensures that: 1. Raw data is properly loaded and available in memory 2. Text indexes can access the raw data they need during creation 3. The segment is in the correct state before any index operations Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-11-06 14:49:34 +08:00
Lior Friedman	a4d69031f1	fix: Add AiSAQ index type RAM estimation implementation on the query node. (#45246 ) Currently, the index type AiSAQ RAM usage estimation is not being calculated correctly. AiSAQ index type consumes less RAM usage while loading the index than DISKANN does, and the query node module is missing the implementation of the RAM usage estimation for that AiSAQ index type. We suggest that the AiSAQ RAM usage estimation calculation should be as follows: UsedDiskMemoryRatioAisaq = 1024 (contrary to the UsedDiskMemoryRatio, which is 4) neededMemSize = indexInfo.IndexSize / UsedDiskMemoryRatioAisaq neededDiskSize = indexInfo.IndexSize Reported issue is #45247 --------- Signed-off-by: Lior Friedman <lior.friedman@il.kioxia.com> Signed-off-by: friedl <lior.friedman@kioxia.com> Co-authored-by: friedl <lior.friedman@kioxia.com>	2025-11-06 08:53:34 +08:00
congqixia	e6be590b97	enhance: set schema version when creating new collection (#45263 ) Related to #43028 Initialize the schema version field when creating a new collection instance in QueryNode. The schema version is extracted from loadMetaInfo and assigned to the collection, ensuring proper schema version tracking and consistency across the distributed system. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-11-04 10:15:32 +08:00
Zhen Ye	576084fe86	enhance: support alter collection/database with WAL-based DDL framework (#45266 ) issue: #43897 - Alter collection/database is implemented by WAL-based DDL framework now. - Support AlterCollection/AlterDatabase in wal now. - Alter operation can be synced by new CDC now. - Refactor some UT for alter DDL. --------- Signed-off-by: chyezh <chyezh@outlook.com>	2025-11-04 09:59:33 +08:00
congqixia	8c98adfeb3	fix: update QueryNode NumEntities metrics when collection has no segments (#45147 ) Related to #44509 Fix a bug where QueryNodeNumEntities metrics were not updated for collections with zero segments, causing stale metrics when all segments are flushed or compacted. The previous implementation used separate loops: one to update size metrics for all collections, and another to update num entities metrics only for collections present in the grouped segments map. Collections with no segments were skipped in the second loop, leaving their NumEntities metrics stale. Changes: - Consolidate size and num entities metric updates into single loop - Iterate over all collections instead of grouped segments - Get collection metadata from manager instead of segment instances - Correctly set NumEntities to 0 for collections with no segments - Apply the same fix to both growing and sealed segment processing - Add nil check for collection metadata before processing This ensures all collection metrics are updated consistently, even when segment count drops to zero. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-10-30 10:12:08 +08:00
aoiasd	ad9a0cae48	enhance: add global analyzer options (#44684 ) relate: https://github.com/milvus-io/milvus/issues/43687 Add global analyzer options, avoid having to merge some milvus params into user's analyzer params. Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-10-28 14:52:10 +08:00
congqixia	36a887b38b	enhance: add NewSegmentWithLoadInfo API to support segment self-managed loading (#45061 ) This commit introduces the foundation for enabling segments to manage their own loading process by passing load information during segment creation. Changes: C++ Layer: - Add NewSegmentWithLoadInfo() C API to create segments with serialized load info - Add SetLoadInfo() method to SegmentInterface for storing load information - Refactor segment creation logic into shared CreateSegment() helper function - Add comprehensive documentation for the new API Go Layer: - Extend CreateCSegmentRequest to support optional LoadInfo field - Update segment creation in querynode to pass SegmentLoadInfo when available - Add ConvertToSegcoreSegmentLoadInfo() and helper converters for proto translation Proto Definitions: - Add segcorepb.SegmentLoadInfo message with essential loading metadata - Add supporting messages: Binlog, FieldBinlog, FieldIndexInfo, TextIndexStats, JsonKeyStats - Remove dependency on data_coord.proto by creating segcore-specific definitions Testing: - Add comprehensive unit tests for proto conversion functions - Test edge cases including nil inputs, empty data, and nil array/map elements This is the first step toward issue #45060 - enabling segments to autonomously manage their loading process, which will: - Clarify responsibilities between Go and C++ layers - Reduce cross-language call overhead - Enable precise resource management at the C++ level - Support better integration with caching layer - Enable proactive schema evolution handling Related to #45060 --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-10-27 15:28:12 +08:00
aoiasd	cfeb095ad7	enhance: forbid build analyzer at proxy (#44067 ) relate: https://github.com/milvus-io/milvus/issues/43687 We used to run the temporary analyzer and validate analyzer on the proxy, but the proxy should not be a computation-heavy node. This PR move all analyzer calculations to the streaming node. --------- Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-10-23 10:58:12 +08:00
aoiasd	ac82bad0b3	enhance: optimize idf oracle sync logic (#44628 ) Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-10-20 15:42:08 +08:00
sparknack	935160840c	enhance: add a disk quota for the loaded binlog size to prevent load failure of querynode (#44893 ) issue: #41435 --------- Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>	2025-10-19 19:44:01 +08:00
aoiasd	754997ac2b	enhance: update some annotations (#44769 ) relate: https://github.com/milvus-io/milvus/issues/43114 --------- Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-10-17 16:22:02 +08:00
sparknack	c8a4d6e2ef	enhance: add cachinglayer management for TextMatchIndex (#44741 ) issue: #41435, #44502 Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>	2025-10-13 14:37:58 +08:00
sparknack	6d5b41644b	enhance: remove logical usage checks during segment loading (#44743 ) issue: #41435 Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>	2025-10-13 14:21:58 +08:00
congqixia	5ece760d73	fix: Pass fs via `FileManagerContext` when loading index (#44733 ) Related to #44615 --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-10-11 09:55:57 +08:00
wei liu	33d1e7de83	fix: Replace incorrect log import with milvus v2 log package (#44731 ) issue: #44730 Fix the issue where logs were not outputting as expected due to incorrect log package imports across multiple components. Changes include: - Add golangci-lint rule to forbid github.com/pingcap/log usage - Replace github.com/pingcap/log with github.com/milvus-io/milvus/pkg/v2/log Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-10-10 20:27:57 +08:00
Zhen Ye	a110d8cc49	fix: don't use logical resource for metrics of quota center on streaming node (#44613 ) issue: #44599 Signed-off-by: chyezh <chyezh@outlook.com>	2025-09-29 21:34:13 +08:00
aoiasd	78ee76f018	enhance: support preload sealed segment bm25 stats and optimize bm25 stats serialize (#44279 ) relate: https://github.com/milvus-io/milvus/issues/41424 --------- Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-09-29 16:35:05 +08:00
Zhen Ye	b6b59bd222	fix: remove redundant initialization of storage v2 (#44597 ) issue: #44596 - querynode already init the storage v2 and segcore, so streamingnode should not do this again. - It also fix the gcp object storage access denied. Signed-off-by: chyezh <chyezh@outlook.com>	2025-09-29 10:17:04 +08:00
zhagnlu	eac16a577c	enhance:support cachelayer for json stats (#44446 ) #42533 Signed-off-by: zhagnlu <lu.zhang@zilliz.com>	2025-09-24 15:30:04 +08:00
Tianx	2c0c5ef41e	feat: timestamptz expression & index & timezone (#44080 ) issue: https://github.com/milvus-io/milvus/issues/27467 >My plan is as follows. >- [x] M1 Create collection with timestamptz field >- [x] M2 Insert timestamptz field data >- [x] M3 Retrieve timestamptz field data >- [x] M4 Implement handoff >- [x] M5 Implement compare operator >- [x] M6 Implement extract operator >- [x] M8 Support database/collection level default timezone >- [x] M7 Support STL-SORT index for datatype timestamptz --- The third PR of issue: https://github.com/milvus-io/milvus/issues/27467, which completes M5, M6, M7, M8 described above. ## M8 Default Timezone We will be able to use alter_collection() and alter_database() in a future Python SDK release to modify the default timezone at the collection or database level. For insert requests, the timezone will be resolved using the following order of precedence: String Literal-> Collection Default -> Database Default. For retrieval requests, the timezone will be resolved in this order: Query Parameters -> Collection Default -> Database Default. In both cases, the final fallback timezone is UTC. ## M5: Comparison Operators We can now use the following expression format to filter on the timestamptz field: - `timestamptz_field [+/- INTERVAL 'interval_string'] {comparison_op} ISO 'iso_string' ` - The interval_string follows the ISO 8601 duration format, for example: P1Y2M3DT1H2M3S. - The iso_string follows the ISO 8601 timestamp format, for example: 2025-01-03T00:00:00+08:00. - Example expressions: "tsz + INTERVAL 'P0D' != ISO '2025-01-03T00:00:00+08:00'" or "tsz != ISO '2025-01-03T00:00:00+08:00'". ## M6: Extract We will be able to extract sepecific time filed by kwargs in a future Python SDK release. The key is `time_fields`, and value should be one or more of "year, month, day, hour, minute, second, microsecond", seperated by comma or space. Then the result of each record would be an array of int64. ## M7: Indexing Support Expressions without interval arithmetic can be accelerated using an STL-SORT index. However, expressions that include interval arithmetic cannot be indexed. This is because the result of an interval calculation depends on the specific timestamp value. For example, adding one month to a date in February results in a different number of added days than adding one month to a date in March. --- After this PR, the input / output type of timestamptz would be iso string. Timestampz would be stored as timestamptz data, which is int64_t finally. > for more information, see https://en.wikipedia.org/wiki/ISO_8601 --------- Signed-off-by: xtx <xtianx@smail.nju.edu.cn>	2025-09-23 10:24:12 +08:00
jiaqizho	338ed2fed4	enhance: Introduce sparse filter in query (#44347 ) issue: #44373 The current commit implements sparse filtering in query tasks using the statistical information (Bloom filter/MinMax) of the Primary Key (PK). The statistical information of the PK is bound to the segment during the segment loading phase. A new filter has been added to the segment filter to enable the sparse filtering functionality. Signed-off-by: jiaqizho <jiaqi.zhou@zilliz.com>	2025-09-23 09:58:09 +08:00

1 2 3 4 5 ...

845 Commits