milvus

mirror of https://gitee.com/milvus-io/milvus.git synced 2026-01-07 19:31:51 +08:00

Author	SHA1	Message	Date
Chun Han	b7ee93fc52	feat: support query aggregtion(#36380 ) (#44394 ) related: #36380 <!-- This is an auto-generated comment: release notes by coderabbit.ai --> - Core invariant: aggregation is centralized and schema-aware — all aggregate functions are created via the exec Aggregate registry (milvus::exec::Aggregate) and validated by ValidateAggFieldType, use a single in-memory accumulator layout (Accumulator/RowContainer) and grouping primitives (GroupingSet, HashTable, VectorHasher), ensuring consistent typing, null semantics and offsets across planner → exec → reducer conversion paths (toAggregateInfo, Aggregate::create, GroupingSet, AggResult converters). - Removed / simplified logic: removed ad‑hoc count/group-by and reducer code (CountNode/PhyCountNode, GroupByNode/PhyGroupByNode, cntReducer and its tests) and consolidated into a unified AggregationNode → PhyAggregationNode + GroupingSet + HashTable execution path and centralized reducers (MilvusAggReducer, InternalAggReducer, SegcoreAggReducer). AVG now implemented compositionally (SUM + COUNT) rather than a bespoke operator, eliminating duplicate implementations. - Why this does NOT cause data loss or regressions: existing data-access and serialization paths are preserved and explicitly validated — bulk_subscript / bulk_script_field_data and FieldData creation are used for output materialization; converters (InternalResult2AggResult ↔ AggResult2internalResult, SegcoreResults2AggResult ↔ AggResult2segcoreResult) enforce shape/type/row-count validation; proxy and plan-level checks (MatchAggregationExpression, translateOutputFields, ValidateAggFieldType, translateGroupByFieldIds) reject unsupported inputs (ARRAY/JSON, unsupported datatypes) early. Empty-result generation and explicit error returns guard against silent corruption. - New capability and scope: end-to-end GROUP BY and aggregation support added across the stack — proto (plan.proto, RetrieveRequest fields group_by_field_ids/aggregates), planner nodes (AggregationNode, ProjectNode, SearchGroupByNode), exec operators (PhyAggregationNode, PhyProjectNode) and aggregation core (Aggregate implementations: Sum/Count/Min/Max, SimpleNumericAggregate, RowContainer, GroupingSet, HashTable) plus proxy/querynode reducers and tests — enabling grouped and global aggregation (sum, count, min, max, avg via sum+count) with schema-aware validation and reduction. <!-- end of auto-generated comment: release notes by coderabbit.ai --> Signed-off-by: MrPresent-Han <chun.han@gmail.com> Co-authored-by: MrPresent-Han <chun.han@gmail.com>	2026-01-06 16:29:25 +08:00
aoiasd	90809d1d86	fix: highlight with multi analyzer failed (#46527 ) relate: https://github.com/milvus-io/milvus/issues/46498 <!-- This is an auto-generated comment: release notes by coderabbit.ai --> - Core invariant: text fields configured with multi_analyzer_params must include a "by_field" string that names another field containing per-row analyzer choices; schemaInfo.GetMultiAnalyzerNameFieldID caches and returns the dependent field ID (or 0 if none) and relies on that mapping to make per-row analyzer names available to the highlighter. - What changed / simplified: the highlighter is now schema-aware — addTaskWithSearchText accepts *schemaInfo and uses GetMultiAnalyzerNameFieldID to resolve the analyzer-name field; resolution and caching moved into schemaInfo.multiAnalyzerFieldMap (meta_cache.go), eliminating ad-hoc/typeutil-only lookups and duplicated logic; GetMultiAnalyzerParams now gates on EnableAnalyzer(), centralizing analyzer enablement checks. - Why this fixes the bug (root cause): fixes #46498 — previously the highlighter failed when the analyzer-by-field was not in output_fields. The change (1) populates task.AnalyzerNames (defaulting missing names to "default") when multi-analyzer is configured and (2) appends the analyzer-name field ID to LexicalHighlighter.extraFields so FieldIDs includes it; the operator then requests the analyzer-name column at search time, ensuring per-row analyzer selection is available for highlighting. - No data-loss or regression: when no multi-analyzer is configured GetMultiAnalyzerNameFieldID returns 0 and behavior is unchanged; the patch only adds the analyzer-name field to requested output IDs (no mutation of stored data). Error handling on malformed params is preserved (errors are returned instead of silently changing data), and single-analyzer behavior remains untouched. <!-- end of auto-generated comment: release notes by coderabbit.ai --> Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-12-30 11:55:21 +08:00
Buqian Zheng	dce44f2a20	test: reduce test time for TestSparsePlaceholderGroupSize (#46637 ) issue: #44452 ## Summary Reduce test combinations in `TestSparsePlaceholderGroupSize` to decrease test execution time: - `nqs`: from `[1, 10, 100, 1000, 10000]` to `[1, 100, 10000]` - `averageNNZs`: from `[1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024, 2048]` to `[1, 4, 16, 64, 256, 1024]` <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## TestSparsePlaceholderGroupSize Test Reduction Core Invariant: The sparse vector NNZ estimation algorithm (`EstimateSparseVectorNNZFromPlaceholderGroup`) must maintain accuracy within bounded error thresholds—individual cases < 10% error and no more than 2% of cases exceeding 5% error—across representative parameter ranges. Test Coverage Optimized, Not Removed: Test combinations reduced from 60 to 18 by pruning redundant parameter points while retaining critical coverage: nqs now tests [1, 100, 10000] (min, mid, max) and averageNNZs tests [1, 4, 16, 64, 256, 1024] (exponential spacing). Variant generation logic (powers of 2 scaling) remains unchanged, ensuring error scenarios are still exercised. No Behavioral Regression: The algorithm under test is untouched; only test case frequency decreases. The same assertions validate error bounds are satisfied—individual assertions (`assert.Less(errorRatio, 10.0)`) and statistical assertions (`assert.Less(largeErrorRatio, 2.0)`) remain identical, confirming that estimation quality is still verified. Why Safe: Exponential spacing of removed parameters (e.g., nqs: 10, 1000 removed; averageNNZs: 2, 8, 32, 128, 512, 2048 removed) addresses diminishing returns—intermediate values provide no new error scenarios beyond what surrounding powers-of-2 values expose, while keeping test execution time proportional to coverage value gained. <!-- end of auto-generated comment: release notes by coderabbit.ai --> Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2025-12-29 09:51:21 +08:00
marcelo-cjl	3b599441fd	feat: Add nullable vector support for proxy and querynode (#46305 ) related: #45993 This commit extends nullable vector support to the proxy layer, querynode, and adds comprehensive validation, search reduce, and field data handling for nullable vectors with sparse storage. Proxy layer changes: - Update validate_util.go checkAligned() with getExpectedVectorRows() helper to validate nullable vector field alignment using valid data count - Update checkFloatVectorFieldData/checkSparseFloatVectorFieldData for nullable vector validation with proper row count expectations - Add FieldDataIdxComputer in typeutil/schema.go for logical-to-physical index translation during search reduce operations - Update search_reduce_util.go reduceSearchResultData to use idxComputers for correct field data indexing with nullable vectors - Update task.go, task_query.go, task_upsert.go for nullable vector handling - Update msg_pack.go with nullable vector field data processing QueryNode layer changes: - Update segments/result.go for nullable vector result handling - Update segments/search_reduce.go with nullable vector offset translation Storage and index changes: - Update data_codec.go and utils.go for nullable vector serialization - Update indexcgowrapper/dataset.go and index.go for nullable vector indexing Utility changes: - Add FieldDataIdxComputer struct with Compute() method for efficient logical-to-physical index mapping across multiple field data - Update EstimateEntitySize() and AppendFieldData() with fieldIdxs parameter - Update funcutil.go with nullable vector support functions <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * New Features * Full support for nullable vector fields (float, binary, float16, bfloat16, int8, sparse) across ingest, storage, indexing, search and retrieval; logical↔physical offset mapping preserves row semantics. * Client: compaction control and compaction-state APIs. * Bug Fixes * Improved validation for adding vector fields (nullable + dimension checks) and corrected search/query behavior for nullable vectors. * Chores * Persisted validity maps with indexes and on-disk formats. * Tests * Extensive new and updated end-to-end nullable-vector tests. <sub>✏️ Tip: You can customize this high-level summary in your review settings.</sub> <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Signed-off-by: marcelo-cjl <marcelo.chen@zilliz.com>	2025-12-24 10:13:19 +08:00
Xiaofan	ca2e27f576	enhance: remove uncessary segment size estimation and make it configurable (#46302 ) fix #46300 remove unused segment size estimation, and make size estimation configurable Signed-off-by: xiaofanluan <xiaofan.luan@zilliz.com>	2025-12-13 02:58:46 +08:00
Chun Han	d9f8e38d6a	fix: query failed for int value on edge(#46075 ) (#46126 ) related: #46075 Signed-off-by: MrPresent-Han <chun.han@gmail.com> Co-authored-by: MrPresent-Han <chun.han@gmail.com>	2025-12-10 15:59:12 +08:00
Zhen Ye	c3fe6473b8	enhance: support async write syncer for milvus logging (#45805 ) issue: #45640 - log may be dropped if the underlying file system is busy. - use async write syncer to avoid the log operation block the milvus major system. - remove some log dependency from the until function to avoid dependency-loop. --------- Signed-off-by: chyezh <chyezh@outlook.com>	2025-11-28 17:43:11 +08:00
zhenshan.cao	bec6d1d1e1	enhance: timestamptz support groupby (#45762 ) issue: https://github.com/milvus-io/milvus/issues/45761 Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>	2025-11-21 18:39:05 +08:00
zhenshan.cao	352a8d06ec	fix: Partial update panic with TIMESTAMPTZ (#45740 ) issue: https://github.com/milvus-io/milvus/issues/45729 Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>	2025-11-20 21:20:12 +08:00
zhenshan.cao	6327c9a514	fix: Fix bugs related to TimestampTz (#45111 ) issue: https://github.com/milvus-io/milvus/issues/44527 https://github.com/milvus-io/milvus/issues/44537 https://github.com/milvus-io/milvus/issues/44538 https://github.com/milvus-io/milvus/issues/44585 https://github.com/milvus-io/milvus/issues/44622 Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>	2025-11-04 16:51:33 +08:00
Spade A	ce2862d325	fix: fix parquet import bug in STRUCT (#45028 ) issue: https://github.com/milvus-io/milvus/issues/45006 ref: https://github.com/milvus-io/milvus/issues/42148 Previsouly, the parquet import is implemented based on that the STRUCT in the parquet files is hanlded in the way that each field in struct is stored in a single column. However, in the user's perspective, the array of STRUCT contains data is something like STRUCT_A: for one row, [struct{field1_1, field2_1, field3_1}, struct{field1_2, field2_2, field3_2}, ...], rather than {[field1_1, field1_2, ...], [field2_1, field2_2, ...], [field3_1, field3_2, field3_3, ...]}. This PR fixes this. --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-10-27 10:26:06 +08:00
Spade A	d8591f9548	fix: csv/json import with STRUCT adapts concatenated struct name (#45000 ) After https://github.com/milvus-io/milvus/pull/44557, the field name in STRUCT field becomes STRUCT_NAME[FIELD_NAME] This PR make import consider the change. issue: https://github.com/milvus-io/milvus/issues/45006 ref: https://github.com/milvus-io/milvus/issues/42148 TODO: parquet is much more complex than csv/json, and I will leave it to a separate PR. --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-10-24 10:22:15 +08:00
aoiasd	cfeb095ad7	enhance: forbid build analyzer at proxy (#44067 ) relate: https://github.com/milvus-io/milvus/issues/43687 We used to run the temporary analyzer and validate analyzer on the proxy, but the proxy should not be a computation-heavy node. This PR move all analyzer calculations to the streaming node. --------- Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-10-23 10:58:12 +08:00
wei liu	529c98520c	enhance: Add nullable support for Geometry and Timestamptz types (#44846 ) issue: #44800 This commit enhances the upsert and validation logic to properly handle nullable Geometry (WKT/WKB) and Timestamptz data types: - Add ToCompressedFormatNullable support for TimestamptzData, GeometryWktData, and GeometryData to filter out null values during data compression - Implement GenNullableFieldData for Timestamptz and Geometry types to generate nullable field data structures - Update FillWithNullValue to handle both GeometryData and GeometryWktData with null value filling logic - Add UpdateFieldData support for Timestamptz, GeometryData, and GeometryWktData field updates - Comprehensive unit tests covering all new data type handling scenarios Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-10-15 14:04:00 +08:00
cai.zhang	19346fa389	feat: Geospatial Data Type and GIS Function support for milvus (#44547 ) issue: #43427 This pr's main goal is merge #37417 to milvus 2.5 without conflicts. # Main Goals 1. Create and describe collections with geospatial type 2. Insert geospatial data into the insert binlog 3. Load segments containing geospatial data into memory 4. Enable query and search can display geospatial data 5. Support using GIS funtions like ST_EQUALS in query 6. Support R-Tree index for geometry type # Solution 1. Add Type: Modify the Milvus core by adding a Geospatial type in both the C++ and Go code layers, defining the Geospatial data structure and the corresponding interfaces. 2. Dependency Libraries: Introduce necessary geospatial data processing libraries. In the C++ source code, use Conan package management to include the GDAL library. In the Go source code, add the go-geom library to the go.mod file. 3. Protocol Interface: Revise the Milvus protocol to provide mechanisms for Geospatial message serialization and deserialization. 4. Data Pipeline: Facilitate interaction between the client and proxy using the WKT format for geospatial data. The proxy will convert all data into WKB format for downstream processing, providing column data interfaces, segment encapsulation, segment loading, payload writing, and cache block management. 5. Query Operators: Implement simple display and support for filter queries. Initially, focus on filtering based on spatial relationships for a single column of geospatial literal values, providing parsing and execution for query expressions.Now only support brutal search 7. Client Modification: Enable the client to handle user input for geospatial data and facilitate end-to-end testing.Check the modification in pymilvus. --------- Signed-off-by: Yinwei Li <yinwei.li@zilliz.com> Signed-off-by: Cai Zhang <cai.zhang@zilliz.com> Co-authored-by: ZhuXi <150327960+Yinwei-Yu@users.noreply.github.com>	2025-09-28 19:43:05 +08:00
junjiejiangjjj	f07979f91d	enhance: add support for controlling function output field insertion (#44162 ) #44053 Signed-off-by: junjie.jiang <junjie.jiang@zilliz.com>	2025-09-24 17:26:04 +08:00
Tianx	2c0c5ef41e	feat: timestamptz expression & index & timezone (#44080 ) issue: https://github.com/milvus-io/milvus/issues/27467 >My plan is as follows. >- [x] M1 Create collection with timestamptz field >- [x] M2 Insert timestamptz field data >- [x] M3 Retrieve timestamptz field data >- [x] M4 Implement handoff >- [x] M5 Implement compare operator >- [x] M6 Implement extract operator >- [x] M8 Support database/collection level default timezone >- [x] M7 Support STL-SORT index for datatype timestamptz --- The third PR of issue: https://github.com/milvus-io/milvus/issues/27467, which completes M5, M6, M7, M8 described above. ## M8 Default Timezone We will be able to use alter_collection() and alter_database() in a future Python SDK release to modify the default timezone at the collection or database level. For insert requests, the timezone will be resolved using the following order of precedence: String Literal-> Collection Default -> Database Default. For retrieval requests, the timezone will be resolved in this order: Query Parameters -> Collection Default -> Database Default. In both cases, the final fallback timezone is UTC. ## M5: Comparison Operators We can now use the following expression format to filter on the timestamptz field: - `timestamptz_field [+/- INTERVAL 'interval_string'] {comparison_op} ISO 'iso_string' ` - The interval_string follows the ISO 8601 duration format, for example: P1Y2M3DT1H2M3S. - The iso_string follows the ISO 8601 timestamp format, for example: 2025-01-03T00:00:00+08:00. - Example expressions: "tsz + INTERVAL 'P0D' != ISO '2025-01-03T00:00:00+08:00'" or "tsz != ISO '2025-01-03T00:00:00+08:00'". ## M6: Extract We will be able to extract sepecific time filed by kwargs in a future Python SDK release. The key is `time_fields`, and value should be one or more of "year, month, day, hour, minute, second, microsecond", seperated by comma or space. Then the result of each record would be an array of int64. ## M7: Indexing Support Expressions without interval arithmetic can be accelerated using an STL-SORT index. However, expressions that include interval arithmetic cannot be indexed. This is because the result of an interval calculation depends on the specific timestamp value. For example, adding one month to a date in February results in a different number of added days than adding one month to a date in March. --- After this PR, the input / output type of timestamptz would be iso string. Timestampz would be stored as timestamptz data, which is int64_t finally. > for more information, see https://en.wikipedia.org/wiki/ISO_8601 --------- Signed-off-by: xtx <xtianx@smail.nju.edu.cn>	2025-09-23 10:24:12 +08:00
yihao.dai	51f69f32d0	feat: Add CDC support (#44124 ) This PR implements a new CDC service for Milvus 2.6, providing log-based cross-cluster replication. issue: https://github.com/milvus-io/milvus/issues/44123 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com> Signed-off-by: chyezh <chyezh@outlook.com> Co-authored-by: chyezh <chyezh@outlook.com>	2025-09-16 16:32:01 +08:00
Spade A	eb793531b9	feat: impl StructArray -- support import for CSV/JSON/PARQUET/BINLOG (#44201 ) Ref https://github.com/milvus-io/milvus/issues/42148 --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-09-15 20:41:59 +08:00
wei liu	18371773dd	enhance: Optimize partial update merge logic by unifying nullable format (#44197 ) issue: #43980 This commit optimizes the partial update merge logic by standardizing nullable field representation before merge operations to avoid corner cases during the merge process. Key changes: - Unify nullable field data format to FULL FORMAT before merge execution - Add extensive unit tests for bounds checking and edge cases The optimization ensures: - Consistent nullable field representation across SDK and internal - Proper handling of null values during merge operations - Prevention of index out-of-bounds errors in vector field updates - Better error handling and validation for partial update scenarios This resolves issues where different nullable field formats could cause merge failures or data corruption during partial update operations. Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-09-10 17:27:56 +08:00
wei liu	5ef793c393	fix: Fix panic when upsert with partial_update=true on empty table (#44155 ) issue: #43980 Fix panic issue caused by incorrect nullable field merging logic when upsert converts to insert operation on empty tables. - Add AppendFieldDataWithNullData to handle nullable field merging - Fix existing data merge with skipAppendNullData=false - Fix insert data merge with skipAppendNullData=true - Add unit tests for nullable field data appending scenarios Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-09-02 16:47:52 +08:00
wei liu	16af4e230a	fix: Prevent panic in upsert due to missing nullable fields [Proxy] (#44070 ) issue: #43980 Fixes a panic that occurred when a partial update was converted to an insert due to a non-existent primary key. The panic was caused by missing nullable fields that were not provided in the original partial update request. The upsert pre-execution logic is refactored to handle this correctly: - Explicitly splits upsert data into 'insert' and 'update' batches. - Automatically generates data for missing nullable or default-value fields during inserts, preventing the panic. - Enhances `typeutil.UpdateFieldData` to support different source and destination indexes for flexible data merging. - Adds comprehensive unit tests for mixed upsert, pure insert, and pure update scenarios. --------- Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-08-29 18:33:51 +08:00
Spade A	8456f824be	feat: impl StructArray -- miscellaneous staffs for struct array (#43960 ) Ref https://github.com/milvus-io/milvus/issues/42148 1. enable storage v2 2. implement some missing staffs 3. fix some bugs and add tests --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-08-26 21:35:53 +08:00
Tianx	c0d62268ac	feat: add timesatmptz data type (#44005 ) issue: https://github.com/milvus-io/milvus/issues/27467 > https://github.com/milvus-io/milvus/issues/27467#issuecomment-3092211420 > * [x] M1 Create collection with timestamptz field > * [x] M2 Insert timestamptz field data > * [x] M3 Retrieve timestamptz field data > * [x] M4 Implement handoff[ ] The second PR of issue: https://github.com/milvus-io/milvus/issues/27467, which completes M1-M4 described above. --------- Signed-off-by: xtx <xtianx@smail.nju.edu.cn>	2025-08-26 15:59:53 +08:00
Spade A	d6a428e880	feat: impl StructArray -- support create index for vector array (embedding list) and search on it (#43726 ) Ref https://github.com/milvus-io/milvus/issues/42148 This PR supports create index for vector array (now, only for `DataType.FLOAT_VECTOR`) and search on it. The index type supported in this PR is `EMB_LIST_HNSW` and the metric type is `MAX_SIM` only. The way to use it: ```python milvus_client = MilvusClient("xxx:19530") schema = milvus_client.create_schema(enable_dynamic_field=True, auto_id=True) ... struct_schema = milvus_client.create_struct_array_field_schema("struct_array_field") ... struct_schema.add_field("struct_float_vec", DataType.ARRAY_OF_VECTOR, element_type=DataType.FLOAT_VECTOR, dim=128, max_capacity=1000) ... schema.add_struct_array_field(struct_schema) index_params = milvus_client.prepare_index_params() index_params.add_index(field_name="struct_float_vec", index_type="EMB_LIST_HNSW", metric_type="MAX_SIM", index_params={"nlist": 128}) ... milvus_client.create_index(COLLECTION_NAME, schema=schema, index_params=index_params) ``` Note: This PR uses `Lims` to convey offsets of the vector array to knowhere where vectors of multiple vector arrays are concatenated and we need offsets to specify which vectors belong to which vector array. --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com> Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>	2025-08-20 10:27:46 +08:00
wei liu	d3c95eaa77	enhance: Support partial field updates with upsert API (#42877 ) issue: #29735 Implement partial field update functionality for upsert operations, supporting scalar, vector, and dynamic JSON fields without requiring all collection fields. Changes: - Add queryPreExecute to retrieve existing records before upsert - Implement UpdateFieldData function for merging data - Add IDsChecker utility for efficient primary key lookups - Fix JSON data creation in tests using proper map marshaling - Add test cases for partial updates of different field types Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-08-19 11:15:45 +08:00
Spade A	faeb7fd410	feat: impl StructArray -- create schema, insert, and retrieve data (#42855 ) Ref https://github.com/milvus-io/milvus/issues/42148 https://github.com/milvus-io/milvus/pull/42406 impls the segcore part of storage for handling with VectorArray. This PR: 1. impls the go part of storage for VectorArray 2. impls the collection creation with StructArrayField and VectorArray 3. insert and retrieve data from the collection. --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com> Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com> Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>	2025-07-27 01:30:55 +08:00
XuanYang-cn	4dcaa97682	fix: Use diskSegmentMaxSize for coll with sparse and dense vectors (#43194 ) Previous code uses diskSegmentMaxSize if and only if all of the collection's vector fields are indexed with DiskANN index. When introducing sparse vectors, since sparse vector cannot be indexed with DiskANN index, collections with both dense and sparse vectors will use maxSize instead. This PR changes the requirments of using diskSegmentMaxSize to all dense vectors are indexed with DiskANN indexs, ignoring sparse vector fields. See also: #43193 Signed-off-by: yangxuan <xuan.yang@zilliz.com>	2025-07-16 18:04:52 +08:00
Chun Han	07745439b5	fix: empty search groupby result causing crash(#43137 ) (#43214 ) related: #43137 Signed-off-by: MrPresent-Han <chun.han@gmail.com> Co-authored-by: MrPresent-Han <chun.han@gmail.com>	2025-07-10 12:04:48 +08:00
congqixia	74ea57bac1	enhance: Remove unused load field check from proxy (#42816 ) Related to #42489 Since load list works as hint after cachelayer implemented, the related check logic could be removed to keep code logic clean. --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-06-19 19:34:47 +08:00
Zhen Ye	fc010e44a8	fix: release memory after pop from heap (#42482 ) issue: #42481 Signed-off-by: chyezh <chyezh@outlook.com>	2025-06-04 10:00:32 +08:00
Ted Xu	ae32203d3a	fix: support group by with nullable grouping keys (#41797 ) See #36264 In this PR: - Enhanced error handling in parse of grouping field. - Fixed null handling in reduce tasks in proxy nodes. - Updated tests to reflect changes in error handling and data processing logic. --------- Signed-off-by: Ted Xu <ted.xu@zilliz.com>	2025-05-17 20:54:22 +08:00
aoiasd	f52c2909c4	feat: support multi analyzer for bm25 function (#41351 ) relate: https://github.com/milvus-io/milvus/issues/41213 --------- Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-04-23 18:22:38 +08:00
SimFG	91d40fa558	fix: Update logging context and upgrade dependencies (#41318 ) - issue: #41291 --------- Signed-off-by: SimFG <bang.fu@zilliz.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2025-04-23 10:52:38 +08:00
Xianhui Lin	f9febe3bae	enhance: Merge RootCoord, DataCoord And QueryCoord into MixCoord (#41006 ) Merge RootCoord, DataCoord And QueryCoord into MixCoord Make Session into one issue : https://github.com/milvus-io/milvus/issues/37764 --------- Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>	2025-04-11 16:36:30 +08:00
Xianhui Lin	3bc24c264f	enhance: Add json key inverted index in stats for optimization (#38039 ) Add json key inverted index in stats for optimization https://github.com/milvus-io/milvus/issues/36995 --------- Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-04-10 15:20:28 +08:00
yihao.dai	b4cb8a4b13	enhance: Add UTF-8 string validation for import (#40694 ) issue: https://github.com/milvus-io/milvus/issues/40684 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2025-04-01 19:04:21 +08:00
Ted Xu	688505ab1c	enhance: cleanup lint check exclusions (#40829 ) See: #40828 Signed-off-by: Ted Xu <ted.xu@zilliz.com>	2025-03-21 18:12:14 +08:00
Buqian Zheng	c12abf4e2a	enhance: improve sparse query nnz metric (#40713 ) add query type and field id label; add metric for hybrid search issue: https://github.com/milvus-io/milvus/issues/35853 Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2025-03-18 17:20:16 +08:00
Zhen Ye	f6fb4bc442	fix: backoff will retry infinitely after reaching max elapse (#40589 ) issue: #40588 Signed-off-by: chyezh <chyezh@outlook.com>	2025-03-13 16:24:06 +08:00
Zhen Ye	96a010da7a	fix: msgstream adaptor may not gc quickly after comsumed (#40555 ) issue: #40540 Signed-off-by: chyezh <chyezh@outlook.com>	2025-03-12 14:00:04 +08:00
congqixia	cb7f2fa6fd	enhance: Use v2 package name for pkg module (#39990 ) Related to #39095 https://go.dev/doc/modules/version-numbers Update pkg version according to golang dep version convention --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-02-22 23:15:58 +08:00
yihao.dai	d72d2281ca	fix: Fix concurrent map (#39775 ) issue: https://github.com/milvus-io/milvus/issues/39778 Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2025-02-22 09:51:54 +08:00
Patrick Weizhi Xu	04fff74a56	feat: introduce Text data type (#39874 ) issue: https://github.com/milvus-io/milvus/issues/39818 This PR mimics Varchar data type, allows insert, search, query, delete, full-text search and others. Functionalities related to filter expressions are disabled temporarily. Storage changes for Text data type will be in the following PRs. Signed-off-by: Patrick Weizhi Xu <weizhi.xu@zilliz.com>	2025-02-19 11:04:51 +08:00
Cai Yudong	5bf1b2b929	feat: Support Int8Vector in go (#38990 ) Issue: #38666 Signed-off-by: Cai Yudong <yudong.cai@zilliz.com>	2025-01-14 20:43:06 +08:00
SimFG	357eaf0d71	fix: use the object heap to keep the min ddl ts order (#39118 ) issue: #39002 Signed-off-by: SimFG <bang.fu@zilliz.com>	2025-01-10 18:16:58 +08:00
Zhen Ye	3bcdd92915	enhance: add broadcast for streaming service (#39020 ) issue: #38399 - Add new rpc for transfer broadcast to streaming coord - Add broadcast service at streaming coord to make broadcast message sent automicly Signed-off-by: chyezh <chyezh@outlook.com>	2025-01-09 16:24:55 +08:00
Spade A	4245c5bed1	fix: text match panics when enable_match is set be false (#38950 ) fix: https://github.com/milvus-io/milvus/issues/38949 --------- Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>	2025-01-03 14:20:55 +08:00
Zhen Ye	afac153c26	enhance: move the lifetime implementation out of server level lifetime (#38442 ) issue: #38399 - move the lifetime implementation of common code out of the server level lifetime implementation Signed-off-by: chyezh <chyezh@outlook.com>	2024-12-17 11:42:44 +08:00
Buqian Zheng	75e64b993f	enhance: add metrics for counting number of nun-zeros/tokens of sparse/FTS search (#38329 ) sparse vectors may have arbitrary number of non zeros and it is hard to optimize without knowing the actual distribution of nnz. this PR adds a metric for analyzing that. issue: https://github.com/milvus-io/milvus/issues/35853 comparing with https://github.com/milvus-io/milvus/pull/38328, this includes also metric for FTS in query node delegator also fixed a bug of sparse when searching by pk Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2024-12-12 16:22:43 +08:00

1 2 3 4

167 Commits