milvus

mirror of https://gitee.com/milvus-io/milvus.git synced 2026-01-07 19:31:51 +08:00

Author	SHA1	Message	Date
Chun Han	b7ee93fc52	feat: support query aggregtion(#36380 ) (#44394 ) related: #36380 <!-- This is an auto-generated comment: release notes by coderabbit.ai --> - Core invariant: aggregation is centralized and schema-aware — all aggregate functions are created via the exec Aggregate registry (milvus::exec::Aggregate) and validated by ValidateAggFieldType, use a single in-memory accumulator layout (Accumulator/RowContainer) and grouping primitives (GroupingSet, HashTable, VectorHasher), ensuring consistent typing, null semantics and offsets across planner → exec → reducer conversion paths (toAggregateInfo, Aggregate::create, GroupingSet, AggResult converters). - Removed / simplified logic: removed ad‑hoc count/group-by and reducer code (CountNode/PhyCountNode, GroupByNode/PhyGroupByNode, cntReducer and its tests) and consolidated into a unified AggregationNode → PhyAggregationNode + GroupingSet + HashTable execution path and centralized reducers (MilvusAggReducer, InternalAggReducer, SegcoreAggReducer). AVG now implemented compositionally (SUM + COUNT) rather than a bespoke operator, eliminating duplicate implementations. - Why this does NOT cause data loss or regressions: existing data-access and serialization paths are preserved and explicitly validated — bulk_subscript / bulk_script_field_data and FieldData creation are used for output materialization; converters (InternalResult2AggResult ↔ AggResult2internalResult, SegcoreResults2AggResult ↔ AggResult2segcoreResult) enforce shape/type/row-count validation; proxy and plan-level checks (MatchAggregationExpression, translateOutputFields, ValidateAggFieldType, translateGroupByFieldIds) reject unsupported inputs (ARRAY/JSON, unsupported datatypes) early. Empty-result generation and explicit error returns guard against silent corruption. - New capability and scope: end-to-end GROUP BY and aggregation support added across the stack — proto (plan.proto, RetrieveRequest fields group_by_field_ids/aggregates), planner nodes (AggregationNode, ProjectNode, SearchGroupByNode), exec operators (PhyAggregationNode, PhyProjectNode) and aggregation core (Aggregate implementations: Sum/Count/Min/Max, SimpleNumericAggregate, RowContainer, GroupingSet, HashTable) plus proxy/querynode reducers and tests — enabling grouped and global aggregation (sum, count, min, max, avg via sum+count) with schema-aware validation and reduction. <!-- end of auto-generated comment: release notes by coderabbit.ai --> Signed-off-by: MrPresent-Han <chun.han@gmail.com> Co-authored-by: MrPresent-Han <chun.han@gmail.com>	2026-01-06 16:29:25 +08:00
lif	65cca5d046	fix: correct typo CredentialSeperator to CredentialSeparator (#46631 ) issue: #46635 ## Summary - Fix spelling error in constant name: `CredentialSeperator` -> `CredentialSeparator` - Updated all usages across the codebase to use the correct spelling ## Changes - `pkg/util/constant.go`: Renamed the constant - `pkg/util/contextutil/context_util.go`: Updated usage - `pkg/util/contextutil/context_util_test.go`: Updated usage - `internal/proxy/authentication_interceptor.go`: Updated usage - `internal/proxy/util.go`: Updated usage - `internal/proxy/util_test.go`: Updated usage - `internal/proxy/trace_log_interceptor_test.go`: Updated usage - `internal/proxy/accesslog/info/util.go`: Updated usage - `internal/distributed/proxy/service.go`: Updated usage - `internal/distributed/proxy/httpserver/utils.go`: Updated usage ## Test Plan - [x] All references updated consistently - [x] No functional changes - only constant name spelling correction <!-- This is an auto-generated comment: release notes by coderabbit.ai --> - Core invariant: the separator character for credentials remains ":" everywhere — only the exported identifier was renamed from CredentialSeperator → CredentialSeparator; the constant value and split/join semantics are unchanged. - Change (bug fix): corrected the misspelled exported constant in pkg/util/constant.go and updated all references across the codebase (parsing, token construction, header handling and tests) to use the new identifier; this is an identifier rename that removes an inconsistent symbol and prevents compile-time/reference errors. - Logic simplified/redundant work removed: no runtime logic was removed; the simplification is purely maintenance-focused — eliminating a misspelled exported name that could cause developers to introduce duplicate or incorrect constants. - No data loss or behavior regression: runtime code paths are unchanged — e.g., GetAuthInfoFromContext, ParseUsernamePassword, AuthenticationInterceptor, proxy service token construction and access-log extraction still use ":" to split/join credentials; updated and added unit tests (parsing and metadata extraction) exercise these paths and validate identical semantics. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Signed-off-by: majiayu000 <1835304752@qq.com> Signed-off-by: lif <1835304752@qq.com> Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-05 14:37:24 +08:00
cai.zhang	a16d04f5d1	feat: Support ttl field for entity level expiration (#46342 ) issue： #46033 <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Pull Request Summary: Entity-Level TTL Field Support ### Core Invariant and Design This PR introduces per-entity TTL (time-to-live) expiration via a dedicated TIMESTAMPTZ field as a fine-grained alternative to collection-level TTL. The key invariant is mutual exclusivity: collection-level TTL and entity-level TTL field cannot coexist on the same collection. Validation is enforced at the proxy layer during collection creation/alteration (`validateTTL()` prevents both being set simultaneously). ### What Is Removed and Why - Global `EntityExpirationTTL` parameter removed from config (`configs/milvus.yaml`, `pkg/util/paramtable/component_param.go`). This was the only mechanism for collection-level expiration. The removal is safe because: - The collection-level TTL path (`isEntityExpired(ts)` check) remains intact in the codebase for backward compatibility - TTL field check (`isEntityExpiredByTTLField()`) is a secondary path invoked only when a TTL field is configured - Existing deployments using collection TTL can continue without modification The global parameter was removed specifically because entity-level TTL makes per-entity control redundant with a collection-wide setting, and the PR chooses one mechanism per collection rather than layering both. ### No Data Loss or Behavior Regression TTL filtering logic is additive and safe: 1. Collection-level TTL unaffected: The `isEntityExpired(ts)` check still applies when no TTL field is configured; callers of `EntityFilter.Filtered()` pass `-1` as the TTL expiration timestamp when no field exists, causing `isEntityExpiredByTTLField()` to return false immediately 2. Null/invalid TTL values treated safely: Rows with null TTL or TTL ≤ 0 are marked as "never expire" (using sentinel value `int64(^uint64(0) >> 1)`) and are preserved across compactions; percentile calculations only include positive TTL values 3. Query-time filtering automatic: TTL filtering is transparently added to expression compilation via `AddTTLFieldFilterExpressions()`, which appends `(ttl_field IS NULL OR ttl_field > current_time)` to the filter pipeline. Entities with null TTL always pass the filter 4. Compaction triggering granular: Percentile-based expiration (20%, 40%, 60%, 80%, 100%) allows configurable compaction thresholds via `SingleCompactionRatioThreshold`, preventing premature data deletion ### Capability Added: Per-Entity Expiration with Data Distribution Awareness Users can now specify a TIMESTAMPTZ collection property `ttl_field` naming a schema field. During data writes, TTL values are collected per segment and percentile quantiles (5-value array) are computed and stored in segment metadata. At query time, the TTL field is automatically filtered. At compaction time, segment-level percentiles drive expiration-based compaction decisions, enabling intelligent compaction of segments where a configurable fraction of data has expired (e.g., compact when 40% of rows are expired, controlled by threshold ratio). <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2026-01-05 10:27:24 +08:00
congqixia	92c0c38e24	fix: validate collection TTL property to prevent compaction stuck (#46717 ) If collection TTL property is malformed (e.g., non-numeric value), compaction tasks would fail silently and get stuck. This change: - Add centralized GetCollectionTTL/GetCollectionTTLFromMap functions in pkg/common to handle TTL parsing with proper error handling - Validate TTL property in createCollectionTask and alterCollectionTask PreExecute to reject invalid values early - Refactor datacoord compaction policies to use the new common functions - Remove duplicated getCollectionTTL from datacoord/util.go issue: #46716 <!-- This is an auto-generated comment: release notes by coderabbit.ai --> - Core invariant: collection.ttl.seconds must be a parseable int64 and validated at collection creation/alter time so malformed TTLs never reach compaction/execution codepaths. - Bug fix (resolves #46716): malformed/non-numeric TTLs could silently cause compaction tasks to fail/stall; fixed by adding centralized parsing helpers pkg/common.GetCollectionTTL and GetCollectionTTLFromMap and validating TTL in createCollectionTask.PreExecute and alterCollectionTask.PreExecute (calls with default -1 and return parameter-invalid errors on parse failure). - Simplification / removed redundancy: eliminated duplicated getCollectionTTL in internal/datacoord/util.go and replaced ad-hoc TTL parsing across datacoord (compaction policies, import_util, compaction triggers) and proxy util with the common helpers, centralizing error handling and defaulting logic. - No data loss or behavior regression: valid TTL parsing semantics unchanged (helpers use identical int64 parsing and default fallback from paramtable/CommonCfg); validation occurs in PreExecute so existing valid collections proceed unchanged while malformed values are rejected early—compaction codepaths now receive only validated TTL values (or explicit defaults), preventing silent skips without altering valid execution flows. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2026-01-01 08:13:22 +08:00
yihao.dai	b18ebd9468	enhance: Remove legacy cdc/replication (#46603 ) issue: https://github.com/milvus-io/milvus/issues/44123 <!-- This is an auto-generated comment: release notes by coderabbit.ai --> - Core invariant: legacy in-cluster CDC/replication plumbing (ReplicateMsg types, ReplicateID-based guards and flags) is obsolete — the system relies on standard msgstream positions, subPos/end-ts semantics and timetick ordering as the single source of truth for message ordering and skipping, so replication-specific channels/types/guards can be removed safely. - Removed/simplified logic (what and why): removed replication feature flags and params (ReplicateMsgChannel, TTMsgEnabled, CollectionReplicateEnable), ReplicateMsg type and its tests, ReplicateID constants/helpers and MergeProperties hooks, ReplicateConfig and its propagation (streamPipeline, StreamConfig, dispatcher, target), replicate-aware dispatcher/pipeline branches, and replicate-mode pre-checks/timestamp-allocation in proxy tasks — these implemented a redundant alternate “replicate-mode” pathway that duplicated position/end-ts and timetick logic. - Why this does NOT cause data loss or regression (concrete code paths): no persistence or core write paths were removed — proxy PreExecute flows (internal/proxy/task_*.go) still perform the same schema/ID/size validations and then follow the normal non-replicate execution path; dispatcher and pipeline continue to use position/subPos and pullback/end-ts in Seek/grouping (pkg/mq/msgdispatcher/dispatcher.go, internal/util/pipeline/stream_pipeline.go), so skipping and ordering behavior remains unchanged; timetick emission in rootcoord (sendMinDdlTsAsTt) is now ungated (no silent suppression), preserving or increasing timetick delivery rather than removing it. - PR type and net effect: Enhancement/Refactor — removes deprecated replication API surface (types, helpers, config, tests) and replication branches, simplifies public APIs and constructor signatures, and reduces surface area for future maintenance while keeping DML/DDL persistence, ordering, and seek semantics intact. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2025-12-30 14:53:21 +08:00
wei liu	f85e86a6ec	fix: change upsert duplicate PK behavior from dedup to error (#45997 ) issue: #44320 Replace the DeduplicateFieldData function with CheckDuplicatePkExist that returns an error when duplicate primary keys are detected in the same batch, instead of silently deduplicating. Changes: - Replace DeduplicateFieldData with CheckDuplicatePkExist in util.go - Update upsertTask.PreExecute to return error on duplicate PKs - Simplify helper function from findLastOccurrenceIndices to hasDuplicates - Update unit tests to verify the new error behavior - Add Python integration tests for duplicate PK error cases Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-12-04 10:23:11 +08:00
junjiejiangjjj	d3164e8030	feat: add configurable batch factor and runtime check bypass for embedding functions (#45592 ) https://github.com/milvus-io/milvus/issues/45544 - Add batch_factor configuration parameter (default: 5) to control embedding provider batch sizes - Add disable_func_runtime_check property to bypass function validation during collection creation - Add database interceptor support for AddCollectionFunction, AlterCollectionFunction, and DropCollectionFunction requests Signed-off-by: junjie.jiang <junjie.jiang@zilliz.com>	2025-11-20 19:55:04 +08:00
zhenshan.cao	a3b8bcb198	fix: correct default value backfill during AddField (#45634 ) issue: https://github.com/milvus-io/milvus/issues/44585 Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>	2025-11-18 23:05:42 +08:00
aoiasd	947c8855f3	feat: support search bm25 with highlight (#44923 ) relate: https://github.com/milvus-io/milvus/issues/42589 --------- Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-11-18 16:09:39 +08:00
wei liu	7aed88113c	enhance: Deduplicate primary keys in upsert request batch (#45249 ) issue: #44320 This change adds deduplication logic to handle duplicate primary keys within a single upsert batch, keeping the last occurrence of each primary key. Key changes: - Add DeduplicateFieldData function to remove duplicate PKs from field data, supporting both Int64 and VarChar primary keys - Refactor fillFieldPropertiesBySchema into two separate functions: validateFieldDataColumns for validation and fillFieldPropertiesOnly for property filling, improving code clarity and reusability - Integrate deduplication logic in upsertTask.PreExecute to automatically deduplicate data before processing - Add comprehensive unit tests for deduplication with various PK types (Int64, VarChar) and field types (scalar, vector) - Add Python integration tests to verify end-to-end behavior --------- Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-11-17 21:35:40 +08:00
junjiejiangjjj	50f198e346	feat: Support zilliz models (#45168 ) https://github.com/milvus-io/milvus/issues/35856 Signed-off-by: junjie.jiang <junjie.jiang@zilliz.com>	2025-11-13 12:55:37 +08:00
zhenshan.cao	6327c9a514	fix: Fix bugs related to TimestampTz (#45111 ) issue: https://github.com/milvus-io/milvus/issues/44527 https://github.com/milvus-io/milvus/issues/44537 https://github.com/milvus-io/milvus/issues/44538 https://github.com/milvus-io/milvus/issues/44585 https://github.com/milvus-io/milvus/issues/44622 Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>	2025-11-04 16:51:33 +08:00
Spade A	2b5241fe5a	fix: allow "[" and "]" in index name (#45193 ) issue: https://github.com/milvus-io/milvus/issues/42148 --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com> Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>	2025-11-04 11:59:34 +08:00
Spade A	d8591f9548	fix: csv/json import with STRUCT adapts concatenated struct name (#45000 ) After https://github.com/milvus-io/milvus/pull/44557, the field name in STRUCT field becomes STRUCT_NAME[FIELD_NAME] This PR make import consider the change. issue: https://github.com/milvus-io/milvus/issues/45006 ref: https://github.com/milvus-io/milvus/issues/42148 TODO: parquet is much more complex than csv/json, and I will leave it to a separate PR. --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-10-24 10:22:15 +08:00
aoiasd	cfeb095ad7	enhance: forbid build analyzer at proxy (#44067 ) relate: https://github.com/milvus-io/milvus/issues/43687 We used to run the temporary analyzer and validate analyzer on the proxy, but the proxy should not be a computation-heavy node. This PR move all analyzer calculations to the streaming node. --------- Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-10-23 10:58:12 +08:00
Spade A	34f54da155	fix: reject GEOMETRY and TIMESTAMPTZ in STRUCT (#44937 ) issue: https://github.com/milvus-io/milvus/issues/44930 Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-10-20 11:32:05 +08:00
Spade A	6c8e353439	feat: impl StructArray -- ban non-float-vector for now (#44875 ) ref https://github.com/milvus-io/milvus/issues/42148 --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-10-17 10:26:09 +08:00
Bingyi Sun	633cae9461	enhance: add namespace for query and search request (#44343 ) issue: #44011 --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-10-16 17:52:01 +08:00
congqixia	f5f053f1d2	enhance: Refactor privilege management by extracting privilege cache into separate package (#44762 ) Related to #44761 This commit refactors the privilege management system in the proxy component by: 1. Separation of Concerns: Extracts privilege-related functionality from MetaCache into a dedicated `internal/proxy/privilege` package, improving code organization and maintainability. 2. New Package Structure: Creates `internal/proxy/privilege/` with: - `cache.go`: Core privilege cache implementation (PrivilegeCache) - `result_cache.go`: Privilege enforcement result caching - `model.go`: Casbin model and policy enforcement functions - `meta_cache_adapter.go`: Casbin adapter for MetaCache integration - Corresponding test files and mock implementations 3. MetaCache Simplification: Removes privilege and credential management methods from MetaCache interface and implementation: - Removed: GetCredentialInfo, RemoveCredential, UpdateCredential - Removed: GetPrivilegeInfo, GetUserRole, RefreshPolicyInfo, InitPolicyInfo - Deleted: meta_cache_adapter.go, privilege_cache.go and their tests 4. Updated References: Updates all callsites to use the new privilegeCache global: - Authentication interceptor now uses privilegeCache for password verification - Credential cache operations (InvalidateCredentialCache, UpdateCredentialCache, UpdateCredential) now use privilegeCache - Policy refresh operations (RefreshPolicyInfoCache) now use privilegeCache - Privilege interceptor uses new privilege.GetEnforcer() and privilege result cache 5. Improved API: Renames cache functions for clarity: - GetPrivilegeCache → GetResultCache - SetPrivilegeCache → SetResultCache - CleanPrivilegeCache → CleanResultCache This refactoring makes the codebase more modular, separates privilege management concerns from general metadata caching, and provides a clearer API for privilege enforcement operations. --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-10-13 11:15:58 +08:00
Spade A	208481a070	feat: impl StructArray -- support same names in different STRUCT (#44557 ) ref: https://github.com/milvus-io/milvus/issues/42148 --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-10-10 15:53:56 +08:00
Gao	3cc59a0d69	enhance: add storage usage for delete/upsert/restful (#44512 ) #44212 Also, record metrics only when storageUsageTracking is enabled. Use MB for scanned_remote counter and scanned_total counter metrics to avoid overflow. --------- Signed-off-by: chasingegg <chao.gao@zilliz.com>	2025-09-30 00:31:06 +08:00
aoiasd	294282f1d2	enhance: support use nullable field as bm25 function input field (#44586 ) Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-09-29 10:25:05 +08:00
junjiejiangjjj	f07979f91d	enhance: add support for controlling function output field insertion (#44162 ) #44053 Signed-off-by: junjie.jiang <junjie.jiang@zilliz.com>	2025-09-24 17:26:04 +08:00
Tianx	4d5afec9a8	fix: upsert error for timestamptz (#44548 ) issue: https://github.com/milvus-io/milvus/issues/44527 Signed-off-by: xtx <xtianx@smail.nju.edu.cn>	2025-09-24 10:28:04 +08:00
Gao	d5255b5eef	fix: set report_value default value when extrainfo is not nil for compatibility (#44529 ) https://github.com/milvus-io/milvus/issues/44523 Signed-off-by: chasingegg <chao.gao@zilliz.com>	2025-09-23 22:26:05 +08:00
Bingyi Sun	96e1de4e22	feat: allow users to write pk field when autoid is enabled (#44424 ) https://github.com/milvus-io/milvus/issues/44425 --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-09-23 16:10:04 +08:00
Tianx	2c0c5ef41e	feat: timestamptz expression & index & timezone (#44080 ) issue: https://github.com/milvus-io/milvus/issues/27467 >My plan is as follows. >- [x] M1 Create collection with timestamptz field >- [x] M2 Insert timestamptz field data >- [x] M3 Retrieve timestamptz field data >- [x] M4 Implement handoff >- [x] M5 Implement compare operator >- [x] M6 Implement extract operator >- [x] M8 Support database/collection level default timezone >- [x] M7 Support STL-SORT index for datatype timestamptz --- The third PR of issue: https://github.com/milvus-io/milvus/issues/27467, which completes M5, M6, M7, M8 described above. ## M8 Default Timezone We will be able to use alter_collection() and alter_database() in a future Python SDK release to modify the default timezone at the collection or database level. For insert requests, the timezone will be resolved using the following order of precedence: String Literal-> Collection Default -> Database Default. For retrieval requests, the timezone will be resolved in this order: Query Parameters -> Collection Default -> Database Default. In both cases, the final fallback timezone is UTC. ## M5: Comparison Operators We can now use the following expression format to filter on the timestamptz field: - `timestamptz_field [+/- INTERVAL 'interval_string'] {comparison_op} ISO 'iso_string' ` - The interval_string follows the ISO 8601 duration format, for example: P1Y2M3DT1H2M3S. - The iso_string follows the ISO 8601 timestamp format, for example: 2025-01-03T00:00:00+08:00. - Example expressions: "tsz + INTERVAL 'P0D' != ISO '2025-01-03T00:00:00+08:00'" or "tsz != ISO '2025-01-03T00:00:00+08:00'". ## M6: Extract We will be able to extract sepecific time filed by kwargs in a future Python SDK release. The key is `time_fields`, and value should be one or more of "year, month, day, hour, minute, second, microsecond", seperated by comma or space. Then the result of each record would be an array of int64. ## M7: Indexing Support Expressions without interval arithmetic can be accelerated using an STL-SORT index. However, expressions that include interval arithmetic cannot be indexed. This is because the result of an interval calculation depends on the specific timestamp value. For example, adding one month to a date in February results in a different number of added days than adding one month to a date in March. --- After this PR, the input / output type of timestamptz would be iso string. Timestampz would be stored as timestamptz data, which is int64_t finally. > for more information, see https://en.wikipedia.org/wiki/ISO_8601 --------- Signed-off-by: xtx <xtianx@smail.nju.edu.cn>	2025-09-23 10:24:12 +08:00
Gao	d3784c6515	enhance: add storage resource usage for vector search (#44308 ) issue: #44212 Implement search/query storage usage statistics in go side(result reduce), only record storage usage in vector search C++ path. Need to be implemented in query c++ path in next prs. --------- Signed-off-by: chasingegg <chao.gao@zilliz.com> Signed-off-by: marcelo.chen <marcelo.chen@zilliz.com> Co-authored-by: marcelo.chen <marcelo.chen@zilliz.com>	2025-09-19 20:20:02 +08:00
cai.zhang	62d416bf4f	fix: Check if ArrayData is nil to prevent panic (#44332 ) issue: #44331 Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2025-09-12 14:17:57 +08:00
Bingyi Sun	e2eb8562f1	feat: Auto add namespace field data if namespace is enabled (#44198 ) issue: #44011 Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-09-09 16:17:56 +08:00
Spade A	825a134739	feat: impl StructArray -- reject json types for struct (#44190 ) issue: https://github.com/milvus-io/milvus/issues/42148 --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-09-03 19:33:53 +08:00
Jean-Francois Weber-Marx	330a871979	enhance: add configuration to allow custom characters in names (#42417 ) (#44063 ) related: #42417 - Add NameValidationAllowedChars and RoleNameValidationAllowedChars configuration parameters to specify additional characters allowed respectively in (generic) names and a role names - All validations in validateName method is moved to a the new method validateNameWithCustomChars which is called by both validateName and ValidateRoleName while specifying characters allowed Signed-off-by: Jean-Francois Weber-Marx <jfwm@hotmail.com> Signed-off-by: Jean-Francois Weber-Marx <jf.webermarx@criteo.com>	2025-09-02 11:57:52 +08:00
wei liu	16af4e230a	fix: Prevent panic in upsert due to missing nullable fields [Proxy] (#44070 ) issue: #43980 Fixes a panic that occurred when a partial update was converted to an insert due to a non-existent primary key. The panic was caused by missing nullable fields that were not provided in the original partial update request. The upsert pre-execution logic is refactored to handle this correctly: - Explicitly splits upsert data into 'insert' and 'update' batches. - Automatically generates data for missing nullable or default-value fields during inserts, preventing the panic. - Enhances `typeutil.UpdateFieldData` to support different source and destination indexes for flexible data merging. - Adds comprehensive unit tests for mixed upsert, pure insert, and pure update scenarios. --------- Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-08-29 18:33:51 +08:00
aoiasd	208a345a3d	enhance: package analyzer code in Go and fix named analyzer as tokenizer (#43694 ) relate: https://github.com/milvus-io/milvus/issues/43687 Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-08-27 10:59:52 +08:00
junjiejiangjjj	f3d7e47227	feat: Supports more rerankers (#43270 ) https://github.com/milvus-io/milvus/issues/35856 Signed-off-by: junjiejiangjjj <junjie.jiang@zilliz.com>	2025-08-22 17:29:47 +08:00
congqixia	606d4c24cd	enhance: Use function def determine field `IsFunctionOutput` only (#43979 ) Related to #35853 Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-08-22 04:49:46 +08:00
Tianx	26c5c779bf	feat: temporarily disable Timestamptz collection creation (#43935 ) issue: https://github.com/milvus-io/milvus/issues/27467 Signed-off-by: xtx <xtianx@smail.nju.edu.cn>	2025-08-21 11:17:46 +08:00
Spade A	d6a428e880	feat: impl StructArray -- support create index for vector array (embedding list) and search on it (#43726 ) Ref https://github.com/milvus-io/milvus/issues/42148 This PR supports create index for vector array (now, only for `DataType.FLOAT_VECTOR`) and search on it. The index type supported in this PR is `EMB_LIST_HNSW` and the metric type is `MAX_SIM` only. The way to use it: ```python milvus_client = MilvusClient("xxx:19530") schema = milvus_client.create_schema(enable_dynamic_field=True, auto_id=True) ... struct_schema = milvus_client.create_struct_array_field_schema("struct_array_field") ... struct_schema.add_field("struct_float_vec", DataType.ARRAY_OF_VECTOR, element_type=DataType.FLOAT_VECTOR, dim=128, max_capacity=1000) ... schema.add_struct_array_field(struct_schema) index_params = milvus_client.prepare_index_params() index_params.add_index(field_name="struct_float_vec", index_type="EMB_LIST_HNSW", metric_type="MAX_SIM", index_params={"nlist": 128}) ... milvus_client.create_index(COLLECTION_NAME, schema=schema, index_params=index_params) ``` Note: This PR uses `Lims` to convey offsets of the vector array to knowhere where vectors of multiple vector arrays are concatenated and we need offsets to specify which vectors belong to which vector array. --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com> Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>	2025-08-20 10:27:46 +08:00
wei liu	d3c95eaa77	enhance: Support partial field updates with upsert API (#42877 ) issue: #29735 Implement partial field update functionality for upsert operations, supporting scalar, vector, and dynamic JSON fields without requiring all collection fields. Changes: - Add queryPreExecute to retrieve existing records before upsert - Implement UpdateFieldData function for merging data - Add IDsChecker utility for efficient primary key lookups - Fix JSON data creation in tests using proper map marshaling - Add test cases for partial updates of different field types Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-08-19 11:15:45 +08:00
Zhen Ye	5551d99425	enhance: remove old arch non-streaming arch code (#43651 ) issue: #41609 - remove all dml dead code at proxy - remove dead code at l0_write_buffer - remove msgstream dependency at proxy - remove timetick reporter from proxy - remove replicate stream implementation --------- Signed-off-by: chyezh <chyezh@outlook.com>	2025-08-06 14:41:40 +08:00
Spade A	faeb7fd410	feat: impl StructArray -- create schema, insert, and retrieve data (#42855 ) Ref https://github.com/milvus-io/milvus/issues/42148 https://github.com/milvus-io/milvus/pull/42406 impls the segcore part of storage for handling with VectorArray. This PR: 1. impls the go part of storage for VectorArray 2. impls the collection creation with StructArrayField and VectorArray 3. insert and retrieve data from the collection. --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com> Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com> Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>	2025-07-27 01:30:55 +08:00
Jean-Francois Weber-Marx	1bd66b09e3	enhance: allow '.' and '-' characters in usernames (#42417 ) (#42588 ) related: #42417 - update the isValidUsername function to accept dots and hyphens in addition to letters, digits, and underscores - this change improves compatibility with common username formats and addresses feedback in issue #42417 Signed-off-by: Jean-Francois Weber-Marx <jfwm@hotmail.com> Signed-off-by: Jean-Francois Weber-Marx <jf.webermarx@criteo.com>	2025-07-24 09:54:54 +08:00
foxspy	ed57650b52	fix: remove invalid restrictions on dim for int8 vector (#43469 ) issue: #43466 Signed-off-by: xianliang.li <xianliang.li@zilliz.com>	2025-07-23 20:22:54 +08:00
congqixia	684f027496	fix: Remove trimming space logic when validating collection name (#43064 ) Related to #43031 --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-07-04 11:00:45 +08:00
congqixia	74ea57bac1	enhance: Remove unused load field check from proxy (#42816 ) Related to #42489 Since load list works as hint after cachelayer implemented, the related check logic could be removed to keep code logic clean. --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-06-19 19:34:47 +08:00
junjiejiangjjj	f1a4526bac	enhance: refactor rrf and weighted rerank (#42154 ) https://github.com/milvus-io/milvus/issues/35856 Signed-off-by: junjie.jiang <junjie.jiang@zilliz.com>	2025-06-10 18:08:35 +08:00
cqy123456	317bbfbf81	enhance: milvus support minhash vector and mhjaccard metric (#42036 ) issue: https://github.com/issues/assigned?issue=milvus-io%7Cmilvus%7C41746 Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>	2025-06-10 14:38:34 +08:00
aoiasd	2ae4d80120	enhance: support run analyzer by loaded collection field (#42113 ) relate: https://github.com/milvus-io/milvus/issues/42094 Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-05-29 10:54:30 +08:00
congqixia	4cab236bca	enhance: [AddField][Nullable] Fill absent nullable field server-side (#42095 ) Related to #39718 The absent nullable field shall be filled at server-side in nullable design. While the implementation here was buggy causing the feature was not able to serve. This PR make proxy fill the field data in correct format so that field data with absent column(s) will be accepted. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-05-27 18:50:28 +08:00
Xianhui Lin	6a0e182e13	enhance: support TTL expiration with queries returning no results (#42086 ) support TTL expiration with queries returning no results issue:https://github.com/milvus-io/milvus/issues/41959 Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>	2025-05-27 18:28:27 +08:00

1 2 3 4 5

243 Commits