milvus

mirror of https://gitee.com/milvus-io/milvus.git synced 2025-12-07 01:28:27 +08:00

Author	SHA1	Message	Date
wei liu	7aed88113c	enhance: Deduplicate primary keys in upsert request batch (#45249 ) issue: #44320 This change adds deduplication logic to handle duplicate primary keys within a single upsert batch, keeping the last occurrence of each primary key. Key changes: - Add DeduplicateFieldData function to remove duplicate PKs from field data, supporting both Int64 and VarChar primary keys - Refactor fillFieldPropertiesBySchema into two separate functions: validateFieldDataColumns for validation and fillFieldPropertiesOnly for property filling, improving code clarity and reusability - Integrate deduplication logic in upsertTask.PreExecute to automatically deduplicate data before processing - Add comprehensive unit tests for deduplication with various PK types (Int64, VarChar) and field types (scalar, vector) - Add Python integration tests to verify end-to-end behavior --------- Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-11-17 21:35:40 +08:00
zhenshan.cao	6327c9a514	fix: Fix bugs related to TimestampTz (#45111 ) issue: https://github.com/milvus-io/milvus/issues/44527 https://github.com/milvus-io/milvus/issues/44537 https://github.com/milvus-io/milvus/issues/44538 https://github.com/milvus-io/milvus/issues/44585 https://github.com/milvus-io/milvus/issues/44622 Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>	2025-11-04 16:51:33 +08:00
junjiejiangjjj	f07979f91d	enhance: add support for controlling function output field insertion (#44162 ) #44053 Signed-off-by: junjie.jiang <junjie.jiang@zilliz.com>	2025-09-24 17:26:04 +08:00
Tianx	4d5afec9a8	fix: upsert error for timestamptz (#44548 ) issue: https://github.com/milvus-io/milvus/issues/44527 Signed-off-by: xtx <xtianx@smail.nju.edu.cn>	2025-09-24 10:28:04 +08:00
Tianx	2c0c5ef41e	feat: timestamptz expression & index & timezone (#44080 ) issue: https://github.com/milvus-io/milvus/issues/27467 >My plan is as follows. >- [x] M1 Create collection with timestamptz field >- [x] M2 Insert timestamptz field data >- [x] M3 Retrieve timestamptz field data >- [x] M4 Implement handoff >- [x] M5 Implement compare operator >- [x] M6 Implement extract operator >- [x] M8 Support database/collection level default timezone >- [x] M7 Support STL-SORT index for datatype timestamptz --- The third PR of issue: https://github.com/milvus-io/milvus/issues/27467, which completes M5, M6, M7, M8 described above. ## M8 Default Timezone We will be able to use alter_collection() and alter_database() in a future Python SDK release to modify the default timezone at the collection or database level. For insert requests, the timezone will be resolved using the following order of precedence: String Literal-> Collection Default -> Database Default. For retrieval requests, the timezone will be resolved in this order: Query Parameters -> Collection Default -> Database Default. In both cases, the final fallback timezone is UTC. ## M5: Comparison Operators We can now use the following expression format to filter on the timestamptz field: - `timestamptz_field [+/- INTERVAL 'interval_string'] {comparison_op} ISO 'iso_string' ` - The interval_string follows the ISO 8601 duration format, for example: P1Y2M3DT1H2M3S. - The iso_string follows the ISO 8601 timestamp format, for example: 2025-01-03T00:00:00+08:00. - Example expressions: "tsz + INTERVAL 'P0D' != ISO '2025-01-03T00:00:00+08:00'" or "tsz != ISO '2025-01-03T00:00:00+08:00'". ## M6: Extract We will be able to extract sepecific time filed by kwargs in a future Python SDK release. The key is `time_fields`, and value should be one or more of "year, month, day, hour, minute, second, microsecond", seperated by comma or space. Then the result of each record would be an array of int64. ## M7: Indexing Support Expressions without interval arithmetic can be accelerated using an STL-SORT index. However, expressions that include interval arithmetic cannot be indexed. This is because the result of an interval calculation depends on the specific timestamp value. For example, adding one month to a date in February results in a different number of added days than adding one month to a date in March. --- After this PR, the input / output type of timestamptz would be iso string. Timestampz would be stored as timestamptz data, which is int64_t finally. > for more information, see https://en.wikipedia.org/wiki/ISO_8601 --------- Signed-off-by: xtx <xtianx@smail.nju.edu.cn>	2025-09-23 10:24:12 +08:00
Bingyi Sun	94d53a5ac6	feat: encode cluster id in auto id (#44471 ) https://github.com/milvus-io/milvus/issues/44326 prev: [physical_ts][logical_ts] after [sign_bit][cluster_id][physical_ts][logical_ts] --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-09-22 10:40:02 +08:00
Bingyi Sun	5cd2d99799	enhance: Revert "feat: encode cluster id in auto id (#44324 )" (#44426 ) This reverts commit 7af159410395f0e7079d4875d96544c01f1d477b	2025-09-17 17:56:01 +08:00
Bingyi Sun	7af1594103	feat: encode cluster id in auto id (#44324 ) https://github.com/milvus-io/milvus/issues/44326 prev: `[physical_ts][logical_ts]` after `[sign_bit][cluster_id][physical_ts][logical_ts]` --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-09-17 16:56:01 +08:00
Bingyi Sun	e2eb8562f1	feat: Auto add namespace field data if namespace is enabled (#44198 ) issue: #44011 Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-09-09 16:17:56 +08:00
junjiejiangjjj	f3d7e47227	feat: Supports more rerankers (#43270 ) https://github.com/milvus-io/milvus/issues/35856 Signed-off-by: junjiejiangjjj <junjie.jiang@zilliz.com>	2025-08-22 17:29:47 +08:00
Spade A	d6a428e880	feat: impl StructArray -- support create index for vector array (embedding list) and search on it (#43726 ) Ref https://github.com/milvus-io/milvus/issues/42148 This PR supports create index for vector array (now, only for `DataType.FLOAT_VECTOR`) and search on it. The index type supported in this PR is `EMB_LIST_HNSW` and the metric type is `MAX_SIM` only. The way to use it: ```python milvus_client = MilvusClient("xxx:19530") schema = milvus_client.create_schema(enable_dynamic_field=True, auto_id=True) ... struct_schema = milvus_client.create_struct_array_field_schema("struct_array_field") ... struct_schema.add_field("struct_float_vec", DataType.ARRAY_OF_VECTOR, element_type=DataType.FLOAT_VECTOR, dim=128, max_capacity=1000) ... schema.add_struct_array_field(struct_schema) index_params = milvus_client.prepare_index_params() index_params.add_index(field_name="struct_float_vec", index_type="EMB_LIST_HNSW", metric_type="MAX_SIM", index_params={"nlist": 128}) ... milvus_client.create_index(COLLECTION_NAME, schema=schema, index_params=index_params) ``` Note: This PR uses `Lims` to convey offsets of the vector array to knowhere where vectors of multiple vector arrays are concatenated and we need offsets to specify which vectors belong to which vector array. --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com> Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>	2025-08-20 10:27:46 +08:00
aoiasd	06006939f8	feat: support use cipher hook in streaming node (#40562 ) relate: https://github.com/milvus-io/milvus/issues/40321 Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-08-19 10:41:44 +08:00
Zhen Ye	5551d99425	enhance: remove old arch non-streaming arch code (#43651 ) issue: #41609 - remove all dml dead code at proxy - remove dead code at l0_write_buffer - remove msgstream dependency at proxy - remove timetick reporter from proxy - remove replicate stream implementation --------- Signed-off-by: chyezh <chyezh@outlook.com>	2025-08-06 14:41:40 +08:00
Spade A	faeb7fd410	feat: impl StructArray -- create schema, insert, and retrieve data (#42855 ) Ref https://github.com/milvus-io/milvus/issues/42148 https://github.com/milvus-io/milvus/pull/42406 impls the segcore part of storage for handling with VectorArray. This PR: 1. impls the go part of storage for VectorArray 2. impls the collection creation with StructArrayField and VectorArray 3. insert and retrieve data from the collection. --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com> Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com> Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>	2025-07-27 01:30:55 +08:00
congqixia	880915e08b	enhance: Print out-of-date schema ts when returning ErrSchemaMismatch (#42790 ) Related to #41858 This PR add log while debugging schema mismatch between pymilvus cache and proxy schema. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-06-17 10:38:37 +08:00
cai.zhang	63246c040f	fix: Use locking to ensure the atomicity of dropping segment indexes (#42075 ) issue: #41288 --------- Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2025-05-28 10:00:28 +08:00
junjiejiangjjj	359e7efd8e	feat: Add function running monitoring (#40358 ) #35856 #40004 1. Optimize model verification logic 2. Add profiling code Signed-off-by: junjie.jiang <junjie.jiang@zilliz.com>	2025-03-10 22:28:05 +08:00
congqixia	cb7f2fa6fd	enhance: Use v2 package name for pkg module (#39990 ) Related to #39095 https://go.dev/doc/modules/version-numbers Update pkg version according to golang dep version convention --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-02-22 23:15:58 +08:00
SimFG	ad36347fb3	fix: add BeginTimestamp and EndTimestamp to insert and upsert messages (#40110 ) - issue: #40109 - caused by: #38656 Signed-off-by: SimFG <bang.fu@zilliz.com>	2025-02-22 12:29:53 +08:00
Patrick Weizhi Xu	04fff74a56	feat: introduce Text data type (#39874 ) issue: https://github.com/milvus-io/milvus/issues/39818 This PR mimics Varchar data type, allows insert, search, query, delete, full-text search and others. Functionalities related to filter expressions are disabled temporarily. Storage changes for Text data type will be in the following PRs. Signed-off-by: Patrick Weizhi Xu <weizhi.xu@zilliz.com>	2025-02-19 11:04:51 +08:00
Xianhui Lin	82f9689711	enhance: Add schema update time verification for insert and upsert to use cache (#39096 ) enhance: Add schema update time verification for insert and upsert to use cache issue: https://github.com/milvus-io/milvus/issues/39093 --------- Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>	2025-02-07 14:10:45 +08:00
aoiasd	2b4caba76e	fix: check utf-8 format for varchar with analyzer open (#39299 ) relate: https://github.com/milvus-io/milvus/issues/39285 Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-02-06 17:11:51 +08:00
junjiejiangjjj	16cbdfb3b1	feat: Add Text Embedding Function (#36366 ) https://github.com/milvus-io/milvus/issues/35856 Signed-off-by: junjie.jiang <junjie.jiang@zilliz.com>	2025-01-24 14:23:06 +08:00
SimFG	2afe2eaf3e	feat: support to replicate collection when the services contains the system tt msg (#37559 ) - issue: #37105 --------- Signed-off-by: SimFG <bang.fu@zilliz.com>	2024-12-17 09:08:46 +08:00
tinswzy	27229f7907	enhance: refine exists log print with ctx (#38080 ) issue: #35917 Refines exists log print with ctx Signed-off-by: tinswzy <zhenyuan.wei@zilliz.com>	2024-12-14 22:36:44 +08:00
tinswzy	5768dbbb5d	enhance: refine pular related mq interfaces (#38007 ) issue: #35917 Refines the pulsar-related mq APIs to allow the ctx to be passed down Signed-off-by: tinswzy <zhenyuan.wei@zilliz.com>	2024-12-04 20:50:39 +08:00
SimFG	302650ae0e	fix: use the default partition for the limit quota when the request partition name is empty (#38005 ) - issue: #37685 Signed-off-by: SimFG <bang.fu@zilliz.com>	2024-11-27 11:00:36 +08:00
jaime	52cce4de58	fix: iaccurate size estimation for encoded array data (#36373 ) issue: #36029 Signed-off-by: jaime <yun.zhang@zilliz.com>	2024-09-24 14:51:14 +08:00
congqixia	fe20366b5c	enhance: Remove duplicated schema helper creation in proxy (#35489 ) Related to PRs of #35415 Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-08-15 19:18:53 +08:00
smellthemoon	6106a48acb	fix: upsert result use the previous pk (#34672 ) #34668 Signed-off-by: lixinguo <xinguo.li@zilliz.com> Co-authored-by: lixinguo <xinguo.li@zilliz.com>	2024-07-31 15:25:51 +08:00
Jiquan Long	a2ac84bd64	feat: record the duration waiting in the proxy queue (#34744 ) fix: https://github.com/milvus-io/milvus/issues/34743 --------- Signed-off-by: longjiquan <jiquan.long@zilliz.com>	2024-07-23 14:23:52 +08:00
smellthemoon	07b94b4615	enhance: support upsert autoid==true (#30342 ) related with: #29258 --------- Signed-off-by: lixinguo <xinguo.li@zilliz.com> Co-authored-by: lixinguo <xinguo.li@zilliz.com>	2024-07-11 16:53:35 +08:00
aoiasd	186757e622	enhance: support mark error as user error (#33498 ) relate: https://github.com/milvus-io/milvus/issues/33492 --------- Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2024-07-01 14:56:12 +08:00
SimFG	8594b55ad5	enhance: add `max insert request size` and `must use partition key` configs (#32433 ) issue: https://github.com/milvus-io/milvus/issues/30577 /kind improvement Signed-off-by: SimFG <bang.fu@zilliz.com>	2024-04-19 10:31:20 +08:00
cai.zhang	40ca98f57f	enhance: Skip timestamp allocation when search/query consistency level is eventually (#29773 ) issue: #29772 1. Skip timestamp allocation when search/query consistency level is eventually. Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2024-02-21 09:52:59 +08:00
congqixia	4f8c540c77	enhance: cache collection schema attributes to reduce proxy cpu (#29668 ) See also #29113 The collection schema is crucial when performing search/query but some of the information is calculated for every request. This PR change schema field of cached collection info into a utility `schemaInfo` type to store some stable result, say pk field, partitionKeyEnabled, etc. And provided field name to id map for search/query services. --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-01-04 17:28:46 +08:00
yah01	be980fbc38	Refine state check (#27541 ) Signed-off-by: yah01 <yah2er0ne@outlook.com>	2023-10-11 21:01:35 +08:00
yah01	6539a5ae2c	Refine DataCoord status (#27262 ) Signed-off-by: yah01 <yah2er0ne@outlook.com>	2023-09-26 17:15:27 +08:00
cai.zhang	a362bb1457	Support array datatype (#26369 ) Signed-off-by: cai.zhang <cai.zhang@zilliz.com>	2023-09-19 14:23:23 +08:00
yah01	3349db4aa7	Refine errors to remove changes breaking design (#26521 ) Signed-off-by: yah01 <yah2er0ne@outlook.com>	2023-09-04 09:57:09 +08:00
smellthemoon	87ecaac703	Add dynamic schema check in upsert (#26644 ) Signed-off-by: lixinguo <xinguo.li@zilliz.com> Co-authored-by: lixinguo <xinguo.li@zilliz.com>	2023-08-30 10:52:26 +08:00
MrPresent-Han	d30a920226	add log trace for segcore(#26277 ) (#26339 ) Signed-off-by: MrPresent-Han <chun.han@zilliz.com>	2023-08-16 11:41:33 +08:00
SimFG	69d274d233	Improve the operation log (#25589 ) Signed-off-by: SimFG <bang.fu@zilliz.com>	2023-07-14 16:08:31 +08:00
Enwei Jiao	66fdc71479	Refactor logs in DataCoord & DataNode (#25574 ) Signed-off-by: Enwei Jiao <enwei.jiao@zilliz.com>	2023-07-14 15:56:31 +08:00
jaime	18df2ba6fd	[Cherry-Pick] Support Database (#24769 ) Support Database(#23742) Fix db nonexists error for FlushAll (#24222) Fix check collection limits fails (#24235) backward compatibility with empty DB name (#24317) Fix GetFlushAllState with DB (#24347) Remove db from global meta cache after drop database (#24474) Fix db name is empty for describe collection response (#24603) Add RBAC for Database API (#24653) Fix miss load the same name collection during recover stage (#24941) RBAC supports Database validation (#23609) Fix to list grant with db return empty (#23922) Optimize PrivilegeAll permission check (#23972) Add the default db value for the rbac request (#24307) Signed-off-by: jaime <yun.zhang@zilliz.com> Co-authored-by: SimFG <bang.fu@zilliz.com> Co-authored-by: longjiquan <jiquan.long@zilliz.com>	2023-06-25 17:20:43 +08:00
Enwei Jiao	d143682d7d	Refactor logs in proxy package. (#24936 ) Signed-off-by: Enwei Jiao <enwei.jiao@zilliz.com>	2023-06-19 13:28:41 +08:00
yihao.dai	b62429070c	Set pchannels before dml enqueue to prevent panic (#24828 ) Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2023-06-16 16:36:40 +08:00
smellthemoon	db31e88a73	Add length check when insert and upsert (#24759 ) Signed-off-by: lixinguo <xinguo.li@zilliz.com> Co-authored-by: lixinguo <xinguo.li@zilliz.com>	2023-06-15 10:24:38 +08:00
congqixia	41af0a98fa	Use go-api/v2 for milvus-proto (#24770 ) Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2023-06-09 01:28:37 +08:00
xige-16	732fe54775	Support partition Key (#24047 ) Signed-off-by: xige-16 <xi.ge@zilliz.com>	2023-06-06 10:24:34 +08:00

1 2

76 Commits