805 Commits

Author SHA1 Message Date
sparknack
935160840c
enhance: add a disk quota for the loaded binlog size to prevent load failure of querynode (#44893)
issue: #41435

---------

Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>
2025-10-19 19:44:01 +08:00
Zhen Ye
496331ffa8
enhance: support alias with WAL-based DDL framework (#44865)
issue: #43897

- Alias related DDL is implemented by WAL-based DDL framework now.
- Support following message type in wal AlterAlias, DropAlias.
- Alias DDL can be synced by new CDC now.
- Refactor some UT for Alias DDL.

Signed-off-by: chyezh <chyezh@outlook.com>
2025-10-18 15:12:01 +08:00
Zhen Ye
4dc75a6e2c
enhance: support database with WAL-based DDL framework (#44822)
issue: #43897

- Database related DDL is implemented by WAL-based DDL framework now.
- Support following message type in wal CreateDatabase, AlterDatabase,
DropDatabase.
- Database DDL can be synced by new CDC now.
- Refactor some UT for Database DDL.

Signed-off-by: chyezh <chyezh@outlook.com>
2025-10-17 16:38:10 +08:00
Zhen Ye
80bb09f7c2
enhance: support rbac with WAL-based DDL framework (#44735)
issue: #43897

- RBAC(Roles/Users/Privileges/Privilege Groups) is implemented by
WAL-based DDL framework now.
- Support following message type in wal `AlterUser`, `DropUser`,
`AlterRole`, `DropRole`, `AlterUserRole`, `DropUserRole`,
`AlterPrivilege`, `DropPrivilege`, `AlterPrivilegeGroup`,
`DropPrivilegeGroup`, `RestoreRBAC`.
- RBAC can be synced by new CDC now.
- Refactor some UT for RBAC.

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-10-16 16:02:01 +08:00
congqixia
31670c5489
enhance: Use dbName in error message (#44618)
The collection not found err could contains db id in err message, which
is not meaningful to users.

This patch make error message wrapping dbname instead of db id.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-09-30 12:25:05 +08:00
cai.zhang
19346fa389
feat: Geospatial Data Type and GIS Function support for milvus (#44547)
issue: #43427

This pr's main goal is merge #37417 to milvus 2.5 without conflicts.

# Main Goals

1. Create and describe collections with geospatial type
2. Insert geospatial data into the insert binlog
3. Load segments containing geospatial data into memory
4. Enable query and search can display  geospatial data
5. Support using GIS funtions like ST_EQUALS in query
6. Support R-Tree index for geometry type

# Solution

1. **Add Type**: Modify the Milvus core by adding a Geospatial type in
both the C++ and Go code layers, defining the Geospatial data structure
and the corresponding interfaces.
2. **Dependency Libraries**: Introduce necessary geospatial data
processing libraries. In the C++ source code, use Conan package
management to include the GDAL library. In the Go source code, add the
go-geom library to the go.mod file.
3. **Protocol Interface**: Revise the Milvus protocol to provide
mechanisms for Geospatial message serialization and deserialization.
4. **Data Pipeline**: Facilitate interaction between the client and
proxy using the WKT format for geospatial data. The proxy will convert
all data into WKB format for downstream processing, providing column
data interfaces, segment encapsulation, segment loading, payload
writing, and cache block management.
5. **Query Operators**: Implement simple display and support for filter
queries. Initially, focus on filtering based on spatial relationships
for a single column of geospatial literal values, providing parsing and
execution for query expressions.Now only support brutal search
7. **Client Modification**: Enable the client to handle user input for
geospatial data and facilitate end-to-end testing.Check the modification
in pymilvus.

---------

Signed-off-by: Yinwei Li <yinwei.li@zilliz.com>
Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
Co-authored-by: ZhuXi <150327960+Yinwei-Yu@users.noreply.github.com>
2025-09-28 19:43:05 +08:00
Zhen Ye
19e5e9f910
enhance: broadcaster will lock resource until message acked (#44508)
issue: #43897

- Return LastConfirmedMessageID when wal append operation.
- Add resource-key-based locker for broadcast-ack operation to protect
the coord state when executing ddl.
- Resource-key-based locker is held until the broadcast operation is
acked.
- ResourceKey support shared and exclusive lock.
- Add FastAck execute ack right away after the broadcast done to speed
up ddl.
- Ack callback will support broadcast message result now.
- Add tombstone for broadcaster to avoid to repeatedly commit DDL and
ABA issue.

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-09-24 20:58:05 +08:00
congqixia
0e5fb8ac6f
fix: Cleanup collection metrics after dropped on rootcoord (#44511)
Related to #44509

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-09-23 11:02:06 +08:00
Tianx
2c0c5ef41e
feat: timestamptz expression & index & timezone (#44080)
issue: https://github.com/milvus-io/milvus/issues/27467

>My plan is as follows.
>- [x] M1 Create collection with timestamptz field
>- [x] M2 Insert timestamptz field data
>- [x] M3 Retrieve timestamptz field data
>- [x] M4 Implement handoff
>- [x] M5 Implement compare operator
>- [x] M6 Implement extract operator
 >- [x] M8 Support database/collection level default timezone
>- [x] M7 Support STL-SORT index for datatype timestamptz

---

The third PR of issue: https://github.com/milvus-io/milvus/issues/27467,
which completes M5, M6, M7, M8 described above.

## M8 Default Timezone

We will be able to use alter_collection() and alter_database() in a
future Python SDK release to modify the default timezone at the
collection or database level.

For insert requests, the timezone will be resolved using the following
order of precedence: String Literal-> Collection Default -> Database
Default.
For retrieval requests, the timezone will be resolved in this order:
Query Parameters -> Collection Default -> Database Default.
In both cases, the final fallback timezone is UTC.


## M5: Comparison Operators

We can now use the following expression format to filter on the
timestamptz field:

- `timestamptz_field [+/- INTERVAL 'interval_string'] {comparison_op}
ISO 'iso_string' `

- The interval_string follows the ISO 8601 duration format, for example:
P1Y2M3DT1H2M3S.

- The iso_string follows the ISO 8601 timestamp format, for example:
2025-01-03T00:00:00+08:00.

- Example expressions: "tsz + INTERVAL 'P0D' != ISO
'2025-01-03T00:00:00+08:00'" or "tsz != ISO
'2025-01-03T00:00:00+08:00'".

## M6: Extract

We will be able to extract sepecific time filed by kwargs in a future
Python SDK release.
The key is `time_fields`, and value should be one or more of "year,
month, day, hour, minute, second, microsecond", seperated by comma or
space. Then the result of each record would be an array of int64.



## M7: Indexing Support

Expressions without interval arithmetic can be accelerated using an
STL-SORT index. However, expressions that include interval arithmetic
cannot be indexed. This is because the result of an interval calculation
depends on the specific timestamp value. For example, adding one month
to a date in February results in a different number of added days than
adding one month to a date in March.

--- 

After this PR, the input / output type of timestamptz would be iso
string. Timestampz would be stored as timestamptz data, which is int64_t
finally.

> for more information, see https://en.wikipedia.org/wiki/ISO_8601

---------

Signed-off-by: xtx <xtianx@smail.nju.edu.cn>
2025-09-23 10:24:12 +08:00
wei liu
3715ddcc81
fix: Fix load config parsing using wrong parameters for dynamic replica (#44430)
issue: #44429
Fix the issue where dynamic modification of collection replica count
doesn't take effect due to incorrect parameter usage in load config
parsing.

Changes include:
- Replace DatabaseLevelReplicaNumber with CollectionLevelReplicaNumber
- Replace DatabaseLevelResourceGroups with CollectionLevelResourceGroups
- Update test cases to use correct collection-level constants
- Ensure dynamic replica changes are properly applied to collections

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-09-22 10:42:02 +08:00
XuanYang-cn
d1b40b7742
fix: Rename Collection use wrong method to check if collection exists (#44436)
See also: #44435

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2025-09-18 14:37:07 +08:00
XuanYang-cn
ad86292bed
fix: Deny RenameCollection when db enabled encryption (#44225)
Also fix the validation of enabling encyption when create db

See also: #44217, #44218

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2025-09-16 10:08:07 +08:00
congqixia
634026a177
fix: [AddField] Forbid delete dynamicfield.enable property (#44335)
Related to #44311

This PR forbids delete `dynamicfield.enable` property and returns user
understandable error message to user.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-09-15 14:20:00 +08:00
Bingyi Sun
e2eb8562f1
feat: Auto add namespace field data if namespace is enabled (#44198)
issue: #44011

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-09-09 16:17:56 +08:00
Bingyi Sun
e3ecacca9e
feat: Add namespace prop (#43962)
issue: https://github.com/milvus-io/milvus/issues/44011
namespace is an alias for tenant. if this property is enabled, milvus
will add a __namespace_id field.
Modifications in the future will use this property to do compaction and
search.

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-09-03 12:57:53 +08:00
congqixia
aa4ef9c996
feat: Support enabling dynamic schema on existing collection (#44151)
Related to #44150

This PR make enabling `dynamic schema` feature for an existing
collection possible.

This related API is to reuse `AlterCollection` and underhood its
redirected to `adding nullable json field`

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-09-02 15:51:52 +08:00
Zhen Ye
9e2d1963d4
enhance: support cchannel for streaming service (#44143)
issue: #43897

- add cchannel as a special vchannel to hold some ddl and dcl.

Signed-off-by: chyezh <chyezh@outlook.com>
2025-09-02 10:05:52 +08:00
XuanYang-cn
37a447d166
feat: Add CMEK cipher plugin (#43722)
1. Enable Milvus to read cipher configs
2. Enable cipher plugin in binlog reader and writer
3. Add a testCipher for unittests
4. Support pooling for datanode
5. Add encryption in storagev2

See also: #40321 
Signed-off-by: yangxuan <xuan.yang@zilliz.com>

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2025-08-27 11:15:52 +08:00
Spade A
8456f824be
feat: impl StructArray -- miscellaneous staffs for struct array (#43960)
Ref https://github.com/milvus-io/milvus/issues/42148

1. enable storage v2
2. implement some missing staffs
3. fix some bugs and add tests

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-08-26 21:35:53 +08:00
Tianx
c0d62268ac
feat: add timesatmptz data type (#44005)
issue: https://github.com/milvus-io/milvus/issues/27467
>
https://github.com/milvus-io/milvus/issues/27467#issuecomment-3092211420
> * [x]  M1 Create collection with timestamptz field
> * [x]  M2 Insert timestamptz field data
> * [x]  M3 Retrieve timestamptz field data
> * [x]  M4 Implement handoff[ ]  

The second PR of issue:
https://github.com/milvus-io/milvus/issues/27467, which completes M1-M4
described above.

---------

Signed-off-by: xtx <xtianx@smail.nju.edu.cn>
2025-08-26 15:59:53 +08:00
congqixia
847b79e197
enhance: Use RLock for ListPrivilegeGroups (#43998)
Related to #43901

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-08-21 23:45:46 +08:00
Spade A
d6a428e880
feat: impl StructArray -- support create index for vector array (embedding list) and search on it (#43726)
Ref https://github.com/milvus-io/milvus/issues/42148

This PR supports create index for vector array (now, only for
`DataType.FLOAT_VECTOR`) and search on it.
The index type supported in this PR is `EMB_LIST_HNSW` and the metric
type is `MAX_SIM` only.

The way to use it:
```python
milvus_client = MilvusClient("xxx:19530")
schema = milvus_client.create_schema(enable_dynamic_field=True, auto_id=True)
...
struct_schema = milvus_client.create_struct_array_field_schema("struct_array_field")
...
struct_schema.add_field("struct_float_vec", DataType.ARRAY_OF_VECTOR, element_type=DataType.FLOAT_VECTOR, dim=128, max_capacity=1000)
...
schema.add_struct_array_field(struct_schema)
index_params = milvus_client.prepare_index_params()
index_params.add_index(field_name="struct_float_vec", index_type="EMB_LIST_HNSW", metric_type="MAX_SIM", index_params={"nlist": 128})
...
milvus_client.create_index(COLLECTION_NAME, schema=schema, index_params=index_params)
```

Note: This PR uses `Lims` to convey offsets of the vector array to
knowhere where vectors of multiple vector arrays are concatenated and we
need offsets to specify which vectors belong to which vector array.

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>
2025-08-20 10:27:46 +08:00
sthuang
fc03fe7623
enhance: avoid frequent LoadWithPrefix etcd calls in ShowCollections … (#43902)
related: https://github.com/milvus-io/milvus/issues/43901

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2025-08-18 14:57:44 +08:00
congqixia
e75fddcc15
fix: Invalidate proxy cache for create alias op (#43854)
Related to #43853

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-08-18 12:01:45 +08:00
Xianhui Lin
b98b3b16a3
feat:add BatchDescribeCollection interface (#43786)
feat:add BatchDescribeCollection interface
issue: https://github.com/milvus-io/milvus/issues/43781

Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>
2025-08-18 01:23:45 +08:00
PjJinchen
64633cc5b3
fix: Metrics with collectionName but no databaseName label are causing name conflicts and confusion (#43277) (#43808)
issue: https://github.com/milvus-io/milvus/issues/43277

---------

Signed-off-by: PjJinchen <6268414+pj1987111@users.noreply.github.com>
2025-08-15 01:37:44 +08:00
Zhen Ye
5551d99425
enhance: remove old arch non-streaming arch code (#43651)
issue: #41609

- remove all dml dead code at proxy
- remove dead code at l0_write_buffer
- remove msgstream dependency at proxy
- remove timetick reporter from proxy
- remove replicate stream implementation

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-08-06 14:41:40 +08:00
Spade A
faeb7fd410
feat: impl StructArray -- create schema, insert, and retrieve data (#42855)
Ref https://github.com/milvus-io/milvus/issues/42148

https://github.com/milvus-io/milvus/pull/42406 impls the segcore part of
storage for handling with VectorArray.
This PR:
1. impls the go part of storage for VectorArray
2. impls the collection creation with StructArrayField and VectorArray
3. insert and retrieve data from the collection.

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>
Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>
2025-07-27 01:30:55 +08:00
Zhen Ye
e9ab73e93d
enhance: add schema version at recovery storage (#43500)
issue: #43072, #43289

- manage the schema version at recovery storage.
- update the schema when creating collection or alter schema.
- get schema at write buffer based on version.
- recover the schema when upgrading from 2.5.

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-07-23 21:38:54 +08:00
cai.zhang
2adc6ce0bc
fix: Call AlterCollection when only rename collection (#43420)
issue: #43407

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-07-18 15:46:56 +08:00
Zhen Ye
5aa7a116d2
fix: change maxTimeTickDelay from 5m into 20m (#43377)
issue: #43266

Signed-off-by: chyezh <chyezh@outlook.com>
2025-07-18 11:29:42 +08:00
Zhen Ye
15a6631147
enhance: add quota limit based on sn consuming lag (#43105)
issue: #42995

- The consuming lag at streaming node will be reported to coordinator.
- The consuming lag will trigger the write limit and deny by quota
center.
- Set the ttProtection by default.

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-07-11 14:10:49 +08:00
Zhen Ye
46b6f1b9e2
fix: panic when logging a old message should be skipped (#43076)
issue: #43074

- fix: panic when logging a old message should be skipped, #43074 
- fix: make the ack of broadcaster idompotent, #43026
- fix: lost dropping collection when upgrading, #43092
- fix: panic when DropPartition happen after DropCollection, #43027,
#43078

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-07-04 16:04:44 +08:00
Zhen Ye
ecb24e7232
enhance: use multi-process framework in integration test (#42976)
issue: #41609

- add env `MILVUS_NODE_ID_FOR_TESTING` to set up a node id for milvus
process.
- add env `MILVUS_CONFIG_REFRESH_INTERVAL` to set up the refresh
interval of paramtable.
- Init paramtable when calling `paramtable.Get()`.
- add new multi process framework for integration test.
- change all integration test into multi process.
- merge some test case into one suite to speed up it.
- modify some test, which need to wait for issue #42966, #42685.
- remove the waittssync for delete collection to fix issue: #42989

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-06-30 14:22:43 +08:00
cai.zhang
59b003adac
enhance: Skip modify field meta when rename collection or rename dbName (#42875)
issue: #42873

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-06-23 17:04:41 +08:00
Zhen Ye
fadc053d7a
fix: filter new proxy when initializing proxy session at timeticksync (#42831)
issue: #40532

Signed-off-by: chyezh <chyezh@outlook.com>
2025-06-19 16:44:40 +08:00
Spade A
911a8df17c
feat: impl StructArray -- data storage support in segcore (#42406)
Ref https://github.com/milvus-io/milvus/issues/42148
This PR mainly enables segcore to support array of vector (read and
write, but not indexing). Now only float vector as the element type is
supported.

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>
2025-06-12 14:38:35 +08:00
Zhen Ye
af0881ee5d
fix: timetick cannot push forward when upgrading (#42567)
issue #42492

- streamingcoord start before old rootcoord.
- streaming balancer will check the node session synchronously to avoid
redundant operation when cluster startup.
- ddl operation will check if streaming enabled, if the streaming is not
enabled, it will use msgstream.
- msgstream will initialize if streaming is not enabled, and stop when
streaming is enabled.

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-06-10 14:52:42 +08:00
cai.zhang
5566a85bcc
enhance: Add proxy task queue metrics (#42156)
issue: #42155

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-06-04 11:26:32 +08:00
Xiaowei Shi
729d0b666e
enhance: use parsed physical timestamp in metrics (#41784)
issue: https://github.com/milvus-io/milvus/issues/38809
pr: https://github.com/milvus-io/milvus/pull/38810 failed to reopen

Signed-off-by: Xiaowei Shi <shallwe.shih@gmail.com>
2025-05-30 10:20:37 +08:00
Zhen Ye
66cc194ab2
enhance: add partition gc at streaming arch (#42179)
issue: #41976

- make drop partition message as a broadcast message.
- add gc when drop partition message is acked.
- add a call back to handle the broadcast message when ack.
- the ack operation of broadcast message will retry until success.

Signed-off-by: chyezh <chyezh@outlook.com>
2025-05-29 23:20:30 +08:00
Zhen Ye
4bad293655
enhance: make upgrading from 2.5.x less down time (#42082)
issue: #40532

- start timeticksync at rootcoord if the streaming service is not
available
- stop timeticksync if the streaming service is available
- open a read-only wal if some nodes in cluster is not upgrading to 2.6
- allow to open read-write wal after all nodes in cluster is upgrading
to 2.6

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-05-29 23:02:29 +08:00
groot
c00005bdaa
feat: support to drop properties of field (#41996)
issue: https://github.com/milvus-io/milvus/issues/41990

Signed-off-by: yhmo <yihua.mo@zilliz.com>
2025-05-27 14:32:34 +08:00
congqixia
22ef03b040
fix: [AddField] Expire metacache with rc task timestamp (#42010)
Related to #39718 #41681

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-05-22 17:20:25 +08:00
Zhen Ye
e4f04fd692
fix: vchannel count of wal balancer is wrong after recovering (#41922)
issue: #40638

Signed-off-by: chyezh <chyezh@outlook.com>
2025-05-19 22:14:23 +08:00
Zhen Ye
59dff668dc
enhance: schema change without manual flush (#41882)
issue: #39718

- remove the manual flush message from schema change operation
- add flush segment id handle into schema change processes

Signed-off-by: chyezh <chyezh@outlook.com>
Co-authored-by: congqixia <congqi.xia@zilliz.com>
2025-05-19 10:14:22 +08:00
SimFG
9fa50e0b1a
enhance: implement authorization checks for DescribeCollection and DescribeDatabase tasks (#41798)
- issue: #41694

---------

Signed-off-by: SimFG <bang.fu@zilliz.com>
2025-05-15 17:52:23 +08:00
groot
1574673a8c
enhance: Alter collection description (#41558)
issue: https://github.com/milvus-io/milvus/issues/41557

Signed-off-by: yhmo <yihua.mo@zilliz.com>
2025-05-12 14:16:55 +08:00
Xianhui Lin
2278e9e03f
fix: set quota center metrics configuration before watch (#41706)
fix: set quota center metrics configuration before watch
issue: https://github.com/milvus-io/milvus/issues/35177

Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>
2025-05-09 14:32:54 +08:00
Xianhui Lin
3124d6f599
feat: add string ts type in describecollection (#41612)
feat: add string  ts type in describecollection
issue:https://github.com/milvus-io/milvus/issues/39093

Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>
2025-05-06 14:28:52 +08:00