1982 Commits

Author SHA1 Message Date
wei liu
33d1e7de83
fix: Replace incorrect log import with milvus v2 log package (#44731)
issue: #44730
Fix the issue where logs were not outputting as expected due to
incorrect log package imports across multiple components.

Changes include:
- Add golangci-lint rule to forbid github.com/pingcap/log usage
- Replace github.com/pingcap/log with
github.com/milvus-io/milvus/pkg/v2/log

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-10-10 20:27:57 +08:00
groot
81f0d498be
enhance: Support JSONL/NDJSON files for bulkinsert (#44602)
issue: https://github.com/milvus-io/milvus/issues/44567

Signed-off-by: yhmo <yihua.mo@zilliz.com>
2025-10-10 10:43:57 +08:00
Bingyi Sun
c25166a202
fix: Fix bulk import with autoid (#44604)
issue: #44424

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-10-09 12:09:56 +08:00
PjJinchen
d8efe8a6fb
feat: support ali qwen rerank (#44363) (#44364)
issue: #44363

---------

Signed-off-by: PjJinchen <6268414+pj1987111@users.noreply.github.com>
2025-09-30 23:19:52 +08:00
congqixia
079cd97c0a
fix: [skip e2e] Wait eventch in TestUpdateSessions (#44640)
Related to #44620

Make unittest stable by waiting eventch instead of async read
len(eventCh), which is unstable

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-09-30 09:53:05 +08:00
aoiasd
294282f1d2
enhance: support use nullable field as bm25 function input field (#44586)
Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-09-29 10:25:05 +08:00
Zhen Ye
b6b59bd222
fix: remove redundant initialization of storage v2 (#44597)
issue: #44596

- querynode already init the storage v2 and segcore, so streamingnode
should not do this again.
- It also fix the gcp object storage access denied.

Signed-off-by: chyezh <chyezh@outlook.com>
2025-09-29 10:17:04 +08:00
cai.zhang
19346fa389
feat: Geospatial Data Type and GIS Function support for milvus (#44547)
issue: #43427

This pr's main goal is merge #37417 to milvus 2.5 without conflicts.

# Main Goals

1. Create and describe collections with geospatial type
2. Insert geospatial data into the insert binlog
3. Load segments containing geospatial data into memory
4. Enable query and search can display  geospatial data
5. Support using GIS funtions like ST_EQUALS in query
6. Support R-Tree index for geometry type

# Solution

1. **Add Type**: Modify the Milvus core by adding a Geospatial type in
both the C++ and Go code layers, defining the Geospatial data structure
and the corresponding interfaces.
2. **Dependency Libraries**: Introduce necessary geospatial data
processing libraries. In the C++ source code, use Conan package
management to include the GDAL library. In the Go source code, add the
go-geom library to the go.mod file.
3. **Protocol Interface**: Revise the Milvus protocol to provide
mechanisms for Geospatial message serialization and deserialization.
4. **Data Pipeline**: Facilitate interaction between the client and
proxy using the WKT format for geospatial data. The proxy will convert
all data into WKB format for downstream processing, providing column
data interfaces, segment encapsulation, segment loading, payload
writing, and cache block management.
5. **Query Operators**: Implement simple display and support for filter
queries. Initially, focus on filtering based on spatial relationships
for a single column of geospatial literal values, providing parsing and
execution for query expressions.Now only support brutal search
7. **Client Modification**: Enable the client to handle user input for
geospatial data and facilitate end-to-end testing.Check the modification
in pymilvus.

---------

Signed-off-by: Yinwei Li <yinwei.li@zilliz.com>
Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
Co-authored-by: ZhuXi <150327960+Yinwei-Yu@users.noreply.github.com>
2025-09-28 19:43:05 +08:00
congqixia
cc53b25ba4
fix: [skip e2e] Update unit test after hnsw support binary vector (#44575)
fix: #44574

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-09-26 18:21:04 +08:00
Zhen Ye
19e5e9f910
enhance: broadcaster will lock resource until message acked (#44508)
issue: #43897

- Return LastConfirmedMessageID when wal append operation.
- Add resource-key-based locker for broadcast-ack operation to protect
the coord state when executing ddl.
- Resource-key-based locker is held until the broadcast operation is
acked.
- ResourceKey support shared and exclusive lock.
- Add FastAck execute ack right away after the broadcast done to speed
up ddl.
- Ack callback will support broadcast message result now.
- Add tombstone for broadcaster to avoid to repeatedly commit DDL and
ABA issue.

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-09-24 20:58:05 +08:00
congqixia
99598ae5ec
enhance: Add param item for hybrid search requery policy (#44466)
Related to #39757

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-09-24 17:32:04 +08:00
junjiejiangjjj
f07979f91d
enhance: add support for controlling function output field insertion (#44162)
#44053

Signed-off-by: junjie.jiang <junjie.jiang@zilliz.com>
2025-09-24 17:26:04 +08:00
Bingyi Sun
96e1de4e22
feat: allow users to write pk field when autoid is enabled (#44424)
https://github.com/milvus-io/milvus/issues/44425

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-09-23 16:10:04 +08:00
junjiejiangjjj
71563d5d0e
enhance: optimize decay function with configurable score merging and … (#44066)
…normalization

- Add configurable score merge functions (max, avg, sum) for decay
reranking
- Introduce norm_score parameter to control score normalization behavior
- Refactor score normalization logic into reusable utility functions

#44051

Signed-off-by: junjie.jiang <junjie.jiang@zilliz.com>
2025-09-23 14:18:06 +08:00
Tianx
2c0c5ef41e
feat: timestamptz expression & index & timezone (#44080)
issue: https://github.com/milvus-io/milvus/issues/27467

>My plan is as follows.
>- [x] M1 Create collection with timestamptz field
>- [x] M2 Insert timestamptz field data
>- [x] M3 Retrieve timestamptz field data
>- [x] M4 Implement handoff
>- [x] M5 Implement compare operator
>- [x] M6 Implement extract operator
 >- [x] M8 Support database/collection level default timezone
>- [x] M7 Support STL-SORT index for datatype timestamptz

---

The third PR of issue: https://github.com/milvus-io/milvus/issues/27467,
which completes M5, M6, M7, M8 described above.

## M8 Default Timezone

We will be able to use alter_collection() and alter_database() in a
future Python SDK release to modify the default timezone at the
collection or database level.

For insert requests, the timezone will be resolved using the following
order of precedence: String Literal-> Collection Default -> Database
Default.
For retrieval requests, the timezone will be resolved in this order:
Query Parameters -> Collection Default -> Database Default.
In both cases, the final fallback timezone is UTC.


## M5: Comparison Operators

We can now use the following expression format to filter on the
timestamptz field:

- `timestamptz_field [+/- INTERVAL 'interval_string'] {comparison_op}
ISO 'iso_string' `

- The interval_string follows the ISO 8601 duration format, for example:
P1Y2M3DT1H2M3S.

- The iso_string follows the ISO 8601 timestamp format, for example:
2025-01-03T00:00:00+08:00.

- Example expressions: "tsz + INTERVAL 'P0D' != ISO
'2025-01-03T00:00:00+08:00'" or "tsz != ISO
'2025-01-03T00:00:00+08:00'".

## M6: Extract

We will be able to extract sepecific time filed by kwargs in a future
Python SDK release.
The key is `time_fields`, and value should be one or more of "year,
month, day, hour, minute, second, microsecond", seperated by comma or
space. Then the result of each record would be an array of int64.



## M7: Indexing Support

Expressions without interval arithmetic can be accelerated using an
STL-SORT index. However, expressions that include interval arithmetic
cannot be indexed. This is because the result of an interval calculation
depends on the specific timestamp value. For example, adding one month
to a date in February results in a different number of added days than
adding one month to a date in March.

--- 

After this PR, the input / output type of timestamptz would be iso
string. Timestampz would be stored as timestamptz data, which is int64_t
finally.

> for more information, see https://en.wikipedia.org/wiki/ISO_8601

---------

Signed-off-by: xtx <xtianx@smail.nju.edu.cn>
2025-09-23 10:24:12 +08:00
Gao
539f17f1ad
enhance: tiered index updates (#44433)
issue: #42032 #44212 

- special case for warmup param and cell storage size for tiered index
- add a config to enable/disable storage usage tracking

---------

Signed-off-by: chasingegg <chao.gao@zilliz.com>
2025-09-22 21:34:11 +08:00
Zhen Ye
c171280f63
enhance: support replicate message in wal. (#44456)
issue: #44123

- support replicate message  in wal of milvus.
- support CDC-replicate recovery from wal.
- fix some CDC replicator bugs

Signed-off-by: chyezh <chyezh@outlook.com>
2025-09-22 17:06:11 +08:00
XuanYang-cn
200ee4cb18
enhance: Remove annoying warning log (#44499)
Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2025-09-22 15:44:18 +08:00
Gao
d3784c6515
enhance: add storage resource usage for vector search (#44308)
issue: #44212 

Implement search/query storage usage statistics in go side(result
reduce), only record storage usage in vector search C++ path. Need to be
implemented in query c++ path in next prs.

---------

Signed-off-by: chasingegg <chao.gao@zilliz.com>
Signed-off-by: marcelo.chen <marcelo.chen@zilliz.com>
Co-authored-by: marcelo.chen <marcelo.chen@zilliz.com>
2025-09-19 20:20:02 +08:00
sangheee
bed94fc061
feat: support grpc tokenizer (#41994)
relate: https://github.com/milvus-io/milvus/issues/41035

This PR adds support for a gRPC-based tokenizer.

- The protobuf definition was added in
[milvus-proto#445](https://github.com/milvus-io/milvus-proto/pull/445).
- Based on this, the corresponding Rust client code was generated and
added under `tantivi-binding`.
  - The generated file is `milvus.proto.tokenizer.rs`.

I'm not very experienced with Rust, so there might be parts of the code
that could be improved.
I’d appreciate any suggestions or improvements.

---------

Signed-off-by: park.sanghee <park.sanghee@navercorp.com>
2025-09-19 17:40:01 +08:00
yihao.dai
51f69f32d0
feat: Add CDC support (#44124)
This PR implements a new CDC service for Milvus 2.6, providing log-based
cross-cluster replication.

issue: https://github.com/milvus-io/milvus/issues/44123

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
Signed-off-by: chyezh <chyezh@outlook.com>
Co-authored-by: chyezh <chyezh@outlook.com>
2025-09-16 16:32:01 +08:00
congqixia
98d23de36c
enhance: [StorageV2] Make load info contains child info (#44384)
Related to #44257

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-09-16 16:14:00 +08:00
XuanYang-cn
ad86292bed
fix: Deny RenameCollection when db enabled encryption (#44225)
Also fix the validation of enabling encyption when create db

See also: #44217, #44218

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2025-09-16 10:08:07 +08:00
Spade A
eb793531b9
feat: impl StructArray -- support import for CSV/JSON/PARQUET/BINLOG (#44201)
Ref https://github.com/milvus-io/milvus/issues/42148

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-09-15 20:41:59 +08:00
sparknack
4a01c726f3
enhance: cachinglayer: some metric and params update (#44276)
issue: #41435

---------

Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>
2025-09-10 11:03:57 +08:00
Chun Han
26a024625d
feat: support search by on json field and dynamic field(#43124) (#43203)
related: #43124

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2025-09-09 21:51:56 +08:00
Buqian Zheng
dae0fd0e90
enhance: removed unused map_c (#44183)
Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2025-09-09 16:46:04 +08:00
aoiasd
f98dd84dec
fix: add utf8 check before bm25 functoin run (#44220)
relate: https://github.com/milvus-io/milvus/issues/44219

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-09-08 14:43:56 +08:00
zhagnlu
d67f1ea0ab
enhance: add param to modify dump snapshot batch size (#44215)
issue: #44216

Signed-off-by: luzhang <luzhang@zilliz.com>
2025-09-05 14:29:54 +08:00
Xianhui Lin
4662aff36e
fix: retry old session existence in ProcessActiveStandBy (#44208)
fix: retry old session existence in ProcessActiveStandBy
issue: https://github.com/milvus-io/milvus/issues/44205

---------

Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>
2025-09-04 15:45:56 +08:00
cqy123456
d50b365375
enhance: add autoindex config for deduplication case (#44186)
Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>
2025-09-03 17:19:53 +08:00
Spade A
03c46e686f
fix: ngram index for json rejects type of non-varchar field (#44157)
issue: https://github.com/milvus-io/milvus/issues/43934

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-09-03 16:45:54 +08:00
Spade A
1b583e4b54
fix: fixing ngram index rejecting mmap (#44175)
issue: https://github.com/milvus-io/milvus/issues/44164

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-09-03 14:35:53 +08:00
Bingyi Sun
0c0630cc38
feat: support dropping index without releasing collection (#42941)
issue: #42942

This pr includes the following changes:
1. Added checks for index checker in querycoord to generate drop index
tasks
2. Added drop index interface to querynode
3. To avoid search failure after dropping the index, the querynode
allows the use of lazy mode (warmup=disable) to load raw data even when
indexes contain raw data.
4. In segcore, loading the index no longer deletes raw data; instead, it
evicts it.
5. In expr, the index is pinned to prevent concurrent errors.

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-09-02 16:17:52 +08:00
Zhen Ye
9e2d1963d4
enhance: support cchannel for streaming service (#44143)
issue: #43897

- add cchannel as a special vchannel to hold some ddl and dcl.

Signed-off-by: chyezh <chyezh@outlook.com>
2025-09-02 10:05:52 +08:00
zhagnlu
fc876639cf
enhance: support json stats with shredding design (#42534)
#42533

Co-authored-by: luzhang <luzhang@zilliz.com>
2025-09-01 10:49:52 +08:00
Zhen Ye
3327df72e4
enhance: make immutable message as the param of ack operation for cdc (#43900)
issue: #43897

- The original broadcast ack operation need to recover message from
etcd, which can not support cdc.
- immutable message will set as the ack parameter to fix it.

Signed-off-by: chyezh <chyezh@outlook.com>
2025-09-01 10:21:52 +08:00
XuanYang-cn
3160f41821
enhance: [cmek]Merge cipher.yml with hook.yml (#44118)
See also: #40321

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2025-08-29 18:37:51 +08:00
sparknack
70c8114e85
enhance: cachinglayer: resource management for segment loading (#43846)
issue: #41435

---------

Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>
2025-08-29 11:37:50 +08:00
Zhen Ye
7b04107863
fix: unrecoverable if lease expire when standby mode (#44112)
issue: #44111

Signed-off-by: chyezh <chyezh@outlook.com>
2025-08-29 10:47:51 +08:00
Chun Han
da156981c6
feat: milvus support posix-compatible mode(milvus-io#43942) (#43944)
related: #43942

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2025-08-27 16:29:50 +08:00
XuanYang-cn
37a447d166
feat: Add CMEK cipher plugin (#43722)
1. Enable Milvus to read cipher configs
2. Enable cipher plugin in binlog reader and writer
3. Add a testCipher for unittests
4. Support pooling for datanode
5. Add encryption in storagev2

See also: #40321 
Signed-off-by: yangxuan <xuan.yang@zilliz.com>

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2025-08-27 11:15:52 +08:00
aoiasd
208a345a3d
enhance: package analyzer code in Go and fix named analyzer as tokenizer (#43694)
relate: https://github.com/milvus-io/milvus/issues/43687

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-08-27 10:59:52 +08:00
Spade A
8456f824be
feat: impl StructArray -- miscellaneous staffs for struct array (#43960)
Ref https://github.com/milvus-io/milvus/issues/42148

1. enable storage v2
2. implement some missing staffs
3. fix some bugs and add tests

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-08-26 21:35:53 +08:00
Zhen Ye
5bdc593b8a
enhance: use v0.15.1 official pulsar client and add logging for pulsar client (#43913)
issue: #43785

- pulsar client will print log into milvus logger now.
- pulsar client open the metric by default.
- upgrade the pulsar client to v0.15.1, and use offical repo.
- the fixing of milvus-io/pulsar-client-go is already covered by
official v0.15.1.

Signed-off-by: chyezh <chyezh@outlook.com>
2025-08-26 16:45:53 +08:00
Tianx
c0d62268ac
feat: add timesatmptz data type (#44005)
issue: https://github.com/milvus-io/milvus/issues/27467
>
https://github.com/milvus-io/milvus/issues/27467#issuecomment-3092211420
> * [x]  M1 Create collection with timestamptz field
> * [x]  M2 Insert timestamptz field data
> * [x]  M3 Retrieve timestamptz field data
> * [x]  M4 Implement handoff[ ]  

The second PR of issue:
https://github.com/milvus-io/milvus/issues/27467, which completes M1-M4
described above.

---------

Signed-off-by: xtx <xtianx@smail.nju.edu.cn>
2025-08-26 15:59:53 +08:00
groot
ccb0db92e7
fix: Not allow to import null element of array field from parquet (#43964)
issue: https://github.com/milvus-io/milvus/issues/43819

Before this fix: null elements are converted to zero or empty strings
After this fix: import job will return error "array element is not
allowed to be null value for field xxx"

Signed-off-by: yhmo <yihua.mo@zilliz.com>
2025-08-26 14:45:51 +08:00
zhagnlu
8934c18792
enhance: support cache result cache for expr (#43923)
issue: #43878

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-08-26 10:55:52 +08:00
junjiejiangjjj
f1ce84996d
enhance: refactor model service configuration and environment variables (#44036)
- Add enable configuration for all model service providers
- Migrate environment variables from MILVUSAI_* to MILVUS_* prefix with
backward compatibility
- Unify model service enable/disable logic using configuration
- Add tests for environment variable parsing with fallback support

#35856

Signed-off-by: junjie.jiang <junjie.jiang@zilliz.com>
2025-08-26 10:49:52 +08:00
cqy123456
d987dd7103
enhance: Make build ratio of interim index configurable (#43939)
issue: https://github.com/milvus-io/milvus/issues/43993

Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>
2025-08-25 14:43:51 +08:00