milvus

mirror of https://gitee.com/milvus-io/milvus.git synced 2026-01-07 19:31:51 +08:00

Author	SHA1	Message	Date
zhagnlu	d67f1ea0ab	enhance: add param to modify dump snapshot batch size (#44215 ) issue: #44216 Signed-off-by: luzhang <luzhang@zilliz.com>	2025-09-05 14:29:54 +08:00
Xianhui Lin	4662aff36e	fix: retry old session existence in ProcessActiveStandBy (#44208 ) fix: retry old session existence in ProcessActiveStandBy issue: https://github.com/milvus-io/milvus/issues/44205 --------- Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>	2025-09-04 15:45:56 +08:00
cqy123456	d50b365375	enhance: add autoindex config for deduplication case (#44186 ) Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>	2025-09-03 17:19:53 +08:00
Spade A	03c46e686f	fix: ngram index for json rejects type of non-varchar field (#44157 ) issue: https://github.com/milvus-io/milvus/issues/43934 --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-09-03 16:45:54 +08:00
Spade A	1b583e4b54	fix: fixing ngram index rejecting mmap (#44175 ) issue: https://github.com/milvus-io/milvus/issues/44164 Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-09-03 14:35:53 +08:00
Bingyi Sun	0c0630cc38	feat: support dropping index without releasing collection (#42941 ) issue: #42942 This pr includes the following changes: 1. Added checks for index checker in querycoord to generate drop index tasks 2. Added drop index interface to querynode 3. To avoid search failure after dropping the index, the querynode allows the use of lazy mode (warmup=disable) to load raw data even when indexes contain raw data. 4. In segcore, loading the index no longer deletes raw data; instead, it evicts it. 5. In expr, the index is pinned to prevent concurrent errors. --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-09-02 16:17:52 +08:00
Zhen Ye	9e2d1963d4	enhance: support cchannel for streaming service (#44143 ) issue: #43897 - add cchannel as a special vchannel to hold some ddl and dcl. Signed-off-by: chyezh <chyezh@outlook.com>	2025-09-02 10:05:52 +08:00
zhagnlu	fc876639cf	enhance: support json stats with shredding design (#42534 ) #42533 Co-authored-by: luzhang <luzhang@zilliz.com>	2025-09-01 10:49:52 +08:00
Zhen Ye	3327df72e4	enhance: make immutable message as the param of ack operation for cdc (#43900 ) issue: #43897 - The original broadcast ack operation need to recover message from etcd, which can not support cdc. - immutable message will set as the ack parameter to fix it. Signed-off-by: chyezh <chyezh@outlook.com>	2025-09-01 10:21:52 +08:00
XuanYang-cn	3160f41821	enhance: [cmek]Merge cipher.yml with hook.yml (#44118 ) See also: #40321 Signed-off-by: yangxuan <xuan.yang@zilliz.com>	2025-08-29 18:37:51 +08:00
sparknack	70c8114e85	enhance: cachinglayer: resource management for segment loading (#43846 ) issue: #41435 --------- Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>	2025-08-29 11:37:50 +08:00
Zhen Ye	7b04107863	fix: unrecoverable if lease expire when standby mode (#44112 ) issue: #44111 Signed-off-by: chyezh <chyezh@outlook.com>	2025-08-29 10:47:51 +08:00
Chun Han	da156981c6	feat: milvus support posix-compatible mode(milvus-io#43942) (#43944 ) related: #43942 Signed-off-by: MrPresent-Han <chun.han@gmail.com> Co-authored-by: MrPresent-Han <chun.han@gmail.com>	2025-08-27 16:29:50 +08:00
XuanYang-cn	37a447d166	feat: Add CMEK cipher plugin (#43722 ) 1. Enable Milvus to read cipher configs 2. Enable cipher plugin in binlog reader and writer 3. Add a testCipher for unittests 4. Support pooling for datanode 5. Add encryption in storagev2 See also: #40321 Signed-off-by: yangxuan <xuan.yang@zilliz.com> --------- Signed-off-by: yangxuan <xuan.yang@zilliz.com>	2025-08-27 11:15:52 +08:00
aoiasd	208a345a3d	enhance: package analyzer code in Go and fix named analyzer as tokenizer (#43694 ) relate: https://github.com/milvus-io/milvus/issues/43687 Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-08-27 10:59:52 +08:00
Spade A	8456f824be	feat: impl StructArray -- miscellaneous staffs for struct array (#43960 ) Ref https://github.com/milvus-io/milvus/issues/42148 1. enable storage v2 2. implement some missing staffs 3. fix some bugs and add tests --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-08-26 21:35:53 +08:00
Zhen Ye	5bdc593b8a	enhance: use v0.15.1 official pulsar client and add logging for pulsar client (#43913 ) issue: #43785 - pulsar client will print log into milvus logger now. - pulsar client open the metric by default. - upgrade the pulsar client to v0.15.1, and use offical repo. - the fixing of milvus-io/pulsar-client-go is already covered by official v0.15.1. Signed-off-by: chyezh <chyezh@outlook.com>	2025-08-26 16:45:53 +08:00
Tianx	c0d62268ac	feat: add timesatmptz data type (#44005 ) issue: https://github.com/milvus-io/milvus/issues/27467 > https://github.com/milvus-io/milvus/issues/27467#issuecomment-3092211420 > * [x] M1 Create collection with timestamptz field > * [x] M2 Insert timestamptz field data > * [x] M3 Retrieve timestamptz field data > * [x] M4 Implement handoff[ ] The second PR of issue: https://github.com/milvus-io/milvus/issues/27467, which completes M1-M4 described above. --------- Signed-off-by: xtx <xtianx@smail.nju.edu.cn>	2025-08-26 15:59:53 +08:00
groot	ccb0db92e7	fix: Not allow to import null element of array field from parquet (#43964 ) issue: https://github.com/milvus-io/milvus/issues/43819 Before this fix: null elements are converted to zero or empty strings After this fix: import job will return error "array element is not allowed to be null value for field xxx" Signed-off-by: yhmo <yihua.mo@zilliz.com>	2025-08-26 14:45:51 +08:00
zhagnlu	8934c18792	enhance: support cache result cache for expr (#43923 ) issue: #43878 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-08-26 10:55:52 +08:00
junjiejiangjjj	f1ce84996d	enhance: refactor model service configuration and environment variables (#44036 ) - Add enable configuration for all model service providers - Migrate environment variables from MILVUSAI_* to MILVUS_* prefix with backward compatibility - Unify model service enable/disable logic using configuration - Add tests for environment variable parsing with fallback support #35856 Signed-off-by: junjie.jiang <junjie.jiang@zilliz.com>	2025-08-26 10:49:52 +08:00
cqy123456	d987dd7103	enhance: Make build ratio of interim index configurable (#43939 ) issue: https://github.com/milvus-io/milvus/issues/43993 Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>	2025-08-25 14:43:51 +08:00
sparknack	4fae074d56	enhance: add write rate limit for disk file writer (#43912 ) issue: #43040 --------- Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>	2025-08-25 10:27:47 +08:00
junjiejiangjjj	f3d7e47227	feat: Supports more rerankers (#43270 ) https://github.com/milvus-io/milvus/issues/35856 Signed-off-by: junjiejiangjjj <junjie.jiang@zilliz.com>	2025-08-22 17:29:47 +08:00
Spade A	d6a428e880	feat: impl StructArray -- support create index for vector array (embedding list) and search on it (#43726 ) Ref https://github.com/milvus-io/milvus/issues/42148 This PR supports create index for vector array (now, only for `DataType.FLOAT_VECTOR`) and search on it. The index type supported in this PR is `EMB_LIST_HNSW` and the metric type is `MAX_SIM` only. The way to use it: ```python milvus_client = MilvusClient("xxx:19530") schema = milvus_client.create_schema(enable_dynamic_field=True, auto_id=True) ... struct_schema = milvus_client.create_struct_array_field_schema("struct_array_field") ... struct_schema.add_field("struct_float_vec", DataType.ARRAY_OF_VECTOR, element_type=DataType.FLOAT_VECTOR, dim=128, max_capacity=1000) ... schema.add_struct_array_field(struct_schema) index_params = milvus_client.prepare_index_params() index_params.add_index(field_name="struct_float_vec", index_type="EMB_LIST_HNSW", metric_type="MAX_SIM", index_params={"nlist": 128}) ... milvus_client.create_index(COLLECTION_NAME, schema=schema, index_params=index_params) ``` Note: This PR uses `Lims` to convey offsets of the vector array to knowhere where vectors of multiple vector arrays are concatenated and we need offsets to specify which vectors belong to which vector array. --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com> Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>	2025-08-20 10:27:46 +08:00
aoiasd	dcf04a58b8	feat: support use score function on segment search and use filter (#43868 ) relate: https://github.com/milvus-io/milvus/issues/43867 Support boost function score, multiply by the weight if match filter. Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-08-19 16:15:45 +08:00
aoiasd	06006939f8	feat: support use cipher hook in streaming node (#40562 ) relate: https://github.com/milvus-io/milvus/issues/40321 Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-08-19 10:41:44 +08:00
congqixia	f032044125	enhance: Refine segcore param change callback (#43838 ) Related to #43230 This PR - Move segcore setup function to `initcore` package to remove cgo dependency from pkg - Register core callback only for components depends on segcore - Rectify `UpdateLogLevel` implementation Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-08-13 19:31:44 +08:00
yihao.dai	ad950368fe	enhance: Fix parquet import OOM (#43756 ) Each ColumnReader consumes ReaderProperties.BufferSize memory independently. Therefore, the bufferSize should be divided by the number of columns to ensure total memory usage stays within the intended limit. issue: https://github.com/milvus-io/milvus/issues/43755 Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2025-08-08 18:57:40 +08:00
wei liu	46dfe260da	enhance: Add timestamp filtering support to L0Reader (#43747 ) issue: #43745 Add timestamp filtering capability to L0Reader to match the functionality available in the regular Reader. This enhancement allows filtering delete records based on timestamp range during L0 import operations. Changes include: - Add tsStart and tsEnd fields to l0Reader struct for timestamp filtering - Modify NewL0Reader function signature to accept tsStart and tsEnd parameters - Implement timestamp filtering logic in Read method to skip records outside the specified range - Update L0ImportTask and L0PreImportTask to parse timestamp parameters from request options and pass them to NewL0Reader - Add comprehensive test case TestL0Reader_ReadWithTsFilter to verify ts filtering functionality using mockey framework Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-08-06 16:49:39 +08:00
aoiasd	4f02b06abc	enhance: support set lindera dict build dir and download url in yaml (#43541 ) relate: https://github.com/milvus-io/milvus/issues/43120 --------- Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-08-04 09:47:38 +08:00
XuanYang-cn	0ccb95303e	feat: [CMEK] Add utils to load plugins (#42986 ) See also: #40321 --------- Signed-off-by: yangxuan <xuan.yang@zilliz.com>	2025-07-29 17:17:36 +08:00
yihao.dai	a29b3272b0	fix: Improve import memory management to prevent OOM (#43568 ) 1. Use blocking memory allocation to wait until memory becomes available 2. Perform memory allocation at the file level instead of per task 3. Limit Parquet file reader batch size to prevent excessive memory consumption 4. Limit import buffer size from 20% to 10% of total memory issue: https://github.com/milvus-io/milvus/issues/43387, https://github.com/milvus-io/milvus/issues/43131 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2025-07-28 21:25:35 +08:00
Spade A	864d1b93b1	enhance: enable stlsort with mmap support (#43359 ) issue: https://github.com/milvus-io/milvus/issues/43358 --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-07-28 15:32:55 +08:00
Spade A	faeb7fd410	feat: impl StructArray -- create schema, insert, and retrieve data (#42855 ) Ref https://github.com/milvus-io/milvus/issues/42148 https://github.com/milvus-io/milvus/pull/42406 impls the segcore part of storage for handling with VectorArray. This PR: 1. impls the go part of storage for VectorArray 2. impls the collection creation with StructArrayField and VectorArray 3. insert and retrieve data from the collection. --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com> Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com> Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>	2025-07-27 01:30:55 +08:00
Ted Xu	9041bf1b9a	fix: including shouldCopy parameter in file readers (#43578 ) This parameter determines whether the returned value should be a copy or a reference from the arrow array. The updates enhance memory management and provide more control over data handling during deserialization. See #43186 --------- Signed-off-by: Ted Xu <ted.xu@zilliz.com>	2025-07-26 17:30:55 +08:00
Spade A	10fe53ff59	feat: support json for ngram (#43170 ) Ref https://github.com/milvus-io/milvus/issues/42053 This PR enable ngram to support json data type. --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-07-25 10:28:54 +08:00
yihao.dai	9fbd41a97d	fix: Adjust binlog and parquet reader buffer size for import (#43495 ) 1. Modify the binlog reader to stop reading a fixed 4096 rows and instead use the calculated bufferSize to avoid generating small binlogs. 2. Use a fixed bufferSize (32MB) for the Parquet reader to prevent OOM. issue: https://github.com/milvus-io/milvus/issues/43387 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2025-07-23 21:28:54 +08:00
junjiejiangjjj	4db877f76c	fix: Fix weighted rerank (#43503 ) #43478 Signed-off-by: junjie.jiang <junjie.jiang@zilliz.com>	2025-07-23 14:54:53 +08:00
junjiejiangjjj	77f3a1f213	enhance: Add search post pipeline (#43065 ) https://github.com/milvus-io/milvus/issues/35856 Signed-off-by: junjiejiangjjj <junjie.jiang@zilliz.com>	2025-07-21 11:10:52 +08:00
Buqian Zheng	f7b262a702	feat: make storagev1 to support eviction (#43219 ) issue: https://github.com/milvus-io/milvus/issues/41435 turns out we have per file binlog size in golang code, by passing it into segcore we can support eviction in storage v1 Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2025-07-19 02:02:52 +08:00
Zhen Ye	3aacd179f7	fix: balance channel before balance segment when upgrading (#43346 ) issue: #43117, #42966, #43373 - also fix channel balance may not work at 2.6. - fix error lost at delete path - add mvcc into s/q log - change the log level for TestCoordDownSearch Signed-off-by: chyezh <chyezh@outlook.com>	2025-07-17 20:16:52 +08:00
yihao.dai	1984be646c	fix: Fix storagev2 binlog import (#43221 ) issue: https://github.com/milvus-io/milvus/issues/43218 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2025-07-13 22:52:49 +08:00
congqixia	5a9efb3f81	enhance: [StorageV2] Refine storage rw option usage & validation (#43175 ) Related to #39173 This PR: - Make all datanode task passes storage config via storage config option - Remove legacy comments, rootPath & bucketName parameters - Fix clustering compaction option behavior - Add validation logic for `rwOptions` - Use correct storageType from storageConfig - Add storage config in sync task --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-07-11 01:14:48 +08:00
PjJinchen	a90694165b	feat: Supports tracing services that require header-based authentication. (#43211 ) issue: https://github.com/milvus-io/milvus/issues/43082 support tracing services that require header-based authentication. for example: aliyun SLS, volcengine LogService etc... [aliyun SLS](https://help.aliyun.com/zh/sls/import-trace-data-from-golang-applications-to-log-service-by-using-opentelemetry-sdk-for-golang?spm=a2c4g.11186623.help-menu-search-28958.d_1#section-ktk-xxz-8om) Add a headers config in trace config ``` trace: exporter: otlp sampleFraction: 1 otlp: endpoint: milvus-cn-beijing-pre.cn-beijing.log.aliyuncs.com:10010 method: # otlp export method, acceptable values: ["grpc", "http"], using "grpc" by default secure: true headers: # base64 initTimeoutSeconds: 10 ``` it is encoded as base64, raw data is json ``` { "x-sls-otel-project": "milvus-cn-beijing-pre", "x-sls-otel-instance-id": "milvus-cn-beijing-pre", "x-sls-otel-ak-id": "xxx", "x-sls-otel-ak-secret": "xxx" } ``` [volcengine tls](https://www.volcengine.com/docs/6470/812322#grpc-%E5%8D%8F%E8%AE%AE%E5%88%9D%E5%A7%8B%E5%8C%96%E7%A4%BA%E4%BE%8B) Add a headers config in trace config ``` trace: exporter: otlp sampleFraction: 1 otlp: endpoint: xxx method: # otlp export method, acceptable values: ["grpc", "http"], using "grpc" by default secure: true headers: # base64 initTimeoutSeconds: 10 ``` it is encoded as base64, raw data is json ``` { "x-tls-otel-region": "cn-beijing", "x-tls-otel-tracetopic": "milvus-cn-beijing-pre", "x-tls-otel-ak": "xxx", "x-tls-otel-sk": "xxx" } ``` Signed-off-by: PjJinchen <6268414+pj1987111@users.noreply.github.com>	2025-07-10 17:32:48 +08:00
groot	1ee8cea35b	enhance: bulkinsert handle nullable/defaultValue/functionOutput fields (#42956 ) issue: https://github.com/milvus-io/milvus/issues/42173 Signed-off-by: yhmo <yihua.mo@zilliz.com>	2025-07-04 14:20:44 +08:00
Bingyi Sun	6e38e9d18f	fix: Add json cast type for flat index (#42970 ) issue: #42916 Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-07-03 14:14:44 +08:00
sparknack	7e855f1046	enhance: add disk file writer with Direct IO support (#42665 ) issue: #43040 This patch introduces a disk file writer that supports Direct IO. Currently, it is exclusively utilized during the QueryNode load process. Below is its parameters: 1. `common.diskWriteMode` This parameter controls the write mode of the local disk, which is used to write temporary data downloaded from remote storage. Currently, only QueryNode uses 'common.diskWrite*' parameters. Support for other components will be added in the future. The options include 'direct' and 'buffered'. The default value is 'buffered'. 2. `common.diskWriteBufferSizeKb` Disk write buffer size in KB, only used when disk write mode is 'direct', default is 64KB. Current valid range is [4, 65536]. If the value is not aligned to 4KB, it will be rounded up to the nearest multiple of 4KB. 3. `common.diskWriteNumThreads` This parameter controls the number of writer threads used for disk write operations. The valid range is [0, hardware_concurrency]. It is designed to limit the maximum concurrency of disk write operations to reduce the impact on disk read performance. For example, if you want to limit the maximum concurrency of disk write operations to 1, you can set this parameter to 1. The default value is 0, which means the caller will perform write operations directly without using an additional writer thread pool. In this case, the maximum concurrency of disk write operations is determined by the caller's thread pool size. Both parameters can be updated during runtime. --------- Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>	2025-07-02 22:18:44 +08:00
Spade A	26ec841feb	feat: optimize `Like` query with n-gram (#41803 ) Ref #42053 This is the first PR for optimizing `LIKE` with ngram inverted index. Now, only VARCHAR data type is supported and only InnerMatch LIKE (%xxx%) query is supported. How to use it: ``` milvus_client = MilvusClient("http://localhost:19530") schema = milvus_client.create_schema() ... schema.add_field("content_ngram", DataType.VARCHAR, max_length=10000) ... index_params = milvus_client.prepare_index_params() index_params.add_index(field_name="content_ngram", index_type="NGRAM", index_name="ngram_index", min_gram=2, max_gram=3) milvus_client.create_collection(COLLECTION_NAME, ...) ``` min_gram and max_gram controls how we tokenize the documents. For example, for min_gram=2 and max_gram=4, we will tokenize each document with 2-gram, 3-gram and 4-gram. --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com> Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>	2025-07-01 10:08:44 +08:00
Zhen Ye	ecb24e7232	enhance: use multi-process framework in integration test (#42976 ) issue: #41609 - add env `MILVUS_NODE_ID_FOR_TESTING` to set up a node id for milvus process. - add env `MILVUS_CONFIG_REFRESH_INTERVAL` to set up the refresh interval of paramtable. - Init paramtable when calling `paramtable.Get()`. - add new multi process framework for integration test. - change all integration test into multi process. - merge some test case into one suite to speed up it. - modify some test, which need to wait for issue #42966, #42685. - remove the waittssync for delete collection to fix issue: #42989 --------- Signed-off-by: chyezh <chyezh@outlook.com>	2025-06-30 14:22:43 +08:00

1 2 3 4 5 ...

1904 Commits