milvus

mirror of https://gitee.com/milvus-io/milvus.git synced 2025-12-28 22:45:26 +08:00

Author	SHA1	Message	Date
Tianx	2c0c5ef41e	feat: timestamptz expression & index & timezone (#44080 ) issue: https://github.com/milvus-io/milvus/issues/27467 >My plan is as follows. >- [x] M1 Create collection with timestamptz field >- [x] M2 Insert timestamptz field data >- [x] M3 Retrieve timestamptz field data >- [x] M4 Implement handoff >- [x] M5 Implement compare operator >- [x] M6 Implement extract operator >- [x] M8 Support database/collection level default timezone >- [x] M7 Support STL-SORT index for datatype timestamptz --- The third PR of issue: https://github.com/milvus-io/milvus/issues/27467, which completes M5, M6, M7, M8 described above. ## M8 Default Timezone We will be able to use alter_collection() and alter_database() in a future Python SDK release to modify the default timezone at the collection or database level. For insert requests, the timezone will be resolved using the following order of precedence: String Literal-> Collection Default -> Database Default. For retrieval requests, the timezone will be resolved in this order: Query Parameters -> Collection Default -> Database Default. In both cases, the final fallback timezone is UTC. ## M5: Comparison Operators We can now use the following expression format to filter on the timestamptz field: - `timestamptz_field [+/- INTERVAL 'interval_string'] {comparison_op} ISO 'iso_string' ` - The interval_string follows the ISO 8601 duration format, for example: P1Y2M3DT1H2M3S. - The iso_string follows the ISO 8601 timestamp format, for example: 2025-01-03T00:00:00+08:00. - Example expressions: "tsz + INTERVAL 'P0D' != ISO '2025-01-03T00:00:00+08:00'" or "tsz != ISO '2025-01-03T00:00:00+08:00'". ## M6: Extract We will be able to extract sepecific time filed by kwargs in a future Python SDK release. The key is `time_fields`, and value should be one or more of "year, month, day, hour, minute, second, microsecond", seperated by comma or space. Then the result of each record would be an array of int64. ## M7: Indexing Support Expressions without interval arithmetic can be accelerated using an STL-SORT index. However, expressions that include interval arithmetic cannot be indexed. This is because the result of an interval calculation depends on the specific timestamp value. For example, adding one month to a date in February results in a different number of added days than adding one month to a date in March. --- After this PR, the input / output type of timestamptz would be iso string. Timestampz would be stored as timestamptz data, which is int64_t finally. > for more information, see https://en.wikipedia.org/wiki/ISO_8601 --------- Signed-off-by: xtx <xtianx@smail.nju.edu.cn>	2025-09-23 10:24:12 +08:00
sparknack	ab64afba2f	enhance: add storage resource usage for scalar search (#44414 ) issue: #44212 --------- Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>	2025-09-22 14:28:06 +08:00
Gao	d3784c6515	enhance: add storage resource usage for vector search (#44308 ) issue: #44212 Implement search/query storage usage statistics in go side(result reduce), only record storage usage in vector search C++ path. Need to be implemented in query c++ path in next prs. --------- Signed-off-by: chasingegg <chao.gao@zilliz.com> Signed-off-by: marcelo.chen <marcelo.chen@zilliz.com> Co-authored-by: marcelo.chen <marcelo.chen@zilliz.com>	2025-09-19 20:20:02 +08:00
congqixia	98d23de36c	enhance: [StorageV2] Make load info contains child info (#44384 ) Related to #44257 --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-09-16 16:14:00 +08:00
zhagnlu	77f7d19400	fix:avoid mmap rewrite by multi json fields (#44299 ) issue: #44127 Signed-off-by: zhagnlu <lu.zhang@zilliz.com>	2025-09-11 10:13:57 +08:00
zhagnlu	2f8620fa79	fix: fix like failed and add max columns limit (#44233 ) #44137 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-09-10 10:33:57 +08:00
Chun Han	26a024625d	feat: support search by on json field and dynamic field(#43124 ) (#43203 ) related: #43124 Signed-off-by: MrPresent-Han <chun.han@gmail.com> Co-authored-by: MrPresent-Han <chun.han@gmail.com>	2025-09-09 21:51:56 +08:00
Buqian Zheng	9bf2b5c10c	enhance: moved more segcore unit test files (#44210 ) issue: https://github.com/milvus-io/milvus/issues/43931 --------- Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2025-09-08 10:21:55 +08:00
zhagnlu	d67f1ea0ab	enhance: add param to modify dump snapshot batch size (#44215 ) issue: #44216 Signed-off-by: luzhang <luzhang@zilliz.com>	2025-09-05 14:29:54 +08:00
Gao	2e98cb0103	enhance: load resource estimation for tiered index (#44171 ) issue: https://github.com/milvus-io/milvus/issues/42032 - Use bytes to estimate load resource in the whole estimation procedure - Add num_rows and dim info for vector index to better estimate - Disable eviction for tiered index's meta --------- Signed-off-by: chasingegg <chao.gao@zilliz.com>	2025-09-04 19:41:53 +08:00
Buqian Zheng	b76bf13fc3	enhance: move c++ unit test file to aside of the production code (#43932 ) issue: https://github.com/milvus-io/milvus/issues/43931 Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2025-09-03 23:45:53 +08:00
Spade A	7cb15ef141	feat: impl StructArray -- optimize vector array serialization (#44035 ) issue: https://github.com/milvus-io/milvus/issues/42148 Optimized from Go VectorArray → VectorArray Proto → Binary → C++ VectorArray Proto → C++ VectorArray local impl → Memory to Go VectorArray → Arrow ListArray → Memory --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-09-03 16:39:53 +08:00
Bingyi Sun	0c0630cc38	feat: support dropping index without releasing collection (#42941 ) issue: #42942 This pr includes the following changes: 1. Added checks for index checker in querycoord to generate drop index tasks 2. Added drop index interface to querynode 3. To avoid search failure after dropping the index, the querynode allows the use of lazy mode (warmup=disable) to load raw data even when indexes contain raw data. 4. In segcore, loading the index no longer deletes raw data; instead, it evicts it. 5. In expr, the index is pinned to prevent concurrent errors. --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-09-02 16:17:52 +08:00
Bingyi Sun	c420e7bd27	enhance: align the behavior of exist expr between brute force and index (#44030 ) https://github.com/milvus-io/milvus/issues/44031 --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-09-01 15:03:52 +08:00
zhagnlu	fc876639cf	enhance: support json stats with shredding design (#42534 ) #42533 Co-authored-by: luzhang <luzhang@zilliz.com>	2025-09-01 10:49:52 +08:00
sparknack	70c8114e85	enhance: cachinglayer: resource management for segment loading (#43846 ) issue: #41435 --------- Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>	2025-08-29 11:37:50 +08:00
congqixia	e3b3502287	fix: Use correct regex for cppcheck (#44077 ) Related to #44076 Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-08-27 20:57:50 +08:00
marcelo-cjl	e13e19cd2c	enhance: add sparse_u32_f32 data type for sparse vertor (#43974 ) issue: #43973 Signed-off-by: marcelo.chen <marcelo.chen@zilliz.com>	2025-08-27 16:47:50 +08:00
Chun Han	da156981c6	feat: milvus support posix-compatible mode(milvus-io#43942) (#43944 ) related: #43942 Signed-off-by: MrPresent-Han <chun.han@gmail.com> Co-authored-by: MrPresent-Han <chun.han@gmail.com>	2025-08-27 16:29:50 +08:00
XuanYang-cn	37a447d166	feat: Add CMEK cipher plugin (#43722 ) 1. Enable Milvus to read cipher configs 2. Enable cipher plugin in binlog reader and writer 3. Add a testCipher for unittests 4. Support pooling for datanode 5. Add encryption in storagev2 See also: #40321 Signed-off-by: yangxuan <xuan.yang@zilliz.com> --------- Signed-off-by: yangxuan <xuan.yang@zilliz.com>	2025-08-27 11:15:52 +08:00
Spade A	8456f824be	feat: impl StructArray -- miscellaneous staffs for struct array (#43960 ) Ref https://github.com/milvus-io/milvus/issues/42148 1. enable storage v2 2. implement some missing staffs 3. fix some bugs and add tests --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-08-26 21:35:53 +08:00
Tianx	c0d62268ac	feat: add timesatmptz data type (#44005 ) issue: https://github.com/milvus-io/milvus/issues/27467 > https://github.com/milvus-io/milvus/issues/27467#issuecomment-3092211420 > * [x] M1 Create collection with timestamptz field > * [x] M2 Insert timestamptz field data > * [x] M3 Retrieve timestamptz field data > * [x] M4 Implement handoff[ ] The second PR of issue: https://github.com/milvus-io/milvus/issues/27467, which completes M1-M4 described above. --------- Signed-off-by: xtx <xtianx@smail.nju.edu.cn>	2025-08-26 15:59:53 +08:00
Gao	e97a618630	enhance: support readAt interface for remote input stream (#43997 ) #42032 Also, fix the cacheoptfield method to work in storagev2. Also, change the sparse related interface for knowhere version bump #43974 . Also, includes https://github.com/milvus-io/milvus/pull/44046 for metric lost. --------- Signed-off-by: chasingegg <chao.gao@zilliz.com> Signed-off-by: marcelo.chen <marcelo.chen@zilliz.com> Signed-off-by: Congqi Xia <congqi.xia@zilliz.com> Co-authored-by: marcelo.chen <marcelo.chen@zilliz.com> Co-authored-by: Congqi Xia <congqi.xia@zilliz.com>	2025-08-26 11:19:58 +08:00
zhagnlu	8934c18792	enhance: support cache result cache for expr (#43923 ) issue: #43878 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-08-26 10:55:52 +08:00
sparknack	4fae074d56	enhance: add write rate limit for disk file writer (#43912 ) issue: #43040 --------- Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>	2025-08-25 10:27:47 +08:00
zhagnlu	d904c4e677	enhance: optimize compare expr performance for pk field (#43154 ) #43153 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-08-21 10:59:46 +08:00
Spade A	d6a428e880	feat: impl StructArray -- support create index for vector array (embedding list) and search on it (#43726 ) Ref https://github.com/milvus-io/milvus/issues/42148 This PR supports create index for vector array (now, only for `DataType.FLOAT_VECTOR`) and search on it. The index type supported in this PR is `EMB_LIST_HNSW` and the metric type is `MAX_SIM` only. The way to use it: ```python milvus_client = MilvusClient("xxx:19530") schema = milvus_client.create_schema(enable_dynamic_field=True, auto_id=True) ... struct_schema = milvus_client.create_struct_array_field_schema("struct_array_field") ... struct_schema.add_field("struct_float_vec", DataType.ARRAY_OF_VECTOR, element_type=DataType.FLOAT_VECTOR, dim=128, max_capacity=1000) ... schema.add_struct_array_field(struct_schema) index_params = milvus_client.prepare_index_params() index_params.add_index(field_name="struct_float_vec", index_type="EMB_LIST_HNSW", metric_type="MAX_SIM", index_params={"nlist": 128}) ... milvus_client.create_index(COLLECTION_NAME, schema=schema, index_params=index_params) ``` Note: This PR uses `Lims` to convey offsets of the vector array to knowhere where vectors of multiple vector arrays are concatenated and we need offsets to specify which vectors belong to which vector array. --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com> Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>	2025-08-20 10:27:46 +08:00
congqixia	f032044125	enhance: Refine segcore param change callback (#43838 ) Related to #43230 This PR - Move segcore setup function to `initcore` package to remove cgo dependency from pkg - Register core callback only for components depends on segcore - Rectify `UpdateLogLevel` implementation Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-08-13 19:31:44 +08:00
Gao	81a0915c29	enhance: add milvus-common module to decouple knwhere & segcore (#43624 ) issue: https://github.com/milvus-io/milvus/issues/42032 https://github.com/milvus-io/milvus/issues/41435 based on pr: https://github.com/milvus-io/milvus/pull/42124 --------- Signed-off-by: chasingegg <chao.gao@zilliz.com> Co-authored-by: xianliang.li <xianliang.li@zilliz.com>	2025-08-11 14:09:42 +08:00
zhagnlu	c04d678ad4	enhance: make segcore params effective without restarting milvus (#43231 ) #43230 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-08-08 10:33:48 +08:00
Chun Han	d826d6ac91	fix: try to get span raw data for variable length data type(#43544 ) (#43705 ) related: #43544 Signed-off-by: MrPresent-Han <chun.han@gmail.com> Co-authored-by: MrPresent-Han <chun.han@gmail.com>	2025-08-04 11:15:38 +08:00
congqixia	089f02bcca	fix: [StorageV2] Align null bitmap offset for fixed-length datatype (#43654 ) Related to #43626 Similar to previous pr #43321, null bitmap could be dislocated if the bitset ptr does not count the offset of array Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-07-31 09:55:36 +08:00
Bingyi Sun	a765cd1eaa	enhance: unlink mmap file when chunk and index are destructed (#43524 ) issue: https://github.com/milvus-io/milvus/issues/41636 --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-07-29 16:05:36 +08:00
Spade A	864d1b93b1	enhance: enable stlsort with mmap support (#43359 ) issue: https://github.com/milvus-io/milvus/issues/43358 --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-07-28 15:32:55 +08:00
sthuang	5cebc9f7f6	fix: [StorageV2] handle correct cid with multiple files and add storage v2 prefix logs (#43539 ) related: #43372 --------- Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2025-07-25 11:22:54 +08:00
sthuang	59bbdd93f5	fix: [StorageV2] fill the correct group chunk into cell (#43486 ) The root cause of the issue lies in the fact that when a sealed segment contains multiple row groups, the get_cells function may receive unordered cids. This can result in row groups being written into incorrect cells during data retrieval. Previously, this issue was hard to reproduce because the old Storage V2 writer had a bug that caused it to write row groups larger than 1MB. These large row groups could lead to uncontrolled memory usage and eventually an OOM (Out of Memory) error. Additionally, compaction typically produced a single large row group, which avoided the incorrect cell-filling issue during query execution. related: https://github.com/milvus-io/milvus/issues/43388, https://github.com/milvus-io/milvus/issues/43372, https://github.com/milvus-io/milvus/issues/43464, #43446, #43453 --------- Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2025-07-22 22:22:53 +08:00
Buqian Zheng	0599113a4b	enhance: add timeout to resource reservation (#43441 ) issue: https://github.com/milvus-io/milvus/issues/41435 Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2025-07-22 15:24:53 +08:00
Chun Han	5a1092304c	fix: refine judgement for batch views(#38736 ) (#43481 ) related: #38736 Signed-off-by: MrPresent-Han <chun.han@gmail.com> Co-authored-by: MrPresent-Han <chun.han@gmail.com>	2025-07-22 14:20:53 +08:00
Buqian Zheng	389104d200	enhance: rename PanicInfo to ThrowInfo (#43384 ) issue: #41435 this is to prevent AI from thinking of our exception throwing as a dangerous PANIC operation that terminates the program. Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2025-07-19 20:22:52 +08:00
Buqian Zheng	f7b262a702	feat: make storagev1 to support eviction (#43219 ) issue: https://github.com/milvus-io/milvus/issues/41435 turns out we have per file binlog size in golang code, by passing it into segcore we can support eviction in storage v1 Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2025-07-19 02:02:52 +08:00
Buqian Zheng	d793def47c	feat: impose a physical memory limit when loading cells (#43222 ) issue: #41435 issue: https://github.com/milvus-io/milvus/issues/43038 This PR also: 1. removed ERROR state from ListNode 2. CacheSlot will do reserveMemory once for all requested cells after updating the state to LOADING, so now we transit a cell to LOADING before its resource reservation 3. reject resource reservation directly if size >= max_size --------- Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2025-07-18 11:18:52 +08:00
sparknack	9b4081e110	enhance: cachinglayer: some performance optimization (#42858 ) issue: #41435 We compared the performance using the modified test_sealed.cpp, which randomly accesses all rows in all chunks and counts the number of runs within 3s. ## performance data comparison (ops/second) chunk config: 1x1000 \| Field Type \| w/o cachinglayer (commit 640f526301) \| w/ cachinglayer \| w/ cachinglayer + opt \| \|---\|---\|---\|---\| \| Bool field \| 82428 \| -63.6% (29983) \| +2.7% (84675) \| \| Int8 field \| 82228 \| -63.3% (30166) \| +2.4% (84163) \| \| Int16 field \| 82572 \| -63.8% (29867) \| +1.8% (84036) \| \| Int32 field \| 82797 \| -63.7% (30031) \| +1.5% (84043) \| \| Int64 field \| 81077 \| -62.9% (30107) \| +0.6% (81604) \| \| Float field \| 82678 \| -63.4% (30266) \| +1.8% (84146) \| \| Double field \| 81925 \| -63.4% (29974) \| +0.2% (82097) \| \| Varchar field \| 19933 \| -19.6% (16027) \| +18.9% (23690) \| \| JSON field \| 16519 \| -96.8% (533) \| +2.5% (16927) \| \| Int array field \| 7325 \| -13.7% (6321) \| -1.4% (7220) \| \| Long array field \| 6347 \| -8.9% (5781) \| -0.1% (6344) \| \| Bool array field \| 8275 \| -14.0% (7116) \| +0.4% (8311) \| \| String array field \| 2281 \| -5.0% (2168) \| +0.2% (2287) \| \| Double array field \| 6427 \| -13.3% (5574) \| -2.0% (6301) \| \| Float array field \| 7291 \| -13.0% (6346) \| -1.5% (7183) \| \| Vector field \| 27487 \| -40.4% (16371) \| -4.7% (26192) \| \| Float16 vector field \| 49773 \| -54.6% (22601) \| -5.9% (46834) \| \| BFloat16 vector field \| 49783 \| -53.1% (23350) \| -5.7% (46934) \| \| Int8 vector field \| 63871 \| -59.0% (26179) \| -6.2% (59926) \| --- chunk config: 10x1000 \| Field Type \| w/o cachinglayer (commit 640f526301) \| w/ cachinglayer \| w/ cachinglayer + opt \| \|---\|---\|---\|---\| \| Bool field \| 3659 \| -48.6% (1879) \| +110.1% (7686) \| \| Int8 field \| 3410 \| -45.3% (1864) \| +123.9% (7636) \| \| Int16 field \| 3647 \| -48.6% (1874) \| +110.1% (7661) \| \| Int32 field \| 3647 \| -48.8% (1866) \| +109.6% (7645) \| \| Int64 field \| 3645 \| -48.9% (1863) \| +107.8% (7573) \| \| Float field \| 3647 \| -49.0% (1861) \| +109.5% (7639) \| \| Double field \| 3640 \| -45.1% (1998) \| +108.4% (7586) \| \| Varchar field \| 1594 \| -23.9% (1213) \| +20.6% (1922) \| \| JSON field \| 1202 \| -26.5% (884) \| +16.1% (1396) \| \| Int array field \| 602 \| -12.3% (528) \| +12.7% (678) \| \| Long array field \| 529 \| -12.2% (465) \| +7.5% (569) \| \| Double array field \| 537 \| -13.0% (467) \| +6.4% (571) \| \| Vector field \| 1520 \| -37.9% (943) \| -5.5% (1437) \| \| Float16 vector field \| 2607 \| -47.0% (1382) \| +6.4% (2774) \| \| BFloat16 vector field \| 2586 \| -46.5% (1383) \| +8.8% (2813) \| \| Int8 vector field \| 3101 \| -47.3% (1633) \| +41.9% (4400) \| --------- Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>	2025-07-17 11:20:51 +08:00
congqixia	fe8de016d5	fix: [StorageV2] Align null bitmap offset when loading multi-chunk (#43321 ) Related to #43262 This patch fixes following logic bug: - When multiple chunks are loaded and size cannot be divided by 8, just appending uint8_t as bitmap will cause null bitmap dislocation - `null_bitmap_data()` points to start of whole row group, which may not stand for current `arrow::Array` The current solutions is: - Reorganize the null_bitmap with currect size & offset - Pass `array->offset()` in tuple to info the current offset Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-07-15 19:22:50 +08:00
PjJinchen	a90694165b	feat: Supports tracing services that require header-based authentication. (#43211 ) issue: https://github.com/milvus-io/milvus/issues/43082 support tracing services that require header-based authentication. for example: aliyun SLS, volcengine LogService etc... [aliyun SLS](https://help.aliyun.com/zh/sls/import-trace-data-from-golang-applications-to-log-service-by-using-opentelemetry-sdk-for-golang?spm=a2c4g.11186623.help-menu-search-28958.d_1#section-ktk-xxz-8om) Add a headers config in trace config ``` trace: exporter: otlp sampleFraction: 1 otlp: endpoint: milvus-cn-beijing-pre.cn-beijing.log.aliyuncs.com:10010 method: # otlp export method, acceptable values: ["grpc", "http"], using "grpc" by default secure: true headers: # base64 initTimeoutSeconds: 10 ``` it is encoded as base64, raw data is json ``` { "x-sls-otel-project": "milvus-cn-beijing-pre", "x-sls-otel-instance-id": "milvus-cn-beijing-pre", "x-sls-otel-ak-id": "xxx", "x-sls-otel-ak-secret": "xxx" } ``` [volcengine tls](https://www.volcengine.com/docs/6470/812322#grpc-%E5%8D%8F%E8%AE%AE%E5%88%9D%E5%A7%8B%E5%8C%96%E7%A4%BA%E4%BE%8B) Add a headers config in trace config ``` trace: exporter: otlp sampleFraction: 1 otlp: endpoint: xxx method: # otlp export method, acceptable values: ["grpc", "http"], using "grpc" by default secure: true headers: # base64 initTimeoutSeconds: 10 ``` it is encoded as base64, raw data is json ``` { "x-tls-otel-region": "cn-beijing", "x-tls-otel-tracetopic": "milvus-cn-beijing-pre", "x-tls-otel-ak": "xxx", "x-tls-otel-sk": "xxx" } ``` Signed-off-by: PjJinchen <6268414+pj1987111@users.noreply.github.com>	2025-07-10 17:32:48 +08:00
congqixia	f027eea545	enhance: [AddField] Add log for segcore segment schema change (#43215 ) Related to #39178 This PR add logs for segment schema change operations. Also fixes the nit comments from PR #42490 --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-07-10 10:22:47 +08:00
Spade A	d41eec6f10	fix: void copy when getting json chunk (#43183 ) fix: https://github.com/milvus-io/milvus/issues/43182 Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-07-08 15:28:46 +08:00
Zhen Ye	bbbc7d4517	enhance: collect all cgo calling into metric and log slow cgo call (#43035 ) issue: #42833 - also fix the error metric for async cgo. - also make sure the roles can be seen when node startup, #43041. Signed-off-by: chyezh <chyezh@outlook.com>	2025-07-03 15:00:44 +08:00
sparknack	7e855f1046	enhance: add disk file writer with Direct IO support (#42665 ) issue: #43040 This patch introduces a disk file writer that supports Direct IO. Currently, it is exclusively utilized during the QueryNode load process. Below is its parameters: 1. `common.diskWriteMode` This parameter controls the write mode of the local disk, which is used to write temporary data downloaded from remote storage. Currently, only QueryNode uses 'common.diskWrite*' parameters. Support for other components will be added in the future. The options include 'direct' and 'buffered'. The default value is 'buffered'. 2. `common.diskWriteBufferSizeKb` Disk write buffer size in KB, only used when disk write mode is 'direct', default is 64KB. Current valid range is [4, 65536]. If the value is not aligned to 4KB, it will be rounded up to the nearest multiple of 4KB. 3. `common.diskWriteNumThreads` This parameter controls the number of writer threads used for disk write operations. The valid range is [0, hardware_concurrency]. It is designed to limit the maximum concurrency of disk write operations to reduce the impact on disk read performance. For example, if you want to limit the maximum concurrency of disk write operations to 1, you can set this parameter to 1. The default value is 0, which means the caller will perform write operations directly without using an additional writer thread pool. In this case, the maximum concurrency of disk write operations is determined by the caller's thread pool size. Both parameters can be updated during runtime. --------- Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>	2025-07-02 22:18:44 +08:00
Spade A	26ec841feb	feat: optimize `Like` query with n-gram (#41803 ) Ref #42053 This is the first PR for optimizing `LIKE` with ngram inverted index. Now, only VARCHAR data type is supported and only InnerMatch LIKE (%xxx%) query is supported. How to use it: ``` milvus_client = MilvusClient("http://localhost:19530") schema = milvus_client.create_schema() ... schema.add_field("content_ngram", DataType.VARCHAR, max_length=10000) ... index_params = milvus_client.prepare_index_params() index_params.add_index(field_name="content_ngram", index_type="NGRAM", index_name="ngram_index", min_gram=2, max_gram=3) milvus_client.create_collection(COLLECTION_NAME, ...) ``` min_gram and max_gram controls how we tokenize the documents. For example, for min_gram=2 and max_gram=4, we will tokenize each document with 2-gram, 3-gram and 4-gram. --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com> Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>	2025-07-01 10:08:44 +08:00
Bingyi Sun	669ea51ce5	enhance: Make json index compatible with caching layer (#42484 ) issue: https://github.com/milvus-io/milvus/issues/42483 --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-06-24 15:16:41 +08:00

1 2 3 4 5 ...

458 Commits