milvus

mirror of https://gitee.com/milvus-io/milvus.git synced 2026-01-07 19:31:51 +08:00

Author	SHA1	Message	Date
Bingyi Sun	f0446fd9a0	enhance: optimize the performance of binary_search_string (#44469 ) Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-09-23 10:52:13 +08:00
Tianx	2c0c5ef41e	feat: timestamptz expression & index & timezone (#44080 ) issue: https://github.com/milvus-io/milvus/issues/27467 >My plan is as follows. >- [x] M1 Create collection with timestamptz field >- [x] M2 Insert timestamptz field data >- [x] M3 Retrieve timestamptz field data >- [x] M4 Implement handoff >- [x] M5 Implement compare operator >- [x] M6 Implement extract operator >- [x] M8 Support database/collection level default timezone >- [x] M7 Support STL-SORT index for datatype timestamptz --- The third PR of issue: https://github.com/milvus-io/milvus/issues/27467, which completes M5, M6, M7, M8 described above. ## M8 Default Timezone We will be able to use alter_collection() and alter_database() in a future Python SDK release to modify the default timezone at the collection or database level. For insert requests, the timezone will be resolved using the following order of precedence: String Literal-> Collection Default -> Database Default. For retrieval requests, the timezone will be resolved in this order: Query Parameters -> Collection Default -> Database Default. In both cases, the final fallback timezone is UTC. ## M5: Comparison Operators We can now use the following expression format to filter on the timestamptz field: - `timestamptz_field [+/- INTERVAL 'interval_string'] {comparison_op} ISO 'iso_string' ` - The interval_string follows the ISO 8601 duration format, for example: P1Y2M3DT1H2M3S. - The iso_string follows the ISO 8601 timestamp format, for example: 2025-01-03T00:00:00+08:00. - Example expressions: "tsz + INTERVAL 'P0D' != ISO '2025-01-03T00:00:00+08:00'" or "tsz != ISO '2025-01-03T00:00:00+08:00'". ## M6: Extract We will be able to extract sepecific time filed by kwargs in a future Python SDK release. The key is `time_fields`, and value should be one or more of "year, month, day, hour, minute, second, microsecond", seperated by comma or space. Then the result of each record would be an array of int64. ## M7: Indexing Support Expressions without interval arithmetic can be accelerated using an STL-SORT index. However, expressions that include interval arithmetic cannot be indexed. This is because the result of an interval calculation depends on the specific timestamp value. For example, adding one month to a date in February results in a different number of added days than adding one month to a date in March. --- After this PR, the input / output type of timestamptz would be iso string. Timestampz would be stored as timestamptz data, which is int64_t finally. > for more information, see https://en.wikipedia.org/wiki/ISO_8601 --------- Signed-off-by: xtx <xtianx@smail.nju.edu.cn>	2025-09-23 10:24:12 +08:00
Gao	539f17f1ad	enhance: tiered index updates (#44433 ) issue: #42032 #44212 - special case for warmup param and cell storage size for tiered index - add a config to enable/disable storage usage tracking --------- Signed-off-by: chasingegg <chao.gao@zilliz.com>	2025-09-22 21:34:11 +08:00
Buqian Zheng	75557f3eb8	enhance: Use std::shared_lock and std::unique_lock for mutexes (#44459 ) issue: https://github.com/milvus-io/milvus/issues/44452 Signed-off-by: zhengbuqian <zhengbuqian@gmail.com> Co-authored-by: buqian.zheng <buqian.zheng@zilliz.com>	2025-09-22 18:02:09 +08:00
Buqian Zheng	846cf52a95	enhance: Remove unused vector plan node subclasses (#44453 ) Remove redundant `VectorPlanNode` subclasses and simplify the visitor pattern by consolidating to a single `VectorPlanNode`. The previous design used distinct `VectorPlanNode` subclasses and a templated `VectorVisitorImpl` for type-directed dispatch. However, the template parameter was not functionally used to implement different logic for each vector type, making the subclasses redundant for their intended purpose. This PR is created by Cursor Agent and manually moved from https://github.com/zhengbuqian/milvus/pull/14. Signed-off-by: zhengbuqian <zhengbuqian@gmail.com> Co-authored-by: Cursor Agent <cursoragent@cursor.com> Co-authored-by: buqian.zheng <buqian.zheng@zilliz.com>	2025-09-22 18:00:27 +08:00
sthuang	edd250ffef	fix: [StorageV2] force virtual host for oss and cos (#44484 ) related: #44481 Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2025-09-22 16:58:11 +08:00
sparknack	ab64afba2f	enhance: add storage resource usage for scalar search (#44414 ) issue: #44212 --------- Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>	2025-09-22 14:28:06 +08:00
Gao	d3784c6515	enhance: add storage resource usage for vector search (#44308 ) issue: #44212 Implement search/query storage usage statistics in go side(result reduce), only record storage usage in vector search C++ path. Need to be implemented in query c++ path in next prs. --------- Signed-off-by: chasingegg <chao.gao@zilliz.com> Signed-off-by: marcelo.chen <marcelo.chen@zilliz.com> Co-authored-by: marcelo.chen <marcelo.chen@zilliz.com>	2025-09-19 20:20:02 +08:00
sangheee	bed94fc061	feat: support grpc tokenizer (#41994 ) relate: https://github.com/milvus-io/milvus/issues/41035 This PR adds support for a gRPC-based tokenizer. - The protobuf definition was added in [milvus-proto#445](https://github.com/milvus-io/milvus-proto/pull/445). - Based on this, the corresponding Rust client code was generated and added under `tantivi-binding`. - The generated file is `milvus.proto.tokenizer.rs`. I'm not very experienced with Rust, so there might be parts of the code that could be improved. I’d appreciate any suggestions or improvements. --------- Signed-off-by: park.sanghee <park.sanghee@navercorp.com>	2025-09-19 17:40:01 +08:00
congqixia	b532a3e026	enhance: Move c API unittest aside to src files (#44458 ) Related to #43931 Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-09-19 10:30:01 +08:00
congqixia	7b83314bf3	enhance: [StorageV2] Make datanode use non-singleton fs (#44418 ) Related to #39173 According to the current design, datanode shall create fs from storage config in request instead of using singleton fs. This PR upgrade milvus-storage and make packed reader/writer compose new fs from storage config. --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-09-18 20:06:00 +08:00
zhagnlu	9b6703626d	fix:fix unescaped bug for json stats (#44421 ) #42533 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-09-17 20:54:01 +08:00
sthuang	2f70a73258	fix: turn on azure by default (#44377 ) related: #44354, #44138, #43869 --------- Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2025-09-17 10:12:01 +08:00
congqixia	6f7318a731	enhance: [StorageV2] Use compressed size as log file size (#44402 ) Related to #39173 backlog issue that memory size and log size shared same value. This patch add `GetFileSize` api to get remote compressed binlog size as meta log file size to calculate usage more accurate. --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-09-16 21:20:02 +08:00
congqixia	98d23de36c	enhance: [StorageV2] Make load info contains child info (#44384 ) Related to #44257 --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-09-16 16:14:00 +08:00
congqixia	aa861f55e6	enhance: [StorageV2] Reverts #44232 bucket name change (#44390 ) Related to #39173 - Put bucket name concatenation logic back for azure support This reverts commit 8f97eb355fde6b86cf37f166d2191750b4210ba3. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-09-16 10:10:00 +08:00
zhagnlu	baa84e0b2b	fix: avoid mvcc when doing pk compare expr (#44353 ) #44352 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-09-15 10:17:59 +08:00
zhagnlu	e9bbb6aa9b	fix: fix json_contains bug for stats (#44325 ) #42533 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-09-15 10:16:07 +08:00
sthuang	b38013352d	enhance: [StorageV2] enable build with azure (#44177 ) related: #43869 --------- Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2025-09-14 08:05:58 +08:00
Bingyi Sun	1931dcd9b5	fix: Fix initialize timestamp index concurrently (#44317 ) #issue: https://github.com/milvus-io/milvus/issues/44341 --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-09-12 14:25:57 +08:00
sparknack	060fc61e80	fix: milvus-common commits update (#44339 ) issue: #41435 related: #44268 Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>	2025-09-12 12:43:57 +08:00
aoiasd	fb58701cbb	enhance: update rust version (#44322 ) relate: https://github.com/milvus-io/milvus/issues/44321 Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-09-12 10:53:57 +08:00
zhagnlu	16e6b6aa8a	fix:fix build json stats bug for nested object (#44303 ) issue: #44132 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-09-11 14:13:56 +08:00
cqy123456	f5c6138793	enhance: update knowhere version (#44294 ) issue: https://github.com/milvus-io/milvus/issues/42937 Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>	2025-09-11 11:21:56 +08:00
sparknack	e821468d2a	fix: milvus-common commit update (#44304 ) issue: #41435 related: #44268 Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>	2025-09-11 10:19:56 +08:00
zhagnlu	77f7d19400	fix:avoid mmap rewrite by multi json fields (#44299 ) issue: #44127 Signed-off-by: zhagnlu <lu.zhang@zilliz.com>	2025-09-11 10:13:57 +08:00
congqixia	f5618d5153	enhance: [StorageV2] Utilized advance split policy and persist in meta (#44282 ) Related to #44257 This PR: - Utilize configurable split policy for storage v2, enabling system field policy - Store split result in field binlog struct - Adapt legacy binlog without child fields --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-09-10 14:47:57 +08:00
sparknack	4a01c726f3	enhance: cachinglayer: some metric and params update (#44276 ) issue: #41435 --------- Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>	2025-09-10 11:03:57 +08:00
zhagnlu	2f8620fa79	fix: fix like failed and add max columns limit (#44233 ) #44137 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-09-10 10:33:57 +08:00
Spade A	45adf2d426	fix: load resource considers ngram index (#44237 ) fix https://github.com/milvus-io/milvus/issues/44236 --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-09-10 10:27:56 +08:00
Chun Han	26a024625d	feat: support search by on json field and dynamic field(#43124 ) (#43203 ) related: #43124 Signed-off-by: MrPresent-Han <chun.han@gmail.com> Co-authored-by: MrPresent-Han <chun.han@gmail.com>	2025-09-09 21:51:56 +08:00
sthuang	dfc2335144	enhance: [StorageV2] storage file system error messages (#44255 ) related: https://github.com/milvus-io/milvus/issues/44138 bump milvus storage version, include the followings: * https://github.com/milvus-io/milvus-storage/pull/243 * https://github.com/milvus-io/milvus-storage/pull/240 * https://github.com/milvus-io/milvus-storage/pull/245 Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2025-09-09 19:37:56 +08:00
Spade A	575d490af6	fix: ngram index is mistakenly used for unsopported operations 2 (#44142 ) issue: https://github.com/milvus-io/milvus/issues/44020 https://github.com/milvus-io/milvus/pull/43955 only fixed unary expression This fixes all expressions and add more tests. --------- Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com> Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-09-09 19:05:56 +08:00
Buqian Zheng	dae0fd0e90	enhance: removed unused map_c (#44183 ) Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2025-09-09 16:46:04 +08:00
aoiasd	92fedb8280	enhance: forbid panic when tantivy index path not exist (#44135 ) Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-09-08 15:21:56 +08:00
Buqian Zheng	9bf2b5c10c	enhance: moved more segcore unit test files (#44210 ) issue: https://github.com/milvus-io/milvus/issues/43931 --------- Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2025-09-08 10:21:55 +08:00
congqixia	8f97eb355f	enhance: [StorageV2] Make bucket name concatenation transparent to user (#44232 ) Related to #39173 This PR: - Bump milvus-storage commit to handle bucket name concatenation logic in multipart s3 fs - Remove all user-side bucket name concatenation code Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-09-08 10:15:55 +08:00
Spade A	ba4cd68edb	fix: adjust params to make CPP UT run faster (#44223 ) fix: https://github.com/milvus-io/milvus/issues/44224 --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-09-06 14:13:54 +08:00
aoiasd	c71b47b52c	enhance: add internal core latency metric for rescore node (#44010 ) For fetching latency of boost. Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-09-05 17:37:54 +08:00
cqy123456	1d4d721859	test: Reduce the run time of interim index cpp ut (#44200 ) issue: https://github.com/milvus-io/milvus/issues/44176 Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>	2025-09-05 16:45:53 +08:00
zhagnlu	d67f1ea0ab	enhance: add param to modify dump snapshot batch size (#44215 ) issue: #44216 Signed-off-by: luzhang <luzhang@zilliz.com>	2025-09-05 14:29:54 +08:00
Gao	2e98cb0103	enhance: load resource estimation for tiered index (#44171 ) issue: https://github.com/milvus-io/milvus/issues/42032 - Use bytes to estimate load resource in the whole estimation procedure - Add num_rows and dim info for vector index to better estimate - Disable eviction for tiered index's meta --------- Signed-off-by: chasingegg <chao.gao@zilliz.com>	2025-09-04 19:41:53 +08:00
Buqian Zheng	b76bf13fc3	enhance: move c++ unit test file to aside of the production code (#43932 ) issue: https://github.com/milvus-io/milvus/issues/43931 Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2025-09-03 23:45:53 +08:00
Spade A	825a134739	feat: impl StructArray -- reject json types for struct (#44190 ) issue: https://github.com/milvus-io/milvus/issues/42148 --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-09-03 19:33:53 +08:00
Spade A	7cb15ef141	feat: impl StructArray -- optimize vector array serialization (#44035 ) issue: https://github.com/milvus-io/milvus/issues/42148 Optimized from Go VectorArray → VectorArray Proto → Binary → C++ VectorArray Proto → C++ VectorArray local impl → Memory to Go VectorArray → Arrow ListArray → Memory --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-09-03 16:39:53 +08:00
Buqian Zheng	ad16441aa0	enhance: removed unused VectorFunction (#44178 ) Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2025-09-03 14:37:53 +08:00
foxspy	d55bf49bf1	enhance: update knowhere version (#44144 ) issue: #42937 --------- Signed-off-by: xianliang.li <xianliang.li@zilliz.com>	2025-09-03 01:31:53 +08:00
Bingyi Sun	0c0630cc38	feat: support dropping index without releasing collection (#42941 ) issue: #42942 This pr includes the following changes: 1. Added checks for index checker in querycoord to generate drop index tasks 2. Added drop index interface to querynode 3. To avoid search failure after dropping the index, the querynode allows the use of lazy mode (warmup=disable) to load raw data even when indexes contain raw data. 4. In segcore, loading the index no longer deletes raw data; instead, it evicts it. 5. In expr, the index is pinned to prevent concurrent errors. --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-09-02 16:17:52 +08:00
congqixia	aa4ef9c996	feat: Support enabling dynamic schema on existing collection (#44151 ) Related to #44150 This PR make enabling `dynamic schema` feature for an existing collection possible. This related API is to reuse `AlterCollection` and underhood its redirected to `adding nullable json field` --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-09-02 15:51:52 +08:00
Bingyi Sun	c420e7bd27	enhance: align the behavior of exist expr between brute force and index (#44030 ) https://github.com/milvus-io/milvus/issues/44031 --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-09-01 15:03:52 +08:00

1 2 3 4 5 ...

2181 Commits