From 8a37609dcc04440b466a616855d78d670505b770 Mon Sep 17 00:00:00 2001 From: jielinxu <52057195+jielinxu@users.noreply.github.com> Date: Wed, 20 Nov 2019 15:44:25 +0800 Subject: [PATCH 01/32] [skip ci] Update README --- README.md | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/README.md b/README.md index 08ac3de4a0..5b2fc4454b 100644 --- a/README.md +++ b/README.md @@ -5,7 +5,7 @@ ![LICENSE](https://img.shields.io/badge/license-Apache--2.0-brightgreen) ![Language](https://img.shields.io/badge/language-C%2B%2B-blue) [![codebeat badge](https://codebeat.co/badges/e030a4f6-b126-4475-a938-4723d54ec3a7?style=plastic)](https://codebeat.co/projects/github-com-jinhai-cn-milvus-master) -![Release](https://img.shields.io/badge/release-v0.5.2-yellowgreen) +![Release](https://img.shields.io/badge/release-v0.5.3-yellowgreen) ![Release_date](https://img.shields.io/badge/release%20date-November-yellowgreen) [中文版](README_CN.md) | [日本語版](README_JP.md) @@ -18,7 +18,7 @@ For more detailed introduction of Milvus and its architecture, see [Milvus overv Milvus provides stable [Python](https://github.com/milvus-io/pymilvus), [Java](https://github.com/milvus-io/milvus-sdk-java) and [C++](https://github.com/milvus-io/milvus/tree/master/core/src/sdk) APIs. -Keep up-to-date with newest releases and latest updates by reading Milvus [release notes](https://www.milvus.io/docs/en/release/v0.5.2/). +Keep up-to-date with newest releases and latest updates by reading Milvus [release notes](https://www.milvus.io/docs/en/release/v0.5.3/). ## Get started @@ -52,12 +52,13 @@ We use [GitHub issues](https://github.com/milvus-io/milvus/issues) to track issu To connect with other users and contributors, welcome to join our [Slack channel](https://join.slack.com/t/milvusio/shared_invite/enQtNzY1OTQ0NDI3NjMzLWNmYmM1NmNjOTQ5MGI5NDhhYmRhMGU5M2NhNzhhMDMzY2MzNDdlYjM5ODQ5MmE3ODFlYzU3YjJkNmVlNDQ2ZTk). -## Thanks +## Contributors -We greatly appreciate the help of the following people. +Below is a list of Milvus contributors. We greatly appreciate your contributions! - [akihoni](https://github.com/akihoni) provided the CN version of README, and found a broken link in the doc. - [goodhamgupta](https://github.com/goodhamgupta) fixed a filename typo in the bootcamp doc. +- [erdustiggen](https://github.com/erdustiggen) changed from std::cout to LOG for error messages, and fixed a clang format issue as well as some grammatical errors. ## Resources @@ -65,6 +66,8 @@ We greatly appreciate the help of the following people. - [Milvus bootcamp](https://github.com/milvus-io/bootcamp) +- [Milvus test reports](https://github.com/milvus-io/milvus/tree/master/docs/test_report) + - [Milvus Medium](https://medium.com/@milvusio) - [Milvus CSDN](https://zilliz.blog.csdn.net/) From e5ee302ef8058e5f5ee11e448bfa81ce3beeb891 Mon Sep 17 00:00:00 2001 From: jielinxu <52057195+jielinxu@users.noreply.github.com> Date: Wed, 20 Nov 2019 16:03:43 +0800 Subject: [PATCH 02/32] [skip ci] Update README_CN --- README_CN.md | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/README_CN.md b/README_CN.md index d5de0b1cd6..979c476ebf 100644 --- a/README_CN.md +++ b/README_CN.md @@ -4,7 +4,7 @@ ![LICENSE](https://img.shields.io/badge/license-Apache--2.0-brightgreen) ![Language](https://img.shields.io/badge/language-C%2B%2B-blue) [![codebeat badge](https://codebeat.co/badges/e030a4f6-b126-4475-a938-4723d54ec3a7?style=plastic)](https://codebeat.co/projects/github-com-jinhai-cn-milvus-master) -![Release](https://img.shields.io/badge/release-v0.5.2-orange) +![Release](https://img.shields.io/badge/release-v0.5.3-yellowgreen) ![Release_date](https://img.shields.io/badge/release_date-October-yellowgreen) # 欢迎来到 Milvus @@ -17,7 +17,7 @@ Milvus 是一款开源的、针对海量特征向量的相似性搜索引擎。 Milvus 提供稳定的 [Python](https://github.com/milvus-io/pymilvus)、[Java](https://github.com/milvus-io/milvus-sdk-java) 以及 C++ 的 API 接口。 -通过 [版本发布说明](https://milvus.io/docs/zh-CN/release/v0.5.2/) 获取最新版本的功能和更新。 +通过 [版本发布说明](https://milvus.io/docs/zh-CN/release/v0.5.3/) 获取最新版本的功能和更新。 ## 开始使用 Milvus @@ -57,6 +57,7 @@ Milvus 提供稳定的 [Python](https://github.com/milvus-io/pymilvus)、[Java]( - [akihoni](https://github.com/akihoni) 提供了中文版 README,并发现了 README 中的无效链接。 - [goodhamgupta](https://github.com/goodhamgupta) 发现并修正了在线训练营文档中的文件名拼写错误。 +- [erdustiggen](https://github.com/erdustiggen) 将错误信息里的 std::cout 修改为 LOG,修正了一个 Clang 格式问题和一些语法错误。 ## 相关链接 @@ -64,6 +65,8 @@ Milvus 提供稳定的 [Python](https://github.com/milvus-io/pymilvus)、[Java]( - [Milvus 在线训练营](https://github.com/milvus-io/bootcamp) +- [Milvus 测试报告](https://github.com/milvus-io/milvus/tree/master/docs/test_report_cn) + - [Milvus Medium](https://medium.com/@milvusio) - [Milvus CSDN](https://zilliz.blog.csdn.net/) From 84c1483d969ea3d3d8c5f64f2d575271d4d2d1aa Mon Sep 17 00:00:00 2001 From: jielinxu <52057195+jielinxu@users.noreply.github.com> Date: Wed, 20 Nov 2019 16:10:36 +0800 Subject: [PATCH 03/32] [skip ci] Update README_JP --- README_JP.md | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/README_JP.md b/README_JP.md index fd80b5d2ca..d55fea4a14 100644 --- a/README_JP.md +++ b/README_JP.md @@ -5,7 +5,7 @@ ![LICENSE](https://img.shields.io/badge/license-Apache--2.0-brightgreen) ![Language](https://img.shields.io/badge/language-C%2B%2B-blue) [![codebeat badge](https://codebeat.co/badges/e030a4f6-b126-4475-a938-4723d54ec3a7?style=plastic)](https://codebeat.co/projects/github-com-jinhai-cn-milvus-master) -![Release](https://img.shields.io/badge/release-v0.5.2-yellowgreen) +![Release](https://img.shields.io/badge/release-v0.5.3-yellowgreen) ![Release_date](https://img.shields.io/badge/release%20date-November-yellowgreen) @@ -15,9 +15,9 @@ Milvusは世界中一番早い特徴ベクトルにむかう類似性検索エンジンです。不均質な計算アーキテクチャーに基づいて効率を最大化出来ます。数十億のベクタの中に目標を検索できるまで数ミリ秒しかかからず、最低限の計算資源だけが必要です。 -Milvusは安定的なPython、Java又は C++ APIsを提供します。 +Milvusは安定的な[Python](https://github.com/milvus-io/pymilvus)、[Java](https://github.com/milvus-io/milvus-sdk-java)又は [C++](https://github.com/milvus-io/milvus/tree/master/core/src/sdk) APIsを提供します。 -Milvus [リリースノート](https://milvus.io/docs/en/release/v0.5.2/)を読んで最新バージョンや更新情報を手に入れます。 +Milvus [リリースノート](https://milvus.io/docs/en/release/v0.5.3/)を読んで最新バージョンや更新情報を手に入れます。(https://github.com/milvus-io/milvus/tree/master/core/src/sdk) ## はじめに @@ -46,7 +46,7 @@ C++サンプルコードを実行するために、次のコマンドをつか 本プロジェクトへの貢献に心より感謝いたします。 Milvusを貢献したいと思うなら、[貢献規約](CONTRIBUTING.md)を読んでください。 本プロジェクトはMilvusの[行動規範](CODE_OF_CONDUCT.md)に従います。プロジェクトに参加したい場合は、行動規範を従ってください。 -[GitHub issues](https://github.com/milvus-io/milvus/issues/new/choose) を使って問題やバッグなとを報告しでください。 一般てきな問題なら, Milvusコミュニティに参加してください。 +[GitHub issues](https://github.com/milvus-io/milvus/issues) を使って問題やバッグなとを報告しでください。 一般てきな問題なら, Milvusコミュニティに参加してください。 ## Milvusコミュニティを参加する @@ -59,6 +59,8 @@ C++サンプルコードを実行するために、次のコマンドをつか - [Milvus](https://github.com/milvus-io/bootcamp) +- [Milvus テストレポート](https://github.com/milvus-io/milvus/tree/master/docs/test_report) + - [Milvus Medium](https://medium.com/@milvusio) - [Milvus CSDN](https://zilliz.blog.csdn.net/) From a54df0d3ccf06c0b36df9ada37919016cce2a95c Mon Sep 17 00:00:00 2001 From: jielinxu <52057195+jielinxu@users.noreply.github.com> Date: Wed, 20 Nov 2019 16:57:22 +0800 Subject: [PATCH 04/32] [skip ci] minor change --- README_CN.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README_CN.md b/README_CN.md index 979c476ebf..df407f1a5f 100644 --- a/README_CN.md +++ b/README_CN.md @@ -65,7 +65,7 @@ Milvus 提供稳定的 [Python](https://github.com/milvus-io/pymilvus)、[Java]( - [Milvus 在线训练营](https://github.com/milvus-io/bootcamp) -- [Milvus 测试报告](https://github.com/milvus-io/milvus/tree/master/docs/test_report_cn) +- [Milvus 测试报告](https://github.com/milvus-io/milvus/tree/master/docs/test_report) - [Milvus Medium](https://medium.com/@milvusio) From fd9c1a123065b4c5c4a28299ace4dd02aae88b20 Mon Sep 17 00:00:00 2001 From: jielinxu <52057195+jielinxu@users.noreply.github.com> Date: Wed, 20 Nov 2019 17:21:10 +0800 Subject: [PATCH 05/32] [skip ci] Update test reports link --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 5b2fc4454b..d8d5d80b11 100644 --- a/README.md +++ b/README.md @@ -66,7 +66,7 @@ Below is a list of Milvus contributors. We greatly appreciate your contributions - [Milvus bootcamp](https://github.com/milvus-io/bootcamp) -- [Milvus test reports](https://github.com/milvus-io/milvus/tree/master/docs/test_report) +- [Milvus test reports](https://github.com/milvus-io/milvus/tree/master/docs) - [Milvus Medium](https://medium.com/@milvusio) From fd9beb8d3b173f4522bf0308637a98ae7a00172c Mon Sep 17 00:00:00 2001 From: jielinxu <52057195+jielinxu@users.noreply.github.com> Date: Wed, 20 Nov 2019 17:22:18 +0800 Subject: [PATCH 06/32] [skip ci] Update test report link --- README_CN.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README_CN.md b/README_CN.md index df407f1a5f..df5445d931 100644 --- a/README_CN.md +++ b/README_CN.md @@ -65,7 +65,7 @@ Milvus 提供稳定的 [Python](https://github.com/milvus-io/pymilvus)、[Java]( - [Milvus 在线训练营](https://github.com/milvus-io/bootcamp) -- [Milvus 测试报告](https://github.com/milvus-io/milvus/tree/master/docs/test_report) +- [Milvus 测试报告](https://github.com/milvus-io/milvus/tree/master/docs) - [Milvus Medium](https://medium.com/@milvusio) From e8f97ece6f047dac0059f1decb33d5bfe6f13a92 Mon Sep 17 00:00:00 2001 From: jielinxu <52057195+jielinxu@users.noreply.github.com> Date: Wed, 20 Nov 2019 17:22:53 +0800 Subject: [PATCH 07/32] [skip ci] Update test report link --- README_JP.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README_JP.md b/README_JP.md index d55fea4a14..b5001a476c 100644 --- a/README_JP.md +++ b/README_JP.md @@ -59,7 +59,7 @@ C++サンプルコードを実行するために、次のコマンドをつか - [Milvus](https://github.com/milvus-io/bootcamp) -- [Milvus テストレポート](https://github.com/milvus-io/milvus/tree/master/docs/test_report) +- [Milvus テストレポート](https://github.com/milvus-io/milvus/tree/master/docs) - [Milvus Medium](https://medium.com/@milvusio) From 388e7fc315ca4e19b2e4269c419d39f8c970b615 Mon Sep 17 00:00:00 2001 From: jielinxu <52057195+jielinxu@users.noreply.github.com> Date: Wed, 20 Nov 2019 17:23:47 +0800 Subject: [PATCH 08/32] [skip ci] minor delete --- README_JP.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README_JP.md b/README_JP.md index b5001a476c..4a1d67738d 100644 --- a/README_JP.md +++ b/README_JP.md @@ -17,7 +17,7 @@ Milvusは世界中一番早い特徴ベクトルにむかう類似性検索エ Milvusは安定的な[Python](https://github.com/milvus-io/pymilvus)、[Java](https://github.com/milvus-io/milvus-sdk-java)又は [C++](https://github.com/milvus-io/milvus/tree/master/core/src/sdk) APIsを提供します。 -Milvus [リリースノート](https://milvus.io/docs/en/release/v0.5.3/)を読んで最新バージョンや更新情報を手に入れます。(https://github.com/milvus-io/milvus/tree/master/core/src/sdk) +Milvus [リリースノート](https://milvus.io/docs/en/release/v0.5.3/)を読んで最新バージョンや更新情報を手に入れます。 ## はじめに From 7fa7a2a65bd697585a62f93ed0af40cc3a18ceff Mon Sep 17 00:00:00 2001 From: groot Date: Thu, 21 Nov 2019 11:28:28 +0800 Subject: [PATCH 09/32] #449 Add ShowPartitions example for C++ SDK --- CHANGELOG.md | 1 + core/src/sdk/examples/partition/src/ClientTest.cpp | 9 +++++++++ 2 files changed, 10 insertions(+) diff --git a/CHANGELOG.md b/CHANGELOG.md index a8b243546e..e7c52eb5bb 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -41,6 +41,7 @@ Please mark all change in change log and use the ticket from JIRA. - \#404 - Add virtual method Init() in Pass abstract class - \#409 - Add a Fallback pass in optimizer - \#433 - C++ SDK query result is not easy to use +- \#449 - Add ShowPartitions example for C++ SDK ## Task diff --git a/core/src/sdk/examples/partition/src/ClientTest.cpp b/core/src/sdk/examples/partition/src/ClientTest.cpp index a12a7ff50e..ecbf0a80e7 100644 --- a/core/src/sdk/examples/partition/src/ClientTest.cpp +++ b/core/src/sdk/examples/partition/src/ClientTest.cpp @@ -93,6 +93,15 @@ ClientTest::Test(const std::string& address, const std::string& port) { std::cout << "CreatePartition function call status: " << stat.message() << std::endl; milvus_sdk::Utils::PrintPartitionParam(partition_param); } + + // show partitions + milvus::PartitionList partition_array; + stat = conn->ShowPartitions(TABLE_NAME, partition_array); + + std::cout << partition_array.size() << " partitions created:" << std::endl; + for (auto& partition : partition_array) { + std::cout << "\t" << partition.partition_name << "\t tag = " << partition.partition_tag << std::endl; + } } { // insert vectors From 046c5543b8e5a6c79a3cad7b670f89e16c7ae69c Mon Sep 17 00:00:00 2001 From: "peng.xu" Date: Thu, 21 Nov 2019 13:56:37 +0800 Subject: [PATCH 10/32] [skip ci] (doc/shards): update doc and demo --- shards/README.md | 2 +- shards/README_CN.md | 2 +- shards/all_in_one/all_in_one.yml | 5 ++-- shards/all_in_one/ro_server.yml | 8 +++---- shards/all_in_one/wr_server.yml | 41 ++++++++++++++++++++++++++++++++ shards/mishards/.env.example | 4 ++-- 6 files changed, 52 insertions(+), 10 deletions(-) create mode 100644 shards/all_in_one/wr_server.yml diff --git a/shards/README.md b/shards/README.md index f59eca0460..1e0b000af5 100644 --- a/shards/README.md +++ b/shards/README.md @@ -54,7 +54,7 @@ Follow below steps to start a standalone Milvus instance with Mishards from sour 3. Start Milvus server. ```shell - $ sudo nvidia-docker run --rm -d -p 19530:19530 -v /tmp/milvus/db:/opt/milvus/db milvusdb/milvus:0.5.0-d102119-ede20b + $ sudo nvidia-docker run --rm -d -p 19530:19530 -v /tmp/milvus/db:/opt/milvus/db milvusdb/milvus ``` 4. Update path permissions. diff --git a/shards/README_CN.md b/shards/README_CN.md index 24e019d001..98264b206b 100644 --- a/shards/README_CN.md +++ b/shards/README_CN.md @@ -48,7 +48,7 @@ Python 版本为3.6及以上。 3. 启动 Milvus 服务。 ```shell - $ sudo nvidia-docker run --rm -d -p 19530:19530 -v /tmp/milvus/db:/opt/milvus/db milvusdb/milvus:0.5.0-d102119-ede20b + $ sudo nvidia-docker run --rm -d -p 19530:19530 -v /tmp/milvus/db:/opt/milvus/db milvusdb/milvus ``` 4. 更改目录权限。 diff --git a/shards/all_in_one/all_in_one.yml b/shards/all_in_one/all_in_one.yml index 40473fe8b9..75a3340068 100644 --- a/shards/all_in_one/all_in_one.yml +++ b/shards/all_in_one/all_in_one.yml @@ -3,14 +3,15 @@ services: milvus_wr: runtime: nvidia restart: always - image: milvusdb/milvus:0.5.0-d102119-ede20b + image: milvusdb/milvus volumes: - /tmp/milvus/db:/opt/milvus/db + - ./wr_server.yml:/opt/milvus/conf/server_config.yaml milvus_ro: runtime: nvidia restart: always - image: milvusdb/milvus:0.5.0-d102119-ede20b + image: milvusdb/milvus volumes: - /tmp/milvus/db:/opt/milvus/db - ./ro_server.yml:/opt/milvus/conf/server_config.yaml diff --git a/shards/all_in_one/ro_server.yml b/shards/all_in_one/ro_server.yml index 10cf695448..09857ee9c8 100644 --- a/shards/all_in_one/ro_server.yml +++ b/shards/all_in_one/ro_server.yml @@ -12,7 +12,7 @@ db_config: # Keep 'dialect://:@:/', and replace other texts with real values # Replace 'dialect' with 'mysql' or 'sqlite' - insert_buffer_size: 4 # GB, maximum insert buffer size allowed + insert_buffer_size: 1 # GB, maximum insert buffer size allowed # sum of insert_buffer_size and cpu_cache_capacity cannot exceed total memory preload_table: # preload data at startup, '*' means load all tables, empty value means no preload @@ -25,14 +25,14 @@ metric_config: port: 8080 # port prometheus uses to fetch metrics cache_config: - cpu_cache_capacity: 16 # GB, CPU memory used for cache + cpu_cache_capacity: 4 # GB, CPU memory used for cache cpu_cache_threshold: 0.85 # percentage of data that will be kept when cache cleanup is triggered - gpu_cache_capacity: 4 # GB, GPU memory used for cache + gpu_cache_capacity: 1 # GB, GPU memory used for cache gpu_cache_threshold: 0.85 # percentage of data that will be kept when cache cleanup is triggered cache_insert_data: false # whether to load inserted data into cache engine_config: - use_blas_threshold: 20 # if nq < use_blas_threshold, use SSE, faster with fluctuated response times + use_blas_threshold: 800 # if nq < use_blas_threshold, use SSE, faster with fluctuated response times # if nq >= use_blas_threshold, use OpenBlas, slower with stable response times resource_config: diff --git a/shards/all_in_one/wr_server.yml b/shards/all_in_one/wr_server.yml new file mode 100644 index 0000000000..5d7d855c03 --- /dev/null +++ b/shards/all_in_one/wr_server.yml @@ -0,0 +1,41 @@ +server_config: + address: 0.0.0.0 # milvus server ip address (IPv4) + port: 19530 # port range: 1025 ~ 65534 + deploy_mode: cluster_writable # deployment type: single, cluster_readonly, cluster_writable + time_zone: UTC+8 + +db_config: + primary_path: /opt/milvus # path used to store data and meta + secondary_path: # path used to store data only, split by semicolon + + backend_url: sqlite://:@:/ # URI format: dialect://username:password@host:port/database + # Keep 'dialect://:@:/', and replace other texts with real values + # Replace 'dialect' with 'mysql' or 'sqlite' + + insert_buffer_size: 2 # GB, maximum insert buffer size allowed + # sum of insert_buffer_size and cpu_cache_capacity cannot exceed total memory + + preload_table: # preload data at startup, '*' means load all tables, empty value means no preload + # you can specify preload tables like this: table1,table2,table3 + +metric_config: + enable_monitor: false # enable monitoring or not + collector: prometheus # prometheus + prometheus_config: + port: 8080 # port prometheus uses to fetch metrics + +cache_config: + cpu_cache_capacity: 2 # GB, CPU memory used for cache + cpu_cache_threshold: 0.85 # percentage of data that will be kept when cache cleanup is triggered + gpu_cache_capacity: 2 # GB, GPU memory used for cache + gpu_cache_threshold: 0.85 # percentage of data that will be kept when cache cleanup is triggered + cache_insert_data: false # whether to load inserted data into cache + +engine_config: + use_blas_threshold: 800 # if nq < use_blas_threshold, use SSE, faster with fluctuated response times + # if nq >= use_blas_threshold, use OpenBlas, slower with stable response times + +resource_config: + search_resources: # define the GPUs used for search computation, valid value: gpux + - gpu0 + index_build_device: gpu0 # GPU used for building index diff --git a/shards/mishards/.env.example b/shards/mishards/.env.example index f1c812a269..91b67760af 100644 --- a/shards/mishards/.env.example +++ b/shards/mishards/.env.example @@ -1,7 +1,7 @@ DEBUG=True WOSERVER=tcp://127.0.0.1:19530 -SERVER_PORT=19532 +SERVER_PORT=19535 SERVER_TEST_PORT=19888 #SQLALCHEMY_DATABASE_URI=mysql+pymysql://root:root@127.0.0.1:3306/milvus?charset=utf8mb4 @@ -19,7 +19,7 @@ TRACER_CLASS_NAME=jaeger TRACING_SERVICE_NAME=fortest TRACING_SAMPLER_TYPE=const TRACING_SAMPLER_PARAM=1 -TRACING_LOG_PAYLOAD=True +TRACING_LOG_PAYLOAD=False #TRACING_SAMPLER_TYPE=probabilistic #TRACING_SAMPLER_PARAM=0.5 From 1ac30913e73d9eb621eb770a7dd0125fd5c2c6a8 Mon Sep 17 00:00:00 2001 From: "xiaojun.lin" Date: Thu, 21 Nov 2019 15:06:00 +0800 Subject: [PATCH 11/32] move seal to Load --- CHANGELOG.md | 1 + .../knowhere/knowhere/index/vector_index/FaissBaseIndex.cpp | 4 +++- .../knowhere/knowhere/index/vector_index/IndexGPUIVF.cpp | 3 --- 3 files changed, 4 insertions(+), 4 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index a8b243546e..af4abe71a5 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -18,6 +18,7 @@ Please mark all change in change log and use the ticket from JIRA. - \#412 - Message returned is confused when partition created with null partition name - \#416 - Drop the same partition success repeatally - \#440 - Query API in customization still uses old version +- \#458 - Index data is not compatible between 0.5 and 0.6 ## Feature - \#12 - Pure CPU version for Milvus diff --git a/core/src/index/knowhere/knowhere/index/vector_index/FaissBaseIndex.cpp b/core/src/index/knowhere/knowhere/index/vector_index/FaissBaseIndex.cpp index 783487be3a..8fce37a81e 100644 --- a/core/src/index/knowhere/knowhere/index/vector_index/FaissBaseIndex.cpp +++ b/core/src/index/knowhere/knowhere/index/vector_index/FaissBaseIndex.cpp @@ -33,7 +33,7 @@ FaissBaseIndex::SerializeImpl() { try { faiss::Index* index = index_.get(); - SealImpl(); + // SealImpl(); MemoryIOWriter writer; faiss::write_index(index, &writer); @@ -60,6 +60,8 @@ FaissBaseIndex::LoadImpl(const BinarySet& index_binary) { faiss::Index* index = faiss::read_index(&reader); index_.reset(index); + + SealImpl(); } void diff --git a/core/src/index/knowhere/knowhere/index/vector_index/IndexGPUIVF.cpp b/core/src/index/knowhere/knowhere/index/vector_index/IndexGPUIVF.cpp index 251dfc12ed..d69f87a061 100644 --- a/core/src/index/knowhere/knowhere/index/vector_index/IndexGPUIVF.cpp +++ b/core/src/index/knowhere/knowhere/index/vector_index/IndexGPUIVF.cpp @@ -86,9 +86,6 @@ GPUIVF::SerializeImpl() { faiss::Index* index = index_.get(); faiss::Index* host_index = faiss::gpu::index_gpu_to_cpu(index); - // TODO(linxj): support seal - // SealImpl(); - faiss::write_index(host_index, &writer); delete host_index; } From 32e5bba61aafcc1285acadfca63a28e39b9045a9 Mon Sep 17 00:00:00 2001 From: "xiaojun.lin" Date: Thu, 21 Nov 2019 16:54:51 +0800 Subject: [PATCH 12/32] fix --- core/src/index/knowhere/knowhere/index/vector_index/IndexIVF.cpp | 1 - 1 file changed, 1 deletion(-) diff --git a/core/src/index/knowhere/knowhere/index/vector_index/IndexIVF.cpp b/core/src/index/knowhere/knowhere/index/vector_index/IndexIVF.cpp index 7f30a97ea0..8b734abdc6 100644 --- a/core/src/index/knowhere/knowhere/index/vector_index/IndexIVF.cpp +++ b/core/src/index/knowhere/knowhere/index/vector_index/IndexIVF.cpp @@ -97,7 +97,6 @@ IVF::Serialize() { } std::lock_guard lk(mutex_); - Seal(); return SerializeImpl(); } From acf4d0459d37a8760a4c931abeb29b7a4e9e916f Mon Sep 17 00:00:00 2001 From: zhenwu Date: Thu, 21 Nov 2019 17:23:06 +0800 Subject: [PATCH 13/32] Add partition case --- tests/milvus_python_test/pytest.ini | 2 +- tests/milvus_python_test/requirements.txt | 2 +- .../requirements_no_pymilvus.txt | 1 - tests/milvus_python_test/test_add_vectors.py | 96 +++- tests/milvus_python_test/test_connect.py | 13 +- tests/milvus_python_test/test_index.py | 216 ++++++++- tests/milvus_python_test/test_mix.py | 3 +- tests/milvus_python_test/test_partition.py | 431 ++++++++++++++++++ .../milvus_python_test/test_search_vectors.py | 238 +++++++++- tests/milvus_python_test/test_table.py | 2 +- tests/milvus_python_test/test_table_count.py | 88 +++- 11 files changed, 1065 insertions(+), 27 deletions(-) create mode 100644 tests/milvus_python_test/test_partition.py diff --git a/tests/milvus_python_test/pytest.ini b/tests/milvus_python_test/pytest.ini index 3f95dc29b8..3ae6a790db 100644 --- a/tests/milvus_python_test/pytest.ini +++ b/tests/milvus_python_test/pytest.ini @@ -4,6 +4,6 @@ log_format = [%(asctime)s-%(levelname)s-%(name)s]: %(message)s (%(filename)s:%(l log_cli = true log_level = 20 -timeout = 300 +timeout = 600 level = 1 \ No newline at end of file diff --git a/tests/milvus_python_test/requirements.txt b/tests/milvus_python_test/requirements.txt index c8fc02c096..016c8dedfc 100644 --- a/tests/milvus_python_test/requirements.txt +++ b/tests/milvus_python_test/requirements.txt @@ -22,4 +22,4 @@ wcwidth==0.1.7 wrapt==1.11.1 zipp==0.5.1 scikit-learn>=0.19.1 -pymilvus-test>=0.2.0 \ No newline at end of file +pymilvus-test>=0.2.0 diff --git a/tests/milvus_python_test/requirements_no_pymilvus.txt b/tests/milvus_python_test/requirements_no_pymilvus.txt index 45884c0c71..c6a933736e 100644 --- a/tests/milvus_python_test/requirements_no_pymilvus.txt +++ b/tests/milvus_python_test/requirements_no_pymilvus.txt @@ -17,7 +17,6 @@ allure-pytest==2.7.0 pytest-print==0.1.2 pytest-level==0.1.1 six==1.12.0 -thrift==0.11.0 typed-ast==1.3.5 wcwidth==0.1.7 wrapt==1.11.1 diff --git a/tests/milvus_python_test/test_add_vectors.py b/tests/milvus_python_test/test_add_vectors.py index f9f7f7d4ca..7245d51ea2 100644 --- a/tests/milvus_python_test/test_add_vectors.py +++ b/tests/milvus_python_test/test_add_vectors.py @@ -15,7 +15,7 @@ table_id = "test_add" ADD_TIMEOUT = 60 nprobe = 1 epsilon = 0.0001 - +tag = "1970-01-01" class TestAddBase: """ @@ -186,6 +186,7 @@ class TestAddBase: expected: status ok ''' index_param = get_simple_index_params + logging.getLogger().info(index_param) vector = gen_single_vector(dim) status, ids = connect.add_vectors(table, vector) status = connect.create_index(table, index_param) @@ -439,6 +440,80 @@ class TestAddBase: assert status.OK() assert len(ids) == nq + @pytest.mark.timeout(ADD_TIMEOUT) + def test_add_vectors_tag(self, connect, table): + ''' + target: test add vectors in table created before + method: create table and add vectors in it, with the partition_tag param + expected: the table row count equals to nq + ''' + nq = 5 + partition_name = gen_unique_str() + vectors = gen_vectors(nq, dim) + status = connect.create_partition(table, partition_name, tag) + status, ids = connect.add_vectors(table, vectors, partition_tag=tag) + assert status.OK() + assert len(ids) == nq + + @pytest.mark.timeout(ADD_TIMEOUT) + def test_add_vectors_tag_A(self, connect, table): + ''' + target: test add vectors in table created before + method: create partition and add vectors in it + expected: the table row count equals to nq + ''' + nq = 5 + partition_name = gen_unique_str() + vectors = gen_vectors(nq, dim) + status = connect.create_partition(table, partition_name, tag) + status, ids = connect.add_vectors(partition_name, vectors) + assert status.OK() + assert len(ids) == nq + + @pytest.mark.timeout(ADD_TIMEOUT) + def test_add_vectors_tag_not_existed(self, connect, table): + ''' + target: test add vectors in table created before + method: create table and add vectors in it, with the not existed partition_tag param + expected: status not ok + ''' + nq = 5 + vectors = gen_vectors(nq, dim) + status, ids = connect.add_vectors(table, vectors, partition_tag=tag) + assert not status.OK() + + @pytest.mark.timeout(ADD_TIMEOUT) + def test_add_vectors_tag_not_existed_A(self, connect, table): + ''' + target: test add vectors in table created before + method: create partition, add vectors with the not existed partition_tag param + expected: status not ok + ''' + nq = 5 + vectors = gen_vectors(nq, dim) + new_tag = "new_tag" + partition_name = gen_unique_str() + status = connect.create_partition(table, partition_name, tag) + status, ids = connect.add_vectors(table, vectors, partition_tag=new_tag) + assert not status.OK() + + @pytest.mark.timeout(ADD_TIMEOUT) + def test_add_vectors_tag_existed(self, connect, table): + ''' + target: test add vectors in table created before + method: create table and add vectors in it repeatly, with the partition_tag param + expected: the table row count equals to nq + ''' + nq = 5 + partition_name = gen_unique_str() + vectors = gen_vectors(nq, dim) + status = connect.create_partition(table, partition_name, tag) + status, ids = connect.add_vectors(table, vectors, partition_tag=tag) + for i in range(5): + status, ids = connect.add_vectors(table, vectors, partition_tag=tag) + assert status.OK() + assert len(ids) == nq + @pytest.mark.level(2) def test_add_vectors_without_connect(self, dis_connect, table): ''' @@ -1198,7 +1273,8 @@ class TestAddAdvance: assert len(ids) == nb assert status.OK() -class TestAddTableNameInvalid(object): + +class TestNameInvalid(object): """ Test adding vectors with invalid table names """ @@ -1209,13 +1285,27 @@ class TestAddTableNameInvalid(object): def get_table_name(self, request): yield request.param + @pytest.fixture( + scope="function", + params=gen_invalid_table_names() + ) + def get_tag_name(self, request): + yield request.param + @pytest.mark.level(2) - def test_add_vectors_with_invalid_tablename(self, connect, get_table_name): + def test_add_vectors_with_invalid_table_name(self, connect, get_table_name): table_name = get_table_name vectors = gen_vectors(1, dim) status, result = connect.add_vectors(table_name, vectors) assert not status.OK() + @pytest.mark.level(2) + def test_add_vectors_with_invalid_tag_name(self, connect, get_tag_name): + tag_name = get_tag_name + vectors = gen_vectors(1, dim) + status, result = connect.add_vectors(table_name, vectors, partition_tag=tag_name) + assert not status.OK() + class TestAddTableVectorsInvalid(object): single_vector = gen_single_vector(dim) diff --git a/tests/milvus_python_test/test_connect.py b/tests/milvus_python_test/test_connect.py index dd7e80c1f9..143ac4d8bf 100644 --- a/tests/milvus_python_test/test_connect.py +++ b/tests/milvus_python_test/test_connect.py @@ -149,15 +149,14 @@ class TestConnect: milvus.connect(uri=uri_value, timeout=1) assert not milvus.connected() - # TODO: enable - def _test_connect_with_multiprocess(self, args): + def test_connect_with_multiprocess(self, args): ''' target: test uri connect with multiprocess method: set correct uri, test with multiprocessing connecting expected: all connection is connected ''' uri_value = "tcp://%s:%s" % (args["ip"], args["port"]) - process_num = 4 + process_num = 10 processes = [] def connect(milvus): @@ -248,7 +247,7 @@ class TestConnect: expected: connect raise an exception and connected is false ''' milvus = Milvus() - uri_value = "tcp://%s:19540" % args["ip"] + uri_value = "tcp://%s:39540" % args["ip"] with pytest.raises(Exception) as e: milvus.connect(host=args["ip"], port="", uri=uri_value) @@ -264,6 +263,7 @@ class TestConnect: milvus.connect(host="", port=args["port"], uri=uri_value, timeout=1) assert not milvus.connected() + # Disable, (issue: https://github.com/milvus-io/milvus/issues/288) def test_connect_param_priority_both_hostip_uri(self, args): ''' target: both host_ip_port / uri are both given, and not null, use the uri params @@ -273,8 +273,9 @@ class TestConnect: milvus = Milvus() uri_value = "tcp://%s:%s" % (args["ip"], args["port"]) with pytest.raises(Exception) as e: - milvus.connect(host=args["ip"], port=19540, uri=uri_value, timeout=1) - assert not milvus.connected() + res = milvus.connect(host=args["ip"], port=39540, uri=uri_value, timeout=1) + logging.getLogger().info(res) + # assert not milvus.connected() def _test_add_vector_and_disconnect_concurrently(self): ''' diff --git a/tests/milvus_python_test/test_index.py b/tests/milvus_python_test/test_index.py index 269e6137da..39aadb9d33 100644 --- a/tests/milvus_python_test/test_index.py +++ b/tests/milvus_python_test/test_index.py @@ -20,6 +20,7 @@ vectors = sklearn.preprocessing.normalize(vectors, axis=1, norm='l2') vectors = vectors.tolist() BUILD_TIMEOUT = 60 nprobe = 1 +tag = "1970-01-01" class TestIndexBase: @@ -62,6 +63,21 @@ class TestIndexBase: status = connect.create_index(table, index_params) assert status.OK() + @pytest.mark.timeout(BUILD_TIMEOUT) + def test_create_index_partition(self, connect, table, get_index_params): + ''' + target: test create index interface + method: create table, create partition, and add vectors in it, create index + expected: return code equals to 0, and search success + ''' + partition_name = gen_unique_str() + index_params = get_index_params + logging.getLogger().info(index_params) + status = connect.create_partition(table, partition_name, tag) + status, ids = connect.add_vectors(table, vectors, partition_tag=tag) + status = connect.create_index(table, index_params) + assert status.OK() + @pytest.mark.level(2) def test_create_index_without_connect(self, dis_connect, table): ''' @@ -555,6 +571,21 @@ class TestIndexIP: status = connect.create_index(ip_table, index_params) assert status.OK() + @pytest.mark.timeout(BUILD_TIMEOUT) + def test_create_index_partition(self, connect, ip_table, get_index_params): + ''' + target: test create index interface + method: create table, create partition, and add vectors in it, create index + expected: return code equals to 0, and search success + ''' + partition_name = gen_unique_str() + index_params = get_index_params + logging.getLogger().info(index_params) + status = connect.create_partition(ip_table, partition_name, tag) + status, ids = connect.add_vectors(ip_table, vectors, partition_tag=tag) + status = connect.create_index(partition_name, index_params) + assert status.OK() + @pytest.mark.level(2) def test_create_index_without_connect(self, dis_connect, ip_table): ''' @@ -583,9 +614,9 @@ class TestIndexIP: query_vecs = [vectors[0], vectors[1], vectors[2]] top_k = 5 status, result = connect.search_vectors(ip_table, top_k, nprobe, query_vecs) + logging.getLogger().info(result) assert status.OK() assert len(result) == len(query_vecs) - # logging.getLogger().info(result) # TODO: enable @pytest.mark.timeout(BUILD_TIMEOUT) @@ -743,13 +774,13 @@ class TestIndexIP: ****************************************************************** """ - def test_describe_index(self, connect, ip_table, get_index_params): + def test_describe_index(self, connect, ip_table, get_simple_index_params): ''' target: test describe index interface method: create table and add vectors in it, create index, call describe index expected: return code 0, and index instructure ''' - index_params = get_index_params + index_params = get_simple_index_params logging.getLogger().info(index_params) status, ids = connect.add_vectors(ip_table, vectors) status = connect.create_index(ip_table, index_params) @@ -759,6 +790,80 @@ class TestIndexIP: assert result._table_name == ip_table assert result._index_type == index_params["index_type"] + def test_describe_index_partition(self, connect, ip_table, get_simple_index_params): + ''' + target: test describe index interface + method: create table, create partition and add vectors in it, create index, call describe index + expected: return code 0, and index instructure + ''' + partition_name = gen_unique_str() + index_params = get_simple_index_params + logging.getLogger().info(index_params) + status = connect.create_partition(ip_table, partition_name, tag) + status, ids = connect.add_vectors(ip_table, vectors, partition_tag=tag) + status = connect.create_index(ip_table, index_params) + status, result = connect.describe_index(ip_table) + logging.getLogger().info(result) + assert result._nlist == index_params["nlist"] + assert result._table_name == ip_table + assert result._index_type == index_params["index_type"] + status, result = connect.describe_index(partition_name) + logging.getLogger().info(result) + assert result._nlist == index_params["nlist"] + assert result._table_name == partition_name + assert result._index_type == index_params["index_type"] + + def test_describe_index_partition_A(self, connect, ip_table, get_simple_index_params): + ''' + target: test describe index interface + method: create table, create partition and add vectors in it, create index on partition, call describe index + expected: return code 0, and index instructure + ''' + partition_name = gen_unique_str() + index_params = get_simple_index_params + logging.getLogger().info(index_params) + status = connect.create_partition(ip_table, partition_name, tag) + status, ids = connect.add_vectors(ip_table, vectors, partition_tag=tag) + status = connect.create_index(partition_name, index_params) + status, result = connect.describe_index(ip_table) + logging.getLogger().info(result) + assert result._nlist == 16384 + assert result._table_name == ip_table + assert result._index_type == IndexType.FLAT + status, result = connect.describe_index(partition_name) + logging.getLogger().info(result) + assert result._nlist == index_params["nlist"] + assert result._table_name == partition_name + assert result._index_type == index_params["index_type"] + + def test_describe_index_partition_B(self, connect, ip_table, get_simple_index_params): + ''' + target: test describe index interface + method: create table, create partitions and add vectors in it, create index on partitions, call describe index + expected: return code 0, and index instructure + ''' + partition_name = gen_unique_str() + new_partition_name = gen_unique_str() + new_tag = "new_tag" + index_params = get_simple_index_params + logging.getLogger().info(index_params) + status = connect.create_partition(ip_table, partition_name, tag) + status = connect.create_partition(ip_table, new_partition_name, new_tag) + status, ids = connect.add_vectors(ip_table, vectors, partition_tag=tag) + status, ids = connect.add_vectors(ip_table, vectors, partition_tag=new_tag) + status = connect.create_index(partition_name, index_params) + status = connect.create_index(new_partition_name, index_params) + status, result = connect.describe_index(ip_table) + logging.getLogger().info(result) + assert result._nlist == 16384 + assert result._table_name == ip_table + assert result._index_type == IndexType.FLAT + status, result = connect.describe_index(new_partition_name) + logging.getLogger().info(result) + assert result._nlist == index_params["nlist"] + assert result._table_name == new_partition_name + assert result._index_type == index_params["index_type"] + def test_describe_and_drop_index_multi_tables(self, connect, get_simple_index_params): ''' target: test create, describe and drop index interface with multiple tables of IP @@ -849,6 +954,111 @@ class TestIndexIP: assert result._table_name == ip_table assert result._index_type == IndexType.FLAT + def test_drop_index_partition(self, connect, ip_table, get_simple_index_params): + ''' + target: test drop index interface + method: create table, create partition and add vectors in it, create index on table, call drop table index + expected: return code 0, and default index param + ''' + partition_name = gen_unique_str() + index_params = get_simple_index_params + status = connect.create_partition(ip_table, partition_name, tag) + status, ids = connect.add_vectors(ip_table, vectors, partition_tag=tag) + status = connect.create_index(ip_table, index_params) + assert status.OK() + status, result = connect.describe_index(ip_table) + logging.getLogger().info(result) + status = connect.drop_index(ip_table) + assert status.OK() + status, result = connect.describe_index(ip_table) + logging.getLogger().info(result) + assert result._nlist == 16384 + assert result._table_name == ip_table + assert result._index_type == IndexType.FLAT + + def test_drop_index_partition_A(self, connect, ip_table, get_simple_index_params): + ''' + target: test drop index interface + method: create table, create partition and add vectors in it, create index on partition, call drop table index + expected: return code 0, and default index param + ''' + partition_name = gen_unique_str() + index_params = get_simple_index_params + status = connect.create_partition(ip_table, partition_name, tag) + status, ids = connect.add_vectors(ip_table, vectors, partition_tag=tag) + status = connect.create_index(partition_name, index_params) + assert status.OK() + status = connect.drop_index(ip_table) + assert status.OK() + status, result = connect.describe_index(ip_table) + logging.getLogger().info(result) + assert result._nlist == 16384 + assert result._table_name == ip_table + assert result._index_type == IndexType.FLAT + status, result = connect.describe_index(partition_name) + logging.getLogger().info(result) + assert result._nlist == 16384 + assert result._table_name == partition_name + assert result._index_type == IndexType.FLAT + + def test_drop_index_partition_B(self, connect, ip_table, get_simple_index_params): + ''' + target: test drop index interface + method: create table, create partition and add vectors in it, create index on partition, call drop partition index + expected: return code 0, and default index param + ''' + partition_name = gen_unique_str() + index_params = get_simple_index_params + status = connect.create_partition(ip_table, partition_name, tag) + status, ids = connect.add_vectors(ip_table, vectors, partition_tag=tag) + status = connect.create_index(partition_name, index_params) + assert status.OK() + status = connect.drop_index(partition_name) + assert status.OK() + status, result = connect.describe_index(ip_table) + logging.getLogger().info(result) + assert result._nlist == 16384 + assert result._table_name == ip_table + assert result._index_type == IndexType.FLAT + status, result = connect.describe_index(partition_name) + logging.getLogger().info(result) + assert result._nlist == 16384 + assert result._table_name == partition_name + assert result._index_type == IndexType.FLAT + + def test_drop_index_partition_C(self, connect, ip_table, get_simple_index_params): + ''' + target: test drop index interface + method: create table, create partitions and add vectors in it, create index on partitions, call drop partition index + expected: return code 0, and default index param + ''' + partition_name = gen_unique_str() + new_partition_name = gen_unique_str() + new_tag = "new_tag" + index_params = get_simple_index_params + status = connect.create_partition(ip_table, partition_name, tag) + status = connect.create_partition(ip_table, new_partition_name, new_tag) + status, ids = connect.add_vectors(ip_table, vectors) + status = connect.create_index(ip_table, index_params) + assert status.OK() + status = connect.drop_index(new_partition_name) + assert status.OK() + status, result = connect.describe_index(new_partition_name) + logging.getLogger().info(result) + assert result._nlist == 16384 + assert result._table_name == new_partition_name + assert result._index_type == IndexType.FLAT + status, result = connect.describe_index(partition_name) + logging.getLogger().info(result) + assert result._nlist == index_params["nlist"] + assert result._table_name == partition_name + assert result._index_type == index_params["index_type"] + status, result = connect.describe_index(ip_table) + logging.getLogger().info(result) + assert result._nlist == index_params["nlist"] + assert result._table_name == ip_table + assert result._index_type == index_params["index_type"] + def test_drop_index_repeatly(self, connect, ip_table, get_simple_index_params): ''' target: test drop index repeatly diff --git a/tests/milvus_python_test/test_mix.py b/tests/milvus_python_test/test_mix.py index f099db5c31..5ef9ba2cde 100644 --- a/tests/milvus_python_test/test_mix.py +++ b/tests/milvus_python_test/test_mix.py @@ -25,9 +25,8 @@ index_params = {'index_type': IndexType.IVFLAT, 'nlist': 16384} class TestMixBase: - # TODO: enable def test_search_during_createIndex(self, args): - loops = 100000 + loops = 10000 table = gen_unique_str() query_vecs = [vectors[0], vectors[1]] uri = "tcp://%s:%s" % (args["ip"], args["port"]) diff --git a/tests/milvus_python_test/test_partition.py b/tests/milvus_python_test/test_partition.py new file mode 100644 index 0000000000..cbb0b5bc8e --- /dev/null +++ b/tests/milvus_python_test/test_partition.py @@ -0,0 +1,431 @@ +import time +import random +import pdb +import threading +import logging +from multiprocessing import Pool, Process +import pytest +from milvus import Milvus, IndexType, MetricType +from utils import * + + +dim = 128 +index_file_size = 10 +table_id = "test_add" +ADD_TIMEOUT = 60 +nprobe = 1 +epsilon = 0.0001 +tag = "1970-01-01" + + +class TestCreateBase: + + """ + ****************************************************************** + The following cases are used to test `create_partition` function + ****************************************************************** + """ + def test_create_partition(self, connect, table): + ''' + target: test create partition, check status returned + method: call function: create_partition + expected: status ok + ''' + partition_name = gen_unique_str() + status = connect.create_partition(table, partition_name, tag) + assert status.OK() + + def test_create_partition_repeat(self, connect, table): + ''' + target: test create partition, check status returned + method: call function: create_partition + expected: status ok + ''' + partition_name = gen_unique_str() + status = connect.create_partition(table, partition_name, tag) + status = connect.create_partition(table, partition_name, tag) + assert not status.OK() + + def test_create_partition_recursively(self, connect, table): + ''' + target: test create partition, and create partition in parent partition, check status returned + method: call function: create_partition + expected: status not ok + ''' + partition_name = gen_unique_str() + new_partition_name = gen_unique_str() + new_tag = "new_tag" + status = connect.create_partition(table, partition_name, tag) + status = connect.create_partition(partition_name, new_partition_name, new_tag) + assert not status.OK() + + def test_create_partition_table_not_existed(self, connect): + ''' + target: test create partition, its owner table name not existed in db, check status returned + method: call function: create_partition + expected: status not ok + ''' + table_name = gen_unique_str() + partition_name = gen_unique_str() + status = connect.create_partition(table_name, partition_name, tag) + assert not status.OK() + + def test_create_partition_partition_name_existed(self, connect, table): + ''' + target: test create partition, and create the same partition again, check status returned + method: call function: create_partition + expected: status not ok + ''' + partition_name = gen_unique_str() + status = connect.create_partition(table, partition_name, tag) + assert status.OK() + tag_new = "tag_new" + status = connect.create_partition(table, partition_name, tag_new) + assert not status.OK() + + def test_create_partition_partition_name_equals_table(self, connect, table): + ''' + target: test create partition, the partition equals to table, check status returned + method: call function: create_partition + expected: status not ok + ''' + status = connect.create_partition(table, table, tag) + assert not status.OK() + + def test_create_partition_partition_name_None(self, connect, table): + ''' + target: test create partition, partition name set None, check status returned + method: call function: create_partition + expected: status not ok + ''' + partition_name = None + status = connect.create_partition(table, partition_name, tag) + assert not status.OK() + + def test_create_partition_tag_name_None(self, connect, table): + ''' + target: test create partition, tag name set None, check status returned + method: call function: create_partition + expected: status ok + ''' + tag_name = None + partition_name = gen_unique_str() + status = connect.create_partition(table, partition_name, tag_name) + assert not status.OK() + + def test_create_different_partition_tag_name_existed(self, connect, table): + ''' + target: test create partition, and create the same partition tag again, check status returned + method: call function: create_partition with the same tag name + expected: status not ok + ''' + partition_name = gen_unique_str() + status = connect.create_partition(table, partition_name, tag) + assert status.OK() + new_partition_name = gen_unique_str() + status = connect.create_partition(table, new_partition_name, tag) + assert not status.OK() + + def test_create_partition_add_vectors(self, connect, table): + ''' + target: test create partition, and insert vectors, check status returned + method: call function: create_partition + expected: status ok + ''' + partition_name = gen_unique_str() + status = connect.create_partition(table, partition_name, tag) + assert status.OK() + nq = 100 + vectors = gen_vectors(nq, dim) + ids = [i for i in range(nq)] + status, ids = connect.insert(table, vectors, ids) + assert status.OK() + + def test_create_partition_insert_with_tag(self, connect, table): + ''' + target: test create partition, and insert vectors, check status returned + method: call function: create_partition + expected: status ok + ''' + partition_name = gen_unique_str() + status = connect.create_partition(table, partition_name, tag) + assert status.OK() + nq = 100 + vectors = gen_vectors(nq, dim) + ids = [i for i in range(nq)] + status, ids = connect.insert(table, vectors, ids, partition_tag=tag) + assert status.OK() + + def test_create_partition_insert_with_tag_not_existed(self, connect, table): + ''' + target: test create partition, and insert vectors, check status returned + method: call function: create_partition + expected: status not ok + ''' + tag_new = "tag_new" + partition_name = gen_unique_str() + status = connect.create_partition(table, partition_name, tag) + assert status.OK() + nq = 100 + vectors = gen_vectors(nq, dim) + ids = [i for i in range(nq)] + status, ids = connect.insert(table, vectors, ids, partition_tag=tag_new) + assert not status.OK() + + def test_create_partition_insert_same_tags(self, connect, table): + ''' + target: test create partition, and insert vectors, check status returned + method: call function: create_partition + expected: status ok + ''' + partition_name = gen_unique_str() + status = connect.create_partition(table, partition_name, tag) + assert status.OK() + nq = 100 + vectors = gen_vectors(nq, dim) + ids = [i for i in range(nq)] + status, ids = connect.insert(table, vectors, ids, partition_tag=tag) + ids = [(i+100) for i in range(nq)] + status, ids = connect.insert(table, vectors, ids, partition_tag=tag) + assert status.OK() + time.sleep(1) + status, res = connect.get_table_row_count(partition_name) + assert res == nq * 2 + + def test_create_partition_insert_same_tags_two_tables(self, connect, table): + ''' + target: test create two partitions, and insert vectors with the same tag to each table, check status returned + method: call function: create_partition + expected: status ok, table length is correct + ''' + partition_name = gen_unique_str() + table_new = gen_unique_str() + new_partition_name = gen_unique_str() + status = connect.create_partition(table, partition_name, tag) + assert status.OK() + param = {'table_name': table_new, + 'dimension': dim, + 'index_file_size': index_file_size, + 'metric_type': MetricType.L2} + status = connect.create_table(param) + status = connect.create_partition(table_new, new_partition_name, tag) + assert status.OK() + nq = 100 + vectors = gen_vectors(nq, dim) + ids = [i for i in range(nq)] + status, ids = connect.insert(table, vectors, ids, partition_tag=tag) + ids = [(i+100) for i in range(nq)] + status, ids = connect.insert(table_new, vectors, ids, partition_tag=tag) + assert status.OK() + time.sleep(1) + status, res = connect.get_table_row_count(new_partition_name) + assert res == nq + + +class TestShowBase: + + """ + ****************************************************************** + The following cases are used to test `show_partitions` function + ****************************************************************** + """ + def test_show_partitions(self, connect, table): + ''' + target: test show partitions, check status and partitions returned + method: create partition first, then call function: show_partitions + expected: status ok, partition correct + ''' + partition_name = gen_unique_str() + status = connect.create_partition(table, partition_name, tag) + status, res = connect.show_partitions(table) + assert status.OK() + + def test_show_partitions_no_partition(self, connect, table): + ''' + target: test show partitions with table name, check status and partitions returned + method: call function: show_partitions + expected: status ok, partitions correct + ''' + partition_name = gen_unique_str() + status, res = connect.show_partitions(table) + assert status.OK() + + def test_show_partitions_no_partition_recursive(self, connect, table): + ''' + target: test show partitions with partition name, check status and partitions returned + method: call function: show_partitions + expected: status ok, no partitions + ''' + partition_name = gen_unique_str() + status, res = connect.show_partitions(partition_name) + assert status.OK() + assert len(res) == 0 + + def test_show_multi_partitions(self, connect, table): + ''' + target: test show partitions, check status and partitions returned + method: create partitions first, then call function: show_partitions + expected: status ok, partitions correct + ''' + partition_name = gen_unique_str() + new_partition_name = gen_unique_str() + status = connect.create_partition(table, partition_name, tag) + status = connect.create_partition(table, new_partition_name, tag) + status, res = connect.show_partitions(table) + assert status.OK() + + +class TestDropBase: + + """ + ****************************************************************** + The following cases are used to test `drop_partition` function + ****************************************************************** + """ + def test_drop_partition(self, connect, table): + ''' + target: test drop partition, check status and partition if existed + method: create partitions first, then call function: drop_partition + expected: status ok, no partitions in db + ''' + partition_name = gen_unique_str() + status = connect.create_partition(table, partition_name, tag) + status = connect.drop_partition(table, tag) + assert status.OK() + # check if the partition existed + status, res = connect.show_partitions(table) + assert partition_name not in res + + def test_drop_partition_tag_not_existed(self, connect, table): + ''' + target: test drop partition, but tag not existed + method: create partitions first, then call function: drop_partition + expected: status not ok + ''' + partition_name = gen_unique_str() + status = connect.create_partition(table, partition_name, tag) + new_tag = "new_tag" + status = connect.drop_partition(table, new_tag) + assert not status.OK() + + def test_drop_partition_tag_not_existed_A(self, connect, table): + ''' + target: test drop partition, but table not existed + method: create partitions first, then call function: drop_partition + expected: status not ok + ''' + partition_name = gen_unique_str() + status = connect.create_partition(table, partition_name, tag) + new_table = gen_unique_str() + status = connect.drop_partition(new_table, tag) + assert not status.OK() + + def test_drop_partition_repeatedly(self, connect, table): + ''' + target: test drop partition twice, check status and partition if existed + method: create partitions first, then call function: drop_partition + expected: status not ok, no partitions in db + ''' + partition_name = gen_unique_str() + status = connect.create_partition(table, partition_name, tag) + status = connect.drop_partition(table, tag) + status = connect.drop_partition(table, tag) + time.sleep(2) + assert not status.OK() + status, res = connect.show_partitions(table) + assert partition_name not in res + + def test_drop_partition_create(self, connect, table): + ''' + target: test drop partition, and create again, check status + method: create partitions first, then call function: drop_partition, create_partition + expected: status not ok, partition in db + ''' + partition_name = gen_unique_str() + status = connect.create_partition(table, partition_name, tag) + status = connect.drop_partition(table, tag) + time.sleep(2) + status = connect.create_partition(table, partition_name, tag) + assert status.OK() + status, res = connect.show_partitions(table) + assert partition_name == res[0].partition_name + + +class TestNameInvalid(object): + @pytest.fixture( + scope="function", + params=gen_invalid_table_names() + ) + def get_partition_name(self, request): + yield request.param + + @pytest.fixture( + scope="function", + params=gen_invalid_table_names() + ) + def get_tag_name(self, request): + yield request.param + + @pytest.fixture( + scope="function", + params=gen_invalid_table_names() + ) + def get_table_name(self, request): + yield request.param + + def test_create_partition_with_invalid_partition_name(self, connect, table, get_partition_name): + ''' + target: test create partition, with invalid partition name, check status returned + method: call function: create_partition + expected: status not ok + ''' + partition_name = get_partition_name + status = connect.create_partition(table, partition_name, tag) + assert not status.OK() + + def test_create_partition_with_invalid_tag_name(self, connect, table): + ''' + target: test create partition, with invalid partition name, check status returned + method: call function: create_partition + expected: status not ok + ''' + tag_name = " " + partition_name = gen_unique_str() + status = connect.create_partition(table, partition_name, tag_name) + assert not status.OK() + + def test_drop_partition_with_invalid_table_name(self, connect, table, get_table_name): + ''' + target: test drop partition, with invalid table name, check status returned + method: call function: drop_partition + expected: status not ok + ''' + table_name = get_table_name + partition_name = gen_unique_str() + status = connect.create_partition(table, partition_name, tag) + status = connect.drop_partition(table_name, tag) + assert not status.OK() + + def test_drop_partition_with_invalid_tag_name(self, connect, table, get_tag_name): + ''' + target: test drop partition, with invalid tag name, check status returned + method: call function: drop_partition + expected: status not ok + ''' + tag_name = get_tag_name + partition_name = gen_unique_str() + status = connect.create_partition(table, partition_name, tag) + status = connect.drop_partition(table, tag_name) + assert not status.OK() + + def test_show_partitions_with_invalid_table_name(self, connect, table, get_table_name): + ''' + target: test show partitions, with invalid table name, check status returned + method: call function: show_partitions + expected: status not ok + ''' + table_name = get_table_name + partition_name = gen_unique_str() + status = connect.create_partition(table, partition_name, tag) + status, res = connect.show_partitions(table_name) + assert not status.OK() \ No newline at end of file diff --git a/tests/milvus_python_test/test_search_vectors.py b/tests/milvus_python_test/test_search_vectors.py index 10892d6de3..e0b1bc09ea 100644 --- a/tests/milvus_python_test/test_search_vectors.py +++ b/tests/milvus_python_test/test_search_vectors.py @@ -16,8 +16,9 @@ add_interval_time = 2 vectors = gen_vectors(100, dim) # vectors /= numpy.linalg.norm(vectors) # vectors = vectors.tolist() -nrpobe = 1 +nprobe = 1 epsilon = 0.001 +tag = "1970-01-01" class TestSearchBase: @@ -49,6 +50,15 @@ class TestSearchBase: pytest.skip("sq8h not support in open source") return request.param + @pytest.fixture( + scope="function", + params=gen_simple_index_params() + ) + def get_simple_index_params(self, request, args): + if "internal" not in args: + if request.param["index_type"] == IndexType.IVF_SQ8H: + pytest.skip("sq8h not support in open source") + return request.param """ generate top-k params """ @@ -70,7 +80,7 @@ class TestSearchBase: query_vec = [vectors[0]] top_k = get_top_k nprobe = 1 - status, result = connect.search_vectors(table, top_k, nrpobe, query_vec) + status, result = connect.search_vectors(table, top_k, nprobe, query_vec) if top_k <= 2048: assert status.OK() assert len(result[0]) == min(len(vectors), top_k) @@ -85,7 +95,6 @@ class TestSearchBase: method: search with the given vectors, check the result expected: search status ok, and the length of the result is top_k ''' - index_params = get_index_params logging.getLogger().info(index_params) vectors, ids = self.init_data(connect, table) @@ -93,7 +102,7 @@ class TestSearchBase: query_vec = [vectors[0]] top_k = 10 nprobe = 1 - status, result = connect.search_vectors(table, top_k, nrpobe, query_vec) + status, result = connect.search_vectors(table, top_k, nprobe, query_vec) logging.getLogger().info(result) if top_k <= 1024: assert status.OK() @@ -103,6 +112,160 @@ class TestSearchBase: else: assert not status.OK() + def test_search_l2_index_params_partition(self, connect, table, get_simple_index_params): + ''' + target: test basic search fuction, all the search params is corrent, test all index params, and build + method: add vectors into table, search with the given vectors, check the result + expected: search status ok, and the length of the result is top_k, search table with partition tag return empty + ''' + index_params = get_simple_index_params + logging.getLogger().info(index_params) + partition_name = gen_unique_str() + status = connect.create_partition(table, partition_name, tag) + vectors, ids = self.init_data(connect, table) + status = connect.create_index(table, index_params) + query_vec = [vectors[0]] + top_k = 10 + nprobe = 1 + status, result = connect.search_vectors(table, top_k, nprobe, query_vec) + logging.getLogger().info(result) + assert status.OK() + assert len(result[0]) == min(len(vectors), top_k) + assert check_result(result[0], ids[0]) + assert result[0][0].distance <= epsilon + status, result = connect.search_vectors(table, top_k, nprobe, query_vec, partition_tags=[tag]) + logging.getLogger().info(result) + assert status.OK() + assert len(result) == 0 + + def test_search_l2_index_params_partition_A(self, connect, table, get_simple_index_params): + ''' + target: test basic search fuction, all the search params is corrent, test all index params, and build + method: search partition with the given vectors, check the result + expected: search status ok, and the length of the result is 0 + ''' + index_params = get_simple_index_params + logging.getLogger().info(index_params) + partition_name = gen_unique_str() + status = connect.create_partition(table, partition_name, tag) + vectors, ids = self.init_data(connect, table) + status = connect.create_index(table, index_params) + query_vec = [vectors[0]] + top_k = 10 + nprobe = 1 + status, result = connect.search_vectors(partition_name, top_k, nprobe, query_vec, partition_tags=[tag]) + logging.getLogger().info(result) + assert status.OK() + assert len(result) == 0 + + def test_search_l2_index_params_partition_B(self, connect, table, get_simple_index_params): + ''' + target: test basic search fuction, all the search params is corrent, test all index params, and build + method: search with the given vectors, check the result + expected: search status ok, and the length of the result is top_k + ''' + index_params = get_simple_index_params + logging.getLogger().info(index_params) + partition_name = gen_unique_str() + status = connect.create_partition(table, partition_name, tag) + vectors, ids = self.init_data(connect, partition_name) + status = connect.create_index(table, index_params) + query_vec = [vectors[0]] + top_k = 10 + nprobe = 1 + status, result = connect.search_vectors(table, top_k, nprobe, query_vec) + logging.getLogger().info(result) + assert status.OK() + assert len(result[0]) == min(len(vectors), top_k) + assert check_result(result[0], ids[0]) + assert result[0][0].distance <= epsilon + status, result = connect.search_vectors(table, top_k, nprobe, query_vec, partition_tags=[tag]) + logging.getLogger().info(result) + assert status.OK() + assert len(result[0]) == min(len(vectors), top_k) + assert check_result(result[0], ids[0]) + assert result[0][0].distance <= epsilon + status, result = connect.search_vectors(partition_name, top_k, nprobe, query_vec, partition_tags=[tag]) + logging.getLogger().info(result) + assert status.OK() + assert len(result) == 0 + + def test_search_l2_index_params_partition_C(self, connect, table, get_simple_index_params): + ''' + target: test basic search fuction, all the search params is corrent, test all index params, and build + method: search with the given vectors and tags (one of the tags not existed in table), check the result + expected: search status ok, and the length of the result is top_k + ''' + index_params = get_simple_index_params + logging.getLogger().info(index_params) + partition_name = gen_unique_str() + status = connect.create_partition(table, partition_name, tag) + vectors, ids = self.init_data(connect, partition_name) + status = connect.create_index(table, index_params) + query_vec = [vectors[0]] + top_k = 10 + nprobe = 1 + status, result = connect.search_vectors(table, top_k, nprobe, query_vec, partition_tags=[tag, "new_tag"]) + logging.getLogger().info(result) + assert status.OK() + assert len(result[0]) == min(len(vectors), top_k) + assert check_result(result[0], ids[0]) + assert result[0][0].distance <= epsilon + + def test_search_l2_index_params_partition_D(self, connect, table, get_simple_index_params): + ''' + target: test basic search fuction, all the search params is corrent, test all index params, and build + method: search with the given vectors and tag (tag name not existed in table), check the result + expected: search status ok, and the length of the result is top_k + ''' + index_params = get_simple_index_params + logging.getLogger().info(index_params) + partition_name = gen_unique_str() + status = connect.create_partition(table, partition_name, tag) + vectors, ids = self.init_data(connect, partition_name) + status = connect.create_index(table, index_params) + query_vec = [vectors[0]] + top_k = 10 + nprobe = 1 + status, result = connect.search_vectors(table, top_k, nprobe, query_vec, partition_tags=["new_tag"]) + logging.getLogger().info(result) + assert status.OK() + assert len(result) == 0 + + def test_search_l2_index_params_partition_E(self, connect, table, get_simple_index_params): + ''' + target: test basic search fuction, all the search params is corrent, test all index params, and build + method: search table with the given vectors and tags, check the result + expected: search status ok, and the length of the result is top_k + ''' + new_tag = "new_tag" + index_params = get_simple_index_params + logging.getLogger().info(index_params) + partition_name = gen_unique_str() + new_partition_name = gen_unique_str() + status = connect.create_partition(table, partition_name, tag) + status = connect.create_partition(table, new_partition_name, new_tag) + vectors, ids = self.init_data(connect, partition_name) + new_vectors, new_ids = self.init_data(connect, new_partition_name, nb=1000) + status = connect.create_index(table, index_params) + query_vec = [vectors[0], new_vectors[0]] + top_k = 10 + nprobe = 1 + status, result = connect.search_vectors(table, top_k, nprobe, query_vec, partition_tags=[tag, new_tag]) + logging.getLogger().info(result) + assert status.OK() + assert len(result[0]) == min(len(vectors), top_k) + assert check_result(result[0], ids[0]) + assert check_result(result[1], new_ids[0]) + assert result[0][0].distance <= epsilon + assert result[1][0].distance <= epsilon + status, result = connect.search_vectors(table, top_k, nprobe, query_vec, partition_tags=[new_tag]) + logging.getLogger().info(result) + assert status.OK() + assert len(result[0]) == min(len(vectors), top_k) + assert check_result(result[1], new_ids[0]) + assert result[1][0].distance <= epsilon + def test_search_ip_index_params(self, connect, ip_table, get_index_params): ''' target: test basic search fuction, all the search params is corrent, test all index params, and build @@ -117,7 +280,7 @@ class TestSearchBase: query_vec = [vectors[0]] top_k = 10 nprobe = 1 - status, result = connect.search_vectors(ip_table, top_k, nrpobe, query_vec) + status, result = connect.search_vectors(ip_table, top_k, nprobe, query_vec) logging.getLogger().info(result) if top_k <= 1024: @@ -128,6 +291,59 @@ class TestSearchBase: else: assert not status.OK() + def test_search_ip_index_params_partition(self, connect, ip_table, get_simple_index_params): + ''' + target: test basic search fuction, all the search params is corrent, test all index params, and build + method: search with the given vectors, check the result + expected: search status ok, and the length of the result is top_k + ''' + index_params = get_simple_index_params + logging.getLogger().info(index_params) + partition_name = gen_unique_str() + status = connect.create_partition(ip_table, partition_name, tag) + vectors, ids = self.init_data(connect, ip_table) + status = connect.create_index(ip_table, index_params) + query_vec = [vectors[0]] + top_k = 10 + nprobe = 1 + status, result = connect.search_vectors(ip_table, top_k, nprobe, query_vec) + logging.getLogger().info(result) + assert status.OK() + assert len(result[0]) == min(len(vectors), top_k) + assert check_result(result[0], ids[0]) + assert abs(result[0][0].distance - numpy.inner(numpy.array(query_vec[0]), numpy.array(query_vec[0]))) <= gen_inaccuracy(result[0][0].distance) + status, result = connect.search_vectors(ip_table, top_k, nprobe, query_vec, partition_tags=[tag]) + logging.getLogger().info(result) + assert status.OK() + assert len(result) == 0 + + def test_search_ip_index_params_partition_A(self, connect, ip_table, get_simple_index_params): + ''' + target: test basic search fuction, all the search params is corrent, test all index params, and build + method: search with the given vectors and tag, check the result + expected: search status ok, and the length of the result is top_k + ''' + index_params = get_simple_index_params + logging.getLogger().info(index_params) + partition_name = gen_unique_str() + status = connect.create_partition(ip_table, partition_name, tag) + vectors, ids = self.init_data(connect, partition_name) + status = connect.create_index(ip_table, index_params) + query_vec = [vectors[0]] + top_k = 10 + nprobe = 1 + status, result = connect.search_vectors(ip_table, top_k, nprobe, query_vec, partition_tags=[tag]) + logging.getLogger().info(result) + assert status.OK() + assert len(result[0]) == min(len(vectors), top_k) + assert check_result(result[0], ids[0]) + assert abs(result[0][0].distance - numpy.inner(numpy.array(query_vec[0]), numpy.array(query_vec[0]))) <= gen_inaccuracy(result[0][0].distance) + status, result = connect.search_vectors(partition_name, top_k, nprobe, query_vec) + logging.getLogger().info(result) + assert status.OK() + assert len(result[0]) == min(len(vectors), top_k) + assert check_result(result[0], ids[0]) + @pytest.mark.level(2) def test_search_vectors_without_connect(self, dis_connect, table): ''' @@ -518,6 +734,14 @@ class TestSearchParamsInvalid(object): status, result = connect.search_vectors(table_name, top_k, nprobe, query_vecs) assert not status.OK() + @pytest.mark.level(1) + def test_search_with_invalid_tag_format(self, connect, table): + top_k = 1 + nprobe = 1 + query_vecs = gen_vectors(1, dim) + with pytest.raises(Exception) as e: + status, result = connect.search_vectors(table_name, top_k, nprobe, query_vecs, partition_tags="tag") + """ Test search table with invalid top-k """ @@ -574,7 +798,7 @@ class TestSearchParamsInvalid(object): yield request.param @pytest.mark.level(1) - def test_search_with_invalid_nrpobe(self, connect, table, get_nprobes): + def test_search_with_invalid_nprobe(self, connect, table, get_nprobes): ''' target: test search fuction, with the wrong top_k method: search with top_k @@ -592,7 +816,7 @@ class TestSearchParamsInvalid(object): status, result = connect.search_vectors(table, top_k, nprobe, query_vecs) @pytest.mark.level(2) - def test_search_with_invalid_nrpobe_ip(self, connect, ip_table, get_nprobes): + def test_search_with_invalid_nprobe_ip(self, connect, ip_table, get_nprobes): ''' target: test search fuction, with the wrong top_k method: search with top_k diff --git a/tests/milvus_python_test/test_table.py b/tests/milvus_python_test/test_table.py index 6af38bac15..40b0850859 100644 --- a/tests/milvus_python_test/test_table.py +++ b/tests/milvus_python_test/test_table.py @@ -297,7 +297,7 @@ class TestTable: ''' table_name = gen_unique_str("test_table") status = connect.delete_table(table_name) - assert not status.code==0 + assert not status.OK() def test_delete_table_repeatedly(self, connect): ''' diff --git a/tests/milvus_python_test/test_table_count.py b/tests/milvus_python_test/test_table_count.py index 4e8a780c62..77780c8faa 100644 --- a/tests/milvus_python_test/test_table_count.py +++ b/tests/milvus_python_test/test_table_count.py @@ -13,8 +13,8 @@ from milvus import IndexType, MetricType dim = 128 index_file_size = 10 -add_time_interval = 5 - +add_time_interval = 3 +tag = "1970-01-01" class TestTableCount: """ @@ -58,6 +58,90 @@ class TestTableCount: status, res = connect.get_table_row_count(table) assert res == nb + def test_table_rows_count_partition(self, connect, table, add_vectors_nb): + ''' + target: test table rows_count is correct or not + method: create table, create partition and add vectors in it, + assert the value returned by get_table_row_count method is equal to length of vectors + expected: the count is equal to the length of vectors + ''' + nb = add_vectors_nb + partition_name = gen_unique_str() + vectors = gen_vectors(nb, dim) + status = connect.create_partition(table, partition_name, tag) + assert status.OK() + res = connect.add_vectors(table_name=table, records=vectors, partition_tag=tag) + time.sleep(add_time_interval) + status, res = connect.get_table_row_count(table) + assert res == nb + + def test_table_rows_count_multi_partitions_A(self, connect, table, add_vectors_nb): + ''' + target: test table rows_count is correct or not + method: create table, create partitions and add vectors in it, + assert the value returned by get_table_row_count method is equal to length of vectors + expected: the count is equal to the length of vectors + ''' + new_tag = "new_tag" + nb = add_vectors_nb + partition_name = gen_unique_str() + new_partition_name = gen_unique_str() + vectors = gen_vectors(nb, dim) + status = connect.create_partition(table, partition_name, tag) + status = connect.create_partition(table, new_partition_name, new_tag) + assert status.OK() + res = connect.add_vectors(table_name=table, records=vectors) + time.sleep(add_time_interval) + status, res = connect.get_table_row_count(table) + assert res == nb + + def test_table_rows_count_multi_partitions_B(self, connect, table, add_vectors_nb): + ''' + target: test table rows_count is correct or not + method: create table, create partitions and add vectors in one of the partitions, + assert the value returned by get_table_row_count method is equal to length of vectors + expected: the count is equal to the length of vectors + ''' + new_tag = "new_tag" + nb = add_vectors_nb + partition_name = gen_unique_str() + new_partition_name = gen_unique_str() + vectors = gen_vectors(nb, dim) + status = connect.create_partition(table, partition_name, tag) + status = connect.create_partition(table, new_partition_name, new_tag) + assert status.OK() + res = connect.add_vectors(table_name=table, records=vectors, partition_tag=tag) + time.sleep(add_time_interval) + status, res = connect.get_table_row_count(partition_name) + assert res == nb + status, res = connect.get_table_row_count(new_partition_name) + assert res == 0 + + def test_table_rows_count_multi_partitions_C(self, connect, table, add_vectors_nb): + ''' + target: test table rows_count is correct or not + method: create table, create partitions and add vectors in one of the partitions, + assert the value returned by get_table_row_count method is equal to length of vectors + expected: the table count is equal to the length of vectors + ''' + new_tag = "new_tag" + nb = add_vectors_nb + partition_name = gen_unique_str() + new_partition_name = gen_unique_str() + vectors = gen_vectors(nb, dim) + status = connect.create_partition(table, partition_name, tag) + status = connect.create_partition(table, new_partition_name, new_tag) + assert status.OK() + res = connect.add_vectors(table_name=table, records=vectors, partition_tag=tag) + res = connect.add_vectors(table_name=table, records=vectors, partition_tag=new_tag) + time.sleep(add_time_interval) + status, res = connect.get_table_row_count(partition_name) + assert res == nb + status, res = connect.get_table_row_count(new_partition_name) + assert res == nb + status, res = connect.get_table_row_count(table) + assert res == nb * 2 + def test_table_rows_count_after_index_created(self, connect, table, get_simple_index_params): ''' target: test get_table_row_count, after index have been created From a9bc655cfb5dbcd06131ca4fe70fc26816cb2c51 Mon Sep 17 00:00:00 2001 From: wxyu Date: Thu, 21 Nov 2019 17:26:11 +0800 Subject: [PATCH 14/32] Read gpu config only gpu_resource_config.enable=true fix #467 --- CHANGELOG.md | 1 + core/src/scheduler/SchedInst.cpp | 52 ++++++++++++++------------- core/src/scheduler/SchedInst.h | 15 +++++--- core/src/server/Config.cpp | 42 +++++++++++----------- core/src/wrapper/KnowhereResource.cpp | 12 +++++-- 5 files changed, 71 insertions(+), 51 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index a8b243546e..c68f655077 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -18,6 +18,7 @@ Please mark all change in change log and use the ticket from JIRA. - \#412 - Message returned is confused when partition created with null partition name - \#416 - Drop the same partition success repeatally - \#440 - Query API in customization still uses old version +- \#440 - Server cannot startup with gpu_resource_config.enable=false in GPU version ## Feature - \#12 - Pure CPU version for Milvus diff --git a/core/src/scheduler/SchedInst.cpp b/core/src/scheduler/SchedInst.cpp index 69d293f986..f86b6f44b3 100644 --- a/core/src/scheduler/SchedInst.cpp +++ b/core/src/scheduler/SchedInst.cpp @@ -54,36 +54,40 @@ load_simple_config() { // get resources #ifdef MILVUS_GPU_VERSION + bool enable_gpu = false; server::Config& config = server::Config::GetInstance(); - std::vector gpu_ids; - config.GetGpuResourceConfigSearchResources(gpu_ids); - std::vector build_gpu_ids; - config.GetGpuResourceConfigBuildIndexResources(build_gpu_ids); - auto pcie = Connection("pcie", 12000); + config.GetGpuResourceConfigEnable(enable_gpu); + if (enable_gpu) { + std::vector gpu_ids; + config.GetGpuResourceConfigSearchResources(gpu_ids); + std::vector build_gpu_ids; + config.GetGpuResourceConfigBuildIndexResources(build_gpu_ids); + auto pcie = Connection("pcie", 12000); - std::vector not_find_build_ids; - for (auto& build_id : build_gpu_ids) { - bool find_gpu_id = false; - for (auto& gpu_id : gpu_ids) { - if (gpu_id == build_id) { - find_gpu_id = true; - break; + std::vector not_find_build_ids; + for (auto& build_id : build_gpu_ids) { + bool find_gpu_id = false; + for (auto& gpu_id : gpu_ids) { + if (gpu_id == build_id) { + find_gpu_id = true; + break; + } + } + if (not find_gpu_id) { + not_find_build_ids.emplace_back(build_id); } } - if (not find_gpu_id) { - not_find_build_ids.emplace_back(build_id); + + for (auto& gpu_id : gpu_ids) { + ResMgrInst::GetInstance()->Add(ResourceFactory::Create(std::to_string(gpu_id), "GPU", gpu_id, true, true)); + ResMgrInst::GetInstance()->Connect("cpu", std::to_string(gpu_id), pcie); } - } - for (auto& gpu_id : gpu_ids) { - ResMgrInst::GetInstance()->Add(ResourceFactory::Create(std::to_string(gpu_id), "GPU", gpu_id, true, true)); - ResMgrInst::GetInstance()->Connect("cpu", std::to_string(gpu_id), pcie); - } - - for (auto& not_find_id : not_find_build_ids) { - ResMgrInst::GetInstance()->Add( - ResourceFactory::Create(std::to_string(not_find_id), "GPU", not_find_id, true, true)); - ResMgrInst::GetInstance()->Connect("cpu", std::to_string(not_find_id), pcie); + for (auto& not_find_id : not_find_build_ids) { + ResMgrInst::GetInstance()->Add( + ResourceFactory::Create(std::to_string(not_find_id), "GPU", not_find_id, true, true)); + ResMgrInst::GetInstance()->Connect("cpu", std::to_string(not_find_id), pcie); + } } #endif } diff --git a/core/src/scheduler/SchedInst.h b/core/src/scheduler/SchedInst.h index dc2d5ade35..6273af7a9f 100644 --- a/core/src/scheduler/SchedInst.h +++ b/core/src/scheduler/SchedInst.h @@ -102,11 +102,16 @@ class OptimizerInst { if (instance == nullptr) { std::vector pass_list; #ifdef MILVUS_GPU_VERSION - pass_list.push_back(std::make_shared()); - pass_list.push_back(std::make_shared()); - pass_list.push_back(std::make_shared()); - pass_list.push_back(std::make_shared()); - pass_list.push_back(std::make_shared()); + bool enable_gpu = false; + server::Config& config = server::Config::GetInstance(); + config.GetGpuResourceConfigEnable(enable_gpu); + if (enable_gpu) { + pass_list.push_back(std::make_shared()); + pass_list.push_back(std::make_shared()); + pass_list.push_back(std::make_shared()); + pass_list.push_back(std::make_shared()); + pass_list.push_back(std::make_shared()); + } #endif pass_list.push_back(std::make_shared()); instance = std::make_shared(pass_list); diff --git a/core/src/server/Config.cpp b/core/src/server/Config.cpp index f3efcff0cc..5465c6c505 100644 --- a/core/src/server/Config.cpp +++ b/core/src/server/Config.cpp @@ -189,35 +189,37 @@ Config::ValidateConfig() { } /* gpu resource config */ -#ifdef MILVUS_GPU_VERSION bool gpu_resource_enable; s = GetGpuResourceConfigEnable(gpu_resource_enable); if (!s.ok()) { return s; } - int64_t resource_cache_capacity; - s = GetGpuResourceConfigCacheCapacity(resource_cache_capacity); - if (!s.ok()) { - return s; - } +#ifdef MILVUS_GPU_VERSION + if (gpu_resource_enable) { + int64_t resource_cache_capacity; + s = GetGpuResourceConfigCacheCapacity(resource_cache_capacity); + if (!s.ok()) { + return s; + } - float resource_cache_threshold; - s = GetGpuResourceConfigCacheThreshold(resource_cache_threshold); - if (!s.ok()) { - return s; - } + float resource_cache_threshold; + s = GetGpuResourceConfigCacheThreshold(resource_cache_threshold); + if (!s.ok()) { + return s; + } - std::vector search_resources; - s = GetGpuResourceConfigSearchResources(search_resources); - if (!s.ok()) { - return s; - } + std::vector search_resources; + s = GetGpuResourceConfigSearchResources(search_resources); + if (!s.ok()) { + return s; + } - std::vector index_build_resources; - s = GetGpuResourceConfigBuildIndexResources(index_build_resources); - if (!s.ok()) { - return s; + std::vector index_build_resources; + s = GetGpuResourceConfigBuildIndexResources(index_build_resources); + if (!s.ok()) { + return s; + } } #endif diff --git a/core/src/wrapper/KnowhereResource.cpp b/core/src/wrapper/KnowhereResource.cpp index 5a2296b16e..42105777aa 100644 --- a/core/src/wrapper/KnowhereResource.cpp +++ b/core/src/wrapper/KnowhereResource.cpp @@ -37,6 +37,16 @@ constexpr int64_t M_BYTE = 1024 * 1024; Status KnowhereResource::Initialize() { #ifdef MILVUS_GPU_VERSION + Status s; + bool enable_gpu = false; + server::Config& config = server::Config::GetInstance(); + s = config.GetGpuResourceConfigEnable(enable_gpu); + if (!s.ok()) + return s; + + if (not enable_gpu) + return Status::OK(); + struct GpuResourceSetting { int64_t pinned_memory = 300 * M_BYTE; int64_t temp_memory = 300 * M_BYTE; @@ -44,10 +54,8 @@ KnowhereResource::Initialize() { }; using GpuResourcesArray = std::map; GpuResourcesArray gpu_resources; - Status s; // get build index gpu resource - server::Config& config = server::Config::GetInstance(); std::vector build_index_gpus; s = config.GetGpuResourceConfigBuildIndexResources(build_index_gpus); if (!s.ok()) From 29d9ac4954b88b087116706108ada657b548e0cd Mon Sep 17 00:00:00 2001 From: zhenwu Date: Fri, 22 Nov 2019 11:13:43 +0800 Subject: [PATCH 15/32] [skip ci] Add ann-dataset accuracy pipeline --- .../ci/function/file_transfer.groovy | 10 ++ .../ci/jenkinsfile/acc_test.groovy | 16 ++ .../ci/jenkinsfile/cleanup.groovy | 13 ++ .../jenkinsfile/deploy_default_server.groovy | 22 +++ .../ci/jenkinsfile/notify.groovy | 15 ++ tests/milvus_ann_acc/ci/main_jenkinsfile | 130 ++++++++++++++ .../pod_containers/milvus-testframework.yaml | 13 ++ tests/milvus_ann_acc/client.py | 33 ++-- tests/milvus_ann_acc/main.py | 65 +++++-- tests/milvus_ann_acc/requirements.txt | 5 + tests/milvus_ann_acc/runner.py | 162 ++++++++++++++++++ tests/milvus_ann_acc/suite.yaml | 29 ++++ tests/milvus_ann_acc/suite.yaml.bak | 11 ++ tests/milvus_ann_acc/suite_czr.yaml | 20 +++ tests/milvus_ann_acc/suite_debug.yaml | 10 ++ tests/milvus_ann_acc/test.py | 157 ++++------------- 16 files changed, 547 insertions(+), 164 deletions(-) create mode 100644 tests/milvus_ann_acc/ci/function/file_transfer.groovy create mode 100644 tests/milvus_ann_acc/ci/jenkinsfile/acc_test.groovy create mode 100644 tests/milvus_ann_acc/ci/jenkinsfile/cleanup.groovy create mode 100644 tests/milvus_ann_acc/ci/jenkinsfile/deploy_default_server.groovy create mode 100644 tests/milvus_ann_acc/ci/jenkinsfile/notify.groovy create mode 100644 tests/milvus_ann_acc/ci/main_jenkinsfile create mode 100644 tests/milvus_ann_acc/ci/pod_containers/milvus-testframework.yaml create mode 100644 tests/milvus_ann_acc/runner.py create mode 100644 tests/milvus_ann_acc/suite.yaml create mode 100644 tests/milvus_ann_acc/suite.yaml.bak create mode 100644 tests/milvus_ann_acc/suite_czr.yaml create mode 100644 tests/milvus_ann_acc/suite_debug.yaml diff --git a/tests/milvus_ann_acc/ci/function/file_transfer.groovy b/tests/milvus_ann_acc/ci/function/file_transfer.groovy new file mode 100644 index 0000000000..bebae14832 --- /dev/null +++ b/tests/milvus_ann_acc/ci/function/file_transfer.groovy @@ -0,0 +1,10 @@ +def FileTransfer (sourceFiles, remoteDirectory, remoteIP, protocol = "ftp", makeEmptyDirs = true) { + if (protocol == "ftp") { + ftpPublisher masterNodeName: '', paramPublish: [parameterName: ''], alwaysPublishFromMaster: false, continueOnError: false, failOnError: true, publishers: [ + [configName: "${remoteIP}", transfers: [ + [asciiMode: false, cleanRemote: false, excludes: '', flatten: false, makeEmptyDirs: "${makeEmptyDirs}", noDefaultExcludes: false, patternSeparator: '[, ]+', remoteDirectory: "${remoteDirectory}", remoteDirectorySDF: false, removePrefix: '', sourceFiles: "${sourceFiles}"]], usePromotionTimestamp: true, useWorkspaceInPromotion: false, verbose: true + ] + ] + } +} +return this diff --git a/tests/milvus_ann_acc/ci/jenkinsfile/acc_test.groovy b/tests/milvus_ann_acc/ci/jenkinsfile/acc_test.groovy new file mode 100644 index 0000000000..1ce327b802 --- /dev/null +++ b/tests/milvus_ann_acc/ci/jenkinsfile/acc_test.groovy @@ -0,0 +1,16 @@ +timeout(time: 1800, unit: 'MINUTES') { + try { + dir ("milvu_ann_acc") { + print "Git clone url: ${TEST_URL}:${TEST_BRANCH}" + checkout([$class: 'GitSCM', branches: [[name: "${TEST_BRANCH}"]], doGenerateSubmoduleConfigurations: false, extensions: [], submoduleCfg: [], userRemoteConfigs: [[credentialsId: "${params.GIT_USER}", url: "${TEST_URL}", name: 'origin', refspec: "+refs/heads/${TEST_BRANCH}:refs/remotes/origin/${TEST_BRANCH}"]]]) + print "Install requirements" + sh 'python3 -m pip install -r requirements.txt -i http://pypi.douban.com/simple --trusted-host pypi.douban.com' + // sleep(120000) + sh "python3 main.py --suite=${params.SUITE} --host=acc-test-${env.JOB_NAME}-${env.BUILD_NUMBER}-engine.milvus.svc.cluster.local --port=19530" + } + } catch (exc) { + echo 'Milvus Ann Accuracy Test Failed !' + throw exc + } +} + diff --git a/tests/milvus_ann_acc/ci/jenkinsfile/cleanup.groovy b/tests/milvus_ann_acc/ci/jenkinsfile/cleanup.groovy new file mode 100644 index 0000000000..2e9332fa6e --- /dev/null +++ b/tests/milvus_ann_acc/ci/jenkinsfile/cleanup.groovy @@ -0,0 +1,13 @@ +try { + def result = sh script: "helm status ${env.JOB_NAME}-${env.BUILD_NUMBER}", returnStatus: true + if (!result) { + sh "helm del --purge ${env.JOB_NAME}-${env.BUILD_NUMBER}" + } +} catch (exc) { + def result = sh script: "helm status ${env.JOB_NAME}-${env.BUILD_NUMBER}", returnStatus: true + if (!result) { + sh "helm del --purge ${env.JOB_NAME}-${env.BUILD_NUMBER}" + } + throw exc +} + diff --git a/tests/milvus_ann_acc/ci/jenkinsfile/deploy_default_server.groovy b/tests/milvus_ann_acc/ci/jenkinsfile/deploy_default_server.groovy new file mode 100644 index 0000000000..951bb69941 --- /dev/null +++ b/tests/milvus_ann_acc/ci/jenkinsfile/deploy_default_server.groovy @@ -0,0 +1,22 @@ +timeout(time: 30, unit: 'MINUTES') { + try { + dir ("milvus") { + sh 'helm init --client-only --skip-refresh --stable-repo-url https://kubernetes.oss-cn-hangzhou.aliyuncs.com/charts' + sh 'helm repo update' + checkout([$class: 'GitSCM', branches: [[name: "${HELM_BRANCH}"]], userRemoteConfigs: [[url: "${HELM_URL}", name: 'origin', refspec: "+refs/heads/${HELM_BRANCH}:refs/remotes/origin/${HELM_BRANCH}"]]]) + dir ("milvus") { + sh "helm install --wait --timeout 300 --set engine.image.tag=${IMAGE_TAG} --set expose.type=clusterIP --name acc-test-${env.JOB_NAME}-${env.BUILD_NUMBER} -f ci/db_backend/sqlite_${params.IMAGE_TYPE}_values.yaml -f ci/filebeat/values.yaml --namespace milvus --version ${HELM_BRANCH} ." + } + } + // dir ("milvus") { + // checkout([$class: 'GitSCM', branches: [[name: "${env.SERVER_BRANCH}"]], userRemoteConfigs: [[url: "${env.SERVER_URL}", name: 'origin', refspec: "+refs/heads/${env.SERVER_BRANCH}:refs/remotes/origin/${env.SERVER_BRANCH}"]]]) + // dir ("milvus") { + // load "ci/jenkins/step/deploySingle2Dev.groovy" + // } + // } + } catch (exc) { + echo 'Deploy Milvus Server Failed !' + throw exc + } +} + diff --git a/tests/milvus_ann_acc/ci/jenkinsfile/notify.groovy b/tests/milvus_ann_acc/ci/jenkinsfile/notify.groovy new file mode 100644 index 0000000000..0a257b8cd8 --- /dev/null +++ b/tests/milvus_ann_acc/ci/jenkinsfile/notify.groovy @@ -0,0 +1,15 @@ +def notify() { + if (!currentBuild.resultIsBetterOrEqualTo('SUCCESS')) { + // Send an email only if the build status has changed from green/unstable to red + emailext subject: '$DEFAULT_SUBJECT', + body: '$DEFAULT_CONTENT', + recipientProviders: [ + [$class: 'DevelopersRecipientProvider'], + [$class: 'RequesterRecipientProvider'] + ], + replyTo: '$DEFAULT_REPLYTO', + to: '$DEFAULT_RECIPIENTS' + } +} +return this + diff --git a/tests/milvus_ann_acc/ci/main_jenkinsfile b/tests/milvus_ann_acc/ci/main_jenkinsfile new file mode 100644 index 0000000000..9fdac4fc6e --- /dev/null +++ b/tests/milvus_ann_acc/ci/main_jenkinsfile @@ -0,0 +1,130 @@ +pipeline { + agent none + + options { + timestamps() + } + + parameters{ + choice choices: ['cpu', 'gpu'], description: 'cpu or gpu version', name: 'IMAGE_TYPE' + string defaultValue: '0.6.0', description: 'server image version', name: 'IMAGE_VERSION', trim: true + string defaultValue: 'suite.yaml', description: 'test suite config yaml', name: 'SUITE', trim: true + string defaultValue: '09509e53-9125-4f5d-9ce8-42855987ad67', description: 'git credentials', name: 'GIT_USER', trim: true + } + + environment { + IMAGE_TAG = "${params.IMAGE_VERSION}-${params.IMAGE_TYPE}-ubuntu18.04-release" + HELM_URL = "https://github.com/milvus-io/milvus-helm.git" + HELM_BRANCH = "0.6.0" + TEST_URL = "git@192.168.1.105:Test/milvus_ann_acc.git" + TEST_BRANCH = "0.6.0" + } + + stages { + stage("Setup env") { + agent { + kubernetes { + label 'dev-test' + defaultContainer 'jnlp' + yaml """ + apiVersion: v1 + kind: Pod + metadata: + labels: + app: milvus + componet: test + spec: + containers: + - name: milvus-testframework + image: registry.zilliz.com/milvus/milvus-test:v0.2 + command: + - cat + tty: true + volumeMounts: + - name: kubeconf + mountPath: /root/.kube/ + readOnly: true + - name: hdf5-path + mountPath: /test + readOnly: true + volumes: + - name: kubeconf + secret: + secretName: test-cluster-config + - name: hdf5-path + flexVolume: + driver: "fstab/cifs" + fsType: "cifs" + secretRef: + name: "cifs-test-secret" + options: + networkPath: "//192.168.1.126/test" + mountOptions: "vers=1.0" + """ + } + } + + stages { + stage("Deploy Default Server") { + steps { + gitlabCommitStatus(name: 'Accuracy Test') { + container('milvus-testframework') { + script { + print "In Deploy Default Server Stage" + load "${env.WORKSPACE}/ci/jenkinsfile/deploy_default_server.groovy" + } + } + } + } + } + stage("Acc Test") { + steps { + gitlabCommitStatus(name: 'Accuracy Test') { + container('milvus-testframework') { + script { + print "In Acc test stage" + load "${env.WORKSPACE}/ci/jenkinsfile/acc_test.groovy" + } + } + } + } + } + stage ("Cleanup Env") { + steps { + gitlabCommitStatus(name: 'Cleanup Env') { + container('milvus-testframework') { + script { + load "${env.WORKSPACE}/ci/jenkinsfile/cleanup.groovy" + } + } + } + } + } + } + post { + always { + container('milvus-testframework') { + script { + load "${env.WORKSPACE}/ci/jenkinsfile/cleanup.groovy" + } + } + } + success { + script { + echo "Milvus ann-accuracy test success !" + } + } + aborted { + script { + echo "Milvus ann-accuracy test aborted !" + } + } + failure { + script { + echo "Milvus ann-accuracy test failed !" + } + } + } + } + } +} diff --git a/tests/milvus_ann_acc/ci/pod_containers/milvus-testframework.yaml b/tests/milvus_ann_acc/ci/pod_containers/milvus-testframework.yaml new file mode 100644 index 0000000000..6b1d6c7dfd --- /dev/null +++ b/tests/milvus_ann_acc/ci/pod_containers/milvus-testframework.yaml @@ -0,0 +1,13 @@ +apiVersion: v1 +kind: Pod +metadata: + labels: + app: milvus + componet: testframework +spec: + containers: + - name: milvus-testframework + image: registry.zilliz.com/milvus/milvus-test:v0.2 + command: + - cat + tty: true diff --git a/tests/milvus_ann_acc/client.py b/tests/milvus_ann_acc/client.py index de4ef17cb6..6fec829612 100644 --- a/tests/milvus_ann_acc/client.py +++ b/tests/milvus_ann_acc/client.py @@ -8,7 +8,7 @@ import numpy import sklearn.preprocessing from milvus import Milvus, IndexType, MetricType -logger = logging.getLogger("milvus_ann_acc.client") +logger = logging.getLogger("milvus_acc.client") SERVER_HOST_DEFAULT = "127.0.0.1" SERVER_PORT_DEFAULT = 19530 @@ -28,17 +28,17 @@ def time_wrapper(func): class MilvusClient(object): - def __init__(self, table_name=None, ip=None, port=None): + def __init__(self, table_name=None, host=None, port=None): self._milvus = Milvus() self._table_name = table_name try: - if not ip: + if not host: self._milvus.connect( host = SERVER_HOST_DEFAULT, port = SERVER_PORT_DEFAULT) else: self._milvus.connect( - host = ip, + host = host, port = port) except Exception as e: raise e @@ -113,7 +113,6 @@ class MilvusClient(object): X = X.astype(numpy.float32) status, results = self._milvus.search_vectors(self._table_name, top_k, nprobe, X.tolist()) self.check_status(status) - # logger.info(results[0]) ids = [] for result in results: tmp_ids = [] @@ -125,24 +124,20 @@ class MilvusClient(object): def count(self): return self._milvus.get_table_row_count(self._table_name)[1] - def delete(self, timeout=60): - logger.info("Start delete table: %s" % self._table_name) - self._milvus.delete_table(self._table_name) - i = 0 - while i < timeout: - if self.count(): - time.sleep(1) - i = i + 1 - else: - break - if i >= timeout: - logger.error("Delete table timeout") + def delete(self, table_name): + logger.info("Start delete table: %s" % table_name) + return self._milvus.delete_table(table_name) def describe(self): return self._milvus.describe_table(self._table_name) - def exists_table(self): - return self._milvus.has_table(self._table_name) + def exists_table(self, table_name): + return self._milvus.has_table(table_name) + + def get_server_version(self): + status, res = self._milvus.server_version() + self.check_status(status) + return res @time_wrapper def preload_table(self): diff --git a/tests/milvus_ann_acc/main.py b/tests/milvus_ann_acc/main.py index 308e8246c7..703303232d 100644 --- a/tests/milvus_ann_acc/main.py +++ b/tests/milvus_ann_acc/main.py @@ -1,26 +1,57 @@ - +import os +import sys import argparse +from yaml import load, dump +import logging +from logging import handlers +from client import MilvusClient +import runner + +LOG_FOLDER = "logs" +logger = logging.getLogger("milvus_acc") +formatter = logging.Formatter('[%(asctime)s] [%(levelname)-4s] [%(pathname)s:%(lineno)d] %(message)s') +if not os.path.exists(LOG_FOLDER): + os.system('mkdir -p %s' % LOG_FOLDER) +fileTimeHandler = handlers.TimedRotatingFileHandler(os.path.join(LOG_FOLDER, 'acc'), "D", 1, 10) +fileTimeHandler.suffix = "%Y%m%d.log" +fileTimeHandler.setFormatter(formatter) +logging.basicConfig(level=logging.DEBUG) +fileTimeHandler.setFormatter(formatter) +logger.addHandler(fileTimeHandler) + def main(): parser = argparse.ArgumentParser( formatter_class=argparse.ArgumentDefaultsHelpFormatter) parser.add_argument( - '--dataset', - metavar='NAME', - help='the dataset to load training points from', - default='glove-100-angular', - choices=DATASETS.keys()) + "--host", + default="127.0.0.1", + help="server host") parser.add_argument( - "-k", "--count", - default=10, - type=positive_int, - help="the number of near neighbours to search for") + "--port", + default=19530, + help="server port") parser.add_argument( - '--definitions', + '--suite', metavar='FILE', - help='load algorithm definitions from FILE', - default='algos.yaml') - parser.add_argument( - '--image-tag', - default=None, - help='pull image first') \ No newline at end of file + help='load config definitions from suite_czr' + '.yaml', + default='suite_czr.yaml') + args = parser.parse_args() + if args.suite: + with open(args.suite, "r") as f: + suite = load(f) + hdf5_path = suite["hdf5_path"] + dataset_configs = suite["datasets"] + if not hdf5_path or not dataset_configs: + logger.warning("No datasets given") + sys.exit() + f.close() + for dataset_config in dataset_configs: + logger.debug(dataset_config) + milvus_instance = MilvusClient(host=args.host, port=args.port) + runner.run(milvus_instance, dataset_config, hdf5_path) + + +if __name__ == "__main__": + main() \ No newline at end of file diff --git a/tests/milvus_ann_acc/requirements.txt b/tests/milvus_ann_acc/requirements.txt index 8c10e71b1f..1f2b337423 100644 --- a/tests/milvus_ann_acc/requirements.txt +++ b/tests/milvus_ann_acc/requirements.txt @@ -2,3 +2,8 @@ numpy==1.16.3 pymilvus>=0.2.0 scikit-learn==0.19.1 h5py==2.7.1 +influxdb==5.2.2 +pyyaml==3.12 +tableprint==0.8.0 +ansicolors==1.1.8 +scipy==1.3.1 \ No newline at end of file diff --git a/tests/milvus_ann_acc/runner.py b/tests/milvus_ann_acc/runner.py new file mode 100644 index 0000000000..88a5d24016 --- /dev/null +++ b/tests/milvus_ann_acc/runner.py @@ -0,0 +1,162 @@ +import os +import pdb +import time +import random +import sys +import logging +import h5py +import numpy +from influxdb import InfluxDBClient + +INSERT_INTERVAL = 100000 +# s +DELETE_INTERVAL_TIME = 5 +INFLUXDB_HOST = "192.168.1.194" +INFLUXDB_PORT = 8086 +INFLUXDB_USER = "admin" +INFLUXDB_PASSWD = "admin" +INFLUXDB_NAME = "test_result" +influxdb_client = InfluxDBClient(host=INFLUXDB_HOST, port=INFLUXDB_PORT, username=INFLUXDB_USER, password=INFLUXDB_PASSWD, database=INFLUXDB_NAME) + +logger = logging.getLogger("milvus_acc.runner") + + +def parse_dataset_name(dataset_name): + data_type = dataset_name.split("-")[0] + dimension = int(dataset_name.split("-")[1]) + metric = dataset_name.split("-")[-1] + # metric = dataset.attrs['distance'] + # dimension = len(dataset["train"][0]) + if metric == "euclidean": + metric_type = "l2" + elif metric == "angular": + metric_type = "ip" + return ("ann"+data_type, dimension, metric_type) + + +def get_dataset(hdf5_path, dataset_name): + file_path = os.path.join(hdf5_path, '%s.hdf5' % dataset_name) + if not os.path.exists(file_path): + raise Exception("%s not existed" % file_path) + dataset = h5py.File(file_path) + return dataset + + +def get_table_name(hdf5_path, dataset_name, index_file_size): + data_type, dimension, metric_type = parse_dataset_name(dataset_name) + dataset = get_dataset(hdf5_path, dataset_name) + table_size = len(dataset["train"]) + table_size = str(table_size // 1000000)+"m" + table_name = data_type+'_'+table_size+'_'+str(index_file_size)+'_'+str(dimension)+'_'+metric_type + return table_name + + +def recall_calc(result_ids, true_ids, top_k, recall_k): + sum_intersect_num = 0 + recall = 0.0 + for index, result_item in enumerate(result_ids): + if len(set(true_ids[index][:top_k])) != len(set(result_item)): + logger.warning("Error happened: query result length is wrong") + continue + tmp = set(true_ids[index][:recall_k]).intersection(set(result_item)) + sum_intersect_num = sum_intersect_num + len(tmp) + recall = round(sum_intersect_num / (len(result_ids) * recall_k), 4) + return recall + + +def run(milvus, config, hdf5_path, force=True): + server_version = milvus.get_server_version() + logger.info(server_version) + + for dataset_name, config_value in config.items(): + dataset = get_dataset(hdf5_path, dataset_name) + index_file_sizes = config_value["index_file_sizes"] + index_types = config_value["index_types"] + nlists = config_value["nlists"] + search_param = config_value["search_param"] + top_ks = search_param["top_ks"] + nprobes = search_param["nprobes"] + nqs = search_param["nqs"] + + for index_file_size in index_file_sizes: + table_name = get_table_name(hdf5_path, dataset_name, index_file_size) + if milvus.exists_table(table_name): + if force is True: + logger.info("Re-create table: %s" % table_name) + milvus.delete(table_name) + time.sleep(DELETE_INTERVAL_TIME) + else: + logger.warning("Table name: %s existed" % table_name) + continue + data_type, dimension, metric_type = parse_dataset_name(dataset_name) + milvus.create_table(table_name, dimension, index_file_size, metric_type) + logger.info(milvus.describe()) + insert_vectors = numpy.array(dataset["train"]) + # milvus.insert(insert_vectors) + + loops = len(insert_vectors) // INSERT_INTERVAL + 1 + for i in range(loops): + start = i*INSERT_INTERVAL + end = min((i+1)*INSERT_INTERVAL, len(insert_vectors)) + tmp_vectors = insert_vectors[start:end] + if start < end: + milvus.insert(tmp_vectors, ids=[i for i in range(start, end)]) + time.sleep(20) + row_count = milvus.count() + logger.info("Table: %s, row count: %s" % (table_name, row_count)) + if milvus.count() != len(insert_vectors): + logger.error("Table row count is not equal to insert vectors") + return + for index_type in index_types: + for nlist in nlists: + milvus.create_index(index_type, nlist) + logger.info(milvus.describe_index()) + logger.info("Start preload table: %s, index_type: %s, nlist: %s" % (table_name, index_type, nlist)) + milvus.preload_table() + true_ids = numpy.array(dataset["neighbors"]) + for nprobe in nprobes: + for nq in nqs: + query_vectors = numpy.array(dataset["test"][:nq]) + for top_k in top_ks: + rec1 = 0.0 + rec10 = 0.0 + rec100 = 0.0 + result_ids = milvus.query(query_vectors, top_k, nprobe) + logger.info("Query result: %s" % len(result_ids)) + rec1 = recall_calc(result_ids, true_ids, top_k, 1) + if top_k == 10: + rec10 = recall_calc(result_ids, true_ids, top_k, 10) + if top_k == 100: + rec10 = recall_calc(result_ids, true_ids, top_k, 10) + rec100 = recall_calc(result_ids, true_ids, top_k, 100) + avg_radio = recall_calc(result_ids, true_ids, top_k, top_k) + logger.debug("Recall_1: %s" % rec1) + logger.debug("Recall_10: %s" % rec10) + logger.debug("Recall_100: %s" % rec100) + logger.debug("Accuracy: %s" % avg_radio) + acc_record = [{ + "measurement": "accuracy", + "tags": { + "server_version": server_version, + "dataset": dataset_name, + "index_file_size": index_file_size, + "index_type": index_type, + "nlist": nlist, + "search_nprobe": nprobe, + "top_k": top_k, + "nq": len(query_vectors) + }, + # "time": time.ctime(), + "time": time.strftime("%Y-%m-%dT%H:%M:%SZ"), + "fields": { + "recall1": rec1, + "recall10": rec10, + "recall100": rec100, + "avg_radio": avg_radio + } + }] + logger.info(acc_record) + try: + res = influxdb_client.write_points(acc_record) + except Exception as e: + logger.error("Insert infuxdb failed: %s" % str(e)) diff --git a/tests/milvus_ann_acc/suite.yaml b/tests/milvus_ann_acc/suite.yaml new file mode 100644 index 0000000000..1137ccfa64 --- /dev/null +++ b/tests/milvus_ann_acc/suite.yaml @@ -0,0 +1,29 @@ +datasets: + - sift-128-euclidean: + index_file_sizes: [50, 1024] + index_types: ['ivf_flat', 'ivf_sq8', 'ivf_sq8h'] + # index_types: ['ivf_sq8'] + nlists: [16384] + search_param: + nprobes: [1, 32, 128, 256] + top_ks: [10] + nqs: [10000] + - glove-25-angular: + index_file_sizes: [50, 1024] + index_types: ['ivf_flat', 'ivf_sq8', 'ivf_sq8h'] + # index_types: ['ivf_sq8'] + nlists: [16384] + search_param: + nprobes: [1, 32, 128, 256] + top_ks: [10] + nqs: [10000] + - glove-200-angular: + index_file_sizes: [50, 1024] + index_types: ['ivf_flat', 'ivf_sq8', 'ivf_sq8h'] + # index_types: ['ivf_sq8'] + nlists: [16384] + search_param: + nprobes: [1, 32, 128, 256] + top_ks: [10] + nqs: [10000] +hdf5_path: /test/milvus/ann_hdf5/ \ No newline at end of file diff --git a/tests/milvus_ann_acc/suite.yaml.bak b/tests/milvus_ann_acc/suite.yaml.bak new file mode 100644 index 0000000000..7736786d03 --- /dev/null +++ b/tests/milvus_ann_acc/suite.yaml.bak @@ -0,0 +1,11 @@ +datasets: + - glove-200-angular: + index_file_sizes: [1024] + index_types: ['ivf_sq8'] + # index_types: ['ivf_sq8'] + nlists: [16384] + search_param: + nprobes: [256, 400, 256] + top_ks: [100] + nqs: [10000] +hdf5_path: /test/milvus/ann_hdf5/ diff --git a/tests/milvus_ann_acc/suite_czr.yaml b/tests/milvus_ann_acc/suite_czr.yaml new file mode 100644 index 0000000000..7e2b0c8708 --- /dev/null +++ b/tests/milvus_ann_acc/suite_czr.yaml @@ -0,0 +1,20 @@ +datasets: + - sift-128-euclidean: + index_file_sizes: [1024] + index_types: ['ivf_sq8', 'ivf_sq8h'] + # index_types: ['ivf_sq8'] + nlists: [16384] + search_param: + nprobes: [16, 128, 1024] + top_ks: [1, 10, 100] + nqs: [10, 100, 1000] + - glove-200-angular: + index_file_sizes: [1024] + index_types: ['ivf_sq8', 'ivf_sq8h'] + # index_types: ['ivf_sq8'] + nlists: [16384] + search_param: + nprobes: [16, 128, 1024] + top_ks: [1, 10, 100] + nqs: [10, 100, 1000] +hdf5_path: /test/milvus/ann_hdf5/ \ No newline at end of file diff --git a/tests/milvus_ann_acc/suite_debug.yaml b/tests/milvus_ann_acc/suite_debug.yaml new file mode 100644 index 0000000000..ca463a9c40 --- /dev/null +++ b/tests/milvus_ann_acc/suite_debug.yaml @@ -0,0 +1,10 @@ +datasets: + - sift-128-euclidean: + index_file_sizes: [1024] + index_types: ['ivf_flat'] + nlists: [16384] + search_param: + nprobes: [1, 256] + top_ks: [10] + nqs: [10000] +hdf5_path: /test/milvus/ann_hdf5/ diff --git a/tests/milvus_ann_acc/test.py b/tests/milvus_ann_acc/test.py index c4fbc33195..44ffd53051 100644 --- a/tests/milvus_ann_acc/test.py +++ b/tests/milvus_ann_acc/test.py @@ -1,132 +1,33 @@ -import os -import pdb import time -import random -import sys -import h5py -import numpy -import logging -from logging import handlers +from influxdb import InfluxDBClient -from client import MilvusClient +INFLUXDB_HOST = "192.168.1.194" +INFLUXDB_PORT = 8086 +INFLUXDB_USER = "admin" +INFLUXDB_PASSWD = "admin" +INFLUXDB_NAME = "test_result" -LOG_FOLDER = "logs" -logger = logging.getLogger("milvus_ann_acc") +client = InfluxDBClient(host=INFLUXDB_HOST, port=INFLUXDB_PORT, username=INFLUXDB_USER, password=INFLUXDB_PASSWD, database=INFLUXDB_NAME) -formatter = logging.Formatter('[%(asctime)s] [%(levelname)-4s] [%(pathname)s:%(lineno)d] %(message)s') -if not os.path.exists(LOG_FOLDER): - os.system('mkdir -p %s' % LOG_FOLDER) -fileTimeHandler = handlers.TimedRotatingFileHandler(os.path.join(LOG_FOLDER, 'acc'), "D", 1, 10) -fileTimeHandler.suffix = "%Y%m%d.log" -fileTimeHandler.setFormatter(formatter) -logging.basicConfig(level=logging.DEBUG) -fileTimeHandler.setFormatter(formatter) -logger.addHandler(fileTimeHandler) - - -def get_dataset_fn(dataset_name): - file_path = "/test/milvus/ann_hdf5/" - if not os.path.exists(file_path): - raise Exception("%s not exists" % file_path) - return os.path.join(file_path, '%s.hdf5' % dataset_name) - - -def get_dataset(dataset_name): - hdf5_fn = get_dataset_fn(dataset_name) - hdf5_f = h5py.File(hdf5_fn) - return hdf5_f - - -def parse_dataset_name(dataset_name): - data_type = dataset_name.split("-")[0] - dimension = int(dataset_name.split("-")[1]) - metric = dataset_name.split("-")[-1] - # metric = dataset.attrs['distance'] - # dimension = len(dataset["train"][0]) - if metric == "euclidean": - metric_type = "l2" - elif metric == "angular": - metric_type = "ip" - return ("ann"+data_type, dimension, metric_type) - - -def get_table_name(dataset_name, index_file_size): - data_type, dimension, metric_type = parse_dataset_name(dataset_name) - dataset = get_dataset(dataset_name) - table_size = len(dataset["train"]) - table_size = str(table_size // 1000000)+"m" - table_name = data_type+'_'+table_size+'_'+str(index_file_size)+'_'+str(dimension)+'_'+metric_type - return table_name - - -def main(dataset_name, index_file_size, nlist=16384, force=False): - top_k = 10 - nprobes = [32, 128] - - dataset = get_dataset(dataset_name) - table_name = get_table_name(dataset_name, index_file_size) - m = MilvusClient(table_name) - if m.exists_table(): - if force is True: - logger.info("Re-create table: %s" % table_name) - m.delete() - time.sleep(10) - else: - logger.info("Table name: %s existed" % table_name) - return - data_type, dimension, metric_type = parse_dataset_name(dataset_name) - m.create_table(table_name, dimension, index_file_size, metric_type) - print(m.describe()) - vectors = numpy.array(dataset["train"]) - query_vectors = numpy.array(dataset["test"]) - # m.insert(vectors) - - interval = 100000 - loops = len(vectors) // interval + 1 - - for i in range(loops): - start = i*interval - end = min((i+1)*interval, len(vectors)) - tmp_vectors = vectors[start:end] - if start < end: - m.insert(tmp_vectors, ids=[i for i in range(start, end)]) - - time.sleep(60) - print(m.count()) - - for index_type in ["ivf_flat", "ivf_sq8", "ivf_sq8h"]: - m.create_index(index_type, nlist) - print(m.describe_index()) - if m.count() != len(vectors): - return - m.preload_table() - true_ids = numpy.array(dataset["neighbors"]) - for nprobe in nprobes: - print("nprobe: %s" % nprobe) - sum_radio = 0.0; avg_radio = 0.0 - result_ids = m.query(query_vectors, top_k, nprobe) - # print(result_ids[:10]) - for index, result_item in enumerate(result_ids): - if len(set(true_ids[index][:top_k])) != len(set(result_item)): - logger.info("Error happened") - # logger.info(query_vectors[index]) - # logger.info(true_ids[index][:top_k], result_item) - tmp = set(true_ids[index][:top_k]).intersection(set(result_item)) - sum_radio = sum_radio + (len(tmp) / top_k) - avg_radio = round(sum_radio / len(result_ids), 4) - logger.info(avg_radio) - m.drop_index() - - -if __name__ == "__main__": - print("glove-25-angular") - # main("sift-128-euclidean", 1024, force=True) - for index_file_size in [50, 1024]: - print("Index file size: %d" % index_file_size) - main("glove-25-angular", index_file_size, force=True) - - print("sift-128-euclidean") - for index_file_size in [50, 1024]: - print("Index file size: %d" % index_file_size) - main("sift-128-euclidean", index_file_size, force=True) - # m = MilvusClient() \ No newline at end of file +print(client.get_list_database()) +acc_record = [{ + "measurement": "accuracy", + "tags": { + "server_version": "0.4.3", + "dataset": "test", + "index_type": "test", + "nlist": 12, + "search_nprobe": 12, + "top_k": 1, + "nq": 1 + }, + "time": time.ctime(), + "fields": { + "accuracy": 0.1 + } +}] +try: + res = client.write_points(acc_record) + print(res) +except Exception as e: + print(str(e)) \ No newline at end of file From e96c97c8f70d11f430833b8f88dedab58a28c48d Mon Sep 17 00:00:00 2001 From: groot Date: Fri, 22 Nov 2019 11:28:31 +0800 Subject: [PATCH 16/32] #470 raw files should not be build index --- CHANGELOG.md | 1 + core/src/db/DBImpl.cpp | 27 +++- core/src/db/DBImpl.h | 4 + core/src/db/meta/Meta.h | 3 +- core/src/db/meta/MetaConsts.h | 2 + core/src/db/meta/MySQLMetaImpl.cpp | 27 +++- core/src/db/meta/MySQLMetaImpl.h | 2 +- core/src/db/meta/SqliteMetaImpl.cpp | 226 +++++++++++++++------------ core/src/db/meta/SqliteMetaImpl.h | 2 +- core/unittest/db/test_meta.cpp | 10 +- core/unittest/db/test_meta_mysql.cpp | 16 +- 11 files changed, 188 insertions(+), 132 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index e7c52eb5bb..8830180941 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -42,6 +42,7 @@ Please mark all change in change log and use the ticket from JIRA. - \#409 - Add a Fallback pass in optimizer - \#433 - C++ SDK query result is not easy to use - \#449 - Add ShowPartitions example for C++ SDK +- \#470 - Small raw files should not be build index ## Task diff --git a/core/src/db/DBImpl.cpp b/core/src/db/DBImpl.cpp index dd230ce0d1..51ea665064 100644 --- a/core/src/db/DBImpl.cpp +++ b/core/src/db/DBImpl.cpp @@ -838,6 +838,25 @@ DBImpl::BackgroundBuildIndex() { // ENGINE_LOG_TRACE << "Background build index thread exit"; } +Status +DBImpl::GetFilesToBuildIndex(const std::string& table_id, const std::vector& file_types, + meta::TableFilesSchema& files) { + files.clear(); + auto status = meta_ptr_->FilesByType(table_id, file_types, files); + + // only build index for files that row count greater than certain threshold + for (auto it = files.begin(); it != files.end();) { + if ((*it).file_type_ == static_cast(meta::TableFileSchema::RAW) && + (*it).row_count_ < meta::BUILD_INDEX_THRESHOLD) { + it = files.erase(it); + } else { + it++; + } + } + + return Status::OK(); +} + Status DBImpl::GetFilesToSearch(const std::string& table_id, const std::vector& file_ids, const meta::DatesT& dates, meta::TableFilesSchema& files) { @@ -946,18 +965,18 @@ DBImpl::BuildTableIndexRecursively(const std::string& table_id, const TableIndex } // get files to build index - std::vector file_ids; - auto status = meta_ptr_->FilesByType(table_id, file_types, file_ids); + meta::TableFilesSchema table_files; + auto status = GetFilesToBuildIndex(table_id, file_types, table_files); int times = 1; - while (!file_ids.empty()) { + while (!table_files.empty()) { ENGINE_LOG_DEBUG << "Non index files detected! Will build index " << times; if (index.engine_type_ != (int)EngineType::FAISS_IDMAP) { status = meta_ptr_->UpdateTableFilesToIndex(table_id); } std::this_thread::sleep_for(std::chrono::milliseconds(std::min(10 * 1000, times * 100))); - status = meta_ptr_->FilesByType(table_id, file_types, file_ids); + GetFilesToBuildIndex(table_id, file_types, table_files); times++; } diff --git a/core/src/db/DBImpl.h b/core/src/db/DBImpl.h index a0c5cc356d..bff56efded 100644 --- a/core/src/db/DBImpl.h +++ b/core/src/db/DBImpl.h @@ -152,6 +152,10 @@ class DBImpl : public DB { Status MemSerialize(); + Status + GetFilesToBuildIndex(const std::string& table_id, const std::vector& file_types, + meta::TableFilesSchema& files); + Status GetFilesToSearch(const std::string& table_id, const std::vector& file_ids, const meta::DatesT& dates, meta::TableFilesSchema& files); diff --git a/core/src/db/meta/Meta.h b/core/src/db/meta/Meta.h index f538bebce6..52fe86fe69 100644 --- a/core/src/db/meta/Meta.h +++ b/core/src/db/meta/Meta.h @@ -109,8 +109,7 @@ class Meta { FilesToIndex(TableFilesSchema&) = 0; virtual Status - FilesByType(const std::string& table_id, const std::vector& file_types, - std::vector& file_ids) = 0; + FilesByType(const std::string& table_id, const std::vector& file_types, TableFilesSchema& table_files) = 0; virtual Status Size(uint64_t& result) = 0; diff --git a/core/src/db/meta/MetaConsts.h b/core/src/db/meta/MetaConsts.h index 4e40ff7731..c21a749fc8 100644 --- a/core/src/db/meta/MetaConsts.h +++ b/core/src/db/meta/MetaConsts.h @@ -32,6 +32,8 @@ const size_t H_SEC = 60 * M_SEC; const size_t D_SEC = 24 * H_SEC; const size_t W_SEC = 7 * D_SEC; +const size_t BUILD_INDEX_THRESHOLD = 1000; + } // namespace meta } // namespace engine } // namespace milvus diff --git a/core/src/db/meta/MySQLMetaImpl.cpp b/core/src/db/meta/MySQLMetaImpl.cpp index 4406b87f7e..6d13cad248 100644 --- a/core/src/db/meta/MySQLMetaImpl.cpp +++ b/core/src/db/meta/MySQLMetaImpl.cpp @@ -959,6 +959,7 @@ MySQLMetaImpl::UpdateTableFilesToIndex(const std::string& table_id) { updateTableFilesToIndexQuery << "UPDATE " << META_TABLEFILES << " SET file_type = " << std::to_string(TableFileSchema::TO_INDEX) << " WHERE table_id = " << mysqlpp::quote << table_id + << " AND row_count >= " << std::to_string(meta::BUILD_INDEX_THRESHOLD) << " AND file_type = " << std::to_string(TableFileSchema::RAW) << ";"; ENGINE_LOG_DEBUG << "MySQLMetaImpl::UpdateTableFilesToIndex: " << updateTableFilesToIndexQuery.str(); @@ -1527,13 +1528,13 @@ MySQLMetaImpl::FilesToIndex(TableFilesSchema& files) { Status MySQLMetaImpl::FilesByType(const std::string& table_id, const std::vector& file_types, - std::vector& file_ids) { + TableFilesSchema& table_files) { if (file_types.empty()) { return Status(DB_ERROR, "file types array is empty"); } try { - file_ids.clear(); + table_files.clear(); mysqlpp::StoreQueryResult res; { @@ -1553,9 +1554,10 @@ MySQLMetaImpl::FilesByType(const std::string& table_id, const std::vector& mysqlpp::Query hasNonIndexFilesQuery = connectionPtr->query(); // since table_id is a unique column we just need to check whether it exists or not - hasNonIndexFilesQuery << "SELECT file_id, file_type" - << " FROM " << META_TABLEFILES << " WHERE table_id = " << mysqlpp::quote << table_id - << " AND file_type in (" << types << ");"; + hasNonIndexFilesQuery + << "SELECT id, engine_type, file_id, file_type, file_size, row_count, date, created_on" + << " FROM " << META_TABLEFILES << " WHERE table_id = " << mysqlpp::quote << table_id + << " AND file_type in (" << types << ");"; ENGINE_LOG_DEBUG << "MySQLMetaImpl::FilesByType: " << hasNonIndexFilesQuery.str(); @@ -1566,9 +1568,18 @@ MySQLMetaImpl::FilesByType(const std::string& table_id, const std::vector& int raw_count = 0, new_count = 0, new_merge_count = 0, new_index_count = 0; int to_index_count = 0, index_count = 0, backup_count = 0; for (auto& resRow : res) { - std::string file_id; - resRow["file_id"].to_string(file_id); - file_ids.push_back(file_id); + TableFileSchema file_schema; + file_schema.id_ = resRow["id"]; + file_schema.table_id_ = table_id; + file_schema.engine_type_ = resRow["engine_type"]; + resRow["file_id"].to_string(file_schema.file_id_); + file_schema.file_type_ = resRow["file_type"]; + file_schema.file_size_ = resRow["file_size"]; + file_schema.row_count_ = resRow["row_count"]; + file_schema.date_ = resRow["date"]; + file_schema.created_on_ = resRow["created_on"]; + + table_files.emplace_back(file_schema); int32_t file_type = resRow["file_type"]; switch (file_type) { diff --git a/core/src/db/meta/MySQLMetaImpl.h b/core/src/db/meta/MySQLMetaImpl.h index 00b7627548..dd882fca2e 100644 --- a/core/src/db/meta/MySQLMetaImpl.h +++ b/core/src/db/meta/MySQLMetaImpl.h @@ -108,7 +108,7 @@ class MySQLMetaImpl : public Meta { Status FilesByType(const std::string& table_id, const std::vector& file_types, - std::vector& file_ids) override; + TableFilesSchema& table_files) override; Status Archive() override; diff --git a/core/src/db/meta/SqliteMetaImpl.cpp b/core/src/db/meta/SqliteMetaImpl.cpp index 12128c074d..19ec684728 100644 --- a/core/src/db/meta/SqliteMetaImpl.cpp +++ b/core/src/db/meta/SqliteMetaImpl.cpp @@ -58,7 +58,7 @@ HandleException(const std::string& desc, const char* what = nullptr) { } // namespace inline auto -StoragePrototype(const std::string &path) { +StoragePrototype(const std::string& path) { return make_storage(path, make_table(META_TABLES, make_column("id", &TableSchema::id_, primary_key()), @@ -160,7 +160,7 @@ SqliteMetaImpl::Initialize() { } Status -SqliteMetaImpl::CreateTable(TableSchema &table_schema) { +SqliteMetaImpl::CreateTable(TableSchema& table_schema) { try { server::MetricCollector metric; @@ -188,20 +188,20 @@ SqliteMetaImpl::CreateTable(TableSchema &table_schema) { try { auto id = ConnectorPtr->insert(table_schema); table_schema.id_ = id; - } catch (std::exception &e) { + } catch (std::exception& e) { return HandleException("Encounter exception when create table", e.what()); } ENGINE_LOG_DEBUG << "Successfully create table: " << table_schema.table_id_; return utils::CreateTablePath(options_, table_schema.table_id_); - } catch (std::exception &e) { + } catch (std::exception& e) { return HandleException("Encounter exception when create table", e.what()); } } Status -SqliteMetaImpl::DescribeTable(TableSchema &table_schema) { +SqliteMetaImpl::DescribeTable(TableSchema& table_schema) { try { server::MetricCollector metric; @@ -218,7 +218,7 @@ SqliteMetaImpl::DescribeTable(TableSchema &table_schema) { &TableSchema::partition_tag_, &TableSchema::version_), where(c(&TableSchema::table_id_) == table_schema.table_id_ - and c(&TableSchema::state_) != (int) TableSchema::TO_DELETE)); + and c(&TableSchema::state_) != (int)TableSchema::TO_DELETE)); if (groups.size() == 1) { table_schema.id_ = std::get<0>(groups[0]); @@ -236,7 +236,7 @@ SqliteMetaImpl::DescribeTable(TableSchema &table_schema) { } else { return Status(DB_NOT_FOUND, "Table " + table_schema.table_id_ + " not found"); } - } catch (std::exception &e) { + } catch (std::exception& e) { return HandleException("Encounter exception when describe table", e.what()); } @@ -244,20 +244,20 @@ SqliteMetaImpl::DescribeTable(TableSchema &table_schema) { } Status -SqliteMetaImpl::HasTable(const std::string &table_id, bool &has_or_not) { +SqliteMetaImpl::HasTable(const std::string& table_id, bool& has_or_not) { has_or_not = false; try { server::MetricCollector metric; auto tables = ConnectorPtr->select(columns(&TableSchema::id_), where(c(&TableSchema::table_id_) == table_id - and c(&TableSchema::state_) != (int) TableSchema::TO_DELETE)); + and c(&TableSchema::state_) != (int)TableSchema::TO_DELETE)); if (tables.size() == 1) { has_or_not = true; } else { has_or_not = false; } - } catch (std::exception &e) { + } catch (std::exception& e) { return HandleException("Encounter exception when lookup table", e.what()); } @@ -265,7 +265,7 @@ SqliteMetaImpl::HasTable(const std::string &table_id, bool &has_or_not) { } Status -SqliteMetaImpl::AllTables(std::vector &table_schema_array) { +SqliteMetaImpl::AllTables(std::vector& table_schema_array) { try { server::MetricCollector metric; @@ -281,8 +281,8 @@ SqliteMetaImpl::AllTables(std::vector &table_schema_array) { &TableSchema::owner_table_, &TableSchema::partition_tag_, &TableSchema::version_), - where(c(&TableSchema::state_) != (int) TableSchema::TO_DELETE)); - for (auto &table : selected) { + where(c(&TableSchema::state_) != (int)TableSchema::TO_DELETE)); + for (auto& table : selected) { TableSchema schema; schema.id_ = std::get<0>(table); schema.table_id_ = std::get<1>(table); @@ -299,7 +299,7 @@ SqliteMetaImpl::AllTables(std::vector &table_schema_array) { table_schema_array.emplace_back(schema); } - } catch (std::exception &e) { + } catch (std::exception& e) { return HandleException("Encounter exception when lookup all tables", e.what()); } @@ -307,7 +307,7 @@ SqliteMetaImpl::AllTables(std::vector &table_schema_array) { } Status -SqliteMetaImpl::DropTable(const std::string &table_id) { +SqliteMetaImpl::DropTable(const std::string& table_id) { try { server::MetricCollector metric; @@ -317,13 +317,13 @@ SqliteMetaImpl::DropTable(const std::string &table_id) { //soft delete table ConnectorPtr->update_all( set( - c(&TableSchema::state_) = (int) TableSchema::TO_DELETE), + c(&TableSchema::state_) = (int)TableSchema::TO_DELETE), where( c(&TableSchema::table_id_) == table_id and - c(&TableSchema::state_) != (int) TableSchema::TO_DELETE)); + c(&TableSchema::state_) != (int)TableSchema::TO_DELETE)); ENGINE_LOG_DEBUG << "Successfully delete table, table id = " << table_id; - } catch (std::exception &e) { + } catch (std::exception& e) { return HandleException("Encounter exception when delete table", e.what()); } @@ -331,7 +331,7 @@ SqliteMetaImpl::DropTable(const std::string &table_id) { } Status -SqliteMetaImpl::DeleteTableFiles(const std::string &table_id) { +SqliteMetaImpl::DeleteTableFiles(const std::string& table_id) { try { server::MetricCollector metric; @@ -341,14 +341,14 @@ SqliteMetaImpl::DeleteTableFiles(const std::string &table_id) { //soft delete table files ConnectorPtr->update_all( set( - c(&TableFileSchema::file_type_) = (int) TableFileSchema::TO_DELETE, + c(&TableFileSchema::file_type_) = (int)TableFileSchema::TO_DELETE, c(&TableFileSchema::updated_time_) = utils::GetMicroSecTimeStamp()), where( c(&TableFileSchema::table_id_) == table_id and - c(&TableFileSchema::file_type_) != (int) TableFileSchema::TO_DELETE)); + c(&TableFileSchema::file_type_) != (int)TableFileSchema::TO_DELETE)); ENGINE_LOG_DEBUG << "Successfully delete table files, table id = " << table_id; - } catch (std::exception &e) { + } catch (std::exception& e) { return HandleException("Encounter exception when delete table files", e.what()); } @@ -356,7 +356,7 @@ SqliteMetaImpl::DeleteTableFiles(const std::string &table_id) { } Status -SqliteMetaImpl::CreateTableFile(TableFileSchema &file_schema) { +SqliteMetaImpl::CreateTableFile(TableFileSchema& file_schema) { if (file_schema.date_ == EmptyDate) { file_schema.date_ = utils::GetDate(); } @@ -389,7 +389,7 @@ SqliteMetaImpl::CreateTableFile(TableFileSchema &file_schema) { ENGINE_LOG_DEBUG << "Successfully create table file, file id = " << file_schema.file_id_; return utils::CreateTableFilePath(options_, file_schema); - } catch (std::exception &e) { + } catch (std::exception& e) { return HandleException("Encounter exception when create table file", e.what()); } @@ -398,8 +398,8 @@ SqliteMetaImpl::CreateTableFile(TableFileSchema &file_schema) { // TODO(myh): Delete single vecotor by id Status -SqliteMetaImpl::DropDataByDate(const std::string &table_id, - const DatesT &dates) { +SqliteMetaImpl::DropDataByDate(const std::string& table_id, + const DatesT& dates) { if (dates.empty()) { return Status::OK(); } @@ -440,7 +440,7 @@ SqliteMetaImpl::DropDataByDate(const std::string &table_id, } ENGINE_LOG_DEBUG << "Successfully drop data by date, table id = " << table_schema.table_id_; - } catch (std::exception &e) { + } catch (std::exception& e) { return HandleException("Encounter exception when drop partition", e.what()); } @@ -448,9 +448,9 @@ SqliteMetaImpl::DropDataByDate(const std::string &table_id, } Status -SqliteMetaImpl::GetTableFiles(const std::string &table_id, - const std::vector &ids, - TableFilesSchema &table_files) { +SqliteMetaImpl::GetTableFiles(const std::string& table_id, + const std::vector& ids, + TableFilesSchema& table_files) { try { table_files.clear(); auto files = ConnectorPtr->select(columns(&TableFileSchema::id_, @@ -463,7 +463,7 @@ SqliteMetaImpl::GetTableFiles(const std::string &table_id, &TableFileSchema::created_on_), where(c(&TableFileSchema::table_id_) == table_id and in(&TableFileSchema::id_, ids) and - c(&TableFileSchema::file_type_) != (int) TableFileSchema::TO_DELETE)); + c(&TableFileSchema::file_type_) != (int)TableFileSchema::TO_DELETE)); TableSchema table_schema; table_schema.table_id_ = table_id; auto status = DescribeTable(table_schema); @@ -472,7 +472,7 @@ SqliteMetaImpl::GetTableFiles(const std::string &table_id, } Status result; - for (auto &file : files) { + for (auto& file : files) { TableFileSchema file_schema; file_schema.table_id_ = table_id; file_schema.id_ = std::get<0>(file); @@ -495,13 +495,13 @@ SqliteMetaImpl::GetTableFiles(const std::string &table_id, ENGINE_LOG_DEBUG << "Get table files by id"; return result; - } catch (std::exception &e) { + } catch (std::exception& e) { return HandleException("Encounter exception when lookup table files", e.what()); } } Status -SqliteMetaImpl::UpdateTableFlag(const std::string &table_id, int64_t flag) { +SqliteMetaImpl::UpdateTableFlag(const std::string& table_id, int64_t flag) { try { server::MetricCollector metric; @@ -512,7 +512,7 @@ SqliteMetaImpl::UpdateTableFlag(const std::string &table_id, int64_t flag) { where( c(&TableSchema::table_id_) == table_id)); ENGINE_LOG_DEBUG << "Successfully update table flag, table id = " << table_id; - } catch (std::exception &e) { + } catch (std::exception& e) { std::string msg = "Encounter exception when update table flag: table_id = " + table_id; return HandleException(msg, e.what()); } @@ -521,7 +521,7 @@ SqliteMetaImpl::UpdateTableFlag(const std::string &table_id, int64_t flag) { } Status -SqliteMetaImpl::UpdateTableFile(TableFileSchema &file_schema) { +SqliteMetaImpl::UpdateTableFile(TableFileSchema& file_schema) { file_schema.updated_time_ = utils::GetMicroSecTimeStamp(); try { server::MetricCollector metric; @@ -534,14 +534,14 @@ SqliteMetaImpl::UpdateTableFile(TableFileSchema &file_schema) { //if the table has been deleted, just mark the table file as TO_DELETE //clean thread will delete the file later - if (tables.size() < 1 || std::get<0>(tables[0]) == (int) TableSchema::TO_DELETE) { + if (tables.size() < 1 || std::get<0>(tables[0]) == (int)TableSchema::TO_DELETE) { file_schema.file_type_ = TableFileSchema::TO_DELETE; } ConnectorPtr->update(file_schema); ENGINE_LOG_DEBUG << "Update single table file, file id = " << file_schema.file_id_; - } catch (std::exception &e) { + } catch (std::exception& e) { std::string msg = "Exception update table file: table_id = " + file_schema.table_id_ + " file_id = " + file_schema.file_id_; return HandleException(msg, e.what()); @@ -550,7 +550,7 @@ SqliteMetaImpl::UpdateTableFile(TableFileSchema &file_schema) { } Status -SqliteMetaImpl::UpdateTableFiles(TableFilesSchema &files) { +SqliteMetaImpl::UpdateTableFiles(TableFilesSchema& files) { try { server::MetricCollector metric; @@ -558,13 +558,13 @@ SqliteMetaImpl::UpdateTableFiles(TableFilesSchema &files) { std::lock_guard meta_lock(meta_mutex_); std::map has_tables; - for (auto &file : files) { + for (auto& file : files) { if (has_tables.find(file.table_id_) != has_tables.end()) { continue; } auto tables = ConnectorPtr->select(columns(&TableSchema::id_), where(c(&TableSchema::table_id_) == file.table_id_ - and c(&TableSchema::state_) != (int) TableSchema::TO_DELETE)); + and c(&TableSchema::state_) != (int)TableSchema::TO_DELETE)); if (tables.size() >= 1) { has_tables[file.table_id_] = true; } else { @@ -573,7 +573,7 @@ SqliteMetaImpl::UpdateTableFiles(TableFilesSchema &files) { } auto commited = ConnectorPtr->transaction([&]() mutable { - for (auto &file : files) { + for (auto& file : files) { if (!has_tables[file.table_id_]) { file.file_type_ = TableFileSchema::TO_DELETE; } @@ -589,7 +589,7 @@ SqliteMetaImpl::UpdateTableFiles(TableFilesSchema &files) { } ENGINE_LOG_DEBUG << "Update " << files.size() << " table files"; - } catch (std::exception &e) { + } catch (std::exception& e) { return HandleException("Encounter exception when update table files", e.what()); } return Status::OK(); @@ -613,7 +613,7 @@ SqliteMetaImpl::UpdateTableIndex(const std::string& table_id, const TableIndex& &TableSchema::partition_tag_, &TableSchema::version_), where(c(&TableSchema::table_id_) == table_id - and c(&TableSchema::state_) != (int) TableSchema::TO_DELETE)); + and c(&TableSchema::state_) != (int)TableSchema::TO_DELETE)); if (tables.size() > 0) { meta::TableSchema table_schema; @@ -639,11 +639,11 @@ SqliteMetaImpl::UpdateTableIndex(const std::string& table_id, const TableIndex& //set all backup file to raw ConnectorPtr->update_all( set( - c(&TableFileSchema::file_type_) = (int) TableFileSchema::RAW, + c(&TableFileSchema::file_type_) = (int)TableFileSchema::RAW, c(&TableFileSchema::updated_time_) = utils::GetMicroSecTimeStamp()), where( c(&TableFileSchema::table_id_) == table_id and - c(&TableFileSchema::file_type_) == (int) TableFileSchema::BACKUP)); + c(&TableFileSchema::file_type_) == (int)TableFileSchema::BACKUP)); ENGINE_LOG_DEBUG << "Successfully update table index, table id = " << table_id; } catch (std::exception& e) { @@ -655,7 +655,7 @@ SqliteMetaImpl::UpdateTableIndex(const std::string& table_id, const TableIndex& } Status -SqliteMetaImpl::UpdateTableFilesToIndex(const std::string &table_id) { +SqliteMetaImpl::UpdateTableFilesToIndex(const std::string& table_id) { try { server::MetricCollector metric; @@ -664,13 +664,14 @@ SqliteMetaImpl::UpdateTableFilesToIndex(const std::string &table_id) { ConnectorPtr->update_all( set( - c(&TableFileSchema::file_type_) = (int) TableFileSchema::TO_INDEX), + c(&TableFileSchema::file_type_) = (int)TableFileSchema::TO_INDEX), where( c(&TableFileSchema::table_id_) == table_id and - c(&TableFileSchema::file_type_) == (int) TableFileSchema::RAW)); + c(&TableFileSchema::row_count_) >= meta::BUILD_INDEX_THRESHOLD and + c(&TableFileSchema::file_type_) == (int)TableFileSchema::RAW)); ENGINE_LOG_DEBUG << "Update files to to_index, table id = " << table_id; - } catch (std::exception &e) { + } catch (std::exception& e) { return HandleException("Encounter exception when update table files to to_index", e.what()); } @@ -686,7 +687,7 @@ SqliteMetaImpl::DescribeTableIndex(const std::string& table_id, TableIndex& inde &TableSchema::nlist_, &TableSchema::metric_type_), where(c(&TableSchema::table_id_) == table_id - and c(&TableSchema::state_) != (int) TableSchema::TO_DELETE)); + and c(&TableSchema::state_) != (int)TableSchema::TO_DELETE)); if (groups.size() == 1) { index.engine_type_ = std::get<0>(groups[0]); @@ -713,20 +714,20 @@ SqliteMetaImpl::DropTableIndex(const std::string& table_id) { //soft delete index files ConnectorPtr->update_all( set( - c(&TableFileSchema::file_type_) = (int) TableFileSchema::TO_DELETE, + c(&TableFileSchema::file_type_) = (int)TableFileSchema::TO_DELETE, c(&TableFileSchema::updated_time_) = utils::GetMicroSecTimeStamp()), where( c(&TableFileSchema::table_id_) == table_id and - c(&TableFileSchema::file_type_) == (int) TableFileSchema::INDEX)); + c(&TableFileSchema::file_type_) == (int)TableFileSchema::INDEX)); //set all backup file to raw ConnectorPtr->update_all( set( - c(&TableFileSchema::file_type_) = (int) TableFileSchema::RAW, + c(&TableFileSchema::file_type_) = (int)TableFileSchema::RAW, c(&TableFileSchema::updated_time_) = utils::GetMicroSecTimeStamp()), where( c(&TableFileSchema::table_id_) == table_id and - c(&TableFileSchema::file_type_) == (int) TableFileSchema::BACKUP)); + c(&TableFileSchema::file_type_) == (int)TableFileSchema::BACKUP)); //set table index type to raw ConnectorPtr->update_all( @@ -738,7 +739,7 @@ SqliteMetaImpl::DropTableIndex(const std::string& table_id) { c(&TableSchema::table_id_) == table_id)); ENGINE_LOG_DEBUG << "Successfully drop table index, table id = " << table_id; - } catch (std::exception &e) { + } catch (std::exception& e) { return HandleException("Encounter exception when delete table index files", e.what()); } @@ -746,7 +747,9 @@ SqliteMetaImpl::DropTableIndex(const std::string& table_id) { } Status -SqliteMetaImpl::CreatePartition(const std::string& table_id, const std::string& partition_name, const std::string& tag) { +SqliteMetaImpl::CreatePartition(const std::string& table_id, + const std::string& partition_name, + const std::string& tag) { server::MetricCollector metric; TableSchema table_schema; @@ -757,7 +760,7 @@ SqliteMetaImpl::CreatePartition(const std::string& table_id, const std::string& } // not allow create partition under partition - if(!table_schema.owner_table_.empty()) { + if (!table_schema.owner_table_.empty()) { return Status(DB_ERROR, "Nested partition is not allowed"); } @@ -769,7 +772,7 @@ SqliteMetaImpl::CreatePartition(const std::string& table_id, const std::string& // not allow duplicated partition std::string exist_partition; GetPartitionName(table_id, valid_tag, exist_partition); - if(!exist_partition.empty()) { + if (!exist_partition.empty()) { return Status(DB_ERROR, "Duplicate partition is not allowed"); } @@ -805,16 +808,16 @@ SqliteMetaImpl::ShowPartitions(const std::string& table_id, std::vectorselect(columns(&TableSchema::table_id_), - where(c(&TableSchema::owner_table_) == table_id - and c(&TableSchema::state_) != (int) TableSchema::TO_DELETE)); - for(size_t i = 0; i < partitions.size(); i++) { + where(c(&TableSchema::owner_table_) == table_id + and c(&TableSchema::state_) != (int)TableSchema::TO_DELETE)); + for (size_t i = 0; i < partitions.size(); i++) { std::string partition_name = std::get<0>(partitions[i]); meta::TableSchema partition_schema; partition_schema.table_id_ = partition_name; DescribeTable(partition_schema); partiton_schema_array.emplace_back(partition_schema); } - } catch (std::exception &e) { + } catch (std::exception& e) { return HandleException("Encounter exception when show partitions", e.what()); } @@ -832,14 +835,14 @@ SqliteMetaImpl::GetPartitionName(const std::string& table_id, const std::string& server::StringHelpFunctions::TrimStringBlank(valid_tag); auto name = ConnectorPtr->select(columns(&TableSchema::table_id_), - where(c(&TableSchema::owner_table_) == table_id - and c(&TableSchema::partition_tag_) == valid_tag)); + where(c(&TableSchema::owner_table_) == table_id + and c(&TableSchema::partition_tag_) == valid_tag)); if (name.size() > 0) { partition_name = std::get<0>(name[0]); } else { return Status(DB_NOT_FOUND, "Table " + table_id + "'s partition " + valid_tag + " not found"); } - } catch (std::exception &e) { + } catch (std::exception& e) { return HandleException("Encounter exception when get partition name", e.what()); } @@ -1032,7 +1035,7 @@ SqliteMetaImpl::FilesToMerge(const std::string& table_id, DatePartionedTableFile } Status -SqliteMetaImpl::FilesToIndex(TableFilesSchema &files) { +SqliteMetaImpl::FilesToIndex(TableFilesSchema& files) { files.clear(); try { @@ -1048,13 +1051,13 @@ SqliteMetaImpl::FilesToIndex(TableFilesSchema &files) { &TableFileSchema::engine_type_, &TableFileSchema::created_on_), where(c(&TableFileSchema::file_type_) - == (int) TableFileSchema::TO_INDEX)); + == (int)TableFileSchema::TO_INDEX)); std::map groups; TableFileSchema table_file; Status ret; - for (auto &file : selected) { + for (auto& file : selected) { table_file.id_ = std::get<0>(file); table_file.table_id_ = std::get<1>(file); table_file.file_id_ = std::get<2>(file); @@ -1090,48 +1093,66 @@ SqliteMetaImpl::FilesToIndex(TableFilesSchema &files) { ENGINE_LOG_DEBUG << "Collect " << selected.size() << " to-index files"; } return ret; - } catch (std::exception &e) { + } catch (std::exception& e) { return HandleException("Encounter exception when iterate raw files", e.what()); } } Status -SqliteMetaImpl::FilesByType(const std::string &table_id, - const std::vector &file_types, - std::vector &file_ids) { +SqliteMetaImpl::FilesByType(const std::string& table_id, + const std::vector& file_types, + TableFilesSchema& table_files) { if (file_types.empty()) { return Status(DB_ERROR, "file types array is empty"); } try { - file_ids.clear(); - auto selected = ConnectorPtr->select(columns(&TableFileSchema::file_id_, - &TableFileSchema::file_type_), + table_files.clear(); + auto selected = ConnectorPtr->select(columns(&TableFileSchema::id_, + &TableFileSchema::file_id_, + &TableFileSchema::file_type_, + &TableFileSchema::file_size_, + &TableFileSchema::row_count_, + &TableFileSchema::date_, + &TableFileSchema::engine_type_, + &TableFileSchema::created_on_), where(in(&TableFileSchema::file_type_, file_types) and c(&TableFileSchema::table_id_) == table_id)); if (selected.size() >= 1) { int raw_count = 0, new_count = 0, new_merge_count = 0, new_index_count = 0; int to_index_count = 0, index_count = 0, backup_count = 0; - for (auto &file : selected) { - file_ids.push_back(std::get<0>(file)); - switch (std::get<1>(file)) { - case (int) TableFileSchema::RAW:raw_count++; + for (auto& file : selected) { + TableFileSchema file_schema; + file_schema.table_id_ = table_id; + file_schema.id_ = std::get<0>(file); + file_schema.file_id_ = std::get<1>(file); + file_schema.file_type_ = std::get<2>(file); + file_schema.file_size_ = std::get<3>(file); + file_schema.row_count_ = std::get<4>(file); + file_schema.date_ = std::get<5>(file); + file_schema.engine_type_ = std::get<6>(file); + file_schema.created_on_ = std::get<7>(file); + + switch (file_schema.file_type_) { + case (int)TableFileSchema::RAW:raw_count++; break; - case (int) TableFileSchema::NEW:new_count++; + case (int)TableFileSchema::NEW:new_count++; break; - case (int) TableFileSchema::NEW_MERGE:new_merge_count++; + case (int)TableFileSchema::NEW_MERGE:new_merge_count++; break; - case (int) TableFileSchema::NEW_INDEX:new_index_count++; + case (int)TableFileSchema::NEW_INDEX:new_index_count++; break; - case (int) TableFileSchema::TO_INDEX:to_index_count++; + case (int)TableFileSchema::TO_INDEX:to_index_count++; break; - case (int) TableFileSchema::INDEX:index_count++; + case (int)TableFileSchema::INDEX:index_count++; break; - case (int) TableFileSchema::BACKUP:backup_count++; + case (int)TableFileSchema::BACKUP:backup_count++; break; default:break; } + + table_files.emplace_back(file_schema); } ENGINE_LOG_DEBUG << "Table " << table_id << " currently has raw files:" << raw_count @@ -1139,13 +1160,12 @@ SqliteMetaImpl::FilesByType(const std::string &table_id, << " new_index files:" << new_index_count << " to_index files:" << to_index_count << " index files:" << index_count << " backup files:" << backup_count; } - } catch (std::exception &e) { + } catch (std::exception& e) { return HandleException("Encounter exception when check non index files", e.what()); } return Status::OK(); } - // TODO(myh): Support swap to cloud storage Status SqliteMetaImpl::Archive() { @@ -1166,11 +1186,11 @@ SqliteMetaImpl::Archive() { ConnectorPtr->update_all( set( - c(&TableFileSchema::file_type_) = (int) TableFileSchema::TO_DELETE), + c(&TableFileSchema::file_type_) = (int)TableFileSchema::TO_DELETE), where( - c(&TableFileSchema::created_on_) < (int64_t) (now - usecs) and - c(&TableFileSchema::file_type_) != (int) TableFileSchema::TO_DELETE)); - } catch (std::exception &e) { + c(&TableFileSchema::created_on_) < (int64_t)(now - usecs) and + c(&TableFileSchema::file_type_) != (int)TableFileSchema::TO_DELETE)); + } catch (std::exception& e) { return HandleException("Encounter exception when update table files", e.what()); } @@ -1218,15 +1238,15 @@ SqliteMetaImpl::CleanUp() { std::lock_guard meta_lock(meta_mutex_); std::vector file_types = { - (int) TableFileSchema::NEW, - (int) TableFileSchema::NEW_INDEX, - (int) TableFileSchema::NEW_MERGE + (int)TableFileSchema::NEW, + (int)TableFileSchema::NEW_INDEX, + (int)TableFileSchema::NEW_MERGE }; auto files = ConnectorPtr->select(columns(&TableFileSchema::id_), where(in(&TableFileSchema::file_type_, file_types))); auto commited = ConnectorPtr->transaction([&]() mutable { - for (auto &file : files) { + for (auto& file : files) { ENGINE_LOG_DEBUG << "Remove table file type as NEW"; ConnectorPtr->remove(std::get<0>(file)); } @@ -1240,7 +1260,7 @@ SqliteMetaImpl::CleanUp() { if (files.size() > 0) { ENGINE_LOG_DEBUG << "Clean " << files.size() << " files"; } - } catch (std::exception &e) { + } catch (std::exception& e) { return HandleException("Encounter exception when clean table file", e.what()); } @@ -1265,7 +1285,7 @@ SqliteMetaImpl::CleanUpFilesWithTTL(uint16_t seconds) { &TableFileSchema::date_), where( c(&TableFileSchema::file_type_) == - (int) TableFileSchema::TO_DELETE + (int)TableFileSchema::TO_DELETE and c(&TableFileSchema::updated_time_) < now - seconds * US_PS)); @@ -1354,7 +1374,7 @@ SqliteMetaImpl::CleanUpFilesWithTTL(uint16_t seconds) { } Status -SqliteMetaImpl::Count(const std::string &table_id, uint64_t &result) { +SqliteMetaImpl::Count(const std::string& table_id, uint64_t& result) { try { server::MetricCollector metric; @@ -1414,14 +1434,14 @@ SqliteMetaImpl::DiscardFiles(int64_t to_discard_size) { auto selected = ConnectorPtr->select(columns(&TableFileSchema::id_, &TableFileSchema::file_size_), where(c(&TableFileSchema::file_type_) - != (int) TableFileSchema::TO_DELETE), + != (int)TableFileSchema::TO_DELETE), order_by(&TableFileSchema::id_), limit(10)); std::vector ids; TableFileSchema table_file; - for (auto &file : selected) { + for (auto& file : selected) { if (to_discard_size <= 0) break; table_file.id_ = std::get<0>(file); table_file.file_size_ = std::get<1>(file); @@ -1437,7 +1457,7 @@ SqliteMetaImpl::DiscardFiles(int64_t to_discard_size) { ConnectorPtr->update_all( set( - c(&TableFileSchema::file_type_) = (int) TableFileSchema::TO_DELETE, + c(&TableFileSchema::file_type_) = (int)TableFileSchema::TO_DELETE, c(&TableFileSchema::updated_time_) = utils::GetMicroSecTimeStamp()), where( in(&TableFileSchema::id_, ids))); @@ -1448,7 +1468,7 @@ SqliteMetaImpl::DiscardFiles(int64_t to_discard_size) { if (!commited) { return HandleException("DiscardFiles error: sqlite transaction failed"); } - } catch (std::exception &e) { + } catch (std::exception& e) { return HandleException("Encounter exception when discard table file", e.what()); } diff --git a/core/src/db/meta/SqliteMetaImpl.h b/core/src/db/meta/SqliteMetaImpl.h index 84d97ed49d..8e821d81de 100644 --- a/core/src/db/meta/SqliteMetaImpl.h +++ b/core/src/db/meta/SqliteMetaImpl.h @@ -108,7 +108,7 @@ class SqliteMetaImpl : public Meta { Status FilesByType(const std::string& table_id, const std::vector& file_types, - std::vector& file_ids) override; + TableFilesSchema& table_files) override; Status Size(uint64_t& result) override; diff --git a/core/unittest/db/test_meta.cpp b/core/unittest/db/test_meta.cpp index 097f004bd1..143bf39383 100644 --- a/core/unittest/db/test_meta.cpp +++ b/core/unittest/db/test_meta.cpp @@ -306,9 +306,9 @@ TEST_F(MetaTest, TABLE_FILES_TEST) { ASSERT_EQ(dated_files[table_file.date_].size(), 0); std::vector file_types; - std::vector file_ids; - status = impl_->FilesByType(table.table_id_, file_types, file_ids); - ASSERT_TRUE(file_ids.empty()); + milvus::engine::meta::TableFilesSchema table_files; + status = impl_->FilesByType(table.table_id_, file_types, table_files); + ASSERT_TRUE(table_files.empty()); ASSERT_FALSE(status.ok()); file_types = { @@ -317,11 +317,11 @@ TEST_F(MetaTest, TABLE_FILES_TEST) { milvus::engine::meta::TableFileSchema::INDEX, milvus::engine::meta::TableFileSchema::RAW, milvus::engine::meta::TableFileSchema::BACKUP, }; - status = impl_->FilesByType(table.table_id_, file_types, file_ids); + status = impl_->FilesByType(table.table_id_, file_types, table_files); ASSERT_TRUE(status.ok()); uint64_t total_cnt = new_index_files_cnt + new_merge_files_cnt + backup_files_cnt + new_files_cnt + raw_files_cnt + to_index_files_cnt + index_files_cnt; - ASSERT_EQ(file_ids.size(), total_cnt); + ASSERT_EQ(table_files.size(), total_cnt); status = impl_->DeleteTableFiles(table_id); ASSERT_TRUE(status.ok()); diff --git a/core/unittest/db/test_meta_mysql.cpp b/core/unittest/db/test_meta_mysql.cpp index b9a82c0748..9a52a01b7b 100644 --- a/core/unittest/db/test_meta_mysql.cpp +++ b/core/unittest/db/test_meta_mysql.cpp @@ -169,9 +169,9 @@ TEST_F(MySqlMetaTest, ARCHIVE_TEST_DAYS) { std::vector file_types = { (int)milvus::engine::meta::TableFileSchema::NEW, }; - std::vector file_ids; - status = impl.FilesByType(table_id, file_types, file_ids); - ASSERT_FALSE(file_ids.empty()); + milvus::engine::meta::TableFilesSchema table_files; + status = impl.FilesByType(table_id, file_types, table_files); + ASSERT_FALSE(table_files.empty()); status = impl.UpdateTableFilesToIndex(table_id); ASSERT_TRUE(status.ok()); @@ -326,9 +326,9 @@ TEST_F(MySqlMetaTest, TABLE_FILES_TEST) { ASSERT_EQ(dated_files[table_file.date_].size(), 0); std::vector file_types; - std::vector file_ids; - status = impl_->FilesByType(table.table_id_, file_types, file_ids); - ASSERT_TRUE(file_ids.empty()); + milvus::engine::meta::TableFilesSchema table_files; + status = impl_->FilesByType(table.table_id_, file_types, table_files); + ASSERT_TRUE(table_files.empty()); ASSERT_FALSE(status.ok()); file_types = { @@ -337,11 +337,11 @@ TEST_F(MySqlMetaTest, TABLE_FILES_TEST) { milvus::engine::meta::TableFileSchema::INDEX, milvus::engine::meta::TableFileSchema::RAW, milvus::engine::meta::TableFileSchema::BACKUP, }; - status = impl_->FilesByType(table.table_id_, file_types, file_ids); + status = impl_->FilesByType(table.table_id_, file_types, table_files); ASSERT_TRUE(status.ok()); uint64_t total_cnt = new_index_files_cnt + new_merge_files_cnt + backup_files_cnt + new_files_cnt + raw_files_cnt + to_index_files_cnt + index_files_cnt; - ASSERT_EQ(file_ids.size(), total_cnt); + ASSERT_EQ(table_files.size(), total_cnt); status = impl_->DeleteTableFiles(table_id); ASSERT_TRUE(status.ok()); From eb0270aee5a761955de7ee6781b73ae14d18be3c Mon Sep 17 00:00:00 2001 From: groot Date: Fri, 22 Nov 2019 12:05:40 +0800 Subject: [PATCH 17/32] #470 raw files should not be build index --- core/src/db/meta/MetaConsts.h | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/core/src/db/meta/MetaConsts.h b/core/src/db/meta/MetaConsts.h index c21a749fc8..0c77dc2599 100644 --- a/core/src/db/meta/MetaConsts.h +++ b/core/src/db/meta/MetaConsts.h @@ -32,7 +32,12 @@ const size_t H_SEC = 60 * M_SEC; const size_t D_SEC = 24 * H_SEC; const size_t W_SEC = 7 * D_SEC; -const size_t BUILD_INDEX_THRESHOLD = 1000; +// This value is to ignore small raw files when building index. +// The reason is: +// 1. The performance of brute-search for small raw files could be better than small index file. +// 2. And small raw files can be merged to larger files, thus reduce fragmented files count. +// We decide the value based on a testing for small size raw/index files. +const size_t BUILD_INDEX_THRESHOLD = 5000; } // namespace meta } // namespace engine From 54e113ba31763b2cec4370c38558641e10802ed3 Mon Sep 17 00:00:00 2001 From: zhenwu Date: Fri, 22 Nov 2019 14:12:57 +0800 Subject: [PATCH 18/32] Disable multiprocess cases --- tests/milvus_python_test/test_connect.py | 3 ++- tests/milvus_python_test/test_mix.py | 3 ++- 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/tests/milvus_python_test/test_connect.py b/tests/milvus_python_test/test_connect.py index 143ac4d8bf..b5b55f3ee1 100644 --- a/tests/milvus_python_test/test_connect.py +++ b/tests/milvus_python_test/test_connect.py @@ -149,7 +149,8 @@ class TestConnect: milvus.connect(uri=uri_value, timeout=1) assert not milvus.connected() - def test_connect_with_multiprocess(self, args): + # disable + def _test_connect_with_multiprocess(self, args): ''' target: test uri connect with multiprocess method: set correct uri, test with multiprocessing connecting diff --git a/tests/milvus_python_test/test_mix.py b/tests/milvus_python_test/test_mix.py index 5ef9ba2cde..d9331151dd 100644 --- a/tests/milvus_python_test/test_mix.py +++ b/tests/milvus_python_test/test_mix.py @@ -25,7 +25,8 @@ index_params = {'index_type': IndexType.IVFLAT, 'nlist': 16384} class TestMixBase: - def test_search_during_createIndex(self, args): + # disable + def _test_search_during_createIndex(self, args): loops = 10000 table = gen_unique_str() query_vecs = [vectors[0], vectors[1]] From 5268a1628ccfc021bc425915bb7151a28e775d34 Mon Sep 17 00:00:00 2001 From: zhenwu Date: Fri, 22 Nov 2019 14:18:33 +0800 Subject: [PATCH 19/32] Update caser run timeout --- tests/milvus_ann_acc/ci/jenkinsfile/acc_test.groovy | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tests/milvus_ann_acc/ci/jenkinsfile/acc_test.groovy b/tests/milvus_ann_acc/ci/jenkinsfile/acc_test.groovy index 1ce327b802..c1b2b2ed64 100644 --- a/tests/milvus_ann_acc/ci/jenkinsfile/acc_test.groovy +++ b/tests/milvus_ann_acc/ci/jenkinsfile/acc_test.groovy @@ -1,4 +1,4 @@ -timeout(time: 1800, unit: 'MINUTES') { +timeout(time: 7200, unit: 'MINUTES') { try { dir ("milvu_ann_acc") { print "Git clone url: ${TEST_URL}:${TEST_BRANCH}" From 91c33882b54d6f41f1b14f257d7518e6308740b2 Mon Sep 17 00:00:00 2001 From: zerowe-seven Date: Fri, 22 Nov 2019 06:56:53 -0800 Subject: [PATCH 20/32] #329 The error message in TableNotExistMsg has a grammatical error --- core/src/server/grpc_impl/request/GrpcBaseRequest.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/core/src/server/grpc_impl/request/GrpcBaseRequest.cpp b/core/src/server/grpc_impl/request/GrpcBaseRequest.cpp index e1eb07af5c..2aa96c5331 100644 --- a/core/src/server/grpc_impl/request/GrpcBaseRequest.cpp +++ b/core/src/server/grpc_impl/request/GrpcBaseRequest.cpp @@ -90,7 +90,7 @@ GrpcBaseRequest::SetStatus(ErrorCode error_code, const std::string& error_msg) { std::string GrpcBaseRequest::TableNotExistMsg(const std::string& table_name) { return "Table " + table_name + - " not exist. Use milvus.has_table to verify whether the table exists. You also can check if the table name " + " does not exist. Use milvus.has_table to verify whether the table exists. You also can check whether the table name " "exists."; } From e40c06740052fe79ae4c4ced5f6c5ff337f3280f Mon Sep 17 00:00:00 2001 From: groot Date: Fri, 22 Nov 2019 17:45:50 +0800 Subject: [PATCH 21/32] #416 Drop the same partition success repeatally --- core/src/db/meta/SqliteMetaImpl.cpp | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/core/src/db/meta/SqliteMetaImpl.cpp b/core/src/db/meta/SqliteMetaImpl.cpp index 19ec684728..74460c1b4d 100644 --- a/core/src/db/meta/SqliteMetaImpl.cpp +++ b/core/src/db/meta/SqliteMetaImpl.cpp @@ -836,7 +836,8 @@ SqliteMetaImpl::GetPartitionName(const std::string& table_id, const std::string& auto name = ConnectorPtr->select(columns(&TableSchema::table_id_), where(c(&TableSchema::owner_table_) == table_id - and c(&TableSchema::partition_tag_) == valid_tag)); + and c(&TableSchema::partition_tag_) == valid_tag + and c(&TableSchema::state_) != (int)TableSchema::TO_DELETE)); if (name.size() > 0) { partition_name = std::get<0>(name[0]); } else { From d3ab5ffa407eee56a74e9ab2c8e3290725127c7f Mon Sep 17 00:00:00 2001 From: groot Date: Fri, 22 Nov 2019 19:35:42 +0800 Subject: [PATCH 22/32] add unittest case --- core/unittest/server/test_config.cpp | 20 +++++++++++++++++--- 1 file changed, 17 insertions(+), 3 deletions(-) diff --git a/core/unittest/server/test_config.cpp b/core/unittest/server/test_config.cpp index 664a08d631..ce0ae9fa80 100644 --- a/core/unittest/server/test_config.cpp +++ b/core/unittest/server/test_config.cpp @@ -25,6 +25,8 @@ #include "utils/StringHelpFunctions.h" #include "utils/ValidationUtil.h" +#include + namespace { static constexpr uint64_t KB = 1024; @@ -63,9 +65,21 @@ TEST_F(ConfigTest, CONFIG_TEST) { int64_t port = server_config.GetInt64Value("port"); ASSERT_NE(port, 0); - server_config.SetValue("test", "2.5"); - double test = server_config.GetDoubleValue("test"); - ASSERT_EQ(test, 2.5); + server_config.SetValue("float_test", "2.5"); + double dbl = server_config.GetDoubleValue("float_test"); + ASSERT_LE(abs(dbl - 2.5), std::numeric_limits::epsilon()); + float flt = server_config.GetFloatValue("float_test"); + ASSERT_LE(abs(flt - 2.5), std::numeric_limits::epsilon()); + + server_config.SetValue("bool_test", "true"); + bool blt = server_config.GetBoolValue("bool_test"); + ASSERT_TRUE(blt); + + server_config.SetValue("int_test", "34"); + int32_t it32 = server_config.GetInt32Value("int_test"); + ASSERT_EQ(it32, 34); + int64_t it64 = server_config.GetInt64Value("int_test"); + ASSERT_EQ(it64, 34); milvus::server::ConfigNode fake; server_config.AddChild("fake", fake); From 97807df0af0e527a205593d28497866e79003e54 Mon Sep 17 00:00:00 2001 From: JinHai-CN Date: Fri, 22 Nov 2019 20:30:44 +0800 Subject: [PATCH 23/32] Fix lint --- core/src/server/grpc_impl/request/GrpcBaseRequest.cpp | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/core/src/server/grpc_impl/request/GrpcBaseRequest.cpp b/core/src/server/grpc_impl/request/GrpcBaseRequest.cpp index 2aa96c5331..0f46217057 100644 --- a/core/src/server/grpc_impl/request/GrpcBaseRequest.cpp +++ b/core/src/server/grpc_impl/request/GrpcBaseRequest.cpp @@ -90,8 +90,8 @@ GrpcBaseRequest::SetStatus(ErrorCode error_code, const std::string& error_msg) { std::string GrpcBaseRequest::TableNotExistMsg(const std::string& table_name) { return "Table " + table_name + - " does not exist. Use milvus.has_table to verify whether the table exists. You also can check whether the table name " - "exists."; + " does not exist. Use milvus.has_table to verify whether the table exists. " + "You also can check whether the table name exists."; } Status From b68f4f43c1ecf9671876658907c68673bbe46ade Mon Sep 17 00:00:00 2001 From: fishpenguin Date: Sat, 23 Nov 2019 10:29:11 +0800 Subject: [PATCH 24/32] Add log in scheduler/optimizer --- CHANGELOG.md | 1 + core/src/scheduler/optimizer/BuildIndexPass.cpp | 4 +++- core/src/scheduler/optimizer/FaissFlatPass.cpp | 2 ++ core/src/scheduler/optimizer/FaissIVFFlatPass.cpp | 3 +++ core/src/scheduler/optimizer/FaissIVFSQ8HPass.cpp | 3 +++ core/src/scheduler/optimizer/FaissIVFSQ8Pass.cpp | 3 +++ core/src/scheduler/optimizer/FallbackPass.cpp | 1 + 7 files changed, 16 insertions(+), 1 deletion(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index f08ca3bf42..7a518a8d41 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -28,6 +28,7 @@ Please mark all change in change log and use the ticket from JIRA. - \#226 - Experimental shards middleware for Milvus - \#227 - Support new index types SPTAG-KDT and SPTAG-BKT - \#346 - Support build index with multiple gpu +- \#488 - Add log in scheduler/optimizer ## Improvement - \#255 - Add ivfsq8 test report detailed version diff --git a/core/src/scheduler/optimizer/BuildIndexPass.cpp b/core/src/scheduler/optimizer/BuildIndexPass.cpp index d535b9675f..faa451bc56 100644 --- a/core/src/scheduler/optimizer/BuildIndexPass.cpp +++ b/core/src/scheduler/optimizer/BuildIndexPass.cpp @@ -38,8 +38,10 @@ BuildIndexPass::Run(const TaskPtr& task) { if (task->Type() != TaskType::BuildIndexTask) return false; - if (build_gpu_ids_.empty()) + if (build_gpu_ids_.empty()) { + SERVER_LOG_WARNING << "BUildIndexPass cannot get build index gpu!"; return false; + } ResourcePtr res_ptr; res_ptr = ResMgrInst::GetInstance()->GetResource(ResourceType::GPU, build_gpu_ids_[specified_gpu_id_]); diff --git a/core/src/scheduler/optimizer/FaissFlatPass.cpp b/core/src/scheduler/optimizer/FaissFlatPass.cpp index 61ca1b9ec9..c78f7d57e1 100644 --- a/core/src/scheduler/optimizer/FaissFlatPass.cpp +++ b/core/src/scheduler/optimizer/FaissFlatPass.cpp @@ -54,9 +54,11 @@ FaissFlatPass::Run(const TaskPtr& task) { auto search_job = std::static_pointer_cast(search_task->job_.lock()); ResourcePtr res_ptr; if (search_job->nq() < threshold_) { + SERVER_LOG_DEBUG << "FaissFlatPass: nq < gpu_search_threshold, specify cpu to search!"; res_ptr = ResMgrInst::GetInstance()->GetResource("cpu"); } else { auto best_device_id = count_ % gpus.size(); + SERVER_LOG_DEBUG << "FaissFlatPass: nq > gpu_search_threshold, specify gpu" << best_device_id << " to search!"; count_++; res_ptr = ResMgrInst::GetInstance()->GetResource(ResourceType::GPU, best_device_id); } diff --git a/core/src/scheduler/optimizer/FaissIVFFlatPass.cpp b/core/src/scheduler/optimizer/FaissIVFFlatPass.cpp index 1f1efb374b..17067cf24e 100644 --- a/core/src/scheduler/optimizer/FaissIVFFlatPass.cpp +++ b/core/src/scheduler/optimizer/FaissIVFFlatPass.cpp @@ -54,9 +54,12 @@ FaissIVFFlatPass::Run(const TaskPtr& task) { auto search_job = std::static_pointer_cast(search_task->job_.lock()); ResourcePtr res_ptr; if (search_job->nq() < threshold_) { + SERVER_LOG_DEBUG << "FaissIVFFlatPass: nq < gpu_search_threshold, specify cpu to search!"; res_ptr = ResMgrInst::GetInstance()->GetResource("cpu"); } else { auto best_device_id = count_ % gpus.size(); + SERVER_LOG_DEBUG << "FaissIVFFlatPass: nq > gpu_search_threshold, specify gpu" << best_device_id + << " to search!"; count_++; res_ptr = ResMgrInst::GetInstance()->GetResource(ResourceType::GPU, best_device_id); } diff --git a/core/src/scheduler/optimizer/FaissIVFSQ8HPass.cpp b/core/src/scheduler/optimizer/FaissIVFSQ8HPass.cpp index a99e861e03..676ed1720e 100644 --- a/core/src/scheduler/optimizer/FaissIVFSQ8HPass.cpp +++ b/core/src/scheduler/optimizer/FaissIVFSQ8HPass.cpp @@ -51,9 +51,12 @@ FaissIVFSQ8HPass::Run(const TaskPtr& task) { auto search_job = std::static_pointer_cast(search_task->job_.lock()); ResourcePtr res_ptr; if (search_job->nq() < threshold_) { + SERVER_LOG_DEBUG << "FaissIVFSQ8HPass: nq < gpu_search_threshold, specify cpu to search!"; res_ptr = ResMgrInst::GetInstance()->GetResource("cpu"); } else { auto best_device_id = count_ % gpus.size(); + SERVER_LOG_DEBUG << "FaissIVFSQ8HPass: nq > gpu_search_threshold, specify gpu" << best_device_id + << " to search!"; count_++; res_ptr = ResMgrInst::GetInstance()->GetResource(ResourceType::GPU, best_device_id); } diff --git a/core/src/scheduler/optimizer/FaissIVFSQ8Pass.cpp b/core/src/scheduler/optimizer/FaissIVFSQ8Pass.cpp index 30dd306b3b..4f06c1e1fc 100644 --- a/core/src/scheduler/optimizer/FaissIVFSQ8Pass.cpp +++ b/core/src/scheduler/optimizer/FaissIVFSQ8Pass.cpp @@ -54,9 +54,12 @@ FaissIVFSQ8Pass::Run(const TaskPtr& task) { auto search_job = std::static_pointer_cast(search_task->job_.lock()); ResourcePtr res_ptr; if (search_job->nq() < threshold_) { + SERVER_LOG_DEBUG << "FaissIVFSQ8Pass: nq < gpu_search_threshold, specify cpu to search!"; res_ptr = ResMgrInst::GetInstance()->GetResource("cpu"); } else { auto best_device_id = count_ % gpus.size(); + SERVER_LOG_DEBUG << "FaissIVFSQ8Pass: nq > gpu_search_threshold, specify gpu" << best_device_id + << " to search!"; count_++; res_ptr = ResMgrInst::GetInstance()->GetResource(ResourceType::GPU, best_device_id); } diff --git a/core/src/scheduler/optimizer/FallbackPass.cpp b/core/src/scheduler/optimizer/FallbackPass.cpp index 2e275ede4b..15687bb9bb 100644 --- a/core/src/scheduler/optimizer/FallbackPass.cpp +++ b/core/src/scheduler/optimizer/FallbackPass.cpp @@ -33,6 +33,7 @@ FallbackPass::Run(const TaskPtr& task) { return false; } // NEVER be empty + SERVER_LOG_DEBUG << "FallbackPass!"; auto cpu = ResMgrInst::GetInstance()->GetCpuResources()[0]; auto label = std::make_shared(cpu); task->label() = label; From 288490f980f279e5ebe74735e06e3845054aca85 Mon Sep 17 00:00:00 2001 From: fishpenguin Date: Sat, 23 Nov 2019 11:07:42 +0800 Subject: [PATCH 25/32] gpu no usage during index building --- CHANGELOG.md | 1 + core/src/scheduler/optimizer/BuildIndexPass.cpp | 4 ++-- core/src/scheduler/optimizer/BuildIndexPass.h | 2 +- 3 files changed, 4 insertions(+), 3 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 7a518a8d41..ccbbe9f64e 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -20,6 +20,7 @@ Please mark all change in change log and use the ticket from JIRA. - \#440 - Query API in customization still uses old version - \#440 - Server cannot startup with gpu_resource_config.enable=false in GPU version - \#458 - Index data is not compatible between 0.5 and 0.6 +- \#486 - gpu no usage during index building ## Feature - \#12 - Pure CPU version for Milvus diff --git a/core/src/scheduler/optimizer/BuildIndexPass.cpp b/core/src/scheduler/optimizer/BuildIndexPass.cpp index faa451bc56..55e28ec672 100644 --- a/core/src/scheduler/optimizer/BuildIndexPass.cpp +++ b/core/src/scheduler/optimizer/BuildIndexPass.cpp @@ -26,8 +26,7 @@ namespace scheduler { void BuildIndexPass::Init() { server::Config& config = server::Config::GetInstance(); - std::vector build_resources; - Status s = config.GetGpuResourceConfigBuildIndexResources(build_resources); + Status s = config.GetGpuResourceConfigBuildIndexResources(build_gpu_ids_); if (!s.ok()) { throw; } @@ -47,6 +46,7 @@ BuildIndexPass::Run(const TaskPtr& task) { res_ptr = ResMgrInst::GetInstance()->GetResource(ResourceType::GPU, build_gpu_ids_[specified_gpu_id_]); auto label = std::make_shared(std::weak_ptr(res_ptr)); task->label() = label; + SERVER_LOG_DEBUG << "Specify gpu" << specified_gpu_id_ << " to build index!"; specified_gpu_id_ = (specified_gpu_id_ + 1) % build_gpu_ids_.size(); return true; diff --git a/core/src/scheduler/optimizer/BuildIndexPass.h b/core/src/scheduler/optimizer/BuildIndexPass.h index 4f7117fc4e..3adf1259a7 100644 --- a/core/src/scheduler/optimizer/BuildIndexPass.h +++ b/core/src/scheduler/optimizer/BuildIndexPass.h @@ -45,7 +45,7 @@ class BuildIndexPass : public Pass { private: uint64_t specified_gpu_id_ = 0; - std::vector build_gpu_ids_; + std::vector build_gpu_ids_; }; using BuildIndexPassPtr = std::shared_ptr; From 0097962642a4b7ae60eed06ee5662651ae713db8 Mon Sep 17 00:00:00 2001 From: Yukikaze-CZR Date: Sat, 23 Nov 2019 11:15:40 +0800 Subject: [PATCH 26/32] [skip ci] Small change on test report --- ..._ivfsq8_test_report_detailed_version_cn.md | 28 +++++++++---------- ...ivfsq8h_test_report_detailed_version_cn.md | 2 +- 2 files changed, 15 insertions(+), 15 deletions(-) diff --git a/docs/test_report/milvus_ivfsq8_test_report_detailed_version_cn.md b/docs/test_report/milvus_ivfsq8_test_report_detailed_version_cn.md index a6e5e75ea4..098f9e69a4 100644 --- a/docs/test_report/milvus_ivfsq8_test_report_detailed_version_cn.md +++ b/docs/test_report/milvus_ivfsq8_test_report_detailed_version_cn.md @@ -16,25 +16,25 @@ ### 软硬件环境 -操作系统: CentOS Linux release 7.6.1810 (Core) +操作系统:CentOS Linux release 7.6.1810 (Core) -CPU: Intel(R) Xeon(R) CPU E5-2678 v3 @ 2.50GHz +CPU:Intel(R) Xeon(R) CPU E5-2678 v3 @ 2.50GHz -GPU0: GeForce GTX 1080 +GPU0:GeForce GTX 1080 -GPU1: GeForce GTX 1080 +GPU1:GeForce GTX 1080 -内存: 503GB +内存:503GB -Docker版本: 18.09 +Docker版本:18.09 -NVIDIA Driver版本: 430.34 +NVIDIA Driver版本:430.34 -Milvus版本: 0.5.3 +Milvus版本:0.5.3 -SDK接口: Python 3.6.8 +SDK接口:Python 3.6.8 -pymilvus版本: 0.2.5 +pymilvus版本:0.2.5 @@ -51,7 +51,7 @@ pymilvus版本: 0.2.5 ### 测试指标 -- Query Elapsed Time: 数据库查询所有向量的时间(以秒计)。影响Query Elapsed Time的变量: +- Query Elapsed Time:数据库查询所有向量的时间(以秒计)。影响Query Elapsed Time的变量: - nq (被查询向量的数量) @@ -59,7 +59,7 @@ pymilvus版本: 0.2.5 > > 被查询向量的数量nq将按照 [1, 5, 10, 200, 400, 600, 800, 1000]的数量分组。 -- Recall: 实际返回的正确结果占总数之比 . 影响Recall的变量: +- Recall:实际返回的正确结果占总数之比。影响Recall的变量: - nq (被查询向量的数量) - topk (单条查询中最相似的K个结果) @@ -76,7 +76,7 @@ pymilvus版本: 0.2.5 ### 测试环境 -数据集: sift1b-1,000,000,000向量, 128维 +数据集:sift1b-1,000,000,000向量,128维 表格属性: @@ -143,7 +143,7 @@ search_resources: cpu, gpu0 | nq=800 | 23.24 | | nq=1000 | 27.41 | -当nq为1000时,在GPU模式下查询一条128维向量需要耗时约27毫秒。 +当nq为1000时,在CPU模式下查询一条128维向量需要耗时约27毫秒。 diff --git a/docs/test_report/milvus_ivfsq8h_test_report_detailed_version_cn.md b/docs/test_report/milvus_ivfsq8h_test_report_detailed_version_cn.md index b50d00f9bd..daac2af545 100644 --- a/docs/test_report/milvus_ivfsq8h_test_report_detailed_version_cn.md +++ b/docs/test_report/milvus_ivfsq8h_test_report_detailed_version_cn.md @@ -139,7 +139,7 @@ topk = 100 **总结** -当nq小于1200时,查询耗时随nq的增长快速增大;当nq大于1200时,查询耗时的增大则缓慢许多。这是因为gpu_search_threshold这一参数的值被设为1200,当nq<1200时,选择CPU进行操作,否则选择GPU进行操作。与CPU。 +当nq小于1200时,查询耗时随nq的增长快速增大;当nq大于1200时,查询耗时的增大则缓慢许多。这是因为gpu_search_threshold这一参数的值被设为1200,当nq小于1200时,选择CPU进行操作,否则选择GPU进行操作。 在GPU模式下的查询耗时由两部分组成:(1)索引从CPU到GPU的拷贝时间;(2)所有分桶的查询时间。当nq小于500时,索引从CPU到GPU 的拷贝时间无法被有效均摊,此时CPU模式时一个更优的选择;当nq大于500时,选择GPU模式更合理。和CPU相比,GPU具有更多的核数和更强的算力。当nq较大时,GPU在计算上的优势能被更好地被体现。 From 2b805489a1e29a7f6d39a80387edeff45660bca6 Mon Sep 17 00:00:00 2001 From: groot Date: Sat, 23 Nov 2019 11:34:45 +0800 Subject: [PATCH 27/32] #485 Increase code coverage rate --- core/src/cache/GpuCacheMgr.cpp | 2 + core/src/cache/GpuCacheMgr.h | 2 + core/src/db/engine/ExecutionEngineImpl.cpp | 13 ++++++- core/src/main.cpp | 4 +- .../scheduler/optimizer/BuildIndexPass.cpp | 2 + .../src/scheduler/optimizer/FaissFlatPass.cpp | 2 + .../scheduler/optimizer/FaissIVFFlatPass.cpp | 2 + .../scheduler/optimizer/FaissIVFSQ8HPass.cpp | 2 + .../scheduler/optimizer/FaissIVFSQ8Pass.cpp | 2 + core/src/server/Config.cpp | 14 +++++-- core/src/server/Config.h | 15 ++++++-- core/src/server/Server.cpp | 6 ++- core/src/utils/StringHelpFunctions.h | 6 +++ core/src/utils/ValidationUtil.cpp | 6 +-- core/src/utils/ValidationUtil.h | 2 + core/unittest/db/utils.cpp | 2 +- core/unittest/scheduler/test_scheduler.cpp | 38 ++++++++----------- core/unittest/server/test_cache.cpp | 2 + core/unittest/server/test_config.cpp | 6 +-- core/unittest/server/test_rpc.cpp | 3 ++ core/unittest/server/test_util.cpp | 15 ++++++++ 21 files changed, 104 insertions(+), 42 deletions(-) diff --git a/core/src/cache/GpuCacheMgr.cpp b/core/src/cache/GpuCacheMgr.cpp index 72229527fa..1802fb3935 100644 --- a/core/src/cache/GpuCacheMgr.cpp +++ b/core/src/cache/GpuCacheMgr.cpp @@ -25,6 +25,7 @@ namespace milvus { namespace cache { +#ifdef MILVUS_GPU_VERSION std::mutex GpuCacheMgr::mutex_; std::unordered_map GpuCacheMgr::instance_; @@ -76,6 +77,7 @@ GpuCacheMgr::GetIndex(const std::string& key) { DataObjPtr obj = GetItem(key); return obj; } +#endif } // namespace cache } // namespace milvus diff --git a/core/src/cache/GpuCacheMgr.h b/core/src/cache/GpuCacheMgr.h index 4d434b2cfb..06dd44cca2 100644 --- a/core/src/cache/GpuCacheMgr.h +++ b/core/src/cache/GpuCacheMgr.h @@ -25,6 +25,7 @@ namespace milvus { namespace cache { +#ifdef MILVUS_GPU_VERSION class GpuCacheMgr; using GpuCacheMgrPtr = std::shared_ptr; @@ -42,6 +43,7 @@ class GpuCacheMgr : public CacheMgr { static std::mutex mutex_; static std::unordered_map instance_; }; +#endif } // namespace cache } // namespace milvus diff --git a/core/src/db/engine/ExecutionEngineImpl.cpp b/core/src/db/engine/ExecutionEngineImpl.cpp index ca307b90fc..c0ab4e829e 100644 --- a/core/src/db/engine/ExecutionEngineImpl.cpp +++ b/core/src/db/engine/ExecutionEngineImpl.cpp @@ -151,6 +151,7 @@ ExecutionEngineImpl::HybridLoad() const { return; } +#ifdef MILVUS_GPU_VERSION const std::string key = location_ + ".quantizer"; server::Config& config = server::Config::GetInstance(); @@ -205,6 +206,7 @@ ExecutionEngineImpl::HybridLoad() const { auto cache_quantizer = std::make_shared(quantizer); cache::GpuCacheMgr::GetInstance(best_device_id)->InsertItem(key, cache_quantizer); } +#endif } void @@ -342,6 +344,7 @@ ExecutionEngineImpl::CopyToGpu(uint64_t device_id, bool hybrid) { } #endif +#ifdef MILVUS_GPU_VERSION auto index = std::static_pointer_cast(cache::GpuCacheMgr::GetInstance(device_id)->GetIndex(location_)); bool already_in_cache = (index != nullptr); if (already_in_cache) { @@ -364,16 +367,19 @@ ExecutionEngineImpl::CopyToGpu(uint64_t device_id, bool hybrid) { if (!already_in_cache) { GpuCache(device_id); } +#endif return Status::OK(); } Status ExecutionEngineImpl::CopyToIndexFileToGpu(uint64_t device_id) { +#ifdef MILVUS_GPU_VERSION gpu_num_ = device_id; auto to_index_data = std::make_shared(PhysicalSize()); cache::DataObjPtr obj = std::static_pointer_cast(to_index_data); milvus::cache::GpuCacheMgr::GetInstance(device_id)->InsertItem(location_, obj); +#endif return Status::OK(); } @@ -584,15 +590,17 @@ ExecutionEngineImpl::Cache() { Status ExecutionEngineImpl::GpuCache(uint64_t gpu_id) { +#ifdef MILVUS_GPU_VERSION cache::DataObjPtr obj = std::static_pointer_cast(index_); milvus::cache::GpuCacheMgr::GetInstance(gpu_id)->InsertItem(location_, obj); - +#endif return Status::OK(); } // TODO(linxj): remove. Status ExecutionEngineImpl::Init() { +#ifdef MILVUS_GPU_VERSION server::Config& config = server::Config::GetInstance(); std::vector gpu_ids; Status s = config.GetGpuResourceConfigBuildIndexResources(gpu_ids); @@ -604,6 +612,9 @@ ExecutionEngineImpl::Init() { std::string msg = "Invalid gpu_num"; return Status(SERVER_INVALID_ARGUMENT, msg); +#else + return Status::OK(); +#endif } } // namespace engine diff --git a/core/src/main.cpp b/core/src/main.cpp index b39ba87997..5c97a061d2 100644 --- a/core/src/main.cpp +++ b/core/src/main.cpp @@ -59,9 +59,9 @@ print_banner() { #endif << " library." << std::endl; #ifdef MILVUS_CPU_VERSION - std::cout << "You are using Milvus CPU version" << std::endl; + std::cout << "You are using Milvus CPU edition" << std::endl; #else - std::cout << "You are using Milvus GPU version" << std::endl; + std::cout << "You are using Milvus GPU edition" << std::endl; #endif std::cout << std::endl; } diff --git a/core/src/scheduler/optimizer/BuildIndexPass.cpp b/core/src/scheduler/optimizer/BuildIndexPass.cpp index d535b9675f..b5cd8eb0f1 100644 --- a/core/src/scheduler/optimizer/BuildIndexPass.cpp +++ b/core/src/scheduler/optimizer/BuildIndexPass.cpp @@ -25,12 +25,14 @@ namespace scheduler { void BuildIndexPass::Init() { +#ifdef MILVUS_GPU_VERSION server::Config& config = server::Config::GetInstance(); std::vector build_resources; Status s = config.GetGpuResourceConfigBuildIndexResources(build_resources); if (!s.ok()) { throw; } +#endif } bool diff --git a/core/src/scheduler/optimizer/FaissFlatPass.cpp b/core/src/scheduler/optimizer/FaissFlatPass.cpp index 61ca1b9ec9..a34d9a5951 100644 --- a/core/src/scheduler/optimizer/FaissFlatPass.cpp +++ b/core/src/scheduler/optimizer/FaissFlatPass.cpp @@ -29,6 +29,7 @@ namespace scheduler { void FaissFlatPass::Init() { +#ifdef MILVUS_GPU_VERSION server::Config& config = server::Config::GetInstance(); Status s = config.GetEngineConfigGpuSearchThreshold(threshold_); if (!s.ok()) { @@ -38,6 +39,7 @@ FaissFlatPass::Init() { if (!s.ok()) { throw; } +#endif } bool diff --git a/core/src/scheduler/optimizer/FaissIVFFlatPass.cpp b/core/src/scheduler/optimizer/FaissIVFFlatPass.cpp index 1f1efb374b..cec44ddfe9 100644 --- a/core/src/scheduler/optimizer/FaissIVFFlatPass.cpp +++ b/core/src/scheduler/optimizer/FaissIVFFlatPass.cpp @@ -29,6 +29,7 @@ namespace scheduler { void FaissIVFFlatPass::Init() { +#ifdef MILVUS_GPU_VERSION server::Config& config = server::Config::GetInstance(); Status s = config.GetEngineConfigGpuSearchThreshold(threshold_); if (!s.ok()) { @@ -38,6 +39,7 @@ FaissIVFFlatPass::Init() { if (!s.ok()) { throw; } +#endif } bool diff --git a/core/src/scheduler/optimizer/FaissIVFSQ8HPass.cpp b/core/src/scheduler/optimizer/FaissIVFSQ8HPass.cpp index a99e861e03..f9d8b91bd8 100644 --- a/core/src/scheduler/optimizer/FaissIVFSQ8HPass.cpp +++ b/core/src/scheduler/optimizer/FaissIVFSQ8HPass.cpp @@ -29,12 +29,14 @@ namespace scheduler { void FaissIVFSQ8HPass::Init() { +#ifdef MILVUS_GPU_VERSION server::Config& config = server::Config::GetInstance(); Status s = config.GetEngineConfigGpuSearchThreshold(threshold_); if (!s.ok()) { threshold_ = std::numeric_limits::max(); } s = config.GetGpuResourceConfigSearchResources(gpus); +#endif } bool diff --git a/core/src/scheduler/optimizer/FaissIVFSQ8Pass.cpp b/core/src/scheduler/optimizer/FaissIVFSQ8Pass.cpp index 30dd306b3b..4b6d199f0a 100644 --- a/core/src/scheduler/optimizer/FaissIVFSQ8Pass.cpp +++ b/core/src/scheduler/optimizer/FaissIVFSQ8Pass.cpp @@ -29,6 +29,7 @@ namespace scheduler { void FaissIVFSQ8Pass::Init() { +#ifdef MILVUS_GPU_VERSION server::Config& config = server::Config::GetInstance(); Status s = config.GetEngineConfigGpuSearchThreshold(threshold_); if (!s.ok()) { @@ -38,6 +39,7 @@ FaissIVFSQ8Pass::Init() { if (!s.ok()) { throw; } +#endif } bool diff --git a/core/src/server/Config.cpp b/core/src/server/Config.cpp index 5465c6c505..95bab84391 100644 --- a/core/src/server/Config.cpp +++ b/core/src/server/Config.cpp @@ -182,6 +182,7 @@ Config::ValidateConfig() { return s; } +#ifdef MILVUS_GPU_VERSION int64_t engine_gpu_search_threshold; s = GetEngineConfigGpuSearchThreshold(engine_gpu_search_threshold); if (!s.ok()) { @@ -195,7 +196,6 @@ Config::ValidateConfig() { return s; } -#ifdef MILVUS_GPU_VERSION if (gpu_resource_enable) { int64_t resource_cache_capacity; s = GetGpuResourceConfigCacheCapacity(resource_cache_capacity); @@ -325,13 +325,13 @@ Config::ResetDefaultConfig() { return s; } +#ifdef MILVUS_GPU_VERSION + /* gpu resource config */ s = SetEngineConfigGpuSearchThreshold(CONFIG_ENGINE_GPU_SEARCH_THRESHOLD_DEFAULT); if (!s.ok()) { return s; } - /* gpu resource config */ -#ifdef MILVUS_GPU_VERSION s = SetGpuResourceConfigEnable(CONFIG_GPU_RESOURCE_ENABLE_DEFAULT); if (!s.ok()) { return s; @@ -632,6 +632,7 @@ Config::CheckEngineConfigOmpThreadNum(const std::string& value) { return Status::OK(); } +#ifdef MILVUS_GPU_VERSION Status Config::CheckEngineConfigGpuSearchThreshold(const std::string& value) { if (!ValidationUtil::ValidateStringIsNumber(value).ok()) { @@ -761,6 +762,7 @@ Config::CheckGpuResourceConfigBuildIndexResources(const std::vector return Status::OK(); } +#endif //////////////////////////////////////////////////////////////////////////////// ConfigNode& @@ -981,6 +983,7 @@ Config::GetEngineConfigOmpThreadNum(int64_t& value) { return Status::OK(); } +#ifdef MILVUS_GPU_VERSION Status Config::GetEngineConfigGpuSearchThreshold(int64_t& value) { std::string str = @@ -1097,6 +1100,7 @@ Config::GetGpuResourceConfigBuildIndexResources(std::vector& value) { } return Status::OK(); } +#endif /////////////////////////////////////////////////////////////////////////////// /* server config */ @@ -1284,6 +1288,8 @@ Config::SetEngineConfigOmpThreadNum(const std::string& value) { return Status::OK(); } +#ifdef MILVUS_GPU_VERSION +/* gpu resource config */ Status Config::SetEngineConfigGpuSearchThreshold(const std::string& value) { Status s = CheckEngineConfigGpuSearchThreshold(value); @@ -1294,7 +1300,6 @@ Config::SetEngineConfigGpuSearchThreshold(const std::string& value) { return Status::OK(); } -/* gpu resource config */ Status Config::SetGpuResourceConfigEnable(const std::string& value) { Status s = CheckGpuResourceConfigEnable(value); @@ -1348,6 +1353,7 @@ Config::SetGpuResourceConfigBuildIndexResources(const std::string& value) { SetConfigValueInMem(CONFIG_GPU_RESOURCE, CONFIG_GPU_RESOURCE_BUILD_INDEX_RESOURCES, value); return Status::OK(); } // namespace server +#endif } // namespace server } // namespace milvus diff --git a/core/src/server/Config.h b/core/src/server/Config.h index 0907080a6f..4e4923ee07 100644 --- a/core/src/server/Config.h +++ b/core/src/server/Config.h @@ -170,10 +170,11 @@ class Config { CheckEngineConfigUseBlasThreshold(const std::string& value); Status CheckEngineConfigOmpThreadNum(const std::string& value); + +#ifdef MILVUS_GPU_VERSION + /* gpu resource config */ Status CheckEngineConfigGpuSearchThreshold(const std::string& value); - - /* gpu resource config */ Status CheckGpuResourceConfigEnable(const std::string& value); Status @@ -184,6 +185,7 @@ class Config { CheckGpuResourceConfigSearchResources(const std::vector& value); Status CheckGpuResourceConfigBuildIndexResources(const std::vector& value); +#endif std::string GetConfigStr(const std::string& parent_key, const std::string& child_key, const std::string& default_value = ""); @@ -239,6 +241,8 @@ class Config { GetEngineConfigUseBlasThreshold(int64_t& value); Status GetEngineConfigOmpThreadNum(int64_t& value); + +#ifdef MILVUS_GPU_VERSION Status GetEngineConfigGpuSearchThreshold(int64_t& value); @@ -253,6 +257,7 @@ class Config { GetGpuResourceConfigSearchResources(std::vector& value); Status GetGpuResourceConfigBuildIndexResources(std::vector& value); +#endif public: /* server config */ @@ -300,10 +305,11 @@ class Config { SetEngineConfigUseBlasThreshold(const std::string& value); Status SetEngineConfigOmpThreadNum(const std::string& value); + +#ifdef MILVUS_GPU_VERSION + /* gpu resource config */ Status SetEngineConfigGpuSearchThreshold(const std::string& value); - - /* gpu resource config */ Status SetGpuResourceConfigEnable(const std::string& value); Status @@ -314,6 +320,7 @@ class Config { SetGpuResourceConfigSearchResources(const std::string& value); Status SetGpuResourceConfigBuildIndexResources(const std::string& value); +#endif private: std::unordered_map> config_map_; diff --git a/core/src/server/Server.cpp b/core/src/server/Server.cpp index 5676504722..169463080e 100644 --- a/core/src/server/Server.cpp +++ b/core/src/server/Server.cpp @@ -183,7 +183,11 @@ Server::Start() { // print version information SERVER_LOG_INFO << "Milvus " << BUILD_TYPE << " version: v" << MILVUS_VERSION << ", built at " << BUILD_TIME; - +#ifdef MILVUS_CPU_VERSION + SERVER_LOG_INFO << "CPU edition"; +#else + SERVER_LOG_INFO << "GPU edition"; +#endif server::Metrics::GetInstance().Init(); server::SystemInfo::GetInstance().Init(); diff --git a/core/src/utils/StringHelpFunctions.h b/core/src/utils/StringHelpFunctions.h index 3a41e53f4b..51812fae40 100644 --- a/core/src/utils/StringHelpFunctions.h +++ b/core/src/utils/StringHelpFunctions.h @@ -30,9 +30,13 @@ class StringHelpFunctions { StringHelpFunctions() = default; public: + // trim blanks from begin and end + // " a b c " => "a b c" static void TrimStringBlank(std::string& string); + // trim quotes from begin and end + // "'abc'" => "abc" static void TrimStringQuote(std::string& string, const std::string& qoute); @@ -46,6 +50,8 @@ class StringHelpFunctions { static void SplitStringByDelimeter(const std::string& str, const std::string& delimeter, std::vector& result); + // merge strings with delimeter + // "a", "b", "c" => "a,b,c" static void MergeStringWithDelimeter(const std::vector& strs, const std::string& delimeter, std::string& result); diff --git a/core/src/utils/ValidationUtil.cpp b/core/src/utils/ValidationUtil.cpp index 12b2372fc5..2d1a0e257e 100644 --- a/core/src/utils/ValidationUtil.cpp +++ b/core/src/utils/ValidationUtil.cpp @@ -218,10 +218,9 @@ ValidationUtil::ValidateGpuIndex(int32_t gpu_index) { return Status::OK(); } +#ifdef MILVUS_GPU_VERSION Status ValidationUtil::GetGpuMemory(int32_t gpu_index, size_t& memory) { -#ifdef MILVUS_GPU_VERSION - cudaDeviceProp deviceProp; auto cuda_err = cudaGetDeviceProperties(&deviceProp, gpu_index); if (cuda_err) { @@ -232,10 +231,9 @@ ValidationUtil::GetGpuMemory(int32_t gpu_index, size_t& memory) { } memory = deviceProp.totalGlobalMem; -#endif - return Status::OK(); } +#endif Status ValidationUtil::ValidateIpAddress(const std::string& ip_address) { diff --git a/core/src/utils/ValidationUtil.h b/core/src/utils/ValidationUtil.h index ab32c35c40..bc523654e5 100644 --- a/core/src/utils/ValidationUtil.h +++ b/core/src/utils/ValidationUtil.h @@ -64,8 +64,10 @@ class ValidationUtil { static Status ValidateGpuIndex(int32_t gpu_index); +#ifdef MILVUS_GPU_VERSION static Status GetGpuMemory(int32_t gpu_index, size_t& memory); +#endif static Status ValidateIpAddress(const std::string& ip_address); diff --git a/core/unittest/db/utils.cpp b/core/unittest/db/utils.cpp index afa1d39006..293eeccc69 100644 --- a/core/unittest/db/utils.cpp +++ b/core/unittest/db/utils.cpp @@ -132,8 +132,8 @@ BaseTest::SetUp() { void BaseTest::TearDown() { milvus::cache::CpuCacheMgr::GetInstance()->ClearCache(); - milvus::cache::GpuCacheMgr::GetInstance(0)->ClearCache(); #ifdef MILVUS_GPU_VERSION + milvus::cache::GpuCacheMgr::GetInstance(0)->ClearCache(); knowhere::FaissGpuResourceMgr::GetInstance().Free(); #endif } diff --git a/core/unittest/scheduler/test_scheduler.cpp b/core/unittest/scheduler/test_scheduler.cpp index 72538113c3..c839307958 100644 --- a/core/unittest/scheduler/test_scheduler.cpp +++ b/core/unittest/scheduler/test_scheduler.cpp @@ -98,24 +98,25 @@ class SchedulerTest : public testing::Test { protected: void SetUp() override { + res_mgr_ = std::make_shared(); + ResourcePtr disk = ResourceFactory::Create("disk", "DISK", 0, true, false); + ResourcePtr cpu = ResourceFactory::Create("cpu", "CPU", 0, true, false); + disk_resource_ = res_mgr_->Add(std::move(disk)); + cpu_resource_ = res_mgr_->Add(std::move(cpu)); + +#ifdef MILVUS_GPU_VERSION constexpr int64_t cache_cap = 1024 * 1024 * 1024; cache::GpuCacheMgr::GetInstance(0)->SetCapacity(cache_cap); cache::GpuCacheMgr::GetInstance(1)->SetCapacity(cache_cap); - - ResourcePtr disk = ResourceFactory::Create("disk", "DISK", 0, true, false); - ResourcePtr cpu = ResourceFactory::Create("cpu", "CPU", 0, true, false); ResourcePtr gpu_0 = ResourceFactory::Create("gpu0", "GPU", 0); ResourcePtr gpu_1 = ResourceFactory::Create("gpu1", "GPU", 1); - - res_mgr_ = std::make_shared(); - disk_resource_ = res_mgr_->Add(std::move(disk)); - cpu_resource_ = res_mgr_->Add(std::move(cpu)); gpu_resource_0_ = res_mgr_->Add(std::move(gpu_0)); gpu_resource_1_ = res_mgr_->Add(std::move(gpu_1)); auto PCIE = Connection("IO", 11000.0); res_mgr_->Connect("cpu", "gpu0", PCIE); res_mgr_->Connect("cpu", "gpu1", PCIE); +#endif scheduler_ = std::make_shared(res_mgr_); @@ -138,17 +139,6 @@ class SchedulerTest : public testing::Test { std::shared_ptr scheduler_; }; -void -insert_dummy_index_into_gpu_cache(uint64_t device_id) { - MockVecIndex* mock_index = new MockVecIndex(); - mock_index->ntotal_ = 1000; - engine::VecIndexPtr index(mock_index); - - cache::DataObjPtr obj = std::static_pointer_cast(index); - - cache::GpuCacheMgr::GetInstance(device_id)->InsertItem("location", obj); -} - class SchedulerTest2 : public testing::Test { protected: void @@ -157,16 +147,13 @@ class SchedulerTest2 : public testing::Test { ResourcePtr cpu0 = ResourceFactory::Create("cpu0", "CPU", 0, true, false); ResourcePtr cpu1 = ResourceFactory::Create("cpu1", "CPU", 1, true, false); ResourcePtr cpu2 = ResourceFactory::Create("cpu2", "CPU", 2, true, false); - ResourcePtr gpu0 = ResourceFactory::Create("gpu0", "GPU", 0, true, true); - ResourcePtr gpu1 = ResourceFactory::Create("gpu1", "GPU", 1, true, true); res_mgr_ = std::make_shared(); disk_ = res_mgr_->Add(std::move(disk)); cpu_0_ = res_mgr_->Add(std::move(cpu0)); cpu_1_ = res_mgr_->Add(std::move(cpu1)); cpu_2_ = res_mgr_->Add(std::move(cpu2)); - gpu_0_ = res_mgr_->Add(std::move(gpu0)); - gpu_1_ = res_mgr_->Add(std::move(gpu1)); + auto IO = Connection("IO", 5.0); auto PCIE1 = Connection("PCIE", 11.0); auto PCIE2 = Connection("PCIE", 20.0); @@ -174,8 +161,15 @@ class SchedulerTest2 : public testing::Test { res_mgr_->Connect("cpu0", "cpu1", IO); res_mgr_->Connect("cpu1", "cpu2", IO); res_mgr_->Connect("cpu0", "cpu2", IO); + +#ifdef MILVUS_GPU_VERSION + ResourcePtr gpu0 = ResourceFactory::Create("gpu0", "GPU", 0, true, true); + ResourcePtr gpu1 = ResourceFactory::Create("gpu1", "GPU", 1, true, true); + gpu_0_ = res_mgr_->Add(std::move(gpu0)); + gpu_1_ = res_mgr_->Add(std::move(gpu1)); res_mgr_->Connect("cpu1", "gpu0", PCIE1); res_mgr_->Connect("cpu2", "gpu1", PCIE2); +#endif scheduler_ = std::make_shared(res_mgr_); diff --git a/core/unittest/server/test_cache.cpp b/core/unittest/server/test_cache.cpp index 67e9664d2c..92e09d4a26 100644 --- a/core/unittest/server/test_cache.cpp +++ b/core/unittest/server/test_cache.cpp @@ -175,6 +175,7 @@ TEST(CacheTest, CPU_CACHE_TEST) { cpu_mgr->PrintInfo(); } +#ifdef MILVUS_GPU_VERSION TEST(CacheTest, GPU_CACHE_TEST) { auto gpu_mgr = milvus::cache::GpuCacheMgr::GetInstance(0); @@ -202,6 +203,7 @@ TEST(CacheTest, GPU_CACHE_TEST) { gpu_mgr->ClearCache(); ASSERT_EQ(gpu_mgr->ItemCount(), 0); } +#endif TEST(CacheTest, INVALID_TEST) { { diff --git a/core/unittest/server/test_config.cpp b/core/unittest/server/test_config.cpp index ce0ae9fa80..3a24c02a45 100644 --- a/core/unittest/server/test_config.cpp +++ b/core/unittest/server/test_config.cpp @@ -250,6 +250,7 @@ TEST_F(ConfigTest, SERVER_CONFIG_VALID_TEST) { ASSERT_TRUE(s.ok()); ASSERT_TRUE(int64_val == engine_omp_thread_num); +#ifdef MILVUS_GPU_VERSION int64_t engine_gpu_search_threshold = 800; s = config.SetEngineConfigGpuSearchThreshold(std::to_string(engine_gpu_search_threshold)); ASSERT_TRUE(s.ok()); @@ -265,7 +266,6 @@ TEST_F(ConfigTest, SERVER_CONFIG_VALID_TEST) { ASSERT_TRUE(s.ok()); ASSERT_TRUE(bool_val == resource_enable_gpu); -#ifdef MILVUS_GPU_VERSION int64_t gpu_cache_capacity = 1; s = config.SetGpuResourceConfigCacheCapacity(std::to_string(gpu_cache_capacity)); ASSERT_TRUE(s.ok()); @@ -403,14 +403,14 @@ TEST_F(ConfigTest, SERVER_CONFIG_INVALID_TEST) { s = config.SetEngineConfigOmpThreadNum("10000"); ASSERT_FALSE(s.ok()); +#ifdef MILVUS_GPU_VERSION + /* gpu resource config */ s = config.SetEngineConfigGpuSearchThreshold("-1"); ASSERT_FALSE(s.ok()); - /* gpu resource config */ s = config.SetGpuResourceConfigEnable("ok"); ASSERT_FALSE(s.ok()); -#ifdef MILVUS_GPU_VERSION s = config.SetGpuResourceConfigCacheCapacity("a"); ASSERT_FALSE(s.ok()); s = config.SetGpuResourceConfigCacheCapacity("128"); diff --git a/core/unittest/server/test_rpc.cpp b/core/unittest/server/test_rpc.cpp index 5753c68422..e04411fad3 100644 --- a/core/unittest/server/test_rpc.cpp +++ b/core/unittest/server/test_rpc.cpp @@ -313,6 +313,9 @@ TEST_F(RpcHandlerTest, TABLES_TEST) { std::vector> record_array; BuildVectors(0, VECTOR_COUNT, record_array); ::milvus::grpc::VectorIds vector_ids; + for (int64_t i = 0; i < VECTOR_COUNT; i++) { + vector_ids.add_vector_id_array(i); + } // Insert vectors // test invalid table name handler->Insert(&context, &request, &vector_ids); diff --git a/core/unittest/server/test_util.cpp b/core/unittest/server/test_util.cpp index e5884cac65..01cf713bcd 100644 --- a/core/unittest/server/test_util.cpp +++ b/core/unittest/server/test_util.cpp @@ -120,7 +120,13 @@ TEST(UtilTest, STRINGFUNCTIONS_TEST) { milvus::server::StringHelpFunctions::SplitStringByDelimeter(str, ",", result); ASSERT_EQ(result.size(), 3UL); + std::string merge_str; + milvus::server::StringHelpFunctions::MergeStringWithDelimeter(result, ",", merge_str); + ASSERT_EQ(merge_str, "a,b,c"); result.clear(); + milvus::server::StringHelpFunctions::MergeStringWithDelimeter(result, ",", merge_str); + ASSERT_TRUE(merge_str.empty()); + auto status = milvus::server::StringHelpFunctions::SplitStringByQuote(str, ",", "\"", result); ASSERT_TRUE(status.ok()); ASSERT_EQ(result.size(), 3UL); @@ -211,6 +217,11 @@ TEST(UtilTest, STATUS_TEST) { str = status.ToString(); ASSERT_FALSE(str.empty()); + status = milvus::Status(milvus::DB_INVALID_PATH, "mistake"); + ASSERT_EQ(status.code(), milvus::DB_INVALID_PATH); + str = status.ToString(); + ASSERT_FALSE(str.empty()); + status = milvus::Status(milvus::DB_META_TRANSACTION_FAILED, "mistake"); ASSERT_EQ(status.code(), milvus::DB_META_TRANSACTION_FAILED); str = status.ToString(); @@ -261,6 +272,10 @@ TEST(ValidationUtilTest, VALIDATE_TABLENAME_TEST) { table_name = std::string(10000, 'a'); status = milvus::server::ValidationUtil::ValidateTableName(table_name); ASSERT_EQ(status.code(), milvus::SERVER_INVALID_TABLE_NAME); + + table_name = ""; + status = milvus::server::ValidationUtil::ValidatePartitionName(table_name); + ASSERT_EQ(status.code(), milvus::SERVER_INVALID_TABLE_NAME); } TEST(ValidationUtilTest, VALIDATE_DIMENSION_TEST) { From 8a4cc9263e06708299e74261be8c23f0d9aa3412 Mon Sep 17 00:00:00 2001 From: fishpenguin Date: Sat, 23 Nov 2019 11:56:34 +0800 Subject: [PATCH 28/32] [skip ci]Add more log msg --- core/src/scheduler/SchedInst.h | 19 +++++++++++++++++++ .../scheduler/optimizer/BuildIndexPass.cpp | 2 +- core/src/scheduler/task/BuildIndexTask.cpp | 2 +- 3 files changed, 21 insertions(+), 2 deletions(-) diff --git a/core/src/scheduler/SchedInst.h b/core/src/scheduler/SchedInst.h index 6273af7a9f..1e8a7acf2e 100644 --- a/core/src/scheduler/SchedInst.h +++ b/core/src/scheduler/SchedInst.h @@ -106,6 +106,25 @@ class OptimizerInst { server::Config& config = server::Config::GetInstance(); config.GetGpuResourceConfigEnable(enable_gpu); if (enable_gpu) { + std::vector build_gpus; + std::vector search_gpus; + int64_t gpu_search_threshold; + config.GetGpuResourceConfigBuildIndexResources(build_gpus); + config.GetGpuResourceConfigSearchResources(search_gpus); + config.GetEngineConfigGpuSearchThreshold(gpu_search_threshold); + std::string build_msg = "Build index gpu:"; + for (auto build_id : build_gpus) { + build_msg.append(" gpu" + std::to_string(build_id)); + } + SERVER_LOG_DEBUG << build_msg; + + std::string search_msg = "Search gpu:"; + for (auto search_id : search_gpus) { + search_msg.append(" gpu" + std::to_string(search_id)); + } + search_msg.append(". gpu_search_threshold:" + std::to_string(gpu_search_threshold)); + SERVER_LOG_DEBUG << search_msg; + pass_list.push_back(std::make_shared()); pass_list.push_back(std::make_shared()); pass_list.push_back(std::make_shared()); diff --git a/core/src/scheduler/optimizer/BuildIndexPass.cpp b/core/src/scheduler/optimizer/BuildIndexPass.cpp index 55e28ec672..5e5719a1bd 100644 --- a/core/src/scheduler/optimizer/BuildIndexPass.cpp +++ b/core/src/scheduler/optimizer/BuildIndexPass.cpp @@ -38,7 +38,7 @@ BuildIndexPass::Run(const TaskPtr& task) { return false; if (build_gpu_ids_.empty()) { - SERVER_LOG_WARNING << "BUildIndexPass cannot get build index gpu!"; + SERVER_LOG_WARNING << "BuildIndexPass cannot get build index gpu!"; return false; } diff --git a/core/src/scheduler/task/BuildIndexTask.cpp b/core/src/scheduler/task/BuildIndexTask.cpp index f561fa947d..e952bd0938 100644 --- a/core/src/scheduler/task/BuildIndexTask.cpp +++ b/core/src/scheduler/task/BuildIndexTask.cpp @@ -85,7 +85,7 @@ XBuildIndexTask::Load(milvus::scheduler::LoadType type, uint8_t device_id) { size_t file_size = to_index_engine_->PhysicalSize(); - std::string info = "Load file id:" + std::to_string(file_->id_) + + std::string info = "Load file id:" + std::to_string(file_->id_) + " " + type_str + " file type:" + std::to_string(file_->file_type_) + " size:" + std::to_string(file_size) + " bytes from location: " + file_->location_ + " totally cost"; double span = rc.ElapseFromBegin(info); From 8aac809ad78ed4fedf913bcd53fb8ade90afa0ad Mon Sep 17 00:00:00 2001 From: groot Date: Sat, 23 Nov 2019 14:36:22 +0800 Subject: [PATCH 29/32] fix typo --- core/src/server/Config.h | 6 ++++-- core/unittest/server/test_config.cpp | 2 +- 2 files changed, 5 insertions(+), 3 deletions(-) diff --git a/core/src/server/Config.h b/core/src/server/Config.h index 4e4923ee07..281a832d57 100644 --- a/core/src/server/Config.h +++ b/core/src/server/Config.h @@ -172,9 +172,10 @@ class Config { CheckEngineConfigOmpThreadNum(const std::string& value); #ifdef MILVUS_GPU_VERSION - /* gpu resource config */ Status CheckEngineConfigGpuSearchThreshold(const std::string& value); + + /* gpu resource config */ Status CheckGpuResourceConfigEnable(const std::string& value); Status @@ -307,9 +308,10 @@ class Config { SetEngineConfigOmpThreadNum(const std::string& value); #ifdef MILVUS_GPU_VERSION - /* gpu resource config */ Status SetEngineConfigGpuSearchThreshold(const std::string& value); + + /* gpu resource config */ Status SetGpuResourceConfigEnable(const std::string& value); Status diff --git a/core/unittest/server/test_config.cpp b/core/unittest/server/test_config.cpp index 3a24c02a45..791876ee8b 100644 --- a/core/unittest/server/test_config.cpp +++ b/core/unittest/server/test_config.cpp @@ -404,10 +404,10 @@ TEST_F(ConfigTest, SERVER_CONFIG_INVALID_TEST) { ASSERT_FALSE(s.ok()); #ifdef MILVUS_GPU_VERSION - /* gpu resource config */ s = config.SetEngineConfigGpuSearchThreshold("-1"); ASSERT_FALSE(s.ok()); + /* gpu resource config */ s = config.SetGpuResourceConfigEnable("ok"); ASSERT_FALSE(s.ok()); From b2e37f1e94cb1ebc247ef6cb26782c6763702dae Mon Sep 17 00:00:00 2001 From: Yukikaze-CZR Date: Sat, 23 Nov 2019 15:37:36 +0800 Subject: [PATCH 30/32] #502 C++ SDK support IVFPQ and SPTAG --- core/src/sdk/examples/utils/Utils.cpp | 6 ++++++ core/src/sdk/include/MilvusApi.h | 3 +++ 2 files changed, 9 insertions(+) diff --git a/core/src/sdk/examples/utils/Utils.cpp b/core/src/sdk/examples/utils/Utils.cpp index fa373cd498..d3bf9eec25 100644 --- a/core/src/sdk/examples/utils/Utils.cpp +++ b/core/src/sdk/examples/utils/Utils.cpp @@ -99,6 +99,12 @@ Utils::IndexTypeName(const milvus::IndexType& index_type) { return "NSG"; case milvus::IndexType::IVFSQ8H: return "IVFSQ8H"; + case milvus::IndexType::IVFPQ: + return "IVFPQ"; + case milvus::IndexType::SPTAGKDT: + return "SPTAGKDT"; + case milvus::IndexType::SPTAGBKT: + return "SPTAGBKT"; default: return "Unknown index type"; } diff --git a/core/src/sdk/include/MilvusApi.h b/core/src/sdk/include/MilvusApi.h index 9fa98deb40..5c7736d4e2 100644 --- a/core/src/sdk/include/MilvusApi.h +++ b/core/src/sdk/include/MilvusApi.h @@ -37,6 +37,9 @@ enum class IndexType { IVFSQ8 = 3, NSG = 4, IVFSQ8H = 5, + IVFPQ = 6, + SPTAGKDT = 7, + SPTAGBKT = 8, }; enum class MetricType { From d3940919287513fdccf38c00a7007334623dd7b3 Mon Sep 17 00:00:00 2001 From: Yukikaze-CZR Date: Sat, 23 Nov 2019 15:49:33 +0800 Subject: [PATCH 31/32] #502 Changelog upgrade --- CHANGELOG.md | 1 + 1 file changed, 1 insertion(+) diff --git a/CHANGELOG.md b/CHANGELOG.md index ccbbe9f64e..53e427dcb9 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -30,6 +30,7 @@ Please mark all change in change log and use the ticket from JIRA. - \#227 - Support new index types SPTAG-KDT and SPTAG-BKT - \#346 - Support build index with multiple gpu - \#488 - Add log in scheduler/optimizer +- \#502 - C++ SDK support IVFPQ and SPTAG ## Improvement - \#255 - Add ivfsq8 test report detailed version From 7924bd874f713250efae3791323972a95a2b3127 Mon Sep 17 00:00:00 2001 From: "G.Y Feng" Date: Sat, 23 Nov 2019 18:02:36 +0800 Subject: [PATCH 32/32] Update README_CN.md add a C++ sdk link --- README_CN.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README_CN.md b/README_CN.md index 374cefa9bd..b101ea0570 100644 --- a/README_CN.md +++ b/README_CN.md @@ -15,7 +15,7 @@ Milvus 是一款开源的、针对海量特征向量的相似性搜索引擎。 若要了解 Milvus 详细介绍和整体架构,请访问 [Milvus 简介](https://www.milvus.io/docs/zh-CN/aboutmilvus/overview/)。 -Milvus 提供稳定的 [Python](https://github.com/milvus-io/pymilvus)、[Java](https://github.com/milvus-io/milvus-sdk-java) 以及 C++ 的 API 接口。 +Milvus 提供稳定的 [Python](https://github.com/milvus-io/pymilvus)、[Java](https://github.com/milvus-io/milvus-sdk-java) 以及[C++](https://github.com/milvus-io/milvus/tree/master/core/src/sdk) 的 API 接口。 通过 [版本发布说明](https://milvus.io/docs/zh-CN/release/v0.5.3/) 获取最新版本的功能和更新。