diff --git a/shards/README.md b/shards/README.md new file mode 100644 index 0000000000..f59eca0460 --- /dev/null +++ b/shards/README.md @@ -0,0 +1,266 @@ +# Mishards - An Experimental Sharding Middleware + +[中文版](README_CN.md) + +Milvus aims to achieve efficient similarity search and analytics for massive-scale vectors. A standalone Milvus instance can easily handle vector search among billion-scale vectors. However, for 10 billion, 100 billion or even larger datasets, a Milvus cluster is needed. + +Ideally, this cluster can be accessed and used just as the standalone instance, meanwhile it satisfies the business requirements such as low latency and high concurrency. + +This page meant to demonstrates how to use Mishards, an experimental sharding middleware for Milvus, to establish an orchestrated cluster. + +## What is Mishards + +Mishards is a middleware that is developed using Python. It provides unlimited extension of memory and computation capacity through request forwarding, read/write splitting, horizontal scalability and dynamic extension. It works as the proxy of the Milvus system. + +Using Mishards in Milvus cluster deployment is an experimental feature available for user test and feedback. + +## How Mishards works + +Mishards splits the upstream requests to sub-requests and forwards them to Milvus servers. When the search computation is completed, all results are collected by Mishards and sent back to the client. + +Below graph is a demonstration of the process: + +![mishards](https://raw.githubusercontent.com/milvus-io/docs/master/assets/mishards.png) + +## Mishards example codes + +Below examples codes demonstrate how to build from source code a Milvus server with Mishards on a standalone machine, as well as how to use Kubernetes to establish Milvus cluster with Mishards. + +Before executing these examples, make sure you meet the prerequisites of [Milvus installation](https://www.milvus.io/docs/en/userguide/install_milvus/). + +### Build from source code + +#### Prequisites + +Make sure Python 3.6 or higher is installed. + +#### Start Milvus and Mishards from source code + +Follow below steps to start a standalone Milvus instance with Mishards from source code: + +1. Clone milvus repository. + + ```shell + git clone + ``` + +2. Install Mishards dependencies. + + ```shell + $ cd milvus/shards + $ pip install -r requirements.txt + ``` + +3. Start Milvus server. + + ```shell + $ sudo nvidia-docker run --rm -d -p 19530:19530 -v /tmp/milvus/db:/opt/milvus/db milvusdb/milvus:0.5.0-d102119-ede20b + ``` + +4. Update path permissions. + + ```shell + $ sudo chown -R $USER:$USER /tmp/milvus + ``` + +5. Configure Mishards environmental variables. + + ```shell + $ cp mishards/.env.example mishards/.env + ``` + +6. Start Mishards server. + + ```shell + $ python mishards/main.py + ``` + +### Docker example + +The `all_in_one` example shows how to use Docker container to start 2 Milvus instances, 1 Mishards instance and 1 Jaeger instance. + + 1. Install [Docker Compose](https://docs.docker.com/compose/install/). + + 2. Build docker images for these instances. + + ```shell + $ make build + ``` + + 3. Start all instances. + + ```shell + $ make deploy + ``` + + 4. Confirm instance status. + + ```shell + $ make probe_deploy + Pass ==> Pass: Connected + Fail ==> Error: Fail connecting to server on 127.0.0.1:19530. Timeout + ``` + +To check the service tracing, open the [Jaeger page](http://127.0.0.1:16686/) on your browser. + +![jaegerui](https://raw.githubusercontent.com/milvus-io/docs/master/assets/jaegerui.png) + +![jaegertraces](https://raw.githubusercontent.com/milvus-io/docs/master/assets/jaegertraces.png) + +To stop all instances, use the following command: + +```shell +$ make clean_deploy +``` + +### Kubernetes example + +Using Kubernetes to deploy Milvus cluster requires that the developers have a basic understanding of [general concepts](https://kubernetes.io/docs/concepts/) of Kubernetes. + +This example mainly demonstrates how to use Kubernetes to establish a Milvus cluster containing 2 Milvus instances(1 read instance and 1 write instance), 1 MySQL instance and 1 Mishards instance. + +This example does not include tasks such as setting up Kubernetes cluster, [installing shared storage](https://kubernetes.io/docs/concepts/storage/volumes/) and using command tools such as [kubectl](https://kubernetes.io/docs/tasks/tools/install-kubectl/). + +Below is the architecture of Milvus cluster built upon Kubernetes: + +![k8s_arch](https://raw.githubusercontent.com/milvus-io/docs/master/assets/k8s_arch.png) + +#### Prerequisites + +- A Kubernetes cluster is already established. +- [nvidia-docker 2.0](https://github.com/nvidia/nvidia-docker/wiki/Installation-(version-2.0)) is already installed. +- Shared storage is already installed. +- kubectl is installed and can access the Kubernetes cluster. + +#### Use Kubernetes to build a Milvus cluster + +1. Start Milvus cluster + + ```shell + $ make cluster + ``` + +2. Confirm that Mishards is connected to Milvus. + + ```shell + $ make probe_cluster + Pass ==> Pass: Connected + ``` + +To check cluster status: + +```shell +$ make cluster_status +``` + +To delete the cluster: + +```shell +$ make clean_cluster +``` + +To add a read instance: + +```shell +$ cd kubernetes_demo +$ ./start.sh scale-ro-server 2 +``` + +To add a proxy instance: + +```shell +$ cd kubernetes_demo +$ ./start.sh scale-proxy 2 +``` + +To check cluster logs: + +```shell +$ kubectl logs -f --tail=1000 -n milvus milvus-ro-servers-0 +``` + +## Mishards Unit test + +**Unit test** + +```shell +$ cd milvus/shards +$ make test +``` + +**Code coverage test** + +```shell +$ cd milvus/shards +$ make coverage +``` + +**Code format check** + +```shell +$ cd milvus/shards +$ make style +``` + +## Mishards configuration + +### Overall configuration + +| Name | Required | Type | Default | Description | +| ------------- | -------- | ------- | ------- | ------------------------------------------------------------ | +| `Debug` | No | boolean | `True` | Choose if to enable `Debug` work mode. | +| `TIMEZONE` | No | string | `UTC` | Timezone | +| `MAX_RETRY` | No | integer | `3` | The maximum retry times allowed to connect to Milvus. | +| `SERVER_PORT` | No | integer | `19530` | Define the server port of Mishards. | +| `WOSERVER` | **Yes** | string | ` ` | Define the address of Milvus write instance. Currently, only static settings are supported. Format for reference: `tcp://127.0.0.1:19530`. | + +### Metadata + +| Name | Required | Type | Default | Description | +| ------------------------------ | -------- | ------- | ------- | ------------------------------------------------------------ | +| `SQLALCHEMY_DATABASE_URI` | **Yes** | string | ` ` | Define the database address for metadata storage. Format standard: RFC-738-style. For example: `mysql+pymysql://root:root@127.0.0.1:3306/milvus?charset=utf8mb4`. | +| `SQL_ECHO` | No | boolean | `False` | Choose if to print SQL statements. | +| `SQLALCHEMY_DATABASE_TEST_URI` | No | string | ` ` | Define the database address of metadata storage in test environment. | +| `SQL_TEST_ECHO` | No | boolean | `False` | Choose if to print SQL statements in test environment. | + +### Service discovery + +| Name | Required | Type | Default | Description | +| ------------------------------------- | -------- | ------- | ------------- | ------------------------------------------------------------ | +| `DISCOVERY_PLUGIN_PATH` | No | string | ` ` | Define the search path to locate the plug-in. The default path is used if the value is not set. | +| `DISCOVERY_CLASS_NAME` | No | string | `static` | Under the plug-in search path, search the class based on the class name, and instantiate it. Currently, the system provides 2 classes: `static` and `kubernetes`. | +| `DISCOVERY_STATIC_HOSTS` | No | list | `[]` | When `DISCOVERY_CLASS_NAME` is `static` , define a comma-separated service address list, for example`192.168.1.188,192.168.1.190`. | +| `DISCOVERY_STATIC_PORT` | No | integer | `19530` | When `DISCOVERY_CLASS_NAME` is `static`, define the server port. | +| `DISCOVERY_KUBERNETES_NAMESPACE` | No | string | ` ` | When `DISCOVERY_CLASS_NAME` is `kubernetes`, define the namespace of Milvus cluster. | +| `DISCOVERY_KUBERNETES_IN_CLUSTER` | No | boolean | `False` | When `DISCOVERY_CLASS_NAME` is `kubernetes` , choose if to run the server in Kubernetes. | +| `DISCOVERY_KUBERNETES_POLL_INTERVAL` | No | integer | `5` (Seconds) | When `DISCOVERY_CLASS_NAME` is `kubernetes` , define the listening cycle of the server. | +| `DISCOVERY_KUBERNETES_POD_PATT` | No | string | ` ` | When `DISCOVERY_CLASS_NAME` is `kubernetes` , map the regular expression of Milvus Pod. | +| `DISCOVERY_KUBERNETES_LABEL_SELECTOR` | No | string | ` ` | When `SD_PROVIDER` is `kubernetes`, map the label of Milvus Pod. For example: `tier=ro-servers`. | + +### Tracing + +| Name | Required | Type | Default | Description | +| ----------------------- | -------- | ------- | ---------- | ------------------------------------------------------------ | +| `TRACER_PLUGIN_PATH` | No | string | ` ` | Define the search path to locate the tracing plug-in. The default path is used if the value is not set. | +| `TRACER_CLASS_NAME` | No | string | ` ` | Under the plug-in search path, search the class based on the class name, and instantiate it. Currently, only `Jaeger` is supported. | +| `TRACING_SERVICE_NAME` | No | string | `mishards` | When `TRACING_CLASS_NAME` is [`Jaeger`](https://www.jaegertracing.io/docs/1.14/), the name of the tracing service. | +| `TRACING_SAMPLER_TYPE` | No | string | `const` | When `TRACING_CLASS_NAME` is [`Jaeger`](https://www.jaegertracing.io/docs/1.14/), the [sampling type](https://www.jaegertracing.io/docs/1.14/sampling/) of the tracing service. | +| `TRACING_SAMPLER_PARAM` | No | integer | `1` | When `TRACING_CLASS_NAME` is [`Jaeger`](https://www.jaegertracing.io/docs/1.14/), the [sampling frequency](https://www.jaegertracing.io/docs/1.14/sampling/) of the tracing service. | +| `TRACING_LOG_PAYLOAD` | No | boolean | `False` | When `TRACING_CLASS_NAME` is [`Jaeger`](https://www.jaegertracing.io/docs/1.14/), choose if to sample Payload. | + +### Logging + +| Name | Required | Type | Default | Description | +| ----------- | -------- | ------ | --------------- | ------------------------------------------------------------ | +| `LOG_LEVEL` | No | string | `DEBUG` | Log recording levels. Currently supports `DEBUG` ,`INFO` ,`WARNING` and `ERROR`. | +| `LOG_PATH` | No | string | `/tmp/mishards` | Log recording path. | +| `LOG_NAME` | No | string | `logfile` | Log recording name. | + +### Routing + +| Name | Required | Type | Default | Description | +| ------------------------ | -------- | ------ | ------------------------- | ------------------------------------------------------------ | +| `ROUTER_PLUGIN_PATH` | No | string | ` ` | Define the search path to locate the routing plug-in. The default path is used if the value is not set. | +| `ROUTER_CLASS_NAME` | No | string | `FileBasedHashRingRouter` | Under the plug-in search path, search the class based on the class name, and instantiate it. Currently, only `FileBasedHashRingRouter` is supported. | +| `ROUTER_CLASS_TEST_NAME` | No | string | `FileBasedHashRingRouter` | Under the plug-in search path, search the class based on the class name, and instantiate it. Currently, `FileBasedHashRingRouter` is supported for test environment only. | + diff --git a/shards/README_CN.md b/shards/README_CN.md new file mode 100644 index 0000000000..24e019d001 --- /dev/null +++ b/shards/README_CN.md @@ -0,0 +1,261 @@ +# Mishards - Milvus 集群分片中间件 + +Milvus 旨在帮助用户实现海量非结构化数据的近似检索和分析。单个 Milvus 实例可处理十亿级数据规模,而对于百亿或者千亿级数据,则需要一个 Milvus 集群实例。该实例对于上层应用可以像单机实例一样使用,同时满足海量数据低延迟、高并发业务需求。 + +本文主要展示如何使用 Mishards 分片中间件来搭建 Milvus 集群。 + +## Mishards 是什么 + +Mishards 是一个用 Python 开发的 Milvus 集群分片中间件,其内部处理请求转发、读写分离、水平扩展、动态扩容,为用户提供内存和算力可以无限扩容的 Milvus 实例。 + +Mishards 的设计尚未完成,属于试用功能,希望大家多多测试、提供反馈。 + +## Mishards 如何工作 + +Mishards 负责将上游请求拆分,并路由到内部各细分子服务,最后将子服务结果汇总,返回给上游。 + +![mishards](https://raw.githubusercontent.com/milvus-io/docs/master/assets/mishards.png) + +## Mishards 相关示例 + +以下分别向您展示如何使用源代码在单机上启动 Mishards 和 Milvus 服务,以及如何使用 Kubernetes 启动 Milvus 集群和 Mishards。 + +Milvus 启动的前提条件请参考 [Milvus 安装](https://www.milvus.io/docs/zh-CN/userguide/install_milvus/)。 + +### 源代码启动示例 + +#### 前提条件 + +Python 版本为3.6及以上。 + +#### 源代码启动 Milvus 和 Mishards 实例 + +请按照以下步骤在单机上启动单个 Milvus 实例和 Mishards 服务: + +1. 将 milvus repository 复制到本地。 + + ```shell + git clone + ``` + +2. 安装 Mishards 的依赖库。 + + ```shell + $ cd milvus/shards + $ pip install -r requirements.txt + ``` + +3. 启动 Milvus 服务。 + + ```shell + $ sudo nvidia-docker run --rm -d -p 19530:19530 -v /tmp/milvus/db:/opt/milvus/db milvusdb/milvus:0.5.0-d102119-ede20b + ``` + +4. 更改目录权限。 + + ```shell + $ sudo chown -R $USER:$USER /tmp/milvus + ``` + +5. 配置 Mishards 环境变量 + + ```shell + $ cp mishards/.env.example mishards/.env + ``` + +6. 启动 Mishards 服务 + + ```shell + $ python mishards/main.py + ``` + +### Docker 示例 + +`all_in_one` 使用 Docker 容器启动2个 Milvus 实例,1个 Mishards 中间件实例,和1个 Jaeger 链路追踪实例。 + + 1. 安装 [Docker Compose](https://docs.docker.com/compose/install/)。 + + 2. 制作实例镜像。 + + ```shell + $ make build + ``` + + 3. 启动所有服务。 + + ```shell + $ make deploy + ``` + + 4. 检查确认服务状态。 + + ```shell + $ make probe_deploy + Pass ==> Pass: Connected + Fail ==> Error: Fail connecting to server on 127.0.0.1:19530. Timeout + ``` + +若要查看服务踪迹,使用浏览器打开 [Jaeger 页面](http://127.0.0.1:16686/)。 + +![jaegerui](https://github.com/milvus-io/docs/blob/master/assets/jaegerui.png) + +![jaegertraces](https://github.com/milvus-io/docs/blob/master/assets/jaegertraces.png) + +若要清理所有服务,请使用如下命令: + +```shell +$ make clean_deploy +``` + +### Kubernetes 示例 + +使用 Kubernetes 部署 Milvus 分布式集群要求开发人员对 Kubernetes 的[基本概念](https://kubernetes.io/docs/concepts/)和操作有基本了解。 + +本示例主要展示如何使用 Kubernetes 搭建 Milvus 集群,包含2个 Milvus 实例(1个可读实例,1个可写实例)、1个 MySQL 实例和1个 Mishards 实例。 + +本示例不包括如何搭建 Kubernetes 集群,如何安装[共享存储](https://kubernetes.io/docs/concepts/storage/volumes/)和如何安装 [kubectl](https://kubernetes.io/docs/tasks/tools/install-kubectl/) 命令行工具等。 + +以下是 Kubernetes 示例架构图: + +![k8s_arch](https://github.com/milvus-io/docs/blob/master/assets/k8s_arch.png) + +#### 前提条件 + +使用 Kubernetes 启动多个 Milvus 实例之前,请确保您已满足以下条件: + +- 已创建 Kubernetes 集群 +- 已安装 [nvidia-docker 2.0](https://github.com/nvidia/nvidia-docker/wiki/Installation-(version-2.0)) +- 已安装共享存储 +- 已安装 kubectl,且能访问集群 + +#### Kubernetes 启动集群 + +1. 启动 Milvus 集群。 + + ```shell + $ make cluster + ``` + +2. 确认 Mishards 是否可用。 + + ```shell + $ make probe_cluster + Pass ==> Pass: Connected + ``` + +查看集群状态: + +```shell +$ make cluster_status +``` + +删除 Milvus 集群: + +```shell +$ make clean_cluster +``` + +扩容 Milvus 可读实例到2个: + +```shell +$ cd kubernetes_demo +$ ./start.sh scale-ro-server 2 +``` + +扩容 Mishards(代理)实例到2个: + +```shell +$ cd kubernetes_demo +$ ./start.sh scale-proxy 2 +``` + +查看计算节点 `milvus-ro-servers-0` 日志: + +```shell +$ kubectl logs -f --tail=1000 -n milvus milvus-ro-servers-0 +``` + +## 单元测试 + +**单元测试** + +```shell +$ cd milvus/shards +$ make test +``` + +**代码覆盖率测试** + +```shell +$ cd milvus/shards +$ make coverage +``` + +**代码格式检查** + +```shell +$ cd milvus/shards +$ make style +``` + +## Mishards 配置 + +### 全局配置 + +| 参数 | 是否必填 | 类型 | 默认值 | 说明 | +| ------------- | -------- | ------- | ------- | ------------------------------------------------------------ | +| `Debug` | No | boolean | `True` | 选择是否启用 `Debug` 工作模式。 | +| `TIMEZONE` | No | string | `UTC` | 时区 | +| `MAX_RETRY` | No | integer | `3` | Mishards 连接 Milvus 的最大重试次数。 | +| `SERVER_PORT` | No | integer | `19530` | 定义 Mishards 的服务端口。 | +| `WOSERVER` | **Yes** | string | ` ` | 定义 Milvus 可写实例的地址,目前只支持静态设置。参考格式: `tcp://127.0.0.1:19530`。 | + +### 元数据 + +| 参数 | 是否必填 | 类型 | 默认值 | 说明 | +| ------------------------------ | -------- | ------- | ------- | ------------------------------------------------------------ | +| `SQLALCHEMY_DATABASE_URI` | **Yes** | string | ` ` | 定义元数据存储的数据库地址,格式标准为 RFC-738-style。例如:`mysql+pymysql://root:root@127.0.0.1:3306/milvus?charset=utf8mb4`。 | +| `SQL_ECHO` | No | boolean | `False` | 选择是否打印 SQL 详细语句。 | +| `SQLALCHEMY_DATABASE_TEST_URI` | No | string | ` ` | 定义测试环境下元数据存储的数据库地址。 | +| `SQL_TEST_ECHO` | No | boolean | `False` | 选择测试环境下是否打印 SQL 详细语句。 | + +### 服务发现 + +| 参数 | 是否必填 | 类型 | 默认值 | 说明 | +| ------------------------------------- | -------- | ------- | -------- | ------------------------------------------------------------ | +| `DISCOVERY_PLUGIN_PATH` | No | string | ` ` | 用户自定义服务发现插件的搜索路径,默认使用系统搜索路径。 | +| `DISCOVERY_CLASS_NAME` | No | string | `static` | 在插件搜索路径下,根据类名搜索类,并将其实例化。目前系统提供 `static` 和 `kubernetes` 两种类,默认使用 `static`。 | +| `DISCOVERY_STATIC_HOSTS` | No | list | `[]` | `DISCOVERY_CLASS_NAME`为 `static` 时,定义服务地址列表,地址之间以逗号隔开,例如 `192.168.1.188,192.168.1.190`。 | +| `DISCOVERY_STATIC_PORT` | No | integer | `19530` | `DISCOVERY_CLASS_NAME` 为 `static` 时,定义服务地址监听端口。 | +| `DISCOVERY_KUBERNETES_NAMESPACE` | No | string | ` ` | `DISCOVERY_CLASS_NAME` 为 `kubernetes`时,定义 Milvus 集群的namespace。 | +| `DISCOVERY_KUBERNETES_IN_CLUSTER` | No | boolean | `False` | `DISCOVERY_CLASS_NAME` 为 `kubernetes` 时,选择服务发现是否在集群中运行。 | +| `DISCOVERY_KUBERNETES_POLL_INTERVAL` | No | integer | `5` | `DISCOVERY_CLASS_NAME` 为 `kubernetes` 时,定义服务发现监听周期,单位:second。 | +| `DISCOVERY_KUBERNETES_POD_PATT` | No | string | ` ` | `DISCOVERY_CLASS_NAME` 为 `kubernetes` 时,匹配 Milvus Pod 名字的正则表达式。 | +| `DISCOVERY_KUBERNETES_LABEL_SELECTOR` | No | string | ` ` | `SD_PROVIDER`为`kubernetes`时,匹配 Milvus Pod 的标签。例如:`tier=ro-servers`。 | + +### 链路追踪 + +| 参数 | 是否必填 | 类型 | 默认值 | 说明 | +| ----------------------- | -------- | ------- | ---------- | ------------------------------------------------------------ | +| `TRACER_PLUGIN_PATH` | No | string | ` ` | 用户自定义链路追踪插件的搜索路径,默认使用系统搜索路径。 | +| `TRACER_CLASS_NAME` | No | string | ` ` | 在插件搜索路径下,根据类名搜索类,并将其实例化。目前只支持 `Jaeger`, 默认不使用。 | +| `TRACING_SERVICE_NAME` | No | string | `mishards` | `TRACING_CLASS_NAME` 为 [`Jaeger`](https://www.jaegertracing.io/docs/1.14/)时,链路追踪的 service。 | +| `TRACING_SAMPLER_TYPE` | No | string | `const` | `TRACING_CLASS_NAME`为 `Jaeger` 时,链路追踪的[采样类型](https://www.jaegertracing.io/docs/1.14/sampling/)。 | +| `TRACING_SAMPLER_PARAM` | No | integer | `1` | `TRACING_CLASS_NAME` 为 `Jaeger`时,链路追踪的[采样频率](https://www.jaegertracing.io/docs/1.14/sampling/)。 | +| `TRACING_LOG_PAYLOAD` | No | boolean | `False` | `TRACING_CLASS_NAME`为 `Jaeger`时,链路追踪是否采集 Payload。 | + +### 日志 + +| 参数 | 是否必填 | 类型 | 默认值 | 说明 | +| ----------- | -------- | ------ | --------------- | ------------------------------------------------------------ | +| `LOG_LEVEL` | No | string | `DEBUG` | 日志记录级别,目前支持 `DEBUG` 、`INFO` 、`WARNING` 和`ERROR`。 | +| `LOG_PATH` | No | string | `/tmp/mishards` | 日志记录路径。 | +| `LOG_NAME` | No | string | `logfile` | 日志记录名。 | + +### 路由 + +| 参数 | 是否必填 | 类型 | 默认值 | 说明 | +| ------------------------ | -------- | ------ | ------------------------- | ------------------------------------------------------------ | +| `ROUTER_PLUGIN_PATH` | No | string | ` ` | 用户自定义路由插件的搜索路径,默认使用系统搜索路径。 | +| `ROUTER_CLASS_NAME` | No | string | `FileBasedHashRingRouter` | 在插件搜索路径下,根据类名搜索路由的类,并将其实例化。目前系统只提供了 `FileBasedHashRingRouter`。 | +| `ROUTER_CLASS_TEST_NAME` | No | string | `FileBasedHashRingRouter` | 在插件搜索路径下,根据类名搜索路由的类,并将其实例化。目前系统只提供了 `FileBasedHashRingRouter`,仅限测试环境下使用。 | diff --git a/shards/Tutorial_CN.md b/shards/Tutorial_CN.md deleted file mode 100644 index 192a0fd285..0000000000 --- a/shards/Tutorial_CN.md +++ /dev/null @@ -1,147 +0,0 @@ -# Mishards使用文档 ---- -Milvus 旨在帮助用户实现海量非结构化数据的近似检索和分析。单个 Milvus 实例可处理十亿级数据规模,而对于百亿或者千亿规模数据的需求,则需要一个 Milvus 集群实例,该实例对于上层应用可以像单机实例一样使用,同时满足海量数据低延迟,高并发业务需求。mishards就是一个集群中间件,其内部处理请求转发,读写分离,水平扩展,动态扩容,为用户提供内存和算力可以无限扩容的 Milvus 实例。 - -## 运行环境 ---- - -### 单机快速启动实例 -**`python >= 3.4`环境** - -``` -1. cd milvus/shards -2. pip install -r requirements.txt -3. nvidia-docker run --rm -d -p 19530:19530 -v /tmp/milvus/db:/opt/milvus/db milvusdb/milvus:0.5.0-d102119-ede20b -4. sudo chown -R $USER:$USER /tmp/milvus -5. cp mishards/.env.example mishards/.env -6. 在python mishards/main.py #.env配置mishards监听19532端口 -7. make probe port=19532 #健康检查 -``` - -### 容器启动实例 -`all_in_one`会在服务器上开启两个milvus实例,一个mishards实例,一个jaeger链路追踪实例 - -**启动** -``` -cd milvus/shards -1. 安装docker-compose -2. make build -3. make deploy #监听19531端口 -4. make clean_deploy #清理服务 -5. make probe_deplopy #健康检查 -``` - -**打开Jaeger UI** -``` -浏览器打开 "http://127.0.0.1:16686/" -``` - -### kubernetes中快速启动 -**准备** -``` -- kubernetes集群 -- 安装nvidia-docker -- 共享存储 -- 安装kubectl并能访问集群 -``` - -**步骤** -``` -cd milvus/shards -1. make deploy_cluster #启动集群 -2. make probe_cluster #健康检查 -3. make clean_cluster #关闭集群 -``` - -**扩容计算实例** -``` -cd milvus/shards/kubernetes_demo/ -./start.sh scale-ro-server 2 扩容计算实例到2 -``` - -**扩容代理器实例** -``` -cd milvus/shards/kubernetes_demo/ -./start.sh scale-proxy 2 扩容代理服务器实例到2 -``` - -**查看日志** -``` -kubectl logs -f --tail=1000 -n milvus milvus-ro-servers-0 查看计算节点milvus-ro-servers-0日志 -``` - -## 测试 - -**启动单元测试** -``` -1. cd milvus/shards -2. make test -``` - -**单元测试覆盖率** -``` -1. cd milvus/shards -2. make coverage -``` - -**代码风格检查** -``` -1. cd milvus/shards -2. make style -``` - -## mishards配置详解 - -### 全局 -| Name | Required | Type | Default Value | Explanation | -| --------------------------- | -------- | -------- | ------------- | ------------- | -| Debug | No | bool | True | 是否Debug工作模式 | -| TIMEZONE | No | string | "UTC" | 时区 | -| MAX_RETRY | No | int | 3 | 最大连接重试次数 | -| SERVER_PORT | No | int | 19530 | 配置服务端口 | -| WOSERVER | **Yes** | str | - | 配置后台可写Milvus实例地址。目前只支持静态设置,例"tcp://127.0.0.1:19530" | - -### 元数据 -| Name | Required | Type | Default Value | Explanation | -| --------------------------- | -------- | -------- | ------------- | ------------- | -| SQLALCHEMY_DATABASE_URI | **Yes** | string | - | 配置元数据存储数据库地址 | -| SQL_ECHO | No | bool | False | 是否打印Sql详细语句 | -| SQLALCHEMY_DATABASE_TEST_URI | No | string | - | 配置测试环境下元数据存储数据库地址 | -| SQL_TEST_ECHO | No | bool | False | 配置测试环境下是否打印Sql详细语句 | - -### 服务发现 -| Name | Required | Type | Default Value | Explanation | -| --------------------------- | -------- | -------- | ------------- | ------------- | -| DISCOVERY_PLUGIN_PATH | No | string | - | 用户自定义服务发现插件搜索路径,默认使用系统搜索路径| -| DISCOVERY_CLASS_NAME | No | string | static | 在服务发现插件搜索路径下搜索类并实例化。目前系统提供 **static** 和 **kubernetes** 两种类,默认使用 **static** | -| DISCOVERY_STATIC_HOSTS | No | list | [] | **DISCOVERY_CLASS_NAME** 为 **static** 时,配置服务地址列表,例"192.168.1.188,192.168.1.190"| -| DISCOVERY_STATIC_PORT | No | int | 19530 | **DISCOVERY_CLASS_NAME** 为 **static** 时,配置 Hosts 监听端口 | -| DISCOVERY_KUBERNETES_NAMESPACE | No | string | - | **DISCOVERY_CLASS_NAME** 为 **kubernetes** 时,配置集群 namespace | -| DISCOVERY_KUBERNETES_IN_CLUSTER | No | bool | False | **DISCOVERY_CLASS_NAME** 为 **kubernetes** 时,标明服务发现是否在集群中运行 | -| DISCOVERY_KUBERNETES_POLL_INTERVAL | No | int | 5 | **DISCOVERY_CLASS_NAME** 为 **kubernetes** 时,标明服务发现监听服务列表频率,单位 Second | -| DISCOVERY_KUBERNETES_POD_PATT | No | string | - | **DISCOVERY_CLASS_NAME** 为 **kubernetes** 时,匹配可读 Milvus 实例的正则表达式 | -| DISCOVERY_KUBERNETES_LABEL_SELECTOR | No | string | - | **SD_PROVIDER** 为**Kubernetes**时,匹配可读Milvus实例的标签选择 | - -### 链路追踪 -| Name | Required | Type | Default Value | Explanation | -| --------------------------- | -------- | -------- | ------------- | ------------- | -| TRACER_PLUGIN_PATH | No | string | - | 用户自定义链路追踪插件搜索路径,默认使用系统搜索路径| -| TRACER_CLASS_NAME | No | string | "" | 链路追踪方案选择,目前只实现 **Jaeger**, 默认不使用| -| TRACING_SERVICE_NAME | No | string | "mishards" | **TRACING_TYPE** 为 **Jaeger** 时,链路追踪服务名 | -| TRACING_SAMPLER_TYPE | No | string | "const" | **TRACING_TYPE** 为 **Jaeger** 时,链路追踪采样类型 | -| TRACING_SAMPLER_PARAM | No | int | 1 | **TRACING_TYPE** 为 **Jaeger** 时,链路追踪采样频率 | -| TRACING_LOG_PAYLOAD | No | bool | False | **TRACING_TYPE** 为 **Jaeger** 时,链路追踪是否采集 Payload | - -### 日志 -| Name | Required | Type | Default Value | Explanation | -| --------------------------- | -------- | -------- | ------------- | ------------- | -| LOG_LEVEL | No | string | "DEBUG" if Debug is ON else "INFO" | 日志记录级别 | -| LOG_PATH | No | string | "/tmp/mishards" | 日志记录路径 | -| LOG_NAME | No | string | "logfile" | 日志记录名 | - -### 路由 -| Name | Required | Type | Default Value | Explanation | -| --------------------------- | -------- | -------- | ------------- | ------------- | -| ROUTER_PLUGIN_PATH | No | string | - | 用户自定义路由插件搜索路径,默认使用系统搜索路径| -| ROUTER_CLASS_NAME | No | string | FileBasedHashRingRouter | 处理请求路由类名, 可注册自定义类。目前系统只提供了类 **FileBasedHashRingRouter** | -| ROUTER_CLASS_TEST_NAME | No | string | FileBasedHashRingRouter | 测试环境下处理请求路由类名, 可注册自定义类 |