[kylin] branch document updated: add document for KYLIN-4485

xxyu Fri, 03 Jul 2020 05:14:00 -0700

This is an automated email from the ASF dual-hosted git repository.

xxyu pushed a commit to branch document
in repository https://gitbox.apache.org/repos/asf/kylin.git



The following commit(s) were added to refs/heads/document by this push:
     new 41e496e  add document for KYLIN-4485
41e496e is described below

commit 41e496e8aad7fb995bac14d95cb2e19f0a6e3e0c
Author: Zhichao Zhang <441586...@qq.com>
AuthorDate: Fri Jul 3 15:30:38 2020 +0800

    add document for KYLIN-4485
    
    1. fix according to comments
    2. add CN version
---
 website/_docs/gettingstarted/quickstart.cn.md  |  26 ++--
 website/_docs/gettingstarted/quickstart.md     |  40 +++---
 website/_docs/howto/howto_use_mr_hive_dict.md  |  11 +-
 website/_docs/install/kylin_docker.cn.md       |   2 +-
 website/_docs/install/kylin_docker.md          |   2 +-
 website/_docs/tutorial/cube_migration.cn.md    | 164 +++++++++++++++++++++++++
 website/_docs/tutorial/cube_migration.md       |  52 +++++---
 website/images/docs/quickstart/pull_docker.png | Bin 311525 -> 75775 bytes
 8 files changed, 235 insertions(+), 62 deletions(-)

diff --git a/website/_docs/gettingstarted/quickstart.cn.md 
b/website/_docs/gettingstarted/quickstart.cn.md
index 18274dd..518c398 100644
--- a/website/_docs/gettingstarted/quickstart.cn.md
+++ b/website/_docs/gettingstarted/quickstart.cn.md
@@ -14,25 +14,23 @@ since: v0.6.x
 ### 一、 从docker镜像安装使用kylin（不需要提前准备hadoop环境）
 为了让用户方便的试用 Kylin，我们提供了 Kylin 的 docker 镜像。该镜像中，Kylin 依赖的各个服务均已正确的安装及部署，包括：
 
-- jdk 1.8
+- JDK 1.8
 - Hadoop 2.7.0
 - Hive 1.2.1
-- Hbase 1.1.2
+- Hbase 1.1.2 (with Zookeeper)
 - Spark 2.3.1
-- Zookeeper 3.4.6
 - Kafka 1.1.1
-- Mysql
-- Maven 3.6.1
+- MySQL 5.1.73
 
 我们已将面向用户的 Kylin 镜像上传至 docker 仓库，用户无需在本地构建镜像，只需要安装docker，就可以体验kylin的一键安装。
 
 #### step1、首先执行以下命令从 docker 仓库 pull 镜像：
 
 ```
-docker pull apachekylin/apache-kylin-standalone:3.0.1
+docker pull apachekylin/apache-kylin-standalone:3.1.0
 ```
 
-此处的镜像包含的是kylin最新Release版本kylin 
3.0.1。由于该镜像中包含了所有kylin依赖的大数据组件，所以拉取镜像需要的时间较长，请耐心等待。Pull成功后显示如下：
+此处的镜像包含的是kylin最新Release版本kylin 
3.1.0。由于该镜像中包含了所有kylin依赖的大数据组件，所以拉取镜像需要的时间较长，请耐心等待。Pull成功后显示如下：
 ![](/images/docs/quickstart/pull_docker.png)
 
 #### step2、执行以下命令来启动容器：
@@ -46,7 +44,7 @@ docker run -d \
 -p 8032:8032 \
 -p 8042:8042 \
 -p 16010:16010 \
-apachekylin/apache-kylin-standalone:3.0.1
+apachekylin/apache-kylin-standalone:3.1.0
 ```
 
 容器会很快启动，由于容器内指定端口已经映射到本机端口，可以直接在本机浏览器中打开各个服务的页面，如：
@@ -74,7 +72,7 @@ KAFKA_HOME=/home/admin/kafka_2.11-1.1.1
 SPARK_HOME=/home/admin/spark-2.3.1-bin-hadoop2.6
 HBASE_HOME=/home/admin/hbase-1.1.2
 HIVE_HOME=/home/admin/apache-hive-1.2.1-bin
-KYLIN_HOME=/home/admin/apache-kylin-3.0.0-alpha2-bin-hbase1x
+KYLIN_HOME=/home/admin/apache-kylin-3.1.0-bin-hbase1x
 ```
 
 使用ADMIN/KYLIN的用户名和密码组合登陆Kylin后，用户可以使用sample 
cube来体验cube的构建和查询，也可以按照下面“基于hadoop环境安装使用kylin”中从step8之后的教程来创建并查询属于自己的model和cube。
@@ -105,11 +103,11 @@ CentOS 6.5+ 或Ubuntu 16.0.4+
 
 #### step1、下载kylin压缩包
 
-从[Apache Kylin Download 
Site](https://kylin.apache.org/download/)下载一个适用于你的Hadoop版本的二进制文件。目前最新Release版本是kylin
 3.0.1和kylin 2.6.5，其中3.0版本支持实时摄入数据进行预计算的功能。以CDH 5.的hadoop环境为例，可以使用如下命令行下载kylin 
3.0.0：
+从[Apache Kylin Download 
Site](https://kylin.apache.org/download/)下载一个适用于你的Hadoop版本的二进制文件。目前最新Release版本是kylin
 3.1.0和kylin 2.6.6，其中3.0版本支持实时摄入数据进行预计算的功能。以CDH 5.的hadoop环境为例，可以使用如下命令行下载kylin 
3.1.0：
 
 ```
 cd /usr/local/
-wget 
http://apache.website-solution.net/kylin/apache-kylin-3.0.0/apache-kylin-3.0.0-bin-cdh57.tar.gz
+wget 
http://apache.website-solution.net/kylin/apache-kylin-3.1.0/apache-kylin-3.1.0-bin-cdh57.tar.gz
 ```
 
 #### step2、解压kylin
@@ -117,8 +115,8 @@ wget 
http://apache.website-solution.net/kylin/apache-kylin-3.0.0/apache-kylin-3.
 解压下载得到的kylin压缩包，并配置环境变量KYLIN_HOME指向解压目录：
 
 ```
-tar -zxvf  apache-kylin-3.0.0-bin-cdh57.tar.gz
-cd apache-kylin-3.0.0-bin-cdh57
+tar -zxvf  apache-kylin-3.1.0-bin-cdh57.tar.gz
+cd apache-kylin-3.1.0-bin-cdh57
 export KYLIN_HOME=`pwd`
 ```
 
@@ -160,7 +158,7 @@ $KYLIN_HOME/bin/kylin.sh start
 
 ```
 A new Kylin instance is started by root. To stop it, run 'kylin.sh stop'
-Check the log at /usr/local/apache-kylin-3.0.0-bin-cdh57/logs/kylin.log
+Check the log at /usr/local/apache-kylin-3.1.0-bin-cdh57/logs/kylin.log
 Web UI is at http://<hostname>:7070/kylin
 ```
 
diff --git a/website/_docs/gettingstarted/quickstart.md 
b/website/_docs/gettingstarted/quickstart.md
index 161696e..27fe8dc 100644
--- a/website/_docs/gettingstarted/quickstart.md
+++ b/website/_docs/gettingstarted/quickstart.md
@@ -1,5 +1,5 @@
 ---
-layout: docs-cn
+layout: docs
 title:  Quick Start
 categories: start
 permalink: /docs/gettingstarted/kylin-quickstart.html
@@ -14,15 +14,13 @@ Users can follow these steps to get an initial 
understanding of how to use Kylin
 
 In order to make it easy for users to try out Kylin, Zhu Weibin of Ant 
Financial has contributed “Kylin Docker Image” to the community. In this image, 
various services that Kylin depends on have been installed and deployed, 
including:
 
-- Jdk 1.8
+- JDK 1.8
 - Hadoop 2.7.0
 - Hive 1.2.1
-- Hbase 1.1.2
+- Hbase 1.1.2 (with Zookeeper)
 - Spark 2.3.1
-- Zookeeper 3.4.6
 - Kafka 1.1.1
-- Mysql
-- Maven 3.6.1
+- MySQL 5.1.73
 
 We have uploaded the user facing Kylin image to the Docker repository. Users 
do not need to build the image locally; they only need to install Docker to 
experience Kylin’s one-click installation.
 
@@ -30,10 +28,10 @@ We have uploaded the user facing Kylin image to the Docker 
repository. Users do
 First, execute the following command to pull the image from the Docker 
repository:
 
 ```
-docker pull apachekylin/apache-kylin-standalone:3.0.1
+docker pull apachekylin/apache-kylin-standalone:3.1.0
 ```
 
-The image here contains the latest version of Kylin: Kylin v3.0.1. This image 
contains all of the big data components that Kylin depends on, so it takes a 
long time to pull the image – please be patient. After the pull is successful, 
it is displayed as follows:
+The image here contains the latest version of Kylin: Kylin v3.1.0. This image 
contains all of the big data components that Kylin depends on, so it takes a 
long time to pull the image – please be patient. After the pull is successful, 
it is displayed as follows:
 
 ![](/images/docs/quickstart/pull_docker.png)
 
@@ -49,7 +47,7 @@ docker run -d \
 -p 8032:8032 \
 -p 8042:8042 \
 -p 16010:16010 \
-apachekylin/apache-kylin-standalone:3.0.1
+apachekylin/apache-kylin-standalone:3.1.0
 ```
 
 The container will start shortly. Since the specified port in the container 
has been mapped to the local port, you can directly open the pages of each 
service in the local browser, such as:
@@ -68,13 +66,13 @@ When the container starts, the following services are 
automatically started:
 It will also automatically run $ KYLIN_HOME / bin / sample.sh and create a 
kylin_streaming_topic in Kafka and continue to send data to that topic to allow 
users to experience building and querying cubes in batches and streams as soon 
as the container is launched.
 
 Users can enter the container through the docker exec command. The relevant 
environment variables in the container are as follows:
-- JAVA_HOME = / home / admin / jdk1.8.0_141
-- HADOOP_HOME = / home / admin / hadoop-2.7.0
-- KAFKA_HOME = / home / admin / kafka_2.11-1.1.1
-- SPARK_HOME = / home / admin / spark-2.3.1-bin-hadoop2.6
-- HBASE_HOME = / home / admin / hbase-1.1.2
-- HIVE_HOME = / home / admin / apache-hive-1.2.1-bin
-- KYLIN_HOME = / home / admin / apache-kylin-3.0.0-alpha2-bin-hbase1x
+- JAVA_HOME = /home/admin/jdk1.8.0_141
+- HADOOP_HOME = /home/admin/hadoop-2.7.0
+- KAFKA_HOME = /home/admin/kafka_2.11-1.1.1
+- SPARK_HOME = /home/admin/spark-2.3.1-bin-hadoop2.6
+- HBASE_HOME = /home/admin/hbase-1.1.2
+- HIVE_HOME = /home/admin/apache-hive-1.2.1-bin
+- KYLIN_HOME = /home/admin/apache-kylin-3.1.0-bin-hbase1x
 
 After logging in to Kylin with user/password of ADMIN/KYLIN, users can use the 
sample cube to experience the construction and query of the cube, or they can 
create and query their own models and cubes by following the tutorial from Step 
8 in “Install and Use Kylin Based on a Hadoop Environment” below.
 
@@ -110,19 +108,19 @@ It is recommended to use an integrated Hadoop environment 
for Kylin installation
 When your environment meets the above prerequisites, you can install and start 
using Kylin.
 
 #### Step1. Download the Kylin Archive
-Download a binary for your version of Hadoop from [Apache Kylin Download 
Site](https://kylin.apache.org/download/). Currently, the latest versions are 
Kylin 3.0.1 and Kylin 2.6.5, of which, version 3.0 supports the function of 
ingesting data in real time for pre-calculation. If your Hadoop environment is 
CDH 5.7, you can download Kylin 3.0.0 using the following command line:
+Download a binary for your version of Hadoop from [Apache Kylin Download 
Site](https://kylin.apache.org/download/). Currently, the latest versions are 
Kylin 3.1.0 and Kylin 2.6.6, of which, version 3.0 supports the function of 
ingesting data in real time for pre-calculation. If your Hadoop environment is 
CDH 5.7, you can download Kylin 3.1.0 using the following command line:
 
 ```
 cd /usr/local/
-wget 
http://apache.website-solution.net/kylin/apache-kylin-3.0.0/apache-kylin-3.0.0-bin-cdh57.tar.gz
+wget 
http://apache.website-solution.net/kylin/apache-kylin-3.1.0/apache-kylin-3.1.0-bin-cdh57.tar.gz
 ```
 
 #### Step2. Extract Kylin
 Extract the downloaded Kylin archive and configure the environment variable 
KYLIN_HOME to point to the extracted directory:
 
 ```
-tar -zxvf  apache-kylin-3.0.0-bin-cdh57.tar.gz
-cd apache-kylin-3.0.0-bin-cdh57
+tar -zxvf  apache-kylin-3.1.0-bin-cdh57.tar.gz
+cd apache-kylin-3.1.0-bin-cdh57
 export KYLIN_HOME=`pwd`
 ```
 
@@ -157,7 +155,7 @@ Start script to start Kylin. If the startup is successful, 
the following will be
 
 ```
 A new Kylin instance is started by root. To stop it, run 'kylin.sh stop'
-Check the log at /usr/local/apache-kylin-3.0.0-bin-cdh57/logs/kylin.log
+Check the log at /usr/local/apache-kylin-3.1.0-bin-cdh57/logs/kylin.log
 Web UI is at http://<hostname>:7070/kylin
 ```
 
diff --git a/website/_docs/howto/howto_use_mr_hive_dict.md 
b/website/_docs/howto/howto_use_mr_hive_dict.md
index b9f5c96..bfaf483 100644
--- a/website/_docs/howto/howto_use_mr_hive_dict.md
+++ b/website/_docs/howto/howto_use_mr_hive_dict.md
@@ -8,11 +8,12 @@ permalink: /docs/howto/howto_use_hive_mr_dict.html
 ## Global Dictionary in Hive
 
 ### Background
-Count distinct(bitmap) measure is very important for many scenario, such as 
PageView statistics, and Kylin support count distinct since 1.5.3 .
-Apache Kylin implements precisely count distinct measure based on bitmap, and 
use global dictionary to encode string value into integer. 
-Currently we have to build global dictionary in single process/JVM, which may 
take a lot of time and memory for UHC. 
-Kylin v3.0.0 introduce Hive global dictionary v1(KYLIN-3841). By this feature, 
we use Hive, a distributed SQL engine to build global dictionary.
-For improve performance, kylin v3.1.0 use MapReduce replace HQL in some steps, 
introduce Hive global dictionary v2(KYLIN-4342).
+
+- Count distinct(bitmap) measure is very important for many scenario, such as 
PageView statistics, and Kylin support count distinct since 1.5.3 .
+- Apache Kylin implements precisely count distinct measure based on bitmap, 
and use global dictionary to encode string value into integer.
+- Currently we have to build global dictionary in single process/JVM, which 
may take a lot of time and memory for UHC.
+- Kylin v3.0.0 introduce Hive global dictionary v1(KYLIN-3841). By this 
feature, we use Hive, a distributed SQL engine to build global dictionary.
+- For improve performance, kylin v3.1.0 use MapReduce replace HQL in some 
steps, introduce Hive global dictionary v2(KYLIN-4342).
 
 ### Benefit Summary
 1.Build Global Dictionary in distributed way, thus building job spent less 
time.
diff --git a/website/_docs/install/kylin_docker.cn.md 
b/website/_docs/install/kylin_docker.cn.md
index 9d63554..e0bf234 100644
--- a/website/_docs/install/kylin_docker.cn.md
+++ b/website/_docs/install/kylin_docker.cn.md
@@ -8,7 +8,7 @@ since: v3.0.0
 
 为了让用户方便的试用 Kylin，以及方便开发者在修改了源码后进行验证及调试。我们提供了 Kylin 的 docker 镜像。该镜像中，Kylin 
依赖的各个服务均已正确的安装及部署，包括：
 
-- Jdk 1.8
+- JDK 1.8
 - Hadoop 2.7.0
 - Hive 1.2.1
 - Hbase 1.1.2 (with Zookeeper)
diff --git a/website/_docs/install/kylin_docker.md 
b/website/_docs/install/kylin_docker.md
index 3521b77..aba06ec 100644
--- a/website/_docs/install/kylin_docker.md
+++ b/website/_docs/install/kylin_docker.md
@@ -8,7 +8,7 @@ since: v3.0.0
 
 In order to allow users to easily try Kylin, and to facilitate developers to 
verify and debug after modifying the source code. We provide Kylin's docker 
image. In this image, each service that Kylin relies on is properly installed 
and deployed, including:
 
-- Jdk 1.8
+- JDK 1.8
 - Hadoop 2.7.0
 - Hive 1.2.1
 - Hbase 1.1.2 (with Zookeeper)
diff --git a/website/_docs/tutorial/cube_migration.cn.md 
b/website/_docs/tutorial/cube_migration.cn.md
new file mode 100644
index 0000000..d9cde69
--- /dev/null
+++ b/website/_docs/tutorial/cube_migration.cn.md
@@ -0,0 +1,164 @@
+---
+layout: docs-cn
+title: "Cube 迁移"
+categories: 教程
+permalink: /cn/docs/tutorial/cube_migration.html
+since: v3.1.0
+---
+
+Cube迁移功能主要用于把QA环境下的Cube迁移到PROD环境下，Kylin v3.1.0对这个功能进行了加强，加强的功能列表如下：
+
+- 在迁移前，Kylin会使用内部定义的一些规则对Cube的质量及兼容性做校验，之前的版本则需要人工去校验;
+- 通过邮件的方式发送迁移请求及迁移结果通知，取代之前的人工沟通;
+- 支持跨Hadoop集群的迁移功能;
+
+## I. 在同一个Hadoop集群下的Cube迁移
+
+提供如下两种方式来迁移同一个Hadoop集群下的Cube：
+
+- 使用Kylin portal;
+- 使用工具类'CubeMigrationCLI.java';
+
+### 1. 迁移的前置条件
+
+1. Cube迁移的操作按钮只有Cube的管理员才可见。
+2. 在迁移前，必须对要迁移的Cube进行构建，确认查询性能，Cube的状态必须是**READY**。
+3. 配置项'**kylin.cube.migration.enabled**'必须是true。
+4. 确保Cube要迁移的目标项目（PROD环境下）必须存在。
+5. QA环境和PROD环境必须在同一个Hadoop集群下, 即具有相同的 HDFS, HBase and HIVE等。
+
+### 2. 通过Web界面进行Cube迁移的步骤
+
+首先，要确保有操作Cube的权限。
+
+#### 步骤 1
+在QA环境里的 'Model' 页面，点击'Actions'列中的'Action'下拉列表，选择'Migrate'操作：
+
+   ![](/images/tutorial/3.1/Kylin-Cube-Migration/1_request_migration.png)
+
+#### 步骤 2
+在点击'Migrate'按钮后, 将会出现一个弹出框:
+
+   ![](/images/tutorial/3.1/Kylin-Cube-Migration/2_input_target_project.png)
+
+#### 步骤 3
+在弹出框中输入PROD环境的目标项目名称，使用QA环境的项目名称作为默认值。
+
+#### 步骤 4
+在弹出框中点击'Validate'按钮，将会在后端对迁移的Cube做一些验证，待验证完毕，会出现验证结果的弹出框。
+
+   **验证异常及解决方法**
+
+   - `The target project XXX does not exist on PROD-KYLIN-INSTANCE:7070`: 
输入的PROD环境的目标项目名称必须存在。
+
+   - `Cube email notification list is not set or empty`: 要迁移的Cube的邮件通知列表不能为空。
+
+   **建议性提示**
+
+   - `Auto merge time range for cube XXXX is not set`: 建议设置Cube的配置项：'Auto 
Merge Threshold'。
+   - `ExpansionRateRule: failed on expansion rate check with exceeding 5`: 
Cube的膨胀率超过配置项'kylin.cube.migration.expansion-rate'配置的值，可以设置为一个合理的值。
+   - `Failed on query latency check with average cost 5617 exceeding 2000ms`: 
如果设置配置项'kylin.cube.migration.rule-query-latency-enabled'为true, 
在验证阶段后端会自动生成一些SQL来测试Cube的查询性能，可以合理设置配置项'kylin.cube.migration.query-latency-seconds'的值。
+
+#### 步骤 5
+
+待验证通过，点击'Submit'按钮发起Cube迁移请求给Cube的管理员。后端会自动发送请求邮件给Cube管理员:
+
+   
![](/images/tutorial/3.1/Kylin-Cube-Migration/3_cube_migration_request_succ.png)
+
+#### 步骤 6
+Cube管理员在接收到Cube迁移请求邮件后，可以通过'Model'页面里'Admins'列的'Action'下拉列表，选择'Approve 
Migration'操作还是'Reject Migration'操作，同时后端会自动发送请求结果邮件给请求者:
+
+   ![](/images/tutorial/3.1/Kylin-Cube-Migration/4_approve_reject.png)
+
+#### 步骤 7
+如果Cube管理员选择'Approve Migration'，将会出现如下弹出框:
+
+   ![](/images/tutorial/3.1/Kylin-Cube-Migration/5_approve_migration.png)
+
+在弹出框输入正确的目标项目名称，点击'Approve'按钮，后端开始迁移Cube。
+
+#### 步骤 8
+迁移Cube成功后，将会出现如下弹出框，显示迁移成功：
+
+   ![](/images/tutorial/3.1/Kylin-Cube-Migration/6_migration_successfully.png)
+
+#### 步骤 9
+最后, 在PROD环境下的'Model'页面，迁移的Cube会出现在列表中，且状态是**DISABLED**。
+
+### 3. 使用'CubeMigrationCLI.java'工具类进行迁移
+
+#### 作用
+CubeMigrationCLI.java 用于迁移 cubes。例如：将 cube 从测试环境迁移到生产环境。请注意，不同的环境是共享相同的 Hadoop 
集群，包括 HDFS，HBase 和 HIVE。此 CLI 不支持跨 Hadoop 集群的数据迁移。
+
+#### 如何使用
+
+前八个参数必须有且次序不能改变。
+{% highlight Groff markup %}
+./bin/kylin.sh org.apache.kylin.tool.CubeMigrationCLI <srcKylinConfigUri> 
<dstKylinConfigUri> <cubeName> <projectName> <copyAclOrNot> <purgeOrNot> 
<overwriteIfExists> <realExecute> <migrateSegmentOrNot>
+{% endhighlight %}
+例如：
+{% highlight Groff markup %}
+./bin/kylin.sh org.apache.kylin.tool.CubeMigrationCLI kylin-qa:7070 
kylin-prod:7070 kylin_sales_cube learn_kylin true false false true false
+{% endhighlight %}
+命令执行成功后，请 reload metadata，您想要迁移的 cube 将会存在于迁移后的 project 中。
+
+下面会列出所有支持的参数：
+- 如果您使用 `cubeName` 这个参数，但想要迁移的 cube 所对应的 model 在要迁移的环境中不存在，model 的数据也会迁移过去。
+- 如果您将 `overwriteIfExists` 设置为 false，且该 cube 已存在于要迁移的环境中，当您运行命令，cube 
存在的提示信息将会出现。
+- 如果您将 `migrateSegmentOrNot` 设置为 true，请保证 Kylin metadata 的 HDFS 目录存在且 Cube 
的状态为 READY。
+
+| Parameter           | Description                                            
                                    |
+| ------------------- | 
:-----------------------------------------------------------------------------------------
 |
+| srcKylinConfigUri   | The URL of the source environment's Kylin 
configuration. It can be `host:7070`, or an absolute file path to the 
`kylin.properties`.                                                     |
+| dstKylinConfigUri   | The URL of the target environment's Kylin 
configuration.                                                 |
+| cubeName            | the name of Cube to be migrated.(Make sure it exist)   
                                    |
+| projectName         | The target project in the target environment.(Make 
sure it exist)                          |
+| copyAclOrNot        | `true` or `false`: whether copy Cube ACL to target 
environment.                                |
+| purgeOrNot          | `true` or `false`: whether purge the Cube from src 
server after the migration.                 |
+| overwriteIfExists   | `true` or `false`: overwrite cube if it already exists 
in the target environment.                             |
+| realExecute         | `true` or `false`: if false, just print the operations 
to take, if true, do the real migration.               |
+| migrateSegmentOrNot | (Optional) true or false: whether copy segment data to 
target environment. Default true.   |
+
+## II. 跨Hadoop集群下的Cube迁移
+
+**注意**:
+
+- 当前只支持使用工具类'CubeMigrationCrossClusterCLI.java'来进行跨Hadoop集群下的Cube迁移。
+- 跨Hadoop集群的Cube迁移，支持同时把Cube数据从QA环境迁移到PROD环境。
+
+
+### 1. 迁移的前置条件
+1. 在迁移前，必须对要迁移的Cube进行构建Segment，确认查询性能，Cube的状态必须是**READY**。
+2. PROD环境下的目标项目名称必须和QA环境下的项目名称一致。
+
+### 2. 如何使用工具类'CubeMigrationCrossClusterCLI.java'来迁移Cube
+
+{% highlight Groff markup %}
+./bin/kylin.sh org.apache.kylin.tool.migration.CubeMigrationCrossClusterCLI 
<kylinUriSrc> <kylinUriDst> <updateMappingPath> <cube> <hybrid> <project> <all> 
<dstHiveCheck> <overwrite> <schemaOnly> <execute> <coprocessorPath> 
<codeOfFSHAEnabled> <distCpJobQueue> <distCpJobMemory> <nThread>
+{% endhighlight %}
+例如：
+{% highlight Groff markup %}
+./bin/kylin.sh org.apache.kylin.tool.migration.CubeMigrationCrossClusterCLI 
-kylinUriSrc ADMIN:ky...@qa.env:17070 -kylinUriDst ADMIN:ky...@prod.env:17777 
-cube kylin_sales_cube -updateMappingPath $KYLIN_HOME/updateTableMapping.json 
-execute true -schemaOnly false -overwrite true
+{% endhighlight %}
+命令执行成功后，在PROD环境下的'Model'页面，迁移的Cube会出现在列表中，且状态是**READY**。
+
+下面会列出所有支持的参数：
+
+| Parameter           | Description                                            
                                    |
+| ------------------- | 
:-----------------------------------------------------------------------------------------
 |
+| kylinUriSrc   | (Required) The source kylin uri with format 
user:pwd@host:port.                                                      |
+| kylinUriDst   | (Required) The target kylin uri with format 
user:pwd@host:port.                                                     |
+| updateMappingPath            | (Optional) The path for the update Hive table 
mapping file, the format is json.                                        |
+| cube         | The cubes which you want to migrate, separated by ','.        
                  |
+| hybrid        | The hybrids which you want to migrate, separated by ','.     
                           |
+| project          | The projects which you want to migrate, separated by ','. 
                |
+| all   | Migrate all projects.  **Note**: You must add only one of above four 
parameters: 'cube', 'hybrid', 'project' or 'all'.                            |
+| dstHiveCheck         | (Optional) Whether to check target hive tables, the 
default value is true.               |
+| overwrite | (Optional) Whether to overwrite existing cubes, the default 
value is false.   |
+| schemaOnly | (Optional) Whether only migrate cube related schema, the 
default value is true. **Note**: If set to false, it will migrate cube data 
too.  |
+| execute | (Optional) Whether it's to execute the migration, the default 
value is false.   |
+| coprocessorPath | (Optional) The path of coprocessor to be deployed, the 
default value is get from KylinConfigBase.getCoprocessorLocalJar().   |
+| codeOfFSHAEnabled | (Optional) Whether to enable the namenode ha of 
clusters.   |
+| distCpJobQueue | (Optional) The mapreduce.job.queuename for DistCp job.   |
+| distCpJobMemory | (Optional) The mapreduce.map.memory.mb for DistCp job.   |
+| nThread | (Optional) The number of threads for migrating cube data in 
parallel.   |
diff --git a/website/_docs/tutorial/cube_migration.md 
b/website/_docs/tutorial/cube_migration.md
index e580fed..cc4ca67 100644
--- a/website/_docs/tutorial/cube_migration.md
+++ b/website/_docs/tutorial/cube_migration.md
@@ -6,17 +6,29 @@ permalink: /docs/tutorial/cube_migration.html
 since: v3.1.0
 ---
 
-## Migrate on the same Hadoop cluster
+Cube migration is used to migrate cube from QA env to PROD env, Kylin v3.1.0 
enhances this feature, the list of enhanced functions is showed as below:
 
-### Pre-requisites to use cube migration
+- Use some internal rules to check the quality and compatibility of cube by 
Kylin before migration, instead of checking manually;
+- Use email to send cube migration request and result notification;
+- Support to migrate across two Hadoop cluster;
+
+## I. Migrate on the same Hadoop cluster
+
+There are two ways to migrate cube from QA env to PROD env on the same Hadoop 
cluster:
+
+- Use the Kylin portal;
+- Use 'CubeMigrationCLI.java' CLI;
+
+### 1. Pre-requisitions to use cube migration
 
 1. Only cube admin can migrate the cubes as the "migrate" button is **ONLY** 
visible to cube admin.
-2. The cube status must be **ready** before migration which you have built the 
segment and confirmed the performance.
+2. The cube status must be **READY** before migration which you have built the 
segment and confirmed the performance.
 3. The Property '**kylin.cube.migration.enabled**' must be true.
 4. The target project must exist on Kylin PROD env before migration.
 5. The QA env and PROD env must share the same Hadoop cluster, including HDFS, 
HBase and HIVE.
 
-### Steps to migrate a cube through the Kylin portal
+### 2. Steps to migrate a cube through the Kylin portal
+
 First of all, make sure that you have authority of the cube you want to 
migrate.
 
 #### Step 1
@@ -33,7 +45,7 @@ After you click 'Migrate' button, you will see a pop-up 
window:
 Check if the target project name is what you want. It uses the same project 
name on QA env as default target project name. If the target project name is 
different on PROD env, please replace with the correct one.
 
 #### Step 4
-Click 'Validate' button to verify the cube validity. It may take couple of 
minutes to validate the cube on the backend and show the validity results on a 
pop-up window:
+Click 'Validate' button to verify the cube validity. It may take couple of 
minutes to validate the cube on the backend and show the validity results on a 
pop-up window.
 
    **Common exceptions and suggested solutions**
 
@@ -54,7 +66,7 @@ If validations are ok, click 'Submit' button to send the 
migration request email
    
![](/images/tutorial/3.1/Kylin-Cube-Migration/3_cube_migration_request_succ.png)
 
 #### Step 6
-Cubes administrator will receive a migration request email, and can click the 
'Action' drop down button in the 'Actions' column and select operation 'Approve 
Migration' button to migrate cube or select 'Reject Migration' button to reject 
request. It also will send a notification email to the migration requester:
+Cubes administrator will receive a migration request email, and can click the 
'Action' drop down button in the 'Admins' column and select operation 'Approve 
Migration' button to migrate cube or select 'Reject Migration' button to reject 
request. It also will send a notification email to the migration requester:
 
    ![](/images/tutorial/3.1/Kylin-Cube-Migration/4_approve_reject.png)
 
@@ -73,7 +85,7 @@ If migrate successfully, it will show the message below:
 #### Step 9
 Finally, go to Kylin portal on PROD env, and refresh the 'Model' page, you 
will see the cube you migrated from QA env and the status of this cube is 
**DISABLED**.
 
-### Use 'CubeMigrationCLI.java' CLI to migrate cube
+### 3. Use 'CubeMigrationCLI.java' CLI to migrate cube
 
 #### Function
 CubeMigrationCLI.java can migrate a cube from a Kylin environment to another, 
for example, promote a well tested cube from the testing env to production env. 
Note that the different Kylin environments should share the same Hadoop 
cluster, including HDFS, HBase and HIVE.
@@ -92,9 +104,10 @@ For example:
 After the command is successfully executed, please reload Kylin metadata, the 
cube you want to migrate will appear in the target environment.
 
 All supported parameters are listed below:
-　If the data model of the cube you want to migrate does not exist in the 
target environment, this tool will also migrate the model.
-　If you set `overwriteIfExists` to `false`, and the cube exists in the target 
environment, the tool will stop to proceed.
-　If you set `migrateSegmentOrNot` to `true`, please make sure the cube has 
`READY` segments, they will be migrated to target environment together.
+
+- If the data model of the cube you want to migrate does not exist in the 
target environment, this tool will also migrate the model.
+- If you set `overwriteIfExists` to `false`, and the cube exists in the target 
environment, the tool will stop to proceed.
+- If you set `migrateSegmentOrNot` to `true`, please make sure the cube has 
`READY` segments, they will be migrated to target environment together.
 
 | Parameter           | Description                                            
                                    |
 | ------------------- | 
:-----------------------------------------------------------------------------------------
 |
@@ -108,15 +121,18 @@ All supported parameters are listed below:
 | realExecute         | `true` or `false`: If false, just print the operations 
to take (dry-run mode); if true, do the real migration.               |
 | migrateSegmentOrNot | (Optional) `true` or `false`: whether copy segment 
info to the target environment. Default true.   |
 
-## Migrate across two Hadoop clusters
+## II. Migrate across two Hadoop clusters
 
-**Note**: Currently it just supports to use 
'CubeMigrationCrossClusterCLI.java' CLI to migrate cube across two Hadoop 
clusters.
+**Note**:
 
-### Pre-requisitions to use cube migration
-1. The cube status must be **ready** before migration which you have built the 
segment and confirmed the performance.
+- Currently it only supports to use 'CubeMigrationCrossClusterCLI.java' CLI to 
migrate cube across two Hadoop clusters.
+- Support to migrate cube data (segments data on HBase) from QA env to PROD 
env.
+
+### 1. Pre-requisitions to use cube migration
+1. The cube status must be **READY** before migration which you have built the 
segment and confirmed the performance.
 2. The target project name of PROD env must be the same as the one on QA env.
 
-### How to use 'CubeMigrationCrossClusterCLI.java' CLI to migrate cube
+### 2. How to use 'CubeMigrationCrossClusterCLI.java' CLI to migrate cube
 
 {% highlight Groff markup %}
 ./bin/kylin.sh org.apache.kylin.tool.migration.CubeMigrationCrossClusterCLI 
<kylinUriSrc> <kylinUriDst> <updateMappingPath> <cube> <hybrid> <project> <all> 
<dstHiveCheck> <overwrite> <schemaOnly> <execute> <coprocessorPath> 
<codeOfFSHAEnabled> <distCpJobQueue> <distCpJobMemory> <nThread>
@@ -140,14 +156,10 @@ All supported parameters are listed below:
 | all   | Migrate all projects.  **Note**: You must add only one of above four 
parameters: 'cube', 'hybrid', 'project' or 'all'.                            |
 | dstHiveCheck         | (Optional) Whether to check target hive tables, the 
default value is true.               |
 | overwrite | (Optional) Whether to overwrite existing cubes, the default 
value is false.   |
-| schemaOnly | (Optional) Whether only migrate cube related schema, the 
default value is true.   |
+| schemaOnly | (Optional) Whether only migrate cube related schema, the 
default value is true. **Note**: If set to false, it will migrate cube data 
too.  |
 | execute | (Optional) Whether it's to execute the migration, the default 
value is false.   |
 | coprocessorPath | (Optional) The path of coprocessor to be deployed, the 
default value is get from KylinConfigBase.getCoprocessorLocalJar().   |
 | codeOfFSHAEnabled | (Optional) Whether to enable the namenode ha of 
clusters.   |
 | distCpJobQueue | (Optional) The mapreduce.job.queuename for DistCp job.   |
 | distCpJobMemory | (Optional) The mapreduce.map.memory.mb for DistCp job.   |
 | nThread | (Optional) The number of threads for migrating cube data in 
parallel.   |
-
-
-
-
diff --git a/website/images/docs/quickstart/pull_docker.png 
b/website/images/docs/quickstart/pull_docker.png
index 5b5a88c..16236e4 100644
Binary files a/website/images/docs/quickstart/pull_docker.png and 
b/website/images/docs/quickstart/pull_docker.png differ

[kylin] branch document updated: add document for KYLIN-4485

Reply via email to