This is an automated email from the ASF dual-hosted git repository. xxyu pushed a commit to branch document in repository https://gitbox.apache.org/repos/asf/kylin.git
commit 7f024d4af9acfc457e4eac96f553115bbc9fd1ec Author: Zhichao Zhang <441586...@qq.com> AuthorDate: Tue Jun 30 11:23:23 2020 +0800 add documents for KYLIN-4485 --- website/_data/docs-cn.yml | 1 + website/_data/docs.yml | 3 +- website/_data/docs31-cn.yml | 1 + website/_data/docs31.yml | 1 + website/_docs/howto/howto_use_cli.cn.md | 33 ----- website/_docs/howto/howto_use_cli.md | 36 ----- website/_docs/index.cn.md | 19 +-- website/_docs/index.md | 19 +-- website/_docs/install/kylin_docker.cn.md | 10 +- website/_docs/install/kylin_docker.md | 8 +- website/_docs/tutorial/cube_migration.cn.md | 10 ++ website/_docs/tutorial/cube_migration.md | 149 ++++++++++++++++++++ website/_docs31/howto/howto_use_cli.cn.md | 33 ----- website/_docs31/howto/howto_use_cli.md | 36 ----- website/_docs31/index.cn.md | 19 +-- website/_docs31/index.md | 19 +-- website/_docs31/install/kylin_docker.cn.md | 8 +- website/_docs31/install/kylin_docker.md | 8 +- website/_docs31/tutorial/cube_migration.cn.md | 7 + website/_docs31/tutorial/cube_migration.md | 153 +++++++++++++++++++++ .../Kylin-Cube-Migration/1_request_migration.png | Bin 0 -> 163499 bytes .../2_input_target_project.png | Bin 0 -> 63083 bytes .../3_cube_migration_request_succ.png | Bin 0 -> 60100 bytes .../3.1/Kylin-Cube-Migration/4_approve_reject.png | Bin 0 -> 88781 bytes .../Kylin-Cube-Migration/5_approve_migration.png | Bin 0 -> 36918 bytes .../6_migration_successfully.png | Bin 0 -> 42744 bytes 26 files changed, 377 insertions(+), 196 deletions(-) diff --git a/website/_data/docs-cn.yml b/website/_data/docs-cn.yml index c8c31fe..5384355 100644 --- a/website/_data/docs-cn.yml +++ b/website/_data/docs-cn.yml @@ -33,6 +33,7 @@ - tutorial/web - tutorial/create_cube - tutorial/cube_build_job + - tutorial/cube_migration - tutorial/sql_reference - tutorial/project_level_acl - tutorial/cube_spark diff --git a/website/_data/docs.yml b/website/_data/docs.yml index aaccf40..d49f98f 100644 --- a/website/_data/docs.yml +++ b/website/_data/docs.yml @@ -41,6 +41,7 @@ - tutorial/web - tutorial/create_cube - tutorial/cube_build_job + - tutorial/cube_migration - tutorial/sql_reference - tutorial/project_level_acl - tutorial/cube_spark @@ -91,4 +92,4 @@ - title: Security docs: - - security \ No newline at end of file + - security diff --git a/website/_data/docs31-cn.yml b/website/_data/docs31-cn.yml index 570e363..8295b94 100644 --- a/website/_data/docs31-cn.yml +++ b/website/_data/docs31-cn.yml @@ -32,6 +32,7 @@ - tutorial/web - tutorial/create_cube - tutorial/cube_build_job + - tutorial/cube_migration - tutorial/sql_reference - tutorial/project_level_acl - tutorial/cube_spark diff --git a/website/_data/docs31.yml b/website/_data/docs31.yml index 6b604cf..f25d092 100644 --- a/website/_data/docs31.yml +++ b/website/_data/docs31.yml @@ -40,6 +40,7 @@ - tutorial/web - tutorial/create_cube - tutorial/cube_build_job + - tutorial/cube_migration - tutorial/sql_reference - tutorial/project_level_acl - tutorial/cube_spark diff --git a/website/_docs/howto/howto_use_cli.cn.md b/website/_docs/howto/howto_use_cli.cn.md index fd18a49..a73f3f1 100644 --- a/website/_docs/howto/howto_use_cli.cn.md +++ b/website/_docs/howto/howto_use_cli.cn.md @@ -99,39 +99,6 @@ CubeMetaIngester.java 将提取的 cube 注入到另一个 metadata store 中。 | project <project> | (Required) Specify the target project for the new cubes. | srcPath <srcPath> | (Required) Specify the path to the extracted Cube metadata zip file. | -## CubeMigrationCLI.java - -### 作用 -CubeMigrationCLI.java 用于迁移 cubes。例如:将 cube 从测试环境迁移到生产环境。请注意,不同的环境是共享相同的 Hadoop 集群,包括 HDFS,HBase 和 HIVE。此 CLI 不支持跨 Hadoop 集群的数据迁移。 - -### 如何使用 -前八个参数必须有且次序不能改变。 -{% highlight Groff markup %} -./bin/kylin.sh org.apache.kylin.tool.CubeMigrationCLI <srcKylinConfigUri> <dstKylinConfigUri> <cubeName> <projectName> <copyAclOrNot> <purgeOrNot> <overwriteIfExists> <realExecute> <migrateSegmentOrNot> -{% endhighlight %} -例如: -{% highlight Groff markup %} -./bin/kylin.sh org.apache.kylin.tool.CubeMigrationCLI kylin-qa:7070 kylin-prod:7070 kylin_sales_cube learn_kylin true false false true false -{% endhighlight %} -命令执行成功后,请 reload metadata,您想要迁移的 cube 将会存在于迁移后的 project 中。 - -下面会列出所有支持的参数: - 如果您使用 `cubeName` 这个参数,但想要迁移的 cube 所对应的 model 在要迁移的环境中不存在,model 的数据也会迁移过去。 - 如果您将 `overwriteIfExists` 设置为 false,且该 cube 已存在于要迁移的环境中,当您运行命令,cube 存在的提示信息将会出现。 - 如果您将 `migrateSegmentOrNot` 设置为 true,请保证 Kylin metadata 的 HDFS 目录存在且 Cube 的状态为 READY。 - -| Parameter | Description | -| ------------------- | :----------------------------------------------------------------------------------------- | -| srcKylinConfigUri | The URL of the source environment's Kylin configuration. It can be `host:7070`, or an absolute file path to the `kylin.properties`. | -| dstKylinConfigUri | The URL of the target environment's Kylin configuration. | -| cubeName | the name of Cube to be migrated.(Make sure it exist) | -| projectName | The target project in the target environment.(Make sure it exist) | -| copyAclOrNot | `true` or `false`: whether copy Cube ACL to target environment. | -| purgeOrNot | `true` or `false`: whether purge the Cube from src server after the migration. | -| overwriteIfExists | `true` or `false`: overwrite cube if it already exists in the target environment. | -| realExecute | `true` or `false`: if false, just print the operations to take, if true, do the real migration. | -| migrateSegmentOrNot | (Optional) true or false: whether copy segment data to target environment. Default true. | - ## CubeMigrationCheckCLI.java ### 作用 diff --git a/website/_docs/howto/howto_use_cli.md b/website/_docs/howto/howto_use_cli.md index 1024295..aadaa7a 100644 --- a/website/_docs/howto/howto_use_cli.md +++ b/website/_docs/howto/howto_use_cli.md @@ -103,42 +103,6 @@ All supported parameters are listed below: | project <project> | (Required) Specify the target project for the new cubes. | | srcPath <srcPath> | (Required) Specify the path to the extracted Cube metadata zip file. | -## CubeMigrationCLI.java - -### Function -CubeMigrationCLI.java can migrate a cube from a Kylin environment to another, for example, promote a well tested cube from the testing env to production env. Note that the different Kylin environments should share the same Hadoop cluster, including HDFS, HBase and HIVE. - -Please note, this tool will migrate the Kylin metadata, rename the Kylin HDFS folders and update HBase table's metadata. It doesn't migrate data across Hadoop clusters. - -### How to use - - -{% highlight Groff markup %} -./bin/kylin.sh org.apache.kylin.tool.CubeMigrationCLI <srcKylinConfigUri> <dstKylinConfigUri> <cubeName> <projectName> <copyAclOrNot> <purgeOrNot> <overwriteIfExists> <realExecute> <migrateSegmentOrNot> -{% endhighlight %} -For example: -{% highlight Groff markup %} -./bin/kylin.sh org.apache.kylin.tool.CubeMigrationCLI kylin-qa:7070 kylin-prod:7070 kylin_sales_cube learn_kylin true false false true false -{% endhighlight %} -After the command is successfully executed, please reload Kylin metadata, the cube you want to migrate will appear in the target environment. - -All supported parameters are listed below: - If the data model of the cube you want to migrate does not exist in the target environment, this tool will also migrate the model. - If you set `overwriteIfExists` to `false`, and the cube exists in the target environment, the tool will stop to proceed. - If you set `migrateSegmentOrNot` to `true`, please make sure the cube has `READY` segments, they will be migrated to target environment together. - -| Parameter | Description | -| ------------------- | :----------------------------------------------------------------------------------------- | -| srcKylinConfigUri | The URL of the source environment's Kylin configuration. It can be `host:7070`, or an absolute file path to the `kylin.properties`. | -| dstKylinConfigUri | The URL of the target environment's Kylin configuration. | -| cubeName | the name of cube to be migrated. | -| projectName | The target project in the target environment. If it doesn't exist, create it before run this command. | -| copyAclOrNot | `true` or `false`: whether copy the cube ACL to target environment. | -| purgeOrNot | `true` or `false`: whether to purge the cube from source environment after it be migrated to target environment. | -| overwriteIfExists | `true` or `false`: whether to overwrite if it already exists in the target environment. | -| realExecute | `true` or `false`: If false, just print the operations to take (dry-run mode); if true, do the real migration. | -| migrateSegmentOrNot | (Optional) `true` or `false`: whether copy segment info to the target environment. Default true. | - ## CubeMigrationCheckCLI.java ### Function diff --git a/website/_docs/index.cn.md b/website/_docs/index.cn.md index 882419e..9dca4b9 100644 --- a/website/_docs/index.cn.md +++ b/website/_docs/index.cn.md @@ -30,15 +30,16 @@ Apache Kylin™是一个开源的、分布式的分析型数据仓库,提供 H 2. [Web 界面](tutorial/web.html) 3. [Cube 创建](tutorial/create_cube.html) 4. [Cube 构建和 Job 监控](tutorial/cube_build_job.html) -5. [SQL 快速参考](tutorial/sql_reference.html) -6. [用 Kafka 流构建 Cube](tutorial/cube_streaming.html) -7. [用 Spark 构建 Cube](tutorial/cube_spark.html) -8. [优化 Cube 构建](tutorial/cube_build_performance.html) -9. [查询下压](tutorial/query_pushdown.html) -10. [建立 System Cube](tutorial/setup_systemcube.html) -11. [使用 Cube Planner](tutorial/use_cube_planner.html) -12. [使用 Dashboard](tutorial/use_dashboard.html) -13. [建立 JDBC 数据源](tutorial/setup_jdbc_datasource.html) +5. [Cube 迁移](tutorial/cube_migration.html) +6. [SQL 快速参考](tutorial/sql_reference.html) +7. [用 Kafka 流构建 Cube](tutorial/cube_streaming.html) +8. [用 Spark 构建 Cube](tutorial/cube_spark.html) +9. [优化 Cube 构建](tutorial/cube_build_performance.html) +10. [查询下压](tutorial/query_pushdown.html) +11. [建立 System Cube](tutorial/setup_systemcube.html) +12. [使用 Cube Planner](tutorial/use_cube_planner.html) +13. [使用 Dashboard](tutorial/use_dashboard.html) +14. [建立 JDBC 数据源](tutorial/setup_jdbc_datasource.html) 工具集成 diff --git a/website/_docs/index.md b/website/_docs/index.md index 1351335..79e930d 100644 --- a/website/_docs/index.md +++ b/website/_docs/index.md @@ -30,15 +30,16 @@ Tutorial 2. [Web Interface](tutorial/web.html) 3. [Cube Wizard](tutorial/create_cube.html) 4. [Cube Build and Job Monitoring](tutorial/cube_build_job.html) -5. [SQL reference](tutorial/sql_reference.html) -6. [Build Cube with Streaming Data](tutorial/cube_streaming.html) -7. [Build Cube with Spark Engine](tutorial/cube_spark.html) -8. [Cube Build Tuning](tutorial/cube_build_performance.html) -9. [Enable Query Pushdown](tutorial/query_pushdown.html) -10. [Setup System Cube](tutorial/setup_systemcube.html) -11. [Optimize with Cube Planner](tutorial/use_cube_planner.html) -12. [Use System Dashboard](tutorial/use_dashboard.html) -13. [Setup JDBC Data Source](tutorial/setup_jdbc_datasource.html) +5. [Cube Migration](tutorial/cube_migration.html) +6. [SQL reference](tutorial/sql_reference.html) +7. [Build Cube with Streaming Data](tutorial/cube_streaming.html) +8. [Build Cube with Spark Engine](tutorial/cube_spark.html) +9. [Cube Build Tuning](tutorial/cube_build_performance.html) +10. [Enable Query Pushdown](tutorial/query_pushdown.html) +11. [Setup System Cube](tutorial/setup_systemcube.html) +12. [Optimize with Cube Planner](tutorial/use_cube_planner.html) +13. [Use System Dashboard](tutorial/use_dashboard.html) +14. [Setup JDBC Data Source](tutorial/setup_jdbc_datasource.html) Connectivity and APIs diff --git a/website/_docs/install/kylin_docker.cn.md b/website/_docs/install/kylin_docker.cn.md index b97d2ae..9d63554 100644 --- a/website/_docs/install/kylin_docker.cn.md +++ b/website/_docs/install/kylin_docker.cn.md @@ -8,22 +8,20 @@ since: v3.0.0 为了让用户方便的试用 Kylin,以及方便开发者在修改了源码后进行验证及调试。我们提供了 Kylin 的 docker 镜像。该镜像中,Kylin 依赖的各个服务均已正确的安装及部署,包括: -- JDK 1.8 +- Jdk 1.8 - Hadoop 2.7.0 - Hive 1.2.1 -- HBase 1.1.2 +- Hbase 1.1.2 (with Zookeeper) - Spark 2.3.1 -- Zookeeper 3.4.6 - Kafka 1.1.1 - MySQL 5.1.73 -- Maven 3.6.1 ## 快速试用 Kylin 我们已将面向用户的 Kylin 镜像上传至 docker 仓库,用户无需在本地构建镜像,直接执行以下命令从 docker 仓库 pull 镜像: {% highlight Groff markup %} -docker pull apachekylin/apache-kylin-standalone:3.0.1 +docker pull apachekylin/apache-kylin-standalone:3.1.0 {% endhighlight %} pull 成功后,执行以下命令启动容器: @@ -37,7 +35,7 @@ docker run -d \ -p 8032:8032 \ -p 8042:8042 \ -p 16010:16010 \ -apachekylin/apache-kylin-standalone:3.0.1 +apachekylin/apache-kylin-standalone:3.1.0 {% endhighlight %} 在容器启动时,会自动启动以下服务: diff --git a/website/_docs/install/kylin_docker.md b/website/_docs/install/kylin_docker.md index 58da0bb..3521b77 100644 --- a/website/_docs/install/kylin_docker.md +++ b/website/_docs/install/kylin_docker.md @@ -11,19 +11,17 @@ In order to allow users to easily try Kylin, and to facilitate developers to ver - Jdk 1.8 - Hadoop 2.7.0 - Hive 1.2.1 -- Hbase 1.1.2 +- Hbase 1.1.2 (with Zookeeper) - Spark 2.3.1 -- Zookeeper 3.4.6 - Kafka 1.1.1 - MySQL 5.1.73 -- Maven 3.6.1 ## Quickly try Kylin We have pushed the Kylin image for the user to the docker hub. Users do not need to build the image locally, just execute the following command to pull the image from the docker hub: {% highlight Groff markup %} -docker pull apachekylin/apache-kylin-standalone:3.0.1 +docker pull apachekylin/apache-kylin-standalone:3.1.0 {% endhighlight %} After the pull is successful, execute the following command to start the container: @@ -37,7 +35,7 @@ docker run -d \ -p 8032:8032 \ -p 8042:8042 \ -p 16010:16010 \ -apachekylin/apache-kylin-standalone:3.0.1 +apachekylin/apache-kylin-standalone:3.1.0 {% endhighlight %} The following services are automatically started when the container starts: diff --git a/website/_docs/tutorial/cube_migration.cn.md b/website/_docs/tutorial/cube_migration.cn.md new file mode 100644 index 0000000..322d474 --- /dev/null +++ b/website/_docs/tutorial/cube_migration.cn.md @@ -0,0 +1,10 @@ +--- +layout: docs-cn +title: "Cube 迁移" +categories: 教程 +permalink: /cn/docs/tutorial/cube_migration.html +since: v3.1.0 +--- + + + diff --git a/website/_docs/tutorial/cube_migration.md b/website/_docs/tutorial/cube_migration.md new file mode 100644 index 0000000..dfb0511 --- /dev/null +++ b/website/_docs/tutorial/cube_migration.md @@ -0,0 +1,149 @@ +--- +layout: docs +title: Cube Migration +categories: tutorial +permalink: /docs/tutorial/cube_migration.html +since: v3.1.0 +--- + +## Migrate on the same Hadoop cluster + +### Pre-requisitions to use cube migration + +1. Only cube admin can migrate the cubes as the "migrate" button is **ONLY** visible to cube admin. +2. The cube status must be **ready** before migration which you have built the segment and confirmed the performance. +3. The Property '**kylin.cube.migration.enabled**' must be true. +4. The target project must exist on Kylin PROD env before migration. +5. The QA env and PROD env must share the same Hadoop cluster, including HDFS, HBase and HIVE. + +### Steps to migrate a cube through the Kylin portal +First of all, make sure that you have authority of the cube you want to migrate. + +#### Step 1 +In 'Model' page, click the 'Action' drop down button in the 'Actions' column and select operation 'Migrate': + +  + +#### Step 2 +After you click 'Migrate' button, you will see a pop-up window: + +  + +#### Step 3 +Check if the target project name is what you want. It uses the same project name on QA env as default target project name. If the target project name is different on PROD env, please replace with the correct one. + +#### Step 4 +Click 'Validate' button to verify the cube validity. It may take couple of minutes to validate the cube on the backend and show the validity results on a pop-up window: + + **Common exceptions and suggested solutions** + + - `The target project XXX does not exist on PROD-KYLIN-INSTANCE:7070`: please enter the correct project name or create the expected project on the PROD env. + + - `Cube email notification list is not set or empty`: please add notification email in the Notification List of cube; + + **Some suggestive messages** + + - `Auto merge time range for cube XXXX is not set`: if 'Auto Merge Threshold' is not set, this message will be shown. You can ignore it or set 'Auto Merge Threshold' in 'Refresh Setting' section of cube. + - `ExpansionRateRule: failed on expansion rate check with exceeding 5`: it means the expansion rate of the cube you want to migrate exceed 5, which is the value of property 'kylin.cube.migration.expansion-rate', you can set the proper value for the cube. + - `Failed on query latency check with average cost 5617 exceeding 2000ms`: if property 'kylin.cube.migration.rule-query-latency-enabled' is set to true, it will generate some test sql to test the average query latency for the cube you want to migrate. You can set the proper value for property 'kylin.cube.migration.query-latency-seconds'. + +#### Step 5 + +If validations are ok, click 'Submit' button to send the migration request email to cubes administrator, if send email successfully, it will show the message like this: + +  + +#### Step 6 +Cubes administrator will receive a migration request email, and can click the 'Action' drop down button in the 'Actions' column and select operation 'Approve Migration' button to migrate cube or select 'Reject Migration' button to reject request. It also will send a notification email to the migration requester: + +  + +#### Step 7 +If cubes administrator selects 'Approve Migration' button to migrate cube, it will show a pop-up window: + +  + +After enter the target project name, and click 'Approve' button, it will start to migrate cube. + +#### Step 8 +If migrate successfully, it will show the message below: + +  + +#### Step 9 +Finally, go to Kylin portal on PROD env, and refresh the 'Model' page, you will see the cube you migrated from QA env and the status of this cube is **DISABLED**. + +### Use 'CubeMigrationCLI.java' CLI to migrate cube + +#### Function +CubeMigrationCLI.java can migrate a cube from a Kylin environment to another, for example, promote a well tested cube from the testing env to production env. Note that the different Kylin environments should share the same Hadoop cluster, including HDFS, HBase and HIVE. + +Please note, this tool will migrate the Kylin metadata, rename the Kylin HDFS folders and update HBase table's metadata. It doesn't migrate data across Hadoop clusters. + +#### How to use + +{% highlight Groff markup %} +./bin/kylin.sh org.apache.kylin.tool.CubeMigrationCLI <srcKylinConfigUri> <dstKylinConfigUri> <cubeName> <projectName> <copyAclOrNot> <purgeOrNot> <overwriteIfExists> <realExecute> <migrateSegmentOrNot> +{% endhighlight %} +For example: +{% highlight Groff markup %} +./bin/kylin.sh org.apache.kylin.tool.CubeMigrationCLI kylin-qa:7070 kylin-prod:7070 kylin_sales_cube learn_kylin true false false true false +{% endhighlight %} +After the command is successfully executed, please reload Kylin metadata, the cube you want to migrate will appear in the target environment. + +All supported parameters are listed below: + If the data model of the cube you want to migrate does not exist in the target environment, this tool will also migrate the model. + If you set `overwriteIfExists` to `false`, and the cube exists in the target environment, the tool will stop to proceed. + If you set `migrateSegmentOrNot` to `true`, please make sure the cube has `READY` segments, they will be migrated to target environment together. + +| Parameter | Description | +| ------------------- | :----------------------------------------------------------------------------------------- | +| srcKylinConfigUri | The URL of the source environment's Kylin configuration. It can be `host:7070`, or an absolute file path to the `kylin.properties`. | +| dstKylinConfigUri | The URL of the target environment's Kylin configuration. | +| cubeName | the name of cube to be migrated. | +| projectName | The target project in the target environment. If it doesn't exist, create it before run this command. | +| copyAclOrNot | `true` or `false`: whether copy the cube ACL to target environment. | +| purgeOrNot | `true` or `false`: whether to purge the cube from source environment after it be migrated to target environment. | +| overwriteIfExists | `true` or `false`: whether to overwrite if it already exists in the target environment. | +| realExecute | `true` or `false`: If false, just print the operations to take (dry-run mode); if true, do the real migration. | +| migrateSegmentOrNot | (Optional) `true` or `false`: whether copy segment info to the target environment. Default true. | + +## Migrate across two Hadoop clusters + +**Note**: Currently it just supports to use 'CubeMigrationCrossClusterCLI.java' CLI to migrate cube across two Hadoop clusters. + +### Pre-requisitions to use cube migration +1. The cube status must be **ready** before migration which you have built the segment and confirmed the performance. +2. The target project name of PROD env must be the same as the one on QA env. + +### How to use 'CubeMigrationCrossClusterCLI.java' CLI to migrate cube + +{% highlight Groff markup %} +./bin/kylin.sh org.apache.kylin.tool.migration.CubeMigrationCrossClusterCLI <kylinUriSrc> <kylinUriDst> <updateMappingPath> <cube> <hybrid> <project> <all> <dstHiveCheck> <overwrite> <schemaOnly> <execute> <coprocessorPath> <codeOfFSHAEnabled> <distCpJobQueue> <distCpJobMemory> <nThread> +{% endhighlight %} +For example: +{% highlight Groff markup %} +./bin/kylin.sh org.apache.kylin.tool.migration.CubeMigrationCrossClusterCLI -kylinUriSrc ADMIN:ky...@qa.env:17070 -kylinUriDst ADMIN:ky...@prod.env:17777 -cube kylin_sales_cube -updateMappingPath $KYLIN_HOME/updateTableMapping.json -execute true -schemaOnly false -overwrite true +{% endhighlight %} +After the command is successfully executed, please go to Kylin PROD portal and find the cube you migrated from 'Model' page, the status of the cube is **READY**. + +All supported parameters are listed below: + +| Parameter | Description | +| ------------------- | :----------------------------------------------------------------------------------------- | +| kylinUriSrc | (Required) The source kylin uri with format user:pwd@host:port. | +| kylinUriDst | (Required) The target kylin uri with format user:pwd@host:port. | +| updateMappingPath | (Optional) The path for the update Hive table mapping file, the format is json. | +| cube | The cubes which you want to migrate, separated by ','. | +| hybrid | The hybrids which you want to migrate, separated by ','. | +| project | The projects which you want to migrate, separated by ','. | +| all | Migrate all projects. **Note**: You must add only one of above four parameters: 'cube', 'hybrid', 'project' or 'all'. | +| dstHiveCheck | (Optional) Whether to check target hive tables, the default value is true. | +| overwrite | (Optional) Whether to overwrite existing cubes, the default value is false. | +| schemaOnly | (Optional) Whether only migrate cube related schema, the default value is true. | +| execute | (Optional) Whether it's to execute the migration, the default value is false. | +| coprocessorPath | (Optional) The path of coprocessor to be deployed, the default value is get from KylinConfigBase.getCoprocessorLocalJar(). | +| codeOfFSHAEnabled | (Optional) Whether to enable the namenode ha of clusters. | +| distCpJobQueue | (Optional) The mapreduce.job.queuename for DistCp job. | +| distCpJobMemory | (Optional) The mapreduce.map.memory.mb for DistCp job. | +| nThread | (Optional) The number of threads for migrating cube data in parallel. | diff --git a/website/_docs31/howto/howto_use_cli.cn.md b/website/_docs31/howto/howto_use_cli.cn.md index efd96b8..ebf8fda 100644 --- a/website/_docs31/howto/howto_use_cli.cn.md +++ b/website/_docs31/howto/howto_use_cli.cn.md @@ -99,39 +99,6 @@ CubeMetaIngester.java 将提取的 cube 注入到另一个 metadata store 中。 | project <project> | (Required) Specify the target project for the new cubes. | srcPath <srcPath> | (Required) Specify the path to the extracted Cube metadata zip file. | -## CubeMigrationCLI.java - -### 作用 -CubeMigrationCLI.java 用于迁移 cubes。例如:将 cube 从测试环境迁移到生产环境。请注意,不同的环境是共享相同的 Hadoop 集群,包括 HDFS,HBase 和 HIVE。此 CLI 不支持跨 Hadoop 集群的数据迁移。 - -### 如何使用 -前八个参数必须有且次序不能改变。 -{% highlight Groff markup %} -./bin/kylin.sh org.apache.kylin.tool.CubeMigrationCLI <srcKylinConfigUri> <dstKylinConfigUri> <cubeName> <projectName> <copyAclOrNot> <purgeOrNot> <overwriteIfExists> <realExecute> <migrateSegmentOrNot> -{% endhighlight %} -例如: -{% highlight Groff markup %} -./bin/kylin.sh org.apache.kylin.tool.CubeMigrationCLI kylin-qa:7070 kylin-prod:7070 kylin_sales_cube learn_kylin true false false true false -{% endhighlight %} -命令执行成功后,请 reload metadata,您想要迁移的 cube 将会存在于迁移后的 project 中。 - -下面会列出所有支持的参数: - 如果您使用 `cubeName` 这个参数,但想要迁移的 cube 所对应的 model 在要迁移的环境中不存在,model 的数据也会迁移过去。 - 如果您将 `overwriteIfExists` 设置为 false,且该 cube 已存在于要迁移的环境中,当您运行命令,cube 存在的提示信息将会出现。 - 如果您将 `migrateSegmentOrNot` 设置为 true,请保证 Kylin metadata 的 HDFS 目录存在且 Cube 的状态为 READY。 - -| Parameter | Description | -| ------------------- | :----------------------------------------------------------------------------------------- | -| srcKylinConfigUri | The URL of the source environment's Kylin configuration. It can be `host:7070`, or an absolute file path to the `kylin.properties`. | -| dstKylinConfigUri | The URL of the target environment's Kylin configuration. | -| cubeName | the name of Cube to be migrated.(Make sure it exist) | -| projectName | The target project in the target environment.(Make sure it exist) | -| copyAclOrNot | `true` or `false`: whether copy Cube ACL to target environment. | -| purgeOrNot | `true` or `false`: whether purge the Cube from src server after the migration. | -| overwriteIfExists | `true` or `false`: overwrite cube if it already exists in the target environment. | -| realExecute | `true` or `false`: if false, just print the operations to take, if true, do the real migration. | -| migrateSegmentOrNot | (Optional) true or false: whether copy segment data to target environment. Default true. | - ## CubeMigrationCheckCLI.java ### 作用 diff --git a/website/_docs31/howto/howto_use_cli.md b/website/_docs31/howto/howto_use_cli.md index 01023cc..c7a09a9 100644 --- a/website/_docs31/howto/howto_use_cli.md +++ b/website/_docs31/howto/howto_use_cli.md @@ -103,42 +103,6 @@ All supported parameters are listed below: | project <project> | (Required) Specify the target project for the new cubes. | | srcPath <srcPath> | (Required) Specify the path to the extracted Cube metadata zip file. | -## CubeMigrationCLI.java - -### Function -CubeMigrationCLI.java can migrate a cube from a Kylin environment to another, for example, promote a well tested cube from the testing env to production env. Note that the different Kylin environments should share the same Hadoop cluster, including HDFS, HBase and HIVE. - -Please note, this tool will migrate the Kylin metadata, rename the Kylin HDFS folders and update HBase table's metadata. It doesn't migrate data across Hadoop clusters. - -### How to use - - -{% highlight Groff markup %} -./bin/kylin.sh org.apache.kylin.tool.CubeMigrationCLI <srcKylinConfigUri> <dstKylinConfigUri> <cubeName> <projectName> <copyAclOrNot> <purgeOrNot> <overwriteIfExists> <realExecute> <migrateSegmentOrNot> -{% endhighlight %} -For example: -{% highlight Groff markup %} -./bin/kylin.sh org.apache.kylin.tool.CubeMigrationCLI kylin-qa:7070 kylin-prod:7070 kylin_sales_cube learn_kylin true false false true false -{% endhighlight %} -After the command is successfully executed, please reload Kylin metadata, the cube you want to migrate will appear in the target environment. - -All supported parameters are listed below: - If the data model of the cube you want to migrate does not exist in the target environment, this tool will also migrate the model. - If you set `overwriteIfExists` to `false`, and the cube exists in the target environment, the tool will stop to proceed. - If you set `migrateSegmentOrNot` to `true`, please make sure the cube has `READY` segments, they will be migrated to target environment together. - -| Parameter | Description | -| ------------------- | :----------------------------------------------------------------------------------------- | -| srcKylinConfigUri | The URL of the source environment's Kylin configuration. It can be `host:7070`, or an absolute file path to the `kylin.properties`. | -| dstKylinConfigUri | The URL of the target environment's Kylin configuration. | -| cubeName | the name of cube to be migrated. | -| projectName | The target project in the target environment. If it doesn't exist, create it before run this command. | -| copyAclOrNot | `true` or `false`: whether copy the cube ACL to target environment. | -| purgeOrNot | `true` or `false`: whether to purge the cube from source environment after it be migrated to target environment. | -| overwriteIfExists | `true` or `false`: whether to overwrite if it already exists in the target environment. | -| realExecute | `true` or `false`: If false, just print the operations to take (dry-run mode); if true, do the real migration. | -| migrateSegmentOrNot | (Optional) `true` or `false`: whether copy segment info to the target environment. Default true. | - ## CubeMigrationCheckCLI.java ### Function diff --git a/website/_docs31/index.cn.md b/website/_docs31/index.cn.md index 44b1455..f287534 100644 --- a/website/_docs31/index.cn.md +++ b/website/_docs31/index.cn.md @@ -33,15 +33,16 @@ Apache Kylin™是一个开源的、分布式的分析型数据仓库,提供 H 2. [Web 界面](tutorial/web.html) 3. [Cube 创建](tutorial/create_cube.html) 4. [Cube 构建和 Job 监控](tutorial/cube_build_job.html) -5. [SQL 快速参考](tutorial/sql_reference.html) -6. [用 Kafka 流构建 Cube](tutorial/cube_streaming.html) -7. [用 Spark 构建 Cube](tutorial/cube_spark.html) -8. [优化 Cube 构建](tutorial/cube_build_performance.html) -9. [查询下压](tutorial/query_pushdown.html) -10. [建立 System Cube](tutorial/setup_systemcube.html) -11. [使用 Cube Planner](tutorial/use_cube_planner.html) -12. [使用 Dashboard](tutorial/use_dashboard.html) -13. [建立 JDBC 数据源](tutorial/setup_jdbc_datasource.html) +5. [Cube 迁移](tutorial/cube_migration.html) +6. [SQL 快速参考](tutorial/sql_reference.html) +7. [用 Kafka 流构建 Cube](tutorial/cube_streaming.html) +8. [用 Spark 构建 Cube](tutorial/cube_spark.html) +9. [优化 Cube 构建](tutorial/cube_build_performance.html) +10. [查询下压](tutorial/query_pushdown.html) +11. [建立 System Cube](tutorial/setup_systemcube.html) +12. [使用 Cube Planner](tutorial/use_cube_planner.html) +13. [使用 Dashboard](tutorial/use_dashboard.html) +14. [建立 JDBC 数据源](tutorial/setup_jdbc_datasource.html) 工具集成 diff --git a/website/_docs31/index.md b/website/_docs31/index.md index e2fb10d..90b9cf0 100644 --- a/website/_docs31/index.md +++ b/website/_docs31/index.md @@ -33,15 +33,16 @@ Tutorial 2. [Web Interface](tutorial/web.html) 3. [Cube Wizard](tutorial/create_cube.html) 4. [Cube Build and Job Monitoring](tutorial/cube_build_job.html) -5. [SQL reference](tutorial/sql_reference.html) -6. [Build Cube with Streaming Data](tutorial/cube_streaming.html) -7. [Build Cube with Spark Engine](tutorial/cube_spark.html) -8. [Cube Build Tuning](tutorial/cube_build_performance.html) -9. [Enable Query Pushdown](tutorial/query_pushdown.html) -10. [Setup System Cube](tutorial/setup_systemcube.html) -11. [Optimize with Cube Planner](tutorial/use_cube_planner.html) -12. [Use System Dashboard](tutorial/use_dashboard.html) -13. [Setup JDBC Data Source](tutorial/setup_jdbc_datasource.html) +5. [Cube Migration](tutorial/cube_migration.html) +6. [SQL reference](tutorial/sql_reference.html) +7. [Build Cube with Streaming Data](tutorial/cube_streaming.html) +8. [Build Cube with Spark Engine](tutorial/cube_spark.html) +9. [Cube Build Tuning](tutorial/cube_build_performance.html) +10. [Enable Query Pushdown](tutorial/query_pushdown.html) +11. [Setup System Cube](tutorial/setup_systemcube.html) +12. [Optimize with Cube Planner](tutorial/use_cube_planner.html) +13. [Use System Dashboard](tutorial/use_dashboard.html) +14. [Setup JDBC Data Source](tutorial/setup_jdbc_datasource.html) Connectivity and APIs diff --git a/website/_docs31/install/kylin_docker.cn.md b/website/_docs31/install/kylin_docker.cn.md index c409a2f..64d1884 100644 --- a/website/_docs31/install/kylin_docker.cn.md +++ b/website/_docs31/install/kylin_docker.cn.md @@ -11,19 +11,17 @@ since: v3.0.0 - JDK 1.8 - Hadoop 2.7.0 - Hive 1.2.1 -- HBase 1.1.2 +- HBase 1.1.2 (with Zookeeper) - Spark 2.3.1 -- Zookeeper 3.4.6 - Kafka 1.1.1 - MySQL 5.1.73 -- Maven 3.6.1 ## 快速试用 Kylin 我们已将面向用户的 Kylin 镜像上传至 docker 仓库,用户无需在本地构建镜像,直接执行以下命令从 docker 仓库 pull 镜像: {% highlight Groff markup %} -docker pull apachekylin/apache-kylin-standalone:3.0.1 +docker pull apachekylin/apache-kylin-standalone:3.1.0 {% endhighlight %} pull 成功后,执行以下命令启动容器: @@ -37,7 +35,7 @@ docker run -d \ -p 8032:8032 \ -p 8042:8042 \ -p 16010:16010 \ -apachekylin/apache-kylin-standalone:3.0.1 +apachekylin/apache-kylin-standalone:3.1.0 {% endhighlight %} 在容器启动时,会自动启动以下服务: diff --git a/website/_docs31/install/kylin_docker.md b/website/_docs31/install/kylin_docker.md index 9847303..637cd72 100644 --- a/website/_docs31/install/kylin_docker.md +++ b/website/_docs31/install/kylin_docker.md @@ -11,19 +11,17 @@ In order to allow users to easily try Kylin, and to facilitate developers to ver - Jdk 1.8 - Hadoop 2.7.0 - Hive 1.2.1 -- Hbase 1.1.2 +- Hbase 1.1.2 (with Zookeeper) - Spark 2.3.1 -- Zookeeper 3.4.6 - Kafka 1.1.1 - MySQL 5.1.73 -- Maven 3.6.1 ## Quickly try Kylin We have pushed the Kylin image for the user to the docker hub. Users do not need to build the image locally, just execute the following command to pull the image from the docker hub: {% highlight Groff markup %} -docker pull apachekylin/apache-kylin-standalone:3.0.1 +docker pull apachekylin/apache-kylin-standalone:3.1.0 {% endhighlight %} After the pull is successful, execute the following command to start the container: @@ -37,7 +35,7 @@ docker run -d \ -p 8032:8032 \ -p 8042:8042 \ -p 16010:16010 \ -apachekylin/apache-kylin-standalone:3.0.1 +apachekylin/apache-kylin-standalone:3.1.0 {% endhighlight %} The following services are automatically started when the container starts: diff --git a/website/_docs31/tutorial/cube_migration.cn.md b/website/_docs31/tutorial/cube_migration.cn.md new file mode 100644 index 0000000..72d341e --- /dev/null +++ b/website/_docs31/tutorial/cube_migration.cn.md @@ -0,0 +1,7 @@ +--- +layout: docs31-cn +title: "Cube 迁移" +categories: 教程 +permalink: /cn/docs31/tutorial/cube_migration.html +since: v3.1.0 +--- diff --git a/website/_docs31/tutorial/cube_migration.md b/website/_docs31/tutorial/cube_migration.md new file mode 100644 index 0000000..b1b6d79 --- /dev/null +++ b/website/_docs31/tutorial/cube_migration.md @@ -0,0 +1,153 @@ +--- +layout: docs31 +title: Cube Migration +categories: tutorial +permalink: /docs31/tutorial/cube_migration.html +since: v3.1.0 +--- + +## Migrate on the same Hadoop cluster + +### Pre-requisitions to use cube migration + +1. Only cube admin can migrate the cubes as the "migrate" button is **ONLY** visible to cube admin. +2. The cube status must be **ready** before migration which you have built the segment and confirmed the performance. +3. The Property '**kylin.cube.migration.enabled**' must be true. +4. The target project must exist on Kylin PROD env before migration. +5. The QA env and PROD env must share the same Hadoop cluster, including HDFS, HBase and HIVE. + +### Steps to migrate a cube through the Kylin portal +First of all, make sure that you have authority of the cube you want to migrate. + +#### Step 1 +In 'Model' page, click the 'Action' drop down button in the 'Actions' column and select operation 'Migrate': + +  + +#### Step 2 +After you click 'Migrate' button, you will see a pop-up window: + +  + +#### Step 3 +Check if the target project name is what you want. It uses the same project name on QA env as default target project name. If the target project name is different on PROD env, please replace with the correct one. + +#### Step 4 +Click 'Validate' button to verify the cube validity. It may take couple of minutes to validate the cube on the backend and show the validity results on a pop-up window: + + **Common exceptions and suggested solutions** + + - `The target project XXX does not exist on PROD-KYLIN-INSTANCE:7070`: please enter the correct project name or create the expected project on the PROD env. + + - `Cube email notification list is not set or empty`: please add notification email in the Notification List of cube; + + **Some suggestive messages** + + - `Auto merge time range for cube XXXX is not set`: if 'Auto Merge Threshold' is not set, this message will be shown. You can ignore it or set 'Auto Merge Threshold' in 'Refresh Setting' section of cube. + - `ExpansionRateRule: failed on expansion rate check with exceeding 5`: it means the expansion rate of the cube you want to migrate exceed 5, which is the value of property 'kylin.cube.migration.expansion-rate', you can set the proper value for the cube. + - `Failed on query latency check with average cost 5617 exceeding 2000ms`: if property 'kylin.cube.migration.rule-query-latency-enabled' is set to true, it will generate some test sql to test the average query latency for the cube you want to migrate. You can set the proper value for property 'kylin.cube.migration.query-latency-seconds'. + +#### Step 5 + +If validations are ok, click 'Submit' button to send the migration request email to cubes administrator, if send email successfully, it will show the message like this: + +  + +#### Step 6 +Cubes administrator will receive a migration request email, and can click the 'Action' drop down button in the 'Actions' column and select operation 'Approve Migration' button to migrate cube or select 'Reject Migration' button to reject request. It also will send a notification email to the migration requester: + +  + +#### Step 7 +If cubes administrator selects 'Approve Migration' button to migrate cube, it will show a pop-up window: + +  + +After enter the target project name, and click 'Approve' button, it will start to migrate cube. + +#### Step 8 +If migrate successfully, it will show the message below: + +  + +#### Step 9 +Finally, go to Kylin portal on PROD env, and refresh the 'Model' page, you will see the cube you migrated from QA env and the status of this cube is **DISABLED**. + +### Use 'CubeMigrationCLI.java' CLI to migrate cube + +#### Function +CubeMigrationCLI.java can migrate a cube from a Kylin environment to another, for example, promote a well tested cube from the testing env to production env. Note that the different Kylin environments should share the same Hadoop cluster, including HDFS, HBase and HIVE. + +Please note, this tool will migrate the Kylin metadata, rename the Kylin HDFS folders and update HBase table's metadata. It doesn't migrate data across Hadoop clusters. + +#### How to use + +{% highlight Groff markup %} +./bin/kylin.sh org.apache.kylin.tool.CubeMigrationCLI <srcKylinConfigUri> <dstKylinConfigUri> <cubeName> <projectName> <copyAclOrNot> <purgeOrNot> <overwriteIfExists> <realExecute> <migrateSegmentOrNot> +{% endhighlight %} +For example: +{% highlight Groff markup %} +./bin/kylin.sh org.apache.kylin.tool.CubeMigrationCLI kylin-qa:7070 kylin-prod:7070 kylin_sales_cube learn_kylin true false false true false +{% endhighlight %} +After the command is successfully executed, please reload Kylin metadata, the cube you want to migrate will appear in the target environment. + +All supported parameters are listed below: + If the data model of the cube you want to migrate does not exist in the target environment, this tool will also migrate the model. + If you set `overwriteIfExists` to `false`, and the cube exists in the target environment, the tool will stop to proceed. + If you set `migrateSegmentOrNot` to `true`, please make sure the cube has `READY` segments, they will be migrated to target environment together. + +| Parameter | Description | +| ------------------- | :----------------------------------------------------------------------------------------- | +| srcKylinConfigUri | The URL of the source environment's Kylin configuration. It can be `host:7070`, or an absolute file path to the `kylin.properties`. | +| dstKylinConfigUri | The URL of the target environment's Kylin configuration. | +| cubeName | the name of cube to be migrated. | +| projectName | The target project in the target environment. If it doesn't exist, create it before run this command. | +| copyAclOrNot | `true` or `false`: whether copy the cube ACL to target environment. | +| purgeOrNot | `true` or `false`: whether to purge the cube from source environment after it be migrated to target environment. | +| overwriteIfExists | `true` or `false`: whether to overwrite if it already exists in the target environment. | +| realExecute | `true` or `false`: If false, just print the operations to take (dry-run mode); if true, do the real migration. | +| migrateSegmentOrNot | (Optional) `true` or `false`: whether copy segment info to the target environment. Default true. | + +## Migrate across two Hadoop clusters + +**Note**: Currently it just supports to use 'CubeMigrationCrossClusterCLI.java' CLI to migrate cube across two Hadoop clusters. + +### Pre-requisitions to use cube migration +1. The cube status must be **ready** before migration which you have built the segment and confirmed the performance. +2. The target project name of PROD env must be the same as the one on QA env. + +### How to use 'CubeMigrationCrossClusterCLI.java' CLI to migrate cube + +{% highlight Groff markup %} +./bin/kylin.sh org.apache.kylin.tool.migration.CubeMigrationCrossClusterCLI <kylinUriSrc> <kylinUriDst> <updateMappingPath> <cube> <hybrid> <project> <all> <dstHiveCheck> <overwrite> <schemaOnly> <execute> <coprocessorPath> <codeOfFSHAEnabled> <distCpJobQueue> <distCpJobMemory> <nThread> +{% endhighlight %} +For example: +{% highlight Groff markup %} +./bin/kylin.sh org.apache.kylin.tool.migration.CubeMigrationCrossClusterCLI -kylinUriSrc ADMIN:ky...@qa.env:17070 -kylinUriDst ADMIN:ky...@prod.env:17777 -cube kylin_sales_cube -updateMappingPath $KYLIN_HOME/updateTableMapping.json -execute true -schemaOnly false -overwrite true +{% endhighlight %} +After the command is successfully executed, please go to Kylin PROD portal and find the cube you migrated from 'Model' page, the status of the cube is **READY**. + +All supported parameters are listed below: + +| Parameter | Description | +| ------------------- | :----------------------------------------------------------------------------------------- | +| kylinUriSrc | (Required) The source kylin uri with format user:pwd@host:port. | +| kylinUriDst | (Required) The target kylin uri with format user:pwd@host:port. | +| updateMappingPath | (Optional) The path for the update Hive table mapping file, the format is json. | +| cube | The cubes which you want to migrate, separated by ','. | +| hybrid | The hybrids which you want to migrate, separated by ','. | +| project | The projects which you want to migrate, separated by ','. | +| all | Migrate all projects. **Note**: You must add only one of above four parameters: 'cube', 'hybrid', 'project' or 'all'. | +| dstHiveCheck | (Optional) Whether to check target hive tables, the default value is true. | +| overwrite | (Optional) Whether to overwrite existing cubes, the default value is false. | +| schemaOnly | (Optional) Whether only migrate cube related schema, the default value is true. | +| execute | (Optional) Whether it's to execute the migration, the default value is false. | +| coprocessorPath | (Optional) The path of coprocessor to be deployed, the default value is get from KylinConfigBase.getCoprocessorLocalJar(). | +| codeOfFSHAEnabled | (Optional) Whether to enable the namenode ha of clusters. | +| distCpJobQueue | (Optional) The mapreduce.job.queuename for DistCp job. | +| distCpJobMemory | (Optional) The mapreduce.map.memory.mb for DistCp job. | +| nThread | (Optional) The number of threads for migrating cube data in parallel. | + + + + diff --git a/website/images/tutorial/3.1/Kylin-Cube-Migration/1_request_migration.png b/website/images/tutorial/3.1/Kylin-Cube-Migration/1_request_migration.png new file mode 100644 index 0000000..b6f1945 Binary files /dev/null and b/website/images/tutorial/3.1/Kylin-Cube-Migration/1_request_migration.png differ diff --git a/website/images/tutorial/3.1/Kylin-Cube-Migration/2_input_target_project.png b/website/images/tutorial/3.1/Kylin-Cube-Migration/2_input_target_project.png new file mode 100644 index 0000000..f910f26 Binary files /dev/null and b/website/images/tutorial/3.1/Kylin-Cube-Migration/2_input_target_project.png differ diff --git a/website/images/tutorial/3.1/Kylin-Cube-Migration/3_cube_migration_request_succ.png b/website/images/tutorial/3.1/Kylin-Cube-Migration/3_cube_migration_request_succ.png new file mode 100644 index 0000000..bbaf620 Binary files /dev/null and b/website/images/tutorial/3.1/Kylin-Cube-Migration/3_cube_migration_request_succ.png differ diff --git a/website/images/tutorial/3.1/Kylin-Cube-Migration/4_approve_reject.png b/website/images/tutorial/3.1/Kylin-Cube-Migration/4_approve_reject.png new file mode 100644 index 0000000..97e63ee Binary files /dev/null and b/website/images/tutorial/3.1/Kylin-Cube-Migration/4_approve_reject.png differ diff --git a/website/images/tutorial/3.1/Kylin-Cube-Migration/5_approve_migration.png b/website/images/tutorial/3.1/Kylin-Cube-Migration/5_approve_migration.png new file mode 100644 index 0000000..99fbebc Binary files /dev/null and b/website/images/tutorial/3.1/Kylin-Cube-Migration/5_approve_migration.png differ diff --git a/website/images/tutorial/3.1/Kylin-Cube-Migration/6_migration_successfully.png b/website/images/tutorial/3.1/Kylin-Cube-Migration/6_migration_successfully.png new file mode 100644 index 0000000..9a1332b Binary files /dev/null and b/website/images/tutorial/3.1/Kylin-Cube-Migration/6_migration_successfully.png differ