This is an automated email from the ASF dual-hosted git repository. xxyu pushed a commit to branch document in repository https://gitbox.apache.org/repos/asf/kylin.git
The following commit(s) were added to refs/heads/document by this push: new 5df6477 Update howto_use_cli document for docs40 5df6477 is described below commit 5df6477a57bc20abb16e36564f8dd61b3a902572 Author: yaqian.zhang <598593...@qq.com> AuthorDate: Mon Aug 9 11:25:50 2021 +0800 Update howto_use_cli document for docs40 --- website/_data/docs40-cn.yml | 4 ++-- website/_data/docs40.yml | 4 ++-- website/_dev40/dev_env.cn.md | 16 ++++++++-------- website/_dev40/dev_env.md | 16 ++++++++-------- website/_docs/install/index.cn.md | 8 ++++---- website/_docs/install/index.md | 6 +++--- website/_docs40/howto/howto_use_cli.cn.md | 11 +++++++++++ website/_docs40/howto/howto_use_cli.md | 11 +++++++++++ website/_docs40/index.cn.md | 3 +++ website/_docs40/index.md | 3 +++ website/_docs40/install/index.cn.md | 14 +++++++------- website/_docs40/install/index.md | 14 +++++++------- 12 files changed, 69 insertions(+), 41 deletions(-) diff --git a/website/_data/docs40-cn.yml b/website/_data/docs40-cn.yml index 3bbee1a..f0de5b3 100644 --- a/website/_data/docs40-cn.yml +++ b/website/_data/docs40-cn.yml @@ -21,11 +21,11 @@ - title: 安装 docs: - install/index + - install/deploy_without_hadoop - install/kylin_cluster + - install/kylin_docker - install/configuration - install/advance_settings - - install/kylin_docker - - install/deploy_without_hadoop - title: 教程 docs: diff --git a/website/_data/docs40.yml b/website/_data/docs40.yml index c94d3cf..2c07665 100644 --- a/website/_data/docs40.yml +++ b/website/_data/docs40.yml @@ -29,11 +29,11 @@ - title: Installation docs: - install/index + - install/deploy_without_hadoop - install/kylin_cluster + - install/kylin_docker - install/configuration - install/advance_settings - - install/kylin_docker - - install/deploy_without_hadoop - title: Tutorial docs: diff --git a/website/_dev40/dev_env.cn.md b/website/_dev40/dev_env.cn.md index 8c031e0..cf808a7 100644 --- a/website/_dev40/dev_env.cn.md +++ b/website/_dev40/dev_env.cn.md @@ -13,24 +13,24 @@ permalink: /cn/development40/dev_env.html ### 安装 Maven -最新的 Maven 下载地址:<http://maven.apache.org/download.cgi>,然后创建一个软链接,以便 `mvn` 可以在任何地方运行。 +下载 Maven 3.5.4 及以上版本:<http://maven.apache.org/download.cgi>,然后创建一个软链接,以便 `mvn` 可以在任何地方运行。 {% highlight Groff markup %} cd ~ -wget http://xenia.sote.hu/ftp/mirrors/www.apache.org/maven/maven-3/3.2.5/binaries/apache-maven-3.2.5-bin.tar.gz -tar -xzvf apache-maven-3.2.5-bin.tar.gz -ln -s /root/apache-maven-3.2.5/bin/mvn /usr/bin/mvn +wget http://xenia.sote.hu/ftp/mirrors/www.apache.org/maven/maven-3/3.5.4/binaries/apache-maven-3.5.4-bin.tar.gz +tar -xzvf apache-maven-3.5.4-bin.tar.gz +ln -s /root/apache-maven-3.5.4/bin/mvn /usr/bin/mvn {% endhighlight %} ### 安装 Spark -在像 /usr/local/spark 这样的本地文件夹下手动安装 Spark;Kylin4 支持 Spark2.4.6,你需要从 Spark 下载页面获取下载链接。 +在像 /usr/local/spark 这样的本地文件夹下手动安装 Spark;Kylin4 支持 Spark 2.4.7,你需要从 Spark 下载页面获取下载链接。 {% highlight Groff markup %} -wget -O /tmp/spark-2.4.6-bin-hadoop2.7.tgz https://archive.apache.org/dist/spark/spark-2.4.6/spark-2.4.6-bin-hadoop2.7.tgz +wget -O /tmp/spark-2.4.7-bin-hadoop2.7.tgz https://archive.apache.org/dist/spark/spark-2.4.7/spark-2.4.7-bin-hadoop2.7.tgz cd /usr/local -tar -zxvf /tmp/spark-2.4.6-bin-hadoop2.7.tgz -ln -s spark-2.4.6-bin-hadoop2.7 spark +tar -zxvf /tmp/spark-2.4.7-bin-hadoop2.7.tgz +ln -s spark-2.4.7-bin-hadoop2.7 spark {% endhighlight %} ### 编译 diff --git a/website/_dev40/dev_env.md b/website/_dev40/dev_env.md index 8a88f2d..4240dc1 100644 --- a/website/_dev40/dev_env.md +++ b/website/_dev40/dev_env.md @@ -14,24 +14,24 @@ Following this tutorial, you can easily build a kylin4 development environment o ### Install Maven -The latest maven can be found at <http://maven.apache.org/download.cgi>, we create a symbolic so that `mvn` can be run anywhere. +Download Maven 3.5.4 and above version: <http://maven.apache.org/download.cgi>, we create a symbolic so that `mvn` can be run anywhere. {% highlight Groff markup %} cd ~ -wget http://xenia.sote.hu/ftp/mirrors/www.apache.org/maven/maven-3/3.2.5/binaries/apache-maven-3.2.5-bin.tar.gz -tar -xzvf apache-maven-3.2.5-bin.tar.gz -ln -s /root/apache-maven-3.2.5/bin/mvn /usr/bin/mvn +wget http://xenia.sote.hu/ftp/mirrors/www.apache.org/maven/maven-3/3.5.4/binaries/apache-maven-3.5.4-bin.tar.gz +tar -xzvf apache-maven-3.5.4-bin.tar.gz +ln -s /root/apache-maven-3.5.4/bin/mvn /usr/bin/mvn {% endhighlight %} ### Install Spark -Manually install the Spark binary in a local folder like /usr/local/spark. Kylin4 supports spark 2.4.6, you need to get the download link from the spark download page. +Manually install the Spark binary in a local folder like /usr/local/spark. Kylin4 supports Spark 2.4.7, you need to get the download link from the spark download page. {% highlight Groff markup %} -wget -O /tmp/spark-2.4.6-bin-hadoop2.7.tgz https://archive.apache.org/dist/spark/spark-2.4.6/spark-2.4.6-bin-hadoop2.7.tgz +wget -O /tmp/spark-2.4.7-bin-hadoop2.7.tgz https://archive.apache.org/dist/spark/spark-2.4.7/spark-2.4.7-bin-hadoop2.7.tgz cd /usr/local -tar -zxvf /tmp/spark-2.4.6-bin-hadoop2.7.tgz -ln -s spark-2.4.6-bin-hadoop2.7 spark +tar -zxvf /tmp/spark-2.4.7-bin-hadoop2.7.tgz +ln -s spark-2.4.7-bin-hadoop2.7 spark {% endhighlight %} ### Compile diff --git a/website/_docs/install/index.cn.md b/website/_docs/install/index.cn.md index 2843535..0117d5a 100644 --- a/website/_docs/install/index.cn.md +++ b/website/_docs/install/index.cn.md @@ -36,14 +36,14 @@ Kylin 可以在 Hadoop 集群的任意节点上启动。方便起见,您可以 ### Kylin 安装 -1. 从 [Apache Kylin下载网站](https://kylin.apache.org/download/) 下载一个适用于您 Hadoop 版本的二进制文件。例如,适用于 HBase 1.x 的 Kylin 2.5.0 可通过如下命令行下载得到: +- 从 [Apache Kylin下载网站](https://kylin.apache.org/download/) 下载一个适用于您 Hadoop 版本的二进制文件。例如,适用于 HBase 1.x 的 Kylin 2.5.0 可通过如下命令行下载得到: ```shell cd /usr/local/ wget http://mirror.bit.edu.cn/apache/kylin/apache-kylin-2.5.0/apache-kylin-2.5.0-bin-hbase1x.tar.gz ``` - -2. 解压 tar 包,配置环境变量 `$KYLIN_HOME` 指向 Kylin 文件夹。 + +- 解压 tar 包,配置环境变量 `$KYLIN_HOME` 指向 Kylin 文件夹。 ```shell tar -zxvf apache-kylin-2.5.0-bin-hbase1x.tar.gz @@ -51,7 +51,7 @@ cd apache-kylin-2.5.0-bin-hbase1x export KYLIN_HOME=`pwd` ``` -3. 从 v2.6.1 开始, Kylin 不再包含 Spark 二进制包; 您需要另外下载 Spark,然后设置 `SPARK_HOME` 系统变量到 Spark 安装目录: +- 从 v2.6.1 开始, Kylin 不再包含 Spark 二进制包; 您需要另外下载 Spark,然后设置 `SPARK_HOME` 系统变量到 Spark 安装目录: ```shell export SPARK_HOME=/path/to/spark diff --git a/website/_docs/install/index.md b/website/_docs/install/index.md index 4db5c98..a11bf2f 100644 --- a/website/_docs/install/index.md +++ b/website/_docs/install/index.md @@ -38,14 +38,14 @@ Linux accounts running Kylin must have access to the Hadoop cluster, including t ### Kylin Installation -1. Download a binary package for your Hadoop version from the [Apache Kylin Download Site](https://kylin.apache.org/download/). For example, Kylin 2.5.0 for HBase 1.x can be downloaded from the following command line: +- Download a binary package for your Hadoop version from the [Apache Kylin Download Site](https://kylin.apache.org/download/). For example, Kylin 2.5.0 for HBase 1.x can be downloaded from the following command line: ```shell cd /usr/local/ wget http://mirror.bit.edu.cn/apache/kylin/apache-kylin-2.5.0/apache-kylin-2.5.0-bin-hbase1x.tar.gz ``` -2. Unzip the tarball and configure the environment variable `$KYLIN_HOME` to the Kylin folder. +- Unzip the tarball and configure the environment variable `$KYLIN_HOME` to the Kylin folder. ```shell tar -zxvf apache-kylin-2.5.0-bin-hbase1x.tar.gz @@ -53,7 +53,7 @@ cd apache-kylin-2.5.0-bin-hbase1x export KYLIN_HOME=`pwd` ``` -From v2.6.1, Kylin will not ship Spark binary anymore; You need to install Spark seperately, and then point `SPARK_HOME` system environment variable to it: +- From v2.6.1, Kylin will not ship Spark binary anymore; You need to install Spark seperately, and then point `SPARK_HOME` system environment variable to it: ```shell export SPARK_HOME=/path/to/spark diff --git a/website/_docs40/howto/howto_use_cli.cn.md b/website/_docs40/howto/howto_use_cli.cn.md index 2c06c92..a72a014 100644 --- a/website/_docs40/howto/howto_use_cli.cn.md +++ b/website/_docs40/howto/howto_use_cli.cn.md @@ -98,3 +98,14 @@ CubeMetaIngester.java 将提取的 cube 注入到另一个 metadata store 中。 | overwriteTables <overwriteTables> | If table meta conflicts, overwrite the one in metadata store with the one in srcPath. Use in caution because it might break existing cubes! Suggest to backup metadata store first. Default false. | | project <project> | (Required) Specify the target project for the new cubes. | srcPath <srcPath> | (Required) Specify the path to the extracted Cube metadata zip file. | + +## CubeMigrationCLI.java + +### 作用 +自 Apache kylin 2.0 以来提供了迁移工具来支持跨不同集群迁移元数据。在 kylin4.0 中,我们对 CubeMigration 工具进行了改进并添加了新功能,增强功能列表如下所示: +-支持迁移源集群中的所有多维数据集 +-支持在源集群中迁移整个项目 +-支持将元数据从旧版本迁移和升级到Kylin 4 + +### 如何使用 +请参考文档:[How to migrate metadata to Kylin4](https://cwiki.apache.org/confluence/display/KYLIN/How+to+migrate+metadata+to+Kylin+4) \ No newline at end of file diff --git a/website/_docs40/howto/howto_use_cli.md b/website/_docs40/howto/howto_use_cli.md index c26b2d1..f8f60ab 100644 --- a/website/_docs40/howto/howto_use_cli.md +++ b/website/_docs40/howto/howto_use_cli.md @@ -102,3 +102,14 @@ All supported parameters are listed below: | overwriteTables <overwriteTables> | If table meta conflicts, overwrite the one in metadata store with the one in srcPath. Use in caution because it might break existing cubes! Suggest to backup metadata store first. Default false. | | project <project> | (Required) Specify the target project for the new cubes. | | srcPath <srcPath> | (Required) Specify the path to the extracted Cube metadata zip file. | + +## CubeMigrationCLI.java + +## Function +Apache Kylin have provided migration tool to support migrating metadata across different clusters since version 2.0. Recently, we have refined and added new ability to CubeMigration tool, The list of enhanced functions is showed as below: +- Support migrating all cubes in source cluster +- Support migrating a whole project in source cluster +- Support migrating and upgrading metadata from older version to Kylin 4 + +### How to use +Please check: [How to migrate metadata to Kylin4](https://cwiki.apache.org/confluence/display/KYLIN/How+to+migrate+metadata+to+Kylin+4) \ No newline at end of file diff --git a/website/_docs40/index.cn.md b/website/_docs40/index.cn.md index 5bd4b4c..3240b83 100644 --- a/website/_docs40/index.cn.md +++ b/website/_docs40/index.cn.md @@ -128,6 +128,9 @@ Kylin4 的查询引擎 `Sparder(SparderContext)` 是由 spark application 后端 从查询结果对比中可以看出,对于***简单查询***,kylin3 与 Kylin4 不相上下,kylin4 略有不足;而对于***复杂查询***,kylin4 则体现出了明显的优势,查询速度比 kylin3 快很多。 并且,Kylin4 中的***简单查询***的性能还存在很大的优化空间。在有赞使用 Kylin4 的实践中,对于***简单查询***的性能可以优化到 1 秒以内。 +## 如何升级 +请参考文档:[How to migrate metadata to Kylin4](https://cwiki.apache.org/confluence/display/KYLIN/How+to+migrate+metadata+to+Kylin+4) + ## Kylin 4.0 查询和构建调优 对于 Kylin4 的调优,请参考:[How to improve cube building and query performance](/docs40/howto/howto_optimize_build_and_query.html) diff --git a/website/_docs40/index.md b/website/_docs40/index.md index ccaf410..a6dd45a 100644 --- a/website/_docs40/index.md +++ b/website/_docs40/index.md @@ -132,6 +132,9 @@ The test results can reflect the following two points: From the comparison of query results, it can be seen that kylin3 and kylin4 are the same for ***simple query***, kylin4 is slightly insufficient; However, kylin4 has obvious advantages over kylin3 for ***complex query***. Moreover, there is still a lot of room to optimize the performance of ***simple query*** in kylin4. In the practice of Youzan using kylin4, the performance of ***simple query*** can be optimized to less than 1 second. +## How to upgrade +Please check: [How to migrate metadata to Kylin4](https://cwiki.apache.org/confluence/display/KYLIN/How+to+migrate+metadata+to+Kylin+4) + ## Kylin 4.0 query and build tuning For kylin4 tuning, please refer to: [How to improve cube building and query performance](/docs40/howto/howto_optimize_build_and_query.html) diff --git a/website/_docs40/install/index.cn.md b/website/_docs40/install/index.cn.md index ca608da..e3c6d2b 100644 --- a/website/_docs40/install/index.cn.md +++ b/website/_docs40/install/index.cn.md @@ -9,7 +9,7 @@ permalink: /cn/docs40/install/index.html * Hadoop: cdh5.x, cdh6.x, hdp2.x, EMR5.x, EMR6.x, HDI4.x * Hive: 0.13 - 1.2.1+ -* Spark: 2.4.6 +* Spark: 2.4.7 * Mysql: 5.1.17及以上 * JDK: 1.8+ * OS: Linux only, CentOS 6.5+ or Ubuntu 16.0.4+ @@ -33,14 +33,14 @@ Kylin 可以在 Hadoop 集群的任意节点上启动。方便起见,您可以 ### Kylin 安装 -1. 从 [Apache Kylin下载网站](https://kylin.apache.org/download/) 下载一个 Apache Kylin 4.0 的二进制文件。可通过如下命令行下载得到: +- 从 [Apache Kylin下载网站](https://kylin.apache.org/download/) 下载一个 Apache Kylin 4.0 的二进制文件。可通过如下命令行下载得到: ```shell cd /usr/local/ wget http://mirror.bit.edu.cn/apache/kylin/apache-kylin-4.0.0/apache-kylin-4.0.0-bin.tar.gz ``` -2. 解压 tar 包,配置环境变量 `$KYLIN_HOME` 指向 Kylin 文件夹。 +- 解压 tar 包,配置环境变量 `$KYLIN_HOME` 指向 Kylin 文件夹。 ```shell tar -zxvf apache-kylin-4.0.0-bin.tar.gz @@ -48,15 +48,15 @@ cd apache-kylin-4.0.0-bin export KYLIN_HOME=`pwd` ``` -使用脚本下载spark: +- 使用脚本下载spark: ```shell $KYLIN_HOME/bin/download-spark.sh ``` -或者配置 SPARK_HOME 指向环境中的 spark2.4.6 的路径。 +或者配置 SPARK_HOME 指向环境中的 spark2.4.7 的路径。 -3. 配置 Mysql 元数据 +- 配置 Mysql 元数据 Kylin 4.0 使用 Mysql 作为元数据存储,需要在 kylin.properties 中做如下配置: @@ -65,7 +65,7 @@ kylin.metadata.url=kylin_metadata@jdbc,driverClassName=com.mysql.jdbc.Driver,url kylin.env.zookeeper-connect-string=ip ``` -你需要修改其中的 Mysql 用户名和密码,以及存储元数据的database和table。 +你需要修改其中的 Mysql 用户名和密码,以及存储元数据的database和table。并将 mysql jdbc connector 放在 `$KYLIN_HOME/ext` 目录下,没有该目录时请自行创建。 请参考 [配置 Mysql 为 Metastore](/_docs40/tutorial/mysql_metastore.html) 了解 Mysql 作为 Metastore 的详细配置。 ### Kylin tarball 目录 diff --git a/website/_docs40/install/index.md b/website/_docs40/install/index.md index b0cb776..73b5b4f 100644 --- a/website/_docs40/install/index.md +++ b/website/_docs40/install/index.md @@ -9,7 +9,7 @@ permalink: /docs40/install/index.html * Hadoop: cdh5.x, cdh6.x, hdp2.x, EMR5.x, EMR6.x, HDI4.x * Hive: 0.13 - 1.2.1+ -* Spark: 2.4.6 +* Spark: 2.4.7 * Mysql: 5.1.17及以上 * JDK: 1.8+ * OS: Linux only, CentOS 6.5+ or Ubuntu 16.0.4+ @@ -37,14 +37,14 @@ Linux accounts running Kylin must have access to the Hadoop cluster, including t ### Kylin Installation -1. Download a Apache kylin 4.0.0 binary package from the [Apache Kylin Download Site](https://kylin.apache.org/download/). For example, the following command line can be used: +- Download a Apache kylin 4.0.0 binary package from the [Apache Kylin Download Site](https://kylin.apache.org/download/). For example, the following command line can be used: ```shell cd /usr/local/ wget http://mirror.bit.edu.cn/apache/kylin/apache-kylin-4.0.0/apache-kylin-4.0.0-bin.tar.gz ``` -2. Unzip the tarball and configure the environment variable `$KYLIN_HOME` to the Kylin folder. +- Unzip the tarball and configure the environment variable `$KYLIN_HOME` to the Kylin folder. ```shell tar -zxvf apache-kylin-4.0.0-bin.tar.gz @@ -52,15 +52,15 @@ cd apache-kylin-4.0.0-bin export KYLIN_HOME=`pwd` ``` -Run the script to download spark: +- Run the script to download spark: ```shell $KYLIN_HOME/bin/download-spark.sh ``` -Or configure SPARK_HOME points to the path of spark2.4.6 in the environment. +Or configure SPARK_HOME points to the path of spark2.4.7 in the environment. -3. Configure MySQL metastore +- Configure MySQL metastore Kylin 4.0 uses MySQL as metadata storage, make the following configuration in `kylin.properties`: @@ -69,7 +69,7 @@ kylin.metadata.url=kylin_metadata@jdbc,driverClassName=com.mysql.jdbc.Driver,url kylin.env.zookeeper-connect-string=ip:2181 ``` -You need to change the Mysql user name and password, as well as the database and table where the metadata is stored. +You need to change the Mysql user name and password, as well as the database and table where the metadata is stored. And put mysql jdbc connector into `$KYLIN_HOME/ext/`, if there is no such directory, please create it. Please refer to [配置 Mysql 为 Metastore](/_docs40/tutorial/mysql_metastore.html) learn about the detailed configuration of MySQL as a Metastore. ### Kylin tarball structure