This is an automated email from the ASF dual-hosted git repository. shaofengshi pushed a commit to branch document in repository https://gitbox.apache.org/repos/asf/kylin.git
commit 4943da2135e6da55bc858586d131bddf9b1cebfa Author: shaofengshi <shaofeng...@apache.org> AuthorDate: Tue Jun 26 17:20:22 2018 +0800 Refine document --- website/_docs/tutorial/cube_spark.md | 34 ++++++---------------------- website/_docs/tutorial/use_cube_planner.md | 13 ++++++----- website/_docs23/tutorial/use_cube_planner.md | 11 +++++---- 3 files changed, 21 insertions(+), 37 deletions(-) diff --git a/website/_docs/tutorial/cube_spark.md b/website/_docs/tutorial/cube_spark.md index 1a486b7..4770e48 100644 --- a/website/_docs/tutorial/cube_spark.md +++ b/website/_docs/tutorial/cube_spark.md @@ -10,42 +10,22 @@ Kylin v2.0 introduces the Spark cube engine, it uses Apache Spark to replace Map ## Preparation To finish this tutorial, you need a Hadoop environment which has Kylin v2.1.0 or above installed. Here we will use Hortonworks HDP 2.4 Sandbox VM, the Hadoop components as well as Hive/HBase has already been started. -## Install Kylin v2.1.0 or above +## Install Kylin v2.4.0 or above -Download the Kylin v2.1.0 for HBase 1.x from Kylin's download page, and then uncompress the tar ball into */usr/local/* folder: +Download the Kylin binary for HBase 1.x from Kylin's download page, and then uncompress the tar ball into */usr/local/* folder: {% highlight Groff markup %} -wget http://www-us.apache.org/dist/kylin/apache-kylin-2.1.0/apache-kylin-2.1.0-bin-hbase1x.tar.gz -P /tmp +wget http://www-us.apache.org/dist/kylin/apache-kylin-2.4.0/apache-kylin-2.4.0-bin-hbase1x.tar.gz -P /tmp -tar -zxvf /tmp/apache-kylin-2.1.0-bin-hbase1x.tar.gz -C /usr/local/ +tar -zxvf /tmp/apache-kylin-2.4.0-bin-hbase1x.tar.gz -C /usr/local/ -export KYLIN_HOME=/usr/local/apache-kylin-2.1.0-bin-hbase1x +export KYLIN_HOME=/usr/local/apache-kylin-2.4.0-bin-hbase1x {% endhighlight %} ## Prepare "kylin.env.hadoop-conf-dir" -To run Spark on Yarn, need specify **HADOOP_CONF_DIR** environment variable, which is the directory that contains the (client side) configuration files for Hadoop. In many Hadoop distributions the directory is "/etc/hadoop/conf"; But Kylin not only need access HDFS, Yarn and Hive, but also HBase, so the default directory might not have all necessary files. In this case, you need create a new directory and then copying or linking those client files (core-site.xml, hdfs-site.xml, yarn-site [...] - -{% highlight Groff markup %} - -mkdir $KYLIN_HOME/hadoop-conf -ln -s /etc/hadoop/conf/core-site.xml $KYLIN_HOME/hadoop-conf/core-site.xml -ln -s /etc/hadoop/conf/hdfs-site.xml $KYLIN_HOME/hadoop-conf/hdfs-site.xml -ln -s /etc/hadoop/conf/yarn-site.xml $KYLIN_HOME/hadoop-conf/yarn-site.xml -ln -s /etc/hbase/2.4.0.0-169/0/hbase-site.xml $KYLIN_HOME/hadoop-conf/hbase-site.xml -cp /etc/hive/2.4.0.0-169/0/hive-site.xml $KYLIN_HOME/hadoop-conf/hive-site.xml -vi $KYLIN_HOME/hadoop-conf/hive-site.xml (change "hive.execution.engine" value from "tez" to "mr") - -{% endhighlight %} - -Now, let Kylin know this directory with property "kylin.env.hadoop-conf-dir" in kylin.properties: - -{% highlight Groff markup %} -kylin.env.hadoop-conf-dir=/usr/local/apache-kylin-2.1.0-bin-hbase1x/hadoop-conf -{% endhighlight %} - -If this property isn't set, Kylin will use the directory that "hive-site.xml" locates in; while that folder may have no "hbase-site.xml", will get HBase/ZK connection error in Spark. +To run Spark on Yarn, need specify **HADOOP_CONF_DIR** environment variable, which is the directory that contains the (client side) configuration files for Hadoop. In many Hadoop distributions the directory is "/etc/hadoop/conf"; Kylin can automatically detect this folder from Hadoop configuration, so by default you don't need to set this property. If your configuration files are not in default folder, please set this property explicitly. ## Check Spark configuration @@ -142,7 +122,7 @@ After all steps be successfully executed, the Cube becomes "Ready" and you can q When getting error, you should check "logs/kylin.log" firstly. There has the full Spark command that Kylin executes, e.g: {% highlight Groff markup %} -2017-03-06 14:44:38,574 INFO [Job 2d5c1178-c6f6-4b50-8937-8e5e3b39227e-306] spark.SparkExecutable:121 : cmd:export HADOOP_CONF_DIR=/usr/local/apache-kylin-2.1.0-bin-hbase1x/hadoop-conf && /usr/local/apache-kylin-2.1.0-bin-hbase1x/spark/bin/spark-submit --class org.apache.kylin.common.util.SparkEntry --conf spark.executor.instances=1 --conf spark.yarn.queue=default --conf spark.yarn.am.extraJavaOptions=-Dhdp.version=current --conf spark.history.fs.logDirectory=hdfs:///kylin/spark-his [...] +2017-03-06 14:44:38,574 INFO [Job 2d5c1178-c6f6-4b50-8937-8e5e3b39227e-306] spark.SparkExecutable:121 : cmd:export HADOOP_CONF_DIR=/etc/hadoop/conf && /usr/local/apache-kylin-2.4.0-bin-hbase1x/spark/bin/spark-submit --class org.apache.kylin.common.util.SparkEntry --conf spark.executor.instances=1 --conf spark.yarn.queue=default --conf spark.yarn.am.extraJavaOptions=-Dhdp.version=current --conf spark.history.fs.logDirectory=hdfs:///kylin/spark-history --conf spark.driver.extraJavaOp [...] {% endhighlight %} diff --git a/website/_docs/tutorial/use_cube_planner.md b/website/_docs/tutorial/use_cube_planner.md index 476c91c..3d5340d 100644 --- a/website/_docs/tutorial/use_cube_planner.md +++ b/website/_docs/tutorial/use_cube_planner.md @@ -17,10 +17,11 @@ Cube Planner makes Apache Kylin to be more resource efficient. It intelligently  -## Prerequisites +Read more at [eBay tech blog](https://www.ebayinc.com/stories/blogs/tech/cube-planner-build-an-apache-kylin-olap-cube-efficiently-and-intelligently/) -To enable Dashboard on WebUI, you need to set **kylin.cube.cubeplanner.enabled=true** and other properties in **kylin.properties**. +## Prerequisites +To enable Dashboard on WebUI, you need to set `kylin.cube.cubeplanner.enabled=true` and other properties in`kylin.properties` {% highlight Groff markup %} kylin.cube.cubeplanner.enabled=true @@ -44,16 +45,14 @@ kylin.metrics.monitor-enabled=true You should make sure the status of the Cube is '**READY**' - If the status of the Cube is '**DISABLED**', you will not be able to use the Cube planner. - - You should change the status of the Cube from '**DISABLED**' to '**READY**' by building it or enabling it if it has been built before. + If the status of the Cube is '**DISABLED**', you will not be able to use the Cube planner. You should change the status of the Cube from '**DISABLED**' to '**READY**' by building it or enabling it if it has been built before. #### Step 3: a. Click the '**Planner**' button to view the '**Current Cuboid Distribution**' of the Cube. -- The data will be displayed in *[Sunburst Chart](https://en.wikipedia.org/wiki/Pie_chart#Ring_chart_.2F_Sunburst_chart_.2F_Multilevel_pie_chart)*. +- The data will be displayed in Sunburst Chart. - Each part refers to a cuboid, is shown in different colors determined by the query **frequency** against this cuboid. @@ -120,6 +119,8 @@ c. Click the '**Optimize**' button to optimize the Cube. - User is able to get to know the last optimized time of the Cube in Cube Planner tab page. +Please note: if you don't see the last optimized time, upgrade to Kylin v2.3.2 or above, check KYLIN-3404. +  - User is able to receive an email notification for a Cube optimization job. diff --git a/website/_docs23/tutorial/use_cube_planner.md b/website/_docs23/tutorial/use_cube_planner.md index 4d889ff..d1acb96 100644 --- a/website/_docs23/tutorial/use_cube_planner.md +++ b/website/_docs23/tutorial/use_cube_planner.md @@ -17,10 +17,11 @@ Cube Planner makes Apache Kylin to be more resource efficient. It intelligently  -## Prerequisites +Read more at [eBay tech blog](https://www.ebayinc.com/stories/blogs/tech/cube-planner-build-an-apache-kylin-olap-cube-efficiently-and-intelligently/) -To enable Dashboard on WebUI, you need to set **kylin.cube.cubeplanner.enabled=true** and other properties in **kylin.properties**. +## Prerequisites +To enable Dashboard on WebUI, you need to set `kylin.cube.cubeplanner.enabled=true` and other properties in `kylin.properties` {% highlight Groff markup %} kylin.cube.cubeplanner.enabled=true @@ -44,7 +45,7 @@ kylin.metrics.monitor-enabled=true You should make sure the status of the Cube is '**READY**' - If the status of the Cube is '**DISABLED**', you will not be able to use the Cube planner. + If the status of the Cube is '**DISABLED**', you will not be able to use the Cube planner. You should change the status of the Cube from '**DISABLED**' to '**READY**' by building it or enabling it if it has been built before. @@ -53,7 +54,7 @@ kylin.metrics.monitor-enabled=true a. Click the '**Planner**' button to view the '**Current Cuboid Distribution**' of the Cube. -- The data will be displayed in *[Sunburst Chart](https://en.wikipedia.org/wiki/Pie_chart#Ring_chart_.2F_Sunburst_chart_.2F_Multilevel_pie_chart)*. +- The data will be displayed in **Sunburst Chart**. - Each part refers to a cuboid, is shown in different colors determined by the query **frequency** against this cuboid. @@ -120,6 +121,8 @@ c. Click the '**Optimize**' button to optimize the Cube. - User is able to get to know the last optimized time of the Cube in Cube Planner tab page. +Please note: if you don't see the last optimized time, upgrade to Kylin v2.3.2 or above, check KYLIN-3404. +  - User is able to receive an email notification for a Cube optimization job.