[kylin] branch doc5.0 updated: KYLIN-5221 add apache hadoop installation

xxyu Tue, 23 Aug 2022 05:12:51 -0700

This is an automated email from the ASF dual-hosted git repository.

xxyu pushed a commit to branch doc5.0
in repository https://gitbox.apache.org/repos/asf/kylin.git



The following commit(s) were added to refs/heads/doc5.0 by this push:
     new 1954e12c72 KYLIN-5221 add apache hadoop installation
1954e12c72 is described below

commit 1954e12c72dc594a011b5a7d7efc471407a45998
Author: Mukvin <boyboys...@163.com>
AuthorDate: Tue Aug 23 19:17:34 2022 +0800

    KYLIN-5221 add apache hadoop installation
---
 .../platform/install_on_apache_hadoop.md           |  52 ++++
 .../docs/deployment/installation/platform/intro.md |  18 ++
 website/docs/development/how_to_package.md         |  24 +-
 website/docs/quickstart/expert_mode_tutorial.md    |   8 +-
 website/docs/quickstart/images/gss_negotiate.png   | Bin 0 -> 19292 bytes
 .../images/installation_query_result.png           | Bin 0 -> 127355 bytes
 website/docs/quickstart/images/list.png            | Bin 0 -> 153847 bytes
 website/docs/quickstart/quick_start.md             | 268 +++++++++++++++++++++
 website/sidebars.js                                |  38 ++-
 9 files changed, 399 insertions(+), 9 deletions(-)

diff --git 
a/website/docs/deployment/installation/platform/install_on_apache_hadoop.md 
b/website/docs/deployment/installation/platform/install_on_apache_hadoop.md
new file mode 100644
index 0000000000..2a5bdcc28e
--- /dev/null
+++ b/website/docs/deployment/installation/platform/install_on_apache_hadoop.md
@@ -0,0 +1,52 @@
+---
+title: Install on Apache Hadoop Platform
+language: en
+sidebar_label: Install on Apache Hadoop Platform
+pagination_label: Install on Apache Hadoop Platform
+toc_min_heading_level: 2
+toc_max_heading_level: 6
+pagination_prev: null
+pagination_next: null
+keywords:
+    - install
+    - hadoop
+draft: false
+last_update:
+    date: 08/12/2022
+---
+
+
+### Prepare Environment
+
+First, **make sure you allocate sufficient resources for the environment**. 
Please refer to 
[Prerequisites](../../../deployment/on-premises/prerequisite.md) for detailed 
resource requirements for Kylin. Moreover, please ensure that `HDFS`, `YARN`, 
`Hive`, `ZooKeeper` and other components are in normal state without any 
warning information.
+
+
+
+#### Apache Hadoop Supported Version
+
+Following Apache Hadoop versions are supported by Kylin:
+
+- Apache Hadoop 3.2.1
+
+**Note**：The Apache Hadoop 3.2.1 environment with Kerberos is not currently 
supported.
+
+#### Additional configuration required for Apache Hadoop version
+
+Add the following two configurations in `$KYLIN_HOME/conf/kylin.properties`:
+
+- `kylin.env.apache-hadoop-conf-dir` Hadoop conf directory in Hadoop 
environment
+- `kylin.env.apache-hive-conf-dir` Hive conf directory in Hadoop environment
+
+
+
+#### Jar package required by Apache Hadoop version
+
+In Apache Hadoop 3.2.1, you also need to prepare the MySQL JDBC driver in the 
operating environment of Kylin.
+
+Here is a download link for the jar file package of the MySQL 5.1 JDBC 
driver：https://repo1.maven.org/maven2/mysql/mysql-connector-java/5.1.41/mysql-connector-java-5.1.41.jar.
 You need to prepare the other versions of the driver yourself.Please place the 
JDBC driver of the corresponding version of MySQL in the `$KYLIN_HOME/lib/ext` 
directory.
+
+
+
+### Install Kylin
+
+After setting up the environment, please refer to [Quick 
Start](../../../quickstart/quick_start.md) to continue.
diff --git a/website/docs/deployment/installation/platform/intro.md 
b/website/docs/deployment/installation/platform/intro.md
new file mode 100644
index 0000000000..fb44cb572c
--- /dev/null
+++ b/website/docs/deployment/installation/platform/intro.md
@@ -0,0 +1,18 @@
+---
+title: Install On Platforms
+language: en
+sidebar_label: Install On Platforms
+pagination_label: Install On Platforms
+toc_min_heading_level: 2
+toc_max_heading_level: 6
+pagination_prev: null
+pagination_next: null
+keywords:
+    - install
+    - platforms
+draft: false
+last_update:
+    date: 08/12/2022
+---
+
+This chapter will introduce how to install Kylin on different platforms.
diff --git a/website/docs/development/how_to_package.md 
b/website/docs/development/how_to_package.md
index 72d0708f45..a52c95a8c0 100644
--- a/website/docs/development/how_to_package.md
+++ b/website/docs/development/how_to_package.md
@@ -1,5 +1,17 @@
 ---
-sidebar_position: 1
+title: How to package
+language: en
+sidebar_label: How to package
+pagination_label: How to package
+toc_min_heading_level: 2
+toc_max_heading_level: 6
+pagination_prev: null
+pagination_next: null
+keywords:
+    - package
+draft: false
+last_update:
+    date: 08/22/2022
 ---
 
 # How to package
@@ -24,6 +36,11 @@ sidebar_position: 1
 | -skipFront           | If add this option, front-end won't be build and 
packaging |
 | -skipCompile         | Add this option will assume java source code no need 
be compiled again |
 
+### Other Options for Packaging Script
+|         Option       |     Comment                                        | 
+|--------------------  | ---------------------------------------------------|
+| -P hadoop3           | Packaging a Kylin 5.0 software package for running on 
Hadoop 3.0 + platform.|
+
 ### Package Content
 
 |         Option       |     Comment    | 
@@ -46,6 +63,9 @@ For example, an unofficial package could be 
`apache-kylin-5.0.0-SNAPSHOT.2022081
 ## Case 2: Official apache release,  kylin binary for deploy on Hadoop3+ and 
Hive2.3+, 
 # and third party cannot be distributed because of apache distribution 
policy(size and license)
 ./build/release/release.sh -noSpark -official 
+
+## Case 3: A package for runing on Apache Hadoop 3 platform
+./build/release/release.sh -P hadoop3
 ```
 
 ### How to switch to older node.js
@@ -60,4 +80,4 @@ nvm use 12.14.0
 
 ## switch to original version
 nvm use system
-```
\ No newline at end of file
+```
diff --git a/website/docs/quickstart/expert_mode_tutorial.md 
b/website/docs/quickstart/expert_mode_tutorial.md
index 66905f9150..cbafa0496e 100644
--- a/website/docs/quickstart/expert_mode_tutorial.md
+++ b/website/docs/quickstart/expert_mode_tutorial.md
@@ -1,14 +1,14 @@
 ---
-title: Quick Start
+title: Expert Mode Tutorial
 language: en
-sidebar_label: Quick Start
-pagination_label: Quick Start
+sidebar_label: Expert Mode Tutorial
+pagination_label: Expert Mode Tutorial
 toc_min_heading_level: 2
 toc_max_heading_level: 6
 pagination_prev: null
 pagination_next: null
 keywords:
-      - quick start
+      - expert mode tutorial
 draft: true
 last_update:
       date: 08/12/2022
diff --git a/website/docs/quickstart/images/gss_negotiate.png 
b/website/docs/quickstart/images/gss_negotiate.png
new file mode 100644
index 0000000000..2eca44b918
Binary files /dev/null and b/website/docs/quickstart/images/gss_negotiate.png 
differ
diff --git a/website/docs/quickstart/images/installation_query_result.png 
b/website/docs/quickstart/images/installation_query_result.png
new file mode 100644
index 0000000000..f2bd43f594
Binary files /dev/null and 
b/website/docs/quickstart/images/installation_query_result.png differ
diff --git a/website/docs/quickstart/images/list.png 
b/website/docs/quickstart/images/list.png
new file mode 100644
index 0000000000..937e7782c0
Binary files /dev/null and b/website/docs/quickstart/images/list.png differ
diff --git a/website/docs/quickstart/quick_start.md 
b/website/docs/quickstart/quick_start.md
new file mode 100644
index 0000000000..69c91df583
--- /dev/null
+++ b/website/docs/quickstart/quick_start.md
@@ -0,0 +1,268 @@
+---
+title: Quick Start
+language: en
+sidebar_label: Quick Start
+pagination_label: Quick Start
+toc_min_heading_level: 2
+toc_max_heading_level: 6
+pagination_prev: null
+pagination_next: null
+keywords:
+    - quick start
+draft: true
+last_update:
+    date: 08/12/2022
+---
+
+In this guide, we will explain how to quickly install and start Kylin 5.
+
+Before proceeding, please make sure the 
[Prerequisite](../deployment/on-premises/prerequisite.md) is met.
+
+
+### <span id="install">Download and Install</span>
+
+1. Get Kylin installation package.
+
+   Please refer to [How To Package](../development/how_to_package.md).
+
+2. Decide the installation location and the Linux account to run Kylin. All 
the examples below are based on the following assumptions:
+
+   - The installation location is `/usr/local/`
+   - Linux account to run Kylin is `KyAdmin`. It is called the **Linux 
account** hereafter.
+   - **For all commands in the rest of the document**, please replace the 
above parameters with your real installation location and Linux account. 
+
+3. Copy and uncompress Kylin software package to your server or virtual 
machine.
+
+   ```shell
+   cd /usr/local
+   tar -zxvf Kylin5.0-Beta-[Version].tar.gz
+   ```
+   The decompressed directory is referred to as **$KYLIN_HOME** or **root 
directory**.
+
+5. Prepare RDBMS metastore.
+
+   If PostgreSQL or MySQL has been installed already in your environment, you 
can choose one of them as the metastore. 
+   
+   **Note**: 
+   
+   + For the production environment, we recommend to setup a dedicated 
metastore. You can use PostgreSQL which is shipped with Kylin 5.x. 
+   + The database name of metastore **must start with an English character**.
+   
+   Please refer to the below links for complete steps to install and configure:
+   
+   * [Use PostgreSQL as 
Metastore](../deployment/on-premises/rdbms_metastore/postgresql/default_metastore.md).
+   * [Use MySQL as 
Metastore](../deployment/on-premises/rdbms_metastore/mysql/mysql_metastore.md).
+   
+6. (optional) Install InfluxDB.
+  
+   Kylin uses InfluxDB to save various system monitoring information. If you 
do not need to view related information, you can skip this step. It is strongly 
recommended to complete this step in a production environment and use related 
monitoring functions.
+   
+   ```sh
+   cd $KYLIN_HOME/influxdb
+   
+   # install influxdb
+   rpm -ivh influxdb-1.6.5.x86_64.rpm
+   ```
+   
+   For more details, please refer to [Use InfluxDB as Time-Series 
Database](../operations/monitoring/influxdb/influxdb.md).
+   
+6. Create a working directory on HDFS and grant permissions.
+
+   The default working directory is `/kylin`. Also ensure the Linux account 
has access to its home directory on HDFS. Meanwhile, create directory 
`/kylin/spark-history` to store the spark log files.
+
+   ```sh
+   hadoop fs -mkdir -p /kylin
+   hadoop fs -chown root /kylin
+   hadoop fs -mkdir -p /kylin/spark-history
+   hadoop fs -chown root /kylin/spark-history
+   ```
+
+   If necessary, you can modify the path of the Kylin working directory in 
`$KYLIN_HOME/conf/kylin.properties`.
+
+   **Note**: If you do not have the permission to create 
`/kylin/spark-history`, you can configure 
`kylin.engine.spark-conf.spark.eventLog.dir` and 
`kylin.engine.spark-conf.spark.history.fs.logDirectory` with an available 
directory.
+
+### <span id="configuration">Quick Configuration</span>
+
+In the `conf` directory under the root directory of the installation package, 
you should configure the parameters in the file `kylin.properties` as follows:
+
+1. According to the PostgreSQL configuration, configure the following metadata 
parameters. Pay attention to replace the corresponding ` {metadata_name} `, 
`{host} `, ` {port} `, ` {user} `, ` {password} ` value, the maximum length of 
`metadata_name` allowed is 28.
+
+   ```properties
+   
kylin.metadata.url={metadata_name}@jdbc,driverClassName=org.postgresql.Driver,url=jdbc:postgresql://{host}:{port}/kylin,username={user},password={password}
+   ```
+   For more PostgreSQL configuration, please refer to [Use PostgreSQL as 
Metastore](../deployment/on-premises/rdbms_metastore/postgresql/default_metastore.md).
 For information for MySQL configuration, please refer to [Use MySQL as 
Metastore](../deployment/on-premises/rdbms_metastore/mysql/mysql_metastore.md). 
+
+   > **Note**: please name the `{metadata_name}` with letters, numbers, or 
underscores. The name can't start with numbers, such as `1a` is illegal and 
`a1` is legal.
+
+2. When executing jobs, Kylin will submit the build task to Yarn. You can set 
and replace `{queue}` in the following parameters as the queue you actually 
use, and require the build task to be submitted to the specified queue.
+
+   ```properties
+   kylin.engine.spark-conf.spark.yarn.queue={queue_name}
+   ```
+
+
+3. Configure the ZooKeeper service.
+
+   Kylin uses ZooKeeper for service discovery, which will ensure that when an 
instance starts, stops, or unexpectedly interrupts communication during cluster 
deployment, other instances in the cluster can automatically discover and 
update the status. For more details, pleaser refer to [Service 
Discovery](../deployment/on-premises/deploy_mode/service_discovery.md).
+   
+   Please add ZooKeeper's connection configuration 
`kylin.env.zookeeper-connect-string=host:port`. You can modify the cluster 
address and port according to the following example.
+   
+   ```properties
+   kylin.env.zookeeper-connect-string=10.1.2.1:2181,10.1.2.2:2181,10.1.2.3:2181
+   ```
+   
+4. (optional) Configure Spark Client node information
+   Since Spark is started in yarn-client mode, if the IP information of Kylin 
is not configured in the hosts file of the Hadoop cluster, please add the 
following configurations in `kylin.properties`:
+    `kylin.storage.columnar.spark-conf.spark.driver.host={hostIp}`
+    `kylin.engine.spark-conf.spark.driver.host={hostIp}`
+
+  You can modify the {hostIp} according to the following example:
+  ```properties
+  kylin.storage.columnar.spark-conf.spark.driver.host=10.1.3.71
+  kylin.engine.spark-conf.spark.driver.host=10.1.3.71
+  ```
+
+
+
+
+### <span id="start">Start Kylin</span>
+
+1. Check the version of `curl`.
+
+   Since `check-env.sh` needs to rely on the support of GSS-Negotiate during 
the installation process, it is recommended that you check the relevant 
components of your curl first. You can use the following commands in your 
environment:
+
+   ```shell
+   curl --version
+   ```
+   If GSS-Negotiate is displayed in the interface, the curl version is 
available. If not, you can reinstall curl or add GSS-Negotiate support.
+   ![Check GSS-Negotiate dependency](images/gss_negotiate.png)
+
+2. Start Kylin with the startup script.
+   Run the following command to start Kylin. When it is first started, the 
system will run a series of scripts to check whether the system environment has 
met the requirements. For details, please refer to the [Environment Dependency 
Check](../operations/system-operation/cli_tool/environment_dependency_check.md) 
chapter.
+   
+   ```shell
+   ${KYLIN_HOME}/bin/kylin.sh start
+   ```
+   > **Note**：If you want to observe the detailed startup progress, run:
+   >
+   > ```shell
+   > tail -f $KYLIN_HOME/logs/kylin.log
+   > ```
+   
+
+Once the startup is completed, you will see information prompt in the console. 
Run the command below to check the Kylin process at any time.
+
+   ```shell
+   ps -ef | grep kylin
+   ```
+
+3. Get login information.
+
+   After the startup script has finished, the random password of the default 
user `ADMIN` will be displayed on the console. You are highly recommended to 
save this password. If this password is accidentally lost, please refer to 
[ADMIN User Reset Password](../operations/access-control/user_management.md).
+
+### <span id="use">How to Use</span>
+
+After Kylin is started, open web GUI at `http://{host}:7070/kylin`. Please 
replace `host` with your host name, IP address, or domain name. The default 
port is `7070`. 
+
+The default user name is `ADMIN`. The random password generated by default 
will be displayed on the console when Kylin is started for the first time. 
After the first login, please reset the administrator password according to the 
password rules.
+
+- At least 8 characters.
+- Contains at least one number, one letter, and one special character 
```(~!@#$%^&*(){}|:"<>?[];',./`)```.
+
+Kylin uses the open source **SSB** (Star Schema Benchmark) dataset for star 
schema OLAP scenarios as a test dataset. You can verify whether the 
installation is successful by running a script to import the SSB dataset into 
Hive. The SSB dataset is from multiple CSV files.
+
+**Import Sample Data**
+
+Run the following command to import the sample data:
+
+```shell
+$KYLIN_HOME/bin/sample.sh
+```
+
+The script will create 1 database **SSB** and 6 Hive tables then import data 
into it.
+
+After running successfully, you should be able to see the following 
information in the console:
+
+```shell
+Sample hive tables are created successfully
+```
+
+
+We will be using SSB dataset as the data sample to introduce Kylin in several 
sections of this product manual. The SSB dataset simulates transaction data for 
the online store, see more details in [Sample Dataset](sample_dataset.md). 
Below is a brief introduction.
+
+
+| Table       | Description                           | Introduction           
                                      |
+| ----------- | ------------------------------------- | 
------------------------------------------------------------ |
+| CUSTOMER    | customer information                  | includes customer 
name, address, contact information .etc.   |
+| DATES       | order date                            | includes a order's 
specific date, week, month, year .etc.    |
+| LINEORDER   | order information                     | includes some basic 
information like order date, order amount, order revenue, supplier ID, 
commodity ID, customer Id .etc. |
+| PART        | product information                   | includes some basic 
information like product name, category, brand .etc. |
+| P_LINEORDER | view based on order information table | includes all content 
in the order information table and new content in the view |
+| SUPPLIER    | supplier information                  | includes supplier 
name, address, contact information .etc.   |
+
+
+**Validate Product Functions**
+
+You can create a sample project and model according to [Expert Mode 
Tutorial](expert_mode_tutorial.md). The project should validate basic features 
such as source table loading, model creation, index build etc. 
+
+On the **Data Asset -> Model** page, you should see an example model with some 
storage over 0.00 KB, this indicates the data has been loaded for this model.
+
+![model list](images/list.png)
+
+On the **Monitor** page, you can see all jobs have been completed successfully 
in **Batch Job** and **Streaming Job** pages. 
+
+![job monitor](images/job.png)
+
+**Validate Query Analysis**
+
+When the metadata is loaded successfully, at the **Insight** page, 6 sample 
hive tables would be shown at the left panel. User could input query statements 
against these tables. For example, the SQL statement queries different product 
group by order date, and in descending order by total revenue: 
+
+```sql
+SELECT LO_PARTKEY, SUM(LO_REVENUE) AS TOTAL_REVENUE
+FROM SSB.P_LINEORDER
+WHERE LO_ORDERDATE between '19930601' AND '19940601' 
+group by LO_PARTKEY
+order by SUM(LO_REVENUE) DESC 
+```
+
+
+The query result will be displayed at the **Insight** page, showing that the 
query hit the sample model.
+
+![query result](images/installation_query_result.png)
+
+You can also use the same SQL statement to query on Hive to verify the result 
and performance.
+
+
+
+### <span id="stop">Stop Kylin</span>
+
+Run the following command to stop Kylin:
+
+```shell
+$KYLIN_HOME/bin/kylin.sh stop
+```
+
+You can run the following command to check if the Kylin process has stopped.
+
+```shell
+ps -ef | grep kylin
+```
+
+### <span id="faq">FAQ</span>
+
+**Q: How do I change the service default port?**
+
+You can modify the following configuration in the 
`$KYLIN_HOME/conf/kylin.properties`, here is an example for setting the server 
port to 7070.
+
+```properties
+server.port=7070
+```
+
+**Q: Does Kylin support Kerberos integration?**
+
+Yes, if your cluster enables Kerberos authentication protocol, the Spark 
embedded in Kylin needs proper configuration to access your cluster resource 
securely. For more information, please refer to [Integrate with 
Kerberos](#TODO)(Details doc will come soon).
+
+**Q: Is the query pushdown engine turned on by default?**
+
+Yes, if you want to turn it off, please refer to [Pushdown to 
SparkSQL](../query/pushdown/pushdown_to_embedded_spark.md).
+
diff --git a/website/sidebars.js b/website/sidebars.js
index fd175df36d..7307aa5b33 100644
--- a/website/sidebars.js
+++ b/website/sidebars.js
@@ -35,6 +35,10 @@ const sidebars = {
                 id: 'quickstart/intro',
             },
             items: [
+                {
+                    type: 'doc',
+                    id: 'quickstart/quick_start',
+                },
                 {
                     type: 'doc',
                     id: 'quickstart/expert_mode_tutorial',
@@ -214,9 +218,37 @@ const sidebars = {
                     ],
                 },
                 {
-                    type: 'doc',
-                    id: 'deployment/installation/uninstallation'
-                }
+                    type: 'category',
+                    label: 'Install and Uninstall',
+                    link: {
+                        type: 'doc',
+                        id: 'deployment/installation/intro',
+                    },
+                    items: [
+                        {
+                            type: 'category',
+                            label: 'Install On Platforms',
+                            link: {
+                                type: 'doc',
+                                id: 'deployment/installation/platform/intro',
+                            },
+                            items: [
+                                {
+                                    type: 'doc',
+                                    id: 
'deployment/installation/platform/install_on_apache_hadoop',
+                                },
+                            ],
+                        },
+                        {
+                            type: 'doc',
+                            id: 'deployment/installation/uninstallation',
+                        },
+                        {
+                            type: 'doc',
+                            id: 'deployment/installation/install_validation',
+                        },
+                    ],
+                },
             ],
         },
         {

[kylin] branch doc5.0 updated: KYLIN-5221 add apache hadoop installation

Reply via email to