This is an automated email from the ASF dual-hosted git repository. yaqian pushed a commit to branch kylin4_on_cloud in repository https://gitbox.apache.org/repos/asf/kylin.git
The following commit(s) were added to refs/heads/kylin4_on_cloud by this push: new 113e417 Modify docs (#1847) 113e417 is described below commit 113e417c6096d424c3c1bf8bcdf9ca78abb8013d Author: Yaqian Zhang <598593...@qq.com> AuthorDate: Sat Apr 2 16:53:51 2022 +0800 Modify docs (#1847) --- README.md | 5 ++- instances/aws_instance.py | 2 +- ...vanced_configs.md => advanced_configuration.md} | 10 ++--- readme/commands.md | 43 +++++++++++++++------- readme/{configs.md => configuration.md} | 16 +++++--- readme/quick_start.md | 27 ++++++++------ readme/quick_start_for_multiple_clusters.md | 4 +- 7 files changed, 68 insertions(+), 39 deletions(-) diff --git a/README.md b/README.md index c4704e4..2599bc9 100644 --- a/README.md +++ b/README.md @@ -40,7 +40,7 @@ When cluster(s) created, services and nodes will be like below: 1. For more details about `cost` of tool, see document [cost calculation](./readme/cost_calculation.md). 2. For more details about `commands` of tool, see document [commands](./readme/commands.md). 3. For more details about the `prerequisites` of tool, see document [prerequisites](./readme/prerequisites.md). -4. For more details about `advanced configs` of tool, see document [advanced configs](./readme/advanced_configs.md). +4. For more details about `advanced configs` of tool, see document [configuration](./readme/configuration.md) and [advanced configuration](./readme/advanced_configuration.md). 5. For more details about `monitor services` supported by tool, see document [monitor](./readme/monitor.md). 6. For more details about `troubleshooting`, see document [troubleshooting](./readme/trouble_shooting.md). 7. The current tool has already opened the public port for some services. You can access the service by `public IP` of related EC2 instances. @@ -49,6 +49,7 @@ When cluster(s) created, services and nodes will be like below: 3. `Prometheus`: 9090, 9100. 4. `Kylin`: 7070. 5. `Spark`: 8080, 4040. + 6. `MDX for Kylin`: 7080. 8. More about cloudformation syntax, please check [aws website](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/Welcome.html). -9. The current Kylin version is 4.0.0. +9. The current Kylin version is 4.0.1. 10. The current Spark version is 3.1.1. diff --git a/instances/aws_instance.py b/instances/aws_instance.py index f370d95..9260e79 100644 --- a/instances/aws_instance.py +++ b/instances/aws_instance.py @@ -1808,7 +1808,7 @@ class AWSInstance: logger.info(f"Fetching messages successfully ...") header_msg = '\n=================== List Alive Nodes ===========================\n' - result = header_msg + f"Stack Name\t\tInstance ID\t\tPrivate Ip\t\tPublic Ip\t\t\n" + result = header_msg + f"Node Name\t\tInstance ID\t\tPrivate Ip\t\tPublic Ip\t\t\n" for msg in msgs: result += msg + '\n' result += header_msg diff --git a/readme/advanced_configs.md b/readme/advanced_configuration.md similarity index 83% rename from readme/advanced_configs.md rename to readme/advanced_configuration.md index cd36d3b..0218650 100644 --- a/readme/advanced_configs.md +++ b/readme/advanced_configuration.md @@ -16,7 +16,7 @@ There are `9` modules params for tools. Introductions as below: - EC2_KYLIN4_PARAMS: These params of the module are for creating a Kylin4. -- EC2_SPARK_WORKER_PARAMS: These params of the module are for creating **Spark Workers**, the default is **3** spark workers for all clusters. +- EC2_SPARK_WORKER_PARAMS: These params of the module are for creating **Spark Workers**, the default is **3** spark workers for all of the clusters. - EC2_KYLIN4_SCALE_PARAMS: these params of the module are for scaling **Kylin4 nodes**, the range of **Kylin4 nodes** is related to `KYLIN_SCALE_UP_NODES` and `KYLIN_SCALE_DOWN_NODES`. @@ -25,16 +25,16 @@ There are `9` modules params for tools. Introductions as below: > 1. `KYLIN_SCALE_UP_NODES` is for the range of Kylin nodes to scale up. > 2. `KYLIN_SCALE_DOWN_NODES` is for the range of Kylin nodes to scale down. > 3. The range of `KYLIN_SCALE_UP_NODES` must contain the range of `KYLIN_SCALE_DOWN_NODES`. - > 4. **They are effective to all clusters which are not only `default cluster` but also another cluster whose index is in `${CLUSTER_INDEXES}`.** + > 4. **They are effective to all of the clusters which are not only `default cluster` but also another cluster whose index is in `${CLUSTER_INDEXES}`.** - EC2_SPARK_SCALE_SLAVE_PARAMS: these params of the module are for scaling **Spark workers**, the range of **Spark Workers is related to `SPARK_WORKER_SCALE_UP_NODES` and `SPARK_WORKER_SCALE_DOWN_NODES`. > Note: > - > 1. `SPARK_WORKER_SCALE_UP_NODES` is for the range for spark workers to scale up. **It's effective to all clusters which are not only `default cluster` but also another cluster whose index is in `${CLUSTER_INDEXES}`.** - > 2. `SPARK_WORKER_SCALE_DOWN_NODES` is for the range for spark workers to scale down. **It's effective to all clusters which are not only `default cluster` but also another cluster whose index is in `${CLUSTER_INDEXES}`.** + > 1. `SPARK_WORKER_SCALE_UP_NODES` is for the range for spark workers to scale up. **It's effective to all of the clusters which are not only `default cluster` but also another cluster whose index is in `${CLUSTER_INDEXES}`.** + > 2. `SPARK_WORKER_SCALE_DOWN_NODES` is for the range for spark workers to scale down. **It's effective to all of the clusters which are not only `default cluster` but also another cluster whose index is in `${CLUSTER_INDEXES}`.** > 3. The range of `SPARK_WORKER_SCALE_UP_NODES` must contain the range of `SPARK_WORKER_SCALE_DOWN_NODES`. - > 4. **They are effective to all clusters which are not only `default cluster` but also another cluster whose index is in `${CLUSTER_INDEXES}`.** + > 4. **They are effective to all of the clusters which are not only `default cluster` but also another cluster whose index is in `${CLUSTER_INDEXES}`.** ### Customize Configs diff --git a/readme/commands.md b/readme/commands.md index 7bca88f..81dd510 100644 --- a/readme/commands.md +++ b/readme/commands.md @@ -1,16 +1,21 @@ ## Commands<a name="run"></a> Command: + > Note: + > + > Options are placed in `[]`, and different options are separated by `|`. ```shell -python deploy.py --type [deploy|destroy|list|scale] --scale-type [up|down] --node-type [kylin|spark_worker] [--cluster {1..6}|all|default] +python deploy.py --type [deploy|destroy|destroy-all|list|scale] --mode [all|job|query] --scale-type [up|down] --node-type [kylin|spark_worker] --cluster [{1..6}|all|default] ``` - deploy: create cluster(s). -- destroy: destroy created cluster(s). +- destroy: destroy created cluster(s). Including kylin node, spark master node, spark slave node and zookeeper node. + +- destroy-all: destroy all of the node. Including kylin node, spark master node, spark slave node, zookeeper node, rds node, monitor node and vpc node . -- list: list alive nodes which are with stack name, instance id, private IP, and public IP. +- list: list alive nodes which are with node name, instance id, private IP, and public IP. - scale: Must be used with `--scale-type` and `--node-type`. @@ -19,14 +24,26 @@ python deploy.py --type [deploy|destroy|list|scale] --scale-type [up|down] --nod > 1. Current support to scale up/down `kylin` or `spark_worker` for a specific cluster. > 2. Before scaling up/down `kylin` or `spark_worker` nodes, Cluster services must be ready. > 3. If you want to scale a `kylin` or `spark_worker` node to a specified cluster, please add the `--cluster ${cluster ID}` to specify the expected node add to the cluster `${cluster ID}`. - > 4. For details about the index of the cluster, please check [Indexes of clusters](./configs.md#indexofcluster). + > 4. For details about the index of the cluster, please check [Indexes of clusters](./configuration.md#indexofcluster). ### Command for deploy -- Deploy a default cluster +- Deploy a cluster, the mode of kylin node is `all` + +```shell +$ python deploy.py --type deploy +``` + +- deploy a cluster, the mode of kylin node is `job` + +```shell +$ python deploy.py --type deploy --mode job +``` + +- deploy a cluster, the mode of kylin node is `query` ```shell -$ python deploy.py --type deploy [--cluster default] +$ python deploy.py --type deploy --mode query ``` - Deploy a cluster with a specific cluster index. <a name="deploycluster"></a> @@ -37,7 +54,7 @@ $ python deploy.py --type deploy --cluster ${cluster ID} > Note: the `${cluster ID}` must be in the range of `CLUSTER_INDEXES`. -- Deploy all clusters which contain the default cluster and all clusters whose index is in the range of `CLUSTER_INDEXES`. +- Deploy all of the clusters which contain the default cluster and all of the clusters whose index is in the range of `CLUSTER_INDEXES`. ```shell $ python deploy.py --type deploy --cluster all @@ -47,12 +64,12 @@ $ python deploy.py --type deploy --cluster all > Note: > -> Destroy all clusters will not delete vpc, rds, and monitor node. So if user doesn't want to hold the env, please set the `ALWAYS_DESTROY_VPC_RDS_MONITOR` to be `'true'`. +> By default, using the `destroy` command does not vpc, rds, and monitor node. So if user doesn't want to hold the env, please use `destroy-all` command. -- Destroy a default cluster +- Destroy the default cluster ```shell -$ python deploy.py --type destroy [--cluster default] +$ python deploy.py --type destroy ``` - Destroy a cluster with a specific cluster index. @@ -63,7 +80,7 @@ $ python deploy.py --type destroy --cluster ${cluster ID} > Note: the `${cluster ID}` must be in the range of `CLUSTER_INDEXES`. -- Destroy all clusters which contain the default cluster and all clusters whose index is in the range of `CLUSTER_INDEXES`. +- Destroy all of the clusters which contain the default cluster and all of the clusters whose index is in the range of `CLUSTER_INDEXES`. ```shell $ python deploy.py --type destroy --cluster all @@ -71,7 +88,7 @@ $ python deploy.py --type destroy --cluster all ### Command for list -- List nodes that are with **stack name**, **instance id**, **private IP,** and **public IP** in **available stacks**. +- List nodes that are with **node name**, **instance id**, **private IP,** and **public IP** in **available stacks**. ```shell $ python deploy.py --type list @@ -83,7 +100,7 @@ $ python deploy.py --type list > > 1. Scale command must be used with `--scale-type` and `--node-type`. > 2. If the scale command does not specify a `cluster ID`, then the scaled > node(Kylin or spark worker) will be added to the `default` cluster. -> 3. Scale command **not support** to **scale** node (kylin or spark worker) to **all clusters** at **one time**. It means that `python ./deploy.py --type scale --scale-type up[|down] --node-type kylin[|spark_worker] --cluster all` is invalid commad. +> 3. Scale command **not support** to **scale** node (kylin or spark worker) to **all of the clusters** at **one time**. It means that `python ./deploy.py --type scale --scale-type up[|down] --node-type kylin[|spark_worker] --cluster all` is invalid commad. > 4. Scale params which are `KYLIN_SCALE_UP_NODES`, `KYLIN_SCALE_DOWN_NODES`, > `SPARK_WORKER_SCALE_UP_NODES` and `SPARK_WORKER_SCALE_DOWN_NODES` effect on > all cluster. So if user wants to scale a node for a specific cluster, then > modify the scale params before **every run time.** > 5. **(Important!!!)** The current cluster is created with default `3` spark > workers and `1` Kylin node. The `3` spark workers can not be scaled down. > The `1` Kylin node also can not be scaled down. > 6. **(Important!!!)** The current cluster can only scale up or down the > range of nodes which is in `KYLIN_SCALE_UP_NODES`, > `KYLIN_SCALE_DOWN_NODES`, `SPARK_WORKER_SCALE_UP_NODES,` and > `SPARK_WORKER_SCALE_DOWN_NODES`. Not the default `3` spark workers and `1` > kylin node in a cluster. diff --git a/readme/configs.md b/readme/configuration.md similarity index 87% rename from readme/configs.md rename to readme/configuration.md index b3281f7..52e0559 100644 --- a/readme/configs.md +++ b/readme/configuration.md @@ -1,18 +1,24 @@ -## Configs +## Configuration #### I. Configure the `kylin_configs.yaml` **Required parameters**: -- `AWS_REGION`: Current region for EC2 instances. +- `AWS_REGION`: Current region for EC2 instances. Default is `cn-northwest-1`. - `IAMRole`: IAM role which has the access to aws authority. This parameter will be set to the created **name** of the IAM role. - `S3_URI`: the prefix path of storing `jars/scripts/tar`. For example, this parameter will be set to `s3://.../kylin4-aws-test`. - `KeyName`: Security key name is a set of security credentials that you use to prove your identity when connecting to an instance. This parameter will be set to the created **name** of key pair`. - `CIDR_IP`: An inbound rule permits instances to receive traffic from the specified IPv4 or IPv6 CIDR address range, or the instances associated with the specified security group. + +**Optional parameters**: + - `DB_IDENTIFIER`: this param should be only one in the `RDS -> Databases`. And it will be the name of created RDS database. -- `DB_PORT`: this param will be the port of created RDS database, default is `3306`. -- `DB_USER`: this param will be a login ID for the master user of your DB instance, the default is `root`. -- `DB_PASSWORD`: this param will be the password of `DB_USER` to access the DB instance. default is `123456test`, it's strongly suggested you change it. +- `DB_PORT`: this param will be the port of created RDS database. The default value is `3306`. +- `DB_USER`: this param will be a login ID for the master user of your DB instance. The default value is `root`. +- `DB_PASSWORD`: this param will be the password of `DB_USER` to access the DB instance. The default value is `123456test`, it's strongly suggested you change it. + +- `ENABLE_MDX`: Whether to start the `MDX for Kylin` service when starting the cluster. The default value is `false`. +- `SUPPORT_GLUE`: Whether to use AWS Glue as the metastore service of hive data source. The default value is `true`, effective only when deploying a kylin node of `job` mode. #### II. Configure the `kylin.properties` in `backup/properties` directories.<a name="cluster"></a> diff --git a/readme/quick_start.md b/readme/quick_start.md index 7311c2b..f86f026 100644 --- a/readme/quick_start.md +++ b/readme/quick_start.md @@ -15,26 +15,30 @@ git clone https://github.com/apache/kylin.git && cd kylin && git checkout kylin4_on_cloud ``` -3. Modify the `kylin_config.yml`. +3. Configure the `kylin_config.yaml`. - 1. Set the `AWS_REGION`, such as us-east-1. + 1. Set the `AWS_REGION`, such as `us-east-1`. - 2. Set the `IAMRole`, please check [Create an IAM role](./prerequisites.md#IAM). + 2. Set the `IAMRole`, please check [create an IAM role](./prerequisites.md#IAM). - 3. Set the `S3_URI`, please check [Create a S3 direcotry](./prerequisites.md#S3). + 3. Set the `S3_URI`, please check [create a S3 direcotry](./prerequisites.md#S3). - 4. Set the `KeyName`, please check [Create a keypair](./prerequisites.md#keypair). + 4. Set the `KeyName`, please check [create a keypair](./prerequisites.md#keypair). 5. Set the `CIDR_IP`, make sure that the `CIDR_IP` match the pattern `xxx.xxx.xxx.xxx/16[|24|32]`. - > Note: + > Note: > > 1. this `CIDR_IP` is the specified IPv4 or IPv6 CIDR address range which an inbound rule can permit instances to receive traffic from. > > 2. In one word, it will let your mac which IP is in the `CIDR_IP` to access instances. + 6. Set the `ENABLE_MDX`, if you want to use `MDX for Kylin`, you can set this parameter to `true`. For `MDX for Kylin`, please refer to: [The manual of MDX for Kylin](https://kyligence.github.io/mdx-kylin/). + 4. Init python env. +> Note: You need to ensure that the local machine has installed Python above version 3.6.6. + ```shell $ bin/init.sh $ source venv/bin/activate @@ -46,8 +50,6 @@ Check the python version: $ python --version ``` -> Note: If Python is already installed locally, you need to ensure that the python version is 3.6.6 or later. - 5. Execute commands to deploy a cluster quickly. ```shell @@ -58,8 +60,9 @@ After this cluster is ready, you will see the message `Kylin Cluster already sta > Note: > -> 1. For more details about the properties of kylin4 in a cluster, please check [configure kylin.properties](./configs.md#cluster). -> 2. For more details about the index of the clusters, please check [Indexes of clusters](./configs.md#indexofcluster). +> 1. By default, the mode of kylin node in the deployed cluster is `all`, supports both `job` and `query`. If you want to deploy a read-write separated cluster, you can use command `python deploy.py --type deploy --mode job` to deploy a `job` cluster, and use command `python deploy.py --type deploy --mode query` to deploy a `query` cluster. AWS Glue is supported by default in `job` cluster. +> 2. For more details about the properties of kylin4 in a cluster, please check [configure kylin.properties](./configuration.md#cluster). +> 3. For more details about the index of the clusters, please check [Indexes of clusters](./configuration.md#indexofcluster). 6. Execute commands to list nodes of the cluster. @@ -73,6 +76,8 @@ You can access `Kylin` web by `http://{kylin public ip}:7070/kylin`.  +If you set `ENABLE_MDX` to true, you can access `MDX for Kylin` by `http://{kylin public ip}:7080/kylin`. + 7. Destroy the cluster quickly. ```shell @@ -82,5 +87,5 @@ $ python deploy.py --type destroy > Note: > > 1. If you want to check about a quick start for multiple clusters, please > referer to a [quick start for multiple > clusters](./quick_start_for_multiple_clusters.md). -> 2. **Current destroy operation will remain some stack which contains `RDS` and so on**. So if user want to destroy clearly, please modify the `ALWAYS_DESTROY_VPC_RDS_MONITOR` in `kylin_configs.yml` to be `true` and re-execute `destroy` command. +> 2. **Current destroy operation will remain some stack which contains `RDS` and so on**. So if user want to destroy clearly, please use `python deploy.py --type destroy-all`. diff --git a/readme/quick_start_for_multiple_clusters.md b/readme/quick_start_for_multiple_clusters.md index fd18836..e025941 100644 --- a/readme/quick_start_for_multiple_clusters.md +++ b/readme/quick_start_for_multiple_clusters.md @@ -8,9 +8,9 @@ > > 1. `CLUSTER_INDEXES` means that cluster index is in the range of `CLUSTER_INDEXES`. > 2. Configs for multiple clusters are also from `kylin_configs.yaml`. - > 3. For more details about the index of the clusters, please check [Indexes of clusters](./configs.md#indexofcluster). + > 3. For more details about the index of the clusters, please check [Indexes of clusters](./configuration.md#indexofcluster). -2. Copy `kylin.properties.template` for expecting clusters to deploy, please check the [details](./configs.md#cluster). +2. Copy `kylin.properties.template` for expecting clusters to deploy, please check the [details](./configuration.md#cluster). 3. Execute commands to deploy all of the clusters.