This is an automated email from the ASF dual-hosted git repository.

zjffdu pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/zeppelin.git


The following commit(s) were added to refs/heads/master by this push:
     new 1b31947  [ZEPPELIN-4874]. Add document for interpreter yarn launch mode
1b31947 is described below

commit 1b319475d1625149de69253aceecd25fdb1a1179
Author: Jeff Zhang <zjf...@apache.org>
AuthorDate: Wed Jun 17 23:15:42 2020 +0800

    [ZEPPELIN-4874]. Add document for interpreter yarn launch mode
    
    ### What is this PR for?
    
    Document about yarn launch mode.
    * Add one section in `Run Mode`
    * Add one section about how to integration with hadoop
    
    ### What type of PR is it?
    [ Documentation ]
    
    ### Todos
    * [ ] - Task
    
    ### What is the Jira issue?
    * https://issues.apache.org/jira/browse/ZEPPELIN-4874
    
    ### How should this be tested?
    No test needed
    
    ### Screenshots (if appropriate)
    
    ### Questions:
    * Does the licenses files need update? No
    * Is there breaking changes for older versions? No
    * Does this needs documentation? No
    
    Author: Jeff Zhang <zjf...@apache.org>
    
    Closes #3835 from zjffdu/ZEPPELIN-4874 and squashes the following commits:
    
    a01912b4f [Jeff Zhang] [ZEPPELIN-4874]. Add document for interpreter yarn 
launch mode
---
 docs/_includes/themes/zeppelin/_navigation.html |  2 +
 docs/quickstart/yarn.md                         | 75 +++++++++++++++++++++++++
 docs/setup/basics/hadoop_integration.md         | 39 +++++++++++++
 3 files changed, 116 insertions(+)

diff --git a/docs/_includes/themes/zeppelin/_navigation.html 
b/docs/_includes/themes/zeppelin/_navigation.html
index 0940863..5f0eac4 100644
--- a/docs/_includes/themes/zeppelin/_navigation.html
+++ b/docs/_includes/themes/zeppelin/_navigation.html
@@ -31,6 +31,7 @@
                 <li class="title"><span>Run Mode</span></li>
                 <li><a 
href="{{BASE_PATH}}/quickstart/kubernetes.html">Kubernetes</a></li>
                 <li><a 
href="{{BASE_PATH}}/quickstart/docker.html">Docker</a></li>
+                <li><a href="{{BASE_PATH}}/quickstart/yarn.html">Yarn</a></li>
                 <li role="separator" class="divider"></li>
                 <li><a 
href="{{BASE_PATH}}/quickstart/spark_with_zeppelin.html">Spark with 
Zeppelin</a></li>
                 <li><a 
href="{{BASE_PATH}}/quickstart/sql_with_zeppelin.html">SQL with 
Zeppelin</a></li>
@@ -85,6 +86,7 @@
               <ul class="dropdown-menu scrollable-menu">
                 <li class="title"><span>Basics</span></li>
                 <li><a href="{{BASE_PATH}}/setup/basics/how_to_build.html">How 
to Build Zeppelin</a></li>
+                <li><a 
href="{{BASE_PATH}}/setup/basics/hadoop_integration.html">Hadoop 
Integration</a></li>
                 <li><a 
href="{{BASE_PATH}}/setup/basics/multi_user_support.html">Multi-user 
Support</a></li>
                 <li role="separator" class="divider"></li>
                 <li class="title"><span>Deployment</span></li>
diff --git a/docs/quickstart/yarn.md b/docs/quickstart/yarn.md
new file mode 100644
index 0000000..c283a2a
--- /dev/null
+++ b/docs/quickstart/yarn.md
@@ -0,0 +1,75 @@
+---
+layout: page
+title: "Zeppelin on Yarn"
+description: "Apache Zeppelin supports to run interpreter process in yarn 
containers"
+group: usage/interpreter 
+---
+<!--
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+{% include JB/setup %}
+
+# Zeppelin on Yarn
+
+<div id="toc"></div>
+
+Zeppelin on yarn means to run interpreter process in yarn container. The key 
benefit is the scalability, you won't run out of memory
+of the zeppelin server host if you run large amount of interpreter processes.
+
+## Prerequisites
+The following is required for yarn interpreter mode.
+
+* Hadoop client (both 2.x and 3.x are supported) is installed.
+* `$HADOOP_HOME/bin` is put in `PATH`. Because internally zeppelin will run 
command `hadoop classpath` to get all the hadoop jars and put them in the 
classpath of Zeppelin.
+* Set `USE_HADOOP` as `true` in `zeppelin-env.sh`.
+
+## Configuration
+
+Yarn interpreter mode needs to be set for each interpreter. You can set 
`zeppelin.interpreter.launcher` to be `yarn` to run it in yarn mode.
+Besides that, you can also specify other properties as following table.
+
+<table class="table-configuration">
+  <tr>
+    <th>Name</th>
+    <th>Default Value</th>
+    <th>Description</th>
+  </tr>
+  <tr>
+    <td>zeppelin.interpreter.yarn.resource.memory</td>
+    <td>1024</td>
+    <td>memory for interpreter process, unit: mb</td>
+  </tr>
+  <tr>
+    <td>zeppelin.interpreter.yarn.resource.memoryOverhead</td>
+    <td>Amount of non-heap memory to be allocated per interpreter process in 
yarn interpreter mode, in MiB unless otherwise specified. This is memory that 
accounts for things like VM overheads, interned strings, other native 
overheads, etc.</td>
+  </tr>
+  <tr>
+    <td>zeppelin.interpreter.yarn.resource.cores</td>
+    <td>1</td>
+    <td>cpu cores for interpreter process</td>
+  </tr>
+  <tr>
+    <td>zeppelin.interpreter.yarn.queue</td>
+    <td>default</td>
+    <td>yarn queue name</td>
+  </tr>
+</table>
+
+## Differences with non-yarn interpreter mode (local mode)
+
+There're several differences between yarn interpreter mode with non-yarn 
interpreter mode (local mode)
+
+* New yarn app will be allocated for the interpreter process.
+* Any local path setting won't work in yarn interpreter process. E.g. if you 
run python interpreter in yarn interpreter mode, then you need to make sure the 
python executable of `zeppelin.python` exist in all the nodes of yarn cluster. 
+Because the python interpreter may launch in any node.
+* Don't use it for spark interpreter. Instead use spark's built-in yarn-client 
or yarn-cluster which is more suitable for spark interpreter.
\ No newline at end of file
diff --git a/docs/setup/basics/hadoop_integration.md 
b/docs/setup/basics/hadoop_integration.md
new file mode 100644
index 0000000..9417ede
--- /dev/null
+++ b/docs/setup/basics/hadoop_integration.md
@@ -0,0 +1,39 @@
+---
+layout: page
+title: "How to integrate with hadoop"
+description: "How to integrate with hadoop"
+group: setup/basics
+---
+<!--
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+{% include JB/setup %}
+
+# Integrate with hadoop
+
+<div id="toc"></div>
+
+Hadoop is an optional component of zeppelin unless you need the following 
features
+
+* Use hdfs to store notes. 
+* Use hdfs to store interpreter configuration
+* Use hdfs to store recovery data
+* Launch interpreter in yarn mode
+
+## Requirements
+
+In Zeppelin 0.9 doesn't ship with hadoop dependencies, you need to include 
hadoop jars by yourself via the following steps
+
+* Hadoop client (both 2.x and 3.x are supported) is installed.
+* `$HADOOP_HOME/bin` is put in `PATH`. Because internally zeppelin will run 
command `hadoop classpath` to get all the hadoop jars and put them in the 
classpath of Zeppelin.
+* Set `USE_HADOOP` as `true` in `zeppelin-env.sh`.

Reply via email to