This is an automated email from the ASF dual-hosted git repository.
ksumit pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-livy.git
The following commit(s) were added to refs/heads/master by this push:
new 314f2def [LIVY-969] Create docker based integration environment for
local debugging (#407)
314f2def is described below
commit 314f2def423954a77eb75f72e9bf7f4e51a6f7f4
Author: Sumit Kumar <[email protected]>
AuthorDate: Fri Jun 16 17:03:57 2023 -0700
[LIVY-969] Create docker based integration environment for local debugging
(#407)
* [LIVY-969] Create docker based integration environment for local
debugging
* Add missing newlines, move customization steps towards the end in
Dockerfile so we can reuse previous layers and avoid recreating it for every
image build
* Revert changes in module list, still update the scala-maven-plugin
version to avoid issues compiling against 2.12
* Upgrade spark3 version to spark 3.2.3
* Fix Dockerfile path in ci image build pipeline,
* Update instructions to use livy-ci container image name
---------
Co-authored-by: Sumit Kumar <[email protected]>
---
.github/workflows/build-ci-image.yaml | 2 +-
README.md | 8 +-
dev/docker/README.md | 116 +++++++++++++++++++++
dev/docker/build-images.sh | 68 ++++++++++++
dev/docker/{ => livy-dev-base}/Dockerfile | 9 +-
dev/docker/livy-dev-cluster/conf/livy/livy-env.sh | 18 ++++
dev/docker/livy-dev-cluster/conf/livy/livy.conf | 20 ++++
.../livy-dev-cluster/conf/livy/log4j.properties | 33 ++++++
.../livy-dev-cluster/conf/master/log4j.properties | 35 +++++++
.../conf/master/spark-default.conf | 26 +++++
.../livy-dev-cluster/conf/master/spark-env.sh | 18 ++++
.../livy-dev-cluster/conf/worker/log4j.properties | 35 +++++++
.../conf/worker/spark-default.conf | 24 +++++
.../livy-dev-cluster/conf/worker/spark-env.sh | 18 ++++
dev/docker/livy-dev-cluster/docker-compose.yml | 99 ++++++++++++++++++
dev/docker/livy-dev-server/Dockerfile | 40 +++++++
dev/docker/livy-dev-spark/Dockerfile | 58 +++++++++++
pom.xml | 12 +--
18 files changed, 625 insertions(+), 14 deletions(-)
diff --git a/.github/workflows/build-ci-image.yaml
b/.github/workflows/build-ci-image.yaml
index 1dc0b60a..17437676 100644
--- a/.github/workflows/build-ci-image.yaml
+++ b/.github/workflows/build-ci-image.yaml
@@ -19,7 +19,7 @@ on:
push:
branches: ["main"]
paths:
- - 'dev/docker/Dockerfile'
+ - 'dev/docker/livy-dev-base/Dockerfile'
jobs:
docker-build:
runs-on: ubuntu-latest
diff --git a/README.md b/README.md
index f3ddbad8..b653569f 100644
--- a/README.md
+++ b/README.md
@@ -71,13 +71,13 @@ cd incubator-livy
mvn package
```
-You can also use the provided [Dockerfile](./Dockerfile):
+You can also use the provided
[Dockerfile](./dev/docker/livy-dev-base/Dockerfile):
```
git clone https://github.com/apache/incubator-livy.git
-cd incubator-livy/dev/docker
-docker build -t livy .
-docker run --rm -it -v $(pwd)/../../:/workspace -v $HOME/.m2:/root/.m2 livy
mvn package
+cd incubator-livy
+docker build -t livy-ci dev/docker/livy-dev-base/
+docker run --rm -it -v $(pwd):/workspace -v $HOME/.m2:/root/.m2 livy-ci mvn
package
```
> **Note**: The `docker run` command maps the maven repository to your host
> machine's maven cache so subsequent runs will not need to download
> dependencies.
diff --git a/dev/docker/README.md b/dev/docker/README.md
new file mode 100644
index 00000000..fbf3ed16
--- /dev/null
+++ b/dev/docker/README.md
@@ -0,0 +1,116 @@
+# Livy with standalone Spark Cluster
+## Pre-requisite
+Following steps use Ubuntu as development environment but most of the
instructions can be modified to fit another OS as well.
+* Install wsl if on windows, instructions available
[here](https://ubuntu.com/tutorials/install-ubuntu-on-wsl2-on-windows-11-with-gui-support)
+* Install docker engine, instructions available
[here](https://docs.docker.com/engine/install/ubuntu/)
+* Install docker-compose, instructions available
[here](https://docs.docker.com/compose/install/)
+
+## Standalone cluster using docker-compose
+### Build the current livy branch and copy it to appropriate folder
+```
+$ mvn clean package -Pscala-2.12 -Pspark3 -DskipITs -DskipTests
+```
+This generates a zip file for livy similar to
`assembly/target/apache-livy-0.8.0-incubating-SNAPSHOT_2.12-bin.zip`. It's
useful to use the `clean` target to avoid mixing with previously built
dependencies/versions.
+
+### Build container images locally
+* Build livy-dev-base, livy-dev-spark, livy-dev-server container images using
provided script `build-images.sh`
+```
+livy/dev/docker$ ls
+README.md build-images.sh livy-dev-base livy-dev-cluster livy-dev-server
livy-dev-spark
+livy/dev/docker$ ./build-images.sh
+```
+#### Customizing container images
+`build-images.sh` downloads built up artifacts from Apache's respository
however, private builds can be copied to respective container directories to
build a container image with private artifacts as well.
+```
+livy-dev-spark uses hadoop and spark tarballs
+livy-dev-server uses livy zip file
+```
+
+For quicker iteration, copy the modified jars to specific container
directories and update corresponding `Dockerfile` to replace those jars as
additional steps inside the image. Provided `Dockerfile`s have example lines
that can be uncommented/modified to achieve this.
+
+`livy-dev-cluster` folder contains conf folder with customizable
configurations (environment, .conf and log4j.properties files) that can be
updated to suit specific needs. Restart the cluster after making changes
(without rebuilding the images).
+### Launching the cluster
+```
+livy/dev/docker/livy-dev-cluster$ docker-compose up
+Starting spark-master ... done
+Starting spark-worker-1 ... done
+Starting livy ... done
+Attaching to spark-worker-1, spark-master, livy
+```
+### UIs
+* Livy UI at http://localhost:8998/
+* Spark Master at spark://master:7077 (http://localhost:8080/).
+* Spark Worker at spark://spark-worker-1:8881 (http://localhost:8081/)
+
+### Run spark shell
+* Login to spark-master or spark-worker or livy container using docker cli
+```
+$ docker exec -it spark-master /bin/bash
+root@master:/opt/spark-3.2.3-bin-without-hadoop# spark-shell
+Setting default log level to "WARN".
+To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use
setLogLevel(newLevel).
+2023-01-27 19:32:37,469 WARN util.NativeCodeLoader: Unable to load
native-hadoop library for your platform... using builtin-java classes where
applicable
+Spark context Web UI available at http://localhost:4040
+Spark context available as 'sc' (master = spark://master:7077, app id =
app-20230127193238-0002).
+Spark session available as 'spark'.
+Welcome to
+ ____ __
+ / __/__ ___ _____/ /__
+ _\ \/ _ \/ _ `/ __/ '_/
+ /___/ .__/\_,_/_/ /_/\_\ version 3.2.3
+ /_/
+
+Using Scala version 2.12.15 (OpenJDK 64-Bit Server VM, Java 1.8.0_292)
+Type in expressions to have them evaluated.
+Type :help for more information.
+
+scala> println("Hello world!")
+Hello world!
+```
+### Submit requests to livy using REST apis
+Login to livy container directly and submit requests using REST endpoint
+```
+# Create a new session
+curl -s -X POST -d '{"kind":
"spark","driverMemory":"512M","executorMemory":"512M"}' -H "Content-Type:
application/json" http://localhost:8998/sessions/ | jq
+
+# Check session state
+curl -s -X GET -H "Content-Type: application/json"
http://localhost:8998/sessions/ | jq -r '.sessions[] | [ .id, .state ] | @tsv'
+
+# Submit the simplest `1+1` statement
+curl -s -X POST -d '{"code": "1 + 1"}' -H "Content-Type: application/json"
http://localhost:8998/sessions/0/statements | jq
+
+# Check for statement status
+curl -s -X GET -H "Content-Type: application/json"
http://localhost:8998/sessions/0/statements | jq -r '.statements[] | [
.id,.state,.progress,.output.status,.code ] | @tsv'
+
+# Submit simple spark code
+curl -s -X POST -d '{"code": "val data = Array(1,2,3);
sc.parallelize(data).count"}' -H "Content-Type: application/json"
http://localhost:8998/sessions/0/statements | jq
+
+# Check for statement status
+curl -s -X GET -H "Content-Type: application/json"
http://localhost:8998/sessions/0/statements | jq -r '.statements[] | [
.id,.state,.progress,.output.status,.code ] | @tsv'
+
+# Submit simple sql code (this setup still doesn't have hive metastore
configured)
+curl -X POST -d '{"kind": "sql", "code": "show databases"}, ' -H
"Content-Type: application/json" http://localhost:8998/sessions/0/statements |
jq
+
+# Check for statement status
+curl -s -X GET -H "Content-Type: application/json"
http://localhost:8998/sessions/0/statements | jq -r '.statements[] | [
.id,.state,.progress,.output.status,.code ] | @tsv'
+```
+### Debugging Livy/Spark/Hadoop
+`livy-dev-cluster` has conf directory for spark-master, spark-worker and livy.
Configuration files in those directories can be modified before launching the
cluster, for example:
+1. `Setting log level` - log4j.properties file can be modified in
`livy-dev-cluster` folder to change log level for root logger as well as for
specific packages
+2. `Testing private changes` - copy private jars into respective container
folder, update corresponding Dockerfile to copy/replace those jars into
respective paths on the container image and rebuild all the images (Note:
livy-dev-server builds on top of livy-dev-spark which builds on top of
livy-dev-base)
+3. `Remote debugging` - livy-env.sh already has customization to start with
remote debugging on 9010. Please follow IDE specific guidance on how to debug
remotely connecting to specific JDWP port for the daemon. Instructions for
IntelliJ are available
[here](https://www.jetbrains.com/help/idea/tutorial-remote-debug.html) and for
Eclipse,
[here](https://help.eclipse.org/latest/index.jsp?topic=%2Forg.eclipse.jdt.doc.user%2Ftasks%2Ftask-remotejava_launch_config.htm)
+### Terminate the cluster
+Press `CTRL-C` to terminate
+```
+spark-worker-1 | 2023-01-27 19:16:47,921 INFO
shuffle.ExternalShuffleBlockResolver: Application app-20230127191546-0000
removed, cleanupLocalDirs = true
+^CGracefully stopping... (press Ctrl+C again to force)
+Stopping spark-worker-1 ... done
+Stopping spark-master ... done
+```
+
+## Common Gotchas
+1. Use `docker-compose down` to clean up all the resources created for the
cluster
+2. Login to created images to check the state
+```
+docker run -it [imageId | imageName] /bin/bash
+```
diff --git a/dev/docker/build-images.sh b/dev/docker/build-images.sh
new file mode 100644
index 00000000..c8795872
--- /dev/null
+++ b/dev/docker/build-images.sh
@@ -0,0 +1,68 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+# Fail if there is an error
+set -ex
+
+SCRIPT_DIR=$(realpath "$(dirname ${0})")
+echo "Running from ${SCRIPT_DIR}/${0}"
+
+APACHE_ARCHIVE_ROOT=http://archive.apache.org/dist
+HADOOP_VERSION=3.3.1
+HADOOP_PACKAGE="hadoop-${HADOOP_VERSION}.tar.gz"
+HIVE_VERSION=2.3.9
+HIVE_PACKAGE="apache-hive-${HIVE_VERSION}-bin.tar.gz"
+SPARK_VERSION=3.2.3
+SPARK_PACKAGE="spark-${SPARK_VERSION}-bin-without-hadoop.tgz"
+SCALA_VERSION=2.12
+LIVY_VERSION="0.8.0-incubating-SNAPSHOT_${SCALA_VERSION}"
+LIVY_PACKAGE="apache-livy-${LIVY_VERSION}-bin.zip"
+LOCALLY_BUILT_LIVY_PACKAGE="${SCRIPT_DIR}/../../assembly/target/${LIVY_PACKAGE}"
+
+# Download hadoop if needed
+if [ ! -f "${SCRIPT_DIR}/livy-dev-spark/${HADOOP_PACKAGE}" ]; then
+ curl --fail -L --retry 3 -o
"${SCRIPT_DIR}/livy-dev-spark/${HADOOP_PACKAGE}" \
+
"${APACHE_ARCHIVE_ROOT}/hadoop/common/hadoop-${HADOOP_VERSION}/${HADOOP_PACKAGE}"
+fi
+
+# Download hive if needed
+if [ ! -f "${SCRIPT_DIR}/livy-dev-spark/${HIVE_PACKAGE}" ]; then
+ curl --fail -L --retry 3 -o "${SCRIPT_DIR}/livy-dev-spark/${HIVE_PACKAGE}"
\
+ "${APACHE_ARCHIVE_ROOT}/hive/hive-${HIVE_VERSION}/${HIVE_PACKAGE}"
+fi
+
+# Download spark if needed
+if [ ! -f "${SCRIPT_DIR}/livy-dev-spark/${SPARK_PACKAGE}" ]; then
+ curl --fail -L --retry 3 -o
"${SCRIPT_DIR}/livy-dev-spark/${SPARK_PACKAGE}" \
+ "${APACHE_ARCHIVE_ROOT}/spark/spark-${SPARK_VERSION}/${SPARK_PACKAGE}"
+fi
+
+# Check if livy build exists locally
+if [[ -f "${LOCALLY_BUILT_LIVY_PACKAGE}" && ! -f
"${SCRIPT_DIR}/livy-dev-server/${LIVY_PACKAGE}" ]]; then
+ cp ${LOCALLY_BUILT_LIVY_PACKAGE}
"${SCRIPT_DIR}/livy-dev-server/${LIVY_PACKAGE}"
+fi
+
+# Download livy if needed
+if [ ! -f "${SCRIPT_DIR}/livy-dev-server/${LIVY_PACKAGE}" ]; then
+ curl --fail -L --retry 3 -o
"${SCRIPT_DIR}/livy-dev-server/${LIVY_PACKAGE}" \
+ "${APACHE_ARCHIVE_ROOT}/incubator/livy/${LIVY_VERSION}/${LIVY_PACKAGE}"
+fi
+
+
+docker build -t livy-dev-base "${SCRIPT_DIR}/livy-dev-base/"
+docker build -t livy-dev-spark "${SCRIPT_DIR}/livy-dev-spark/" --build-arg
HADOOP_VERSION=${HADOOP_VERSION} --build-arg SPARK_VERSION=${SPARK_VERSION}
+docker build -t livy-dev-server "${SCRIPT_DIR}/livy-dev-server/" --build-arg
LIVY_VERSION=${LIVY_VERSION}
diff --git a/dev/docker/Dockerfile b/dev/docker/livy-dev-base/Dockerfile
similarity index 94%
rename from dev/docker/Dockerfile
rename to dev/docker/livy-dev-base/Dockerfile
index 3f8fd844..8d4f101e 100644
--- a/dev/docker/Dockerfile
+++ b/dev/docker/livy-dev-base/Dockerfile
@@ -25,7 +25,8 @@ ENV LANG="en_US.UTF-8" \
LANGUAGE="en_US.UTF-8" \
LC_ALL="en_US.UTF-8"
-# Install necessary dependencies for build/test
+# Install necessary dependencies for build/test/debug
+# Use `lsof -i -P -n` to find open ports
RUN apt-get install -qq \
apt-transport-https \
curl \
@@ -38,7 +39,9 @@ RUN apt-get install -qq \
python3-pip \
software-properties-common \
vim \
- wget
+ wget \
+ telnet \
+ lsof
# R 3.x install - ensure to add the signing key per
https://cran.r-project.org/bin/linux/ubuntu/olderreleasesREADME.html
RUN add-apt-repository 'deb https://cloud.r-project.org/bin/linux/ubuntu
xenial-cran35/' && \
@@ -70,4 +73,4 @@ RUN python -m pip install -U "pip < 21.0" && \
RUN python3 -m pip install -U pip
WORKDIR /workspace
-#
https://archive.apache.org/dist/spark/spark-2.4.5/spark-2.4.5-bin-hadoop2.7.tgz
\ No newline at end of file
+
diff --git a/dev/docker/livy-dev-cluster/conf/livy/livy-env.sh
b/dev/docker/livy-dev-cluster/conf/livy/livy-env.sh
new file mode 100644
index 00000000..02f4312d
--- /dev/null
+++ b/dev/docker/livy-dev-cluster/conf/livy/livy-env.sh
@@ -0,0 +1,18 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+export
LIVY_SERVER_JAVA_OPTS="-agentlib:jdwp=transport=dt_socket,server=y,address=9010,suspend=n"
diff --git a/dev/docker/livy-dev-cluster/conf/livy/livy.conf
b/dev/docker/livy-dev-cluster/conf/livy/livy.conf
new file mode 100644
index 00000000..592a63c7
--- /dev/null
+++ b/dev/docker/livy-dev-cluster/conf/livy/livy.conf
@@ -0,0 +1,20 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+livy.spark.master = spark://master:7077
+livy.file.local-dir-whitelist = /
+livy.spark.driver.memory=1g
diff --git a/dev/docker/livy-dev-cluster/conf/livy/log4j.properties
b/dev/docker/livy-dev-cluster/conf/livy/log4j.properties
new file mode 100644
index 00000000..c3a2d09b
--- /dev/null
+++ b/dev/docker/livy-dev-cluster/conf/livy/log4j.properties
@@ -0,0 +1,33 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+log4j.rootCategory=INFO, console, DRFA
+
+log4j.appender.console=org.apache.log4j.ConsoleAppender
+log4j.appender.console.target=System.err
+log4j.appender.console.layout=org.apache.log4j.PatternLayout
+log4j.appender.console.layout.ConversionPattern=%d %p %c{1} [%t]: %m%n
+
+log4j.appender.DRFA=org.apache.log4j.DailyRollingFileAppender
+log4j.appender.DRFA.File=/logs/livy-server.log
+# Rollver at midnight
+log4j.appender.DRFA.DatePattern=.yyyy-MM-dd
+log4j.appender.DRFA.layout=org.apache.log4j.PatternLayout
+# Pattern format: Date LogLevel LoggerName LogMessage
+log4j.appender.DRFA.layout.ConversionPattern=%d %p %c{1} [%t]: %m%n
+
+log4j.logger.org.eclipse.jetty=WARN
diff --git a/dev/docker/livy-dev-cluster/conf/master/log4j.properties
b/dev/docker/livy-dev-cluster/conf/master/log4j.properties
new file mode 100644
index 00000000..644032d2
--- /dev/null
+++ b/dev/docker/livy-dev-cluster/conf/master/log4j.properties
@@ -0,0 +1,35 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+log4j.rootCategory=INFO, console, DRFA
+log4jspark.log.dir=/logs
+log4jspark.log.file=spark-master.log
+
+log4j.appender.console=org.apache.log4j.ConsoleAppender
+log4j.appender.console.target=System.err
+log4j.appender.console.layout=org.apache.log4j.PatternLayout
+log4j.appender.console.layout.ConversionPattern=%d %p %c{1} [%t]: %m%n
+
+log4j.appender.DRFA=org.apache.log4j.DailyRollingFileAppender
+log4j.appender.DRFA.File=${log4jspark.log.dir}/${log4jspark.log.file}
+# Rollver at midnight
+log4j.appender.DRFA.DatePattern=.yyyy-MM-dd
+log4j.appender.DRFA.layout=org.apache.log4j.PatternLayout
+# Pattern format: Date LogLevel LoggerName LogMessage
+log4j.appender.DRFA.layout.ConversionPattern=%d %p %c{1} [%t]: %m%n
+
+log4j.logger.org.eclipse.jetty=WARN
diff --git a/dev/docker/livy-dev-cluster/conf/master/spark-default.conf
b/dev/docker/livy-dev-cluster/conf/master/spark-default.conf
new file mode 100644
index 00000000..338d57df
--- /dev/null
+++ b/dev/docker/livy-dev-cluster/conf/master/spark-default.conf
@@ -0,0 +1,26 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+spark.driver.port 7001
+spark.fileserver.port 7002
+spark.broadcast.port 7003
+spark.replClassServer.port 7004
+spark.blockManager.port 7005
+spark.driver.memory 1024m
+
+spark.broadcast.factory org.apache.spark.broadcast.HttpBroadcastFactory
+spark.port.maxRetries 4
diff --git a/dev/docker/livy-dev-cluster/conf/master/spark-env.sh
b/dev/docker/livy-dev-cluster/conf/master/spark-env.sh
new file mode 100644
index 00000000..c0c8ab52
--- /dev/null
+++ b/dev/docker/livy-dev-cluster/conf/master/spark-env.sh
@@ -0,0 +1,18 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+# Nothing here yet
diff --git a/dev/docker/livy-dev-cluster/conf/worker/log4j.properties
b/dev/docker/livy-dev-cluster/conf/worker/log4j.properties
new file mode 100644
index 00000000..23c5ee5d
--- /dev/null
+++ b/dev/docker/livy-dev-cluster/conf/worker/log4j.properties
@@ -0,0 +1,35 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+log4j.rootCategory=INFO, console, DRFA
+log4jspark.log.dir=/logs
+log4jspark.log.file=spark-worker.log
+
+log4j.appender.console=org.apache.log4j.ConsoleAppender
+log4j.appender.console.target=System.err
+log4j.appender.console.layout=org.apache.log4j.PatternLayout
+log4j.appender.console.layout.ConversionPattern=%d %p %c{1} [%t]: %m%n
+
+log4j.appender.DRFA=org.apache.log4j.DailyRollingFileAppender
+log4j.appender.DRFA.File=${log4jspark.log.dir}/${log4jspark.log.file}
+# Rollver at midnight
+log4j.appender.DRFA.DatePattern=.yyyy-MM-dd
+log4j.appender.DRFA.layout=org.apache.log4j.PatternLayout
+# Pattern format: Date LogLevel LoggerName LogMessage
+log4j.appender.DRFA.layout.ConversionPattern=%d %p %c{1} [%t]: %m%n
+
+log4j.logger.org.eclipse.jetty=WARN
diff --git a/dev/docker/livy-dev-cluster/conf/worker/spark-default.conf
b/dev/docker/livy-dev-cluster/conf/worker/spark-default.conf
new file mode 100644
index 00000000..f984e455
--- /dev/null
+++ b/dev/docker/livy-dev-cluster/conf/worker/spark-default.conf
@@ -0,0 +1,24 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+spark.fileserver.port 7012
+spark.broadcast.port 7013
+spark.replClassServer.port 7014
+spark.blockManager.port 7015
+
+spark.broadcast.factory org.apache.spark.broadcast.HttpBroadcastFactory
+spark.port.maxRetries 4
diff --git a/dev/docker/livy-dev-cluster/conf/worker/spark-env.sh
b/dev/docker/livy-dev-cluster/conf/worker/spark-env.sh
new file mode 100644
index 00000000..c0c8ab52
--- /dev/null
+++ b/dev/docker/livy-dev-cluster/conf/worker/spark-env.sh
@@ -0,0 +1,18 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+# Nothing here yet
diff --git a/dev/docker/livy-dev-cluster/docker-compose.yml
b/dev/docker/livy-dev-cluster/docker-compose.yml
new file mode 100644
index 00000000..c69134d7
--- /dev/null
+++ b/dev/docker/livy-dev-cluster/docker-compose.yml
@@ -0,0 +1,99 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+version: '2'
+
+services:
+ spark-master:
+ image: livy-dev-spark:latest
+ command: bin/spark-class org.apache.spark.deploy.master.Master -h master
+ hostname: master
+ container_name: spark-master
+ environment:
+ MASTER: spark://master:7077
+ SPARK_CONF_DIR: /conf
+ SPARK_PUBLIC_DNS: localhost
+ expose:
+ - 7001
+ - 7002
+ - 7003
+ - 7004
+ - 7005
+ - 7077
+ - 6066
+ - 4040
+ ports:
+ - 4040:4040
+ - 6066:6066
+ - 7077:7077
+ - 8080:8080
+ volumes:
+ - ./conf/master:/conf
+ - ./data:/tmp/data
+ - ./logs/master:/logs
+
+ spark-worker-1:
+ image: livy-dev-spark:latest
+ command: bin/spark-class org.apache.spark.deploy.worker.Worker
spark://master:7077
+ container_name: spark-worker-1
+ environment:
+ SPARK_CONF_DIR: /conf
+ SPARK_WORKER_CORES: 1
+ SPARK_WORKER_MEMORY: 1g
+ SPARK_WORKER_PORT: 8881
+ SPARK_WORKER_WEBUI_PORT: 8081
+ SPARK_PUBLIC_DNS: localhost
+ expose:
+ - 7012
+ - 7013
+ - 7014
+ - 7015
+ - 8881
+ ports:
+ - 8081:8081
+ volumes:
+ - ./conf/worker:/conf
+ - ./data:/tmp/data
+ - ./logs/worker:/logs
+ depends_on:
+ - "spark-master"
+
+ livy:
+ image: livy-dev-server:latest
+ command: bin/livy-server
+ container_name: livy
+ environment:
+ SPARK_CONF_DIR: /conf
+ SPARK_DIVER_CORES: 1
+ SPARK_DRIVER_MEMORY: 1g
+ SPARK_MASTER_ENDPOINT: master
+ SPARK_MASTER_PORT: 7077
+ LIVY_CONF_DIR: /conf
+ LIVY_LOG_DIR: /logs
+ LIVY_FILE_LOCAL_DIR_WHITELIST: /opt/jars
+ expose:
+ # remote debug port
+ - 9010
+ ports:
+ - 8998:8998
+ - 9010:9010
+ volumes:
+ - ./conf/livy:/conf
+ - ./logs/livy:/logs
+ depends_on:
+ - "spark-master"
+ - "spark-worker-1"
diff --git a/dev/docker/livy-dev-server/Dockerfile
b/dev/docker/livy-dev-server/Dockerfile
new file mode 100644
index 00000000..689645aa
--- /dev/null
+++ b/dev/docker/livy-dev-server/Dockerfile
@@ -0,0 +1,40 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+FROM livy-dev-spark:latest
+
+ARG LIVY_VERSION=0.8.0-incubating-SNAPSHOT
+ARG ROOT_PATH=/opt
+
+RUN apt-get update \
+ && apt-get install -y unzip
+
+ENV LIVY_HOME=${ROOT_PATH}/livy
+ENV LIVY_PACKAGE=apache-livy-${LIVY_VERSION}-bin
+
+COPY ${LIVY_PACKAGE}.zip ${LIVY_PACKAGE}.zip
+
+RUN unzip ${LIVY_PACKAGE}.zip 1>/dev/null \
+ && mv ${LIVY_PACKAGE} ${ROOT_PATH}/ \
+ && ln -s ${ROOT_PATH}/${LIVY_PACKAGE} ${LIVY_HOME} \
+ && chown -R root:root ${LIVY_HOME} \
+ && rm ${LIVY_PACKAGE}.zip
+
+# Uncomment following line or add more such lines to replace the default jars
with private builds.
+# COPY livy-core_2.12-0.8.0-incubating-SNAPSHOT.jar
${SPARK_HOME}/repl_2.12-jars/livy-core_2.12-0.8.0-incubating-SNAPSHOT.jar
+
+WORKDIR ${LIVY_HOME}
diff --git a/dev/docker/livy-dev-spark/Dockerfile
b/dev/docker/livy-dev-spark/Dockerfile
new file mode 100644
index 00000000..18927988
--- /dev/null
+++ b/dev/docker/livy-dev-spark/Dockerfile
@@ -0,0 +1,58 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+FROM livy-dev-base:latest
+
+ARG HADOOP_VERSION=3.3.1
+ARG SPARK_VERSION=3.2.3
+ARG ROOT_PATH=/opt
+
+RUN mkdir -p ${ROOT_PATH}
+
+ENV HADOOP_HOME=${ROOT_PATH}/hadoop
+ENV HADOOP_CONF_DIR=${HADOOP_HOME}/etc/hadoop
+ENV PATH=${PATH}:${HADOOP_HOME}/bin
+ENV HADOOP_PACKAGE=hadoop-${HADOOP_VERSION}
+
+COPY ${HADOOP_PACKAGE}.tar.gz ${HADOOP_PACKAGE}.tar.gz
+
+RUN gunzip ${HADOOP_PACKAGE}.tar.gz \
+ && tar -xf ${HADOOP_PACKAGE}.tar -C ${ROOT_PATH}/ \
+ && ln -s ${ROOT_PATH}/${HADOOP_PACKAGE} ${HADOOP_HOME} \
+ && rm -rf ${HADOOP_HOME}/share/doc \
+ && chown -R root:root ${HADOOP_HOME} \
+ && rm ${HADOOP_PACKAGE}.tar
+
+ENV SPARK_HOME=${ROOT_PATH}/spark
+ENV
SPARK_DIST_CLASSPATH="${HADOOP_HOME}/etc/hadoop/*:${HADOOP_HOME}/share/hadoop/common/lib/*:${HADOOP_HOME}/share/hadoop/common/*:${HADOOP_HOME}/share/hadoop/hdfs/*:${HADOOP_HOME}/share/hadoop/hdfs/lib/*:${HADOOP_HOME}/share/hadoop/hdfs/*:${HADOOP_HOME}/share/hadoop/yarn/lib/*:${HADOOP_HOME}/share/hadoop/yarn/*:${HADOOP_HOME}/share/hadoop/mapreduce/lib/*:${HADOOP_HOME}/share/hadoop/mapreduce/*:${HADOOP_HOME}/share/hadoop/tools/lib/*"
+ENV PATH=${PATH}:${SPARK_HOME}/bin
+ENV SPARK_PACKAGE=spark-${SPARK_VERSION}-bin-without-hadoop
+
+COPY ${SPARK_PACKAGE}.tgz ${SPARK_PACKAGE}.tgz
+
+RUN gunzip ${SPARK_PACKAGE}.tgz \
+ && tar -xf ${SPARK_PACKAGE}.tar -C ${ROOT_PATH}/ \
+ && ln -s ${ROOT_PATH}/${SPARK_PACKAGE} ${SPARK_HOME} \
+ && chown -R root:root ${SPARK_HOME} \
+ && rm ${SPARK_PACKAGE}.tar
+
+# Uncomment following line or add more such lines to replace the default jars
with private builds.
+# COPY hadoop-streaming-3.3.1.jar
${HADOOP_HOME}/share/hadoop/tools/lib/hadoop-streaming-3.3.1.jar
+
+# Uncomment following line or add more such lines to replace the default jars
with private builds.
+# COPY spark-repl_2.12-3.2.3.jar ${SPARK_HOME}/jars/spark-repl_2.12-3.2.3.jar
+
+WORKDIR ${SPARK_HOME}
diff --git a/pom.xml b/pom.xml
index 52c64c4e..717750a0 100644
--- a/pom.xml
+++ b/pom.xml
@@ -100,7 +100,6 @@
<mockito.version>1.10.19</mockito.version>
<netty.version>4.1.86.Final</netty.version>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
- <py4j.version>0.10.7</py4j.version>
<scalatest.version>3.0.8</scalatest.version>
<scalatra.version>2.6.5</scalatra.version>
<java.version>1.8</java.version>
@@ -728,7 +727,7 @@
<plugin>
<groupId>net.alchim31.maven</groupId>
<artifactId>scala-maven-plugin</artifactId>
- <version>4.2.0</version>
+ <version>4.3.0</version>
<executions>
<execution>
<goals>
@@ -1151,7 +1150,7 @@
<id>scala-2.12</id>
<properties>
<scala.binary.version>2.12</scala.binary.version>
- <scala.version>2.12.10</scala.version>
+ <scala.version>2.12.15</scala.version>
</properties>
</profile>
@@ -1180,10 +1179,11 @@
<profile>
<id>spark3</id>
<properties>
- <spark.version>3.0.0</spark.version>
+ <spark.version>3.2.3</spark.version>
<java.version>1.8</java.version>
- <py4j.version>0.10.9</py4j.version>
- <json4s.version>3.6.6</json4s.version>
+ <py4j.version>0.10.9.7</py4j.version>
+ <json4s.version>3.7.0-M11</json4s.version>
+ <netty.version>4.1.92.Final</netty.version>
<spark.bin.name>spark-${spark.version}-bin-hadoop${hadoop.major-minor.version}</spark.bin.name>
<spark.bin.download.url>
https://archive.apache.org/dist/spark/spark-${spark.version}/${spark.bin.name}.tgz