This is an automated email from the ASF dual-hosted git repository.
dongjoon pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/spark-kubernetes-operator.git
The following commit(s) were added to refs/heads/main by this push:
new 056c05b [SPARK-56131] Use `jlink` custom JRE runtime to minimize
Docker image size
056c05b is described below
commit 056c05bf34675893a81a4684ce0f10452ed3e876
Author: Dongjoon Hyun <[email protected]>
AuthorDate: Sat Mar 21 15:43:17 2026 -0700
[SPARK-56131] Use `jlink` custom JRE runtime to minimize Docker image size
### What changes were proposed in this pull request?
This PR reduces the Docker image from 220MB to 184MB (approximately 16.3%
reduction) by using `jlink` to create a custom minimal JRE instead of using the
pre-built `zulu-openjdk-alpine:26-jre` image.
```diff
$ docker images
IMAGE ID DISK
USAGE CONTENT SIZE EXTRA
-apache/spark-kubernetes-operator:0.9.0-SNAPSHOT 3713f0e80686
636MB 220MB U
+apache/spark-kubernetes-operator:0.9.0-SNAPSHOT a877fb1be597
418MB 184MB
```
The Dockerfile now has a three-stage build:
1. **builder**: Builds the application JAR with Gradle
2. **jlink**: Uses `jdeps` to analyze module dependencies and `jlink` to
create a stripped-down custom JRE
3. **final**: Uses `alpine:3.23` as the base image with the custom JRE
copied in
The `jlink` stage includes the following optimizations:
- `--strip-debug`: Removes debug information
- `--compress zip-6`: Compresses modules
- `--no-header-files` / `--no-man-pages`: Removes unnecessary files
Additional modules (`jdk.security.auth`, `jdk.httpserver`,
`jdk.unsupported`, `jdk.crypto.ec`, `java.management`) are explicitly included
to ensure runtime compatibility.
### Why are the changes needed?
A custom JRE built with `jlink` contains only the modules required by the
application, resulting in a significantly smaller Docker image compared to
using the full JRE base image. This reduces container pull times, storage
costs, and attack surface.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
Manually tested by running `gradle buildDockerImage`.
### Was this patch authored or co-authored using generative AI tooling?
Generated-by: Claude Code (Opus 4.6)
Closes #571 from dongjoon-hyun/SPARK-56131.
Authored-by: Dongjoon Hyun <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
---
build-tools/docker/Dockerfile | 30 ++++++++++++++++++++++++++++--
1 file changed, 28 insertions(+), 2 deletions(-)
diff --git a/build-tools/docker/Dockerfile b/build-tools/docker/Dockerfile
index f6f67d3..e0726ac 100644
--- a/build-tools/docker/Dockerfile
+++ b/build-tools/docker/Dockerfile
@@ -22,7 +22,28 @@ COPY . .
RUN --mount=type=cache,target=/home/gradle/.gradle/caches gradle --no-daemon
clean build -x check
-FROM azul/zulu-openjdk-alpine:26-jre
+FROM azul/zulu-openjdk-alpine:26 AS jlink
+
+ARG APP_VERSION=0.9.0-SNAPSHOT
+
+COPY --from=builder
/app/spark-operator/build/libs/spark-kubernetes-operator-${APP_VERSION}-all.jar
/tmp/app.jar
+
+RUN apk add --no-cache binutils && \
+ MODULES=$(jdeps \
+ --ignore-missing-deps \
+ --print-module-deps \
+ --multi-release 17 \
+ /tmp/app.jar) && \
+
MODULES="${MODULES},jdk.httpserver,jdk.unsupported,jdk.crypto.ec,jdk.security.auth,java.management"
&& \
+ jlink \
+ --add-modules ${MODULES} \
+ --strip-debug \
+ --compress zip-6 \
+ --no-header-files \
+ --no-man-pages \
+ --output /opt/custom-jre
+
+FROM alpine:3.23
ARG APP_VERSION=0.9.0-SNAPSHOT
ARG SPARK_UID=185
@@ -32,13 +53,18 @@ LABEL org.opencontainers.image.licenses="Apache-2.0"
LABEL org.opencontainers.image.ref.name="Apache Spark Kubernetes Operator"
LABEL org.opencontainers.image.version="${APP_VERSION}"
+ENV JAVA_HOME=/opt/custom-jre
+ENV PATH="${JAVA_HOME}/bin:${PATH}"
ENV SPARK_OPERATOR_HOME=/opt/spark-operator
ENV SPARK_OPERATOR_WORK_DIR=/opt/spark-operator/operator
ENV SPARK_OPERATOR_JAR=spark-kubernetes-operator.jar
WORKDIR $SPARK_OPERATOR_WORK_DIR
-RUN addgroup -S -g $SPARK_UID spark && \
+COPY --from=jlink /opt/custom-jre $JAVA_HOME
+
+RUN apk add --no-cache libstdc++ && \
+ addgroup -S -g $SPARK_UID spark && \
adduser -S -h $SPARK_OPERATOR_HOME -u $SPARK_UID -G spark spark && \
chown -R spark:spark $SPARK_OPERATOR_HOME
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]