This is an automated email from the ASF dual-hosted git repository.
dongjoon pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/spark-kubernetes-operator.git
The following commit(s) were added to refs/heads/main by this push:
new 21db3b2 [SPARK-52467] Add `dfs-read-write` and `localstack` examples
21db3b2 is described below
commit 21db3b28f1e422371fe0356edb3bfb37fce0bace
Author: Dongjoon Hyun <[email protected]>
AuthorDate: Fri Jun 13 06:09:42 2025 -0700
[SPARK-52467] Add `dfs-read-write` and `localstack` examples
### What changes were proposed in this pull request?
This PR aims to add `dfs-read-write` and `localstack` examples.
### Why are the changes needed?
To provide the following examples.
1. How to add additional packages
```yaml
spark.jars.packages: "org.apache.hadoop:hadoop-aws:3.4.1"
spark.jars.ivy: "/tmp/.ivy2.5.2"
```
2. How to use S3
```yaml
spark.hadoop.fs.defaultFS: "..."
spark.hadoop.fs.s3a.endpoint: "..."
spark.hadoop.fs.s3a.path.style.access: "..."
spark.hadoop.fs.s3a.access.key: "..."
spark.hadoop.fs.s3a.secret.key: "..."
```
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
Manually run.
```bash
$ kubectl apply -f examples/localstack.yml
$ kubectl apply -f examples/dfs-read-write.yaml
$ kubectl logs -f dfs-read-write-0-driver
...
Success! Local Word Count 18 and DFS Word Count 18 agree.
...
```
### Was this patch authored or co-authored using generative AI tooling?
No.
Closes #242 from dongjoon-hyun/SPARK-52467.
Authored-by: Dongjoon Hyun <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
---
examples/dfs-read-write.yaml | 44 +++++++++++++++++++++++++++++++++++
examples/localstack.yml | 55 ++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 99 insertions(+)
diff --git a/examples/dfs-read-write.yaml b/examples/dfs-read-write.yaml
new file mode 100644
index 0000000..56acdc2
--- /dev/null
+++ b/examples/dfs-read-write.yaml
@@ -0,0 +1,44 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+# Since this requires a remote storage, prepare it via `localstack.yml`.
+#
+apiVersion: spark.apache.org/v1beta1
+kind: SparkApplication
+metadata:
+ name: dfs-read-write
+spec:
+ mainClass: "org.apache.spark.examples.DFSReadWriteTest"
+ jars: "local:///opt/spark/examples/jars/spark-examples.jar"
+ driverArgs: [ "/opt/spark/RELEASE", "s3a://data/" ]
+ sparkConf:
+ spark.logConf: "true"
+ spark.jars.packages: "org.apache.hadoop:hadoop-aws:3.4.1"
+ spark.jars.ivy: "/tmp/.ivy2.5.2"
+ spark.driver.memory: "2g"
+ spark.dynamicAllocation.enabled: "true"
+ spark.dynamicAllocation.shuffleTracking.enabled: "true"
+ spark.dynamicAllocation.maxExecutors: "3"
+ spark.kubernetes.authenticate.driver.serviceAccountName: "spark"
+ spark.kubernetes.container.image: "apache/spark:4.0.0-java21-scala"
+ spark.hadoop.fs.defaultFS: "s3a://data"
+ spark.hadoop.fs.s3a.endpoint: "http://localstack:4566"
+ spark.hadoop.fs.s3a.path.style.access: "true"
+ spark.hadoop.fs.s3a.access.key: "test"
+ spark.hadoop.fs.s3a.secret.key: "test"
+ applicationTolerations:
+ resourceRetainPolicy: OnFailure
+ runtimeVersions:
+ sparkVersion: "4.0.0"
diff --git a/examples/localstack.yml b/examples/localstack.yml
new file mode 100644
index 0000000..3b64806
--- /dev/null
+++ b/examples/localstack.yml
@@ -0,0 +1,55 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+apiVersion: v1
+kind: Pod
+metadata:
+ name: localstack
+ labels:
+ role: s3
+spec:
+ containers:
+ - name: localstack
+ image: localstack/localstack:4
+ resources:
+ limits:
+ cpu: "1"
+ memory: 1Gi
+ requests:
+ cpu: "1"
+ memory: 1Gi
+ ports:
+ - containerPort: 4566
+ lifecycle:
+ postStart:
+ exec:
+ command:
+ - /bin/sh
+ - -c
+ - >
+ awslocal s3 mb s3://data;
+ awslocal s3 cp /opt/code/localstack/Makefile s3://data/
+---
+apiVersion: v1
+kind: Service
+metadata:
+ name: localstack
+spec:
+ type: ClusterIP
+ ports:
+ - port: 4566
+ protocol: TCP
+ targetPort: 4566
+ selector:
+ role: s3
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]