This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/spark-kubernetes-operator.git


The following commit(s) were added to refs/heads/main by this push:
     new 21db3b2  [SPARK-52467] Add `dfs-read-write` and `localstack` examples
21db3b2 is described below

commit 21db3b28f1e422371fe0356edb3bfb37fce0bace
Author: Dongjoon Hyun <[email protected]>
AuthorDate: Fri Jun 13 06:09:42 2025 -0700

    [SPARK-52467] Add `dfs-read-write` and `localstack` examples
    
    ### What changes were proposed in this pull request?
    
    This PR aims to add `dfs-read-write` and `localstack` examples.
    
    ### Why are the changes needed?
    
    To provide the following examples.
    
    1. How to add additional packages
    
    ```yaml
    spark.jars.packages: "org.apache.hadoop:hadoop-aws:3.4.1"
    spark.jars.ivy: "/tmp/.ivy2.5.2"
    ```
    
    2. How to use S3
    
    ```yaml
    spark.hadoop.fs.defaultFS: "..."
    spark.hadoop.fs.s3a.endpoint: "..."
    spark.hadoop.fs.s3a.path.style.access: "..."
    spark.hadoop.fs.s3a.access.key: "..."
    spark.hadoop.fs.s3a.secret.key: "..."
    ```
    
    ### Does this PR introduce _any_ user-facing change?
    
    No.
    
    ### How was this patch tested?
    
    Manually run.
    
    ```bash
    $ kubectl apply -f examples/localstack.yml
    $ kubectl apply -f examples/dfs-read-write.yaml
    $ kubectl logs -f dfs-read-write-0-driver
    ...
    Success! Local Word Count 18 and DFS Word Count 18 agree.
    ...
    ```
    
    ### Was this patch authored or co-authored using generative AI tooling?
    
    No.
    
    Closes #242 from dongjoon-hyun/SPARK-52467.
    
    Authored-by: Dongjoon Hyun <[email protected]>
    Signed-off-by: Dongjoon Hyun <[email protected]>
---
 examples/dfs-read-write.yaml | 44 +++++++++++++++++++++++++++++++++++
 examples/localstack.yml      | 55 ++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 99 insertions(+)

diff --git a/examples/dfs-read-write.yaml b/examples/dfs-read-write.yaml
new file mode 100644
index 0000000..56acdc2
--- /dev/null
+++ b/examples/dfs-read-write.yaml
@@ -0,0 +1,44 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+# Since this requires a remote storage, prepare it via `localstack.yml`.
+#
+apiVersion: spark.apache.org/v1beta1
+kind: SparkApplication
+metadata:
+  name: dfs-read-write
+spec:
+  mainClass: "org.apache.spark.examples.DFSReadWriteTest"
+  jars: "local:///opt/spark/examples/jars/spark-examples.jar"
+  driverArgs: [ "/opt/spark/RELEASE", "s3a://data/" ]
+  sparkConf:
+    spark.logConf: "true"
+    spark.jars.packages: "org.apache.hadoop:hadoop-aws:3.4.1"
+    spark.jars.ivy: "/tmp/.ivy2.5.2"
+    spark.driver.memory: "2g"
+    spark.dynamicAllocation.enabled: "true"
+    spark.dynamicAllocation.shuffleTracking.enabled: "true"
+    spark.dynamicAllocation.maxExecutors: "3"
+    spark.kubernetes.authenticate.driver.serviceAccountName: "spark"
+    spark.kubernetes.container.image: "apache/spark:4.0.0-java21-scala"
+    spark.hadoop.fs.defaultFS: "s3a://data"
+    spark.hadoop.fs.s3a.endpoint: "http://localstack:4566";
+    spark.hadoop.fs.s3a.path.style.access: "true"
+    spark.hadoop.fs.s3a.access.key: "test"
+    spark.hadoop.fs.s3a.secret.key: "test"
+  applicationTolerations:
+    resourceRetainPolicy: OnFailure
+  runtimeVersions:
+    sparkVersion: "4.0.0"
diff --git a/examples/localstack.yml b/examples/localstack.yml
new file mode 100644
index 0000000..3b64806
--- /dev/null
+++ b/examples/localstack.yml
@@ -0,0 +1,55 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+apiVersion: v1
+kind: Pod
+metadata:
+  name: localstack
+  labels:
+    role: s3
+spec:
+  containers:
+  - name: localstack
+    image: localstack/localstack:4
+    resources:
+      limits:
+        cpu: "1"
+        memory: 1Gi
+      requests:
+        cpu: "1"
+        memory: 1Gi
+    ports:
+    - containerPort: 4566
+    lifecycle:
+      postStart:
+        exec:
+          command:
+          - /bin/sh
+          - -c
+          - >
+            awslocal s3 mb s3://data;
+            awslocal s3 cp /opt/code/localstack/Makefile s3://data/
+---
+apiVersion: v1
+kind: Service
+metadata:
+  name: localstack
+spec:
+  type: ClusterIP
+  ports:
+  - port: 4566
+    protocol: TCP
+    targetPort: 4566
+  selector:
+    role: s3


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to