UFMurphy commented on issue #7570: URL: https://github.com/apache/iceberg/issues/7570#issuecomment-2088572698
Hi dramatically, In your {SPARK_HOME}/conf directory, you should see a file called spark-defaults.conf. At the bottom of that file, just add your environment variables in form : NAME VALUE Here is a copy of the one I use locally: spark.master spark://master:7077 spark.sql.caseSensitive true spark.executor.memory 5g spark.executor.cores 2 spark.driver.memory 2g spark.ui.port 4042 spark.executor.heartbeatInterval 10000000 spark.network.timeout 10000000 spark.local.dir /tmp/spark spark.hadoop.fs.s3a.access.key xxxxxxxxxxxxxx spark.hadoop.fs.s3a.aws.credentials.provider org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider spark.hadoop.fs.s3a.block.size 512M spark.hadoop.fs.s3a.committer.name directory spark.hadoop.fs.s3a.committer.magic.enabled false spark.hadoop.fs.s3a.committer.staging.abort.pending.uploads true spark.hadoop.fs.s3a.committer.staging.conflict-mode append spark.hadoop.fs.s3a.committer.staging.tmp.path /tmp/staging spark.hadoop.fs.s3a.committer.staging.unique-filenames true spark.hadoop.fs.s3a.committer.staging.abort.pending.uploads true spark.hadoop.fs.s3a.committer.threads 2048 spark.hadoop.fs.s3a.connection.establish.timeout 5000 spark.hadoop.fs.s3a.connection.maximum 8192 spark.hadoop.fs.s3a.connection.ssl.enabled false spark.hadoop.fs.s3a.connection.timeout 200000 spark.hadoop.fs.s3a.endpoint http://localminio:9000/ spark.hadoop.fs.s3a.fast.upload true spark.hadoop.fs.s3a.fast.upload.active.blocks 2048 spark.hadoop.fs.s3a.fast.upload.buffer disk spark.hadoop.fs.s3a.impl org.apache.hadoop.fs.s3a.S3AFileSystem spark.hadoop.fs.s3a.max.total.tasks 2048 spark.hadoop.fs.s3a.multipart.size 512M spark.hadoop.fs.s3a.multipart.threshold 512M spark.hadoop.fs.s3a.path.style.access True spark.hadoop.fs.s3a.secret.key xxxxxxxxxxxx spark.hadoop.fs.s3a.socket.recv.buffer 65536 spark.hadoop.fs.s3a.socket.send.buffer 65536 spark.hadoop.fs.s3a.threads.max 2048 spark.worker.cleanup.enabled True spark.jars.packages org.postgresql:postgresql:42.5.1,org.apache.hadoop:hadoop-aws:3.3.1,com.amazonaws:aws-java-sdk-bom:1.12.50$ Then restart the cluster and it *should* work On Sun, Apr 28, 2024 at 4:10 PM YannickLecroart21 ***@***.***> wrote: > @dramaticlly <https://github.com/dramaticlly> > > Thank you for your suggestion. I wasn't able to properly set the > environment variables using that specific approach. But it got me thinking > about other ways. I got it working by setting the aws credentials as > environment variables in the conf/spark-env.sh file. These Jupyter > notebooks throw a wrench in to everything ;-). Thanks again! > > How did you do that please? > > — > Reply to this email directly, view it on GitHub > <https://github.com/apache/iceberg/issues/7570#issuecomment-2081657309>, > or unsubscribe > <https://github.com/notifications/unsubscribe-auth/AAHNFISBPRZKJ4SSZS2PPA3Y7VQTHAVCNFSM6AAAAAAX3OCJFSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAOBRGY2TOMZQHE> > . > You are receiving this because you were mentioned.Message ID: > ***@***.***> > -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org