Re: [PR] AWS: Support multiple storage credential prefixes [iceberg]

via GitHub Tue, 27 May 2025 08:15:14 -0700


and124578963 commented on PR #12799:
URL: https://github.com/apache/iceberg/pull/12799#issuecomment-2912973308


   Hey. I had a problem with this commit. As I understand my path start with 
s3a://, not s3://. How to configure it?
   After removing the commit, all is working well.
   
   ```
   25/05/27 13:13:38 ERROR org.apache.spark.executor.Executor: Exception in 
task 2.0 in stage 2.0 (TID 10)
   java.lang.IllegalStateException: [BUG] S3 client for storage path not 
available: 
s3a://spark-nt/load_test_delete_warehouse/gen_transaction_icb_year/data/TRANSACTION_DATE_year=2003/00115-938-a969331b-32bf-4df2-a897-8c9dbc4a7c1a-0-00001.parquet
        at 
org.apache.iceberg.relocated.com.google.common.base.Preconditions.checkState(Preconditions.java:603)
 ~[iceberg-spark-runtime-3.5_2.12-iceberg-comet-trunk.jar:?]
        at 
org.apache.iceberg.aws.s3.S3FileIO.clientForStoragePath(S3FileIO.java:427) 
~[iceberg-spark-runtime-3.5_2.12-iceberg-comet-trunk.jar:?]
        at org.apache.iceberg.aws.s3.S3FileIO.newInputFile(S3FileIO.java:184) 
~[iceberg-spark-runtime-3.5_2.12-iceberg-comet-trunk.jar:?]
        at 
org.apache.iceberg.encryption.EncryptingFileIO.wrap(EncryptingFileIO.java:150) 
~[iceberg-spark-runtime-3.5_2.12-iceberg-comet-trunk.jar:?]
        at 
org.apache.iceberg.relocated.com.google.common.collect.Iterators$6.transform(Iterators.java:828)
 ~[iceberg-spark-runtime-3.5_2.12-iceberg-comet-trunk.jar:?]
        at 
org.apache.iceberg.relocated.com.google.common.collect.TransformedIterator.next(TransformedIterator.java:51)
 ~[iceberg-spark-runtime-3.5_2.12-iceberg-comet-trunk.jar:?]
        at 
org.apache.iceberg.relocated.com.google.common.collect.TransformedIterator.next(TransformedIterator.java:51)
 ~[iceberg-spark-runtime-3.5_2.12-iceberg-comet-trunk.jar:?]
        at 
org.apache.iceberg.encryption.EncryptingFileIO.bulkDecrypt(EncryptingFileIO.java:63)
 ~[iceberg-spark-runtime-3.5_2.12-iceberg-comet-trunk.jar:?]
        at 
org.apache.iceberg.spark.source.BaseReader.inputFiles(BaseReader.java:177) 
~[iceberg-spark-runtime-3.5_2.12-iceberg-comet-trunk.jar:?]
        at 
org.apache.iceberg.spark.source.BaseReader.getInputFile(BaseReader.java:170) 
~[iceberg-spark-runtime-3.5_2.12-iceberg-comet-trunk.jar:?]
        at 
org.apache.iceberg.spark.source.BatchDataReader.open(BatchDataReader.java:100) 
~[iceberg-spark-runtime-3.5_2.12-iceberg-comet-trunk.jar:?]
        at 
org.apache.iceberg.spark.source.BatchDataReader.open(BatchDataReader.java:43) 
~[iceberg-spark-runtime-3.5_2.12-iceberg-comet-trunk.jar:?]
        at org.apache.iceberg.spark.source.BaseReader.next(BaseReader.java:134) 
~[iceberg-spark-runtime-3.5_2.12-iceberg-comet-trunk.jar:?]
        at 
org.apache.spark.sql.execution.datasources.v2.PartitionIterator.hasNext(DataSourceRDD.scala:120)
 ~[spark-sql_2.12-3.5.4.jar:3.5.4]
        at 
org.apache.spark.sql.execution.datasources.v2.MetricsIterator.hasNext(DataSourceRDD.scala:158)
 ~[spark-sql_2.12-3.5.4.jar:3.5.4]
        at 
org.apache.spark.sql.execution.datasources.v2.DataSourceRDD$$anon$1.$anonfun$hasNext$1(DataSourceRDD.scala:63)
 ~[spark-sql_2.12-3.5.4.jar:3.5.4]
        at 
org.apache.spark.sql.execution.datasources.v2.DataSourceRDD$$anon$1.$anonfun$hasNext$1$adapted(DataSourceRDD.scala:63)
 ~[spark-sql_2.12-3.5.4.jar:3.5.4]
        at scala.Option.exists(Option.scala:376) ~[scala-library-2.12.18.jar:?]
        at 
org.apache.spark.sql.execution.datasources.v2.DataSourceRDD$$anon$1.hasNext(DataSourceRDD.scala:63)
 ~[spark-sql_2.12-3.5.4.jar:3.5.4]
        at 
org.apache.spark.sql.execution.datasources.v2.DataSourceRDD$$anon$1.advanceToNextIter(DataSourceRDD.scala:97)
 ~[spark-sql_2.12-3.5.4.jar:3.5.4]
        at 
org.apache.spark.sql.execution.datasources.v2.DataSourceRDD$$anon$1.hasNext(DataSourceRDD.scala:63)
 ~[spark-sql_2.12-3.5.4.jar:3.5.4]
        at 
org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37) 
~[spark-core_2.12-3.5.4.jar:3.5.4]
        at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460) 
~[scala-library-2.12.18.jar:?]
        at 
org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.columnartorow_nextBatch_0$(Unknown
 Source) ~[?:?]
        at 
org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown
 Source) ~[?:?]
        at 
org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
 ~[spark-sql_2.12-3.5.4.jar:3.5.4]
        at 
org.apache.spark.sql.execution.WholeStageCodegenEvaluatorFactory$WholeStageCodegenPartitionEvaluator$$anon$1.hasNext(WholeStageCodegenEvaluatorFactory.scala:43)
 ~[spark-sql_2.12-3.5.4.jar:3.5.4]
        at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460) 
~[scala-library-2.12.18.jar:?]
        at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460) 
~[scala-library-2.12.18.jar:?]
        at 
org.apache.spark.util.random.SamplingUtils$.reservoirSampleAndCount(SamplingUtils.scala:41)
 ~[spark-core_2.12-3.5.4.jar:3.5.4]
        at 
org.apache.spark.RangePartitioner$.$anonfun$sketch$1(Partitioner.scala:322) 
~[spark-core_2.12-3.5.4.jar:3.5.4]
        at 
org.apache.spark.RangePartitioner$.$anonfun$sketch$1$adapted(Partitioner.scala:320)
 ~[spark-core_2.12-3.5.4.jar:3.5.4]
        at 
org.apache.spark.rdd.RDD.$anonfun$mapPartitionsWithIndex$2(RDD.scala:910) 
~[spark-core_2.12-3.5.4.jar:3.5.4]
        at 
org.apache.spark.rdd.RDD.$anonfun$mapPartitionsWithIndex$2$adapted(RDD.scala:910)
 ~[spark-core_2.12-3.5.4.jar:3.5.4]
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] AWS: Support multiple storage credential prefixes [iceberg]

Reply via email to