DongSeungLee opened a new issue, #11558: URL: https://github.com/apache/iceberg/issues/11558
### Query engine Spark 3.5.3 ### Question for study, i run spark cluster standalone in my local, and i have developed my own IcebergRestCatalog. My IcebergRestCatalog Iceberg spec is followed by 1.6.1 version for running add_files like below. ``` CALL iceberg.system.add_files( table => 'yearly_month_clicks', source_table => '`parquet`.`s3a://dataquery-warehouse/iceberg/data`' ); ``` error occurs like below. ``` Caused by: org.apache.iceberg.exceptions.RuntimeIOException: Failed to get file system for path: s3://dataquery-warehouse/iceberg/dataquery/yearly_month_clicks/metadata/stage-31-task-1619-manifest-855c8009-c073-48b0-9fd7-e12c1daf8930.avro at org.apache.iceberg.hadoop.Util.getFs(Util.java:58) at org.apache.iceberg.hadoop.HadoopOutputFile.fromPath(HadoopOutputFile.java:53) at org.apache.iceberg.hadoop.HadoopFileIO.newOutputFile(HadoopFileIO.java:97) at org.apache.iceberg.spark.SparkTableUtil.buildManifest(SparkTableUtil.java:368) at org.apache.iceberg.spark.SparkTableUtil.lambda$importSparkPartitions$1e94a719$1(SparkTableUtil.java:796) at org.apache.spark.sql.Dataset.$anonfun$mapPartitions$1(Dataset.scala:3414) at org.apache.spark.sql.execution.MapPartitionsExec.$anonfun$doExecute$3(objects.scala:198) at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:893) at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:893) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367) at org.apache.spark.rdd.RDD.iterator(RDD.scala:331) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367) at org.apache.spark.rdd.RDD.iterator(RDD.scala:331) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367) at org.apache.spark.rdd.RDD.iterator(RDD.scala:331) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93) at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166) at org.apache.spark.scheduler.Task.run(Task.scala:141) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:620) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:623) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) at java.base/java.lang.Thread.run(Thread.java:840) Caused by: org.apache.hadoop.fs.UnsupportedFileSystemException: No FileSystem for scheme "s3" at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:3443) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3466) at org.apache.hadoop.fs.FileSystem.access$300(FileSystem.java:174) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3574) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3521) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:540) at org.apache.hadoop.fs.Path.getFileSystem(Path.java:365) at org.apache.iceberg.hadoop.Util.getFs(Util.java:56) ``` from my point of view, spark try to create staging metadata from location of which iceberg table metadata has. here, iceberg metadata location is started with `s3`, and scheme is fixed as s3. Spark try to access file system by hadoop s3aFileSystem, thus it seems scheme s3 is not supported. how can i overcome this issue? thanks, sincerely -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org