matthijseikelenboom opened a new issue, #9148: URL: https://github.com/apache/iceberg/issues/9148
### Query engine Spark ### Question Hi, I have a question about the combined use of Apache Spark and Iceberg. I'm trying to concurrently write to Iceberg, but I get `Unclose input stream` exceptions when I'm testing, see below ``` 2023-11-23T16:22:03.407+0100 WARN [Thread="Finalizer"] [o.a.i.h.HadoopStreams :138] Unclosed input stream created by: org.apache.iceberg.hadoop.HadoopStreams$HadoopSeekableInputStream.<init>(HadoopStreams.java:91) org.apache.iceberg.hadoop.HadoopStreams.wrap(HadoopStreams.java:55) org.apache.iceberg.hadoop.HadoopInputFile.newStream(HadoopInputFile.java:183) org.apache.iceberg.avro.AvroIterable.newFileReader(AvroIterable.java:100) org.apache.iceberg.avro.AvroIterable.iterator(AvroIterable.java:76) org.apache.iceberg.io.CloseableIterable$7$1.<init>(CloseableIterable.java:188) org.apache.iceberg.io.CloseableIterable$7.iterator(CloseableIterable.java:187) org.apache.iceberg.io.CloseableIterable.lambda$filter$1(CloseableIterable.java:136) org.apache.iceberg.io.CloseableIterable$2.iterator(CloseableIterable.java:72) org.apache.iceberg.io.CloseableIterable.lambda$filter$1(CloseableIterable.java:136) org.apache.iceberg.io.CloseableIterable$2.iterator(CloseableIterable.java:72) org.apache.iceberg.io.CloseableIterable.lambda$filter$1(CloseableIterable.java:136) org.apache.iceberg.io.CloseableIterable$2.iterator(CloseableIterable.java:72) org.apache.iceberg.io.CloseableIterable$7$1.<init>(CloseableIterable.java:188) org.apache.iceberg.io.CloseableIterable$7.iterator(CloseableIterable.java:187) org.apache.iceberg.ManifestGroup$1.iterator(ManifestGroup.java:333) org.apache.iceberg.ManifestGroup$1.iterator(ManifestGroup.java:291) org.apache.iceberg.util.ParallelIterable$ParallelIterator.lambda$new$1(ParallelIterable.java:69) java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) java.base/java.lang.Thread.run(Thread.java:829) ``` I first saw these exceptions when using Iceberg with Hadoop. So I tried it with Hive, hoping that it would be fixed, but that does not seem to be the case. What am I missing here? Do I need to change some configurations to enable proper concurrent writing? Here are the versions of the tools I'm using, in case that may come in handy Spark version: 3.2.1 Iceberg version: 1.4.2 Hive version: 4.0.0-alpha-2 (The only version we got to work) Hadoop version: 3.2.2 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org