Fokko commented on issue #12845: URL: https://github.com/apache/iceberg/issues/12845#issuecomment-2823677466
Hey @littleDrew, as @wzx140 already indicated, Spark uses distributed planning. The problem is likely that you have a lot of metadata and data. This makes the query planning process slow. I see that you refer to the code of Iceberg 0.13, if you're still using that version, I would highly recommend updating to a more recent version. Distributed planning has been added in 1.4.0 with Spark 3.4 and 3.5. Another thing to take into consideration is doing table maintenance, rewriting [data](https://iceberg.apache.org/docs/nightly/spark-procedures/#rewrite_data_files) and [metadata](https://iceberg.apache.org/docs/nightly/spark-procedures/#rewrite_manifests) can speed up queries quite a bit. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org