mirageyjd opened a new issue, #8932: URL: https://github.com/apache/iceberg/issues/8932
### Apache Iceberg version 0.13.1 ### Query engine Spark ### Please describe the bug 🐞 We ran `BaseRewriteManifestsSparkAction` action on a large table with 7k+ manifests in Spark, and it took more than an hour unexpectedly. The most time-consuming procedure is to validate that each manifest entry in added manifests has a snapshot id, which is not executed in a distributed manner. Without the validation, the entire action takes less than 2 minutes. I wonder whether it is necessary to validate snapshot id of each manifest entry in manifests written by `BaseRewriteManifestsSparkAction`. It would be better such validation is optional and can be skipped in`BaseRewriteManifestsSparkAction`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org