RussellSpitzer opened a new pull request, #6588: URL: https://github.com/apache/iceberg/pull/6588
An issue we've run into frequently is that several Spark actions perform deletes on the driver with a default parallelism of 1. This is quite slow for S3 and painfully slow for very large tables. To fix this we change the default behavior to always be multithreaded deletes. The default for all Spark related actions can then be changed with a SQL Conf parameter as well as within each command with their own parallelism parameters. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org