aokolnychyi opened a new pull request, #8755: URL: https://github.com/apache/iceberg/pull/8755
This PR has code to parallelize reading of deletes and enable caching them on executors. I also have a follow-up change to assign tasks for one partition to the same executor, similar to `KafkaRDD`. There is no way to express task affinity so we can only rely on task locality. The solution in `KafkaRDD` is simple to implement but won't work well if dynamic allocation is enabled (so it should be hidden under a flag). More thoughts in [this](https://docs.google.com/document/d/1M4L6o-qnGRwGhbhkW8BnravoTwvCrJV8VvzVQDRJO5I) doc. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org