gaborkaszab commented on PR #11837: URL: https://github.com/apache/iceberg/pull/11837#issuecomment-2690864221
Hi @amogh-jahagirdar , @pvary , I managed to simplify the original PR and uploaded a new version. I took a deeper look at how [orphan file deletion does the same thing](https://github.com/apache/iceberg/blob/e230f5d79d82a50439029db5c73f8b59497b2e9f/spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/DeleteOrphanFilesSparkAction.java#L253) and I basically follow the same approach here too. In a nutshell, I think I overcomplicated my first approach because I also wanted to implement bulk deletion for the plugged-in deleteFunc. However, for orphan files we don't do that and rather do the bulk deletion in case there is no pluggeg-in deleteFunc and the FileIO supports bulk deletion. Additionally, there were comments about retries and batching the files. I believe these are not needed here, and this is also inline with orphan file cleanup. Let me know what you think! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org