amogh-jahagirdar commented on PR #8397: URL: https://github.com/apache/iceberg/pull/8397#issuecomment-1707058898
Thanks for the review @rdblue, I've updated the PR for updating the create/replace cases to perform the strict metadata cleanup check. One case I wanted to double check was the deletion of files happening in the finally block for create/replace https://github.com/apache/iceberg/pull/8397#discussion_r1316210793 . If I understand right, we still would want to preserve this cleanup regardless of commit outcome or strict cleanup because that set of files would be files from previous commit attempts which we can safely cleanup. > @amogh-jahagirdar, in addition to fixing this, I think we also need to look into file cleanup for transactions. I was reviewing the current handling and I noticed that the deletedFiles set is never cleared between retries for a SIMPLE transaction. As a result, there is a chance that a file that was deleted by one operation's commit but reused when that operation is recommitted will remain in the delete set and be removed. I think all we need to do is to clear the delete file set before each commit attempt. Could you look into this in a follow-up PR? Good catch, sure I will look into this! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
