amogh-jahagirdar commented on PR #8397:
URL: https://github.com/apache/iceberg/pull/8397#issuecomment-1707058898

   Thanks for the review @rdblue, I've updated the PR for updating the 
create/replace cases to perform the strict metadata cleanup check. One case I 
wanted to double check was the deletion of files happening in the finally block 
for create/replace 
https://github.com/apache/iceberg/pull/8397#discussion_r1316210793 . If I 
understand right, we still would want to preserve this cleanup regardless of 
commit outcome or strict cleanup because that set of files would be files from 
previous commit attempts which we can safely cleanup.
   
   
   > @amogh-jahagirdar, in addition to fixing this, I think we also need to 
look into file cleanup for transactions. I was reviewing the current handling 
and I noticed that the deletedFiles set is never cleared between retries for a 
SIMPLE transaction. As a result, there is a chance that a file that was deleted 
by one operation's commit but reused when that operation is recommitted will 
remain in the delete set and be removed.
   
   I think all we need to do is to clear the delete file set before each commit 
attempt. Could you look into this in a follow-up PR?
   
   Good catch, sure I will look into this! 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to