pvary commented on PR #10526:
URL: https://github.com/apache/iceberg/pull/10526#issuecomment-2238165277

   > @zhongqishang @pvary I have a uber question.
   > 
   > let's say checkpoint N was cancelled or timed out and checkpoint N+1 
completed successfully. In this case, do we know all the writer subtasks have 
flushed data files for checkpoint N and all write results have all been 
received by the committer?
   
   I think we only need to be sure, that every writer which has successfully 
closed on a given checkpoint is added in an Iceberg commit for the given 
checkpoint.
   If some of the writers are not closed, then they will keep collecting 
consistent data, and the results of the next checkpoint will be consistent.
   
   What we should be aware, is that the Iceberg commit for the non-successful 
checkpoint might not contain everything which is received up to the checkpoint 
time. I think this is ok, since the users could check if the given checkpoint 
was successful or not


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to