pvary commented on issue #11894:
URL: https://github.com/apache/iceberg/issues/11894#issuecomment-2568809136

   > @pvary I did restart from a savepoint. I mentioned that in the description
   > 
   > > I restart the job from the latest savepoint, which is committed at the 
Kafka source
   
   Sorry, I missed this... too many things to catch up after the holidays.
   
   > The problem is that Iceberg file committer did not finish writing data to 
my object store (S3) for the older Kafka offset (say `x`) that was committed to 
the savepoint.
   
   Even if the committer doesn't write the data to the Iceberg table, the 
savepoint is only successful if the temp files are already written out.
   
   > Now say offset `x + 1` is written to the next savepoint before Iceberg 
file committer finishes writing the data for the previous offset `x`, upon 
restart the data that was read from `x` offset will never be written out to the 
Iceberg table and hence lost.
   
   The temp files are committed to the Iceberg table upon state restore. Also 
the code makes sure that the commits are added to the table in the correct 
order.
   Could you please dig into your case a little bit more to help us understand 
what is causing the issue?
   Thanks, Peter 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to