pvary commented on issue #11894: URL: https://github.com/apache/iceberg/issues/11894#issuecomment-2568809136
> @pvary I did restart from a savepoint. I mentioned that in the description > > > I restart the job from the latest savepoint, which is committed at the Kafka source Sorry, I missed this... too many things to catch up after the holidays. > The problem is that Iceberg file committer did not finish writing data to my object store (S3) for the older Kafka offset (say `x`) that was committed to the savepoint. Even if the committer doesn't write the data to the Iceberg table, the savepoint is only successful if the temp files are already written out. > Now say offset `x + 1` is written to the next savepoint before Iceberg file committer finishes writing the data for the previous offset `x`, upon restart the data that was read from `x` offset will never be written out to the Iceberg table and hence lost. The temp files are committed to the Iceberg table upon state restore. Also the code makes sure that the commits are added to the table in the correct order. Could you please dig into your case a little bit more to help us understand what is causing the issue? Thanks, Peter -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org