lirui-apache commented on issue #5846:
URL: https://github.com/apache/iceberg/issues/5846#issuecomment-1651312676

   We had a similar issue here. `IcebergFilesCommitter` commits data files when 
a checkpoint completes, and stores the checkpoint ID in iceberg snapshot 
summary, and removes the committed manifest. When restoring from state, it 
tries to restore the [max committed checkpoint 
id](https://github.com/apache/iceberg/blob/apache-iceberg-0.13.2/flink/v1.14/flink/src/main/java/org/apache/iceberg/flink/sink/IcebergFilesCommitter.java#L147)
 from table history. But this seems to be just best efforts because the 
snapshot might have been expired. And when that happens, IcebergFilesCommitter 
considers all files in the state as "uncommitted" and tris to [commit them 
again](https://github.com/apache/iceberg/blob/apache-iceberg-0.13.2/flink/v1.14/flink/src/main/java/org/apache/iceberg/flink/sink/IcebergFilesCommitter.java#L154),
 which can fail because the committed manifest has already been removed. 
@openinx is this a known limitation, so that we should be more careful with 
snapshot expiration?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to