FranMorilloAWS commented on issue #9089: URL: https://github.com/apache/iceberg/issues/9089#issuecomment-1862314942
Yes I am sure, I could see it in the CloudWatch Logs. What is weird is that it was not for all tables, some did have the correct checkpointID that matched the one that the Flink Job was currently at (so it could commit without issues), the others, would show Assuming Flink Job is now in Checkpoint 850 Start to flush snapshot state to state backend, table: <tablename>, checkpointID: 900 Skipping commiting checkpoint 900. 900 is already commited. It would continue doing this until it catches up and the commits. But my concern is as we dont know why this tables have the checkpointID is advanced beyoned the one created by the job. If this is a race condition so whenever the trigger happens again, that table will stop commiting again until it catches up the new advanced checkpointID. Does Spark Maintenance Job update the checkpointID, so when Flink recovers/restarts after update from state and loads the metadata of the table, it has a different checkpointID? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org