FranMorilloAWS commented on issue #9089:
URL: https://github.com/apache/iceberg/issues/9089#issuecomment-1862314942

   Yes I am sure, I could see it in the CloudWatch Logs.  What is weird is that 
it was not for all tables, some did have the correct checkpointID that matched 
the one that the Flink Job was currently at (so it could commit without 
issues), the others, would show 
   Assuming Flink Job is now in Checkpoint 850
   
   Start to flush snapshot state to state backend, table: <tablename>, 
checkpointID: 900
   Skipping commiting checkpoint 900. 900 is already commited.
   
   
   It would continue doing this until it catches up and the commits. But my 
concern is as we dont know why this tables have the checkpointID is advanced 
beyoned the one created by the job. If this is a race condition so whenever the 
trigger happens again, that table will stop commiting again until it catches up 
the new advanced checkpointID.   Does Spark Maintenance Job update the 
checkpointID, so when Flink recovers/restarts after update from state and loads 
the metadata of the table, it has a different checkpointID?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to