aiborodin commented on PR #14092:
URL: https://github.com/apache/iceberg/pull/14092#issuecomment-3336965874

   >  I'm suggesting to combine the append-only WriteResults in each Flink 
checkpoint.
   
   @mxm Just combining the append-only WriteResults in each Flink checkpoint 
would not resolve the issue for WriteResults with delete files (see my comment 
in https://github.com/apache/iceberg/pull/14182#issuecomment-3336891582). We 
should combine both appends and deletes within the same checkpoint (implemented 
in this change), which is valid for equality delete semantics because records 
with the same equality fields would always be routed to the same writer, 
storing all unique equality delete keys in the same delete file.
   
   > While we could combine multiple Flink checkpoints during recovery, I don't 
think there is much benefit from doing that. Apart from recovery, every 
checkpoint would normally be processed independently. We wouldn't gain much 
from optimizing the snapshots by combining commit request from multiple 
checkpoints.
   
   My suggestion is to combine both appends and deletes for the same 
checkpoint, which I implemented in this change.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to