aiborodin commented on PR #14092: URL: https://github.com/apache/iceberg/pull/14092#issuecomment-3336965874
> I'm suggesting to combine the append-only WriteResults in each Flink checkpoint. @mxm Just combining the append-only WriteResults in each Flink checkpoint would not resolve the issue for WriteResults with delete files (see my comment in https://github.com/apache/iceberg/pull/14182#issuecomment-3336891582). We should combine both appends and deletes within the same checkpoint (implemented in this change), which is valid for equality delete semantics because records with the same equality fields would always be routed to the same writer, storing all unique equality delete keys in the same delete file. > While we could combine multiple Flink checkpoints during recovery, I don't think there is much benefit from doing that. Apart from recovery, every checkpoint would normally be processed independently. We wouldn't gain much from optimizing the snapshots by combining commit request from multiple checkpoints. My suggestion is to combine both appends and deletes for the same checkpoint, which I implemented in this change. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
