hrishisd opened a new issue, #9718: URL: https://github.com/apache/iceberg/issues/9718
### Feature Request / Improvement ## Request The `SnapshotProducer` API provides the capability to validate that the snapshots in the latest table metadata don't introduce changes that conflict with the new snapshot being committed. `BaseOverwriteFiles` implements custom validation that validates only snapshots from a supplied `startingSnapshotId`. When the validation fails and the commit is retried, `BaseOverwriteFiles::validate` will validate the newly landed snapshots from the base metadata and then re-validate snapshots that it previously validated. I'm requesting we checkpoint the validation in `BaseOverwriteFiles` so that, on commit retry, we can avoid revalidating snapshots that have been validated on the previous commit attempt. The change itself amounts to setting `startingSnapshotId` to the latest snapshot of the base table metadata after successful validation [[code](https://github.com/apache/iceberg/blob/main/core/src/main/java/org/apache/iceberg/BaseOverwriteFiles.java#L114-L157)]. ## Motivation We understand that iceberg is not optimized for a high frequency of commits. However, we have observed that workloads users spin up backfill jobs with ~10-100 workers, with each worker committing in parallel, frequently fail. If the total number of writes across all workers is 1k+, its possible for conflicts to cause commit retries. Even with an aggressive retry policy, writes can fail because on each retry, the validation step needs to revalidate all of the snapshots validated in the last attempt and validate the new snapshots that landed during the backoff period. The increasing duration of the validation step means that other writers are more likely to land commits as the number of retries increases. This creates a bimodal distribution of commit latencies where commits either succeed relatively quickly or get stuck in an increasingly expensive validation cycle that eventually exhausts all available retries. Most of these users are not sensitive to the latency of their backfill jobs and just don't want the jobs to fail (i.e. the latency of effectively serializing all of their commits does not matter). We've created a patched `BaseOverwriteFiles` implementation that performs checkpointing and have seen marked improvement in the failure rate of parallel backfill-style workloads. We want to eventually migrate some of these users to a compute engine that stages writes and performs a single commit. -- If this change looks reasonable, I'm happy to file a PR that implements checkpointing for `BaseOvewriteFiles` and other `SnapshotProducer` subclasses which implement validation from a `startingSnapshotId`. ### Query engine None -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org