hrishisd opened a new issue, #9718:
URL: https://github.com/apache/iceberg/issues/9718

   ### Feature Request / Improvement
   
   ## Request
   The `SnapshotProducer` API provides the capability to validate that the 
snapshots in the latest table metadata don't introduce changes that conflict 
with the new snapshot being committed. `BaseOverwriteFiles` implements custom 
validation that validates only snapshots from a supplied `startingSnapshotId`. 
When the validation fails and the commit is retried, 
`BaseOverwriteFiles::validate` will validate the newly landed snapshots from 
the base metadata and then re-validate snapshots that it previously validated. 
   
   I'm requesting we checkpoint the validation in `BaseOverwriteFiles` so that, 
on commit retry, we can avoid revalidating snapshots that have been validated 
on the previous commit attempt. The change itself amounts to setting 
`startingSnapshotId` to the latest snapshot of the base table metadata after 
successful validation 
[[code](https://github.com/apache/iceberg/blob/main/core/src/main/java/org/apache/iceberg/BaseOverwriteFiles.java#L114-L157)].
 
   
   
   ## Motivation
   We understand that iceberg is not optimized for a high frequency of commits. 
However, we have observed that workloads users spin up backfill jobs with 
~10-100 workers, with each worker committing in parallel, frequently fail. If 
the total number of writes across all workers is 1k+, its possible for 
conflicts to cause commit retries. Even with an aggressive retry policy, writes 
can fail because on each retry, the validation step needs to revalidate all of 
the snapshots validated in the last attempt and validate the new snapshots that 
landed during the backoff period. The increasing duration of the validation 
step means that other writers are more likely to land commits as the number of 
retries increases. 
   
   This creates a bimodal distribution of commit latencies where commits either 
succeed relatively quickly or get stuck in an increasingly expensive validation 
cycle that eventually exhausts all available retries. 
   
   Most of these users are not sensitive to the latency of their backfill jobs 
and just don't want the jobs to fail (i.e. the latency of effectively 
serializing all of their commits does not matter). We've created a patched 
`BaseOverwriteFiles` implementation that performs checkpointing and have seen 
marked improvement in the failure rate of parallel backfill-style workloads. We 
want to eventually migrate some of these users to a compute engine that stages 
writes and performs a single commit. 
   
   --
   
   If this change looks reasonable, I'm happy to file a PR that implements 
checkpointing for `BaseOvewriteFiles` and other `SnapshotProducer` subclasses 
which implement validation from a `startingSnapshotId`.
   
   ### Query engine
   
   None


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to