[I] Core: checkpoint validation in BaseOverwriteFiles [iceberg]

via GitHub Mon, 12 Feb 2024 13:59:20 -0800


hrishisd opened a new issue, #9718:
URL: https://github.com/apache/iceberg/issues/9718

### Feature Request / Improvement

## Request
The `SnapshotProducer` API provides the capability to validate that the
snapshots in the latest table metadata don't introduce changes that conflict
with the new snapshot being committed. `BaseOverwriteFiles` implements custom
validation that validates only snapshots from a supplied `startingSnapshotId`.
When the validation fails and the commit is retried,
`BaseOverwriteFiles::validate` will validate the newly landed snapshots from
the base metadata and then re-validate snapshots that it previously validated.

I'm requesting we checkpoint the validation in `BaseOverwriteFiles` so that,
on commit retry, we can avoid revalidating snapshots that have been validated
on the previous commit attempt. The change itself amounts to setting
`startingSnapshotId` to the latest snapshot of the base table metadata after
successful validation
[[code](https://github.com/apache/iceberg/blob/main/core/src/main/java/org/apache/iceberg/BaseOverwriteFiles.java#L114-L157)].

## Motivation
We understand that iceberg is not optimized for a high frequency of commits.
However, we have observed that workloads users spin up backfill jobs with
~10-100 workers, with each worker committing in parallel, frequently fail. If
the total number of writes across all workers is 1k+, its possible for
conflicts to cause commit retries. Even with an aggressive retry policy, writes
can fail because on each retry, the validation step needs to revalidate all of
the snapshots validated in the last attempt and validate the new snapshots that
landed during the backoff period. The increasing duration of the validation
step means that other writers are more likely to land commits as the number of
retries increases.

This creates a bimodal distribution of commit latencies where commits either
succeed relatively quickly or get stuck in an increasingly expensive validation
cycle that eventually exhausts all available retries.

Most of these users are not sensitive to the latency of their backfill jobs
and just don't want the jobs to fail (i.e. the latency of effectively
serializing all of their commits does not matter). We've created a patched
`BaseOverwriteFiles` implementation that performs checkpointing and have seen
marked improvement in the failure rate of parallel backfill-style workloads. We
want to eventually migrate some of these users to a compute engine that stages
writes and performs a single commit.

If this change looks reasonable, I'm happy to file a PR that implements
checkpointing for `BaseOvewriteFiles` and other `SnapshotProducer` subclasses
which implement validation from a `startingSnapshotId`.

### Query engine

None

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

[I] Core: checkpoint validation in BaseOverwriteFiles [iceberg]

Reply via email to