kevinjqliu opened a new pull request, #16356:
URL: https://github.com/apache/iceberg/pull/16356

   
   ## Why
   
   The shared gradle caches managed by `gradle/actions/setup-gradle` are 
written by every job that doesn't explicitly opt out. With ~10 setup-gradle 
invocations across 12 workflows, parallel jobs race to save overlapping caches 
on every push to `main`, producing duplicated entries and accelerating LRU 
pressure against GitHub's 10 GB per-repo cap.
   
   This causes **cache thrashing**: each commit produces ~3–4 GB of fresh 
per-job `gradle-home-...-<sha>` entries that immediately evict older entries 
(including hot dependency caches) under LRU. The next build then misses on 
entries that should have been warm, re-downloads dependencies, writes new 
entries, and evicts again — a self-perpetuating churn loop that wastes minutes 
per build and keeps the cache in a permanently cold state despite being 
at-capacity.
   
   ## What
   
   Restrict cache writes to a single canonical job and make every other job 
read-only:
   
   - **Sole writer:** `java-ci.yml` → `build-checks (17)` on `refs/heads/main` 
only. This job runs `./gradlew -DallModules build` and therefore resolves the 
union dependency closure that all other workflows need.
   - **Read-only everywhere else:** `cache-read-only: true` added to 
setup-gradle in `java-ci.yml` (3 other jobs), `spark-ci.yml`, `flink-ci.yml`, 
`hive-ci.yml`, `kafka-connect-ci.yml`, `delta-conversion-ci.yml` (×2), 
`cve-scan.yml`, `api-binary-compatibility.yml`, `publish-snapshot.yml`, 
`publish-iceberg-rest-fixture-docker.yml`, `recurring-jmh-benchmarks.yml`.
   - **Cache-disabled:** `jmh-benchmarks.yml` (workflow_dispatch on arbitrary 
repo/ref) → `cache-disabled: true` to avoid cache poisoning.
   
   Read-only jobs still benefit from cache **restores** (including 
setup-gradle's `restore-keys` prefix walk that lets matrix variants pull a 
sibling job's `gradle-home` entry).
   
   ## Validation
   
   Validated on a fork (`kevinjqliu/iceberg`) across 4 rounds:
   
   | Round | Trigger | Outcome |
   |---|---|---|
   | R1 | Initial main push (cold) | New caches saved by `build-checks (17)` 
only |
   | R2 | PR run (`refs/pull/20/*`) | **Zero** new cache entries created (PRs 
are read-only) |
   | R3 | 2nd main push (warm) | `build-checks (17)` updated `gradle-home` and 
`gradle-dependencies`; no other writers |
   | R4 | 3rd main push | All restores hit; only the single per-commit 
`gradle-home` entry created |
   
   Job logs confirm `Cache is read-only: will not save state for use in 
subsequent builds.` for every non-writer job, and `Saved cache entry with key 
gradle-home-v1\|...build-checks[...]-<sha>` only from `build-checks (17)`.
   
   ## Impact
   
   - Eliminates inter-job cache write races and duplicate entries across the 
matrix
   - Single deterministic write point makes cache contents predictable and 
debuggable
   - No build-time regression: read-only jobs still get full restore behavior
   - Reduces steady-state cache footprint and slows accumulation against the 10 
GB cap
   
   ## Files changed
   
   12 workflows under `.github/workflows/`, +58 lines (comments + 1–2 new keys 
per file). No code changes, no test changes.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to