This is an automated email from the ASF dual-hosted git repository.

zhengruifeng pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
     new b96b63350c3c [SPARK-57069][INFRA] Share SBT precompile artifact with 
docker/k8s integration test CI jobs
b96b63350c3c is described below

commit b96b63350c3c153b10452108dbd892069f7be0f4
Author: Ruifeng Zheng <[email protected]>
AuthorDate: Thu May 28 16:14:46 2026 +0800

    [SPARK-57069][INFRA] Share SBT precompile artifact with docker/k8s 
integration test CI jobs
    
    ### What changes were proposed in this pull request?
    
    This PR extends the SBT precompile-sharing pattern (parent: 
[SPARK-56830](https://issues.apache.org/jira/browse/SPARK-56830); prior 
sub-tasks: [SPARK-56768](https://issues.apache.org/jira/browse/SPARK-56768) 
pyspark, [SPARK-56831](https://issues.apache.org/jira/browse/SPARK-56831) 
sparkr, [SPARK-56943](https://issues.apache.org/jira/browse/SPARK-56943) JVM 
build) to the two remaining SBT-compiling jobs in 
`.github/workflows/build_and_test.yml` that still run their own full Spark 
compile:
    
    - `docker-integration-tests`
    - `k8s-integration-tests`
    
    Concretely:
    
    - The existing `precompile` job's `if:` gate is extended to also fire when 
`docker-integration-tests == 'true'` or `k8s-integration-tests == 'true'` in 
the precondition output, so the artifact is available whenever either job needs 
it.
    - The precompile SBT invocation adds `-Pkubernetes-integration-tests`, so 
the integration-tests submodule's `target/` ends up in the shared artifact and 
the k8s job doesn't have to recompile it.
    - `docker-integration-tests`:
      - `needs: precondition` -> `needs: [precondition, precompile]`
      - `if:` extended with `(!cancelled()) &&` so the job still runs if 
precompile is cancelled.
      - Adds "Download precompiled artifact" + "Extract precompiled artifact" 
steps between Java setup and `Run tests`, with graceful fallback 
(`continue-on-error: true`).
      - `Run tests` exports `SKIP_SCALA_BUILD=true` when extraction succeeded; 
`dev/run-tests.py` already honors this flag and skips `build_apache_spark` + 
`build_spark_assembly_sbt`.
    - `k8s-integration-tests`:
      - Same `needs:` and `if:` change.
      - Adds the same Download/Extract steps after Java setup.
      - The actual test runs via a direct `build/sbt ... 
"kubernetes-integration-tests/test"` call rather than `dev/run-tests.py`, so no 
`SKIP_SCALA_BUILD` is set. SBT sees the extracted `target/` and skips 
compilation of the pre-built modules (Spark Core, SQL, kubernetes, 
integration-tests, ...); only the small SparkR Scala bindings still compile 
(the precompile doesn't include `-Psparkr` because that profile activates 
`core/buildRPackage`, which shells out to R, and the precompile runne [...]
    
    ### Optional: graceful fallback if precompile fails
    
    Same pattern as the prior sub-tasks:
    - `precompile` keeps `continue-on-error: true`.
    - Both consumers' "Download precompiled artifact" step is gated on 
`needs.precompile.result == 'success'` and has `continue-on-error: true`.
    - "Extract precompiled artifact" is gated on the download succeeding and 
has `continue-on-error: true`.
    - For docker, `SKIP_SCALA_BUILD=true` is exported only when 
`steps.extract-precompiled.outcome == 'success'`; otherwise `dev/run-tests.py` 
runs the original local SBT build.
    - For k8s, if extraction fails, SBT compiles from scratch as before.
    
    Worst case is degraded to the pre-PR behavior, not a workflow failure.
    
    ### Profile coverage
    
    The precompile job runs:
    ```
    ./build/sbt -Phadoop-3 -Pyarn -Pspark-ganglia-lgpl -Phadoop-cloud -Phive \
      -Pkubernetes -Pjvm-profiler -Pkinesis-asl -Phive-thriftserver \
      -Pdocker-integration-tests -Pkubernetes-integration-tests -Pvolcano \
      Test/package streaming-kinesis-asl-assembly/assembly connect/assembly 
assembly/package
    ```
    
    - `docker-integration-tests`: profile is in the precompile invocation; the 
module's `target/` is pre-built, so `dev/run-tests --modules 
docker-integration-tests` only runs the test phase.
    - `k8s-integration-tests`: `-Pkubernetes` and 
`-Pkubernetes-integration-tests` are both in the precompile, so the 
integration-tests submodule is pre-built. The job's direct SBT call adds 
`-Psparkr`, which triggers compile of the small SparkR Scala bindings on top of 
the reused `target/`. Net work in this job drops from "compile all of Spark + 
integration tests + sparkr" to "compile only the sparkr module".
    
    ### Why are the changes needed?
    
    Today every scheduled / dispatched run of `build_and_test.yml` that 
requires `docker-integration-tests` or `k8s-integration-tests` re-runs the same 
SBT compile that `precompile` already produced for `pyspark` / `sparkr` / 
`build`. Wiring these two consumers to the existing artifact removes that 
duplicate work for free (precompile is already running).
    
    ### Does this PR introduce _any_ user-facing change?
    
    No. CI infrastructure change only.
    
    ### How was this patch tested?
    
    The change is exercised by the CI run of this PR itself. The 
Download/Extract steps log artifact size; the Run tests step prints `Reusing 
precompiled artifact, skipping local SBT build.` for the docker job when the 
fast path is taken. If the precompile job is forced to fail (or its artifact is 
missing), both consumers fall back to the original local SBT build.
    
    Measured CI timings before vs after are posted as a comment on this PR.
    
    ### Was this patch authored or co-authored using generative AI tooling?
    
    Generated-by: Claude Code (Opus 4.7)
    
    Closes #56110 from zhengruifeng/share-precompile-integration-tests-dev5.
    
    Authored-by: Ruifeng Zheng <[email protected]>
    Signed-off-by: Ruifeng Zheng <[email protected]>
---
 .github/workflows/build_and_test.yml | 46 +++++++++++++++++++++++++++++++-----
 1 file changed, 40 insertions(+), 6 deletions(-)

diff --git a/.github/workflows/build_and_test.yml 
b/.github/workflows/build_and_test.yml
index 6c2606f62683..3d5e94ef275f 100644
--- a/.github/workflows/build_and_test.yml
+++ b/.github/workflows/build_and_test.yml
@@ -573,7 +573,9 @@ jobs:
         fromJson(needs.precondition.outputs.required).pyspark == 'true' ||
         fromJson(needs.precondition.outputs.required).pyspark-pandas == 'true' 
||
         fromJson(needs.precondition.outputs.required).pyspark-install == 
'true' ||
-        fromJson(needs.precondition.outputs.required).sparkr == 'true')
+        fromJson(needs.precondition.outputs.required).sparkr == 'true' ||
+        fromJson(needs.precondition.outputs.required).docker-integration-tests 
== 'true' ||
+        fromJson(needs.precondition.outputs.required).k8s-integration-tests == 
'true')
     name: "Precompile Spark"
     runs-on: ubuntu-latest
     timeout-minutes: 60
@@ -624,7 +626,7 @@ jobs:
       run: |
         ./build/sbt -Phadoop-3 -Pyarn -Pspark-ganglia-lgpl -Phadoop-cloud 
-Phive \
           -Pkubernetes -Pjvm-profiler -Pkinesis-asl -Phive-thriftserver \
-          -Pdocker-integration-tests -Pvolcano \
+          -Pdocker-integration-tests -Pkubernetes-integration-tests -Pvolcano \
           Test/package streaming-kinesis-asl-assembly/assembly 
connect/assembly assembly/package
     - name: Package compile output
       run: |
@@ -1510,8 +1512,8 @@ jobs:
         path: "**/target/unit-tests.log"
 
   docker-integration-tests:
-    needs: precondition
-    if: fromJson(needs.precondition.outputs.required).docker-integration-tests 
== 'true'
+    needs: [precondition, precompile]
+    if: (!cancelled()) && 
fromJson(needs.precondition.outputs.required).docker-integration-tests == 'true'
     name: Run Docker integration tests
     runs-on: ubuntu-latest
     timeout-minutes: 120
@@ -1559,9 +1561,27 @@ jobs:
       with:
         distribution: zulu
         java-version: ${{ inputs.java }}
+    - name: Download precompiled artifact
+      id: download-precompiled
+      if: needs.precompile.result == 'success'
+      continue-on-error: true
+      uses: actions/download-artifact@v6
+      with:
+        name: spark-compile-${{ inputs.branch }}-${{ github.run_id }}
+    - name: Extract precompiled artifact
+      id: extract-precompiled
+      if: steps.download-precompiled.outcome == 'success'
+      continue-on-error: true
+      run: |
+        tar -xzf compile-artifact.tar.gz
+        rm compile-artifact.tar.gz
     - name: Run tests
       env: ${{ fromJSON(inputs.envs) }}
       run: |
+        if [ "${{ steps.extract-precompiled.outcome }}" = "success" ]; then
+          export SKIP_SCALA_BUILD=true
+          echo "Reusing precompiled artifact, skipping local SBT build."
+        fi
         ./dev/run-tests --parallelism 1 --modules docker-integration-tests 
--included-tags org.apache.spark.tags.DockerTest
     - name: Upload test results to report
       if: always()
@@ -1586,8 +1606,8 @@ jobs:
         path: "**/target/unit-tests.log"
 
   k8s-integration-tests:
-    needs: precondition
-    if: fromJson(needs.precondition.outputs.required).k8s-integration-tests == 
'true'
+    needs: [precondition, precompile]
+    if: (!cancelled()) && 
fromJson(needs.precondition.outputs.required).k8s-integration-tests == 'true'
     name: Run Spark on Kubernetes Integration test
     runs-on: ubuntu-latest
     timeout-minutes: 120
@@ -1632,6 +1652,20 @@ jobs:
         with:
           distribution: zulu
           java-version: ${{ inputs.java }}
+      - name: Download precompiled artifact
+        id: download-precompiled
+        if: needs.precompile.result == 'success'
+        continue-on-error: true
+        uses: actions/download-artifact@v6
+        with:
+          name: spark-compile-${{ inputs.branch }}-${{ github.run_id }}
+      - name: Extract precompiled artifact
+        id: extract-precompiled
+        if: steps.download-precompiled.outcome == 'success'
+        continue-on-error: true
+        run: |
+          tar -xzf compile-artifact.tar.gz
+          rm compile-artifact.tar.gz
       - name: Install R
         run: |
           sudo apt update


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to