This is an automated email from the ASF dual-hosted git repository.

jeffreyvo pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/arrow-rs.git


The following commit(s) were added to refs/heads/main by this push:
     new 8ed2b5246d fix: integration / Archery test With other arrows container 
ran out of space (#9043)
8ed2b5246d is described below

commit 8ed2b5246d5de68909695f5953aa2811f3f8ea0d
Author: Lanqing Yang <[email protected]>
AuthorDate: Sat Dec 27 09:53:54 2025 +0900

    fix: integration / Archery test With other arrows container ran out of 
space (#9043)
    
    # Which issue does this PR close?
    
    
    - Closes #9024.
    
    # Rationale for this change
    
    the ci container starts with 63gb / 72gb used, the 9GB remaining disk
    space is barely enough for a cross build in 7 languages that leads to ci
    being stuck.
    
    this is what a debug step after initialize container shows
    === CONTAINER DISK USAGE ===
    Filesystem      Size  Used Avail Use% Mounted on
    overlay          72G   63G  9.5G  87% /
    
    
    # What changes are included in this PR?
    
    - add resource monitoring to build process
    - add a clean up step to remove unnecessary software (cuts 6GB of space)
    === Cleaning up host disk space ===
    Disk space before cleanup:
    Filesystem      Size  Used Avail Use% Mounted on
    overlay          72G   63G  9.5G  87% /
    
    Disk space after cleanup:
    Filesystem      Size  Used Avail Use% Mounted on
    overlay          72G   57G   16G  79% /
    - add a small optimization to shallow clone (only clone most recent
    commit not full history) for github repos
    
    optimization results we have 6.1 GB left after build
    
    === After Build ===
    Filesystem      Size  Used Avail Use% Mounted on
    overlay          72G   66G  6.1G  92% /
    
    
    # Are these changes tested?
    
    tested by github ci
    
    # Are there any user-facing changes?
    
    no
    
    ---------
    
    Signed-off-by: lyang24 <[email protected]>
---
 .github/workflows/integration.yml | 66 +++++++++++++++++++++++++++++++++++----
 1 file changed, 60 insertions(+), 6 deletions(-)

diff --git a/.github/workflows/integration.yml 
b/.github/workflows/integration.yml
index 32c5e78d4f..cc74650812 100644
--- a/.github/workflows/integration.yml
+++ b/.github/workflows/integration.yml
@@ -78,58 +78,112 @@ jobs:
       run:
         shell: bash
     steps:
+      - name: Monitor disk usage - Initial
+        run: |
+          echo "=== Initial Disk Usage ==="
+          df -h /
+          echo ""
+
+      - name: Remove unnecessary preinstalled software
+        run: |
+          echo "=== Cleaning up host disk space ==="
+          echo "Disk space before cleanup:"
+          df -h /
+
+          # Clean apt cache
+          apt-get clean || true
+
+          # Remove GitHub Actions tool cache
+          rm -rf /__t/* || true
+
+          # Remove large packages from host filesystem (mounted at /host/)
+          rm -rf /host/usr/share/dotnet || true
+          rm -rf /host/usr/local/lib/android || true
+          rm -rf /host/usr/local/.ghcup || true
+          rm -rf /host/opt/hostedtoolcache/CodeQL || true
+
+          echo ""
+          echo "Disk space after cleanup:"
+          df -h /
+          echo ""
+
       # This is necessary so that actions/checkout can find git
       - name: Export conda path
         run: echo "/opt/conda/envs/arrow/bin" >> $GITHUB_PATH
       # This is necessary so that Rust can find cargo
       - name: Export cargo path
         run: echo "/root/.cargo/bin" >> $GITHUB_PATH
-      - name: Check rustup
-        run: which rustup
-      - name: Check cmake
-        run: which cmake
+
+      # Checkout repos (using shallow clones with fetch-depth: 1)
       - name: Checkout Arrow
         uses: actions/checkout@v6
         with:
           repository: apache/arrow
           submodules: true
-          fetch-depth: 0
+          fetch-depth: 1
       - name: Checkout Arrow Rust
         uses: actions/checkout@v6
         with:
           path: rust
           submodules: true
-          fetch-depth: 0
+          fetch-depth: 1
       - name: Checkout Arrow .NET
         uses: actions/checkout@v6
         with:
           repository: apache/arrow-dotnet
           path: dotnet
+          fetch-depth: 1
       - name: Checkout Arrow Go
         uses: actions/checkout@v6
         with:
           repository: apache/arrow-go
           path: go
+          fetch-depth: 1
       - name: Checkout Arrow Java
         uses: actions/checkout@v6
         with:
           repository: apache/arrow-java
           path: java
+          fetch-depth: 1
       - name: Checkout Arrow JavaScript
         uses: actions/checkout@v6
         with:
           repository: apache/arrow-js
           path: js
+          fetch-depth: 1
       - name: Checkout Arrow nanoarrow
         uses: actions/checkout@v6
         with:
           repository: apache/arrow-nanoarrow
           path: nanoarrow
+          fetch-depth: 1
+
+      - name: Monitor disk usage - After checkouts
+        run: |
+          echo "=== After Checkouts ==="
+          df -h /
+          echo ""
+
       - name: Build
         run: conda run --no-capture-output 
ci/scripts/integration_arrow_build.sh $PWD /build
+
+      - name: Monitor disk usage - After build
+        if: always()
+        run: |
+          echo "=== After Build ==="
+          df -h /
+          echo ""
+
       - name: Run
         run: conda run --no-capture-output ci/scripts/integration_arrow.sh 
$PWD /build
 
+      - name: Monitor disk usage - After tests
+        if: always()
+        run: |
+          echo "=== After Tests ==="
+          df -h /
+          echo ""
+
   # test FFI against the C-Data interface exposed by pyarrow
   pyarrow-integration-test:
     name: Pyarrow C Data Interface

Reply via email to