[ 
https://issues.apache.org/jira/browse/HADOOP-19893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18086189#comment-18086189
 ] 

ASF GitHub Bot commented on HADOOP-19893:
-----------------------------------------

Copilot commented on code in PR #8530:
URL: https://github.com/apache/hadoop/pull/8530#discussion_r3358356455


##########
.github/workflows/cloud_aws.yml:
##########
@@ -20,15 +20,100 @@
 name: "Cloud-AWS"
 
 on:
+  # pull requests will automatically trigger S3A tests for non-forked PRs
   pull_request:
     paths:
       - 'hadoop-tools/hadoop-aws/**'
       - '.github/workflows/*cloud_aws.yml'
       - '.github/actions/build_image**'
       - '.github/gha-tests/hadoop-aws*excludes.txt'
 
+  # For fork PRs, S3A integration tests must be manually triggered.
+  # Although our auth. key for localstack is not a very sensitive secret,
+  # we don't want to leak it to PRs that we haven't reviewed yet.
+  # A maintainer should confirm that affected fork PRs don't
+  # include questionable changes to .github/workflows/  (i.e. that may leak
+  # secrets) before approving the workflow run.

Review Comment:
   The manual-trigger security note only mentions reviewing changes under 
`.github/workflows/`, but this workflow run checks out and executes code from 
the fork. That means changes under `.github/actions/` (and other scripts 
invoked by the workflow) are also part of the trusted computing base and should 
be explicitly called out here.



##########
.github/workflows/cloud_aws.yml:
##########
@@ -20,15 +20,100 @@
 name: "Cloud-AWS"
 
 on:
+  # pull requests will automatically trigger S3A tests for non-forked PRs
   pull_request:
     paths:
       - 'hadoop-tools/hadoop-aws/**'
       - '.github/workflows/*cloud_aws.yml'
       - '.github/actions/build_image**'
       - '.github/gha-tests/hadoop-aws*excludes.txt'
 
+  # For fork PRs, S3A integration tests must be manually triggered.
+  # Although our auth. key for localstack is not a very sensitive secret,
+  # we don't want to leak it to PRs that we haven't reviewed yet.
+  # A maintainer should confirm that affected fork PRs don't
+  # include questionable changes to .github/workflows/  (i.e. that may leak
+  # secrets) before approving the workflow run.
+  workflow_dispatch:
+    inputs:
+      pr_number:
+        description: Pull request number from a fork repository
+        required: true
+        type: string
+      java:
+        description: Java version for the Cloud-AWS run
+        required: false
+        type: string
+        default: '25'
+      os:
+        description: OS image key for the test container
+        required: false
+        type: string
+        default: ubuntu_24
+      runner_os:
+        description: Runner label used to execute jobs
+        required: false
+        type: string
+        default: ubuntu-24.04
+
 jobs:
+  # Stash the owner, repo, and PR number, so we can properly add a test summary
+  # in both the forked-repo, and non-forked case.
+  resolve-fork-pr:
+    if: github.event_name == 'workflow_dispatch'
+    runs-on: ubuntu-slim
+    permissions:
+      actions: write
+      pull-requests: read

Review Comment:
   `resolve-fork-pr` grants `actions: write`, but this job only reads PR 
metadata and uploads an artifact. Dropping this to least privilege reduces the 
blast radius of the manual workflow.



##########
.github/workflows/tmpl_cloud_aws.yml:
##########
@@ -63,8 +74,8 @@ jobs:
     steps:
       - uses: actions/checkout@v6
         with:
-          # Full fetch so build image URL can be computed for any branch
-          fetch-depth: 0
+          repository: ${{ inputs.checkout_repository || github.repository }}
+          ref: ${{ inputs.checkout_ref || github.ref }}

Review Comment:
   When running the manual fork PR workflow, `actions/checkout` will (by 
default) persist the `GITHUB_TOKEN` credential into the git config of the 
checked-out fork. Since subsequent steps execute untrusted code, set 
`persist-credentials: false` to reduce the risk of token exfiltration/misuse.



##########
.github/workflows/cloud_aws.yml:
##########
@@ -20,15 +20,100 @@
 name: "Cloud-AWS"
 
 on:
+  # pull requests will automatically trigger S3A tests for non-forked PRs
   pull_request:
     paths:
       - 'hadoop-tools/hadoop-aws/**'
       - '.github/workflows/*cloud_aws.yml'
       - '.github/actions/build_image**'
       - '.github/gha-tests/hadoop-aws*excludes.txt'
 
+  # For fork PRs, S3A integration tests must be manually triggered.
+  # Although our auth. key for localstack is not a very sensitive secret,
+  # we don't want to leak it to PRs that we haven't reviewed yet.
+  # A maintainer should confirm that affected fork PRs don't
+  # include questionable changes to .github/workflows/  (i.e. that may leak
+  # secrets) before approving the workflow run.
+  workflow_dispatch:
+    inputs:
+      pr_number:
+        description: Pull request number from a fork repository
+        required: true
+        type: string
+      java:
+        description: Java version for the Cloud-AWS run
+        required: false
+        type: string
+        default: '25'
+      os:
+        description: OS image key for the test container
+        required: false
+        type: string
+        default: ubuntu_24
+      runner_os:
+        description: Runner label used to execute jobs
+        required: false
+        type: string
+        default: ubuntu-24.04
+
 jobs:
+  # Stash the owner, repo, and PR number, so we can properly add a test summary
+  # in both the forked-repo, and non-forked case.

Review Comment:
   The comment says this job stashes "owner, repo, and PR number", but the job 
actually resolves and outputs the fork head repo/SHA (and writes those to an 
artifact). Updating this avoids confusion when maintaining the workflow.



##########
.github/workflows/tmpl_cloud_aws.yml:
##########
@@ -87,6 +98,9 @@ jobs:
         run: |
           echo "Build image URL: ${{ 
needs.precondition.outputs.build_image_url }}"
       - uses: actions/checkout@v6
+        with:
+          repository: ${{ inputs.checkout_repository || github.repository }}
+          ref: ${{ inputs.checkout_ref || github.ref }}

Review Comment:
   Same token-persistence concern as the other checkout steps: this job checks 
out the fork repo and then runs local composite actions and other steps. 
Disabling credential persistence prevents leaving a usable `GITHUB_TOKEN` in 
`.git/config` for untrusted code to read.



##########
.github/workflows/notify_cloud_aws.yml:
##########
@@ -0,0 +1,120 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+#
+
+# Add a sticky comment to hadoop-aws PRs from forked repos with a hint that
+# integration tests must be manually triggered by a maintainer.
+#
+name: "Cloud-AWS PR Update"
+
+# Security: This privileged workflow uses pull_request_target but does not
+# check out or execute untrusted code. It only creates a check run and a PR
+# comment in the base repository.
+on:
+  pull_request_target:
+    types: [opened, reopened, synchronize]
+    paths:
+      - 'hadoop-tools/hadoop-aws/**'
+      - '.github/workflows/*cloud_aws.yml'
+      - '.github/actions/build_image**'
+      - '.github/gha-tests/hadoop-aws*excludes.txt'
+
+jobs:
+  notify:
+    if: github.event.pull_request.head.repo.full_name != github.repository
+    name: "Notify Cloud-AWS"
+    runs-on: ubuntu-slim
+    permissions:
+      checks: write
+      pull-requests: write
+    steps:
+      - name: Post approval-required check and sticky comment
+        uses: actions/github-script@v9
+        with:
+          github-token: ${{ secrets.GITHUB_TOKEN }}
+          script: |
+            const marker = '<!

> ci: s3a integration tests fail for fork PRs
> -------------------------------------------
>
>                 Key: HADOOP-19893
>                 URL: https://issues.apache.org/jira/browse/HADOOP-19893
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: ci, fs/s3
>    Affects Versions: 3.5.0
>            Reporter: Aaron Fabbri
>            Assignee: Aaron Fabbri
>            Priority: Major
>              Labels: pull-request-available
>
> `.github/workflows/cloud_aws.yml` fails to execute when a PR branch is pushed 
> to a fork repository. It works fine when pushing a branch to upstream 
> (apache/hadoop). The problem is that the determination of the container image 
> URL (which happens in `.github/actions/build_image_url/action.yml`) uses 
> `apache` for `github.repository.owner` instead of `fork-owner`, due to use of 
> `pull_request` trigger.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to