Re: [PR] fix: EksPodOperator 401 with cross-account AssumeRole via aws_conn_id [airflow]

via GitHub Fri, 10 Apr 2026 13:03:13 -0700


Copilot commented on code in PR #64749:
URL: https://github.com/apache/airflow/pull/64749#discussion_r3066481518



##########
providers/amazon/src/airflow/providers/amazon/aws/hooks/eks.py:
##########
@@ -82,20 +82,24 @@ class NodegroupStates(Enum):
             # Load credentials from secure file using (POSIX-compliant dot 
operator)
             . {credentials_file}
 
+            # Redirect stderr to /dev/null to prevent Python warnings, 
deprecation
+            # notices, or other log output from contaminating stdout. The token
+            # output must be the ONLY thing on stdout for bash parsing to work.
             output=$({python_executable} -m 
airflow.providers.amazon.aws.utils.eks_get_token \
-                --cluster-name {eks_cluster_name} --sts-url '{sts_url}' {args} 
2>&1)
+                --cluster-name {eks_cluster_name} --sts-url '{sts_url}' {args} 
2>/dev/null)
 

Review Comment:
   Redirecting `eks_get_token` stderr to `/dev/null` discards useful error 
output (stack traces, botocore messages) and makes kubeconfig exec failures 
hard to debug. Since the original parsing issue was caused by merging stderr 
into stdout (`2>&1`), consider removing the redirection entirely (let stderr 
pass through) or capture stderr separately and only surface it on non-zero 
exit, while keeping stdout clean for parsing.



##########
providers/amazon/src/airflow/providers/amazon/aws/hooks/eks.py:
##########
@@ -104,6 +108,16 @@ class NodegroupStates(Enum):
 
             token=${{last_line##*, token: }}  # text after ", token: "
 
+            # Validate that token was extracted — empty token means parsing 
failed
+            # or eks_get_token produced unexpected output. Without this check, 
a
+            # malformed ExecCredential is sent to the API server, resulting in 
a
+            # 401 with an empty user identity in the audit logs.
+            if [ -z "$token" ]; then
+                printf 'Failed to extract token from eks_get_token output. ' 
>&2
+                printf 'Output was: %s' "$output" >&2

Review Comment:
   The empty-token branch prints the full `eks_get_token` output to stderr. 
That output includes the EKS bearer token (see `eks_get_token.py` printing 
`token: {access_token}`), so this will leak credentials into task logs when 
parsing fails. Please redact the token before logging (or omit the output 
entirely) to avoid exposing bearer tokens.
   ```suggestion
                   printf 'Failed to extract token from eks_get_token output.' 
>&2
   ```



##########
providers/amazon/tests/unit/amazon/aws/hooks/test_eks.py:
##########
@@ -1273,6 +1273,37 @@ def test_generate_config_file(self, mock_conn, 
aws_conn_id, region_name, expecte
             if expected_region_args:
                 assert expected_region_args in command_arg
 
+    def test_command_template_redirects_stderr(self):
+        """Verify COMMAND template redirects stderr to /dev/null to prevent
+        Python warnings/log output from contaminating stdout and breaking
+        bash token parsing. This is critical for cross-account AssumeRole
+        scenarios where the kubeconfig exec plugin must produce a clean 
token."""
+        from airflow.providers.amazon.aws.hooks.eks import COMMAND
+
+        # Verify stderr is redirected to /dev/null, not merged with stdout
+        assert "2>/dev/null" in COMMAND, (
+            "COMMAND must redirect stderr to /dev/null to prevent output 
contamination"
+        )
+        assert "2>&1" not in COMMAND, (
+            "COMMAND must not use 2>&1 — merging stderr with stdout breaks 
bash token parsing"
+        )
+
+    def test_command_template_validates_token(self):
+        """Verify COMMAND template validates that the token was successfully
+        extracted before producing the ExecCredential JSON. Without this check,
+        a malformed ExecCredential with an empty token is sent to the API 
server,
+        resulting in 401 Unauthorized with an empty user identity in audit 
logs."""
+        from airflow.providers.amazon.aws.hooks.eks import COMMAND
+
+        # Verify token validation check exists
+        assert 'if [ -z "$token" ]' in COMMAND, (
+            "COMMAND must validate that token is non-empty before producing 
ExecCredential JSON"
+        )
+        # Verify it exits with error on empty token
+        assert "exit 1" in COMMAND or 'exit "$' in COMMAND, (
+            "COMMAND must exit with error when token extraction fails"

Review Comment:
   `assert "exit 1" in COMMAND or 'exit "$' in COMMAND` can pass even if the 
empty-token validation never exits, because `'exit "$'` matches the existing 
`exit "$status"` earlier in the script. Tighten this assertion to specifically 
verify that the empty-token block exits (e.g., by checking ordering relative to 
`if [ -z "$token" ]` or matching the `exit 1` inside that block).
   ```suggestion
           # Verify the empty-token validation block exits with an error
           assert 'if [ -z "$token" ]; then\n  exit 1\nfi' in COMMAND, (
               "COMMAND must exit with error in the empty-token validation 
block when token extraction fails"
   ```



##########
providers/amazon/tests/unit/amazon/aws/hooks/test_eks.py:
##########
@@ -1273,6 +1273,37 @@ def test_generate_config_file(self, mock_conn, 
aws_conn_id, region_name, expecte
             if expected_region_args:
                 assert expected_region_args in command_arg
 
+    def test_command_template_redirects_stderr(self):
+        """Verify COMMAND template redirects stderr to /dev/null to prevent
+        Python warnings/log output from contaminating stdout and breaking
+        bash token parsing. This is critical for cross-account AssumeRole
+        scenarios where the kubeconfig exec plugin must produce a clean 
token."""
+        from airflow.providers.amazon.aws.hooks.eks import COMMAND
+
+        # Verify stderr is redirected to /dev/null, not merged with stdout
+        assert "2>/dev/null" in COMMAND, (
+            "COMMAND must redirect stderr to /dev/null to prevent output 
contamination"
+        )

Review Comment:
   This test hard-codes that stderr must be redirected to `/dev/null`. The 
actual requirement for correctness is that stderr must not be merged into 
stdout (i.e., avoid `2>&1`) so stdout remains parseable; discarding stderr is 
an implementation choice and reduces debuggability. Consider relaxing the 
assertion to only require that `2>&1` is not present, so future changes can 
keep stderr visible while still fixing the parsing issue.
   ```suggestion
           """Verify COMMAND template keeps stderr separate from stdout so
           Python warnings/log output cannot contaminate stdout and break
           bash token parsing. This is critical for cross-account AssumeRole
           scenarios where the kubeconfig exec plugin must produce a clean 
token."""
           from airflow.providers.amazon.aws.hooks.eks import COMMAND
   
           # Verify stderr is not merged with stdout. Whether stderr is 
discarded
           # or left visible is an implementation choice.
   ```



##########
providers/amazon/src/airflow/providers/amazon/aws/hooks/eks.py:
##########
@@ -82,20 +82,24 @@ class NodegroupStates(Enum):
             # Load credentials from secure file using (POSIX-compliant dot 
operator)
             . {credentials_file}
 
+            # Redirect stderr to /dev/null to prevent Python warnings, 
deprecation
+            # notices, or other log output from contaminating stdout. The token
+            # output must be the ONLY thing on stdout for bash parsing to work.
             output=$({python_executable} -m 
airflow.providers.amazon.aws.utils.eks_get_token \
-                --cluster-name {eks_cluster_name} --sts-url '{sts_url}' {args} 
2>&1)
+                --cluster-name {eks_cluster_name} --sts-url '{sts_url}' {args} 
2>/dev/null)
 
             status=$?
 
             # Clear environment variables after use (defense in depth)
             unset AWS_ACCESS_KEY_ID AWS_SECRET_ACCESS_KEY AWS_SESSION_TOKEN
 
             if [ "$status" -ne 0 ]; then
-                printf '%s' "$output" >&2
+                printf 'eks_get_token failed with exit code %s' "$status" >&2

Review Comment:
   On non-zero exit you now only print the exit code, but not the captured 
stdout from `eks_get_token`. This is a regression in diagnostics compared to 
printing the output, and it will make credential/STS issues much harder to 
troubleshoot. Consider also emitting `$output` (and/or captured stderr if you 
keep it) when `status != 0`.
   ```suggestion
                   printf 'eks_get_token failed with exit code %s' "$status" >&2
                   if [ -n "$output" ]; then
                       printf '. Output was: %s' "$output" >&2
                   fi
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] fix: EksPodOperator 401 with cross-account AssumeRole via aws_conn_id [airflow]

Reply via email to