Copilot commented on code in PR #64749:
URL: https://github.com/apache/airflow/pull/64749#discussion_r3066481518
##########
providers/amazon/src/airflow/providers/amazon/aws/hooks/eks.py:
##########
@@ -82,20 +82,24 @@ class NodegroupStates(Enum):
# Load credentials from secure file using (POSIX-compliant dot
operator)
. {credentials_file}
+ # Redirect stderr to /dev/null to prevent Python warnings,
deprecation
+ # notices, or other log output from contaminating stdout. The token
+ # output must be the ONLY thing on stdout for bash parsing to work.
output=$({python_executable} -m
airflow.providers.amazon.aws.utils.eks_get_token \
- --cluster-name {eks_cluster_name} --sts-url '{sts_url}' {args}
2>&1)
+ --cluster-name {eks_cluster_name} --sts-url '{sts_url}' {args}
2>/dev/null)
Review Comment:
Redirecting `eks_get_token` stderr to `/dev/null` discards useful error
output (stack traces, botocore messages) and makes kubeconfig exec failures
hard to debug. Since the original parsing issue was caused by merging stderr
into stdout (`2>&1`), consider removing the redirection entirely (let stderr
pass through) or capture stderr separately and only surface it on non-zero
exit, while keeping stdout clean for parsing.
##########
providers/amazon/src/airflow/providers/amazon/aws/hooks/eks.py:
##########
@@ -104,6 +108,16 @@ class NodegroupStates(Enum):
token=${{last_line##*, token: }} # text after ", token: "
+ # Validate that token was extracted — empty token means parsing
failed
+ # or eks_get_token produced unexpected output. Without this check,
a
+ # malformed ExecCredential is sent to the API server, resulting in
a
+ # 401 with an empty user identity in the audit logs.
+ if [ -z "$token" ]; then
+ printf 'Failed to extract token from eks_get_token output. '
>&2
+ printf 'Output was: %s' "$output" >&2
Review Comment:
The empty-token branch prints the full `eks_get_token` output to stderr.
That output includes the EKS bearer token (see `eks_get_token.py` printing
`token: {access_token}`), so this will leak credentials into task logs when
parsing fails. Please redact the token before logging (or omit the output
entirely) to avoid exposing bearer tokens.
```suggestion
printf 'Failed to extract token from eks_get_token output.'
>&2
```
##########
providers/amazon/tests/unit/amazon/aws/hooks/test_eks.py:
##########
@@ -1273,6 +1273,37 @@ def test_generate_config_file(self, mock_conn,
aws_conn_id, region_name, expecte
if expected_region_args:
assert expected_region_args in command_arg
+ def test_command_template_redirects_stderr(self):
+ """Verify COMMAND template redirects stderr to /dev/null to prevent
+ Python warnings/log output from contaminating stdout and breaking
+ bash token parsing. This is critical for cross-account AssumeRole
+ scenarios where the kubeconfig exec plugin must produce a clean
token."""
+ from airflow.providers.amazon.aws.hooks.eks import COMMAND
+
+ # Verify stderr is redirected to /dev/null, not merged with stdout
+ assert "2>/dev/null" in COMMAND, (
+ "COMMAND must redirect stderr to /dev/null to prevent output
contamination"
+ )
+ assert "2>&1" not in COMMAND, (
+ "COMMAND must not use 2>&1 — merging stderr with stdout breaks
bash token parsing"
+ )
+
+ def test_command_template_validates_token(self):
+ """Verify COMMAND template validates that the token was successfully
+ extracted before producing the ExecCredential JSON. Without this check,
+ a malformed ExecCredential with an empty token is sent to the API
server,
+ resulting in 401 Unauthorized with an empty user identity in audit
logs."""
+ from airflow.providers.amazon.aws.hooks.eks import COMMAND
+
+ # Verify token validation check exists
+ assert 'if [ -z "$token" ]' in COMMAND, (
+ "COMMAND must validate that token is non-empty before producing
ExecCredential JSON"
+ )
+ # Verify it exits with error on empty token
+ assert "exit 1" in COMMAND or 'exit "$' in COMMAND, (
+ "COMMAND must exit with error when token extraction fails"
Review Comment:
`assert "exit 1" in COMMAND or 'exit "$' in COMMAND` can pass even if the
empty-token validation never exits, because `'exit "$'` matches the existing
`exit "$status"` earlier in the script. Tighten this assertion to specifically
verify that the empty-token block exits (e.g., by checking ordering relative to
`if [ -z "$token" ]` or matching the `exit 1` inside that block).
```suggestion
# Verify the empty-token validation block exits with an error
assert 'if [ -z "$token" ]; then\n exit 1\nfi' in COMMAND, (
"COMMAND must exit with error in the empty-token validation
block when token extraction fails"
```
##########
providers/amazon/tests/unit/amazon/aws/hooks/test_eks.py:
##########
@@ -1273,6 +1273,37 @@ def test_generate_config_file(self, mock_conn,
aws_conn_id, region_name, expecte
if expected_region_args:
assert expected_region_args in command_arg
+ def test_command_template_redirects_stderr(self):
+ """Verify COMMAND template redirects stderr to /dev/null to prevent
+ Python warnings/log output from contaminating stdout and breaking
+ bash token parsing. This is critical for cross-account AssumeRole
+ scenarios where the kubeconfig exec plugin must produce a clean
token."""
+ from airflow.providers.amazon.aws.hooks.eks import COMMAND
+
+ # Verify stderr is redirected to /dev/null, not merged with stdout
+ assert "2>/dev/null" in COMMAND, (
+ "COMMAND must redirect stderr to /dev/null to prevent output
contamination"
+ )
Review Comment:
This test hard-codes that stderr must be redirected to `/dev/null`. The
actual requirement for correctness is that stderr must not be merged into
stdout (i.e., avoid `2>&1`) so stdout remains parseable; discarding stderr is
an implementation choice and reduces debuggability. Consider relaxing the
assertion to only require that `2>&1` is not present, so future changes can
keep stderr visible while still fixing the parsing issue.
```suggestion
"""Verify COMMAND template keeps stderr separate from stdout so
Python warnings/log output cannot contaminate stdout and break
bash token parsing. This is critical for cross-account AssumeRole
scenarios where the kubeconfig exec plugin must produce a clean
token."""
from airflow.providers.amazon.aws.hooks.eks import COMMAND
# Verify stderr is not merged with stdout. Whether stderr is
discarded
# or left visible is an implementation choice.
```
##########
providers/amazon/src/airflow/providers/amazon/aws/hooks/eks.py:
##########
@@ -82,20 +82,24 @@ class NodegroupStates(Enum):
# Load credentials from secure file using (POSIX-compliant dot
operator)
. {credentials_file}
+ # Redirect stderr to /dev/null to prevent Python warnings,
deprecation
+ # notices, or other log output from contaminating stdout. The token
+ # output must be the ONLY thing on stdout for bash parsing to work.
output=$({python_executable} -m
airflow.providers.amazon.aws.utils.eks_get_token \
- --cluster-name {eks_cluster_name} --sts-url '{sts_url}' {args}
2>&1)
+ --cluster-name {eks_cluster_name} --sts-url '{sts_url}' {args}
2>/dev/null)
status=$?
# Clear environment variables after use (defense in depth)
unset AWS_ACCESS_KEY_ID AWS_SECRET_ACCESS_KEY AWS_SESSION_TOKEN
if [ "$status" -ne 0 ]; then
- printf '%s' "$output" >&2
+ printf 'eks_get_token failed with exit code %s' "$status" >&2
Review Comment:
On non-zero exit you now only print the exit code, but not the captured
stdout from `eks_get_token`. This is a regression in diagnostics compared to
printing the output, and it will make credential/STS issues much harder to
troubleshoot. Consider also emitting `$output` (and/or captured stderr if you
keep it) when `status != 0`.
```suggestion
printf 'eks_get_token failed with exit code %s' "$status" >&2
if [ -n "$output" ]; then
printf '. Output was: %s' "$output" >&2
fi
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]