amoghrajesh commented on PR #65212: URL: https://github.com/apache/airflow/pull/65212#issuecomment-4251065704
Thanks for the pointer! These are actually distinct issues: - **#60943 / #63610** is a *race condition* — concurrent tasks fighting over the AWS CLI cache directory, causing `FileExistsError`. That's resolved by pre-creating the cache dirs (or botocore ≥ 1.40.2 handling it internally). - **This PR** is a *token expiry* issue specific to `AsyncKubernetesHook`. On the first `_load_config()` call, the exec plugin runs and obtains a short-lived token (15 min for EKS). Because `_config_loaded = True` is set, all future calls return early — the exec plugin is never re-invoked. After 15 minutes, the expired token is reused and auth fails. Bumping botocore fixes the race condition but not this: the stale-token problem exists regardless of botocore version, as it is Airflow's hook-level caching that suppresses the reload. Deferrable tasks are especially affected since they can be deferred far longer than the token lifetime. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
