amoghrajesh commented on PR #65212:
URL: https://github.com/apache/airflow/pull/65212#issuecomment-4251065704

   Thanks for the pointer! These are actually distinct issues:
   
   - **#60943 / #63610** is a *race condition* — concurrent tasks fighting over 
the AWS CLI cache directory, causing `FileExistsError`. That's resolved by 
pre-creating the cache dirs (or botocore ≥ 1.40.2 handling it internally).
   
   - **This PR** is a *token expiry* issue specific to `AsyncKubernetesHook`. 
On the first `_load_config()` call, the exec plugin runs and obtains a 
short-lived token (15 min for EKS). Because `_config_loaded = True` is set, all 
future calls return early — the exec plugin is never re-invoked. After 15 
minutes, the expired token is reused and auth fails.
   
   Bumping botocore fixes the race condition but not this: the stale-token 
problem exists regardless of botocore version, as it is Airflow's hook-level 
caching that suppresses the reload. Deferrable tasks are especially affected 
since they can be deferred far longer than the token lifetime.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to