aurangzaib048 opened a new pull request, #64770:
URL: https://github.com/apache/airflow/pull/64770

   When `EmrContainerOperator` runs in deferrable mode and the trigger
   times out or the task is killed, the EMR job keeps running on the
   cluster. This leads to orphaned jobs consuming resources and duplicate
   executions on retry.
   
   This PR adds cancel-on-kill support to `EmrContainerTrigger` following
   the same proven pattern as `EmrServerlessStartJobTrigger` (PR #51883):
   
   - Override `run()` in `EmrContainerTrigger` to catch
     `asyncio.CancelledError` and cancel the EMR job via
     `hook.stop_query()` when safe to do so
   - Add `safe_to_cancel()` check to distinguish user-initiated kills
     from triggerer restarts (avoids cancelling jobs during triggerer
     restart)
   - Add `cancel_on_kill` parameter (default `True`) for opt-out
   - Update `EmrContainerOperator.execute_complete()` to cancel the job
     when the trigger reports a failure/timeout event
   - All cancellation paths are wrapped in try/except to ensure proper
     error propagation (CancelledError is always re-raised, original
     AirflowException is preserved)
   
   closes: #60517
   
   ---
   
   ##### Was generative AI tooling used to co-author this PR?
   
   - [ ] No


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to