amarhacks commented on issue #65011: URL: https://github.com/apache/airflow/issues/65011#issuecomment-4224142705
Thanks for the detailed report — we’re observing the same behavior in Airflow 3.1.8 with `GlueJobOperator(deferrable=True)`. From our investigation, this appears to be a systemic issue with how XComs are handled across the deferrable lifecycle (`execute → defer → execute_complete → retry`): * XCom rows created during `execute()` are **not cleared** when the task resumes. * On resume, `execute_complete()` attempts to **write the same keys again**, leading to duplicate key violations. * Retries further amplify the issue since the same XCom keys (`return_value`, `glue_job_run_details`) are re-inserted. A few specific observations: 1. `return_value` is auto-pushed by the task runner in both `execute()` and `execute_complete()`, which guarantees a collision on resume/retry. 2. `glue_job_run_details` is written before deferral and then attempted again on subsequent runs, and while some writes are suppressed, they still contribute to inconsistent behavior. 3. Airflow explicitly avoids clearing XComs for deferred tasks, which makes the current insert-only semantics unsafe for deferrable operators. This suggests that deferrable operators are currently **not XCom-idempotent**, which can lead to failures even when operator logic itself is correct. **Expected behavior:** * XCom writes should be idempotent across deferral/resume boundaries, OR * Existing keys should be updated/replaced instead of causing failures, OR * The framework should avoid auto-pushing duplicate `return_value` entries for resumed executions. As a temporary workaround, we’ve had to suppress `return_value` pushes or avoid deferrable mode, but this is not ideal. It would be great to get guidance on whether: * This is a known limitation of the current deferrable execution model, or * There are plans to make XCom handling idempotent (e.g., upsert semantics or scoped lifecycle per phase). Happy to help with a minimal reproducible example if needed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
