hkc-8010 commented on issue #65011:
URL: https://github.com/apache/airflow/issues/65011#issuecomment-4236664458

   Thanks, and good question. As far as I can tell, no, this is not the 
customer explicitly writing XComs through the Core API, and I don’t see 
evidence of a custom XCom backend either.
   
   What we have is an Airflow 3.1.8 managed runtime, so the normal task 
execution path goes through the execution API internally. The stack trace for 
the `return_value` failure is from the standard task runner path, not from 
customer code calling the API directly:
   
   ```text
   airflow/sdk/execution_time/task_runner.py ... _push_xcom_if_needed
   airflow/sdk/execution_time/task_runner.py ... _xcom_push
   airflow/sdk/bases/xcom.py ... set
   airflow/sdk/execution_time/comms.py ... send
   ```
   
   On the customer side, the operator is a thin subclass of `GlueJobOperator`. 
It does not override `execute()`, `execute_complete()`, or add custom XCom 
writes for `return_value`. In the deferrable case it was just enabling 
`deferrable=True`; in the non-deferrable test they switched to a non-deferrable 
subclass and still hit the same issue.
   
   The two keys we’re seeing are:
   - `glue_job_run_details`, which looks like the stock provider link path 
(`BaseAwsLink.persist()` -> `ti.xcom_push(...)`)
   - `return_value`, which looks like the stock task-runner auto-push path 
after the operator returns successfully
   
   So my current understanding is:
   - not a direct customer call to the Core API
   - not a custom operator manually posting `return_value`
   - no evidence so far of a custom XCom backend
   - the `409` is coming from Airflow’s internal execution API path that the 
task runner uses in 3.x
   
   The part that still looks suspicious to me is retry cleanup. In the 
non-deferrable repro, at try-2 startup we observed deletion of 
`_link_GlueJobRunDetailsLink`, but we did not observe deletion of 
`glue_job_run_details` or `return_value`, and the next writes to those keys 
then hit `409`.
   
   If helpful, I can add a follow-up comment with the sanitized request 
sequence around that retry boundary as well.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to