Re: [PR] Fix: Preserve TaskInstance history during Kubernetes API rate limiting errors - CNCF Fix [airflow]

via GitHub Sat, 28 Mar 2026 23:27:37 -0700


Nataneljpwd commented on code in PR #57152:
URL: https://github.com/apache/airflow/pull/57152#discussion_r3005785250



##########
providers/cncf/kubernetes/src/airflow/providers/cncf/kubernetes/executors/kubernetes_executor.py:
##########
@@ -380,11 +380,12 @@ def sync(self) -> None:
                         body = {"message": e.body}
 
                     retries = self.task_publish_retries[key]
-                    # In case of exceeded quota or conflict errors, requeue 
the task as per the task_publish_max_retries
+                    # In case of exceeded quota, conflict errors, or rate 
limiting, requeue the task as per the task_publish_max_retries
                     message = body.get("message", "")
                     if (
                         (str(e.status) == "403" and "exceeded quota" in 
message)
                         or (str(e.status) == "409" and "object has been 
modified" in message)
+                        or str(e.status) == "429"  # Add support for rate 
limiting errors

Review Comment:
   I agree, there should b some kind of mechanism to stop calls up until what 
the retry after header states, probably should be done by saving the retry 
after time and when the method is executed, if it is before that time, log a 
warning and skip creating new pods for the scheduler loop iteration



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] Fix: Preserve TaskInstance history during Kubernetes API rate limiting errors - CNCF Fix [airflow]

Reply via email to