rahulgoyal2987 opened a new issue, #53491:
URL: https://github.com/apache/airflow/issues/53491

   ### Apache Airflow version
   
   3.0.3
   
   ### If "Other Airflow 2 version" selected, which one?
   
   2.10.3
   
   ### What happened?
   
   Airflow slows for 30 min to 1 hour due to db locks in High availability 
scheduler (we have scheduler 2 replicas on prod) and fixes only after restart.
   What we observed from metrics that locks (For Update query)  is acquired at 
task instance level. This task instance for update query is taking lot of time 
and increasing the DBLoad in aurora postgresql.
   Airflow have most of places fetching row with skip_locked function argument 
however while fetching task instance skip locked is not used. Similar kind of 
code i found in airflow 
(https://github.com/apache/airflow/blob/main/airflow-core/src/airflow/models/taskinstance.py#L1183)
   Will there be lock/slowness issue if task instance state is changed without 
skip_locked ?
   
   ### What you think should happen instead?
   
   Scheduler slowness/restart causes SLA miss in some DAG.
   
   ### How to reproduce
   
   Create high load on airflow and increase no of scheduler replicas
   
   ### Operating System
   
   Kubernetes helm
   
   ### Versions of Apache Airflow Providers
   
   _No response_
   
   ### Deployment
   
   Official Apache Airflow Helm Chart
   
   ### Deployment details
   
   _No response_
   
   ### Anything else?
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [x] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to