rahulgoyal2987 opened a new issue, #53491: URL: https://github.com/apache/airflow/issues/53491
### Apache Airflow version 3.0.3 ### If "Other Airflow 2 version" selected, which one? 2.10.3 ### What happened? Airflow slows for 30 min to 1 hour due to db locks in High availability scheduler (we have scheduler 2 replicas on prod) and fixes only after restart. What we observed from metrics that locks (For Update query) is acquired at task instance level. This task instance for update query is taking lot of time and increasing the DBLoad in aurora postgresql. Airflow have most of places fetching row with skip_locked function argument however while fetching task instance skip locked is not used. Similar kind of code i found in airflow (https://github.com/apache/airflow/blob/main/airflow-core/src/airflow/models/taskinstance.py#L1183) Will there be lock/slowness issue if task instance state is changed without skip_locked ? ### What you think should happen instead? Scheduler slowness/restart causes SLA miss in some DAG. ### How to reproduce Create high load on airflow and increase no of scheduler replicas ### Operating System Kubernetes helm ### Versions of Apache Airflow Providers _No response_ ### Deployment Official Apache Airflow Helm Chart ### Deployment details _No response_ ### Anything else? _No response_ ### Are you willing to submit PR? - [ ] Yes I am willing to submit a PR! ### Code of Conduct - [x] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
