tirkarthi commented on PR #64882:
URL: https://github.com/apache/airflow/pull/64882#issuecomment-4235936202

   Yes, I am also not sure if the RuntimeError and triggerer not running 
existing triggers alloted to it are related though both occur at similar time.
   
   The issue is that since triggerer keeps heartbeat the liveness probe passes 
and the pod is alive. Since we have a Kubernetes based environment where 
triggerer runs, one workaround we thought would be to do `sys.exit(1)` when 
RuntimeError is raised due to invalid frame.id so that triggerer restarts with 
a hard crash. The other triggerers can run the triggers on unhealthy triggerer 
and with the error going away on restart it can pickup new ones till the issue 
occurs again. But this will not be applicable to all deployments where restart 
is not guaranteed like Kubernetes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to