Re: [ClusterLabs] IPaddr2 resource times out and cant be killed

Reid Wahl Fri, 29 Jul 2022 13:03:22 -0700

On Fri, Jul 29, 2022 at 12:52 PM Ross Sponholtz <[email protected]> wrote:
>
> I’m running a RHEL pacemaker cluster on Azure, and I’ve gotten a failure & 
> fencing where I get these messages in the log file:
>
>
> warning: vip_ABC_30_monitor_10000 process (PID 1779737) timed out
> crit: vip_ABC_30_monitor_10000 process (PID 1779737) will not die!
>
>
>
> This resource uses the IPAddr2 resource agent.  I’ve looked at the agent 
> code, and I can’t pinpoint any reason it would hang up, and since the node 
> gets fenced, I can’t tell why this happens – any ideas on what kinds of 
> failures could cause this problem?
>
>
>
> Thanks,
>
> Ross
>


Are you able to reproduce this? I suggest adding `trace_ra=1` to the
resource configuration in order to determine where it's hanging.

# pcs resource update vip_ABC trace_ra=1

This will produce a shell trace of each operation in
/var/lib/heartbeat/trace_ra/IPaddr2. This is naturally quite a lot of
logging, so remove the option when you've gotten what you need.

# pcs resource update vip_ABC trace_ra=

Also discussed in this article (you should have access if you're on RHEL):
- How can I determine exactly what is happening with every operation
on a resource in Pacemaker?
(https://access.redhat.com/solutions/3182931)

> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/



-- 
Regards,

Reid Wahl (He/Him)
Senior Software Engineer, Red Hat
RHEL High Availability - Pacemaker

_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] IPaddr2 resource times out and cant be killed

Reply via email to