On 08/09/2017 21:20, Valentin Vidic wrote: > On Fri, Sep 08, 2017 at 12:57:12PM +0000, Mark Syms wrote: >> As we discussed regarding the handling of watchdog in XenServer, both >> guest and host, I've had a discussion with our subject matter expert >> (Andrew, cc'd) on this topic. The guest watchdogs are handled by a >> hardware timer in the hypervisor but if the timers themselves are not >> serviced within 5 seconds the host watchdog will fire and pull the >> host down. > I presume the host watchdog is the NMI watchdog described in the > Xen Hypervisor Command Line Options? > > watchdog = force | <boolean> (Default: false) > Run an NMI watchdog on each processor. If a processor is stuck for > longer than the watchdog_timeout, a panic occurs. When force is > specified, in addition to running an NMI watchdog on each processor, > unknown NMIs will still be processed. > > watchdog_timeout = <integer> (Default: 5) > Set the NMI watchdog timeout in seconds. Specifying 0 will turn off the > watchdog. >
Yes. The internal mechanism of the host watchdog is to use one performance counter to count retired instructions and generate an NMI roughly once every half second (give or take C and P states). Separately, there is a one second timer (the same framework as all other timers in Xen, including the guest watchdog), which triggers a softirq (lower priority, runs on the return-to-guest path), which increments a local variable. If the NMI handler doesn't observe this local variable incrementing in the timeout period, Xen crash the entire system. ~Andrew _______________________________________________ Users mailing list: [email protected] http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
