Re: [systemd-devel] [PATCH 0/4] systemd and watchdog

Albert Strasheim Wed, 28 Sep 2011 10:17:07 -0700

Hello

On Wed, Sep 28, 2011 at 6:59 PM, Michael Olbrich
<[email protected]> wrote:
> How to implement this is systemd:
> systemd already has the concept of a state for each service and a very
> simple method (sd_notify) for the service to provide status information to
> systemd.
> This is implemented in the first patch. A service can send keep-alive
> messages with sd_notify, and the timestamp of the latest message is exposed
> as a service property.


Very cool. I've been wondering how we could restart services that hang
(e.g., deadlock or go into infinite loops) but don't crash.

> The second patch implements service restart / reboot when no keep-alive
> message was received for a certain amount of time.
> Note: This only triggers if at least one keep-alive was received. I don't
> think anything can be done if a service fails to start. This should be
> handled outside of systemd.

A question at this point: are ExecStartPosts executed if a service
fails? If they are, and if they can obtain the main exit status (if
that's a well-defined concept), they could take further action.

> I think, the watchdog hardware should be handled in a separate service, for
> several reasons:

Agreed. We've had good results with an IPMI watchdog and Fedora's
watchdog package. I think it might even include a .service file, or
maybe I wrote a simple one.

Regards

Albert
_______________________________________________
systemd-devel mailing list
[email protected]
http://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] [PATCH 0/4] systemd and watchdog

Reply via email to