Hello! tl;dr: When a daemon exits "normally" (for example due to signal 15) although it should not exit (because I did not call "systemctl stop"), systemd does not consider it a failure.
I have a Debian system running jessie. I have the following running: * spamd (system wide, via the official systemd unit file) * fetchmail (as user, started via a custom unit file in ~/.config/systemd/user) For reasons that are not entirely clear to me, these daemons sometimes stop (crash?) without me noticing. This is obviously undesirable, because if spamd crashes, all mails go through as ham; if fetchmail crashes I receive no more mail. A simple solution using systemd would work pretty well for me. I would like systemd to try to restart the service a few times on failure and notify me every time one of those services fails (so I know something happened no matter if the restart worked or not). My first try is the following: /etc/systemd/system/spamassassin.service.d/override.conf: [Service] Restart=on-failure RestartSec=5 [Unit] OnFailure=fail-notify@%n # <--- this thing works, please ask # if you need more info here (fetchmail uses a similar setup; both services are Type=forking) I added the OnFailure today and used kill -9 on the spamd processes to simulate a crash. It seems to work. The point is that, in the past, spamd disappeared permanently even though "Restart=on-failure" was set. In my testing using kill -15 on the daemon did not trigger a failure. systemd simply considers this a normal exit and happily ignores that a critical daemon is no longer running. I guess something similar may have happened silently before. I want this to go from a silent failure to a loud one! How do I tell systemd that it is a failure if the daemon is not running, except if I explicitly killed it via "systemctl stop" or similar? I either can't seem to find the right google query or this is not the right way to go about this. I would be thankful for any hints. Tobias