On Tue, Oct 09, 2018 at 11:04:21AM +0200, to...@tuxteam.de wrote: > Systemd's stance is that it wants to supervise the processes it creates. And > the most effective way to do that is to keep them as children; that's what > old BSD init did, and what other process supervision programs (e.g. runit) > do. They can watch closely when one of their children die and take action.
You're actually thinking of inittab here. This is what System V's inittab was intended to do. It's a monolithic file that controls the starting and restarting of all services on the system. You (root) edit the file using your favorite text editor to add or remove a service, and then run "init q" (or "telinit q" in later Linux implementations) to have the running init process re-read the file. For some reason, this model never caught on. People didn't like having a monolithic file that had to be human-edited and which could break the entire boot process if munged. So, they built another layer on top of inittab. > The whole PID thing came up with SysV init and is a crutch: many things > can happen: process dies, leaving behind the PID file, so you have to > check explicitly whether process is still running (in itself a race > condition), PIDs "wrap around", so if process died a while ago, the > PID may be active again for a totally unrelated process, etc. etc. This is the additional layer that was added later. It's sometimes called "sysv-rc". In this model, the inittab file is told to run a single script which then runs a whole bunch of other scripts. Each of these other scripts is of the form /etc/init.d/foobar and takes a single argument, like "start" or "stop". The objective of the /etc/init.d/foobar script is to start its designated service as some kind of background program, and capture its PID and store it in a file, so that later, it can terminate this process. The storing of PIDs in files is a catastrophic hack, for the reasons you've already explained. The only sane way to make sure you're controlling the actual service that you started is to stay running, as the parent of the service. As the parent process, you have unique authority over the child-service. You can be informed immediately when it terminates, you can receive its exit status, and perhaps most importantly of all, the child is left in the process table as a zombie until you, the parent, have asked for its exit status. Under the sysv-rc model, the child-service processes are orphaned. There's no responsible parent hanging around waiting for that phone call from the kernel telling it that Little Johnny has died. The sysv-rc services are wards of the state. They become children of init (PID 1), and init does not care about their exit status, or about notifying the next of kin, or any of that. When an orphaned process dies, the state (init) reaps and discards its exit status, and the zombie is gone. The PID is free to be reused by some other process. The sysv-rc shell scripts have no way of knowing whether this has occurred. So, people tried to layer more and more and more hacks (Debian's start-stop-daemon is one of these) on top of the problem. But going from 99% working to 99.9% working is still not a good solution. The only 100% working solution is to keep the parent alive. MANY people wrote service management systems to do this. I won't attempt to list them all here. Systemd is just one of them. For a more detailed analysis of the history of sysv-rc and its failings, see <http://jdebp.eu./FGA/system-5-rc-problems.html> among many other pages.