Hi,

[....]

>> I wonder if the people developing systemd are paying attention to a 
>> development in de Windows
>> environment where the latest thing is that de service can report back that 
>> it is indeed still trying to
>> stop and not just hung or not reporting back. Windows will now kill a 
>> service after a certain time 
>> when shutting down, in some cases it is killing a database that took A LONG 
>> TIME to shut down
>> and cause the database to become inconsistent. The new development is to 
>> make sure that does
>> not happen.
>> If systemd is trying to become smart about stopping services it might be a 
>> good idea to have this
>> built in. Also not just have the service report back "I am still busy" but 
>> also with a progress indicator
>> which NEEDS to increase at each report so systemd can detect whether the 
>> service is indeed
>> progressing towards a stopped state or hung in the getting there.

>> From the past I have seen things go wrong in communication when the only 
>> thing reported back is "I am busy" while there was no progress being made 
>> toward the finish.
>>
>> Is this something the systemd team has already put on the todo list or am I 
>> the first to suggest it?
>
> Possibly:
> 
> http://thread.gmane.org/gmane.comp.sysutils.systemd.devel/21419
> 
> http://article.gmane.org/gmane.comp.sysutils.systemd.devel/21997
> 
> "Introducing sd_notify() messages that can notify PID 1 about daemons 
> reloading or shutting down, has been on the TODO list for a while"

Ok, it looks like the problem has been seen before. The reason I think it is a 
good idea to have also a progress indicator is to make sure a daemon cannot 
keep a system from shutting down when there is no real progress towards a 
stopped state. 

An example from my "communication days".
A file needs to be transported and the job is handed over to the file transfer 
protocol job, whichever protocol that might be in this  case. The file transfer 
starts and after a while there are some errors on the line so some blocks are 
resend. Before the job is finished the line conditions become so bad that there 
is still some data transfer going on but each block sent has errors and needs 
to be resend. So the protocol job is still busy sending the file but no 
progress is being made.
In the good old days when we were still using phone lines and modems I have 
seen an international filetransfer that should have lasted a few minutes keep 
an open line for several hours until the operator noticed the busy line when it 
should have been free and canceled the job. A later protocol that was developed 
included a progress indicator in the protocol so it could keep track whether 
any progress was being made towards the end, if that counter did not increase 
each time then a watchdog part of the protocol would kill the transfer.

Something like that could happen during daemon start / stop, mainly during stop 
I think, where a job wants to complete some steps and tells systemd "hold on, I 
am still busy", at the same time for whatever reason it can no longer complete 
those steps and at some point needs to be killed in order for the system to 
continue shutting down or do whatever.
The "hard" part for the daemon will be what to use for a "progress indicator" 
and not simply use an i++ counter each time systemd asks it whether it has 
almost finished shutting down. You want something that indicates real progress 
but at the same time is small enough that is can increase each time system asks 
the daemon. On the other hand, that might not be too hard if systemd does not 
ask too often. ;-)

Most daemons will not need this feature and systemd can rely on a timeout 
killing the job if it does not stop within x seconds. But it would be good if 
the start / stop protocol allows for it when a notify part is developed.

After all if the system is running on the UPS batteries after a power failure 
and the low battery indicator was used to start a system shutdown you want the 
system to shutdown eventually before the battery runs out. ;-) 

Bonno Bloksma

_______________________________________________
systemd-devel mailing list
[email protected]
http://lists.freedesktop.org/mailman/listinfo/systemd-devel

Reply via email to