Re: [systemd-devel] Later activation of the HW watchdog

Jan Kundrát Thu, 14 Jun 2018 10:19:04 -0700

On úterý 24. října 2017 17:10:39 CEST, Jan Kundrát wrote:

Hi,
is it possible to change systemd's global settings forRuntimeWatchdogSec at runtime? I would like to have the earlyboot "guarded" by the HW watchdog started by my platform code,and for systemd to take over only after a certain target hasbeen reached. I was thinking about an extra unit which simplywrites an appropriate config file, but the docs for `systemctldaemon-reload` or `daemon-reexec` do not talk about thesetop-level settins. How do I tell systemd to notice a new value?
Context: I'm using systemd on an embedded ARM box with reliablenetwork connectivity. The system has two fully separaterootfs/kernel/devicetree instances, A and B. The bootloaderstarts a HW watchdog timer, and the bootloader keeps a countertracking of how many times a particular A/B "boot slot"attempted to boot. The kernel ignores the watchdog, and oncesystemd gets launched and checks it system.conf file, itproceeds to re-start the WD timer periodically. Finally, a unitwhich is pulled in by my default target updates the bootloader'senvironment, resetting the boot counter.
My goal is to be able to boot a possibly broken image (but nota malicious one, of course) without fearing that it's going tolock me out of my device. If the new image "fails" for somereason, I epxect the HW watchdog to reset the system, the bootattempt counter to eventually reach zero, and the whole systemto roll-back to the previous image, eventually. In my scneario,it's preferred to make the decision to reboot rather thanwaiting for human interaction for solving the actual problem.The once-failed slot can be re-flahed very cheapily, and anupdated version can be re-tried during the next update attempt.
During my testing, I was able to unplug the system's SD card ata "wrong" moment which resulted in systemd trying to boot intoemergency.target and ultimately failing due to a missing rootfs.I ended up with an unusable system which did not rebootautomatically because systemd was periodically pinging the HWwatchdog timer. [1]
I got a suggestion to adjust the important units so that theyspecify a FailureAction. I do not like that solution because itis additional work (identifying which units might fail, comingup with various possible failing scenarios, being hard to testand get "right" in face of systemd updates in future, etc). Italso feels like I am attacking a wrong problem. I already *have*a watchdog which will shoot the system into the head ifsomething wrong happens. Wouldn't it make more sense to rely onthis piece of infrastructure and start telling the watchdog"hey, I'm OK" only after the system has fuly booted and myultimate target has been *reached*?
SUggestions which offer additional possibilities are welcome. Ilike system'd feature set, and I won't pretend that I know allof them :).
With kind regards,
Jan

[1] https://github.com/systemd/systemd/issues/7063

I more or less solved this by *not* configuring systemd to start pingingthe watchdog on its own. Then I added another unit depending on and beingwanted by multi-user.target which checks whether everything is OK so far:


 [Unit]
 Description=Pinging the HW watchdog
 Requires=multi-user.target
 After=multi-user.target

[Service]

 Type=oneshot

ExecStartPre=/bin/sh -c '[ "$(/bin/systemctl list-units --failed --all--no-legend --no-pager)" == "" ]'ExecStart=/bin/busctl set-property org.freedesktop.systemd1/org/freedesktop/systemd1 org.freedesktop.systemd1.ManagerRuntimeWatchdogUSec t 30000000

For more details, see the original bugreport athttps://github.com/systemd/systemd/issues/7063 .


Cheers,
Jan
_______________________________________________
systemd-devel mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] Later activation of the HW watchdog

Reply via email to