On úterý 24. října 2017 17:10:39 CEST, Jan Kundrát wrote:
Hi,
is it possible to change systemd's global settings for
RuntimeWatchdogSec at runtime? I would like to have the early
boot "guarded" by the HW watchdog started by my platform code,
and for systemd to take over only after a certain target has
been reached. I was thinking about an extra unit which simply
writes an appropriate config file, but the docs for `systemctl
daemon-reload` or `daemon-reexec` do not talk about these
top-level settins. How do I tell systemd to notice a new value?
Context: I'm using systemd on an embedded ARM box with reliable
network connectivity. The system has two fully separate
rootfs/kernel/devicetree instances, A and B. The bootloader
starts a HW watchdog timer, and the bootloader keeps a counter
tracking of how many times a particular A/B "boot slot"
attempted to boot. The kernel ignores the watchdog, and once
systemd gets launched and checks it system.conf file, it
proceeds to re-start the WD timer periodically. Finally, a unit
which is pulled in by my default target updates the bootloader's
environment, resetting the boot counter.
My goal is to be able to boot a possibly broken image (but not
a malicious one, of course) without fearing that it's going to
lock me out of my device. If the new image "fails" for some
reason, I epxect the HW watchdog to reset the system, the boot
attempt counter to eventually reach zero, and the whole system
to roll-back to the previous image, eventually. In my scneario,
it's preferred to make the decision to reboot rather than
waiting for human interaction for solving the actual problem.
The once-failed slot can be re-flahed very cheapily, and an
updated version can be re-tried during the next update attempt.
During my testing, I was able to unplug the system's SD card at
a "wrong" moment which resulted in systemd trying to boot into
emergency.target and ultimately failing due to a missing rootfs.
I ended up with an unusable system which did not reboot
automatically because systemd was periodically pinging the HW
watchdog timer. [1]
I got a suggestion to adjust the important units so that they
specify a FailureAction. I do not like that solution because it
is additional work (identifying which units might fail, coming
up with various possible failing scenarios, being hard to test
and get "right" in face of systemd updates in future, etc). It
also feels like I am attacking a wrong problem. I already *have*
a watchdog which will shoot the system into the head if
something wrong happens. Wouldn't it make more sense to rely on
this piece of infrastructure and start telling the watchdog
"hey, I'm OK" only after the system has fuly booted and my
ultimate target has been *reached*?
SUggestions which offer additional possibilities are welcome. I
like system'd feature set, and I won't pretend that I know all
of them :).
With kind regards,
Jan
[1] https://github.com/systemd/systemd/issues/7063
I more or less solved this by *not* configuring systemd to start pinging
the watchdog on its own. Then I added another unit depending on and being
wanted by multi-user.target which checks whether everything is OK so far:
[Unit]
Description=Pinging the HW watchdog
Requires=multi-user.target
After=multi-user.target
[Service]
Type=oneshot
ExecStartPre=/bin/sh -c '[ "$(/bin/systemctl list-units --failed --all
--no-legend --no-pager)" == "" ]'
ExecStart=/bin/busctl set-property org.freedesktop.systemd1
/org/freedesktop/systemd1 org.freedesktop.systemd1.Manager
RuntimeWatchdogUSec t 30000000
For more details, see the original bugreport at
https://github.com/systemd/systemd/issues/7063 .
Cheers,
Jan
_______________________________________________
systemd-devel mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/systemd-devel