> and/or failing. Imho, one shouldn't be killing journald, when it is otherwise 
> obviously
> operating fine (aka waiting to be run).

@xnox: How do you tell if there is no live lock and it is operating fine
despite it timed out on a 3 minute timer?

> I'm concerned as to why there is a watchdog on journald now. It should be 
> rocksolid,
> and either work or crash, there is no need to crash it on a fixed schedule 
> just because.

If we drop the watchdog we won't get any new journal entries if it
enters/tricked into an infinite loop. I don't think that would be wise.

There are upstream bugs with too little information for similar issues:
https://github.com/systemd/systemd/issues/2899
https://github.com/systemd/systemd/issues/2924

@xnox Do you have links with enough info for debugging?


** Bug watch added: github.com/systemd/systemd/issues #2899
   https://github.com/systemd/systemd/issues/2899

** Bug watch added: github.com/systemd/systemd/issues #2924
   https://github.com/systemd/systemd/issues/2924

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to systemd in Ubuntu.
https://bugs.launchpad.net/bugs/1696970

Title:
  softlockup DoS causes systemd-journald.service to abort with SIGABORT

Status in systemd package in Ubuntu:
  Confirmed
Status in systemd source package in Artful:
  Opinion
Status in systemd source package in Bionic:
  Confirmed

Bug description:
  I was running the new stress-ng softlockup stressor and observed that
  systemd-journald gets killed with an abort and this corrupts the
  systemd journal.

  How to reproduce:

  git clone git://kernel.ubuntu.com/cking/stress-ng
  cd stress-ng
  make clean; make

  sudo ./stress-ng --softlockup 0 -t 360 -v

  ..and wait for 360 seconds.  dmesg shows the following, 100%
  reproduceable:

  
  [  875.310331] systemd[1]: systemd-timesyncd.service: Watchdog timeout (limit 
3min)!
  [  875.310740] systemd[1]: systemd-timesyncd.service: Killing process 574 
(systemd-timesyn) with signal SIGABRT.
  [  875.327289] systemd[1]: systemd-timesyncd.service: Main process exited, 
code=killed, status=6/ABRT
  [  875.327666] systemd[1]: systemd-timesyncd.service: Unit entered failed 
state.
  [  875.327686] systemd[1]: systemd-timesyncd.service: Failed with result 
'watchdog'.
  [  875.327917] systemd[1]: systemd-timesyncd.service: Service has no hold-off 
time, scheduling restart.
  [  875.327954] systemd[1]: Stopped Network Time Synchronization.
  [  875.328845] systemd[1]: Starting Network Time Synchronization...
  [  875.525071] systemd[1]: Started Network Time Synchronization.
  [  875.539619] systemd[1]: systemd-journald.service: Main process exited, 
code=dumped, status=6/ABRT
  [  875.544257] systemd-journald[5214]: File 
/run/log/journal/440e485e550040e3b93b66b2faae8525/system.journal corrupted or 
uncleanly shut down, renaming and replacing.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1696970/+subscriptions

-- 
Mailing list: https://launchpad.net/~touch-packages
Post to     : touch-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~touch-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to