On Wed, 22.10.14 13:10, Damien Robert ([email protected]) 
wrote:

> On one boot I got a watchdog timeout on systemd-journald:
> 
> Oct 21 20:08:21 feanor systemd-journal[213]: Permanent journal is using 68.7M 
> (m
> Oct 21 20:08:21 feanor systemd-journal[213]: Time spent on flushing to /var 
> is 2
> Oct 21 20:08:25 feanor kernel: IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not 
> rea
> Oct 21 20:08:25 feanor kernel: wlan0: authenticate with f4:ca:e5:cc:c7:40
> Oct 21 20:08:25 feanor kernel: wlan0: send auth to f4:ca:e5:cc:c7:40 (try 1/3)
> Oct 21 20:08:25 feanor kernel: wlan0: authenticated
> Oct 21 20:08:25 feanor kernel: wlan0: associate with f4:ca:e5:cc:c7:40 (try 
> 1/3)
> Oct 21 20:08:25 feanor kernel: wlan0: RX AssocResp from f4:ca:e5:cc:c7:40 
> (capab
> Oct 21 20:08:25 feanor kernel: wlan0: associated
> Oct 21 20:08:25 feanor kernel: IPv6: ADDRCONF(NETDEV_CHANGE): wlan0: link 
> become
> Oct 21 20:08:20 feanor systemd[1]: Started Trigger Flushing of Journal to 
> Persis
> Oct 21 20:08:20 feanor systemd[1]: Started Create Volatile Files and 
> Directories
> Oct 21 20:08:20 feanor systemd[1]: Starting Network Time Synchronization...
> Oct 21 20:08:20 feanor systemd[1]: Starting Update UTMP about System 
> Boot/Shutdo
> Oct 21 20:08:20 feanor systemd[1]: Started Update UTMP about System 
> Boot/Shutdow
> Oct 21 20:08:20 feanor systemd[1]: Started Network Time Synchronization.
> Oct 21 20:08:20 feanor systemd[1]: Starting System Initialization.
> Oct 21 20:08:21 feanor systemd[1]: Reached target System Initialization.
> Oct 21 20:08:21 feanor systemd[1]: Starting sshd.socket.
> Oct 21 20:08:21 feanor systemd[1]: Listening on sshd.socket.
> Oct 21 20:08:21 feanor systemd[1]: Starting D-Bus System Message Bus Socket.
> Oct 21 20:08:25 feanor systemd-logind[516]: New seat seat0.
> Oct 21 20:08:25 feanor systemd-networkd[518]: lo              : gained carrier
> Oct 21 20:08:25 feanor systemd-logind[516]: Watching system buttons on 
> /dev/inpu
> Oct 21 20:08:25 feanor systemd-logind[516]: Watching system buttons on 
> /dev/inpu
> Oct 21 20:08:25 feanor systemd-logind[516]: Watching system buttons on 
> /dev/inpu
> Oct 21 20:08:25 feanor systemd[1]: Started Network Service.
> Oct 21 20:08:25 feanor systemd[1]: Starting Network.
> Oct 21 20:08:25 feanor systemd[1]: Reached target Network.
> Oct 21 20:08:25 feanor systemd[1]: Starting Network Name Resolution...
> Oct 21 20:11:20 feanor systemd-journal[574]: Permanent journal is using 75.1M 
> (m
> Oct 21 20:11:20 feanor kernel: e1000e 0000:00:19.0: irq 55 for MSI/MSI-X
> Oct 21 20:11:20 feanor kernel: IPv6: ADDRCONF(NETDEV_UP): eth0: link is not 
> read
> Oct 21 20:11:20 feanor systemd[1]: systemd-journald.service watchdog timeout 
> (li
> Oct 21 20:11:20 feanor systemd[1]: Starting Journal Service...
> Oct 21 20:11:20 feanor systemd[1]: Started Getty on tty2.
> Oct 21 20:11:20 feanor systemd[1]: lightdm.service start operation timed out. 
> Te
> Oct 21 20:11:20 feanor systemd[1]: systemd-timesyncd.service watchdog timeout 
> (l
> Oct 21 20:11:20 feanor systemd[1]: systemd-journald.service stop-sigterm 
> timed o
> Oct 21 20:11:20 feanor systemd[1]: Starting Journal Service...
> Oct 21 20:11:20 feanor systemd[1]: systemd-journald.service: main process 
> exited
> Oct 21 20:11:20 feanor systemd[1]: Unit systemd-journald.service entered 
> failed 
> Oct 21 20:11:20 feanor systemd[1]: systemd-journald.service has no holdoff 
> time,
> Oct 21 20:11:20 feanor systemd[1]: Stopping Journal Service...
> Oct 21 20:11:20 feanor systemd[1]: Starting Journal Service...
> Oct 21 20:11:20 feanor systemd[1]: Started Journal Service.
> 
> As you can see at 20:08:20 things appears to look fine, but everything
> get stuck during 3 minutes until there is a watchdog timeout for the
> journal and it gets restarted. Then the boot finished without any further
> problem. The next time everything worked fine, so I am wondering what could
> have happened?

That's difficult to say just from these logs. Can you reliably
reproduce this? If so, can you attach strace to journald before this
happens and see what it is doing?

What I find interesting is that after the 3min pause suddenly the
network driver allocates irq 55. This kinda makes me wonder if this
might be a kernel/driver problem of some kind, where for some reason
userspace (and hence journald) don't get scheduled?

What distro/version/arch is this?

Lennart

-- 
Lennart Poettering, Red Hat
_______________________________________________
systemd-devel mailing list
[email protected]
http://lists.freedesktop.org/mailman/listinfo/systemd-devel

Reply via email to