On 11-09-18 15:40, Christian Brauner wrote: >> Kees Bakker <[email protected]> hat am 11. September 2018 um 15:13 geschrieben: >> >> >> Hey, >> >> Every now and then we have one or more containers in state ERROR. >> Is there a clever method to recover from that, other than >> rebooting the LXD server? >> >> Killing the monitor and the forkstart does help. And also a kworker >> process (kworker/u16:0) is eating up one of the CPUs with 100% load. >> lxc info gives "error: Monitor is hung" > If I'm not mistaken this is usually caused by a hanging lxc-monitord > process which older LXC versions still use and which is removed in > newer LXC versions. > Can you check whether you see a lxc-monitord process when such a hang > happens. If so, kill it. Afterwards things should work fine again.
Killing lxc-monitord did not help. I had to kill a "[lxc monitor]" process as well. Then the container got back to state "STOPPED". But after trying to start the container again, the state went back to "ERROR". Meanwhile the kworker/u16:0 process continued at 100% load. >> I'm running Ubuntu 16.04 with BTRFS. The kernel is 4.15.0-33-generic > > Cc stgraber since I don't have in mind what LXC version is used > and if it is one that has already gotten rid of lxc-monitord. ii lxc-common 2.0.8-0ubuntu1~16.04.2 amd64 Linux Containers userspace tools (common tools) ii lxcfs 2.0.8-0ubuntu1~16.04.2 amd64 FUSE based filesystem for LXC ii lxd 2.0.11-0ubuntu1~16.04.4 amd64 Container hypervisor based on LXC - daemon ii lxd-client 2.0.11-0ubuntu1~16.04.4 amd64 Container hypervisor based on LXC - client -- Kees Bakker _______________________________________________ lxc-users mailing list [email protected] http://lists.linuxcontainers.org/listinfo/lxc-users
