Hi, I think we can forget my former suggestion for isolation at least for now. Thanks for closing 848319 btw that explanation gave me the confidence to continue debugging this case.
Now I found that I seem to be "able to" run into this issue on x86 as well - so not arch dependent at all. Still it is weird, in a dep8 environment I sem to run into this 100% while I never do when I redo the same steps on my system. I logged into the dep8 KVM guest and checked how reproducible it is. It turns out that after the FIRST restart the lxc-guest is killed and listed as shut-down then. I lists an error like this: Dec 19 14:40:19 autopkgtest libvirtd[4500]: internal error: No valid cgroup for machine sl Dec 19 14:40:19 autopkgtest libvirtd[4500]: End of file while reading data: Input/output error That is somewhat familiar if you know https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=774237 But then we are clearly above that systemd level. Now what makes this even more interesting is that AFTER the issue happened it seems to be ok. So afterwards doing export ... virsh start sl # now I can restart libvirt without affecting guest "sl" The following cleanup and re-define gets me back to the situation where a following restart will destroy the guest and throw the error listed above to the log: virsh destroy sl; virsh undefine sl; rm -rf /etc/libvirt; apt-get remove --purge libvirt-daemon-system libvirt-clients libxml2-utils; apt-get install libvirt-daemon-system libvirt-clients libxml2-utils; virsh define smoke-lxc.xml; virsh start sl; virsh list --all # now trigger the fail with /etc/init.d/libvirtd restart After this I have again libvirt running, but not really: systemctl status libvirtd ● libvirtd.service - Virtualization daemon Loaded: loaded (/lib/systemd/system/libvirtd.service; enabled; vendor preset: enabled) Active: active (running) since Mon 2016-12-19 14:52:51 CET; 14s ago Docs: man:libvirtd(8) http://libvirt.org Main PID: 7608 (libvirtd) Tasks: 18 CGroup: /system.slice/libvirtd.service ├─2035 /usr/sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/default.conf --leasefile-ro --dhcp-script=/usr/lib/libvirt/libvirt_leases ├─2036 /usr/sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/default.conf --leasefile-ro --dhcp-script=/usr/lib/libvirt/libvirt_leases └─7608 /usr/sbin/libvirtd Dec 19 14:52:51 autopkgtest systemd[1]: Starting Virtualization daemon... Dec 19 14:52:51 autopkgtest systemd[1]: Started Virtualization daemon. Dec 19 14:52:52 autopkgtest dnsmasq[2035]: read /etc/hosts - 8 addresses Dec 19 14:52:52 autopkgtest dnsmasq[2035]: read /var/lib/libvirt/dnsmasq/default.addnhosts - 0 addresses Dec 19 14:52:52 autopkgtest dnsmasq-dhcp[2035]: read /var/lib/libvirt/dnsmasq/default.hostsfile Dec 19 14:52:52 autopkgtest libvirtd[7608]: libvirt version: 2.5.0, package: 1ubuntu1~ppa3 (Christian Ehrhardt <christian.ehrha...@canonical.com> Dec 19 14:52:52 autopkgtest libvirtd[7608]: hostname: autopkgtest.localdomain Dec 19 14:52:52 autopkgtest libvirtd[7608]: internal error: No valid cgroup for machine sl Dec 19 14:52:52 autopkgtest libvirtd[7608]: End of file while reading data: Input/output error I say "not really" even if systemd says active here because a virsh list now looks the following way: $ virsh list --all Id Name State ---------------------------------------------------- That is it, the guest is completely gone. Restarting the service again lets it start normally and the guest returns - although in stopped state: $ /etc/init.d/libvirtd restart $ virsh list --all Id Name State ---------------------------------------------------- - sl shut off I can start the guest again now, and from this point on it is resilient against restarts. $ virsh start sl $ /etc/init.d/libvirtd restart $ virsh list --all Id Name State ---------------------------------------------------- 9020 sl running So much for now trying to gather some extra debug data next ...