Verification done on focal-proposed, following comments 23, 24, 25, 26. Including in this comment a few key snippets from each test/comment.
--- Environment --- LXD virtual machine lxc launch --vm ubuntu:focal lp2059272-focal lxc exec lp2059272-focal -- su - ubuntu Enable proposed & debug symbols cat <<EOF | sudo tee /etc/apt/sources.list.d/proposed.list deb http://archive.ubuntu.com/ubuntu focal-proposed main universe deb http://ddebs.ubuntu.com focal-proposed main universe EOF cat <<EOF | sudo tee /etc/apt/preferences.d/proposed Package: * Pin: release a=focal-proposed Pin-Priority: 400 EOF sudo apt install --yes --no-install-recommends gdb qemu-system-x86 ubuntu-dbgsym-keyring sudo apt update sudo apt install --yes --no-install-recommends -t focal-proposed libvirt{0,-daemon{,-driver-qemu,-system}}{,-dbgsym} libvirt-clients $ apt-cache policy libvirt-daemon-driver-qemu libvirt-daemon-driver-qemu: Installed: 6.0.0-0ubuntu8.20 Candidate: 6.0.0-0ubuntu8.20 Version table: *** 6.0.0-0ubuntu8.20 400 400 http://archive.ubuntu.com/ubuntu focal-proposed/main amd64 Packages 100 /var/lib/dpkg/status 6.0.0-0ubuntu8.19 500 500 http://archive.ubuntu.com/ubuntu focal-updates/main amd64 Packages 500 http://security.ubuntu.com/ubuntu focal-security/main amd64 Packages 6.0.0-0ubuntu8 500 500 http://archive.ubuntu.com/ubuntu focal/main amd64 Packages newgrp libvirt # or logout/login Libvirtd debug logging cat <<-EOF | sudo tee -a /etc/libvirt/libvirtd.conf log_filters="1:qemu 1:libvirt" log_outputs="3:syslog:libvirtd 1:file:/var/log/libvirt/libvirtd-debug.log" EOF --- Steps with test packages on Focal (normal restarts) --- <...> for SLEEP in $(seq 0.1 0.1 2.0); do <...> All VMs are still managed by libvirt: $ virsh list Id Name State ---------------------------- 1 test-vm-1 running 2 test-vm-2 running 3 test-vm-3 running 4 test-vm-4 running 5 test-vm-5 running 6 test-vm-6 running 7 test-vm-7 running 8 test-vm-8 running 9 test-vm-9 running 10 test-vm-10 running --- Steps with test packages on Focal (shutdown-on-init) --- Scenario 1) Shutdown wins race against XML update (ie, shutdown happens first) <...> Now, let the qemuProcessReconnect thread continue, it will not update the XML file, because 'quit' is set (ie, shutdown in progress) (gdb) t 20 (gdb) p ((virNetDaemonPtr)anyobj)->quit $2 = true $ ls -l /run/libvirt/qemu/test-vm.xml -rw------- 1 root root 10189 Apr 24 12:02 /run/libvirt/qemu/test-vm.xml (gdb) c & $ ls -l /run/libvirt/qemu/test-vm.xml -rw------- 1 root root 10189 Apr 24 12:02 /run/libvirt/qemu/test-vm.xml <...> $ sudo grep 'Leaving the update of .* domain status XML' /var/log/libvirt/libvirtd-debug.log 2024-04-24 12:08:40.054+0000: 3770: info : qemuProcessReconnect:8157 : Leaving the update of 'test-vm' domain status XML for the next initialization (shutdown detected on this initialization). <...> $ sudo grep -e '<domstatus' -e '<domain' -e 'monitor path' /run/libvirt/qemu/test-vm.xml <domstatus state='running' reason='booted' pid='3726'> <monitor path='/var/lib/libvirt/qemu/domain-1-test-vm/monitor.sock' type='unix'/> <domain type='qemu' id='1'> Scenario 2) Shutdown loses race against XML update (ie, update happens first) <...> Instead, let the qemuProcessReconnect thread take the lock, and update the XML file, but not unlock yet <...> $ ls -l /run/libvirt/qemu/test-vm.xml -rw------- 1 root root 10189 Apr 24 12:02 /run/libvirt/qemu/test-vm.xml (gdb) b virObjectUnlock thread 20 if anyobj == $ptr (gdb) c $ ls -l /run/libvirt/qemu/test-vm.xml -rw------- 1 root root 10189 Apr 24 12:14 /run/libvirt/qemu/test-vm.xml <...> $ sudo grep -e '<domstatus' -e '<domain' -e 'monitor path' /run/libvirt/qemu/test-vm.xml <domstatus state='running' reason='booted' pid='3726'> <monitor path='/var/lib/libvirt/qemu/domain-1-test-vm/monitor.sock' type='unix'/> <domain type='qemu' id='1'> Scenario 3) Shutdown happens along QEMU monitor calls (ie, calls don't finish) <...> The XML was not updated, as expected: $ ls -l /run/libvirt/qemu/test-vm.xml -rw------- 1 root root 10189 Apr 24 12:14 /run/libvirt/qemu/test-vm.xml $ sudo grep -e '<domstatus' -e '<domain' -e 'monitor path' /run/libvirt/qemu/test-vm.xml <domstatus state='running' reason='booted' pid='3726'> <monitor path='/var/lib/libvirt/qemu/domain-1-test-vm/monitor.sock' type='unix'/> <domain type='qemu' id='1'> <...> Now, the next time libvirtd starts, it correctly parses that XML: $ sudo systemctl start libvirtd.service $ journalctl -b -u libvirtd.service | grep -A1 error $ And libvirt is aware of the domain, and can manage it: $ virsh list Id Name State ------------------------- 1 test-vm running $ virsh destroy test-vm Domain test-vm destroyed $ virsh undefine test-vm Domain test-vm has been undefined --- Steps with test packages on Focal (shutdown-on-runtime) --- <...> Check the formatter/options again; it is *STILL* referenced, not 0x0 anymore: (gdb) t 20 (gdb) p xmlopt.privateData.format $3 = (virDomainXMLPrivateDataFormatFunc) 0x7fd08c3437c0 <qemuDomainObjPrivateXMLFormat> (gdb) p/x xmlopt.parent $4 = {u = {dummy_align1 = 0x1cafe0026, dummy_align2 = 0x1cafe0026, s = {magic = 0xcafe0026, refs = 0x1}}, klass = 0x7fd080043170} Let the save function continue, and libvirt finishes shutting down: <...> Check the VM status XML *after*: $ ls -l /run/libvirt/qemu/test-vm.xml -rw------- 1 root root 10251 Apr 24 12:28 /run/libvirt/qemu/test-vm.xml $ sudo grep -e '<domstatus' -e '<domain' -e 'monitor path' /run/libvirt/qemu/test-vm.xml <domstatus state='running' reason='booted' pid='4055'> <monitor path='/var/lib/libvirt/qemu/domain-1-test-vm/monitor.sock' type='unix'/> <domain type='qemu' id='1'> Now, the next time libvirtd starts, it correctly parses that XML: $ sudo systemctl start libvirtd.service $ journalctl -b -u libvirtd.service | grep -A1 error $ And libvirt is aware of the domain, and can manage it: $ virsh list Id Name State ------------------------- 1 test-vm running $ virsh destroy test-vm Domain test-vm destroyed $ virsh undefine test-vm Domain test-vm has been undefined ** Tags removed: verification-needed verification-needed-focal ** Tags added: verification-done verification-done-focal -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/2059272 Title: libvirt domain is not listed/managed after libvirt restart with messages "internal error: no monitor path" and "Failed to load config for domain" To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/2059272/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs