Verification done on focal-proposed, following comments 23, 24, 25, 26.

Including in this comment a few key snippets from each test/comment.

---
Environment
---

LXD virtual machine

 lxc launch --vm ubuntu:focal lp2059272-focal
 lxc exec lp2059272-focal -- su - ubuntu

Enable proposed & debug symbols

        cat <<EOF | sudo tee /etc/apt/sources.list.d/proposed.list
        deb http://archive.ubuntu.com/ubuntu focal-proposed main universe
        deb http://ddebs.ubuntu.com focal-proposed main universe
        EOF

        cat <<EOF | sudo tee /etc/apt/preferences.d/proposed
        Package: *
        Pin: release a=focal-proposed
        Pin-Priority: 400
        EOF

        sudo apt install --yes --no-install-recommends gdb qemu-system-x86 
ubuntu-dbgsym-keyring
        sudo apt update
        sudo apt install --yes --no-install-recommends -t focal-proposed 
libvirt{0,-daemon{,-driver-qemu,-system}}{,-dbgsym} libvirt-clients

        $ apt-cache policy libvirt-daemon-driver-qemu
        libvirt-daemon-driver-qemu:
          Installed: 6.0.0-0ubuntu8.20
          Candidate: 6.0.0-0ubuntu8.20
          Version table:
         *** 6.0.0-0ubuntu8.20 400
                400 http://archive.ubuntu.com/ubuntu focal-proposed/main amd64 
Packages
                100 /var/lib/dpkg/status
             6.0.0-0ubuntu8.19 500
                500 http://archive.ubuntu.com/ubuntu focal-updates/main amd64 
Packages
                500 http://security.ubuntu.com/ubuntu focal-security/main amd64 
Packages
             6.0.0-0ubuntu8 500
                500 http://archive.ubuntu.com/ubuntu focal/main amd64 Packages

        newgrp libvirt # or logout/login

Libvirtd debug logging

        cat <<-EOF | sudo tee -a /etc/libvirt/libvirtd.conf
        log_filters="1:qemu 1:libvirt"
        log_outputs="3:syslog:libvirtd 
1:file:/var/log/libvirt/libvirtd-debug.log"
        EOF

---
Steps with test packages on Focal (normal restarts)
---

        <...>
        for SLEEP in $(seq 0.1 0.1 2.0); do
        <...>
        
All VMs are still managed by libvirt:

        $ virsh list
         Id   Name         State
        ----------------------------
         1    test-vm-1    running
         2    test-vm-2    running
         3    test-vm-3    running
         4    test-vm-4    running
         5    test-vm-5    running
         6    test-vm-6    running
         7    test-vm-7    running
         8    test-vm-8    running
         9    test-vm-9    running
         10   test-vm-10   running


---
Steps with test packages on Focal (shutdown-on-init)
---

Scenario 1) Shutdown wins race against XML update (ie, shutdown happens
first)

<...>

Now, let the qemuProcessReconnect thread continue, it will not update the XML 
file,
because 'quit' is set (ie, shutdown in progress)

        (gdb) t 20
        (gdb) p ((virNetDaemonPtr)anyobj)->quit
        $2 = true

        $ ls -l /run/libvirt/qemu/test-vm.xml
        -rw------- 1 root root 10189 Apr 24 12:02 /run/libvirt/qemu/test-vm.xml

        (gdb) c &

        $ ls -l /run/libvirt/qemu/test-vm.xml
        -rw------- 1 root root 10189 Apr 24 12:02 /run/libvirt/qemu/test-vm.xml

        <...>
        
        $ sudo grep 'Leaving the update of .* domain status XML' 
/var/log/libvirt/libvirtd-debug.log
        2024-04-24 12:08:40.054+0000: 3770: info : qemuProcessReconnect:8157 : 
Leaving the update of 'test-vm' domain status XML for the next initialization 
(shutdown detected on this initialization).

        <...>
        
        $ sudo grep -e '<domstatus' -e '<domain' -e 'monitor path' 
/run/libvirt/qemu/test-vm.xml
        <domstatus state='running' reason='booted' pid='3726'>
          <monitor path='/var/lib/libvirt/qemu/domain-1-test-vm/monitor.sock' 
type='unix'/>
          <domain type='qemu' id='1'>

Scenario 2) Shutdown loses race against XML update (ie, update happens
first)

<...>

Instead, let the qemuProcessReconnect thread take the lock, and update
the XML file, but not unlock yet

        <...>
        
        $ ls -l /run/libvirt/qemu/test-vm.xml
        -rw------- 1 root root 10189 Apr 24 12:02 /run/libvirt/qemu/test-vm.xml

        (gdb) b virObjectUnlock thread 20 if anyobj == $ptr
        (gdb) c
        
        $ ls -l /run/libvirt/qemu/test-vm.xml
        -rw------- 1 root root 10189 Apr 24 12:14 /run/libvirt/qemu/test-vm.xml

        <...>

        $ sudo grep -e '<domstatus' -e '<domain' -e 'monitor path' 
/run/libvirt/qemu/test-vm.xml
        <domstatus state='running' reason='booted' pid='3726'>
          <monitor path='/var/lib/libvirt/qemu/domain-1-test-vm/monitor.sock' 
type='unix'/>
          <domain type='qemu' id='1'>

Scenario 3) Shutdown happens along QEMU monitor calls (ie, calls don't
finish)

<...>

        The XML was not updated, as expected:

        $ ls -l /run/libvirt/qemu/test-vm.xml
        -rw------- 1 root root 10189 Apr 24 12:14 /run/libvirt/qemu/test-vm.xml

        $ sudo grep -e '<domstatus' -e '<domain' -e 'monitor path' 
/run/libvirt/qemu/test-vm.xml
        <domstatus state='running' reason='booted' pid='3726'>
          <monitor path='/var/lib/libvirt/qemu/domain-1-test-vm/monitor.sock' 
type='unix'/>
          <domain type='qemu' id='1'>
<...>

Now, the next time libvirtd starts, it correctly parses that XML:

         $ sudo systemctl start libvirtd.service

         $ journalctl -b -u libvirtd.service | grep -A1 error
         $
         
And libvirt is aware of the domain, and can manage it:

        $ virsh list
         Id   Name      State
        -------------------------
         1    test-vm   running

        $ virsh destroy test-vm
        Domain test-vm destroyed

        $ virsh undefine test-vm
        Domain test-vm has been undefined

---
Steps with test packages on Focal (shutdown-on-runtime)
---

<...>
Check the formatter/options again; it is *STILL* referenced, not 0x0 anymore:

        (gdb) t 20
        (gdb) p xmlopt.privateData.format
        $3 = (virDomainXMLPrivateDataFormatFunc) 0x7fd08c3437c0 
<qemuDomainObjPrivateXMLFormat>
        (gdb) p/x xmlopt.parent
        $4 = {u = {dummy_align1 = 0x1cafe0026, dummy_align2 = 0x1cafe0026, s = 
{magic = 0xcafe0026, refs = 0x1}}, klass = 0x7fd080043170}

Let the save function continue, and libvirt finishes shutting down:
<...>
Check the VM status XML *after*:

        $ ls -l /run/libvirt/qemu/test-vm.xml
        -rw------- 1 root root 10251 Apr 24 12:28 /run/libvirt/qemu/test-vm.xml

        $ sudo grep -e '<domstatus' -e '<domain' -e 'monitor path' 
/run/libvirt/qemu/test-vm.xml
        <domstatus state='running' reason='booted' pid='4055'>
          <monitor path='/var/lib/libvirt/qemu/domain-1-test-vm/monitor.sock' 
type='unix'/>
          <domain type='qemu' id='1'>

Now, the next time libvirtd starts, it correctly parses that XML:

        $ sudo systemctl start libvirtd.service

        $ journalctl -b -u libvirtd.service | grep -A1 error
        $

And libvirt is aware of the domain, and can manage it:

        $ virsh list
        Id Name State
        -------------------------
        1 test-vm running

        $ virsh destroy test-vm
        Domain test-vm destroyed

        $ virsh undefine test-vm
        Domain test-vm has been undefined

** Tags removed: verification-needed verification-needed-focal
** Tags added: verification-done verification-done-focal

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2059272

Title:
  libvirt domain is not listed/managed after libvirt restart with
  messages "internal error: no monitor path" and "Failed to load config
  for domain"

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/2059272/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to