Hi Stefan Hajnoczi <stefa...@gmail.com> writes:
> [ Unknown signature status ] > On Thu, Aug 11, 2016 at 09:18:12AM +0200, Gaudenz Steinlin wrote: >> >> [ Please CC me on replies as I'm not subscribed to this list. ] >> >> Hi >> >> The Fix for CVE-2016-5403 (virtio: error out if guest exceeds virtqueue >> size)[1] causes qemu to exit(1) after migration or restart from a saved >> state if memory statistics are enabled in libvirt. Qemu exits after >> printing "qemu-system-x86_64: Virtqueue size exceeded". >> >> I experienced this problem with the latest security update in Ubuntu >> Trusty (14.04) which cherry-picked this fix. If you think that the >> latest upstream version is not affected I can try this too. I only >> tested with VM started through libvirt. If someone tells me how to >> enable memory statistics with plain qemu without libvirt I can test this >> too. My guess would be that this does not make a difference. >> >> I discovered this bug because OpenStack Nova enables memory statistics >> by default since the Juno release. After the QEMU upgrade to the latest >> version in Ubuntu VMs were suddenly shutoff after migration. >> >> Steps to reproduce: >> 1. Create a VM with libvirt which contains a memory balloon device >> defined like this: >> <memballoon model='virtio'> >> <address type='pci' domain='0x0000' bus='0x00' slot='0x02' >> function='0x0'/> >> <stats period='10'/> >> </memballoon> >> >> 2. Start the VM and let the Linux kernel boot (bug does not appear if >> the kernel is not yet booted, eg. while in the PXE boot phase) >> 3. Issue a managedsave >> 4. Start the VM again >> 5. The VM is restored and "crashes" right after it starts running again. >> 6. You can find the qemu output "qemu-system-x86_64: Virtqueue size >> exceeded" in the log at /var/log/libvirt/vmname.log > > I couldn't reproduce this with qemu.git/master (28b874429ba) and a RHEL > 7.2 guest. > > Which guest distro and kernel version are you using? I just retested and ran into the bug with the following guest OSs: - Ubuntu 16.04 (Linux ubuntu-1604 4.4.0-24-generic #43-Ubuntu SMP Wed Jun 8 19:27:37 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux) - Ubuntu 14.04 (Linux ubuntu-1404 3.13.0-88-generic #135-Ubuntu SMP Wed Jun 8 21:10:42 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux) - Debian 8.5 (Linux debian 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt25-2+deb8u3 (2016-07-02) x86_64 GNU/Linu) - Centos 7 (Linux centos 3.10.0-327.18.2.el7.x86_64 #1 SMP Thu May 12 11:03:55 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux - Arch 16.07 (Linux arch 4.6.4-1-ARCH #1 SMP PREEMPT Mon Jul 11 19:12:32 CEST 2016 x86_64 GNU/Linux) - CoreOS 1010.5.0 (Linux coreos.openstacklocal 4.5.0-coreos-r1 #2 SMP Thu May 26 22:21:06 UTC 2016 x86_64 Intel Xeon E312xx (Sandy Bridge) GenuineIntel GNU/Linux) So it's reproducible with a wide range of Linux OSes and kernel versions for me. I used the Ubuntu packaged qemu version 2.0.0+dfsg-2ubuntu1.26. The version 2.0.0+dfsg-2ubuntu1.26 which has the fix for CVE-2016-5403 reversed does not have the bug. So it seems quite obvious that at least backporting this fix to 2.0.0 is not safe. If I can get the latest master to compile I will try this too. > Are you doing anything that might cause virtio-balloon activity? How can I check that? I do nothing out of the ordinary and the problem is present just after the guest OS is fully booted but otherwise completely idle. Gaudenz