Thanks Murilo, I need no test with the basic kernel yet. But later when we really SRU this it will be good to do both a old and a HWE kernel check. Let me add that to the verification steps ...
** Description changed: [Impact] - * In the past qemu has generally not allowd MSI-X BAR mapping on VFIO. - But there can be platforms (like ppc64 spapr) that can and want to do - exactly that. + * In the past qemu has generally not allowd MSI-X BAR mapping on VFIO. + But there can be platforms (like ppc64 spapr) that can and want to do + exactly that. - * Backport two patches from upstream (in since qemu 2.12 / Disco). + * Backport two patches from upstream (in since qemu 2.12 / Disco). - * Due to that there is a tremendous speedup, especially useful with page - size bigger than 4k. This avoids that being split into chunks and makes - direct MMIO access possible for the guest. + * Due to that there is a tremendous speedup, especially useful with page + size bigger than 4k. This avoids that being split into chunks and makes + direct MMIO access possible for the guest. [Test Case] - * On ppc64 pass through an NVME device to the guest and run I/O - benchmarks, see below for Details how to set that up. - Note this needs the HWE kernel or another kernel fixup for [1]. + * On ppc64 pass through an NVME device to the guest and run I/O + benchmarks, see below for Details how to set that up. + Note: this needs the HWE kernel or another kernel fixup for [1]. + Note: the test should also be done with the non-HWE kernel, the + expectation there is that it would not show the perf benefits, but + still work fine [Regression Potential] - * Changes: - a) if the host driver allows mapping of MSI-X data the entire BAR is - mapped. This is only done if the kernel reports that capability [1]. - This ensures that only on kernels able to do so qemu does expose the - new behavior (safe against regression in that regard) - b) on ppc64 MSI-X emulation is disabled for VFIO devices this is local - to just this HW and will not affect other HW. + * Changes: + a) if the host driver allows mapping of MSI-X data the entire BAR is + mapped. This is only done if the kernel reports that capability [1]. + This ensures that only on kernels able to do so qemu does expose the + new behavior (safe against regression in that regard) + b) on ppc64 MSI-X emulation is disabled for VFIO devices this is local + to just this HW and will not affect other HW. - Generally the regressions that come to mind are slight changes in - behavior (real HW vs the former emulation) that on some weird/old - guests could cause trouble. But then it is limited to only PPC where - only a small set of certified HW is really allowed. + Generally the regressions that come to mind are slight changes in + behavior (real HW vs the former emulation) that on some weird/old + guests could cause trouble. But then it is limited to only PPC where + only a small set of certified HW is really allowed. - The mapping that might be added even on other platforms should not - consume too much extra memory as long as it isn't used. Further since - it depends on the kernel capability it isn't randomly issues on kernels - where we expect it to fail. + The mapping that might be added even on other platforms should not + consume too much extra memory as long as it isn't used. Further since + it depends on the kernel capability it isn't randomly issues on kernels + where we expect it to fail. - So while it is quite a change, it seems safe to me. + So while it is quite a change, it seems safe to me. [Other Info] - - * I know, one could as well call that a "feature", but it really is a - performance bug fix more than anything else. Also the SRU policy allows - exploitation/toleration of new HW especially for LTS releases. - Therefore I think this is fine as SRU. + + * I know, one could as well call that a "feature", but it really is a + performance bug fix more than anything else. Also the SRU policy allows + exploitation/toleration of new HW especially for LTS releases. + Therefore I think this is fine as SRU. [1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=a32295c612c57990d17fb0f41e7134394b2f35f6 - == Comment: #0 - Murilo Opsfelder Araujo - 2019-10-11 14:16:14 == ---Problem Description--- Back-port the following patches to Bionic QEMU to improve NVMe guest performance by more than 200%: ?vfio-pci: Allow mmap of MSIX BAR? https://git.qemu.org/?p=qemu.git;a=commit;h=ae0215b2bb56a9d5321a185dde133bfdd306a4c0 ?ppc/spapr, vfio: Turn off MSIX emulation for VFIO devices? https://git.qemu.org/?p=qemu.git;a=commit;h=fcad0d2121976df4b422b4007a5eb7fcaac01134 ---uname output--- na ---Additional Hardware Info--- 0030:01:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller 172Xa/172Xb (rev 01) Machine Type = AC922 ---Debugger--- A debugger is not configured ---Steps to Reproduce--- Install or setup a guest image and boot it. Once guest is running, passthrough the NVMe disk to the guest using the XML: host$ cat nvme-disk.xml <hostdev mode='subsystem' type='pci' managed='no'> <driver name='vfio'/> <source> <address domain='0x0030' bus='0x01' slot='0x00' function='0x0'/> </source> </hostdev> host$ virsh attach-device <domain> nvme-disk.xml --live On the guest, run fio benchmarks: guest$ fio --direct=1 --rw=randrw --refill_buffers --norandommap --randrepeat=0 --ioengine=libaio --bs=4k --rwmixread=100 --iodepth=16 --runtime=60 --name=job1 --filename=/dev/nvme0n1 --numjobs=4 Results are similar with numjobs=4 and numjobs=64, respectively: READ: bw=385MiB/s (404MB/s), 78.0MiB/s-115MiB/s (81.8MB/s-120MB/s), io=11.3GiB (12.1GB), run=30001-30001msec READ: bw=382MiB/s (400MB/s), 2684KiB/s-12.6MiB/s (2749kB/s-13.2MB/s), io=11.2GiB (12.0GB), run=30001-30009msec With the two patches applied, performance improved significantly for numjobs=4 and numjobs=64 cases, respectively: READ: bw=1191MiB/s (1249MB/s), 285MiB/s-309MiB/s (299MB/s-324MB/s), io=34.9GiB (37.5GB), run=30001-30001msec READ: bw=4273MiB/s (4481MB/s), 49.7MiB/s-113MiB/s (52.1MB/s-119MB/s), io=125GiB (134GB), run=30001-30005msec Userspace tool common name: qemu Userspace rpm: qemu The userspace tool has the following bit modes: 64-bit Userspace tool obtained from project website: na -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1847948 Title: Improve NVMe guest performance on Bionic QEMU To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu-power-systems/+bug/1847948/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs