I built qemu head from git

$ export CFLAGS="-O0 -g"
$ ./configure --disable-user --disable-linux-user --disable-docs 
--disable-guest-agent --disable-sdl --disable-gtk --disable-vnc --disable-xen 
--disable-brlapi --enable-fdt --disable-bluez --disable-vde --disable-rbd 
--disable-libiscsi --disable-libnfs --disable-libusb --disable-usb-redir 
--disable-seccomp --disable-glusterfs --disable-tpm --disable-numa 
--disable-slirp --disable-blobs --target-list=ppc64-softmmu
$ make -j

$ virsh nodedev-detach pci_0005_01_00_0 --driver vfio
$ virsh nodedev-detach pci_0005_01_00_1 --driver vfio
$ virsh nodedev-detach pci_0005_01_00_2 --driver vfio
$ virsh nodedev-detach pci_0005_01_00_3 --driver vfio
$ virsh nodedev-detach pci_0005_01_00_4 --driver vfio
$ virsh nodedev-detach pci_0005_01_00_5 --driver vfio

$ sudo ppc64-softmmu/qemu-system-ppc64 -machine
pseries-4.1,accel=kvm,usb=off,dump-guest-core=off,cap-cfpc=broken,cap-
sbbc=broken,cap-ibs=broken -name guest=test-vfio-slowness -m 131072 -smp
1 -no-user-config -device spapr-pci-host-bridge,index=1,id=pci.1 -drive
file=/var/lib/uvtool/libvirt/images/test-huge-mem-
init.qcow,format=qcow2,if=none,id=drive-virtio-disk0 -device virtio-blk-
pci,scsi=off,bus=pci.0,addr=0x3,drive=drive-virtio-disk0,id=virtio-
disk0,bootindex=1 -device vfio-
pci,host=0005:01:00.0,id=hostdev0,bus=pci.1.0,addr=0x1 -device vfio-
pci,host=0005:01:00.1,id=hostdev1,bus=pci.1.0,addr=0x2 -device vfio-
pci,host=0005:01:00.2,id=hostdev2,bus=pci.1.0,addr=0x3 -device vfio-
pci,host=0005:01:00.3,id=hostdev3,bus=pci.1.0,addr=0x4 -device vfio-
pci,host=0005:01:00.4,id=hostdev4,bus=pci.1.0,addr=0x5 -device vfio-
pci,host=0005:01:00.5,id=hostdev5,bus=pci.1.0,addr=0x6 -msg timestamp=on
-display curses


I found VFIO_IOMMU_SPAPR_REGISTER_MEMORY:
96783      0.000088 readlink("/sys/bus/pci/devices/0005:01:00.0/iommu_group", 
"../../../../kernel/iommu_groups/"..., 4096) = 33 <0.000022>
96783      0.000066 openat(AT_FDCWD, "/dev/vfio/8", O_RDWR|O_CLOEXEC) = 16 
<0.000025>
96783      0.000059 ioctl(16, VFIO_GROUP_GET_STATUS, 0x7fffe3fd6e20) = 0 
<0.000018>
96783      0.000050 openat(AT_FDCWD, "/dev/vfio/vfio", O_RDWR|O_CLOEXEC) = 17 
<0.000014>
96783      0.000049 ioctl(17, VFIO_GET_API_VERSION, 0) = 0 <0.000008>
96783      0.000039 ioctl(17, VFIO_CHECK_EXTENSION, 0x3) = 0 <0.000011>
96783      0.000040 ioctl(17, VFIO_CHECK_EXTENSION, 0x1) = 0 <0.000008>
96783      0.000037 ioctl(17, VFIO_CHECK_EXTENSION, 0x7) = 1 <0.000008>
96783      0.000037 ioctl(16, VFIO_GROUP_SET_CONTAINER, 0x65690e1bb48) = 0 
<0.000008>
96783      0.000037 ioctl(17, VFIO_SET_IOMMU, 0x7) = 0 <0.000039>
96783      0.000070 ioctl(17, VFIO_IOMMU_SPAPR_REGISTER_MEMORY <unfinished ...>
96785     10.019032 <... futex resumed> ) = -1 ETIMEDOUT (Connection timed out) 
<10.022751>
96785      0.053520 madvise(0x7fabd1f60000, 8257536, MADV_DONTNEED) = 0 
<0.000020>
96785      0.007283 exit(0)             = ?
96785      0.000072 +++ exited with 0 +++
96783    276.894553 <... ioctl resumed> , 0x7fffe3fd6b70) = 0 <286.974436>
96783      0.000107 --- SIGWINCH {si_signo=SIGWINCH, si_code=SI_KERNEL} ---


This is repeatable and explains why I haven't seen exactly the same that I saw 
on x86.
VFIO_IOMMU_SPAPR_REGISTER_MEMORY is ppc specific (but with the same long hang 
behavior).

Already there at least for ppc the documentation (kernel) says:
460    - VFIO_IOMMU_SPAPR_REGISTER_MEMORY/VFIO_IOMMU_SPAPR_UNREGISTER_MEMORY 
ioctls  
461      receive a user space address and size of the block to be pinned.       
     
462      Bisecting is not supported and VFIO_IOMMU_UNREGISTER_MEMORY is 
expected to  
463      be called with the exact address and size used for registering         
     
464      the memory block. The userspace is not expected to call these often.   
     
465      The ranges are stored in a linked list in a VFIO container.  

The size seems to be all memory as I see:
    reg.vaddr = (uintptr_t) vfio_prereg_gpa_to_vaddr(section, gpa);
    reg.size = end - gpa;

And GDB confirms that is ALL of the guests memory (which explains the scaling 
with memory size)
78          ret = ioctl(container->fd, VFIO_IOMMU_SPAPR_REGISTER_MEMORY, &reg);
(gdb) p reg.size/1024/1024
$3 = 131072

"non bisectable" is the bad flag here.
It might be splittable in the kernel, but for this qemu can't do a lot as it 
has to be a single range.

Now that we have this confirmed, lets search the same on x86.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1838575

Title:
  passthrough devices cause >17min boot delay

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1838575/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to