On Mon, 27 Feb 2017 15:09:30 +0100 Christian Borntraeger <[email protected]> wrote:
> Paolo, > > commit 97cd965c070152bc626c7507df9fb356bbe1cd81 > "virtio: use VRingMemoryRegionCaches for avail and used rings" > does cause a segfault on my s390 system when I use num-queues. > > gdb --args qemu-system-s390x -nographic -enable-kvm -m 1G -drive > file=/var/lib/libvirt/qemu/image.zhyp137,if=none,id=d1 -device > virtio-blk-ccw,drive=d1,iothread=io1,num-queues=2 -object iothread,id=io1 (...) > (gdb) bt > #0 0x0000000001024a26 in address_space_translate_cached (cache=0x38, addr=2, > xlat=0x3ffe587bff8, plen=0x3ffe587bff0, is_write=false) at > /home/cborntra/REPOS/qemu/exec.c:3187 > #1 0x0000000001025596 in address_space_lduw_internal_cached (cache=0x38, > addr=2, attrs=..., result=0x0, endian=DEVICE_BIG_ENDIAN) at > /home/cborntra/REPOS/qemu/memory_ldst.inc.c:264 > #2 0x0000000001025846 in address_space_lduw_be_cached (cache=0x38, addr=2, > attrs=..., result=0x0) at /home/cborntra/REPOS/qemu/memory_ldst.inc.c:322 > #3 0x000000000102597e in lduw_be_phys_cached (cache=0x38, addr=2) at > /home/cborntra/REPOS/qemu/memory_ldst.inc.c:340 > #4 0x0000000001114856 in virtio_lduw_phys_cached (vdev=0x1c57cd0, > cache=0x38, pa=2) at > /home/cborntra/REPOS/qemu/include/hw/virtio/virtio-access.h:164 > #5 0x000000000111523c in vring_avail_idx (vq=0x3fffde1e090) at > /home/cborntra/REPOS/qemu/hw/virtio/virtio.c:201 > #6 0x0000000001115bba in virtio_queue_empty (vq=0x3fffde1e090) at > /home/cborntra/REPOS/qemu/hw/virtio/virtio.c:332 > #7 0x000000000111c312 in virtio_queue_host_notifier_aio_poll > (opaque=0x3fffde1e0f8) at /home/cborntra/REPOS/qemu/hw/virtio/virtio.c:2294 > #8 0x000000000147a036 in run_poll_handlers_once (ctx=0x1bb8bb0) at > /home/cborntra/REPOS/qemu/util/aio-posix.c:490 > #9 0x000000000147a2fe in try_poll_mode (ctx=0x1bb8bb0, blocking=true) at > /home/cborntra/REPOS/qemu/util/aio-posix.c:566 > #10 0x000000000147a3ca in aio_poll (ctx=0x1bb8bb0, blocking=true) at > /home/cborntra/REPOS/qemu/util/aio-posix.c:595 > #11 0x00000000011a0176 in iothread_run (opaque=0x1bb86c0) at > /home/cborntra/REPOS/qemu/iothread.c:59 > #12 0x000003ffe9087bc4 in start_thread () at /lib64/libpthread.so.0 > #13 0x000003ffe8f8a9f2 in thread_start () at /lib64/libc.so.6 > > It seems to make a difference if its the boot disk or not. Maybe the reset of > the > devices that the bootloader does before handling over control to Linux creates > some trouble here. I can reproduce this (the root cause seems to be that the bootloader only sets up the first queue but the dataplane code wants to handle both queues); this particular problem is fixed by https://patchwork.ozlabs.org/patch/731445/ but then I hit a similar problem later: 0x0000000010019b46 in address_space_translate_cached (cache=0x60, addr=0, xlat=0x3fffcb7e420, plen=0x3fffcb7e418, is_write=false) at /root/git/qemu/exec.c:3187 3187 assert(addr < cache->len && *plen <= cache->len - addr); (...) (gdb) bt #0 0x0000000010019b46 in address_space_translate_cached (cache=0x60, addr=0, xlat=0x3fffcb7e420, plen=0x3fffcb7e418, is_write=false) at /root/git/qemu/exec.c:3187 #1 0x000000001001a5fe in address_space_lduw_internal_cached (cache=0x60, addr=0, attrs=..., result=0x0, endian=DEVICE_BIG_ENDIAN) at /root/git/qemu/memory_ldst.inc.c:264 #2 0x000000001001a88e in address_space_lduw_be_cached (cache=0x60, addr=0, attrs=..., result=0x0) at /root/git/qemu/memory_ldst.inc.c:322 #3 0x000000001001a9c6 in lduw_be_phys_cached (cache=0x60, addr=0) at /root/git/qemu/memory_ldst.inc.c:340 #4 0x00000000100fa876 in virtio_lduw_phys_cached (vdev=0x10bc2ce0, cache=0x60, pa=0) at /root/git/qemu/include/hw/virtio/virtio-access.h:164 #5 0x00000000100fb536 in vring_used_flags_set_bit (vq=0x3fffdebc090, mask=1) at /root/git/qemu/hw/virtio/virtio.c:255 #6 0x00000000100fb7fa in virtio_queue_set_notification (vq=0x3fffdebc090, enable=0) at /root/git/qemu/hw/virtio/virtio.c:297 #7 0x0000000010101d22 in virtio_queue_host_notifier_aio_poll_begin ( n=0x3fffdebc0f8) at /root/git/qemu/hw/virtio/virtio.c:2285 #8 0x00000000103f4164 in poll_set_started (ctx=0x10ae8230, started=true) at /root/git/qemu/util/aio-posix.c:338 #9 0x00000000103f4d5a in try_poll_mode (ctx=0x10ae8230, blocking=true) at /root/git/qemu/util/aio-posix.c:553 #10 0x00000000103f4e56 in aio_poll (ctx=0x10ae8230, blocking=true) at /root/git/qemu/util/aio-posix.c:595 #11 0x000000001017ea36 in iothread_run (opaque=0x10ae7d40) at /root/git/qemu/iothread.c:59 #12 0x000003fffd6084c6 in start_thread () from /lib64/libpthread.so.0 #13 0x000003fffd502ec2 in thread_start () from /lib64/libc.so.6 I think we may be missing guards for not-yet-setup queues in other places; maybe we can centralize this instead of playing whack-a-mole?
