> -----Original Message----- > From: Qemu-devel <qemu-devel- > bounces+thanos.makatos=nutanix....@nongnu.org> On Behalf Of Thanos > Makatos > Sent: 19 July 2021 19:02 > To: Peter Xu <pet...@redhat.com> > Cc: Paolo Bonzini <pbonz...@redhat.com>; John Levon > <john.le...@nutanix.com>; John G Johnson <john.g.john...@oracle.com>; > Markus Armbruster <arm...@redhat.com>; QEMU Devel Mailing List > <qemu-devel@nongnu.org> > Subject: Re: Question on memory commit during MR finalize() > > Omg I don't know how I missed that, of course I'll ignore SIGUSR1 and retest! > > ________________________________________ > From: Peter Xu <mailto:pet...@redhat.com> > Sent: Monday, 19 July 2021, 16:58 > To: Thanos Makatos > Cc: Paolo Bonzini; Markus Armbruster; QEMU Devel Mailing List; John Levon; > John G Johnson > Subject: Re: Question on memory commit during MR finalize() > > > Hi, Thanos, > > On Mon, Jul 19, 2021 at 02:38:52PM +0000, Thanos Makatos wrote: > > I can trivially trigger an assertion with a build where I merged the recent > vfio-user patches (https://urldefense.proofpoint.com/v2/url?u=https- > 3A__patchew.org_QEMU_cover.1626675354.git.elena.ufimtseva- > 40oracle.com_&d=DwIBaQ&c=s883GpUCOChKOHiocYtGcg&r=XTpYsh5Ps2zJv > tw6ogtti46atk736SI4vgsJiUKIyDE&m=LvALaULnrxZWlgXFcaxGAl95UIwq3a6LI8 > OnG_5r4XY&s=moFPVchYp27xozQcvvxG4nb4nC2QmMnqQ1Wmt4Z3dNE&e > = ) to master and then merging the result into your xzpeter/memory-sanity > branch, I've pushed the branch here: > https://urldefense.proofpoint.com/v2/url?u=https- > 3A__github.com_tmakatos_qemu_tree_memory- > 2Dsanity&d=DwIBaQ&c=s883GpUCOChKOHiocYtGcg&r=XTpYsh5Ps2zJvtw6og > tti46atk736SI4vgsJiUKIyDE&m=LvALaULnrxZWlgXFcaxGAl95UIwq3a6LI8OnG_ > 5r4XY&s=veyjdkkFkGSYNDZOuksB-kbHmdQaw9RYxyZp8Qo7nW4&e= . I > explain the repro steps below in case you want to take a look: > > > > Build as follows: > > > > ./configure --prefix=/opt/qemu-xzpeter --target-list=x86_64-softmmu -- > enable-kvm --enable-debug --enable-multiprocess && make -j `nproc` && > make install > > > > Then build and run the GPIO sample from libvfio-user > (https://urldefense.proofpoint.com/v2/url?u=https- > 3A__github.com_nutanix_libvfio- > 2Duser&d=DwIBaQ&c=s883GpUCOChKOHiocYtGcg&r=XTpYsh5Ps2zJvtw6ogt > ti46atk736SI4vgsJiUKIyDE&m=LvALaULnrxZWlgXFcaxGAl95UIwq3a6LI8OnG_5 > r4XY&s=HYP5NmDMGuS13pdyV83x3HzyhGbE-oP1T8NLtu0d1U8&e= ): > > > > libvfio-user/build/dbg/samples/gpio-pci-idio-16 -v /var/run/vfio-user.sock > > > > And then run QEMU as follows: > > > > gdb --args /opt/qemu-xzpeter/bin/qemu-system-x86_64 -cpu host - > enable-kvm -smp 4 -m 2G -object memory-backend- > file,id=mem0,size=2G,mem-path=/dev/hugepages,share=on,prealloc=yes - > numa node,memdev=mem0 -kernel bionic-server-cloudimg-amd64-vmlinuz- > generic -initrd bionic-server-cloudimg-amd64-initrd-generic -append > 'console=ttyS0 root=/dev/sda1 single' -hda bionic-server-cloudimg-amd64- > 0.raw -nic user,model=virtio-net-pci -machine pc-q35-3.1 -device vfio-user- > pci,socket=/var/run/vfio-user.sock -nographic > > > > I immediately get the following stack trace: > > > > Thread 5 "qemu-system-x86" received signal SIGUSR1, User defined signal > 1. > > This is SIGUSR1. QEMU uses it for general vcpu ipis. > > > [Switching to Thread 0x7fffe6e82700 (LWP 151973)] > > __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:103 > > 103 ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S: No such file or > directory. > > (gdb) bt > > #0 0x00007ffff655d29c in __lll_lock_wait () at > ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:103 > > #1 0x00007ffff6558642 in __pthread_mutex_cond_lock > (mutex=mutex@entry=0x5555568bb280 <qemu_global_mutex>) at > ../nptl/pthread_mutex_lock.c:80 > > #2 0x00007ffff6559ef8 in __pthread_cond_wait_common (abstime=0x0, > mutex=0x5555568bb280 <qemu_global_mutex>, cond=0x555556cecc30) at > pthread_cond_wait.c:645 > > #3 0x00007ffff6559ef8 in __pthread_cond_wait (cond=0x555556cecc30, > mutex=0x5555568bb280 <qemu_global_mutex>) at > pthread_cond_wait.c:655 > > #4 0x000055555604f717 in qemu_cond_wait_impl (cond=0x555556cecc30, > mutex=0x5555568bb280 <qemu_global_mutex>, file=0x5555561ca869 > "../softmmu/cpus.c", line=514) at ../util/qemu-thread-posix.c:194 > > #5 0x0000555555d28a4a in qemu_cond_wait_iothread > (cond=0x555556cecc30) at ../softmmu/cpus.c:514 > > #6 0x0000555555d28781 in qemu_wait_io_event (cpu=0x555556ce02c0) at > ../softmmu/cpus.c:425 > > #7 0x0000555555e5da75 in kvm_vcpu_thread_fn (arg=0x555556ce02c0) at > ../accel/kvm/kvm-accel-ops.c:54 > > #8 0x000055555604feed in qemu_thread_start (args=0x555556cecc70) at > ../util/qemu-thread-posix.c:541 > > #9 0x00007ffff6553fa3 in start_thread (arg=<optimized out>) at > pthread_create.c:486 > > #10 0x00007ffff64824cf in clone () at > ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 > > Would you please add below to your ~/.gdbinit script? > > handle SIGUSR1 nostop noprint > > Or just run without gdb and wait it to crash with SIGABRT. > > Thanks, > > -- > Peter Xu
Sorry for the bad brain day, here's your stack trace: qemu-system-x86_64: ../softmmu/cpus.c:72: qemu_mutex_unlock_iothread_prepare: Assertion `!memory_region_has_pending_transaction()' failed. Thread 1 "qemu-system-x86" received signal SIGABRT, Aborted. __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50 50 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory. (gdb) bt #0 0x00007ffff63c07bb in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50 #1 0x00007ffff63ab535 in __GI_abort () at abort.c:79 #2 0x00007ffff63ab40f in __assert_fail_base (fmt=0x7ffff650dee0 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x5555561ca880 "!memory_region_has_pending_transaction()", file=0x5555561ca869 "../softmmu/cpus.c", line=72, function=<optimized out>) at assert.c:92 #3 0x00007ffff63b9102 in __GI___assert_fail (assertion=0x5555561ca880 "!memory_region_has_pending_transaction()", file=0x5555561ca869 "../softmmu/cpus.c", line=72, function=0x5555561caa60 <__PRETTY_FUNCTION__.37393> "qemu_mutex_unlock_iothread_prepare") at assert.c:101 #4 0x0000555555d27c20 in qemu_mutex_unlock_iothread_prepare () at ../softmmu/cpus.c:72 #5 0x0000555555d289f6 in qemu_mutex_unlock_iothread () at ../softmmu/cpus.c:507 #6 0x0000555555dcb3d6 in vfio_user_send_recv (proxy=0x555557ac5560, msg=0x555557933d50, fds=0x0, rsize=40) at ../hw/vfio/user.c:88 #7 0x0000555555dcd30a in vfio_user_dma_unmap (proxy=0x555557ac5560, unmap=0x7fffffffd8d0, bitmap=0x0) at ../hw/vfio/user.c:796 #8 0x0000555555dabf5f in vfio_dma_unmap (container=0x555557a06fb0, iova=786432, size=2146697216, iotlb=0x0) at ../hw/vfio/common.c:501 #9 0x0000555555dae12c in vfio_listener_region_del (listener=0x555557a06fc0, section=0x7fffffffd9f0) at ../hw/vfio/common.c:1249 #10 0x0000555555d3d06d in address_space_update_topology_pass (as=0x5555568bbc80 <address_space_memory>, old_view=0x555556d6cfb0, new_view=0x555556d6c8b0, adding=false) at ../softmmu/memory.c:960 #11 0x0000555555d3d62c in address_space_set_flatview (as=0x5555568bbc80 <address_space_memory>) at ../softmmu/memory.c:1062 #12 0x0000555555d3d800 in memory_region_transaction_commit () at ../softmmu/memory.c:1124 #13 0x0000555555b75a3e in mch_update_pam (mch=0x555556e80a40) at ../hw/pci-host/q35.c:344 #14 0x0000555555b76068 in mch_update (mch=0x555556e80a40) at ../hw/pci-host/q35.c:504 #15 0x0000555555b761d7 in mch_reset (qdev=0x555556e80a40) at ../hw/pci-host/q35.c:561 #16 0x0000555555e93a95 in device_transitional_reset (obj=0x555556e80a40) at ../hw/core/qdev.c:1028 #17 0x0000555555e956f8 in resettable_phase_hold (obj=0x555556e80a40, opaque=0x0, type=RESET_TYPE_COLD) at ../hw/core/resettable.c:182 #18 0x0000555555e8e07c in bus_reset_child_foreach (obj=0x555556ebce80, cb=0x555555e955ca <resettable_phase_hold>, opaque=0x0, type=RESET_TYPE_COLD) at ../hw/core/bus.c:97 #19 0x0000555555e953ff in resettable_child_foreach (rc=0x555556a07ab0, obj=0x555556ebce80, cb=0x555555e955ca <resettable_phase_hold>, opaque=0x0, type=RESET_TYPE_COLD) at ../hw/core/resettable.c:96 #20 0x0000555555e9567e in resettable_phase_hold (obj=0x555556ebce80, opaque=0x0, type=RESET_TYPE_COLD) at ../hw/core/resettable.c:173 #21 0x0000555555e920e0 in device_reset_child_foreach (obj=0x555556e802d0, cb=0x555555e955ca <resettable_phase_hold>, opaque=0x0, type=RESET_TYPE_COLD) at ../hw/core/qdev.c:366 #22 0x0000555555e953ff in resettable_child_foreach (rc=0x555556ad2830, obj=0x555556e802d0, cb=0x555555e955ca <resettable_phase_hold>, opaque=0x0, type=RESET_TYPE_COLD) at ../hw/core/resettable.c:96 #23 0x0000555555e9567e in resettable_phase_hold (obj=0x555556e802d0, opaque=0x0, type=RESET_TYPE_COLD) at ../hw/core/resettable.c:173 #24 0x0000555555e8e07c in bus_reset_child_foreach (obj=0x555556beaac0, cb=0x555555e955ca <resettable_phase_hold>, opaque=0x0, type=RESET_TYPE_COLD) at ../hw/core/bus.c:97 #25 0x0000555555e953ff in resettable_child_foreach (rc=0x555556b1ca70, obj=0x555556beaac0, cb=0x555555e955ca <resettable_phase_hold>, opaque=0x0, type=RESET_TYPE_COLD) at ../hw/core/resettable.c:96 #26 0x0000555555e9567e in resettable_phase_hold (obj=0x555556beaac0, opaque=0x0, type=RESET_TYPE_COLD) at ../hw/core/resettable.c:173 #27 0x0000555555e952b4 in resettable_assert_reset (obj=0x555556beaac0, type=RESET_TYPE_COLD) at ../hw/core/resettable.c:60 #28 0x0000555555e951f8 in resettable_reset (obj=0x555556beaac0, type=RESET_TYPE_COLD) at ../hw/core/resettable.c:45 #29 0x0000555555e95a37 in resettable_cold_reset_fn (opaque=0x555556beaac0) at ../hw/core/resettable.c:269 #30 0x0000555555e93f40 in qemu_devices_reset () at ../hw/core/reset.c:69 #31 0x0000555555c9eb04 in pc_machine_reset (machine=0x555556a4d9e0) at ../hw/i386/pc.c:1654 #32 0x0000555555d381fb in qemu_system_reset (reason=SHUTDOWN_CAUSE_NONE) at ../softmmu/runstate.c:443 #33 0x0000555555a787f2 in qdev_machine_creation_done () at ../hw/core/machine.c:1330 #34 0x0000555555d4e09c in qemu_machine_creation_done () at ../softmmu/vl.c:2650 #35 0x0000555555d4e16b in qmp_x_exit_preconfig (errp=0x5555568db1a0 <error_fatal>) at ../softmmu/vl.c:2673 #36 0x0000555555d506be in qemu_init (argc=31, argv=0x7fffffffe268, envp=0x7fffffffe368) at ../softmmu/vl.c:3692 #37 0x0000555555945cad in main (argc=31, argv=0x7fffffffe268, envp=0x7fffffffe368) at ../softmmu/main.c:49 This is where the vfio-user client in QEMU tells the vfio-user server (the GPIO device) that this particular memory region is not available for DMA. There are 3 vfio_dma_map() operations before this happens and this seems to be the very first vfio_dma_unmap() operation.