[Qemu-devel] qemu_system_reset_request() broken w.r.t BQL locking regime

Alex Bennée Wed, 05 Jul 2017 09:01:46 -0700

Hi,

An interesting bug was reported on #qemu today. It was bisected to
8d04fb55 (drop global lock for TCG) and only occurred when QEMU was run
with taskset -c 0. Originally the fingers where pointed at mttcg but it
occurs in both single and multi-threaded modes.


I think the problem is qemu_system_reset_request() is certainly racy
when resetting a running CPU. AFAICT:

  - Guest resets board, writing to some hw address (e.g.
    arm_sysctl_write)
  - This triggers qemu_system_reset_request(SHUTDOWN_CAUSE_GUEST_RESET)
  - We exit iowrite and drop the BQL
  - vl.c schedules qemu_system_reset->qemu_devices_reset...arm_cpu_reset
  - we start writing new values to CPU env while still in TCG code
  - CHAOS!

The general solution for this is to ensure these sort of tasks are done
with safe work in the CPUs context when we know nothing else is running.
It seems this is probably best done by modifying
qemu_system_reset_request to queue work up on current_cpu and execute it
as safe work - I don't think the vl.c thread should ever be messing
about with calling cpu_reset directly.

Looking at the calls most of these are made by device code but I see KVM
also does it. I just wanted to check this was a reasonable approach and
wouldn't upset anything else.

Any thoughts?

--
Alex Bennée

[Qemu-devel] qemu_system_reset_request() broken w.r.t BQL locking regime

Reply via email to