Hi, An interesting bug was reported on #qemu today. It was bisected to 8d04fb55 (drop global lock for TCG) and only occurred when QEMU was run with taskset -c 0. Originally the fingers where pointed at mttcg but it occurs in both single and multi-threaded modes.
I think the problem is qemu_system_reset_request() is certainly racy when resetting a running CPU. AFAICT: - Guest resets board, writing to some hw address (e.g. arm_sysctl_write) - This triggers qemu_system_reset_request(SHUTDOWN_CAUSE_GUEST_RESET) - We exit iowrite and drop the BQL - vl.c schedules qemu_system_reset->qemu_devices_reset...arm_cpu_reset - we start writing new values to CPU env while still in TCG code - CHAOS! The general solution for this is to ensure these sort of tasks are done with safe work in the CPUs context when we know nothing else is running. It seems this is probably best done by modifying qemu_system_reset_request to queue work up on current_cpu and execute it as safe work - I don't think the vl.c thread should ever be messing about with calling cpu_reset directly. Looking at the calls most of these are made by device code but I see KVM also does it. I just wanted to check this was a reasonable approach and wouldn't upset anything else. Any thoughts? -- Alex Bennée