> On Fri, Feb 2, 2018 at 1:49 PM, Alistair Francis > <alistair.fran...@xilinx.com> wrote: >> On Fri, Feb 2, 2018 at 12:37 PM, Alex Bennée <alex.ben...@linaro.org> wrote: >>> >>> Alistair Francis <alistair.fran...@xilinx.com> writes: >>> >>>> On Thu, Feb 1, 2018 at 9:13 AM, Alistair Francis >>>> <alistair.fran...@xilinx.com> wrote: >>>>> On Thu, Feb 1, 2018 at 4:01 AM, Alex Bennée <alex.ben...@linaro.org> >>>>> wrote: >>>>>> >>>>>> Alistair Francis <alistair.fran...@xilinx.com> writes: >>>>>> >>>>>>> On Wed, Jan 31, 2018 at 12:32 PM, Alex Bennée <alex.ben...@linaro.org> >>>>>>> wrote: >>>>>>>> >>>>>>>> Alistair Francis <alistair.fran...@xilinx.com> writes: >>>>>>>> >>>>>>>>> On Tue, Jan 30, 2018 at 8:26 PM, Paolo Bonzini <pbonz...@redhat.com> >>>>>>>>> wrote: >>>>>>>>>> On 30/01/2018 18:56, Alistair Francis wrote: >>>>>>>>>>> >>>>>>>>>>> I don't have a good solution though, as setting CPU_INTERRUPT_RESET >>>>>>>>>>> doesn't help (that isn't handled while we are halted) and >>>>>>>>>>> async_run_on_cpu()/run_on_cpu() doesn't reliably reset the CPU when >>>>>>>>>>> we >>>>>>>>>>> want. >>>>>>>>>>> >>>>>>>>>>> I've ever tried pausing all CPUs before reseting the CPU and them >>>>>>>>>>> resuming them all but that doesn't seem to to work either. >>>>>>>>>> >>>>>>>>>> async_safe_run_on_cpu would be like async_run_on_cpu, except that it >>>>>>>>>> takes care of stopping all other CPUs while the function runs. >>>>>>>>>> >>>>>>>>>>> Is there >>>>>>>>>>> anything I'm missing? Is there no reliable way to reset a CPU? >>>>>>>>>> >>>>>>>>>> What do you mean by reliable? Executing no instruction after the one >>>>>>>>>> you were at? >>>>>>>>> >>>>>>>>> The reset is called by a GPIO line, so I need the reset to be called >>>>>>>>> basically as quickly as the GPIO line changes. The async_ and >>>>>>>>> async_safe_ functions seem to not run quickly enough, even if I run a >>>>>>>>> process_work_queue() function afterwards. >>>>>>>>> >>>>>>>>> Is there a way to kick the CPU to act on the async_*? >>>>>>>> >>>>>>>> Define quickly enough? The async_(safe) functions kick the vCPUs so >>>>>>>> they >>>>>>>> will all exit the run loop as they enter the next TB (even if they loop >>>>>>>> to themselves). >>>>>>> >>>>>>> We have a special power controller CPU that wakes all the CPUs up and >>>>>>> at boot the async_* functions don't wake the CPUs up. If I just use >>>>>>> the cpu_rest() function directly everything starts fine (but then I >>>>>>> hit issues later). >>>>>>> >>>>>>> If I forcefully run process_queued_cpu_work() then I can get the CPUs >>>>>>> up, but I don't think that is the right solution. >>>>>>> >>>>>>>> >>>>>>>> From an external vCPUs point of view those extra instructions have >>>>>>>> already executed. If the resetting vCPU needs them to have reset by the >>>>>>>> time it executes it's next instruction it should either cpu_loop_exit >>>>>>>> at >>>>>>>> that point or ensure it is the last instruction in it's TB (which is >>>>>>>> what we do for the MMU flush cases in ARM, they all end the TB at that >>>>>>>> point). >>>>>>> >>>>>>> cpu_loop_exit() sounds like it would help, but as I'm not in the CPU >>>>>>> context it just seg faults. >>>>>> >>>>>> What context are you in? gdb-stub does have to something like this. >>>>> >>>>> gdb-stub just seems to use vm_stop() and vm_start(). >>>>> >>>>> That fixes all hangs/asserts, but now Linux only brings up 1 CPU (instead >>>>> of 4). >>>> >>>> Hmmm... Interesting if I do this on reset events: >>>> >>>> pause_all_vcpus(); >>>> cpu_reset(cpu); >>>> resume_all_vcpus(); >>>> >>>> it hangs, while if I do this >>>> >>>> if (runstate_is_running()) { >>>> vm_stop(RUN_STATE_PAUSED); >>>> } >>>> cpu_reset(cpu); >>>> if (!runstate_needs_reset()) { >>>> vm_start(); >>>> } >>>> >>>> it doesn't hang but CPU bringup doesn't work. >>> >>> Hmm I'm still confused what context you are in. Is this an externally >>> triggered reset via the (qemu) prompt or something? >> >> This gets called from a variety of places. But most likely it's called >> from a second QEMU process that is triggering an interrupt through a >> device. > > Something like this: > > #0 0x0000555555807350 in cpu_reset_gpio (opaque=0x555557272100, > irq=0, level=0) at /scratch/alistai/master-qemu/exec.c:3853 > #1 0x0000555555a20336 in dep_register_refresh_gpios > (reg=reg@entry=0x555556fa5ad0, old_value=old_value@entry=2147496974) > at hw/core/register-dep.c:246 > #2 0x0000555555a2067b in dep_register_write (reg=0x555556fa5ad0, > val=<optimized out>, we=<optimized out>) > at hw/core/register-dep.c:142 > #3 0x0000555555841ae8 in memory_region_write_accessor > (mr=0x555556fa5b80, addr=0, value=<optimized out>, size=4, > shift=<optimized out>, mask=<optimized out>, attrs=...) at > /scratch/alistai/master-qemu/memory.c:617 > #4 0x000055555583e57d in access_with_adjusted_size > (addr=addr@entry=0, value=value@entry=0x7fffffffd218, > size=size@entry=4, access_size_min=<optimized out>, > access_size_max=<optimized out>, access_fn= > 0x555555841a70 <memory_region_write_accessor>, mr=0x555556fa5b80, > attrs=...) at /scratch/alistai/master-qemu/memory.c:684 > #5 0x0000555555843cda in memory_region_dispatch_write > (mr=0x555556fa5b80, addr=0, data=<optimized out>, size=4, attrs=...) > at /scratch/alistai/master-qemu/memory.c:1789 > #6 0x00005555557fbcb1 in flatview_write_continue (mr=0x555556fa5b80, > l=<optimized out>, addr1=<optimized out>, len=4, buf=0x7fff900047c0 > "\f4", attrs=..., addr=4246339844, fv=0x5555574cdc10) at > /scratch/alistai/master-qemu/exec.c:3076 > #7 0x00005555557fbcb1 in flatview_write (fv=0x5555574cdc10, > addr=<optimized out>, attrs=..., buf=<optimized out>, len=<optimized > out>) at /scratch/alistai/master-qemu/exec.c:3145 > #8 0x000055555586eb1b in dma_memory_rw_relaxed_attr (attr=..., > dir=DMA_DIRECTION_FROM_DEVICE, len=<optimized out>, > buf=0x7fff900047c0, addr=<optimized out>, as=<optimized out>) at > /scratch/alistai/master-qemu/include/sysemu/dma.h:96 > #9 0x000055555586eb1b in dma_memory_rw_attr (attr=..., > dir=DMA_DIRECTION_FROM_DEVICE, len=<optimized out>, > buf=0x7fff900047c0, addr=<optimized out>, as=<optimized out>) at > /scratch/alistai/master-qemu/include/sysemu/dma.h:120
Cc'ing Stefan for this part: > #10 0x000055555586eb1b in rp_cmd_rw (s=0x555556d0bb90, > pkt=0x7fff90004770, dir=DMA_DIRECTION_FROM_DEVICE) > at /scratch/alistai/master-qemu/hw/core/remote-port-memory-slave.c:93 > #11 0x000055555586db53 in rp_process (s=<optimized out>) at > /scratch/alistai/master-qemu/hw/core/remote-port.c:424 > #12 0x000055555586db53 in rp_event_read (opaque=<optimized out>) at > /scratch/alistai/master-qemu/hw/core/remote-port.c:460 > #13 0x0000555555c5de14 in aio_dispatch_handlers > (ctx=ctx@entry=0x555556cf7750) at util/aio-posix.c:406 > #14 0x0000555555c5e6e8 in aio_dispatch (ctx=0x555556cf7750) at > util/aio-posix.c:437 > #15 0x0000555555c5b6ae in aio_ctx_dispatch (source=<optimized out>, > callback=<optimized out>, user_data=<optimized out>) > at util/async.c:261 > #16 0x00007ffff27a4fb7 in g_main_context_dispatch () at > /lib/x86_64-linux-gnu/libglib-2.0.so.0 > #17 0x0000555555c5d937 in glib_pollfds_poll () at util/main-loop.c:215 > #18 0x0000555555c5d937 in os_host_main_loop_wait (timeout=<optimized > out>) at util/main-loop.c:262 > #19 0x0000555555c5d937 in main_loop_wait (nonblocking=<optimized out>) > at util/main-loop.c:516 > #20 0x00005555557f4c76 in main_loop () at vl.c:2002 > #21 0x00005555557f4c76 in main (argc=<optimized out>, argv=<optimized > out>, envp=<optimized out>) at vl.c:4949