The comment around documenting the cpu_mutex fields and critical sections got us thinking and revisiting our locking assumptions in cpu_handle_interrupt.
Initially we were thinking that removing the BQL from cpu_handle_interrupt meant that we needed to replace it with the cpu mutex to protect the per cpu data that is accessed like interrupt_request. We are reconsidering this and now thinking that the cpu mutex might not be needed here. BQL is clearly needed to protect the critical section around the call to ->cpu_exec_interrupt. What else is the BQL protecting in cpu_handle_interrupt that we need to consider? Are we missing anything here? It's also worth mentioning that we did give this approach a try. We tried out changes to cpu_handle_interrupt(), dropping the BQL from all but around ->cpu_exec_interrupt, and not using the cpu_mutex at all. It seemed to be functional, passing all the tests that we tried (so far). :) Thanks, -Rob On Thu, 6 Aug 2020 at 12:11, Robert Foley <robert.fo...@linaro.org> wrote: > > On Thu, 6 Aug 2020 at 05:22, Paolo Bonzini <pbonz...@redhat.com> wrote: > > > > On 05/08/20 21:18, Richard Henderson wrote: > > > On 8/5/20 11:12 AM, Robert Foley wrote: > > >> This change removes the implied BQL from the cpu_handle_interrupt, > > >> and cpu_handle_exception paths. This BQL acquire is being pushed > > >> down into the per arch implementation. > > >> > > >> Signed-off-by: Robert Foley <robert.fo...@linaro.org> > > >> --- > > >> accel/tcg/cpu-exec.c | 19 +++++++++++-------- > > >> 1 file changed, 11 insertions(+), 8 deletions(-) > > >> > > >> diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c > > >> index 80d0e649b2..8e2bfd97a1 100644 > > >> --- a/accel/tcg/cpu-exec.c > > >> +++ b/accel/tcg/cpu-exec.c > > >> @@ -517,9 +517,7 @@ static inline bool cpu_handle_exception(CPUState > > >> *cpu, int *ret) > > >> #else > > >> if (replay_exception()) { > > >> CPUClass *cc = CPU_GET_CLASS(cpu); > > >> - qemu_mutex_lock_iothread(); > > >> cc->do_interrupt(cpu); > > >> - qemu_mutex_unlock_iothread(); > > >> cpu->exception_index = -1; > > >> > > > > > > This patch is not bisectable. The removal of the lock here needs to > > > happen at > > > the end, or something. > > > > Indeed the series should be structured like this: > > > > 1) rename all *_do_interrupt functions to *_do_interrupt_locked > > > > 2) add back *_do_interrupt that takes the BQL and calls > > *_do_interrupt_locked, point ->do_interrupt to it, remove the BQL from > > cpu-exec.c > > > > 3) modify the cpu_mutex and BQL critical sections around > > ->cpu_exec_interrupt, so that the BQL critical section covers just the > > call to ->cpu_exec_interrupt. Document which fields are now covered by > > cpu_mutex. > > > > 4/5) same as 1/2 for ->cpu_exec_interrupt > > > > Patches 1/2 would be pretty large, but they're trivial to review just by > > grepping for "->do_interrupt\s*=", and likewise for 4/5. > > > > Thanks for the details ! > > It seems like we will have 3 separate patches for this series, 1/2, 3, and > 4/5. > > We will go in this direction. > > Thanks, > -Rob > > > Thanks, > > > > Paolo > >