"Denis V. Lunev" <[email protected]> wrote:
> Release qemu global mutex before call synchronize_rcu().
> synchronize_rcu() waiting for all readers to finish their critical
> sections. There is at least one critical section in which we try
> to get QGM (critical section is in address_space_rw() and
> prepare_mmio_access() is trying to aquire QGM).
>
> Both functions (migration_end() and migration_bitmap_extend())
> are called from main thread which is holding QGM.
>
> Thus there is a race condition that ends up with deadlock:
> main thread     working thread
> Lock QGA                |
> |             Call KVM_EXIT_IO handler
> |                       |
> |        Open rcu reader's critical section
> Migration cleanup bh    |
> |                       |
> synchronize_rcu() is    |
> waiting for readers     |
> |            prepare_mmio_access() is waiting for QGM
>   \                   /
>          deadlock
>
> The patch changes bitmap freeing from direct g_free after synchronize_rcu
> to free inside call_rcu.
>
> Signed-off-by: Denis V. Lunev <[email protected]>
> Reported-by: Igor Redko <[email protected]>
> Tested-by: Igor Redko <[email protected]>
> CC: Anna Melekhova <[email protected]>
> CC: Juan Quintela <[email protected]>
> CC: Amit Shah <[email protected]>
> CC: Paolo Bonzini <[email protected]>
> CC: Wen Congyang <[email protected]>

Reviewed-by: Juan Quintela <[email protected]>

Appliefd to my tree.

PD, no I still don't understood how RCU gave us so many corner cases wrong.

Reply via email to