On Wed, Jul 21, 2021 at 03:34:06PM -0400, Peter Xu wrote:
> Accessing from_dst_file is potentially racy in current code base like below:
>
> if (s->from_dst_file)
> do_something(s->from_dst_file);
>
> Because from_dst_file can be reset right after the check in another
> thread (rp_thread). One example is migrate_fd_cancel().
>
> Use the same qemu_file_lock to protect it too, just like to_dst_file.
>
> When it's safe to access without lock, comment it.
>
> There's one special reference in migration_thread() that can be replaced by
> the newly introduced rp_thread_created flag.
>
> Reported-by: Dr. David Alan Gilbert <[email protected]>
> Signed-off-by: Peter Xu <[email protected]>
> ---
> migration/migration.c | 32 +++++++++++++++++++++++++-------
> migration/migration.h | 8 +++++---
> migration/ram.c | 1 +
> 3 files changed, 31 insertions(+), 10 deletions(-)
>
> diff --git a/migration/migration.c b/migration/migration.c
> index 21b94f75a3..fa70400f98 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -1879,10 +1879,12 @@ static void migrate_fd_cancel(MigrationState *s)
> QEMUFile *f = migrate_get_current()->to_dst_file;
> trace_migrate_fd_cancel();
>
> + qemu_mutex_lock(&s->qemu_file_lock);
> if (s->rp_state.from_dst_file) {
> /* shutdown the rp socket, so causing the rp thread to shutdown */
> qemu_file_shutdown(s->rp_state.from_dst_file);
> }
> + qemu_mutex_unlock(&s->qemu_file_lock);
Worth using WITH_QEMU_LOCK_GUARD?
> @@ -2827,11 +2845,13 @@ out:
> * Maybe there is something we can do: it looks like a
> * network down issue, and we pause for a recovery.
> */
> - qemu_fclose(rp);
> - ms->rp_state.from_dst_file = NULL;
> + migration_release_from_dst_file(ms);
> rp = NULL;
> if (postcopy_pause_return_path_thread(ms)) {
> - /* Reload rp, reset the rest */
> + /*
> + * Reload rp, reset the rest. Referencing it is save since
s/save/safe/
> + * it's reset only by us above, or when migration completes
> + */
> rp = ms->rp_state.from_dst_file;
> ms->rp_state.error = false;
> goto retry;
--
Eric Blake, Principal Software Engineer
Red Hat, Inc. +1-919-301-3266
Virtualization: qemu.org | libvirt.org