24.11.2020 13:04, Tuguoyi wrote:
The following steps will cause qemu assertion failure: - pause vm - save memory snapshot into local file through fd migration - do the above operation again will cause qemu assertion failure
Hi! Why do you need such scenario? Isn't it more correct and safer to just error-out on savevm if disks are already inactive? Inactive disks are a sign, that vm may be migrated to other host and already running, so creating any kind of snapshots of this old state may be bad idea. I mean, you try to allow a doubtful feature to avoid an assertion. If you don't have strong reasons for the feature, it's better to turn a crash into clean error-out. As far as I remember, bdrv_inactive_all() is the only source of inactivation, so actually, it's more like "inactive" state of the vm, not just some disks are inactive.. And you change the code in a way that it looks like that some disks may be inactive and some not, which would actually be unexpected behavior.
The backtrace looks like: #0 0x00007fbf958c5c37 in raise () from /lib/x86_64-linux-gnu/libc.so.6 #1 0x00007fbf958c9028 in abort () from /lib/x86_64-linux-gnu/libc.so.6 #2 0x00007fbf958bebf6 in ?? () from /lib/x86_64-linux-gnu/libc.so.6 #3 0x00007fbf958beca2 in __assert_fail () from /lib/x86_64-linux-gnu/libc.so.6 #4 0x000055ca8decd39d in bdrv_inactivate_recurse (bs=0x55ca90c80400) at /build/qemu-5.0/block.c:5724 #5 0x000055ca8dece967 in bdrv_inactivate_all () at /build//qemu-5.0/block.c:5792 #6 0x000055ca8de5539d in qemu_savevm_state_complete_precopy_non_iterable (inactivate_disks=true, in_postcopy=false, f=0x55ca907044b0) at /build/qemu-5.0/migration/savevm.c:1401 #7 qemu_savevm_state_complete_precopy (f=0x55ca907044b0, iterable_only=iterable_only@entry=false, inactivate_disks=inactivate_disks@entry=true) at /build/qemu-5.0/migration/savevm.c:1453 #8 0x000055ca8de4f581 in migration_completion (s=0x55ca8f64d9f0) at /build/qemu-5.0/migration/migration.c:2941 #9 migration_iteration_run (s=0x55ca8f64d9f0) at /build/qemu-5.0/migration/migration.c:3295 #10 migration_thread (opaque=opaque@entry=0x55ca8f64d9f0) at /build/qemu-5.0/migration/migration.c:3459 #11 0x000055ca8dfc6716 in qemu_thread_start (args=<optimized out>) at /build/qemu-5.0/util/qemu-thread-posix.c:519 #12 0x00007fbf95c5f184 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0 #13 0x00007fbf9598cbed in clone () from /lib/x86_64-linux-gnu/libc.so.6 When the first migration completes, bs->open_flags will set BDRV_O_INACTIVE flag by bdrv_inactivate_recurse(), and during the second migration the bdrv_inactivate_recurse assert that the bs->open_flags is already BDRV_O_INACTIVE enabled which cause crash. This patch just make the bdrv_inactivate_all() to don't inactivate the bs if it is already inactive Signed-off-by: Tuguoyi <[email protected]> --- block.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/block.c b/block.c index f1cedac..02361e1 100644 --- a/block.c +++ b/block.c @@ -5938,6 +5938,11 @@ static int bdrv_inactivate_recurse(BlockDriverState *bs) return 0; }+static bool bdrv_is_inactive(BlockDriverState *bs)+{ + return bs->open_flags & BDRV_O_INACTIVE; +} + int bdrv_inactivate_all(void) { BlockDriverState *bs = NULL; @@ -5958,7 +5963,7 @@ int bdrv_inactivate_all(void) /* Nodes with BDS parents are covered by recursion from the last * parent that gets inactivated. Don't inactivate them a second * time if that has already happened. */ - if (bdrv_has_bds_parent(bs, false)) { + if (bdrv_has_bds_parent(bs, false) || bdrv_is_inactive(bs)) { continue; } ret = bdrv_inactivate_recurse(bs);
-- Best regards, Vladimir
