On 2025/09/01 1:38, Arun Menon wrote:
Hi,
On Mon, Sep 01, 2025 at 01:04:40AM +0900, Akihiko Odaki wrote:
On 2025/09/01 0:45, Arun Menon wrote:
Hi Akihiko,
Thanks for the review.
On Sat, Aug 30, 2025 at 02:58:05PM +0900, Akihiko Odaki wrote:
On 2025/08/30 5:01, Arun Menon wrote:
This is an incremental step in converting vmstate loading
code to report error via Error objects instead of directly
printing it to console/monitor.
It is ensured that qemu_loadvm_state() must report an error
in errp, in case of failure.
When postcopy live migration runs, the device states are loaded by
both the qemu coroutine process_incoming_migration_co() and the
postcopy_ram_listen_thread(). Therefore, it is important that the
coroutine also reports the error in case of failure, with
error_report_err(). Otherwise, the source qemu will not display
any errors before going into the postcopy pause state.
Reviewed-by: Marc-André Lureau <[email protected]>
Reviewed-by: Fabiano Rosas <[email protected]>
Signed-off-by: Arun Menon <[email protected]>
---
migration/migration.c | 9 +++++----
migration/savevm.c | 30 ++++++++++++++++++------------
migration/savevm.h | 2 +-
3 files changed, 24 insertions(+), 17 deletions(-)
diff --git a/migration/migration.c b/migration/migration.c
index
10c216d25dec01f206eacad2edd24d21f00e614c..c6768d88f45c870c7fad9b9957300766ff69effc
100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -881,7 +881,7 @@ process_incoming_migration_co(void *opaque)
MIGRATION_STATUS_ACTIVE);
mis->loadvm_co = qemu_coroutine_self();
- ret = qemu_loadvm_state(mis->from_src_file);
+ ret = qemu_loadvm_state(mis->from_src_file, &local_err);
mis->loadvm_co = NULL;
trace_vmstate_downtime_checkpoint("dst-precopy-loadvm-completed");
@@ -908,7 +908,8 @@ process_incoming_migration_co(void *opaque)
}
if (ret < 0) {
- error_setg(&local_err, "load of migration failed: %s", strerror(-ret));
+ error_prepend(&local_err, "load of migration failed: %s: ",
+ strerror(-ret));
goto fail;
}
@@ -924,13 +925,13 @@ fail:
migrate_set_state(&mis->state, MIGRATION_STATUS_ACTIVE,
MIGRATION_STATUS_FAILED);
migrate_set_error(s, local_err);
- error_free(local_err);
+ error_report_err(local_err);
This is problematic because it results in duplicate error reports when
!mis->exit_on_error; in that case the query-migrate QMP command reports the
error and this error reporting is redundant.
If I comment this change, then all of the errors propagated up to now, using
error_setg() will not be reported. This is the place where it is finally
reported,
when qemu_loadvm_state() fails. In other words, all the error_reports() we
removed
from all the files, replacing them with error_setg(), will finally be reported
here
using error_report_err().
My understanding of the code without these two changes is:
- If the migrate-incoming QMP command is used with false as
exit-on-error, this function will not report the error but
the query-migrate QMP command will report the error.
- Otherwise, this function reports the error.
With my limited experience in testing, I have a question,
So there are 2 scenarios,
1. running the virsh migrate command on the source host. Something like the
following,
virsh -c 'qemu:///system' migrate --live --verbose --domain guest-vm
--desturi qemu+ssh://10.6.120.20/system
OR for postcopy-ram,
virsh migrate guest-vm --live qemu+ssh://10.6.120.20/system --verbose
--postcopy --timeout 10 --timeout-postcopy
2. Using QMP commands, performing a migration from source to destination.
Running something like the following on the destination:
{
"execute": "migrate-incoming",
"arguments": {
"uri": "tcp:127.0.0.1:7777",
"exit-on-error": false
}
}
{
"execute": "migrate-incoming",
"arguments": {
"uri": "tcp:127.0.0.1:7777",
"exit-on-error": false
}
}
and the somthing like the following on source:
{
"execute": "migrate",
"arguments": {
"uri": "tcp:127.0.0.1:7777"
}
}
{"execute" : "query-migrate"}
In 1, previously, the user used to get an error message on migration failure.
This was because there were error_report() calls in all of the files.
Now that they are replaced with error_setg() and the error is stored in errp,
we need to display that using error_report_err(). Hence I introduced an
error_report_err()
call in the fail section.
In 2, we have 2 QMP sessions, one for the source and another for the
destination.
The QMP command migrate will be issued on the source, and the errp will be set.
I did not understand the part where the message will be displayed because of the
error_report_err() call. I did not see such a message on failure scenario on
both
the sessions.
If the user wants to check for errors, then the destination qemu will not exit
(exit-on-error = false ) and we can retrieve it using {"execute" :
"query-migrate"}
Aren't the 2 scenarios different by nature?
In 1, doesn't libvirt query the error with query-migrate and print it?
In any case, it would be nice if you describe how libvirt interacts with
QEMU in 1.
Regards,
Akihiko Odaki