Hi Peng Ju, On 11/24/25 6:34 PM, Zhou, Peng Ju wrote: > Hi Dongli, > Thanks for your reply. > > As you said in > https://urldefense.com/v3/__https://lore.kernel.org/qemu-devel/[email protected]/__;!!ACWV5N9M2RV99hQ!I2xVB3T2iW4xtnnYdW0e-tnS2cRMe_EuIFx2ALBZ7Ys1lZGS1fyeZ3eYxf21kSU_VVhNkkl0FHIpHFe-U5yIxg$ > > timeout occurred in guest after live migration due to the monotonic time jump > ahead. > > Hi Qemu team > Could you help to check the patch again? > (I think Dongli's patch is better than mine.) > > Timeout can be occurred in the following sequence: > 1. Send a job to HW and start a timer > 2. HW response an interrupt (which means HW finishes the work) and VM > suspended without process the interrupt > 3. Resume the VM after live migration with a long downtime (may be 20s). > 4. Timer timeout
Regarding such scenario ... General Linux kernel uses 'PVCLOCK_GUEST_STOPPED' to notify guest VM that clock may be unreliable for a short period of times. The guest kernel then calls pvclock_touch_watchdogs() to avoid any general timeout. Regarding a specific driver, it depends on its implementation. Usually I/O timeout can recover once the request is finally complete. For some very specific scenarios, perhaps we may need to resolve it case by case. [PATCH RESEND 1/1] x86/smpboot: check cpu_initialized_mask first after returning from schedule() https://lore.kernel.org/all/[email protected]/ Thank you very much! Dongli Zhang > > Thanks in advance. > > > ---------------------------------------------------------------------- > BW > Pengju Zhou > > > > > >> -----Original Message----- >> From: Dongli Zhang <[email protected]> >> Sent: Monday, November 24, 2025 3:14 PM >> To: Zhou, Peng Ju <[email protected]>; [email protected] >> Cc: Chang, HaiJun <[email protected]>; Ma, Qing (Mark) >> <[email protected]>; [email protected]; >> [email protected]; [email protected]; [email protected]; >> [email protected] >> Subject: Re: [PATCH] hw/i386/kvm: Prevent guest monotonic clock jump after >> live >> migration >> >> Hi Peng Ju, >> >> On 11/20/25 12:44 AM, Peng Ju Zhou wrote: >>> Problem >>> After live migration, the guest monotonic clock may jump forward on the >>> target. >>> >>> Cause >>> kvmclock (the guest’s time base) is derived from host wall time and >>> keeps advancing while the VM is paused. During STOP_COPY, QEMU reads >> kvmclock twice: >>> 1) immediately after the VM is paused, and >>> 2) when final CPU state is collected. >>> Only the second (later) value is migrated. The gap between the two >>> reads is roughly the downtime, so the target restores from a later >>> time and the guest monotonic clock jumps ahead. >> >> According to prior discussion, it is expected to account live migration >> downtime. >> >> https://urldefense.com/v3/__https://lore.kernel.org/qemu-__;!!ACWV5N9M2RV99hQ!I2xVB3T2iW4xtnnYdW0e-tnS2cRMe_EuIFx2ALBZ7Ys1lZGS1fyeZ3eYxf21kSU_VVhNkkl0FHIpHFePWcT4Fg$ >> >> devel/[email protected]/ >> >> >> That is, the jump forward is expected during live migration. >> >> >> I used to send a QEMU patch to account live migration downtime. >> >> [PATCH 1/1] target/i386/kvm: account blackout downtime for kvm-clock and >> guest >> TSC >> https://urldefense.com/v3/__https://lore.kernel.org/qemu-devel/20251009095831.46297-1-__;!!ACWV5N9M2RV99hQ!I2xVB3T2iW4xtnnYdW0e-tnS2cRMe_EuIFx2ALBZ7Ys1lZGS1fyeZ3eYxf21kSU_VVhNkkl0FHIpHFckXGguCA$ >> >> [email protected]/ >> >> Thank you very much! >> >> Dongli Zhang >> >>> >>> Fix >>> Migrate the kvmclock value captured at pause time (the first read) so >>> the target restores from the actual pause point. >>> >>> Signed-off-by: Peng Ju Zhou <[email protected]> >>> --- >>> hw/i386/kvm/clock.c | 8 +++++++- >>> 1 file changed, 7 insertions(+), 1 deletion(-) >>> >>> diff --git a/hw/i386/kvm/clock.c b/hw/i386/kvm/clock.c index >>> 40aa9a32c3..cd6f7e1315 100644 >>> --- a/hw/i386/kvm/clock.c >>> +++ b/hw/i386/kvm/clock.c >>> @@ -43,6 +43,7 @@ struct KVMClockState { >>> >>> /* whether the 'clock' value was obtained in the 'paused' state */ >>> bool runstate_paused; >>> + RunState state; >>> >>> /* whether machine type supports reliable KVM_GET_CLOCK */ >>> bool mach_use_reliable_get_clock; @@ -108,7 +109,10 @@ static >>> void kvm_update_clock(KVMClockState *s) >>> fprintf(stderr, "KVM_GET_CLOCK failed: %s\n", strerror(-ret)); >>> abort(); >>> } >>> - s->clock = data.clock; >>> + >>> + if (s->state != RUN_STATE_FINISH_MIGRATE) { >>> + s->clock = data.clock; >>> + } >>> >>> /* If kvm_has_adjust_clock_stable() is false, KVM_GET_CLOCK returns >>> * essentially CLOCK_MONOTONIC plus a guest-specific adjustment. >>> This @@ -217,6 +221,8 @@ static void kvmclock_vm_state_change(void >> *opaque, bool running, >>> */ >>> s->clock_valid = true; >>> } >>> + >>> + s->state = state; >>> } >>> >>> static void kvmclock_realize(DeviceState *dev, Error **errp) >
