Sergio, thanks for the updates to the bug description/SRU template.

I've extended the Test Plan to cover not only boots, but reboots too,
of course, which the patch changes/exercises (missed this previously);
and Regression Potential, to point to the modified area (cpu reset).

** Description changed:

  [ Impact ]
  
  Some versions of Windows hang on reboot if their TSC value is greater
  than 2^54.  The calibration of the Hyper-V reference time overflows
  and fails; as a result the processors' clock sources are out of sync.
  
  [ Test Plan ]
  
  As suggested by Mauricio, testing will be done in stages.
  
  1) unit test, with such rdtsc/print loop (and confirm the tsc value
  decreases after system_reset).
  
  This can be done by using x86/tsc.flat from the following repository:
  
  https://gitlab.com/kvm-unit-tests/kvm-unit-tests.git
  
  Follow the steps below:
  
  Inside a Jammy system (privileged container/VM, bare metal, etc.):
  
  # apt update && apt install gcc make -y
  # git clone https://gitlab.com/kvm-unit-tests/kvm-unit-tests.git
  # cd kvm-unit-tests
  # wget 
https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/2064914/+attachment/5784045/+files/tsc.c.patch
 -O- | patch -p1
  # ./configure && make
  
  Make sure x86/tsc exists.  Now you can install qemu and perform the
  test:
  
  # apt install -y qemu-system-x86
  # qemu-system-x86_64 -serial file:/tmp/bogus-output -accel kvm -kernel 
x86/tsc.flat -monitor stdio -nographic
  
  Wait a couple of seconds and issue a "system_reset" command.  Then, wait
  a couple more seconds and issue a "quit" command.
  
  You can now open /tmp/bogus-output and check the values of rdtsc.  You
  will notice that its value increments after the "system_reset", which is
  exactly what we don't want.
  
  Afterwards, you can update qemu and test the fix by doing the same steps
  (make sure you adjust the "file:/tmp/..." path).
  
- 2) regression test, booting Ubuntu kernel/initrd pairs (installer's
- should be enough) from supported releases, and checking they boot/reach
- a prompt.
+ 2) regression test, booting Ubuntu kernel/initrd pairs (installer's should be 
enough) from supported releases, and checking they boot/reach a prompt.
+ 2.1) now, it is important to _reboot_, and check they boot/reach a prompt too.
  
  [ Where problems could occur ]
  
  This is a change impacting normal x86 code, so although the patch is
  small and well contained, in the unlikely case that we find a regression
  it will impact more users.  As such, and under Mauricio's advice, the
  test plan is being extended to really guarantee that the common
  virtualization scenarios are not impacted.  If we find a problem with
  this update, there is the possibility of reverting it temporarily until
  we can devise a proper fix.
  
  [ Original Description ]
  
  Description:
  Some versions of Windows hang on reboot if their TSC value is greater
  than 2^54.  The calibration of the Hyper-V reference time overflows
  and fails; as a result the processors' clock sources are out of sync.
  
  The issue is that the TSC _should_ be reset to 0 on CPU reset and
  QEMU tries to do that.  However, KVM special cases writing 0 to the
  TSC and thinks that QEMU is trying to hot-plug a CPU, which is
  correct the first time through but not later.  Thwart this valiant
  effort and reset the TSC to 1 instead, but only if the CPU has been
  run once.
  
  For this to work, env->tsc has to be moved to the part of CPUArchState
  that is not zeroed at the beginning of x86_cpu_reset.
  
  Solution: [PATCH] target/i386: properly reset TSC on reset
  
  I created and tested a ppa ubuntu package already. The patch fixes this issue.
  Link to ppa: 
https://launchpad.net/~bhinz83/+archive/ubuntu/openstack-rds/+packages
  
  It affects only jammy 22.04 package. The newest version is:
  qemu-1:6.2+dfsg-2ubuntu6.19

** Description changed:

  [ Impact ]
  
  Some versions of Windows hang on reboot if their TSC value is greater
  than 2^54.  The calibration of the Hyper-V reference time overflows
  and fails; as a result the processors' clock sources are out of sync.
  
  [ Test Plan ]
  
  As suggested by Mauricio, testing will be done in stages.
  
  1) unit test, with such rdtsc/print loop (and confirm the tsc value
  decreases after system_reset).
  
  This can be done by using x86/tsc.flat from the following repository:
  
  https://gitlab.com/kvm-unit-tests/kvm-unit-tests.git
  
  Follow the steps below:
  
  Inside a Jammy system (privileged container/VM, bare metal, etc.):
  
  # apt update && apt install gcc make -y
  # git clone https://gitlab.com/kvm-unit-tests/kvm-unit-tests.git
  # cd kvm-unit-tests
  # wget 
https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/2064914/+attachment/5784045/+files/tsc.c.patch
 -O- | patch -p1
  # ./configure && make
  
  Make sure x86/tsc exists.  Now you can install qemu and perform the
  test:
  
  # apt install -y qemu-system-x86
  # qemu-system-x86_64 -serial file:/tmp/bogus-output -accel kvm -kernel 
x86/tsc.flat -monitor stdio -nographic
  
  Wait a couple of seconds and issue a "system_reset" command.  Then, wait
  a couple more seconds and issue a "quit" command.
  
  You can now open /tmp/bogus-output and check the values of rdtsc.  You
  will notice that its value increments after the "system_reset", which is
  exactly what we don't want.
  
  Afterwards, you can update qemu and test the fix by doing the same steps
  (make sure you adjust the "file:/tmp/..." path).
  
  2) regression test, booting Ubuntu kernel/initrd pairs (installer's should be 
enough) from supported releases, and checking they boot/reach a prompt.
  2.1) now, it is important to _reboot_, and check they boot/reach a prompt too.
  
  [ Where problems could occur ]
  
  This is a change impacting normal x86 code, so although the patch is
  small and well contained, in the unlikely case that we find a regression
  it will impact more users.  As such, and under Mauricio's advice, the
  test plan is being extended to really guarantee that the common
  virtualization scenarios are not impacted.  If we find a problem with
  this update, there is the possibility of reverting it temporarily until
  we can devise a proper fix.
  
+ Regressions would be likely to occur in the initialization / (re)boot path,
+ which should be fine to identify early in testing, except for corner cases.
+ 
  [ Original Description ]
  
  Description:
  Some versions of Windows hang on reboot if their TSC value is greater
  than 2^54.  The calibration of the Hyper-V reference time overflows
  and fails; as a result the processors' clock sources are out of sync.
  
  The issue is that the TSC _should_ be reset to 0 on CPU reset and
  QEMU tries to do that.  However, KVM special cases writing 0 to the
  TSC and thinks that QEMU is trying to hot-plug a CPU, which is
  correct the first time through but not later.  Thwart this valiant
  effort and reset the TSC to 1 instead, but only if the CPU has been
  run once.
  
  For this to work, env->tsc has to be moved to the part of CPUArchState
  that is not zeroed at the beginning of x86_cpu_reset.
  
  Solution: [PATCH] target/i386: properly reset TSC on reset
  
  I created and tested a ppa ubuntu package already. The patch fixes this issue.
  Link to ppa: 
https://launchpad.net/~bhinz83/+archive/ubuntu/openstack-rds/+packages
  
  It affects only jammy 22.04 package. The newest version is:
  qemu-1:6.2+dfsg-2ubuntu6.19

** Changed in: qemu (Ubuntu Jammy)
       Status: In Progress => Fix Committed

** Tags added: verification-needed verification-needed-jammy

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2064914

Title:
  Windows guest hangs after reboot from the guest OS

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/2064914/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to