And Anushree, would you also agree that bug LP: 2067383 / Bugzilla: 206641 - 
https://bugs.launchpad.net/bugs/2067383)
can be considered as a duplicate bug of this one (LP: 2076587 / Bugzilla: 
208538) ?
They seem to be suspiciously similar ...


** Description changed:

+ SRU Justification:
+ 
+ [ Impact ]
+ 
+  * While running a (nested) KVM guest on Power 10 (with PowerVM)
+    and performing a CPU hotplug, trying to set to 68 vCPUs,
+    the KVM guest crashes.
+ 
+  * In the failure case the KVM guest has maxvcpus 128,
+    and it starts fine with an initial value of 4 vCPUs,
+    but fails after a larger increase (here to 68 vCPUs).
+ 
+  * The error reported is:
+    [ 662.102542] KVM: Create Guest vcpu hcall failed, rc=-44
+    error: Unable to read from monitor: Connection reset by peer
+ 
+  * This especially seems to happen in memory constraint systems.
+ 
+  * This can be avoided by pre-creating and parking vCPUs on success
+    or return error otherwise, which then leads to a graceful error 
+    in case of a vCPU hotplug failure, while the guest keeps running.
+ 
+ [ Fix ]
+ 
+  * 08c3286822 ("accel/kvm: Extract common KVM vCPU {creation,parking}
+ code") [pre-req]
+ 
+  * c6a3d7bc9e ("accel/kvm: Introduce kvm_create_and_park_vcpu() helper")
+ 
+  * 18530e7c57 ("cpu-common.c: export cpu_get_free_index to be reused
+ later")
+ 
+  * cfb52d07f5 ("target/ppc: handle vcpu hotplug failure gracefully")
+ 
+ [ Test Plan ]
+ 
+  * Setup an IBM Power10 system (with firmware FW1060 or newer,
+    that comes with nested KVM support), running Ubuntu Server 24.04.
+ 
+  * Install and configure KVM on this system with a (higher)
+    maxvcpus value of 128, but have a (smaller) initial value of 4 vCPUs.
+    $ virsh define ubu2404.xml
+ 
+  * Now after successful definition, start the VM:
+    $ virsh start ubu2404 --console
+ 
+  * If the VM is up and running increase the vCPUs to a larger value
+    here 68:
+    $ virsh setvcpus ubu2404 68
+ 
+  * A system with an unpatched qemu will crash, showing:
+    [ 662.102542] KVM: Create Guest vcpu hcall failed, rc=-44
+    error: Unable to read from monitor: Connection reset by peer
+ 
+  * A patches environment will:
+    - either just successfully hotplug the new amount (68) of vCPUs
+      without further messages
+    - or (in case very memory constraint) print a (graceful) error
+      message that hotplug couldn't be performed,
+      but stays up and running:
+      error: internal error: unable to execute QEMU command 'device_add': \
+      kvmppc_cpu_realize: vcpu hotplug failed with -12
+ 
+  * Since certain firmware is required, IBM is doing the test and validation
+    (and already successfully verified based on the PPA test builds).
+ 
+ [ Where problems could occur ]
+ 
+  * All modification were done in target/ppc/kvm.c
+    and are with that limited to the IBM Power platform,
+    and will not affect other architectures.
+ 
+  * The implementation of the pre-creation of vCPUs (init cpu_target_realize)
+    may lead to early failures when a user doesn't expect to have such an
+    amount of vCPUs yet.
+ 
+  * And the pre-creation and especially parking (kvm_create_and_park_vcpu)
+    will probably consume more resources than before.
+ 
+  * Hence a patched system might run with a reduced max amount of vCPUs,
+    but instead will not crash hard, but gracefully fail on lack of resources.
+ 
+  * This case and the patch(es) are also discussed in more detail here:
+    
https://lore.kernel.org/qemu-devel/20240516053211.145504-1-hars...@linux.ibm.com/T/#t
+    and here:
+    https://bugzilla.redhat.com/show_bug.cgi?id=2304078
+ 
+ [ Other Info ]
+ 
+  * The code is upstream accepted with qemu v9.1.0(-rc0),
+    and the upload to oracular was done,
+    and now only noble is affected.
+ 
+  * Ubuntu releases older than noble are not affected,
+    since (nested) KVM virtualization on P10
+    was introduced starting with noble.
+ __________
+ 
  == Comment: #0 - SEETEENA THOUFEEK <sthou...@in.ibm.com> - 2024-08-12 
03:47:06 ==
  +++ This bug was initially created as a clone of Bug #205620 +++
  
  ---Problem Description---
  cpu hotplug crashes the guest!cpu hotplug crashes the guest!
-  
+ 
  ---Steps to Reproduce---
-  I have been trying for the CPU hotplugging to the guest with maxvcpus as 128 
and current value I am giving as 4! but when I try to hotplug 68 vcpus to the 
guest, it crahses and we get error message as: 
+  I have been trying for the CPU hotplugging to the guest with maxvcpus as 128 
and current value I am giving as 4! but when I try to hotplug 68 vcpus to the 
guest, it crahses and we get error message as:
  [  303.808494] KVM: Create Guest vcpu hcall failed, rc=-44
  error: Unable to read from monitor: Connection reset by peer
-  
  
  Steps to reproduce:
  
  1) virsh define bug.xml
  
  2) virsh start Fedora39 --console
  
  3) virsh setvcpus Fedora39 68
  
- Output : 
+ Output :
  [  662.102542] KVM: Create Guest vcpu hcall failed, rc=-44
  error: Unable to read from monitor: Connection reset by peer
  
- 
- If resources are less, in my thinking it should fail gracefully! 
+ If resources are less, in my thinking it should fail gracefully!
  Attaching the XML file that i have used and will post the observations on MDC 
system there i saw this same failure on higher number.
  
  fixed with upstream commit
  
  https://github.com/qemu/qemu/commit/cfb52d07f53aa916003d43f69c945c2b42bc6374
-  
- Machine Type = na 
-  
+ 
+ Machine Type = na
+ 
  ---Debugger---
  A debugger is not configured
-  
- Contact Information = sthou...@in.ibm.com 
-  
+ 
+ Contact Information = sthou...@in.ibm.com
+ 
  ---uname output---
  NA

** Changed in: ubuntu-power-systems
       Status: Triaged => In Progress

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2076587

Title:
  cpu hotplug crashes the guest!

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-power-systems/+bug/2076587/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to