On 11/30/20 7:07 PM, Cédric Le Goater wrote: > On 11/30/20 5:52 PM, Greg Kurz wrote: >> The sPAPR XIVE device has an internal ENDT table the size of >> which is configurable by the machine. This table is supposed >> to contain END structures for all possible vCPUs that may >> enter the guest. The machine must also claim IPIs for all >> possible vCPUs since this is expected by the guest. >> >> spapr_irq_init() takes care of that under the assumption that >> spapr_max_vcpu_ids() returns the number of possible vCPUs. >> This happens to be the case when the VSMT mode is set to match >> the number of threads per core in the guest (default behavior). >> With non-default VSMT settings, this limit is > to the number >> of vCPUs. In the worst case, we can end up allocating an 8 times >> bigger ENDT and claiming 8 times more IPIs than needed. But more >> importantly, this creates a confusion between number of vCPUs and >> vCPU ids, which can lead to subtle bugs like [1]. >> >> Use smp.max_cpus instead of spapr_max_vcpu_ids() in >> spapr_irq_init() for the latest machine type. Older machine >> types continue to use spapr_max_vcpu_ids() since the size of >> the ENDT is migration visible. >> >> [1] https://bugs.launchpad.net/qemu/+bug/1900241 >> >> Signed-off-by: Greg Kurz <gr...@kaod.org> > > > Reviewed-by: Cédric Le Goater <c...@kaod.org>
I gave patch 2 and 3 a little more thinking. I don't think we need much more than patch 1 which clarifies the nature of the values being manipulated, quantities vs. numbering. The last 2 patches are adding complexity to try to optimize the XIVE VP space in a case scenario which is not very common (vSMT). May be it's not worth it. Today, we can start 4K (-2) KVM guests with 16 vCPUs each on a witherspoon (2 socket P9) and we are far from reaching the limits of the VP space. Available RAM is more a problem. VP space is even bigger on P10. The width was increased to 24bit per chip. C.