Confirmed that the Disco kernel is only missing 2b57ecd0208f ("KVM: PPC: Book3S: Add count cache flush parameters to kvmppc_get_cpu_char()") from the patchset referenced in bug 1822870.
-- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1832622 Title: QEMU - count cache flush Spectre v2 mitigation (CVE) (required for POWER9 DD2.3) Status in The Ubuntu-power-systems project: Confirmed Status in linux package in Ubuntu: Fix Released Status in qemu package in Ubuntu: Fix Released Status in qemu source package in Xenial: Won't Fix Status in linux source package in Bionic: Fix Released Status in qemu source package in Bionic: Fix Committed Status in qemu source package in Cosmic: Won't Fix Status in linux source package in Disco: Confirmed Status in qemu source package in Disco: Fix Committed Status in qemu source package in Eoan: Fix Released Bug description: [Impact] * This belongs to the overall context of spectre mitigations and even more the try to minimize the related performance impacts. On ppc64el there is a new chip revision (DD 2.3) which provides a facility that helps to better mitigate some of this. * Backport the patches that will make the feature (if supported by the HW) will pass the capability to the guest - to allow guests that support the improved mitigation to use it. [Test Case] * Start guests with and without this capability * Check if the capability is guest visible as intented * Check if there are any issues on pre DD2.3 HW * Test migrations (IBM outlined the intented paths that will work below) * The problem with the above (and also the reasons I didn't add a list of commands this time) is that it needs special HW (mentioned DD2.3 revision) of the chips which aren't available to us right now. Due to that testing / verification of this on all releases is on IBM [Regression Potential] * Adding new capabilities usually works fine, there are three common pitfalls which here are the regression potential. - (severe) the code would announce a capability that isn't really available. The guest tries to use it and crashes - (medium) several migration paths especially from systems with the new cap to older (un-updated systems) will fail. But that applies to any "from machine with Feature to machine without that feature" and isn't really a new regression. As outlined by IBM below they even tried to make it somewhat compatible (by being a new value in an existing cap) - (low) the guest will see new caps and or facilities. A really odd guest could stumble due to that (would actually be a guest bug then) Overall all of the above was considered by IBM when developing this and should be ok. For archive wide SRU considerations, this has NO effect on non ppc64el. [Other Info] * n/a --- Power9 DD 2.3 CPUs running updated firmware will use a new Spectre v2 mitigation. The new mitigation improves performance of branch heavy workloads, but also requires kernel support in order to be fully secure. Without the kernel support there is a risk of a Spectre v2 attack across a process context switch, though it has not been demonstrated in practice. QEMU portion - platform definition needs to account for this new mitigation action.. so attribute for this needs to be added. In terms of support for virtualisation there are 2 sides, kvm and qemu support. Patch list for each, KVM: 2b57ecd0208f KVM: PPC: Book3S: Add count cache flush parameters to kvmppc_get_cpu_char() This is part of LP1822870 already. QEMU: 8ff43ee404 target/ppc/spapr: Add SPAPR_CAP_CCF_ASSIST 399b2896d4 target/ppc/spapr: Add workaround option to SPAPR_CAP_IBS The KVM side is upstream as of v5.1-rc1. The QEMU side is upstream as of v4.0.0-rc0. In terms of migration the state is as follows. In order to specify to the guest to use the count cache flush workaround we use the spapr-cap cap-ibs (indirect branch speculation) with the value workaround. Previously the only valid values were broken, fixed-ibs (indirect branch serialisation) and fixed-ccd (count cache disabled). And add a new cap cap-ccf-assist (count cache flush assist) to specify the availability of the hardware assisted flush variant. Note the the way spapr caps work you can migrate to a host that supports a higher value, but not to one which doesn't support the current value (i.e. only supports lower values). Where for cap-ibs these are defined as: 0 - Broken 1 - Workaround 2 - fixed-ibs 3 - fixed-ccd So the following migrations would be valid for example: broken -> fixed-ccd, broken -> workaround, workaround -> fixed-ccd While the following would be invalid: fixed-ccd -> workaround, workaround ->broken, fixed-ccd -> broken This is done to maintain at least the level of protection specified on the command line on migration. Since the workaround must be communicated to the guest kernel at boot we cannot migrate a guest from a host with fixed-ccd to one with workaround since the guest wouldn't know to do the flush and so would be wholly unprotected. This means that to migrate a guest from 2.2 and before to 2.3 would require the guest to either be have been booted with broken previously, or to be rebooted with workaround specified on the command line which would allow the migration to succeed to a 2.3. == MICHAEL D. ROTH == I've tested a backport of count-cache-flush support consisting of the following patches applied (cleanly) on top of bionic's QEMU 2.11+dfsg-1ubuntu7.14 source: target/ppc/spapr: Add SPAPR_CAP_CCF_ASSIST ppc/spapr-caps: Change migration macro to take full spapr-cap name target/ppc/spapr: Add workaround option to SPAPR_CAP_IBS target/ppc: Factor out the parsing in kvmppc_get_cpu_characteristics() The following tests were done using a DD 2.3 Witherspoon machine and the results seem to align with what's expected in the original summary: == enablement tests (using 4.15.0-51-generic in both host and guests) == with cap-ibs=workaround,cap-ccf-assist=on: mdroth@ubuntu:~$ dmesg | grep cache-flush [ 0.000000] count-cache-flush: hardware assisted flush sequence enabled with cap-ibs=workaround,cap-ccf-assist=off: mdroth@ubuntu:~$ dmesg | grep cache-flush [ 0.000000] count-cache-flush: full software flush sequence enabled. with cap-ibs=broken mdroth@ubuntu:~$ dmesg | grep cache-flush [ 0.000000] count-cache-flush: software flush disabled. == migration tests (using 4.15.0-51-generic in both host and guests) == Note that pseries-2.11-sxxm/bionic-sxxm defaults to: smc->default_caps.caps[SPAPR_CAP_CFPC] = SPAPR_CAP_WORKAROUND; smc->default_caps.caps[SPAPR_CAP_SBBC] = SPAPR_CAP_WORKAROUND; smc->default_caps.caps[SPAPR_CAP_IBS] = SPAPR_CAP_FIXED_CCD but SPAPR_CAP_FIXED_CCD is not available on the DD 2.3 system I tested on (no fw-count-cache-disabled/enabled in host fw-features device tree), so I used pseries-2.11-sxxm,cap-ibs=broken as the base-level cross-migration: qemu 2.11+dfsg-1ubuntu7.14 -> 2.11+dfsg-1ubuntu7.14 +ccf-backport source: -M bionic-sxxm,cap-ibs=broken target: -M bionic-sxxm,cap-ibs=workaround,cap-ccf-assist=off expected: warning actual: warning "cap-ibs lower level (0) in incoming stream than on destination (1))" software ccf enabled after reboot? yes target: -M bionic-sxxm,cap-ibs=workaround,cap-ccf-assist=on expected: warning actual: warning "cap-ccf-assist lower level (0) in incoming stream than on destination (1))" hardware ccf enabled after reboot? yes target: -M bionic-sxxm,cap-ibs=broken expected: success actual: success migration: 2.11+dfsg-1ubuntu7.14+ccf-backport -> 2.11+dfsg-1ubuntu7.14 +ccf-backport source: -M bionic-sxxm,cap-ibs=workaround,cap-ccf-assist=off target: -M bionic-sxxm,cap-ibs=workaround,cap-ccf-assist=off expected: success actual: success target: -M bionic-sxxm,cap-ibs=workaround,cap-ccf-assist=on expected: warning actual: warning "cap-ccf-assist lower level (0) in incoming stream than on destination (1)" hardware ccf enabled after reboot? yes target: -M bionic-sxxm,cap-ibs=broken expected: fail actual: fail "cap-ibs higher level (1) in incoming stream than on destination (0)" source: -M bionic-sxxm,cap-ibs=workaround,ccf-assist=on target: -M bionic-sxxm,cap-ibs=workaround,cap-ccf-assist=on expected: success actual: success target: -M bionic-sxxm,cap-ibs=workaround,cap-ccf-assist=off expected: fail actual: fail, "cap-ccf-assist higher level (1) in incoming stream than on destination (0)" target: cap-ibs=broken (expected: fail, actual: ) expected: fail actual: fail "cap-ibs higher level (1) in incoming stream than on destination (0)" "cap-ccf-assist higher level (1) in incoming stream than on destination (0)" Sorry, I forgot that I needed some fix-ups for the 4th/last patch, "target/ppc/spapr: Add SPAPR_CAP_CCF_ASSIST". I've gone ahead and posted my git tree, which is based on top of the qemu_2.11+dfsg-1ubuntu7.14 source, so the 4 patches there should apply cleanly. There's are notes in the commit notes on what changes were needed for patch 4. https://github.com/mdroth/qemu/commits/spectre-ccf-ubuntu-bionic- 1ubuntu7.14 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu-power-systems/+bug/1832622/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp