On 04.01.2023 12:11, Andrew Cooper wrote:
> Refer to Intel SDM Rev 70 (Dec 2022), Vol3 17.2.3 "Supervisor Shadow Stack
> Token".
> 
> Architecturally, an event delivery which starts in CPL<3 and switches shadow
> stack will first validate the Supervisor Shadow Stack Token (setting the busy
> bit), then pushes CS/LIP/SSP.  One example of this is an NMI interrupting Xen.
> 
> Some CPUs suffer from an issue called fracturing, whereby a fault/vmexit/etc
> between setting the busy bit and completing the event injection renders the
> action non-restartable, because when it comes time to restart, the busy bit is
> found to be already set.
> 
> This is far more easily encountered under virt, yet it is not the fault of the
> hypervisor, nor the fault of the guest kernel.  The fault lies somewhere
> between the architectural specification, and the uarch behaviour.
> 
> Intel have allocated CPUID.7[1].ecx[18] CET_SSS to enumerate that supervisor
> shadow stacks are safe to use.  Because of how Xen lays out its shadow stacks,
> fracturing is not expected to be a problem on native.

IOW that's the "contained in an aligned 32-byte region" constraint which we
meet, aiui.

> Detect this case on boot and default to not using shstk if virtualised.
> Specifying `cet=shstk` on the command line will override this heuristic and
> enable shadow stacks irrespective.
> 
> Signed-off-by: Andrew Cooper <[email protected]>

Reviewed-by: Jan Beulich <[email protected]>
with one nit (below).

> This ideally wants backporting to Xen 4.14.  I have no idea how likely it is
> to need to backport the prerequisite patch for new feature words, but we've
> already had to do that once for security patches.  OTOH, I have no idea how
> easy it is to trigger in non-synthetic cases.

Plus: How likely is it that Xen actually is used virtualized in production?
For the moment I don't see any reason to backport to branches in security-
only maintenance mode. I'm not even sure it needs backporting at all.

> @@ -1099,11 +1095,45 @@ void __init noreturn __start_xen(unsigned long mbi_p)
>      early_cpu_init();
>  
>      /* Choose shadow stack early, to set infrastructure up appropriately. */
> -    if ( opt_xen_shstk && boot_cpu_has(X86_FEATURE_CET_SS) )
> +    if ( !boot_cpu_has(X86_FEATURE_CET_SS) )
> +        opt_xen_shstk = 0;
> +
> +    if ( opt_xen_shstk )
>      {
> -        printk("Enabling Supervisor Shadow Stacks\n");
> +        /*
> +         * Some CPUs suffer from Shadow Stack Fracturing, an issue whereby a
> +         * fault/VMExit/etc between setting a Supervisor Busy bit and the
> +         * event delivery completing renders the operation non-restartable.
> +         * On restart, event delivery will find the Busy bit already set.
> +         *
> +         * This is a problem on bare metal, but outside of synthetic cases or
> +         * a very badly timed #MC, it's not believed to problem.  It is a 
> much

Nit: "... to be a problem."

Jan

Reply via email to