Re: [Xen-devel] [PATCH] x86/boot: Clean up the trampoline transition into Long mode

Andrew Cooper Fri, 03 Jan 2020 06:27:01 -0800

On 03/01/2020 13:52, Jan Beulich wrote:
> On 03.01.2020 14:44, Andrew Cooper wrote:
>> On 03/01/2020 13:36, Jan Beulich wrote:
>>> On 02.01.2020 15:59, Andrew Cooper wrote:
>>>> @@ -111,26 +109,6 @@ trampoline_protmode_entry:
>>>>  start64:
>>>>          /* Jump to high mappings. */
>>>>          movabs  $__high_start, %rdi
>>>> -
>>>> -#ifdef CONFIG_INDIRECT_THUNK
>>>> -        /*
>>>> -         * If booting virtualised, or hot-onlining a CPU, sibling threads 
>>>> can
>>>> -         * attempt Branch Target Injection against this jmp.
>>>> -         *
>>>> -         * We've got no usable stack so can't use a RETPOLINE thunk, and 
>>>> are
>>>> -         * further than disp32 from the high mappings so couldn't use
>>>> -         * JUMP_THUNK even if it was a non-RETPOLINE thunk.  Furthermore, 
>>>> an
>>>> -         * LFENCE isn't necessarily safe to use at this point.
>>>> -         *
>>>> -         * As this isn't a hotpath, use a fully serialising event to 
>>>> reduce
>>>> -         * the speculation window as much as possible.  %ebx needs 
>>>> preserving
>>>> -         * for __high_start.
>>>> -         */
>>>> -        mov     %ebx, %esi
>>>> -        cpuid
>>>> -        mov     %esi, %ebx
>>>> -#endif
>>>> -
>>>>          jmpq    *%rdi
>>> I can see this being unneeded when running virtualized, as you said
>>> in reply to Wei. However, for hot-onlining (when other CPUs may run
>>> random vCPU-s) I don't see how this can safely be dropped. There's
>>> no similar concern for S3 resume, as thaw_domains() happens only
>>> after enable_nonboot_cpus().
>> I covered that in the same reply.  Any guest which can use branch target
>> injection against this jmp can also poison the regular branch predictor
>> and get at data that way.
> Aren't you implying then that retpolines could also be dropped?


No.  It is a simple risk vs complexity tradeoff.

Guests running on a sibling *can already* attack this branch with BTI,
because CPUID isn't a fix to bad BTB speculation, and the leakage gadget
need only be a single instruction.

Such a guest can also attack Xen in general with Spectre v1.

As I said - this was introduced because of paranoia, back while the few
people who knew about the issues (only several hundred at the time) were
attempting to figure out what exactly a speculative attack looked like,
and was applying duct tape to everything suspicious because we had 0
time to rewrite several core pieces of system handling.

>> Once again, we get to CPU Hotplug being an unused feature in practice,
>> which is completely evident now with Intel MCE behaviour.
> What does Intel's MCE behavior have to do with whether CPU hotplug
> (or hot-onlining) is (un)used in practice?

The logical consequence of hotplug breaking MCEs.

If hotplug had been used in practice, the MCE behaviour would have come
to light much sooner, when MCEs didn't work in practice.

Given that MCEs really did work in practice even before the L1TF days,
hotplug wasn't in common-enough use for anyone to notice the MCE behaviour.

>> A guest can't control/guess when a hotplug even might occur, or where
>> exactly this branch is in memory (after all - it is variable based on
>> the position of the trampoline), and core scheduling mitigates the risk
>> entirely.
> "... will mitigate ..." - it's experimental up to now, isn't it?

Core scheduling ought to prevent the problem entirely.  The current code
is not safe in the absence of core scheduling.

~Andrew

_______________________________________________
Xen-devel mailing list
[email protected]
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] x86/boot: Clean up the trampoline transition into Long mode

Reply via email to