On 02/03/2026 3:34 pm, Jan Beulich wrote:
> On 02.03.2026 16:20, Andrew Cooper wrote:
>> On 27/02/2026 11:16 pm, Andrew Cooper wrote:
>>> Lets just say this took an unreasoanble amount of time and effort to track
>>> down, when trying to move traps_init() earlier during boot.
>>>
>>> When the SYSCALL linkage MSRs are not configured ahead of _svm_cpu_up() on 
>>> the
>>> BSP, the first context switch into PV uses svm_load_segs() and clobbers the
>>> later-set-up linkage with the 0's cached here, causing hypercalls issues by
>>> the PV guest to enter at 0 in supervisor mode on the user stack.
>>>
>>> Signed-off-by: Andrew Cooper <[email protected]>
>>> ---
>>> CC: Jan Beulich <[email protected]>
>>> CC: Roger Pau Monné <[email protected]>
>>>
>>> v4:
>>>  * New
>>>
>>> It occurs to me that it's not actually 0's we cache here.  It's whatever
>>> context was left from prior to Xen.  We still don't reliably clean unused
>>> MSRs.
> Actually, with this, ...
>
>>> --- a/xen/arch/x86/hvm/svm/svm.c
>>> +++ b/xen/arch/x86/hvm/svm/svm.c
>>> @@ -35,6 +35,7 @@
>>>  #include <asm/p2m.h>
>>>  #include <asm/paging.h>
>>>  #include <asm/processor.h>
>>> +#include <asm/traps.h>
>>>  #include <asm/vm_event.h>
>>>  #include <asm/x86_emulate.h>
>>>  
>>> @@ -1581,6 +1582,21 @@ static int _svm_cpu_up(bool bsp)
>>>      /* Initialize OSVW bits to be used by guests */
>>>      svm_host_osvw_init();
>>>  
>>> +    /*
>>> +     * VMSAVE writes out the current full FS, GS, LDTR and TR segments, and
>>> +     * the GS_SHADOW, SYSENTER and SYSCALL linkage MSRs.
>>> +     *
>>> +     * The segment data gets modified by the svm_load_segs() optimisation 
>>> for
>>> +     * PV context switches, but all values get reloaded at that point, as 
>>> well
>>> +     * as during context switch from SVM.
>>> +     *
>>> +     * If PV guests are available (and FRED is not in use), it is critical
>>> +     * that the SYSCALL linkage MSRs been configured at this juncture.
>>> +     */
>>> +    ASSERT(opt_fred >= 0); /* Confirm that FRED-ness has been resolved */
>>> +    if ( IS_ENABLED(CONFIG_PV) && !opt_fred )
>>> +        ASSERT(rdmsr(MSR_LSTAR));
>> It has occurred to me that this is subtly wrong.  While FRED doesn't use
>> LSTAR/SFMASK, it does reuse STAR.
>>
>> So this needs to be:
>>
>>     if ( IS_ENABLED(CONFIG_PV) )
>>         ASSERT(rdmsr(MSR_STAR));
>>
>> with the include dropped, as the final sentence adjusted to say "even
>> with FRED".
> ... if we inherit a non-zero value, is the assertion of much use this way?

The inherited case is normally when we're KEXEC'd into.  That doesn't
happen very much with Xen, and is more of a concern with Linux.

But yes, IMO the assertion is still useful.  CI boots from clean, so the
ASSERT() will catch accidental code movement which violates the dependency.

~Andrew

Reply via email to