On 28.02.2026 00:16, Andrew Cooper wrote:
> When FRED is active, hardware automatically swaps GS when changing privilege,
> and the SWAPGS instruction is disallowed.
> 
> For native OSes using GS as the thread local pointer this is a massive
> improvement on the pre-FRED architecture, but under Xen it makes handling PV
> guests more complicated.  Specifically, it means that GS_BASE and GS_SHADOW
> are the opposite way around in FRED mode, as opposed to IDT mode.
> 
> This leads to the following changes:
> 
>   * In load_segments(), we have to load both GSes.  Account for this in the
>     SWAP() condition and avoid the path with SWAGS.
> 
>   * In save_segments(), we need to read GS_SHADOW rather than GS_BASE.
> 
>   * In toggle_guest_mode(), we need to emulate SWAPGS.
> 
>   * In {read,write}_msr() which access the live registers, GS_SHADOW and
>     GS_BASE need swapping.
> 
>   * In do_set_segment_base(), merge the SEGBASE_GS_{USER,KERNEL} cases and
>     take FRED into account when choosing which base to update.
> 
>     SEGBASE_GS_USER_SEL was already an LKGS invocation (decades before FRED)
>     so under FRED needs to be just a MOV %gs.  Simply skip the SWAPGSes.
> 
> Signed-off-by: Andrew Cooper <[email protected]>
> Reviewed-by: Jan Beulich <[email protected]>
> ---
> CC: Jan Beulich <[email protected]>
> CC: Roger Pau MonnĂ© <[email protected]>
> 
> v4:
>  * Adjust GS accesses for emulated {RD,WR}MSR too.

I think this ...

> @@ -926,7 +927,7 @@ static int cf_check read_msr(
>      case MSR_GS_BASE:
>          if ( !cp->extd.lm )
>              break;
> -        *val = read_gs_base();
> +        *val = opt_fred ? rdmsr(MSR_SHADOW_GS_BASE) : read_gs_base();
>          return X86EMUL_OKAY;

... calls for a comment both here and ...

> @@ -1066,17 +1067,22 @@ static int cf_check write_msr(
>          if ( !cp->extd.lm || !is_canonical_address(val) )
>              break;
>  
> -        if ( reg == MSR_FS_BASE )
> -            write_fs_base(val);
> -        else if ( reg == MSR_GS_BASE )
> -            write_gs_base(val);
> -        else if ( reg == MSR_SHADOW_GS_BASE )
> +        switch ( reg )
>          {
> -            write_gs_shadow(val);
> +        case MSR_FS_BASE:
> +            write_fs_base(val);
> +            break;
> +
> +        case MSR_SHADOW_GS_BASE:
>              curr->arch.pv.gs_base_user = val;
> +            fallthrough;
> +        case MSR_GS_BASE:
> +            if ( (reg == MSR_GS_BASE) ^ opt_fred )
> +                write_gs_base(val);
> +            else
> +                write_gs_shadow(val);
> +            break;

... here, much like you do about everywhere else.

> @@ -192,11 +193,12 @@ long do_set_segment_base(unsigned int which, unsigned 
> long base)
>  
>          case SEGBASE_GS_USER:
>              v->arch.pv.gs_base_user = base;
> -            write_gs_shadow(base);
> -            break;
> -
> +            fallthrough;
>          case SEGBASE_GS_KERNEL:
> -            write_gs_base(base);
> +            if ( (which == SEGBASE_GS_KERNEL) ^ opt_fred )
> +                write_gs_base(base);
> +            else
> +                write_gs_shadow(base);
>              break;
>          }
>          break;

Same perhaps here, and ...

> @@ -209,7 +211,8 @@ long do_set_segment_base(unsigned int which, unsigned 
> long base)
>           * We wish to update the user %gs from the GDT/LDT.  Currently, the
>           * guest kernel's GS_BASE is in context.
>           */
> -        asm volatile ( "swapgs" );
> +        if ( !opt_fred )
> +            asm volatile ( "swapgs" );

... the comment in context could also do with inserting "unless using FRED"
or some such.

Jan

Reply via email to