On 28.08.2025 17:04, Andrew Cooper wrote:
> With the shadow stack and exception handling adjustements in place, we can now
> activate FRED when appropriate.  Note that opt_fred is still disabled by
> default.
> 
> Introduce init_fred() to set up all the MSRs relevant for FRED.  FRED uses
> MSR_STAR (entries from Ring3 only), and MSR_FRED_SSP_SL0 aliases MSR_PL0_SSP
> when CET-SS is active.  Otherwise, they're all new MSRs.
> 
> With init_fred() existing, load_system_tables() and legacy_syscall_init()
> should only be used when setting up IDT delivery.  Insert ASSERT()s to this
> effect, and adjust the various *_init() functions to make this property true.
> 
> Per the documentation, ap_early_traps_init() is responsible for switching off
> the boot GDT, which needs doing even in FRED mode.
> 
> Finally, set CR4.FRED in {bsp,ap}_early_traps_init().

Nit: That's {bsp,percpu}_traps_init() now, aiui.

> --- a/xen/arch/x86/include/asm/traps.h
> +++ b/xen/arch/x86/include/asm/traps.h
> @@ -16,6 +16,8 @@ void traps_init(void);
>  void bsp_traps_reinit(void);
>  void percpu_traps_init(void);
>  
> +void nocall entry_FRED_R3(void);

Can't we constrain this decl to the sole C file needing it?

> @@ -268,6 +272,52 @@ static void __init init_ler(void)
>      setup_force_cpu_cap(X86_FEATURE_XEN_LBR);
>  }
>  
> +/*
> + * Set up all MSRs relevant for FRED event delivery.
> + *
> + * Xen does not use any of the optional config in MSR_FRED_CONFIG, so all 
> that
> + * is needed is the entrypoint.
> + *
> + * Because FRED always provides a good stack, NMI and #DB do not need any
> + * special treatment.  Only #DF needs another stack level, and #MC for the
> + * offchance that Xen's main stack suffers an uncorrectable error.
> + *
> + * This makes Stack Level 1 unused, but we use #DB's stacks, and with the
> + * regular and shadow stacks reversed as posion to guarantee that any use
> + * escalates to #DF.
> + *
> + * FRED reuses MSR_STAR to provide the segment selector values to load on
> + * entry from Ring3.  Entry from Ring0 leave %cs and %ss unmodified.
> + */
> +static void init_fred(void)
> +{
> +    unsigned long stack_top = get_stack_bottom() & ~(STACK_SIZE - 1);
> +
> +    ASSERT(opt_fred == 1);
> +
> +    wrmsrns(MSR_STAR, XEN_MSR_STAR);
> +    wrmsrns(MSR_FRED_CONFIG, (unsigned long)entry_FRED_R3);
> +
> +    /*
> +     * MSR_FRED_RSP_* all come with an 64-byte alignment check, avoiding the
> +     * need for an explicit BUG_ON().
> +     */
> +    wrmsrns(MSR_FRED_RSP_SL0, (unsigned long)(&get_cpu_info()->_fred + 1));
> +    wrmsrns(MSR_FRED_RSP_SL1, stack_top + (IST_DB * IST_SHSTK_SIZE)); /* 
> Poison */

So the use of IST_SHSTK_SIZE here (and PAGE_SIZE below) is to have the SL1
stacks "the wrong way round". I first thought this was a mistake, not the
least also due to the typo below. I think this wants commenting upon.

> +    wrmsrns(MSR_FRED_RSP_SL2, stack_top + (1 + IST_MCE)  * PAGE_SIZE);
> +    wrmsrns(MSR_FRED_RSP_SL3, stack_top + (1 + IST_DF)   * PAGE_SIZE);
> +    wrmsrns(MSR_FRED_STK_LVLS, ((2UL << (X86_EXC_MC * 2)) |
> +                                (3UL << (X86_EXC_DF * 2))));
> +
> +    if ( cpu_has_xen_shstk )
> +    {
> +        wrmsrns(MSR_FRED_SSP_SL0, stack_top + (PRIMARY_SHSTK_SLOT + 1) * 
> PAGE_SIZE);
> +        wrmsrns(MSR_FRED_RSP_SL1, stack_top + (1 + IST_DF)  * PAGE_SIZE); /* 
> Poison */

MSR_FRED_SSP_SL1 and presumably IST_DB?

Also (nit) both of these lines are too long; the double blank ahead of * on
the latter one probably also wants dropping.

> +        wrmsrns(MSR_FRED_SSP_SL2, stack_top + (IST_MCE * IST_SHSTK_SIZE));
> +        wrmsrns(MSR_FRED_SSP_SL3, stack_top + (IST_DF  * IST_SHSTK_SIZE));
> +    }
> +}

Because of the intentional asymmetry for SL1, maybe the writing of
MSR_FRED_STK_LVLS would better move below here?

> @@ -356,8 +410,13 @@ void __init traps_init(void)
>   */
>  void __init bsp_traps_reinit(void)
>  {
> -    load_system_tables();
> -    percpu_traps_init();
> +    if ( opt_fred )
> +        init_fred();
> +    else
> +    {
> +        load_system_tables();
> +        percpu_traps_init();

Doesn't this need to stay outside of the if/else, ...

> +    }
>  }
>  
>  /*
> @@ -366,7 +425,8 @@ void __init bsp_traps_reinit(void)
>   */
>  void percpu_traps_init(void)
>  {
> -    legacy_syscall_init();
> +    if ( !opt_fred )
> +        legacy_syscall_init();
>  
>      if ( cpu_has_xen_lbr )
>          wrmsrl(MSR_IA32_DEBUGCTLMSR, IA32_DEBUGCTLMSR_LBR);

... for this to still be done?

Jan

Reply via email to