On 16/05/2022 15:31, Roger Pau Monne wrote:
> Booting with Shadow Stacks leads to the following assert on a debug
> hypervisor:
>
> (XEN) [   11.625166] Assertion 'local_irq_is_enabled()' failed at 
> arch/x86/smp.c:265
> (XEN) [   11.629410] ----[ Xen-4.17.0-10.24-d  x86_64  debug=y  Not tainted 
> ]----
> (XEN) [   11.633679] CPU:    0
> (XEN) [   11.637834] RIP:    e008:[<ffff82d040345300>] 
> flush_area_mask+0x40/0x13e
> [...]
> (XEN) [   11.806158] Xen call trace:
> (XEN) [   11.811255]    [<ffff82d040345300>] R flush_area_mask+0x40/0x13e
> (XEN) [   11.816459]    [<ffff82d040338a40>] F modify_xen_mappings+0xc5/0x958
> (XEN) [   11.821689]    [<ffff82d0404474f9>] F 
> arch/x86/alternative.c#_alternative_instructions+0xb7/0xb9
> (XEN) [   11.827053]    [<ffff82d0404476cc>] F alternative_branches+0xf/0x12
> (XEN) [   11.832416]    [<ffff82d04044e37d>] F __start_xen+0x1ef4/0x2776
> (XEN) [   11.837809]    [<ffff82d040203344>] F __high_start+0x94/0xa0
>
>
> This is due to SYS_STATE_smp_boot being set before calling
> alternative_branches(), and the flush in modify_xen_mappings() then
> using flush_area_all() with interrupts disabled.  Note that
> alternative_branches() is called before APs are started, so the flush
> must be a local one (and indeed the cpumask passed to
> flush_area_mask() just contains one CPU).
>
> Take the opportunity to simplify a bit the logic and make flush_area()
> an alias for flush_area_mask(&cpu_online_map...), taking into account
> that cpu_online_map just contains the BSP before APs are started.
> This requires widening the assert in flush_area_mask() to allow
> being called with interrupts disabled as long as it's strictly a local
> only flush.
>
> The overall result is that a conditional can be removed from
> flush_area().
>
> Fixes: (78e072bc37 'x86/mm: avoid inadvertently degrading a TLB flush to 
> local only')
> Suggested-by: Andrew Cooper <[email protected]>
> Signed-off-by: Roger Pau Monné <[email protected]>

Tentatively Acked-by: Andrew Cooper <[email protected]>

This seems like the least bad option of a lot of bad options.  I'd say
it's more than just removing a conditional from flush_area(); it's
removing a runtime special case for init-time code.

~Andrew

Reply via email to