On Thu, Oct 26, 2023 at 09:59:42AM +0200, Jan Beulich wrote:
> On 24.10.2023 16:53, Roger Pau Monne wrote:
> > Sporadically we have seen the following during AP bringup on AMD platforms
> > only:
> > 
> > microcode: CPU59 updated from revision 0x830107a to 0x830107a, date = 
> > 2023-05-17
> > microcode: CPU60 updated from revision 0x830104d to 0x830107a, date = 
> > 2023-05-17
> > CPU60: No irq handler for vector 27 (IRQ -2147483648)
> > microcode: CPU61 updated from revision 0x830107a to 0x830107a, date = 
> > 2023-05-17
> > 
> > This is similar to the issue raised on Linux commit 36e9e1eab777e, where 
> > they
> > observed i8259 (active) vectors getting delivered to CPUs different than 0.
> > 
> > On AMD or Hygon platforms adjust the target CPU mask of i8259 interrupt
> > descriptors to contain all possible CPUs, so that APs will reserve the 
> > vector
> > at startup if any legacy IRQ is still delivered through the i8259.  Note 
> > that
> > if the IO-APIC takes over those interrupt descriptors the CPU mask will be
> > reset.
> > 
> > Spurious i8259 interrupt vectors however (IRQ7 and IRQ15) can be injected 
> > even
> > when all i8259 pins are masked, and hence would need to be handled on all 
> > CPUs.
> > 
> > Continue to reserve PIC vectors on CPU0 only, but do check for such spurious
> > interrupts on all CPUs if the vendor is AMD or Hygon.  Note that once the
> > vectors get used by devices detecting PIC spurious interrupts will no 
> > longer be
> > possible, however the device driver should be able to cope with spurious
> > interrupts.  Such PIC spurious interrupts occurring when the vector is in 
> > use
> > by a local APIC routed source will lead to an extra EOI, which might
> > unintentionally clear a different vector from ISR.  Note this is already the
> > current behavior, so assume it's infrequent enough to not cause real issues.
> > 
> > Finally, adjust the printed message to display the CPU where the spurious
> > interrupt has been received, so it looks like:
> > 
> > microcode: CPU1 updated from revision 0x830107a to 0x830107a, date = 
> > 2023-05-17
> > cpu1: spurious 8259A interrupt: IRQ7
> > microcode: CPU2 updated from revision 0x830104d to 0x830107a, date = 
> > 2023-05-17
> > 
> > Amends: 3fba06ba9f8b ('x86/IRQ: re-use legacy vector ranges on APs')
> > Signed-off-by: Roger Pau Monné <[email protected]>
> 
> Reviewed-by: Jan Beulich <[email protected]>
> with one nit (which I think can be taken care of when committing):
> 
> > --- a/xen/arch/x86/irq.c
> > +++ b/xen/arch/x86/irq.c
> > @@ -1920,7 +1920,16 @@ void do_IRQ(struct cpu_user_regs *regs)
> >                  kind = "";
> >              if ( !(vector >= FIRST_LEGACY_VECTOR &&
> >                     vector <= LAST_LEGACY_VECTOR &&
> > -                   !smp_processor_id() &&
> > +                   (!smp_processor_id() ||
> > +                    /*
> > +                     * For AMD/Hygon do spurious PIC interrupt
> > +                     * detection on all CPUs, as it has been observed
> > +                     * that during unknown circumstances spurious PIC
> > +                     * interrupts have been delivered to CPUs
> > +                     * different than the BSP.
> > +                     */
> > +                   (boot_cpu_data.x86_vendor & (X86_VENDOR_AMD |
> > +                                                X86_VENDOR_HYGON))) &&
> 
> Afaict these two lines need indenting by one more blank, to account
> for the parentheses enclosing the || operands.

Indeed, please adjust at commit if you don't mind.

Thanks, Roger.

Reply via email to