On Mon, Jan 30, 2023 at 08:03:52PM +0000, Mike Larkin wrote:
> On Mon, Jan 30, 2023 at 12:32:22PM -0500, Dave Voutila wrote:
> > vmd's serial console has a race condition and can generate interrupt
> > storms as the ns8250 device constantly asserts and deasserts its irq.
> > 
> > Easiest way to see this is on older, slower hardware OR running nested
> > such as OpenBSD guest inside OpenBSD vmm atop Linux KVM. I don't know
> > enough about how consoles work to understand why, but if you use a shell
> > with history that lets you hit the up-arrow it can trigger this fairly
> > consistently.
> > 
> > When triggered, you will experience the serial console "lock up" and the
> > vmd process for the vm spike > 100% (the cpu thread is constantly
> > exiting due to interrupt and the event thread is constantly firing
> > ioctls). Anywhere from 10s to a few minutes later, it may stop and
> > control in the console resumes.
> > 
> > btrace shows the vm exit rate spike with the exit reason being external
> > interrupt.
> > 
> > AFAICT what happens is the kevent for data available on the com1 fd
> > fires and, if it was !EV_WRITE, we assert/deassert the irq even if we
> > weren't ready to actually receive data off the fd. This keeps kicking
> > the poor vcpu for no reason making it slower to get the ns8250 into a
> > state where it is slow to get back into a LSR_RXRDY state.
> > 
> > OK?
> > 
> 
> This needs to be widely tested. IIRC there were old linux kernels that
> behaved differently and were sensitive to where/when irqs were injected
> for the serial ports, leading to various issues like needing to press a
> key for the console to advance, and/or seeing "too much work" messages
> from the kernel.
> 
> I'd recommend testing on a bunch of platforms.
> 

I'm happy with it, I tested on a few old linux and don't see any
problems.

> -ml
> 
> > 
> > diff refs/heads/master refs/heads/vmd-ns8250
> > commit - ef14a9e8cae106563ff9ce15d913365f9ad3fa0e
> > commit + 5c76914a8c33243ec5ccc82689d0dadaa7cae666
> > blob - dbb6568714c192447a99017fcf92bcc4bcc90ba6
> > blob + 96e1f2533691205bdeb8e8b6dafee92603494c44
> > --- usr.sbin/vmd/ns8250.c
> > +++ usr.sbin/vmd/ns8250.c
> > @@ -153,14 +153,15 @@ com_rcv_event(int fd, short kind, void *arg)
> >             return;
> >     }
> > 
> > -   if ((com1_dev.regs.lsr & LSR_RXRDY) == 0)
> > +   if ((com1_dev.regs.lsr & LSR_RXRDY) == 0) {
> >             com_rcv(&com1_dev, (uintptr_t)arg, 0);
> > 
> > -   /* If pending interrupt, inject */
> > -   if ((com1_dev.regs.iir & IIR_NOPEND) == 0) {
> > -           /* XXX: vcpu_id */
> > -           vcpu_assert_pic_irq((uintptr_t)arg, 0, com1_dev.irq);
> > -           vcpu_deassert_pic_irq((uintptr_t)arg, 0, com1_dev.irq);
> > +           /* If pending interrupt, inject */
> > +           if ((com1_dev.regs.iir & IIR_NOPEND) == 0) {
> > +                   /* XXX: vcpu_id */
> > +                   vcpu_assert_pic_irq((uintptr_t)arg, 0, com1_dev.irq);
> > +                   vcpu_deassert_pic_irq((uintptr_t)arg, 0, com1_dev.irq);
> > +           }
> >     }
> > 
> >     mutex_unlock(&com1_dev.mutex);

Reply via email to