On Wed, Nov 18, 2020 at 11:00:25AM +0100, Roger Pau Monné wrote:
> On Wed, Nov 18, 2020 at 10:24:25AM +0100, Manuel Bouyer wrote:
> > On Wed, Nov 18, 2020 at 09:57:38AM +0100, Roger Pau Monné wrote:
> > > On Tue, Nov 17, 2020 at 05:40:33PM +0100, Manuel Bouyer wrote:
> > > > On Tue, Nov 17, 2020 at 04:58:07PM +0100, Roger Pau Monné wrote:
> > > > > [...]
> > > > >
> > > > > I have attached a patch below that will dump the vIO-APIC info as part
> > > > > of the 'i' debug key output, can you paste the whole output of the 'i'
> > > > > debug key when the system stalls?
> > > >
> > > > see attached file. Note that the kernel did unstall while 'i' output was
> > > > being printed, so it is mixed with some NetBSD kernel output.
> > > > The idt entry of the 'ioapic2 pin2' interrupt is 103 on CPU 0.
> > > >
> > > > I also put the whole sequence at
> > > > http://www-soc.lip6.fr/~bouyer/xen-log3.txt
> > >
> > > On one of the instances the pin shows up as masked, but I'm not sure
> > > if that's relevant since later it shows up as unmasked. Might just be
> > > part of how NetBSD handles such interrupts.
> >
> > Yes, NetBSD can mask an interrupt source if the interrupts needs to be
> > delayed.
> > It will be unmasked once the interrupt has been handled.
>
> Yes, I think that's roughly the same model that FreeBSD uses for
> level IO-APIC interrupts: mask it until the handlers have been run.
>
> > Would it be possible that Xen misses an unmask write, or fails to
> > call the vector if the interrupt is again pending at the time of the
> > unmask ?
>
> Well, it should work properly, but we cannot discard anything.
I did some more instrumentation from the NetBSD kernel, including dumping
the iopic2 pin2 register.
At the time of the command timeout, the register value is 0x0000a067,
which, if I understant it properly, menas that there's no interrupt
pending (bit IOAPIC_REDLO_RIRR, 0x00004000, is not set).
>From the NetBSD ddb, I can dump this register multiple times, waiting
several seconds, etc .., it doens't change).
Now if I call ioapic_dump_raw() from the debugger, which triggers some
XEN printf:
db{0}> call ioapic_dump_raw^M
Register dump of ioapic0^M
[ 203.5489060] 00 08000000 00170011 08000000(XEN) vioapic.c:124:d0v0 apic_mem_re
adl:undefined ioregsel 3
00000000(XEN) vioapic.c:124:d0v0 apic_mem_readl:undefined ioregsel 4
00000000(XEN) vioapic.c:124:d0v0 apic_mem_readl:undefined ioregsel 5
00000000(XEN) vioapic.c:124:d0v0 apic_mem_readl:undefined ioregsel 6
00000000(XEN) vioapic.c:124:d0v0 apic_mem_readl:undefined ioregsel 7
00000000^M
[ 203.5489060] 08(XEN) vioapic.c:124:d0v0 apic_mem_readl:undefined ioregsel 8
00000000(XEN) vioapic.c:124:d0v0 apic_mem_readl:undefined ioregsel 9
00000000(XEN) vioapic.c:124:d0v0 apic_mem_readl:undefined ioregsel a
00000000(XEN) vioapic.c:124:d0v0 apic_mem_readl:undefined ioregsel b
00000000(XEN) vioapic.c:124:d0v0 apic_mem_readl:undefined ioregsel c
00000000(XEN) vioapic.c:124:d0v0 apic_mem_readl:undefined ioregsel d
00000000(XEN) vioapic.c:124:d0v0 apic_mem_readl:undefined ioregsel e
00000000(XEN) vioapic.c:124:d0v0 apic_mem_readl:undefined ioregsel f
00000000^M
[ 203.5489060] 10 00010000 00000000 00010000 00000000 00010000 00000000
00010000 00000000^M
[...]
[ 203.5489060] Register dump of ioapic2^M
[ 203.5489060] 00 0a000000 00070011 0a000000(XEN) vioapic.c:124:d0v0
apic_mem_readl:undefined ioregsel 3
00000000(XEN) vioapic.c:124:d0v0 apic_mem_readl:undefined ioregsel 4
00000000(XEN) vioapic.c:124:d0v0 apic_mem_readl:undefined ioregsel 5
00000000(XEN) vioapic.c:124:d0v0 apic_mem_readl:undefined ioregsel 6
00000000(XEN) vioapic.c:124:d0v0 apic_mem_readl:undefined ioregsel 7
00000000^M
[ 203.5489060] 08(XEN) vioapic.c:124:d0v0 apic_mem_readl:undefined ioregsel 8
00000000(XEN) vioapic.c:124:d0v0 apic_mem_readl:undefined ioregsel 9
00000000(XEN) vioapic.c:124:d0v0 apic_mem_readl:undefined ioregsel a
00000000(XEN) vioapic.c:124:d0v0 apic_mem_readl:undefined ioregsel b
00000000(XEN) vioapic.c:124:d0v0 apic_mem_readl:undefined ioregsel c
00000000(XEN) vioapic.c:124:d0v0 apic_mem_readl:undefined ioregsel d
00000000(XEN) vioapic.c:124:d0v0 apic_mem_readl:undefined ioregsel e
00000000(XEN) vioapic.c:124:d0v0 apic_mem_readl:undefined ioregsel f
00000000^M
[ 203.5489060] 10 00010000 00000000 00010000 00000000 0000e067 00000000
00010000 00000000^M
then the register switches to 0000e067, with the IOAPIC_REDLO_RIRR bit set.
>From here, if I continue from ddb, the dom0 boots.
I can get the same effect by just doing ^A^A^A so my guess is that it's
not accessing the iopic's register which changes the IOAPIC_REDLO_RIRR bit,
but the XEN printf. Also, from NetBSD, using a dump fuinction which
doesn't access undefined registers - and so doesn't trigger XEN printfs -
doens't change the IOAPIC_REDLO_RIRR bit either.
--
Manuel Bouyer <[email protected]>
NetBSD: 26 ans d'experience feront toujours la difference
--