On 27.11.2020 11:59, Roger Pau Monné wrote:
> On Thu, Nov 26, 2020 at 06:20:34PM +0100, Manuel Bouyer wrote:
>> On Thu, Nov 26, 2020 at 04:09:37PM +0100, Roger Pau Monné wrote:
>>>>
>>>> Oh, that's actually very useful. The interrupt is being constantly
>>>> injected from the hardware and received by Xen, it's just not then
>>>> injected into dom0 - that's the bit we are missing. Let me look into
>>>> adding some more debug to that path, hopefully it will tell us where
>>>> things are getting blocked.
>>>
>>> So I have yet one more patch for you to try, this one has more
>>> debugging and a slight change in the emulated IO-APIC behavior.
>>> Depending on the result I might have to find a way to mask the
>>> interrupt so it doesn't spam the whole buffer in order for us to see
>>> exactly what triggered this scenario you are in.
>>
>> OK, here it is:
>> http://www-soc.lip6.fr/~bouyer/xen-log9.txt
>>
>> I had to restart from a clean source tree to apply this patch, so to make
>> sure we're in sync I attached the diff from my sources
>
> I'm quite confused about why your trace don't even get into
> hvm_do_IRQ_dpci, so I've added some more debug info.
Are you sure it doesn't? I'm somewhat worried we may ...
> --- a/xen/drivers/passthrough/io.c
> +++ b/xen/drivers/passthrough/io.c
> @@ -828,6 +828,9 @@ int hvm_do_IRQ_dpci(struct domain *d, struct pirq *pirq)
> !pirq_dpci || !(pirq_dpci->flags & HVM_IRQ_DPCI_MAPPED) )
> return 0;
>
> + if ( pirq->pirq == TRACK_IRQ )
> + debugtrace_printk("hvm_do_IRQ_dpci irq %u\n", pirq->pirq);
... take the early exit path up from here. I still wouldn't be
able to say why that is, because when I looked yesterday I
think I found all failure paths leading to HVM_IRQ_DPCI_MAPPED
remaining clear to have a log message associated, while Manuel
said there were no other log messages.
In the context of this I also started wondering whether it's
the right thing to do to start the EOI timer if the subsequent
call to send_guest_pirq() also doesn't actually send any event.
In this case the guest is effectively guaranteed to not handle
the interrupt. When the interrupt isn't shared, I think we
ought to ->end() it right away, but without unmasking it, to
unblock same or lower priority interrupts. What to do in the
shared case is less obvious to me ...
Jan