Re: Design session "MSI-X support with Linux stubdomain" notes

Roger Pau Monné Thu, 29 Sep 2022 04:52:50 -0700

On Thu, Sep 29, 2022 at 01:44:28PM +0200, Jan Beulich wrote:
> On 29.09.2022 12:57, Marek Marczykowski-Górecki wrote:
> > On Mon, Sep 26, 2022 at 02:47:55PM +0200, Jan Beulich wrote:
> >> On 26.09.2022 14:43, Marek Marczykowski-Górecki wrote:
> >>> On Thu, Sep 22, 2022 at 08:00:00PM +0200, Jan Beulich wrote:
> >>>> On 22.09.2022 18:05, Anthony PERARD wrote:
> >>>>> WARNING: Notes missing at the beginning of the meeting.
> >>>>>
> >>>>> session description:
> >>>>>> Currently a HVM with PCI passthrough and Qemu Linux stubdomain doesn’t
> >>>>>> support MSI-X. For the device to (partially) work, Qemu needs a patch 
> >>>>>> masking
> >>>>>> MSI-X from the PCI config space. Some drivers are not happy about 
> >>>>>> that, which
> >>>>>> is understandable (device natively supports MSI-X, so fallback path are
> >>>>>> rarely tested).
> >>>>>>
> >>>>>> This is mostly (?) about qemu accessing /dev/mem directly (here:
> >>>>>> https://github.com/qemu/qemu/blob/master/hw/xen/xen_pt_msi.c#L579) - 
> >>>>>> lets
> >>>>>> discuss alternative interface that stubdomain could use.
> >>>>>
> >>>>>
> >>>>>
> >>>>> when qemu forward interrupt,
> >>>>>     for correct mask bit, it read physical mask bit.
> >>>>>     an hypercall would make sense.
> >>>>>     -> benefit, mask bit in hardware will be what hypervisor desire, 
> >>>>> and device model desire.
> >>>>>     from guest point of view, interrupt should be unmask.
> >>>>>
> >>>>> interrupt request are first forwarded to qemu, so xen have to do some 
> >>>>> post processing once request comes back from qemu.
> >>>>>     it's weird..
> >>>>>
> >>>>> someone should have a look, and rationalize this weird path.
> >>>>>
> >>>>> Xen tries to not forward everything to qemu.
> >>>>>
> >>>>> why don't we do that in xen.
> >>>>>     there's already code in xen for that.
> >>>>
> >>>> So what I didn't pay enough attention to when talking was that the
> >>>> completion logic in Xen is for writes only. Maybe something similar
> >>>> can be had for reads as well, but that's to be checked ...
> >>>
> >>> I spent some time trying to follow that part of qemu, and I think it
> >>> reads vector control only on the write path, to keep some bits
> >>> unchanged, and also detect whether Xen masked it behind qemu's back.
> >>> My understanding is, since 484d7c852e "x86/MSI-X: track host and guest
> >>> mask-all requests separately" it is unnecessary, because Xen will
> >>> remember guest's intention, so qemu can simply use its own internal
> >>> state and act on that (guest writes will go through qemu, so it should
> >>> have up to date view from guest's point of view).
> >>>
> >>> As for PBA access, it is read by qemu only to pass it to the guest. I'm
> >>> not sure whether qemu should use hypercall to retrieve it, or maybe
> >>> Xen should fixup value itself on the read path.
> >>
> >> Forwarding the access to qemu just for qemu to use a hypercall to obtain
> >> the value needed seems backwards to me. If we need new code in Xen, we
> >> can as well handle the read directly I think, without involving qemu.
> > 
> > I'm not sure if I fully follow what qemu does here, but I think the
> > reason for such handling is that PBA can (and often do) live on the same
> > page as the actual MSI-X table. I'm trying to adjust qemu to not
> > intercept this read, but at this point I'm not yet sure of that's even
> > possible on sub-page granularity.
> > 
> > But, to go forward with PoC/debugging, I hardwired PBA read to
> > 0xFFFFFFFF, and it seems it doesn't work. My observation is that the
> > handler in the Linux driver isn't called. There are several moving
> > part (it could very well be bug in the driver, or some other part in the
> > VM). Is there some place in Xen I can see if an interrupt gets delivered
> > to the guest (some function I can add debug print to), or is it
> > delivered directly to the guest?
> 
> I guess "iommu=no-intpost" would suppress "direct" delivery (if hardware
> is capable of that in the first place). And wait - this option actually
> default to off.
> 
> As to software delivery - I guess you would want to start from
> do_IRQ_guest() and then see where things get lost. (Adding logging to
> such a path of course has a fair risk of ending up overly chatty.)


Having dealt with interrupt issues before, try to limit logging to the
IRQ you are interested on only - using xentrace might be a better
option depending on what you need to debug, albeit it's kind of a pain
to add new trace points as you also need to modify xenalyze to print
them.

Roger.

Re: Design session "MSI-X support with Linux stubdomain" notes

Reply via email to