On Thu, Aug 01, 2019 at 11:25:04AM -0700, Roman Shaposhnik wrote:
> On Thu, Aug 1, 2019 at 1:16 AM Roger Pau Monné <[email protected]> wrote:
> >
> > On Wed, Jul 31, 2019 at 02:03:24PM -0700, Roman Shaposhnik wrote:
> > > On Wed, Jul 31, 2019 at 12:46 PM Andrew Cooper
> > > <[email protected]> wrote:
> > > >
> > > > On 31/07/2019 20:35, Roman Shaposhnik wrote:
> > > > > On Wed, Jul 31, 2019 at 1:43 AM Roger Pau Monné
> > > > > <[email protected]> wrote:
> > > > >> On Wed, Jul 31, 2019 at 10:36:31AM +0200, Roger Pau Monné wrote:
> > > > >>> On Tue, Jul 30, 2019 at 10:55:24AM -0700, Roman Shaposhnik wrote:
> > > > >>>> Sorry -- got a bit distracted yesterday. Attached is the log with
> > > > >>>> only
> > > > >>>> your latest patch attached. Interestingly enough the box booted
> > > > >>>> fine
> > > > >>>> without screen artifacts. So I guess we're getting closer...
> > > > >>>>
> > > > >>>> Thanks for all the help!
> > > > >>> That's quite weird, there's no functional changes between the
> > > > >>> previous patches and this one, the only difference is that this
> > > > >>> patch
> > > > >>> has more verbose output.
> > > > >>>
> > > > >>> Are you sure you didn't have any local patches on top of Xen that
> > > > >>> could explain this difference in behaviour?
> > > > >> FWIW, can you please try the plain patch again:
> > > > >>
> > > > >> https://lists.xenproject.org/archives/html/xen-devel/2019-07/msg01547.html
> > > > >>
> > > > >> And report back?
> > > > >>
> > > > >> I would like to get this committed ASAP if it does fix your issue.
> > > > > I'd like to say that it did -- but I tried it again just now and it
> > > > > still garbled screen and tons of:
> > > > >
> > > > > (XEN) printk: 26665 messages suppressed.
> > > > > (XEN) [VT-D]DMAR:[DMA Read] Request device [0000:00:02.0] fault addr
> > > > > 8e14c000, iommu reg = ffff82c0008de000
> > > > >
> > > > > I'm very much confused by what's going on, but it seems that's the
> > > > > case -- adding those debug print statements make the issue go away
> > > > >
> > > > > Here are the patches that are being applied:
> > > > > NOT WORKING:
> > > > > https://github.com/rvs/eve/blob/xen-bug/pkg/xen/01-iommu-mappings.patch
> > > > >
> > > > > WORKING:
> > > > > https://github.com/rvs/eve/blob/a1291fcd4e669df2a63285afb5e8b4841f45c1c8/pkg/xen/01-iommu-mappings.patch
> > > > >
> > > > > At this point I'm really not sure what's going on.
> > > >
> > > > Ok. seeing as you've double checked this, the mystery deepens.
> > > >
> > > > My bet is on the intel_iommu_lookup_page() call having side effects[1].
> > > > If you take out the debugging in the middle of the loop in
> > > > rmrr_identity_mapping(), does the problem reproduce again?
> > > >
> > > > ~Andrew
> > > >
> > > > [1] Looking at the internals of addr_to_dma_page_maddr(), it does 100%
> > > > more memory allocation and higher-level PTE construction than looks wise
> > > > for what is supposed to be a getter.
> > >
> > > Yup. That's what it is -- intel_iommu_lookup_page() seems to be the
> > > culprit.
> > >
> > > I've did the experiment in the other direction -- adding a dummy call:
> > >
> > > https://github.com/rvs/eve/blob/36aeeaa7c0a53474fb1ecef2ff587a86637df858/pkg/xen/01-iommu-mappings.patch#L23
> > > on top of original Roger's patch makes system boot NORMALLY.
> >
> > I'm again quite lost, and I don't really understand why mappings added
> > by arch_iommu_hwdom_init seems to work fine while mappings added by
> > rmrr_identity_mapping don't.
> >
> > I have yet another patch for you to try, which attempts to mimic
> > exactly what arch_iommu_hwdom_init does into rmrr_identity_mapping,
> > can you please give it a try?
> >
> > This has the added bonus of limiting the use of
> > {set/clear}_identity_p2m_entry to translated domains only, since
> > rmrr_identity_mapping was the only caller against PV domains.
> >
> > Thanks, Roger.
> > ---8<---
> > diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
> > index fef97c82f6..d36a58b1a6 100644
> > --- a/xen/arch/x86/mm/p2m.c
> > +++ b/xen/arch/x86/mm/p2m.c
> > @@ -1341,10 +1341,8 @@ int set_identity_p2m_entry(struct domain *d,
> > unsigned long gfn_l,
> >
> > if ( !paging_mode_translate(p2m->domain) )
> > {
> > - if ( !need_iommu_pt_sync(d) )
> > - return 0;
> > - return iommu_legacy_map(d, _dfn(gfn_l), _mfn(gfn_l), PAGE_ORDER_4K,
> > - IOMMUF_readable | IOMMUF_writable);
> > + ASSERT_UNREACHABLE();
> > + return -ENXIO;
> > }
> >
> > gfn_lock(p2m, gfn, 0);
> > @@ -1432,9 +1430,8 @@ int clear_identity_p2m_entry(struct domain *d,
> > unsigned long gfn_l)
> >
> > if ( !paging_mode_translate(d) )
> > {
> > - if ( !need_iommu_pt_sync(d) )
> > - return 0;
> > - return iommu_legacy_unmap(d, _dfn(gfn_l), PAGE_ORDER_4K);
> > + ASSERT_UNREACHABLE();
> > + return -ENXIO;
> > }
> >
> > gfn_lock(p2m, gfn, 0);
> > diff --git a/xen/drivers/passthrough/vtd/iommu.c
> > b/xen/drivers/passthrough/vtd/iommu.c
> > index 5d72270c5b..62df5ca5aa 100644
> > --- a/xen/drivers/passthrough/vtd/iommu.c
> > +++ b/xen/drivers/passthrough/vtd/iommu.c
> > @@ -1969,6 +1969,7 @@ static int rmrr_identity_mapping(struct domain *d,
> > bool_t map,
> > unsigned long end_pfn = PAGE_ALIGN_4K(rmrr->end_address) >>
> > PAGE_SHIFT_4K;
> > struct mapped_rmrr *mrmrr;
> > struct domain_iommu *hd = dom_iommu(d);
> > + unsigned int flush_flags = 0;
> >
> > ASSERT(pcidevs_locked());
> > ASSERT(rmrr->base_address < rmrr->end_address);
> > @@ -1982,7 +1983,7 @@ static int rmrr_identity_mapping(struct domain *d,
> > bool_t map,
> > if ( mrmrr->base == rmrr->base_address &&
> > mrmrr->end == rmrr->end_address )
> > {
> > - int ret = 0;
> > + int ret = 0, err;
> >
> > if ( map )
> > {
> > @@ -1995,13 +1996,20 @@ static int rmrr_identity_mapping(struct domain *d,
> > bool_t map,
> >
> > while ( base_pfn < end_pfn )
> > {
> > - if ( clear_identity_p2m_entry(d, base_pfn) )
> > - ret = -ENXIO;
> > + if ( paging_mode_translate(d) )
> > + ret = clear_identity_p2m_entry(d, base_pfn);
> > + else
> > + ret = iommu_unmap(d, _dfn(base_pfn), PAGE_ORDER_4K,
> > + &flush_flags);
> > base_pfn++;
> > }
> >
> > list_del(&mrmrr->list);
> > xfree(mrmrr);
> > + /* Keep the previous error code if there's one. */
> > + err = iommu_iotlb_flush_all(d, flush_flags);
> > + if ( !ret )
> > + ret = err;
> > return ret;
> > }
> > }
> > @@ -2011,8 +2019,13 @@ static int rmrr_identity_mapping(struct domain *d,
> > bool_t map,
> >
> > while ( base_pfn < end_pfn )
> > {
> > - int err = set_identity_p2m_entry(d, base_pfn, p2m_access_rw, flag);
> > + int err;
> >
> > + if ( paging_mode_translate(d) )
> > + err = set_identity_p2m_entry(d, base_pfn, p2m_access_rw, flag);
> > + else
> > + err = iommu_map(d, _dfn(base_pfn), _mfn(base_pfn),
> > PAGE_ORDER_4K,
> > + IOMMUF_readable | IOMMUF_writable,
> > &flush_flags);
> > if ( err )
> > return err;
> > base_pfn++;
> > @@ -2026,7 +2039,7 @@ static int rmrr_identity_mapping(struct domain *d,
> > bool_t map,
> > mrmrr->count = 1;
> > list_add_tail(&mrmrr->list, &hd->arch.mapped_rmrrs);
> >
> > - return 0;
> > + return iommu_iotlb_flush_all(d, flush_flags);
> > }
> >
> > static int intel_iommu_add_device(u8 devfn, struct pci_dev *pdev)
>
> This patch completely fixes the problem for me!
>
> Thanks Roger! I'd love to see this in Xen 4.13
Thanks for testing!
It's still not clear to me why the previous approach didn't work, but
I think this patch is better because it removes the usage of
{set/clear}_identity_p2m_entry from PV domains. I will submit this
formally now.
Roger.
_______________________________________________
Xen-devel mailing list
[email protected]
https://lists.xenproject.org/mailman/listinfo/xen-devel