> -----Original Message-----
> From: Xen-devel [mailto:[email protected]] On Behalf
> Of Jan Beulich
> Sent: 03 December 2018 15:03
> To: Paul Durrant <[email protected]>
> Cc: Kevin Tian <[email protected]>; Stefano Stabellini
> <[email protected]>; Wei Liu <[email protected]>; Andrew Cooper
> <[email protected]>; Julien Grall <[email protected]>; Suravee
> Suthikulpanit <[email protected]>; xen-devel <xen-
> [email protected]>; Brian Woods <[email protected]>; Roger Pau
> Monne <[email protected]>
> Subject: Re: [Xen-devel] [PATCH 2/2] iommu: elide flushing for higher
> order map/unmap operations
>
> >>> On 30.11.18 at 11:45, <[email protected]> wrote:
> > This patch removes any implicit flushing that occurs in the
> implementation
> > of map and unmap operations and, instead, adds explicit flushing at the
> > end of the loops in the iommu_map/unmap() wrapper functions.
> >
> > Because VT-d currently performs two different types of flush dependent
> upon
> > whether a PTE is being modified versus merely added (i.e. replacing a
> non-
> > present PTE) a 'iommu_flush_type' enumeration is defined by this patch
> and
> > the iommu_ops map method is modified to pass back the type of flush
> > necessary for the PTE that has been populated. When a higher order
> mapping
> > operation is done, the wrapper code performs the 'highest' level of
> flush
> > required by the individual iommu_ops method calls, where a 'modified
> PTE'
> > flush is deemed to be higher than a 'added PTE' flush.
>
> I'm afraid such ordering properties may not generally exist. That is,
> what you pass the flush handlers needs to be an OR of "added new
> entries" and "modified existing entries". That's because at least in
> the abstract case it may be that distinct flushes need to be issued
> for both cases (i.e. potentially two of them).
Ok, I can set things up that way instead.
>
> > -static bool set_iommu_pte_present(unsigned long pt_mfn, unsigned long
> dfn,
> > - unsigned long next_mfn, int
> pde_level,
> > - bool iw, bool ir)
> > +static enum iommu_flush_type set_iommu_pte_present(
> > + unsigned long pt_mfn, unsigned long dfn, unsigned long next_mfn,
> > + int pde_level, bool iw, bool ir)
> > {
> > uint64_t *table;
> > uint32_t *pde;
> > - bool need_flush;
> > + enum iommu_flush_type flush_type;
> >
> > table = map_domain_page(_mfn(pt_mfn));
> >
> > pde = (uint32_t *)(table + pfn_to_pde_idx(dfn, pde_level));
> >
> > - need_flush = set_iommu_pde_present(pde, next_mfn, 0, iw, ir);
> > + flush_type = set_iommu_pde_present(pde, next_mfn, 0, iw, ir);
> > unmap_domain_page(table);
> > - return need_flush;
> > + return flush_type;
> > }
>
> Please take the opportunity and add the missing blank line.
>
Ok.
> > @@ -629,8 +629,6 @@ int amd_iommu_unmap_page(struct domain *d, dfn_t
> dfn)
> > clear_iommu_pte_present(pt_mfn[1], dfn_x(dfn));
> >
> > spin_unlock(&hd->arch.mapping_lock);
> > -
> > - amd_iommu_flush_pages(d, dfn_x(dfn), 0);
> > return 0;
> > }
>
> Please retain the blank line.
Ok.
>
> > @@ -700,12 +705,23 @@ int amd_iommu_reserve_domain_unity_map(struct
> domain *domain,
> > for ( i = 0; i < npages; i++ )
> > {
> > unsigned long frame = gfn + i;
> > + enum iommu_flush_type this_flush_type;
> >
> > - rt = amd_iommu_map_page(domain, _dfn(frame), _mfn(frame),
> flags);
> > + rt = amd_iommu_map_page(domain, _dfn(frame), _mfn(frame),
> flags,
> > + &this_flush_type);
> > if ( rt != 0 )
> > - return rt;
> > + break;
> > +
> > + flush_type = MAX(flush_type, this_flush_type);
> > }
> > - return 0;
> > +
> > + /*
> > + * The underlying implementation is void so the return value is
> > + * meaningless and can hence be ignored.
> > + */
> > + ignored = amd_iommu_flush_iotlb_pages(domain, _dfn(gfn), npages,
> > + flush_type);
>
> I'm afraid such an assignment without subsequent use can be
> (legitimately) warned about by compilers. Hence the approach
> I had asked you to restore in one of your earlier patches. The
> exact same one won't fit here, but while ( ... ) break; should.
> Same then elsewhere.
>
Ok.
> > @@ -319,11 +324,18 @@ int iommu_map(struct domain *d, dfn_t dfn, mfn_t
> mfn,
> >
> > for ( i = 0; i < (1ul << page_order); i++ )
> > {
> > + enum iommu_flush_type this_flush_type;
> > + int ignore;
> > +
> > rc = hd->platform_ops->map_page(d, dfn_add(dfn, i),
> > - mfn_add(mfn, i), flags);
> > + mfn_add(mfn, i), flags,
> > + &this_flush_type);
> >
> > if ( likely(!rc) )
> > + {
> > + flush_type = MAX(flush_type, this_flush_type);
>
> With the comment above this is unlikely to stay anyway, but if it
> does please use max() instead. At least I can't see why you
> couldn't use the typesafe variant here.
>
As you say, it will go away.
> > @@ -336,12 +348,19 @@ int iommu_map(struct domain *d, dfn_t dfn, mfn_t
> mfn,
> > if ( hd->platform_ops->unmap_page(d, dfn_add(dfn, i)) )
> > continue;
> >
> > + /* Something went wrong so attempt a full flush */
> > + ignore = hd->platform_ops->iotlb_flush_all(d);
> > +
> > if ( !is_hardware_domain(d) )
> > domain_crash(d);
> >
> > break;
> > }
> >
> > + if ( hd->platform_ops->iotlb_flush &&
> !this_cpu(iommu_dont_flush_iotlb) )
> > + rc = hd->platform_ops->iotlb_flush(d, dfn, (1ul << page_order),
>
> 1u only please, since the function parameter is unsigned int.
>
Ok.
> > @@ -378,6 +397,10 @@ int iommu_unmap(struct domain *d, dfn_t dfn,
> unsigned int page_order)
> > }
> > }
> >
> > + if ( hd->platform_ops->iotlb_flush )
> > + rc = hd->platform_ops->iotlb_flush(d, dfn, (1ul << page_order),
>
> Same here.
>
Ok.
> > @@ -417,7 +440,9 @@ int iommu_iotlb_flush(struct domain *d, dfn_t dfn,
> unsigned int page_count)
> > if ( !iommu_enabled || !hd->platform_ops || !hd->platform_ops-
> >iotlb_flush
> > )
> > return 0;
> >
> > - rc = hd->platform_ops->iotlb_flush(d, dfn, page_count);
> > + /* Assume a 'modified' flush is required */
> > + rc = hd->platform_ops->iotlb_flush(d, dfn, page_count,
> > + IOMMU_FLUSH_modified);
>
> As per above this would then become the OR of both flush modes.
Yes.
>
> > --- a/xen/drivers/passthrough/vtd/iommu.c
> > +++ b/xen/drivers/passthrough/vtd/iommu.c
> > @@ -631,11 +631,14 @@ static int __must_check iommu_flush_iotlb(struct
> domain *d, dfn_t dfn,
> > return rc;
> > }
> >
> > -static int __must_check iommu_flush_iotlb_pages(struct domain *d,
> > - dfn_t dfn,
> > - unsigned int
> page_count)
> > +static int __must_check iommu_flush_iotlb_pages(
> > + struct domain *d, dfn_t dfn, unsigned int page_count,
> > + enum iommu_flush_type flush_type)
>
> Is the re-flowing really needed?
>
Yes. The enum is long and won't fit within 80 chars otherwise.
> > {
> > - return iommu_flush_iotlb(d, dfn, 1, page_count);
> > + return (flush_type == IOMMU_FLUSH_none) ?
> > + 0 :
> > + iommu_flush_iotlb(d, dfn, (flush_type ==
> IOMMU_FLUSH_modified),
>
> Unnecessary parentheses.
Ok.
>
> > @@ -674,9 +677,6 @@ static int __must_check dma_pte_clear_one(struct
> domain *domain, u64 addr)
> > spin_unlock(&hd->arch.mapping_lock);
> > iommu_flush_cache_entry(pte, sizeof(struct dma_pte));
> >
> > - if ( !this_cpu(iommu_dont_flush_iotlb) )
> > - rc = iommu_flush_iotlb_pages(domain, daddr_to_dfn(addr), 1);
>
> This code not getting replaced by another addition right in this
> source file, and this function's only caller being
> intel_iommu_unmap_page() makes me wonder why you don't
> have the unmap functions similarly hand back a flush indicator.
>
Well, the assumption is that unmap is always modifying an existing entry. Is
that assumption wrong?
Paul
> Jan
>
>
> _______________________________________________
> Xen-devel mailing list
> [email protected]
> https://lists.xenproject.org/mailman/listinfo/xen-devel
_______________________________________________
Xen-devel mailing list
[email protected]
https://lists.xenproject.org/mailman/listinfo/xen-devel