On Wed, Dec 01, 2021 at 10:27:21AM +0100, Jan Beulich wrote:
> On 01.12.2021 10:09, Roger Pau Monné wrote:
> > On Fri, Sep 24, 2021 at 11:46:57AM +0200, Jan Beulich wrote:
> >> @@ -267,44 +267,60 @@ static bool __hwdom_init hwdom_iommu_map
> >>       * that fall in unusable ranges for PV Dom0.
> >>       */
> >>      if ( (pfn > max_pfn && !mfn_valid(mfn)) || xen_in_range(pfn) )
> >> -        return false;
> >> +        return 0;
> >>  
> >>      switch ( type = page_get_ram_type(mfn) )
> >>      {
> >>      case RAM_TYPE_UNUSABLE:
> >> -        return false;
> >> +        return 0;
> >>  
> >>      case RAM_TYPE_CONVENTIONAL:
> >>          if ( iommu_hwdom_strict )
> >> -            return false;
> >> +            return 0;
> >>          break;
> >>  
> >>      default:
> >>          if ( type & RAM_TYPE_RESERVED )
> >>          {
> >>              if ( !iommu_hwdom_inclusive && !iommu_hwdom_reserved )
> >> -                return false;
> >> +                perms = 0;
> >>          }
> >> -        else if ( is_hvm_domain(d) || !iommu_hwdom_inclusive || pfn > 
> >> max_pfn )
> >> -            return false;
> >> +        else if ( is_hvm_domain(d) )
> >> +            return 0;
> >> +        else if ( !iommu_hwdom_inclusive || pfn > max_pfn )
> >> +            perms = 0;
> > 
> > I'm confused about the reason to set perms = 0 instead of just
> > returning here. AFAICT perms won't be set to any other value below,
> > so you might as well just return 0.
> 
> This is so that ...
> 
> >>      }
> >>  
> >>      /* Check that it doesn't overlap with the Interrupt Address Range. */
> >>      if ( pfn >= 0xfee00 && pfn <= 0xfeeff )
> >> -        return false;
> >> +        return 0;
> >>      /* ... or the IO-APIC */
> >> -    for ( i = 0; has_vioapic(d) && i < d->arch.hvm.nr_vioapics; i++ )
> >> -        if ( pfn == PFN_DOWN(domain_vioapic(d, i)->base_address) )
> >> -            return false;
> >> +    if ( has_vioapic(d) )
> >> +    {
> >> +        for ( i = 0; i < d->arch.hvm.nr_vioapics; i++ )
> >> +            if ( pfn == PFN_DOWN(domain_vioapic(d, i)->base_address) )
> >> +                return 0;
> >> +    }
> >> +    else if ( is_pv_domain(d) )
> >> +    {
> >> +        /*
> >> +         * Be consistent with CPU mappings: Dom0 is permitted to 
> >> establish r/o
> >> +         * ones there, so it should also have such established for IOMMUs.
> >> +         */
> >> +        for ( i = 0; i < nr_ioapics; i++ )
> >> +            if ( pfn == PFN_DOWN(mp_ioapics[i].mpc_apicaddr) )
> >> +                return rangeset_contains_singleton(mmio_ro_ranges, pfn)
> >> +                       ? IOMMUF_readable : 0;
> >> +    }
> 
> ... this return, as per the comment, takes precedence over returning
> zero.

I see. This is because you want to map those in the IOMMU page tables
even if the IO-APIC ranges are outside of a reserved region.

I have to admit this is kind of weird, because the purpose of this
function is to add mappings for all memory below 4G, and/or for all
reserved regions.

I also wonder whether we should kind of generalize the handling of RO
regions in the IOMMU for PV dom0 by using mmio_ro_ranges instead? Ie:
we could loop around the RO ranges in PV dom0 build and map them.

FWIW MSI-X tables are also RO, but adding and removing those to the
IOMMU might be quite complex as we have to track the memory decoding
and MSI-X enable bits.

And we are likely missing a check for iomem_access_permitted in
hwdom_iommu_map?

> >> @@ -346,15 +362,19 @@ void __hwdom_init arch_iommu_hwdom_init(
> >>      for ( ; i < top; i++ )
> >>      {
> >>          unsigned long pfn = pdx_to_pfn(i);
> >> +        unsigned int perms = hwdom_iommu_map(d, pfn, max_pfn);
> >>          int rc;
> >>  
> >> -        if ( !hwdom_iommu_map(d, pfn, max_pfn) )
> >> +        if ( !perms )
> >>              rc = 0;
> >>          else if ( paging_mode_translate(d) )
> >> -            rc = set_identity_p2m_entry(d, pfn, p2m_access_rw, 0);
> >> +            rc = set_identity_p2m_entry(d, pfn,
> >> +                                        perms & IOMMUF_writable ? 
> >> p2m_access_rw
> >> +                                                                : 
> >> p2m_access_r,
> >> +                                        0);
> >>          else
> >>              rc = iommu_map(d, _dfn(pfn), _mfn(pfn), 1ul << PAGE_ORDER_4K,
> >> -                           IOMMUF_readable | IOMMUF_writable, 
> >> &flush_flags);
> >> +                           perms, &flush_flags);
> > 
> > You could just call set_identity_p2m_entry uniformly here. It will
> > DTRT for non-translated guests also, and then hwdom_iommu_map could
> > perhaps return a p2m_access_t?
> 
> That's an orthogonal change imo, i.e. could be done as a prereq change,
> but I'd prefer to leave it as is for now. Furthermore see "x86/mm: split
> set_identity_p2m_entry() into PV and HVM parts": In v2 I'm now also
> adjusting the code here 

I would rather adjust the code here to just call
set_identity_p2m_entry instead of differentiating between PV and
HVM.

> (and vpci_make_msix_hole()) to call the
> translated-only function.

This one does make sense, as vpci is strictly HVM only.

Thanks, Roger.

Reply via email to