On Mon, Oct 17, 2016 at 10:47:02PM -0600, Alex Williamson wrote: > On Tue, 18 Oct 2016 15:06:55 +1100 > David Gibson <[email protected]> wrote: > > > On Mon, Oct 17, 2016 at 10:07:36AM -0600, Alex Williamson wrote: > > > On Mon, 17 Oct 2016 18:44:21 +0300 > > > "Aviv B.D" <[email protected]> wrote: > > > > > > > From: "Aviv Ben-David" <[email protected]> > > > > > > > > * Advertize Cache Mode capability in iommu cap register. > > > > This capability is controlled by "cache-mode" property of intel-iommu > > > > device. > > > > To enable this option call QEMU with "-device > > > > intel-iommu,cache-mode=true". > > > > > > > > * On page cache invalidation in intel vIOMMU, check if the domain > > > > belong to > > > > registered notifier, and notify accordingly. > > > > > > > > Currently this patch still doesn't enabling VFIO devices support with > > > > vIOMMU > > > > present. Current problems: > > > > * vfio_iommu_map_notify is not aware about memory range belong to > > > > specific > > > > VFIOGuestIOMMU. > > > > > > Could you elaborate on why this is an issue? > > > > > > > * memory_region_iommu_replay hangs QEMU on start up while it itterate > > > > over > > > > 64bit address space. Commenting out the call to this function enables > > > > workable VFIO device while vIOMMU present. > > > > > > This has been discussed previously, it would be incorrect for vfio not > > > to call the replay function. The solution is to add an iommu driver > > > callback to efficiently walk the mappings within a MemoryRegion. > > > > Right, replay is a bit of a hack. There are a couple of other > > approaches that might be adequate without a new callback: > > - Make the VFIOGuestIOMMU aware of the guest address range mapped > > by the vIOMMU. Intel currently advertises that as a full 64-bit > > address space, but I bet that's not actually true in practice. > > - Have the IOMMU MR advertise a (minimum) page size for vIOMMU > > mappings. That may let you stpe through the range with greater > > strides > > Hmm, VT-d supports at least a 39-bit address width and always supports > a minimum 4k page size, so yes that does reduce us from 2^52 steps down > to 2^27,
Right, which is probably doable, if not ideal
> but it's still absurd to walk through the raw address space.
Well.. it depends on the internal structure of the IOMMU. For Power,
it's traditionally just a 1-level page table, so we can't actually do
any better than stepping through each IOMMU page.
> It does however seem correct to create the MemoryRegion with a width
> that actually matches the IOMMU capability, but I don't think that's a
> sufficient fix by itself. Thanks,
I suspect it would actually make it workable in the short term.
But I don't disagree that a "traverse" or "replay" callback of some
sort in the iommu_ops is a better idea long term. Having a fallback
to the current replay implementation if the callback isn't supplied
seems pretty reasonable though.
--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
signature.asc
Description: PGP signature
