On 2017年03月31日 10:56, Peter Xu wrote:
Just come to mind that there may be a corner case here.
Intel VT-d actually has a "pt" mode which allows device use physical address
even when VT-d is enabled. In kernel, there is a iommu_identity_mapping.
If a device is in this map, then it would use "pt" mode. So that IOMMU driver
would not build second-level page table for it.
Yes, but qemu does not support ECAP_PT now, so guest will still have a page
table in this case.
Back to the virtual IOVA implementation, if an assigned device is in the
iommu_identity_mapping(e.g. VGA controller), it uses GPA directly to do DMA.
So it demands a GPA->HPA mapping in host. However, the iommu->ops.replay
is not able to build it when guest SL page table is empty.
So I think building an entire guest PA->HPA mapping before guest kernel boot
would be recommended. Any thoughts?
We plan to add PT in 2.10, a possible rough idea is disabled iommu dmar
region and use another region without iommu_ops. Then
vfio_listener_region_add() will just do the correct mappings.
Even without any new region. With the patch 16/17 ("intel_iommu: allow
dynamic switch of IOMMU region"), we can just turn the IOMMU region
on/off, following the device's PT bit, maybe using the new
vtd_switch_address_space() interface. That should be enough.
Right. For vhost it was probably need more works, e.g setting up static
mappings during region_add().
Again, we just need to wait until current series merged.
(Oh, then I found why I had an extra "on/off" parameter in previous
versions in vtd_switch_address_space(), but it was removed.)
Good to know this.
Thanks
Thanks,
-- peterx