On Fri,  7 Nov 2025 12:49:32 -0400
Jason Gunthorpe <[email protected]> wrote:

> This series is the start of adding full DMABUF support to
> iommufd. Currently it is limited to only work with VFIO's DMABUF exporter.
> It sits on top of Leon's series to add a DMABUF exporter to VFIO:
> 
>   https://lore.kernel.org/r/[email protected]
> 
> The existing IOMMU_IOAS_MAP_FILE is enhanced to detect DMABUF fd's, but
> otherwise works the same as it does today for a memfd. The user can select
> a slice of the FD to map into the ioas and if the underliyng alignment
> requirements are met it will be placed in the iommu_domain.
> 
> Though limited, it is enough to allow a VMM like QEMU to connect MMIO BAR
> memory from VFIO to an iommu_domain controlled by iommufd. This is used
> for PCI Peer to Peer support in VMs, and is the last feature that the VFIO
> type 1 container has that iommufd couldn't do.
> 
> The VFIO type1 version extracts raw PFNs from VMAs, which has no lifetime
> control and is a use-after-free security problem.
> 
> Instead iommufd relies on revokable DMABUFs. Whenever VFIO thinks there
> should be no access to the MMIO it can shoot down the mapping in iommufd
> which will unmap it from the iommu_domain. There is no automatic remap,
> this is a safety protocol so the kernel doesn't get stuck. Userspace is
> expected to know it is doing something that will revoke the dmabuf and
> map/unmap it around the activity. Eg when QEMU goes to issue FLR it should
> do the map/unmap to iommufd.
> 
> Since DMABUF is missing some key general features for this use case it
> relies on a "private interconnect" between VFIO and iommufd via the
> vfio_pci_dma_buf_iommufd_map() call.
> 
> The call confirms the DMABUF has revoke semantics and delivers a phys_addr
> for the memory suitable for use with iommu_map().
> 
> Medium term there is a desire to expand the supported DMABUFs to include
> GPU drivers to support DPDK/SPDK type use cases so future series will work
> to add a general concept of revoke and a general negotiation of
> interconnect to remove vfio_pci_dma_buf_iommufd_map().
> 
> I also plan another series to modify iommufd's vfio_compat to
> transparently pull a dmabuf out of a VFIO VMA to emulate more of the uAPI
> of type1.
> 
> The latest series for interconnect negotation to exchange a phys_addr is:
>  https://lore.kernel.org/r/[email protected]

If this is in development, why are we pursuing a vfio specific
temporary "private interconnect" here rather than building on that
work?  What are the gaps/barriers/timeline?

I don't see any uAPI changes here, is there any visibility to userspace
whether IOMMUFD supports this feature or is it simply a try and fail
approach?  The latter makes it difficult for management tools to select
whether to choose a VM configuration based on IOMMUFD or legacy vfio if
p2p DMA is a requirement.  Thanks,

Alex

Reply via email to