On Fri, 7 Nov 2025 12:49:32 -0400 Jason Gunthorpe <[email protected]> wrote:
> This series is the start of adding full DMABUF support to > iommufd. Currently it is limited to only work with VFIO's DMABUF exporter. > It sits on top of Leon's series to add a DMABUF exporter to VFIO: > > https://lore.kernel.org/r/[email protected] > > The existing IOMMU_IOAS_MAP_FILE is enhanced to detect DMABUF fd's, but > otherwise works the same as it does today for a memfd. The user can select > a slice of the FD to map into the ioas and if the underliyng alignment > requirements are met it will be placed in the iommu_domain. > > Though limited, it is enough to allow a VMM like QEMU to connect MMIO BAR > memory from VFIO to an iommu_domain controlled by iommufd. This is used > for PCI Peer to Peer support in VMs, and is the last feature that the VFIO > type 1 container has that iommufd couldn't do. > > The VFIO type1 version extracts raw PFNs from VMAs, which has no lifetime > control and is a use-after-free security problem. > > Instead iommufd relies on revokable DMABUFs. Whenever VFIO thinks there > should be no access to the MMIO it can shoot down the mapping in iommufd > which will unmap it from the iommu_domain. There is no automatic remap, > this is a safety protocol so the kernel doesn't get stuck. Userspace is > expected to know it is doing something that will revoke the dmabuf and > map/unmap it around the activity. Eg when QEMU goes to issue FLR it should > do the map/unmap to iommufd. > > Since DMABUF is missing some key general features for this use case it > relies on a "private interconnect" between VFIO and iommufd via the > vfio_pci_dma_buf_iommufd_map() call. > > The call confirms the DMABUF has revoke semantics and delivers a phys_addr > for the memory suitable for use with iommu_map(). > > Medium term there is a desire to expand the supported DMABUFs to include > GPU drivers to support DPDK/SPDK type use cases so future series will work > to add a general concept of revoke and a general negotiation of > interconnect to remove vfio_pci_dma_buf_iommufd_map(). > > I also plan another series to modify iommufd's vfio_compat to > transparently pull a dmabuf out of a VFIO VMA to emulate more of the uAPI > of type1. > > The latest series for interconnect negotation to exchange a phys_addr is: > https://lore.kernel.org/r/[email protected] If this is in development, why are we pursuing a vfio specific temporary "private interconnect" here rather than building on that work? What are the gaps/barriers/timeline? I don't see any uAPI changes here, is there any visibility to userspace whether IOMMUFD supports this feature or is it simply a try and fail approach? The latter makes it difficult for management tools to select whether to choose a VM configuration based on IOMMUFD or legacy vfio if p2p DMA is a requirement. Thanks, Alex
