On Wed, Jun 10, 2026 at 04:43:22PM +0100, Matt Evans wrote:
> Expand the VFIO DMABUF revocation state to three states:
> Not revoked, temporarily revoked, and permanently revoked.
>
> The first two are for existing transient revocation, e.g. across a
> function reset, and the DMABUF is put into the last in response to a
> new VFIO feature VFIO_DEVICE_FEATURE_DMA_BUF.
>
> VFIO_DEVICE_FEATURE_DMA_BUF passes a DMABUF by fd and requests that
> the DMABUF is permanently revoked. On success, it's guaranteed that
> the buffer can never be imported/attached/mmap()ed in future, that
> dynamic imports have been cleanly detached, and that all mappings have
> been made inaccessible/PTEs zapped.
>
> This is useful for lifecycle management, to reclaim VFIO PCI BAR
> ranges previously delegated to a subordinate client process: The
> driver process can ensure that the loaned resources are revoked when
> the client is deemed "done", and exported ranges can be safely re-used
> elsewhere.
>
> Refactor the revocation code out of vfio_pci_dma_buf_move() to a
> function common to move and the new feature request path.
>
> Signed-off-by: Matt Evans <[email protected]>
> ---
> drivers/vfio/pci/vfio_pci_core.c | 6 +-
> drivers/vfio/pci/vfio_pci_dmabuf.c | 169 ++++++++++++++++++++++-------
> drivers/vfio/pci/vfio_pci_priv.h | 19 +++-
> include/uapi/linux/vfio.h | 20 ++++
> 4 files changed, 173 insertions(+), 41 deletions(-)
>
> diff --git a/drivers/vfio/pci/vfio_pci_core.c
> b/drivers/vfio/pci/vfio_pci_core.c
> index 508a5eca910a..064906b25467 100644
[...]
>
> +/* Set the DMABUF's revocation status (OK or temporarily/permanently
> revoked) */
> +static void vfio_pci_dma_buf_set_status(struct vfio_pci_dma_buf *priv,
> + enum vfio_pci_dma_buf_status new_status)
> +{
> + bool was_revoked;
> +
> + lockdep_assert_held_write(&priv->vdev->memory_lock);
> +
> + if (priv->status == VFIO_PCI_DMABUF_PERM_REVOKED ||
> + priv->status == new_status) {
> + return;
> + }
> +
> + dma_resv_lock(priv->dmabuf->resv, NULL);
> + was_revoked = (priv->status == VFIO_PCI_DMABUF_TEMP_REVOKED);
> +
> + if (new_status != VFIO_PCI_DMABUF_OK) {
> + priv->status = new_status; /* Temp or permanently revoked */
> +
> + if (was_revoked) {
> + /*
> + * TEMP_REVOKED is being upgraded to
> + * PERM_REVOKED. The buffer is already gone,
> + * don't wait on it again.
> + */
> + dma_resv_unlock(priv->dmabuf->resv);
> + return;
> + }
> + }
> +
> + dma_buf_invalidate_mappings(priv->dmabuf);
Nit: We seem to be calling this even if new_status == OK, while it works
as importers (like IOMMUFD and RDMA core) are immune to a double
invalidate / revoke. I'm wondering if we could move this within the
if (new_status != VFIO_PCI_DMABUF_OK) block?
Since this is only needed to be called when we TEMP/PERM _REVOKE?
I'm just worried that this may overload the dma_buf_invalidate_mappings
to be a state-change notification instead of a revoke / invalidate
notification.
> + dma_resv_wait_timeout(priv->dmabuf->resv,
> + DMA_RESV_USAGE_BOOKKEEP, false,
> + MAX_SCHEDULE_TIMEOUT);
> + dma_resv_unlock(priv->dmabuf->resv);
> + if (new_status != VFIO_PCI_DMABUF_OK) {
> + kref_put(&priv->kref, vfio_pci_dma_buf_done);
> + wait_for_completion(&priv->comp);
> + unmap_mapping_range(priv->dmabuf->file->f_mapping,
> + 0, priv->size, 1);
> + /*
> + * Re-arm the registered kref reference and the
> + * completion so the post-revoke state matches the
> + * post-creation state. An un-revoke followed by a
> + * new mapping needs the kref to be non-zero before
> + * kref_get(), and vfio_pci_dma_buf_cleanup()
> + * delegates its drain back through this revoke
> + * path on a possibly-already-revoked dma-buf.
> + */
> + kref_init(&priv->kref);
> + reinit_completion(&priv->comp);
> + } else {
> + dma_resv_lock(priv->dmabuf->resv, NULL);
> + priv->status = VFIO_PCI_DMABUF_OK;
> + dma_resv_unlock(priv->dmabuf->resv);
> + }
> +}
> +
Otherwise,
Reviewed-by: Pranjal Shrivastava <[email protected]>
Thanks,
Praan