This series adds TLP Processing Hints (TPH) support to the VFIO dma-buf export path, allowing importing drivers (e.g. mlx5) to use the exporter's steering tag when performing peer-to-peer DMA into a VFIO-owned device.
There is no separate in-tree vendor kernel driver for the target device: vfio-pci is the in-tree driver and the targeted device is managed from userspace via VFIO passthrough. That is why the ST has to flow through a uAPI: userspace owns the device and its ST table, so it is the entity that can publish a meaningful value for a given dma-buf. The kernel-visible participants are still in-tree: vfio-pci exports the dma-buf and mlx5 imports it. On the effect: the endpoint's PCIe ingress block uses the 8-bit ST as an in-band instruction for the incoming P2P TLP -- selecting a target cache partition and, on writes, an in-flight operation on the data before it lands. The dma-buf callback keeps this opaque to the framework -- only the producer (userspace owner of the VFIO device) and the consumer (endpoint block) need to interpret the value. The dma-buf get_tph callback itself is optional for workloads that depend on the endpoint's in-flight operation that fallback does not produce the same result. The dma-buf hook is intentionally generic and discoverable rather than a private side channel. The exporter owns the completing address space for the dma-buf and decides whether it can provide a meaningful ST/PH tuple for that completer; the dma-buf core keeps the tuple opaque, and importers merely request the namespace they support and place the returned value on generated TLPs. Exporters that cannot derive a meaningful tuple simply return -EOPNOTSUPP. Patch 1 is a pre-existing fix split out from the series: mlx5_st_dealloc_index() removed the xarray entry but never freed the backing struct, so repeated alloc/dealloc cycles leaked memory. Patch 2 adds small PCI/TPH type helpers so drivers can query the enabled TPH requester mode and the device's TPH Completer Supported field without reaching into pci_dev internals (and so callers in CONFIG_PCIE_TPH=n builds get a clean fallback). Patch 3 adds the optional dma_buf_ops::get_tph callback plus the dma_buf_get_tph() importer wrapper so importers can fetch TPH metadata from an exporter under dmabuf->resv. Patch 4 implements get_tph in vfio-pci and adds the new uAPI (VFIO_DEVICE_FEATURE_DMA_BUF_TPH) for userspace to attach the metadata. Patch 5 wires up the mlx5 RDMA driver as a consumer. Build-tested with both CONFIG_PCIE_TPH=y and CONFIG_PCIE_TPH=n. Functional validation on the target topology: PCIe analyzer captures on the P2P TLPs confirm the ST emitted by mlx5 matches the value published through VFIO_DEVICE_FEATURE_DMA_BUF_TPH, and the end-to-end P2P workload only produces results consistent with the endpoint's ST-selected in-flight operation. For example, with userspace publishing 8-bit ST=0xf0 and PH=2, an analyzer capture of a peer-to- peer MWr64 shows "STP MWr64 TC=0 OHC=2 ..." followed by "OHC-B ST=F0h PH=2 HV=1": (TLP Captures) 08000260 -> STP MWr64 TC=0 OHC=2 TS=0 Attr=0 L=8 F0000004 -> RID=4h:0h.0h EP- Tag=F0h E0200000 -> AddrH=000020E0h 00080006 -> AddrL=06000800h 90F00000 -> OHC-B ST=F0h PH=2 HV=1 AMA=0 AV- Previous link: v6: https://lore.kernel.org/dri-devel/[email protected]/ v5: https://lore.kernel.org/dri-devel/[email protected]/ v4: https://lore.kernel.org/linux-pci/[email protected]/ v3: https://lore.kernel.org/linux-pci/[email protected]/ v2: https://lore.kernel.org/linux-pci/[email protected]/ Zhiping Zhang (5): net/mlx5: free mlx5_st_idx_data on final dealloc PCI/TPH: Add requester/completer type helpers dma-buf: add optional get_tph() callback vfio/pci: implement get_tph and DMA_BUF_TPH feature RDMA/mlx5: get tph for p2p access when registering dma-buf mr drivers/dma-buf/dma-buf.c | 25 ++++ drivers/infiniband/core/frmr_pools.c | 20 +++- drivers/infiniband/hw/mlx5/mr.c | 111 +++++++++++++++++- .../net/ethernet/mellanox/mlx5/core/lib/st.c | 50 ++++++-- drivers/pci/tph.c | 43 +++++++ drivers/vfio/pci/vfio_pci_core.c | 3 + drivers/vfio/pci/vfio_pci_dmabuf.c | 94 ++++++++++++++- drivers/vfio/pci/vfio_pci_priv.h | 12 ++ include/linux/dma-buf.h | 21 ++++ include/linux/mlx5/driver.h | 12 ++ include/linux/pci-tph.h | 8 ++ include/rdma/frmr_pools.h | 5 +- include/uapi/linux/vfio.h | 37 ++++++ 13 files changed, 421 insertions(+), 20 deletions(-) -- 2.53.0-Meta
