On Mon, Sep 22, 2025 at 11:53:06PM -0600, Kasireddy, Vivek wrote:
> Hi Jason,
> 
> > Subject: Re: [PATCH v4 1/5] PCI/P2PDMA: Don't enforce ACS check for device
> > functions of Intel GPUs
> > 
> > On Mon, Sep 22, 2025 at 01:22:49PM +0200, Christian König wrote:
> > 
> > > Well what exactly is happening here? You have a PF assigned to the
> > > host and a VF passed through to a guest, correct?
> > >
> > > And now the PF (from the host side) wants to access a BAR of the VF?
> > 
> > Not quite.
> > 
> > It is a GPU so it has a pool of VRAM. The PF can access all VRAM and
> > the VF can access some VRAM.
> > 
> > They want to get a DMABUF handle for a bit of the VF's reachable VRAM
> > that the PF can import and use through it's own funciton.
> > 
> > The use of the VF's BAR in this series is an ugly hack.
> IIUC, it is a common practice among GPU drivers including Xe and Amdgpu
> to never expose VRAM Addresses and instead have BAR addresses as DMA
> addresses when exporting dmabufs to other devices. Here is the relevant code
> snippet in Xe:
>                 phys_addr_t phys = cursor.start + 
> xe_vram_region_io_start(tile->mem.vram);             
>                 size_t size = min_t(u64, cursor.size, SZ_2G);                 
>         
>                 dma_addr_t addr;                                              
>         
>                                                                               
>         
>                 addr = dma_map_resource(dev, phys, size, dir,                 
>         
>                                         DMA_ATTR_SKIP_CPU_SYNC);
> 
> And, here is the one in amdgpu:
>         for_each_sgtable_sg((*sgt), sg, i) {
>                 phys_addr_t phys = cursor.start + adev->gmc.aper_base;
>                 unsigned long size = min(cursor.size, 
> AMDGPU_MAX_SG_SEGMENT_SIZE);
>                 dma_addr_t addr;
> 
>                 addr = dma_map_resource(dev, phys, size, dir,
>                                         DMA_ATTR_SKIP_CPU_SYNC);
> 

I've read through this thread—Jason, correct me if I'm wrong—but I
believe what you're suggesting is that instead of using PCIe P2P
(dma_map_resource) to communicate the VF's VRAM offset to the PF, we
should teach dma-buf to natively understand a VF's VRAM offset. I don't
think this is currently built into dma-buf, but it probably should be,
as it could benefit other use cases as well (e.g., UALink, NVLink,
etc.).

In both examples above, the PCIe P2P fabric is used for communication,
whereas in the VF→PF case, it's only using the PCIe P2P address to
extract the VF's VRAM offset, rather than serving as a communication
path. I believe that's Jason's objection. Again, Jason, correct me if
I'm misunderstanding here.

Assuming I'm understanding Jason's comments correctly, I tend to agree
with him.

> And, AFAICS, most of these drivers don't see use the BAR addresses directly
> if they import a dmabuf that they exported earlier and instead do this:
> 
>         if (dma_buf->ops == &xe_dmabuf_ops) {
>                 obj = dma_buf->priv;
>                 if (obj->dev == dev &&
>                     !XE_TEST_ONLY(test && test->force_different_devices)) {
>                         /*
>                          * Importing dmabuf exported from out own gem 
> increases
>                          * refcount on gem itself instead of f_count of 
> dmabuf.
>                          */
>                         drm_gem_object_get(obj);
>                         return obj;
>                 }
>         }

This code won't be triggered on the VF→PF path, as obj->dev == dev will
fail.

> 
> >The PF never actually uses the VF BAR
> That's because the PF can't use it directly, most likely due to hardware 
> limitations.
> 
> >it just hackily converts the dma_addr_t back
> > to CPU physical and figures out where it is in the VRAM pool and then
> > uses a PF centric address for it.
> > 
> > All they want is either the actual VRAM address or the CPU physical.
> The problem here is that the CPU physical (aka BAR Address) is only
> usable by the CPU. Since the GPU PF only understands VRAM addresses,
> the current exporter (vfio-pci) or any VF/VFIO variant driver cannot provide
> the VRAM addresses that the GPU PF can use directly because they do not
> have access to the provisioning data.
>

Right, we need to provide the offset within the VRAM provisioning, which
the PF can resolve to a physical address based on the provisioning data.
The series already does this—the problem is how the VF provides
this offset. It shouldn't be a P2P address, but rather a native
dma-buf-provided offset that everyone involved in the attachment
understands.
 
> However, it is possible that if vfio-pci or a VF/VFIO variant driver had 
> access
> to the VF's provisioning data, then it might be able to create a dmabuf with
> VRAM addresses that the PF can use directly. But I am not sure if exposing
> provisioning data to VFIO drivers is ok from a security standpoint or not.
> 

I'd prefer to leave the provisioning data to the PF if possible. I
haven't fully wrapped my head around the flow yet, but it should be
feasible for the VF → VFIO → PF path to pass along the initial VF
scatter-gather (SG) list in the dma-buf, which includes VF-specific
PFNs. The PF can then use this, along with its provisioning information,
to resolve the physical address.

Matt

> Thanks,
> Vivek
> 
> > 
> > Jason

Reply via email to