umem: Support importing dma-buf as user memory region

Xiong, Jianxin Mon, 05 Oct 2020 09:21:17 -0700

> -----Original Message-----
> From: Jason Gunthorpe <[email protected]>
> Sent: Monday, October 05, 2020 6:13 AM
> To: Xiong, Jianxin <[email protected]>
> Cc: [email protected]; [email protected]; Doug Ledford 
> <[email protected]>; Leon Romanovsky
> <[email protected]>; Sumit Semwal <[email protected]>; Christian Koenig 
> <[email protected]>; Vetter, Daniel
> <[email protected]>
> Subject: Re: [RFC PATCH v3 1/4] RDMA/umem: Support importing dma-buf as user 
> memory region
> 
> On Sun, Oct 04, 2020 at 12:12:28PM -0700, Jianxin Xiong wrote:
> > Dma-buf is a standard cross-driver buffer sharing mechanism that can
> > be used to support peer-to-peer access from RDMA devices.
> >
> > Device memory exported via dma-buf is associated with a file descriptor.
> > This is passed to the user space as a property associated with the
> > buffer allocation. When the buffer is registered as a memory region,
> > the file descriptor is passed to the RDMA driver along with other
> > parameters.
> >
> > Implement the common code for importing dma-buf object and mapping
> > dma-buf pages.
> >
> > Signed-off-by: Jianxin Xiong <[email protected]>
> > Reviewed-by: Sean Hefty <[email protected]>
> > Acked-by: Michael J. Ruhl <[email protected]>
> > ---
> >  drivers/infiniband/core/Makefile      |   2 +-
> >  drivers/infiniband/core/umem.c        |   4 +
> >  drivers/infiniband/core/umem_dmabuf.c | 291
> > ++++++++++++++++++++++++++++++++++
> >  drivers/infiniband/core/umem_dmabuf.h |  14 ++
> >  drivers/infiniband/core/umem_odp.c    |  12 ++
> >  include/rdma/ib_umem.h                |  19 ++-
> >  6 files changed, 340 insertions(+), 2 deletions(-)  create mode
> > 100644 drivers/infiniband/core/umem_dmabuf.c
> >  create mode 100644 drivers/infiniband/core/umem_dmabuf.h
> 
> I think this is using ODP too literally, dmabuf isn't going to need fine 
> grained page faults, and I'm not sure this locking scheme is OK - ODP is
> horrifically complicated.
>


> If this is the approach then I think we should make dmabuf its own stand 
> alone API, reg_user_mr_dmabuf()

That's the original approach in the first version. We can go back there.

> 
> The implementation in mlx5 will be much more understandable, it would just do 
> dma_buf_dynamic_attach() and program the XLT exactly
> the same as a normal umem.
> 
> The move_notify() simply zap's the XLT and triggers a work to reload it after 
> the move. Locking is provided by the dma_resv_lock. Only a
> small disruption to the page fault handler is needed.
> 

We considered such scheme but didn't go that way due to the lack of 
notification when the move is done and thus the work wouldn't know when it can 
reload.

Now I think it again, we could probably signal the reload in the page fault 
handler. 

> > +   dma_resv_lock(umem_dmabuf->attach->dmabuf->resv, NULL);
> > +   sgt = dma_buf_map_attachment(umem_dmabuf->attach,
> > +                                DMA_BIDIRECTIONAL);
> > +   dma_resv_unlock(umem_dmabuf->attach->dmabuf->resv);
> 
> This doesn't look right, this lock has to be held up until the HW is 
> programmed

The mapping remains valid until being invalidated again. There is a sequence 
number checking before programming the HW. 

> 
> The use of atomic looks probably wrong as well.

Do you mean umem_dmabuf->notifier_seq? Could you elaborate the concern?

> 
> > +   k = 0;
> > +   total_pages = ib_umem_odp_num_pages(umem_odp);
> > +   for_each_sg(umem->sg_head.sgl, sg, umem->sg_head.nents, j) {
> > +           addr = sg_dma_address(sg);
> > +           pages = sg_dma_len(sg) >> page_shift;
> > +           while (pages > 0 && k < total_pages) {
> > +                   umem_odp->dma_list[k++] = addr | access_mask;
> > +                   umem_odp->npages++;
> > +                   addr += page_size;
> > +                   pages--;
> 
> This isn't fragmenting the sg into a page list properly, won't work for 
> unaligned things

I thought the addresses are aligned, but will add explicit alignment here.

> 
> And really we don't need the dma_list for this case, with a fixed whole 
> mapping DMA SGL a normal umem sgl is OK and the normal umem
> XLT programming in mlx5 is fine.

The dma_list is used by both "polulate_mtt()" and "mlx5_ib_invalidate_range", 
which are used for XLT programming and invalidating (zapping), respectively.

> 
> Jason
_______________________________________________
dri-devel mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/dri-devel

RE: [RFC PATCH v3 1/4] RDMA/umem: Support importing dma-buf as user memory region

Reply via email to