On Sat, Aug 30, 2025 at 4:58 PM Barry Song <[email protected]> wrote:
>
> From: Barry Song <[email protected]>
>
> We can allocate high-order pages, but mapping them one by
> one is inefficient. This patch changes the code to map
> as large a chunk as possible. The code looks somewhat
> complicated mainly because supporting mmap with a
> non-zero offset is a bit tricky.
>
> Using the micro-benchmark below, we see that mmap becomes
> 3.5X faster:
...

It's been awhile since I've done mm things, so take it with a pinch of
salt, but this seems reasonable to me.

Though, one thought below...

> diff --git a/drivers/dma-buf/heaps/system_heap.c 
> b/drivers/dma-buf/heaps/system_heap.c
> index bbe7881f1360..4c782fe33fd4 100644
> --- a/drivers/dma-buf/heaps/system_heap.c
> +++ b/drivers/dma-buf/heaps/system_heap.c
> @@ -186,20 +186,35 @@ static int system_heap_mmap(struct dma_buf *dmabuf, 
> struct vm_area_struct *vma)
>         struct system_heap_buffer *buffer = dmabuf->priv;
>         struct sg_table *table = &buffer->sg_table;
>         unsigned long addr = vma->vm_start;
> -       struct sg_page_iter piter;
> -       int ret;
> +       unsigned long pgoff = vma->vm_pgoff;
> +       struct scatterlist *sg;
> +       int i, ret;
> +
> +       for_each_sgtable_sg(table, sg, i) {
> +               unsigned long n = sg->length >> PAGE_SHIFT;
>
> -       for_each_sgtable_page(table, &piter, vma->vm_pgoff) {
> -               struct page *page = sg_page_iter_page(&piter);
> +               if (pgoff < n)
> +                       break;
> +               pgoff -= n;
> +       }
> +
> +       for (; sg && addr < vma->vm_end; sg = sg_next(sg)) {
> +               unsigned long n = (sg->length >> PAGE_SHIFT) - pgoff;
> +               struct page *page = sg_page(sg) + pgoff;
> +               unsigned long size = n << PAGE_SHIFT;
> +
> +               if (addr + size > vma->vm_end)
> +                       size = vma->vm_end - addr;
>
> -               ret = remap_pfn_range(vma, addr, page_to_pfn(page), PAGE_SIZE,
> -                                     vma->vm_page_prot);
> +               ret = remap_pfn_range(vma, addr, page_to_pfn(page),
> +                               size, vma->vm_page_prot);

It feels like this sort of mapping loop for higher order pages
wouldn't be a unique pattern to just this code.  Would this be better
worked into a helper so it would be more generally usable?

Otherwise,
Acked-by: John Stultz <[email protected]>

thanks
-john

Reply via email to