Thank you for your contribution! Sashiko AI review found 2 potential issue(s) 
to consider:

New issues:
- [Critical] Arithmetic overflow in `calc_sg_nents` on 32-bit systems causes an 
under-allocation of scatterlist entries, leading to a NULL pointer dereference 
in `fill_sg_entry`.

Pre-existing issues:
- [High] Incorrect DMA unmapping of split scatterlist chunks and unpopulated 
entries violates the DMA API and corrupts IOMMU state.
--

commit f3f35bb73743f61d1b46175517c947331e0902f1
Author: David Hu <[email protected]>

dma-buf: Split sgl into page-aligned 2G chunks

Currently, fill_sg_entry() splits the scatterlist using UINT_MAX. This patch
changes the split to 2G chunks to satisfy page alignment and natural hardware
boundaries for DMA engines.

Link: https://lore.kernel.org/all/[email protected]/

> diff --git a/drivers/dma-buf/dma-buf-mapping.c 
> b/drivers/dma-buf/dma-buf-mapping.c
> index 794acff2546a3..2d88e08c5ebf9 100644
> --- a/drivers/dma-buf/dma-buf-mapping.c
> +++ b/drivers/dma-buf/dma-buf-mapping.c
> @@ -5,16 +5,17 @@
>   */
>  #include <linux/dma-buf-mapping.h>
>  #include <linux/dma-resv.h>
> +#include <linux/sizes.h>
> +
> +#define MAX_SG_ENT_SZ ((size_t)SZ_2G)
>  
>  static struct scatterlist *fill_sg_entry(struct scatterlist *sgl, size_t 
> length,
>                                        dma_addr_t addr)
>  {
> -     unsigned int len, nents;
> -     int i;
> +     size_t len;
>  
> -     nents = DIV_ROUND_UP(length, UINT_MAX);
> -     for (i = 0; i < nents; i++) {
> -             len = min_t(size_t, length, UINT_MAX);
> +     while (length) {
> +             len = min(length, MAX_SG_ENT_SZ);
>               length -= len;
>               /*
>                * DMABUF abuses scatterlist to create a scatterlist
> @@ -24,11 +25,12 @@ static struct scatterlist *fill_sg_entry(struct 
> scatterlist *sgl, size_t length,
>                * does not require the CPU list for mapping or unmapping.
>                */
>               sg_set_page(sgl, NULL, 0, 0);

[Severity: Critical]
Will this lead to a NULL pointer dereference on 32-bit systems?

If phys_vec[i].len is greater than 2GB (e.g. 0x80000001), the DIV_ROUND_UP
macro in calc_sg_nents() will overflow because size_t is 32-bit and
MAX_SG_ENT_SZ is 0x80000000:

    (0x80000001 + 0x80000000 - 1) / 0x80000000 == 0

This means no SGL entries are calculated or allocated for this region. However,
the while loop in fill_sg_entry() will still iterate to process the remaining
length. It will eventually advance past the end of the undersized scatterlist
table via:

    sgl = sg_next(sgl);

which becomes NULL, and then dereference it on the next iteration via
sg_set_page(sgl, ...).

> -             sg_dma_address(sgl) = addr + (dma_addr_t)i * UINT_MAX;
> +             sg_dma_address(sgl) = addr;
>               sg_dma_len(sgl) = len;
> +             addr += len;
> +             /* Unconditionally advance. On last segment, this becomes NULL 
> */
>               sgl = sg_next(sgl);
>       }
> -
>       return sgl;
>  }
>  
> @@ -41,14 +43,14 @@ static unsigned int calc_sg_nents(struct dma_iova_state 
> *state,
>  
>       if (!state || !dma_use_iova(state)) {
>               for (i = 0; i < nr_ranges; i++)
> -                     nents += DIV_ROUND_UP(phys_vec[i].len, UINT_MAX);
> +                     nents += DIV_ROUND_UP(phys_vec[i].len, MAX_SG_ENT_SZ);
>       } else {
>               /*
>                * In IOVA case, there is only one SG entry which spans
>                * for whole IOVA address space, but we need to make sure
>                * that it fits sg->length, maybe we need more.
>                */
> -             nents = DIV_ROUND_UP(size, UINT_MAX);
> +             nents = DIV_ROUND_UP(size, MAX_SG_ENT_SZ);
>       }
>  
>       return nents;

[Severity: High]
This isn't a bug introduced by this patch, but does iterating over the SGL to
unmap chunked entries violate the DMA API?

In dma_buf_phys_vec_to_sgt(), dma_map_phys() maps the full, contiguous original
length of phys_vec[i].len. However, fill_sg_entry() splits this mapped region
into smaller chunks.

When unmapping, both the error path in dma_buf_phys_vec_to_sgt() and the
teardown in dma_buf_free_sgt() iterate over the split SGL entries:

drivers/dma-buf/dma-buf-mapping.c:dma_buf_free_sgt() {
    ...
                for_each_sgtable_dma_sg(sgt, sgl, i)
                        dma_unmap_phys(attach->dev, sg_dma_address(sgl),
                                       sg_dma_len(sgl), dir, DMA_ATTR_MMIO);
    ...
}

Does the DMA API permit unmapping partial chunks instead of the exact length
and address originally mapped?

Additionally, in the error path of dma_buf_phys_vec_to_sgt():

drivers/dma-buf/dma-buf-mapping.c:dma_buf_phys_vec_to_sgt() {
    ...
err_unmap_dma:
        if (!i || !dma->state) {
                ; /* Do nothing */
        ...
        } else {
                for_each_sgtable_dma_sg(&dma->sgt, sgl, i)
                        dma_unmap_phys(attach->dev, sg_dma_address(sgl),
                                       sg_dma_len(sgl), dir, DMA_ATTR_MMIO);
        }
    ...
}

Is it safe to reuse the outer loop variable i here? Since i was tracking the
successfully mapped count, using it in for_each_sgtable_dma_sg() clobbers it
and forces iteration up to the sgt->nents capacity, which could unmap
unpopulated zero-initialized entries or page 0.

-- 
Sashiko AI review ยท 
https://sashiko.dev/#/patchset/[email protected]?part=1

Reply via email to