Thank you for your contribution! Sashiko AI review found 6 potential issue(s) 
to consider:

Pre-existing issues:
- [High] DIV_ROUND_UP(length, UINT_MAX) silently evaluates to 0 on 32-bit 
architectures, causing allocation failures.
- [High] Error recovery path unmaps uninitialized scatterlist entries and 
improperly unmaps split entries.
- [High] dma_iova_link() is consistently called with a hardcoded offset of 0, 
causing overlapping IOVA mappings.
- [High] IOVA space allocated by dma_iova_try_alloc() is permanently leaked on 
multiple error paths.
- [High] Mismatch between caller-provided size and sum of phys_vec[i].len 
causes out-of-bounds write and NULL pointer dereference.
- [High] Splitting physical memory ranges with UINT_MAX creates unaligned 
scatterlist entries, violating IOMMU page alignment rules.
--

commit 451e93086fcd7b734bfb9aa07420089c45fd9663
Author: David Hu <[email protected]>

dma-buf: Fix silent overflow for phys vec to sgt

This patch changes mapped_len to size_t to fix a silent wrap-around
when accumulating lengths. It also adds check_add_overflow() in
calc_sg_nents and uses unsigned int for the loop iterator in
fill_sg_entry.

Cc: [email protected]
Cc: [email protected]

> diff --git a/drivers/dma-buf/dma-buf-mapping.c 
> b/drivers/dma-buf/dma-buf-mapping.c
> index 794acff2546a3..80f6ab2f4809d 100644
> --- a/drivers/dma-buf/dma-buf-mapping.c
> +++ b/drivers/dma-buf/dma-buf-mapping.c
> @@ -5,12 +5,13 @@
>   */
>  #include <linux/dma-buf-mapping.h>
>  #include <linux/dma-resv.h>
> +#include <linux/overflow.h>
>  
>  static struct scatterlist *fill_sg_entry(struct scatterlist *sgl, size_t 
> length,
>                                        dma_addr_t addr)
>  {
>       unsigned int len, nents;
> -     int i;
> +     unsigned int i;
>  
>       nents = DIV_ROUND_UP(length, UINT_MAX);

[Severity: High]
This is a pre-existing issue, but splitting physical memory ranges using
UINT_MAX can create unaligned scatterlist entries:

drivers/dma-buf/dma-buf-mapping.c:fill_sg_entry() {
        ...
        for (i = 0; i < nents; i++) {
                len = min_t(size_t, length, UINT_MAX);
                ...
                sg_dma_address(sgl) = addr + (dma_addr_t)i * UINT_MAX;
                ...
        }
        ...
}

Because UINT_MAX (0xFFFFFFFF) is not page-aligned, if a physical range exceeds
UINT_MAX, the first segment has a non-page-aligned length, and the next segment
will start at an unaligned DMA address.

Since IOMMU mapping and DMA engine APIs require middle segments to be aligned
to hardware page boundaries, could passing unaligned addresses and sizes cause
silent DMA corruption or explicit mapping failures? Would it be safer to align
the max length chunk to the page mask (e.g., ALIGN_DOWN(UINT_MAX, PAGE_SIZE))?

[ ... ]

> @@ -40,8 +41,12 @@ static unsigned int calc_sg_nents(struct dma_iova_state 
> *state,
>       size_t i;
>  
>       if (!state || !dma_use_iova(state)) {
> -             for (i = 0; i < nr_ranges; i++)
> -                     nents += DIV_ROUND_UP(phys_vec[i].len, UINT_MAX);
> +             for (i = 0; i < nr_ranges; i++) {
> +                     unsigned int added = DIV_ROUND_UP(phys_vec[i].len, 
> UINT_MAX);

[Severity: High]
This isn't a bug introduced by this patch, but on 32-bit architectures,
size_t and unsigned int are 32 bits. The macro DIV_ROUND_UP(n, d) evaluates
as ((n) + (d) - 1) / (d).

When phys_vec[i].len <= UINT_MAX (e.g., 4096), the expression
4096 + 0xFFFFFFFF - 1 wraps around in 32-bit unsigned arithmetic to 4094.
Consequently, 4094 / 0xFFFFFFFF results in 0.

Does this cause calc_sg_nents() and fill_sg_entry() to compute 0 entries,
forcing sg_alloc_table() to fail with -EINVAL and permanently breaking DMA
mapping on 32-bit platforms?

> +                     if (check_add_overflow(nents, added, &nents))
> +                             return 0;
> +             }
>       } else {
>               /*
>                * In IOVA case, there is only one SG entry which spans

[ ... ]

> @@ -133,6 +139,8 @@ struct sg_table *dma_buf_phys_vec_to_sgt(struct 
> dma_buf_attachment *attach,
>       }
>  
>       nents = calc_sg_nents(dma->state, phys_vec, nr_ranges, size);
> +
> +     /* sg_alloc_table will cleanly fail and return -EINVAL if nents == 0 */
>       ret = sg_alloc_table(&dma->sgt, nents, GFP_KERNEL | __GFP_ZERO);
>       if (ret)
>               goto err_free_state;

[Severity: High]
This isn't a bug introduced by this patch, but if there's a mismatch between
the caller-provided size and the sum of phys_vec[i].len, could it cause an
out-of-bounds write?

drivers/dma-buf/dma-buf-mapping.c:dma_buf_phys_vec_to_sgt() {
        ...
        if (dma->state && dma_use_iova(dma->state)) {
                WARN_ON_ONCE(mapped_len != size);
                ret = dma_iova_sync(attach->dev, dma->state, 0, mapped_len);
                if (ret)
                        goto err_unmap_dma;

                sgl = fill_sg_entry(sgl, mapped_len, dma->state->addr);
        }
        ...
}

If size is smaller than mapped_len, the SG table is allocated with nents
derived from size, but fill_sg_entry() is later called with mapped_len.
The loop inside fill_sg_entry() would then iterate more times than the
allocated size of the scatterlist.

Could this cause sg_next() to return NULL, leading to a NULL pointer
dereference and out-of-bounds write on the sgl object since execution proceeds
past the WARN_ON_ONCE?

[Severity: High]
This is a pre-existing issue, but could the IOVA space allocated by
dma_iova_try_alloc() be permanently leaked on multiple error paths?

drivers/dma-buf/dma-buf-mapping.c:dma_buf_phys_vec_to_sgt() {
        ...
err_unmap_dma:
        if (!i || !dma->state) {
                ; /* Do nothing */
        } else if (dma_use_iova(dma->state)) {
                dma_iova_destroy(attach->dev, dma->state, mapped_len, dir,
                                 DMA_ATTR_MMIO);
        }
        ...
err_free_state:
        kfree(dma->state);
err_free_dma:
        kfree(dma);
        return ERR_PTR(ret);
}

If sg_alloc_table() fails, execution jumps to err_free_state, which frees
the state memory but doesn't call dma_iova_free() or dma_iova_destroy() to
release the IOMMU address space.

Additionally, if dma_iova_link() fails on the first iteration (i == 0),
it jumps to err_unmap_dma where (!i) is true, skipping dma_iova_destroy()
and proceeding to err_free_state, again leaking the address space. Does
this logic need adjusting to properly release the IOVA space?

[Severity: High]
This isn't a bug introduced by this patch, but dma_iova_link() appears to be
consistently called with a hardcoded offset of 0:

drivers/dma-buf/dma-buf-mapping.c:dma_buf_phys_vec_to_sgt() {
        ...
                } else if (dma_use_iova(dma->state)) {
                        ret = dma_iova_link(attach->dev, dma->state,
                                            phys_vec[i].paddr, 0,
                                            phys_vec[i].len, dir,
                                            DMA_ATTR_MMIO);
        ...
}

If there are multiple physical ranges (nr_ranges > 1), wouldn't every single
range be mapped to the exact same start address within the allocated IOVA
space? Could this overwrite previous mappings and corrupt the IOMMU page
tables? Should the offset parameter be the accumulated mapped_len instead
of 0?

[Severity: High]
This isn't a bug introduced by this patch, but could the error recovery path
improperly unmap scatterlist entries?

drivers/dma-buf/dma-buf-mapping.c:dma_buf_phys_vec_to_sgt() {
        ...
err_unmap_dma:
        ...
        } else {
                for_each_sgtable_dma_sg(&dma->sgt, sgl, i)
                        dma_unmap_phys(attach->dev, sg_dma_address(sgl),
                                       sg_dma_len(sgl), dir, DMA_ATTR_MMIO);
        }
        ...
}

If dma_map_phys() fails mid-loop, for_each_sgtable_dma_sg() iterates over
all dma->sgt.nents allocated entries. Unpopulated trailing entries will have
addr=0 and len=0, leading to invalid dma_unmap_phys() calls.

Furthermore, if a range exceeded UINT_MAX, it was mapped via dma_map_phys()
as a single contiguous chunk but split into multiple SG entries by
fill_sg_entry(). Doesn't unmapping these sub-chunks piece-wise in the SG
iteration loop violate the DMA API, which requires the exact contiguous
dma_addr_t and size originally mapped?

-- 
Sashiko AI review ยท 
https://sashiko.dev/#/patchset/[email protected]?part=1

Reply via email to