On Sun, 21 Jun 2026 22:21:30 +0000 David Hu <[email protected]> wrote:
> Currently, `fill_sg_entry()` splits the scatterlist using `UINT_MAX`. > This creates a non-page-aligned DMA length (`0xFFFFFFFF`) for the > first entry, resulting in non-page-aligned DMA addresses for all > subsequent entries. How did you find this? It requires a single buffer over 4GB - seems highly unlikely. > > While the underlying IOMMU mapping may be contiguous, hardware > DMA engines often require explicit address alignment (e.g., page, > cacheline, or storage sector boundaries). Passing unaligned > addresses and lengths can cause explicit failures in DMA descriptor > creation or silent data corruption if lower unaligned bits are > truncated. > > Fix this by splitting the scatterlist by the largest possible page > aligned chunk within `UINT_MAX` (`ALIGN_DOWN(UINT_MAX, PAGE_SIZE)`). > This ensures all scatterlist DMA addresses and lengths remain page > aligned and satisfy hardware constraints. It would almost certainly better to spilt into 2G chunks. That removes any need for any divisions. > Page-aligned entries allow the system to cleanly chunk payloads into > PCIe MaxPayloadSize (MPS) (e.g., 128 bytes, 256 bytes, 512 bytes). > As a result, this may help reduce TLP fragmentation in P2P transfers > and alleviate potential congestion within a logical PCIe switch > partition, especially when Relaxed Ordering is not possible due to > hardware constraints. > > Reported-by: sashiko-bot <[email protected]> > Closes: > https://lore.kernel.org/all/[email protected]/ > Fixes: 3aa31a8bb11e ("dma-buf: provide phys_vec to scatter-gather mapping > routine") > Cc: [email protected] > Signed-off-by: David Hu <[email protected]> > --- > drivers/dma-buf/dma-buf-mapping.c | 13 ++++++++----- > 1 file changed, 8 insertions(+), 5 deletions(-) > > diff --git a/drivers/dma-buf/dma-buf-mapping.c > b/drivers/dma-buf/dma-buf-mapping.c > index 794acff2546a..f2bde38fdb1f 100644 > --- a/drivers/dma-buf/dma-buf-mapping.c > +++ b/drivers/dma-buf/dma-buf-mapping.c > @@ -5,6 +5,9 @@ > */ > #include <linux/dma-buf-mapping.h> > #include <linux/dma-resv.h> > +#include <linux/align.h> > + > +#define MAX_ENT_SZ ALIGN_DOWN(UINT_MAX, PAGE_SIZE) > > static struct scatterlist *fill_sg_entry(struct scatterlist *sgl, size_t > length, > dma_addr_t addr) > @@ -12,9 +15,9 @@ static struct scatterlist *fill_sg_entry(struct scatterlist > *sgl, size_t length, > unsigned int len, nents; > int i; > > - nents = DIV_ROUND_UP(length, UINT_MAX); > + nents = DIV_ROUND_UP(length, MAX_ENT_SZ); > for (i = 0; i < nents; i++) { Why not change that to 'while (length) {' to avoid the division above. > - len = min_t(size_t, length, UINT_MAX); > + len = min_t(size_t, length, MAX_ENT_SZ); I bet that doesn't need to be min_t() > length -= len; > /* > * DMABUF abuses scatterlist to create a scatterlist > @@ -24,7 +27,7 @@ static struct scatterlist *fill_sg_entry(struct scatterlist > *sgl, size_t length, > * does not require the CPU list for mapping or unmapping. > */ > sg_set_page(sgl, NULL, 0, 0); > - sg_dma_address(sgl) = addr + (dma_addr_t)i * UINT_MAX; > + sg_dma_address(sgl) = addr + (dma_addr_t)i * MAX_ENT_SZ; > sg_dma_len(sgl) = len; Replace the multiply with 'addr += len'. -- David > sgl = sg_next(sgl); > } > @@ -41,14 +44,14 @@ static unsigned int calc_sg_nents(struct dma_iova_state > *state, > > if (!state || !dma_use_iova(state)) { > for (i = 0; i < nr_ranges; i++) > - nents += DIV_ROUND_UP(phys_vec[i].len, UINT_MAX); > + nents += DIV_ROUND_UP(phys_vec[i].len, MAX_ENT_SZ); > } else { > /* > * In IOVA case, there is only one SG entry which spans > * for whole IOVA address space, but we need to make sure > * that it fits sg->length, maybe we need more. > */ > - nents = DIV_ROUND_UP(size, UINT_MAX); > + nents = DIV_ROUND_UP(size, MAX_ENT_SZ); > } > > return nents;
