On Wed, 2025-09-10 at 10:34 +0200, Christian König wrote:
> On 09.09.25 18:56, Timur Kristóf wrote:
> > > Even when we apply it I think we should drop that, the value the
> > > kernel uses is correct.
> > 
> > Hi Christian,
> > 
> > The kernel and Mesa disagree on the limits for almost all SDMA
> > versions, so it would be nice to actually understand what the
> > limits of
> > the SDMA HW are and use the same limit in the kernel and Mesa, or
> > if
> > that isn't viable, let's document why the different limits make
> > sense.
> > 
> > I'm adding Marek to CC, he wrote the comment that I referenced
> > here.
> > As far as I understand from my conversation with Marek, the kernel
> > is
> > actually wrong.
> > 
> > If the limits depend on alignment, then we should either set a
> > limit
> > that is always safe, or make sure SDMA copies in the kernel are
> > always
> > aligned and add assertions about it.
> 
> That's already done. See the size restrictions applied to BOs and the
> callers of amdgpu_copy_buffer().

Based on the code comments I cited, as far as I understand, the issue
is with copying the last 256 bytes of 2^22-1. Do I understood your
response correctly that you are saying that the kernel isn't affected
by this issue because it always copies things that are 256 byte
aligned?

I checked the callers of amdgpu_copy_buffer and can't find what you are
referring to. However, assuming that all callers use amdgpu_copy_buffer
on 256 byte aligned addresses, there is still an issue with large BOs:

When the kernel copies a BO that is larger than 0x3fffe0 bytes then it
needs to emit multiple SDMA copy packets, and the copy done by the
second packet (and other subsequent packets) won't be 256 byte aligned.

> 
> We could add another warning to amdgpu_copy_buffer(), but that is
> just the backend function.
> 
> > Looking at the implementation of
> > amdgpu_copy_buffer in the kernel, I see that it relies on
> > copy_max_bytes and doesn't take alignment into account, so with the
> > current limit it could issue subsequent copies that aren't 256 byte
> > aligned.
> 
> "Due to HW limitation, the maximum count may not be 2^n-1, can only
> be 2^n - 1 - start_addr[4:2]"
> 
> Well according to this comments the size restriction is 32 bytes (256
> bits!) and not 256 bytes.
> 
> Were do you actually get the 256 bytes restriction from?

The comments I cited above are from the following sources:

PAL uses 1<<22-256 = 0x3fff00 here:
https://github.com/GPUOpen-Drivers/pal/blob/bcec463efe5260776d486a5e3da0c549bc0a75d2/src/core/hw/ossip/oss2/oss2DmaCmdBuffer.cpp#L308

Mesa also uses 0x3fff00 here:
https://gitlab.freedesktop.org/mesa/mesa/-/blob/47f5d25f93fd36224c112ee2d48e511ae078f8c2/src/amd/common/sid.h#L390

The limit in Mesa was added by this commit:
https://gitlab.freedesktop.org/mesa/mesa/-/commit/2c1f249f2b61be50222411bc0d41c095004232ed
According to the commit message, Dave added this limit when hitting an
issue trying to use SDMA with buffers that are larger than this.

For SDMA v5.2 and newer, a larger limit was added by Marek later:
https://gitlab.freedesktop.org/mesa/mesa/-/commit/a54bcb9429666fcbe38c04660cc4b3f8abbde259
Which confirms the same issue copying the last 256 bytes on these
versions, although in this case the kernel isn't technically wrong
because it uses a smaller overall maximum.




Reply via email to