Applied.  Thanks!

On Wed, Jun 10, 2026 at 5:24 AM Christian König
<[email protected]> wrote:
>
> On 6/10/26 10:32, Pavel Ondračka wrote:
> >
> > r100_copy_blit() copies BOs as 1024-pixel-wide ARGB8888 blits, so one
> > GPU page becomes one blit row. Large copies are split into chunks of at
> > most 8191 rows.
> >
> > The kernel register header names the packet coordinate dwords SRC_Y_X
> > and DST_Y_X. In the BITBLT_MULTI description in
> > R5xx_Acceleration_v1.5.pdf docs, these correspond to [SRC_X1 | SRC_Y1]
> > and [DST_X1 | DST_Y1], which are signed 13-bit coordinates in the
> > -8192..8191 range. The old code kept SRC/DST_PITCH_OFFSET at the BO base
> > and used SRC_Y_X/DST_Y_X as the chunk address, so large BO moves could
> > exceed that coordinate range.
> >
> > Compute per-chunk SRC/DST_PITCH_OFFSET bases and emit zero source and
> > destination coordinates. r100_copy_blit() already packs
> > SRC/DST_PITCH_OFFSET as pitch plus base offset, so large chunk addresses
> > belong there rather than in the coordinate fields.
> >
> > This fixes Prison Architect corruption with 4096x4096 mipped textures
> > after they are evicted to GTT under memory pressure on RV530.
>
> Wow, impressive piece of work.
>
> > Closes: https://gitlab.freedesktop.org/mesa/mesa/-/work_items/6716
> > Cc: [email protected]
> > Signed-off-by: Pavel Ondračka <[email protected]>
>
> Acked-by: Christian König <[email protected]>
>
> Thanks a lot for digging into this,
> Christian.
>
> > ---
> >  drivers/gpu/drm/radeon/r100.c | 13 +++++++++----
> >  1 file changed, 9 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/radeon/r100.c b/drivers/gpu/drm/radeon/r100.c
> > index 3ac1a79b6f13..533215d6e9cb 100644
> > --- a/drivers/gpu/drm/radeon/r100.c
> > +++ b/drivers/gpu/drm/radeon/r100.c
> > @@ -906,6 +906,7 @@ struct radeon_fence *r100_copy_blit(struct 
> > radeon_device *rdev,
> >  {
> >         struct radeon_ring *ring = &rdev->ring[RADEON_RING_TYPE_GFX_INDEX];
> >         struct radeon_fence *fence;
> > +       uint64_t cur_src_offset, cur_dst_offset;
> >         uint32_t cur_pages;
> >         uint32_t stride_bytes = RADEON_GPU_PAGE_SIZE;
> >         uint32_t pitch;
> > @@ -934,6 +935,10 @@ struct radeon_fence *r100_copy_blit(struct 
> > radeon_device *rdev,
> >                         cur_pages = 8191;
> >                 }
> >                 num_gpu_pages -= cur_pages;
> > +               cur_src_offset = src_offset +
> > +                       (uint64_t)num_gpu_pages * RADEON_GPU_PAGE_SIZE;
> > +               cur_dst_offset = dst_offset +
> > +                       (uint64_t)num_gpu_pages * RADEON_GPU_PAGE_SIZE;
> >
> >                 /* pages are in Y direction - height
> >                    page width in X direction - width */
> > @@ -950,13 +955,13 @@ struct radeon_fence *r100_copy_blit(struct 
> > radeon_device *rdev,
> >                                   RADEON_DP_SRC_SOURCE_MEMORY |
> >                                   RADEON_GMC_CLR_CMP_CNTL_DIS |
> >                                   RADEON_GMC_WR_MSK_DIS);
> > -               radeon_ring_write(ring, (pitch << 22) | (src_offset >> 10));
> > -               radeon_ring_write(ring, (pitch << 22) | (dst_offset >> 10));
> > +               radeon_ring_write(ring, (pitch << 22) | (cur_src_offset >> 
> > 10));
> > +               radeon_ring_write(ring, (pitch << 22) | (cur_dst_offset >> 
> > 10));
> >                 radeon_ring_write(ring, (0x1fff) | (0x1fff << 16));
> >                 radeon_ring_write(ring, 0);
> >                 radeon_ring_write(ring, (0x1fff) | (0x1fff << 16));
> > -               radeon_ring_write(ring, num_gpu_pages);
> > -               radeon_ring_write(ring, num_gpu_pages);
> > +               radeon_ring_write(ring, 0);
> > +               radeon_ring_write(ring, 0);
> >                 radeon_ring_write(ring, cur_pages | (stride_pixels << 16));
> >         }
> >         radeon_ring_write(ring, PACKET0(RADEON_DSTCACHE_CTLSTAT, 0));
> > --
> > 2.52.0
> >
>

Reply via email to