Applied. Thanks!
On Wed, Jun 10, 2026 at 5:24 AM Christian König <[email protected]> wrote: > > On 6/10/26 10:32, Pavel Ondračka wrote: > > > > r100_copy_blit() copies BOs as 1024-pixel-wide ARGB8888 blits, so one > > GPU page becomes one blit row. Large copies are split into chunks of at > > most 8191 rows. > > > > The kernel register header names the packet coordinate dwords SRC_Y_X > > and DST_Y_X. In the BITBLT_MULTI description in > > R5xx_Acceleration_v1.5.pdf docs, these correspond to [SRC_X1 | SRC_Y1] > > and [DST_X1 | DST_Y1], which are signed 13-bit coordinates in the > > -8192..8191 range. The old code kept SRC/DST_PITCH_OFFSET at the BO base > > and used SRC_Y_X/DST_Y_X as the chunk address, so large BO moves could > > exceed that coordinate range. > > > > Compute per-chunk SRC/DST_PITCH_OFFSET bases and emit zero source and > > destination coordinates. r100_copy_blit() already packs > > SRC/DST_PITCH_OFFSET as pitch plus base offset, so large chunk addresses > > belong there rather than in the coordinate fields. > > > > This fixes Prison Architect corruption with 4096x4096 mipped textures > > after they are evicted to GTT under memory pressure on RV530. > > Wow, impressive piece of work. > > > Closes: https://gitlab.freedesktop.org/mesa/mesa/-/work_items/6716 > > Cc: [email protected] > > Signed-off-by: Pavel Ondračka <[email protected]> > > Acked-by: Christian König <[email protected]> > > Thanks a lot for digging into this, > Christian. > > > --- > > drivers/gpu/drm/radeon/r100.c | 13 +++++++++---- > > 1 file changed, 9 insertions(+), 4 deletions(-) > > > > diff --git a/drivers/gpu/drm/radeon/r100.c b/drivers/gpu/drm/radeon/r100.c > > index 3ac1a79b6f13..533215d6e9cb 100644 > > --- a/drivers/gpu/drm/radeon/r100.c > > +++ b/drivers/gpu/drm/radeon/r100.c > > @@ -906,6 +906,7 @@ struct radeon_fence *r100_copy_blit(struct > > radeon_device *rdev, > > { > > struct radeon_ring *ring = &rdev->ring[RADEON_RING_TYPE_GFX_INDEX]; > > struct radeon_fence *fence; > > + uint64_t cur_src_offset, cur_dst_offset; > > uint32_t cur_pages; > > uint32_t stride_bytes = RADEON_GPU_PAGE_SIZE; > > uint32_t pitch; > > @@ -934,6 +935,10 @@ struct radeon_fence *r100_copy_blit(struct > > radeon_device *rdev, > > cur_pages = 8191; > > } > > num_gpu_pages -= cur_pages; > > + cur_src_offset = src_offset + > > + (uint64_t)num_gpu_pages * RADEON_GPU_PAGE_SIZE; > > + cur_dst_offset = dst_offset + > > + (uint64_t)num_gpu_pages * RADEON_GPU_PAGE_SIZE; > > > > /* pages are in Y direction - height > > page width in X direction - width */ > > @@ -950,13 +955,13 @@ struct radeon_fence *r100_copy_blit(struct > > radeon_device *rdev, > > RADEON_DP_SRC_SOURCE_MEMORY | > > RADEON_GMC_CLR_CMP_CNTL_DIS | > > RADEON_GMC_WR_MSK_DIS); > > - radeon_ring_write(ring, (pitch << 22) | (src_offset >> 10)); > > - radeon_ring_write(ring, (pitch << 22) | (dst_offset >> 10)); > > + radeon_ring_write(ring, (pitch << 22) | (cur_src_offset >> > > 10)); > > + radeon_ring_write(ring, (pitch << 22) | (cur_dst_offset >> > > 10)); > > radeon_ring_write(ring, (0x1fff) | (0x1fff << 16)); > > radeon_ring_write(ring, 0); > > radeon_ring_write(ring, (0x1fff) | (0x1fff << 16)); > > - radeon_ring_write(ring, num_gpu_pages); > > - radeon_ring_write(ring, num_gpu_pages); > > + radeon_ring_write(ring, 0); > > + radeon_ring_write(ring, 0); > > radeon_ring_write(ring, cur_pages | (stride_pixels << 16)); > > } > > radeon_ring_write(ring, PACKET0(RADEON_DSTCACHE_CTLSTAT, 0)); > > -- > > 2.52.0 > > >
