On Thu, Nov 5, 2015 at 7:58 PM, Nicolai Hähnle <[email protected]> wrote: > On 04.11.2015 00:47, Marek Olšák wrote: >> >> From: Marek Olšák <[email protected]> >> >> --- >> src/gallium/drivers/radeonsi/si_blit.c | 55 >> ++++++++++++++++++++++++++++++++++ >> 1 file changed, 55 insertions(+) >> >> diff --git a/src/gallium/drivers/radeonsi/si_blit.c >> b/src/gallium/drivers/radeonsi/si_blit.c >> index fce014a..e934146 100644 >> --- a/src/gallium/drivers/radeonsi/si_blit.c >> +++ b/src/gallium/drivers/radeonsi/si_blit.c >> @@ -731,9 +731,64 @@ static void si_flush_resource(struct pipe_context >> *ctx, >> } >> } >> >> +static void si_pipe_clear_buffer(struct pipe_context *ctx, >> + struct pipe_resource *dst, >> + unsigned offset, unsigned size, >> + const void *clear_value, >> + int clear_value_size) >> +{ >> + struct si_context *sctx = (struct si_context*)ctx; >> + const uint32_t *u32 = clear_value; >> + unsigned i; >> + bool clear_value_fits_dword = true; >> + uint8_t *map; >> + >> + if (clear_value_size > 4) >> + for (i = 1; i < clear_value_size / 4; i++) >> + if (u32[0] != u32[i]) { >> + clear_value_fits_dword = false; >> + break; >> + } >> + >> + /* Use CP DMA for the simple case. */ >> + if (offset % 4 == 0 && size % 4 == 0 && clear_value_fits_dword) { >> + uint32_t value = u32[0]; >> + >> + switch (clear_value_size) { >> + case 1: >> + value &= 0xff; >> + value |= (value << 8) | (value << 16) | (value << >> 24); >> + break; >> + case 2: >> + value &= 0xffff; >> + value |= value << 16; >> + break; >> + } > > > To reduce the chance of complaints by valgrind et al: > > switch (clear_value_size) { > case 1: > value = *(uint8_t *)u32; > value |= (value << 8) | (value << 16) | (value << 24); > break; > case 2: > value = *(uint16_t *)u32; > value |= value << 16; > break; > default: > value = *u32; > break; > }
Thanks. The preliminary plan is to use transform feedback for fills>=64 bits (already implemented by u_blitter), and CP DMA should be used for 32-bit fills and any fills that can be lowered to 32-bit. Unaligned 8-bit and 16-bit fills should fill the largest aligned subrange using CP DMA. Then, the unaligned beginning and ending dwords will be filled separately using: COPY_DATA from mem to reg REG_RMW to fill the requested bytes COPY_DATA from reg to mem Marek _______________________________________________ mesa-dev mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/mesa-dev
