On Thursday, June 9, 2016 10:50:53 AM PDT Kenneth Graunke wrote: > On Thursday, June 9, 2016 10:00:40 AM PDT Ilia Mirkin wrote: > > On Jun 9, 2016 4:10 AM, "Kenneth Graunke" <[email protected]> wrote: > > > > > > Skylake changes the representation of shared local memory size: > > > > > > Size | 0 kB | 1 kB | 2 kB | 4 kB | 8 kB | 16 kB | 32 kB | 64 kB | > > > ------------------------------------------------------------------- > > > Gen7-8 | 0 | none | none | 1 | 2 | 3 | 4 | 5 | > > > ------------------------------------------------------------------- > > > Gen9+ | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | > > > > > > The old formula would substantially underallocate the amount of space. > > > This fixes GPU hangs on Skylake when running with full thread counts. > > > > > > Cc: "12.0" <[email protected]> > > > Signed-off-by: Kenneth Graunke <[email protected]> > > > --- > > > src/mesa/drivers/dri/i965/gen7_cs_state.c | 15 ++++++++++----- > > > 1 file changed, 10 insertions(+), 5 deletions(-) > > > > > > diff --git a/src/mesa/drivers/dri/i965/gen7_cs_state.c > > b/src/mesa/drivers/dri/i965/gen7_cs_state.c > > > index 750aa2c..aff1f4e 100644 > > > --- a/src/mesa/drivers/dri/i965/gen7_cs_state.c > > > +++ b/src/mesa/drivers/dri/i965/gen7_cs_state.c > > > @@ -150,11 +150,16 @@ brw_upload_cs_state(struct brw_context *brw) > > > assert(prog_data->total_shared <= 64 * 1024); > > > uint32_t slm_size = 0; > > > if (prog_data->total_shared > 0) { > > > - /* slm_size is in 4k increments, but must be a power of 2. */ > > > - slm_size = 4 * 1024; > > > - while (slm_size < prog_data->total_shared) > > > - slm_size <<= 1; > > > - slm_size /= 4 * 1024; > > > + /* Shared Local Memory Size is specified as powers of two. */ > > > + slm_size = util_next_power_of_two(prog_data->total_shared); > > > + > > > + if (brw->gen >= 9) { > > > + /* Use a minimum of 1kB; turn an exponent of 10 (1024 kB) into > > 1. */ > > > + slm_size = ffs(MAX2(slm_size, 1024)) - 10; > > > + } else { > > > + /* Use a minimum of 4kB; convert to the pre-Gen9 > > representation. */ > > > + slm_size = MAX2(slm_size, 4096) / 4096; > > > > According to your chart, 16k should end up with 3, but this logic will > > produce 4. The old comment said it was in increments of 4k, so I'm guessing > > just the chart needs to be adjusted. > > Yikes, sorry! A wrong chart is better than no chart at all. I meant:
worse. :( wow.
> Size | 0 kB | 1 kB | 2 kB | 4 kB | 8 kB | 16 kB | 32 kB | 64 kB |
> -------------------------------------------------------------------
> Gen7-8 | 0 | none | none | 1 | 2 | 4 | 8 | 16 |
> -------------------------------------------------------------------
> Gen9+ | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
>
> I should probably move this code to a helper function and put the
> (correct) table in a comment...
>
signature.asc
Description: This is a digitally signed message part.
_______________________________________________ mesa-dev mailing list [email protected] https://lists.freedesktop.org/mailman/listinfo/mesa-dev
