Re: [PATCH] [amdgcn] Scale number of threads/workers with VGPR usage

2020-01-31 Thread Andrew Stubbs
On 31/01/2020 13:56, Kwok Cheung Yeung wrote: The GCN architecture has 4 SIMD units per compute unit, with 256 VGPRs per SIMD unit. OpenMP threads or OpenACC workers must be distributed across the SIMD units, with each thread/worker fitting entirely within a single SIMD unit. VGPRs are shared b

[PATCH] [amdgcn] Scale number of threads/workers with VGPR usage

2020-01-31 Thread Kwok Cheung Yeung
The GCN architecture has 4 SIMD units per compute unit, with 256 VGPRs per SIMD unit. OpenMP threads or OpenACC workers must be distributed across the SIMD units, with each thread/worker fitting entirely within a single SIMD unit. VGPRs are shared by the kernels running in a SIMD unit, so we can