On Wed, Nov 06, 2024 at 03:27:19PM +0000, Andrew Stubbs wrote:
> If requested, return the vectorization factor appropriate for the offload
> device, if any.
> 
> This change gives a significant speedup in the BabelStream "dot" benchmark on
> amdgcn.
> 
> The omp_adjust_chunk_size usecase is set "false", for now, but I intend to
> change that in a follow-up patch.
> 
> Note that NVPTX SIMT offload does not use this code-path.
> 
> gcc/ChangeLog:
> 
>       * gimple-loop-versioning.cc (loop_versioning::loop_versioning): Set
>       omp_max_vf to offload == false.
>       * omp-expand.cc (omp_adjust_chunk_size): Likewise.
>       * omp-general.cc (omp_max_vf): Add "offload" parameter, and detect
>       amdgcn offload devices.
>       * omp-general.h (omp_max_vf): Likewise.
>       * omp-low.cc (lower_rec_simd_input_clauses): Pass offload state to
>       omp_max_vf.

Ok.

        Jakub

Reply via email to