On Wed, Nov 06, 2024 at 03:27:19PM +0000, Andrew Stubbs wrote:
> If requested, return the vectorization factor appropriate for the offload
> device, if any.
>
> This change gives a significant speedup in the BabelStream "dot" benchmark on
> amdgcn.
>
> The omp_adjust_chunk_size usecase is set "false", for now, but I intend to
> change that in a follow-up patch.
>
> Note that NVPTX SIMT offload does not use this code-path.
>
> gcc/ChangeLog:
>
> * gimple-loop-versioning.cc (loop_versioning::loop_versioning): Set
> omp_max_vf to offload == false.
> * omp-expand.cc (omp_adjust_chunk_size): Likewise.
> * omp-general.cc (omp_max_vf): Add "offload" parameter, and detect
> amdgcn offload devices.
> * omp-general.h (omp_max_vf): Likewise.
> * omp-low.cc (lower_rec_simd_input_clauses): Pass offload state to
> omp_max_vf.
Ok.
Jakub