On Mon, Nov 04, 2024 at 10:21:58AM +0000, Andrew Stubbs wrote:
> @@ -999,6 +1000,18 @@ omp_max_vf (void)
> && OPTION_SET_P (flag_tree_loop_vectorize)))
> return 1;
>
> + if (ENABLE_OFFLOADING && offload)
> + {
> + for (const char *c = getenv ("OFFLOAD_TARGET_NAMES"); c;)
> + {
> + if (startswith (c, "amdgcn"))
> + return 64;
> + else if ((c = strchr (c, ':')))
> + c++;
> + }
> + /* Otherwise, fall through to host VF. */
> + }
This assumes that the host can't have max_vf more than 64.
Because the offload code isn't compiled just for a single offload target,
but perhaps multiple and the host as well.
I think it should be max (64, omp_max_vf (false)) in that case.
Though SVE/RISC-V can complicate that by omp_max_vf returning a poly_uint64.
Maybe upper_bound is the function to call?
> --- a/gcc/omp-expand.cc
> +++ b/gcc/omp-expand.cc
> @@ -229,7 +229,15 @@ omp_adjust_chunk_size (tree chunk_size, bool
> simd_schedule, bool offload)
> if (!simd_schedule || integer_zerop (chunk_size))
> return chunk_size;
>
> - poly_uint64 vf = omp_max_vf (offload);
> + if (offload)
> + {
> + cfun->curr_properties &= ~PROP_gimple_lomp_dev;
> + tree vf = build_call_expr_internal_loc (UNKNOWN_LOCATION,
> IFN_GOMP_MAX_VF,
> + unsigned_type_node, 0);
> + return fold_convert (TREE_TYPE (chunk_size), vf);
This is incorrect.
The code below that doesn't return the vf returned by omp_max_vf, but
instead it returns (chunk_size + vf - 1) & -vf.
So, question is if we can rely omp_max_vf to always return a power of
two, clearly we rely on it now already in the existing code.
And, either it should build that
(tmp = IFN_GOMP_MAX_VF (), (chunk_size + tmp - 1) & -tmp)
expression as trees, or IFN_GOMP_MAX_VF should take an argument,
the chunk_size, and be folded to (chunk_size + vf - 1) & -vf
later.
Jakub