On Fri, Oct 18, 2024 at 11:52:25AM +0530, Tejas Belagod wrote:
> The target clause in OpenMP is used to offload loop kernels to accelarator
> peripeherals.  target's 'map' clause is used to move data from and to the
> accelarator.  When the data is SVE type, it may not be suitable because of
> various reasons i.e. the two SVE targets may not agree on vector size or
> some targets don't support variable vector size.  This makes SVE unsuitable
> for use in OMP's 'map' clause.  This patch diagnoses all such cases and issues
> an error where SVE types are not suitable.
> 
> Co-authored-by: Andrea Corallo <andrea.cora...@arm.com>
> 
> gcc/ChangeLog:
> 
>       * target.h (type_context_kind): Add new context kinds for target 
> clauses.
>       * config/aarch64/aarch64-sve-builtins.cc (verify_type_context): Diagnose
>       SVE types for a given OpenMP context.
>       (omp_type_context): New.
>       * gimplify.cc (omp_notice_variable):  Diagnose implicitly-mapped SVE

s/  / / above

>       objects in OpenMP regions.
>       (gimplify_scan_omp_clauses): Diagnose SVE types for various target
>       clauses.
> --- a/gcc/config/aarch64/aarch64-sve-builtins.cc
> +++ b/gcc/config/aarch64/aarch64-sve-builtins.cc
> @@ -4956,12 +4956,35 @@ handle_arm_sve_vector_bits_attribute (tree *node, 
> tree, tree args, int,
>    return NULL_TREE;
>  }
>  
> +
> +/* Return true if OpenMP context types.  */
> +
> +static bool
> +omp_type_context (type_context_kind context)
> +{
> +  switch (context)
> +    {
> +    case TCTX_OMP_MAP:
> +    case TCTX_OMP_MAP_IMP_REF:
> +    case TCTX_OMP_PRIVATE:
> +    case TCTX_OMP_FIRSTPRIVATE:
> +    case TCTX_OMP_DEVICE_ADDR:
> +      return true;
> +    default:
> +      return false;;
> +    }
> +}
> +
>  /* Implement TARGET_VERIFY_TYPE_CONTEXT for SVE types.  */
>  bool
>  verify_type_context (location_t loc, type_context_kind context,
>                    const_tree type, bool silent_p)

I know nothing about this verify_type_context stuff, will certainly
defer review of it to Richard S.
Just am wondering how can this work at all, is this in some anonymous
or aarch64 specific namespace?
Because tree.cc has verify_type_context definition with the same
types.

> --- a/gcc/gimplify.cc
> +++ b/gcc/gimplify.cc
> @@ -8430,11 +8430,13 @@ omp_notice_variable (struct gimplify_omp_ctx *ctx, 
> tree decl, bool in_code)
>                         | GOVD_MAP_ALLOC_ONLY)) == flags)
>           {
>             tree type = TREE_TYPE (decl);
> +           location_t dummy = UNKNOWN_LOCATION;
>  
>             if (gimplify_omp_ctxp->target_firstprivatize_array_bases
>                 && omp_privatize_by_reference (decl))
>               type = TREE_TYPE (type);
> -           if (!omp_mappable_type (type))
> +           if (!omp_mappable_type (type)
> +               || !verify_type_context (dummy, TCTX_OMP_MAP_IMP_REF, type))
>               {
>                 error ("%qD referenced in target region does not have "
>                        "a mappable type", decl);
> @@ -12165,6 +12167,8 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq 
> *pre_p,
>        unsigned int flags;
>        tree decl;
>        auto_vec<omp_addr_token *, 10> addr_tokens;
> +      tree op = NULL_TREE;
> +      location_t loc = OMP_CLAUSE_LOCATION (c);
>  
>        if (grp_end && c == OMP_CLAUSE_CHAIN (grp_end))
>       {

Ditto for review here.

> @@ -12172,6 +12176,34 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq 
> *pre_p,
>         grp_end = NULL_TREE;
>       }
>  
> +      if (code == OMP_TARGET || code == OMP_TARGET_DATA
> +       || code == OMP_TARGET_ENTER_DATA || code == OMP_TARGET_EXIT_DATA)

Just general formatting rule, if condition doesn't fit on one line,
split on every || (so each || goes on a separate line).

> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/sve/omp/offload-parallel-loop.c
> @@ -0,0 +1,442 @@
> +/* { dg-do compile } */
> +/* { dg-options "-msve-vector-bits=256 -std=gnu99 -fopenmp -O2" } */
> +
> +#include <arm_sve.h>
> +
> +#define N __ARM_FEATURE_SVE_BITS
> +
> +svint32_t
> +omp_target_vla ()
> +{
> +  int a[N], b[N], c[N];
> +  svint32_t va, vb, vc;
> +  int i;
> +
> +#pragma omp parallel for
> +  for (i = 0; i < N; i++)
> +    {
> +      b[i] = i;
> +      c[i] = i + 1;
> +    }
> +
> +#pragma omp target parallel loop
> +  for (i = 0; i < 8; i++)
> +    {
> +      vb = svld1_s32 (svptrue_b32 (), b); /* { dg-error {'vb' referenced in 
> target region does not have a mappable type} } */
> +      vc = svld1_s32 (svptrue_b32 (), c); /* { dg-error {'vc' referenced in 
> target region does not have a mappable type} } */
> +      va = svadd_s32_z (svptrue_b32 (), vb, vc); /* { dg-error {'va' 
> referenced in target region does not have a mappable type} } */

I think better would be to use the non-mappable types rather than having
something racy (all threads writing the same shared vars and in
the last case also using them).
> +    }
> +
> +  return va;
> +}
> +
> +svint32_t
> +omp_target_data_map_1_vla ()
> +{
> +  int a[N], b[N], c[N];
> +  svint32_t va, vb, vc;
> +  int i;
> +
> +#pragma omp parallel for
> +  for (i = 0; i < N; i++)
> +    {
> +      b[i] = i;
> +      c[i] = i + 1;
> +    }
> +
> +#pragma omp target parallel loop map(to: b, c) map(from: va) /* { dg-error 
> {SVE type 'svint32_t' not allowed in map clause} } */
> +  for (i = 0; i < 8; i++)
> +    {
> +      vb = svld1_s32 (svptrue_b32 (), b); /* { dg-error {'vb' referenced in 
> target region does not have a mappable type} } */
> +      vc = svld1_s32 (svptrue_b32 (), c); /* { dg-error {'vc' referenced in 
> target region does not have a mappable type} } */
> +      va = svadd_s32_z (svptrue_b32 (), vb, vc);

Again, this is racy.
You could have say limited it to if (i == 0) or if (i == 7) or whatever
other iteration, but then what is the point of having the loop construct
there.

        Jakub

Reply via email to