[PATCH, OpenMP, Fortran] requires unified_shared_memory 1/2: adjust libgfortran memory allocators

2022-08-15 Thread Chung-Lin Tang

Hi, this patch is to fix the case where 'requires unified_shared_memory' doesn't
work due to memory allocator mismatch. Currently this is only for OG12 
(devel/omp/gcc-12),
but will apply to mainline as well once those requires patches get in.

Basically, under 'requires unified_shared_memory' enables the usm_transform 
pass,
which transforms some of the expanded Fortran intrinsic code that uses 
__builtin_free()
into 'omp_free (..., ompx_unified_shared_mem_alloc)'.

The intention is to make all dynamic memory allocation use the OpenMP 
unified_shared_memory
allocator, but there is a big gap in this, namely libgfortran. What happens in 
some tests
are that libgfortran allocates stuff using normal malloc(), and the 
usm_transform generates
code that frees the stuff using omp_free(), and chaos ensues.

So the proper fix we believe is: to make it possible to move the entire 
libgfortran on to
unified_shared_memory.

This first patch is a mostly mechanical patch to change all references of 
malloc/free/calloc/realloc
in libgfortran into xmalloc/xfree/xcalloc/xrealloc in 
libgfortran/runtime/memory.c,
as well as strdup uses into a new internal xstrdup.

All of libgfortran is adjusted this way, except libgfortran/caf, which is an 
independent library
outside of libgfortran.so.

The second patch of this series will present a way to switch the references of 
allocators
in libgfortran/runtime/memory.c from the normal glibc malloc/free/etc. to 
omp_alloc/omp_free/etc.
when 'requires unified_shared_memory' is detected.

Tested on devel/omp/gcc-12. Plans is to commit there soon, but also seeking 
approval for mainline
once the requires stuff goes in.

Thanks,
Chung-Lin

2022-08-15  Chung-Lin Tang  

libgfortran/ChangeLog:

* m4/matmul_internal.m4: Adjust malloc/free to xmalloc/xfree.
* generated/matmul_c10.c: Regenerate.
* generated/matmul_c16.c: Likewise.
* generated/matmul_c17.c: Likewise.
* generated/matmul_c4.c: Likewise.
* generated/matmul_c8.c: Likewise.
* generated/matmul_i1.c: Likewise.
* generated/matmul_i16.c: Likewise.
* generated/matmul_i2.c: Likewise.
* generated/matmul_i4.c: Likewise.
* generated/matmul_i8.c: Likewise.
* generated/matmul_r10.c: Likewise.
* generated/matmul_r16.c: Likewise.
* generated/matmul_r17.c: Likewise.
* generated/matmul_r4.c: Likewise.
* generated/matmul_r8.c: Likewise.
* generated/matmulavx128_c10.c: Likewise.
* generated/matmulavx128_c16.c: Likewise.
* generated/matmulavx128_c17.c: Likewise.
* generated/matmulavx128_c4.c: Likewise.
* generated/matmulavx128_c8.c: Likewise.
* generated/matmulavx128_i1.c: Likewise.
* generated/matmulavx128_i16.c: Likewise.
* generated/matmulavx128_i2.c: Likewise.
* generated/matmulavx128_i4.c: Likewise.
* generated/matmulavx128_i8.c: Likewise.
* generated/matmulavx128_r10.c: Likewise.
* generated/matmulavx128_r16.c: Likewise.
* generated/matmulavx128_r17.c: Likewise.
* generated/matmulavx128_r4.c: Likewise.
* generated/matmulavx128_r8.c: Likewise.
* intrinsics/access.c (access_func): Adjust free to xfree.
* intrinsics/chdir.c (chdir_i4_sub): Likewise.
(chdir_i8_sub): Likewise.
* intrinsics/chmod.c (chmod_func): Likewise.
* intrinsics/date_and_time.c (secnds): Likewise.
* intrinsics/env.c (PREFIX(getenv)): Likewise.
(get_environment_variable_i4): Likewise.
* intrinsics/execute_command_line.c (execute_command_line): Likewise.
* intrinsics/getcwd.c (getcwd_i4_sub): Likewise.
* intrinsics/getlog.c (PREFIX(getlog)): Likewise.
* intrinsics/link.c (link_internal): Likewise.
* intrinsics/move_alloc.c (move_alloc): Likewise.
* intrinsics/perror.c (perror_sub): Likewise.
* intrinsics/random.c (constructor_random): Likewise.
* intrinsics/rename.c (rename_internal): Likewise.
* intrinsics/stat.c (stat_i4_sub_0): Likewise.
(stat_i8_sub_0): Likewise.
* intrinsics/symlnk.c (symlnk_internal): Likewise.
* intrinsics/system.c (system_sub): Likewise.
* intrinsics/unlink.c (unlink_i4_sub): Likewise.
* io/async.c (update_pdt): Likewise.
(async_io): Likewise.
(free_async_unit): Likewise.
(init_async_unit): Adjust calloc to xcalloc.
(enqueue_done_id): Likewise.
(enqueue_done): Likewise.
(enqueue_close): Likewise.
* io/async.h (MUTEX_DEBUG_ADD): Adjust malloc/free to xmalloc/xfree.
* io/close.c (st_close): Adjust strdup/free to xstrdup/xfree.
* io/fbuf.c (fbuf_destroy): Adjust free to xfree.
* io/format.c (free_format_hash_table): Likewise.
(save_parsed_format): Likewise.
(free_format): Likewise.
(free_format_data)

[PATCH, OpenMP, Fortran] requires unified_shared_memory 2/2: insert USM allocators into libgfortran

2022-08-15 Thread Chung-Lin Tang

After the first libgfortran memory allocator preparation patch, this is the
actual patch that organizes unified_shared_memory allocation into libgfortran.

In the current OpenMP requires implementation, the requires_mask is collected
through offload LTO processing, and presented to libgomp when registering
offload images through GOMP_offload_register_ver() (called by the mkoffload 
generated
constructor linked into the program binary)

This means that the only reliable place to access omp_requires_mask is in
GOMP_offload_register_ver, however since it is called through an ELF constructor
in the *main program*, this runs later than libgfortran/runtime/main.c:init() 
constructor,
and because some libgfortran init actions there start allocating memory, this 
can cause
more deallocation errors later.

Another issue is that CUDA appears to be registering some cleanup actions using 
atexit(),
which forces libgomp to register gomp_target_fini() using atexit as well (to 
properly run
before the underlying CUDA stuff disappears). This happens to us here as well.

So to summarize we need to: (1) order libgfortran init actions after 
omp_requires_mask
processing is done, and (2) order libgfortran cleanup actions before 
gomp_target_fini,
to properly deallocate stuff without crashing.

The above explanation is for why there's a little new set of definitions, as 
well as
callback registering functions exported from libgomp to libgfortran, basically 
to register
libgfortran init/fini actions into libgomp to run.

Inside GOMP_offload_register_ver, after omp_requires_mask processing is done, 
we call into
libgfortran through a new _gfortran_mem_allocators_init function to insert the 
omp_free/alloc/etc.
based allocators into the Fortran runtime, when 
GOMP_REQUIRES_UNIFIED_SHARED_MEMORY is set.

All symbol references between libgfortran/libgomp are defined with weak 
symbols. Test of the
weak symbols are also used to determine if the other library exists in this 
program.

A final issue is: the case where we have an OpenMP program that does NOT have 
offloading.
We cannot passively determine in libgomp/libgfortran whether offloading exists 
or not, only the
main program itself can, by seeing if the hidden __OFFLOAD_TABLE__ exists.

When we do init/fini libgomp callback registering for OpenMP programs, those 
with no offloading
will not have those callback properly run (because of no offload image loading)
Therefore the solution here is a constructor added into the crtoffloadend.o 
fragment that does
a "null" call of GOMP_offload_register_ver, solely for triggering the 
post-offload_register callbacks
when __OFFLOAD_TABLE__ is NULL. (and because of this, the crtoffloadend.o 
Makefile rule is adjusted
to compile with PIC)

I know this is a big pile of yarn wrt how the main program/libgomp/libgfortran 
interacts, but it's
finally working. Again tested without regressions. Preparing to commit to 
devel/omp/gcc-12, and seeking
approval for mainline when the requires patches are in.

Thanks,
Chung-Lin

2022-08-15  Chung-Lin Tang  

libgcc/
* Makefile.in (crtoffloadend$(objext)): Add $(PICFLAG) to compile rule.
* offloadstuff.c (GOMP_offload_register_ver): Add declaration of weak
symbol.
(__OFFLOAD_TABLE__): Likewise.
(init_non_offload): New function.

libgfortran/

* gfortran.map (GFORTRAN_13): New namespace.
(_gfortran_mem_allocators_init): New name inside GFORTRAN_13.
* libgfortran.h (mem_allocators_init): New exported declaration.
* runtime/main.c (do_init): Rename from init, add run-once guard code.
(cleanup): Add run-once guard code.
(GOMP_post_offload_register_callback): Declare weak symbol.
(GOMP_pre_gomp_target_fini_callback): Likewise.
(init): New constructor to register offload callbacks, or call do_init
when not OpenMP.
* runtime/memory.c (gfortran_malloc): New pointer variable.
(gfortran_calloc): Likewise.
(gfortran_realloc): Likewise.
(gfortran_free): Likewise.
(mem_allocators_init): New function.
(xmalloc): Use gfortran_malloc.
(xmallocarray): Use gfortran_malloc.
(xcalloc): Use gfortran_calloc.
(xrealloc): Use gfortran_realloc.
(xfree): Use gfortran_free.

libgomp/

* libgomp.map (GOMP_5.1.2): New version namespace.
(GOMP_post_offload_register_callback): New name inside GOMP_5.1.2.
(GOMP_pre_gomp_target_fini_callback): Likewise.
(GOMP_DEFINE_CALLBACK_SET): Macro to define callback set.
(post_offload_register): Define callback set for after offload image
register.
(pre_gomp_target_fini): Define callback set for before gomp_target_fini
is called.
(libgfortran_malloc_usm): New function.
(libgfortran_calloc_usm): Likewise
(libgfortran_realloc_usm): Likewise
(libgfortran_free_usm): Likewise.
(_gfortran

Re: [PATCH, OpenMP, Fortran] requires unified_shared_memory 2/2: insert USM allocators into libgfortran

2022-08-15 Thread Chung-Lin Tang

On 2022/8/15 7:06 PM, Chung-Lin Tang wrote:


I know this is a big pile of yarn wrt how the main program/libgomp/libgfortran 
interacts, but it's
finally working. Again tested without regressions. Preparing to commit to 
devel/omp/gcc-12, and seeking
approval for mainline when the requires patches are in.


Just realized that I don't have the new testcases added in this patch.
Will supplement them later :P

Thanks,
Chung-Lin


Re: [PATCH, OpenMP, Fortran] requires unified_shared_memory 2/2: insert USM allocators into libgfortran

2022-09-05 Thread Chung-Lin Tang



On 2022/8/15 7:15 PM, Chung-Lin Tang wrote:

On 2022/8/15 7:06 PM, Chung-Lin Tang wrote:


I know this is a big pile of yarn wrt how the main program/libgomp/libgfortran 
interacts, but it's
finally working. Again tested without regressions. Preparing to commit to 
devel/omp/gcc-12, and seeking
approval for mainline when the requires patches are in.


Just realized that I don't have the new testcases added in this patch.
Will supplement them later :P


Here's the USM allocator/libgfortran patch, with a libgomp.fortran testcase 
added.

Thanks,
Chung-Lin

2022-09-05  Chung-Lin Tang  

libgcc/
* Makefile.in (crtoffloadend$(objext)): Add $(PICFLAG) to compile rule.
* offloadstuff.c (GOMP_offload_register_ver): Add declaration of weak
symbol.
(__OFFLOAD_TABLE__): Likewise.
(init_non_offload): New function.

libgfortran/

* gfortran.map (GFORTRAN_13): New namespace.
(_gfortran_mem_allocators_init): New name inside GFORTRAN_13.
* libgfortran.h (mem_allocators_init): New exported declaration.
* runtime/main.c (do_init): Rename from init, add run-once guard code.
(cleanup): Add run-once guard code.
(GOMP_post_offload_register_callback): Declare weak symbol.
(GOMP_pre_gomp_target_fini_callback): Likewise.
(init): New constructor to register offload callbacks, or call do_init
when not OpenMP.
* runtime/memory.c (gfortran_malloc): New pointer variable.
(gfortran_calloc): Likewise.
(gfortran_realloc): Likewise.
(gfortran_free): Likewise.
(mem_allocators_init): New function.
(xmalloc): Use gfortran_malloc.
(xmallocarray): Use gfortran_malloc.
(xcalloc): Use gfortran_calloc.
(xrealloc): Use gfortran_realloc.
(xfree): Use gfortran_free.

libgomp/

* libgomp.map (GOMP_5.1.2): New version namespace.
(GOMP_post_offload_register_callback): New name inside GOMP_5.1.2.
(GOMP_pre_gomp_target_fini_callback): Likewise.
(GOMP_DEFINE_CALLBACK_SET): Macro to define callback set.
(post_offload_register): Define callback set for after offload image
register.
(pre_gomp_target_fini): Define callback set for before gomp_target_fini
is called.
(libgfortran_malloc_usm): New function.
(libgfortran_calloc_usm): Likewise
(libgfortran_realloc_usm): Likewise
(libgfortran_free_usm): Likewise.
(_gfortran_mem_allocators_init): Declare weak symbol.
(gomp_libgfortran_omp_allocators_init): New function.
(GOMP_offload_register_ver): Add handling of host_table == NULL, calling
into libgfortran to set unified_shared_memory allocators, and execution
of post_offload_register callbacks.
(gomp_target_init): Register all pre_gomp_target_fini callbacks to run
at end of main using atexit().

* testsuite/libgomp.fortran/target-unified_shared_memory-1.f90: New test.







diff --git a/libgcc/Makefile.in b/libgcc/Makefile.in
index 09b3ec8bc2e..70720cc910c 100644
--- a/libgcc/Makefile.in
+++ b/libgcc/Makefile.in
@@ -1045,8 +1045,9 @@ crtbeginT$(objext): $(srcdir)/crtstuff.c
 crtoffloadbegin$(objext): $(srcdir)/offloadstuff.c
$(crt_compile) $(CRTSTUFF_T_CFLAGS) -c $< -DCRT_BEGIN
 
+# crtoffloadend contains a constructor with calls to libgomp, so build as PIC.
 crtoffloadend$(objext): $(srcdir)/offloadstuff.c
-   $(crt_compile) $(CRTSTUFF_T_CFLAGS) -c $< -DCRT_END
+   $(crt_compile) $(CRTSTUFF_T_CFLAGS) $(PICFLAG) -c $< -DCRT_END
 
 crtoffloadtable$(objext): $(srcdir)/offloadstuff.c
$(crt_compile) $(CRTSTUFF_T_CFLAGS) -c $< -DCRT_TABLE
diff --git a/libgcc/offloadstuff.c b/libgcc/offloadstuff.c
index 10e1fe19c8e..2edb6810021 100644
--- a/libgcc/offloadstuff.c
+++ b/libgcc/offloadstuff.c
@@ -63,6 +63,19 @@ const void *const __offload_vars_end[0]
   __attribute__ ((__used__, visibility ("hidden"),
  section (OFFLOAD_VAR_TABLE_SECTION_NAME))) = { };
 
+extern void GOMP_offload_register_ver (unsigned, const void *, int,
+  const void *);
+extern const void *const __OFFLOAD_TABLE__[0] __attribute__ ((weak));
+static void __attribute__((constructor))
+init_non_offload (void)
+{
+  /* If an OpenMP program has no offloading, post-offload_register callbacks
+ that need to run will require a call to GOMP_offload_register_ver, in
+ order to properly trigger those callbacks during init.  */
+  if (__OFFLOAD_TABLE__ == NULL)
+GOMP_offload_register_ver (0, NULL, 0, NULL);
+}
+
 #elif defined CRT_TABLE
 
 extern const void *const __offload_func_table[];
diff --git a/libgfortran/gfortran.map b/libgfortran/gfortran.map
index e0e795c3d48..55d2a529acd 100644
--- a/libgfortran/gfortran.map
+++ b/libgfortran/gfortran.map
@@ -1759,3 +1759,8 @@ GFORTRAN_12 {
   _gfortran_transfer_real128_write;
 #endif
 } GFORTRAN_10.2;
+
+GFORTRAN_13 {
+  global:
+  _gfortran_mem_allocators_init;
+} GFORTRAN_12;
diff --git a/libgfortran/libgfortran.h b/libgfortran/libgfortran.h
index 0b893a51851.

[PATCH, OpenACC 2.7] struct/array reductions for Fortran

2024-02-08 Thread Chung-Lin Tang
Hi Tobias, Thomas,
this patch adds support for Fortran to use arrays and struct(record) types in 
OpenACC reductions.

There is still some shortcomings in the current state, mainly that only 
explicit-shaped arrays can be used (like its C counterpart). Anything else is 
currently a bit more complicated in the middle-end, since the existing 
reduction code creates an "init-op" (literal of initial values) which can't be 
done when say TYPE_MAX_VALUE (TYPE_DOMAIN (array_type)) is not a tree constant. 
I think we'll be on the hook to solve this later, but I think the current state 
is okay to submit.

Tested without regressions on mainline (on top of first struct/array reduction 
patch[1])

Thanks,
Chung-Lin

[1] https://gcc.gnu.org/pipermail/gcc-patches/2024-January/641669.html

2024-02-08  Chung-Lin Tang  

gcc/fortran/ChangeLog:
* openmp.cc (oacc_reduction_defined_type_p): New function.
(resolve_omp_clauses): Adjust OpenACC array reduction error case. Use
oacc_reduction_defined_type_p for OpenACC.
* trans-openmp.cc (gfc_trans_omp_array_reduction_or_udr):
Add 'bool openacc' parameter, adjust part of function to be !openacc
only.
(gfc_trans_omp_reduction_list): Add 'bool openacc' parameter, pass to
calls to gfc_trans_omp_array_reduction_or_udr.
(gfc_trans_omp_clauses): Add 'openacc' argument to calls to
gfc_trans_omp_reduction_list.
(gfc_trans_omp_do): Pass 'op == EXEC_OACC_LOOP' as 'bool openacc'
parameter in call to gfc_trans_omp_clauses.

gcc/ChangeLog:
* omp-low.cc (omp_reduction_init_op): Add checking if reduced array
has constant bounds.
(lower_oacc_reductions): Add handling of error_mark_node.

gcc/testsuite/ChangeLog:
* gfortran.dg/goacc/array-reduction.f90: Adjust testcase.
* gfortran.dg/goacc/reduction.f95: Likewise.

libgomp/ChangeLog:
* libgomp/testsuite/libgomp.oacc-fortran/reduction-9.f90: New testcase.
* libgomp/testsuite/libgomp.oacc-fortran/reduction-10.f90: Likewise.
* libgomp/testsuite/libgomp.oacc-fortran/reduction-11.f90: Likewise.
* libgomp/testsuite/libgomp.oacc-fortran/reduction-12.f90: Likewise.
* libgomp/testsuite/libgomp.oacc-fortran/reduction-13.f90: Likewise.
diff --git a/gcc/fortran/openmp.cc b/gcc/fortran/openmp.cc
index 0af80d54fad..4bba9e666d6 100644
--- a/gcc/fortran/openmp.cc
+++ b/gcc/fortran/openmp.cc
@@ -7047,6 +7047,72 @@ oacc_is_loop (gfc_code *code)
 || code->op == EXEC_OACC_LOOP;
 }
 
+static bool
+oacc_reduction_defined_type_p (enum gfc_omp_reduction_op rop, gfc_typespec *ts)
+{
+  if (rop == OMP_REDUCTION_USER || rop == OMP_REDUCTION_NONE)
+return false;
+
+  if (ts->type == BT_INTEGER)
+switch (rop)
+  {
+  case OMP_REDUCTION_AND:
+  case OMP_REDUCTION_OR:
+  case OMP_REDUCTION_EQV:
+  case OMP_REDUCTION_NEQV:
+   return false;
+  default:
+   return true;
+  }
+
+  if (ts->type == BT_LOGICAL)
+switch (rop)
+  {
+  case OMP_REDUCTION_AND:
+  case OMP_REDUCTION_OR:
+  case OMP_REDUCTION_EQV:
+  case OMP_REDUCTION_NEQV:
+   return true;
+  default:
+   return false;
+  }
+
+  if (ts->type == BT_REAL || ts->type == BT_COMPLEX)
+switch (rop)
+  {
+  case OMP_REDUCTION_PLUS:
+  case OMP_REDUCTION_TIMES:
+  case OMP_REDUCTION_MINUS:
+   return true;
+
+  case OMP_REDUCTION_AND:
+  case OMP_REDUCTION_OR:
+  case OMP_REDUCTION_EQV:
+  case OMP_REDUCTION_NEQV:
+   return false;
+
+  case OMP_REDUCTION_MAX:
+  case OMP_REDUCTION_MIN:
+   return ts->type != BT_COMPLEX;
+  case OMP_REDUCTION_IAND:
+  case OMP_REDUCTION_IOR:
+  case OMP_REDUCTION_IEOR:
+   return false;
+  default:
+   gcc_unreachable ();
+  }
+
+  if (ts->type == BT_DERIVED)
+{
+  for (gfc_component *p = ts->u.derived->components; p; p = p->next)
+   if (!oacc_reduction_defined_type_p (rop, &p->ts))
+ return false;
+  return true;
+}
+
+  return false;
+}
+
 static void
 resolve_scalar_int_expr (gfc_expr *expr, const char *clause)
 {
@@ -8137,13 +8203,15 @@ resolve_omp_clauses (gfc_code *code, gfc_omp_clauses 
*omp_clauses,
  else
n->sym->mark = 1;
 
- /* OpenACC does not support reductions on arrays.  */
- if (n->sym->as)
+ /* OpenACC current only supports array reductions on explicit-shape
+arrays.  */
+ if ((n->sym->as && n->sym->as->type != AS_EXPLICIT)
+ || n->sym->attr.codimension)
gfc_error ("Array %qs is not permitted in reduction at %L",
   n->sym->name, &n->where);
}
 }
-  
+
   for (n = omp_clauses->lists[OMP_LIST_TO]; n; n = n->next)

Re: [PATCH, OpenACC 2.7, v2] readonly modifier support in front-ends

2024-03-07 Thread Chung-Lin Tang
Hi Thomas, Tobias,

On 2023/10/26 6:43 PM, Thomas Schwinge wrote:
> +++ b/gcc/tree.h
> @@ -1813,6 +1813,14 @@ class auto_suppress_location_wrappers
>   #define OMP_CLAUSE_MAP_DECL_MAKE_ADDRESSABLE(NODE) \
> (OMP_CLAUSE_SUBCODE_CHECK (NODE, 
> OMP_CLAUSE_MAP)->base.addressable_flag)
>
> +/* Nonzero if OpenACC 'readonly' modifier set, used for 'copyin'.  */
> +#define OMP_CLAUSE_MAP_READONLY(NODE) \
> +  TREE_READONLY (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_MAP))
> +
> +/* Same as above, for use in OpenACC cache directives.  */
> +#define OMP_CLAUSE__CACHE__READONLY(NODE) \
> +  TREE_READONLY (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE__CACHE_))
 I'm not sure if these special accessor functions are actually useful, or
 we should just directly use 'TREE_READONLY' instead?  We're only using
 them in contexts where it's clear that the 'OMP_CLAUSE_SUBCODE_CHECK' is
 satisfied, for example.
>>> I find directly using TREE_READONLY confusing.
>>
>> FWIW, I've changed to use TREE_NOTHROW instead, if it can give a better 
>> sense of safety :P
> 
> I don't understand that, why not use 'TREE_READONLY'?
> 
>> I think there's a misunderstanding here anyways: we are not relying on a 
>> DECL marked
>> TREE_READONLY here. We merely need the OMP_CLAUSE_MAP to be marked as 
>> OMP_CLAUSE_MAP_READONLY == 1.
> 
> Yes, I understand that.  My question was why we don't just use
> 'TREE_READONLY (c)', where 'c' is the
> 'OMP_CLAUSE_MAP'/'OMP_CLAUSE__CACHE_' clause (not its decl), and avoid
> the indirection through
> '#define OMP_CLAUSE_MAP_READONLY'/'#define OMP_CLAUSE__CACHE__READONLY',
> given that we're only using them in contexts where it's clear that the
> 'OMP_CLAUSE_SUBCODE_CHECK' is satisfied.  I don't have a strong
> preference, though.

After further re-testing using TREE_NOTHROW, I have reverted to using 
TREE_READONLY, because TREE_NOTHROW clashes
with OMP_CLAUSE_RELEASE_DESCRIPTOR (which doesn't use the OMP_CLAUSE_MAP_* 
naming convention and is
not documented in gcc/tree-core.h either, hmmm...)

I have added the comment adjustments in gcc/tree-core.h for the new uses of 
TREE_READONLY/readonly_flag.

We basically all use OMP_CLAUSE_SUBCODE_CHECK macros for OpenMP clause 
expressions exclusively,
so I don't see a reason to diverge from that style (even when context is clear).

> Either way, you still need to document this:
> 
> | Also, for the new use for OMP clauses, update 'gcc/tree.h:TREE_READONLY',
> | and in 'gcc/tree-core.h' for 'readonly_flag' the
> | "table lists the uses of each of the above flags".

Okay, done as mentioned above.

> In addition to a few individual comments above and below, you've also not
> yet responded to my requests re test cases.

I have greatly expanded the test scan patterns to include 
parallel/kernels/serial/data/enter data,
as well as non-readonly copyin clause together with readonly.

Also added simple 'declare' tests, but there is not anything to scan in the 
'tree-original' dump though.

>> +  tree nl = list;
>> +  bool readonly = false;
>> +  matching_parens parens;
>> +  if (parens.require_open (parser))
>> +{
>> +  /* Turn on readonly modifier parsing for copyin clause.  */
>> +  if (c_kind == PRAGMA_OACC_CLAUSE_COPYIN)
>> + {
>> +   c_token *token = c_parser_peek_token (parser);
>> +   if (token->type == CPP_NAME
>> +   && !strcmp (IDENTIFIER_POINTER (token->value), "readonly")
>> +   && c_parser_peek_2nd_token (parser)->type == CPP_COLON)
>> + {
>> +   c_parser_consume_token (parser);
>> +   c_parser_consume_token (parser);
>> +   readonly = true;
>> + }
>> + }
>> +  location_t loc = c_parser_peek_token (parser)->location;
> 
> I suppose 'loc' here now points to after the opening '(' or after the
> 'readonly :'?  This is different from what 'c_parser_omp_var_list_parens'
> does, and indeed, 'c_parser_omp_variable_list' states that "CLAUSE_LOC is
> the location of the clause", not the location of the variable-list?  As
> this, I suppose, may change diagnostics, please restore the original
> behavior.  (This appears to be different in the C++ front end, huh.)

Thanks for catching this! Fixed.

>> --- a/gcc/fortran/openmp.cc
>> +++ b/gcc/fortran/openmp.cc
>> @@ -1197,7 +1197,7 @@ omp_inv_mask::omp_inv_mask (const omp_mask &m) : 
>> omp_mask (m)
>>
>>  static bool
>>  gfc_match_omp_map_clause (gfc_omp_namelist **list, gfc_omp_map_op map_op,
>> -   bool allow_common, bool allow_derived)
>> +   bool allow_common, bool allow_derived, bool readonly 
>> = false)
>>  {
>>gfc_omp_namelist **head = NULL;
>>if (gfc_match_omp_variable_list ("", list, allow_common, NULL, &head, 
>> true,
>> @@ -1206,7 +1206,10 @@ gfc_match_omp_map_clause (gfc_omp_namelist **list, 
>> gfc_omp_map_op map_op,
>>  {
>>gfc_omp_namelist *n;
>>for (n = *head; n; n = n->next)
>> -

Re: [PATCH 5/5] Mapping of components of references to pointers to structs for OpenMP/OpenACC

2021-05-17 Thread Chung-Lin Tang

Hi Julian,

On 2021/5/15 5:27 AM, Julian Brown wrote:

GCC currently raises a parse error for indirect accesses to struct
members, where the base of the access is a reference to a pointer.
This patch fixes that case.



gcc/cp/
* semantics.c (finish_omp_clauses): Handle components of references to
pointers to structs.

libgomp/
* testsuite/libgomp.oacc-c++/deep-copy-17.C: Update test.



--- a/gcc/cp/semantics.c
+++ b/gcc/cp/semantics.c
@@ -7670,7 +7670,12 @@ finish_omp_clauses (tree clauses, enum c_omp_region_type 
ort)
  if ((ort == C_ORT_ACC || ort == C_ORT_OMP)
  && TREE_CODE (t) == COMPONENT_REF
  && TREE_CODE (TREE_OPERAND (t, 0)) == INDIRECT_REF)
-   t = TREE_OPERAND (TREE_OPERAND (t, 0), 0);
+   {
+ t = TREE_OPERAND (TREE_OPERAND (t, 0), 0);
+ /* References to pointers have a double indirection here.  */
+ if (TREE_CODE (t) == INDIRECT_REF)
+   t = TREE_OPERAND (t, 0);
+   }
  if (TREE_CODE (t) == COMPONENT_REF
  && ((ort & C_ORT_OMP_DECLARE_SIMD) == C_ORT_OMP
  || ort == C_ORT_ACC)


There is already a large plethora of such modifications in this patch:
"[PATCH, OG10, OpenMP 5.0, committed] Remove array section base-pointer mapping 
semantics, and other front-end adjustments."
https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570075.html

I am in the process of taking that patch to mainline, so are you sure this is 
not already handled there?


diff --git a/libgomp/testsuite/libgomp.oacc-c++/deep-copy-17.C 
b/libgomp/testsuite/libgomp.oacc-c++/deep-copy-17.C
index dacbb520f3d..e038e9e3802 100644
--- a/libgomp/testsuite/libgomp.oacc-c++/deep-copy-17.C
+++ b/libgomp/testsuite/libgomp.oacc-c++/deep-copy-17.C
@@ -83,7 +83,7 @@ void strrp (void)
a[0] = 8;
c[0] = 10;
e[0] = 12;
-  #pragma acc parallel copy(n->a[0:10], n->c[0:10], n->e[0:10])
+  #pragma acc parallel copy(n->a[0:10], n->b, n->c[0:10], n->d, n->e[0:10])
{
  n->a[0] = n->c[0] + n->e[0];
}


This testcase can be added.

Chung-Lin






Re: [PATCH 7/7] [og10] WIP GOMP_MAP_ATTACH_ZERO_LENGTH_ARRAY_SECTION changes

2021-05-17 Thread Chung-Lin Tang

On 2021/5/11 4:57 PM, Julian Brown wrote:

This work-in-progress patch tries to get
GOMP_MAP_ATTACH_ZERO_LENGTH_ARRAY_SECTION to behave more like
GOMP_MAP_ATTACH_DETACH -- in that the mapping is made to form groups
to be processed by build_struct_group/build_struct_comp_map.  I think
that's important to integrate with how groups of mappings for array
sections are handled in other cases.

This patch isn't sufficient by itself to fix a couple of broken test cases
at present (libgomp.c++/target-lambda-1.C, libgomp.c++/target-this-4.C),
though.


No, GOMP_MAP_ATTACH_ZERO_LENGTH_ARRAY_SECTION is supposed to be just a slightly
different behavior version of GOMP_MAP_ATTACH; it tolerates an unmapped
pointer-target and assigns NULL on the device, instead of just gomp_fatal().
(see its handling in libgomp/target.c)

In case OpenACC can have the same such zero-length array section behavior,
we can just share one GOMP_MAP_ATTACH map. For now it is treated as separate
cases.

Chung-Lin


2021-05-11  Julian Brown  

gcc/
* gimplify.c (build_struct_comp_nodes): Add
GOMP_MAP_ATTACH_ZERO_LENGTH_ARRAY_SECTION handling.
(build_struct_group): Process GOMP_MAP_ATTACH_ZERO_LENGTH_ARRAY_SECTION
as part of pointer group.
(gimplify_scan_omp_clauses): Update prev_list_p such that
GOMP_MAP_ATTACH_ZERO_LENGTH_ARRAY_SECTION will form part of pointer
group.
---
  gcc/gimplify.c | 16 
  1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index 6d204908c82..c5cb486aa23 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -8298,7 +8298,9 @@ build_struct_comp_nodes (enum tree_code code, tree 
grp_start, tree grp_end,
if (grp_mid
&& OMP_CLAUSE_CODE (grp_mid) == OMP_CLAUSE_MAP
&& (OMP_CLAUSE_MAP_KIND (grp_mid) == GOMP_MAP_ALWAYS_POINTER
- || OMP_CLAUSE_MAP_KIND (grp_mid) == GOMP_MAP_ATTACH_DETACH))
+ || OMP_CLAUSE_MAP_KIND (grp_mid) == GOMP_MAP_ATTACH_DETACH
+ || (OMP_CLAUSE_MAP_KIND (grp_mid)
+ == GOMP_MAP_ATTACH_ZERO_LENGTH_ARRAY_SECTION)))
  {
tree c3
= build_omp_clause (OMP_CLAUSE_LOCATION (grp_end), OMP_CLAUSE_MAP);
@@ -8774,12 +8776,14 @@ build_struct_group (struct gimplify_omp_ctx *ctx,
 ? splay_tree_lookup (ctx->variables, (splay_tree_key) decl)
 : NULL);
bool ptr = (OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_ALWAYS_POINTER);
-  bool attach_detach = (OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_ATTACH_DETACH);
+  bool attach_detach = (OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_ATTACH_DETACH
+   || (OMP_CLAUSE_MAP_KIND (c)
+   == GOMP_MAP_ATTACH_ZERO_LENGTH_ARRAY_SECTION));
bool attach = (OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_ATTACH
 || OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_DETACH);
bool has_attachments = false;
/* For OpenACC, pointers in structs should trigger an attach action.  */
-  if (attach_detach
+  if (OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_ATTACH_DETACH
&& ((region_type & (ORT_ACC | ORT_TARGET | ORT_TARGET_DATA))
  || code == OMP_TARGET_ENTER_DATA
  || code == OMP_TARGET_EXIT_DATA))
@@ -9784,6 +9788,8 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq 
*pre_p,
  if (!remove
  && OMP_CLAUSE_MAP_KIND (c) != GOMP_MAP_ALWAYS_POINTER
  && OMP_CLAUSE_MAP_KIND (c) != GOMP_MAP_ATTACH_DETACH
+ && (OMP_CLAUSE_MAP_KIND (c)
+ != GOMP_MAP_ATTACH_ZERO_LENGTH_ARRAY_SECTION)
  && OMP_CLAUSE_MAP_KIND (c) != GOMP_MAP_TO_PSET
  && OMP_CLAUSE_CHAIN (c)
  && OMP_CLAUSE_CODE (OMP_CLAUSE_CHAIN (c)) == OMP_CLAUSE_MAP
@@ -9792,7 +9798,9 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq 
*pre_p,
  || (OMP_CLAUSE_MAP_KIND (OMP_CLAUSE_CHAIN (c))
  == GOMP_MAP_ATTACH_DETACH)
  || (OMP_CLAUSE_MAP_KIND (OMP_CLAUSE_CHAIN (c))
- == GOMP_MAP_TO_PSET)))
+ == GOMP_MAP_TO_PSET)
+ || (OMP_CLAUSE_MAP_KIND (OMP_CLAUSE_CHAIN (c))
+ == GOMP_MAP_ATTACH_ZERO_LENGTH_ARRAY_SECTION)))
prev_list_p = list_p;
  
  	  break;




Re: [PATCH 7/7] [og10] WIP GOMP_MAP_ATTACH_ZERO_LENGTH_ARRAY_SECTION changes

2021-05-18 Thread Chung-Lin Tang

On 2021/5/17 10:26 PM, Julian Brown wrote:

OK, understood. But, I'm a bit concerned that we're ignoring some
"hidden rules" with regards to OMP pointer clause ordering/grouping that
certain code (at least the bit that creates GOMP_MAP_STRUCT node
groups, and parts of omp-low.c) relies on. I believe those rules are as
follows:

  - an array slice is mapped using two or three pointers -- two for a
normal (non-reference) base pointer, and three if we have a
reference to a pointer (i.e. in C++) or an array descriptor (i.e. in
Fortran). So we can have e.g.

GOMP_MAP_TO
GOMP_MAP_ALWAYS_POINTER

GOMP_MAP_TO
GOMP_MAP_.*_POINTER
GOMP_MAP_ALWAYS_POINTER

GOMP_MAP_TO
GOMP_MAP_TO_PSET
GOMP_MAP_ALWAYS_POINTER

  - for OpenACC, we extend this to allow (up to and including
gimplify.c) the GOMP_MAP_ATTACH_DETACH mapping. So we can have (for
component refs):

GOMP_MAP_TO
GOMP_MAP_ATTACH_DETACH

GOMP_MAP_TO
GOMP_MAP_TO_PSET
GOMP_MAP_ATTACH_DETACH

GOMP_MAP_TO
GOMP_MAP_.*_POINTER
GOMP_MAP_ATTACH_DETACH

For the scanning in insert_struct_comp_map (as it is at present) to
work right, these groups must stay intact.  I think the current
behaviour of omp_target_reorder_clauses on the og10 branch can break
those groups apart though!


Originally this sorting was intended to enforce OpenMP 5.0 map ordering
rules, although I did add some ATTACH_DETACH ordering code in the latest
round of patching. May not be the best practice.


(The "prev_list_p" stuff in the loop in question in gimplify.c just
keeps track of the first node in these groups.)


Such a brittle way of doing this; even the variable name is not that
obvious in what it intends to do.


For OpenACC, the GOMP_MAP_ATTACH_DETACH code does*not*  depend on the
previous clause when lowering in omp-low.c. But GOMP_MAP_ALWAYS_POINTER
does! And in one case ("update" directive), GOMP_MAP_ATTACH_DETACH is
rewritten to GOMP_MAP_ALWAYS_POINTER, so for that case at least, the
dependency on the preceding mapping node must stay intact.


Yes, I think there are some weird conventions here, stemming from the 
front-ends.
I would think that _ALWAYS_POINTER should exist at a similar level like 
_ATTACH_DETACH,
both a pointer operation, just different details in runtime behavior, though its
intended purpose for C++ references seem to skew some things here and there.


OpenACC also allows "bare" GOMP_MAP_ATTACH and GOMP_MAP_DETACH nodes
(corresponding to the "attach" and "detach" clauses). Those are handled
a bit differently to GOMP_MAP_ATTACH_DETACH in gimplify.c -- but
GOMP_MAP_ATTACH_Z_L_A_S doesn't quite behave like that either, I don't
think?


IIRC, GOMP_MAP_ATTACH_ZERO_LENGTH_ARRAY_SECTION was handled that way (just a 
single
line in gimplify.c) due to idiosyncrasies with the surrounding generated
maps from the C++ front-end (which ATM is the only user of this map-kind).
So yeah, inside the compiler, its not entirely the same as GOMP_MAP_ATTACH,
but it is intended to live through for the runtime to see.


Anyway: I've not entirely understood what omp_target_reorder_clauses is
doing, but I think it may need to try harder to keep the groups
mentioned above together.  What do you think?


As you know, attach operations don't really need to be glued to the prior
operations, it just has to be ordered after mapping of the pointer and the 
pointed.

There's already some book-keeping to move clauses together, but as you say,
it might need more.

Overall, I think this re-organizing of the struct-group creation is a good 
thing,
but actually as you probably also observed, this insistence of "in-flight"
tree chain manipulation is just hard to work with and modify.

Maybe instead of directly working on clause expression chains at this point, we
should be stashing all this information into a single clause tree node,
e.g. starting from the front-end, we can set
'OMP_CLAUSE_MAP_POINTER_KIND(c) = ALWAYS/ATTACH_DETACH/FIRSTPRIVATE/etc.',
(instead of actually creating new, must-follow-in-order maps that's causing all
these conventions).

For struct-groups, during the start of gimplify_scan_omp_clauses(), we could 
work
with map clause tree nodes with OMP_CLAUSE_MAP_STRUCT_LIST(c), which contains 
the
entire TREE_LIST or VEC of elements. Then later, after scanning is complete,
expand the list into the current form. Ordering is only created at this stage.

Just an idea, not sure if it will help understandability in general, but it
should definitely help to simplify when we're reordering due to other rules.

Chung-Lin


[PATCH, OpenMP 5.0] Remove array section base-pointer mapping semantics, and other front-end adjustments (mainline trunk)

2021-05-25 Thread Chung-Lin Tang

Hi Jakub,
this is a version of this patch: 
https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570075.html
for mainline trunk.

This patch largely implements three pieces of functionality:

(1) Per discussion and clarification on the omp-lang mailing list,
standards conforming behavior for mapping array sections should *NOT* also map 
the base-pointer,
i.e for this code:

struct S { int *ptr; ... };
struct S s;
#pragma omp target enter data map(to: s.ptr[:100])

Currently we generate after gimplify:
#pragma omp target enter data map(struct:s [len: 1]) map(alloc:s.ptr [len: 8]) \
   map(to:*_1 [len: 400]) map(attach:s.ptr [bias: 
0])

which is deemed incorrect. After this patch, the gimplify results are now 
adjusted to:
#pragma omp target enter data map(to:*_1 [len: 400]) map(attach:s.ptr [bias: 0])
(the attach operation is still generated, and if s.ptr is already mapped prior, 
attachment will happen)

The correct way of achieving the base-pointer-also-mapped behavior would be to 
use:
#pragma omp target enter data map(to: s.ptr, s.ptr[:100])

This adjustment in behavior required a number of small adjustments here and 
there in gimplify, including
to accomodate map sequences for C++ references.

There is also a small Fortran front-end patch involved (hence CCing Tobias and 
fortran@).
The new gimplify processing changed behavior in handling 
GOMP_MAP_ALWAYS_POINTER maps such that
the libgomp.fortran/struct-elem-map-1.f90 regressed. It appeared that the 
Fortran FE was generating
a GOMP_MAP_ALWAYS_POINTER for array types, which didn't seem quite correct, and 
the pre-patch behavior
was removing this map anyways. I have a small change in 
trans-openmp.c:gfc_trans_omp_array_section
to not generate the map in this case, and so far no bad test results.

(2) The second part (though kind of related to the first above) are fixes in 
libgomp/target.c
to not overwrite attached pointers when handling device<->host copies, mainly for the 
"always" case.
This behavior is also noted in the 5.0 spec, but not yet properly coded before.

(3) The third is a set of changes to the C/C++ front-ends to extend the allowed 
component access syntax
in map clauses. This is actually mainly an effort to allow SPEC HPC to compile, 
so despite in the long
term the entire map clause syntax parsing is probably going to be revamped, 
we're still adding this in
for now. These changes are enabled for both OpenACC and OpenMP.

Tested on x86_64-linux with nvptx offloading with no regressions. This patch 
was merged and tested atop
of the prior submitted patches:
 (a) https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570886.html
 "[PATCH, OpenMP 5.0] Improve OpenMP target support for C++ (includes PR92120 
v3)"
 (b) https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570365.html
 "[PATCH, OpenMP 5.0] Implement relaxation of implicit map vs. existing device 
mappings (for mainline trunk)"
so you might queued this one later than those for review.

Thanks,
Chung-Lin

2021-05-25  Chung-Lin Tang  

gcc/c/ChangeLog:

* c-parser.c (struct omp_dim): New struct type for use inside
c_parser_omp_variable_list.
(c_parser_omp_variable_list): Allow multiple levels of array and
component accesses in array section base-pointer expression.
(c_parser_omp_clause_to): Set 'allow_deref' to true in call to
c_parser_omp_var_list_parens.
(c_parser_omp_clause_from): Likewise.
* c-typeck.c (handle_omp_array_sections_1): Extend allowed range
of base-pointer expressions involving INDIRECT/MEM/ARRAY_REF and
POINTER_PLUS_EXPR.
(c_finish_omp_clauses): Extend allowed ranged of expressions
involving INDIRECT/MEM/ARRAY_REF and POINTER_PLUS_EXPR.

gcc/cp/ChangeLog:

* parser.c (struct omp_dim): New struct type for use inside
cp_parser_omp_var_list_no_open.
(cp_parser_omp_var_list_no_open): Allow multiple levels of array and
component accesses in array section base-pointer expression.
(cp_parser_omp_all_clauses): Set 'allow_deref' to true in call to
cp_parser_omp_var_list for to/from clauses.
* semantics.c (handle_omp_array_sections_1): Extend allowed range
of base-pointer expressions involving INDIRECT/MEM/ARRAY_REF and
POINTER_PLUS_EXPR.
(handle_omp_array_sections): Adjust pointer map generation of
references.
(finish_omp_clauses): Extend allowed ranged of expressions
involving INDIRECT/MEM/ARRAY_REF and POINTER_PLUS_EXPR.

gcc/fortran/ChangeLog:

* trans-openmp.c (gfc_trans_omp_array_section): Do not generate
GOMP_MAP_ALWAYS_POINTER map for main array maps of ARRAY_TYPE type.

gcc/ChangeLog:

* gimplify.c (extract_base_bit_offset): Add 'tree *offsetp' parameter,
accomodate case where 'offset' return of get_inner_r

[PATCH, OpenMP, Fortran] Support in_reduction for Fortran

2021-09-17 Thread Chung-Lin Tang

Hi Jakub, and Fortran folks,
this patch does the required adjustments to let 'in_reduction' work for Fortran.
Not just for the target directive actually, task directive is also working after
this patch.

There is a little bit of adjustment in omp-low.c:scan_sharing_clauses:
RTL expand of the copy of the OMP_CLAUSE_IN_REDUCTION decl was failing
for Fortran by-reference arguments, which seems to work after placing them
under the outer ctx (when it exists). This also now needs checking the field_map
for existence of the field before inserting.

Tested without regressions on mainline trunk, is this okay?

(testing for devel/omp/gcc-11 is in progress)

Thanks,
Chung-Lin

2021-09-17  Chung-Lin Tang  

gcc/fortran/ChangeLog:

* openmp.c (gfc_match_omp_clause_reduction): Add 'openmp_target' default
false parameter. Add 'always,tofrom' map for OMP_LIST_IN_REDUCTION case.
(gfc_match_omp_clauses): Add 'openmp_target' default false parameter,
adjust call to gfc_match_omp_clause_reduction.
(match_omp): Adjust call to gfc_match_omp_clauses
* trans-openmp.c (gfc_trans_omp_taskgroup): Add call to
gfc_match_omp_clause, create and return block.

gcc/ChangeLog:

* omp-low.c (scan_sharing_clauses): Place in_reduction copy of variable
in outer ctx if if exists. Check if non-existent in field_map before
installing OMP_CLAUSE_IN_REDUCTION decl.

gcc/testsuite/ChangeLog:

* gfortran.dg/gomp/reduction4.f90: Adjust omp target in_reduction' scan
pattern.

libgomp/ChangeLog:

* testsuite/libgomp.fortran/target-in-reduction-1.f90: New test.
diff --git a/gcc/fortran/openmp.c b/gcc/fortran/openmp.c
index a64b7f5aa10..8179b5aa8bc 100644
--- a/gcc/fortran/openmp.c
+++ b/gcc/fortran/openmp.c
@@ -1138,7 +1138,7 @@ failed:
 
 static match
 gfc_match_omp_clause_reduction (char pc, gfc_omp_clauses *c, bool openacc,
-   bool allow_derived)
+   bool allow_derived, bool openmp_target = false)
 {
   if (pc == 'r' && gfc_match ("reduction ( ") != MATCH_YES)
 return MATCH_NO;
@@ -1285,6 +1285,19 @@ gfc_match_omp_clause_reduction (char pc, gfc_omp_clauses 
*c, bool openacc,
n->u2.udr = gfc_get_omp_namelist_udr ();
n->u2.udr->udr = udr;
  }
+   if (openmp_target && list_idx == OMP_LIST_IN_REDUCTION)
+ {
+   gfc_omp_namelist *p = gfc_get_omp_namelist (), **tl;
+   p->sym = n->sym;
+   p->where = p->where;
+   p->u.map_op = OMP_MAP_ALWAYS_TOFROM;
+
+   tl = &c->lists[OMP_LIST_MAP];
+   while (*tl)
+ tl = &((*tl)->next);
+   *tl = p;
+   p->next = NULL;
+ }
  }
   return MATCH_YES;
 }
@@ -1353,7 +1366,7 @@ gfc_match_dupl_atomic (bool not_dupl, const char *name)
 static match
 gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
   bool first = true, bool needs_space = true,
-  bool openacc = false)
+  bool openacc = false, bool openmp_target = false)
 {
   bool error = false;
   gfc_omp_clauses *c = gfc_get_omp_clauses ();
@@ -2057,8 +2070,8 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const 
omp_mask mask,
  goto error;
}
  if ((mask & OMP_CLAUSE_IN_REDUCTION)
- && gfc_match_omp_clause_reduction (pc, c, openacc,
-allow_derived) == MATCH_YES)
+ && gfc_match_omp_clause_reduction (pc, c, openacc, allow_derived,
+openmp_target) == MATCH_YES)
continue;
  if ((mask & OMP_CLAUSE_INBRANCH)
  && (m = gfc_match_dupl_check (!c->inbranch && !c->notinbranch,
@@ -3496,7 +3509,8 @@ static match
 match_omp (gfc_exec_op op, const omp_mask mask)
 {
   gfc_omp_clauses *c;
-  if (gfc_match_omp_clauses (&c, mask) != MATCH_YES)
+  if (gfc_match_omp_clauses (&c, mask, true, true, false,
+(op == EXEC_OMP_TARGET)) != MATCH_YES)
 return MATCH_ERROR;
   new_st.op = op;
   new_st.ext.omp_clauses = c;
diff --git a/gcc/fortran/trans-openmp.c b/gcc/fortran/trans-openmp.c
index e55e0c81868..08483951066 100644
--- a/gcc/fortran/trans-openmp.c
+++ b/gcc/fortran/trans-openmp.c
@@ -6391,12 +6391,17 @@ gfc_trans_omp_task (gfc_code *code)
 static tree
 gfc_trans_omp_taskgroup (gfc_code *code)
 {
+  stmtblock_t block;
+  gfc_start_block (&block);
   tree body = gfc_trans_code (code->block->next);
   tree stmt = make_node (OMP_TASKGROUP);
   TREE_TYPE (stmt) = void_type_node;
   OMP_TASKGROUP_BODY (stmt) = body;
-  OMP_TASKGROUP_CLAUSES (stmt) = NULL_TREE;
-  return stmt;
+  OMP_TASKGROUP_CLAUSES (stmt) = gfc_trans_omp_clauses (&

[PATCH, OpenMP 5.1, Fortran] Strictly-structured block support for OpenMP directives

2021-10-07 Thread Chung-Lin Tang

Hi all,
this patch add support for "strictly-structured blocks" introduced in OpenMP 
5.1,
basically allowing BLOCK constructs to serve as the body for directives:

!$omp target
block
  ...
end block
[!$omp end target]  !! end directive is optional

!$omp parallel
block
  ...
end block
...
!$omp end parallel  !! error, considered as not match to above parallel 
directive

The parsing loop in parse_omp_structured_block() has been modified to allow
a BLOCK construct after the first statement has been detected to be ST_BLOCK.
This is done by a hard modification of the state into (the new) 
COMP_OMP_STRICTLY_STRUCTURED_BLOCK
after the statement is known (I'm not sure if there's a way to 'peek' the next
statement/token in the Fortran FE, open to suggestions on how to better write 
this)

Tested with no regressions on trunk, is this okay to commit?

Thanks,
Chung-Lin

2021-10-07  Chung-Lin Tang  

gcc/fortran/ChangeLog:

* decl.c (gfc_match_end): Add COMP_OMP_STRICTLY_STRUCTURED_BLOCK case
together with COMP_BLOCK.
* parse.c (parse_omp_structured_block): Adjust declaration, add
'bool strictly_structured_block' default true parameter, add handling
for strictly-structured block case, adjust recursive calls to
parse_omp_structured_block.
(parse_executable): Adjust calls to parse_omp_structured_block.
* parse.h (enum gfc_compile_state): Add
COMP_OMP_STRICTLY_STRUCTURED_BLOCK.
* trans-openmp.c (gfc_trans_omp_workshare): Add EXEC_BLOCK case
handling.

gcc/testsuite/ChangeLog:

* gfortran.dg/gomp/strictly-structured-block-1.f90: New test.
diff --git a/gcc/fortran/decl.c b/gcc/fortran/decl.c
index b3c65b7175b..ff66d1f9475 100644
--- a/gcc/fortran/decl.c
+++ b/gcc/fortran/decl.c
@@ -8445,6 +8445,7 @@ gfc_match_end (gfc_statement *st)
   break;
 
 case COMP_BLOCK:
+case COMP_OMP_STRICTLY_STRUCTURED_BLOCK:
   *st = ST_END_BLOCK;
   target = " block";
   eos_ok = 0;
diff --git a/gcc/fortran/parse.c b/gcc/fortran/parse.c
index 7d765a0866d..d78bf9b8fa5 100644
--- a/gcc/fortran/parse.c
+++ b/gcc/fortran/parse.c
@@ -5451,8 +5451,9 @@ parse_oacc_loop (gfc_statement acc_st)
 
 /* Parse the statements of an OpenMP structured block.  */
 
-static void
-parse_omp_structured_block (gfc_statement omp_st, bool workshare_stmts_only)
+static gfc_statement
+parse_omp_structured_block (gfc_statement omp_st, bool workshare_stmts_only,
+   bool strictly_structured_block = true)
 {
   gfc_statement st, omp_end_st;
   gfc_code *cp, *np;
@@ -5538,6 +5539,32 @@ parse_omp_structured_block (gfc_statement omp_st, bool 
workshare_stmts_only)
   gcc_unreachable ();
 }
 
+  bool block_construct = false;
+  gfc_namespace* my_ns = NULL;
+  gfc_namespace* my_parent = NULL;
+
+  st = next_statement ();
+
+  if (strictly_structured_block && st == ST_BLOCK)
+{
+  /* Adjust state to a strictly-structured block, now that we found that
+the body starts with a BLOCK construct.  */
+  s.state = COMP_OMP_STRICTLY_STRUCTURED_BLOCK;
+
+  block_construct = true;
+  gfc_notify_std (GFC_STD_F2008, "BLOCK construct at %C");
+
+  my_ns = gfc_build_block_ns (gfc_current_ns);
+  gfc_current_ns = my_ns;
+  my_parent = my_ns->parent;
+
+  new_st.op = EXEC_BLOCK;
+  new_st.ext.block.ns = my_ns;
+  new_st.ext.block.assoc = NULL;
+  accept_statement (ST_BLOCK);
+  st = parse_spec (ST_NONE);
+}
+
   do
 {
   if (workshare_stmts_only)
@@ -5554,7 +5581,6 @@ parse_omp_structured_block (gfc_statement omp_st, bool 
workshare_stmts_only)
 restrictions apply recursively.  */
  bool cycle = true;
 
- st = next_statement ();
  for (;;)
{
  switch (st)
@@ -5576,17 +5602,20 @@ parse_omp_structured_block (gfc_statement omp_st, bool 
workshare_stmts_only)
  parse_forall_block ();
  break;
 
+   case ST_OMP_PARALLEL_SECTIONS:
+ st = parse_omp_structured_block (st, false, false);
+ continue;
+
case ST_OMP_PARALLEL:
case ST_OMP_PARALLEL_MASKED:
case ST_OMP_PARALLEL_MASTER:
-   case ST_OMP_PARALLEL_SECTIONS:
- parse_omp_structured_block (st, false);
- break;
+ st = parse_omp_structured_block (st, false);
+ continue;
 
case ST_OMP_PARALLEL_WORKSHARE:
case ST_OMP_CRITICAL:
- parse_omp_structured_block (st, true);
- break;
+ st = parse_omp_structured_block (st, true);
+ continue;
 
case ST_OMP_PARALLEL_DO:
case ST_OMP_PARALLEL_DO_SIMD:
@@ -5609,7 +5638,7 @@ parse_omp_structured_block (gfc_statement omp_st, boo

Re: [PATCH, OpenMP 5.1, Fortran] Strictly-structured block support for OpenMP directives

2021-10-15 Thread Chung-Lin Tang

On 2021/10/14 7:19 PM, Jakub Jelinek wrote:

On Thu, Oct 14, 2021 at 12:20:51PM +0200, Jakub Jelinek via Gcc-patches wrote:

Thinking more about the Fortran case for !$omp sections, there is an
ambiguity.
!$omp sections
block
   !$omp section
end block
is clear and !$omp end sections is optional, but
!$omp sections
block
end block
is ambiguous during parsing, it could be either followed by !$omp section
and then the BLOCK would be first section, or by !$omp end sections and then
it would be clearly the whole sections, with first section being empty
inside of the block, or if it is followed by something else, it is
ambiguous whether the block ... end block is part of the first section,
followed by something and then we should be looking later for either
!$omp section or !$omp end section to prove that, or if
!$omp sections
block
end block
was the whole sections construct and we shouldn't await anything further.
I'm afraid back to the drawing board.


And I have to correct myself, there is no ambiguity in 5.2 here,
the important fact is hidden in sections/parallel sections being
block-associated constructs.  That means the body of the whole construct
has to be a structured-block, and by the 5.1+ definition of Fortran
structured block, it is either block ... end block or something that
doesn't start with block.
So,
!$omp sections
block
end block
a = 1
is only ambiguous in whether it is actually
!$omp sections
block
   !$omp section
end block
a = 1
or
!$omp sections
!$omp section
block
end block
!$omp end sections
a = 1
but both actually do the same thing, work roughly as !$omp single.
If one wants block statement as first in structured-block-sequence
of the first section, followed by either some further statements
or by other sections, then one needs to write
!$omp sections
!$omp section
block
end block
a = 1
...
!$omp end sections
or
!$omp sections
block
   block
   end block
   a = 1
...
end block

Your patch probably already handles it that way, but we again need
testsuite coverage to prove it is handled the way it should in all these
cases (and that we diagnose what is invalid).


The patch currently does not allow strictly-structured BLOCK for 
sections/parallel sections,
since I was referencing the 5.1 spec while writing it, although that is 
trivially fixable.
(was sensing a bit odd why those two constructs had to be specially treated in 
5.1 anyways)

The bigger issue is that under the current way the patch is written, the 
statements inside
a [parallel] sections construct are parsed automatically by parse_executable(), 
so to enforce
the specified meaning of "structured-block-sequence" (i.e. BLOCK or non-BLOCK 
starting sequence of stmts)
will probably be more a bit harder to implement:

!$omp sections
block
   !$omp section
   block
 x=0
   end block
   x=1   !! This is allowed now, though should be wrong spec-wise
   !$omp section
   x=2
end block

Currently "$!omp section" acts essentially as a top-level separator within a 
sections-construct,
rather than a structured directive. Though I would kind of argue this is 
actually better to use for the
user (why prohibit what looks like very apparent meaning of the program?)

So Jakub, my question for this is, is this current state okay? Or must we 
implement the spec pedantically?

As for the other issues:
(1) BLOCK/END BLOCK is not generally handled in parse_omp_structured_block, so 
for workshare,
it is only handled for the top-level construct, not within workshare. I 
think this is what you meant
in the last mail.

(2) As for the dangling-!$omp_end issue Tobias raised, because we are basically 
using 1-statement lookahead,
any "!$omp end <*>" is naturally bound with the adjacent BLOCK/END BLOCK, 
so we should be okay there.

Thanks,
Chung-Lin


[PATCH, v2, OpenMP, Fortran] Support in_reduction for Fortran

2021-10-19 Thread Chung-Lin Tang
t have any more evidence this is needed, so
removed now.


--- /dev/null
+++ b/libgomp/testsuite/libgomp.fortran/target-in-reduction-1.f90
@@ -0,0 +1,33 @@
+! { dg-do run }
+
+subroutine foo (x, y)

...

+  if (x .ne. 11) stop 1
+  if (y .ne. 21) stop 2
+
+end program main


Again, something that can be dealt incrementally, but the
testsuite coverage of
https://gcc.gnu.org/pipermail/gcc-patches/2021-June/573600.html
was larger than this.  Would be nice e.g. to cover both scalar vars
and array sections/arrays, parameters passed by reference as in the
above testcase, but also something that isn't a reference (either a local
variable or dummy parameter with VALUE, etc.

Jakub


I have expanded target-in-reduction-1.f90 to cover local variables and
VALUE passed parameters. Array sections in reductions appear to be still
not supported by the Fortran FE in general (Tobias plans to work on that later).

I also added another target-in-reduction-2.f90 testcase that tests the 
"orphaned"
case in Fortran, where the task/target-in_reduction is in another separate 
subroutine.

Tested without regressions on trunk, is this okay to commit?

Thanks,
Chung-Lin

2021-10-19  Chung-Lin Tang  

gcc/fortran/ChangeLog:

* openmp.c (gfc_match_omp_clause_reduction): Add 'openmp_target' default
false parameter. Add 'always,tofrom' map for OMP_LIST_IN_REDUCTION case.
(gfc_match_omp_clauses): Add 'openmp_target' default false parameter,
adjust call to gfc_match_omp_clause_reduction.
(match_omp): Adjust call to gfc_match_omp_clauses
* trans-openmp.c (gfc_trans_omp_taskgroup): Add call to
gfc_match_omp_clause, create and return block.

gcc/ChangeLog:

* omp-low.c (omp_copy_decl_2): For !ctx, use record_vars to add new copy
as local variable.
(scan_sharing_clauses): Place copy of OMP_CLAUSE_IN_REDUCTION decl in
ctx->outer instead of ctx.

gcc/testsuite/ChangeLog:

* gfortran.dg/gomp/reduction4.f90: Adjust omp target in_reduction' scan
pattern.

libgomp/ChangeLog:

* testsuite/libgomp.fortran/target-in-reduction-1.f90: New test.
* testsuite/libgomp.fortran/target-in-reduction-2.f90: New test.diff --git a/gcc/fortran/openmp.c b/gcc/fortran/openmp.c
index 6a4ca2868f8..210fb06dbec 100644
--- a/gcc/fortran/openmp.c
+++ b/gcc/fortran/openmp.c
@@ -1138,7 +1138,7 @@ failed:
 
 static match
 gfc_match_omp_clause_reduction (char pc, gfc_omp_clauses *c, bool openacc,
-   bool allow_derived)
+   bool allow_derived, bool openmp_target = false)
 {
   if (pc == 'r' && gfc_match ("reduction ( ") != MATCH_YES)
 return MATCH_NO;
@@ -1285,6 +1285,19 @@ gfc_match_omp_clause_reduction (char pc, gfc_omp_clauses 
*c, bool openacc,
n->u2.udr = gfc_get_omp_namelist_udr ();
n->u2.udr->udr = udr;
  }
+   if (openmp_target && list_idx == OMP_LIST_IN_REDUCTION)
+ {
+   gfc_omp_namelist *p = gfc_get_omp_namelist (), **tl;
+   p->sym = n->sym;
+   p->where = p->where;
+   p->u.map_op = OMP_MAP_ALWAYS_TOFROM;
+
+   tl = &c->lists[OMP_LIST_MAP];
+   while (*tl)
+ tl = &((*tl)->next);
+   *tl = p;
+   p->next = NULL;
+ }
  }
   return MATCH_YES;
 }
@@ -1353,7 +1366,7 @@ gfc_match_dupl_atomic (bool not_dupl, const char *name)
 static match
 gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
   bool first = true, bool needs_space = true,
-  bool openacc = false)
+  bool openacc = false, bool openmp_target = false)
 {
   bool error = false;
   gfc_omp_clauses *c = gfc_get_omp_clauses ();
@@ -2057,8 +2070,8 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const 
omp_mask mask,
  goto error;
}
  if ((mask & OMP_CLAUSE_IN_REDUCTION)
- && gfc_match_omp_clause_reduction (pc, c, openacc,
-allow_derived) == MATCH_YES)
+ && gfc_match_omp_clause_reduction (pc, c, openacc, allow_derived,
+openmp_target) == MATCH_YES)
continue;
  if ((mask & OMP_CLAUSE_INBRANCH)
  && (m = gfc_match_dupl_check (!c->inbranch && !c->notinbranch,
@@ -3512,7 +3525,8 @@ static match
 match_omp (gfc_exec_op op, const omp_mask mask)
 {
   gfc_omp_clauses *c;
-  if (gfc_match_omp_clauses (&c, mask) != MATCH_YES)
+  if (gfc_match_omp_clauses (&c, mask, true, true, false,
+op == EXEC_OMP_TARGET) != MATCH_YES)
 return MATCH_ERROR;
   new_st.op = op;
   new_st.ext.omp_clauses = c;
diff --git a/gcc/fortran/trans-openmp.c b/gcc/fortran/trans-ope

[PATCH, v2, OpenMP 5.2, Fortran] Strictly-structured block support for OpenMP directives

2021-10-20 Thread Chung-Lin Tang

Hi Jakub,
this version adjusts the patch to let sections/parallel sections also use
strictly-structured blocks, making it more towards 5.2.

Because of this change, some of the testcases using the sections-construct need
a bit of adjustment too, since "block; end block" at the start of the construct
now means something different than before.

There are now three new testcases, with the non-dg-error/dg-error cases 
separated,
and a third testcase containing a few cases listed in prior emails. I hope this 
is
enough.

The implementation status entry in libgomp/libgomp.texi for strictly-structured 
blocks
has also been changed to "Y" in this patch.

Tested without regressions, is this now okay for trunk?

Thanks,
Chung-Lin

2021-10-20  Chung-Lin Tang  

gcc/fortran/ChangeLog:

* decl.c (gfc_match_end): Add COMP_OMP_STRICTLY_STRUCTURED_BLOCK case
together with COMP_BLOCK.
* parse.c (parse_omp_structured_block): Change return type to
'gfc_statement', add handling for strictly-structured block case, adjust
recursive calls to parse_omp_structured_block.
(parse_executable): Adjust calls to parse_omp_structured_block.
* parse.h (enum gfc_compile_state): Add
COMP_OMP_STRICTLY_STRUCTURED_BLOCK.
* trans-openmp.c (gfc_trans_omp_workshare): Add EXEC_BLOCK case
handling.

gcc/testsuite/ChangeLog:

* gfortran.dg/gomp/cancel-1.f90: Adjust testcase.
* gfortran.dg/gomp/nesting-3.f90: Adjust testcase.
* gfortran.dg/gomp/strictly-structured-block-1.f90: New test.
* gfortran.dg/gomp/strictly-structured-block-2.f90: New test.
* gfortran.dg/gomp/strictly-structured-block-3.f90: New test.

libgomp/ChangeLog:

* libgomp.texi (Support of strictly structured blocks in Fortran):
Adjust to 'Y'.
* testsuite/libgomp.fortran/task-reduction-16.f90: Adjust testcase.
diff --git a/gcc/fortran/decl.c b/gcc/fortran/decl.c
index d6a22d13451..66489da12be 100644
--- a/gcc/fortran/decl.c
+++ b/gcc/fortran/decl.c
@@ -8449,6 +8449,7 @@ gfc_match_end (gfc_statement *st)
   break;
 
 case COMP_BLOCK:
+case COMP_OMP_STRICTLY_STRUCTURED_BLOCK:
   *st = ST_END_BLOCK;
   target = " block";
   eos_ok = 0;
diff --git a/gcc/fortran/parse.c b/gcc/fortran/parse.c
index 7d765a0866d..2fb98844356 100644
--- a/gcc/fortran/parse.c
+++ b/gcc/fortran/parse.c
@@ -5451,7 +5451,7 @@ parse_oacc_loop (gfc_statement acc_st)
 
 /* Parse the statements of an OpenMP structured block.  */
 
-static void
+static gfc_statement
 parse_omp_structured_block (gfc_statement omp_st, bool workshare_stmts_only)
 {
   gfc_statement st, omp_end_st;
@@ -5538,6 +5538,32 @@ parse_omp_structured_block (gfc_statement omp_st, bool 
workshare_stmts_only)
   gcc_unreachable ();
 }
 
+  bool block_construct = false;
+  gfc_namespace *my_ns = NULL;
+  gfc_namespace *my_parent = NULL;
+
+  st = next_statement ();
+
+  if (st == ST_BLOCK)
+{
+  /* Adjust state to a strictly-structured block, now that we found that
+the body starts with a BLOCK construct.  */
+  s.state = COMP_OMP_STRICTLY_STRUCTURED_BLOCK;
+
+  block_construct = true;
+  gfc_notify_std (GFC_STD_F2008, "BLOCK construct at %C");
+
+  my_ns = gfc_build_block_ns (gfc_current_ns);
+  gfc_current_ns = my_ns;
+  my_parent = my_ns->parent;
+
+  new_st.op = EXEC_BLOCK;
+  new_st.ext.block.ns = my_ns;
+  new_st.ext.block.assoc = NULL;
+  accept_statement (ST_BLOCK);
+  st = parse_spec (ST_NONE);
+}
+
   do
 {
   if (workshare_stmts_only)
@@ -5554,7 +5580,6 @@ parse_omp_structured_block (gfc_statement omp_st, bool 
workshare_stmts_only)
 restrictions apply recursively.  */
  bool cycle = true;
 
- st = next_statement ();
  for (;;)
{
  switch (st)
@@ -5580,13 +5605,13 @@ parse_omp_structured_block (gfc_statement omp_st, bool 
workshare_stmts_only)
case ST_OMP_PARALLEL_MASKED:
case ST_OMP_PARALLEL_MASTER:
case ST_OMP_PARALLEL_SECTIONS:
- parse_omp_structured_block (st, false);
- break;
+ st = parse_omp_structured_block (st, false);
+ continue;
 
case ST_OMP_PARALLEL_WORKSHARE:
case ST_OMP_CRITICAL:
- parse_omp_structured_block (st, true);
- break;
+ st = parse_omp_structured_block (st, true);
+ continue;
 
case ST_OMP_PARALLEL_DO:
case ST_OMP_PARALLEL_DO_SIMD:
@@ -5609,7 +5634,7 @@ parse_omp_structured_block (gfc_statement omp_st, bool 
workshare_stmts_only)
}
}
   else
-   st = parse_executable (ST_NONE);
+   st = parse_executable (st);
   if (st == ST_NONE)
unexpected_eof ();
   else if (st == S

Re: [PATCH, v2, OpenMP 5.2, Fortran] Strictly-structured block support for OpenMP directives

2021-10-21 Thread Chung-Lin Tang



On 2021/10/21 12:15 AM, Jakub Jelinek wrote:

+program main
+  integer :: x, i, n
+
+  !$omp parallel
+  block
+x = x + 1
+  end block

I'd prefer not to use those x = j or x = x + 1 etc.
as statements that do random work here whenever possible.
While those are dg-do compile testcases, especially if
it is without dg-errors I think it is preferrable not to show
bad coding examples.
E.g. the x = x + 1 above is wrong for 2 reasons, x is uninitialized
before the parallel, and there is a data race, the threads, teams etc.
can write to x concurrently.
I think better would be to use something like
 call do_work
which doesn't have to be defined anywhere and will just stand there
as a black box for unspecified work.


+  !$omp workshare
+  block
+x = x + 1
+  end block

There are exceptions though, e.g. workshare is such a case, because
e.g. call do_work is not valid in workshare.
So, it is ok to keep using x = x + 1 here if you initialize it
first at the start of the program.


+  !$omp workshare
+  block
+x = 1
+!$omp critical
+block
+  x = 3
+end block
+  end block

And then there are cases like the above, please
just use different variables there (all initialized) or
say an array and access different elements in the different spots.

Jakub



Thanks, attached is what I finally committed.

Chung-Lin



From 2e4659199e814b7ee0f6bd925fd2c0a7610da856 Mon Sep 17 00:00:00 2001
From: Chung-Lin Tang 
Date: Thu, 21 Oct 2021 14:56:20 +0800
Subject: [PATCH] openmp: Fortran strictly-structured blocks support

This implements strictly-structured blocks support for Fortran, as specified in
OpenMP 5.2. This now allows using a Fortran BLOCK construct as the body of most
OpenMP constructs, with a "!$omp end ..." ending directive optional for that
form.

gcc/fortran/ChangeLog:

* decl.c (gfc_match_end): Add COMP_OMP_STRICTLY_STRUCTURED_BLOCK case
together with COMP_BLOCK.
* parse.c (parse_omp_structured_block): Change return type to
'gfc_statement', add handling for strictly-structured block case, adjust
recursive calls to parse_omp_structured_block.
(parse_executable): Adjust calls to parse_omp_structured_block.
* parse.h (enum gfc_compile_state): Add
COMP_OMP_STRICTLY_STRUCTURED_BLOCK.
* trans-openmp.c (gfc_trans_omp_workshare): Add EXEC_BLOCK case
handling.

gcc/testsuite/ChangeLog:

* gfortran.dg/gomp/cancel-1.f90: Adjust testcase.
* gfortran.dg/gomp/nesting-3.f90: Adjust testcase.
* gfortran.dg/gomp/strictly-structured-block-1.f90: New test.
* gfortran.dg/gomp/strictly-structured-block-2.f90: New test.
* gfortran.dg/gomp/strictly-structured-block-3.f90: New test.

libgomp/ChangeLog:

* libgomp.texi (Support of strictly structured blocks in Fortran):
Adjust to 'Y'.
* testsuite/libgomp.fortran/task-reduction-16.f90: Adjust testcase.
---
 gcc/fortran/decl.c|   1 +
 gcc/fortran/parse.c   |  69 +-
 gcc/fortran/parse.h   |   2 +-
 gcc/fortran/trans-openmp.c|   6 +-
 gcc/testsuite/gfortran.dg/gomp/cancel-1.f90   |   3 +
 gcc/testsuite/gfortran.dg/gomp/nesting-3.f90  |  20 +-
 .../gomp/strictly-structured-block-1.f90  | 214 ++
 .../gomp/strictly-structured-block-2.f90  | 139 
 .../gomp/strictly-structured-block-3.f90  |  52 +
 libgomp/libgomp.texi  |   2 +-
 .../libgomp.fortran/task-reduction-16.f90 |   1 +
 11 files changed, 484 insertions(+), 25 deletions(-)
 create mode 100644 
gcc/testsuite/gfortran.dg/gomp/strictly-structured-block-1.f90
 create mode 100644 
gcc/testsuite/gfortran.dg/gomp/strictly-structured-block-2.f90
 create mode 100644 
gcc/testsuite/gfortran.dg/gomp/strictly-structured-block-3.f90

diff --git a/gcc/fortran/decl.c b/gcc/fortran/decl.c
index 6784b07ae9e..6043e100fbb 100644
--- a/gcc/fortran/decl.c
+++ b/gcc/fortran/decl.c
@@ -8429,6 +8429,7 @@ gfc_match_end (gfc_statement *st)
   break;
 
 case COMP_BLOCK:
+case COMP_OMP_STRICTLY_STRUCTURED_BLOCK:
   *st = ST_END_BLOCK;
   target = " block";
   eos_ok = 0;
diff --git a/gcc/fortran/parse.c b/gcc/fortran/parse.c
index 2a454be79b0..b1e73ee6801 100644
--- a/gcc/fortran/parse.c
+++ b/gcc/fortran/parse.c
@@ -5459,7 +5459,7 @@ parse_oacc_loop (gfc_statement acc_st)
 
 /* Parse the statements of an OpenMP structured block.  */
 
-static void
+static gfc_statement
 parse_omp_structured_block (gfc_statement omp_st, bool workshare_stmts_only)
 {
   gfc_statement st, omp_end_st;
@@ -5546,6 +5546,32 @@ parse_omp_structured_block (gfc_statement omp_st, bool 
workshare_stmts_only)
   gcc_unreachable ();
 }
 
+  bool block_construct = false;
+  gfc_namespace *my_ns = NULL;
+  gfc_namespace *my_parent = NULL;
+
+  st = next_statement ()

[PATCH, PR90030] Fortran OpenMP/OpenACC array mapping alignment fix

2021-11-04 Thread Chung-Lin Tang

Hi Jakub,
As Thomas reported and submitted a patch a while ago:
https://gcc.gnu.org/pipermail/gcc-patches/2019-April/519932.html
https://gcc.gnu.org/pipermail/gcc-patches/2019-May/522738.html

There's an issue with the Fortran front-end when mapping arrays: when
creating the data MEM_REF for the map clause, there's a convention of
casting the referencing pointer to 'c_char *' by
fold_convert (build_pointer_type (char_type_node), ptr).

This causes the alignment passed to the libgomp runtime for array data
hardwared to '1', and causes alignment errors on the offload target
(not always showing up, but can trigger due to slight change of clause
ordering)

This patch is not exactly Thomas' patch from 2019, but does the same
thing. The new libgomp tests are directly reused though. A lot of
scan test adjustment is also included in this patch.

Patch has been tested for no regressions for gfortran and libgomp, is
this okay for trunk?

Thanks,
Chung-Lin

Fortran: fix array alignment for OpenMP/OpenACC target mapping clauses [PR90030]

The Fortran front-end is creating maps of array data with a type of pointer to
char_type_node, which when eventually passed to libgomp during runtime, marks
the passed array with an alignment of 1, which can cause mapping alignment
errors on the offload target.

This patch removes the related fold_convert(build_pointer_type (char_type_node))
calls in fortran/trans-openmp.c, and adds gcc_asserts to ensure pointer type.

2021-11-04  Chung-Lin Tang  
Thomas Schwinge 

PR fortran/90030

gcc/fortran/ChangeLog:

* trans-openmp.c (gfc_omp_finish_clause): Remove fold_convert to pointer
to char_type_node, add gcc_assert of POINTER_TYPE_P.
(gfc_trans_omp_array_section): Likewise.
(gfc_trans_omp_clauses): Likewise.

gcc/testsuite/ChangeLog:

* gfortran.dg/goacc/finalize-1.f: Adjust scan test.
* gfortran.dg/gomp/affinity-clause-1.f90: Likewise.
* gfortran.dg/gomp/affinity-clause-5.f90: Likewise.
* gfortran.dg/gomp/defaultmap-4.f90: Likewise.
* gfortran.dg/gomp/defaultmap-5.f90: Likewise.
* gfortran.dg/gomp/defaultmap-6.f90: Likewise.
* gfortran.dg/gomp/map-3.f90: Likewise.
* gfortran.dg/gomp/pr78260-2.f90: Likewise.
* gfortran.dg/gomp/pr78260-3.f90: Likewise.

libgomp/ChangeLog:

* testsuite/libgomp.oacc-fortran/pr90030.f90: New test.
* testsuite/libgomp.fortran/pr90030.f90: New test.diff --git a/gcc/fortran/trans-openmp.c b/gcc/fortran/trans-openmp.c
index e81c558..0ff90b7 100644
--- a/gcc/fortran/trans-openmp.c
+++ b/gcc/fortran/trans-openmp.c
@@ -1564,7 +1564,7 @@ gfc_omp_finish_clause (tree c, gimple_seq *pre_p, bool 
openacc)
   if (present)
ptr = gfc_build_cond_assign_expr (&block, present, ptr,
  null_pointer_node);
-  ptr = fold_convert (build_pointer_type (char_type_node), ptr);
+  gcc_assert (POINTER_TYPE_P (TREE_TYPE (ptr)));
   ptr = build_fold_indirect_ref (ptr);
   OMP_CLAUSE_DECL (c) = ptr;
   c2 = build_omp_clause (input_location, OMP_CLAUSE_MAP);
@@ -2381,7 +2381,7 @@ gfc_trans_omp_array_section (stmtblock_t *block, 
gfc_omp_namelist *n,
OMP_CLAUSE_SIZE (node), elemsz);
 }
   gcc_assert (se.post.head == NULL_TREE);
-  ptr = fold_convert (build_pointer_type (char_type_node), ptr);
+  gcc_assert (POINTER_TYPE_P (TREE_TYPE (ptr)));
   OMP_CLAUSE_DECL (node) = build_fold_indirect_ref (ptr);
   ptr = fold_convert (ptrdiff_type_node, ptr);
 
@@ -2849,8 +2849,7 @@ gfc_trans_omp_clauses (stmtblock_t *block, 
gfc_omp_clauses *clauses,
  if (GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (decl)))
{
  decl = gfc_conv_descriptor_data_get (decl);
- decl = fold_convert (build_pointer_type (char_type_node),
-  decl);
+ gcc_assert (POINTER_TYPE_P (TREE_TYPE (decl)));
  decl = build_fold_indirect_ref (decl);
}
  else if (DECL_P (decl))
@@ -2873,8 +2872,7 @@ gfc_trans_omp_clauses (stmtblock_t *block, 
gfc_omp_clauses *clauses,
}
  gfc_add_block_to_block (&iter_block, &se.pre);
  gfc_add_block_to_block (&iter_block, &se.post);
- ptr = fold_convert (build_pointer_type (char_type_node),
- ptr);
+ gcc_assert (POINTER_TYPE_P (TREE_TYPE (ptr)));
  OMP_CLAUSE_DECL (node) = build_fold_indirect_ref (ptr);
}
  if (list == OMP_LIST_DEPEND)
@@ -3117,8 +3115,7 @@ gfc_trans_omp_clauses (stmtblock_t *block, 
gfc_omp_clauses *clauses,
  if (present)
  

[PATCH, v2, OpenMP 5.0] Remove array section base-pointer mapping semantics, and other front-end adjustments (mainline trunk)

2021-11-19 Thread Chung-Lin Tang

Hi Jakub,
attached is a rebased version of this "OpenMP fixes/adjustments" patch.

This version removes some of the (ort == C_ORT_OMP || ort == C_ORT_ACC) stuff 
that's not needed
in handle_omp_array_sections_1 and [c_]finish_omp_clauses.

Note that this is meant to be patched atop of the recent also posted C++ 
PR92120 v5 patch:
https://gcc.gnu.org/pipermail/gcc-patches/2021-November/584602.html

Again, tested without regressions (together with the PR92120 patch), awaiting 
review.

Thanks,
Chung-Lin

(ChangeLog updated below)

On 2021/5/25 9:36 PM, Chung-Lin Tang wrote:


This patch largely implements three pieces of functionality:

(1) Per discussion and clarification on the omp-lang mailing list,
standards conforming behavior for mapping array sections should *NOT* also map 
the base-pointer,
i.e for this code:

 struct S { int *ptr; ... };
 struct S s;
 #pragma omp target enter data map(to: s.ptr[:100])

Currently we generate after gimplify:
#pragma omp target enter data map(struct:s [len: 1]) map(alloc:s.ptr [len: 8]) \
    map(to:*_1 [len: 400]) map(attach:s.ptr [bias: 
0])

which is deemed incorrect. After this patch, the gimplify results are now 
adjusted to:
#pragma omp target enter data map(to:*_1 [len: 400]) map(attach:s.ptr [bias: 0])
(the attach operation is still generated, and if s.ptr is already mapped prior, 
attachment will happen)

The correct way of achieving the base-pointer-also-mapped behavior would be to 
use:
#pragma omp target enter data map(to: s.ptr, s.ptr[:100])

This adjustment in behavior required a number of small adjustments here and 
there in gimplify, including
to accomodate map sequences for C++ references.

There is also a small Fortran front-end patch involved (hence CCing Tobias and 
fortran@).
The new gimplify processing changed behavior in handling 
GOMP_MAP_ALWAYS_POINTER maps such that
the libgomp.fortran/struct-elem-map-1.f90 regressed. It appeared that the 
Fortran FE was generating
a GOMP_MAP_ALWAYS_POINTER for array types, which didn't seem quite correct, and 
the pre-patch behavior
was removing this map anyways. I have a small change in 
trans-openmp.c:gfc_trans_omp_array_section
to not generate the map in this case, and so far no bad test results.

(2) The second part (though kind of related to the first above) are fixes in 
libgomp/target.c
to not overwrite attached pointers when handling device<->host copies, mainly for the 
"always" case.
This behavior is also noted in the 5.0 spec, but not yet properly coded before.

(3) The third is a set of changes to the C/C++ front-ends to extend the allowed 
component access syntax
in map clauses. This is actually mainly an effort to allow SPEC HPC to compile, 
so despite in the long
term the entire map clause syntax parsing is probably going to be revamped, 
we're still adding this in
for now. These changes are enabled for both OpenACC and OpenMP.



2021-11-19  Chung-Lin Tang  

gcc/c/ChangeLog:

* c-parser.c (struct omp_dim): New struct type for use inside
c_parser_omp_variable_list.
(c_parser_omp_variable_list): Allow multiple levels of array and
component accesses in array section base-pointer expression.
(c_parser_omp_clause_to): Set 'allow_deref' to true in call to
c_parser_omp_var_list_parens.
(c_parser_omp_clause_from): Likewise.
* c-typeck.c (handle_omp_array_sections_1): Extend allowed range
of base-pointer expressions involving INDIRECT/MEM/ARRAY_REF and
POINTER_PLUS_EXPR.
(c_finish_omp_clauses): Extend allowed ranged of expressions
involving INDIRECT/MEM/ARRAY_REF and POINTER_PLUS_EXPR.

gcc/cp/ChangeLog:

* parser.c (struct omp_dim): New struct type for use inside
cp_parser_omp_var_list_no_open.
(cp_parser_omp_var_list_no_open): Allow multiple levels of array and
component accesses in array section base-pointer expression.
(cp_parser_omp_all_clauses): Set 'allow_deref' to true in call to
cp_parser_omp_var_list for to/from clauses.
* semantics.c (handle_omp_array_sections_1): Extend allowed range
of base-pointer expressions involving INDIRECT/MEM/ARRAY_REF and
POINTER_PLUS_EXPR.
(handle_omp_array_sections): Adjust pointer map generation of
references.
(finish_omp_clauses): Extend allowed ranged of expressions
involving INDIRECT/MEM/ARRAY_REF and POINTER_PLUS_EXPR.

gcc/fortran/ChangeLog:

* trans-openmp.c (gfc_trans_omp_array_section): Do not generate
GOMP_MAP_ALWAYS_POINTER map for main array maps of ARRAY_TYPE type.

gcc/ChangeLog:

* gimplify.c (extract_base_bit_offset): Add 'tree *offsetp' parameter,
accomodate case where 'offset' return of get_inner_reference is
non-NULL.
(is_or_contains_p): Further robustify conditions.
  

Re: [PATCH, PR90030] Fortran OpenMP/OpenACC array mapping alignment fix

2021-11-19 Thread Chung-Lin Tang

Ping.

On 2021/11/4 4:23 PM, Chung-Lin Tang wrote:

Hi Jakub,
As Thomas reported and submitted a patch a while ago:
https://gcc.gnu.org/pipermail/gcc-patches/2019-April/519932.html
https://gcc.gnu.org/pipermail/gcc-patches/2019-May/522738.html

There's an issue with the Fortran front-end when mapping arrays: when
creating the data MEM_REF for the map clause, there's a convention of
casting the referencing pointer to 'c_char *' by
fold_convert (build_pointer_type (char_type_node), ptr).

This causes the alignment passed to the libgomp runtime for array data
hardwared to '1', and causes alignment errors on the offload target
(not always showing up, but can trigger due to slight change of clause
ordering)

This patch is not exactly Thomas' patch from 2019, but does the same
thing. The new libgomp tests are directly reused though. A lot of
scan test adjustment is also included in this patch.

Patch has been tested for no regressions for gfortran and libgomp, is
this okay for trunk?

Thanks,
Chung-Lin

Fortran: fix array alignment for OpenMP/OpenACC target mapping clauses [PR90030]

The Fortran front-end is creating maps of array data with a type of pointer to
char_type_node, which when eventually passed to libgomp during runtime, marks
the passed array with an alignment of 1, which can cause mapping alignment
errors on the offload target.

This patch removes the related fold_convert(build_pointer_type (char_type_node))
calls in fortran/trans-openmp.c, and adds gcc_asserts to ensure pointer type.

2021-11-04  Chung-Lin Tang  
     Thomas Schwinge 

 PR fortran/90030

gcc/fortran/ChangeLog:

 * trans-openmp.c (gfc_omp_finish_clause): Remove fold_convert to pointer
 to char_type_node, add gcc_assert of POINTER_TYPE_P.
 (gfc_trans_omp_array_section): Likewise.
 (gfc_trans_omp_clauses): Likewise.

gcc/testsuite/ChangeLog:

 * gfortran.dg/goacc/finalize-1.f: Adjust scan test.
 * gfortran.dg/gomp/affinity-clause-1.f90: Likewise.
 * gfortran.dg/gomp/affinity-clause-5.f90: Likewise.
 * gfortran.dg/gomp/defaultmap-4.f90: Likewise.
 * gfortran.dg/gomp/defaultmap-5.f90: Likewise.
 * gfortran.dg/gomp/defaultmap-6.f90: Likewise.
 * gfortran.dg/gomp/map-3.f90: Likewise.
 * gfortran.dg/gomp/pr78260-2.f90: Likewise.
 * gfortran.dg/gomp/pr78260-3.f90: Likewise.

libgomp/ChangeLog:

 * testsuite/libgomp.oacc-fortran/pr90030.f90: New test.
 * testsuite/libgomp.fortran/pr90030.f90: New test.


[PATCH, Fortran] Fix setting of array lower bound for named arrays

2021-11-29 Thread Chung-Lin Tang

This patch by Tobias, fixes a case of setting array low-bounds, found
for particular uses of SOURCE=/MOLD=.

For example:
program A_M
  implicit none
  real, dimension (:), allocatable :: A, B
  allocate (A(0:5))
  call Init (A)
contains
  subroutine Init ( A )
real, dimension ( 0 : ), intent ( in ) :: A
integer, dimension ( 1 ) :: lb_B

allocate (B, mold = A)
...
lb_B = lbound (B, dim=1)   ! Error: lb_B assigned 1, instead of 0 like 
lower-bound of A.

Referencing the Fortran standard:

"16.9.109 LBOUND (ARRAY [, DIM, KIND])"
states:
"If DIM is present, ARRAY is a whole array, and either ARRAY is
 an assumed-size array of rank DIM or dimension DIM of ARRAY has
 nonzero extent, the result has a value equal to the lower bound
 for subscript DIM of ARRAY. Otherwise, if DIM is present, the
 result value is 1."

And on what is a "whole array":

"9.5.2 Whole arrays"
"A whole array is a named array or a structure component ..."

The attached patch adjusts the relevant part in gfc_trans_allocate() to only set
e3_has_nodescriptor only for non-named arrays.

Tobias has tested this once, and I've tested this patch as well on our complete 
set of
testsuites (which usually serves for OpenMP related stuff). Everything appears 
well with no regressions.

Is this okay for trunk?

Thanks,
Chung-Lin

2021-11-29  Tobias Burnus  

gcc/fortran/ChangeLog:

* trans-stmt.c (gfc_trans_allocate): Set e3_has_nodescriptor to true
only for non-named arrays.

gcc/testsuite/ChangeLog:

* gfortran.dg/allocate_with_source_26.f90: Adjust testcase.
* gfortran.dg/allocate_with_mold_4.f90: New testcase.diff --git a/gcc/fortran/trans-stmt.c b/gcc/fortran/trans-stmt.c
index bdf7957..982e1e0 100644
--- a/gcc/fortran/trans-stmt.c
+++ b/gcc/fortran/trans-stmt.c
@@ -6660,16 +6660,13 @@ gfc_trans_allocate (gfc_code * code)
   else
e3rhs = gfc_copy_expr (code->expr3);
 
-  // We need to propagate the bounds of the expr3 for source=/mold=;
-  // however, for nondescriptor arrays, we use internally a lower bound
-  // of zero instead of one, which needs to be corrected for the allocate 
obj
-  if (e3_is == E3_DESC)
-   {
- symbol_attribute attr = gfc_expr_attr (code->expr3);
- if (code->expr3->expr_type == EXPR_ARRAY ||
- (!attr.allocatable && !attr.pointer))
-   e3_has_nodescriptor = true;
-   }
+  // We need to propagate the bounds of the expr3 for source=/mold=.
+  // However, for non-named arrays, the lbound has to be 1 and neither the
+  // bound used inside the called function even when returning an
+  // allocatable/pointer nor the zero used internally.
+  if (e3_is == E3_DESC
+ && code->expr3->expr_type != EXPR_VARIABLE)
+   e3_has_nodescriptor = true;
 }
 
   /* Loop over all objects to allocate.  */
diff --git a/gcc/testsuite/gfortran.dg/allocate_with_mold_4.f90 
b/gcc/testsuite/gfortran.dg/allocate_with_mold_4.f90
new file mode 100644
index 000..d545fe1
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/allocate_with_mold_4.f90
@@ -0,0 +1,24 @@
+program A_M
+  implicit none
+  real, parameter :: C(5:10) = 5.0
+  real, dimension (:), allocatable :: A, B
+  allocate (A(6))
+  call Init (A)
+contains
+  subroutine Init ( A )
+real, dimension ( -1 : ), intent ( in ) :: A
+integer, dimension ( 1 ) :: lb_B
+
+allocate (B, mold = A)
+if (any (lbound (B) /= lbound (A))) stop 1
+if (any (ubound (B) /= ubound (A))) stop 2
+if (any (shape (B) /= shape (A))) stop 3
+if (size (B) /= size (A)) stop 4
+deallocate (B)
+allocate (B, mold = C)
+if (any (lbound (B) /= lbound (C))) stop 5
+if (any (ubound (B) /= ubound (C))) stop 6
+if (any (shape (B) /= shape (C))) stop 7
+if (size (B) /= size (C)) stop 8
+end
+end 
diff --git a/gcc/testsuite/gfortran.dg/allocate_with_source_26.f90 
b/gcc/testsuite/gfortran.dg/allocate_with_source_26.f90
index 28f24fc..323c8a3 100644
--- a/gcc/testsuite/gfortran.dg/allocate_with_source_26.f90
+++ b/gcc/testsuite/gfortran.dg/allocate_with_source_26.f90
@@ -34,23 +34,23 @@ program p
  if (lbound(p1, 1) /= 3 .or. ubound(p1, 1) /= 4 &
  .or. lbound(p2, 1) /= 3 .or. ubound(p2, 1) /= 4 &
  .or. lbound(p3, 1) /= 1 .or. ubound(p3, 1) /= 2 &
- .or. lbound(p4, 1) /= 7 .or. ubound(p4, 1) /= 8 &
+ .or. lbound(p4, 1) /= 1 .or. ubound(p4, 1) /= 2 &
  .or. p1(3)%i /= 43 .or. p1(4)%i /= 56 &
  .or. p2(3)%i /= 43 .or. p2(4)%i /= 56 &
  .or. p3(1)%i /= 43 .or. p3(2)%i /= 56 &
- .or. p4(7)%i /= 11 .or. p4(8)%i /= 12) then
+ .or. p4(1)%i /= 11 .or. p4(2)%i /= 12) then
call abort()
  endif
 
  !write(*,*) lbound(a,1), ubound(a,1) ! prints 1 3
  !write(*,*) lbound(b,1), ubound(b,1) ! prints 1 3
- !write(*,*) lbound(c,1), ubound(c,1) ! prints 3 5
+ !write(*,*) lbound(c,1), ubound(c,1) ! prints 1 3
  !write(*,*) lbound(d,1), ubound(d,1) ! prints 1 5
  !write(*,*) lbound(e,1), ubound(e,1) ! prints 1 6