[PATCH, OpenMP, Fortran] requires unified_shared_memory 1/2: adjust libgfortran memory allocators
Hi, this patch is to fix the case where 'requires unified_shared_memory' doesn't work due to memory allocator mismatch. Currently this is only for OG12 (devel/omp/gcc-12), but will apply to mainline as well once those requires patches get in. Basically, under 'requires unified_shared_memory' enables the usm_transform pass, which transforms some of the expanded Fortran intrinsic code that uses __builtin_free() into 'omp_free (..., ompx_unified_shared_mem_alloc)'. The intention is to make all dynamic memory allocation use the OpenMP unified_shared_memory allocator, but there is a big gap in this, namely libgfortran. What happens in some tests are that libgfortran allocates stuff using normal malloc(), and the usm_transform generates code that frees the stuff using omp_free(), and chaos ensues. So the proper fix we believe is: to make it possible to move the entire libgfortran on to unified_shared_memory. This first patch is a mostly mechanical patch to change all references of malloc/free/calloc/realloc in libgfortran into xmalloc/xfree/xcalloc/xrealloc in libgfortran/runtime/memory.c, as well as strdup uses into a new internal xstrdup. All of libgfortran is adjusted this way, except libgfortran/caf, which is an independent library outside of libgfortran.so. The second patch of this series will present a way to switch the references of allocators in libgfortran/runtime/memory.c from the normal glibc malloc/free/etc. to omp_alloc/omp_free/etc. when 'requires unified_shared_memory' is detected. Tested on devel/omp/gcc-12. Plans is to commit there soon, but also seeking approval for mainline once the requires stuff goes in. Thanks, Chung-Lin 2022-08-15 Chung-Lin Tang libgfortran/ChangeLog: * m4/matmul_internal.m4: Adjust malloc/free to xmalloc/xfree. * generated/matmul_c10.c: Regenerate. * generated/matmul_c16.c: Likewise. * generated/matmul_c17.c: Likewise. * generated/matmul_c4.c: Likewise. * generated/matmul_c8.c: Likewise. * generated/matmul_i1.c: Likewise. * generated/matmul_i16.c: Likewise. * generated/matmul_i2.c: Likewise. * generated/matmul_i4.c: Likewise. * generated/matmul_i8.c: Likewise. * generated/matmul_r10.c: Likewise. * generated/matmul_r16.c: Likewise. * generated/matmul_r17.c: Likewise. * generated/matmul_r4.c: Likewise. * generated/matmul_r8.c: Likewise. * generated/matmulavx128_c10.c: Likewise. * generated/matmulavx128_c16.c: Likewise. * generated/matmulavx128_c17.c: Likewise. * generated/matmulavx128_c4.c: Likewise. * generated/matmulavx128_c8.c: Likewise. * generated/matmulavx128_i1.c: Likewise. * generated/matmulavx128_i16.c: Likewise. * generated/matmulavx128_i2.c: Likewise. * generated/matmulavx128_i4.c: Likewise. * generated/matmulavx128_i8.c: Likewise. * generated/matmulavx128_r10.c: Likewise. * generated/matmulavx128_r16.c: Likewise. * generated/matmulavx128_r17.c: Likewise. * generated/matmulavx128_r4.c: Likewise. * generated/matmulavx128_r8.c: Likewise. * intrinsics/access.c (access_func): Adjust free to xfree. * intrinsics/chdir.c (chdir_i4_sub): Likewise. (chdir_i8_sub): Likewise. * intrinsics/chmod.c (chmod_func): Likewise. * intrinsics/date_and_time.c (secnds): Likewise. * intrinsics/env.c (PREFIX(getenv)): Likewise. (get_environment_variable_i4): Likewise. * intrinsics/execute_command_line.c (execute_command_line): Likewise. * intrinsics/getcwd.c (getcwd_i4_sub): Likewise. * intrinsics/getlog.c (PREFIX(getlog)): Likewise. * intrinsics/link.c (link_internal): Likewise. * intrinsics/move_alloc.c (move_alloc): Likewise. * intrinsics/perror.c (perror_sub): Likewise. * intrinsics/random.c (constructor_random): Likewise. * intrinsics/rename.c (rename_internal): Likewise. * intrinsics/stat.c (stat_i4_sub_0): Likewise. (stat_i8_sub_0): Likewise. * intrinsics/symlnk.c (symlnk_internal): Likewise. * intrinsics/system.c (system_sub): Likewise. * intrinsics/unlink.c (unlink_i4_sub): Likewise. * io/async.c (update_pdt): Likewise. (async_io): Likewise. (free_async_unit): Likewise. (init_async_unit): Adjust calloc to xcalloc. (enqueue_done_id): Likewise. (enqueue_done): Likewise. (enqueue_close): Likewise. * io/async.h (MUTEX_DEBUG_ADD): Adjust malloc/free to xmalloc/xfree. * io/close.c (st_close): Adjust strdup/free to xstrdup/xfree. * io/fbuf.c (fbuf_destroy): Adjust free to xfree. * io/format.c (free_format_hash_table): Likewise. (save_parsed_format): Likewise. (free_format): Likewise. (free_format_data)
[PATCH, OpenMP, Fortran] requires unified_shared_memory 2/2: insert USM allocators into libgfortran
After the first libgfortran memory allocator preparation patch, this is the actual patch that organizes unified_shared_memory allocation into libgfortran. In the current OpenMP requires implementation, the requires_mask is collected through offload LTO processing, and presented to libgomp when registering offload images through GOMP_offload_register_ver() (called by the mkoffload generated constructor linked into the program binary) This means that the only reliable place to access omp_requires_mask is in GOMP_offload_register_ver, however since it is called through an ELF constructor in the *main program*, this runs later than libgfortran/runtime/main.c:init() constructor, and because some libgfortran init actions there start allocating memory, this can cause more deallocation errors later. Another issue is that CUDA appears to be registering some cleanup actions using atexit(), which forces libgomp to register gomp_target_fini() using atexit as well (to properly run before the underlying CUDA stuff disappears). This happens to us here as well. So to summarize we need to: (1) order libgfortran init actions after omp_requires_mask processing is done, and (2) order libgfortran cleanup actions before gomp_target_fini, to properly deallocate stuff without crashing. The above explanation is for why there's a little new set of definitions, as well as callback registering functions exported from libgomp to libgfortran, basically to register libgfortran init/fini actions into libgomp to run. Inside GOMP_offload_register_ver, after omp_requires_mask processing is done, we call into libgfortran through a new _gfortran_mem_allocators_init function to insert the omp_free/alloc/etc. based allocators into the Fortran runtime, when GOMP_REQUIRES_UNIFIED_SHARED_MEMORY is set. All symbol references between libgfortran/libgomp are defined with weak symbols. Test of the weak symbols are also used to determine if the other library exists in this program. A final issue is: the case where we have an OpenMP program that does NOT have offloading. We cannot passively determine in libgomp/libgfortran whether offloading exists or not, only the main program itself can, by seeing if the hidden __OFFLOAD_TABLE__ exists. When we do init/fini libgomp callback registering for OpenMP programs, those with no offloading will not have those callback properly run (because of no offload image loading) Therefore the solution here is a constructor added into the crtoffloadend.o fragment that does a "null" call of GOMP_offload_register_ver, solely for triggering the post-offload_register callbacks when __OFFLOAD_TABLE__ is NULL. (and because of this, the crtoffloadend.o Makefile rule is adjusted to compile with PIC) I know this is a big pile of yarn wrt how the main program/libgomp/libgfortran interacts, but it's finally working. Again tested without regressions. Preparing to commit to devel/omp/gcc-12, and seeking approval for mainline when the requires patches are in. Thanks, Chung-Lin 2022-08-15 Chung-Lin Tang libgcc/ * Makefile.in (crtoffloadend$(objext)): Add $(PICFLAG) to compile rule. * offloadstuff.c (GOMP_offload_register_ver): Add declaration of weak symbol. (__OFFLOAD_TABLE__): Likewise. (init_non_offload): New function. libgfortran/ * gfortran.map (GFORTRAN_13): New namespace. (_gfortran_mem_allocators_init): New name inside GFORTRAN_13. * libgfortran.h (mem_allocators_init): New exported declaration. * runtime/main.c (do_init): Rename from init, add run-once guard code. (cleanup): Add run-once guard code. (GOMP_post_offload_register_callback): Declare weak symbol. (GOMP_pre_gomp_target_fini_callback): Likewise. (init): New constructor to register offload callbacks, or call do_init when not OpenMP. * runtime/memory.c (gfortran_malloc): New pointer variable. (gfortran_calloc): Likewise. (gfortran_realloc): Likewise. (gfortran_free): Likewise. (mem_allocators_init): New function. (xmalloc): Use gfortran_malloc. (xmallocarray): Use gfortran_malloc. (xcalloc): Use gfortran_calloc. (xrealloc): Use gfortran_realloc. (xfree): Use gfortran_free. libgomp/ * libgomp.map (GOMP_5.1.2): New version namespace. (GOMP_post_offload_register_callback): New name inside GOMP_5.1.2. (GOMP_pre_gomp_target_fini_callback): Likewise. (GOMP_DEFINE_CALLBACK_SET): Macro to define callback set. (post_offload_register): Define callback set for after offload image register. (pre_gomp_target_fini): Define callback set for before gomp_target_fini is called. (libgfortran_malloc_usm): New function. (libgfortran_calloc_usm): Likewise (libgfortran_realloc_usm): Likewise (libgfortran_free_usm): Likewise. (_gfortran
Re: [PATCH, OpenMP, Fortran] requires unified_shared_memory 2/2: insert USM allocators into libgfortran
On 2022/8/15 7:06 PM, Chung-Lin Tang wrote: I know this is a big pile of yarn wrt how the main program/libgomp/libgfortran interacts, but it's finally working. Again tested without regressions. Preparing to commit to devel/omp/gcc-12, and seeking approval for mainline when the requires patches are in. Just realized that I don't have the new testcases added in this patch. Will supplement them later :P Thanks, Chung-Lin
Re: [PATCH, OpenMP, Fortran] requires unified_shared_memory 2/2: insert USM allocators into libgfortran
On 2022/8/15 7:15 PM, Chung-Lin Tang wrote: On 2022/8/15 7:06 PM, Chung-Lin Tang wrote: I know this is a big pile of yarn wrt how the main program/libgomp/libgfortran interacts, but it's finally working. Again tested without regressions. Preparing to commit to devel/omp/gcc-12, and seeking approval for mainline when the requires patches are in. Just realized that I don't have the new testcases added in this patch. Will supplement them later :P Here's the USM allocator/libgfortran patch, with a libgomp.fortran testcase added. Thanks, Chung-Lin 2022-09-05 Chung-Lin Tang libgcc/ * Makefile.in (crtoffloadend$(objext)): Add $(PICFLAG) to compile rule. * offloadstuff.c (GOMP_offload_register_ver): Add declaration of weak symbol. (__OFFLOAD_TABLE__): Likewise. (init_non_offload): New function. libgfortran/ * gfortran.map (GFORTRAN_13): New namespace. (_gfortran_mem_allocators_init): New name inside GFORTRAN_13. * libgfortran.h (mem_allocators_init): New exported declaration. * runtime/main.c (do_init): Rename from init, add run-once guard code. (cleanup): Add run-once guard code. (GOMP_post_offload_register_callback): Declare weak symbol. (GOMP_pre_gomp_target_fini_callback): Likewise. (init): New constructor to register offload callbacks, or call do_init when not OpenMP. * runtime/memory.c (gfortran_malloc): New pointer variable. (gfortran_calloc): Likewise. (gfortran_realloc): Likewise. (gfortran_free): Likewise. (mem_allocators_init): New function. (xmalloc): Use gfortran_malloc. (xmallocarray): Use gfortran_malloc. (xcalloc): Use gfortran_calloc. (xrealloc): Use gfortran_realloc. (xfree): Use gfortran_free. libgomp/ * libgomp.map (GOMP_5.1.2): New version namespace. (GOMP_post_offload_register_callback): New name inside GOMP_5.1.2. (GOMP_pre_gomp_target_fini_callback): Likewise. (GOMP_DEFINE_CALLBACK_SET): Macro to define callback set. (post_offload_register): Define callback set for after offload image register. (pre_gomp_target_fini): Define callback set for before gomp_target_fini is called. (libgfortran_malloc_usm): New function. (libgfortran_calloc_usm): Likewise (libgfortran_realloc_usm): Likewise (libgfortran_free_usm): Likewise. (_gfortran_mem_allocators_init): Declare weak symbol. (gomp_libgfortran_omp_allocators_init): New function. (GOMP_offload_register_ver): Add handling of host_table == NULL, calling into libgfortran to set unified_shared_memory allocators, and execution of post_offload_register callbacks. (gomp_target_init): Register all pre_gomp_target_fini callbacks to run at end of main using atexit(). * testsuite/libgomp.fortran/target-unified_shared_memory-1.f90: New test. diff --git a/libgcc/Makefile.in b/libgcc/Makefile.in index 09b3ec8bc2e..70720cc910c 100644 --- a/libgcc/Makefile.in +++ b/libgcc/Makefile.in @@ -1045,8 +1045,9 @@ crtbeginT$(objext): $(srcdir)/crtstuff.c crtoffloadbegin$(objext): $(srcdir)/offloadstuff.c $(crt_compile) $(CRTSTUFF_T_CFLAGS) -c $< -DCRT_BEGIN +# crtoffloadend contains a constructor with calls to libgomp, so build as PIC. crtoffloadend$(objext): $(srcdir)/offloadstuff.c - $(crt_compile) $(CRTSTUFF_T_CFLAGS) -c $< -DCRT_END + $(crt_compile) $(CRTSTUFF_T_CFLAGS) $(PICFLAG) -c $< -DCRT_END crtoffloadtable$(objext): $(srcdir)/offloadstuff.c $(crt_compile) $(CRTSTUFF_T_CFLAGS) -c $< -DCRT_TABLE diff --git a/libgcc/offloadstuff.c b/libgcc/offloadstuff.c index 10e1fe19c8e..2edb6810021 100644 --- a/libgcc/offloadstuff.c +++ b/libgcc/offloadstuff.c @@ -63,6 +63,19 @@ const void *const __offload_vars_end[0] __attribute__ ((__used__, visibility ("hidden"), section (OFFLOAD_VAR_TABLE_SECTION_NAME))) = { }; +extern void GOMP_offload_register_ver (unsigned, const void *, int, + const void *); +extern const void *const __OFFLOAD_TABLE__[0] __attribute__ ((weak)); +static void __attribute__((constructor)) +init_non_offload (void) +{ + /* If an OpenMP program has no offloading, post-offload_register callbacks + that need to run will require a call to GOMP_offload_register_ver, in + order to properly trigger those callbacks during init. */ + if (__OFFLOAD_TABLE__ == NULL) +GOMP_offload_register_ver (0, NULL, 0, NULL); +} + #elif defined CRT_TABLE extern const void *const __offload_func_table[]; diff --git a/libgfortran/gfortran.map b/libgfortran/gfortran.map index e0e795c3d48..55d2a529acd 100644 --- a/libgfortran/gfortran.map +++ b/libgfortran/gfortran.map @@ -1759,3 +1759,8 @@ GFORTRAN_12 { _gfortran_transfer_real128_write; #endif } GFORTRAN_10.2; + +GFORTRAN_13 { + global: + _gfortran_mem_allocators_init; +} GFORTRAN_12; diff --git a/libgfortran/libgfortran.h b/libgfortran/libgfortran.h index 0b893a51851.
[PATCH, OpenACC 2.7] struct/array reductions for Fortran
Hi Tobias, Thomas, this patch adds support for Fortran to use arrays and struct(record) types in OpenACC reductions. There is still some shortcomings in the current state, mainly that only explicit-shaped arrays can be used (like its C counterpart). Anything else is currently a bit more complicated in the middle-end, since the existing reduction code creates an "init-op" (literal of initial values) which can't be done when say TYPE_MAX_VALUE (TYPE_DOMAIN (array_type)) is not a tree constant. I think we'll be on the hook to solve this later, but I think the current state is okay to submit. Tested without regressions on mainline (on top of first struct/array reduction patch[1]) Thanks, Chung-Lin [1] https://gcc.gnu.org/pipermail/gcc-patches/2024-January/641669.html 2024-02-08 Chung-Lin Tang gcc/fortran/ChangeLog: * openmp.cc (oacc_reduction_defined_type_p): New function. (resolve_omp_clauses): Adjust OpenACC array reduction error case. Use oacc_reduction_defined_type_p for OpenACC. * trans-openmp.cc (gfc_trans_omp_array_reduction_or_udr): Add 'bool openacc' parameter, adjust part of function to be !openacc only. (gfc_trans_omp_reduction_list): Add 'bool openacc' parameter, pass to calls to gfc_trans_omp_array_reduction_or_udr. (gfc_trans_omp_clauses): Add 'openacc' argument to calls to gfc_trans_omp_reduction_list. (gfc_trans_omp_do): Pass 'op == EXEC_OACC_LOOP' as 'bool openacc' parameter in call to gfc_trans_omp_clauses. gcc/ChangeLog: * omp-low.cc (omp_reduction_init_op): Add checking if reduced array has constant bounds. (lower_oacc_reductions): Add handling of error_mark_node. gcc/testsuite/ChangeLog: * gfortran.dg/goacc/array-reduction.f90: Adjust testcase. * gfortran.dg/goacc/reduction.f95: Likewise. libgomp/ChangeLog: * libgomp/testsuite/libgomp.oacc-fortran/reduction-9.f90: New testcase. * libgomp/testsuite/libgomp.oacc-fortran/reduction-10.f90: Likewise. * libgomp/testsuite/libgomp.oacc-fortran/reduction-11.f90: Likewise. * libgomp/testsuite/libgomp.oacc-fortran/reduction-12.f90: Likewise. * libgomp/testsuite/libgomp.oacc-fortran/reduction-13.f90: Likewise. diff --git a/gcc/fortran/openmp.cc b/gcc/fortran/openmp.cc index 0af80d54fad..4bba9e666d6 100644 --- a/gcc/fortran/openmp.cc +++ b/gcc/fortran/openmp.cc @@ -7047,6 +7047,72 @@ oacc_is_loop (gfc_code *code) || code->op == EXEC_OACC_LOOP; } +static bool +oacc_reduction_defined_type_p (enum gfc_omp_reduction_op rop, gfc_typespec *ts) +{ + if (rop == OMP_REDUCTION_USER || rop == OMP_REDUCTION_NONE) +return false; + + if (ts->type == BT_INTEGER) +switch (rop) + { + case OMP_REDUCTION_AND: + case OMP_REDUCTION_OR: + case OMP_REDUCTION_EQV: + case OMP_REDUCTION_NEQV: + return false; + default: + return true; + } + + if (ts->type == BT_LOGICAL) +switch (rop) + { + case OMP_REDUCTION_AND: + case OMP_REDUCTION_OR: + case OMP_REDUCTION_EQV: + case OMP_REDUCTION_NEQV: + return true; + default: + return false; + } + + if (ts->type == BT_REAL || ts->type == BT_COMPLEX) +switch (rop) + { + case OMP_REDUCTION_PLUS: + case OMP_REDUCTION_TIMES: + case OMP_REDUCTION_MINUS: + return true; + + case OMP_REDUCTION_AND: + case OMP_REDUCTION_OR: + case OMP_REDUCTION_EQV: + case OMP_REDUCTION_NEQV: + return false; + + case OMP_REDUCTION_MAX: + case OMP_REDUCTION_MIN: + return ts->type != BT_COMPLEX; + case OMP_REDUCTION_IAND: + case OMP_REDUCTION_IOR: + case OMP_REDUCTION_IEOR: + return false; + default: + gcc_unreachable (); + } + + if (ts->type == BT_DERIVED) +{ + for (gfc_component *p = ts->u.derived->components; p; p = p->next) + if (!oacc_reduction_defined_type_p (rop, &p->ts)) + return false; + return true; +} + + return false; +} + static void resolve_scalar_int_expr (gfc_expr *expr, const char *clause) { @@ -8137,13 +8203,15 @@ resolve_omp_clauses (gfc_code *code, gfc_omp_clauses *omp_clauses, else n->sym->mark = 1; - /* OpenACC does not support reductions on arrays. */ - if (n->sym->as) + /* OpenACC current only supports array reductions on explicit-shape +arrays. */ + if ((n->sym->as && n->sym->as->type != AS_EXPLICIT) + || n->sym->attr.codimension) gfc_error ("Array %qs is not permitted in reduction at %L", n->sym->name, &n->where); } } - + for (n = omp_clauses->lists[OMP_LIST_TO]; n; n = n->next)
Re: [PATCH, OpenACC 2.7, v2] readonly modifier support in front-ends
Hi Thomas, Tobias, On 2023/10/26 6:43 PM, Thomas Schwinge wrote: > +++ b/gcc/tree.h > @@ -1813,6 +1813,14 @@ class auto_suppress_location_wrappers > #define OMP_CLAUSE_MAP_DECL_MAKE_ADDRESSABLE(NODE) \ > (OMP_CLAUSE_SUBCODE_CHECK (NODE, > OMP_CLAUSE_MAP)->base.addressable_flag) > > +/* Nonzero if OpenACC 'readonly' modifier set, used for 'copyin'. */ > +#define OMP_CLAUSE_MAP_READONLY(NODE) \ > + TREE_READONLY (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_MAP)) > + > +/* Same as above, for use in OpenACC cache directives. */ > +#define OMP_CLAUSE__CACHE__READONLY(NODE) \ > + TREE_READONLY (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE__CACHE_)) I'm not sure if these special accessor functions are actually useful, or we should just directly use 'TREE_READONLY' instead? We're only using them in contexts where it's clear that the 'OMP_CLAUSE_SUBCODE_CHECK' is satisfied, for example. >>> I find directly using TREE_READONLY confusing. >> >> FWIW, I've changed to use TREE_NOTHROW instead, if it can give a better >> sense of safety :P > > I don't understand that, why not use 'TREE_READONLY'? > >> I think there's a misunderstanding here anyways: we are not relying on a >> DECL marked >> TREE_READONLY here. We merely need the OMP_CLAUSE_MAP to be marked as >> OMP_CLAUSE_MAP_READONLY == 1. > > Yes, I understand that. My question was why we don't just use > 'TREE_READONLY (c)', where 'c' is the > 'OMP_CLAUSE_MAP'/'OMP_CLAUSE__CACHE_' clause (not its decl), and avoid > the indirection through > '#define OMP_CLAUSE_MAP_READONLY'/'#define OMP_CLAUSE__CACHE__READONLY', > given that we're only using them in contexts where it's clear that the > 'OMP_CLAUSE_SUBCODE_CHECK' is satisfied. I don't have a strong > preference, though. After further re-testing using TREE_NOTHROW, I have reverted to using TREE_READONLY, because TREE_NOTHROW clashes with OMP_CLAUSE_RELEASE_DESCRIPTOR (which doesn't use the OMP_CLAUSE_MAP_* naming convention and is not documented in gcc/tree-core.h either, hmmm...) I have added the comment adjustments in gcc/tree-core.h for the new uses of TREE_READONLY/readonly_flag. We basically all use OMP_CLAUSE_SUBCODE_CHECK macros for OpenMP clause expressions exclusively, so I don't see a reason to diverge from that style (even when context is clear). > Either way, you still need to document this: > > | Also, for the new use for OMP clauses, update 'gcc/tree.h:TREE_READONLY', > | and in 'gcc/tree-core.h' for 'readonly_flag' the > | "table lists the uses of each of the above flags". Okay, done as mentioned above. > In addition to a few individual comments above and below, you've also not > yet responded to my requests re test cases. I have greatly expanded the test scan patterns to include parallel/kernels/serial/data/enter data, as well as non-readonly copyin clause together with readonly. Also added simple 'declare' tests, but there is not anything to scan in the 'tree-original' dump though. >> + tree nl = list; >> + bool readonly = false; >> + matching_parens parens; >> + if (parens.require_open (parser)) >> +{ >> + /* Turn on readonly modifier parsing for copyin clause. */ >> + if (c_kind == PRAGMA_OACC_CLAUSE_COPYIN) >> + { >> + c_token *token = c_parser_peek_token (parser); >> + if (token->type == CPP_NAME >> + && !strcmp (IDENTIFIER_POINTER (token->value), "readonly") >> + && c_parser_peek_2nd_token (parser)->type == CPP_COLON) >> + { >> + c_parser_consume_token (parser); >> + c_parser_consume_token (parser); >> + readonly = true; >> + } >> + } >> + location_t loc = c_parser_peek_token (parser)->location; > > I suppose 'loc' here now points to after the opening '(' or after the > 'readonly :'? This is different from what 'c_parser_omp_var_list_parens' > does, and indeed, 'c_parser_omp_variable_list' states that "CLAUSE_LOC is > the location of the clause", not the location of the variable-list? As > this, I suppose, may change diagnostics, please restore the original > behavior. (This appears to be different in the C++ front end, huh.) Thanks for catching this! Fixed. >> --- a/gcc/fortran/openmp.cc >> +++ b/gcc/fortran/openmp.cc >> @@ -1197,7 +1197,7 @@ omp_inv_mask::omp_inv_mask (const omp_mask &m) : >> omp_mask (m) >> >> static bool >> gfc_match_omp_map_clause (gfc_omp_namelist **list, gfc_omp_map_op map_op, >> - bool allow_common, bool allow_derived) >> + bool allow_common, bool allow_derived, bool readonly >> = false) >> { >>gfc_omp_namelist **head = NULL; >>if (gfc_match_omp_variable_list ("", list, allow_common, NULL, &head, >> true, >> @@ -1206,7 +1206,10 @@ gfc_match_omp_map_clause (gfc_omp_namelist **list, >> gfc_omp_map_op map_op, >> { >>gfc_omp_namelist *n; >>for (n = *head; n; n = n->next) >> -
Re: [PATCH 5/5] Mapping of components of references to pointers to structs for OpenMP/OpenACC
Hi Julian, On 2021/5/15 5:27 AM, Julian Brown wrote: GCC currently raises a parse error for indirect accesses to struct members, where the base of the access is a reference to a pointer. This patch fixes that case. gcc/cp/ * semantics.c (finish_omp_clauses): Handle components of references to pointers to structs. libgomp/ * testsuite/libgomp.oacc-c++/deep-copy-17.C: Update test. --- a/gcc/cp/semantics.c +++ b/gcc/cp/semantics.c @@ -7670,7 +7670,12 @@ finish_omp_clauses (tree clauses, enum c_omp_region_type ort) if ((ort == C_ORT_ACC || ort == C_ORT_OMP) && TREE_CODE (t) == COMPONENT_REF && TREE_CODE (TREE_OPERAND (t, 0)) == INDIRECT_REF) - t = TREE_OPERAND (TREE_OPERAND (t, 0), 0); + { + t = TREE_OPERAND (TREE_OPERAND (t, 0), 0); + /* References to pointers have a double indirection here. */ + if (TREE_CODE (t) == INDIRECT_REF) + t = TREE_OPERAND (t, 0); + } if (TREE_CODE (t) == COMPONENT_REF && ((ort & C_ORT_OMP_DECLARE_SIMD) == C_ORT_OMP || ort == C_ORT_ACC) There is already a large plethora of such modifications in this patch: "[PATCH, OG10, OpenMP 5.0, committed] Remove array section base-pointer mapping semantics, and other front-end adjustments." https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570075.html I am in the process of taking that patch to mainline, so are you sure this is not already handled there? diff --git a/libgomp/testsuite/libgomp.oacc-c++/deep-copy-17.C b/libgomp/testsuite/libgomp.oacc-c++/deep-copy-17.C index dacbb520f3d..e038e9e3802 100644 --- a/libgomp/testsuite/libgomp.oacc-c++/deep-copy-17.C +++ b/libgomp/testsuite/libgomp.oacc-c++/deep-copy-17.C @@ -83,7 +83,7 @@ void strrp (void) a[0] = 8; c[0] = 10; e[0] = 12; - #pragma acc parallel copy(n->a[0:10], n->c[0:10], n->e[0:10]) + #pragma acc parallel copy(n->a[0:10], n->b, n->c[0:10], n->d, n->e[0:10]) { n->a[0] = n->c[0] + n->e[0]; } This testcase can be added. Chung-Lin
Re: [PATCH 7/7] [og10] WIP GOMP_MAP_ATTACH_ZERO_LENGTH_ARRAY_SECTION changes
On 2021/5/11 4:57 PM, Julian Brown wrote: This work-in-progress patch tries to get GOMP_MAP_ATTACH_ZERO_LENGTH_ARRAY_SECTION to behave more like GOMP_MAP_ATTACH_DETACH -- in that the mapping is made to form groups to be processed by build_struct_group/build_struct_comp_map. I think that's important to integrate with how groups of mappings for array sections are handled in other cases. This patch isn't sufficient by itself to fix a couple of broken test cases at present (libgomp.c++/target-lambda-1.C, libgomp.c++/target-this-4.C), though. No, GOMP_MAP_ATTACH_ZERO_LENGTH_ARRAY_SECTION is supposed to be just a slightly different behavior version of GOMP_MAP_ATTACH; it tolerates an unmapped pointer-target and assigns NULL on the device, instead of just gomp_fatal(). (see its handling in libgomp/target.c) In case OpenACC can have the same such zero-length array section behavior, we can just share one GOMP_MAP_ATTACH map. For now it is treated as separate cases. Chung-Lin 2021-05-11 Julian Brown gcc/ * gimplify.c (build_struct_comp_nodes): Add GOMP_MAP_ATTACH_ZERO_LENGTH_ARRAY_SECTION handling. (build_struct_group): Process GOMP_MAP_ATTACH_ZERO_LENGTH_ARRAY_SECTION as part of pointer group. (gimplify_scan_omp_clauses): Update prev_list_p such that GOMP_MAP_ATTACH_ZERO_LENGTH_ARRAY_SECTION will form part of pointer group. --- gcc/gimplify.c | 16 1 file changed, 12 insertions(+), 4 deletions(-) diff --git a/gcc/gimplify.c b/gcc/gimplify.c index 6d204908c82..c5cb486aa23 100644 --- a/gcc/gimplify.c +++ b/gcc/gimplify.c @@ -8298,7 +8298,9 @@ build_struct_comp_nodes (enum tree_code code, tree grp_start, tree grp_end, if (grp_mid && OMP_CLAUSE_CODE (grp_mid) == OMP_CLAUSE_MAP && (OMP_CLAUSE_MAP_KIND (grp_mid) == GOMP_MAP_ALWAYS_POINTER - || OMP_CLAUSE_MAP_KIND (grp_mid) == GOMP_MAP_ATTACH_DETACH)) + || OMP_CLAUSE_MAP_KIND (grp_mid) == GOMP_MAP_ATTACH_DETACH + || (OMP_CLAUSE_MAP_KIND (grp_mid) + == GOMP_MAP_ATTACH_ZERO_LENGTH_ARRAY_SECTION))) { tree c3 = build_omp_clause (OMP_CLAUSE_LOCATION (grp_end), OMP_CLAUSE_MAP); @@ -8774,12 +8776,14 @@ build_struct_group (struct gimplify_omp_ctx *ctx, ? splay_tree_lookup (ctx->variables, (splay_tree_key) decl) : NULL); bool ptr = (OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_ALWAYS_POINTER); - bool attach_detach = (OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_ATTACH_DETACH); + bool attach_detach = (OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_ATTACH_DETACH + || (OMP_CLAUSE_MAP_KIND (c) + == GOMP_MAP_ATTACH_ZERO_LENGTH_ARRAY_SECTION)); bool attach = (OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_ATTACH || OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_DETACH); bool has_attachments = false; /* For OpenACC, pointers in structs should trigger an attach action. */ - if (attach_detach + if (OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_ATTACH_DETACH && ((region_type & (ORT_ACC | ORT_TARGET | ORT_TARGET_DATA)) || code == OMP_TARGET_ENTER_DATA || code == OMP_TARGET_EXIT_DATA)) @@ -9784,6 +9788,8 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p, if (!remove && OMP_CLAUSE_MAP_KIND (c) != GOMP_MAP_ALWAYS_POINTER && OMP_CLAUSE_MAP_KIND (c) != GOMP_MAP_ATTACH_DETACH + && (OMP_CLAUSE_MAP_KIND (c) + != GOMP_MAP_ATTACH_ZERO_LENGTH_ARRAY_SECTION) && OMP_CLAUSE_MAP_KIND (c) != GOMP_MAP_TO_PSET && OMP_CLAUSE_CHAIN (c) && OMP_CLAUSE_CODE (OMP_CLAUSE_CHAIN (c)) == OMP_CLAUSE_MAP @@ -9792,7 +9798,9 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p, || (OMP_CLAUSE_MAP_KIND (OMP_CLAUSE_CHAIN (c)) == GOMP_MAP_ATTACH_DETACH) || (OMP_CLAUSE_MAP_KIND (OMP_CLAUSE_CHAIN (c)) - == GOMP_MAP_TO_PSET))) + == GOMP_MAP_TO_PSET) + || (OMP_CLAUSE_MAP_KIND (OMP_CLAUSE_CHAIN (c)) + == GOMP_MAP_ATTACH_ZERO_LENGTH_ARRAY_SECTION))) prev_list_p = list_p; break;
Re: [PATCH 7/7] [og10] WIP GOMP_MAP_ATTACH_ZERO_LENGTH_ARRAY_SECTION changes
On 2021/5/17 10:26 PM, Julian Brown wrote: OK, understood. But, I'm a bit concerned that we're ignoring some "hidden rules" with regards to OMP pointer clause ordering/grouping that certain code (at least the bit that creates GOMP_MAP_STRUCT node groups, and parts of omp-low.c) relies on. I believe those rules are as follows: - an array slice is mapped using two or three pointers -- two for a normal (non-reference) base pointer, and three if we have a reference to a pointer (i.e. in C++) or an array descriptor (i.e. in Fortran). So we can have e.g. GOMP_MAP_TO GOMP_MAP_ALWAYS_POINTER GOMP_MAP_TO GOMP_MAP_.*_POINTER GOMP_MAP_ALWAYS_POINTER GOMP_MAP_TO GOMP_MAP_TO_PSET GOMP_MAP_ALWAYS_POINTER - for OpenACC, we extend this to allow (up to and including gimplify.c) the GOMP_MAP_ATTACH_DETACH mapping. So we can have (for component refs): GOMP_MAP_TO GOMP_MAP_ATTACH_DETACH GOMP_MAP_TO GOMP_MAP_TO_PSET GOMP_MAP_ATTACH_DETACH GOMP_MAP_TO GOMP_MAP_.*_POINTER GOMP_MAP_ATTACH_DETACH For the scanning in insert_struct_comp_map (as it is at present) to work right, these groups must stay intact. I think the current behaviour of omp_target_reorder_clauses on the og10 branch can break those groups apart though! Originally this sorting was intended to enforce OpenMP 5.0 map ordering rules, although I did add some ATTACH_DETACH ordering code in the latest round of patching. May not be the best practice. (The "prev_list_p" stuff in the loop in question in gimplify.c just keeps track of the first node in these groups.) Such a brittle way of doing this; even the variable name is not that obvious in what it intends to do. For OpenACC, the GOMP_MAP_ATTACH_DETACH code does*not* depend on the previous clause when lowering in omp-low.c. But GOMP_MAP_ALWAYS_POINTER does! And in one case ("update" directive), GOMP_MAP_ATTACH_DETACH is rewritten to GOMP_MAP_ALWAYS_POINTER, so for that case at least, the dependency on the preceding mapping node must stay intact. Yes, I think there are some weird conventions here, stemming from the front-ends. I would think that _ALWAYS_POINTER should exist at a similar level like _ATTACH_DETACH, both a pointer operation, just different details in runtime behavior, though its intended purpose for C++ references seem to skew some things here and there. OpenACC also allows "bare" GOMP_MAP_ATTACH and GOMP_MAP_DETACH nodes (corresponding to the "attach" and "detach" clauses). Those are handled a bit differently to GOMP_MAP_ATTACH_DETACH in gimplify.c -- but GOMP_MAP_ATTACH_Z_L_A_S doesn't quite behave like that either, I don't think? IIRC, GOMP_MAP_ATTACH_ZERO_LENGTH_ARRAY_SECTION was handled that way (just a single line in gimplify.c) due to idiosyncrasies with the surrounding generated maps from the C++ front-end (which ATM is the only user of this map-kind). So yeah, inside the compiler, its not entirely the same as GOMP_MAP_ATTACH, but it is intended to live through for the runtime to see. Anyway: I've not entirely understood what omp_target_reorder_clauses is doing, but I think it may need to try harder to keep the groups mentioned above together. What do you think? As you know, attach operations don't really need to be glued to the prior operations, it just has to be ordered after mapping of the pointer and the pointed. There's already some book-keeping to move clauses together, but as you say, it might need more. Overall, I think this re-organizing of the struct-group creation is a good thing, but actually as you probably also observed, this insistence of "in-flight" tree chain manipulation is just hard to work with and modify. Maybe instead of directly working on clause expression chains at this point, we should be stashing all this information into a single clause tree node, e.g. starting from the front-end, we can set 'OMP_CLAUSE_MAP_POINTER_KIND(c) = ALWAYS/ATTACH_DETACH/FIRSTPRIVATE/etc.', (instead of actually creating new, must-follow-in-order maps that's causing all these conventions). For struct-groups, during the start of gimplify_scan_omp_clauses(), we could work with map clause tree nodes with OMP_CLAUSE_MAP_STRUCT_LIST(c), which contains the entire TREE_LIST or VEC of elements. Then later, after scanning is complete, expand the list into the current form. Ordering is only created at this stage. Just an idea, not sure if it will help understandability in general, but it should definitely help to simplify when we're reordering due to other rules. Chung-Lin
[PATCH, OpenMP 5.0] Remove array section base-pointer mapping semantics, and other front-end adjustments (mainline trunk)
Hi Jakub, this is a version of this patch: https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570075.html for mainline trunk. This patch largely implements three pieces of functionality: (1) Per discussion and clarification on the omp-lang mailing list, standards conforming behavior for mapping array sections should *NOT* also map the base-pointer, i.e for this code: struct S { int *ptr; ... }; struct S s; #pragma omp target enter data map(to: s.ptr[:100]) Currently we generate after gimplify: #pragma omp target enter data map(struct:s [len: 1]) map(alloc:s.ptr [len: 8]) \ map(to:*_1 [len: 400]) map(attach:s.ptr [bias: 0]) which is deemed incorrect. After this patch, the gimplify results are now adjusted to: #pragma omp target enter data map(to:*_1 [len: 400]) map(attach:s.ptr [bias: 0]) (the attach operation is still generated, and if s.ptr is already mapped prior, attachment will happen) The correct way of achieving the base-pointer-also-mapped behavior would be to use: #pragma omp target enter data map(to: s.ptr, s.ptr[:100]) This adjustment in behavior required a number of small adjustments here and there in gimplify, including to accomodate map sequences for C++ references. There is also a small Fortran front-end patch involved (hence CCing Tobias and fortran@). The new gimplify processing changed behavior in handling GOMP_MAP_ALWAYS_POINTER maps such that the libgomp.fortran/struct-elem-map-1.f90 regressed. It appeared that the Fortran FE was generating a GOMP_MAP_ALWAYS_POINTER for array types, which didn't seem quite correct, and the pre-patch behavior was removing this map anyways. I have a small change in trans-openmp.c:gfc_trans_omp_array_section to not generate the map in this case, and so far no bad test results. (2) The second part (though kind of related to the first above) are fixes in libgomp/target.c to not overwrite attached pointers when handling device<->host copies, mainly for the "always" case. This behavior is also noted in the 5.0 spec, but not yet properly coded before. (3) The third is a set of changes to the C/C++ front-ends to extend the allowed component access syntax in map clauses. This is actually mainly an effort to allow SPEC HPC to compile, so despite in the long term the entire map clause syntax parsing is probably going to be revamped, we're still adding this in for now. These changes are enabled for both OpenACC and OpenMP. Tested on x86_64-linux with nvptx offloading with no regressions. This patch was merged and tested atop of the prior submitted patches: (a) https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570886.html "[PATCH, OpenMP 5.0] Improve OpenMP target support for C++ (includes PR92120 v3)" (b) https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570365.html "[PATCH, OpenMP 5.0] Implement relaxation of implicit map vs. existing device mappings (for mainline trunk)" so you might queued this one later than those for review. Thanks, Chung-Lin 2021-05-25 Chung-Lin Tang gcc/c/ChangeLog: * c-parser.c (struct omp_dim): New struct type for use inside c_parser_omp_variable_list. (c_parser_omp_variable_list): Allow multiple levels of array and component accesses in array section base-pointer expression. (c_parser_omp_clause_to): Set 'allow_deref' to true in call to c_parser_omp_var_list_parens. (c_parser_omp_clause_from): Likewise. * c-typeck.c (handle_omp_array_sections_1): Extend allowed range of base-pointer expressions involving INDIRECT/MEM/ARRAY_REF and POINTER_PLUS_EXPR. (c_finish_omp_clauses): Extend allowed ranged of expressions involving INDIRECT/MEM/ARRAY_REF and POINTER_PLUS_EXPR. gcc/cp/ChangeLog: * parser.c (struct omp_dim): New struct type for use inside cp_parser_omp_var_list_no_open. (cp_parser_omp_var_list_no_open): Allow multiple levels of array and component accesses in array section base-pointer expression. (cp_parser_omp_all_clauses): Set 'allow_deref' to true in call to cp_parser_omp_var_list for to/from clauses. * semantics.c (handle_omp_array_sections_1): Extend allowed range of base-pointer expressions involving INDIRECT/MEM/ARRAY_REF and POINTER_PLUS_EXPR. (handle_omp_array_sections): Adjust pointer map generation of references. (finish_omp_clauses): Extend allowed ranged of expressions involving INDIRECT/MEM/ARRAY_REF and POINTER_PLUS_EXPR. gcc/fortran/ChangeLog: * trans-openmp.c (gfc_trans_omp_array_section): Do not generate GOMP_MAP_ALWAYS_POINTER map for main array maps of ARRAY_TYPE type. gcc/ChangeLog: * gimplify.c (extract_base_bit_offset): Add 'tree *offsetp' parameter, accomodate case where 'offset' return of get_inner_r
[PATCH, OpenMP, Fortran] Support in_reduction for Fortran
Hi Jakub, and Fortran folks, this patch does the required adjustments to let 'in_reduction' work for Fortran. Not just for the target directive actually, task directive is also working after this patch. There is a little bit of adjustment in omp-low.c:scan_sharing_clauses: RTL expand of the copy of the OMP_CLAUSE_IN_REDUCTION decl was failing for Fortran by-reference arguments, which seems to work after placing them under the outer ctx (when it exists). This also now needs checking the field_map for existence of the field before inserting. Tested without regressions on mainline trunk, is this okay? (testing for devel/omp/gcc-11 is in progress) Thanks, Chung-Lin 2021-09-17 Chung-Lin Tang gcc/fortran/ChangeLog: * openmp.c (gfc_match_omp_clause_reduction): Add 'openmp_target' default false parameter. Add 'always,tofrom' map for OMP_LIST_IN_REDUCTION case. (gfc_match_omp_clauses): Add 'openmp_target' default false parameter, adjust call to gfc_match_omp_clause_reduction. (match_omp): Adjust call to gfc_match_omp_clauses * trans-openmp.c (gfc_trans_omp_taskgroup): Add call to gfc_match_omp_clause, create and return block. gcc/ChangeLog: * omp-low.c (scan_sharing_clauses): Place in_reduction copy of variable in outer ctx if if exists. Check if non-existent in field_map before installing OMP_CLAUSE_IN_REDUCTION decl. gcc/testsuite/ChangeLog: * gfortran.dg/gomp/reduction4.f90: Adjust omp target in_reduction' scan pattern. libgomp/ChangeLog: * testsuite/libgomp.fortran/target-in-reduction-1.f90: New test. diff --git a/gcc/fortran/openmp.c b/gcc/fortran/openmp.c index a64b7f5aa10..8179b5aa8bc 100644 --- a/gcc/fortran/openmp.c +++ b/gcc/fortran/openmp.c @@ -1138,7 +1138,7 @@ failed: static match gfc_match_omp_clause_reduction (char pc, gfc_omp_clauses *c, bool openacc, - bool allow_derived) + bool allow_derived, bool openmp_target = false) { if (pc == 'r' && gfc_match ("reduction ( ") != MATCH_YES) return MATCH_NO; @@ -1285,6 +1285,19 @@ gfc_match_omp_clause_reduction (char pc, gfc_omp_clauses *c, bool openacc, n->u2.udr = gfc_get_omp_namelist_udr (); n->u2.udr->udr = udr; } + if (openmp_target && list_idx == OMP_LIST_IN_REDUCTION) + { + gfc_omp_namelist *p = gfc_get_omp_namelist (), **tl; + p->sym = n->sym; + p->where = p->where; + p->u.map_op = OMP_MAP_ALWAYS_TOFROM; + + tl = &c->lists[OMP_LIST_MAP]; + while (*tl) + tl = &((*tl)->next); + *tl = p; + p->next = NULL; + } } return MATCH_YES; } @@ -1353,7 +1366,7 @@ gfc_match_dupl_atomic (bool not_dupl, const char *name) static match gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask, bool first = true, bool needs_space = true, - bool openacc = false) + bool openacc = false, bool openmp_target = false) { bool error = false; gfc_omp_clauses *c = gfc_get_omp_clauses (); @@ -2057,8 +2070,8 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask, goto error; } if ((mask & OMP_CLAUSE_IN_REDUCTION) - && gfc_match_omp_clause_reduction (pc, c, openacc, -allow_derived) == MATCH_YES) + && gfc_match_omp_clause_reduction (pc, c, openacc, allow_derived, +openmp_target) == MATCH_YES) continue; if ((mask & OMP_CLAUSE_INBRANCH) && (m = gfc_match_dupl_check (!c->inbranch && !c->notinbranch, @@ -3496,7 +3509,8 @@ static match match_omp (gfc_exec_op op, const omp_mask mask) { gfc_omp_clauses *c; - if (gfc_match_omp_clauses (&c, mask) != MATCH_YES) + if (gfc_match_omp_clauses (&c, mask, true, true, false, +(op == EXEC_OMP_TARGET)) != MATCH_YES) return MATCH_ERROR; new_st.op = op; new_st.ext.omp_clauses = c; diff --git a/gcc/fortran/trans-openmp.c b/gcc/fortran/trans-openmp.c index e55e0c81868..08483951066 100644 --- a/gcc/fortran/trans-openmp.c +++ b/gcc/fortran/trans-openmp.c @@ -6391,12 +6391,17 @@ gfc_trans_omp_task (gfc_code *code) static tree gfc_trans_omp_taskgroup (gfc_code *code) { + stmtblock_t block; + gfc_start_block (&block); tree body = gfc_trans_code (code->block->next); tree stmt = make_node (OMP_TASKGROUP); TREE_TYPE (stmt) = void_type_node; OMP_TASKGROUP_BODY (stmt) = body; - OMP_TASKGROUP_CLAUSES (stmt) = NULL_TREE; - return stmt; + OMP_TASKGROUP_CLAUSES (stmt) = gfc_trans_omp_clauses (&
[PATCH, OpenMP 5.1, Fortran] Strictly-structured block support for OpenMP directives
Hi all, this patch add support for "strictly-structured blocks" introduced in OpenMP 5.1, basically allowing BLOCK constructs to serve as the body for directives: !$omp target block ... end block [!$omp end target] !! end directive is optional !$omp parallel block ... end block ... !$omp end parallel !! error, considered as not match to above parallel directive The parsing loop in parse_omp_structured_block() has been modified to allow a BLOCK construct after the first statement has been detected to be ST_BLOCK. This is done by a hard modification of the state into (the new) COMP_OMP_STRICTLY_STRUCTURED_BLOCK after the statement is known (I'm not sure if there's a way to 'peek' the next statement/token in the Fortran FE, open to suggestions on how to better write this) Tested with no regressions on trunk, is this okay to commit? Thanks, Chung-Lin 2021-10-07 Chung-Lin Tang gcc/fortran/ChangeLog: * decl.c (gfc_match_end): Add COMP_OMP_STRICTLY_STRUCTURED_BLOCK case together with COMP_BLOCK. * parse.c (parse_omp_structured_block): Adjust declaration, add 'bool strictly_structured_block' default true parameter, add handling for strictly-structured block case, adjust recursive calls to parse_omp_structured_block. (parse_executable): Adjust calls to parse_omp_structured_block. * parse.h (enum gfc_compile_state): Add COMP_OMP_STRICTLY_STRUCTURED_BLOCK. * trans-openmp.c (gfc_trans_omp_workshare): Add EXEC_BLOCK case handling. gcc/testsuite/ChangeLog: * gfortran.dg/gomp/strictly-structured-block-1.f90: New test. diff --git a/gcc/fortran/decl.c b/gcc/fortran/decl.c index b3c65b7175b..ff66d1f9475 100644 --- a/gcc/fortran/decl.c +++ b/gcc/fortran/decl.c @@ -8445,6 +8445,7 @@ gfc_match_end (gfc_statement *st) break; case COMP_BLOCK: +case COMP_OMP_STRICTLY_STRUCTURED_BLOCK: *st = ST_END_BLOCK; target = " block"; eos_ok = 0; diff --git a/gcc/fortran/parse.c b/gcc/fortran/parse.c index 7d765a0866d..d78bf9b8fa5 100644 --- a/gcc/fortran/parse.c +++ b/gcc/fortran/parse.c @@ -5451,8 +5451,9 @@ parse_oacc_loop (gfc_statement acc_st) /* Parse the statements of an OpenMP structured block. */ -static void -parse_omp_structured_block (gfc_statement omp_st, bool workshare_stmts_only) +static gfc_statement +parse_omp_structured_block (gfc_statement omp_st, bool workshare_stmts_only, + bool strictly_structured_block = true) { gfc_statement st, omp_end_st; gfc_code *cp, *np; @@ -5538,6 +5539,32 @@ parse_omp_structured_block (gfc_statement omp_st, bool workshare_stmts_only) gcc_unreachable (); } + bool block_construct = false; + gfc_namespace* my_ns = NULL; + gfc_namespace* my_parent = NULL; + + st = next_statement (); + + if (strictly_structured_block && st == ST_BLOCK) +{ + /* Adjust state to a strictly-structured block, now that we found that +the body starts with a BLOCK construct. */ + s.state = COMP_OMP_STRICTLY_STRUCTURED_BLOCK; + + block_construct = true; + gfc_notify_std (GFC_STD_F2008, "BLOCK construct at %C"); + + my_ns = gfc_build_block_ns (gfc_current_ns); + gfc_current_ns = my_ns; + my_parent = my_ns->parent; + + new_st.op = EXEC_BLOCK; + new_st.ext.block.ns = my_ns; + new_st.ext.block.assoc = NULL; + accept_statement (ST_BLOCK); + st = parse_spec (ST_NONE); +} + do { if (workshare_stmts_only) @@ -5554,7 +5581,6 @@ parse_omp_structured_block (gfc_statement omp_st, bool workshare_stmts_only) restrictions apply recursively. */ bool cycle = true; - st = next_statement (); for (;;) { switch (st) @@ -5576,17 +5602,20 @@ parse_omp_structured_block (gfc_statement omp_st, bool workshare_stmts_only) parse_forall_block (); break; + case ST_OMP_PARALLEL_SECTIONS: + st = parse_omp_structured_block (st, false, false); + continue; + case ST_OMP_PARALLEL: case ST_OMP_PARALLEL_MASKED: case ST_OMP_PARALLEL_MASTER: - case ST_OMP_PARALLEL_SECTIONS: - parse_omp_structured_block (st, false); - break; + st = parse_omp_structured_block (st, false); + continue; case ST_OMP_PARALLEL_WORKSHARE: case ST_OMP_CRITICAL: - parse_omp_structured_block (st, true); - break; + st = parse_omp_structured_block (st, true); + continue; case ST_OMP_PARALLEL_DO: case ST_OMP_PARALLEL_DO_SIMD: @@ -5609,7 +5638,7 @@ parse_omp_structured_block (gfc_statement omp_st, boo
Re: [PATCH, OpenMP 5.1, Fortran] Strictly-structured block support for OpenMP directives
On 2021/10/14 7:19 PM, Jakub Jelinek wrote: On Thu, Oct 14, 2021 at 12:20:51PM +0200, Jakub Jelinek via Gcc-patches wrote: Thinking more about the Fortran case for !$omp sections, there is an ambiguity. !$omp sections block !$omp section end block is clear and !$omp end sections is optional, but !$omp sections block end block is ambiguous during parsing, it could be either followed by !$omp section and then the BLOCK would be first section, or by !$omp end sections and then it would be clearly the whole sections, with first section being empty inside of the block, or if it is followed by something else, it is ambiguous whether the block ... end block is part of the first section, followed by something and then we should be looking later for either !$omp section or !$omp end section to prove that, or if !$omp sections block end block was the whole sections construct and we shouldn't await anything further. I'm afraid back to the drawing board. And I have to correct myself, there is no ambiguity in 5.2 here, the important fact is hidden in sections/parallel sections being block-associated constructs. That means the body of the whole construct has to be a structured-block, and by the 5.1+ definition of Fortran structured block, it is either block ... end block or something that doesn't start with block. So, !$omp sections block end block a = 1 is only ambiguous in whether it is actually !$omp sections block !$omp section end block a = 1 or !$omp sections !$omp section block end block !$omp end sections a = 1 but both actually do the same thing, work roughly as !$omp single. If one wants block statement as first in structured-block-sequence of the first section, followed by either some further statements or by other sections, then one needs to write !$omp sections !$omp section block end block a = 1 ... !$omp end sections or !$omp sections block block end block a = 1 ... end block Your patch probably already handles it that way, but we again need testsuite coverage to prove it is handled the way it should in all these cases (and that we diagnose what is invalid). The patch currently does not allow strictly-structured BLOCK for sections/parallel sections, since I was referencing the 5.1 spec while writing it, although that is trivially fixable. (was sensing a bit odd why those two constructs had to be specially treated in 5.1 anyways) The bigger issue is that under the current way the patch is written, the statements inside a [parallel] sections construct are parsed automatically by parse_executable(), so to enforce the specified meaning of "structured-block-sequence" (i.e. BLOCK or non-BLOCK starting sequence of stmts) will probably be more a bit harder to implement: !$omp sections block !$omp section block x=0 end block x=1 !! This is allowed now, though should be wrong spec-wise !$omp section x=2 end block Currently "$!omp section" acts essentially as a top-level separator within a sections-construct, rather than a structured directive. Though I would kind of argue this is actually better to use for the user (why prohibit what looks like very apparent meaning of the program?) So Jakub, my question for this is, is this current state okay? Or must we implement the spec pedantically? As for the other issues: (1) BLOCK/END BLOCK is not generally handled in parse_omp_structured_block, so for workshare, it is only handled for the top-level construct, not within workshare. I think this is what you meant in the last mail. (2) As for the dangling-!$omp_end issue Tobias raised, because we are basically using 1-statement lookahead, any "!$omp end <*>" is naturally bound with the adjacent BLOCK/END BLOCK, so we should be okay there. Thanks, Chung-Lin
[PATCH, v2, OpenMP, Fortran] Support in_reduction for Fortran
t have any more evidence this is needed, so removed now. --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/target-in-reduction-1.f90 @@ -0,0 +1,33 @@ +! { dg-do run } + +subroutine foo (x, y) ... + if (x .ne. 11) stop 1 + if (y .ne. 21) stop 2 + +end program main Again, something that can be dealt incrementally, but the testsuite coverage of https://gcc.gnu.org/pipermail/gcc-patches/2021-June/573600.html was larger than this. Would be nice e.g. to cover both scalar vars and array sections/arrays, parameters passed by reference as in the above testcase, but also something that isn't a reference (either a local variable or dummy parameter with VALUE, etc. Jakub I have expanded target-in-reduction-1.f90 to cover local variables and VALUE passed parameters. Array sections in reductions appear to be still not supported by the Fortran FE in general (Tobias plans to work on that later). I also added another target-in-reduction-2.f90 testcase that tests the "orphaned" case in Fortran, where the task/target-in_reduction is in another separate subroutine. Tested without regressions on trunk, is this okay to commit? Thanks, Chung-Lin 2021-10-19 Chung-Lin Tang gcc/fortran/ChangeLog: * openmp.c (gfc_match_omp_clause_reduction): Add 'openmp_target' default false parameter. Add 'always,tofrom' map for OMP_LIST_IN_REDUCTION case. (gfc_match_omp_clauses): Add 'openmp_target' default false parameter, adjust call to gfc_match_omp_clause_reduction. (match_omp): Adjust call to gfc_match_omp_clauses * trans-openmp.c (gfc_trans_omp_taskgroup): Add call to gfc_match_omp_clause, create and return block. gcc/ChangeLog: * omp-low.c (omp_copy_decl_2): For !ctx, use record_vars to add new copy as local variable. (scan_sharing_clauses): Place copy of OMP_CLAUSE_IN_REDUCTION decl in ctx->outer instead of ctx. gcc/testsuite/ChangeLog: * gfortran.dg/gomp/reduction4.f90: Adjust omp target in_reduction' scan pattern. libgomp/ChangeLog: * testsuite/libgomp.fortran/target-in-reduction-1.f90: New test. * testsuite/libgomp.fortran/target-in-reduction-2.f90: New test.diff --git a/gcc/fortran/openmp.c b/gcc/fortran/openmp.c index 6a4ca2868f8..210fb06dbec 100644 --- a/gcc/fortran/openmp.c +++ b/gcc/fortran/openmp.c @@ -1138,7 +1138,7 @@ failed: static match gfc_match_omp_clause_reduction (char pc, gfc_omp_clauses *c, bool openacc, - bool allow_derived) + bool allow_derived, bool openmp_target = false) { if (pc == 'r' && gfc_match ("reduction ( ") != MATCH_YES) return MATCH_NO; @@ -1285,6 +1285,19 @@ gfc_match_omp_clause_reduction (char pc, gfc_omp_clauses *c, bool openacc, n->u2.udr = gfc_get_omp_namelist_udr (); n->u2.udr->udr = udr; } + if (openmp_target && list_idx == OMP_LIST_IN_REDUCTION) + { + gfc_omp_namelist *p = gfc_get_omp_namelist (), **tl; + p->sym = n->sym; + p->where = p->where; + p->u.map_op = OMP_MAP_ALWAYS_TOFROM; + + tl = &c->lists[OMP_LIST_MAP]; + while (*tl) + tl = &((*tl)->next); + *tl = p; + p->next = NULL; + } } return MATCH_YES; } @@ -1353,7 +1366,7 @@ gfc_match_dupl_atomic (bool not_dupl, const char *name) static match gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask, bool first = true, bool needs_space = true, - bool openacc = false) + bool openacc = false, bool openmp_target = false) { bool error = false; gfc_omp_clauses *c = gfc_get_omp_clauses (); @@ -2057,8 +2070,8 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask, goto error; } if ((mask & OMP_CLAUSE_IN_REDUCTION) - && gfc_match_omp_clause_reduction (pc, c, openacc, -allow_derived) == MATCH_YES) + && gfc_match_omp_clause_reduction (pc, c, openacc, allow_derived, +openmp_target) == MATCH_YES) continue; if ((mask & OMP_CLAUSE_INBRANCH) && (m = gfc_match_dupl_check (!c->inbranch && !c->notinbranch, @@ -3512,7 +3525,8 @@ static match match_omp (gfc_exec_op op, const omp_mask mask) { gfc_omp_clauses *c; - if (gfc_match_omp_clauses (&c, mask) != MATCH_YES) + if (gfc_match_omp_clauses (&c, mask, true, true, false, +op == EXEC_OMP_TARGET) != MATCH_YES) return MATCH_ERROR; new_st.op = op; new_st.ext.omp_clauses = c; diff --git a/gcc/fortran/trans-openmp.c b/gcc/fortran/trans-ope
[PATCH, v2, OpenMP 5.2, Fortran] Strictly-structured block support for OpenMP directives
Hi Jakub, this version adjusts the patch to let sections/parallel sections also use strictly-structured blocks, making it more towards 5.2. Because of this change, some of the testcases using the sections-construct need a bit of adjustment too, since "block; end block" at the start of the construct now means something different than before. There are now three new testcases, with the non-dg-error/dg-error cases separated, and a third testcase containing a few cases listed in prior emails. I hope this is enough. The implementation status entry in libgomp/libgomp.texi for strictly-structured blocks has also been changed to "Y" in this patch. Tested without regressions, is this now okay for trunk? Thanks, Chung-Lin 2021-10-20 Chung-Lin Tang gcc/fortran/ChangeLog: * decl.c (gfc_match_end): Add COMP_OMP_STRICTLY_STRUCTURED_BLOCK case together with COMP_BLOCK. * parse.c (parse_omp_structured_block): Change return type to 'gfc_statement', add handling for strictly-structured block case, adjust recursive calls to parse_omp_structured_block. (parse_executable): Adjust calls to parse_omp_structured_block. * parse.h (enum gfc_compile_state): Add COMP_OMP_STRICTLY_STRUCTURED_BLOCK. * trans-openmp.c (gfc_trans_omp_workshare): Add EXEC_BLOCK case handling. gcc/testsuite/ChangeLog: * gfortran.dg/gomp/cancel-1.f90: Adjust testcase. * gfortran.dg/gomp/nesting-3.f90: Adjust testcase. * gfortran.dg/gomp/strictly-structured-block-1.f90: New test. * gfortran.dg/gomp/strictly-structured-block-2.f90: New test. * gfortran.dg/gomp/strictly-structured-block-3.f90: New test. libgomp/ChangeLog: * libgomp.texi (Support of strictly structured blocks in Fortran): Adjust to 'Y'. * testsuite/libgomp.fortran/task-reduction-16.f90: Adjust testcase. diff --git a/gcc/fortran/decl.c b/gcc/fortran/decl.c index d6a22d13451..66489da12be 100644 --- a/gcc/fortran/decl.c +++ b/gcc/fortran/decl.c @@ -8449,6 +8449,7 @@ gfc_match_end (gfc_statement *st) break; case COMP_BLOCK: +case COMP_OMP_STRICTLY_STRUCTURED_BLOCK: *st = ST_END_BLOCK; target = " block"; eos_ok = 0; diff --git a/gcc/fortran/parse.c b/gcc/fortran/parse.c index 7d765a0866d..2fb98844356 100644 --- a/gcc/fortran/parse.c +++ b/gcc/fortran/parse.c @@ -5451,7 +5451,7 @@ parse_oacc_loop (gfc_statement acc_st) /* Parse the statements of an OpenMP structured block. */ -static void +static gfc_statement parse_omp_structured_block (gfc_statement omp_st, bool workshare_stmts_only) { gfc_statement st, omp_end_st; @@ -5538,6 +5538,32 @@ parse_omp_structured_block (gfc_statement omp_st, bool workshare_stmts_only) gcc_unreachable (); } + bool block_construct = false; + gfc_namespace *my_ns = NULL; + gfc_namespace *my_parent = NULL; + + st = next_statement (); + + if (st == ST_BLOCK) +{ + /* Adjust state to a strictly-structured block, now that we found that +the body starts with a BLOCK construct. */ + s.state = COMP_OMP_STRICTLY_STRUCTURED_BLOCK; + + block_construct = true; + gfc_notify_std (GFC_STD_F2008, "BLOCK construct at %C"); + + my_ns = gfc_build_block_ns (gfc_current_ns); + gfc_current_ns = my_ns; + my_parent = my_ns->parent; + + new_st.op = EXEC_BLOCK; + new_st.ext.block.ns = my_ns; + new_st.ext.block.assoc = NULL; + accept_statement (ST_BLOCK); + st = parse_spec (ST_NONE); +} + do { if (workshare_stmts_only) @@ -5554,7 +5580,6 @@ parse_omp_structured_block (gfc_statement omp_st, bool workshare_stmts_only) restrictions apply recursively. */ bool cycle = true; - st = next_statement (); for (;;) { switch (st) @@ -5580,13 +5605,13 @@ parse_omp_structured_block (gfc_statement omp_st, bool workshare_stmts_only) case ST_OMP_PARALLEL_MASKED: case ST_OMP_PARALLEL_MASTER: case ST_OMP_PARALLEL_SECTIONS: - parse_omp_structured_block (st, false); - break; + st = parse_omp_structured_block (st, false); + continue; case ST_OMP_PARALLEL_WORKSHARE: case ST_OMP_CRITICAL: - parse_omp_structured_block (st, true); - break; + st = parse_omp_structured_block (st, true); + continue; case ST_OMP_PARALLEL_DO: case ST_OMP_PARALLEL_DO_SIMD: @@ -5609,7 +5634,7 @@ parse_omp_structured_block (gfc_statement omp_st, bool workshare_stmts_only) } } else - st = parse_executable (ST_NONE); + st = parse_executable (st); if (st == ST_NONE) unexpected_eof (); else if (st == S
Re: [PATCH, v2, OpenMP 5.2, Fortran] Strictly-structured block support for OpenMP directives
On 2021/10/21 12:15 AM, Jakub Jelinek wrote: +program main + integer :: x, i, n + + !$omp parallel + block +x = x + 1 + end block I'd prefer not to use those x = j or x = x + 1 etc. as statements that do random work here whenever possible. While those are dg-do compile testcases, especially if it is without dg-errors I think it is preferrable not to show bad coding examples. E.g. the x = x + 1 above is wrong for 2 reasons, x is uninitialized before the parallel, and there is a data race, the threads, teams etc. can write to x concurrently. I think better would be to use something like call do_work which doesn't have to be defined anywhere and will just stand there as a black box for unspecified work. + !$omp workshare + block +x = x + 1 + end block There are exceptions though, e.g. workshare is such a case, because e.g. call do_work is not valid in workshare. So, it is ok to keep using x = x + 1 here if you initialize it first at the start of the program. + !$omp workshare + block +x = 1 +!$omp critical +block + x = 3 +end block + end block And then there are cases like the above, please just use different variables there (all initialized) or say an array and access different elements in the different spots. Jakub Thanks, attached is what I finally committed. Chung-Lin From 2e4659199e814b7ee0f6bd925fd2c0a7610da856 Mon Sep 17 00:00:00 2001 From: Chung-Lin Tang Date: Thu, 21 Oct 2021 14:56:20 +0800 Subject: [PATCH] openmp: Fortran strictly-structured blocks support This implements strictly-structured blocks support for Fortran, as specified in OpenMP 5.2. This now allows using a Fortran BLOCK construct as the body of most OpenMP constructs, with a "!$omp end ..." ending directive optional for that form. gcc/fortran/ChangeLog: * decl.c (gfc_match_end): Add COMP_OMP_STRICTLY_STRUCTURED_BLOCK case together with COMP_BLOCK. * parse.c (parse_omp_structured_block): Change return type to 'gfc_statement', add handling for strictly-structured block case, adjust recursive calls to parse_omp_structured_block. (parse_executable): Adjust calls to parse_omp_structured_block. * parse.h (enum gfc_compile_state): Add COMP_OMP_STRICTLY_STRUCTURED_BLOCK. * trans-openmp.c (gfc_trans_omp_workshare): Add EXEC_BLOCK case handling. gcc/testsuite/ChangeLog: * gfortran.dg/gomp/cancel-1.f90: Adjust testcase. * gfortran.dg/gomp/nesting-3.f90: Adjust testcase. * gfortran.dg/gomp/strictly-structured-block-1.f90: New test. * gfortran.dg/gomp/strictly-structured-block-2.f90: New test. * gfortran.dg/gomp/strictly-structured-block-3.f90: New test. libgomp/ChangeLog: * libgomp.texi (Support of strictly structured blocks in Fortran): Adjust to 'Y'. * testsuite/libgomp.fortran/task-reduction-16.f90: Adjust testcase. --- gcc/fortran/decl.c| 1 + gcc/fortran/parse.c | 69 +- gcc/fortran/parse.h | 2 +- gcc/fortran/trans-openmp.c| 6 +- gcc/testsuite/gfortran.dg/gomp/cancel-1.f90 | 3 + gcc/testsuite/gfortran.dg/gomp/nesting-3.f90 | 20 +- .../gomp/strictly-structured-block-1.f90 | 214 ++ .../gomp/strictly-structured-block-2.f90 | 139 .../gomp/strictly-structured-block-3.f90 | 52 + libgomp/libgomp.texi | 2 +- .../libgomp.fortran/task-reduction-16.f90 | 1 + 11 files changed, 484 insertions(+), 25 deletions(-) create mode 100644 gcc/testsuite/gfortran.dg/gomp/strictly-structured-block-1.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/strictly-structured-block-2.f90 create mode 100644 gcc/testsuite/gfortran.dg/gomp/strictly-structured-block-3.f90 diff --git a/gcc/fortran/decl.c b/gcc/fortran/decl.c index 6784b07ae9e..6043e100fbb 100644 --- a/gcc/fortran/decl.c +++ b/gcc/fortran/decl.c @@ -8429,6 +8429,7 @@ gfc_match_end (gfc_statement *st) break; case COMP_BLOCK: +case COMP_OMP_STRICTLY_STRUCTURED_BLOCK: *st = ST_END_BLOCK; target = " block"; eos_ok = 0; diff --git a/gcc/fortran/parse.c b/gcc/fortran/parse.c index 2a454be79b0..b1e73ee6801 100644 --- a/gcc/fortran/parse.c +++ b/gcc/fortran/parse.c @@ -5459,7 +5459,7 @@ parse_oacc_loop (gfc_statement acc_st) /* Parse the statements of an OpenMP structured block. */ -static void +static gfc_statement parse_omp_structured_block (gfc_statement omp_st, bool workshare_stmts_only) { gfc_statement st, omp_end_st; @@ -5546,6 +5546,32 @@ parse_omp_structured_block (gfc_statement omp_st, bool workshare_stmts_only) gcc_unreachable (); } + bool block_construct = false; + gfc_namespace *my_ns = NULL; + gfc_namespace *my_parent = NULL; + + st = next_statement ()
[PATCH, PR90030] Fortran OpenMP/OpenACC array mapping alignment fix
Hi Jakub, As Thomas reported and submitted a patch a while ago: https://gcc.gnu.org/pipermail/gcc-patches/2019-April/519932.html https://gcc.gnu.org/pipermail/gcc-patches/2019-May/522738.html There's an issue with the Fortran front-end when mapping arrays: when creating the data MEM_REF for the map clause, there's a convention of casting the referencing pointer to 'c_char *' by fold_convert (build_pointer_type (char_type_node), ptr). This causes the alignment passed to the libgomp runtime for array data hardwared to '1', and causes alignment errors on the offload target (not always showing up, but can trigger due to slight change of clause ordering) This patch is not exactly Thomas' patch from 2019, but does the same thing. The new libgomp tests are directly reused though. A lot of scan test adjustment is also included in this patch. Patch has been tested for no regressions for gfortran and libgomp, is this okay for trunk? Thanks, Chung-Lin Fortran: fix array alignment for OpenMP/OpenACC target mapping clauses [PR90030] The Fortran front-end is creating maps of array data with a type of pointer to char_type_node, which when eventually passed to libgomp during runtime, marks the passed array with an alignment of 1, which can cause mapping alignment errors on the offload target. This patch removes the related fold_convert(build_pointer_type (char_type_node)) calls in fortran/trans-openmp.c, and adds gcc_asserts to ensure pointer type. 2021-11-04 Chung-Lin Tang Thomas Schwinge PR fortran/90030 gcc/fortran/ChangeLog: * trans-openmp.c (gfc_omp_finish_clause): Remove fold_convert to pointer to char_type_node, add gcc_assert of POINTER_TYPE_P. (gfc_trans_omp_array_section): Likewise. (gfc_trans_omp_clauses): Likewise. gcc/testsuite/ChangeLog: * gfortran.dg/goacc/finalize-1.f: Adjust scan test. * gfortran.dg/gomp/affinity-clause-1.f90: Likewise. * gfortran.dg/gomp/affinity-clause-5.f90: Likewise. * gfortran.dg/gomp/defaultmap-4.f90: Likewise. * gfortran.dg/gomp/defaultmap-5.f90: Likewise. * gfortran.dg/gomp/defaultmap-6.f90: Likewise. * gfortran.dg/gomp/map-3.f90: Likewise. * gfortran.dg/gomp/pr78260-2.f90: Likewise. * gfortran.dg/gomp/pr78260-3.f90: Likewise. libgomp/ChangeLog: * testsuite/libgomp.oacc-fortran/pr90030.f90: New test. * testsuite/libgomp.fortran/pr90030.f90: New test.diff --git a/gcc/fortran/trans-openmp.c b/gcc/fortran/trans-openmp.c index e81c558..0ff90b7 100644 --- a/gcc/fortran/trans-openmp.c +++ b/gcc/fortran/trans-openmp.c @@ -1564,7 +1564,7 @@ gfc_omp_finish_clause (tree c, gimple_seq *pre_p, bool openacc) if (present) ptr = gfc_build_cond_assign_expr (&block, present, ptr, null_pointer_node); - ptr = fold_convert (build_pointer_type (char_type_node), ptr); + gcc_assert (POINTER_TYPE_P (TREE_TYPE (ptr))); ptr = build_fold_indirect_ref (ptr); OMP_CLAUSE_DECL (c) = ptr; c2 = build_omp_clause (input_location, OMP_CLAUSE_MAP); @@ -2381,7 +2381,7 @@ gfc_trans_omp_array_section (stmtblock_t *block, gfc_omp_namelist *n, OMP_CLAUSE_SIZE (node), elemsz); } gcc_assert (se.post.head == NULL_TREE); - ptr = fold_convert (build_pointer_type (char_type_node), ptr); + gcc_assert (POINTER_TYPE_P (TREE_TYPE (ptr))); OMP_CLAUSE_DECL (node) = build_fold_indirect_ref (ptr); ptr = fold_convert (ptrdiff_type_node, ptr); @@ -2849,8 +2849,7 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses, if (GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (decl))) { decl = gfc_conv_descriptor_data_get (decl); - decl = fold_convert (build_pointer_type (char_type_node), - decl); + gcc_assert (POINTER_TYPE_P (TREE_TYPE (decl))); decl = build_fold_indirect_ref (decl); } else if (DECL_P (decl)) @@ -2873,8 +2872,7 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses, } gfc_add_block_to_block (&iter_block, &se.pre); gfc_add_block_to_block (&iter_block, &se.post); - ptr = fold_convert (build_pointer_type (char_type_node), - ptr); + gcc_assert (POINTER_TYPE_P (TREE_TYPE (ptr))); OMP_CLAUSE_DECL (node) = build_fold_indirect_ref (ptr); } if (list == OMP_LIST_DEPEND) @@ -3117,8 +3115,7 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses, if (present)
[PATCH, v2, OpenMP 5.0] Remove array section base-pointer mapping semantics, and other front-end adjustments (mainline trunk)
Hi Jakub, attached is a rebased version of this "OpenMP fixes/adjustments" patch. This version removes some of the (ort == C_ORT_OMP || ort == C_ORT_ACC) stuff that's not needed in handle_omp_array_sections_1 and [c_]finish_omp_clauses. Note that this is meant to be patched atop of the recent also posted C++ PR92120 v5 patch: https://gcc.gnu.org/pipermail/gcc-patches/2021-November/584602.html Again, tested without regressions (together with the PR92120 patch), awaiting review. Thanks, Chung-Lin (ChangeLog updated below) On 2021/5/25 9:36 PM, Chung-Lin Tang wrote: This patch largely implements three pieces of functionality: (1) Per discussion and clarification on the omp-lang mailing list, standards conforming behavior for mapping array sections should *NOT* also map the base-pointer, i.e for this code: struct S { int *ptr; ... }; struct S s; #pragma omp target enter data map(to: s.ptr[:100]) Currently we generate after gimplify: #pragma omp target enter data map(struct:s [len: 1]) map(alloc:s.ptr [len: 8]) \ map(to:*_1 [len: 400]) map(attach:s.ptr [bias: 0]) which is deemed incorrect. After this patch, the gimplify results are now adjusted to: #pragma omp target enter data map(to:*_1 [len: 400]) map(attach:s.ptr [bias: 0]) (the attach operation is still generated, and if s.ptr is already mapped prior, attachment will happen) The correct way of achieving the base-pointer-also-mapped behavior would be to use: #pragma omp target enter data map(to: s.ptr, s.ptr[:100]) This adjustment in behavior required a number of small adjustments here and there in gimplify, including to accomodate map sequences for C++ references. There is also a small Fortran front-end patch involved (hence CCing Tobias and fortran@). The new gimplify processing changed behavior in handling GOMP_MAP_ALWAYS_POINTER maps such that the libgomp.fortran/struct-elem-map-1.f90 regressed. It appeared that the Fortran FE was generating a GOMP_MAP_ALWAYS_POINTER for array types, which didn't seem quite correct, and the pre-patch behavior was removing this map anyways. I have a small change in trans-openmp.c:gfc_trans_omp_array_section to not generate the map in this case, and so far no bad test results. (2) The second part (though kind of related to the first above) are fixes in libgomp/target.c to not overwrite attached pointers when handling device<->host copies, mainly for the "always" case. This behavior is also noted in the 5.0 spec, but not yet properly coded before. (3) The third is a set of changes to the C/C++ front-ends to extend the allowed component access syntax in map clauses. This is actually mainly an effort to allow SPEC HPC to compile, so despite in the long term the entire map clause syntax parsing is probably going to be revamped, we're still adding this in for now. These changes are enabled for both OpenACC and OpenMP. 2021-11-19 Chung-Lin Tang gcc/c/ChangeLog: * c-parser.c (struct omp_dim): New struct type for use inside c_parser_omp_variable_list. (c_parser_omp_variable_list): Allow multiple levels of array and component accesses in array section base-pointer expression. (c_parser_omp_clause_to): Set 'allow_deref' to true in call to c_parser_omp_var_list_parens. (c_parser_omp_clause_from): Likewise. * c-typeck.c (handle_omp_array_sections_1): Extend allowed range of base-pointer expressions involving INDIRECT/MEM/ARRAY_REF and POINTER_PLUS_EXPR. (c_finish_omp_clauses): Extend allowed ranged of expressions involving INDIRECT/MEM/ARRAY_REF and POINTER_PLUS_EXPR. gcc/cp/ChangeLog: * parser.c (struct omp_dim): New struct type for use inside cp_parser_omp_var_list_no_open. (cp_parser_omp_var_list_no_open): Allow multiple levels of array and component accesses in array section base-pointer expression. (cp_parser_omp_all_clauses): Set 'allow_deref' to true in call to cp_parser_omp_var_list for to/from clauses. * semantics.c (handle_omp_array_sections_1): Extend allowed range of base-pointer expressions involving INDIRECT/MEM/ARRAY_REF and POINTER_PLUS_EXPR. (handle_omp_array_sections): Adjust pointer map generation of references. (finish_omp_clauses): Extend allowed ranged of expressions involving INDIRECT/MEM/ARRAY_REF and POINTER_PLUS_EXPR. gcc/fortran/ChangeLog: * trans-openmp.c (gfc_trans_omp_array_section): Do not generate GOMP_MAP_ALWAYS_POINTER map for main array maps of ARRAY_TYPE type. gcc/ChangeLog: * gimplify.c (extract_base_bit_offset): Add 'tree *offsetp' parameter, accomodate case where 'offset' return of get_inner_reference is non-NULL. (is_or_contains_p): Further robustify conditions.
Re: [PATCH, PR90030] Fortran OpenMP/OpenACC array mapping alignment fix
Ping. On 2021/11/4 4:23 PM, Chung-Lin Tang wrote: Hi Jakub, As Thomas reported and submitted a patch a while ago: https://gcc.gnu.org/pipermail/gcc-patches/2019-April/519932.html https://gcc.gnu.org/pipermail/gcc-patches/2019-May/522738.html There's an issue with the Fortran front-end when mapping arrays: when creating the data MEM_REF for the map clause, there's a convention of casting the referencing pointer to 'c_char *' by fold_convert (build_pointer_type (char_type_node), ptr). This causes the alignment passed to the libgomp runtime for array data hardwared to '1', and causes alignment errors on the offload target (not always showing up, but can trigger due to slight change of clause ordering) This patch is not exactly Thomas' patch from 2019, but does the same thing. The new libgomp tests are directly reused though. A lot of scan test adjustment is also included in this patch. Patch has been tested for no regressions for gfortran and libgomp, is this okay for trunk? Thanks, Chung-Lin Fortran: fix array alignment for OpenMP/OpenACC target mapping clauses [PR90030] The Fortran front-end is creating maps of array data with a type of pointer to char_type_node, which when eventually passed to libgomp during runtime, marks the passed array with an alignment of 1, which can cause mapping alignment errors on the offload target. This patch removes the related fold_convert(build_pointer_type (char_type_node)) calls in fortran/trans-openmp.c, and adds gcc_asserts to ensure pointer type. 2021-11-04 Chung-Lin Tang Thomas Schwinge PR fortran/90030 gcc/fortran/ChangeLog: * trans-openmp.c (gfc_omp_finish_clause): Remove fold_convert to pointer to char_type_node, add gcc_assert of POINTER_TYPE_P. (gfc_trans_omp_array_section): Likewise. (gfc_trans_omp_clauses): Likewise. gcc/testsuite/ChangeLog: * gfortran.dg/goacc/finalize-1.f: Adjust scan test. * gfortran.dg/gomp/affinity-clause-1.f90: Likewise. * gfortran.dg/gomp/affinity-clause-5.f90: Likewise. * gfortran.dg/gomp/defaultmap-4.f90: Likewise. * gfortran.dg/gomp/defaultmap-5.f90: Likewise. * gfortran.dg/gomp/defaultmap-6.f90: Likewise. * gfortran.dg/gomp/map-3.f90: Likewise. * gfortran.dg/gomp/pr78260-2.f90: Likewise. * gfortran.dg/gomp/pr78260-3.f90: Likewise. libgomp/ChangeLog: * testsuite/libgomp.oacc-fortran/pr90030.f90: New test. * testsuite/libgomp.fortran/pr90030.f90: New test.
[PATCH, Fortran] Fix setting of array lower bound for named arrays
This patch by Tobias, fixes a case of setting array low-bounds, found for particular uses of SOURCE=/MOLD=. For example: program A_M implicit none real, dimension (:), allocatable :: A, B allocate (A(0:5)) call Init (A) contains subroutine Init ( A ) real, dimension ( 0 : ), intent ( in ) :: A integer, dimension ( 1 ) :: lb_B allocate (B, mold = A) ... lb_B = lbound (B, dim=1) ! Error: lb_B assigned 1, instead of 0 like lower-bound of A. Referencing the Fortran standard: "16.9.109 LBOUND (ARRAY [, DIM, KIND])" states: "If DIM is present, ARRAY is a whole array, and either ARRAY is an assumed-size array of rank DIM or dimension DIM of ARRAY has nonzero extent, the result has a value equal to the lower bound for subscript DIM of ARRAY. Otherwise, if DIM is present, the result value is 1." And on what is a "whole array": "9.5.2 Whole arrays" "A whole array is a named array or a structure component ..." The attached patch adjusts the relevant part in gfc_trans_allocate() to only set e3_has_nodescriptor only for non-named arrays. Tobias has tested this once, and I've tested this patch as well on our complete set of testsuites (which usually serves for OpenMP related stuff). Everything appears well with no regressions. Is this okay for trunk? Thanks, Chung-Lin 2021-11-29 Tobias Burnus gcc/fortran/ChangeLog: * trans-stmt.c (gfc_trans_allocate): Set e3_has_nodescriptor to true only for non-named arrays. gcc/testsuite/ChangeLog: * gfortran.dg/allocate_with_source_26.f90: Adjust testcase. * gfortran.dg/allocate_with_mold_4.f90: New testcase.diff --git a/gcc/fortran/trans-stmt.c b/gcc/fortran/trans-stmt.c index bdf7957..982e1e0 100644 --- a/gcc/fortran/trans-stmt.c +++ b/gcc/fortran/trans-stmt.c @@ -6660,16 +6660,13 @@ gfc_trans_allocate (gfc_code * code) else e3rhs = gfc_copy_expr (code->expr3); - // We need to propagate the bounds of the expr3 for source=/mold=; - // however, for nondescriptor arrays, we use internally a lower bound - // of zero instead of one, which needs to be corrected for the allocate obj - if (e3_is == E3_DESC) - { - symbol_attribute attr = gfc_expr_attr (code->expr3); - if (code->expr3->expr_type == EXPR_ARRAY || - (!attr.allocatable && !attr.pointer)) - e3_has_nodescriptor = true; - } + // We need to propagate the bounds of the expr3 for source=/mold=. + // However, for non-named arrays, the lbound has to be 1 and neither the + // bound used inside the called function even when returning an + // allocatable/pointer nor the zero used internally. + if (e3_is == E3_DESC + && code->expr3->expr_type != EXPR_VARIABLE) + e3_has_nodescriptor = true; } /* Loop over all objects to allocate. */ diff --git a/gcc/testsuite/gfortran.dg/allocate_with_mold_4.f90 b/gcc/testsuite/gfortran.dg/allocate_with_mold_4.f90 new file mode 100644 index 000..d545fe1 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/allocate_with_mold_4.f90 @@ -0,0 +1,24 @@ +program A_M + implicit none + real, parameter :: C(5:10) = 5.0 + real, dimension (:), allocatable :: A, B + allocate (A(6)) + call Init (A) +contains + subroutine Init ( A ) +real, dimension ( -1 : ), intent ( in ) :: A +integer, dimension ( 1 ) :: lb_B + +allocate (B, mold = A) +if (any (lbound (B) /= lbound (A))) stop 1 +if (any (ubound (B) /= ubound (A))) stop 2 +if (any (shape (B) /= shape (A))) stop 3 +if (size (B) /= size (A)) stop 4 +deallocate (B) +allocate (B, mold = C) +if (any (lbound (B) /= lbound (C))) stop 5 +if (any (ubound (B) /= ubound (C))) stop 6 +if (any (shape (B) /= shape (C))) stop 7 +if (size (B) /= size (C)) stop 8 +end +end diff --git a/gcc/testsuite/gfortran.dg/allocate_with_source_26.f90 b/gcc/testsuite/gfortran.dg/allocate_with_source_26.f90 index 28f24fc..323c8a3 100644 --- a/gcc/testsuite/gfortran.dg/allocate_with_source_26.f90 +++ b/gcc/testsuite/gfortran.dg/allocate_with_source_26.f90 @@ -34,23 +34,23 @@ program p if (lbound(p1, 1) /= 3 .or. ubound(p1, 1) /= 4 & .or. lbound(p2, 1) /= 3 .or. ubound(p2, 1) /= 4 & .or. lbound(p3, 1) /= 1 .or. ubound(p3, 1) /= 2 & - .or. lbound(p4, 1) /= 7 .or. ubound(p4, 1) /= 8 & + .or. lbound(p4, 1) /= 1 .or. ubound(p4, 1) /= 2 & .or. p1(3)%i /= 43 .or. p1(4)%i /= 56 & .or. p2(3)%i /= 43 .or. p2(4)%i /= 56 & .or. p3(1)%i /= 43 .or. p3(2)%i /= 56 & - .or. p4(7)%i /= 11 .or. p4(8)%i /= 12) then + .or. p4(1)%i /= 11 .or. p4(2)%i /= 12) then call abort() endif !write(*,*) lbound(a,1), ubound(a,1) ! prints 1 3 !write(*,*) lbound(b,1), ubound(b,1) ! prints 1 3 - !write(*,*) lbound(c,1), ubound(c,1) ! prints 3 5 + !write(*,*) lbound(c,1), ubound(c,1) ! prints 1 3 !write(*,*) lbound(d,1), ubound(d,1) ! prints 1 5 !write(*,*) lbound(e,1), ubound(e,1) ! prints 1 6