[Patch] OpenMP: Fix use_device_{addr,ptr} with in-data-sharing arg
For omp parallel shared(array_desc_var) the shared-variable is passed to the generated function as argument - and replaced by a DECL_VALUE_EXPR inside the parallel region. If inside the parallel region, a omp target data has_device_addr(array_descr_var) is used, the latter generates a omp_arr->array_descr_var = &array_descr_var.data; ... tmp_desc = array_descr_var tmp_desc.data = omp_o->array_descr_var that is: 'tmp_desc' gets assigned the original descriptor and only the data components is updated. However, if that's inside the parallel region, not 'array_descr_var' has to be used – but the value expression ('omp_i->array_descr_var'). Fixed by searching the variable used in use_device_{addr,ptr} in the outer OpenMP context – and then checking for a DECL_VALUE_EXPR. OK? Tobias - Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955 OpenMP: Fix use_device_{addr,ptr} with in-data-sharing arg For array-descriptor vars, the descriptor is assigned to a temporary. However, this failed when the clause's argument was in turn in a data-sharing clause as the outer context's VALUE_EXPR wasn't used. gcc/ChangeLog: * omp-low.cc (lower_omp_target): Fix use_device_{addr,ptr} with list item that is in an outer data-sharing clause. libgomp/ChangeLog: * testsuite/libgomp.fortran/use_device_addr-5.f90: New test. gcc/omp-low.cc | 22 ++-- .../libgomp.fortran/use_device_addr-5.f90 | 143 + 2 files changed, 156 insertions(+), 9 deletions(-) diff --git a/gcc/omp-low.cc b/gcc/omp-low.cc index bf5779b6543..6e387fd9a61 100644 --- a/gcc/omp-low.cc +++ b/gcc/omp-low.cc @@ -13656,26 +13656,30 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx) new_var = lookup_decl (var, ctx); new_var = DECL_VALUE_EXPR (new_var); tree v = new_var; + tree v2 = var; + if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_USE_DEVICE_PTR + || OMP_CLAUSE_CODE (c) == OMP_CLAUSE_USE_DEVICE_ADDR) + { + v2 = maybe_lookup_decl_in_outer_ctx (var, ctx); + if (DECL_HAS_VALUE_EXPR_P (v2)) + v2 = DECL_VALUE_EXPR (v2); + } if (is_ref) { - var = build_fold_indirect_ref (var); - gimplify_expr (&var, &assign_body, NULL, is_gimple_val, - fb_rvalue); - v = create_tmp_var_raw (TREE_TYPE (var), get_name (var)); + v2 = build_fold_indirect_ref (v2); + v = create_tmp_var_raw (TREE_TYPE (v2), get_name (var)); gimple_add_tmp_var (v); TREE_ADDRESSABLE (v) = 1; - gimple_seq_add_stmt (&assign_body, - gimple_build_assign (v, var)); + gimplify_assign (v, v2, &assign_body); tree rhs = build_fold_addr_expr (v); gimple_seq_add_stmt (&assign_body, gimple_build_assign (new_var, rhs)); } else - gimple_seq_add_stmt (&assign_body, - gimple_build_assign (new_var, var)); + gimplify_assign (new_var, v2, &assign_body); - tree v2 = lang_hooks.decls.omp_array_data (unshare_expr (v), false); + v2 = lang_hooks.decls.omp_array_data (unshare_expr (v), false); gcc_assert (v2); gimplify_expr (&x, &assign_body, NULL, is_gimple_val, fb_rvalue); gimple_seq_add_stmt (&assign_body, diff --git a/libgomp/testsuite/libgomp.fortran/use_device_addr-5.f90 b/libgomp/testsuite/libgomp.fortran/use_device_addr-5.f90 new file mode 100644 index 000..1def70a1bc0 --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/use_device_addr-5.f90 @@ -0,0 +1,143 @@ +program main + use omp_lib + implicit none + integer, allocatable :: aaa(:,:,:) + integer :: i + + allocate (aaa(-4:10,-3:8,2)) + aaa(:,:,:) = reshape ([(i, i = 1, size(aaa))], shape(aaa)) + + do i = 0, omp_get_num_devices() +!$omp target data map(to: aaa) + call test_addr (aaa, i) + call test_ptr (aaa, i) +!$omp end target data + end do + deallocate (aaa) + +contains + + subroutine test_addr (, dev) +use iso_c_binding +integer, target, allocatable :: (:,:,:), (:,:,:) +integer, value :: dev +integer :: i +type(c_ptr) :: ptr +logical :: is_shared + +is_shared = .false. +!$omp target device(dev) map(to: is_shared) + is_shared = .true. +!$omp end target + +allocate ((-4:10,-3:8,2)) +(:,:,:) = reshape ([(-i, i = 1, size())], shape()) +!$omp target enter data map(to: ) device(dev) +if (any (lbound () /= [-4, -3, 1])) error stop 1 +if (any (shape () /= [15, 12, 2])) error stop 2 +if (any (lbound () /= [-4, -3, 1])) error stop 3 +if (any (shape () /= [15, 12, 2])) error stop 4 +if (any ( /= -)) error stop 5 +if (any ( /= reshape ([(i, i = 1, size())], shape( & + error stop 6 + +!$omp parallel do shared(, ) +do i
[PATCH] openmp: Handle unified address memory.
This patch adds enough support for "requires unified_address" to make the sollve_vv testcases pass. It implements unified_address as a synonym of unified_shared_memory, which is both valid and the only way I know of to unify addresses with Cuda (could be wrong). This patch should be applied on to of the previous patch set for USM. OK for stage 1? I'll apply it to OG11 shortly. Andrewopenmp: unified_address support This makes "requires unified_address" work by making it eqivalent to "requires unified_shared_memory". This is more than is strictly necessary, but should be standard compliant. gcc/c/ChangeLog: * c-parser.c (c_parser_omp_requires): Check requires unified_address for conflict with -foffload-memory=shared. gcc/cp/ChangeLog: * parser.c (cp_parser_omp_requires): Check requires unified_address for conflict with -foffload-memory=shared. gcc/fortran/ChangeLog: * openmp.c (gfc_match_omp_requires): Check requires unified_address for conflict with -foffload-memory=shared. gcc/ChangeLog: * omp-low.c: Do USM transformations for "unified_address". gcc/testsuite/ChangeLog: * c-c++-common/gomp/usm-4.c: New test. * gfortran.dg/gomp/usm-4.f90: New test. diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c index 12408770193..9a3d0cb8cea 100644 --- a/gcc/c/c-parser.c +++ b/gcc/c/c-parser.c @@ -22531,18 +22531,27 @@ c_parser_omp_requires (c_parser *parser) enum omp_requires this_req = (enum omp_requires) 0; if (!strcmp (p, "unified_address")) - this_req = OMP_REQUIRES_UNIFIED_ADDRESS; + { + this_req = OMP_REQUIRES_UNIFIED_ADDRESS; + + if (flag_offload_memory != OFFLOAD_MEMORY_UNIFIED + && flag_offload_memory != OFFLOAD_MEMORY_NONE) + error_at (cloc, + "unified_address is incompatible with the " + "selected -foffload-memory option"); + flag_offload_memory = OFFLOAD_MEMORY_UNIFIED; + } else if (!strcmp (p, "unified_shared_memory")) - { - this_req = OMP_REQUIRES_UNIFIED_SHARED_MEMORY; - - if (flag_offload_memory != OFFLOAD_MEMORY_UNIFIED - && flag_offload_memory != OFFLOAD_MEMORY_NONE) - error_at (cloc, - "unified_shared_memory is incompatible with the " - "selected -foffload-memory option"); - flag_offload_memory = OFFLOAD_MEMORY_UNIFIED; - } + { + this_req = OMP_REQUIRES_UNIFIED_SHARED_MEMORY; + + if (flag_offload_memory != OFFLOAD_MEMORY_UNIFIED + && flag_offload_memory != OFFLOAD_MEMORY_NONE) + error_at (cloc, + "unified_shared_memory is incompatible with the " + "selected -foffload-memory option"); + flag_offload_memory = OFFLOAD_MEMORY_UNIFIED; + } else if (!strcmp (p, "dynamic_allocators")) this_req = OMP_REQUIRES_DYNAMIC_ALLOCATORS; else if (!strcmp (p, "reverse_offload")) diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c index fd9f62f4543..3a9ea272f10 100644 --- a/gcc/cp/parser.c +++ b/gcc/cp/parser.c @@ -46406,18 +46406,27 @@ cp_parser_omp_requires (cp_parser *parser, cp_token *pragma_tok) enum omp_requires this_req = (enum omp_requires) 0; if (!strcmp (p, "unified_address")) - this_req = OMP_REQUIRES_UNIFIED_ADDRESS; + { + this_req = OMP_REQUIRES_UNIFIED_ADDRESS; + + if (flag_offload_memory != OFFLOAD_MEMORY_UNIFIED + && flag_offload_memory != OFFLOAD_MEMORY_NONE) + error_at (cloc, + "unified_address is incompatible with the " + "selected -foffload-memory option"); + flag_offload_memory = OFFLOAD_MEMORY_UNIFIED; + } else if (!strcmp (p, "unified_shared_memory")) - { - this_req = OMP_REQUIRES_UNIFIED_SHARED_MEMORY; - - if (flag_offload_memory != OFFLOAD_MEMORY_UNIFIED - && flag_offload_memory != OFFLOAD_MEMORY_NONE) - error_at (cloc, - "unified_shared_memory is incompatible with the " - "selected -foffload-memory option"); - flag_offload_memory = OFFLOAD_MEMORY_UNIFIED; - } + { + this_req = OMP_REQUIRES_UNIFIED_SHARED_MEMORY; + + if (flag_offload_memory != OFFLOAD_MEMORY_UNIFIED + && flag_offload_memory != OFFLOAD_MEMORY_NONE) + error_at (cloc, + "unified_shared_memory is incompatible with the " + "selected -foffload-memory option"); + flag_offload_memory = OFFLOAD_MEMORY_UNIFIED; + } else if (!strcmp (p, "dynamic_allocators")) this_r
[PATCH] PR fortran/105310 - ICE when UNION is after the 8th field in a DEC STRUCTURE with -finit-derived -finit-local-zero
See the bug report at gcc dot gnu dot org/bugzilla/show_bug.cgi?id=105310 . This code was originally authored by me and the fix is trivial, so I intend to commit the attached patch in the next few days if there is no dissent. The bug is caused by gfc_conv_union_initializer in gcc/fortran/trans-expr.cc, which accepts a pointer to a vector of constructor trees (vec*) as an argument, then appends one or two field constructors to the vector. The problem is the use of CONSTRUCTOR_APPEND_ELT(v, ...) within gfc_conv_union_initializer, which modifies the vector pointer v when a reallocation of the vector occurs, but the pointer is passed by value. Therefore, when a vector reallocation occurs, the caller's (gfc_conv_structure) vector pointer is not updated and subsequently points to freed memory. Chaos ensues. The bug only occurs when gfc_conv_union_initializer itself triggers the reallocation, which is whenever the vector is "full" (v->m_vecpfx.m_alloc == v->m_vecpfx.m_num). Since the vector defaults to allocating 8 elements and doubles in size for every reallocation, the bug only occurs when there are 8, 16, 32, etc... fields with initializers prior to the union, causing the vector of constructors to be resized when entering gfc_conv_union_initializer. The -finit-derived and -finit-local-zero options together ensure each field has an initializer, triggering the bug. The patch fixes the bug by passing the vector pointer to gfc_conv_union_initializer by reference, matching the signature of vec_safe_push from within the CONSTRUCTOR_APPEND_ELT macro. -- Fritz Reese diff --git a/gcc/fortran/trans-expr.cc b/gcc/fortran/trans-expr.cc index 06713f24f95..8677a3b0b20 100644 --- a/gcc/fortran/trans-expr.cc +++ b/gcc/fortran/trans-expr.cc @@ -9195,7 +9195,7 @@ gfc_trans_structure_assign (tree dest, gfc_expr * expr, bool init, bool coarray) } void -gfc_conv_union_initializer (vec *v, +gfc_conv_union_initializer (vec *&v, gfc_component *un, gfc_expr *init) { gfc_constructor *ctor;
Re: [PATCH] PR fortran/105310 - ICE when UNION is after the 8th field in a DEC STRUCTURE with -finit-derived -finit-local-zero
Hi Fritz, Am 20.04.22 um 20:03 schrieb Fritz Reese via Fortran: See the bug report at gcc dot gnu dot org/bugzilla/show_bug.cgi?id=105310 . This code was originally authored by me and the fix is trivial, so I intend to commit the attached patch in the next few days if there is no dissent. OK if you add a/the testcase. The bug is caused by gfc_conv_union_initializer in gcc/fortran/trans-expr.cc, which accepts a pointer to a vector of constructor trees (vec*) as an argument, then appends one or two field constructors to the vector. The problem is the use of CONSTRUCTOR_APPEND_ELT(v, ...) within gfc_conv_union_initializer, which modifies the vector pointer v when a reallocation of the vector occurs, but the pointer is passed by value. Therefore, when a vector reallocation occurs, the caller's (gfc_conv_structure) vector pointer is not updated and subsequently points to freed memory. Chaos ensues. The bug only occurs when gfc_conv_union_initializer itself triggers the reallocation, which is whenever the vector is "full" (v->m_vecpfx.m_alloc == v->m_vecpfx.m_num). Since the vector defaults to allocating 8 elements and doubles in size for every reallocation, the bug only occurs when there are 8, 16, 32, etc... fields with initializers prior to the union, causing the vector of constructors to be resized when entering gfc_conv_union_initializer. The -finit-derived and -finit-local-zero options together ensure each field has an initializer, triggering the bug. The patch fixes the bug by passing the vector pointer to gfc_conv_union_initializer by reference, matching the signature of vec_safe_push from within the CONSTRUCTOR_APPEND_ELT macro. -- Fritz Reese As this affects all branches, you may backport the patch as far as you feel reasonable. (No, I do not use DEC extensions personally.) Thanks for the patch! Harald