[PATCH] asan: Mark instrumented vars addressable [PR102656]
Hi! We ICE on the following testcase, because the asan1 pass decides to instrument .x = 0; and does that by _13 = &.x; .ASAN_CHECK (7, _13, 4, 4); .x = 0; and later sanopt pass turns that into: _39 = (unsigned long) &.x; _40 = _39 >> 3; _41 = _40 + 2147450880; _42 = (signed char *) _41; _43 = *_42; _44 = _43 != 0; _45 = _39 & 7; _46 = (signed char) _45; _47 = _46 + 3; _48 = _47 >= _43; _49 = _44 & _48; if (_49 != 0) goto ; [0.05%] else goto ; [99.95%] [local count: 536864]: __builtin___asan_report_store4 (_39); [local count: 1073741824]: .x = 0; The problem is during expansion, isn't marked TREE_ADDRESSABLE, even when we take its address in (unsigned long) &.x. Now, instrument_derefs has code to avoid the instrumentation altogether if we can prove the access is within bounds of an automatic variable in the current function and the var isn't TREE_ADDRESSABLE (or we don't instrument use after scope), but we do it solely for VAR_DECLs. I think we should treat RESULT_DECLs exactly like that too, which is what the following patch does. I must say I'm unsure about PARM_DECLs, those can have different cases, either they are fully or partially passed in registers, then if we take parameter's address, they are in a local copy inside of a function and so work like those automatic vars. But if they are fully passed in memory, we typically just take address of the slot and in that case they live in the caller's frame. It is true we don't (can't) put any asan padding in between the arguments, so all asan could detect in that case is if caller passes fewer on stack arguments or smaller arguments than callee accepts. Anyway, as I'm unsure, I haven't added PARM_DECLs to that case. And another thing is, when we actually build_fold_addr_expr, we need to mark_addressable the inner if it isn't addressable already. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2022-02-18 Jakub Jelinek PR sanitizer/102656 * asan.cc (instrument_derefs): If inner is a RESULT_DECL and access is known to be within bounds, treat it like automatic variables. If instrumenting access and inner is {VAR,PARM,RESULT}_DECL from current function and !TREE_STATIC which is not TREE_ADDRESSABLE, mark it addressable. * g++.dg/asan/pr102656.C: New test. --- gcc/asan.cc.jj 2022-02-12 19:14:45.915168286 +0100 +++ gcc/asan.cc 2022-02-17 17:13:09.346987047 +0100 @@ -2688,13 +2688,13 @@ instrument_derefs (gimple_stmt_iterator return; poly_int64 decl_size; - if (VAR_P (inner) + if ((VAR_P (inner) || TREE_CODE (inner) == RESULT_DECL) && offset == NULL_TREE && DECL_SIZE (inner) && poly_int_tree_p (DECL_SIZE (inner), &decl_size) && known_subrange_p (bitpos, bitsize, 0, decl_size)) { - if (DECL_THREAD_LOCAL_P (inner)) + if (VAR_P (inner) && DECL_THREAD_LOCAL_P (inner)) return; /* If we're not sanitizing globals and we can tell statically that this access is inside a global variable, then there's no point adding @@ -2724,6 +2726,14 @@ instrument_derefs (gimple_stmt_iterator } } + if ((VAR_P (inner) + || TREE_CODE (inner) == PARM_DECL + || TREE_CODE (inner) == RESULT_DECL) + && !TREE_STATIC (inner) + && decl_function_context (inner) == current_function_decl + && !TREE_ADDRESSABLE (inner)) +mark_addressable (inner); + base = build_fold_addr_expr (t); if (!has_mem_ref_been_instrumented (base, size_in_bytes)) { --- gcc/testsuite/g++.dg/asan/pr102656.C.jj 2022-02-17 17:09:42.758866731 +0100 +++ gcc/testsuite/g++.dg/asan/pr102656.C2022-02-17 17:10:46.212981870 +0100 @@ -0,0 +1,27 @@ +// PR sanitizer/102656 +// { dg-do compile } +// { dg-options "-std=c++20 -fsanitize=address" } + +#include + +class promise; + +struct future { + using promise_type = promise; + future() = default; + int x = 0; +}; + +struct promise { + future get_return_object() noexcept { return {}; } + auto initial_suspend() noexcept { return std::suspend_never{}; } + auto final_suspend() noexcept { return std::suspend_never{}; } + void return_void() noexcept {} + void unhandled_exception() {} +}; + +future +func () +{ + co_return; +} Jakub
Re: [PATCH] asan: Mark instrumented vars addressable [PR102656]
On Fri, 18 Feb 2022, Jakub Jelinek wrote: > Hi! > > We ICE on the following testcase, because the asan1 pass decides to > instrument > .x = 0; > and does that by > _13 = &.x; > .ASAN_CHECK (7, _13, 4, 4); > .x = 0; > and later sanopt pass turns that into: > _39 = (unsigned long) &.x; > _40 = _39 >> 3; > _41 = _40 + 2147450880; > _42 = (signed char *) _41; > _43 = *_42; > _44 = _43 != 0; > _45 = _39 & 7; > _46 = (signed char) _45; > _47 = _46 + 3; > _48 = _47 >= _43; > _49 = _44 & _48; > if (_49 != 0) > goto ; [0.05%] > else > goto ; [99.95%] > >[local count: 536864]: > __builtin___asan_report_store4 (_39); > >[local count: 1073741824]: > .x = 0; > The problem is during expansion, isn't marked TREE_ADDRESSABLE, > even when we take its address in (unsigned long) &.x. > > Now, instrument_derefs has code to avoid the instrumentation altogether > if we can prove the access is within bounds of an automatic variable in the > current function and the var isn't TREE_ADDRESSABLE (or we don't instrument > use after scope), but we do it solely for VAR_DECLs. > > I think we should treat RESULT_DECLs exactly like that too, which is what > the following patch does. I must say I'm unsure about PARM_DECLs, those can > have different cases, either they are fully or partially passed in > registers, then if we take parameter's address, they are in a local copy > inside of a function and so work like those automatic vars. But if they > are fully passed in memory, we typically just take address of the slot > and in that case they live in the caller's frame. It is true we don't > (can't) put any asan padding in between the arguments, so all asan could > detect in that case is if caller passes fewer on stack arguments or smaller > arguments than callee accepts. Anyway, as I'm unsure, I haven't added > PARM_DECLs to that case. > > And another thing is, when we actually build_fold_addr_expr, we need to > mark_addressable the inner if it isn't addressable already. > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? > > 2022-02-18 Jakub Jelinek > > PR sanitizer/102656 > * asan.cc (instrument_derefs): If inner is a RESULT_DECL and access is > known to be within bounds, treat it like automatic variables. > If instrumenting access and inner is {VAR,PARM,RESULT}_DECL from > current function and !TREE_STATIC which is not TREE_ADDRESSABLE, mark > it addressable. > > * g++.dg/asan/pr102656.C: New test. > > --- gcc/asan.cc.jj2022-02-12 19:14:45.915168286 +0100 > +++ gcc/asan.cc 2022-02-17 17:13:09.346987047 +0100 > @@ -2688,13 +2688,13 @@ instrument_derefs (gimple_stmt_iterator > return; > >poly_int64 decl_size; > - if (VAR_P (inner) > + if ((VAR_P (inner) || TREE_CODE (inner) == RESULT_DECL) >&& offset == NULL_TREE >&& DECL_SIZE (inner) >&& poly_int_tree_p (DECL_SIZE (inner), &decl_size) >&& known_subrange_p (bitpos, bitsize, 0, decl_size)) > { > - if (DECL_THREAD_LOCAL_P (inner)) > + if (VAR_P (inner) && DECL_THREAD_LOCAL_P (inner)) > return; >/* If we're not sanitizing globals and we can tell statically that this >access is inside a global variable, then there's no point adding > @@ -2724,6 +2726,14 @@ instrument_derefs (gimple_stmt_iterator > } > } > > + if ((VAR_P (inner) > + || TREE_CODE (inner) == PARM_DECL > + || TREE_CODE (inner) == RESULT_DECL) > + && !TREE_STATIC (inner) > + && decl_function_context (inner) == current_function_decl > + && !TREE_ADDRESSABLE (inner)) There's no good reason to exclute TREE_STATIC or global vars. If we take the address of something we should mark it addressable. I think you can just use DECL_P (inner) as well here for simplicity. OK with that change. Thanks, Richard. > +mark_addressable (inner); > + >base = build_fold_addr_expr (t); >if (!has_mem_ref_been_instrumented (base, size_in_bytes)) > { > --- gcc/testsuite/g++.dg/asan/pr102656.C.jj 2022-02-17 17:09:42.758866731 > +0100 > +++ gcc/testsuite/g++.dg/asan/pr102656.C 2022-02-17 17:10:46.212981870 > +0100 > @@ -0,0 +1,27 @@ > +// PR sanitizer/102656 > +// { dg-do compile } > +// { dg-options "-std=c++20 -fsanitize=address" } > + > +#include > + > +class promise; > + > +struct future { > + using promise_type = promise; > + future() = default; > + int x = 0; > +}; > + > +struct promise { > + future get_return_object() noexcept { return {}; } > + auto initial_suspend() noexcept { return std::suspend_never{}; } > + auto final_suspend() noexcept { return std::suspend_never{}; } > + void return_void() noexcept {} > + void unhandled_exception() {} > +}; > + > +future > +func () > +{ > + co_return; > +} > > Jakub > > -- Richard Biener SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany; GF: Ivo Totev; HRB 36809
[PATCH 1/3] tree-optimization/104582 - Simplify vectorizer cost API and fixes
This simplifies the vectorizer cost API by providing overloads to add_stmt_cost and record_stmt_cost suitable for scalar stmt and branch stmt costing which do not need information like a vector type or alignment. It also fixes two mistakes where costs for versioning tests were recorded as vector stmt rather than scalar stmt. This is a first patch to simplify the actual fix for PR104582. Bootstrapped and tested on x86_64-unknown-linux-gnu. OK? 2022-02-18 Richard Biener PR tree-optimization/104582 * tree-vectorizer.h (add_stmt_cost): New overload. (record_stmt_cost): Likewise. * tree-vect-loop.cc (vect_compute_single_scalar_iteration_cost): Use add_stmt_costs. (vect_get_known_peeling_cost): Use new overloads. (vect_estimate_min_profitable_iters): Likewise. Consistently use scalar_stmt for costing versioning checks. * tree-vect-stmts.cc (record_stmt_cost): New overload. --- gcc/tree-vect-loop.cc | 39 +++ gcc/tree-vect-stmts.cc | 10 ++ gcc/tree-vectorizer.h | 12 3 files changed, 37 insertions(+), 24 deletions(-) diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc index 896218f23ea..5c7b163f01c 100644 --- a/gcc/tree-vect-loop.cc +++ b/gcc/tree-vect-loop.cc @@ -1323,13 +1323,8 @@ vect_compute_single_scalar_iteration_cost (loop_vec_info loop_vinfo) /* Now accumulate cost. */ loop_vinfo->scalar_costs = init_cost (loop_vinfo, true); - stmt_info_for_cost *si; - int j; - FOR_EACH_VEC_ELT (LOOP_VINFO_SCALAR_ITERATION_COST (loop_vinfo), - j, si) -(void) add_stmt_cost (loop_vinfo->scalar_costs, si->count, - si->kind, si->stmt_info, si->vectype, - si->misalign, si->where); + add_stmt_costs (loop_vinfo->scalar_costs, + &LOOP_VINFO_SCALAR_ITERATION_COST (loop_vinfo)); loop_vinfo->scalar_costs->finish_cost (nullptr); } @@ -3873,10 +3868,10 @@ vect_get_known_peeling_cost (loop_vec_info loop_vinfo, int peel_iters_prologue, iterations are unknown, count a taken branch per peeled loop. */ if (peel_iters_prologue > 0) retval = record_stmt_cost (prologue_cost_vec, 1, cond_branch_taken, - NULL, NULL_TREE, 0, vect_prologue); + vect_prologue); if (*peel_iters_epilogue > 0) retval += record_stmt_cost (epilogue_cost_vec, 1, cond_branch_taken, - NULL, NULL_TREE, 0, vect_epilogue); + vect_epilogue); } stmt_info_for_cost *si; @@ -3946,8 +3941,7 @@ vect_estimate_min_profitable_iters (loop_vec_info loop_vinfo, { /* FIXME: Make cost depend on complexity of individual check. */ unsigned len = LOOP_VINFO_MAY_MISALIGN_STMTS (loop_vinfo).length (); - (void) add_stmt_cost (target_cost_data, len, vector_stmt, - NULL, NULL_TREE, 0, vect_prologue); + (void) add_stmt_cost (target_cost_data, len, scalar_stmt, vect_prologue); if (dump_enabled_p ()) dump_printf (MSG_NOTE, "cost model: Adding cost of checks for loop " @@ -3959,13 +3953,12 @@ vect_estimate_min_profitable_iters (loop_vec_info loop_vinfo, { /* FIXME: Make cost depend on complexity of individual check. */ unsigned len = LOOP_VINFO_COMP_ALIAS_DDRS (loop_vinfo).length (); - (void) add_stmt_cost (target_cost_data, len, vector_stmt, - NULL, NULL_TREE, 0, vect_prologue); + (void) add_stmt_cost (target_cost_data, len, scalar_stmt, vect_prologue); len = LOOP_VINFO_CHECK_UNEQUAL_ADDRS (loop_vinfo).length (); if (len) /* Count LEN - 1 ANDs and LEN comparisons. */ (void) add_stmt_cost (target_cost_data, len * 2 - 1, - scalar_stmt, NULL, NULL_TREE, 0, vect_prologue); + scalar_stmt, vect_prologue); len = LOOP_VINFO_LOWER_BOUNDS (loop_vinfo).length (); if (len) { @@ -3976,7 +3969,7 @@ vect_estimate_min_profitable_iters (loop_vec_info loop_vinfo, if (!LOOP_VINFO_LOWER_BOUNDS (loop_vinfo)[i].unsigned_p) nstmts += 1; (void) add_stmt_cost (target_cost_data, nstmts, - scalar_stmt, NULL, NULL_TREE, 0, vect_prologue); + scalar_stmt, vect_prologue); } if (dump_enabled_p ()) dump_printf (MSG_NOTE, @@ -3998,7 +3991,7 @@ vect_estimate_min_profitable_iters (loop_vec_info loop_vinfo, if (LOOP_REQUIRES_VERSIONING (loop_vinfo)) (void) add_stmt_cost (target_cost_data, 1, cond_branch_taken, - NULL, NULL_TREE, 0, vect_prologue); + vect_prologue); /* Count statements in scalar loop. Using this as scalar cost for a single iteration for now. @@
[PATCH 2/3] tree-optimization/104582 - make SLP node available in vector cost hook
This adjusts the vectorizer costing API to allow passing down the SLP node the vector stmt is created from. Bootstrapped and tested on x86_64-unknown-linux-gnu, I've built aarch64 and rs6000 cc1 crosses. OK? Thanks, Richard. 2022-02-18 Richard Biener PR tree-optimization/104582 * tree-vectorizer.h (stmt_info_for_cost::node): New field. (vector_costs::add_stmt_cost): Add SLP node parameter. (dump_stmt_cost): Likewise. (add_stmt_cost): Likewise, new overload and adjust. (add_stmt_costs): Adjust. (record_stmt_cost): New overload. * tree-vectorizer.cc (dump_stmt_cost): Dump the SLP node. (vector_costs::add_stmt_cost): Adjust. * tree-vect-loop.cc (vect_estimate_min_profitable_iters): Adjust. * tree-vect-slp.cc (vect_prologue_cost_for_slp): Record the SLP node for costing. (vectorizable_slp_permutation): Likewise. * tree-vect-stmts.cc (record_stmt_cost): Adjust and add new overloads. * config/i386/i386.cc (ix86_vector_costs::add_stmt_cost): Adjust. * config/aarch64/aarch64.cc (aarch64_vector_costs::add_stmt_cost): Adjust. * config/rs6000/rs6000.cc (rs6000_vector_costs::add_stmt_cost): Adjust. (rs6000_cost_data::adjust_vect_cost_per_loop): Likewise. --- gcc/config/aarch64/aarch64.cc | 6 +++--- gcc/config/i386/i386.cc | 9 + gcc/config/rs6000/rs6000.cc | 10 ++ gcc/tree-vect-loop.cc | 16 +--- gcc/tree-vect-slp.cc | 4 ++-- gcc/tree-vect-stmts.cc| 30 ++ gcc/tree-vectorizer.cc| 10 +++--- gcc/tree-vectorizer.h | 28 +--- 8 files changed, 75 insertions(+), 38 deletions(-) diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index 37ed22bcc94..fb40b7e9c78 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -15008,7 +15008,7 @@ public: aarch64_vector_costs (vec_info *, bool); unsigned int add_stmt_cost (int count, vect_cost_for_stmt kind, - stmt_vec_info stmt_info, tree vectype, + stmt_vec_info stmt_info, slp_tree, tree vectype, int misalign, vect_cost_model_location where) override; void finish_cost (const vector_costs *) override; @@ -15953,8 +15953,8 @@ aarch64_stp_sequence_cost (unsigned int count, vect_cost_for_stmt kind, unsigned aarch64_vector_costs::add_stmt_cost (int count, vect_cost_for_stmt kind, -stmt_vec_info stmt_info, tree vectype, -int misalign, +stmt_vec_info stmt_info, slp_tree, +tree vectype, int misalign, vect_cost_model_location where) { fractional_cost stmt_cost diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc index e4b42fbba6f..0830dbd7dca 100644 --- a/gcc/config/i386/i386.cc +++ b/gcc/config/i386/i386.cc @@ -22982,8 +22982,8 @@ class ix86_vector_costs : public vector_costs using vector_costs::vector_costs; unsigned int add_stmt_cost (int count, vect_cost_for_stmt kind, - stmt_vec_info stmt_info, tree vectype, - int misalign, + stmt_vec_info stmt_info, slp_tree node, + tree vectype, int misalign, vect_cost_model_location where) override; }; @@ -22997,8 +22997,9 @@ ix86_vectorize_create_costs (vec_info *vinfo, bool costing_for_scalar) unsigned ix86_vector_costs::add_stmt_cost (int count, vect_cost_for_stmt kind, - stmt_vec_info stmt_info, tree vectype, - int misalign, vect_cost_model_location where) + stmt_vec_info stmt_info, slp_tree, + tree vectype, int misalign, + vect_cost_model_location where) { unsigned retval = 0; bool scalar_p diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc index 32a13cdaf1f..91e5efb6288 100644 --- a/gcc/config/rs6000/rs6000.cc +++ b/gcc/config/rs6000/rs6000.cc @@ -5236,7 +5236,7 @@ public: using vector_costs::vector_costs; unsigned int add_stmt_cost (int count, vect_cost_for_stmt kind, - stmt_vec_info stmt_info, tree vectype, + stmt_vec_info stmt_info, slp_tree, tree vectype, int misalign, vect_cost_model_location where) override; void finish_cost (const vector_costs *) override; @@ -5452,8 +5452,9 @@ rs6000_cost_data::update_target_cost_per_stmt (vect_cost_for_stmt kind, unsigned rs6000_cost_data::
[PATCH 3/3] target/99881 - x86 vector cost of CTOR from integer regs
This uses the now passed SLP node to the vectorizer costing hook to adjust vector construction costs for the cost of moving an integer component from a GPR to a vector register when that's required for building a vector from components. A cruical difference here is whether the component is loaded from memory or extracted from a vector register as in those cases no intermediate GPR is involved. The pr99881.c testcase can be Un-XFAILed with this patch, the pr91446.c testcase now produces scalar code which looks superior to me so I've adjusted it as well. I'm currently re-bootstrapping and testing on x86_64-unknown-linux-gnu after adding the BIT_FIELD_REF vector extracting special casing. I suppose we can let autotesters look for SPEC performance fallout. OK if testing succeeds? Thanks, Richard. 2022-02-18 Richard Biener PR tree-optimization/104582 PR target/99881 * config/i386/i386.cc (ix86_vector_costs::add_stmt_cost): Cost GPR to vector register moves for integer vector construction. * gcc.dg/vect/costmodel/x86_64/costmodel-pr104582-1.c: New. * gcc.dg/vect/costmodel/x86_64/costmodel-pr104582-2.c: Likewise. * gcc.dg/vect/costmodel/x86_64/costmodel-pr104582-3.c: Likewise. * gcc.dg/vect/costmodel/x86_64/costmodel-pr104582-4.c: Likewise. * gcc.target/i386/pr99881.c: Un-XFAIL. * gcc.target/i386/pr91446.c: Adjust to not expect vectorization. --- gcc/config/i386/i386.cc | 45 ++- .../costmodel/x86_64/costmodel-pr104582-1.c | 15 +++ .../costmodel/x86_64/costmodel-pr104582-2.c | 13 ++ .../costmodel/x86_64/costmodel-pr104582-3.c | 13 ++ .../costmodel/x86_64/costmodel-pr104582-4.c | 15 +++ gcc/testsuite/gcc.target/i386/pr91446.c | 2 +- gcc/testsuite/gcc.target/i386/pr99881.c | 2 +- 7 files changed, 102 insertions(+), 3 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/vect/costmodel/x86_64/costmodel-pr104582-1.c create mode 100644 gcc/testsuite/gcc.dg/vect/costmodel/x86_64/costmodel-pr104582-2.c create mode 100644 gcc/testsuite/gcc.dg/vect/costmodel/x86_64/costmodel-pr104582-3.c create mode 100644 gcc/testsuite/gcc.dg/vect/costmodel/x86_64/costmodel-pr104582-4.c diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc index 0830dbd7dca..b2bf90576d5 100644 --- a/gcc/config/i386/i386.cc +++ b/gcc/config/i386/i386.cc @@ -22997,7 +22997,7 @@ ix86_vectorize_create_costs (vec_info *vinfo, bool costing_for_scalar) unsigned ix86_vector_costs::add_stmt_cost (int count, vect_cost_for_stmt kind, - stmt_vec_info stmt_info, slp_tree, + stmt_vec_info stmt_info, slp_tree node, tree vectype, int misalign, vect_cost_model_location where) { @@ -23160,6 +23160,49 @@ ix86_vector_costs::add_stmt_cost (int count, vect_cost_for_stmt kind, stmt_cost = ix86_builtin_vectorization_cost (kind, vectype, misalign); stmt_cost *= (TYPE_VECTOR_SUBPARTS (vectype) + 1); } + else if (kind == vec_construct + && node + && SLP_TREE_DEF_TYPE (node) == vect_external_def + && INTEGRAL_TYPE_P (TREE_TYPE (vectype))) +{ + stmt_cost = ix86_builtin_vectorization_cost (kind, vectype, misalign); + unsigned i; + tree op; + FOR_EACH_VEC_ELT (SLP_TREE_SCALAR_OPS (node), i, op) + if (TREE_CODE (op) == SSA_NAME) + TREE_VISITED (op) = 0; + FOR_EACH_VEC_ELT (SLP_TREE_SCALAR_OPS (node), i, op) + { + if (TREE_CODE (op) != SSA_NAME + || TREE_VISITED (op)) + continue; + TREE_VISITED (op) = 1; + gimple *def = SSA_NAME_DEF_STMT (op); + tree tem; + if (is_gimple_assign (def) + && CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (def)) + && ((tem = gimple_assign_rhs1 (def)), true) + && TREE_CODE (tem) == SSA_NAME + /* A sign-change expands to nothing. */ + && tree_nop_conversion_p (TREE_TYPE (gimple_assign_lhs (def)), + TREE_TYPE (tem))) + def = SSA_NAME_DEF_STMT (tem); + /* When the component is loaded from memory we can directly +move it to a vector register, otherwise we have to go +via a GPR or via vpinsr which involves similar cost. +Likewise with a BIT_FIELD_REF extracting from a vector +register we can hope to avoid using a GPR. */ + if (!is_gimple_assign (def) + || (!gimple_assign_load_p (def) + && (gimple_assign_rhs_code (def) != BIT_FIELD_REF + || !VECTOR_TYPE_P (TREE_TYPE + (TREE_OPERAND (gimple_assign_rhs1 (def), 0)) + stmt_cost += ix86_cost->sse_to_integer; + } + FOR_EACH_VEC_ELT (SLP_TREE_SCALAR_OPS (node), i, op) +
[committed] rs6000: Fix up posix_memalign call in _mm_malloc [PR104598]
On Wed, Feb 16, 2022 at 09:09:27PM -0600, Paul A. Clarke wrote: > I see a couple of issues in my commit message, which I'll fix up before > merging... The uglification changes went in one spot too far and uglified also the anem of function, posix_memalign should be called like that and not a non-existent function instead of it. Committed to trunk as obvious. 2022-02-18 Jakub Jelinek PR target/104257 PR target/104598 * config/rs6000/mm_malloc.h (_mm_malloc): Call posix_memalign rather than __posix_memalign. --- gcc/config/rs6000/mm_malloc.h.jj2022-02-18 12:38:05.729397896 +0100 +++ gcc/config/rs6000/mm_malloc.h 2022-02-18 17:16:52.251239933 +0100 @@ -47,7 +47,7 @@ _mm_malloc (size_t __size, size_t __alig return malloc (__size); if (__alignment < __vec_align) __alignment = __vec_align; - if (__posix_memalign (&__ptr, __alignment, __size) == 0) + if (posix_memalign (&__ptr, __alignment, __size) == 0) return __ptr; else return NULL; Jakub
Re: [committed] d: Merge upstream dmd 52844d4b1, druntime dbd0c874, phobos 896b1d0e1.
Excerpts from Rainer Orth's message of Februar 16, 2022 11:45 pm: > Hi Iain, > >> This patch merges the D front-end implementation with upstream dmd >> 52844d4b1, as well as the D runtime libraries with druntime dbd0c874, >> and phobos 896b1d0e1, including the latest features and bug-fixes ahead of >> the 2.099.0-beta1 release. > > this patch broke Solaris bootstrap: > > /vol/gcc/src/hg/master/local/libphobos/libdruntime/core/sys/posix/sys/ipc.d:193:5: > error: static assert: "Unsupported platform" > 193 | static assert(false, "Unsupported platform"); > | ^ > > The attached patch fixes this. Tested on i386-pc-solaris2.11 and > sparc-sun-solaris2.11. > Thanks Rainer! I've committed it upstream, and will merge it down in the next sync-up. Iain.
[PATCH] PR middle-end/65855: Scalar evolution for quadratic chrecs
This patch improves GCC's scalar evolution and final value replacement optimizations by supporting triangular/quadratic/trapezoid chrecs which resolves both PR middle-end/65855 and PR c/80852, but alas not (yet) PR tree-optimization/46186. I've listed Richard Biener as co-author, as this solution is based heavily on his proposed patch in comment #4 of PR 65855 from 2015, but with several significant changes. The most important change is a correction to formula used. For the scalar evolution {a, +, {b, +, c}}, there was an off-by-one error, so chrec_apply should not return a + b*x + c*x*(x+1)/2, but a + b*x + c*x*(x-1)/2, which explains why the original patch didn't produce the expected results. Another significant set of changes involves handling the various type mismatches and overflows. In chrec_apply the evolution is evaluated after x iterations (which is an unsigned integer type, called orig_type in this patch) but a, b and c may be any type including floating point (see PR tree-opt/23391) and including types that trap on overflow with -ftrapv (see PR tree-opt/79721), and in the case of pointer arithmetic, b and c may have a different type (sizetype) to a! Additionally, the division by two in "c*x*(x-1)/2" requires the correct top bit in modulo arithmetic, which means that the multiplication typically needs to be performed with more precision (in a wider mode) than orig_type [unless c is an even integer constant, or x (the number of iterations) is a known constant at compile-time]. Finally, final value replacement is only an effective optimization if the expression used to compute the result of the loop is cheaper than the loop itself, hence chrec_apply needs to return an optimized folded tree with the minimal number of operators. Hence if b == c, this patch calculates "a + c*x*(x+1)/2", when c is even we can perform the division at compile-time, and when x is a non-trivial expression, we wrap it in a SAVE_EXPR so that the lowering to gimple can reuse the common subexpression. Correctly handling all of the corner cases results in a patch significantly larger than the original "proof-of-concept". There's one remaining refinement, marked as TODO in the patch, which is to also support unsigned 64-bit to 128-bit widening multiplications (umulditi3) on targets that support it, such as x86_64, as used by LLVM, but this patch provides the core target-independent functionality. [This last piece would handle/resolve the testcase in PR tree-opt/46186 which requires 128-bit TImode operations, not suitable for all backend targets]. This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check with no new failures. Ok for mainline? 2022-02-18 Roger Sayle Richard Biener gcc/ChangeLog PR target/65855 PR c/80852 * tree-chrec.cc (chrec_fold_divide_by_2): New function to divide a chrec by two, honoring the type of the chrec. (chrec_save_expr): New function to wrap a chrec in a SAVE_EXPR, but without TREE_SIDE_EFFECTS. (chrec_apply_triangular): New helper function of chrec apply to evaluate a quadratic/triangular chrec. (chrec_apply): Expand/clarify comment before function. Handle application of any chrec after zero iterations, i.e. A. Recursively handle cases where the iteration count is conditional. Handle quadratic/triangular chrecs by calling the new chrec_apply_triangular function. (chrec_convert_rhs): Handle conversion of integer constants to scalar floating point types here (moved from chrec_apply). * tree-scalar-evolution.cc (interpret_expr): Handle SAVE_EXPR by (tail call) recursion. (expression_expensive_p): Calculate the cost of a SAVE_EXPR as half the cost of its operand, i.e. assume it is reused. gcc/testsuite/ChangeLog PR target/65855 PR c/80852 * gcc.dg/tree-ssa/pr65855.c: New test case. * gcc.dg/tree-ssa/pr80852.c: New test case. * gcc.dg/vect/vect-iv-11.c: Update to reflect that the loop is no longer vectorized, but calculated by final value replacement. Roger -- diff --git a/gcc/tree-chrec.cc b/gcc/tree-chrec.cc index c44cea7..63d7522 100644 --- a/gcc/tree-chrec.cc +++ b/gcc/tree-chrec.cc @@ -573,6 +573,246 @@ chrec_evaluate (unsigned var, tree chrec, tree n, unsigned int k) chrec_convert (ctype, chrec, NULL), binomial_n_k); } + +/* Fold the (exact) division of chrec by two. */ + +static tree +chrec_fold_divide_by_2 (tree type, tree op) +{ + if (SCALAR_FLOAT_TYPE_P (type)) +return fold_build2 (MULT_EXPR, type, op, build_real (type, dconsthalf)); + return fold_build2 (RSHIFT_EXPR, type, op, build_int_cst (type, 1)); +} + + +/* Indicate that a chrec subexpression is used multiple times. */ + +static tree +chrec_save_expr (tree t) +{ + if (is_gimple_min_invariant (t)) +return t; + t = save_expr (t); + if (TRE
[r12-7293 Regression] FAIL: gcc.target/i386/pieces-memset-21.c scan-assembler-not vzeroupper on Linux/x86_64
On Linux/x86_64, fe79d652c96b53384ddfa43e312cb0010251391b is the first bad commit commit fe79d652c96b53384ddfa43e312cb0010251391b Author: Richard Biener Date: Thu Feb 17 14:40:16 2022 +0100 target/104581 - compile-time regression in mode-switching caused FAIL: gcc.target/i386/pieces-memset-21.c scan-assembler-not vzeroupper with GCC configured with ../../gcc/configure --prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r12-7293/usr --enable-clocale=gnu --with-system-zlib --with-demangler-in-ld --with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl --enable-libmpx x86_64-linux --disable-bootstrap To reproduce: $ cd {build_dir}/gcc && make check RUNTESTFLAGS="i386.exp=gcc.target/i386/pieces-memset-21.c --target_board='unix{-m32}'" $ cd {build_dir}/gcc && make check RUNTESTFLAGS="i386.exp=gcc.target/i386/pieces-memset-21.c --target_board='unix{-m32\ -march=cascadelake}'" (Please do not reply to this email, for question about this report, contact me at skpgkp2 at gmail dot com)
[PATCH] pieces-memset-21.c: Expect vzeroupper for ia32
On Thu, Feb 17, 2022 at 6:32 PM Hongtao Liu via Gcc-patches wrote: > > On Thu, Feb 17, 2022 at 9:47 PM Richard Biener via Gcc-patches > wrote: > > > > The x86 backend piggy-backs on mode-switching for insertion of > > vzeroupper. A recent improvement there was implemented in a way > > to walk possibly the whole basic-block for all DF reg def definitions > > in its mode_needed hook which is called for each instruction in > > a basic-block during mode-switching local analysis. > > > > The following mostly reverts this improvement. It needs to be > > re-done in a way more consistent with a local dataflow which > > probably means making targets aware of the state of the local > > dataflow analysis. > > > > This improves compile-time of some 538.imagick_r TU from > > 362s to 16s with -Ofast -mavx2 -fprofile-generate. > > > > Bootstrapped and tested on x86_64-unknown-linux-gnu, OK? > LGTM, have talked to H.J, he also agrees. > > > > Thanks, > > Richard. > > > > 2022-02-17 Richard Biener > > > > PR target/104581 > > * config/i386/i386.cc (ix86_avx_u128_mode_source): Remove. > > (ix86_avx_u128_mode_needed): Return AVX_U128_DIRTY instead > > of calling ix86_avx_u128_mode_source which would eventually > > have returned AVX_U128_ANY in some very special case. > > > > * gcc.target/i386/pr101456-1.c: XFAIL. > > --- > > gcc/config/i386/i386.cc| 78 +- > > gcc/testsuite/gcc.target/i386/pr101456-1.c | 3 +- > > 2 files changed, 5 insertions(+), 76 deletions(-) > > > > diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc > > index cf246e74e57..e4b42fbba6f 100644 > > --- a/gcc/config/i386/i386.cc > > +++ b/gcc/config/i386/i386.cc > > @@ -14377,80 +14377,12 @@ ix86_check_avx_upper_register (const_rtx exp) > > > > static void > > ix86_check_avx_upper_stores (rtx dest, const_rtx, void *data) > > - { > > - if (ix86_check_avx_upper_register (dest)) > > +{ > > + if (ix86_check_avx_upper_register (dest)) > > { > >bool *used = (bool *) data; > >*used = true; > > } > > - } > > - > > -/* For YMM/ZMM store or YMM/ZMM extract. Return mode for the source > > - operand of SRC DEFs in the same basic block before INSN. */ > > - > > -static int > > -ix86_avx_u128_mode_source (rtx_insn *insn, const_rtx src) > > -{ > > - basic_block bb = BLOCK_FOR_INSN (insn); > > - rtx_insn *end = BB_END (bb); > > - > > - /* Return AVX_U128_DIRTY if there is no DEF in the same basic > > - block. */ > > - int status = AVX_U128_DIRTY; > > - > > - for (df_ref def = DF_REG_DEF_CHAIN (REGNO (src)); > > - def; def = DF_REF_NEXT_REG (def)) > > -if (DF_REF_BB (def) == bb) > > - { > > - /* Ignore DEF from different basic blocks. */ > > - rtx_insn *def_insn = DF_REF_INSN (def); > > - > > - /* Check if DEF_INSN is before INSN. */ > > - rtx_insn *next; > > - for (next = NEXT_INSN (def_insn); > > -next != nullptr && next != end && next != insn; > > -next = NEXT_INSN (next)) > > - ; > > - > > - /* Skip if DEF_INSN isn't before INSN. */ > > - if (next != insn) > > - continue; > > - > > - /* Return AVX_U128_DIRTY if the source operand of DEF_INSN > > - isn't constant zero. */ > > - > > - if (CALL_P (def_insn)) > > - { > > - bool avx_upper_reg_found = false; > > - note_stores (def_insn, > > -ix86_check_avx_upper_stores, > > -&avx_upper_reg_found); > > - > > - /* Return AVX_U128_DIRTY if call returns AVX. */ > > - if (avx_upper_reg_found) > > - return AVX_U128_DIRTY; > > - > > - continue; > > - } > > - > > - rtx set = single_set (def_insn); > > - if (!set) > > - return AVX_U128_DIRTY; > > - > > - rtx dest = SET_DEST (set); > > - > > - /* Skip if DEF_INSN is not an AVX load. Return AVX_U128_DIRTY > > - if the source operand isn't constant zero. */ > > - if (ix86_check_avx_upper_register (dest) > > - && standard_sse_constant_p (SET_SRC (set), > > - GET_MODE (dest)) != 1) > > - return AVX_U128_DIRTY; > > - > > - /* We get here only if all AVX loads are from constant zero. */ > > - status = AVX_U128_ANY; > > - } > > - > > - return status; > > } > > > > /* Return needed mode for entity in optimize_mode_switching pass. */ > > @@ -14520,11 +14452,7 @@ ix86_avx_u128_mode_needed (rtx_insn *insn) > > { > > FOR_EACH_SUBRTX (iter, array, src, NONCONST) > > if (ix86_check_avx_upper_register (*iter)) > > - { > > - int status = ix86_avx_u128_mode_source (insn, *iter); > > - if (status == AVX_U128_DIRTY) > > - return status; > > - } > > + return AVX_U128_DIRTY; > >
[PATCH 0/8] OpenMP 5.0: C++ "declare mapper" support (plus struct rework, etc.)
Hi, This patch contains rebased/slightly bug-fixed versions of the patches previously posted in the series: https://gcc.gnu.org/pipermail/gcc-patches/2021-November/585439.html plus a new implementation of "declare mapper" support for C++. This can't be committed now, but posting now so others (mostly Jakub?) have a chance to look at it and comment on the general approach, etc.. Further commentary on individual patches. Thanks, Julian Julian Brown (8): OpenMP 5.0: Clause ordering for OpenMP 5.0 (topological sorting by base pointer) Remove omp_target_reorder_clauses OpenMP/OpenACC struct sibling list gimplification extension and rework OpenMP: Add inspector class to unify mapped address analysis OpenMP: lvalue parsing for map clauses (C++) OpenMP: lvalue parsing for map clauses (C) Use OMP_ARRAY_SECTION instead of TREE_LIST in C++ FE OpenMP 5.0 "declare mapper" support for C++ gcc/c-family/c-common.h | 45 + gcc/c-family/c-omp.cc | 227 ++ gcc/c/c-parser.cc | 150 +- gcc/c/c-tree.h|1 + gcc/c/c-typeck.cc | 251 +- gcc/cp/cp-gimplify.cc |6 + gcc/cp/cp-objcp-common.h |2 + gcc/cp/cp-tree.h | 10 + gcc/cp/decl.cc| 18 +- gcc/cp/error.cc |9 + gcc/cp/mangle.cc |5 +- gcc/cp/name-lookup.cc |3 +- gcc/cp/parser.cc | 520 ++- gcc/cp/parser.h |3 + gcc/cp/pt.cc | 144 +- gcc/cp/semantics.cc | 863 +++-- gcc/fortran/parse.cc |3 + gcc/fortran/trans-openmp.cc | 20 +- gcc/gimplify.cc | 2822 - gcc/langhooks-def.h |3 + gcc/langhooks.cc |9 + gcc/langhooks.h |4 + gcc/omp-general.h | 52 + gcc/omp-low.cc| 23 +- gcc/testsuite/c-c++-common/gomp/map-1.c |3 +- gcc/testsuite/c-c++-common/gomp/map-6.c | 12 +- gcc/testsuite/g++.dg/goacc/member-array-acc.C | 13 + gcc/testsuite/g++.dg/gomp/declare-mapper-1.C | 58 + gcc/testsuite/g++.dg/gomp/declare-mapper-2.C | 30 + gcc/testsuite/g++.dg/gomp/declare-mapper-3.C | 27 + gcc/testsuite/g++.dg/gomp/declare-mapper-4.C | 74 + gcc/testsuite/g++.dg/gomp/ind-base-3.C| 38 + gcc/testsuite/g++.dg/gomp/map-assignment-1.C | 12 + gcc/testsuite/g++.dg/gomp/map-inc-1.C | 10 + gcc/testsuite/g++.dg/gomp/map-lvalue-ref-1.C | 19 + gcc/testsuite/g++.dg/gomp/map-ptrmem-1.C | 36 + gcc/testsuite/g++.dg/gomp/map-ptrmem-2.C | 39 + .../g++.dg/gomp/map-static-cast-lvalue-1.C| 17 + gcc/testsuite/g++.dg/gomp/map-ternary-1.C | 20 + gcc/testsuite/g++.dg/gomp/member-array-2.C| 86 + gcc/testsuite/g++.dg/gomp/member-array-omp.C | 13 + gcc/testsuite/g++.dg/gomp/pr67522.C |2 +- gcc/testsuite/g++.dg/gomp/target-3.C |4 +- gcc/testsuite/g++.dg/gomp/target-lambda-1.C |6 +- gcc/testsuite/g++.dg/gomp/target-this-2.C |2 +- gcc/testsuite/g++.dg/gomp/target-this-3.C |4 +- gcc/testsuite/g++.dg/gomp/target-this-4.C |4 +- .../g++.dg/gomp/unmappable-component-1.C | 21 + gcc/tree-core.h |4 + gcc/tree-pretty-print.cc | 56 + gcc/tree.cc |2 + gcc/tree.def | 10 + gcc/tree.h| 21 + include/gomp-constants.h |8 +- libgomp/testsuite/libgomp.c++/baseptrs-3.C| 275 ++ libgomp/testsuite/libgomp.c++/class-array-1.C | 59 + .../testsuite/libgomp.c++/declare-mapper-1.C | 87 + .../testsuite/libgomp.c++/declare-mapper-2.C | 55 + .../testsuite/libgomp.c++/declare-mapper-3.C | 63 + .../testsuite/libgomp.c++/declare-mapper-4.C | 63 + .../testsuite/libgomp.c++/declare-mapper-5.C | 52 + .../testsuite/libgomp.c++/declare-mapper-6.C | 37 + .../testsuite/libgomp.c++/declare-mapper-7.C | 48 + .../testsuite/libgomp.c++/declare-mapper-8.C | 61 + libgomp/testsuite/libgomp.c++/ind-base-1.C| 162 + libgomp/testsuite/libgomp.c++/ind-base-2.C| 49 + libgomp/testsuite/libgomp.c++/map-comma-1.C | 15 + .../testsuite/libgomp.c++/map-rvalue-ref-1.C | 22 + .../testsuite/libgomp.c++/member-array-1.C| 89 + libgomp/testsuite/libgomp.c++/struct-ref-1.C | 97 + .../libgomp.c-c++-common/baseptrs-1.c | 50 + .../libgomp.c-c++-common/baseptrs-2.c | 70 + .../libgo
[PATCH 1/8] OpenMP 5.0: Clause ordering for OpenMP 5.0 (topological sorting by base pointer)
This patch reimplements the omp_target_reorder_clauses function in anticipation of supporting "deeper" struct mappings (that is, with several structure dereference operators, or similar). The idea is that in place of the (possibly quadratic) algorithm in omp_target_reorder_clauses that greedily moves clauses containing addresses that are subexpressions of other addresses before those other addresses, we employ a topological sort algorithm to calculate a proper order for map clauses. This should run in linear time, and hopefully handles degenerate cases where multiple "levels" of indirect accesses are present on a given directive. The new method also takes care to keep clause groups together, addressing the concerns raised in: https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570501.html To figure out if some given clause depends on a base pointer in another clause, we strip off the outer layers of the address expression, and check (via a tree_operand_hash hash table we have built) if the result is a "base pointer" as defined in OpenMP 5.0 (1.2.6 Data Terminology). There are some subtleties involved, however: - We must treat MEM_REF with zero offset the same as INDIRECT_REF. This should probably be fixed in the front ends instead so we always use a canonical form (probably INDIRECT_REF). The following patch shows one instance of the problem, but there may be others: https://gcc.gnu.org/pipermail/gcc-patches/2021-May/571382.html - Mapping a whole struct implies mapping each of that struct's elements, which may be base pointers. Because those base pointers aren't necessarily explicitly referenced in the directive in question, we treat the whole-struct mapping as a dependency instead. This version of the patch has been moved to the front of the patch queue, thus isn't dependent on any of the following struct-rework patches. Tested with offloading to NVPTX and bootstrapped. OK (for stage 1)? Thanks, Julian 2022-02-18 Julian Brown gcc/ * gimplify.c (is_or_contains_p, omp_target_reorder_clauses): Delete functions. (omp_tsort_mark): Add enum. (omp_mapping_group): Add struct. (debug_mapping_group, omp_get_base_pointer, omp_get_attachment, omp_group_last, omp_gather_mapping_groups, omp_group_base, omp_index_mapping_groups, omp_containing_struct, omp_tsort_mapping_groups_1, omp_tsort_mapping_groups, omp_segregate_mapping_groups, omp_reorder_mapping_groups): New functions. (gimplify_scan_omp_clauses): Call above functions instead of omp_target_reorder_clauses, unless we've seen an error. * omp-low.c (scan_sharing_clauses): Avoid strict test if we haven't sorted mapping groups. gcc/testsuite/ * g++.dg/gomp/target-lambda-1.C: Adjust expected output. * g++.dg/gomp/target-this-3.C: Likewise. * g++.dg/gomp/target-this-4.C: Likewise. --- gcc/gimplify.cc | 785 +++- gcc/omp-low.cc | 7 +- gcc/testsuite/g++.dg/gomp/target-lambda-1.C | 7 +- gcc/testsuite/g++.dg/gomp/target-this-3.C | 4 +- gcc/testsuite/g++.dg/gomp/target-this-4.C | 4 +- 5 files changed, 793 insertions(+), 14 deletions(-) diff --git a/gcc/gimplify.cc b/gcc/gimplify.cc index 875b115d02d..968cbd263f5 100644 --- a/gcc/gimplify.cc +++ b/gcc/gimplify.cc @@ -8738,6 +8738,7 @@ extract_base_bit_offset (tree base, tree *base_ref, poly_int64 *bitposp, return base; } +#if 0 /* Returns true if EXPR is or contains (as a sub-component) BASE_PTR. */ static bool @@ -8761,6 +8762,7 @@ is_or_contains_p (tree expr, tree base_ptr) return operand_equal_p (expr, base_ptr); } + /* Implement OpenMP 5.x map ordering rules for target directives. There are several rules, and with some level of ambiguity, hopefully we can at least collect the complexity here in one place. */ @@ -8940,6 +8942,761 @@ omp_target_reorder_clauses (tree *list_p) } } } +#endif + + +enum omp_tsort_mark { + UNVISITED, + TEMPORARY, + PERMANENT +}; + +struct omp_mapping_group { + tree *grp_start; + tree grp_end; + omp_tsort_mark mark; + struct omp_mapping_group *sibling; + struct omp_mapping_group *next; +}; + +__attribute__((used)) static void +debug_mapping_group (omp_mapping_group *grp) +{ + tree tmp = OMP_CLAUSE_CHAIN (grp->grp_end); + OMP_CLAUSE_CHAIN (grp->grp_end) = NULL; + debug_generic_expr (*grp->grp_start); + OMP_CLAUSE_CHAIN (grp->grp_end) = tmp; +} + +/* Return the OpenMP "base pointer" of an expression EXPR, or NULL if there + isn't one. This needs improvement. */ + +static tree +omp_get_base_pointer (tree expr) +{ + while (TREE_CODE (expr) == ARRAY_REF) +expr = TREE_OPERAND (expr, 0); + + while (TREE_CODE (expr) == COMPONENT_REF +&& (DECL_P (TREE_OPERAND (expr, 0)) +|| (TREE_CODE (TREE_OPERAND (expr, 0)) == COMPONENT_REF) +|| TREE_CODE (
[PATCH 2/8] Remove omp_target_reorder_clauses
This patch has been split out from the previous one to avoid a confusingly-interleaved diff. The two patches should probably be committed squashed together. 2021-10-01 Julian Brown gcc/ * gimplify.c (is_or_contains_p, omp_target_reorder_clauses): Delete. --- gcc/gimplify.cc | 207 1 file changed, 207 deletions(-) diff --git a/gcc/gimplify.cc b/gcc/gimplify.cc index 968cbd263f5..b667012a118 100644 --- a/gcc/gimplify.cc +++ b/gcc/gimplify.cc @@ -8738,213 +8738,6 @@ extract_base_bit_offset (tree base, tree *base_ref, poly_int64 *bitposp, return base; } -#if 0 -/* Returns true if EXPR is or contains (as a sub-component) BASE_PTR. */ - -static bool -is_or_contains_p (tree expr, tree base_ptr) -{ - if ((TREE_CODE (expr) == INDIRECT_REF && TREE_CODE (base_ptr) == MEM_REF) - || (TREE_CODE (expr) == MEM_REF && TREE_CODE (base_ptr) == INDIRECT_REF)) -return operand_equal_p (TREE_OPERAND (expr, 0), - TREE_OPERAND (base_ptr, 0)); - while (!operand_equal_p (expr, base_ptr)) -{ - if (TREE_CODE (base_ptr) == COMPOUND_EXPR) - base_ptr = TREE_OPERAND (base_ptr, 1); - if (TREE_CODE (base_ptr) == COMPONENT_REF - || TREE_CODE (base_ptr) == POINTER_PLUS_EXPR - || TREE_CODE (base_ptr) == SAVE_EXPR) - base_ptr = TREE_OPERAND (base_ptr, 0); - else - break; -} - return operand_equal_p (expr, base_ptr); -} - - -/* Implement OpenMP 5.x map ordering rules for target directives. There are - several rules, and with some level of ambiguity, hopefully we can at least - collect the complexity here in one place. */ - -static void -omp_target_reorder_clauses (tree *list_p) -{ - /* Collect refs to alloc/release/delete maps. */ - auto_vec ard; - tree *cp = list_p; - while (*cp != NULL_TREE) -if (OMP_CLAUSE_CODE (*cp) == OMP_CLAUSE_MAP - && (OMP_CLAUSE_MAP_KIND (*cp) == GOMP_MAP_ALLOC - || OMP_CLAUSE_MAP_KIND (*cp) == GOMP_MAP_RELEASE - || OMP_CLAUSE_MAP_KIND (*cp) == GOMP_MAP_DELETE)) - { - /* Unlink cp and push to ard. */ - tree c = *cp; - tree nc = OMP_CLAUSE_CHAIN (c); - *cp = nc; - ard.safe_push (c); - - /* Any associated pointer type maps should also move along. */ - while (*cp != NULL_TREE - && OMP_CLAUSE_CODE (*cp) == OMP_CLAUSE_MAP - && (OMP_CLAUSE_MAP_KIND (*cp) == GOMP_MAP_FIRSTPRIVATE_REFERENCE - || OMP_CLAUSE_MAP_KIND (*cp) == GOMP_MAP_FIRSTPRIVATE_POINTER - || OMP_CLAUSE_MAP_KIND (*cp) == GOMP_MAP_ATTACH_DETACH - || OMP_CLAUSE_MAP_KIND (*cp) == GOMP_MAP_POINTER - || OMP_CLAUSE_MAP_KIND (*cp) == GOMP_MAP_ALWAYS_POINTER - || OMP_CLAUSE_MAP_KIND (*cp) == GOMP_MAP_TO_PSET)) - { - c = *cp; - nc = OMP_CLAUSE_CHAIN (c); - *cp = nc; - ard.safe_push (c); - } - } -else - cp = &OMP_CLAUSE_CHAIN (*cp); - - /* Link alloc/release/delete maps to the end of list. */ - for (unsigned int i = 0; i < ard.length (); i++) -{ - *cp = ard[i]; - cp = &OMP_CLAUSE_CHAIN (ard[i]); -} - *cp = NULL_TREE; - - /* OpenMP 5.0 requires that pointer variables are mapped before - its use as a base-pointer. */ - auto_vec atf; - for (tree *cp = list_p; *cp; cp = &OMP_CLAUSE_CHAIN (*cp)) -if (OMP_CLAUSE_CODE (*cp) == OMP_CLAUSE_MAP) - { - /* Collect alloc, to, from, to/from clause tree pointers. */ - gomp_map_kind k = OMP_CLAUSE_MAP_KIND (*cp); - if (k == GOMP_MAP_ALLOC - || k == GOMP_MAP_TO - || k == GOMP_MAP_FROM - || k == GOMP_MAP_TOFROM - || k == GOMP_MAP_ALWAYS_TO - || k == GOMP_MAP_ALWAYS_FROM - || k == GOMP_MAP_ALWAYS_TOFROM) - atf.safe_push (cp); - } - - for (unsigned int i = 0; i < atf.length (); i++) -if (atf[i]) - { - tree *cp = atf[i]; - tree decl = OMP_CLAUSE_DECL (*cp); - if (TREE_CODE (decl) == INDIRECT_REF || TREE_CODE (decl) == MEM_REF) - { - tree base_ptr = TREE_OPERAND (decl, 0); - STRIP_TYPE_NOPS (base_ptr); - for (unsigned int j = i + 1; j < atf.length (); j++) - if (atf[j]) - { - tree *cp2 = atf[j]; - tree decl2 = OMP_CLAUSE_DECL (*cp2); - - decl2 = OMP_CLAUSE_DECL (*cp2); - if (is_or_contains_p (decl2, base_ptr)) - { - /* Move *cp2 to before *cp. */ - tree c = *cp2; - *cp2 = OMP_CLAUSE_CHAIN (c); - OMP_CLAUSE_CHAIN (c) = *cp; - *cp = c; - - if (*cp2 != NULL_TREE - && OMP_CLAUSE_CODE (*cp2) == OMP_CLAUSE_MAP - && OMP_CLAUSE_MAP_KIND (*cp2) == GOMP_MAP_
[PATCH 3/8] OpenMP/OpenACC struct sibling list gimplification extension and rework
This patch is a combination of several previously-posted patches, rebased and squashed together, and with a couple of additional bugfixes: "Rewrite GOMP_MAP_ATTACH_DETACH mappings unconditionally" https://gcc.gnu.org/pipermail/gcc-patches/2021-November/585440.html "OpenMP/OpenACC: Move array_ref/indirect_ref handling code out of extract_base_bit_offset" https://gcc.gnu.org/pipermail/gcc-patches/2021-November/585441.html "OpenACC/OpenMP: Refactor struct lowering in gimplify.c" https://gcc.gnu.org/pipermail/gcc-patches/2021-November/585442.html "OpenACC: Rework indirect struct handling in gimplify.c" https://gcc.gnu.org/pipermail/gcc-patches/2021-November/585443.html "Remove base_ind/base_ref handling from extract_base_bit_offset" https://gcc.gnu.org/pipermail/gcc-patches/2021-November/585445.html "OpenMP/OpenACC: Hoist struct sibling list handling in gimplification" https://gcc.gnu.org/pipermail/gcc-patches/2021-November/585448.html "OpenMP: Allow array ref components for C & C++" https://gcc.gnu.org/pipermail/gcc-patches/2021-November/585449.html "OpenMP: Fix non-zero attach/detach bias for struct dereferences" https://gcc.gnu.org/pipermail/gcc-patches/2021-November/585451.html "OpenMP: Handle reference-typed struct members" https://gcc.gnu.org/pipermail/gcc-patches/2021-November/585453.html "OpenACC: Make deep-copy-arrayofstruct.c a libgomp/runtime test" https://gcc.gnu.org/pipermail/gcc-patches/2021-November/585452.html This resulting patch thus contains all the struct gimplification changes (and associated fixes) previously posted: the hope was that this would reduce the amount of churn to be reviewed, although maybe this patch does too much at once. The changes as a whole are all relating to increasing the variety of supported expressions in OpenMP and OpenACC map clauses, though. 2022-02-18 Julian Brown gcc/c-family/ * c-common.h (c_omp_decompose_attachable_address): Add prototype. * c-omp.cc (c_omp_decompose_attachable_address): New function. gcc/c/ * c-typeck.cc (handle_omp_array_sections): Handle attach/detach for struct dereferences with non-zero bias. (c_finish_omp_clauses): Allow ARRAY_REF components. gcc/cp/ * semantics.cc (handle_omp_array_section): Handle attach/detach for struct dereferences with non-zero bias. (finish_omp_clauses): Allow ARRAY_REF components. Handle reference-typed members. gcc/fortran/ * trans-openmp.cc (gfc_trans_omp_clauses): Don't create GOMP_MAP_TO_PSET mappings for class metadata, nor GOMP_MAP_POINTER mappings for POINTER_TYPE_P decls. gcc/ * gimplify.c (gimplify_omp_var_data): Remove GOVD_MAP_HAS_ATTACHMENTS. (insert_struct_comp_map): Refactor function into... (build_struct_comp_nodes): This new function. Remove list handling and improve self-documentation. (extract_base_bit_offset): Remove BASE_REF, OFFSETP parameters. Move code to strip outer parts of address out of function, but strip no-op conversions. (omp_mapping_group): Add DELETED field for use during reindexing. (strip_components_and_deref, strip_indirections): New functions. (omp_group_last, omp_group_base): Add GOMP_MAP_STRUCT handling. (omp_gather_mapping_groups): Initialise DELETED field for new groups. (omp_index_mapping_groups): Notice DELETED groups when (re)indexing. (insert_node_after, move_node_after, move_nodes_after, move_concat_nodes_after): New helper functions. (accumulate_sibling_list): New function to build up GOMP_MAP_STRUCT node groups for sibling lists. Outlined from gimplify_scan_omp_clauses. (omp_build_struct_sibling_lists): New function. (gimplify_scan_omp_clauses): Remove struct_map_to_clause, struct_seen_clause, struct_deref_set. Call omp_build_struct_sibling_lists as pre-pass instead of handling sibling lists in the function's main processing loop. (gimplify_adjust_omp_clauses_1): Remove GOVD_MAP_HAS_ATTACHMENTS handling, unused now. * omp-low.cc (scan_sharing_clauses): Handle pointer-type indirect struct references, and references to pointers to structs also. gcc/testsuite/ * g++.dg/goacc/member-array-acc.C: New test. * g++.dg/gomp/member-array-omp.C: New test. * g++.dg/gomp/target-3.C: Update expected output. * g++.dg/gomp/target-lambda-1.C: Likewise. * g++.dg/gomp/target-this-2.C: Likewise. * c-c++-common/goacc/deep-copy-arrayofstruct.c: Move test from here. libgomp/ * testsuite/libgomp.c-c++-common/baseptrs-1.c: Add test. * testsuite/libgomp.c-c++-common/baseptrs-2.c: Add test. * testsuite/libgomp.c++/baseptrs-3.C: Add test. * testsuite/libgomp.oacc-c-c++-common/deep-copy-15.c: New test. * testsuite/libgomp.oacc-c-c++-common/deep-copy-16.c: New test. * testsuite/libgom
[PATCH 4/8] OpenMP: Add inspector class to unify mapped address analysis
Several places in the C and C++ front-ends dig through OpenMP addresses from "map" clauses (etc.) in order to determine whether they are component accesses that need "attach" operations, check duplicate mapping clauses, and so on. When we're extending support for more kinds of lvalues in map clauses, it seems helpful to bring these all into one place in order to keep all the analyses in sync, and to make it easier to reason about which kinds of expressions are supported. This patch introduces an "address inspector" class for that purpose, and adjusts the C and C++ front-ends to use it. 2021-11-15 Julian Brown gcc/c-family/ * c-common.h (c_omp_address_inspector): New class. * c-omp.c (c_omp_address_inspector::init, c_omp_address_inspector::analyze_components, c_omp_address_inspector::map_supported_p, c_omp_address_inspector::mappable_type): New methods. gcc/c/ * c-typeck.c (handle_omp_array_sections_1, c_finish_omp_clauses): Use c_omp_address_inspector class. gcc/cp/ * semantics.c (cp_omp_address_inspector): New class, derived from c_omp_address_inspector. (handle_omp_array_sections_1): Use cp_omp_address_inspector class to analyze OpenMP map clause expressions. Support POINTER_PLUS_EXPR. (finish_omp_clauses): Likewise. Support some additional kinds of lvalues in map clauses. gcc/testsuite/ * g++.dg/gomp/unmappable-component-1.C: New test. Make map_supported_p strip nops lvalue thing, generic part --- gcc/c-family/c-common.h | 44 +++ gcc/c-family/c-omp.cc | 150 +++ gcc/c/c-typeck.cc | 199 +++--- gcc/cp/semantics.cc | 253 ++ .../g++.dg/gomp/unmappable-component-1.C | 21 ++ libgomp/testsuite/libgomp.c++/class-array-1.C | 59 6 files changed, 400 insertions(+), 326 deletions(-) create mode 100644 gcc/testsuite/g++.dg/gomp/unmappable-component-1.C create mode 100644 libgomp/testsuite/libgomp.c++/class-array-1.C diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h index d00838a57f0..af8dc6c4e4f 100644 --- a/gcc/c-family/c-common.h +++ b/gcc/c-family/c-common.h @@ -1254,6 +1254,50 @@ extern const char *c_omp_map_clause_name (tree, bool); extern void c_omp_adjust_map_clauses (tree, bool); extern tree c_omp_decompose_attachable_address (tree t, tree *virtbase); +class c_omp_address_inspector +{ + tree clause; + tree orig; + tree deref_toplevel; + tree outer_virtual_base; + tree root_term; + bool component_access; + bool indirections; + int map_supported; + +protected: + virtual bool reference_ref_p (tree) { return false; } + virtual bool processing_template_decl_p () { return false; } + virtual bool mappable_type (tree t); + virtual void emit_unmappable_type_notes (tree) { } + +public: + c_omp_address_inspector (tree c, tree t) +: clause (c), orig (t), deref_toplevel (NULL_TREE), + outer_virtual_base (NULL_TREE), root_term (NULL_TREE), + component_access (false), indirections (false), map_supported (-1) + { } + + ~c_omp_address_inspector () {} + + virtual void init (); + + tree analyze_components (bool); + + tree get_deref_toplevel () { return deref_toplevel; } + tree get_outer_virtual_base () { return outer_virtual_base; } + tree get_root_term () { gcc_assert (root_term); return root_term; } + bool component_access_p () { return component_access; } + + bool indir_component_ref_p () +{ + gcc_assert (!component_access || root_term != NULL_TREE); + return component_access && indirections; +} + + bool map_supported_p (); +}; + enum c_omp_directive_kind { C_OMP_DIR_STANDALONE, C_OMP_DIR_CONSTRUCT, diff --git a/gcc/c-family/c-omp.cc b/gcc/c-family/c-omp.cc index 0e76eb4392e..e3270f781b0 100644 --- a/gcc/c-family/c-omp.cc +++ b/gcc/c-family/c-omp.cc @@ -3113,6 +3113,156 @@ c_omp_adjust_map_clauses (tree clauses, bool is_target) } } +/* This could just be done in the constructor, but we need to call the + subclass's version of reference_ref_p, etc. */ + +void +c_omp_address_inspector::init () +{ + tree t = orig; + + gcc_assert (TREE_CODE (t) != ARRAY_REF); + + /* We may have a reference-typed component access at the outermost level + that has had convert_from_reference called on it. Look through that + access. */ + if (reference_ref_p (t) + && TREE_CODE (TREE_OPERAND (t, 0)) == COMPONENT_REF) +{ + t = TREE_OPERAND (t, 0); + deref_toplevel = t; +} + else +deref_toplevel = t; + + /* Strip off expression nodes that may enclose a COMPONENT_REF. Look through + references, but not indirections through pointers. */ + while (1) +{ + if (TREE_CODE (t) == COMPOUND_EXPR) + { + t = TREE_OPERAND (t, 1); + STRIP_NOPS (t); + } + else if (TREE_CODE (t) == POINTER_PLUS_EXPR +
[PATCH 5/8] OpenMP: lvalue parsing for map clauses (C++)
This patch changes parsing for OpenMP map clauses in C++ to use the generic expression parser, hence adds support for parsing general lvalues (as required by OpenMP 5.0+). So far only a few new types of expression are actually supported throughout compilation (including everything in the testsuite of course, and newly-added tests), and we attempt to reject unsupported expressions in order to avoid surprises for the user. This version of the patch relaxes the check for supported map expressions to include LVALUE_EXPR, which makes pointer-to-member mappings stop "accidentally" failing the check. However such mappings don't actually work yet, so they have been explicitly disallowed elsewhere. Supporting pointer-to-member mappings can probably be done most easily by mapping the whole of the containing struct, and mapping the offsets as firstprivate or similar. That's a little fiddly to get working though, and isn't unambiguously the right thing to do, so that might need more thought. 2022-02-18 Julian Brown gcc/c-family/ * c-omp.cc (c_omp_decompose_attachable_address): Handle more types of expressions. gcc/cp/ * error.cc (dump_expr): Handle OMP_ARRAY_SECTION. * parser.cc (cp_parser_new): Initialize parser->omp_array_section_p. (cp_parser_postfix_open_square_expression): Support OMP_ARRAY_SECTION parsing. (cp_parser_omp_var_list_no_open): Remove ALLOW_DEREF parameter, add MAP_LVALUE in its place. Supported generalised lvalue parsing for map clauses. (cp_parser_omp_var_list): Remove ALLOW_DEREF parameter, add MAP_LVALUE. Pass to cp_parser_omp_var_list_no_open. (cp_parser_oacc_data_clause, cp_parser_omp_all_clauses): Update calls to cp_parser_omp_var_list. * parser.h (cp_parser): Add omp_array_section_p field. * semantics.cc (handle_omp_array_sections_1): Handle more types of map expression. (handle_omp_array_section): Handle non-DECL_P attachment points. (finish_omp_clauses): Check for supported types of expression. gcc/ * gimplify.cc (accumulate_sibling_list): Handle reference-typed component accesses. Fix support for non-DECL_P struct bases. (omp_ptrmem_p): New function. (omp_build_struct_sibling_lists): Support length-two group for synthesized inner struct mapping. Explicitly disallow pointers to members. * tree-pretty-print.c (dump_generic_node): Support OMP_ARRAY_SECTION. * tree.def (OMP_ARRAY_SECTION): New tree code. gcc/testsuite/ * c-c++-common/gomp/map-6.c: Update expected output. * g++.dg/gomp/pr67522.C: Likewise. * g++.dg/gomp/ind-base-3.C: New test. * g++.dg/gomp/map-assignment-1.C: New test. * g++.dg/gomp/map-inc-1.C: New test. * g++.dg/gomp/map-lvalue-ref-1.C: New test. * g++.dg/gomp/map-ptrmem-1.C: New test. * g++.dg/gomp/map-ptrmem-2.C: New test. * g++.dg/gomp/map-static-cast-lvalue-1.C: New test. * g++.dg/gomp/map-ternary-1.C: New test. * g++.dg/gomp/member-array-2.C: New test. libgomp/ * testsuite/libgomp.c++/ind-base-1.C: New test. * testsuite/libgomp.c++/ind-base-2.C: New test. * testsuite/libgomp.c++/map-comma-1.C: New test. * testsuite/libgomp.c++/map-rvalue-ref-1.C: New test. * testsuite/libgomp.c++/member-array-1.C: New test. * testsuite/libgomp.c++/struct-ref-1.C: New test. --- gcc/c-family/c-omp.cc | 25 ++- gcc/cp/error.cc | 9 + gcc/cp/parser.cc | 141 +-- gcc/cp/parser.h | 3 + gcc/cp/semantics.cc | 55 +- gcc/gimplify.cc | 71 +++- gcc/testsuite/c-c++-common/gomp/map-6.c | 4 +- gcc/testsuite/g++.dg/gomp/ind-base-3.C| 38 gcc/testsuite/g++.dg/gomp/map-assignment-1.C | 12 ++ gcc/testsuite/g++.dg/gomp/map-inc-1.C | 10 ++ gcc/testsuite/g++.dg/gomp/map-lvalue-ref-1.C | 19 ++ gcc/testsuite/g++.dg/gomp/map-ptrmem-1.C | 36 gcc/testsuite/g++.dg/gomp/map-ptrmem-2.C | 39 + .../g++.dg/gomp/map-static-cast-lvalue-1.C| 17 ++ gcc/testsuite/g++.dg/gomp/map-ternary-1.C | 20 +++ gcc/testsuite/g++.dg/gomp/member-array-2.C| 86 ++ gcc/testsuite/g++.dg/gomp/pr67522.C | 2 +- gcc/tree-pretty-print.cc | 14 ++ gcc/tree.def | 3 + libgomp/testsuite/libgomp.c++/ind-base-1.C| 162 ++ libgomp/testsuite/libgomp.c++/ind-base-2.C| 49 ++ libgomp/testsuite/libgomp.c++/map-comma-1.C | 15 ++ .../testsuite/libgomp.c++/map-rvalue-ref-1.C | 22 +++ .../testsuite/libgomp.c++/member-array-1.C| 89 ++ libgomp/testsuite/libgomp.c++/struct-ref-1.C | 97 +++ 25 files
[PATCH 7/8] Use OMP_ARRAY_SECTION instead of TREE_LIST in C++ FE
This patch changes the representation of OMP array sections in the C++ front end to use the new OMP_ARRAY_SECTION tree code instead of a TREE_LIST. This is important for "declare mapper" support, because the array section representation may stick around longer (in "declare mapper" definitions), and special-case handling TREE_LIST becomes necessary in more places, which starts to become unwieldy. 2022-02-18 Julian Brown gcc/c-family/ * c-omp.cc (c_omp_split_clauses): Support OMP_ARRAY_SECTION. gcc/cp/ * parser.cc (cp_parser_omp_var_list_no_open): Use OMP_ARRAY_SECTION code instead of TREE_LIST to represent OpenMP array sections. * pt.cc (tsubst_copy, tsubst_omp_clause_decl, tsubst_copy_and_build): Add OMP_ARRAY_SECTION support. * semantics.cc (handle_omp_array_sections_1, handle_omp_array_sections, cp_oacc_check_attachments, finish_omp_clauses): Use OMP_ARRAY_SECTION instead of TREE_LIST where appropriate. * gimplify.cc (gimplify_expr): Ensure OMP_ARRAY_SECTION has been processed out before gimplification. --- gcc/c-family/c-omp.cc | 14 +++ gcc/cp/parser.cc | 15 gcc/cp/pt.cc | 52 gcc/cp/semantics.cc | 56 ++- gcc/gimplify.cc | 3 +++ 5 files changed, 109 insertions(+), 31 deletions(-) diff --git a/gcc/c-family/c-omp.cc b/gcc/c-family/c-omp.cc index ad34f817029..3f5f9637213 100644 --- a/gcc/c-family/c-omp.cc +++ b/gcc/c-family/c-omp.cc @@ -2662,6 +2662,9 @@ c_omp_split_clauses (location_t loc, enum tree_code code, } else if (TREE_CODE (OMP_CLAUSE_DECL (c)) == TREE_LIST) { + /* TODO: This can go away once we transition all uses of +TREE_LIST for representing OMP array sections to +OMP_ARRAY_SECTION. */ tree t; for (t = OMP_CLAUSE_DECL (c); TREE_CODE (t) == TREE_LIST; t = TREE_CHAIN (t)) @@ -2670,6 +2673,17 @@ c_omp_split_clauses (location_t loc, enum tree_code code, bitmap_clear_bit (&allocate_head, DECL_UID (t)); break; } + else if (TREE_CODE (OMP_CLAUSE_DECL (c)) == OMP_ARRAY_SECTION) + { + tree t; + for (t = OMP_CLAUSE_DECL (c); + TREE_CODE (t) == OMP_ARRAY_SECTION; + t = TREE_OPERAND (t, 0)) + ; + if (DECL_P (t)) + bitmap_clear_bit (&allocate_head, DECL_UID (t)); + break; + } /* FALLTHRU */ case OMP_CLAUSE_PRIVATE: case OMP_CLAUSE_FIRSTPRIVATE: diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc index efb65543e11..8d5ae9c44d0 100644 --- a/gcc/cp/parser.cc +++ b/gcc/cp/parser.cc @@ -36639,11 +36639,14 @@ cp_parser_omp_var_list_no_open (cp_parser *parser, enum omp_clause_code kind, decl = TREE_OPERAND (decl, 0); for (int i = dims.length () - 1; i >= 0; i--) - decl = tree_cons (dims[i].low_bound, dims[i].length, decl); + decl = build3_loc (input_location, OMP_ARRAY_SECTION, + TREE_TYPE (decl), decl, dims[i].low_bound, + dims[i].length); } else if (TREE_CODE (decl) == INDIRECT_REF) { bool ref_p = REFERENCE_REF_P (decl); + tree type = TREE_TYPE (decl); /* Turn *foo into the representation previously used for foo[0]. */ @@ -36653,7 +36656,8 @@ cp_parser_omp_var_list_no_open (cp_parser *parser, enum omp_clause_code kind, /* ...but don't add the [0:1] representation for references (because they have special handling elsewhere). */ if (!ref_p) - decl = tree_cons (integer_zero_node, integer_one_node, decl); + decl = build3_loc (input_location, OMP_ARRAY_SECTION, type, + decl, integer_zero_node, integer_one_node); } else if (TREE_CODE (decl) == ARRAY_REF) { @@ -36662,7 +3,8 @@ cp_parser_omp_var_list_no_open (cp_parser *parser, enum omp_clause_code kind, decl = TREE_OPERAND (decl, 0); STRIP_NOPS (decl); - decl = tree_cons (idx, integer_one_node, decl); + decl = build3_loc (input_location, OMP_ARRAY_SECTION, +TREE_TYPE (decl), decl, idx, integer_one_node); } else if (TREE_CODE (decl) == NON_LVALUE_EXPR || CONVERT_EXPR_P (decl)) @@ -36837,7 +36842,9 @@ cp_parser_omp_var_list_no_open (cp_parser
[PATCH 6/8] OpenMP: lvalue parsing for map clauses (C)
This patch adds support for parsing general lvalues for OpenMP "map" clauses to the C front-end, similar to the previous patch for C++. This version of the patch fixes several omissions regarding non-DECL_P root terms in map clauses (i.e. "*foo" in "(*foo)->ptr->arr[:N]") -- similar to the cp/semantics.c changes in the previous patch -- and adds a couple of new tests. 2021-11-24 Julian Brown gcc/c/ * c-parser.c (c_parser_postfix_expression_after_primary): Add support for OpenMP array section parsing. (c_parser_omp_variable_list): Change ALLOW_DEREF parameter to MAP_LVALUE. Support parsing of general lvalues in "map" clauses. (c_parser_omp_var_list_parens): Change ALLOW_DEREF parameter to MAP_LVALUE. Update call to c_parser_omp_variable_list. (c_parser_oacc_data_clause, c_parser_omp_clause_to, c_parser_omp_clause_from): Update calls to c_parser_omp_var_list_parens. * c-tree.h (c_omp_array_section_p): Add extern declaration. * c-typeck.c (c_omp_array_section_p): Add flag. (mark_exp_read): Support OMP_ARRAY_SECTION. (handle_omp_array_sections_1): Handle more kinds of expressions. (handle_omp_array_sections): Handle non-DECL_P attachment points. (c_finish_omp_clauses): Check for supported expression types. Support non-DECL_P root term for map clauses. gcc/testsuite/ * c-c++-common/gomp/map-1.c: Adjust expected output. * c-c++-common/gomp/map-6.c: Likewise. libgomp/ * testsuite/libgomp.c-c++-common/ind-base-4.c: New test. * testsuite/libgomp.c-c++-common/unary-ptr-1.c: New test. --- gcc/c/c-parser.cc | 150 +++--- gcc/c/c-tree.h| 1 + gcc/c/c-typeck.cc | 45 +- gcc/testsuite/c-c++-common/gomp/map-1.c | 3 +- gcc/testsuite/c-c++-common/gomp/map-6.c | 2 + .../libgomp.c-c++-common/ind-base-4.c | 50 ++ .../libgomp.c-c++-common/unary-ptr-1.c| 16 ++ 7 files changed, 243 insertions(+), 24 deletions(-) create mode 100644 libgomp/testsuite/libgomp.c-c++-common/ind-base-4.c create mode 100644 libgomp/testsuite/libgomp.c-c++-common/unary-ptr-1.c diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc index e9086c58524..cc590e56e75 100644 --- a/gcc/c/c-parser.cc +++ b/gcc/c/c-parser.cc @@ -10486,7 +10486,7 @@ c_parser_postfix_expression_after_primary (c_parser *parser, struct c_expr expr) { struct c_expr orig_expr; - tree ident, idx; + tree ident, idx, len; location_t sizeof_arg_loc[3], comp_loc; tree sizeof_arg[3]; unsigned int literal_zero_mask; @@ -10505,15 +10505,44 @@ c_parser_postfix_expression_after_primary (c_parser *parser, case CPP_OPEN_SQUARE: /* Array reference. */ c_parser_consume_token (parser); - idx = c_parser_expression (parser).value; - c_parser_skip_until_found (parser, CPP_CLOSE_SQUARE, -"expected %<]%>"); - start = expr.get_start (); - finish = parser->tokens_buf[0].location; - expr.value = build_array_ref (op_loc, expr.value, idx); - set_c_expr_source_range (&expr, start, finish); - expr.original_code = ERROR_MARK; - expr.original_type = NULL; + idx = len = NULL_TREE; + if (!c_omp_array_section_p + || c_parser_next_token_is_not (parser, CPP_COLON)) + idx = c_parser_expression (parser).value; + + if (c_omp_array_section_p + && c_parser_next_token_is (parser, CPP_COLON)) + { + c_parser_consume_token (parser); + if (c_parser_next_token_is_not (parser, CPP_CLOSE_SQUARE)) + len = c_parser_expression (parser).value; + + c_parser_skip_until_found (parser, CPP_CLOSE_SQUARE, +"expected %<]%>"); + +/* NOTE: We are reusing using the type of the whole array as the + type of the array section here, which isn't necessarily + entirely correct. Might need revisiting. */ + start = expr.get_start (); + finish = parser->tokens_buf[0].location; + expr.value = build3_loc (op_loc, OMP_ARRAY_SECTION, + TREE_TYPE (expr.value), expr.value, + idx, len); + set_c_expr_source_range (&expr, start, finish); + expr.original_code = ERROR_MARK; + expr.original_type = NULL; + } + else + { + c_parser_skip_until_found (parser, CPP_CLOSE_SQUARE, +"expected %<]%>"); + start = expr.get_start (); + finish = parser->tokens_buf[0].location; + expr.value = build_array_ref (op_loc, expr.value, i
[PATCH 8/8] OpenMP 5.0 "declare mapper" support for C++
This patch implements OpenMP 5.0 "declare mapper" support for C++ -- except for arrays of structs with mappers, which are TBD. I've taken cues from the existing "declare reduction" support where appropriate, though obviously the details of implementation differ somewhat (in particular, "declare mappers" must survive longer, until gimplification time). Both named and unnamed (default) mappers are supported, and both explicitly-mapped structures and implicitly-mapped struct-typed variables used within an offload region are supported. For the latter, we need a way to communicate to the middle end which mapper (visible, in-scope) is the right one to use -- for that, we scan the target body in the front end for uses of structure (etc.) types, and create artificial "mapper binding" clauses to associate types with visible mappers. (It doesn't matter too much if we create these mapper bindings a bit over-eagerly, since they will only be used if needed later during gimplification.) Another difficulty concerns the order in which an OpenMP offload region body's clauses are processed relative to its body: in order to add clauses for instantiated mappers, we need to have processed the body already in order to find which variables have been used, but at present the region's clauses are processed strictly before the body. So, this patch adds a second clause-processing step after gimplification of the body in order to add clauses resulting from instantiated mappers. (C and Fortran FE support are TBD.) Tested (alongside previous patches) with offloading to NVPTX, and bootstrapped. Any comments on this general approach? Thoughts on handling arrays of custom-mapped structs? OK (for stage 1)? Thanks, Julian 2022-02-18 Julian Brown gcc/cp/ * cp-gimplify.cc (cxx_omp_finish_mapper_clauses): New function. * cp-objcp-common.h (LANG_HOOKS_OMP_FINISH_MAPPER_CLAUSES): Add new langhook. * cp-tree.h (lang_decl_fn): Add omp_declare_mapper_p field. (DECL_OMP_DECLARE_MAPPER_P): New function. (omp_mapper_id, cp_check_omp_declare_mapper, omp_instantiate_mappers, cxx_omp_finish_mapper_clauses): Add prototypes. * decl.cc (duplicate_decls): Support DECL_OMP_DECLARE_MAPPER_P functions. (finish_function): Likewise. * mangle.cc (decl_mangling_context): Likewise. * name-lookup.cc (set_decl_context_in_fn): Likewise. * parser.cc (cp_parser_class_specifier_1): Likewise. (cp_parser_omp_declare_mapper, cp_parser_omp_declare_mapper_maplist): New functions. (cp_parser_late_parsing_for_member): Support DECL_OMP_DECLARE_MAPPER_P functions. (cp_parser_omp_clause_map): Add KIND parameter. Support "mapper" modifier. (cp_parser_omp_all_clauses): Add KIND argument to cp_parser_omp_clause_map call. (cp_parser_omp_target): Call omp_instantiate_mappers before finish_omp_clauses. (cp_parser_omp_declare): Add "declare mapper" support. * pt.cc (instantiate_class_template_1, tsubst_function_decl): Support DECL_OMP_DECLARE_MAPPER_P functions. (tsubst_omp_clauses): Call omp_instantiate_mappers before finish_omp_clauses, for target regions. (tsubst_expr): Support DECL_OMP_DECLARE_MAPPER_P functions. (tsubst_omp_udm): New function. (instantiate_body): Support DECL_OMP_DECLARE_MAPPER_P functions. * semantics.cc (gimplify.h): Include. (expand_or_defer_fn_1): Support DECL_OMP_DECLARE_MAPPER_P functions. (omp_mapper_id, omp_mapper_lookup, omp_extract_mapper_directive, cp_check_omp_declare_mapper): New functions. (remap_mapper_decl_info): New struct. (remap_mapper_decl_1, omp_instantiate_mapper, omp_instantiate_mappers): New functions. (finish_omp_clauses): Delete GOMP_MAP_PUSH_MAPPER_NAME and GOMP_MAP_POP_MAPPER_NAME artificial clauses. (mapper_list): New struct. (find_nested_mappers): New function. (omp_target_walk_data): Add MAPPERS field. (finish_omp_target_clauses_r): Scan for uses of struct/union/class type variables. (finish_omp_target_clauses): Create artificial mapper binding clauses for used structs/unions/classes in offload region. gcc/fortran/ * parse.cc (tree.h, fold-const.h, tree-hash-traits.h): Add includes (for additions to omp-general.h). gcc/ * gimplify.cc (gimplify_omp_ctx): Add IMPLICIT_MAPPERS field. (new_omp_context): Initialise IMPLICIT_MAPPERS hash map. (delete_omp_context): Delete IMPLICIT_MAPPERS hash map. (instantiate_mapper_info, remap_mapper_decl_info): New structs. (remap_mapper_decl_1, omp_instantiate_mapper, handled_struct_p, omp_instantiate_implicit_mappers, new_omp_context_for_scan): New functions. (gimplify_scan_omp_clauses): Add optional USE_MAPPERS parameter. Instantia
Re: [PATCH] rs6000: Workaround for new ifcvt behavior [PR104335]
On 2/17/22 1:04 PM, Robin Dapp via Gcc-patches wrote: >> Please send patches as plain text, not as base64. > > It seems like Thunderbird does not support this anymore since later > versions, grml. Probably need to look for another mail client. I use Thunderbird with no problems. I use the Quicktext addon/extension and it gives me a toolbar option "Other" which has a "Insert file as text" which allows adding a patch to your email inline with no whitespace munging. Peter
[PATCH] openmp: Improve handling of nested OpenMP metadirectives in C and C++ (was: Re: [PATCH 1/7] openmp: Add C support for parsing metadirectives)
This patch (to be applied on top of the metadirective patch series) addresses issues found in the C/C++ parsers when nested metadirectives are used. analyze_metadirective_body when encountering code like: #pragma omp metadirective when {set={...}: A) #pragma omp metadirective when (set={...}: B) would stop just before ': B' before it naively assumes that the '}' marks the end of the body associated with the first metadirective, when it needs to include the whole of the second metadirective plus its associated body. This is fixed by checking that the nesting level of parentheses is zero as well before stopping the gathering of tokens. The assert on the remaining tokens after parsing a clause can fail (resulting in an ICE) if there is a parse error in the directive or the body, since in that case not all tokens may be processed before parsing aborts early. The assert is therefore not enforced if any parse errors occur in the clause. I have also moved the handling of the metadirective pragma from c_parser_omp_construct to c_parser_pragma (and their C++ equivalents), since c_parser_omp_construct has some checks that do not apply to metadirectives. KwokFrom a9e4936b8476b97f11bb81b416ef3d28fa60cd37 Mon Sep 17 00:00:00 2001 From: Kwok Cheung Yeung Date: Fri, 18 Feb 2022 19:00:57 + Subject: [PATCH] openmp: Improve handling of nested OpenMP metadirectives in C and C++ This patch fixes a misparsing issue when encountering code like: #pragma omp metadirective when {={...}: A) #pragma omp metadirective when (={...}: B) When called for the first metadirective, analyze_metadirective_body would stop just before the colon in the second metadirective because it naively assumes that the '}' marks the end of a code block. The assertion for clauses to end parsing at the same point is now disabled if a parse error has occurred during the parsing of the clause, since some tokens may not be consumed if a parse error cuts parsing short. 2022-02-18 Kwok Cheung Yeung gcc/c/ * c-parser.cc (c_parser_omp_construct): Move handling of PRAGMA_OMP_METADIRECTIVE from here... (c_parser_pragma): ...to here. (analyze_metadirective_body): Check that the bracket nesting level is also zero before stopping the adding of tokens on encountering a close brace. (c_parser_omp_metadirective): Modify function signature and update. Do not assert on remaining tokens if there has been a parse error. gcc/cp/ * parser.cc (cp_parser_omp_construct): Move handling of PRAGMA_OMP_METADIRECTIVE from here... (cp_parser_pragma): ...to here. (analyze_metadirective_body): Check that the bracket nesting level is also zero before stopping the adding of tokens on encountering a close brace. (cp_parser_omp_metadirective): Modify function signature and update. Do not assert on remaining tokens if there has been a parse error. gcc/testsuite/ * c-c++-common/gomp/metadirective-1.c (f): Add test for improperly nested metadirectives. --- gcc/c/c-parser.cc | 47 +-- gcc/cp/parser.cc | 33 ++--- .../c-c++-common/gomp/metadirective-1.c | 13 + 3 files changed, 51 insertions(+), 42 deletions(-) diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc index 58fcbb398ee..6a134e0fb50 100644 --- a/gcc/c/c-parser.cc +++ b/gcc/c/c-parser.cc @@ -1592,6 +1592,7 @@ static void c_parser_omp_taskwait (c_parser *); static void c_parser_omp_taskyield (c_parser *); static void c_parser_omp_cancel (c_parser *); static void c_parser_omp_nothing (c_parser *); +static void c_parser_omp_metadirective (c_parser *, bool *); enum pragma_context { pragma_external, pragma_struct, pragma_param, pragma_stmt, pragma_compound }; @@ -1600,8 +1601,6 @@ static bool c_parser_omp_cancellation_point (c_parser *, enum pragma_context); static bool c_parser_omp_target (c_parser *, enum pragma_context, bool *); static void c_parser_omp_end_declare_target (c_parser *); static bool c_parser_omp_declare (c_parser *, enum pragma_context); -static tree c_parser_omp_metadirective (location_t, c_parser *, char *, - omp_clause_mask, tree *, bool *); static void c_parser_omp_requires (c_parser *); static bool c_parser_omp_error (c_parser *, enum pragma_context); static bool c_parser_omp_ordered (c_parser *, enum pragma_context, bool *); @@ -12551,6 +12550,10 @@ c_parser_pragma (c_parser *parser, enum pragma_context context, bool *if_p) c_parser_omp_nothing (parser); return false; +case PRAGMA_OMP_METADIRECTIVE: + c_parser_omp_metadirective (parser, if_p); + return true; + case PRAGMA_OMP_ERROR: return c_parser_omp_error (parser, context); @@ -23020,7 +23023,7 @@ analyze_metadirective_body (c_parser *parser,
libgo patch committed: Update to Go1.18rc1 release
This patch updates libgo to the Go1.18rc1 release. Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu. Committed to mainline. Ian patch.txt.bz2 Description: application/bzip
[og11][committed] openmp: Improve handling of nested OpenMP metadirectives in C and C++
This patch has been committed to the devel/omp/gcc-11 development branch: 249df772b70f7b9f50f68030d4ea9c25624cc578 openmp: Improve handling of nested OpenMP metadirectives in C and C++ Kwok
[PATCH] c: [PR104506] Fix ICE after error due to change of type to error_mark_node
From: Andrew Pinski The problem here is we end up with an error_mark_node when calling useless_type_conversion_p and that ICEs. STRIP_NOPS/tree_nop_conversion has had a check for the inner type being an error_mark_node since g9a6bb3f78c96 (2000). This just adds the check also to tree_ssa_useless_type_conversion. STRIP_USELESS_TYPE_CONVERSION is mostly used inside the gimplifier and the places where it is used outside of the gimplifier would not be adding too much overhead. OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions. Thanks, Andrew Pinski PR c/104506 gcc/ChangeLog: * tree-ssa.cc (tree_ssa_useless_type_conversion): Check the inner type before calling useless_type_conversion_p. gcc/testsuite/ChangeLog: * gcc.dg/pr104506-1.c: New test. * gcc.dg/pr104506-2.c: New test. * gcc.dg/pr104506-3.c: New test. --- gcc/testsuite/gcc.dg/pr104506-1.c | 12 gcc/testsuite/gcc.dg/pr104506-2.c | 11 +++ gcc/testsuite/gcc.dg/pr104506-3.c | 11 +++ gcc/tree-ssa.cc | 20 +--- 4 files changed, 47 insertions(+), 7 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/pr104506-1.c create mode 100644 gcc/testsuite/gcc.dg/pr104506-2.c create mode 100644 gcc/testsuite/gcc.dg/pr104506-3.c diff --git a/gcc/testsuite/gcc.dg/pr104506-1.c b/gcc/testsuite/gcc.dg/pr104506-1.c new file mode 100644 index 000..5eb71911b71 --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr104506-1.c @@ -0,0 +1,12 @@ +/* { dg-do compile } */ +/* { dg-options "-std=gnu11" } */ +/* PR c/104506: we used to ICE after the error of + changing the type. */ + +void +foo (double x) +/* { dg-message "note: previous definition" "previous definition" { target *-*-* } .-1 } */ +{ + (void)x; + int x; /* { dg-error "redeclared as different kind of symbol" } */ +} diff --git a/gcc/testsuite/gcc.dg/pr104506-2.c b/gcc/testsuite/gcc.dg/pr104506-2.c new file mode 100644 index 000..3c3c4f8 --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr104506-2.c @@ -0,0 +1,11 @@ +/* { dg-do compile } */ +/* { dg-options "-std=gnu11" } */ +/* PR c/104506: we used to ICE after the error of + changing the type. */ +void +foo (double x) +/* { dg-message "note: previous definition" "previous definition" { target *-*-* } .-1 } */ +{ + x; + int x; /* { dg-error "redeclared as different kind of symbol" } */ +} diff --git a/gcc/testsuite/gcc.dg/pr104506-3.c b/gcc/testsuite/gcc.dg/pr104506-3.c new file mode 100644 index 000..b14deb5cf25 --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr104506-3.c @@ -0,0 +1,11 @@ +/* { dg-do compile } */ +/* PR c/104506: we used to ICE after the error of + changing the type. */ +double x; +/* { dg-message "note: previous declaration" "previous declaration" { target *-*-* } .-1 } */ +void +foo (void) +{ + x; +} +int x; /* { dg-error "conflicting types" } */ diff --git a/gcc/tree-ssa.cc b/gcc/tree-ssa.cc index 430875ae37a..423dd871d9e 100644 --- a/gcc/tree-ssa.cc +++ b/gcc/tree-ssa.cc @@ -1256,18 +1256,24 @@ delete_tree_ssa (struct function *fn) bool tree_ssa_useless_type_conversion (tree expr) { + tree outer_type, inner_type; + /* If we have an assignment that merely uses a NOP_EXPR to change the top of the RHS to the type of the LHS and the type conversion is "safe", then strip away the type conversion so that we can enter LHS = RHS into the const_and_copies table. */ - if (CONVERT_EXPR_P (expr) - || TREE_CODE (expr) == VIEW_CONVERT_EXPR - || TREE_CODE (expr) == NON_LVALUE_EXPR) -return useless_type_conversion_p - (TREE_TYPE (expr), - TREE_TYPE (TREE_OPERAND (expr, 0))); + if (!CONVERT_EXPR_P (expr) + && TREE_CODE (expr) != VIEW_CONVERT_EXPR + && TREE_CODE (expr) != NON_LVALUE_EXPR) +return false; - return false; + outer_type = TREE_TYPE (expr); + inner_type = TREE_TYPE (TREE_OPERAND (expr, 0)); + + if (inner_type == error_mark_node) +return false; + + return useless_type_conversion_p (outer_type, inner_type); } /* Strip conversions from EXP according to -- 2.17.1
Re: [PATCH] PR c/104506: Tolerate error_mark_node in useless_type_conversion_p.
On Mon, Feb 14, 2022 at 11:39 PM Andrew Pinski wrote: > > On Mon, Feb 14, 2022 at 11:33 PM Richard Biener > wrote: > > > > On Tue, Feb 15, 2022 at 12:58 AM Andrew Pinski via Gcc-patches > > wrote: > > > > > > On Mon, Feb 14, 2022 at 4:54 AM Roger Sayle > > > wrote: > > > > > > > > > > > > > > > > This simple fix to the middle-end, resolves PR c/104506, by adding an > > > > > > > > explicit check for error_mark_node to useless_type_conversion_p. I > > > > first > > > > > > > > trying fixing this in the C front-end, but the type is valid at the > > > > point > > > > > > > > that the NOP_EXPR is created, so the poisoned type leaks to the > > > > middle-end. > > > > > > > > Returning either true or false from useless_type_conversion_p avoids the > > > > > > > > ICE-after-error. Apologies to Andrew Pinski, I hadn't noticed that he'd > > > > > > > > assigned this PR to himself until after my regression testing had > > > > finished. > > > > > > > > > > > > > > > > This patch has been tested on x86_64-pc-linux-gnu with make bootstrap > > > > and > > > > > > > > make -k check with no new failures. Ok for mainline? > > > > > > > > > > > > > > > > > > > > > > > > 2022-02-14 Roger Sayle > > > > > > > > > > > > > > > > gcc/ChangeLog > > > > > > > > PR c/104506 > > > > > > > > * gimple-expr.cc (useless_type_conversion_p): Add a check for > > > > > > > > error_mark_node. > > > > > > I came up with a different patch (attached) which just changes > > > tree_ssa_useless_type_conversion rather than useless_type_conversion_p > > > which I was going to submit but had an issue with my build machine. > > > I did it this way as it was similar to how > > > STRIP_NOPS/tree_nop_conversion was done already. > > > > > > Also from my description of the patch > > > STRIP_USELESS_TYPE_CONVERSION is mostly used inside the gimplifier > > > and the places where it is used outside of the gimplifier would not > > > be adding too much overhead. > > > > > > Though I think Richard Biener's patch is better really. It would be > > > interesting to see how the C++ front-end handles this case, I remember > > > it using integer_type_node in some locations after an error for a > > > type. > > > > If the fix to the C frontend doesn't work out I'd indeed prefer your > > variant. > > Nit: > > > > + outer_type = TREE_TYPE (expr); > > + inner_type = TREE_TYPE (TREE_OPERAND (expr, 0)); > > + > > + if (!inner_type || inner_type == error_mark_node) > > +return false; > > > > unless we get to a case where inner_type == NULL I would not bother > > checking that. > > Understood, I will remove it, I was just copying exactly what was done > in tree_nop_conversion really. The history on the null check seems to > date to 2000 without any explanation really. Finally submitted it as https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590595.html Thanks, Andrew > > Thanks, > Andrew Pinski > > > > > As said, that TREE_TYPE (error_mark_node) is not a type is IMHO bad > > for error recovery. Maybe we really need ERROR_TYPE here. > > > > > Thanks, > > > Andrew Pinski > > > > > > > > > > > > > > > > > > > gcc/testsuite/ChangeLog > > > > > > > > PR c/104506 > > > > > > > > * gcc.dg/pr104506.c: New test case. > > > > > > > > > > > > > > > > > > > > > > > > Roger > > > > > > > > -- > > > > > > > >
[PATCH] Add a restriction on allocate clause (OpenMP 5.0)
An allocate clause in target region must specify an allocator unless the compilation unit has requires construct with dynamic_allocators clause. Current implementation of the allocate clause did not check for this restriction. This patch fills that gap. gcc/ChangeLog: * omp-low.cc (omp_maybe_offloaded_ctx): New prototype. (scan_sharing_clauses): Check a restriction on allocate clause. gcc/testsuite/ChangeLog: * c-c++-common/gomp/allocate-2.c: Add tests. * c-c++-common/gomp/allocate-8.c: New test. * gfortran.dg/gomp/allocate-3.f90: Add tests. * gcc.dg/gomp/pr104517.c: Update. --- gcc/omp-low.cc| 10 ++ gcc/testsuite/c-c++-common/gomp/allocate-2.c | 15 +++ gcc/testsuite/c-c++-common/gomp/allocate-8.c | 18 ++ gcc/testsuite/gcc.dg/gomp/pr104517.c | 18 ++ gcc/testsuite/gfortran.dg/gomp/allocate-3.f90 | 14 ++ 5 files changed, 67 insertions(+), 8 deletions(-) create mode 100644 gcc/testsuite/c-c++-common/gomp/allocate-8.c diff --git a/gcc/omp-low.cc b/gcc/omp-low.cc index 77176efe715..658cb3de7d6 100644 --- a/gcc/omp-low.cc +++ b/gcc/omp-low.cc @@ -195,6 +195,7 @@ static vec task_cpyfns; static void scan_omp (gimple_seq *, omp_context *); static tree scan_omp_1_op (tree *, int *, void *); +static bool omp_maybe_offloaded_ctx (omp_context *ctx); #define WALK_SUBSTMTS \ case GIMPLE_BIND: \ @@ -1169,6 +1170,15 @@ scan_sharing_clauses (tree clauses, omp_context *ctx) || !integer_onep (OMP_CLAUSE_ALLOCATE_ALLOCATOR (c)) || OMP_CLAUSE_ALLOCATE_ALIGN (c) != NULL_TREE)) { + /* The allocate clauses that appear on a target construct or on + constructs in a target region must specify an allocator expression + unless a requires directive with the dynamic_allocators clause + is present in the same compilation unit. */ + if (OMP_CLAUSE_ALLOCATE_ALLOCATOR (c) == NULL_TREE + && ((omp_requires_mask & OMP_REQUIRES_DYNAMIC_ALLOCATORS) == 0) + && omp_maybe_offloaded_ctx (ctx)) + error_at (OMP_CLAUSE_LOCATION (c), "% clause must" + " specify an allocator here"); if (ctx->allocate_map == NULL) ctx->allocate_map = new hash_map; tree val = integer_zero_node; diff --git a/gcc/testsuite/c-c++-common/gomp/allocate-2.c b/gcc/testsuite/c-c++-common/gomp/allocate-2.c index cc77efc8ffe..6bb4a8af2e7 100644 --- a/gcc/testsuite/c-c++-common/gomp/allocate-2.c +++ b/gcc/testsuite/c-c++-common/gomp/allocate-2.c @@ -43,3 +43,18 @@ foo (int x, int z) #pragma omp parallel private (x) allocate (0 : x)/* { dg-error "'allocate' clause allocator expression has type 'int' rather than 'omp_allocator_handle_t'" } */ bar (x, &x, 0); } + +void +foo1 () +{ + int a = 10; +#pragma omp target + { +#pragma omp parallel private (a) allocate(a) // { dg-error "'allocate' clause must specify an allocator here" } +a = 20; + } +#pragma omp target private(a) allocate(a) // { dg-error "'allocate' clause must specify an allocator here" } + { +a = 30; + } +} diff --git a/gcc/testsuite/c-c++-common/gomp/allocate-8.c b/gcc/testsuite/c-c++-common/gomp/allocate-8.c new file mode 100644 index 000..bacefafc6fc --- /dev/null +++ b/gcc/testsuite/c-c++-common/gomp/allocate-8.c @@ -0,0 +1,18 @@ +#pragma omp requires dynamic_allocators +void +foo () +{ + int a = 10; +#pragma omp parallel private (a) allocate(a) + a = 20; +#pragma omp target + { +#pragma omp parallel private (a) allocate(a) +a = 30; + } +#pragma omp target private(a) allocate(a) + { +a = 40; + } +} + diff --git a/gcc/testsuite/gcc.dg/gomp/pr104517.c b/gcc/testsuite/gcc.dg/gomp/pr104517.c index efb3175beb3..7e3bd1a1d1e 100644 --- a/gcc/testsuite/gcc.dg/gomp/pr104517.c +++ b/gcc/testsuite/gcc.dg/gomp/pr104517.c @@ -2,11 +2,13 @@ /* { dg-do compile } */ /* { dg-options "-O1 -fcompare-debug -fopenmp -fno-tree-ter -save-temps" } */ -enum { - omp_default_mem_alloc, - omp_large_cap_mem_alloc, - omp_const_mem_alloc, - omp_high_bw_mem_alloc +typedef enum omp_allocator_handle_t +{ + omp_null_allocator = 0, + omp_default_mem_alloc = 1, + omp_large_cap_mem_alloc = 2, + omp_const_mem_alloc = 3, + omp_high_bw_mem_alloc = 4, } omp_allocator_handle_t; int t, bar_nte, bar_tl, bar_i3, bar_dd; @@ -23,7 +25,7 @@ bar (int *idp, int s, int nth, int g, int nta, int fi, int pp, int *q, int p = 0, i2 = 0, i1 = 0, m = 0, d = 0; #pragma omp target parallel for \ - device(p) firstprivate (f) allocate (f) in_reduction(+:r2) + device(p) firstprivate (f) allocate (omp_default_mem_alloc:f) in_reduction(+:r2) for (int i = 0; i < 4; i++) ll++; @@ -31,8 +33,8 @@ bar (int *idp, int s, int nth, int g, int nta, int fi, int pp, int *q, device(d) map (m) \ if
Re: [PR103302] skip multi-word pre-move clobber during lra
On Dec 15, 2021, Jeff Law wrote: >> * expr.c (emit_move_complex_parts): Skip clobbers during lra. > OK for the next cycle. Thanks, but having looked into PR 104121, I withdraw this patch and also the already-installed patch for PR 103302. As I found out, LRA does worse without the clobbers for multi-word moves, not only because the clobbers shorten live ranges and enable earlier and better allocations, but also because find_reload_regno_insns implicitly, possibly unknowingly, relied on the clobbers to avoid the risk of an infinite loop. As noted in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104121#c11 with the clobber, a multi-word reload, and the insn the reload applies to, we get 4 insns, so find_reload_regno_insns avoids the loop. Without the clobber, a multi-word reload for either input or output makes for n==3, so we enter the loop and don't ever exit it: we'll find first_insn (input) or second_insn (output), but then we'll loop forever because we won't iterate again on {prev,next}_insn, respectively, and the other iterator won't find the other word reload. We advance the other till the end, but that's not enough for us to terminate the loop. With the proposed patch reversal, we no longer hit the problem with the v850 testcase in 104121, but I'm concerned we might still get an infinite loop on ports whose input or output reloads might emit a pair of insns without a clobber. I see two ways to robustify it. One is to find a complete reload sequence: diff --git a/gcc/lra-assigns.cc b/gcc/lra-assigns.cc index c1d40ea2a14bd..ff1688917cbce 100644 --- a/gcc/lra-assigns.cc +++ b/gcc/lra-assigns.cc @@ -1716,9 +1716,18 @@ find_reload_regno_insns (int regno, rtx_insn * &start, rtx_insn * &finish) start_insn = lra_insn_recog_data[uid]->insn; n++; } - /* For reload pseudo we should have at most 3 insns referring for - it: input/output reload insns and the original insn. */ - if (n > 3) + /* For reload pseudo we should have at most 3 (sequences of) insns + referring for it: input/output reload insn sequences and the + original insn. Each reload insn sequence may amount to multiple + insns, but we expect to find each of them contiguous, one before + start_insn, one after it. We know start_insn is between the + sequences because it's the lowest-numbered insn, thus the first + we'll have found above. The reload insns, emitted later, will + have been assigned higher insn uids. If this assumption doesn't + hold, and there happen to be intervening reload insns for other + pseudos, we may end up returning false after searching to the end + in the wrong direction. */ + if (n > 1 + 2 * CEIL (lra_reg_info[regno].biggest_mode, UNITS_PER_WORD)) return false; if (n > 1) { @@ -1726,26 +1735,52 @@ find_reload_regno_insns (int regno, rtx_insn * &start, rtx_insn * &finish) next_insn = NEXT_INSN (start_insn); n != 1 && (prev_insn != NULL || next_insn != NULL); ) { - if (prev_insn != NULL && first_insn == NULL) + if (prev_insn != NULL) { if (! bitmap_bit_p (&lra_reg_info[regno].insn_bitmap, INSN_UID (prev_insn))) prev_insn = PREV_INSN (prev_insn); else { - first_insn = prev_insn; - n--; + /* A reload sequence may have multiple insns, but +they must be contiguous. */ + do + { + first_insn = prev_insn; + n--; + prev_insn = PREV_INSN (prev_insn); + } + while (n > 1 && prev_insn +&& bitmap_bit_p (&lra_reg_info[regno].insn_bitmap, + INSN_UID (prev_insn))); + /* After finding first_insn, we don't want to search +backward any more, so set prev_insn to NULL so as +to not loop indefinitely. */ + prev_insn = NULL; } } - if (next_insn != NULL && second_insn == NULL) + else if (next_insn != NULL) { if (! bitmap_bit_p (&lra_reg_info[regno].insn_bitmap, INSN_UID (next_insn))) next_insn = NEXT_INSN (next_insn); else { - second_insn = next_insn; - n--; + /* A reload sequence may have multiple insns, but +they must be contiguous. */ + do + { + second_insn = next_insn; + n--; + next_insn = NEXT_INSN (next_insn); + } + while (n > 1 && next_insn +&& bitmap_bit_p (&lra_reg_info[regno].insn_bitmap, +
libgo patch committed: Update hurd support
This libgo patch based on patches by Svante Signell updates the hurd support in libgo. This is for GCC PR 104290. Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu. Committed to mainline. Ian fa59861178df32a1f1271be6f763b71d2bb5ecab diff --git a/gcc/go/gofrontend/MERGE b/gcc/go/gofrontend/MERGE index 073fb4b50a8..3c0380e8285 100644 --- a/gcc/go/gofrontend/MERGE +++ b/gcc/go/gofrontend/MERGE @@ -1,4 +1,4 @@ -fade776395ffe5497d8aae5c0e6bd6d15e09e04a +20e74f9ef8206fb02fd28ce3d6e0f01f6fb95dc9 The first line of this file holds the git revision number of the last merge done from the gofrontend repository. diff --git a/libgo/go/internal/testenv/testenv_unix.go b/libgo/go/internal/testenv/testenv_unix.go index 3dc5daf45e7..19112076a3c 100644 --- a/libgo/go/internal/testenv/testenv_unix.go +++ b/libgo/go/internal/testenv/testenv_unix.go @@ -2,7 +2,7 @@ // Use of this source code is governed by a BSD-style // license that can be found in the LICENSE file. -//go:build aix || darwin || dragonfly || freebsd || linux || netbsd || openbsd || solaris +//go:build aix || darwin || dragonfly || freebsd || hurd || linux || netbsd || openbsd || solaris package testenv diff --git a/libgo/go/os/exec/internal/fdtest/exists_unix.go b/libgo/go/os/exec/internal/fdtest/exists_unix.go index 49f264cebda..fca328fc65b 100644 --- a/libgo/go/os/exec/internal/fdtest/exists_unix.go +++ b/libgo/go/os/exec/internal/fdtest/exists_unix.go @@ -2,7 +2,7 @@ // Use of this source code is governed by a BSD-style // license that can be found in the LICENSE file. -//go:build aix || darwin || dragonfly || freebsd || linux || netbsd || openbsd || solaris +//go:build aix || darwin || dragonfly || freebsd || hurd || linux || netbsd || openbsd || solaris // Package fdtest provides test helpers for working with file descriptors across exec. package fdtest diff --git a/libgo/go/os/user/cgo_listgroups_unix.go b/libgo/go/os/user/cgo_listgroups_unix.go index 5621c1a63aa..09ec1cc5d2a 100644 --- a/libgo/go/os/user/cgo_listgroups_unix.go +++ b/libgo/go/os/user/cgo_listgroups_unix.go @@ -2,7 +2,7 @@ // Use of this source code is governed by a BSD-style // license that can be found in the LICENSE file. -//go:build (dragonfly || darwin || freebsd || (!android && linux) || netbsd || openbsd || (solaris && !illumos)) && cgo && !osusergo +//go:build (dragonfly || darwin || freebsd || hurd || (!android && linux) || netbsd || openbsd || (solaris && !illumos)) && cgo && !osusergo package user diff --git a/libgo/go/os/user/getgrouplist_unix.go b/libgo/go/os/user/getgrouplist_unix.go index 527d941308e..e97e0bc64f6 100644 --- a/libgo/go/os/user/getgrouplist_unix.go +++ b/libgo/go/os/user/getgrouplist_unix.go @@ -2,7 +2,7 @@ // Use of this source code is governed by a BSD-style // license that can be found in the LICENSE file. -//go:build (dragonfly || freebsd || (!android && linux) || netbsd || openbsd || (solaris && !illumos)) && cgo && !osusergo +//go:build (dragonfly || freebsd || hurd || (!android && linux) || netbsd || openbsd || (solaris && !illumos)) && cgo && !osusergo package user diff --git a/libgo/go/runtime/netpoll_hurd.go b/libgo/go/runtime/netpoll_hurd.go index 3d3fa4b76d3..96b013845fa 100644 --- a/libgo/go/runtime/netpoll_hurd.go +++ b/libgo/go/runtime/netpoll_hurd.go @@ -238,10 +238,7 @@ retry: pfd.events &= ^_POLLOUT } if mode != 0 { - pds[i].everr = false - if pfd.revents == _POLLERR { - pds[i].everr = true - } + pds[i].setEventErr(pfd.revents == _POLLERR) netpollready(&toRun, pds[i], mode) n-- } diff --git a/libgo/go/runtime/os_hurd.go b/libgo/go/runtime/os_hurd.go index 8bde23edb81..9750a4822cf 100644 --- a/libgo/go/runtime/os_hurd.go +++ b/libgo/go/runtime/os_hurd.go @@ -129,3 +129,16 @@ func osinit() { physPageSize = uintptr(getPageSize()) } } + +func setProcessCPUProfiler(hz int32) { + setProcessCPUProfilerTimer(hz) +} + +func setThreadCPUProfiler(hz int32) { + setThreadCPUProfilerHz(hz) +} + +//go:nosplit +func validSIGPROF(mp *m, c *sigctxt) bool { + return true +} diff --git a/libgo/go/syscall/exec_bsd.go b/libgo/go/syscall/exec_bsd.go index ff88bc45366..86e513efdea 100644 --- a/libgo/go/syscall/exec_bsd.go +++ b/libgo/go/syscall/exec_bsd.go @@ -2,7 +2,7 @@ // Use of this source code is governed by a BSD-style // license that can be found in the LICENSE file. -//go:build aix || darwin || dragonfly || hurd || netbsd || openbsd || solaris +//go:build aix || darwin || dragonfly || netbsd || openbsd || solaris package syscall diff --git a/libgo/go/syscall/exec_hurd.go b/libgo/go/syscall/exec_hurd.go new file mode 100644 index 000..06df513c55c --- /dev/null +++ b/libgo/go/syscall/exec_hurd.go @@ -0,0 +
Contents of PO file 'cpplib-12.1-b20220213.ru.po'
cpplib-12.1-b20220213.ru.po.gz Description: Binary data The Translation Project robot, in the name of your translation coordinator.
New Russian PO file for 'cpplib' (version 12.1-b20220213)
Hello, gentle maintainer. This is a message from the Translation Project robot. A revised PO file for textual domain 'cpplib' has been submitted by the Russian team of translators. The file is available at: https://translationproject.org/latest/cpplib/ru.po (This file, 'cpplib-12.1-b20220213.ru.po', has just now been sent to you in a separate email.) All other PO files for your package are available in: https://translationproject.org/latest/cpplib/ Please consider including all of these in your next release, whether official or a pretest. Whenever you have a new distribution with a new version number ready, containing a newer POT file, please send the URL of that distribution tarball to the address below. The tarball may be just a pretest or a snapshot, it does not even have to compile. It is just used by the translators when they need some extra translation context. The following HTML page has been updated: https://translationproject.org/domain/cpplib.html If any question arises, please contact the translation coordinator. Thank you for all your work, The Translation Project robot, in the name of your translation coordinator.