[gcc r15-7793] combine: Discard REG_UNUSED note in i2 when register is also referenced in i3 [PR118739]
https://gcc.gnu.org/g:a92dc3fe31c95d56019b2fb95a58414bca06241f commit r15-7793-ga92dc3fe31c95d56019b2fb95a58414bca06241f Author: Uros Bizjak Date: Wed Feb 12 11:19:57 2025 +0100 combine: Discard REG_UNUSED note in i2 when register is also referenced in i3 [PR118739] The combine pass is trying to combine: Trying 16, 22, 21 -> 23: 16: r104:QI=flags:CCNO>0 22: {r120:QI=r104:QI^0x1;clobber flags:CC;} REG_UNUSED flags:CC 21: r119:QI=flags:CCNO<=0 REG_DEAD flags:CCNO 23: {r110:QI=r119:QI|r120:QI;clobber flags:CC;} REG_DEAD r120:QI REG_DEAD r119:QI REG_UNUSED flags:CC and creates the following two insn sequence: modifying insn i222: r104:QI=flags:CCNO>0 REG_DEAD flags:CC deferring rescan insn with uid = 22. modifying insn i323: r110:QI=flags:CCNO<=0 REG_DEAD flags:CC deferring rescan insn with uid = 23. where the REG_DEAD note in i2 is not correct, because the flags register is still referenced in i3. In try_combine() megafunction, we have this part: --cut here-- /* Distribute all the LOG_LINKS and REG_NOTES from I1, I2, and I3. */ if (i3notes) distribute_notes (i3notes, i3, i3, newi2pat ? i2 : NULL, elim_i2, elim_i1, elim_i0); if (i2notes) distribute_notes (i2notes, i2, i3, newi2pat ? i2 : NULL, elim_i2, elim_i1, elim_i0); if (i1notes) distribute_notes (i1notes, i1, i3, newi2pat ? i2 : NULL, elim_i2, local_elim_i1, local_elim_i0); if (i0notes) distribute_notes (i0notes, i0, i3, newi2pat ? i2 : NULL, elim_i2, elim_i1, local_elim_i0); if (midnotes) distribute_notes (midnotes, NULL, i3, newi2pat ? i2 : NULL, elim_i2, elim_i1, elim_i0); --cut here-- where the compiler distributes REG_UNUSED note from i2: 22: {r120:QI=r104:QI^0x1;clobber flags:CC;} REG_UNUSED flags:CC via distribute_notes() using the following: --cut here-- /* Otherwise, if this register is used by I3, then this register now dies here, so we must put a REG_DEAD note here unless there is one already. */ else if (reg_referenced_p (XEXP (note, 0), PATTERN (i3)) && ! (REG_P (XEXP (note, 0)) ? find_regno_note (i3, REG_DEAD, REGNO (XEXP (note, 0))) : find_reg_note (i3, REG_DEAD, XEXP (note, 0 { PUT_REG_NOTE_KIND (note, REG_DEAD); place = i3; } --cut here-- Flags register is used in I3, but there already is a REG_DEAD note in I3. The above condition doesn't trigger and continues in the "else" part where REG_DEAD note is put to I2. The proposed solution corrects the above logic to trigger every time the register is referenced in I3, avoiding the "else" part. PR rtl-optimization/118739 gcc/ChangeLog: * combine.cc (distribute_notes) : Correct the logic when the register is used by I3. gcc/testsuite/ChangeLog: * gcc.target/i386/pr118739.c: New test. Diff: --- gcc/combine.cc | 15 +- gcc/testsuite/gcc.target/i386/pr118739.c | 50 2 files changed, 58 insertions(+), 7 deletions(-) diff --git a/gcc/combine.cc b/gcc/combine.cc index 3beeb514b817..1b2bd34748ec 100644 --- a/gcc/combine.cc +++ b/gcc/combine.cc @@ -14523,14 +14523,15 @@ distribute_notes (rtx notes, rtx_insn *from_insn, rtx_insn *i3, rtx_insn *i2, /* Otherwise, if this register is used by I3, then this register now dies here, so we must put a REG_DEAD note here unless there is one already. */ - else if (reg_referenced_p (XEXP (note, 0), PATTERN (i3)) - && ! (REG_P (XEXP (note, 0)) -? find_regno_note (i3, REG_DEAD, - REGNO (XEXP (note, 0))) -: find_reg_note (i3, REG_DEAD, XEXP (note, 0 + else if (reg_referenced_p (XEXP (note, 0), PATTERN (i3))) { - PUT_REG_NOTE_KIND (note, REG_DEAD); - place = i3; + if (! (REG_P (XEXP (note, 0)) +? find_regno_note (i3, REG_DEAD, REGNO (XEXP (note, 0))) +: find_reg_note (i3, REG_DEAD, XEXP (note, 0 + { + PUT_REG_NOTE_KIND (note, REG_DEAD); + place = i3; + } } /* A SET or CLOBBER of the REG_UNUSED reg has been removed, diff --git
[gcc r14-11373] gimple: sccopy: Prune removed statements from SCCs [PR117919]
https://gcc.gnu.org/g:6ffbc711afbda9446df51fd2b542ecd61853283d commit r14-11373-g6ffbc711afbda9446df51fd2b542ecd61853283d Author: Filip Kastl Date: Sun Mar 2 06:39:17 2025 +0100 gimple: sccopy: Prune removed statements from SCCs [PR117919] While writing the sccopy pass I didn't realize that 'replace_uses_by ()' can remove portions of the CFG. This happens when replacing arguments of some statement results in the removal of an EH edge. Because of this sccopy can then work with GIMPLE statements that aren't part of the IR anymore. In PR117919 this triggered an assertion within the pass which assumes that statements the pass works with are reachable. This patch tells the pass to notice when a statement isn't in the IR anymore and remove it from it's worklist. PR tree-optimization/117919 gcc/ChangeLog: * gimple-ssa-sccopy.cc (scc_copy_prop::propagate): Prune statements that 'replace_uses_by ()' removed. gcc/testsuite/ChangeLog: * g++.dg/pr117919.C: New test. Signed-off-by: Filip Kastl (cherry picked from commit 5349aa2accdf34a7bf9cabd1447878aaadfc0e87) Diff: --- gcc/gimple-ssa-sccopy.cc| 13 +++ gcc/testsuite/g++.dg/pr117919.C | 52 + 2 files changed, 65 insertions(+) diff --git a/gcc/gimple-ssa-sccopy.cc b/gcc/gimple-ssa-sccopy.cc index 138ee9a0ac48..d4d06f3b13e7 100644 --- a/gcc/gimple-ssa-sccopy.cc +++ b/gcc/gimple-ssa-sccopy.cc @@ -550,6 +550,19 @@ sccopy_propagate () { vec scc = worklist.pop (); + /* When we do 'replace_scc_by_value' it may happen that some EH edges +get removed. That means parts of CFG get removed. Those may +contain copy statements. For that reason we prune SCCs here. */ + unsigned i; + for (i = 0; i < scc.length (); i++) + if (gimple_bb (scc[i]) == NULL) + scc.unordered_remove (i); + if (scc.is_empty ()) + { + scc.release (); + continue; + } + auto_vec inner; hash_set outer_ops; tree last_outer_op = NULL_TREE; diff --git a/gcc/testsuite/g++.dg/pr117919.C b/gcc/testsuite/g++.dg/pr117919.C new file mode 100644 index ..fa2d9c9cd1e5 --- /dev/null +++ b/gcc/testsuite/g++.dg/pr117919.C @@ -0,0 +1,52 @@ +/* PR tree-optimization/117919 */ +/* { dg-do compile } */ +/* { dg-options "-O1 -fno-tree-forwprop -fnon-call-exceptions --param=early-inlining-insns=192 -std=c++20" } */ + +char _M_p, _M_construct___beg; +struct _Alloc_hider { + _Alloc_hider(char); +}; +long _M_string_length; +void _M_destroy(); +void _S_copy_chars(char *, char *, char *) noexcept; +char _M_local_data(); +struct Trans_NS___cxx11_basic_string { + _Alloc_hider _M_dataplus; + bool _M_is_local() { +if (_M_local_data()) + if (_M_string_length) +return true; +return false; + } + void _M_dispose() { +if (!_M_is_local()) + _M_destroy(); + } + char *_M_construct___end; + Trans_NS___cxx11_basic_string(Trans_NS___cxx11_basic_string &) + : _M_dataplus(0) { +struct _Guard { + ~_Guard() { _M_guarded->_M_dispose(); } + Trans_NS___cxx11_basic_string *_M_guarded; +} __guard0; +_S_copy_chars(&_M_p, &_M_construct___beg, _M_construct___end); + } +}; +namespace filesystem { +struct path { + path(); + Trans_NS___cxx11_basic_string _M_pathname; +}; +} // namespace filesystem +struct FileWriter { + filesystem::path path; + FileWriter() : path(path) {} +}; +struct LanguageFileWriter : FileWriter { + LanguageFileWriter(filesystem::path) {} +}; +int +main() { + filesystem::path output_file; + LanguageFileWriter writer(output_file); +}
[gcc r15-7794] combine: Reverse negative logic in ternary operator
https://gcc.gnu.org/g:f1c30c6213fb228f1e8b5973d10c868b834a4acd commit r15-7794-gf1c30c6213fb228f1e8b5973d10c868b834a4acd Author: Uros Bizjak Date: Mon Mar 3 17:04:54 2025 +0100 combine: Reverse negative logic in ternary operator Reverse negative logic in !a ? b : c to become a ? c : b. No functional changes. gcc/ChangeLog: * combine.cc (distribute_notes): Reverse negative logic in ternary operators. Diff: --- gcc/combine.cc | 22 +++--- 1 file changed, 11 insertions(+), 11 deletions(-) diff --git a/gcc/combine.cc b/gcc/combine.cc index 1b2bd34748ec..892d37641e99 100644 --- a/gcc/combine.cc +++ b/gcc/combine.cc @@ -14515,9 +14515,9 @@ distribute_notes (rtx notes, rtx_insn *from_insn, rtx_insn *i3, rtx_insn *i2, if (from_insn != i3) break; - if (! (REG_P (XEXP (note, 0)) -? find_regno_note (i3, REG_UNUSED, REGNO (XEXP (note, 0))) -: find_reg_note (i3, REG_UNUSED, XEXP (note, 0 + if (REG_P (XEXP (note, 0)) + ? find_reg_note (i3, REG_UNUSED, XEXP (note, 0)) + : find_regno_note (i3, REG_UNUSED, REGNO (XEXP (note, 0 place = i3; } /* Otherwise, if this register is used by I3, then this register @@ -14525,9 +14525,9 @@ distribute_notes (rtx notes, rtx_insn *from_insn, rtx_insn *i3, rtx_insn *i2, is one already. */ else if (reg_referenced_p (XEXP (note, 0), PATTERN (i3))) { - if (! (REG_P (XEXP (note, 0)) -? find_regno_note (i3, REG_DEAD, REGNO (XEXP (note, 0))) -: find_reg_note (i3, REG_DEAD, XEXP (note, 0 + if (REG_P (XEXP (note, 0)) + ? find_reg_note (i3, REG_DEAD, XEXP (note, 0)) + : find_regno_note (i3, REG_DEAD, REGNO (XEXP (note, 0 { PUT_REG_NOTE_KIND (note, REG_DEAD); place = i3; @@ -14564,11 +14564,11 @@ distribute_notes (rtx notes, rtx_insn *from_insn, rtx_insn *i3, rtx_insn *i2, { if (!reg_set_p (XEXP (note, 0), PATTERN (i2))) PUT_REG_NOTE_KIND (note, REG_DEAD); - if (! (REG_P (XEXP (note, 0)) -? find_regno_note (i2, REG_NOTE_KIND (note), - REGNO (XEXP (note, 0))) -: find_reg_note (i2, REG_NOTE_KIND (note), - XEXP (note, 0 + if (REG_P (XEXP (note, 0)) + ? find_reg_note (i2, REG_NOTE_KIND (note), + XEXP (note, 0)) + : find_regno_note (i2, REG_NOTE_KIND (note), +REGNO (XEXP (note, 0 place = i2; } }
[gcc r15-7795] Revert "combine: Reverse negative logic in ternary operator"
https://gcc.gnu.org/g:ebc6c54e1f84170d591aa44c4a589c0164436a02 commit r15-7795-gebc6c54e1f84170d591aa44c4a589c0164436a02 Author: Uros Bizjak Date: Mon Mar 3 17:52:04 2025 +0100 Revert "combine: Reverse negative logic in ternary operator" This reverts commit f1c30c6213fb228f1e8b5973d10c868b834a4acd. Diff: --- gcc/combine.cc | 22 +++--- 1 file changed, 11 insertions(+), 11 deletions(-) diff --git a/gcc/combine.cc b/gcc/combine.cc index 892d37641e99..1b2bd34748ec 100644 --- a/gcc/combine.cc +++ b/gcc/combine.cc @@ -14515,9 +14515,9 @@ distribute_notes (rtx notes, rtx_insn *from_insn, rtx_insn *i3, rtx_insn *i2, if (from_insn != i3) break; - if (REG_P (XEXP (note, 0)) - ? find_reg_note (i3, REG_UNUSED, XEXP (note, 0)) - : find_regno_note (i3, REG_UNUSED, REGNO (XEXP (note, 0 + if (! (REG_P (XEXP (note, 0)) +? find_regno_note (i3, REG_UNUSED, REGNO (XEXP (note, 0))) +: find_reg_note (i3, REG_UNUSED, XEXP (note, 0 place = i3; } /* Otherwise, if this register is used by I3, then this register @@ -14525,9 +14525,9 @@ distribute_notes (rtx notes, rtx_insn *from_insn, rtx_insn *i3, rtx_insn *i2, is one already. */ else if (reg_referenced_p (XEXP (note, 0), PATTERN (i3))) { - if (REG_P (XEXP (note, 0)) - ? find_reg_note (i3, REG_DEAD, XEXP (note, 0)) - : find_regno_note (i3, REG_DEAD, REGNO (XEXP (note, 0 + if (! (REG_P (XEXP (note, 0)) +? find_regno_note (i3, REG_DEAD, REGNO (XEXP (note, 0))) +: find_reg_note (i3, REG_DEAD, XEXP (note, 0 { PUT_REG_NOTE_KIND (note, REG_DEAD); place = i3; @@ -14564,11 +14564,11 @@ distribute_notes (rtx notes, rtx_insn *from_insn, rtx_insn *i3, rtx_insn *i2, { if (!reg_set_p (XEXP (note, 0), PATTERN (i2))) PUT_REG_NOTE_KIND (note, REG_DEAD); - if (REG_P (XEXP (note, 0)) - ? find_reg_note (i2, REG_NOTE_KIND (note), - XEXP (note, 0)) - : find_regno_note (i2, REG_NOTE_KIND (note), -REGNO (XEXP (note, 0 + if (! (REG_P (XEXP (note, 0)) +? find_regno_note (i2, REG_NOTE_KIND (note), + REGNO (XEXP (note, 0))) +: find_reg_note (i2, REG_NOTE_KIND (note), + XEXP (note, 0 place = i2; } }
[gcc r15-7789] Fortran: Fix regression on double free on elemental function [PR118747]
https://gcc.gnu.org/g:43c11931acc50f3a44efb485b03e6a8d44df97e0 commit r15-7789-g43c11931acc50f3a44efb485b03e6a8d44df97e0 Author: Andre Vehreschild Date: Wed Feb 26 14:30:13 2025 +0100 Fortran: Fix regression on double free on elemental function [PR118747] Fix a regression were adding a temporary variable inserted a copy of the argument to the elemental function. That copy was then later used to free allocated memory, but the freeing was not tracked in the source array correctly. PR fortran/118747 gcc/fortran/ChangeLog: * trans-array.cc (gfc_trans_array_ctor_element): Remove copy to temporary variable. * trans-expr.cc (gfc_conv_procedure_call): Use references to array members instead of copies when freeing after use. Formatting fix. gcc/testsuite/ChangeLog: * gfortran.dg/alloc_comp_auto_array_4.f90: New test. Diff: --- gcc/fortran/trans-array.cc | 11 - gcc/fortran/trans-expr.cc | 13 --- .../gfortran.dg/alloc_comp_auto_array_4.f90| 27 ++ 3 files changed, 41 insertions(+), 10 deletions(-) diff --git a/gcc/fortran/trans-array.cc b/gcc/fortran/trans-array.cc index 8f76870b286a..6a00d26cb2f3 100644 --- a/gcc/fortran/trans-array.cc +++ b/gcc/fortran/trans-array.cc @@ -2002,13 +2002,10 @@ gfc_trans_array_ctor_element (stmtblock_t * pblock, tree desc, if (expr->expr_type == EXPR_FUNCTION && expr->ts.type == BT_DERIVED && expr->ts.u.derived->attr.alloc_comp) -{ - if (!VAR_P (se->expr)) - se->expr = gfc_evaluate_now (se->expr, &se->pre); - gfc_add_expr_to_block (&se->finalblock, -gfc_deallocate_alloc_comp_no_caf ( - expr->ts.u.derived, se->expr, expr->rank, true)); -} +gfc_add_expr_to_block (&se->finalblock, + gfc_deallocate_alloc_comp_no_caf (expr->ts.u.derived, +tmp, expr->rank, +true)); if (expr->ts.type == BT_CHARACTER) { diff --git a/gcc/fortran/trans-expr.cc b/gcc/fortran/trans-expr.cc index ab55940638e2..e619013f261e 100644 --- a/gcc/fortran/trans-expr.cc +++ b/gcc/fortran/trans-expr.cc @@ -6999,6 +6999,12 @@ gfc_conv_procedure_call (gfc_se * se, gfc_symbol * sym, if ((fsym && fsym->attr.value) || (ulim_copy && (argc == 2 || argc == 3))) gfc_conv_expr (&parmse, e); + else if (e->expr_type == EXPR_ARRAY) + { + gfc_conv_expr (&parmse, e); + if (e->ts.type != BT_CHARACTER) + parmse.expr = gfc_build_addr_expr (NULL_TREE, parmse.expr); + } else gfc_conv_expr_reference (&parmse, e); @@ -7930,11 +7936,11 @@ gfc_conv_procedure_call (gfc_se * se, gfc_symbol * sym, /* It is known the e returns a structure type with at least one allocatable component. When e is a function, ensure that the function is called once only by using a temporary variable. */ - if (!DECL_P (parmse.expr)) + if (!DECL_P (parmse.expr) && e->expr_type == EXPR_FUNCTION) parmse.expr = gfc_evaluate_now_loc (input_location, parmse.expr, &se->pre); - if (fsym && fsym->attr.value) + if ((fsym && fsym->attr.value) || e->expr_type == EXPR_ARRAY) tmp = parmse.expr; else tmp = build_fold_indirect_ref_loc (input_location, @@ -7993,7 +7999,8 @@ gfc_conv_procedure_call (gfc_se * se, gfc_symbol * sym, /* Scalars passed to an assumed rank argument are converted to a descriptor. Obtain the data field before deallocating any allocatable components. */ - if (parm_rank == 0 && GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (tmp))) + if (parm_rank == 0 && e->expr_type != EXPR_ARRAY + && GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (tmp))) tmp = gfc_conv_descriptor_data_get (tmp); if (scalar_res_outside_loop) diff --git a/gcc/testsuite/gfortran.dg/alloc_comp_auto_array_4.f90 b/gcc/testsuite/gfortran.dg/alloc_comp_auto_array_4.f90 new file mode 100644 index ..06bd8b50b967 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/alloc_comp_auto_array_4.f90 @@ -0,0 +1,27 @@ +!{ dg-do run } + +! Check freeing derived typed result's allocatable components is not done twice. +! Contributed by Damian Rouson + +program pr118747 + implicit none + + type string_t +character(len=:), allocatable :: string_ + end type + + call check_allocation([foo(), foo()]) + +contains + + type(string_t) function foo() +foo%string_ = "foo" + end function + + elemental subroutine check_allocation(string) +
[gcc r15-7790] ipa/119067 - bogus TYPE_PRECISION check on VECTOR_TYPE
https://gcc.gnu.org/g:f22e89167b3abfbf6d67f42fc4d689d8ffdc1810 commit r15-7790-gf22e89167b3abfbf6d67f42fc4d689d8ffdc1810 Author: Richard Biener Date: Mon Mar 3 09:54:15 2025 +0100 ipa/119067 - bogus TYPE_PRECISION check on VECTOR_TYPE odr_types_equivalent_p can end up using TYPE_PRECISION on vector types which is a no-go. The following instead uses TYPE_VECTOR_SUBPARTS for vector types so we also end up comparing the number of vector elements. PR ipa/119067 * ipa-devirt.cc (odr_types_equivalent_p): Check TYPE_VECTOR_SUBPARTS for vectors. * g++.dg/lto/pr119067_0.C: New testcase. * g++.dg/lto/pr119067_1.C: Likewise. Diff: --- gcc/ipa-devirt.cc | 10 +- gcc/testsuite/g++.dg/lto/pr119067_0.C | 22 ++ gcc/testsuite/g++.dg/lto/pr119067_1.C | 10 ++ 3 files changed, 41 insertions(+), 1 deletion(-) diff --git a/gcc/ipa-devirt.cc b/gcc/ipa-devirt.cc index c31658f57ef2..532e25e87c60 100644 --- a/gcc/ipa-devirt.cc +++ b/gcc/ipa-devirt.cc @@ -1259,13 +1259,21 @@ odr_types_equivalent_p (tree t1, tree t2, bool warn, bool *warned, || TREE_CODE (t1) == OFFSET_TYPE || POINTER_TYPE_P (t1)) { - if (TYPE_PRECISION (t1) != TYPE_PRECISION (t2)) + if (!VECTOR_TYPE_P (t1) && TYPE_PRECISION (t1) != TYPE_PRECISION (t2)) { warn_odr (t1, t2, NULL, NULL, warn, warned, G_("a type with different precision is defined " "in another translation unit")); return false; } + if (VECTOR_TYPE_P (t1) + && maybe_ne (TYPE_VECTOR_SUBPARTS (t1), TYPE_VECTOR_SUBPARTS (t2))) + { + warn_odr (t1, t2, NULL, NULL, warn, warned, + G_("a vector type with different number of elements " + "is defined in another translation unit")); + return false; + } if (TYPE_UNSIGNED (t1) != TYPE_UNSIGNED (t2)) { warn_odr (t1, t2, NULL, NULL, warn, warned, diff --git a/gcc/testsuite/g++.dg/lto/pr119067_0.C b/gcc/testsuite/g++.dg/lto/pr119067_0.C new file mode 100644 index ..e0f813ceffed --- /dev/null +++ b/gcc/testsuite/g++.dg/lto/pr119067_0.C @@ -0,0 +1,22 @@ +/* { dg-lto-do link } */ +/* { dg-skip-if "" { ! { x86_64-*-* i?86-*-* } } } */ +/* { dg-require-effective-target avx2 } */ +/* { dg-require-effective-target shared } */ +/* { dg-lto-options { { -O2 -fPIC -flto } } } */ +/* { dg-extra-ld-options { -shared } } */ + +#pragma GCC push_options +#pragma GCC target("avx2") +typedef char __v32qi __attribute__ ((__vector_size__ (32))); +struct ff +{ + __v32qi t; +}; +__v32qi g(struct ff a); + +__v32qi h(__v32qi a) +{ + struct ff t = {a}; + return g(t); +} +#pragma GCC pop_options diff --git a/gcc/testsuite/g++.dg/lto/pr119067_1.C b/gcc/testsuite/g++.dg/lto/pr119067_1.C new file mode 100644 index ..d8e2935fa24d --- /dev/null +++ b/gcc/testsuite/g++.dg/lto/pr119067_1.C @@ -0,0 +1,10 @@ +/* { dg-options "-mavx2" } */ + +typedef char __v32qi __attribute__ ((__vector_size__ (32))); +struct ff +{ + __v32qi t; +}; +__v32qi g(struct ff a) { + return a.t; +}
[gcc r15-7791] tree-optimization/119057 - bogus double reduction detection
https://gcc.gnu.org/g:758de6263dfc7ba8701965fa468691ac23cb7eb5 commit r15-7791-g758de6263dfc7ba8701965fa468691ac23cb7eb5 Author: Richard Biener Date: Mon Mar 3 13:21:53 2025 +0100 tree-optimization/119057 - bogus double reduction detection We are detecting a cycle as double reduction where the inner loop cycle has extra out-of-loop uses. This clashes at least with assumptions from the SLP discovery code which says the cycle isn't reachable from another SLP instance. It also was not intended to support this case, in fact with GCC 14 we seem to generate wrong code here. PR tree-optimization/119057 * tree-vect-loop.cc (check_reduction_path): Add argument specifying whether we're analyzing the inner loop of a double reduction. Do not allow extra uses outside of the double reduction cycle in this case. (vect_is_simple_reduction): Adjust. * gcc.dg/vect/pr119057.c: New testcase. Diff: --- gcc/testsuite/gcc.dg/vect/pr119057.c | 19 +++ gcc/tree-vect-loop.cc| 12 +++- 2 files changed, 26 insertions(+), 5 deletions(-) diff --git a/gcc/testsuite/gcc.dg/vect/pr119057.c b/gcc/testsuite/gcc.dg/vect/pr119057.c new file mode 100644 index ..582bb8ff86c3 --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/pr119057.c @@ -0,0 +1,19 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-fno-tree-vrp -fno-tree-forwprop" } */ + +int a, b, c, d; +unsigned e; +static void f(void) +{ + unsigned h; + for (d = 0; d < 2; d++) +b |= e; + h = b; + c |= h; +} +int main() +{ + for (; a; a++) +f(); + return 0; +} diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc index b279ebe2793b..dc15b955aadf 100644 --- a/gcc/tree-vect-loop.cc +++ b/gcc/tree-vect-loop.cc @@ -4044,7 +4044,8 @@ needs_fold_left_reduction_p (tree type, code_helper code) static bool check_reduction_path (dump_user_location_t loc, loop_p loop, gphi *phi, tree loop_arg, code_helper *code, - vec > &path) + vec > &path, + bool inner_loop_of_double_reduc) { auto_bitmap visited; tree lookfor = PHI_RESULT (phi); @@ -4181,7 +4182,8 @@ pop: break; } /* Check there's only a single stmt the op is used on. For the -not value-changing tail and the last stmt allow out-of-loop uses. +not value-changing tail and the last stmt allow out-of-loop uses, +but not when this is the inner loop of a double reduction. ??? We could relax this and handle arbitrary live stmts by forcing a scalar epilogue for example. */ imm_use_iterator imm_iter; @@ -4216,7 +4218,7 @@ pop: } } else if (!is_gimple_debug (op_use_stmt) - && (*code != ERROR_MARK + && ((*code != ERROR_MARK || inner_loop_of_double_reduc) || flow_bb_inside_loop_p (loop, gimple_bb (op_use_stmt FOR_EACH_IMM_USE_ON_STMT (use_p, imm_iter) @@ -4238,7 +4240,7 @@ check_reduction_path (dump_user_location_t loc, loop_p loop, gphi *phi, { auto_vec > path; code_helper code_; - return (check_reduction_path (loc, loop, phi, loop_arg, &code_, path) + return (check_reduction_path (loc, loop, phi, loop_arg, &code_, path, false) && code_ == code); } @@ -4449,7 +4451,7 @@ vect_is_simple_reduction (loop_vec_info loop_info, stmt_vec_info phi_info, auto_vec > path; code_helper code; if (check_reduction_path (vect_location, loop, phi, latch_def, &code, - path)) + path, inner_loop_of_double_reduc)) { STMT_VINFO_REDUC_CODE (phi_info) = code; if (code == COND_EXPR && !nested_in_vect_loop)
[gcc r15-7792] ipa-vr: Handle non-conversion unary ops separately from conversions (PR 118785)
https://gcc.gnu.org/g:d05b64bdd048ffb7f72d97553888934a9bcd13fa commit r15-7792-gd05b64bdd048ffb7f72d97553888934a9bcd13fa Author: Martin Jambor Date: Mon Mar 3 14:53:03 2025 +0100 ipa-vr: Handle non-conversion unary ops separately from conversions (PR 118785) Since we construct arithmetic jump functions even when there is a type conversion in between the operation encoded in the jump function and when it is passed in a call argument, the IPA propagation phase must also perform the operation and conversion in two steps. IPA-VR had actually been doing it even before for binary operations but, as PR 118756 exposes, not in the case on unary operations. This patch adds the necessary step to rectify that. Like in the scalar constant case, we depend on expr_type_first_operand_type_p to determine the type of the result of the arithmetic operation. On top this, the patch special-cases ABSU_EXPR because it looks useful an so that the PR testcase exercises the added code-path. This seems most appropriate for stage 4, long term we should probably stream the types, probably after also encoding them with a string of expr_eval_op rather than what we have today. A check for expr_type_first_operand_type_p was also missing in the handling of binary ops and the intermediate value_range was initialized with a wrong type, so I also fixed this. gcc/ChangeLog: 2025-02-24 Martin Jambor PR ipa/118785 * ipa-cp.cc (ipa_vr_intersect_with_arith_jfunc): Handle non-conversion unary operations separately before doing any conversions. Check expr_type_first_operand_type_p for non-unary operations too. Fix type of op_res. gcc/testsuite/ChangeLog: 2025-02-24 Martin Jambor PR ipa/118785 * g++.dg/lto/pr118785_0.C: New test. Diff: --- gcc/ipa-cp.cc | 45 --- gcc/testsuite/g++.dg/lto/pr118785_0.C | 14 +++ 2 files changed, 56 insertions(+), 3 deletions(-) diff --git a/gcc/ipa-cp.cc b/gcc/ipa-cp.cc index 68959f2677ba..3c994f24f540 100644 --- a/gcc/ipa-cp.cc +++ b/gcc/ipa-cp.cc @@ -1720,8 +1720,45 @@ ipa_vr_intersect_with_arith_jfunc (vrange &vr, enum tree_code operation = ipa_get_jf_pass_through_operation (jfunc); if (TREE_CODE_CLASS (operation) == tcc_unary) { + value_range op_res; + const value_range *inter_vr; + if (operation != NOP_EXPR) + { + /* Since we construct arithmetic jump functions even when there is a + type conversion in between the operation encoded in the jump + function and when it is passed in a call argument, the IPA + propagation phase must also perform the operation and conversion + in two separate steps. + +TODO: In order to remove the use of expr_type_first_operand_type_p +predicate we would need to stream the operation type, ideally +encoding the whole jump function as a series of expr_eval_op +structures. */ + + tree operation_type; + if (expr_type_first_operand_type_p (operation)) + operation_type = src_type; + else if (operation == ABSU_EXPR) + operation_type = unsigned_type_for (src_type); + else + return; + op_res.set_varying (operation_type); + if (!ipa_vr_operation_and_type_effects (op_res, src_vr, operation, + operation_type, src_type)) + return; + if (src_type == dst_type) + { + vr.intersect (op_res); + return; + } + inter_vr = &op_res; + src_type = operation_type; + } + else + inter_vr = &src_vr; + value_range tmp_res (dst_type); - if (ipa_vr_operation_and_type_effects (tmp_res, src_vr, operation, + if (ipa_vr_operation_and_type_effects (tmp_res, *inter_vr, NOP_EXPR, dst_type, src_type)) vr.intersect (tmp_res); return; @@ -1737,10 +1774,12 @@ ipa_vr_intersect_with_arith_jfunc (vrange &vr, tree operation_type; if (TREE_CODE_CLASS (operation) == tcc_comparison) operation_type = boolean_type_node; - else + else if (expr_type_first_operand_type_p (operation)) operation_type = src_type; + else +return; - value_range op_res (dst_type); + value_range op_res (operation_type); if (!ipa_vr_supported_type_p (operation_type) || !handler.operand_check_p (operation_type, src_type, op_vr.type ()) || !handler.fold_range (op_res, operation_type, src_vr, op_vr)) diff --git a/gcc/testsuite/g++.dg/lto/pr118785_0.C b/gcc/testsuite/g++.dg/lto/pr118785_0.C new file mode 100644 index ..cdcc1dd947d3 --- /dev/null +++ b/gcc/testsuite/g++.dg/lto/pr118785
[gcc r15-7797] aarch64: Ignore target pragmas while defining intrinsics
https://gcc.gnu.org/g:71355700432b15590123dc13833304c75ad8a0b6 commit r15-7797-g71355700432b15590123dc13833304c75ad8a0b6 Author: Andrew Carlotti Date: Fri Feb 7 17:13:36 2025 + aarch64: Ignore target pragmas while defining intrinsics Refactor the switcher classes into two separate classes: - sve_alignment_switcher takes the alignment switching functionality, and is used only for ABI correctness when defining sve structure types. - aarch64_target_switcher takes the rest of the functionality of aarch64_simd_switcher and sve_switcher, and gates simd/sve specific parts upon the specified feature flags. Additionally, aarch64_target_switcher now adds dependencies of the specified flags (which adds +fcma and +bf16 to some intrinsic declarations), and unsets current_target_pragma. This last change fixes an internal bug where we would sometimes add a user specified target pragma (stored in current_target_pragma) on top of an internally specified target architecture while initialising intrinsics with `#pragma GCC aarch64 "arm_*.h"`. As far as I can tell, this has no visible impact at the moment. However, the unintended target feature combinations lead to unwanted behaviour in an under-development patch. This also fixes a missing Makefile dependency, which was due to aarch64-sve-builtins.o incorrectly depending on the undefined $(REG_H). The correct $(REGS_H) dependency is added to the switcher's new source location. gcc/ChangeLog: * common/config/aarch64/aarch64-common.cc (struct aarch64_extension_info): Add field. (aarch64_get_required_features): New. * config/aarch64/aarch64-builtins.cc (aarch64_simd_switcher::aarch64_simd_switcher): Rename to... (aarch64_target_switcher::aarch64_target_switcher): ...this, and extend to handle sve, nosimd and target pragmas. (aarch64_simd_switcher::~aarch64_simd_switcher): Rename to... (aarch64_target_switcher::~aarch64_target_switcher): ...this, and extend to handle sve, nosimd and target pragmas. (handle_arm_acle_h): Use aarch64_target_switcher. (handle_arm_neon_h): Rename switcher and pass explicit flags. (aarch64_general_init_builtins): Ditto. * config/aarch64/aarch64-protos.h (class aarch64_simd_switcher): Rename to... (class aarch64_target_switcher): ...this, and add new members. (aarch64_get_required_features): New prototype. * config/aarch64/aarch64-sve-builtins.cc (sve_switcher::sve_switcher): Delete (sve_switcher::~sve_switcher): Delete (sve_alignment_switcher::sve_alignment_switcher): New (sve_alignment_switcher::~sve_alignment_switcher): New (register_builtin_types): Use alignment switcher (init_builtins): Rename switcher. (handle_arm_neon_sve_bridge_h): Ditto. (handle_arm_sme_h): Ditto. (handle_arm_sve_h): Ditto, and use alignment switcher. * config/aarch64/aarch64-sve-builtins.h (class sve_switcher): Delete. (class sme_switcher): Delete. (class sve_alignment_switcher): New. * config/aarch64/t-aarch64 (aarch64-builtins.o): Add $(REGS_H). (aarch64-sve-builtins.o): Remove $(REG_H). Diff: --- gcc/common/config/aarch64/aarch64-common.cc | 19 +++-- gcc/config/aarch64/aarch64-builtins.cc | 44 + gcc/config/aarch64/aarch64-protos.h | 9 -- gcc/config/aarch64/aarch64-sve-builtins.cc | 28 +++--- gcc/config/aarch64/aarch64-sve-builtins.h | 19 - gcc/config/aarch64/t-aarch64| 4 +-- 6 files changed, 74 insertions(+), 49 deletions(-) diff --git a/gcc/common/config/aarch64/aarch64-common.cc b/gcc/common/config/aarch64/aarch64-common.cc index ef4458fb6930..500bf784983d 100644 --- a/gcc/common/config/aarch64/aarch64-common.cc +++ b/gcc/common/config/aarch64/aarch64-common.cc @@ -157,6 +157,8 @@ struct aarch64_extension_info aarch64_feature_flags flags_on; /* If this feature is turned off, these bits also need to be turned off. */ aarch64_feature_flags flags_off; + /* If this feature remains enabled, these bits must also remain enabled. */ + aarch64_feature_flags flags_required; }; /* ISA extensions in AArch64. */ @@ -164,9 +166,10 @@ static constexpr aarch64_extension_info all_extensions[] = { #define AARCH64_OPT_EXTENSION(NAME, IDENT, C, D, E, FEATURE_STRING) \ {NAME, AARCH64_FL_##IDENT, feature_deps::IDENT ().explicit_on, \ - feature_deps::get_flags_off (feature_deps::root_off_##IDENT)}, + feature_deps::get_flags_off (feature_deps::root_off_##IDENT), \ + feature_deps::IDENT ().enable}, #include "config/aarch64/aarch64
[gcc r15-7801] RISC-V: Fix the test case bug-3.c failure
https://gcc.gnu.org/g:bfb9276f344cbc6794379d61d0279dfc3a7441b3 commit r15-7801-gbfb9276f344cbc6794379d61d0279dfc3a7441b3 Author: Pan Li Date: Mon Mar 3 14:51:21 2025 +0800 RISC-V: Fix the test case bug-3.c failure The bug-3.c would like to check the slli a[0-9]+, a[0-9]+, 33 for the big poly int handling. But the underlying insn may change to slli 1 + slli 32 with sorts of optimization. Thus, update the asm check to function body check with above slli 1 + slli 32 series. The below test suites are passed for this patch. * The rv64gcv fully regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/bug-3.c: Update asm check to function body check. Signed-off-by: Pan Li Diff: --- gcc/testsuite/gcc.target/riscv/rvv/autovec/bug-3.c | 12 ++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/bug-3.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/bug-3.c index 05ac2e54cbed..2d5f4c2e0de0 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/bug-3.c +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/bug-3.c @@ -1,5 +1,6 @@ /* { dg-do compile } */ /* { dg-options "-march=rv64gcv_zvl512b -mabi=lp64d -mrvv-max-lmul=m8 -mrvv-vector-bits=scalable -fno-vect-cost-model -O2 -ffast-math" } */ +/* { dg-final { check-function-bodies "**" "" } } */ #define N 16 @@ -25,7 +26,15 @@ _Complex float res[N] = -740.0F + 2488.0iF, -760.0F + 2638.0iF, -780.0F + 2792.0iF, -800.0F + 2950.0iF }; - +/* +** foo: +** ... +** csrr\s+[atx][0-9]+,\s*vlenb +** slli\s+[atx][0-9]+,\s*[atx][0-9],\s*1 +** ... +** slli\s+[atx][0-9]+,\s*[atx][0-9],\s*32 +** ... +*/ void foo (void) { @@ -36,4 +45,3 @@ foo (void) } /* { dg-final { scan-assembler-not {li\s+[a-x0-9]+,\s*0} } } */ -/* { dg-final { scan-assembler-times {slli\s+[a-x0-9]+,\s*[a-x0-9]+,\s*33} 1 } } */
[gcc r15-7796] arm: remove some redundant zero_extend ops on thumb1
https://gcc.gnu.org/g:2a502f9e4c5c6a8e908ef1b0b5c03fb2e4bd4390 commit r15-7796-g2a502f9e4c5c6a8e908ef1b0b5c03fb2e4bd4390 Author: Richard Earnshaw Date: Mon Mar 3 15:30:58 2025 + arm: remove some redundant zero_extend ops on thumb1 The code in gcc.target/unsigned-extend-1.c really should not need an unsigned extension operations when the optimizers are used. For Arm and thumb2 that is indeed the case, but for thumb1 code it gets more complicated as there are too many instructions for combine to look at. For thumb1 we end up with two redundant zero_extend patterns which are not removed: the first after the subtract instruction and the second of the final boolean result. We can partially fix this (for the second case above) by adding a new split pattern for LEU and GEU patterns which work because the two instructions for the [LG]EU pattern plus the redundant extension instruction are combined into a single insn, which we can then split using the 3->2 method back into the two insns of the [LG]EU sequence. Because we're missing the optimization for all thumb1 cases (not just those architectures with UXTB), I've adjust the testcase to detect all the idioms that we might use for zero-extending a value, namely: UXTB AND ...#255 (in thumb1 this would require a register to hold 255) LSL ... #24; LSR ... #24 but I've also marked this test as XFAIL for thumb1 because we can't yet eliminate the first of the two extend instructions. gcc/ * config/arm/thumb1.md (split patterns for GEU and LEU): New. gcc/testsuite: * gcc.target/arm/unsigned-extend-1.c: Expand check for any insn suggesting a zero-extend. XFAIL for thumb1 code. Diff: --- gcc/config/arm/thumb1.md | 28 gcc/testsuite/gcc.target/arm/unsigned-extend-1.c | 4 ++-- 2 files changed, 30 insertions(+), 2 deletions(-) diff --git a/gcc/config/arm/thumb1.md b/gcc/config/arm/thumb1.md index 548c36979f12..f9e89e991d9b 100644 --- a/gcc/config/arm/thumb1.md +++ b/gcc/config/arm/thumb1.md @@ -1810,6 +1810,34 @@ (set_attr "type" "multiple")] ) +;; Re-split an LEU/GEU sequence if combine tries to oversimplify a 3-plus +;; insn sequence. Beware of the early-clobber of operand0 +(define_split + [(set (match_operand:SI 0 "s_register_operand") + (leu:SI (match_operand:SI 1 "s_register_operand") + (match_operand:SI 2 "s_register_operand")))] + "TARGET_THUMB1 + && !reg_overlap_mentioned_p (operands[0], operands[1]) + && !reg_overlap_mentioned_p (operands[0], operands[2])" + [(set (match_dup 0) (const_int 0)) + (set (match_dup 0) (plus:SI (plus:SI (match_dup 0) (match_dup 0)) + (geu:SI (match_dup 2) (match_dup 1] + {} +) + +(define_split + [(set (match_operand:SI 0 "s_register_operand") + (geu:SI (match_operand:SI 1 "s_register_operand") + (match_operand:SI 2 "thumb1_cmp_operand")))] + "TARGET_THUMB1 + && !reg_overlap_mentioned_p (operands[0], operands[1]) + && !reg_overlap_mentioned_p (operands[0], operands[2])" + [(set (match_dup 0) (const_int 0)) + (set (match_dup 0) (plus:SI (plus:SI (match_dup 0) (match_dup 0)) + (geu:SI (match_dup 1) (match_dup 2] + {} +) + (define_insn "*thumb_jump" [(set (pc) diff --git a/gcc/testsuite/gcc.target/arm/unsigned-extend-1.c b/gcc/testsuite/gcc.target/arm/unsigned-extend-1.c index 3b4ab048fb09..fa3d34400bfa 100644 --- a/gcc/testsuite/gcc.target/arm/unsigned-extend-1.c +++ b/gcc/testsuite/gcc.target/arm/unsigned-extend-1.c @@ -5,5 +5,5 @@ unsigned char foo (unsigned char c) { return (c >= '0') && (c <= '9'); } - -/* { dg-final { scan-assembler-not "uxtb" } } */ +/* We shouldn't need any zero-extension idioms here. */ +/* { dg-final { scan-assembler-not "\t(uxtb|and|lsr|lsl)" { xfail arm_thumb1 } } } */
[gcc r15-7798] Fortran: reject empty derived type with bind(C) attribute [PR101577]
https://gcc.gnu.org/g:f9f16b9f74b767ca799a82f25be66a5fed25756d commit r15-7798-gf9f16b9f74b767ca799a82f25be66a5fed25756d Author: Harald Anlauf Date: Sun Mar 2 22:20:28 2025 +0100 Fortran: reject empty derived type with bind(C) attribute [PR101577] PR fortran/101577 gcc/fortran/ChangeLog: * symbol.cc (verify_bind_c_derived_type): Generate error message for derived type with no components in standard conformance mode, indicating that this is a GNU extension. gcc/testsuite/ChangeLog: * gfortran.dg/empty_derived_type.f90: Adjust dg-options. * gfortran.dg/empty_derived_type_2.f90: New test. Diff: --- gcc/fortran/symbol.cc | 23 +++--- gcc/testsuite/gfortran.dg/empty_derived_type.f90 | 1 + gcc/testsuite/gfortran.dg/empty_derived_type_2.f90 | 11 +++ 3 files changed, 32 insertions(+), 3 deletions(-) diff --git a/gcc/fortran/symbol.cc b/gcc/fortran/symbol.cc index c6894810bce7..81aa81df2eec 100644 --- a/gcc/fortran/symbol.cc +++ b/gcc/fortran/symbol.cc @@ -4624,12 +4624,29 @@ verify_bind_c_derived_type (gfc_symbol *derived_sym) entity may be defined by means of C and the Fortran entity is said to be interoperable with the C entity. There does not have to be such an interoperating C entity." + + However, later discussion on the J3 mailing list + (https://mailman.j3-fortran.org/pipermail/j3/2021-July/013190.html) + found this to be a defect, and Fortran 2018 added in section 18.3.4 + the following constraint: + "C1805: A derived type with the BIND attribute shall have at least one + component." + + We thus allow empty derived types only as GNU extension while giving a + warning by default, or reject empty types in standard conformance mode. */ if (curr_comp == NULL) { - gfc_warning (0, "Derived type %qs with BIND(C) attribute at %L is empty, " - "and may be inaccessible by the C companion processor", - derived_sym->name, &(derived_sym->declared_at)); + if (!gfc_notify_std (GFC_STD_GNU, "Derived type %qs with BIND(C) " + "attribute at %L has no components", + derived_sym->name, &(derived_sym->declared_at))) + return false; + else if (!pedantic) + /* Generally emit warning, but not twice if -pedantic is given. */ + gfc_warning (0, "Derived type %qs with BIND(C) attribute at %L " +"is empty, and may be inaccessible by the C " +"companion processor", +derived_sym->name, &(derived_sym->declared_at)); derived_sym->ts.is_c_interop = 1; derived_sym->attr.is_bind_c = 1; return true; diff --git a/gcc/testsuite/gfortran.dg/empty_derived_type.f90 b/gcc/testsuite/gfortran.dg/empty_derived_type.f90 index 6bf616c2c6ab..496262de2cdf 100644 --- a/gcc/testsuite/gfortran.dg/empty_derived_type.f90 +++ b/gcc/testsuite/gfortran.dg/empty_derived_type.f90 @@ -1,4 +1,5 @@ ! { dg-do compile } +! { dg-options "" } module stuff implicit none type, bind(C) :: junk ! { dg-warning "may be inaccessible by the C companion" } diff --git a/gcc/testsuite/gfortran.dg/empty_derived_type_2.f90 b/gcc/testsuite/gfortran.dg/empty_derived_type_2.f90 new file mode 100644 index ..1ef56da4c25c --- /dev/null +++ b/gcc/testsuite/gfortran.dg/empty_derived_type_2.f90 @@ -0,0 +1,11 @@ +! { dg-do compile } +! { dg-additional-options "-std=f2018" } +! +! PR fortran/101577 +! +! Contributed by Tobias Burnus + +type, bind(C) :: t ! { dg-error "has no components" } + ! Empty! +end type t +end