[gcc(refs/users/meissner/heads/work195-test)] Add ChangeLog.test and update REVISION.
https://gcc.gnu.org/g:7cc89dc54e90170779b1629eec4a5095428e13a8 commit 7cc89dc54e90170779b1629eec4a5095428e13a8 Author: Michael Meissner Date: Sat Mar 8 20:45:09 2025 -0500 Add ChangeLog.test and update REVISION. 2025-03-08 Michael Meissner gcc/ * ChangeLog.test: New file for branch. * REVISION: Update. Diff: --- gcc/ChangeLog.test | 5 + gcc/REVISION | 2 +- 2 files changed, 6 insertions(+), 1 deletion(-) diff --git a/gcc/ChangeLog.test b/gcc/ChangeLog.test new file mode 100644 index ..43bc5cfb6516 --- /dev/null +++ b/gcc/ChangeLog.test @@ -0,0 +1,5 @@ + Branch work195-test, baseline + +2025-03-08 Michael Meissner + + Clone branch diff --git a/gcc/REVISION b/gcc/REVISION index 2a0d2a2c2a0b..8de326feeaf1 100644 --- a/gcc/REVISION +++ b/gcc/REVISION @@ -1 +1 @@ -work195 branch +work195-test branch
[gcc r15-7912] testsuite: Require effective target float16 for test [PR119133]
https://gcc.gnu.org/g:1b21da6abf22f164adb5d03cc91ef09472d035db commit r15-7912-g1b21da6abf22f164adb5d03cc91ef09472d035db Author: Dimitar Dimitrov Date: Sun Mar 9 15:57:25 2025 +0200 testsuite: Require effective target float16 for test [PR119133] The test spuriously failed on pru-unknown-elf due to missing support for _Float16 type. PR target/119133 gcc/testsuite/ChangeLog: * gcc.dg/torture/pr119133.c: Require effective target float16. Signed-off-by: Dimitar Dimitrov Diff: --- gcc/testsuite/gcc.dg/torture/pr119133.c | 1 + 1 file changed, 1 insertion(+) diff --git a/gcc/testsuite/gcc.dg/torture/pr119133.c b/gcc/testsuite/gcc.dg/torture/pr119133.c index 78cdda69afed..5369becd350d 100644 --- a/gcc/testsuite/gcc.dg/torture/pr119133.c +++ b/gcc/testsuite/gcc.dg/torture/pr119133.c @@ -1,4 +1,5 @@ /* { dg-additional-options "-fno-tree-ter" } */ +/* { dg-require-effective-target float16 } */ int foo(_Float16 f, int i)
[gcc r15-7913] phiopt: Fix value_replacement for middle bb having phi nodes [PR118922]
https://gcc.gnu.org/g:7232c005afb5002cdfd0a2dbd0e8b8f2d80250ce commit r15-7913-g7232c005afb5002cdfd0a2dbd0e8b8f2d80250ce Author: Andrew Pinski Date: Sat Mar 8 22:43:54 2025 -0800 phiopt: Fix value_replacement for middle bb having phi nodes [PR118922] After r12-5300-gf98f373dd822b3, value_replacement would be able to look at the following cfg structure: ``` [local count: 1014686024]: if (h_6 != 0) goto ; [94.50%] else goto ; [5.50%] [local count: 114863530]: # h_6 = PHI <0(4), 1(5)> [local count: 1073741824]: # f_8 = PHI <0(5), h_6(6)> _9 = f_8 ^ 1; a.0_10 = a; _11 = _9 + a.0_10; if (_11 != -117) goto ; [94.50%] else goto ; [5.50%] ``` value_replacement would incorrectly think the middle bb (6) was empty and so it decides to remove condition in bb5 and replacing it with 0 as the function thought it was `h_6 ? 0 : h_6`. But since the there is an incoming phi node to bb6 defining h_6 that is incorrect. The fix is to check if there is phi nodes in the middle bb and set empty_or_with_defined_p to false. This was not needed before r12-5300-gf98f373dd822b3 because the phi would have been dead otherwise due to other checks. Bootstrapped and tested on x86_64-linux-gnu. PR tree-optimization/118922 gcc/ChangeLog: * tree-ssa-phiopt.cc (value_replacement): Set empty_or_with_defined_p to false when there is phi nodes for the middle bb. gcc/testsuite/ChangeLog: * gcc.dg/torture/pr118922-1.c: New test. Signed-off-by: Andrew Pinski Diff: --- gcc/testsuite/gcc.dg/torture/pr118922-1.c | 57 +++ gcc/tree-ssa-phiopt.cc| 4 +++ 2 files changed, 61 insertions(+) diff --git a/gcc/testsuite/gcc.dg/torture/pr118922-1.c b/gcc/testsuite/gcc.dg/torture/pr118922-1.c new file mode 100644 index ..27e8c78c0e4e --- /dev/null +++ b/gcc/testsuite/gcc.dg/torture/pr118922-1.c @@ -0,0 +1,57 @@ +/* { dg-do run } */ +/* PR tree-optimization/118922 */ + +/* Phi-opt would convert: + [local count: 1014686024]: + if (h_6 != 0) +goto ; [94.50%] + else +goto ; [5.50%] + + [local count: 114863530]: + # h_6 = PHI <0(4), 1(5)> + + [local count: 1073741824]: + # f_8 = PHI <0(5), h_6(6)> + _9 = f_8 ^ 1; + a.0_10 = a; + _11 = _9 + a.0_10; + if (_11 != -117) +goto ; [94.50%] + else +goto ; [5.50%] + +into: + + [local count: 59055799]: + c = d_3; + + [local count: 1073741824]: + # f_8 = PHI <0(5), 0(4)> + _9 = f_8 ^ 1; + a.0_10 = a; + _11 = _9 + a.0_10; + if (_11 != -117) +goto ; [94.50%] + else +goto ; [5.50%] + +as it thought the middle bb was empty as there was only a phi node there. */ + + +int a = -117, b, c, e; +void g(int h) { + int f = 0; + while (!f + a - -117) { +f = h == 0; +if (h == 0) + h = 1; + } +} +int main() { + int d = 8; + for (; e;) +d = 0; + c = d; + g(0); +} diff --git a/gcc/tree-ssa-phiopt.cc b/gcc/tree-ssa-phiopt.cc index f67f52d2d69a..7d2d1696ee70 100644 --- a/gcc/tree-ssa-phiopt.cc +++ b/gcc/tree-ssa-phiopt.cc @@ -1331,6 +1331,10 @@ value_replacement (basic_block cond_bb, basic_block middle_bb, empty_or_with_defined_p = false; } + /* The middle bb is not empty if there are any phi nodes. */ + if (phi_nodes (middle_bb)) +empty_or_with_defined_p = false; + /* We need to know which is the true edge and which is the false edge so that we know if have abs or negative abs. */ extract_true_false_edges_from_block (cond_bb, &true_edge, &false_edge);
[gcc r14-11395] Fortran: Fix ICE in resolve.cc with -pedantic
https://gcc.gnu.org/g:e4f886c463fffd0bd2b1c98fa668c20aab5b37d2 commit r14-11395-ge4f886c463fffd0bd2b1c98fa668c20aab5b37d2 Author: Jerry DeLisle Date: Fri Mar 7 18:33:29 2025 -0800 Fortran: Fix ICE in resolve.cc with -pedantic Fixes an ICE in gfc_resolve_code when passing an optional array to an elemental procedure with `-pedantic` enabled. PR95446 added the original check, this patch fixes the case where the other actual argument is an array literal (or something else other than a variable). PR fortran/119054 gcc/fortran/ChangeLog: * resolve.cc (resolve_elemental_actual): When checking other actual arguments to elemental procedures, don't check attributes of literals and function calls. gcc/testsuite/ChangeLog: * gfortran.dg/pr95446.f90: Expand test case to literals and function calls. Signed-off-by: Peter Hill (cherry picked from commit 3014f8787196d7c0d15d24195c8f07167968ff55) Diff: --- gcc/fortran/resolve.cc| 4 +++- gcc/testsuite/gfortran.dg/pr95446.f90 | 14 ++ 2 files changed, 17 insertions(+), 1 deletion(-) diff --git a/gcc/fortran/resolve.cc b/gcc/fortran/resolve.cc index a0ed0e516da3..0261b4e6b246 100644 --- a/gcc/fortran/resolve.cc +++ b/gcc/fortran/resolve.cc @@ -2409,7 +2409,9 @@ resolve_elemental_actual (gfc_expr *expr, gfc_code *c) for (a = arg0; a; a = a->next) if (a != arg && a->expr->rank == arg->expr->rank - && !a->expr->symtree->n.sym->attr.optional) + && (a->expr->expr_type != EXPR_VARIABLE + || (a->expr->expr_type == EXPR_VARIABLE + && !a->expr->symtree->n.sym->attr.optional))) { t = true; break; diff --git a/gcc/testsuite/gfortran.dg/pr95446.f90 b/gcc/testsuite/gfortran.dg/pr95446.f90 index 86e1019d7af9..0787658813aa 100644 --- a/gcc/testsuite/gfortran.dg/pr95446.f90 +++ b/gcc/testsuite/gfortran.dg/pr95446.f90 @@ -22,6 +22,20 @@ program elemental_optional end function outer + function outer_literal(o) result(l) +integer, intent(in), optional :: o(5) +integer :: l(5) + +l = inner(o, [1,2,3,4,5]) + end function outer_literal + + function outer_func(o) result(l) +integer, intent(in), optional :: o(5) +integer :: l(5) + +l = inner(o, outer()) + end function outer_func + elemental function inner(a,b) result(x) integer, intent(in), optional :: a integer, intent(in) :: b
[gcc r14-11394] Fortran: Fix gimplification error on assignment to pointer [PR103391]
https://gcc.gnu.org/g:c6b2a359d348e2255cbf5b548540ecd8a5fa5a59 commit r14-11394-gc6b2a359d348e2255cbf5b548540ecd8a5fa5a59 Author: Andre Vehreschild Date: Tue Mar 4 12:56:20 2025 +0100 Fortran: Fix gimplification error on assignment to pointer [PR103391] PR fortran/103391 gcc/fortran/ChangeLog: * trans-expr.cc (gfc_trans_assignment_1): Do not use poly assign for pointer arrays on lhs (as it is done for allocatables already). gcc/testsuite/ChangeLog: * gfortran.dg/assign_12.f90: New test. (cherry picked from commit 04909c7ecc023874c3444b85f88c60b7b7cc7778) Diff: --- gcc/fortran/trans-expr.cc | 16 gcc/testsuite/gfortran.dg/assign_12.f90 | 28 2 files changed, 36 insertions(+), 8 deletions(-) diff --git a/gcc/fortran/trans-expr.cc b/gcc/fortran/trans-expr.cc index 68b627cc4651..cf6511852132 100644 --- a/gcc/fortran/trans-expr.cc +++ b/gcc/fortran/trans-expr.cc @@ -12350,14 +12350,14 @@ gfc_trans_assignment_1 (gfc_expr * expr1, gfc_expr * expr2, bool init_flag, needed. */ lhs_attr = gfc_expr_attr (expr1); - is_poly_assign = (use_vptr_copy || lhs_attr.pointer - || (lhs_attr.allocatable && !lhs_attr.dimension)) - && (expr1->ts.type == BT_CLASS - || gfc_is_class_array_ref (expr1, NULL) - || gfc_is_class_scalar_expr (expr1) - || gfc_is_class_array_ref (expr2, NULL) - || gfc_is_class_scalar_expr (expr2)) - && lhs_attr.flavor != FL_PROCEDURE; + is_poly_assign += (use_vptr_copy + || ((lhs_attr.pointer || lhs_attr.allocatable) && !lhs_attr.dimension)) + && (expr1->ts.type == BT_CLASS || gfc_is_class_array_ref (expr1, NULL) + || gfc_is_class_scalar_expr (expr1) + || gfc_is_class_array_ref (expr2, NULL) + || gfc_is_class_scalar_expr (expr2)) + && lhs_attr.flavor != FL_PROCEDURE; realloc_flag = flag_realloc_lhs && gfc_is_reallocatable_lhs (expr1) diff --git a/gcc/testsuite/gfortran.dg/assign_12.f90 b/gcc/testsuite/gfortran.dg/assign_12.f90 new file mode 100644 index ..be31021f24c6 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/assign_12.f90 @@ -0,0 +1,28 @@ +!{ dg-do run } +! +! Check assignment works for derived types to memory referenced by pointer +! Contributed by G. Steinmetz + +program pr103391 + type t + character(1) :: c + end type + type t2 + type(t), pointer :: a(:) + end type + + type(t), target :: arr(2) + type(t2) :: r + + arr = [t('a'), t('b')] + + r = f([arr]) + if (any(r%a(:)%c /= ['a', 'b'])) stop 1 +contains + function f(x) + class(t), intent(in), target :: x(:) + type(t2) :: f + allocate(f%a(size(x,1))) + f%a = x + end +end
[gcc(refs/users/meissner/heads/work195-libs)] Merge commit 'refs/users/meissner/heads/work195-libs' of git+ssh://gcc.gnu.org/git/gcc into me/work1
https://gcc.gnu.org/g:3574c4424621c46d0215821d499779764f96e41c commit 3574c4424621c46d0215821d499779764f96e41c Merge: e0707d9bf04b a5fa0d4ba5af Author: Michael Meissner Date: Sat Mar 8 21:14:47 2025 -0500 Merge commit 'refs/users/meissner/heads/work195-libs' of git+ssh://gcc.gnu.org/git/gcc into me/work195-libs Diff:
[gcc/meissner/heads/work195-test] (14 commits) Merge commit 'refs/users/meissner/heads/work195-test' of gi
The branch 'meissner/heads/work195-test' was updated to point to: 35806655a415... Merge commit 'refs/users/meissner/heads/work195-test' of gi It previously pointed to: 7cc89dc54e90... Add ChangeLog.test and update REVISION. Diff: Summary of changes (added commits): --- 3580665... Merge commit 'refs/users/meissner/heads/work195-test' of gi 72b13cd... Add ChangeLog.test and update REVISION. 3ced480... xUse architecture flags for defining _ARCH_PWR macros. (*) 7aa1750... Add rs6000 architecture masks. (*) 94cad41... Do not allow -mvsx to boost processor to power7. (*) 9d6a493... Use vector pair load/store for memcpy with -mcpu=future (*) f049955... Add -mcpu=future tests. (*) 883cc86... Add -mcpu=future tuning support. (*) 7997a87... Add support for -mcpu=future (*) 5115a81... Change TARGET_MODULO to TARGET_POWER9. (*) 14ffdf1... Change TARGET_POPCNTD to TARGET_POWER7. (*) 558802e... Change TARGET_CMPB to TARGET_POWER6. (*) 1f54dda... Change TARGET_FPRND to TARGET_POWER5X. (*) 0a7acdc... Change TARGET_POPCNTB to TARGET_POWER5. (*) (*) This commit already exists in another branch. Because the reference `refs/users/meissner/heads/work195-test' matches your hooks.email-new-commits-only configuration, no separate email is sent for this commit.
[gcc r15-7915] [rtl-optimization/117467] Avoid unnecessarily marking things live in ext-dce
https://gcc.gnu.org/g:4ed07a11ee2845c2085a3cd5cff043209a452441 commit r15-7915-g4ed07a11ee2845c2085a3cd5cff043209a452441 Author: Jeff Law Date: Sun Mar 9 13:28:10 2025 -0600 [rtl-optimization/117467] Avoid unnecessarily marking things live in ext-dce This is the first of what I expect to be a few patches to improve memory consumption and performance of ext-dce. While I haven't been able to reproduce the insane memory usage that Richi saw, I can certainly see how we might get there. I instrumented ext-dce to dump the size of liveness sets, removed the memory allocation limiter, then compiled the appropriate file from specfp on rv64. In my test I saw the liveness sets growing to absurd sizes as we worked from the last block back to the first. Think 125k entries by the time we got back to the entry block which would mean ~30k live registers. Simply no way that's correct. The use handling is the primary source of problems and the code that I most want to rewrite for gcc-16. It's just a fugly mess. I'm not terribly inclined to do that rewrite for gcc-15 though. So these will be spot adjustments. The most important thing to know about use processing is it sets up an iterator and walks that. When a SET is encountered we actually manually dive into the SRC/DEST and ideally terminate the iterator. If during that SET processing we encounter something unexpected we let the iterator continue normally, which causes iteration down into the SET_DEST object. That's safe behavior, though it can lead to too many objects as being marked live. We can refine that behavior by trivially realizing that we need not process the SET_DEST if it is a naked REG (and probably for other cases too, but they're not expected to be terribly important). So once we see the SET with a simple REG destination, we can bump the iterator to avoid having it dive into the SET_DEST if something unexpected is seen on the SET_SRC side. Fixing this alone takes us from 125k live objects to 10k live objects at the entry block. Time in ext-dce for rv64 on the testcase goes from 10.81s to 2.14s. Given this reduces the things considered live, this could easily result in finding more cases for ext-dce to improve. In fact a missed optimization issue for rv64 I've been poking at needs this patch as a prerequisite. Bootstrapped and regression tested on x86_64. Pushing to the trunk. PR rtl-optimization/117467 gcc * ext-dce.cc (ext_dce_process_uses): When trivially possible advance the iterator over the destination of a SET. Diff: --- gcc/ext-dce.cc | 12 1 file changed, 12 insertions(+) diff --git a/gcc/ext-dce.cc b/gcc/ext-dce.cc index 626c431f601e..c53dd5b46161 100644 --- a/gcc/ext-dce.cc +++ b/gcc/ext-dce.cc @@ -643,6 +643,18 @@ ext_dce_process_uses (rtx_insn *insn, rtx obj, /* The code of the RHS of a SET. */ enum rtx_code code = GET_CODE (src); + /* If we break the main loop below, then we will continue processing +sub-components of this RTX, including the SET_DEST. + +That is not necessary if the SET_DEST is a REG. We can just bump the +iterator to the next element to skip handling the SET_DEST. + +We can probably do this for ZERO_EXTRACT, STRICT_LOW_PART and SUBREG +destinations as well. But I want to rewrite all this code and keep +this fix conservative given we're deep into the gcc-15 release cycle. */ + if (REG_P (dst)) + iter.next (); + /* ?!? How much of this should mirror SET handling, potentially being shared? */ if (SUBREG_P (dst) && SUBREG_BYTE (dst).is_constant ())
[gcc(refs/users/omachota/heads/rtl-ssa-dce)] rtl-ssa: dce fix bad marked insn map passing
https://gcc.gnu.org/g:5759cb1c0b15d7c83f3aa5701d883508004a1b83 commit 5759cb1c0b15d7c83f3aa5701d883508004a1b83 Author: Ondřej Machota Date: Sun Mar 9 20:59:45 2025 +0100 rtl-ssa: dce fix bad marked insn map passing Diff: --- gcc/dce.cc | 146 - 1 file changed, 116 insertions(+), 30 deletions(-) diff --git a/gcc/dce.cc b/gcc/dce.cc index a04c5702d8a3..fa3d7721a6d0 100644 --- a/gcc/dce.cc +++ b/gcc/dce.cc @@ -1313,6 +1313,7 @@ bool sets_global_register(const_rtx rtx) { } // We should mark stack registers +// use HARD_FRAME_POINTER_REGNUM, REGNO_PTR_FRAME_P bool sets_global_register(rtx_insn* insn) { rtx set = single_set(insn); if (!set) @@ -1367,21 +1368,55 @@ bool is_unary_mem_modification(rtx_code code) { } } -bool is_rtx_insn_prelive(const_rtx rtx) { - if (rtx == nullptr) { -return false; +bool is_rtx_insn_prelive(rtx_insn *insn) { + gcc_assert(insn != nullptr); + + // TODO : handle calls correctly + if (CALL_P (insn) + /* We cannot delete pure or const sibling calls because it is +hard to see the result. */ + && (!SIBLING_CALL_P (insn)) + /* We can delete dead const or pure calls as long as they do not + infinite loop. */ + && (RTL_CONST_OR_PURE_CALL_P (insn) && !RTL_LOOPING_CONST_OR_PURE_CALL_P (insn)) + /* Don't delete calls that may throw if we cannot do so. */ + && can_delete_call (insn)) +return true; +// return !find_call_stack_args (as_a (insn), false, fast, arg_stores); + + // Jumps, notes, barriers should not be deleted +// According to the docs, rtl ssa does not contain noteS and barrierS + if (!NONJUMP_INSN_P (insn)) + { +std::cerr << "found jump instruction\n"; +debug(insn); +return true; } - auto code = GET_CODE(rtx); - if (is_control_flow(code)) + // Only rtx_insn should be handled here + auto code = GET_CODE(insn); + gcc_assert(code == INSN); + + /* Don't delete insns that may throw if we cannot do so. */ + if (!(cfun->can_delete_dead_exceptions && can_alter_cfg) && !insn_nothrow_p (insn)) return true; + /* TODO : What about call argumets? Accoring to the docs, for function prologue the RTX_FRAME_RELATED_P + should return true. + */ + /* Callee-save restores are needed. */ + if (RTX_FRAME_RELATED_P (insn) && crtl->shrink_wrapped_separate && find_reg_note (insn, REG_CFA_RESTORE, NULL)) +return true; + + // if (is_control_flow(code)) + // return true; + // Mark set of a global register - if (sets_global_register(rtx)) // check rtx_class with GET_RTX_CLASS if RTX_ISNS and convert if needed + if (sets_global_register(insn)) // check rtx_class with GET_RTX_CLASS if RTX_ISNS and convert if needed return true; - // Call is inside side_effects_p - how to mark parameter registers? - if (volatile_refs_p(rtx) || can_throw_internal(rtx) || BARRIER_P(rtx) || code == PREFETCH) + rtx body = PATTERN(insn); + if (side_effects_p(body) || can_throw_internal(body)) return true; if (is_unary_mem_modification(code)) @@ -1391,21 +1426,34 @@ bool is_rtx_insn_prelive(const_rtx rtx) { // Parallel is handled by volatile_refs_p switch (code) { - - } - - const char *const fmt = GET_RTX_FORMAT (code); - for (size_t i = 0; i < GET_RTX_LENGTH(code); ++i) { -if (fmt[i] == 'e' && is_rtx_insn_prelive(XEXP(rtx, i))) { +case MEM: +case ASM_INPUT: +case ASM_OPERANDS: return true; -} else if (fmt[i] == 'E') { - for (size_t j = 0; j < XVECLEN(rtx, i); ++j) { -if (is_rtx_insn_prelive(XVECEXP(rtx, i, j))) - return true; - } -} + +case PARALLEL: + for (int i = XVECLEN (body, 0) - 1; i >= 0; i--) + if (!deletable_insn_p_1 (XVECEXP (body, 0, i))) + return true; + return false; + break; + +default: + break; } + // const char *const fmt = GET_RTX_FORMAT (code); + // for (size_t i = 0; i < GET_RTX_LENGTH(code); ++i) { + // if (fmt[i] == 'e' && is_rtx_insn_prelive(XEXP(rtx, i))) { + // return true; + // } else if (fmt[i] == 'E') { + // for (size_t j = 0; j < XVECLEN(rtx, i); ++j) { + // if (is_rtx_insn_prelive(XVECEXP(rtx, i, j))) + // return true; + // } + // } + // } + return false; } @@ -1440,7 +1488,7 @@ bool is_prelive(insn_info *insn) if (!INSN_P(rtl)) // This might be useless return false; - rtx pat = PATTERN(rtl); // if we use this instead of rtl, then rtl notes wont be checked + return is_rtx_insn_prelive(rtl); // TODO : join if statements // We need to describe all possible prelive instructions, a list of all the instructions is inside `rtl.def` @@ -1458,7 +1506,7 @@ bool is_prelive(insn_info *insn) std::cerr << "Prelive: " << GET_RTX_NAME(code) << '\n'; // debug(insn); debug(rtl); - if (volatile_refs_p(rtl) || can_throw_internal(rtl) || BARRIER_P(rtl) || code == PREFETCH)
[gcc r15-7852] ira: Add new hooks for callee-save vs spills [PR117477]
https://gcc.gnu.org/g:e836d80374aa03a5ea5bd6cca00d826020c461da commit r15-7852-ge836d80374aa03a5ea5bd6cca00d826020c461da Author: Richard Sandiford Date: Thu Mar 6 11:06:25 2025 + ira: Add new hooks for callee-save vs spills [PR117477] Following on from the discussion in: https://gcc.gnu.org/pipermail/gcc-patches/2025-February/675256.html this patch removes TARGET_IRA_CALLEE_SAVED_REGISTER_COST_SCALE and replaces it with two hooks: one that controls the cost of using an extra callee-saved register and one that controls the cost of allocating a frame for the first spill. (The patch does not attempt to address the shrink-wrapping part of the thread above.) On AArch64, this is enough to fix PR117477, as verified by the new tests. The patch does not change the SPEC2017 scores significantly. (I saw a slight improvement in fotonik3d and roms, but I'm not convinced that the improvements are real.) The patch makes IRA use caller saves for gcc.target/aarch64/pr103350-1.c, which is a scan-dump correctness test that relies on not using caller saves. The decision to use caller saves looks appropriate, and saves an instruction, so I've just added -fno-caller-saves to the test options. The x86 parts were written by Honza. gcc/ PR rtl-optimization/117477 * config/aarch64/aarch64.cc (aarch64_count_saves): New function. (aarch64_count_above_hard_fp_saves, aarch64_callee_save_cost) (aarch64_frame_allocation_cost): Likewise. (TARGET_CALLEE_SAVE_COST): Define. (TARGET_FRAME_ALLOCATION_COST): Likewise. * config/i386/i386.cc (ix86_ira_callee_saved_register_cost_scale): Replace with... (ix86_callee_save_cost): ...this new hook. (TARGET_IRA_CALLEE_SAVED_REGISTER_COST_SCALE): Delete. (TARGET_CALLEE_SAVE_COST): Define. * target.h (spill_cost_type, frame_cost_type): New enums. * target.def (callee_save_cost, frame_allocation_cost): New hooks. (ira_callee_saved_register_cost_scale): Delete. * doc/tm.texi.in (TARGET_IRA_CALLEE_SAVED_REGISTER_COST_SCALE): Delete. (TARGET_CALLEE_SAVE_COST, TARGET_FRAME_ALLOCATION_COST): New hooks. * doc/tm.texi: Regenerate. * hard-reg-set.h (hard_reg_set_popcount): New function. * ira-color.cc (allocated_memory_p): New variable. (allocated_callee_save_regs): Likewise. (record_allocation): New function. (assign_hard_reg): Use targetm.frame_allocation_cost to model the cost of the first spill or first caller save. Use targetm.callee_save_cost to model the cost of using new callee-saved registers. Apply the exit rather than entry frequency to the cost of restoring a register or deallocating the frame. Update the new variables above. (improve_allocation): Use record_allocation. (color): Initialize allocated_callee_save_regs. (ira_color): Initialize allocated_memory_p. * targhooks.h (default_callee_save_cost): Declare. (default_frame_allocation_cost): Likewise. * targhooks.cc (default_callee_save_cost): New function. (default_frame_allocation_cost): Likewise. gcc/testsuite/ PR rtl-optimization/117477 * gcc.target/aarch64/callee_save_1.c: New test. * gcc.target/aarch64/callee_save_2.c: Likewise. * gcc.target/aarch64/callee_save_3.c: Likewise. * gcc.target/aarch64/pr103350-1.c: Add -fno-caller-saves. Co-authored-by: Jan Hubicka Diff: --- gcc/config/aarch64/aarch64.cc| 118 +++ gcc/config/i386/i386.cc | 28 -- gcc/doc/tm.texi | 77 +-- gcc/doc/tm.texi.in | 6 +- gcc/hard-reg-set.h | 15 +++ gcc/ira-color.cc | 83 ++-- gcc/target.def | 87 ++--- gcc/target.h | 12 +++ gcc/targhooks.cc | 27 ++ gcc/targhooks.h | 5 + gcc/testsuite/gcc.target/aarch64/callee_save_1.c | 12 +++ gcc/testsuite/gcc.target/aarch64/callee_save_2.c | 14 +++ gcc/testsuite/gcc.target/aarch64/callee_save_3.c | 12 +++ gcc/testsuite/gcc.target/aarch64/pr103350-1.c| 2 +- 14 files changed, 459 insertions(+), 39 deletions(-) diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index 9196b8d906c8..9bea8ce88f9f 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -15873,6 +15873,118 @@ aarch64_memory
[gcc r14-11396] Fortran: Fix segmentation fault in defined assignment [PR109066]
https://gcc.gnu.org/g:f6f90e015f642424ba0e94871d9389facaca5395 commit r14-11396-gf6f90e015f642424ba0e94871d9389facaca5395 Author: Paul Thomas Date: Sat Nov 16 15:56:10 2024 + Fortran: Fix segmentation fault in defined assignment [PR109066] 2024-11-16 Paul Thomas gcc/fortran PR fortran/109066 * resolve.cc (generate_component_assignments): If the temporary for 'var' is a pointer and 'expr' is neither a constant or a variable, change its attribute from pointer to allocatable. This avoids assignment to a temporary point that has neither been allocated or associated. gcc/testsuite/ PR fortran/109066 * gfortran.dg/defined_assignment_12.f90: New test. (cherry picked from commit 27ff8049bbdb0a001ba46835cd6a334c4ac76573) Diff: --- gcc/fortran/resolve.cc | 5 ++ .../gfortran.dg/defined_assignment_12.f90 | 61 ++ 2 files changed, 66 insertions(+) diff --git a/gcc/fortran/resolve.cc b/gcc/fortran/resolve.cc index 0261b4e6b246..71cb3d758016 100644 --- a/gcc/fortran/resolve.cc +++ b/gcc/fortran/resolve.cc @@ -12157,6 +12157,11 @@ generate_component_assignments (gfc_code **code, gfc_namespace *ns) { /* Assign the rhs to the temporary. */ tmp_expr = get_temp_from_expr ((*code)->expr1, ns); + if (tmp_expr->symtree->n.sym->attr.pointer) + { + tmp_expr->symtree->n.sym->attr.pointer = 0; + tmp_expr->symtree->n.sym->attr.allocatable = 1; + } this_code = build_assignment (EXEC_ASSIGN, tmp_expr, (*code)->expr2, NULL, NULL, (*code)->loc); diff --git a/gcc/testsuite/gfortran.dg/defined_assignment_12.f90 b/gcc/testsuite/gfortran.dg/defined_assignment_12.f90 new file mode 100644 index ..57445abe25c4 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/defined_assignment_12.f90 @@ -0,0 +1,61 @@ +! { dg-do run } +! +! Test fix of PR109066, which caused segfaults as below +! +! Contributed by Andrew Benson +! +module bugMod + + type :: rm + integer :: c=0 + contains + procedure :: rma + generic :: assignment(=) => rma + end type rm + + type :: lc + type(rm) :: lm + end type lc + +contains + + impure elemental subroutine rma(to,from) +implicit none +class(rm), intent(out) :: to +class(rm), intent(in) :: from +to%c = -from%c +return + end subroutine rma + +end module bugMod + +program bug + use bugMod + implicit none + type(lc), pointer :: i, j(:) + + allocate (i) + i = lc (rm (1)) ! Segmentation fault + if (i%lm%c .ne. -1) stop 1 + i = i_ptr () ! Segmentation fault + if (i%lm%c .ne. 1) stop 2 + + allocate (j(2)) + j = [lc (rm (2)), lc (rm (3))] ! Segmentation fault + if (any (j%lm%c .ne. [-2,-3])) stop 3 + j = j_ptr () ! Worked! + if (any (j%lm%c .ne. [2,3])) stop 4 + +contains + + function i_ptr () result(res) +type(lc), pointer :: res +res => i + end function + + function j_ptr () result(res) +type(lc), pointer :: res (:) +res => j + end function + +end program bug
[gcc(refs/users/omachota/heads/rtl-ssa-dce)] rtl-ssa: dce pass simple testcase
https://gcc.gnu.org/g:a6de52696da5e080e9694b0a85f4f463801a4f52 commit a6de52696da5e080e9694b0a85f4f463801a4f52 Author: Ondřej Machota Date: Sun Mar 9 23:02:03 2025 +0100 rtl-ssa: dce pass simple testcase Diff: --- gcc/cse.cc | 2 +- gcc/dce.cc | 233 + gcc/passes.def | 3 +- 3 files changed, 104 insertions(+), 134 deletions(-) diff --git a/gcc/cse.cc b/gcc/cse.cc index c53deecbe547..4dc1d3d57849 100644 --- a/gcc/cse.cc +++ b/gcc/cse.cc @@ -7622,7 +7622,7 @@ rest_of_handle_cse2 (void) bypassed safely. */ cse_condition_code_reg (); - delete_trivially_dead_insns (get_insns (), max_reg_num ()); + // delete_trivially_dead_insns (get_insns (), max_reg_num ()); if (tem == 2) { diff --git a/gcc/dce.cc b/gcc/dce.cc index fa3d7721a6d0..3e6e47f19208 100644 --- a/gcc/dce.cc +++ b/gcc/dce.cc @@ -1322,12 +1322,12 @@ bool sets_global_register(rtx_insn* insn) { rtx dest = SET_DEST(set); // TODO : rewrite to simple return - std::cerr << "first pseudo: " << FIRST_PSEUDO_REGISTER << '\n'; - std::cerr << "register: " << REGNO(dest) << "\n"; - debug(insn); + //std::cerr << "first pseudo: " << FIRST_PSEUDO_REGISTER << '\n'; + //std::cerr << "register: " << REGNO(dest) << "\n"; + //debug(insn); // If I understand correctly, global_regs[i] is 1 iff reg i is used if (REG_P(dest) && HARD_REGISTER_NUM_P(REGNO(dest))) { // && global_regs[REGNO(dest)] -std::cerr << "sets_global_register: true\n"; +//std::cerr << "sets_global_register: true\n"; return true; } @@ -1353,24 +1353,88 @@ bool is_control_flow(rtx_code code) { } } -bool is_unary_mem_modification(rtx_code code) { - switch (code) { -case PRE_DEC: +bool side_effects_with_mem (const_rtx x) +{ + const RTX_CODE code = GET_CODE (x); + switch (code) +{ +case LABEL_REF: +case SYMBOL_REF: +case CONST: +CASE_CONST_ANY: +case PC: +case REG: +case SCRATCH: +case ADDR_VEC: +case ADDR_DIFF_VEC: +case VAR_LOCATION: + return false; + +case CLOBBER: + /* Reject CLOBBER with a non-VOID mode. These are made by combine.cc +when some combination can't be done. If we see one, don't think +that we can simplify the expression. */ + return (GET_MODE (x) != VOIDmode); + case PRE_INC: -case POST_DEC: +case PRE_DEC: case POST_INC: +case POST_DEC: case PRE_MODIFY: case POST_MODIFY: +case CALL: +case UNSPEC_VOLATILE: + return true; + +case MEM: // We might want tu return true iff volatile or mem is a destination +case ASM_INPUT: +case ASM_OPERANDS: + return true; + +case USE: return true; default: - return false; + break; +} + + /* Recursively scan the operands of this expression. */ + + { +const char *fmt = GET_RTX_FORMAT (code); +int i; + +for (i = GET_RTX_LENGTH (code) - 1; i >= 0; i--) + { + if (fmt[i] == 'e') + { + if (side_effects_with_mem (XEXP (x, i))) + return true; + } + else if (fmt[i] == 'E') + { + int j; + for (j = 0; j < XVECLEN (x, i); j++) + if (side_effects_with_mem (XVECEXP (x, i, j))) + return true; + } + } } + return false; } bool is_rtx_insn_prelive(rtx_insn *insn) { gcc_assert(insn != nullptr); + // Jumps, notes, barriers should not be deleted + // According to the docs, rtl ssa does not contain noteS and barrierS + if (!NONJUMP_INSN_P (insn)) + { +std::cerr << "found jump instruction\n"; +debug(insn); +return true; + } + // TODO : handle calls correctly if (CALL_P (insn) /* We cannot delete pure or const sibling calls because it is @@ -1384,15 +1448,6 @@ bool is_rtx_insn_prelive(rtx_insn *insn) { return true; // return !find_call_stack_args (as_a (insn), false, fast, arg_stores); - // Jumps, notes, barriers should not be deleted -// According to the docs, rtl ssa does not contain noteS and barrierS - if (!NONJUMP_INSN_P (insn)) - { -std::cerr << "found jump instruction\n"; -debug(insn); -return true; - } - // Only rtx_insn should be handled here auto code = GET_CODE(insn); gcc_assert(code == INSN); @@ -1408,51 +1463,17 @@ bool is_rtx_insn_prelive(rtx_insn *insn) { if (RTX_FRAME_RELATED_P (insn) && crtl->shrink_wrapped_separate && find_reg_note (insn, REG_CFA_RESTORE, NULL)) return true; - // if (is_control_flow(code)) - // return true; - // Mark set of a global register if (sets_global_register(insn)) // check rtx_class with GET_RTX_CLASS if RTX_ISNS and convert if needed return true; rtx body = PATTERN(insn); - if (side_effects_p(body) || can_throw_internal(body)) -return true; - - if (is_unary_mem_modification(code)) + if (side_effects_with_mem(body) || can_throw_internal(body)) return
[gcc r15-7914] Use gfc_commit_symbol() to remove UNDO status instead of new function.
https://gcc.gnu.org/g:9f5b508bc5c16ae11ea385f6031487a518f62c8f commit r15-7914-g9f5b508bc5c16ae11ea385f6031487a518f62c8f Author: Thomas Koenig Date: Sun Mar 9 19:35:06 2025 +0100 Use gfc_commit_symbol() to remove UNDO status instead of new function. This is a cleaner version, removing an unneeded function and making sure that no memory leaks can occur if callers change. gcc/fortran/ChangeLog: PR fortran/119157 * gfortran.h (gfc_pop_undo_symbol): Remove prototype. * interface.cc (gfc_get_formal_from_actual_arglist): Use gfc_commit_symbol() instead of gfc_pop_undo_symbol(). * symbol.cc (gfc_pop_undo_symbol): Remove. Diff: --- gcc/fortran/gfortran.h | 1 - gcc/fortran/interface.cc | 4 ++-- gcc/fortran/symbol.cc| 5 - 3 files changed, 2 insertions(+), 8 deletions(-) diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h index f81be1d984c5..cf48d025768a 100644 --- a/gcc/fortran/gfortran.h +++ b/gcc/fortran/gfortran.h @@ -3736,7 +3736,6 @@ void gfc_traverse_user_op (gfc_namespace *, void (*)(gfc_user_op *)); void gfc_save_all (gfc_namespace *); void gfc_enforce_clean_symbol_state (void); -void gfc_pop_undo_symbol (void); gfc_gsymbol *gfc_get_gsymbol (const char *, bool bind_c); gfc_gsymbol *gfc_find_gsymbol (gfc_gsymbol *, const char *); diff --git a/gcc/fortran/interface.cc b/gcc/fortran/interface.cc index e3bc22f25e58..c59ed1f5306c 100644 --- a/gcc/fortran/interface.cc +++ b/gcc/fortran/interface.cc @@ -5836,8 +5836,6 @@ gfc_get_formal_from_actual_arglist (gfc_symbol *sym, { snprintf (name, GFC_MAX_SYMBOL_LEN, "_formal_%d", var_num ++); gfc_get_symbol (name, gfc_current_ns, &s); - /* We do not need this in an undo table. */ - gfc_pop_undo_symbol(); if (a->expr->ts.type == BT_PROCEDURE) { gfc_symbol *asym = a->expr->symtree->n.sym; @@ -5878,12 +5876,14 @@ gfc_get_formal_from_actual_arglist (gfc_symbol *sym, s->declared_at = a->expr->where; s->attr.intent = INTENT_UNKNOWN; (*f)->sym = s; + gfc_commit_symbol (s); } else /* If a->expr is NULL, this is an alternate rerturn. */ (*f)->sym = NULL; f = &((*f)->next); } + } diff --git a/gcc/fortran/symbol.cc b/gcc/fortran/symbol.cc index 92cba4187842..81aa81df2eec 100644 --- a/gcc/fortran/symbol.cc +++ b/gcc/fortran/symbol.cc @@ -3898,11 +3898,6 @@ enforce_single_undo_checkpoint (void) gcc_checking_assert (single_undo_checkpoint_p ()); } -void -gfc_pop_undo_symbol () -{ - latest_undo_chgset->syms.pop(); -} /* Undoes all the changes made to symbols in the current statement. */
[gcc/meissner/heads/work195-vpair] (14 commits) Merge commit 'refs/users/meissner/heads/work195-vpair' of g
The branch 'meissner/heads/work195-vpair' was updated to point to: ef2a8c1bb4fe... Merge commit 'refs/users/meissner/heads/work195-vpair' of g It previously pointed to: fd09cd889cab... Add ChangeLog.vpair and update REVISION. Diff: Summary of changes (added commits): --- ef2a8c1... Merge commit 'refs/users/meissner/heads/work195-vpair' of g 07fc419... Add ChangeLog.vpair and update REVISION. 3ced480... xUse architecture flags for defining _ARCH_PWR macros. (*) 7aa1750... Add rs6000 architecture masks. (*) 94cad41... Do not allow -mvsx to boost processor to power7. (*) 9d6a493... Use vector pair load/store for memcpy with -mcpu=future (*) f049955... Add -mcpu=future tests. (*) 883cc86... Add -mcpu=future tuning support. (*) 7997a87... Add support for -mcpu=future (*) 5115a81... Change TARGET_MODULO to TARGET_POWER9. (*) 14ffdf1... Change TARGET_POPCNTD to TARGET_POWER7. (*) 558802e... Change TARGET_CMPB to TARGET_POWER6. (*) 1f54dda... Change TARGET_FPRND to TARGET_POWER5X. (*) 0a7acdc... Change TARGET_POPCNTB to TARGET_POWER5. (*) (*) This commit already exists in another branch. Because the reference `refs/users/meissner/heads/work195-vpair' matches your hooks.email-new-commits-only configuration, no separate email is sent for this commit.
[gcc r15-7920] LoongArch: testsuite: Fix gcc.dg/vect/slp-26.c.
https://gcc.gnu.org/g:62a6a53766ba46ada1112472b71d4ea21411ea39 commit r15-7920-g62a6a53766ba46ada1112472b71d4ea21411ea39 Author: Lulu Cheng Date: Mon Mar 3 17:09:10 2025 +0800 LoongArch: testsuite: Fix gcc.dg/vect/slp-26.c. After d34cda720988674bcf8a24267c9e1ec61335d6de, what was originally not vectorizable can now be vectorized. So adjust gcc.dg/vect/slp-26.c. gcc/testsuite/ChangeLog: * gcc.dg/vect/slp-26.c: Adjust. Diff: --- gcc/testsuite/gcc.dg/vect/slp-26.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/gcc/testsuite/gcc.dg/vect/slp-26.c b/gcc/testsuite/gcc.dg/vect/slp-26.c index 23917474ddc1..b916bb3ff9ce 100644 --- a/gcc/testsuite/gcc.dg/vect/slp-26.c +++ b/gcc/testsuite/gcc.dg/vect/slp-26.c @@ -50,5 +50,5 @@ int main (void) /* { dg-final { scan-tree-dump-times "vectorized 0 loops" 1 "vect" { target { ! { mips_msa || { amdgcn-*-* || { riscv_v || loongarch_sx } } } } } } } */ /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target { mips_msa || { amdgcn-*-* || { riscv_v || loongarch_sx } } } } } } */ /* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 0 "vect" { target { ! { mips_msa || { amdgcn-*-* || { riscv_v || loongarch_sx } } } } } } } */ -/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { target { mips_msa || loongarch_sx } } } } */ -/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 "vect" { target { riscv_v || amdgcn-*-* } } } } */ +/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { target { mips_msa } } } } */ +/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 "vect" { target { loongarch_sx || { riscv_v || amdgcn-*-* } } } } } */
[gcc r15-7918] LoongArch: testsuite: Fix pr112325.c and pr117888-1.c.
https://gcc.gnu.org/g:671702b29f252b417810b7a1dc7506f096339577 commit r15-7918-g671702b29f252b417810b7a1dc7506f096339577 Author: Lulu Cheng Date: Mon Mar 3 16:52:43 2025 +0800 LoongArch: testsuite: Fix pr112325.c and pr117888-1.c. By default, vectorization is not enabled on LoongArch, resulting in the failure of these two test cases. gcc/testsuite/ChangeLog: * gcc.dg/vect/pr112325.c: Add the vector compilation option '-mlsx' for LoongArch. * gcc.dg/vect/pr117888-1.c: Likewise. Diff: --- gcc/testsuite/gcc.dg/vect/pr112325.c | 1 + gcc/testsuite/gcc.dg/vect/pr117888-1.c | 1 + 2 files changed, 2 insertions(+) diff --git a/gcc/testsuite/gcc.dg/vect/pr112325.c b/gcc/testsuite/gcc.dg/vect/pr112325.c index 143903beab20..8689fbfe092d 100644 --- a/gcc/testsuite/gcc.dg/vect/pr112325.c +++ b/gcc/testsuite/gcc.dg/vect/pr112325.c @@ -4,6 +4,7 @@ /* { dg-require-effective-target vect_shift } */ /* { dg-additional-options "-mavx2" { target x86_64-*-* i?86-*-* } } */ /* { dg-additional-options "--param max-completely-peeled-insns=200" { target powerpc64*-*-* } } */ +/* { dg-additional-options "-mlsx" { target loongarch64-*-* } } */ typedef unsigned short ggml_fp16_t; static float table_f32_f16[1 << 16]; diff --git a/gcc/testsuite/gcc.dg/vect/pr117888-1.c b/gcc/testsuite/gcc.dg/vect/pr117888-1.c index 4796a7c83c16..0b31fcdc423b 100644 --- a/gcc/testsuite/gcc.dg/vect/pr117888-1.c +++ b/gcc/testsuite/gcc.dg/vect/pr117888-1.c @@ -4,6 +4,7 @@ /* { dg-require-effective-target vect_shift } */ /* { dg-additional-options "-mavx2" { target x86_64-*-* i?86-*-* } } */ /* { dg-additional-options "--param max-completely-peeled-insns=200" { target powerpc64*-*-* } } */ +/* { dg-additional-options "-mlsx" { target loongarch64-*-* } } */ typedef unsigned short ggml_fp16_t; static float table_f32_f16[1 << 16];
[gcc r15-7919] LoongArch: testsuite: Fix gcc.dg/vect/bb-slp-77.c.
https://gcc.gnu.org/g:546567367a9d5c4ff3f1f416b55cf168153d03c7 commit r15-7919-g546567367a9d5c4ff3f1f416b55cf168153d03c7 Author: Lulu Cheng Date: Mon Mar 3 16:58:28 2025 +0800 LoongArch: testsuite: Fix gcc.dg/vect/bb-slp-77.c. The issue is the same as 12383255fe4e82c31f5e42c72a8fbcb1b5dea35d. Neither is .REDUC_PLUS set for V2SImode on LoongArch, so add it to the list of targets not expecting BB vectorization. gcc/testsuite/ChangeLog: * gcc.dg/vect/bb-slp-77.c: Add loongarch*-*-* to the list of expected failing targets. Diff: --- gcc/testsuite/gcc.dg/vect/bb-slp-77.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-77.c b/gcc/testsuite/gcc.dg/vect/bb-slp-77.c index bc74f6a4db31..2057f038f2f3 100644 --- a/gcc/testsuite/gcc.dg/vect/bb-slp-77.c +++ b/gcc/testsuite/gcc.dg/vect/bb-slp-77.c @@ -71,4 +71,4 @@ void test(const int n, float * restrict s, const void * restrict vx, const void *s = sumf; } -/* { dg-final { scan-tree-dump-times "optimized: basic block" 1 "slp1" { target { { vect_int_mult && vect_element_align } && { ! { powerpc*-*-* x86_64-*-* i?86-*-* } } } } } } */ +/* { dg-final { scan-tree-dump-times "optimized: basic block" 1 "slp1" { target { { vect_int_mult && vect_element_align } && { ! { powerpc*-*-* x86_64-*-* i?86-*-* loongarch*-*-* } } } } } } */