[gcc r15-346] sra: Do not leave work for DSE (that it can sometimes not perform)
https://gcc.gnu.org/g:f6743695b4d2bd4da96e56a19157372f93b800bd commit r15-346-gf6743695b4d2bd4da96e56a19157372f93b800bd Author: Martin Jambor Date: Thu May 9 16:39:44 2024 +0200 sra: Do not leave work for DSE (that it can sometimes not perform) When looking again at the g++.dg/tree-ssa/pr109849.C testcase we discovered that it generates terrible store-to-load forwarding stalls because SRA was leaving behind aggregate loads but all the stores were by scalar parts and DSE failed to remove the useless load. SRA has all the knowledge to remove the statement even now, so this small patch makes it do so. With this patch, the g++.dg/tree-ssa/pr109849.C micro-benchmark runs 9 times faster (on an AMD EPYC 75F3 machine). gcc/ChangeLog: 2024-04-18 Martin Jambor * tree-sra.cc (sra_modify_assign): Remove the original statement also when dealing with a store to a fully covered aggregate from a non-candidate. gcc/testsuite/ChangeLog: 2024-04-23 Martin Jambor * g++.dg/tree-ssa/pr109849.C: Also check that the aggeegate store to cur disappears. * gcc.dg/tree-ssa/ssa-dse-26.c: Instead of relying on DSE, check that the unwanted stores were removed at early SRA time. Diff: --- gcc/testsuite/g++.dg/tree-ssa/pr109849.C | 3 ++- gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-26.c | 6 +++--- gcc/tree-sra.cc| 14 -- 3 files changed, 17 insertions(+), 6 deletions(-) diff --git a/gcc/testsuite/g++.dg/tree-ssa/pr109849.C b/gcc/testsuite/g++.dg/tree-ssa/pr109849.C index cd348c0f5906..d06dbb104829 100644 --- a/gcc/testsuite/g++.dg/tree-ssa/pr109849.C +++ b/gcc/testsuite/g++.dg/tree-ssa/pr109849.C @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -fdump-tree-sra" } */ +/* { dg-options "-O2 -fdump-tree-sra -fdump-tree-optimized" } */ #include typedef unsigned int uint32_t; @@ -29,3 +29,4 @@ main() } /* { dg-final { scan-tree-dump "Created a replacement for stack offset" "sra"} } */ +/* { dg-final { scan-tree-dump-not "cur = MEM" "optimized"} } */ diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-26.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-26.c index 43152de56163..1d01392c5957 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-26.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-26.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -fdump-tree-dse1-details -fno-short-enums -fno-tree-fre" } */ +/* { dg-options "-O2 -fdump-tree-esra -fno-short-enums -fno-tree-fre" } */ /* { dg-skip-if "we want a BIT_FIELD_REF from fold_truth_andor" { ! lp64 } } */ /* { dg-skip-if "temporary variable names are not x and y" { mmix-knuth-mmixware } } */ @@ -31,5 +31,5 @@ constraint_equal (struct constraint a, struct constraint b) && constraint_expr_equal (a.rhs, b.rhs); } -/* { dg-final { scan-tree-dump-times "Deleted dead store: x = " 2 "dse1" } } */ -/* { dg-final { scan-tree-dump-times "Deleted dead store: y = " 2 "dse1" } } */ +/* { dg-final { scan-tree-dump-not "x = " "esra" } } */ +/* { dg-final { scan-tree-dump-not "y = " "esra" } } */ diff --git a/gcc/tree-sra.cc b/gcc/tree-sra.cc index 32fa28911f2d..8040b0c56451 100644 --- a/gcc/tree-sra.cc +++ b/gcc/tree-sra.cc @@ -4854,8 +4854,18 @@ sra_modify_assign (gimple *stmt, gimple_stmt_iterator *gsi) But use the RHS aggregate to load from to expose more optimization opportunities. */ if (access_has_children_p (lacc)) - generate_subtree_copies (lacc->first_child, rhs, lacc->offset, -0, 0, gsi, true, true, loc); + { + generate_subtree_copies (lacc->first_child, rhs, lacc->offset, + 0, 0, gsi, true, true, loc); + if (lacc->grp_covered) + { + unlink_stmt_vdef (stmt); + gsi_remove (& orig_gsi, true); + release_defs (stmt); + sra_stats.deleted++; + return SRA_AM_REMOVED; + } + } } return SRA_AM_NONE;
[gcc r13-8773] ICF&SRA: Make ICF and SRA agree on padding
https://gcc.gnu.org/g:10bf53a80eefa46500bffb442719777e2640e7d7 commit r13-8773-g10bf53a80eefa46500bffb442719777e2640e7d7 Author: Martin Jambor Date: Mon Apr 8 18:53:23 2024 +0200 ICF&SRA: Make ICF and SRA agree on padding PR 113359 shows that (at least with -fno-strict-aliasing) ICF can unify two functions which copy an aggregate type of the same size but then SRA, through its total scalarization, can copy the aggregate by pieces, skipping paddding, but the padding was not the same in the two original functions that ICF unified. This patch enhances SRA with the ability to collect padding information which then can be compared from within ICF. Unfortunately SRA uses OPTION_SET_P when determining its limits, so ICF needs to switch cfuns at least once to figure it out too. gcc/ChangeLog: 2024-03-27 Martin Jambor PR ipa/113359 * ipa-icf-gimple.h (func_checker): New members safe_for_total_scalarization_p, m_total_scalarization_limit_known_p and m_total_scalarization_limit. (func_checker::func_checker): Initialize new member variables. * ipa-icf-gimple.cc: Include tree-sra.h. (func_checker::func_checker): Initialize new member variables. (func_checker::safe_for_total_scalarization_p): New function. (func_checker::compare_operand): Use the new function. * tree-sra.h (sra_get_max_scalarization_size): Declare. (sra_total_scalarization_would_copy_same_data_p): Likewise. * tree-sra.cc (prepare_iteration_over_array_elts): New function. (class sra_padding_collecting): New. (sra_padding_collecting::record_padding): Likewise. (scalarizable_type_p): Rename to totally_scalarizable_type_p. Add ability to record padding when requested. (totally_scalarize_subtree): Split out gathering information necessary to iterate over array elements to prepare_iteration_over_array_elts. Fix errornous early exit. (analyze_all_variable_accesses): Adjust the call to totally_scalarizable_type_p. Move determining of total scalariation size limit... (sra_get_max_scalarization_size): ...here. (check_ts_and_push_padding_to_vec): New function. (sra_total_scalarization_would_copy_same_data_p): Likewise. gcc/testsuite/ChangeLog: 2024-03-27 Martin Jambor PR ipa/113359 * gcc.dg/lto/pr113359-1_0.c: New. * gcc.dg/lto/pr113359-1_1.c: Likewise. * gcc.dg/lto/pr113359-2_0.c: Likewise. * gcc.dg/lto/pr113359-2_1.c: Likewise. * gcc.dg/lto/pr113359-3_0.c: Likewise. * gcc.dg/lto/pr113359-3_1.c: Likewise. * gcc.dg/lto/pr113359-4_0.c: Likewise. * gcc.dg/lto/pr113359-4_1.c: Likewise. * gcc.dg/lto/pr113359-5_0.c: Likewise. * gcc.dg/lto/pr113359-5_1.c: Likewise. (cherry picked from commit 1e3312a25a7b34d6e3f549273e1674c7114e4408) Diff: --- gcc/ipa-icf-gimple.cc | 41 +- gcc/ipa-icf-gimple.h| 15 +- gcc/testsuite/gcc.dg/lto/pr113359-1_0.c | 86 +++ gcc/testsuite/gcc.dg/lto/pr113359-1_1.c | 38 + gcc/testsuite/gcc.dg/lto/pr113359-2_0.c | 87 +++ gcc/testsuite/gcc.dg/lto/pr113359-2_1.c | 38 + gcc/testsuite/gcc.dg/lto/pr113359-3_0.c | 114 +++ gcc/testsuite/gcc.dg/lto/pr113359-3_1.c | 49 +++ gcc/testsuite/gcc.dg/lto/pr113359-4_0.c | 114 +++ gcc/testsuite/gcc.dg/lto/pr113359-4_1.c | 49 +++ gcc/testsuite/gcc.dg/lto/pr113359-5_0.c | 118 +++ gcc/testsuite/gcc.dg/lto/pr113359-5_1.c | 50 +++ gcc/tree-sra.cc | 252 +--- gcc/tree-sra.h | 3 + 14 files changed, 999 insertions(+), 55 deletions(-) diff --git a/gcc/ipa-icf-gimple.cc b/gcc/ipa-icf-gimple.cc index f4180c0fa813..49302ad56c65 100644 --- a/gcc/ipa-icf-gimple.cc +++ b/gcc/ipa-icf-gimple.cc @@ -39,6 +39,7 @@ along with GCC; see the file COPYING3. If not see #include "cfgloop.h" #include "attribs.h" #include "gimple-walk.h" +#include "tree-sra.h" #include "tree-ssa-alias-compare.h" #include "ipa-icf-gimple.h" @@ -59,7 +60,8 @@ func_checker::func_checker (tree source_func_decl, tree target_func_decl, : m_source_func_decl (source_func_decl), m_target_func_decl (target_func_decl), m_ignored_source_nodes (ignored_source_nodes), m_ignored_target_nodes (ignored_target_nodes), -m_ignore_labels (ignore_labels), m_tbaa (tbaa) +m_ignore_labels (ignore_labels), m_tbaa (tbaa), +m_total_scalarization_limit_known_p (false) { function *source_func = DECL_STRUCT_FUNCTION (source_func_decl); function *target_func = DECL_STRUCT_FUNCTION (target_func_de
[gcc r13-8774] ipa: Compare jump functions in ICF (PR 113907)
https://gcc.gnu.org/g:1db45e83021a8a87f41e22053910fcce6e8e2c2c commit r13-8774-g1db45e83021a8a87f41e22053910fcce6e8e2c2c Author: Martin Jambor Date: Tue May 14 17:01:21 2024 +0200 ipa: Compare jump functions in ICF (PR 113907) This is a manual backport of r14-9840-g1162861439fd3c from master. Manual because the bits and value range representation in jump functions have changes during the gcc 14 development cycle. In PR 113907 comment #58, Honza found a case where ICF thinks bodies of functions are equivalent but becaise of difference in aliases in a memory access, different aggregate jump functions are associated with supposedly equivalent call statements. This patch adds a way to compare jump functions and plugs it into ICF to avoid the issue. gcc/ChangeLog: 2024-05-14 Martin Jambor PR ipa/113907 * ipa-prop.h (ipa_jump_functions_equivalent_p): Declare. (values_equal_for_ipcp_p): Likewise. * ipa-prop.cc (ipa_agg_pass_through_jf_equivalent_p): New function. (ipa_agg_jump_functions_equivalent_p): Likewise. (ipa_jump_functions_equivalent_p): Likewise. * ipa-cp.cc (values_equal_for_ipcp_p): Make function public. * ipa-icf-gimple.cc: Include alloc-pool.h, symbol-summary.h, sreal.h, ipa-cp.h and ipa-prop.h. (func_checker::compare_gimple_call): Comapre jump functions. gcc/testsuite/ChangeLog: 2024-05-10 Martin Jambor PR ipa/113907 * gcc.dg/lto/pr113907_0.c: New. * gcc.dg/lto/pr113907_1.c: Likewise. * gcc.dg/lto/pr113907_2.c: Likewise. Diff: --- gcc/ipa-cp.cc | 2 +- gcc/ipa-icf-gimple.cc | 29 +++ gcc/ipa-prop.cc | 157 ++ gcc/ipa-prop.h| 3 + gcc/testsuite/gcc.dg/lto/pr113907_0.c | 18 gcc/testsuite/gcc.dg/lto/pr113907_1.c | 35 gcc/testsuite/gcc.dg/lto/pr113907_2.c | 11 +++ 7 files changed, 254 insertions(+), 1 deletion(-) diff --git a/gcc/ipa-cp.cc b/gcc/ipa-cp.cc index b3e0f62e4003..8f36608cf33b 100644 --- a/gcc/ipa-cp.cc +++ b/gcc/ipa-cp.cc @@ -458,7 +458,7 @@ ipcp_lattice::is_single_const () /* Return true iff X and Y should be considered equal values by IPA-CP. */ -static bool +bool values_equal_for_ipcp_p (tree x, tree y) { gcc_checking_assert (x != NULL_TREE && y != NULL_TREE); diff --git a/gcc/ipa-icf-gimple.cc b/gcc/ipa-icf-gimple.cc index 49302ad56c65..054a557bd588 100644 --- a/gcc/ipa-icf-gimple.cc +++ b/gcc/ipa-icf-gimple.cc @@ -42,7 +42,11 @@ along with GCC; see the file COPYING3. If not see #include "tree-sra.h" #include "tree-ssa-alias-compare.h" +#include "alloc-pool.h" +#include "symbol-summary.h" #include "ipa-icf-gimple.h" +#include "sreal.h" +#include "ipa-prop.h" namespace ipa_icf_gimple { @@ -751,6 +755,31 @@ func_checker::compare_gimple_call (gcall *s1, gcall *s2) && !compatible_types_p (TREE_TYPE (t1), TREE_TYPE (t2))) return return_false_with_msg ("GIMPLE internal call LHS type mismatch"); + if (!gimple_call_internal_p (s1)) +{ + cgraph_edge *e1 = cgraph_node::get (m_source_func_decl)->get_edge (s1); + cgraph_edge *e2 = cgraph_node::get (m_target_func_decl)->get_edge (s2); + class ipa_edge_args *args1 = ipa_edge_args_sum->get (e1); + class ipa_edge_args *args2 = ipa_edge_args_sum->get (e2); + if ((args1 != nullptr) != (args2 != nullptr)) + return return_false_with_msg ("ipa_edge_args mismatch"); + if (args1) + { + int n1 = ipa_get_cs_argument_count (args1); + int n2 = ipa_get_cs_argument_count (args2); + if (n1 != n2) + return return_false_with_msg ("ipa_edge_args nargs mismatch"); + for (int i = 0; i < n1; i++) + { + struct ipa_jump_func *jf1 = ipa_get_ith_jump_func (args1, i); + struct ipa_jump_func *jf2 = ipa_get_ith_jump_func (args2, i); + if (((jf1 != nullptr) != (jf2 != nullptr)) + || (jf1 && !ipa_jump_functions_equivalent_p (jf1, jf2))) + return return_false_with_msg ("jump function mismatch"); + } + } +} + return compare_operand (t1, t2, get_operand_access_type (&map, t1)); } diff --git a/gcc/ipa-prop.cc b/gcc/ipa-prop.cc index 0d8167495341..11ba2521b2c9 100644 --- a/gcc/ipa-prop.cc +++ b/gcc/ipa-prop.cc @@ -6022,5 +6022,162 @@ ipcp_transform_function (struct cgraph_node *node) return modified_mem_access ? TODO_update_ssa_only_virtuals : 0; } +/* Return true if the two pass_through components of two jump functions are + known to be equivalent. AGG_JF denotes whether they are part of aggregate + functions or not. The function can be used before the IPA phase of IPA-CP + or inlining because it cannot cope with refdesc changes these
[gcc r12-10442] ipa: Force args obtined through pass-through maps to the expected type (PR 114247)
https://gcc.gnu.org/g:44191982c6bd41db1c9d126ea2f15febec3c1f81 commit r12-10442-g44191982c6bd41db1c9d126ea2f15febec3c1f81 Author: Martin Jambor Date: Tue May 14 14:13:36 2024 +0200 ipa: Force args obtined through pass-through maps to the expected type (PR 114247) Interactions of IPA-CP and IPA-SRA on the same data is a rather big source of issues, I'm afraid. PR 113964 is a situation where IPA-CP propagates an unsigned short in a union parameter into a function which itself calls a different function which has a same union parameter and both these union parameters are split with IPA-SRA. The leaf function however uses a signed short member of the union. In the calling function, we get the unsigned constant as the replacement for the union and it is then passed in the call without any type compatibility checks. Apparently on riscv64 it matters whether the parameter is signed or unsigned short and so the leaf function can see different values. Fixed by using useless_type_conversion_p at the appropriate place and if it fails, use force_value_to type as elsewhere in similar situations. gcc/ChangeLog: 2024-04-04 Martin Jambor PR ipa/114247 * ipa-param-manipulation.cc (ipa_param_adjustments::modify_call): Force values obtined through pass-through maps to the expected split type. gcc/testsuite/ChangeLog: 2024-04-04 Patrick O'Neill Martin Jambor PR ipa/114247 * gcc.dg/ipa/pr114247.c: New test. (cherry picked from commit 8cd0d29270d4ed86c69b80c08de66dcb6c1e22fe) Diff: --- gcc/ipa-param-manipulation.cc | 6 ++ gcc/testsuite/gcc.dg/ipa/pr114247.c | 31 +++ 2 files changed, 37 insertions(+) diff --git a/gcc/ipa-param-manipulation.cc b/gcc/ipa-param-manipulation.cc index 38328c3e8d0a..3472ef13bc2f 100644 --- a/gcc/ipa-param-manipulation.cc +++ b/gcc/ipa-param-manipulation.cc @@ -719,6 +719,12 @@ ipa_param_adjustments::modify_call (cgraph_edge *cs, } if (repl) { + if (!useless_type_conversion_p(apm->type, repl->typed.type)) + { + repl = force_value_to_type (apm->type, repl); + repl = force_gimple_operand_gsi (&gsi, repl, + true, NULL, true, GSI_SAME_STMT); + } vargs.quick_push (repl); continue; } diff --git a/gcc/testsuite/gcc.dg/ipa/pr114247.c b/gcc/testsuite/gcc.dg/ipa/pr114247.c new file mode 100644 index ..60aa2bc0122f --- /dev/null +++ b/gcc/testsuite/gcc.dg/ipa/pr114247.c @@ -0,0 +1,31 @@ +/* { dg-do run } */ +/* { dg-options "-O2 -fsigned-char -fno-strict-aliasing -fwrapv" } */ + +union a { + unsigned short b; + int c; + signed short d; +}; +int e, f = 1, g; +long h; +const int **i; +void j(union a k, int l, unsigned m) { + const int *a[100]; + i = &a[0]; + h = k.d; +} +static int o(union a k) { + k.d = -1; + while (1) +if (f) + break; + j(k, g, e); + return 0; +} +int main() { + union a n = {1}; + o(n); + if (h != -1) +__builtin_abort(); + return 0; +}
[gcc r12-10443] ipa: Self-DCE of uses of removed call LHSs (PR 108007)
https://gcc.gnu.org/g:2183e5b5aa3a080624cb95a06993e34dedd09cb2 commit r12-10443-g2183e5b5aa3a080624cb95a06993e34dedd09cb2 Author: Martin Jambor Date: Mon Apr 8 17:34:33 2024 +0200 ipa: Self-DCE of uses of removed call LHSs (PR 108007) PR 108007 is another manifestation where we rely on DCE to clean-up after IPA-SRA and if the user explicitely switches DCE off, IPA-SRA can leave behind statements which are fed uninitialized values and trap, even though their results are themselves never used. I have already fixed this for unused parameters in callees, this bug shows that almost the same thing can happen for removed returns, on the side of callers. This means that the issue has to be fixed elsewhere, in call redirection. This patch adds a function which looks for (and through, using a work-list) uses of operations fed specific SSA names and removes them all. That would have been easy if it wasn't for debug statements during tree-inline (from which call redirection is also invoked). Debug statements are decoupled from the rest at this point and iterating over uses of SSAs does not bring them up. During tree-inline they are handled especially at the end, I assume in order to make sure that relative ordering of UIDs are the same with and without debug info. This means that during tree-inline we need to make a hash of killed SSAs, that we already have in copy_body_data, available to the function making the purging. So the patch duly does also that, making the interface slightly ugly. Moreover, all newly unused SSA names need to be freed and as PR 112616 showed, it must be done in a defined order, which is what newly added ipa_release_ssas_in_hash does. This backport to gcc-13 also contains 54e505d0446f86b7ad383acbb8e5501f20872b64 in order not to reintroduce PR 113757. gcc/ChangeLog: 2024-04-05 Martin Jambor PR ipa/108007 PR ipa/112616 * cgraph.h (cgraph_edge): Add a parameter to redirect_call_stmt_to_callee. * ipa-param-manipulation.h (ipa_param_adjustments): Add a parameter to modify_call. (ipa_release_ssas_in_hash): Declare. * cgraph.cc (cgraph_edge::redirect_call_stmt_to_callee): New parameter killed_ssas, pass it to padjs->modify_call. * ipa-param-manipulation.cc (purge_all_uses): New function. (ipa_param_adjustments::modify_call): New parameter killed_ssas. Instead of substituting uses, invoke purge_all_uses. If hash of killed SSAs has not been provided, create a temporary one and release SSAs that have been added to it. (compare_ssa_versions): New function. (ipa_release_ssas_in_hash): Likewise. * tree-inline.cc (redirect_all_calls): Create id->killed_new_ssa_names earlier, pass it to edge redirection, adjust a comment. (copy_body): Release SSAs in id->killed_new_ssa_names. gcc/testsuite/ChangeLog: 2024-01-15 Martin Jambor PR ipa/108007 PR ipa/112616 * gcc.dg/ipa/pr108007.c: New test. * gcc.dg/ipa/pr112616.c: Likewise. (cherry picked from commit 40ddc0b05a47f999b24f20c1becb79004995731b) Diff: --- gcc/cgraph.cc | 10 +++- gcc/cgraph.h| 9 ++- gcc/ipa-param-manipulation.cc | 112 +--- gcc/ipa-param-manipulation.h| 5 +- gcc/testsuite/g++.dg/ipa/pr113757.C | 14 + gcc/testsuite/gcc.dg/ipa/pr108007.c | 32 +++ gcc/testsuite/gcc.dg/ipa/pr112616.c | 28 + gcc/tree-inline.cc | 27 - 8 files changed, 193 insertions(+), 44 deletions(-) diff --git a/gcc/cgraph.cc b/gcc/cgraph.cc index 3734c85db637..b5cfa3b36c57 100644 --- a/gcc/cgraph.cc +++ b/gcc/cgraph.cc @@ -1403,11 +1403,17 @@ cgraph_edge::redirect_callee (cgraph_node *n) speculative indirect call, remove "speculative" of the indirect call and also redirect stmt to it's final direct target. + When called from within tree-inline, KILLED_SSAs has to contain the pointer + to killed_new_ssa_names within the copy_body_data structure and SSAs + discovered to be useless (if LHS is removed) will be added to it, otherwise + it needs to be NULL. + It is up to caller to iteratively transform each "speculative" direct call as appropriate. */ gimple * -cgraph_edge::redirect_call_stmt_to_callee (cgraph_edge *e) +cgraph_edge::redirect_call_stmt_to_callee (cgraph_edge *e, + hash_set *killed_ssas) { tree decl = gimple_call_fndecl (e->call_stmt); gcall *new_stmt; @@ -1528,7 +1534,7 @@ cgraph_edge::redirect_call_stmt_to_callee (cgraph_edge *e) remove_stmt_from_eh_lp (e-
[gcc r13-8982] Compare loop bounds in ipa-icf
https://gcc.gnu.org/g:e469654e5e7bdd823c5aa996075e903c6b4d47e2 commit r13-8982-ge469654e5e7bdd823c5aa996075e903c6b4d47e2 Author: Jan Hubicka Date: Mon Aug 19 17:10:25 2024 +0200 Compare loop bounds in ipa-icf Hi, this testcase shows another poblem with missing comparators for metadata in ICF. With value ranges available to loop optimizations during early opts we can estimate number of iterations based on guarding condition that can be split away by the fnsplit pass. This patch disables ICF when number of iteraitons does not match. Bootstrapped/regtesed x86_64-linux, will commit it shortly gcc/ChangeLog: PR ipa/115277 * ipa-icf-gimple.cc (func_checker::compare_loops): compare loop bounds. gcc/testsuite/ChangeLog: * gcc.c-torture/compile/pr115277.c: New test. (cherry picked from commit 0d19fbc7b0760ce665fa6a88cd40cfa0311358d7) Diff: --- gcc/ipa-icf-gimple.cc | 4 gcc/testsuite/gcc.c-torture/compile/pr115277.c | 28 ++ 2 files changed, 32 insertions(+) diff --git a/gcc/ipa-icf-gimple.cc b/gcc/ipa-icf-gimple.cc index 054a557bd58..a844e74792a 100644 --- a/gcc/ipa-icf-gimple.cc +++ b/gcc/ipa-icf-gimple.cc @@ -542,6 +542,10 @@ func_checker::compare_loops (basic_block bb1, basic_block bb2) return return_false_with_msg ("unroll"); if (!compare_variable_decl (l1->simduid, l2->simduid)) return return_false_with_msg ("simduid"); + if ((l1->any_upper_bound != l2->any_upper_bound) + || (l1->any_upper_bound + && (l1->nb_iterations_upper_bound != l2->nb_iterations_upper_bound))) +return return_false_with_msg ("nb_iterations_upper_bound"); return true; } diff --git a/gcc/testsuite/gcc.c-torture/compile/pr115277.c b/gcc/testsuite/gcc.c-torture/compile/pr115277.c new file mode 100644 index 000..27449eb254f --- /dev/null +++ b/gcc/testsuite/gcc.c-torture/compile/pr115277.c @@ -0,0 +1,28 @@ +int array[1000]; +void +test (int a) +{ +if (__builtin_expect (a > 3, 1)) +return; +for (int i = 0; i < a; i++) +array[i]=i; +} +void +test2 (int a) +{ +if (__builtin_expect (a > 10, 1)) +return; +for (int i = 0; i < a; i++) +array[i]=i; +} +int +main() +{ +test(1); +test(2); +test(3); +test2(10); +if (array[9] != 9) +__builtin_abort (); +return 0; +}
[gcc r15-3070] sra: Avoid risking x87 magling binary representation of a replacement (PR 58416)
https://gcc.gnu.org/g:f577959f420ae404f99f630dadc1c0370734d0da commit r15-3070-gf577959f420ae404f99f630dadc1c0370734d0da Author: Martin Jambor Date: Wed Aug 21 14:49:11 2024 +0200 sra: Avoid risking x87 magling binary representation of a replacement (PR 58416) PR 58416 shows that storing non-floating point data to floating point scalar registers can lead to miscompilations when the data is normalized or otherwise processed upon loading to a register. To avoid that risk, this patch detects situations where we have multiple types and a we decide to represent the data in a type with a mode that is known to not be able to transfer actual bits reliably using the new TARGET_MODE_CAN_TRANSFER_BITS hook. gcc/ChangeLog: 2024-08-19 Martin Jambor PR target/58416 * tree-sra.cc (types_risk_mangled_binary_repr_p): New function. (sort_and_splice_var_accesses): Use it. (propagate_subaccesses_from_rhs): Likewise. gcc/testsuite/ChangeLog: 2024-08-19 Martin Jambor PR target/58416 * gcc.dg/torture/pr58416.c: New test. Diff: --- gcc/testsuite/gcc.dg/torture/pr58416.c | 32 gcc/tree-sra.cc| 28 +++- 2 files changed, 59 insertions(+), 1 deletion(-) diff --git a/gcc/testsuite/gcc.dg/torture/pr58416.c b/gcc/testsuite/gcc.dg/torture/pr58416.c new file mode 100644 index ..0922b0e70890 --- /dev/null +++ b/gcc/testsuite/gcc.dg/torture/pr58416.c @@ -0,0 +1,32 @@ +/* { dg-do run } */ + +struct s { + char s[sizeof(long double)]; +}; + +union u { + long double d; + struct s s; +}; + +int main() +{ + union u x = {0}; +#if __SIZEOF_LONG_DOUBLE__ == 16 + x.s = (struct s){""}; +#elif __SIZEOF_LONG_DOUBLE__ == 12 + x.s = (struct s){""}; +#elif __SIZEOF_LONG_DOUBLE__ == 8 + x.s = (struct s){""}; +#elif __SIZEOF_LONG_DOUBLE__ == 4 + x.s = (struct s){""}; +#endif + + union u y = x; + + for (unsigned char *p = (unsigned char *)&y + sizeof y; + p-- > (unsigned char *)&y;) +if (*p != (unsigned char)'x') + __builtin_abort (); + return 0; +} diff --git a/gcc/tree-sra.cc b/gcc/tree-sra.cc index 8040b0c56451..64e2f007d680 100644 --- a/gcc/tree-sra.cc +++ b/gcc/tree-sra.cc @@ -2335,6 +2335,19 @@ same_access_path_p (tree exp1, tree exp2) return true; } +/* Return true when either T1 is a type that, when loaded into a register and + stored back to memory will yield the same bits or when both T1 and T2 are + compatible. */ + +static bool +types_risk_mangled_binary_repr_p (tree t1, tree t2) +{ + if (mode_can_transfer_bits (TYPE_MODE (t1))) +return false; + + return !types_compatible_p (t1, t2); +} + /* Sort all accesses for the given variable, check for partial overlaps and return NULL if there are any. If there are none, pick a representative for each combination of offset and size and create a linked list out of them. @@ -2461,6 +2474,17 @@ sort_and_splice_var_accesses (tree var) } unscalarizable_region = true; } + else if (types_risk_mangled_binary_repr_p (access->type, ac2->type)) + { + if (dump_file && (dump_flags & TDF_DETAILS)) + { + fprintf (dump_file, "Cannot scalarize the following access " + "because data would be held in a mode which is not " + "guaranteed to preserve all bits.\n "); + dump_access (dump_file, access, false); + } + unscalarizable_region = true; + } if (grp_same_access_path && !same_access_path_p (access->expr, ac2->expr)) @@ -3127,7 +3151,9 @@ propagate_subaccesses_from_rhs (struct access *lacc, struct access *racc) ret = true; subtree_mark_written_and_rhs_enqueue (lacc); } - if (!lacc->first_child && !racc->first_child) + if (!lacc->first_child + && !racc->first_child + && !types_risk_mangled_binary_repr_p (racc->type, lacc->type)) { /* We are about to change the access type from aggregate to scalar, so we need to put the reverse flag onto the access, if any. */
[gcc r15-3515] ipa: Treat static constructors and destructors as non-local (PR 115815)
https://gcc.gnu.org/g:e98ad6a049c96c21cf641954584c2f5b7df0ce93 commit r15-3515-ge98ad6a049c96c21cf641954584c2f5b7df0ce93 Author: Martin Jambor Date: Fri Sep 6 14:12:53 2024 +0200 ipa: Treat static constructors and destructors as non-local (PR 115815) In PR 115815, IPA-SRA thought it had control over all invocations of a (recursive) static destructor but it did not see the implied invocation which led to the original being left behind and the clean-up code encountering uses of SSAs that definitely should have been dead. Fixed by teaching cgraph_node::can_be_local_p about static constructors and destructors. Similar test is missing in cgraph_node::local_p so I added the check there as well. gcc/ChangeLog: 2024-07-25 Martin Jambor PR ipa/115815 * cgraph.cc (cgraph_node_cannot_be_local_p_1): Also check DECL_STATIC_CONSTRUCTOR and DECL_STATIC_DESTRUCTOR. * ipa-visibility.cc (non_local_p): Likewise. (cgraph_node::local_p): Delete extraneous line of tabs. gcc/testsuite/ChangeLog: 2024-07-25 Martin Jambor PR ipa/115815 * gcc.dg/lto/pr115815_0.c: New test. Diff: --- gcc/cgraph.cc | 4 +++- gcc/ipa-visibility.cc | 5 +++-- gcc/testsuite/gcc.dg/lto/pr115815_0.c | 18 ++ 3 files changed, 24 insertions(+), 3 deletions(-) diff --git a/gcc/cgraph.cc b/gcc/cgraph.cc index 473d8410bc9..39a3adbc7c3 100644 --- a/gcc/cgraph.cc +++ b/gcc/cgraph.cc @@ -2434,7 +2434,9 @@ cgraph_node_cannot_be_local_p_1 (cgraph_node *node, void *) && !node->forced_by_abi && !node->used_from_object_file_p () && !node->same_comdat_group) - || !node->externally_visible)); + || !node->externally_visible) + && !DECL_STATIC_CONSTRUCTOR (node->decl) + && !DECL_STATIC_DESTRUCTOR (node->decl)); } /* Return true if cgraph_node can be made local for API change. diff --git a/gcc/ipa-visibility.cc b/gcc/ipa-visibility.cc index 501d3c304aa..21f0c47f388 100644 --- a/gcc/ipa-visibility.cc +++ b/gcc/ipa-visibility.cc @@ -102,7 +102,9 @@ non_local_p (struct cgraph_node *node, void *data ATTRIBUTE_UNUSED) && !node->externally_visible && !node->used_from_other_partition && !node->in_other_partition - && node->get_availability () >= AVAIL_AVAILABLE); + && node->get_availability () >= AVAIL_AVAILABLE + && !DECL_STATIC_CONSTRUCTOR (node->decl) + && !DECL_STATIC_DESTRUCTOR (node->decl)); } /* Return true when function can be marked local. */ @@ -116,7 +118,6 @@ cgraph_node::local_p (void) return n->callees->callee->local_p (); return !n->call_for_symbol_thunks_and_aliases (non_local_p, NULL, true); - } /* A helper for comdat_can_be_unshared_p. */ diff --git a/gcc/testsuite/gcc.dg/lto/pr115815_0.c b/gcc/testsuite/gcc.dg/lto/pr115815_0.c new file mode 100644 index 000..d938ae4c802 --- /dev/null +++ b/gcc/testsuite/gcc.dg/lto/pr115815_0.c @@ -0,0 +1,18 @@ +int a; +volatile int v; +volatile int w; + +int __attribute__((destructor)) +b() { + if (v) +return a + b(); + v = 5; + return 0; +} + +int +main (int argc, char **argv) +{ + w = 1; + return 0; +}
[gcc r15-3516] ipa: Move pass_ipa_cdtor_merge before pass_ipa_cp and pass_ipa_sra
https://gcc.gnu.org/g:db0fa0b35b922449d703c040383abf7acb349d9d commit r15-3516-gdb0fa0b35b922449d703c040383abf7acb349d9d Author: Martin Jambor Date: Fri Sep 6 14:12:54 2024 +0200 ipa: Move pass_ipa_cdtor_merge before pass_ipa_cp and pass_ipa_sra When looking at PR 115815 we realized that it would make sense to make calls to functions originally declared static constructors and destructors created by pass_ipa_cdtor_merge visible to IPA-SRA. This patch does that. gcc/ChangeLog: 2024-07-25 Martin Jambor * passes.def: Move pass_ipa_cdtor_merge before pass_ipa_cp and pass_ipa_sra. Diff: --- gcc/passes.def | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/passes.def b/gcc/passes.def index 6d98c3c9282..40162ac20a0 100644 --- a/gcc/passes.def +++ b/gcc/passes.def @@ -157,9 +157,9 @@ along with GCC; see the file COPYING3. If not see NEXT_PASS (pass_ipa_profile); NEXT_PASS (pass_ipa_icf); NEXT_PASS (pass_ipa_devirt); + NEXT_PASS (pass_ipa_cdtor_merge); NEXT_PASS (pass_ipa_cp); NEXT_PASS (pass_ipa_sra); - NEXT_PASS (pass_ipa_cdtor_merge); NEXT_PASS (pass_ipa_fn_summary); NEXT_PASS (pass_ipa_inline); NEXT_PASS (pass_ipa_pure_const);
[gcc r15-3589] ipa: Rename ipa_supports_p to ipa_vr_supported_type_p
https://gcc.gnu.org/g:323291c29c77e3214f4850129bb8a3d0d8da6a45 commit r15-3589-g323291c29c77e3214f4850129bb8a3d0d8da6a45 Author: Martin Jambor Date: Wed Sep 11 23:53:21 2024 +0200 ipa: Rename ipa_supports_p to ipa_vr_supported_type_p ipa_supports_p is not a name that captures well what the predicate determines. Therefore, this patch renames it to ipa_vr_supported_type_p. gcc/ChangeLog: 2024-09-06 Martin Jambor * ipa-cp.h (ipa_supports_p): Rename to ipa_vr_supported_type_p. * ipa-cp.cc (ipa_vr_operation_and_type_effects): Adjust called function name. (propagate_vr_across_jump_function): Likewise. * ipa-prop.cc (ipa_compute_jump_functions_for_edge): Likewise. (ipcp_get_parm_bits): Likewise. Diff: --- gcc/ipa-cp.cc | 5 +++-- gcc/ipa-cp.h| 2 +- gcc/ipa-prop.cc | 6 +++--- 3 files changed, 7 insertions(+), 6 deletions(-) diff --git a/gcc/ipa-cp.cc b/gcc/ipa-cp.cc index 56468dc40ee4..a1033b81aefc 100644 --- a/gcc/ipa-cp.cc +++ b/gcc/ipa-cp.cc @@ -1649,7 +1649,8 @@ ipa_vr_operation_and_type_effects (vrange &dst_vr, enum tree_code operation, tree dst_type, tree src_type) { - if (!ipa_supports_p (dst_type) || !ipa_supports_p (src_type)) + if (!ipa_vr_supported_type_p (dst_type) + || !ipa_vr_supported_type_p (src_type)) return false; range_op_handler handler (operation); @@ -2553,7 +2554,7 @@ propagate_vr_across_jump_function (cgraph_edge *cs, ipa_jump_func *jfunc, ipa_range_set_and_normalize (op_vr, op); if (!handler - || !ipa_supports_p (operand_type) + || !ipa_vr_supported_type_p (operand_type) /* Sometimes we try to fold comparison operators using a pointer type to hold the result instead of a boolean type. Avoid trapping in the sanity check in diff --git a/gcc/ipa-cp.h b/gcc/ipa-cp.h index 4616c61625ab..ba2ebfede63f 100644 --- a/gcc/ipa-cp.h +++ b/gcc/ipa-cp.h @@ -294,7 +294,7 @@ bool values_equal_for_ipcp_p (tree x, tree y); /* Return TRUE if IPA supports ranges of TYPE. */ static inline bool -ipa_supports_p (tree type) +ipa_vr_supported_type_p (tree type) { return irange::supports_p (type) || prange::supports_p (type); } diff --git a/gcc/ipa-prop.cc b/gcc/ipa-prop.cc index 99ebd6229ec4..78d1fb7086d5 100644 --- a/gcc/ipa-prop.cc +++ b/gcc/ipa-prop.cc @@ -2392,8 +2392,8 @@ ipa_compute_jump_functions_for_edge (struct ipa_func_body_info *fbi, else { if (param_type - && ipa_supports_p (TREE_TYPE (arg)) - && ipa_supports_p (param_type) + && ipa_vr_supported_type_p (TREE_TYPE (arg)) + && ipa_vr_supported_type_p (param_type) && get_range_query (cfun)->range_of_expr (vr, arg, cs->call_stmt) && !vr.undefined_p ()) { @@ -5761,7 +5761,7 @@ ipcp_get_parm_bits (tree parm, tree *value, widest_int *mask) ipcp_transformation *ts = ipcp_get_transformation_summary (cnode); if (!ts || vec_safe_length (ts->m_vr) == 0 - || !ipa_supports_p (TREE_TYPE (parm))) + || !ipa_vr_supported_type_p (TREE_TYPE (parm))) return false; int i = ts->get_param_index (current_function_decl, parm);
[gcc r15-3590] ipa-cp: One more use of ipa_vr_supported_type_p
https://gcc.gnu.org/g:f910b02919036647a3f096265cda19358dded628 commit r15-3590-gf910b02919036647a3f096265cda19358dded628 Author: Martin Jambor Date: Wed Sep 11 23:53:21 2024 +0200 ipa-cp: One more use of ipa_vr_supported_type_p Since we have the predicate, this patch converts one more check for essentially the same thing into its use. 2024-09-11 Martin Jambor * ipa-cp.cc (propagate_vr_across_jump_function): Use ipa_vr_supported_type_p instead of explicit check for integral and pointer types. Diff: --- gcc/ipa-cp.cc | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/gcc/ipa-cp.cc b/gcc/ipa-cp.cc index a1033b81aefc..fa7bd6a15da7 100644 --- a/gcc/ipa-cp.cc +++ b/gcc/ipa-cp.cc @@ -2519,8 +2519,7 @@ propagate_vr_across_jump_function (cgraph_edge *cs, ipa_jump_func *jfunc, return false; if (!param_type - || (!INTEGRAL_TYPE_P (param_type) - && !POINTER_TYPE_P (param_type))) + || !ipa_vr_supported_type_p (param_type)) return dest_lat->set_to_bottom (); if (jfunc->type == IPA_JF_PASS_THROUGH)
[gcc r14-9403] ipa: Avoid excessive removing of SSAs (PR 113757)
https://gcc.gnu.org/g:54e505d0446f86b7ad383acbb8e5501f20872b64 commit r14-9403-g54e505d0446f86b7ad383acbb8e5501f20872b64 Author: Martin Jambor Date: Sat Mar 9 00:47:22 2024 +0100 ipa: Avoid excessive removing of SSAs (PR 113757) PR 113757 shows that the code which was meant to debug-reset and remove SSAs defined by LHSs of calls redirected to __builtin_unreachable can trigger also when speculative devirtualization creates a call to a noreturn function (and since it is noreturn, it does not bother dealing with its return value). What is more, it seems that the code handling this case is not really necessary. I feel slightly idiotic about this because I have a feeling that I added it because of a failing test-case but I can neither find the testcase nor a reason why the code in cgraph_edge::redirect_call_stmt_to_callee would not be sufficient (it turns the SSA name into a default-def, a bit like IPA-SRA, but any code dominated by a call to a noreturn is not dangerous when it comes to its side-effects). So this patch just removes the handling. gcc/ChangeLog: 2024-02-07 Martin Jambor PR ipa/113757 * tree-inline.cc (redirect_all_calls): Remove code adding SSAs to id->killed_new_ssa_names. gcc/testsuite/ChangeLog: 2024-02-07 Martin Jambor PR ipa/113757 * g++.dg/ipa/pr113757.C: New test. Diff: --- gcc/testsuite/g++.dg/ipa/pr113757.C | 14 ++ gcc/tree-inline.cc | 14 ++ 2 files changed, 16 insertions(+), 12 deletions(-) diff --git a/gcc/testsuite/g++.dg/ipa/pr113757.C b/gcc/testsuite/g++.dg/ipa/pr113757.C new file mode 100644 index 000..885d4010a10 --- /dev/null +++ b/gcc/testsuite/g++.dg/ipa/pr113757.C @@ -0,0 +1,14 @@ +// { dg-do compile } +// { dg-options "-O2 -fPIC" } +// { dg-require-effective-target fpic } + +long size(); +struct ll { virtual int hh(); }; +ll *slice_owner; +int ll::hh() { __builtin_exit(0); } +int nn() { + if (size()) +return 0; + return slice_owner->hh(); +} +int (*a)() = nn; diff --git a/gcc/tree-inline.cc b/gcc/tree-inline.cc index f0a067f5812..eebcea8a029 100644 --- a/gcc/tree-inline.cc +++ b/gcc/tree-inline.cc @@ -2984,23 +2984,13 @@ redirect_all_calls (copy_body_data * id, basic_block bb) gimple *stmt = gsi_stmt (si); if (is_gimple_call (stmt)) { - tree old_lhs = gimple_call_lhs (stmt); struct cgraph_edge *edge = id->dst_node->get_edge (stmt); if (edge) { if (!id->killed_new_ssa_names) id->killed_new_ssa_names = new hash_set (16); - gimple *new_stmt - = cgraph_edge::redirect_call_stmt_to_callee (edge, - id->killed_new_ssa_names); - if (old_lhs - && TREE_CODE (old_lhs) == SSA_NAME - && !gimple_call_lhs (new_stmt)) - /* In case of IPA-SRA removing the LHS, the name should have - been already added to the hash. But in case of redirecting - to builtin_unreachable it was not and the name still should - be pruned from debug statements. */ - id->killed_new_ssa_names->add (old_lhs); + cgraph_edge::redirect_call_stmt_to_callee (edge, + id->killed_new_ssa_names); if (stmt == last && id->call_stmt && maybe_clean_eh_stmt (stmt)) gimple_purge_dead_eh_edges (bb);
[gcc r14-9559] ipa: Fix C++ member ptr indirect inlining (PR 114254, PR 108802)
https://gcc.gnu.org/g:bf838884fac573b4902a21bb82d9b6f777e32cb9 commit r14-9559-gbf838884fac573b4902a21bb82d9b6f777e32cb9 Author: Martin Jambor Date: Tue Mar 19 22:33:27 2024 +0100 ipa: Fix C++ member ptr indirect inlining (PR 114254, PR 108802) Even though we have had code to handle creation of indirect call graph edges (so that these calls can than be made direct as part of IPA-CP and inlining and eventually also inlined) for C++ member pointers for many years, it turns out that it does not work for lambdas and that it has been severely broken since GCC 10 when the base class has virtual functions. Lambdas don't work because the code cannot work with structures representing member function pointers because they are passed by reference instead by value and the code was not ready for that. The presence of virtual methods broke thinks because at some point C++ FE got clever and stopped emitting the check for virtual methods when the base class does not have any and that in turn made our existing testcases not test the necessary pattern matching code. The pattern matcher had a small bug which did not matter before r10-917-g3b47da42de621c but did afterwards. This patch changes the pattern matcher to match both of these cases. gcc/ChangeLog: 2024-03-06 Martin Jambor PR ipa/108802 PR ipa/114254 * ipa-prop.cc (ipa_get_stmt_member_ptr_load_param): Fix case looking at COMPONENT_REFs directly from a PARM_DECL, also recognize loads from a pointer parameter. (ipa_analyze_indirect_call_uses): Also recognize loads from a pointer parameter, also recognize the case when pfn pointer is loaded in its own BB. gcc/testsuite/ChangeLog: 2024-03-06 Martin Jambor PR ipa/108802 PR ipa/114254 * g++.dg/ipa/iinline-4.C: New test. * g++.dg/ipa/pr108802.C: Likewise. Diff: --- gcc/ipa-prop.cc | 110 +-- gcc/testsuite/g++.dg/ipa/iinline-4.C | 61 +++ gcc/testsuite/g++.dg/ipa/pr108802.C | 14 + 3 files changed, 154 insertions(+), 31 deletions(-) diff --git a/gcc/ipa-prop.cc b/gcc/ipa-prop.cc index e22c4f78405..e8e4918d5a8 100644 --- a/gcc/ipa-prop.cc +++ b/gcc/ipa-prop.cc @@ -2500,7 +2500,9 @@ static tree ipa_get_stmt_member_ptr_load_param (gimple *stmt, bool use_delta, HOST_WIDE_INT *offset_p) { - tree rhs, rec, ref_field, ref_offset, fld, ptr_field, delta_field; + tree rhs, fld, ptr_field, delta_field; + tree ref_field = NULL_TREE; + tree ref_offset = NULL_TREE; if (!gimple_assign_single_p (stmt)) return NULL_TREE; @@ -2511,35 +2513,53 @@ ipa_get_stmt_member_ptr_load_param (gimple *stmt, bool use_delta, ref_field = TREE_OPERAND (rhs, 1); rhs = TREE_OPERAND (rhs, 0); } - else -ref_field = NULL_TREE; - if (TREE_CODE (rhs) != MEM_REF) -return NULL_TREE; - rec = TREE_OPERAND (rhs, 0); - if (TREE_CODE (rec) != ADDR_EXPR) -return NULL_TREE; - rec = TREE_OPERAND (rec, 0); - if (TREE_CODE (rec) != PARM_DECL - || !type_like_member_ptr_p (TREE_TYPE (rec), &ptr_field, &delta_field)) + + if (TREE_CODE (rhs) == MEM_REF) +{ + ref_offset = TREE_OPERAND (rhs, 1); + if (ref_field && integer_nonzerop (ref_offset)) + return NULL_TREE; +} + else if (!ref_field) return NULL_TREE; - ref_offset = TREE_OPERAND (rhs, 1); + + if (TREE_CODE (rhs) == MEM_REF + && TREE_CODE (TREE_OPERAND (rhs, 0)) == SSA_NAME + && SSA_NAME_IS_DEFAULT_DEF (TREE_OPERAND (rhs, 0))) +{ + rhs = TREE_OPERAND (rhs, 0); + if (TREE_CODE (SSA_NAME_VAR (rhs)) != PARM_DECL + || !type_like_member_ptr_p (TREE_TYPE (TREE_TYPE (rhs)), &ptr_field, + &delta_field)) + return NULL_TREE; +} + else +{ + if (TREE_CODE (rhs) == MEM_REF + && TREE_CODE (TREE_OPERAND (rhs, 0)) == ADDR_EXPR) + rhs = TREE_OPERAND (TREE_OPERAND (rhs, 0), 0); + if (TREE_CODE (rhs) != PARM_DECL + || !type_like_member_ptr_p (TREE_TYPE (rhs), &ptr_field, + &delta_field)) + return NULL_TREE; +} if (use_delta) fld = delta_field; else fld = ptr_field; - if (offset_p) -*offset_p = int_bit_position (fld); if (ref_field) { - if (integer_nonzerop (ref_offset)) + if (ref_field != fld) return NULL_TREE; - return ref_field == fld ? rec : NULL_TREE; } - else -return tree_int_cst_equal (byte_position (fld), ref_offset) ? rec - : NULL_TREE; + else if (!tree_int_cst_equal (byte_position (fld), ref_offset)) +return NULL_TREE; + + if (offset_p) +*offset_p = int_bit_position (fld); + return rhs; } /* Returns true iff
[gcc r14-9794] ipa: Avoid duplicate replacements in IPA-SRA transformation phase
https://gcc.gnu.org/g:ca56b43105fc09021ec445f1978a17cd85ae5e0c commit r14-9794-gca56b43105fc09021ec445f1978a17cd85ae5e0c Author: Martin Jambor Date: Thu Apr 4 22:46:16 2024 +0200 ipa: Avoid duplicate replacements in IPA-SRA transformation phase When the analysis part of IPA-SRA figures out that it would split out a scalar part of an aggregate which is known by IPA-CP to contain a known constant, it skips it knowing that the transformation part looks at IPA-CP aggregate results too and does the right thing (which can include doing the propagation in GIMPLE because that is the last moment the parameter exists). However, when IPA-SRA wants to split out a smaller aggregate out of an aggregate, which happens to be of the same size as a known scalar constant at the same offset, the transformation bit fails to recognize the situation, tries to do both splitting and constant propagation and in PR 111571 testcase creates a nonsensical call statement on which the call redirection then ICEs. Fixed by making sure we don't try to do two replacements of the same part of the same parameter. The look-up among replacements requires these are sorted and this patch just sorts them if they are not already sorted before each new look-up. The worst number of sortings that can happen is number of parameters which are both split and have aggregate constants times param_ipa_max_agg_items (default 16). I don't think complicating the source code to optimize for this unlikely case is worth it but if need be, it can of course be done. gcc/ChangeLog: 2024-03-15 Martin Jambor PR ipa/111571 * ipa-param-manipulation.cc (ipa_param_body_adjustments::common_initialization): Avoid creating duplicate replacement entries. gcc/testsuite/ChangeLog: 2024-03-15 Martin Jambor PR ipa/111571 * gcc.dg/ipa/pr111571.c: New test. Diff: --- gcc/ipa-param-manipulation.cc | 16 gcc/testsuite/gcc.dg/ipa/pr111571.c | 29 + 2 files changed, 45 insertions(+) diff --git a/gcc/ipa-param-manipulation.cc b/gcc/ipa-param-manipulation.cc index 3e0df6a6f77..f4b5e850c2b 100644 --- a/gcc/ipa-param-manipulation.cc +++ b/gcc/ipa-param-manipulation.cc @@ -1525,6 +1525,22 @@ ipa_param_body_adjustments::common_initialization (tree old_fndecl, replacement with a constant (for split aggregates passed by value). */ + if (split[parm_num]) + { + /* We must be careful not to add a duplicate +replacement. */ + sort_replacements (); + ipa_param_body_replacement *pbr + = lookup_replacement_1 (m_oparms[parm_num], + av.unit_offset); + if (pbr) + { + /* Otherwise IPA-SRA should have bailed out. */ + gcc_assert (AGGREGATE_TYPE_P (TREE_TYPE (pbr->repl))); + continue; + } + } + tree repl; if (av.by_ref) repl = av.value; diff --git a/gcc/testsuite/gcc.dg/ipa/pr111571.c b/gcc/testsuite/gcc.dg/ipa/pr111571.c new file mode 100644 index 000..2a4adc608db --- /dev/null +++ b/gcc/testsuite/gcc.dg/ipa/pr111571.c @@ -0,0 +1,29 @@ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ + +struct a { + int b; +}; +struct c { + long d; + struct a e; + long f; +}; +int g, h, i; +int j() {return 0;} +static void k(struct a l, int p) { + if (h) +g = 0; + for (; g; g = j()) +if (l.b) + break; +} +static void m(struct c l) { + k(l.e, l.f); + for (;; --i) +; +} +int main() { + struct c n = {10, 9}; + m(n); +}
[gcc r14-9813] ipa: Force args obtined through pass-through maps to the expected type (PR 113964)
https://gcc.gnu.org/g:8cd0d29270d4ed86c69b80c08de66dcb6c1e22fe commit r14-9813-g8cd0d29270d4ed86c69b80c08de66dcb6c1e22fe Author: Martin Jambor Date: Fri Apr 5 18:18:39 2024 +0200 ipa: Force args obtined through pass-through maps to the expected type (PR 113964) Interactions of IPA-CP and IPA-SRA on the same data is a rather big source of issues, I'm afraid. PR 113964 is a situation where IPA-CP propagates an unsigned short in a union parameter into a function which itself calls a different function which has a same union parameter and both these union parameters are split with IPA-SRA. The leaf function however uses a signed short member of the union. In the calling function, we get the unsigned constant as the replacement for the union and it is then passed in the call without any type compatibility checks. Apparently on riscv64 it matters whether the parameter is signed or unsigned short and so the leaf function can see different values. Fixed by using useless_type_conversion_p at the appropriate place and if it fails, use force_value_to type as elsewhere in similar situations. gcc/ChangeLog: 2024-04-04 Martin Jambor PR ipa/113964 * ipa-param-manipulation.cc (ipa_param_adjustments::modify_call): Force values obtined through pass-through maps to the expected split type. gcc/testsuite/ChangeLog: 2024-04-04 Patrick O'Neill Martin Jambor PR ipa/113964 * gcc.dg/ipa/pr114247.c: New test. Diff: --- gcc/ipa-param-manipulation.cc | 6 ++ gcc/testsuite/gcc.dg/ipa/pr114247.c | 31 +++ 2 files changed, 37 insertions(+) diff --git a/gcc/ipa-param-manipulation.cc b/gcc/ipa-param-manipulation.cc index f4b5e850c2b..ad36b8389c0 100644 --- a/gcc/ipa-param-manipulation.cc +++ b/gcc/ipa-param-manipulation.cc @@ -740,6 +740,12 @@ ipa_param_adjustments::modify_call (cgraph_edge *cs, } if (repl) { + if (!useless_type_conversion_p(apm->type, repl->typed.type)) + { + repl = force_value_to_type (apm->type, repl); + repl = force_gimple_operand_gsi (&gsi, repl, + true, NULL, true, GSI_SAME_STMT); + } vargs.quick_push (repl); continue; } diff --git a/gcc/testsuite/gcc.dg/ipa/pr114247.c b/gcc/testsuite/gcc.dg/ipa/pr114247.c new file mode 100644 index 000..60aa2bc0122 --- /dev/null +++ b/gcc/testsuite/gcc.dg/ipa/pr114247.c @@ -0,0 +1,31 @@ +/* { dg-do run } */ +/* { dg-options "-O2 -fsigned-char -fno-strict-aliasing -fwrapv" } */ + +union a { + unsigned short b; + int c; + signed short d; +}; +int e, f = 1, g; +long h; +const int **i; +void j(union a k, int l, unsigned m) { + const int *a[100]; + i = &a[0]; + h = k.d; +} +static int o(union a k) { + k.d = -1; + while (1) +if (f) + break; + j(k, g, e); + return 0; +} +int main() { + union a n = {1}; + o(n); + if (h != -1) +__builtin_abort(); + return 0; +}
[gcc r13-8594] ipa: Self-DCE of uses of removed call LHSs (PR 108007)
https://gcc.gnu.org/g:40ddc0b05a47f999b24f20c1becb79004995731b commit r13-8594-g40ddc0b05a47f999b24f20c1becb79004995731b Author: Martin Jambor Date: Mon Apr 8 17:34:33 2024 +0200 ipa: Self-DCE of uses of removed call LHSs (PR 108007) PR 108007 is another manifestation where we rely on DCE to clean-up after IPA-SRA and if the user explicitely switches DCE off, IPA-SRA can leave behind statements which are fed uninitialized values and trap, even though their results are themselves never used. I have already fixed this for unused parameters in callees, this bug shows that almost the same thing can happen for removed returns, on the side of callers. This means that the issue has to be fixed elsewhere, in call redirection. This patch adds a function which looks for (and through, using a work-list) uses of operations fed specific SSA names and removes them all. That would have been easy if it wasn't for debug statements during tree-inline (from which call redirection is also invoked). Debug statements are decoupled from the rest at this point and iterating over uses of SSAs does not bring them up. During tree-inline they are handled especially at the end, I assume in order to make sure that relative ordering of UIDs are the same with and without debug info. This means that during tree-inline we need to make a hash of killed SSAs, that we already have in copy_body_data, available to the function making the purging. So the patch duly does also that, making the interface slightly ugly. Moreover, all newly unused SSA names need to be freed and as PR 112616 showed, it must be done in a defined order, which is what newly added ipa_release_ssas_in_hash does. This backport to gcc-13 also contains 54e505d0446f86b7ad383acbb8e5501f20872b64 in order not to reintroduce PR 113757. gcc/ChangeLog: 2024-04-05 Martin Jambor PR ipa/108007 PR ipa/112616 * cgraph.h (cgraph_edge): Add a parameter to redirect_call_stmt_to_callee. * ipa-param-manipulation.h (ipa_param_adjustments): Add a parameter to modify_call. (ipa_release_ssas_in_hash): Declare. * cgraph.cc (cgraph_edge::redirect_call_stmt_to_callee): New parameter killed_ssas, pass it to padjs->modify_call. * ipa-param-manipulation.cc (purge_all_uses): New function. (ipa_param_adjustments::modify_call): New parameter killed_ssas. Instead of substituting uses, invoke purge_all_uses. If hash of killed SSAs has not been provided, create a temporary one and release SSAs that have been added to it. (compare_ssa_versions): New function. (ipa_release_ssas_in_hash): Likewise. * tree-inline.cc (redirect_all_calls): Create id->killed_new_ssa_names earlier, pass it to edge redirection, adjust a comment. (copy_body): Release SSAs in id->killed_new_ssa_names. gcc/testsuite/ChangeLog: 2024-01-15 Martin Jambor PR ipa/108007 PR ipa/112616 * gcc.dg/ipa/pr108007.c: New test. * gcc.dg/ipa/pr112616.c: Likewise. (cherry picked from commit a9a8426e534760b8d3a250e9bd3cff4db131a2be) Diff: --- gcc/cgraph.cc | 10 +++- gcc/cgraph.h| 9 ++- gcc/ipa-param-manipulation.cc | 112 +--- gcc/ipa-param-manipulation.h| 5 +- gcc/testsuite/g++.dg/ipa/pr113757.C | 14 + gcc/testsuite/gcc.dg/ipa/pr108007.c | 32 +++ gcc/testsuite/gcc.dg/ipa/pr112616.c | 28 + gcc/tree-inline.cc | 27 - 8 files changed, 193 insertions(+), 44 deletions(-) diff --git a/gcc/cgraph.cc b/gcc/cgraph.cc index ec663d23385..7a14c00b60a 100644 --- a/gcc/cgraph.cc +++ b/gcc/cgraph.cc @@ -1403,11 +1403,17 @@ cgraph_edge::redirect_callee (cgraph_node *n) speculative indirect call, remove "speculative" of the indirect call and also redirect stmt to it's final direct target. + When called from within tree-inline, KILLED_SSAs has to contain the pointer + to killed_new_ssa_names within the copy_body_data structure and SSAs + discovered to be useless (if LHS is removed) will be added to it, otherwise + it needs to be NULL. + It is up to caller to iteratively transform each "speculative" direct call as appropriate. */ gimple * -cgraph_edge::redirect_call_stmt_to_callee (cgraph_edge *e) +cgraph_edge::redirect_call_stmt_to_callee (cgraph_edge *e, + hash_set *killed_ssas) { tree decl = gimple_call_fndecl (e->call_stmt); gcall *new_stmt; @@ -1527,7 +1533,7 @@ cgraph_edge::redirect_call_stmt_to_callee (cgraph_edge *e) remove_stmt_from_eh_lp (e->ca
[gcc r14-9840] ipa: Compare jump functions in ICF (PR 113907)
https://gcc.gnu.org/g:1162861439fd3c4b30fc3ccd49462e47e876f04a commit r14-9840-g1162861439fd3c4b30fc3ccd49462e47e876f04a Author: Martin Jambor Date: Mon Apr 8 18:53:23 2024 +0200 ipa: Compare jump functions in ICF (PR 113907) In PR 113907 comment #58, Honza found a case where ICF thinks bodies of functions are equivalent but becaise of difference in aliases in a memory access, different aggregate jump functions are associated with supposedly equivalent call statements. This patch adds a way to compare jump functions and plugs it into ICF to avoid the issue. gcc/ChangeLog: 2024-03-20 Martin Jambor PR ipa/113907 * ipa-prop.h (class ipa_vr): Declare new overload of a member function equal_p. (ipa_jump_functions_equivalent_p): Declare. * ipa-prop.cc (ipa_vr::equal_p): New function. (ipa_agg_pass_through_jf_equivalent_p): Likewise. (ipa_agg_jump_functions_equivalent_p): Likewise. (ipa_jump_functions_equivalent_p): Likewise. * ipa-cp.h (values_equal_for_ipcp_p): Declare. * ipa-cp.cc (values_equal_for_ipcp_p): Make function public. * ipa-icf-gimple.cc: Include alloc-pool.h, symbol-summary.h, sreal.h, ipa-cp.h and ipa-prop.h. (func_checker::compare_gimple_call): Comapre jump functions. gcc/testsuite/ChangeLog: 2024-03-20 Martin Jambor PR ipa/113907 * gcc.dg/lto/pr113907_0.c: New. * gcc.dg/lto/pr113907_1.c: Likewise. * gcc.dg/lto/pr113907_2.c: Likewise. Diff: --- gcc/ipa-cp.cc | 2 +- gcc/ipa-cp.h | 2 + gcc/ipa-icf-gimple.cc | 30 ++ gcc/ipa-prop.cc | 167 ++ gcc/ipa-prop.h| 3 + gcc/testsuite/gcc.dg/lto/pr113907_0.c | 18 gcc/testsuite/gcc.dg/lto/pr113907_1.c | 35 +++ gcc/testsuite/gcc.dg/lto/pr113907_2.c | 11 +++ 8 files changed, 267 insertions(+), 1 deletion(-) diff --git a/gcc/ipa-cp.cc b/gcc/ipa-cp.cc index 2a1da631e9c..b7add455bd5 100644 --- a/gcc/ipa-cp.cc +++ b/gcc/ipa-cp.cc @@ -201,7 +201,7 @@ ipcp_lattice::is_single_const () /* Return true iff X and Y should be considered equal values by IPA-CP. */ -static bool +bool values_equal_for_ipcp_p (tree x, tree y) { gcc_checking_assert (x != NULL_TREE && y != NULL_TREE); diff --git a/gcc/ipa-cp.h b/gcc/ipa-cp.h index 0b3cfe4b526..7ff74fb5c98 100644 --- a/gcc/ipa-cp.h +++ b/gcc/ipa-cp.h @@ -289,4 +289,6 @@ public: bool virt_call = false; }; +bool values_equal_for_ipcp_p (tree x, tree y); + #endif /* IPA_CP_H */ diff --git a/gcc/ipa-icf-gimple.cc b/gcc/ipa-icf-gimple.cc index 8c2df7a354e..17f62bec068 100644 --- a/gcc/ipa-icf-gimple.cc +++ b/gcc/ipa-icf-gimple.cc @@ -41,7 +41,12 @@ along with GCC; see the file COPYING3. If not see #include "gimple-walk.h" #include "tree-ssa-alias-compare.h" +#include "alloc-pool.h" +#include "symbol-summary.h" #include "ipa-icf-gimple.h" +#include "sreal.h" +#include "ipa-cp.h" +#include "ipa-prop.h" namespace ipa_icf_gimple { @@ -714,6 +719,31 @@ func_checker::compare_gimple_call (gcall *s1, gcall *s2) && !compatible_types_p (TREE_TYPE (t1), TREE_TYPE (t2))) return return_false_with_msg ("GIMPLE internal call LHS type mismatch"); + if (!gimple_call_internal_p (s1)) +{ + cgraph_edge *e1 = cgraph_node::get (m_source_func_decl)->get_edge (s1); + cgraph_edge *e2 = cgraph_node::get (m_target_func_decl)->get_edge (s2); + class ipa_edge_args *args1 = ipa_edge_args_sum->get (e1); + class ipa_edge_args *args2 = ipa_edge_args_sum->get (e2); + if ((args1 != nullptr) != (args2 != nullptr)) + return return_false_with_msg ("ipa_edge_args mismatch"); + if (args1) + { + int n1 = ipa_get_cs_argument_count (args1); + int n2 = ipa_get_cs_argument_count (args2); + if (n1 != n2) + return return_false_with_msg ("ipa_edge_args nargs mismatch"); + for (int i = 0; i < n1; i++) + { + struct ipa_jump_func *jf1 = ipa_get_ith_jump_func (args1, i); + struct ipa_jump_func *jf2 = ipa_get_ith_jump_func (args2, i); + if (((jf1 != nullptr) != (jf2 != nullptr)) + || (jf1 && !ipa_jump_functions_equivalent_p (jf1, jf2))) + return return_false_with_msg ("jump function mismatch"); + } + } +} + return compare_operand (t1, t2, get_operand_access_type (&map, t1)); } diff --git a/gcc/ipa-prop.cc b/gcc/ipa-prop.cc index e8e4918d5a8..374e998aa64 100644 --- a/gcc/ipa-prop.cc +++ b/gcc/ipa-prop.cc @@ -156,6 +156,20 @@ ipa_vr::equal_p (const vrange &r) const return (types_compatible_p (m_type, r.type ()) && m_storage->equal_p (r)); } +bool +ipa_vr::equal_p (const ipa_vr &o) const +{ + i
[gcc r14-9841] ICF&SRA: Make ICF and SRA agree on padding
https://gcc.gnu.org/g:1e3312a25a7b34d6e3f549273e1674c7114e4408 commit r14-9841-g1e3312a25a7b34d6e3f549273e1674c7114e4408 Author: Martin Jambor Date: Mon Apr 8 18:53:23 2024 +0200 ICF&SRA: Make ICF and SRA agree on padding PR 113359 shows that (at least with -fno-strict-aliasing) ICF can unify two functions which copy an aggregate type of the same size but then SRA, through its total scalarization, can copy the aggregate by pieces, skipping paddding, but the padding was not the same in the two original functions that ICF unified. This patch enhances SRA with the ability to collect padding information which then can be compared from within ICF. Unfortunately SRA uses OPTION_SET_P when determining its limits, so ICF needs to switch cfuns at least once to figure it out too. gcc/ChangeLog: 2024-03-27 Martin Jambor PR ipa/113359 * ipa-icf-gimple.h (func_checker): New members safe_for_total_scalarization_p, m_total_scalarization_limit_known_p and m_total_scalarization_limit. (func_checker::func_checker): Initialize new member variables. * ipa-icf-gimple.cc: Include tree-sra.h. (func_checker::func_checker): Initialize new member variables. (func_checker::safe_for_total_scalarization_p): New function. (func_checker::compare_operand): Use the new function. * tree-sra.h (sra_get_max_scalarization_size): Declare. (sra_total_scalarization_would_copy_same_data_p): Likewise. * tree-sra.cc (prepare_iteration_over_array_elts): New function. (class sra_padding_collecting): New. (sra_padding_collecting::record_padding): Likewise. (scalarizable_type_p): Rename to totally_scalarizable_type_p. Add ability to record padding when requested. (totally_scalarize_subtree): Split out gathering information necessary to iterate over array elements to prepare_iteration_over_array_elts. Fix errornous early exit. (analyze_all_variable_accesses): Adjust the call to totally_scalarizable_type_p. Move determining of total scalariation size limit... (sra_get_max_scalarization_size): ...here. (check_ts_and_push_padding_to_vec): New function. (sra_total_scalarization_would_copy_same_data_p): Likewise. gcc/testsuite/ChangeLog: 2024-03-27 Martin Jambor PR ipa/113359 * gcc.dg/lto/pr113359-1_0.c: New. * gcc.dg/lto/pr113359-1_1.c: Likewise. * gcc.dg/lto/pr113359-2_0.c: Likewise. * gcc.dg/lto/pr113359-2_1.c: Likewise. * gcc.dg/lto/pr113359-3_0.c: Likewise. * gcc.dg/lto/pr113359-3_1.c: Likewise. * gcc.dg/lto/pr113359-4_0.c: Likewise. * gcc.dg/lto/pr113359-4_1.c: Likewise. * gcc.dg/lto/pr113359-5_0.c: Likewise. * gcc.dg/lto/pr113359-5_1.c: Likewise. Diff: --- gcc/ipa-icf-gimple.cc | 41 +- gcc/ipa-icf-gimple.h| 15 +- gcc/testsuite/gcc.dg/lto/pr113359-1_0.c | 86 +++ gcc/testsuite/gcc.dg/lto/pr113359-1_1.c | 38 + gcc/testsuite/gcc.dg/lto/pr113359-2_0.c | 87 +++ gcc/testsuite/gcc.dg/lto/pr113359-2_1.c | 38 + gcc/testsuite/gcc.dg/lto/pr113359-3_0.c | 114 +++ gcc/testsuite/gcc.dg/lto/pr113359-3_1.c | 49 +++ gcc/testsuite/gcc.dg/lto/pr113359-4_0.c | 114 +++ gcc/testsuite/gcc.dg/lto/pr113359-4_1.c | 49 +++ gcc/testsuite/gcc.dg/lto/pr113359-5_0.c | 118 +++ gcc/testsuite/gcc.dg/lto/pr113359-5_1.c | 50 +++ gcc/tree-sra.cc | 252 +--- gcc/tree-sra.h | 3 + 14 files changed, 999 insertions(+), 55 deletions(-) diff --git a/gcc/ipa-icf-gimple.cc b/gcc/ipa-icf-gimple.cc index 17f62bec068..c25eb24710f 100644 --- a/gcc/ipa-icf-gimple.cc +++ b/gcc/ipa-icf-gimple.cc @@ -39,6 +39,7 @@ along with GCC; see the file COPYING3. If not see #include "cfgloop.h" #include "attribs.h" #include "gimple-walk.h" +#include "tree-sra.h" #include "tree-ssa-alias-compare.h" #include "alloc-pool.h" @@ -64,7 +65,8 @@ func_checker::func_checker (tree source_func_decl, tree target_func_decl, : m_source_func_decl (source_func_decl), m_target_func_decl (target_func_decl), m_ignored_source_nodes (ignored_source_nodes), m_ignored_target_nodes (ignored_target_nodes), -m_ignore_labels (ignore_labels), m_tbaa (tbaa) +m_ignore_labels (ignore_labels), m_tbaa (tbaa), +m_total_scalarization_limit_known_p (false) { function *source_func = DECL_STRUCT_FUNCTION (source_func_decl); function *target_func = DECL_STRUCT_FUNCTION (target_func_decl); @@ -361,6 +363,36 @@ func_checker::operand_equal_p (const_tree t1, const_tree
[gcc r14-9926] contrib/check-params-in-docs.py: Ignore gcn-preferred-vectorization-factor
https://gcc.gnu.org/g:33f83d3cd84f9876180a2e2a9d1ea082debdaa37 commit r14-9926-g33f83d3cd84f9876180a2e2a9d1ea082debdaa37 Author: Martin Jambor Date: Thu Apr 11 19:37:45 2024 +0200 contrib/check-params-in-docs.py: Ignore gcn-preferred-vectorization-factor contrib/check-params-in-docs.py is a script that checks that all options reported with ./gcc/xgcc -Bgcc --help=param are in gcc/doc/invoke.texi and vice versa. gcn-preferred-vectorization-factor is in the manual but normally not reported by --help, probably because I do not have gcn offload configured. This patch makes the script silently about this particular fact. contrib/ChangeLog: 2024-04-11 Martin Jambor * check-params-in-docs.py (ignored): Add gcn-preferred-vectorization-factor. Diff: --- contrib/check-params-in-docs.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/contrib/check-params-in-docs.py b/contrib/check-params-in-docs.py index 623c82284e2..f7879dd8e08 100755 --- a/contrib/check-params-in-docs.py +++ b/contrib/check-params-in-docs.py @@ -45,7 +45,7 @@ parser.add_argument('params_output') args = parser.parse_args() -ignored = {'logical-op-non-short-circuit'} +ignored = {'logical-op-non-short-circuit', 'gcn-preferred-vectorization-factor'} params = {} for line in open(args.params_output).readlines():
[gcc r13-8619] ipa: Avoid duplicate replacements in IPA-SRA transformation phase
https://gcc.gnu.org/g:8a3784adf5cd873ca295a5a011d8623338ff3976 commit r13-8619-g8a3784adf5cd873ca295a5a011d8623338ff3976 Author: Martin Jambor Date: Fri Apr 19 16:48:12 2024 +0200 ipa: Avoid duplicate replacements in IPA-SRA transformation phase When the analysis part of IPA-SRA figures out that it would split out a scalar part of an aggregate which is known by IPA-CP to contain a known constant, it skips it knowing that the transformation part looks at IPA-CP aggregate results too and does the right thing (which can include doing the propagation in GIMPLE because that is the last moment the parameter exists). However, when IPA-SRA wants to split out a smaller aggregate out of an aggregate, which happens to be of the same size as a known scalar constant at the same offset, the transformation bit fails to recognize the situation, tries to do both splitting and constant propagation and in PR 111571 testcase creates a nonsensical call statement on which the call redirection then ICEs. Fixed by making sure we don't try to do two replacements of the same part of the same parameter. The look-up among replacements requires these are sorted and this patch just sorts them if they are not already sorted before each new look-up. The worst number of sortings that can happen is number of parameters which are both split and have aggregate constants times param_ipa_max_agg_items (default 16). I don't think complicating the source code to optimize for this unlikely case is worth it but if need be, it can of course be done. gcc/ChangeLog: 2024-03-15 Martin Jambor PR ipa/111571 * ipa-param-manipulation.cc (ipa_param_body_adjustments::common_initialization): Avoid creating duplicate replacement entries. gcc/testsuite/ChangeLog: 2024-03-15 Martin Jambor PR ipa/111571 * gcc.dg/ipa/pr111571.c: New test. (cherry picked from commit ca56b43105fc09021ec445f1978a17cd85ae5e0c) Diff: --- gcc/ipa-param-manipulation.cc | 16 gcc/testsuite/gcc.dg/ipa/pr111571.c | 29 + 2 files changed, 45 insertions(+) diff --git a/gcc/ipa-param-manipulation.cc b/gcc/ipa-param-manipulation.cc index 182f0c6741e..e4f626ae95e 100644 --- a/gcc/ipa-param-manipulation.cc +++ b/gcc/ipa-param-manipulation.cc @@ -1484,6 +1484,22 @@ ipa_param_body_adjustments::common_initialization (tree old_fndecl, replacement with a constant (for split aggregates passed by value). */ + if (split[parm_num]) + { + /* We must be careful not to add a duplicate +replacement. */ + sort_replacements (); + ipa_param_body_replacement *pbr + = lookup_replacement_1 (m_oparms[parm_num], + av.unit_offset); + if (pbr) + { + /* Otherwise IPA-SRA should have bailed out. */ + gcc_assert (AGGREGATE_TYPE_P (TREE_TYPE (pbr->repl))); + continue; + } + } + tree repl; if (av.by_ref) repl = av.value; diff --git a/gcc/testsuite/gcc.dg/ipa/pr111571.c b/gcc/testsuite/gcc.dg/ipa/pr111571.c new file mode 100644 index 000..2a4adc608db --- /dev/null +++ b/gcc/testsuite/gcc.dg/ipa/pr111571.c @@ -0,0 +1,29 @@ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ + +struct a { + int b; +}; +struct c { + long d; + struct a e; + long f; +}; +int g, h, i; +int j() {return 0;} +static void k(struct a l, int p) { + if (h) +g = 0; + for (; g; g = j()) +if (l.b) + break; +} +static void m(struct c l) { + k(l.e, l.f); + for (;; --i) +; +} +int main() { + struct c n = {10, 9}; + m(n); +}
[gcc r13-8620] ipa: Force args obtined through pass-through maps to the expected type (PR 113964)
https://gcc.gnu.org/g:5c3238b0d55ec13a2430aa606e2bfed9432e97ac commit r13-8620-g5c3238b0d55ec13a2430aa606e2bfed9432e97ac Author: Martin Jambor Date: Fri Apr 19 16:48:12 2024 +0200 ipa: Force args obtined through pass-through maps to the expected type (PR 113964) Interactions of IPA-CP and IPA-SRA on the same data is a rather big source of issues, I'm afraid. PR 113964 is a situation where IPA-CP propagates an unsigned short in a union parameter into a function which itself calls a different function which has a same union parameter and both these union parameters are split with IPA-SRA. The leaf function however uses a signed short member of the union. In the calling function, we get the unsigned constant as the replacement for the union and it is then passed in the call without any type compatibility checks. Apparently on riscv64 it matters whether the parameter is signed or unsigned short and so the leaf function can see different values. Fixed by using useless_type_conversion_p at the appropriate place and if it fails, use force_value_to type as elsewhere in similar situations. gcc/ChangeLog: 2024-04-04 Martin Jambor PR ipa/113964 * ipa-param-manipulation.cc (ipa_param_adjustments::modify_call): Force values obtined through pass-through maps to the expected split type. gcc/testsuite/ChangeLog: 2024-04-04 Patrick O'Neill Martin Jambor PR ipa/113964 * gcc.dg/ipa/pr114247.c: New test. (cherry picked from commit 8cd0d29270d4ed86c69b80c08de66dcb6c1e22fe) Diff: --- gcc/ipa-param-manipulation.cc | 6 ++ gcc/testsuite/gcc.dg/ipa/pr114247.c | 31 +++ 2 files changed, 37 insertions(+) diff --git a/gcc/ipa-param-manipulation.cc b/gcc/ipa-param-manipulation.cc index e4f626ae95e..729d5e8e688 100644 --- a/gcc/ipa-param-manipulation.cc +++ b/gcc/ipa-param-manipulation.cc @@ -738,6 +738,12 @@ ipa_param_adjustments::modify_call (cgraph_edge *cs, } if (repl) { + if (!useless_type_conversion_p(apm->type, repl->typed.type)) + { + repl = force_value_to_type (apm->type, repl); + repl = force_gimple_operand_gsi (&gsi, repl, + true, NULL, true, GSI_SAME_STMT); + } vargs.quick_push (repl); continue; } diff --git a/gcc/testsuite/gcc.dg/ipa/pr114247.c b/gcc/testsuite/gcc.dg/ipa/pr114247.c new file mode 100644 index 000..60aa2bc0122 --- /dev/null +++ b/gcc/testsuite/gcc.dg/ipa/pr114247.c @@ -0,0 +1,31 @@ +/* { dg-do run } */ +/* { dg-options "-O2 -fsigned-char -fno-strict-aliasing -fwrapv" } */ + +union a { + unsigned short b; + int c; + signed short d; +}; +int e, f = 1, g; +long h; +const int **i; +void j(union a k, int l, unsigned m) { + const int *a[100]; + i = &a[0]; + h = k.d; +} +static int o(union a k) { + k.d = -1; + while (1) +if (f) + break; + j(k, g, e); + return 0; +} +int main() { + union a n = {1}; + o(n); + if (h != -1) +__builtin_abort(); + return 0; +}
[gcc r13-8785] testsuite: Adjust pr113359-2_*.c with unsigned long long [PR114662]
https://gcc.gnu.org/g:c827f46d8652d7a089e614302a4cffb6b192284d commit r13-8785-gc827f46d8652d7a089e614302a4cffb6b192284d Author: Kewen Lin Date: Wed Apr 10 02:59:43 2024 -0500 testsuite: Adjust pr113359-2_*.c with unsigned long long [PR114662] pr113359-2_*.c define a struct having unsigned long type members ay and az which have 4 bytes size at -m32, while the related constants CL1 and CL2 used for equality check are always 8 bytes, it makes compiler consider the below 69 if (a.ay != CL1) 70 __builtin_abort (); always to abort and optimize away the following call to getb, which leads to the expected wpa dumping on "Semantic equality" missing. This patch is to modify the types with unsigned long long accordingly. PR testsuite/114662 gcc/testsuite/ChangeLog: * gcc.dg/lto/pr113359-2_0.c: Use unsigned long long instead of unsigned long. * gcc.dg/lto/pr113359-2_1.c: Likewise. (cherry picked from commit 4923ed49b93352bcf9e43cafac38345e4a54c3f8) Diff: --- gcc/testsuite/gcc.dg/lto/pr113359-2_0.c | 8 gcc/testsuite/gcc.dg/lto/pr113359-2_1.c | 8 2 files changed, 8 insertions(+), 8 deletions(-) diff --git a/gcc/testsuite/gcc.dg/lto/pr113359-2_0.c b/gcc/testsuite/gcc.dg/lto/pr113359-2_0.c index 8b2d5bdfab2..8495667599d 100644 --- a/gcc/testsuite/gcc.dg/lto/pr113359-2_0.c +++ b/gcc/testsuite/gcc.dg/lto/pr113359-2_0.c @@ -8,15 +8,15 @@ struct SA { unsigned int ax; - unsigned long ay; - unsigned long az; + unsigned long long ay; + unsigned long long az; }; struct SB { unsigned int bx; - unsigned long by; - unsigned long bz; + unsigned long long by; + unsigned long long bz; }; struct ZA diff --git a/gcc/testsuite/gcc.dg/lto/pr113359-2_1.c b/gcc/testsuite/gcc.dg/lto/pr113359-2_1.c index 61bc0547981..8320f347efe 100644 --- a/gcc/testsuite/gcc.dg/lto/pr113359-2_1.c +++ b/gcc/testsuite/gcc.dg/lto/pr113359-2_1.c @@ -5,15 +5,15 @@ struct SA { unsigned int ax; - unsigned long ay; - unsigned long az; + unsigned long long ay; + unsigned long long az; }; struct SB { unsigned int bx; - unsigned long by; - unsigned long bz; + unsigned long long by; + unsigned long long bz; }; struct ZA
[gcc r14-10237] sra: Do not leave work for DSE (that it can sometimes not perform)
https://gcc.gnu.org/g:1a6c1c85b7ab1ad4bdf9573fcdc04dcce894ba82 commit r14-10237-g1a6c1c85b7ab1ad4bdf9573fcdc04dcce894ba82 Author: Martin Jambor Date: Thu May 9 16:39:44 2024 +0200 sra: Do not leave work for DSE (that it can sometimes not perform) When looking again at the g++.dg/tree-ssa/pr109849.C testcase we discovered that it generates terrible store-to-load forwarding stalls because SRA was leaving behind aggregate loads but all the stores were by scalar parts and DSE failed to remove the useless load. SRA has all the knowledge to remove the statement even now, so this small patch makes it do so. With this patch, the g++.dg/tree-ssa/pr109849.C micro-benchmark runs 9 times faster (on an AMD EPYC 75F3 machine). gcc/ChangeLog: 2024-04-18 Martin Jambor * tree-sra.cc (sra_modify_assign): Remove the original statement also when dealing with a store to a fully covered aggregate from a non-candidate. gcc/testsuite/ChangeLog: 2024-04-23 Martin Jambor * g++.dg/tree-ssa/pr109849.C: Also check that the aggeegate store to cur disappears. * gcc.dg/tree-ssa/ssa-dse-26.c: Instead of relying on DSE, check that the unwanted stores were removed at early SRA time. (cherry picked from commit f6743695b4d2bd4da96e56a19157372f93b800bd) Diff: --- gcc/testsuite/g++.dg/tree-ssa/pr109849.C | 3 ++- gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-26.c | 6 +++--- gcc/tree-sra.cc| 14 -- 3 files changed, 17 insertions(+), 6 deletions(-) diff --git a/gcc/testsuite/g++.dg/tree-ssa/pr109849.C b/gcc/testsuite/g++.dg/tree-ssa/pr109849.C index cd348c0f590..d06dbb10482 100644 --- a/gcc/testsuite/g++.dg/tree-ssa/pr109849.C +++ b/gcc/testsuite/g++.dg/tree-ssa/pr109849.C @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -fdump-tree-sra" } */ +/* { dg-options "-O2 -fdump-tree-sra -fdump-tree-optimized" } */ #include typedef unsigned int uint32_t; @@ -29,3 +29,4 @@ main() } /* { dg-final { scan-tree-dump "Created a replacement for stack offset" "sra"} } */ +/* { dg-final { scan-tree-dump-not "cur = MEM" "optimized"} } */ diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-26.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-26.c index 43152de5616..1d01392c595 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-26.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-26.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -fdump-tree-dse1-details -fno-short-enums -fno-tree-fre" } */ +/* { dg-options "-O2 -fdump-tree-esra -fno-short-enums -fno-tree-fre" } */ /* { dg-skip-if "we want a BIT_FIELD_REF from fold_truth_andor" { ! lp64 } } */ /* { dg-skip-if "temporary variable names are not x and y" { mmix-knuth-mmixware } } */ @@ -31,5 +31,5 @@ constraint_equal (struct constraint a, struct constraint b) && constraint_expr_equal (a.rhs, b.rhs); } -/* { dg-final { scan-tree-dump-times "Deleted dead store: x = " 2 "dse1" } } */ -/* { dg-final { scan-tree-dump-times "Deleted dead store: y = " 2 "dse1" } } */ +/* { dg-final { scan-tree-dump-not "x = " "esra" } } */ +/* { dg-final { scan-tree-dump-not "y = " "esra" } } */ diff --git a/gcc/tree-sra.cc b/gcc/tree-sra.cc index 32fa28911f2..8040b0c5645 100644 --- a/gcc/tree-sra.cc +++ b/gcc/tree-sra.cc @@ -4854,8 +4854,18 @@ sra_modify_assign (gimple *stmt, gimple_stmt_iterator *gsi) But use the RHS aggregate to load from to expose more optimization opportunities. */ if (access_has_children_p (lacc)) - generate_subtree_copies (lacc->first_child, rhs, lacc->offset, -0, 0, gsi, true, true, loc); + { + generate_subtree_copies (lacc->first_child, rhs, lacc->offset, + 0, 0, gsi, true, true, loc); + if (lacc->grp_covered) + { + unlink_stmt_vdef (stmt); + gsi_remove (& orig_gsi, true); + release_defs (stmt); + sra_stats.deleted++; + return SRA_AM_REMOVED; + } + } } return SRA_AM_NONE;
[gcc r12-10475] ipa: Compare jump functions in ICF (PR 113907)
https://gcc.gnu.org/g:72f6b7ec3915f0b5b3517dffa19e3b34c8af687d commit r12-10475-g72f6b7ec3915f0b5b3517dffa19e3b34c8af687d Author: Martin Jambor Date: Tue May 28 13:33:02 2024 +0200 ipa: Compare jump functions in ICF (PR 113907) This is a manual backport of r14-9840-g1162861439fd3c from master. Manual because the bits and value range representation in jump functions have changes during the gcc 14 development cycle. In PR 113907 comment #58, Honza found a case where ICF thinks bodies of functions are equivalent but becaise of difference in aliases in a memory access, different aggregate jump functions are associated with supposedly equivalent call statements. This patch adds a way to compare jump functions and plugs it into ICF to avoid the issue. gcc/ChangeLog: 2024-05-14 Martin Jambor PR ipa/113907 * ipa-prop.h (ipa_jump_functions_equivalent_p): Declare. (values_equal_for_ipcp_p): Likewise. * ipa-prop.cc (ipa_agg_pass_through_jf_equivalent_p): New function. (ipa_agg_jump_functions_equivalent_p): Likewise. (ipa_jump_functions_equivalent_p): Likewise. * ipa-cp.cc (values_equal_for_ipcp_p): Make function public. * ipa-icf-gimple.cc: Include alloc-pool.h, symbol-summary.h, sreal.h, ipa-cp.h and ipa-prop.h. (func_checker::compare_gimple_call): Comapre jump functions. gcc/testsuite/ChangeLog: 2024-05-10 Martin Jambor PR ipa/113907 * gcc.dg/lto/pr113907_0.c: New. * gcc.dg/lto/pr113907_1.c: Likewise. * gcc.dg/lto/pr113907_2.c: Likewise. (cherry picked from commit 1db45e83021a8a87f41e22053910fcce6e8e2c2c) Diff: --- gcc/ipa-cp.cc | 2 +- gcc/ipa-icf-gimple.cc | 29 +++ gcc/ipa-prop.cc | 157 ++ gcc/ipa-prop.h| 3 + gcc/testsuite/gcc.dg/lto/pr113907_0.c | 18 gcc/testsuite/gcc.dg/lto/pr113907_1.c | 35 gcc/testsuite/gcc.dg/lto/pr113907_2.c | 11 +++ 7 files changed, 254 insertions(+), 1 deletion(-) diff --git a/gcc/ipa-cp.cc b/gcc/ipa-cp.cc index fbb31f6dff2..909464f4ac4 100644 --- a/gcc/ipa-cp.cc +++ b/gcc/ipa-cp.cc @@ -1402,7 +1402,7 @@ ipacp_value_safe_for_type (tree param_type, tree value) /* Return true iff X and Y should be considered equal values by IPA-CP. */ -static bool +bool values_equal_for_ipcp_p (tree x, tree y) { gcc_checking_assert (x != NULL_TREE && y != NULL_TREE); diff --git a/gcc/ipa-icf-gimple.cc b/gcc/ipa-icf-gimple.cc index ab398ca051c..e81409c16f9 100644 --- a/gcc/ipa-icf-gimple.cc +++ b/gcc/ipa-icf-gimple.cc @@ -41,7 +41,11 @@ along with GCC; see the file COPYING3. If not see #include "gimple-walk.h" #include "tree-ssa-alias-compare.h" +#include "alloc-pool.h" +#include "symbol-summary.h" #include "ipa-icf-gimple.h" +#include "sreal.h" +#include "ipa-prop.h" namespace ipa_icf_gimple { @@ -714,6 +718,31 @@ func_checker::compare_gimple_call (gcall *s1, gcall *s2) && !compatible_types_p (TREE_TYPE (t1), TREE_TYPE (t2))) return return_false_with_msg ("GIMPLE internal call LHS type mismatch"); + if (!gimple_call_internal_p (s1)) +{ + cgraph_edge *e1 = cgraph_node::get (m_source_func_decl)->get_edge (s1); + cgraph_edge *e2 = cgraph_node::get (m_target_func_decl)->get_edge (s2); + class ipa_edge_args *args1 = ipa_edge_args_sum->get (e1); + class ipa_edge_args *args2 = ipa_edge_args_sum->get (e2); + if ((args1 != nullptr) != (args2 != nullptr)) + return return_false_with_msg ("ipa_edge_args mismatch"); + if (args1) + { + int n1 = ipa_get_cs_argument_count (args1); + int n2 = ipa_get_cs_argument_count (args2); + if (n1 != n2) + return return_false_with_msg ("ipa_edge_args nargs mismatch"); + for (int i = 0; i < n1; i++) + { + struct ipa_jump_func *jf1 = ipa_get_ith_jump_func (args1, i); + struct ipa_jump_func *jf2 = ipa_get_ith_jump_func (args2, i); + if (((jf1 != nullptr) != (jf2 != nullptr)) + || (jf1 && !ipa_jump_functions_equivalent_p (jf1, jf2))) + return return_false_with_msg ("jump function mismatch"); + } + } +} + return compare_operand (t1, t2, get_operand_access_type (&map, t1)); } diff --git a/gcc/ipa-prop.cc b/gcc/ipa-prop.cc index 0197ac6108d..e2e83b5f3f5 100644 --- a/gcc/ipa-prop.cc +++ b/gcc/ipa-prop.cc @@ -6096,6 +6096,163 @@ ipcp_transform_function (struct cgraph_node *node) return modified_mem_access ? TODO_update_ssa_only_virtuals : 0; } +/* Return true if the two pass_through components of two jump functions are + known to be equivalent. AGG_JF denotes whether they are part of aggregate + functions or not. The function can be
[gcc r14-10803] ipa: Treat static constructors and destructors as non-local (PR 115815)
https://gcc.gnu.org/g:f057e958732cd2627b6db127fa6d4d882b61dd5f commit r14-10803-gf057e958732cd2627b6db127fa6d4d882b61dd5f Author: Martin Jambor Date: Fri Oct 18 21:32:16 2024 +0200 ipa: Treat static constructors and destructors as non-local (PR 115815) In PR 115815, IPA-SRA thought it had control over all invocations of a (recursive) static destructor but it did not see the implied invocation which led to the original being left behind and the clean-up code encountering uses of SSAs that definitely should have been dead. Fixed by teaching cgraph_node::can_be_local_p about static constructors and destructors. Similar test is missing in cgraph_node::local_p so I added the check there as well. In addition to the commit with the fix, this backport also contains squashed commit 1a458bdeb223ffa501bac8e76182115681967094 which fixes dejagnu directives in the testcase. gcc/ChangeLog: 2024-07-25 Martin Jambor PR ipa/115815 * cgraph.cc (cgraph_node_cannot_be_local_p_1): Also check DECL_STATIC_CONSTRUCTOR and DECL_STATIC_DESTRUCTOR. * ipa-visibility.cc (non_local_p): Likewise. (cgraph_node::local_p): Delete extraneous line of tabs. gcc/testsuite/ChangeLog: 2024-07-25 Martin Jambor PR ipa/115815 * gcc.dg/lto/pr115815_0.c: New test. (cherry picked from commit e98ad6a049c96c21cf641954584c2f5b7df0ce93) Diff: --- gcc/cgraph.cc | 4 +++- gcc/ipa-visibility.cc | 5 +++-- gcc/testsuite/gcc.dg/lto/pr115815_0.c | 22 ++ 3 files changed, 28 insertions(+), 3 deletions(-) diff --git a/gcc/cgraph.cc b/gcc/cgraph.cc index 473d8410bc97..39a3adbc7c35 100644 --- a/gcc/cgraph.cc +++ b/gcc/cgraph.cc @@ -2434,7 +2434,9 @@ cgraph_node_cannot_be_local_p_1 (cgraph_node *node, void *) && !node->forced_by_abi && !node->used_from_object_file_p () && !node->same_comdat_group) - || !node->externally_visible)); + || !node->externally_visible) + && !DECL_STATIC_CONSTRUCTOR (node->decl) + && !DECL_STATIC_DESTRUCTOR (node->decl)); } /* Return true if cgraph_node can be made local for API change. diff --git a/gcc/ipa-visibility.cc b/gcc/ipa-visibility.cc index 501d3c304aa3..21f0c47f388e 100644 --- a/gcc/ipa-visibility.cc +++ b/gcc/ipa-visibility.cc @@ -102,7 +102,9 @@ non_local_p (struct cgraph_node *node, void *data ATTRIBUTE_UNUSED) && !node->externally_visible && !node->used_from_other_partition && !node->in_other_partition - && node->get_availability () >= AVAIL_AVAILABLE); + && node->get_availability () >= AVAIL_AVAILABLE + && !DECL_STATIC_CONSTRUCTOR (node->decl) + && !DECL_STATIC_DESTRUCTOR (node->decl)); } /* Return true when function can be marked local. */ @@ -116,7 +118,6 @@ cgraph_node::local_p (void) return n->callees->callee->local_p (); return !n->call_for_symbol_thunks_and_aliases (non_local_p, NULL, true); - } /* A helper for comdat_can_be_unshared_p. */ diff --git a/gcc/testsuite/gcc.dg/lto/pr115815_0.c b/gcc/testsuite/gcc.dg/lto/pr115815_0.c new file mode 100644 index ..ade91def55b0 --- /dev/null +++ b/gcc/testsuite/gcc.dg/lto/pr115815_0.c @@ -0,0 +1,22 @@ +/* { dg-lto-options {{-O2 -flto}} } */ +/* { dg-lto-do link } */ +/* { dg-require-effective-target global_constructor } */ + +int a; +volatile int v; +volatile int w; + +int __attribute__((destructor)) +b() { + if (v) +return a + b(); + v = 5; + return 0; +} + +int +main (int argc, char **argv) +{ + w = 1; + return 0; +}
[gcc r15-4464] testsuite: Add necessary dejagnu directives to pr115815_0.c
https://gcc.gnu.org/g:1a458bdeb223ffa501bac8e76182115681967094 commit r15-4464-g1a458bdeb223ffa501bac8e76182115681967094 Author: Martin Jambor Date: Fri Oct 18 12:00:12 2024 +0200 testsuite: Add necessary dejagnu directives to pr115815_0.c I have received an email from the Linaro infrastructure that the test gcc.dg/lto/pr115815_0.c which I added is failing on arm-eabi and I realized that not only it is missing dg-require-effective-target global_constructor but actually any dejagnu directives at all, which means it is unnecessarily running both at -O0 and -O2 and there is an unnecesary run test too. All fixed by this patch. I have not actually verified that the failure goes away on arm-eabi but have very high hopes it will. I have verified that the test still checks for the bug and also that it passes by running: make -k check-gcc RUNTESTFLAGS="lto.exp=*pr115815*" gcc/testsuite/ChangeLog: 2024-10-14 Martin Jambor * gcc.dg/lto/pr115815_0.c: Add dejagu directives. Diff: --- gcc/testsuite/gcc.dg/lto/pr115815_0.c | 4 1 file changed, 4 insertions(+) diff --git a/gcc/testsuite/gcc.dg/lto/pr115815_0.c b/gcc/testsuite/gcc.dg/lto/pr115815_0.c index d938ae4c8025..ade91def55b0 100644 --- a/gcc/testsuite/gcc.dg/lto/pr115815_0.c +++ b/gcc/testsuite/gcc.dg/lto/pr115815_0.c @@ -1,3 +1,7 @@ +/* { dg-lto-options {{-O2 -flto}} } */ +/* { dg-lto-do link } */ +/* { dg-require-effective-target global_constructor } */ + int a; volatile int v; volatile int w;
[gcc r15-4564] tree-sra: Avoid SRAing arguments to a function returning_twice (PR 117142)
https://gcc.gnu.org/g:29d8f1f0b7ad3c69b3bdb130325300d5f73aa784 commit r15-4564-g29d8f1f0b7ad3c69b3bdb130325300d5f73aa784 Author: Martin Jambor Date: Wed Oct 23 11:30:32 2024 +0200 tree-sra: Avoid SRAing arguments to a function returning_twice (PR 117142) PR 117142 shows that the current SRA probably never worked reliably with arguments passed to a function returning twice, because it then creates statements before the call which however needs to be at the beginning of a basic block. While it should be possible to make at least the case of passing arguments by value work with SRA (the statements would need to be put just on the non-abnormal edges leading to the BB), this would mean large surgery of function sra_modify_expr and I guess the time would better be spent re-organizing the whole pass. gcc/ChangeLog: 2024-10-21 Martin Jambor PR tree-optimization/117142 * tree-sra.cc (build_access_from_call_arg): Disqualify any candidate passed to a function returning twice. gcc/testsuite/ChangeLog: 2024-10-21 Martin Jambor PR tree-optimization/117142 * gcc.dg/tree-ssa/pr117142.c: New test. Diff: --- gcc/testsuite/gcc.dg/tree-ssa/pr117142.c | 14 ++ gcc/tree-sra.cc | 9 + 2 files changed, 23 insertions(+) diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr117142.c b/gcc/testsuite/gcc.dg/tree-ssa/pr117142.c new file mode 100644 index ..fc62c1e58f2e --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr117142.c @@ -0,0 +1,14 @@ +/* { dg-do compile } */ +/* { dg-options "-O1" } */ + +struct a { + int b; +}; +void c(int, int); +void __attribute__((returns_twice)) +bar1(struct a); +void bar(struct a) { + struct a d; + bar1(d); + c(d.b, d.b); +} diff --git a/gcc/tree-sra.cc b/gcc/tree-sra.cc index 64e2f007d680..c0915dce5c4a 100644 --- a/gcc/tree-sra.cc +++ b/gcc/tree-sra.cc @@ -1397,6 +1397,15 @@ static bool build_access_from_call_arg (tree expr, gimple *stmt, bool can_be_returned, enum out_edge_check *oe_check) { + if (gimple_call_flags (stmt) & ECF_RETURNS_TWICE) +{ + tree base = expr; + if (TREE_CODE (expr) == ADDR_EXPR) + base = get_base_address (TREE_OPERAND (expr, 0)); + disqualify_base_of_expr (base, "Passed to a returns_twice call."); + return false; +} + if (TREE_CODE (expr) == ADDR_EXPR) { tree base = get_base_address (TREE_OPERAND (expr, 0));
[gcc r13-9143] ipa: Treat static constructors and destructors as non-local (PR 115815)
https://gcc.gnu.org/g:005ce1c1826777f33d5011723827d17f1fcd55c1 commit r13-9143-g005ce1c1826777f33d5011723827d17f1fcd55c1 Author: Martin Jambor Date: Fri Oct 18 21:32:16 2024 +0200 ipa: Treat static constructors and destructors as non-local (PR 115815) In PR 115815, IPA-SRA thought it had control over all invocations of a (recursive) static destructor but it did not see the implied invocation which led to the original being left behind and the clean-up code encountering uses of SSAs that definitely should have been dead. Fixed by teaching cgraph_node::can_be_local_p about static constructors and destructors. Similar test is missing in cgraph_node::local_p so I added the check there as well. In addition to the commit with the fix, this backport also contains squashed commit 1a458bdeb223ffa501bac8e76182115681967094 which fixes dejagnu directives in the testcase. gcc/ChangeLog: 2024-07-25 Martin Jambor PR ipa/115815 * cgraph.cc (cgraph_node_cannot_be_local_p_1): Also check DECL_STATIC_CONSTRUCTOR and DECL_STATIC_DESTRUCTOR. * ipa-visibility.cc (non_local_p): Likewise. (cgraph_node::local_p): Delete extraneous line of tabs. gcc/testsuite/ChangeLog: 2024-07-25 Martin Jambor PR ipa/115815 * gcc.dg/lto/pr115815_0.c: New test. (cherry picked from commit e98ad6a049c96c21cf641954584c2f5b7df0ce93) Diff: --- gcc/cgraph.cc | 4 +++- gcc/ipa-visibility.cc | 5 +++-- gcc/testsuite/gcc.dg/lto/pr115815_0.c | 22 ++ 3 files changed, 28 insertions(+), 3 deletions(-) diff --git a/gcc/cgraph.cc b/gcc/cgraph.cc index 7a14c00b60a0..ad71cf82823c 100644 --- a/gcc/cgraph.cc +++ b/gcc/cgraph.cc @@ -2433,7 +2433,9 @@ cgraph_node_cannot_be_local_p_1 (cgraph_node *node, void *) && !node->forced_by_abi && !node->used_from_object_file_p () && !node->same_comdat_group) - || !node->externally_visible)); + || !node->externally_visible) + && !DECL_STATIC_CONSTRUCTOR (node->decl) + && !DECL_STATIC_DESTRUCTOR (node->decl)); } /* Return true if cgraph_node can be made local for API change. diff --git a/gcc/ipa-visibility.cc b/gcc/ipa-visibility.cc index 8ec82bb333e2..9ca0e39df950 100644 --- a/gcc/ipa-visibility.cc +++ b/gcc/ipa-visibility.cc @@ -102,7 +102,9 @@ non_local_p (struct cgraph_node *node, void *data ATTRIBUTE_UNUSED) && !node->externally_visible && !node->used_from_other_partition && !node->in_other_partition - && node->get_availability () >= AVAIL_AVAILABLE); + && node->get_availability () >= AVAIL_AVAILABLE + && !DECL_STATIC_CONSTRUCTOR (node->decl) + && !DECL_STATIC_DESTRUCTOR (node->decl)); } /* Return true when function can be marked local. */ @@ -116,7 +118,6 @@ cgraph_node::local_p (void) return n->callees->callee->local_p (); return !n->call_for_symbol_thunks_and_aliases (non_local_p, NULL, true); - } /* A helper for comdat_can_be_unshared_p. */ diff --git a/gcc/testsuite/gcc.dg/lto/pr115815_0.c b/gcc/testsuite/gcc.dg/lto/pr115815_0.c new file mode 100644 index ..ade91def55b0 --- /dev/null +++ b/gcc/testsuite/gcc.dg/lto/pr115815_0.c @@ -0,0 +1,22 @@ +/* { dg-lto-options {{-O2 -flto}} } */ +/* { dg-lto-do link } */ +/* { dg-require-effective-target global_constructor } */ + +int a; +volatile int v; +volatile int w; + +int __attribute__((destructor)) +b() { + if (v) +return a + b(); + v = 5; + return 0; +} + +int +main (int argc, char **argv) +{ + w = 1; + return 0; +}
[gcc r15-5637] ipa: Move individual jump function copying to a separate function
https://gcc.gnu.org/g:cc5779fcaf76aeee005f986eb1dc15205c696544 commit r15-5637-gcc5779fcaf76aeee005f986eb1dc15205c696544 Author: Martin Jambor Date: Sun Nov 24 23:03:43 2024 +0100 ipa: Move individual jump function copying to a separate function When reviewing various IPA bits and pieces I have falsely assumed that jump function duplication misses copying important bits because it relies on vec_safe_copy-ing all data in the vector of jump functions and then just fixes up the few fields it needs to. Perhaps more importantly, we do want a function to copy one individual jump function to form jump functions for planned call-graph edges that model transfer of control to OpenMP outlined regions through calls to gomp functions. Therefore, this patch introduces such function and makes ipa_edge_args_sum_t::duplicate just allocate the new vectors and then uses the new function to copy the data. gcc/ChangeLog: 2024-11-01 Martin Jambor * ipa-prop.cc (ipa_duplicate_jump_function): New function. (ipa_edge_args_sum_t::duplicate): Move individual jump function copying to ipa_duplicate_jump_function. Diff: --- gcc/ipa-prop.cc | 188 +--- 1 file changed, 111 insertions(+), 77 deletions(-) diff --git a/gcc/ipa-prop.cc b/gcc/ipa-prop.cc index cbc825670fe0..9070a45f6835 100644 --- a/gcc/ipa-prop.cc +++ b/gcc/ipa-prop.cc @@ -4503,99 +4503,96 @@ ipa_edge_args_sum_t::remove (cgraph_edge *cs, ipa_edge_args *args) } } -/* Method invoked when an edge is duplicated. Copy ipa_edge_args and adjust - reference count data strucutres accordingly. */ +/* Copy information from SRC_JF to DST_JF which correstpond to call graph edges + SRC and DST. */ -void -ipa_edge_args_sum_t::duplicate (cgraph_edge *src, cgraph_edge *dst, - ipa_edge_args *old_args, ipa_edge_args *new_args) +static void +ipa_duplicate_jump_function (cgraph_edge *src, cgraph_edge *dst, +ipa_jump_func *src_jf, ipa_jump_func *dst_jf) { - unsigned int i; + dst_jf->agg.items = vec_safe_copy (src_jf->agg.items); + dst_jf->agg.by_ref = src_jf->agg.by_ref; - new_args->jump_functions = vec_safe_copy (old_args->jump_functions); - if (old_args->polymorphic_call_contexts) -new_args->polymorphic_call_contexts - = vec_safe_copy (old_args->polymorphic_call_contexts); + /* We can avoid calling ipa_set_jfunc_vr since it would only look up the + place in the hash_table where the source m_vr resides. */ + dst_jf->m_vr = src_jf->m_vr; - for (i = 0; i < vec_safe_length (old_args->jump_functions); i++) + if (src_jf->type == IPA_JF_CONST) { - struct ipa_jump_func *src_jf = ipa_get_ith_jump_func (old_args, i); - struct ipa_jump_func *dst_jf = ipa_get_ith_jump_func (new_args, i); - - dst_jf->agg.items = vec_safe_copy (dst_jf->agg.items); + ipa_set_jf_cst_copy (dst_jf, src_jf); + struct ipa_cst_ref_desc *src_rdesc = jfunc_rdesc_usable (src_jf); - if (src_jf->type == IPA_JF_CONST) + if (!src_rdesc) + dst_jf->value.constant.rdesc = NULL; + else if (src->caller == dst->caller) { - struct ipa_cst_ref_desc *src_rdesc = jfunc_rdesc_usable (src_jf); - - if (!src_rdesc) - dst_jf->value.constant.rdesc = NULL; - else if (src->caller == dst->caller) - { - /* Creation of a speculative edge. If the source edge is the one -grabbing a reference, we must create a new (duplicate) -reference description. Otherwise they refer to the same -description corresponding to a reference taken in a function -src->caller is inlined to. In that case we just must -increment the refcount. */ - if (src_rdesc->cs == src) - { - symtab_node *n = symtab_node_for_jfunc (src_jf); - gcc_checking_assert (n); - ipa_ref *ref -= src->caller->find_reference (n, src->call_stmt, - src->lto_stmt_uid, - IPA_REF_ADDR); - gcc_checking_assert (ref); - dst->caller->clone_reference (ref, ref->stmt); - - ipa_cst_ref_desc *dst_rdesc = ipa_refdesc_pool.allocate (); - dst_rdesc->cs = dst; - dst_rdesc->refcount = src_rdesc->refcount; - dst_rdesc->next_duplicate = NULL; - dst_jf->value.constant.rdesc = dst_rdesc; - } - else - { - src_rdesc->refcount++; - dst_jf->value.constant.rdesc = src_rdesc; - } - } - else if (src_rdesc->cs == src) + /* Creation of a sp
[gcc r14-10997] tree-sra: Avoid SRAing arguments to a function returning_twice (PR 117142)
https://gcc.gnu.org/g:8fd9461976b325efd134f9004a7958ebd008148f commit r14-10997-g8fd9461976b325efd134f9004a7958ebd008148f Author: Martin Jambor Date: Wed Oct 23 11:30:32 2024 +0200 tree-sra: Avoid SRAing arguments to a function returning_twice (PR 117142) PR 117142 shows that the current SRA probably never worked reliably with arguments passed to a function returning twice, because it then creates statements before the call which however needs to be at the beginning of a basic block. While it should be possible to make at least the case of passing arguments by value work with SRA (the statements would need to be put just on the non-abnormal edges leading to the BB), this would mean large surgery of function sra_modify_expr and I guess the time would better be spent re-organizing the whole pass. gcc/ChangeLog: 2024-10-21 Martin Jambor PR tree-optimization/117142 * tree-sra.cc (build_access_from_call_arg): Disqualify any candidate passed to a function returning twice. gcc/testsuite/ChangeLog: 2024-10-21 Martin Jambor PR tree-optimization/117142 * gcc.dg/tree-ssa/pr117142.c: New test. (cherry picked from commit 29d8f1f0b7ad3c69b3bdb130325300d5f73aa784) Diff: --- gcc/testsuite/gcc.dg/tree-ssa/pr117142.c | 14 ++ gcc/tree-sra.cc | 9 + 2 files changed, 23 insertions(+) diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr117142.c b/gcc/testsuite/gcc.dg/tree-ssa/pr117142.c new file mode 100644 index ..fc62c1e58f2e --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr117142.c @@ -0,0 +1,14 @@ +/* { dg-do compile } */ +/* { dg-options "-O1" } */ + +struct a { + int b; +}; +void c(int, int); +void __attribute__((returns_twice)) +bar1(struct a); +void bar(struct a) { + struct a d; + bar1(d); + c(d.b, d.b); +} diff --git a/gcc/tree-sra.cc b/gcc/tree-sra.cc index 8040b0c56451..c91e40ef7e71 100644 --- a/gcc/tree-sra.cc +++ b/gcc/tree-sra.cc @@ -1397,6 +1397,15 @@ static bool build_access_from_call_arg (tree expr, gimple *stmt, bool can_be_returned, enum out_edge_check *oe_check) { + if (gimple_call_flags (stmt) & ECF_RETURNS_TWICE) +{ + tree base = expr; + if (TREE_CODE (expr) == ADDR_EXPR) + base = get_base_address (TREE_OPERAND (expr, 0)); + disqualify_base_of_expr (base, "Passed to a returns_twice call."); + return false; +} + if (TREE_CODE (expr) == ADDR_EXPR) { tree base = get_base_address (TREE_OPERAND (expr, 0));
[gcc r12-10836] tree-sra: Avoid SRAing arguments to a function returning_twice (PR 117142)
https://gcc.gnu.org/g:dc0e962ea18667bc3cdabcafef85b241a4f2c678 commit r12-10836-gdc0e962ea18667bc3cdabcafef85b241a4f2c678 Author: Martin Jambor Date: Fri Nov 15 14:37:06 2024 +0100 tree-sra: Avoid SRAing arguments to a function returning_twice (PR 117142) This is a manual bacport of commit 29d8f1f0b7ad3c69b3bdb130325300d5f73aa784 which must be done slightly elsewhere for gcc 13 and 12 because function build_access_from_call_arg was added only in gcc 14. But the gist of the patch is the same. The commit message of the original fix says: PR 117142 shows that the current SRA probably never worked reliably with arguments passed to a function returning twice, because it then creates statements before the call which however needs to be at the beginning of a basic block. While it should be possible to make at least the case of passing arguments by value work with SRA (the statements would need to be put just on the non-abnormal edges leading to the BB), this would mean large surgery of function sra_modify_expr and I guess the time would better be spent re-organizing the whole pass. gcc/ChangeLog: 2024-11-14 Martin Jambor PR tree-optimization/117142 * tree-sra.cc (scan_function): Disqualify any candidate passed to a function returning twice. gcc/testsuite/ChangeLog: 2024-11-14 Martin Jambor * gcc.dg/tree-ssa/pr117142.c: New test. (cherry picked from commit 6244de432a5ba9807c6f0065e70a8025af7b1bd6) Diff: --- gcc/testsuite/gcc.dg/tree-ssa/pr117142.c | 14 ++ gcc/tree-sra.cc | 13 ++--- 2 files changed, 24 insertions(+), 3 deletions(-) diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr117142.c b/gcc/testsuite/gcc.dg/tree-ssa/pr117142.c new file mode 100644 index ..fc62c1e58f2e --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr117142.c @@ -0,0 +1,14 @@ +/* { dg-do compile } */ +/* { dg-options "-O1" } */ + +struct a { + int b; +}; +void c(int, int); +void __attribute__((returns_twice)) +bar1(struct a); +void bar(struct a) { + struct a d; + bar1(d); + c(d.b, d.b); +} diff --git a/gcc/tree-sra.cc b/gcc/tree-sra.cc index 47eee5add126..5a9eaf31b6e9 100644 --- a/gcc/tree-sra.cc +++ b/gcc/tree-sra.cc @@ -1392,9 +1392,16 @@ scan_function (void) break; case GIMPLE_CALL: - for (i = 0; i < gimple_call_num_args (stmt); i++) - ret |= build_access_from_expr (gimple_call_arg (stmt, i), - stmt, false); + if (gimple_call_flags (stmt) & ECF_RETURNS_TWICE) + { + for (i = 0; i < gimple_call_num_args (stmt); i++) + disqualify_base_of_expr (gimple_call_arg (stmt, i), +"Passed to a returns_twice call."); + } + else + for (i = 0; i < gimple_call_num_args (stmt); i++) + ret |= build_access_from_expr (gimple_call_arg (stmt, i), +stmt, false); t = gimple_call_lhs (stmt); if (t && !disqualify_if_bad_bb_terminating_stmt (stmt, t, NULL))
[gcc r13-9193] tree-sra: Avoid SRAing arguments to a function returning_twice (PR 117142)
https://gcc.gnu.org/g:6244de432a5ba9807c6f0065e70a8025af7b1bd6 commit r13-9193-g6244de432a5ba9807c6f0065e70a8025af7b1bd6 Author: Martin Jambor Date: Fri Nov 15 14:37:06 2024 +0100 tree-sra: Avoid SRAing arguments to a function returning_twice (PR 117142) This is a manual bacport of commit 29d8f1f0b7ad3c69b3bdb130325300d5f73aa784 which must be done slightly elsewhere for gcc 13 and 12 because function build_access_from_call_arg was added only in gcc 14. But the gist of the patch is the same. The commit message of the original fix says: PR 117142 shows that the current SRA probably never worked reliably with arguments passed to a function returning twice, because it then creates statements before the call which however needs to be at the beginning of a basic block. While it should be possible to make at least the case of passing arguments by value work with SRA (the statements would need to be put just on the non-abnormal edges leading to the BB), this would mean large surgery of function sra_modify_expr and I guess the time would better be spent re-organizing the whole pass. gcc/ChangeLog: 2024-11-14 Martin Jambor PR tree-optimization/117142 * tree-sra.cc (scan_function): Disqualify any candidate passed to a function returning twice. gcc/testsuite/ChangeLog: 2024-11-14 Martin Jambor * gcc.dg/tree-ssa/pr117142.c: New test. Diff: --- gcc/testsuite/gcc.dg/tree-ssa/pr117142.c | 14 ++ gcc/tree-sra.cc | 13 ++--- 2 files changed, 24 insertions(+), 3 deletions(-) diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr117142.c b/gcc/testsuite/gcc.dg/tree-ssa/pr117142.c new file mode 100644 index ..fc62c1e58f2e --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr117142.c @@ -0,0 +1,14 @@ +/* { dg-do compile } */ +/* { dg-options "-O1" } */ + +struct a { + int b; +}; +void c(int, int); +void __attribute__((returns_twice)) +bar1(struct a); +void bar(struct a) { + struct a d; + bar1(d); + c(d.b, d.b); +} diff --git a/gcc/tree-sra.cc b/gcc/tree-sra.cc index 77508894772d..8a9cbeec4908 100644 --- a/gcc/tree-sra.cc +++ b/gcc/tree-sra.cc @@ -1504,9 +1504,16 @@ scan_function (void) break; case GIMPLE_CALL: - for (i = 0; i < gimple_call_num_args (stmt); i++) - ret |= build_access_from_expr (gimple_call_arg (stmt, i), - stmt, false); + if (gimple_call_flags (stmt) & ECF_RETURNS_TWICE) + { + for (i = 0; i < gimple_call_num_args (stmt); i++) + disqualify_base_of_expr (gimple_call_arg (stmt, i), +"Passed to a returns_twice call."); + } + else + for (i = 0; i < gimple_call_num_args (stmt); i++) + ret |= build_access_from_expr (gimple_call_arg (stmt, i), +stmt, false); t = gimple_call_lhs (stmt); if (t && !disqualify_if_bad_bb_terminating_stmt (stmt, t, NULL))
[gcc r15-5291] ipa: Rationalize IPA-VR computations across pass-through jump functions
https://gcc.gnu.org/g:012f5a22bac26a898ab66655965b07ac23201fdd commit r15-5291-g012f5a22bac26a898ab66655965b07ac23201fdd Author: Martin Jambor Date: Thu Nov 14 20:55:06 2024 +0100 ipa: Rationalize IPA-VR computations across pass-through jump functions Currently ipa_value_range_from_jfunc and propagate_vr_across_jump_function contain similar but not same code for dealing with pass-through jump functions. This patch puts these common bits into one function which can also handle comparison operations. gcc/ChangeLog: 2024-11-01 Martin Jambor PR ipa/114985 * ipa-cp.cc (ipa_vr_intersect_with_arith_jfunc): New function. (ipa_value_range_from_jfunc): Move the common functionality to the above new function, adjust the rest so that it works with it well. (propagate_vr_across_jump_function): Likewise. Diff: --- gcc/ipa-cp.cc | 181 ++ 1 file changed, 67 insertions(+), 114 deletions(-) diff --git a/gcc/ipa-cp.cc b/gcc/ipa-cp.cc index fb65ec0c6a62..25741cf47bb0 100644 --- a/gcc/ipa-cp.cc +++ b/gcc/ipa-cp.cc @@ -1692,6 +1692,55 @@ ipa_vr_operation_and_type_effects (vrange &dst_vr, dst_type, src_type); } +/* Given a PASS_THROUGH jump function JFUNC that takes as its source SRC_VR of + SRC_TYPE and the result needs to be DST_TYPE, if any value range information + can be deduced at all, intersect VR with it. */ + +static void +ipa_vr_intersect_with_arith_jfunc (vrange &vr, + ipa_jump_func *jfunc, + const value_range &src_vr, + tree src_type, + tree dst_type) +{ + if (src_vr.undefined_p () || src_vr.varying_p ()) +return; + + enum tree_code operation = ipa_get_jf_pass_through_operation (jfunc); + if (TREE_CODE_CLASS (operation) == tcc_unary) +{ + value_range tmp_res (dst_type); + if (ipa_vr_operation_and_type_effects (tmp_res, src_vr, operation, +dst_type, src_type)) + vr.intersect (tmp_res); + return; +} + + tree operand = ipa_get_jf_pass_through_operand (jfunc); + range_op_handler handler (operation); + if (!handler) +return; + value_range op_vr (TREE_TYPE (operand)); + ipa_range_set_and_normalize (op_vr, operand); + + tree operation_type; + if (TREE_CODE_CLASS (operation) == tcc_comparison) +operation_type = boolean_type_node; + else +operation_type = src_type; + + value_range op_res (dst_type); + if (!ipa_vr_supported_type_p (operation_type) + || !handler.operand_check_p (operation_type, src_type, op_vr.type ()) + || !handler.fold_range (op_res, operation_type, src_vr, op_vr)) +return; + + value_range tmp_res (dst_type); + if (ipa_vr_operation_and_type_effects (tmp_res, op_res, NOP_EXPR, dst_type, +operation_type)) + vr.intersect (tmp_res); +} + /* Determine range of JFUNC given that INFO describes the caller node or the one it is inlined to, CS is the call graph edge corresponding to JFUNC and PARM_TYPE of the parameter. */ @@ -1701,18 +1750,18 @@ ipa_value_range_from_jfunc (vrange &vr, ipa_node_params *info, cgraph_edge *cs, ipa_jump_func *jfunc, tree parm_type) { - vr.set_undefined (); + vr.set_varying (parm_type); - if (jfunc->m_vr) + if (jfunc->m_vr && jfunc->m_vr->known_p ()) ipa_vr_operation_and_type_effects (vr, *jfunc->m_vr, NOP_EXPR, parm_type, jfunc->m_vr->type ()); if (vr.singleton_p ()) return; + if (jfunc->type == IPA_JF_PASS_THROUGH) { - int idx; ipcp_transformation *sum = ipcp_get_transformation_summary (cs->caller->inlined_to ? cs->caller->inlined_to @@ -1720,54 +1769,15 @@ ipa_value_range_from_jfunc (vrange &vr, if (!sum || !sum->m_vr) return; - idx = ipa_get_jf_pass_through_formal_id (jfunc); + int idx = ipa_get_jf_pass_through_formal_id (jfunc); if (!(*sum->m_vr)[idx].known_p ()) return; - tree vr_type = ipa_get_type (info, idx); + tree src_type = ipa_get_type (info, idx); value_range srcvr; (*sum->m_vr)[idx].get_vrange (srcvr); - enum tree_code operation = ipa_get_jf_pass_through_operation (jfunc); - - if (TREE_CODE_CLASS (operation) == tcc_unary) - { - value_range res (parm_type); - - if (ipa_vr_operation_and_type_effects (res, -srcvr, -operation, parm_type, -vr_type)) -
[gcc r15-5240] ipa: Introduce a one jump function dumping function
https://gcc.gnu.org/g:f927264935972145bb71f1cdb26263a5446671e1 commit r15-5240-gf927264935972145bb71f1cdb26263a5446671e1 Author: Martin Jambor Date: Thu Nov 14 14:42:27 2024 +0100 ipa: Introduce a one jump function dumping function I plan to introduce a verifier that prints a single jump function when it fails with the function introduced in this one. Because it is a verifier, the risk that it would need to e reverted are non-zero and because the function can be useful on its own, this is a special patch to introduce it. gcc/ChangeLog: 2024-11-01 Martin Jambor * ipa-prop.h (ipa_dump_jump_function): Declare. * ipa-prop.cc (ipa_dump_jump_function): New function. (ipa_print_node_jump_functions_for_edge): Move printing of individual jump functions to the new function. Diff: --- gcc/ipa-prop.cc | 209 +--- gcc/ipa-prop.h | 2 + 2 files changed, 110 insertions(+), 101 deletions(-) diff --git a/gcc/ipa-prop.cc b/gcc/ipa-prop.cc index fd18f847e460..2a0d4503f525 100644 --- a/gcc/ipa-prop.cc +++ b/gcc/ipa-prop.cc @@ -429,126 +429,133 @@ ipa_print_constant_value (FILE *f, tree val) } } -/* Print the jump functions associated with call graph edge CS to file F. */ +/* Print contents of JFUNC to F. If CTX is non-NULL, dump it too. */ -static void -ipa_print_node_jump_functions_for_edge (FILE *f, struct cgraph_edge *cs) +DEBUG_FUNCTION void +ipa_dump_jump_function (FILE *f, ipa_jump_func *jump_func, + class ipa_polymorphic_call_context *ctx) { - ipa_edge_args *args = ipa_edge_args_sum->get (cs); - int count = ipa_get_cs_argument_count (args); + enum jump_func_type type = jump_func->type; - for (int i = 0; i < count; i++) + if (type == IPA_JF_UNKNOWN) +fprintf (f, "UNKNOWN\n"); + else if (type == IPA_JF_CONST) { - struct ipa_jump_func *jump_func; - enum jump_func_type type; - - jump_func = ipa_get_ith_jump_func (args, i); - type = jump_func->type; - - fprintf (f, " param %d: ", i); - if (type == IPA_JF_UNKNOWN) - fprintf (f, "UNKNOWN\n"); - else if (type == IPA_JF_CONST) + fprintf (f, "CONST: "); + ipa_print_constant_value (f, jump_func->value.constant.value); + fprintf (f, "\n"); +} + else if (type == IPA_JF_PASS_THROUGH) +{ + fprintf (f, "PASS THROUGH: "); + fprintf (f, "%d, op %s", + jump_func->value.pass_through.formal_id, + get_tree_code_name(jump_func->value.pass_through.operation)); + if (jump_func->value.pass_through.operation != NOP_EXPR) { - fprintf (f, "CONST: "); - ipa_print_constant_value (f, jump_func->value.constant.value); - fprintf (f, "\n"); + fprintf (f, " "); + print_generic_expr (f, jump_func->value.pass_through.operand); } - else if (type == IPA_JF_PASS_THROUGH) + if (jump_func->value.pass_through.agg_preserved) + fprintf (f, ", agg_preserved"); + if (jump_func->value.pass_through.refdesc_decremented) + fprintf (f, ", refdesc_decremented"); + fprintf (f, "\n"); +} + else if (type == IPA_JF_ANCESTOR) +{ + fprintf (f, "ANCESTOR: "); + fprintf (f, "%d, offset " HOST_WIDE_INT_PRINT_DEC, + jump_func->value.ancestor.formal_id, + jump_func->value.ancestor.offset); + if (jump_func->value.ancestor.agg_preserved) + fprintf (f, ", agg_preserved"); + if (jump_func->value.ancestor.keep_null) + fprintf (f, ", keep_null"); + fprintf (f, "\n"); +} + + if (jump_func->agg.items) +{ + struct ipa_agg_jf_item *item; + int j; + + fprintf (f, " Aggregate passed by %s:\n", + jump_func->agg.by_ref ? "reference" : "value"); + FOR_EACH_VEC_ELT (*jump_func->agg.items, j, item) { - fprintf (f, "PASS THROUGH: "); - fprintf (f, "%d, op %s", - jump_func->value.pass_through.formal_id, - get_tree_code_name(jump_func->value.pass_through.operation)); - if (jump_func->value.pass_through.operation != NOP_EXPR) + fprintf (f, " offset: " HOST_WIDE_INT_PRINT_DEC ", ", + item->offset); + fprintf (f, "type: "); + print_generic_expr (f, item->type); + fprintf (f, ", "); + if (item->jftype == IPA_JF_PASS_THROUGH) + fprintf (f, "PASS THROUGH: %d,", +item->value.pass_through.formal_id); + else if (item->jftype == IPA_JF_LOAD_AGG) { - fprintf (f, " "); - print_generic_expr (f, jump_func->value.pass_through.operand); + fprintf (f, "LOAD AGG: %d", + item->value.pass_through.formal_id); + fprintf (f, " [offset: " HOST_WIDE_INT_PRINT_DEC ", by %s],", +
[gcc r15-5239] ipa-cp: Fix constant dumping
https://gcc.gnu.org/g:da29560711b2a66b26738caf46dbf67d3f7cff85 commit r15-5239-gda29560711b2a66b26738caf46dbf67d3f7cff85 Author: Martin Jambor Date: Thu Nov 14 14:42:27 2024 +0100 ipa-cp: Fix constant dumping Commit gcc-14-5368-ge0787da2633 removed an overloaded variant of function print_ipcp_constant_value for tree constants. That did not break build because the other overloaded variant for polymorphic contexts-has a parameter which is constructible from a tree, but it prints polymorphic contexts, not tree constants, so we in dumps we got things like: param [0]: VARIABLE ctxs: VARIABLE Bits: value = 0x0, mask = 0xfffc [prange] struct S * [1, +INF] MASK 0xfffc VALUE 0x0 ref offset 0: nothing known [scc: 1, from: 1(1.00)] [loc_time: 0, loc_size: 0, prop_time: 0, prop_size: 0] ref offset 32: nothing known [scc: 2, from: 1(1.00)] [loc_time: 0, loc_size: 0, prop_time: 0, prop_size: 0] ref offset 64: nothing known [scc: 3, from: 1(1.00)] [loc_time: 0, loc_size: 0, prop_time: 0, prop_size: 0] instead of: param [0]: VARIABLE ctxs: VARIABLE Bits: value = 0x0, mask = 0xfffc [prange] struct S * [1, +INF] MASK 0xfffc VALUE 0x0 ref offset 0: 1 [scc: 1, from: 1(1.00)] [loc_time: 0, loc_size: 0, prop_time: 0, prop_size: 0] ref offset 32: 64 [scc: 2, from: 1(1.00)] [loc_time: 0, loc_size: 0, prop_time: 0, prop_size: 0] ref offset 64: 32 [scc: 3, from: 1(1.00)] [loc_time: 0, loc_size: 0, prop_time: 0, prop_size: 0] This commit re-adds the needed overloaded variant though it uses the printing function added in the aforementioned commit instead of printing it itself. gcc/ChangeLog: 2024-11-13 Martin Jambor * ipa-prop.h (ipa_print_constant_value): Declare. * ipa-prop.cc (ipa_print_constant_value): Make public. * ipa-cp.cc (print_ipcp_constant_value): Re-add this overloaded function for printing tree constants. gcc/testsuite/ChangeLog: 2024-11-14 Martin Jambor * gcc.dg/ipa/ipcp-agg-1.c: Add a scan dump for a constant value in the latice dump. Diff: --- gcc/ipa-cp.cc | 12 +++- gcc/ipa-prop.cc | 2 +- gcc/ipa-prop.h| 1 + gcc/testsuite/gcc.dg/ipa/ipcp-agg-1.c | 1 + 4 files changed, 14 insertions(+), 2 deletions(-) diff --git a/gcc/ipa-cp.cc b/gcc/ipa-cp.cc index 212d9ccbbfe0..fb65ec0c6a62 100644 --- a/gcc/ipa-cp.cc +++ b/gcc/ipa-cp.cc @@ -225,7 +225,17 @@ values_equal_for_ipcp_p (tree x, tree y) return operand_equal_p (x, y, 0); } -/* Print V which is extracted from a value in a lattice to F. */ +/* Print V which is extracted from a value in a lattice to F. This overloaded + function is used to print tree constants. */ + +static void +print_ipcp_constant_value (FILE * f, tree v) +{ + ipa_print_constant_value (f, v); +} + +/* Print V which is extracted from a value in a lattice to F. This overloaded + function is used to print constant polymorphic call contexts. */ static void print_ipcp_constant_value (FILE * f, ipa_polymorphic_call_context v) diff --git a/gcc/ipa-prop.cc b/gcc/ipa-prop.cc index 599181d0a943..fd18f847e460 100644 --- a/gcc/ipa-prop.cc +++ b/gcc/ipa-prop.cc @@ -413,7 +413,7 @@ ipa_initialize_node_params (struct cgraph_node *node) /* Print VAL which is extracted from a jump function to F. */ -static void +void ipa_print_constant_value (FILE *f, tree val) { print_generic_expr (f, val); diff --git a/gcc/ipa-prop.h b/gcc/ipa-prop.h index 7a05c169c421..a9ef3fe3aa60 100644 --- a/gcc/ipa-prop.h +++ b/gcc/ipa-prop.h @@ -1179,6 +1179,7 @@ ipcp_get_transformation_summary (cgraph_node *node) /* Function formal parameters related computations. */ void ipa_initialize_node_params (struct cgraph_node *node); +void ipa_print_constant_value (FILE *f, tree val); bool ipa_propagate_indirect_call_infos (struct cgraph_edge *cs, vec *new_edges); diff --git a/gcc/testsuite/gcc.dg/ipa/ipcp-agg-1.c b/gcc/testsuite/gcc.dg/ipa/ipcp-agg-1.c index 8cfc18799fae..15f6286e54bc 100644 --- a/gcc/testsuite/gcc.dg/ipa/ipcp-agg-1.c +++ b/gcc/testsuite/gcc.dg/ipa/ipcp-agg-1.c @@ -30,6 +30,7 @@ entry (void) foo (&s); } +/* { dg-final { scan-ipa-dump "ref offset\[^\n\r\]*: 64\[^\n\r\]*scc:" "cp" } } */ /* { dg-final { scan-ipa-dump "Creating a specialized node of foo.*for all known contexts" "cp" } } */ /* { dg-final { scan-ipa-dump-times "Aggregate replacements:" 2 "cp" } } */ /* { dg-final { scan-tree-dump-not "->c;" "optimized" } } */
[gcc r15-6599] ipa-cp: Make dumping of bit masks representing -1 nicer
https://gcc.gnu.org/g:72b273152f75a8622ea13d0fe95d6d2461615ba4 commit r15-6599-g72b273152f75a8622ea13d0fe95d6d2461615ba4 Author: Martin Jambor Date: Mon Jan 6 11:58:29 2025 +0100 ipa-cp: Make dumping of bit masks representing -1 nicer Dumps of the lattices representing bit-values and of propagation results of bit-values can print a really long hexadecimal value when the bit-value represents -1 (all bits set). This patch simply detect that situation and prints the string "-1" in that case, making the dumps somewhat nicer. gcc/ChangeLog: 2025-01-03 Martin Jambor * ipa-cp.cc (ipcp_print_widest_int): New function. (ipcp_store_vr_results): Use it. (ipcp_bits_lattice::print): Likewise. Fix formatting. Diff: --- gcc/ipa-cp.cc | 20 +--- 1 file changed, 17 insertions(+), 3 deletions(-) diff --git a/gcc/ipa-cp.cc b/gcc/ipa-cp.cc index 7423731d7250..294389fba4c7 100644 --- a/gcc/ipa-cp.cc +++ b/gcc/ipa-cp.cc @@ -307,6 +307,18 @@ ipcp_lattice::print (FILE * f, bool dump_sources, bool dump_benefits) fprintf (f, "\n"); } +/* If VALUE has all bits set to one, print "-1" to F, otherwise simply print it + hexadecimally to F. */ + +static void +ipcp_print_widest_int (FILE *f, const widest_int &value) +{ + if (wi::eq_p (wi::bit_not (value), 0)) +fprintf (f, "-1"); + else +print_hex (value, f); +} + void ipcp_bits_lattice::print (FILE *f) { @@ -316,8 +328,10 @@ ipcp_bits_lattice::print (FILE *f) fprintf (f, " Bits unusable (BOTTOM)\n"); else { - fprintf (f, " Bits: value = "); print_hex (get_value (), f); - fprintf (f, ", mask = "); print_hex (get_mask (), f); + fprintf (f, " Bits: value = "); + ipcp_print_widest_int (f, get_value ()); + fprintf (f, ", mask = "); + print_hex (get_mask (), f); fprintf (f, "\n"); } } @@ -6375,7 +6389,7 @@ ipcp_store_vr_results (void) dumped_sth = true; } fprintf (dump_file, " param %i: value = ", i); - print_hex (bits->get_value (), dump_file); + ipcp_print_widest_int (dump_file, bits->get_value ()); fprintf (dump_file, ", mask = "); print_hex (bits->get_mask (), dump_file); fprintf (dump_file, "\n");
[gcc r15-7456] ipa-cp: Perform operations in the appropriate types (PR 118097)
https://gcc.gnu.org/g:6d07e3de7e8d39ac144ba1d83bba08d48bacae13 commit r15-7456-g6d07e3de7e8d39ac144ba1d83bba08d48bacae13 Author: Martin Jambor Date: Mon Feb 10 16:49:59 2025 +0100 ipa-cp: Perform operations in the appropriate types (PR 118097) One of the testcases from PR 118097 and the one from PR 118535 show that the fix to PR 118138 was incomplete. We must not only make sure that (intermediate) results of operations performed by IPA-CP are fold_converted to the type of the destination formal parameter but we also must decouple the these types from the ones in which operations are performed. This patch does that, even though we do not store or stream the operation types, instead we simply limit ourselves to tcc_comparisons and operations for which the first operand and the result are of the same type as determined by expr_type_first_operand_type_p. If we wanted to go beyond these, we would indeed need to store/stream the respective operation type. ipa_value_from_jfunc needs an additional check that res_type is not NULL because it is not called just from within IPA-CP (where we know we have a destination lattice slot belonging to a defined parameter) but also from inlining, ipa-fnsummary and ipa-modref where it is used to examine a call to a function with variadic arguments and we do not have types for the unknown parameters. But we cannot really work with those or estimate any benefits when it comes to them, so ignoring them should be OK. Even after this patch, ipa_get_jf_arith_result has a parameter called res_type in which it performs operations for aggregate jump functions, where we do not allow type conversions when constucting the jump functions and the type is the type of the stored data. In GCC 16, we could relax this and allow conversions like for scalars. gcc/ChangeLog: 2025-01-20 Martin Jambor PR ipa/118097 * ipa-cp.cc (ipa_get_jf_arith_result): Adjust comment. (ipa_get_jf_pass_through_result): Removed. (ipa_value_from_jfunc): Use directly ipa_get_jf_arith_result, do not specify operation type but make sure we check and possibly convert the result. (get_val_across_arith_op): Remove the last parameter, always pass NULL_TREE to ipa_get_jf_arith_result in its last argument. (propagate_vals_across_arith_jfunc): Do not pass res_type to get_val_across_arith_op. (propagate_vals_across_pass_through): Add checking assert that parm_type is not NULL. gcc/testsuite/ChangeLog: 2025-01-24 Martin Jambor PR ipa/118097 * gcc.dg/ipa/pr118097.c: New test. * gcc.dg/ipa/pr118535.c: Likewise. * gcc.dg/ipa/ipa-notypes-1.c: Likewise. Diff: --- gcc/ipa-cp.cc| 46 +--- gcc/testsuite/gcc.dg/ipa/ipa-notypes-1.c | 17 gcc/testsuite/gcc.dg/ipa/pr118097.c | 23 gcc/testsuite/gcc.dg/ipa/pr118535.c | 17 4 files changed, 75 insertions(+), 28 deletions(-) diff --git a/gcc/ipa-cp.cc b/gcc/ipa-cp.cc index d89324a00775..68959f2677ba 100644 --- a/gcc/ipa-cp.cc +++ b/gcc/ipa-cp.cc @@ -1467,11 +1467,10 @@ ipacp_value_safe_for_type (tree param_type, tree value) return NULL_TREE; } -/* Return the result of a (possibly arithmetic) operation on the constant - value INPUT. OPERAND is 2nd operand for binary operation. RES_TYPE is - the type of the parameter to which the result is passed. Return - NULL_TREE if that cannot be determined or be considered an - interprocedural invariant. */ +/* Return the result of a (possibly arithmetic) operation on the constant value + INPUT. OPERAND is 2nd operand for binary operation. RES_TYPE is the type + in which any operation is to be performed. Return NULL_TREE if that cannot + be determined or be considered an interprocedural invariant. */ static tree ipa_get_jf_arith_result (enum tree_code opcode, tree input, tree operand, @@ -1513,21 +1512,6 @@ ipa_get_jf_arith_result (enum tree_code opcode, tree input, tree operand, return res; } -/* Return the result of a (possibly arithmetic) pass through jump function - JFUNC on the constant value INPUT. RES_TYPE is the type of the parameter - to which the result is passed. Return NULL_TREE if that cannot be - determined or be considered an interprocedural invariant. */ - -static tree -ipa_get_jf_pass_through_result (struct ipa_jump_func *jfunc, tree input, - tree res_type) -{ - return ipa_get_jf_arith_result (ipa_get_jf_pass_through_operation (jfunc), - input, - ipa_get_jf_pass_through_operand (jfunc), - re
[gcc r15-7476] lto: Add an entry for cold attribute to lto_gnu_attributes
https://gcc.gnu.org/g:4abac2ffdb071ca9337e4f31fa79cd38df1ac7c3 commit r15-7476-g4abac2ffdb071ca9337e4f31fa79cd38df1ac7c3 Author: Martin Jambor Date: Tue Feb 11 16:39:56 2025 +0100 lto: Add an entry for cold attribute to lto_gnu_attributes PR 118125 is a performance regression stemming from the fact that we lose the cold attribute of our __builtin_unreachable. The attribute is simply and silently dropped on the floor by decl_attributes (in attribs.cc) in the process of building decls for builtins because it cannot look it up in the gnu attribute name space by lookup_scoped_attribute_spec. For that not to happen it must be in lto_gnu_attributes and this patch adds it there. In comment 13 of the bug Andrew identified other attributes which are in builtin-attrs.def but missing in lto_gnu_attributes but apart from cold it seems that they are either not used in builtins.def or are used in DEF_LIB_BUILTIN which I guess might be less critical? Eventually I decided to go for the most simple of patches and only add things if they are requested. For the same reason I also did not add any checking to the attribute "handle" callback or any exclusion check. They seem to be mostly relevant before LTO FE kicks in to me, but again, I'm happy to add any if they seem to be useful. Since Ian fixed PR 118746, the same issue has also been fixed in the Go front-end and so I have added a simple checking assert to the redirect_to_unreachable function to make sure it has the intended effect. gcc/ChangeLog: 2025-02-03 Martin Jambor PR lto/118125 * ipa-fnsummary.cc (redirect_to_unreachable): Add checking assert that the builtin_unreachable decl has attribute cold. gcc/lto/ChangeLog: 2025-02-03 Martin Jambor PR lto/118125 * lto-lang.cc (lto_gnu_attributes): Add an entry for cold attribute. (handle_cold_attribute): New function. Diff: --- gcc/ipa-fnsummary.cc | 3 +++ gcc/lto/lto-lang.cc | 13 + 2 files changed, 16 insertions(+) diff --git a/gcc/ipa-fnsummary.cc b/gcc/ipa-fnsummary.cc index 33f19365ec36..4c062fe8a0e2 100644 --- a/gcc/ipa-fnsummary.cc +++ b/gcc/ipa-fnsummary.cc @@ -255,6 +255,9 @@ redirect_to_unreachable (struct cgraph_edge *e) struct cgraph_node *target = cgraph_node::get_create (builtin_decl_unreachable ()); + gcc_checking_assert (lookup_attribute ("cold", +DECL_ATTRIBUTES (target->decl))); + if (e->speculative) e = cgraph_edge::resolve_speculation (e, target->decl); else if (!e->callee) diff --git a/gcc/lto/lto-lang.cc b/gcc/lto/lto-lang.cc index 652d7fc5e30d..e41b548b3983 100644 --- a/gcc/lto/lto-lang.cc +++ b/gcc/lto/lto-lang.cc @@ -60,6 +60,7 @@ static tree ignore_attribute (tree *, tree, tree, int, bool *); static tree handle_format_attribute (tree *, tree, tree, int, bool *); static tree handle_fnspec_attribute (tree *, tree, tree, int, bool *); static tree handle_format_arg_attribute (tree *, tree, tree, int, bool *); +static tree handle_cold_attribute (tree *, tree, tree, int, bool *); /* Helper to define attribute exclusions. */ #define ATTR_EXCL(name, function, type, variable) \ @@ -128,6 +129,8 @@ static const attribute_spec lto_gnu_attributes[] = handle_sentinel_attribute, NULL }, { "type generic", 0, 0, false, true, true, false, handle_type_generic_attribute, NULL }, + { "cold", 0, 0, false, false, false, false, + handle_cold_attribute, NULL }, { "fn spec", 1, 1, false, true, true, false, handle_fnspec_attribute, NULL }, { "transaction_pure", 0, 0, false, true, true, false, @@ -598,6 +601,16 @@ handle_fnspec_attribute (tree *node ATTRIBUTE_UNUSED, tree ARG_UNUSED (name), return NULL_TREE; } +/* Handle a "cold" attribute; arguments as in + struct attribute_spec.handler. */ + +static tree +handle_cold_attribute (tree *, tree, tree, int, bool *) +{ + /* Nothing to be done here. */ + return NULL_TREE; +} + /* Cribbed from c-common.cc. */ static void
[gcc r15-7269] tree-ssa-dce: Avoid creating invalid BBs with no outgoing edge (PR117892)
https://gcc.gnu.org/g:3d07e7bf13d4aec794dd25b5090c139b4d78283d commit r15-7269-g3d07e7bf13d4aec794dd25b5090c139b4d78283d Author: Martin Jambor Date: Wed Jan 29 10:51:08 2025 +0100 tree-ssa-dce: Avoid creating invalid BBs with no outgoing edge (PR117892) Zhendong Su and Michal Jireš found out that our gimple DSE pass can, under fairly specific conditions, remove a noreturn call which then leaves behind a "normal" BB with no successor edges which following passes do not expect. This patch simply tells the pass to leave such calls alone even when they otherwise appear to be dead. Interestingly, our CFG verifier does not report this. I'll put on my todo list to add a test for it in the next stage 1. gcc/ChangeLog: 2025-01-28 Martin Jambor PR tree-optimization/117892 * tree-ssa-dse.cc (dse_optimize_call): Leave control-altering noreturn calls alone. gcc/testsuite/ChangeLog: 2025-01-27 Martin Jambor PR tree-optimization/117892 * gcc.dg/tree-ssa/pr117892.c: New test. * gcc.dg/tree-ssa/pr118517.c: Likewise. co-authored-by: Michal Jireš Diff: --- gcc/testsuite/gcc.dg/tree-ssa/pr117892.c | 17 + gcc/testsuite/gcc.dg/tree-ssa/pr118517.c | 11 +++ gcc/tree-ssa-dse.cc | 6 -- 3 files changed, 32 insertions(+), 2 deletions(-) diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr117892.c b/gcc/testsuite/gcc.dg/tree-ssa/pr117892.c new file mode 100644 index ..d9b9c15095fc --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr117892.c @@ -0,0 +1,17 @@ +/* { dg-do compile } */ +/* { dg-options "-O1" } */ + + +volatile int a; +void b(int *c) { + int *d = 0; + *c = 0; + *d = 0; + __builtin_abort(); +} +int main() { + int f; + if (a) +b(&f); + return 0; +} diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr118517.c b/gcc/testsuite/gcc.dg/tree-ssa/pr118517.c new file mode 100644 index ..3a34f6788a9c --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr118517.c @@ -0,0 +1,11 @@ +/* { dg-do compile } */ +/* { dg-options "-O1 -fno-ipa-pure-const" } */ + +void __attribute__((noreturn)) bar(void) { + __builtin_unreachable (); +} + +int p; +void foo() { + if (p) bar(); +} diff --git a/gcc/tree-ssa-dse.cc b/gcc/tree-ssa-dse.cc index 753d7ef148ba..bc632e384841 100644 --- a/gcc/tree-ssa-dse.cc +++ b/gcc/tree-ssa-dse.cc @@ -1396,8 +1396,10 @@ dse_optimize_call (gimple_stmt_iterator *gsi, sbitmap live_bytes) if (!node) return false; - if (stmt_could_throw_p (cfun, stmt) - && !cfun->can_delete_dead_exceptions) + if ((stmt_could_throw_p (cfun, stmt) + && !cfun->can_delete_dead_exceptions) + || ((gimple_call_flags (stmt) & ECF_NORETURN) + && gimple_call_ctrl_altering_p (stmt))) return false; /* If return value is used the call is not dead. */
[gcc r15-6110] ipa: Update value range jump functions during inlining
https://gcc.gnu.org/g:92e0e0f8177530b8c6fcafe1d61ba03b00dff6a6 commit r15-6110-g92e0e0f8177530b8c6fcafe1d61ba03b00dff6a6 Author: Martin Jambor Date: Wed Dec 11 14:55:27 2024 +0100 ipa: Update value range jump functions during inlining When inlining (during the analysis phase) a call graph edge, we update all pass-through jump functions corresponding to edges going out of the newly inlined function to be relative to the function into which we are inlining or to expose the information originally captured for the edge that is being inlined. Similarly, we can combine the value range information in pass-through jump functions corresponding to both edges, which is what this patch adds - at least for the case when the inlined pass-through is a simple, non-arithmetic one, which is the case that we also handle for constant and aggregate jump function parts. gcc/ChangeLog: 2024-11-01 Martin Jambor * ipa-cp.h: Forward declare class ipa_vr. (ipa_vr_operation_and_type_effects) Declare. * ipa-cp.cc (ipa_vr_operation_and_type_effects): Make public. * ipa-prop.cc (update_jump_functions_after_inlining): Also update value range jump functions. Diff: --- gcc/ipa-cp.cc | 4 ++-- gcc/ipa-cp.h| 13 + gcc/ipa-prop.cc | 18 ++ 3 files changed, 33 insertions(+), 2 deletions(-) diff --git a/gcc/ipa-cp.cc b/gcc/ipa-cp.cc index e6d707c286db..a664bc03f62a 100644 --- a/gcc/ipa-cp.cc +++ b/gcc/ipa-cp.cc @@ -1653,7 +1653,7 @@ ipa_context_from_jfunc (ipa_node_params *info, cgraph_edge *cs, int csidx, DST_TYPE on value range in SRC_VR and store it to DST_VR. Return true if the result is a range that is not VARYING nor UNDEFINED. */ -static bool +bool ipa_vr_operation_and_type_effects (vrange &dst_vr, const vrange &src_vr, enum tree_code operation, @@ -1679,7 +1679,7 @@ ipa_vr_operation_and_type_effects (vrange &dst_vr, /* Same as above, but the SRC_VR argument is an IPA_VR which must first be extracted onto a vrange. */ -static bool +bool ipa_vr_operation_and_type_effects (vrange &dst_vr, const ipa_vr &src_vr, enum tree_code operation, diff --git a/gcc/ipa-cp.h b/gcc/ipa-cp.h index ba2ebfede63f..4f569c1ee838 100644 --- a/gcc/ipa-cp.h +++ b/gcc/ipa-cp.h @@ -299,4 +299,17 @@ ipa_vr_supported_type_p (tree type) return irange::supports_p (type) || prange::supports_p (type); } +class ipa_vr; + +bool ipa_vr_operation_and_type_effects (vrange &dst_vr, + const vrange &src_vr, + enum tree_code operation, + tree dst_type, tree src_type); +bool ipa_vr_operation_and_type_effects (vrange &dst_vr, + const ipa_vr &src_vr, + enum tree_code operation, + tree dst_type, tree src_type); + + + #endif /* IPA_CP_H */ diff --git a/gcc/ipa-prop.cc b/gcc/ipa-prop.cc index 9070a45f6835..3d72794e37c4 100644 --- a/gcc/ipa-prop.cc +++ b/gcc/ipa-prop.cc @@ -3471,6 +3471,24 @@ update_jump_functions_after_inlining (struct cgraph_edge *cs, gcc_unreachable (); } + if (src->m_vr && src->m_vr->known_p ()) + { + value_range svr (src->m_vr->type ()); + if (!dst->m_vr || !dst->m_vr->known_p ()) + ipa_set_jfunc_vr (dst, *src->m_vr); + else if (ipa_vr_operation_and_type_effects (svr, *src->m_vr, + NOP_EXPR, + dst->m_vr->type (), + src->m_vr->type ())) + { + value_range dvr; + dst->m_vr->get_vrange (dvr); + dvr.intersect (svr); + if (!dvr.undefined_p ()) + ipa_set_jfunc_vr (dst, dvr); + } + } + if (src->agg.items && (dst_agg_p || !src->agg.by_ref)) {
[gcc r15-6295] ipa: Better value ranges for pointer integer constants
https://gcc.gnu.org/g:1eb41aeb49a491f5b18d160074e651a76afc655a commit r15-6295-g1eb41aeb49a491f5b18d160074e651a76afc655a Author: Martin Jambor Date: Tue Dec 17 11:17:14 2024 +0100 ipa: Better value ranges for pointer integer constants When looking into cases where we know an actual argument of a call is a constant but we don't generate a singleton value-range for the jump function, I found out that the special handling of pointer constants does not work well for constant zero pointer values. In fact the code only attempts to see if it can figure out that an argument is not zero and if it can figure out any alignment information. With this patch, we try to use the value_range that ranger can give us in the jump function if we can and we query ranger for all kinds of arguments, not just SSA_NAMES (and so also pointer integer constants). If we cannot figure out a useful range we fall back again on figuring out non-NULLness with tree_single_nonzero_warnv_p. With this patch, we generate [prange] struct S * [0, 0] MASK 0x0 VALUE 0x0 instead of for example: [prange] struct S * [0, +INF] MASK 0xfff0 VALUE 0x0 for a zero constant passed in a call. If you are wondering why we check whether the value range obtained from range_of_expr can be undefined, even when the function returns true, that is because that can apparently happen fro default-definition SSA_NAMEs. gcc/ChangeLog: 2024-11-15 Martin Jambor * ipa-prop.cc (ipa_compute_jump_functions_for_edge): Try harder to use the value range obtained from ranger for pointer values. Diff: --- gcc/ipa-prop.cc | 35 --- 1 file changed, 16 insertions(+), 19 deletions(-) diff --git a/gcc/ipa-prop.cc b/gcc/ipa-prop.cc index ae309ec78a2d..f0b915ba2be1 100644 --- a/gcc/ipa-prop.cc +++ b/gcc/ipa-prop.cc @@ -2396,28 +2396,27 @@ ipa_compute_jump_functions_for_edge (struct ipa_func_body_info *fbi, value_range vr (TREE_TYPE (arg)); if (POINTER_TYPE_P (TREE_TYPE (arg))) { - bool addr_nonzero = false; - bool strict_overflow = false; - - if (TREE_CODE (arg) == SSA_NAME - && param_type - && get_range_query (cfun)->range_of_expr (vr, arg, cs->call_stmt) - && vr.nonzero_p ()) - addr_nonzero = true; - else if (tree_single_nonzero_warnv_p (arg, &strict_overflow)) - addr_nonzero = true; - - if (addr_nonzero) - vr.set_nonzero (TREE_TYPE (arg)); - + if (!get_range_query (cfun)->range_of_expr (vr, arg, cs->call_stmt) + || vr.varying_p () + || vr.undefined_p ()) + { + bool strict_overflow = false; + if (tree_single_nonzero_warnv_p (arg, &strict_overflow)) + vr.set_nonzero (TREE_TYPE (arg)); + else + vr.set_varying (TREE_TYPE (arg)); + } + gcc_assert (!vr.undefined_p ()); unsigned HOST_WIDE_INT bitpos; - unsigned align, prec = TYPE_PRECISION (TREE_TYPE (arg)); + unsigned align = BITS_PER_UNIT; - get_pointer_alignment_1 (arg, &align, &bitpos); + if (!vr.singleton_p ()) + get_pointer_alignment_1 (arg, &align, &bitpos); if (align > BITS_PER_UNIT && opt_for_fn (cs->caller->decl, flag_ipa_bit_cp)) { + unsigned prec = TYPE_PRECISION (TREE_TYPE (arg)); wide_int mask = wi::bit_and_not (wi::mask (prec, false, prec), wide_int::from (align / BITS_PER_UNIT - 1, @@ -2425,12 +2424,10 @@ ipa_compute_jump_functions_for_edge (struct ipa_func_body_info *fbi, wide_int value = wide_int::from (bitpos / BITS_PER_UNIT, prec, UNSIGNED); irange_bitmask bm (value, mask); - if (!addr_nonzero) - vr.set_varying (TREE_TYPE (arg)); vr.update_bitmask (bm); ipa_set_jfunc_vr (jfunc, vr); } - else if (addr_nonzero) + else if (!vr.varying_p ()) ipa_set_jfunc_vr (jfunc, vr); else gcc_assert (!jfunc->m_vr);
[gcc r15-6296] ipa: Improve how we derive value ranges from IPA invariants
https://gcc.gnu.org/g:5d740f56a162702a33379789a4d6134d9733aa71 commit r15-6296-g5d740f56a162702a33379789a4d6134d9733aa71 Author: Martin Jambor Date: Tue Dec 17 11:17:14 2024 +0100 ipa: Improve how we derive value ranges from IPA invariants I believe that the current function ipa_range_set_and_normalize lacks a check that a base of an ADDR_EXPR lacks a test whether the base really cannot be NULL, so this patch adds it. Moreover, I never liked the name as I do not think it makes the value of ranges any more normal but rather just special-cases non-zero ip_invariant pointers. Therefore, I have given it a different name and moved it to a .cc file, our LTO bootstrap should inline (and/or split) it if necessary anyway. Because, as Honza correctly pointed out, deriving non-NULLness from a pointer depends on flag_delete_null_pointer_checks which is an optimization flag and thus depends on a given function, in this version of the patch ipa_get_range_from_ip_invariant gets a context_node parameter for that purpose. This then needs to be used within symtab_node::nonzero_address which gets a special overload in which the value of the flag can be provided as a parameter. gcc/ChangeLog: 2024-12-11 Martin Jambor * cgraph.h (symtab_node): Add a new overload of nonzero_address. * symtab.cc (symtab_node::nonzero_address): Add a new overload whith a parameter for delete_null_pointer_checks. Make the original overload call the new one which has retains the actual implementation. * ipa-prop.h (ipa_get_range_from_ip_invariant): Declare. (ipa_range_set_and_normalize): Remove. * ipa-prop.cc (ipa_get_range_from_ip_invariant): New function. (ipa_range_set_and_normalize): Remove. * ipa-cp.cc (ipa_vr_intersect_with_arith_jfunc): Add a new parameter context_node. Use ipa_get_range_from_ip_invariant instead of ipa_range_set_and_normalize and pass to it the new parameter. (ipa_value_range_from_jfunc): Pass cs->caller as the context_node to ipa_vr_intersect_with_arith_jfunc. (propagate_vr_across_jump_function): Likewise. (ipa_get_range_from_ip_invariant): New function. * ipa-fnsummary.cc (evaluate_conditions_for_known_args): Use ipa_get_range_from_ip_invariant instead of ipa_range_set_and_normalize Diff: --- gcc/cgraph.h | 4 gcc/ipa-cp.cc| 12 gcc/ipa-fnsummary.cc | 4 ++-- gcc/ipa-prop.cc | 37 + gcc/ipa-prop.h | 15 +-- gcc/symtab.cc| 21 +++-- 6 files changed, 67 insertions(+), 26 deletions(-) diff --git a/gcc/cgraph.h b/gcc/cgraph.h index 50bae96de4cf..9b4cb6383afc 100644 --- a/gcc/cgraph.h +++ b/gcc/cgraph.h @@ -431,6 +431,10 @@ public: /* Return true if ONE and TWO are part of the same COMDAT group. */ inline bool in_same_comdat_group_p (symtab_node *target); + /* Return true if symbol is known to be nonzero, assume that + flag_delete_null_pointer_checks is equal to delete_null_pointer_checks. */ + bool nonzero_address (bool delete_null_pointer_checks); + /* Return true if symbol is known to be nonzero. */ bool nonzero_address (); diff --git a/gcc/ipa-cp.cc b/gcc/ipa-cp.cc index a664bc03f62a..5d7b3d25df5d 100644 --- a/gcc/ipa-cp.cc +++ b/gcc/ipa-cp.cc @@ -1693,11 +1693,14 @@ ipa_vr_operation_and_type_effects (vrange &dst_vr, /* Given a PASS_THROUGH jump function JFUNC that takes as its source SRC_VR of SRC_TYPE and the result needs to be DST_TYPE, if any value range information - can be deduced at all, intersect VR with it. */ + can be deduced at all, intersect VR with it. CONTEXT_NODE is the call graph + node representing the function for which optimization flags should be + evaluated. */ static void ipa_vr_intersect_with_arith_jfunc (vrange &vr, ipa_jump_func *jfunc, + cgraph_node *context_node, const value_range &src_vr, tree src_type, tree dst_type) @@ -1720,7 +1723,7 @@ ipa_vr_intersect_with_arith_jfunc (vrange &vr, if (!handler) return; value_range op_vr (TREE_TYPE (operand)); - ipa_range_set_and_normalize (op_vr, operand); + ipa_get_range_from_ip_invariant (op_vr, operand, context_node); tree operation_type; if (TREE_CODE_CLASS (operation) == tcc_comparison) @@ -1776,7 +1779,8 @@ ipa_value_range_from_jfunc (vrange &vr, value_range srcvr; (*sum->m_vr)[idx].get_vrange (srcvr); - ipa_vr_intersect_with_arith_jfunc (vr, jfunc, srcvr, src_type, parm_type); + ipa_vr_intersect_with_arith_jfunc (vr, jfunc, cs->caller, srcvr, src_type, +
[gcc r15-6294] ipa: Skip widening type conversions in jump function constructions
https://gcc.gnu.org/g:96fb71883d438bdb241fdf9c7d12f945c5ba0c7f commit r15-6294-g96fb71883d438bdb241fdf9c7d12f945c5ba0c7f Author: Martin Jambor Date: Tue Dec 17 11:17:14 2024 +0100 ipa: Skip widening type conversions in jump function constructions Originally, we did not stream any formal parameter types into WPA and were generally very conservative when it came to type mismatches in IPA-CP. Over the time, mismatches that happen in code and blew up in WPA made us to be much more resilient and also to stream the types of the parameters which we now use commonly. With that information, we can safely skip conversions when looking at the IL from which we build jump functions and then simply fold convert the constants and ranges to the resulting type, as long as we are careful that performing the corresponding folding of constants gives the corresponding results. In order to do that, we must ensure that the old value can be represented in the new one without any loss. With this change, we can nicely propagate non-NULLness in IPA-VR as demonstrated with the new test case. I have gone through all other uses of (all components of) jump functions which could be affected by this and verified they do indeed check types and can handle mismatches. gcc/ChangeLog: 2024-12-11 Martin Jambor * ipa-prop.cc: Include vr-values.h. (skip_a_safe_conversion_op): New function. (ipa_compute_jump_functions_for_edge): Use it. gcc/testsuite/ChangeLog: 2024-11-01 Martin Jambor * gcc.dg/ipa/vrp9.c: New test. Diff: --- gcc/ipa-prop.cc | 40 ++ gcc/testsuite/gcc.dg/ipa/vrp9.c | 48 + 2 files changed, 88 insertions(+) diff --git a/gcc/ipa-prop.cc b/gcc/ipa-prop.cc index 3d72794e37c4..ae309ec78a2d 100644 --- a/gcc/ipa-prop.cc +++ b/gcc/ipa-prop.cc @@ -59,6 +59,7 @@ along with GCC; see the file COPYING3. If not see #include "attr-fnspec.h" #include "gimple-range.h" #include "value-range-storage.h" +#include "vr-values.h" /* Function summary where the parameter infos are actually stored. */ ipa_node_params_t *ipa_node_params_sum = NULL; @@ -2311,6 +2312,44 @@ ipa_set_jfunc_vr (ipa_jump_func *jf, const ipa_vr &vr) ipa_set_jfunc_vr (jf, tmp); } + +/* If T is an SSA_NAME that is the result of a simple type conversion statement + from an integer type to another integer type which is known to be able to + represent the values the operand of the conversion can hold, return the + operand of that conversion, otherwise return T. */ + +static tree +skip_a_safe_conversion_op (tree t) +{ + if (TREE_CODE (t) != SSA_NAME + || SSA_NAME_IS_DEFAULT_DEF (t)) +return t; + + gimple *def = SSA_NAME_DEF_STMT (t); + if (!is_gimple_assign (def) + || !CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (def)) + || !INTEGRAL_TYPE_P (TREE_TYPE (t)) + || !INTEGRAL_TYPE_P (TREE_TYPE (gimple_assign_rhs1 (def +return t; + + tree rhs1 = gimple_assign_rhs1 (def); + if (TYPE_PRECISION (TREE_TYPE (t)) + >= TYPE_PRECISION (TREE_TYPE (rhs1))) +return gimple_assign_rhs1 (def); + + value_range vr (TREE_TYPE (rhs1)); + if (!get_range_query (cfun)->range_of_expr (vr, rhs1, def) + || vr.undefined_p ()) +return t; + + irange &ir = as_a (vr); + if (range_fits_type_p (&ir, TYPE_PRECISION (TREE_TYPE (t)), +TYPE_SIGN (TREE_TYPE (t + return gimple_assign_rhs1 (def); + + return t; +} + /* Compute jump function for all arguments of callsite CS and insert the information in the jump_functions array in the ipa_edge_args corresponding to this callsite. */ @@ -2415,6 +2454,7 @@ ipa_compute_jump_functions_for_edge (struct ipa_func_body_info *fbi, gcc_assert (!jfunc->m_vr); } + arg = skip_a_safe_conversion_op (arg); if (is_gimple_ip_invariant (arg) || (VAR_P (arg) && is_global_var (arg) diff --git a/gcc/testsuite/gcc.dg/ipa/vrp9.c b/gcc/testsuite/gcc.dg/ipa/vrp9.c new file mode 100644 index ..461a2e757d2c --- /dev/null +++ b/gcc/testsuite/gcc.dg/ipa/vrp9.c @@ -0,0 +1,48 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-optimized" } */ + +int some_f1 (int); +int some_f2 (int); +int some_f3 (int); + +void remove_this_call (); + +int g; + +static int __attribute__((noinline)) +bar (int p) +{ + if (p) +remove_this_call (); + return g++; +} + +static int __attribute__((noinline)) +foo (int (*f)(int)) +{ + return bar (f == (void *)0); +} + +int +baz1 (void) +{ + int (*f)(int); + if (g) +f = some_f1; + else +f = some_f2; + return foo (f); +} + +int +baz2 (void) +{ + int (*f)(int); + if (g) +f = some_f2; + else +f = some_f3; + return foo (f); +} + +/* { dg-final { scan-tree-dump-not "remove_this_call" "
[gcc r15-6769] ipa-cp: Fold-convert values when necessary (PR 118138)
https://gcc.gnu.org/g:d019ab4f115caab48316c185c007765719e93052 commit r15-6769-gd019ab4f115caab48316c185c007765719e93052 Author: Martin Jambor Date: Sat Jan 4 20:40:07 2025 +0100 ipa-cp: Fold-convert values when necessary (PR 118138) PR 118138 and quite a few duplicates that it has acquired in a short time show that even though we are careful to make sure we do not loose any bits when newly allowing type conversions in jump-functions, we still need to perform the fold conversions during IPA constant propagation and not just at the end in order to properly perform sign-extensions or zero-extensions as appropriate. This patch does just that, changing a safety predicate we already use at the appropriate places to return the necessary type. gcc/ChangeLog: 2025-01-03 Martin Jambor PR ipa/118138 * ipa-cp.cc (ipacp_value_safe_for_type): Return the appropriate type instead of a bool, accept NULL_TREE VALUEs. (propagate_vals_across_arith_jfunc): Use the new returned value of ipacp_value_safe_for_type. (propagate_vals_across_ancestor): Likewise. (propagate_scalar_across_jump_function): Likewise. gcc/testsuite/ChangeLog: 2025-01-03 Martin Jambor PR ipa/118138 * gcc.dg/ipa/pr118138.c: New test. Diff: --- gcc/ipa-cp.cc | 33 +++-- gcc/testsuite/gcc.dg/ipa/pr118138.c | 30 ++ 2 files changed, 49 insertions(+), 14 deletions(-) diff --git a/gcc/ipa-cp.cc b/gcc/ipa-cp.cc index 294389fba4c7..d89324a00775 100644 --- a/gcc/ipa-cp.cc +++ b/gcc/ipa-cp.cc @@ -1448,19 +1448,23 @@ initialize_node_lattices (struct cgraph_node *node) } } -/* Return true if VALUE can be safely IPA-CP propagated to a parameter of type - PARAM_TYPE. */ +/* Return VALUE if it is NULL_TREE or if it can be directly safely IPA-CP + propagated to a parameter of type PARAM_TYPE, or return a fold-converted + VALUE to PARAM_TYPE if that is possible. Return NULL_TREE otherwise. */ -static bool +static tree ipacp_value_safe_for_type (tree param_type, tree value) { + if (!value) +return NULL_TREE; tree val_type = TREE_TYPE (value); if (param_type == val_type - || useless_type_conversion_p (param_type, val_type) - || fold_convertible_p (param_type, value)) -return true; + || useless_type_conversion_p (param_type, val_type)) +return value; + if (fold_convertible_p (param_type, value)) +return fold_convert (param_type, value); else -return false; +return NULL_TREE; } /* Return the result of a (possibly arithmetic) operation on the constant @@ -2210,8 +2214,8 @@ propagate_vals_across_arith_jfunc (cgraph_edge *cs, { tree cstval = get_val_across_arith_op (opcode, opnd1_type, opnd2, src_val, res_type); - if (!cstval - || !ipacp_value_safe_for_type (res_type, cstval)) + cstval = ipacp_value_safe_for_type (res_type, cstval); + if (!cstval) break; ret |= dest_lat->add_value (cstval, cs, src_val, src_idx, @@ -2235,8 +2239,8 @@ propagate_vals_across_arith_jfunc (cgraph_edge *cs, tree cstval = get_val_across_arith_op (opcode, opnd1_type, opnd2, src_val, res_type); - if (cstval - && ipacp_value_safe_for_type (res_type, cstval)) + cstval = ipacp_value_safe_for_type (res_type, cstval); + if (cstval) ret |= dest_lat->add_value (cstval, cs, src_val, src_idx, src_offset); else @@ -2284,8 +2288,8 @@ propagate_vals_across_ancestor (struct cgraph_edge *cs, for (src_val = src_lat->values; src_val; src_val = src_val->next) { tree t = ipa_get_jf_ancestor_result (jfunc, src_val->value); - - if (t && ipacp_value_safe_for_type (param_type, t)) + t = ipacp_value_safe_for_type (param_type, t); + if (t) ret |= dest_lat->add_value (t, cs, src_val, src_idx); else ret |= dest_lat->set_contains_variable (); @@ -2310,7 +2314,8 @@ propagate_scalar_across_jump_function (struct cgraph_edge *cs, if (jfunc->type == IPA_JF_CONST) { tree val = ipa_get_jf_constant (jfunc); - if (ipacp_value_safe_for_type (param_type, val)) + val = ipacp_value_safe_for_type (param_type, val); + if (val) return dest_lat->add_value (val, cs, NULL, 0); else return dest_lat->set_contains_variable (); diff --git a/gcc/testsuite/gcc.dg/ipa/pr118138.c b/gcc/testsuite/gcc.dg/ipa/pr118138.c new file mode 100644 index ..5c94253f58b2 --- /dev/null +++ b/gcc/testsuite/gcc.dg/ipa/pr118138.c @@ -0,0 +1,30 @@ +/* { dg-do run } */ +/* { dg-options "-O3 -fno-inli
[gcc r15-6864] MAINTAINERS: Make contrib/check-MAINTAINERS.py happy
https://gcc.gnu.org/g:539fc490690d825ab2d299a0f577c5e9d3fa33d0 commit r15-6864-g539fc490690d825ab2d299a0f577c5e9d3fa33d0 Author: Martin Jambor Date: Mon Jan 13 13:47:27 2025 +0100 MAINTAINERS: Make contrib/check-MAINTAINERS.py happy This commit makes the contrib/check-MAINTAINERS.py script happy about our MAINTAINERS file. I hope that it knows best how things ought to be and so am committing this as obvious. ChangeLog: 2025-01-13 Martin Jambor * MAINTAINERS: Fix the name order of the Write After Approval section. Diff: --- MAINTAINERS | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/MAINTAINERS b/MAINTAINERS index 0c571bde8bce..256a03957d59 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -327,12 +327,12 @@ from other maintainers or reviewers. NameBZ account Email +Soumya AR soumyaa Mark G. Adams mgadams Ajit Kumar Agarwal aagarwa Pedro Alves palves John David Anglin danglin Harald Anlauf anlauf -Soumya AR soumyaa Paul-Antoine Arras parras Arsen Arsenović arsen Raksit Ashokraksit
[gcc r15-8061] ipa: Do not modify cgraph edges from thunk clones during inlining (PR116572)
https://gcc.gnu.org/g:075ec330307c5b1fe5ed166a633c718c06b01437 commit r15-8061-g075ec330307c5b1fe5ed166a633c718c06b01437 Author: Martin Jambor Date: Fri Mar 14 16:07:01 2025 +0100 ipa: Do not modify cgraph edges from thunk clones during inlining (PR116572) In PR 116572 we hit an assert that a thunk which does not have a body looks like it has one. It does not, but the call_stmt of its outgoing edge points to a statement, which should not. In fact it has several outgoing call graph edges, which cannot be. The problem is that the code updating the edges to reflect inlining into the master clone (an ex-thunk, unlike the clone, which is still an unexpanded thunk) is being updated during inling into the master clone. This patch simply makes the code to skip unexpanded thunk clones. gcc/ChangeLog: 2025-03-13 Martin Jambor PR ipa/116572 * cgraph.cc (cgraph_update_edges_for_call_stmt): Do not update edges of clones that are unexpanded thunk. Assert that the node passed as the parameter is not an unexpanded thunk. gcc/testsuite/ChangeLog: 2025-03-13 Martin Jambor PR ipa/116572 * g++.dg/ipa/pr116572.C: New test. Diff: --- gcc/cgraph.cc | 7 +-- gcc/testsuite/g++.dg/ipa/pr116572.C | 37 + 2 files changed, 42 insertions(+), 2 deletions(-) diff --git a/gcc/cgraph.cc b/gcc/cgraph.cc index d0b19ad850e0..6ae6a97f6f56 100644 --- a/gcc/cgraph.cc +++ b/gcc/cgraph.cc @@ -1708,12 +1708,15 @@ cgraph_update_edges_for_call_stmt (gimple *old_stmt, tree old_decl, cgraph_node *node; gcc_checking_assert (orig); + gcc_assert (!orig->thunk); cgraph_update_edges_for_call_stmt_node (orig, old_stmt, old_decl, new_stmt); if (orig->clones) for (node = orig->clones; node != orig;) { - cgraph_update_edges_for_call_stmt_node (node, old_stmt, old_decl, - new_stmt); + /* Do not attempt to adjust bodies of yet unexpanded thunks. */ + if (!node->thunk) + cgraph_update_edges_for_call_stmt_node (node, old_stmt, old_decl, + new_stmt); if (node->clones) node = node->clones; else if (node->next_sibling_clone) diff --git a/gcc/testsuite/g++.dg/ipa/pr116572.C b/gcc/testsuite/g++.dg/ipa/pr116572.C new file mode 100644 index ..909568e1c72c --- /dev/null +++ b/gcc/testsuite/g++.dg/ipa/pr116572.C @@ -0,0 +1,37 @@ +/* { dg-do compile } */ +/* { dg-options "-std=c++20 -O3 -fsanitize=undefined" } */ + +long v; +template struct A; +template , typename = C> +class B; +template <> +struct A +{ + static int foo(char *s, const char *t, long n) { return __builtin_memcmp(s, t, n); } +}; +template +struct B { + long b; + B(const C *); + C *bar() const; + constexpr unsigned long baz(const C *, unsigned long, unsigned long) const noexcept; + void baz() { C c; baz(&c, 0, v); } +}; +template +constexpr unsigned long +B::baz(const C *s, unsigned long, unsigned long n) const noexcept +{ + C *x = bar(); if (!x) return b; D::foo(x, s, n); return 0; +} +namespace { +struct F { virtual ~F() {} }; +struct F2 { virtual void foo(B) const; }; +struct F3 : F, F2 { void foo(B s) const { s.baz(); } } f; +} +int +main() +{ + F *p; + dynamic_cast(p)->foo(""); +}
[gcc r13-9497] ipa-cp: Avoid ICE when redistributing nodes among edges to recursive clones (PR 118318)
https://gcc.gnu.org/g:659e222b82c41ae0730a0bb93d891864b6ae5e16 commit r13-9497-g659e222b82c41ae0730a0bb93d891864b6ae5e16 Author: Martin Jambor Date: Fri Mar 7 17:17:24 2025 +0100 ipa-cp: Avoid ICE when redistributing nodes among edges to recursive clones (PR 118318) PR 118318 reported an ICE during PGO build of Firefox when IPA-CP, in the final stages of update_counts_for_self_gen_clones where it attempts to guess how to distribute profile count among clones created for recursive edges and the various edges that are created in the process. If one such edge has profile count of kind GUESSED_GLOBAL0, the compatibility check in the operator+ will lead to an ICE. After discussing the situation with Honza, we concluded that there is little more we can do other than check for this situation before touching the edge count, so this is what this patch does. gcc/ChangeLog: 2025-02-28 Martin Jambor PR ipa/118318 * ipa-cp.cc (adjust_clone_incoming_counts): Add a compatible_p check. (cherry picked from commit 7deb498425799aceb7659ea25614175a49533184) Diff: --- gcc/ipa-cp.cc | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/gcc/ipa-cp.cc b/gcc/ipa-cp.cc index 8f36608cf33b..08fca00e5f65 100644 --- a/gcc/ipa-cp.cc +++ b/gcc/ipa-cp.cc @@ -4808,7 +4808,8 @@ adjust_clone_incoming_counts (cgraph_node *node, cs->count = cs->count.combine_with_ipa_count (sum); } else if (!desc->processed_edges->contains (cs) -&& cs->caller->clone_of == desc->orig) +&& cs->caller->clone_of == desc->orig +&& cs->count.compatible_p (desc->count)) { cs->count += desc->count; if (dump_file)
[gcc r14-11497] Fix a pasto in ao_compare::compare_ao_refs
https://gcc.gnu.org/g:28c10781fd26324e8fd6077e743944f1a32e commit r14-11497-g28c10781fd26324e8fd6077e743944f1a32e Author: Martin Jambor Date: Tue Mar 11 14:52:44 2025 +0100 Fix a pasto in ao_compare::compare_ao_refs When reading the function ao_compare::compare_ao_refs I came accross what I believe to ba a copy-and-paste error which this patch fixes. gcc/ChangeLog: 2025-03-10 Martin Jambor * tree-ssa-alias.cc (ao_compare::compare_ao_refs): Fix a copy-and-paste error. (cherry picked from commit dc47161c1f32c3f27d1157ba0de9d98ea1b7fc82) Diff: --- gcc/tree-ssa-alias.cc | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/gcc/tree-ssa-alias.cc b/gcc/tree-ssa-alias.cc index 72af21c02131..beab6249ae43 100644 --- a/gcc/tree-ssa-alias.cc +++ b/gcc/tree-ssa-alias.cc @@ -4302,12 +4302,13 @@ ao_compare::compare_ao_refs (ao_ref *ref1, ao_ref *ref2, c1 = p1, nskipped1 = i; i++; } + i = 0; for (tree p2 = ref2->ref; handled_component_p (p2); p2 = TREE_OPERAND (p2, 0)) { if (component_ref_to_zero_sized_trailing_array_p (p2)) end_struct_ref2 = p2; if (ends_tbaa_access_path_p (p2)) - c2 = p2, nskipped1 = i; + c2 = p2, nskipped2 = i; i++; }
[gcc r14-11495] ipa: Do not modify cgraph edges from thunk clones during inlining (PR116572)
https://gcc.gnu.org/g:5312a8f62a6bcae36f6aa40f88c8b58dfae7db21 commit r14-11495-g5312a8f62a6bcae36f6aa40f88c8b58dfae7db21 Author: Martin Jambor Date: Fri Mar 14 16:07:01 2025 +0100 ipa: Do not modify cgraph edges from thunk clones during inlining (PR116572) In PR 116572 we hit an assert that a thunk which does not have a body looks like it has one. It does not, but the call_stmt of its outgoing edge points to a statement, which should not. In fact it has several outgoing call graph edges, which cannot be. The problem is that the code updating the edges to reflect inlining into the master clone (an ex-thunk, unlike the clone, which is still an unexpanded thunk) is being updated during inling into the master clone. This patch simply makes the code to skip unexpanded thunk clones. gcc/ChangeLog: 2025-03-13 Martin Jambor PR ipa/116572 * cgraph.cc (cgraph_update_edges_for_call_stmt): Do not update edges of clones that are unexpanded thunk. Assert that the node passed as the parameter is not an unexpanded thunk. gcc/testsuite/ChangeLog: 2025-03-13 Martin Jambor PR ipa/116572 * g++.dg/ipa/pr116572.C: New test. (cherry picked from commit 075ec330307c5b1fe5ed166a633c718c06b01437) Diff: --- gcc/cgraph.cc | 7 +-- gcc/testsuite/g++.dg/ipa/pr116572.C | 37 + 2 files changed, 42 insertions(+), 2 deletions(-) diff --git a/gcc/cgraph.cc b/gcc/cgraph.cc index 8226c7d96e05..dc18a16f8917 100644 --- a/gcc/cgraph.cc +++ b/gcc/cgraph.cc @@ -1710,12 +1710,15 @@ cgraph_update_edges_for_call_stmt (gimple *old_stmt, tree old_decl, cgraph_node *node; gcc_checking_assert (orig); + gcc_assert (!orig->thunk); cgraph_update_edges_for_call_stmt_node (orig, old_stmt, old_decl, new_stmt); if (orig->clones) for (node = orig->clones; node != orig;) { - cgraph_update_edges_for_call_stmt_node (node, old_stmt, old_decl, - new_stmt); + /* Do not attempt to adjust bodies of yet unexpanded thunks. */ + if (!node->thunk) + cgraph_update_edges_for_call_stmt_node (node, old_stmt, old_decl, + new_stmt); if (node->clones) node = node->clones; else if (node->next_sibling_clone) diff --git a/gcc/testsuite/g++.dg/ipa/pr116572.C b/gcc/testsuite/g++.dg/ipa/pr116572.C new file mode 100644 index ..909568e1c72c --- /dev/null +++ b/gcc/testsuite/g++.dg/ipa/pr116572.C @@ -0,0 +1,37 @@ +/* { dg-do compile } */ +/* { dg-options "-std=c++20 -O3 -fsanitize=undefined" } */ + +long v; +template struct A; +template , typename = C> +class B; +template <> +struct A +{ + static int foo(char *s, const char *t, long n) { return __builtin_memcmp(s, t, n); } +}; +template +struct B { + long b; + B(const C *); + C *bar() const; + constexpr unsigned long baz(const C *, unsigned long, unsigned long) const noexcept; + void baz() { C c; baz(&c, 0, v); } +}; +template +constexpr unsigned long +B::baz(const C *s, unsigned long, unsigned long n) const noexcept +{ + C *x = bar(); if (!x) return b; D::foo(x, s, n); return 0; +} +namespace { +struct F { virtual ~F() {} }; +struct F2 { virtual void foo(B) const; }; +struct F3 : F, F2 { void foo(B s) const { s.baz(); } } f; +} +int +main() +{ + F *p; + dynamic_cast(p)->foo(""); +}
[gcc r15-9250] sra: Avoid creating TBAA hazards (PR118924)
https://gcc.gnu.org/g:07d243670020b339380194f6125cde87ada56148 commit r15-9250-g07d243670020b339380194f6125cde87ada56148 Author: Martin Jambor Date: Mon Apr 7 13:32:09 2025 +0200 sra: Avoid creating TBAA hazards (PR118924) The testcase in PR 118924, when compiled on Aarch64, contains an gimple aggregate assignment statement in between different types which are types_compatible_p but behave differently for the purposes of alias analysis. SRA replaces the statement with a series of scalar assignments which however have LHSs access chains modeled on the RHS type and so do not alias with a subsequent reads and so are DSEd. SRA clearly gets its "same_access_path" logic subtly wrong. One issue is that the same_access_path_p function probably should be implemented more along the lines of (parts of ao_compare::compare_ao_refs) instead of internally relying on operand_equal_p. That is however not the problem in the PR and so I will deal with it only later. The issue here is that even when the access path is the same, it must not be bolted on an aggregate type that does not match. This patch does that, taking just one simple function from the ao_compare::compare_ao_refs machinery and using it to detect the situation. The rest is just merging the information in between accesses of the same access group. I looked at how many times we come across such assignment during "make stage2-bubble" of GCC (configured with only c and C++ and without multilib and libsanitizers) and on an x86_64 there were 87924 such assignments (though now I realize not all of them had to be aggregate), so they do happen. The patch leads to about 5% increase of cases where we don't use an "access path" but resort to a MEM_REF (from 90209 to 95204). On an Aarch64, there were 92268 such assignments and the increase of falling back to MEM_REFs was by 4% (but from a bigger base 132983 to 107991). gcc/ChangeLog: 2025-04-04 Martin Jambor PR tree-optimization/118924 * tree-ssa-alias-compare.h (types_equal_for_same_type_for_tbaa_p): Declare. * tree-ssa-alias.cc: Include ipa-utils.h. (types_equal_for_same_type_for_tbaa_p): New public overloaded variant. * tree-sra.cc: Include tree-ssa-alias-compare.h. (create_access): Initialzie grp_same_access_path to true. (build_accesses_from_assign): Detect tbaa hazards and clear grp_same_access_path fields of involved accesses when they occur. (sort_and_splice_var_accesses): Take previous values of grp_same_access_path into account. gcc/testsuite/ChangeLog: 2025-03-25 Martin Jambor PR tree-optimization/118924 * g++.dg/tree-ssa/pr118924.C: New test. Diff: --- gcc/testsuite/g++.dg/tree-ssa/pr118924.C | 29 + gcc/tree-sra.cc | 17 ++--- gcc/tree-ssa-alias-compare.h | 2 ++ gcc/tree-ssa-alias.cc| 13 - 4 files changed, 57 insertions(+), 4 deletions(-) diff --git a/gcc/testsuite/g++.dg/tree-ssa/pr118924.C b/gcc/testsuite/g++.dg/tree-ssa/pr118924.C new file mode 100644 index ..c95eacafc9ce --- /dev/null +++ b/gcc/testsuite/g++.dg/tree-ssa/pr118924.C @@ -0,0 +1,29 @@ +/* { dg-do run } */ +/* { dg-options "-std=c++17 -O2" } */ + +template struct Vector { + int m_data[Size]; + Vector(int, int, int) {} +}; +enum class E { POINTS, LINES, TRIANGLES }; + +__attribute__((noipa)) +void getName(E type) { + static E check = E::POINTS; + if (type == check) +check = (E)((int)check + 1); + else +__builtin_abort (); +} + +int main() { + int arr[]{0, 1, 2}; + for (auto dim : arr) { +Vector<3> localInvs(1, 1, 1); +localInvs.m_data[dim] = 8; + } + E types[] = {E::POINTS, E::LINES, E::TRIANGLES}; + for (auto primType : types) +getName(primType); + return 0; +} diff --git a/gcc/tree-sra.cc b/gcc/tree-sra.cc index c26559edc666..ae7cd57a5f23 100644 --- a/gcc/tree-sra.cc +++ b/gcc/tree-sra.cc @@ -100,6 +100,7 @@ along with GCC; see the file COPYING3. If not see #include "builtins.h" #include "tree-sra.h" #include "opts.h" +#include "tree-ssa-alias-compare.h" /* Enumeration of all aggregate reductions we can do. */ enum sra_mode { SRA_MODE_EARLY_IPA, /* early call regularization */ @@ -979,6 +980,7 @@ create_access (tree expr, gimple *stmt, bool write) access->type = TREE_TYPE (expr); access->write = write; access->grp_unscalarizable_region = unscalarizable_region; + access->grp_same_access_path = true; access->stmt = stmt; access->reverse = reverse; @@ -1522,6 +1524,9 @@ build_accesses_from_assign (gimple *stmt) racc = build_access_from_expr_1 (rhs, stmt, false); lacc = build_access_from_expr_1 (lhs, stmt,
[gcc r15-9251] sra: Clear grp_same_access_path of acesses created by total scalarization (PR118924)
https://gcc.gnu.org/g:40445711b8af113ef423d8bcac1a7ce1c47f62d7 commit r15-9251-g40445711b8af113ef423d8bcac1a7ce1c47f62d7 Author: Martin Jambor Date: Mon Apr 7 13:32:10 2025 +0200 sra: Clear grp_same_access_path of acesses created by total scalarization (PR118924) During analysis of PR 118924 it was discussed that total scalarization invents access paths (strings of COMPONENT_REFs and possibly even ARRAY_REFs) which did not exist in the program before which can have unintended effects on subsequent AA queries. Although not doing that does not mean that SRA cannot create such situations (see the bug for more info), it has been agreed that not doing this is generally better. This patch therfore makes SRA fall back on creating simple MEM_REFs when accessing components of an aggregate corresponding to what a SRA variable now represents. gcc/ChangeLog: 2025-03-26 Martin Jambor PR tree-optimization/118924 * tree-sra.cc (create_total_scalarization_access): Set grp_same_access_path flag to zero. Diff: --- gcc/tree-sra.cc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/tree-sra.cc b/gcc/tree-sra.cc index ae7cd57a5f23..302b73e83b8f 100644 --- a/gcc/tree-sra.cc +++ b/gcc/tree-sra.cc @@ -3462,7 +3462,7 @@ create_total_scalarization_access (struct access *parent, HOST_WIDE_INT pos, access->grp_write = parent->grp_write; access->grp_total_scalarization = 1; access->grp_hint = 1; - access->grp_same_access_path = path_comparable_for_same_access (expr); + access->grp_same_access_path = 0; access->reverse = reverse_storage_order_for_component_p (expr); access->next_sibling = next_sibling;
[gcc r15-9427] ipa-cp: Make propagation of bits in IPA-CP aware of type conversions (PR119318)
https://gcc.gnu.org/g:de1c734a8ae034c92f485e7f58b7fcb1c921ecd2 commit r15-9427-gde1c734a8ae034c92f485e7f58b7fcb1c921ecd2 Author: Martin Jambor Date: Mon Apr 14 14:21:15 2025 +0200 ipa-cp: Make propagation of bits in IPA-CP aware of type conversions (PR119318) After the propagation of constants and value ranges, it turns out that the propagation of known bits also needs to be made aware of any intermediate types in which any arithmetic operations are made and must limit its precision there. This implements just that, using the newly collected and streamed types of the operations involved. This version removed the extra check that the type of a formal parameter is known pointed out in Honza in his review because I agree it is currently always known. I have also added the testcase of PR 119530 which is a duplicate of this bug. gcc/ChangeLog: 2025-04-11 Martin Jambor PR ipa/119318 * ipa-cp.cc (ipcp_bits_lattice::meet_with_1): Set all mask bits not covered by precision to one. (ipcp_bits_lattice::meet_with): Likewise. (propagate_bits_across_jump_function): Use the stored operation type to perform meet with other lattices. gcc/testsuite/ChangeLog: 2025-04-11 Martin Jambor PR ipa/119318 * gcc.dg/ipa/pr119318.c: New test. * gcc.dg/ipa/pr119530.c: Likwise. Diff: --- gcc/ipa-cp.cc | 21 +++- gcc/testsuite/gcc.dg/ipa/pr119318.c | 38 + gcc/testsuite/gcc.dg/ipa/pr119530.c | 21 3 files changed, 75 insertions(+), 5 deletions(-) diff --git a/gcc/ipa-cp.cc b/gcc/ipa-cp.cc index 264568989a96..fd2c4cca1365 100644 --- a/gcc/ipa-cp.cc +++ b/gcc/ipa-cp.cc @@ -918,6 +918,8 @@ ipcp_bits_lattice::meet_with_1 (widest_int value, widest_int mask, m_mask |= m_value; m_value &= ~m_mask; + widest_int cap_mask = wi::bit_not (wi::sub (wi::lshift (1, precision), 1)); + m_mask |= cap_mask; if (wi::sext (m_mask, precision) == -1) return set_to_bottom (); @@ -996,6 +998,8 @@ ipcp_bits_lattice::meet_with (ipcp_bits_lattice& other, unsigned precision, adjusted_mask |= adjusted_value; adjusted_value &= ~adjusted_mask; } + widest_int cap_mask = wi::bit_not (wi::sub (wi::lshift (1, precision), 1)); + adjusted_mask |= cap_mask; if (wi::sext (adjusted_mask, precision) == -1) return set_to_bottom (); return set_to_constant (adjusted_value, adjusted_mask); @@ -2507,14 +2511,12 @@ propagate_bits_across_jump_function (cgraph_edge *cs, int idx, return dest_lattice->set_to_bottom (); } - unsigned precision = TYPE_PRECISION (parm_type); - signop sgn = TYPE_SIGN (parm_type); - if (jfunc->type == IPA_JF_PASS_THROUGH || jfunc->type == IPA_JF_ANCESTOR) { ipa_node_params *caller_info = ipa_node_params_sum->get (cs->caller); tree operand = NULL_TREE; + tree op_type = NULL_TREE; enum tree_code code; unsigned src_idx; bool keep_null = false; @@ -2524,7 +2526,10 @@ propagate_bits_across_jump_function (cgraph_edge *cs, int idx, code = ipa_get_jf_pass_through_operation (jfunc); src_idx = ipa_get_jf_pass_through_formal_id (jfunc); if (code != NOP_EXPR) - operand = ipa_get_jf_pass_through_operand (jfunc); + { + operand = ipa_get_jf_pass_through_operand (jfunc); + op_type = ipa_get_jf_pass_through_op_type (jfunc); + } } else { @@ -2551,6 +2556,11 @@ propagate_bits_across_jump_function (cgraph_edge *cs, int idx, if (!src_lats->bits_lattice.bottom_p ()) { + if (!op_type) + op_type = ipa_get_type (caller_info, src_idx); + + unsigned precision = TYPE_PRECISION (op_type); + signop sgn = TYPE_SIGN (op_type); bool drop_all_ones = keep_null && !src_lats->bits_lattice.known_nonzero_p (); @@ -2570,7 +2580,8 @@ propagate_bits_across_jump_function (cgraph_edge *cs, int idx, = widest_int::from (bm.mask (), TYPE_SIGN (parm_type)); widest_int value = widest_int::from (bm.value (), TYPE_SIGN (parm_type)); - return dest_lattice->meet_with (value, mask, precision); + return dest_lattice->meet_with (value, mask, + TYPE_PRECISION (parm_type)); } } return dest_lattice->set_to_bottom (); diff --git a/gcc/testsuite/gcc.dg/ipa/pr119318.c b/gcc/testsuite/gcc.dg/ipa/pr119318.c new file mode 100644 index ..8e62ec5e3503 --- /dev/null +++ b/gcc/testsuite/gcc.dg/ipa/pr119318.c @@ -0,0 +1,38 @@ +/* { dg-do run } */ +/* { dg-require-effective-target int128 } */ +/* { dg-additional-options "-Wno-psabi -w" } */ +/* { dg-options "-Wno-psabi
[gcc r15-9486] ipa-bit-cp: Fix adjusting value according to mask (PR119803)
https://gcc.gnu.org/g:b4cf69503bcb32491dbd7ab63fe7f0f9fcdcca38 commit r15-9486-gb4cf69503bcb32491dbd7ab63fe7f0f9fcdcca38 Author: Martin Jambor Date: Tue Apr 15 15:55:34 2025 +0200 ipa-bit-cp: Fix adjusting value according to mask (PR119803) In my fix for PR 119318 I put mask calculation in ipcp_bits_lattice::meet_with_1 above a final fix to value so that all the bits in the value which are meaningless according to mask have value zero, which has tripped a validator in PR 119803. This patch fixes that by moving the adjustment down. Even thought the fix for PR 119318 did a similar thing in ipcp_bits_lattice::meet_with, the same is not necessary because that code path then feeds the new value and mask to ipcp_bits_lattice::set_to_constant which does the final adjustment correctly. In both places, however, Jakup proposed a better way of calculating cap_mask and so I have changed it accordingly. gcc/ChangeLog: 2025-04-15 Martin Jambor PR ipa/119803 * ipa-cp.cc (ipcp_bits_lattice::meet_with_1): Move m_value adjustmed according to m_mask below the adjustment of the latter according to cap_mask. Optimize the calculation of cap_mask a bit. (ipcp_bits_lattice::meet_with): Optimize the calculation of cap_mask a bit. gcc/testsuite/ChangeLog: 2025-04-15 Martin Jambor PR ipa/119803 * gcc.dg/ipa/pr119803.c: New test. Co-authored-by: Jakub Jelinek Diff: --- gcc/ipa-cp.cc | 6 +++--- gcc/testsuite/gcc.dg/ipa/pr119803.c | 16 2 files changed, 19 insertions(+), 3 deletions(-) diff --git a/gcc/ipa-cp.cc b/gcc/ipa-cp.cc index 379fbc5dd637..806c2bdc97f2 100644 --- a/gcc/ipa-cp.cc +++ b/gcc/ipa-cp.cc @@ -933,13 +933,13 @@ ipcp_bits_lattice::meet_with_1 (widest_int value, widest_int mask, m_mask = (m_mask | mask) | (m_value ^ value); if (drop_all_ones) m_mask |= m_value; - m_value &= ~m_mask; - widest_int cap_mask = wi::bit_not (wi::sub (wi::lshift (1, precision), 1)); + widest_int cap_mask = wi::shifted_mask (0, precision, true); m_mask |= cap_mask; if (wi::sext (m_mask, precision) == -1) return set_to_bottom (); + m_value &= ~m_mask; return m_mask != old_mask; } @@ -1015,7 +1015,7 @@ ipcp_bits_lattice::meet_with (ipcp_bits_lattice& other, unsigned precision, adjusted_mask |= adjusted_value; adjusted_value &= ~adjusted_mask; } - widest_int cap_mask = wi::bit_not (wi::sub (wi::lshift (1, precision), 1)); + widest_int cap_mask = wi::shifted_mask (0, precision, true); adjusted_mask |= cap_mask; if (wi::sext (adjusted_mask, precision) == -1) return set_to_bottom (); diff --git a/gcc/testsuite/gcc.dg/ipa/pr119803.c b/gcc/testsuite/gcc.dg/ipa/pr119803.c new file mode 100644 index ..1a7bfd25018a --- /dev/null +++ b/gcc/testsuite/gcc.dg/ipa/pr119803.c @@ -0,0 +1,16 @@ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ + +extern void f(int p); +int a, b; +char c; +static int d(int e) { return !e || a == 1 ? 0 : a / e; } +static void h(short e) { + int g = d(e); + f(g); +} +void i() { + c = 128; + h(c); + b = d(65536); +}
[gcc r14-11682] sra: Avoid creating TBAA hazards (PR118924)
https://gcc.gnu.org/g:19dd791b3a7166df0766dfd0b5e6918f8e3d1bba commit r14-11682-g19dd791b3a7166df0766dfd0b5e6918f8e3d1bba Author: Martin Jambor Date: Mon Apr 7 13:32:09 2025 +0200 sra: Avoid creating TBAA hazards (PR118924) The testcase in PR 118924, when compiled on Aarch64, contains an gimple aggregate assignment statement in between different types which are types_compatible_p but behave differently for the purposes of alias analysis. SRA replaces the statement with a series of scalar assignments which however have LHSs access chains modeled on the RHS type and so do not alias with a subsequent reads and so are DSEd. SRA clearly gets its "same_access_path" logic subtly wrong. One issue is that the same_access_path_p function probably should be implemented more along the lines of (parts of ao_compare::compare_ao_refs) instead of internally relying on operand_equal_p. That is however not the problem in the PR and so I will deal with it only later. The issue here is that even when the access path is the same, it must not be bolted on an aggregate type that does not match. This patch does that, taking just one simple function from the ao_compare::compare_ao_refs machinery and using it to detect the situation. The rest is just merging the information in between accesses of the same access group. I looked at how many times we come across such assignment during "make stage2-bubble" of GCC (configured with only c and C++ and without multilib and libsanitizers) and on an x86_64 there were 87924 such assignments (though now I realize not all of them had to be aggregate), so they do happen. The patch leads to about 5% increase of cases where we don't use an "access path" but resort to a MEM_REF (from 90209 to 95204). On an Aarch64, there were 92268 such assignments and the increase of falling back to MEM_REFs was by 4% (but from a bigger base 132983 to 107991). gcc/ChangeLog: 2025-04-04 Martin Jambor PR tree-optimization/118924 * tree-ssa-alias-compare.h (types_equal_for_same_type_for_tbaa_p): Declare. * tree-ssa-alias.cc: Include ipa-utils.h. (types_equal_for_same_type_for_tbaa_p): New public overloaded variant. * tree-sra.cc: Include tree-ssa-alias-compare.h. (create_access): Initialzie grp_same_access_path to true. (build_accesses_from_assign): Detect tbaa hazards and clear grp_same_access_path fields of involved accesses when they occur. (sort_and_splice_var_accesses): Take previous values of grp_same_access_path into account. gcc/testsuite/ChangeLog: 2025-03-25 Martin Jambor PR tree-optimization/118924 * g++.dg/tree-ssa/pr118924.C: New test. (cherry picked from commit 07d243670020b339380194f6125cde87ada56148) Diff: --- gcc/testsuite/g++.dg/tree-ssa/pr118924.C | 29 + gcc/tree-sra.cc | 17 ++--- gcc/tree-ssa-alias-compare.h | 2 ++ gcc/tree-ssa-alias.cc| 13 - 4 files changed, 57 insertions(+), 4 deletions(-) diff --git a/gcc/testsuite/g++.dg/tree-ssa/pr118924.C b/gcc/testsuite/g++.dg/tree-ssa/pr118924.C new file mode 100644 index ..c95eacafc9ce --- /dev/null +++ b/gcc/testsuite/g++.dg/tree-ssa/pr118924.C @@ -0,0 +1,29 @@ +/* { dg-do run } */ +/* { dg-options "-std=c++17 -O2" } */ + +template struct Vector { + int m_data[Size]; + Vector(int, int, int) {} +}; +enum class E { POINTS, LINES, TRIANGLES }; + +__attribute__((noipa)) +void getName(E type) { + static E check = E::POINTS; + if (type == check) +check = (E)((int)check + 1); + else +__builtin_abort (); +} + +int main() { + int arr[]{0, 1, 2}; + for (auto dim : arr) { +Vector<3> localInvs(1, 1, 1); +localInvs.m_data[dim] = 8; + } + E types[] = {E::POINTS, E::LINES, E::TRIANGLES}; + for (auto primType : types) +getName(primType); + return 0; +} diff --git a/gcc/tree-sra.cc b/gcc/tree-sra.cc index c91e40ef7e71..e1243dd0441d 100644 --- a/gcc/tree-sra.cc +++ b/gcc/tree-sra.cc @@ -100,6 +100,7 @@ along with GCC; see the file COPYING3. If not see #include "builtins.h" #include "tree-sra.h" #include "opts.h" +#include "tree-ssa-alias-compare.h" /* Enumeration of all aggregate reductions we can do. */ enum sra_mode { SRA_MODE_EARLY_IPA, /* early call regularization */ @@ -979,6 +980,7 @@ create_access (tree expr, gimple *stmt, bool write) access->type = TREE_TYPE (expr); access->write = write; access->grp_unscalarizable_region = unscalarizable_region; + access->grp_same_access_path = true; access->stmt = stmt; access->reverse = reverse; @@ -1522,6 +1524,9 @@ build_accesses_from_assign (gimple *stmt) racc = build_access
[gcc r14-11683] sra: Clear grp_same_access_path of acesses created by total scalarization (PR118924)
https://gcc.gnu.org/g:cd7c5d9729851940ab6bb7a8522a548c62e8dade commit r14-11683-gcd7c5d9729851940ab6bb7a8522a548c62e8dade Author: Martin Jambor Date: Mon Apr 7 13:32:10 2025 +0200 sra: Clear grp_same_access_path of acesses created by total scalarization (PR118924) During analysis of PR 118924 it was discussed that total scalarization invents access paths (strings of COMPONENT_REFs and possibly even ARRAY_REFs) which did not exist in the program before which can have unintended effects on subsequent AA queries. Although not doing that does not mean that SRA cannot create such situations (see the bug for more info), it has been agreed that not doing this is generally better. This patch therfore makes SRA fall back on creating simple MEM_REFs when accessing components of an aggregate corresponding to what a SRA variable now represents. gcc/ChangeLog: 2025-03-26 Martin Jambor PR tree-optimization/118924 * tree-sra.cc (create_total_scalarization_access): Set grp_same_access_path flag to zero. (cherry picked from commit 40445711b8af113ef423d8bcac1a7ce1c47f62d7) Diff: --- gcc/tree-sra.cc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/tree-sra.cc b/gcc/tree-sra.cc index e1243dd0441d..46ddd41fdcb9 100644 --- a/gcc/tree-sra.cc +++ b/gcc/tree-sra.cc @@ -3436,7 +3436,7 @@ create_total_scalarization_access (struct access *parent, HOST_WIDE_INT pos, access->grp_write = parent->grp_write; access->grp_total_scalarization = 1; access->grp_hint = 1; - access->grp_same_access_path = path_comparable_for_same_access (expr); + access->grp_same_access_path = 0; access->reverse = reverse_storage_order_for_component_p (expr); access->next_sibling = next_sibling;
[gcc r12-11079] sra: Clear grp_same_access_path of acesses created by total scalarization (PR118924)
https://gcc.gnu.org/g:d4d12a548f210371609e85f6d2f4f3ee0e2b04f2 commit r12-11079-gd4d12a548f210371609e85f6d2f4f3ee0e2b04f2 Author: Martin Jambor Date: Mon Apr 7 13:32:10 2025 +0200 sra: Clear grp_same_access_path of acesses created by total scalarization (PR118924) During analysis of PR 118924 it was discussed that total scalarization invents access paths (strings of COMPONENT_REFs and possibly even ARRAY_REFs) which did not exist in the program before which can have unintended effects on subsequent AA queries. Although not doing that does not mean that SRA cannot create such situations (see the bug for more info), it has been agreed that not doing this is generally better. This patch therfore makes SRA fall back on creating simple MEM_REFs when accessing components of an aggregate corresponding to what a SRA variable now represents. gcc/ChangeLog: 2025-03-26 Martin Jambor PR tree-optimization/118924 * tree-sra.cc (create_total_scalarization_access): Set grp_same_access_path flag to zero. (cherry picked from commit 40445711b8af113ef423d8bcac1a7ce1c47f62d7) Diff: --- gcc/tree-sra.cc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/tree-sra.cc b/gcc/tree-sra.cc index 5a9eaf31b6e9..91af2aef8b4c 100644 --- a/gcc/tree-sra.cc +++ b/gcc/tree-sra.cc @@ -3130,7 +3130,7 @@ create_total_scalarization_access (struct access *parent, HOST_WIDE_INT pos, access->grp_write = parent->grp_write; access->grp_total_scalarization = 1; access->grp_hint = 1; - access->grp_same_access_path = path_comparable_for_same_access (expr); + access->grp_same_access_path = 0; access->reverse = reverse_storage_order_for_component_p (expr); access->next_sibling = next_sibling;
[gcc r15-9429] ipa-cp: Use the stored and streamed pass-through types in ipa-vr (PR118785)
https://gcc.gnu.org/g:4f19487f2606d25516d31f0279101deea9772da4 commit r15-9429-g4f19487f2606d25516d31f0279101deea9772da4 Author: Martin Jambor Date: Mon Apr 14 14:21:15 2025 +0200 ipa-cp: Use the stored and streamed pass-through types in ipa-vr (PR118785) This patch revisits the fix for PR 118785 and intead of deducing the necessary operation type it just uses the value collected and streamed by an earlier patch. The main advantage is that we do not rely on expr_type_first_operand_type_p enumarating all operations. gcc/ChangeLog: 2025-03-20 Martin Jambor PR ipa/118785 * ipa-cp.cc (ipa_vr_intersect_with_arith_jfunc): Use the stored and streamed type of arithmetic pass-through functions. Diff: --- gcc/ipa-cp.cc | 28 ++-- 1 file changed, 2 insertions(+), 26 deletions(-) diff --git a/gcc/ipa-cp.cc b/gcc/ipa-cp.cc index 637bc49f0482..21033c666bf4 100644 --- a/gcc/ipa-cp.cc +++ b/gcc/ipa-cp.cc @@ -1735,24 +1735,7 @@ ipa_vr_intersect_with_arith_jfunc (vrange &vr, const value_range *inter_vr; if (operation != NOP_EXPR) { - /* Since we construct arithmetic jump functions even when there is a - type conversion in between the operation encoded in the jump - function and when it is passed in a call argument, the IPA - propagation phase must also perform the operation and conversion - in two separate steps. - -TODO: In order to remove the use of expr_type_first_operand_type_p -predicate we would need to stream the operation type, ideally -encoding the whole jump function as a series of expr_eval_op -structures. */ - - tree operation_type; - if (expr_type_first_operand_type_p (operation)) - operation_type = src_type; - else if (operation == ABSU_EXPR) - operation_type = unsigned_type_for (src_type); - else - return; + tree operation_type = ipa_get_jf_pass_through_op_type (jfunc); op_res.set_varying (operation_type); if (!ipa_vr_operation_and_type_effects (op_res, src_vr, operation, operation_type, src_type)) @@ -1782,14 +1765,7 @@ ipa_vr_intersect_with_arith_jfunc (vrange &vr, value_range op_vr (TREE_TYPE (operand)); ipa_get_range_from_ip_invariant (op_vr, operand, context_node); - tree operation_type; - if (TREE_CODE_CLASS (operation) == tcc_comparison) -operation_type = boolean_type_node; - else if (expr_type_first_operand_type_p (operation)) -operation_type = src_type; - else -return; - + tree operation_type = ipa_get_jf_pass_through_op_type (jfunc); value_range op_res (operation_type); if (!ipa_vr_supported_type_p (operation_type) || !handler.operand_check_p (operation_type, src_type, op_vr.type ())
[gcc r15-9426] ipa: Record and stream result types of arithemetic jump functions
https://gcc.gnu.org/g:f33d2e6b532304d487193667e6b5d8f8d7df2bf4 commit r15-9426-gf33d2e6b532304d487193667e6b5d8f8d7df2bf4 Author: Martin Jambor Date: Mon Apr 14 14:21:14 2025 +0200 ipa: Record and stream result types of arithemetic jump functions In order to replace the use of somewhat unweildy expr_type_first_operand_type_p we need to record and stream the types of results of operations recorded in arithmetic jump functions. This is necessary so that we can then simulate them at the IPA stage with the corresponding precision and signedness. This patch does the recorsing and streaming, the following one adds the use of the date. Per Honza's request this version also checks that we do not put VLA types into the global LTO stream, even though I was not able to actually craft a test-case that would do that without them. gcc/ChangeLog: 2025-04-11 Martin Jambor PR ipa/118097 PR ipa/118785 PR ipa/119318 * lto-streamer.h (lto_variably_modified_type_p): Declare. * ipa-prop.h (ipa_pass_through_data): New field op_type. (ipa_get_jf_pass_through_op_type): New function. * ipa-prop.cc: Include lto-streamer.h. (ipa_dump_jump_function): Dump also pass-through operation types, if any. Dump pass-through operands only if not NULL. (ipa_set_jf_simple_pass_through): Set op_type accordingly. (compute_complex_assign_jump_func): Set op_type of arithmetic pass-through jump_functions. (analyze_agg_content_value): Update lhs when walking assighment copies. Set op_type of aggregate arithmetic pass-through jump_functions. (update_jump_functions_after_inlining): Also transfer the operation type from the source arithmentic pass-through jump function to the destination jump function. (ipa_write_jump_function): Stream also the op_type when necessary. (ipa_read_jump_function): Likewise. (ipa_agg_pass_through_jf_equivalent_p): Also compare operation types. * lto-streamer-out.cc (lto_variably_modified_type_p): Make public. Diff: --- gcc/ipa-prop.cc | 76 - gcc/ipa-prop.h | 15 ++ gcc/lto-streamer-out.cc | 2 +- gcc/lto-streamer.h | 1 + 4 files changed, 80 insertions(+), 14 deletions(-) diff --git a/gcc/ipa-prop.cc b/gcc/ipa-prop.cc index a120f942dc25..49d68ab044b7 100644 --- a/gcc/ipa-prop.cc +++ b/gcc/ipa-prop.cc @@ -60,6 +60,7 @@ along with GCC; see the file COPYING3. If not see #include "gimple-range.h" #include "value-range-storage.h" #include "vr-values.h" +#include "lto-streamer.h" /* Function summary where the parameter infos are actually stored. */ ipa_node_params_t *ipa_node_params_sum = NULL; @@ -454,7 +455,11 @@ ipa_dump_jump_function (FILE *f, ipa_jump_func *jump_func, if (jump_func->value.pass_through.operation != NOP_EXPR) { fprintf (f, " "); - print_generic_expr (f, jump_func->value.pass_through.operand); + if (jump_func->value.pass_through.operand) + print_generic_expr (f, jump_func->value.pass_through.operand); + fprintf (f, " (in type "); + print_generic_expr (f, jump_func->value.pass_through.op_type); + fprintf (f, ")"); } if (jump_func->value.pass_through.agg_preserved) fprintf (f, ", agg_preserved"); @@ -510,7 +515,11 @@ ipa_dump_jump_function (FILE *f, ipa_jump_func *jump_func, if (item->value.pass_through.operation != NOP_EXPR) { fprintf (f, " "); - print_generic_expr (f, item->value.pass_through.operand); + if (item->value.pass_through.operand) + print_generic_expr (f, item->value.pass_through.operand); + fprintf (f, " (in type "); + print_generic_expr (f, jump_func->value.pass_through.op_type); + fprintf (f, ")"); } } else if (item->jftype == IPA_JF_CONST) @@ -682,6 +691,7 @@ ipa_set_jf_simple_pass_through (struct ipa_jump_func *jfunc, int formal_id, { jfunc->type = IPA_JF_PASS_THROUGH; jfunc->value.pass_through.operand = NULL_TREE; + jfunc->value.pass_through.op_type = NULL_TREE; jfunc->value.pass_through.formal_id = formal_id; jfunc->value.pass_through.operation = NOP_EXPR; jfunc->value.pass_through.agg_preserved = agg_preserved; @@ -692,10 +702,11 @@ ipa_set_jf_simple_pass_through (struct ipa_jump_func *jfunc, int formal_id, static void ipa_set_jf_unary_pass_through (struct ipa_jump_func *jfunc, int formal_id, - enum tree_code operation) + enum tree_code operation, tree op_type) { jfunc->type = IPA_JF_PASS_THROUGH; j
[gcc r15-9428] ipa-cp: Make dumping of widest_ints even more sane
https://gcc.gnu.org/g:044d0d1ee1a61c21670068485d4a250edfbb695a commit r15-9428-g044d0d1ee1a61c21670068485d4a250edfbb695a Author: Martin Jambor Date: Mon Apr 14 14:21:15 2025 +0200 ipa-cp: Make dumping of widest_ints even more sane This patch just introduces a form of dumping of widest ints that only have zeros in the lowest 128 bits so that instead of printing thousands of f's the output looks like: Bits: value = 0x, mask = all ones folled by 0x and then makes sure we use the function not only to print bits but also to print masks where values like these can also occur. gcc/ChangeLog: 2025-03-21 Martin Jambor * ipa-cp.cc (ipcp_print_widest_int): Also add a truncated form of dumping of widest ints which only have zeros in the lowest 128 bits. Update the comment. (ipcp_bits_lattice::print): Also dump the mask using ipcp_print_widest_int. (ipcp_store_vr_results): Likewise. Diff: --- gcc/ipa-cp.cc | 15 +++ 1 file changed, 11 insertions(+), 4 deletions(-) diff --git a/gcc/ipa-cp.cc b/gcc/ipa-cp.cc index fd2c4cca1365..637bc49f0482 100644 --- a/gcc/ipa-cp.cc +++ b/gcc/ipa-cp.cc @@ -307,14 +307,21 @@ ipcp_lattice::print (FILE * f, bool dump_sources, bool dump_benefits) fprintf (f, "\n"); } -/* If VALUE has all bits set to one, print "-1" to F, otherwise simply print it - hexadecimally to F. */ +/* Print VALUE to F in a form which in usual cases does not take thousands of + characters. */ static void ipcp_print_widest_int (FILE *f, const widest_int &value) { if (wi::eq_p (wi::bit_not (value), 0)) fprintf (f, "-1"); + else if (wi::eq_p (wi::bit_not (wi::bit_or (value, + wi::sub (wi::lshift (1, 128), + 1))), 0)) +{ + fprintf (f, "all ones folled by "); + print_hex (wi::bit_and (value, wi::sub (wi::lshift (1, 128), 1)), f); +} else print_hex (value, f); } @@ -331,7 +338,7 @@ ipcp_bits_lattice::print (FILE *f) fprintf (f, " Bits: value = "); ipcp_print_widest_int (f, get_value ()); fprintf (f, ", mask = "); - print_hex (get_mask (), f); + ipcp_print_widest_int (f, get_mask ()); fprintf (f, "\n"); } } @@ -6437,7 +6444,7 @@ ipcp_store_vr_results (void) fprintf (dump_file, " param %i: value = ", i); ipcp_print_widest_int (dump_file, bits->get_value ()); fprintf (dump_file, ", mask = "); - print_hex (bits->get_mask (), dump_file); + ipcp_print_widest_int (dump_file, bits->get_mask ()); fprintf (dump_file, "\n"); } }
[gcc r15-9430] ipa-cp: Use the collected pass-through types to propgate constants (PR118097)
https://gcc.gnu.org/g:6b6611f81476b6375c90859d85331c2981a2ce51 commit r15-9430-g6b6611f81476b6375c90859d85331c2981a2ce51 Author: Martin Jambor Date: Mon Apr 14 14:21:15 2025 +0200 ipa-cp: Use the collected pass-through types to propgate constants (PR118097) This patch revisits the fix for PR 118097 and instead of deducing the necessary operation type it just uses the value collected and streamed by an earlier patch. It is bigger than the ones for propagating value ranges and known bits because we track constants both in parameters themselves and also in memory they point to or within aggregates, we clone functions for them and we do fancy things for some types of recursive calls. In the case of constants in aggregates or passed by reference, the situation should not change because the code creating jump functions for them does not allow type-casts, unlike for the plain ones. However, this patch changes how we handle them for the sake of consistency and also so that we can try and eliminate this limitation in the next stage 1. gcc/ChangeLog: 2025-03-20 Martin Jambor PR ipa/118097 * ipa-cp.cc (ipa_get_jf_arith_result): Require res_operand for anything except NOP_EXPR or ADDR_EXPR, document it and remove the code trying to deduce it. (ipa_value_from_jfunc): Use the stored and streamed type of arithmetic pass-through functions. (ipa_agg_value_from_jfunc): Use the stored and streamed type of arithmetic pass-through functions, convert to the type used to store the value if necessary. (get_val_across_arith_op): New parameter op_type, pass it to ipa_get_jf_arith_result. (propagate_vals_across_arith_jfunc): New parameter op_type, pass it to get_val_across_arith_op. (propagate_vals_across_pass_through): Use the stored and streamed type of arithmetic pass-through functions. (propagate_aggregate_lattice): Likewise. (push_agg_values_for_index_from_edge): Use the stored and streamed type of arithmetic pass-through functions, convert to the type used to store the value if necessary. Diff: --- gcc/ipa-cp.cc | 94 +-- 1 file changed, 52 insertions(+), 42 deletions(-) diff --git a/gcc/ipa-cp.cc b/gcc/ipa-cp.cc index 21033c666bf4..26b1496f29bb 100644 --- a/gcc/ipa-cp.cc +++ b/gcc/ipa-cp.cc @@ -1478,10 +1478,12 @@ ipacp_value_safe_for_type (tree param_type, tree value) return NULL_TREE; } -/* Return the result of a (possibly arithmetic) operation on the constant value - INPUT. OPERAND is 2nd operand for binary operation. RES_TYPE is the type - in which any operation is to be performed. Return NULL_TREE if that cannot - be determined or be considered an interprocedural invariant. */ +/* Return the result of a (possibly arithmetic) operation determined by OPCODE + on the constant value INPUT. OPERAND is 2nd operand for binary operation + and is required for binary operations. RES_TYPE, required when opcode is + not NOP_EXPR, is the type in which any operation is to be performed. Return + NULL_TREE if that cannot be determined or be considered an interprocedural + invariant. */ static tree ipa_get_jf_arith_result (enum tree_code opcode, tree input, tree operand, @@ -1502,16 +1504,6 @@ ipa_get_jf_arith_result (enum tree_code opcode, tree input, tree operand, return NULL_TREE; } - if (!res_type) -{ - if (TREE_CODE_CLASS (opcode) == tcc_comparison) - res_type = boolean_type_node; - else if (expr_type_first_operand_type_p (opcode)) - res_type = TREE_TYPE (input); - else - return NULL_TREE; -} - if (TREE_CODE_CLASS (opcode) == tcc_unary) res = fold_unary (opcode, res_type, input); else @@ -1595,7 +1587,10 @@ ipa_value_from_jfunc (class ipa_node_params *info, struct ipa_jump_func *jfunc, return NULL_TREE; enum tree_code opcode = ipa_get_jf_pass_through_operation (jfunc); tree op2 = ipa_get_jf_pass_through_operand (jfunc); - tree cstval = ipa_get_jf_arith_result (opcode, input, op2, NULL_TREE); + tree op_type + = (opcode == NOP_EXPR) ? NULL_TREE + : ipa_get_jf_pass_through_op_type (jfunc); + tree cstval = ipa_get_jf_arith_result (opcode, input, op2, op_type); return ipacp_value_safe_for_type (parm_type, cstval); } else @@ -1905,10 +1900,11 @@ ipa_agg_value_from_jfunc (ipa_node_params *info, cgraph_node *node, return NULL_TREE; } - return ipa_get_jf_arith_result (item->value.pass_through.operation, - value, - item->value.pass_through.operand, -
[gcc r12-11080] Add test-case for PR118924
https://gcc.gnu.org/g:81b30ef214690b6521753293bf2fcb2339055b54 commit r12-11080-g81b30ef214690b6521753293bf2fcb2339055b54 Author: Martin Jambor Date: Tue Apr 29 18:24:29 2025 +0200 Add test-case for PR118924 Because the testcase for the issue in master is in a commit I do not plan to backport to GCC 12 but the issue is avoided by my previous one nevertheless, I am backporting the testcase in this one. gcc/testsuite/ChangeLog: 2025-04-29 Martin Jambor PR tree-optimization/118924 * g++.dg/tree-ssa/pr118924.C: New test. Diff: --- gcc/testsuite/g++.dg/tree-ssa/pr118924.C | 29 + 1 file changed, 29 insertions(+) diff --git a/gcc/testsuite/g++.dg/tree-ssa/pr118924.C b/gcc/testsuite/g++.dg/tree-ssa/pr118924.C new file mode 100644 index ..c95eacafc9ce --- /dev/null +++ b/gcc/testsuite/g++.dg/tree-ssa/pr118924.C @@ -0,0 +1,29 @@ +/* { dg-do run } */ +/* { dg-options "-std=c++17 -O2" } */ + +template struct Vector { + int m_data[Size]; + Vector(int, int, int) {} +}; +enum class E { POINTS, LINES, TRIANGLES }; + +__attribute__((noipa)) +void getName(E type) { + static E check = E::POINTS; + if (type == check) +check = (E)((int)check + 1); + else +__builtin_abort (); +} + +int main() { + int arr[]{0, 1, 2}; + for (auto dim : arr) { +Vector<3> localInvs(1, 1, 1); +localInvs.m_data[dim] = 8; + } + E types[] = {E::POINTS, E::LINES, E::TRIANGLES}; + for (auto primType : types) +getName(primType); + return 0; +}
[gcc r16-420] ipa: Do not emit info about temporary clones to ipa-clones dump (PR119852)
https://gcc.gnu.org/g:fb5829a01651d427a63a12c44ecc8baa47dbfc83 commit r16-420-gfb5829a01651d427a63a12c44ecc8baa47dbfc83 Author: Martin Jambor Date: Tue May 6 17:28:43 2025 +0200 ipa: Do not emit info about temporary clones to ipa-clones dump (PR119852) As described in PR 119852, the output of -fdump-ipa-clones can contain "(null)" as the suffix/reason for cloning when we need to create a clone to hold the original function during recursive inlining. Such clone is never output and so should not be part of the dump output either. gcc/ChangeLog: 2025-04-23 Martin Jambor PR ipa/119852 * cgraphclones.cc (dump_callgraph_transformation): Document the function. Do not dump if suffix is NULL. gcc/testsuite/ChangeLog: 2025-04-23 Martin Jambor PR ipa/119852 * gcc.dg/ipa/pr119852.c: New test. Diff: --- gcc/cgraphclones.cc | 10 +++- gcc/testsuite/gcc.dg/ipa/pr119852.c | 50 + 2 files changed, 59 insertions(+), 1 deletion(-) diff --git a/gcc/cgraphclones.cc b/gcc/cgraphclones.cc index e6223fa1f5cc..bf5bc41cde9c 100644 --- a/gcc/cgraphclones.cc +++ b/gcc/cgraphclones.cc @@ -307,12 +307,20 @@ cgraph_node::expand_all_artificial_thunks () e = e->next_caller; } +/* Dump information about creation of a call graph node clone to the dump file + created by the -fdump-ipa-clones option. ORIGINAL is the function being + cloned, CLONE is the new clone. SUFFIX is a string that helps identify the + reason for cloning, often it is the suffix used by a particular IPA pass to + create unique function names. SUFFIX can be NULL and in that case the + dumping will not take place, which must be the case only for helper clones + which will never be emitted to the output. */ + void dump_callgraph_transformation (const cgraph_node *original, const cgraph_node *clone, const char *suffix) { - if (symtab->ipa_clones_dump_file) + if (suffix && symtab->ipa_clones_dump_file) { fprintf (symtab->ipa_clones_dump_file, "Callgraph clone;%s;%d;%s;%d;%d;%s;%d;%s;%d;%d;%s\n", diff --git a/gcc/testsuite/gcc.dg/ipa/pr119852.c b/gcc/testsuite/gcc.dg/ipa/pr119852.c new file mode 100644 index ..eab8d21293cc --- /dev/null +++ b/gcc/testsuite/gcc.dg/ipa/pr119852.c @@ -0,0 +1,50 @@ +/* { dg-do compile } */ +/* { dg-options "-O3 -fdump-ipa-clones" } */ + +typedef struct rtx_def *rtx; +enum rtx_code { + LAST_AND_UNUSED_RTX_CODE}; +extern const char * const rtx_format[((int) LAST_AND_UNUSED_RTX_CODE)]; +struct rtx_def { + enum rtx_code code; +}; +typedef int (*rtx_function) (rtx *, void *); +extern int for_each_rtx (rtx *, rtx_function, void *); +int +replace_label (rtx *x, void *data) +{ + rtx l = *x; + if (l == (rtx) 0) +{ + { + rtx new_c, new_l; + for_each_rtx (&new_c, replace_label, data); + } +} +} +static int +for_each_rtx_1 (rtx exp, int n, rtx_function f, void *data) +{ + int result, i, j; + const char *format = (rtx_format[(int) (((enum rtx_code) (exp)->code))]); + rtx *x; + for (; format[n] != '\0'; n++) +{ + switch (format[n]) + { + case 'e': + result = (*f) (x, data); + { + result = for_each_rtx_1 (*x, i, f, data); + } + } +} +} +int +for_each_rtx (rtx *x, rtx_function f, void *data) +{ + int i; + return for_each_rtx_1 (*x, i, f, data); +} + +/* { dg-final { scan-ipa-dump-not "(null)" "ipa-clones" } } */
[gcc r16-419] Document option -fdump-ipa-clones
https://gcc.gnu.org/g:6ecc2fee06bdd60da0e9b3fe6660b553dbdca3ca commit r16-419-g6ecc2fee06bdd60da0e9b3fe6660b553dbdca3ca Author: Martin Jambor Date: Tue May 6 17:28:42 2025 +0200 Document option -fdump-ipa-clones I have noticed that the option -fdump-ipa-clones is not documented although there are users who depend on it. This patch adds the missing documentation along with the description of the information it dumps and the format it uses. I am never quite sure which of the texinfo mark-ups is the most appropriate in which situation, I'll of course incorporate any feedback on this as well as the general wording of the text. After we settle on a version, I'd like to backport the documentation also at least to GCC 15, 14 and 13. Is it perhaps OK for master and the branches or what would better be changed? Thanks, Martin gcc/ChangeLog: 2025-04-23 Martin Jambor * doc/invoke.texi (Developer Options): Document -fdump-ipa-clones. Diff: --- gcc/doc/invoke.texi | 87 + 1 file changed, 87 insertions(+) diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 32bc45725de9..90cbb516bc46 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -20774,6 +20774,93 @@ By default, the dump will contain messages about successful optimizations (equivalent to @option{-optimized}) together with low-level details about the analysis. +@opindex fdump-ipa-clones +@item -fdump-ipa-clones + +Create a dump file containing information about creation of call graph +node clones and removals of call graph nodes during inter-procedural +optimizations and transformations. Its main intended use is that tools +that create live-patches can determine the set of functions that need to +be live-patched to completely replace a particular function (see +@option{-flive-patching}). The file name is generated by appending +suffix @code{ipa-clones} to the source file name, and the file is +created in the same directory as the output file. Each entry in the +file is on a separate line containing semicolon separated fields. + +In the case of call graph clone creation, the individual fields are: + +@enumerate +@item +String @code{Callgraph clone}. + +@item +Name of the function being cloned as it is presented to the assembler. + +@item +A number that uniquely represents the function being cloned in the call +graph. Note that the number is unique only within a compilation unit or +within whole-program analysis but is likely to be different in the two +phases. + +@item +The file name of the source file where the function is defined. + +@item +The line on which the function definition is located. + +@item +The column where the function definition is located. + +@item +Name of the new function clone as it is presented to the assembler. + +@item +A number that uniquely represents the new function clone in the call +graph. Note that the number is unique only within a compilation unit or +within whole-program analysis but is likely to be different in the two +phases. + +@item +The file name of the source file where the source code location of the +new clone points to. + +@item +The line to which the source code location of the new clone points to. + +@item +The column to which the source code location of the new clone points to. + +@item +A string that determines the reason for cloning. + +@end enumerate + +In the case of call graph clone removal, the individual fields are: + +@enumerate +@item +String @code{Callgraph removal}. + +@item +Name of the function being removed as it would be presented to the assembler. + +@item +A number that uniquely represents the function being cloned in the call +graph. Note that the number is unique only within a compilation unit or +within whole-program analysis but is likely to be different in the two +phases. + +@item +The file name of the source file where the function is defined. + +@item +The line on which the function definition is located. + +@item +The column where the function definition is located. + +@end enumerate + @opindex fdump-lang @item -fdump-lang Dump language-specific information. The file name is made by appending
[gcc r16-422] ipa: Drop the default value of suffix parameter of create_clone (PR119852)
https://gcc.gnu.org/g:76c882e341cb330a4e9f677a8c3541d573820255 commit r16-422-g76c882e341cb330a4e9f677a8c3541d573820255 Author: Martin Jambor Date: Tue May 6 17:28:44 2025 +0200 ipa: Drop the default value of suffix parameter of create_clone (PR119852) In PR 119852 we agreed that since the NULL-ness of the suffix parameter should prevent creation of a record in the ipa-clones dump (which is implemented by a previous patch), it should not default to NULL. gcc/ChangeLog: 2025-04-25 Martin Jambor PR ipa/119852 * cgraph.h (cgraph_node::create_clone): Remove the default value of argument suffix. Update function comment. * cgraphclones.cc (cgraph_node::create_clone): Update function comment. * ipa-inline-transform.cc (clone_inlined_nodes): Pass NULL to suffix of create_clone explicitely. * ipa-inline.cc (recursive_inlining): Likewise. * lto-cgraph.cc (input_node): Likewise. Diff: --- gcc/cgraph.h| 10 +++--- gcc/cgraphclones.cc | 7 ++- gcc/ipa-inline-transform.cc | 2 +- gcc/ipa-inline.cc | 2 +- gcc/lto-cgraph.cc | 2 +- 5 files changed, 16 insertions(+), 7 deletions(-) diff --git a/gcc/cgraph.h b/gcc/cgraph.h index 1a59bf609b51..f4ee29e998c3 100644 --- a/gcc/cgraph.h +++ b/gcc/cgraph.h @@ -965,15 +965,19 @@ struct GTY((tag ("SYMTAB_FUNCTION"))) cgraph_node : public symtab_node If the new node is being inlined into another one, NEW_INLINED_TO should be the outline function the new one is (even indirectly) inlined to. All hooks will see this in node's inlined_to, when invoked. - Can be NULL if the node is not inlined. SUFFIX is string that is appended - to the original name. */ + Should be NULL if the node is not inlined. + + SUFFIX is string that is appended to the original name, it should only be + NULL if NEW_INLINED_TO is not NULL or if the clone being created is + temporary and a record about it should not be added into the ipa-clones + dump file. */ cgraph_node *create_clone (tree decl, profile_count count, bool update_original, vec redirect_callers, bool call_duplication_hook, cgraph_node *new_inlined_to, ipa_param_adjustments *param_adjustments, -const char *suffix = NULL); +const char *suffix); /* Create callgraph node clone with new declaration. The actual body will be copied later at compilation stage. The name of the new clone will be diff --git a/gcc/cgraphclones.cc b/gcc/cgraphclones.cc index cb457e5f457f..b45ac4977331 100644 --- a/gcc/cgraphclones.cc +++ b/gcc/cgraphclones.cc @@ -366,9 +366,14 @@ localize_profile (cgraph_node *n) If the new node is being inlined into another one, NEW_INLINED_TO should be the outline function the new one is (even indirectly) inlined to. All hooks - will see this in node's inlined_to, when invoked. Can be NULL if the + will see this in node's inlined_to, when invoked. Should be NULL if the node is not inlined. + SUFFIX is string that is appended to the original name, it should only be + NULL if NEW_INLINED_TO is not NULL or if the clone being created is + temporary and a record about it should not be added into the ipa-clones dump + file. + If PARAM_ADJUSTMENTS is non-NULL, the parameter manipulation information will be overwritten by the new structure. Otherwise the new node will share parameter manipulation information with the original node. */ diff --git a/gcc/ipa-inline-transform.cc b/gcc/ipa-inline-transform.cc index e00887be481b..46b8e5bb6790 100644 --- a/gcc/ipa-inline-transform.cc +++ b/gcc/ipa-inline-transform.cc @@ -225,7 +225,7 @@ clone_inlined_nodes (struct cgraph_edge *e, bool duplicate, e->count, update_original, vNULL, true, inlining_into, - NULL); + NULL, NULL); n->used_as_abstract_origin = e->callee->used_as_abstract_origin; e->redirect_callee (n); } diff --git a/gcc/ipa-inline.cc b/gcc/ipa-inline.cc index 38fdbfde1b3b..35e5496d8463 100644 --- a/gcc/ipa-inline.cc +++ b/gcc/ipa-inline.cc @@ -1865,7 +1865,7 @@ recursive_inlining (struct cgraph_edge *edge, { /* We need original clone to copy around. */ master_clone = node->create_clone (node->decl, node->count, - false, vNULL, true, NULL, NULL); + false, vNULL, true, NULL, NULL, NULL); for (e = master_clone->callees; e; e = e->next_callee) if (!e->inline_failed) clone_inlined_nodes (e,
[gcc r16-421] ipa: Fix create_version_clone_with_body declaration and comment
https://gcc.gnu.org/g:1eaee43dc0c6292ce865b460d52474ca14ea1d71 commit r16-421-g1eaee43dc0c6292ce865b460d52474ca14ea1d71 Author: Martin Jambor Date: Tue May 6 17:28:43 2025 +0200 ipa: Fix create_version_clone_with_body declaration and comment I noticed that the name of the fifth parameter of cgraph_node::create_version_clone_with_body is different in the class definition in cgraph.h and in the actual member function definition in cgraphclones.cc. The former (clone_name) is misleading and so this patch changes it to the latter (suffix) which is also used in related functions. The patch also updates the function comment in both places because it clearly became out of date. gcc/ChangeLog: 2025-04-25 Martin Jambor * cgraph.h (cgraph_node::create_version_clone_with_body): Fix function comment. Change the name of clone_name to suffix, in line with the function definition. * cgraphclones.cc (cgraph_node::create_version_clone_with_body): Fix function comment. Diff: --- gcc/cgraph.h| 9 + gcc/cgraphclones.cc | 7 --- 2 files changed, 9 insertions(+), 7 deletions(-) diff --git a/gcc/cgraph.h b/gcc/cgraph.h index f7b67ed0a6c5..1a59bf609b51 100644 --- a/gcc/cgraph.h +++ b/gcc/cgraph.h @@ -1020,11 +1020,12 @@ struct GTY((tag ("SYMTAB_FUNCTION"))) cgraph_node : public symtab_node TREE_MAP is a mapping of tree nodes we want to replace with new ones (according to results of prior analysis). - If non-NULL ARGS_TO_SKIP determine function parameters to remove - from new version. - If SKIP_RETURN is true, the new version will return void. + If non-NULL PARAM_ADJUSTMENTS determine how function formal parameters + should be modified in the new version and if it should return void. If non-NULL BLOCK_TO_COPY determine what basic blocks to copy. If non_NULL NEW_ENTRY determine new entry BB of the clone. + SUFFIX is a string that will be used to create a new name for the new + function. If TARGET_ATTRIBUTES is non-null, when creating a new declaration, add the attributes to DECL_ATTRIBUTES. And call valid_attribute_p @@ -1039,7 +1040,7 @@ struct GTY((tag ("SYMTAB_FUNCTION"))) cgraph_node : public symtab_node (vec redirect_callers, vec *tree_map, ipa_param_adjustments *param_adjustments, - bitmap bbs_to_copy, basic_block new_entry_block, const char *clone_name, + bitmap bbs_to_copy, basic_block new_entry_block, const char *suffix, tree target_attributes = NULL_TREE, bool version_decl = true); /* Insert a new cgraph_function_version_info node into cgraph_fnver_htab diff --git a/gcc/cgraphclones.cc b/gcc/cgraphclones.cc index bf5bc41cde9c..cb457e5f457f 100644 --- a/gcc/cgraphclones.cc +++ b/gcc/cgraphclones.cc @@ -1002,11 +1002,12 @@ cgraph_node::create_version_clone (tree new_decl, TREE_MAP is a mapping of tree nodes we want to replace with new ones (according to results of prior analysis). - If non-NULL ARGS_TO_SKIP determine function parameters to remove - from new version. - If SKIP_RETURN is true, the new version will return void. + If non-NULL PARAM_ADJUSTMENTS determine how function formal parameters + should be modified in the new version and if it should return void. If non-NULL BLOCK_TO_COPY determine what basic blocks to copy. If non_NULL NEW_ENTRY determine new entry BB of the clone. + SUFFIX is a string that will be used to create a new name for the new + function. If TARGET_ATTRIBUTES is non-null, when creating a new declaration, add the attributes to DECL_ATTRIBUTES. And call valid_attribute_p
[gcc r13-9612] Fix a pasto in ao_compare::compare_ao_refs
https://gcc.gnu.org/g:7495787e31c4e5ee6a04c8f05d227a4f0eb7a345 commit r13-9612-g7495787e31c4e5ee6a04c8f05d227a4f0eb7a345 Author: Martin Jambor Date: Tue Mar 11 14:52:44 2025 +0100 Fix a pasto in ao_compare::compare_ao_refs When reading the function ao_compare::compare_ao_refs I came accross what I believe to ba a copy-and-paste error which this patch fixes. gcc/ChangeLog: 2025-03-10 Martin Jambor * tree-ssa-alias.cc (ao_compare::compare_ao_refs): Fix a copy-and-paste error. (cherry picked from commit dc47161c1f32c3f27d1157ba0de9d98ea1b7fc82) Diff: --- gcc/tree-ssa-alias.cc | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/gcc/tree-ssa-alias.cc b/gcc/tree-ssa-alias.cc index dcb65648ec04..a784aebd6537 100644 --- a/gcc/tree-ssa-alias.cc +++ b/gcc/tree-ssa-alias.cc @@ -4292,12 +4292,13 @@ ao_compare::compare_ao_refs (ao_ref *ref1, ao_ref *ref2, c1 = p1, nskipped1 = i; i++; } + i = 0; for (tree p2 = ref2->ref; handled_component_p (p2); p2 = TREE_OPERAND (p2, 0)) { if (component_ref_to_zero_sized_trailing_array_p (p2)) end_struct_ref2 = p2; if (ends_tbaa_access_path_p (p2)) - c2 = p2, nskipped1 = i; + c2 = p2, nskipped2 = i; i++; }
[gcc r13-9622] sra: Avoid creating TBAA hazards (PR118924)
https://gcc.gnu.org/g:087d91f9d4e97de66955caa94b42e91180d02d78 commit r13-9622-g087d91f9d4e97de66955caa94b42e91180d02d78 Author: Martin Jambor Date: Mon Apr 7 13:32:09 2025 +0200 sra: Avoid creating TBAA hazards (PR118924) The testcase in PR 118924, when compiled on Aarch64, contains an gimple aggregate assignment statement in between different types which are types_compatible_p but behave differently for the purposes of alias analysis. SRA replaces the statement with a series of scalar assignments which however have LHSs access chains modeled on the RHS type and so do not alias with a subsequent reads and so are DSEd. SRA clearly gets its "same_access_path" logic subtly wrong. One issue is that the same_access_path_p function probably should be implemented more along the lines of (parts of ao_compare::compare_ao_refs) instead of internally relying on operand_equal_p. That is however not the problem in the PR and so I will deal with it only later. The issue here is that even when the access path is the same, it must not be bolted on an aggregate type that does not match. This patch does that, taking just one simple function from the ao_compare::compare_ao_refs machinery and using it to detect the situation. The rest is just merging the information in between accesses of the same access group. I looked at how many times we come across such assignment during "make stage2-bubble" of GCC (configured with only c and C++ and without multilib and libsanitizers) and on an x86_64 there were 87924 such assignments (though now I realize not all of them had to be aggregate), so they do happen. The patch leads to about 5% increase of cases where we don't use an "access path" but resort to a MEM_REF (from 90209 to 95204). On an Aarch64, there were 92268 such assignments and the increase of falling back to MEM_REFs was by 4% (but from a bigger base 132983 to 107991). gcc/ChangeLog: 2025-04-04 Martin Jambor PR tree-optimization/118924 * tree-ssa-alias-compare.h (types_equal_for_same_type_for_tbaa_p): Declare. * tree-ssa-alias.cc: Include ipa-utils.h. (types_equal_for_same_type_for_tbaa_p): New public overloaded variant. * tree-sra.cc: Include tree-ssa-alias-compare.h. (create_access): Initialzie grp_same_access_path to true. (build_accesses_from_assign): Detect tbaa hazards and clear grp_same_access_path fields of involved accesses when they occur. (sort_and_splice_var_accesses): Take previous values of grp_same_access_path into account. gcc/testsuite/ChangeLog: 2025-03-25 Martin Jambor PR tree-optimization/118924 * g++.dg/tree-ssa/pr118924.C: New test. (cherry picked from commit 07d243670020b339380194f6125cde87ada56148) Diff: --- gcc/testsuite/g++.dg/tree-ssa/pr118924.C | 29 + gcc/tree-sra.cc | 17 ++--- gcc/tree-ssa-alias-compare.h | 2 ++ gcc/tree-ssa-alias.cc| 13 - 4 files changed, 57 insertions(+), 4 deletions(-) diff --git a/gcc/testsuite/g++.dg/tree-ssa/pr118924.C b/gcc/testsuite/g++.dg/tree-ssa/pr118924.C new file mode 100644 index ..c95eacafc9ce --- /dev/null +++ b/gcc/testsuite/g++.dg/tree-ssa/pr118924.C @@ -0,0 +1,29 @@ +/* { dg-do run } */ +/* { dg-options "-std=c++17 -O2" } */ + +template struct Vector { + int m_data[Size]; + Vector(int, int, int) {} +}; +enum class E { POINTS, LINES, TRIANGLES }; + +__attribute__((noipa)) +void getName(E type) { + static E check = E::POINTS; + if (type == check) +check = (E)((int)check + 1); + else +__builtin_abort (); +} + +int main() { + int arr[]{0, 1, 2}; + for (auto dim : arr) { +Vector<3> localInvs(1, 1, 1); +localInvs.m_data[dim] = 8; + } + E types[] = {E::POINTS, E::LINES, E::TRIANGLES}; + for (auto primType : types) +getName(primType); + return 0; +} diff --git a/gcc/tree-sra.cc b/gcc/tree-sra.cc index 8a9cbeec4908..11dca3c026b2 100644 --- a/gcc/tree-sra.cc +++ b/gcc/tree-sra.cc @@ -100,6 +100,7 @@ along with GCC; see the file COPYING3. If not see #include "builtins.h" #include "tree-sra.h" #include "opts.h" +#include "tree-ssa-alias-compare.h" /* Enumeration of all aggregate reductions we can do. */ enum sra_mode { SRA_MODE_EARLY_IPA, /* early call regularization */ @@ -968,6 +969,7 @@ create_access (tree expr, gimple *stmt, bool write) access->type = TREE_TYPE (expr); access->write = write; access->grp_unscalarizable_region = unscalarizable_region; + access->grp_same_access_path = true; access->stmt = stmt; access->reverse = reverse; @@ -1394,6 +1396,9 @@ build_accesses_from_assign (gimple *stmt) racc = build_access_
[gcc r13-9623] sra: Clear grp_same_access_path of acesses created by total scalarization (PR118924)
https://gcc.gnu.org/g:85792c6234ba8436422b3119bf3aae50d7951b27 commit r13-9623-g85792c6234ba8436422b3119bf3aae50d7951b27 Author: Martin Jambor Date: Mon Apr 7 13:32:10 2025 +0200 sra: Clear grp_same_access_path of acesses created by total scalarization (PR118924) During analysis of PR 118924 it was discussed that total scalarization invents access paths (strings of COMPONENT_REFs and possibly even ARRAY_REFs) which did not exist in the program before which can have unintended effects on subsequent AA queries. Although not doing that does not mean that SRA cannot create such situations (see the bug for more info), it has been agreed that not doing this is generally better. This patch therfore makes SRA fall back on creating simple MEM_REFs when accessing components of an aggregate corresponding to what a SRA variable now represents. gcc/ChangeLog: 2025-03-26 Martin Jambor PR tree-optimization/118924 * tree-sra.cc (create_total_scalarization_access): Set grp_same_access_path flag to zero. (cherry picked from commit 40445711b8af113ef423d8bcac1a7ce1c47f62d7) Diff: --- gcc/tree-sra.cc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/tree-sra.cc b/gcc/tree-sra.cc index 11dca3c026b2..ec499fdd5109 100644 --- a/gcc/tree-sra.cc +++ b/gcc/tree-sra.cc @@ -3253,7 +3253,7 @@ create_total_scalarization_access (struct access *parent, HOST_WIDE_INT pos, access->grp_write = parent->grp_write; access->grp_total_scalarization = 1; access->grp_hint = 1; - access->grp_same_access_path = path_comparable_for_same_access (expr); + access->grp_same_access_path = 0; access->reverse = reverse_storage_order_for_component_p (expr); access->next_sibling = next_sibling;
[gcc r15-9633] ipa: Do not emit info about temporary clones to ipa-clones dump (PR119852)
https://gcc.gnu.org/g:77780c31485eeb71e9fabf8ea9d4b1af0c3be595 commit r15-9633-g77780c31485eeb71e9fabf8ea9d4b1af0c3be595 Author: Martin Jambor Date: Tue May 6 17:28:43 2025 +0200 ipa: Do not emit info about temporary clones to ipa-clones dump (PR119852) As described in PR 119852, the output of -fdump-ipa-clones can contain "(null)" as the suffix/reason for cloning when we need to create a clone to hold the original function during recursive inlining. Such clone is never output and so should not be part of the dump output either. gcc/ChangeLog: 2025-04-23 Martin Jambor PR ipa/119852 * cgraphclones.cc (dump_callgraph_transformation): Document the function. Do not dump if suffix is NULL. gcc/testsuite/ChangeLog: 2025-04-23 Martin Jambor PR ipa/119852 * gcc.dg/ipa/pr119852.c: New test. (cherry picked from commit fb5829a01651d427a63a12c44ecc8baa47dbfc83) Diff: --- gcc/cgraphclones.cc | 10 +++- gcc/testsuite/gcc.dg/ipa/pr119852.c | 50 + 2 files changed, 59 insertions(+), 1 deletion(-) diff --git a/gcc/cgraphclones.cc b/gcc/cgraphclones.cc index e6223fa1f5cc..bf5bc41cde9c 100644 --- a/gcc/cgraphclones.cc +++ b/gcc/cgraphclones.cc @@ -307,12 +307,20 @@ cgraph_node::expand_all_artificial_thunks () e = e->next_caller; } +/* Dump information about creation of a call graph node clone to the dump file + created by the -fdump-ipa-clones option. ORIGINAL is the function being + cloned, CLONE is the new clone. SUFFIX is a string that helps identify the + reason for cloning, often it is the suffix used by a particular IPA pass to + create unique function names. SUFFIX can be NULL and in that case the + dumping will not take place, which must be the case only for helper clones + which will never be emitted to the output. */ + void dump_callgraph_transformation (const cgraph_node *original, const cgraph_node *clone, const char *suffix) { - if (symtab->ipa_clones_dump_file) + if (suffix && symtab->ipa_clones_dump_file) { fprintf (symtab->ipa_clones_dump_file, "Callgraph clone;%s;%d;%s;%d;%d;%s;%d;%s;%d;%d;%s\n", diff --git a/gcc/testsuite/gcc.dg/ipa/pr119852.c b/gcc/testsuite/gcc.dg/ipa/pr119852.c new file mode 100644 index ..eab8d21293cc --- /dev/null +++ b/gcc/testsuite/gcc.dg/ipa/pr119852.c @@ -0,0 +1,50 @@ +/* { dg-do compile } */ +/* { dg-options "-O3 -fdump-ipa-clones" } */ + +typedef struct rtx_def *rtx; +enum rtx_code { + LAST_AND_UNUSED_RTX_CODE}; +extern const char * const rtx_format[((int) LAST_AND_UNUSED_RTX_CODE)]; +struct rtx_def { + enum rtx_code code; +}; +typedef int (*rtx_function) (rtx *, void *); +extern int for_each_rtx (rtx *, rtx_function, void *); +int +replace_label (rtx *x, void *data) +{ + rtx l = *x; + if (l == (rtx) 0) +{ + { + rtx new_c, new_l; + for_each_rtx (&new_c, replace_label, data); + } +} +} +static int +for_each_rtx_1 (rtx exp, int n, rtx_function f, void *data) +{ + int result, i, j; + const char *format = (rtx_format[(int) (((enum rtx_code) (exp)->code))]); + rtx *x; + for (; format[n] != '\0'; n++) +{ + switch (format[n]) + { + case 'e': + result = (*f) (x, data); + { + result = for_each_rtx_1 (*x, i, f, data); + } + } +} +} +int +for_each_rtx (rtx *x, rtx_function f, void *data) +{ + int i; + return for_each_rtx_1 (*x, i, f, data); +} + +/* { dg-final { scan-ipa-dump-not "(null)" "ipa-clones" } } */
[gcc r15-9632] Document option -fdump-ipa-clones
https://gcc.gnu.org/g:99e2f1138c61e851cfa08712aa73e2689d314fd1 commit r15-9632-g99e2f1138c61e851cfa08712aa73e2689d314fd1 Author: Martin Jambor Date: Tue May 6 17:28:42 2025 +0200 Document option -fdump-ipa-clones I have noticed that the option -fdump-ipa-clones is not documented although there are users who depend on it. This patch adds the missing documentation along with the description of the information it dumps and the format it uses. I am never quite sure which of the texinfo mark-ups is the most appropriate in which situation, I'll of course incorporate any feedback on this as well as the general wording of the text. After we settle on a version, I'd like to backport the documentation also at least to GCC 15, 14 and 13. Is it perhaps OK for master and the branches or what would better be changed? Thanks, Martin gcc/ChangeLog: 2025-04-23 Martin Jambor * doc/invoke.texi (Developer Options): Document -fdump-ipa-clones. (cherry picked from commit 6ecc2fee06bdd60da0e9b3fe6660b553dbdca3ca) Diff: --- gcc/doc/invoke.texi | 87 + 1 file changed, 87 insertions(+) diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index c2e1bf8031b8..617a3d8ae182 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -20748,6 +20748,93 @@ By default, the dump will contain messages about successful optimizations (equivalent to @option{-optimized}) together with low-level details about the analysis. +@opindex fdump-ipa-clones +@item -fdump-ipa-clones + +Create a dump file containing information about creation of call graph +node clones and removals of call graph nodes during inter-procedural +optimizations and transformations. Its main intended use is that tools +that create live-patches can determine the set of functions that need to +be live-patched to completely replace a particular function (see +@option{-flive-patching}). The file name is generated by appending +suffix @code{ipa-clones} to the source file name, and the file is +created in the same directory as the output file. Each entry in the +file is on a separate line containing semicolon separated fields. + +In the case of call graph clone creation, the individual fields are: + +@enumerate +@item +String @code{Callgraph clone}. + +@item +Name of the function being cloned as it is presented to the assembler. + +@item +A number that uniquely represents the function being cloned in the call +graph. Note that the number is unique only within a compilation unit or +within whole-program analysis but is likely to be different in the two +phases. + +@item +The file name of the source file where the function is defined. + +@item +The line on which the function definition is located. + +@item +The column where the function definition is located. + +@item +Name of the new function clone as it is presented to the assembler. + +@item +A number that uniquely represents the new function clone in the call +graph. Note that the number is unique only within a compilation unit or +within whole-program analysis but is likely to be different in the two +phases. + +@item +The file name of the source file where the source code location of the +new clone points to. + +@item +The line to which the source code location of the new clone points to. + +@item +The column to which the source code location of the new clone points to. + +@item +A string that determines the reason for cloning. + +@end enumerate + +In the case of call graph clone removal, the individual fields are: + +@enumerate +@item +String @code{Callgraph removal}. + +@item +Name of the function being removed as it would be presented to the assembler. + +@item +A number that uniquely represents the function being cloned in the call +graph. Note that the number is unique only within a compilation unit or +within whole-program analysis but is likely to be different in the two +phases. + +@item +The file name of the source file where the function is defined. + +@item +The line on which the function definition is located. + +@item +The column where the function definition is located. + +@end enumerate + @opindex fdump-lang @item -fdump-lang Dump language-specific information. The file name is made by appending
[gcc r15-7792] ipa-vr: Handle non-conversion unary ops separately from conversions (PR 118785)
https://gcc.gnu.org/g:d05b64bdd048ffb7f72d97553888934a9bcd13fa commit r15-7792-gd05b64bdd048ffb7f72d97553888934a9bcd13fa Author: Martin Jambor Date: Mon Mar 3 14:53:03 2025 +0100 ipa-vr: Handle non-conversion unary ops separately from conversions (PR 118785) Since we construct arithmetic jump functions even when there is a type conversion in between the operation encoded in the jump function and when it is passed in a call argument, the IPA propagation phase must also perform the operation and conversion in two steps. IPA-VR had actually been doing it even before for binary operations but, as PR 118756 exposes, not in the case on unary operations. This patch adds the necessary step to rectify that. Like in the scalar constant case, we depend on expr_type_first_operand_type_p to determine the type of the result of the arithmetic operation. On top this, the patch special-cases ABSU_EXPR because it looks useful an so that the PR testcase exercises the added code-path. This seems most appropriate for stage 4, long term we should probably stream the types, probably after also encoding them with a string of expr_eval_op rather than what we have today. A check for expr_type_first_operand_type_p was also missing in the handling of binary ops and the intermediate value_range was initialized with a wrong type, so I also fixed this. gcc/ChangeLog: 2025-02-24 Martin Jambor PR ipa/118785 * ipa-cp.cc (ipa_vr_intersect_with_arith_jfunc): Handle non-conversion unary operations separately before doing any conversions. Check expr_type_first_operand_type_p for non-unary operations too. Fix type of op_res. gcc/testsuite/ChangeLog: 2025-02-24 Martin Jambor PR ipa/118785 * g++.dg/lto/pr118785_0.C: New test. Diff: --- gcc/ipa-cp.cc | 45 --- gcc/testsuite/g++.dg/lto/pr118785_0.C | 14 +++ 2 files changed, 56 insertions(+), 3 deletions(-) diff --git a/gcc/ipa-cp.cc b/gcc/ipa-cp.cc index 68959f2677ba..3c994f24f540 100644 --- a/gcc/ipa-cp.cc +++ b/gcc/ipa-cp.cc @@ -1720,8 +1720,45 @@ ipa_vr_intersect_with_arith_jfunc (vrange &vr, enum tree_code operation = ipa_get_jf_pass_through_operation (jfunc); if (TREE_CODE_CLASS (operation) == tcc_unary) { + value_range op_res; + const value_range *inter_vr; + if (operation != NOP_EXPR) + { + /* Since we construct arithmetic jump functions even when there is a + type conversion in between the operation encoded in the jump + function and when it is passed in a call argument, the IPA + propagation phase must also perform the operation and conversion + in two separate steps. + +TODO: In order to remove the use of expr_type_first_operand_type_p +predicate we would need to stream the operation type, ideally +encoding the whole jump function as a series of expr_eval_op +structures. */ + + tree operation_type; + if (expr_type_first_operand_type_p (operation)) + operation_type = src_type; + else if (operation == ABSU_EXPR) + operation_type = unsigned_type_for (src_type); + else + return; + op_res.set_varying (operation_type); + if (!ipa_vr_operation_and_type_effects (op_res, src_vr, operation, + operation_type, src_type)) + return; + if (src_type == dst_type) + { + vr.intersect (op_res); + return; + } + inter_vr = &op_res; + src_type = operation_type; + } + else + inter_vr = &src_vr; + value_range tmp_res (dst_type); - if (ipa_vr_operation_and_type_effects (tmp_res, src_vr, operation, + if (ipa_vr_operation_and_type_effects (tmp_res, *inter_vr, NOP_EXPR, dst_type, src_type)) vr.intersect (tmp_res); return; @@ -1737,10 +1774,12 @@ ipa_vr_intersect_with_arith_jfunc (vrange &vr, tree operation_type; if (TREE_CODE_CLASS (operation) == tcc_comparison) operation_type = boolean_type_node; - else + else if (expr_type_first_operand_type_p (operation)) operation_type = src_type; + else +return; - value_range op_res (dst_type); + value_range op_res (operation_type); if (!ipa_vr_supported_type_p (operation_type) || !handler.operand_check_p (operation_type, src_type, op_vr.type ()) || !handler.fold_range (op_res, operation_type, src_vr, op_vr)) diff --git a/gcc/testsuite/g++.dg/lto/pr118785_0.C b/gcc/testsuite/g++.dg/lto/pr118785_0.C new file mode 100644 index ..cdcc1dd947d3 --- /dev/null +++ b/gcc/testsuite/g++.dg/lto/pr118785
[gcc r15-7891] ipa-cp: Avoid ICE when redistributing nodes among edges to recursive clones (PR 118318)
https://gcc.gnu.org/g:7deb498425799aceb7659ea25614175a49533184 commit r15-7891-g7deb498425799aceb7659ea25614175a49533184 Author: Martin Jambor Date: Fri Mar 7 17:17:24 2025 +0100 ipa-cp: Avoid ICE when redistributing nodes among edges to recursive clones (PR 118318) PR 118318 reported an ICE during PGO build of Firefox when IPA-CP, in the final stages of update_counts_for_self_gen_clones where it attempts to guess how to distribute profile count among clones created for recursive edges and the various edges that are created in the process. If one such edge has profile count of kind GUESSED_GLOBAL0, the compatibility check in the operator+ will lead to an ICE. After discussing the situation with Honza, we concluded that there is little more we can do other than check for this situation before touching the edge count, so this is what this patch does. gcc/ChangeLog: 2025-02-28 Martin Jambor PR ipa/118318 * ipa-cp.cc (adjust_clone_incoming_counts): Add a compatible_p check. Diff: --- gcc/ipa-cp.cc | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/gcc/ipa-cp.cc b/gcc/ipa-cp.cc index 3c994f24f540..264568989a96 100644 --- a/gcc/ipa-cp.cc +++ b/gcc/ipa-cp.cc @@ -4638,7 +4638,8 @@ adjust_clone_incoming_counts (cgraph_node *node, cs->count = cs->count.combine_with_ipa_count (sum); } else if (!desc->processed_edges->contains (cs) -&& cs->caller->clone_of == desc->orig) +&& cs->caller->clone_of == desc->orig +&& cs->count.compatible_p (desc->count)) { cs->count += desc->count; if (dump_file)
[gcc r14-11375] ipa-sra: Avoid clashes with ipa-cp when pulling accesses across calls (PR 118243)
https://gcc.gnu.org/g:455ea90d6e5ed2938fb7cc7008bf738dcbbc72d4 commit r14-11375-g455ea90d6e5ed2938fb7cc7008bf738dcbbc72d4 Author: Martin Jambor Date: Tue Mar 4 14:53:41 2025 +0100 ipa-sra: Avoid clashes with ipa-cp when pulling accesses across calls (PR 118243) Among other things, IPA-SRA checks whether splitting out a bit of an aggregate or something passed by reference would lead into a clash with an already known IPA-CP constant a way which would cause problems later on. Unfortunately the test is done only in adjust_parameter_descriptions and is missing when accesses are propagated from callees to callers, which leads to miscompilation reported as PR 118243 (where the callee is a function created by ipa-split). The matter is then further complicated by the fact that we consider complex numbers as scalars even though they can be modified piecemeal (IPA-CP can detect and propagate the pieces separately too) which then confuses the parameter manipulation machinery furter. This patch simply adds the missing check to avoid the IPA-SRA transform in these cases too, which should be suitable for backporting to all affected release branches. It is a bit of a shame as in the PR testcase we do propagate both components of the complex number in question and the transformation phase could recover. I have some prototype patches in this direction but that is something for (a) stage 1. gcc/ChangeLog: 2025-02-10 Martin Jambor PR ipa/118243 * ipa-sra.cc (pull_accesses_from_callee): New parameters caller_ipcp_ts and param_idx. Check that scalar pulled accesses would not clash with a known IPA-CP aggregate constant. (param_splitting_across_edge): Pass IPA-CP transformation summary and caller parameter index to pull_accesses_from_callee. gcc/testsuite/ChangeLog: 2025-02-10 Martin Jambor PR ipa/118243 * g++.dg/ipa/pr118243.C: New test. (cherry picked from commit 0bffcd469e68d68ba9c724f515651deff8494b82) Diff: --- gcc/ipa-sra.cc | 38 +-- gcc/testsuite/g++.dg/ipa/pr118243.C | 40 + 2 files changed, 68 insertions(+), 10 deletions(-) diff --git a/gcc/ipa-sra.cc b/gcc/ipa-sra.cc index 6d6da4089251..25fbccd03480 100644 --- a/gcc/ipa-sra.cc +++ b/gcc/ipa-sra.cc @@ -3640,15 +3640,19 @@ enum acc_prop_kind {ACC_PROP_DONT, ACC_PROP_COPY, ACC_PROP_CERTAIN}; /* Attempt to propagate all definite accesses from ARG_DESC to PARAM_DESC, (which belongs to CALLER) if they would not violate some constraint there. - If successful, return NULL, otherwise return the string reason for failure - (which can be written to the dump file). DELTA_OFFSET is the known offset - of the actual argument withing the formal parameter (so of ARG_DESCS within - PARAM_DESCS), ARG_SIZE is the size of the actual argument or zero, if not - known. In case of success, set *CHANGE_P to true if propagation actually - changed anything. */ + CALLER_IPCP_TS describes the caller, PARAM_IDX is the index of the parameter + described by PARAM_DESC. If successful, return NULL, otherwise return the + string reason for failure (which can be written to the dump file). + DELTA_OFFSET is the known offset of the actual argument withing the formal + parameter (so of ARG_DESCS within PARAM_DESCS), ARG_SIZE is the size of the + actual argument or zero, if not known. In case of success, set *CHANGE_P to + true if propagation actually changed anything. */ static const char * -pull_accesses_from_callee (cgraph_node *caller, isra_param_desc *param_desc, +pull_accesses_from_callee (cgraph_node *caller, + ipcp_transformation *caller_ipcp_ts, + int param_idx, + isra_param_desc *param_desc, isra_param_desc *arg_desc, unsigned delta_offset, unsigned arg_size, bool *change_p) @@ -3673,6 +3677,17 @@ pull_accesses_from_callee (cgraph_node *caller, isra_param_desc *param_desc, continue; unsigned offset = argacc->unit_offset + delta_offset; + + if (caller_ipcp_ts && !AGGREGATE_TYPE_P (argacc->type)) + { + ipa_argagg_value_list avl (caller_ipcp_ts); + tree value = avl.get_value (param_idx, offset); + if (value && ((tree_to_uhwi (TYPE_SIZE (TREE_TYPE (value))) +/ BITS_PER_UNIT) + != argacc->unit_size)) + return " propagated access would conflict with an IPA-CP constant"; + } + /* Given that accesses are initially stored according to increasing offset and decreasing size in case of equal offsets, the following searches could
[gcc r13-9422] ipa-sra: Avoid clashes with ipa-cp when pulling accesses across calls (PR 118243)
https://gcc.gnu.org/g:ceb689d5b697886c2255a43ee61b7352242c9683 commit r13-9422-gceb689d5b697886c2255a43ee61b7352242c9683 Author: Martin Jambor Date: Tue Mar 11 16:49:40 2025 +0100 ipa-sra: Avoid clashes with ipa-cp when pulling accesses across calls (PR 118243) Among other things, IPA-SRA checks whether splitting out a bit of an aggregate or something passed by reference would lead into a clash with an already known IPA-CP constant a way which would cause problems later on. Unfortunately the test is done only in adjust_parameter_descriptions and is missing when accesses are propagated from callees to callers, which leads to miscompilation reported as PR 118243 (where the callee is a function created by ipa-split). The matter is then further complicated by the fact that we consider complex numbers as scalars even though they can be modified piecemeal (IPA-CP can detect and propagate the pieces separately too) which then confuses the parameter manipulation machinery furter. This patch simply adds the missing check to avoid the IPA-SRA transform in these cases too, which should be suitable for backporting to all affected release branches. It is a bit of a shame as in the PR testcase we do propagate both components of the complex number in question and the transformation phase could recover. I have some prototype patches in this direction but that is something for (a) stage 1. gcc/ChangeLog: 2025-02-10 Martin Jambor PR ipa/118243 * ipa-sra.cc (pull_accesses_from_callee): New parameters caller_ipcp_ts and param_idx. Check that scalar pulled accesses would not clash with a known IPA-CP aggregate constant. (param_splitting_across_edge): Pass IPA-CP transformation summary and caller parameter index to pull_accesses_from_callee. gcc/testsuite/ChangeLog: 2025-02-10 Martin Jambor PR ipa/118243 * g++.dg/ipa/pr118243.C: New test. (cherry picked from commit 0bffcd469e68d68ba9c724f515651deff8494b82) Diff: --- gcc/ipa-sra.cc | 38 +-- gcc/testsuite/g++.dg/ipa/pr118243.C | 40 + 2 files changed, 68 insertions(+), 10 deletions(-) diff --git a/gcc/ipa-sra.cc b/gcc/ipa-sra.cc index a3db19b7a6fb..288b61e4fb4f 100644 --- a/gcc/ipa-sra.cc +++ b/gcc/ipa-sra.cc @@ -3579,15 +3579,19 @@ enum acc_prop_kind {ACC_PROP_DONT, ACC_PROP_COPY, ACC_PROP_CERTAIN}; /* Attempt to propagate all definite accesses from ARG_DESC to PARAM_DESC, (which belongs to CALLER) if they would not violate some constraint there. - If successful, return NULL, otherwise return the string reason for failure - (which can be written to the dump file). DELTA_OFFSET is the known offset - of the actual argument withing the formal parameter (so of ARG_DESCS within - PARAM_DESCS), ARG_SIZE is the size of the actual argument or zero, if not - known. In case of success, set *CHANGE_P to true if propagation actually - changed anything. */ + CALLER_IPCP_TS describes the caller, PARAM_IDX is the index of the parameter + described by PARAM_DESC. If successful, return NULL, otherwise return the + string reason for failure (which can be written to the dump file). + DELTA_OFFSET is the known offset of the actual argument withing the formal + parameter (so of ARG_DESCS within PARAM_DESCS), ARG_SIZE is the size of the + actual argument or zero, if not known. In case of success, set *CHANGE_P to + true if propagation actually changed anything. */ static const char * -pull_accesses_from_callee (cgraph_node *caller, isra_param_desc *param_desc, +pull_accesses_from_callee (cgraph_node *caller, + ipcp_transformation *caller_ipcp_ts, + int param_idx, + isra_param_desc *param_desc, isra_param_desc *arg_desc, unsigned delta_offset, unsigned arg_size, bool *change_p) @@ -3612,6 +3616,17 @@ pull_accesses_from_callee (cgraph_node *caller, isra_param_desc *param_desc, continue; unsigned offset = argacc->unit_offset + delta_offset; + + if (caller_ipcp_ts && !AGGREGATE_TYPE_P (argacc->type)) + { + ipa_argagg_value_list avl (caller_ipcp_ts); + tree value = avl.get_value (param_idx, offset); + if (value && ((tree_to_uhwi (TYPE_SIZE (TREE_TYPE (value))) +/ BITS_PER_UNIT) + != argacc->unit_size)) + return " propagated access would conflict with an IPA-CP constant"; + } + /* Given that accesses are initially stored according to increasing offset and decreasing size in case of equal offsets, the following searches could
[gcc r15-7961] Fix a pasto in ao_compare::compare_ao_refs
https://gcc.gnu.org/g:dc47161c1f32c3f27d1157ba0de9d98ea1b7fc82 commit r15-7961-gdc47161c1f32c3f27d1157ba0de9d98ea1b7fc82 Author: Martin Jambor Date: Tue Mar 11 14:52:44 2025 +0100 Fix a pasto in ao_compare::compare_ao_refs When reading the function ao_compare::compare_ao_refs I came accross what I believe to ba a copy-and-paste error which this patch fixes. gcc/ChangeLog: 2025-03-10 Martin Jambor * tree-ssa-alias.cc (ao_compare::compare_ao_refs): Fix a copy-and-paste error. Diff: --- gcc/tree-ssa-alias.cc | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/gcc/tree-ssa-alias.cc b/gcc/tree-ssa-alias.cc index 2489aa6b8087..e93d5187d509 100644 --- a/gcc/tree-ssa-alias.cc +++ b/gcc/tree-ssa-alias.cc @@ -4355,12 +4355,13 @@ ao_compare::compare_ao_refs (ao_ref *ref1, ao_ref *ref2, c1 = p1, nskipped1 = i; i++; } + i = 0; for (tree p2 = ref2->ref; handled_component_p (p2); p2 = TREE_OPERAND (p2, 0)) { if (component_ref_to_zero_sized_trailing_array_p (p2)) end_struct_ref2 = p2; if (ends_tbaa_access_path_p (p2)) - c2 = p2, nskipped1 = i; + c2 = p2, nskipped2 = i; i++; }
[gcc r15-7760] ipa-sra: Avoid clashes with ipa-cp when pulling accesses across calls (PR 118243)
https://gcc.gnu.org/g:0bffcd469e68d68ba9c724f515651deff8494b82 commit r15-7760-g0bffcd469e68d68ba9c724f515651deff8494b82 Author: Martin Jambor Date: Fri Feb 28 17:34:10 2025 +0100 ipa-sra: Avoid clashes with ipa-cp when pulling accesses across calls (PR 118243) Among other things, IPA-SRA checks whether splitting out a bit of an aggregate or something passed by reference would lead into a clash with an already known IPA-CP constant a way which would cause problems later on. Unfortunately the test is done only in adjust_parameter_descriptions and is missing when accesses are propagated from callees to callers, which leads to miscompilation reported as PR 118243 (where the callee is a function created by ipa-split). The matter is then further complicated by the fact that we consider complex numbers as scalars even though they can be modified piecemeal (IPA-CP can detect and propagate the pieces separately too) which then confuses the parameter manipulation machinery furter. This patch simply adds the missing check to avoid the IPA-SRA transform in these cases too, which should be suitable for backporting to all affected release branches. It is a bit of a shame as in the PR testcase we do propagate both components of the complex number in question and the transformation phase could recover. I have some prototype patches in this direction but that is something for (a) stage 1. gcc/ChangeLog: 2025-02-10 Martin Jambor PR ipa/118243 * ipa-sra.cc (pull_accesses_from_callee): New parameters caller_ipcp_ts and param_idx. Check that scalar pulled accesses would not clash with a known IPA-CP aggregate constant. (param_splitting_across_edge): Pass IPA-CP transformation summary and caller parameter index to pull_accesses_from_callee. gcc/testsuite/ChangeLog: 2025-02-10 Martin Jambor PR ipa/118243 * g++.dg/ipa/pr118243.C: New test. Diff: --- gcc/ipa-sra.cc | 38 +-- gcc/testsuite/g++.dg/ipa/pr118243.C | 40 + 2 files changed, 68 insertions(+), 10 deletions(-) diff --git a/gcc/ipa-sra.cc b/gcc/ipa-sra.cc index ad80d22f8ced..5d1703ed394f 100644 --- a/gcc/ipa-sra.cc +++ b/gcc/ipa-sra.cc @@ -3640,15 +3640,19 @@ enum acc_prop_kind {ACC_PROP_DONT, ACC_PROP_COPY, ACC_PROP_CERTAIN}; /* Attempt to propagate all definite accesses from ARG_DESC to PARAM_DESC, (which belongs to CALLER) if they would not violate some constraint there. - If successful, return NULL, otherwise return the string reason for failure - (which can be written to the dump file). DELTA_OFFSET is the known offset - of the actual argument withing the formal parameter (so of ARG_DESCS within - PARAM_DESCS), ARG_SIZE is the size of the actual argument or zero, if not - known. In case of success, set *CHANGE_P to true if propagation actually - changed anything. */ + CALLER_IPCP_TS describes the caller, PARAM_IDX is the index of the parameter + described by PARAM_DESC. If successful, return NULL, otherwise return the + string reason for failure (which can be written to the dump file). + DELTA_OFFSET is the known offset of the actual argument withing the formal + parameter (so of ARG_DESCS within PARAM_DESCS), ARG_SIZE is the size of the + actual argument or zero, if not known. In case of success, set *CHANGE_P to + true if propagation actually changed anything. */ static const char * -pull_accesses_from_callee (cgraph_node *caller, isra_param_desc *param_desc, +pull_accesses_from_callee (cgraph_node *caller, + ipcp_transformation *caller_ipcp_ts, + int param_idx, + isra_param_desc *param_desc, isra_param_desc *arg_desc, unsigned delta_offset, unsigned arg_size, bool *change_p) @@ -3673,6 +3677,17 @@ pull_accesses_from_callee (cgraph_node *caller, isra_param_desc *param_desc, continue; unsigned offset = argacc->unit_offset + delta_offset; + + if (caller_ipcp_ts && !AGGREGATE_TYPE_P (argacc->type)) + { + ipa_argagg_value_list avl (caller_ipcp_ts); + tree value = avl.get_value (param_idx, offset); + if (value && ((tree_to_uhwi (TYPE_SIZE (TREE_TYPE (value))) +/ BITS_PER_UNIT) + != argacc->unit_size)) + return " propagated access would conflict with an IPA-CP constant"; + } + /* Given that accesses are initially stored according to increasing offset and decreasing size in case of equal offsets, the following searches could be written more efficiently if we kept the ordering @@ -3781,6 +3796,8 @@ para
[gcc r14-11447] ipa-cp: Avoid ICE when redistributing nodes among edges to recursive clones (PR 118318)
https://gcc.gnu.org/g:82bd83122a483275787fcd18131bf6cd91fbdbd4 commit r14-11447-g82bd83122a483275787fcd18131bf6cd91fbdbd4 Author: Martin Jambor Date: Fri Mar 7 17:17:24 2025 +0100 ipa-cp: Avoid ICE when redistributing nodes among edges to recursive clones (PR 118318) PR 118318 reported an ICE during PGO build of Firefox when IPA-CP, in the final stages of update_counts_for_self_gen_clones where it attempts to guess how to distribute profile count among clones created for recursive edges and the various edges that are created in the process. If one such edge has profile count of kind GUESSED_GLOBAL0, the compatibility check in the operator+ will lead to an ICE. After discussing the situation with Honza, we concluded that there is little more we can do other than check for this situation before touching the edge count, so this is what this patch does. gcc/ChangeLog: 2025-02-28 Martin Jambor PR ipa/118318 * ipa-cp.cc (adjust_clone_incoming_counts): Add a compatible_p check. (cherry picked from commit 7deb498425799aceb7659ea25614175a49533184) Diff: --- gcc/ipa-cp.cc | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/gcc/ipa-cp.cc b/gcc/ipa-cp.cc index b7add455bd5d..6b772fae88ff 100644 --- a/gcc/ipa-cp.cc +++ b/gcc/ipa-cp.cc @@ -4608,7 +4608,8 @@ adjust_clone_incoming_counts (cgraph_node *node, cs->count = cs->count.combine_with_ipa_count (sum); } else if (!desc->processed_edges->contains (cs) -&& cs->caller->clone_of == desc->orig) +&& cs->caller->clone_of == desc->orig +&& cs->count.compatible_p (desc->count)) { cs->count += desc->count; if (dump_file)
[gcc r13-9654] ipa: Do not emit info about temporary clones to ipa-clones dump (PR119852)
https://gcc.gnu.org/g:168ce6032dd582e39f9ddadcc195fc73f364c4dd commit r13-9654-g168ce6032dd582e39f9ddadcc195fc73f364c4dd Author: Martin Jambor Date: Tue May 6 17:28:43 2025 +0200 ipa: Do not emit info about temporary clones to ipa-clones dump (PR119852) As described in PR 119852, the output of -fdump-ipa-clones can contain "(null)" as the suffix/reason for cloning when we need to create a clone to hold the original function during recursive inlining. Such clone is never output and so should not be part of the dump output either. gcc/ChangeLog: 2025-04-23 Martin Jambor PR ipa/119852 * cgraphclones.cc (dump_callgraph_transformation): Document the function. Do not dump if suffix is NULL. gcc/testsuite/ChangeLog: 2025-04-23 Martin Jambor PR ipa/119852 * gcc.dg/ipa/pr119852.c: New test. (cherry picked from commit fb5829a01651d427a63a12c44ecc8baa47dbfc83) Diff: --- gcc/cgraphclones.cc | 10 +++- gcc/testsuite/gcc.dg/ipa/pr119852.c | 50 + 2 files changed, 59 insertions(+), 1 deletion(-) diff --git a/gcc/cgraphclones.cc b/gcc/cgraphclones.cc index 7c5d3b2842c9..b5435537b1d9 100644 --- a/gcc/cgraphclones.cc +++ b/gcc/cgraphclones.cc @@ -304,12 +304,20 @@ cgraph_node::expand_all_artificial_thunks () e = e->next_caller; } +/* Dump information about creation of a call graph node clone to the dump file + created by the -fdump-ipa-clones option. ORIGINAL is the function being + cloned, CLONE is the new clone. SUFFIX is a string that helps identify the + reason for cloning, often it is the suffix used by a particular IPA pass to + create unique function names. SUFFIX can be NULL and in that case the + dumping will not take place, which must be the case only for helper clones + which will never be emitted to the output. */ + void dump_callgraph_transformation (const cgraph_node *original, const cgraph_node *clone, const char *suffix) { - if (symtab->ipa_clones_dump_file) + if (suffix && symtab->ipa_clones_dump_file) { fprintf (symtab->ipa_clones_dump_file, "Callgraph clone;%s;%d;%s;%d;%d;%s;%d;%s;%d;%d;%s\n", diff --git a/gcc/testsuite/gcc.dg/ipa/pr119852.c b/gcc/testsuite/gcc.dg/ipa/pr119852.c new file mode 100644 index ..eab8d21293cc --- /dev/null +++ b/gcc/testsuite/gcc.dg/ipa/pr119852.c @@ -0,0 +1,50 @@ +/* { dg-do compile } */ +/* { dg-options "-O3 -fdump-ipa-clones" } */ + +typedef struct rtx_def *rtx; +enum rtx_code { + LAST_AND_UNUSED_RTX_CODE}; +extern const char * const rtx_format[((int) LAST_AND_UNUSED_RTX_CODE)]; +struct rtx_def { + enum rtx_code code; +}; +typedef int (*rtx_function) (rtx *, void *); +extern int for_each_rtx (rtx *, rtx_function, void *); +int +replace_label (rtx *x, void *data) +{ + rtx l = *x; + if (l == (rtx) 0) +{ + { + rtx new_c, new_l; + for_each_rtx (&new_c, replace_label, data); + } +} +} +static int +for_each_rtx_1 (rtx exp, int n, rtx_function f, void *data) +{ + int result, i, j; + const char *format = (rtx_format[(int) (((enum rtx_code) (exp)->code))]); + rtx *x; + for (; format[n] != '\0'; n++) +{ + switch (format[n]) + { + case 'e': + result = (*f) (x, data); + { + result = for_each_rtx_1 (*x, i, f, data); + } + } +} +} +int +for_each_rtx (rtx *x, rtx_function f, void *data) +{ + int i; + return for_each_rtx_1 (*x, i, f, data); +} + +/* { dg-final { scan-ipa-dump-not "(null)" "ipa-clones" } } */
[gcc r14-11763] Document option -fdump-ipa-clones
https://gcc.gnu.org/g:c817f833cf13bc81380bc9745da2622e4e3b7cb5 commit r14-11763-gc817f833cf13bc81380bc9745da2622e4e3b7cb5 Author: Martin Jambor Date: Tue May 6 17:28:42 2025 +0200 Document option -fdump-ipa-clones I have noticed that the option -fdump-ipa-clones is not documented although there are users who depend on it. This patch adds the missing documentation along with the description of the information it dumps and the format it uses. I am never quite sure which of the texinfo mark-ups is the most appropriate in which situation, I'll of course incorporate any feedback on this as well as the general wording of the text. After we settle on a version, I'd like to backport the documentation also at least to GCC 15, 14 and 13. Is it perhaps OK for master and the branches or what would better be changed? Thanks, Martin gcc/ChangeLog: 2025-04-23 Martin Jambor * doc/invoke.texi (Developer Options): Document -fdump-ipa-clones. (cherry picked from commit 6ecc2fee06bdd60da0e9b3fe6660b553dbdca3ca) Diff: --- gcc/doc/invoke.texi | 87 + 1 file changed, 87 insertions(+) diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index c85cac24f3ce..64728fead512 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -20180,6 +20180,93 @@ By default, the dump will contain messages about successful optimizations (equivalent to @option{-optimized}) together with low-level details about the analysis. +@opindex fdump-ipa-clones +@item -fdump-ipa-clones + +Create a dump file containing information about creation of call graph +node clones and removals of call graph nodes during inter-procedural +optimizations and transformations. Its main intended use is that tools +that create live-patches can determine the set of functions that need to +be live-patched to completely replace a particular function (see +@option{-flive-patching}). The file name is generated by appending +suffix @code{ipa-clones} to the source file name, and the file is +created in the same directory as the output file. Each entry in the +file is on a separate line containing semicolon separated fields. + +In the case of call graph clone creation, the individual fields are: + +@enumerate +@item +String @code{Callgraph clone}. + +@item +Name of the function being cloned as it is presented to the assembler. + +@item +A number that uniquely represents the function being cloned in the call +graph. Note that the number is unique only within a compilation unit or +within whole-program analysis but is likely to be different in the two +phases. + +@item +The file name of the source file where the function is defined. + +@item +The line on which the function definition is located. + +@item +The column where the function definition is located. + +@item +Name of the new function clone as it is presented to the assembler. + +@item +A number that uniquely represents the new function clone in the call +graph. Note that the number is unique only within a compilation unit or +within whole-program analysis but is likely to be different in the two +phases. + +@item +The file name of the source file where the source code location of the +new clone points to. + +@item +The line to which the source code location of the new clone points to. + +@item +The column to which the source code location of the new clone points to. + +@item +A string that determines the reason for cloning. + +@end enumerate + +In the case of call graph clone removal, the individual fields are: + +@enumerate +@item +String @code{Callgraph removal}. + +@item +Name of the function being removed as it would be presented to the assembler. + +@item +A number that uniquely represents the function being cloned in the call +graph. Note that the number is unique only within a compilation unit or +within whole-program analysis but is likely to be different in the two +phases. + +@item +The file name of the source file where the function is defined. + +@item +The line on which the function definition is located. + +@item +The column where the function definition is located. + +@end enumerate + @opindex fdump-lang @item -fdump-lang Dump language-specific information. The file name is made by appending
[gcc r14-11764] ipa: Do not emit info about temporary clones to ipa-clones dump (PR119852)
https://gcc.gnu.org/g:51ffec744b513a71fe84373fb87a3c0125b7fffd commit r14-11764-g51ffec744b513a71fe84373fb87a3c0125b7fffd Author: Martin Jambor Date: Tue May 6 17:28:43 2025 +0200 ipa: Do not emit info about temporary clones to ipa-clones dump (PR119852) As described in PR 119852, the output of -fdump-ipa-clones can contain "(null)" as the suffix/reason for cloning when we need to create a clone to hold the original function during recursive inlining. Such clone is never output and so should not be part of the dump output either. gcc/ChangeLog: 2025-04-23 Martin Jambor PR ipa/119852 * cgraphclones.cc (dump_callgraph_transformation): Document the function. Do not dump if suffix is NULL. gcc/testsuite/ChangeLog: 2025-04-23 Martin Jambor PR ipa/119852 * gcc.dg/ipa/pr119852.c: New test. (cherry picked from commit fb5829a01651d427a63a12c44ecc8baa47dbfc83) Diff: --- gcc/cgraphclones.cc | 10 +++- gcc/testsuite/gcc.dg/ipa/pr119852.c | 50 + 2 files changed, 59 insertions(+), 1 deletion(-) diff --git a/gcc/cgraphclones.cc b/gcc/cgraphclones.cc index 4fff6873a369..913c0a0a082f 100644 --- a/gcc/cgraphclones.cc +++ b/gcc/cgraphclones.cc @@ -307,12 +307,20 @@ cgraph_node::expand_all_artificial_thunks () e = e->next_caller; } +/* Dump information about creation of a call graph node clone to the dump file + created by the -fdump-ipa-clones option. ORIGINAL is the function being + cloned, CLONE is the new clone. SUFFIX is a string that helps identify the + reason for cloning, often it is the suffix used by a particular IPA pass to + create unique function names. SUFFIX can be NULL and in that case the + dumping will not take place, which must be the case only for helper clones + which will never be emitted to the output. */ + void dump_callgraph_transformation (const cgraph_node *original, const cgraph_node *clone, const char *suffix) { - if (symtab->ipa_clones_dump_file) + if (suffix && symtab->ipa_clones_dump_file) { fprintf (symtab->ipa_clones_dump_file, "Callgraph clone;%s;%d;%s;%d;%d;%s;%d;%s;%d;%d;%s\n", diff --git a/gcc/testsuite/gcc.dg/ipa/pr119852.c b/gcc/testsuite/gcc.dg/ipa/pr119852.c new file mode 100644 index ..eab8d21293cc --- /dev/null +++ b/gcc/testsuite/gcc.dg/ipa/pr119852.c @@ -0,0 +1,50 @@ +/* { dg-do compile } */ +/* { dg-options "-O3 -fdump-ipa-clones" } */ + +typedef struct rtx_def *rtx; +enum rtx_code { + LAST_AND_UNUSED_RTX_CODE}; +extern const char * const rtx_format[((int) LAST_AND_UNUSED_RTX_CODE)]; +struct rtx_def { + enum rtx_code code; +}; +typedef int (*rtx_function) (rtx *, void *); +extern int for_each_rtx (rtx *, rtx_function, void *); +int +replace_label (rtx *x, void *data) +{ + rtx l = *x; + if (l == (rtx) 0) +{ + { + rtx new_c, new_l; + for_each_rtx (&new_c, replace_label, data); + } +} +} +static int +for_each_rtx_1 (rtx exp, int n, rtx_function f, void *data) +{ + int result, i, j; + const char *format = (rtx_format[(int) (((enum rtx_code) (exp)->code))]); + rtx *x; + for (; format[n] != '\0'; n++) +{ + switch (format[n]) + { + case 'e': + result = (*f) (x, data); + { + result = for_each_rtx_1 (*x, i, f, data); + } + } +} +} +int +for_each_rtx (rtx *x, rtx_function f, void *data) +{ + int i; + return for_each_rtx_1 (*x, i, f, data); +} + +/* { dg-final { scan-ipa-dump-not "(null)" "ipa-clones" } } */
[gcc r13-9653] Document option -fdump-ipa-clones
https://gcc.gnu.org/g:70d3dec42e8c7aec6604f920f56529c796cd398a commit r13-9653-g70d3dec42e8c7aec6604f920f56529c796cd398a Author: Martin Jambor Date: Tue May 6 17:28:42 2025 +0200 Document option -fdump-ipa-clones I have noticed that the option -fdump-ipa-clones is not documented although there are users who depend on it. This patch adds the missing documentation along with the description of the information it dumps and the format it uses. I am never quite sure which of the texinfo mark-ups is the most appropriate in which situation, I'll of course incorporate any feedback on this as well as the general wording of the text. After we settle on a version, I'd like to backport the documentation also at least to GCC 15, 14 and 13. Is it perhaps OK for master and the branches or what would better be changed? Thanks, Martin gcc/ChangeLog: 2025-04-23 Martin Jambor * doc/invoke.texi (Developer Options): Document -fdump-ipa-clones. (cherry picked from commit 6ecc2fee06bdd60da0e9b3fe6660b553dbdca3ca) Diff: --- gcc/doc/invoke.texi | 87 + 1 file changed, 87 insertions(+) diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 4f539b59d17e..b80966e13539 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -19191,6 +19191,93 @@ By default, the dump will contain messages about successful optimizations (equivalent to @option{-optimized}) together with low-level details about the analysis. +@opindex fdump-ipa-clones +@item -fdump-ipa-clones + +Create a dump file containing information about creation of call graph +node clones and removals of call graph nodes during inter-procedural +optimizations and transformations. Its main intended use is that tools +that create live-patches can determine the set of functions that need to +be live-patched to completely replace a particular function (see +@option{-flive-patching}). The file name is generated by appending +suffix @code{ipa-clones} to the source file name, and the file is +created in the same directory as the output file. Each entry in the +file is on a separate line containing semicolon separated fields. + +In the case of call graph clone creation, the individual fields are: + +@enumerate +@item +String @code{Callgraph clone}. + +@item +Name of the function being cloned as it is presented to the assembler. + +@item +A number that uniquely represents the function being cloned in the call +graph. Note that the number is unique only within a compilation unit or +within whole-program analysis but is likely to be different in the two +phases. + +@item +The file name of the source file where the function is defined. + +@item +The line on which the function definition is located. + +@item +The column where the function definition is located. + +@item +Name of the new function clone as it is presented to the assembler. + +@item +A number that uniquely represents the new function clone in the call +graph. Note that the number is unique only within a compilation unit or +within whole-program analysis but is likely to be different in the two +phases. + +@item +The file name of the source file where the source code location of the +new clone points to. + +@item +The line to which the source code location of the new clone points to. + +@item +The column to which the source code location of the new clone points to. + +@item +A string that determines the reason for cloning. + +@end enumerate + +In the case of call graph clone removal, the individual fields are: + +@enumerate +@item +String @code{Callgraph removal}. + +@item +Name of the function being removed as it would be presented to the assembler. + +@item +A number that uniquely represents the function being cloned in the call +graph. Note that the number is unique only within a compilation unit or +within whole-program analysis but is likely to be different in the two +phases. + +@item +The file name of the source file where the function is defined. + +@item +The line on which the function definition is located. + +@item +The column where the function definition is located. + +@end enumerate + @opindex fdump-lang @item -fdump-lang Dump language-specific information. The file name is made by appending
[gcc r16-696] ipa: Dump cgraph_node UID instead of order into ipa-clones dump file
https://gcc.gnu.org/g:9fa534f0831892393885e64596a0d6ca8c4078b6 commit r16-696-g9fa534f0831892393885e64596a0d6ca8c4078b6 Author: Martin Jambor Date: Fri May 16 17:13:51 2025 +0200 ipa: Dump cgraph_node UID instead of order into ipa-clones dump file Since starting from GCC 15 the order is not unique for any symtab_nodes but m_uid is, I believe we ought to dump the latter in the ipa-clones dump, if only so that people can reliably match entries about new clones to those about removed nodes (if any). This patch also contains a fixes to a few other places where we have so far dumped order to our ordinary dumps and which have been identified by Michal Jires. gcc/ChangeLog: 2025-05-16 Martin Jambor * cgraph.h (symtab_node): Make member function get_uid const. * cgraphclones.cc (dump_callgraph_transformation): Dump m_uid of the call graph nodes instead of order. * cgraph.cc (cgraph_node::remove): Likewise. * ipa-cp.cc (ipcp_lattice::print): Likewise. * ipa-sra.cc (ipa_sra_summarize_function): Likewise. * symtab.cc (symtab_node::dump_base): Likewise. Co-Authored-By: Michal Jires Diff: --- gcc/cgraph.cc | 2 +- gcc/cgraph.h| 2 +- gcc/cgraphclones.cc | 4 ++-- gcc/ipa-cp.cc | 2 +- gcc/ipa-sra.cc | 2 +- gcc/symtab.cc | 4 ++-- 6 files changed, 8 insertions(+), 8 deletions(-) diff --git a/gcc/cgraph.cc b/gcc/cgraph.cc index 1a2ec38374ab..ac0f2519361b 100644 --- a/gcc/cgraph.cc +++ b/gcc/cgraph.cc @@ -1879,7 +1879,7 @@ cgraph_node::remove (void) clone_info *info, saved_info; if (symtab->ipa_clones_dump_file && symtab->cloned_nodes.contains (this)) fprintf (symtab->ipa_clones_dump_file, -"Callgraph removal;%s;%d;%s;%d;%d\n", asm_name (), order, +"Callgraph removal;%s;%d;%s;%d;%d\n", asm_name (), get_uid (), DECL_SOURCE_FILE (decl), DECL_SOURCE_LINE (decl), DECL_SOURCE_COLUMN (decl)); diff --git a/gcc/cgraph.h b/gcc/cgraph.h index f4ee29e998c3..8dbe36eac09d 100644 --- a/gcc/cgraph.h +++ b/gcc/cgraph.h @@ -493,7 +493,7 @@ public: static inline void checking_verify_symtab_nodes (void); /* Get unique identifier of the node. */ - inline int get_uid () + inline int get_uid () const { return m_uid; } diff --git a/gcc/cgraphclones.cc b/gcc/cgraphclones.cc index b45ac4977331..c160e8b6985b 100644 --- a/gcc/cgraphclones.cc +++ b/gcc/cgraphclones.cc @@ -324,11 +324,11 @@ dump_callgraph_transformation (const cgraph_node *original, { fprintf (symtab->ipa_clones_dump_file, "Callgraph clone;%s;%d;%s;%d;%d;%s;%d;%s;%d;%d;%s\n", - original->asm_name (), original->order, + original->asm_name (), original->get_uid (), DECL_SOURCE_FILE (original->decl), DECL_SOURCE_LINE (original->decl), DECL_SOURCE_COLUMN (original->decl), clone->asm_name (), - clone->order, DECL_SOURCE_FILE (clone->decl), + clone->get_uid (), DECL_SOURCE_FILE (clone->decl), DECL_SOURCE_LINE (clone->decl), DECL_SOURCE_COLUMN (clone->decl), suffix); diff --git a/gcc/ipa-cp.cc b/gcc/ipa-cp.cc index b41148c74de3..f06ac46dfffb 100644 --- a/gcc/ipa-cp.cc +++ b/gcc/ipa-cp.cc @@ -288,7 +288,7 @@ ipcp_lattice::print (FILE * f, bool dump_sources, bool dump_benefits) else fprintf (f, " [scc: %i, from:", val->scc_no); for (s = val->sources; s; s = s->next) - fprintf (f, " %i(%f)", s->cs->caller->order, + fprintf (f, " %i(%f)", s->cs->caller->get_uid (), s->cs->sreal_frequency ().to_double ()); fprintf (f, "]"); } diff --git a/gcc/ipa-sra.cc b/gcc/ipa-sra.cc index 1331ba49b507..88bfae9502c7 100644 --- a/gcc/ipa-sra.cc +++ b/gcc/ipa-sra.cc @@ -4644,7 +4644,7 @@ ipa_sra_summarize_function (cgraph_node *node) { if (dump_file) fprintf (dump_file, "Creating summary for %s/%i:\n", node->name (), -node->order); +node->get_uid ()); gcc_obstack_init (&gensum_obstack); loaded_decls = new hash_set; diff --git a/gcc/symtab.cc b/gcc/symtab.cc index fe9c031247f9..fc1155f46964 100644 --- a/gcc/symtab.cc +++ b/gcc/symtab.cc @@ -989,10 +989,10 @@ symtab_node::dump_base (FILE *f) same_comdat_group->dump_asm_name ()); if (next_sharing_asm_name) fprintf (f, " next sharing asm name: %i\n", -next_sharing_asm_name->order); +next_sharing_asm_name->get_uid ()); if (previous_sharing_asm_name) fprintf (f, " previous sharing asm name: %i\n", -previous_sharing_asm_name->order); +previous_sharing_asm_name->get_uid ()); if (address_taken) fprintf (f, " Address is taken.\n");
[gcc r16-614] tree-sra: Do not create stores into const aggregates (PR111873)
https://gcc.gnu.org/g:9d039eff453f777c58642ff16178c1ce2a4be6ab commit r16-614-g9d039eff453f777c58642ff16178c1ce2a4be6ab Author: Martin Jambor Date: Wed May 14 12:08:24 2025 +0200 tree-sra: Do not create stores into const aggregates (PR111873) This patch fixes (hopefully the) one remaining place where gimple SRA was still creating a load into const aggregates. It occurs when there is a replacement for a load but that replacement is not type compatible - typically because it is a single field structure. I have used testcases from duplicates because the original test-case no longer reproduces for me. gcc/ChangeLog: 2025-05-13 Martin Jambor PR tree-optimization/111873 * tree-sra.cc (sra_modify_expr): When processing a load which has a type-incompatible replacement, do not store the contents of the replacement into the original aggregate when that aggregate is const. gcc/testsuite/ChangeLog: 2025-05-13 Martin Jambor * gcc.dg/ipa/pr120044-1.c: New test. * gcc.dg/ipa/pr120044-2.c: Likewise. * gcc.dg/tree-ssa/pr114864.c: Likewise. Diff: --- gcc/testsuite/gcc.dg/ipa/pr120044-1.c| 17 + gcc/testsuite/gcc.dg/ipa/pr120044-2.c| 17 + gcc/testsuite/gcc.dg/tree-ssa/pr114864.c | 15 +++ gcc/tree-sra.cc | 4 +++- 4 files changed, 52 insertions(+), 1 deletion(-) diff --git a/gcc/testsuite/gcc.dg/ipa/pr120044-1.c b/gcc/testsuite/gcc.dg/ipa/pr120044-1.c new file mode 100644 index ..f9fee3e85afb --- /dev/null +++ b/gcc/testsuite/gcc.dg/ipa/pr120044-1.c @@ -0,0 +1,17 @@ +/* { dg-do run } */ +/* { dg-options "-O3 -fno-early-inlining -fno-tree-fre -fno-tree-pre -fno-code-hoisting -fno-inline" } */ + +struct a { + int b; +} const c; +void d(char p, struct a e) { + while (e.b) +; +} +static unsigned short f(const struct a g) { + d(g.b, g); + return g.b; +} +int main() { + return f(c); +} diff --git a/gcc/testsuite/gcc.dg/ipa/pr120044-2.c b/gcc/testsuite/gcc.dg/ipa/pr120044-2.c new file mode 100644 index ..5130791f5444 --- /dev/null +++ b/gcc/testsuite/gcc.dg/ipa/pr120044-2.c @@ -0,0 +1,17 @@ +/* { dg-do run } */ +/* { dg-options "-O3 -fno-early-inlining -fno-tree-fre -fno-tree-pre -fno-code-hoisting -fno-ipa-cp" } */ + +struct a { + int b; +} const c; +void d(char p, struct a e) { + while (e.b) +; +} +static unsigned short f(const struct a g) { + d(g.b, g); + return g.b; +} +int main() { + return f(c); +} diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr114864.c b/gcc/testsuite/gcc.dg/tree-ssa/pr114864.c new file mode 100644 index ..cd9b94c094fc --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr114864.c @@ -0,0 +1,15 @@ +/* { dg-do run } */ +/* { dg-options "-O1 -fno-tree-dce -fno-tree-fre" } */ + +struct a { + int b; +} const c; +void d(const struct a f) {} +void e(const struct a f) { + f.b == 0 ? 1 : f.b; + d(f); +} +int main() { + e(c); + return 0; +} diff --git a/gcc/tree-sra.cc b/gcc/tree-sra.cc index 302b73e83b8f..4b6daf772841 100644 --- a/gcc/tree-sra.cc +++ b/gcc/tree-sra.cc @@ -4205,8 +4205,10 @@ sra_modify_expr (tree *expr, bool write, gimple_stmt_iterator *stmt_gsi, } else { - gassign *stmt; + if (TREE_READONLY (access->base)) + return false; + gassign *stmt; if (access->grp_partial_lhs) repl = force_gimple_operand_gsi (stmt_gsi, repl, true, NULL_TREE, true,
[gcc r15-9716] ipa: Dump cgraph_node UID instead of order into ipa-clones dump file
https://gcc.gnu.org/g:76d16fbd802a10faabf63945dd34f351aea087dc commit r15-9716-g76d16fbd802a10faabf63945dd34f351aea087dc Author: Martin Jambor Date: Fri May 16 17:13:51 2025 +0200 ipa: Dump cgraph_node UID instead of order into ipa-clones dump file Since starting from GCC 15 the order is not unique for any symtab_nodes but m_uid is, I believe we ought to dump the latter in the ipa-clones dump, if only so that people can reliably match entries about new clones to those about removed nodes (if any). This patch also contains a fixes to a few other places where we have so far dumped order to our ordinary dumps and which have been identified by Michal Jires. gcc/ChangeLog: 2025-05-16 Martin Jambor * cgraph.h (symtab_node): Make member function get_uid const. * cgraphclones.cc (dump_callgraph_transformation): Dump m_uid of the call graph nodes instead of order. * cgraph.cc (cgraph_node::remove): Likewise. * ipa-cp.cc (ipcp_lattice::print): Likewise. * ipa-sra.cc (ipa_sra_summarize_function): Likewise. * symtab.cc (symtab_node::dump_base): Likewise. Co-Authored-By: Michal Jires (cherry picked from commit 9fa534f0831892393885e64596a0d6ca8c4078b6) Diff: --- gcc/cgraph.cc | 2 +- gcc/cgraph.h| 2 +- gcc/cgraphclones.cc | 4 ++-- gcc/ipa-cp.cc | 2 +- gcc/ipa-sra.cc | 2 +- gcc/symtab.cc | 4 ++-- 6 files changed, 8 insertions(+), 8 deletions(-) diff --git a/gcc/cgraph.cc b/gcc/cgraph.cc index 6ae6a97f6f56..48646de6aa32 100644 --- a/gcc/cgraph.cc +++ b/gcc/cgraph.cc @@ -1879,7 +1879,7 @@ cgraph_node::remove (void) clone_info *info, saved_info; if (symtab->ipa_clones_dump_file && symtab->cloned_nodes.contains (this)) fprintf (symtab->ipa_clones_dump_file, -"Callgraph removal;%s;%d;%s;%d;%d\n", asm_name (), order, +"Callgraph removal;%s;%d;%s;%d;%d\n", asm_name (), get_uid (), DECL_SOURCE_FILE (decl), DECL_SOURCE_LINE (decl), DECL_SOURCE_COLUMN (decl)); diff --git a/gcc/cgraph.h b/gcc/cgraph.h index abde770ba2b3..45119e3dce9e 100644 --- a/gcc/cgraph.h +++ b/gcc/cgraph.h @@ -493,7 +493,7 @@ public: static inline void checking_verify_symtab_nodes (void); /* Get unique identifier of the node. */ - inline int get_uid () + inline int get_uid () const { return m_uid; } diff --git a/gcc/cgraphclones.cc b/gcc/cgraphclones.cc index bf5bc41cde9c..3c9c642bdec4 100644 --- a/gcc/cgraphclones.cc +++ b/gcc/cgraphclones.cc @@ -324,11 +324,11 @@ dump_callgraph_transformation (const cgraph_node *original, { fprintf (symtab->ipa_clones_dump_file, "Callgraph clone;%s;%d;%s;%d;%d;%s;%d;%s;%d;%d;%s\n", - original->asm_name (), original->order, + original->asm_name (), original->get_uid (), DECL_SOURCE_FILE (original->decl), DECL_SOURCE_LINE (original->decl), DECL_SOURCE_COLUMN (original->decl), clone->asm_name (), - clone->order, DECL_SOURCE_FILE (clone->decl), + clone->get_uid (), DECL_SOURCE_FILE (clone->decl), DECL_SOURCE_LINE (clone->decl), DECL_SOURCE_COLUMN (clone->decl), suffix); diff --git a/gcc/ipa-cp.cc b/gcc/ipa-cp.cc index a8ff3c870731..7ce9ba776961 100644 --- a/gcc/ipa-cp.cc +++ b/gcc/ipa-cp.cc @@ -292,7 +292,7 @@ ipcp_lattice::print (FILE * f, bool dump_sources, bool dump_benefits) else fprintf (f, " [scc: %i, from:", val->scc_no); for (s = val->sources; s; s = s->next) - fprintf (f, " %i(%f)", s->cs->caller->order, + fprintf (f, " %i(%f)", s->cs->caller->get_uid (), s->cs->sreal_frequency ().to_double ()); fprintf (f, "]"); } diff --git a/gcc/ipa-sra.cc b/gcc/ipa-sra.cc index 1331ba49b507..88bfae9502c7 100644 --- a/gcc/ipa-sra.cc +++ b/gcc/ipa-sra.cc @@ -4644,7 +4644,7 @@ ipa_sra_summarize_function (cgraph_node *node) { if (dump_file) fprintf (dump_file, "Creating summary for %s/%i:\n", node->name (), -node->order); +node->get_uid ()); gcc_obstack_init (&gensum_obstack); loaded_decls = new hash_set; diff --git a/gcc/symtab.cc b/gcc/symtab.cc index fe9c031247f9..fc1155f46964 100644 --- a/gcc/symtab.cc +++ b/gcc/symtab.cc @@ -989,10 +989,10 @@ symtab_node::dump_base (FILE *f) same_comdat_group->dump_asm_name ()); if (next_sharing_asm_name) fprintf (f, " next sharing asm name: %i\n", -next_sharing_asm_name->order); +next_sharing_asm_name->get_uid ()); if (previous_sharing_asm_name) fprintf (f, " previous sharing asm name: %i\n", -previous_sharing_asm_name->order); +previous_sharing_asm_name->get_uid ()); if (address_taken) fprintf (f, " Address is taken.\n");
[gcc r15-9717] tree-sra: Do not create stores into const aggregates (PR111873)
https://gcc.gnu.org/g:c1db46f7e51d4a546ca536f7f10e548f02e5cc12 commit r15-9717-gc1db46f7e51d4a546ca536f7f10e548f02e5cc12 Author: Martin Jambor Date: Wed May 14 12:08:24 2025 +0200 tree-sra: Do not create stores into const aggregates (PR111873) This patch fixes (hopefully the) one remaining place where gimple SRA was still creating a load into const aggregates. It occurs when there is a replacement for a load but that replacement is not type compatible - typically because it is a single field structure. I have used testcases from duplicates because the original test-case no longer reproduces for me. gcc/ChangeLog: 2025-05-13 Martin Jambor PR tree-optimization/111873 * tree-sra.cc (sra_modify_expr): When processing a load which has a type-incompatible replacement, do not store the contents of the replacement into the original aggregate when that aggregate is const. gcc/testsuite/ChangeLog: 2025-05-13 Martin Jambor * gcc.dg/ipa/pr120044-1.c: New test. * gcc.dg/ipa/pr120044-2.c: Likewise. * gcc.dg/tree-ssa/pr114864.c: Likewise. (cherry picked from commit 9d039eff453f777c58642ff16178c1ce2a4be6ab) Diff: --- gcc/testsuite/gcc.dg/ipa/pr120044-1.c| 17 + gcc/testsuite/gcc.dg/ipa/pr120044-2.c| 17 + gcc/testsuite/gcc.dg/tree-ssa/pr114864.c | 15 +++ gcc/tree-sra.cc | 4 +++- 4 files changed, 52 insertions(+), 1 deletion(-) diff --git a/gcc/testsuite/gcc.dg/ipa/pr120044-1.c b/gcc/testsuite/gcc.dg/ipa/pr120044-1.c new file mode 100644 index ..f9fee3e85afb --- /dev/null +++ b/gcc/testsuite/gcc.dg/ipa/pr120044-1.c @@ -0,0 +1,17 @@ +/* { dg-do run } */ +/* { dg-options "-O3 -fno-early-inlining -fno-tree-fre -fno-tree-pre -fno-code-hoisting -fno-inline" } */ + +struct a { + int b; +} const c; +void d(char p, struct a e) { + while (e.b) +; +} +static unsigned short f(const struct a g) { + d(g.b, g); + return g.b; +} +int main() { + return f(c); +} diff --git a/gcc/testsuite/gcc.dg/ipa/pr120044-2.c b/gcc/testsuite/gcc.dg/ipa/pr120044-2.c new file mode 100644 index ..5130791f5444 --- /dev/null +++ b/gcc/testsuite/gcc.dg/ipa/pr120044-2.c @@ -0,0 +1,17 @@ +/* { dg-do run } */ +/* { dg-options "-O3 -fno-early-inlining -fno-tree-fre -fno-tree-pre -fno-code-hoisting -fno-ipa-cp" } */ + +struct a { + int b; +} const c; +void d(char p, struct a e) { + while (e.b) +; +} +static unsigned short f(const struct a g) { + d(g.b, g); + return g.b; +} +int main() { + return f(c); +} diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr114864.c b/gcc/testsuite/gcc.dg/tree-ssa/pr114864.c new file mode 100644 index ..cd9b94c094fc --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr114864.c @@ -0,0 +1,15 @@ +/* { dg-do run } */ +/* { dg-options "-O1 -fno-tree-dce -fno-tree-fre" } */ + +struct a { + int b; +} const c; +void d(const struct a f) {} +void e(const struct a f) { + f.b == 0 ? 1 : f.b; + d(f); +} +int main() { + e(c); + return 0; +} diff --git a/gcc/tree-sra.cc b/gcc/tree-sra.cc index 302b73e83b8f..4b6daf772841 100644 --- a/gcc/tree-sra.cc +++ b/gcc/tree-sra.cc @@ -4205,8 +4205,10 @@ sra_modify_expr (tree *expr, bool write, gimple_stmt_iterator *stmt_gsi, } else { - gassign *stmt; + if (TREE_READONLY (access->base)) + return false; + gassign *stmt; if (access->grp_partial_lhs) repl = force_gimple_operand_gsi (stmt_gsi, repl, true, NULL_TREE, true,
[gcc r14-11797] tree-sra: Do not create stores into const aggregates (PR111873)
https://gcc.gnu.org/g:92d8b9970ea2ed59010a5f1a394cb98adffa63e8 commit r14-11797-g92d8b9970ea2ed59010a5f1a394cb98adffa63e8 Author: Martin Jambor Date: Wed May 14 12:08:24 2025 +0200 tree-sra: Do not create stores into const aggregates (PR111873) This patch fixes (hopefully the) one remaining place where gimple SRA was still creating a load into const aggregates. It occurs when there is a replacement for a load but that replacement is not type compatible - typically because it is a single field structure. I have used testcases from duplicates because the original test-case no longer reproduces for me. gcc/ChangeLog: 2025-05-13 Martin Jambor PR tree-optimization/111873 * tree-sra.cc (sra_modify_expr): When processing a load which has a type-incompatible replacement, do not store the contents of the replacement into the original aggregate when that aggregate is const. gcc/testsuite/ChangeLog: 2025-05-13 Martin Jambor * gcc.dg/ipa/pr120044-1.c: New test. * gcc.dg/ipa/pr120044-2.c: Likewise. * gcc.dg/tree-ssa/pr114864.c: Likewise. (cherry picked from commit 9d039eff453f777c58642ff16178c1ce2a4be6ab) Diff: --- gcc/testsuite/gcc.dg/ipa/pr120044-1.c| 17 + gcc/testsuite/gcc.dg/ipa/pr120044-2.c| 17 + gcc/testsuite/gcc.dg/tree-ssa/pr114864.c | 15 +++ gcc/tree-sra.cc | 4 +++- 4 files changed, 52 insertions(+), 1 deletion(-) diff --git a/gcc/testsuite/gcc.dg/ipa/pr120044-1.c b/gcc/testsuite/gcc.dg/ipa/pr120044-1.c new file mode 100644 index ..f9fee3e85afb --- /dev/null +++ b/gcc/testsuite/gcc.dg/ipa/pr120044-1.c @@ -0,0 +1,17 @@ +/* { dg-do run } */ +/* { dg-options "-O3 -fno-early-inlining -fno-tree-fre -fno-tree-pre -fno-code-hoisting -fno-inline" } */ + +struct a { + int b; +} const c; +void d(char p, struct a e) { + while (e.b) +; +} +static unsigned short f(const struct a g) { + d(g.b, g); + return g.b; +} +int main() { + return f(c); +} diff --git a/gcc/testsuite/gcc.dg/ipa/pr120044-2.c b/gcc/testsuite/gcc.dg/ipa/pr120044-2.c new file mode 100644 index ..5130791f5444 --- /dev/null +++ b/gcc/testsuite/gcc.dg/ipa/pr120044-2.c @@ -0,0 +1,17 @@ +/* { dg-do run } */ +/* { dg-options "-O3 -fno-early-inlining -fno-tree-fre -fno-tree-pre -fno-code-hoisting -fno-ipa-cp" } */ + +struct a { + int b; +} const c; +void d(char p, struct a e) { + while (e.b) +; +} +static unsigned short f(const struct a g) { + d(g.b, g); + return g.b; +} +int main() { + return f(c); +} diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr114864.c b/gcc/testsuite/gcc.dg/tree-ssa/pr114864.c new file mode 100644 index ..cd9b94c094fc --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr114864.c @@ -0,0 +1,15 @@ +/* { dg-do run } */ +/* { dg-options "-O1 -fno-tree-dce -fno-tree-fre" } */ + +struct a { + int b; +} const c; +void d(const struct a f) {} +void e(const struct a f) { + f.b == 0 ? 1 : f.b; + d(f); +} +int main() { + e(c); + return 0; +} diff --git a/gcc/tree-sra.cc b/gcc/tree-sra.cc index 46ddd41fdcb9..6e09476418cd 100644 --- a/gcc/tree-sra.cc +++ b/gcc/tree-sra.cc @@ -4179,8 +4179,10 @@ sra_modify_expr (tree *expr, bool write, gimple_stmt_iterator *stmt_gsi, } else { - gassign *stmt; + if (TREE_READONLY (access->base)) + return false; + gassign *stmt; if (access->grp_partial_lhs) repl = force_gimple_operand_gsi (stmt_gsi, repl, true, NULL_TREE, true,
[gcc r13-9724] tree-sra: Do not create stores into const aggregates (PR111873)
https://gcc.gnu.org/g:a067a18d42e338aea990347bb4d16d6a852c4480 commit r13-9724-ga067a18d42e338aea990347bb4d16d6a852c4480 Author: Martin Jambor Date: Wed May 14 12:08:24 2025 +0200 tree-sra: Do not create stores into const aggregates (PR111873) This patch fixes (hopefully the) one remaining place where gimple SRA was still creating a load into const aggregates. It occurs when there is a replacement for a load but that replacement is not type compatible - typically because it is a single field structure. I have used testcases from duplicates because the original test-case no longer reproduces for me. gcc/ChangeLog: 2025-05-13 Martin Jambor PR tree-optimization/111873 * tree-sra.cc (sra_modify_expr): When processing a load which has a type-incompatible replacement, do not store the contents of the replacement into the original aggregate when that aggregate is const. gcc/testsuite/ChangeLog: 2025-05-13 Martin Jambor * gcc.dg/ipa/pr120044-1.c: New test. * gcc.dg/ipa/pr120044-2.c: Likewise. * gcc.dg/tree-ssa/pr114864.c: Likewise. (cherry picked from commit 9d039eff453f777c58642ff16178c1ce2a4be6ab) Diff: --- gcc/testsuite/gcc.dg/ipa/pr120044-1.c| 17 + gcc/testsuite/gcc.dg/ipa/pr120044-2.c| 17 + gcc/testsuite/gcc.dg/tree-ssa/pr114864.c | 15 +++ gcc/tree-sra.cc | 4 +++- 4 files changed, 52 insertions(+), 1 deletion(-) diff --git a/gcc/testsuite/gcc.dg/ipa/pr120044-1.c b/gcc/testsuite/gcc.dg/ipa/pr120044-1.c new file mode 100644 index ..f9fee3e85afb --- /dev/null +++ b/gcc/testsuite/gcc.dg/ipa/pr120044-1.c @@ -0,0 +1,17 @@ +/* { dg-do run } */ +/* { dg-options "-O3 -fno-early-inlining -fno-tree-fre -fno-tree-pre -fno-code-hoisting -fno-inline" } */ + +struct a { + int b; +} const c; +void d(char p, struct a e) { + while (e.b) +; +} +static unsigned short f(const struct a g) { + d(g.b, g); + return g.b; +} +int main() { + return f(c); +} diff --git a/gcc/testsuite/gcc.dg/ipa/pr120044-2.c b/gcc/testsuite/gcc.dg/ipa/pr120044-2.c new file mode 100644 index ..5130791f5444 --- /dev/null +++ b/gcc/testsuite/gcc.dg/ipa/pr120044-2.c @@ -0,0 +1,17 @@ +/* { dg-do run } */ +/* { dg-options "-O3 -fno-early-inlining -fno-tree-fre -fno-tree-pre -fno-code-hoisting -fno-ipa-cp" } */ + +struct a { + int b; +} const c; +void d(char p, struct a e) { + while (e.b) +; +} +static unsigned short f(const struct a g) { + d(g.b, g); + return g.b; +} +int main() { + return f(c); +} diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr114864.c b/gcc/testsuite/gcc.dg/tree-ssa/pr114864.c new file mode 100644 index ..cd9b94c094fc --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr114864.c @@ -0,0 +1,15 @@ +/* { dg-do run } */ +/* { dg-options "-O1 -fno-tree-dce -fno-tree-fre" } */ + +struct a { + int b; +} const c; +void d(const struct a f) {} +void e(const struct a f) { + f.b == 0 ? 1 : f.b; + d(f); +} +int main() { + e(c); + return 0; +} diff --git a/gcc/tree-sra.cc b/gcc/tree-sra.cc index ec499fdd5109..c3c0a70338d2 100644 --- a/gcc/tree-sra.cc +++ b/gcc/tree-sra.cc @@ -3988,8 +3988,10 @@ sra_modify_expr (tree *expr, gimple_stmt_iterator *gsi, bool write) } else { - gassign *stmt; + if (TREE_READONLY (access->base)) + return false; + gassign *stmt; if (access->grp_partial_lhs) repl = force_gimple_operand_gsi (gsi, repl, true, NULL_TREE, true, GSI_SAME_STMT);
[gcc r16-960] ipa: When inlining, don't combine PT JFs changing signedness (PR120295)
https://gcc.gnu.org/g:0b004c92f5ea239936a403a2a757e12ca82ce6d8 commit r16-960-g0b004c92f5ea239936a403a2a757e12ca82ce6d8 Author: Martin Jambor Date: Thu May 29 16:32:04 2025 +0200 ipa: When inlining, don't combine PT JFs changing signedness (PR120295) In GCC 15 we allowed jump-function generation code to skip over a type-cast converting one integer to another as long as the latter can hold all the values of the former or has at least the same precision. This works well for IPA-CP where we do then evaluate each jump function as we propagate values and value-ranges. However, the test-case in PR 120295 shows a problem with inlining, where we combine pass-through jump-functions so that they are always relative to the function which is the root of the inline tree. Unfortunately, we are happy to combine also those with type-casts to a different signedness which makes us use sign zero extension for the expected value ranges where we should have used sign extension. When the value-range which then leads to wrong insertion of a call to builtin_unreachable is being computed, the information about an existence of a intermediary signed type has already been lost during previous inlining. This patch simply blocks combining such jump-functions so that it is back-portable to GCC 15. Once we switch pass-through jump functions to use a vector of operations rather than having room for just one, we will be able to address this situation with adding an extra conversion instead. gcc/ChangeLog: 2025-05-19 Martin Jambor PR ipa/120295 * ipa-prop.cc (update_jump_functions_after_inlining): Do not combine pass-through jump functions with type-casts changing signedness. gcc/testsuite/ChangeLog: 2025-05-19 Martin Jambor PR ipa/120295 * gcc.dg/ipa/pr120295.c: New test. Diff: --- gcc/ipa-prop.cc | 28 gcc/testsuite/gcc.dg/ipa/pr120295.c | 66 + 2 files changed, 94 insertions(+) diff --git a/gcc/ipa-prop.cc b/gcc/ipa-prop.cc index 24a538034e31..84d4fb5db674 100644 --- a/gcc/ipa-prop.cc +++ b/gcc/ipa-prop.cc @@ -3330,6 +3330,10 @@ update_jump_functions_after_inlining (struct cgraph_edge *cs, ipa_edge_args *args = ipa_edge_args_sum->get (e); if (!args) return; + ipa_node_params *old_inline_root_info = ipa_node_params_sum->get (cs->callee); + ipa_node_params *new_inline_root_info += ipa_node_params_sum->get (cs->caller->inlined_to + ? cs->caller->inlined_to : cs->caller); int count = ipa_get_cs_argument_count (args); int i; @@ -3541,6 +3545,30 @@ update_jump_functions_after_inlining (struct cgraph_edge *cs, enum tree_code operation; operation = ipa_get_jf_pass_through_operation (src); + tree old_ir_ptype = ipa_get_type (old_inline_root_info, + dst_fid); + tree new_ir_ptype = ipa_get_type (new_inline_root_info, + formal_id); + if (!useless_type_conversion_p (old_ir_ptype, new_ir_ptype)) + { + /* Jump-function construction now permits type-casts + from an integer to another if the latter can hold + all values or has at least the same precision. + However, as we're combining multiple pass-through + functions together, we are losing information about + signedness and thus if conversions should sign or + zero extend. Therefore we must prevent combining + such jump-function if signednesses do not match. */ + if (!INTEGRAL_TYPE_P (old_ir_ptype) + || !INTEGRAL_TYPE_P (new_ir_ptype) + || (TYPE_UNSIGNED (new_ir_ptype) + != TYPE_UNSIGNED (old_ir_ptype))) + { + ipa_set_jf_unknown (dst); + continue; + } + } + if (operation == NOP_EXPR) { bool agg_p; diff --git a/gcc/testsuite/gcc.dg/ipa/pr120295.c b/gcc/testsuite/gcc.dg/ipa/pr120295.c new file mode 100644 index ..2033ee9493d2 --- /dev/null +++ b/gcc/testsuite/gcc.dg/ipa/pr120295.c @@ -0,0 +1,66 @@ +/* { dg-do run } */ +/* { dg-options "-O3" } */ + +struct { + signed a; +} b; +int a, f, j, l; +char c, k, g, e; +short d[2] = {0}; +int *i = &j; + +volatile int glob; +void __attribute__((noipa)) sth (const char *, int a) +{ + glob = a; + return; +
[gcc r16-959] ipa: Fix whitespace when dumping VR in jump_functions
https://gcc.gnu.org/g:71e6b7b26a5169d217a62f34acbbc43c592b24bd commit r16-959-g71e6b7b26a5169d217a62f34acbbc43c592b24bd Author: Martin Jambor Date: Thu May 29 16:32:04 2025 +0200 ipa: Fix whitespace when dumping VR in jump_functions Lack of white space breakes the tree-visualisation structure and makes the dump unnecessarily difficult to read. gcc/ChangeLog: 2025-05-19 Martin Jambor * ipa-prop.cc (ipa_dump_jump_function): Fix whitespace when dumping IPA VRs. Diff: --- gcc/ipa-prop.cc | 1 + 1 file changed, 1 insertion(+) diff --git a/gcc/ipa-prop.cc b/gcc/ipa-prop.cc index 0398d69962f8..24a538034e31 100644 --- a/gcc/ipa-prop.cc +++ b/gcc/ipa-prop.cc @@ -542,6 +542,7 @@ ipa_dump_jump_function (FILE *f, ipa_jump_func *jump_func, if (jump_func->m_vr) { + fprintf (f, " "); jump_func->m_vr->dump (f); fprintf (f, "\n"); }