[gcc r15-346] sra: Do not leave work for DSE (that it can sometimes not perform)

2024-05-09 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:f6743695b4d2bd4da96e56a19157372f93b800bd

commit r15-346-gf6743695b4d2bd4da96e56a19157372f93b800bd
Author: Martin Jambor 
Date:   Thu May 9 16:39:44 2024 +0200

sra: Do not leave work for DSE (that it can sometimes not perform)

When looking again at the g++.dg/tree-ssa/pr109849.C testcase we
discovered that it generates terrible store-to-load forwarding stalls
because SRA was leaving behind aggregate loads but all the stores were
by scalar parts and DSE failed to remove the useless load.  SRA has
all the knowledge to remove the statement even now, so this small
patch makes it do so.

With this patch, the g++.dg/tree-ssa/pr109849.C micro-benchmark runs 9
times faster (on an AMD EPYC 75F3 machine).

gcc/ChangeLog:

2024-04-18  Martin Jambor  

* tree-sra.cc (sra_modify_assign): Remove the original statement
also when dealing with a store to a fully covered aggregate from a
non-candidate.

gcc/testsuite/ChangeLog:

2024-04-23  Martin Jambor  

* g++.dg/tree-ssa/pr109849.C: Also check that the aggeegate store
to cur disappears.
* gcc.dg/tree-ssa/ssa-dse-26.c: Instead of relying on DSE,
check that the unwanted stores were removed at early SRA time.

Diff:
---
 gcc/testsuite/g++.dg/tree-ssa/pr109849.C   |  3 ++-
 gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-26.c |  6 +++---
 gcc/tree-sra.cc| 14 --
 3 files changed, 17 insertions(+), 6 deletions(-)

diff --git a/gcc/testsuite/g++.dg/tree-ssa/pr109849.C 
b/gcc/testsuite/g++.dg/tree-ssa/pr109849.C
index cd348c0f5906..d06dbb104829 100644
--- a/gcc/testsuite/g++.dg/tree-ssa/pr109849.C
+++ b/gcc/testsuite/g++.dg/tree-ssa/pr109849.C
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-sra" } */
+/* { dg-options "-O2 -fdump-tree-sra -fdump-tree-optimized" } */
 
 #include 
 typedef unsigned int uint32_t;
@@ -29,3 +29,4 @@ main()
 }
 
 /* { dg-final { scan-tree-dump "Created a replacement for stack offset" "sra"} 
} */
+/* { dg-final { scan-tree-dump-not "cur = MEM" "optimized"} } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-26.c 
b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-26.c
index 43152de56163..1d01392c5957 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-26.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-26.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-dse1-details -fno-short-enums -fno-tree-fre" 
} */
+/* { dg-options "-O2 -fdump-tree-esra -fno-short-enums -fno-tree-fre" } */
 /* { dg-skip-if "we want a BIT_FIELD_REF from fold_truth_andor" { ! lp64 } } */
 /* { dg-skip-if "temporary variable names are not x and y" { 
mmix-knuth-mmixware } } */
 
@@ -31,5 +31,5 @@ constraint_equal (struct constraint a, struct constraint b)
 && constraint_expr_equal (a.rhs, b.rhs);
 }
 
-/* { dg-final { scan-tree-dump-times "Deleted dead store: x = " 2 "dse1" } } */
-/* { dg-final { scan-tree-dump-times "Deleted dead store: y = " 2 "dse1" } } */
+/* { dg-final { scan-tree-dump-not "x = " "esra" } } */
+/* { dg-final { scan-tree-dump-not "y = " "esra" } } */
diff --git a/gcc/tree-sra.cc b/gcc/tree-sra.cc
index 32fa28911f2d..8040b0c56451 100644
--- a/gcc/tree-sra.cc
+++ b/gcc/tree-sra.cc
@@ -4854,8 +4854,18 @@ sra_modify_assign (gimple *stmt, gimple_stmt_iterator 
*gsi)
 But use the RHS aggregate to load from to expose more
 optimization opportunities.  */
  if (access_has_children_p (lacc))
-   generate_subtree_copies (lacc->first_child, rhs, lacc->offset,
-0, 0, gsi, true, true, loc);
+   {
+ generate_subtree_copies (lacc->first_child, rhs, lacc->offset,
+  0, 0, gsi, true, true, loc);
+ if (lacc->grp_covered)
+   {
+ unlink_stmt_vdef (stmt);
+ gsi_remove (& orig_gsi, true);
+ release_defs (stmt);
+ sra_stats.deleted++;
+ return SRA_AM_REMOVED;
+   }
+   }
}
 
   return SRA_AM_NONE;


[gcc r13-8773] ICF&SRA: Make ICF and SRA agree on padding

2024-05-14 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:10bf53a80eefa46500bffb442719777e2640e7d7

commit r13-8773-g10bf53a80eefa46500bffb442719777e2640e7d7
Author: Martin Jambor 
Date:   Mon Apr 8 18:53:23 2024 +0200

ICF&SRA: Make ICF and SRA agree on padding

PR 113359 shows that (at least with -fno-strict-aliasing) ICF can
unify two functions which copy an aggregate type of the same size but
then SRA, through its total scalarization, can copy the aggregate by
pieces, skipping paddding, but the padding was not the same in the two
original functions that ICF unified.

This patch enhances SRA with the ability to collect padding
information which then can be compared from within ICF.  Unfortunately
SRA uses OPTION_SET_P when determining its limits, so ICF needs to
switch cfuns at least once to figure it out too.

gcc/ChangeLog:

2024-03-27  Martin Jambor  

PR ipa/113359
* ipa-icf-gimple.h (func_checker): New members
safe_for_total_scalarization_p, m_total_scalarization_limit_known_p
and m_total_scalarization_limit.
(func_checker::func_checker): Initialize new member variables.
* ipa-icf-gimple.cc: Include tree-sra.h.
(func_checker::func_checker): Initialize new member variables.
(func_checker::safe_for_total_scalarization_p): New function.
(func_checker::compare_operand): Use the new function.
* tree-sra.h (sra_get_max_scalarization_size): Declare.
(sra_total_scalarization_would_copy_same_data_p): Likewise.
* tree-sra.cc (prepare_iteration_over_array_elts): New function.
(class sra_padding_collecting): New.
(sra_padding_collecting::record_padding): Likewise.
(scalarizable_type_p): Rename to totally_scalarizable_type_p.  Add
ability to record padding when requested.
(totally_scalarize_subtree): Split out gathering information 
necessary
to iterate over array elements to prepare_iteration_over_array_elts.
Fix errornous early exit.
(analyze_all_variable_accesses): Adjust the call to
totally_scalarizable_type_p.  Move determining of total scalariation
size limit...
(sra_get_max_scalarization_size): ...here.
(check_ts_and_push_padding_to_vec): New function.
(sra_total_scalarization_would_copy_same_data_p): Likewise.

gcc/testsuite/ChangeLog:

2024-03-27  Martin Jambor  

PR ipa/113359
* gcc.dg/lto/pr113359-1_0.c: New.
* gcc.dg/lto/pr113359-1_1.c: Likewise.
* gcc.dg/lto/pr113359-2_0.c: Likewise.
* gcc.dg/lto/pr113359-2_1.c: Likewise.
* gcc.dg/lto/pr113359-3_0.c: Likewise.
* gcc.dg/lto/pr113359-3_1.c: Likewise.
* gcc.dg/lto/pr113359-4_0.c: Likewise.
* gcc.dg/lto/pr113359-4_1.c: Likewise.
* gcc.dg/lto/pr113359-5_0.c: Likewise.
* gcc.dg/lto/pr113359-5_1.c: Likewise.

(cherry picked from commit 1e3312a25a7b34d6e3f549273e1674c7114e4408)

Diff:
---
 gcc/ipa-icf-gimple.cc   |  41 +-
 gcc/ipa-icf-gimple.h|  15 +-
 gcc/testsuite/gcc.dg/lto/pr113359-1_0.c |  86 +++
 gcc/testsuite/gcc.dg/lto/pr113359-1_1.c |  38 +
 gcc/testsuite/gcc.dg/lto/pr113359-2_0.c |  87 +++
 gcc/testsuite/gcc.dg/lto/pr113359-2_1.c |  38 +
 gcc/testsuite/gcc.dg/lto/pr113359-3_0.c | 114 +++
 gcc/testsuite/gcc.dg/lto/pr113359-3_1.c |  49 +++
 gcc/testsuite/gcc.dg/lto/pr113359-4_0.c | 114 +++
 gcc/testsuite/gcc.dg/lto/pr113359-4_1.c |  49 +++
 gcc/testsuite/gcc.dg/lto/pr113359-5_0.c | 118 +++
 gcc/testsuite/gcc.dg/lto/pr113359-5_1.c |  50 +++
 gcc/tree-sra.cc | 252 +---
 gcc/tree-sra.h  |   3 +
 14 files changed, 999 insertions(+), 55 deletions(-)

diff --git a/gcc/ipa-icf-gimple.cc b/gcc/ipa-icf-gimple.cc
index f4180c0fa813..49302ad56c65 100644
--- a/gcc/ipa-icf-gimple.cc
+++ b/gcc/ipa-icf-gimple.cc
@@ -39,6 +39,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "cfgloop.h"
 #include "attribs.h"
 #include "gimple-walk.h"
+#include "tree-sra.h"
 
 #include "tree-ssa-alias-compare.h"
 #include "ipa-icf-gimple.h"
@@ -59,7 +60,8 @@ func_checker::func_checker (tree source_func_decl, tree 
target_func_decl,
   : m_source_func_decl (source_func_decl), m_target_func_decl 
(target_func_decl),
 m_ignored_source_nodes (ignored_source_nodes),
 m_ignored_target_nodes (ignored_target_nodes),
-m_ignore_labels (ignore_labels), m_tbaa (tbaa)
+m_ignore_labels (ignore_labels), m_tbaa (tbaa),
+m_total_scalarization_limit_known_p (false)
 {
   function *source_func = DECL_STRUCT_FUNCTION (source_func_decl);
   function *target_func = DECL_STRUCT_FUNCTION (target_func_de

[gcc r13-8774] ipa: Compare jump functions in ICF (PR 113907)

2024-05-14 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:1db45e83021a8a87f41e22053910fcce6e8e2c2c

commit r13-8774-g1db45e83021a8a87f41e22053910fcce6e8e2c2c
Author: Martin Jambor 
Date:   Tue May 14 17:01:21 2024 +0200

ipa: Compare jump functions in ICF (PR 113907)

This is a manual backport of r14-9840-g1162861439fd3c from master.
Manual because the bits and value range representation in jump
functions have changes during the gcc 14 development cycle.

In PR 113907 comment #58, Honza found a case where ICF thinks bodies
of functions are equivalent but becaise of difference in aliases in a
memory access, different aggregate jump functions are associated with
supposedly equivalent call statements.  This patch adds a way to
compare jump functions and plugs it into ICF to avoid the issue.

gcc/ChangeLog:

2024-05-14  Martin Jambor  

PR ipa/113907
* ipa-prop.h (ipa_jump_functions_equivalent_p): Declare.
(values_equal_for_ipcp_p): Likewise.
* ipa-prop.cc (ipa_agg_pass_through_jf_equivalent_p): New function.
(ipa_agg_jump_functions_equivalent_p): Likewise.
(ipa_jump_functions_equivalent_p): Likewise.
* ipa-cp.cc (values_equal_for_ipcp_p): Make function public.
* ipa-icf-gimple.cc: Include alloc-pool.h, symbol-summary.h, 
sreal.h,
ipa-cp.h and ipa-prop.h.
(func_checker::compare_gimple_call): Comapre jump functions.

gcc/testsuite/ChangeLog:

2024-05-10  Martin Jambor  

PR ipa/113907
* gcc.dg/lto/pr113907_0.c: New.
* gcc.dg/lto/pr113907_1.c: Likewise.
* gcc.dg/lto/pr113907_2.c: Likewise.

Diff:
---
 gcc/ipa-cp.cc |   2 +-
 gcc/ipa-icf-gimple.cc |  29 +++
 gcc/ipa-prop.cc   | 157 ++
 gcc/ipa-prop.h|   3 +
 gcc/testsuite/gcc.dg/lto/pr113907_0.c |  18 
 gcc/testsuite/gcc.dg/lto/pr113907_1.c |  35 
 gcc/testsuite/gcc.dg/lto/pr113907_2.c |  11 +++
 7 files changed, 254 insertions(+), 1 deletion(-)

diff --git a/gcc/ipa-cp.cc b/gcc/ipa-cp.cc
index b3e0f62e4003..8f36608cf33b 100644
--- a/gcc/ipa-cp.cc
+++ b/gcc/ipa-cp.cc
@@ -458,7 +458,7 @@ ipcp_lattice::is_single_const ()
 
 /* Return true iff X and Y should be considered equal values by IPA-CP.  */
 
-static bool
+bool
 values_equal_for_ipcp_p (tree x, tree y)
 {
   gcc_checking_assert (x != NULL_TREE && y != NULL_TREE);
diff --git a/gcc/ipa-icf-gimple.cc b/gcc/ipa-icf-gimple.cc
index 49302ad56c65..054a557bd588 100644
--- a/gcc/ipa-icf-gimple.cc
+++ b/gcc/ipa-icf-gimple.cc
@@ -42,7 +42,11 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-sra.h"
 
 #include "tree-ssa-alias-compare.h"
+#include "alloc-pool.h"
+#include "symbol-summary.h"
 #include "ipa-icf-gimple.h"
+#include "sreal.h"
+#include "ipa-prop.h"
 
 namespace ipa_icf_gimple {
 
@@ -751,6 +755,31 @@ func_checker::compare_gimple_call (gcall *s1, gcall *s2)
   && !compatible_types_p (TREE_TYPE (t1), TREE_TYPE (t2)))
 return return_false_with_msg ("GIMPLE internal call LHS type mismatch");
 
+  if (!gimple_call_internal_p (s1))
+{
+  cgraph_edge *e1 = cgraph_node::get (m_source_func_decl)->get_edge (s1);
+  cgraph_edge *e2 = cgraph_node::get (m_target_func_decl)->get_edge (s2);
+  class ipa_edge_args *args1 = ipa_edge_args_sum->get (e1);
+  class ipa_edge_args *args2 = ipa_edge_args_sum->get (e2);
+  if ((args1 != nullptr) != (args2 != nullptr))
+   return return_false_with_msg ("ipa_edge_args mismatch");
+  if (args1)
+   {
+ int n1 = ipa_get_cs_argument_count (args1);
+ int n2 = ipa_get_cs_argument_count (args2);
+ if (n1 != n2)
+   return return_false_with_msg ("ipa_edge_args nargs mismatch");
+ for (int i = 0; i < n1; i++)
+   {
+ struct ipa_jump_func *jf1 = ipa_get_ith_jump_func (args1, i);
+ struct ipa_jump_func *jf2 = ipa_get_ith_jump_func (args2, i);
+ if (((jf1 != nullptr) != (jf2 != nullptr))
+ || (jf1 && !ipa_jump_functions_equivalent_p (jf1, jf2)))
+   return return_false_with_msg ("jump function mismatch");
+   }
+   }
+}
+
   return compare_operand (t1, t2, get_operand_access_type (&map, t1));
 }
 
diff --git a/gcc/ipa-prop.cc b/gcc/ipa-prop.cc
index 0d8167495341..11ba2521b2c9 100644
--- a/gcc/ipa-prop.cc
+++ b/gcc/ipa-prop.cc
@@ -6022,5 +6022,162 @@ ipcp_transform_function (struct cgraph_node *node)
   return modified_mem_access ? TODO_update_ssa_only_virtuals : 0;
 }
 
+/* Return true if the two pass_through components of two jump functions are
+   known to be equivalent.  AGG_JF denotes whether they are part of aggregate
+   functions or not.  The function can be used before the IPA phase of IPA-CP
+   or inlining because it cannot cope with refdesc changes these

[gcc r12-10442] ipa: Force args obtined through pass-through maps to the expected type (PR 114247)

2024-05-15 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:44191982c6bd41db1c9d126ea2f15febec3c1f81

commit r12-10442-g44191982c6bd41db1c9d126ea2f15febec3c1f81
Author: Martin Jambor 
Date:   Tue May 14 14:13:36 2024 +0200

ipa: Force args obtined through pass-through maps to the expected type (PR 
114247)

Interactions of IPA-CP and IPA-SRA on the same data is a rather big
source of issues, I'm afraid.  PR 113964 is a situation where IPA-CP
propagates an unsigned short in a union parameter into a function
which itself calls a different function which has a same union
parameter and both these union parameters are split with IPA-SRA.  The
leaf function however uses a signed short member of the union.

In the calling function, we get the unsigned constant as the
replacement for the union and it is then passed in the call without
any type compatibility checks.  Apparently on riscv64 it matters
whether the parameter is signed or unsigned short and so the leaf
function can see different values.

Fixed by using useless_type_conversion_p at the appropriate place and
if it fails, use force_value_to type as elsewhere in similar
situations.

gcc/ChangeLog:

2024-04-04  Martin Jambor  

PR ipa/114247
* ipa-param-manipulation.cc (ipa_param_adjustments::modify_call):
Force values obtined through pass-through maps to the expected
split type.

gcc/testsuite/ChangeLog:

2024-04-04  Patrick O'Neill  
Martin Jambor  

PR ipa/114247
* gcc.dg/ipa/pr114247.c: New test.

(cherry picked from commit 8cd0d29270d4ed86c69b80c08de66dcb6c1e22fe)

Diff:
---
 gcc/ipa-param-manipulation.cc   |  6 ++
 gcc/testsuite/gcc.dg/ipa/pr114247.c | 31 +++
 2 files changed, 37 insertions(+)

diff --git a/gcc/ipa-param-manipulation.cc b/gcc/ipa-param-manipulation.cc
index 38328c3e8d0a..3472ef13bc2f 100644
--- a/gcc/ipa-param-manipulation.cc
+++ b/gcc/ipa-param-manipulation.cc
@@ -719,6 +719,12 @@ ipa_param_adjustments::modify_call (cgraph_edge *cs,
  }
   if (repl)
{
+ if (!useless_type_conversion_p(apm->type, repl->typed.type))
+   {
+ repl = force_value_to_type (apm->type, repl);
+ repl = force_gimple_operand_gsi (&gsi, repl,
+  true, NULL, true, GSI_SAME_STMT);
+   }
  vargs.quick_push (repl);
  continue;
}
diff --git a/gcc/testsuite/gcc.dg/ipa/pr114247.c 
b/gcc/testsuite/gcc.dg/ipa/pr114247.c
new file mode 100644
index ..60aa2bc0122f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ipa/pr114247.c
@@ -0,0 +1,31 @@
+/* { dg-do run } */
+/* { dg-options "-O2 -fsigned-char -fno-strict-aliasing -fwrapv" } */
+
+union a {
+  unsigned short b;
+  int c;
+  signed short d;
+};
+int e, f = 1, g;
+long h;
+const int **i;
+void j(union a k, int l, unsigned m) {
+  const int *a[100];
+  i = &a[0];
+  h = k.d;
+}
+static int o(union a k) {
+  k.d = -1;
+  while (1)
+if (f)
+  break;
+  j(k, g, e);
+  return 0;
+}
+int main() {
+  union a n = {1};
+  o(n);
+  if (h != -1)
+__builtin_abort();
+  return 0;
+}


[gcc r12-10443] ipa: Self-DCE of uses of removed call LHSs (PR 108007)

2024-05-15 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:2183e5b5aa3a080624cb95a06993e34dedd09cb2

commit r12-10443-g2183e5b5aa3a080624cb95a06993e34dedd09cb2
Author: Martin Jambor 
Date:   Mon Apr 8 17:34:33 2024 +0200

ipa: Self-DCE of uses of removed call LHSs (PR 108007)

PR 108007 is another manifestation where we rely on DCE to clean-up
after IPA-SRA and if the user explicitely switches DCE off, IPA-SRA
can leave behind statements which are fed uninitialized values and
trap, even though their results are themselves never used.

I have already fixed this for unused parameters in callees, this bug
shows that almost the same thing can happen for removed returns, on
the side of callers.  This means that the issue has to be fixed
elsewhere, in call redirection.  This patch adds a function which
looks for (and through, using a work-list) uses of operations fed
specific SSA names and removes them all.

That would have been easy if it wasn't for debug statements during
tree-inline (from which call redirection is also invoked).  Debug
statements are decoupled from the rest at this point and iterating
over uses of SSAs does not bring them up.  During tree-inline they are
handled especially at the end, I assume in order to make sure that
relative ordering of UIDs are the same with and without debug info.

This means that during tree-inline we need to make a hash of killed
SSAs, that we already have in copy_body_data, available to the
function making the purging.  So the patch duly does also that, making
the interface slightly ugly.  Moreover, all newly unused SSA names
need to be freed and as PR 112616 showed, it must be done in a defined
order, which is what newly added ipa_release_ssas_in_hash does.

This backport to gcc-13 also contains
54e505d0446f86b7ad383acbb8e5501f20872b64 in order not to reintroduce
PR 113757.

gcc/ChangeLog:

2024-04-05  Martin Jambor  

PR ipa/108007
PR ipa/112616
* cgraph.h (cgraph_edge): Add a parameter to
redirect_call_stmt_to_callee.
* ipa-param-manipulation.h (ipa_param_adjustments): Add a
parameter to modify_call.
(ipa_release_ssas_in_hash): Declare.
* cgraph.cc (cgraph_edge::redirect_call_stmt_to_callee): New
parameter killed_ssas, pass it to padjs->modify_call.
* ipa-param-manipulation.cc (purge_all_uses): New function.
(ipa_param_adjustments::modify_call): New parameter killed_ssas.
Instead of substituting uses, invoke purge_all_uses.  If
hash of killed SSAs has not been provided, create a temporary one
and release SSAs that have been added to it.
(compare_ssa_versions): New function.
(ipa_release_ssas_in_hash): Likewise.
* tree-inline.cc (redirect_all_calls): Create
id->killed_new_ssa_names earlier, pass it to edge redirection,
adjust a comment.
(copy_body): Release SSAs in id->killed_new_ssa_names.

gcc/testsuite/ChangeLog:

2024-01-15  Martin Jambor  

PR ipa/108007
PR ipa/112616
* gcc.dg/ipa/pr108007.c: New test.
* gcc.dg/ipa/pr112616.c: Likewise.

(cherry picked from commit 40ddc0b05a47f999b24f20c1becb79004995731b)

Diff:
---
 gcc/cgraph.cc   |  10 +++-
 gcc/cgraph.h|   9 ++-
 gcc/ipa-param-manipulation.cc   | 112 +---
 gcc/ipa-param-manipulation.h|   5 +-
 gcc/testsuite/g++.dg/ipa/pr113757.C |  14 +
 gcc/testsuite/gcc.dg/ipa/pr108007.c |  32 +++
 gcc/testsuite/gcc.dg/ipa/pr112616.c |  28 +
 gcc/tree-inline.cc  |  27 -
 8 files changed, 193 insertions(+), 44 deletions(-)

diff --git a/gcc/cgraph.cc b/gcc/cgraph.cc
index 3734c85db637..b5cfa3b36c57 100644
--- a/gcc/cgraph.cc
+++ b/gcc/cgraph.cc
@@ -1403,11 +1403,17 @@ cgraph_edge::redirect_callee (cgraph_node *n)
speculative indirect call, remove "speculative" of the indirect call and
also redirect stmt to it's final direct target.
 
+   When called from within tree-inline, KILLED_SSAs has to contain the pointer
+   to killed_new_ssa_names within the copy_body_data structure and SSAs
+   discovered to be useless (if LHS is removed) will be added to it, otherwise
+   it needs to be NULL.
+
It is up to caller to iteratively transform each "speculative"
direct call as appropriate.  */
 
 gimple *
-cgraph_edge::redirect_call_stmt_to_callee (cgraph_edge *e)
+cgraph_edge::redirect_call_stmt_to_callee (cgraph_edge *e,
+  hash_set  *killed_ssas)
 {
   tree decl = gimple_call_fndecl (e->call_stmt);
   gcall *new_stmt;
@@ -1528,7 +1534,7 @@ cgraph_edge::redirect_call_stmt_to_callee (cgraph_edge *e)
remove_stmt_from_eh_lp (e-

[gcc r13-8982] Compare loop bounds in ipa-icf

2024-08-19 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:e469654e5e7bdd823c5aa996075e903c6b4d47e2

commit r13-8982-ge469654e5e7bdd823c5aa996075e903c6b4d47e2
Author: Jan Hubicka 
Date:   Mon Aug 19 17:10:25 2024 +0200

Compare loop bounds in ipa-icf

Hi,
this testcase shows another poblem with missing comparators for metadata
in ICF. With value ranges available to loop optimizations during early
opts we can estimate number of iterations based on guarding condition that
can be split away by the fnsplit pass. This patch disables ICF when
number of iteraitons does not match.

Bootstrapped/regtesed x86_64-linux, will commit it shortly

gcc/ChangeLog:

PR ipa/115277
* ipa-icf-gimple.cc (func_checker::compare_loops): compare loop
bounds.

gcc/testsuite/ChangeLog:

* gcc.c-torture/compile/pr115277.c: New test.

(cherry picked from commit 0d19fbc7b0760ce665fa6a88cd40cfa0311358d7)

Diff:
---
 gcc/ipa-icf-gimple.cc  |  4 
 gcc/testsuite/gcc.c-torture/compile/pr115277.c | 28 ++
 2 files changed, 32 insertions(+)

diff --git a/gcc/ipa-icf-gimple.cc b/gcc/ipa-icf-gimple.cc
index 054a557bd58..a844e74792a 100644
--- a/gcc/ipa-icf-gimple.cc
+++ b/gcc/ipa-icf-gimple.cc
@@ -542,6 +542,10 @@ func_checker::compare_loops (basic_block bb1, basic_block 
bb2)
 return return_false_with_msg ("unroll");
   if (!compare_variable_decl (l1->simduid, l2->simduid))
 return return_false_with_msg ("simduid");
+  if ((l1->any_upper_bound != l2->any_upper_bound)
+  || (l1->any_upper_bound
+ && (l1->nb_iterations_upper_bound != l2->nb_iterations_upper_bound)))
+return return_false_with_msg ("nb_iterations_upper_bound");
 
   return true;
 }
diff --git a/gcc/testsuite/gcc.c-torture/compile/pr115277.c 
b/gcc/testsuite/gcc.c-torture/compile/pr115277.c
new file mode 100644
index 000..27449eb254f
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/compile/pr115277.c
@@ -0,0 +1,28 @@
+int array[1000];
+void
+test (int a)
+{
+if (__builtin_expect (a > 3, 1))
+return;
+for (int i = 0; i < a; i++)
+array[i]=i;
+}
+void
+test2 (int a)
+{
+if (__builtin_expect (a > 10, 1))
+return;
+for (int i = 0; i < a; i++)
+array[i]=i;
+}
+int
+main()
+{
+test(1);
+test(2);
+test(3);
+test2(10);
+if (array[9] != 9)
+__builtin_abort ();
+return 0;
+}


[gcc r15-3070] sra: Avoid risking x87 magling binary representation of a replacement (PR 58416)

2024-08-21 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:f577959f420ae404f99f630dadc1c0370734d0da

commit r15-3070-gf577959f420ae404f99f630dadc1c0370734d0da
Author: Martin Jambor 
Date:   Wed Aug 21 14:49:11 2024 +0200

sra: Avoid risking x87 magling binary representation of a replacement (PR 
58416)

PR 58416 shows that storing non-floating point data to floating point
scalar registers can lead to miscompilations when the data is
normalized or otherwise processed upon loading to a register.  To
avoid that risk, this patch detects situations where we have multiple
types and a we decide to represent the data in a type with a mode that
is known to not be able to transfer actual bits reliably using the new
TARGET_MODE_CAN_TRANSFER_BITS hook.

gcc/ChangeLog:

2024-08-19  Martin Jambor  

PR target/58416
* tree-sra.cc (types_risk_mangled_binary_repr_p): New function.
(sort_and_splice_var_accesses): Use it.
(propagate_subaccesses_from_rhs): Likewise.

gcc/testsuite/ChangeLog:

2024-08-19  Martin Jambor  

PR target/58416
* gcc.dg/torture/pr58416.c: New test.

Diff:
---
 gcc/testsuite/gcc.dg/torture/pr58416.c | 32 
 gcc/tree-sra.cc| 28 +++-
 2 files changed, 59 insertions(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.dg/torture/pr58416.c 
b/gcc/testsuite/gcc.dg/torture/pr58416.c
new file mode 100644
index ..0922b0e70890
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr58416.c
@@ -0,0 +1,32 @@
+/* { dg-do run } */
+
+struct s {
+  char s[sizeof(long double)];
+};
+
+union u {
+  long double d;
+  struct s s;
+};
+
+int main()
+{
+  union u x = {0};
+#if __SIZEOF_LONG_DOUBLE__ == 16
+  x.s = (struct s){""};
+#elif __SIZEOF_LONG_DOUBLE__ == 12
+  x.s = (struct s){""};
+#elif __SIZEOF_LONG_DOUBLE__ == 8
+  x.s = (struct s){""};
+#elif __SIZEOF_LONG_DOUBLE__ == 4
+  x.s = (struct s){""};
+#endif
+
+  union u y = x;
+
+  for (unsigned char *p = (unsigned char *)&y + sizeof y;
+   p-- > (unsigned char *)&y;)
+if (*p != (unsigned char)'x')
+  __builtin_abort ();
+  return 0;
+}
diff --git a/gcc/tree-sra.cc b/gcc/tree-sra.cc
index 8040b0c56451..64e2f007d680 100644
--- a/gcc/tree-sra.cc
+++ b/gcc/tree-sra.cc
@@ -2335,6 +2335,19 @@ same_access_path_p (tree exp1, tree exp2)
   return true;
 }
 
+/* Return true when either T1 is a type that, when loaded into a register and
+   stored back to memory will yield the same bits or when both T1 and T2 are
+   compatible.  */
+
+static bool
+types_risk_mangled_binary_repr_p (tree t1, tree t2)
+{
+  if (mode_can_transfer_bits (TYPE_MODE (t1)))
+return false;
+
+  return !types_compatible_p (t1, t2);
+}
+
 /* Sort all accesses for the given variable, check for partial overlaps and
return NULL if there are any.  If there are none, pick a representative for
each combination of offset and size and create a linked list out of them.
@@ -2461,6 +2474,17 @@ sort_and_splice_var_accesses (tree var)
}
  unscalarizable_region = true;
}
+ else if (types_risk_mangled_binary_repr_p (access->type, ac2->type))
+   {
+ if (dump_file && (dump_flags & TDF_DETAILS))
+   {
+ fprintf (dump_file, "Cannot scalarize the following access "
+  "because data would be held in a mode which is not "
+  "guaranteed to preserve all bits.\n  ");
+ dump_access (dump_file, access, false);
+   }
+ unscalarizable_region = true;
+   }
 
  if (grp_same_access_path
  && !same_access_path_p (access->expr, ac2->expr))
@@ -3127,7 +3151,9 @@ propagate_subaccesses_from_rhs (struct access *lacc, 
struct access *racc)
  ret = true;
  subtree_mark_written_and_rhs_enqueue (lacc);
}
-  if (!lacc->first_child && !racc->first_child)
+  if (!lacc->first_child
+ && !racc->first_child
+ && !types_risk_mangled_binary_repr_p (racc->type, lacc->type))
{
  /* We are about to change the access type from aggregate to scalar,
 so we need to put the reverse flag onto the access, if any.  */


[gcc r15-3515] ipa: Treat static constructors and destructors as non-local (PR 115815)

2024-09-06 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:e98ad6a049c96c21cf641954584c2f5b7df0ce93

commit r15-3515-ge98ad6a049c96c21cf641954584c2f5b7df0ce93
Author: Martin Jambor 
Date:   Fri Sep 6 14:12:53 2024 +0200

ipa: Treat static constructors and destructors as non-local (PR 115815)

In PR 115815, IPA-SRA thought it had control over all invocations of a
(recursive) static destructor but it did not see the implied
invocation which led to the original being left behind and the
clean-up code encountering uses of SSAs that definitely should have
been dead.

Fixed by teaching cgraph_node::can_be_local_p about static
constructors and destructors.  Similar test is missing in
cgraph_node::local_p so I added the check there as well.

gcc/ChangeLog:

2024-07-25  Martin Jambor  

PR ipa/115815
* cgraph.cc (cgraph_node_cannot_be_local_p_1): Also check
DECL_STATIC_CONSTRUCTOR and DECL_STATIC_DESTRUCTOR.
* ipa-visibility.cc (non_local_p): Likewise.
(cgraph_node::local_p): Delete extraneous line of tabs.

gcc/testsuite/ChangeLog:

2024-07-25  Martin Jambor  

PR ipa/115815
* gcc.dg/lto/pr115815_0.c: New test.

Diff:
---
 gcc/cgraph.cc |  4 +++-
 gcc/ipa-visibility.cc |  5 +++--
 gcc/testsuite/gcc.dg/lto/pr115815_0.c | 18 ++
 3 files changed, 24 insertions(+), 3 deletions(-)

diff --git a/gcc/cgraph.cc b/gcc/cgraph.cc
index 473d8410bc9..39a3adbc7c3 100644
--- a/gcc/cgraph.cc
+++ b/gcc/cgraph.cc
@@ -2434,7 +2434,9 @@ cgraph_node_cannot_be_local_p_1 (cgraph_node *node, void 
*)
&& !node->forced_by_abi
&& !node->used_from_object_file_p ()
&& !node->same_comdat_group)
-  || !node->externally_visible));
+  || !node->externally_visible)
+  && !DECL_STATIC_CONSTRUCTOR (node->decl)
+  && !DECL_STATIC_DESTRUCTOR (node->decl));
 }
 
 /* Return true if cgraph_node can be made local for API change.
diff --git a/gcc/ipa-visibility.cc b/gcc/ipa-visibility.cc
index 501d3c304aa..21f0c47f388 100644
--- a/gcc/ipa-visibility.cc
+++ b/gcc/ipa-visibility.cc
@@ -102,7 +102,9 @@ non_local_p (struct cgraph_node *node, void *data 
ATTRIBUTE_UNUSED)
   && !node->externally_visible
   && !node->used_from_other_partition
   && !node->in_other_partition
-  && node->get_availability () >= AVAIL_AVAILABLE);
+  && node->get_availability () >= AVAIL_AVAILABLE
+  && !DECL_STATIC_CONSTRUCTOR (node->decl)
+  && !DECL_STATIC_DESTRUCTOR (node->decl));
 }
 
 /* Return true when function can be marked local.  */
@@ -116,7 +118,6 @@ cgraph_node::local_p (void)
  return n->callees->callee->local_p ();
return !n->call_for_symbol_thunks_and_aliases (non_local_p,
  NULL, true);
-   
 }
 
 /* A helper for comdat_can_be_unshared_p.  */
diff --git a/gcc/testsuite/gcc.dg/lto/pr115815_0.c 
b/gcc/testsuite/gcc.dg/lto/pr115815_0.c
new file mode 100644
index 000..d938ae4c802
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/lto/pr115815_0.c
@@ -0,0 +1,18 @@
+int a;
+volatile int v;
+volatile int w;
+
+int __attribute__((destructor))
+b() {
+  if (v)
+return a + b();
+  v = 5;
+  return 0;
+}
+
+int
+main (int argc, char **argv)
+{
+  w = 1;
+  return 0;
+}


[gcc r15-3516] ipa: Move pass_ipa_cdtor_merge before pass_ipa_cp and pass_ipa_sra

2024-09-06 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:db0fa0b35b922449d703c040383abf7acb349d9d

commit r15-3516-gdb0fa0b35b922449d703c040383abf7acb349d9d
Author: Martin Jambor 
Date:   Fri Sep 6 14:12:54 2024 +0200

ipa: Move pass_ipa_cdtor_merge before pass_ipa_cp and pass_ipa_sra

When looking at PR 115815 we realized that it would make sense to make
calls to functions originally declared static constructors and
destructors created by pass_ipa_cdtor_merge visible to IPA-SRA.  This
patch does that.

gcc/ChangeLog:

2024-07-25  Martin Jambor  

* passes.def: Move pass_ipa_cdtor_merge before pass_ipa_cp and
pass_ipa_sra.

Diff:
---
 gcc/passes.def | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/passes.def b/gcc/passes.def
index 6d98c3c9282..40162ac20a0 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -157,9 +157,9 @@ along with GCC; see the file COPYING3.  If not see
   NEXT_PASS (pass_ipa_profile);
   NEXT_PASS (pass_ipa_icf);
   NEXT_PASS (pass_ipa_devirt);
+  NEXT_PASS (pass_ipa_cdtor_merge);
   NEXT_PASS (pass_ipa_cp);
   NEXT_PASS (pass_ipa_sra);
-  NEXT_PASS (pass_ipa_cdtor_merge);
   NEXT_PASS (pass_ipa_fn_summary);
   NEXT_PASS (pass_ipa_inline);
   NEXT_PASS (pass_ipa_pure_const);


[gcc r15-3589] ipa: Rename ipa_supports_p to ipa_vr_supported_type_p

2024-09-11 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:323291c29c77e3214f4850129bb8a3d0d8da6a45

commit r15-3589-g323291c29c77e3214f4850129bb8a3d0d8da6a45
Author: Martin Jambor 
Date:   Wed Sep 11 23:53:21 2024 +0200

ipa: Rename ipa_supports_p to ipa_vr_supported_type_p

ipa_supports_p is not a name that captures well what the predicate
determines.  Therefore, this patch renames it to ipa_vr_supported_type_p.

gcc/ChangeLog:

2024-09-06  Martin Jambor  

* ipa-cp.h (ipa_supports_p): Rename to ipa_vr_supported_type_p.
* ipa-cp.cc (ipa_vr_operation_and_type_effects): Adjust called
function name.
(propagate_vr_across_jump_function): Likewise.
* ipa-prop.cc (ipa_compute_jump_functions_for_edge): Likewise.
(ipcp_get_parm_bits): Likewise.

Diff:
---
 gcc/ipa-cp.cc   | 5 +++--
 gcc/ipa-cp.h| 2 +-
 gcc/ipa-prop.cc | 6 +++---
 3 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/gcc/ipa-cp.cc b/gcc/ipa-cp.cc
index 56468dc40ee4..a1033b81aefc 100644
--- a/gcc/ipa-cp.cc
+++ b/gcc/ipa-cp.cc
@@ -1649,7 +1649,8 @@ ipa_vr_operation_and_type_effects (vrange &dst_vr,
   enum tree_code operation,
   tree dst_type, tree src_type)
 {
-  if (!ipa_supports_p (dst_type) || !ipa_supports_p (src_type))
+  if (!ipa_vr_supported_type_p (dst_type)
+  || !ipa_vr_supported_type_p (src_type))
 return false;
 
   range_op_handler handler (operation);
@@ -2553,7 +2554,7 @@ propagate_vr_across_jump_function (cgraph_edge *cs, 
ipa_jump_func *jfunc,
  ipa_range_set_and_normalize (op_vr, op);
 
  if (!handler
- || !ipa_supports_p (operand_type)
+ || !ipa_vr_supported_type_p (operand_type)
  /* Sometimes we try to fold comparison operators using a
 pointer type to hold the result instead of a boolean
 type.  Avoid trapping in the sanity check in
diff --git a/gcc/ipa-cp.h b/gcc/ipa-cp.h
index 4616c61625ab..ba2ebfede63f 100644
--- a/gcc/ipa-cp.h
+++ b/gcc/ipa-cp.h
@@ -294,7 +294,7 @@ bool values_equal_for_ipcp_p (tree x, tree y);
 /* Return TRUE if IPA supports ranges of TYPE.  */
 
 static inline bool
-ipa_supports_p (tree type)
+ipa_vr_supported_type_p (tree type)
 {
   return irange::supports_p (type) || prange::supports_p (type);
 }
diff --git a/gcc/ipa-prop.cc b/gcc/ipa-prop.cc
index 99ebd6229ec4..78d1fb7086d5 100644
--- a/gcc/ipa-prop.cc
+++ b/gcc/ipa-prop.cc
@@ -2392,8 +2392,8 @@ ipa_compute_jump_functions_for_edge (struct 
ipa_func_body_info *fbi,
   else
{
  if (param_type
- && ipa_supports_p (TREE_TYPE (arg))
- && ipa_supports_p (param_type)
+ && ipa_vr_supported_type_p (TREE_TYPE (arg))
+ && ipa_vr_supported_type_p (param_type)
  && get_range_query (cfun)->range_of_expr (vr, arg, cs->call_stmt)
  && !vr.undefined_p ())
{
@@ -5761,7 +5761,7 @@ ipcp_get_parm_bits (tree parm, tree *value, widest_int 
*mask)
   ipcp_transformation *ts = ipcp_get_transformation_summary (cnode);
   if (!ts
   || vec_safe_length (ts->m_vr) == 0
-  || !ipa_supports_p (TREE_TYPE (parm)))
+  || !ipa_vr_supported_type_p (TREE_TYPE (parm)))
 return false;
 
   int i = ts->get_param_index (current_function_decl, parm);


[gcc r15-3590] ipa-cp: One more use of ipa_vr_supported_type_p

2024-09-11 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:f910b02919036647a3f096265cda19358dded628

commit r15-3590-gf910b02919036647a3f096265cda19358dded628
Author: Martin Jambor 
Date:   Wed Sep 11 23:53:21 2024 +0200

ipa-cp: One more use of ipa_vr_supported_type_p

Since we have the predicate, this patch converts one more check for
essentially the same thing into its use.

2024-09-11  Martin Jambor  

* ipa-cp.cc (propagate_vr_across_jump_function): Use
ipa_vr_supported_type_p instead of explicit check for integral and
pointer types.

Diff:
---
 gcc/ipa-cp.cc | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/gcc/ipa-cp.cc b/gcc/ipa-cp.cc
index a1033b81aefc..fa7bd6a15da7 100644
--- a/gcc/ipa-cp.cc
+++ b/gcc/ipa-cp.cc
@@ -2519,8 +2519,7 @@ propagate_vr_across_jump_function (cgraph_edge *cs, 
ipa_jump_func *jfunc,
 return false;
 
   if (!param_type
-  || (!INTEGRAL_TYPE_P (param_type)
- && !POINTER_TYPE_P (param_type)))
+  || !ipa_vr_supported_type_p (param_type))
 return dest_lat->set_to_bottom ();
 
   if (jfunc->type == IPA_JF_PASS_THROUGH)


[gcc r14-9403] ipa: Avoid excessive removing of SSAs (PR 113757)

2024-03-08 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:54e505d0446f86b7ad383acbb8e5501f20872b64

commit r14-9403-g54e505d0446f86b7ad383acbb8e5501f20872b64
Author: Martin Jambor 
Date:   Sat Mar 9 00:47:22 2024 +0100

ipa: Avoid excessive removing of SSAs (PR 113757)

PR 113757 shows that the code which was meant to debug-reset and
remove SSAs defined by LHSs of calls redirected to
__builtin_unreachable can trigger also when speculative
devirtualization creates a call to a noreturn function (and since it
is noreturn, it does not bother dealing with its return value).

What is more, it seems that the code handling this case is not really
necessary.  I feel slightly idiotic about this because I have a
feeling that I added it because of a failing test-case but I can
neither find the testcase nor a reason why the code in
cgraph_edge::redirect_call_stmt_to_callee would not be sufficient (it
turns the SSA name into a default-def, a bit like IPA-SRA, but any
code dominated by a call to a noreturn is not dangerous when it comes
to its side-effects).  So this patch just removes the handling.

gcc/ChangeLog:

2024-02-07  Martin Jambor  

PR ipa/113757
* tree-inline.cc (redirect_all_calls): Remove code adding SSAs to
id->killed_new_ssa_names.

gcc/testsuite/ChangeLog:

2024-02-07  Martin Jambor  

PR ipa/113757
* g++.dg/ipa/pr113757.C: New test.

Diff:
---
 gcc/testsuite/g++.dg/ipa/pr113757.C | 14 ++
 gcc/tree-inline.cc  | 14 ++
 2 files changed, 16 insertions(+), 12 deletions(-)

diff --git a/gcc/testsuite/g++.dg/ipa/pr113757.C 
b/gcc/testsuite/g++.dg/ipa/pr113757.C
new file mode 100644
index 000..885d4010a10
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ipa/pr113757.C
@@ -0,0 +1,14 @@
+// { dg-do compile }
+// { dg-options "-O2 -fPIC" }
+// { dg-require-effective-target fpic }
+
+long size();
+struct ll {  virtual int hh();  };
+ll  *slice_owner;
+int ll::hh() { __builtin_exit(0); }
+int nn() {
+  if (size())
+return 0;
+  return slice_owner->hh();
+}
+int (*a)() = nn;
diff --git a/gcc/tree-inline.cc b/gcc/tree-inline.cc
index f0a067f5812..eebcea8a029 100644
--- a/gcc/tree-inline.cc
+++ b/gcc/tree-inline.cc
@@ -2984,23 +2984,13 @@ redirect_all_calls (copy_body_data * id, basic_block bb)
   gimple *stmt = gsi_stmt (si);
   if (is_gimple_call (stmt))
{
- tree old_lhs = gimple_call_lhs (stmt);
  struct cgraph_edge *edge = id->dst_node->get_edge (stmt);
  if (edge)
{
  if (!id->killed_new_ssa_names)
id->killed_new_ssa_names = new hash_set (16);
- gimple *new_stmt
-   = cgraph_edge::redirect_call_stmt_to_callee (edge,
-   id->killed_new_ssa_names);
- if (old_lhs
- && TREE_CODE (old_lhs) == SSA_NAME
- && !gimple_call_lhs (new_stmt))
-   /* In case of IPA-SRA removing the LHS, the name should have
-  been already added to the hash.  But in case of redirecting
-  to builtin_unreachable it was not and the name still should
-  be pruned from debug statements.  */
-   id->killed_new_ssa_names->add (old_lhs);
+ cgraph_edge::redirect_call_stmt_to_callee (edge,
+   id->killed_new_ssa_names);
 
  if (stmt == last && id->call_stmt && maybe_clean_eh_stmt (stmt))
gimple_purge_dead_eh_edges (bb);


[gcc r14-9559] ipa: Fix C++ member ptr indirect inlining (PR 114254, PR 108802)

2024-03-19 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:bf838884fac573b4902a21bb82d9b6f777e32cb9

commit r14-9559-gbf838884fac573b4902a21bb82d9b6f777e32cb9
Author: Martin Jambor 
Date:   Tue Mar 19 22:33:27 2024 +0100

ipa: Fix C++ member ptr indirect inlining (PR 114254, PR 108802)

Even though we have had code to handle creation of indirect call graph
edges (so that these calls can than be made direct as part of IPA-CP
and inlining and eventually also inlined) for C++ member pointers for
many years, it turns out that it does not work for lambdas and that it
has been severely broken since GCC 10 when the base class has virtual
functions.

Lambdas don't work because the code cannot work with structures
representing member function pointers because they are passed by
reference instead by value and the code was not ready for that.

The presence of virtual methods broke thinks because at some point C++
FE got clever and stopped emitting the check for virtual methods when
the base class does not have any and that in turn made our existing
testcases not test the necessary pattern matching code.  The pattern
matcher had a small bug which did not matter before
r10-917-g3b47da42de621c but did afterwards.

This patch changes the pattern matcher to match both of these cases.

gcc/ChangeLog:

2024-03-06  Martin Jambor  

PR ipa/108802
PR ipa/114254
* ipa-prop.cc (ipa_get_stmt_member_ptr_load_param): Fix case looking
at COMPONENT_REFs directly from a PARM_DECL, also recognize loads 
from
a pointer parameter.
(ipa_analyze_indirect_call_uses): Also recognize loads from a 
pointer
parameter, also recognize the case when pfn pointer is loaded in its
own BB.

gcc/testsuite/ChangeLog:

2024-03-06  Martin Jambor  

PR ipa/108802
PR ipa/114254
* g++.dg/ipa/iinline-4.C: New test.
* g++.dg/ipa/pr108802.C: Likewise.

Diff:
---
 gcc/ipa-prop.cc  | 110 +--
 gcc/testsuite/g++.dg/ipa/iinline-4.C |  61 +++
 gcc/testsuite/g++.dg/ipa/pr108802.C  |  14 +
 3 files changed, 154 insertions(+), 31 deletions(-)

diff --git a/gcc/ipa-prop.cc b/gcc/ipa-prop.cc
index e22c4f78405..e8e4918d5a8 100644
--- a/gcc/ipa-prop.cc
+++ b/gcc/ipa-prop.cc
@@ -2500,7 +2500,9 @@ static tree
 ipa_get_stmt_member_ptr_load_param (gimple *stmt, bool use_delta,
HOST_WIDE_INT *offset_p)
 {
-  tree rhs, rec, ref_field, ref_offset, fld, ptr_field, delta_field;
+  tree rhs, fld, ptr_field, delta_field;
+  tree ref_field = NULL_TREE;
+  tree ref_offset = NULL_TREE;
 
   if (!gimple_assign_single_p (stmt))
 return NULL_TREE;
@@ -2511,35 +2513,53 @@ ipa_get_stmt_member_ptr_load_param (gimple *stmt, bool 
use_delta,
   ref_field = TREE_OPERAND (rhs, 1);
   rhs = TREE_OPERAND (rhs, 0);
 }
-  else
-ref_field = NULL_TREE;
-  if (TREE_CODE (rhs) != MEM_REF)
-return NULL_TREE;
-  rec = TREE_OPERAND (rhs, 0);
-  if (TREE_CODE (rec) != ADDR_EXPR)
-return NULL_TREE;
-  rec = TREE_OPERAND (rec, 0);
-  if (TREE_CODE (rec) != PARM_DECL
-  || !type_like_member_ptr_p (TREE_TYPE (rec), &ptr_field, &delta_field))
+
+  if (TREE_CODE (rhs) == MEM_REF)
+{
+  ref_offset = TREE_OPERAND (rhs, 1);
+  if (ref_field && integer_nonzerop (ref_offset))
+   return NULL_TREE;
+}
+  else if (!ref_field)
 return NULL_TREE;
-  ref_offset = TREE_OPERAND (rhs, 1);
+
+  if (TREE_CODE (rhs) == MEM_REF
+  && TREE_CODE (TREE_OPERAND (rhs, 0)) == SSA_NAME
+  && SSA_NAME_IS_DEFAULT_DEF (TREE_OPERAND (rhs, 0)))
+{
+  rhs = TREE_OPERAND (rhs, 0);
+  if (TREE_CODE (SSA_NAME_VAR (rhs)) != PARM_DECL
+ || !type_like_member_ptr_p (TREE_TYPE (TREE_TYPE (rhs)), &ptr_field,
+ &delta_field))
+   return NULL_TREE;
+}
+  else
+{
+  if (TREE_CODE (rhs) == MEM_REF
+ && TREE_CODE (TREE_OPERAND (rhs, 0)) == ADDR_EXPR)
+   rhs = TREE_OPERAND (TREE_OPERAND (rhs, 0), 0);
+  if (TREE_CODE (rhs) != PARM_DECL
+ || !type_like_member_ptr_p (TREE_TYPE (rhs), &ptr_field,
+ &delta_field))
+   return NULL_TREE;
+}
 
   if (use_delta)
 fld = delta_field;
   else
 fld = ptr_field;
-  if (offset_p)
-*offset_p = int_bit_position (fld);
 
   if (ref_field)
 {
-  if (integer_nonzerop (ref_offset))
+  if (ref_field != fld)
return NULL_TREE;
-  return ref_field == fld ? rec : NULL_TREE;
 }
-  else
-return tree_int_cst_equal (byte_position (fld), ref_offset) ? rec
-  : NULL_TREE;
+  else if (!tree_int_cst_equal (byte_position (fld), ref_offset))
+return NULL_TREE;
+
+  if (offset_p)
+*offset_p = int_bit_position (fld);
+  return rhs;
 }
 
 /* Returns true iff

[gcc r14-9794] ipa: Avoid duplicate replacements in IPA-SRA transformation phase

2024-04-04 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:ca56b43105fc09021ec445f1978a17cd85ae5e0c

commit r14-9794-gca56b43105fc09021ec445f1978a17cd85ae5e0c
Author: Martin Jambor 
Date:   Thu Apr 4 22:46:16 2024 +0200

ipa: Avoid duplicate replacements in IPA-SRA transformation phase

When the analysis part of IPA-SRA figures out that it would split out
a scalar part of an aggregate which is known by IPA-CP to contain a
known constant, it skips it knowing that the transformation part looks
at IPA-CP aggregate results too and does the right thing (which can
include doing the propagation in GIMPLE because that is the last
moment the parameter exists).

However, when IPA-SRA wants to split out a smaller aggregate out
of an aggregate, which happens to be of the same size as a known
scalar constant at the same offset, the transformation bit fails to
recognize the situation, tries to do both splitting and constant
propagation and in PR 111571 testcase creates a nonsensical call
statement on which the call redirection then ICEs.

Fixed by making sure we don't try to do two replacements of the same
part of the same parameter.

The look-up among replacements requires these are sorted and this
patch just sorts them if they are not already sorted before each new
look-up.  The worst number of sortings that can happen is number of
parameters which are both split and have aggregate constants times
param_ipa_max_agg_items (default 16).  I don't think complicating the
source code to optimize for this unlikely case is worth it but if need
be, it can of course be done.

gcc/ChangeLog:

2024-03-15  Martin Jambor  

PR ipa/111571
* ipa-param-manipulation.cc
(ipa_param_body_adjustments::common_initialization): Avoid creating
duplicate replacement entries.

gcc/testsuite/ChangeLog:

2024-03-15  Martin Jambor  

PR ipa/111571
* gcc.dg/ipa/pr111571.c: New test.

Diff:
---
 gcc/ipa-param-manipulation.cc   | 16 
 gcc/testsuite/gcc.dg/ipa/pr111571.c | 29 +
 2 files changed, 45 insertions(+)

diff --git a/gcc/ipa-param-manipulation.cc b/gcc/ipa-param-manipulation.cc
index 3e0df6a6f77..f4b5e850c2b 100644
--- a/gcc/ipa-param-manipulation.cc
+++ b/gcc/ipa-param-manipulation.cc
@@ -1525,6 +1525,22 @@ ipa_param_body_adjustments::common_initialization (tree 
old_fndecl,
 replacement with a constant (for split aggregates passed
 by value).  */
 
+ if (split[parm_num])
+   {
+ /* We must be careful not to add a duplicate
+replacement. */
+ sort_replacements ();
+ ipa_param_body_replacement *pbr
+   = lookup_replacement_1 (m_oparms[parm_num],
+   av.unit_offset);
+ if (pbr)
+   {
+ /* Otherwise IPA-SRA should have bailed out.  */
+ gcc_assert (AGGREGATE_TYPE_P (TREE_TYPE (pbr->repl)));
+ continue;
+   }
+   }
+
  tree repl;
  if (av.by_ref)
repl = av.value;
diff --git a/gcc/testsuite/gcc.dg/ipa/pr111571.c 
b/gcc/testsuite/gcc.dg/ipa/pr111571.c
new file mode 100644
index 000..2a4adc608db
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ipa/pr111571.c
@@ -0,0 +1,29 @@
+/* { dg-do compile } */
+/* { dg-options "-O2"  } */
+
+struct a {
+  int b;
+};
+struct c {
+  long d;
+  struct a e;
+  long f;
+};
+int g, h, i;
+int j() {return 0;}
+static void k(struct a l, int p) {
+  if (h)
+g = 0;
+  for (; g; g = j())
+if (l.b)
+  break;
+}
+static void m(struct c l) {
+  k(l.e, l.f);
+  for (;; --i)
+;
+}
+int main() {
+  struct c n = {10, 9};
+  m(n);
+}


[gcc r14-9813] ipa: Force args obtined through pass-through maps to the expected type (PR 113964)

2024-04-05 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:8cd0d29270d4ed86c69b80c08de66dcb6c1e22fe

commit r14-9813-g8cd0d29270d4ed86c69b80c08de66dcb6c1e22fe
Author: Martin Jambor 
Date:   Fri Apr 5 18:18:39 2024 +0200

ipa: Force args obtined through pass-through maps to the expected type (PR 
113964)

Interactions of IPA-CP and IPA-SRA on the same data is a rather big
source of issues, I'm afraid.  PR 113964 is a situation where IPA-CP
propagates an unsigned short in a union parameter into a function
which itself calls a different function which has a same union
parameter and both these union parameters are split with IPA-SRA.  The
leaf function however uses a signed short member of the union.

In the calling function, we get the unsigned constant as the
replacement for the union and it is then passed in the call without
any type compatibility checks.  Apparently on riscv64 it matters
whether the parameter is signed or unsigned short and so the leaf
function can see different values.

Fixed by using useless_type_conversion_p at the appropriate place and
if it fails, use force_value_to type as elsewhere in similar
situations.

gcc/ChangeLog:

2024-04-04  Martin Jambor  

PR ipa/113964
* ipa-param-manipulation.cc (ipa_param_adjustments::modify_call):
Force values obtined through pass-through maps to the expected
split type.

gcc/testsuite/ChangeLog:

2024-04-04  Patrick O'Neill  
Martin Jambor  

PR ipa/113964
* gcc.dg/ipa/pr114247.c: New test.

Diff:
---
 gcc/ipa-param-manipulation.cc   |  6 ++
 gcc/testsuite/gcc.dg/ipa/pr114247.c | 31 +++
 2 files changed, 37 insertions(+)

diff --git a/gcc/ipa-param-manipulation.cc b/gcc/ipa-param-manipulation.cc
index f4b5e850c2b..ad36b8389c0 100644
--- a/gcc/ipa-param-manipulation.cc
+++ b/gcc/ipa-param-manipulation.cc
@@ -740,6 +740,12 @@ ipa_param_adjustments::modify_call (cgraph_edge *cs,
  }
   if (repl)
{
+ if (!useless_type_conversion_p(apm->type, repl->typed.type))
+   {
+ repl = force_value_to_type (apm->type, repl);
+ repl = force_gimple_operand_gsi (&gsi, repl,
+  true, NULL, true, GSI_SAME_STMT);
+   }
  vargs.quick_push (repl);
  continue;
}
diff --git a/gcc/testsuite/gcc.dg/ipa/pr114247.c 
b/gcc/testsuite/gcc.dg/ipa/pr114247.c
new file mode 100644
index 000..60aa2bc0122
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ipa/pr114247.c
@@ -0,0 +1,31 @@
+/* { dg-do run } */
+/* { dg-options "-O2 -fsigned-char -fno-strict-aliasing -fwrapv" } */
+
+union a {
+  unsigned short b;
+  int c;
+  signed short d;
+};
+int e, f = 1, g;
+long h;
+const int **i;
+void j(union a k, int l, unsigned m) {
+  const int *a[100];
+  i = &a[0];
+  h = k.d;
+}
+static int o(union a k) {
+  k.d = -1;
+  while (1)
+if (f)
+  break;
+  j(k, g, e);
+  return 0;
+}
+int main() {
+  union a n = {1};
+  o(n);
+  if (h != -1)
+__builtin_abort();
+  return 0;
+}


[gcc r13-8594] ipa: Self-DCE of uses of removed call LHSs (PR 108007)

2024-04-08 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:40ddc0b05a47f999b24f20c1becb79004995731b

commit r13-8594-g40ddc0b05a47f999b24f20c1becb79004995731b
Author: Martin Jambor 
Date:   Mon Apr 8 17:34:33 2024 +0200

ipa: Self-DCE of uses of removed call LHSs (PR 108007)

PR 108007 is another manifestation where we rely on DCE to clean-up
after IPA-SRA and if the user explicitely switches DCE off, IPA-SRA
can leave behind statements which are fed uninitialized values and
trap, even though their results are themselves never used.

I have already fixed this for unused parameters in callees, this bug
shows that almost the same thing can happen for removed returns, on
the side of callers.  This means that the issue has to be fixed
elsewhere, in call redirection.  This patch adds a function which
looks for (and through, using a work-list) uses of operations fed
specific SSA names and removes them all.

That would have been easy if it wasn't for debug statements during
tree-inline (from which call redirection is also invoked).  Debug
statements are decoupled from the rest at this point and iterating
over uses of SSAs does not bring them up.  During tree-inline they are
handled especially at the end, I assume in order to make sure that
relative ordering of UIDs are the same with and without debug info.

This means that during tree-inline we need to make a hash of killed
SSAs, that we already have in copy_body_data, available to the
function making the purging.  So the patch duly does also that, making
the interface slightly ugly.  Moreover, all newly unused SSA names
need to be freed and as PR 112616 showed, it must be done in a defined
order, which is what newly added ipa_release_ssas_in_hash does.

This backport to gcc-13 also contains
54e505d0446f86b7ad383acbb8e5501f20872b64 in order not to reintroduce
PR 113757.

gcc/ChangeLog:

2024-04-05  Martin Jambor  

PR ipa/108007
PR ipa/112616
* cgraph.h (cgraph_edge): Add a parameter to
redirect_call_stmt_to_callee.
* ipa-param-manipulation.h (ipa_param_adjustments): Add a
parameter to modify_call.
(ipa_release_ssas_in_hash): Declare.
* cgraph.cc (cgraph_edge::redirect_call_stmt_to_callee): New
parameter killed_ssas, pass it to padjs->modify_call.
* ipa-param-manipulation.cc (purge_all_uses): New function.
(ipa_param_adjustments::modify_call): New parameter killed_ssas.
Instead of substituting uses, invoke purge_all_uses.  If
hash of killed SSAs has not been provided, create a temporary one
and release SSAs that have been added to it.
(compare_ssa_versions): New function.
(ipa_release_ssas_in_hash): Likewise.
* tree-inline.cc (redirect_all_calls): Create
id->killed_new_ssa_names earlier, pass it to edge redirection,
adjust a comment.
(copy_body): Release SSAs in id->killed_new_ssa_names.

gcc/testsuite/ChangeLog:

2024-01-15  Martin Jambor  

PR ipa/108007
PR ipa/112616
* gcc.dg/ipa/pr108007.c: New test.
* gcc.dg/ipa/pr112616.c: Likewise.

(cherry picked from commit a9a8426e534760b8d3a250e9bd3cff4db131a2be)

Diff:
---
 gcc/cgraph.cc   |  10 +++-
 gcc/cgraph.h|   9 ++-
 gcc/ipa-param-manipulation.cc   | 112 +---
 gcc/ipa-param-manipulation.h|   5 +-
 gcc/testsuite/g++.dg/ipa/pr113757.C |  14 +
 gcc/testsuite/gcc.dg/ipa/pr108007.c |  32 +++
 gcc/testsuite/gcc.dg/ipa/pr112616.c |  28 +
 gcc/tree-inline.cc  |  27 -
 8 files changed, 193 insertions(+), 44 deletions(-)

diff --git a/gcc/cgraph.cc b/gcc/cgraph.cc
index ec663d23385..7a14c00b60a 100644
--- a/gcc/cgraph.cc
+++ b/gcc/cgraph.cc
@@ -1403,11 +1403,17 @@ cgraph_edge::redirect_callee (cgraph_node *n)
speculative indirect call, remove "speculative" of the indirect call and
also redirect stmt to it's final direct target.
 
+   When called from within tree-inline, KILLED_SSAs has to contain the pointer
+   to killed_new_ssa_names within the copy_body_data structure and SSAs
+   discovered to be useless (if LHS is removed) will be added to it, otherwise
+   it needs to be NULL.
+
It is up to caller to iteratively transform each "speculative"
direct call as appropriate.  */
 
 gimple *
-cgraph_edge::redirect_call_stmt_to_callee (cgraph_edge *e)
+cgraph_edge::redirect_call_stmt_to_callee (cgraph_edge *e,
+  hash_set  *killed_ssas)
 {
   tree decl = gimple_call_fndecl (e->call_stmt);
   gcall *new_stmt;
@@ -1527,7 +1533,7 @@ cgraph_edge::redirect_call_stmt_to_callee (cgraph_edge *e)
remove_stmt_from_eh_lp (e->ca

[gcc r14-9840] ipa: Compare jump functions in ICF (PR 113907)

2024-04-08 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:1162861439fd3c4b30fc3ccd49462e47e876f04a

commit r14-9840-g1162861439fd3c4b30fc3ccd49462e47e876f04a
Author: Martin Jambor 
Date:   Mon Apr 8 18:53:23 2024 +0200

ipa: Compare jump functions in ICF (PR 113907)

In PR 113907 comment #58, Honza found a case where ICF thinks bodies
of functions are equivalent but becaise of difference in aliases in a
memory access, different aggregate jump functions are associated with
supposedly equivalent call statements.  This patch adds a way to
compare jump functions and plugs it into ICF to avoid the issue.

gcc/ChangeLog:

2024-03-20  Martin Jambor  

PR ipa/113907
* ipa-prop.h (class ipa_vr): Declare new overload of a member 
function
equal_p.
(ipa_jump_functions_equivalent_p): Declare.
* ipa-prop.cc (ipa_vr::equal_p): New function.
(ipa_agg_pass_through_jf_equivalent_p): Likewise.
(ipa_agg_jump_functions_equivalent_p): Likewise.
(ipa_jump_functions_equivalent_p): Likewise.
* ipa-cp.h (values_equal_for_ipcp_p): Declare.
* ipa-cp.cc (values_equal_for_ipcp_p): Make function public.
* ipa-icf-gimple.cc: Include alloc-pool.h, symbol-summary.h, 
sreal.h,
ipa-cp.h and ipa-prop.h.
(func_checker::compare_gimple_call): Comapre jump functions.

gcc/testsuite/ChangeLog:

2024-03-20  Martin Jambor  

PR ipa/113907
* gcc.dg/lto/pr113907_0.c: New.
* gcc.dg/lto/pr113907_1.c: Likewise.
* gcc.dg/lto/pr113907_2.c: Likewise.

Diff:
---
 gcc/ipa-cp.cc |   2 +-
 gcc/ipa-cp.h  |   2 +
 gcc/ipa-icf-gimple.cc |  30 ++
 gcc/ipa-prop.cc   | 167 ++
 gcc/ipa-prop.h|   3 +
 gcc/testsuite/gcc.dg/lto/pr113907_0.c |  18 
 gcc/testsuite/gcc.dg/lto/pr113907_1.c |  35 +++
 gcc/testsuite/gcc.dg/lto/pr113907_2.c |  11 +++
 8 files changed, 267 insertions(+), 1 deletion(-)

diff --git a/gcc/ipa-cp.cc b/gcc/ipa-cp.cc
index 2a1da631e9c..b7add455bd5 100644
--- a/gcc/ipa-cp.cc
+++ b/gcc/ipa-cp.cc
@@ -201,7 +201,7 @@ ipcp_lattice::is_single_const ()
 
 /* Return true iff X and Y should be considered equal values by IPA-CP.  */
 
-static bool
+bool
 values_equal_for_ipcp_p (tree x, tree y)
 {
   gcc_checking_assert (x != NULL_TREE && y != NULL_TREE);
diff --git a/gcc/ipa-cp.h b/gcc/ipa-cp.h
index 0b3cfe4b526..7ff74fb5c98 100644
--- a/gcc/ipa-cp.h
+++ b/gcc/ipa-cp.h
@@ -289,4 +289,6 @@ public:
   bool virt_call = false;
 };
 
+bool values_equal_for_ipcp_p (tree x, tree y);
+
 #endif /* IPA_CP_H */
diff --git a/gcc/ipa-icf-gimple.cc b/gcc/ipa-icf-gimple.cc
index 8c2df7a354e..17f62bec068 100644
--- a/gcc/ipa-icf-gimple.cc
+++ b/gcc/ipa-icf-gimple.cc
@@ -41,7 +41,12 @@ along with GCC; see the file COPYING3.  If not see
 #include "gimple-walk.h"
 
 #include "tree-ssa-alias-compare.h"
+#include "alloc-pool.h"
+#include "symbol-summary.h"
 #include "ipa-icf-gimple.h"
+#include "sreal.h"
+#include "ipa-cp.h"
+#include "ipa-prop.h"
 
 namespace ipa_icf_gimple {
 
@@ -714,6 +719,31 @@ func_checker::compare_gimple_call (gcall *s1, gcall *s2)
   && !compatible_types_p (TREE_TYPE (t1), TREE_TYPE (t2)))
 return return_false_with_msg ("GIMPLE internal call LHS type mismatch");
 
+  if (!gimple_call_internal_p (s1))
+{
+  cgraph_edge *e1 = cgraph_node::get (m_source_func_decl)->get_edge (s1);
+  cgraph_edge *e2 = cgraph_node::get (m_target_func_decl)->get_edge (s2);
+  class ipa_edge_args *args1 = ipa_edge_args_sum->get (e1);
+  class ipa_edge_args *args2 = ipa_edge_args_sum->get (e2);
+  if ((args1 != nullptr) != (args2 != nullptr))
+   return return_false_with_msg ("ipa_edge_args mismatch");
+  if (args1)
+   {
+ int n1 = ipa_get_cs_argument_count (args1);
+ int n2 = ipa_get_cs_argument_count (args2);
+ if (n1 != n2)
+   return return_false_with_msg ("ipa_edge_args nargs mismatch");
+ for (int i = 0; i < n1; i++)
+   {
+ struct ipa_jump_func *jf1 = ipa_get_ith_jump_func (args1, i);
+ struct ipa_jump_func *jf2 = ipa_get_ith_jump_func (args2, i);
+ if (((jf1 != nullptr) != (jf2 != nullptr))
+ || (jf1 && !ipa_jump_functions_equivalent_p (jf1, jf2)))
+   return return_false_with_msg ("jump function mismatch");
+   }
+   }
+}
+
   return compare_operand (t1, t2, get_operand_access_type (&map, t1));
 }
 
diff --git a/gcc/ipa-prop.cc b/gcc/ipa-prop.cc
index e8e4918d5a8..374e998aa64 100644
--- a/gcc/ipa-prop.cc
+++ b/gcc/ipa-prop.cc
@@ -156,6 +156,20 @@ ipa_vr::equal_p (const vrange &r) const
   return (types_compatible_p (m_type, r.type ()) && m_storage->equal_p (r));
 }
 
+bool
+ipa_vr::equal_p (const ipa_vr &o) const
+{
+  i

[gcc r14-9841] ICF&SRA: Make ICF and SRA agree on padding

2024-04-08 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:1e3312a25a7b34d6e3f549273e1674c7114e4408

commit r14-9841-g1e3312a25a7b34d6e3f549273e1674c7114e4408
Author: Martin Jambor 
Date:   Mon Apr 8 18:53:23 2024 +0200

ICF&SRA: Make ICF and SRA agree on padding

PR 113359 shows that (at least with -fno-strict-aliasing) ICF can
unify two functions which copy an aggregate type of the same size but
then SRA, through its total scalarization, can copy the aggregate by
pieces, skipping paddding, but the padding was not the same in the two
original functions that ICF unified.

This patch enhances SRA with the ability to collect padding
information which then can be compared from within ICF.  Unfortunately
SRA uses OPTION_SET_P when determining its limits, so ICF needs to
switch cfuns at least once to figure it out too.

gcc/ChangeLog:

2024-03-27  Martin Jambor  

PR ipa/113359
* ipa-icf-gimple.h (func_checker): New members
safe_for_total_scalarization_p, m_total_scalarization_limit_known_p
and m_total_scalarization_limit.
(func_checker::func_checker): Initialize new member variables.
* ipa-icf-gimple.cc: Include tree-sra.h.
(func_checker::func_checker): Initialize new member variables.
(func_checker::safe_for_total_scalarization_p): New function.
(func_checker::compare_operand): Use the new function.
* tree-sra.h (sra_get_max_scalarization_size): Declare.
(sra_total_scalarization_would_copy_same_data_p): Likewise.
* tree-sra.cc (prepare_iteration_over_array_elts): New function.
(class sra_padding_collecting): New.
(sra_padding_collecting::record_padding): Likewise.
(scalarizable_type_p): Rename to totally_scalarizable_type_p.  Add
ability to record padding when requested.
(totally_scalarize_subtree): Split out gathering information 
necessary
to iterate over array elements to prepare_iteration_over_array_elts.
Fix errornous early exit.
(analyze_all_variable_accesses): Adjust the call to
totally_scalarizable_type_p.  Move determining of total scalariation
size limit...
(sra_get_max_scalarization_size): ...here.
(check_ts_and_push_padding_to_vec): New function.
(sra_total_scalarization_would_copy_same_data_p): Likewise.

gcc/testsuite/ChangeLog:

2024-03-27  Martin Jambor  

PR ipa/113359
* gcc.dg/lto/pr113359-1_0.c: New.
* gcc.dg/lto/pr113359-1_1.c: Likewise.
* gcc.dg/lto/pr113359-2_0.c: Likewise.
* gcc.dg/lto/pr113359-2_1.c: Likewise.
* gcc.dg/lto/pr113359-3_0.c: Likewise.
* gcc.dg/lto/pr113359-3_1.c: Likewise.
* gcc.dg/lto/pr113359-4_0.c: Likewise.
* gcc.dg/lto/pr113359-4_1.c: Likewise.
* gcc.dg/lto/pr113359-5_0.c: Likewise.
* gcc.dg/lto/pr113359-5_1.c: Likewise.

Diff:
---
 gcc/ipa-icf-gimple.cc   |  41 +-
 gcc/ipa-icf-gimple.h|  15 +-
 gcc/testsuite/gcc.dg/lto/pr113359-1_0.c |  86 +++
 gcc/testsuite/gcc.dg/lto/pr113359-1_1.c |  38 +
 gcc/testsuite/gcc.dg/lto/pr113359-2_0.c |  87 +++
 gcc/testsuite/gcc.dg/lto/pr113359-2_1.c |  38 +
 gcc/testsuite/gcc.dg/lto/pr113359-3_0.c | 114 +++
 gcc/testsuite/gcc.dg/lto/pr113359-3_1.c |  49 +++
 gcc/testsuite/gcc.dg/lto/pr113359-4_0.c | 114 +++
 gcc/testsuite/gcc.dg/lto/pr113359-4_1.c |  49 +++
 gcc/testsuite/gcc.dg/lto/pr113359-5_0.c | 118 +++
 gcc/testsuite/gcc.dg/lto/pr113359-5_1.c |  50 +++
 gcc/tree-sra.cc | 252 +---
 gcc/tree-sra.h  |   3 +
 14 files changed, 999 insertions(+), 55 deletions(-)

diff --git a/gcc/ipa-icf-gimple.cc b/gcc/ipa-icf-gimple.cc
index 17f62bec068..c25eb24710f 100644
--- a/gcc/ipa-icf-gimple.cc
+++ b/gcc/ipa-icf-gimple.cc
@@ -39,6 +39,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "cfgloop.h"
 #include "attribs.h"
 #include "gimple-walk.h"
+#include "tree-sra.h"
 
 #include "tree-ssa-alias-compare.h"
 #include "alloc-pool.h"
@@ -64,7 +65,8 @@ func_checker::func_checker (tree source_func_decl, tree 
target_func_decl,
   : m_source_func_decl (source_func_decl), m_target_func_decl 
(target_func_decl),
 m_ignored_source_nodes (ignored_source_nodes),
 m_ignored_target_nodes (ignored_target_nodes),
-m_ignore_labels (ignore_labels), m_tbaa (tbaa)
+m_ignore_labels (ignore_labels), m_tbaa (tbaa),
+m_total_scalarization_limit_known_p (false)
 {
   function *source_func = DECL_STRUCT_FUNCTION (source_func_decl);
   function *target_func = DECL_STRUCT_FUNCTION (target_func_decl);
@@ -361,6 +363,36 @@ func_checker::operand_equal_p (const_tree t1, const_tree 

[gcc r14-9926] contrib/check-params-in-docs.py: Ignore gcn-preferred-vectorization-factor

2024-04-11 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:33f83d3cd84f9876180a2e2a9d1ea082debdaa37

commit r14-9926-g33f83d3cd84f9876180a2e2a9d1ea082debdaa37
Author: Martin Jambor 
Date:   Thu Apr 11 19:37:45 2024 +0200

contrib/check-params-in-docs.py: Ignore gcn-preferred-vectorization-factor

contrib/check-params-in-docs.py is a script that checks that all
options reported with ./gcc/xgcc -Bgcc --help=param are in
gcc/doc/invoke.texi and vice versa.
gcn-preferred-vectorization-factor is in the manual but normally not
reported by --help, probably because I do not have gcn offload
configured.  This patch makes the script silently about this particular
fact.

contrib/ChangeLog:

2024-04-11  Martin Jambor  

* check-params-in-docs.py (ignored): Add
gcn-preferred-vectorization-factor.

Diff:
---
 contrib/check-params-in-docs.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/contrib/check-params-in-docs.py b/contrib/check-params-in-docs.py
index 623c82284e2..f7879dd8e08 100755
--- a/contrib/check-params-in-docs.py
+++ b/contrib/check-params-in-docs.py
@@ -45,7 +45,7 @@ parser.add_argument('params_output')
 
 args = parser.parse_args()
 
-ignored = {'logical-op-non-short-circuit'}
+ignored = {'logical-op-non-short-circuit', 
'gcn-preferred-vectorization-factor'}
 params = {}
 
 for line in open(args.params_output).readlines():


[gcc r13-8619] ipa: Avoid duplicate replacements in IPA-SRA transformation phase

2024-04-19 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:8a3784adf5cd873ca295a5a011d8623338ff3976

commit r13-8619-g8a3784adf5cd873ca295a5a011d8623338ff3976
Author: Martin Jambor 
Date:   Fri Apr 19 16:48:12 2024 +0200

ipa: Avoid duplicate replacements in IPA-SRA transformation phase

When the analysis part of IPA-SRA figures out that it would split out
a scalar part of an aggregate which is known by IPA-CP to contain a
known constant, it skips it knowing that the transformation part looks
at IPA-CP aggregate results too and does the right thing (which can
include doing the propagation in GIMPLE because that is the last
moment the parameter exists).

However, when IPA-SRA wants to split out a smaller aggregate out
of an aggregate, which happens to be of the same size as a known
scalar constant at the same offset, the transformation bit fails to
recognize the situation, tries to do both splitting and constant
propagation and in PR 111571 testcase creates a nonsensical call
statement on which the call redirection then ICEs.

Fixed by making sure we don't try to do two replacements of the same
part of the same parameter.

The look-up among replacements requires these are sorted and this
patch just sorts them if they are not already sorted before each new
look-up.  The worst number of sortings that can happen is number of
parameters which are both split and have aggregate constants times
param_ipa_max_agg_items (default 16).  I don't think complicating the
source code to optimize for this unlikely case is worth it but if need
be, it can of course be done.

gcc/ChangeLog:

2024-03-15  Martin Jambor  

PR ipa/111571
* ipa-param-manipulation.cc
(ipa_param_body_adjustments::common_initialization): Avoid creating
duplicate replacement entries.

gcc/testsuite/ChangeLog:

2024-03-15  Martin Jambor  

PR ipa/111571
* gcc.dg/ipa/pr111571.c: New test.

(cherry picked from commit ca56b43105fc09021ec445f1978a17cd85ae5e0c)

Diff:
---
 gcc/ipa-param-manipulation.cc   | 16 
 gcc/testsuite/gcc.dg/ipa/pr111571.c | 29 +
 2 files changed, 45 insertions(+)

diff --git a/gcc/ipa-param-manipulation.cc b/gcc/ipa-param-manipulation.cc
index 182f0c6741e..e4f626ae95e 100644
--- a/gcc/ipa-param-manipulation.cc
+++ b/gcc/ipa-param-manipulation.cc
@@ -1484,6 +1484,22 @@ ipa_param_body_adjustments::common_initialization (tree 
old_fndecl,
 replacement with a constant (for split aggregates passed
 by value).  */
 
+ if (split[parm_num])
+   {
+ /* We must be careful not to add a duplicate
+replacement. */
+ sort_replacements ();
+ ipa_param_body_replacement *pbr
+   = lookup_replacement_1 (m_oparms[parm_num],
+   av.unit_offset);
+ if (pbr)
+   {
+ /* Otherwise IPA-SRA should have bailed out.  */
+ gcc_assert (AGGREGATE_TYPE_P (TREE_TYPE (pbr->repl)));
+ continue;
+   }
+   }
+
  tree repl;
  if (av.by_ref)
repl = av.value;
diff --git a/gcc/testsuite/gcc.dg/ipa/pr111571.c 
b/gcc/testsuite/gcc.dg/ipa/pr111571.c
new file mode 100644
index 000..2a4adc608db
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ipa/pr111571.c
@@ -0,0 +1,29 @@
+/* { dg-do compile } */
+/* { dg-options "-O2"  } */
+
+struct a {
+  int b;
+};
+struct c {
+  long d;
+  struct a e;
+  long f;
+};
+int g, h, i;
+int j() {return 0;}
+static void k(struct a l, int p) {
+  if (h)
+g = 0;
+  for (; g; g = j())
+if (l.b)
+  break;
+}
+static void m(struct c l) {
+  k(l.e, l.f);
+  for (;; --i)
+;
+}
+int main() {
+  struct c n = {10, 9};
+  m(n);
+}


[gcc r13-8620] ipa: Force args obtined through pass-through maps to the expected type (PR 113964)

2024-04-19 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:5c3238b0d55ec13a2430aa606e2bfed9432e97ac

commit r13-8620-g5c3238b0d55ec13a2430aa606e2bfed9432e97ac
Author: Martin Jambor 
Date:   Fri Apr 19 16:48:12 2024 +0200

ipa: Force args obtined through pass-through maps to the expected type (PR 
113964)

Interactions of IPA-CP and IPA-SRA on the same data is a rather big
source of issues, I'm afraid.  PR 113964 is a situation where IPA-CP
propagates an unsigned short in a union parameter into a function
which itself calls a different function which has a same union
parameter and both these union parameters are split with IPA-SRA.  The
leaf function however uses a signed short member of the union.

In the calling function, we get the unsigned constant as the
replacement for the union and it is then passed in the call without
any type compatibility checks.  Apparently on riscv64 it matters
whether the parameter is signed or unsigned short and so the leaf
function can see different values.

Fixed by using useless_type_conversion_p at the appropriate place and
if it fails, use force_value_to type as elsewhere in similar
situations.

gcc/ChangeLog:

2024-04-04  Martin Jambor  

PR ipa/113964
* ipa-param-manipulation.cc (ipa_param_adjustments::modify_call):
Force values obtined through pass-through maps to the expected
split type.

gcc/testsuite/ChangeLog:

2024-04-04  Patrick O'Neill  
Martin Jambor  

PR ipa/113964
* gcc.dg/ipa/pr114247.c: New test.

(cherry picked from commit 8cd0d29270d4ed86c69b80c08de66dcb6c1e22fe)

Diff:
---
 gcc/ipa-param-manipulation.cc   |  6 ++
 gcc/testsuite/gcc.dg/ipa/pr114247.c | 31 +++
 2 files changed, 37 insertions(+)

diff --git a/gcc/ipa-param-manipulation.cc b/gcc/ipa-param-manipulation.cc
index e4f626ae95e..729d5e8e688 100644
--- a/gcc/ipa-param-manipulation.cc
+++ b/gcc/ipa-param-manipulation.cc
@@ -738,6 +738,12 @@ ipa_param_adjustments::modify_call (cgraph_edge *cs,
  }
   if (repl)
{
+ if (!useless_type_conversion_p(apm->type, repl->typed.type))
+   {
+ repl = force_value_to_type (apm->type, repl);
+ repl = force_gimple_operand_gsi (&gsi, repl,
+  true, NULL, true, GSI_SAME_STMT);
+   }
  vargs.quick_push (repl);
  continue;
}
diff --git a/gcc/testsuite/gcc.dg/ipa/pr114247.c 
b/gcc/testsuite/gcc.dg/ipa/pr114247.c
new file mode 100644
index 000..60aa2bc0122
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ipa/pr114247.c
@@ -0,0 +1,31 @@
+/* { dg-do run } */
+/* { dg-options "-O2 -fsigned-char -fno-strict-aliasing -fwrapv" } */
+
+union a {
+  unsigned short b;
+  int c;
+  signed short d;
+};
+int e, f = 1, g;
+long h;
+const int **i;
+void j(union a k, int l, unsigned m) {
+  const int *a[100];
+  i = &a[0];
+  h = k.d;
+}
+static int o(union a k) {
+  k.d = -1;
+  while (1)
+if (f)
+  break;
+  j(k, g, e);
+  return 0;
+}
+int main() {
+  union a n = {1};
+  o(n);
+  if (h != -1)
+__builtin_abort();
+  return 0;
+}


[gcc r13-8785] testsuite: Adjust pr113359-2_*.c with unsigned long long [PR114662]

2024-05-21 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:c827f46d8652d7a089e614302a4cffb6b192284d

commit r13-8785-gc827f46d8652d7a089e614302a4cffb6b192284d
Author: Kewen Lin 
Date:   Wed Apr 10 02:59:43 2024 -0500

testsuite: Adjust pr113359-2_*.c with unsigned long long [PR114662]

pr113359-2_*.c define a struct having unsigned long type
members ay and az which have 4 bytes size at -m32, while
the related constants CL1 and CL2 used for equality check
are always 8 bytes, it makes compiler consider the below

  69   if (a.ay != CL1)
  70 __builtin_abort ();

always to abort and optimize away the following call to
getb, which leads to the expected wpa dumping on
"Semantic equality" missing.

This patch is to modify the types with unsigned long long
accordingly.

PR testsuite/114662

gcc/testsuite/ChangeLog:

* gcc.dg/lto/pr113359-2_0.c: Use unsigned long long instead of
unsigned long.
* gcc.dg/lto/pr113359-2_1.c: Likewise.

(cherry picked from commit 4923ed49b93352bcf9e43cafac38345e4a54c3f8)

Diff:
---
 gcc/testsuite/gcc.dg/lto/pr113359-2_0.c | 8 
 gcc/testsuite/gcc.dg/lto/pr113359-2_1.c | 8 
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/lto/pr113359-2_0.c 
b/gcc/testsuite/gcc.dg/lto/pr113359-2_0.c
index 8b2d5bdfab2..8495667599d 100644
--- a/gcc/testsuite/gcc.dg/lto/pr113359-2_0.c
+++ b/gcc/testsuite/gcc.dg/lto/pr113359-2_0.c
@@ -8,15 +8,15 @@
 struct SA
 {
   unsigned int ax;
-  unsigned long ay;
-  unsigned long az;
+  unsigned long long ay;
+  unsigned long long az;
 };
 
 struct SB
 {
   unsigned int bx;
-  unsigned long by;
-  unsigned long bz;
+  unsigned long long by;
+  unsigned long long bz;
 };
 
 struct ZA
diff --git a/gcc/testsuite/gcc.dg/lto/pr113359-2_1.c 
b/gcc/testsuite/gcc.dg/lto/pr113359-2_1.c
index 61bc0547981..8320f347efe 100644
--- a/gcc/testsuite/gcc.dg/lto/pr113359-2_1.c
+++ b/gcc/testsuite/gcc.dg/lto/pr113359-2_1.c
@@ -5,15 +5,15 @@
 struct SA
 {
   unsigned int ax;
-  unsigned long ay;
-  unsigned long az;
+  unsigned long long ay;
+  unsigned long long az;
 };
 
 struct SB
 {
   unsigned int bx;
-  unsigned long by;
-  unsigned long bz;
+  unsigned long long by;
+  unsigned long long bz;
 };
 
 struct ZA


[gcc r14-10237] sra: Do not leave work for DSE (that it can sometimes not perform)

2024-05-23 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:1a6c1c85b7ab1ad4bdf9573fcdc04dcce894ba82

commit r14-10237-g1a6c1c85b7ab1ad4bdf9573fcdc04dcce894ba82
Author: Martin Jambor 
Date:   Thu May 9 16:39:44 2024 +0200

sra: Do not leave work for DSE (that it can sometimes not perform)

When looking again at the g++.dg/tree-ssa/pr109849.C testcase we
discovered that it generates terrible store-to-load forwarding stalls
because SRA was leaving behind aggregate loads but all the stores were
by scalar parts and DSE failed to remove the useless load.  SRA has
all the knowledge to remove the statement even now, so this small
patch makes it do so.

With this patch, the g++.dg/tree-ssa/pr109849.C micro-benchmark runs 9
times faster (on an AMD EPYC 75F3 machine).

gcc/ChangeLog:

2024-04-18  Martin Jambor  

* tree-sra.cc (sra_modify_assign): Remove the original statement
also when dealing with a store to a fully covered aggregate from a
non-candidate.

gcc/testsuite/ChangeLog:

2024-04-23  Martin Jambor  

* g++.dg/tree-ssa/pr109849.C: Also check that the aggeegate store
to cur disappears.
* gcc.dg/tree-ssa/ssa-dse-26.c: Instead of relying on DSE,
check that the unwanted stores were removed at early SRA time.

(cherry picked from commit f6743695b4d2bd4da96e56a19157372f93b800bd)

Diff:
---
 gcc/testsuite/g++.dg/tree-ssa/pr109849.C   |  3 ++-
 gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-26.c |  6 +++---
 gcc/tree-sra.cc| 14 --
 3 files changed, 17 insertions(+), 6 deletions(-)

diff --git a/gcc/testsuite/g++.dg/tree-ssa/pr109849.C 
b/gcc/testsuite/g++.dg/tree-ssa/pr109849.C
index cd348c0f590..d06dbb10482 100644
--- a/gcc/testsuite/g++.dg/tree-ssa/pr109849.C
+++ b/gcc/testsuite/g++.dg/tree-ssa/pr109849.C
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-sra" } */
+/* { dg-options "-O2 -fdump-tree-sra -fdump-tree-optimized" } */
 
 #include 
 typedef unsigned int uint32_t;
@@ -29,3 +29,4 @@ main()
 }
 
 /* { dg-final { scan-tree-dump "Created a replacement for stack offset" "sra"} 
} */
+/* { dg-final { scan-tree-dump-not "cur = MEM" "optimized"} } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-26.c 
b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-26.c
index 43152de5616..1d01392c595 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-26.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-26.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-dse1-details -fno-short-enums -fno-tree-fre" 
} */
+/* { dg-options "-O2 -fdump-tree-esra -fno-short-enums -fno-tree-fre" } */
 /* { dg-skip-if "we want a BIT_FIELD_REF from fold_truth_andor" { ! lp64 } } */
 /* { dg-skip-if "temporary variable names are not x and y" { 
mmix-knuth-mmixware } } */
 
@@ -31,5 +31,5 @@ constraint_equal (struct constraint a, struct constraint b)
 && constraint_expr_equal (a.rhs, b.rhs);
 }
 
-/* { dg-final { scan-tree-dump-times "Deleted dead store: x = " 2 "dse1" } } */
-/* { dg-final { scan-tree-dump-times "Deleted dead store: y = " 2 "dse1" } } */
+/* { dg-final { scan-tree-dump-not "x = " "esra" } } */
+/* { dg-final { scan-tree-dump-not "y = " "esra" } } */
diff --git a/gcc/tree-sra.cc b/gcc/tree-sra.cc
index 32fa28911f2..8040b0c5645 100644
--- a/gcc/tree-sra.cc
+++ b/gcc/tree-sra.cc
@@ -4854,8 +4854,18 @@ sra_modify_assign (gimple *stmt, gimple_stmt_iterator 
*gsi)
 But use the RHS aggregate to load from to expose more
 optimization opportunities.  */
  if (access_has_children_p (lacc))
-   generate_subtree_copies (lacc->first_child, rhs, lacc->offset,
-0, 0, gsi, true, true, loc);
+   {
+ generate_subtree_copies (lacc->first_child, rhs, lacc->offset,
+  0, 0, gsi, true, true, loc);
+ if (lacc->grp_covered)
+   {
+ unlink_stmt_vdef (stmt);
+ gsi_remove (& orig_gsi, true);
+ release_defs (stmt);
+ sra_stats.deleted++;
+ return SRA_AM_REMOVED;
+   }
+   }
}
 
   return SRA_AM_NONE;


[gcc r12-10475] ipa: Compare jump functions in ICF (PR 113907)

2024-05-28 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:72f6b7ec3915f0b5b3517dffa19e3b34c8af687d

commit r12-10475-g72f6b7ec3915f0b5b3517dffa19e3b34c8af687d
Author: Martin Jambor 
Date:   Tue May 28 13:33:02 2024 +0200

ipa: Compare jump functions in ICF (PR 113907)

This is a manual backport of r14-9840-g1162861439fd3c from master.
Manual because the bits and value range representation in jump
functions have changes during the gcc 14 development cycle.

In PR 113907 comment #58, Honza found a case where ICF thinks bodies
of functions are equivalent but becaise of difference in aliases in a
memory access, different aggregate jump functions are associated with
supposedly equivalent call statements.  This patch adds a way to
compare jump functions and plugs it into ICF to avoid the issue.

gcc/ChangeLog:

2024-05-14  Martin Jambor  

PR ipa/113907
* ipa-prop.h (ipa_jump_functions_equivalent_p): Declare.
(values_equal_for_ipcp_p): Likewise.
* ipa-prop.cc (ipa_agg_pass_through_jf_equivalent_p): New function.
(ipa_agg_jump_functions_equivalent_p): Likewise.
(ipa_jump_functions_equivalent_p): Likewise.
* ipa-cp.cc (values_equal_for_ipcp_p): Make function public.
* ipa-icf-gimple.cc: Include alloc-pool.h, symbol-summary.h, 
sreal.h,
ipa-cp.h and ipa-prop.h.
(func_checker::compare_gimple_call): Comapre jump functions.

gcc/testsuite/ChangeLog:

2024-05-10  Martin Jambor  

PR ipa/113907
* gcc.dg/lto/pr113907_0.c: New.
* gcc.dg/lto/pr113907_1.c: Likewise.
* gcc.dg/lto/pr113907_2.c: Likewise.

(cherry picked from commit 1db45e83021a8a87f41e22053910fcce6e8e2c2c)

Diff:
---
 gcc/ipa-cp.cc |   2 +-
 gcc/ipa-icf-gimple.cc |  29 +++
 gcc/ipa-prop.cc   | 157 ++
 gcc/ipa-prop.h|   3 +
 gcc/testsuite/gcc.dg/lto/pr113907_0.c |  18 
 gcc/testsuite/gcc.dg/lto/pr113907_1.c |  35 
 gcc/testsuite/gcc.dg/lto/pr113907_2.c |  11 +++
 7 files changed, 254 insertions(+), 1 deletion(-)

diff --git a/gcc/ipa-cp.cc b/gcc/ipa-cp.cc
index fbb31f6dff2..909464f4ac4 100644
--- a/gcc/ipa-cp.cc
+++ b/gcc/ipa-cp.cc
@@ -1402,7 +1402,7 @@ ipacp_value_safe_for_type (tree param_type, tree value)
 
 /* Return true iff X and Y should be considered equal values by IPA-CP.  */
 
-static bool
+bool
 values_equal_for_ipcp_p (tree x, tree y)
 {
   gcc_checking_assert (x != NULL_TREE && y != NULL_TREE);
diff --git a/gcc/ipa-icf-gimple.cc b/gcc/ipa-icf-gimple.cc
index ab398ca051c..e81409c16f9 100644
--- a/gcc/ipa-icf-gimple.cc
+++ b/gcc/ipa-icf-gimple.cc
@@ -41,7 +41,11 @@ along with GCC; see the file COPYING3.  If not see
 #include "gimple-walk.h"
 
 #include "tree-ssa-alias-compare.h"
+#include "alloc-pool.h"
+#include "symbol-summary.h"
 #include "ipa-icf-gimple.h"
+#include "sreal.h"
+#include "ipa-prop.h"
 
 namespace ipa_icf_gimple {
 
@@ -714,6 +718,31 @@ func_checker::compare_gimple_call (gcall *s1, gcall *s2)
   && !compatible_types_p (TREE_TYPE (t1), TREE_TYPE (t2)))
 return return_false_with_msg ("GIMPLE internal call LHS type mismatch");
 
+  if (!gimple_call_internal_p (s1))
+{
+  cgraph_edge *e1 = cgraph_node::get (m_source_func_decl)->get_edge (s1);
+  cgraph_edge *e2 = cgraph_node::get (m_target_func_decl)->get_edge (s2);
+  class ipa_edge_args *args1 = ipa_edge_args_sum->get (e1);
+  class ipa_edge_args *args2 = ipa_edge_args_sum->get (e2);
+  if ((args1 != nullptr) != (args2 != nullptr))
+   return return_false_with_msg ("ipa_edge_args mismatch");
+  if (args1)
+   {
+ int n1 = ipa_get_cs_argument_count (args1);
+ int n2 = ipa_get_cs_argument_count (args2);
+ if (n1 != n2)
+   return return_false_with_msg ("ipa_edge_args nargs mismatch");
+ for (int i = 0; i < n1; i++)
+   {
+ struct ipa_jump_func *jf1 = ipa_get_ith_jump_func (args1, i);
+ struct ipa_jump_func *jf2 = ipa_get_ith_jump_func (args2, i);
+ if (((jf1 != nullptr) != (jf2 != nullptr))
+ || (jf1 && !ipa_jump_functions_equivalent_p (jf1, jf2)))
+   return return_false_with_msg ("jump function mismatch");
+   }
+   }
+}
+
   return compare_operand (t1, t2, get_operand_access_type (&map, t1));
 }
 
diff --git a/gcc/ipa-prop.cc b/gcc/ipa-prop.cc
index 0197ac6108d..e2e83b5f3f5 100644
--- a/gcc/ipa-prop.cc
+++ b/gcc/ipa-prop.cc
@@ -6096,6 +6096,163 @@ ipcp_transform_function (struct cgraph_node *node)
   return modified_mem_access ? TODO_update_ssa_only_virtuals : 0;
 }
 
+/* Return true if the two pass_through components of two jump functions are
+   known to be equivalent.  AGG_JF denotes whether they are part of aggregate
+   functions or not.  The function can be 

[gcc r14-10803] ipa: Treat static constructors and destructors as non-local (PR 115815)

2024-10-18 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:f057e958732cd2627b6db127fa6d4d882b61dd5f

commit r14-10803-gf057e958732cd2627b6db127fa6d4d882b61dd5f
Author: Martin Jambor 
Date:   Fri Oct 18 21:32:16 2024 +0200

ipa: Treat static constructors and destructors as non-local (PR 115815)

In PR 115815, IPA-SRA thought it had control over all invocations of a
(recursive) static destructor but it did not see the implied
invocation which led to the original being left behind and the
clean-up code encountering uses of SSAs that definitely should have
been dead.

Fixed by teaching cgraph_node::can_be_local_p about static
constructors and destructors.  Similar test is missing in
cgraph_node::local_p so I added the check there as well.

In addition to the commit with the fix, this backport also contains
squashed commit 1a458bdeb223ffa501bac8e76182115681967094 which fixes
dejagnu directives in the testcase.

gcc/ChangeLog:

2024-07-25  Martin Jambor  

PR ipa/115815
* cgraph.cc (cgraph_node_cannot_be_local_p_1): Also check
DECL_STATIC_CONSTRUCTOR and DECL_STATIC_DESTRUCTOR.
* ipa-visibility.cc (non_local_p): Likewise.
(cgraph_node::local_p): Delete extraneous line of tabs.

gcc/testsuite/ChangeLog:

2024-07-25  Martin Jambor  

PR ipa/115815
* gcc.dg/lto/pr115815_0.c: New test.

(cherry picked from commit e98ad6a049c96c21cf641954584c2f5b7df0ce93)

Diff:
---
 gcc/cgraph.cc |  4 +++-
 gcc/ipa-visibility.cc |  5 +++--
 gcc/testsuite/gcc.dg/lto/pr115815_0.c | 22 ++
 3 files changed, 28 insertions(+), 3 deletions(-)

diff --git a/gcc/cgraph.cc b/gcc/cgraph.cc
index 473d8410bc97..39a3adbc7c35 100644
--- a/gcc/cgraph.cc
+++ b/gcc/cgraph.cc
@@ -2434,7 +2434,9 @@ cgraph_node_cannot_be_local_p_1 (cgraph_node *node, void 
*)
&& !node->forced_by_abi
&& !node->used_from_object_file_p ()
&& !node->same_comdat_group)
-  || !node->externally_visible));
+  || !node->externally_visible)
+  && !DECL_STATIC_CONSTRUCTOR (node->decl)
+  && !DECL_STATIC_DESTRUCTOR (node->decl));
 }
 
 /* Return true if cgraph_node can be made local for API change.
diff --git a/gcc/ipa-visibility.cc b/gcc/ipa-visibility.cc
index 501d3c304aa3..21f0c47f388e 100644
--- a/gcc/ipa-visibility.cc
+++ b/gcc/ipa-visibility.cc
@@ -102,7 +102,9 @@ non_local_p (struct cgraph_node *node, void *data 
ATTRIBUTE_UNUSED)
   && !node->externally_visible
   && !node->used_from_other_partition
   && !node->in_other_partition
-  && node->get_availability () >= AVAIL_AVAILABLE);
+  && node->get_availability () >= AVAIL_AVAILABLE
+  && !DECL_STATIC_CONSTRUCTOR (node->decl)
+  && !DECL_STATIC_DESTRUCTOR (node->decl));
 }
 
 /* Return true when function can be marked local.  */
@@ -116,7 +118,6 @@ cgraph_node::local_p (void)
  return n->callees->callee->local_p ();
return !n->call_for_symbol_thunks_and_aliases (non_local_p,
  NULL, true);
-   
 }
 
 /* A helper for comdat_can_be_unshared_p.  */
diff --git a/gcc/testsuite/gcc.dg/lto/pr115815_0.c 
b/gcc/testsuite/gcc.dg/lto/pr115815_0.c
new file mode 100644
index ..ade91def55b0
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/lto/pr115815_0.c
@@ -0,0 +1,22 @@
+/* { dg-lto-options {{-O2 -flto}} }  */
+/* { dg-lto-do link } */
+/* { dg-require-effective-target global_constructor } */
+
+int a;
+volatile int v;
+volatile int w;
+
+int __attribute__((destructor))
+b() {
+  if (v)
+return a + b();
+  v = 5;
+  return 0;
+}
+
+int
+main (int argc, char **argv)
+{
+  w = 1;
+  return 0;
+}


[gcc r15-4464] testsuite: Add necessary dejagnu directives to pr115815_0.c

2024-10-18 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:1a458bdeb223ffa501bac8e76182115681967094

commit r15-4464-g1a458bdeb223ffa501bac8e76182115681967094
Author: Martin Jambor 
Date:   Fri Oct 18 12:00:12 2024 +0200

testsuite: Add necessary dejagnu directives to pr115815_0.c

I have received an email from the Linaro infrastructure that the test
gcc.dg/lto/pr115815_0.c which I added is failing on arm-eabi and I
realized that not only it is missing dg-require-effective-target
global_constructor but actually any dejagnu directives at all, which
means it is unnecessarily running both at -O0 and -O2 and there is an
unnecesary run test too.  All fixed by this patch.

I have not actually verified that the failure goes away on arm-eabi
but have very high hopes it will.  I have verified that the test still
checks for the bug and also that it passes by running:

  make -k check-gcc RUNTESTFLAGS="lto.exp=*pr115815*"

gcc/testsuite/ChangeLog:

2024-10-14  Martin Jambor  

* gcc.dg/lto/pr115815_0.c: Add dejagu directives.

Diff:
---
 gcc/testsuite/gcc.dg/lto/pr115815_0.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/gcc/testsuite/gcc.dg/lto/pr115815_0.c 
b/gcc/testsuite/gcc.dg/lto/pr115815_0.c
index d938ae4c8025..ade91def55b0 100644
--- a/gcc/testsuite/gcc.dg/lto/pr115815_0.c
+++ b/gcc/testsuite/gcc.dg/lto/pr115815_0.c
@@ -1,3 +1,7 @@
+/* { dg-lto-options {{-O2 -flto}} }  */
+/* { dg-lto-do link } */
+/* { dg-require-effective-target global_constructor } */
+
 int a;
 volatile int v;
 volatile int w;


[gcc r15-4564] tree-sra: Avoid SRAing arguments to a function returning_twice (PR 117142)

2024-10-23 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:29d8f1f0b7ad3c69b3bdb130325300d5f73aa784

commit r15-4564-g29d8f1f0b7ad3c69b3bdb130325300d5f73aa784
Author: Martin Jambor 
Date:   Wed Oct 23 11:30:32 2024 +0200

tree-sra: Avoid SRAing arguments to a function returning_twice (PR 117142)

PR 117142 shows that the current SRA probably never worked reliably
with arguments passed to a function returning twice, because it then
creates statements before the call which however needs to be at the
beginning of a basic block.

While it should be possible to make at least the case of passing
arguments by value work with SRA (the statements would need to be put
just on the non-abnormal edges leading to the BB), this would mean
large surgery of function sra_modify_expr and I guess the time would
better be spent re-organizing the whole pass.

gcc/ChangeLog:

2024-10-21  Martin Jambor  

PR tree-optimization/117142
* tree-sra.cc (build_access_from_call_arg): Disqualify any
candidate passed to a function returning twice.

gcc/testsuite/ChangeLog:

2024-10-21  Martin Jambor  

PR tree-optimization/117142
* gcc.dg/tree-ssa/pr117142.c: New test.

Diff:
---
 gcc/testsuite/gcc.dg/tree-ssa/pr117142.c | 14 ++
 gcc/tree-sra.cc  |  9 +
 2 files changed, 23 insertions(+)

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr117142.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr117142.c
new file mode 100644
index ..fc62c1e58f2e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr117142.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-O1" } */
+
+struct a {
+  int b;
+};
+void c(int, int);
+void __attribute__((returns_twice))
+bar1(struct a);
+void bar(struct a) {
+  struct a d;
+  bar1(d);
+  c(d.b, d.b);
+}
diff --git a/gcc/tree-sra.cc b/gcc/tree-sra.cc
index 64e2f007d680..c0915dce5c4a 100644
--- a/gcc/tree-sra.cc
+++ b/gcc/tree-sra.cc
@@ -1397,6 +1397,15 @@ static bool
 build_access_from_call_arg (tree expr, gimple *stmt, bool can_be_returned,
enum out_edge_check *oe_check)
 {
+  if (gimple_call_flags (stmt) & ECF_RETURNS_TWICE)
+{
+  tree base = expr;
+  if (TREE_CODE (expr) == ADDR_EXPR)
+   base = get_base_address (TREE_OPERAND (expr, 0));
+  disqualify_base_of_expr (base, "Passed to a returns_twice call.");
+  return false;
+}
+
   if (TREE_CODE (expr) == ADDR_EXPR)
 {
   tree base = get_base_address (TREE_OPERAND (expr, 0));


[gcc r13-9143] ipa: Treat static constructors and destructors as non-local (PR 115815)

2024-10-22 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:005ce1c1826777f33d5011723827d17f1fcd55c1

commit r13-9143-g005ce1c1826777f33d5011723827d17f1fcd55c1
Author: Martin Jambor 
Date:   Fri Oct 18 21:32:16 2024 +0200

ipa: Treat static constructors and destructors as non-local (PR 115815)

In PR 115815, IPA-SRA thought it had control over all invocations of a
(recursive) static destructor but it did not see the implied
invocation which led to the original being left behind and the
clean-up code encountering uses of SSAs that definitely should have
been dead.

Fixed by teaching cgraph_node::can_be_local_p about static
constructors and destructors.  Similar test is missing in
cgraph_node::local_p so I added the check there as well.

In addition to the commit with the fix, this backport also contains
squashed commit 1a458bdeb223ffa501bac8e76182115681967094 which fixes
dejagnu directives in the testcase.

gcc/ChangeLog:

2024-07-25  Martin Jambor  

PR ipa/115815
* cgraph.cc (cgraph_node_cannot_be_local_p_1): Also check
DECL_STATIC_CONSTRUCTOR and DECL_STATIC_DESTRUCTOR.
* ipa-visibility.cc (non_local_p): Likewise.
(cgraph_node::local_p): Delete extraneous line of tabs.

gcc/testsuite/ChangeLog:

2024-07-25  Martin Jambor  

PR ipa/115815
* gcc.dg/lto/pr115815_0.c: New test.

(cherry picked from commit e98ad6a049c96c21cf641954584c2f5b7df0ce93)

Diff:
---
 gcc/cgraph.cc |  4 +++-
 gcc/ipa-visibility.cc |  5 +++--
 gcc/testsuite/gcc.dg/lto/pr115815_0.c | 22 ++
 3 files changed, 28 insertions(+), 3 deletions(-)

diff --git a/gcc/cgraph.cc b/gcc/cgraph.cc
index 7a14c00b60a0..ad71cf82823c 100644
--- a/gcc/cgraph.cc
+++ b/gcc/cgraph.cc
@@ -2433,7 +2433,9 @@ cgraph_node_cannot_be_local_p_1 (cgraph_node *node, void 
*)
&& !node->forced_by_abi
&& !node->used_from_object_file_p ()
&& !node->same_comdat_group)
-  || !node->externally_visible));
+  || !node->externally_visible)
+  && !DECL_STATIC_CONSTRUCTOR (node->decl)
+  && !DECL_STATIC_DESTRUCTOR (node->decl));
 }
 
 /* Return true if cgraph_node can be made local for API change.
diff --git a/gcc/ipa-visibility.cc b/gcc/ipa-visibility.cc
index 8ec82bb333e2..9ca0e39df950 100644
--- a/gcc/ipa-visibility.cc
+++ b/gcc/ipa-visibility.cc
@@ -102,7 +102,9 @@ non_local_p (struct cgraph_node *node, void *data 
ATTRIBUTE_UNUSED)
   && !node->externally_visible
   && !node->used_from_other_partition
   && !node->in_other_partition
-  && node->get_availability () >= AVAIL_AVAILABLE);
+  && node->get_availability () >= AVAIL_AVAILABLE
+  && !DECL_STATIC_CONSTRUCTOR (node->decl)
+  && !DECL_STATIC_DESTRUCTOR (node->decl));
 }
 
 /* Return true when function can be marked local.  */
@@ -116,7 +118,6 @@ cgraph_node::local_p (void)
  return n->callees->callee->local_p ();
return !n->call_for_symbol_thunks_and_aliases (non_local_p,
  NULL, true);
-   
 }
 
 /* A helper for comdat_can_be_unshared_p.  */
diff --git a/gcc/testsuite/gcc.dg/lto/pr115815_0.c 
b/gcc/testsuite/gcc.dg/lto/pr115815_0.c
new file mode 100644
index ..ade91def55b0
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/lto/pr115815_0.c
@@ -0,0 +1,22 @@
+/* { dg-lto-options {{-O2 -flto}} }  */
+/* { dg-lto-do link } */
+/* { dg-require-effective-target global_constructor } */
+
+int a;
+volatile int v;
+volatile int w;
+
+int __attribute__((destructor))
+b() {
+  if (v)
+return a + b();
+  v = 5;
+  return 0;
+}
+
+int
+main (int argc, char **argv)
+{
+  w = 1;
+  return 0;
+}


[gcc r15-5637] ipa: Move individual jump function copying to a separate function

2024-11-24 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:cc5779fcaf76aeee005f986eb1dc15205c696544

commit r15-5637-gcc5779fcaf76aeee005f986eb1dc15205c696544
Author: Martin Jambor 
Date:   Sun Nov 24 23:03:43 2024 +0100

ipa: Move individual jump function copying to a separate function

When reviewing various IPA bits and pieces I have falsely assumed
that jump function duplication misses copying important bits because
it relies on vec_safe_copy-ing all data in the vector of jump
functions and then just fixes up the few fields it needs to.

Perhaps more importantly, we do want a function to copy one individual
jump function to form jump functions for planned call-graph edges that
model transfer of control to OpenMP outlined regions through calls to
gomp functions.

Therefore, this patch introduces such function and makes
ipa_edge_args_sum_t::duplicate just allocate the new vectors and then
uses the new function to copy the data.

gcc/ChangeLog:

2024-11-01  Martin Jambor  

* ipa-prop.cc (ipa_duplicate_jump_function): New function.
(ipa_edge_args_sum_t::duplicate): Move individual jump function
copying to ipa_duplicate_jump_function.

Diff:
---
 gcc/ipa-prop.cc | 188 +---
 1 file changed, 111 insertions(+), 77 deletions(-)

diff --git a/gcc/ipa-prop.cc b/gcc/ipa-prop.cc
index cbc825670fe0..9070a45f6835 100644
--- a/gcc/ipa-prop.cc
+++ b/gcc/ipa-prop.cc
@@ -4503,99 +4503,96 @@ ipa_edge_args_sum_t::remove (cgraph_edge *cs, 
ipa_edge_args *args)
 }
 }
 
-/* Method invoked when an edge is duplicated.  Copy ipa_edge_args and adjust
-   reference count data strucutres accordingly.  */
+/* Copy information from SRC_JF to DST_JF which correstpond to call graph edges
+   SRC and DST.  */
 
-void
-ipa_edge_args_sum_t::duplicate (cgraph_edge *src, cgraph_edge *dst,
-   ipa_edge_args *old_args, ipa_edge_args 
*new_args)
+static void
+ipa_duplicate_jump_function (cgraph_edge *src, cgraph_edge *dst,
+ipa_jump_func *src_jf, ipa_jump_func *dst_jf)
 {
-  unsigned int i;
+  dst_jf->agg.items = vec_safe_copy (src_jf->agg.items);
+  dst_jf->agg.by_ref = src_jf->agg.by_ref;
 
-  new_args->jump_functions = vec_safe_copy (old_args->jump_functions);
-  if (old_args->polymorphic_call_contexts)
-new_args->polymorphic_call_contexts
-  = vec_safe_copy (old_args->polymorphic_call_contexts);
+  /* We can avoid calling ipa_set_jfunc_vr since it would only look up the
+ place in the hash_table where the source m_vr resides.  */
+  dst_jf->m_vr = src_jf->m_vr;
 
-  for (i = 0; i < vec_safe_length (old_args->jump_functions); i++)
+  if (src_jf->type == IPA_JF_CONST)
 {
-  struct ipa_jump_func *src_jf = ipa_get_ith_jump_func (old_args, i);
-  struct ipa_jump_func *dst_jf = ipa_get_ith_jump_func (new_args, i);
-
-  dst_jf->agg.items = vec_safe_copy (dst_jf->agg.items);
+  ipa_set_jf_cst_copy (dst_jf, src_jf);
+  struct ipa_cst_ref_desc *src_rdesc = jfunc_rdesc_usable (src_jf);
 
-  if (src_jf->type == IPA_JF_CONST)
+  if (!src_rdesc)
+   dst_jf->value.constant.rdesc = NULL;
+  else if (src->caller == dst->caller)
{
- struct ipa_cst_ref_desc *src_rdesc = jfunc_rdesc_usable (src_jf);
-
- if (!src_rdesc)
-   dst_jf->value.constant.rdesc = NULL;
- else if (src->caller == dst->caller)
-   {
- /* Creation of a speculative edge.  If the source edge is the one
-grabbing a reference, we must create a new (duplicate)
-reference description.  Otherwise they refer to the same
-description corresponding to a reference taken in a function
-src->caller is inlined to.  In that case we just must
-increment the refcount.  */
- if (src_rdesc->cs == src)
-   {
-  symtab_node *n = symtab_node_for_jfunc (src_jf);
-  gcc_checking_assert (n);
-  ipa_ref *ref
-= src->caller->find_reference (n, src->call_stmt,
-   src->lto_stmt_uid,
-   IPA_REF_ADDR);
-  gcc_checking_assert (ref);
-  dst->caller->clone_reference (ref, ref->stmt);
-
-  ipa_cst_ref_desc *dst_rdesc = ipa_refdesc_pool.allocate ();
-  dst_rdesc->cs = dst;
-  dst_rdesc->refcount = src_rdesc->refcount;
-  dst_rdesc->next_duplicate = NULL;
-  dst_jf->value.constant.rdesc = dst_rdesc;
-   }
- else
-   {
- src_rdesc->refcount++;
- dst_jf->value.constant.rdesc = src_rdesc;
-   }
-   }
- else if (src_rdesc->cs == src)
+ /* Creation of a sp

[gcc r14-10997] tree-sra: Avoid SRAing arguments to a function returning_twice (PR 117142)

2024-11-28 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:8fd9461976b325efd134f9004a7958ebd008148f

commit r14-10997-g8fd9461976b325efd134f9004a7958ebd008148f
Author: Martin Jambor 
Date:   Wed Oct 23 11:30:32 2024 +0200

tree-sra: Avoid SRAing arguments to a function returning_twice (PR 117142)

PR 117142 shows that the current SRA probably never worked reliably
with arguments passed to a function returning twice, because it then
creates statements before the call which however needs to be at the
beginning of a basic block.

While it should be possible to make at least the case of passing
arguments by value work with SRA (the statements would need to be put
just on the non-abnormal edges leading to the BB), this would mean
large surgery of function sra_modify_expr and I guess the time would
better be spent re-organizing the whole pass.

gcc/ChangeLog:

2024-10-21  Martin Jambor  

PR tree-optimization/117142
* tree-sra.cc (build_access_from_call_arg): Disqualify any
candidate passed to a function returning twice.

gcc/testsuite/ChangeLog:

2024-10-21  Martin Jambor  

PR tree-optimization/117142
* gcc.dg/tree-ssa/pr117142.c: New test.

(cherry picked from commit 29d8f1f0b7ad3c69b3bdb130325300d5f73aa784)

Diff:
---
 gcc/testsuite/gcc.dg/tree-ssa/pr117142.c | 14 ++
 gcc/tree-sra.cc  |  9 +
 2 files changed, 23 insertions(+)

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr117142.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr117142.c
new file mode 100644
index ..fc62c1e58f2e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr117142.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-O1" } */
+
+struct a {
+  int b;
+};
+void c(int, int);
+void __attribute__((returns_twice))
+bar1(struct a);
+void bar(struct a) {
+  struct a d;
+  bar1(d);
+  c(d.b, d.b);
+}
diff --git a/gcc/tree-sra.cc b/gcc/tree-sra.cc
index 8040b0c56451..c91e40ef7e71 100644
--- a/gcc/tree-sra.cc
+++ b/gcc/tree-sra.cc
@@ -1397,6 +1397,15 @@ static bool
 build_access_from_call_arg (tree expr, gimple *stmt, bool can_be_returned,
enum out_edge_check *oe_check)
 {
+  if (gimple_call_flags (stmt) & ECF_RETURNS_TWICE)
+{
+  tree base = expr;
+  if (TREE_CODE (expr) == ADDR_EXPR)
+   base = get_base_address (TREE_OPERAND (expr, 0));
+  disqualify_base_of_expr (base, "Passed to a returns_twice call.");
+  return false;
+}
+
   if (TREE_CODE (expr) == ADDR_EXPR)
 {
   tree base = get_base_address (TREE_OPERAND (expr, 0));


[gcc r12-10836] tree-sra: Avoid SRAing arguments to a function returning_twice (PR 117142)

2024-11-27 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:dc0e962ea18667bc3cdabcafef85b241a4f2c678

commit r12-10836-gdc0e962ea18667bc3cdabcafef85b241a4f2c678
Author: Martin Jambor 
Date:   Fri Nov 15 14:37:06 2024 +0100

tree-sra: Avoid SRAing arguments to a function returning_twice (PR 117142)

This is a manual bacport of commit
29d8f1f0b7ad3c69b3bdb130325300d5f73aa784 which must be done slightly
elsewhere for gcc 13 and 12 because function
build_access_from_call_arg was added only in gcc 14.

But the gist of the patch is the same.  The commit message of the
original fix says:

PR 117142 shows that the current SRA probably never worked reliably
with arguments passed to a function returning twice, because it then
creates statements before the call which however needs to be at the
beginning of a basic block.

While it should be possible to make at least the case of passing
arguments by value work with SRA (the statements would need to be put
just on the non-abnormal edges leading to the BB), this would mean
large surgery of function sra_modify_expr and I guess the time would
better be spent re-organizing the whole pass.

gcc/ChangeLog:

2024-11-14  Martin Jambor  

PR tree-optimization/117142
* tree-sra.cc (scan_function): Disqualify any candidate passed to
a function returning twice.

gcc/testsuite/ChangeLog:

2024-11-14  Martin Jambor  

* gcc.dg/tree-ssa/pr117142.c: New test.

(cherry picked from commit 6244de432a5ba9807c6f0065e70a8025af7b1bd6)

Diff:
---
 gcc/testsuite/gcc.dg/tree-ssa/pr117142.c | 14 ++
 gcc/tree-sra.cc  | 13 ++---
 2 files changed, 24 insertions(+), 3 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr117142.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr117142.c
new file mode 100644
index ..fc62c1e58f2e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr117142.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-O1" } */
+
+struct a {
+  int b;
+};
+void c(int, int);
+void __attribute__((returns_twice))
+bar1(struct a);
+void bar(struct a) {
+  struct a d;
+  bar1(d);
+  c(d.b, d.b);
+}
diff --git a/gcc/tree-sra.cc b/gcc/tree-sra.cc
index 47eee5add126..5a9eaf31b6e9 100644
--- a/gcc/tree-sra.cc
+++ b/gcc/tree-sra.cc
@@ -1392,9 +1392,16 @@ scan_function (void)
  break;
 
case GIMPLE_CALL:
- for (i = 0; i < gimple_call_num_args (stmt); i++)
-   ret |= build_access_from_expr (gimple_call_arg (stmt, i),
-  stmt, false);
+ if (gimple_call_flags (stmt) & ECF_RETURNS_TWICE)
+   {
+ for (i = 0; i < gimple_call_num_args (stmt); i++)
+   disqualify_base_of_expr (gimple_call_arg (stmt, i),
+"Passed to a returns_twice call.");
+   }
+ else
+   for (i = 0; i < gimple_call_num_args (stmt); i++)
+ ret |= build_access_from_expr (gimple_call_arg (stmt, i),
+stmt, false);
 
  t = gimple_call_lhs (stmt);
  if (t && !disqualify_if_bad_bb_terminating_stmt (stmt, t, NULL))


[gcc r13-9193] tree-sra: Avoid SRAing arguments to a function returning_twice (PR 117142)

2024-11-15 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:6244de432a5ba9807c6f0065e70a8025af7b1bd6

commit r13-9193-g6244de432a5ba9807c6f0065e70a8025af7b1bd6
Author: Martin Jambor 
Date:   Fri Nov 15 14:37:06 2024 +0100

tree-sra: Avoid SRAing arguments to a function returning_twice (PR 117142)

This is a manual bacport of commit
29d8f1f0b7ad3c69b3bdb130325300d5f73aa784 which must be done slightly
elsewhere for gcc 13 and 12 because function
build_access_from_call_arg was added only in gcc 14.

But the gist of the patch is the same.  The commit message of the
original fix says:

PR 117142 shows that the current SRA probably never worked reliably
with arguments passed to a function returning twice, because it then
creates statements before the call which however needs to be at the
beginning of a basic block.

While it should be possible to make at least the case of passing
arguments by value work with SRA (the statements would need to be put
just on the non-abnormal edges leading to the BB), this would mean
large surgery of function sra_modify_expr and I guess the time would
better be spent re-organizing the whole pass.

gcc/ChangeLog:

2024-11-14  Martin Jambor  

PR tree-optimization/117142
* tree-sra.cc (scan_function): Disqualify any candidate passed to
a function returning twice.

gcc/testsuite/ChangeLog:

2024-11-14  Martin Jambor  

* gcc.dg/tree-ssa/pr117142.c: New test.

Diff:
---
 gcc/testsuite/gcc.dg/tree-ssa/pr117142.c | 14 ++
 gcc/tree-sra.cc  | 13 ++---
 2 files changed, 24 insertions(+), 3 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr117142.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr117142.c
new file mode 100644
index ..fc62c1e58f2e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr117142.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-O1" } */
+
+struct a {
+  int b;
+};
+void c(int, int);
+void __attribute__((returns_twice))
+bar1(struct a);
+void bar(struct a) {
+  struct a d;
+  bar1(d);
+  c(d.b, d.b);
+}
diff --git a/gcc/tree-sra.cc b/gcc/tree-sra.cc
index 77508894772d..8a9cbeec4908 100644
--- a/gcc/tree-sra.cc
+++ b/gcc/tree-sra.cc
@@ -1504,9 +1504,16 @@ scan_function (void)
  break;
 
case GIMPLE_CALL:
- for (i = 0; i < gimple_call_num_args (stmt); i++)
-   ret |= build_access_from_expr (gimple_call_arg (stmt, i),
-  stmt, false);
+ if (gimple_call_flags (stmt) & ECF_RETURNS_TWICE)
+   {
+ for (i = 0; i < gimple_call_num_args (stmt); i++)
+   disqualify_base_of_expr (gimple_call_arg (stmt, i),
+"Passed to a returns_twice call.");
+   }
+ else
+   for (i = 0; i < gimple_call_num_args (stmt); i++)
+ ret |= build_access_from_expr (gimple_call_arg (stmt, i),
+stmt, false);
 
  t = gimple_call_lhs (stmt);
  if (t && !disqualify_if_bad_bb_terminating_stmt (stmt, t, NULL))


[gcc r15-5291] ipa: Rationalize IPA-VR computations across pass-through jump functions

2024-11-14 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:012f5a22bac26a898ab66655965b07ac23201fdd

commit r15-5291-g012f5a22bac26a898ab66655965b07ac23201fdd
Author: Martin Jambor 
Date:   Thu Nov 14 20:55:06 2024 +0100

ipa: Rationalize IPA-VR computations across pass-through jump functions

Currently ipa_value_range_from_jfunc and
propagate_vr_across_jump_function contain similar but not same code
for dealing with pass-through jump functions.  This patch puts these
common bits into one function which can also handle comparison
operations.

gcc/ChangeLog:

2024-11-01  Martin Jambor  

PR ipa/114985
* ipa-cp.cc (ipa_vr_intersect_with_arith_jfunc): New function.
(ipa_value_range_from_jfunc): Move the common functionality to the
above new function, adjust the rest so that it works with it well.
(propagate_vr_across_jump_function): Likewise.

Diff:
---
 gcc/ipa-cp.cc | 181 ++
 1 file changed, 67 insertions(+), 114 deletions(-)

diff --git a/gcc/ipa-cp.cc b/gcc/ipa-cp.cc
index fb65ec0c6a62..25741cf47bb0 100644
--- a/gcc/ipa-cp.cc
+++ b/gcc/ipa-cp.cc
@@ -1692,6 +1692,55 @@ ipa_vr_operation_and_type_effects (vrange &dst_vr,
dst_type, src_type);
 }
 
+/* Given a PASS_THROUGH jump function JFUNC that takes as its source SRC_VR of
+   SRC_TYPE and the result needs to be DST_TYPE, if any value range information
+   can be deduced at all, intersect VR with it.  */
+
+static void
+ipa_vr_intersect_with_arith_jfunc (vrange &vr,
+  ipa_jump_func *jfunc,
+  const value_range &src_vr,
+  tree src_type,
+  tree dst_type)
+{
+  if (src_vr.undefined_p () || src_vr.varying_p ())
+return;
+
+  enum tree_code operation = ipa_get_jf_pass_through_operation (jfunc);
+  if (TREE_CODE_CLASS (operation) == tcc_unary)
+{
+  value_range tmp_res (dst_type);
+  if (ipa_vr_operation_and_type_effects (tmp_res, src_vr, operation,
+dst_type, src_type))
+   vr.intersect (tmp_res);
+  return;
+}
+
+  tree operand = ipa_get_jf_pass_through_operand (jfunc);
+  range_op_handler handler (operation);
+  if (!handler)
+return;
+  value_range op_vr (TREE_TYPE (operand));
+  ipa_range_set_and_normalize (op_vr, operand);
+
+  tree operation_type;
+  if (TREE_CODE_CLASS (operation) == tcc_comparison)
+operation_type = boolean_type_node;
+  else
+operation_type = src_type;
+
+  value_range op_res (dst_type);
+  if (!ipa_vr_supported_type_p (operation_type)
+  || !handler.operand_check_p (operation_type, src_type, op_vr.type ())
+  || !handler.fold_range (op_res, operation_type, src_vr, op_vr))
+return;
+
+  value_range tmp_res (dst_type);
+  if (ipa_vr_operation_and_type_effects (tmp_res, op_res, NOP_EXPR, dst_type,
+operation_type))
+  vr.intersect (tmp_res);
+}
+
 /* Determine range of JFUNC given that INFO describes the caller node or
the one it is inlined to, CS is the call graph edge corresponding to JFUNC
and PARM_TYPE of the parameter.  */
@@ -1701,18 +1750,18 @@ ipa_value_range_from_jfunc (vrange &vr,
ipa_node_params *info, cgraph_edge *cs,
ipa_jump_func *jfunc, tree parm_type)
 {
-  vr.set_undefined ();
+  vr.set_varying (parm_type);
 
-  if (jfunc->m_vr)
+  if (jfunc->m_vr && jfunc->m_vr->known_p ())
 ipa_vr_operation_and_type_effects (vr,
   *jfunc->m_vr,
   NOP_EXPR, parm_type,
   jfunc->m_vr->type ());
   if (vr.singleton_p ())
 return;
+
   if (jfunc->type == IPA_JF_PASS_THROUGH)
 {
-  int idx;
   ipcp_transformation *sum
= ipcp_get_transformation_summary (cs->caller->inlined_to
   ? cs->caller->inlined_to
@@ -1720,54 +1769,15 @@ ipa_value_range_from_jfunc (vrange &vr,
   if (!sum || !sum->m_vr)
return;
 
-  idx = ipa_get_jf_pass_through_formal_id (jfunc);
+  int idx = ipa_get_jf_pass_through_formal_id (jfunc);
 
   if (!(*sum->m_vr)[idx].known_p ())
return;
-  tree vr_type = ipa_get_type (info, idx);
+  tree src_type = ipa_get_type (info, idx);
   value_range srcvr;
   (*sum->m_vr)[idx].get_vrange (srcvr);
 
-  enum tree_code operation = ipa_get_jf_pass_through_operation (jfunc);
-
-  if (TREE_CODE_CLASS (operation) == tcc_unary)
-   {
- value_range res (parm_type);
-
- if (ipa_vr_operation_and_type_effects (res,
-srcvr,
-operation, parm_type,
-vr_type))
- 

[gcc r15-5240] ipa: Introduce a one jump function dumping function

2024-11-14 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:f927264935972145bb71f1cdb26263a5446671e1

commit r15-5240-gf927264935972145bb71f1cdb26263a5446671e1
Author: Martin Jambor 
Date:   Thu Nov 14 14:42:27 2024 +0100

ipa: Introduce a one jump function dumping function

I plan to introduce a verifier that prints a single jump function when
it fails with the function introduced in this one.  Because it is a
verifier, the risk that it would need to e reverted are non-zero and
because the function can be useful on its own, this is a special patch
to introduce it.

gcc/ChangeLog:

2024-11-01  Martin Jambor  

* ipa-prop.h (ipa_dump_jump_function): Declare.
* ipa-prop.cc (ipa_dump_jump_function): New function.
(ipa_print_node_jump_functions_for_edge): Move printing of
individual jump functions to the new function.

Diff:
---
 gcc/ipa-prop.cc | 209 +---
 gcc/ipa-prop.h  |   2 +
 2 files changed, 110 insertions(+), 101 deletions(-)

diff --git a/gcc/ipa-prop.cc b/gcc/ipa-prop.cc
index fd18f847e460..2a0d4503f525 100644
--- a/gcc/ipa-prop.cc
+++ b/gcc/ipa-prop.cc
@@ -429,126 +429,133 @@ ipa_print_constant_value (FILE *f, tree val)
 }
 }
 
-/* Print the jump functions associated with call graph edge CS to file F.  */
+/* Print contents of JFUNC to F.  If CTX is non-NULL, dump it too.  */
 
-static void
-ipa_print_node_jump_functions_for_edge (FILE *f, struct cgraph_edge *cs)
+DEBUG_FUNCTION void
+ipa_dump_jump_function (FILE *f, ipa_jump_func *jump_func,
+   class ipa_polymorphic_call_context *ctx)
 {
-  ipa_edge_args *args = ipa_edge_args_sum->get (cs);
-  int count = ipa_get_cs_argument_count (args);
+  enum jump_func_type type = jump_func->type;
 
-  for (int i = 0; i < count; i++)
+  if (type == IPA_JF_UNKNOWN)
+fprintf (f, "UNKNOWN\n");
+  else if (type == IPA_JF_CONST)
 {
-  struct ipa_jump_func *jump_func;
-  enum jump_func_type type;
-
-  jump_func = ipa_get_ith_jump_func (args, i);
-  type = jump_func->type;
-
-  fprintf (f, "   param %d: ", i);
-  if (type == IPA_JF_UNKNOWN)
-   fprintf (f, "UNKNOWN\n");
-  else if (type == IPA_JF_CONST)
+  fprintf (f, "CONST: ");
+  ipa_print_constant_value (f, jump_func->value.constant.value);
+  fprintf (f, "\n");
+}
+  else if (type == IPA_JF_PASS_THROUGH)
+{
+  fprintf (f, "PASS THROUGH: ");
+  fprintf (f, "%d, op %s",
+  jump_func->value.pass_through.formal_id,
+  get_tree_code_name(jump_func->value.pass_through.operation));
+  if (jump_func->value.pass_through.operation != NOP_EXPR)
{
- fprintf (f, "CONST: ");
- ipa_print_constant_value (f, jump_func->value.constant.value);
- fprintf (f, "\n");
+ fprintf (f, " ");
+ print_generic_expr (f, jump_func->value.pass_through.operand);
}
-  else if (type == IPA_JF_PASS_THROUGH)
+  if (jump_func->value.pass_through.agg_preserved)
+   fprintf (f, ", agg_preserved");
+  if (jump_func->value.pass_through.refdesc_decremented)
+   fprintf (f, ", refdesc_decremented");
+  fprintf (f, "\n");
+}
+  else if (type == IPA_JF_ANCESTOR)
+{
+  fprintf (f, "ANCESTOR: ");
+  fprintf (f, "%d, offset " HOST_WIDE_INT_PRINT_DEC,
+  jump_func->value.ancestor.formal_id,
+  jump_func->value.ancestor.offset);
+  if (jump_func->value.ancestor.agg_preserved)
+   fprintf (f, ", agg_preserved");
+  if (jump_func->value.ancestor.keep_null)
+   fprintf (f, ", keep_null");
+  fprintf (f, "\n");
+}
+
+  if (jump_func->agg.items)
+{
+  struct ipa_agg_jf_item *item;
+  int j;
+
+  fprintf (f, " Aggregate passed by %s:\n",
+  jump_func->agg.by_ref ? "reference" : "value");
+  FOR_EACH_VEC_ELT (*jump_func->agg.items, j, item)
{
- fprintf (f, "PASS THROUGH: ");
- fprintf (f, "%d, op %s",
-  jump_func->value.pass_through.formal_id,
-  get_tree_code_name(jump_func->value.pass_through.operation));
- if (jump_func->value.pass_through.operation != NOP_EXPR)
+ fprintf (f, "   offset: " HOST_WIDE_INT_PRINT_DEC ", ",
+  item->offset);
+ fprintf (f, "type: ");
+ print_generic_expr (f, item->type);
+ fprintf (f, ", ");
+ if (item->jftype == IPA_JF_PASS_THROUGH)
+   fprintf (f, "PASS THROUGH: %d,",
+item->value.pass_through.formal_id);
+ else if (item->jftype == IPA_JF_LOAD_AGG)
{
- fprintf (f, " ");
- print_generic_expr (f, jump_func->value.pass_through.operand);
+ fprintf (f, "LOAD AGG: %d",
+  item->value.pass_through.formal_id);
+ fprintf (f, " [offset: " HOST_WIDE_INT_PRINT_DEC ", by %s],",
+ 

[gcc r15-5239] ipa-cp: Fix constant dumping

2024-11-14 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:da29560711b2a66b26738caf46dbf67d3f7cff85

commit r15-5239-gda29560711b2a66b26738caf46dbf67d3f7cff85
Author: Martin Jambor 
Date:   Thu Nov 14 14:42:27 2024 +0100

ipa-cp: Fix constant dumping

Commit gcc-14-5368-ge0787da2633 removed an overloaded variant of
function print_ipcp_constant_value for tree constants.  That did not
break build because the other overloaded variant for polymorphic
contexts-has a parameter which is constructible from a tree, but it
prints polymorphic contexts, not tree constants, so we in dumps we got
things like:

  param [0]: VARIABLE
   ctxs: VARIABLE
   Bits: value = 0x0, mask = 0xfffc
   [prange] struct S * [1, +INF] MASK 0xfffc VALUE 0x0
  ref offset 0: nothing known [scc: 1, from: 1(1.00)] 
[loc_time: 0, loc_size: 0, prop_time: 0, prop_size: 0]
  ref offset 32: nothing known [scc: 2, from: 1(1.00)] 
[loc_time: 0, loc_size: 0, prop_time: 0, prop_size: 0]
  ref offset 64: nothing known [scc: 3, from: 1(1.00)] 
[loc_time: 0, loc_size: 0, prop_time: 0, prop_size: 0]

instead of:

  param [0]: VARIABLE
   ctxs: VARIABLE
   Bits: value = 0x0, mask = 0xfffc
   [prange] struct S * [1, +INF] MASK 0xfffc VALUE 0x0
  ref offset 0: 1 [scc: 1, from: 1(1.00)] [loc_time: 0, loc_size: 
0, prop_time: 0, prop_size: 0]
  ref offset 32: 64 [scc: 2, from: 1(1.00)] [loc_time: 0, loc_size: 
0, prop_time: 0, prop_size: 0]
  ref offset 64: 32 [scc: 3, from: 1(1.00)] [loc_time: 0, loc_size: 
0, prop_time: 0, prop_size: 0]

This commit re-adds the needed overloaded variant though it uses the
printing function added in the aforementioned commit instead of
printing it itself.

gcc/ChangeLog:

2024-11-13  Martin Jambor  

* ipa-prop.h (ipa_print_constant_value): Declare.
* ipa-prop.cc (ipa_print_constant_value): Make public.
* ipa-cp.cc (print_ipcp_constant_value): Re-add this overloaded
function for printing tree constants.

gcc/testsuite/ChangeLog:

2024-11-14  Martin Jambor  

* gcc.dg/ipa/ipcp-agg-1.c: Add a scan dump for a constant value in
the latice dump.

Diff:
---
 gcc/ipa-cp.cc | 12 +++-
 gcc/ipa-prop.cc   |  2 +-
 gcc/ipa-prop.h|  1 +
 gcc/testsuite/gcc.dg/ipa/ipcp-agg-1.c |  1 +
 4 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/gcc/ipa-cp.cc b/gcc/ipa-cp.cc
index 212d9ccbbfe0..fb65ec0c6a62 100644
--- a/gcc/ipa-cp.cc
+++ b/gcc/ipa-cp.cc
@@ -225,7 +225,17 @@ values_equal_for_ipcp_p (tree x, tree y)
 return operand_equal_p (x, y, 0);
 }
 
-/* Print V which is extracted from a value in a lattice to F.  */
+/* Print V which is extracted from a value in a lattice to F.  This overloaded
+   function is used to print tree constants.  */
+
+static void
+print_ipcp_constant_value (FILE * f, tree v)
+{
+  ipa_print_constant_value (f, v);
+}
+
+/* Print V which is extracted from a value in a lattice to F.  This overloaded
+   function is used to print constant polymorphic call contexts.  */
 
 static void
 print_ipcp_constant_value (FILE * f, ipa_polymorphic_call_context v)
diff --git a/gcc/ipa-prop.cc b/gcc/ipa-prop.cc
index 599181d0a943..fd18f847e460 100644
--- a/gcc/ipa-prop.cc
+++ b/gcc/ipa-prop.cc
@@ -413,7 +413,7 @@ ipa_initialize_node_params (struct cgraph_node *node)
 
 /* Print VAL which is extracted from a jump function to F.  */
 
-static void
+void
 ipa_print_constant_value (FILE *f, tree val)
 {
   print_generic_expr (f, val);
diff --git a/gcc/ipa-prop.h b/gcc/ipa-prop.h
index 7a05c169c421..a9ef3fe3aa60 100644
--- a/gcc/ipa-prop.h
+++ b/gcc/ipa-prop.h
@@ -1179,6 +1179,7 @@ ipcp_get_transformation_summary (cgraph_node *node)
 
 /* Function formal parameters related computations.  */
 void ipa_initialize_node_params (struct cgraph_node *node);
+void ipa_print_constant_value (FILE *f, tree val);
 bool ipa_propagate_indirect_call_infos (struct cgraph_edge *cs,
vec *new_edges);
 
diff --git a/gcc/testsuite/gcc.dg/ipa/ipcp-agg-1.c 
b/gcc/testsuite/gcc.dg/ipa/ipcp-agg-1.c
index 8cfc18799fae..15f6286e54bc 100644
--- a/gcc/testsuite/gcc.dg/ipa/ipcp-agg-1.c
+++ b/gcc/testsuite/gcc.dg/ipa/ipcp-agg-1.c
@@ -30,6 +30,7 @@ entry (void)
   foo (&s);
 }
 
+/* { dg-final { scan-ipa-dump "ref offset\[^\n\r\]*: 64\[^\n\r\]*scc:" "cp" } 
} */
 /* { dg-final { scan-ipa-dump "Creating a specialized node of foo.*for all 
known contexts" "cp" } } */
 /* { dg-final { scan-ipa-dump-times "Aggregate replacements:" 2 "cp" } } */
 /* { dg-final { scan-tree-dump-not "->c;" "optimized" } } */


[gcc r15-6599] ipa-cp: Make dumping of bit masks representing -1 nicer

2025-01-06 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:72b273152f75a8622ea13d0fe95d6d2461615ba4

commit r15-6599-g72b273152f75a8622ea13d0fe95d6d2461615ba4
Author: Martin Jambor 
Date:   Mon Jan 6 11:58:29 2025 +0100

ipa-cp: Make dumping of bit masks representing -1 nicer

Dumps of the lattices representing bit-values and of propagation
results of bit-values can print a really long hexadecimal value when
the bit-value represents -1 (all bits set).  This patch simply detect
that situation and prints the string "-1" in that case, making the
dumps somewhat nicer.

gcc/ChangeLog:

2025-01-03  Martin Jambor  

* ipa-cp.cc (ipcp_print_widest_int): New function.
(ipcp_store_vr_results): Use it.
(ipcp_bits_lattice::print): Likewise.  Fix formatting.

Diff:
---
 gcc/ipa-cp.cc | 20 +---
 1 file changed, 17 insertions(+), 3 deletions(-)

diff --git a/gcc/ipa-cp.cc b/gcc/ipa-cp.cc
index 7423731d7250..294389fba4c7 100644
--- a/gcc/ipa-cp.cc
+++ b/gcc/ipa-cp.cc
@@ -307,6 +307,18 @@ ipcp_lattice::print (FILE * f, bool dump_sources, 
bool dump_benefits)
 fprintf (f, "\n");
 }
 
+/* If VALUE has all bits set to one, print "-1" to F, otherwise simply print it
+   hexadecimally to F. */
+
+static void
+ipcp_print_widest_int (FILE *f, const widest_int &value)
+{
+  if (wi::eq_p (wi::bit_not (value), 0))
+fprintf (f, "-1");
+  else
+print_hex (value, f);
+}
+
 void
 ipcp_bits_lattice::print (FILE *f)
 {
@@ -316,8 +328,10 @@ ipcp_bits_lattice::print (FILE *f)
 fprintf (f, " Bits unusable (BOTTOM)\n");
   else
 {
-  fprintf (f, " Bits: value = "); print_hex (get_value (), f);
-  fprintf (f, ", mask = "); print_hex (get_mask (), f);
+  fprintf (f, " Bits: value = ");
+  ipcp_print_widest_int (f, get_value ());
+  fprintf (f, ", mask = ");
+  print_hex (get_mask (), f);
   fprintf (f, "\n");
 }
 }
@@ -6375,7 +6389,7 @@ ipcp_store_vr_results (void)
  dumped_sth = true;
}
  fprintf (dump_file, " param %i: value = ", i);
- print_hex (bits->get_value (), dump_file);
+ ipcp_print_widest_int (dump_file, bits->get_value ());
  fprintf (dump_file, ", mask = ");
  print_hex (bits->get_mask (), dump_file);
  fprintf (dump_file, "\n");


[gcc r15-7456] ipa-cp: Perform operations in the appropriate types (PR 118097)

2025-02-10 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:6d07e3de7e8d39ac144ba1d83bba08d48bacae13

commit r15-7456-g6d07e3de7e8d39ac144ba1d83bba08d48bacae13
Author: Martin Jambor 
Date:   Mon Feb 10 16:49:59 2025 +0100

ipa-cp: Perform operations in the appropriate types (PR 118097)

One of the testcases from PR 118097 and the one from PR 118535 show
that the fix to PR 118138 was incomplete.  We must not only make sure
that (intermediate) results of operations performed by IPA-CP are
fold_converted to the type of the destination formal parameter but we
also must decouple the these types from the ones in which operations
are performed.

This patch does that, even though we do not store or stream the
operation types, instead we simply limit ourselves to tcc_comparisons
and operations for which the first operand and the result are of the
same type as determined by expr_type_first_operand_type_p.  If we
wanted to go beyond these, we would indeed need to store/stream the
respective operation type.

ipa_value_from_jfunc needs an additional check that res_type is not
NULL because it is not called just from within IPA-CP (where we know
we have a destination lattice slot belonging to a defined parameter)
but also from inlining, ipa-fnsummary and ipa-modref where it is used
to examine a call to a function with variadic arguments and we do not
have types for the unknown parameters.  But we cannot really work with
those or estimate any benefits when it comes to them, so ignoring them
should be OK.

Even after this patch, ipa_get_jf_arith_result has a parameter called
res_type in which it performs operations for aggregate jump functions,
where we do not allow type conversions when constucting the jump
functions and the type is the type of the stored data.  In GCC 16, we
could relax this and allow conversions like for scalars.

gcc/ChangeLog:

2025-01-20  Martin Jambor  

PR ipa/118097
* ipa-cp.cc (ipa_get_jf_arith_result): Adjust comment.
(ipa_get_jf_pass_through_result): Removed.
(ipa_value_from_jfunc): Use directly ipa_get_jf_arith_result, do
not specify operation type but make sure we check and possibly
convert the result.
(get_val_across_arith_op): Remove the last parameter, always pass
NULL_TREE to ipa_get_jf_arith_result in its last argument.
(propagate_vals_across_arith_jfunc): Do not pass res_type to
get_val_across_arith_op.
(propagate_vals_across_pass_through): Add checking assert that
parm_type is not NULL.

gcc/testsuite/ChangeLog:

2025-01-24  Martin Jambor  

PR ipa/118097
* gcc.dg/ipa/pr118097.c: New test.
* gcc.dg/ipa/pr118535.c: Likewise.
* gcc.dg/ipa/ipa-notypes-1.c: Likewise.

Diff:
---
 gcc/ipa-cp.cc| 46 +---
 gcc/testsuite/gcc.dg/ipa/ipa-notypes-1.c | 17 
 gcc/testsuite/gcc.dg/ipa/pr118097.c  | 23 
 gcc/testsuite/gcc.dg/ipa/pr118535.c  | 17 
 4 files changed, 75 insertions(+), 28 deletions(-)

diff --git a/gcc/ipa-cp.cc b/gcc/ipa-cp.cc
index d89324a00775..68959f2677ba 100644
--- a/gcc/ipa-cp.cc
+++ b/gcc/ipa-cp.cc
@@ -1467,11 +1467,10 @@ ipacp_value_safe_for_type (tree param_type, tree value)
 return NULL_TREE;
 }
 
-/* Return the result of a (possibly arithmetic) operation on the constant
-   value INPUT.  OPERAND is 2nd operand for binary operation.  RES_TYPE is
-   the type of the parameter to which the result is passed.  Return
-   NULL_TREE if that cannot be determined or be considered an
-   interprocedural invariant.  */
+/* Return the result of a (possibly arithmetic) operation on the constant value
+   INPUT.  OPERAND is 2nd operand for binary operation.  RES_TYPE is the type
+   in which any operation is to be performed.  Return NULL_TREE if that cannot
+   be determined or be considered an interprocedural invariant.  */
 
 static tree
 ipa_get_jf_arith_result (enum tree_code opcode, tree input, tree operand,
@@ -1513,21 +1512,6 @@ ipa_get_jf_arith_result (enum tree_code opcode, tree 
input, tree operand,
   return res;
 }
 
-/* Return the result of a (possibly arithmetic) pass through jump function
-   JFUNC on the constant value INPUT.  RES_TYPE is the type of the parameter
-   to which the result is passed.  Return NULL_TREE if that cannot be
-   determined or be considered an interprocedural invariant.  */
-
-static tree
-ipa_get_jf_pass_through_result (struct ipa_jump_func *jfunc, tree input,
-   tree res_type)
-{
-  return ipa_get_jf_arith_result (ipa_get_jf_pass_through_operation (jfunc),
- input,
- ipa_get_jf_pass_through_operand (jfunc),
- re

[gcc r15-7476] lto: Add an entry for cold attribute to lto_gnu_attributes

2025-02-11 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:4abac2ffdb071ca9337e4f31fa79cd38df1ac7c3

commit r15-7476-g4abac2ffdb071ca9337e4f31fa79cd38df1ac7c3
Author: Martin Jambor 
Date:   Tue Feb 11 16:39:56 2025 +0100

lto: Add an entry for cold attribute to lto_gnu_attributes

PR 118125 is a performance regression stemming from the fact that we
lose the cold attribute of our __builtin_unreachable.  The attribute
is simply and silently dropped on the floor by decl_attributes (in
attribs.cc) in the process of building decls for builtins because it
cannot look it up in the gnu attribute name space by
lookup_scoped_attribute_spec.  For that not to happen it must be in
lto_gnu_attributes and this patch adds it there.

In comment 13 of the bug Andrew identified other attributes which are
in builtin-attrs.def but missing in lto_gnu_attributes but apart from
cold it seems that they are either not used in builtins.def or are
used in DEF_LIB_BUILTIN which I guess might be less critical?
Eventually I decided to go for the most simple of patches and only add
things if they are requested.  For the same reason I also did not add
any checking to the attribute "handle" callback or any exclusion check.
They seem to be mostly relevant before LTO FE kicks in to me, but
again, I'm happy to add any if they seem to be useful.

Since Ian fixed PR 118746, the same issue has also been fixed in the
Go front-end and so I have added a simple checking assert to the
redirect_to_unreachable function to make sure it has the intended
effect.

gcc/ChangeLog:

2025-02-03  Martin Jambor  

PR lto/118125
* ipa-fnsummary.cc (redirect_to_unreachable): Add checking assert
that the builtin_unreachable decl has attribute cold.

gcc/lto/ChangeLog:

2025-02-03  Martin Jambor  

PR lto/118125
* lto-lang.cc (lto_gnu_attributes): Add an entry for cold attribute.
(handle_cold_attribute): New function.

Diff:
---
 gcc/ipa-fnsummary.cc |  3 +++
 gcc/lto/lto-lang.cc  | 13 +
 2 files changed, 16 insertions(+)

diff --git a/gcc/ipa-fnsummary.cc b/gcc/ipa-fnsummary.cc
index 33f19365ec36..4c062fe8a0e2 100644
--- a/gcc/ipa-fnsummary.cc
+++ b/gcc/ipa-fnsummary.cc
@@ -255,6 +255,9 @@ redirect_to_unreachable (struct cgraph_edge *e)
   struct cgraph_node *target
 = cgraph_node::get_create (builtin_decl_unreachable ());
 
+  gcc_checking_assert (lookup_attribute ("cold",
+DECL_ATTRIBUTES (target->decl)));
+
   if (e->speculative)
 e = cgraph_edge::resolve_speculation (e, target->decl);
   else if (!e->callee)
diff --git a/gcc/lto/lto-lang.cc b/gcc/lto/lto-lang.cc
index 652d7fc5e30d..e41b548b3983 100644
--- a/gcc/lto/lto-lang.cc
+++ b/gcc/lto/lto-lang.cc
@@ -60,6 +60,7 @@ static tree ignore_attribute (tree *, tree, tree, int, bool 
*);
 static tree handle_format_attribute (tree *, tree, tree, int, bool *);
 static tree handle_fnspec_attribute (tree *, tree, tree, int, bool *);
 static tree handle_format_arg_attribute (tree *, tree, tree, int, bool *);
+static tree handle_cold_attribute (tree *, tree, tree, int, bool *);
 
 /* Helper to define attribute exclusions.  */
 #define ATTR_EXCL(name, function, type, variable)  \
@@ -128,6 +129,8 @@ static const attribute_spec lto_gnu_attributes[] =
  handle_sentinel_attribute, NULL },
   { "type generic",   0, 0, false, true, true, false,
  handle_type_generic_attribute, NULL },
+  { "cold",  0, 0, false,  false, false, false,
+ handle_cold_attribute, NULL },
   { "fn spec",   1, 1, false, true, true, false,
  handle_fnspec_attribute, NULL },
   { "transaction_pure",  0, 0, false, true, true, false,
@@ -598,6 +601,16 @@ handle_fnspec_attribute (tree *node ATTRIBUTE_UNUSED, tree 
ARG_UNUSED (name),
   return NULL_TREE;
 }
 
+/* Handle a "cold" attribute; arguments as in
+   struct attribute_spec.handler.  */
+
+static tree
+handle_cold_attribute (tree *, tree, tree, int, bool *)
+{
+  /* Nothing to be done here.  */
+  return NULL_TREE;
+}
+
 /* Cribbed from c-common.cc.  */
 
 static void


[gcc r15-7269] tree-ssa-dce: Avoid creating invalid BBs with no outgoing edge (PR117892)

2025-01-29 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:3d07e7bf13d4aec794dd25b5090c139b4d78283d

commit r15-7269-g3d07e7bf13d4aec794dd25b5090c139b4d78283d
Author: Martin Jambor 
Date:   Wed Jan 29 10:51:08 2025 +0100

tree-ssa-dce: Avoid creating invalid BBs with no outgoing edge (PR117892)

Zhendong Su and Michal Jireš found out that our gimple DSE pass can,
under fairly specific conditions, remove a noreturn call which then
leaves behind a "normal" BB with no successor edges which following
passes do not expect.  This patch simply tells the pass to leave such
calls alone even when they otherwise appear to be dead.

Interestingly, our CFG verifier does not report this.  I'll put on my
todo list to add a test for it in the next stage 1.

gcc/ChangeLog:

2025-01-28  Martin Jambor  

PR tree-optimization/117892
* tree-ssa-dse.cc (dse_optimize_call): Leave control-altering
noreturn calls alone.

gcc/testsuite/ChangeLog:

2025-01-27  Martin Jambor  

PR tree-optimization/117892
* gcc.dg/tree-ssa/pr117892.c: New test.
* gcc.dg/tree-ssa/pr118517.c: Likewise.

co-authored-by: Michal Jireš 

Diff:
---
 gcc/testsuite/gcc.dg/tree-ssa/pr117892.c | 17 +
 gcc/testsuite/gcc.dg/tree-ssa/pr118517.c | 11 +++
 gcc/tree-ssa-dse.cc  |  6 --
 3 files changed, 32 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr117892.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr117892.c
new file mode 100644
index ..d9b9c15095fc
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr117892.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O1" } */
+
+
+volatile int a;
+void b(int *c) {
+  int *d = 0;
+  *c = 0;
+  *d = 0;
+  __builtin_abort();
+}
+int main() {
+  int f;
+  if (a)
+b(&f);
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr118517.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr118517.c
new file mode 100644
index ..3a34f6788a9c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr118517.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O1 -fno-ipa-pure-const" } */
+
+void __attribute__((noreturn)) bar(void) {
+  __builtin_unreachable ();
+}
+
+int p;
+void foo() {
+  if (p) bar();
+}
diff --git a/gcc/tree-ssa-dse.cc b/gcc/tree-ssa-dse.cc
index 753d7ef148ba..bc632e384841 100644
--- a/gcc/tree-ssa-dse.cc
+++ b/gcc/tree-ssa-dse.cc
@@ -1396,8 +1396,10 @@ dse_optimize_call (gimple_stmt_iterator *gsi, sbitmap 
live_bytes)
   if (!node)
 return false;
 
-  if (stmt_could_throw_p (cfun, stmt)
-  && !cfun->can_delete_dead_exceptions)
+  if ((stmt_could_throw_p (cfun, stmt)
+   && !cfun->can_delete_dead_exceptions)
+  || ((gimple_call_flags (stmt) & ECF_NORETURN)
+ && gimple_call_ctrl_altering_p (stmt)))
 return false;
 
   /* If return value is used the call is not dead.  */


[gcc r15-6110] ipa: Update value range jump functions during inlining

2024-12-11 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:92e0e0f8177530b8c6fcafe1d61ba03b00dff6a6

commit r15-6110-g92e0e0f8177530b8c6fcafe1d61ba03b00dff6a6
Author: Martin Jambor 
Date:   Wed Dec 11 14:55:27 2024 +0100

ipa: Update value range jump functions during inlining

When inlining (during the analysis phase) a call graph edge, we update
all pass-through jump functions corresponding to edges going out of
the newly inlined function to be relative to the function into which
we are inlining or to expose the information originally captured for
the edge that is being inlined.

Similarly, we can combine the value range information in pass-through
jump functions corresponding to both edges, which is what this patch
adds - at least for the case when the inlined pass-through is a
simple, non-arithmetic one, which is the case that we also handle for
constant and aggregate jump function parts.

gcc/ChangeLog:

2024-11-01  Martin Jambor  

* ipa-cp.h: Forward declare class ipa_vr.
(ipa_vr_operation_and_type_effects) Declare.
* ipa-cp.cc (ipa_vr_operation_and_type_effects): Make public.
* ipa-prop.cc (update_jump_functions_after_inlining): Also update
value range jump functions.

Diff:
---
 gcc/ipa-cp.cc   |  4 ++--
 gcc/ipa-cp.h| 13 +
 gcc/ipa-prop.cc | 18 ++
 3 files changed, 33 insertions(+), 2 deletions(-)

diff --git a/gcc/ipa-cp.cc b/gcc/ipa-cp.cc
index e6d707c286db..a664bc03f62a 100644
--- a/gcc/ipa-cp.cc
+++ b/gcc/ipa-cp.cc
@@ -1653,7 +1653,7 @@ ipa_context_from_jfunc (ipa_node_params *info, 
cgraph_edge *cs, int csidx,
DST_TYPE on value range in SRC_VR and store it to DST_VR.  Return true if
the result is a range that is not VARYING nor UNDEFINED.  */
 
-static bool
+bool
 ipa_vr_operation_and_type_effects (vrange &dst_vr,
   const vrange &src_vr,
   enum tree_code operation,
@@ -1679,7 +1679,7 @@ ipa_vr_operation_and_type_effects (vrange &dst_vr,
 /* Same as above, but the SRC_VR argument is an IPA_VR which must
first be extracted onto a vrange.  */
 
-static bool
+bool
 ipa_vr_operation_and_type_effects (vrange &dst_vr,
   const ipa_vr &src_vr,
   enum tree_code operation,
diff --git a/gcc/ipa-cp.h b/gcc/ipa-cp.h
index ba2ebfede63f..4f569c1ee838 100644
--- a/gcc/ipa-cp.h
+++ b/gcc/ipa-cp.h
@@ -299,4 +299,17 @@ ipa_vr_supported_type_p (tree type)
   return irange::supports_p (type) || prange::supports_p (type);
 }
 
+class ipa_vr;
+
+bool ipa_vr_operation_and_type_effects (vrange &dst_vr,
+   const vrange &src_vr,
+   enum tree_code operation,
+   tree dst_type, tree src_type);
+bool ipa_vr_operation_and_type_effects (vrange &dst_vr,
+   const ipa_vr &src_vr,
+   enum tree_code operation,
+   tree dst_type, tree src_type);
+
+
+
 #endif /* IPA_CP_H */
diff --git a/gcc/ipa-prop.cc b/gcc/ipa-prop.cc
index 9070a45f6835..3d72794e37c4 100644
--- a/gcc/ipa-prop.cc
+++ b/gcc/ipa-prop.cc
@@ -3471,6 +3471,24 @@ update_jump_functions_after_inlining (struct cgraph_edge 
*cs,
  gcc_unreachable ();
}
 
+ if (src->m_vr && src->m_vr->known_p ())
+   {
+ value_range svr (src->m_vr->type ());
+ if (!dst->m_vr || !dst->m_vr->known_p ())
+   ipa_set_jfunc_vr (dst, *src->m_vr);
+ else if (ipa_vr_operation_and_type_effects (svr, *src->m_vr,
+  NOP_EXPR,
+  dst->m_vr->type (),
+  src->m_vr->type ()))
+   {
+ value_range dvr;
+ dst->m_vr->get_vrange (dvr);
+ dvr.intersect (svr);
+ if (!dvr.undefined_p ())
+   ipa_set_jfunc_vr (dst, dvr);
+   }
+   }
+
  if (src->agg.items
  && (dst_agg_p || !src->agg.by_ref))
{


[gcc r15-6295] ipa: Better value ranges for pointer integer constants

2024-12-17 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:1eb41aeb49a491f5b18d160074e651a76afc655a

commit r15-6295-g1eb41aeb49a491f5b18d160074e651a76afc655a
Author: Martin Jambor 
Date:   Tue Dec 17 11:17:14 2024 +0100

ipa: Better value ranges for pointer integer constants

When looking into cases where we know an actual argument of a call is
a constant but we don't generate a singleton value-range for the jump
function, I found out that the special handling of pointer constants
does not work well for constant zero pointer values.  In fact the code
only attempts to see if it can figure out that an argument is not zero
and if it can figure out any alignment information.

With this patch, we try to use the value_range that ranger can give us
in the jump function if we can and we query ranger for all kinds of
arguments, not just SSA_NAMES (and so also pointer integer constants).
If we cannot figure out a useful range we fall back again on figuring
out non-NULLness with tree_single_nonzero_warnv_p.

With this patch, we generate

  [prange] struct S * [0, 0] MASK 0x0 VALUE 0x0

instead of for example:

  [prange] struct S * [0, +INF] MASK 0xfff0 VALUE 0x0

for a zero constant passed in a call.

If you are wondering why we check whether the value range obtained
from range_of_expr can be undefined, even when the function returns
true, that is because that can apparently happen fro default-definition
SSA_NAMEs.

gcc/ChangeLog:

2024-11-15  Martin Jambor  

* ipa-prop.cc (ipa_compute_jump_functions_for_edge): Try harder to
use the value range obtained from ranger for pointer values.

Diff:
---
 gcc/ipa-prop.cc | 35 ---
 1 file changed, 16 insertions(+), 19 deletions(-)

diff --git a/gcc/ipa-prop.cc b/gcc/ipa-prop.cc
index ae309ec78a2d..f0b915ba2be1 100644
--- a/gcc/ipa-prop.cc
+++ b/gcc/ipa-prop.cc
@@ -2396,28 +2396,27 @@ ipa_compute_jump_functions_for_edge (struct 
ipa_func_body_info *fbi,
   value_range vr (TREE_TYPE (arg));
   if (POINTER_TYPE_P (TREE_TYPE (arg)))
{
- bool addr_nonzero = false;
- bool strict_overflow = false;
-
- if (TREE_CODE (arg) == SSA_NAME
- && param_type
- && get_range_query (cfun)->range_of_expr (vr, arg, cs->call_stmt)
- && vr.nonzero_p ())
-   addr_nonzero = true;
- else if (tree_single_nonzero_warnv_p (arg, &strict_overflow))
-   addr_nonzero = true;
-
- if (addr_nonzero)
-   vr.set_nonzero (TREE_TYPE (arg));
-
+ if (!get_range_query (cfun)->range_of_expr (vr, arg, cs->call_stmt)
+ || vr.varying_p ()
+ || vr.undefined_p ())
+   {
+ bool strict_overflow = false;
+ if (tree_single_nonzero_warnv_p (arg, &strict_overflow))
+   vr.set_nonzero (TREE_TYPE (arg));
+ else
+   vr.set_varying (TREE_TYPE (arg));
+   }
+ gcc_assert (!vr.undefined_p ());
  unsigned HOST_WIDE_INT bitpos;
- unsigned align, prec = TYPE_PRECISION (TREE_TYPE (arg));
+ unsigned align = BITS_PER_UNIT;
 
- get_pointer_alignment_1 (arg, &align, &bitpos);
+ if (!vr.singleton_p ())
+   get_pointer_alignment_1 (arg, &align, &bitpos);
 
  if (align > BITS_PER_UNIT
  && opt_for_fn (cs->caller->decl, flag_ipa_bit_cp))
{
+ unsigned prec = TYPE_PRECISION (TREE_TYPE (arg));
  wide_int mask
= wi::bit_and_not (wi::mask (prec, false, prec),
   wide_int::from (align / BITS_PER_UNIT - 1,
@@ -2425,12 +2424,10 @@ ipa_compute_jump_functions_for_edge (struct 
ipa_func_body_info *fbi,
  wide_int value = wide_int::from (bitpos / BITS_PER_UNIT, prec,
   UNSIGNED);
  irange_bitmask bm (value, mask);
- if (!addr_nonzero)
-   vr.set_varying (TREE_TYPE (arg));
  vr.update_bitmask (bm);
  ipa_set_jfunc_vr (jfunc, vr);
}
- else if (addr_nonzero)
+ else if (!vr.varying_p ())
ipa_set_jfunc_vr (jfunc, vr);
  else
gcc_assert (!jfunc->m_vr);


[gcc r15-6296] ipa: Improve how we derive value ranges from IPA invariants

2024-12-17 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:5d740f56a162702a33379789a4d6134d9733aa71

commit r15-6296-g5d740f56a162702a33379789a4d6134d9733aa71
Author: Martin Jambor 
Date:   Tue Dec 17 11:17:14 2024 +0100

ipa: Improve how we derive value ranges from IPA invariants

I believe that the current function ipa_range_set_and_normalize lacks
a check that a base of an ADDR_EXPR lacks a test whether the base
really cannot be NULL, so this patch adds it.  Moreover, I never liked
the name as I do not think it makes the value of ranges any more
normal but rather just special-cases non-zero ip_invariant pointers.
Therefore, I have given it a different name and moved it to a .cc
file, our LTO bootstrap should inline (and/or split) it if necessary
anyway.

Because, as Honza correctly pointed out, deriving non-NULLness from a
pointer depends on flag_delete_null_pointer_checks which is an
optimization flag and thus depends on a given function, in this
version of the patch ipa_get_range_from_ip_invariant gets a
context_node parameter for that purpose.  This then needs to be used
within symtab_node::nonzero_address which gets a special overload in
which the value of the flag can be provided as a parameter.

gcc/ChangeLog:

2024-12-11  Martin Jambor  

* cgraph.h (symtab_node): Add a new overload of nonzero_address.
* symtab.cc (symtab_node::nonzero_address): Add a new overload 
whith a
parameter for delete_null_pointer_checks.  Make the original 
overload
call the new one which has retains the actual implementation.
* ipa-prop.h (ipa_get_range_from_ip_invariant): Declare.
(ipa_range_set_and_normalize): Remove.
* ipa-prop.cc (ipa_get_range_from_ip_invariant): New function.
(ipa_range_set_and_normalize): Remove.
* ipa-cp.cc (ipa_vr_intersect_with_arith_jfunc): Add a new parameter
context_node. Use ipa_get_range_from_ip_invariant instead of
ipa_range_set_and_normalize and pass to it the new parameter.
(ipa_value_range_from_jfunc): Pass cs->caller as the context_node to
ipa_vr_intersect_with_arith_jfunc.
(propagate_vr_across_jump_function): Likewise.
(ipa_get_range_from_ip_invariant): New function.
* ipa-fnsummary.cc (evaluate_conditions_for_known_args): Use
ipa_get_range_from_ip_invariant instead of 
ipa_range_set_and_normalize

Diff:
---
 gcc/cgraph.h |  4 
 gcc/ipa-cp.cc| 12 
 gcc/ipa-fnsummary.cc |  4 ++--
 gcc/ipa-prop.cc  | 37 +
 gcc/ipa-prop.h   | 15 +--
 gcc/symtab.cc| 21 +++--
 6 files changed, 67 insertions(+), 26 deletions(-)

diff --git a/gcc/cgraph.h b/gcc/cgraph.h
index 50bae96de4cf..9b4cb6383afc 100644
--- a/gcc/cgraph.h
+++ b/gcc/cgraph.h
@@ -431,6 +431,10 @@ public:
   /* Return true if ONE and TWO are part of the same COMDAT group.  */
   inline bool in_same_comdat_group_p (symtab_node *target);
 
+  /* Return true if symbol is known to be nonzero, assume that
+ flag_delete_null_pointer_checks is equal to delete_null_pointer_checks.  
*/
+  bool nonzero_address (bool delete_null_pointer_checks);
+
   /* Return true if symbol is known to be nonzero.  */
   bool nonzero_address ();
 
diff --git a/gcc/ipa-cp.cc b/gcc/ipa-cp.cc
index a664bc03f62a..5d7b3d25df5d 100644
--- a/gcc/ipa-cp.cc
+++ b/gcc/ipa-cp.cc
@@ -1693,11 +1693,14 @@ ipa_vr_operation_and_type_effects (vrange &dst_vr,
 
 /* Given a PASS_THROUGH jump function JFUNC that takes as its source SRC_VR of
SRC_TYPE and the result needs to be DST_TYPE, if any value range information
-   can be deduced at all, intersect VR with it.  */
+   can be deduced at all, intersect VR with it.  CONTEXT_NODE is the call graph
+   node representing the function for which optimization flags should be
+   evaluated.  */
 
 static void
 ipa_vr_intersect_with_arith_jfunc (vrange &vr,
   ipa_jump_func *jfunc,
+  cgraph_node *context_node,
   const value_range &src_vr,
   tree src_type,
   tree dst_type)
@@ -1720,7 +1723,7 @@ ipa_vr_intersect_with_arith_jfunc (vrange &vr,
   if (!handler)
 return;
   value_range op_vr (TREE_TYPE (operand));
-  ipa_range_set_and_normalize (op_vr, operand);
+  ipa_get_range_from_ip_invariant (op_vr, operand, context_node);
 
   tree operation_type;
   if (TREE_CODE_CLASS (operation) == tcc_comparison)
@@ -1776,7 +1779,8 @@ ipa_value_range_from_jfunc (vrange &vr,
   value_range srcvr;
   (*sum->m_vr)[idx].get_vrange (srcvr);
 
-  ipa_vr_intersect_with_arith_jfunc (vr, jfunc, srcvr, src_type, 
parm_type);
+  ipa_vr_intersect_with_arith_jfunc (vr, jfunc, cs->caller, srcvr, 
src_type,
+   

[gcc r15-6294] ipa: Skip widening type conversions in jump function constructions

2024-12-17 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:96fb71883d438bdb241fdf9c7d12f945c5ba0c7f

commit r15-6294-g96fb71883d438bdb241fdf9c7d12f945c5ba0c7f
Author: Martin Jambor 
Date:   Tue Dec 17 11:17:14 2024 +0100

ipa: Skip widening type conversions in jump function constructions

Originally, we did not stream any formal parameter types into WPA and
were generally very conservative when it came to type mismatches in
IPA-CP.  Over the time, mismatches that happen in code and blew up in
WPA made us to be much more resilient and also to stream the types of
the parameters which we now use commonly.

With that information, we can safely skip conversions when looking at
the IL from which we build jump functions and then simply fold convert
the constants and ranges to the resulting type, as long as we are
careful that performing the corresponding folding of constants gives
the corresponding results.  In order to do that, we must ensure that
the old value can be represented in the new one without any loss.
With this change, we can nicely propagate non-NULLness in IPA-VR as
demonstrated with the new test case.

I have gone through all other uses of (all components of) jump
functions which could be affected by this and verified they do indeed
check types and can handle mismatches.

gcc/ChangeLog:

2024-12-11  Martin Jambor  

* ipa-prop.cc: Include vr-values.h.
(skip_a_safe_conversion_op): New function.
(ipa_compute_jump_functions_for_edge): Use it.

gcc/testsuite/ChangeLog:

2024-11-01  Martin Jambor  

* gcc.dg/ipa/vrp9.c: New test.

Diff:
---
 gcc/ipa-prop.cc | 40 ++
 gcc/testsuite/gcc.dg/ipa/vrp9.c | 48 +
 2 files changed, 88 insertions(+)

diff --git a/gcc/ipa-prop.cc b/gcc/ipa-prop.cc
index 3d72794e37c4..ae309ec78a2d 100644
--- a/gcc/ipa-prop.cc
+++ b/gcc/ipa-prop.cc
@@ -59,6 +59,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "attr-fnspec.h"
 #include "gimple-range.h"
 #include "value-range-storage.h"
+#include "vr-values.h"
 
 /* Function summary where the parameter infos are actually stored. */
 ipa_node_params_t *ipa_node_params_sum = NULL;
@@ -2311,6 +2312,44 @@ ipa_set_jfunc_vr (ipa_jump_func *jf, const ipa_vr &vr)
   ipa_set_jfunc_vr (jf, tmp);
 }
 
+
+/* If T is an SSA_NAME that is the result of a simple type conversion statement
+   from an integer type to another integer type which is known to be able to
+   represent the values the operand of the conversion can hold, return the
+   operand of that conversion, otherwise return T.  */
+
+static tree
+skip_a_safe_conversion_op (tree t)
+{
+  if (TREE_CODE (t) != SSA_NAME
+  || SSA_NAME_IS_DEFAULT_DEF (t))
+return t;
+
+  gimple *def = SSA_NAME_DEF_STMT (t);
+  if (!is_gimple_assign (def)
+  || !CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (def))
+  || !INTEGRAL_TYPE_P (TREE_TYPE (t))
+  || !INTEGRAL_TYPE_P (TREE_TYPE (gimple_assign_rhs1 (def
+return t;
+
+  tree rhs1 = gimple_assign_rhs1 (def);
+  if (TYPE_PRECISION (TREE_TYPE (t))
+  >= TYPE_PRECISION (TREE_TYPE (rhs1)))
+return gimple_assign_rhs1 (def);
+
+  value_range vr (TREE_TYPE (rhs1));
+  if (!get_range_query (cfun)->range_of_expr (vr, rhs1, def)
+  || vr.undefined_p ())
+return t;
+
+  irange &ir = as_a  (vr);
+  if (range_fits_type_p (&ir, TYPE_PRECISION (TREE_TYPE (t)),
+TYPE_SIGN (TREE_TYPE (t
+  return gimple_assign_rhs1 (def);
+
+  return t;
+}
+
 /* Compute jump function for all arguments of callsite CS and insert the
information in the jump_functions array in the ipa_edge_args corresponding
to this callsite.  */
@@ -2415,6 +2454,7 @@ ipa_compute_jump_functions_for_edge (struct 
ipa_func_body_info *fbi,
gcc_assert (!jfunc->m_vr);
}
 
+  arg = skip_a_safe_conversion_op (arg);
   if (is_gimple_ip_invariant (arg)
  || (VAR_P (arg)
  && is_global_var (arg)
diff --git a/gcc/testsuite/gcc.dg/ipa/vrp9.c b/gcc/testsuite/gcc.dg/ipa/vrp9.c
new file mode 100644
index ..461a2e757d2c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ipa/vrp9.c
@@ -0,0 +1,48 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized"  }  */
+
+int some_f1 (int);
+int some_f2 (int);
+int some_f3 (int);
+
+void remove_this_call ();
+
+int g;
+
+static int __attribute__((noinline))
+bar (int p)
+{
+  if (p)
+remove_this_call ();
+  return g++;
+}
+
+static int __attribute__((noinline))
+foo (int (*f)(int))
+{
+  return bar (f == (void *)0);
+}
+
+int
+baz1 (void)
+{
+  int (*f)(int);
+  if (g)
+f = some_f1;
+  else
+f = some_f2;
+  return foo (f);
+}
+
+int
+baz2 (void)
+{
+  int (*f)(int);
+  if (g)
+f = some_f2;
+  else
+f = some_f3;
+  return foo (f);
+}
+
+/* { dg-final { scan-tree-dump-not "remove_this_call"  "

[gcc r15-6769] ipa-cp: Fold-convert values when necessary (PR 118138)

2025-01-10 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:d019ab4f115caab48316c185c007765719e93052

commit r15-6769-gd019ab4f115caab48316c185c007765719e93052
Author: Martin Jambor 
Date:   Sat Jan 4 20:40:07 2025 +0100

ipa-cp: Fold-convert values when necessary (PR 118138)

PR 118138 and quite a few duplicates that it has acquired in a short
time show that even though we are careful to make sure we do not loose
any bits when newly allowing type conversions in jump-functions, we
still need to perform the fold conversions during IPA constant
propagation and not just at the end in order to properly perform
sign-extensions or zero-extensions as appropriate.

This patch does just that, changing a safety predicate we already use
at the appropriate places to return the necessary type.

gcc/ChangeLog:

2025-01-03  Martin Jambor  

PR ipa/118138
* ipa-cp.cc (ipacp_value_safe_for_type): Return the appropriate
type instead of a bool, accept NULL_TREE VALUEs.
(propagate_vals_across_arith_jfunc): Use the new returned value of
ipacp_value_safe_for_type.
(propagate_vals_across_ancestor): Likewise.
(propagate_scalar_across_jump_function): Likewise.

gcc/testsuite/ChangeLog:

2025-01-03  Martin Jambor  

PR ipa/118138
* gcc.dg/ipa/pr118138.c: New test.

Diff:
---
 gcc/ipa-cp.cc   | 33 +++--
 gcc/testsuite/gcc.dg/ipa/pr118138.c | 30 ++
 2 files changed, 49 insertions(+), 14 deletions(-)

diff --git a/gcc/ipa-cp.cc b/gcc/ipa-cp.cc
index 294389fba4c7..d89324a00775 100644
--- a/gcc/ipa-cp.cc
+++ b/gcc/ipa-cp.cc
@@ -1448,19 +1448,23 @@ initialize_node_lattices (struct cgraph_node *node)
   }
 }
 
-/* Return true if VALUE can be safely IPA-CP propagated to a parameter of type
-   PARAM_TYPE.  */
+/* Return VALUE if it is NULL_TREE or if it can be directly safely IPA-CP
+   propagated to a parameter of type PARAM_TYPE, or return a fold-converted
+   VALUE to PARAM_TYPE if that is possible.  Return NULL_TREE otherwise.  */
 
-static bool
+static tree
 ipacp_value_safe_for_type (tree param_type, tree value)
 {
+  if (!value)
+return NULL_TREE;
   tree val_type = TREE_TYPE (value);
   if (param_type == val_type
-  || useless_type_conversion_p (param_type, val_type)
-  || fold_convertible_p (param_type, value))
-return true;
+  || useless_type_conversion_p (param_type, val_type))
+return value;
+  if (fold_convertible_p (param_type, value))
+return fold_convert (param_type, value);
   else
-return false;
+return NULL_TREE;
 }
 
 /* Return the result of a (possibly arithmetic) operation on the constant
@@ -2210,8 +2214,8 @@ propagate_vals_across_arith_jfunc (cgraph_edge *cs,
{
  tree cstval = get_val_across_arith_op (opcode, opnd1_type, opnd2,
 src_val, res_type);
- if (!cstval
- || !ipacp_value_safe_for_type (res_type, cstval))
+ cstval = ipacp_value_safe_for_type (res_type, cstval);
+ if (!cstval)
break;
 
  ret |= dest_lat->add_value (cstval, cs, src_val, src_idx,
@@ -2235,8 +2239,8 @@ propagate_vals_across_arith_jfunc (cgraph_edge *cs,
 
tree cstval = get_val_across_arith_op (opcode, opnd1_type, opnd2,
   src_val, res_type);
-   if (cstval
-   && ipacp_value_safe_for_type (res_type, cstval))
+   cstval = ipacp_value_safe_for_type (res_type, cstval);
+   if (cstval)
  ret |= dest_lat->add_value (cstval, cs, src_val, src_idx,
  src_offset);
else
@@ -2284,8 +2288,8 @@ propagate_vals_across_ancestor (struct cgraph_edge *cs,
   for (src_val = src_lat->values; src_val; src_val = src_val->next)
 {
   tree t = ipa_get_jf_ancestor_result (jfunc, src_val->value);
-
-  if (t && ipacp_value_safe_for_type (param_type, t))
+  t = ipacp_value_safe_for_type (param_type, t);
+  if (t)
ret |= dest_lat->add_value (t, cs, src_val, src_idx);
   else
ret |= dest_lat->set_contains_variable ();
@@ -2310,7 +2314,8 @@ propagate_scalar_across_jump_function (struct cgraph_edge 
*cs,
   if (jfunc->type == IPA_JF_CONST)
 {
   tree val = ipa_get_jf_constant (jfunc);
-  if (ipacp_value_safe_for_type (param_type, val))
+  val = ipacp_value_safe_for_type (param_type, val);
+  if (val)
return dest_lat->add_value (val, cs, NULL, 0);
   else
return dest_lat->set_contains_variable ();
diff --git a/gcc/testsuite/gcc.dg/ipa/pr118138.c 
b/gcc/testsuite/gcc.dg/ipa/pr118138.c
new file mode 100644
index ..5c94253f58b2
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ipa/pr118138.c
@@ -0,0 +1,30 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -fno-inli

[gcc r15-6864] MAINTAINERS: Make contrib/check-MAINTAINERS.py happy

2025-01-13 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:539fc490690d825ab2d299a0f577c5e9d3fa33d0

commit r15-6864-g539fc490690d825ab2d299a0f577c5e9d3fa33d0
Author: Martin Jambor 
Date:   Mon Jan 13 13:47:27 2025 +0100

MAINTAINERS: Make contrib/check-MAINTAINERS.py happy

This commit makes the contrib/check-MAINTAINERS.py script happy about
our MAINTAINERS file.  I hope that it knows best how things ought to
be and so am committing this as obvious.

ChangeLog:

2025-01-13  Martin Jambor  

* MAINTAINERS: Fix the name order of the Write After Approval 
section.

Diff:
---
 MAINTAINERS | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 0c571bde8bce..256a03957d59 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -327,12 +327,12 @@ from other maintainers or reviewers.
 
 NameBZ account  Email
 
+Soumya AR   soumyaa 
 Mark G. Adams   mgadams 
 Ajit Kumar Agarwal  aagarwa 
 Pedro Alves palves  
 John David Anglin   danglin 
 Harald Anlauf   anlauf  
-Soumya AR   soumyaa 
 Paul-Antoine Arras  parras  
 Arsen Arsenović arsen   
 Raksit Ashokraksit  


[gcc r15-8061] ipa: Do not modify cgraph edges from thunk clones during inlining (PR116572)

2025-03-14 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:075ec330307c5b1fe5ed166a633c718c06b01437

commit r15-8061-g075ec330307c5b1fe5ed166a633c718c06b01437
Author: Martin Jambor 
Date:   Fri Mar 14 16:07:01 2025 +0100

ipa: Do not modify cgraph edges from thunk clones during inlining (PR116572)

In PR 116572 we hit an assert that a thunk which does not have a body
looks like it has one.  It does not, but the call_stmt of its outgoing
edge points to a statement, which should not.  In fact it has several
outgoing call graph edges, which cannot be.  The problem is that the
code updating the edges to reflect inlining into the master clone (an
ex-thunk, unlike the clone, which is still an unexpanded thunk) is
being updated during inling into the master clone.  This patch simply
makes the code to skip unexpanded thunk clones.

gcc/ChangeLog:

2025-03-13  Martin Jambor  

PR ipa/116572
* cgraph.cc (cgraph_update_edges_for_call_stmt): Do not update
edges of clones that are unexpanded thunk.  Assert that the node
passed as the parameter is not an unexpanded thunk.

gcc/testsuite/ChangeLog:

2025-03-13  Martin Jambor  

PR ipa/116572
* g++.dg/ipa/pr116572.C: New test.

Diff:
---
 gcc/cgraph.cc   |  7 +--
 gcc/testsuite/g++.dg/ipa/pr116572.C | 37 +
 2 files changed, 42 insertions(+), 2 deletions(-)

diff --git a/gcc/cgraph.cc b/gcc/cgraph.cc
index d0b19ad850e0..6ae6a97f6f56 100644
--- a/gcc/cgraph.cc
+++ b/gcc/cgraph.cc
@@ -1708,12 +1708,15 @@ cgraph_update_edges_for_call_stmt (gimple *old_stmt, 
tree old_decl,
   cgraph_node *node;
 
   gcc_checking_assert (orig);
+  gcc_assert (!orig->thunk);
   cgraph_update_edges_for_call_stmt_node (orig, old_stmt, old_decl, new_stmt);
   if (orig->clones)
 for (node = orig->clones; node != orig;)
   {
-   cgraph_update_edges_for_call_stmt_node (node, old_stmt, old_decl,
-   new_stmt);
+   /* Do not attempt to adjust bodies of yet unexpanded thunks.  */
+   if (!node->thunk)
+ cgraph_update_edges_for_call_stmt_node (node, old_stmt, old_decl,
+ new_stmt);
if (node->clones)
  node = node->clones;
else if (node->next_sibling_clone)
diff --git a/gcc/testsuite/g++.dg/ipa/pr116572.C 
b/gcc/testsuite/g++.dg/ipa/pr116572.C
new file mode 100644
index ..909568e1c72c
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ipa/pr116572.C
@@ -0,0 +1,37 @@
+/* { dg-do compile } */
+/* { dg-options "-std=c++20 -O3 -fsanitize=undefined" } */
+
+long v;
+template  struct A;
+template , typename = C>
+class B;
+template <>
+struct A
+{
+  static int foo(char *s, const char *t, long n) { return __builtin_memcmp(s, 
t, n); }
+};
+template 
+struct B {
+  long b;
+  B(const C *);
+  C *bar() const;
+  constexpr unsigned long baz(const C *, unsigned long, unsigned long) const 
noexcept;
+  void baz() { C c; baz(&c, 0, v); }
+};
+template 
+constexpr unsigned long
+B::baz(const C *s, unsigned long, unsigned long n) const noexcept
+{
+  C *x = bar(); if (!x) return b; D::foo(x, s, n); return 0;
+}
+namespace {
+struct F { virtual ~F() {} };
+struct F2 { virtual void foo(B) const; };
+struct F3 : F, F2 { void foo(B s) const { s.baz(); } } f;
+}
+int
+main()
+{
+  F *p;
+  dynamic_cast(p)->foo("");
+}


[gcc r13-9497] ipa-cp: Avoid ICE when redistributing nodes among edges to recursive clones (PR 118318)

2025-04-08 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:659e222b82c41ae0730a0bb93d891864b6ae5e16

commit r13-9497-g659e222b82c41ae0730a0bb93d891864b6ae5e16
Author: Martin Jambor 
Date:   Fri Mar 7 17:17:24 2025 +0100

ipa-cp: Avoid ICE when redistributing nodes among edges to recursive clones 
(PR 118318)

PR 118318 reported an ICE during PGO build of Firefox when IPA-CP, in
the final stages of update_counts_for_self_gen_clones where it
attempts to guess how to distribute profile count among clones created
for recursive edges and the various edges that are created in the
process.  If one such edge has profile count of kind GUESSED_GLOBAL0,
the compatibility check in the operator+ will lead to an ICE.  After
discussing the situation with Honza, we concluded that there is little
more we can do other than check for this situation before touching the
edge count, so this is what this patch does.

gcc/ChangeLog:

2025-02-28  Martin Jambor  

PR ipa/118318
* ipa-cp.cc (adjust_clone_incoming_counts): Add a compatible_p 
check.

(cherry picked from commit 7deb498425799aceb7659ea25614175a49533184)

Diff:
---
 gcc/ipa-cp.cc | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/ipa-cp.cc b/gcc/ipa-cp.cc
index 8f36608cf33b..08fca00e5f65 100644
--- a/gcc/ipa-cp.cc
+++ b/gcc/ipa-cp.cc
@@ -4808,7 +4808,8 @@ adjust_clone_incoming_counts (cgraph_node *node,
cs->count = cs->count.combine_with_ipa_count (sum);
   }
 else if (!desc->processed_edges->contains (cs)
-&& cs->caller->clone_of == desc->orig)
+&& cs->caller->clone_of == desc->orig
+&& cs->count.compatible_p (desc->count))
   {
cs->count += desc->count;
if (dump_file)


[gcc r14-11497] Fix a pasto in ao_compare::compare_ao_refs

2025-04-01 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:28c10781fd26324e8fd6077e743944f1a32e

commit r14-11497-g28c10781fd26324e8fd6077e743944f1a32e
Author: Martin Jambor 
Date:   Tue Mar 11 14:52:44 2025 +0100

Fix a pasto in ao_compare::compare_ao_refs

When reading the function ao_compare::compare_ao_refs I came accross
what I believe to ba a copy-and-paste error which this patch fixes.

gcc/ChangeLog:

2025-03-10  Martin Jambor  

* tree-ssa-alias.cc (ao_compare::compare_ao_refs): Fix a
copy-and-paste error.

(cherry picked from commit dc47161c1f32c3f27d1157ba0de9d98ea1b7fc82)

Diff:
---
 gcc/tree-ssa-alias.cc | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/tree-ssa-alias.cc b/gcc/tree-ssa-alias.cc
index 72af21c02131..beab6249ae43 100644
--- a/gcc/tree-ssa-alias.cc
+++ b/gcc/tree-ssa-alias.cc
@@ -4302,12 +4302,13 @@ ao_compare::compare_ao_refs (ao_ref *ref1, ao_ref *ref2,
c1 = p1, nskipped1 = i;
   i++;
 }
+  i = 0;
   for (tree p2 = ref2->ref; handled_component_p (p2); p2 = TREE_OPERAND (p2, 
0))
 {
   if (component_ref_to_zero_sized_trailing_array_p (p2))
end_struct_ref2 = p2;
   if (ends_tbaa_access_path_p (p2))
-   c2 = p2, nskipped1 = i;
+   c2 = p2, nskipped2 = i;
   i++;
 }


[gcc r14-11495] ipa: Do not modify cgraph edges from thunk clones during inlining (PR116572)

2025-04-01 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:5312a8f62a6bcae36f6aa40f88c8b58dfae7db21

commit r14-11495-g5312a8f62a6bcae36f6aa40f88c8b58dfae7db21
Author: Martin Jambor 
Date:   Fri Mar 14 16:07:01 2025 +0100

ipa: Do not modify cgraph edges from thunk clones during inlining (PR116572)

In PR 116572 we hit an assert that a thunk which does not have a body
looks like it has one.  It does not, but the call_stmt of its outgoing
edge points to a statement, which should not.  In fact it has several
outgoing call graph edges, which cannot be.  The problem is that the
code updating the edges to reflect inlining into the master clone (an
ex-thunk, unlike the clone, which is still an unexpanded thunk) is
being updated during inling into the master clone.  This patch simply
makes the code to skip unexpanded thunk clones.

gcc/ChangeLog:

2025-03-13  Martin Jambor  

PR ipa/116572
* cgraph.cc (cgraph_update_edges_for_call_stmt): Do not update
edges of clones that are unexpanded thunk.  Assert that the node
passed as the parameter is not an unexpanded thunk.

gcc/testsuite/ChangeLog:

2025-03-13  Martin Jambor  

PR ipa/116572
* g++.dg/ipa/pr116572.C: New test.

(cherry picked from commit 075ec330307c5b1fe5ed166a633c718c06b01437)

Diff:
---
 gcc/cgraph.cc   |  7 +--
 gcc/testsuite/g++.dg/ipa/pr116572.C | 37 +
 2 files changed, 42 insertions(+), 2 deletions(-)

diff --git a/gcc/cgraph.cc b/gcc/cgraph.cc
index 8226c7d96e05..dc18a16f8917 100644
--- a/gcc/cgraph.cc
+++ b/gcc/cgraph.cc
@@ -1710,12 +1710,15 @@ cgraph_update_edges_for_call_stmt (gimple *old_stmt, 
tree old_decl,
   cgraph_node *node;
 
   gcc_checking_assert (orig);
+  gcc_assert (!orig->thunk);
   cgraph_update_edges_for_call_stmt_node (orig, old_stmt, old_decl, new_stmt);
   if (orig->clones)
 for (node = orig->clones; node != orig;)
   {
-   cgraph_update_edges_for_call_stmt_node (node, old_stmt, old_decl,
-   new_stmt);
+   /* Do not attempt to adjust bodies of yet unexpanded thunks.  */
+   if (!node->thunk)
+ cgraph_update_edges_for_call_stmt_node (node, old_stmt, old_decl,
+ new_stmt);
if (node->clones)
  node = node->clones;
else if (node->next_sibling_clone)
diff --git a/gcc/testsuite/g++.dg/ipa/pr116572.C 
b/gcc/testsuite/g++.dg/ipa/pr116572.C
new file mode 100644
index ..909568e1c72c
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ipa/pr116572.C
@@ -0,0 +1,37 @@
+/* { dg-do compile } */
+/* { dg-options "-std=c++20 -O3 -fsanitize=undefined" } */
+
+long v;
+template  struct A;
+template , typename = C>
+class B;
+template <>
+struct A
+{
+  static int foo(char *s, const char *t, long n) { return __builtin_memcmp(s, 
t, n); }
+};
+template 
+struct B {
+  long b;
+  B(const C *);
+  C *bar() const;
+  constexpr unsigned long baz(const C *, unsigned long, unsigned long) const 
noexcept;
+  void baz() { C c; baz(&c, 0, v); }
+};
+template 
+constexpr unsigned long
+B::baz(const C *s, unsigned long, unsigned long n) const noexcept
+{
+  C *x = bar(); if (!x) return b; D::foo(x, s, n); return 0;
+}
+namespace {
+struct F { virtual ~F() {} };
+struct F2 { virtual void foo(B) const; };
+struct F3 : F, F2 { void foo(B s) const { s.baz(); } } f;
+}
+int
+main()
+{
+  F *p;
+  dynamic_cast(p)->foo("");
+}


[gcc r15-9250] sra: Avoid creating TBAA hazards (PR118924)

2025-04-07 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:07d243670020b339380194f6125cde87ada56148

commit r15-9250-g07d243670020b339380194f6125cde87ada56148
Author: Martin Jambor 
Date:   Mon Apr 7 13:32:09 2025 +0200

sra: Avoid creating TBAA hazards (PR118924)

The testcase in PR 118924, when compiled on Aarch64, contains an
gimple aggregate assignment statement in between different types which
are types_compatible_p but behave differently for the purposes of
alias analysis.

SRA replaces the statement with a series of scalar assignments which
however have LHSs access chains modeled on the RHS type and so do not
alias with a subsequent reads and so are DSEd.

SRA clearly gets its "same_access_path" logic subtly wrong.  One issue
is that the same_access_path_p function probably should be implemented
more along the lines of (parts of ao_compare::compare_ao_refs) instead
of internally relying on operand_equal_p.  That is however not the
problem in the PR and so I will deal with it only later.

The issue here is that even when the access path is the same, it must
not be bolted on an aggregate type that does not match.  This patch
does that, taking just one simple function from the
ao_compare::compare_ao_refs machinery and using it to detect the
situation.  The rest is just merging the information in between
accesses of the same access group.

I looked at how many times we come across such assignment during
"make stage2-bubble" of GCC (configured with only c and C++ and
without multilib and libsanitizers) and on an x86_64 there were 87924
such assignments (though now I realize not all of them had to be
aggregate), so they do happen.  The patch leads to about 5% increase
of cases where we don't use an "access path" but resort to a
MEM_REF (from 90209 to 95204).  On an Aarch64, there were 92268 such
assignments and the increase of falling back to MEM_REFs was by
4% (but from a bigger base 132983 to 107991).

gcc/ChangeLog:

2025-04-04  Martin Jambor  

PR tree-optimization/118924
* tree-ssa-alias-compare.h (types_equal_for_same_type_for_tbaa_p):
Declare.
* tree-ssa-alias.cc: Include ipa-utils.h.
(types_equal_for_same_type_for_tbaa_p): New public overloaded 
variant.
* tree-sra.cc: Include tree-ssa-alias-compare.h.
(create_access): Initialzie grp_same_access_path to true.
(build_accesses_from_assign): Detect tbaa hazards and clear
grp_same_access_path fields of involved accesses when they occur.
(sort_and_splice_var_accesses): Take previous values of
grp_same_access_path into account.

gcc/testsuite/ChangeLog:

2025-03-25  Martin Jambor  

PR tree-optimization/118924
* g++.dg/tree-ssa/pr118924.C: New test.

Diff:
---
 gcc/testsuite/g++.dg/tree-ssa/pr118924.C | 29 +
 gcc/tree-sra.cc  | 17 ++---
 gcc/tree-ssa-alias-compare.h |  2 ++
 gcc/tree-ssa-alias.cc| 13 -
 4 files changed, 57 insertions(+), 4 deletions(-)

diff --git a/gcc/testsuite/g++.dg/tree-ssa/pr118924.C 
b/gcc/testsuite/g++.dg/tree-ssa/pr118924.C
new file mode 100644
index ..c95eacafc9ce
--- /dev/null
+++ b/gcc/testsuite/g++.dg/tree-ssa/pr118924.C
@@ -0,0 +1,29 @@
+/* { dg-do run } */
+/* { dg-options "-std=c++17 -O2" } */
+
+template  struct Vector {
+  int m_data[Size];
+  Vector(int, int, int) {}
+};
+enum class E { POINTS, LINES, TRIANGLES };
+
+__attribute__((noipa))
+void getName(E type) {
+  static E check = E::POINTS;
+  if (type == check)
+check = (E)((int)check + 1);
+  else
+__builtin_abort ();
+}
+
+int main() {
+  int arr[]{0, 1, 2};
+  for (auto dim : arr) {
+Vector<3> localInvs(1, 1, 1);
+localInvs.m_data[dim] = 8;
+  }
+  E types[] = {E::POINTS, E::LINES, E::TRIANGLES};
+  for (auto primType : types)
+getName(primType);
+  return 0;
+}
diff --git a/gcc/tree-sra.cc b/gcc/tree-sra.cc
index c26559edc666..ae7cd57a5f23 100644
--- a/gcc/tree-sra.cc
+++ b/gcc/tree-sra.cc
@@ -100,6 +100,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "builtins.h"
 #include "tree-sra.h"
 #include "opts.h"
+#include "tree-ssa-alias-compare.h"
 
 /* Enumeration of all aggregate reductions we can do.  */
 enum sra_mode { SRA_MODE_EARLY_IPA,   /* early call regularization */
@@ -979,6 +980,7 @@ create_access (tree expr, gimple *stmt, bool write)
   access->type = TREE_TYPE (expr);
   access->write = write;
   access->grp_unscalarizable_region = unscalarizable_region;
+  access->grp_same_access_path = true;
   access->stmt = stmt;
   access->reverse = reverse;
 
@@ -1522,6 +1524,9 @@ build_accesses_from_assign (gimple *stmt)
   racc = build_access_from_expr_1 (rhs, stmt, false);
   lacc = build_access_from_expr_1 (lhs, stmt,

[gcc r15-9251] sra: Clear grp_same_access_path of acesses created by total scalarization (PR118924)

2025-04-07 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:40445711b8af113ef423d8bcac1a7ce1c47f62d7

commit r15-9251-g40445711b8af113ef423d8bcac1a7ce1c47f62d7
Author: Martin Jambor 
Date:   Mon Apr 7 13:32:10 2025 +0200

sra: Clear grp_same_access_path of acesses created by total scalarization 
(PR118924)

During analysis of PR 118924 it was discussed that total scalarization
invents access paths (strings of COMPONENT_REFs and possibly even
ARRAY_REFs) which did not exist in the program before which can have
unintended effects on subsequent AA queries.  Although not doing that
does not mean that SRA cannot create such situations (see the bug for
more info), it has been agreed that not doing this is generally better.
This patch therfore makes SRA fall back on creating simple MEM_REFs when
accessing components of an aggregate corresponding to what a SRA
variable now represents.

gcc/ChangeLog:

2025-03-26  Martin Jambor  

PR tree-optimization/118924
* tree-sra.cc (create_total_scalarization_access): Set
grp_same_access_path flag to zero.

Diff:
---
 gcc/tree-sra.cc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/tree-sra.cc b/gcc/tree-sra.cc
index ae7cd57a5f23..302b73e83b8f 100644
--- a/gcc/tree-sra.cc
+++ b/gcc/tree-sra.cc
@@ -3462,7 +3462,7 @@ create_total_scalarization_access (struct access *parent, 
HOST_WIDE_INT pos,
   access->grp_write = parent->grp_write;
   access->grp_total_scalarization = 1;
   access->grp_hint = 1;
-  access->grp_same_access_path = path_comparable_for_same_access (expr);
+  access->grp_same_access_path = 0;
   access->reverse = reverse_storage_order_for_component_p (expr);
 
   access->next_sibling = next_sibling;


[gcc r15-9427] ipa-cp: Make propagation of bits in IPA-CP aware of type conversions (PR119318)

2025-04-14 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:de1c734a8ae034c92f485e7f58b7fcb1c921ecd2

commit r15-9427-gde1c734a8ae034c92f485e7f58b7fcb1c921ecd2
Author: Martin Jambor 
Date:   Mon Apr 14 14:21:15 2025 +0200

ipa-cp: Make propagation of bits in IPA-CP aware of type conversions 
(PR119318)

After the propagation of constants and value ranges, it turns out
that the propagation of known bits also needs to be made aware of any
intermediate types in which any arithmetic operations are made and
must limit its precision there.  This implements just that, using the
newly collected and streamed types of the operations involved.

This version removed the extra check that the type of a formal
parameter is known pointed out in Honza in his review because I agree
it is currently always known.  I have also added the testcase of PR
119530 which is a duplicate of this bug.

gcc/ChangeLog:

2025-04-11  Martin Jambor  

PR ipa/119318
* ipa-cp.cc (ipcp_bits_lattice::meet_with_1): Set all mask bits
not covered by precision to one.
(ipcp_bits_lattice::meet_with): Likewise.
(propagate_bits_across_jump_function): Use the stored operation
type to perform meet with other lattices.

gcc/testsuite/ChangeLog:

2025-04-11  Martin Jambor  

PR ipa/119318
* gcc.dg/ipa/pr119318.c: New test.
* gcc.dg/ipa/pr119530.c: Likwise.

Diff:
---
 gcc/ipa-cp.cc   | 21 +++-
 gcc/testsuite/gcc.dg/ipa/pr119318.c | 38 +
 gcc/testsuite/gcc.dg/ipa/pr119530.c | 21 
 3 files changed, 75 insertions(+), 5 deletions(-)

diff --git a/gcc/ipa-cp.cc b/gcc/ipa-cp.cc
index 264568989a96..fd2c4cca1365 100644
--- a/gcc/ipa-cp.cc
+++ b/gcc/ipa-cp.cc
@@ -918,6 +918,8 @@ ipcp_bits_lattice::meet_with_1 (widest_int value, 
widest_int mask,
 m_mask |= m_value;
   m_value &= ~m_mask;
 
+  widest_int cap_mask = wi::bit_not (wi::sub (wi::lshift (1, precision), 1));
+  m_mask |= cap_mask;
   if (wi::sext (m_mask, precision) == -1)
 return set_to_bottom ();
 
@@ -996,6 +998,8 @@ ipcp_bits_lattice::meet_with (ipcp_bits_lattice& other, 
unsigned precision,
  adjusted_mask |= adjusted_value;
  adjusted_value &= ~adjusted_mask;
}
+  widest_int cap_mask = wi::bit_not (wi::sub (wi::lshift (1, precision), 
1));
+  adjusted_mask |= cap_mask;
   if (wi::sext (adjusted_mask, precision) == -1)
return set_to_bottom ();
   return set_to_constant (adjusted_value, adjusted_mask);
@@ -2507,14 +2511,12 @@ propagate_bits_across_jump_function (cgraph_edge *cs, 
int idx,
   return dest_lattice->set_to_bottom ();
 }
 
-  unsigned precision = TYPE_PRECISION (parm_type);
-  signop sgn = TYPE_SIGN (parm_type);
-
   if (jfunc->type == IPA_JF_PASS_THROUGH
   || jfunc->type == IPA_JF_ANCESTOR)
 {
   ipa_node_params *caller_info = ipa_node_params_sum->get (cs->caller);
   tree operand = NULL_TREE;
+  tree op_type = NULL_TREE;
   enum tree_code code;
   unsigned src_idx;
   bool keep_null = false;
@@ -2524,7 +2526,10 @@ propagate_bits_across_jump_function (cgraph_edge *cs, 
int idx,
  code = ipa_get_jf_pass_through_operation (jfunc);
  src_idx = ipa_get_jf_pass_through_formal_id (jfunc);
  if (code != NOP_EXPR)
-   operand = ipa_get_jf_pass_through_operand (jfunc);
+   {
+ operand = ipa_get_jf_pass_through_operand (jfunc);
+ op_type = ipa_get_jf_pass_through_op_type (jfunc);
+   }
}
   else
{
@@ -2551,6 +2556,11 @@ propagate_bits_across_jump_function (cgraph_edge *cs, 
int idx,
 
   if (!src_lats->bits_lattice.bottom_p ())
{
+ if (!op_type)
+   op_type = ipa_get_type (caller_info, src_idx);
+
+ unsigned precision = TYPE_PRECISION (op_type);
+ signop sgn = TYPE_SIGN (op_type);
  bool drop_all_ones
= keep_null && !src_lats->bits_lattice.known_nonzero_p ();
 
@@ -2570,7 +2580,8 @@ propagate_bits_across_jump_function (cgraph_edge *cs, int 
idx,
= widest_int::from (bm.mask (), TYPE_SIGN (parm_type));
  widest_int value
= widest_int::from (bm.value (), TYPE_SIGN (parm_type));
- return dest_lattice->meet_with (value, mask, precision);
+ return dest_lattice->meet_with (value, mask,
+ TYPE_PRECISION (parm_type));
}
 }
   return dest_lattice->set_to_bottom ();
diff --git a/gcc/testsuite/gcc.dg/ipa/pr119318.c 
b/gcc/testsuite/gcc.dg/ipa/pr119318.c
new file mode 100644
index ..8e62ec5e3503
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ipa/pr119318.c
@@ -0,0 +1,38 @@
+/* { dg-do run } */
+/* { dg-require-effective-target int128 } */
+/* { dg-additional-options "-Wno-psabi -w" } */
+/* { dg-options "-Wno-psabi

[gcc r15-9486] ipa-bit-cp: Fix adjusting value according to mask (PR119803)

2025-04-15 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:b4cf69503bcb32491dbd7ab63fe7f0f9fcdcca38

commit r15-9486-gb4cf69503bcb32491dbd7ab63fe7f0f9fcdcca38
Author: Martin Jambor 
Date:   Tue Apr 15 15:55:34 2025 +0200

ipa-bit-cp: Fix adjusting value according to mask (PR119803)

In my fix for PR 119318 I put mask calculation in
ipcp_bits_lattice::meet_with_1 above a final fix to value so that all
the bits in the value which are meaningless according to mask have
value zero, which has tripped a validator in PR 119803.  This patch
fixes that by moving the adjustment down.

Even thought the fix for PR 119318 did a similar thing in
ipcp_bits_lattice::meet_with, the same is not necessary because that
code path then feeds the new value and mask to
ipcp_bits_lattice::set_to_constant which does the final adjustment
correctly.

In both places, however, Jakup proposed a better way of calculating
cap_mask and so I have changed it accordingly.

gcc/ChangeLog:

2025-04-15  Martin Jambor  

PR ipa/119803
* ipa-cp.cc (ipcp_bits_lattice::meet_with_1): Move m_value adjustmed
according to m_mask below the adjustment of the latter according to
cap_mask.  Optimize the  calculation of cap_mask a bit.
(ipcp_bits_lattice::meet_with): Optimize the calculation of 
cap_mask a
bit.

gcc/testsuite/ChangeLog:

2025-04-15  Martin Jambor  

PR ipa/119803
* gcc.dg/ipa/pr119803.c: New test.

Co-authored-by: Jakub Jelinek 

Diff:
---
 gcc/ipa-cp.cc   |  6 +++---
 gcc/testsuite/gcc.dg/ipa/pr119803.c | 16 
 2 files changed, 19 insertions(+), 3 deletions(-)

diff --git a/gcc/ipa-cp.cc b/gcc/ipa-cp.cc
index 379fbc5dd637..806c2bdc97f2 100644
--- a/gcc/ipa-cp.cc
+++ b/gcc/ipa-cp.cc
@@ -933,13 +933,13 @@ ipcp_bits_lattice::meet_with_1 (widest_int value, 
widest_int mask,
   m_mask = (m_mask | mask) | (m_value ^ value);
   if (drop_all_ones)
 m_mask |= m_value;
-  m_value &= ~m_mask;
 
-  widest_int cap_mask = wi::bit_not (wi::sub (wi::lshift (1, precision), 1));
+  widest_int cap_mask = wi::shifted_mask  (0, precision, true);
   m_mask |= cap_mask;
   if (wi::sext (m_mask, precision) == -1)
 return set_to_bottom ();
 
+  m_value &= ~m_mask;
   return m_mask != old_mask;
 }
 
@@ -1015,7 +1015,7 @@ ipcp_bits_lattice::meet_with (ipcp_bits_lattice& other, 
unsigned precision,
  adjusted_mask |= adjusted_value;
  adjusted_value &= ~adjusted_mask;
}
-  widest_int cap_mask = wi::bit_not (wi::sub (wi::lshift (1, precision), 
1));
+  widest_int cap_mask = wi::shifted_mask  (0, precision, true);
   adjusted_mask |= cap_mask;
   if (wi::sext (adjusted_mask, precision) == -1)
return set_to_bottom ();
diff --git a/gcc/testsuite/gcc.dg/ipa/pr119803.c 
b/gcc/testsuite/gcc.dg/ipa/pr119803.c
new file mode 100644
index ..1a7bfd25018a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ipa/pr119803.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+extern void f(int p);
+int a, b;
+char c;
+static int d(int e) { return !e || a == 1 ? 0 : a / e; }
+static void h(short e) {
+  int g = d(e);
+  f(g);
+}
+void i() {
+  c = 128;
+  h(c);
+  b = d(65536);
+}


[gcc r14-11682] sra: Avoid creating TBAA hazards (PR118924)

2025-04-24 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:19dd791b3a7166df0766dfd0b5e6918f8e3d1bba

commit r14-11682-g19dd791b3a7166df0766dfd0b5e6918f8e3d1bba
Author: Martin Jambor 
Date:   Mon Apr 7 13:32:09 2025 +0200

sra: Avoid creating TBAA hazards (PR118924)

The testcase in PR 118924, when compiled on Aarch64, contains an
gimple aggregate assignment statement in between different types which
are types_compatible_p but behave differently for the purposes of
alias analysis.

SRA replaces the statement with a series of scalar assignments which
however have LHSs access chains modeled on the RHS type and so do not
alias with a subsequent reads and so are DSEd.

SRA clearly gets its "same_access_path" logic subtly wrong.  One issue
is that the same_access_path_p function probably should be implemented
more along the lines of (parts of ao_compare::compare_ao_refs) instead
of internally relying on operand_equal_p.  That is however not the
problem in the PR and so I will deal with it only later.

The issue here is that even when the access path is the same, it must
not be bolted on an aggregate type that does not match.  This patch
does that, taking just one simple function from the
ao_compare::compare_ao_refs machinery and using it to detect the
situation.  The rest is just merging the information in between
accesses of the same access group.

I looked at how many times we come across such assignment during
"make stage2-bubble" of GCC (configured with only c and C++ and
without multilib and libsanitizers) and on an x86_64 there were 87924
such assignments (though now I realize not all of them had to be
aggregate), so they do happen.  The patch leads to about 5% increase
of cases where we don't use an "access path" but resort to a
MEM_REF (from 90209 to 95204).  On an Aarch64, there were 92268 such
assignments and the increase of falling back to MEM_REFs was by
4% (but from a bigger base 132983 to 107991).

gcc/ChangeLog:

2025-04-04  Martin Jambor  

PR tree-optimization/118924
* tree-ssa-alias-compare.h (types_equal_for_same_type_for_tbaa_p):
Declare.
* tree-ssa-alias.cc: Include ipa-utils.h.
(types_equal_for_same_type_for_tbaa_p): New public overloaded 
variant.
* tree-sra.cc: Include tree-ssa-alias-compare.h.
(create_access): Initialzie grp_same_access_path to true.
(build_accesses_from_assign): Detect tbaa hazards and clear
grp_same_access_path fields of involved accesses when they occur.
(sort_and_splice_var_accesses): Take previous values of
grp_same_access_path into account.

gcc/testsuite/ChangeLog:

2025-03-25  Martin Jambor  

PR tree-optimization/118924
* g++.dg/tree-ssa/pr118924.C: New test.

(cherry picked from commit 07d243670020b339380194f6125cde87ada56148)

Diff:
---
 gcc/testsuite/g++.dg/tree-ssa/pr118924.C | 29 +
 gcc/tree-sra.cc  | 17 ++---
 gcc/tree-ssa-alias-compare.h |  2 ++
 gcc/tree-ssa-alias.cc| 13 -
 4 files changed, 57 insertions(+), 4 deletions(-)

diff --git a/gcc/testsuite/g++.dg/tree-ssa/pr118924.C 
b/gcc/testsuite/g++.dg/tree-ssa/pr118924.C
new file mode 100644
index ..c95eacafc9ce
--- /dev/null
+++ b/gcc/testsuite/g++.dg/tree-ssa/pr118924.C
@@ -0,0 +1,29 @@
+/* { dg-do run } */
+/* { dg-options "-std=c++17 -O2" } */
+
+template  struct Vector {
+  int m_data[Size];
+  Vector(int, int, int) {}
+};
+enum class E { POINTS, LINES, TRIANGLES };
+
+__attribute__((noipa))
+void getName(E type) {
+  static E check = E::POINTS;
+  if (type == check)
+check = (E)((int)check + 1);
+  else
+__builtin_abort ();
+}
+
+int main() {
+  int arr[]{0, 1, 2};
+  for (auto dim : arr) {
+Vector<3> localInvs(1, 1, 1);
+localInvs.m_data[dim] = 8;
+  }
+  E types[] = {E::POINTS, E::LINES, E::TRIANGLES};
+  for (auto primType : types)
+getName(primType);
+  return 0;
+}
diff --git a/gcc/tree-sra.cc b/gcc/tree-sra.cc
index c91e40ef7e71..e1243dd0441d 100644
--- a/gcc/tree-sra.cc
+++ b/gcc/tree-sra.cc
@@ -100,6 +100,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "builtins.h"
 #include "tree-sra.h"
 #include "opts.h"
+#include "tree-ssa-alias-compare.h"
 
 /* Enumeration of all aggregate reductions we can do.  */
 enum sra_mode { SRA_MODE_EARLY_IPA,   /* early call regularization */
@@ -979,6 +980,7 @@ create_access (tree expr, gimple *stmt, bool write)
   access->type = TREE_TYPE (expr);
   access->write = write;
   access->grp_unscalarizable_region = unscalarizable_region;
+  access->grp_same_access_path = true;
   access->stmt = stmt;
   access->reverse = reverse;
 
@@ -1522,6 +1524,9 @@ build_accesses_from_assign (gimple *stmt)
   racc = build_access

[gcc r14-11683] sra: Clear grp_same_access_path of acesses created by total scalarization (PR118924)

2025-04-24 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:cd7c5d9729851940ab6bb7a8522a548c62e8dade

commit r14-11683-gcd7c5d9729851940ab6bb7a8522a548c62e8dade
Author: Martin Jambor 
Date:   Mon Apr 7 13:32:10 2025 +0200

sra: Clear grp_same_access_path of acesses created by total scalarization 
(PR118924)

During analysis of PR 118924 it was discussed that total scalarization
invents access paths (strings of COMPONENT_REFs and possibly even
ARRAY_REFs) which did not exist in the program before which can have
unintended effects on subsequent AA queries.  Although not doing that
does not mean that SRA cannot create such situations (see the bug for
more info), it has been agreed that not doing this is generally better.
This patch therfore makes SRA fall back on creating simple MEM_REFs when
accessing components of an aggregate corresponding to what a SRA
variable now represents.

gcc/ChangeLog:

2025-03-26  Martin Jambor  

PR tree-optimization/118924
* tree-sra.cc (create_total_scalarization_access): Set
grp_same_access_path flag to zero.

(cherry picked from commit 40445711b8af113ef423d8bcac1a7ce1c47f62d7)

Diff:
---
 gcc/tree-sra.cc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/tree-sra.cc b/gcc/tree-sra.cc
index e1243dd0441d..46ddd41fdcb9 100644
--- a/gcc/tree-sra.cc
+++ b/gcc/tree-sra.cc
@@ -3436,7 +3436,7 @@ create_total_scalarization_access (struct access *parent, 
HOST_WIDE_INT pos,
   access->grp_write = parent->grp_write;
   access->grp_total_scalarization = 1;
   access->grp_hint = 1;
-  access->grp_same_access_path = path_comparable_for_same_access (expr);
+  access->grp_same_access_path = 0;
   access->reverse = reverse_storage_order_for_component_p (expr);
 
   access->next_sibling = next_sibling;


[gcc r12-11079] sra: Clear grp_same_access_path of acesses created by total scalarization (PR118924)

2025-04-30 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:d4d12a548f210371609e85f6d2f4f3ee0e2b04f2

commit r12-11079-gd4d12a548f210371609e85f6d2f4f3ee0e2b04f2
Author: Martin Jambor 
Date:   Mon Apr 7 13:32:10 2025 +0200

sra: Clear grp_same_access_path of acesses created by total scalarization 
(PR118924)

During analysis of PR 118924 it was discussed that total scalarization
invents access paths (strings of COMPONENT_REFs and possibly even
ARRAY_REFs) which did not exist in the program before which can have
unintended effects on subsequent AA queries.  Although not doing that
does not mean that SRA cannot create such situations (see the bug for
more info), it has been agreed that not doing this is generally better.
This patch therfore makes SRA fall back on creating simple MEM_REFs when
accessing components of an aggregate corresponding to what a SRA
variable now represents.

gcc/ChangeLog:

2025-03-26  Martin Jambor  

PR tree-optimization/118924
* tree-sra.cc (create_total_scalarization_access): Set
grp_same_access_path flag to zero.

(cherry picked from commit 40445711b8af113ef423d8bcac1a7ce1c47f62d7)

Diff:
---
 gcc/tree-sra.cc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/tree-sra.cc b/gcc/tree-sra.cc
index 5a9eaf31b6e9..91af2aef8b4c 100644
--- a/gcc/tree-sra.cc
+++ b/gcc/tree-sra.cc
@@ -3130,7 +3130,7 @@ create_total_scalarization_access (struct access *parent, 
HOST_WIDE_INT pos,
   access->grp_write = parent->grp_write;
   access->grp_total_scalarization = 1;
   access->grp_hint = 1;
-  access->grp_same_access_path = path_comparable_for_same_access (expr);
+  access->grp_same_access_path = 0;
   access->reverse = reverse_storage_order_for_component_p (expr);
 
   access->next_sibling = next_sibling;


[gcc r15-9429] ipa-cp: Use the stored and streamed pass-through types in ipa-vr (PR118785)

2025-04-14 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:4f19487f2606d25516d31f0279101deea9772da4

commit r15-9429-g4f19487f2606d25516d31f0279101deea9772da4
Author: Martin Jambor 
Date:   Mon Apr 14 14:21:15 2025 +0200

ipa-cp: Use the stored and streamed pass-through types in ipa-vr (PR118785)

This patch revisits the fix for PR 118785 and intead of deducing the
necessary operation type it just uses the value collected and streamed
by an earlier patch.  The main advantage is that we do not rely on
expr_type_first_operand_type_p enumarating all operations.

gcc/ChangeLog:

2025-03-20  Martin Jambor  

PR ipa/118785
* ipa-cp.cc (ipa_vr_intersect_with_arith_jfunc): Use the stored
and streamed type of arithmetic pass-through functions.

Diff:
---
 gcc/ipa-cp.cc | 28 ++--
 1 file changed, 2 insertions(+), 26 deletions(-)

diff --git a/gcc/ipa-cp.cc b/gcc/ipa-cp.cc
index 637bc49f0482..21033c666bf4 100644
--- a/gcc/ipa-cp.cc
+++ b/gcc/ipa-cp.cc
@@ -1735,24 +1735,7 @@ ipa_vr_intersect_with_arith_jfunc (vrange &vr,
   const value_range *inter_vr;
   if (operation != NOP_EXPR)
{
- /* Since we construct arithmetic jump functions even when there is a
- type conversion in between the operation encoded in the jump
- function and when it is passed in a call argument, the IPA
- propagation phase must also perform the operation and conversion
- in two separate steps.
-
-TODO: In order to remove the use of expr_type_first_operand_type_p
-predicate we would need to stream the operation type, ideally
-encoding the whole jump function as a series of expr_eval_op
-structures.  */
-
- tree operation_type;
- if (expr_type_first_operand_type_p (operation))
-   operation_type = src_type;
- else if (operation == ABSU_EXPR)
-   operation_type = unsigned_type_for (src_type);
- else
-   return;
+ tree operation_type = ipa_get_jf_pass_through_op_type (jfunc);
  op_res.set_varying (operation_type);
  if (!ipa_vr_operation_and_type_effects (op_res, src_vr, operation,
  operation_type, src_type))
@@ -1782,14 +1765,7 @@ ipa_vr_intersect_with_arith_jfunc (vrange &vr,
   value_range op_vr (TREE_TYPE (operand));
   ipa_get_range_from_ip_invariant (op_vr, operand, context_node);
 
-  tree operation_type;
-  if (TREE_CODE_CLASS (operation) == tcc_comparison)
-operation_type = boolean_type_node;
-  else if (expr_type_first_operand_type_p (operation))
-operation_type = src_type;
-  else
-return;
-
+  tree operation_type = ipa_get_jf_pass_through_op_type (jfunc);
   value_range op_res (operation_type);
   if (!ipa_vr_supported_type_p (operation_type)
   || !handler.operand_check_p (operation_type, src_type, op_vr.type ())


[gcc r15-9426] ipa: Record and stream result types of arithemetic jump functions

2025-04-14 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:f33d2e6b532304d487193667e6b5d8f8d7df2bf4

commit r15-9426-gf33d2e6b532304d487193667e6b5d8f8d7df2bf4
Author: Martin Jambor 
Date:   Mon Apr 14 14:21:14 2025 +0200

ipa: Record and stream result types of arithemetic jump functions

In order to replace the use of somewhat unweildy
expr_type_first_operand_type_p we need to record and stream the types
of results of operations recorded in arithmetic jump functions.  This
is necessary so that we can then simulate them at the IPA stage with
the corresponding precision and signedness.  This patch does the
recorsing and streaming, the following one adds the use of the date.

Per Honza's request this version also checks that we do not put VLA
types into the global LTO stream, even though I was not able to
actually craft a test-case that would do that without them.

gcc/ChangeLog:

2025-04-11  Martin Jambor  

PR ipa/118097
PR ipa/118785
PR ipa/119318
* lto-streamer.h (lto_variably_modified_type_p): Declare.
* ipa-prop.h (ipa_pass_through_data): New field op_type.
(ipa_get_jf_pass_through_op_type): New function.
* ipa-prop.cc: Include lto-streamer.h.
(ipa_dump_jump_function): Dump also pass-through
operation types, if any.  Dump pass-through operands only if not 
NULL.
(ipa_set_jf_simple_pass_through): Set op_type accordingly.
(compute_complex_assign_jump_func): Set op_type of arithmetic
pass-through jump_functions.
(analyze_agg_content_value): Update lhs when walking assighment
copies.  Set op_type of aggregate arithmetic pass-through
jump_functions.
(update_jump_functions_after_inlining): Also transfer the operation
type from the source arithmentic pass-through jump function to the
destination jump function.
(ipa_write_jump_function): Stream also the op_type when necessary.
(ipa_read_jump_function): Likewise.
(ipa_agg_pass_through_jf_equivalent_p): Also compare operation 
types.
* lto-streamer-out.cc (lto_variably_modified_type_p): Make public.

Diff:
---
 gcc/ipa-prop.cc | 76 -
 gcc/ipa-prop.h  | 15 ++
 gcc/lto-streamer-out.cc |  2 +-
 gcc/lto-streamer.h  |  1 +
 4 files changed, 80 insertions(+), 14 deletions(-)

diff --git a/gcc/ipa-prop.cc b/gcc/ipa-prop.cc
index a120f942dc25..49d68ab044b7 100644
--- a/gcc/ipa-prop.cc
+++ b/gcc/ipa-prop.cc
@@ -60,6 +60,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "gimple-range.h"
 #include "value-range-storage.h"
 #include "vr-values.h"
+#include "lto-streamer.h"
 
 /* Function summary where the parameter infos are actually stored. */
 ipa_node_params_t *ipa_node_params_sum = NULL;
@@ -454,7 +455,11 @@ ipa_dump_jump_function (FILE *f, ipa_jump_func *jump_func,
   if (jump_func->value.pass_through.operation != NOP_EXPR)
{
  fprintf (f, " ");
- print_generic_expr (f, jump_func->value.pass_through.operand);
+ if (jump_func->value.pass_through.operand)
+   print_generic_expr (f, jump_func->value.pass_through.operand);
+ fprintf (f, " (in type ");
+ print_generic_expr (f, jump_func->value.pass_through.op_type);
+ fprintf (f, ")");
}
   if (jump_func->value.pass_through.agg_preserved)
fprintf (f, ", agg_preserved");
@@ -510,7 +515,11 @@ ipa_dump_jump_function (FILE *f, ipa_jump_func *jump_func,
  if (item->value.pass_through.operation != NOP_EXPR)
{
  fprintf (f, " ");
- print_generic_expr (f, item->value.pass_through.operand);
+ if (item->value.pass_through.operand)
+   print_generic_expr (f, item->value.pass_through.operand);
+ fprintf (f, " (in type ");
+ print_generic_expr (f, jump_func->value.pass_through.op_type);
+ fprintf (f, ")");
}
}
  else if (item->jftype == IPA_JF_CONST)
@@ -682,6 +691,7 @@ ipa_set_jf_simple_pass_through (struct ipa_jump_func 
*jfunc, int formal_id,
 {
   jfunc->type = IPA_JF_PASS_THROUGH;
   jfunc->value.pass_through.operand = NULL_TREE;
+  jfunc->value.pass_through.op_type = NULL_TREE;
   jfunc->value.pass_through.formal_id = formal_id;
   jfunc->value.pass_through.operation = NOP_EXPR;
   jfunc->value.pass_through.agg_preserved = agg_preserved;
@@ -692,10 +702,11 @@ ipa_set_jf_simple_pass_through (struct ipa_jump_func 
*jfunc, int formal_id,
 
 static void
 ipa_set_jf_unary_pass_through (struct ipa_jump_func *jfunc, int formal_id,
-  enum tree_code operation)
+  enum tree_code operation, tree op_type)
 {
   jfunc->type = IPA_JF_PASS_THROUGH;
   j

[gcc r15-9428] ipa-cp: Make dumping of widest_ints even more sane

2025-04-14 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:044d0d1ee1a61c21670068485d4a250edfbb695a

commit r15-9428-g044d0d1ee1a61c21670068485d4a250edfbb695a
Author: Martin Jambor 
Date:   Mon Apr 14 14:21:15 2025 +0200

ipa-cp: Make dumping of widest_ints even more sane

This patch just introduces a form of dumping of widest ints that only
have zeros in the lowest 128 bits so that instead of printing
thousands of f's the output looks like:

   Bits: value = 0x, mask = all ones folled by 
0x

and then makes sure we use the function not only to print bits but
also to print masks where values like these can also occur.

gcc/ChangeLog:

2025-03-21  Martin Jambor  

* ipa-cp.cc (ipcp_print_widest_int): Also add a truncated form of
dumping of widest ints which only have zeros in the lowest 128 bits.
Update the comment.
(ipcp_bits_lattice::print): Also dump the mask using
ipcp_print_widest_int.
(ipcp_store_vr_results): Likewise.

Diff:
---
 gcc/ipa-cp.cc | 15 +++
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/gcc/ipa-cp.cc b/gcc/ipa-cp.cc
index fd2c4cca1365..637bc49f0482 100644
--- a/gcc/ipa-cp.cc
+++ b/gcc/ipa-cp.cc
@@ -307,14 +307,21 @@ ipcp_lattice::print (FILE * f, bool 
dump_sources, bool dump_benefits)
 fprintf (f, "\n");
 }
 
-/* If VALUE has all bits set to one, print "-1" to F, otherwise simply print it
-   hexadecimally to F. */
+/* Print VALUE to F in a form which in usual cases does not take thousands of
+   characters. */
 
 static void
 ipcp_print_widest_int (FILE *f, const widest_int &value)
 {
   if (wi::eq_p (wi::bit_not (value), 0))
 fprintf (f, "-1");
+  else if (wi::eq_p (wi::bit_not (wi::bit_or (value,
+ wi::sub (wi::lshift (1, 128),
+  1))), 0))
+{
+  fprintf (f, "all ones folled by ");
+  print_hex (wi::bit_and (value, wi::sub (wi::lshift (1, 128), 1)), f);
+}
   else
 print_hex (value, f);
 }
@@ -331,7 +338,7 @@ ipcp_bits_lattice::print (FILE *f)
   fprintf (f, " Bits: value = ");
   ipcp_print_widest_int (f, get_value ());
   fprintf (f, ", mask = ");
-  print_hex (get_mask (), f);
+  ipcp_print_widest_int (f, get_mask ());
   fprintf (f, "\n");
 }
 }
@@ -6437,7 +6444,7 @@ ipcp_store_vr_results (void)
  fprintf (dump_file, " param %i: value = ", i);
  ipcp_print_widest_int (dump_file, bits->get_value ());
  fprintf (dump_file, ", mask = ");
- print_hex (bits->get_mask (), dump_file);
+ ipcp_print_widest_int (dump_file, bits->get_mask ());
  fprintf (dump_file, "\n");
}
 }


[gcc r15-9430] ipa-cp: Use the collected pass-through types to propgate constants (PR118097)

2025-04-14 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:6b6611f81476b6375c90859d85331c2981a2ce51

commit r15-9430-g6b6611f81476b6375c90859d85331c2981a2ce51
Author: Martin Jambor 
Date:   Mon Apr 14 14:21:15 2025 +0200

ipa-cp: Use the collected pass-through types to propgate constants 
(PR118097)

This patch revisits the fix for PR 118097 and instead of deducing the
necessary operation type it just uses the value collected and streamed
by an earlier patch.

It is bigger than the ones for propagating value ranges and known bits
because we track constants both in parameters themselves and also in
memory they point to or within aggregates, we clone functions for them
and we do fancy things for some types of recursive calls.

In the case of constants in aggregates or passed by reference, the
situation should not change because the code creating jump functions
for them does not allow type-casts, unlike for the plain ones.
However, this patch changes how we handle them for the sake of
consistency and also so that we can try and eliminate this limitation
in the next stage 1.

gcc/ChangeLog:

2025-03-20  Martin Jambor  

PR ipa/118097
* ipa-cp.cc (ipa_get_jf_arith_result): Require res_operand for
anything except NOP_EXPR or ADDR_EXPR, document it and remove the 
code
trying to deduce it.
(ipa_value_from_jfunc): Use the stored and streamed type of 
arithmetic
pass-through functions.
(ipa_agg_value_from_jfunc): Use the stored and streamed type of
arithmetic pass-through functions, convert to the type used to store
the value if necessary.
(get_val_across_arith_op): New parameter op_type, pass it to
ipa_get_jf_arith_result.
(propagate_vals_across_arith_jfunc): New parameter op_type, pass it 
to
get_val_across_arith_op.
(propagate_vals_across_pass_through): Use the stored and streamed 
type
of arithmetic pass-through functions.
(propagate_aggregate_lattice): Likewise.
(push_agg_values_for_index_from_edge): Use the stored and streamed
type of arithmetic pass-through functions, convert to the type used 
to
store the value if necessary.

Diff:
---
 gcc/ipa-cp.cc | 94 +--
 1 file changed, 52 insertions(+), 42 deletions(-)

diff --git a/gcc/ipa-cp.cc b/gcc/ipa-cp.cc
index 21033c666bf4..26b1496f29bb 100644
--- a/gcc/ipa-cp.cc
+++ b/gcc/ipa-cp.cc
@@ -1478,10 +1478,12 @@ ipacp_value_safe_for_type (tree param_type, tree value)
 return NULL_TREE;
 }
 
-/* Return the result of a (possibly arithmetic) operation on the constant value
-   INPUT.  OPERAND is 2nd operand for binary operation.  RES_TYPE is the type
-   in which any operation is to be performed.  Return NULL_TREE if that cannot
-   be determined or be considered an interprocedural invariant.  */
+/* Return the result of a (possibly arithmetic) operation determined by OPCODE
+   on the constant value INPUT.  OPERAND is 2nd operand for binary operation
+   and is required for binary operations.  RES_TYPE, required when opcode is
+   not NOP_EXPR, is the type in which any operation is to be performed.  Return
+   NULL_TREE if that cannot be determined or be considered an interprocedural
+   invariant.  */
 
 static tree
 ipa_get_jf_arith_result (enum tree_code opcode, tree input, tree operand,
@@ -1502,16 +1504,6 @@ ipa_get_jf_arith_result (enum tree_code opcode, tree 
input, tree operand,
return NULL_TREE;
 }
 
-  if (!res_type)
-{
-  if (TREE_CODE_CLASS (opcode) == tcc_comparison)
-   res_type = boolean_type_node;
-  else if (expr_type_first_operand_type_p (opcode))
-   res_type = TREE_TYPE (input);
-  else
-   return NULL_TREE;
-}
-
   if (TREE_CODE_CLASS (opcode) == tcc_unary)
 res = fold_unary (opcode, res_type, input);
   else
@@ -1595,7 +1587,10 @@ ipa_value_from_jfunc (class ipa_node_params *info, 
struct ipa_jump_func *jfunc,
return NULL_TREE;
  enum tree_code opcode = ipa_get_jf_pass_through_operation (jfunc);
  tree op2 = ipa_get_jf_pass_through_operand (jfunc);
- tree cstval = ipa_get_jf_arith_result (opcode, input, op2, NULL_TREE);
+ tree op_type
+   = (opcode == NOP_EXPR) ? NULL_TREE
+   : ipa_get_jf_pass_through_op_type (jfunc);
+ tree cstval = ipa_get_jf_arith_result (opcode, input, op2, op_type);
  return ipacp_value_safe_for_type (parm_type, cstval);
}
   else
@@ -1905,10 +1900,11 @@ ipa_agg_value_from_jfunc (ipa_node_params *info, 
cgraph_node *node,
return NULL_TREE;
 }
 
-  return ipa_get_jf_arith_result (item->value.pass_through.operation,
- value,
- item->value.pass_through.operand,
-   

[gcc r12-11080] Add test-case for PR118924

2025-04-30 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:81b30ef214690b6521753293bf2fcb2339055b54

commit r12-11080-g81b30ef214690b6521753293bf2fcb2339055b54
Author: Martin Jambor 
Date:   Tue Apr 29 18:24:29 2025 +0200

Add test-case for PR118924

Because the testcase for the issue in master is in a commit I do not
plan to backport to GCC 12 but the issue is avoided by my previous one
nevertheless, I am backporting the testcase in this one.

gcc/testsuite/ChangeLog:

2025-04-29  Martin Jambor  

PR tree-optimization/118924
* g++.dg/tree-ssa/pr118924.C: New test.

Diff:
---
 gcc/testsuite/g++.dg/tree-ssa/pr118924.C | 29 +
 1 file changed, 29 insertions(+)

diff --git a/gcc/testsuite/g++.dg/tree-ssa/pr118924.C 
b/gcc/testsuite/g++.dg/tree-ssa/pr118924.C
new file mode 100644
index ..c95eacafc9ce
--- /dev/null
+++ b/gcc/testsuite/g++.dg/tree-ssa/pr118924.C
@@ -0,0 +1,29 @@
+/* { dg-do run } */
+/* { dg-options "-std=c++17 -O2" } */
+
+template  struct Vector {
+  int m_data[Size];
+  Vector(int, int, int) {}
+};
+enum class E { POINTS, LINES, TRIANGLES };
+
+__attribute__((noipa))
+void getName(E type) {
+  static E check = E::POINTS;
+  if (type == check)
+check = (E)((int)check + 1);
+  else
+__builtin_abort ();
+}
+
+int main() {
+  int arr[]{0, 1, 2};
+  for (auto dim : arr) {
+Vector<3> localInvs(1, 1, 1);
+localInvs.m_data[dim] = 8;
+  }
+  E types[] = {E::POINTS, E::LINES, E::TRIANGLES};
+  for (auto primType : types)
+getName(primType);
+  return 0;
+}


[gcc r16-420] ipa: Do not emit info about temporary clones to ipa-clones dump (PR119852)

2025-05-06 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:fb5829a01651d427a63a12c44ecc8baa47dbfc83

commit r16-420-gfb5829a01651d427a63a12c44ecc8baa47dbfc83
Author: Martin Jambor 
Date:   Tue May 6 17:28:43 2025 +0200

ipa: Do not emit info about temporary clones to ipa-clones dump (PR119852)

As described in PR 119852, the output of -fdump-ipa-clones can contain
"(null)" as the suffix/reason for cloning when we need to create a
clone to hold the original function during recursive inlining.  Such
clone is never output and so should not be part of the dump output
either.

gcc/ChangeLog:

2025-04-23  Martin Jambor  

PR ipa/119852
* cgraphclones.cc (dump_callgraph_transformation): Document the
function.  Do not dump if suffix is NULL.

gcc/testsuite/ChangeLog:

2025-04-23  Martin Jambor  

PR ipa/119852
* gcc.dg/ipa/pr119852.c: New test.

Diff:
---
 gcc/cgraphclones.cc | 10 +++-
 gcc/testsuite/gcc.dg/ipa/pr119852.c | 50 +
 2 files changed, 59 insertions(+), 1 deletion(-)

diff --git a/gcc/cgraphclones.cc b/gcc/cgraphclones.cc
index e6223fa1f5cc..bf5bc41cde9c 100644
--- a/gcc/cgraphclones.cc
+++ b/gcc/cgraphclones.cc
@@ -307,12 +307,20 @@ cgraph_node::expand_all_artificial_thunks ()
   e = e->next_caller;
 }
 
+/* Dump information about creation of a call graph node clone to the dump file
+   created by the -fdump-ipa-clones option.  ORIGINAL is the function being
+   cloned, CLONE is the new clone.  SUFFIX is a string that helps identify the
+   reason for cloning, often it is the suffix used by a particular IPA pass to
+   create unique function names.  SUFFIX can be NULL and in that case the
+   dumping will not take place, which must be the case only for helper clones
+   which will never be emitted to the output.  */
+
 void
 dump_callgraph_transformation (const cgraph_node *original,
   const cgraph_node *clone,
   const char *suffix)
 {
-  if (symtab->ipa_clones_dump_file)
+  if (suffix && symtab->ipa_clones_dump_file)
 {
   fprintf (symtab->ipa_clones_dump_file,
   "Callgraph clone;%s;%d;%s;%d;%d;%s;%d;%s;%d;%d;%s\n",
diff --git a/gcc/testsuite/gcc.dg/ipa/pr119852.c 
b/gcc/testsuite/gcc.dg/ipa/pr119852.c
new file mode 100644
index ..eab8d21293cc
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ipa/pr119852.c
@@ -0,0 +1,50 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -fdump-ipa-clones"  } */
+
+typedef struct rtx_def *rtx;
+enum rtx_code {
+  LAST_AND_UNUSED_RTX_CODE};
+extern const char * const rtx_format[((int) LAST_AND_UNUSED_RTX_CODE)];
+struct rtx_def {
+  enum rtx_code code;
+};
+typedef int (*rtx_function) (rtx *, void *);
+extern int for_each_rtx (rtx *, rtx_function, void *);
+int
+replace_label (rtx *x, void *data)
+{
+  rtx l = *x;
+  if (l == (rtx) 0)
+{
+ {
+   rtx new_c, new_l;
+   for_each_rtx (&new_c, replace_label, data);
+ }
+}
+}
+static int
+for_each_rtx_1 (rtx exp, int n, rtx_function f, void *data)
+{
+  int result, i, j;
+  const char *format = (rtx_format[(int) (((enum rtx_code) (exp)->code))]);
+  rtx *x;
+  for (; format[n] != '\0'; n++)
+{
+  switch (format[n])
+ {
+ case 'e':
+   result = (*f) (x, data);
+ {
+   result = for_each_rtx_1 (*x, i, f, data);
+ }
+ }
+}
+}
+int
+for_each_rtx (rtx *x, rtx_function f, void *data)
+{
+  int i;
+  return for_each_rtx_1 (*x, i, f, data);
+}
+
+/* { dg-final { scan-ipa-dump-not "(null)"  "ipa-clones"  } } */


[gcc r16-419] Document option -fdump-ipa-clones

2025-05-06 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:6ecc2fee06bdd60da0e9b3fe6660b553dbdca3ca

commit r16-419-g6ecc2fee06bdd60da0e9b3fe6660b553dbdca3ca
Author: Martin Jambor 
Date:   Tue May 6 17:28:42 2025 +0200

Document option -fdump-ipa-clones

I have noticed that the option -fdump-ipa-clones is not documented
although there are users who depend on it.  This patch adds the
missing documentation along with the description of the information it
dumps and the format it uses.

I am never quite sure which of the texinfo mark-ups is the most
appropriate in which situation, I'll of course incorporate any
feedback on this as well as the general wording of the text.

After we settle on a version, I'd like to backport the documentation
also at least to GCC 15, 14 and 13.

Is it perhaps OK for master and the branches or what would better be
changed?

Thanks,

Martin

gcc/ChangeLog:

2025-04-23  Martin Jambor  

* doc/invoke.texi (Developer Options): Document -fdump-ipa-clones.

Diff:
---
 gcc/doc/invoke.texi | 87 +
 1 file changed, 87 insertions(+)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 32bc45725de9..90cbb516bc46 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -20774,6 +20774,93 @@ By default, the dump will contain messages about 
successful
 optimizations (equivalent to @option{-optimized}) together with
 low-level details about the analysis.
 
+@opindex fdump-ipa-clones
+@item -fdump-ipa-clones
+
+Create a dump file containing information about creation of call graph
+node clones and removals of call graph nodes during inter-procedural
+optimizations and transformations.  Its main intended use is that tools
+that create live-patches can determine the set of functions that need to
+be live-patched to completely replace a particular function (see
+@option{-flive-patching}).  The file name is generated by appending
+suffix @code{ipa-clones} to the source file name, and the file is
+created in the same directory as the output file.  Each entry in the
+file is on a separate line containing semicolon separated fields.
+
+In the case of call graph clone creation, the individual fields are:
+
+@enumerate
+@item
+String @code{Callgraph clone}.
+
+@item
+Name of the function being cloned as it is presented to the assembler.
+
+@item
+A number that uniquely represents the function being cloned in the call
+graph.  Note that the number is unique only within a compilation unit or
+within whole-program analysis but is likely to be different in the two
+phases.
+
+@item
+The file name of the source file where the function is defined.
+
+@item
+The line on which the function definition is located.
+
+@item
+The column where the function definition is located.
+
+@item
+Name of the new function clone as it is presented to the assembler.
+
+@item
+A number that uniquely represents the new function clone in the call
+graph.  Note that the number is unique only within a compilation unit or
+within whole-program analysis but is likely to be different in the two
+phases.
+
+@item
+The file name of the source file where the source code location of the
+new clone points to.
+
+@item
+The line to which the source code location of the new clone points to.
+
+@item
+The column to which the source code location of the new clone points to.
+
+@item
+A string that determines the reason for cloning.
+
+@end enumerate
+
+In the case of call graph clone removal, the individual fields are:
+
+@enumerate
+@item
+String @code{Callgraph removal}.
+
+@item
+Name of the function being removed as it would be presented to the assembler.
+
+@item
+A number that uniquely represents the function being cloned in the call
+graph.  Note that the number is unique only within a compilation unit or
+within whole-program analysis but is likely to be different in the two
+phases.
+
+@item
+The file name of the source file where the function is defined.
+
+@item
+The line on which the function definition is located.
+
+@item
+The column where the function definition is located.
+
+@end enumerate
+
 @opindex fdump-lang
 @item -fdump-lang
 Dump language-specific information.  The file name is made by appending


[gcc r16-422] ipa: Drop the default value of suffix parameter of create_clone (PR119852)

2025-05-06 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:76c882e341cb330a4e9f677a8c3541d573820255

commit r16-422-g76c882e341cb330a4e9f677a8c3541d573820255
Author: Martin Jambor 
Date:   Tue May 6 17:28:44 2025 +0200

ipa: Drop the default value of suffix parameter of create_clone (PR119852)

In PR 119852 we agreed that since the NULL-ness of the suffix
parameter should prevent creation of a record in the ipa-clones
dump (which is implemented by a previous patch), it should not default
to NULL.

gcc/ChangeLog:

2025-04-25  Martin Jambor  

PR ipa/119852
* cgraph.h (cgraph_node::create_clone): Remove the default value of
argument suffix.  Update function comment.
* cgraphclones.cc (cgraph_node::create_clone): Update function 
comment.
* ipa-inline-transform.cc (clone_inlined_nodes): Pass NULL to suffix
of create_clone explicitely.
* ipa-inline.cc (recursive_inlining): Likewise.
* lto-cgraph.cc (input_node): Likewise.

Diff:
---
 gcc/cgraph.h| 10 +++---
 gcc/cgraphclones.cc |  7 ++-
 gcc/ipa-inline-transform.cc |  2 +-
 gcc/ipa-inline.cc   |  2 +-
 gcc/lto-cgraph.cc   |  2 +-
 5 files changed, 16 insertions(+), 7 deletions(-)

diff --git a/gcc/cgraph.h b/gcc/cgraph.h
index 1a59bf609b51..f4ee29e998c3 100644
--- a/gcc/cgraph.h
+++ b/gcc/cgraph.h
@@ -965,15 +965,19 @@ struct GTY((tag ("SYMTAB_FUNCTION"))) cgraph_node : 
public symtab_node
  If the new node is being inlined into another one, NEW_INLINED_TO should 
be
  the outline function the new one is (even indirectly) inlined to.
  All hooks will see this in node's inlined_to, when invoked.
- Can be NULL if the node is not inlined.  SUFFIX is string that is appended
- to the original name.  */
+ Should be NULL if the node is not inlined.
+
+ SUFFIX is string that is appended to the original name, it should only be
+ NULL if NEW_INLINED_TO is not NULL or if the clone being created is
+ temporary and a record about it should not be added into the ipa-clones
+ dump file.  */
   cgraph_node *create_clone (tree decl, profile_count count,
 bool update_original,
 vec redirect_callers,
 bool call_duplication_hook,
 cgraph_node *new_inlined_to,
 ipa_param_adjustments *param_adjustments,
-const char *suffix = NULL);
+const char *suffix);
 
   /* Create callgraph node clone with new declaration.  The actual body will be
  copied later at compilation stage.  The name of the new clone will be
diff --git a/gcc/cgraphclones.cc b/gcc/cgraphclones.cc
index cb457e5f457f..b45ac4977331 100644
--- a/gcc/cgraphclones.cc
+++ b/gcc/cgraphclones.cc
@@ -366,9 +366,14 @@ localize_profile (cgraph_node *n)
 
If the new node is being inlined into another one, NEW_INLINED_TO should be
the outline function the new one is (even indirectly) inlined to.  All hooks
-   will see this in node's inlined_to, when invoked.  Can be NULL if the
+   will see this in node's inlined_to, when invoked.  Should be NULL if the
node is not inlined.
 
+   SUFFIX is string that is appended to the original name, it should only be
+   NULL if NEW_INLINED_TO is not NULL or if the clone being created is
+   temporary and a record about it should not be added into the ipa-clones dump
+   file.
+
If PARAM_ADJUSTMENTS is non-NULL, the parameter manipulation information
will be overwritten by the new structure.  Otherwise the new node will
share parameter manipulation information with the original node.  */
diff --git a/gcc/ipa-inline-transform.cc b/gcc/ipa-inline-transform.cc
index e00887be481b..46b8e5bb6790 100644
--- a/gcc/ipa-inline-transform.cc
+++ b/gcc/ipa-inline-transform.cc
@@ -225,7 +225,7 @@ clone_inlined_nodes (struct cgraph_edge *e, bool duplicate,
   e->count,
   update_original, vNULL, true,
   inlining_into,
-  NULL);
+  NULL, NULL);
  n->used_as_abstract_origin = e->callee->used_as_abstract_origin;
  e->redirect_callee (n);
}
diff --git a/gcc/ipa-inline.cc b/gcc/ipa-inline.cc
index 38fdbfde1b3b..35e5496d8463 100644
--- a/gcc/ipa-inline.cc
+++ b/gcc/ipa-inline.cc
@@ -1865,7 +1865,7 @@ recursive_inlining (struct cgraph_edge *edge,
{
  /* We need original clone to copy around.  */
  master_clone = node->create_clone (node->decl, node->count,
-   false, vNULL, true, NULL, NULL);
+   false, vNULL, true, NULL, NULL, NULL);
  for (e = master_clone->callees; e; e = e->next_callee)
if (!e->inline_failed)
  clone_inlined_nodes (e, 

[gcc r16-421] ipa: Fix create_version_clone_with_body declaration and comment

2025-05-06 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:1eaee43dc0c6292ce865b460d52474ca14ea1d71

commit r16-421-g1eaee43dc0c6292ce865b460d52474ca14ea1d71
Author: Martin Jambor 
Date:   Tue May 6 17:28:43 2025 +0200

ipa: Fix create_version_clone_with_body declaration and comment

I noticed that the name of the fifth parameter of
cgraph_node::create_version_clone_with_body is different in the class
definition in cgraph.h and in the actual member function definition in
cgraphclones.cc.  The former (clone_name) is misleading and so this
patch changes it to the latter (suffix) which is also used in related
functions.

The patch also updates the function comment in both places because it
clearly became out of date.

gcc/ChangeLog:

2025-04-25  Martin Jambor  

* cgraph.h (cgraph_node::create_version_clone_with_body): Fix 
function
comment.  Change the name of clone_name to suffix, in line with the
function definition.
* cgraphclones.cc (cgraph_node::create_version_clone_with_body): Fix
function comment.

Diff:
---
 gcc/cgraph.h| 9 +
 gcc/cgraphclones.cc | 7 ---
 2 files changed, 9 insertions(+), 7 deletions(-)

diff --git a/gcc/cgraph.h b/gcc/cgraph.h
index f7b67ed0a6c5..1a59bf609b51 100644
--- a/gcc/cgraph.h
+++ b/gcc/cgraph.h
@@ -1020,11 +1020,12 @@ struct GTY((tag ("SYMTAB_FUNCTION"))) cgraph_node : 
public symtab_node
  TREE_MAP is a mapping of tree nodes we want to replace with
  new ones (according to results of prior analysis).
 
- If non-NULL ARGS_TO_SKIP determine function parameters to remove
- from new version.
- If SKIP_RETURN is true, the new version will return void.
+ If non-NULL PARAM_ADJUSTMENTS determine how function formal parameters
+ should be modified in the new version and if it should return void.
  If non-NULL BLOCK_TO_COPY determine what basic blocks to copy.
  If non_NULL NEW_ENTRY determine new entry BB of the clone.
+ SUFFIX is a string that will be used to create a new name for the new
+ function.
 
  If TARGET_ATTRIBUTES is non-null, when creating a new declaration,
  add the attributes to DECL_ATTRIBUTES.  And call valid_attribute_p
@@ -1039,7 +1040,7 @@ struct GTY((tag ("SYMTAB_FUNCTION"))) cgraph_node : 
public symtab_node
 (vec redirect_callers,
  vec *tree_map,
  ipa_param_adjustments *param_adjustments,
- bitmap bbs_to_copy, basic_block new_entry_block, const char *clone_name,
+ bitmap bbs_to_copy, basic_block new_entry_block, const char *suffix,
  tree target_attributes = NULL_TREE, bool version_decl = true);
 
   /* Insert a new cgraph_function_version_info node into cgraph_fnver_htab
diff --git a/gcc/cgraphclones.cc b/gcc/cgraphclones.cc
index bf5bc41cde9c..cb457e5f457f 100644
--- a/gcc/cgraphclones.cc
+++ b/gcc/cgraphclones.cc
@@ -1002,11 +1002,12 @@ cgraph_node::create_version_clone (tree new_decl,
TREE_MAP is a mapping of tree nodes we want to replace with
new ones (according to results of prior analysis).
 
-   If non-NULL ARGS_TO_SKIP determine function parameters to remove
-   from new version.
-   If SKIP_RETURN is true, the new version will return void.
+   If non-NULL PARAM_ADJUSTMENTS determine how function formal parameters
+   should be modified in the new version and if it should return void.
If non-NULL BLOCK_TO_COPY determine what basic blocks to copy.
If non_NULL NEW_ENTRY determine new entry BB of the clone.
+   SUFFIX is a string that will be used to create a new name for the new
+   function.
 
If TARGET_ATTRIBUTES is non-null, when creating a new declaration,
add the attributes to DECL_ATTRIBUTES.  And call valid_attribute_p


[gcc r13-9612] Fix a pasto in ao_compare::compare_ao_refs

2025-04-23 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:7495787e31c4e5ee6a04c8f05d227a4f0eb7a345

commit r13-9612-g7495787e31c4e5ee6a04c8f05d227a4f0eb7a345
Author: Martin Jambor 
Date:   Tue Mar 11 14:52:44 2025 +0100

Fix a pasto in ao_compare::compare_ao_refs

When reading the function ao_compare::compare_ao_refs I came accross
what I believe to ba a copy-and-paste error which this patch fixes.

gcc/ChangeLog:

2025-03-10  Martin Jambor  

* tree-ssa-alias.cc (ao_compare::compare_ao_refs): Fix a
copy-and-paste error.

(cherry picked from commit dc47161c1f32c3f27d1157ba0de9d98ea1b7fc82)

Diff:
---
 gcc/tree-ssa-alias.cc | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/tree-ssa-alias.cc b/gcc/tree-ssa-alias.cc
index dcb65648ec04..a784aebd6537 100644
--- a/gcc/tree-ssa-alias.cc
+++ b/gcc/tree-ssa-alias.cc
@@ -4292,12 +4292,13 @@ ao_compare::compare_ao_refs (ao_ref *ref1, ao_ref *ref2,
c1 = p1, nskipped1 = i;
   i++;
 }
+  i = 0;
   for (tree p2 = ref2->ref; handled_component_p (p2); p2 = TREE_OPERAND (p2, 
0))
 {
   if (component_ref_to_zero_sized_trailing_array_p (p2))
end_struct_ref2 = p2;
   if (ends_tbaa_access_path_p (p2))
-   c2 = p2, nskipped1 = i;
+   c2 = p2, nskipped2 = i;
   i++;
 }


[gcc r13-9622] sra: Avoid creating TBAA hazards (PR118924)

2025-04-29 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:087d91f9d4e97de66955caa94b42e91180d02d78

commit r13-9622-g087d91f9d4e97de66955caa94b42e91180d02d78
Author: Martin Jambor 
Date:   Mon Apr 7 13:32:09 2025 +0200

sra: Avoid creating TBAA hazards (PR118924)

The testcase in PR 118924, when compiled on Aarch64, contains an
gimple aggregate assignment statement in between different types which
are types_compatible_p but behave differently for the purposes of
alias analysis.

SRA replaces the statement with a series of scalar assignments which
however have LHSs access chains modeled on the RHS type and so do not
alias with a subsequent reads and so are DSEd.

SRA clearly gets its "same_access_path" logic subtly wrong.  One issue
is that the same_access_path_p function probably should be implemented
more along the lines of (parts of ao_compare::compare_ao_refs) instead
of internally relying on operand_equal_p.  That is however not the
problem in the PR and so I will deal with it only later.

The issue here is that even when the access path is the same, it must
not be bolted on an aggregate type that does not match.  This patch
does that, taking just one simple function from the
ao_compare::compare_ao_refs machinery and using it to detect the
situation.  The rest is just merging the information in between
accesses of the same access group.

I looked at how many times we come across such assignment during
"make stage2-bubble" of GCC (configured with only c and C++ and
without multilib and libsanitizers) and on an x86_64 there were 87924
such assignments (though now I realize not all of them had to be
aggregate), so they do happen.  The patch leads to about 5% increase
of cases where we don't use an "access path" but resort to a
MEM_REF (from 90209 to 95204).  On an Aarch64, there were 92268 such
assignments and the increase of falling back to MEM_REFs was by
4% (but from a bigger base 132983 to 107991).

gcc/ChangeLog:

2025-04-04  Martin Jambor  

PR tree-optimization/118924
* tree-ssa-alias-compare.h (types_equal_for_same_type_for_tbaa_p):
Declare.
* tree-ssa-alias.cc: Include ipa-utils.h.
(types_equal_for_same_type_for_tbaa_p): New public overloaded 
variant.
* tree-sra.cc: Include tree-ssa-alias-compare.h.
(create_access): Initialzie grp_same_access_path to true.
(build_accesses_from_assign): Detect tbaa hazards and clear
grp_same_access_path fields of involved accesses when they occur.
(sort_and_splice_var_accesses): Take previous values of
grp_same_access_path into account.

gcc/testsuite/ChangeLog:

2025-03-25  Martin Jambor  

PR tree-optimization/118924
* g++.dg/tree-ssa/pr118924.C: New test.

(cherry picked from commit 07d243670020b339380194f6125cde87ada56148)

Diff:
---
 gcc/testsuite/g++.dg/tree-ssa/pr118924.C | 29 +
 gcc/tree-sra.cc  | 17 ++---
 gcc/tree-ssa-alias-compare.h |  2 ++
 gcc/tree-ssa-alias.cc| 13 -
 4 files changed, 57 insertions(+), 4 deletions(-)

diff --git a/gcc/testsuite/g++.dg/tree-ssa/pr118924.C 
b/gcc/testsuite/g++.dg/tree-ssa/pr118924.C
new file mode 100644
index ..c95eacafc9ce
--- /dev/null
+++ b/gcc/testsuite/g++.dg/tree-ssa/pr118924.C
@@ -0,0 +1,29 @@
+/* { dg-do run } */
+/* { dg-options "-std=c++17 -O2" } */
+
+template  struct Vector {
+  int m_data[Size];
+  Vector(int, int, int) {}
+};
+enum class E { POINTS, LINES, TRIANGLES };
+
+__attribute__((noipa))
+void getName(E type) {
+  static E check = E::POINTS;
+  if (type == check)
+check = (E)((int)check + 1);
+  else
+__builtin_abort ();
+}
+
+int main() {
+  int arr[]{0, 1, 2};
+  for (auto dim : arr) {
+Vector<3> localInvs(1, 1, 1);
+localInvs.m_data[dim] = 8;
+  }
+  E types[] = {E::POINTS, E::LINES, E::TRIANGLES};
+  for (auto primType : types)
+getName(primType);
+  return 0;
+}
diff --git a/gcc/tree-sra.cc b/gcc/tree-sra.cc
index 8a9cbeec4908..11dca3c026b2 100644
--- a/gcc/tree-sra.cc
+++ b/gcc/tree-sra.cc
@@ -100,6 +100,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "builtins.h"
 #include "tree-sra.h"
 #include "opts.h"
+#include "tree-ssa-alias-compare.h"
 
 /* Enumeration of all aggregate reductions we can do.  */
 enum sra_mode { SRA_MODE_EARLY_IPA,   /* early call regularization */
@@ -968,6 +969,7 @@ create_access (tree expr, gimple *stmt, bool write)
   access->type = TREE_TYPE (expr);
   access->write = write;
   access->grp_unscalarizable_region = unscalarizable_region;
+  access->grp_same_access_path = true;
   access->stmt = stmt;
   access->reverse = reverse;
 
@@ -1394,6 +1396,9 @@ build_accesses_from_assign (gimple *stmt)
   racc = build_access_

[gcc r13-9623] sra: Clear grp_same_access_path of acesses created by total scalarization (PR118924)

2025-04-29 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:85792c6234ba8436422b3119bf3aae50d7951b27

commit r13-9623-g85792c6234ba8436422b3119bf3aae50d7951b27
Author: Martin Jambor 
Date:   Mon Apr 7 13:32:10 2025 +0200

sra: Clear grp_same_access_path of acesses created by total scalarization 
(PR118924)

During analysis of PR 118924 it was discussed that total scalarization
invents access paths (strings of COMPONENT_REFs and possibly even
ARRAY_REFs) which did not exist in the program before which can have
unintended effects on subsequent AA queries.  Although not doing that
does not mean that SRA cannot create such situations (see the bug for
more info), it has been agreed that not doing this is generally better.
This patch therfore makes SRA fall back on creating simple MEM_REFs when
accessing components of an aggregate corresponding to what a SRA
variable now represents.

gcc/ChangeLog:

2025-03-26  Martin Jambor  

PR tree-optimization/118924
* tree-sra.cc (create_total_scalarization_access): Set
grp_same_access_path flag to zero.

(cherry picked from commit 40445711b8af113ef423d8bcac1a7ce1c47f62d7)

Diff:
---
 gcc/tree-sra.cc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/tree-sra.cc b/gcc/tree-sra.cc
index 11dca3c026b2..ec499fdd5109 100644
--- a/gcc/tree-sra.cc
+++ b/gcc/tree-sra.cc
@@ -3253,7 +3253,7 @@ create_total_scalarization_access (struct access *parent, 
HOST_WIDE_INT pos,
   access->grp_write = parent->grp_write;
   access->grp_total_scalarization = 1;
   access->grp_hint = 1;
-  access->grp_same_access_path = path_comparable_for_same_access (expr);
+  access->grp_same_access_path = 0;
   access->reverse = reverse_storage_order_for_component_p (expr);
 
   access->next_sibling = next_sibling;


[gcc r15-9633] ipa: Do not emit info about temporary clones to ipa-clones dump (PR119852)

2025-05-07 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:77780c31485eeb71e9fabf8ea9d4b1af0c3be595

commit r15-9633-g77780c31485eeb71e9fabf8ea9d4b1af0c3be595
Author: Martin Jambor 
Date:   Tue May 6 17:28:43 2025 +0200

ipa: Do not emit info about temporary clones to ipa-clones dump (PR119852)

As described in PR 119852, the output of -fdump-ipa-clones can contain
"(null)" as the suffix/reason for cloning when we need to create a
clone to hold the original function during recursive inlining.  Such
clone is never output and so should not be part of the dump output
either.

gcc/ChangeLog:

2025-04-23  Martin Jambor  

PR ipa/119852
* cgraphclones.cc (dump_callgraph_transformation): Document the
function.  Do not dump if suffix is NULL.

gcc/testsuite/ChangeLog:

2025-04-23  Martin Jambor  

PR ipa/119852
* gcc.dg/ipa/pr119852.c: New test.

(cherry picked from commit fb5829a01651d427a63a12c44ecc8baa47dbfc83)

Diff:
---
 gcc/cgraphclones.cc | 10 +++-
 gcc/testsuite/gcc.dg/ipa/pr119852.c | 50 +
 2 files changed, 59 insertions(+), 1 deletion(-)

diff --git a/gcc/cgraphclones.cc b/gcc/cgraphclones.cc
index e6223fa1f5cc..bf5bc41cde9c 100644
--- a/gcc/cgraphclones.cc
+++ b/gcc/cgraphclones.cc
@@ -307,12 +307,20 @@ cgraph_node::expand_all_artificial_thunks ()
   e = e->next_caller;
 }
 
+/* Dump information about creation of a call graph node clone to the dump file
+   created by the -fdump-ipa-clones option.  ORIGINAL is the function being
+   cloned, CLONE is the new clone.  SUFFIX is a string that helps identify the
+   reason for cloning, often it is the suffix used by a particular IPA pass to
+   create unique function names.  SUFFIX can be NULL and in that case the
+   dumping will not take place, which must be the case only for helper clones
+   which will never be emitted to the output.  */
+
 void
 dump_callgraph_transformation (const cgraph_node *original,
   const cgraph_node *clone,
   const char *suffix)
 {
-  if (symtab->ipa_clones_dump_file)
+  if (suffix && symtab->ipa_clones_dump_file)
 {
   fprintf (symtab->ipa_clones_dump_file,
   "Callgraph clone;%s;%d;%s;%d;%d;%s;%d;%s;%d;%d;%s\n",
diff --git a/gcc/testsuite/gcc.dg/ipa/pr119852.c 
b/gcc/testsuite/gcc.dg/ipa/pr119852.c
new file mode 100644
index ..eab8d21293cc
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ipa/pr119852.c
@@ -0,0 +1,50 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -fdump-ipa-clones"  } */
+
+typedef struct rtx_def *rtx;
+enum rtx_code {
+  LAST_AND_UNUSED_RTX_CODE};
+extern const char * const rtx_format[((int) LAST_AND_UNUSED_RTX_CODE)];
+struct rtx_def {
+  enum rtx_code code;
+};
+typedef int (*rtx_function) (rtx *, void *);
+extern int for_each_rtx (rtx *, rtx_function, void *);
+int
+replace_label (rtx *x, void *data)
+{
+  rtx l = *x;
+  if (l == (rtx) 0)
+{
+ {
+   rtx new_c, new_l;
+   for_each_rtx (&new_c, replace_label, data);
+ }
+}
+}
+static int
+for_each_rtx_1 (rtx exp, int n, rtx_function f, void *data)
+{
+  int result, i, j;
+  const char *format = (rtx_format[(int) (((enum rtx_code) (exp)->code))]);
+  rtx *x;
+  for (; format[n] != '\0'; n++)
+{
+  switch (format[n])
+ {
+ case 'e':
+   result = (*f) (x, data);
+ {
+   result = for_each_rtx_1 (*x, i, f, data);
+ }
+ }
+}
+}
+int
+for_each_rtx (rtx *x, rtx_function f, void *data)
+{
+  int i;
+  return for_each_rtx_1 (*x, i, f, data);
+}
+
+/* { dg-final { scan-ipa-dump-not "(null)"  "ipa-clones"  } } */


[gcc r15-9632] Document option -fdump-ipa-clones

2025-05-07 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:99e2f1138c61e851cfa08712aa73e2689d314fd1

commit r15-9632-g99e2f1138c61e851cfa08712aa73e2689d314fd1
Author: Martin Jambor 
Date:   Tue May 6 17:28:42 2025 +0200

Document option -fdump-ipa-clones

I have noticed that the option -fdump-ipa-clones is not documented
although there are users who depend on it.  This patch adds the
missing documentation along with the description of the information it
dumps and the format it uses.

I am never quite sure which of the texinfo mark-ups is the most
appropriate in which situation, I'll of course incorporate any
feedback on this as well as the general wording of the text.

After we settle on a version, I'd like to backport the documentation
also at least to GCC 15, 14 and 13.

Is it perhaps OK for master and the branches or what would better be
changed?

Thanks,

Martin

gcc/ChangeLog:

2025-04-23  Martin Jambor  

* doc/invoke.texi (Developer Options): Document -fdump-ipa-clones.

(cherry picked from commit 6ecc2fee06bdd60da0e9b3fe6660b553dbdca3ca)

Diff:
---
 gcc/doc/invoke.texi | 87 +
 1 file changed, 87 insertions(+)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index c2e1bf8031b8..617a3d8ae182 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -20748,6 +20748,93 @@ By default, the dump will contain messages about 
successful
 optimizations (equivalent to @option{-optimized}) together with
 low-level details about the analysis.
 
+@opindex fdump-ipa-clones
+@item -fdump-ipa-clones
+
+Create a dump file containing information about creation of call graph
+node clones and removals of call graph nodes during inter-procedural
+optimizations and transformations.  Its main intended use is that tools
+that create live-patches can determine the set of functions that need to
+be live-patched to completely replace a particular function (see
+@option{-flive-patching}).  The file name is generated by appending
+suffix @code{ipa-clones} to the source file name, and the file is
+created in the same directory as the output file.  Each entry in the
+file is on a separate line containing semicolon separated fields.
+
+In the case of call graph clone creation, the individual fields are:
+
+@enumerate
+@item
+String @code{Callgraph clone}.
+
+@item
+Name of the function being cloned as it is presented to the assembler.
+
+@item
+A number that uniquely represents the function being cloned in the call
+graph.  Note that the number is unique only within a compilation unit or
+within whole-program analysis but is likely to be different in the two
+phases.
+
+@item
+The file name of the source file where the function is defined.
+
+@item
+The line on which the function definition is located.
+
+@item
+The column where the function definition is located.
+
+@item
+Name of the new function clone as it is presented to the assembler.
+
+@item
+A number that uniquely represents the new function clone in the call
+graph.  Note that the number is unique only within a compilation unit or
+within whole-program analysis but is likely to be different in the two
+phases.
+
+@item
+The file name of the source file where the source code location of the
+new clone points to.
+
+@item
+The line to which the source code location of the new clone points to.
+
+@item
+The column to which the source code location of the new clone points to.
+
+@item
+A string that determines the reason for cloning.
+
+@end enumerate
+
+In the case of call graph clone removal, the individual fields are:
+
+@enumerate
+@item
+String @code{Callgraph removal}.
+
+@item
+Name of the function being removed as it would be presented to the assembler.
+
+@item
+A number that uniquely represents the function being cloned in the call
+graph.  Note that the number is unique only within a compilation unit or
+within whole-program analysis but is likely to be different in the two
+phases.
+
+@item
+The file name of the source file where the function is defined.
+
+@item
+The line on which the function definition is located.
+
+@item
+The column where the function definition is located.
+
+@end enumerate
+
 @opindex fdump-lang
 @item -fdump-lang
 Dump language-specific information.  The file name is made by appending


[gcc r15-7792] ipa-vr: Handle non-conversion unary ops separately from conversions (PR 118785)

2025-03-03 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:d05b64bdd048ffb7f72d97553888934a9bcd13fa

commit r15-7792-gd05b64bdd048ffb7f72d97553888934a9bcd13fa
Author: Martin Jambor 
Date:   Mon Mar 3 14:53:03 2025 +0100

ipa-vr: Handle non-conversion unary ops separately from conversions (PR 
118785)

Since we construct arithmetic jump functions even when there is a
type conversion in between the operation encoded in the jump function
and when it is passed in a call argument, the IPA propagation phase
must also perform the operation and conversion in two steps.  IPA-VR
had actually been doing it even before for binary operations but, as
PR 118756 exposes, not in the case on unary operations.  This patch
adds the necessary step to rectify that.

Like in the scalar constant case, we depend on
expr_type_first_operand_type_p to determine the type of the result of
the arithmetic operation.  On top this, the patch special-cases
ABSU_EXPR because it looks useful an so that the PR testcase exercises
the added code-path.  This seems most appropriate for stage 4, long
term we should probably stream the types, probably after also encoding
them with a string of expr_eval_op rather than what we have today.

A check for expr_type_first_operand_type_p was also missing in the
handling of binary ops and the intermediate value_range was
initialized with a wrong type, so I also fixed this.

gcc/ChangeLog:

2025-02-24  Martin Jambor  

PR ipa/118785

* ipa-cp.cc (ipa_vr_intersect_with_arith_jfunc): Handle 
non-conversion
unary operations separately before doing any conversions.  Check
expr_type_first_operand_type_p for non-unary operations too.  Fix 
type
of op_res.

gcc/testsuite/ChangeLog:

2025-02-24  Martin Jambor  

PR ipa/118785
* g++.dg/lto/pr118785_0.C: New test.

Diff:
---
 gcc/ipa-cp.cc | 45 ---
 gcc/testsuite/g++.dg/lto/pr118785_0.C | 14 +++
 2 files changed, 56 insertions(+), 3 deletions(-)

diff --git a/gcc/ipa-cp.cc b/gcc/ipa-cp.cc
index 68959f2677ba..3c994f24f540 100644
--- a/gcc/ipa-cp.cc
+++ b/gcc/ipa-cp.cc
@@ -1720,8 +1720,45 @@ ipa_vr_intersect_with_arith_jfunc (vrange &vr,
   enum tree_code operation = ipa_get_jf_pass_through_operation (jfunc);
   if (TREE_CODE_CLASS (operation) == tcc_unary)
 {
+  value_range op_res;
+  const value_range *inter_vr;
+  if (operation != NOP_EXPR)
+   {
+ /* Since we construct arithmetic jump functions even when there is a
+ type conversion in between the operation encoded in the jump
+ function and when it is passed in a call argument, the IPA
+ propagation phase must also perform the operation and conversion
+ in two separate steps.
+
+TODO: In order to remove the use of expr_type_first_operand_type_p
+predicate we would need to stream the operation type, ideally
+encoding the whole jump function as a series of expr_eval_op
+structures.  */
+
+ tree operation_type;
+ if (expr_type_first_operand_type_p (operation))
+   operation_type = src_type;
+ else if (operation == ABSU_EXPR)
+   operation_type = unsigned_type_for (src_type);
+ else
+   return;
+ op_res.set_varying (operation_type);
+ if (!ipa_vr_operation_and_type_effects (op_res, src_vr, operation,
+ operation_type, src_type))
+   return;
+ if (src_type == dst_type)
+   {
+ vr.intersect (op_res);
+ return;
+   }
+ inter_vr = &op_res;
+ src_type = operation_type;
+   }
+  else
+   inter_vr = &src_vr;
+
   value_range tmp_res (dst_type);
-  if (ipa_vr_operation_and_type_effects (tmp_res, src_vr, operation,
+  if (ipa_vr_operation_and_type_effects (tmp_res, *inter_vr, NOP_EXPR,
 dst_type, src_type))
vr.intersect (tmp_res);
   return;
@@ -1737,10 +1774,12 @@ ipa_vr_intersect_with_arith_jfunc (vrange &vr,
   tree operation_type;
   if (TREE_CODE_CLASS (operation) == tcc_comparison)
 operation_type = boolean_type_node;
-  else
+  else if (expr_type_first_operand_type_p (operation))
 operation_type = src_type;
+  else
+return;
 
-  value_range op_res (dst_type);
+  value_range op_res (operation_type);
   if (!ipa_vr_supported_type_p (operation_type)
   || !handler.operand_check_p (operation_type, src_type, op_vr.type ())
   || !handler.fold_range (op_res, operation_type, src_vr, op_vr))
diff --git a/gcc/testsuite/g++.dg/lto/pr118785_0.C 
b/gcc/testsuite/g++.dg/lto/pr118785_0.C
new file mode 100644
index ..cdcc1dd947d3
--- /dev/null
+++ b/gcc/testsuite/g++.dg/lto/pr118785

[gcc r15-7891] ipa-cp: Avoid ICE when redistributing nodes among edges to recursive clones (PR 118318)

2025-03-07 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:7deb498425799aceb7659ea25614175a49533184

commit r15-7891-g7deb498425799aceb7659ea25614175a49533184
Author: Martin Jambor 
Date:   Fri Mar 7 17:17:24 2025 +0100

ipa-cp: Avoid ICE when redistributing nodes among edges to recursive clones 
(PR 118318)

PR 118318 reported an ICE during PGO build of Firefox when IPA-CP, in
the final stages of update_counts_for_self_gen_clones where it
attempts to guess how to distribute profile count among clones created
for recursive edges and the various edges that are created in the
process.  If one such edge has profile count of kind GUESSED_GLOBAL0,
the compatibility check in the operator+ will lead to an ICE.  After
discussing the situation with Honza, we concluded that there is little
more we can do other than check for this situation before touching the
edge count, so this is what this patch does.

gcc/ChangeLog:

2025-02-28  Martin Jambor  

PR ipa/118318
* ipa-cp.cc (adjust_clone_incoming_counts): Add a compatible_p 
check.

Diff:
---
 gcc/ipa-cp.cc | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/ipa-cp.cc b/gcc/ipa-cp.cc
index 3c994f24f540..264568989a96 100644
--- a/gcc/ipa-cp.cc
+++ b/gcc/ipa-cp.cc
@@ -4638,7 +4638,8 @@ adjust_clone_incoming_counts (cgraph_node *node,
cs->count = cs->count.combine_with_ipa_count (sum);
   }
 else if (!desc->processed_edges->contains (cs)
-&& cs->caller->clone_of == desc->orig)
+&& cs->caller->clone_of == desc->orig
+&& cs->count.compatible_p (desc->count))
   {
cs->count += desc->count;
if (dump_file)


[gcc r14-11375] ipa-sra: Avoid clashes with ipa-cp when pulling accesses across calls (PR 118243)

2025-03-04 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:455ea90d6e5ed2938fb7cc7008bf738dcbbc72d4

commit r14-11375-g455ea90d6e5ed2938fb7cc7008bf738dcbbc72d4
Author: Martin Jambor 
Date:   Tue Mar 4 14:53:41 2025 +0100

ipa-sra: Avoid clashes with ipa-cp when pulling accesses across calls (PR 
118243)

Among other things, IPA-SRA checks whether splitting out a bit of an
aggregate or something passed by reference would lead into a clash
with an already known IPA-CP constant a way which would cause problems
later on.  Unfortunately the test is done only in
adjust_parameter_descriptions and is missing when accesses are
propagated from callees to callers, which leads to miscompilation
reported as PR 118243 (where the callee is a function created by
ipa-split).

The matter is then further complicated by the fact that we consider
complex numbers as scalars even though they can be modified piecemeal
(IPA-CP can detect and propagate the pieces separately too) which then
confuses the parameter manipulation machinery furter.

This patch simply adds the missing check to avoid the IPA-SRA
transform in these cases too, which should be suitable for backporting
to all affected release branches.  It is a bit of a shame as in the PR
testcase we do propagate both components of the complex number in
question and the transformation phase could recover.  I have some
prototype patches in this direction but that is something for (a)
stage 1.

gcc/ChangeLog:

2025-02-10  Martin Jambor  

PR ipa/118243
* ipa-sra.cc (pull_accesses_from_callee): New parameters
caller_ipcp_ts and param_idx.  Check that scalar pulled accesses 
would
not clash with a known IPA-CP aggregate constant.
(param_splitting_across_edge): Pass IPA-CP transformation summary 
and
caller parameter index to pull_accesses_from_callee.

gcc/testsuite/ChangeLog:

2025-02-10  Martin Jambor  

PR ipa/118243
* g++.dg/ipa/pr118243.C: New test.

(cherry picked from commit 0bffcd469e68d68ba9c724f515651deff8494b82)

Diff:
---
 gcc/ipa-sra.cc  | 38 +--
 gcc/testsuite/g++.dg/ipa/pr118243.C | 40 +
 2 files changed, 68 insertions(+), 10 deletions(-)

diff --git a/gcc/ipa-sra.cc b/gcc/ipa-sra.cc
index 6d6da4089251..25fbccd03480 100644
--- a/gcc/ipa-sra.cc
+++ b/gcc/ipa-sra.cc
@@ -3640,15 +3640,19 @@ enum acc_prop_kind {ACC_PROP_DONT, ACC_PROP_COPY, 
ACC_PROP_CERTAIN};
 
 /* Attempt to propagate all definite accesses from ARG_DESC to PARAM_DESC,
(which belongs to CALLER) if they would not violate some constraint there.
-   If successful, return NULL, otherwise return the string reason for failure
-   (which can be written to the dump file).  DELTA_OFFSET is the known offset
-   of the actual argument withing the formal parameter (so of ARG_DESCS within
-   PARAM_DESCS), ARG_SIZE is the size of the actual argument or zero, if not
-   known. In case of success, set *CHANGE_P to true if propagation actually
-   changed anything.  */
+   CALLER_IPCP_TS describes the caller, PARAM_IDX is the index of the parameter
+   described by PARAM_DESC.  If successful, return NULL, otherwise return the
+   string reason for failure (which can be written to the dump file).
+   DELTA_OFFSET is the known offset of the actual argument withing the formal
+   parameter (so of ARG_DESCS within PARAM_DESCS), ARG_SIZE is the size of the
+   actual argument or zero, if not known. In case of success, set *CHANGE_P to
+   true if propagation actually changed anything.  */
 
 static const char *
-pull_accesses_from_callee (cgraph_node *caller, isra_param_desc *param_desc,
+pull_accesses_from_callee (cgraph_node *caller,
+  ipcp_transformation *caller_ipcp_ts,
+  int param_idx,
+  isra_param_desc *param_desc,
   isra_param_desc *arg_desc,
   unsigned delta_offset, unsigned arg_size,
   bool *change_p)
@@ -3673,6 +3677,17 @@ pull_accesses_from_callee (cgraph_node *caller, 
isra_param_desc *param_desc,
continue;
 
   unsigned offset = argacc->unit_offset + delta_offset;
+
+  if (caller_ipcp_ts && !AGGREGATE_TYPE_P (argacc->type))
+   {
+ ipa_argagg_value_list avl (caller_ipcp_ts);
+ tree value = avl.get_value (param_idx, offset);
+ if (value && ((tree_to_uhwi (TYPE_SIZE (TREE_TYPE (value)))
+/ BITS_PER_UNIT)
+   != argacc->unit_size))
+   return " propagated access would conflict with an IPA-CP constant";
+   }
+
   /* Given that accesses are initially stored according to increasing
 offset and decreasing size in case of equal offsets, the following
 searches could 

[gcc r13-9422] ipa-sra: Avoid clashes with ipa-cp when pulling accesses across calls (PR 118243)

2025-03-11 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:ceb689d5b697886c2255a43ee61b7352242c9683

commit r13-9422-gceb689d5b697886c2255a43ee61b7352242c9683
Author: Martin Jambor 
Date:   Tue Mar 11 16:49:40 2025 +0100

ipa-sra: Avoid clashes with ipa-cp when pulling accesses across calls (PR 
118243)

Among other things, IPA-SRA checks whether splitting out a bit of an
aggregate or something passed by reference would lead into a clash
with an already known IPA-CP constant a way which would cause problems
later on.  Unfortunately the test is done only in
adjust_parameter_descriptions and is missing when accesses are
propagated from callees to callers, which leads to miscompilation
reported as PR 118243 (where the callee is a function created by
ipa-split).

The matter is then further complicated by the fact that we consider
complex numbers as scalars even though they can be modified piecemeal
(IPA-CP can detect and propagate the pieces separately too) which then
confuses the parameter manipulation machinery furter.

This patch simply adds the missing check to avoid the IPA-SRA
transform in these cases too, which should be suitable for backporting
to all affected release branches.  It is a bit of a shame as in the PR
testcase we do propagate both components of the complex number in
question and the transformation phase could recover.  I have some
prototype patches in this direction but that is something for (a)
stage 1.

gcc/ChangeLog:

2025-02-10  Martin Jambor  

PR ipa/118243
* ipa-sra.cc (pull_accesses_from_callee): New parameters
caller_ipcp_ts and param_idx.  Check that scalar pulled accesses 
would
not clash with a known IPA-CP aggregate constant.
(param_splitting_across_edge): Pass IPA-CP transformation summary 
and
caller parameter index to pull_accesses_from_callee.

gcc/testsuite/ChangeLog:

2025-02-10  Martin Jambor  

PR ipa/118243
* g++.dg/ipa/pr118243.C: New test.

(cherry picked from commit 0bffcd469e68d68ba9c724f515651deff8494b82)

Diff:
---
 gcc/ipa-sra.cc  | 38 +--
 gcc/testsuite/g++.dg/ipa/pr118243.C | 40 +
 2 files changed, 68 insertions(+), 10 deletions(-)

diff --git a/gcc/ipa-sra.cc b/gcc/ipa-sra.cc
index a3db19b7a6fb..288b61e4fb4f 100644
--- a/gcc/ipa-sra.cc
+++ b/gcc/ipa-sra.cc
@@ -3579,15 +3579,19 @@ enum acc_prop_kind {ACC_PROP_DONT, ACC_PROP_COPY, 
ACC_PROP_CERTAIN};
 
 /* Attempt to propagate all definite accesses from ARG_DESC to PARAM_DESC,
(which belongs to CALLER) if they would not violate some constraint there.
-   If successful, return NULL, otherwise return the string reason for failure
-   (which can be written to the dump file).  DELTA_OFFSET is the known offset
-   of the actual argument withing the formal parameter (so of ARG_DESCS within
-   PARAM_DESCS), ARG_SIZE is the size of the actual argument or zero, if not
-   known. In case of success, set *CHANGE_P to true if propagation actually
-   changed anything.  */
+   CALLER_IPCP_TS describes the caller, PARAM_IDX is the index of the parameter
+   described by PARAM_DESC.  If successful, return NULL, otherwise return the
+   string reason for failure (which can be written to the dump file).
+   DELTA_OFFSET is the known offset of the actual argument withing the formal
+   parameter (so of ARG_DESCS within PARAM_DESCS), ARG_SIZE is the size of the
+   actual argument or zero, if not known. In case of success, set *CHANGE_P to
+   true if propagation actually changed anything.  */
 
 static const char *
-pull_accesses_from_callee (cgraph_node *caller, isra_param_desc *param_desc,
+pull_accesses_from_callee (cgraph_node *caller,
+  ipcp_transformation *caller_ipcp_ts,
+  int param_idx,
+  isra_param_desc *param_desc,
   isra_param_desc *arg_desc,
   unsigned delta_offset, unsigned arg_size,
   bool *change_p)
@@ -3612,6 +3616,17 @@ pull_accesses_from_callee (cgraph_node *caller, 
isra_param_desc *param_desc,
continue;
 
   unsigned offset = argacc->unit_offset + delta_offset;
+
+  if (caller_ipcp_ts && !AGGREGATE_TYPE_P (argacc->type))
+   {
+ ipa_argagg_value_list avl (caller_ipcp_ts);
+ tree value = avl.get_value (param_idx, offset);
+ if (value && ((tree_to_uhwi (TYPE_SIZE (TREE_TYPE (value)))
+/ BITS_PER_UNIT)
+   != argacc->unit_size))
+   return " propagated access would conflict with an IPA-CP constant";
+   }
+
   /* Given that accesses are initially stored according to increasing
 offset and decreasing size in case of equal offsets, the following
 searches could 

[gcc r15-7961] Fix a pasto in ao_compare::compare_ao_refs

2025-03-11 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:dc47161c1f32c3f27d1157ba0de9d98ea1b7fc82

commit r15-7961-gdc47161c1f32c3f27d1157ba0de9d98ea1b7fc82
Author: Martin Jambor 
Date:   Tue Mar 11 14:52:44 2025 +0100

Fix a pasto in ao_compare::compare_ao_refs

When reading the function ao_compare::compare_ao_refs I came accross
what I believe to ba a copy-and-paste error which this patch fixes.

gcc/ChangeLog:

2025-03-10  Martin Jambor  

* tree-ssa-alias.cc (ao_compare::compare_ao_refs): Fix a
copy-and-paste error.

Diff:
---
 gcc/tree-ssa-alias.cc | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/tree-ssa-alias.cc b/gcc/tree-ssa-alias.cc
index 2489aa6b8087..e93d5187d509 100644
--- a/gcc/tree-ssa-alias.cc
+++ b/gcc/tree-ssa-alias.cc
@@ -4355,12 +4355,13 @@ ao_compare::compare_ao_refs (ao_ref *ref1, ao_ref *ref2,
c1 = p1, nskipped1 = i;
   i++;
 }
+  i = 0;
   for (tree p2 = ref2->ref; handled_component_p (p2); p2 = TREE_OPERAND (p2, 
0))
 {
   if (component_ref_to_zero_sized_trailing_array_p (p2))
end_struct_ref2 = p2;
   if (ends_tbaa_access_path_p (p2))
-   c2 = p2, nskipped1 = i;
+   c2 = p2, nskipped2 = i;
   i++;
 }


[gcc r15-7760] ipa-sra: Avoid clashes with ipa-cp when pulling accesses across calls (PR 118243)

2025-02-28 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:0bffcd469e68d68ba9c724f515651deff8494b82

commit r15-7760-g0bffcd469e68d68ba9c724f515651deff8494b82
Author: Martin Jambor 
Date:   Fri Feb 28 17:34:10 2025 +0100

ipa-sra: Avoid clashes with ipa-cp when pulling accesses across calls (PR 
118243)

Among other things, IPA-SRA checks whether splitting out a bit of an
aggregate or something passed by reference would lead into a clash
with an already known IPA-CP constant a way which would cause problems
later on.  Unfortunately the test is done only in
adjust_parameter_descriptions and is missing when accesses are
propagated from callees to callers, which leads to miscompilation
reported as PR 118243 (where the callee is a function created by
ipa-split).

The matter is then further complicated by the fact that we consider
complex numbers as scalars even though they can be modified piecemeal
(IPA-CP can detect and propagate the pieces separately too) which then
confuses the parameter manipulation machinery furter.

This patch simply adds the missing check to avoid the IPA-SRA
transform in these cases too, which should be suitable for backporting
to all affected release branches.  It is a bit of a shame as in the PR
testcase we do propagate both components of the complex number in
question and the transformation phase could recover.  I have some
prototype patches in this direction but that is something for (a)
stage 1.

gcc/ChangeLog:

2025-02-10  Martin Jambor  

PR ipa/118243
* ipa-sra.cc (pull_accesses_from_callee): New parameters
caller_ipcp_ts and param_idx.  Check that scalar pulled accesses 
would
not clash with a known IPA-CP aggregate constant.
(param_splitting_across_edge): Pass IPA-CP transformation summary 
and
caller parameter index to pull_accesses_from_callee.

gcc/testsuite/ChangeLog:

2025-02-10  Martin Jambor  

PR ipa/118243
* g++.dg/ipa/pr118243.C: New test.

Diff:
---
 gcc/ipa-sra.cc  | 38 +--
 gcc/testsuite/g++.dg/ipa/pr118243.C | 40 +
 2 files changed, 68 insertions(+), 10 deletions(-)

diff --git a/gcc/ipa-sra.cc b/gcc/ipa-sra.cc
index ad80d22f8ced..5d1703ed394f 100644
--- a/gcc/ipa-sra.cc
+++ b/gcc/ipa-sra.cc
@@ -3640,15 +3640,19 @@ enum acc_prop_kind {ACC_PROP_DONT, ACC_PROP_COPY, 
ACC_PROP_CERTAIN};
 
 /* Attempt to propagate all definite accesses from ARG_DESC to PARAM_DESC,
(which belongs to CALLER) if they would not violate some constraint there.
-   If successful, return NULL, otherwise return the string reason for failure
-   (which can be written to the dump file).  DELTA_OFFSET is the known offset
-   of the actual argument withing the formal parameter (so of ARG_DESCS within
-   PARAM_DESCS), ARG_SIZE is the size of the actual argument or zero, if not
-   known. In case of success, set *CHANGE_P to true if propagation actually
-   changed anything.  */
+   CALLER_IPCP_TS describes the caller, PARAM_IDX is the index of the parameter
+   described by PARAM_DESC.  If successful, return NULL, otherwise return the
+   string reason for failure (which can be written to the dump file).
+   DELTA_OFFSET is the known offset of the actual argument withing the formal
+   parameter (so of ARG_DESCS within PARAM_DESCS), ARG_SIZE is the size of the
+   actual argument or zero, if not known. In case of success, set *CHANGE_P to
+   true if propagation actually changed anything.  */
 
 static const char *
-pull_accesses_from_callee (cgraph_node *caller, isra_param_desc *param_desc,
+pull_accesses_from_callee (cgraph_node *caller,
+  ipcp_transformation *caller_ipcp_ts,
+  int param_idx,
+  isra_param_desc *param_desc,
   isra_param_desc *arg_desc,
   unsigned delta_offset, unsigned arg_size,
   bool *change_p)
@@ -3673,6 +3677,17 @@ pull_accesses_from_callee (cgraph_node *caller, 
isra_param_desc *param_desc,
continue;
 
   unsigned offset = argacc->unit_offset + delta_offset;
+
+  if (caller_ipcp_ts && !AGGREGATE_TYPE_P (argacc->type))
+   {
+ ipa_argagg_value_list avl (caller_ipcp_ts);
+ tree value = avl.get_value (param_idx, offset);
+ if (value && ((tree_to_uhwi (TYPE_SIZE (TREE_TYPE (value)))
+/ BITS_PER_UNIT)
+   != argacc->unit_size))
+   return " propagated access would conflict with an IPA-CP constant";
+   }
+
   /* Given that accesses are initially stored according to increasing
 offset and decreasing size in case of equal offsets, the following
 searches could be written more efficiently if we kept the ordering
@@ -3781,6 +3796,8 @@ para

[gcc r14-11447] ipa-cp: Avoid ICE when redistributing nodes among edges to recursive clones (PR 118318)

2025-03-25 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:82bd83122a483275787fcd18131bf6cd91fbdbd4

commit r14-11447-g82bd83122a483275787fcd18131bf6cd91fbdbd4
Author: Martin Jambor 
Date:   Fri Mar 7 17:17:24 2025 +0100

ipa-cp: Avoid ICE when redistributing nodes among edges to recursive clones 
(PR 118318)

PR 118318 reported an ICE during PGO build of Firefox when IPA-CP, in
the final stages of update_counts_for_self_gen_clones where it
attempts to guess how to distribute profile count among clones created
for recursive edges and the various edges that are created in the
process.  If one such edge has profile count of kind GUESSED_GLOBAL0,
the compatibility check in the operator+ will lead to an ICE.  After
discussing the situation with Honza, we concluded that there is little
more we can do other than check for this situation before touching the
edge count, so this is what this patch does.

gcc/ChangeLog:

2025-02-28  Martin Jambor  

PR ipa/118318
* ipa-cp.cc (adjust_clone_incoming_counts): Add a compatible_p 
check.

(cherry picked from commit 7deb498425799aceb7659ea25614175a49533184)

Diff:
---
 gcc/ipa-cp.cc | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/ipa-cp.cc b/gcc/ipa-cp.cc
index b7add455bd5d..6b772fae88ff 100644
--- a/gcc/ipa-cp.cc
+++ b/gcc/ipa-cp.cc
@@ -4608,7 +4608,8 @@ adjust_clone_incoming_counts (cgraph_node *node,
cs->count = cs->count.combine_with_ipa_count (sum);
   }
 else if (!desc->processed_edges->contains (cs)
-&& cs->caller->clone_of == desc->orig)
+&& cs->caller->clone_of == desc->orig
+&& cs->count.compatible_p (desc->count))
   {
cs->count += desc->count;
if (dump_file)


[gcc r13-9654] ipa: Do not emit info about temporary clones to ipa-clones dump (PR119852)

2025-05-13 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:168ce6032dd582e39f9ddadcc195fc73f364c4dd

commit r13-9654-g168ce6032dd582e39f9ddadcc195fc73f364c4dd
Author: Martin Jambor 
Date:   Tue May 6 17:28:43 2025 +0200

ipa: Do not emit info about temporary clones to ipa-clones dump (PR119852)

As described in PR 119852, the output of -fdump-ipa-clones can contain
"(null)" as the suffix/reason for cloning when we need to create a
clone to hold the original function during recursive inlining.  Such
clone is never output and so should not be part of the dump output
either.

gcc/ChangeLog:

2025-04-23  Martin Jambor  

PR ipa/119852
* cgraphclones.cc (dump_callgraph_transformation): Document the
function.  Do not dump if suffix is NULL.

gcc/testsuite/ChangeLog:

2025-04-23  Martin Jambor  

PR ipa/119852
* gcc.dg/ipa/pr119852.c: New test.

(cherry picked from commit fb5829a01651d427a63a12c44ecc8baa47dbfc83)

Diff:
---
 gcc/cgraphclones.cc | 10 +++-
 gcc/testsuite/gcc.dg/ipa/pr119852.c | 50 +
 2 files changed, 59 insertions(+), 1 deletion(-)

diff --git a/gcc/cgraphclones.cc b/gcc/cgraphclones.cc
index 7c5d3b2842c9..b5435537b1d9 100644
--- a/gcc/cgraphclones.cc
+++ b/gcc/cgraphclones.cc
@@ -304,12 +304,20 @@ cgraph_node::expand_all_artificial_thunks ()
   e = e->next_caller;
 }
 
+/* Dump information about creation of a call graph node clone to the dump file
+   created by the -fdump-ipa-clones option.  ORIGINAL is the function being
+   cloned, CLONE is the new clone.  SUFFIX is a string that helps identify the
+   reason for cloning, often it is the suffix used by a particular IPA pass to
+   create unique function names.  SUFFIX can be NULL and in that case the
+   dumping will not take place, which must be the case only for helper clones
+   which will never be emitted to the output.  */
+
 void
 dump_callgraph_transformation (const cgraph_node *original,
   const cgraph_node *clone,
   const char *suffix)
 {
-  if (symtab->ipa_clones_dump_file)
+  if (suffix && symtab->ipa_clones_dump_file)
 {
   fprintf (symtab->ipa_clones_dump_file,
   "Callgraph clone;%s;%d;%s;%d;%d;%s;%d;%s;%d;%d;%s\n",
diff --git a/gcc/testsuite/gcc.dg/ipa/pr119852.c 
b/gcc/testsuite/gcc.dg/ipa/pr119852.c
new file mode 100644
index ..eab8d21293cc
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ipa/pr119852.c
@@ -0,0 +1,50 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -fdump-ipa-clones"  } */
+
+typedef struct rtx_def *rtx;
+enum rtx_code {
+  LAST_AND_UNUSED_RTX_CODE};
+extern const char * const rtx_format[((int) LAST_AND_UNUSED_RTX_CODE)];
+struct rtx_def {
+  enum rtx_code code;
+};
+typedef int (*rtx_function) (rtx *, void *);
+extern int for_each_rtx (rtx *, rtx_function, void *);
+int
+replace_label (rtx *x, void *data)
+{
+  rtx l = *x;
+  if (l == (rtx) 0)
+{
+ {
+   rtx new_c, new_l;
+   for_each_rtx (&new_c, replace_label, data);
+ }
+}
+}
+static int
+for_each_rtx_1 (rtx exp, int n, rtx_function f, void *data)
+{
+  int result, i, j;
+  const char *format = (rtx_format[(int) (((enum rtx_code) (exp)->code))]);
+  rtx *x;
+  for (; format[n] != '\0'; n++)
+{
+  switch (format[n])
+ {
+ case 'e':
+   result = (*f) (x, data);
+ {
+   result = for_each_rtx_1 (*x, i, f, data);
+ }
+ }
+}
+}
+int
+for_each_rtx (rtx *x, rtx_function f, void *data)
+{
+  int i;
+  return for_each_rtx_1 (*x, i, f, data);
+}
+
+/* { dg-final { scan-ipa-dump-not "(null)"  "ipa-clones"  } } */


[gcc r14-11763] Document option -fdump-ipa-clones

2025-05-12 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:c817f833cf13bc81380bc9745da2622e4e3b7cb5

commit r14-11763-gc817f833cf13bc81380bc9745da2622e4e3b7cb5
Author: Martin Jambor 
Date:   Tue May 6 17:28:42 2025 +0200

Document option -fdump-ipa-clones

I have noticed that the option -fdump-ipa-clones is not documented
although there are users who depend on it.  This patch adds the
missing documentation along with the description of the information it
dumps and the format it uses.

I am never quite sure which of the texinfo mark-ups is the most
appropriate in which situation, I'll of course incorporate any
feedback on this as well as the general wording of the text.

After we settle on a version, I'd like to backport the documentation
also at least to GCC 15, 14 and 13.

Is it perhaps OK for master and the branches or what would better be
changed?

Thanks,

Martin

gcc/ChangeLog:

2025-04-23  Martin Jambor  

* doc/invoke.texi (Developer Options): Document -fdump-ipa-clones.

(cherry picked from commit 6ecc2fee06bdd60da0e9b3fe6660b553dbdca3ca)

Diff:
---
 gcc/doc/invoke.texi | 87 +
 1 file changed, 87 insertions(+)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index c85cac24f3ce..64728fead512 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -20180,6 +20180,93 @@ By default, the dump will contain messages about 
successful
 optimizations (equivalent to @option{-optimized}) together with
 low-level details about the analysis.
 
+@opindex fdump-ipa-clones
+@item -fdump-ipa-clones
+
+Create a dump file containing information about creation of call graph
+node clones and removals of call graph nodes during inter-procedural
+optimizations and transformations.  Its main intended use is that tools
+that create live-patches can determine the set of functions that need to
+be live-patched to completely replace a particular function (see
+@option{-flive-patching}).  The file name is generated by appending
+suffix @code{ipa-clones} to the source file name, and the file is
+created in the same directory as the output file.  Each entry in the
+file is on a separate line containing semicolon separated fields.
+
+In the case of call graph clone creation, the individual fields are:
+
+@enumerate
+@item
+String @code{Callgraph clone}.
+
+@item
+Name of the function being cloned as it is presented to the assembler.
+
+@item
+A number that uniquely represents the function being cloned in the call
+graph.  Note that the number is unique only within a compilation unit or
+within whole-program analysis but is likely to be different in the two
+phases.
+
+@item
+The file name of the source file where the function is defined.
+
+@item
+The line on which the function definition is located.
+
+@item
+The column where the function definition is located.
+
+@item
+Name of the new function clone as it is presented to the assembler.
+
+@item
+A number that uniquely represents the new function clone in the call
+graph.  Note that the number is unique only within a compilation unit or
+within whole-program analysis but is likely to be different in the two
+phases.
+
+@item
+The file name of the source file where the source code location of the
+new clone points to.
+
+@item
+The line to which the source code location of the new clone points to.
+
+@item
+The column to which the source code location of the new clone points to.
+
+@item
+A string that determines the reason for cloning.
+
+@end enumerate
+
+In the case of call graph clone removal, the individual fields are:
+
+@enumerate
+@item
+String @code{Callgraph removal}.
+
+@item
+Name of the function being removed as it would be presented to the assembler.
+
+@item
+A number that uniquely represents the function being cloned in the call
+graph.  Note that the number is unique only within a compilation unit or
+within whole-program analysis but is likely to be different in the two
+phases.
+
+@item
+The file name of the source file where the function is defined.
+
+@item
+The line on which the function definition is located.
+
+@item
+The column where the function definition is located.
+
+@end enumerate
+
 @opindex fdump-lang
 @item -fdump-lang
 Dump language-specific information.  The file name is made by appending


[gcc r14-11764] ipa: Do not emit info about temporary clones to ipa-clones dump (PR119852)

2025-05-12 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:51ffec744b513a71fe84373fb87a3c0125b7fffd

commit r14-11764-g51ffec744b513a71fe84373fb87a3c0125b7fffd
Author: Martin Jambor 
Date:   Tue May 6 17:28:43 2025 +0200

ipa: Do not emit info about temporary clones to ipa-clones dump (PR119852)

As described in PR 119852, the output of -fdump-ipa-clones can contain
"(null)" as the suffix/reason for cloning when we need to create a
clone to hold the original function during recursive inlining.  Such
clone is never output and so should not be part of the dump output
either.

gcc/ChangeLog:

2025-04-23  Martin Jambor  

PR ipa/119852
* cgraphclones.cc (dump_callgraph_transformation): Document the
function.  Do not dump if suffix is NULL.

gcc/testsuite/ChangeLog:

2025-04-23  Martin Jambor  

PR ipa/119852
* gcc.dg/ipa/pr119852.c: New test.

(cherry picked from commit fb5829a01651d427a63a12c44ecc8baa47dbfc83)

Diff:
---
 gcc/cgraphclones.cc | 10 +++-
 gcc/testsuite/gcc.dg/ipa/pr119852.c | 50 +
 2 files changed, 59 insertions(+), 1 deletion(-)

diff --git a/gcc/cgraphclones.cc b/gcc/cgraphclones.cc
index 4fff6873a369..913c0a0a082f 100644
--- a/gcc/cgraphclones.cc
+++ b/gcc/cgraphclones.cc
@@ -307,12 +307,20 @@ cgraph_node::expand_all_artificial_thunks ()
   e = e->next_caller;
 }
 
+/* Dump information about creation of a call graph node clone to the dump file
+   created by the -fdump-ipa-clones option.  ORIGINAL is the function being
+   cloned, CLONE is the new clone.  SUFFIX is a string that helps identify the
+   reason for cloning, often it is the suffix used by a particular IPA pass to
+   create unique function names.  SUFFIX can be NULL and in that case the
+   dumping will not take place, which must be the case only for helper clones
+   which will never be emitted to the output.  */
+
 void
 dump_callgraph_transformation (const cgraph_node *original,
   const cgraph_node *clone,
   const char *suffix)
 {
-  if (symtab->ipa_clones_dump_file)
+  if (suffix && symtab->ipa_clones_dump_file)
 {
   fprintf (symtab->ipa_clones_dump_file,
   "Callgraph clone;%s;%d;%s;%d;%d;%s;%d;%s;%d;%d;%s\n",
diff --git a/gcc/testsuite/gcc.dg/ipa/pr119852.c 
b/gcc/testsuite/gcc.dg/ipa/pr119852.c
new file mode 100644
index ..eab8d21293cc
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ipa/pr119852.c
@@ -0,0 +1,50 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -fdump-ipa-clones"  } */
+
+typedef struct rtx_def *rtx;
+enum rtx_code {
+  LAST_AND_UNUSED_RTX_CODE};
+extern const char * const rtx_format[((int) LAST_AND_UNUSED_RTX_CODE)];
+struct rtx_def {
+  enum rtx_code code;
+};
+typedef int (*rtx_function) (rtx *, void *);
+extern int for_each_rtx (rtx *, rtx_function, void *);
+int
+replace_label (rtx *x, void *data)
+{
+  rtx l = *x;
+  if (l == (rtx) 0)
+{
+ {
+   rtx new_c, new_l;
+   for_each_rtx (&new_c, replace_label, data);
+ }
+}
+}
+static int
+for_each_rtx_1 (rtx exp, int n, rtx_function f, void *data)
+{
+  int result, i, j;
+  const char *format = (rtx_format[(int) (((enum rtx_code) (exp)->code))]);
+  rtx *x;
+  for (; format[n] != '\0'; n++)
+{
+  switch (format[n])
+ {
+ case 'e':
+   result = (*f) (x, data);
+ {
+   result = for_each_rtx_1 (*x, i, f, data);
+ }
+ }
+}
+}
+int
+for_each_rtx (rtx *x, rtx_function f, void *data)
+{
+  int i;
+  return for_each_rtx_1 (*x, i, f, data);
+}
+
+/* { dg-final { scan-ipa-dump-not "(null)"  "ipa-clones"  } } */


[gcc r13-9653] Document option -fdump-ipa-clones

2025-05-13 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:70d3dec42e8c7aec6604f920f56529c796cd398a

commit r13-9653-g70d3dec42e8c7aec6604f920f56529c796cd398a
Author: Martin Jambor 
Date:   Tue May 6 17:28:42 2025 +0200

Document option -fdump-ipa-clones

I have noticed that the option -fdump-ipa-clones is not documented
although there are users who depend on it.  This patch adds the
missing documentation along with the description of the information it
dumps and the format it uses.

I am never quite sure which of the texinfo mark-ups is the most
appropriate in which situation, I'll of course incorporate any
feedback on this as well as the general wording of the text.

After we settle on a version, I'd like to backport the documentation
also at least to GCC 15, 14 and 13.

Is it perhaps OK for master and the branches or what would better be
changed?

Thanks,

Martin

gcc/ChangeLog:

2025-04-23  Martin Jambor  

* doc/invoke.texi (Developer Options): Document -fdump-ipa-clones.

(cherry picked from commit 6ecc2fee06bdd60da0e9b3fe6660b553dbdca3ca)

Diff:
---
 gcc/doc/invoke.texi | 87 +
 1 file changed, 87 insertions(+)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 4f539b59d17e..b80966e13539 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -19191,6 +19191,93 @@ By default, the dump will contain messages about 
successful
 optimizations (equivalent to @option{-optimized}) together with
 low-level details about the analysis.
 
+@opindex fdump-ipa-clones
+@item -fdump-ipa-clones
+
+Create a dump file containing information about creation of call graph
+node clones and removals of call graph nodes during inter-procedural
+optimizations and transformations.  Its main intended use is that tools
+that create live-patches can determine the set of functions that need to
+be live-patched to completely replace a particular function (see
+@option{-flive-patching}).  The file name is generated by appending
+suffix @code{ipa-clones} to the source file name, and the file is
+created in the same directory as the output file.  Each entry in the
+file is on a separate line containing semicolon separated fields.
+
+In the case of call graph clone creation, the individual fields are:
+
+@enumerate
+@item
+String @code{Callgraph clone}.
+
+@item
+Name of the function being cloned as it is presented to the assembler.
+
+@item
+A number that uniquely represents the function being cloned in the call
+graph.  Note that the number is unique only within a compilation unit or
+within whole-program analysis but is likely to be different in the two
+phases.
+
+@item
+The file name of the source file where the function is defined.
+
+@item
+The line on which the function definition is located.
+
+@item
+The column where the function definition is located.
+
+@item
+Name of the new function clone as it is presented to the assembler.
+
+@item
+A number that uniquely represents the new function clone in the call
+graph.  Note that the number is unique only within a compilation unit or
+within whole-program analysis but is likely to be different in the two
+phases.
+
+@item
+The file name of the source file where the source code location of the
+new clone points to.
+
+@item
+The line to which the source code location of the new clone points to.
+
+@item
+The column to which the source code location of the new clone points to.
+
+@item
+A string that determines the reason for cloning.
+
+@end enumerate
+
+In the case of call graph clone removal, the individual fields are:
+
+@enumerate
+@item
+String @code{Callgraph removal}.
+
+@item
+Name of the function being removed as it would be presented to the assembler.
+
+@item
+A number that uniquely represents the function being cloned in the call
+graph.  Note that the number is unique only within a compilation unit or
+within whole-program analysis but is likely to be different in the two
+phases.
+
+@item
+The file name of the source file where the function is defined.
+
+@item
+The line on which the function definition is located.
+
+@item
+The column where the function definition is located.
+
+@end enumerate
+
 @opindex fdump-lang
 @item -fdump-lang
 Dump language-specific information.  The file name is made by appending


[gcc r16-696] ipa: Dump cgraph_node UID instead of order into ipa-clones dump file

2025-05-16 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:9fa534f0831892393885e64596a0d6ca8c4078b6

commit r16-696-g9fa534f0831892393885e64596a0d6ca8c4078b6
Author: Martin Jambor 
Date:   Fri May 16 17:13:51 2025 +0200

ipa: Dump cgraph_node UID instead of order into ipa-clones dump file

Since starting from GCC 15 the order is not unique for any
symtab_nodes but m_uid is, I believe we ought to dump the latter in
the ipa-clones dump, if only so that people can reliably match entries
about new clones to those about removed nodes (if any).

This patch also contains a fixes to a few other places where we have
so far dumped order to our ordinary dumps and which have been
identified by Michal Jires.

gcc/ChangeLog:

2025-05-16  Martin Jambor  

* cgraph.h (symtab_node): Make member function get_uid const.
* cgraphclones.cc (dump_callgraph_transformation): Dump m_uid of the
call graph nodes instead of order.
* cgraph.cc (cgraph_node::remove): Likewise.
* ipa-cp.cc (ipcp_lattice::print): Likewise.
* ipa-sra.cc (ipa_sra_summarize_function): Likewise.
* symtab.cc (symtab_node::dump_base): Likewise.

Co-Authored-By: Michal Jires 

Diff:
---
 gcc/cgraph.cc   | 2 +-
 gcc/cgraph.h| 2 +-
 gcc/cgraphclones.cc | 4 ++--
 gcc/ipa-cp.cc   | 2 +-
 gcc/ipa-sra.cc  | 2 +-
 gcc/symtab.cc   | 4 ++--
 6 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/gcc/cgraph.cc b/gcc/cgraph.cc
index 1a2ec38374ab..ac0f2519361b 100644
--- a/gcc/cgraph.cc
+++ b/gcc/cgraph.cc
@@ -1879,7 +1879,7 @@ cgraph_node::remove (void)
   clone_info *info, saved_info;
   if (symtab->ipa_clones_dump_file && symtab->cloned_nodes.contains (this))
 fprintf (symtab->ipa_clones_dump_file,
-"Callgraph removal;%s;%d;%s;%d;%d\n", asm_name (), order,
+"Callgraph removal;%s;%d;%s;%d;%d\n", asm_name (), get_uid (),
 DECL_SOURCE_FILE (decl), DECL_SOURCE_LINE (decl),
 DECL_SOURCE_COLUMN (decl));
 
diff --git a/gcc/cgraph.h b/gcc/cgraph.h
index f4ee29e998c3..8dbe36eac09d 100644
--- a/gcc/cgraph.h
+++ b/gcc/cgraph.h
@@ -493,7 +493,7 @@ public:
   static inline void checking_verify_symtab_nodes (void);
 
   /* Get unique identifier of the node.  */
-  inline int get_uid ()
+  inline int get_uid () const
   {
 return m_uid;
   }
diff --git a/gcc/cgraphclones.cc b/gcc/cgraphclones.cc
index b45ac4977331..c160e8b6985b 100644
--- a/gcc/cgraphclones.cc
+++ b/gcc/cgraphclones.cc
@@ -324,11 +324,11 @@ dump_callgraph_transformation (const cgraph_node 
*original,
 {
   fprintf (symtab->ipa_clones_dump_file,
   "Callgraph clone;%s;%d;%s;%d;%d;%s;%d;%s;%d;%d;%s\n",
-  original->asm_name (), original->order,
+  original->asm_name (), original->get_uid (),
   DECL_SOURCE_FILE (original->decl),
   DECL_SOURCE_LINE (original->decl),
   DECL_SOURCE_COLUMN (original->decl), clone->asm_name (),
-  clone->order, DECL_SOURCE_FILE (clone->decl),
+  clone->get_uid (), DECL_SOURCE_FILE (clone->decl),
   DECL_SOURCE_LINE (clone->decl), DECL_SOURCE_COLUMN (clone->decl),
   suffix);
 
diff --git a/gcc/ipa-cp.cc b/gcc/ipa-cp.cc
index b41148c74de3..f06ac46dfffb 100644
--- a/gcc/ipa-cp.cc
+++ b/gcc/ipa-cp.cc
@@ -288,7 +288,7 @@ ipcp_lattice::print (FILE * f, bool dump_sources, 
bool dump_benefits)
  else
fprintf (f, " [scc: %i, from:", val->scc_no);
  for (s = val->sources; s; s = s->next)
-   fprintf (f, " %i(%f)", s->cs->caller->order,
+   fprintf (f, " %i(%f)", s->cs->caller->get_uid (),
 s->cs->sreal_frequency ().to_double ());
  fprintf (f, "]");
}
diff --git a/gcc/ipa-sra.cc b/gcc/ipa-sra.cc
index 1331ba49b507..88bfae9502c7 100644
--- a/gcc/ipa-sra.cc
+++ b/gcc/ipa-sra.cc
@@ -4644,7 +4644,7 @@ ipa_sra_summarize_function (cgraph_node *node)
 {
   if (dump_file)
 fprintf (dump_file, "Creating summary for %s/%i:\n", node->name (),
-node->order);
+node->get_uid ());
   gcc_obstack_init (&gensum_obstack);
   loaded_decls = new hash_set;
 
diff --git a/gcc/symtab.cc b/gcc/symtab.cc
index fe9c031247f9..fc1155f46964 100644
--- a/gcc/symtab.cc
+++ b/gcc/symtab.cc
@@ -989,10 +989,10 @@ symtab_node::dump_base (FILE *f)
 same_comdat_group->dump_asm_name ());
   if (next_sharing_asm_name)
 fprintf (f, "  next sharing asm name: %i\n",
-next_sharing_asm_name->order);
+next_sharing_asm_name->get_uid ());
   if (previous_sharing_asm_name)
 fprintf (f, "  previous sharing asm name: %i\n",
-previous_sharing_asm_name->order);
+previous_sharing_asm_name->get_uid ());
 
   if (address_taken)
 fprintf (f, "  Address is taken.\n");


[gcc r16-614] tree-sra: Do not create stores into const aggregates (PR111873)

2025-05-14 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:9d039eff453f777c58642ff16178c1ce2a4be6ab

commit r16-614-g9d039eff453f777c58642ff16178c1ce2a4be6ab
Author: Martin Jambor 
Date:   Wed May 14 12:08:24 2025 +0200

tree-sra: Do not create stores into const aggregates (PR111873)

This patch fixes (hopefully the) one remaining place where gimple SRA
was still creating a load into const aggregates.  It occurs when there
is a replacement for a load but that replacement is not type
compatible - typically because it is a single field structure.

I have used testcases from duplicates because the original test-case
no longer reproduces for me.

gcc/ChangeLog:

2025-05-13  Martin Jambor  

PR tree-optimization/111873
* tree-sra.cc (sra_modify_expr): When processing a load which has
a type-incompatible replacement, do not store the contents of the
replacement into the original aggregate when that aggregate is
const.

gcc/testsuite/ChangeLog:

2025-05-13  Martin Jambor  

* gcc.dg/ipa/pr120044-1.c: New test.
* gcc.dg/ipa/pr120044-2.c: Likewise.
* gcc.dg/tree-ssa/pr114864.c: Likewise.

Diff:
---
 gcc/testsuite/gcc.dg/ipa/pr120044-1.c| 17 +
 gcc/testsuite/gcc.dg/ipa/pr120044-2.c| 17 +
 gcc/testsuite/gcc.dg/tree-ssa/pr114864.c | 15 +++
 gcc/tree-sra.cc  |  4 +++-
 4 files changed, 52 insertions(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.dg/ipa/pr120044-1.c 
b/gcc/testsuite/gcc.dg/ipa/pr120044-1.c
new file mode 100644
index ..f9fee3e85afb
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ipa/pr120044-1.c
@@ -0,0 +1,17 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -fno-early-inlining -fno-tree-fre -fno-tree-pre 
-fno-code-hoisting -fno-inline" } */
+
+struct a {
+  int b;
+} const c;
+void d(char p, struct a e) {
+  while (e.b)
+;
+}
+static unsigned short f(const struct a g) {
+  d(g.b, g);
+  return g.b;
+}
+int main() {
+  return f(c);
+}
diff --git a/gcc/testsuite/gcc.dg/ipa/pr120044-2.c 
b/gcc/testsuite/gcc.dg/ipa/pr120044-2.c
new file mode 100644
index ..5130791f5444
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ipa/pr120044-2.c
@@ -0,0 +1,17 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -fno-early-inlining -fno-tree-fre -fno-tree-pre 
-fno-code-hoisting -fno-ipa-cp" } */
+
+struct a {
+  int b;
+} const c;
+void d(char p, struct a e) {
+  while (e.b)
+;
+}
+static unsigned short f(const struct a g) {
+  d(g.b, g);
+  return g.b;
+}
+int main() {
+  return f(c);
+}
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr114864.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr114864.c
new file mode 100644
index ..cd9b94c094fc
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr114864.c
@@ -0,0 +1,15 @@
+/* { dg-do run } */
+/* { dg-options "-O1 -fno-tree-dce -fno-tree-fre" } */
+
+struct a {
+  int b;
+} const c;
+void d(const struct a f) {}
+void e(const struct a f) {
+  f.b == 0 ? 1 : f.b;
+  d(f);
+}
+int main() {
+  e(c);
+  return 0;
+}
diff --git a/gcc/tree-sra.cc b/gcc/tree-sra.cc
index 302b73e83b8f..4b6daf772841 100644
--- a/gcc/tree-sra.cc
+++ b/gcc/tree-sra.cc
@@ -4205,8 +4205,10 @@ sra_modify_expr (tree *expr, bool write, 
gimple_stmt_iterator *stmt_gsi,
}
  else
{
- gassign *stmt;
+ if (TREE_READONLY (access->base))
+   return false;
 
+ gassign *stmt;
  if (access->grp_partial_lhs)
repl = force_gimple_operand_gsi (stmt_gsi, repl, true,
 NULL_TREE, true,


[gcc r15-9716] ipa: Dump cgraph_node UID instead of order into ipa-clones dump file

2025-05-20 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:76d16fbd802a10faabf63945dd34f351aea087dc

commit r15-9716-g76d16fbd802a10faabf63945dd34f351aea087dc
Author: Martin Jambor 
Date:   Fri May 16 17:13:51 2025 +0200

ipa: Dump cgraph_node UID instead of order into ipa-clones dump file

Since starting from GCC 15 the order is not unique for any
symtab_nodes but m_uid is, I believe we ought to dump the latter in
the ipa-clones dump, if only so that people can reliably match entries
about new clones to those about removed nodes (if any).

This patch also contains a fixes to a few other places where we have
so far dumped order to our ordinary dumps and which have been
identified by Michal Jires.

gcc/ChangeLog:

2025-05-16  Martin Jambor  

* cgraph.h (symtab_node): Make member function get_uid const.
* cgraphclones.cc (dump_callgraph_transformation): Dump m_uid of the
call graph nodes instead of order.
* cgraph.cc (cgraph_node::remove): Likewise.
* ipa-cp.cc (ipcp_lattice::print): Likewise.
* ipa-sra.cc (ipa_sra_summarize_function): Likewise.
* symtab.cc (symtab_node::dump_base): Likewise.

Co-Authored-By: Michal Jires 
(cherry picked from commit 9fa534f0831892393885e64596a0d6ca8c4078b6)

Diff:
---
 gcc/cgraph.cc   | 2 +-
 gcc/cgraph.h| 2 +-
 gcc/cgraphclones.cc | 4 ++--
 gcc/ipa-cp.cc   | 2 +-
 gcc/ipa-sra.cc  | 2 +-
 gcc/symtab.cc   | 4 ++--
 6 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/gcc/cgraph.cc b/gcc/cgraph.cc
index 6ae6a97f6f56..48646de6aa32 100644
--- a/gcc/cgraph.cc
+++ b/gcc/cgraph.cc
@@ -1879,7 +1879,7 @@ cgraph_node::remove (void)
   clone_info *info, saved_info;
   if (symtab->ipa_clones_dump_file && symtab->cloned_nodes.contains (this))
 fprintf (symtab->ipa_clones_dump_file,
-"Callgraph removal;%s;%d;%s;%d;%d\n", asm_name (), order,
+"Callgraph removal;%s;%d;%s;%d;%d\n", asm_name (), get_uid (),
 DECL_SOURCE_FILE (decl), DECL_SOURCE_LINE (decl),
 DECL_SOURCE_COLUMN (decl));
 
diff --git a/gcc/cgraph.h b/gcc/cgraph.h
index abde770ba2b3..45119e3dce9e 100644
--- a/gcc/cgraph.h
+++ b/gcc/cgraph.h
@@ -493,7 +493,7 @@ public:
   static inline void checking_verify_symtab_nodes (void);
 
   /* Get unique identifier of the node.  */
-  inline int get_uid ()
+  inline int get_uid () const
   {
 return m_uid;
   }
diff --git a/gcc/cgraphclones.cc b/gcc/cgraphclones.cc
index bf5bc41cde9c..3c9c642bdec4 100644
--- a/gcc/cgraphclones.cc
+++ b/gcc/cgraphclones.cc
@@ -324,11 +324,11 @@ dump_callgraph_transformation (const cgraph_node 
*original,
 {
   fprintf (symtab->ipa_clones_dump_file,
   "Callgraph clone;%s;%d;%s;%d;%d;%s;%d;%s;%d;%d;%s\n",
-  original->asm_name (), original->order,
+  original->asm_name (), original->get_uid (),
   DECL_SOURCE_FILE (original->decl),
   DECL_SOURCE_LINE (original->decl),
   DECL_SOURCE_COLUMN (original->decl), clone->asm_name (),
-  clone->order, DECL_SOURCE_FILE (clone->decl),
+  clone->get_uid (), DECL_SOURCE_FILE (clone->decl),
   DECL_SOURCE_LINE (clone->decl), DECL_SOURCE_COLUMN (clone->decl),
   suffix);
 
diff --git a/gcc/ipa-cp.cc b/gcc/ipa-cp.cc
index a8ff3c870731..7ce9ba776961 100644
--- a/gcc/ipa-cp.cc
+++ b/gcc/ipa-cp.cc
@@ -292,7 +292,7 @@ ipcp_lattice::print (FILE * f, bool dump_sources, 
bool dump_benefits)
  else
fprintf (f, " [scc: %i, from:", val->scc_no);
  for (s = val->sources; s; s = s->next)
-   fprintf (f, " %i(%f)", s->cs->caller->order,
+   fprintf (f, " %i(%f)", s->cs->caller->get_uid (),
 s->cs->sreal_frequency ().to_double ());
  fprintf (f, "]");
}
diff --git a/gcc/ipa-sra.cc b/gcc/ipa-sra.cc
index 1331ba49b507..88bfae9502c7 100644
--- a/gcc/ipa-sra.cc
+++ b/gcc/ipa-sra.cc
@@ -4644,7 +4644,7 @@ ipa_sra_summarize_function (cgraph_node *node)
 {
   if (dump_file)
 fprintf (dump_file, "Creating summary for %s/%i:\n", node->name (),
-node->order);
+node->get_uid ());
   gcc_obstack_init (&gensum_obstack);
   loaded_decls = new hash_set;
 
diff --git a/gcc/symtab.cc b/gcc/symtab.cc
index fe9c031247f9..fc1155f46964 100644
--- a/gcc/symtab.cc
+++ b/gcc/symtab.cc
@@ -989,10 +989,10 @@ symtab_node::dump_base (FILE *f)
 same_comdat_group->dump_asm_name ());
   if (next_sharing_asm_name)
 fprintf (f, "  next sharing asm name: %i\n",
-next_sharing_asm_name->order);
+next_sharing_asm_name->get_uid ());
   if (previous_sharing_asm_name)
 fprintf (f, "  previous sharing asm name: %i\n",
-previous_sharing_asm_name->order);
+previous_sharing_asm_name->get_uid ());
 
   if (address_taken)
 fprintf (f, "  Address is taken.\n");


[gcc r15-9717] tree-sra: Do not create stores into const aggregates (PR111873)

2025-05-20 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:c1db46f7e51d4a546ca536f7f10e548f02e5cc12

commit r15-9717-gc1db46f7e51d4a546ca536f7f10e548f02e5cc12
Author: Martin Jambor 
Date:   Wed May 14 12:08:24 2025 +0200

tree-sra: Do not create stores into const aggregates (PR111873)

This patch fixes (hopefully the) one remaining place where gimple SRA
was still creating a load into const aggregates.  It occurs when there
is a replacement for a load but that replacement is not type
compatible - typically because it is a single field structure.

I have used testcases from duplicates because the original test-case
no longer reproduces for me.

gcc/ChangeLog:

2025-05-13  Martin Jambor  

PR tree-optimization/111873
* tree-sra.cc (sra_modify_expr): When processing a load which has
a type-incompatible replacement, do not store the contents of the
replacement into the original aggregate when that aggregate is
const.

gcc/testsuite/ChangeLog:

2025-05-13  Martin Jambor  

* gcc.dg/ipa/pr120044-1.c: New test.
* gcc.dg/ipa/pr120044-2.c: Likewise.
* gcc.dg/tree-ssa/pr114864.c: Likewise.

(cherry picked from commit 9d039eff453f777c58642ff16178c1ce2a4be6ab)

Diff:
---
 gcc/testsuite/gcc.dg/ipa/pr120044-1.c| 17 +
 gcc/testsuite/gcc.dg/ipa/pr120044-2.c| 17 +
 gcc/testsuite/gcc.dg/tree-ssa/pr114864.c | 15 +++
 gcc/tree-sra.cc  |  4 +++-
 4 files changed, 52 insertions(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.dg/ipa/pr120044-1.c 
b/gcc/testsuite/gcc.dg/ipa/pr120044-1.c
new file mode 100644
index ..f9fee3e85afb
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ipa/pr120044-1.c
@@ -0,0 +1,17 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -fno-early-inlining -fno-tree-fre -fno-tree-pre 
-fno-code-hoisting -fno-inline" } */
+
+struct a {
+  int b;
+} const c;
+void d(char p, struct a e) {
+  while (e.b)
+;
+}
+static unsigned short f(const struct a g) {
+  d(g.b, g);
+  return g.b;
+}
+int main() {
+  return f(c);
+}
diff --git a/gcc/testsuite/gcc.dg/ipa/pr120044-2.c 
b/gcc/testsuite/gcc.dg/ipa/pr120044-2.c
new file mode 100644
index ..5130791f5444
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ipa/pr120044-2.c
@@ -0,0 +1,17 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -fno-early-inlining -fno-tree-fre -fno-tree-pre 
-fno-code-hoisting -fno-ipa-cp" } */
+
+struct a {
+  int b;
+} const c;
+void d(char p, struct a e) {
+  while (e.b)
+;
+}
+static unsigned short f(const struct a g) {
+  d(g.b, g);
+  return g.b;
+}
+int main() {
+  return f(c);
+}
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr114864.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr114864.c
new file mode 100644
index ..cd9b94c094fc
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr114864.c
@@ -0,0 +1,15 @@
+/* { dg-do run } */
+/* { dg-options "-O1 -fno-tree-dce -fno-tree-fre" } */
+
+struct a {
+  int b;
+} const c;
+void d(const struct a f) {}
+void e(const struct a f) {
+  f.b == 0 ? 1 : f.b;
+  d(f);
+}
+int main() {
+  e(c);
+  return 0;
+}
diff --git a/gcc/tree-sra.cc b/gcc/tree-sra.cc
index 302b73e83b8f..4b6daf772841 100644
--- a/gcc/tree-sra.cc
+++ b/gcc/tree-sra.cc
@@ -4205,8 +4205,10 @@ sra_modify_expr (tree *expr, bool write, 
gimple_stmt_iterator *stmt_gsi,
}
  else
{
- gassign *stmt;
+ if (TREE_READONLY (access->base))
+   return false;
 
+ gassign *stmt;
  if (access->grp_partial_lhs)
repl = force_gimple_operand_gsi (stmt_gsi, repl, true,
 NULL_TREE, true,


[gcc r14-11797] tree-sra: Do not create stores into const aggregates (PR111873)

2025-05-22 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:92d8b9970ea2ed59010a5f1a394cb98adffa63e8

commit r14-11797-g92d8b9970ea2ed59010a5f1a394cb98adffa63e8
Author: Martin Jambor 
Date:   Wed May 14 12:08:24 2025 +0200

tree-sra: Do not create stores into const aggregates (PR111873)

This patch fixes (hopefully the) one remaining place where gimple SRA
was still creating a load into const aggregates.  It occurs when there
is a replacement for a load but that replacement is not type
compatible - typically because it is a single field structure.

I have used testcases from duplicates because the original test-case
no longer reproduces for me.

gcc/ChangeLog:

2025-05-13  Martin Jambor  

PR tree-optimization/111873
* tree-sra.cc (sra_modify_expr): When processing a load which has
a type-incompatible replacement, do not store the contents of the
replacement into the original aggregate when that aggregate is
const.

gcc/testsuite/ChangeLog:

2025-05-13  Martin Jambor  

* gcc.dg/ipa/pr120044-1.c: New test.
* gcc.dg/ipa/pr120044-2.c: Likewise.
* gcc.dg/tree-ssa/pr114864.c: Likewise.

(cherry picked from commit 9d039eff453f777c58642ff16178c1ce2a4be6ab)

Diff:
---
 gcc/testsuite/gcc.dg/ipa/pr120044-1.c| 17 +
 gcc/testsuite/gcc.dg/ipa/pr120044-2.c| 17 +
 gcc/testsuite/gcc.dg/tree-ssa/pr114864.c | 15 +++
 gcc/tree-sra.cc  |  4 +++-
 4 files changed, 52 insertions(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.dg/ipa/pr120044-1.c 
b/gcc/testsuite/gcc.dg/ipa/pr120044-1.c
new file mode 100644
index ..f9fee3e85afb
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ipa/pr120044-1.c
@@ -0,0 +1,17 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -fno-early-inlining -fno-tree-fre -fno-tree-pre 
-fno-code-hoisting -fno-inline" } */
+
+struct a {
+  int b;
+} const c;
+void d(char p, struct a e) {
+  while (e.b)
+;
+}
+static unsigned short f(const struct a g) {
+  d(g.b, g);
+  return g.b;
+}
+int main() {
+  return f(c);
+}
diff --git a/gcc/testsuite/gcc.dg/ipa/pr120044-2.c 
b/gcc/testsuite/gcc.dg/ipa/pr120044-2.c
new file mode 100644
index ..5130791f5444
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ipa/pr120044-2.c
@@ -0,0 +1,17 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -fno-early-inlining -fno-tree-fre -fno-tree-pre 
-fno-code-hoisting -fno-ipa-cp" } */
+
+struct a {
+  int b;
+} const c;
+void d(char p, struct a e) {
+  while (e.b)
+;
+}
+static unsigned short f(const struct a g) {
+  d(g.b, g);
+  return g.b;
+}
+int main() {
+  return f(c);
+}
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr114864.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr114864.c
new file mode 100644
index ..cd9b94c094fc
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr114864.c
@@ -0,0 +1,15 @@
+/* { dg-do run } */
+/* { dg-options "-O1 -fno-tree-dce -fno-tree-fre" } */
+
+struct a {
+  int b;
+} const c;
+void d(const struct a f) {}
+void e(const struct a f) {
+  f.b == 0 ? 1 : f.b;
+  d(f);
+}
+int main() {
+  e(c);
+  return 0;
+}
diff --git a/gcc/tree-sra.cc b/gcc/tree-sra.cc
index 46ddd41fdcb9..6e09476418cd 100644
--- a/gcc/tree-sra.cc
+++ b/gcc/tree-sra.cc
@@ -4179,8 +4179,10 @@ sra_modify_expr (tree *expr, bool write, 
gimple_stmt_iterator *stmt_gsi,
}
  else
{
- gassign *stmt;
+ if (TREE_READONLY (access->base))
+   return false;
 
+ gassign *stmt;
  if (access->grp_partial_lhs)
repl = force_gimple_operand_gsi (stmt_gsi, repl, true,
 NULL_TREE, true,


[gcc r13-9724] tree-sra: Do not create stores into const aggregates (PR111873)

2025-05-28 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:a067a18d42e338aea990347bb4d16d6a852c4480

commit r13-9724-ga067a18d42e338aea990347bb4d16d6a852c4480
Author: Martin Jambor 
Date:   Wed May 14 12:08:24 2025 +0200

tree-sra: Do not create stores into const aggregates (PR111873)

This patch fixes (hopefully the) one remaining place where gimple SRA
was still creating a load into const aggregates.  It occurs when there
is a replacement for a load but that replacement is not type
compatible - typically because it is a single field structure.

I have used testcases from duplicates because the original test-case
no longer reproduces for me.

gcc/ChangeLog:

2025-05-13  Martin Jambor  

PR tree-optimization/111873
* tree-sra.cc (sra_modify_expr): When processing a load which has
a type-incompatible replacement, do not store the contents of the
replacement into the original aggregate when that aggregate is
const.

gcc/testsuite/ChangeLog:

2025-05-13  Martin Jambor  

* gcc.dg/ipa/pr120044-1.c: New test.
* gcc.dg/ipa/pr120044-2.c: Likewise.
* gcc.dg/tree-ssa/pr114864.c: Likewise.

(cherry picked from commit 9d039eff453f777c58642ff16178c1ce2a4be6ab)

Diff:
---
 gcc/testsuite/gcc.dg/ipa/pr120044-1.c| 17 +
 gcc/testsuite/gcc.dg/ipa/pr120044-2.c| 17 +
 gcc/testsuite/gcc.dg/tree-ssa/pr114864.c | 15 +++
 gcc/tree-sra.cc  |  4 +++-
 4 files changed, 52 insertions(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.dg/ipa/pr120044-1.c 
b/gcc/testsuite/gcc.dg/ipa/pr120044-1.c
new file mode 100644
index ..f9fee3e85afb
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ipa/pr120044-1.c
@@ -0,0 +1,17 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -fno-early-inlining -fno-tree-fre -fno-tree-pre 
-fno-code-hoisting -fno-inline" } */
+
+struct a {
+  int b;
+} const c;
+void d(char p, struct a e) {
+  while (e.b)
+;
+}
+static unsigned short f(const struct a g) {
+  d(g.b, g);
+  return g.b;
+}
+int main() {
+  return f(c);
+}
diff --git a/gcc/testsuite/gcc.dg/ipa/pr120044-2.c 
b/gcc/testsuite/gcc.dg/ipa/pr120044-2.c
new file mode 100644
index ..5130791f5444
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ipa/pr120044-2.c
@@ -0,0 +1,17 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -fno-early-inlining -fno-tree-fre -fno-tree-pre 
-fno-code-hoisting -fno-ipa-cp" } */
+
+struct a {
+  int b;
+} const c;
+void d(char p, struct a e) {
+  while (e.b)
+;
+}
+static unsigned short f(const struct a g) {
+  d(g.b, g);
+  return g.b;
+}
+int main() {
+  return f(c);
+}
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr114864.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr114864.c
new file mode 100644
index ..cd9b94c094fc
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr114864.c
@@ -0,0 +1,15 @@
+/* { dg-do run } */
+/* { dg-options "-O1 -fno-tree-dce -fno-tree-fre" } */
+
+struct a {
+  int b;
+} const c;
+void d(const struct a f) {}
+void e(const struct a f) {
+  f.b == 0 ? 1 : f.b;
+  d(f);
+}
+int main() {
+  e(c);
+  return 0;
+}
diff --git a/gcc/tree-sra.cc b/gcc/tree-sra.cc
index ec499fdd5109..c3c0a70338d2 100644
--- a/gcc/tree-sra.cc
+++ b/gcc/tree-sra.cc
@@ -3988,8 +3988,10 @@ sra_modify_expr (tree *expr, gimple_stmt_iterator *gsi, 
bool write)
}
  else
{
- gassign *stmt;
+ if (TREE_READONLY (access->base))
+   return false;
 
+ gassign *stmt;
  if (access->grp_partial_lhs)
repl = force_gimple_operand_gsi (gsi, repl, true, NULL_TREE,
 true, GSI_SAME_STMT);


[gcc r16-960] ipa: When inlining, don't combine PT JFs changing signedness (PR120295)

2025-05-29 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:0b004c92f5ea239936a403a2a757e12ca82ce6d8

commit r16-960-g0b004c92f5ea239936a403a2a757e12ca82ce6d8
Author: Martin Jambor 
Date:   Thu May 29 16:32:04 2025 +0200

ipa: When inlining, don't combine PT JFs changing signedness (PR120295)

In GCC 15 we allowed jump-function generation code to skip over a
type-cast converting one integer to another as long as the latter can
hold all the values of the former or has at least the same precision.
This works well for IPA-CP where we do then evaluate each jump
function as we propagate values and value-ranges.  However, the
test-case in PR 120295 shows a problem with inlining, where we combine
pass-through jump-functions so that they are always relative to the
function which is the root of the inline tree.  Unfortunately, we are
happy to combine also those with type-casts to a different signedness
which makes us use sign zero extension for the expected value ranges
where we should have used sign extension.  When the value-range which
then leads to wrong insertion of a call to builtin_unreachable is
being computed, the information about an existence of a intermediary
signed type has already been lost during previous inlining.

This patch simply blocks combining such jump-functions so that it is
back-portable to GCC 15.  Once we switch pass-through jump functions
to use a vector of operations rather than having room for just one, we
will be able to address this situation with adding an extra conversion
instead.

gcc/ChangeLog:

2025-05-19  Martin Jambor  

PR ipa/120295
* ipa-prop.cc (update_jump_functions_after_inlining): Do not
combine pass-through jump functions with type-casts changing
signedness.

gcc/testsuite/ChangeLog:

2025-05-19  Martin Jambor  

PR ipa/120295
* gcc.dg/ipa/pr120295.c: New test.

Diff:
---
 gcc/ipa-prop.cc | 28 
 gcc/testsuite/gcc.dg/ipa/pr120295.c | 66 +
 2 files changed, 94 insertions(+)

diff --git a/gcc/ipa-prop.cc b/gcc/ipa-prop.cc
index 24a538034e31..84d4fb5db674 100644
--- a/gcc/ipa-prop.cc
+++ b/gcc/ipa-prop.cc
@@ -3330,6 +3330,10 @@ update_jump_functions_after_inlining (struct cgraph_edge 
*cs,
   ipa_edge_args *args = ipa_edge_args_sum->get (e);
   if (!args)
 return;
+  ipa_node_params *old_inline_root_info = ipa_node_params_sum->get 
(cs->callee);
+  ipa_node_params *new_inline_root_info
+= ipa_node_params_sum->get (cs->caller->inlined_to
+   ? cs->caller->inlined_to : cs->caller);
   int count = ipa_get_cs_argument_count (args);
   int i;
 
@@ -3541,6 +3545,30 @@ update_jump_functions_after_inlining (struct cgraph_edge 
*cs,
enum tree_code operation;
operation = ipa_get_jf_pass_through_operation (src);
 
+   tree old_ir_ptype = ipa_get_type (old_inline_root_info,
+ dst_fid);
+   tree new_ir_ptype = ipa_get_type (new_inline_root_info,
+ formal_id);
+   if (!useless_type_conversion_p (old_ir_ptype, new_ir_ptype))
+ {
+   /* Jump-function construction now permits type-casts
+  from an integer to another if the latter can hold
+  all values or has at least the same precision.
+  However, as we're combining multiple pass-through
+  functions together, we are losing information about
+  signedness and thus if conversions should sign or
+  zero extend.  Therefore we must prevent combining
+  such jump-function if signednesses do not match.  */
+   if (!INTEGRAL_TYPE_P (old_ir_ptype)
+   || !INTEGRAL_TYPE_P (new_ir_ptype)
+   || (TYPE_UNSIGNED (new_ir_ptype)
+   != TYPE_UNSIGNED (old_ir_ptype)))
+ {
+   ipa_set_jf_unknown (dst);
+   continue;
+ }
+ }
+
if (operation == NOP_EXPR)
  {
bool agg_p;
diff --git a/gcc/testsuite/gcc.dg/ipa/pr120295.c 
b/gcc/testsuite/gcc.dg/ipa/pr120295.c
new file mode 100644
index ..2033ee9493d2
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ipa/pr120295.c
@@ -0,0 +1,66 @@
+/* { dg-do run } */
+/* { dg-options "-O3" } */
+
+struct {
+  signed a;
+} b;
+int a, f, j, l;
+char c, k, g, e;
+short d[2] = {0};
+int *i = &j;
+
+volatile int glob;
+void __attribute__((noipa)) sth (const char *, int a)
+{
+  glob = a;
+  return;
+

[gcc r16-959] ipa: Fix whitespace when dumping VR in jump_functions

2025-05-29 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:71e6b7b26a5169d217a62f34acbbc43c592b24bd

commit r16-959-g71e6b7b26a5169d217a62f34acbbc43c592b24bd
Author: Martin Jambor 
Date:   Thu May 29 16:32:04 2025 +0200

ipa: Fix whitespace when dumping VR in jump_functions

Lack of white space breakes the tree-visualisation structure and makes
the dump unnecessarily difficult to read.

gcc/ChangeLog:

2025-05-19  Martin Jambor  

* ipa-prop.cc (ipa_dump_jump_function): Fix whitespace when
dumping IPA VRs.

Diff:
---
 gcc/ipa-prop.cc | 1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/ipa-prop.cc b/gcc/ipa-prop.cc
index 0398d69962f8..24a538034e31 100644
--- a/gcc/ipa-prop.cc
+++ b/gcc/ipa-prop.cc
@@ -542,6 +542,7 @@ ipa_dump_jump_function (FILE *f, ipa_jump_func *jump_func,
 
   if (jump_func->m_vr)
 {
+  fprintf (f, " ");
   jump_func->m_vr->dump (f);
   fprintf (f, "\n");
 }