[gcc(refs/users/aoliva/heads/testme)] fold fold_truth_andor field merging into ifcombine

2024-11-28 Thread Alexandre Oliva via Gcc-cvs
https://gcc.gnu.org/g:ff06b3bd1d35c04904a3ddd3c38080b71f490294

commit ff06b3bd1d35c04904a3ddd3c38080b71f490294
Author: Alexandre Oliva 
Date:   Thu Nov 28 18:44:28 2024 -0300

fold fold_truth_andor field merging into ifcombine

This patch introduces various improvements to the logic that merges
field compares, while moving it into ifcombine.

Before the patch, we could merge:

  (a.x1 EQNE b.x1)  ANDOR  (a.y1 EQNE b.y1)

into something like:

  (((type *)&a)[Na] & MASK) EQNE (((type *)&b)[Nb] & MASK)

if both of A's fields live within the same alignment boundaries, and
so do B's, at the same relative positions.  Constants may be used
instead of the object B.

The initial goal of this patch was to enable such combinations when a
field crossed alignment boundaries, e.g. for packed types.  We can't
generally access such fields with a single memory access, so when we
come across such a compare, we will attempt to combine each access
separately.

Some merging opportunities were missed because of right-shifts,
compares expressed as e.g. ((a.x1 ^ b.x1) & MASK) EQNE 0, and
narrowing conversions, especially after earlier merges.  This patch
introduces handlers for several cases involving these.

The merging of multiple field accesses into wider bitfield-like
accesses is undesirable to do too early in compilation, so we move it
from folding to ifcombine.

When it is the second of a noncontiguous pair of compares that first
accesses a word, we may merge the first compare with part of the
second compare that refers to the same word, keeping the compare of
the remaining bits at the spot where the second compare used to be.

Handling compares with non-constant fields was somewhat generalized
from what fold used to do, now handling non-adjacent fields, even if a
field of one object crosses an alignment boundary but the other
doesn't.


for  gcc/ChangeLog

* fold-const.cc (make_bit_field): Export.
(unextend, all_ones_mask_p): Drop.
(decode_field_reference, fold_truth_andor_1): Move
field compare merging logic...
* gimple-fold.cc: (fold_truth_andor_for_ifcombine) ... here.
(compute_split_boundary_from_align): New.
(make_bit_field_load, build_split_load): New.
(reuse_split_load): New.
* fold-const.h: (make_bit_field_ref): Declare
(fold_truth_andor_for_ifcombine): Declare.
* match.pd (any_convert, bit_and_cst, rshift_cst): New.
* tree-ssa-ifcombine.cc (ifcombine_ifandif): Try
fold_truth_andor_for_ifcombine.

for  gcc/testsuite/ChangeLog

* gcc.dg/field-merge-1.c: New.
* gcc.dg/field-merge-2.c: New.
* gcc.dg/field-merge-3.c: New.
* gcc.dg/field-merge-4.c: New.
* gcc.dg/field-merge-5.c: New.
* gcc.dg/field-merge-6.c: New.
* gcc.dg/field-merge-7.c: New.
* gcc.dg/field-merge-8.c: New.
* gcc.dg/field-merge-9.c: New.
* gcc.dg/field-merge-10.c: New.
* gcc.dg/field-merge-11.c: New.

Diff:
---
 gcc/fold-const.cc |  512 +--
 gcc/fold-const.h  |   10 +
 gcc/gimple-fold.cc| 1107 +
 gcc/match.pd  |   11 +
 gcc/testsuite/gcc.dg/field-merge-1.c  |   64 ++
 gcc/testsuite/gcc.dg/field-merge-10.c |   36 ++
 gcc/testsuite/gcc.dg/field-merge-11.c |   32 +
 gcc/testsuite/gcc.dg/field-merge-2.c  |   31 +
 gcc/testsuite/gcc.dg/field-merge-3.c  |   36 ++
 gcc/testsuite/gcc.dg/field-merge-4.c  |   40 ++
 gcc/testsuite/gcc.dg/field-merge-5.c  |   40 ++
 gcc/testsuite/gcc.dg/field-merge-6.c  |   26 +
 gcc/testsuite/gcc.dg/field-merge-7.c  |   23 +
 gcc/testsuite/gcc.dg/field-merge-8.c  |   25 +
 gcc/testsuite/gcc.dg/field-merge-9.c  |   36 ++
 gcc/tree-ssa-ifcombine.cc |   14 +-
 16 files changed, 1534 insertions(+), 509 deletions(-)

diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc
index 1e8ae1ab493b..644966459864 100644
--- a/gcc/fold-const.cc
+++ b/gcc/fold-const.cc
@@ -137,7 +137,6 @@ static tree range_successor (tree);
 static tree fold_range_test (location_t, enum tree_code, tree, tree, tree);
 static tree fold_cond_expr_with_comparison (location_t, tree, enum tree_code,
tree, tree, tree, tree);
-static tree unextend (tree, int, int, tree);
 static tree extract_muldiv (tree, tree, enum tree_code, tree, bool *);
 static tree extract_muldiv_1 (tree, tree, enum tree_code, tree, bool *);
 static tree fold_binary_op_with_conditional_arg (location_t,
@@ -4711,7 +4710,7 @@ invert_truthvalue_loc (location_t loc, tree arg)
is the original memory reference used to preserve the alias set of
the

[gcc(refs/users/aoliva/heads/testme)] ifcombine: avoid unsound forwarder-enabled combinations [PR117723]

2024-11-28 Thread Alexandre Oliva via Gcc-cvs
https://gcc.gnu.org/g:1682b6669ba3cc8562a8fcc3c7eda00a313c9c43

commit 1682b6669ba3cc8562a8fcc3c7eda00a313c9c43
Author: Alexandre Oliva 
Date:   Thu Nov 28 21:56:34 2024 -0300

ifcombine: avoid unsound forwarder-enabled combinations [PR117723]

When ifcombining contiguous blocks, we can follow forwarder blocks and
reverse conditions to enable combinations, but when there are
intervening blocks, we have to constrain ourselves to paths to the
exit that share the PHI args with all intervening blocks.

Avoiding considering forwarders when intervening blocks were present
would match the preexisting test, but we can do better, recording in
case a forwarded path corresponds to the outer block's exit path, and
insisting on not combining through any other path but the one that was
verified as corresponding.  The latter is what this patch implements.

While at that, I've fixed some typos, introduced early testing before
computing the exit path to avoid it when computing it would be
wasteful, or when avoiding it can enable other sound combinations.


for  gcc/ChangeLog

PR tree-optimization/117723
* tree-ssa-ifcombine.cc (tree_ssa_ifcombine_bb): Record
forwarder blocks in path to exit, and stick to them.  Avoid
computing the exit if obviously not needed, and if that
enables additional optimizations.
(tree_ssa_ifcombine_bb_1): Fix typos.

for  gcc/testsuite/ChangeLog

PR tree-optimization/117723
* gcc.dg/torture/ifcmb-1.c: New.

Diff:
---
 gcc/testsuite/gcc.dg/torture/ifcmb-1.c |  63 ++
 gcc/tree-ssa-ifcombine.cc  | 117 -
 2 files changed, 162 insertions(+), 18 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/torture/ifcmb-1.c 
b/gcc/testsuite/gcc.dg/torture/ifcmb-1.c
new file mode 100644
index ..2431a548598f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/ifcmb-1.c
@@ -0,0 +1,63 @@
+/* { dg-do run } */
+
+/* Test that we do NOT perform unsound transformations for any of these cases.
+   Forwarding blocks to the exit block used to enable some of them.  */
+
+[[gnu::noinline]]
+int f0 (int a, int b) {
+  if ((a & 1))
+return 0;
+  if (b)
+return 1;
+  if (!(a & 2))
+return 0;
+  else
+return 1;
+}
+
+[[gnu::noinline]]
+int f1 (int a, int b) {
+  if (!(a & 1))
+return 0;
+  if (b)
+return 1;
+  if ((a & 2))
+return 1;
+  else
+return 0;
+}
+
+[[gnu::noinline]]
+int f2 (int a, int b) {
+  if ((a & 1))
+return 0;
+  if (b)
+return 1;
+  if (!(a & 2))
+return 0;
+  else
+return 1;
+}
+
+[[gnu::noinline]]
+int f3 (int a, int b) {
+  if (!(a & 1))
+return 0;
+  if (b)
+return 1;
+  if ((a & 2))
+return 1;
+  else
+return 0;
+}
+
+int main() {
+  if (f0 (0, 1) != 1)
+__builtin_abort();
+  if (f1 (1, 1) != 1)
+__builtin_abort();
+  if (f2 (2, 1) != 1)
+__builtin_abort();
+  if (f3 (3, 1) != 1)
+__builtin_abort();
+}
diff --git a/gcc/tree-ssa-ifcombine.cc b/gcc/tree-ssa-ifcombine.cc
index e389b12aa37d..6f6c0f342746 100644
--- a/gcc/tree-ssa-ifcombine.cc
+++ b/gcc/tree-ssa-ifcombine.cc
@@ -1077,7 +1077,7 @@ tree_ssa_ifcombine_bb_1 (basic_block inner_cond_bb, 
basic_block outer_cond_bb,
 }
 
   /* The || form is characterized by a common then_bb with the
- two edges leading to it mergable.  The latter is guaranteed
+ two edges leading to it mergeable.  The latter is guaranteed
  by matching PHI arguments in the then_bb and the inner cond_bb
  having no side-effects.  */
   if (phi_pred_bb != then_bb
@@ -1088,7 +1088,7 @@ tree_ssa_ifcombine_bb_1 (basic_block inner_cond_bb, 
basic_block outer_cond_bb,
   
 if (q) goto then_bb; else goto inner_cond_bb;
   
-if (q) goto then_bb; else goto ...;
+if (p) goto then_bb; else goto ...;
   
 ...
*/
@@ -1104,7 +1104,7 @@ tree_ssa_ifcombine_bb_1 (basic_block inner_cond_bb, 
basic_block outer_cond_bb,
   
 if (q) goto inner_cond_bb; else goto then_bb;
   
-if (q) goto then_bb; else goto ...;
+if (p) goto then_bb; else goto ...;
   
 ...
*/
@@ -1139,19 +1139,25 @@ tree_ssa_ifcombine_bb (basic_block inner_cond_bb)
  Look for an OUTER_COND_BBs to combine with INNER_COND_BB.  They need not
  be contiguous, as long as inner and intervening blocks have no side
  effects, and are either single-entry-single-exit or conditionals choosing
- between the same EXIT_BB with the same PHI args, and the path leading to
- INNER_COND_BB.  ??? We could potentially handle multi-block
- single-entry-single-exit regions, but the loop below only deals with
- single-entry-single-exit individual intervening blocks.  Larger regions
- without side effects are presumably rare, so 

[gcc/aoliva/heads/testme] (3 commits) ifcombine: don't try xor on right-hand op

2024-11-28 Thread Alexandre Oliva via Gcc-cvs
The branch 'aoliva/heads/testme' was updated to point to:

 93e2a2990abb... ifcombine: don't try xor on right-hand op

It previously pointed to:

 be442be963ba... ifcombine: don't try xor on right-hand op

Diff:

!!! WARNING: THE FOLLOWING COMMITS ARE NO LONGER ACCESSIBLE (LOST):
---

  be442be... ifcombine: don't try xor on right-hand op
  1df512b... fold fold_truth_andor field merging into ifcombine
  75e065e... ifcombine: avoid unsound forwarder-enabled combinations [PR


Summary of changes (added commits):
---

  93e2a29... ifcombine: don't try xor on right-hand op
  ff06b3b... fold fold_truth_andor field merging into ifcombine
  1682b66... ifcombine: avoid unsound forwarder-enabled combinations [PR


[gcc(refs/users/aoliva/heads/testme)] ifcombine: don't try xor on right-hand op

2024-11-28 Thread Alexandre Oliva via Gcc-cvs
https://gcc.gnu.org/g:93e2a2990abb2b2085207c7008f5070e69746d87

commit 93e2a2990abb2b2085207c7008f5070e69746d87
Author: Alexandre Oliva 
Date:   Thu Nov 28 18:44:35 2024 -0300

ifcombine: don't try xor on right-hand op

Diff:
---
 gcc/gimple-fold.cc | 4 
 1 file changed, 4 insertions(+)

diff --git a/gcc/gimple-fold.cc b/gcc/gimple-fold.cc
index 6ac7f1593ad3..cb560ba456ba 100644
--- a/gcc/gimple-fold.cc
+++ b/gcc/gimple-fold.cc
@@ -7485,6 +7485,10 @@ decode_field_reference (tree *pexp, HOST_WIDE_INT 
*pbitsize,
  exp = res_ops[1];
  gcc_checking_assert (!xor_cmp_op);
}
+  else if (!xor_cmp_op)
+   /* Not much we can do when xor appears in the right-hand compare
+  operand.  */
+   return NULL_TREE;
   else
{
  *xor_p = true;


[gcc/aoliva/heads/testme] ifcombine: avoid unsound forwarder-enabled combinations [PR

2024-11-28 Thread Alexandre Oliva via Gcc-cvs
The branch 'aoliva/heads/testme' was updated to point to:

 5f282e81f4c3... ifcombine: avoid unsound forwarder-enabled combinations [PR

It previously pointed to:

 2a07a07a42a9... ifcombine: don't try xor on right-hand op

Diff:

!!! WARNING: THE FOLLOWING COMMITS ARE NO LONGER ACCESSIBLE (LOST):
---

  2a07a07... ifcombine: don't try xor on right-hand op
  696dcb8... fold fold_truth_andor field merging into ifcombine


[gcc(refs/users/aoliva/heads/testme)] ifcombine: avoid unsound forwarder-enabled combinations [PR117723]

2024-11-28 Thread Alexandre Oliva via Gcc-cvs
https://gcc.gnu.org/g:5f282e81f4c3cb625d7b39e3fa312dd63fccbff3

commit 5f282e81f4c3cb625d7b39e3fa312dd63fccbff3
Author: Alexandre Oliva 
Date:   Thu Nov 28 21:56:34 2024 -0300

ifcombine: avoid unsound forwarder-enabled combinations [PR117723]

When ifcombining contiguous blocks, we can follow forwarder blocks and
reverse conditions to enable combinations, but when there are
intervening blocks, we have to constrain ourselves to paths to the
exit that share the PHI args with all intervening blocks.

Avoiding considering forwarders when intervening blocks were present
would match the preexisting test, but we can do better, recording in
case a forwarded path corresponds to the outer block's exit path, and
insisting on not combining through any other path but the one that was
verified as corresponding.  The latter is what this patch implements.

While at that, I've fixed some typos, introduced early testing before
computing the exit path to avoid it when computing it would be
wasteful, or when avoiding it can enable other sound combinations.


for  gcc/ChangeLog

PR tree-optimization/117723
* tree-ssa-ifcombine.cc (tree_ssa_ifcombine_bb): Record
forwarder blocks in path to exit, and stick to them.  Avoid
computing the exit if obviously not needed, and if that
enables additional optimizations.
(tree_ssa_ifcombine_bb_1): Fix typos.

for  gcc/testsuite/ChangeLog

PR tree-optimization/117723
* gcc.dg/torture/ifcmb-1.c: New.

Diff:
---
 gcc/testsuite/gcc.dg/torture/ifcmb-1.c |  63 ++
 gcc/tree-ssa-ifcombine.cc  | 116 -
 2 files changed, 161 insertions(+), 18 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/torture/ifcmb-1.c 
b/gcc/testsuite/gcc.dg/torture/ifcmb-1.c
new file mode 100644
index ..2431a548598f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/ifcmb-1.c
@@ -0,0 +1,63 @@
+/* { dg-do run } */
+
+/* Test that we do NOT perform unsound transformations for any of these cases.
+   Forwarding blocks to the exit block used to enable some of them.  */
+
+[[gnu::noinline]]
+int f0 (int a, int b) {
+  if ((a & 1))
+return 0;
+  if (b)
+return 1;
+  if (!(a & 2))
+return 0;
+  else
+return 1;
+}
+
+[[gnu::noinline]]
+int f1 (int a, int b) {
+  if (!(a & 1))
+return 0;
+  if (b)
+return 1;
+  if ((a & 2))
+return 1;
+  else
+return 0;
+}
+
+[[gnu::noinline]]
+int f2 (int a, int b) {
+  if ((a & 1))
+return 0;
+  if (b)
+return 1;
+  if (!(a & 2))
+return 0;
+  else
+return 1;
+}
+
+[[gnu::noinline]]
+int f3 (int a, int b) {
+  if (!(a & 1))
+return 0;
+  if (b)
+return 1;
+  if ((a & 2))
+return 1;
+  else
+return 0;
+}
+
+int main() {
+  if (f0 (0, 1) != 1)
+__builtin_abort();
+  if (f1 (1, 1) != 1)
+__builtin_abort();
+  if (f2 (2, 1) != 1)
+__builtin_abort();
+  if (f3 (3, 1) != 1)
+__builtin_abort();
+}
diff --git a/gcc/tree-ssa-ifcombine.cc b/gcc/tree-ssa-ifcombine.cc
index e389b12aa37d..a87bf1210776 100644
--- a/gcc/tree-ssa-ifcombine.cc
+++ b/gcc/tree-ssa-ifcombine.cc
@@ -1077,7 +1077,7 @@ tree_ssa_ifcombine_bb_1 (basic_block inner_cond_bb, 
basic_block outer_cond_bb,
 }
 
   /* The || form is characterized by a common then_bb with the
- two edges leading to it mergable.  The latter is guaranteed
+ two edges leading to it mergeable.  The latter is guaranteed
  by matching PHI arguments in the then_bb and the inner cond_bb
  having no side-effects.  */
   if (phi_pred_bb != then_bb
@@ -1088,7 +1088,7 @@ tree_ssa_ifcombine_bb_1 (basic_block inner_cond_bb, 
basic_block outer_cond_bb,
   
 if (q) goto then_bb; else goto inner_cond_bb;
   
-if (q) goto then_bb; else goto ...;
+if (p) goto then_bb; else goto ...;
   
 ...
*/
@@ -1104,7 +1104,7 @@ tree_ssa_ifcombine_bb_1 (basic_block inner_cond_bb, 
basic_block outer_cond_bb,
   
 if (q) goto inner_cond_bb; else goto then_bb;
   
-if (q) goto then_bb; else goto ...;
+if (p) goto then_bb; else goto ...;
   
 ...
*/
@@ -1139,13 +1139,18 @@ tree_ssa_ifcombine_bb (basic_block inner_cond_bb)
  Look for an OUTER_COND_BBs to combine with INNER_COND_BB.  They need not
  be contiguous, as long as inner and intervening blocks have no side
  effects, and are either single-entry-single-exit or conditionals choosing
- between the same EXIT_BB with the same PHI args, and the path leading to
- INNER_COND_BB.  ??? We could potentially handle multi-block
- single-entry-single-exit regions, but the loop below only deals with
- single-entry-single-exit individual intervening blocks.  Larger regions
- without side effects are presumably rare, so 

[gcc(refs/users/aoliva/heads/testme)] fold fold_truth_andor field merging into ifcombine

2024-11-28 Thread Alexandre Oliva via Gcc-cvs
https://gcc.gnu.org/g:696dcb8933dc84d6a32cd0e6ec502ba865a012fa

commit 696dcb8933dc84d6a32cd0e6ec502ba865a012fa
Author: Alexandre Oliva 
Date:   Thu Nov 28 18:44:28 2024 -0300

fold fold_truth_andor field merging into ifcombine

This patch introduces various improvements to the logic that merges
field compares, while moving it into ifcombine.

Before the patch, we could merge:

  (a.x1 EQNE b.x1)  ANDOR  (a.y1 EQNE b.y1)

into something like:

  (((type *)&a)[Na] & MASK) EQNE (((type *)&b)[Nb] & MASK)

if both of A's fields live within the same alignment boundaries, and
so do B's, at the same relative positions.  Constants may be used
instead of the object B.

The initial goal of this patch was to enable such combinations when a
field crossed alignment boundaries, e.g. for packed types.  We can't
generally access such fields with a single memory access, so when we
come across such a compare, we will attempt to combine each access
separately.

Some merging opportunities were missed because of right-shifts,
compares expressed as e.g. ((a.x1 ^ b.x1) & MASK) EQNE 0, and
narrowing conversions, especially after earlier merges.  This patch
introduces handlers for several cases involving these.

The merging of multiple field accesses into wider bitfield-like
accesses is undesirable to do too early in compilation, so we move it
from folding to ifcombine.

When it is the second of a noncontiguous pair of compares that first
accesses a word, we may merge the first compare with part of the
second compare that refers to the same word, keeping the compare of
the remaining bits at the spot where the second compare used to be.

Handling compares with non-constant fields was somewhat generalized
from what fold used to do, now handling non-adjacent fields, even if a
field of one object crosses an alignment boundary but the other
doesn't.


for  gcc/ChangeLog

* fold-const.cc (make_bit_field): Export.
(unextend, all_ones_mask_p): Drop.
(decode_field_reference, fold_truth_andor_1): Move
field compare merging logic...
* gimple-fold.cc: (fold_truth_andor_for_ifcombine) ... here.
(compute_split_boundary_from_align): New.
(make_bit_field_load, build_split_load): New.
(reuse_split_load): New.
* fold-const.h: (make_bit_field_ref): Declare
(fold_truth_andor_for_ifcombine): Declare.
* match.pd (any_convert, bit_and_cst, rshift_cst): New.
* tree-ssa-ifcombine.cc (ifcombine_ifandif): Try
fold_truth_andor_for_ifcombine.

for  gcc/testsuite/ChangeLog

* gcc.dg/field-merge-1.c: New.
* gcc.dg/field-merge-2.c: New.
* gcc.dg/field-merge-3.c: New.
* gcc.dg/field-merge-4.c: New.
* gcc.dg/field-merge-5.c: New.
* gcc.dg/field-merge-6.c: New.
* gcc.dg/field-merge-7.c: New.
* gcc.dg/field-merge-8.c: New.
* gcc.dg/field-merge-9.c: New.
* gcc.dg/field-merge-10.c: New.
* gcc.dg/field-merge-11.c: New.

Diff:
---
 gcc/fold-const.cc |  512 +--
 gcc/fold-const.h  |   10 +
 gcc/gimple-fold.cc| 1107 +
 gcc/match.pd  |   11 +
 gcc/testsuite/gcc.dg/field-merge-1.c  |   64 ++
 gcc/testsuite/gcc.dg/field-merge-10.c |   36 ++
 gcc/testsuite/gcc.dg/field-merge-11.c |   32 +
 gcc/testsuite/gcc.dg/field-merge-2.c  |   31 +
 gcc/testsuite/gcc.dg/field-merge-3.c  |   36 ++
 gcc/testsuite/gcc.dg/field-merge-4.c  |   40 ++
 gcc/testsuite/gcc.dg/field-merge-5.c  |   40 ++
 gcc/testsuite/gcc.dg/field-merge-6.c  |   26 +
 gcc/testsuite/gcc.dg/field-merge-7.c  |   23 +
 gcc/testsuite/gcc.dg/field-merge-8.c  |   25 +
 gcc/testsuite/gcc.dg/field-merge-9.c  |   36 ++
 gcc/tree-ssa-ifcombine.cc |   14 +-
 16 files changed, 1534 insertions(+), 509 deletions(-)

diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc
index 1e8ae1ab493b..644966459864 100644
--- a/gcc/fold-const.cc
+++ b/gcc/fold-const.cc
@@ -137,7 +137,6 @@ static tree range_successor (tree);
 static tree fold_range_test (location_t, enum tree_code, tree, tree, tree);
 static tree fold_cond_expr_with_comparison (location_t, tree, enum tree_code,
tree, tree, tree, tree);
-static tree unextend (tree, int, int, tree);
 static tree extract_muldiv (tree, tree, enum tree_code, tree, bool *);
 static tree extract_muldiv_1 (tree, tree, enum tree_code, tree, bool *);
 static tree fold_binary_op_with_conditional_arg (location_t,
@@ -4711,7 +4710,7 @@ invert_truthvalue_loc (location_t loc, tree arg)
is the original memory reference used to preserve the alias set of
the

[gcc(refs/users/aoliva/heads/testme)] ifcombine: don't try xor on right-hand op

2024-11-28 Thread Alexandre Oliva via Gcc-cvs
https://gcc.gnu.org/g:2a07a07a42a9f194fd07c34e89bdafcb8176a1ed

commit 2a07a07a42a9f194fd07c34e89bdafcb8176a1ed
Author: Alexandre Oliva 
Date:   Thu Nov 28 18:44:35 2024 -0300

ifcombine: don't try xor on right-hand op

Diff:
---
 gcc/gimple-fold.cc | 4 
 1 file changed, 4 insertions(+)

diff --git a/gcc/gimple-fold.cc b/gcc/gimple-fold.cc
index 6ac7f1593ad3..cb560ba456ba 100644
--- a/gcc/gimple-fold.cc
+++ b/gcc/gimple-fold.cc
@@ -7485,6 +7485,10 @@ decode_field_reference (tree *pexp, HOST_WIDE_INT 
*pbitsize,
  exp = res_ops[1];
  gcc_checking_assert (!xor_cmp_op);
}
+  else if (!xor_cmp_op)
+   /* Not much we can do when xor appears in the right-hand compare
+  operand.  */
+   return NULL_TREE;
   else
{
  *xor_p = true;


[gcc/aoliva/heads/testme] (3 commits) ifcombine: don't try xor on right-hand op

2024-11-28 Thread Alexandre Oliva via Gcc-cvs
The branch 'aoliva/heads/testme' was updated to point to:

 2a07a07a42a9... ifcombine: don't try xor on right-hand op

It previously pointed to:

 93e2a2990abb... ifcombine: don't try xor on right-hand op

Diff:

!!! WARNING: THE FOLLOWING COMMITS ARE NO LONGER ACCESSIBLE (LOST):
---

  93e2a29... ifcombine: don't try xor on right-hand op
  ff06b3b... fold fold_truth_andor field merging into ifcombine
  1682b66... ifcombine: avoid unsound forwarder-enabled combinations [PR


Summary of changes (added commits):
---

  2a07a07... ifcombine: don't try xor on right-hand op
  696dcb8... fold fold_truth_andor field merging into ifcombine
  5f282e8... ifcombine: avoid unsound forwarder-enabled combinations [PR


[gcc r15-5743] gimple-fold: Avoid ICEs with bogus declarations like const attribute no snprintf [PR117358]

2024-11-28 Thread Jakub Jelinek via Gcc-cvs
https://gcc.gnu.org/g:29032dfa57629d1713a97b17a785273823993a91

commit r15-5743-g29032dfa57629d1713a97b17a785273823993a91
Author: Jakub Jelinek 
Date:   Thu Nov 28 10:51:16 2024 +0100

gimple-fold: Avoid ICEs with bogus declarations like const attribute no 
snprintf [PR117358]

When one puts incorrect const or pure attributes on declarations of various
C APIs which have corresponding builtins (vs. what they actually do), we can
get tons of ICEs in gimple-fold.cc.

The following patch fixes it by giving up gimple_fold_builtin_* folding
if the functions don't have gimple_vdef (or for pure functions like
bcmp/strchr/strstr gimple_vuse) when in SSA form (during gimplification
they will surely have both of those NULL even when declared correctly,
yet it is highly desirable to fold them).

Or shall I replace
!gimple_vdef (stmt) && gimple_in_ssa_p (cfun)
tests with
(gimple_call_flags (stmt) & (ECF_CONST | ECF_PURE | ECF_NOVOPS)) != 0
and
!gimple_vuse (stmt) && gimple_in_ssa_p (cfun)
with
(gimple_call_flags (stmt) & (ECF_CONST | ECF_NOVOPS)) != 0
?

2024-11-28  Jakub Jelinek  

PR tree-optimization/117358
* gimple-fold.cc (gimple_fold_builtin_memory_op): Punt if stmt has 
no
vdef in ssa form.
(gimple_fold_builtin_bcmp): Punt if stmt has no vuse in ssa form.
(gimple_fold_builtin_bcopy): Punt if stmt has no vdef in ssa form.
(gimple_fold_builtin_bzero): Likewise.
(gimple_fold_builtin_memset): Likewise.  Use return false instead of
return NULL_TREE.
(gimple_fold_builtin_strcpy): Punt if stmt has no vdef in ssa form.
(gimple_fold_builtin_strncpy): Likewise.
(gimple_fold_builtin_strchr): Punt if stmt has no vuse in ssa form.
(gimple_fold_builtin_strstr): Likewise.
(gimple_fold_builtin_strcat): Punt if stmt has no vdef in ssa form.
(gimple_fold_builtin_strcat_chk): Likewise.
(gimple_fold_builtin_strncat): Likewise.
(gimple_fold_builtin_strncat_chk): Likewise.
(gimple_fold_builtin_string_compare): Likewise.
(gimple_fold_builtin_fputs): Likewise.
(gimple_fold_builtin_memory_chk): Likewise.
(gimple_fold_builtin_stxcpy_chk): Likewise.
(gimple_fold_builtin_stxncpy_chk): Likewise.
(gimple_fold_builtin_stpcpy): Likewise.
(gimple_fold_builtin_snprintf_chk): Likewise.
(gimple_fold_builtin_sprintf_chk): Likewise.
(gimple_fold_builtin_sprintf): Likewise.
(gimple_fold_builtin_snprintf): Likewise.
(gimple_fold_builtin_fprintf): Likewise.
(gimple_fold_builtin_printf): Likewise.
(gimple_fold_builtin_realloc): Likewise.

* gcc.c-torture/compile/pr117358.c: New test.

Diff:
---
 gcc/gimple-fold.cc | 76 --
 gcc/testsuite/gcc.c-torture/compile/pr117358.c | 17 ++
 2 files changed, 77 insertions(+), 16 deletions(-)

diff --git a/gcc/gimple-fold.cc b/gcc/gimple-fold.cc
index 5eedad54ced0..39112379c592 100644
--- a/gcc/gimple-fold.cc
+++ b/gcc/gimple-fold.cc
@@ -1061,6 +1061,8 @@ gimple_fold_builtin_memory_op (gimple_stmt_iterator *gsi,
}
   goto done;
 }
+  else if (!gimple_vdef (stmt) && gimple_in_ssa_p (cfun))
+return false;
   else
 {
   /* We cannot (easily) change the type of the copy if it is a storage
@@ -1511,6 +1513,8 @@ gimple_fold_builtin_bcmp (gimple_stmt_iterator *gsi)
   /* Transform bcmp (a, b, len) into memcmp (a, b, len).  */
 
   gimple *stmt = gsi_stmt (*gsi);
+  if (!gimple_vuse (stmt) && gimple_in_ssa_p (cfun))
+return false;
   tree a = gimple_call_arg (stmt, 0);
   tree b = gimple_call_arg (stmt, 1);
   tree len = gimple_call_arg (stmt, 2);
@@ -1537,6 +1541,8 @@ gimple_fold_builtin_bcopy (gimple_stmt_iterator *gsi)
  len) into memmove (dest, src, len).  */
 
   gimple *stmt = gsi_stmt (*gsi);
+  if (!gimple_vdef (stmt) && gimple_in_ssa_p (cfun))
+return false;
   tree src = gimple_call_arg (stmt, 0);
   tree dest = gimple_call_arg (stmt, 1);
   tree len = gimple_call_arg (stmt, 2);
@@ -1562,6 +1568,8 @@ gimple_fold_builtin_bzero (gimple_stmt_iterator *gsi)
   /* Transform bzero (dest, len) into memset (dest, 0, len).  */
 
   gimple *stmt = gsi_stmt (*gsi);
+  if (!gimple_vdef (stmt) && gimple_in_ssa_p (cfun))
+return false;
   tree dest = gimple_call_arg (stmt, 0);
   tree len = gimple_call_arg (stmt, 1);
 
@@ -1591,6 +1599,9 @@ gimple_fold_builtin_memset (gimple_stmt_iterator *gsi, 
tree c, tree len)
   return true;
 }
 
+  if (!gimple_vdef (stmt) && gimple_in_ssa_p (cfun))
+return false;
+
   if (! tree_fits_uhwi_p (len))
 return false;
 
@@ -1613,20 +1624,20 @@ gimple_fold_builtin_memset (gimple_stmt_iterator *gsi, 
tree c, tree len)
   if ((!INTEGRA

[gcc r15-5746] expr, c: Don't clear whole unions [PR116416]

2024-11-28 Thread Jakub Jelinek via Gcc-cvs
https://gcc.gnu.org/g:0547dbb725b6d8e878a79e28a2e171eafcfbc1aa

commit r15-5746-g0547dbb725b6d8e878a79e28a2e171eafcfbc1aa
Author: Jakub Jelinek 
Date:   Thu Nov 28 11:18:07 2024 +0100

expr, c: Don't clear whole unions [PR116416]

As discussed earlier, we currently clear padding bits even when we
don't have to and that causes pessimization of emitted code,
e.g. for
union U { int a; long b[64]; };
void bar (union U *);
void
foo (void)
{
  union U u = { 0 };
  bar (&u);
}
we need to clear just u.a, not the whole union, but on the other side
in cases where the standard requires padding bits to be zeroed, like for
C23 {} initializers of aggregates with padding bits, or for C++11 zero
initialization we don't do that.

This patch
a) moves some of the stuff into complete_ctor_at_level_p (but not
   all the *p_complete = 0; case, for that it would need to change
   so that it passes around the ctor rather than just its type) and
   changes the handling of unions
b) introduces a new option, so that users can either get the new
   behavior (only what is guaranteed by the standards, the default),
   or previous behavior (union padding zero initialization, no such
   guarantees in structures) or also a guarantee in structures
c) introduces a new CONSTRUCTOR flag which says that the padding bits
   (if any) should be zero initialized (and sets it for now in the C
   FE for C23 {} initializers).

Am not sure the CONSTRUCTOR_ZERO_PADDING_BITS flag is really needed
for C23, if there is just empty initializer, I think we already mark
it as incomplete if there are any missing initializers.  Maybe with
some designated initializer games, say
void foo () {
  struct S { char a; long long b; };
  struct T { struct S c; } t = { .c = {}, .c.a = 1, .c.b = 2 };
...
}
Is this supposed to initialize padding bits in C23 and then the .c.a = 1
and .c.b = 2 stores preserve those padding bits, so is that supposed
to be different from struct T t2 = { .c = { 1, 2 } };
?  What about just struct T t3 = { .c.a = 1, .c.b = 2 }; ?

And I haven't touched the C++ FE for the flag, because I'm afraid I'm lost
on where exactly is zero-initialization done (vs. other types of
initialization) and where is e.g. zero-initialization of a temporary then
(member-wise) copied.
Say
struct S { char a; long long b; };
struct T { constexpr T (int a, int b) : c () { c.a = a; c.b = b; } S c; };
void bar (T *);

void
foo ()
{
  T t (1, 2);
  bar (&t);
}
Is the c () value-initialization of t.c followed by c.a and c.b updates
which preserve the zero initialized padding bits?  Or is there some
copy construction involved which does member-wise copying and makes the
padding bits undefined?
Looking at (older) clang++ with -O2, it initializes also the padding bits
when c () is used and doesn't with c {}.
For GCC, note that there is that optimization from Alex to zero padding bits
for optimization purposes for small aggregates, so either one needs to look
at -O0 -fdump-tree-gimple dumps, or use larger structures which aren't
optimized that way.

2024-11-28  Jakub Jelinek  

PR c++/116416
gcc/
* flag-types.h (enum zero_init_padding_bits_kind): New type.
* tree.h (CONSTRUCTOR_ZERO_PADDING_BITS): Define.
* common.opt (fzero-init-padding-bits=): New option.
* expr.cc (categorize_ctor_elements_1): Handle
CONSTRUCTOR_ZERO_PADDING_BITS or
flag_zero_init_padding_bits == ZERO_INIT_PADDING_BITS_ALL.  Fix
up *p_complete = -1; setting for unions.
(complete_ctor_at_level_p): Handle unions differently for
flag_zero_init_padding_bits == ZERO_INIT_PADDING_BITS_STANDARD.
* gimple-fold.cc (type_has_padding_at_level_p): Fix up UNION_TYPE
handling, return also true for UNION_TYPE with no FIELD_DECLs
and non-zero size, handle QUAL_UNION_TYPE like UNION_TYPE.
* doc/invoke.texi (-fzero-init-padding-bits=@var{value}): Document.
gcc/c/
* c-parser.cc (c_parser_braced_init): Set 
CONSTRUCTOR_ZERO_PADDING_BITS
for flag_isoc23 empty initializers.
* c-typeck.cc (constructor_zero_padding_bits): New variable.
(struct constructor_stack): Add zero_padding_bits member.
(really_start_incremental_init): Save and clear
constructor_zero_padding_bits.
(push_init_level): Save constructor_zero_padding_bits.  Or into it
CONSTRUCTOR_ZERO_PADDING_BITS from previous value if implicit.
(pop_init_level): Set CONSTRUCTOR_ZERO_PADDING_BITS if
constructor_zero_padding_bits and restore
constructor_zero_padding_bits.
gcc/testsuite/
  

[gcc r15-5747] c++: Small initial fixes for zeroing of padding bits [PR117256]

2024-11-28 Thread Jakub Jelinek via Gcc-cvs
https://gcc.gnu.org/g:fd62fdc5e1b3c4baf5218eedbc3c6d29861f027b

commit r15-5747-gfd62fdc5e1b3c4baf5218eedbc3c6d29861f027b
Author: Jakub Jelinek 
Date:   Thu Nov 28 11:30:32 2024 +0100

c++: Small initial fixes for zeroing of padding bits [PR117256]

https://eel.is/c++draft/dcl.init#general-6
says that even padding bits are supposed to be zeroed during
zero-initialization.
The following patch on top of the
https://gcc.gnu.org/pipermail/gcc-patches/2024-October/665565.html
patch attempts to implement that, though only for the easy
cases so far, in particular marks the CONSTRUCTOR created during
zero-initialization (or zero-initialization done during the
value-initialization) as having padding bits cleared and for
constexpr evaluation attempts to preserve that bit on a new CONSTRUCTOR
created for CONSTRUCTOR_ZERO_PADDING_BITS lhs.

I think we need far more than that, but am not sure where exactly
to implement that.
In particular, I think __builtin_bitcast should take it into account
during constant evaluation, if the padding bits in something are guaranteed
to be zero, then I'd think std::bitcast out of it and testing those
bits in there should be well defined.
But if we do that, the flag needs to be maintained precisely, not just
conservatively, so e.g. any place where some object is copied into another
one (except bitcast?) which would be element-wise copy, the bit should
be cleared (or preserved from the earlier state?  I'd hope
element-wise copying invalidates even the padding bits, but then what
about just stores into some members, do those invalidate the padding bits
in the rest of the object?).  But if it is an elided copy, it shouldn't.
And am not really sure what happens e.g. with non-automatic constexpr
variables.  If it is constructed by something that doesn't guarantee
the zeroing of the padding bits (so similarly constructed constexpr 
automatic
variable would have undefined state of the padding bits), are those padding
bits well defined because it isn't automatic variable?

Anyway, I hope the following patch is at least a small step in the right
direction.

2024-11-28  Jakub Jelinek  

PR c++/78620
PR c++/117256
* init.cc (build_zero_init_1): Set CONSTRUCTOR_ZERO_PADDING_BITS.
(build_value_init_noctor): Likewise.
* constexpr.cc (cxx_eval_store_expression): Propagate
CONSTRUCTOR_ZERO_PADDING_BITS flag.

Diff:
---
 gcc/cp/constexpr.cc | 10 -
 gcc/cp/init.cc  |  5 ++-
 gcc/testsuite/g++.dg/cpp0x/zero-init1.C | 70 +
 3 files changed, 83 insertions(+), 2 deletions(-)

diff --git a/gcc/cp/constexpr.cc b/gcc/cp/constexpr.cc
index 55e44fcbafba..bc3dc3b2559e 100644
--- a/gcc/cp/constexpr.cc
+++ b/gcc/cp/constexpr.cc
@@ -6409,6 +6409,7 @@ cxx_eval_store_expression (const constexpr_ctx *ctx, tree 
t,
 
   type = TREE_TYPE (object);
   bool no_zero_init = true;
+  bool zero_padding_bits = false;
 
   auto_vec ctors;
   releasing_vec indexes;
@@ -6421,6 +6422,7 @@ cxx_eval_store_expression (const constexpr_ctx *ctx, tree 
t,
{
  *valp = build_constructor (type, NULL);
  CONSTRUCTOR_NO_CLEARING (*valp) = no_zero_init;
+ CONSTRUCTOR_ZERO_PADDING_BITS (*valp) = zero_padding_bits;
}
   else if (STRIP_ANY_LOCATION_WRAPPER (*valp),
   TREE_CODE (*valp) == STRING_CST)
@@ -6480,8 +6482,10 @@ cxx_eval_store_expression (const constexpr_ctx *ctx, 
tree t,
}
 
   /* If the value of object is already zero-initialized, any new ctors for
-subobjects will also be zero-initialized.  */
+subobjects will also be zero-initialized.  Similarly with zeroing of
+padding bits.  */
   no_zero_init = CONSTRUCTOR_NO_CLEARING (*valp);
+  zero_padding_bits = CONSTRUCTOR_ZERO_PADDING_BITS (*valp);
 
   if (code == RECORD_TYPE && is_empty_field (index))
/* Don't build a sub-CONSTRUCTOR for an empty base or field, as they
@@ -,6 +6670,7 @@ cxx_eval_store_expression (const constexpr_ctx *ctx, tree 
t,
{
  *valp = build_constructor (type, NULL);
  CONSTRUCTOR_NO_CLEARING (*valp) = no_zero_init;
+ CONSTRUCTOR_ZERO_PADDING_BITS (*valp) = zero_padding_bits;
}
   new_ctx.ctor = empty_base ? NULL_TREE : *valp;
   new_ctx.object = target;
@@ -6707,6 +6712,7 @@ cxx_eval_store_expression (const constexpr_ctx *ctx, tree 
t,
  /* But do make sure we have something in *valp.  */
  *valp = build_constructor (type, nullptr);
  CONSTRUCTOR_NO_CLEARING (*valp) = no_zero_init;
+ CONSTRUCTOR_ZERO_PADDING_BITS (*valp) = zero_padding_bits;
}
 }
   else if (*valp && TREE_CODE (*valp) == CONSTRUCTOR
@@ -6719,6 +6725,8 @@ cxx_eval_store_expression (const constex

[gcc r15-5749] rs6000: Add PowerPC inline asm redzone clobber support

2024-11-28 Thread Jakub Jelinek via Gcc-cvs
https://gcc.gnu.org/g:654afa46c90f7552af52fed30bc1a3fa21163f40

commit r15-5749-g654afa46c90f7552af52fed30bc1a3fa21163f40
Author: Jakub Jelinek 
Date:   Thu Nov 28 11:45:00 2024 +0100

rs6000: Add PowerPC inline asm redzone clobber support

The following patch on top of the
https://gcc.gnu.org/pipermail/gcc-patches/2024-November/667949.html
patch adds rs6000 part of the support (the only other target I'm aware of
which clearly has red zone as well).

2024-11-28  Jakub Jelinek  

* config/rs6000/rs6000.h (struct machine_function): Add
asm_redzone_clobber_seen member.
* config/rs6000/rs6000-logue.cc (rs6000_stack_info): Force
info->push_p if cfun->machine->asm_redzone_clobber_seen.
* config/rs6000/rs6000.cc (TARGET_REDZONE_CLOBBER): Redefine.
(rs6000_redzone_clobber): New function.

* gcc.target/powerpc/asm-redzone-1.c: New test.

Diff:
---
 gcc/config/rs6000/rs6000-logue.cc|  2 +-
 gcc/config/rs6000/rs6000.cc  | 21 +++
 gcc/config/rs6000/rs6000.h   |  1 +
 gcc/testsuite/gcc.target/powerpc/asm-redzone-1.c | 71 
 4 files changed, 94 insertions(+), 1 deletion(-)

diff --git a/gcc/config/rs6000/rs6000-logue.cc 
b/gcc/config/rs6000/rs6000-logue.cc
index c87058b435e5..2efb8b1cda86 100644
--- a/gcc/config/rs6000/rs6000-logue.cc
+++ b/gcc/config/rs6000/rs6000-logue.cc
@@ -918,7 +918,7 @@ rs6000_stack_info (void)
   else if (DEFAULT_ABI == ABI_V4)
 info->push_p = non_fixed_size != 0;
 
-  else if (frame_pointer_needed)
+  else if (frame_pointer_needed || cfun->machine->asm_redzone_clobber_seen)
 info->push_p = 1;
 
   else
diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index 3c25315877f0..02a2f1152dbe 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -1751,6 +1751,9 @@ static const scoped_attribute_specs *const 
rs6000_attribute_table[] =
 #undef TARGET_CAN_CHANGE_MODE_CLASS
 #define TARGET_CAN_CHANGE_MODE_CLASS rs6000_can_change_mode_class
 
+#undef TARGET_REDZONE_CLOBBER
+#define TARGET_REDZONE_CLOBBER rs6000_redzone_clobber
+
 #undef TARGET_CONSTANT_ALIGNMENT
 #define TARGET_CONSTANT_ALIGNMENT rs6000_constant_alignment
 
@@ -13725,6 +13728,24 @@ rs6000_can_change_mode_class (machine_mode from,
   return true;
 }
 
+/* Implement TARGET_REDZONE_CLOBBER.  */
+
+static rtx
+rs6000_redzone_clobber ()
+{
+  cfun->machine->asm_redzone_clobber_seen = true;
+  if (DEFAULT_ABI != ABI_V4)
+{
+  int red_zone_size = TARGET_32BIT ? 220 : 288;
+  rtx base = plus_constant (Pmode, stack_pointer_rtx,
+   GEN_INT (-red_zone_size));
+  rtx mem = gen_rtx_MEM (BLKmode, base);
+  set_mem_size (mem, red_zone_size);
+  return mem;
+}
+  return NULL_RTX;
+}
+
 /* Debug version of rs6000_can_change_mode_class.  */
 static bool
 rs6000_debug_can_change_mode_class (machine_mode from,
diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h
index e0c41e1dfd26..926b6b2180ec 100644
--- a/gcc/config/rs6000/rs6000.h
+++ b/gcc/config/rs6000/rs6000.h
@@ -2421,6 +2421,7 @@ typedef struct GTY(()) machine_function
  global entry.  It helps to control the patchable area before and after
  local entry.  */
   bool global_entry_emitted;
+  bool asm_redzone_clobber_seen;
 } machine_function;
 #endif
 
diff --git a/gcc/testsuite/gcc.target/powerpc/asm-redzone-1.c 
b/gcc/testsuite/gcc.target/powerpc/asm-redzone-1.c
new file mode 100644
index ..87b91809a141
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/asm-redzone-1.c
@@ -0,0 +1,71 @@
+/* { dg-do run { target lp64 } } */
+/* { dg-options "-O2" } */
+
+__attribute__((noipa)) int
+foo (void)
+{
+  int a = 1;
+  int b = 2;
+  int c = 3;
+  int d = 4;
+  int e = 5;
+  int f = 6;
+  int g = 7;
+  int h = 8;
+  int i = 9;
+  int j = 10;
+  int k = 11;
+  int l = 12;
+  int m = 13;
+  int n = 14;
+  int o = 15;
+  int p = 16;
+  int q = 17;
+  int r = 18;
+  int s = 19;
+  int t = 20;
+  int u = 21;
+  int v = 22;
+  int w = 23;
+  int x = 24;
+  int y = 25;
+  int z = 26;
+  asm volatile ("" : "+g" (a), "+g" (b), "+g" (c), "+g" (d), "+g" (e));
+  asm volatile ("" : "+g" (f), "+g" (g), "+g" (h), "+g" (i), "+g" (j));
+  asm volatile ("" : "+g" (k), "+g" (l), "+g" (m), "+g" (n), "+g" (o));
+  asm volatile ("" : "+g" (k), "+g" (l), "+g" (m), "+g" (n), "+g" (o));
+  asm volatile ("" : "+g" (p), "+g" (q), "+g" (s), "+g" (t), "+g" (u));
+  asm volatile ("" : "+g" (v), "+g" (w), "+g" (y), "+g" (z));
+#ifdef __PPC64__
+  asm volatile ("std 1,-8(1); std 1,-16(1); std 1,-24(1); std 1,-32(1)"
+   : : : "18", "19", "20", "redzone");
+#elif defined(_AIX)
+  asm volatile ("stw 1,-4(1); stw 1,-8(1); stw 1,-12(1); stw 1,-16(1)"
+   : : : "18", "19", "20", "redzone");
+#endif
+  asm volatile ("" : "+g" (a), "+g" (b), "+g" (c), "+g" (d), "+g" (e));
+  a

[gcc r15-5748] inline-asm, i386: Add "redzone" clobber support

2024-11-28 Thread Jakub Jelinek via Gcc-cvs
https://gcc.gnu.org/g:37c98fdeac7ae2f9649d49e0cfa2631c84a329da

commit r15-5748-g37c98fdeac7ae2f9649d49e0cfa2631c84a329da
Author: Jakub Jelinek 
Date:   Thu Nov 28 11:42:11 2024 +0100

inline-asm, i386: Add "redzone" clobber support

The following patch adds a "redzone" clobber (recognized everywhere,
even on on targets which don't do anything with it),
with which one can mark the rare case where inline asm pushes
something on the stack or uses call instruction without taking
red zone into account (i.e. addq $-128, %rsp; and addq $128, %rsp
around that).

2024-11-28  Jakub Jelinek  

gcc/
* target.def (redzone_clobber): New target hook.
* varasm.cc (decode_reg_name_and_count): Return -5 for
"redzone".
* cfgexpand.cc (expand_asm_stmt): Handle redzone clobber.
* config/i386/i386.h (struct machine_function): Add
asm_redzone_clobber_seen member.
* config/i386/i386.cc (ix86_compute_frame_layout): Don't
use red zone if cfun->machine->asm_redzone_clobber_seen.
(ix86_redzone_clobber): New function.
(TARGET_REDZONE_CLOBBER): Redefine.
* doc/extend.texi (Clobbers and Scratch Registers): Document
the "redzone" clobber.
* doc/tm.texi.in: Add @hook TARGET_REDZONE_CLOBBER.
* doc/tm.texi: Regenerate.
gcc/testsuite/
* gcc.dg/asm-redzone-1.c: New test.
* gcc.target/i386/asm-redzone-1.c: New test.

Diff:
---
 gcc/cfgexpand.cc  |  6 +
 gcc/config/i386/i386.cc   | 20 ++
 gcc/config/i386/i386.h|  3 +++
 gcc/doc/extend.texi   | 14 +-
 gcc/doc/tm.texi   |  8 ++
 gcc/doc/tm.texi.in|  2 ++
 gcc/target.def| 10 +++
 gcc/testsuite/gcc.dg/asm-redzone-1.c  |  8 ++
 gcc/testsuite/gcc.target/i386/asm-redzone-1.c | 38 +++
 gcc/varasm.cc |  9 +--
 10 files changed, 115 insertions(+), 3 deletions(-)

diff --git a/gcc/cfgexpand.cc b/gcc/cfgexpand.cc
index 2a984758bc7b..58d68ec1caa5 100644
--- a/gcc/cfgexpand.cc
+++ b/gcc/cfgexpand.cc
@@ -3209,6 +3209,12 @@ expand_asm_stmt (gasm *stmt)
  rtx x = gen_rtx_MEM (BLKmode, gen_rtx_SCRATCH (VOIDmode));
  clobber_rvec.safe_push (x);
}
+ else if (j == -5)
+   {
+ if (targetm.redzone_clobber)
+   if (rtx x = targetm.redzone_clobber ())
+ clobber_rvec.safe_push (x);
+   }
  else
{
  /* Otherwise we should have -1 == empty string
diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
index fda2112e4d60..0beeb514cf95 100644
--- a/gcc/config/i386/i386.cc
+++ b/gcc/config/i386/i386.cc
@@ -7170,6 +7170,7 @@ ix86_compute_frame_layout (void)
   if (ix86_using_red_zone ()
   && crtl->sp_is_unchanging
   && crtl->is_leaf
+  && !cfun->machine->asm_redzone_clobber_seen
   && !ix86_pc_thunk_call_expanded
   && !ix86_current_function_calls_tls_descriptor)
 {
@@ -26419,6 +26420,22 @@ ix86_mode_can_transfer_bits (machine_mode mode)
   return true;
 }
 
+/* Implement TARGET_REDZONE_CLOBBER.  */
+static rtx
+ix86_redzone_clobber ()
+{
+  cfun->machine->asm_redzone_clobber_seen = true;
+  if (ix86_using_red_zone ())
+{
+  rtx base = plus_constant (Pmode, stack_pointer_rtx,
+   GEN_INT (-RED_ZONE_SIZE));
+  rtx mem = gen_rtx_MEM (BLKmode, base);
+  set_mem_size (mem, RED_ZONE_SIZE);
+  return mem;
+}
+  return NULL_RTX;
+}
+
 /* Target-specific selftests.  */
 
 #if CHECKING_P
@@ -27272,6 +27289,9 @@ ix86_libgcc_floating_mode_supported_p
 #undef TARGET_MODE_CAN_TRANSFER_BITS
 #define TARGET_MODE_CAN_TRANSFER_BITS ix86_mode_can_transfer_bits
 
+#undef TARGET_REDZONE_CLOBBER
+#define TARGET_REDZONE_CLOBBER ix86_redzone_clobber
+
 static bool
 ix86_libc_has_fast_function (int fcode ATTRIBUTE_UNUSED)
 {
diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
index 65227e7fd413..85c54c35c5c9 100644
--- a/gcc/config/i386/i386.h
+++ b/gcc/config/i386/i386.h
@@ -2895,6 +2895,9 @@ struct GTY(()) machine_function {
   /* True if red zone is used.  */
   BOOL_BITFIELD red_zone_used : 1;
 
+  /* True if inline asm with redzone clobber has been seen.  */
+  BOOL_BITFIELD asm_redzone_clobber_seen : 1;
+
   /* The largest alignment, in bytes, of stack slot actually used.  */
   unsigned int max_used_stack_alignment;
 
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index df34cd1da309..2fc513efdb58 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -11800,7 +11800,7 @@ asm volatile ("movc3 %0, %1, %2"
: "r0", "r1",

[gcc r15-5750] Add support for nonnull_if_nonzero attribute [PR117023]

2024-11-28 Thread Jakub Jelinek via Gcc-cvs
https://gcc.gnu.org/g:19fe55c4801de50deee03b333e94d007aae222e3

commit r15-5750-g19fe55c4801de50deee03b333e94d007aae222e3
Author: Jakub Jelinek 
Date:   Thu Nov 28 11:48:33 2024 +0100

Add support for nonnull_if_nonzero attribute [PR117023]

As mentioned in an earlier thread, C2Y voted in a change which made
various library APIs callable with NULL arguments in certain cases,
e.g.
memcpy (NULL, NULL, 0);
is now valid, although
memcpy (NULL, NULL, 1);
remains invalid.  This affects various APIs, including several of
GCC builtins; plus on the C library side those APIs are often declared
with nonnull attribute(s) as well.

Florian suggested using the access attribute for this, but our docs
explicitly say that access attribute doesn't imply nonnull and it doesn't
cover e.g. the qsort case where the comparison function pointer may be
also NULL if nmemb is 0, but must be non-zero otherwise.
As this case affects 21 APIs in C standard and I think is going to affect
various wrappers around those in various packages as well, I think it
is a common thing that should have its own attribute, because we should
still warn when people use
qsort (NULL, 1, 1, NULL);
etc., and similarly want to have -fsanitize=null instrumentation for those.

So, the following patch introduces nonnull_if_nonzero attribute (or would
you prefer cond_nonnull or some other name?), which has always 2 arguments,
argument index of a pointer argument (like one argument nonnull) and
argument index of an associated integral argument.  If that argument is
non-zero, it is UB to pass NULL to the pointer argument, if that argument
is zero, it is valid.  And changes various spots which already handled the
nonnull attribute to handle this one as well, with sometimes using the
ranger (or for -fsanitize=nonnull explicitly checking the associated
argument value, so instead of if (!ptr) __ubsan_... (...); it will
now do if (!ptr && sz) __ubsan_... (...);).
I've so far omitted changing gimple_infer_range (am not 100% sure how I can
use the ranger inside of the ranger) and changing the analyzer to handle it.
And I haven't changed builtins.def etc. to make use of that attribute
instead of nonnull where appropriate.

I'd then follow with the builtins.def changes (and eventually glibc
etc. would need to be adjusted too).

2024-11-28  Jakub Jelinek  

PR c/117023
gcc/
* gimple.h (infer_nonnull_range_by_attribute): Add a tree *
argument defaulted to NULL.
* gimple.cc (infer_nonnull_range_by_attribute): Add op2 argument.
Handle also nonnull_if_nonzero attributes.
* tree.cc (get_nonnull_args): Fix comment typo.
* builtins.cc (validate_arglist): Handle nonnull_if_nonzero 
attribute.
* tree-ssa-ccp.cc (pass_post_ipa_warn::execute): Handle
nonnull_if_nonzero attributes.
* ubsan.cc (instrument_nonnull_arg): Adjust
infer_nonnull_range_by_attribute caller.  If it returned true and
filed in non-NULL arg2, check that arg2 is non-zero as another
condition next to checking that arg is zero.
* doc/extend.texi (nonnull_if_nonzero): Document new attribute.
gcc/c-family/
* c-attribs.cc (handle_nonnull_if_nonzero_attribute): New
function.
(c_common_gnu_attributes): Add nonnull_if_nonzero attribute.
(handle_nonnull_attribute): Fix comment typo.
* c-common.cc (struct nonnull_arg_ctx): Add other member.
(check_function_nonnull): Also check nonnull_if_nonzero attributes.
(check_nonnull_arg): Use different warning wording if pctx->other
is non-zero.
(check_function_arguments): Initialize ctx.other.
gcc/testsuite/
* gcc.dg/nonnull-8.c: New test.
* gcc.dg/nonnull-9.c: New test.
* gcc.dg/nonnull-10.c: New test.
* c-c++-common/ubsan/nonnull-6.c: New test.
* c-c++-common/ubsan/nonnull-7.c: New test.

Diff:
---
 gcc/builtins.cc  |  18 +++
 gcc/c-family/c-attribs.cc|  29 -
 gcc/c-family/c-common.cc |  54 +++--
 gcc/doc/extend.texi  |  35 +-
 gcc/gimple.cc|  47 +++-
 gcc/gimple.h |   2 +-
 gcc/testsuite/c-c++-common/ubsan/nonnull-6.c |  28 +
 gcc/testsuite/c-c++-common/ubsan/nonnull-7.c |  39 +++
 gcc/testsuite/gcc.dg/nonnull-10.c| 162 +++
 gcc/testsuite/gcc.dg/nonnull-8.c |  57 ++
 gcc/testsuite/gcc.dg/nonnull-9.c |  40 +++
 gcc/tree-ssa-ccp.cc  |  68 ++-
 gcc/tree.cc  |   2 +-

[gcc r15-5751] ranger: Handle nonnull_if_nonzero attribute [PR117023]

2024-11-28 Thread Jakub Jelinek via Gcc-cvs
https://gcc.gnu.org/g:912d5cfb8cf3c2568a544a4260bac4f6f932767a

commit r15-5751-g912d5cfb8cf3c2568a544a4260bac4f6f932767a
Author: Jakub Jelinek 
Date:   Thu Nov 28 11:50:49 2024 +0100

ranger: Handle nonnull_if_nonzero attribute [PR117023]

On top of the
https://gcc.gnu.org/pipermail/gcc-patches/2024-November/668554.html
patch which introduces the nonnull_if_nonzero attribute (because
C2Y is allowing NULL arguments on various calls like memcpy, memset,
strncpy etc. as long as the count is 0) the following patch adds just
limited handling of the attribute in the ranger, in particular infers
nonnull for the pointer argument referenced in first argument of the
attribute if the second argument is a non-zero INTEGER_CST
(integer_nonzerop).

Ideally (as the FIXME says) I'd like to query arg2 range and check if
it doesn't contain zero, but am not sure such queries are possible from
gimple_infer_range (and if it is possible whether one can just query
the currently recorded range for it or if one can call something that
will try to compute the range by walking the def stmts etc.).

Could you handle as a follow-up the range querying if it is possible?

As for useful testcase, with the patch I'm going to post next
e.g. gcc.dg/tree-ssa/pr78154.c if the calls use d as destination (not dn)
and count that will have a range which doesn't include 0 and isn't constant.

2024-11-28  Jakub Jelinek  

PR c/117023
* gimple-range-infer.cc (gimple_infer_range::gimple_infer_range):
Handle also nonnull_if_nonzero attributes.

Diff:
---
 gcc/gimple-range-infer.cc | 24 
 1 file changed, 24 insertions(+)

diff --git a/gcc/gimple-range-infer.cc b/gcc/gimple-range-infer.cc
index 98642e2438fc..b5621f0b22aa 100644
--- a/gcc/gimple-range-infer.cc
+++ b/gcc/gimple-range-infer.cc
@@ -183,6 +183,30 @@ gimple_infer_range::gimple_infer_range (gimple *s, bool 
use_rangeops)
}
  BITMAP_FREE (nonnullargs);
}
+  if (fntype)
+   for (tree attrs = TYPE_ATTRIBUTES (fntype);
+(attrs = lookup_attribute ("nonnull_if_nonzero", attrs));
+attrs = TREE_CHAIN (attrs))
+ {
+   tree args = TREE_VALUE (attrs);
+   unsigned int idx = TREE_INT_CST_LOW (TREE_VALUE (args)) - 1;
+   unsigned int idx2
+ = TREE_INT_CST_LOW (TREE_VALUE (TREE_CHAIN (args))) - 1;
+   if (idx < gimple_call_num_args (s)
+   && idx2 < gimple_call_num_args (s))
+ {
+   tree arg = gimple_call_arg (s, idx);
+   tree arg2 = gimple_call_arg (s, idx2);
+   if (!POINTER_TYPE_P (TREE_TYPE (arg))
+   || !INTEGRAL_TYPE_P (TREE_TYPE (arg2))
+   || integer_zerop (arg2))
+ continue;
+   if (integer_nonzerop (arg2))
+ add_nonzero (arg);
+   // FIXME: Can one query here whether arg2 has
+   // nonzero range if it is a SSA_NAME?
+ }
+ }
   // Fallthru and walk load/store ops now.
 }


[gcc r15-5742] builtins: Handle BITINT_TYPE in __builtin_iseqsig folding [PR117802]

2024-11-28 Thread Jakub Jelinek via Gcc-cvs
https://gcc.gnu.org/g:88aeea14c23a5d066a635ffb4f1d2943fddcf0bd

commit r15-5742-g88aeea14c23a5d066a635ffb4f1d2943fddcf0bd
Author: Jakub Jelinek 
Date:   Thu Nov 28 10:23:47 2024 +0100

builtins: Handle BITINT_TYPE in __builtin_iseqsig folding [PR117802]

In check_builtin_function_arguments in the _BitInt patchset I've changed
INTEGER_TYPE tests to INTEGER_TYPE or BITINT_TYPE, but haven't done the
same in fold_builtin_iseqsig, which now ICEs because of that.

The following patch fixes that.

BTW, that TYPE_PRECISION (type0) >= TYPE_PRECISION (type1) test
for REAL_TYPE vs. REAL_TYPE looks pretty random and dangerous, I think
it would be useful to handle this builtin also in the C and C++ FEs,
if both arguments have REAL_TYPE, use the FE specific routine to decide
which types to use and error if a comparison between types would be
erroneous (e.g. complain about _Decimal* vs. float/double/long
double/_Float*, pick up the preferred type, complain about
__ibm128 vs. _Float128 in C++, etc.).
But the FEs can just promote one argument to the other in that case
and keep fold_builtin_iseqsig as is for say Fortran and other FEs.

2024-11-28  Jakub Jelinek  

PR c/117802
* builtins.cc (fold_builtin_iseqsig): Handle BITINT_TYPE like
INTEGER_TYPE.

* gcc.dg/builtin-iseqsig-1.c: New test.
* gcc.dg/bitint-118.c: New test.

Diff:
---
 gcc/builtins.cc  |  6 --
 gcc/testsuite/gcc.dg/bitint-118.c| 21 +
 gcc/testsuite/gcc.dg/builtin-iseqsig-1.c | 20 
 3 files changed, 45 insertions(+), 2 deletions(-)

diff --git a/gcc/builtins.cc b/gcc/builtins.cc
index d338abe95baa..d925074c5475 100644
--- a/gcc/builtins.cc
+++ b/gcc/builtins.cc
@@ -9946,9 +9946,11 @@ fold_builtin_iseqsig (location_t loc, tree arg0, tree 
arg1)
 /* Choose the wider of two real types.  */
 cmp_type = TYPE_PRECISION (type0) >= TYPE_PRECISION (type1)
   ? type0 : type1;
-  else if (code0 == REAL_TYPE && code1 == INTEGER_TYPE)
+  else if (code0 == REAL_TYPE
+  && (code1 == INTEGER_TYPE || code1 == BITINT_TYPE))
 cmp_type = type0;
-  else if (code0 == INTEGER_TYPE && code1 == REAL_TYPE)
+  else if ((code0 == INTEGER_TYPE || code0 == BITINT_TYPE)
+  && code1 == REAL_TYPE)
 cmp_type = type1;
 
   arg0 = builtin_save_expr (fold_convert_loc (loc, cmp_type, arg0));
diff --git a/gcc/testsuite/gcc.dg/bitint-118.c 
b/gcc/testsuite/gcc.dg/bitint-118.c
new file mode 100644
index ..d330cc502a94
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/bitint-118.c
@@ -0,0 +1,21 @@
+/* PR c/117802 */
+/* { dg-do compile { target bitint575 } } */
+/* { dg-options "-std=c23" } */
+
+int
+foo (float x, _BitInt(8) y)
+{
+  return __builtin_iseqsig (x, y) * 2 + __builtin_iseqsig (y, x);
+}
+
+int
+bar (double x, unsigned _BitInt(162) y)
+{
+  return __builtin_iseqsig (x, y) * 2 + __builtin_iseqsig (y, x);
+}
+
+int
+baz (long double x, _BitInt(574) y)
+{
+  return __builtin_iseqsig (x, y) * 2 + __builtin_iseqsig (y, x);
+}
diff --git a/gcc/testsuite/gcc.dg/builtin-iseqsig-1.c 
b/gcc/testsuite/gcc.dg/builtin-iseqsig-1.c
new file mode 100644
index ..e9fe87b7f708
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/builtin-iseqsig-1.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-options "" } */
+
+int
+foo (float x, int y)
+{
+  return __builtin_iseqsig (x, y) * 2 + __builtin_iseqsig (y, x);
+}
+
+int
+bar (double x, unsigned long y)
+{
+  return __builtin_iseqsig (x, y) * 2 + __builtin_iseqsig (y, x);
+}
+
+int
+baz (long double x, long long y)
+{
+  return __builtin_iseqsig (x, y) * 2 + __builtin_iseqsig (y, x);
+}


[gcc r15-5745] middle-end: rework vectorizable_store to iterate over single index [PR117557]

2024-11-28 Thread Tamar Christina via Gcc-cvs
https://gcc.gnu.org/g:1b3bff737b2d5a7dc0d5977b77200c785fc53f01

commit r15-5745-g1b3bff737b2d5a7dc0d5977b77200c785fc53f01
Author: Tamar Christina 
Date:   Thu Nov 28 10:23:14 2024 +

middle-end: rework vectorizable_store to iterate over single index 
[PR117557]

The testcase

#include 
#include 

#define N 8
#define L 8

void f(const uint8_t * restrict seq1,
   const uint8_t *idx, uint8_t *seq_out) {
  for (int i = 0; i < L; ++i) {
uint8_t h = idx[i];
memcpy((void *)&seq_out[i * N], (const void *)&seq1[h * N / 2], N / 2);
  }
}

compiled at -O3 -mcpu=neoverse-n1+sve

miscompiles to:

ld1wz31.s, p3/z, [x23, z29.s, sxtw]
ld1wz29.s, p7/z, [x23, z30.s, sxtw]
st1wz29.s, p7, [x24, z12.s, sxtw]
st1wz31.s, p7, [x24, z12.s, sxtw]

rather than

ld1wz31.s, p3/z, [x23, z29.s, sxtw]
ld1wz29.s, p7/z, [x23, z30.s, sxtw]
st1wz29.s, p7, [x24, z12.s, sxtw]
addvl   x3, x24, #2
st1wz31.s, p3, [x3, z12.s, sxtw]

Where two things go wrong, the wrong mask is used and the address pointers 
to
the stores are wrong.

This issue is happening because the codegen loop in vectorizable_store is a
nested loop where in the outer loop we iterate over ncopies and in the inner
loop we loop over vec_num.

For SLP ncopies == 1 and vec_num == SLP_NUM_STMS, but the loop mask is
determined by only the outerloop index and the pointer address is only 
updated
in the outer loop.

As such for SLP we always use the same predicate and the same memory 
location.
This patch flattens the two loops and instead iterates over ncopies * 
vec_num
and simplified the indexing.

This does not fully fix the gcc_r miscompile error in SPECCPU 2017 as the 
error
moves somewhere else.  I will look at that next but fixes some other 
libraries
that also started failing.

gcc/ChangeLog:

PR tree-optimization/117557
* tree-vect-stmts.cc (vectorizable_store): Flatten the ncopies and
vec_num loops.

gcc/testsuite/ChangeLog:

PR tree-optimization/117557
* gcc.target/aarch64/pr117557.c: New test.

Diff:
---
 gcc/testsuite/gcc.target/aarch64/pr117557.c |  29 ++
 gcc/tree-vect-stmts.cc  | 504 ++--
 2 files changed, 281 insertions(+), 252 deletions(-)

diff --git a/gcc/testsuite/gcc.target/aarch64/pr117557.c 
b/gcc/testsuite/gcc.target/aarch64/pr117557.c
new file mode 100644
index ..80b3fde41109
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/pr117557.c
@@ -0,0 +1,29 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -mcpu=neoverse-n1+sve -fdump-tree-vect" } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+#include 
+#include 
+
+#define N 8
+#define L 8
+
+/*
+**f:
+** ...
+** ld1wz[0-9]+.s, p([0-9]+)/z, \[x[0-9]+, z[0-9]+.s, sxtw\]
+** ld1wz[0-9]+.s, p([0-9]+)/z, \[x[0-9]+, z[0-9]+.s, sxtw\]
+** st1wz[0-9]+.s, p\1, \[x[0-9]+, z[0-9]+.s, sxtw\]
+** incbx([0-9]+), all, mul #2
+** st1wz[0-9]+.s, p\2, \[x\3, z[0-9]+.s, sxtw\]
+** ret
+** ...
+*/
+void f(const uint8_t * restrict seq1,
+   const uint8_t *idx, uint8_t *seq_out) {
+  for (int i = 0; i < L; ++i) {
+uint8_t h = idx[i];
+memcpy((void *)&seq_out[i * N], (const void *)&seq1[h * N / 2], N / 2);
+  }
+}
+
diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index c2d5818b2786..4759c274f3cc 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -9228,7 +9228,8 @@ vectorizable_store (vec_info *vinfo,
   gcc_assert (!grouped_store);
   auto_vec vec_offsets;
   unsigned int inside_cost = 0, prologue_cost = 0;
-  for (j = 0; j < ncopies; j++)
+  int num_stmts = ncopies * vec_num;
+  for (j = 0; j < num_stmts; j++)
{
  gimple *new_stmt;
  if (j == 0)
@@ -9246,14 +9247,14 @@ vectorizable_store (vec_info *vinfo,
vect_get_slp_defs (op_node, gvec_oprnds[0]);
  else
vect_get_vec_defs_for_operand (vinfo, first_stmt_info,
-  ncopies, op, gvec_oprnds[0]);
+  num_stmts, op, 
gvec_oprnds[0]);
  if (mask)
{
  if (slp_node)
vect_get_slp_defs (mask_node, &vec_masks);
  else
vect_get_vec_defs_for_operand (vinfo, stmt_info,
-  ncopies,
+  num_stmts,
   mask, &vec_masks,
   mask_vectype);
  

[gcc r15-5744] libstdc++: Include in os_defines.h for FreeBSD [PR117210]

2024-11-28 Thread Jonathan Wakely via Libstdc++-cvs
https://gcc.gnu.org/g:aa9f12e58ddc2298a253cc6e1343ad7e2eb9bcad

commit r15-5744-gaa9f12e58ddc2298a253cc6e1343ad7e2eb9bcad
Author: Jonathan Wakely 
Date:   Wed Nov 27 14:10:34 2024 +

libstdc++: Include  in os_defines.h for FreeBSD [PR117210]

This is needed so that __LONG_LONG_SUPPORTED is defined before we depend
on it.

libstdc++-v3/ChangeLog:

PR libstdc++/117210
* config/os/bsd/dragonfly/os_defines.h: Include .
* config/os/bsd/freebsd/os_defines.h: Likewise.

Diff:
---
 libstdc++-v3/config/os/bsd/dragonfly/os_defines.h | 2 ++
 libstdc++-v3/config/os/bsd/freebsd/os_defines.h   | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/libstdc++-v3/config/os/bsd/dragonfly/os_defines.h 
b/libstdc++-v3/config/os/bsd/dragonfly/os_defines.h
index e030fa3dc872..9c5aaabc90f3 100644
--- a/libstdc++-v3/config/os/bsd/dragonfly/os_defines.h
+++ b/libstdc++-v3/config/os/bsd/dragonfly/os_defines.h
@@ -29,6 +29,8 @@
 // System-specific #define, typedefs, corrections, etc, go here.  This
 // file will come before all others.
 
+#include  // For __LONG_LONG_SUPPORTED
+
 #define _GLIBCXX_USE_C99 1
 #define _GLIBCXX_USE_C99_STDIO 1
 #define _GLIBCXX_USE_C99_STDLIB 1
diff --git a/libstdc++-v3/config/os/bsd/freebsd/os_defines.h 
b/libstdc++-v3/config/os/bsd/freebsd/os_defines.h
index 0d63ae6cec4c..125dfdc18885 100644
--- a/libstdc++-v3/config/os/bsd/freebsd/os_defines.h
+++ b/libstdc++-v3/config/os/bsd/freebsd/os_defines.h
@@ -29,6 +29,8 @@
 // System-specific #define, typedefs, corrections, etc, go here.  This
 // file will come before all others.
 
+#include  // For __LONG_LONG_SUPPORTED
+
 #define _GLIBCXX_USE_C99_STDIO 1
 #define _GLIBCXX_USE_C99_STDLIB 1
 #define _GLIBCXX_USE_C99_WCHAR 1


[gcc r15-5752] docs: Fix up __sync_* documentation [PR117642]

2024-11-28 Thread Jakub Jelinek via Gcc-cvs
https://gcc.gnu.org/g:0dcc09a8b5eb275ce939daad2bdfc7076ae1863c

commit r15-5752-g0dcc09a8b5eb275ce939daad2bdfc7076ae1863c
Author: Jakub Jelinek 
Date:   Thu Nov 28 14:31:44 2024 +0100

docs: Fix up __sync_* documentation [PR117642]

The PR14311 commit which added support for __sync_* builtins documented that
there is a warning if a particular operation cannot be implemented.
But that commit nor anything later on implemented such warning, it was
always silent generation of the mentioned calls (which can in most cases
result in linker errors of course because those functions aren't implemented
anywhere, in libatomic or elsewhere in code shipped in gcc).

So, the following patch just adjust the documentation to match the
implementation.

2024-11-28  Jakub Jelinek  

PR target/117642
* doc/extend.texi: Remove documentation of warning for unimplemented
__sync_* operations, such warning has never been implemented.

Diff:
---
 gcc/doc/extend.texi | 11 +++
 1 file changed, 3 insertions(+), 8 deletions(-)

diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 8497aa8603f2..106fa57addfd 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -13562,16 +13562,11 @@ builtins (@pxref{__atomic Builtins}).  They should 
not be used for new
 code which should use the @samp{__atomic} builtins instead.
 
 Not all operations are supported by all target processors.  If a particular
-operation cannot be implemented on the target processor, a warning is
-generated and a call to an external function is generated.  The external
-function carries the same name as the built-in version,
-with an additional suffix
+operation cannot be implemented on the target processor, a call to an
+external function is generated.  The external function carries the same name
+as the built-in version, with an additional suffix
 @samp{_@var{n}} where @var{n} is the size of the data type.
 
-@c ??? Should we have a mechanism to suppress this warning?  This is almost
-@c useful for implementing the operation under the control of an external
-@c mutex.
-
 In most cases, these built-in functions are considered a @dfn{full barrier}.
 That is,
 no memory operand is moved across the operation, either forward or


[gcc r15-5753] Address UNRESOLVED for 'g++.dg/tree-ssa/empty-loop.C'

2024-11-28 Thread Thomas Schwinge via Gcc-cvs
https://gcc.gnu.org/g:3e8d3079c31567d3e9f43cc2cb100ddef25f48a2

commit r15-5753-g3e8d3079c31567d3e9f43cc2cb100ddef25f48a2
Author: Thomas Schwinge 
Date:   Thu Nov 28 14:31:17 2024 +0100

Address UNRESOLVED for 'g++.dg/tree-ssa/empty-loop.C'

As of commit 1046c32de4956c3d706a2ff8683582fd21b8f360 "optimize 
basic_string",
we've got:

PASS: g++.dg/tree-ssa/empty-loop.C  -std=gnu++17 (test for excess 
errors)
[-PASS:-]{+XFAIL:+} g++.dg/tree-ssa/empty-loop.C  -std=gnu++17  
scan-tree-dump-not cddce2 "if"
{+UNRESOLVED: g++.dg/tree-ssa/empty-loop.C  -std=gnu++17  
scan-tree-dump-not cddce3 "if"+}
[Etc.]

gcc/testsuite/
* g++.dg/tree-ssa/empty-loop.C: Address UNRESOLVED.

Diff:
---
 gcc/testsuite/g++.dg/tree-ssa/empty-loop.C | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/testsuite/g++.dg/tree-ssa/empty-loop.C 
b/gcc/testsuite/g++.dg/tree-ssa/empty-loop.C
index b7e7e27cc042..adb6ab582dbc 100644
--- a/gcc/testsuite/g++.dg/tree-ssa/empty-loop.C
+++ b/gcc/testsuite/g++.dg/tree-ssa/empty-loop.C
@@ -1,5 +1,6 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-cddce2 -ffinite-loops -Wno-unused-result" } */
+/* { dg-options "-O2 -ffinite-loops -Wno-unused-result" } */
+/* { dg-additional-options "-fdump-tree-cddce2 -fdump-tree-cddce3" } */
 /* { dg-skip-if "requires hosted libstdc++ for string" { ! hostedlib } } */
 
 #include 


[gcc r15-5767] Fortran: Check for impure subroutine.

2024-11-28 Thread Jerry DeLisle via Gcc-cvs
https://gcc.gnu.org/g:30078cb0cc5e19d3de55d218ae500d59a21e7537

commit r15-5767-g30078cb0cc5e19d3de55d218ae500d59a21e7537
Author: Steven G. Kargl 
Date:   Thu Nov 28 13:37:02 2024 -0800

Fortran: Check for impure subroutine.

PR fortran/117765

gcc/fortran/ChangeLog:

* resolve.cc (pure_subroutine): Check for an impure subroutine
call in a BLOCK construct nested within a DO CONCURRENT block.

gcc/testsuite/ChangeLog:

* gfortran.dg/impure_fcn_do_concurrent.f90: Update test to catch
calls to an impure subroutine.

Diff:
---
 gcc/fortran/resolve.cc | 18 ++
 gcc/testsuite/gfortran.dg/impure_fcn_do_concurrent.f90 |  9 -
 2 files changed, 26 insertions(+), 1 deletion(-)

diff --git a/gcc/fortran/resolve.cc b/gcc/fortran/resolve.cc
index 304bf208d1a9..f892d809d209 100644
--- a/gcc/fortran/resolve.cc
+++ b/gcc/fortran/resolve.cc
@@ -3603,9 +3603,27 @@ resolve_function (gfc_expr *expr)
 static bool
 pure_subroutine (gfc_symbol *sym, const char *name, locus *loc)
 {
+  code_stack *stack;
+  bool saw_block = false;
+
   if (gfc_pure (sym))
 return true;
 
+  /* A BLOCK construct within a DO CONCURRENT construct leads to
+ gfc_do_concurrent_flag = 0 when the check for an impure subroutine
+ occurs.  Check the stack to see if the source code has a nested
+ BLOCK construct.  */
+  for (stack = cs_base; stack; stack = stack->prev)
+{
+  if (stack->current->op == EXEC_BLOCK) saw_block = true;
+  if (saw_block && stack->current->op == EXEC_DO_CONCURRENT)
+   {
+ gfc_error ("Subroutine call at %L in a DO CONCURRENT block "
+"is not PURE", loc);
+ return false;
+   }
+}
+
   if (forall_flag)
 {
   gfc_error ("Subroutine call to %qs in FORALL block at %L is not PURE",
diff --git a/gcc/testsuite/gfortran.dg/impure_fcn_do_concurrent.f90 
b/gcc/testsuite/gfortran.dg/impure_fcn_do_concurrent.f90
index af524ae83f3c..5846f8c68aab 100644
--- a/gcc/testsuite/gfortran.dg/impure_fcn_do_concurrent.f90
+++ b/gcc/testsuite/gfortran.dg/impure_fcn_do_concurrent.f90
@@ -10,12 +10,14 @@ program foo
 
do concurrent(i=1:4)
   y(i) = bar(i)! { dg-error "Reference to impure function" }
+  call bla(i)  ! { dg-error "Subroutine call to" }
end do
 
do concurrent(i=1:4)
   block
  y(i) = bar(i) ! { dg-error "Reference to impure function" }
-  end block
+ call bla(i)   ! { dg-error "Subroutine call at" }
+   end block
end do
 
contains
@@ -27,4 +29,9 @@ program foo
  bar = j
   end function bar
 
+  impure subroutine bla (i)
+ integer, intent(in) :: i
+ j = j + i
+  end subroutine bla
+
 end program foo


[gcc r14-10997] tree-sra: Avoid SRAing arguments to a function returning_twice (PR 117142)

2024-11-28 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:8fd9461976b325efd134f9004a7958ebd008148f

commit r14-10997-g8fd9461976b325efd134f9004a7958ebd008148f
Author: Martin Jambor 
Date:   Wed Oct 23 11:30:32 2024 +0200

tree-sra: Avoid SRAing arguments to a function returning_twice (PR 117142)

PR 117142 shows that the current SRA probably never worked reliably
with arguments passed to a function returning twice, because it then
creates statements before the call which however needs to be at the
beginning of a basic block.

While it should be possible to make at least the case of passing
arguments by value work with SRA (the statements would need to be put
just on the non-abnormal edges leading to the BB), this would mean
large surgery of function sra_modify_expr and I guess the time would
better be spent re-organizing the whole pass.

gcc/ChangeLog:

2024-10-21  Martin Jambor  

PR tree-optimization/117142
* tree-sra.cc (build_access_from_call_arg): Disqualify any
candidate passed to a function returning twice.

gcc/testsuite/ChangeLog:

2024-10-21  Martin Jambor  

PR tree-optimization/117142
* gcc.dg/tree-ssa/pr117142.c: New test.

(cherry picked from commit 29d8f1f0b7ad3c69b3bdb130325300d5f73aa784)

Diff:
---
 gcc/testsuite/gcc.dg/tree-ssa/pr117142.c | 14 ++
 gcc/tree-sra.cc  |  9 +
 2 files changed, 23 insertions(+)

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr117142.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr117142.c
new file mode 100644
index ..fc62c1e58f2e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr117142.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-O1" } */
+
+struct a {
+  int b;
+};
+void c(int, int);
+void __attribute__((returns_twice))
+bar1(struct a);
+void bar(struct a) {
+  struct a d;
+  bar1(d);
+  c(d.b, d.b);
+}
diff --git a/gcc/tree-sra.cc b/gcc/tree-sra.cc
index 8040b0c56451..c91e40ef7e71 100644
--- a/gcc/tree-sra.cc
+++ b/gcc/tree-sra.cc
@@ -1397,6 +1397,15 @@ static bool
 build_access_from_call_arg (tree expr, gimple *stmt, bool can_be_returned,
enum out_edge_check *oe_check)
 {
+  if (gimple_call_flags (stmt) & ECF_RETURNS_TWICE)
+{
+  tree base = expr;
+  if (TREE_CODE (expr) == ADDR_EXPR)
+   base = get_base_address (TREE_OPERAND (expr, 0));
+  disqualify_base_of_expr (base, "Passed to a returns_twice call.");
+  return false;
+}
+
   if (TREE_CODE (expr) == ADDR_EXPR)
 {
   tree base = get_base_address (TREE_OPERAND (expr, 0));


[gcc r15-5761] libstdc++: Fix allocator-extended move ctor for std::basic_stacktrace [PR117822]

2024-11-28 Thread Jonathan Wakely via Libstdc++-cvs
https://gcc.gnu.org/g:fe04901737112abb6b1a71fe645f727384dc986a

commit r15-5761-gfe04901737112abb6b1a71fe645f727384dc986a
Author: Jonathan Wakely 
Date:   Thu Nov 28 10:24:00 2024 +

libstdc++: Fix allocator-extended move ctor for std::basic_stacktrace 
[PR117822]

libstdc++-v3/ChangeLog:

PR libstdc++/117822
* include/std/stacktrace (stacktrace(stacktrace&&, const A&)):
Fix typo in qualified-id for is_always_equal trait.
* testsuite/19_diagnostics/stacktrace/stacktrace.cc: Test
allocator-extended constructors and allocator propagation.

Diff:
---
 libstdc++-v3/include/std/stacktrace|   2 +-
 .../19_diagnostics/stacktrace/stacktrace.cc| 207 -
 2 files changed, 204 insertions(+), 5 deletions(-)

diff --git a/libstdc++-v3/include/std/stacktrace 
b/libstdc++-v3/include/std/stacktrace
index 58d0c2a0fc22..2c0f6ba10a91 100644
--- a/libstdc++-v3/include/std/stacktrace
+++ b/libstdc++-v3/include/std/stacktrace
@@ -295,7 +295,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   const allocator_type& __alloc) noexcept
   : _M_alloc(__alloc)
   {
-   if constexpr (_Allocator::is_always_equal::value)
+   if constexpr (_AllocTraits::is_always_equal::value)
  _M_impl = std::__exchange(__other._M_impl, {});
else if (_M_alloc == __other._M_alloc)
  _M_impl = std::__exchange(__other._M_impl, {});
diff --git a/libstdc++-v3/testsuite/19_diagnostics/stacktrace/stacktrace.cc 
b/libstdc++-v3/testsuite/19_diagnostics/stacktrace/stacktrace.cc
index 6bb22eacd92c..ee1a6d221e3a 100644
--- a/libstdc++-v3/testsuite/19_diagnostics/stacktrace/stacktrace.cc
+++ b/libstdc++-v3/testsuite/19_diagnostics/stacktrace/stacktrace.cc
@@ -106,12 +106,164 @@ test_cons()
 VERIFY( s5 != s0 );
 VERIFY( s3 == s0 );
 
-// TODO test allocator-extended copy/move
+Stacktrace s6(s5, Alloc{6});
+VERIFY( ! s6.empty() );
+VERIFY( s6.size() != 0 );
+VERIFY( s6.begin() != s6.end() );
+VERIFY( s6 == s5 );
+VERIFY( s5 != s0 );
+VERIFY( s6.get_allocator().get_personality() == 6 );
 
-// TODO test allocator propagation
+Stacktrace s7(std::move(s6), Alloc{7});
+VERIFY( ! s7.empty() );
+VERIFY( s7.size() != 0 );
+VERIFY( s7.begin() != s7.end() );
+VERIFY( s7 == s5 );
+VERIFY( s5 != s0 );
+VERIFY( s7.get_allocator().get_personality() == 7 );
   }
-}
 
+  {
+using Alloc = __gnu_test::SimpleAllocator;
+using Stacktrace = std::basic_stacktrace;
+
+Stacktrace s0;
+VERIFY( s0.empty() );
+VERIFY( s0.size() == 0 );
+VERIFY( s0.begin() == s0.end() );
+
+Stacktrace s1(Alloc{});
+VERIFY( s1.empty() );
+VERIFY( s1.size() == 0 );
+VERIFY( s1.begin() == s1.end() );
+
+VERIFY( s0 == s1 );
+
+Stacktrace s2(s0);
+VERIFY( s2 == s0 );
+
+const Stacktrace curr = Stacktrace::current();
+
+Stacktrace s3(curr);
+VERIFY( ! s3.empty() );
+VERIFY( s3.size() != 0 );
+VERIFY( s3.begin() != s3.end() );
+VERIFY( s3 != s0 );
+
+Stacktrace s4(s3);
+VERIFY( ! s4.empty() );
+VERIFY( s4.size() != 0 );
+VERIFY( s4.begin() != s4.end() );
+VERIFY( s4 == s3 );
+VERIFY( s4 != s0 );
+
+Stacktrace s5(std::move(s3));
+VERIFY( ! s5.empty() );
+VERIFY( s5.size() != 0 );
+VERIFY( s5.begin() != s5.end() );
+VERIFY( s5 == s4 );
+VERIFY( s5 != s0 );
+VERIFY( s3 == s0 );
+
+Stacktrace s6(s5, Alloc{});
+VERIFY( ! s6.empty() );
+VERIFY( s6.size() != 0 );
+VERIFY( s6.begin() != s6.end() );
+VERIFY( s6 == s5 );
+VERIFY( s5 != s0 );
+
+Stacktrace s7(std::move(s6), Alloc{});
+VERIFY( ! s7.empty() );
+VERIFY( s7.size() != 0 );
+VERIFY( s7.begin() != s7.end() );
+VERIFY( s7 == s5 );
+VERIFY( s5 != s0 );
+  }
+
+{
+using Stacktrace = std::pmr::stacktrace;
+using Alloc = Stacktrace::allocator_type;
+
+Stacktrace s0;
+VERIFY( s0.empty() );
+VERIFY( s0.size() == 0 );
+VERIFY( s0.begin() == s0.end() );
+
+Stacktrace s1(Alloc{});
+VERIFY( s1.empty() );
+VERIFY( s1.size() == 0 );
+VERIFY( s1.begin() == s1.end() );
+
+VERIFY( s0 == s1 );
+
+Stacktrace s2(s0);
+VERIFY( s2 == s0 );
+
+const Stacktrace curr = Stacktrace::current();
+
+Stacktrace s3(curr);
+VERIFY( ! s3.empty() );
+VERIFY( s3.size() != 0 );
+VERIFY( s3.begin() != s3.end() );
+VERIFY( s3 != s0 );
+
+Stacktrace s4(s3);
+VERIFY( ! s4.empty() );
+VERIFY( s4.size() != 0 );
+VERIFY( s4.begin() != s4.end() );
+VERIFY( s4 == s3 );
+VERIFY( s4 != s0 );
+
+Stacktrace s5(std::move(s3));
+VERIFY( ! s5.empty() );
+VERIFY( s5.size() != 0 );
+VERIFY( s5.begin() != s5.end() );
+VERIFY( s5 == s4 );
+VERIFY( s5 != s0 );
+VERIFY( s3 == s0 );
+
+__gnu_test::memory_resource mr;
+Stacktrace s6(s5, &mr);
+VERIFY( ! s6.empty() );
+VERIFY( s

[gcc r15-5760] libstdc++: Deprecate std::rel_ops namespace for C++20

2024-11-28 Thread Jonathan Wakely via Libstdc++-cvs
https://gcc.gnu.org/g:bb551f497e72b2c86733144568002ef8a7317ca3

commit r15-5760-gbb551f497e72b2c86733144568002ef8a7317ca3
Author: Jonathan Wakely 
Date:   Wed Nov 27 23:36:03 2024 +

libstdc++: Deprecate std::rel_ops namespace for C++20

This is deprecated in the C++20 standard and will be removed at some
point.

libstdc++-v3/ChangeLog:

* include/bits/stl_relops.h (rel_ops): Add deprecated attribute.
* testsuite/20_util/headers/utility/using_namespace_std_rel_ops.cc:
Add dg-warning for -Wdeprecated warnings.
* testsuite/20_util/rel_ops.cc: Likewise.
* testsuite/util/testsuite_containers.h: Disable -Wdeprecated
warnings when using rel_ops.

Diff:
---
 libstdc++-v3/include/bits/stl_relops.h   | 2 +-
 .../20_util/headers/utility/using_namespace_std_rel_ops.cc   | 2 +-
 libstdc++-v3/testsuite/20_util/rel_ops.cc| 2 +-
 libstdc++-v3/testsuite/util/testsuite_containers.h   | 9 +
 4 files changed, 12 insertions(+), 3 deletions(-)

diff --git a/libstdc++-v3/include/bits/stl_relops.h 
b/libstdc++-v3/include/bits/stl_relops.h
index 06c85ca8da91..29e7af3c2504 100644
--- a/libstdc++-v3/include/bits/stl_relops.h
+++ b/libstdc++-v3/include/bits/stl_relops.h
@@ -63,7 +63,7 @@ namespace std _GLIBCXX_VISIBILITY(default)
 {
 _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
-  namespace rel_ops
+  namespace rel_ops _GLIBCXX20_DEPRECATED_SUGGEST("<=>")
   {
 /** @namespace std::rel_ops
  *  @brief  The generated relational operators are sequestered here.
diff --git 
a/libstdc++-v3/testsuite/20_util/headers/utility/using_namespace_std_rel_ops.cc 
b/libstdc++-v3/testsuite/20_util/headers/utility/using_namespace_std_rel_ops.cc
index 330bde88d63e..b583eaa4713c 100644
--- 
a/libstdc++-v3/testsuite/20_util/headers/utility/using_namespace_std_rel_ops.cc
+++ 
b/libstdc++-v3/testsuite/20_util/headers/utility/using_namespace_std_rel_ops.cc
@@ -21,5 +21,5 @@
 
 namespace gnu
 {
-  using namespace std::rel_ops;
+  using namespace std::rel_ops; // { dg-warning "deprecated" "" { target c++20 
} }
 }
diff --git a/libstdc++-v3/testsuite/20_util/rel_ops.cc 
b/libstdc++-v3/testsuite/20_util/rel_ops.cc
index 711822966d3a..f84503293e1b 100644
--- a/libstdc++-v3/testsuite/20_util/rel_ops.cc
+++ b/libstdc++-v3/testsuite/20_util/rel_ops.cc
@@ -24,7 +24,7 @@
 #include 
 #include 
 
-using namespace std::rel_ops;
+using namespace std::rel_ops; // { dg-warning "deprecated" "" { target c++20 } 
}
 
 // libstdc++/3628
 void test01()
diff --git a/libstdc++-v3/testsuite/util/testsuite_containers.h 
b/libstdc++-v3/testsuite/util/testsuite_containers.h
index 4dd78d4ec9d7..f48bb54f140a 100644
--- a/libstdc++-v3/testsuite/util/testsuite_containers.h
+++ b/libstdc++-v3/testsuite/util/testsuite_containers.h
@@ -183,9 +183,12 @@ namespace __gnu_test
 {
   forward_members_unordered(const typename _Tp::value_type& v)
   {
+#pragma GCC diagnostic push
+#pragma GCC diagnostic ignored "-Wdeprecated-declarations"
// Make sure that even if rel_ops is injected there is no ambiguity
// when comparing iterators.
using namespace std::rel_ops;
+#pragma GCC diagnostic pop
 
typedef _Tp test_type;
test_type container;
@@ -283,9 +286,12 @@ namespace __gnu_test
 {
   forward_members(_Tp& container)
   {
+#pragma GCC diagnostic push
+#pragma GCC diagnostic ignored "-Wdeprecated-declarations"
// Make sure that even if rel_ops is injected there is no ambiguity
// when comparing iterators.
using namespace std::rel_ops;
+#pragma GCC diagnostic pop
 
typedef traits<_Tp> traits_type;
iterator_concept_checks(container)
   {
+#pragma GCC diagnostic push
+#pragma GCC diagnostic ignored "-Wdeprecated-declarations"
// Make sure that even if rel_ops is injected there is no ambiguity
// when comparing iterators.
using namespace std::rel_ops;
+#pragma GCC diagnostic pop
 
assert( !(container.begin() < container.begin()) );
assert( !(container.cbegin() < container.cbegin()) );


[gcc r15-5757] [PATCH v6 02/12] Add built-ins and tests for bit-forward and bit-reversed CRCs.

2024-11-28 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:c5126f0a004c27b180ac48f9e874e3744c088a09

commit r15-5757-gc5126f0a004c27b180ac48f9e874e3744c088a09
Author: Mariam Arutunian 
Date:   Mon Nov 11 12:51:18 2024 -0700

[PATCH v6 02/12] Add built-ins and tests for bit-forward and bit-reversed 
CRCs.

This patch introduces new built-in functions to GCC for computing
bit-forward and bit-reversed CRCs.
These builtins aim to provide efficient CRC calculation capabilities.
When the target architecture supports CRC operations (as indicated by the
presence of a CRC optab),
the builtins will utilize the expander to generate CRC code.
In the absence of hardware support, the builtins default to generating code
for a table-based CRC calculation.

The built-ins are defined as follows:
__builtin_rev_crc16_data8,
__builtin_rev_crc32_data8, __builtin_rev_crc32_data16,
__builtin_rev_crc32_data32
__builtin_rev_crc64_data8, __builtin_rev_crc64_data16,
 __builtin_rev_crc64_data32, __builtin_rev_crc64_data64,
__builtin_crc8_data8,
__builtin_crc16_data16, __builtin_crc16_data8,
__builtin_crc32_data8, __builtin_crc32_data16, __builtin_crc32_data32,
__builtin_crc64_data8, __builtin_crc64_data16,  __builtin_crc64_data32,
__builtin_crc64_data64

Each built-in takes three parameters:
crc: The initial CRC value.
data: The data to be processed.
polynomial: The CRC polynomial without the leading 1.

To validate the correctness of these built-ins, this patch also includes
additions to the GCC testsuite.
This enhancement allows GCC to offer developers high-performance CRC
computation options
that automatically adapt to the capabilities of the target hardware.

gcc/

* builtin-types.def (BT_FN_UINT8_UINT8_UINT8_CONST_SIZE): Define.
(BT_FN_UINT16_UINT16_UINT8_CONST_SIZE): Likewise.
(BT_FN_UINT16_UINT16_UINT16_CONST_SIZE): Likewise.
(BT_FN_UINT32_UINT32_UINT8_CONST_SIZE): Likewise.
(BT_FN_UINT32_UINT32_UINT16_CONST_SIZE): Likewise.
(BT_FN_UINT32_UINT32_UINT32_CONST_SIZE): Likewise.
(BT_FN_UINT64_UINT64_UINT8_CONST_SIZE): Likewise.
(BT_FN_UINT64_UINT64_UINT16_CONST_SIZE): Likewise.
(BT_FN_UINT64_UINT64_UINT32_CONST_SIZE): Likewise.
(BT_FN_UINT64_UINT64_UINT64_CONST_SIZE): Likewise.
* builtins.cc (associated_internal_fn): Handle CRC related builtins.
(expand_builtin_crc_table_based): New function.
(expand_builtin): Handle CRC related builtins.
* builtins.def (BUILT_IN_CRC8_DATA8): New builtin.
(BUILT_IN_CRC16_DATA8): Likewise.
(BUILT_IN_CRC16_DATA16): Likewise.
(BUILT_IN_CRC32_DATA8): Likewise.
(BUILT_IN_CRC32_DATA16): Likewise.
(BUILT_IN_CRC32_DATA32): Likewise.
(BUILT_IN_CRC64_DATA8): Likewise.
(BUILT_IN_CRC64_DATA16): Likewise.
(BUILT_IN_CRC64_DATA32): Likewise.
(BUILT_IN_CRC64_DATA64): Likewise.
(BUILT_IN_REV_CRC8_DATA8): New builtin.
(BUILT_IN_REV_CRC16_DATA8): Likewise.
(BUILT_IN_REV_CRC16_DATA16): Likewise.
(BUILT_IN_REV_CRC32_DATA8): Likewise.
(BUILT_IN_REV_CRC32_DATA16): Likewise.
(BUILT_IN_REV_CRC32_DATA32): Likewise.
(BUILT_IN_REV_CRC64_DATA8): Likewise.
(BUILT_IN_REV_CRC64_DATA16): Likewise.
(BUILT_IN_REV_CRC64_DATA32): Likewise.
(BUILT_IN_REV_CRC64_DATA64): Likewise.
* builtins.h (expand_builtin_crc_table_based): New function
declaration.
* doc/extend.texi: Add documentation for new CRC builtins.

gcc/testsuite/

* gcc.dg/crc-builtin-rev-target32.c: New test.
* gcc.dg/crc-builtin-rev-target64.c: New test.
* gcc.dg/crc-builtin-target32.c: New test.
* gcc.dg/crc-builtin-target64.c: New test.

Signed-off-by: Mariam Arutunian 
Co-authored-by: Joern Rennecke 
Co-authored-by: Jeff Law 

Diff:
---
 gcc/builtin-types.def   |  20 +
 gcc/builtins.cc | 112 +++-
 gcc/builtins.def|  21 -
 gcc/builtins.h  |   3 +
 gcc/doc/extend.texi | 109 +++
 gcc/testsuite/gcc.dg/crc-builtin-rev-target32.c |  38 
 gcc/testsuite/gcc.dg/crc-builtin-rev-target64.c |  62 +
 gcc/testsuite/gcc.dg/crc-builtin-target32.c |  38 
 gcc/testsuite/gcc.dg/crc-builtin-target64.c |  61 +
 9 files changed, 462 insertions(+), 2 deletions(-)

diff --git a/gcc/builtin-types.def b/gcc/builtin-types.def
index 427af741c6b0..fa988d350645 100644
--- a/gcc/builtin-types.def
+++ b/gcc/builtin-types.def
@@ -840,6 

[gcc r15-5756] [PATCH v6 01/12] Implement internal functions for efficient CRC computation.

2024-11-28 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:bb46d05ad64e4e0acb3307e76bab340aa8587d3e

commit r15-5756-gbb46d05ad64e4e0acb3307e76bab340aa8587d3e
Author: Mariam Arutunian 
Date:   Mon Nov 11 12:48:34 2024 -0700

[PATCH v6 01/12] Implement internal functions for efficient CRC computation.

Add two new internal functions (IFN_CRC, IFN_CRC_REV), to provide faster
CRC generation.
One performs bit-forward and the other bit-reversed CRC computation.
If CRC optabs are supported, they are used for the CRC computation.
Otherwise, table-based CRC is generated.
The supported data and CRC sizes are 8, 16, 32, and 64 bits.
The polynomial is without the leading 1.
A table with 256 elements is used to store precomputed CRCs.
For the reflection of inputs and the output, a simple algorithm involving
SHIFT, AND, and OR operations is used.

gcc/

* doc/md.texi (crc@var{m}@var{n}4, crc_rev@var{m}@var{n}4): 
Document.
* expr.cc (calculate_crc): New function.
(assemble_crc_table): Likewise.
(generate_crc_table): Likewise.
(calculate_table_based_CRC): Likewise.
(expand_crc_table_based): Likewise.
(gen_common_operation_to_reflect): Likewise.
(reflect_64_bit_value): Likewise.
(reflect_32_bit_value): Likewise.
(reflect_16_bit_value): Likewise.
(reflect_8_bit_value): Likewise.
(generate_reflecting_code_standard): Likewise.
(expand_reversed_crc_table_based): Likewise.
* expr.h (generate_reflecting_code_standard): New function 
declaration.
(expand_crc_table_based): Likewise.
(expand_reversed_crc_table_based): Likewise.
* internal-fn.cc: (crc_direct): Define.
(direct_crc_optab_supported_p): Likewise.
(expand_crc_optab_fn): New function
* internal-fn.def (CRC, CRC_REV): New internal functions.
* optabs.def (crc_optab, crc_rev_optab): New optabs.

Signed-off-by: Mariam Arutunian 
Co-authored-by: Joern Rennecke 
Co-authored-by: Jeff Law 

Diff:
---
 gcc/doc/md.texi |  14 +++
 gcc/expr.cc | 347 
 gcc/expr.h  |   6 +
 gcc/internal-fn.cc  |  75 
 gcc/internal-fn.def |   2 +
 gcc/optabs.def  |   2 +
 6 files changed, 446 insertions(+)

diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index c4c37053833e..69605bf75c0f 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -8578,6 +8578,20 @@ Return 1 if operand 1 is a normal floating point number 
and 0
 otherwise.  @var{m} is a scalar floating point mode.  Operand 0
 has mode @code{SImode}, and operand 1 has mode @var{m}.
 
+@cindex @code{crc@var{m}@var{n}4} instruction pattern
+@item @samp{crc@var{m}@var{n}4}
+Calculate a bit-forward CRC using operands 1, 2 and 3,
+then store the result in operand 0.
+Operands 1 is the initial CRC, operands 2 is the data and operands 3 is the
+polynomial without leading 1.
+Operands 0, 1 and 3 have mode @var{n} and operand 2 has mode @var{m}, where
+both modes are integers.  The size of CRC to be calculated is determined by the
+mode; for example, if @var{n} is @code{HImode}, a CRC16 is calculated.
+
+@cindex @code{crc_rev@var{m}@var{n}4} instruction pattern
+@item @samp{crc_rev@var{m}@var{n}4}
+Similar to @samp{crc@var{m}@var{n}4}, but calculates a bit-reversed CRC.
+
 @end table
 
 @end ifset
diff --git a/gcc/expr.cc b/gcc/expr.cc
index f4939140bb51..de25437660e0 100644
--- a/gcc/expr.cc
+++ b/gcc/expr.cc
@@ -14177,3 +14177,350 @@ int_expr_size (const_tree exp)
 
   return tree_to_shwi (size);
 }
+
+/* Calculate CRC for the initial CRC and given POLYNOMIAL.
+   CRC_BITS is CRC size.  */
+
+static unsigned HOST_WIDE_INT
+calculate_crc (unsigned HOST_WIDE_INT crc,
+  unsigned HOST_WIDE_INT polynomial,
+  unsigned short crc_bits)
+{
+  unsigned HOST_WIDE_INT msb = HOST_WIDE_INT_1U << (crc_bits - 1);
+  crc = crc << (crc_bits - 8);
+  for (short i = 8; i > 0; --i)
+{
+  if (crc & msb)
+   crc = (crc << 1) ^ polynomial;
+  else
+   crc <<= 1;
+}
+  /* Zero out bits in crc beyond the specified number of crc_bits.  */
+  if (crc_bits < sizeof (crc) * CHAR_BIT)
+crc &= (HOST_WIDE_INT_1U << crc_bits) - 1;
+  return crc;
+}
+
+/* Assemble CRC table with 256 elements for the given POLYNOM and CRC_BITS with
+   given ID.
+   ID is the identifier of the table, the name of the table is unique,
+   contains CRC size and the polynomial.
+   POLYNOM is the polynomial used to calculate the CRC table's elements.
+   CRC_BITS is the size of CRC, may be 8, 16, ... . */
+
+rtx
+assemble_crc_table (tree id, unsigned HOST_WIDE_INT polynom,
+   unsigned short crc_bits)
+{
+  unsigned table_el_n = 0x100;
+  tree ar = build_array_type (make_unsigned_type (crc_bits),
+ build_index_typ

[gcc r15-5758] c++: Implement P2662R3, Pack Indexing [PR113798]

2024-11-28 Thread Marek Polacek via Libstdc++-cvs
https://gcc.gnu.org/g:c43527216622a119feb0d2ca4c0d8dd37463fe0a

commit r15-5758-gc43527216622a119feb0d2ca4c0d8dd37463fe0a
Author: Marek Polacek 
Date:   Wed Oct 9 11:54:37 2024 -0400

c++: Implement P2662R3, Pack Indexing [PR113798]

This patch implements C++26 Pack Indexing, as described in
.

The issue discussing how to mangle pack indexes has not been resolved
yet  and I've
made no attempt to address it so far.

Unlike v1, which used augmented TYPE/EXPR_PACK_EXPANSION codes, this
version introduces two new codes: PACK_INDEX_EXPR and PACK_INDEX_TYPE.
Both carry two operands: the pack expansion and the index.  They are
handled in tsubst_pack_index: substitute the index and the pack and
then extract the element from the vector (if possible).

To handle pack indexing in a decltype or with decltype(auto), there is
also the new PACK_INDEX_PARENTHESIZED_P flag.

With this feature, it's valid to write something like

  using U = tmpl;

where we first expand the template argument into

  Ts...[Is#0], Ts...[Is#1], ...

and then substitute each individual pack index.

PR c++/113798

gcc/cp/ChangeLog:

* constexpr.cc (potential_constant_expression_1) :
New case.
* cp-objcp-common.cc (cp_common_init_ts): Mark PACK_INDEX_TYPE and
PACK_INDEX_EXPR.
* cp-tree.def (PACK_INDEX_TYPE): New.
(PACK_INDEX_EXPR): New.
* cp-tree.h (WILDCARD_TYPE_P): Also check PACK_INDEX_TYPE.
(PACK_INDEX_CHECK): Define.
(PACK_INDEX_P): Define.
(PACK_INDEX_PACK): Define.
(PACK_INDEX_INDEX): Define.
(PACK_INDEX_PARENTHESIZED_P): Define.
(make_pack_index): Declare.
(pack_index_element): Declare.
* cxx-pretty-print.cc (cxx_pretty_printer::expression) : New case.
(cxx_pretty_printer::type_id) : New case.
* error.cc (dump_type) : New case.
(dump_type_prefix): Handle PACK_INDEX_TYPE.
(dump_type_suffix): Likewise.
(dump_expr) : New case.
* mangle.cc (write_type) : New case.
* module.cc (trees_out::type_node) : New case.
(trees_in::tree_node) : New case.
* parser.cc (cp_parser_next_tokens_are_pack_index_p): New.
(cp_parser_pack_index): New.
(cp_parser_primary_expression): Handle a C++26 
pack-index-expression.
(cp_parser_unqualified_id): Handle a C++26 pack-index-specifier.
(cp_parser_nested_name_specifier_opt): See if a pack-index-specifier
follows.
(cp_parser_qualifying_entity): Handle a C++26 pack-index-specifier.
(cp_parser_decltype_expr): Set id_expression_or_member_access_p for
pack indexing.
(cp_parser_mem_initializer_id): Handle a C++26 pack-index-specifier.
(cp_parser_simple_type_specifier): Likewise.
(cp_parser_base_specifier): Likewise.
* pt.cc (iterative_hash_template_arg) : New case.
(find_parameter_packs_r) : 
New
case.
(make_pack_index): New.
(tsubst_pack_index): New.
(tsubst): Avoid tsubst on PACK_INDEX_TYPE.
: Add a call to error.
: New case.
(tsubst_expr) : New case.
(dependent_type_p_r): Return true for PACK_INDEX_TYPE.
(type_dependent_expression_p): Recurse on PACK_INDEX_PACK for
PACK_INDEX_EXPR.
* ptree.cc (cxx_print_type) : New case.
* semantics.cc (finish_parenthesized_expr): Set
PACK_INDEX_PARENTHESIZED_P for PACK_INDEX_EXPR.
(finish_type_pack_element): Adjust error messages.
(pack_index_element): New.
* tree.cc (cp_tree_equal) : New case.
(cp_walk_subtrees) : New 
case.
* typeck.cc (structural_comptypes) : New case.

libstdc++-v3/ChangeLog:

* testsuite/20_util/tuple/element_access/get_neg.cc: Adjust
dg-prune-output.

gcc/testsuite/ChangeLog:

* g++.dg/cpp26/pack-indexing1.C: New test.
* g++.dg/cpp26/pack-indexing2.C: New test.
* g++.dg/cpp26/pack-indexing3.C: New test.
* g++.dg/cpp26/pack-indexing4.C: New test.
* g++.dg/cpp26/pack-indexing5.C: New test.
* g++.dg/cpp26/pack-indexing6.C: New test.
* g++.dg/cpp26/pack-indexing7.C: New test.
* g++.dg/cpp26/pack-indexing8.C: New test.
* g++.dg/cpp26/pack-indexing9.C: New test.
* g++.dg/cpp26/pack-indexing10.C: New test.
* g++.dg/cpp26/pack-indexing11.C: New test.
* g++.dg/modules/pack-index-1_a.C: New test.
* g++.dg/modul

[gcc r15-5755] Fix 'libgomp.oacc-c/../libgomp.oacc-c-c++-common/acc_get_property-gcn.c' for C23 default

2024-11-28 Thread Thomas Schwinge via Gcc-cvs
https://gcc.gnu.org/g:bcb764ec7c063326a17eb6213313cc9c0fd348b3

commit r15-5755-gbcb764ec7c063326a17eb6213313cc9c0fd348b3
Author: Thomas Schwinge 
Date:   Thu Nov 28 15:14:20 2024 +0100

Fix 'libgomp.oacc-c/../libgomp.oacc-c-c++-common/acc_get_property-gcn.c' 
for C23 default

With commit 55e3bd376b2214e200fa76d12b67ff259b06c212 "c: Default to 
-std=gnu23"
we've got:

[-PASS:-]{+FAIL:+} 
libgomp.oacc-c/../libgomp.oacc-c-c++-common/acc_get_property-gcn.c 
-DACC_DEVICE_TYPE_radeon=1 -DACC_MEM_SHARED=0 -foffload=amdgcn-amdhsa  -O0  
(test for excess errors)
[-PASS:-]{+UNRESOLVED:+} 
libgomp.oacc-c/../libgomp.oacc-c-c++-common/acc_get_property-gcn.c 
-DACC_DEVICE_TYPE_radeon=1 -DACC_MEM_SHARED=0 -foffload=amdgcn-amdhsa  -O0  
[-execution test-]{+compilation failed to produce executable+}
[Etc.]

..., due to:


[...]/libgomp.oacc-c/../libgomp.oacc-c-c++-common/acc_get_property-gcn.c:16:13: 
error: two or more data types in declaration specifiers

[...]/libgomp.oacc-c/../libgomp.oacc-c-c++-common/acc_get_property-gcn.c:16:1: 
warning: useless type name in empty declaration

libgomp/
* testsuite/libgomp.oacc-c-c++-common/acc_get_property-gcn.c
[!__cplusplus]: Don't 'typedef int bool;'.

Diff:
---
 libgomp/testsuite/libgomp.oacc-c-c++-common/acc_get_property-gcn.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/acc_get_property-gcn.c 
b/libgomp/testsuite/libgomp.oacc-c-c++-common/acc_get_property-gcn.c
index 4b1fb5e0e761..ab8fc6c276be 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/acc_get_property-gcn.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/acc_get_property-gcn.c
@@ -12,9 +12,6 @@
 #include 
 #include 
 
-#ifndef __cplusplus
-typedef int bool;
-#endif
 #include 


[gcc r15-5754] testsuite: Fix up pr116675.c test [PR116675]

2024-11-28 Thread Jakub Jelinek via Gcc-cvs
https://gcc.gnu.org/g:32a3f46ca543726196371a6f2a5d06feb31aa92d

commit r15-5754-g32a3f46ca543726196371a6f2a5d06feb31aa92d
Author: Jakub Jelinek 
Date:   Thu Nov 28 14:54:42 2024 +0100

testsuite: Fix up pr116675.c test [PR116675]

The test uses dg-do run and scan-assembler* at the same time,
that obviously doesn't work when pr116675.s isn't created at all,
so one gets
PASS: gcc.target/i386/pr116675.c execution test
gcc.target/i386/pr116675.c: output file does not exist
UNRESOLVED: gcc.target/i386/pr116675.c scan-assembler-times pand 4
gcc.target/i386/pr116675.c: output file does not exist
UNRESOLVED: gcc.target/i386/pr116675.c scan-assembler-times pandn 4
gcc.target/i386/pr116675.c: output file does not exist
UNRESOLVED: gcc.target/i386/pr116675.c scan-assembler-times por 4
The usual way to handle that is adding -save-temps option.

The test FAILs after that change though, for simple reason, the pand
regex doesn't match just pand instructions, but also the pandn ones.

I've added \t there to make sure it matches only pand.

Though, wonder if it wouldn't be safer to split the test into two,
one with just the 4 functions (why noinline, noclone rather than
noipa, btw?), that one would be dg-do compile and have the scan-assembler*
directives, and then another one which includes the first one and is
dg-do run and contains the runtime checking of those.

In any case, I've committed this as obvious.

2024-11-28  Jakub Jelinek  

PR target/116675
* gcc.target/i386/pr116675.c: Add -save-temps to dg-options.
Scan for pand\t rather than pand.

Diff:
---
 gcc/testsuite/gcc.target/i386/pr116675.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/gcc.target/i386/pr116675.c 
b/gcc/testsuite/gcc.target/i386/pr116675.c
index e463dd8415f5..f108b680f6d0 100644
--- a/gcc/testsuite/gcc.target/i386/pr116675.c
+++ b/gcc/testsuite/gcc.target/i386/pr116675.c
@@ -1,6 +1,6 @@
 /* { dg-do run } */
-/* { dg-options "-O2 -msse2 -mno-ssse3" } */
-/* { dg-final { scan-assembler-times "pand" 4 } } */
+/* { dg-options "-O2 -msse2 -mno-ssse3 -save-temps" } */
+/* { dg-final { scan-assembler-times "pand\t" 4 } } */
 /* { dg-final { scan-assembler-times "pandn" 4 } } */
 /* { dg-final { scan-assembler-times "por" 4 } } */


[gcc r15-5763] i386: Macroize compound shift patterns some more

2024-11-28 Thread Uros Bizjak via Gcc-cvs
https://gcc.gnu.org/g:ab2cce593ef6085a5f517cdca2520c5c44acbfad

commit r15-5763-gab2cce593ef6085a5f517cdca2520c5c44acbfad
Author: Uros Bizjak 
Date:   Thu Nov 28 17:44:03 2024 +0100

i386: Macroize compound shift patterns some more

Merge ashl and  compound define_insn_and_split
patterns to form  macroized pattern.

No functional changes.

gcc/ChangeLog:

* config/i386/i386.md (*3_mask): Macroize
pattern from *ashl3_mask and *3_mask
using any_shift code iterator.
(*3_mask_1): Macroize pattern
from *ashl3_mask_1 and *3_mask_1
using any_shift code iterator.
(*3_add): Macroize pattern
from *ashl3_add and *3_add
using any_shift code iterator.
(*3_add_1): Macroize pattern
from *ashl3_add_1 and *3_add_1
using any_shift code iterator.
(*3_sub): Macroize pattern
from *ashl3_sub and *3_sub
using any_shift code iterator.
(*3_sub_1): Macroize pattern
from *ashl3_sub_1 and *3_sub_1
using any_shift code iterator.

Diff:
---
 gcc/config/i386/i386.md | 453 
 1 file changed, 151 insertions(+), 302 deletions(-)

diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 2fc48006bca7..8eb9cb682b11 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -15847,157 +15847,6 @@
   DONE;
 })
 
-;; Avoid useless masking of count operand.
-(define_insn_and_split "*ashl3_mask"
-  [(set (match_operand:SWI48 0 "nonimmediate_operand")
-   (ashift:SWI48
- (match_operand:SWI48 1 "nonimmediate_operand")
- (subreg:QI
-   (and
- (match_operand 2 "int248_register_operand" "c,r")
- (match_operand 3 "const_int_operand")) 0)))
-   (clobber (reg:CC FLAGS_REG))]
-  "ix86_binary_operator_ok (ASHIFT, mode, operands)
-   && (INTVAL (operands[3]) & (GET_MODE_BITSIZE (mode)-1))
-  == GET_MODE_BITSIZE (mode)-1
-   && ix86_pre_reload_split ()"
-  "#"
-  "&& 1"
-  [(parallel
- [(set (match_dup 0)
-  (ashift:SWI48 (match_dup 1)
-(match_dup 2)))
-  (clobber (reg:CC FLAGS_REG))])]
-{
-  operands[2] = force_reg (GET_MODE (operands[2]), operands[2]);
-  operands[2] = gen_lowpart (QImode, operands[2]);
-}
-  [(set_attr "isa" "*,bmi2")])
-
-(define_insn_and_split "*ashl3_mask_1"
-  [(set (match_operand:SWI48 0 "nonimmediate_operand")
-   (ashift:SWI48
- (match_operand:SWI48 1 "nonimmediate_operand")
- (and:QI
-   (match_operand:QI 2 "register_operand" "c,r")
-   (match_operand:QI 3 "const_int_operand"
-   (clobber (reg:CC FLAGS_REG))]
-  "ix86_binary_operator_ok (ASHIFT, mode, operands)
-   && (INTVAL (operands[3]) & (GET_MODE_BITSIZE (mode)-1))
-  == GET_MODE_BITSIZE (mode)-1
-   && ix86_pre_reload_split ()"
-  "#"
-  "&& 1"
-  [(parallel
- [(set (match_dup 0)
-  (ashift:SWI48 (match_dup 1)
-(match_dup 2)))
-  (clobber (reg:CC FLAGS_REG))])]
-  ""
-  [(set_attr "isa" "*,bmi2")])
-
-(define_insn_and_split "*ashl3_add"
-  [(set (match_operand:SWI48 0 "nonimmediate_operand")
-   (ashift:SWI48
- (match_operand:SWI48 1 "nonimmediate_operand")
- (subreg:QI
-   (plus
- (match_operand 2 "int248_register_operand" "c,r")
- (match_operand 3 "const_int_operand")) 0)))
-   (clobber (reg:CC FLAGS_REG))]
-  "ix86_binary_operator_ok (ASHIFT, mode, operands)
-   && (INTVAL (operands[3]) & ( * BITS_PER_UNIT - 1)) == 0
-   && ix86_pre_reload_split ()"
-  "#"
-  "&& 1"
-  [(parallel
- [(set (match_dup 0)
-  (ashift:SWI48 (match_dup 1)
-(match_dup 2)))
-  (clobber (reg:CC FLAGS_REG))])]
-{
-  operands[2] = force_reg (GET_MODE (operands[2]), operands[2]);
-  operands[2] = gen_lowpart (QImode, operands[2]);
-}
-  [(set_attr "isa" "*,bmi2")])
-
-(define_insn_and_split "*ashl3_add_1"
-  [(set (match_operand:SWI48 0 "nonimmediate_operand")
-   (ashift:SWI48
- (match_operand:SWI48 1 "nonimmediate_operand")
- (plus:QI
-   (match_operand:QI 2 "register_operand" "c,r")
-   (match_operand:QI 3 "const_int_operand"
-   (clobber (reg:CC FLAGS_REG))]
-  "ix86_binary_operator_ok (ASHIFT, mode, operands)
-   && (INTVAL (operands[3]) & ( * BITS_PER_UNIT - 1)) == 0
-   && ix86_pre_reload_split ()"
-  "#"
-  "&& 1"
-  [(parallel
- [(set (match_dup 0)
-  (ashift:SWI48 (match_dup 1)
-(match_dup 2)))
-  (clobber (reg:CC FLAGS_REG))])]
-  ""
-  [(set_attr "isa" "*,bmi2")])
-
-(define_insn_and_split "*ashl3_sub"
-  [(set (match_operand:SWI48 0 "nonimmediate_operand")
-   (ashift:SWI48
- (match_operand:SWI48 1 "nonimmediate_operand")
- (subreg:QI
-   (minus
- (match_operand 3 "const_int_operand")
-   

[gcc/aoliva/heads/testbase] (187 commits) Fortran: fix crash with bounds check writing array section

2024-11-28 Thread Alexandre Oliva via Gcc-cvs
The branch 'aoliva/heads/testbase' was updated to point to:

 2261a15c0715... Fortran: fix crash with bounds check writing array section 

It previously pointed to:

 8500a8c32b8c... Daily bump.

Diff:

Summary of changes (added commits):
---

  2261a15... Fortran: fix crash with bounds check writing array section  (*)
  1e2b03e... libstdc++: Use std::_Destroy in std::stacktrace (*)
  a495413... c++: define __cpp_pack_indexing [PR113798] (*)
  ab2cce5... i386: Macroize compound shift patterns some more (*)
  6bba4ca... libstdc++: Reorder printer registrations in printers.py (*)
  fe04901... libstdc++: Fix allocator-extended move ctor for std::basic_ (*)
  bb551f4... libstdc++: Deprecate std::rel_ops namespace for C++20 (*)
  3f3966b... libstdc++: Reduce duplication in Doxygen comments for std:: (*)
  c435272... c++: Implement P2662R3, Pack Indexing [PR113798] (*)
  c5126f0... [PATCH v6 02/12] Add built-ins and tests for bit-forward an (*)
  bb46d05... [PATCH v6 01/12] Implement internal functions for efficient (*)
  bcb764e... Fix 'libgomp.oacc-c/../libgomp.oacc-c-c++-common/acc_get_pr (*)
  32a3f46... testsuite: Fix up pr116675.c test [PR116675] (*)
  3e8d307... Address UNRESOLVED for 'g++.dg/tree-ssa/empty-loop.C' (*)
  0dcc09a... docs: Fix up __sync_* documentation [PR117642] (*)
  912d5cf... ranger: Handle nonnull_if_nonzero attribute [PR117023] (*)
  19fe55c... Add support for nonnull_if_nonzero attribute [PR117023] (*)
  654afa4... rs6000: Add PowerPC inline asm redzone clobber support (*)
  37c98fd... inline-asm, i386: Add "redzone" clobber support (*)
  fd62fdc... c++: Small initial fixes for zeroing of padding bits [PR117 (*)
  0547dbb... expr, c: Don't clear whole unions [PR116416] (*)
  1b3bff7... middle-end: rework vectorizable_store to iterate over singl (*)
  aa9f12e... libstdc++: Include  in os_defines.h for FreeBS (*)
  29032df... gimple-fold: Avoid ICEs with bogus declarations like const  (*)
  88aeea1... builtins: Handle BITINT_TYPE in __builtin_iseqsig folding [ (*)
  24dac1e... c: Fix gimplification ICE for shifts with invalid redeclara (*)
  066f309... analyzer,timevar: avoid naked "new" in JSON-handling (*)
  5341eb6... c-family: offer suggestions for missing command-line option (*)
  5336b63... C/C++: add fix-it hints for missing '&' and '*' (v5) [PR878 (*)
  9f06b91... diagnostics: replace %<%s%> with %qs [PR104896] (*)
  7a656d7... Daily bump. (*)
  1046c32... optimize basic_string (*)
  87492fb... c: Fix ICE using function name in parameter type in old-sty (*)
  73e5d2f... libstdc++: Remove __builtin_expect from consteval assertion (*)
  17db574... libstdc++: Add cold attribute to assertion failure function (*)
  093584a... i386: x86 can use x >> y for x >> 32+y [PR36503] (*)
  bca515f... match: Improve handling of double convert [PR117776] (*)
  56029c9... libstdc++/ranges: make _RangeAdaptorClosure befriend operat (*)
  958f002... c: Fix sizeof error recovery [PR117745] (*)
  4a86859... I386: Add more testcases for unsigned SAT_ADD vector patter (*)
  83f200f... Match: Refactor the unsigned SAT_ADD match ADD_OVERFLOW pat (*)
  eaa675a... c: Do not remove _Atomic from array element type for typeof (*)
  96ccb20... builtins: Emit __sync_lock_release_{8,16} call as last reso (*)
  a2370cc... match.pd: Avoid introducing UB in the ((X /[ex] C1) +- C2)  (*)
  21954a5... libstdc++: module std fixes (*)
  e7aa614... libstdc++: Add debug assertions to std::list and std::forwa (*)
  751f91b... libstdc++: Simplify std::forward_list assignment using 'if  (*)
  281ac0e... libstdc++: Simplify std::list assignment using 'if constexp (*)
  498f9ae... libstdc++: Fix unsigned wraparound in codecvt::do_length [P (*)
  32f6485... ifcombine: skip fallback conjunction on noncontiguous block (*)
  fed871f... Fortran: Fix non_overridable typebound proc problems [PR846 (*)
  631cd92... c: Introduce -Wfree-labels (*)
  2fd9aef... c++: modules and local static (*)
  d89d033... c++: enable -Warray-compare by default (*)
  134dc93... libcpp: modules and -include again (*)
  5e718a7... PR117350: Keep assembler name for abstract decls for autofd (*)
  74cee43... Daily bump. (*)
  44e71c8... libstdc++: Add -fno-assume-sane-operators-new-delete to tes (*)
  94f98f6... Fortran: fix minor front-end memleaks (*)
  4a23528... aarch64: Update error message check for __builtin_launder c (*)
  1a0d480... aarch64: Fix fp8_scalar_1.c's stacktest1 (*)
  746629e... selftest: invoke "diff" when ASSERT_STREQ fails (*)
  b4d4e22... testsuite: rename plugins from .c to .cc (*)
  e2db825... csky: use quotes when referring to cpus and archs [PR90160] (*)
  3e2a1b2... [PATCH] testsuite:RISC-V:Modify the char string. (*)
  eff7e72... Fortran: passing inquiry ref of complex array to assumed ra (*)
  5134bad... c: avoid double-negative in warning message [PR94370] (*)
  67458ea... loop-prefetch: fix wording of warning [PR80760] (*)
  08bb92d... plugin: add missing colon in error message [PR93746] (*)

[gcc(refs/users/aoliva/heads/testme)] ifcombine: avoid forwarders with intervening blocks

2024-11-28 Thread Alexandre Oliva via Gcc-cvs
https://gcc.gnu.org/g:93d41e3a7fcc2c74f41fdd62edc76bc1068e1a7a

commit 93d41e3a7fcc2c74f41fdd62edc76bc1068e1a7a
Author: Alexandre Oliva 
Date:   Thu Nov 28 21:56:34 2024 -0300

ifcombine: avoid forwarders with intervening blocks

Diff:
---
 gcc/testsuite/gcc.dg/ifcmb-1.c | 60 ++
 gcc/tree-ssa-ifcombine.cc  | 83 +-
 2 files changed, 125 insertions(+), 18 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/ifcmb-1.c b/gcc/testsuite/gcc.dg/ifcmb-1.c
new file mode 100644
index ..9aaba4de5328
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ifcmb-1.c
@@ -0,0 +1,60 @@
+/* { dg-do run } */
+
+[[gnu::noinline]]
+int f0 (int a, int b) {
+  if ((a & 1))
+return 0;
+  if (b)
+return 1;
+  if (!(a & 2))
+return 0;
+  else
+return 1;
+}
+
+[[gnu::noinline]]
+int f1 (int a, int b) {
+  if (!(a & 1))
+return 0;
+  if (b)
+return 1;
+  if ((a & 2))
+return 1;
+  else
+return 0;
+}
+
+[[gnu::noinline]]
+int f2 (int a, int b) {
+  if ((a & 1))
+return 0;
+  if (b)
+return 1;
+  if (!(a & 2))
+return 0;
+  else
+return 1;
+}
+
+[[gnu::noinline]]
+int f3 (int a, int b) {
+  if (!(a & 1))
+return 0;
+  if (b)
+return 1;
+  if ((a & 2))
+return 1;
+  else
+return 0;
+}
+
+int main() {
+  if (f0 (0, 1) != 1)
+__builtin_abort();
+  if (f1 (1, 1) != 1)
+__builtin_abort();
+  if (f2 (2, 1) != 1)
+__builtin_abort();
+  if (f3 (3, 1) != 1)
+__builtin_abort();
+}
diff --git a/gcc/tree-ssa-ifcombine.cc b/gcc/tree-ssa-ifcombine.cc
index b58f63f4707a..170d2189d25f 100644
--- a/gcc/tree-ssa-ifcombine.cc
+++ b/gcc/tree-ssa-ifcombine.cc
@@ -1089,7 +1089,7 @@ tree_ssa_ifcombine_bb_1 (basic_block inner_cond_bb, 
basic_block outer_cond_bb,
 }
 
   /* The || form is characterized by a common then_bb with the
- two edges leading to it mergable.  The latter is guaranteed
+ two edges leading to it mergeable.  The latter is guaranteed
  by matching PHI arguments in the then_bb and the inner cond_bb
  having no side-effects.  */
   if (phi_pred_bb != then_bb
@@ -1100,7 +1100,7 @@ tree_ssa_ifcombine_bb_1 (basic_block inner_cond_bb, 
basic_block outer_cond_bb,
   
 if (q) goto then_bb; else goto inner_cond_bb;
   
-if (q) goto then_bb; else goto ...;
+if (p) goto then_bb; else goto ...;
   
 ...
*/
@@ -1116,7 +1116,7 @@ tree_ssa_ifcombine_bb_1 (basic_block inner_cond_bb, 
basic_block outer_cond_bb,
   
 if (q) goto inner_cond_bb; else goto then_bb;
   
-if (q) goto then_bb; else goto ...;
+if (p) goto then_bb; else goto ...;
   
 ...
*/
@@ -1151,13 +1151,15 @@ tree_ssa_ifcombine_bb (basic_block inner_cond_bb)
  Look for an OUTER_COND_BBs to combine with INNER_COND_BB.  They need not
  be contiguous, as long as inner and intervening blocks have no side
  effects, and are either single-entry-single-exit or conditionals choosing
- between the same EXIT_BB with the same PHI args, and the path leading to
- INNER_COND_BB.  ??? We could potentially handle multi-block
- single-entry-single-exit regions, but the loop below only deals with
- single-entry-single-exit individual intervening blocks.  Larger regions
- without side effects are presumably rare, so it's probably not worth the
- effort.  */
-  for (basic_block bb = inner_cond_bb, outer_cond_bb, exit_bb = NULL;
+ between the same EXIT_BB with the same PHI args, possibly through an
+ EXIT_PRED, and the path leading to INNER_COND_BB.  EXIT_PRED will be set
+ just before (along with a successful combination) or just after setting
+ EXIT_BB, to either THEN_BB, ELSE_BB, or INNER_COND_BB.  ??? We could
+ potentially handle multi-block single-entry-single-exit regions, but the
+ loop below only deals with single-entry-single-exit individual intervening
+ blocks.  Larger regions without side effects are presumably rare, so it's
+ probably not worth the effort.  */
+  for (basic_block bb = inner_cond_bb, outer_cond_bb, exit_bb = NULL, 
exit_pred;
single_pred_p (bb) && bb_no_side_effects_p (bb);
bb = outer_cond_bb)
 {
@@ -1198,10 +1200,13 @@ tree_ssa_ifcombine_bb (basic_block inner_cond_bb)
 checking of same phi args.  */
   if (known_succ_p (outer_cond_bb))
changed = false;
-  else if (tree_ssa_ifcombine_bb_1 (inner_cond_bb, outer_cond_bb,
-   then_bb, else_bb, inner_cond_bb, bb))
-   changed = true;
-  else if (forwarder_block_to (else_bb, then_bb))
+  else if ((!exit_bb || exit_pred == inner_cond_bb))
+  && tree_ssa_ifcombine_bb_1 (inner_cond_bb, outer_cond_bb,
+  then_bb, else_bb, inner_cond_bb, bb))
+   changed = true, exit_pred = inner_cond_bb;
+  else if (exit_

[gcc/aoliva/heads/testme] ifcombine: avoid forwarders with intervening blocks

2024-11-28 Thread Alexandre Oliva via Gcc-cvs
The branch 'aoliva/heads/testme' was updated to point to:

 93d41e3a7fcc... ifcombine: avoid forwarders with intervening blocks

It previously pointed to:

 9b671c0fdb80... ifcombine: avoid forwarders with intervening blocks

Diff:

!!! WARNING: THE FOLLOWING COMMITS ARE NO LONGER ACCESSIBLE (LOST):
---

  9b671c0... ifcombine: avoid forwarders with intervening blocks


Summary of changes (added commits):
---

  93d41e3... ifcombine: avoid forwarders with intervening blocks


[gcc/aoliva/heads/testme] ifcombine: avoid forwarders with intervening blocks

2024-11-28 Thread Alexandre Oliva via Gcc-cvs
The branch 'aoliva/heads/testme' was updated to point to:

 e87623a1100e... ifcombine: avoid forwarders with intervening blocks

It previously pointed to:

 93d41e3a7fcc... ifcombine: avoid forwarders with intervening blocks

Diff:

!!! WARNING: THE FOLLOWING COMMITS ARE NO LONGER ACCESSIBLE (LOST):
---

  93d41e3... ifcombine: avoid forwarders with intervening blocks


Summary of changes (added commits):
---

  e87623a... ifcombine: avoid forwarders with intervening blocks


[gcc(refs/users/aoliva/heads/testme)] fold fold_truth_andor field merging into ifcombine

2024-11-28 Thread Alexandre Oliva via Gcc-cvs
https://gcc.gnu.org/g:1df27fc2ee2fc19a52731e35d1c2f1ce885eeff7

commit 1df27fc2ee2fc19a52731e35d1c2f1ce885eeff7
Author: Alexandre Oliva 
Date:   Thu Nov 28 18:44:28 2024 -0300

fold fold_truth_andor field merging into ifcombine

This patch introduces various improvements to the logic that merges
field compares, while moving it into ifcombine.

Before the patch, we could merge:

  (a.x1 EQNE b.x1)  ANDOR  (a.y1 EQNE b.y1)

into something like:

  (((type *)&a)[Na] & MASK) EQNE (((type *)&b)[Nb] & MASK)

if both of A's fields live within the same alignment boundaries, and
so do B's, at the same relative positions.  Constants may be used
instead of the object B.

The initial goal of this patch was to enable such combinations when a
field crossed alignment boundaries, e.g. for packed types.  We can't
generally access such fields with a single memory access, so when we
come across such a compare, we will attempt to combine each access
separately.

Some merging opportunities were missed because of right-shifts,
compares expressed as e.g. ((a.x1 ^ b.x1) & MASK) EQNE 0, and
narrowing conversions, especially after earlier merges.  This patch
introduces handlers for several cases involving these.

The merging of multiple field accesses into wider bitfield-like
accesses is undesirable to do too early in compilation, so we move it
from folding to ifcombine.

When it is the second of a noncontiguous pair of compares that first
accesses a word, we may merge the first compare with part of the
second compare that refers to the same word, keeping the compare of
the remaining bits at the spot where the second compare used to be.

Handling compares with non-constant fields was somewhat generalized
from what fold used to do, now handling non-adjacent fields, even if a
field of one object crosses an alignment boundary but the other
doesn't.


for  gcc/ChangeLog

* fold-const.cc (make_bit_field): Export.
(unextend, all_ones_mask_p): Drop.
(decode_field_reference, fold_truth_andor_1): Move
field compare merging logic...
* gimple-fold.cc: (fold_truth_andor_for_ifcombine) ... here.
(compute_split_boundary_from_align): New.
(make_bit_field_load, build_split_load): New.
(reuse_split_load): New.
* fold-const.h: (make_bit_field_ref): Declare
(fold_truth_andor_for_ifcombine): Declare.
* match.pd (any_convert, bit_and_cst, rshift_cst): New.
* tree-ssa-ifcombine.cc (ifcombine_ifandif): Try
fold_truth_andor_for_ifcombine.

for  gcc/testsuite/ChangeLog

* gcc.dg/field-merge-1.c: New.
* gcc.dg/field-merge-2.c: New.
* gcc.dg/field-merge-3.c: New.
* gcc.dg/field-merge-4.c: New.
* gcc.dg/field-merge-5.c: New.
* gcc.dg/field-merge-6.c: New.
* gcc.dg/field-merge-7.c: New.
* gcc.dg/field-merge-8.c: New.
* gcc.dg/field-merge-9.c: New.
* gcc.dg/field-merge-10.c: New.
* gcc.dg/field-merge-11.c: New.

Diff:
---
 gcc/fold-const.cc |  512 +--
 gcc/fold-const.h  |   10 +
 gcc/gimple-fold.cc| 1107 +
 gcc/match.pd  |   11 +
 gcc/testsuite/gcc.dg/field-merge-1.c  |   64 ++
 gcc/testsuite/gcc.dg/field-merge-10.c |   36 ++
 gcc/testsuite/gcc.dg/field-merge-11.c |   32 +
 gcc/testsuite/gcc.dg/field-merge-2.c  |   31 +
 gcc/testsuite/gcc.dg/field-merge-3.c  |   36 ++
 gcc/testsuite/gcc.dg/field-merge-4.c  |   40 ++
 gcc/testsuite/gcc.dg/field-merge-5.c  |   40 ++
 gcc/testsuite/gcc.dg/field-merge-6.c  |   26 +
 gcc/testsuite/gcc.dg/field-merge-7.c  |   23 +
 gcc/testsuite/gcc.dg/field-merge-8.c  |   25 +
 gcc/testsuite/gcc.dg/field-merge-9.c  |   36 ++
 gcc/tree-ssa-ifcombine.cc |   14 +-
 16 files changed, 1534 insertions(+), 509 deletions(-)

diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc
index 1e8ae1ab493b..644966459864 100644
--- a/gcc/fold-const.cc
+++ b/gcc/fold-const.cc
@@ -137,7 +137,6 @@ static tree range_successor (tree);
 static tree fold_range_test (location_t, enum tree_code, tree, tree, tree);
 static tree fold_cond_expr_with_comparison (location_t, tree, enum tree_code,
tree, tree, tree, tree);
-static tree unextend (tree, int, int, tree);
 static tree extract_muldiv (tree, tree, enum tree_code, tree, bool *);
 static tree extract_muldiv_1 (tree, tree, enum tree_code, tree, bool *);
 static tree fold_binary_op_with_conditional_arg (location_t,
@@ -4711,7 +4710,7 @@ invert_truthvalue_loc (location_t loc, tree arg)
is the original memory reference used to preserve the alias set of
the

[gcc(refs/users/aoliva/heads/testme)] ifcombine: avoid forwarders with intervening blocks

2024-11-28 Thread Alexandre Oliva via Gcc-cvs
https://gcc.gnu.org/g:9b671c0fdb80d9fd8a60030e7cc9550e903c4fa3

commit 9b671c0fdb80d9fd8a60030e7cc9550e903c4fa3
Author: Alexandre Oliva 
Date:   Thu Nov 28 21:56:34 2024 -0300

ifcombine: avoid forwarders with intervening blocks

Diff:
---
 gcc/testsuite/gcc.dg/ifcmb-1.c | 60 
 gcc/tree-ssa-ifcombine.cc  | 69 --
 2 files changed, 119 insertions(+), 10 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/ifcmb-1.c b/gcc/testsuite/gcc.dg/ifcmb-1.c
new file mode 100644
index ..9aaba4de5328
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ifcmb-1.c
@@ -0,0 +1,60 @@
+/* { dg-do run } */
+
+[[gnu::noinline]]
+int f0 (int a, int b) {
+  if ((a & 1))
+return 0;
+  if (b)
+return 1;
+  if (!(a & 2))
+return 0;
+  else
+return 1;
+}
+
+[[gnu::noinline]]
+int f1 (int a, int b) {
+  if (!(a & 1))
+return 0;
+  if (b)
+return 1;
+  if ((a & 2))
+return 1;
+  else
+return 0;
+}
+
+[[gnu::noinline]]
+int f2 (int a, int b) {
+  if ((a & 1))
+return 0;
+  if (b)
+return 1;
+  if (!(a & 2))
+return 0;
+  else
+return 1;
+}
+
+[[gnu::noinline]]
+int f3 (int a, int b) {
+  if (!(a & 1))
+return 0;
+  if (b)
+return 1;
+  if ((a & 2))
+return 1;
+  else
+return 0;
+}
+
+int main() {
+  if (f0 (0, 1) != 1)
+__builtin_abort();
+  if (f1 (1, 1) != 1)
+__builtin_abort();
+  if (f2 (2, 1) != 1)
+__builtin_abort();
+  if (f3 (3, 1) != 1)
+__builtin_abort();
+}
diff --git a/gcc/tree-ssa-ifcombine.cc b/gcc/tree-ssa-ifcombine.cc
index b58f63f4707a..fd7f3c245254 100644
--- a/gcc/tree-ssa-ifcombine.cc
+++ b/gcc/tree-ssa-ifcombine.cc
@@ -1089,7 +1089,7 @@ tree_ssa_ifcombine_bb_1 (basic_block inner_cond_bb, 
basic_block outer_cond_bb,
 }
 
   /* The || form is characterized by a common then_bb with the
- two edges leading to it mergable.  The latter is guaranteed
+ two edges leading to it mergeable.  The latter is guaranteed
  by matching PHI arguments in the then_bb and the inner cond_bb
  having no side-effects.  */
   if (phi_pred_bb != then_bb
@@ -1100,7 +1100,7 @@ tree_ssa_ifcombine_bb_1 (basic_block inner_cond_bb, 
basic_block outer_cond_bb,
   
 if (q) goto then_bb; else goto inner_cond_bb;
   
-if (q) goto then_bb; else goto ...;
+if (p) goto then_bb; else goto ...;
   
 ...
*/
@@ -1116,7 +1116,7 @@ tree_ssa_ifcombine_bb_1 (basic_block inner_cond_bb, 
basic_block outer_cond_bb,
   
 if (q) goto inner_cond_bb; else goto then_bb;
   
-if (q) goto then_bb; else goto ...;
+if (p) goto then_bb; else goto ...;
   
 ...
*/
@@ -1139,6 +1139,10 @@ tree_ssa_ifcombine_bb (basic_block inner_cond_bb)
   if (!recognize_if_then_else (inner_cond_bb, &then_bb, &else_bb))
 return ret;
 
+  /* FWD_WHICH will be set along with EXIT_BB to take note of whether THEN_BB
+ (1), ELSE_BB (-1) or neither (0) is a forwarder block to EXIT_BB.  */
+  int fwd_which;
+
   /* Recognize && and || of two conditions with a common
  then/else block which entry edges we can merge.  That is:
if (a || b)
@@ -1198,10 +1202,14 @@ tree_ssa_ifcombine_bb (basic_block inner_cond_bb)
 checking of same phi args.  */
   if (known_succ_p (outer_cond_bb))
changed = false;
-  else if (tree_ssa_ifcombine_bb_1 (inner_cond_bb, outer_cond_bb,
-   then_bb, else_bb, inner_cond_bb, bb))
-   changed = true;
-  else if (forwarder_block_to (else_bb, then_bb))
+  else if (exit_bb
+  ? fwd_which == 0
+  : tree_ssa_ifcombine_bb_1 (inner_cond_bb, outer_cond_bb,
+ then_bb, else_bb, inner_cond_bb, bb))
+   changed = true, fwd_which = 0;
+  else if (exit_bb
+  ? fwd_which < 0
+  : forwarder_block_to (else_bb, then_bb))
{
  /* Other possibilities for the && form, if else_bb is
 empty forwarder block to then_bb.  Compared to the above simpler
@@ -1211,9 +1219,11 @@ tree_ssa_ifcombine_bb (basic_block inner_cond_bb)
 edge from outer_cond_bb and the forwarder block.  */
  if (tree_ssa_ifcombine_bb_1 (inner_cond_bb, outer_cond_bb, else_bb,
   then_bb, else_bb, bb))
-   changed = true;
+   changed = true, fwd_which = -1;
}
-  else if (forwarder_block_to (then_bb, else_bb))
+  else if (exit_bb
+  ? fwd_which > 0
+  : forwarder_block_to (then_bb, else_bb))
{
  /* Other possibilities for the || form, if then_bb is
 empty forwarder block to else_bb.  Compared to the above simpler
@@ -1223,7 +1233,7 @@ tree_ssa_ifcombine_bb (basic_block inner_cond_bb)
 edge from outer_cond_bb and the forwarder block.  */
  

[gcc/aoliva/heads/testme] (190 commits) ifcombine: avoid forwarders with intervening blocks

2024-11-28 Thread Alexandre Oliva via Gcc-cvs
The branch 'aoliva/heads/testme' was updated to point to:

 9b671c0fdb80... ifcombine: avoid forwarders with intervening blocks

It previously pointed to:

 dab845bbb29b... ifcombine: don't try xor on right-hand op

Diff:

!!! WARNING: THE FOLLOWING COMMITS ARE NO LONGER ACCESSIBLE (LOST):
---

  dab845b... ifcombine: don't try xor on right-hand op
  4f022d9... fold fold_truth_andor field merging into ifcombine
  887c27b... ifcombine: skip fallback conjunction on noncontiguous block


Summary of changes (added commits):
---

  9b671c0... ifcombine: avoid forwarders with intervening blocks
  8841e79... ifcombine: don't try xor on right-hand op
  1df27fc... fold fold_truth_andor field merging into ifcombine
  2261a15... Fortran: fix crash with bounds check writing array section  (*)
  1e2b03e... libstdc++: Use std::_Destroy in std::stacktrace (*)
  a495413... c++: define __cpp_pack_indexing [PR113798] (*)
  ab2cce5... i386: Macroize compound shift patterns some more (*)
  6bba4ca... libstdc++: Reorder printer registrations in printers.py (*)
  fe04901... libstdc++: Fix allocator-extended move ctor for std::basic_ (*)
  bb551f4... libstdc++: Deprecate std::rel_ops namespace for C++20 (*)
  3f3966b... libstdc++: Reduce duplication in Doxygen comments for std:: (*)
  c435272... c++: Implement P2662R3, Pack Indexing [PR113798] (*)
  c5126f0... [PATCH v6 02/12] Add built-ins and tests for bit-forward an (*)
  bb46d05... [PATCH v6 01/12] Implement internal functions for efficient (*)
  bcb764e... Fix 'libgomp.oacc-c/../libgomp.oacc-c-c++-common/acc_get_pr (*)
  32a3f46... testsuite: Fix up pr116675.c test [PR116675] (*)
  3e8d307... Address UNRESOLVED for 'g++.dg/tree-ssa/empty-loop.C' (*)
  0dcc09a... docs: Fix up __sync_* documentation [PR117642] (*)
  912d5cf... ranger: Handle nonnull_if_nonzero attribute [PR117023] (*)
  19fe55c... Add support for nonnull_if_nonzero attribute [PR117023] (*)
  654afa4... rs6000: Add PowerPC inline asm redzone clobber support (*)
  37c98fd... inline-asm, i386: Add "redzone" clobber support (*)
  fd62fdc... c++: Small initial fixes for zeroing of padding bits [PR117 (*)
  0547dbb... expr, c: Don't clear whole unions [PR116416] (*)
  1b3bff7... middle-end: rework vectorizable_store to iterate over singl (*)
  aa9f12e... libstdc++: Include  in os_defines.h for FreeBS (*)
  29032df... gimple-fold: Avoid ICEs with bogus declarations like const  (*)
  88aeea1... builtins: Handle BITINT_TYPE in __builtin_iseqsig folding [ (*)
  24dac1e... c: Fix gimplification ICE for shifts with invalid redeclara (*)
  066f309... analyzer,timevar: avoid naked "new" in JSON-handling (*)
  5341eb6... c-family: offer suggestions for missing command-line option (*)
  5336b63... C/C++: add fix-it hints for missing '&' and '*' (v5) [PR878 (*)
  9f06b91... diagnostics: replace %<%s%> with %qs [PR104896] (*)
  7a656d7... Daily bump. (*)
  1046c32... optimize basic_string (*)
  87492fb... c: Fix ICE using function name in parameter type in old-sty (*)
  73e5d2f... libstdc++: Remove __builtin_expect from consteval assertion (*)
  17db574... libstdc++: Add cold attribute to assertion failure function (*)
  093584a... i386: x86 can use x >> y for x >> 32+y [PR36503] (*)
  bca515f... match: Improve handling of double convert [PR117776] (*)
  56029c9... libstdc++/ranges: make _RangeAdaptorClosure befriend operat (*)
  958f002... c: Fix sizeof error recovery [PR117745] (*)
  4a86859... I386: Add more testcases for unsigned SAT_ADD vector patter (*)
  83f200f... Match: Refactor the unsigned SAT_ADD match ADD_OVERFLOW pat (*)
  eaa675a... c: Do not remove _Atomic from array element type for typeof (*)
  96ccb20... builtins: Emit __sync_lock_release_{8,16} call as last reso (*)
  a2370cc... match.pd: Avoid introducing UB in the ((X /[ex] C1) +- C2)  (*)
  21954a5... libstdc++: module std fixes (*)
  e7aa614... libstdc++: Add debug assertions to std::list and std::forwa (*)
  751f91b... libstdc++: Simplify std::forward_list assignment using 'if  (*)
  281ac0e... libstdc++: Simplify std::list assignment using 'if constexp (*)
  498f9ae... libstdc++: Fix unsigned wraparound in codecvt::do_length [P (*)
  32f6485... ifcombine: skip fallback conjunction on noncontiguous block (*)
  fed871f... Fortran: Fix non_overridable typebound proc problems [PR846 (*)
  631cd92... c: Introduce -Wfree-labels (*)
  2fd9aef... c++: modules and local static (*)
  d89d033... c++: enable -Warray-compare by default (*)
  134dc93... libcpp: modules and -include again (*)
  5e718a7... PR117350: Keep assembler name for abstract decls for autofd (*)
  74cee43... Daily bump. (*)
  44e71c8... libstdc++: Add -fno-assume-sane-operators-new-delete to tes (*)
  94f98f6... Fortran: fix minor front-end memleaks (*)
  4a23528... aarch64: Update error message check for __builtin_launder c (*)
  1a0d480... aarch64: Fix fp8_scalar_1.c's stacktest1 (*)
  746629e... 

[gcc(refs/users/aoliva/heads/testme)] ifcombine: don't try xor on right-hand op

2024-11-28 Thread Alexandre Oliva via Gcc-cvs
https://gcc.gnu.org/g:8841e794667a39446e0ec4bfe9e3e66e5cd5bf80

commit 8841e794667a39446e0ec4bfe9e3e66e5cd5bf80
Author: Alexandre Oliva 
Date:   Thu Nov 28 18:44:35 2024 -0300

ifcombine: don't try xor on right-hand op

Diff:
---
 gcc/gimple-fold.cc | 4 
 1 file changed, 4 insertions(+)

diff --git a/gcc/gimple-fold.cc b/gcc/gimple-fold.cc
index 69723ec97f78..988e552180c9 100644
--- a/gcc/gimple-fold.cc
+++ b/gcc/gimple-fold.cc
@@ -7539,6 +7539,10 @@ decode_field_reference (tree *pexp, HOST_WIDE_INT 
*pbitsize,
  exp = res_ops[1];
  gcc_checking_assert (!xor_cmp_op);
}
+  else if (!xor_cmp_op)
+   /* Not much we can do when xor appears in the right-hand compare
+  operand.  */
+   return NULL_TREE;
   else
{
  *xor_p = true;


[gcc(refs/users/aoliva/heads/testme)] ifcombine: avoid forwarders with intervening blocks

2024-11-28 Thread Alexandre Oliva via Gcc-cvs
https://gcc.gnu.org/g:e87623a1100e66c7073c720b5c697a4590aa7e73

commit e87623a1100e66c7073c720b5c697a4590aa7e73
Author: Alexandre Oliva 
Date:   Thu Nov 28 21:56:34 2024 -0300

ifcombine: avoid forwarders with intervening blocks

Diff:
---
 gcc/testsuite/gcc.dg/ifcmb-1.c | 60 ++
 gcc/tree-ssa-ifcombine.cc  | 83 +-
 2 files changed, 125 insertions(+), 18 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/ifcmb-1.c b/gcc/testsuite/gcc.dg/ifcmb-1.c
new file mode 100644
index ..9aaba4de5328
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ifcmb-1.c
@@ -0,0 +1,60 @@
+/* { dg-do run } */
+
+[[gnu::noinline]]
+int f0 (int a, int b) {
+  if ((a & 1))
+return 0;
+  if (b)
+return 1;
+  if (!(a & 2))
+return 0;
+  else
+return 1;
+}
+
+[[gnu::noinline]]
+int f1 (int a, int b) {
+  if (!(a & 1))
+return 0;
+  if (b)
+return 1;
+  if ((a & 2))
+return 1;
+  else
+return 0;
+}
+
+[[gnu::noinline]]
+int f2 (int a, int b) {
+  if ((a & 1))
+return 0;
+  if (b)
+return 1;
+  if (!(a & 2))
+return 0;
+  else
+return 1;
+}
+
+[[gnu::noinline]]
+int f3 (int a, int b) {
+  if (!(a & 1))
+return 0;
+  if (b)
+return 1;
+  if ((a & 2))
+return 1;
+  else
+return 0;
+}
+
+int main() {
+  if (f0 (0, 1) != 1)
+__builtin_abort();
+  if (f1 (1, 1) != 1)
+__builtin_abort();
+  if (f2 (2, 1) != 1)
+__builtin_abort();
+  if (f3 (3, 1) != 1)
+__builtin_abort();
+}
diff --git a/gcc/tree-ssa-ifcombine.cc b/gcc/tree-ssa-ifcombine.cc
index b58f63f4707a..e7c700d448e5 100644
--- a/gcc/tree-ssa-ifcombine.cc
+++ b/gcc/tree-ssa-ifcombine.cc
@@ -1089,7 +1089,7 @@ tree_ssa_ifcombine_bb_1 (basic_block inner_cond_bb, 
basic_block outer_cond_bb,
 }
 
   /* The || form is characterized by a common then_bb with the
- two edges leading to it mergable.  The latter is guaranteed
+ two edges leading to it mergeable.  The latter is guaranteed
  by matching PHI arguments in the then_bb and the inner cond_bb
  having no side-effects.  */
   if (phi_pred_bb != then_bb
@@ -1100,7 +1100,7 @@ tree_ssa_ifcombine_bb_1 (basic_block inner_cond_bb, 
basic_block outer_cond_bb,
   
 if (q) goto then_bb; else goto inner_cond_bb;
   
-if (q) goto then_bb; else goto ...;
+if (p) goto then_bb; else goto ...;
   
 ...
*/
@@ -1116,7 +1116,7 @@ tree_ssa_ifcombine_bb_1 (basic_block inner_cond_bb, 
basic_block outer_cond_bb,
   
 if (q) goto inner_cond_bb; else goto then_bb;
   
-if (q) goto then_bb; else goto ...;
+if (p) goto then_bb; else goto ...;
   
 ...
*/
@@ -1151,13 +1151,15 @@ tree_ssa_ifcombine_bb (basic_block inner_cond_bb)
  Look for an OUTER_COND_BBs to combine with INNER_COND_BB.  They need not
  be contiguous, as long as inner and intervening blocks have no side
  effects, and are either single-entry-single-exit or conditionals choosing
- between the same EXIT_BB with the same PHI args, and the path leading to
- INNER_COND_BB.  ??? We could potentially handle multi-block
- single-entry-single-exit regions, but the loop below only deals with
- single-entry-single-exit individual intervening blocks.  Larger regions
- without side effects are presumably rare, so it's probably not worth the
- effort.  */
-  for (basic_block bb = inner_cond_bb, outer_cond_bb, exit_bb = NULL;
+ between the same EXIT_BB with the same PHI args, possibly through an
+ EXIT_PRED, and the path leading to INNER_COND_BB.  EXIT_PRED will be set
+ just before (along with a successful combination) or just after setting
+ EXIT_BB, to either THEN_BB, ELSE_BB, or INNER_COND_BB.  ??? We could
+ potentially handle multi-block single-entry-single-exit regions, but the
+ loop below only deals with single-entry-single-exit individual intervening
+ blocks.  Larger regions without side effects are presumably rare, so it's
+ probably not worth the effort.  */
+  for (basic_block bb = inner_cond_bb, outer_cond_bb, exit_bb = NULL, 
exit_pred;
single_pred_p (bb) && bb_no_side_effects_p (bb);
bb = outer_cond_bb)
 {
@@ -1198,10 +1200,13 @@ tree_ssa_ifcombine_bb (basic_block inner_cond_bb)
 checking of same phi args.  */
   if (known_succ_p (outer_cond_bb))
changed = false;
-  else if (tree_ssa_ifcombine_bb_1 (inner_cond_bb, outer_cond_bb,
-   then_bb, else_bb, inner_cond_bb, bb))
-   changed = true;
-  else if (forwarder_block_to (else_bb, then_bb))
+  else if ((!exit_bb || exit_pred == inner_cond_bb)
+  && tree_ssa_ifcombine_bb_1 (inner_cond_bb, outer_cond_bb,
+  then_bb, else_bb, inner_cond_bb, bb))
+   changed = true, exit_pred = inner_cond_bb;
+  else if (exit_b

[gcc(refs/users/aoliva/heads/testme)] ifcombine: avoid forwarders with intervening blocks

2024-11-28 Thread Alexandre Oliva via Gcc-cvs
https://gcc.gnu.org/g:936a10c5c290f0cc84c6b005bdf66553f9807188

commit 936a10c5c290f0cc84c6b005bdf66553f9807188
Author: Alexandre Oliva 
Date:   Thu Nov 28 21:56:34 2024 -0300

ifcombine: avoid forwarders with intervening blocks

Diff:
---
 gcc/testsuite/gcc.dg/torture/ifcmb-1.c | 63 +++
 gcc/tree-ssa-ifcombine.cc  | 94 +++---
 2 files changed, 139 insertions(+), 18 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/torture/ifcmb-1.c 
b/gcc/testsuite/gcc.dg/torture/ifcmb-1.c
new file mode 100644
index ..2431a548598f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/ifcmb-1.c
@@ -0,0 +1,63 @@
+/* { dg-do run } */
+
+/* Test that we do NOT perform unsound transformations for any of these cases.
+   Forwarding blocks to the exit block used to enable some of them.  */
+
+[[gnu::noinline]]
+int f0 (int a, int b) {
+  if ((a & 1))
+return 0;
+  if (b)
+return 1;
+  if (!(a & 2))
+return 0;
+  else
+return 1;
+}
+
+[[gnu::noinline]]
+int f1 (int a, int b) {
+  if (!(a & 1))
+return 0;
+  if (b)
+return 1;
+  if ((a & 2))
+return 1;
+  else
+return 0;
+}
+
+[[gnu::noinline]]
+int f2 (int a, int b) {
+  if ((a & 1))
+return 0;
+  if (b)
+return 1;
+  if (!(a & 2))
+return 0;
+  else
+return 1;
+}
+
+[[gnu::noinline]]
+int f3 (int a, int b) {
+  if (!(a & 1))
+return 0;
+  if (b)
+return 1;
+  if ((a & 2))
+return 1;
+  else
+return 0;
+}
+
+int main() {
+  if (f0 (0, 1) != 1)
+__builtin_abort();
+  if (f1 (1, 1) != 1)
+__builtin_abort();
+  if (f2 (2, 1) != 1)
+__builtin_abort();
+  if (f3 (3, 1) != 1)
+__builtin_abort();
+}
diff --git a/gcc/tree-ssa-ifcombine.cc b/gcc/tree-ssa-ifcombine.cc
index b58f63f4707a..e03597f1bbd8 100644
--- a/gcc/tree-ssa-ifcombine.cc
+++ b/gcc/tree-ssa-ifcombine.cc
@@ -1089,7 +1089,7 @@ tree_ssa_ifcombine_bb_1 (basic_block inner_cond_bb, 
basic_block outer_cond_bb,
 }
 
   /* The || form is characterized by a common then_bb with the
- two edges leading to it mergable.  The latter is guaranteed
+ two edges leading to it mergeable.  The latter is guaranteed
  by matching PHI arguments in the then_bb and the inner cond_bb
  having no side-effects.  */
   if (phi_pred_bb != then_bb
@@ -1100,7 +1100,7 @@ tree_ssa_ifcombine_bb_1 (basic_block inner_cond_bb, 
basic_block outer_cond_bb,
   
 if (q) goto then_bb; else goto inner_cond_bb;
   
-if (q) goto then_bb; else goto ...;
+if (p) goto then_bb; else goto ...;
   
 ...
*/
@@ -1116,7 +1116,7 @@ tree_ssa_ifcombine_bb_1 (basic_block inner_cond_bb, 
basic_block outer_cond_bb,
   
 if (q) goto inner_cond_bb; else goto then_bb;
   
-if (q) goto then_bb; else goto ...;
+if (p) goto then_bb; else goto ...;
   
 ...
*/
@@ -1139,6 +1139,9 @@ tree_ssa_ifcombine_bb (basic_block inner_cond_bb)
   if (!recognize_if_then_else (inner_cond_bb, &then_bb, &else_bb))
 return ret;
 
+  if (!single_pred_p (inner_cond_bb))
+return ret;
+
   /* Recognize && and || of two conditions with a common
  then/else block which entry edges we can merge.  That is:
if (a || b)
@@ -1151,13 +1154,15 @@ tree_ssa_ifcombine_bb (basic_block inner_cond_bb)
  Look for an OUTER_COND_BBs to combine with INNER_COND_BB.  They need not
  be contiguous, as long as inner and intervening blocks have no side
  effects, and are either single-entry-single-exit or conditionals choosing
- between the same EXIT_BB with the same PHI args, and the path leading to
- INNER_COND_BB.  ??? We could potentially handle multi-block
- single-entry-single-exit regions, but the loop below only deals with
- single-entry-single-exit individual intervening blocks.  Larger regions
- without side effects are presumably rare, so it's probably not worth the
- effort.  */
-  for (basic_block bb = inner_cond_bb, outer_cond_bb, exit_bb = NULL;
+ between the same EXIT_BB with the same PHI args, possibly through an
+ EXIT_PRED, and the path leading to INNER_COND_BB.  EXIT_PRED will be set
+ just before (along with a successful combination) or just after setting
+ EXIT_BB, to either THEN_BB, ELSE_BB, or INNER_COND_BB.  ??? We could
+ potentially handle multi-block single-entry-single-exit regions, but the
+ loop below only deals with single-entry-single-exit individual intervening
+ blocks.  Larger regions without side effects are presumably rare, so it's
+ probably not worth the effort.  */
+  for (basic_block bb = inner_cond_bb, outer_cond_bb, exit_bb = NULL, 
exit_pred;
single_pred_p (bb) && bb_no_side_effects_p (bb);
bb = outer_cond_bb)
 {
@@ -1198,10 +1203,13 @@ tree_ssa_ifcombine_bb (basic_block inner_cond_bb)
 checking of same phi args.  */
   if (known_succ_p (outer_cond_bb

[gcc/aoliva/heads/testme] ifcombine: avoid forwarders with intervening blocks

2024-11-28 Thread Alexandre Oliva via Gcc-cvs
The branch 'aoliva/heads/testme' was updated to point to:

 936a10c5c290... ifcombine: avoid forwarders with intervening blocks

It previously pointed to:

 4c36c32ff46b... ifcombine: avoid forwarders with intervening blocks

Diff:

!!! WARNING: THE FOLLOWING COMMITS ARE NO LONGER ACCESSIBLE (LOST):
---

  4c36c32... ifcombine: avoid forwarders with intervening blocks


Summary of changes (added commits):
---

  936a10c... ifcombine: avoid forwarders with intervening blocks


[gcc(refs/users/aoliva/heads/testme)] ifcombine: avoid forwarders with intervening blocks

2024-11-28 Thread Alexandre Oliva via Gcc-cvs
https://gcc.gnu.org/g:4c36c32ff46bf20897be07ceec2633ac9d1bf005

commit 4c36c32ff46bf20897be07ceec2633ac9d1bf005
Author: Alexandre Oliva 
Date:   Thu Nov 28 21:56:34 2024 -0300

ifcombine: avoid forwarders with intervening blocks

Diff:
---
 gcc/testsuite/gcc.dg/ifcmb-1.c | 60 +++
 gcc/tree-ssa-ifcombine.cc  | 94 ++
 2 files changed, 136 insertions(+), 18 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/ifcmb-1.c b/gcc/testsuite/gcc.dg/ifcmb-1.c
new file mode 100644
index ..9aaba4de5328
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ifcmb-1.c
@@ -0,0 +1,60 @@
+/* { dg-do run } */
+
+[[gnu::noinline]]
+int f0 (int a, int b) {
+  if ((a & 1))
+return 0;
+  if (b)
+return 1;
+  if (!(a & 2))
+return 0;
+  else
+return 1;
+}
+
+[[gnu::noinline]]
+int f1 (int a, int b) {
+  if (!(a & 1))
+return 0;
+  if (b)
+return 1;
+  if ((a & 2))
+return 1;
+  else
+return 0;
+}
+
+[[gnu::noinline]]
+int f2 (int a, int b) {
+  if ((a & 1))
+return 0;
+  if (b)
+return 1;
+  if (!(a & 2))
+return 0;
+  else
+return 1;
+}
+
+[[gnu::noinline]]
+int f3 (int a, int b) {
+  if (!(a & 1))
+return 0;
+  if (b)
+return 1;
+  if ((a & 2))
+return 1;
+  else
+return 0;
+}
+
+int main() {
+  if (f0 (0, 1) != 1)
+__builtin_abort();
+  if (f1 (1, 1) != 1)
+__builtin_abort();
+  if (f2 (2, 1) != 1)
+__builtin_abort();
+  if (f3 (3, 1) != 1)
+__builtin_abort();
+}
diff --git a/gcc/tree-ssa-ifcombine.cc b/gcc/tree-ssa-ifcombine.cc
index b58f63f4707a..e03597f1bbd8 100644
--- a/gcc/tree-ssa-ifcombine.cc
+++ b/gcc/tree-ssa-ifcombine.cc
@@ -1089,7 +1089,7 @@ tree_ssa_ifcombine_bb_1 (basic_block inner_cond_bb, 
basic_block outer_cond_bb,
 }
 
   /* The || form is characterized by a common then_bb with the
- two edges leading to it mergable.  The latter is guaranteed
+ two edges leading to it mergeable.  The latter is guaranteed
  by matching PHI arguments in the then_bb and the inner cond_bb
  having no side-effects.  */
   if (phi_pred_bb != then_bb
@@ -1100,7 +1100,7 @@ tree_ssa_ifcombine_bb_1 (basic_block inner_cond_bb, 
basic_block outer_cond_bb,
   
 if (q) goto then_bb; else goto inner_cond_bb;
   
-if (q) goto then_bb; else goto ...;
+if (p) goto then_bb; else goto ...;
   
 ...
*/
@@ -1116,7 +1116,7 @@ tree_ssa_ifcombine_bb_1 (basic_block inner_cond_bb, 
basic_block outer_cond_bb,
   
 if (q) goto inner_cond_bb; else goto then_bb;
   
-if (q) goto then_bb; else goto ...;
+if (p) goto then_bb; else goto ...;
   
 ...
*/
@@ -1139,6 +1139,9 @@ tree_ssa_ifcombine_bb (basic_block inner_cond_bb)
   if (!recognize_if_then_else (inner_cond_bb, &then_bb, &else_bb))
 return ret;
 
+  if (!single_pred_p (inner_cond_bb))
+return ret;
+
   /* Recognize && and || of two conditions with a common
  then/else block which entry edges we can merge.  That is:
if (a || b)
@@ -1151,13 +1154,15 @@ tree_ssa_ifcombine_bb (basic_block inner_cond_bb)
  Look for an OUTER_COND_BBs to combine with INNER_COND_BB.  They need not
  be contiguous, as long as inner and intervening blocks have no side
  effects, and are either single-entry-single-exit or conditionals choosing
- between the same EXIT_BB with the same PHI args, and the path leading to
- INNER_COND_BB.  ??? We could potentially handle multi-block
- single-entry-single-exit regions, but the loop below only deals with
- single-entry-single-exit individual intervening blocks.  Larger regions
- without side effects are presumably rare, so it's probably not worth the
- effort.  */
-  for (basic_block bb = inner_cond_bb, outer_cond_bb, exit_bb = NULL;
+ between the same EXIT_BB with the same PHI args, possibly through an
+ EXIT_PRED, and the path leading to INNER_COND_BB.  EXIT_PRED will be set
+ just before (along with a successful combination) or just after setting
+ EXIT_BB, to either THEN_BB, ELSE_BB, or INNER_COND_BB.  ??? We could
+ potentially handle multi-block single-entry-single-exit regions, but the
+ loop below only deals with single-entry-single-exit individual intervening
+ blocks.  Larger regions without side effects are presumably rare, so it's
+ probably not worth the effort.  */
+  for (basic_block bb = inner_cond_bb, outer_cond_bb, exit_bb = NULL, 
exit_pred;
single_pred_p (bb) && bb_no_side_effects_p (bb);
bb = outer_cond_bb)
 {
@@ -1198,10 +1203,13 @@ tree_ssa_ifcombine_bb (basic_block inner_cond_bb)
 checking of same phi args.  */
   if (known_succ_p (outer_cond_bb))
changed = false;
-  else if (tree_ssa_ifcombine_bb_1 (inner_cond_bb, outer_cond_bb,
-   then_bb, else_bb, inner_cond_bb, bb))
-  

[gcc/aoliva/heads/testme] ifcombine: avoid forwarders with intervening blocks

2024-11-28 Thread Alexandre Oliva via Gcc-cvs
The branch 'aoliva/heads/testme' was updated to point to:

 4c36c32ff46b... ifcombine: avoid forwarders with intervening blocks

It previously pointed to:

 e87623a1100e... ifcombine: avoid forwarders with intervening blocks

Diff:

!!! WARNING: THE FOLLOWING COMMITS ARE NO LONGER ACCESSIBLE (LOST):
---

  e87623a... ifcombine: avoid forwarders with intervening blocks


Summary of changes (added commits):
---

  4c36c32... ifcombine: avoid forwarders with intervening blocks


[gcc(refs/users/aoliva/heads/testme)] fold fold_truth_andor field merging into ifcombine

2024-11-28 Thread Alexandre Oliva via Gcc-cvs
https://gcc.gnu.org/g:1df512bfd42688024bc33f607704529b18f1e481

commit 1df512bfd42688024bc33f607704529b18f1e481
Author: Alexandre Oliva 
Date:   Thu Nov 28 18:44:28 2024 -0300

fold fold_truth_andor field merging into ifcombine

This patch introduces various improvements to the logic that merges
field compares, while moving it into ifcombine.

Before the patch, we could merge:

  (a.x1 EQNE b.x1)  ANDOR  (a.y1 EQNE b.y1)

into something like:

  (((type *)&a)[Na] & MASK) EQNE (((type *)&b)[Nb] & MASK)

if both of A's fields live within the same alignment boundaries, and
so do B's, at the same relative positions.  Constants may be used
instead of the object B.

The initial goal of this patch was to enable such combinations when a
field crossed alignment boundaries, e.g. for packed types.  We can't
generally access such fields with a single memory access, so when we
come across such a compare, we will attempt to combine each access
separately.

Some merging opportunities were missed because of right-shifts,
compares expressed as e.g. ((a.x1 ^ b.x1) & MASK) EQNE 0, and
narrowing conversions, especially after earlier merges.  This patch
introduces handlers for several cases involving these.

The merging of multiple field accesses into wider bitfield-like
accesses is undesirable to do too early in compilation, so we move it
from folding to ifcombine.

When it is the second of a noncontiguous pair of compares that first
accesses a word, we may merge the first compare with part of the
second compare that refers to the same word, keeping the compare of
the remaining bits at the spot where the second compare used to be.

Handling compares with non-constant fields was somewhat generalized
from what fold used to do, now handling non-adjacent fields, even if a
field of one object crosses an alignment boundary but the other
doesn't.


for  gcc/ChangeLog

* fold-const.cc (make_bit_field): Export.
(unextend, all_ones_mask_p): Drop.
(decode_field_reference, fold_truth_andor_1): Move
field compare merging logic...
* gimple-fold.cc: (fold_truth_andor_for_ifcombine) ... here.
(compute_split_boundary_from_align): New.
(make_bit_field_load, build_split_load): New.
(reuse_split_load): New.
* fold-const.h: (make_bit_field_ref): Declare
(fold_truth_andor_for_ifcombine): Declare.
* match.pd (any_convert, bit_and_cst, rshift_cst): New.
* tree-ssa-ifcombine.cc (ifcombine_ifandif): Try
fold_truth_andor_for_ifcombine.

for  gcc/testsuite/ChangeLog

* gcc.dg/field-merge-1.c: New.
* gcc.dg/field-merge-2.c: New.
* gcc.dg/field-merge-3.c: New.
* gcc.dg/field-merge-4.c: New.
* gcc.dg/field-merge-5.c: New.
* gcc.dg/field-merge-6.c: New.
* gcc.dg/field-merge-7.c: New.
* gcc.dg/field-merge-8.c: New.
* gcc.dg/field-merge-9.c: New.
* gcc.dg/field-merge-10.c: New.
* gcc.dg/field-merge-11.c: New.

Diff:
---
 gcc/fold-const.cc |  512 +--
 gcc/fold-const.h  |   10 +
 gcc/gimple-fold.cc| 1107 +
 gcc/match.pd  |   11 +
 gcc/testsuite/gcc.dg/field-merge-1.c  |   64 ++
 gcc/testsuite/gcc.dg/field-merge-10.c |   36 ++
 gcc/testsuite/gcc.dg/field-merge-11.c |   32 +
 gcc/testsuite/gcc.dg/field-merge-2.c  |   31 +
 gcc/testsuite/gcc.dg/field-merge-3.c  |   36 ++
 gcc/testsuite/gcc.dg/field-merge-4.c  |   40 ++
 gcc/testsuite/gcc.dg/field-merge-5.c  |   40 ++
 gcc/testsuite/gcc.dg/field-merge-6.c  |   26 +
 gcc/testsuite/gcc.dg/field-merge-7.c  |   23 +
 gcc/testsuite/gcc.dg/field-merge-8.c  |   25 +
 gcc/testsuite/gcc.dg/field-merge-9.c  |   36 ++
 gcc/tree-ssa-ifcombine.cc |   14 +-
 16 files changed, 1534 insertions(+), 509 deletions(-)

diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc
index 1e8ae1ab493b..644966459864 100644
--- a/gcc/fold-const.cc
+++ b/gcc/fold-const.cc
@@ -137,7 +137,6 @@ static tree range_successor (tree);
 static tree fold_range_test (location_t, enum tree_code, tree, tree, tree);
 static tree fold_cond_expr_with_comparison (location_t, tree, enum tree_code,
tree, tree, tree, tree);
-static tree unextend (tree, int, int, tree);
 static tree extract_muldiv (tree, tree, enum tree_code, tree, bool *);
 static tree extract_muldiv_1 (tree, tree, enum tree_code, tree, bool *);
 static tree fold_binary_op_with_conditional_arg (location_t,
@@ -4711,7 +4710,7 @@ invert_truthvalue_loc (location_t loc, tree arg)
is the original memory reference used to preserve the alias set of
the

[gcc(refs/users/aoliva/heads/testme)] ifcombine: avoid unsound forwarder-enabled combinations [PR117723]

2024-11-28 Thread Alexandre Oliva via Gcc-cvs
https://gcc.gnu.org/g:75e065ed787a6686700acbc94955f25c1a92e7e2

commit 75e065ed787a6686700acbc94955f25c1a92e7e2
Author: Alexandre Oliva 
Date:   Thu Nov 28 21:56:34 2024 -0300

ifcombine: avoid unsound forwarder-enabled combinations [PR117723]

When ifcombining contiguous blocks, we can follow forwarder blocks and
reverse conditions to enable combinations, but when there are
intervening blocks, we have to constrain ourselves to paths to the
exit that share the PHI args with all intervening blocks.

Avoiding considering forwarders when intervening blocks were present
would match the preexisting test, but we can do better, recording in
case a forwarded path corresponds to the outer block's exit path, and
insisting on not combining through any other path but the one that was
verified as corresponding.  The latter is what this patch implements.

While at that, I've fixed some typos, introduced early testing before
computing the exit path to avoid it when computing it would be
wasteful, or when avoiding it can enable other sound combinations.


for  gcc/ChangeLog

PR tree-optimization/117723
* tree-ssa-ifcombine.cc (tree_ssa_ifcombine_bb): Record
forwarder blocks in path to exit, and stick to them.  Avoid
computing the exit if obviously not needed, and if that
enables additional optimizations.
(tree_ssa_ifcombine_bb_1): Fix typos.

for  gcc/testsuite/ChangeLog

PR tree-optimization/117723
* gcc.dg/torture/ifcmb-1.c: New.

Diff:
---
 gcc/testsuite/gcc.dg/torture/ifcmb-1.c |  63 ++
 gcc/tree-ssa-ifcombine.cc  | 113 +++--
 2 files changed, 158 insertions(+), 18 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/torture/ifcmb-1.c 
b/gcc/testsuite/gcc.dg/torture/ifcmb-1.c
new file mode 100644
index ..2431a548598f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/ifcmb-1.c
@@ -0,0 +1,63 @@
+/* { dg-do run } */
+
+/* Test that we do NOT perform unsound transformations for any of these cases.
+   Forwarding blocks to the exit block used to enable some of them.  */
+
+[[gnu::noinline]]
+int f0 (int a, int b) {
+  if ((a & 1))
+return 0;
+  if (b)
+return 1;
+  if (!(a & 2))
+return 0;
+  else
+return 1;
+}
+
+[[gnu::noinline]]
+int f1 (int a, int b) {
+  if (!(a & 1))
+return 0;
+  if (b)
+return 1;
+  if ((a & 2))
+return 1;
+  else
+return 0;
+}
+
+[[gnu::noinline]]
+int f2 (int a, int b) {
+  if ((a & 1))
+return 0;
+  if (b)
+return 1;
+  if (!(a & 2))
+return 0;
+  else
+return 1;
+}
+
+[[gnu::noinline]]
+int f3 (int a, int b) {
+  if (!(a & 1))
+return 0;
+  if (b)
+return 1;
+  if ((a & 2))
+return 1;
+  else
+return 0;
+}
+
+int main() {
+  if (f0 (0, 1) != 1)
+__builtin_abort();
+  if (f1 (1, 1) != 1)
+__builtin_abort();
+  if (f2 (2, 1) != 1)
+__builtin_abort();
+  if (f3 (3, 1) != 1)
+__builtin_abort();
+}
diff --git a/gcc/tree-ssa-ifcombine.cc b/gcc/tree-ssa-ifcombine.cc
index e389b12aa37d..19cfd727c6ad 100644
--- a/gcc/tree-ssa-ifcombine.cc
+++ b/gcc/tree-ssa-ifcombine.cc
@@ -1077,7 +1077,7 @@ tree_ssa_ifcombine_bb_1 (basic_block inner_cond_bb, 
basic_block outer_cond_bb,
 }
 
   /* The || form is characterized by a common then_bb with the
- two edges leading to it mergable.  The latter is guaranteed
+ two edges leading to it mergeable.  The latter is guaranteed
  by matching PHI arguments in the then_bb and the inner cond_bb
  having no side-effects.  */
   if (phi_pred_bb != then_bb
@@ -1088,7 +1088,7 @@ tree_ssa_ifcombine_bb_1 (basic_block inner_cond_bb, 
basic_block outer_cond_bb,
   
 if (q) goto then_bb; else goto inner_cond_bb;
   
-if (q) goto then_bb; else goto ...;
+if (p) goto then_bb; else goto ...;
   
 ...
*/
@@ -1104,7 +1104,7 @@ tree_ssa_ifcombine_bb_1 (basic_block inner_cond_bb, 
basic_block outer_cond_bb,
   
 if (q) goto inner_cond_bb; else goto then_bb;
   
-if (q) goto then_bb; else goto ...;
+if (p) goto then_bb; else goto ...;
   
 ...
*/
@@ -1139,13 +1139,15 @@ tree_ssa_ifcombine_bb (basic_block inner_cond_bb)
  Look for an OUTER_COND_BBs to combine with INNER_COND_BB.  They need not
  be contiguous, as long as inner and intervening blocks have no side
  effects, and are either single-entry-single-exit or conditionals choosing
- between the same EXIT_BB with the same PHI args, and the path leading to
- INNER_COND_BB.  ??? We could potentially handle multi-block
- single-entry-single-exit regions, but the loop below only deals with
- single-entry-single-exit individual intervening blocks.  Larger regions
- without side effects are presumably rare, so 

[gcc(refs/users/aoliva/heads/testme)] ifcombine: don't try xor on right-hand op

2024-11-28 Thread Alexandre Oliva via Gcc-cvs
https://gcc.gnu.org/g:be442be963ba54e8deba7613a215a77daa7eb006

commit be442be963ba54e8deba7613a215a77daa7eb006
Author: Alexandre Oliva 
Date:   Thu Nov 28 18:44:35 2024 -0300

ifcombine: don't try xor on right-hand op

Diff:
---
 gcc/gimple-fold.cc | 4 
 1 file changed, 4 insertions(+)

diff --git a/gcc/gimple-fold.cc b/gcc/gimple-fold.cc
index 6ac7f1593ad3..cb560ba456ba 100644
--- a/gcc/gimple-fold.cc
+++ b/gcc/gimple-fold.cc
@@ -7485,6 +7485,10 @@ decode_field_reference (tree *pexp, HOST_WIDE_INT 
*pbitsize,
  exp = res_ops[1];
  gcc_checking_assert (!xor_cmp_op);
}
+  else if (!xor_cmp_op)
+   /* Not much we can do when xor appears in the right-hand compare
+  operand.  */
+   return NULL_TREE;
   else
{
  *xor_p = true;


[gcc/aoliva/heads/testme] (3 commits) ifcombine: don't try xor on right-hand op

2024-11-28 Thread Alexandre Oliva via Gcc-cvs
The branch 'aoliva/heads/testme' was updated to point to:

 be442be963ba... ifcombine: don't try xor on right-hand op

It previously pointed to:

 936a10c5c290... ifcombine: avoid forwarders with intervening blocks

Diff:

!!! WARNING: THE FOLLOWING COMMITS ARE NO LONGER ACCESSIBLE (LOST):
---

  936a10c... ifcombine: avoid forwarders with intervening blocks
  8841e79... ifcombine: don't try xor on right-hand op
  1df27fc... fold fold_truth_andor field merging into ifcombine


Summary of changes (added commits):
---

  be442be... ifcombine: don't try xor on right-hand op
  1df512b... fold fold_truth_andor field merging into ifcombine
  75e065e... ifcombine: avoid unsound forwarder-enabled combinations [PR


[gcc r15-5762] libstdc++: Reorder printer registrations in printers.py

2024-11-28 Thread Jonathan Wakely via Gcc-cvs
https://gcc.gnu.org/g:6bba4ca26c9919c0d5b590d648bd0ae9adc678ac

commit r15-5762-g6bba4ca26c9919c0d5b590d648bd0ae9adc678ac
Author: Jonathan Wakely 
Date:   Thu Nov 28 15:23:25 2024 +

libstdc++: Reorder printer registrations in printers.py

Register StdIntegralConstantPrinter with the other C++11 printers, and
register StdTextEncodingPrinter after C++20 printers.

libstdc++-v3/ChangeLog:

* python/libstdcxx/v6/printers.py: Reorder registrations.

Diff:
---
 libstdc++-v3/python/libstdcxx/v6/printers.py | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/libstdc++-v3/python/libstdcxx/v6/printers.py 
b/libstdc++-v3/python/libstdcxx/v6/printers.py
index d05b79762fdd..37ca51b26286 100644
--- a/libstdc++-v3/python/libstdcxx/v6/printers.py
+++ b/libstdc++-v3/python/libstdcxx/v6/printers.py
@@ -2830,10 +2830,6 @@ def build_libstdcxx_dictionary():
 # vector
 libstdcxx_printer.add_version('std::', 'locale', StdLocalePrinter)
 
-libstdcxx_printer.add_version('std::', 'integral_constant',
-  StdIntegralConstantPrinter)
-libstdcxx_printer.add_version('std::', 'text_encoding',
-  StdTextEncodingPrinter)
 
 if hasattr(gdb.Value, 'dynamic_type'):
 libstdcxx_printer.add_version('std::', 'error_code',
@@ -2896,6 +2892,8 @@ def build_libstdcxx_dictionary():
   StdChronoDurationPrinter)
 libstdcxx_printer.add_version('std::chrono::', 'time_point',
   StdChronoTimePointPrinter)
+libstdcxx_printer.add_version('std::', 'integral_constant',
+  StdIntegralConstantPrinter)
 
 # std::regex components
 libstdcxx_printer.add_version('std::__detail::', '_State',
@@ -2971,6 +2969,9 @@ def build_libstdcxx_dictionary():
 # libstdcxx_printer.add_version('std::chrono::(anonymous namespace)', 
'Rule',
 #  StdChronoTimeZoneRulePrinter)
 
+# C++26 components
+libstdcxx_printer.add_version('std::', 'text_encoding',
+  StdTextEncodingPrinter)
 # Extensions.
 libstdcxx_printer.add_version('__gnu_cxx::', 'slist', StdSlistPrinter)


[gcc r14-10995] [PR114942][LRA]: Don't reuse input reload reg of inout early clobber operand

2024-11-28 Thread Uros Bizjak via Gcc-cvs
https://gcc.gnu.org/g:196ab7853ef5dc225833a914491add0a3adeaf9d

commit r14-10995-g196ab7853ef5dc225833a914491add0a3adeaf9d
Author: Vladimir N. Makarov 
Date:   Fri May 10 09:15:50 2024 -0400

[PR114942][LRA]: Don't reuse input reload reg of inout early clobber operand

  The insn in question has the same reg in inout operand and input
operand.  The inout operand is early clobber.  LRA reused input reload
reg of the inout operand for the input operand which is wrong.  It
were a good decision if the inout operand was not early clobber one.
The patch rejects the reuse for the PR test case.

gcc/ChangeLog:

PR target/114942
* lra-constraints.cc (struct input_reload): Add new member 
early_clobber_p.
(get_reload_reg): Add new arg early_clobber_p, don't reuse input
reload with true early_clobber_p member value, use the arg for new
element of curr_insn_input_reloads.
(match_reload): Assign false to early_clobber_p member.
(process_addr_reg, simplify_operand_subreg, curr_insn_transform):
Adjust get_reload_reg calls.

gcc/testsuite/ChangeLog:

PR target/114942
* gcc.target/i386/pr114942.c: New.

(cherry picked from commit 9585317f0715699197b1313bbf939c6ea3c1ace6)

Diff:
---
 gcc/lra-constraints.cc   | 27 +++
 gcc/testsuite/gcc.target/i386/pr114942.c | 24 
 2 files changed, 43 insertions(+), 8 deletions(-)

diff --git a/gcc/lra-constraints.cc b/gcc/lra-constraints.cc
index 10e3d4e40977..1f50e3005ea9 100644
--- a/gcc/lra-constraints.cc
+++ b/gcc/lra-constraints.cc
@@ -599,6 +599,8 @@ struct input_reload
 {
   /* True for input reload of matched operands.  */
   bool match_p;
+  /* True for input reload of inout earlyclobber operand.  */
+  bool early_clobber_p;
   /* Reloaded value.  */
   rtx input;
   /* Reload pseudo used.  */
@@ -649,13 +651,15 @@ canonicalize_reload_addr (rtx addr)
 /* Create a new pseudo using MODE, RCLASS, EXCLUDE_START_HARD_REGS, ORIGINAL or
reuse an existing reload pseudo.  Don't reuse an existing reload pseudo if
IN_SUBREG_P is true and the reused pseudo should be wrapped up in a SUBREG.
+   EARLY_CLOBBER_P is true for input reload of inout early clobber operand.
The result pseudo is returned through RESULT_REG.  Return TRUE if we created
a new pseudo, FALSE if we reused an existing reload pseudo.  Use TITLE to
describe new registers for debug purposes.  */
 static bool
 get_reload_reg (enum op_type type, machine_mode mode, rtx original,
enum reg_class rclass, HARD_REG_SET *exclude_start_hard_regs,
-   bool in_subreg_p, const char *title, rtx *result_reg)
+   bool in_subreg_p, bool early_clobber_p,
+   const char *title, rtx *result_reg)
 {
   int i, regno;
   enum reg_class new_class;
@@ -703,6 +707,7 @@ get_reload_reg (enum op_type type, machine_mode mode, rtx 
original,
 for (i = 0; i < curr_insn_input_reloads_num; i++)
   {
if (! curr_insn_input_reloads[i].match_p
+   && ! curr_insn_input_reloads[i].early_clobber_p
&& rtx_equal_p (curr_insn_input_reloads[i].input, original)
&& in_class_p (curr_insn_input_reloads[i].reg, rclass, &new_class))
  {
@@ -750,6 +755,8 @@ get_reload_reg (enum op_type type, machine_mode mode, rtx 
original,
   lra_assert (curr_insn_input_reloads_num < LRA_MAX_INSN_RELOADS);
   curr_insn_input_reloads[curr_insn_input_reloads_num].input = original;
   curr_insn_input_reloads[curr_insn_input_reloads_num].match_p = false;
+  curr_insn_input_reloads[curr_insn_input_reloads_num].early_clobber_p
+= early_clobber_p;
   curr_insn_input_reloads[curr_insn_input_reloads_num++].reg = *result_reg;
   return true;
 }
@@ -1189,6 +1196,7 @@ match_reload (signed char out, signed char *ins, signed 
char *outs,
   lra_assert (curr_insn_input_reloads_num < LRA_MAX_INSN_RELOADS);
   curr_insn_input_reloads[curr_insn_input_reloads_num].input = in_rtx;
   curr_insn_input_reloads[curr_insn_input_reloads_num].match_p = true;
+  curr_insn_input_reloads[curr_insn_input_reloads_num].early_clobber_p = false;
   curr_insn_input_reloads[curr_insn_input_reloads_num++].reg = new_in_reg;
   for (i = 0; (in = ins[i]) >= 0; i++)
 if (GET_MODE (*curr_id->operand_loc[in]) == VOIDmode
@@ -1577,7 +1585,7 @@ process_addr_reg (rtx *loc, bool check_only_p, rtx_insn 
**before, rtx_insn **aft
  reg = *loc;
  if (get_reload_reg (after == NULL ? OP_IN : OP_INOUT,
  mode, reg, cl, NULL,
- subreg_p, "address", &new_reg))
+ subreg_p, false, "address", &new_reg))
before_p = true;
}
   else if (new_class != NO_REGS && rclass != new_class)
@@ -1733,7 +1741,7 @@ simplify_operand_subreg (int nop, machine_mode reg_mode)
   

[gcc r14-10996] [PR117105][LRA]: Use unique value reload pseudo for early clobber operand

2024-11-28 Thread Uros Bizjak via Gcc-cvs
https://gcc.gnu.org/g:ea36e9d17971210762580489b71b05e7bf7faa2e

commit r14-10996-gea36e9d17971210762580489b71b05e7bf7faa2e
Author: Vladimir N. Makarov 
Date:   Mon Nov 25 16:09:00 2024 -0500

[PR117105][LRA]: Use unique value reload pseudo for early clobber operand

LRA did not generate insn satisfying insn constraints on the PR
test.  The reason for this is that LRA assigned the same hard reg for
two conflicting reload pseudos.  The two insn reload pseudos are
originated from the same pseudo and LRA tried to optimize as it
assigned the same value for the reload pseudos.  It is an LRA
optimization to minimize reload insns.  The two reload pseudos
conflict as one of them is an early clobber insn operands.  The patch
solves this problem by assigning unique value if the operand is early
clobber one.

gcc/ChangeLog:

PR target/117105
* lra-constraints.cc (get_reload_reg): Create unique value reload
pseudos for early clobbered operands.

gcc/testsuite/ChangeLog:

PR target/117105
* gcc.target/i386/pr117105.c: New test.

(cherry picked from commit 4b09e2c67ef593db171b0755b46378964421782b)

Diff:
---
 gcc/lra-constraints.cc   |  3 ++-
 gcc/testsuite/gcc.target/i386/pr117105.c | 15 +++
 2 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/gcc/lra-constraints.cc b/gcc/lra-constraints.cc
index 1f50e3005ea9..dbc5129ae0a2 100644
--- a/gcc/lra-constraints.cc
+++ b/gcc/lra-constraints.cc
@@ -663,7 +663,6 @@ get_reload_reg (enum op_type type, machine_mode mode, rtx 
original,
 {
   int i, regno;
   enum reg_class new_class;
-  bool unique_p = false;
 
   if (type == OP_OUT)
 {
@@ -701,6 +700,8 @@ get_reload_reg (enum op_type type, machine_mode mode, rtx 
original,
exclude_start_hard_regs, title);
   return true;
 }
+
+  bool unique_p = early_clobber_p;
   /* Prevent reuse value of expression with side effects,
  e.g. volatile memory.  */
   if (! side_effects_p (original))
diff --git a/gcc/testsuite/gcc.target/i386/pr117105.c 
b/gcc/testsuite/gcc.target/i386/pr117105.c
new file mode 100644
index ..252bb138c9c7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr117105.c
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -fno-code-hoisting -fno-tree-fre -fno-tree-dominator-opts 
-fno-tree-pre -fno-tree-sra" } */
+int a;
+struct b {
+  char c;
+  char d;
+};
+int main() {
+  struct b e;
+  int f;
+  while (a)
+if (f == e.d)
+  f = e.c = e.d & 1 >> e.d;
+  return 0;
+}


[gcc r15-5759] libstdc++: Reduce duplication in Doxygen comments for std::list

2024-11-28 Thread Jonathan Wakely via Libstdc++-cvs
https://gcc.gnu.org/g:3f3966b5a3103ab198257b62bd7563996f2f6f65

commit r15-5759-g3f3966b5a3103ab198257b62bd7563996f2f6f65
Author: Jonathan Wakely 
Date:   Mon Nov 25 23:01:28 2024 +

libstdc++: Reduce duplication in Doxygen comments for std::list

We have a number of comments which are duplicated for C++98 and C++11
overloads, where the signatures are slightly different. Instead of
duplicating the comments that are 90% identical, just use a single
comment that can apply to both. In some cases this means saying "an
iterator" instead of "A const iterator" but that's fine, a
std::list::const_iterator is still an iterator (and a non-const iterator
is a valid argument to those functions because they'll implicitly
convert to const_iterator).

In two cases the @return description just needs to say that it returns
void for C++98 and an iterator otherwise.

libstdc++-v3/ChangeLog:

* include/bits/stl_list.h: Reduce duplication in doxygen
comments.

Diff:
---
 libstdc++-v3/include/bits/stl_list.h | 142 +--
 1 file changed, 35 insertions(+), 107 deletions(-)

diff --git a/libstdc++-v3/include/bits/stl_list.h 
b/libstdc++-v3/include/bits/stl_list.h
index df7f388ede5a..cf3d05fcae95 100644
--- a/libstdc++-v3/include/bits/stl_list.h
+++ b/libstdc++-v3/include/bits/stl_list.h
@@ -1477,10 +1477,11 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
   template
iterator
emplace(const_iterator __position, _Args&&... __args);
+#endif
 
   /**
*  @brief  Inserts given value into %list before specified iterator.
-   *  @param  __position  A const_iterator into the %list.
+   *  @param  __position  An iterator into the %list.
*  @param  __x  Data to be inserted.
*  @return  An iterator that points to the inserted data.
*
@@ -1488,52 +1489,34 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
*  the specified location.  Due to the nature of a %list this
*  operation can be done in constant time, and does not
*  invalidate iterators and references.
+   *
+   *  @{
*/
+#if __cplusplus >= 201103L
   iterator
   insert(const_iterator __position, const value_type& __x);
+
+  iterator
+  insert(const_iterator __position, value_type&& __x)
+  { return emplace(__position, std::move(__x)); }
 #else
-  /**
-   *  @brief  Inserts given value into %list before specified iterator.
-   *  @param  __position  An iterator into the %list.
-   *  @param  __x  Data to be inserted.
-   *  @return  An iterator that points to the inserted data.
-   *
-   *  This function will insert a copy of the given value before
-   *  the specified location.  Due to the nature of a %list this
-   *  operation can be done in constant time, and does not
-   *  invalidate iterators and references.
-   */
   iterator
   insert(iterator __position, const value_type& __x);
 #endif
+  /// @}
 
 #if __cplusplus >= 201103L
-  /**
-   *  @brief  Inserts given rvalue into %list before specified iterator.
-   *  @param  __position  A const_iterator into the %list.
-   *  @param  __x  Data to be inserted.
-   *  @return  An iterator that points to the inserted data.
-   *
-   *  This function will insert a copy of the given rvalue before
-   *  the specified location.  Due to the nature of a %list this
-   *  operation can be done in constant time, and does not
-   *  invalidate iterators and references.
-   */
-  iterator
-  insert(const_iterator __position, value_type&& __x)
-  { return emplace(__position, std::move(__x)); }
-
   /**
*  @brief  Inserts the contents of an initializer_list into %list
*  before specified const_iterator.
*  @param  __p  A const_iterator into the %list.
*  @param  __l  An initializer_list of value_type.
*  @return  An iterator pointing to the first element inserted
-   *   (or __position).
+   *   (or `__p`).
*
*  This function will insert copies of the data in the
-   *  initializer_list @a l into the %list before the location
-   *  specified by @a p.
+   *  initializer_list `__l` into the %list before the location
+   *  specified by `__p`.
*
*  This operation is linear in the number of elements inserted and
*  does not invalidate iterators and references.
@@ -1543,36 +1526,24 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
   { return this->insert(__p, __l.begin(), __l.end()); }
 #endif
 
-#if __cplusplus >= 201103L
   /**
*  @brief  Inserts a number of copies of given data into the %list.
-   *  @param  __position  A const_iterator into the %list.
+   *  @param  __position  An iterator into the %list.
*  @param  __n  Number of elements to be inserted.
*  @param  __x  

[gcc r15-5764] c++: define __cpp_pack_indexing [PR113798]

2024-11-28 Thread Marek Polacek via Gcc-cvs
https://gcc.gnu.org/g:a4954130d43d478a23ec8b65f5d861167935d77a

commit r15-5764-ga4954130d43d478a23ec8b65f5d861167935d77a
Author: Marek Polacek 
Date:   Thu Nov 28 12:07:00 2024 -0500

c++: define __cpp_pack_indexing [PR113798]

Forgot to do this in my original patch.

PR c++/113798

gcc/c-family/ChangeLog:

* c-cppbuiltin.cc (c_cpp_builtins): Predefine
__cpp_pack_indexing=202311L for C++26.

gcc/testsuite/ChangeLog:

* g++.dg/cpp26/feat-cxx26.C (__cpp_pack_indexing): Add test.

Reviewed-by: Jakub Jelinek 

Diff:
---
 gcc/c-family/c-cppbuiltin.cc| 1 +
 gcc/testsuite/g++.dg/cpp26/feat-cxx26.C | 6 ++
 2 files changed, 7 insertions(+)

diff --git a/gcc/c-family/c-cppbuiltin.cc b/gcc/c-family/c-cppbuiltin.cc
index c354c794b55e..195f8ae5e40b 100644
--- a/gcc/c-family/c-cppbuiltin.cc
+++ b/gcc/c-family/c-cppbuiltin.cc
@@ -1092,6 +1092,7 @@ c_cpp_builtins (cpp_reader *pfile)
  cpp_define (pfile, "__cpp_structured_bindings=202403L");
  cpp_define (pfile, "__cpp_deleted_function=202403L");
  cpp_define (pfile, "__cpp_variadic_friend=202403L");
+ cpp_define (pfile, "__cpp_pack_indexing=202311L");
}
   if (flag_concepts && cxx_dialect > cxx14)
cpp_define (pfile, "__cpp_concepts=202002L");
diff --git a/gcc/testsuite/g++.dg/cpp26/feat-cxx26.C 
b/gcc/testsuite/g++.dg/cpp26/feat-cxx26.C
index c387a7dfe609..d74ff0e427bd 100644
--- a/gcc/testsuite/g++.dg/cpp26/feat-cxx26.C
+++ b/gcc/testsuite/g++.dg/cpp26/feat-cxx26.C
@@ -622,3 +622,9 @@
 #elif __cpp_variadic_friend != 202403
 #  error "__cpp_variadic_friend != 202403"
 #endif
+
+#ifndef __cpp_pack_indexing
+# error "__cpp_pack_indexing"
+#elif __cpp_pack_indexing != 202311
+#  error "__cpp_pack_indexing != 202311"
+#endif


[gcc r15-5765] libstdc++: Use std::_Destroy in std::stacktrace

2024-11-28 Thread Jonathan Wakely via Libstdc++-cvs
https://gcc.gnu.org/g:1e2b03e4d66d894c2e42d209502b6957b2dabbf9

commit r15-5765-g1e2b03e4d66d894c2e42d209502b6957b2dabbf9
Author: Jonathan Wakely 
Date:   Thu Nov 28 12:38:22 2024 +

libstdc++: Use std::_Destroy in std::stacktrace

This benefits from the optimizations in std::_Destroy which avoid doing
any work when using std::allocator.

libstdc++-v3/ChangeLog:

* include/std/stacktrace (basic_stacktrace::_M_impl::_M_resize):
Use std::_Destroy to destroy removed elements.

Diff:
---
 libstdc++-v3/include/std/stacktrace | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/libstdc++-v3/include/std/stacktrace 
b/libstdc++-v3/include/std/stacktrace
index 2c0f6ba10a91..f94a424e4cff 100644
--- a/libstdc++-v3/include/std/stacktrace
+++ b/libstdc++-v3/include/std/stacktrace
@@ -601,8 +601,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
void
_M_resize(size_type __n, allocator_type& __alloc) noexcept
{
- for (size_type __i = __n; __i < _M_size; ++__i)
-   _AllocTraits::destroy(__alloc, &_M_frames[__i]);
+ std::_Destroy(_M_frames + __n, _M_frames + _M_size, __alloc);
  _M_size = __n;
}


[gcc r15-5766] Fortran: fix crash with bounds check writing array section [PR117791]

2024-11-28 Thread Harald Anlauf via Gcc-cvs
https://gcc.gnu.org/g:2261a15c0715cbf5c129b66ee44fc1d3a9e36972

commit r15-5766-g2261a15c0715cbf5c129b66ee44fc1d3a9e36972
Author: Harald Anlauf 
Date:   Wed Nov 27 21:11:16 2024 +0100

Fortran: fix crash with bounds check writing array section [PR117791]

PR fortran/117791

gcc/fortran/ChangeLog:

* trans-io.cc (gfc_trans_transfer): When an array index depends on
a function evaluation or an expression, do not use optimized array
I/O of an array section and fall back to normal scalarization.

gcc/testsuite/ChangeLog:

* gfortran.dg/bounds_check_array_io.f90: New test.

Diff:
---
 gcc/fortran/trans-io.cc| 20 ++
 .../gfortran.dg/bounds_check_array_io.f90  | 31 ++
 2 files changed, 51 insertions(+)

diff --git a/gcc/fortran/trans-io.cc b/gcc/fortran/trans-io.cc
index 961a711c5301..906dd7c6eb61 100644
--- a/gcc/fortran/trans-io.cc
+++ b/gcc/fortran/trans-io.cc
@@ -2648,6 +2648,26 @@ gfc_trans_transfer (gfc_code * code)
 || gfc_expr_attr (expr).pointer))
goto scalarize;
 
+  /* With array-bounds checking enabled, force scalarization in some
+situations, e.g., when an array index depends on a function
+evaluation or an expression and possibly has side-effects.  */
+  if ((gfc_option.rtcheck & GFC_RTCHECK_BOUNDS)
+ && ref
+ && ref->u.ar.type == AR_SECTION)
+   {
+ for (n = 0; n < ref->u.ar.dimen; n++)
+   if (ref->u.ar.dimen_type[n] == DIMEN_ELEMENT
+   && ref->u.ar.start[n])
+ {
+   switch (ref->u.ar.start[n]->expr_type)
+ {
+ case EXPR_FUNCTION:
+ case EXPR_OP:
+   goto scalarize;
+ }
+ }
+   }
+
   if (!(gfc_bt_struct (expr->ts.type)
  || expr->ts.type == BT_CLASS)
&& ref && ref->next == NULL
diff --git a/gcc/testsuite/gfortran.dg/bounds_check_array_io.f90 
b/gcc/testsuite/gfortran.dg/bounds_check_array_io.f90
new file mode 100644
index ..0cfc11742834
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/bounds_check_array_io.f90
@@ -0,0 +1,31 @@
+! { dg-do run }
+! { dg-additional-options "-fcheck=bounds -fdump-tree-original" }
+!
+! PR fortran/117791 - crash with bounds check writing array section
+! Contributed by Andreas van Hameren (hameren at ifj dot edu dot pl)
+
+program testprogram
+  implicit none
+  integer, parameter :: array(4,2)=reshape ([11,12,13,14 ,15,16,17,18], [4,2])
+  integer:: i(3) = [45,51,0]
+
+  write(*,*) 'line 1:',array(:,  sort_2(i(1:2)) )
+  write(*,*) 'line 2:',array(:,  3 - sort_2(i(1:2)) )
+  write(*,*) 'line 3:',array(:, int (3 - sort_2(i(1:2
+
+contains
+
+  function sort_2(i) result(rslt)
+integer,intent(in) :: i(2)
+integer:: rslt
+if (i(1) <= i(2)) then
+   rslt = 1
+else
+   rslt = 2
+endif
+  end function
+
+end program 
+
+! { dg-final { scan-tree-dump-times "sort_2" 5 "original" } }
+! { dg-final { scan-tree-dump-not "_gfortran_transfer_array_write" "original" 
} }