date:20241122

[gcc r15-5586] testsuite: Fix up vector-{8,9,10}.c tests

2024-11-22 Thread Jakub Jelinek via Gcc-cvs

https://gcc.gnu.org/g:77f4b1097e6aec50053577a8a1a65487ed58cbb0

commit r15-5586-g77f4b1097e6aec50053577a8a1a65487ed58cbb0
Author: Jakub Jelinek 
Date:   Fri Nov 22 10:02:59 2024 +0100

testsuite: Fix up vector-{8,9,10}.c tests

On Thu, Nov 21, 2024 at 01:30:39PM +0100, Christoph Müllner wrote:
> > >   * gcc.dg/tree-ssa/satd-hadamard.c: New test.
> > >   * gcc.dg/tree-ssa/vector-10.c: New test.
> > >   * gcc.dg/tree-ssa/vector-8.c: New test.
> > >   * gcc.dg/tree-ssa/vector-9.c: New test.

I see FAILs on i686-linux or on x86_64-linux (in the latter
with -m32 testing).

One problem is that vector-10.c doesn't use -Wno-psabi option
and uses a function which returns a vector and takes vector
as first parameter, the other problems are that 3 other
tests don't arrange for at least basic vector ISA support,
plus non-standardly test only on x86_64-*-*, while normally
one would allow both i?86-*-* x86_64-*-* and if it is e.g.
specific to 64-bit, also check for lp64 or int128 or whatever
else is needed.  E.g. Solaris I think has i?86-*-* triplet even
for 64-bit code, etc.

The following patch fixes these.

2024-11-22  Jakub Jelinek  

* gcc.dg/tree-ssa/satd-hadamard.c: Add -msse2 as 
dg-additional-options
on x86.  Also scan-tree-dump on i?86-*-*.
* gcc.dg/tree-ssa/vector-8.c: Likewise.
* gcc.dg/tree-ssa/vector-9.c: Likewise.
* gcc.dg/tree-ssa/vector-10.c: Add -Wno-psabi to 
dg-additional-options.

Diff:
---
 gcc/testsuite/gcc.dg/tree-ssa/satd-hadamard.c | 3 ++-
 gcc/testsuite/gcc.dg/tree-ssa/vector-10.c | 2 +-
 gcc/testsuite/gcc.dg/tree-ssa/vector-8.c  | 5 +++--
 gcc/testsuite/gcc.dg/tree-ssa/vector-9.c  | 5 +++--
 4 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/satd-hadamard.c 
b/gcc/testsuite/gcc.dg/tree-ssa/satd-hadamard.c
index 7a22772f2e63..6042378f1650 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/satd-hadamard.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/satd-hadamard.c
@@ -1,5 +1,6 @@
 /* { dg-do compile } */
 /* { dg-additional-options "-O3 -fdump-tree-forwprop4-details" } */
+/* { dg-additional-options "-msse2" { target i?86-*-* x86_64-*-* } } */
 
 #include 
 
@@ -40,4 +41,4 @@ x264_pixel_satd_8x4_simplified (uint8_t *pix1, int i_pix1, 
uint8_t *pix2, int i_
   return (((uint16_t)sum) + ((uint32_t)sum>>16)) >> 1;
 }
 
-/* { dg-final { scan-tree-dump "VEC_PERM_EXPR.*{ 2, 3, 6, 7 }" "forwprop4" { 
target { aarch64*-*-* x86_64-*-* } } } } */
+/* { dg-final { scan-tree-dump "VEC_PERM_EXPR.*{ 2, 3, 6, 7 }" "forwprop4" { 
target { aarch64*-*-* i?86-*-* x86_64-*-* } } } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vector-10.c 
b/gcc/testsuite/gcc.dg/tree-ssa/vector-10.c
index d5caebdf1742..bb1ed92dc90a 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/vector-10.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/vector-10.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-additional-options "-O3 -fdump-tree-forwprop1-details" } */
+/* { dg-additional-options "-O3 -fdump-tree-forwprop1-details -Wno-psabi" } */
 
 typedef int vec __attribute__((vector_size (4 * sizeof (int;
 
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vector-8.c 
b/gcc/testsuite/gcc.dg/tree-ssa/vector-8.c
index 3a7b62b640d6..ba9a0187c106 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/vector-8.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/vector-8.c
@@ -1,5 +1,6 @@
 /* { dg-do compile } */
 /* { dg-additional-options "-O3 -fdump-tree-forwprop1-details" } */
+/* { dg-additional-options "-msse2" { target i?86-*-* x86_64-*-* } } */
 
 typedef int vec __attribute__((vector_size (4 * sizeof (int;
 
@@ -30,5 +31,5 @@ void f (vec *p_v_in_1, vec *p_v_in_2, vec *p_v_out_1, vec 
*p_v_out_2)
   *p_v_out_2 = v_out_2;
 }
 
-/* { dg-final { scan-tree-dump "Vec perm simplify sequences have been blended" 
"forwprop1" { target { aarch64*-*-* x86_64-*-* } } } } */
-/* { dg-final { scan-tree-dump "VEC_PERM_EXPR.*{ 2, 3, 6, 7 }" "forwprop1" { 
target { aarch64*-*-* x86_64-*-* } } } } */
+/* { dg-final { scan-tree-dump "Vec perm simplify sequences have been blended" 
"forwprop1" { target { aarch64*-*-* i?86-*-* x86_64-*-* } } } } */
+/* { dg-final { scan-tree-dump "VEC_PERM_EXPR.*{ 2, 3, 6, 7 }" "forwprop1" { 
target { aarch64*-*-* i?86-*-* x86_64-*-* } } } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vector-9.c 
b/gcc/testsuite/gcc.dg/tree-ssa/vector-9.c
index ba34fb163d67..1aa2ef99c3c2 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/vector-9.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/vector-9.c
@@ -1,5 +1,6 @@
 /* { dg-do compile } */
 /* { dg-additional-options "-O3 -fdump-tree-forwprop1-details" } */
+/* { dg-additional-options "-msse2" { target i?86-*-* x86_64-*-* } } */
 
 typedef int vec __attribute__((vector_size (4 * sizeof (int;
 
@@ -30,5 +31,5 @@ void f (vec *p_v_in_1, vec *p_v_in_2, vec *p_v_out_1, vec 
*p_v_out_2)
   *p_v_out_2 = v_out_2;
 }
 
-/* { dg-final {

[gcc r14-10963] [PATCH] modula2: Fix typos, grammar, and a link

2024-11-22 Thread Gaius Mulley via Gcc-cvs

https://gcc.gnu.org/g:53b0e42ac425efc6aafc73108adb030060a145ff

commit r14-10963-g53b0e42ac425efc6aafc73108adb030060a145ff
Author: Gaius Mulley 
Date:   Fri Nov 22 11:39:26 2024 +

[PATCH] modula2: Fix typos, grammar, and a link

gcc:
* doc/gm2.texi (Documentation): Fix typos, grammar, and a link.
(cherry picked from commit ec0865623fc555086f96bdf52ec59f60b213be36)

Signed-off-by: Gaius Mulley 

Diff:
---
 gcc/doc/gm2.texi | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/gcc/doc/gm2.texi b/gcc/doc/gm2.texi
index 72b659bdde01..08b09a7e6f03 100644
--- a/gcc/doc/gm2.texi
+++ b/gcc/doc/gm2.texi
@@ -2906,9 +2906,9 @@ you wish to see something different please email
 @node Documentation, Regression tests, Release map, Using
 @section Documentation
 
-The GNU Modula-2 documentation is available on line
-@url{https://gcc.gnu.org/onlinedocs}
-or in the pdf, info, html file format.
+The GNU Modula-2 documentation is available online at
+@url{https://gcc.gnu.org/onlinedocs/}
+in the PDF, info, and HTML file formats.
 
 @node Regression tests, Limitations, Documentation, Using
 @section Regression tests for gm2 in the repository

[gcc r15-5585] middle-end:For multiplication try swapping operands when matching complex multiply [PR116463]

2024-11-22 Thread Tamar Christina via Gcc-cvs

https://gcc.gnu.org/g:a9473f9c6f2d755d2eb79dbd30877e64b4bc6fc8

commit r15-5585-ga9473f9c6f2d755d2eb79dbd30877e64b4bc6fc8
Author: Tamar Christina 
Date:   Thu Nov 21 15:10:24 2024 +

middle-end:For multiplication try swapping operands when matching complex 
multiply [PR116463]

This commit fixes the failures of complex.exp=fast-math-complex-mls-*.c on 
the
GCC 14 branch and some of the ones on the master.

The current matching just looks for one order for multiplication and was 
relying
on canonicalization to always give the right order because of the 
TWO_OPERANDS.

However when it comes to the multiplication trying only one order is a bit
fragile as they can be flipped.

The failing tests on the branch are:

void fms180snd(_Complex TYPE a[restrict N], _Complex TYPE b[restrict N],
   _Complex TYPE c[restrict N]) {
  for (int i = 0; i < N; i++)
c[i] -= a[i] * (b[i] * I * I);
}

void fms180fst(_Complex TYPE a[restrict N], _Complex TYPE b[restrict N],
   _Complex TYPE c[restrict N]) {
  for (int i = 0; i < N; i++)
c[i] -= (a[i] * I * I) * b[i];
}

The issue is just a small difference in commutative operations.
we look for {R,R} * {R,I} but found {R,I} * {R,R}.

Since the DF analysis is cached, we should be able to swap operands and 
retry
for multiply cheaply.

There is a constraint being checked by vect_validate_multiplication for the 
data
flow of the operands feeding the multiplications.  So e.g.

between the nodes:

note:   node 0x4d1d210 (max_nunits=2, refcnt=3) vector(2) double
note:   op template: _27 = _10 * _25;
note:  stmt 0 _27 = _10 * _25;
note:  stmt 1 _29 = _11 * _25;
note:   node 0x4d1d060 (max_nunits=2, refcnt=2) vector(2) double
note:   op template: _26 = _11 * _24;
note:  stmt 0 _26 = _11 * _24;
note:  stmt 1 _28 = _10 * _24;

we require the lanes to come from the same source which
vect_validate_multiplication checks.  As such it doesn't make sense to flip 
them
individually because that would invalidate the earlier linear_loads_p checks
which have validated that the arguments all come from the same datarefs.

This patch thus flips the operands in unison to still maintain this 
invariant,
but also honor the commutative nature of multiplication.

gcc/ChangeLog:

PR tree-optimization/116463
* tree-vect-slp-patterns.cc (complex_mul_pattern::matches,
complex_fms_pattern::matches): Try swapping operands on multiply.

Diff:
---
 gcc/tree-vect-slp-patterns.cc | 20 ++--
 1 file changed, 18 insertions(+), 2 deletions(-)

diff --git a/gcc/tree-vect-slp-patterns.cc b/gcc/tree-vect-slp-patterns.cc
index d62682be43c9..2535d46db3e8 100644
--- a/gcc/tree-vect-slp-patterns.cc
+++ b/gcc/tree-vect-slp-patterns.cc
@@ -1076,7 +1076,15 @@ complex_mul_pattern::matches (complex_operation_t op,
   enum _conj_status status;
   if (!vect_validate_multiplication (perm_cache, compat_cache, left_op,
 right_op, false, &status))
-return IFN_LAST;
+{
+  /* Try swapping the order and re-trying since multiplication is
+commutative.  */
+  std::swap (left_op[0], left_op[1]);
+  std::swap (right_op[0], right_op[1]);
+  if (!vect_validate_multiplication (perm_cache, compat_cache, left_op,
+right_op, false, &status))
+   return IFN_LAST;
+}
 
   if (status == CONJ_NONE)
 {
@@ -1293,7 +1301,15 @@ complex_fms_pattern::matches (complex_operation_t op,
   enum _conj_status status;
   if (!vect_validate_multiplication (perm_cache, compat_cache, right_op,
 left_op, true, &status))
-return IFN_LAST;
+{
+  /* Try swapping the order and re-trying since multiplication is
+commutative.  */
+  std::swap (left_op[0], left_op[1]);
+  std::swap (right_op[0], right_op[1]);
+  if (!vect_validate_multiplication (perm_cache, compat_cache, right_op,
+left_op, true, &status))
+   return IFN_LAST;
+}
 
   if (status == CONJ_NONE)
 ifn = IFN_COMPLEX_FMS;

[gcc r15-5587] i386: Make __builtin_ia32_f{nstenv, ldenv, nstsw, fnclex} builtins internal [PR117165]

2024-11-22 Thread Jakub Jelinek via Gcc-cvs

https://gcc.gnu.org/g:d6d1fdcf953a79d1e3ef2d28c99c1933d1e07d80

commit r15-5587-gd6d1fdcf953a79d1e3ef2d28c99c1933d1e07d80
Author: Jakub Jelinek 
Date:   Fri Nov 22 11:33:34 2024 +0100

i386: Make __builtin_ia32_f{nstenv,ldenv,nstsw,fnclex} builtins internal 
[PR117165]

As the comment says, these builtins are meant to be internal for the atomic
support and cause various ICEs when using them directly in various
conditions.
So the following patch makes them internal.
We do have also internal-fn.*, but those target specific builtins would
need to be there in generic code, so I've just added space to their name,
which is the old way to hide builtins/attributes etc.

2024-11-22  Jakub Jelinek  

PR target/117165
* config/i386/i386-builtin.def (IX86_BUILTIN_FNSTENV,
IX86_BUILTIN_FLDENV, IX86_BUILTIN_FNSTSW, IX86_BUILTIN_FNCLEX): Add
space to the end of the builtin name to make it really internal.

* gcc.target/i386/pr117165.c: New test.

Diff:
---
 gcc/config/i386/i386-builtin.def |  8 
 gcc/testsuite/gcc.target/i386/pr117165.c | 27 +++
 2 files changed, 31 insertions(+), 4 deletions(-)

diff --git a/gcc/config/i386/i386-builtin.def b/gcc/config/i386/i386-builtin.def
index 26c23780b1c6..d4fa87cb4766 100644
--- a/gcc/config/i386/i386-builtin.def
+++ b/gcc/config/i386/i386-builtin.def
@@ -94,10 +94,10 @@ BDESC (0, 0, CODE_FOR_nothing, "__builtin_ia32_rdpmc", 
IX86_BUILTIN_RDPMC, UNKNO
 BDESC (0, 0, CODE_FOR_pause, "__builtin_ia32_pause", IX86_BUILTIN_PAUSE, 
UNKNOWN, (int) VOID_FTYPE_VOID)
 
 /* 80387 (for use internally for atomic compound assignment).  */
-BDESC (0, 0, CODE_FOR_fnstenv, "__builtin_ia32_fnstenv", IX86_BUILTIN_FNSTENV, 
UNKNOWN, (int) VOID_FTYPE_PVOID)
-BDESC (0, 0, CODE_FOR_fldenv, "__builtin_ia32_fldenv", IX86_BUILTIN_FLDENV, 
UNKNOWN, (int) VOID_FTYPE_PCVOID)
-BDESC (0, 0, CODE_FOR_fnstsw, "__builtin_ia32_fnstsw", IX86_BUILTIN_FNSTSW, 
UNKNOWN, (int) USHORT_FTYPE_VOID)
-BDESC (0, 0, CODE_FOR_fnclex, "__builtin_ia32_fnclex", IX86_BUILTIN_FNCLEX, 
UNKNOWN, (int) VOID_FTYPE_VOID)
+BDESC (0, 0, CODE_FOR_fnstenv, "__builtin_ia32_fnstenv ", 
IX86_BUILTIN_FNSTENV, UNKNOWN, (int) VOID_FTYPE_PVOID)
+BDESC (0, 0, CODE_FOR_fldenv, "__builtin_ia32_fldenv ", IX86_BUILTIN_FLDENV, 
UNKNOWN, (int) VOID_FTYPE_PCVOID)
+BDESC (0, 0, CODE_FOR_fnstsw, "__builtin_ia32_fnstsw ", IX86_BUILTIN_FNSTSW, 
UNKNOWN, (int) USHORT_FTYPE_VOID)
+BDESC (0, 0, CODE_FOR_fnclex, "__builtin_ia32_fnclex ", IX86_BUILTIN_FNCLEX, 
UNKNOWN, (int) VOID_FTYPE_VOID)
 
 /* MMX */
 BDESC (OPTION_MASK_ISA_MMX, 0, CODE_FOR_mmx_emms, "__builtin_ia32_emms", 
IX86_BUILTIN_EMMS, UNKNOWN, (int) VOID_FTYPE_VOID)
diff --git a/gcc/testsuite/gcc.target/i386/pr117165.c 
b/gcc/testsuite/gcc.target/i386/pr117165.c
new file mode 100644
index ..d1f9663eb333
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr117165.c
@@ -0,0 +1,27 @@
+/* PR target/117165 */
+/* { dg-do compile } */
+/* { dg-options "-msoft-float" } */
+
+void
+foo ()
+{
+  __builtin_ia32_fnstsw ();/* { dg-error "implicit declaration of 
function" } */
+}
+
+void
+bar ()
+{
+  __builtin_ia32_fnclex ();/* { dg-error "implicit declaration of 
function" } */
+}
+
+void
+baz ()
+{
+  __builtin_ia32_fnstenv (0);  /* { dg-error "implicit declaration of 
function" } */
+}
+
+void
+qux ()
+{
+  __builtin_ia32_fldenv (0);   /* { dg-error "implicit declaration of 
function" } */
+}

[gcc r14-10962] [PATCH] modula2: Simplify REAL/LONGREAL/SHORTREAL node creation.

2024-11-22 Thread Gaius Mulley via Gcc-cvs

https://gcc.gnu.org/g:6bae5e6fef530f6af6e305b0e7d41e0800074ff8

commit r14-10962-g6bae5e6fef530f6af6e305b0e7d41e0800074ff8
Author: Gaius Mulley 
Date:   Fri Nov 22 10:32:05 2024 +

[PATCH] modula2: Simplify REAL/LONGREAL/SHORTREAL node creation.

This patch simplifies the real type build functions by using
the default float_type_node, double_type_node rather than create
new nodes.  It also uses the default GCC long_double_type_node
or float128_type_nodes for longreal.

gcc/m2/ChangeLog:

* gm2-gcc/m2type.cc (build_m2_short_real_node): Rewrite
to use the default float_type_node.
(build_m2_real_node): Rewrite to use the default
double_type_node.
(build_m2_long_real_node): Rewrite to use the default
long_double_type_node or float128_type_node.

Co-Authored-By: Kewen.Lin  
(cherry picked from commit 30ce9dfcc665b6088e5898cfa766b57556ebb90e)

Signed-off-by: Gaius Mulley 

Diff:
---
 gcc/m2/gm2-gcc/m2type.cc | 30 +++---
 1 file changed, 7 insertions(+), 23 deletions(-)

diff --git a/gcc/m2/gm2-gcc/m2type.cc b/gcc/m2/gm2-gcc/m2type.cc
index 571923c08ef6..5773a5cbd190 100644
--- a/gcc/m2/gm2-gcc/m2type.cc
+++ b/gcc/m2/gm2-gcc/m2type.cc
@@ -1415,45 +1415,29 @@ build_m2_char_node (void)
 static tree
 build_m2_short_real_node (void)
 {
-  tree c;
-
-  /* Define `REAL'.  */
-
-  c = make_node (REAL_TYPE);
-  TYPE_PRECISION (c) = FLOAT_TYPE_SIZE;
-  layout_type (c);
-  return c;
+  /* Define `SHORTREAL'.  */
+  ASSERT_CONDITION (TYPE_PRECISION (float_type_node) == FLOAT_TYPE_SIZE);
+  return float_type_node;
 }
 
 static tree
 build_m2_real_node (void)
 {
-  tree c;
-
   /* Define `REAL'.  */
-
-  c = make_node (REAL_TYPE);
-  TYPE_PRECISION (c) = DOUBLE_TYPE_SIZE;
-  layout_type (c);
-  return c;
+  ASSERT_CONDITION (TYPE_PRECISION (double_type_node) == DOUBLE_TYPE_SIZE);  
+  return double_type_node;
 }
 
 static tree
 build_m2_long_real_node (void)
 {
   tree longreal;
-
+  
   /* Define `LONGREAL'.  */
-  if (M2Options_GetIBMLongDouble ())
-{
-  longreal = make_node (REAL_TYPE);
-  TYPE_PRECISION (longreal) = LONG_DOUBLE_TYPE_SIZE;
-}
-  else if (M2Options_GetIEEELongDouble ())
+  if (M2Options_GetIEEELongDouble ())
 longreal = float128_type_node;
   else
 longreal = long_double_type_node;
-  layout_type (longreal);
   return longreal;
 }

[gcc r15-5588] MAINTAINERS: Add myself to write after approval

2024-11-22 Thread Evgeny Karpov via Gcc-cvs

https://gcc.gnu.org/g:8d7f2d53c81970c50a4b9bc592ce360563ae192b

commit r15-5588-g8d7f2d53c81970c50a4b9bc592ce360563ae192b
Author: Evgeny Karpov 
Date:   Fri Nov 22 13:28:21 2024 +0100

MAINTAINERS: Add myself to write after approval

ChangeLog:

* MAINTAINERS: Add myself to write after approval.

Diff:
---
 MAINTAINERS | 1 +
 1 file changed, 1 insertion(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 7da332323dcd..e432b2a4da9c 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -570,6 +570,7 @@ Kean Johnston   -   

 Phillip Jordan  pmj 
 Tim Josling timjosling  
 Victor Kaplanskyvictork 
+Evgeny Karpov   -   
 Filip Kastl pheeck  
 Geoffrey Keatinggeoffk  
 Brendan Kehoe   -

[gcc r15-5590] OpenMP: Add 'interop' clause to 'dispatch' for C/C++

2024-11-22 Thread Tobias Burnus via Gcc-cvs

https://gcc.gnu.org/g:f34422e06c38eb1f69c301ad5d8e2114c46a2796

commit r15-5590-gf34422e06c38eb1f69c301ad5d8e2114c46a2796
Author: Tobias Burnus 
Date:   Fri Nov 22 16:15:17 2024 +0100

OpenMP: Add 'interop' clause to 'dispatch' for C/C++

Will fail with an error if/as no suitable 'append_args' has been specified,
given that 'append_args' is not yet implemented.

gcc/c-family/ChangeLog:

* c-pragma.h (enum pragma_omp_clause): Add 
PRAGMA_OMP_CLAUSE_INTEROP.

gcc/c/ChangeLog:

* c-parser.cc (c_parser_omp_clause_interop): New.
(c_parser_omp_clause_name, c_parser_omp_all_clauses,
c_parser_omp_dispatch_body): Handle 'interop' clause.
* c-typeck.cc (c_finish_omp_clauses): Likewise.

gcc/cp/ChangeLog:

* parser.cc (cp_parser_omp_clause_name, cp_parser_omp_all_clauses,
cp_parser_omp_dispatch_body): Handle 'interop' clause.
* pt.cc (tsubst_omp_clauses): Likewise.
* semantics.cc (finish_omp_clauses): Likewise.

gcc/ChangeLog:

* gimplify.cc (gimplify_call_expr): Add initial support for
dispatch's 'interop' clause.
(gimplify_scan_omp_clauses): Handle interop clause.
* tree-pretty-print.cc (dump_omp_clause): Likewise.
* tree-core.h (enum omp_clause_code): Add OMP_CLAUSE_INTEROP.
* tree.cc (omp_clause_num_ops, omp_clause_code_name): Add interop.

gcc/testsuite/ChangeLog:

* c-c++-common/gomp/dispatch-11.c: New test.
* c-c++-common/gomp/dispatch-12.c: New test.

Diff:
---
 gcc/c-family/c-pragma.h   |  1 +
 gcc/c/c-parser.cc | 17 ++
 gcc/c/c-typeck.cc | 23 ++--
 gcc/cp/parser.cc  | 10 
 gcc/cp/pt.cc  |  1 +
 gcc/cp/semantics.cc   | 29 ++---
 gcc/gimplify.cc   | 15 +
 gcc/testsuite/c-c++-common/gomp/dispatch-11.c | 84 +++
 gcc/testsuite/c-c++-common/gomp/dispatch-12.c | 53 +
 gcc/tree-core.h   |  3 +
 gcc/tree-pretty-print.cc  |  6 +-
 gcc/tree.cc   |  2 +
 12 files changed, 228 insertions(+), 16 deletions(-)

diff --git a/gcc/c-family/c-pragma.h b/gcc/c-family/c-pragma.h
index c95a602a4756..df5625d5f4f0 100644
--- a/gcc/c-family/c-pragma.h
+++ b/gcc/c-family/c-pragma.h
@@ -134,6 +134,7 @@ enum pragma_omp_clause {
   PRAGMA_OMP_CLAUSE_INDIRECT,
   PRAGMA_OMP_CLAUSE_INIT,
   PRAGMA_OMP_CLAUSE_IS_DEVICE_PTR,
+  PRAGMA_OMP_CLAUSE_INTEROP,
   PRAGMA_OMP_CLAUSE_LASTPRIVATE,
   PRAGMA_OMP_CLAUSE_LINEAR,
   PRAGMA_OMP_CLAUSE_LINK,
diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc
index ca4a6b39b276..f3ed61047477 100644
--- a/gcc/c/c-parser.cc
+++ b/gcc/c/c-parser.cc
@@ -15786,6 +15786,8 @@ c_parser_omp_clause_name (c_parser *parser)
result = PRAGMA_OMP_CLAUSE_INIT;
  else if (!strcmp ("is_device_ptr", p))
result = PRAGMA_OMP_CLAUSE_IS_DEVICE_PTR;
+ else if (!strcmp ("interop", p))
+   result = PRAGMA_OMP_CLAUSE_INTEROP;
  break;
case 'l':
  if (!strcmp ("lastprivate", p))
@@ -20569,6 +20571,16 @@ c_parser_omp_clause_use (c_parser *parser, tree list)
   return c_parser_omp_var_list_parens (parser, OMP_CLAUSE_USE, list);
 }
 
+/* OpenMP 6.0:
+   interop ( variable-list ) */
+
+static tree
+c_parser_omp_clause_interop (c_parser *parser, tree list)
+{
+  check_no_duplicate_clause (list, OMP_CLAUSE_INTEROP, "interop");
+  return c_parser_omp_var_list_parens (parser, OMP_CLAUSE_INTEROP, list);
+}
+
 /* Parse all OpenACC clauses.  The set clauses allowed by the directive
is a bitmask in MASK.  Return the list of clauses found.  */
 
@@ -21076,6 +21088,10 @@ c_parser_omp_all_clauses (c_parser *parser, 
omp_clause_mask mask,
  clauses = c_parser_omp_clause_use (parser, clauses);
  c_name = "use";
  break;
+   case PRAGMA_OMP_CLAUSE_INTEROP:
+ clauses = c_parser_omp_clause_interop (parser, clauses);
+ c_name = "interop";
+ break;
case PRAGMA_OMP_CLAUSE_MAP:
  clauses = c_parser_omp_clause_map (parser, clauses);
  c_name = "map";
@@ -25078,6 +25094,7 @@ c_parser_omp_dispatch_body (c_parser *parser)
| (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_DEPEND)   
\
| (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_NOVARIANTS)   
\
| (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_NOCONTEXT)
\
+   | (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_INTEROP)  
\
| (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_IS_DEVICE_PTR)
\
| (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_NOWAIT))
 
diff --git a/gcc/c/c-typeck.cc b

[gcc r14-10965] Fortran: fix passing of NULL() actual argument to character dummy [PR104819]

2024-11-22 Thread Harald Anlauf via Gcc-cvs

https://gcc.gnu.org/g:9fd8cd1990e0f5bac416bafc8a588d05c735f000

commit r14-10965-g9fd8cd1990e0f5bac416bafc8a588d05c735f000
Author: Harald Anlauf 
Date:   Thu Nov 14 21:38:04 2024 +0100

Fortran: fix passing of NULL() actual argument to character dummy [PR104819]

Ensure that character length is set and passed by the call to a procedure
when its dummy argument is NULL() with MOLD argument present, or set length
to either 0 or the callee's expected character length.  For assumed-rank
dummies, use the rank of the MOLD argument.  Generate temporaries for
passed arguments when needed.

PR fortran/104819

gcc/fortran/ChangeLog:

* trans-expr.cc (conv_null_actual): Helper function to handle
passing of NULL() to non-optional dummy arguments of non-bind(c)
procedures.
(gfc_conv_procedure_call): Use it for character dummies.

gcc/testsuite/ChangeLog:

* gfortran.dg/null_actual_6.f90: New test.

(cherry picked from commit f70c1d517e09c4dde421774a8cec591ca3c479a0)

Diff:
---
 gcc/fortran/trans-expr.cc   |  79 ++
 gcc/testsuite/gfortran.dg/null_actual_6.f90 | 221 
 2 files changed, 300 insertions(+)

diff --git a/gcc/fortran/trans-expr.cc b/gcc/fortran/trans-expr.cc
index f182ea2ee1cd..80399fe3c4f7 100644
--- a/gcc/fortran/trans-expr.cc
+++ b/gcc/fortran/trans-expr.cc
@@ -6226,6 +6226,76 @@ conv_dummy_value (gfc_se * parmse, gfc_expr * e, 
gfc_symbol * fsym,
 }
 
 
+/* Helper function for the handling of NULL() actual arguments associated with
+   non-optional dummy variables.  Argument parmse should already be set up.  */
+static void
+conv_null_actual (gfc_se * parmse, gfc_expr * e, gfc_symbol * fsym)
+{
+  gcc_assert (fsym && !fsym->attr.optional);
+
+  /* Obtain the character length for a NULL() actual with a character
+ MOLD argument.  Otherwise substitute a suitable dummy length.
+ Here we handle only non-optional dummies of non-bind(c) procedures.  */
+  if (fsym->ts.type == BT_CHARACTER)
+{
+  if (e->ts.type == BT_CHARACTER
+ && e->symtree->n.sym->ts.type == BT_CHARACTER)
+   {
+ /* MOLD is present.  Substitute a temporary character NULL pointer.
+For an assumed-rank dummy we need a descriptor that passes the
+correct rank.  */
+ if (fsym->as && fsym->as->type == AS_ASSUMED_RANK)
+   {
+ tree rank;
+ tree tmp = parmse->expr;
+ tmp = gfc_conv_scalar_to_descriptor (parmse, tmp, fsym->attr);
+ rank = gfc_conv_descriptor_rank (tmp);
+ gfc_add_modify (&parmse->pre, rank,
+ build_int_cst (TREE_TYPE (rank), e->rank));
+ parmse->expr = gfc_build_addr_expr (NULL_TREE, tmp);
+   }
+ else
+   {
+ tree tmp = gfc_create_var (TREE_TYPE (parmse->expr), "null");
+ gfc_add_modify (&parmse->pre, tmp,
+ build_zero_cst (TREE_TYPE (tmp)));
+ parmse->expr = gfc_build_addr_expr (NULL_TREE, tmp);
+   }
+
+ /* Ensure that a usable length is available.  */
+ if (parmse->string_length == NULL_TREE)
+   {
+ gfc_typespec *ts = &e->symtree->n.sym->ts;
+
+ if (ts->u.cl->length != NULL
+ && ts->u.cl->length->expr_type == EXPR_CONSTANT)
+   gfc_conv_const_charlen (ts->u.cl);
+
+ if (ts->u.cl->backend_decl)
+   parmse->string_length = ts->u.cl->backend_decl;
+   }
+   }
+  else if (e->ts.type == BT_UNKNOWN && parmse->string_length == NULL_TREE)
+   {
+ /* MOLD is not present.  Pass length of associated dummy character
+argument if constant, or zero.  */
+ if (fsym->ts.u.cl->length != NULL
+ && fsym->ts.u.cl->length->expr_type == EXPR_CONSTANT)
+   {
+ gfc_conv_const_charlen (fsym->ts.u.cl);
+ parmse->string_length = fsym->ts.u.cl->backend_decl;
+   }
+ else
+   {
+ parmse->string_length = gfc_create_var (gfc_charlen_type_node,
+ "slen");
+ gfc_add_modify (&parmse->pre, parmse->string_length,
+ build_zero_cst (gfc_charlen_type_node));
+   }
+   }
+}
+}
+
 
 /* Generate code for a procedure call.  Note can return se->post != NULL.
If se->direct_byref is set then se->expr contains the return parameter.
@@ -7375,6 +7445,15 @@ gfc_conv_procedure_call (gfc_se * se, gfc_symbol * sym,
  gfc_conv_const_charlen (e->symtree->n.sym->ts.u.cl);
  parmse.string_length = e->symtree->n.sym->ts.u.cl->backend_decl;
}
+
+ /* Obtain the character length for a NULL() actual with a character
+MOLD argument.  Otherwise substitut

[gcc r15-5593] c-family: Yet another fix for _BitInt & __sync_* builtins [PR117641]

2024-11-22 Thread Jakub Jelinek via Gcc-cvs

https://gcc.gnu.org/g:44984f7f7523f136085ba60fd107ba8309d4115b

commit r15-5593-g44984f7f7523f136085ba60fd107ba8309d4115b
Author: Jakub Jelinek 
Date:   Fri Nov 22 19:47:52 2024 +0100

c-family: Yet another fix for _BitInt & __sync_* builtins [PR117641]

Sorry, the last patch only partially fixed the __sync_* ICEs with
_BitInt(128) on ia32.
Even for !fetch we need to error out and return 0.  I was afraid of
APIs like __atomic_exchange/__atomic_compare_exchange, those obviously
need to be supported even on _BitInt(128) on ia32, but they actually never
sync_resolve_size, they are handled by adding the size argument and using
the library version much earlier.
For fetch && !orig_format (i.e. __atomic_fetch_* etc.) we need to return -1
so that we handle it with a manualy __atomic_load +
__atomic_compare_exchange loop in the caller, all other cases should
be rejected.

2024-11-22  Jakub Jelinek  

PR c/117641
* c-common.cc (sync_resolve_size): For size 16 with _BitInt
on targets where TImode isn't supported, use goto incompatible if
!fetch.

* gcc.dg/bitint-117.c: New test.

Diff:
---
 gcc/c-family/c-common.cc  |  3 +--
 gcc/testsuite/gcc.dg/bitint-117.c | 13 +
 2 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/gcc/c-family/c-common.cc b/gcc/c-family/c-common.cc
index 367a9b07c872..9254fb6c010a 100644
--- a/gcc/c-family/c-common.cc
+++ b/gcc/c-family/c-common.cc
@@ -7457,11 +7457,10 @@ sync_resolve_size (tree function, vec 
*params, bool fetch,
 
   size = tree_to_uhwi (TYPE_SIZE_UNIT (type));
   if (size == 16
-  && fetch
   && TREE_CODE (type) == BITINT_TYPE
   && !targetm.scalar_mode_supported_p (TImode))
 {
-  if (!orig_format)
+  if (fetch && !orig_format)
return -1;
   goto incompatible;
 }
diff --git a/gcc/testsuite/gcc.dg/bitint-117.c 
b/gcc/testsuite/gcc.dg/bitint-117.c
new file mode 100644
index ..16a76165d1e8
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/bitint-117.c
@@ -0,0 +1,13 @@
+/* PR c/117641 */
+/* { dg-do compile { target bitint575 } } */
+/* { dg-options "-std=c23" } */
+
+void
+foo (_BitInt(128) *b)
+{
+  __sync_add_and_fetch (b, 1); /* { dg-error "incompatible" "" 
{ target { ! int128 } } } */
+  __sync_val_compare_and_swap (b, 0, 1);   /* { dg-error "incompatible" "" 
{ target { ! int128 } } } */
+  __sync_bool_compare_and_swap (b, 0, 1);  /* { dg-error "incompatible" "" 
{ target { ! int128 } } } */
+  __sync_lock_test_and_set (b, 1); /* { dg-error "incompatible" "" 
{ target { ! int128 } } } */
+  __sync_lock_release (b); /* { dg-error "incompatible" "" 
{ target { ! int128 } } } */
+}

[gcc r15-5594] match.pd: Fix up the new simpliofiers using with_possible_nonzero_bits2 [PR117420]

2024-11-22 Thread Jakub Jelinek via Gcc-cvs

https://gcc.gnu.org/g:c25c172959e7fb424455ee6acc60571c68b72443

commit r15-5594-gc25c172959e7fb424455ee6acc60571c68b72443
Author: Jakub Jelinek 
Date:   Fri Nov 22 19:50:22 2024 +0100

match.pd: Fix up the new simpliofiers using with_possible_nonzero_bits2 
[PR117420]

The following testcase shows wrong-code caused by incorrect use
of with_possible_nonzero_bits2.
That matcher is defined as
/* Slightly extended version, do not make it recursive to keep it cheap.  */
(match (with_possible_nonzero_bits2 @0)
 with_possible_nonzero_bits@0)
(match (with_possible_nonzero_bits2 @0)
 (bit_and:c with_possible_nonzero_bits@0 @2))
and because with_possible_nonzero_bits includes the SSA_NAME case with
integral/pointer argument, both forms can actually match when a SSA_NAME
with integral/pointer type has a def stmt which is BIT_AND_EXPR
assignment with say SSA_NAME with integral/pointer type as one of its
operands (or INTEGER_CST, another with_possible_nonzero_bits case).
And in match.pd the latter actually wins if both match and so when using
(with_possible_nonzero_bits2 @0) the @0 will actually be one of the
BIT_AND_EXPR operands if that form is matched.

Now, with_possible_nonzero_bits2 and with_certain_nonzero_bits2 were added
for the
/* X == C (or X & Z == Y | C) is impossible if ~nonzero(X) & C != 0.  */
(for cmp (eq ne)
 (simplify
  (cmp:c (with_possible_nonzero_bits2 @0) (with_certain_nonzero_bits2 @1))
  (if (wi::bit_and_not (wi::to_wide (@1), get_nonzero_bits (@0)) != 0)
   { constant_boolean_node (cmp == NE_EXPR, type); })))
simplifier, but even for that one I think they do not do a good job, they
might actually pessimize stuff rather than optimize, but at least does not
result in wrong-code, because the operands are solely tested with
wi::to_wide or get_nonzero_bits, but not actually used in the
simplification.  The reason why it can pessimize stuff is say if we have
  # RANGE [irange] int ... MASK 0xb VALUE 0x0
  x_1 = ...;
  # RANGE [irange] int ... MASK 0x8 VALUE 0x0
  _2 = x_1 & 0xc;
  _3 = _2 == 2;
then if it used just with_possible_nonzero_bits@0, @0 would have
get_nonzero_bits (@0) 0x8 and (2 & ~8) != 0, so we can fold it into
  _3 = 0;
But as it uses (with_possible_nonzero_bits2 @0), @0 is x_1 rather
than _2 and get_nonzero_bits (@0) is unnecessarily conservative,
0xb rather than 0x8 and (2 & ~0xb) == 0, so we don't optimize.
Now, with_possible_nonzero_bits2 can actually improve stuff as well in that
pattern, if say value ranges aren't fully computed yet or the BIT_AND_EXPR
assignment has been added later and the lhs doesn't have range computed yet,
get_nonzero_range on the BIT_AND_EXPR lhs will be all bits set, while
on the BIT_AND_EXPR operand might actually succeed.

I believe better would be to either modify get_nonzero_bits so that it
special cases the SSA_NAME with BIT_AND_EXPR def_stmt (but one level
deep only like with_possible_nonzero_bits2, no recursion), in that case
return bitwise and of get_nonzero_bits (non-recursive) for the lhs and
both operands, and possibly BIT_AND_EXPR itself e.g. for GENERIC
matching during by returning bitwise and of both operands.
Then with_possible_nonzero_bits2 could be needed for the GENERIC case,
perhaps have the second match #if GENERIC, but changed so that the @N
operand always is the whole thing rather than its operand which is
error-prone.  Or add get_nonzero_bits wrapper with a different name
which would do that.

with_certain_nonzero_bits2 could be changed similarly, these days
we can test known non-zero bits rather than possible non-zero bits on
SSA_NAMEs too, we record both mask and value, so possible nonzero bits
(aka. get_nonzero_bits) is mask () | value (), while known nonzero bits
is value () & ~mask (), with a new function (get_known_nonzero_bits
or get_certain_nonzero_bits etc.) which handles that.

Anyway, the following patch doesn't do what I wrote above just yet,
for that single pattern it is just a missed optimization.
But the with_possible_nonzero_bits2 uses in the 3 new simplifiers are
just completely incorrect, because they don't just use the @0 operand
in get_nonzero_bits (pessimizing stuff if value ranges are fully computed),
but also use it in the replacement, then they act as if the BIT_AND_EXPR
wasn't there at all.
While we could use (with_possible_nonzero_bits2@3 @0) and use
get_nonzero_bits (@0) and use @3 in the replacement, that would still
often be a pessimization, so I've just used with_possible_nonzero_bits@0.

2024-11-22  Jakub Jelinek  

PR tree-optimization/117420
* match.pd ((X >> C1) << (C1 + C2) -> X << C2,
(X >> C1) * (C2 << C1) -> X * C2, X / (1 << C) -> X /[ex] (1

[gcc r14-10967] [PATCH] PR modula2/115540 gcc/m2/mc-boot-ch/Gtermios.cc error return-statement with a value

2024-11-22 Thread Gaius Mulley via Gcc-cvs

https://gcc.gnu.org/g:8701cdbf8df3f746df85882878beb8e8f897b014

commit r14-10967-g8701cdbf8df3f746df85882878beb8e8f897b014
Author: Gaius Mulley 
Date:   Fri Nov 22 19:46:44 2024 +

[PATCH] PR modula2/115540 gcc/m2/mc-boot-ch/Gtermios.cc error 
return-statement with a value

This patch fixes three occurrences of cfmakeraw use in the hand built
m2 support libraries which incorrectly attempt to return a void
result.

gcc/m2/ChangeLog:

PR modula2/115540
* gm2-libs-ch/termios.c (cfmakeraw): Remove return.
* mc-boot-ch/Gtermios.cc (cfmakeraw): Remove return.
* pge-boot/Gtermios.cc (cfmakeraw): Remove return.

(cherry picked from commit d16355c72c7f7b54ecf06371d14d7ad309ea4c34)

Signed-off-by: Gaius Mulley 

Diff:
---
 gcc/m2/gm2-libs-ch/termios.c  | 2 +-
 gcc/m2/mc-boot-ch/Gtermios.cc | 2 +-
 gcc/m2/pge-boot/Gtermios.cc   | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/gcc/m2/gm2-libs-ch/termios.c b/gcc/m2/gm2-libs-ch/termios.c
index 472a4c022e80..fe7403b3dee3 100644
--- a/gcc/m2/gm2-libs-ch/termios.c
+++ b/gcc/m2/gm2-libs-ch/termios.c
@@ -281,7 +281,7 @@ int EXPORT (tcsetattr) (int fd, int option, struct termios 
*t)
 void EXPORT (cfmakeraw) (struct termios *t)
 {
 #if defined(HAVE_CFMAKERAW)
-  return cfmakeraw (t);
+  cfmakeraw (t);
 #endif
 }
 
diff --git a/gcc/m2/mc-boot-ch/Gtermios.cc b/gcc/m2/mc-boot-ch/Gtermios.cc
index a11065a67257..0ef5c8ba8039 100644
--- a/gcc/m2/mc-boot-ch/Gtermios.cc
+++ b/gcc/m2/mc-boot-ch/Gtermios.cc
@@ -289,7 +289,7 @@ void
 EXPORT (cfmakeraw) (struct termios *t)
 {
 #if defined(HAVE_CFMAKERAW)
-  return cfmakeraw (t);
+  cfmakeraw (t);
 #endif
 }
 
diff --git a/gcc/m2/pge-boot/Gtermios.cc b/gcc/m2/pge-boot/Gtermios.cc
index 4f3557619db1..5f966403b197 100644
--- a/gcc/m2/pge-boot/Gtermios.cc
+++ b/gcc/m2/pge-boot/Gtermios.cc
@@ -289,7 +289,7 @@ void
 EXPORT (cfmakeraw) (struct termios *t)
 {
 #if defined(HAVE_CFMAKERAW)
-  return cfmakeraw (t);
+  cfmakeraw (t);
 #endif
 }

[gcc(refs/users/aoliva/heads/testme)] ifcombine: skip fallback conjunction on noncontiguous blocks

2024-11-22 Thread Alexandre Oliva via Gcc-cvs

https://gcc.gnu.org/g:887c27b6da2fbb04cbe20a85378366819aabbd26

commit 887c27b6da2fbb04cbe20a85378366819aabbd26
Author: Alexandre Oliva 
Date:   Thu Nov 21 22:40:45 2024 -0300

ifcombine: skip fallback conjunction on noncontiguous blocks

When everything else fails, if enabled by the target or by a
parameter, and when other requirements are satisfied, ifcombine
generates an AND of both conditions.

That may be good for contiguous conditions, but it's unlikely to be an
optimization when the blocks are separate.

Add contiguity to the set of requirements for this fallback
transformation.


for  gcc/ChangeLog

* tree-ssa-ifcombine.cc (ifcombine_ifandif): Avoid fallback
conjunction of noncontiguous conditions.

Diff:
---
 gcc/tree-ssa-ifcombine.cc | 4 
 1 file changed, 4 insertions(+)

diff --git a/gcc/tree-ssa-ifcombine.cc b/gcc/tree-ssa-ifcombine.cc
index 9b9dc10cd220..51f37f15a9ef 100644
--- a/gcc/tree-ssa-ifcombine.cc
+++ b/gcc/tree-ssa-ifcombine.cc
@@ -974,6 +974,10 @@ ifcombine_ifandif (basic_block inner_cond_bb, bool 
inner_inv,
gimple_cond_rhs (outer_cond),
gimple_bb (outer_cond
{
+ /* Only combine conditions in this fallback case if the blocks are
+neighbors.  */
+ if (single_pred (inner_cond_bb) != outer_cond_bb)
+   return false;
  tree t1, t2;
  bool logical_op_non_short_circuit = LOGICAL_OP_NON_SHORT_CIRCUIT;
  if (param_logical_op_non_short_circuit != -1)

[gcc/aoliva/heads/testme] (2 commits) fold fold_truth_andor field merging into ifcombine

2024-11-22 Thread Alexandre Oliva via Gcc-cvs

The branch 'aoliva/heads/testme' was updated to point to:

 9e160528951b... fold fold_truth_andor field merging into ifcombine

It previously pointed to:

 1a3d4dae5b44... rework locations in fold_truth_andof_for_ifcombine

Diff:

!!! WARNING: THE FOLLOWING COMMITS ARE NO LONGER ACCESSIBLE (LOST):
---

  1a3d4da... rework locations in fold_truth_andof_for_ifcombine
  4e59fe1... drop decode_field_reference subroutines
  b920228... switch to wide_int for masks and constants
  3b4493a... pass NULL separatep in adjacent blocks
  2beb301... drop expensive mergeable tests in favor of gimple_vuse comp
  f98f69a... do not assume andor code
  f1f0eb6... fold_truth_andor: test narrowing conversions
  c3312f3... fold_truth_andor: drop known-result warnings
  3f8b088... fold_truth_andor: use pattern matching
  d69145a... fold fold_truth_andor field merging into ifcombine
  3016f93... skip fallback disjunction on noncontiguous ifcombine


Summary of changes (added commits):
---

  9e16052... fold fold_truth_andor field merging into ifcombine
  887c27b... ifcombine: skip fallback conjunction on noncontiguous block

[gcc(refs/users/aoliva/heads/testme)] fold fold_truth_andor field merging into ifcombine

2024-11-22 Thread Alexandre Oliva via Gcc-cvs

https://gcc.gnu.org/g:9e160528951b2ad3ef7e6b3d803d74d6caca74a0

commit 9e160528951b2ad3ef7e6b3d803d74d6caca74a0
Author: Alexandre Oliva 
Date:   Thu Nov 21 22:36:34 2024 -0300

fold fold_truth_andor field merging into ifcombine

This patch introduces various improvements to the logic that merges
field compares, moving it into ifcombine.

Before the patch, we could merge:

  (a.x1 EQNE b.x1)  ANDOR  (a.y1 EQNE b.y1)

into something like:

  (((type *)&a)[Na] & MASK) EQNE (((type *)&b)[Nb] & MASK)

if both of A's fields live within the same alignment boundaries, and
so do B's, at the same relative positions.  Constants may be used
instead of the object B.

The initial goal of this patch was to enable such combinations when a
field crossed alignment boundaries, e.g. for packed types.  We can't
generally access such fields with a single memory access, so when we
come across such a compare, we will attempt to combine each access
separately.

Some merging opportunities were missed because of right-shifts,
compares expressed as e.g. ((a.x1 ^ b.x1) & MASK) EQNE 0, and
narrowing conversions, especially after earlier merges.  This patch
introduces handlers for several cases involving these.

The merging of multiple field accesses into wider bitfield-like
accesses is undesirable to do too early in compilation, so we move it
from folding to ifcombine, and extend ifcombine to merge noncontiguous
compares, absent intervening side effects.  VUSEs used to prevent
ifcombine; that seemed excessively conservative, since relevant side
effects were already tested, including the possibility of trapping
loads, so that's removed.

Unlike earlier ifcombine, when merging noncontiguous compares the
merged compare must replace the earliest compare, which may require
moving up the DEFs that contributed to the latter compare.

When it is the second of a noncontiguous pair of compares that first
accesses a word, we may merge the first compare with part of the
second compare that refers to the same word, keeping the compare of
the remaining bits at the spot where the second compare used to be.

Handling compares with non-constant fields was somewhat generalized
from what fold used to do, now handling non-adjacent fields, even if a
field of one object crosses an alignment boundary but the other
doesn't.


The -Wno-error for toplev.o on rs6000 is because of toplev.c's:

  if ((flag_sanitize & SANITIZE_ADDRESS)
  && !FRAME_GROWS_DOWNWARD)

and rs6000.h's:

#define FRAME_GROWS_DOWNWARD (flag_stack_protect != 0   \
  || (flag_sanitize & SANITIZE_ADDRESS) != 0)

The mutually exclusive conditions involving flag_sanitize are now
noticed and reported by ifcombine's warning on mutually exclusive
compares.  i386's needs -Wno-error for insn-attrtab.o for similar
reasons.


for  gcc/ChangeLog

* fold-const.cc (make_bit_field): Export.
(unextend, all_ones_mask_p): Drop.
(decode_field_reference, fold_truth_andor_1): Move
field compare merging logic...
* gimple-fold.cc: (fold_truth_andor_for_ifcombine) ... here.
(compute_split_boundary_from_align): New.
(make_bit_field_load, build_split_load): New.
(reuse_split_load): New.
* fold-const.h: (make_bit_field_ref): Declare
(fold_truth_andor_maybe_separate): Declare.
* match.pd (any_convert, bit_and_cst, rshift_cst): New.
* tree-ssa-ifcombine.cc (ifcombine_ifandif): Try
fold_truth_andor_for_ifcombine.

for  gcc/testsuite/ChangeLog

* gcc.dg/field-merge-1.c: New.
* gcc.dg/field-merge-2.c: New.
* gcc.dg/field-merge-3.c: New.
* gcc.dg/field-merge-4.c: New.
* gcc.dg/field-merge-5.c: New.
* gcc.dg/field-merge-6.c: New.
* gcc.dg/field-merge-7.c: New.
* gcc.dg/field-merge-8.c: New.
* gcc.dg/field-merge-9.c: New.
* gcc.dg/field-merge-10.c: New.
* gcc.dg/field-merge-11.c: New.

Diff:
---
 gcc/fold-const.cc |  512 +--
 gcc/fold-const.h  |   10 +
 gcc/gimple-fold.cc| 1107 +
 gcc/match.pd  |   11 +
 gcc/testsuite/gcc.dg/field-merge-1.c  |   64 ++
 gcc/testsuite/gcc.dg/field-merge-10.c |   36 ++
 gcc/testsuite/gcc.dg/field-merge-11.c |   32 +
 gcc/testsuite/gcc.dg/field-merge-2.c  |   31 +
 gcc/testsuite/gcc.dg/field-merge-3.c  |   36 ++
 gcc/testsuite/gcc.dg/field-merge-4.c  |   40 ++
 gcc/testsuite/gcc.dg/field-merge-5.c  |   40 ++
 gcc/testsuite/gcc.dg/field-merge-6.c  |   26 +
 gc

[gcc r15-5596] AVR: Use Var(avropt_xxx) for option variables in avr.opt.

2024-11-22 Thread Georg-Johann Lay via Gcc-cvs

https://gcc.gnu.org/g:5f95136e5efba13d9caf7e4fa3a57e1aaa136aa4

commit r15-5596-g5f95136e5efba13d9caf7e4fa3a57e1aaa136aa4
Author: Georg-Johann Lay 
Date:   Thu Nov 21 17:41:17 2024 +0100

AVR: Use Var(avropt_xxx) for option variables in avr.opt.

This is a no-op refactoring that uses a prefix of avropt_
(formerly: avr_) for variables defined qua Var() directives
in avr.opt.  This makes it easier to spot values that come directly
from avr.opt in the rest of the backend.

gcc/
* config/avr/avr.opt (avr_bits_e, avr_lra_p, avr_mmcu)
(avr_gasisr_prologues, avr_n_flash, avr_log_details)
(avr_branch_cost, avr_split_bit_shift, avr_strict_X)
(avr_flmap, avr_rodata_in_ram, avr_sp8, avr_fuse_add)
(avr_warn_addr_space_convert, avr_warn_misspelled_isr)
(avr_fuse_move, avr_double, avr_long_double): Rename
to respectively: avropt_bits_e, avropt_lra_p, avropt_mmcu,
avropt_gasisr_prologues, avropt_n_flash, avropt_log_details,
avropt_branch_cost, avropt_split_bit_shift, avropt_strict_X,
avropt_flmap, avropt_rodata_in_ram, avropt_sp8, avropt_fuse_add,
avropt_warn_addr_space_convert, avropt_warn_misspelled_isr,
avropt_fuse_move, avropt_double, avropt_long_double.
* config/avr/avr.h: Same.
* config/avr/avr.cc: Same.
* config/avr/avr.md: Same.
* config/avr/avr-passes.cc
* config/avr/avr-log.cc: Same.
* common/config/avr/avr-common.cc: Same.

Diff:
---
 gcc/common/config/avr/avr-common.cc |  8 +++---
 gcc/config/avr/avr-log.cc   |  8 +++---
 gcc/config/avr/avr-passes.cc| 30 +++---
 gcc/config/avr/avr.cc   | 50 ++---
 gcc/config/avr/avr.h|  6 ++---
 gcc/config/avr/avr.md   | 10 
 gcc/config/avr/avr.opt  | 40 ++---
 7 files changed, 76 insertions(+), 76 deletions(-)

diff --git a/gcc/common/config/avr/avr-common.cc 
b/gcc/common/config/avr/avr-common.cc
index 54c99bd0b4af..c8f40fc2367a 100644
--- a/gcc/common/config/avr/avr-common.cc
+++ b/gcc/common/config/avr/avr-common.cc
@@ -91,7 +91,7 @@ avr_handle_option (struct gcc_options *opts, struct 
gcc_options*,
   error_at (loc, "option %<-mdouble=64%> is only available if "
 "configured %<--with-double={64|64,32|32,64}%>");
 #endif
-  opts->x_avr_long_double = 64;
+  opts->x_avropt_long_double = 64;
 }
   else if (value == 32)
 {
@@ -104,7 +104,7 @@ avr_handle_option (struct gcc_options *opts, struct 
gcc_options*,
 gcc_unreachable();
 
 #if defined (HAVE_LONG_DOUBLE_IS_DOUBLE)
-  opts->x_avr_long_double = value;
+  opts->x_avropt_long_double = value;
 #endif
   break; // -mdouble=
 
@@ -126,13 +126,13 @@ avr_handle_option (struct gcc_options *opts, struct 
gcc_options*,
 "or %<--with-long-double=double%> together with "
 "%<--with-double={32|32,64|64,32}%>");
 #endif
-  opts->x_avr_double = 32;
+  opts->x_avropt_double = 32;
 }
   else
 gcc_unreachable();
 
 #if defined (HAVE_LONG_DOUBLE_IS_DOUBLE)
-  opts->x_avr_double = value;
+  opts->x_avropt_double = value;
 #endif
   break; // -mlong-double=
 }
diff --git a/gcc/config/avr/avr-log.cc b/gcc/config/avr/avr-log.cc
index 6f567d845e1d..c48b0fbb6a62 100644
--- a/gcc/config/avr/avr-log.cc
+++ b/gcc/config/avr/avr-log.cc
@@ -342,17 +342,17 @@ avr_log_set_avr_log (void)
   bool all = TARGET_ALL_DEBUG != 0;
 
   if (all)
-avr_log_details = "all";
+avropt_log_details = "all";
 
-  if (all || avr_log_details)
+  if (all || avropt_log_details)
 {
   /* Adding , at beginning and end of string makes searching easier.  */
 
-  char *str = (char*) alloca (3 + strlen (avr_log_details));
+  char *str = (char*) alloca (3 + strlen (avropt_log_details));
   bool info;
 
   str[0] = ',';
-  strcat (stpcpy (str+1, avr_log_details), ",");
+  strcat (stpcpy (str+1, avropt_log_details), ",");
 
   all |= strstr (str, ",all,") != NULL;
   info = strstr (str, ",?,") != NULL;
diff --git a/gcc/config/avr/avr-passes.cc b/gcc/config/avr/avr-passes.cc
index b854f186a7ac..57c3fed1e410 100644
--- a/gcc/config/avr/avr-passes.cc
+++ b/gcc/config/avr/avr-passes.cc
@@ -1227,7 +1227,7 @@ public:
 
   unsigned int execute (function *func) final override
   {
-if (optimize > 0 && avr_fuse_move > 0)
+if (optimize > 0 && avropt_fuse_move > 0)
   {
df_note_add_problem ();
df_analyze ();
@@ -3181,13 +3181,13 @@ bbinfo_t::optimize_one_function (function *func)
   // use arith 1 1 1 1  1 1 1 1  3
 
   // Which optimization(s) to perform.
-  bbinfo_t::try_fuse_p = avr_fuse_move & 0x1;  // Digit 0

[gcc r15-5600] AVR: Tabify avr-common.cc according to coding rules.

2024-11-22 Thread Georg-Johann Lay via Gcc-cvs

https://gcc.gnu.org/g:982d10b74b50f28fd5dbd63876b685f484a6fec2

commit r15-5600-g982d10b74b50f28fd5dbd63876b685f484a6fec2
Author: Georg-Johann Lay 
Date:   Fri Nov 22 21:51:10 2024 +0100

AVR: Tabify avr-common.cc according to coding rules.

gcc/
* common/config/avr/avr-common.cc: Tabify.

Diff:
---
 gcc/common/config/avr/avr-common.cc | 52 ++---
 1 file changed, 26 insertions(+), 26 deletions(-)

diff --git a/gcc/common/config/avr/avr-common.cc 
b/gcc/common/config/avr/avr-common.cc
index c8f40fc2367a..2ff8cbf2cbc8 100644
--- a/gcc/common/config/avr/avr-common.cc
+++ b/gcc/common/config/avr/avr-common.cc
@@ -77,8 +77,8 @@ static const struct default_options 
avr_option_optimization_table[] =
 
 static bool
 avr_handle_option (struct gcc_options *opts, struct gcc_options*,
-   const struct cl_decoded_option *decoded,
-   location_t loc ATTRIBUTE_UNUSED)
+  const struct cl_decoded_option *decoded,
+  location_t loc ATTRIBUTE_UNUSED)
 {
   int value = decoded->value;
 
@@ -86,22 +86,22 @@ avr_handle_option (struct gcc_options *opts, struct 
gcc_options*,
 {
 case OPT_mdouble_:
   if (value == 64)
-{
+   {
 #if !defined (HAVE_DOUBLE64)
-  error_at (loc, "option %<-mdouble=64%> is only available if "
-"configured %<--with-double={64|64,32|32,64}%>");
+ error_at (loc, "option %<-mdouble=64%> is only available if "
+   "configured %<--with-double={64|64,32|32,64}%>");
 #endif
-  opts->x_avropt_long_double = 64;
-}
+ opts->x_avropt_long_double = 64;
+   }
   else if (value == 32)
-{
+   {
 #if !defined (HAVE_DOUBLE32)
-  error_at (loc, "option %<-mdouble=32%> is only available if "
-"configured %<--with-double={32|32,64|64,32}%>");
+ error_at (loc, "option %<-mdouble=32%> is only available if "
+   "configured %<--with-double={32|32,64|64,32}%>");
 #endif
-}
+   }
   else
-gcc_unreachable();
+   gcc_unreachable();
 
 #if defined (HAVE_LONG_DOUBLE_IS_DOUBLE)
   opts->x_avropt_long_double = value;
@@ -110,26 +110,26 @@ avr_handle_option (struct gcc_options *opts, struct 
gcc_options*,
 
 case OPT_mlong_double_:
   if (value == 64)
-{
+   {
 #if !defined (HAVE_LONG_DOUBLE64)
-  error_at (loc, "option %<-mlong-double=64%> is only available if "
-"configured %<--with-long-double={64|64,32|32,64}%>, "
-"or %<--with-long-double=double%> together with "
-"%<--with-double={64|64,32|32,64}%>");
+ error_at (loc, "option %<-mlong-double=64%> is only available if "
+   "configured %<--with-long-double={64|64,32|32,64}%>, "
+   "or %<--with-long-double=double%> together with "
+   "%<--with-double={64|64,32|32,64}%>");
 #endif
-}
+   }
   else if (value == 32)
-{
+   {
 #if !defined (HAVE_LONG_DOUBLE32)
-  error_at (loc, "option %<-mlong-double=32%> is only available if "
-"configured %<--with-long-double={32|32,64|64,32}%>, "
-"or %<--with-long-double=double%> together with "
-"%<--with-double={32|32,64|64,32}%>");
+ error_at (loc, "option %<-mlong-double=32%> is only available if "
+   "configured %<--with-long-double={32|32,64|64,32}%>, "
+   "or %<--with-long-double=double%> together with "
+   "%<--with-double={32|32,64|64,32}%>");
 #endif
-  opts->x_avropt_double = 32;
-}
+ opts->x_avropt_double = 32;
+   }
   else
-gcc_unreachable();
+   gcc_unreachable();
 
 #if defined (HAVE_LONG_DOUBLE_IS_DOUBLE)
   opts->x_avropt_double = value;

[gcc r15-5601] test-art: Fix comment in types.h

2024-11-22 Thread Andrew Pinski via Gcc-cvs

https://gcc.gnu.org/g:76c202329458aad027ececc59d666e4995e3644e

commit r15-5601-g76c202329458aad027ececc59d666e4995e3644e
Author: Andrew Pinski 
Date:   Fri Nov 22 09:25:41 2024 -0800

test-art: Fix comment in types.h

The comment references INCLUDE_MEMORY but the code actually
checks INCLUDE_VECTOR. So fix up the comment to mention
INCLUDE_VECTROR.

Pushed as obvious.

gcc/ChangeLog:

* text-art/types.h: Fix comment.

Signed-off-by: Andrew Pinski 

Diff:
---
 gcc/text-art/types.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/text-art/types.h b/gcc/text-art/types.h
index 2b9f8b387c71..c741f77c25fa 100644
--- a/gcc/text-art/types.h
+++ b/gcc/text-art/types.h
@@ -23,7 +23,7 @@ along with GCC; see the file COPYING3.  If not see
 
 /* This header uses std::vector, but  can't be directly
included due to issues with macros.  Hence it must be included from
-   system.h by defining INCLUDE_MEMORY in any source file using it.  */
+   system.h by defining INCLUDE_VECTOR in any source file using it.  */
 
 #ifndef INCLUDE_VECTOR
 # error "You must define INCLUDE_VECTOR before including system.h to use 
text-art/types.h"

[gcc r14-10968] [PATCH] modula2: tidyup remove unused procedures and unused parameters

2024-11-22 Thread Gaius Mulley via Gcc-cvs

https://gcc.gnu.org/g:ed9129c27d5b7c0ba78725b85f1a0a79d9c21a22

commit r14-10968-ged9129c27d5b7c0ba78725b85f1a0a79d9c21a22
Author: Gaius Mulley 
Date:   Fri Nov 22 21:09:10 2024 +

[PATCH] modula2: tidyup remove unused procedures and unused parameters

This patch removes M2GenGCC.mod:QuadCondition and
M2Quads.mod:GenQuadOTypeUniquetok.  It also removes unused parameter
WalkAction for all FoldIf procedures.

gcc/m2/ChangeLog:

* gm2-compiler/M2GenGCC.mod (QuadCondition): Remove.
(FoldIfEqu): Remove WalkAction parameter.
(FoldIfNotEqu): Ditto.
(FoldIfGreEqu): Ditto.
(FoldIfLessEqu): Ditto.
(FoldIfGre): Ditto.
(FoldIfLess): Ditto.
(FoldIfIn): Ditto.
(FoldIfNotIn): Ditto.
* gm2-compiler/M2Quads.mod (GenQuadOTypeUniquetok): Remove.

(cherry picked from commit 038144534a683d4248c9b98a7110a59b305f124a)

Signed-off-by: Gaius Mulley 

Diff:
---
 gcc/m2/gm2-compiler/M2GenGCC.mod | 77 +---
 gcc/m2/gm2-compiler/M2Quads.mod  | 52 ---
 2 files changed, 16 insertions(+), 113 deletions(-)

diff --git a/gcc/m2/gm2-compiler/M2GenGCC.mod b/gcc/m2/gm2-compiler/M2GenGCC.mod
index c59695a37795..fc3fa204ac0a 100644
--- a/gcc/m2/gm2-compiler/M2GenGCC.mod
+++ b/gcc/m2/gm2-compiler/M2GenGCC.mod
@@ -637,14 +637,14 @@ BEGIN
  CastOp : FoldCast (tokenno, p, quad, op1, op2, op3) |
  InclOp : FoldIncl (tokenno, p, quad, op1, op3) |
  ExclOp : FoldExcl (tokenno, p, quad, op1, op3) |
- IfEquOp: FoldIfEqu (tokenno, p, quad, op1, op2, op3) |
- IfNotEquOp : FoldIfNotEqu (tokenno, p, quad, op1, op2, op3) |
- IfLessOp   : FoldIfLess (tokenno, p, quad, op1, op2, op3) |
- IfLessEquOp: FoldIfLessEqu (tokenno, p, quad, op1, op2, op3) |
- IfGreOp: FoldIfGre (tokenno, p, quad, op1, op2, op3) |
- IfGreEquOp : FoldIfGreEqu (tokenno, p, quad, op1, op2, op3) |
- IfInOp : FoldIfIn (tokenno, p, quad, op1, op2, op3) |
- IfNotInOp  : FoldIfNotIn (tokenno, p, quad, op1, op2, op3) |
+ IfEquOp: FoldIfEqu (tokenno, quad, op1, op2, op3) |
+ IfNotEquOp : FoldIfNotEqu (tokenno, quad, op1, op2, op3) |
+ IfLessOp   : FoldIfLess (tokenno, quad, op1, op2, op3) |
+ IfLessEquOp: FoldIfLessEqu (tokenno, quad, op1, op2, op3) |
+ IfGreOp: FoldIfGre (tokenno, quad, op1, op2, op3) |
+ IfGreEquOp : FoldIfGreEqu (tokenno, quad, op1, op2, op3) |
+ IfInOp : FoldIfIn (tokenno, quad, op1, op2, op3) |
+ IfNotInOp  : FoldIfNotIn (tokenno, quad, op1, op2, op3) |
  LogicalShiftOp : FoldSetShift(tokenno, p, quad, op1, op2, op3) |
  LogicalRotateOp: FoldSetRotate (tokenno, p, quad, op1, op2, op3) |
  ParamOp: FoldBuiltinFunction (tokenno, p, quad, op1, op2, 
op3) |
@@ -5332,7 +5332,7 @@ END FoldIncl ;
 if op1 < op2 then goto op3.
 *)
 
-PROCEDURE FoldIfLess (tokenno: CARDINAL; p: WalkAction;
+PROCEDURE FoldIfLess (tokenno: CARDINAL;
   quad: CARDINAL; left, right, destQuad: CARDINAL) ;
 BEGIN
(* Firstly ensure that constant literals are declared.  *)
@@ -5357,57 +5357,12 @@ BEGIN
 END FoldIfLess ;
 
 
-(*
-   QuadCondition - Pre-condition:  left, right operands are constants
-   which have been resolved.
-   Post-condition: return TRUE if the condition at
-   quad is TRUE.
-*)
-
-PROCEDURE QuadCondition (quad: CARDINAL) : BOOLEAN ;
-VAR
-   left, right, dest, combined,
-   leftpos, rightpos, destpos : CARDINAL ;
-   constExpr, overflow: BOOLEAN ;
-   op : QuadOperator ;
-BEGIN
-   GetQuadOtok (quad, combined, op,
-left, right, dest, overflow,
-constExpr,
-leftpos, rightpos, destpos) ;
-   CASE op OF
-
-   IfInOp :  PushValue (right) ;
- RETURN SetIn (left, combined) |
-   IfNotInOp  :  PushValue (right) ;
- RETURN NOT SetIn (left, combined)
-
-   ELSE
-   END ;
-   PushValue (left) ;
-   PushValue (right) ;
-   CASE op OF
-
-   IfGreOp:  RETURN Gre (combined) |
-   IfLessOp   :  RETURN Less (combined) |
-   IfLessEquOp:  RETURN LessEqu (combined) |
-   IfGreEquOp :  RETURN GreEqu (combined) |
-   IfEquOp:  RETURN GreEqu (combined) |
-   IfNotEquOp :  RETURN NotEqu (combined)
-
-   ELSE
-  InternalError ('unrecognized comparison operator')
-   END ;
-   RETURN FALSE
-END QuadCondition ;
-
-
 (*
FoldIfGre - check to see if it is possible to evaluate
if op1 > op2 then goto op3.
 *)
 
-PROCEDURE FoldIfGre (tokenno: CARDINAL; p: WalkAction;
+PROCEDURE Fold

[gcc r14-10966] [PATCH] PR modula2/115536 Expression is evaluated incorrectly when encountering relops and indirecti

2024-11-22 Thread Gaius Mulley via Gcc-cvs

https://gcc.gnu.org/g:3adceba04eddf8c6e21fda3f7d0f8015e17bf5d8

commit r14-10966-g3adceba04eddf8c6e21fda3f7d0f8015e17bf5d8
Author: Gaius Mulley 
Date:   Fri Nov 22 18:38:51 2024 +

[PATCH] PR modula2/115536 Expression is evaluated incorrectly when 
encountering relops and indirection

This fix ensures that we only call BuildRelOpFromBoolean if we are
inside a constant expression (where no indirection can be used).
The fix creates a temporary variable when a boolean is created from
a relop in other cases.
The previous pattern implementation would not work if the operands required
dereferencing during non const expressions.  Comparison of relop results
in a constant expression are resolved by constant propagation, basic
block analysis and dead code removal.  After the quadruples have been
optimized only one assignment to the boolean variable will remain for
const expressions.  All quadruple pattern checking for boolean
expressions is removed by the patch.  Thus the implementation becomes
more generic.

gcc/m2/ChangeLog:

PR modula2/115536
* gm2-compiler/M2BasicBlock.def (GetBasicBlockScope): New procedure.
(GetBasicBlockStart): Ditto.
(GetBasicBlockEnd): Ditto.
(IsBasicBlockFirst): New procedure function.
* gm2-compiler/M2BasicBlock.mod (ConvertQuads2BasicBlock): Allow
conditional boolean quads to be removed.
(GetBasicBlockScope): Implement new procedure.
(GetBasicBlockStart): Ditto.
(GetBasicBlockEnd): Ditto.
(IsBasicBlockFirst): Implement new procedure function.
* gm2-compiler/M2GCCDeclare.def (FoldConstants): New parameter
declaration.
* gm2-compiler/M2GCCDeclare.mod (FoldConstants): New parameter
declaration.
(DeclareTypesConstantsProceduresInRange): Recreate basic blocks
after resolving constant expressions.
(CodeBecomes): Guard IsVariableSSA with IsVar.
* gm2-compiler/M2GenGCC.def (ResolveConstantExpressions): New
parameter declaration.
* gm2-compiler/M2GenGCC.mod (FoldIfLess): Remove relop pattern
detection.
(FoldIfGre): Ditto.
(FoldIfLessEqu): Ditto.
(FoldIfGreEqu): Ditto.
(FoldIfIn): Ditto.
(FoldIfNotIn): Ditto.
(FoldIfEqu): Ditto.
(FoldIfNotEqu): Ditto.
(FoldBecomes): Add BasicBlock parameter and allow conditional
boolean becomes to be folded in the first basic block.
(ResolveConstantExpressions): Reimplement.
* gm2-compiler/M2Quads.def (IsConstQuad): New procedure function.
(IsConditionalBooleanQuad): Ditto.
* gm2-compiler/M2Quads.mod (IsConstQuad): Implement new procedure 
function.
(IsConditionalBooleanQuad): Ditto.
(MoveWithMode): Use GenQuadOTypetok.
(IsInitialisingConst): Rewrite using OpUsesOp1.
(OpUsesOp1): New procedure function.
(doBuildAssignment): Mark des as a VarConditional.
(ConvertBooleanToVariable): Call PutVarConditional.
(DumpQuadSummary): New procedure.
(BuildRelOpFromBoolean): Updated debugging and improved comments.
(BuildRelOp): Only call BuildRelOpFromBoolean if we are in a const
expression and both operands are boolean relops.
(GenQuadOTypeUniquetok): New procedure.
(BackPatch): Correct comment.
* gm2-compiler/SymbolTable.def (PutVarConditional): New procedure.
(IsVarConditional): New procedure function.
* gm2-compiler/SymbolTable.mod (PutVarConditional): Implement new
procedure.
(IsVarConditional): Implement new procedure function.
(SymConstVar): New field IsConditional.
(SymVar): New field IsConditional.
(MakeVar): Initialize IsConditional field.
(MakeConstVar): Initialize IsConditional field.
* gm2-compiler/M2Swig.mod (DoBasicBlock): Change parameters to
use BasicBlock.
* gm2-compiler/M2Code.mod (SecondDeclareAndOptimize): Use iterator
to FoldConstants over basic block list.
* gm2-compiler/M2SymInit.mod (AppendEntry): Replace parameters
with BasicBlock.
* gm2-compiler/P3Build.bnf (Relation): Call RecordOp for #, <> and 
=.

gcc/testsuite/ChangeLog:

PR modula2/115536
* gm2/iso/const/pass/constbool4.mod: New test.
* gm2/iso/const/pass/constbool5.mod: New test.
* gm2/iso/run/pass/condtest2.mod: New test.
* gm2/iso/run/pass/condtest3.mod: New test.
* gm2/iso/run/pass/condtest4.mod: New test.
* gm2/iso/run/pass/condtest5.mod: New test.
* gm2/iso/run/pass/co

[gcc r15-5595] Add -f{, no-}assume-sane-operators-new-delete options [PR110137]

2024-11-22 Thread Jakub Jelinek via Gcc-cvs

https://gcc.gnu.org/g:27778979c9a1e32a6ca74e5b5f377385225449b1

commit r15-5595-g27778979c9a1e32a6ca74e5b5f377385225449b1
Author: Jakub Jelinek 
Date:   Fri Nov 22 19:52:35 2024 +0100

Add -f{,no-}assume-sane-operators-new-delete options [PR110137]

The following patch adds a new option for optimizations related to
replaceable global operators new/delete.
The option isn't called -fassume-sane-operator-new (which clang++
implements), because
1) clang++ option means something different; initially it was an
   option to add malloc attribute to those declarations (but we have
   malloc attribute on all  calls already unconditionally);
   later it was changed to add noalias attribute rather than malloc,
   whatever it means, but it is certainly about the return value
   from the operator new (whether it can alias with other pointers);
   we already assume malloc-ish behavior that it doesn't alias any
   other pointers
2) the option only affects operator new, we want it affect also
   operator delete
The option basically allows to choose between pre-PR101480 behavior
(now the default, more optimistic) and post-PR101480 behavior (safer
but penalizing most of the code in the wild for rare needs).

I've tried to explain stuff in the documentation too.

2024-11-22  Jakub Jelinek  

PR c++/110137
PR middle-end/101480
gcc/
* doc/invoke.texi (-fassume-sane-operators-new-delete,
-fno-assume-sane-operators-new-delete): Document.
* gimple.cc (gimple_call_fnspec): Handle
-f{,no-}assume-sane-operators-new-delete.
* ipa-inline-transform.cc (inline_call): Also clear
flag_assume_sane_operators_new_delete on caller when inlining
-fno-assume-sane-operators-new-delete callee into
-fassume-sane-operators-new-delete caller.
gcc/c-family/
* c.opt (fassume-sane-operators-new-delete): New option.
gcc/testsuite/
* g++.dg/tree-ssa/pr110137-1.C: New test.
* g++.dg/tree-ssa/pr110137-2.C: New test.
* g++.dg/tree-ssa/pr110137-3.C: New test.
* g++.dg/tree-ssa/pr110137-4.C: New test.
* g++.dg/torture/pr10148.C: Add 
-fno-assume-sane-operators-new-delete
as dg-additional-options.
* g++.dg/warn/Warray-bounds-16.C: Revert 2021-11-10 changes.

Diff:
---
 gcc/c-family/c.opt   |   4 +
 gcc/doc/invoke.texi  |  33 ++-
 gcc/gimple.cc|  14 ++-
 gcc/ipa-inline-transform.cc  |  28 --
 gcc/testsuite/g++.dg/torture/pr10148.C   |   1 +
 gcc/testsuite/g++.dg/tree-ssa/pr110137-1.C   |  74 
 gcc/testsuite/g++.dg/tree-ssa/pr110137-2.C   |  74 
 gcc/testsuite/g++.dg/tree-ssa/pr110137-3.C   |  76 
 gcc/testsuite/g++.dg/tree-ssa/pr110137-4.C   | 124 +++
 gcc/testsuite/g++.dg/warn/Warray-bounds-16.C |   6 +-
 10 files changed, 422 insertions(+), 12 deletions(-)

diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index 65cb8e5c6969..268725471329 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -1690,6 +1690,10 @@ fasm
 C ObjC C++ ObjC++ Var(flag_no_asm, 0)
 Recognize the \"asm\" keyword.
 
+fassume-sane-operators-new-delete
+C++ ObjC++ Optimization Var(flag_assume_sane_operators_new_delete) Init(1)
+Assume C++ replaceable global operators new, new[], delete, delete[] don't 
read or write visible global state.
+
 ; Define extra predefined macros for use in libgcc.
 fbuilding-libgcc
 C ObjC C++ ObjC++ Undocumented Var(flag_building_libgcc)
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 0951901f50af..a8662efb5cb2 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -213,7 +213,9 @@ in the following sections.
 @item C++ Language Options
 @xref{C++ Dialect Options,,Options Controlling C++ Dialect}.
 @gccoptlist{-fabi-version=@var{n}  -fno-access-control
--faligned-new=@var{n}  -fargs-in-order=@var{n}  -fchar8_t  -fcheck-new
+-faligned-new=@var{n}  -fargs-in-order=@var{n}
+-fno-assume-sane-operators-new-delete
+-fchar8_t  -fcheck-new
 -fconcepts  -fconstexpr-depth=@var{n}  -fconstexpr-cache-depth=@var{n}
 -fconstexpr-loop-limit=@var{n}  -fconstexpr-ops-limit=@var{n}
 -fno-elide-constructors
@@ -3163,6 +3165,35 @@ but few users will need to override the default of
 
 This flag is enabled by default for @option{-std=c++17}.
 
+@opindex fno-assume-sane-operators-new-delete
+@opindex fassume-sane-operators-new-delete
+@item -fno-assume-sane-operators-new
+The C++ standard allows replacing the global @code{new}, @code{new[]},
+@code{delete} and @code{delete[]} operators, though a lot of C++ programs
+don't replace them and just use the implementation provided version.
+Furthermore, the C++ standard allows omitting those calls if they are
+made fro

[gcc r15-5597] tree-optimization/117355: object size for PHI nodes with negative offsets

2024-11-22 Thread Siddhesh Poyarekar via Gcc-cvs

https://gcc.gnu.org/g:684595188dea02d246edb66106d82bb7a9a22d79

commit r15-5597-g684595188dea02d246edb66106d82bb7a9a22d79
Author: Siddhesh Poyarekar 
Date:   Tue Nov 19 22:51:31 2024 -0500

tree-optimization/117355: object size for PHI nodes with negative offsets

When the object size estimate is returned for a PHI node, it is the
maximum possible value, which is fine in isolation.  When combined with
negative offsets however, it may sometimes end up in zero size because
the resultant size was larger than the wholesize, leading
size_for_offset to conclude that there's a potential underflow.  Fix
this by allowing a non-strict mode to size_for_offset, which
conservatively returns the size (or wholesize) in case of a negative
offset.

gcc/ChangeLog:

PR tree-optimization/117355
* tree-object-size.cc (size_for_offset): New argument STRICT,
return SZ if it is set to false.
(plus_stmt_object_size): Adjust call to SIZE_FOR_OFFSET.

gcc/testsuite/ChangeLog:

PR tree-optimization/117355
* g++.dg/ext/builtin-object-size2.C (test9): New test.
(main): Call it.
* gcc.dg/builtin-object-size-3.c (test10): Adjust expected size.

Signed-off-by: Siddhesh Poyarekar 

Diff:
---
 gcc/testsuite/g++.dg/ext/builtin-object-size2.C | 27 
 gcc/testsuite/gcc.dg/builtin-object-size-3.c|  2 +-
 gcc/tree-object-size.cc | 28 +++--
 3 files changed, 50 insertions(+), 7 deletions(-)

diff --git a/gcc/testsuite/g++.dg/ext/builtin-object-size2.C 
b/gcc/testsuite/g++.dg/ext/builtin-object-size2.C
index 7a8f4e627332..45401b5a9c13 100644
--- a/gcc/testsuite/g++.dg/ext/builtin-object-size2.C
+++ b/gcc/testsuite/g++.dg/ext/builtin-object-size2.C
@@ -406,6 +406,32 @@ test8 (union F *f)
 FAIL ();
 }
 
+// PR117355
+#define STR "bbb"
+
+void
+__attribute__ ((noinline))
+test9 (void)
+{
+  char line[256];
+  const char *p = STR;
+  const char *q = p + sizeof (STR) - 1;
+
+  char *q1 = line;
+  for (const char *p1 = p; p1 < q;)
+{
+  *q1++ = *p1++;
+
+  if (p1 < q && (*q1++ = *p1++) != '\0')
+   {
+ if (__builtin_object_size (q1 - 2, 0) == 0)
+   __builtin_abort ();
+ if (__builtin_object_size (q1 - 2, 1) == 0)
+   __builtin_abort ();
+   }
+}
+}
+
 int
 main (void)
 {
@@ -430,5 +456,6 @@ main (void)
   union F f, *fp = &f;
   __asm ("" : "+r" (fp));
   test8 (fp);
+  test9 ();
   DONE ();
 }
diff --git a/gcc/testsuite/gcc.dg/builtin-object-size-3.c 
b/gcc/testsuite/gcc.dg/builtin-object-size-3.c
index ec2c62c96401..e0c967e003f6 100644
--- a/gcc/testsuite/gcc.dg/builtin-object-size-3.c
+++ b/gcc/testsuite/gcc.dg/builtin-object-size-3.c
@@ -619,7 +619,7 @@ test10 (void)
  if (__builtin_object_size (p - 3, 2) != sizeof (buf) - i + 3)
FAIL ();
 #else
- if (__builtin_object_size (p - 3, 2) != 0)
+ if (__builtin_object_size (p - 3, 2) != 3)
FAIL ();
 #endif
  break;
diff --git a/gcc/tree-object-size.cc b/gcc/tree-object-size.cc
index 09aad88498ea..6413ebcca37c 100644
--- a/gcc/tree-object-size.cc
+++ b/gcc/tree-object-size.cc
@@ -344,7 +344,8 @@ init_offset_limit (void)
be positive and hence, be within OFFSET_LIMIT for valid offsets.  */
 
 static tree
-size_for_offset (tree sz, tree offset, tree wholesize = NULL_TREE)
+size_for_offset (tree sz, tree offset, tree wholesize = NULL_TREE,
+bool strict = true)
 {
   gcc_checking_assert (types_compatible_p (TREE_TYPE (sz), sizetype));
 
@@ -377,9 +378,17 @@ size_for_offset (tree sz, tree offset, tree wholesize = 
NULL_TREE)
return sz;
 
   /* Negative or too large offset even after adjustment, cannot be within
-bounds of an object.  */
+bounds of an object.  The exception here is when the base object size
+has been overestimated (e.g. through PHI nodes or a COND_EXPR) and the
+adjusted offset remains negative.  If the caller wants to be
+permissive, return the base size.  */
   if (compare_tree_int (offset, offset_limit) > 0)
-   return size_zero_node;
+   {
+ if (strict)
+   return size_zero_node;
+ else
+   return sz;
+   }
 }
 
   return size_binop (MINUS_EXPR, size_binop (MAX_EXPR, sz, offset), offset);
@@ -1521,16 +1530,23 @@ plus_stmt_object_size (struct object_size_info *osi, 
tree var, gimple *stmt)
  addr_object_size (osi, op0, object_size_type, &bytes, &wholesize);
}
 
+  bool pos_offset = (size_valid_p (op1, 0)
+&& compare_tree_int (op1, offset_limit) <= 0);
+
   /* size_for_offset doesn't make sense for -1 size, but it does for size 0
 since the wholesize could be non-zero and a negative offset could give
 a non-zero size.  */
   if (size

[gcc r15-5598] c: Fix typeof_unqual handling of qualified array types [PR112841]

2024-11-22 Thread Joseph Myers via Gcc-cvs

https://gcc.gnu.org/g:84a335eb4f9641a471184d86900609dd97215218

commit r15-5598-g84a335eb4f9641a471184d86900609dd97215218
Author: Joseph Myers 
Date:   Fri Nov 22 20:33:10 2024 +

c: Fix typeof_unqual handling of qualified array types [PR112841]

As reported in bug 112841, typeof_unqual fails to remove qualifiers
from qualified array types.  In C23 (unlike in previous standard
versions), array types are considered to have the qualifiers of the
element type, so typeof_unqual should remove such qualifiers (and an
example in the standard shows that is as intended).  Fix this by
calling strip_array_types when checking for the presence of
qualifiers.  (The reason we check for qualifiers rather than just
using TYPE_MAIN_VARIANT unconditionally is to avoid, as a quality of
implementation matter, unnecessarily losing typedef information in the
case where the type is already unqualified.)

Bootstrapped with no regressions for x86_64-pc-linux-gnu.

PR c/112841

gcc/c/
* c-parser.cc (c_parser_typeof_specifier): Call strip_array_types
when checking for type qualifiers for typeof_unqual.

gcc/testsuite/
* gcc.dg/c23-typeof-4.c: New test.

Diff:
---
 gcc/c/c-parser.cc   |  3 ++-
 gcc/testsuite/gcc.dg/c23-typeof-4.c | 10 ++
 2 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc
index f3ed61047477..44d344fcd32a 100644
--- a/gcc/c/c-parser.cc
+++ b/gcc/c/c-parser.cc
@@ -,7 +,8 @@ c_parser_typeof_specifier (c_parser *parser)
   parens.skip_until_found_close (parser);
   if (ret.spec != error_mark_node)
 {
-  if (is_unqual && TYPE_QUALS (ret.spec) != TYPE_UNQUALIFIED)
+  if (is_unqual
+ && TYPE_QUALS (strip_array_types (ret.spec)) != TYPE_UNQUALIFIED)
ret.spec = TYPE_MAIN_VARIANT (ret.spec);
   if (is_std)
{
diff --git a/gcc/testsuite/gcc.dg/c23-typeof-4.c 
b/gcc/testsuite/gcc.dg/c23-typeof-4.c
new file mode 100644
index ..471d08293414
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/c23-typeof-4.c
@@ -0,0 +1,10 @@
+/* Test C23 typeof and typeof_unqual on qualified arrays (bug 112841).  */
+/* { dg-do compile } */
+/* { dg-options "-std=c23 -pedantic-errors" } */
+
+const int a[] = { 1, 2, 3 };
+int b[3];
+extern typeof (a) a;
+extern typeof (const int [3]) a;
+extern typeof_unqual (a) b;
+extern typeof_unqual (const int [3]) b;

[gcc r15-5599] AVR: target/117726 - Tweak ashiftrt:SI and lshiftrt:SI insns.

2024-11-22 Thread Georg-Johann Lay via Gcc-cvs

https://gcc.gnu.org/g:939362411d0903542647dae0eff82db10a3ad78a

commit r15-5599-g939362411d0903542647dae0eff82db10a3ad78a
Author: Georg-Johann Lay 
Date:   Thu Nov 21 22:59:14 2024 +0100

AVR: target/117726 - Tweak ashiftrt:SI and lshiftrt:SI insns.

This patch is similar to r15-5569 (tweak ashift:SI) but for
ashiftrt and lshiftrt codes.  It splits constant shift offsets > 16
into a 3-operand byte shift and a 2-operand residual bit shift.
   Moreover, some of the constraint alternatives have been promoted
to 3-operand alternatives regardless of options.  For example,
ashift:HI and lshiftrt:HI can support 3 operands for offsets 9...12
without any overhead.
   Apart from that, it's a bit of code clean up for 2-byte and 4-byte
shift insns:  Use one RTL peephole with any_shift code iterator
instead of 3 individual peepholes.  It also removes some useless
split insns; presumably introduced during the cc0 -> CCmode work.

PR target/117726
gcc/
* config/avr/avr-passes.cc (avr_split_shift): Also handle
ASHIFTRT and LSHIFTRT codes for 4-byte shifts.
(constr_split_shift4): New code_attr.
(avr_emit_shift): Adjust to new shift capabilities.
* config/avr/predicates.md (scratch_or_d_register_operand):
rename to scratch_or_dreg_operand.
* config/avr/avr.md: Same.
(define_peephole2): Write the RTL scratch peephole for 2-byte and
4-byte shifts that generates *sh*3_const insns using code
iterator any_shift.
(*ashlhi3_const_split, *ashrhi3_const_split, *ashrhi3_const_split)
(*lshrsi3_const_split, *lshrhi3_const_split): Remove useless
split insns.
(define_split) [avropt_split_bit_shift]: Add splitters
for 4-byte ASHIFTRT and LSHIFTRT insns using avr_split_shift().
(ashrsi3, *ashrsi3, *ashrsi3_const): Add "r,0,C4a" and "r,r,C4a"
constraint alternatives depending on 2op, 3op.
(lshrsi3, *lshrsi3, *lshrsi3_const): Add "r,0,C4r" and "r,r,C4r"
constraint alternatives depending on 2op, 3op. Add "r,r,C15".
(lshrhi3, *lshrhi3, *lshrhi3_const, ashlhi3, *ashlhi3)
(*ashlhi3_const): Add "r,r,C7c" alternative.
(ashrpsi, *ashrpsi3): Add "r,r,C22" alternative.
(ashlqi, *ashlqi): Turn C06 alternative into "r,r,C06".
* config/avr/constraints.md (C14, C22, C30, C7c): New constraints.
* config/avr/avr.cc (ashlhi3_out, lshrhi3_out)
[case 7, 9, 10, 11, 12]: Support as 3-operand insn.
(lshrsi3_out) [case 15]: Same.
(ashrsi3_out) [case 30]: Same.
(ashrhi3_out) [case 14]: Same.
(ashrqi3_out) [case 6]: Same.
(avr_out_ashrpsi3) [case 22]: Same.
* config/avr/avr.h: Fix comment typo.
* doc/invoke.texi (AVR Options) <-msplit-bit-shift>: Document.

Diff:
---
 gcc/config/avr/avr-passes.cc  |  95 +--
 gcc/config/avr/avr.cc | 174 
 gcc/config/avr/avr.h  |   7 +-
 gcc/config/avr/avr.md | 371 --
 gcc/config/avr/constraints.md |  20 +++
 gcc/config/avr/predicates.md  |   2 +-
 gcc/doc/invoke.texi   |  11 +-
 7 files changed, 395 insertions(+), 285 deletions(-)

diff --git a/gcc/config/avr/avr-passes.cc b/gcc/config/avr/avr-passes.cc
index 57c3fed1e410..bd249b70e8d6 100644
--- a/gcc/config/avr/avr-passes.cc
+++ b/gcc/config/avr/avr-passes.cc
@@ -43,6 +43,7 @@
 #include "context.h"
 #include "tree-pass.h"
 #include "insn-attr.h"
+#include "tm-constrs.h"
 
 
 #define CONST_INT_OR_FIXED_P(X) (CONST_INT_P (X) || CONST_FIXED_P (X))
@@ -2412,6 +2413,7 @@ bbinfo_t::find_plies (int len, const insninfo_t &ii, 
const memento_t &memo0)
 
   bool profitable = (cost < SCALE * fpd->max_ply_cost
 || (bbinfo_t::try_split_any_p
+&& fpd->solution.n_plies == 0
 && cost / SCALE <= fpd->max_ply_cost
 && cost / SCALE == fpd->movmode_cost));
   if (! profitable)
@@ -4840,37 +4842,54 @@ avr_shift_is_3op ()
LSHIFTRT, ASHIFT } into a byte shift and a residual bit shift.  */
 
 bool
-avr_split_shift_p (int n_bytes, int offset, rtx_code)
+avr_split_shift_p (int n_bytes, int offset, rtx_code code)
 {
   gcc_assert (n_bytes == 4);
 
-  return (avr_shift_is_3op ()
- && offset % 8 != 0 && IN_RANGE (offset, 17, 30));
+  if (avr_shift_is_3op ()
+  && offset % 8 != 0)
+return select()
+  : code == ASHIFT ? IN_RANGE (offset, 17, 30)
+  : code == ASHIFTRT ? IN_RANGE (offset, 9, 29)
+  : code == LSHIFTRT ? IN_RANGE (offset, 9, 30) && offset != 15
+  : bad_case ();
+
+  return false;
 }
 
 
+/* Emit a DEST = SRC  OFF shift of QImode, HImode or PSImode.
+   SCRATCH is a QImode d-register, scratch

[gcc(refs/users/meissner/heads/work187)] Add ChangeLog.meissner and REVISION.

2024-11-22 Thread Michael Meissner via Libstdc++-cvs

https://gcc.gnu.org/g:a603d5e3c38948cbe97a97bea524a62e0aed8392

commit a603d5e3c38948cbe97a97bea524a62e0aed8392
Author: Michael Meissner 
Date:   Fri Nov 22 17:28:11 2024 -0500

Add ChangeLog.meissner and REVISION.

2024-11-22  Michael Meissner  

gcc/

* REVISION: New file for branch.
* ChangeLog.meissner: New file.

gcc/c-family/

* ChangeLog.meissner: New file.

gcc/c/

* ChangeLog.meissner: New file.

gcc/cp/

* ChangeLog.meissner: New file.

gcc/fortran/

* ChangeLog.meissner: New file.

gcc/testsuite/

* ChangeLog.meissner: New file.

libgcc/

* ChangeLog.meissner: New file.

Diff:
---
 gcc/ChangeLog.meissner   | 5 +
 gcc/REVISION | 1 +
 gcc/c-family/ChangeLog.meissner  | 5 +
 gcc/c/ChangeLog.meissner | 5 +
 gcc/cp/ChangeLog.meissner| 5 +
 gcc/fortran/ChangeLog.meissner   | 5 +
 gcc/testsuite/ChangeLog.meissner | 5 +
 libgcc/ChangeLog.meissner| 5 +
 libstdc++-v3/ChangeLog.meissner  | 5 +
 9 files changed, 41 insertions(+)

diff --git a/gcc/ChangeLog.meissner b/gcc/ChangeLog.meissner
new file mode 100644
index ..0cbeebfbc31e
--- /dev/null
+++ b/gcc/ChangeLog.meissner
@@ -0,0 +1,5 @@
+ Branch work187, baseline 
+
+2024-11-22   Michael Meissner  
+
+   Clone branch
diff --git a/gcc/REVISION b/gcc/REVISION
new file mode 100644
index ..4bb9f2daed50
--- /dev/null
+++ b/gcc/REVISION
@@ -0,0 +1 @@
+work187 branch
diff --git a/gcc/c-family/ChangeLog.meissner b/gcc/c-family/ChangeLog.meissner
new file mode 100644
index ..0cbeebfbc31e
--- /dev/null
+++ b/gcc/c-family/ChangeLog.meissner
@@ -0,0 +1,5 @@
+ Branch work187, baseline 
+
+2024-11-22   Michael Meissner  
+
+   Clone branch
diff --git a/gcc/c/ChangeLog.meissner b/gcc/c/ChangeLog.meissner
new file mode 100644
index ..0cbeebfbc31e
--- /dev/null
+++ b/gcc/c/ChangeLog.meissner
@@ -0,0 +1,5 @@
+ Branch work187, baseline 
+
+2024-11-22   Michael Meissner  
+
+   Clone branch
diff --git a/gcc/cp/ChangeLog.meissner b/gcc/cp/ChangeLog.meissner
new file mode 100644
index ..0cbeebfbc31e
--- /dev/null
+++ b/gcc/cp/ChangeLog.meissner
@@ -0,0 +1,5 @@
+ Branch work187, baseline 
+
+2024-11-22   Michael Meissner  
+
+   Clone branch
diff --git a/gcc/fortran/ChangeLog.meissner b/gcc/fortran/ChangeLog.meissner
new file mode 100644
index ..0cbeebfbc31e
--- /dev/null
+++ b/gcc/fortran/ChangeLog.meissner
@@ -0,0 +1,5 @@
+ Branch work187, baseline 
+
+2024-11-22   Michael Meissner  
+
+   Clone branch
diff --git a/gcc/testsuite/ChangeLog.meissner b/gcc/testsuite/ChangeLog.meissner
new file mode 100644
index ..0cbeebfbc31e
--- /dev/null
+++ b/gcc/testsuite/ChangeLog.meissner
@@ -0,0 +1,5 @@
+ Branch work187, baseline 
+
+2024-11-22   Michael Meissner  
+
+   Clone branch
diff --git a/libgcc/ChangeLog.meissner b/libgcc/ChangeLog.meissner
new file mode 100644
index ..0cbeebfbc31e
--- /dev/null
+++ b/libgcc/ChangeLog.meissner
@@ -0,0 +1,5 @@
+ Branch work187, baseline 
+
+2024-11-22   Michael Meissner  
+
+   Clone branch
diff --git a/libstdc++-v3/ChangeLog.meissner b/libstdc++-v3/ChangeLog.meissner
new file mode 100644
index ..0cbeebfbc31e
--- /dev/null
+++ b/libstdc++-v3/ChangeLog.meissner
@@ -0,0 +1,5 @@
+ Branch work187, baseline 
+
+2024-11-22   Michael Meissner  
+
+   Clone branch

[gcc] Created branch 'meissner/heads/work187-dmf' in namespace 'refs/users'

2024-11-22 Thread Michael Meissner via Gcc-cvs

The branch 'meissner/heads/work187-dmf' was created in namespace 'refs/users' 
pointing to:

 a603d5e3c389... Add ChangeLog.meissner and REVISION.

[gcc] Created branch 'meissner/heads/work187' in namespace 'refs/users'

2024-11-22 Thread Michael Meissner via Gcc-cvs

The branch 'meissner/heads/work187' was created in namespace 'refs/users' 
pointing to:

 76c202329458... test-art: Fix comment in types.h

[gcc(refs/users/meissner/heads/work187-dmf)] Add ChangeLog.dmf and update REVISION.

2024-11-22 Thread Michael Meissner via Gcc-cvs

https://gcc.gnu.org/g:8eada357ce164605748860dcaf0438b664dfc15f

commit 8eada357ce164605748860dcaf0438b664dfc15f
Author: Michael Meissner 
Date:   Fri Nov 22 17:29:03 2024 -0500

Add ChangeLog.dmf and update REVISION.

2024-11-22  Michael Meissner  

gcc/

* ChangeLog.dmf: New file for branch.
* REVISION: Update.

Diff:
---
 gcc/ChangeLog.dmf | 5 +
 gcc/REVISION  | 2 +-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/gcc/ChangeLog.dmf b/gcc/ChangeLog.dmf
new file mode 100644
index ..587a023f88f5
--- /dev/null
+++ b/gcc/ChangeLog.dmf
@@ -0,0 +1,5 @@
+ Branch work187-dmf, baseline 
+
+2024-11-22   Michael Meissner  
+
+   Clone branch
diff --git a/gcc/REVISION b/gcc/REVISION
index 4bb9f2daed50..ac085fee6e80 100644
--- a/gcc/REVISION
+++ b/gcc/REVISION
@@ -1 +1 @@
-work187 branch
+work187-dmf branch

[gcc] Created branch 'meissner/heads/work187-vpair' in namespace 'refs/users'

2024-11-22 Thread Michael Meissner via Gcc-cvs

The branch 'meissner/heads/work187-vpair' was created in namespace 'refs/users' 
pointing to:

 a603d5e3c389... Add ChangeLog.meissner and REVISION.

[gcc(refs/users/meissner/heads/work187-vpair)] Add ChangeLog.vpair and update REVISION.

2024-11-22 Thread Michael Meissner via Gcc-cvs

https://gcc.gnu.org/g:f0fa789ddba838921ed101ee44f53d08fb1bcab3

commit f0fa789ddba838921ed101ee44f53d08fb1bcab3
Author: Michael Meissner 
Date:   Fri Nov 22 17:29:56 2024 -0500

Add ChangeLog.vpair and update REVISION.

2024-11-22  Michael Meissner  

gcc/

* ChangeLog.vpair: New file for branch.
* REVISION: Update.

Diff:
---
 gcc/ChangeLog.vpair | 5 +
 gcc/REVISION| 2 +-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/gcc/ChangeLog.vpair b/gcc/ChangeLog.vpair
new file mode 100644
index ..e16bcc29d564
--- /dev/null
+++ b/gcc/ChangeLog.vpair
@@ -0,0 +1,5 @@
+ Branch work187-vpair, baseline 
+
+2024-11-22   Michael Meissner  
+
+   Clone branch
diff --git a/gcc/REVISION b/gcc/REVISION
index 4bb9f2daed50..76e807270b55 100644
--- a/gcc/REVISION
+++ b/gcc/REVISION
@@ -1 +1 @@
-work187 branch
+work187-vpair branch

[gcc] Created branch 'meissner/heads/work187-bugs' in namespace 'refs/users'

2024-11-22 Thread Michael Meissner via Gcc-cvs

The branch 'meissner/heads/work187-bugs' was created in namespace 'refs/users' 
pointing to:

 a603d5e3c389... Add ChangeLog.meissner and REVISION.

[gcc(refs/users/meissner/heads/work187-bugs)] Add ChangeLog.bugs and update REVISION.

2024-11-22 Thread Michael Meissner via Gcc-cvs

https://gcc.gnu.org/g:0baa951aedfbe31b17252fa6851669a1d89d313f

commit 0baa951aedfbe31b17252fa6851669a1d89d313f
Author: Michael Meissner 
Date:   Fri Nov 22 17:30:50 2024 -0500

Add ChangeLog.bugs and update REVISION.

2024-11-22  Michael Meissner  

gcc/

* ChangeLog.bugs: New file for branch.
* REVISION: Update.

Diff:
---
 gcc/ChangeLog.bugs | 5 +
 gcc/REVISION   | 2 +-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/gcc/ChangeLog.bugs b/gcc/ChangeLog.bugs
new file mode 100644
index ..833633875c55
--- /dev/null
+++ b/gcc/ChangeLog.bugs
@@ -0,0 +1,5 @@
+ Branch work187-bugs, baseline 
+
+2024-11-22   Michael Meissner  
+
+   Clone branch
diff --git a/gcc/REVISION b/gcc/REVISION
index 4bb9f2daed50..1758d81c751c 100644
--- a/gcc/REVISION
+++ b/gcc/REVISION
@@ -1 +1 @@
-work187 branch
+work187-bugs branch

[gcc] Created branch 'meissner/heads/work187-libs' in namespace 'refs/users'

2024-11-22 Thread Michael Meissner via Gcc-cvs

The branch 'meissner/heads/work187-libs' was created in namespace 'refs/users' 
pointing to:

 a603d5e3c389... Add ChangeLog.meissner and REVISION.

[gcc] Created branch 'meissner/heads/work187-sha' in namespace 'refs/users'

2024-11-22 Thread Michael Meissner via Gcc-cvs

The branch 'meissner/heads/work187-sha' was created in namespace 'refs/users' 
pointing to:

 a603d5e3c389... Add ChangeLog.meissner and REVISION.

[gcc(refs/users/meissner/heads/work187-libs)] Add ChangeLog.libs and update REVISION.

2024-11-22 Thread Michael Meissner via Gcc-cvs

https://gcc.gnu.org/g:c4985fe24c9a176d3e822d6e176026992e7c3384

commit c4985fe24c9a176d3e822d6e176026992e7c3384
Author: Michael Meissner 
Date:   Fri Nov 22 17:31:41 2024 -0500

Add ChangeLog.libs and update REVISION.

2024-11-22  Michael Meissner  

gcc/

* ChangeLog.libs: New file for branch.
* REVISION: Update.

Diff:
---
 gcc/ChangeLog.libs | 5 +
 gcc/REVISION   | 2 +-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/gcc/ChangeLog.libs b/gcc/ChangeLog.libs
new file mode 100644
index ..220098bf5ca8
--- /dev/null
+++ b/gcc/ChangeLog.libs
@@ -0,0 +1,5 @@
+ Branch work187-libs, baseline 
+
+2024-11-22   Michael Meissner  
+
+   Clone branch
diff --git a/gcc/REVISION b/gcc/REVISION
index 4bb9f2daed50..5b267ef1d74f 100644
--- a/gcc/REVISION
+++ b/gcc/REVISION
@@ -1 +1 @@
-work187 branch
+work187-libs branch

[gcc(refs/users/meissner/heads/work187-sha)] Add ChangeLog.sha and update REVISION.

2024-11-22 Thread Michael Meissner via Gcc-cvs

https://gcc.gnu.org/g:35b42ee4ace96f9af241f81a8b8ca01df4a3aa38

commit 35b42ee4ace96f9af241f81a8b8ca01df4a3aa38
Author: Michael Meissner 
Date:   Fri Nov 22 17:32:35 2024 -0500

Add ChangeLog.sha and update REVISION.

2024-11-22  Michael Meissner  

gcc/

* ChangeLog.sha: New file for branch.
* REVISION: Update.

Diff:
---
 gcc/ChangeLog.sha | 5 +
 gcc/REVISION  | 2 +-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/gcc/ChangeLog.sha b/gcc/ChangeLog.sha
new file mode 100644
index ..90fefd775f8b
--- /dev/null
+++ b/gcc/ChangeLog.sha
@@ -0,0 +1,5 @@
+ Branch work187-sha, baseline 
+
+2024-11-22   Michael Meissner  
+
+   Clone branch
diff --git a/gcc/REVISION b/gcc/REVISION
index 4bb9f2daed50..bfa2656bef94 100644
--- a/gcc/REVISION
+++ b/gcc/REVISION
@@ -1 +1 @@
-work187 branch
+work187-sha branch

[gcc] Created branch 'meissner/heads/work187-test' in namespace 'refs/users'

2024-11-22 Thread Michael Meissner via Gcc-cvs

The branch 'meissner/heads/work187-test' was created in namespace 'refs/users' 
pointing to:

 a603d5e3c389... Add ChangeLog.meissner and REVISION.

[gcc(refs/users/meissner/heads/work187-test)] Add ChangeLog.test and update REVISION.

2024-11-22 Thread Michael Meissner via Gcc-cvs

https://gcc.gnu.org/g:01087ae644a32b1ee439d17dbde5d6a5ce333b20

commit 01087ae644a32b1ee439d17dbde5d6a5ce333b20
Author: Michael Meissner 
Date:   Fri Nov 22 17:33:30 2024 -0500

Add ChangeLog.test and update REVISION.

2024-11-22  Michael Meissner  

gcc/

* ChangeLog.test: New file for branch.
* REVISION: Update.

Diff:
---
 gcc/ChangeLog.test | 5 +
 gcc/REVISION   | 2 +-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/gcc/ChangeLog.test b/gcc/ChangeLog.test
new file mode 100644
index ..88a8ed1c0e0f
--- /dev/null
+++ b/gcc/ChangeLog.test
@@ -0,0 +1,5 @@
+ Branch work187-test, baseline 
+
+2024-11-22   Michael Meissner  
+
+   Clone branch
diff --git a/gcc/REVISION b/gcc/REVISION
index 4bb9f2daed50..666c8ae3062f 100644
--- a/gcc/REVISION
+++ b/gcc/REVISION
@@ -1 +1 @@
-work187 branch
+work187-test branch

[gcc] Created branch 'meissner/heads/work187-orig' in namespace 'refs/users'

2024-11-22 Thread Michael Meissner via Gcc-cvs

The branch 'meissner/heads/work187-orig' was created in namespace 'refs/users' 
pointing to:

 76c202329458... test-art: Fix comment in types.h

[gcc(refs/users/meissner/heads/work187-orig)] Add REVISION.

2024-11-22 Thread Michael Meissner via Gcc-cvs

https://gcc.gnu.org/g:d4e9c771228110efdf6015d172f7c9a09de0c6ee

commit d4e9c771228110efdf6015d172f7c9a09de0c6ee
Author: Michael Meissner 
Date:   Fri Nov 22 17:34:30 2024 -0500

Add REVISION.

2024-11-22  Michael Meissner  

gcc/

* REVISION: New file for branch.

Diff:
---
 gcc/REVISION | 1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/REVISION b/gcc/REVISION
new file mode 100644
index ..40d69a0c9858
--- /dev/null
+++ b/gcc/REVISION
@@ -0,0 +1 @@
+work187-orig branch

[gcc(refs/users/meissner/heads/work187)] Change TARGET_MODULO to TARGET_POWER9.

2024-11-22 Thread Michael Meissner via Gcc-cvs

https://gcc.gnu.org/g:35fec696fde23eaaca9c2ccca26b2110c8d8b42b

commit 35fec696fde23eaaca9c2ccca26b2110c8d8b42b
Author: Michael Meissner 
Date:   Fri Nov 22 17:42:24 2024 -0500

Change TARGET_MODULO to TARGET_POWER9.

This patch changes TARGET_MODULO to TARGET_POWER9.  The -mmodulo switch is 
not
being changed, just the name of the macros used to determine if the PowerPC
processor supports ISA 3.0 (Power9).

2024-11-22  Michael Meissner  

gcc/

* gcc/config/rs6000/rs6000-builtin.cc (rs6000_builtin_is_supported):
Change TARGET_MODULO to TARGET_POWER9.
* gcc/config/rs6000/rs6000.cc (rs6000_option_override_internal):
Likewise.
* gcc/config/rs6000/rs6000.h (TARGET_CTZ): Likewise.
(TARGET_EXTSWSLI): Likewise.
(TARGET_MADDLD): Likewise.
(TARGET_POWER9): New macro.
* gcc/config/rs6000/rs6000.md (enabled attribute): Change 
TARGET_MODULO
to TARGET_POWER9.
(mod3): Likewise.
(umod3): Likewise.
(divide/modulo peephole2): Likewise.

Diff:
---
 gcc/config/rs6000/rs6000-builtin.cc |  4 ++--
 gcc/config/rs6000/rs6000.cc |  4 ++--
 gcc/config/rs6000/rs6000.h  |  7 ---
 gcc/config/rs6000/rs6000.md | 14 +++---
 4 files changed, 15 insertions(+), 14 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-builtin.cc 
b/gcc/config/rs6000/rs6000-builtin.cc
index dae43b672ea7..b6093b3cb64c 100644
--- a/gcc/config/rs6000/rs6000-builtin.cc
+++ b/gcc/config/rs6000/rs6000-builtin.cc
@@ -169,9 +169,9 @@ rs6000_builtin_is_supported (enum rs6000_gen_builtins 
fncode)
 case ENB_P8V:
   return TARGET_P8_VECTOR;
 case ENB_P9:
-  return TARGET_MODULO;
+  return TARGET_POWER9;
 case ENB_P9_64:
-  return TARGET_MODULO && TARGET_POWERPC64;
+  return TARGET_POWER9 && TARGET_POWERPC64;
 case ENB_P9V:
   return TARGET_P9_VECTOR;
 case ENB_P10:
diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index 4a7880ee9e1b..d41a3340d28b 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -3886,7 +3886,7 @@ rs6000_option_override_internal (bool global_init_p)
 
   /* For the newer switches (vsx, dfp, etc.) set some of the older options,
  unless the user explicitly used the -mno- to disable the code.  */
-  if (TARGET_P9_VECTOR || TARGET_MODULO || TARGET_P9_MISC)
+  if (TARGET_P9_VECTOR || TARGET_POWER9 || TARGET_P9_MISC)
 rs6000_isa_flags |= (ISA_3_0_MASKS_SERVER & ~ignore_masks);
   else if (TARGET_P9_MINMAX)
 {
@@ -22333,7 +22333,7 @@ rs6000_rtx_costs (rtx x, machine_mode mode, int 
outer_code,
*total = rs6000_cost->divsi;
}
   /* Add in shift and subtract for MOD unless we have a mod instruction. */
-  if ((!TARGET_MODULO
+  if ((!TARGET_POWER9
   || (RS6000_DISABLE_SCALAR_MODULO && SCALAR_INT_MODE_P (mode)))
 && (code == MOD || code == UMOD))
*total += COSTS_N_INSNS (2);
diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h
index 4c03ddb57c79..398f52a76ff5 100644
--- a/gcc/config/rs6000/rs6000.h
+++ b/gcc/config/rs6000/rs6000.h
@@ -463,9 +463,9 @@ extern int rs6000_vector_align[];
 #define TARGET_FCTIWUZ TARGET_POWER7
 /* Only powerpc64 and powerpc476 support fctid.  */
 #define TARGET_FCTID   (TARGET_POWERPC64 || rs6000_cpu == PROCESSOR_PPC476)
-#define TARGET_CTZ TARGET_MODULO
-#define TARGET_EXTSWSLI(TARGET_MODULO && TARGET_POWERPC64)
-#define TARGET_MADDLD  TARGET_MODULO
+#define TARGET_CTZ TARGET_POWER9
+#define TARGET_EXTSWSLI(TARGET_POWER9 && TARGET_POWERPC64)
+#define TARGET_MADDLD  TARGET_POWER9
 
 /* TARGET_DIRECT_MOVE is redundant to TARGET_P8_VECTOR, so alias it to that.  
*/
 #define TARGET_DIRECT_MOVE TARGET_P8_VECTOR
@@ -504,6 +504,7 @@ extern int rs6000_vector_align[];
 #define TARGET_POWER5X TARGET_FPRND
 #define TARGET_POWER6  TARGET_CMPB
 #define TARGET_POWER7  TARGET_POPCNTD
+#define TARGET_POWER9  TARGET_MODULO
 
 /* In switching from using target_flags to using rs6000_isa_flags, the options
machinery creates OPTION_MASK_ instead of MASK_.  The MASK_
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 49f6109fc800..294798747708 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -403,7 +403,7 @@
  (const_int 1)
 
  (and (eq_attr "isa" "p9")
- (match_test "TARGET_MODULO"))
+ (match_test "TARGET_POWER9"))
  (const_int 1)
 
  (and (eq_attr "isa" "p9v")
@@ -3457,7 +3457,7 @@
   || INTVAL (operands[2]) <= 0
   || (i = exact_log2 (INTVAL (operands[2]))) < 0)
 {
-  if (!TARGET_MODULO)
+  if (!TARGET_POWER9)
FAIL;
 
   operands[2] = force_reg (mode, operands[2]);
@@ -3491,7 +3491,7 @@
   [(set (match_operand:GPR 0 "gpc_reg_operand" "=&r,r")
 (mod:GPR (match_oper

[gcc(refs/users/meissner/heads/work187)] Change TARGET_POPCNTB to TARGET_POWER5.

2024-11-22 Thread Michael Meissner via Gcc-cvs

https://gcc.gnu.org/g:38fdb83fd9735347321a3e2def32106d7613c11e

commit 38fdb83fd9735347321a3e2def32106d7613c11e
Author: Michael Meissner 
Date:   Fri Nov 22 17:37:35 2024 -0500

Change TARGET_POPCNTB to TARGET_POWER5.

This patch changes TARGET_POPCNTB to TARGET_POWER5.  The -mpopcntb switch 
is not
being changed in this patch, just the name of the macros used to determine 
if
the PowerPC processor supports ISA 2.2 (Power5).

2024-11-22  Michael Meissner  

gcc/

* gcc/config/rs6000/rs6000-builtin.cc (rs6000_builtin_is_supported):
Change TARGET_POPCNTB to TARGET_POWER5.
* gcc/config/rs6000/rs6000.cc (rs6000_option_override_internal):
Likewise.
* gcc/config/rs6000/rs6000.h (TARGET_FCFID): Likewise.
(TARGET_POWER5): New macro.
(TARGET_EXTRA_BUILTINS): Change TARGET_POPCNTB to TARGET_POWER5.
(TARGET_FRE): Likewise.
(TARGET_FRSQRTES): Likewise.
* gcc/config/rs6000/rs6000.md (enabled attribute): Likewise.

Diff:
---
 gcc/config/rs6000/rs6000-builtin.cc |  2 +-
 gcc/config/rs6000/rs6000.cc |  2 +-
 gcc/config/rs6000/rs6000.h  | 11 +++
 gcc/config/rs6000/rs6000.md |  2 +-
 4 files changed, 10 insertions(+), 7 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-builtin.cc 
b/gcc/config/rs6000/rs6000-builtin.cc
index 9bdbae1ecf94..98a0545030cd 100644
--- a/gcc/config/rs6000/rs6000-builtin.cc
+++ b/gcc/config/rs6000/rs6000-builtin.cc
@@ -155,7 +155,7 @@ rs6000_builtin_is_supported (enum rs6000_gen_builtins 
fncode)
 case ENB_ALWAYS:
   return true;
 case ENB_P5:
-  return TARGET_POPCNTB;
+  return TARGET_POWER5;
 case ENB_P6:
   return TARGET_CMPB;
 case ENB_P6_64:
diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index fa0e2ce8eea5..e1399edadcfa 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -3924,7 +3924,7 @@ rs6000_option_override_internal (bool global_init_p)
 rs6000_isa_flags |= (ISA_2_5_MASKS_EMBEDDED & ~ignore_masks);
   else if (TARGET_FPRND)
 rs6000_isa_flags |= (ISA_2_4_MASKS & ~ignore_masks);
-  else if (TARGET_POPCNTB)
+  else if (TARGET_POWER5)
 rs6000_isa_flags |= (ISA_2_2_MASKS & ~ignore_masks);
   else if (TARGET_ALTIVEC)
 rs6000_isa_flags |= (OPTION_MASK_PPC_GFXOPT & ~ignore_masks);
diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h
index e0c41e1dfd26..96de925cc0a0 100644
--- a/gcc/config/rs6000/rs6000.h
+++ b/gcc/config/rs6000/rs6000.h
@@ -448,7 +448,7 @@ extern int rs6000_vector_align[];
Enable 32-bit fcfid's on any of the switches for newer ISA machines.  */
 #define TARGET_FCFID   (TARGET_POWERPC64   \
 || TARGET_PPC_GPOPT/* 970/power4 */\
-|| TARGET_POPCNTB  /* ISA 2.02 */  \
+|| TARGET_POWER5   /* ISA 2.02 */  \
 || TARGET_CMPB /* ISA 2.05 */  \
 || TARGET_POPCNTD) /* ISA 2.06 */
 
@@ -499,6 +499,9 @@ extern int rs6000_vector_align[];
 #define TARGET_MINMAX  (TARGET_HARD_FLOAT && TARGET_PPC_GFXOPT \
 && (TARGET_P9_MINMAX || !flag_trapping_math))
 
+/* Convert ISA bits like POPCNTB to PowerPC processors like POWER5.  */
+#define TARGET_POWER5  TARGET_POPCNTB
+
 /* In switching from using target_flags to using rs6000_isa_flags, the options
machinery creates OPTION_MASK_ instead of MASK_.  The MASK_
options that have not yet been replaced by their OPTION_MASK_
@@ -525,7 +528,7 @@ extern int rs6000_vector_align[];
 
 #define TARGET_EXTRA_BUILTINS  (TARGET_POWERPC64\
 || TARGET_PPC_GPOPT /* 970/power4 */\
-|| TARGET_POPCNTB   /* ISA 2.02 */  \
+|| TARGET_POWER5/* ISA 2.02 */  \
 || TARGET_CMPB  /* ISA 2.05 */  \
 || TARGET_POPCNTD   /* ISA 2.06 */  \
 || TARGET_ALTIVEC   \
@@ -541,9 +544,9 @@ extern int rs6000_vector_align[];
 #define TARGET_FRES(TARGET_HARD_FLOAT && TARGET_PPC_GFXOPT)
 
 #define TARGET_FRE (TARGET_HARD_FLOAT \
-&& (TARGET_POPCNTB || VECTOR_UNIT_VSX_P (DFmode)))
+&& (TARGET_POWER5 || VECTOR_UNIT_VSX_P (DFmode)))
 
-#define TARGET_FRSQRTES(TARGET_HARD_FLOAT && TARGET_POPCNTB \
+#define TARGET_FRSQRTES(TARGET_HARD_FLOAT && TARGET_POWER5 \
 && TARGET_PPC_GFXOPT)
 
 #define TARGET_FRSQRTE (TARGET_HARD_FLOAT \
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 95be36d5a726..ec968f2a4def 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs

[gcc(refs/users/meissner/heads/work187)] Change TARGET_FPRND to TARGET_POWER5X.

2024-11-22 Thread Michael Meissner via Gcc-cvs

https://gcc.gnu.org/g:3fddd6e8001c0af22291768710d0ee85d22dc281

commit 3fddd6e8001c0af22291768710d0ee85d22dc281
Author: Michael Meissner 
Date:   Fri Nov 22 17:39:06 2024 -0500

Change TARGET_FPRND to TARGET_POWER5X.

This patch changes TARGET_POWER5X to TARGET_POWER5.  The -mfprnd switch is 
not
being changed, just the name of the macros used to determine if the PowerPC
processor supports ISA 2.4 (Power5x).

2024-11-22  Michael Meissner  

gcc/

* gcc/config/rs6000/rs6000.cc (rs6000_option_override_internal):
Change TARGET_FPRND to TARGET_POWER5X.
* gcc/config/rs6000/rs6000.h (TARGET_POWERP5X): New macro.
* gcc/config/rs6000/rs6000.md (fmod3): Change TARGET_FPRND to
TARGET_POWER5X.
(remainder3): Likewise.
(fctiwuz_): Likewise.
(ceil2): Likewise.
(floor2): Likewise.
(round2): Likewise.

Diff:
---
 gcc/config/rs6000/rs6000.cc |  4 ++--
 gcc/config/rs6000/rs6000.h  |  1 +
 gcc/config/rs6000/rs6000.md | 14 +++---
 3 files changed, 10 insertions(+), 9 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index e1399edadcfa..efab762d1d1d 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -3922,7 +3922,7 @@ rs6000_option_override_internal (bool global_init_p)
 rs6000_isa_flags |= (ISA_2_5_MASKS_SERVER & ~ignore_masks);
   else if (TARGET_CMPB)
 rs6000_isa_flags |= (ISA_2_5_MASKS_EMBEDDED & ~ignore_masks);
-  else if (TARGET_FPRND)
+  else if (TARGET_POWER5X)
 rs6000_isa_flags |= (ISA_2_4_MASKS & ~ignore_masks);
   else if (TARGET_POWER5)
 rs6000_isa_flags |= (ISA_2_2_MASKS & ~ignore_masks);
@@ -3949,7 +3949,7 @@ rs6000_option_override_internal (bool global_init_p)
   rs6000_isa_flags &= ~OPTION_MASK_CRYPTO;
 }
 
-  if (!TARGET_FPRND && TARGET_VSX)
+  if (!TARGET_POWER5X && TARGET_VSX)
 {
   if (rs6000_isa_flags_explicit & OPTION_MASK_FPRND)
/* TARGET_VSX = 1 implies Power 7 and newer */
diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h
index 96de925cc0a0..9ef4a73d2739 100644
--- a/gcc/config/rs6000/rs6000.h
+++ b/gcc/config/rs6000/rs6000.h
@@ -501,6 +501,7 @@ extern int rs6000_vector_align[];
 
 /* Convert ISA bits like POPCNTB to PowerPC processors like POWER5.  */
 #define TARGET_POWER5  TARGET_POPCNTB
+#define TARGET_POWER5X TARGET_FPRND
 
 /* In switching from using target_flags to using rs6000_isa_flags, the options
machinery creates OPTION_MASK_ instead of MASK_.  The MASK_
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index ec968f2a4def..e986e0e0a354 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -5171,7 +5171,7 @@
(use (match_operand:SFDF 1 "gpc_reg_operand"))
(use (match_operand:SFDF 2 "gpc_reg_operand"))]
   "TARGET_HARD_FLOAT
-   && TARGET_FPRND
+   && TARGET_POWER5X
&& flag_unsafe_math_optimizations"
 {
   rtx div = gen_reg_rtx (mode);
@@ -5189,7 +5189,7 @@
(use (match_operand:SFDF 1 "gpc_reg_operand"))
(use (match_operand:SFDF 2 "gpc_reg_operand"))]
   "TARGET_HARD_FLOAT
-   && TARGET_FPRND
+   && TARGET_POWER5X
&& flag_unsafe_math_optimizations"
 {
   rtx div = gen_reg_rtx (mode);
@@ -6689,7 +6689,7 @@
 (define_insn "*friz"
   [(set (match_operand:DF 0 "gpc_reg_operand" "=d,wa")
(float:DF (fix:DI (match_operand:DF 1 "gpc_reg_operand" "d,wa"]
-  "TARGET_HARD_FLOAT && TARGET_FPRND
+  "TARGET_HARD_FLOAT && TARGET_POWER5X
&& flag_unsafe_math_optimizations && !flag_trapping_math && TARGET_FRIZ"
   "@
friz %0,%1
@@ -6817,7 +6817,7 @@
   [(set (match_operand:SFDF 0 "gpc_reg_operand" "=d,wa")
(unspec:SFDF [(match_operand:SFDF 1 "gpc_reg_operand" "d,wa")]
 UNSPEC_FRIZ))]
-  "TARGET_HARD_FLOAT && TARGET_FPRND"
+  "TARGET_HARD_FLOAT && TARGET_POWER5X"
   "@
friz %0,%1
xsrdpiz %x0,%x1"
@@ -6827,7 +6827,7 @@
   [(set (match_operand:SFDF 0 "gpc_reg_operand" "=d,wa")
(unspec:SFDF [(match_operand:SFDF 1 "gpc_reg_operand" "d,wa")]
 UNSPEC_FRIP))]
-  "TARGET_HARD_FLOAT && TARGET_FPRND"
+  "TARGET_HARD_FLOAT && TARGET_POWER5X"
   "@
frip %0,%1
xsrdpip %x0,%x1"
@@ -6837,7 +6837,7 @@
   [(set (match_operand:SFDF 0 "gpc_reg_operand" "=d,wa")
(unspec:SFDF [(match_operand:SFDF 1 "gpc_reg_operand" "d,wa")]
 UNSPEC_FRIM))]
-  "TARGET_HARD_FLOAT && TARGET_FPRND"
+  "TARGET_HARD_FLOAT && TARGET_POWER5X"
   "@
frim %0,%1
xsrdpim %x0,%x1"
@@ -6848,7 +6848,7 @@
   [(set (match_operand:SFDF 0 "gpc_reg_operand" "=")
(unspec:SFDF [(match_operand:SFDF 1 "gpc_reg_operand" "")]
 UNSPEC_FRIN))]
-  "TARGET_HARD_FLOAT && TARGET_FPRND"
+  "TARGET_HARD_FLOAT && TARGET_POWER5X"
   "frin %0,%1"
   [(set_attr "type" "fp")])

[gcc(refs/users/meissner/heads/work187)] Revert changes

2024-11-22 Thread Michael Meissner via Gcc-cvs

https://gcc.gnu.org/g:1c9f81cb14a30838b60d4602fe15f68925d49b20

commit 1c9f81cb14a30838b60d4602fe15f68925d49b20
Author: Michael Meissner 
Date:   Fri Nov 22 18:06:53 2024 -0500

Revert changes

Diff:
---
 gcc/config.gcc  |   4 +-
 gcc/config/rs6000/aix71.h   |   1 -
 gcc/config/rs6000/aix72.h   |   1 -
 gcc/config/rs6000/aix73.h   |   1 -
 gcc/config/rs6000/driver-rs6000.cc  |   2 -
 gcc/config/rs6000/power10.md| 144 ++--
 gcc/config/rs6000/rs6000-c.cc   |   2 -
 gcc/config/rs6000/rs6000-cpus.def   |   5 --
 gcc/config/rs6000/rs6000-opts.h |   1 -
 gcc/config/rs6000/rs6000-tables.opt |  11 +--
 gcc/config/rs6000/rs6000.cc |  30 ++--
 gcc/config/rs6000/rs6000.h  |   1 -
 gcc/config/rs6000/rs6000.md |   2 +-
 gcc/config/rs6000/rs6000.opt|   6 --
 14 files changed, 87 insertions(+), 124 deletions(-)

diff --git a/gcc/config.gcc b/gcc/config.gcc
index ea939bdef14b..c20817487457 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -541,7 +541,7 @@ powerpc*-*-*)
extra_headers="${extra_headers} ppu_intrinsics.h spu2vmx.h vec_types.h 
si2vmx.h"
extra_headers="${extra_headers} amo.h"
case x$with_cpu in
-   
xpowerpc64|xdefault64|x6[23]0|x970|xG5|xpower[3456789]|xpower1[01]|xpower6x|xrs64a|xcell|xa2|xe500mc64|xe5500|xe6500|xfuture)
+   
xpowerpc64|xdefault64|x6[23]0|x970|xG5|xpower[3456789]|xpower1[01]|xpower6x|xrs64a|xcell|xa2|xe500mc64|xe5500|xe6500)
cpu_is_64bit=yes
;;
esac
@@ -5650,7 +5650,7 @@ case "${target}" in
tm_defines="${tm_defines} CONFIG_PPC405CR"
eval "with_$which=405"
;;
-   "" | common | native | future \
+   "" | common | native \
| power[3456789] | power1[01] | power5+ | power6x \
| powerpc | powerpc64 | powerpc64le \
| rs64 \
diff --git a/gcc/config/rs6000/aix71.h b/gcc/config/rs6000/aix71.h
index 505986b33d63..4350dcd89524 100644
--- a/gcc/config/rs6000/aix71.h
+++ b/gcc/config/rs6000/aix71.h
@@ -79,7 +79,6 @@ do {  
\
 #undef ASM_CPU_SPEC
 #define ASM_CPU_SPEC \
 "%{mcpu=native: %(asm_cpu_native); \
-  mcpu=future: -mfuture; \
   mcpu=power11: -mpwr11; \
   mcpu=power10: -mpwr10; \
   mcpu=power9: -mpwr9; \
diff --git a/gcc/config/rs6000/aix72.h b/gcc/config/rs6000/aix72.h
index 242ca94bd065..fe59f8319b48 100644
--- a/gcc/config/rs6000/aix72.h
+++ b/gcc/config/rs6000/aix72.h
@@ -79,7 +79,6 @@ do {  
\
 #undef ASM_CPU_SPEC
 #define ASM_CPU_SPEC \
 "%{mcpu=native: %(asm_cpu_native); \
-  mcpu=future: -mfuture; \
   mcpu=power11: -mpwr11; \
   mcpu=power10: -mpwr10; \
   mcpu=power9: -mpwr9; \
diff --git a/gcc/config/rs6000/aix73.h b/gcc/config/rs6000/aix73.h
index 2bd6b4bb3c4f..1318b0b3662d 100644
--- a/gcc/config/rs6000/aix73.h
+++ b/gcc/config/rs6000/aix73.h
@@ -79,7 +79,6 @@ do {  
\
 #undef ASM_CPU_SPEC
 #define ASM_CPU_SPEC \
 "%{mcpu=native: %(asm_cpu_native); \
-  mcpu=future: -mfuture; \
   mcpu=power11: -mpwr11; \
   mcpu=power10: -mpwr10; \
   mcpu=power9: -mpwr9; \
diff --git a/gcc/config/rs6000/driver-rs6000.cc 
b/gcc/config/rs6000/driver-rs6000.cc
index 57199b3172f1..a054827b2e03 100644
--- a/gcc/config/rs6000/driver-rs6000.cc
+++ b/gcc/config/rs6000/driver-rs6000.cc
@@ -453,7 +453,6 @@ static const struct asm_name asm_names[] = {
   { "power9",  "-mpwr9" },
   { "power10", "-mpwr10" },
   { "power11", "-mpwr11" },
-  { "future",  "-mfuture" },
   { "powerpc", "-mppc" },
   { "rs64","-mppc" },
   { "603", "-m603" },
@@ -483,7 +482,6 @@ static const struct asm_name asm_names[] = {
   { "power9",  "-mpower9" },
   { "power10", "-mpower10" },
   { "power11", "-mpower11" },
-  { "future",  "-mfuture" },
   { "a2",  "-ma2" },
   { "powerpc", "-mppc" },
   { "powerpc64", "-mppc64" },
diff --git a/gcc/config/rs6000/power10.md b/gcc/config/rs6000/power10.md
index e42b057dc45b..2310c4603457 100644
--- a/gcc/config/rs6000/power10.md
+++ b/gcc/config/rs6000/power10.md
@@ -1,4 +1,4 @@
-;; Scheduling description for the IBM Power10, Power11, and Future processors.
+;; Scheduling description for the IBM Power10 and Power11 processors.
 ;; Copyright (C) 2020-2024 Free Software Foundation, Inc.
 ;;
 ;; Contributed by Pat Haugen (pthau...@us.ibm.com).
@@ -97,12 +97,12 @@
(eq_attr "update" "no")
(eq_attr "size" "!128")
(eq_attr "prefixed" "no")
-   (eq_attr "cpu" "power10,power11,future"))
+   (eq_attr "cpu" "power10,power11"))
   "DU_any_power10,LU_power10")
 
 (define_insn_reservation "power10-fused-load" 4
   (and (eq_attr "type" "fused_load_cmpi,fused_addi

[gcc/meissner/heads/work187-bugs] (30 commits) Merge commit 'refs/users/meissner/heads/work187-bugs' of gi

2024-11-22 Thread Michael Meissner via Gcc-cvs

The branch 'meissner/heads/work187-bugs' was updated to point to:

 53c441108184... Merge commit 'refs/users/meissner/heads/work187-bugs' of gi

It previously pointed to:

 0baa951aedfb... Add ChangeLog.bugs and update REVISION.

Diff:

Summary of changes (added commits):
---

  53c4411... Merge commit 'refs/users/meissner/heads/work187-bugs' of gi
  bf2895e... Add ChangeLog.bugs and update REVISION.
  ead1c9f... Update ChangeLog.* (*)
  992b9d6... Use architecture flags for defining _ARCH_PWR macros. (*)
  04169d3... Add rs6000 architecture masks. (*)
  e92d512... Do not allow -mvsx to boost processor to power7. (*)
  5e0614e... Use vector pair load/store for memcpy with -mcpu=future (*)
  0f14185... Add -mcpu=future tests. (*)
  33fa445... Add -mcpu=future tuning support. (*)
  7f0e7dc... Add support for -mcpu=future (*)
  1c9f81c... Revert changes (*)
  80c4fc9... Delete files (*)
  c004776... Add -mcpu=future tuning support. (*)
  590481f... Add support for -mcpu=future (*)
  2d9d590... Revert changes (*)
  f850d8d... Add -mcpu=future tuning support. (*)
  1f083f5... Add support for -mcpu=future (*)
  f9f717f... Revert changes (*)
  33234a6... Add -mcpu=future tuning support. (*)
  d59547a... Add support for -mcpu=future (*)
  b757938... Revert changes (*)
  e98175b... Use vector pair load/store for memcpy with -mcpu=future (*)
  6e5fb4c... Add -mcpu=future tests. (*)
  121e4ad... Add -mcpu=future tuning support. (*)
  90092f4... Add support for -mcpu=future (*)
  35fec69... Change TARGET_MODULO to TARGET_POWER9. (*)
  e1f8abc... Change TARGET_POPCNTD to TARGET_POWER7. (*)
  8c9b979... Change TARGET_CMPB to TARGET_POWER6. (*)
  3fddd6e... Change TARGET_FPRND to TARGET_POWER5X. (*)
  38fdb83... Change TARGET_POPCNTB to TARGET_POWER5. (*)

(*) This commit already exists in another branch.
Because the reference `refs/users/meissner/heads/work187-bugs' matches
your hooks.email-new-commits-only configuration,
no separate email is sent for this commit.

[gcc(refs/users/meissner/heads/work187-bugs)] Add ChangeLog.bugs and update REVISION.

2024-11-22 Thread Michael Meissner via Gcc-cvs

https://gcc.gnu.org/g:bf2895e6bb430e844af20c0e3a7939b53470577e

commit bf2895e6bb430e844af20c0e3a7939b53470577e
Author: Michael Meissner 
Date:   Fri Nov 22 17:30:50 2024 -0500

Add ChangeLog.bugs and update REVISION.

2024-11-22  Michael Meissner  

gcc/

* ChangeLog.bugs: New file for branch.
* REVISION: Update.

Diff:
---
 gcc/ChangeLog.bugs | 5 +
 gcc/REVISION   | 2 +-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/gcc/ChangeLog.bugs b/gcc/ChangeLog.bugs
new file mode 100644
index ..833633875c55
--- /dev/null
+++ b/gcc/ChangeLog.bugs
@@ -0,0 +1,5 @@
+ Branch work187-bugs, baseline 
+
+2024-11-22   Michael Meissner  
+
+   Clone branch
diff --git a/gcc/REVISION b/gcc/REVISION
index 4bb9f2daed50..1758d81c751c 100644
--- a/gcc/REVISION
+++ b/gcc/REVISION
@@ -1 +1 @@
-work187 branch
+work187-bugs branch

[gcc(refs/users/meissner/heads/work187-dmf)] Update ChangeLog.*

2024-11-22 Thread Michael Meissner via Gcc-cvs

https://gcc.gnu.org/g:bcf6b53cc356bf7e4e305225ab09da2eb45ea3fe

commit bcf6b53cc356bf7e4e305225ab09da2eb45ea3fe
Author: Michael Meissner 
Date:   Fri Nov 22 19:29:31 2024 -0500

Update ChangeLog.*

Diff:
---
 gcc/ChangeLog.dmf | 389 ++
 1 file changed, 389 insertions(+)

diff --git a/gcc/ChangeLog.dmf b/gcc/ChangeLog.dmf
index 587a023f88f5..b4e9986f5c5f 100644
--- a/gcc/ChangeLog.dmf
+++ b/gcc/ChangeLog.dmf
@@ -1,5 +1,394 @@
+ Branch work187-dmf, patch #121 was reverted 

+ Branch work187-dmf, patch #120 was reverted 

+
+ Branch work187-dmf, patch #111 
+
+RFC2655-Add saturating subtract built-ins.
+
+This patch adds support for a saturating subtract built-in function that may be
+added to a future PowerPC processor.  Note, if it is added, the name of the
+built-in function may change before GCC 13 is released.  If the name changes,
+we will submit a patch changing the name.
+
+I also added support for providing dense math built-in functions, even though
+at present, we have not added any new built-in functions for dense math.  It is
+likely we will want to add new dense math built-in functions as the dense math
+support is fleshed out.
+
+The patches have been tested on both little and big endian systems.  Can I 
check
+it into the master branch?
+
+2024-11-22   Michael Meissner  
+
+gcc/
+
+   * config/rs6000/rs6000-builtin.cc (rs6000_invalid_builtin): Add support
+   for flagging invalid use of future built-in functions.
+   (rs6000_builtin_is_supported): Add support for future built-in
+   functions.
+   * config/rs6000/rs6000-builtins.def (__builtin_saturate_subtract32): New
+   built-in function for -mcpu=future.
+   (__builtin_saturate_subtract64): Likewise.
+   * config/rs6000/rs6000-gen-builtins.cc (enum bif_stanza): Add stanzas
+   for -mcpu=future built-ins.
+   (stanza_map): Likewise.
+   (enable_string): Likewise.
+   (struct attrinfo): Likewise.
+   (parse_bif_attrs): Likewise.
+   (write_decls): Likewise.
+   * config/rs6000/rs6000.md (sat_sub3): Add saturating subtract
+   built-in insn declarations.
+   (sat_sub3_dot): Likewise.
+   (sat_sub3_dot2): Likewise.
+   * doc/extend.texi (Future PowerPC built-ins): New section.
+
+gcc/testsuite/
+
+   * gcc.target/powerpc/subfus-1.c: New test.
+   * gcc.target/powerpc/subfus-2.c: Likewise.
+
+ Branch work187-dmf, patch #110 
+
+RFC2656-Support load/store vector with right length.
+
+This patch adds support for new instructions that may be added to the PowerPC
+architecture in the future to enhance the load and store vector with length
+instructions.
+
+The current instructions (lxvl, lxvll, stxvl, and stxvll) are inconvient to use
+since the count for the number of bytes must be in the top 8 bits of the GPR
+register, instead of the bottom 8 bits.  This meant that code generating these
+instructions typically had to do a shift left by 56 bits to get the count into
+the right position.  In a future version of the PowerPC architecture, new
+variants of these instructions might be added that expect the count to be in
+the bottom 8 bits of the GPR register.  These patches add this support to GCC
+if the user uses the -mcpu=future option.
+
+I discovered that the code in rs6000-string.cc to generate ISA 3.1 lxvl/stxvl
+future lxvll/stxvll instructions would generate these instructions on 32-bit.
+However the patterns for these instructions is only done on 64-bit systems.  So
+I added a check for 64-bit support before generating the instructions.
+
+The patches have been tested on both little and big endian systems.  Can I 
check
+it into the master branch?
+
+2024-11-22   Michael Meissner  
+
+gcc/
+
+   * config/rs6000/rs6000-string.cc (expand_block_move): Do not generate
+   lxvl and stxvl on 32-bit.
+   * config/rs6000/vsx.md (lxvl): If -mcpu=future, generate the lxvl with
+   the shift count automaticaly used in the insn.
+   (lxvrl): New insn for -mcpu=future.
+   (lxvrll): Likewise.
+   (stxvl): If -mcpu=future, generate the stxvl with the shift count
+   automaticaly used in the insn.
+   (stxvrl): New insn for -mcpu=future.
+   (stxvrll): Likewise.
+
+gcc/testsuite/
+
+   * gcc.target/powerpc/lxvrl.c: New test.
+   * lib/target-supports.exp (check_effective_target_powerpc_future_ok):
+   New effective target.
+
+ Branch work187-dmf, patch #104 
+
+RFC2653-PowerPC: Add support for 1,024 bit DMR registers.
+
+This patch is a prelimianry patch to add the full 1,024 bit dense math register
+(DMRs) for -mcpu=future.  The MMA 512-bit accumulators map onto the top of the
+DMR register.
+
+This patch only adds the new 1,024 bit register support.  It does not add
+su

[gcc(refs/users/meissner/heads/work187-vpair)] Vector pair support.

2024-11-22 Thread Michael Meissner via Gcc-cvs

https://gcc.gnu.org/g:b26e2e62f8cd7539e269e31516b169b531d17ca5

commit b26e2e62f8cd7539e269e31516b169b531d17ca5
Author: Michael Meissner 
Date:   Fri Nov 22 19:40:21 2024 -0500

Vector pair support.

This patch adds a new include file (vector-pair.h) that adds support so that
users writing high performance libraries can change their code to allow the
generation of the vector pair load and store instructions on power10.

The intention is that if the library authors need to write special loops 
that
go over arrays that they could modify their code to use the functions 
provided
to change loops that can take advantage of the higher bandwidth for load 
vector
pair and store instructions.

This particular patch just adds a new include file (vector-pair.h) that
provides a bunch of functions that on a power10 system would use the vector
pair load operation, 2 floating point operations, and a vector pair store.  
It
does not add any new types, modes, or built-in function.

I have additional patches that can add built-in functions that the 
functions in
vector-pair.h could utilize so that the compiler can optimize and combine
operations.  I may submit those patches in the future, but I would like to
provide this patch to allow the library writer to optimize their code.

I've measured the performance of these new functions on a power10.  For 
default
unrolling, the percentage of change for the 3 methods over the normal vector
loop method:

116%Vector-pair.h function, default unroll
 93%Vector pair split built-in & 2 vector stores, default unroll
 86%Vector pair split & combine built-ins, default unroll

Using explicit 2 way unrolling the numbers are:

114%Vector-pair.h function, unroll 2
106%Vector pair split built-in & 2 vector stores, unroll 2
 98%Vector pair split & combine built-ins, unroll 2

These new functions provided in vector-pair.h use the vector pair load/store
instructions, and don't generate extra vector moves.  Using the existing
vector pair disassemble and assemble built-ins generate extra vector moves
which can hinder performance.

If I compile the loop code for power9, there is a minor speed up for default
unrolling and more of an improvement using the framework provided in the
vector-pair.h for explicit unrolling by 2:

101%Vector-pair.h function, default unroll for power9
107%Vector-pair.h function, unroll 2 for power9

Of course this is a synthetic benchmark run on a quiet power10 system.  
Results
would vary for real code on real systems.  However, I feel adding these
functions can allow the writers of high performance libraries to better
optimize their code.

As an example, if the library wants to code a simple fused multiply-add 
loop,
they might write the code as follows:

#include 
#include 
#include 

void
fma_vector (double * __restrict__ r,
const double * __restrict__ a,
const double * __restrict__ b,
size_t n)
{
  vector double * __restrict__ vr = (vector double * __restrict__)r;
  const vector double * __restrict__ va = (const vector double * 
__restrict__)a;
  const vector double * __restrict__ vb = (const vector double * 
__restrict__)b;
  size_t num_elements = sizeof (vector double) / sizeof (double);
  size_t nv = n / num_elements;
  size_t i;

  for (i = 0; i < nv; i++)
vr[i] = __builtin_vsx_xvmadddp (va[i], vb[i], vr[i]);

  for (i = nv * num_elements; i < n; i++)
r[i] = fma (a[i], b[i], r[i]);
}

The inner loop would look like:

.L3:
lxvx 0,3,9
lxvx 12,4,9
addi 10,9,16
addi 2,2,-2
lxvx 11,5,9
xvmaddadp 0,12,11
lxvx 12,4,10
lxvx 11,5,10
stxvx 0,3,9
lxvx 0,3,10
addi 9,9,32
xvmaddadp 0,12,11
stxvx 0,3,10
bdnz .L3

Now if you code the loop to use __builtin_vsx_disassemble_pair to do a 
vector
pair load, but then do 2 vector stores:

#include 
#include 
#include 

void
fma_mma_ld (double * __restrict__ r,
const double * __restrict__ a,
const double * __restrict__ b,
size_t n)
{
  __vector_pair * __restrict__ vp_r

[gcc(refs/users/meissner/heads/work187-sha)] Add potential p-future XVRLD and XVRLDI instructions.

2024-11-22 Thread Michael Meissner via Gcc-cvs

https://gcc.gnu.org/g:974556b8e98f54108332000ae37a9250fa3c45c6

commit 974556b8e98f54108332000ae37a9250fa3c45c6
Author: Michael Meissner 
Date:   Fri Nov 22 19:47:25 2024 -0500

Add potential p-future XVRLD and XVRLDI instructions.

2024-11-22  Michael Meissner  

gcc/

* config/rs6000/altivec.md (altivec_vrl): Add support for a
possible XVRLD instruction in the future.
(altivec_vrl_immediate): New insns.
* config/rs6000/predicates.md (vector_shift_immediate): New 
predicate.
* config/rs6000/rs6000.h (TARGET_XVRLW): New macro.
* config/rs6000/rs6000.md (isa attribute): Add xvrlw.
(enabled attribute): Add support for xvrlw.

gcc/testsuite/

* lib/target-supports.exp 
(check_effective_target_powerpc_future_ok):
New target.
(check_effective_target_powerpc_dense_math_ok): Likewise.
* gcc.target/powerpc/vector-rotate-left.c: New test.

Diff:
---
 gcc/config/rs6000/altivec.md   | 35 +++---
 gcc/config/rs6000/predicates.md| 26 
 gcc/config/rs6000/rs6000.h |  3 ++
 gcc/config/rs6000/rs6000.md|  6 +++-
 .../gcc.target/powerpc/vector-rotate-left.c| 34 +
 gcc/testsuite/lib/target-supports.exp  | 35 ++
 6 files changed, 134 insertions(+), 5 deletions(-)

diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
index b6a778ef6179..abe6130a94e3 100644
--- a/gcc/config/rs6000/altivec.md
+++ b/gcc/config/rs6000/altivec.md
@@ -1982,12 +1982,39 @@
 }
   [(set_attr "type" "vecperm")])
 
+;; -mcpu=future adds a vector rotate left word variant.  There is no vector
+;; byte/half-word/double-word/quad-word rotate left.  This insn occurs before
+;; altivec_vrl and will match for -mcpu=future, while other cpus will
+;; match the generic insn.
+;; However for testing, allow other xvrl variants.  In particular, XVRLD for
+;; the sha3 tests for multibuf/singlebuf.
 (define_insn "altivec_vrl"
-  [(set (match_operand:VI2 0 "register_operand" "=v")
-(rotate:VI2 (match_operand:VI2 1 "register_operand" "v")
-   (match_operand:VI2 2 "register_operand" "v")))]
+  [(set (match_operand:VI2 0 "register_operand" "=v,wa")
+(rotate:VI2 (match_operand:VI2 1 "register_operand" "v,wa")
+   (match_operand:VI2 2 "register_operand" "v,wa")))]
   ""
-  "vrl %0,%1,%2"
+  "@
+   vrl %0,%1,%2
+   xvrl %x0,%x1,%x2"
+  [(set_attr "type" "vecsimple")
+   (set_attr "isa" "*,xvrlw")])
+
+(define_insn "*altivec_vrl_immediate"
+  [(set (match_operand:VI2 0 "register_operand" "=wa,wa,wa,wa")
+   (rotate:VI2 (match_operand:VI2 1 "register_operand" "wa,wa,wa,wa")
+   (match_operand:VI2 2 "vector_shift_immediate" 
"j,wM,wE,wS")))]
+  "TARGET_XVRLW && "
+{
+  rtx op2 = operands[2];
+  int value = 256;
+  int num_insns = -1;
+
+  if (!xxspltib_constant_p (op2, mode, &num_insns, &value))
+gcc_unreachable ();
+
+  operands[3] = GEN_INT (value & 0xff);
+  return "xvrli %x0,%x1,%3";
+}
   [(set_attr "type" "vecsimple")])
 
 (define_insn "altivec_vrlq"
diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
index 1d95e34557e5..fccfbd7e4904 100644
--- a/gcc/config/rs6000/predicates.md
+++ b/gcc/config/rs6000/predicates.md
@@ -728,6 +728,32 @@
   return num_insns == 1;
 })
 
+;; Return 1 if the operand is a CONST_VECTOR whose elements are all the
+;; same and the elements can be an immediate shift or rotate factor
+(define_predicate "vector_shift_immediate"
+  (match_code "const_vector,vec_duplicate,const_int")
+{
+  int value = 256;
+  int num_insns = -1;
+
+  if (zero_constant (op, mode) || all_ones_constant (op, mode))
+return true;
+
+  if (!xxspltib_constant_p (op, mode, &num_insns, &value))
+return false;
+
+  switch (mode)
+{
+case V16QImode: return IN_RANGE (value, 0, 7);
+case V8HImode:  return IN_RANGE (value, 0, 15);
+case V4SImode:  return IN_RANGE (value, 0, 31);
+case V2DImode:  return IN_RANGE (value, 0, 63);
+default:break;
+}
+
+  return false;
+})
+  
 ;; Return 1 if the operand is a CONST_VECTOR and can be loaded into a
 ;; vector register without using memory.
 (define_predicate "easy_vector_constant"
diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h
index 15cfd3bf7b2d..bf76776f5de1 100644
--- a/gcc/config/rs6000/rs6000.h
+++ b/gcc/config/rs6000/rs6000.h
@@ -575,6 +575,9 @@ extern int rs6000_vector_align[];
below.  */
 #define RS6000_FN_TARGET_INFO_HTM 1
 
+/* Whether we have XVRLW support.  */
+#define TARGET_XVRLW   TARGET_FUTURE
+
 /* Whether the various reciprocal divide/square root estimate instructions
exist, and whether we should automatically generate code for the instruction
by default.  */
diff --git a/gcc/

[gcc(refs/users/meissner/heads/work187)] Add -mcpu=future tuning support.

2024-11-22 Thread Michael Meissner via Gcc-cvs

https://gcc.gnu.org/g:33fa445af265e1ca375073185206fdf4dd805794

commit 33fa445af265e1ca375073185206fdf4dd805794
Author: Michael Meissner 
Date:   Fri Nov 22 18:09:19 2024 -0500

Add -mcpu=future tuning support.

This patch makes -mtune=future use the same tuning decision as 
-mtune=power11.

2024-11-22  Michael Meissner  

gcc/

* config/rs6000/power10.md (all reservations): Add future as an
alterntive to power10 and power11.

Diff:
---
 gcc/config/rs6000/power10.md | 144 +--
 1 file changed, 72 insertions(+), 72 deletions(-)

diff --git a/gcc/config/rs6000/power10.md b/gcc/config/rs6000/power10.md
index 2310c4603457..e42b057dc45b 100644
--- a/gcc/config/rs6000/power10.md
+++ b/gcc/config/rs6000/power10.md
@@ -1,4 +1,4 @@
-;; Scheduling description for the IBM Power10 and Power11 processors.
+;; Scheduling description for the IBM Power10, Power11, and Future processors.
 ;; Copyright (C) 2020-2024 Free Software Foundation, Inc.
 ;;
 ;; Contributed by Pat Haugen (pthau...@us.ibm.com).
@@ -97,12 +97,12 @@
(eq_attr "update" "no")
(eq_attr "size" "!128")
(eq_attr "prefixed" "no")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_any_power10,LU_power10")
 
 (define_insn_reservation "power10-fused-load" 4
   (and (eq_attr "type" "fused_load_cmpi,fused_addis_load,fused_load_load")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_even_power10,LU_power10")
 
 (define_insn_reservation "power10-prefixed-load" 4
@@ -110,13 +110,13 @@
(eq_attr "update" "no")
(eq_attr "size" "!128")
(eq_attr "prefixed" "yes")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_even_power10,LU_power10")
 
 (define_insn_reservation "power10-load-update" 4
   (and (eq_attr "type" "load")
(eq_attr "update" "yes")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_even_power10,LU_power10+SXU_power10")
 
 (define_insn_reservation "power10-fpload-double" 4
@@ -124,7 +124,7 @@
(eq_attr "update" "no")
(eq_attr "size" "64")
(eq_attr "prefixed" "no")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_any_power10,LU_power10")
 
 (define_insn_reservation "power10-prefixed-fpload-double" 4
@@ -132,14 +132,14 @@
(eq_attr "update" "no")
(eq_attr "size" "64")
(eq_attr "prefixed" "yes")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_even_power10,LU_power10")
 
 (define_insn_reservation "power10-fpload-update-double" 4
   (and (eq_attr "type" "fpload")
(eq_attr "update" "yes")
(eq_attr "size" "64")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_even_power10,LU_power10+SXU_power10")
 
 ; SFmode loads are cracked and have additional 3 cycles over DFmode
@@ -148,27 +148,27 @@
   (and (eq_attr "type" "fpload")
(eq_attr "update" "no")
(eq_attr "size" "32")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_even_power10,LU_power10")
 
 (define_insn_reservation "power10-fpload-update-single" 7
   (and (eq_attr "type" "fpload")
(eq_attr "update" "yes")
(eq_attr "size" "32")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_even_power10,LU_power10+SXU_power10")
 
 (define_insn_reservation "power10-vecload" 4
   (and (eq_attr "type" "vecload")
(eq_attr "size" "!256")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_any_power10,LU_power10")
 
 ; lxvp
 (define_insn_reservation "power10-vecload-pair" 4
   (and (eq_attr "type" "vecload")
(eq_attr "size" "256")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_even_power10,LU_power10+SXU_power10")
 
 ; Store Unit
@@ -178,12 +178,12 @@
(eq_attr "prefixed" "no")
(eq_attr "size" "!128")
(eq_attr "size" "!256")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_any_power10,STU_power10")
 
 (define_insn_reservation "power10-fused-store" 0
   (and (eq_attr "type" "fused_store_store")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_even_power10,STU_power10")
 
 (define_insn_reservation "power10-prefixed-store" 0
@@ -191,52 +191,52 @@
(eq_attr "prefixed" "yes")
(eq_attr "size" "!128")
(eq_attr "size" "!256")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_even_power10,STU_power10")
 
 ; Update forms have 2 cycle latency for updat

[gcc(refs/users/meissner/heads/work187)] Add support for -mcpu=future

2024-11-22 Thread Michael Meissner via Gcc-cvs

https://gcc.gnu.org/g:7f0e7dc8d8e9e34a37401e88a88377bffa193ad7

commit 7f0e7dc8d8e9e34a37401e88a88377bffa193ad7
Author: Michael Meissner 
Date:   Fri Nov 22 18:08:01 2024 -0500

Add support for -mcpu=future

This patch adds the support that can be used in developing GCC support for
future PowerPC processors.

2024-11-22  Michael Meissner  

* config.gcc (powerpc*-*-*): Add support for --with-cpu=future.
* config/rs6000/aix71.h (ASM_CPU_SPEC): Add support for 
-mcpu=future.
* config/rs6000/aix72.h (ASM_CPU_SPEC): Likewise.
* config/rs6000/aix73.h (ASM_CPU_SPEC): Likewise.
* config/rs6000/driver-rs6000.cc (asm_names): Likewise.
* config/rs6000/rs6000-c.cc (rs6000_target_modify_macros): If
-mcpu=future, define _ARCH_FUTURE.
* config/rs6000/rs6000-cpus.def (FUTURE_MASKS_SERVER): New macro.
(POWERPC_MASKS): Add OPTION_MASK_FUTURE.
(future cpu): Define.
* config/rs6000/rs6000-opts.h (enum processor_type): Add
PROCESSOR_FUTURE.
* config/rs6000/rs6000-tables.opt: Regenerate.
* config/rs6000/rs6000.cc (power10_cost): Update comment.
(get_arch_flags): Add support for future processor.
(rs6000_option_override_internal): Likewise.
(rs6000_machine_from_flags): Likewise.
(rs6000_reassociation_width): Likewise.
(rs6000_adjust_cost): Likewise.
(rs6000_issue_rate): Likewise.
(rs6000_sched_reorder): Likewise.
(rs6000_sched_reorder2): Likewise.
(rs6000_register_move_cost): Likewise.
(rs6000_opt_masks): Add -mfuture.
* config/rs6000/rs6000.h (ASM_CPU_SPEC): Likewise.
* config/rs6000/rs6000.md (cpu attribute): Likewise.
* config/rs6000/rs6000.opt (-mfuture): New internal option.

Diff:
---
 gcc/config.gcc  |  4 ++--
 gcc/config/rs6000/aix71.h   |  1 +
 gcc/config/rs6000/aix72.h   |  1 +
 gcc/config/rs6000/aix73.h   |  1 +
 gcc/config/rs6000/driver-rs6000.cc  |  2 ++
 gcc/config/rs6000/rs6000-c.cc   |  2 ++
 gcc/config/rs6000/rs6000-cpus.def   |  5 +
 gcc/config/rs6000/rs6000-opts.h |  1 +
 gcc/config/rs6000/rs6000-tables.opt | 11 +++
 gcc/config/rs6000/rs6000.cc | 30 ++
 gcc/config/rs6000/rs6000.h  |  1 +
 gcc/config/rs6000/rs6000.md |  2 +-
 gcc/config/rs6000/rs6000.opt|  6 ++
 13 files changed, 52 insertions(+), 15 deletions(-)

diff --git a/gcc/config.gcc b/gcc/config.gcc
index c20817487457..ea939bdef14b 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -541,7 +541,7 @@ powerpc*-*-*)
extra_headers="${extra_headers} ppu_intrinsics.h spu2vmx.h vec_types.h 
si2vmx.h"
extra_headers="${extra_headers} amo.h"
case x$with_cpu in
-   
xpowerpc64|xdefault64|x6[23]0|x970|xG5|xpower[3456789]|xpower1[01]|xpower6x|xrs64a|xcell|xa2|xe500mc64|xe5500|xe6500)
+   
xpowerpc64|xdefault64|x6[23]0|x970|xG5|xpower[3456789]|xpower1[01]|xpower6x|xrs64a|xcell|xa2|xe500mc64|xe5500|xe6500|xfuture)
cpu_is_64bit=yes
;;
esac
@@ -5650,7 +5650,7 @@ case "${target}" in
tm_defines="${tm_defines} CONFIG_PPC405CR"
eval "with_$which=405"
;;
-   "" | common | native \
+   "" | common | native | future \
| power[3456789] | power1[01] | power5+ | power6x \
| powerpc | powerpc64 | powerpc64le \
| rs64 \
diff --git a/gcc/config/rs6000/aix71.h b/gcc/config/rs6000/aix71.h
index 4350dcd89524..505986b33d63 100644
--- a/gcc/config/rs6000/aix71.h
+++ b/gcc/config/rs6000/aix71.h
@@ -79,6 +79,7 @@ do {  
\
 #undef ASM_CPU_SPEC
 #define ASM_CPU_SPEC \
 "%{mcpu=native: %(asm_cpu_native); \
+  mcpu=future: -mfuture; \
   mcpu=power11: -mpwr11; \
   mcpu=power10: -mpwr10; \
   mcpu=power9: -mpwr9; \
diff --git a/gcc/config/rs6000/aix72.h b/gcc/config/rs6000/aix72.h
index fe59f8319b48..242ca94bd065 100644
--- a/gcc/config/rs6000/aix72.h
+++ b/gcc/config/rs6000/aix72.h
@@ -79,6 +79,7 @@ do {  
\
 #undef ASM_CPU_SPEC
 #define ASM_CPU_SPEC \
 "%{mcpu=native: %(asm_cpu_native); \
+  mcpu=future: -mfuture; \
   mcpu=power11: -mpwr11; \
   mcpu=power10: -mpwr10; \
   mcpu=power9: -mpwr9; \
diff --git a/gcc/config/rs6000/aix73.h b/gcc/config/rs6000/aix73.h
index 1318b0b3662d..2bd6b4bb3c4f 100644
--- a/gcc/config/rs6000/aix73.h
+++ b/gcc/config/rs6000/aix73.h
@@ -79,6 +79,7 @@ do {  
\
 #undef ASM_CPU_SPEC
 #define ASM_CPU_SPEC \
 "%{mcpu=nati

[gcc r15-5602] [RISC-V][PR target/109279] Improve RISC-V constant synthesis

2024-11-22 Thread Jeff Law via Gcc-cvs

https://gcc.gnu.org/g:df2e832c90fe0915c0ab89e5c115bd0c6536c833

commit r15-5602-gdf2e832c90fe0915c0ab89e5c115bd0c6536c833
Author: Jeff Law 
Date:   Fri Nov 22 16:11:03 2024 -0700

[RISC-V][PR target/109279] Improve RISC-V constant synthesis

This is a small improvement to the constant synthesis code to capture a case
appended to PR 109279.

The case in question has the property that the high 32 bits have the value 
one
less than the low 32 bits and the highest bit in two low 32 bits is on.  The
example used in BZ is 0xcccd which comes up computing N/10.

When we construct a constant with bit 31 on, it gets implicitly sign 
extended.
So something like 0xcccd when constructed would generate
0xcccd.  The low bits are precisely what we want and the high 
bits
are a "-1".  Both properties are useful.

We left shift that value by 32 positions into a temporary and add that
temporary to the original value.  Concretely:

  0xcccd
+ 0xcccd
  --
  0xcccd

Tested in my tester on rv32 and rv64, waiting on the pre-commit tester to 
do its thing.

PR target/109279
gcc/
* config/riscv/riscv.cc (riscv_build_integer): Handle another 64-bit
synthesis where high half is one less than the low half and the 
32-bit
sign bit is on.

gcc/testsuite/

* gcc.target/riscv/synthesis-16.c: New test.

Diff:
---
 gcc/config/riscv/riscv.cc | 28 +++
 gcc/testsuite/gcc.target/riscv/synthesis-16.c | 17 
 2 files changed, 45 insertions(+)

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 93702f71ec9f..a25fdf89e445 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -1315,6 +1315,34 @@ riscv_build_integer (struct riscv_integer_op *codes, 
HOST_WIDE_INT value,
  cost = alt_cost;
}
 
+  /* If bit31 is on and the upper constant is one less than the lower
+constant, then we can exploit sign extending nature of the lower
+half to trivially generate the upper half with an ADD.
+
+Not appropriate for ZBKB since that won't use "add"
+at codegen time.  */
+  if (!TARGET_ZBKB
+ && cost > 4
+ && bit31
+ && hival == loval - 1)
+   {
+ alt_cost = 2 + riscv_build_integer_1 (alt_codes,
+   sext_hwi (loval, 32), mode);
+ alt_codes[alt_cost - 3].save_temporary = true;
+ alt_codes[alt_cost - 2].code = ASHIFT;
+ alt_codes[alt_cost - 2].value = 32;
+ alt_codes[alt_cost - 2].use_uw = false;
+ alt_codes[alt_cost - 2].save_temporary = false;
+ /* This will turn into an ADD.  */
+ alt_codes[alt_cost - 1].code = CONCAT;
+ alt_codes[alt_cost - 1].value = 32;
+ alt_codes[alt_cost - 1].use_uw = false;
+ alt_codes[alt_cost - 1].save_temporary = false;
+
+ memcpy (codes, alt_codes, sizeof (alt_codes));
+ cost = alt_cost;
+   }
+
   if (cost > 4 && !bit31 && TARGET_ZBA)
{
  int value = 0;
diff --git a/gcc/testsuite/gcc.target/riscv/synthesis-16.c 
b/gcc/testsuite/gcc.target/riscv/synthesis-16.c
new file mode 100644
index ..352c48ec0374
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/synthesis-16.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target rv64 } */
+/* We aggressively skip as we really just need to test the basic synthesis
+   which shouldn't vary based on the optimization level.  -O1 seems to work
+   and eliminates the usual sources of extraneous dead code that would throw
+   off the counts.  */
+/* { dg-skip-if "" { *-*-* } { "-O0" "-Og" "-O2" "-O3" "-Os" "-Oz" "-flto" } } 
*/
+/* { dg-options "-march=rv64gc" } */
+
+/* Rather than test for a specific synthesis of all these constants or
+   having thousands of tests each testing one variant, we just test the
+   total number of instructions.
+
+   This isn't expected to change much and any change is worthy of a look.  */
+/* { dg-final { scan-assembler-times 
"\\t(add|addi|bseti|li|pack|ret|sh1add|sh2add|sh3add|slli|srli|xori|or)" 5 } } 
*/
+
+unsigned long foo_0xcccd(void) { return 0xcccdUL; }

[gcc(refs/users/meissner/heads/work187)] Add -mcpu=future tests.

2024-11-22 Thread Michael Meissner via Gcc-cvs

https://gcc.gnu.org/g:0f141859d18dcaf5e3260ad7f69b0127bd99b6d4

commit 0f141859d18dcaf5e3260ad7f69b0127bd99b6d4
Author: Michael Meissner 
Date:   Fri Nov 22 18:10:59 2024 -0500

Add -mcpu=future tests.

This patch adds simple tests for -mcpu=future.

2024-11-22  Michael Meissner  

gcc/testsuite/

* gcc.target/powerpc/future-1.c: New test.
* gcc.target/powerpc/future-2.c: Likewise.

Diff:
---
 gcc/testsuite/gcc.target/powerpc/future-1.c | 13 +
 gcc/testsuite/gcc.target/powerpc/future-2.c | 24 
 2 files changed, 37 insertions(+)

diff --git a/gcc/testsuite/gcc.target/powerpc/future-1.c 
b/gcc/testsuite/gcc.target/powerpc/future-1.c
new file mode 100644
index ..f1b940d7bebf
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/future-1.c
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-options "-mdejagnu-cpu=future -O2" } */
+
+/* Basic check to see if the compiler supports -mcpu=future and if it defines
+   _ARCH_PWR11.  */
+
+#ifndef _ARCH_FUTURE
+#error "-mcpu=future is not supported"
+#endif
+
+void foo (void)
+{
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/future-2.c 
b/gcc/testsuite/gcc.target/powerpc/future-2.c
new file mode 100644
index ..5552cefa3c2e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/future-2.c
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+/* Check if we can set the future target via a target attribute.  */
+
+__attribute__((__target__("cpu=power9")))
+void foo_p9 (void)
+{
+}
+
+__attribute__((__target__("cpu=power10")))
+void foo_p10 (void)
+{
+}
+
+__attribute__((__target__("cpu=power11")))
+void foo_p11 (void)
+{
+}
+
+__attribute__((__target__("cpu=future")))
+void foo_future (void)
+{
+}

[gcc(refs/users/meissner/heads/work187)] Use vector pair load/store for memcpy with -mcpu=future

2024-11-22 Thread Michael Meissner via Gcc-cvs

https://gcc.gnu.org/g:5e0614e9e0d6feb683fa0e21ecdb2f9fbbc5907a

commit 5e0614e9e0d6feb683fa0e21ecdb2f9fbbc5907a
Author: Michael Meissner 
Date:   Fri Nov 22 18:12:10 2024 -0500

Use vector pair load/store for memcpy with -mcpu=future

In the development for the power10 processor, GCC did not enable using the 
load
vector pair and store vector pair instructions when optimizing things like
memory copy.  This patch enables using those instructions if -mcpu=future is
used.

2024-11-22  Michael Meissner  

gcc/

* config/rs6000/rs6000-cpus.def (ISA_FUTURE_MASKS_SERVER): Enable 
using
load vector pair and store vector pair instructions for memory copy
operations.
(POWERPC_MASKS): Make the bit for enabling using load vector pair 
and
store vector pair operations set and reset when the PowerPC 
processor is
changed.
* gcc/config/rs6000/rs6000.cc (rs6000_machine_from_flags): Disable
-mblock-ops-vector-pair from influcing .machine selection.

gcc/testsuite/

* gcc.target/powerpc/future-3.c: New test.

Diff:
---
 gcc/config/rs6000/rs6000-cpus.def   |  4 +++-
 gcc/config/rs6000/rs6000.cc |  2 +-
 gcc/testsuite/gcc.target/powerpc/future-3.c | 22 ++
 3 files changed, 26 insertions(+), 2 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-cpus.def 
b/gcc/config/rs6000/rs6000-cpus.def
index 354c1d8de4f0..2f189dd416ca 100644
--- a/gcc/config/rs6000/rs6000-cpus.def
+++ b/gcc/config/rs6000/rs6000-cpus.def
@@ -84,7 +84,8 @@
  | OPTION_MASK_POWER11)
 
 #define FUTURE_MASKS_SERVER(POWER11_MASKS_SERVER   \
-| OPTION_MASK_FUTURE)
+| OPTION_MASK_FUTURE   \
+| OPTION_MASK_BLOCK_OPS_VECTOR_PAIR)
 
 /* Flags that need to be turned off if -mno-vsx.  */
 #define OTHER_VSX_VECTOR_MASKS (OPTION_MASK_EFFICIENT_UNALIGNED_VSX\
@@ -114,6 +115,7 @@
 
 /* Mask of all options to set the default isa flags based on -mcpu=.  */
 #define POWERPC_MASKS  (OPTION_MASK_ALTIVEC\
+| OPTION_MASK_BLOCK_OPS_VECTOR_PAIR\
 | OPTION_MASK_CMPB \
 | OPTION_MASK_CRYPTO   \
 | OPTION_MASK_DFP  \
diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index ab26192b687d..35e42a80ac5c 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -5906,7 +5906,7 @@ rs6000_machine_from_flags (void)
 
   /* Disable the flags that should never influence the .machine selection.  */
   flags &= ~(OPTION_MASK_PPC_GFXOPT | OPTION_MASK_PPC_GPOPT | OPTION_MASK_ISEL
-| OPTION_MASK_ALTIVEC);
+| OPTION_MASK_ALTIVEC | OPTION_MASK_BLOCK_OPS_VECTOR_PAIR);
 
   if ((flags & (FUTURE_MASKS_SERVER & ~ISA_3_1_MASKS_SERVER)) != 0)
 return "future";
diff --git a/gcc/testsuite/gcc.target/powerpc/future-3.c 
b/gcc/testsuite/gcc.target/powerpc/future-3.c
new file mode 100644
index ..afa8b96d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/future-3.c
@@ -0,0 +1,22 @@
+/* 32-bit doesn't generate vector pair instructions.  */
+/* { dg-do compile { target lp64 } } */
+/* { dg-options "-mdejagnu-cpu=future -O2" } */
+
+/* Test to see that memcpy will use load/store vector pair with
+   -mcpu=future.  */
+
+#ifndef SIZE
+#define SIZE 4
+#endif
+
+extern vector double to[SIZE], from[SIZE];
+
+void
+copy (void)
+{
+  __builtin_memcpy (to, from, sizeof (to));
+  return;
+}
+
+/* { dg-final { scan-assembler {\mlxvpx?\M}  } } */
+/* { dg-final { scan-assembler {\mstxvpx?\M} } } */

[gcc(refs/users/meissner/heads/work187)] Add rs6000 architecture masks.

2024-11-22 Thread Michael Meissner via Gcc-cvs

https://gcc.gnu.org/g:04169d353394ef9cab74e43a43d4fed3a3f7d9b3

commit 04169d353394ef9cab74e43a43d4fed3a3f7d9b3
Author: Michael Meissner 
Date:   Fri Nov 22 18:19:09 2024 -0500

Add rs6000 architecture masks.

This patch begins the journey to move architecture bits that are not user 
ISA
options from rs6000_isa_flags to a new targt variable rs6000_arch_flags.  
The
intention is to remove switches that are currently isa options, but the user
should not be using this particular option. For example, we want users to 
use
-mcpu=power10 and not just -mpower10.

This patch also changes the target_clones support to use an architecture 
mask
instead of isa bits.

This patch also switches the handling of .machine to use architecture masks 
if
they exist (power4 through power11).  All of the other PowerPCs will 
continue to
use the existing code for setting the .machine option.

I have built both big endian and little endian bootstrap compilers and there
were no regressions.

In addition, I constructed a test case that used every archiecture define 
(like
_ARCH_PWR4, etc.) and I also looked at the .machine directive generated.  I 
ran
this test for all supported combinations of -mcpu, big/little endian, and 
32/64
bit support.  Every single instance generated exactly the same code with the
patches installed compared to the compiler before installing the patches.

The only difference in this patch compared to the first version posted on
November 6th is that I the correct attribution and copyright year (i.e. 
that I
created rs6000-arch.def in 2024).

Can I install this patch on the GCC 15 trunk?

2024-11-22  Michael Meissner  

gcc/

* config/rs6000/default64.h (TARGET_CPU_DEFAULT): Set default cpu 
name.
* config/rs6000/rs6000-arch.def: New file.
* config/rs6000/rs6000.cc (struct clone_map): Switch to using
architecture masks instead of ISA masks.
(rs6000_clone_map): Likewise.
(rs6000_print_isa_options): Add an architecture flags argument, 
change
all callers.
(get_arch_flag): New function.
(rs6000_debug_reg_global): Update rs6000_print_isa_options calls.
(rs6000_option_override_internal): Likewise.
(rs6000_machine_from_flags): Switch to using architecture masks 
instead
of ISA masks.
(struct rs6000_arch_mask): New structure.
(rs6000_arch_masks): New table of architecutre masks and names.
(rs6000_function_specific_save): Save architecture flags.
(rs6000_function_specific_restore): Restore architecture flags.
(rs6000_function_specific_print): Update rs6000_print_isa_options 
calls.
(rs6000_print_options_internal): Add architecture flags options.
(rs6000_clone_priority): Switch to using architecture masks instead 
of
ISA masks.
(rs6000_can_inline_p): Don't allow inling if the callee requires a 
newer
architecture than the caller.
* config/rs6000/rs6000.h: Use rs6000-arch.def to create the 
architecture
masks.
* config/rs6000/rs6000.opt (rs6000_arch_flags): New target variable.
(x_rs6000_arch_flags): New save/restore field for rs6000_arch_flags.

Diff:
---
 gcc/config/rs6000/default64.h |  11 ++
 gcc/config/rs6000/rs6000-arch.def |  49 +
 gcc/config/rs6000/rs6000.cc   | 222 +++---
 gcc/config/rs6000/rs6000.h|  24 +
 gcc/config/rs6000/rs6000.opt  |   8 ++
 5 files changed, 277 insertions(+), 37 deletions(-)

diff --git a/gcc/config/rs6000/default64.h b/gcc/config/rs6000/default64.h
index 10e3dec78aca..afa6542e040c 100644
--- a/gcc/config/rs6000/default64.h
+++ b/gcc/config/rs6000/default64.h
@@ -21,6 +21,7 @@ along with GCC; see the file COPYING3.  If not see
 #define RS6000_CPU(NAME, CPU, FLAGS)
 #include "rs6000-cpus.def"
 #undef RS6000_CPU
+#undef TARGET_CPU_DEFAULT
 
 #if (TARGET_DEFAULT & MASK_LITTLE_ENDIAN)
 #undef TARGET_DEFAULT
@@ -28,10 +29,20 @@ along with GCC; see the file COPYING3.  If not see
| MASK_LITTLE_ENDIAN)
 #undef ASM_DEFAULT_SPEC
 #define ASM_DEFAULT_SPEC "-mpower8"
+#define TARGET_CPU_DEFAULT "power8"
+
 #else
 #undef TARGET_DEFAULT
 #define TARGET_DEFAULT (OPTION_MASK_PPC_GFXOPT | OPTION_MASK_PPC_GPOPT \
| OPTION_MASK_MFCRF | MASK_POWERPC64 | MASK_64BIT)
 #undef ASM_DEFAULT_SPEC
 #define ASM_DEFAULT_SPEC "-mpower4"
+
+#if (TARGET_DEFAULT & MASK_POWERPC64)
+#define TARGET_CPU_DEFAULT "powerpc64"
+
+#else
+#define TARGET_CPU_DEFAULT "powerpc"
+#endif
+
 #endif
diff --git a/gcc/config/rs6000/rs6000-arch.def 
b/gcc/config/rs6000/rs6000-arch.def
new file mode 100644
index ..c0dbc5834333
--- /dev/null
+++ b/gcc/config/rs6000/rs6000-arch.def
@

[gcc(refs/users/meissner/heads/work187-libs)] Add ChangeLog.libs and update REVISION.

2024-11-22 Thread Michael Meissner via Gcc-cvs

https://gcc.gnu.org/g:f2975fae61a15c72e612328396c6d60463886632

commit f2975fae61a15c72e612328396c6d60463886632
Author: Michael Meissner 
Date:   Fri Nov 22 17:31:41 2024 -0500

Add ChangeLog.libs and update REVISION.

2024-11-22  Michael Meissner  

gcc/

* ChangeLog.libs: New file for branch.
* REVISION: Update.

Diff:
---
 gcc/ChangeLog.libs | 5 +
 gcc/REVISION   | 2 +-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/gcc/ChangeLog.libs b/gcc/ChangeLog.libs
new file mode 100644
index ..220098bf5ca8
--- /dev/null
+++ b/gcc/ChangeLog.libs
@@ -0,0 +1,5 @@
+ Branch work187-libs, baseline 
+
+2024-11-22   Michael Meissner  
+
+   Clone branch
diff --git a/gcc/REVISION b/gcc/REVISION
index 4bb9f2daed50..5b267ef1d74f 100644
--- a/gcc/REVISION
+++ b/gcc/REVISION
@@ -1 +1 @@
-work187 branch
+work187-libs branch

[gcc(refs/users/meissner/heads/work187-libs)] Merge commit 'refs/users/meissner/heads/work187-libs' of git+ssh://gcc.gnu.org/git/gcc into me/work1

2024-11-22 Thread Michael Meissner via Gcc-cvs

https://gcc.gnu.org/g:6a065a2305e3a43e2fa8eb3c22d6e3cbbdc9fe4b

commit 6a065a2305e3a43e2fa8eb3c22d6e3cbbdc9fe4b
Merge: f2975fae61a1 c4985fe24c9a
Author: Michael Meissner 
Date:   Fri Nov 22 18:29:20 2024 -0500

Merge commit 'refs/users/meissner/heads/work187-libs' of 
git+ssh://gcc.gnu.org/git/gcc into me/work187-libs

Diff:

[gcc/meissner/heads/work187-libs] (30 commits) Merge commit 'refs/users/meissner/heads/work187-libs' of gi

2024-11-22 Thread Michael Meissner via Gcc-cvs

The branch 'meissner/heads/work187-libs' was updated to point to:

 6a065a2305e3... Merge commit 'refs/users/meissner/heads/work187-libs' of gi

It previously pointed to:

 c4985fe24c9a... Add ChangeLog.libs and update REVISION.

Diff:

Summary of changes (added commits):
---

  6a065a2... Merge commit 'refs/users/meissner/heads/work187-libs' of gi
  f2975fa... Add ChangeLog.libs and update REVISION.
  ead1c9f... Update ChangeLog.* (*)
  992b9d6... Use architecture flags for defining _ARCH_PWR macros. (*)
  04169d3... Add rs6000 architecture masks. (*)
  e92d512... Do not allow -mvsx to boost processor to power7. (*)
  5e0614e... Use vector pair load/store for memcpy with -mcpu=future (*)
  0f14185... Add -mcpu=future tests. (*)
  33fa445... Add -mcpu=future tuning support. (*)
  7f0e7dc... Add support for -mcpu=future (*)
  1c9f81c... Revert changes (*)
  80c4fc9... Delete files (*)
  c004776... Add -mcpu=future tuning support. (*)
  590481f... Add support for -mcpu=future (*)
  2d9d590... Revert changes (*)
  f850d8d... Add -mcpu=future tuning support. (*)
  1f083f5... Add support for -mcpu=future (*)
  f9f717f... Revert changes (*)
  33234a6... Add -mcpu=future tuning support. (*)
  d59547a... Add support for -mcpu=future (*)
  b757938... Revert changes (*)
  e98175b... Use vector pair load/store for memcpy with -mcpu=future (*)
  6e5fb4c... Add -mcpu=future tests. (*)
  121e4ad... Add -mcpu=future tuning support. (*)
  90092f4... Add support for -mcpu=future (*)
  35fec69... Change TARGET_MODULO to TARGET_POWER9. (*)
  e1f8abc... Change TARGET_POPCNTD to TARGET_POWER7. (*)
  8c9b979... Change TARGET_CMPB to TARGET_POWER6. (*)
  3fddd6e... Change TARGET_FPRND to TARGET_POWER5X. (*)
  38fdb83... Change TARGET_POPCNTB to TARGET_POWER5. (*)

(*) This commit already exists in another branch.
Because the reference `refs/users/meissner/heads/work187-libs' matches
your hooks.email-new-commits-only configuration,
no separate email is sent for this commit.

[gcc(refs/users/meissner/heads/work187-dmf)] Add ChangeLog.dmf and update REVISION.

2024-11-22 Thread Michael Meissner via Gcc-cvs

https://gcc.gnu.org/g:11e8e6de1f36c6bdd1d23cc1645d69aa3afa7e01

commit 11e8e6de1f36c6bdd1d23cc1645d69aa3afa7e01
Author: Michael Meissner 
Date:   Fri Nov 22 17:29:03 2024 -0500

Add ChangeLog.dmf and update REVISION.

2024-11-22  Michael Meissner  

gcc/

* ChangeLog.dmf: New file for branch.
* REVISION: Update.

Diff:
---
 gcc/ChangeLog.dmf | 5 +
 gcc/REVISION  | 2 +-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/gcc/ChangeLog.dmf b/gcc/ChangeLog.dmf
new file mode 100644
index ..587a023f88f5
--- /dev/null
+++ b/gcc/ChangeLog.dmf
@@ -0,0 +1,5 @@
+ Branch work187-dmf, baseline 
+
+2024-11-22   Michael Meissner  
+
+   Clone branch
diff --git a/gcc/REVISION b/gcc/REVISION
index 4bb9f2daed50..ac085fee6e80 100644
--- a/gcc/REVISION
+++ b/gcc/REVISION
@@ -1 +1 @@
-work187 branch
+work187-dmf branch

[gcc(refs/users/meissner/heads/work187-dmf)] Merge commit 'refs/users/meissner/heads/work187-dmf' of git+ssh://gcc.gnu.org/git/gcc into me/work18

2024-11-22 Thread Michael Meissner via Gcc-cvs

https://gcc.gnu.org/g:0acc082469e3bfc73d259ce1ccf273fa7f6b7e45

commit 0acc082469e3bfc73d259ce1ccf273fa7f6b7e45
Merge: 11e8e6de1f36 8eada357ce16
Author: Michael Meissner 
Date:   Fri Nov 22 18:28:14 2024 -0500

Merge commit 'refs/users/meissner/heads/work187-dmf' of 
git+ssh://gcc.gnu.org/git/gcc into me/work187-dmf

Diff:

[gcc/meissner/heads/work187-dmf] (30 commits) Merge commit 'refs/users/meissner/heads/work187-dmf' of git

2024-11-22 Thread Michael Meissner via Gcc-cvs

The branch 'meissner/heads/work187-dmf' was updated to point to:

 0acc082469e3... Merge commit 'refs/users/meissner/heads/work187-dmf' of git

It previously pointed to:

 8eada357ce16... Add ChangeLog.dmf and update REVISION.

Diff:

Summary of changes (added commits):
---

  0acc082... Merge commit 'refs/users/meissner/heads/work187-dmf' of git
  11e8e6d... Add ChangeLog.dmf and update REVISION.
  ead1c9f... Update ChangeLog.* (*)
  992b9d6... Use architecture flags for defining _ARCH_PWR macros. (*)
  04169d3... Add rs6000 architecture masks. (*)
  e92d512... Do not allow -mvsx to boost processor to power7. (*)
  5e0614e... Use vector pair load/store for memcpy with -mcpu=future (*)
  0f14185... Add -mcpu=future tests. (*)
  33fa445... Add -mcpu=future tuning support. (*)
  7f0e7dc... Add support for -mcpu=future (*)
  1c9f81c... Revert changes (*)
  80c4fc9... Delete files (*)
  c004776... Add -mcpu=future tuning support. (*)
  590481f... Add support for -mcpu=future (*)
  2d9d590... Revert changes (*)
  f850d8d... Add -mcpu=future tuning support. (*)
  1f083f5... Add support for -mcpu=future (*)
  f9f717f... Revert changes (*)
  33234a6... Add -mcpu=future tuning support. (*)
  d59547a... Add support for -mcpu=future (*)
  b757938... Revert changes (*)
  e98175b... Use vector pair load/store for memcpy with -mcpu=future (*)
  6e5fb4c... Add -mcpu=future tests. (*)
  121e4ad... Add -mcpu=future tuning support. (*)
  90092f4... Add support for -mcpu=future (*)
  35fec69... Change TARGET_MODULO to TARGET_POWER9. (*)
  e1f8abc... Change TARGET_POPCNTD to TARGET_POWER7. (*)
  8c9b979... Change TARGET_CMPB to TARGET_POWER6. (*)
  3fddd6e... Change TARGET_FPRND to TARGET_POWER5X. (*)
  38fdb83... Change TARGET_POPCNTB to TARGET_POWER5. (*)

(*) This commit already exists in another branch.
Because the reference `refs/users/meissner/heads/work187-dmf' matches
your hooks.email-new-commits-only configuration,
no separate email is sent for this commit.

[gcc(refs/users/meissner/heads/work187-bugs)] Merge commit 'refs/users/meissner/heads/work187-bugs' of git+ssh://gcc.gnu.org/git/gcc into me/work1

2024-11-22 Thread Michael Meissner via Gcc-cvs

https://gcc.gnu.org/g:53c4411081843b37ee13d4ab8562b7be40b60782

commit 53c4411081843b37ee13d4ab8562b7be40b60782
Merge: bf2895e6bb43 0baa951aedfb
Author: Michael Meissner 
Date:   Fri Nov 22 18:27:02 2024 -0500

Merge commit 'refs/users/meissner/heads/work187-bugs' of 
git+ssh://gcc.gnu.org/git/gcc into me/work187-bugs

Diff:

[gcc/meissner/heads/work187-sha] (30 commits) Merge commit 'refs/users/meissner/heads/work187-sha' of git

2024-11-22 Thread Michael Meissner via Gcc-cvs

The branch 'meissner/heads/work187-sha' was updated to point to:

 feb964f1cc66... Merge commit 'refs/users/meissner/heads/work187-sha' of git

It previously pointed to:

 35b42ee4ace9... Add ChangeLog.sha and update REVISION.

Diff:

Summary of changes (added commits):
---

  feb964f... Merge commit 'refs/users/meissner/heads/work187-sha' of git
  a7b21e3... Add ChangeLog.sha and update REVISION.
  ead1c9f... Update ChangeLog.* (*)
  992b9d6... Use architecture flags for defining _ARCH_PWR macros. (*)
  04169d3... Add rs6000 architecture masks. (*)
  e92d512... Do not allow -mvsx to boost processor to power7. (*)
  5e0614e... Use vector pair load/store for memcpy with -mcpu=future (*)
  0f14185... Add -mcpu=future tests. (*)
  33fa445... Add -mcpu=future tuning support. (*)
  7f0e7dc... Add support for -mcpu=future (*)
  1c9f81c... Revert changes (*)
  80c4fc9... Delete files (*)
  c004776... Add -mcpu=future tuning support. (*)
  590481f... Add support for -mcpu=future (*)
  2d9d590... Revert changes (*)
  f850d8d... Add -mcpu=future tuning support. (*)
  1f083f5... Add support for -mcpu=future (*)
  f9f717f... Revert changes (*)
  33234a6... Add -mcpu=future tuning support. (*)
  d59547a... Add support for -mcpu=future (*)
  b757938... Revert changes (*)
  e98175b... Use vector pair load/store for memcpy with -mcpu=future (*)
  6e5fb4c... Add -mcpu=future tests. (*)
  121e4ad... Add -mcpu=future tuning support. (*)
  90092f4... Add support for -mcpu=future (*)
  35fec69... Change TARGET_MODULO to TARGET_POWER9. (*)
  e1f8abc... Change TARGET_POPCNTD to TARGET_POWER7. (*)
  8c9b979... Change TARGET_CMPB to TARGET_POWER6. (*)
  3fddd6e... Change TARGET_FPRND to TARGET_POWER5X. (*)
  38fdb83... Change TARGET_POPCNTB to TARGET_POWER5. (*)

(*) This commit already exists in another branch.
Because the reference `refs/users/meissner/heads/work187-sha' matches
your hooks.email-new-commits-only configuration,
no separate email is sent for this commit.

[gcc(refs/users/meissner/heads/work187-sha)] Add ChangeLog.sha and update REVISION.

2024-11-22 Thread Michael Meissner via Gcc-cvs

https://gcc.gnu.org/g:a7b21e3ba67ad26d4091b15e36ffbc1cdbb38da2

commit a7b21e3ba67ad26d4091b15e36ffbc1cdbb38da2
Author: Michael Meissner 
Date:   Fri Nov 22 17:32:35 2024 -0500

Add ChangeLog.sha and update REVISION.

2024-11-22  Michael Meissner  

gcc/

* ChangeLog.sha: New file for branch.
* REVISION: Update.

Diff:
---
 gcc/ChangeLog.sha | 5 +
 gcc/REVISION  | 2 +-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/gcc/ChangeLog.sha b/gcc/ChangeLog.sha
new file mode 100644
index ..90fefd775f8b
--- /dev/null
+++ b/gcc/ChangeLog.sha
@@ -0,0 +1,5 @@
+ Branch work187-sha, baseline 
+
+2024-11-22   Michael Meissner  
+
+   Clone branch
diff --git a/gcc/REVISION b/gcc/REVISION
index 4bb9f2daed50..bfa2656bef94 100644
--- a/gcc/REVISION
+++ b/gcc/REVISION
@@ -1 +1 @@
-work187 branch
+work187-sha branch

[gcc(refs/users/meissner/heads/work187-sha)] Merge commit 'refs/users/meissner/heads/work187-sha' of git+ssh://gcc.gnu.org/git/gcc into me/work18

2024-11-22 Thread Michael Meissner via Gcc-cvs

https://gcc.gnu.org/g:feb964f1cc666f3a490ac06b642efb5d9a8523da

commit feb964f1cc666f3a490ac06b642efb5d9a8523da
Merge: a7b21e3ba67a 35b42ee4ace9
Author: Michael Meissner 
Date:   Fri Nov 22 18:30:28 2024 -0500

Merge commit 'refs/users/meissner/heads/work187-sha' of 
git+ssh://gcc.gnu.org/git/gcc into me/work187-sha

Diff:

[gcc(refs/users/meissner/heads/work187-test)] Add ChangeLog.test and update REVISION.

2024-11-22 Thread Michael Meissner via Gcc-cvs

https://gcc.gnu.org/g:558cb254aa5c55eabc4f85d03f58f7f8e80b91e0

commit 558cb254aa5c55eabc4f85d03f58f7f8e80b91e0
Author: Michael Meissner 
Date:   Fri Nov 22 17:33:30 2024 -0500

Add ChangeLog.test and update REVISION.

2024-11-22  Michael Meissner  

gcc/

* ChangeLog.test: New file for branch.
* REVISION: Update.

Diff:
---
 gcc/ChangeLog.test | 5 +
 gcc/REVISION   | 2 +-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/gcc/ChangeLog.test b/gcc/ChangeLog.test
new file mode 100644
index ..88a8ed1c0e0f
--- /dev/null
+++ b/gcc/ChangeLog.test
@@ -0,0 +1,5 @@
+ Branch work187-test, baseline 
+
+2024-11-22   Michael Meissner  
+
+   Clone branch
diff --git a/gcc/REVISION b/gcc/REVISION
index 4bb9f2daed50..666c8ae3062f 100644
--- a/gcc/REVISION
+++ b/gcc/REVISION
@@ -1 +1 @@
-work187 branch
+work187-test branch

[gcc(refs/users/meissner/heads/work187-test)] Merge commit 'refs/users/meissner/heads/work187-test' of git+ssh://gcc.gnu.org/git/gcc into me/work1

2024-11-22 Thread Michael Meissner via Gcc-cvs

https://gcc.gnu.org/g:8aa4e532441e28f2494d59cfb6830ce6ef101413

commit 8aa4e532441e28f2494d59cfb6830ce6ef101413
Merge: 558cb254aa5c 01087ae644a3
Author: Michael Meissner 
Date:   Fri Nov 22 18:31:59 2024 -0500

Merge commit 'refs/users/meissner/heads/work187-test' of 
git+ssh://gcc.gnu.org/git/gcc into me/work187-test

Diff:

[gcc/meissner/heads/work187-test] (30 commits) Merge commit 'refs/users/meissner/heads/work187-test' of gi

2024-11-22 Thread Michael Meissner via Gcc-cvs

The branch 'meissner/heads/work187-test' was updated to point to:

 8aa4e532441e... Merge commit 'refs/users/meissner/heads/work187-test' of gi

It previously pointed to:

 01087ae644a3... Add ChangeLog.test and update REVISION.

Diff:

Summary of changes (added commits):
---

  8aa4e53... Merge commit 'refs/users/meissner/heads/work187-test' of gi
  558cb25... Add ChangeLog.test and update REVISION.
  ead1c9f... Update ChangeLog.* (*)
  992b9d6... Use architecture flags for defining _ARCH_PWR macros. (*)
  04169d3... Add rs6000 architecture masks. (*)
  e92d512... Do not allow -mvsx to boost processor to power7. (*)
  5e0614e... Use vector pair load/store for memcpy with -mcpu=future (*)
  0f14185... Add -mcpu=future tests. (*)
  33fa445... Add -mcpu=future tuning support. (*)
  7f0e7dc... Add support for -mcpu=future (*)
  1c9f81c... Revert changes (*)
  80c4fc9... Delete files (*)
  c004776... Add -mcpu=future tuning support. (*)
  590481f... Add support for -mcpu=future (*)
  2d9d590... Revert changes (*)
  f850d8d... Add -mcpu=future tuning support. (*)
  1f083f5... Add support for -mcpu=future (*)
  f9f717f... Revert changes (*)
  33234a6... Add -mcpu=future tuning support. (*)
  d59547a... Add support for -mcpu=future (*)
  b757938... Revert changes (*)
  e98175b... Use vector pair load/store for memcpy with -mcpu=future (*)
  6e5fb4c... Add -mcpu=future tests. (*)
  121e4ad... Add -mcpu=future tuning support. (*)
  90092f4... Add support for -mcpu=future (*)
  35fec69... Change TARGET_MODULO to TARGET_POWER9. (*)
  e1f8abc... Change TARGET_POPCNTD to TARGET_POWER7. (*)
  8c9b979... Change TARGET_CMPB to TARGET_POWER6. (*)
  3fddd6e... Change TARGET_FPRND to TARGET_POWER5X. (*)
  38fdb83... Change TARGET_POPCNTB to TARGET_POWER5. (*)

(*) This commit already exists in another branch.
Because the reference `refs/users/meissner/heads/work187-test' matches
your hooks.email-new-commits-only configuration,
no separate email is sent for this commit.

[gcc(refs/users/meissner/heads/work187-vpair)] Add ChangeLog.vpair and update REVISION.

2024-11-22 Thread Michael Meissner via Gcc-cvs

https://gcc.gnu.org/g:76a808a46caf1e0d1becb24c5c456481bc03ba3a

commit 76a808a46caf1e0d1becb24c5c456481bc03ba3a
Author: Michael Meissner 
Date:   Fri Nov 22 17:29:56 2024 -0500

Add ChangeLog.vpair and update REVISION.

2024-11-22  Michael Meissner  

gcc/

* ChangeLog.vpair: New file for branch.
* REVISION: Update.

Diff:
---
 gcc/ChangeLog.vpair | 5 +
 gcc/REVISION| 2 +-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/gcc/ChangeLog.vpair b/gcc/ChangeLog.vpair
new file mode 100644
index ..e16bcc29d564
--- /dev/null
+++ b/gcc/ChangeLog.vpair
@@ -0,0 +1,5 @@
+ Branch work187-vpair, baseline 
+
+2024-11-22   Michael Meissner  
+
+   Clone branch
diff --git a/gcc/REVISION b/gcc/REVISION
index 4bb9f2daed50..76e807270b55 100644
--- a/gcc/REVISION
+++ b/gcc/REVISION
@@ -1 +1 @@
-work187 branch
+work187-vpair branch

[gcc(refs/users/meissner/heads/work187-vpair)] Merge commit 'refs/users/meissner/heads/work187-vpair' of git+ssh://gcc.gnu.org/git/gcc into me/work

2024-11-22 Thread Michael Meissner via Gcc-cvs

https://gcc.gnu.org/g:ad3ce883d0176db5bda2c759553eff467108b792

commit ad3ce883d0176db5bda2c759553eff467108b792
Merge: 76a808a46caf f0fa789ddba8
Author: Michael Meissner 
Date:   Fri Nov 22 18:33:13 2024 -0500

Merge commit 'refs/users/meissner/heads/work187-vpair' of 
git+ssh://gcc.gnu.org/git/gcc into me/work187-vpair

Diff:

[gcc(refs/users/meissner/heads/work187-dmf)] RFC2686-Add paddis support.

2024-11-22 Thread Michael Meissner via Gcc-cvs

https://gcc.gnu.org/g:77528b14894484e0f1b9c59d812672494b10a31a

commit 77528b14894484e0f1b9c59d812672494b10a31a
Author: Michael Meissner 
Date:   Fri Nov 22 18:49:57 2024 -0500

RFC2686-Add paddis support.

2024-11-22  Michael Meissner  

gcc/

* config/rs6000/constraints.md (eU): New constraint.
(eV): Likewise.
* config/rs6000/predicates.md (paddis_operand): New predicate.
(paddis_paddi_operand): Likewise.
(add_operand): Add paddis support.
* config/rs6000/rs6000.cc (num_insns_constant_gpr): Add paddis 
support.
(num_insns_constant_multi): Likewise.
(print_operand): Add %B for paddis support.
* config/rs6000/rs6000.h (TARGET_PADDIS): New macro.
(SIGNED_INTEGER_32BIT_P): Likewise.
* config/rs6000/rs6000.md (isa attribute): Add paddis support.
(enabled attribute); Likewise.
(add3): Likewise.
(adddi3 splitter): New splitter for paddis.
(movdi_internal64): Add paddis support.
(movdi splitter): New splitter for paddis.

gcc/testsuite/

* gcc.target/powerpc/prefixed-addis.c: New test.

Diff:
---
 gcc/config/rs6000/constraints.md  | 10 +++
 gcc/config/rs6000/predicates.md   | 52 +++-
 gcc/config/rs6000/rs6000.cc   | 25 ++
 gcc/config/rs6000/rs6000.h|  4 +
 gcc/config/rs6000/rs6000.md   | 96 ---
 gcc/testsuite/gcc.target/powerpc/prefixed-addis.c | 24 ++
 6 files changed, 197 insertions(+), 14 deletions(-)

diff --git a/gcc/config/rs6000/constraints.md b/gcc/config/rs6000/constraints.md
index 277a30a82458..4d8d21fd6bbb 100644
--- a/gcc/config/rs6000/constraints.md
+++ b/gcc/config/rs6000/constraints.md
@@ -222,6 +222,16 @@
   "An IEEE 128-bit constant that can be loaded into VSX registers."
   (match_operand 0 "easy_vector_constant_ieee128"))
 
+(define_constraint "eU"
+  "@internal integer constant that can be loaded with paddis"
+  (and (match_code "const_int")
+   (match_operand 0 "paddis_operand")))
+
+(define_constraint "eV"
+  "@internal integer constant that can be loaded with paddis + paddi"
+  (and (match_code "const_int")
+   (match_operand 0 "paddis_paddi_operand")))
+
 ;; Floating-point constraints.  These two are defined so that insn
 ;; length attributes can be calculated exactly.
 
diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
index 2797c3cf619b..f8e7df5e7f5b 100644
--- a/gcc/config/rs6000/predicates.md
+++ b/gcc/config/rs6000/predicates.md
@@ -369,6 +369,53 @@
   return SIGNED_INTEGER_34BIT_P (INTVAL (op));
 })
 
+;; Return 1 if op is a 64-bit constant that uses the paddis instruction
+(define_predicate "paddis_operand"
+  (match_code "const_int")
+{
+  if (!TARGET_PADDIS && TARGET_POWERPC64)
+return 0;
+
+  /* If addi, addis, or paddi can handle the number, don't return true.  */
+  HOST_WIDE_INT value = INTVAL (op);
+  if (SIGNED_INTEGER_34BIT_P (value))
+return false;
+
+  /* If the number is too large for padds, return false.  */
+  if (!SIGNED_INTEGER_32BIT_P (value >> 32))
+return false;
+
+  /* If the bottom 32-bits are non-zero, paddis can't handle it.  */
+  if ((value & HOST_WIDE_INT_C(0x)) != 0)
+return false;
+
+  return true;
+})
+
+;; Return 1 if op is a 64-bit constant that needs the paddis instruction and an
+;; addi/addis/paddi instruction combination.
+(define_predicate "paddis_paddi_operand"
+  (match_code "const_int")
+{
+  if (!TARGET_PADDIS && TARGET_POWERPC64)
+return 0;
+
+  /* If addi, addis, or paddi can handle the number, don't return true.  */
+  HOST_WIDE_INT value = INTVAL (op);
+  if (SIGNED_INTEGER_34BIT_P (value))
+return false;
+
+  /* If the number is too large for padds, return false.  */
+  if (!SIGNED_INTEGER_32BIT_P (value >> 32))
+return false;
+
+  /* If the bottom 32-bits are zero, we can use paddis alone to handle it.  */
+  if ((value & HOST_WIDE_INT_C(0x)) == 0)
+return false;
+
+  return true;
+})
+
 ;; Return 1 if op is a register that is not special.
 ;; Disallow (SUBREG:SF (REG:SI)) and (SUBREG:SI (REG:SF)) on VSX systems where
 ;; you need to be careful in moving a SFmode to SImode and vice versa due to
@@ -1113,7 +1160,10 @@
   (if_then_else (match_code "const_int")
 (match_test "satisfies_constraint_I (op)
 || satisfies_constraint_L (op)
-|| satisfies_constraint_eI (op)")
+|| satisfies_constraint_eI (op)
+|| satisfies_constraint_eU (op)
+|| satisfies_constraint_eV (op)")
+
 (match_operand 0 "gpc_reg_operand")))
 
 ;; Return 1 if the operand is either a non-special register, or 0, or -1.
diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index 909b675709a4..a11695c4f181 100644
--- a/gcc/con

[gcc(refs/users/meissner/heads/work187-dmf)] RFC2677-Add xvrlw support.

2024-11-22 Thread Michael Meissner via Gcc-cvs

https://gcc.gnu.org/g:520e6a6440bc6001babd2f9e35581d5a0637ecea

commit 520e6a6440bc6001babd2f9e35581d5a0637ecea
Author: Michael Meissner 
Date:   Fri Nov 22 18:51:07 2024 -0500

RFC2677-Add xvrlw support.

2024-11-22  Michael Meissner  

gcc/

* config/rs6000/altivec.md (xvrlw): New insn.
* config/rs6000/rs6000.h (TARGET_XVRLW): New macro.

gcc/testsuite/

* gcc.target/powerpc/vector-rotate-left.c: New test.

Diff:
---
 gcc/config/rs6000/altivec.md   | 14 +
 gcc/config/rs6000/rs6000.h |  3 ++
 .../gcc.target/powerpc/vector-rotate-left.c| 34 ++
 3 files changed, 51 insertions(+)

diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
index b6a778ef6179..c76b1eeefe35 100644
--- a/gcc/config/rs6000/altivec.md
+++ b/gcc/config/rs6000/altivec.md
@@ -1982,6 +1982,20 @@
 }
   [(set_attr "type" "vecperm")])
 
+;; -mcpu=future adds a vector rotate left word variant.  There is no vector
+;; byte/half-word/double-word/quad-word rotate left.  This insn occurs before
+;; altivec_vrl and will match for -mcpu=future, while other cpus will
+;; match the generic insn.
+(define_insn "*xvrlw"
+  [(set (match_operand:V4SI 0 "register_operand" "=v,wa")
+   (rotate:V4SI (match_operand:V4SI 1 "register_operand" "v,wa")
+(match_operand:V4SI 2 "register_operand" "v,wa")))]
+  "TARGET_XVRLW"
+  "@
+   vrlw %0,%1,%2
+   xvrlw %x0,%x1,%x2"
+  [(set_attr "type" "vecsimple")])
+
 (define_insn "altivec_vrl"
   [(set (match_operand:VI2 0 "register_operand" "=v")
 (rotate:VI2 (match_operand:VI2 1 "register_operand" "v")
diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h
index 1480c0373f23..4379340aa873 100644
--- a/gcc/config/rs6000/rs6000.h
+++ b/gcc/config/rs6000/rs6000.h
@@ -584,6 +584,9 @@ extern int rs6000_vector_align[];
 /* Whether we have PADDIS support.  */
 #define TARGET_PADDIS  TARGET_FUTURE
 
+/* Whether we have XVRLW support.  */
+#define TARGET_XVRLW   TARGET_FUTURE
+
 /* Whether the various reciprocal divide/square root estimate instructions
exist, and whether we should automatically generate code for the instruction
by default.  */
diff --git a/gcc/testsuite/gcc.target/powerpc/vector-rotate-left.c 
b/gcc/testsuite/gcc.target/powerpc/vector-rotate-left.c
new file mode 100644
index ..5a5f37755077
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vector-rotate-left.c
@@ -0,0 +1,34 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_future_ok } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-options "-mdejagnu-cpu=future -O2" } */
+
+/* Test whether the xvrl (vector word rotate left using VSX registers insead of
+   Altivec registers is generated.  */
+
+#include 
+
+typedef vector unsigned int  v4si_t;
+
+v4si_t
+rotl_v4si_scalar (v4si_t x, unsigned long n)
+{
+  __asm__ (" # %x0" : "+f" (x));
+  return (x << n) | (x >> (32 - n));   /* xvrlw.  */
+}
+
+v4si_t
+rotr_v4si_scalar (v4si_t x, unsigned long n)
+{
+  __asm__ (" # %x0" : "+f" (x));
+  return (x >> n) | (x << (32 - n));   /* xvrlw.  */
+}
+
+v4si_t
+rotl_v4si_vector (v4si_t x, v4si_t y)
+{
+  __asm__ (" # %x0" : "+f" (x));   /* xvrlw.  */
+  return vec_rl (x, y);
+}
+
+/* { dg-final { scan-assembler-times {\mxvrlw\M} 3  } } */

[gcc(refs/users/meissner/heads/work187-dmf)] Revert changes

2024-11-22 Thread Michael Meissner via Gcc-cvs

https://gcc.gnu.org/g:85d2ebbd0fe292fdeb47ccbbea60c339c2c0526b

commit 85d2ebbd0fe292fdeb47ccbbea60c339c2c0526b
Author: Michael Meissner 
Date:   Fri Nov 22 19:25:25 2024 -0500

Revert changes

Diff:
---
 gcc/config/rs6000/altivec.md   | 14 
 gcc/config/rs6000/constraints.md   | 10 ---
 gcc/config/rs6000/predicates.md| 52 +---
 gcc/config/rs6000/rs6000.cc| 25 --
 gcc/config/rs6000/rs6000.h |  7 --
 gcc/config/rs6000/rs6000.md| 96 +++---
 gcc/testsuite/gcc.target/powerpc/prefixed-addis.c  | 24 --
 .../gcc.target/powerpc/vector-rotate-left.c| 34 
 8 files changed, 14 insertions(+), 248 deletions(-)

diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
index c76b1eeefe35..b6a778ef6179 100644
--- a/gcc/config/rs6000/altivec.md
+++ b/gcc/config/rs6000/altivec.md
@@ -1982,20 +1982,6 @@
 }
   [(set_attr "type" "vecperm")])
 
-;; -mcpu=future adds a vector rotate left word variant.  There is no vector
-;; byte/half-word/double-word/quad-word rotate left.  This insn occurs before
-;; altivec_vrl and will match for -mcpu=future, while other cpus will
-;; match the generic insn.
-(define_insn "*xvrlw"
-  [(set (match_operand:V4SI 0 "register_operand" "=v,wa")
-   (rotate:V4SI (match_operand:V4SI 1 "register_operand" "v,wa")
-(match_operand:V4SI 2 "register_operand" "v,wa")))]
-  "TARGET_XVRLW"
-  "@
-   vrlw %0,%1,%2
-   xvrlw %x0,%x1,%x2"
-  [(set_attr "type" "vecsimple")])
-
 (define_insn "altivec_vrl"
   [(set (match_operand:VI2 0 "register_operand" "=v")
 (rotate:VI2 (match_operand:VI2 1 "register_operand" "v")
diff --git a/gcc/config/rs6000/constraints.md b/gcc/config/rs6000/constraints.md
index 4d8d21fd6bbb..277a30a82458 100644
--- a/gcc/config/rs6000/constraints.md
+++ b/gcc/config/rs6000/constraints.md
@@ -222,16 +222,6 @@
   "An IEEE 128-bit constant that can be loaded into VSX registers."
   (match_operand 0 "easy_vector_constant_ieee128"))
 
-(define_constraint "eU"
-  "@internal integer constant that can be loaded with paddis"
-  (and (match_code "const_int")
-   (match_operand 0 "paddis_operand")))
-
-(define_constraint "eV"
-  "@internal integer constant that can be loaded with paddis + paddi"
-  (and (match_code "const_int")
-   (match_operand 0 "paddis_paddi_operand")))
-
 ;; Floating-point constraints.  These two are defined so that insn
 ;; length attributes can be calculated exactly.
 
diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
index f8e7df5e7f5b..2797c3cf619b 100644
--- a/gcc/config/rs6000/predicates.md
+++ b/gcc/config/rs6000/predicates.md
@@ -369,53 +369,6 @@
   return SIGNED_INTEGER_34BIT_P (INTVAL (op));
 })
 
-;; Return 1 if op is a 64-bit constant that uses the paddis instruction
-(define_predicate "paddis_operand"
-  (match_code "const_int")
-{
-  if (!TARGET_PADDIS && TARGET_POWERPC64)
-return 0;
-
-  /* If addi, addis, or paddi can handle the number, don't return true.  */
-  HOST_WIDE_INT value = INTVAL (op);
-  if (SIGNED_INTEGER_34BIT_P (value))
-return false;
-
-  /* If the number is too large for padds, return false.  */
-  if (!SIGNED_INTEGER_32BIT_P (value >> 32))
-return false;
-
-  /* If the bottom 32-bits are non-zero, paddis can't handle it.  */
-  if ((value & HOST_WIDE_INT_C(0x)) != 0)
-return false;
-
-  return true;
-})
-
-;; Return 1 if op is a 64-bit constant that needs the paddis instruction and an
-;; addi/addis/paddi instruction combination.
-(define_predicate "paddis_paddi_operand"
-  (match_code "const_int")
-{
-  if (!TARGET_PADDIS && TARGET_POWERPC64)
-return 0;
-
-  /* If addi, addis, or paddi can handle the number, don't return true.  */
-  HOST_WIDE_INT value = INTVAL (op);
-  if (SIGNED_INTEGER_34BIT_P (value))
-return false;
-
-  /* If the number is too large for padds, return false.  */
-  if (!SIGNED_INTEGER_32BIT_P (value >> 32))
-return false;
-
-  /* If the bottom 32-bits are zero, we can use paddis alone to handle it.  */
-  if ((value & HOST_WIDE_INT_C(0x)) == 0)
-return false;
-
-  return true;
-})
-
 ;; Return 1 if op is a register that is not special.
 ;; Disallow (SUBREG:SF (REG:SI)) and (SUBREG:SI (REG:SF)) on VSX systems where
 ;; you need to be careful in moving a SFmode to SImode and vice versa due to
@@ -1160,10 +1113,7 @@
   (if_then_else (match_code "const_int")
 (match_test "satisfies_constraint_I (op)
 || satisfies_constraint_L (op)
-|| satisfies_constraint_eI (op)
-|| satisfies_constraint_eU (op)
-|| satisfies_constraint_eV (op)")
-
+|| satisfies_constraint_eI (op)")
 (match_operand 0 "gpc_reg_operand")))
 
 ;; Return 1 if the operand is either a non-special register, or 0, or -1.
diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/conf

[gcc/meissner/heads/work187-vpair] (30 commits) Merge commit 'refs/users/meissner/heads/work187-vpair' of g

2024-11-22 Thread Michael Meissner via Gcc-cvs

The branch 'meissner/heads/work187-vpair' was updated to point to:

 ad3ce883d017... Merge commit 'refs/users/meissner/heads/work187-vpair' of g

It previously pointed to:

 f0fa789ddba8... Add ChangeLog.vpair and update REVISION.

Diff:

Summary of changes (added commits):
---

  ad3ce88... Merge commit 'refs/users/meissner/heads/work187-vpair' of g
  76a808a... Add ChangeLog.vpair and update REVISION.
  ead1c9f... Update ChangeLog.* (*)
  992b9d6... Use architecture flags for defining _ARCH_PWR macros. (*)
  04169d3... Add rs6000 architecture masks. (*)
  e92d512... Do not allow -mvsx to boost processor to power7. (*)
  5e0614e... Use vector pair load/store for memcpy with -mcpu=future (*)
  0f14185... Add -mcpu=future tests. (*)
  33fa445... Add -mcpu=future tuning support. (*)
  7f0e7dc... Add support for -mcpu=future (*)
  1c9f81c... Revert changes (*)
  80c4fc9... Delete files (*)
  c004776... Add -mcpu=future tuning support. (*)
  590481f... Add support for -mcpu=future (*)
  2d9d590... Revert changes (*)
  f850d8d... Add -mcpu=future tuning support. (*)
  1f083f5... Add support for -mcpu=future (*)
  f9f717f... Revert changes (*)
  33234a6... Add -mcpu=future tuning support. (*)
  d59547a... Add support for -mcpu=future (*)
  b757938... Revert changes (*)
  e98175b... Use vector pair load/store for memcpy with -mcpu=future (*)
  6e5fb4c... Add -mcpu=future tests. (*)
  121e4ad... Add -mcpu=future tuning support. (*)
  90092f4... Add support for -mcpu=future (*)
  35fec69... Change TARGET_MODULO to TARGET_POWER9. (*)
  e1f8abc... Change TARGET_POPCNTD to TARGET_POWER7. (*)
  8c9b979... Change TARGET_CMPB to TARGET_POWER6. (*)
  3fddd6e... Change TARGET_FPRND to TARGET_POWER5X. (*)
  38fdb83... Change TARGET_POPCNTB to TARGET_POWER5. (*)

(*) This commit already exists in another branch.
Because the reference `refs/users/meissner/heads/work187-vpair' matches
your hooks.email-new-commits-only configuration,
no separate email is sent for this commit.

[gcc(refs/users/meissner/heads/work187)] Use architecture flags for defining _ARCH_PWR macros.

2024-11-22 Thread Michael Meissner via Gcc-cvs

https://gcc.gnu.org/g:992b9d6024dbd8b912930998180f23c88aa9ee2c

commit 992b9d6024dbd8b912930998180f23c88aa9ee2c
Author: Michael Meissner 
Date:   Fri Nov 22 18:21:17 2024 -0500

Use architecture flags for defining _ARCH_PWR macros.

For the newer architectures, this patch changes GCC to define the 
_ARCH_PWR
macros using the new architecture flags instead of relying on isa options 
like
-mpower10.

The -mpower8-internal, -mpower10, -mpower11, and -mfuture options were 
removed.
The -mpower11 and -mfuture options were removed completely, since they were 
just
added in GCC 15. The other two options were marked as WarnRemoved, and the
various ISA bits were removed.

TARGET_POWER8, TARGET_POWER10, TARGET_POWER11, and TARGET_FUTURE were 
re-defined
to use the architeture bits instead of the ISA bits.

There are other internal isa bits that aren't removed with this patch 
because
the built-in function support uses those bits.

I have built both big endian and little endian bootstrap compilers and there
were no regressions.

Can I install this patch on the GCC 15 trunk?

2024-11-22  Michael Meissner  

gcc/

* config/rs6000/rs6000-c.cc (rs6000_target_modify_macros) Add 
support to
use architecture flags instead of ISA flags for setting most of the
_ARCH_PWR* macros.
(rs6000_cpu_cpp_builtins): Update rs6000_target_modify_macros call.
* config/rs6000/rs6000-cpus.def (ISA_2_7_MASKS_SERVER): Remove
OPTION_MASK_POWER8.
(ISA_3_1_MASKS_SERVER): Remove OPTION_MASK_POWER10.
(POWER11_MASKS_SERVER): Remove OPTION_MASK_POWER11.
(FUTURE_MASKS_SERVER): Remove OPTION_MASK_FUTURE.
(POWERPC_MASKS): Remove OPTION_MASK_POWER8, OPTION_MASK_POWER10,
OPTION_MASK_POWER11, and OPTION_MASK_FUTURE.
* config/rs6000/rs6000-protos.h (rs6000_target_modify_macros): 
Update
declaration.
(rs6000_target_modify_macros_ptr): Likewise.
* config/rs6000/rs6000.cc (rs6000_target_modify_macros_ptr): 
Likewise.
(rs6000_option_override_internal): Use architecture flags instead 
of ISA
flags.
(rs6000_opt_masks): Remove -mpower10, -mpower11, and -mfuture which 
are
no longer in the ISA flags.
(rs6000_pragma_target_parse): Use architecture flags as well as ISA
flags.
* config/rs6000/rs6000.h (TARGET_POWER5): Redefine to use 
architecture
flags.
(TARGET_POWER5X): Likewise.
(TARGET_POWER6): Likewise.
(TARGET_POWER7): Likewise.
(TARGET_POWER8): Likewise.
(TARGET_POWER9): Likewise.
(TARGET_POWER10): New macro.
(TARGET_POWER11): Likewise.
(TARGET_FUTURE): Likewise.
* config/rs6000/rs6000.opt (-mpower8-internal): Remove ISA flag 
bits.
(-mpower10): Likewise.
(-mpower11): Likewise.
(-mfuture): Likewise.

Diff:
---
 gcc/config/rs6000/rs6000-c.cc | 29 -
 gcc/config/rs6000/rs6000-cpus.def | 10 +-
 gcc/config/rs6000/rs6000-protos.h |  5 +++--
 gcc/config/rs6000/rs6000.cc   | 20 +++-
 gcc/config/rs6000/rs6000.h| 19 +--
 gcc/config/rs6000/rs6000.opt  | 17 ++---
 6 files changed, 46 insertions(+), 54 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-c.cc b/gcc/config/rs6000/rs6000-c.cc
index b0b5701c7fa9..30ea1d64326e 100644
--- a/gcc/config/rs6000/rs6000-c.cc
+++ b/gcc/config/rs6000/rs6000-c.cc
@@ -339,7 +339,8 @@ rs6000_define_or_undefine_macro (bool define_p, const char 
*name)
#pragma GCC target, we need to adjust the macros dynamically.  */
 
 void
-rs6000_target_modify_macros (bool define_p, HOST_WIDE_INT flags)
+rs6000_target_modify_macros (bool define_p, HOST_WIDE_INT flags,
+HOST_WIDE_INT arch_flags)
 {
   if (TARGET_DEBUG_BUILTIN || TARGET_DEBUG_TARGET)
 fprintf (stderr,
@@ -412,7 +413,7 @@ rs6000_target_modify_macros (bool define_p, HOST_WIDE_INT 
flags)
summary of the flags associated with particular cpu
definitions.  */
 
-  /* rs6000_isa_flags based options.  */
+  /* rs6000_isa_flags and rs6000_arch_flags based options.  */
   rs6000_define_or_undefine_macro (define_p, "_ARCH_PPC");
   if ((flags & OPTION_MASK_PPC_GPOPT) != 0)
 rs6000_define_or_undefine_macro (define_p, "_ARCH_PPCSQ");
@@ -420,25 +421,27 @@ rs6000_target_modify_macros (bool define_p, HOST_WIDE_INT 
flags)
 rs6000_define_or_undefine_macro (define_p, "_ARCH_PPCGR");
   if ((flags & OPTION_MASK_POWERPC64) != 0)
 rs6000_define_or_undefine_macro (define_p, "_ARCH_PPC64");
-  if ((flags & OPTION_MASK_MFCRF) != 0)
+  if ((flags & OPTION_MASK_POWERPC64) != 0)
+rs6000_define_or_undefine_macro (define_p, "_ARCH_PPC64");

[gcc(refs/users/meissner/heads/work187)] Do not allow -mvsx to boost processor to power7.

2024-11-22 Thread Michael Meissner via Gcc-cvs

https://gcc.gnu.org/g:e92d5129aaf8356475e23f75082f78f242fc86be

commit e92d5129aaf8356475e23f75082f78f242fc86be
Author: Michael Meissner 
Date:   Fri Nov 22 18:14:24 2024 -0500

Do not allow -mvsx to boost processor to power7.

This patch restructures the code so that -mvsx for example will not silently
convert the processor to power7.  The user must now use -mcpu=power7 or 
higher.
This means if the user does -mvsx and the default processor does not have 
VSX
support, it will be an error.

I have built both big endian and little endian bootstrap compilers and there
were no regressions.

I updated the 2 tests that used -mvsx to raise the cpu to power7, and the 
test
case that checks if -mno-vsx produces the expected warning.

Note, Peter had some questions about one of the tests in the previous 
version of
the patch.  The test is still the same in this patch.  But the code for
preventing -mvsx is different from the previous patch, and I wanted to get 
that
patch for review before stage1 closes.

Can I install this patch on the GCC 15 trunk?

2024-11-22  Michael Meissner  

gcc/

* config/rs6000/rs6000.cc (rs6000_option_override_internal): Check 
if
the user asked for VSX instructions whether the cpu was at least 
power7.

gcc/testsuite/

* gcc.target/powerpc/ppc-target-4.c: Rewrite the test to add 
cpu=power7
when we need to add VSX support.  Add test for adding cpu=power7 
no-vsx
to generate only Altivec instructions.
* gcc.target/powerpc/pr115688.c: Add cpu=power7 in target 
__attribute__
when requesting VSX instructions.
* gcc.target/powerpc/pr87496-1.c: Update options to use
-mdejagnu-cpu=power6 to get the appropriate error message.

Diff:
---
 gcc/config/rs6000/rs6000.cc |  7 +
 gcc/testsuite/gcc.target/powerpc/ppc-target-4.c | 38 +++--
 gcc/testsuite/gcc.target/powerpc/pr115688.c |  3 +-
 gcc/testsuite/gcc.target/powerpc/pr87496-1.c|  2 +-
 4 files changed, 39 insertions(+), 11 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index 35e42a80ac5c..e5cdbb96c75f 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -3860,6 +3860,13 @@ rs6000_option_override_internal (bool global_init_p)
  rs6000_isa_flags &= ~OPTION_MASK_VSX;
  rs6000_isa_flags_explicit |= OPTION_MASK_VSX;
}
+  else if (!TARGET_POWER7)
+   {
+ if (explicit_vsx_p)
+   error ("%<-mvsx%> requires at least %<-mcpu=power%>");
+ rs6000_isa_flags &= ~OPTION_MASK_VSX;
+ rs6000_isa_flags_explicit |= OPTION_MASK_VSX;
+   }
 }
 
   /* If hard-float/altivec/vsx were explicitly turned off then don't allow
diff --git a/gcc/testsuite/gcc.target/powerpc/ppc-target-4.c 
b/gcc/testsuite/gcc.target/powerpc/ppc-target-4.c
index feef76db4618..5e2ecf34f249 100644
--- a/gcc/testsuite/gcc.target/powerpc/ppc-target-4.c
+++ b/gcc/testsuite/gcc.target/powerpc/ppc-target-4.c
@@ -2,7 +2,7 @@
 /* { dg-skip-if "" { powerpc*-*-darwin* } } */
 /* { dg-require-effective-target powerpc_fprs } */
 /* { dg-options "-O2 -ffast-math -mdejagnu-cpu=power5 -mno-altivec 
-mabi=altivec -fno-unroll-loops" } */
-/* { dg-final { scan-assembler-times "vaddfp" 1 } } */
+/* { dg-final { scan-assembler-times "vaddfp" 2 } } */
 /* { dg-final { scan-assembler-times "xvaddsp" 1 } } */
 /* { dg-final { scan-assembler-times "fadds" 1 } } */
 
@@ -18,10 +18,6 @@
 #error "__VSX__ should not be defined."
 #endif
 
-#pragma GCC target("altivec,vsx")
-#include 
-#pragma GCC reset_options
-
 #pragma GCC push_options
 #pragma GCC target("altivec,no-vsx")
 
@@ -33,6 +29,7 @@
 #error "__VSX__ should not be defined."
 #endif
 
+/* Altivec build, generate vaddfp.  */
 void
 av_add (vector float *a, vector float *b, vector float *c)
 {
@@ -40,10 +37,11 @@ av_add (vector float *a, vector float *b, vector float *c)
   unsigned long n = SIZE / 4;
 
   for (i = 0; i < n; i++)
-a[i] = vec_add (b[i], c[i]);
+a[i] = b[i] + c[i];
 }
 
-#pragma GCC target("vsx")
+/* cpu=power7 must be used to enable VSX.  */
+#pragma GCC target("cpu=power7,vsx")
 
 #ifndef __ALTIVEC__
 #error "__ALTIVEC__ should be defined."
@@ -53,6 +51,7 @@ av_add (vector float *a, vector float *b, vector float *c)
 #error "__VSX__ should be defined."
 #endif
 
+/* VSX build on power7, generate xsaddsp.  */
 void
 vsx_add (vector float *a, vector float *b, vector float *c)
 {
@@ -60,11 +59,31 @@ vsx_add (vector float *a, vector float *b, vector float *c)
   unsigned long n = SIZE / 4;
 
   for (i = 0; i < n; i++)
-a[i] = vec_add (b[i], c[i]);
+a[i] = b[i] + c[i];
+}
+
+#pragma GCC target("cpu=power7,no-vsx")
+
+#ifndef __ALTIVEC__
+#error "__ALTIVEC__ should be defined."
+#endif
+
+#ifdef __VSX__
+#error "__VSX__ should not be defined."

[gcc(refs/users/meissner/heads/work187-dmf)] RFC2653-Add support for dense math registers.

2024-11-22 Thread Michael Meissner via Gcc-cvs

https://gcc.gnu.org/g:83086e8856feeef40f4fe161708bafeafa2f170f

commit 83086e8856feeef40f4fe161708bafeafa2f170f
Author: Michael Meissner 
Date:   Fri Nov 22 18:37:36 2024 -0500

RFC2653-Add support for dense math registers.

The MMA subsystem added the notion of accumulator registers as an optional
feature of ISA 3.1 (power10).  In ISA 3.1, these accumulators overlapped 
with
the VSX registers 0..31, but logically the accumulator registers were 
separate
from the FPR registers.  In ISA 3.1, it was anticipated that in future 
systems,
the accumulator registers may no overlap with the FPR registers.  This patch
adds the support for dense math registers as separate registers.

This particular patch does not change the MMA support to use the 
accumulators
within the dense math registers.  This patch just adds the basic support for
having separate DMRs.  The next patch will switch the MMA support to use the
accumulators if -mcpu=future is used.

For testing purposes, I added an undocumented option '-mdense-math' to 
enable
or disable the dense math support.

This patch updates the wD constraint added in the previous patch.  If MMA is
selected but dense math is not selected (i.e. -mcpu=power10), the wD 
constraint
will allow access to accumulators that overlap with VSX registers 0..31.  If
both MMA and dense math are selected (i.e. -mcpu=future), the wD constraint
will only allow dense math registers.

This patch modifies the existing %A output modifier.  If MMA is selected but
dense math is not selected, then %A output modifier converts the VSX 
register
number to the accumulator number, by dividing it by 4.  If both MMA and 
dense
math are selected, then %A will map the separate DMR registers into 0..7.

The intention is that user code using extended asm can be modified to run on
both MMA without dense math and MMA with dense math:

1)  If possible, don't use extended asm, but instead use the MMA 
built-in
functions;

2)  If you do need to write extended asm, change the d constraints
targetting accumulators should now use wD;

3)  Only use the built-in zero, assemble and disassemble functions 
create
move data between vector quad types and dense math accumulators.
I.e. do not use the xxmfacc, xxmtacc, and xxsetaccz directly in the
extended asm code.  The reason is these instructions assume there 
is a
1-to-1 correspondence between 4 adjacent FPR registers and an
accumulator that overlaps with those instructions.  With 
accumulators
now being separate registers, there no longer is a 1-to-1
correspondence.

It is possible that the mangling for DMRs and the GDB register numbers may
produce other changes in the future.

gcc/

2024-11-22   Michael Meissner  

* config/rs6000/mma.md (UNSPEC_MMA_DMSETDMRZ): New unspec.
(movxo): Add comments about dense math registers.
(movxo_nodm): Rename from movxo and restrict the usage to machines
without dense math registers.
(movxo_dm): New insn for movxo support for machines with dense math
registers.
(mma_): Restrict usage to machines without dense math 
registers.
(mma_xxsetaccz): Add a define_expand wrapper, and add support for 
dense
math registers.
(mma_dmsetaccz): New insn.
* config/rs6000/predicates.md (dmr_operand): New predicate.
(accumulator_operand): Add support for dense math registers.
* config/rs6000/rs6000-builtin.cc (rs6000_gimple_fold_mma_builtin): 
Do
not issue a de-prime instruction when disassembling a vector quad 
on a
system with dense math registers.
* config/rs6000/rs6000-c.cc (rs6000_define_or_undefine_macro): 
Define
__DENSE_MATH__ if we have dense math registers.
* config/rs6000/rs6000.cc (enum rs6000_reg_type): Add DMR_REG_TYPE.
(enum rs6000_reload_reg_type): Add RELOAD_REG_DMR.
(LAST_RELOAD_REG_CLASS): Add support for DMR registers and the wD
constraint.
(reload_reg_map): Likewise.
(rs6000_reg_names): Likewise.
(alt_reg_names): Likewise.
(rs6000_hard_regno_nregs_internal): Likewise.
(rs6000_hard_regno_mode_ok_uncached): Likewise.
(rs6000_debug_reg_global): Likewise.
(rs6000_setup_reg_addr_masks): Likewise.
(rs6000_init_hard_regno_mode_ok): Likewise.
(rs6000_secondary_reload_memory): Add support for DMR registers.
(rs6000_secondary_reload_simple_move): Likewise.
(rs6000_preferred_reload_class): Likewise.
(rs6000_secondary_reload_class): Likewise.
(print_operan

[gcc(refs/users/meissner/heads/work187-dmf)] RFC2653-PowerPC: Switch to dense math names for all MMA operations.

2024-11-22 Thread Michael Meissner via Gcc-cvs

https://gcc.gnu.org/g:fa6075e378f5cb7a8d76125691949c927c571b65

commit fa6075e378f5cb7a8d76125691949c927c571b65
Author: Michael Meissner 
Date:   Fri Nov 22 18:38:30 2024 -0500

RFC2653-PowerPC: Switch to dense math names for all MMA operations.

This patch changes the assembler instruction names for MMA instructions from
the original name used in power10 to the new name when used with the dense 
math
system.  I.e. xvf64gerpp becomes dmxvf64gerpp.  The assembler will emit the
same bits for either spelling.

For the non-prefixed MMA instructions, we add a 'dm' prefix in front of the
instruction.  However, the prefixed instructions have a 'pm' prefix, and we 
add
the 'dm' prefix afterwards.  To prevent having two sets of parallel int
attributes, we remove the "pm" prefix from the instruction string in the
attributes, and add it later, both in the insn name and in the output 
template.

2024-11-22   Michael Meissner  

gcc/

* config/rs6000/mma.md (vvi4i4i8): Change the instruction to not 
have a
"pm" prefix.
(avvi4i4i8): Likewise.
(vvi4i4i2): Likewise.
(avvi4i4i2): Likewise.
(vvi4i4): Likewise.
(avvi4i4): Likewise.
(pvi4i2): Likewise.
(apvi4i2): Likewise.
(vvi4i4i4): Likewise.
(avvi4i4i4): Likewise.
(mma_): Add support for running on DMF systems, generating the 
dense
math instruction and using the dense math accumulators.
(mma_): Likewise.
(mma_): Likewise.
(mma_): Likewise.
(mma_pm): Add support for running on DMF systems, 
generating
the dense math instruction and using the dense math accumulators.
Rename the insn with a 'pm' prefix and add either 'pm' or 'pmdm'
prefixes based on whether we have the original MMA specification or 
if
we have dense math support.
(mma_pm): Likewise.
(mma_pm): Likewise.
(mma_pm): Likewise.
(mma_pm): Likewise.
(mma_pm): Likewise.
(mma_pm): Likewise.
(mma_pm): Likewise.

Diff:
---
 gcc/config/rs6000/mma.md | 157 +++
 1 file changed, 104 insertions(+), 53 deletions(-)

diff --git a/gcc/config/rs6000/mma.md b/gcc/config/rs6000/mma.md
index ae6e7e9695be..2e04eb653fa6 100644
--- a/gcc/config/rs6000/mma.md
+++ b/gcc/config/rs6000/mma.md
@@ -225,44 +225,47 @@
 (UNSPEC_MMA_XVF64GERNP "xvf64gernp")
 (UNSPEC_MMA_XVF64GERNN "xvf64gernn")])
 
-(define_int_attr vvi4i4i8  [(UNSPEC_MMA_PMXVI4GER8 "pmxvi4ger8")])
+;; The "pm" prefix is not in these expansions, so that we can generate
+;; pmdmxvi4ger8 on systems with dense math registers and xvi4ger8 on systems
+;; without dense math registers.
+(define_int_attr vvi4i4i8  [(UNSPEC_MMA_PMXVI4GER8 "xvi4ger8")])
 
-(define_int_attr avvi4i4i8 [(UNSPEC_MMA_PMXVI4GER8PP   
"pmxvi4ger8pp")])
+(define_int_attr avvi4i4i8 [(UNSPEC_MMA_PMXVI4GER8PP   "xvi4ger8pp")])
 
-(define_int_attr vvi4i4i2  [(UNSPEC_MMA_PMXVI16GER2"pmxvi16ger2")
-(UNSPEC_MMA_PMXVI16GER2S   "pmxvi16ger2s")
-(UNSPEC_MMA_PMXVF16GER2"pmxvf16ger2")
-(UNSPEC_MMA_PMXVBF16GER2   
"pmxvbf16ger2")])
+(define_int_attr vvi4i4i2  [(UNSPEC_MMA_PMXVI16GER2"xvi16ger2")
+(UNSPEC_MMA_PMXVI16GER2S   "xvi16ger2s")
+(UNSPEC_MMA_PMXVF16GER2"xvf16ger2")
+(UNSPEC_MMA_PMXVBF16GER2   "xvbf16ger2")])
 
-(define_int_attr avvi4i4i2 [(UNSPEC_MMA_PMXVI16GER2PP  "pmxvi16ger2pp")
-(UNSPEC_MMA_PMXVI16GER2SPP 
"pmxvi16ger2spp")
-(UNSPEC_MMA_PMXVF16GER2PP  "pmxvf16ger2pp")
-(UNSPEC_MMA_PMXVF16GER2PN  "pmxvf16ger2pn")
-(UNSPEC_MMA_PMXVF16GER2NP  "pmxvf16ger2np")
-(UNSPEC_MMA_PMXVF16GER2NN  "pmxvf16ger2nn")
-(UNSPEC_MMA_PMXVBF16GER2PP 
"pmxvbf16ger2pp")
-(UNSPEC_MMA_PMXVBF16GER2PN 
"pmxvbf16ger2pn")
-(UNSPEC_MMA_PMXVBF16GER2NP 
"pmxvbf16ger2np")
-(UNSPEC_MMA_PMXVBF16GER2NN 
"pmxvbf16ger2nn")])
+(define_int_attr avvi4i4i2 [(UNSPEC_MMA_PMXVI16GER2PP  "xvi16ger2pp")
+(UNSPEC_MMA_PMXVI16GER2SPP "xvi16ger2spp")
+(UNSPEC_MMA_PMXVF16GER2PP  "xvf16ger2pp")
+(UNSPEC_MMA_PMXVF16GER2PN  "xvf

[gcc(refs/users/meissner/heads/work187-dmf)] RFC2653-Add wD constraint.

2024-11-22 Thread Michael Meissner via Gcc-cvs

https://gcc.gnu.org/g:60f42288430e483bd3f18b9a4ed46ade90aa8164

commit 60f42288430e483bd3f18b9a4ed46ade90aa8164
Author: Michael Meissner 
Date:   Fri Nov 22 18:36:24 2024 -0500

RFC2653-Add wD constraint.

This patch adds a new constraint ('wD') that matches the accumulator 
registers
that overlap with VSX registers 0..31 on power10.  Future patches will add 
the
support for a separate accumulator register class that will be used when the
support for dense math registes is added.

2024-11-22   Michael Meissner  

* config/rs6000/constraints.md (wD): New constraint.
* config/rs6000/mma.md (mma_): Prepare for alternate 
accumulator
registers.  Use wD constraint instead of 'd' constraint.  Use
accumulator_operand instead of fpr_reg_operand.
(mma_): Likewise.
(mma_): Likewise.
(mma_): Likewise.
(mma_): Likewise.
(mma_): Likewise.
(mma_): Likewise.
(mma_): Likewise.
(mma_): Likewise.
(mma_): Likewise.
(mma_): Likewise.
(mma_): Likewise.
(mma_): Likewise.
(mma_"
-  [(set (match_operand:XO 0 "fpr_reg_operand" "=&d")
-   (unspec:XO [(match_operand:XO 1 "fpr_reg_operand" "0")]
+  [(set (match_operand:XO 0 "accumulator_operand" "=&wD")
+   (unspec:XO [(match_operand:XO 1 "accumulator_operand" "0")]
MMA_ACC))]
   "TARGET_MMA"
   " %A0"
@@ -523,7 +523,7 @@
   [(set_attr "type" "mma")])
 
 (define_insn "mma_"
-  [(set (match_operand:XO 0 "fpr_reg_operand" "=&d,&d")
+  [(set (match_operand:XO 0 "accumulator_operand" "=&wD,&wD")
(unspec:XO [(match_operand:V16QI 1 "vsx_register_operand" "v,?wa")
(match_operand:V16QI 2 "vsx_register_operand" "v,?wa")]
MMA_VV))]
@@ -532,8 +532,8 @@
   [(set_attr "type" "mma")])
 
 (define_insn "mma_"
-  [(set (match_operand:XO 0 "fpr_reg_operand" "=&d,&d")
-   (unspec:XO [(match_operand:XO 1 "fpr_reg_operand" "0,0")
+  [(set (match_operand:XO 0 "accumulator_operand" "=&wD,&wD")
+   (unspec:XO [(match_operand:XO 1 "accumulator_operand" "0,0")
(match_operand:V16QI 2 "vsx_register_operand" "v,?wa")
(match_operand:V16QI 3 "vsx_register_operand" "v,?wa")]
MMA_AVV))]
@@ -542,7 +542,7 @@
   [(set_attr "type" "mma")])
 
 (define_insn "mma_"
-  [(set (match_operand:XO 0 "fpr_reg_operand" "=&d,&d")
+  [(set (match_operand:XO 0 "accumulator_operand" "=&wD,&wD")
(unspec:XO [(match_operand:OO 1 "vsx_register_operand" "v,?wa")
(match_operand:V16QI 2 "vsx_register_operand" "v,?wa")]
MMA_PV))]
@@ -551,8 +551,8 @@
   [(set_attr "type" "mma")])
 
 (define_insn "mma_"
-  [(set (match_operand:XO 0 "fpr_reg_operand" "=&d,&d")
-   (unspec:XO [(match_operand:XO 1 "fpr_reg_operand" "0,0")
+  [(set (match_operand:XO 0 "accumulator_operand" "=&wD,&wD")
+   (unspec:XO [(match_operand:XO 1 "accumulator_operand" "0,0")
(match_operand:OO 2 "vsx_register_operand" "v,?wa")
(match_operand:V16QI 3 "vsx_register_operand" "v,?wa")]
MMA_APV))]
@@ -561,7 +561,7 @@
   [(set_attr "type" "mma")])
 
 (define_insn "mma_"
-  [(set (match_operand:XO 0 "fpr_reg_operand" "=&d,&d")
+  [(set (match_operand:XO 0 "accumulator_operand" "=&wD,&wD")
(unspec:XO [(match_operand:V16QI 1 "vsx_register_operand" "v,?wa")
(match_operand:V16QI 2 "vsx_register_operand" "v,?wa")
(match_operand:SI 3 "const_0_to_15_operand" "n,n")
@@ -574,8 +574,8 @@
(set_attr "prefixed" "yes")])
 
 (define_insn "mma_"
-  [(set (match_operand:XO 0 "fpr_reg_operand" "=&d,&d")
-   (unspec:XO [(match_operand:XO 1 "fpr_reg_operand" "0,0")
+  [(set (match_operand:XO 0 "accumulator_operand" "=&wD,&wD")
+   (unspec:XO [(match_operand:XO 1 "accumulator_operand" "0,0")
(match_operand:V16QI 2 "vsx_register_operand" "v,?wa")
(match_operand:V16QI 3 "vsx_register_operand" "v,?wa")
(match_operand:SI 4 "const_0_to_15_operand" "n,n")
@@ -588,7 +588,7 @@
(set_attr "prefixed" "yes")])
 
 (define_insn "mma_"
-  [(set (match_operand:XO 0 "fpr_reg_operand" "=&d,&d")
+  [(set (match_operand:XO 0 "accumulator_operand" "=&wD,&wD")
(unspec:XO [(match_operand:V16QI 1 "vsx_register_operand" "v,?wa")
(match_operand:V16QI 2 "vsx_register_operand" "v,?wa")
(match_operand:SI 3 "const_0_to_15_operand" "n,n")
@@ -601,8 +601,8 @@
(set_attr "prefixed" "yes")])
 
 (define_insn "mma_"
-  [(set (match_operand:XO 0 "fpr_reg_operand" "=&d,&d")
-   (unspec:XO [(match_operand:XO 1 "fpr_reg_operand" "0,0")
+  [(set (match_operand:XO 0 "accumulator_operand" "=&wD,&wD")
+   (unspec:XO [(match_operand:XO 1 "accumulator_operand" "0,0")

[gcc/aoliva/heads/testme] (2 commits) ifcombine: don't try xor on right-hand op

2024-11-22 Thread Alexandre Oliva via Gcc-cvs

The branch 'aoliva/heads/testme' was updated to point to:

 dab845bbb29b... ifcombine: don't try xor on right-hand op

It previously pointed to:

 9e160528951b... fold fold_truth_andor field merging into ifcombine

Diff:

!!! WARNING: THE FOLLOWING COMMITS ARE NO LONGER ACCESSIBLE (LOST):
---

  9e16052... fold fold_truth_andor field merging into ifcombine


Summary of changes (added commits):
---

  dab845b... ifcombine: don't try xor on right-hand op
  4f022d9... fold fold_truth_andor field merging into ifcombine

[gcc(refs/users/aoliva/heads/testme)] fold fold_truth_andor field merging into ifcombine

2024-11-22 Thread Alexandre Oliva via Gcc-cvs

https://gcc.gnu.org/g:4f022d9943090a374cc3eea3295048a110539d92

commit 4f022d9943090a374cc3eea3295048a110539d92
Author: Alexandre Oliva 
Date:   Thu Nov 21 22:36:34 2024 -0300

fold fold_truth_andor field merging into ifcombine

This patch introduces various improvements to the logic that merges
field compares, while moving it into ifcombine.

Before the patch, we could merge:

  (a.x1 EQNE b.x1)  ANDOR  (a.y1 EQNE b.y1)

into something like:

  (((type *)&a)[Na] & MASK) EQNE (((type *)&b)[Nb] & MASK)

if both of A's fields live within the same alignment boundaries, and
so do B's, at the same relative positions.  Constants may be used
instead of the object B.

The initial goal of this patch was to enable such combinations when a
field crossed alignment boundaries, e.g. for packed types.  We can't
generally access such fields with a single memory access, so when we
come across such a compare, we will attempt to combine each access
separately.

Some merging opportunities were missed because of right-shifts,
compares expressed as e.g. ((a.x1 ^ b.x1) & MASK) EQNE 0, and
narrowing conversions, especially after earlier merges.  This patch
introduces handlers for several cases involving these.

The merging of multiple field accesses into wider bitfield-like
accesses is undesirable to do too early in compilation, so we move it
from folding to ifcombine.

When it is the second of a noncontiguous pair of compares that first
accesses a word, we may merge the first compare with part of the
second compare that refers to the same word, keeping the compare of
the remaining bits at the spot where the second compare used to be.

Handling compares with non-constant fields was somewhat generalized
from what fold used to do, now handling non-adjacent fields, even if a
field of one object crosses an alignment boundary but the other
doesn't.


for  gcc/ChangeLog

* fold-const.cc (make_bit_field): Export.
(unextend, all_ones_mask_p): Drop.
(decode_field_reference, fold_truth_andor_1): Move
field compare merging logic...
* gimple-fold.cc: (fold_truth_andor_for_ifcombine) ... here.
(compute_split_boundary_from_align): New.
(make_bit_field_load, build_split_load): New.
(reuse_split_load): New.
* fold-const.h: (make_bit_field_ref): Declare
(fold_truth_andor_for_ifcombine): Declare.
* match.pd (any_convert, bit_and_cst, rshift_cst): New.
* tree-ssa-ifcombine.cc (ifcombine_ifandif): Try
fold_truth_andor_for_ifcombine.

for  gcc/testsuite/ChangeLog

* gcc.dg/field-merge-1.c: New.
* gcc.dg/field-merge-2.c: New.
* gcc.dg/field-merge-3.c: New.
* gcc.dg/field-merge-4.c: New.
* gcc.dg/field-merge-5.c: New.
* gcc.dg/field-merge-6.c: New.
* gcc.dg/field-merge-7.c: New.
* gcc.dg/field-merge-8.c: New.
* gcc.dg/field-merge-9.c: New.
* gcc.dg/field-merge-10.c: New.
* gcc.dg/field-merge-11.c: New.

Diff:
---
 gcc/fold-const.cc |  512 +--
 gcc/fold-const.h  |   10 +
 gcc/gimple-fold.cc| 1107 +
 gcc/match.pd  |   11 +
 gcc/testsuite/gcc.dg/field-merge-1.c  |   64 ++
 gcc/testsuite/gcc.dg/field-merge-10.c |   36 ++
 gcc/testsuite/gcc.dg/field-merge-11.c |   32 +
 gcc/testsuite/gcc.dg/field-merge-2.c  |   31 +
 gcc/testsuite/gcc.dg/field-merge-3.c  |   36 ++
 gcc/testsuite/gcc.dg/field-merge-4.c  |   40 ++
 gcc/testsuite/gcc.dg/field-merge-5.c  |   40 ++
 gcc/testsuite/gcc.dg/field-merge-6.c  |   26 +
 gcc/testsuite/gcc.dg/field-merge-7.c  |   23 +
 gcc/testsuite/gcc.dg/field-merge-8.c  |   25 +
 gcc/testsuite/gcc.dg/field-merge-9.c  |   36 ++
 gcc/tree-ssa-ifcombine.cc |   14 +-
 16 files changed, 1534 insertions(+), 509 deletions(-)

diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc
index 1e8ae1ab493b..644966459864 100644
--- a/gcc/fold-const.cc
+++ b/gcc/fold-const.cc
@@ -137,7 +137,6 @@ static tree range_successor (tree);
 static tree fold_range_test (location_t, enum tree_code, tree, tree, tree);
 static tree fold_cond_expr_with_comparison (location_t, tree, enum tree_code,
tree, tree, tree, tree);
-static tree unextend (tree, int, int, tree);
 static tree extract_muldiv (tree, tree, enum tree_code, tree, bool *);
 static tree extract_muldiv_1 (tree, tree, enum tree_code, tree, bool *);
 static tree fold_binary_op_with_conditional_arg (location_t,
@@ -4711,7 +4710,7 @@ invert_truthvalue_loc (location_t loc, tree arg)
is the original memory reference used to preserve the alias set of
the

[gcc(refs/users/aoliva/heads/testme)] ifcombine: don't try xor on right-hand op

2024-11-22 Thread Alexandre Oliva via Gcc-cvs

https://gcc.gnu.org/g:dab845bbb29b68bf7e1b6127896fe834fea1b3a4

commit dab845bbb29b68bf7e1b6127896fe834fea1b3a4
Author: Alexandre Oliva 
Date:   Fri Nov 22 19:16:58 2024 -0300

ifcombine: don't try xor on right-hand op

Diff:
---
 gcc/gimple-fold.cc | 4 
 1 file changed, 4 insertions(+)

diff --git a/gcc/gimple-fold.cc b/gcc/gimple-fold.cc
index 731f6ccd5597..d0caabd8a4b4 100644
--- a/gcc/gimple-fold.cc
+++ b/gcc/gimple-fold.cc
@@ -7486,6 +7486,10 @@ decode_field_reference (tree *pexp, HOST_WIDE_INT 
*pbitsize,
  exp = res_ops[1];
  gcc_checking_assert (!xor_cmp_op);
}
+  else if (!xor_cmp_op)
+   /* Not much we can do when xor appears in the right-hand compare
+  operand.  */
+   return NULL_TREE;
   else
{
  *xor_p = true;

[gcc r15-5606] md-files: Add a note about escaped quotes in braced strings in md files

2024-11-22 Thread Andrew Pinski via Gcc-cvs

https://gcc.gnu.org/g:4aa4162e365023896ebf6ed56bf0a00994c50639

commit r15-5606-g4aa4162e365023896ebf6ed56bf0a00994c50639
Author: Andrew Pinski 
Date:   Thu Oct 31 16:00:18 2024 -0700

md-files: Add a note about escaped quotes in braced strings in md files

While looking into PR 33532, It was noted that \" would be treated
still as " for braced strings in the md file. I think that is still
the correct thing to do. So let's just a note to the documentation
on this behavior and NOT change read-md.cc (read_braced_string).
Since this behavior has been there for the last 23 years and only
one person ran into this behavior and helped with the conversion
from using quoted strings to braced strings; that is you just need
to remove the quote around the brace rather than change all of the
code.

Build the documentation to make sure it looks correct.

gcc/ChangeLog:

* doc/rtl.texi: Add a note about quotes in braced strings.

Signed-off-by: Andrew Pinski 

Diff:
---
 gcc/doc/rtl.texi | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/gcc/doc/rtl.texi b/gcc/doc/rtl.texi
index 5debd6245f0c..41dfc27c899e 100644
--- a/gcc/doc/rtl.texi
+++ b/gcc/doc/rtl.texi
@@ -85,7 +85,10 @@ appear, it is also valid to write a C-style brace block.  
The entire
 brace block, including the outermost pair of braces, is considered to be
 the string constant.  Double quote characters inside the braces are not
 special.  Therefore, if you write string constants in the C code, you
-need not escape each quote character with a backslash.
+need not escape each quote character with a backslash. Note escaped quotes
+are treated the same as a plain quote character and if you need a escaped
+quote in a C string, you need an extra backslash to escape the backslash
+like @code{"a=\\"c\\";"}.
 
 A vector contains an arbitrary number of pointers to expressions.  The
 number of elements in the vector is explicitly present in the vector.

[gcc(refs/users/meissner/heads/work187)] Use vector pair load/store for memcpy with -mcpu=future

2024-11-22 Thread Michael Meissner via Gcc-cvs

https://gcc.gnu.org/g:e98175b635a332f28ba1dd8e5da67bcec92efcc5

commit e98175b635a332f28ba1dd8e5da67bcec92efcc5
Author: Michael Meissner 
Date:   Fri Nov 22 17:49:32 2024 -0500

Use vector pair load/store for memcpy with -mcpu=future

In the development for the power10 processor, GCC did not enable using the 
load
vector pair and store vector pair instructions when optimizing things like
memory copy.  This patch enables using those instructions if -mcpu=future is
used.

2024-11-22  Michael Meissner  

gcc/

* config/rs6000/rs6000-cpus.def (ISA_FUTURE_MASKS_SERVER): Enable 
using
load vector pair and store vector pair instructions for memory copy
operations.
(POWERPC_MASKS): Make the bit for enabling using load vector pair 
and
store vector pair operations set and reset when the PowerPC 
processor is
changed.
* gcc/config/rs6000/rs6000.cc (rs6000_machine_from_flags): Disable
-mblock-ops-vector-pair from influcing .machine selection.

gcc/testsuite/

* gcc.target/powerpc/future-3.c: New test.

Diff:
---
 gcc/config/rs6000/rs6000-cpus.def   |  4 +++-
 gcc/config/rs6000/rs6000.cc |  2 +-
 gcc/testsuite/gcc.target/powerpc/future-3.c | 21 +
 3 files changed, 25 insertions(+), 2 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-cpus.def 
b/gcc/config/rs6000/rs6000-cpus.def
index 354c1d8de4f0..2f189dd416ca 100644
--- a/gcc/config/rs6000/rs6000-cpus.def
+++ b/gcc/config/rs6000/rs6000-cpus.def
@@ -84,7 +84,8 @@
  | OPTION_MASK_POWER11)
 
 #define FUTURE_MASKS_SERVER(POWER11_MASKS_SERVER   \
-| OPTION_MASK_FUTURE)
+| OPTION_MASK_FUTURE   \
+| OPTION_MASK_BLOCK_OPS_VECTOR_PAIR)
 
 /* Flags that need to be turned off if -mno-vsx.  */
 #define OTHER_VSX_VECTOR_MASKS (OPTION_MASK_EFFICIENT_UNALIGNED_VSX\
@@ -114,6 +115,7 @@
 
 /* Mask of all options to set the default isa flags based on -mcpu=.  */
 #define POWERPC_MASKS  (OPTION_MASK_ALTIVEC\
+| OPTION_MASK_BLOCK_OPS_VECTOR_PAIR\
 | OPTION_MASK_CMPB \
 | OPTION_MASK_CRYPTO   \
 | OPTION_MASK_DFP  \
diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index ab26192b687d..35e42a80ac5c 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -5906,7 +5906,7 @@ rs6000_machine_from_flags (void)
 
   /* Disable the flags that should never influence the .machine selection.  */
   flags &= ~(OPTION_MASK_PPC_GFXOPT | OPTION_MASK_PPC_GPOPT | OPTION_MASK_ISEL
-| OPTION_MASK_ALTIVEC);
+| OPTION_MASK_ALTIVEC | OPTION_MASK_BLOCK_OPS_VECTOR_PAIR);
 
   if ((flags & (FUTURE_MASKS_SERVER & ~ISA_3_1_MASKS_SERVER)) != 0)
 return "future";
diff --git a/gcc/testsuite/gcc.target/powerpc/future-3.c 
b/gcc/testsuite/gcc.target/powerpc/future-3.c
new file mode 100644
index ..1cbe9170f121
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/future-3.c
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* { dg-options "-mdejagnu-cpu=future -O2" } */
+
+/* Test to see that memcpy will use load/store vector pair with
+   -mcpu=future.  */
+
+#ifndef SIZE
+#define SIZE 4
+#endif
+
+extern vector double to[SIZE], from[SIZE];
+
+void
+copy (void)
+{
+  __builtin_memcpy (to, from, sizeof (to));
+  return;
+}
+
+/* { dg-final { scan-assembler {\mlxvpx?\M}  } } */
+/* { dg-final { scan-assembler {\mstxvpx?\M} } } */

[gcc(refs/users/meissner/heads/work187)] Add -mcpu=future tuning support.

2024-11-22 Thread Michael Meissner via Gcc-cvs

https://gcc.gnu.org/g:33234a6ecbffdaf90398618620d1e8a282bbccbc

commit 33234a6ecbffdaf90398618620d1e8a282bbccbc
Author: Michael Meissner 
Date:   Fri Nov 22 17:53:04 2024 -0500

Add -mcpu=future tuning support.

This patch makes -mtune=future use the same tuning decision as 
-mtune=power11.

2024-11-22  Michael Meissner  

gcc/

* config/rs6000/power10.md (all reservations): Add future as an
alterntive to power10 and power11.

Diff:
---
 gcc/config/rs6000/power10.md | 144 +--
 1 file changed, 72 insertions(+), 72 deletions(-)

diff --git a/gcc/config/rs6000/power10.md b/gcc/config/rs6000/power10.md
index 2310c4603457..e42b057dc45b 100644
--- a/gcc/config/rs6000/power10.md
+++ b/gcc/config/rs6000/power10.md
@@ -1,4 +1,4 @@
-;; Scheduling description for the IBM Power10 and Power11 processors.
+;; Scheduling description for the IBM Power10, Power11, and Future processors.
 ;; Copyright (C) 2020-2024 Free Software Foundation, Inc.
 ;;
 ;; Contributed by Pat Haugen (pthau...@us.ibm.com).
@@ -97,12 +97,12 @@
(eq_attr "update" "no")
(eq_attr "size" "!128")
(eq_attr "prefixed" "no")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_any_power10,LU_power10")
 
 (define_insn_reservation "power10-fused-load" 4
   (and (eq_attr "type" "fused_load_cmpi,fused_addis_load,fused_load_load")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_even_power10,LU_power10")
 
 (define_insn_reservation "power10-prefixed-load" 4
@@ -110,13 +110,13 @@
(eq_attr "update" "no")
(eq_attr "size" "!128")
(eq_attr "prefixed" "yes")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_even_power10,LU_power10")
 
 (define_insn_reservation "power10-load-update" 4
   (and (eq_attr "type" "load")
(eq_attr "update" "yes")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_even_power10,LU_power10+SXU_power10")
 
 (define_insn_reservation "power10-fpload-double" 4
@@ -124,7 +124,7 @@
(eq_attr "update" "no")
(eq_attr "size" "64")
(eq_attr "prefixed" "no")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_any_power10,LU_power10")
 
 (define_insn_reservation "power10-prefixed-fpload-double" 4
@@ -132,14 +132,14 @@
(eq_attr "update" "no")
(eq_attr "size" "64")
(eq_attr "prefixed" "yes")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_even_power10,LU_power10")
 
 (define_insn_reservation "power10-fpload-update-double" 4
   (and (eq_attr "type" "fpload")
(eq_attr "update" "yes")
(eq_attr "size" "64")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_even_power10,LU_power10+SXU_power10")
 
 ; SFmode loads are cracked and have additional 3 cycles over DFmode
@@ -148,27 +148,27 @@
   (and (eq_attr "type" "fpload")
(eq_attr "update" "no")
(eq_attr "size" "32")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_even_power10,LU_power10")
 
 (define_insn_reservation "power10-fpload-update-single" 7
   (and (eq_attr "type" "fpload")
(eq_attr "update" "yes")
(eq_attr "size" "32")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_even_power10,LU_power10+SXU_power10")
 
 (define_insn_reservation "power10-vecload" 4
   (and (eq_attr "type" "vecload")
(eq_attr "size" "!256")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_any_power10,LU_power10")
 
 ; lxvp
 (define_insn_reservation "power10-vecload-pair" 4
   (and (eq_attr "type" "vecload")
(eq_attr "size" "256")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_even_power10,LU_power10+SXU_power10")
 
 ; Store Unit
@@ -178,12 +178,12 @@
(eq_attr "prefixed" "no")
(eq_attr "size" "!128")
(eq_attr "size" "!256")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_any_power10,STU_power10")
 
 (define_insn_reservation "power10-fused-store" 0
   (and (eq_attr "type" "fused_store_store")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_even_power10,STU_power10")
 
 (define_insn_reservation "power10-prefixed-store" 0
@@ -191,52 +191,52 @@
(eq_attr "prefixed" "yes")
(eq_attr "size" "!128")
(eq_attr "size" "!256")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_even_power10,STU_power10")
 
 ; Update forms have 2 cycle latency for updat

[gcc(refs/users/meissner/heads/work187)] Change TARGET_CMPB to TARGET_POWER6.

2024-11-22 Thread Michael Meissner via Gcc-cvs

https://gcc.gnu.org/g:8c9b9796ca497d81e5e695ce977c803424338bd4

commit 8c9b9796ca497d81e5e695ce977c803424338bd4
Author: Michael Meissner 
Date:   Fri Nov 22 17:40:21 2024 -0500

Change TARGET_CMPB to TARGET_POWER6.

This patch changes TARGET_CMPB to TARGET_POWER6.  The -mcmpb switch is not 
being
changed, just the name of the macros used to determine if the PowerPC 
processor
supports ISA 2.5 (Power6).

2024-11-22  Michael Meissner  

gcc/

* gcc/config/rs6000/rs6000-builtin.cc (rs6000_builtin_is_supported):
Change TARGET_CMPB to TARGET_POWER6.
* gcc/config/rs6000/rs6000.cc (rs6000_option_override_internal):
Likewise.
(rs6000_rtx_costs): Likewise.
(rs6000_emit_parity): Likewise.
* gcc/config/rs6000/rs6000.h (TARGET_FCFID): Likewise.
(TARGET_LFIWAX): Likewise.
(TARGET_POWER6): New macro.
(TARGET_EXTRA_BUILTINS): Change TARGET_CMPB to TARGET_POWER6.
* gcc/config/rs6000/rs6000.md (enabled attribute): Likewise.
(parity2_cmp): Likewise.
(cmpb3): Likewise.
(copysign3): Likewise.
(copysign3_fcpsgn): Likewise.
(cmpstrnsi): Likewise.
(cmpstrsi): Likewise.

Diff:
---
 gcc/config/rs6000/rs6000-builtin.cc |  4 ++--
 gcc/config/rs6000/rs6000.cc |  8 
 gcc/config/rs6000/rs6000.h  |  7 ---
 gcc/config/rs6000/rs6000.md | 16 
 4 files changed, 18 insertions(+), 17 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-builtin.cc 
b/gcc/config/rs6000/rs6000-builtin.cc
index 98a0545030cd..76421bd1de0b 100644
--- a/gcc/config/rs6000/rs6000-builtin.cc
+++ b/gcc/config/rs6000/rs6000-builtin.cc
@@ -157,9 +157,9 @@ rs6000_builtin_is_supported (enum rs6000_gen_builtins 
fncode)
 case ENB_P5:
   return TARGET_POWER5;
 case ENB_P6:
-  return TARGET_CMPB;
+  return TARGET_POWER6;
 case ENB_P6_64:
-  return TARGET_CMPB && TARGET_POWERPC64;
+  return TARGET_POWER6 && TARGET_POWERPC64;
 case ENB_P7:
   return TARGET_POPCNTD;
 case ENB_P7_64:
diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index efab762d1d1d..3a1e41d69747 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -3920,7 +3920,7 @@ rs6000_option_override_internal (bool global_init_p)
 rs6000_isa_flags |= (ISA_2_6_MASKS_EMBEDDED & ~ignore_masks);
   else if (TARGET_DFP)
 rs6000_isa_flags |= (ISA_2_5_MASKS_SERVER & ~ignore_masks);
-  else if (TARGET_CMPB)
+  else if (TARGET_POWER6)
 rs6000_isa_flags |= (ISA_2_5_MASKS_EMBEDDED & ~ignore_masks);
   else if (TARGET_POWER5X)
 rs6000_isa_flags |= (ISA_2_4_MASKS & ~ignore_masks);
@@ -4795,7 +4795,7 @@ rs6000_option_override_internal (bool global_init_p)
  DERAT mispredict penalty.  However the LVE and STVE altivec instructions
  need indexed accesses and the type used is the scalar type of the element
  being loaded or stored.  */
-TARGET_AVOID_XFORM = (rs6000_tune == PROCESSOR_POWER6 && TARGET_CMPB
+TARGET_AVOID_XFORM = (rs6000_tune == PROCESSOR_POWER6 && TARGET_POWER6
  && !TARGET_ALTIVEC);
 
   /* Set the -mrecip options.  */
@@ -22352,7 +22352,7 @@ rs6000_rtx_costs (rtx x, machine_mode mode, int 
outer_code,
   return false;
 
 case PARITY:
-  *total = COSTS_N_INSNS (TARGET_CMPB ? 2 : 6);
+  *total = COSTS_N_INSNS (TARGET_POWER6 ? 2 : 6);
   return false;
 
 case NOT:
@@ -23179,7 +23179,7 @@ rs6000_emit_parity (rtx dst, rtx src)
   tmp = gen_reg_rtx (mode);
 
   /* Use the PPC ISA 2.05 prtyw/prtyd instruction if we can.  */
-  if (TARGET_CMPB)
+  if (TARGET_POWER6)
 {
   if (mode == SImode)
{
diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h
index 9ef4a73d2739..2bd7b8bcbdcf 100644
--- a/gcc/config/rs6000/rs6000.h
+++ b/gcc/config/rs6000/rs6000.h
@@ -449,12 +449,12 @@ extern int rs6000_vector_align[];
 #define TARGET_FCFID   (TARGET_POWERPC64   \
 || TARGET_PPC_GPOPT/* 970/power4 */\
 || TARGET_POWER5   /* ISA 2.02 */  \
-|| TARGET_CMPB /* ISA 2.05 */  \
+|| TARGET_POWER6   /* ISA 2.05 */  \
 || TARGET_POPCNTD) /* ISA 2.06 */
 
 #define TARGET_FCTIDZ  TARGET_FCFID
 #define TARGET_STFIWX  TARGET_PPC_GFXOPT
-#define TARGET_LFIWAX  TARGET_CMPB
+#define TARGET_LFIWAX  TARGET_POWER6
 #define TARGET_LFIWZX  TARGET_POPCNTD
 #define TARGET_FCFIDS  TARGET_POPCNTD
 #define TARGET_FCFIDU  TARGET_POPCNTD
@@ -502,6 +502,7 @@ extern int rs6000_vector_align[];
 /* Convert ISA bits like POPCNTB to PowerPC processors like POWER5.  */
 #define TARGET_POWER5  TARGET_POPCNTB
 #define TARGET_POWER5X TARGET_FPRND
+#define TARGET_POWER6

[gcc(refs/users/meissner/heads/work187)] Update ChangeLog.*

2024-11-22 Thread Michael Meissner via Gcc-cvs

https://gcc.gnu.org/g:ead1c9fcc5bdf39fc4100d00761f8537f66c52d5

commit ead1c9fcc5bdf39fc4100d00761f8537f66c52d5
Author: Michael Meissner 
Date:   Fri Nov 22 18:26:01 2024 -0500

Update ChangeLog.*

Diff:
---
 gcc/ChangeLog.meissner | 435 +
 1 file changed, 435 insertions(+)

diff --git a/gcc/ChangeLog.meissner b/gcc/ChangeLog.meissner
index 0cbeebfbc31e..cd1ab357023e 100644
--- a/gcc/ChangeLog.meissner
+++ b/gcc/ChangeLog.meissner
@@ -1,5 +1,440 @@
+ Branch work187, patch #31 
+
+Use architecture flags for defining _ARCH_PWR macros.
+
+For the newer architectures, this patch changes GCC to define the _ARCH_PWR
+macros using the new architecture flags instead of relying on isa options like
+-mpower10.
+
+The -mpower8-internal, -mpower10, -mpower11, and -mfuture options were removed.
+The -mpower11 and -mfuture options were removed completely, since they were 
just
+added in GCC 15. The other two options were marked as WarnRemoved, and the
+various ISA bits were removed.
+
+TARGET_POWER8, TARGET_POWER10, TARGET_POWER11, and TARGET_FUTURE were 
re-defined
+to use the architeture bits instead of the ISA bits.
+
+There are other internal isa bits that aren't removed with this patch because
+the built-in function support uses those bits.
+
+I have built both big endian and little endian bootstrap compilers and there
+were no regressions.
+
+Can I install this patch on the GCC 15 trunk?
+
+2024-11-22  Michael Meissner  
+
+gcc/
+
+   * config/rs6000/rs6000-c.cc (rs6000_target_modify_macros) Add support to
+   use architecture flags instead of ISA flags for setting most of the
+   _ARCH_PWR* macros.
+   (rs6000_cpu_cpp_builtins): Update rs6000_target_modify_macros call.
+   * config/rs6000/rs6000-cpus.def (ISA_2_7_MASKS_SERVER): Remove
+   OPTION_MASK_POWER8.
+   (ISA_3_1_MASKS_SERVER): Remove OPTION_MASK_POWER10.
+   (POWER11_MASKS_SERVER): Remove OPTION_MASK_POWER11.
+   (FUTURE_MASKS_SERVER): Remove OPTION_MASK_FUTURE.
+   (POWERPC_MASKS): Remove OPTION_MASK_POWER8, OPTION_MASK_POWER10,
+   OPTION_MASK_POWER11, and OPTION_MASK_FUTURE.
+   * config/rs6000/rs6000-protos.h (rs6000_target_modify_macros): Update
+   declaration.
+   (rs6000_target_modify_macros_ptr): Likewise.
+   * config/rs6000/rs6000.cc (rs6000_target_modify_macros_ptr): Likewise.
+   (rs6000_option_override_internal): Use architecture flags instead of ISA
+   flags.
+   (rs6000_opt_masks): Remove -mpower10, -mpower11, and -mfuture which are
+   no longer in the ISA flags.
+   (rs6000_pragma_target_parse): Use architecture flags as well as ISA
+   flags.
+   * config/rs6000/rs6000.h (TARGET_POWER5): Redefine to use architecture
+   flags.
+   (TARGET_POWER5X): Likewise.
+   (TARGET_POWER6): Likewise.
+   (TARGET_POWER7): Likewise.
+   (TARGET_POWER8): Likewise.
+   (TARGET_POWER9): Likewise.
+   (TARGET_POWER10): New macro.
+   (TARGET_POWER11): Likewise.
+   (TARGET_FUTURE): Likewise.
+   * config/rs6000/rs6000.opt (-mpower8-internal): Remove ISA flag bits.
+   (-mpower10): Likewise.
+   (-mpower11): Likewise.
+   (-mfuture): Likewise.
+
+ Branch work187, patch #30 
+
+Add rs6000 architecture masks.
+
+This patch begins the journey to move architecture bits that are not user ISA
+options from rs6000_isa_flags to a new targt variable rs6000_arch_flags.  The
+intention is to remove switches that are currently isa options, but the user
+should not be using this particular option. For example, we want users to use
+-mcpu=power10 and not just -mpower10.
+
+This patch also changes the target_clones support to use an architecture mask
+instead of isa bits.
+
+This patch also switches the handling of .machine to use architecture masks if
+they exist (power4 through power11).  All of the other PowerPCs will continue 
to
+use the existing code for setting the .machine option.
+
+I have built both big endian and little endian bootstrap compilers and there
+were no regressions.
+
+In addition, I constructed a test case that used every archiecture define (like
+_ARCH_PWR4, etc.) and I also looked at the .machine directive generated.  I ran
+this test for all supported combinations of -mcpu, big/little endian, and 32/64
+bit support.  Every single instance generated exactly the same code with the
+patches installed compared to the compiler before installing the patches.
+
+The only difference in this patch compared to the first version posted on
+November 6th is that I the correct attribution and copyright year (i.e. that I
+created rs6000-arch.def in 2024).
+
+Can I install this patch on the GCC 15 trunk?
+
+2024-11-22  Michael Meissner  
+
+gcc/
+
+   * config/rs6000/default64.h (TARGET_CPU_DEFAULT): Set default cpu name.
+   * config/rs6000/rs6000-arch.def: New file.
+   * conf

[gcc(refs/users/meissner/heads/work187)] Change TARGET_POPCNTD to TARGET_POWER7.

2024-11-22 Thread Michael Meissner via Gcc-cvs

https://gcc.gnu.org/g:e1f8abc41e7addce1ba62f5d79ee8dfef6dcc85c

commit e1f8abc41e7addce1ba62f5d79ee8dfef6dcc85c
Author: Michael Meissner 
Date:   Fri Nov 22 17:41:22 2024 -0500

Change TARGET_POPCNTD to TARGET_POWER7.

This patch changes TARGET_POPCNTD to TARGET_POWER7.  The -mpopcntd switch 
is not
being changed, just the name of the macros used to determine if the PowerPC
processor supports ISA 2.6 (Power7).

2024-11-22  Michael Meissner  

gcc/

* gcc/config/rs6000/dfp.md (cmp_internal1): Change 
TARGET_POPCNTD
to TARGET_POWER7.
* gcc/config/rs6000/rs6000-builtin.cc (rs6000_builtin_is_supported):
Likewise.
* gcc/config/rs6000/rs6000-string.cc (expand_block_compare): 
Likewise.
* gcc/config/rs6000/rs6000.cc (rs6000_hard_regno_mode_ok_uncached):
Likewise.
(rs6000_option_override_internal): Likewise.
(rs6000_rtx_costs): Likewise.
* gcc/config/rs6000/rs6000.h (TARGET_LDBRX): Likewise.
(TARGET_FCFID): Likewise.
(TARGET_LFIWZX): Likewise.
(TARGET_FCFIDS): Likewise.
(TARGET_FCFIDU): Likewise.
(TARGET_FCFIDUS): Likewise.
(TARGET_FCTIDUZ): Likewise.
(TARGET_FCTIWUZ): Likewise.
(TARGET_FCTIDUZ): Likewise.
(TARGET_POWER7): New macro.
(TARGET_EXTRA_BUILTINS): Change TARGET_POPCNTD to TARGET_POWER7.
(CTZ_DEFINED_VALUE_AT_ZERO): Likewise.
* gcc/config/rs6000/rs6000.md (enabled attribute): Likewise.
(lrintsi2): Likewise.
(lrintsi): Likewise.
(lrintsi_di): Likewise.
(cmpmemsi): Likewise.
(bpermd_): Likewise.
(addg6s): Likewise.
(cdtbcd): Likewise.
(cbcdtd): Likewise.
(div_): Likewise.

Diff:
---
 gcc/config/rs6000/dfp.md|  2 +-
 gcc/config/rs6000/rs6000-builtin.cc |  4 ++--
 gcc/config/rs6000/rs6000-string.cc  |  2 +-
 gcc/config/rs6000/rs6000.cc |  8 
 gcc/config/rs6000/rs6000.h  | 21 +++--
 gcc/config/rs6000/rs6000.md | 20 ++--
 6 files changed, 29 insertions(+), 28 deletions(-)

diff --git a/gcc/config/rs6000/dfp.md b/gcc/config/rs6000/dfp.md
index fa9d7dd45dd3..b8189390d410 100644
--- a/gcc/config/rs6000/dfp.md
+++ b/gcc/config/rs6000/dfp.md
@@ -214,7 +214,7 @@
 (define_insn "floatdidd2"
   [(set (match_operand:DD 0 "gpc_reg_operand" "=d")
(float:DD (match_operand:DI 1 "gpc_reg_operand" "d")))]
-  "TARGET_DFP && TARGET_POPCNTD"
+  "TARGET_DFP && TARGET_POWER7"
   "dcffix %0,%1"
   [(set_attr "type" "dfp")])
 
diff --git a/gcc/config/rs6000/rs6000-builtin.cc 
b/gcc/config/rs6000/rs6000-builtin.cc
index 76421bd1de0b..dae43b672ea7 100644
--- a/gcc/config/rs6000/rs6000-builtin.cc
+++ b/gcc/config/rs6000/rs6000-builtin.cc
@@ -161,9 +161,9 @@ rs6000_builtin_is_supported (enum rs6000_gen_builtins 
fncode)
 case ENB_P6_64:
   return TARGET_POWER6 && TARGET_POWERPC64;
 case ENB_P7:
-  return TARGET_POPCNTD;
+  return TARGET_POWER7;
 case ENB_P7_64:
-  return TARGET_POPCNTD && TARGET_POWERPC64;
+  return TARGET_POWER7 && TARGET_POWERPC64;
 case ENB_P8:
   return TARGET_POWER8;
 case ENB_P8V:
diff --git a/gcc/config/rs6000/rs6000-string.cc 
b/gcc/config/rs6000/rs6000-string.cc
index de618da9b5dc..b633d80110d0 100644
--- a/gcc/config/rs6000/rs6000-string.cc
+++ b/gcc/config/rs6000/rs6000-string.cc
@@ -1949,7 +1949,7 @@ bool
 expand_block_compare (rtx operands[])
 {
   /* TARGET_POPCNTD is already guarded at expand cmpmemsi.  */
-  gcc_assert (TARGET_POPCNTD);
+  gcc_assert (TARGET_POWER7);
 
   /* For P8, this case is complicated to handle because the subtract
  with carry instructions do not generate the 64-bit carry and so
diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index 3a1e41d69747..4a7880ee9e1b 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -1922,7 +1922,7 @@ rs6000_hard_regno_mode_ok_uncached (int regno, 
machine_mode mode)
  if(GET_MODE_SIZE (mode) == UNITS_PER_FP_WORD)
return 1;
 
- if (TARGET_POPCNTD && mode == SImode)
+ if (TARGET_POWER7 && mode == SImode)
return 1;
 
  if (TARGET_P9_VECTOR && (mode == QImode || mode == HImode))
@@ -3916,7 +3916,7 @@ rs6000_option_override_internal (bool global_init_p)
 rs6000_isa_flags |= (ISA_2_7_MASKS_SERVER & ~ignore_masks);
   else if (TARGET_VSX)
 rs6000_isa_flags |= (ISA_2_6_MASKS_SERVER & ~ignore_masks);
-  else if (TARGET_POPCNTD)
+  else if (TARGET_POWER7)
 rs6000_isa_flags |= (ISA_2_6_MASKS_EMBEDDED & ~ignore_masks);
   else if (TARGET_DFP)
 rs6000_isa_flags |= (ISA_2_5_MASKS_SERVER & ~ignore_masks);
@@ -4129,7 +4129,7 @@ rs6000_option_override_internal (bool global_init_p)
   else if (TARGET_LONG_DOUBLE_128)

[gcc(refs/users/meissner/heads/work187)] Add -mcpu=future tuning support.

2024-11-22 Thread Michael Meissner via Gcc-cvs

https://gcc.gnu.org/g:121e4ad87729cf095cec726d79dd7ca4bc708a75

commit 121e4ad87729cf095cec726d79dd7ca4bc708a75
Author: Michael Meissner 
Date:   Fri Nov 22 17:47:22 2024 -0500

Add -mcpu=future tuning support.

This patch makes -mtune=future use the same tuning decision as 
-mtune=power11.

2024-11-22  Michael Meissner  

gcc/

* config/rs6000/power10.md (all reservations): Add future as an
alterntive to power10 and power11.

Diff:
---
 gcc/config/rs6000/power10.md | 144 +--
 1 file changed, 72 insertions(+), 72 deletions(-)

diff --git a/gcc/config/rs6000/power10.md b/gcc/config/rs6000/power10.md
index 2310c4603457..e42b057dc45b 100644
--- a/gcc/config/rs6000/power10.md
+++ b/gcc/config/rs6000/power10.md
@@ -1,4 +1,4 @@
-;; Scheduling description for the IBM Power10 and Power11 processors.
+;; Scheduling description for the IBM Power10, Power11, and Future processors.
 ;; Copyright (C) 2020-2024 Free Software Foundation, Inc.
 ;;
 ;; Contributed by Pat Haugen (pthau...@us.ibm.com).
@@ -97,12 +97,12 @@
(eq_attr "update" "no")
(eq_attr "size" "!128")
(eq_attr "prefixed" "no")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_any_power10,LU_power10")
 
 (define_insn_reservation "power10-fused-load" 4
   (and (eq_attr "type" "fused_load_cmpi,fused_addis_load,fused_load_load")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_even_power10,LU_power10")
 
 (define_insn_reservation "power10-prefixed-load" 4
@@ -110,13 +110,13 @@
(eq_attr "update" "no")
(eq_attr "size" "!128")
(eq_attr "prefixed" "yes")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_even_power10,LU_power10")
 
 (define_insn_reservation "power10-load-update" 4
   (and (eq_attr "type" "load")
(eq_attr "update" "yes")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_even_power10,LU_power10+SXU_power10")
 
 (define_insn_reservation "power10-fpload-double" 4
@@ -124,7 +124,7 @@
(eq_attr "update" "no")
(eq_attr "size" "64")
(eq_attr "prefixed" "no")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_any_power10,LU_power10")
 
 (define_insn_reservation "power10-prefixed-fpload-double" 4
@@ -132,14 +132,14 @@
(eq_attr "update" "no")
(eq_attr "size" "64")
(eq_attr "prefixed" "yes")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_even_power10,LU_power10")
 
 (define_insn_reservation "power10-fpload-update-double" 4
   (and (eq_attr "type" "fpload")
(eq_attr "update" "yes")
(eq_attr "size" "64")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_even_power10,LU_power10+SXU_power10")
 
 ; SFmode loads are cracked and have additional 3 cycles over DFmode
@@ -148,27 +148,27 @@
   (and (eq_attr "type" "fpload")
(eq_attr "update" "no")
(eq_attr "size" "32")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_even_power10,LU_power10")
 
 (define_insn_reservation "power10-fpload-update-single" 7
   (and (eq_attr "type" "fpload")
(eq_attr "update" "yes")
(eq_attr "size" "32")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_even_power10,LU_power10+SXU_power10")
 
 (define_insn_reservation "power10-vecload" 4
   (and (eq_attr "type" "vecload")
(eq_attr "size" "!256")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_any_power10,LU_power10")
 
 ; lxvp
 (define_insn_reservation "power10-vecload-pair" 4
   (and (eq_attr "type" "vecload")
(eq_attr "size" "256")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_even_power10,LU_power10+SXU_power10")
 
 ; Store Unit
@@ -178,12 +178,12 @@
(eq_attr "prefixed" "no")
(eq_attr "size" "!128")
(eq_attr "size" "!256")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_any_power10,STU_power10")
 
 (define_insn_reservation "power10-fused-store" 0
   (and (eq_attr "type" "fused_store_store")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_even_power10,STU_power10")
 
 (define_insn_reservation "power10-prefixed-store" 0
@@ -191,52 +191,52 @@
(eq_attr "prefixed" "yes")
(eq_attr "size" "!128")
(eq_attr "size" "!256")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_even_power10,STU_power10")
 
 ; Update forms have 2 cycle latency for updat

[gcc(refs/users/meissner/heads/work187)] Add support for -mcpu=future

2024-11-22 Thread Michael Meissner via Gcc-cvs

https://gcc.gnu.org/g:90092f48048b5565a5314b3e620d2ef1485788a3

commit 90092f48048b5565a5314b3e620d2ef1485788a3
Author: Michael Meissner 
Date:   Fri Nov 22 17:46:14 2024 -0500

Add support for -mcpu=future

This patch adds the support that can be used in developing GCC support for
future PowerPC processors.

2024-11-22  Michael Meissner  

* config.gcc (powerpc*-*-*): Add support for --with-cpu=future.
* config/rs6000/aix71.h (ASM_CPU_SPEC): Add support for 
-mcpu=future.
* config/rs6000/aix72.h (ASM_CPU_SPEC): Likewise.
* config/rs6000/aix73.h (ASM_CPU_SPEC): Likewise.
* config/rs6000/driver-rs6000.cc (asm_names): Likewise.
* config/rs6000/rs6000-c.cc (rs6000_target_modify_macros): If
-mcpu=future, define _ARCH_FUTURE.
* config/rs6000/rs6000-cpus.def (FUTURE_MASKS_SERVER): New macro.
(POWERPC_MASKS): Add OPTION_MASK_FUTURE.
(future cpu): Define.
* config/rs6000/rs6000-opts.h (enum processor_type): Add
PROCESSOR_FUTURE.
* config/rs6000/rs6000-tables.opt: Regenerate.
* config/rs6000/rs6000.cc (power10_cost): Update comment.
(get_arch_flags): Add support for future processor.
(rs6000_option_override_internal): Likewise.
(rs6000_machine_from_flags): Likewise.
(rs6000_reassociation_width): Likewise.
(rs6000_adjust_cost): Likewise.
(rs6000_issue_rate): Likewise.
(rs6000_sched_reorder): Likewise.
(rs6000_sched_reorder2): Likewise.
(rs6000_register_move_cost): Likewise.
(rs6000_opt_masks): Add -mfuture.
* config/rs6000/rs6000.h (ASM_CPU_SPEC): Likewise.
* config/rs6000/rs6000.md (cpu attribute): Likewise.
* config/rs6000/rs6000.opt (-mfuture): New internal option.

Diff:
---
 gcc/config.gcc  |  4 ++--
 gcc/config/rs6000/aix71.h   |  1 +
 gcc/config/rs6000/aix72.h   |  1 +
 gcc/config/rs6000/aix73.h   |  1 +
 gcc/config/rs6000/driver-rs6000.cc  |  2 ++
 gcc/config/rs6000/rs6000-c.cc   |  2 ++
 gcc/config/rs6000/rs6000-cpus.def   |  5 +
 gcc/config/rs6000/rs6000-opts.h |  1 +
 gcc/config/rs6000/rs6000-tables.opt | 11 +++
 gcc/config/rs6000/rs6000.cc | 30 ++
 gcc/config/rs6000/rs6000.h  |  1 +
 gcc/config/rs6000/rs6000.md |  2 +-
 gcc/config/rs6000/rs6000.opt|  6 ++
 13 files changed, 52 insertions(+), 15 deletions(-)

diff --git a/gcc/config.gcc b/gcc/config.gcc
index c20817487457..ea939bdef14b 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -541,7 +541,7 @@ powerpc*-*-*)
extra_headers="${extra_headers} ppu_intrinsics.h spu2vmx.h vec_types.h 
si2vmx.h"
extra_headers="${extra_headers} amo.h"
case x$with_cpu in
-   
xpowerpc64|xdefault64|x6[23]0|x970|xG5|xpower[3456789]|xpower1[01]|xpower6x|xrs64a|xcell|xa2|xe500mc64|xe5500|xe6500)
+   
xpowerpc64|xdefault64|x6[23]0|x970|xG5|xpower[3456789]|xpower1[01]|xpower6x|xrs64a|xcell|xa2|xe500mc64|xe5500|xe6500|xfuture)
cpu_is_64bit=yes
;;
esac
@@ -5650,7 +5650,7 @@ case "${target}" in
tm_defines="${tm_defines} CONFIG_PPC405CR"
eval "with_$which=405"
;;
-   "" | common | native \
+   "" | common | native | future \
| power[3456789] | power1[01] | power5+ | power6x \
| powerpc | powerpc64 | powerpc64le \
| rs64 \
diff --git a/gcc/config/rs6000/aix71.h b/gcc/config/rs6000/aix71.h
index 4350dcd89524..505986b33d63 100644
--- a/gcc/config/rs6000/aix71.h
+++ b/gcc/config/rs6000/aix71.h
@@ -79,6 +79,7 @@ do {  
\
 #undef ASM_CPU_SPEC
 #define ASM_CPU_SPEC \
 "%{mcpu=native: %(asm_cpu_native); \
+  mcpu=future: -mfuture; \
   mcpu=power11: -mpwr11; \
   mcpu=power10: -mpwr10; \
   mcpu=power9: -mpwr9; \
diff --git a/gcc/config/rs6000/aix72.h b/gcc/config/rs6000/aix72.h
index fe59f8319b48..242ca94bd065 100644
--- a/gcc/config/rs6000/aix72.h
+++ b/gcc/config/rs6000/aix72.h
@@ -79,6 +79,7 @@ do {  
\
 #undef ASM_CPU_SPEC
 #define ASM_CPU_SPEC \
 "%{mcpu=native: %(asm_cpu_native); \
+  mcpu=future: -mfuture; \
   mcpu=power11: -mpwr11; \
   mcpu=power10: -mpwr10; \
   mcpu=power9: -mpwr9; \
diff --git a/gcc/config/rs6000/aix73.h b/gcc/config/rs6000/aix73.h
index 1318b0b3662d..2bd6b4bb3c4f 100644
--- a/gcc/config/rs6000/aix73.h
+++ b/gcc/config/rs6000/aix73.h
@@ -79,6 +79,7 @@ do {  
\
 #undef ASM_CPU_SPEC
 #define ASM_CPU_SPEC \
 "%{mcpu=nati

[gcc(refs/users/meissner/heads/work187)] Add support for -mcpu=future

2024-11-22 Thread Michael Meissner via Gcc-cvs

https://gcc.gnu.org/g:1f083f59766d9472e6a41e7c7084e26f4a3091b4

commit 1f083f59766d9472e6a41e7c7084e26f4a3091b4
Author: Michael Meissner 
Date:   Fri Nov 22 17:55:45 2024 -0500

Add support for -mcpu=future

This patch adds the support that can be used in developing GCC support for
future PowerPC processors.

2024-11-22  Michael Meissner  

* config.gcc (powerpc*-*-*): Add support for --with-cpu=future.
* config/rs6000/aix71.h (ASM_CPU_SPEC): Add support for 
-mcpu=future.
* config/rs6000/aix72.h (ASM_CPU_SPEC): Likewise.
* config/rs6000/aix73.h (ASM_CPU_SPEC): Likewise.
* config/rs6000/driver-rs6000.cc (asm_names): Likewise.
* config/rs6000/rs6000-c.cc (rs6000_target_modify_macros): If
-mcpu=future, define _ARCH_FUTURE.
* config/rs6000/rs6000-cpus.def (FUTURE_MASKS_SERVER): New macro.
(POWERPC_MASKS): Add OPTION_MASK_FUTURE.
(future cpu): Define.
* config/rs6000/rs6000-opts.h (enum processor_type): Add
PROCESSOR_FUTURE.
* config/rs6000/rs6000-tables.opt: Regenerate.
* config/rs6000/rs6000.cc (power10_cost): Update comment.
(get_arch_flags): Add support for future processor.
(rs6000_option_override_internal): Likewise.
(rs6000_machine_from_flags): Likewise.
(rs6000_reassociation_width): Likewise.
(rs6000_adjust_cost): Likewise.
(rs6000_issue_rate): Likewise.
(rs6000_sched_reorder): Likewise.
(rs6000_sched_reorder2): Likewise.
(rs6000_register_move_cost): Likewise.
(rs6000_opt_masks): Add -mfuture.
* config/rs6000/rs6000.h (ASM_CPU_SPEC): Likewise.
* config/rs6000/rs6000.md (cpu attribute): Likewise.
* config/rs6000/rs6000.opt (-mfuture): New internal option.

Diff:
---
 gcc/config.gcc  |  4 ++--
 gcc/config/rs6000/aix71.h   |  1 +
 gcc/config/rs6000/aix72.h   |  1 +
 gcc/config/rs6000/aix73.h   |  1 +
 gcc/config/rs6000/driver-rs6000.cc  |  2 ++
 gcc/config/rs6000/rs6000-c.cc   |  2 ++
 gcc/config/rs6000/rs6000-cpus.def   |  5 +
 gcc/config/rs6000/rs6000-opts.h |  1 +
 gcc/config/rs6000/rs6000-tables.opt | 11 +++
 gcc/config/rs6000/rs6000.cc | 30 ++
 gcc/config/rs6000/rs6000.h  |  1 +
 gcc/config/rs6000/rs6000.md |  2 +-
 gcc/config/rs6000/rs6000.opt|  6 ++
 13 files changed, 52 insertions(+), 15 deletions(-)

diff --git a/gcc/config.gcc b/gcc/config.gcc
index c20817487457..ea939bdef14b 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -541,7 +541,7 @@ powerpc*-*-*)
extra_headers="${extra_headers} ppu_intrinsics.h spu2vmx.h vec_types.h 
si2vmx.h"
extra_headers="${extra_headers} amo.h"
case x$with_cpu in
-   
xpowerpc64|xdefault64|x6[23]0|x970|xG5|xpower[3456789]|xpower1[01]|xpower6x|xrs64a|xcell|xa2|xe500mc64|xe5500|xe6500)
+   
xpowerpc64|xdefault64|x6[23]0|x970|xG5|xpower[3456789]|xpower1[01]|xpower6x|xrs64a|xcell|xa2|xe500mc64|xe5500|xe6500|xfuture)
cpu_is_64bit=yes
;;
esac
@@ -5650,7 +5650,7 @@ case "${target}" in
tm_defines="${tm_defines} CONFIG_PPC405CR"
eval "with_$which=405"
;;
-   "" | common | native \
+   "" | common | native | future \
| power[3456789] | power1[01] | power5+ | power6x \
| powerpc | powerpc64 | powerpc64le \
| rs64 \
diff --git a/gcc/config/rs6000/aix71.h b/gcc/config/rs6000/aix71.h
index 4350dcd89524..505986b33d63 100644
--- a/gcc/config/rs6000/aix71.h
+++ b/gcc/config/rs6000/aix71.h
@@ -79,6 +79,7 @@ do {  
\
 #undef ASM_CPU_SPEC
 #define ASM_CPU_SPEC \
 "%{mcpu=native: %(asm_cpu_native); \
+  mcpu=future: -mfuture; \
   mcpu=power11: -mpwr11; \
   mcpu=power10: -mpwr10; \
   mcpu=power9: -mpwr9; \
diff --git a/gcc/config/rs6000/aix72.h b/gcc/config/rs6000/aix72.h
index fe59f8319b48..242ca94bd065 100644
--- a/gcc/config/rs6000/aix72.h
+++ b/gcc/config/rs6000/aix72.h
@@ -79,6 +79,7 @@ do {  
\
 #undef ASM_CPU_SPEC
 #define ASM_CPU_SPEC \
 "%{mcpu=native: %(asm_cpu_native); \
+  mcpu=future: -mfuture; \
   mcpu=power11: -mpwr11; \
   mcpu=power10: -mpwr10; \
   mcpu=power9: -mpwr9; \
diff --git a/gcc/config/rs6000/aix73.h b/gcc/config/rs6000/aix73.h
index 1318b0b3662d..2bd6b4bb3c4f 100644
--- a/gcc/config/rs6000/aix73.h
+++ b/gcc/config/rs6000/aix73.h
@@ -79,6 +79,7 @@ do {  
\
 #undef ASM_CPU_SPEC
 #define ASM_CPU_SPEC \
 "%{mcpu=nati

[gcc(refs/users/meissner/heads/work187)] Add -mcpu=future tuning support.

2024-11-22 Thread Michael Meissner via Gcc-cvs

https://gcc.gnu.org/g:f850d8d4d3b53454a9b25c848d1c25b046df2300

commit f850d8d4d3b53454a9b25c848d1c25b046df2300
Author: Michael Meissner 
Date:   Fri Nov 22 17:56:48 2024 -0500

Add -mcpu=future tuning support.

This patch makes -mtune=future use the same tuning decision as 
-mtune=power11.

2024-11-22  Michael Meissner  

gcc/

* config/rs6000/power10.md (all reservations): Add future as an
alterntive to power10 and power11.

Diff:
---
 gcc/config/rs6000/power10.md | 144 +--
 1 file changed, 72 insertions(+), 72 deletions(-)

diff --git a/gcc/config/rs6000/power10.md b/gcc/config/rs6000/power10.md
index 2310c4603457..e42b057dc45b 100644
--- a/gcc/config/rs6000/power10.md
+++ b/gcc/config/rs6000/power10.md
@@ -1,4 +1,4 @@
-;; Scheduling description for the IBM Power10 and Power11 processors.
+;; Scheduling description for the IBM Power10, Power11, and Future processors.
 ;; Copyright (C) 2020-2024 Free Software Foundation, Inc.
 ;;
 ;; Contributed by Pat Haugen (pthau...@us.ibm.com).
@@ -97,12 +97,12 @@
(eq_attr "update" "no")
(eq_attr "size" "!128")
(eq_attr "prefixed" "no")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_any_power10,LU_power10")
 
 (define_insn_reservation "power10-fused-load" 4
   (and (eq_attr "type" "fused_load_cmpi,fused_addis_load,fused_load_load")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_even_power10,LU_power10")
 
 (define_insn_reservation "power10-prefixed-load" 4
@@ -110,13 +110,13 @@
(eq_attr "update" "no")
(eq_attr "size" "!128")
(eq_attr "prefixed" "yes")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_even_power10,LU_power10")
 
 (define_insn_reservation "power10-load-update" 4
   (and (eq_attr "type" "load")
(eq_attr "update" "yes")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_even_power10,LU_power10+SXU_power10")
 
 (define_insn_reservation "power10-fpload-double" 4
@@ -124,7 +124,7 @@
(eq_attr "update" "no")
(eq_attr "size" "64")
(eq_attr "prefixed" "no")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_any_power10,LU_power10")
 
 (define_insn_reservation "power10-prefixed-fpload-double" 4
@@ -132,14 +132,14 @@
(eq_attr "update" "no")
(eq_attr "size" "64")
(eq_attr "prefixed" "yes")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_even_power10,LU_power10")
 
 (define_insn_reservation "power10-fpload-update-double" 4
   (and (eq_attr "type" "fpload")
(eq_attr "update" "yes")
(eq_attr "size" "64")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_even_power10,LU_power10+SXU_power10")
 
 ; SFmode loads are cracked and have additional 3 cycles over DFmode
@@ -148,27 +148,27 @@
   (and (eq_attr "type" "fpload")
(eq_attr "update" "no")
(eq_attr "size" "32")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_even_power10,LU_power10")
 
 (define_insn_reservation "power10-fpload-update-single" 7
   (and (eq_attr "type" "fpload")
(eq_attr "update" "yes")
(eq_attr "size" "32")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_even_power10,LU_power10+SXU_power10")
 
 (define_insn_reservation "power10-vecload" 4
   (and (eq_attr "type" "vecload")
(eq_attr "size" "!256")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_any_power10,LU_power10")
 
 ; lxvp
 (define_insn_reservation "power10-vecload-pair" 4
   (and (eq_attr "type" "vecload")
(eq_attr "size" "256")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_even_power10,LU_power10+SXU_power10")
 
 ; Store Unit
@@ -178,12 +178,12 @@
(eq_attr "prefixed" "no")
(eq_attr "size" "!128")
(eq_attr "size" "!256")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_any_power10,STU_power10")
 
 (define_insn_reservation "power10-fused-store" 0
   (and (eq_attr "type" "fused_store_store")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_even_power10,STU_power10")
 
 (define_insn_reservation "power10-prefixed-store" 0
@@ -191,52 +191,52 @@
(eq_attr "prefixed" "yes")
(eq_attr "size" "!128")
(eq_attr "size" "!256")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_even_power10,STU_power10")
 
 ; Update forms have 2 cycle latency for updat

[gcc(refs/users/meissner/heads/work187-dmf)] RFC2653-Add dense math test for new instruction names.

2024-11-22 Thread Michael Meissner via Gcc-cvs

https://gcc.gnu.org/g:4567b563956ce11dddb166dcc60ee9b923105bc0

commit 4567b563956ce11dddb166dcc60ee9b923105bc0
Author: Michael Meissner 
Date:   Fri Nov 22 18:39:34 2024 -0500

RFC2653-Add dense math test for new instruction names.

2024-11-22   Michael Meissner  

gcc/testsuite/

* gcc.target/powerpc/dm-double-test.c: New test.
* lib/target-supports.exp (check_effective_target_ppc_dmr_ok): New
target test.

Diff:
---
 gcc/testsuite/gcc.target/powerpc/dm-double-test.c | 194 ++
 gcc/testsuite/lib/target-supports.exp |  23 +++
 2 files changed, 217 insertions(+)

diff --git a/gcc/testsuite/gcc.target/powerpc/dm-double-test.c 
b/gcc/testsuite/gcc.target/powerpc/dm-double-test.c
new file mode 100644
index ..66c197795856
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/dm-double-test.c
@@ -0,0 +1,194 @@
+/* Test derived from mma-double-1.c, modified for dense math.  */
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_dense_math_ok } */
+/* { dg-options "-mdejagnu-cpu=future -O2" } */
+
+#include 
+#include 
+#include 
+
+typedef unsigned char vec_t __attribute__ ((vector_size (16)));
+typedef double v4sf_t __attribute__ ((vector_size (16)));
+#define SAVE_ACC(ACC, ldc, J)  \
+ __builtin_mma_disassemble_acc (result, ACC); \
+ rowC = (v4sf_t *) &CO[0*ldc+J]; \
+  rowC[0] += result[0]; \
+  rowC = (v4sf_t *) &CO[1*ldc+J]; \
+  rowC[0] += result[1]; \
+  rowC = (v4sf_t *) &CO[2*ldc+J]; \
+  rowC[0] += result[2]; \
+  rowC = (v4sf_t *) &CO[3*ldc+J]; \
+ rowC[0] += result[3];
+
+void
+DM (int m, int n, int k, double *A, double *B, double *C)
+{
+  __vector_quad acc0, acc1, acc2, acc3, acc4, acc5, acc6, acc7;
+  v4sf_t result[4];
+  v4sf_t *rowC;
+  for (int l = 0; l < n; l += 4)
+{
+  double *CO;
+  double *AO;
+  AO = A;
+  CO = C;
+  C += m * 4;
+  for (int j = 0; j < m; j += 16)
+   {
+ double *BO = B;
+ __builtin_mma_xxsetaccz (&acc0);
+ __builtin_mma_xxsetaccz (&acc1);
+ __builtin_mma_xxsetaccz (&acc2);
+ __builtin_mma_xxsetaccz (&acc3);
+ __builtin_mma_xxsetaccz (&acc4);
+ __builtin_mma_xxsetaccz (&acc5);
+ __builtin_mma_xxsetaccz (&acc6);
+ __builtin_mma_xxsetaccz (&acc7);
+ unsigned long i;
+
+ for (i = 0; i < k; i++)
+   {
+ vec_t *rowA = (vec_t *) & AO[i * 16];
+ __vector_pair rowB;
+ vec_t *rb = (vec_t *) & BO[i * 4];
+ __builtin_mma_assemble_pair (&rowB, rb[1], rb[0]);
+ __builtin_mma_xvf64gerpp (&acc0, rowB, rowA[0]);
+ __builtin_mma_xvf64gerpp (&acc1, rowB, rowA[1]);
+ __builtin_mma_xvf64gerpp (&acc2, rowB, rowA[2]);
+ __builtin_mma_xvf64gerpp (&acc3, rowB, rowA[3]);
+ __builtin_mma_xvf64gerpp (&acc4, rowB, rowA[4]);
+ __builtin_mma_xvf64gerpp (&acc5, rowB, rowA[5]);
+ __builtin_mma_xvf64gerpp (&acc6, rowB, rowA[6]);
+ __builtin_mma_xvf64gerpp (&acc7, rowB, rowA[7]);
+   }
+ SAVE_ACC (&acc0, m, 0);
+ SAVE_ACC (&acc2, m, 4);
+ SAVE_ACC (&acc1, m, 2);
+ SAVE_ACC (&acc3, m, 6);
+ SAVE_ACC (&acc4, m, 8);
+ SAVE_ACC (&acc6, m, 12);
+ SAVE_ACC (&acc5, m, 10);
+ SAVE_ACC (&acc7, m, 14);
+ AO += k * 16;
+ BO += k * 4;
+ CO += 16;
+   }
+  B += k * 4;
+}
+}
+
+void
+init (double *matrix, int row, int column)
+{
+  for (int j = 0; j < column; j++)
+{
+  for (int i = 0; i < row; i++)
+   {
+ matrix[j * row + i] = (i * 16 + 2 + j) / 0.123;
+   }
+}
+}
+
+void
+init0 (double *matrix, double *matrix1, int row, int column)
+{
+  for (int j = 0; j < column; j++)
+for (int i = 0; i < row; i++)
+  matrix[j * row + i] = matrix1[j * row + i] = 0;
+}
+
+
+void
+print (const char *name, const double *matrix, int row, int column)
+{
+  printf ("Matrix %s has %d rows and %d columns:\n", name, row, column);
+  for (int i = 0; i < row; i++)
+{
+  for (int j = 0; j < column; j++)
+   {
+ printf ("%f ", matrix[j * row + i]);
+   }
+  printf ("\n");
+}
+  printf ("\n");
+}
+
+int
+main (int argc, char *argv[])
+{
+  int rowsA, colsB, common;
+  int i, j, k;
+  int ret = 0;
+
+  for (int t = 16; t <= 128; t += 16)
+{
+  for (int t1 = 4; t1 <= 16; t1 += 4)
+   {
+ rowsA = t;
+ colsB = t1;
+ common = 1;
+ /* printf ("Running test for rows = %d,cols = %d\n", t, t1); */
+ double A[rowsA * common];
+ double B[common * colsB];
+ double C[rowsA * colsB];
+ double D[rowsA * colsB];
+
+
+ init (A, rowsA, common);
+ init (B, common, colsB);
+ init0 (C, D, rowsA, colsB);
+ DM (rowsA, colsB, common, A, B

[gcc(refs/users/meissner/heads/work187-dmf)] RFC2656-Support load/store vector with right length.

2024-11-22 Thread Michael Meissner via Gcc-cvs

https://gcc.gnu.org/g:07bf9011e5fecd2aba4cbc4d949bfbb5f6ceff90

commit 07bf9011e5fecd2aba4cbc4d949bfbb5f6ceff90
Author: Michael Meissner 
Date:   Fri Nov 22 18:44:07 2024 -0500

RFC2656-Support load/store vector with right length.

This patch adds support for new instructions that may be added to the 
PowerPC
architecture in the future to enhance the load and store vector with length
instructions.

The current instructions (lxvl, lxvll, stxvl, and stxvll) are inconvient to 
use
since the count for the number of bytes must be in the top 8 bits of the GPR
register, instead of the bottom 8 bits.  This meant that code generating 
these
instructions typically had to do a shift left by 56 bits to get the count 
into
the right position.  In a future version of the PowerPC architecture, new
variants of these instructions might be added that expect the count to be in
the bottom 8 bits of the GPR register.  These patches add this support to 
GCC
if the user uses the -mcpu=future option.

I discovered that the code in rs6000-string.cc to generate ISA 3.1 
lxvl/stxvl
future lxvll/stxvll instructions would generate these instructions on 
32-bit.
However the patterns for these instructions is only done on 64-bit systems. 
 So
I added a check for 64-bit support before generating the instructions.

The patches have been tested on both little and big endian systems.  Can I 
check
it into the master branch?

2024-11-22   Michael Meissner  

gcc/

* config/rs6000/rs6000-string.cc (expand_block_move): Do not 
generate
lxvl and stxvl on 32-bit.
* config/rs6000/vsx.md (lxvl): If -mcpu=future, generate the lxvl 
with
the shift count automaticaly used in the insn.
(lxvrl): New insn for -mcpu=future.
(lxvrll): Likewise.
(stxvl): If -mcpu=future, generate the stxvl with the shift count
automaticaly used in the insn.
(stxvrl): New insn for -mcpu=future.
(stxvrll): Likewise.

gcc/testsuite/

* gcc.target/powerpc/lxvrl.c: New test.
* lib/target-supports.exp 
(check_effective_target_powerpc_future_ok):
New effective target.

Diff:
---
 gcc/config/rs6000/rs6000-string.cc   |   1 +
 gcc/config/rs6000/vsx.md | 122 +--
 gcc/testsuite/gcc.target/powerpc/lxvrl.c |  32 
 gcc/testsuite/lib/target-supports.exp|  12 +++
 4 files changed, 146 insertions(+), 21 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-string.cc 
b/gcc/config/rs6000/rs6000-string.cc
index b633d80110d0..afcbe4fef657 100644
--- a/gcc/config/rs6000/rs6000-string.cc
+++ b/gcc/config/rs6000/rs6000-string.cc
@@ -2786,6 +2786,7 @@ expand_block_move (rtx operands[], bool might_overlap)
 
   if (TARGET_MMA && TARGET_BLOCK_OPS_UNALIGNED_VSX
  && TARGET_BLOCK_OPS_VECTOR_PAIR
+ && TARGET_POWERPC64
  && bytes >= 32
  && (align >= 256 || !STRICT_ALIGNMENT))
{
diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index f4f7113f5fe8..43c10a1b0970 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -5710,20 +5710,32 @@
   DONE;
 })
 
-;; Load VSX Vector with Length
+;; Load VSX Vector with Length.  If we have lxvrl, we don't have to do an
+;; explicit shift left into a pseudo.
 (define_expand "lxvl"
-  [(set (match_dup 3)
-(ashift:DI (match_operand:DI 2 "register_operand")
-   (const_int 56)))
-   (set (match_operand:V16QI 0 "vsx_register_operand")
-   (unspec:V16QI
-[(match_operand:DI 1 "gpc_reg_operand")
-  (mem:V16QI (match_dup 1))
- (match_dup 3)]
-UNSPEC_LXVL))]
+  [(use (match_operand:V16QI 0 "vsx_register_operand"))
+   (use (match_operand:DI 1 "gpc_reg_operand"))
+   (use (match_operand:DI 2 "gpc_reg_operand"))]
   "TARGET_P9_VECTOR && TARGET_64BIT"
 {
-  operands[3] = gen_reg_rtx (DImode);
+  rtx shift_len = gen_rtx_ASHIFT (DImode, operands[2], GEN_INT (56));
+  rtx len;
+
+  if (TARGET_FUTURE)
+len = shift_len;
+  else
+{
+  len = gen_reg_rtx (DImode);
+  emit_insn (gen_rtx_SET (len, shift_len));
+}
+
+  rtx dest = operands[0];
+  rtx addr = operands[1];
+  rtx mem = gen_rtx_MEM (V16QImode, addr);
+  rtvec rv = gen_rtvec (3, addr, mem, len);
+  rtx lxvl = gen_rtx_UNSPEC (V16QImode, rv, UNSPEC_LXVL);
+  emit_insn (gen_rtx_SET (dest, lxvl));
+  DONE;
 })
 
 (define_insn "*lxvl"
@@ -5747,6 +5759,34 @@
   "lxvll %x0,%1,%2"
   [(set_attr "type" "vecload")])
 
+;; For lxvrl and lxvrll, use the combiner to eliminate the shift.  The
+;; define_expand for lxvl will already incorporate the shift in generating the
+;; insn.  The lxvll buitl-in function required the user to have already done
+;; the shift.  Defining lxvrll this way, will optimize cases where the user has
+;; done the shift immediately before

[gcc(refs/users/meissner/heads/work187-dmf)] RFC2655-Add saturating subtract built-ins.

2024-11-22 Thread Michael Meissner via Gcc-cvs

https://gcc.gnu.org/g:d62f05990b8d2c94b1018d7064e7a40b84473dc0

commit d62f05990b8d2c94b1018d7064e7a40b84473dc0
Author: Michael Meissner 
Date:   Fri Nov 22 18:45:14 2024 -0500

RFC2655-Add saturating subtract built-ins.

This patch adds support for a saturating subtract built-in function that 
may be
added to a future PowerPC processor.  Note, if it is added, the name of the
built-in function may change before GCC 13 is released.  If the name 
changes,
we will submit a patch changing the name.

I also added support for providing dense math built-in functions, even 
though
at present, we have not added any new built-in functions for dense math.  
It is
likely we will want to add new dense math built-in functions as the dense 
math
support is fleshed out.

The patches have been tested on both little and big endian systems.  Can I 
check
it into the master branch?

2024-11-22   Michael Meissner  

gcc/

* config/rs6000/rs6000-builtin.cc (rs6000_invalid_builtin): Add 
support
for flagging invalid use of future built-in functions.
(rs6000_builtin_is_supported): Add support for future built-in
functions.
* config/rs6000/rs6000-builtins.def 
(__builtin_saturate_subtract32): New
built-in function for -mcpu=future.
(__builtin_saturate_subtract64): Likewise.
* config/rs6000/rs6000-gen-builtins.cc (enum bif_stanza): Add 
stanzas
for -mcpu=future built-ins.
(stanza_map): Likewise.
(enable_string): Likewise.
(struct attrinfo): Likewise.
(parse_bif_attrs): Likewise.
(write_decls): Likewise.
* config/rs6000/rs6000.md (sat_sub3): Add saturating subtract
built-in insn declarations.
(sat_sub3_dot): Likewise.
(sat_sub3_dot2): Likewise.
* doc/extend.texi (Future PowerPC built-ins): New section.

gcc/testsuite/

* gcc.target/powerpc/subfus-1.c: New test.
* gcc.target/powerpc/subfus-2.c: Likewise.

Diff:
---
 gcc/config/rs6000/rs6000-builtin.cc | 17 
 gcc/config/rs6000/rs6000-builtins.def   | 10 +
 gcc/config/rs6000/rs6000-gen-builtins.cc| 35 ++---
 gcc/config/rs6000/rs6000.md | 60 +
 gcc/doc/extend.texi | 24 
 gcc/testsuite/gcc.target/powerpc/subfus-1.c | 32 +++
 gcc/testsuite/gcc.target/powerpc/subfus-2.c | 32 +++
 7 files changed, 205 insertions(+), 5 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-builtin.cc 
b/gcc/config/rs6000/rs6000-builtin.cc
index 8e4335e9b44f..a5f33eb9da18 100644
--- a/gcc/config/rs6000/rs6000-builtin.cc
+++ b/gcc/config/rs6000/rs6000-builtin.cc
@@ -139,6 +139,17 @@ rs6000_invalid_builtin (enum rs6000_gen_builtins fncode)
 case ENB_MMA:
   error ("%qs requires the %qs option", name, "-mmma");
   break;
+case ENB_FUTURE:
+  error ("%qs requires the %qs option", name, "-mcpu=future");
+  break;
+case ENB_FUTURE_64:
+  error ("%qs requires the %qs option and either the %qs or %qs option",
+name, "-mcpu=future", "-m64", "-mpowerpc64");
+  break;
+case ENB_DM:
+  error ("%qs requires the %qs or %qs options", name, "-mcpu=future",
+"-mdense-math");
+  break;
 default:
 case ENB_ALWAYS:
   gcc_unreachable ();
@@ -194,6 +205,12 @@ rs6000_builtin_is_supported (enum rs6000_gen_builtins 
fncode)
   return TARGET_HTM;
 case ENB_MMA:
   return TARGET_MMA;
+case ENB_FUTURE:
+  return TARGET_FUTURE;
+case ENB_FUTURE_64:
+  return TARGET_FUTURE && TARGET_POWERPC64;
+case ENB_DM:
+  return TARGET_DENSE_MATH;
 default:
   gcc_unreachable ();
 }
diff --git a/gcc/config/rs6000/rs6000-builtins.def 
b/gcc/config/rs6000/rs6000-builtins.def
index 69046fd22442..84de393bc597 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -137,6 +137,8 @@
 ;   endian   Needs special handling for endianness
 ;   ibmldRestrict usage to the case when TFmode is IBM-128
 ;   ibm128   Restrict usage to the case where __ibm128 is supported or if ibmld
+;   future   Restrict usage to future instructions
+;   dm   Restrict usage to dense math
 ;
 ; Each attribute corresponds to extra processing required when
 ; the built-in is expanded.  All such special processing should
@@ -3933,3 +3935,11 @@
 
   void __builtin_vsx_stxvp (v256, unsigned long, const v256 *);
 STXVP nothing {mma,pair}
+
+[future]
+  const signed int __builtin_saturate_subtract32 (signed int, signed int);
+  SAT_SUBSI sat_subsi3 {}
+
+[future-64]
+  const signed long __builtin_saturate_subtract64 (signed long,  signed long);
+  SAT_SUBDI sat_subdi3 {}
diff --git a/gcc/config/rs6000/rs6000-gen-builtins.cc 
b/gcc/conf

[gcc(refs/users/meissner/heads/work187-dmf)] RFC2653-PowerPC: Add support for 1, 024 bit DMR registers.

2024-11-22 Thread Michael Meissner via Gcc-cvs

https://gcc.gnu.org/g:19f8ef7ca2271699641af4e239778b8f719799e5

commit 19f8ef7ca2271699641af4e239778b8f719799e5
Author: Michael Meissner 
Date:   Fri Nov 22 18:40:56 2024 -0500

RFC2653-PowerPC: Add support for 1,024 bit DMR registers.

This patch is a prelimianry patch to add the full 1,024 bit dense math 
register
(DMRs) for -mcpu=future.  The MMA 512-bit accumulators map onto the top of 
the
DMR register.

This patch only adds the new 1,024 bit register support.  It does not add
support for any instructions that need 1,024 bit registers instead of 512 
bit
registers.

I used the new mode 'TDOmode' to be the opaque mode used for 1,024 bit
registers.  The 'wD' constraint added in previous patches is used for these
registers.  I added support to do load and store of DMRs via the VSX 
registers,
since there are no load/store dense math instructions.  I added the new 
keyword
'__dmr' to create 1,024 bit types that can be loaded into DMRs.  At 
present, I
don't have aliases for __dmr512 and __dmr1024 that we've discussed 
internally.

The patches have been tested on both little and big endian systems.  Can I 
check
it into the master branch?

2024-11-22   Michael Meissner  

gcc/

* config/rs6000/mma.md (UNSPEC_DM_INSERT512_UPPER): New unspec.
(UNSPEC_DM_INSERT512_LOWER): Likewise.
(UNSPEC_DM_EXTRACT512): Likewise.
(UNSPEC_DMR_RELOAD_FROM_MEMORY): Likewise.
(UNSPEC_DMR_RELOAD_TO_MEMORY): Likewise.
(movtdo): New define_expand and define_insn_and_split to implement 
1,024
bit DMR registers.
(movtdo_insert512_upper): New insn.
(movtdo_insert512_lower): Likewise.
(movtdo_extract512): Likewise.
(reload_dmr_from_memory): Likewise.
(reload_dmr_to_memory): Likewise.
* config/rs6000/rs6000-builtin.cc (rs6000_type_string): Add DMR
support.
(rs6000_init_builtins): Add support for __dmr keyword.
* config/rs6000/rs6000-call.cc (rs6000_return_in_memory): Add 
support
for TDOmode.
(rs6000_function_arg): Likewise.
* config/rs6000/rs6000-modes.def (TDOmode): New mode.
* config/rs6000/rs6000.cc (rs6000_hard_regno_nregs_internal): Add
support for TDOmode.
(rs6000_hard_regno_mode_ok_uncached): Likewise.
(rs6000_hard_regno_mode_ok): Likewise.
(rs6000_modes_tieable_p): Likewise.
(rs6000_debug_reg_global): Likewise.
(rs6000_setup_reg_addr_masks): Likewise.
(rs6000_init_hard_regno_mode_ok): Add support for TDOmode.  Setup 
reload
hooks for DMR mode.
(reg_offset_addressing_ok_p): Add support for TDOmode.
(rs6000_emit_move): Likewise.
(rs6000_secondary_reload_simple_move): Likewise.
(rs6000_preferred_reload_class): Likewise.
(rs6000_secondary_reload_class): Likewise.
(rs6000_mangle_type): Add mangling for __dmr type.
(rs6000_dmr_register_move_cost): Add support for TDOmode.
(rs6000_split_multireg_move): Likewise.
(rs6000_invalid_conversion): Likewise.
* config/rs6000/rs6000.h (VECTOR_ALIGNMENT_P): Add TDOmode.
(enum rs6000_builtin_type_index): Add DMR type nodes.
(dmr_type_node): Likewise.
(ptr_dmr_type_node): Likewise.

gcc/testsuite/

* gcc.target/powerpc/dm-1024bit.c: New test.

Diff:
---
 gcc/config/rs6000/mma.md  | 154 ++
 gcc/config/rs6000/rs6000-builtin.cc   |  17 +++
 gcc/config/rs6000/rs6000-call.cc  |  10 +-
 gcc/config/rs6000/rs6000-modes.def|   4 +
 gcc/config/rs6000/rs6000.cc   | 101 -
 gcc/config/rs6000/rs6000.h|   6 +-
 gcc/testsuite/gcc.target/powerpc/dm-1024bit.c |  63 +++
 7 files changed, 321 insertions(+), 34 deletions(-)

diff --git a/gcc/config/rs6000/mma.md b/gcc/config/rs6000/mma.md
index 2e04eb653fa6..8461499e1c3d 100644
--- a/gcc/config/rs6000/mma.md
+++ b/gcc/config/rs6000/mma.md
@@ -92,6 +92,11 @@
UNSPEC_MMA_XXMFACC
UNSPEC_MMA_XXMTACC
UNSPEC_MMA_DMSETDMRZ
+   UNSPEC_DM_INSERT512_UPPER
+   UNSPEC_DM_INSERT512_LOWER
+   UNSPEC_DM_EXTRACT512
+   UNSPEC_DMR_RELOAD_FROM_MEMORY
+   UNSPEC_DMR_RELOAD_TO_MEMORY
   ])
 
 (define_c_enum "unspecv"
@@ -793,3 +798,152 @@
 }
   [(set_attr "type" "mma")
(set_attr "prefixed" "yes")])
+
+;; TDOmode (__dmr keyword for 1,024 bit registers).
+(define_expand "movtdo"
+  [(set (match_operand:TDO 0 "nonimmediate_operand")
+   (match_operand:TDO 1 "input_operand"))]
+  "TARGET_MMA_DENSE_MATH"
+{
+  rs6000_emit_move (operands[0], operands[1], TDOmode);
+  DONE;
+})
+
+(define_insn_and_split "*movtdo"
+  [(set (match_operand:TDO

[gcc r15-5604] Sync top-level configure with binutils

2024-11-22 Thread Sam James via Gcc-cvs

https://gcc.gnu.org/g:7ff5900399c889ce1984092552040dfb7e73a4b2

commit r15-5604-g7ff5900399c889ce1984092552040dfb7e73a4b2
Author: Sam James 
Date:   Fri Nov 22 19:09:36 2024 +

Sync top-level configure with binutils

This syncs us with binutils/gdb's toplevel configure as of
987db70acefd0b223a8df2240d4e5ca544cc0a91.

There's not much notable here, just gprofng (which is in binutils) being
disabled for musl and a new target which got added on that side too.

The only part which may look interesting is the baseargs->bbaseargs
change which goes back to Arsen's gettext work and a fixup which
landed for that on the binutils side in
9c0aa4c53104b1c4333d55aeaf11b41053307929.

* configure: Regenerate.
* configure.ac: Sync with Binutils.

Diff:
---
 configure| 21 ++---
 configure.ac | 22 ++
 2 files changed, 36 insertions(+), 7 deletions(-)

diff --git a/configure b/configure
index 10c1589d473c..c6796040fd8a 100755
--- a/configure
+++ b/configure
@@ -1551,7 +1551,7 @@ Optional Features:
 
   --enable-gold[=ARG] build gold [ARG={default,yes,no}]
   --enable-ld[=ARG]   build ld [ARG={default,yes,no}]
-  --enable-gprofng[=ARG]  build gprofng [ARG={yes,no}]
+  --disable-gprofng   do not build gprofng
   --enable-compressed-debug-sections={all,gas,gold,ld,none}
   Enable compressed debug sections for gas, gold or ld
   by default
@@ -3151,7 +3151,9 @@ fi
 
 if test "$enable_gprofng" = "yes"; then
   case "${target}" in
-x86_64-*-linux* | i?86-*-linux* | aarch64-*-linux*)
+*-musl*)
+  ;;
+x86_64-*-linux* | i?86-*-linux* | aarch64-*-linux* | riscv64-*-linux*)
 configdirs="$configdirs gprofng"
 ;;
   esac
@@ -3670,6 +3672,15 @@ case "${target}" in
   cris-*-* | crisv32-*-*)
 libgloss_dir=cris
 ;;
+  kvx-*-elf)
+libgloss_dir=kvx-elf
+;;
+  kvx-*-mbr)
+libgloss_dir=kvx-mbr
+;;
+  kvx-*-cos)
+libgloss_dir=kvx-cos
+;;
   hppa*-*-*)
 libgloss_dir=pa
 ;;
@@ -3971,6 +3982,9 @@ case "${target}" in
   i[3456789]86-*-rdos*)
 noconfigdirs="$noconfigdirs gdb"
 ;;
+  kvx-*-*)
+noconfigdirs="$noconfigdirs gdb gdbserver sim"
+;;
   mmix-*-*)
 noconfigdirs="$noconfigdirs gdb"
 ;;
@@ -11314,7 +11328,8 @@ hbaseargs="$hbaseargs --disable-option-checking"
 tbaseargs="$tbaseargs --disable-option-checking"
 
 if test "$enable_year2038" = no; then
-  baseargs="$baseargs --disable-year2038"
+  bbaseargs="$bbaseargs --disable-year2038"
+  hbaseargs="$hbaseargs --disable-year2038"
   tbaseargs="$tbaseargs --disable-year2038"
 fi
 
diff --git a/configure.ac b/configure.ac
index fb61550dba7b..a8d13b31ee2e 100644
--- a/configure.ac
+++ b/configure.ac
@@ -407,13 +407,14 @@ case "${ENABLE_LD}" in
 esac
 
 AC_ARG_ENABLE(gprofng,
-[AS_HELP_STRING([[--enable-gprofng[=ARG]]],
-   [build gprofng @<:@ARG={yes,no}@:>@])],
+[AS_HELP_STRING([[--disable-gprofng]], [do not build gprofng])],
 enable_gprofng=$enableval,
 enable_gprofng=yes)
 if test "$enable_gprofng" = "yes"; then
   case "${target}" in
-x86_64-*-linux* | i?86-*-linux* | aarch64-*-linux*)
+*-musl*)
+  ;;
+x86_64-*-linux* | i?86-*-linux* | aarch64-*-linux* | riscv64-*-linux*)
 configdirs="$configdirs gprofng"
 ;;
   esac
@@ -892,6 +893,15 @@ case "${target}" in
   cris-*-* | crisv32-*-*)
 libgloss_dir=cris
 ;;
+  kvx-*-elf)
+libgloss_dir=kvx-elf
+;;
+  kvx-*-mbr)
+libgloss_dir=kvx-mbr
+;;
+  kvx-*-cos)
+libgloss_dir=kvx-cos
+;;
   hppa*-*-*)
 libgloss_dir=pa
 ;;
@@ -1193,6 +1203,9 @@ case "${target}" in
   i[[3456789]]86-*-rdos*)
 noconfigdirs="$noconfigdirs gdb"
 ;;
+  kvx-*-*)
+noconfigdirs="$noconfigdirs gdb gdbserver sim"
+;;
   mmix-*-*)
 noconfigdirs="$noconfigdirs gdb"
 ;;
@@ -3543,7 +3556,8 @@ hbaseargs="$hbaseargs --disable-option-checking"
 tbaseargs="$tbaseargs --disable-option-checking"
 
 if test "$enable_year2038" = no; then
-  baseargs="$baseargs --disable-year2038"
+  bbaseargs="$bbaseargs --disable-year2038"
+  hbaseargs="$hbaseargs --disable-year2038"
   tbaseargs="$tbaseargs --disable-year2038"
 fi

[gcc(refs/users/meissner/heads/work187-bugs)] Add power9 and power10 float to logical optimizations.

2024-11-22 Thread Michael Meissner via Gcc-cvs

https://gcc.gnu.org/g:f1f93a4ebc69dd87cfe0b71e6ba2cdc659fce621

commit f1f93a4ebc69dd87cfe0b71e6ba2cdc659fce621
Author: Michael Meissner 
Date:   Fri Nov 22 19:34:42 2024 -0500

Add power9 and power10 float to logical optimizations.

I was answering an email from a co-worker and I pointed him to work I had 
done
for the Power8 era that optimizes the 32-bit float math library in Glibc.  
In
doing so, I discovered with the Power9 and later computers, this 
optimization
is no longer taking place.

The glibc 32-bit floating point math functions have code that looks like:

union u {
  float f;
  uint32_t u32;
};

float
math_foo (float x, unsigned int mask)
{
  union u arg;
  float x2;

  arg.f = x;
  arg.u32 &= mask;

  x2 = arg.f;
  /* ... */
}

On power8 with the optimization it generates:

xscvdpspn 0,1
sldi 9,4,32
mtvsrd 32,9
xxland 1,0,32
xscvspdpn 1,1

I.e., it converts the SFmode to the memory format (instead of the DFmode 
that
is used within the register), converts the mask so that it is in the vector
register in the upper 32-bits, and does a XXLAND (i.e. there is only one 
direct
move from GPR to vector register).  Then after doing this, it converts the
upper 32-bits back to DFmode.

If the XSCVSPDN instruction took the value in the normal 32-bit scalar in a
vector register, we wouldn't have needed the SLDI of the mask.

On power9/power10/power11 it currently generates:

xscvdpspn 0,1
mfvsrwz 2,0
and 2,2,4
mtvsrws 1,2
xscvspdpn 1,1
blr

I.e convert to SFmode representation, move the value to a GPR, do an AND
operation, move the 32-bit value with a splat, and then convert it back to
DFmode format.

With this patch, it now generates:

xscvdpspn 0,1
mtvsrwz 32,2
xxland 32,0,32
xxspltw 1,32,1
xscvspdpn 1,1
blr

I.e. convert to SFmode representation, move the mask to the vector 
register, do
the operation using XXLAND.  Splat the value to get the value in the correct
location, and then convert back to DFmode.

I have built GCC with the patches in this patch set applied on both little 
and
big endian PowerPC systems and there were no regressions.  Can I apply this
patch to GCC 15?

2024-11-22  Michael Meissner  

gcc/

PR target/117487
* config/rs6000/vsx.md (SFmode logical peephoole): Update comments 
in
the original code that supports power8.  Add a new define_peephole2 
to
do the optimization on power9/power10.

Diff:
---
 gcc/config/rs6000/vsx.md | 142 +--
 1 file changed, 137 insertions(+), 5 deletions(-)

diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index af9846391db2..ee3d85525e7e 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -6280,7 +6280,7 @@
(SFBOOL_MFVSR_A  3) ;; move to gpr src
(SFBOOL_BOOL_D   4) ;; and/ior/xor dest
(SFBOOL_BOOL_A1  5) ;; and/ior/xor arg1
-   (SFBOOL_BOOL_A2  6) ;; and/ior/xor arg1
+   (SFBOOL_BOOL_A2  6) ;; and/ior/xor arg2
(SFBOOL_SHL_D7) ;; shift left dest
(SFBOOL_SHL_A8) ;; shift left arg
(SFBOOL_MTVSR_D  9) ;; move to vecter dest
@@ -6320,18 +6320,18 @@
 ;; GPR, and instead move the integer mask value to the vector register after a
 ;; shift and do the VSX logical operation.
 
-;; The insns for dealing with SFmode in GPR registers looks like:
+;; The insns for dealing with SFmode in GPR registers looks like on power8:
 ;; (set (reg:V4SF reg2) (unspec:V4SF [(reg:SF reg1)] UNSPEC_VSX_CVDPSPN))
 ;;
-;; (set (reg:DI reg3) (unspec:DI [(reg:V4SF reg2)] UNSPEC_P8V_RELOAD_FROM_VSX))
+;; (set (reg:DI reg3) (zero_extend:DI (reg:SI reg2)))
 ;;
-;; (set (reg:DI reg4) (and:DI (reg:DI reg3) (reg:DI reg3)))
+;; (set (reg:DI reg4) (and:SI (reg:SI reg3) (reg:SI mask)))
 ;;
 ;; (set (reg:DI reg5) (ashift:DI (reg:DI reg4) (const_int 32)))
 ;;
 ;; (set (reg:SF reg6) (unspec:SF [(reg:DI reg5)] UNSPEC_P8V_MTVSRD))
 ;;
-;; (set (reg:SF reg6) (unspec:SF [(reg:SF reg6)] UNSPEC_VSX_CVSPDPN))
+;; (set (reg:SF reg7) (unspec:SF [(reg:SF reg6)] UNSPEC_VSX_CVSPDPN))
 
 (define_peephole2
   [(match_scratch:DI SFBOOL_TMP_GPR "r")
@@ -6412,6 +6412,138 @@
   operands[SFBOOL_MTVSR_D_V4SF] = gen_rtx_REG (V4SFmode, regno_mtvsr_d);
 })
 
+;; Constants for SFbool optimization on power9/power10
+(define_const

[gcc(refs/users/meissner/heads/work187-bugs)] PR target/108958 -- use mtvsrdd to zero extend GPR DImode to VSX TImode

2024-11-22 Thread Michael Meissner via Gcc-cvs

https://gcc.gnu.org/g:14b524a68818d3360a1857d0e35ab7e40328b9b9

commit 14b524a68818d3360a1857d0e35ab7e40328b9b9
Author: Michael Meissner 
Date:   Fri Nov 22 19:35:35 2024 -0500

PR target/108958 -- use mtvsrdd to zero extend GPR DImode to VSX TImode

Previously GCC would zero externd a DImode GPR value to TImode by first zero
extending the DImode value into a GPR TImode value, and then do a MTVSRDD to
move this value to a VSX register.

This patch does the move directly, since if the middle argument to MTVSRDD 
is 0,
it does the zero extend.

If the DImode value is already in a vector register, it does a XXSPLTIB and
XXPERMDI to get the value into the bottom 64-bits of the register.

I have built GCC with the patches in this patch set applied on both little 
and
big endian PowerPC systems and there were no regressions.  Can I apply this
patch to GCC 15?

2024-11-22  Michael Meissner  

gcc/

PR target/108598
* gcc/config/rs6000/rs6000.md (zero_extendditi2): New insn.

gcc/testsuite/

PR target/108598
* gcc.target/powerpc/pr108958.c: New test.

Diff:
---
 gcc/config/rs6000/rs6000.md | 46 +
 gcc/testsuite/gcc.target/powerpc/pr108958.c | 27 +
 2 files changed, 73 insertions(+)

diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index c6d1857f9d43..b67b7e3fc1f8 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -1026,6 +1026,52 @@
(set_attr "dot" "yes")
(set_attr "length" "4,8")])
 
+(define_insn_and_split "zero_extendditi2"
+  [(set (match_operand:TI 0 "gpc_reg_operand" "=r,wa,&wa")
+   (zero_extend:TI
+(match_operand:DI 1 "gpc_reg_operand" "rwa,r,wa")))]
+  "TARGET_P9_VECTOR && TARGET_POWERPC64"
+  "@
+  #
+  mtvsrdd %x0,0,%1
+  #"
+  "&& reload_completed
+   && (int_reg_operand (operands[0], TImode)
+   || vsx_register_operand (operands[1], DImode))"
+  [(set (match_dup 2)
+   (match_dup 3))
+   (set (match_dup 4)
+   (match_dup 5))]
+{
+  rtx op0 = operands[0];
+  rtx op1 = operands[1];
+  int r = reg_or_subregno (op0);
+
+  if (int_reg_operand (op0, TImode))
+{
+  int lo = BYTES_BIG_ENDIAN ? 1 : 0;
+  int hi = 1 - lo;
+
+  operands[2] = gen_rtx_REG (DImode, r + lo);
+  operands[3] = op1;
+  operands[4] = gen_rtx_REG (DImode, r + hi);
+  operands[5] = const0_rtx;
+}
+  else
+{
+  rtx op0_di = gen_rtx_REG (DImode, r);
+  rtx op0_v2di = gen_rtx_REG (V2DImode, r);
+  rtx lo = WORDS_BIG_ENDIAN ? op1 : op0_di;
+  rtx hi = WORDS_BIG_ENDIAN ? op0_di : op1;
+
+  operands[2] = op0_v2di;
+  operands[3] = CONST0_RTX (V2DImode);
+  operands[4] = op0_v2di;
+  operands[5] = gen_rtx_VEC_CONCAT (V2DImode, hi, lo);
+}
+}
+  [(set_attr "type" "*,mtvsr,vecperm")
+   (set_attr "length" "8,*,8")])
 
 (define_insn "extendqi2"
   [(set (match_operand:EXTQI 0 "gpc_reg_operand" "=r,?*v")
diff --git a/gcc/testsuite/gcc.target/powerpc/pr108958.c 
b/gcc/testsuite/gcc.target/powerpc/pr108958.c
new file mode 100644
index ..03eb58d069e7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr108958.c
@@ -0,0 +1,27 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target int128 } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-options "-mdejagnu-cpu=power9 -O2" } */
+
+/* PR target/108958, use mtvsrdd to zero extend gpr to vsx register.  */
+
+void
+gpr_to_vsx (unsigned long long x, __uint128_t *p)
+{
+  /* mtvsrdd vsx,0,gpr.  */
+  __uint128_t y = x;
+  __asm__ (" # %x0" : "+wa" (y));
+  *p = y;
+}
+
+void
+gpr_to_gpr (unsigned long long x, __uint128_t *p)
+{
+  /* mr and li.  */
+  __uint128_t y = x;
+  __asm__ (" # %0" : "+r" (y));
+  *p = y;
+}
+
+/* { dg-final { scan-assembler-times {\mli\M}  1 } } */
+/* { dg-final { scan-assembler-times {\mmtvsrdd .*,0,.*\M} 1 } } */

[gcc(refs/users/meissner/heads/work187-bugs)] PR 99293: Optimize splat of a V2DF/V2DI extract with constant element

2024-11-22 Thread Michael Meissner via Gcc-cvs

https://gcc.gnu.org/g:eba2c7317aa04071cabaf814624850e9b22b74a6

commit eba2c7317aa04071cabaf814624850e9b22b74a6
Author: Michael Meissner 
Date:   Fri Nov 22 19:33:20 2024 -0500

PR 99293: Optimize splat of a V2DF/V2DI extract with constant element

We had optimizations for splat of a vector extract for the other vector
types, but we missed having one for V2DI and V2DF.  This patch adds a
combiner insn to do this optimization.

In looking at the source, we had similar optimizations for V4SI and V4SF
extract and splats, but we missed doing V2DI/V2DF.

Without the patch for the code:

vector long long splat_dup_l_0 (vector long long v)
{
  return __builtin_vec_splats (__builtin_vec_extract (v, 0));
}

the compiler generates (on a little endian power9):

splat_dup_l_0:
mfvsrld 9,34
mtvsrdd 34,9,9
blr

Now it generates:

splat_dup_l_0:
xxpermdi 34,34,34,3
blr

2024-11-22  Michael Meissner  

gcc/

PR target/99293
* config/rs6000/vsx.md (vsx_splat_extract_): New insn.

gcc/testsuite/

PR target/99293
* gcc.target/powerpc/builtins-1.c: Adjust insn count.
* gcc.target/powerpc/pr99293.c: New test.

Diff:
---
 gcc/config/rs6000/vsx.md  | 18 ++
 gcc/testsuite/gcc.target/powerpc/builtins-1.c |  2 +-
 gcc/testsuite/gcc.target/powerpc/pr99293.c| 22 ++
 3 files changed, 41 insertions(+), 1 deletion(-)

diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index f4f7113f5fe8..af9846391db2 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -4796,6 +4796,24 @@
   "lxvdsx %x0,%y1"
   [(set_attr "type" "vecload")])
 
+;; Optimize SPLAT of an extract from a V2DF/V2DI vector with a constant element
+(define_insn "*vsx_splat_extract_"
+  [(set (match_operand:VSX_D 0 "vsx_register_operand" "=wa")
+   (vec_duplicate:VSX_D
+(vec_select:
+ (match_operand:VSX_D 1 "vsx_register_operand" "wa")
+ (parallel [(match_operand 2 "const_0_to_1_operand" "n")]]
+  "VECTOR_MEM_VSX_P (mode)"
+{
+  int which_word = INTVAL (operands[2]);
+  if (!BYTES_BIG_ENDIAN)
+which_word = 1 - which_word;
+
+  operands[3] = GEN_INT (which_word ? 3 : 0);
+  return "xxpermdi %x0,%x1,%x1,%3";
+}
+  [(set_attr "type" "vecperm")])
+
 ;; V4SI splat support
 (define_insn "vsx_splat_v4si"
   [(set (match_operand:V4SI 0 "vsx_register_operand" "=wa,wa")
diff --git a/gcc/testsuite/gcc.target/powerpc/builtins-1.c 
b/gcc/testsuite/gcc.target/powerpc/builtins-1.c
index 8410a5fd4319..4e7e5384675f 100644
--- a/gcc/testsuite/gcc.target/powerpc/builtins-1.c
+++ b/gcc/testsuite/gcc.target/powerpc/builtins-1.c
@@ -1035,4 +1035,4 @@ foo156 (vector unsigned short usa)
 /* { dg-final { scan-assembler-times {\mvmrglb\M} 3 } } */
 /* { dg-final { scan-assembler-times {\mvmrgew\M} 4 } } */
 /* { dg-final { scan-assembler-times {\mvsplth|xxsplth\M} 4 } } */
-/* { dg-final { scan-assembler-times {\mxxpermdi\M} 44 } } */
+/* { dg-final { scan-assembler-times {\mxxpermdi\M} 42 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/pr99293.c 
b/gcc/testsuite/gcc.target/powerpc/pr99293.c
new file mode 100644
index ..20adc1f27f65
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr99293.c
@@ -0,0 +1,22 @@
+/* { dg-do compile { target powerpc*-*-* } } */
+/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-options "-O2 -mvsx" } */
+
+/* Test for PR 99263, which wants to do:
+   __builtin_vec_splats (__builtin_vec_extract (v, n))
+
+   where v is a V2DF or V2DI vector and n is either 0 or 1.  Previously the
+   compiler would do a direct move to the GPR registers to select the item and 
a
+   direct move from the GPR registers to do the splat.  */
+
+vector long long splat_dup_l_0 (vector long long v)
+{
+  return __builtin_vec_splats (__builtin_vec_extract (v, 0));
+}
+
+vector long long splat_dup_l_1 (vector long long v)
+{
+  return __builtin_vec_splats (__builtin_vec_extract (v, 1));
+}
+
+/* { dg-final { scan-assembler-times "xxpermdi" 2 } } */

[gcc(refs/users/meissner/heads/work187-bugs)] Update ChangeLog.*

2024-11-22 Thread Michael Meissner via Gcc-cvs

https://gcc.gnu.org/g:d832f27c8a74c705fc326fee5611e886e57c3a3a

commit d832f27c8a74c705fc326fee5611e886e57c3a3a
Author: Michael Meissner 
Date:   Fri Nov 22 19:38:05 2024 -0500

Update ChangeLog.*

Diff:
---
 gcc/ChangeLog.bugs | 168 +
 1 file changed, 168 insertions(+)

diff --git a/gcc/ChangeLog.bugs b/gcc/ChangeLog.bugs
index 833633875c55..f3df661dbd57 100644
--- a/gcc/ChangeLog.bugs
+++ b/gcc/ChangeLog.bugs
@@ -1,5 +1,173 @@
+ Branch work187-bugs, patch #202 
+
+PR target/108958 -- use mtvsrdd to zero extend GPR DImode to VSX TImode
+
+Previously GCC would zero externd a DImode GPR value to TImode by first zero
+extending the DImode value into a GPR TImode value, and then do a MTVSRDD to
+move this value to a VSX register.
+
+This patch does the move directly, since if the middle argument to MTVSRDD is 
0,
+it does the zero extend.
+
+If the DImode value is already in a vector register, it does a XXSPLTIB and
+XXPERMDI to get the value into the bottom 64-bits of the register.
+
+I have built GCC with the patches in this patch set applied on both little and
+big endian PowerPC systems and there were no regressions.  Can I apply this
+patch to GCC 15?
+
+2024-11-22  Michael Meissner  
+
+gcc/
+
+   PR target/108598
+   * gcc/config/rs6000/rs6000.md (zero_extendditi2): New insn.
+
+gcc/testsuite/
+
+   PR target/108598
+   * gcc.target/powerpc/pr108958.c: New test.
+
+ Branch work187-bugs, patch #201 
+
+Add power9 and power10 float to logical optimizations.
+
+I was answering an email from a co-worker and I pointed him to work I had done
+for the Power8 era that optimizes the 32-bit float math library in Glibc.  In
+doing so, I discovered with the Power9 and later computers, this optimization
+is no longer taking place.
+
+The glibc 32-bit floating point math functions have code that looks like:
+
+   union u {
+ float f;
+ uint32_t u32;
+   };
+
+   float
+   math_foo (float x, unsigned int mask)
+   {
+ union u arg;
+ float x2;
+
+ arg.f = x;
+ arg.u32 &= mask;
+
+ x2 = arg.f;
+ /* ... */
+   }
+
+On power8 with the optimization it generates:
+
+xscvdpspn 0,1
+sldi 9,4,32
+mtvsrd 32,9
+xxland 1,0,32
+xscvspdpn 1,1
+
+I.e., it converts the SFmode to the memory format (instead of the DFmode that
+is used within the register), converts the mask so that it is in the vector
+register in the upper 32-bits, and does a XXLAND (i.e. there is only one direct
+move from GPR to vector register).  Then after doing this, it converts the
+upper 32-bits back to DFmode.
+
+If the XSCVSPDN instruction took the value in the normal 32-bit scalar in a
+vector register, we wouldn't have needed the SLDI of the mask.
+
+On power9/power10/power11 it currently generates:
+
+xscvdpspn 0,1
+mfvsrwz 2,0
+and 2,2,4
+mtvsrws 1,2
+xscvspdpn 1,1
+blr
+
+I.e convert to SFmode representation, move the value to a GPR, do an AND
+operation, move the 32-bit value with a splat, and then convert it back to
+DFmode format.
+
+With this patch, it now generates:
+
+xscvdpspn 0,1
+mtvsrwz 32,2
+xxland 32,0,32
+xxspltw 1,32,1
+xscvspdpn 1,1
+blr
+
+I.e. convert to SFmode representation, move the mask to the vector register, do
+the operation using XXLAND.  Splat the value to get the value in the correct
+location, and then convert back to DFmode.
+
+I have built GCC with the patches in this patch set applied on both little and
+big endian PowerPC systems and there were no regressions.  Can I apply this
+patch to GCC 15?
+
+2024-11-22  Michael Meissner  
+
+gcc/
+
+   PR target/117487
+   * config/rs6000/vsx.md (SFmode logical peephoole): Update comments in
+   the original code that supports power8.  Add a new define_peephole2 to
+   do the optimization on power9/power10.
+
+ Branch work187-bugs, patch #200 
+
+PR 99293: Optimize splat of a V2DF/V2DI extract with constant element
+
+We had optimizations for splat of a vector extract for the other vector
+types, but we missed having one for V2DI and V2DF.  This patch adds a
+combiner insn to do this optimization.
+
+In looking at the source, we had similar optimizations for V4SI and V4SF
+extract and splats, but we missed doing V2DI/V2DF.
+
+Without the patch for the code:
+
+   vector long long splat_dup_l_0 (vector long long v)
+   {
+ return __builtin_vec_splats (__builtin_vec_extract (v, 0));
+   }
+
+the compiler generates (on a little endian power9):
+
+   splat_dup_l_0:
+   mfvsrld 9,34
+   mtvsrdd 34,9,9
+   blr
+
+Now it generates:
+
+   splat_dup_l_0:
+   xxpermdi 34,34,34,3
+

1 2 >

1 - 100 of 114 matches

Mail list logo