[gcc r15-7662] Fortran: Fix build on solaris [PR107635]

2025-02-21 Thread Andre Vehreschild via Gcc-cvs
https://gcc.gnu.org/g:08bdc2ac98ae05ef694f4e55c296835fc01a3673

commit r15-7662-g08bdc2ac98ae05ef694f4e55c296835fc01a3673
Author: Andre Vehreschild 
Date:   Fri Feb 21 08:18:40 2025 +0100

Fortran: Fix build on solaris [PR107635]

libgfortran/ChangeLog:

PR fortran/107635
* caf/single.c: Replace alloca with __builtin_alloca.

Diff:
---
 libgfortran/caf/single.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/libgfortran/caf/single.c b/libgfortran/caf/single.c
index d4e081be4dd7..9c1c0c1bc8ca 100644
--- a/libgfortran/caf/single.c
+++ b/libgfortran/caf/single.c
@@ -672,12 +672,12 @@ _gfortran_caf_transfer_between_remotes (
   if (!scalar_transfer)
 {
   const size_t desc_size = sizeof (*transfer_desc);
-  transfer_desc = alloca (desc_size);
+  transfer_desc = __builtin_alloca (desc_size);
   memset (transfer_desc, 0, desc_size);
   transfer_ptr = transfer_desc;
 }
   else if (opt_dst_charlen)
-transfer_ptr = alloca (*opt_dst_charlen * src_size);
+transfer_ptr = __builtin_alloca (*opt_dst_charlen * src_size);
   else
 {
   buffer = NULL;


[gcc r15-7664] Improve g++.dg/torture/pr118521.C

2025-02-21 Thread Richard Biener via Gcc-cvs
https://gcc.gnu.org/g:d2720051c419a781a2f04daadcbefc99fa77b051

commit r15-7664-gd2720051c419a781a2f04daadcbefc99fa77b051
Author: Richard Biener 
Date:   Fri Feb 21 10:05:19 2025 +0100

Improve g++.dg/torture/pr118521.C

Alexander pointed out the way to do a dg-bogus in an included header.

PR tree-optimization/118521
* g++.dg/torture/pr118521.C: Use dg-bogus properly.

Diff:
---
 gcc/testsuite/g++.dg/torture/pr118521.C | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/g++.dg/torture/pr118521.C 
b/gcc/testsuite/g++.dg/torture/pr118521.C
index 1a66aca0e8d2..08836bd5f0e9 100644
--- a/gcc/testsuite/g++.dg/torture/pr118521.C
+++ b/gcc/testsuite/g++.dg/torture/pr118521.C
@@ -1,7 +1,7 @@
 // { dg-do compile }
 // { dg-additional-options "-Wall" }
 
-#include  // dg-bogus new_allocator.h:191 warning: writing 1 byte into 
a region of size 0
+#include  // { dg-bogus "writing 1 byte into a region of size 0" "" { 
target *-*-* } 0 }
 
 void bar(std::vector);


[gcc r15-7666] Append a newline in debug_edge

2025-02-21 Thread H.J. Lu via Gcc-cvs
https://gcc.gnu.org/g:700f049b66fa1a71001a5ab383580605e1bbd4d6

commit r15-7666-g700f049b66fa1a71001a5ab383580605e1bbd4d6
Author: H.J. Lu 
Date:   Fri Feb 21 10:31:04 2025 +0800

Append a newline in debug_edge

Append a newline in debug_edge so that we get

(gdb) call debug_edge (e)
edge (bb_9, bb_1)
(gdb)

instead of

(gdb) call debug_edge (e)
edge (bb_9, bb_1)(gdb)

* sese.cc (debug_edge): Append a newline.

Signed-off-by: H.J. Lu 

Diff:
---
 gcc/sese.cc | 1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/sese.cc b/gcc/sese.cc
index 02cfcf914ec9..6076b03f3596 100644
--- a/gcc/sese.cc
+++ b/gcc/sese.cc
@@ -489,6 +489,7 @@ DEBUG_FUNCTION void
 debug_edge (const_edge e)
 {
   print_edge (stderr, e);
+  fprintf (stderr, "\n");
 }
 
 /* Pretty print sese S to STDERR.  */


[gcc(refs/users/meissner/heads/work194-dmf)] Add ChangeLog.dmf and update REVISION.

2025-02-21 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:160015ed4cb056b94ef8317268935f405153f401

commit 160015ed4cb056b94ef8317268935f405153f401
Author: Michael Meissner 
Date:   Fri Feb 21 16:04:15 2025 -0500

Add ChangeLog.dmf and update REVISION.

2025-02-21  Michael Meissner  

gcc/

* ChangeLog.dmf: New file for branch.
* REVISION: Update.

Diff:
---
 gcc/ChangeLog.dmf | 5 +
 gcc/REVISION  | 2 +-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/gcc/ChangeLog.dmf b/gcc/ChangeLog.dmf
new file mode 100644
index ..e593ea8e769d
--- /dev/null
+++ b/gcc/ChangeLog.dmf
@@ -0,0 +1,5 @@
+ Branch work194-dmf, baseline 
+
+2025-02-21   Michael Meissner  
+
+   Clone branch
diff --git a/gcc/REVISION b/gcc/REVISION
index 05cf4553f33f..6b364388815e 100644
--- a/gcc/REVISION
+++ b/gcc/REVISION
@@ -1 +1 @@
-work194 branch
+work194-dmf branch


[gcc(refs/users/meissner/heads/work194-bugs)] Add ChangeLog.bugs and update REVISION.

2025-02-21 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:5766a9d4505fd2ef3f1126d4b00acf03f211f6f3

commit 5766a9d4505fd2ef3f1126d4b00acf03f211f6f3
Author: Michael Meissner 
Date:   Fri Feb 21 16:05:57 2025 -0500

Add ChangeLog.bugs and update REVISION.

2025-02-21  Michael Meissner  

gcc/

* ChangeLog.bugs: New file for branch.
* REVISION: Update.

Diff:
---
 gcc/ChangeLog.bugs | 5 +
 gcc/REVISION   | 2 +-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/gcc/ChangeLog.bugs b/gcc/ChangeLog.bugs
new file mode 100644
index ..fcc1dd569fc0
--- /dev/null
+++ b/gcc/ChangeLog.bugs
@@ -0,0 +1,5 @@
+ Branch work194-bugs, baseline 
+
+2025-02-21   Michael Meissner  
+
+   Clone branch
diff --git a/gcc/REVISION b/gcc/REVISION
index 05cf4553f33f..07afa32f8801 100644
--- a/gcc/REVISION
+++ b/gcc/REVISION
@@ -1 +1 @@
-work194 branch
+work194-bugs branch


[gcc(refs/users/meissner/heads/work194-sha)] Add ChangeLog.sha and update REVISION.

2025-02-21 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:33235ddf7a38738f834650c1e7663f3c2f2b13cf

commit 33235ddf7a38738f834650c1e7663f3c2f2b13cf
Author: Michael Meissner 
Date:   Fri Feb 21 16:07:42 2025 -0500

Add ChangeLog.sha and update REVISION.

2025-02-21  Michael Meissner  

gcc/

* ChangeLog.sha: New file for branch.
* REVISION: Update.

Diff:
---
 gcc/ChangeLog.sha | 5 +
 gcc/REVISION  | 2 +-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/gcc/ChangeLog.sha b/gcc/ChangeLog.sha
new file mode 100644
index ..2b5f83e3304e
--- /dev/null
+++ b/gcc/ChangeLog.sha
@@ -0,0 +1,5 @@
+ Branch work194-sha, baseline 
+
+2025-02-21   Michael Meissner  
+
+   Clone branch
diff --git a/gcc/REVISION b/gcc/REVISION
index 05cf4553f33f..c8298bf5c35a 100644
--- a/gcc/REVISION
+++ b/gcc/REVISION
@@ -1 +1 @@
-work194 branch
+work194-sha branch


[gcc] Created branch 'meissner/heads/work194-sha' in namespace 'refs/users'

2025-02-21 Thread Michael Meissner via Gcc-cvs
The branch 'meissner/heads/work194-sha' was created in namespace 'refs/users' 
pointing to:

 dc6774491642... Add ChangeLog.meissner and REVISION.


[gcc(refs/users/meissner/heads/work194-libs)] Add ChangeLog.libs and update REVISION.

2025-02-21 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:860a556970bb172ef9cfc1f04ec15d3485676788

commit 860a556970bb172ef9cfc1f04ec15d3485676788
Author: Michael Meissner 
Date:   Fri Feb 21 16:06:47 2025 -0500

Add ChangeLog.libs and update REVISION.

2025-02-21  Michael Meissner  

gcc/

* ChangeLog.libs: New file for branch.
* REVISION: Update.

Diff:
---
 gcc/ChangeLog.libs | 5 +
 gcc/REVISION   | 2 +-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/gcc/ChangeLog.libs b/gcc/ChangeLog.libs
new file mode 100644
index ..380b93a7ee46
--- /dev/null
+++ b/gcc/ChangeLog.libs
@@ -0,0 +1,5 @@
+ Branch work194-libs, baseline 
+
+2025-02-21   Michael Meissner  
+
+   Clone branch
diff --git a/gcc/REVISION b/gcc/REVISION
index 05cf4553f33f..696d56ba45dc 100644
--- a/gcc/REVISION
+++ b/gcc/REVISION
@@ -1 +1 @@
-work194 branch
+work194-libs branch


[gcc(refs/users/meissner/heads/work194)] Add ChangeLog.meissner and REVISION.

2025-02-21 Thread Michael Meissner via Libstdc++-cvs
https://gcc.gnu.org/g:dc6774491642bb762e1f8db72c401e090ac3dc05

commit dc6774491642bb762e1f8db72c401e090ac3dc05
Author: Michael Meissner 
Date:   Fri Feb 21 16:03:19 2025 -0500

Add ChangeLog.meissner and REVISION.

2025-02-21  Michael Meissner  

gcc/

* REVISION: New file for branch.
* ChangeLog.meissner: New file.

gcc/c-family/

* ChangeLog.meissner: New file.

gcc/c/

* ChangeLog.meissner: New file.

gcc/cp/

* ChangeLog.meissner: New file.

gcc/fortran/

* ChangeLog.meissner: New file.

gcc/testsuite/

* ChangeLog.meissner: New file.

libgcc/

* ChangeLog.meissner: New file.

Diff:
---
 gcc/ChangeLog.meissner   | 5 +
 gcc/REVISION | 1 +
 gcc/c-family/ChangeLog.meissner  | 5 +
 gcc/c/ChangeLog.meissner | 5 +
 gcc/cp/ChangeLog.meissner| 5 +
 gcc/fortran/ChangeLog.meissner   | 5 +
 gcc/testsuite/ChangeLog.meissner | 5 +
 libgcc/ChangeLog.meissner| 5 +
 libstdc++-v3/ChangeLog.meissner  | 5 +
 9 files changed, 41 insertions(+)

diff --git a/gcc/ChangeLog.meissner b/gcc/ChangeLog.meissner
new file mode 100644
index ..3a3d82994248
--- /dev/null
+++ b/gcc/ChangeLog.meissner
@@ -0,0 +1,5 @@
+ Branch work194, baseline 
+
+2025-02-21   Michael Meissner  
+
+   Clone branch
diff --git a/gcc/REVISION b/gcc/REVISION
new file mode 100644
index ..05cf4553f33f
--- /dev/null
+++ b/gcc/REVISION
@@ -0,0 +1 @@
+work194 branch
diff --git a/gcc/c-family/ChangeLog.meissner b/gcc/c-family/ChangeLog.meissner
new file mode 100644
index ..3a3d82994248
--- /dev/null
+++ b/gcc/c-family/ChangeLog.meissner
@@ -0,0 +1,5 @@
+ Branch work194, baseline 
+
+2025-02-21   Michael Meissner  
+
+   Clone branch
diff --git a/gcc/c/ChangeLog.meissner b/gcc/c/ChangeLog.meissner
new file mode 100644
index ..3a3d82994248
--- /dev/null
+++ b/gcc/c/ChangeLog.meissner
@@ -0,0 +1,5 @@
+ Branch work194, baseline 
+
+2025-02-21   Michael Meissner  
+
+   Clone branch
diff --git a/gcc/cp/ChangeLog.meissner b/gcc/cp/ChangeLog.meissner
new file mode 100644
index ..3a3d82994248
--- /dev/null
+++ b/gcc/cp/ChangeLog.meissner
@@ -0,0 +1,5 @@
+ Branch work194, baseline 
+
+2025-02-21   Michael Meissner  
+
+   Clone branch
diff --git a/gcc/fortran/ChangeLog.meissner b/gcc/fortran/ChangeLog.meissner
new file mode 100644
index ..3a3d82994248
--- /dev/null
+++ b/gcc/fortran/ChangeLog.meissner
@@ -0,0 +1,5 @@
+ Branch work194, baseline 
+
+2025-02-21   Michael Meissner  
+
+   Clone branch
diff --git a/gcc/testsuite/ChangeLog.meissner b/gcc/testsuite/ChangeLog.meissner
new file mode 100644
index ..3a3d82994248
--- /dev/null
+++ b/gcc/testsuite/ChangeLog.meissner
@@ -0,0 +1,5 @@
+ Branch work194, baseline 
+
+2025-02-21   Michael Meissner  
+
+   Clone branch
diff --git a/libgcc/ChangeLog.meissner b/libgcc/ChangeLog.meissner
new file mode 100644
index ..3a3d82994248
--- /dev/null
+++ b/libgcc/ChangeLog.meissner
@@ -0,0 +1,5 @@
+ Branch work194, baseline 
+
+2025-02-21   Michael Meissner  
+
+   Clone branch
diff --git a/libstdc++-v3/ChangeLog.meissner b/libstdc++-v3/ChangeLog.meissner
new file mode 100644
index ..3a3d82994248
--- /dev/null
+++ b/libstdc++-v3/ChangeLog.meissner
@@ -0,0 +1,5 @@
+ Branch work194, baseline 
+
+2025-02-21   Michael Meissner  
+
+   Clone branch


[gcc] Created branch 'meissner/heads/work194-bugs' in namespace 'refs/users'

2025-02-21 Thread Michael Meissner via Gcc-cvs
The branch 'meissner/heads/work194-bugs' was created in namespace 'refs/users' 
pointing to:

 dc6774491642... Add ChangeLog.meissner and REVISION.


[gcc(refs/users/meissner/heads/work194-test)] Add ChangeLog.test and update REVISION.

2025-02-21 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:8f61c9712e31feeb436e521e3f12e524714bce04

commit 8f61c9712e31feeb436e521e3f12e524714bce04
Author: Michael Meissner 
Date:   Fri Feb 21 16:08:32 2025 -0500

Add ChangeLog.test and update REVISION.

2025-02-21  Michael Meissner  

gcc/

* ChangeLog.test: New file for branch.
* REVISION: Update.

Diff:
---
 gcc/ChangeLog.test | 5 +
 gcc/REVISION   | 2 +-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/gcc/ChangeLog.test b/gcc/ChangeLog.test
new file mode 100644
index ..a30ca769019a
--- /dev/null
+++ b/gcc/ChangeLog.test
@@ -0,0 +1,5 @@
+ Branch work194-test, baseline 
+
+2025-02-21   Michael Meissner  
+
+   Clone branch
diff --git a/gcc/REVISION b/gcc/REVISION
index 05cf4553f33f..999de8ff4f47 100644
--- a/gcc/REVISION
+++ b/gcc/REVISION
@@ -1 +1 @@
-work194 branch
+work194-test branch


[gcc] Created branch 'meissner/heads/work194-math' in namespace 'refs/users'

2025-02-21 Thread Michael Meissner via Gcc-cvs
The branch 'meissner/heads/work194-math' was created in namespace 'refs/users' 
pointing to:

 dc6774491642... Add ChangeLog.meissner and REVISION.


[gcc] Created branch 'meissner/heads/work194-test' in namespace 'refs/users'

2025-02-21 Thread Michael Meissner via Gcc-cvs
The branch 'meissner/heads/work194-test' was created in namespace 'refs/users' 
pointing to:

 dc6774491642... Add ChangeLog.meissner and REVISION.


[gcc(refs/users/meissner/heads/work194-vpair)] Add ChangeLog.vpair and update REVISION.

2025-02-21 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:6c254f74ffc9fc6a663d2f249a25037a51e5d45a

commit 6c254f74ffc9fc6a663d2f249a25037a51e5d45a
Author: Michael Meissner 
Date:   Fri Feb 21 16:05:04 2025 -0500

Add ChangeLog.vpair and update REVISION.

2025-02-21  Michael Meissner  

gcc/

* ChangeLog.vpair: New file for branch.
* REVISION: Update.

Diff:
---
 gcc/ChangeLog.vpair | 5 +
 gcc/REVISION| 2 +-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/gcc/ChangeLog.vpair b/gcc/ChangeLog.vpair
new file mode 100644
index ..7516c64dcdee
--- /dev/null
+++ b/gcc/ChangeLog.vpair
@@ -0,0 +1,5 @@
+ Branch work194-vpair, baseline 
+
+2025-02-21   Michael Meissner  
+
+   Clone branch
diff --git a/gcc/REVISION b/gcc/REVISION
index 05cf4553f33f..ee6baa0a4202 100644
--- a/gcc/REVISION
+++ b/gcc/REVISION
@@ -1 +1 @@
-work194 branch
+work194-vpair branch


[gcc] Created branch 'meissner/heads/work194-vpair' in namespace 'refs/users'

2025-02-21 Thread Michael Meissner via Gcc-cvs
The branch 'meissner/heads/work194-vpair' was created in namespace 'refs/users' 
pointing to:

 dc6774491642... Add ChangeLog.meissner and REVISION.


[gcc] Created branch 'meissner/heads/work194' in namespace 'refs/users'

2025-02-21 Thread Michael Meissner via Gcc-cvs
The branch 'meissner/heads/work194' was created in namespace 'refs/users' 
pointing to:

 700f049b66fa... Append a newline in debug_edge


[gcc] Created branch 'meissner/heads/work194-libs' in namespace 'refs/users'

2025-02-21 Thread Michael Meissner via Gcc-cvs
The branch 'meissner/heads/work194-libs' was created in namespace 'refs/users' 
pointing to:

 dc6774491642... Add ChangeLog.meissner and REVISION.


[gcc] Created branch 'meissner/heads/work194-orig' in namespace 'refs/users'

2025-02-21 Thread Michael Meissner via Gcc-cvs
The branch 'meissner/heads/work194-orig' was created in namespace 'refs/users' 
pointing to:

 700f049b66fa... Append a newline in debug_edge


[gcc(refs/users/meissner/heads/work194-math)] Add ChangeLog.math and update REVISION.

2025-02-21 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:3fb21ca3f9e34fbd2b86cf692958de7fa8ede1d1

commit 3fb21ca3f9e34fbd2b86cf692958de7fa8ede1d1
Author: Michael Meissner 
Date:   Fri Feb 21 16:09:19 2025 -0500

Add ChangeLog.math and update REVISION.

2025-02-21  Michael Meissner  

gcc/

* ChangeLog.math: New file for branch.
* REVISION: Update.

Diff:
---
 gcc/ChangeLog.math | 5 +
 gcc/REVISION   | 2 +-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/gcc/ChangeLog.math b/gcc/ChangeLog.math
new file mode 100644
index ..e63f7fd373bf
--- /dev/null
+++ b/gcc/ChangeLog.math
@@ -0,0 +1,5 @@
+ Branch work194-math, baseline 
+
+2025-02-21   Michael Meissner  
+
+   Clone branch
diff --git a/gcc/REVISION b/gcc/REVISION
index 05cf4553f33f..324f02973041 100644
--- a/gcc/REVISION
+++ b/gcc/REVISION
@@ -1 +1 @@
-work194 branch
+work194-math branch


[gcc(refs/users/omachota/heads/rtl-ssa-dce)] rtl-ssa: dce improve prelive conditions

2025-02-21 Thread Ondrej Machota via Gcc-cvs
https://gcc.gnu.org/g:b948b7907c28284ac9e14f5316b15d5deec4382d

commit b948b7907c28284ac9e14f5316b15d5deec4382d
Author: Ondřej Machota 
Date:   Fri Feb 21 23:06:19 2025 +0100

rtl-ssa: dce improve prelive conditions

Diff:
---
 gcc/dce.cc | 38 +++---
 1 file changed, 31 insertions(+), 7 deletions(-)

diff --git a/gcc/dce.cc b/gcc/dce.cc
index 6b95f89c5b2c..7fb24d1d975b 100644
--- a/gcc/dce.cc
+++ b/gcc/dce.cc
@@ -17,7 +17,6 @@ You should have received a copy of the GNU General Public 
License
 along with GCC; see the file COPYING3.  If not see
 .  */
 
-#include 
 #define INCLUDE_ALGORITHM
 #define INCLUDE_FUNCTIONAL
 #define INCLUDE_ARRAY
@@ -1316,6 +1315,28 @@ bool sets_global_register(rtx_insn* insn) {
   return false;
 }
 
+bool is_control_flow(rtx_code code) {
+  // What about BARRIERs?
+  switch (code) {
+case JUMP_INSN:
+case JUMP_TABLE_DATA: // Be careful with Table jump addresses - ADDR_VEC, 
ADDR_DIFF_VEC, PREFETCH 
+case TRAP_IF:
+case IF_THEN_ELSE: // Also COMPARE?
+case COND_EXEC: // We might want to check the operation that is under this?
+case RETURN:
+case SIMPLE_RETURN:
+case EH_RETURN:
+  return true;
+
+default:
+  return false;
+  }
+}
+
+bool handle_rtl_previle(rtx_insn *insn) {
+  // TODO : handle everything except parallel
+}
+
 bool is_prelive(insn_info *insn)
 {
   if (insn->is_artificial()) // phis are never prelive
@@ -1341,7 +1362,7 @@ bool is_prelive(insn_info *insn)
   */
 
   // Now, we only have to handle rtx insns
-  assert(insn->is_real());
+  gcc_assert (insn->is_real());
   auto rtl = insn->rtl();
 
   if (!INSN_P(rtl)) // This might be useless
@@ -1350,20 +1371,23 @@ bool is_prelive(insn_info *insn)
   rtx pat = PATTERN(rtl); // if we use this instead of rtl, then rtl notes 
wont be checked
   
   // TODO : join if statements
+  // We need to describe all possible prelive instructions, a list of all the 
instructions is inside `rtl.def`
 
-  if (JUMP_P(rtl))
+  // Control flow
+  auto rtl_code = GET_CODE(rtl);
+  if (is_control_flow(rtl_code))
 return true;
 
-  // We need to describe all possible prelive instructions, a list of all the 
instructions is inside `rtl.def`
-
   // Mark set of a global register
   if (sets_global_register(rtl))
 return true;
 
   // Call is inside side_effects_p
-  if (side_effects_p(rtl) || volatile_refs_p(rtl) || can_throw_internal(rtl))
+  if (volatile_refs_p(rtl) || can_throw_internal(rtl) || BARRIER_P(rtl) || 
rtl_code == PREFETCH)
 return true;
 
+  // TODO : handle parallel, {pre,post}_{int,dec}, {pre,post}_modify
+
   return false;
 }
 
@@ -1521,8 +1545,8 @@ rtl_ssa_dce_sweep(std::unordered_set marked)
 changes[i] = &to_delete[i];
   }
 
-  // if (verify_insn_changes())
   crtl->ssa->change_insns(changes);
+  // if (verify_insn_changes())
 }
 
 static unsigned int


[gcc/omachota/heads/rtl-ssa-dce] rtl-ssa: dce some prelive conditions

2025-02-21 Thread Ondrej Machota via Gcc-cvs
The branch 'omachota/heads/rtl-ssa-dce' was updated to point to:

 7b02cedb2aa8... rtl-ssa: dce some prelive conditions

It previously pointed to:

 175c246015a7... rtl-ssa: dce add prelive conditions

Diff:

!!! WARNING: THE FOLLOWING COMMITS ARE NO LONGER ACCESSIBLE (LOST):
---

  175c246... rtl-ssa: dce add prelive conditions
  d0095cf... rtl-ssa: dce fix working with sbitmap
  cde5332... rtl-ssa: dce fix uid
  96c7ea3... rtl-ssa: dce fix mark compile and sweep sketch


Summary of changes (added commits):
---

  7b02ced... rtl-ssa: dce some prelive conditions


[gcc(refs/users/omachota/heads/rtl-ssa-dce)] rtl-ssa: dce some prelive conditions

2025-02-21 Thread Ondrej Machota via Gcc-cvs
https://gcc.gnu.org/g:7b02cedb2aa83d5d5cbca668661bf96e58376092

commit 7b02cedb2aa83d5d5cbca668661bf96e58376092
Author: Ondřej Machota 
Date:   Fri Feb 21 14:11:36 2025 +0100

rtl-ssa: dce some prelive conditions

Diff:
---
 gcc/dce.cc | 321 +
 1 file changed, 239 insertions(+), 82 deletions(-)

diff --git a/gcc/dce.cc b/gcc/dce.cc
index f59bc6c6ffa2..6b95f89c5b2c 100644
--- a/gcc/dce.cc
+++ b/gcc/dce.cc
@@ -17,8 +17,10 @@ You should have received a copy of the GNU General Public 
License
 along with GCC; see the file COPYING3.  If not see
 .  */
 
+#include 
 #define INCLUDE_ALGORITHM
 #define INCLUDE_FUNCTIONAL
+#define INCLUDE_ARRAY
 #include "config.h"
 #include "system.h"
 #include "coretypes.h"
@@ -30,7 +32,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "rtl-ssa.h"
 #include "memmodel.h"
 #include "tm_p.h"
-#include "emit-rtl.h"  /* FIXME: Can go away once crtl is moved to rtl.h.  */
+#include "emit-rtl.h" /* FIXME: Can go away once crtl is moved to rtl.h.  */
 #include "cfgrtl.h"
 #include "cfgbuild.h"
 #include "cfgcleanup.h"
@@ -39,6 +41,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-pass.h"
 #include "dbgcnt.h"
 #include "rtl-iter.h"
+#include 
 
 using namespace rtl_ssa;
 
@@ -1300,128 +1303,282 @@ public:
 } // namespace
 
 
-bool is_inherently_live(insn_info *insn) {
+bool sets_global_register(rtx_insn* insn) {
+  rtx set = single_set(insn);
+  if (!set)
+return false;
 
+  rtx dest = SET_DEST(set);
+  if (REG_P(dest) && HARD_REGISTER_NUM_P(REGNO(dest)) && 
global_regs[REGNO(dest)]) {
+return true;
+  }
+
+  return false;
 }
 
-static void rti_ssa_dce_() {
+bool is_prelive(insn_info *insn)
+{
+  if (insn->is_artificial()) // phis are never prelive
+return false;
+
+  /*
+   * There are a few functions we can use to detect if an instruction is
+   * inherently live:
+   * rtlanal.cc:
+   *  bool side_effects_p (const_rtx x);
+   *  bool volatile_insn_p (const_rtx x);
+   *
+   * rtlanal.h
+   *  bool has_side_effects (); inside class rtx_properties
+   *
+   * dce.cc:
+   *  static bool deletable_insn_p_1(rtx body); uses bool volatile_insn_p 
(const_rtx x);
+   *  static bool deletable_insn_p(rtx_insn *insn, bool fast, bitmap 
arg_stores);
+   * 
+   * Possibly the most accurate way would be to rewrite `static bool
+   * deletable_insn_p(rtx_insn *insn, bool fast, bitmap arg_stores);`
+   * 
+  */
+
+  // Now, we only have to handle rtx insns
+  assert(insn->is_real());
+  auto rtl = insn->rtl();
+
+  if (!INSN_P(rtl)) // This might be useless
+return false;
+
+  rtx pat = PATTERN(rtl); // if we use this instead of rtl, then rtl notes 
wont be checked
   
+  // TODO : join if statements
+
+  if (JUMP_P(rtl))
+return true;
+
+  // We need to describe all possible prelive instructions, a list of all the 
instructions is inside `rtl.def`
+
+  // Mark set of a global register
+  if (sets_global_register(rtl))
+return true;
+
+  // Call is inside side_effects_p
+  if (side_effects_p(rtl) || volatile_refs_p(rtl) || can_throw_internal(rtl))
+return true;
+
+  return false;
 }
 
 static void
-rtl_ssa_dce_init ()
+rtl_ssa_dce_init()
 {
-calculate_dominance_info (CDI_DOMINATORS);
-crtl->ssa = new rtl_ssa::function_info (cfun);
+  calculate_dominance_info(CDI_DOMINATORS);
+  // here we create ssa form for function
+  crtl->ssa = new rtl_ssa::function_info(cfun);
 }
 
 static void
-rtl_ssa_dce_done ()
+rtl_ssa_dce_done()
 {
-free_dominance_info (CDI_DOMINATORS);
-if (crtl->ssa->perform_pending_updates ())
-  cleanup_cfg (0);
+  free_dominance_info(CDI_DOMINATORS);
+  if (crtl->ssa->perform_pending_updates())
+cleanup_cfg(0);
 
-delete crtl->ssa;
-crtl->ssa = nullptr;
+  delete crtl->ssa;
+  crtl->ssa = nullptr;
 
-if (dump_file)
-  fprintf (dump_file, "\nFinished running rtl_ssa_dce\n\n");
+  if (dump_file)
+fprintf(dump_file, "\nFinished running rtl_ssa_dce\n\n");
 }
 
-static unsigned int
-rtl_ssa_dce ()
+static void
+rtl_ssa_dce_mark_live(insn_info *info, vec &worklist, 
std::unordered_set marked)
 {
-rtl_ssa_dce_init ();
+  int info_uid = info->uid();
+  if (dump_file)
+  {
+fprintf(dump_file, "  Adding insn %d to worklist\n", info_uid);
+  }
 
-insn_info *next;
-sbitmap marked;
-auto_vec worklist;
-for (insn_info *insn = crtl->ssa->first_insn (); insn; insn = next)
+  marked.emplace(info);
+  worklist.safe_push(info);
+}
+
+static auto_vec
+rtl_ssa_dce_prelive(std::unordered_set marked)
+{
+  insn_info *next;
+  auto_vec worklist;
+  for (insn_info *insn = crtl->ssa->first_insn(); insn; insn = next)
+  {
+if (dump_file)
 {
-  next = insn->next_any_insn ();
-  auto *rtl = insn->rtl();
-  /*
-  I would like to mark visited instruction with something like plf (Pass 
local flags) as in gimple
-
-  This file contains some useful functions: e.g. mark

[gcc r15-7665] tree-optimization/118954 - avoid UB on ref created by predcom

2025-02-21 Thread Richard Biener via Gcc-cvs
https://gcc.gnu.org/g:ee30e2586a3142e63daaf301a561984f1d22d38d

commit r15-7665-gee30e2586a3142e63daaf301a561984f1d22d38d
Author: Richard Biener 
Date:   Fri Feb 21 09:58:04 2025 +0100

tree-optimization/118954 - avoid UB on ref created by predcom

When predicitive commoning moves an invariant ref it makes sure to
not build a MEM_REF with a base that is negatively offsetted from
an object.  But in trying to preserve some transforms it does not
consider association of a constant offset with the address computation
in DR_BASE_ADDRESS leading to exactly this problem again.  This is
arguably a problem in data-ref analysis producing such an out-of-bound
DR_BASE_ADDRESS, but this looks quite involved to fix, so the
following avoids the association in one more case.  This fixes the
testcase while preserving the desired transform in
gcc.dg/tree-ssa/predcom-1.c.

PR tree-optimization/118954
* tree-predcom.cc (ref_at_iteration): Make sure to not
associate the constant offset with DR_BASE_ADDRESS when
that is an offsetted pointer.

* gcc.dg/torture/pr118954.c: New testcase.

Diff:
---
 gcc/testsuite/gcc.dg/torture/pr118954.c | 22 ++
 gcc/tree-predcom.cc |  3 ++-
 2 files changed, 24 insertions(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.dg/torture/pr118954.c 
b/gcc/testsuite/gcc.dg/torture/pr118954.c
new file mode 100644
index ..6b95e0b8c46f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr118954.c
@@ -0,0 +1,22 @@
+/* { dg-do run } */
+/* { dg-additional-options "-fpredictive-commoning" } */
+
+int a, b, c, d;
+void e(int f)
+{
+  int g[] = {5, 8};
+  int *h = g;
+  while (c < f) {
+d = 0;
+for (; d < f; d++)
+  c = h[d];
+  }
+}
+int main()
+{
+  int i = (0 == a + 1) - 1;
+  e(i + 3);
+  if (b != 0)
+__builtin_abort ();
+  return 0;
+}
diff --git a/gcc/tree-predcom.cc b/gcc/tree-predcom.cc
index ade8fbf0b5c3..d45aa3857b9f 100644
--- a/gcc/tree-predcom.cc
+++ b/gcc/tree-predcom.cc
@@ -1807,7 +1807,8 @@ ref_at_iteration (data_reference_p dr, int iter,
  then.  But for some cases we can retain that to allow tree_could_trap_p
  to return false - see gcc.dg/tree-ssa/predcom-1.c  */
   tree addr, alias_ptr;
-  if (integer_zerop  (off))
+  if (integer_zerop  (off)
+  && TREE_CODE (DR_BASE_ADDRESS (dr)) != POINTER_PLUS_EXPR)
 {
   alias_ptr = fold_convert (reference_alias_ptr_type (ref), coff);
   addr = DR_BASE_ADDRESS (dr);


[gcc r15-7663] Fortran: initialize non-saved pointers with -fcheck=pointer [PR48958]

2025-02-21 Thread Harald Anlauf via Gcc-cvs
https://gcc.gnu.org/g:7d383a7343af052798a52d575a0f0205c4a82c9c

commit r15-7663-g7d383a7343af052798a52d575a0f0205c4a82c9c
Author: Harald Anlauf 
Date:   Thu Feb 20 21:22:56 2025 +0100

Fortran: initialize non-saved pointers with -fcheck=pointer [PR48958]

PR fortran/48958

gcc/fortran/ChangeLog:

* trans-array.cc (gfc_trans_deferred_array): Initialize the data
component of non-saved pointers when -fcheck=pointer is set.

gcc/testsuite/ChangeLog:

* gfortran.dg/pointer_init_13.f90: New test.

Diff:
---
 gcc/fortran/trans-array.cc|  7 +--
 gcc/testsuite/gfortran.dg/pointer_init_13.f90 | 24 
 2 files changed, 29 insertions(+), 2 deletions(-)

diff --git a/gcc/fortran/trans-array.cc b/gcc/fortran/trans-array.cc
index 92e933add8a8..8f76870b286a 100644
--- a/gcc/fortran/trans-array.cc
+++ b/gcc/fortran/trans-array.cc
@@ -12123,8 +12123,11 @@ gfc_trans_deferred_array (gfc_symbol * sym, 
gfc_wrapped_block * block)
   type = TREE_TYPE (descriptor);
 }
 
-  /* NULLIFY the data pointer, for non-saved allocatables.  */
-  if (GFC_DESCRIPTOR_TYPE_P (type) && !sym->attr.save && sym->attr.allocatable)
+  /* NULLIFY the data pointer for non-saved allocatables, or for non-saved
+ pointers when -fcheck=pointer is specified.  */
+  if (GFC_DESCRIPTOR_TYPE_P (type) && !sym->attr.save
+  && (sym->attr.allocatable
+ || (sym->attr.pointer && (gfc_option.rtcheck & GFC_RTCHECK_POINTER
 {
   gfc_conv_descriptor_data_set (&init, descriptor, null_pointer_node);
   if (flag_coarray == GFC_FCOARRAY_LIB && sym->attr.codimension)
diff --git a/gcc/testsuite/gfortran.dg/pointer_init_13.f90 
b/gcc/testsuite/gfortran.dg/pointer_init_13.f90
new file mode 100644
index ..ece423a9db6d
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pointer_init_13.f90
@@ -0,0 +1,24 @@
+! { dg-do compile }
+! { dg-options "-fcheck=pointer -fdump-tree-optimized -O -fno-inline" }
+!
+! PR fortran/48958
+!
+! Initialize non-saved pointers with -fcheck=pointer to support runtime checks
+! of uses of possibly undefined pointers
+
+program p
+  implicit none
+  call s
+contains
+  subroutine s
+integer, pointer :: a(:)
+integer, pointer :: b(:) => NULL()
+if (size (a) /= 0) stop 1
+if (size (b) /= 0) stop 2
+print *, size (a)
+print *, size (b)
+  end
+end
+
+! { dg-final { scan-tree-dump-times "_gfortran_runtime_error_at" 1 "optimized" 
} }
+! { dg-final { scan-tree-dump-not "_gfortran_stop_numeric" "optimized" } }


[gcc(refs/users/meissner/heads/work194-orig)] Add REVISION.

2025-02-21 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:26535448baf5d1aa0a748c60bbe0b95e049cd142

commit 26535448baf5d1aa0a748c60bbe0b95e049cd142
Author: Michael Meissner 
Date:   Fri Feb 21 16:10:16 2025 -0500

Add REVISION.

2025-02-21  Michael Meissner  

gcc/

* REVISION: New file for branch.

Diff:
---
 gcc/REVISION | 1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/REVISION b/gcc/REVISION
new file mode 100644
index ..f44753c5f5ff
--- /dev/null
+++ b/gcc/REVISION
@@ -0,0 +1 @@
+work194-orig branch


[gcc(refs/users/meissner/heads/work194)] Change TARGET_POPCNTB to TARGET_POWER5.

2025-02-21 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:034a3c06c3115d173a9e0e367c4d3e8b7645e214

commit 034a3c06c3115d173a9e0e367c4d3e8b7645e214
Author: Michael Meissner 
Date:   Fri Feb 21 18:19:54 2025 -0500

Change TARGET_POPCNTB to TARGET_POWER5.

This patch changes TARGET_POPCNTB to TARGET_POWER5.  The -mpopcntb switch 
is not
being changed in this patch, just the name of the macros used to determine 
if
the PowerPC processor supports ISA 2.2 (Power5).

2025-02-21  Michael Meissner  

gcc/

* gcc/config/rs6000/rs6000-builtin.cc (rs6000_builtin_is_supported):
Change TARGET_POPCNTB to TARGET_POWER5.
* gcc/config/rs6000/rs6000.cc (rs6000_option_override_internal):
Likewise.
* gcc/config/rs6000/rs6000.h (TARGET_FCFID): Likewise.
(TARGET_POWER5): New macro.
(TARGET_EXTRA_BUILTINS): Change TARGET_POPCNTB to TARGET_POWER5.
(TARGET_FRE): Likewise.
(TARGET_FRSQRTES): Likewise.
* gcc/config/rs6000/rs6000.md (enabled attribute): Likewise.

Diff:
---
 gcc/config/rs6000/rs6000-builtin.cc |  2 +-
 gcc/config/rs6000/rs6000.cc |  2 +-
 gcc/config/rs6000/rs6000.h  | 11 +++
 gcc/config/rs6000/rs6000.md |  2 +-
 4 files changed, 10 insertions(+), 7 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-builtin.cc 
b/gcc/config/rs6000/rs6000-builtin.cc
index 111802381acb..4ed2bc1ca89e 100644
--- a/gcc/config/rs6000/rs6000-builtin.cc
+++ b/gcc/config/rs6000/rs6000-builtin.cc
@@ -155,7 +155,7 @@ rs6000_builtin_is_supported (enum rs6000_gen_builtins 
fncode)
 case ENB_ALWAYS:
   return true;
 case ENB_P5:
-  return TARGET_POPCNTB;
+  return TARGET_POWER5;
 case ENB_P6:
   return TARGET_CMPB;
 case ENB_P6_64:
diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index 675b039c2b65..011ba7c899ec 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -3926,7 +3926,7 @@ rs6000_option_override_internal (bool global_init_p)
 rs6000_isa_flags |= (ISA_2_5_MASKS_EMBEDDED & ~ignore_masks);
   else if (TARGET_FPRND)
 rs6000_isa_flags |= (ISA_2_4_MASKS & ~ignore_masks);
-  else if (TARGET_POPCNTB)
+  else if (TARGET_POWER5)
 rs6000_isa_flags |= (ISA_2_2_MASKS & ~ignore_masks);
   else if (TARGET_ALTIVEC)
 rs6000_isa_flags |= (OPTION_MASK_PPC_GFXOPT & ~ignore_masks);
diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h
index a60d7a53cfaf..5ff801c8801d 100644
--- a/gcc/config/rs6000/rs6000.h
+++ b/gcc/config/rs6000/rs6000.h
@@ -448,7 +448,7 @@ extern int rs6000_vector_align[];
Enable 32-bit fcfid's on any of the switches for newer ISA machines.  */
 #define TARGET_FCFID   (TARGET_POWERPC64   \
 || TARGET_PPC_GPOPT/* 970/power4 */\
-|| TARGET_POPCNTB  /* ISA 2.02 */  \
+|| TARGET_POWER5   /* ISA 2.02 */  \
 || TARGET_CMPB /* ISA 2.05 */  \
 || TARGET_POPCNTD) /* ISA 2.06 */
 
@@ -499,6 +499,9 @@ extern int rs6000_vector_align[];
 #define TARGET_MINMAX  (TARGET_HARD_FLOAT && TARGET_PPC_GFXOPT \
 && (TARGET_P9_MINMAX || !flag_trapping_math))
 
+/* Convert ISA bits like POPCNTB to PowerPC processors like POWER5.  */
+#define TARGET_POWER5  TARGET_POPCNTB
+
 /* In switching from using target_flags to using rs6000_isa_flags, the options
machinery creates OPTION_MASK_ instead of MASK_.  The MASK_
options that have not yet been replaced by their OPTION_MASK_
@@ -525,7 +528,7 @@ extern int rs6000_vector_align[];
 
 #define TARGET_EXTRA_BUILTINS  (TARGET_POWERPC64\
 || TARGET_PPC_GPOPT /* 970/power4 */\
-|| TARGET_POPCNTB   /* ISA 2.02 */  \
+|| TARGET_POWER5/* ISA 2.02 */  \
 || TARGET_CMPB  /* ISA 2.05 */  \
 || TARGET_POPCNTD   /* ISA 2.06 */  \
 || TARGET_ALTIVEC   \
@@ -541,9 +544,9 @@ extern int rs6000_vector_align[];
 #define TARGET_FRES(TARGET_HARD_FLOAT && TARGET_PPC_GFXOPT)
 
 #define TARGET_FRE (TARGET_HARD_FLOAT \
-&& (TARGET_POPCNTB || VECTOR_UNIT_VSX_P (DFmode)))
+&& (TARGET_POWER5 || VECTOR_UNIT_VSX_P (DFmode)))
 
-#define TARGET_FRSQRTES(TARGET_HARD_FLOAT && TARGET_POPCNTB \
+#define TARGET_FRSQRTES(TARGET_HARD_FLOAT && TARGET_POWER5 \
 && TARGET_PPC_GFXOPT)
 
 #define TARGET_FRSQRTE (TARGET_HARD_FLOAT \
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 9c718ca2a226..c5bd273be8b3 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs

[gcc(refs/users/meissner/heads/work194)] Change TARGET_FPRND to TARGET_POWER5X.

2025-02-21 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:432624d0810c17d927aa0379d6f1c3c48707e13f

commit 432624d0810c17d927aa0379d6f1c3c48707e13f
Author: Michael Meissner 
Date:   Fri Feb 21 18:21:41 2025 -0500

Change TARGET_FPRND to TARGET_POWER5X.

This patch changes TARGET_POWER5X to TARGET_POWER5.  The -mfprnd switch is 
not
being changed, just the name of the macros used to determine if the PowerPC
processor supports ISA 2.4 (Power5x).

2025-02-21  Michael Meissner  

gcc/

* gcc/config/rs6000/rs6000.cc (rs6000_option_override_internal):
Change TARGET_FPRND to TARGET_POWER5X.
* gcc/config/rs6000/rs6000.h (TARGET_POWERP5X): New macro.
* gcc/config/rs6000/rs6000.md (fmod3): Change TARGET_FPRND to
TARGET_POWER5X.
(remainder3): Likewise.
(fctiwuz_): Likewise.
(ceil2): Likewise.
(floor2): Likewise.
(round2): Likewise.

Diff:
---
 gcc/config/rs6000/rs6000.cc |  4 ++--
 gcc/config/rs6000/rs6000.h  |  1 +
 gcc/config/rs6000/rs6000.md | 14 +++---
 3 files changed, 10 insertions(+), 9 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index 011ba7c899ec..07f5e58532a5 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -3924,7 +3924,7 @@ rs6000_option_override_internal (bool global_init_p)
 rs6000_isa_flags |= (ISA_2_5_MASKS_SERVER & ~ignore_masks);
   else if (TARGET_CMPB)
 rs6000_isa_flags |= (ISA_2_5_MASKS_EMBEDDED & ~ignore_masks);
-  else if (TARGET_FPRND)
+  else if (TARGET_POWER5X)
 rs6000_isa_flags |= (ISA_2_4_MASKS & ~ignore_masks);
   else if (TARGET_POWER5)
 rs6000_isa_flags |= (ISA_2_2_MASKS & ~ignore_masks);
@@ -3951,7 +3951,7 @@ rs6000_option_override_internal (bool global_init_p)
   rs6000_isa_flags &= ~OPTION_MASK_CRYPTO;
 }
 
-  if (!TARGET_FPRND && TARGET_VSX)
+  if (!TARGET_POWER5X && TARGET_VSX)
 {
   if (rs6000_isa_flags_explicit & OPTION_MASK_FPRND)
/* TARGET_VSX = 1 implies Power 7 and newer */
diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h
index 5ff801c8801d..882a3864ca66 100644
--- a/gcc/config/rs6000/rs6000.h
+++ b/gcc/config/rs6000/rs6000.h
@@ -501,6 +501,7 @@ extern int rs6000_vector_align[];
 
 /* Convert ISA bits like POPCNTB to PowerPC processors like POWER5.  */
 #define TARGET_POWER5  TARGET_POPCNTB
+#define TARGET_POWER5X TARGET_FPRND
 
 /* In switching from using target_flags to using rs6000_isa_flags, the options
machinery creates OPTION_MASK_ instead of MASK_.  The MASK_
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index c5bd273be8b3..045ce22a03c8 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -5171,7 +5171,7 @@
(use (match_operand:SFDF 1 "gpc_reg_operand"))
(use (match_operand:SFDF 2 "gpc_reg_operand"))]
   "TARGET_HARD_FLOAT
-   && TARGET_FPRND
+   && TARGET_POWER5X
&& flag_unsafe_math_optimizations"
 {
   rtx div = gen_reg_rtx (mode);
@@ -5189,7 +5189,7 @@
(use (match_operand:SFDF 1 "gpc_reg_operand"))
(use (match_operand:SFDF 2 "gpc_reg_operand"))]
   "TARGET_HARD_FLOAT
-   && TARGET_FPRND
+   && TARGET_POWER5X
&& flag_unsafe_math_optimizations"
 {
   rtx div = gen_reg_rtx (mode);
@@ -6689,7 +6689,7 @@
 (define_insn "*friz"
   [(set (match_operand:DF 0 "gpc_reg_operand" "=d,wa")
(float:DF (fix:DI (match_operand:DF 1 "gpc_reg_operand" "d,wa"]
-  "TARGET_HARD_FLOAT && TARGET_FPRND
+  "TARGET_HARD_FLOAT && TARGET_POWER5X
&& flag_unsafe_math_optimizations && !flag_trapping_math && TARGET_FRIZ"
   "@
friz %0,%1
@@ -6817,7 +6817,7 @@
   [(set (match_operand:SFDF 0 "gpc_reg_operand" "=d,wa")
(unspec:SFDF [(match_operand:SFDF 1 "gpc_reg_operand" "d,wa")]
 UNSPEC_FRIZ))]
-  "TARGET_HARD_FLOAT && TARGET_FPRND"
+  "TARGET_HARD_FLOAT && TARGET_POWER5X"
   "@
friz %0,%1
xsrdpiz %x0,%x1"
@@ -6827,7 +6827,7 @@
   [(set (match_operand:SFDF 0 "gpc_reg_operand" "=d,wa")
(unspec:SFDF [(match_operand:SFDF 1 "gpc_reg_operand" "d,wa")]
 UNSPEC_FRIP))]
-  "TARGET_HARD_FLOAT && TARGET_FPRND"
+  "TARGET_HARD_FLOAT && TARGET_POWER5X"
   "@
frip %0,%1
xsrdpip %x0,%x1"
@@ -6837,7 +6837,7 @@
   [(set (match_operand:SFDF 0 "gpc_reg_operand" "=d,wa")
(unspec:SFDF [(match_operand:SFDF 1 "gpc_reg_operand" "d,wa")]
 UNSPEC_FRIM))]
-  "TARGET_HARD_FLOAT && TARGET_FPRND"
+  "TARGET_HARD_FLOAT && TARGET_POWER5X"
   "@
frim %0,%1
xsrdpim %x0,%x1"
@@ -6848,7 +6848,7 @@
   [(set (match_operand:SFDF 0 "gpc_reg_operand" "=")
(unspec:SFDF [(match_operand:SFDF 1 "gpc_reg_operand" "")]
 UNSPEC_FRIN))]
-  "TARGET_HARD_FLOAT && TARGET_FPRND"
+  "TARGET_HARD_FLOAT && TARGET_POWER5X"
   "frin %0,%1"
   [(set_attr "type" "fp")])


[gcc(refs/users/meissner/heads/work194)] Change TARGET_CMPB to TARGET_POWER6.

2025-02-21 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:1ab6162addeec9e9ee65d6e60795f4b764a237ff

commit 1ab6162addeec9e9ee65d6e60795f4b764a237ff
Author: Michael Meissner 
Date:   Fri Feb 21 18:23:06 2025 -0500

Change TARGET_CMPB to TARGET_POWER6.

This patch changes TARGET_CMPB to TARGET_POWER6.  The -mcmpb switch is not 
being
changed, just the name of the macros used to determine if the PowerPC 
processor
supports ISA 2.5 (Power6).

2025-02-21  Michael Meissner  

gcc/

* gcc/config/rs6000/rs6000-builtin.cc (rs6000_builtin_is_supported):
Change TARGET_CMPB to TARGET_POWER6.
* gcc/config/rs6000/rs6000.cc (rs6000_option_override_internal):
Likewise.
(rs6000_rtx_costs): Likewise.
(rs6000_emit_parity): Likewise.
* gcc/config/rs6000/rs6000.h (TARGET_FCFID): Likewise.
(TARGET_LFIWAX): Likewise.
(TARGET_POWER6): New macro.
(TARGET_EXTRA_BUILTINS): Change TARGET_CMPB to TARGET_POWER6.
* gcc/config/rs6000/rs6000.md (enabled attribute): Likewise.
(parity2_cmp): Likewise.
(cmpb3): Likewise.
(copysign3): Likewise.
(copysign3_fcpsgn): Likewise.
(cmpstrnsi): Likewise.
(cmpstrsi): Likewise.

Diff:
---
 gcc/config/rs6000/rs6000-builtin.cc |  4 ++--
 gcc/config/rs6000/rs6000.cc |  8 
 gcc/config/rs6000/rs6000.h  |  7 ---
 gcc/config/rs6000/rs6000.md | 16 
 4 files changed, 18 insertions(+), 17 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-builtin.cc 
b/gcc/config/rs6000/rs6000-builtin.cc
index 4ed2bc1ca89e..dbb8520ab039 100644
--- a/gcc/config/rs6000/rs6000-builtin.cc
+++ b/gcc/config/rs6000/rs6000-builtin.cc
@@ -157,9 +157,9 @@ rs6000_builtin_is_supported (enum rs6000_gen_builtins 
fncode)
 case ENB_P5:
   return TARGET_POWER5;
 case ENB_P6:
-  return TARGET_CMPB;
+  return TARGET_POWER6;
 case ENB_P6_64:
-  return TARGET_CMPB && TARGET_POWERPC64;
+  return TARGET_POWER6 && TARGET_POWERPC64;
 case ENB_P7:
   return TARGET_POPCNTD;
 case ENB_P7_64:
diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index 07f5e58532a5..d2814364b3d0 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -3922,7 +3922,7 @@ rs6000_option_override_internal (bool global_init_p)
 rs6000_isa_flags |= (ISA_2_6_MASKS_EMBEDDED & ~ignore_masks);
   else if (TARGET_DFP)
 rs6000_isa_flags |= (ISA_2_5_MASKS_SERVER & ~ignore_masks);
-  else if (TARGET_CMPB)
+  else if (TARGET_POWER6)
 rs6000_isa_flags |= (ISA_2_5_MASKS_EMBEDDED & ~ignore_masks);
   else if (TARGET_POWER5X)
 rs6000_isa_flags |= (ISA_2_4_MASKS & ~ignore_masks);
@@ -4797,7 +4797,7 @@ rs6000_option_override_internal (bool global_init_p)
  DERAT mispredict penalty.  However the LVE and STVE altivec instructions
  need indexed accesses and the type used is the scalar type of the element
  being loaded or stored.  */
-TARGET_AVOID_XFORM = (rs6000_tune == PROCESSOR_POWER6 && TARGET_CMPB
+TARGET_AVOID_XFORM = (rs6000_tune == PROCESSOR_POWER6 && TARGET_POWER6
  && !TARGET_ALTIVEC);
 
   /* Set the -mrecip options.  */
@@ -22377,7 +22377,7 @@ rs6000_rtx_costs (rtx x, machine_mode mode, int 
outer_code,
   return false;
 
 case PARITY:
-  *total = COSTS_N_INSNS (TARGET_CMPB ? 2 : 6);
+  *total = COSTS_N_INSNS (TARGET_POWER6 ? 2 : 6);
   return false;
 
 case NOT:
@@ -23204,7 +23204,7 @@ rs6000_emit_parity (rtx dst, rtx src)
   tmp = gen_reg_rtx (mode);
 
   /* Use the PPC ISA 2.05 prtyw/prtyd instruction if we can.  */
-  if (TARGET_CMPB)
+  if (TARGET_POWER6)
 {
   if (mode == SImode)
{
diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h
index 882a3864ca66..62e1662d078a 100644
--- a/gcc/config/rs6000/rs6000.h
+++ b/gcc/config/rs6000/rs6000.h
@@ -449,12 +449,12 @@ extern int rs6000_vector_align[];
 #define TARGET_FCFID   (TARGET_POWERPC64   \
 || TARGET_PPC_GPOPT/* 970/power4 */\
 || TARGET_POWER5   /* ISA 2.02 */  \
-|| TARGET_CMPB /* ISA 2.05 */  \
+|| TARGET_POWER6   /* ISA 2.05 */  \
 || TARGET_POPCNTD) /* ISA 2.06 */
 
 #define TARGET_FCTIDZ  TARGET_FCFID
 #define TARGET_STFIWX  TARGET_PPC_GFXOPT
-#define TARGET_LFIWAX  TARGET_CMPB
+#define TARGET_LFIWAX  TARGET_POWER6
 #define TARGET_LFIWZX  TARGET_POPCNTD
 #define TARGET_FCFIDS  TARGET_POPCNTD
 #define TARGET_FCFIDU  TARGET_POPCNTD
@@ -502,6 +502,7 @@ extern int rs6000_vector_align[];
 /* Convert ISA bits like POPCNTB to PowerPC processors like POWER5.  */
 #define TARGET_POWER5  TARGET_POPCNTB
 #define TARGET_POWER5X TARGET_FPRND
+#define TARGET_POWER6

[gcc(refs/users/meissner/heads/work194)] Change TARGET_MODULO to TARGET_POWER9.

2025-02-21 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:a4d8a02a4dd29361205f24ced4037731ff5723b3

commit a4d8a02a4dd29361205f24ced4037731ff5723b3
Author: Michael Meissner 
Date:   Fri Feb 21 18:25:24 2025 -0500

Change TARGET_MODULO to TARGET_POWER9.

This patch changes TARGET_MODULO to TARGET_POWER9.  The -mmodulo switch is 
not
being changed, just the name of the macros used to determine if the PowerPC
processor supports ISA 3.0 (Power9).

2025-02-21  Michael Meissner  

gcc/

* gcc/config/rs6000/rs6000-builtin.cc (rs6000_builtin_is_supported):
Change TARGET_MODULO to TARGET_POWER9.
* gcc/config/rs6000/rs6000.cc (rs6000_option_override_internal):
Likewise.
* gcc/config/rs6000/rs6000.h (TARGET_CTZ): Likewise.
(TARGET_EXTSWSLI): Likewise.
(TARGET_MADDLD): Likewise.
(TARGET_POWER9): New macro.
* gcc/config/rs6000/rs6000.md (enabled attribute): Change 
TARGET_MODULO
to TARGET_POWER9.
(mod3): Likewise.
(umod3): Likewise.
(divide/modulo peephole2): Likewise.

Diff:
---
 gcc/config/rs6000/rs6000-builtin.cc |  4 ++--
 gcc/config/rs6000/rs6000.cc |  4 ++--
 gcc/config/rs6000/rs6000.h  |  7 ---
 gcc/config/rs6000/rs6000.md | 14 +++---
 4 files changed, 15 insertions(+), 14 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-builtin.cc 
b/gcc/config/rs6000/rs6000-builtin.cc
index 2366b2aee00a..d8ff7cf32dfd 100644
--- a/gcc/config/rs6000/rs6000-builtin.cc
+++ b/gcc/config/rs6000/rs6000-builtin.cc
@@ -169,9 +169,9 @@ rs6000_builtin_is_supported (enum rs6000_gen_builtins 
fncode)
 case ENB_P8V:
   return TARGET_P8_VECTOR;
 case ENB_P9:
-  return TARGET_MODULO;
+  return TARGET_POWER9;
 case ENB_P9_64:
-  return TARGET_MODULO && TARGET_POWERPC64;
+  return TARGET_POWER9 && TARGET_POWERPC64;
 case ENB_P9V:
   return TARGET_P9_VECTOR;
 case ENB_P10:
diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index 1bba77244c25..06d1bac5aa83 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -3888,7 +3888,7 @@ rs6000_option_override_internal (bool global_init_p)
 
   /* For the newer switches (vsx, dfp, etc.) set some of the older options,
  unless the user explicitly used the -mno- to disable the code.  */
-  if (TARGET_P9_VECTOR || TARGET_MODULO || TARGET_P9_MISC)
+  if (TARGET_P9_VECTOR || TARGET_POWER9 || TARGET_P9_MISC)
 rs6000_isa_flags |= (ISA_3_0_MASKS_SERVER & ~ignore_masks);
   else if (TARGET_P9_MINMAX)
 {
@@ -22358,7 +22358,7 @@ rs6000_rtx_costs (rtx x, machine_mode mode, int 
outer_code,
*total = rs6000_cost->divsi;
}
   /* Add in shift and subtract for MOD unless we have a mod instruction. */
-  if ((!TARGET_MODULO
+  if ((!TARGET_POWER9
   || (RS6000_DISABLE_SCALAR_MODULO && SCALAR_INT_MODE_P (mode)))
 && (code == MOD || code == UMOD))
*total += COSTS_N_INSNS (2);
diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h
index 856d268d2d27..caf8cddf905e 100644
--- a/gcc/config/rs6000/rs6000.h
+++ b/gcc/config/rs6000/rs6000.h
@@ -463,9 +463,9 @@ extern int rs6000_vector_align[];
 #define TARGET_FCTIWUZ TARGET_POWER7
 /* Only powerpc64 and powerpc476 support fctid.  */
 #define TARGET_FCTID   (TARGET_POWERPC64 || rs6000_cpu == PROCESSOR_PPC476)
-#define TARGET_CTZ TARGET_MODULO
-#define TARGET_EXTSWSLI(TARGET_MODULO && TARGET_POWERPC64)
-#define TARGET_MADDLD  TARGET_MODULO
+#define TARGET_CTZ TARGET_POWER9
+#define TARGET_EXTSWSLI(TARGET_POWER9 && TARGET_POWERPC64)
+#define TARGET_MADDLD  TARGET_POWER9
 
 /* TARGET_DIRECT_MOVE is redundant to TARGET_P8_VECTOR, so alias it to that.  
*/
 #define TARGET_DIRECT_MOVE TARGET_P8_VECTOR
@@ -504,6 +504,7 @@ extern int rs6000_vector_align[];
 #define TARGET_POWER5X TARGET_FPRND
 #define TARGET_POWER6  TARGET_CMPB
 #define TARGET_POWER7  TARGET_POPCNTD
+#define TARGET_POWER9  TARGET_MODULO
 
 /* In switching from using target_flags to using rs6000_isa_flags, the options
machinery creates OPTION_MASK_ instead of MASK_.  The MASK_
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 87ec37a9f8e4..db1b6c2d1164 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -403,7 +403,7 @@
  (const_int 1)
 
  (and (eq_attr "isa" "p9")
- (match_test "TARGET_MODULO"))
+ (match_test "TARGET_POWER9"))
  (const_int 1)
 
  (and (eq_attr "isa" "p9v")
@@ -3457,7 +3457,7 @@
   || INTVAL (operands[2]) <= 0
   || (i = exact_log2 (INTVAL (operands[2]))) < 0)
 {
-  if (!TARGET_MODULO)
+  if (!TARGET_POWER9)
FAIL;
 
   operands[2] = force_reg (mode, operands[2]);
@@ -3491,7 +3491,7 @@
   [(set (match_operand:GPR 0 "gpc_reg_operand" "=&r,r")
 (mod:GPR (match_oper

[gcc(refs/users/meissner/heads/work194)] Add -mcpu=future tests.

2025-02-21 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:e4b02d13db9449b37905c7490c82e39ae0107cf1

commit e4b02d13db9449b37905c7490c82e39ae0107cf1
Author: Michael Meissner 
Date:   Fri Feb 21 18:30:16 2025 -0500

Add -mcpu=future tests.

This patch adds simple tests for -mcpu=future.

2025-02-21  Michael Meissner  

gcc/testsuite/

* gcc.target/powerpc/future-1.c: New test.
* gcc.target/powerpc/future-2.c: Likewise.

Diff:
---
 gcc/testsuite/gcc.target/powerpc/future-1.c | 13 +
 gcc/testsuite/gcc.target/powerpc/future-2.c | 24 
 2 files changed, 37 insertions(+)

diff --git a/gcc/testsuite/gcc.target/powerpc/future-1.c 
b/gcc/testsuite/gcc.target/powerpc/future-1.c
new file mode 100644
index ..f1b940d7bebf
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/future-1.c
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-options "-mdejagnu-cpu=future -O2" } */
+
+/* Basic check to see if the compiler supports -mcpu=future and if it defines
+   _ARCH_PWR11.  */
+
+#ifndef _ARCH_FUTURE
+#error "-mcpu=future is not supported"
+#endif
+
+void foo (void)
+{
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/future-2.c 
b/gcc/testsuite/gcc.target/powerpc/future-2.c
new file mode 100644
index ..5552cefa3c2e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/future-2.c
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+/* Check if we can set the future target via a target attribute.  */
+
+__attribute__((__target__("cpu=power9")))
+void foo_p9 (void)
+{
+}
+
+__attribute__((__target__("cpu=power10")))
+void foo_p10 (void)
+{
+}
+
+__attribute__((__target__("cpu=power11")))
+void foo_p11 (void)
+{
+}
+
+__attribute__((__target__("cpu=future")))
+void foo_future (void)
+{
+}


[gcc(refs/users/meissner/heads/work194)] Add support for -mcpu=future

2025-02-21 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:2640deb14939f7a19fe8ac728b30e412a3132e37

commit 2640deb14939f7a19fe8ac728b30e412a3132e37
Author: Michael Meissner 
Date:   Fri Feb 21 18:28:21 2025 -0500

Add support for -mcpu=future

This patch adds the support that can be used in developing GCC support for
future PowerPC processors.

2025-02-21  Michael Meissner  

* config.gcc (powerpc*-*-*): Add support for --with-cpu=future.
* config/rs6000/aix71.h (ASM_CPU_SPEC): Add support for 
-mcpu=future.
* config/rs6000/aix72.h (ASM_CPU_SPEC): Likewise.
* config/rs6000/aix73.h (ASM_CPU_SPEC): Likewise.
* config/rs6000/driver-rs6000.cc (asm_names): Likewise.
* config/rs6000/rs6000-c.cc (rs6000_target_modify_macros): If
-mcpu=future, define _ARCH_FUTURE.
* config/rs6000/rs6000-cpus.def (FUTURE_MASKS_SERVER): New macro.
(POWERPC_MASKS): Add OPTION_MASK_FUTURE.
(future cpu): Define.
* config/rs6000/rs6000-opts.h (enum processor_type): Add
PROCESSOR_FUTURE.
* config/rs6000/rs6000-tables.opt: Regenerate.
* config/rs6000/rs6000.cc (power10_cost): Update comment.
(get_arch_flags): Add support for future processor.
(rs6000_option_override_internal): Likewise.
(rs6000_machine_from_flags): Likewise.
(rs6000_reassociation_width): Likewise.
(rs6000_adjust_cost): Likewise.
(rs6000_issue_rate): Likewise.
(rs6000_sched_reorder): Likewise.
(rs6000_sched_reorder2): Likewise.
(rs6000_register_move_cost): Likewise.
(rs6000_opt_masks): Add -mfuture.
* config/rs6000/rs6000.h (ASM_CPU_SPEC): Likewise.
* config/rs6000/rs6000.md (cpu attribute): Likewise.
* config/rs6000/rs6000.opt (-mfuture): New internal option.

Diff:
---
 gcc/config.gcc  |  4 ++--
 gcc/config/rs6000/aix71.h   |  1 +
 gcc/config/rs6000/aix72.h   |  1 +
 gcc/config/rs6000/aix73.h   |  1 +
 gcc/config/rs6000/driver-rs6000.cc  |  2 ++
 gcc/config/rs6000/rs6000-c.cc   |  2 ++
 gcc/config/rs6000/rs6000-cpus.def   |  5 +
 gcc/config/rs6000/rs6000-opts.h |  1 +
 gcc/config/rs6000/rs6000-tables.opt | 11 +++
 gcc/config/rs6000/rs6000.cc | 30 ++
 gcc/config/rs6000/rs6000.h  |  1 +
 gcc/config/rs6000/rs6000.md |  2 +-
 gcc/config/rs6000/rs6000.opt|  6 ++
 13 files changed, 52 insertions(+), 15 deletions(-)

diff --git a/gcc/config.gcc b/gcc/config.gcc
index a518e976b82e..74b66797447c 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -536,7 +536,7 @@ powerpc*-*-*)
extra_headers="${extra_headers} ppu_intrinsics.h spu2vmx.h vec_types.h 
si2vmx.h"
extra_headers="${extra_headers} amo.h"
case x$with_cpu in
-   
xpowerpc64|xdefault64|x6[23]0|x970|xG5|xpower[3456789]|xpower1[01]|xpower6x|xrs64a|xcell|xa2|xe500mc64|xe5500|xe6500)
+   
xpowerpc64|xdefault64|x6[23]0|x970|xG5|xpower[3456789]|xpower1[01]|xpower6x|xrs64a|xcell|xa2|xe500mc64|xe5500|xe6500|xfuture)
cpu_is_64bit=yes
;;
esac
@@ -5683,7 +5683,7 @@ case "${target}" in
tm_defines="${tm_defines} CONFIG_PPC405CR"
eval "with_$which=405"
;;
-   "" | common | native \
+   "" | common | native | future \
| power[3456789] | power1[01] | power5+ | power6x \
| powerpc | powerpc64 | powerpc64le \
| rs64 \
diff --git a/gcc/config/rs6000/aix71.h b/gcc/config/rs6000/aix71.h
index 2b21dd7cd1e0..77651f5ea309 100644
--- a/gcc/config/rs6000/aix71.h
+++ b/gcc/config/rs6000/aix71.h
@@ -79,6 +79,7 @@ do {  
\
 #undef ASM_CPU_SPEC
 #define ASM_CPU_SPEC \
 "%{mcpu=native: %(asm_cpu_native); \
+  mcpu=future: -mfuture; \
   mcpu=power11: -mpwr11; \
   mcpu=power10: -mpwr10; \
   mcpu=power9: -mpwr9; \
diff --git a/gcc/config/rs6000/aix72.h b/gcc/config/rs6000/aix72.h
index 53c0bde5ad4a..652f60c7f494 100644
--- a/gcc/config/rs6000/aix72.h
+++ b/gcc/config/rs6000/aix72.h
@@ -79,6 +79,7 @@ do {  
\
 #undef ASM_CPU_SPEC
 #define ASM_CPU_SPEC \
 "%{mcpu=native: %(asm_cpu_native); \
+  mcpu=future: -mfuture; \
   mcpu=power11: -mpwr11; \
   mcpu=power10: -mpwr10; \
   mcpu=power9: -mpwr9; \
diff --git a/gcc/config/rs6000/aix73.h b/gcc/config/rs6000/aix73.h
index c7639368a264..3c66ac1d9171 100644
--- a/gcc/config/rs6000/aix73.h
+++ b/gcc/config/rs6000/aix73.h
@@ -79,6 +79,7 @@ do {  
\
 #undef ASM_CPU_SPEC
 #define ASM_CPU_SPEC \
 "%{mcpu=nati

[gcc(refs/users/meissner/heads/work194-dmf)] RFC2653-Add support for dense math registers.

2025-02-21 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:864d778bd0cd973532250a1fb147cee08705d621

commit 864d778bd0cd973532250a1fb147cee08705d621
Author: Michael Meissner 
Date:   Fri Feb 21 19:36:14 2025 -0500

RFC2653-Add support for dense math registers.

The MMA subsystem added the notion of accumulator registers as an optional
feature of ISA 3.1 (power10).  In ISA 3.1, these accumulators overlapped 
with
the VSX registers 0..31, but logically the accumulator registers were 
separate
from the FPR registers.  In ISA 3.1, it was anticipated that in future 
systems,
the accumulator registers may no overlap with the FPR registers.  This patch
adds the support for dense math registers as separate registers.

This particular patch does not change the MMA support to use the 
accumulators
within the dense math registers.  This patch just adds the basic support for
having separate DMRs.  The next patch will switch the MMA support to use the
accumulators if -mcpu=future is used.

For testing purposes, I added an undocumented option '-mdense-math' to 
enable
or disable the dense math support.

This patch updates the wD constraint added in the previous patch.  If MMA is
selected but dense math is not selected (i.e. -mcpu=power10), the wD 
constraint
will allow access to accumulators that overlap with VSX registers 0..31.  If
both MMA and dense math are selected (i.e. -mcpu=future), the wD constraint
will only allow dense math registers.

This patch modifies the existing %A output modifier.  If MMA is selected but
dense math is not selected, then %A output modifier converts the VSX 
register
number to the accumulator number, by dividing it by 4.  If both MMA and 
dense
math are selected, then %A will map the separate DMR registers into 0..7.

The intention is that user code using extended asm can be modified to run on
both MMA without dense math and MMA with dense math:

1)  If possible, don't use extended asm, but instead use the MMA 
built-in
functions;

2)  If you do need to write extended asm, change the d constraints
targetting accumulators should now use wD;

3)  Only use the built-in zero, assemble and disassemble functions 
create
move data between vector quad types and dense math accumulators.
I.e. do not use the xxmfacc, xxmtacc, and xxsetaccz directly in the
extended asm code.  The reason is these instructions assume there 
is a
1-to-1 correspondence between 4 adjacent FPR registers and an
accumulator that overlaps with those instructions.  With 
accumulators
now being separate registers, there no longer is a 1-to-1
correspondence.

It is possible that the mangling for DMRs and the GDB register numbers may
produce other changes in the future.

gcc/

2025-02-21   Michael Meissner  

* config/rs6000/mma.md (UNSPEC_MMA_DMSETDMRZ): New unspec.
(movxo): Add comments about dense math registers.
(movxo_nodm): Rename from movxo and restrict the usage to machines
without dense math registers.
(movxo_dm): New insn for movxo support for machines with dense math
registers.
(mma_): Restrict usage to machines without dense math 
registers.
(mma_xxsetaccz): Add a define_expand wrapper, and add support for 
dense
math registers.
(mma_dmsetaccz): New insn.
* config/rs6000/predicates.md (dmr_operand): New predicate.
(accumulator_operand): Add support for dense math registers.
* config/rs6000/rs6000-builtin.cc (rs6000_gimple_fold_mma_builtin): 
Do
not issue a de-prime instruction when disassembling a vector quad 
on a
system with dense math registers.
* config/rs6000/rs6000-c.cc (rs6000_define_or_undefine_macro): 
Define
__DENSE_MATH__ if we have dense math registers.
* config/rs6000/rs6000.cc (enum rs6000_reg_type): Add DMR_REG_TYPE.
(enum rs6000_reload_reg_type): Add RELOAD_REG_DMR.
(LAST_RELOAD_REG_CLASS): Add support for DMR registers and the wD
constraint.
(reload_reg_map): Likewise.
(rs6000_reg_names): Likewise.
(alt_reg_names): Likewise.
(rs6000_hard_regno_nregs_internal): Likewise.
(rs6000_hard_regno_mode_ok_uncached): Likewise.
(rs6000_debug_reg_global): Likewise.
(rs6000_setup_reg_addr_masks): Likewise.
(rs6000_init_hard_regno_mode_ok): Likewise.
(rs6000_secondary_reload_memory): Add support for DMR registers.
(rs6000_secondary_reload_simple_move): Likewise.
(rs6000_preferred_reload_class): Likewise.
(rs6000_secondary_reload_class): Likewise.
(print_operan

[gcc(refs/users/meissner/heads/work194-dmf)] RFC2656-Support load/store vector with right length.

2025-02-21 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:13cb3141804c4bea80a065356a8ae36fc6741b42

commit 13cb3141804c4bea80a065356a8ae36fc6741b42
Author: Michael Meissner 
Date:   Fri Feb 21 19:40:17 2025 -0500

RFC2656-Support load/store vector with right length.

This patch adds support for new instructions that may be added to the 
PowerPC
architecture in the future to enhance the load and store vector with length
instructions.

The current instructions (lxvl, lxvll, stxvl, and stxvll) are inconvient to 
use
since the count for the number of bytes must be in the top 8 bits of the GPR
register, instead of the bottom 8 bits.  This meant that code generating 
these
instructions typically had to do a shift left by 56 bits to get the count 
into
the right position.  In a future version of the PowerPC architecture, new
variants of these instructions might be added that expect the count to be in
the bottom 8 bits of the GPR register.  These patches add this support to 
GCC
if the user uses the -mcpu=future option.

I discovered that the code in rs6000-string.cc to generate ISA 3.1 
lxvl/stxvl
future lxvll/stxvll instructions would generate these instructions on 
32-bit.
However the patterns for these instructions is only done on 64-bit systems. 
 So
I added a check for 64-bit support before generating the instructions.

The patches have been tested on both little and big endian systems.  Can I 
check
it into the master branch?

2025-02-10   Michael Meissner  

gcc/

* config/rs6000/rs6000-string.cc (expand_block_move): Do not 
generate
lxvl and stxvl on 32-bit.
* config/rs6000/vsx.md (lxvl): If -mcpu=future, generate the lxvl 
with
the shift count automaticaly used in the insn.
(lxvrl): New insn for -mcpu=future.
(lxvrll): Likewise.
(stxvl): If -mcpu=future, generate the stxvl with the shift count
automaticaly used in the insn.
(stxvrl): New insn for -mcpu=future.
(stxvrll): Likewise.

gcc/testsuite/

* gcc.target/powerpc/lxvrl.c: New test.
* lib/target-supports.exp 
(check_effective_target_powerpc_future_ok):
New effective target.

Diff:
---
 gcc/config/rs6000/rs6000-string.cc   |   1 +
 gcc/config/rs6000/vsx.md | 122 +--
 gcc/testsuite/gcc.target/powerpc/lxvrl.c |  32 
 3 files changed, 134 insertions(+), 21 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-string.cc 
b/gcc/config/rs6000/rs6000-string.cc
index 703f77fa0bf1..814328140553 100644
--- a/gcc/config/rs6000/rs6000-string.cc
+++ b/gcc/config/rs6000/rs6000-string.cc
@@ -2786,6 +2786,7 @@ expand_block_move (rtx operands[], bool might_overlap)
 
   if (TARGET_MMA && TARGET_BLOCK_OPS_UNALIGNED_VSX
  && TARGET_BLOCK_OPS_VECTOR_PAIR
+ && TARGET_POWERPC64
  && bytes >= 32
  && (align >= 256 || !STRICT_ALIGNMENT))
{
diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index dd3573b80868..89523cf4a0e5 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -5712,20 +5712,32 @@
   DONE;
 })
 
-;; Load VSX Vector with Length
+;; Load VSX Vector with Length.  If we have lxvrl, we don't have to do an
+;; explicit shift left into a pseudo.
 (define_expand "lxvl"
-  [(set (match_dup 3)
-(ashift:DI (match_operand:DI 2 "register_operand")
-   (const_int 56)))
-   (set (match_operand:V16QI 0 "vsx_register_operand")
-   (unspec:V16QI
-[(match_operand:DI 1 "gpc_reg_operand")
-  (mem:V16QI (match_dup 1))
- (match_dup 3)]
-UNSPEC_LXVL))]
+  [(use (match_operand:V16QI 0 "vsx_register_operand"))
+   (use (match_operand:DI 1 "gpc_reg_operand"))
+   (use (match_operand:DI 2 "gpc_reg_operand"))]
   "TARGET_P9_VECTOR && TARGET_64BIT"
 {
-  operands[3] = gen_reg_rtx (DImode);
+  rtx shift_len = gen_rtx_ASHIFT (DImode, operands[2], GEN_INT (56));
+  rtx len;
+
+  if (TARGET_FUTURE)
+len = shift_len;
+  else
+{
+  len = gen_reg_rtx (DImode);
+  emit_insn (gen_rtx_SET (len, shift_len));
+}
+
+  rtx dest = operands[0];
+  rtx addr = operands[1];
+  rtx mem = gen_rtx_MEM (V16QImode, addr);
+  rtvec rv = gen_rtvec (3, addr, mem, len);
+  rtx lxvl = gen_rtx_UNSPEC (V16QImode, rv, UNSPEC_LXVL);
+  emit_insn (gen_rtx_SET (dest, lxvl));
+  DONE;
 })
 
 (define_insn "*lxvl"
@@ -5749,6 +5761,34 @@
   "lxvll %x0,%1,%2"
   [(set_attr "type" "vecload")])
 
+;; For lxvrl and lxvrll, use the combiner to eliminate the shift.  The
+;; define_expand for lxvl will already incorporate the shift in generating the
+;; insn.  The lxvll buitl-in function required the user to have already done
+;; the shift.  Defining lxvrll this way, will optimize cases where the user has
+;; done the shift immediately before the built-in.
+(define_insn "*lxvrl"
+  [(set (matc

[gcc(refs/users/meissner/heads/work194-dmf)] RFC2655-Add saturating subtract built-ins.

2025-02-21 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:3684072bf0640be03f143dec188fe6cd9ab31cb3

commit 3684072bf0640be03f143dec188fe6cd9ab31cb3
Author: Michael Meissner 
Date:   Fri Feb 21 19:41:21 2025 -0500

RFC2655-Add saturating subtract built-ins.

This patch adds support for a saturating subtract built-in function that 
may be
added to a future PowerPC processor.  Note, if it is added, the name of the
built-in function may change before GCC 13 is released.  If the name 
changes,
we will submit a patch changing the name.

I also added support for providing dense math built-in functions, even 
though
at present, we have not added any new built-in functions for dense math.  
It is
likely we will want to add new dense math built-in functions as the dense 
math
support is fleshed out.

The patches have been tested on both little and big endian systems.  Can I 
check
it into the master branch?

2025-02-10   Michael Meissner  

gcc/

* config/rs6000/rs6000-builtin.cc (rs6000_invalid_builtin): Add 
support
for flagging invalid use of future built-in functions.
(rs6000_builtin_is_supported): Add support for future built-in
functions.
* config/rs6000/rs6000-builtins.def 
(__builtin_saturate_subtract32): New
built-in function for -mcpu=future.
(__builtin_saturate_subtract64): Likewise.
* config/rs6000/rs6000-gen-builtins.cc (enum bif_stanza): Add 
stanzas
for -mcpu=future built-ins.
(stanza_map): Likewise.
(enable_string): Likewise.
(struct attrinfo): Likewise.
(parse_bif_attrs): Likewise.
(write_decls): Likewise.
* config/rs6000/rs6000.md (sat_sub3): Add saturating subtract
built-in insn declarations.
(sat_sub3_dot): Likewise.
(sat_sub3_dot2): Likewise.
* doc/extend.texi (Future PowerPC built-ins): New section.

gcc/testsuite/

* gcc.target/powerpc/subfus-1.c: New test.
* gcc.target/powerpc/subfus-2.c: Likewise.

Diff:
---
 gcc/config/rs6000/rs6000-builtin.cc | 17 
 gcc/config/rs6000/rs6000-builtins.def   | 10 +
 gcc/config/rs6000/rs6000-gen-builtins.cc| 35 ++---
 gcc/config/rs6000/rs6000.md | 60 +
 gcc/doc/extend.texi | 24 
 gcc/testsuite/gcc.target/powerpc/subfus-1.c | 32 +++
 gcc/testsuite/gcc.target/powerpc/subfus-2.c | 32 +++
 7 files changed, 205 insertions(+), 5 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-builtin.cc 
b/gcc/config/rs6000/rs6000-builtin.cc
index ea8755b3ef8a..1885b1f636f3 100644
--- a/gcc/config/rs6000/rs6000-builtin.cc
+++ b/gcc/config/rs6000/rs6000-builtin.cc
@@ -139,6 +139,17 @@ rs6000_invalid_builtin (enum rs6000_gen_builtins fncode)
 case ENB_MMA:
   error ("%qs requires the %qs option", name, "-mmma");
   break;
+case ENB_FUTURE:
+  error ("%qs requires the %qs option", name, "-mcpu=future");
+  break;
+case ENB_FUTURE_64:
+  error ("%qs requires the %qs option and either the %qs or %qs option",
+name, "-mcpu=future", "-m64", "-mpowerpc64");
+  break;
+case ENB_DM:
+  error ("%qs requires the %qs or %qs options", name, "-mcpu=future",
+"-mdense-math");
+  break;
 default:
 case ENB_ALWAYS:
   gcc_unreachable ();
@@ -194,6 +205,12 @@ rs6000_builtin_is_supported (enum rs6000_gen_builtins 
fncode)
   return TARGET_HTM;
 case ENB_MMA:
   return TARGET_MMA;
+case ENB_FUTURE:
+  return TARGET_FUTURE;
+case ENB_FUTURE_64:
+  return TARGET_FUTURE && TARGET_POWERPC64;
+case ENB_DM:
+  return TARGET_DENSE_MATH;
 default:
   gcc_unreachable ();
 }
diff --git a/gcc/config/rs6000/rs6000-builtins.def 
b/gcc/config/rs6000/rs6000-builtins.def
index 555d7d589506..eef5f41f7615 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -137,6 +137,8 @@
 ;   endian   Needs special handling for endianness
 ;   ibmldRestrict usage to the case when TFmode is IBM-128
 ;   ibm128   Restrict usage to the case where __ibm128 is supported or if ibmld
+;   future   Restrict usage to future instructions
+;   dm   Restrict usage to dense math
 ;
 ; Each attribute corresponds to extra processing required when
 ; the built-in is expanded.  All such special processing should
@@ -3924,3 +3926,11 @@
 
   void __builtin_vsx_stxvp (v256, unsigned long, const v256 *);
 STXVP nothing {mma,pair}
+
+[future]
+  const signed int __builtin_saturate_subtract32 (signed int, signed int);
+  SAT_SUBSI sat_subsi3 {}
+
+[future-64]
+  const signed long __builtin_saturate_subtract64 (signed long,  signed long);
+  SAT_SUBDI sat_subdi3 {}
diff --git a/gcc/config/rs6000/rs6000-gen-builtins.cc 
b/gcc/conf

[gcc(refs/users/meissner/heads/work194-dmf)] Revert changes

2025-02-21 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:cce871e0ad2f21542a0459b8c4d8315d00f25f8e

commit cce871e0ad2f21542a0459b8c4d8315d00f25f8e
Author: Michael Meissner 
Date:   Fri Feb 21 19:46:58 2025 -0500

Revert changes

Diff:
---
 gcc/config/rs6000/altivec.md   | 14 
 gcc/config/rs6000/constraints.md   | 10 ---
 gcc/config/rs6000/predicates.md| 52 +---
 gcc/config/rs6000/rs6000.cc| 25 --
 gcc/config/rs6000/rs6000.h |  7 --
 gcc/config/rs6000/rs6000.md| 96 +++---
 gcc/testsuite/gcc.target/powerpc/prefixed-addis.c  | 24 --
 .../gcc.target/powerpc/vector-rotate-left.c| 34 
 8 files changed, 14 insertions(+), 248 deletions(-)

diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
index d158cf479d60..7edc288a6565 100644
--- a/gcc/config/rs6000/altivec.md
+++ b/gcc/config/rs6000/altivec.md
@@ -1982,20 +1982,6 @@
 }
   [(set_attr "type" "vecperm")])
 
-;; -mcpu=future adds a vector rotate left word variant.  There is no vector
-;; byte/half-word/double-word/quad-word rotate left.  This insn occurs before
-;; altivec_vrl and will match for -mcpu=future, while other cpus will
-;; match the generic insn.
-(define_insn "*xvrlw"
-  [(set (match_operand:V4SI 0 "register_operand" "=v,wa")
-   (rotate:V4SI (match_operand:V4SI 1 "register_operand" "v,wa")
-(match_operand:V4SI 2 "register_operand" "v,wa")))]
-  "TARGET_XVRLW"
-  "@
-   vrlw %0,%1,%2
-   xvrlw %x0,%x1,%x2"
-  [(set_attr "type" "vecsimple")])
-
 (define_insn "altivec_vrl"
   [(set (match_operand:VI2 0 "register_operand" "=v")
 (rotate:VI2 (match_operand:VI2 1 "register_operand" "v")
diff --git a/gcc/config/rs6000/constraints.md b/gcc/config/rs6000/constraints.md
index 5440becb6e6c..3da9ed086810 100644
--- a/gcc/config/rs6000/constraints.md
+++ b/gcc/config/rs6000/constraints.md
@@ -222,16 +222,6 @@
   "An IEEE 128-bit constant that can be loaded into VSX registers."
   (match_operand 0 "easy_vector_constant_ieee128"))
 
-(define_constraint "eU"
-  "@internal integer constant that can be loaded with paddis"
-  (and (match_code "const_int")
-   (match_operand 0 "paddis_operand")))
-
-(define_constraint "eV"
-  "@internal integer constant that can be loaded with paddis + paddi"
-  (and (match_code "const_int")
-   (match_operand 0 "paddis_paddi_operand")))
-
 ;; Floating-point constraints.  These two are defined so that insn
 ;; length attributes can be calculated exactly.
 
diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
index c206860e4927..c95b4336f062 100644
--- a/gcc/config/rs6000/predicates.md
+++ b/gcc/config/rs6000/predicates.md
@@ -369,53 +369,6 @@
   return SIGNED_INTEGER_34BIT_P (INTVAL (op));
 })
 
-;; Return 1 if op is a 64-bit constant that uses the paddis instruction
-(define_predicate "paddis_operand"
-  (match_code "const_int")
-{
-  if (!TARGET_PADDIS && TARGET_POWERPC64)
-return 0;
-
-  /* If addi, addis, or paddi can handle the number, don't return true.  */
-  HOST_WIDE_INT value = INTVAL (op);
-  if (SIGNED_INTEGER_34BIT_P (value))
-return false;
-
-  /* If the number is too large for padds, return false.  */
-  if (!SIGNED_INTEGER_32BIT_P (value >> 32))
-return false;
-
-  /* If the bottom 32-bits are non-zero, paddis can't handle it.  */
-  if ((value & HOST_WIDE_INT_C(0x)) != 0)
-return false;
-
-  return true;
-})
-
-;; Return 1 if op is a 64-bit constant that needs the paddis instruction and an
-;; addi/addis/paddi instruction combination.
-(define_predicate "paddis_paddi_operand"
-  (match_code "const_int")
-{
-  if (!TARGET_PADDIS && TARGET_POWERPC64)
-return 0;
-
-  /* If addi, addis, or paddi can handle the number, don't return true.  */
-  HOST_WIDE_INT value = INTVAL (op);
-  if (SIGNED_INTEGER_34BIT_P (value))
-return false;
-
-  /* If the number is too large for padds, return false.  */
-  if (!SIGNED_INTEGER_32BIT_P (value >> 32))
-return false;
-
-  /* If the bottom 32-bits are zero, we can use paddis alone to handle it.  */
-  if ((value & HOST_WIDE_INT_C(0x)) == 0)
-return false;
-
-  return true;
-})
-
 ;; Return 1 if op is a register that is not special.
 ;; Disallow (SUBREG:SF (REG:SI)) and (SUBREG:SI (REG:SF)) on VSX systems where
 ;; you need to be careful in moving a SFmode to SImode and vice versa due to
@@ -1160,10 +1113,7 @@
   (if_then_else (match_code "const_int")
 (match_test "satisfies_constraint_I (op)
 || satisfies_constraint_L (op)
-|| satisfies_constraint_eI (op)
-|| satisfies_constraint_eU (op)
-|| satisfies_constraint_eV (op)")
-
+|| satisfies_constraint_eI (op)")
 (match_operand 0 "gpc_reg_operand")))
 
 ;; Return 1 if the operand is either a non-special register, or 0, or -1.
diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/conf

[gcc(refs/users/meissner/heads/work194-dmf)] RFC2677-Add xvrlw support.

2025-02-21 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:da13838d7eef36b008c8e586150112b38fafff45

commit da13838d7eef36b008c8e586150112b38fafff45
Author: Michael Meissner 
Date:   Fri Feb 21 19:45:37 2025 -0500

RFC2677-Add xvrlw support.

2025-02-21  Michael Meissner  

gcc/

* config/rs6000/altivec.md (xvrlw): New insn.
* config/rs6000/rs6000.h (TARGET_XVRLW): New macro.

gcc/testsuite/

* gcc.target/powerpc/vector-rotate-left.c: New test.

Diff:
---
 gcc/config/rs6000/altivec.md   | 14 +
 gcc/config/rs6000/rs6000.h |  3 ++
 .../gcc.target/powerpc/vector-rotate-left.c| 34 ++
 3 files changed, 51 insertions(+)

diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
index 7edc288a6565..d158cf479d60 100644
--- a/gcc/config/rs6000/altivec.md
+++ b/gcc/config/rs6000/altivec.md
@@ -1982,6 +1982,20 @@
 }
   [(set_attr "type" "vecperm")])
 
+;; -mcpu=future adds a vector rotate left word variant.  There is no vector
+;; byte/half-word/double-word/quad-word rotate left.  This insn occurs before
+;; altivec_vrl and will match for -mcpu=future, while other cpus will
+;; match the generic insn.
+(define_insn "*xvrlw"
+  [(set (match_operand:V4SI 0 "register_operand" "=v,wa")
+   (rotate:V4SI (match_operand:V4SI 1 "register_operand" "v,wa")
+(match_operand:V4SI 2 "register_operand" "v,wa")))]
+  "TARGET_XVRLW"
+  "@
+   vrlw %0,%1,%2
+   xvrlw %x0,%x1,%x2"
+  [(set_attr "type" "vecsimple")])
+
 (define_insn "altivec_vrl"
   [(set (match_operand:VI2 0 "register_operand" "=v")
 (rotate:VI2 (match_operand:VI2 1 "register_operand" "v")
diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h
index 330548797815..8b9c23cebc44 100644
--- a/gcc/config/rs6000/rs6000.h
+++ b/gcc/config/rs6000/rs6000.h
@@ -584,6 +584,9 @@ extern int rs6000_vector_align[];
 /* Whether we have PADDIS support.  */
 #define TARGET_PADDIS  TARGET_FUTURE
 
+/* Whether we have XVRLW support.  */
+#define TARGET_XVRLW   TARGET_FUTURE
+
 /* Whether the various reciprocal divide/square root estimate instructions
exist, and whether we should automatically generate code for the instruction
by default.  */
diff --git a/gcc/testsuite/gcc.target/powerpc/vector-rotate-left.c 
b/gcc/testsuite/gcc.target/powerpc/vector-rotate-left.c
new file mode 100644
index ..5a5f37755077
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vector-rotate-left.c
@@ -0,0 +1,34 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_future_ok } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-options "-mdejagnu-cpu=future -O2" } */
+
+/* Test whether the xvrl (vector word rotate left using VSX registers insead of
+   Altivec registers is generated.  */
+
+#include 
+
+typedef vector unsigned int  v4si_t;
+
+v4si_t
+rotl_v4si_scalar (v4si_t x, unsigned long n)
+{
+  __asm__ (" # %x0" : "+f" (x));
+  return (x << n) | (x >> (32 - n));   /* xvrlw.  */
+}
+
+v4si_t
+rotr_v4si_scalar (v4si_t x, unsigned long n)
+{
+  __asm__ (" # %x0" : "+f" (x));
+  return (x >> n) | (x << (32 - n));   /* xvrlw.  */
+}
+
+v4si_t
+rotl_v4si_vector (v4si_t x, v4si_t y)
+{
+  __asm__ (" # %x0" : "+f" (x));   /* xvrlw.  */
+  return vec_rl (x, y);
+}
+
+/* { dg-final { scan-assembler-times {\mxvrlw\M} 3  } } */


[gcc(refs/users/meissner/heads/work194-dmf)] RFC2653-Add wD constraint.

2025-02-21 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:18240549700410fe3e7d9c4a672ddbced5cc4af0

commit 18240549700410fe3e7d9c4a672ddbced5cc4af0
Author: Michael Meissner 
Date:   Fri Feb 21 19:34:58 2025 -0500

RFC2653-Add wD constraint.

This patch adds a new constraint ('wD') that matches the accumulator 
registers
that overlap with VSX registers 0..31 on power10.  Future patches will add 
the
support for a separate accumulator register class that will be used when the
support for dense math registes is added.

2025-02-21   Michael Meissner  

* config/rs6000/constraints.md (wD): New constraint.
* config/rs6000/mma.md (mma_): Prepare for alternate 
accumulator
registers.  Use wD constraint instead of 'd' constraint.  Use
accumulator_operand instead of fpr_reg_operand.
(mma_): Likewise.
(mma_): Likewise.
(mma_): Likewise.
(mma_): Likewise.
(mma_): Likewise.
(mma_): Likewise.
(mma_): Likewise.
(mma_): Likewise.
(mma_): Likewise.
(mma_): Likewise.
(mma_): Likewise.
(mma_): Likewise.
(mma_"
-  [(set (match_operand:XO 0 "fpr_reg_operand" "=&d")
-   (unspec:XO [(match_operand:XO 1 "fpr_reg_operand" "0")]
+  [(set (match_operand:XO 0 "accumulator_operand" "=&wD")
+   (unspec:XO [(match_operand:XO 1 "accumulator_operand" "0")]
MMA_ACC))]
   "TARGET_MMA"
   " %A0"
@@ -523,7 +523,7 @@
   [(set_attr "type" "mma")])
 
 (define_insn "mma_"
-  [(set (match_operand:XO 0 "fpr_reg_operand" "=&d,&d")
+  [(set (match_operand:XO 0 "accumulator_operand" "=&wD,&wD")
(unspec:XO [(match_operand:V16QI 1 "vsx_register_operand" "v,?wa")
(match_operand:V16QI 2 "vsx_register_operand" "v,?wa")]
MMA_VV))]
@@ -532,8 +532,8 @@
   [(set_attr "type" "mma")])
 
 (define_insn "mma_"
-  [(set (match_operand:XO 0 "fpr_reg_operand" "=&d,&d")
-   (unspec:XO [(match_operand:XO 1 "fpr_reg_operand" "0,0")
+  [(set (match_operand:XO 0 "accumulator_operand" "=&wD,&wD")
+   (unspec:XO [(match_operand:XO 1 "accumulator_operand" "0,0")
(match_operand:V16QI 2 "vsx_register_operand" "v,?wa")
(match_operand:V16QI 3 "vsx_register_operand" "v,?wa")]
MMA_AVV))]
@@ -542,7 +542,7 @@
   [(set_attr "type" "mma")])
 
 (define_insn "mma_"
-  [(set (match_operand:XO 0 "fpr_reg_operand" "=&d,&d")
+  [(set (match_operand:XO 0 "accumulator_operand" "=&wD,&wD")
(unspec:XO [(match_operand:OO 1 "vsx_register_operand" "v,?wa")
(match_operand:V16QI 2 "vsx_register_operand" "v,?wa")]
MMA_PV))]
@@ -551,8 +551,8 @@
   [(set_attr "type" "mma")])
 
 (define_insn "mma_"
-  [(set (match_operand:XO 0 "fpr_reg_operand" "=&d,&d")
-   (unspec:XO [(match_operand:XO 1 "fpr_reg_operand" "0,0")
+  [(set (match_operand:XO 0 "accumulator_operand" "=&wD,&wD")
+   (unspec:XO [(match_operand:XO 1 "accumulator_operand" "0,0")
(match_operand:OO 2 "vsx_register_operand" "v,?wa")
(match_operand:V16QI 3 "vsx_register_operand" "v,?wa")]
MMA_APV))]
@@ -561,7 +561,7 @@
   [(set_attr "type" "mma")])
 
 (define_insn "mma_"
-  [(set (match_operand:XO 0 "fpr_reg_operand" "=&d,&d")
+  [(set (match_operand:XO 0 "accumulator_operand" "=&wD,&wD")
(unspec:XO [(match_operand:V16QI 1 "vsx_register_operand" "v,?wa")
(match_operand:V16QI 2 "vsx_register_operand" "v,?wa")
(match_operand:SI 3 "const_0_to_15_operand" "n,n")
@@ -574,8 +574,8 @@
(set_attr "prefixed" "yes")])
 
 (define_insn "mma_"
-  [(set (match_operand:XO 0 "fpr_reg_operand" "=&d,&d")
-   (unspec:XO [(match_operand:XO 1 "fpr_reg_operand" "0,0")
+  [(set (match_operand:XO 0 "accumulator_operand" "=&wD,&wD")
+   (unspec:XO [(match_operand:XO 1 "accumulator_operand" "0,0")
(match_operand:V16QI 2 "vsx_register_operand" "v,?wa")
(match_operand:V16QI 3 "vsx_register_operand" "v,?wa")
(match_operand:SI 4 "const_0_to_15_operand" "n,n")
@@ -588,7 +588,7 @@
(set_attr "prefixed" "yes")])
 
 (define_insn "mma_"
-  [(set (match_operand:XO 0 "fpr_reg_operand" "=&d,&d")
+  [(set (match_operand:XO 0 "accumulator_operand" "=&wD,&wD")
(unspec:XO [(match_operand:V16QI 1 "vsx_register_operand" "v,?wa")
(match_operand:V16QI 2 "vsx_register_operand" "v,?wa")
(match_operand:SI 3 "const_0_to_15_operand" "n,n")
@@ -601,8 +601,8 @@
(set_attr "prefixed" "yes")])
 
 (define_insn "mma_"
-  [(set (match_operand:XO 0 "fpr_reg_operand" "=&d,&d")
-   (unspec:XO [(match_operand:XO 1 "fpr_reg_operand" "0,0")
+  [(set (match_operand:XO 0 "accumulator_operand" "=&wD,&wD")
+   (unspec:XO [(match_operand:XO 1 "accumulator_operand" "0,0")
   

[gcc(refs/users/meissner/heads/work194-dmf)] Update ChangeLog.*

2025-02-21 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:90afe88734ee002d0564e759dd3e20dd8c18d3b0

commit 90afe88734ee002d0564e759dd3e20dd8c18d3b0
Author: Michael Meissner 
Date:   Fri Feb 21 19:49:03 2025 -0500

Update ChangeLog.*

Diff:
---
 gcc/ChangeLog.dmf | 329 ++
 1 file changed, 329 insertions(+)

diff --git a/gcc/ChangeLog.dmf b/gcc/ChangeLog.dmf
index e593ea8e769d..fa5f4972f7ed 100644
--- a/gcc/ChangeLog.dmf
+++ b/gcc/ChangeLog.dmf
@@ -1,5 +1,334 @@
+ Branch work194-dmf, patch #121 was reverted 

+ Branch work194-dmf, patch #120 was reverted 

+
+ Branch work194-dmf, patch #111 
+
+RFC2655-Add saturating subtract built-ins.
+
+This patch adds support for a saturating subtract built-in function that may be
+added to a future PowerPC processor.  Note, if it is added, the name of the
+built-in function may change before GCC 13 is released.  If the name changes,
+we will submit a patch changing the name.
+
+I also added support for providing dense math built-in functions, even though
+at present, we have not added any new built-in functions for dense math.  It is
+likely we will want to add new dense math built-in functions as the dense math
+support is fleshed out.
+
+The patches have been tested on both little and big endian systems.  Can I 
check
+it into the master branch?
+
+2025-02-10   Michael Meissner  
+
+gcc/
+
+   * config/rs6000/rs6000-builtin.cc (rs6000_invalid_builtin): Add support
+   for flagging invalid use of future built-in functions.
+   (rs6000_builtin_is_supported): Add support for future built-in
+   functions.
+   * config/rs6000/rs6000-builtins.def (__builtin_saturate_subtract32): New
+   built-in function for -mcpu=future.
+   (__builtin_saturate_subtract64): Likewise.
+   * config/rs6000/rs6000-gen-builtins.cc (enum bif_stanza): Add stanzas
+   for -mcpu=future built-ins.
+   (stanza_map): Likewise.
+   (enable_string): Likewise.
+   (struct attrinfo): Likewise.
+   (parse_bif_attrs): Likewise.
+   (write_decls): Likewise.
+   * config/rs6000/rs6000.md (sat_sub3): Add saturating subtract
+   built-in insn declarations.
+   (sat_sub3_dot): Likewise.
+   (sat_sub3_dot2): Likewise.
+   * doc/extend.texi (Future PowerPC built-ins): New section.
+
+gcc/testsuite/
+
+   * gcc.target/powerpc/subfus-1.c: New test.
+   * gcc.target/powerpc/subfus-2.c: Likewise.
+
+ Branch work194-dmf, patch #110 
+
+RFC2656-Support load/store vector with right length.
+
+This patch adds support for new instructions that may be added to the PowerPC
+architecture in the future to enhance the load and store vector with length
+instructions.
+
+The current instructions (lxvl, lxvll, stxvl, and stxvll) are inconvient to use
+since the count for the number of bytes must be in the top 8 bits of the GPR
+register, instead of the bottom 8 bits.  This meant that code generating these
+instructions typically had to do a shift left by 56 bits to get the count into
+the right position.  In a future version of the PowerPC architecture, new
+variants of these instructions might be added that expect the count to be in
+the bottom 8 bits of the GPR register.  These patches add this support to GCC
+if the user uses the -mcpu=future option.
+
+I discovered that the code in rs6000-string.cc to generate ISA 3.1 lxvl/stxvl
+future lxvll/stxvll instructions would generate these instructions on 32-bit.
+However the patterns for these instructions is only done on 64-bit systems.  So
+I added a check for 64-bit support before generating the instructions.
+
+The patches have been tested on both little and big endian systems.  Can I 
check
+it into the master branch?
+
+2025-02-10   Michael Meissner  
+
+gcc/
+
+   * config/rs6000/rs6000-string.cc (expand_block_move): Do not generate
+   lxvl and stxvl on 32-bit.
+   * config/rs6000/vsx.md (lxvl): If -mcpu=future, generate the lxvl with
+   the shift count automaticaly used in the insn.
+   (lxvrl): New insn for -mcpu=future.
+   (lxvrll): Likewise.
+   (stxvl): If -mcpu=future, generate the stxvl with the shift count
+   automaticaly used in the insn.
+   (stxvrl): New insn for -mcpu=future.
+   (stxvrll): Likewise.
+
+gcc/testsuite/
+
+   * gcc.target/powerpc/lxvrl.c: New test.
+   * lib/target-supports.exp (check_effective_target_powerpc_future_ok):
+   New effective target.
+
+ Branch work194-dmf, patch #102 
+
+RFC2653-PowerPC: Add support for 1,024 bit DMR registers.
+
+This patch is a prelimianry patch to add the full 1,024 bit dense math register
+(DMRs) for -mcpu=future.  The MMA 512-bit accumulators map onto the top of the
+DMR register.
+
+This patch only adds the new 1,024 bit register support.  It does not add
+su

[gcc(refs/users/meissner/heads/work194-bugs)] PR 99293: Optimize splat of a V2DF/V2DI extract with constant element

2025-02-21 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:dc0b2f9b0772ebd89b2b16caeef5d762e5bbd582

commit dc0b2f9b0772ebd89b2b16caeef5d762e5bbd582
Author: Michael Meissner 
Date:   Fri Feb 21 19:50:43 2025 -0500

PR 99293: Optimize splat of a V2DF/V2DI extract with constant element

We had optimizations for splat of a vector extract for the other vector
types, but we missed having one for V2DI and V2DF.  This patch adds a
combiner insn to do this optimization.

In looking at the source, we had similar optimizations for V4SI and V4SF
extract and splats, but we missed doing V2DI/V2DF.

Without the patch for the code:

vector long long splat_dup_l_0 (vector long long v)
{
  return __builtin_vec_splats (__builtin_vec_extract (v, 0));
}

the compiler generates (on a little endian power9):

splat_dup_l_0:
mfvsrld 9,34
mtvsrdd 34,9,9
blr

Now it generates:

splat_dup_l_0:
xxpermdi 34,34,34,3
blr

2025-02-21  Michael Meissner  

gcc/

PR target/99293
* config/rs6000/vsx.md (vsx_splat_extract_): New insn.

gcc/testsuite/

PR target/99293
* gcc.target/powerpc/builtins-1.c: Adjust insn count.
* gcc.target/powerpc/pr99293.c: New test.

Diff:
---
 gcc/config/rs6000/vsx.md  | 18 ++
 gcc/testsuite/gcc.target/powerpc/builtins-1.c |  2 +-
 gcc/testsuite/gcc.target/powerpc/pr99293.c| 22 ++
 3 files changed, 41 insertions(+), 1 deletion(-)

diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index dd3573b80868..d84a2a357a31 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -4798,6 +4798,24 @@
   "lxvdsx %x0,%y1"
   [(set_attr "type" "vecload")])
 
+;; Optimize SPLAT of an extract from a V2DF/V2DI vector with a constant element
+(define_insn "*vsx_splat_extract_"
+  [(set (match_operand:VSX_D 0 "vsx_register_operand" "=wa")
+   (vec_duplicate:VSX_D
+(vec_select:
+ (match_operand:VSX_D 1 "vsx_register_operand" "wa")
+ (parallel [(match_operand 2 "const_0_to_1_operand" "n")]]
+  "VECTOR_MEM_VSX_P (mode)"
+{
+  int which_word = INTVAL (operands[2]);
+  if (!BYTES_BIG_ENDIAN)
+which_word = 1 - which_word;
+
+  operands[3] = GEN_INT (which_word ? 3 : 0);
+  return "xxpermdi %x0,%x1,%x1,%3";
+}
+  [(set_attr "type" "vecperm")])
+
 ;; V4SI splat support
 (define_insn "vsx_splat_v4si"
   [(set (match_operand:V4SI 0 "vsx_register_operand" "=wa,wa")
diff --git a/gcc/testsuite/gcc.target/powerpc/builtins-1.c 
b/gcc/testsuite/gcc.target/powerpc/builtins-1.c
index 8410a5fd4319..4e7e5384675f 100644
--- a/gcc/testsuite/gcc.target/powerpc/builtins-1.c
+++ b/gcc/testsuite/gcc.target/powerpc/builtins-1.c
@@ -1035,4 +1035,4 @@ foo156 (vector unsigned short usa)
 /* { dg-final { scan-assembler-times {\mvmrglb\M} 3 } } */
 /* { dg-final { scan-assembler-times {\mvmrgew\M} 4 } } */
 /* { dg-final { scan-assembler-times {\mvsplth|xxsplth\M} 4 } } */
-/* { dg-final { scan-assembler-times {\mxxpermdi\M} 44 } } */
+/* { dg-final { scan-assembler-times {\mxxpermdi\M} 42 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/pr99293.c 
b/gcc/testsuite/gcc.target/powerpc/pr99293.c
new file mode 100644
index ..20adc1f27f65
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr99293.c
@@ -0,0 +1,22 @@
+/* { dg-do compile { target powerpc*-*-* } } */
+/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-options "-O2 -mvsx" } */
+
+/* Test for PR 99263, which wants to do:
+   __builtin_vec_splats (__builtin_vec_extract (v, n))
+
+   where v is a V2DF or V2DI vector and n is either 0 or 1.  Previously the
+   compiler would do a direct move to the GPR registers to select the item and 
a
+   direct move from the GPR registers to do the splat.  */
+
+vector long long splat_dup_l_0 (vector long long v)
+{
+  return __builtin_vec_splats (__builtin_vec_extract (v, 0));
+}
+
+vector long long splat_dup_l_1 (vector long long v)
+{
+  return __builtin_vec_splats (__builtin_vec_extract (v, 1));
+}
+
+/* { dg-final { scan-assembler-times "xxpermdi" 2 } } */


[gcc(refs/users/meissner/heads/work194-bugs)] Revert changes

2025-02-21 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:88609f13226dc0d1b6ccf56b81af275526cd86cc

commit 88609f13226dc0d1b6ccf56b81af275526cd86cc
Author: Michael Meissner 
Date:   Fri Feb 21 19:54:24 2025 -0500

Revert changes

Diff:
---
 gcc/config/rs6000/vsx.md| 142 +---
 gcc/testsuite/gcc.target/powerpc/pr108958.c |   0
 2 files changed, 5 insertions(+), 137 deletions(-)

diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index f47c4e2f7766..d84a2a357a31 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -6282,7 +6282,7 @@
(SFBOOL_MFVSR_A  3) ;; move to gpr src
(SFBOOL_BOOL_D   4) ;; and/ior/xor dest
(SFBOOL_BOOL_A1  5) ;; and/ior/xor arg1
-   (SFBOOL_BOOL_A2  6) ;; and/ior/xor arg2
+   (SFBOOL_BOOL_A2  6) ;; and/ior/xor arg1
(SFBOOL_SHL_D7) ;; shift left dest
(SFBOOL_SHL_A8) ;; shift left arg
(SFBOOL_MTVSR_D  9) ;; move to vecter dest
@@ -6322,18 +6322,18 @@
 ;; GPR, and instead move the integer mask value to the vector register after a
 ;; shift and do the VSX logical operation.
 
-;; The insns for dealing with SFmode in GPR registers looks like on power8:
+;; The insns for dealing with SFmode in GPR registers looks like:
 ;; (set (reg:V4SF reg2) (unspec:V4SF [(reg:SF reg1)] UNSPEC_VSX_CVDPSPN))
 ;;
-;; (set (reg:DI reg3) (zero_extend:DI (reg:SI reg2)))
+;; (set (reg:DI reg3) (unspec:DI [(reg:V4SF reg2)] UNSPEC_P8V_RELOAD_FROM_VSX))
 ;;
-;; (set (reg:DI reg4) (and:SI (reg:SI reg3) (reg:SI mask)))
+;; (set (reg:DI reg4) (and:DI (reg:DI reg3) (reg:DI reg3)))
 ;;
 ;; (set (reg:DI reg5) (ashift:DI (reg:DI reg4) (const_int 32)))
 ;;
 ;; (set (reg:SF reg6) (unspec:SF [(reg:DI reg5)] UNSPEC_P8V_MTVSRD))
 ;;
-;; (set (reg:SF reg7) (unspec:SF [(reg:SF reg6)] UNSPEC_VSX_CVSPDPN))
+;; (set (reg:SF reg6) (unspec:SF [(reg:SF reg6)] UNSPEC_VSX_CVSPDPN))
 
 (define_peephole2
   [(match_scratch:DI SFBOOL_TMP_GPR "r")
@@ -6414,138 +6414,6 @@
   operands[SFBOOL_MTVSR_D_V4SF] = gen_rtx_REG (V4SFmode, regno_mtvsr_d);
 })
 
-;; Constants for SFbool optimization on power9/power10
-(define_constants
-  [(SFBOOL2_TMP_VSX_V4SI0) ;; vector temporary (V4SI)
-   (SFBOOL2_TMP_GPR_SI  1) ;; GPR temporary (SI)
-   (SFBOOL2_MFVSR_D 2) ;; move to gpr dest (DI)
-   (SFBOOL2_MFVSR_A 3) ;; move to gpr src (SI)
-   (SFBOOL2_BOOL_D  4) ;; and/ior/xor dest (SI)
-   (SFBOOL2_BOOL_A1 5) ;; and/ior/xor arg1 (SI)
-   (SFBOOL2_BOOL_A2 6) ;; and/ior/xor arg2 (SI)
-   (SFBOOL2_SPLAT_D 7) ;; splat dest (V4SI)
-   (SFBOOL2_MTVSR_D 8) ;; move/splat to VSX dest.
-   (SFBOOL2_MTVSR_A 9) ;; move/splat to VSX arg.
-   (SFBOOL2_MFVSR_A_V4SI   10) ;; MFVSR_A as V4SI
-   (SFBOOL2_MTVSR_D_V4SI   11) ;; MTVSR_D as V4SI
-   (SFBOOL2_XXSPLTW12)])   ;; 1 or 3 for XXSPLTW
-
-;; On power9/power10, the code is different because we have a splat 32-bit
-;; operation that does a direct move to the FPR/vector registers (MTVSRWS).
-;;
-;; The insns for dealing with SFmode in GPR registers looks like on
-;; power9/power10:
-;;
-;; (set (reg:V4SF reg2) (unspec:V4SF [(reg:SF reg1)] UNSPEC_VSX_CVDPSPN))
-;;
-;; (set (reg:DI reg3) (zero_extend:DI (reg:SI reg2)))
-;;
-;; (set (reg:SI reg4) (and:SI (reg:SI reg3) (reg:SI mask)))
-;;
-;; (set (reg:V4SI reg5) (vec_duplicate:V4SI (reg:SI reg4)))
-;;
-;; (set (reg:SF reg6) (unspec:SF [(reg:SF reg5)] UNSPEC_VSX_CVSPDPN))
-
-;; The VSX temporary needs to be an Altivec register in case we are trying to
-;; do and/ior/xor of -16..15 and we want to use VSPLTISW to load the constant.
-;;
-;; The GPR temporary is only used if we are trying to do a logical operation
-;; with a constant outside of the -16..15 range on a power9.  Otherwise, we can
-;; load the constant directly into the VSX temporary register.
-
-(define_peephole2
-  [(match_scratch:V4SI SFBOOL2_TMP_VSX_V4SI "v")
-   (match_scratch:SI SFBOOL2_TMP_GPR_SI "r")
-
-   ;; Zero_extend and direct move
-   (set (match_operand:DI SFBOOL2_MFVSR_D "int_reg_operand")
-   (zero_extend:DI
-(match_operand:SI SFBOOL2_MFVSR_A "vsx_register_operand")))
-
-   ;; AND/IOR/XOR operation on int
-   (set (match_operand:SI SFBOOL2_BOOL_D "int_reg_operand")
-   (and_ior_xor:SI
-(match_operand:SI SFBOOL2_BOOL_A1 "int_reg_operand")
-(match_operand:SI SFBOOL2_BOOL_A2 "reg_or_cint_operand")))
-
-   ;; Splat sfbool result to vector register
-   (set (match_operand:V4SI SFBOOL2_SPLAT_D "vsx_register_operand")
-   (vec_duplicate:V4SI
-(match_dup SFBOOL2_BOOL_D)))]
-
-  "TARGET_POWERPC64 && TARGET_P9_VECTOR
-   && REG

[gcc(refs/users/meissner/heads/work194-dmf)] RFC2653-PowerPC: Add support for 1, 024 bit DMR registers.

2025-02-21 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:50a7022e5b54801be5b72d253cd7118a51b90494

commit 50a7022e5b54801be5b72d253cd7118a51b90494
Author: Michael Meissner 
Date:   Fri Feb 21 19:37:53 2025 -0500

RFC2653-PowerPC: Add support for 1,024 bit DMR registers.

This patch is a prelimianry patch to add the full 1,024 bit dense math 
register
(DMRs) for -mcpu=future.  The MMA 512-bit accumulators map onto the top of 
the
DMR register.

This patch only adds the new 1,024 bit register support.  It does not add
support for any instructions that need 1,024 bit registers instead of 512 
bit
registers.

I used the new mode 'TDOmode' to be the opaque mode used for 1,024 bit
registers.  The 'wD' constraint added in previous patches is used for these
registers.  I added support to do load and store of DMRs via the VSX 
registers,
since there are no load/store dense math instructions.  I added the new 
keyword
'__dmr' to create 1,024 bit types that can be loaded into DMRs.  At 
present, I
don't have aliases for __dmr512 and __dmr1024 that we've discussed 
internally.

The patches have been tested on both little and big endian systems.  Can I 
check
it into the master branch?

2025-02-21   Michael Meissner  

gcc/

* config/rs6000/mma.md (UNSPEC_DM_INSERT512_UPPER): New unspec.
(UNSPEC_DM_INSERT512_LOWER): Likewise.
(UNSPEC_DM_EXTRACT512): Likewise.
(UNSPEC_DMR_RELOAD_FROM_MEMORY): Likewise.
(UNSPEC_DMR_RELOAD_TO_MEMORY): Likewise.
(movtdo): New define_expand and define_insn_and_split to implement 
1,024
bit DMR registers.
(movtdo_insert512_upper): New insn.
(movtdo_insert512_lower): Likewise.
(movtdo_extract512): Likewise.
(reload_dmr_from_memory): Likewise.
(reload_dmr_to_memory): Likewise.
* config/rs6000/rs6000-builtin.cc (rs6000_type_string): Add DMR
support.
(rs6000_init_builtins): Add support for __dmr keyword.
* config/rs6000/rs6000-call.cc (rs6000_return_in_memory): Add 
support
for TDOmode.
(rs6000_function_arg): Likewise.
* config/rs6000/rs6000-modes.def (TDOmode): New mode.
* config/rs6000/rs6000.cc (rs6000_hard_regno_nregs_internal): Add
support for TDOmode.
(rs6000_hard_regno_mode_ok_uncached): Likewise.
(rs6000_hard_regno_mode_ok): Likewise.
(rs6000_modes_tieable_p): Likewise.
(rs6000_debug_reg_global): Likewise.
(rs6000_setup_reg_addr_masks): Likewise.
(rs6000_init_hard_regno_mode_ok): Add support for TDOmode.  Setup 
reload
hooks for DMR mode.
(reg_offset_addressing_ok_p): Add support for TDOmode.
(rs6000_emit_move): Likewise.
(rs6000_secondary_reload_simple_move): Likewise.
(rs6000_preferred_reload_class): Likewise.
(rs6000_secondary_reload_class): Likewise.
(rs6000_mangle_type): Add mangling for __dmr type.
(rs6000_dmr_register_move_cost): Add support for TDOmode.
(rs6000_split_multireg_move): Likewise.
(rs6000_invalid_conversion): Likewise.
* config/rs6000/rs6000.h (VECTOR_ALIGNMENT_P): Add TDOmode.
(enum rs6000_builtin_type_index): Add DMR type nodes.
(dmr_type_node): Likewise.
(ptr_dmr_type_node): Likewise.

gcc/testsuite/

* gcc.target/powerpc/dm-1024bit.c: New test.
* lib/target-supports.exp (check_effective_target_ppc_dmr_ok): New
target test.

Diff:
---
 gcc/config/rs6000/mma.md  | 154 ++
 gcc/config/rs6000/rs6000-builtin.cc   |  17 +++
 gcc/config/rs6000/rs6000-call.cc  |  10 +-
 gcc/config/rs6000/rs6000-modes.def|   4 +
 gcc/config/rs6000/rs6000.cc   | 101 -
 gcc/config/rs6000/rs6000.h|   6 +-
 gcc/testsuite/gcc.target/powerpc/dm-1024bit.c |  63 +++
 gcc/testsuite/lib/target-supports.exp |  35 ++
 8 files changed, 356 insertions(+), 34 deletions(-)

diff --git a/gcc/config/rs6000/mma.md b/gcc/config/rs6000/mma.md
index 683d2398ef90..1420fadd4355 100644
--- a/gcc/config/rs6000/mma.md
+++ b/gcc/config/rs6000/mma.md
@@ -92,6 +92,11 @@
UNSPEC_MMA_XXMFACC
UNSPEC_MMA_XXMTACC
UNSPEC_MMA_DMSETDMRZ
+   UNSPEC_DM_INSERT512_UPPER
+   UNSPEC_DM_INSERT512_LOWER
+   UNSPEC_DM_EXTRACT512
+   UNSPEC_DMR_RELOAD_FROM_MEMORY
+   UNSPEC_DMR_RELOAD_TO_MEMORY
   ])
 
 (define_c_enum "unspecv"
@@ -742,3 +747,152 @@
   " %A0,%x2,%x3,%4,%5,%6"
   [(set_attr "type" "mma")
(set_attr "prefixed" "yes")])
+
+;; TDOmode (__dmr keyword for 1,024 bit registers).
+(define_expand "movtdo"
+  [(set (match_operand:TDO 0 "nonimmediate_operand")
+   (match_operand:

[gcc(refs/users/meissner/heads/work194-bugs)] Add power9 and power10 float to logical optimizations.

2025-02-21 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:d584470f889a68b6d745e1815b0afe46b4c73b07

commit d584470f889a68b6d745e1815b0afe46b4c73b07
Author: Michael Meissner 
Date:   Fri Feb 21 19:55:40 2025 -0500

Add power9 and power10 float to logical optimizations.

I was answering an email from a co-worker and I pointed him to work I had 
done
for the Power8 era that optimizes the 32-bit float math library in Glibc.  
In
doing so, I discovered with the Power9 and later computers, this 
optimization
is no longer taking place.

The glibc 32-bit floating point math functions have code that looks like:

union u {
  float f;
  uint32_t u32;
};

float
math_foo (float x, unsigned int mask)
{
  union u arg;
  float x2;

  arg.f = x;
  arg.u32 &= mask;

  x2 = arg.f;
  /* ... */
}

On power8 with the optimization it generates:

xscvdpspn 0,1
sldi 9,4,32
mtvsrd 32,9
xxland 1,0,32
xscvspdpn 1,1

I.e., it converts the SFmode to the memory format (instead of the DFmode 
that
is used within the register), converts the mask so that it is in the vector
register in the upper 32-bits, and does a XXLAND (i.e. there is only one 
direct
move from GPR to vector register).  Then after doing this, it converts the
upper 32-bits back to DFmode.

If the XSCVSPDN instruction took the value in the normal 32-bit scalar in a
vector register, we wouldn't have needed the SLDI of the mask.

On power9/power10/power11 it currently generates:

xscvdpspn 0,1
mfvsrwz 2,0
and 2,2,4
mtvsrws 1,2
xscvspdpn 1,1
blr

I.e convert to SFmode representation, move the value to a GPR, do an AND
operation, move the 32-bit value with a splat, and then convert it back to
DFmode format.

With this patch, it now generates:

xscvdpspn 0,1
mtvsrwz 32,2
xxland 32,0,32
xxspltw 1,32,1
xscvspdpn 1,1
blr

I.e. convert to SFmode representation, move the mask to the vector 
register, do
the operation using XXLAND.  Splat the value to get the value in the correct
location, and then convert back to DFmode.

I have built GCC with the patches in this patch set applied on both little 
and
big endian PowerPC systems and there were no regressions.  Can I apply this
patch to GCC 15?

2025-02-10  Michael Meissner  

gcc/

PR target/117487
* config/rs6000/vsx.md (SFmode logical peephoole): Update comments 
in
the original code that supports power8.  Add a new define_peephole2 
to
do the optimization on power9/power10.

Diff:
---
 gcc/config/rs6000/vsx.md| 142 +++-
 gcc/testsuite/gcc.target/powerpc/pr108958.c |   0
 2 files changed, 137 insertions(+), 5 deletions(-)

diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index d84a2a357a31..f47c4e2f7766 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -6282,7 +6282,7 @@
(SFBOOL_MFVSR_A  3) ;; move to gpr src
(SFBOOL_BOOL_D   4) ;; and/ior/xor dest
(SFBOOL_BOOL_A1  5) ;; and/ior/xor arg1
-   (SFBOOL_BOOL_A2  6) ;; and/ior/xor arg1
+   (SFBOOL_BOOL_A2  6) ;; and/ior/xor arg2
(SFBOOL_SHL_D7) ;; shift left dest
(SFBOOL_SHL_A8) ;; shift left arg
(SFBOOL_MTVSR_D  9) ;; move to vecter dest
@@ -6322,18 +6322,18 @@
 ;; GPR, and instead move the integer mask value to the vector register after a
 ;; shift and do the VSX logical operation.
 
-;; The insns for dealing with SFmode in GPR registers looks like:
+;; The insns for dealing with SFmode in GPR registers looks like on power8:
 ;; (set (reg:V4SF reg2) (unspec:V4SF [(reg:SF reg1)] UNSPEC_VSX_CVDPSPN))
 ;;
-;; (set (reg:DI reg3) (unspec:DI [(reg:V4SF reg2)] UNSPEC_P8V_RELOAD_FROM_VSX))
+;; (set (reg:DI reg3) (zero_extend:DI (reg:SI reg2)))
 ;;
-;; (set (reg:DI reg4) (and:DI (reg:DI reg3) (reg:DI reg3)))
+;; (set (reg:DI reg4) (and:SI (reg:SI reg3) (reg:SI mask)))
 ;;
 ;; (set (reg:DI reg5) (ashift:DI (reg:DI reg4) (const_int 32)))
 ;;
 ;; (set (reg:SF reg6) (unspec:SF [(reg:DI reg5)] UNSPEC_P8V_MTVSRD))
 ;;
-;; (set (reg:SF reg6) (unspec:SF [(reg:SF reg6)] UNSPEC_VSX_CVSPDPN))
+;; (set (reg:SF reg7) (unspec:SF [(reg:SF reg6)] UNSPEC_VSX_CVSPDPN))
 
 (define_peephole2
   [(match_scratch:DI SFBOOL_TMP_GPR "r")
@@ -6414,6 +6414,138 @@
   operands[SFBOOL_MTVSR_D_V4SF] = gen_rtx_REG (V4SFmode, regno_mtvsr_d);
 })
 
+;; Constants for 

[gcc(refs/users/meissner/heads/work194-bugs)] Fix PR 118541, do not generate unordered fp cmoves for IEEE compares.

2025-02-21 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:dffbbf1142060c745764997275b402f8b5499499

commit dffbbf1142060c745764997275b402f8b5499499
Author: Michael Meissner 
Date:   Fri Feb 21 19:58:32 2025 -0500

Fix PR 118541, do not generate unordered fp cmoves for IEEE compares.

This is version 3 of the patch.

In version 3, I made the following changes:

1:  The new argument to rs6000_reverse_condition that says whether we 
should
allow ordered floating point compares to be reversed is now an
enumeration instead of a boolean.

2:  I tried to make the code in rs6000_reverse_condition clearer.

3:  I added checks in invert_fpmask_comparison_operator to prevent 
ordered
floating point compares from being reversed unless -ffast-math.

4:  I split the test cases into 4 separate tests (ordered vs. unordered
compare and -O2 vs. -Ofast).

In bug PR target/118541 on power9, power10, and power11 systems, for the
function:

extern double __ieee754_acos (double);

double
__acospi (double x)
{
  double ret = __ieee754_acos (x) / 3.14;
  return __builtin_isgreater (ret, 1.0) ? 1.0 : ret;
}

GCC currently generates the following code:

Power9  Power10 and Power11
==  ===
bl __ieee754_acos   bl __ieee754_acos@notoc
nop plfd 0,.LC0@pcrel
addis 9,2,.LC2@toc@ha   xxspltidp 12,1065353216
addi 1,1,32 addi 1,1,32
lfd 0,.LC2@toc@l(9) ld 0,16(1)
addis 9,2,.LC0@toc@ha   fdiv 0,1,0
ld 0,16(1)  mtlr 0
lfd 12,.LC0@toc@l(9)xscmpgtdp 1,0,12
fdiv 0,1,0  xxsel 1,0,12,1
mtlr 0  blr
xscmpgtdp 1,0,12
xxsel 1,0,12,1
blr

This is because ifcvt.c optimizes the conditional floating point move to 
use the
XSCMPGTDP instruction.

However, the XSCMPGTDP instruction will generate an interrupt if one of the
arguments is a signalling NaN and signalling NaNs can generate an interrupt.
The IEEE comparison functions (isgreater, etc.) require that the comparison 
not
raise an interrupt.

The following patch changes the PowerPC back end so that ifcvt.c will not 
change
the if/then test and move into a conditional move if the comparison is one 
of
the comparisons that do not raise an error with signalling NaNs and -Ofast 
is
not used.  If a normal comparison is used or -Ofast is used, GCC will 
continue
to generate XSCMPGTDP and XXSEL.

For the following code:

double
ordered_compare (double a, double b, double c, double d)
{
  return __builtin_isgreater (a, b) ? c : d;
}

/* Verify normal > does generate xscmpgtdp.  */

double
normal_compare (double a, double b, double c, double d)
{
  return a > b ? c : d;
}

with the following patch, GCC generates the following for power9, power10, 
and
power11:

ordered_compare:
fcmpu 0,1,2
fmr 1,4
bnglr 0
fmr 1,3
blr

normal_compare:
xscmpgtdp 1,1,2
xxsel 1,4,3,1
blr

I have built bootstrap compilers on big endian power9 systems and little 
endian
power9/power10 systems and there were no regressions.  Can I check this 
patch
into the GCC trunk, and after a waiting period, can I check this into the 
active
older branches?

2025-02-21  Michael Meissner  

gcc/

PR target/118541
* config/rs6000/predicates.md (invert_fpmask_comparison_operator): 
Do
not allow UNLT and UNLE unless -ffast-math.
* config/rs6000/rs6000-protos.h (enum reverse_cond_t): New 
enumeration.
(rs6000_reverse_condition): Add argument.
* config/rs6000/rs6000.cc (rs6000_reverse_condition): Do not allow
ordered comparisons to be reversed for floating point conditional 
moves,
but allow ordered comparisons to be reversed on jumps.
(rs6000_emit_sCOND): Adjust rs6000_reverse_condition call.
* config/rs6000/rs6000.h (REVERSE_CONDITION): Likewise.
* config/rs6000/rs6000.md (reverse_branch_comparison): Name insn.
Adjust rs6000_reverse_condition calls.

gcc/testsuite/

PR target/118541
* gcc.target/powerpc/pr118541-1.c: New test.
 

[gcc(refs/users/meissner/heads/work194-bugs)] PR 99293: Optimize splat of a V2DF/V2DI extract with constant element

2025-02-21 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:251cf23e0e1b6a604324194aa0ac63c2585066c6

commit 251cf23e0e1b6a604324194aa0ac63c2585066c6
Author: Michael Meissner 
Date:   Fri Feb 21 19:52:05 2025 -0500

PR 99293: Optimize splat of a V2DF/V2DI extract with constant element

We had optimizations for splat of a vector extract for the other vector
types, but we missed having one for V2DI and V2DF.  This patch adds a
combiner insn to do this optimization.

In looking at the source, we had similar optimizations for V4SI and V4SF
extract and splats, but we missed doing V2DI/V2DF.

Without the patch for the code:

vector long long splat_dup_l_0 (vector long long v)
{
  return __builtin_vec_splats (__builtin_vec_extract (v, 0));
}

the compiler generates (on a little endian power9):

splat_dup_l_0:
mfvsrld 9,34
mtvsrdd 34,9,9
blr

Now it generates:

splat_dup_l_0:
xxpermdi 34,34,34,3
blr

2025-02-21  Michael Meissner  

gcc/

PR target/99293
* config/rs6000/vsx.md (vsx_splat_extract_): New insn.

gcc/testsuite/

PR target/99293
* gcc.target/powerpc/builtins-1.c: Adjust insn count.
* gcc.target/powerpc/pr99293.c: New test.

Diff:
---
 gcc/config/rs6000/vsx.md| 142 +++-
 gcc/testsuite/gcc.target/powerpc/pr108958.c |   0
 2 files changed, 137 insertions(+), 5 deletions(-)

diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index d84a2a357a31..f47c4e2f7766 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -6282,7 +6282,7 @@
(SFBOOL_MFVSR_A  3) ;; move to gpr src
(SFBOOL_BOOL_D   4) ;; and/ior/xor dest
(SFBOOL_BOOL_A1  5) ;; and/ior/xor arg1
-   (SFBOOL_BOOL_A2  6) ;; and/ior/xor arg1
+   (SFBOOL_BOOL_A2  6) ;; and/ior/xor arg2
(SFBOOL_SHL_D7) ;; shift left dest
(SFBOOL_SHL_A8) ;; shift left arg
(SFBOOL_MTVSR_D  9) ;; move to vecter dest
@@ -6322,18 +6322,18 @@
 ;; GPR, and instead move the integer mask value to the vector register after a
 ;; shift and do the VSX logical operation.
 
-;; The insns for dealing with SFmode in GPR registers looks like:
+;; The insns for dealing with SFmode in GPR registers looks like on power8:
 ;; (set (reg:V4SF reg2) (unspec:V4SF [(reg:SF reg1)] UNSPEC_VSX_CVDPSPN))
 ;;
-;; (set (reg:DI reg3) (unspec:DI [(reg:V4SF reg2)] UNSPEC_P8V_RELOAD_FROM_VSX))
+;; (set (reg:DI reg3) (zero_extend:DI (reg:SI reg2)))
 ;;
-;; (set (reg:DI reg4) (and:DI (reg:DI reg3) (reg:DI reg3)))
+;; (set (reg:DI reg4) (and:SI (reg:SI reg3) (reg:SI mask)))
 ;;
 ;; (set (reg:DI reg5) (ashift:DI (reg:DI reg4) (const_int 32)))
 ;;
 ;; (set (reg:SF reg6) (unspec:SF [(reg:DI reg5)] UNSPEC_P8V_MTVSRD))
 ;;
-;; (set (reg:SF reg6) (unspec:SF [(reg:SF reg6)] UNSPEC_VSX_CVSPDPN))
+;; (set (reg:SF reg7) (unspec:SF [(reg:SF reg6)] UNSPEC_VSX_CVSPDPN))
 
 (define_peephole2
   [(match_scratch:DI SFBOOL_TMP_GPR "r")
@@ -6414,6 +6414,138 @@
   operands[SFBOOL_MTVSR_D_V4SF] = gen_rtx_REG (V4SFmode, regno_mtvsr_d);
 })
 
+;; Constants for SFbool optimization on power9/power10
+(define_constants
+  [(SFBOOL2_TMP_VSX_V4SI0) ;; vector temporary (V4SI)
+   (SFBOOL2_TMP_GPR_SI  1) ;; GPR temporary (SI)
+   (SFBOOL2_MFVSR_D 2) ;; move to gpr dest (DI)
+   (SFBOOL2_MFVSR_A 3) ;; move to gpr src (SI)
+   (SFBOOL2_BOOL_D  4) ;; and/ior/xor dest (SI)
+   (SFBOOL2_BOOL_A1 5) ;; and/ior/xor arg1 (SI)
+   (SFBOOL2_BOOL_A2 6) ;; and/ior/xor arg2 (SI)
+   (SFBOOL2_SPLAT_D 7) ;; splat dest (V4SI)
+   (SFBOOL2_MTVSR_D 8) ;; move/splat to VSX dest.
+   (SFBOOL2_MTVSR_A 9) ;; move/splat to VSX arg.
+   (SFBOOL2_MFVSR_A_V4SI   10) ;; MFVSR_A as V4SI
+   (SFBOOL2_MTVSR_D_V4SI   11) ;; MTVSR_D as V4SI
+   (SFBOOL2_XXSPLTW12)])   ;; 1 or 3 for XXSPLTW
+
+;; On power9/power10, the code is different because we have a splat 32-bit
+;; operation that does a direct move to the FPR/vector registers (MTVSRWS).
+;;
+;; The insns for dealing with SFmode in GPR registers looks like on
+;; power9/power10:
+;;
+;; (set (reg:V4SF reg2) (unspec:V4SF [(reg:SF reg1)] UNSPEC_VSX_CVDPSPN))
+;;
+;; (set (reg:DI reg3) (zero_extend:DI (reg:SI reg2)))
+;;
+;; (set (reg:SI reg4) (and:SI (reg:SI reg3) (reg:SI mask)))
+;;
+;; (set (reg:V4SI reg5) (vec_duplicate:V4SI (reg:SI reg4)))
+;;
+;; (set (reg:SF 

[gcc(refs/users/meissner/heads/work194-dmf)] RFC2686-Add paddis support.

2025-02-21 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:7ef4852265df84ad884ab9ba1e4c235b22a227fe

commit 7ef4852265df84ad884ab9ba1e4c235b22a227fe
Author: Michael Meissner 
Date:   Fri Feb 21 19:44:37 2025 -0500

RFC2686-Add paddis support.

2025-02-21  Michael Meissner  

gcc/

* config/rs6000/constraints.md (eU): New constraint.
(eV): Likewise.
* config/rs6000/predicates.md (paddis_operand): New predicate.
(paddis_paddi_operand): Likewise.
(add_operand): Add paddis support.
* config/rs6000/rs6000.cc (num_insns_constant_gpr): Add paddis 
support.
(num_insns_constant_multi): Likewise.
(print_operand): Add %B for paddis support.
* config/rs6000/rs6000.h (TARGET_PADDIS): New macro.
(SIGNED_INTEGER_32BIT_P): Likewise.
* config/rs6000/rs6000.md (isa attribute): Add paddis support.
(enabled attribute); Likewise.
(add3): Likewise.
(adddi3 splitter): New splitter for paddis.
(movdi_internal64): Add paddis support.
(movdi splitter): New splitter for paddis.

gcc/testsuite/

* gcc.target/powerpc/prefixed-addis.c: New test.

Diff:
---
 gcc/config/rs6000/constraints.md  | 10 +++
 gcc/config/rs6000/predicates.md   | 52 +++-
 gcc/config/rs6000/rs6000.cc   | 25 ++
 gcc/config/rs6000/rs6000.h|  4 +
 gcc/config/rs6000/rs6000.md   | 96 ---
 gcc/testsuite/gcc.target/powerpc/prefixed-addis.c | 24 ++
 6 files changed, 197 insertions(+), 14 deletions(-)

diff --git a/gcc/config/rs6000/constraints.md b/gcc/config/rs6000/constraints.md
index 3da9ed086810..5440becb6e6c 100644
--- a/gcc/config/rs6000/constraints.md
+++ b/gcc/config/rs6000/constraints.md
@@ -222,6 +222,16 @@
   "An IEEE 128-bit constant that can be loaded into VSX registers."
   (match_operand 0 "easy_vector_constant_ieee128"))
 
+(define_constraint "eU"
+  "@internal integer constant that can be loaded with paddis"
+  (and (match_code "const_int")
+   (match_operand 0 "paddis_operand")))
+
+(define_constraint "eV"
+  "@internal integer constant that can be loaded with paddis + paddi"
+  (and (match_code "const_int")
+   (match_operand 0 "paddis_paddi_operand")))
+
 ;; Floating-point constraints.  These two are defined so that insn
 ;; length attributes can be calculated exactly.
 
diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
index c95b4336f062..c206860e4927 100644
--- a/gcc/config/rs6000/predicates.md
+++ b/gcc/config/rs6000/predicates.md
@@ -369,6 +369,53 @@
   return SIGNED_INTEGER_34BIT_P (INTVAL (op));
 })
 
+;; Return 1 if op is a 64-bit constant that uses the paddis instruction
+(define_predicate "paddis_operand"
+  (match_code "const_int")
+{
+  if (!TARGET_PADDIS && TARGET_POWERPC64)
+return 0;
+
+  /* If addi, addis, or paddi can handle the number, don't return true.  */
+  HOST_WIDE_INT value = INTVAL (op);
+  if (SIGNED_INTEGER_34BIT_P (value))
+return false;
+
+  /* If the number is too large for padds, return false.  */
+  if (!SIGNED_INTEGER_32BIT_P (value >> 32))
+return false;
+
+  /* If the bottom 32-bits are non-zero, paddis can't handle it.  */
+  if ((value & HOST_WIDE_INT_C(0x)) != 0)
+return false;
+
+  return true;
+})
+
+;; Return 1 if op is a 64-bit constant that needs the paddis instruction and an
+;; addi/addis/paddi instruction combination.
+(define_predicate "paddis_paddi_operand"
+  (match_code "const_int")
+{
+  if (!TARGET_PADDIS && TARGET_POWERPC64)
+return 0;
+
+  /* If addi, addis, or paddi can handle the number, don't return true.  */
+  HOST_WIDE_INT value = INTVAL (op);
+  if (SIGNED_INTEGER_34BIT_P (value))
+return false;
+
+  /* If the number is too large for padds, return false.  */
+  if (!SIGNED_INTEGER_32BIT_P (value >> 32))
+return false;
+
+  /* If the bottom 32-bits are zero, we can use paddis alone to handle it.  */
+  if ((value & HOST_WIDE_INT_C(0x)) == 0)
+return false;
+
+  return true;
+})
+
 ;; Return 1 if op is a register that is not special.
 ;; Disallow (SUBREG:SF (REG:SI)) and (SUBREG:SI (REG:SF)) on VSX systems where
 ;; you need to be careful in moving a SFmode to SImode and vice versa due to
@@ -1113,7 +1160,10 @@
   (if_then_else (match_code "const_int")
 (match_test "satisfies_constraint_I (op)
 || satisfies_constraint_L (op)
-|| satisfies_constraint_eI (op)")
+|| satisfies_constraint_eI (op)
+|| satisfies_constraint_eU (op)
+|| satisfies_constraint_eV (op)")
+
 (match_operand 0 "gpc_reg_operand")))
 
 ;; Return 1 if the operand is either a non-special register, or 0, or -1.
diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index 80c59cfc4bc3..7595998163b3 100644
--- a/gcc/con

[gcc(refs/users/meissner/heads/work194-sha)] PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-02-21 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:35cc68768badc8babe1b8065b1534e9350c92e00

commit 35cc68768badc8babe1b8065b1534e9350c92e00
Author: Michael Meissner 
Date:   Fri Feb 21 20:03:27 2025 -0500

PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

The multibuff.c benchmark attached to the PR target/117251 compiled for 
Power10
PowerPC that implement SHA3 has a slowdown in the current trunk and GCC 14
compared to GCC 11 - GCC 13, due to excessive amounts of spilling.

The main function for the multibuf.c file has 3,747 lines, all of which are
using vector unsigned long long.  There are 696 vector rotates (all rotates 
are
constant), 1,824 vector xor's and 600 vector andc's.

In looking at it, the main thing that steps out is the reason for either
spilling or moving variables is the support in fusion.md (generated by
genfusion.pl) that tries to fuse the vec_andc feeding into vec_xor, and 
other
vec_xor's feeding into vec_xor.

On the powerpc for power10, there is a special fusion mode that happens if 
the
machine has a VANDC or VXOR instruction that is adjacent to a VXOR 
instruction
and the VANDC/VXOR feeds into the 2nd VXOR instruction.

While the Power10 has 64 vector registers (which uses the XXL prefix to do
logical operations), the fusion only works with the older Altivec 
instruction
set (which uses the V prefix).  The Altivec instruction only has 32 vector
registers (which are overlaid over the VSX vector registers 32-63).

By having the combiner patterns fuse_vandc_vxor and fuse_vxor_vxor to do 
this
fusion, it means that the register allocator has more register pressure for 
the
traditional Altivec registers instead of the VSX registers.

In addition, since there are vector rotates, these rotates only work on the
traditional Altivec registers, which adds to the Altivec register pressure.

Finally in addition to doing the explicit xor, andc, and rotates using the
Altivec registers, we have to also load vector constants for the rotate 
amount
and these registers also are allocated as Altivec registers.

Current trunk and GCC 12-14 have more vector spills than GCC 11, but GCC 11 
has
many more vector moves that the later compilers.  Thus even though it has 
way
less spills, the vector moves are why GCC 11 have the slowest results.

There is an instruction that was added in power10 (XXEVAL) that does provide
fusion between VSX vectors that includes ANDC->XOR and XOR->XOR fusion.

The latency of XXEVAL is slightly more than the fused VANDC/VXOR or 
VXOR/VXOR,
so I have written the patch to prefer doing the Altivec instructions if they
don't need a temporary register.

Here are the results for adding support for XXEVAL for the multibuff.c
benchmark attached to the PR.  Note that we essentially recover the speed 
with
this patch that were lost with GCC 14 and the current trunk:

  XXEVALTrunk   GCC14   GCC13   GCC12
GCC11
  ---   -   -   -
-
Benchmark time in seconds   5.53 6.156.265.575.61 
9.56

Fuse VANDC -> VXOR   209 600  600 600 600  
600
Fuse VXOR -> VXOR  0 240  240 120 120  
120
XXEVAL to fuse ANDC -> XOR   391   00   0   0   
 0
XXEVAL to fuse XOR -> XOR240   00   0   0   
 0

Spill vector to stack 78 364  364 172 184  
110
Load spilled vector from stack   431 962  962 713 723  
166
Vector moves  10 100  100  70  72
3,055

Vector rotate right  696 696  696 696 696  
696
XXLANDC or VANDC 209 600  600 600 600  
600
XXLXOR or VXOR   953   1,8241,824   1,824   1,824
1,825
XXEVAL   631   00   0   0   
 0

Load vector rotate constants  24  24   24  24  24   
24

Here are the results for adding support for XXEVAL for the singlebuff.c
benchmark attached to the PR.  Note that adding XXEVAL greatly speeds up 
this
particular benchmark:

  XXEVALTrunk   GCC14   GCC13   GCC12
GCC11
  ---   -   -   -
-
Benchmark time in seconds   4.46 5.405.405.355.36 
7.54

Fuse VANDC -> VXOR   210  600 600 600 600  
600
Fuse VXOR -> VXOR  0  240 240 120 120  
120
XXEVAL to fuse ANDC -> XOR   3900   0 

[gcc(refs/users/meissner/heads/work194-sha)] Add potential p-future XVRLD and XVRLDI instructions.

2025-02-21 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:846694223bd14fa2227399092216bc45cdf11391

commit 846694223bd14fa2227399092216bc45cdf11391
Author: Michael Meissner 
Date:   Fri Feb 21 20:04:07 2025 -0500

Add potential p-future XVRLD and XVRLDI instructions.

2025-02-21  Michael Meissner  

gcc/

* config/rs6000/altivec.md (altivec_vrl): Add support for a
possible XVRLD instruction in the future.
(altivec_vrl_immediate): New insns.
* config/rs6000/predicates.md (vector_shift_immediate): New 
predicate.
* config/rs6000/rs6000.h (TARGET_XVRLW): New macro.
* config/rs6000/rs6000.md (isa attribute): Add xvrlw.
(enabled attribute): Add support for xvrlw.

gcc/testsuite/

* gcc.target/powerpc/vector-rotate-left.c: New test.
* lib/target-supports.exp 
(check_effective_target_powerpc_future_ok):
Add support to test -mcpu=future.

Diff:
---
 gcc/config/rs6000/altivec.md   | 35 +++---
 gcc/config/rs6000/predicates.md| 26 
 gcc/config/rs6000/rs6000.h |  3 ++
 gcc/config/rs6000/rs6000.md|  6 +++-
 .../gcc.target/powerpc/vector-rotate-left.c| 34 +
 gcc/testsuite/lib/target-supports.exp  | 12 
 6 files changed, 111 insertions(+), 5 deletions(-)

diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
index 7edc288a6565..013960438b04 100644
--- a/gcc/config/rs6000/altivec.md
+++ b/gcc/config/rs6000/altivec.md
@@ -1982,12 +1982,39 @@
 }
   [(set_attr "type" "vecperm")])
 
+;; -mcpu=future adds a vector rotate left word variant.  There is no vector
+;; byte/half-word/double-word/quad-word rotate left.  This insn occurs before
+;; altivec_vrl and will match for -mcpu=future, while other cpus will
+;; match the generic insn.
+;; However for testing, allow other xvrl variants.  In particular, XVRLD for
+;; the sha3 tests for multibuf/singlebuf.
 (define_insn "altivec_vrl"
-  [(set (match_operand:VI2 0 "register_operand" "=v")
-(rotate:VI2 (match_operand:VI2 1 "register_operand" "v")
-   (match_operand:VI2 2 "register_operand" "v")))]
+  [(set (match_operand:VI2 0 "register_operand" "=v,wa")
+(rotate:VI2 (match_operand:VI2 1 "register_operand" "v,wa")
+   (match_operand:VI2 2 "register_operand" "v,wa")))]
   ""
-  "vrl %0,%1,%2"
+  "@
+   vrl %0,%1,%2
+   xvrl %x0,%x1,%x2"
+  [(set_attr "type" "vecsimple")
+   (set_attr "isa" "*,xvrlw")])
+
+(define_insn "*altivec_vrl_immediate"
+  [(set (match_operand:VI2 0 "register_operand" "=wa,wa,wa,wa")
+   (rotate:VI2 (match_operand:VI2 1 "register_operand" "wa,wa,wa,wa")
+   (match_operand:VI2 2 "vector_shift_immediate" 
"j,wM,wE,wS")))]
+  "TARGET_XVRLW && "
+{
+  rtx op2 = operands[2];
+  int value = 256;
+  int num_insns = -1;
+
+  if (!xxspltib_constant_p (op2, mode, &num_insns, &value))
+gcc_unreachable ();
+
+  operands[3] = GEN_INT (value & 0xff);
+  return "xvrli %x0,%x1,%3";
+}
   [(set_attr "type" "vecsimple")])
 
 (define_insn "altivec_vrlq"
diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
index 6485ee3eeecc..276812573977 100644
--- a/gcc/config/rs6000/predicates.md
+++ b/gcc/config/rs6000/predicates.md
@@ -728,6 +728,32 @@
   return num_insns == 1;
 })
 
+;; Return 1 if the operand is a CONST_VECTOR whose elements are all the
+;; same and the elements can be an immediate shift or rotate factor
+(define_predicate "vector_shift_immediate"
+  (match_code "const_vector,vec_duplicate,const_int")
+{
+  int value = 256;
+  int num_insns = -1;
+
+  if (zero_constant (op, mode) || all_ones_constant (op, mode))
+return true;
+
+  if (!xxspltib_constant_p (op, mode, &num_insns, &value))
+return false;
+
+  switch (mode)
+{
+case V16QImode: return IN_RANGE (value, 0, 7);
+case V8HImode:  return IN_RANGE (value, 0, 15);
+case V4SImode:  return IN_RANGE (value, 0, 31);
+case V2DImode:  return IN_RANGE (value, 0, 63);
+default:break;
+}
+
+  return false;
+})
+  
 ;; Return 1 if the operand is a CONST_VECTOR and can be loaded into a
 ;; vector register without using memory.
 (define_predicate "easy_vector_constant"
diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h
index ec08c96d0f67..00f6ff2be636 100644
--- a/gcc/config/rs6000/rs6000.h
+++ b/gcc/config/rs6000/rs6000.h
@@ -575,6 +575,9 @@ extern int rs6000_vector_align[];
below.  */
 #define RS6000_FN_TARGET_INFO_HTM 1
 
+/* Whether we have XVRLW support.  */
+#define TARGET_XVRLW   TARGET_FUTURE
+
 /* Whether the various reciprocal divide/square root estimate instructions
exist, and whether we should automatically generate code for the instruction
by default.  */
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 3b

[gcc(refs/users/meissner/heads/work194-bugs)] PR target/108958 -- use mtvsrdd to zero extend GPR DImode to VSX TImode

2025-02-21 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:8c8610ad034100ba58063caaee4b77d1e24fd533

commit 8c8610ad034100ba58063caaee4b77d1e24fd533
Author: Michael Meissner 
Date:   Fri Feb 21 19:56:41 2025 -0500

PR target/108958 -- use mtvsrdd to zero extend GPR DImode to VSX TImode

Previously GCC would zero externd a DImode GPR value to TImode by first zero
extending the DImode value into a GPR TImode value, and then do a MTVSRDD to
move this value to a VSX register.

This patch does the move directly, since if the middle argument to MTVSRDD 
is 0,
it does the zero extend.

If the DImode value is already in a vector register, it does a XXSPLTIB and
XXPERMDI to get the value into the bottom 64-bits of the register.

I have built GCC with the patches in this patch set applied on both little 
and
big endian PowerPC systems and there were no regressions.  Can I apply this
patch to GCC 15?

2025-02-10  Michael Meissner  

gcc/

PR target/108598
* gcc/config/rs6000/rs6000.md (zero_extendditi2): New insn.

gcc/testsuite/

PR target/108598
* gcc.target/powerpc/pr108958.c: New test.

Diff:
---
 gcc/config/rs6000/rs6000.md | 46 +
 gcc/testsuite/gcc.target/powerpc/pr108958.c | 27 +
 2 files changed, 73 insertions(+)

diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 4c2bc81caf56..65da0c653304 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -1026,6 +1026,52 @@
(set_attr "dot" "yes")
(set_attr "length" "4,8")])
 
+(define_insn_and_split "zero_extendditi2"
+  [(set (match_operand:TI 0 "gpc_reg_operand" "=r,wa,&wa")
+   (zero_extend:TI
+(match_operand:DI 1 "gpc_reg_operand" "rwa,r,wa")))]
+  "TARGET_P9_VECTOR && TARGET_POWERPC64"
+  "@
+  #
+  mtvsrdd %x0,0,%1
+  #"
+  "&& reload_completed
+   && (int_reg_operand (operands[0], TImode)
+   || vsx_register_operand (operands[1], DImode))"
+  [(set (match_dup 2)
+   (match_dup 3))
+   (set (match_dup 4)
+   (match_dup 5))]
+{
+  rtx op0 = operands[0];
+  rtx op1 = operands[1];
+  int r = reg_or_subregno (op0);
+
+  if (int_reg_operand (op0, TImode))
+{
+  int lo = BYTES_BIG_ENDIAN ? 1 : 0;
+  int hi = 1 - lo;
+
+  operands[2] = gen_rtx_REG (DImode, r + lo);
+  operands[3] = op1;
+  operands[4] = gen_rtx_REG (DImode, r + hi);
+  operands[5] = const0_rtx;
+}
+  else
+{
+  rtx op0_di = gen_rtx_REG (DImode, r);
+  rtx op0_v2di = gen_rtx_REG (V2DImode, r);
+  rtx lo = WORDS_BIG_ENDIAN ? op1 : op0_di;
+  rtx hi = WORDS_BIG_ENDIAN ? op0_di : op1;
+
+  operands[2] = op0_v2di;
+  operands[3] = CONST0_RTX (V2DImode);
+  operands[4] = op0_v2di;
+  operands[5] = gen_rtx_VEC_CONCAT (V2DImode, hi, lo);
+}
+}
+  [(set_attr "type" "*,mtvsr,vecperm")
+   (set_attr "length" "8,*,8")])
 
 (define_insn "extendqi2"
   [(set (match_operand:EXTQI 0 "gpc_reg_operand" "=r,?*v")
diff --git a/gcc/testsuite/gcc.target/powerpc/pr108958.c 
b/gcc/testsuite/gcc.target/powerpc/pr108958.c
index e69de29bb2d1..03eb58d069e7 100644
--- a/gcc/testsuite/gcc.target/powerpc/pr108958.c
+++ b/gcc/testsuite/gcc.target/powerpc/pr108958.c
@@ -0,0 +1,27 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target int128 } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-options "-mdejagnu-cpu=power9 -O2" } */
+
+/* PR target/108958, use mtvsrdd to zero extend gpr to vsx register.  */
+
+void
+gpr_to_vsx (unsigned long long x, __uint128_t *p)
+{
+  /* mtvsrdd vsx,0,gpr.  */
+  __uint128_t y = x;
+  __asm__ (" # %x0" : "+wa" (y));
+  *p = y;
+}
+
+void
+gpr_to_gpr (unsigned long long x, __uint128_t *p)
+{
+  /* mr and li.  */
+  __uint128_t y = x;
+  __asm__ (" # %0" : "+r" (y));
+  *p = y;
+}
+
+/* { dg-final { scan-assembler-times {\mli\M}  1 } } */
+/* { dg-final { scan-assembler-times {\mmtvsrdd .*,0,.*\M} 1 } } */


[gcc(refs/users/meissner/heads/work194-sha)] Update ChangeLog.*

2025-02-21 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:4529023d70650df8282c682590b6b70a606cd66b

commit 4529023d70650df8282c682590b6b70a606cd66b
Author: Michael Meissner 
Date:   Fri Feb 21 20:05:54 2025 -0500

Update ChangeLog.*

Diff:
---
 gcc/ChangeLog.sha | 168 ++
 1 file changed, 168 insertions(+)

diff --git a/gcc/ChangeLog.sha b/gcc/ChangeLog.sha
index 2b5f83e3304e..31c3b660caa5 100644
--- a/gcc/ChangeLog.sha
+++ b/gcc/ChangeLog.sha
@@ -1,5 +1,173 @@
+ Branch work194-sha, patch #401 
+
+Add potential p-future XVRLD and XVRLDI instructions.
+
+2025-02-10  Michael Meissner  
+
+gcc/
+
+   * config/rs6000/altivec.md (altivec_vrl): Add support for a
+   possible XVRLD instruction in the future.
+   (altivec_vrl_immediate): New insns.
+   * config/rs6000/predicates.md (vector_shift_immediate): New predicate.
+   * config/rs6000/rs6000.h (TARGET_XVRLW): New macro.
+   * config/rs6000/rs6000.md (isa attribute): Add xvrlw.
+   (enabled attribute): Add support for xvrlw.
+
+gcc/testsuite/
+
+   * gcc.target/powerpc/vector-rotate-left.c: New test.
+   * lib/target-supports.exp (check_effective_target_powerpc_future_ok):
+   Add support to test -mcpu=future.
+
+ Branch work194-sha, patch #400 
+
+PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations
+
+The multibuff.c benchmark attached to the PR target/117251 compiled for Power10
+PowerPC that implement SHA3 has a slowdown in the current trunk and GCC 14
+compared to GCC 11 - GCC 13, due to excessive amounts of spilling.
+
+The main function for the multibuf.c file has 3,747 lines, all of which are
+using vector unsigned long long.  There are 696 vector rotates (all rotates are
+constant), 1,824 vector xor's and 600 vector andc's.
+
+In looking at it, the main thing that steps out is the reason for either
+spilling or moving variables is the support in fusion.md (generated by
+genfusion.pl) that tries to fuse the vec_andc feeding into vec_xor, and other
+vec_xor's feeding into vec_xor.
+
+On the powerpc for power10, there is a special fusion mode that happens if the
+machine has a VANDC or VXOR instruction that is adjacent to a VXOR instruction
+and the VANDC/VXOR feeds into the 2nd VXOR instruction.
+
+While the Power10 has 64 vector registers (which uses the XXL prefix to do
+logical operations), the fusion only works with the older Altivec instruction
+set (which uses the V prefix).  The Altivec instruction only has 32 vector
+registers (which are overlaid over the VSX vector registers 32-63).
+
+By having the combiner patterns fuse_vandc_vxor and fuse_vxor_vxor to do this
+fusion, it means that the register allocator has more register pressure for the
+traditional Altivec registers instead of the VSX registers.
+
+In addition, since there are vector rotates, these rotates only work on the
+traditional Altivec registers, which adds to the Altivec register pressure.
+
+Finally in addition to doing the explicit xor, andc, and rotates using the
+Altivec registers, we have to also load vector constants for the rotate amount
+and these registers also are allocated as Altivec registers.
+
+Current trunk and GCC 12-14 have more vector spills than GCC 11, but GCC 11 has
+many more vector moves that the later compilers.  Thus even though it has way
+less spills, the vector moves are why GCC 11 have the slowest results.
+
+There is an instruction that was added in power10 (XXEVAL) that does provide
+fusion between VSX vectors that includes ANDC->XOR and XOR->XOR fusion.
+
+The latency of XXEVAL is slightly more than the fused VANDC/VXOR or VXOR/VXOR,
+so I have written the patch to prefer doing the Altivec instructions if they
+don't need a temporary register.
+
+Here are the results for adding support for XXEVAL for the multibuff.c
+benchmark attached to the PR.  Note that we essentially recover the speed with
+this patch that were lost with GCC 14 and the current trunk:
+
+  XXEVALTrunk   GCC14   GCC13   GCC12GCC11
+  ---   -   -   --
+Benchmark time in seconds   5.53 6.156.265.575.61 9.56
+
+Fuse VANDC -> VXOR   209 600  600 600 600  600
+Fuse VXOR -> VXOR  0 240  240 120 120  120
+XXEVAL to fuse ANDC -> XOR   391   00   0   00
+XXEVAL to fuse XOR -> XOR240   00   0   00
+
+Spill vector to stack 78 364  364 172 184  110
+Load spilled vector from stack   431 962  962 713 723  166
+Vector moves  10 100  100  70  723,055
+
+Vector rotate right  696 696  696 696 696  696
+XXLANDC or VANDC 209 600  600 600 

[gcc(refs/users/meissner/heads/work194-vpair)] Vector pair support.

2025-02-21 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:a2fe8462aa32c4f8e683141c2c111ab25b183ea0

commit a2fe8462aa32c4f8e683141c2c111ab25b183ea0
Author: Michael Meissner 
Date:   Fri Feb 21 20:07:22 2025 -0500

Vector pair support.

This patch adds a new include file (vector-pair.h) that adds support so that
users writing high performance libraries can change their code to allow the
generation of the vector pair load and store instructions on power10.

The intention is that if the library authors need to write special loops 
that
go over arrays that they could modify their code to use the functions 
provided
to change loops that can take advantage of the higher bandwidth for load 
vector
pair and store instructions.

This particular patch just adds a new include file (vector-pair.h) that
provides a bunch of functions that on a power10 system would use the vector
pair load operation, 2 floating point operations, and a vector pair store.  
It
does not add any new types, modes, or built-in function.

I have additional patches that can add built-in functions that the 
functions in
vector-pair.h could utilize so that the compiler can optimize and combine
operations.  I may submit those patches in the future, but I would like to
provide this patch to allow the library writer to optimize their code.

I've measured the performance of these new functions on a power10.  For 
default
unrolling, the percentage of change for the 3 methods over the normal vector
loop method:

116%Vector-pair.h function, default unroll
 93%Vector pair split built-in & 2 vector stores, default unroll
 86%Vector pair split & combine built-ins, default unroll

Using explicit 2 way unrolling the numbers are:

114%Vector-pair.h function, unroll 2
106%Vector pair split built-in & 2 vector stores, unroll 2
 98%Vector pair split & combine built-ins, unroll 2

These new functions provided in vector-pair.h use the vector pair load/store
instructions, and don't generate extra vector moves.  Using the existing
vector pair disassemble and assemble built-ins generate extra vector moves
which can hinder performance.

If I compile the loop code for power9, there is a minor speed up for default
unrolling and more of an improvement using the framework provided in the
vector-pair.h for explicit unrolling by 2:

101%Vector-pair.h function, default unroll for power9
107%Vector-pair.h function, unroll 2 for power9

Of course this is a synthetic benchmark run on a quiet power10 system.  
Results
would vary for real code on real systems.  However, I feel adding these
functions can allow the writers of high performance libraries to better
optimize their code.

As an example, if the library wants to code a simple fused multiply-add 
loop,
they might write the code as follows:

#include 
#include 
#include 

void
fma_vector (double * __restrict__ r,
const double * __restrict__ a,
const double * __restrict__ b,
size_t n)
{
  vector double * __restrict__ vr = (vector double * __restrict__)r;
  const vector double * __restrict__ va = (const vector double * 
__restrict__)a;
  const vector double * __restrict__ vb = (const vector double * 
__restrict__)b;
  size_t num_elements = sizeof (vector double) / sizeof (double);
  size_t nv = n / num_elements;
  size_t i;

  for (i = 0; i < nv; i++)
vr[i] = __builtin_vsx_xvmadddp (va[i], vb[i], vr[i]);

  for (i = nv * num_elements; i < n; i++)
r[i] = fma (a[i], b[i], r[i]);
}

The inner loop would look like:

.L3:
lxvx 0,3,9
lxvx 12,4,9
addi 10,9,16
addi 2,2,-2
lxvx 11,5,9
xvmaddadp 0,12,11
lxvx 12,4,10
lxvx 11,5,10
stxvx 0,3,9
lxvx 0,3,10
addi 9,9,32
xvmaddadp 0,12,11
stxvx 0,3,10
bdnz .L3

Now if you code the loop to use __builtin_vsx_disassemble_pair to do a 
vector
pair load, but then do 2 vector stores:

#include 
#include 
#include 

void
fma_mma_ld (double * __restrict__ r,
const double * __restrict__ a,
const double * __restrict__ b,
size_t n)
{
  __vector_pair * __restrict__ vp_r

[gcc(refs/users/meissner/heads/work194-bugs)] Update ChangeLog.*

2025-02-21 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:54cfb2c2d0d2b715854167697379b86683402332

commit 54cfb2c2d0d2b715854167697379b86683402332
Author: Michael Meissner 
Date:   Fri Feb 21 20:01:37 2025 -0500

Update ChangeLog.*

Diff:
---
 gcc/ChangeLog.bugs | 293 +
 1 file changed, 293 insertions(+)

diff --git a/gcc/ChangeLog.bugs b/gcc/ChangeLog.bugs
index fcc1dd569fc0..0413e5bf560c 100644
--- a/gcc/ChangeLog.bugs
+++ b/gcc/ChangeLog.bugs
@@ -1,5 +1,298 @@
+ Branch work194-bugs, patch #210 
+
+Fix PR 118541, do not generate unordered fp cmoves for IEEE compares.
+
+This is version 3 of the patch.
+
+In version 3, I made the following changes:
+
+1: The new argument to rs6000_reverse_condition that says whether we should
+   allow ordered floating point compares to be reversed is now an
+   enumeration instead of a boolean.
+
+2: I tried to make the code in rs6000_reverse_condition clearer.
+
+3: I added checks in invert_fpmask_comparison_operator to prevent ordered
+   floating point compares from being reversed unless -ffast-math.
+
+4: I split the test cases into 4 separate tests (ordered vs. unordered
+   compare and -O2 vs. -Ofast).
+
+In bug PR target/118541 on power9, power10, and power11 systems, for the
+function:
+
+extern double __ieee754_acos (double);
+
+double
+__acospi (double x)
+{
+  double ret = __ieee754_acos (x) / 3.14;
+  return __builtin_isgreater (ret, 1.0) ? 1.0 : ret;
+}
+
+GCC currently generates the following code:
+
+Power9  Power10 and Power11
+==  ===
+bl __ieee754_acos   bl __ieee754_acos@notoc
+nop plfd 0,.LC0@pcrel
+addis 9,2,.LC2@toc@ha   xxspltidp 12,1065353216
+addi 1,1,32 addi 1,1,32
+lfd 0,.LC2@toc@l(9) ld 0,16(1)
+addis 9,2,.LC0@toc@ha   fdiv 0,1,0
+ld 0,16(1)  mtlr 0
+lfd 12,.LC0@toc@l(9)xscmpgtdp 1,0,12
+fdiv 0,1,0  xxsel 1,0,12,1
+mtlr 0  blr
+xscmpgtdp 1,0,12
+xxsel 1,0,12,1
+blr
+
+This is because ifcvt.c optimizes the conditional floating point move to use 
the
+XSCMPGTDP instruction.
+
+However, the XSCMPGTDP instruction will generate an interrupt if one of the
+arguments is a signalling NaN and signalling NaNs can generate an interrupt.
+The IEEE comparison functions (isgreater, etc.) require that the comparison not
+raise an interrupt.
+
+The following patch changes the PowerPC back end so that ifcvt.c will not 
change
+the if/then test and move into a conditional move if the comparison is one of
+the comparisons that do not raise an error with signalling NaNs and -Ofast is
+not used.  If a normal comparison is used or -Ofast is used, GCC will continue
+to generate XSCMPGTDP and XXSEL.
+
+For the following code:
+
+double
+ordered_compare (double a, double b, double c, double d)
+{
+  return __builtin_isgreater (a, b) ? c : d;
+}
+
+/* Verify normal > does generate xscmpgtdp.  */
+
+double
+normal_compare (double a, double b, double c, double d)
+{
+  return a > b ? c : d;
+}
+
+with the following patch, GCC generates the following for power9, power10, and
+power11:
+
+ordered_compare:
+fcmpu 0,1,2
+fmr 1,4
+bnglr 0
+fmr 1,3
+blr
+
+normal_compare:
+xscmpgtdp 1,1,2
+xxsel 1,4,3,1
+blr
+
+I have built bootstrap compilers on big endian power9 systems and little endian
+power9/power10 systems and there were no regressions.  Can I check this patch
+into the GCC trunk, and after a waiting period, can I check this into the 
active
+older branches?
+
+2025-02-21  Michael Meissner  
+
+gcc/
+
+   PR target/118541
+   * config/rs6000/predicates.md (invert_fpmask_comparison_operator): Do
+   not allow UNLT and UNLE unless -ffast-math.
+   * config/rs6000/rs6000-protos.h (enum reverse_cond_t): New enumeration.
+   (rs6000_reverse_condition): Add argument.
+   * config/rs6000/rs6000.cc (rs6000_reverse_condition): Do not allow
+   ordered comparisons to be reversed for floating point conditional moves,
+   but allow ordered comparisons to be reversed on jumps.
+   (rs6000_emit_sCOND): Adjust rs6000_reverse_condition call.
+   * config/rs6000/rs6000.h (REVERSE_CONDITION): Likewise.
+   * config/rs6000/rs6000.md (reverse_branch_comparison): Name insn.
+   Adjust rs6000_reverse_condition calls.
+
+gcc/testsuite/
+
+   PR target/118541
+   * gcc.target/powerpc/pr118541-1.c: New test.
+   * gc

[gcc(refs/users/meissner/heads/work194-vpair)] Update ChangeLog.*

2025-02-21 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:8a586a67787b015a0ae2c5c6684a96cc60d69063

commit 8a586a67787b015a0ae2c5c6684a96cc60d69063
Author: Michael Meissner 
Date:   Fri Feb 21 20:09:32 2025 -0500

Update ChangeLog.*

Diff:
---
 gcc/ChangeLog.vpair | 420 
 1 file changed, 420 insertions(+)

diff --git a/gcc/ChangeLog.vpair b/gcc/ChangeLog.vpair
index 7516c64dcdee..50ca915389c6 100644
--- a/gcc/ChangeLog.vpair
+++ b/gcc/ChangeLog.vpair
@@ -1,5 +1,425 @@
+ Branch work194-vpair, patch #300 
+
+Vector pair support.
+
+This patch adds a new include file (vector-pair.h) that adds support so that
+users writing high performance libraries can change their code to allow the
+generation of the vector pair load and store instructions on power10.
+
+The intention is that if the library authors need to write special loops that
+go over arrays that they could modify their code to use the functions provided
+to change loops that can take advantage of the higher bandwidth for load vector
+pair and store instructions.
+
+This particular patch just adds a new include file (vector-pair.h) that
+provides a bunch of functions that on a power10 system would use the vector
+pair load operation, 2 floating point operations, and a vector pair store.  It
+does not add any new types, modes, or built-in function.
+
+I have additional patches that can add built-in functions that the functions in
+vector-pair.h could utilize so that the compiler can optimize and combine
+operations.  I may submit those patches in the future, but I would like to
+provide this patch to allow the library writer to optimize their code.
+
+I've measured the performance of these new functions on a power10.  For default
+unrolling, the percentage of change for the 3 methods over the normal vector
+loop method:
+
+   116%Vector-pair.h function, default unroll
+93%Vector pair split built-in & 2 vector stores, default unroll
+86%Vector pair split & combine built-ins, default unroll
+
+Using explicit 2 way unrolling the numbers are:
+
+   114%Vector-pair.h function, unroll 2
+   106%Vector pair split built-in & 2 vector stores, unroll 2
+98%Vector pair split & combine built-ins, unroll 2
+
+These new functions provided in vector-pair.h use the vector pair load/store
+instructions, and don't generate extra vector moves.  Using the existing
+vector pair disassemble and assemble built-ins generate extra vector moves
+which can hinder performance.
+
+If I compile the loop code for power9, there is a minor speed up for default
+unrolling and more of an improvement using the framework provided in the
+vector-pair.h for explicit unrolling by 2:
+
+   101%Vector-pair.h function, default unroll for power9
+   107%Vector-pair.h function, unroll 2 for power9
+
+Of course this is a synthetic benchmark run on a quiet power10 system.  Results
+would vary for real code on real systems.  However, I feel adding these
+functions can allow the writers of high performance libraries to better
+optimize their code.
+
+As an example, if the library wants to code a simple fused multiply-add loop,
+they might write the code as follows:
+
+   #include 
+   #include 
+   #include 
+
+   void
+   fma_vector (double * __restrict__ r,
+   const double * __restrict__ a,
+   const double * __restrict__ b,
+   size_t n)
+   {
+ vector double * __restrict__ vr = (vector double * __restrict__)r;
+ const vector double * __restrict__ va = (const vector double * 
__restrict__)a;
+ const vector double * __restrict__ vb = (const vector double * 
__restrict__)b;
+ size_t num_elements = sizeof (vector double) / sizeof (double);
+ size_t nv = n / num_elements;
+ size_t i;
+
+ for (i = 0; i < nv; i++)
+   vr[i] = __builtin_vsx_xvmadddp (va[i], vb[i], vr[i]);
+
+ for (i = nv * num_elements; i < n; i++)
+   r[i] = fma (a[i], b[i], r[i]);
+   }
+
+The inner loop would look like:
+
+   .L3:
+   lxvx 0,3,9
+   lxvx 12,4,9
+   addi 10,9,16
+   addi 2,2,-2
+   lxvx 11,5,9
+   xvmaddadp 0,12,11
+   lxvx 12,4,10
+   lxvx 11,5,10
+   stxvx 0,3,9
+   lxvx 0,3,10
+   addi 9,9,32
+   xvmaddadp 0,12,11
+   stxvx 0,3,10
+   bdnz .L3
+
+Now if you code the loop to use __builtin_vsx_disassemble_pair to do a vector
+pair load, but then do 2 vector stores:
+
+
+   #include 
+   #include 
+   #include 
+
+   void
+   fma_mma_ld (double * __restrict__ r,
+   const double * __restrict__ a,
+   const double * __restrict__ b,
+   size_t n)
+   {
+ __vector_pair * __restrict__ vp_r

[gcc(refs/users/meissner/heads/work194)] Use vector pair load/store for memcpy with -mcpu=future

2025-02-21 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:0f635f63990b3ead2f39fc5c5e9b7a91f2c04be6

commit 0f635f63990b3ead2f39fc5c5e9b7a91f2c04be6
Author: Michael Meissner 
Date:   Fri Feb 21 18:31:28 2025 -0500

Use vector pair load/store for memcpy with -mcpu=future

In the development for the power10 processor, GCC did not enable using the 
load
vector pair and store vector pair instructions when optimizing things like
memory copy.  This patch enables using those instructions if -mcpu=future is
used.

2025-02-21  Michael Meissner  

gcc/

* config/rs6000/rs6000-cpus.def (ISA_FUTURE_MASKS_SERVER): Enable 
using
load vector pair and store vector pair instructions for memory copy
operations.
(POWERPC_MASKS): Make the bit for enabling using load vector pair 
and
store vector pair operations set and reset when the PowerPC 
processor is
changed.
* gcc/config/rs6000/rs6000.cc (rs6000_machine_from_flags): Disable
-mblock-ops-vector-pair from influcing .machine selection.

gcc/testsuite/

* gcc.target/powerpc/future-3.c: New test.

Diff:
---
 gcc/config/rs6000/rs6000-cpus.def   |  4 +++-
 gcc/config/rs6000/rs6000.cc |  2 +-
 gcc/testsuite/gcc.target/powerpc/future-3.c | 22 ++
 3 files changed, 26 insertions(+), 2 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-cpus.def 
b/gcc/config/rs6000/rs6000-cpus.def
index 228d0b5e7b54..063591f5c094 100644
--- a/gcc/config/rs6000/rs6000-cpus.def
+++ b/gcc/config/rs6000/rs6000-cpus.def
@@ -84,7 +84,8 @@
  | OPTION_MASK_POWER11)
 
 #define FUTURE_MASKS_SERVER(POWER11_MASKS_SERVER   \
-| OPTION_MASK_FUTURE)
+| OPTION_MASK_FUTURE   \
+| OPTION_MASK_BLOCK_OPS_VECTOR_PAIR)
 
 /* Flags that need to be turned off if -mno-vsx.  */
 #define OTHER_VSX_VECTOR_MASKS (OPTION_MASK_EFFICIENT_UNALIGNED_VSX\
@@ -114,6 +115,7 @@
 
 /* Mask of all options to set the default isa flags based on -mcpu=.  */
 #define POWERPC_MASKS  (OPTION_MASK_ALTIVEC\
+| OPTION_MASK_BLOCK_OPS_VECTOR_PAIR\
 | OPTION_MASK_CMPB \
 | OPTION_MASK_CRYPTO   \
 | OPTION_MASK_DFP  \
diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index f3586f2f1e17..3c0c6e1a74d0 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -5908,7 +5908,7 @@ rs6000_machine_from_flags (void)
 
   /* Disable the flags that should never influence the .machine selection.  */
   flags &= ~(OPTION_MASK_PPC_GFXOPT | OPTION_MASK_PPC_GPOPT | OPTION_MASK_ISEL
-| OPTION_MASK_ALTIVEC);
+| OPTION_MASK_ALTIVEC | OPTION_MASK_BLOCK_OPS_VECTOR_PAIR);
 
   if ((flags & (FUTURE_MASKS_SERVER & ~ISA_3_1_MASKS_SERVER)) != 0)
 return "future";
diff --git a/gcc/testsuite/gcc.target/powerpc/future-3.c 
b/gcc/testsuite/gcc.target/powerpc/future-3.c
new file mode 100644
index ..afa8b96d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/future-3.c
@@ -0,0 +1,22 @@
+/* 32-bit doesn't generate vector pair instructions.  */
+/* { dg-do compile { target lp64 } } */
+/* { dg-options "-mdejagnu-cpu=future -O2" } */
+
+/* Test to see that memcpy will use load/store vector pair with
+   -mcpu=future.  */
+
+#ifndef SIZE
+#define SIZE 4
+#endif
+
+extern vector double to[SIZE], from[SIZE];
+
+void
+copy (void)
+{
+  __builtin_memcpy (to, from, sizeof (to));
+  return;
+}
+
+/* { dg-final { scan-assembler {\mlxvpx?\M}  } } */
+/* { dg-final { scan-assembler {\mstxvpx?\M} } } */


[gcc(refs/users/meissner/heads/work194)] Add -mcpu=future tuning support.

2025-02-21 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:c0f1e682db32a155fae1d6767b83c7acd80b7469

commit c0f1e682db32a155fae1d6767b83c7acd80b7469
Author: Michael Meissner 
Date:   Fri Feb 21 18:29:08 2025 -0500

Add -mcpu=future tuning support.

This patch makes -mtune=future use the same tuning decision as 
-mtune=power11.

2025-02-21  Michael Meissner  

gcc/

* config/rs6000/power10.md (all reservations): Add future as an
alterntive to power10 and power11.

Diff:
---
 gcc/config/rs6000/power10.md | 145 ++-
 1 file changed, 73 insertions(+), 72 deletions(-)

diff --git a/gcc/config/rs6000/power10.md b/gcc/config/rs6000/power10.md
index fd31b16b3314..bdd7e58145ba 100644
--- a/gcc/config/rs6000/power10.md
+++ b/gcc/config/rs6000/power10.md
@@ -1,4 +1,5 @@
-;; Scheduling description for the IBM Power10 and Power11 processors.
+;; Scheduling description for the IBM Power10, Power11, and
+;; potential future processors.
 ;; Copyright (C) 2020-2025 Free Software Foundation, Inc.
 ;;
 ;; Contributed by Pat Haugen (pthau...@us.ibm.com).
@@ -97,12 +98,12 @@
(eq_attr "update" "no")
(eq_attr "size" "!128")
(eq_attr "prefixed" "no")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_any_power10,LU_power10")
 
 (define_insn_reservation "power10-fused-load" 4
   (and (eq_attr "type" "fused_load_cmpi,fused_addis_load,fused_load_load")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_even_power10,LU_power10")
 
 (define_insn_reservation "power10-prefixed-load" 4
@@ -110,13 +111,13 @@
(eq_attr "update" "no")
(eq_attr "size" "!128")
(eq_attr "prefixed" "yes")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_even_power10,LU_power10")
 
 (define_insn_reservation "power10-load-update" 4
   (and (eq_attr "type" "load")
(eq_attr "update" "yes")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_even_power10,LU_power10+SXU_power10")
 
 (define_insn_reservation "power10-fpload-double" 4
@@ -124,7 +125,7 @@
(eq_attr "update" "no")
(eq_attr "size" "64")
(eq_attr "prefixed" "no")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_any_power10,LU_power10")
 
 (define_insn_reservation "power10-prefixed-fpload-double" 4
@@ -132,14 +133,14 @@
(eq_attr "update" "no")
(eq_attr "size" "64")
(eq_attr "prefixed" "yes")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_even_power10,LU_power10")
 
 (define_insn_reservation "power10-fpload-update-double" 4
   (and (eq_attr "type" "fpload")
(eq_attr "update" "yes")
(eq_attr "size" "64")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_even_power10,LU_power10+SXU_power10")
 
 ; SFmode loads are cracked and have additional 3 cycles over DFmode
@@ -148,27 +149,27 @@
   (and (eq_attr "type" "fpload")
(eq_attr "update" "no")
(eq_attr "size" "32")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_even_power10,LU_power10")
 
 (define_insn_reservation "power10-fpload-update-single" 7
   (and (eq_attr "type" "fpload")
(eq_attr "update" "yes")
(eq_attr "size" "32")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_even_power10,LU_power10+SXU_power10")
 
 (define_insn_reservation "power10-vecload" 4
   (and (eq_attr "type" "vecload")
(eq_attr "size" "!256")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_any_power10,LU_power10")
 
 ; lxvp
 (define_insn_reservation "power10-vecload-pair" 4
   (and (eq_attr "type" "vecload")
(eq_attr "size" "256")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_even_power10,LU_power10+SXU_power10")
 
 ; Store Unit
@@ -178,12 +179,12 @@
(eq_attr "prefixed" "no")
(eq_attr "size" "!128")
(eq_attr "size" "!256")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_any_power10,STU_power10")
 
 (define_insn_reservation "power10-fused-store" 0
   (and (eq_attr "type" "fused_store_store")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_even_power10,STU_power10")
 
 (define_insn_reservation "power10-prefixed-store" 0
@@ -191,52 +192,52 @@
(eq_attr "prefixed" "yes")
(eq_attr "size" "!128")
(eq_attr "size" "!256")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_even_power10,STU_power10")
 
 ; Update forms have 2 cycle lat

[gcc] Created branch 'meissner/heads/work194-dmf' in namespace 'refs/users'

2025-02-21 Thread Michael Meissner via Gcc-cvs
The branch 'meissner/heads/work194-dmf' was created in namespace 'refs/users' 
pointing to:

 dc6774491642... Add ChangeLog.meissner and REVISION.


[gcc(refs/users/meissner/heads/work194)] Do not allow -mvsx to boost processor to power7.

2025-02-21 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:4c5efbe412b4b6a6d3c3a3fd0e8e4d86e99ef9ab

commit 4c5efbe412b4b6a6d3c3a3fd0e8e4d86e99ef9ab
Author: Michael Meissner 
Date:   Fri Feb 21 18:33:35 2025 -0500

Do not allow -mvsx to boost processor to power7.

This patch restructures the code so that -mvsx for example will not silently
convert the processor to power7.  The user must now use -mcpu=power7 or 
higher.
This means if the user does -mvsx and the default processor does not have 
VSX
support, it will be an error.

I have built both big endian and little endian bootstrap compilers and there
were no regressions.

I updated the 2 tests that used -mvsx to raise the cpu to power7, and the 
test
case that checks if -mno-vsx produces the expected warning.

Note, Peter had some questions about one of the tests in the previous 
version of
the patch.  The test is still the same in this patch.  But the code for
preventing -mvsx is different from the previous patch, and I wanted to get 
that
patch for review before stage1 closes.

Can I install this patch on the GCC 15 trunk?

2025-02-21  Michael Meissner  

gcc/

* config/rs6000/rs6000.cc (rs6000_option_override_internal): Check 
if
the user asked for VSX instructions whether the cpu was at least 
power7.

gcc/testsuite/

* gcc.target/powerpc/ppc-target-4.c: Rewrite the test to add 
cpu=power7
when we need to add VSX support.  Add test for adding cpu=power7 
no-vsx
to generate only Altivec instructions.
* gcc.target/powerpc/pr115688.c: Add cpu=power7 in target 
__attribute__
when requesting VSX instructions.
* gcc.target/powerpc/pr87496-1.c: Update options to use
-mdejagnu-cpu=power6 to get the appropriate error message.

Diff:
---
 gcc/config/rs6000/rs6000.cc |  7 +
 gcc/testsuite/gcc.target/powerpc/ppc-target-4.c | 38 +++--
 gcc/testsuite/gcc.target/powerpc/pr115688.c |  3 +-
 gcc/testsuite/gcc.target/powerpc/pr87496-1.c|  2 +-
 4 files changed, 39 insertions(+), 11 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index 3c0c6e1a74d0..e2f0b8b7de57 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -3862,6 +3862,13 @@ rs6000_option_override_internal (bool global_init_p)
  rs6000_isa_flags &= ~OPTION_MASK_VSX;
  rs6000_isa_flags_explicit |= OPTION_MASK_VSX;
}
+  else if (!TARGET_POWER7)
+   {
+ if (explicit_vsx_p)
+   error ("%<-mvsx%> requires at least %<-mcpu=power%>");
+ rs6000_isa_flags &= ~OPTION_MASK_VSX;
+ rs6000_isa_flags_explicit |= OPTION_MASK_VSX;
+   }
 }
 
   /* If hard-float/altivec/vsx were explicitly turned off then don't allow
diff --git a/gcc/testsuite/gcc.target/powerpc/ppc-target-4.c 
b/gcc/testsuite/gcc.target/powerpc/ppc-target-4.c
index feef76db4618..5e2ecf34f249 100644
--- a/gcc/testsuite/gcc.target/powerpc/ppc-target-4.c
+++ b/gcc/testsuite/gcc.target/powerpc/ppc-target-4.c
@@ -2,7 +2,7 @@
 /* { dg-skip-if "" { powerpc*-*-darwin* } } */
 /* { dg-require-effective-target powerpc_fprs } */
 /* { dg-options "-O2 -ffast-math -mdejagnu-cpu=power5 -mno-altivec 
-mabi=altivec -fno-unroll-loops" } */
-/* { dg-final { scan-assembler-times "vaddfp" 1 } } */
+/* { dg-final { scan-assembler-times "vaddfp" 2 } } */
 /* { dg-final { scan-assembler-times "xvaddsp" 1 } } */
 /* { dg-final { scan-assembler-times "fadds" 1 } } */
 
@@ -18,10 +18,6 @@
 #error "__VSX__ should not be defined."
 #endif
 
-#pragma GCC target("altivec,vsx")
-#include 
-#pragma GCC reset_options
-
 #pragma GCC push_options
 #pragma GCC target("altivec,no-vsx")
 
@@ -33,6 +29,7 @@
 #error "__VSX__ should not be defined."
 #endif
 
+/* Altivec build, generate vaddfp.  */
 void
 av_add (vector float *a, vector float *b, vector float *c)
 {
@@ -40,10 +37,11 @@ av_add (vector float *a, vector float *b, vector float *c)
   unsigned long n = SIZE / 4;
 
   for (i = 0; i < n; i++)
-a[i] = vec_add (b[i], c[i]);
+a[i] = b[i] + c[i];
 }
 
-#pragma GCC target("vsx")
+/* cpu=power7 must be used to enable VSX.  */
+#pragma GCC target("cpu=power7,vsx")
 
 #ifndef __ALTIVEC__
 #error "__ALTIVEC__ should be defined."
@@ -53,6 +51,7 @@ av_add (vector float *a, vector float *b, vector float *c)
 #error "__VSX__ should be defined."
 #endif
 
+/* VSX build on power7, generate xsaddsp.  */
 void
 vsx_add (vector float *a, vector float *b, vector float *c)
 {
@@ -60,11 +59,31 @@ vsx_add (vector float *a, vector float *b, vector float *c)
   unsigned long n = SIZE / 4;
 
   for (i = 0; i < n; i++)
-a[i] = vec_add (b[i], c[i]);
+a[i] = b[i] + c[i];
+}
+
+#pragma GCC target("cpu=power7,no-vsx")
+
+#ifndef __ALTIVEC__
+#error "__ALTIVEC__ should be defined."
+#endif
+
+#ifdef __VSX__
+#error "__VSX__ should not be defined."

[gcc(refs/users/meissner/heads/work194-math)] Add ChangeLog.math and update REVISION.

2025-02-21 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:df6c0918e4efaf9ba8600d3ef132a349baaa49d0

commit df6c0918e4efaf9ba8600d3ef132a349baaa49d0
Author: Michael Meissner 
Date:   Fri Feb 21 16:09:19 2025 -0500

Add ChangeLog.math and update REVISION.

2025-02-21  Michael Meissner  

gcc/

* ChangeLog.math: New file for branch.
* REVISION: Update.

Diff:
---
 gcc/ChangeLog.math | 5 +
 gcc/REVISION   | 2 +-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/gcc/ChangeLog.math b/gcc/ChangeLog.math
new file mode 100644
index ..e63f7fd373bf
--- /dev/null
+++ b/gcc/ChangeLog.math
@@ -0,0 +1,5 @@
+ Branch work194-math, baseline 
+
+2025-02-21   Michael Meissner  
+
+   Clone branch
diff --git a/gcc/REVISION b/gcc/REVISION
index 05cf4553f33f..324f02973041 100644
--- a/gcc/REVISION
+++ b/gcc/REVISION
@@ -1 +1 @@
-work194 branch
+work194-math branch


[gcc/meissner/heads/work194-vpair] (15 commits) Merge commit 'refs/users/meissner/heads/work194-vpair' of g

2025-02-21 Thread Michael Meissner via Gcc-cvs
The branch 'meissner/heads/work194-vpair' was updated to point to:

 5f8425294cdc... Merge commit 'refs/users/meissner/heads/work194-vpair' of g

It previously pointed to:

 6c254f74ffc9... Add ChangeLog.vpair and update REVISION.

Diff:

Summary of changes (added commits):
---

  5f84252... Merge commit 'refs/users/meissner/heads/work194-vpair' of g
  d21b37f... Add ChangeLog.vpair and update REVISION.
  5cf3c92... Update ChangeLog.* (*)
  b3201bd... Use architecture flags for defining _ARCH_PWR macros. (*)
  0c22347... Add rs6000 architecture masks. (*)
  4c5efbe... Do not allow -mvsx to boost processor to power7. (*)
  0f635f6... Use vector pair load/store for memcpy with -mcpu=future (*)
  e4b02d1... Add -mcpu=future tests. (*)
  c0f1e68... Add -mcpu=future tuning support. (*)
  2640deb... Add support for -mcpu=future (*)
  a4d8a02... Change TARGET_MODULO to TARGET_POWER9. (*)
  9868814... Change TARGET_POPCNTD to TARGET_POWER7. (*)
  1ab6162... Change TARGET_CMPB to TARGET_POWER6. (*)
  432624d... Change TARGET_FPRND to TARGET_POWER5X. (*)
  034a3c0... Change TARGET_POPCNTB to TARGET_POWER5. (*)

(*) This commit already exists in another branch.
Because the reference `refs/users/meissner/heads/work194-vpair' matches
your hooks.email-new-commits-only configuration,
no separate email is sent for this commit.


[gcc(refs/users/meissner/heads/work194-vpair)] Add ChangeLog.vpair and update REVISION.

2025-02-21 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:d21b37f16a864220831ffee13269ddc424f7d030

commit d21b37f16a864220831ffee13269ddc424f7d030
Author: Michael Meissner 
Date:   Fri Feb 21 16:05:04 2025 -0500

Add ChangeLog.vpair and update REVISION.

2025-02-21  Michael Meissner  

gcc/

* ChangeLog.vpair: New file for branch.
* REVISION: Update.

Diff:
---
 gcc/ChangeLog.vpair | 5 +
 gcc/REVISION| 2 +-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/gcc/ChangeLog.vpair b/gcc/ChangeLog.vpair
new file mode 100644
index ..7516c64dcdee
--- /dev/null
+++ b/gcc/ChangeLog.vpair
@@ -0,0 +1,5 @@
+ Branch work194-vpair, baseline 
+
+2025-02-21   Michael Meissner  
+
+   Clone branch
diff --git a/gcc/REVISION b/gcc/REVISION
index 05cf4553f33f..ee6baa0a4202 100644
--- a/gcc/REVISION
+++ b/gcc/REVISION
@@ -1 +1 @@
-work194 branch
+work194-vpair branch


[gcc(refs/users/meissner/heads/work194-vpair)] Merge commit 'refs/users/meissner/heads/work194-vpair' of git+ssh://gcc.gnu.org/git/gcc into me/work

2025-02-21 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:5f8425294cdcb04ef1b30aee00603979f1d8b666

commit 5f8425294cdcb04ef1b30aee00603979f1d8b666
Merge: d21b37f16a86 6c254f74ffc9
Author: Michael Meissner 
Date:   Fri Feb 21 18:49:44 2025 -0500

Merge commit 'refs/users/meissner/heads/work194-vpair' of 
git+ssh://gcc.gnu.org/git/gcc into me/work194-vpair

Diff:


[gcc(refs/users/meissner/heads/work194-test)] Merge commit 'refs/users/meissner/heads/work194-test' of git+ssh://gcc.gnu.org/git/gcc into me/work1

2025-02-21 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:5142962814970560c9f2880d752c74ef58bd0672

commit 5142962814970560c9f2880d752c74ef58bd0672
Merge: d01a5419e35a 8f61c9712e31
Author: Michael Meissner 
Date:   Fri Feb 21 18:47:55 2025 -0500

Merge commit 'refs/users/meissner/heads/work194-test' of 
git+ssh://gcc.gnu.org/git/gcc into me/work194-test

Diff:


[gcc(refs/users/meissner/heads/work194-dmf)] Merge commit 'refs/users/meissner/heads/work194-dmf' of git+ssh://gcc.gnu.org/git/gcc into me/work19

2025-02-21 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:c8d8a14b3f12901cfe0a3856b74e88f174cdf9fb

commit c8d8a14b3f12901cfe0a3856b74e88f174cdf9fb
Merge: 52b161b9249f 160015ed4cb0
Author: Michael Meissner 
Date:   Fri Feb 21 18:42:56 2025 -0500

Merge commit 'refs/users/meissner/heads/work194-dmf' of 
git+ssh://gcc.gnu.org/git/gcc into me/work194-dmf

Diff:


[gcc(refs/users/meissner/heads/work194)] Change TARGET_POPCNTD to TARGET_POWER7.

2025-02-21 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:986881451859ff45b8e585833b0ce5c6ff5a6591

commit 986881451859ff45b8e585833b0ce5c6ff5a6591
Author: Michael Meissner 
Date:   Fri Feb 21 18:24:19 2025 -0500

Change TARGET_POPCNTD to TARGET_POWER7.

This patch changes TARGET_POPCNTD to TARGET_POWER7.  The -mpopcntd switch 
is not
being changed, just the name of the macros used to determine if the PowerPC
processor supports ISA 2.6 (Power7).

2025-02-21  Michael Meissner  

gcc/

* gcc/config/rs6000/dfp.md (cmp_internal1): Change 
TARGET_POPCNTD
to TARGET_POWER7.
* gcc/config/rs6000/rs6000-builtin.cc (rs6000_builtin_is_supported):
Likewise.
* gcc/config/rs6000/rs6000-string.cc (expand_block_compare): 
Likewise.
* gcc/config/rs6000/rs6000.cc (rs6000_hard_regno_mode_ok_uncached):
Likewise.
(rs6000_option_override_internal): Likewise.
(rs6000_rtx_costs): Likewise.
* gcc/config/rs6000/rs6000.h (TARGET_LDBRX): Likewise.
(TARGET_FCFID): Likewise.
(TARGET_LFIWZX): Likewise.
(TARGET_FCFIDS): Likewise.
(TARGET_FCFIDU): Likewise.
(TARGET_FCFIDUS): Likewise.
(TARGET_FCTIDUZ): Likewise.
(TARGET_FCTIWUZ): Likewise.
(TARGET_FCTIDUZ): Likewise.
(TARGET_POWER7): New macro.
(TARGET_EXTRA_BUILTINS): Change TARGET_POPCNTD to TARGET_POWER7.
(CTZ_DEFINED_VALUE_AT_ZERO): Likewise.
* gcc/config/rs6000/rs6000.md (enabled attribute): Likewise.
(lrintsi2): Likewise.
(lrintsi): Likewise.
(lrintsi_di): Likewise.
(cmpmemsi): Likewise.
(bpermd_): Likewise.
(addg6s): Likewise.
(cdtbcd): Likewise.
(cbcdtd): Likewise.
(div_): Likewise.

Diff:
---
 gcc/config/rs6000/dfp.md|  2 +-
 gcc/config/rs6000/rs6000-builtin.cc |  4 ++--
 gcc/config/rs6000/rs6000-string.cc  |  2 +-
 gcc/config/rs6000/rs6000.cc |  8 
 gcc/config/rs6000/rs6000.h  | 21 +++--
 gcc/config/rs6000/rs6000.md | 20 ++--
 6 files changed, 29 insertions(+), 28 deletions(-)

diff --git a/gcc/config/rs6000/dfp.md b/gcc/config/rs6000/dfp.md
index 59fa66ae15c8..5919149682b2 100644
--- a/gcc/config/rs6000/dfp.md
+++ b/gcc/config/rs6000/dfp.md
@@ -214,7 +214,7 @@
 (define_insn "floatdidd2"
   [(set (match_operand:DD 0 "gpc_reg_operand" "=d")
(float:DD (match_operand:DI 1 "gpc_reg_operand" "d")))]
-  "TARGET_DFP && TARGET_POPCNTD"
+  "TARGET_DFP && TARGET_POWER7"
   "dcffix %0,%1"
   [(set_attr "type" "dfp")])
 
diff --git a/gcc/config/rs6000/rs6000-builtin.cc 
b/gcc/config/rs6000/rs6000-builtin.cc
index dbb8520ab039..2366b2aee00a 100644
--- a/gcc/config/rs6000/rs6000-builtin.cc
+++ b/gcc/config/rs6000/rs6000-builtin.cc
@@ -161,9 +161,9 @@ rs6000_builtin_is_supported (enum rs6000_gen_builtins 
fncode)
 case ENB_P6_64:
   return TARGET_POWER6 && TARGET_POWERPC64;
 case ENB_P7:
-  return TARGET_POPCNTD;
+  return TARGET_POWER7;
 case ENB_P7_64:
-  return TARGET_POPCNTD && TARGET_POWERPC64;
+  return TARGET_POWER7 && TARGET_POWERPC64;
 case ENB_P8:
   return TARGET_POWER8;
 case ENB_P8V:
diff --git a/gcc/config/rs6000/rs6000-string.cc 
b/gcc/config/rs6000/rs6000-string.cc
index 3d2911ca08a0..703f77fa0bf1 100644
--- a/gcc/config/rs6000/rs6000-string.cc
+++ b/gcc/config/rs6000/rs6000-string.cc
@@ -1949,7 +1949,7 @@ bool
 expand_block_compare (rtx operands[])
 {
   /* TARGET_POPCNTD is already guarded at expand cmpmemsi.  */
-  gcc_assert (TARGET_POPCNTD);
+  gcc_assert (TARGET_POWER7);
 
   /* For P8, this case is complicated to handle because the subtract
  with carry instructions do not generate the 64-bit carry and so
diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index d2814364b3d0..1bba77244c25 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -1924,7 +1924,7 @@ rs6000_hard_regno_mode_ok_uncached (int regno, 
machine_mode mode)
  if(GET_MODE_SIZE (mode) == UNITS_PER_FP_WORD)
return 1;
 
- if (TARGET_POPCNTD && mode == SImode)
+ if (TARGET_POWER7 && mode == SImode)
return 1;
 
  if (TARGET_P9_VECTOR && (mode == QImode || mode == HImode))
@@ -3918,7 +3918,7 @@ rs6000_option_override_internal (bool global_init_p)
 rs6000_isa_flags |= (ISA_2_7_MASKS_SERVER & ~ignore_masks);
   else if (TARGET_VSX)
 rs6000_isa_flags |= (ISA_2_6_MASKS_SERVER & ~ignore_masks);
-  else if (TARGET_POPCNTD)
+  else if (TARGET_POWER7)
 rs6000_isa_flags |= (ISA_2_6_MASKS_EMBEDDED & ~ignore_masks);
   else if (TARGET_DFP)
 rs6000_isa_flags |= (ISA_2_5_MASKS_SERVER & ~ignore_masks);
@@ -4131,7 +4131,7 @@ rs6000_option_override_internal (bool global_init_p)
   else if (TARGET_LONG_DOUBLE_128)
 

[gcc(refs/users/meissner/heads/work194)] Add rs6000 architecture masks.

2025-02-21 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:0c22347a214c5a04cc760a4550cdb966b137486a

commit 0c22347a214c5a04cc760a4550cdb966b137486a
Author: Michael Meissner 
Date:   Fri Feb 21 18:35:25 2025 -0500

Add rs6000 architecture masks.

This patch begins the journey to move architecture bits that are not user 
ISA
options from rs6000_isa_flags to a new targt variable rs6000_arch_flags.  
The
intention is to remove switches that are currently isa options, but the user
should not be using this particular option. For example, we want users to 
use
-mcpu=power10 and not just -mpower10.

This patch also changes the target_clones support to use an architecture 
mask
instead of isa bits.

This patch also switches the handling of .machine to use architecture masks 
if
they exist (power4 through power11).  All of the other PowerPCs will 
continue to
use the existing code for setting the .machine option.

I have built both big endian and little endian bootstrap compilers and there
were no regressions.

In addition, I constructed a test case that used every archiecture define 
(like
_ARCH_PWR4, etc.) and I also looked at the .machine directive generated.  I 
ran
this test for all supported combinations of -mcpu, big/little endian, and 
32/64
bit support.  Every single instance generated exactly the same code with the
patches installed compared to the compiler before installing the patches.

The only difference in this patch compared to the first version posted on
November 6th is that I the correct attribution and copyright year (i.e. 
that I
created rs6000-arch.def in 2024).

Can I install this patch on the GCC 15 trunk?

2025-02-21  Michael Meissner  

gcc/

* config/rs6000/default64.h (TARGET_CPU_DEFAULT): Set default cpu 
name.
* config/rs6000/rs6000-arch.def: New file.
* config/rs6000/rs6000.cc (struct clone_map): Switch to using
architecture masks instead of ISA masks.
(rs6000_clone_map): Likewise.
(rs6000_print_isa_options): Add an architecture flags argument, 
change
all callers.
(get_arch_flag): New function.
(rs6000_debug_reg_global): Update rs6000_print_isa_options calls.
(rs6000_option_override_internal): Likewise.
(rs6000_machine_from_flags): Switch to using architecture masks 
instead
of ISA masks.
(struct rs6000_arch_mask): New structure.
(rs6000_arch_masks): New table of architecutre masks and names.
(rs6000_function_specific_save): Save architecture flags.
(rs6000_function_specific_restore): Restore architecture flags.
(rs6000_function_specific_print): Update rs6000_print_isa_options 
calls.
(rs6000_print_options_internal): Add architecture flags options.
(rs6000_clone_priority): Switch to using architecture masks instead 
of
ISA masks.
(rs6000_can_inline_p): Don't allow inling if the callee requires a 
newer
architecture than the caller.
* config/rs6000/rs6000.h: Use rs6000-arch.def to create the 
architecture
masks.
* config/rs6000/rs6000.opt (rs6000_arch_flags): New target variable.
(x_rs6000_arch_flags): New save/restore field for rs6000_arch_flags.

Diff:
---
 gcc/config/rs6000/default64.h |  11 ++
 gcc/config/rs6000/rs6000-arch.def |  49 +
 gcc/config/rs6000/rs6000.cc   | 222 +++---
 gcc/config/rs6000/rs6000.h|  24 +
 gcc/config/rs6000/rs6000.opt  |   8 ++
 5 files changed, 277 insertions(+), 37 deletions(-)

diff --git a/gcc/config/rs6000/default64.h b/gcc/config/rs6000/default64.h
index 7f6001ded852..188f5c1d1378 100644
--- a/gcc/config/rs6000/default64.h
+++ b/gcc/config/rs6000/default64.h
@@ -21,6 +21,7 @@ along with GCC; see the file COPYING3.  If not see
 #define RS6000_CPU(NAME, CPU, FLAGS)
 #include "rs6000-cpus.def"
 #undef RS6000_CPU
+#undef TARGET_CPU_DEFAULT
 
 #if (TARGET_DEFAULT & MASK_LITTLE_ENDIAN)
 #undef TARGET_DEFAULT
@@ -28,10 +29,20 @@ along with GCC; see the file COPYING3.  If not see
| MASK_LITTLE_ENDIAN)
 #undef ASM_DEFAULT_SPEC
 #define ASM_DEFAULT_SPEC "-mpower8"
+#define TARGET_CPU_DEFAULT "power8"
+
 #else
 #undef TARGET_DEFAULT
 #define TARGET_DEFAULT (OPTION_MASK_PPC_GFXOPT | OPTION_MASK_PPC_GPOPT \
| OPTION_MASK_MFCRF | MASK_POWERPC64 | MASK_64BIT)
 #undef ASM_DEFAULT_SPEC
 #define ASM_DEFAULT_SPEC "-mpower4"
+
+#if (TARGET_DEFAULT & MASK_POWERPC64)
+#define TARGET_CPU_DEFAULT "powerpc64"
+
+#else
+#define TARGET_CPU_DEFAULT "powerpc"
+#endif
+
 #endif
diff --git a/gcc/config/rs6000/rs6000-arch.def 
b/gcc/config/rs6000/rs6000-arch.def
new file mode 100644
index ..c0dbc5834333
--- /dev/null
+++ b/gcc/config/rs6000/rs6000-arch.def
@

[gcc(refs/users/meissner/heads/work194)] Use architecture flags for defining _ARCH_PWR macros.

2025-02-21 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:b3201bdf76312f46352b2164ade30e79b28b6bbf

commit b3201bdf76312f46352b2164ade30e79b28b6bbf
Author: Michael Meissner 
Date:   Fri Feb 21 18:36:29 2025 -0500

Use architecture flags for defining _ARCH_PWR macros.

For the newer architectures, this patch changes GCC to define the 
_ARCH_PWR
macros using the new architecture flags instead of relying on isa options 
like
-mpower10.

The -mpower8-internal, -mpower10, -mpower11, and -mfuture options were 
removed.
The -mpower11 and -mfuture options were removed completely, since they were 
just
added in GCC 15. The other two options were marked as WarnRemoved, and the
various ISA bits were removed.

TARGET_POWER8, TARGET_POWER10, TARGET_POWER11, and TARGET_FUTURE were 
re-defined
to use the architeture bits instead of the ISA bits.

There are other internal isa bits that aren't removed with this patch 
because
the built-in function support uses those bits.

I have built both big endian and little endian bootstrap compilers and there
were no regressions.

Can I install this patch on the GCC 15 trunk?

2025-02-21  Michael Meissner  

gcc/

* config/rs6000/rs6000-c.cc (rs6000_target_modify_macros) Add 
support to
use architecture flags instead of ISA flags for setting most of the
_ARCH_PWR* macros.
(rs6000_cpu_cpp_builtins): Update rs6000_target_modify_macros call.
* config/rs6000/rs6000-cpus.def (ISA_2_7_MASKS_SERVER): Remove
OPTION_MASK_POWER8.
(ISA_3_1_MASKS_SERVER): Remove OPTION_MASK_POWER10.
(POWER11_MASKS_SERVER): Remove OPTION_MASK_POWER11.
(FUTURE_MASKS_SERVER): Remove OPTION_MASK_FUTURE.
(POWERPC_MASKS): Remove OPTION_MASK_POWER8, OPTION_MASK_POWER10,
OPTION_MASK_POWER11, and OPTION_MASK_FUTURE.
* config/rs6000/rs6000-protos.h (rs6000_target_modify_macros): 
Update
declaration.
(rs6000_target_modify_macros_ptr): Likewise.
* config/rs6000/rs6000.cc (rs6000_target_modify_macros_ptr): 
Likewise.
(rs6000_option_override_internal): Use architecture flags instead 
of ISA
flags.
(rs6000_opt_masks): Remove -mpower10, -mpower11, and -mfuture which 
are
no longer in the ISA flags.
(rs6000_pragma_target_parse): Use architecture flags as well as ISA
flags.
* config/rs6000/rs6000.h (TARGET_POWER5): Redefine to use 
architecture
flags.
(TARGET_POWER5X): Likewise.
(TARGET_POWER6): Likewise.
(TARGET_POWER7): Likewise.
(TARGET_POWER8): Likewise.
(TARGET_POWER9): Likewise.
(TARGET_POWER10): New macro.
(TARGET_POWER11): Likewise.
(TARGET_FUTURE): Likewise.
* config/rs6000/rs6000.opt (-mpower8-internal): Remove ISA flag 
bits.
(-mpower10): Likewise.
(-mpower11): Likewise.
(-mfuture): Likewise.

Diff:
---
 gcc/config/rs6000/rs6000-c.cc | 29 -
 gcc/config/rs6000/rs6000-cpus.def | 10 +-
 gcc/config/rs6000/rs6000-protos.h |  5 +++--
 gcc/config/rs6000/rs6000.cc   | 20 +++-
 gcc/config/rs6000/rs6000.h| 19 +--
 gcc/config/rs6000/rs6000.opt  | 17 ++---
 6 files changed, 46 insertions(+), 54 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-c.cc b/gcc/config/rs6000/rs6000-c.cc
index 6757a2477ad1..6d6838735b33 100644
--- a/gcc/config/rs6000/rs6000-c.cc
+++ b/gcc/config/rs6000/rs6000-c.cc
@@ -338,7 +338,8 @@ rs6000_define_or_undefine_macro (bool define_p, const char 
*name)
#pragma GCC target, we need to adjust the macros dynamically.  */
 
 void
-rs6000_target_modify_macros (bool define_p, HOST_WIDE_INT flags)
+rs6000_target_modify_macros (bool define_p, HOST_WIDE_INT flags,
+HOST_WIDE_INT arch_flags)
 {
   if (TARGET_DEBUG_BUILTIN || TARGET_DEBUG_TARGET)
 fprintf (stderr,
@@ -411,7 +412,7 @@ rs6000_target_modify_macros (bool define_p, HOST_WIDE_INT 
flags)
summary of the flags associated with particular cpu
definitions.  */
 
-  /* rs6000_isa_flags based options.  */
+  /* rs6000_isa_flags and rs6000_arch_flags based options.  */
   rs6000_define_or_undefine_macro (define_p, "_ARCH_PPC");
   if ((flags & OPTION_MASK_PPC_GPOPT) != 0)
 rs6000_define_or_undefine_macro (define_p, "_ARCH_PPCSQ");
@@ -419,25 +420,27 @@ rs6000_target_modify_macros (bool define_p, HOST_WIDE_INT 
flags)
 rs6000_define_or_undefine_macro (define_p, "_ARCH_PPCGR");
   if ((flags & OPTION_MASK_POWERPC64) != 0)
 rs6000_define_or_undefine_macro (define_p, "_ARCH_PPC64");
-  if ((flags & OPTION_MASK_MFCRF) != 0)
+  if ((flags & OPTION_MASK_POWERPC64) != 0)
+rs6000_define_or_undefine_macro (define_p, "_ARCH_PPC64");

[gcc(refs/users/meissner/heads/work194)] Update ChangeLog.*

2025-02-21 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:5cf3c92b4e62f6458c9cccadcb82d002346b6157

commit 5cf3c92b4e62f6458c9cccadcb82d002346b6157
Author: Michael Meissner 
Date:   Fri Feb 21 18:41:10 2025 -0500

Update ChangeLog.*

Diff:
---
 gcc/ChangeLog.meissner | 435 +
 1 file changed, 435 insertions(+)

diff --git a/gcc/ChangeLog.meissner b/gcc/ChangeLog.meissner
index 3a3d82994248..c8dead86ffc8 100644
--- a/gcc/ChangeLog.meissner
+++ b/gcc/ChangeLog.meissner
@@ -1,5 +1,440 @@
+ Branch work194, patch #31 
+
+Use architecture flags for defining _ARCH_PWR macros.
+
+For the newer architectures, this patch changes GCC to define the _ARCH_PWR
+macros using the new architecture flags instead of relying on isa options like
+-mpower10.
+
+The -mpower8-internal, -mpower10, -mpower11, and -mfuture options were removed.
+The -mpower11 and -mfuture options were removed completely, since they were 
just
+added in GCC 15. The other two options were marked as WarnRemoved, and the
+various ISA bits were removed.
+
+TARGET_POWER8, TARGET_POWER10, TARGET_POWER11, and TARGET_FUTURE were 
re-defined
+to use the architeture bits instead of the ISA bits.
+
+There are other internal isa bits that aren't removed with this patch because
+the built-in function support uses those bits.
+
+I have built both big endian and little endian bootstrap compilers and there
+were no regressions.
+
+Can I install this patch on the GCC 15 trunk?
+
+2025-02-21  Michael Meissner  
+
+gcc/
+
+   * config/rs6000/rs6000-c.cc (rs6000_target_modify_macros) Add support to
+   use architecture flags instead of ISA flags for setting most of the
+   _ARCH_PWR* macros.
+   (rs6000_cpu_cpp_builtins): Update rs6000_target_modify_macros call.
+   * config/rs6000/rs6000-cpus.def (ISA_2_7_MASKS_SERVER): Remove
+   OPTION_MASK_POWER8.
+   (ISA_3_1_MASKS_SERVER): Remove OPTION_MASK_POWER10.
+   (POWER11_MASKS_SERVER): Remove OPTION_MASK_POWER11.
+   (FUTURE_MASKS_SERVER): Remove OPTION_MASK_FUTURE.
+   (POWERPC_MASKS): Remove OPTION_MASK_POWER8, OPTION_MASK_POWER10,
+   OPTION_MASK_POWER11, and OPTION_MASK_FUTURE.
+   * config/rs6000/rs6000-protos.h (rs6000_target_modify_macros): Update
+   declaration.
+   (rs6000_target_modify_macros_ptr): Likewise.
+   * config/rs6000/rs6000.cc (rs6000_target_modify_macros_ptr): Likewise.
+   (rs6000_option_override_internal): Use architecture flags instead of ISA
+   flags.
+   (rs6000_opt_masks): Remove -mpower10, -mpower11, and -mfuture which are
+   no longer in the ISA flags.
+   (rs6000_pragma_target_parse): Use architecture flags as well as ISA
+   flags.
+   * config/rs6000/rs6000.h (TARGET_POWER5): Redefine to use architecture
+   flags.
+   (TARGET_POWER5X): Likewise.
+   (TARGET_POWER6): Likewise.
+   (TARGET_POWER7): Likewise.
+   (TARGET_POWER8): Likewise.
+   (TARGET_POWER9): Likewise.
+   (TARGET_POWER10): New macro.
+   (TARGET_POWER11): Likewise.
+   (TARGET_FUTURE): Likewise.
+   * config/rs6000/rs6000.opt (-mpower8-internal): Remove ISA flag bits.
+   (-mpower10): Likewise.
+   (-mpower11): Likewise.
+   (-mfuture): Likewise.
+
+ Branch work194, patch #30 
+
+Add rs6000 architecture masks.
+
+This patch begins the journey to move architecture bits that are not user ISA
+options from rs6000_isa_flags to a new targt variable rs6000_arch_flags.  The
+intention is to remove switches that are currently isa options, but the user
+should not be using this particular option. For example, we want users to use
+-mcpu=power10 and not just -mpower10.
+
+This patch also changes the target_clones support to use an architecture mask
+instead of isa bits.
+
+This patch also switches the handling of .machine to use architecture masks if
+they exist (power4 through power11).  All of the other PowerPCs will continue 
to
+use the existing code for setting the .machine option.
+
+I have built both big endian and little endian bootstrap compilers and there
+were no regressions.
+
+In addition, I constructed a test case that used every archiecture define (like
+_ARCH_PWR4, etc.) and I also looked at the .machine directive generated.  I ran
+this test for all supported combinations of -mcpu, big/little endian, and 32/64
+bit support.  Every single instance generated exactly the same code with the
+patches installed compared to the compiler before installing the patches.
+
+The only difference in this patch compared to the first version posted on
+November 6th is that I the correct attribution and copyright year (i.e. that I
+created rs6000-arch.def in 2024).
+
+Can I install this patch on the GCC 15 trunk?
+
+2025-02-21  Michael Meissner  
+
+gcc/
+
+   * config/rs6000/default64.h (TARGET_CPU_DEFAULT): Set default cpu name.
+   * config/rs6000/rs6000-arch.def: New file.
+   * conf

[gcc(refs/users/meissner/heads/work194-bugs)] Merge commit 'refs/users/meissner/heads/work194-bugs' of git+ssh://gcc.gnu.org/git/gcc into me/work1

2025-02-21 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:8d6f222877326a5bb921d1ebcf73afcfe1b70a2c

commit 8d6f222877326a5bb921d1ebcf73afcfe1b70a2c
Merge: 837ca677cf78 5766a9d4505f
Author: Michael Meissner 
Date:   Fri Feb 21 18:41:47 2025 -0500

Merge commit 'refs/users/meissner/heads/work194-bugs' of 
git+ssh://gcc.gnu.org/git/gcc into me/work194-bugs

Diff:


[gcc/meissner/heads/work194-bugs] (15 commits) Merge commit 'refs/users/meissner/heads/work194-bugs' of gi

2025-02-21 Thread Michael Meissner via Gcc-cvs
The branch 'meissner/heads/work194-bugs' was updated to point to:

 8d6f22287732... Merge commit 'refs/users/meissner/heads/work194-bugs' of gi

It previously pointed to:

 5766a9d4505f... Add ChangeLog.bugs and update REVISION.

Diff:

Summary of changes (added commits):
---

  8d6f222... Merge commit 'refs/users/meissner/heads/work194-bugs' of gi
  837ca67... Add ChangeLog.bugs and update REVISION.
  5cf3c92... Update ChangeLog.* (*)
  b3201bd... Use architecture flags for defining _ARCH_PWR macros. (*)
  0c22347... Add rs6000 architecture masks. (*)
  4c5efbe... Do not allow -mvsx to boost processor to power7. (*)
  0f635f6... Use vector pair load/store for memcpy with -mcpu=future (*)
  e4b02d1... Add -mcpu=future tests. (*)
  c0f1e68... Add -mcpu=future tuning support. (*)
  2640deb... Add support for -mcpu=future (*)
  a4d8a02... Change TARGET_MODULO to TARGET_POWER9. (*)
  9868814... Change TARGET_POPCNTD to TARGET_POWER7. (*)
  1ab6162... Change TARGET_CMPB to TARGET_POWER6. (*)
  432624d... Change TARGET_FPRND to TARGET_POWER5X. (*)
  034a3c0... Change TARGET_POPCNTB to TARGET_POWER5. (*)

(*) This commit already exists in another branch.
Because the reference `refs/users/meissner/heads/work194-bugs' matches
your hooks.email-new-commits-only configuration,
no separate email is sent for this commit.


[gcc(refs/users/meissner/heads/work194-bugs)] Add ChangeLog.bugs and update REVISION.

2025-02-21 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:837ca677cf789bcd08b17efaf720602641e6875e

commit 837ca677cf789bcd08b17efaf720602641e6875e
Author: Michael Meissner 
Date:   Fri Feb 21 16:05:57 2025 -0500

Add ChangeLog.bugs and update REVISION.

2025-02-21  Michael Meissner  

gcc/

* ChangeLog.bugs: New file for branch.
* REVISION: Update.

Diff:
---
 gcc/ChangeLog.bugs | 5 +
 gcc/REVISION   | 2 +-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/gcc/ChangeLog.bugs b/gcc/ChangeLog.bugs
new file mode 100644
index ..fcc1dd569fc0
--- /dev/null
+++ b/gcc/ChangeLog.bugs
@@ -0,0 +1,5 @@
+ Branch work194-bugs, baseline 
+
+2025-02-21   Michael Meissner  
+
+   Clone branch
diff --git a/gcc/REVISION b/gcc/REVISION
index 05cf4553f33f..07afa32f8801 100644
--- a/gcc/REVISION
+++ b/gcc/REVISION
@@ -1 +1 @@
-work194 branch
+work194-bugs branch


[gcc(refs/users/meissner/heads/work194-dmf)] Add ChangeLog.dmf and update REVISION.

2025-02-21 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:52b161b9249fb44a732103c43d24e2ee1af1ddc3

commit 52b161b9249fb44a732103c43d24e2ee1af1ddc3
Author: Michael Meissner 
Date:   Fri Feb 21 16:04:15 2025 -0500

Add ChangeLog.dmf and update REVISION.

2025-02-21  Michael Meissner  

gcc/

* ChangeLog.dmf: New file for branch.
* REVISION: Update.

Diff:
---
 gcc/ChangeLog.dmf | 5 +
 gcc/REVISION  | 2 +-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/gcc/ChangeLog.dmf b/gcc/ChangeLog.dmf
new file mode 100644
index ..e593ea8e769d
--- /dev/null
+++ b/gcc/ChangeLog.dmf
@@ -0,0 +1,5 @@
+ Branch work194-dmf, baseline 
+
+2025-02-21   Michael Meissner  
+
+   Clone branch
diff --git a/gcc/REVISION b/gcc/REVISION
index 05cf4553f33f..6b364388815e 100644
--- a/gcc/REVISION
+++ b/gcc/REVISION
@@ -1 +1 @@
-work194 branch
+work194-dmf branch


[gcc(refs/users/meissner/heads/work194-libs)] Add ChangeLog.libs and update REVISION.

2025-02-21 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:22e9167f687fcd473be6dbbdad8edf12a241e1f6

commit 22e9167f687fcd473be6dbbdad8edf12a241e1f6
Author: Michael Meissner 
Date:   Fri Feb 21 16:06:47 2025 -0500

Add ChangeLog.libs and update REVISION.

2025-02-21  Michael Meissner  

gcc/

* ChangeLog.libs: New file for branch.
* REVISION: Update.

Diff:
---
 gcc/ChangeLog.libs | 5 +
 gcc/REVISION   | 2 +-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/gcc/ChangeLog.libs b/gcc/ChangeLog.libs
new file mode 100644
index ..380b93a7ee46
--- /dev/null
+++ b/gcc/ChangeLog.libs
@@ -0,0 +1,5 @@
+ Branch work194-libs, baseline 
+
+2025-02-21   Michael Meissner  
+
+   Clone branch
diff --git a/gcc/REVISION b/gcc/REVISION
index 05cf4553f33f..696d56ba45dc 100644
--- a/gcc/REVISION
+++ b/gcc/REVISION
@@ -1 +1 @@
-work194 branch
+work194-libs branch


[gcc(refs/users/meissner/heads/work194-libs)] Merge commit 'refs/users/meissner/heads/work194-libs' of git+ssh://gcc.gnu.org/git/gcc into me/work1

2025-02-21 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:9b75068c90ad8418615691f03d49ac76302bf10f

commit 9b75068c90ad8418615691f03d49ac76302bf10f
Merge: 22e9167f687f 860a556970bb
Author: Michael Meissner 
Date:   Fri Feb 21 18:44:04 2025 -0500

Merge commit 'refs/users/meissner/heads/work194-libs' of 
git+ssh://gcc.gnu.org/git/gcc into me/work194-libs

Diff:


[gcc/meissner/heads/work194-dmf] (15 commits) Merge commit 'refs/users/meissner/heads/work194-dmf' of git

2025-02-21 Thread Michael Meissner via Gcc-cvs
The branch 'meissner/heads/work194-dmf' was updated to point to:

 c8d8a14b3f12... Merge commit 'refs/users/meissner/heads/work194-dmf' of git

It previously pointed to:

 160015ed4cb0... Add ChangeLog.dmf and update REVISION.

Diff:

Summary of changes (added commits):
---

  c8d8a14... Merge commit 'refs/users/meissner/heads/work194-dmf' of git
  52b161b... Add ChangeLog.dmf and update REVISION.
  5cf3c92... Update ChangeLog.* (*)
  b3201bd... Use architecture flags for defining _ARCH_PWR macros. (*)
  0c22347... Add rs6000 architecture masks. (*)
  4c5efbe... Do not allow -mvsx to boost processor to power7. (*)
  0f635f6... Use vector pair load/store for memcpy with -mcpu=future (*)
  e4b02d1... Add -mcpu=future tests. (*)
  c0f1e68... Add -mcpu=future tuning support. (*)
  2640deb... Add support for -mcpu=future (*)
  a4d8a02... Change TARGET_MODULO to TARGET_POWER9. (*)
  9868814... Change TARGET_POPCNTD to TARGET_POWER7. (*)
  1ab6162... Change TARGET_CMPB to TARGET_POWER6. (*)
  432624d... Change TARGET_FPRND to TARGET_POWER5X. (*)
  034a3c0... Change TARGET_POPCNTB to TARGET_POWER5. (*)

(*) This commit already exists in another branch.
Because the reference `refs/users/meissner/heads/work194-dmf' matches
your hooks.email-new-commits-only configuration,
no separate email is sent for this commit.


[gcc/meissner/heads/work194-libs] (15 commits) Merge commit 'refs/users/meissner/heads/work194-libs' of gi

2025-02-21 Thread Michael Meissner via Gcc-cvs
The branch 'meissner/heads/work194-libs' was updated to point to:

 9b75068c90ad... Merge commit 'refs/users/meissner/heads/work194-libs' of gi

It previously pointed to:

 860a556970bb... Add ChangeLog.libs and update REVISION.

Diff:

Summary of changes (added commits):
---

  9b75068... Merge commit 'refs/users/meissner/heads/work194-libs' of gi
  22e9167... Add ChangeLog.libs and update REVISION.
  5cf3c92... Update ChangeLog.* (*)
  b3201bd... Use architecture flags for defining _ARCH_PWR macros. (*)
  0c22347... Add rs6000 architecture masks. (*)
  4c5efbe... Do not allow -mvsx to boost processor to power7. (*)
  0f635f6... Use vector pair load/store for memcpy with -mcpu=future (*)
  e4b02d1... Add -mcpu=future tests. (*)
  c0f1e68... Add -mcpu=future tuning support. (*)
  2640deb... Add support for -mcpu=future (*)
  a4d8a02... Change TARGET_MODULO to TARGET_POWER9. (*)
  9868814... Change TARGET_POPCNTD to TARGET_POWER7. (*)
  1ab6162... Change TARGET_CMPB to TARGET_POWER6. (*)
  432624d... Change TARGET_FPRND to TARGET_POWER5X. (*)
  034a3c0... Change TARGET_POPCNTB to TARGET_POWER5. (*)

(*) This commit already exists in another branch.
Because the reference `refs/users/meissner/heads/work194-libs' matches
your hooks.email-new-commits-only configuration,
no separate email is sent for this commit.


[gcc/meissner/heads/work194-math] (15 commits) Merge commit 'refs/users/meissner/heads/work194-math' of gi

2025-02-21 Thread Michael Meissner via Gcc-cvs
The branch 'meissner/heads/work194-math' was updated to point to:

 f86c641ece03... Merge commit 'refs/users/meissner/heads/work194-math' of gi

It previously pointed to:

 3fb21ca3f9e3... Add ChangeLog.math and update REVISION.

Diff:

Summary of changes (added commits):
---

  f86c641... Merge commit 'refs/users/meissner/heads/work194-math' of gi
  df6c091... Add ChangeLog.math and update REVISION.
  5cf3c92... Update ChangeLog.* (*)
  b3201bd... Use architecture flags for defining _ARCH_PWR macros. (*)
  0c22347... Add rs6000 architecture masks. (*)
  4c5efbe... Do not allow -mvsx to boost processor to power7. (*)
  0f635f6... Use vector pair load/store for memcpy with -mcpu=future (*)
  e4b02d1... Add -mcpu=future tests. (*)
  c0f1e68... Add -mcpu=future tuning support. (*)
  2640deb... Add support for -mcpu=future (*)
  a4d8a02... Change TARGET_MODULO to TARGET_POWER9. (*)
  9868814... Change TARGET_POPCNTD to TARGET_POWER7. (*)
  1ab6162... Change TARGET_CMPB to TARGET_POWER6. (*)
  432624d... Change TARGET_FPRND to TARGET_POWER5X. (*)
  034a3c0... Change TARGET_POPCNTB to TARGET_POWER5. (*)

(*) This commit already exists in another branch.
Because the reference `refs/users/meissner/heads/work194-math' matches
your hooks.email-new-commits-only configuration,
no separate email is sent for this commit.


[gcc(refs/users/meissner/heads/work194-math)] Merge commit 'refs/users/meissner/heads/work194-math' of git+ssh://gcc.gnu.org/git/gcc into me/work1

2025-02-21 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:f86c641ece03ef60c30dc67da6b3750a01362ab7

commit f86c641ece03ef60c30dc67da6b3750a01362ab7
Merge: df6c0918e4ef 3fb21ca3f9e3
Author: Michael Meissner 
Date:   Fri Feb 21 18:45:08 2025 -0500

Merge commit 'refs/users/meissner/heads/work194-math' of 
git+ssh://gcc.gnu.org/git/gcc into me/work194-math

Diff:


[gcc(refs/users/meissner/heads/work194-sha)] Add ChangeLog.sha and update REVISION.

2025-02-21 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:a2a1e0fe9afe73fd7d1b70f1aabf4f59ee4592e5

commit a2a1e0fe9afe73fd7d1b70f1aabf4f59ee4592e5
Author: Michael Meissner 
Date:   Fri Feb 21 16:07:42 2025 -0500

Add ChangeLog.sha and update REVISION.

2025-02-21  Michael Meissner  

gcc/

* ChangeLog.sha: New file for branch.
* REVISION: Update.

Diff:
---
 gcc/ChangeLog.sha | 5 +
 gcc/REVISION  | 2 +-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/gcc/ChangeLog.sha b/gcc/ChangeLog.sha
new file mode 100644
index ..2b5f83e3304e
--- /dev/null
+++ b/gcc/ChangeLog.sha
@@ -0,0 +1,5 @@
+ Branch work194-sha, baseline 
+
+2025-02-21   Michael Meissner  
+
+   Clone branch
diff --git a/gcc/REVISION b/gcc/REVISION
index 05cf4553f33f..c8298bf5c35a 100644
--- a/gcc/REVISION
+++ b/gcc/REVISION
@@ -1 +1 @@
-work194 branch
+work194-sha branch


[gcc/meissner/heads/work194-sha] (15 commits) Merge commit 'refs/users/meissner/heads/work194-sha' of git

2025-02-21 Thread Michael Meissner via Gcc-cvs
The branch 'meissner/heads/work194-sha' was updated to point to:

 636f49bdc3c0... Merge commit 'refs/users/meissner/heads/work194-sha' of git

It previously pointed to:

 33235ddf7a38... Add ChangeLog.sha and update REVISION.

Diff:

Summary of changes (added commits):
---

  636f49b... Merge commit 'refs/users/meissner/heads/work194-sha' of git
  a2a1e0f... Add ChangeLog.sha and update REVISION.
  5cf3c92... Update ChangeLog.* (*)
  b3201bd... Use architecture flags for defining _ARCH_PWR macros. (*)
  0c22347... Add rs6000 architecture masks. (*)
  4c5efbe... Do not allow -mvsx to boost processor to power7. (*)
  0f635f6... Use vector pair load/store for memcpy with -mcpu=future (*)
  e4b02d1... Add -mcpu=future tests. (*)
  c0f1e68... Add -mcpu=future tuning support. (*)
  2640deb... Add support for -mcpu=future (*)
  a4d8a02... Change TARGET_MODULO to TARGET_POWER9. (*)
  9868814... Change TARGET_POPCNTD to TARGET_POWER7. (*)
  1ab6162... Change TARGET_CMPB to TARGET_POWER6. (*)
  432624d... Change TARGET_FPRND to TARGET_POWER5X. (*)
  034a3c0... Change TARGET_POPCNTB to TARGET_POWER5. (*)

(*) This commit already exists in another branch.
Because the reference `refs/users/meissner/heads/work194-sha' matches
your hooks.email-new-commits-only configuration,
no separate email is sent for this commit.


[gcc(refs/users/meissner/heads/work194-sha)] Merge commit 'refs/users/meissner/heads/work194-sha' of git+ssh://gcc.gnu.org/git/gcc into me/work19

2025-02-21 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:636f49bdc3c07336b3b08a1cd02446a030b0a321

commit 636f49bdc3c07336b3b08a1cd02446a030b0a321
Merge: a2a1e0fe9afe 33235ddf7a38
Author: Michael Meissner 
Date:   Fri Feb 21 18:46:26 2025 -0500

Merge commit 'refs/users/meissner/heads/work194-sha' of 
git+ssh://gcc.gnu.org/git/gcc into me/work194-sha

Diff:


[gcc/meissner/heads/work194-test] (15 commits) Merge commit 'refs/users/meissner/heads/work194-test' of gi

2025-02-21 Thread Michael Meissner via Gcc-cvs
The branch 'meissner/heads/work194-test' was updated to point to:

 514296281497... Merge commit 'refs/users/meissner/heads/work194-test' of gi

It previously pointed to:

 8f61c9712e31... Add ChangeLog.test and update REVISION.

Diff:

Summary of changes (added commits):
---

  5142962... Merge commit 'refs/users/meissner/heads/work194-test' of gi
  d01a541... Add ChangeLog.test and update REVISION.
  5cf3c92... Update ChangeLog.* (*)
  b3201bd... Use architecture flags for defining _ARCH_PWR macros. (*)
  0c22347... Add rs6000 architecture masks. (*)
  4c5efbe... Do not allow -mvsx to boost processor to power7. (*)
  0f635f6... Use vector pair load/store for memcpy with -mcpu=future (*)
  e4b02d1... Add -mcpu=future tests. (*)
  c0f1e68... Add -mcpu=future tuning support. (*)
  2640deb... Add support for -mcpu=future (*)
  a4d8a02... Change TARGET_MODULO to TARGET_POWER9. (*)
  9868814... Change TARGET_POPCNTD to TARGET_POWER7. (*)
  1ab6162... Change TARGET_CMPB to TARGET_POWER6. (*)
  432624d... Change TARGET_FPRND to TARGET_POWER5X. (*)
  034a3c0... Change TARGET_POPCNTB to TARGET_POWER5. (*)

(*) This commit already exists in another branch.
Because the reference `refs/users/meissner/heads/work194-test' matches
your hooks.email-new-commits-only configuration,
no separate email is sent for this commit.


[gcc(refs/users/meissner/heads/work194-test)] Add ChangeLog.test and update REVISION.

2025-02-21 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:d01a5419e35ac775a640e076668414888e1cc83e

commit d01a5419e35ac775a640e076668414888e1cc83e
Author: Michael Meissner 
Date:   Fri Feb 21 16:08:32 2025 -0500

Add ChangeLog.test and update REVISION.

2025-02-21  Michael Meissner  

gcc/

* ChangeLog.test: New file for branch.
* REVISION: Update.

Diff:
---
 gcc/ChangeLog.test | 5 +
 gcc/REVISION   | 2 +-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/gcc/ChangeLog.test b/gcc/ChangeLog.test
new file mode 100644
index ..a30ca769019a
--- /dev/null
+++ b/gcc/ChangeLog.test
@@ -0,0 +1,5 @@
+ Branch work194-test, baseline 
+
+2025-02-21   Michael Meissner  
+
+   Clone branch
diff --git a/gcc/REVISION b/gcc/REVISION
index 05cf4553f33f..999de8ff4f47 100644
--- a/gcc/REVISION
+++ b/gcc/REVISION
@@ -1 +1 @@
-work194 branch
+work194-test branch