[gcc r15-9175] testsuite: Remove guality xfails for aarch64*-*-*

2025-04-03 Thread Christophe Lyon via Gcc-cvs
https://gcc.gnu.org/g:b1b786580b9bddadcb179c84265655e2d2405b55

commit r15-9175-gb1b786580b9bddadcb179c84265655e2d2405b55
Author: Christophe Lyon 
Date:   Wed Apr 2 17:00:17 2025 +

testsuite: Remove guality xfails for aarch64*-*-*

Since r15-7878-ge1c49f413c8, these tests appear as XPASS on aarch64,
so we can remove the xfails introduced by r12-102-gf31ddad8ac8f11.

gcc/testsuite/ChangeLog:

* gcc.dg/guality/pr90074.c: Remove xfail for aarch64.
* gcc.dg/guality/pr90716.c: Likewise.

Diff:
---
 gcc/testsuite/gcc.dg/guality/pr90074.c | 4 ++--
 gcc/testsuite/gcc.dg/guality/pr90716.c | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/guality/pr90074.c 
b/gcc/testsuite/gcc.dg/guality/pr90074.c
index 2fd884209f26..129492825162 100644
--- a/gcc/testsuite/gcc.dg/guality/pr90074.c
+++ b/gcc/testsuite/gcc.dg/guality/pr90074.c
@@ -25,7 +25,7 @@ int main()
  debug stmt for the final value of the loop during loop distribution
  which would fix the UNSUPPORTED cases.
  c is optimized out at -Og for no obvious reason.  */
-  optimize_me_not(); /* { dg-final { gdb-test . "i + 1" "8" { xfail { 
aarch64*-*-* && { any-opts "-fno-fat-lto-objects" } } } } } */
-/* { dg-final { gdb-test .-1 "c + 1" "2" { xfail { aarch64*-*-* && { 
any-opts "-fno-fat-lto-objects" } } } } } */
+  optimize_me_not(); /* { dg-final { gdb-test . "i + 1" "8" } } */
+/* { dg-final { gdb-test .-1 "c + 1" "2" } } */
   return 0;
 }
diff --git a/gcc/testsuite/gcc.dg/guality/pr90716.c 
b/gcc/testsuite/gcc.dg/guality/pr90716.c
index fe7e5567625c..b2f5c9d146e0 100644
--- a/gcc/testsuite/gcc.dg/guality/pr90716.c
+++ b/gcc/testsuite/gcc.dg/guality/pr90716.c
@@ -20,6 +20,6 @@ int main()
  Instead test j + 1 which will make the test UNSUPPORTED if i
  is optimized out.  Since the test previously had wrong debug
  with j == 0 this is acceptable.  */
-  optimize_me_not(); /* { dg-final { gdb-test . "j + 1" "9" { xfail { 
aarch64*-*-* && { any-opts "-fno-fat-lto-objects" } } } } } */
+  optimize_me_not(); /* { dg-final { gdb-test . "j + 1" "9" } } */
   return 0;
 }


[gcc r15-9174] testsuite: i386: Fix gcc.target/i386/pr82142?.c etc. on Solaris/x86

2025-04-03 Thread Rainer Orth via Gcc-cvs
https://gcc.gnu.org/g:3f0d34f33d081680c2dca1c8792a9cd57a19148a

commit r15-9174-g3f0d34f33d081680c2dca1c8792a9cd57a19148a
Author: Rainer Orth 
Date:   Thu Apr 3 08:52:47 2025 +0200

testsuite: i386: Fix gcc.target/i386/pr82142?.c etc. on Solaris/x86

Three tests FAIL on Solaris/x86 in similar ways:

FAIL: gcc.target/i386/pr111673.c check-function-bodies advance
FAIL: gcc.target/i386/pr82142a.c check-function-bodies assignzero
FAIL: gcc.target/i386/pr82142b.c check-function-bodies assignzero

All tests FAIL as is because they lack either or both of the .LFB0 label
and the .cfi_startproc directive:

* The 32-bit pr82142b.c test lacks both, whether as or gas is in use: as
  lacks full support for the cfi directives and the .LFB0 label is only
  emitted with -fasynchronous-unwind-tables.

* The 64-bit tests pr111673.c and pr82142a.c already work with gas, but
  with as the cfi directives are again missing.

In addition, the 32-bit test (pr82142b.c) still FAILs because 32-bit
Solaris/x86 defaults to -mstackrealign.

To fix all this, this patch adds -fasynchronous-unwind-tables
-fdwarf2-cfi-asm to all tests to force the generation of both the .LFB0
label and .cfi_startproc (which is ok since they are compile tests).  In
addition, pr82142b.c is compiled with -mno-stackrealign to avoid
platform differences.

Tested on i386-pc-solaris2.11 and x86_64-pc-linux-gnu.

2025-03-25  Rainer Orth  

gcc/testsuite:
* gcc.target/i386/pr111673.c (dg-options): Add
-fasynchronous-unwind-tables -fdwarf2-cfi-asm.
* gcc.target/i386/pr82142a.c: Likewise.
* gcc.target/i386/pr82142b.c (dg-options): Add -mno-stackrealign
-fasynchronous-unwind-tables -fdwarf2-cfi-asm.

Diff:
---
 gcc/testsuite/gcc.target/i386/pr111673.c | 2 +-
 gcc/testsuite/gcc.target/i386/pr82142a.c | 2 +-
 gcc/testsuite/gcc.target/i386/pr82142b.c | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/gcc/testsuite/gcc.target/i386/pr111673.c 
b/gcc/testsuite/gcc.target/i386/pr111673.c
index b9ceacf76512..0f08ba89ebda 100644
--- a/gcc/testsuite/gcc.target/i386/pr111673.c
+++ b/gcc/testsuite/gcc.target/i386/pr111673.c
@@ -1,5 +1,5 @@
 /* { dg-do compile { target { ! ia32 } } } */
-/* { dg-options "-O2 -fdump-rtl-pro_and_epilogue" } */
+/* { dg-options "-O2 -fdump-rtl-pro_and_epilogue -fasynchronous-unwind-tables 
-fdwarf2-cfi-asm" } */
 /* Keep labels and directives ('.cfi_startproc', '.cfi_endproc').  */
 /* { dg-final { check-function-bodies "**" "" "" { target "*-*-*" } {^\t?\.}  
} } */
 
diff --git a/gcc/testsuite/gcc.target/i386/pr82142a.c 
b/gcc/testsuite/gcc.target/i386/pr82142a.c
index a40c038452cf..a536150267c5 100644
--- a/gcc/testsuite/gcc.target/i386/pr82142a.c
+++ b/gcc/testsuite/gcc.target/i386/pr82142a.c
@@ -1,5 +1,5 @@
 /* { dg-do compile { target { ! ia32 } } } */
-/* { dg-options "-O2 -mno-avx -msse2" } */
+/* { dg-options "-O2 -mno-avx -msse2 -fasynchronous-unwind-tables 
-fdwarf2-cfi-asm" } */
 /* Keep labels and directives ('.cfi_startproc', '.cfi_endproc').  */
 /* { dg-final { check-function-bodies "**" "" "" { target "*-*-*" } {^\t?\.}  
} } */
 
diff --git a/gcc/testsuite/gcc.target/i386/pr82142b.c 
b/gcc/testsuite/gcc.target/i386/pr82142b.c
index b1bf12d9a5b2..d18b7c466513 100644
--- a/gcc/testsuite/gcc.target/i386/pr82142b.c
+++ b/gcc/testsuite/gcc.target/i386/pr82142b.c
@@ -1,5 +1,5 @@
 /* { dg-do compile { target ia32 } } */
-/* { dg-options "-O2 -mno-avx -msse2" } */
+/* { dg-options "-O2 -mno-avx -msse2 -mno-stackrealign 
-fasynchronous-unwind-tables -fdwarf2-cfi-asm" } */
 /* Keep labels and directives ('.cfi_startproc', '.cfi_endproc').  */
 /* { dg-final { check-function-bodies "**" "" "" { target "*-*-*" } {^\t?\.}  
} } */


[gcc(refs/users/meissner/heads/work199)] Change TARGET_CMPB to TARGET_POWER6.

2025-04-03 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:b9b3a52aeb991d80b7789da61643c1616af56af0

commit b9b3a52aeb991d80b7789da61643c1616af56af0
Author: Michael Meissner 
Date:   Thu Apr 3 16:00:08 2025 -0400

Change TARGET_CMPB to TARGET_POWER6.

This patch changes TARGET_CMPB to TARGET_POWER6.  The -mcmpb switch is not 
being
changed, just the name of the macros used to determine if the PowerPC 
processor
supports ISA 2.5 (Power6).

2025-04-03  Michael Meissner  

gcc/

* gcc/config/rs6000/rs6000-builtin.cc (rs6000_builtin_is_supported):
Change TARGET_CMPB to TARGET_POWER6.
* gcc/config/rs6000/rs6000.cc (rs6000_option_override_internal):
Likewise.
(rs6000_rtx_costs): Likewise.
(rs6000_emit_parity): Likewise.
* gcc/config/rs6000/rs6000.h (TARGET_FCFID): Likewise.
(TARGET_LFIWAX): Likewise.
(TARGET_POWER6): New macro.
(TARGET_EXTRA_BUILTINS): Change TARGET_CMPB to TARGET_POWER6.
* gcc/config/rs6000/rs6000.md (enabled attribute): Likewise.
(parity2_cmp): Likewise.
(cmpb3): Likewise.
(copysign3): Likewise.
(copysign3_fcpsgn): Likewise.
(cmpstrnsi): Likewise.
(cmpstrsi): Likewise.

Diff:
---
 gcc/config/rs6000/rs6000-builtin.cc |  4 ++--
 gcc/config/rs6000/rs6000.cc |  8 
 gcc/config/rs6000/rs6000.h  |  7 ---
 gcc/config/rs6000/rs6000.md | 16 
 4 files changed, 18 insertions(+), 17 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-builtin.cc 
b/gcc/config/rs6000/rs6000-builtin.cc
index 4ed2bc1ca89e..dbb8520ab039 100644
--- a/gcc/config/rs6000/rs6000-builtin.cc
+++ b/gcc/config/rs6000/rs6000-builtin.cc
@@ -157,9 +157,9 @@ rs6000_builtin_is_supported (enum rs6000_gen_builtins 
fncode)
 case ENB_P5:
   return TARGET_POWER5;
 case ENB_P6:
-  return TARGET_CMPB;
+  return TARGET_POWER6;
 case ENB_P6_64:
-  return TARGET_CMPB && TARGET_POWERPC64;
+  return TARGET_POWER6 && TARGET_POWERPC64;
 case ENB_P7:
   return TARGET_POPCNTD;
 case ENB_P7_64:
diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index b2811d963fcf..c01af37200ac 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -3922,7 +3922,7 @@ rs6000_option_override_internal (bool global_init_p)
 rs6000_isa_flags |= (ISA_2_6_MASKS_EMBEDDED & ~ignore_masks);
   else if (TARGET_DFP)
 rs6000_isa_flags |= (ISA_2_5_MASKS_SERVER & ~ignore_masks);
-  else if (TARGET_CMPB)
+  else if (TARGET_POWER6)
 rs6000_isa_flags |= (ISA_2_5_MASKS_EMBEDDED & ~ignore_masks);
   else if (TARGET_POWER5X)
 rs6000_isa_flags |= (ISA_2_4_MASKS & ~ignore_masks);
@@ -4797,7 +4797,7 @@ rs6000_option_override_internal (bool global_init_p)
  DERAT mispredict penalty.  However the LVE and STVE altivec instructions
  need indexed accesses and the type used is the scalar type of the element
  being loaded or stored.  */
-TARGET_AVOID_XFORM = (rs6000_tune == PROCESSOR_POWER6 && TARGET_CMPB
+TARGET_AVOID_XFORM = (rs6000_tune == PROCESSOR_POWER6 && TARGET_POWER6
  && !TARGET_ALTIVEC);
 
   /* Set the -mrecip options.  */
@@ -22396,7 +22396,7 @@ rs6000_rtx_costs (rtx x, machine_mode mode, int 
outer_code,
   return false;
 
 case PARITY:
-  *total = COSTS_N_INSNS (TARGET_CMPB ? 2 : 6);
+  *total = COSTS_N_INSNS (TARGET_POWER6 ? 2 : 6);
   return false;
 
 case NOT:
@@ -23223,7 +23223,7 @@ rs6000_emit_parity (rtx dst, rtx src)
   tmp = gen_reg_rtx (mode);
 
   /* Use the PPC ISA 2.05 prtyw/prtyd instruction if we can.  */
-  if (TARGET_CMPB)
+  if (TARGET_POWER6)
 {
   if (mode == SImode)
{
diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h
index 3794e3c0658d..5b8cf054f98a 100644
--- a/gcc/config/rs6000/rs6000.h
+++ b/gcc/config/rs6000/rs6000.h
@@ -449,12 +449,12 @@ extern int rs6000_vector_align[];
 #define TARGET_FCFID   (TARGET_POWERPC64   \
 || TARGET_PPC_GPOPT/* 970/power4 */\
 || TARGET_POWER5   /* ISA 2.02 */  \
-|| TARGET_CMPB /* ISA 2.05 */  \
+|| TARGET_POWER6   /* ISA 2.05 */  \
 || TARGET_POPCNTD) /* ISA 2.06 */
 
 #define TARGET_FCTIDZ  TARGET_FCFID
 #define TARGET_STFIWX  TARGET_PPC_GFXOPT
-#define TARGET_LFIWAX  TARGET_CMPB
+#define TARGET_LFIWAX  TARGET_POWER6
 #define TARGET_LFIWZX  TARGET_POPCNTD
 #define TARGET_FCFIDS  TARGET_POPCNTD
 #define TARGET_FCFIDU  TARGET_POPCNTD
@@ -502,6 +502,7 @@ extern int rs6000_vector_align[];
 /* Convert ISA bits like POPCNTB to PowerPC processors like POWER5.  */
 #define TARGET_POWER5  TARGET_POPCNTB
 #define TARGET_POWER5X TARGET_FPRND
+#define TARGET_POWER6 

[gcc(refs/users/meissner/heads/work199)] Change TARGET_MODULO to TARGET_POWER9.

2025-04-03 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:04725bf3e0ba584a1df67bba4f49f281168ab3cf

commit 04725bf3e0ba584a1df67bba4f49f281168ab3cf
Author: Michael Meissner 
Date:   Thu Apr 3 16:04:04 2025 -0400

Change TARGET_MODULO to TARGET_POWER9.

This patch changes TARGET_MODULO to TARGET_POWER9.  The -mmodulo switch is 
not
being changed, just the name of the macros used to determine if the PowerPC
processor supports ISA 3.0 (Power9).

2025-04-03  Michael Meissner  

gcc/

* gcc/config/rs6000/rs6000-builtin.cc (rs6000_builtin_is_supported):
Change TARGET_MODULO to TARGET_POWER9.
* gcc/config/rs6000/rs6000.cc (rs6000_option_override_internal):
Likewise.
* gcc/config/rs6000/rs6000.h (TARGET_CTZ): Likewise.
(TARGET_EXTSWSLI): Likewise.
(TARGET_MADDLD): Likewise.
(TARGET_POWER9): New macro.
* gcc/config/rs6000/rs6000.md (enabled attribute): Change 
TARGET_MODULO
to TARGET_POWER9.
(mod3): Likewise.
(umod3): Likewise.
(divide/modulo peephole2): Likewise.

Diff:
---
 gcc/config/rs6000/rs6000-builtin.cc |  4 ++--
 gcc/config/rs6000/rs6000.cc |  4 ++--
 gcc/config/rs6000/rs6000.h  |  7 ---
 gcc/config/rs6000/rs6000.md | 14 +++---
 4 files changed, 15 insertions(+), 14 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-builtin.cc 
b/gcc/config/rs6000/rs6000-builtin.cc
index 2366b2aee00a..d8ff7cf32dfd 100644
--- a/gcc/config/rs6000/rs6000-builtin.cc
+++ b/gcc/config/rs6000/rs6000-builtin.cc
@@ -169,9 +169,9 @@ rs6000_builtin_is_supported (enum rs6000_gen_builtins 
fncode)
 case ENB_P8V:
   return TARGET_P8_VECTOR;
 case ENB_P9:
-  return TARGET_MODULO;
+  return TARGET_POWER9;
 case ENB_P9_64:
-  return TARGET_MODULO && TARGET_POWERPC64;
+  return TARGET_POWER9 && TARGET_POWERPC64;
 case ENB_P9V:
   return TARGET_P9_VECTOR;
 case ENB_P10:
diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index 503b07339647..8d97b265ac91 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -3888,7 +3888,7 @@ rs6000_option_override_internal (bool global_init_p)
 
   /* For the newer switches (vsx, dfp, etc.) set some of the older options,
  unless the user explicitly used the -mno- to disable the code.  */
-  if (TARGET_P9_VECTOR || TARGET_MODULO || TARGET_P9_MISC)
+  if (TARGET_P9_VECTOR || TARGET_POWER9 || TARGET_P9_MISC)
 rs6000_isa_flags |= (ISA_3_0_MASKS_SERVER & ~ignore_masks);
   else if (TARGET_P9_MINMAX)
 {
@@ -22377,7 +22377,7 @@ rs6000_rtx_costs (rtx x, machine_mode mode, int 
outer_code,
*total = rs6000_cost->divsi;
}
   /* Add in shift and subtract for MOD unless we have a mod instruction. */
-  if ((!TARGET_MODULO
+  if ((!TARGET_POWER9
   || (RS6000_DISABLE_SCALAR_MODULO && SCALAR_INT_MODE_P (mode)))
 && (code == MOD || code == UMOD))
*total += COSTS_N_INSNS (2);
diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h
index f1da5d31441a..c2f1910b0ea2 100644
--- a/gcc/config/rs6000/rs6000.h
+++ b/gcc/config/rs6000/rs6000.h
@@ -463,9 +463,9 @@ extern int rs6000_vector_align[];
 #define TARGET_FCTIWUZ TARGET_POWER7
 /* Only powerpc64 and powerpc476 support fctid.  */
 #define TARGET_FCTID   (TARGET_POWERPC64 || rs6000_cpu == PROCESSOR_PPC476)
-#define TARGET_CTZ TARGET_MODULO
-#define TARGET_EXTSWSLI(TARGET_MODULO && TARGET_POWERPC64)
-#define TARGET_MADDLD  TARGET_MODULO
+#define TARGET_CTZ TARGET_POWER9
+#define TARGET_EXTSWSLI(TARGET_POWER9 && TARGET_POWERPC64)
+#define TARGET_MADDLD  TARGET_POWER9
 
 /* TARGET_DIRECT_MOVE is redundant to TARGET_P8_VECTOR, so alias it to that.  
*/
 #define TARGET_DIRECT_MOVE TARGET_P8_VECTOR
@@ -504,6 +504,7 @@ extern int rs6000_vector_align[];
 #define TARGET_POWER5X TARGET_FPRND
 #define TARGET_POWER6  TARGET_CMPB
 #define TARGET_POWER7  TARGET_POPCNTD
+#define TARGET_POWER9  TARGET_MODULO
 
 /* In switching from using target_flags to using rs6000_isa_flags, the options
machinery creates OPTION_MASK_ instead of MASK_.  The MASK_
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 87ec37a9f8e4..db1b6c2d1164 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -403,7 +403,7 @@
  (const_int 1)
 
  (and (eq_attr "isa" "p9")
- (match_test "TARGET_MODULO"))
+ (match_test "TARGET_POWER9"))
  (const_int 1)
 
  (and (eq_attr "isa" "p9v")
@@ -3457,7 +3457,7 @@
   || INTVAL (operands[2]) <= 0
   || (i = exact_log2 (INTVAL (operands[2]))) < 0)
 {
-  if (!TARGET_MODULO)
+  if (!TARGET_POWER9)
FAIL;
 
   operands[2] = force_reg (mode, operands[2]);
@@ -3491,7 +3491,7 @@
   [(set (match_operand:GPR 0 "gpc_reg_operand" "=&r,r")
 (mod:GPR (match_opera

[gcc(refs/users/meissner/heads/work199)] Use vector pair load/store for memcpy with -mcpu=future

2025-04-03 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:9dacc6815c46e12a7dba514d9b730b2ec701cb76

commit 9dacc6815c46e12a7dba514d9b730b2ec701cb76
Author: Michael Meissner 
Date:   Thu Apr 3 16:11:04 2025 -0400

Use vector pair load/store for memcpy with -mcpu=future

In the development for the power10 processor, GCC did not enable using the 
load
vector pair and store vector pair instructions when optimizing things like
memory copy.  This patch enables using those instructions if -mcpu=future is
used.

2025-04-03  Michael Meissner  

gcc/

* config/rs6000/rs6000-cpus.def (ISA_FUTURE_MASKS_SERVER): Enable 
using
load vector pair and store vector pair instructions for memory copy
operations.
(POWERPC_MASKS): Make the bit for enabling using load vector pair 
and
store vector pair operations set and reset when the PowerPC 
processor is
changed.
* gcc/config/rs6000/rs6000.cc (rs6000_machine_from_flags): Disable
-mblock-ops-vector-pair from influcing .machine selection.

gcc/testsuite/

* gcc.target/powerpc/future-3.c: New test.

Diff:
---
 gcc/config/rs6000/rs6000-cpus.def   |  4 +++-
 gcc/config/rs6000/rs6000.cc |  2 +-
 gcc/testsuite/gcc.target/powerpc/future-3.c | 22 ++
 3 files changed, 26 insertions(+), 2 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-cpus.def 
b/gcc/config/rs6000/rs6000-cpus.def
index 228d0b5e7b54..063591f5c094 100644
--- a/gcc/config/rs6000/rs6000-cpus.def
+++ b/gcc/config/rs6000/rs6000-cpus.def
@@ -84,7 +84,8 @@
  | OPTION_MASK_POWER11)
 
 #define FUTURE_MASKS_SERVER(POWER11_MASKS_SERVER   \
-| OPTION_MASK_FUTURE)
+| OPTION_MASK_FUTURE   \
+| OPTION_MASK_BLOCK_OPS_VECTOR_PAIR)
 
 /* Flags that need to be turned off if -mno-vsx.  */
 #define OTHER_VSX_VECTOR_MASKS (OPTION_MASK_EFFICIENT_UNALIGNED_VSX\
@@ -114,6 +115,7 @@
 
 /* Mask of all options to set the default isa flags based on -mcpu=.  */
 #define POWERPC_MASKS  (OPTION_MASK_ALTIVEC\
+| OPTION_MASK_BLOCK_OPS_VECTOR_PAIR\
 | OPTION_MASK_CMPB \
 | OPTION_MASK_CRYPTO   \
 | OPTION_MASK_DFP  \
diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index 4cea1775f110..011f67d290e9 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -5908,7 +5908,7 @@ rs6000_machine_from_flags (void)
 
   /* Disable the flags that should never influence the .machine selection.  */
   flags &= ~(OPTION_MASK_PPC_GFXOPT | OPTION_MASK_PPC_GPOPT | OPTION_MASK_ISEL
-| OPTION_MASK_ALTIVEC);
+| OPTION_MASK_ALTIVEC | OPTION_MASK_BLOCK_OPS_VECTOR_PAIR);
 
   if ((flags & (FUTURE_MASKS_SERVER & ~ISA_3_1_MASKS_SERVER)) != 0)
 return "future";
diff --git a/gcc/testsuite/gcc.target/powerpc/future-3.c 
b/gcc/testsuite/gcc.target/powerpc/future-3.c
new file mode 100644
index ..afa8b96d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/future-3.c
@@ -0,0 +1,22 @@
+/* 32-bit doesn't generate vector pair instructions.  */
+/* { dg-do compile { target lp64 } } */
+/* { dg-options "-mdejagnu-cpu=future -O2" } */
+
+/* Test to see that memcpy will use load/store vector pair with
+   -mcpu=future.  */
+
+#ifndef SIZE
+#define SIZE 4
+#endif
+
+extern vector double to[SIZE], from[SIZE];
+
+void
+copy (void)
+{
+  __builtin_memcpy (to, from, sizeof (to));
+  return;
+}
+
+/* { dg-final { scan-assembler {\mlxvpx?\M}  } } */
+/* { dg-final { scan-assembler {\mstxvpx?\M} } } */


[gcc/meissner/heads/work199-bugs] (14 commits) Merge commit 'refs/users/meissner/heads/work199-bugs' of gi

2025-04-03 Thread Michael Meissner via Gcc-cvs
The branch 'meissner/heads/work199-bugs' was updated to point to:

 ca47ea36c5da... Merge commit 'refs/users/meissner/heads/work199-bugs' of gi

It previously pointed to:

 364e0f5e28f3... Add ChangeLog.bugs and update REVISION.

Diff:

Summary of changes (added commits):
---

  ca47ea3... Merge commit 'refs/users/meissner/heads/work199-bugs' of gi
  c774b02... Add ChangeLog.bugs and update REVISION.
  c62780a... Update ChangeLog.* (*)
  76c081b... Use architecture flags for defining _ARCH_PWR macros. (*)
  aa1860c... Add rs6000 architecture masks. (*)
  9dacc68... Use vector pair load/store for memcpy with -mcpu=future (*)
  c590949... Add -mcpu=future tests. (*)
  230fbe1... Add -mcpu=future tuning support. (*)
  c099046... Add support for -mcpu=future (*)
  04725bf... Change TARGET_MODULO to TARGET_POWER9. (*)
  08e5c71... Change TARGET_POPCNTD to TARGET_POWER7. (*)
  b9b3a52... Change TARGET_CMPB to TARGET_POWER6. (*)
  d0dbdbf... Change TARGET_FPRND to TARGET_POWER5X. (*)
  31d7966... Change TARGET_POPCNTB to TARGET_POWER5. (*)

(*) This commit already exists in another branch.
Because the reference `refs/users/meissner/heads/work199-bugs' matches
your hooks.email-new-commits-only configuration,
no separate email is sent for this commit.


[gcc(refs/users/meissner/heads/work199)] Add ChangeLog.meissner and REVISION.

2025-04-03 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:34cf8c8901edfad0fe5e1c8282698fa8ba6628a7

commit 34cf8c8901edfad0fe5e1c8282698fa8ba6628a7
Author: Michael Meissner 
Date:   Thu Apr 3 15:22:00 2025 -0400

Add ChangeLog.meissner and REVISION.

2025-04-03  Michael Meissner  

gcc/

* REVISION: New file for branch.
* ChangeLog.meissner: New file.

gcc/c-family/

* ChangeLog.meissner: New file.

gcc/c/

* ChangeLog.meissner: New file.

gcc/cp/

* ChangeLog.meissner: New file.

gcc/fortran/

* ChangeLog.meissner: New file.

gcc/testsuite/

* ChangeLog.meissner: New file.

libgcc/

* ChangeLog.meissner: New file.

Diff:
---
 gcc/ChangeLog.meissner   | 5 +
 gcc/REVISION | 1 +
 gcc/c-family/ChangeLog.meissner  | 5 +
 gcc/c/ChangeLog.meissner | 5 +
 gcc/cp/ChangeLog.meissner| 5 +
 gcc/fortran/ChangeLog.meissner   | 5 +
 gcc/testsuite/ChangeLog.meissner | 5 +
 libgcc/ChangeLog.meissner| 5 +
 libstdc++-v3/ChangeLog.meissner  | 5 +
 9 files changed, 41 insertions(+)

diff --git a/gcc/ChangeLog.meissner b/gcc/ChangeLog.meissner
new file mode 100644
index ..95d48eacd3cd
--- /dev/null
+++ b/gcc/ChangeLog.meissner
@@ -0,0 +1,5 @@
+ Branch work199, baseline 
+
+2025-04-03   Michael Meissner  
+
+   Clone branch
diff --git a/gcc/REVISION b/gcc/REVISION
new file mode 100644
index ..113e419bda0d
--- /dev/null
+++ b/gcc/REVISION
@@ -0,0 +1 @@
+work199 branch
diff --git a/gcc/c-family/ChangeLog.meissner b/gcc/c-family/ChangeLog.meissner
new file mode 100644
index ..95d48eacd3cd
--- /dev/null
+++ b/gcc/c-family/ChangeLog.meissner
@@ -0,0 +1,5 @@
+ Branch work199, baseline 
+
+2025-04-03   Michael Meissner  
+
+   Clone branch
diff --git a/gcc/c/ChangeLog.meissner b/gcc/c/ChangeLog.meissner
new file mode 100644
index ..95d48eacd3cd
--- /dev/null
+++ b/gcc/c/ChangeLog.meissner
@@ -0,0 +1,5 @@
+ Branch work199, baseline 
+
+2025-04-03   Michael Meissner  
+
+   Clone branch
diff --git a/gcc/cp/ChangeLog.meissner b/gcc/cp/ChangeLog.meissner
new file mode 100644
index ..95d48eacd3cd
--- /dev/null
+++ b/gcc/cp/ChangeLog.meissner
@@ -0,0 +1,5 @@
+ Branch work199, baseline 
+
+2025-04-03   Michael Meissner  
+
+   Clone branch
diff --git a/gcc/fortran/ChangeLog.meissner b/gcc/fortran/ChangeLog.meissner
new file mode 100644
index ..95d48eacd3cd
--- /dev/null
+++ b/gcc/fortran/ChangeLog.meissner
@@ -0,0 +1,5 @@
+ Branch work199, baseline 
+
+2025-04-03   Michael Meissner  
+
+   Clone branch
diff --git a/gcc/testsuite/ChangeLog.meissner b/gcc/testsuite/ChangeLog.meissner
new file mode 100644
index ..95d48eacd3cd
--- /dev/null
+++ b/gcc/testsuite/ChangeLog.meissner
@@ -0,0 +1,5 @@
+ Branch work199, baseline 
+
+2025-04-03   Michael Meissner  
+
+   Clone branch
diff --git a/libgcc/ChangeLog.meissner b/libgcc/ChangeLog.meissner
new file mode 100644
index ..95d48eacd3cd
--- /dev/null
+++ b/libgcc/ChangeLog.meissner
@@ -0,0 +1,5 @@
+ Branch work199, baseline 
+
+2025-04-03   Michael Meissner  
+
+   Clone branch
diff --git a/libstdc++-v3/ChangeLog.meissner b/libstdc++-v3/ChangeLog.meissner
new file mode 100644
index ..95d48eacd3cd
--- /dev/null
+++ b/libstdc++-v3/ChangeLog.meissner
@@ -0,0 +1,5 @@
+ Branch work199, baseline 
+
+2025-04-03   Michael Meissner  
+
+   Clone branch


[gcc] Created branch 'meissner/heads/work199-dmf' in namespace 'refs/users'

2025-04-03 Thread Michael Meissner via Gcc-cvs
The branch 'meissner/heads/work199-dmf' was created in namespace 'refs/users' 
pointing to:

 34cf8c8901ed... Add ChangeLog.meissner and REVISION.


[gcc(refs/users/meissner/heads/work199-dmf)] Add ChangeLog.dmf and update REVISION.

2025-04-03 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:840d7578343b9f4fd4215c4f02ef9b5dbcc64c36

commit 840d7578343b9f4fd4215c4f02ef9b5dbcc64c36
Author: Michael Meissner 
Date:   Thu Apr 3 15:22:57 2025 -0400

Add ChangeLog.dmf and update REVISION.

2025-04-03  Michael Meissner  

gcc/

* ChangeLog.dmf: New file for branch.
* REVISION: Update.

Diff:
---
 gcc/ChangeLog.dmf | 5 +
 gcc/REVISION  | 2 +-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/gcc/ChangeLog.dmf b/gcc/ChangeLog.dmf
new file mode 100644
index ..3a7dad677be5
--- /dev/null
+++ b/gcc/ChangeLog.dmf
@@ -0,0 +1,5 @@
+ Branch work199-dmf, baseline 
+
+2025-04-03   Michael Meissner  
+
+   Clone branch
diff --git a/gcc/REVISION b/gcc/REVISION
index 113e419bda0d..da7d8a6c744e 100644
--- a/gcc/REVISION
+++ b/gcc/REVISION
@@ -1 +1 @@
-work199 branch
+work199-dmf branch


[gcc] Created branch 'meissner/heads/work199-vpair' in namespace 'refs/users'

2025-04-03 Thread Michael Meissner via Gcc-cvs
The branch 'meissner/heads/work199-vpair' was created in namespace 'refs/users' 
pointing to:

 34cf8c8901ed... Add ChangeLog.meissner and REVISION.


[gcc] Created branch 'meissner/heads/work199' in namespace 'refs/users'

2025-04-03 Thread Michael Meissner via Gcc-cvs
The branch 'meissner/heads/work199' was created in namespace 'refs/users' 
pointing to:

 c669ab0a8666... rs6000: Add Cobol support to traceback table [PR119308]


[gcc] Created branch 'meissner/heads/work199-bugs' in namespace 'refs/users'

2025-04-03 Thread Michael Meissner via Gcc-cvs
The branch 'meissner/heads/work199-bugs' was created in namespace 'refs/users' 
pointing to:

 34cf8c8901ed... Add ChangeLog.meissner and REVISION.


[gcc] Created branch 'meissner/heads/work199-test' in namespace 'refs/users'

2025-04-03 Thread Michael Meissner via Gcc-cvs
The branch 'meissner/heads/work199-test' was created in namespace 'refs/users' 
pointing to:

 34cf8c8901ed... Add ChangeLog.meissner and REVISION.


[gcc(refs/users/meissner/heads/work199-libs)] Add ChangeLog.libs and update REVISION.

2025-04-03 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:f46e8c6a02f275240f238456682aff4d355ce2ff

commit f46e8c6a02f275240f238456682aff4d355ce2ff
Author: Michael Meissner 
Date:   Thu Apr 3 15:25:35 2025 -0400

Add ChangeLog.libs and update REVISION.

2025-04-03  Michael Meissner  

gcc/

* ChangeLog.libs: New file for branch.
* REVISION: Update.

Diff:
---
 gcc/ChangeLog.libs | 5 +
 gcc/REVISION   | 2 +-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/gcc/ChangeLog.libs b/gcc/ChangeLog.libs
new file mode 100644
index ..9f21f063f041
--- /dev/null
+++ b/gcc/ChangeLog.libs
@@ -0,0 +1,5 @@
+ Branch work199-libs, baseline 
+
+2025-04-03   Michael Meissner  
+
+   Clone branch
diff --git a/gcc/REVISION b/gcc/REVISION
index 113e419bda0d..b0294b6f088f 100644
--- a/gcc/REVISION
+++ b/gcc/REVISION
@@ -1 +1 @@
-work199 branch
+work199-libs branch


[gcc] Created branch 'meissner/heads/work199-math' in namespace 'refs/users'

2025-04-03 Thread Michael Meissner via Gcc-cvs
The branch 'meissner/heads/work199-math' was created in namespace 'refs/users' 
pointing to:

 34cf8c8901ed... Add ChangeLog.meissner and REVISION.


[gcc(refs/users/meissner/heads/work199)] Add -mcpu=future tests.

2025-04-03 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:c590949e57e61822195b42e057b543900f404e12

commit c590949e57e61822195b42e057b543900f404e12
Author: Michael Meissner 
Date:   Thu Apr 3 16:09:58 2025 -0400

Add -mcpu=future tests.

This patch adds simple tests for -mcpu=future.

2025-04-03  Michael Meissner  

gcc/testsuite/

* gcc.target/powerpc/future-1.c: New test.
* gcc.target/powerpc/future-2.c: Likewise.

Diff:
---
 gcc/testsuite/gcc.target/powerpc/future-1.c | 13 +
 gcc/testsuite/gcc.target/powerpc/future-2.c | 24 
 2 files changed, 37 insertions(+)

diff --git a/gcc/testsuite/gcc.target/powerpc/future-1.c 
b/gcc/testsuite/gcc.target/powerpc/future-1.c
new file mode 100644
index ..f1b940d7bebf
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/future-1.c
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-options "-mdejagnu-cpu=future -O2" } */
+
+/* Basic check to see if the compiler supports -mcpu=future and if it defines
+   _ARCH_PWR11.  */
+
+#ifndef _ARCH_FUTURE
+#error "-mcpu=future is not supported"
+#endif
+
+void foo (void)
+{
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/future-2.c 
b/gcc/testsuite/gcc.target/powerpc/future-2.c
new file mode 100644
index ..5552cefa3c2e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/future-2.c
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+/* Check if we can set the future target via a target attribute.  */
+
+__attribute__((__target__("cpu=power9")))
+void foo_p9 (void)
+{
+}
+
+__attribute__((__target__("cpu=power10")))
+void foo_p10 (void)
+{
+}
+
+__attribute__((__target__("cpu=power11")))
+void foo_p11 (void)
+{
+}
+
+__attribute__((__target__("cpu=future")))
+void foo_future (void)
+{
+}


[gcc(refs/users/meissner/heads/work199)] Change TARGET_FPRND to TARGET_POWER5X.

2025-04-03 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:d0dbdbfaafc667bb6c20bd2aebad709cba97fbb1

commit d0dbdbfaafc667bb6c20bd2aebad709cba97fbb1
Author: Michael Meissner 
Date:   Thu Apr 3 15:58:18 2025 -0400

Change TARGET_FPRND to TARGET_POWER5X.

This patch changes TARGET_POWER5X to TARGET_POWER5.  The -mfprnd switch is 
not
being changed, just the name of the macros used to determine if the PowerPC
processor supports ISA 2.4 (Power5x).

2025-04-03  Michael Meissner  

gcc/

* gcc/config/rs6000/rs6000.cc (rs6000_option_override_internal):
Change TARGET_FPRND to TARGET_POWER5X.
* gcc/config/rs6000/rs6000.h (TARGET_POWERP5X): New macro.
* gcc/config/rs6000/rs6000.md (fmod3): Change TARGET_FPRND to
TARGET_POWER5X.
(remainder3): Likewise.
(fctiwuz_): Likewise.
(ceil2): Likewise.
(floor2): Likewise.
(round2): Likewise.

Diff:
---
 gcc/config/rs6000/rs6000.cc |  4 ++--
 gcc/config/rs6000/rs6000.h  |  1 +
 gcc/config/rs6000/rs6000.md | 14 +++---
 3 files changed, 10 insertions(+), 9 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index a5ed93702494..b2811d963fcf 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -3924,7 +3924,7 @@ rs6000_option_override_internal (bool global_init_p)
 rs6000_isa_flags |= (ISA_2_5_MASKS_SERVER & ~ignore_masks);
   else if (TARGET_CMPB)
 rs6000_isa_flags |= (ISA_2_5_MASKS_EMBEDDED & ~ignore_masks);
-  else if (TARGET_FPRND)
+  else if (TARGET_POWER5X)
 rs6000_isa_flags |= (ISA_2_4_MASKS & ~ignore_masks);
   else if (TARGET_POWER5)
 rs6000_isa_flags |= (ISA_2_2_MASKS & ~ignore_masks);
@@ -3951,7 +3951,7 @@ rs6000_option_override_internal (bool global_init_p)
   rs6000_isa_flags &= ~OPTION_MASK_CRYPTO;
 }
 
-  if (!TARGET_FPRND && TARGET_VSX)
+  if (!TARGET_POWER5X && TARGET_VSX)
 {
   if (rs6000_isa_flags_explicit & OPTION_MASK_FPRND)
/* TARGET_VSX = 1 implies Power 7 and newer */
diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h
index d9a0ffe9f5b2..3794e3c0658d 100644
--- a/gcc/config/rs6000/rs6000.h
+++ b/gcc/config/rs6000/rs6000.h
@@ -501,6 +501,7 @@ extern int rs6000_vector_align[];
 
 /* Convert ISA bits like POPCNTB to PowerPC processors like POWER5.  */
 #define TARGET_POWER5  TARGET_POPCNTB
+#define TARGET_POWER5X TARGET_FPRND
 
 /* In switching from using target_flags to using rs6000_isa_flags, the options
machinery creates OPTION_MASK_ instead of MASK_.  The MASK_
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index c5bd273be8b3..045ce22a03c8 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -5171,7 +5171,7 @@
(use (match_operand:SFDF 1 "gpc_reg_operand"))
(use (match_operand:SFDF 2 "gpc_reg_operand"))]
   "TARGET_HARD_FLOAT
-   && TARGET_FPRND
+   && TARGET_POWER5X
&& flag_unsafe_math_optimizations"
 {
   rtx div = gen_reg_rtx (mode);
@@ -5189,7 +5189,7 @@
(use (match_operand:SFDF 1 "gpc_reg_operand"))
(use (match_operand:SFDF 2 "gpc_reg_operand"))]
   "TARGET_HARD_FLOAT
-   && TARGET_FPRND
+   && TARGET_POWER5X
&& flag_unsafe_math_optimizations"
 {
   rtx div = gen_reg_rtx (mode);
@@ -6689,7 +6689,7 @@
 (define_insn "*friz"
   [(set (match_operand:DF 0 "gpc_reg_operand" "=d,wa")
(float:DF (fix:DI (match_operand:DF 1 "gpc_reg_operand" "d,wa"]
-  "TARGET_HARD_FLOAT && TARGET_FPRND
+  "TARGET_HARD_FLOAT && TARGET_POWER5X
&& flag_unsafe_math_optimizations && !flag_trapping_math && TARGET_FRIZ"
   "@
friz %0,%1
@@ -6817,7 +6817,7 @@
   [(set (match_operand:SFDF 0 "gpc_reg_operand" "=d,wa")
(unspec:SFDF [(match_operand:SFDF 1 "gpc_reg_operand" "d,wa")]
 UNSPEC_FRIZ))]
-  "TARGET_HARD_FLOAT && TARGET_FPRND"
+  "TARGET_HARD_FLOAT && TARGET_POWER5X"
   "@
friz %0,%1
xsrdpiz %x0,%x1"
@@ -6827,7 +6827,7 @@
   [(set (match_operand:SFDF 0 "gpc_reg_operand" "=d,wa")
(unspec:SFDF [(match_operand:SFDF 1 "gpc_reg_operand" "d,wa")]
 UNSPEC_FRIP))]
-  "TARGET_HARD_FLOAT && TARGET_FPRND"
+  "TARGET_HARD_FLOAT && TARGET_POWER5X"
   "@
frip %0,%1
xsrdpip %x0,%x1"
@@ -6837,7 +6837,7 @@
   [(set (match_operand:SFDF 0 "gpc_reg_operand" "=d,wa")
(unspec:SFDF [(match_operand:SFDF 1 "gpc_reg_operand" "d,wa")]
 UNSPEC_FRIM))]
-  "TARGET_HARD_FLOAT && TARGET_FPRND"
+  "TARGET_HARD_FLOAT && TARGET_POWER5X"
   "@
frim %0,%1
xsrdpim %x0,%x1"
@@ -6848,7 +6848,7 @@
   [(set (match_operand:SFDF 0 "gpc_reg_operand" "=")
(unspec:SFDF [(match_operand:SFDF 1 "gpc_reg_operand" "")]
 UNSPEC_FRIN))]
-  "TARGET_HARD_FLOAT && TARGET_FPRND"
+  "TARGET_HARD_FLOAT && TARGET_POWER5X"
   "frin %0,%1"
   [(set_attr "type" "fp")])


[gcc r15-9189] c++: P2280R4 and speculative constexpr folding [PR119387]

2025-04-03 Thread Patrick Palka via Gcc-cvs
https://gcc.gnu.org/g:a926345f22b500a2620adb83e6821e01fb8cc8fd

commit r15-9189-ga926345f22b500a2620adb83e6821e01fb8cc8fd
Author: Patrick Palka 
Date:   Thu Apr 3 16:33:46 2025 -0400

c++: P2280R4 and speculative constexpr folding [PR119387]

Compiling the testcase in this PR uses 2.5x more memory and 6x more
time ever since r14-5979 which implements P2280R4.  This is because
our speculative constexpr folding now does a lot more work trying to
fold ultimately non-constant calls to constexpr functions, and in turn
produces a lot of garbage.  We do sometimes successfully fold more
thanks to P2280R4, but it seems to be trivial stuff like calls to
std::array::size or std::addressof.  The benefit of P2280 therefore
doesn't seem worth the cost during speculative constexpr folding, so
this patch restricts the paper to only manifestly-constant evaluation.

PR c++/119387

gcc/cp/ChangeLog:

* constexpr.cc (p2280_active_p): New.
(cxx_eval_constant_expression) : Use it to
restrict P2280 relaxations.
: Likewise.

Reviewed-by: Jason Merrill 

Diff:
---
 gcc/cp/constexpr.cc | 22 +++---
 1 file changed, 19 insertions(+), 3 deletions(-)

diff --git a/gcc/cp/constexpr.cc b/gcc/cp/constexpr.cc
index 4820bcc84aa3..9a57f4865e0f 100644
--- a/gcc/cp/constexpr.cc
+++ b/gcc/cp/constexpr.cc
@@ -1294,6 +1294,22 @@ struct constexpr_ctx {
   mce_value manifestly_const_eval;
 };
 
+/* True if the constexpr relaxations afforded by P2280R4 for unknown
+   references and objects are in effect.  */
+
+static bool
+p2280_active_p (const constexpr_ctx *ctx)
+{
+  if (ctx->manifestly_const_eval != mce_true)
+/* Disable these relaxations during speculative constexpr folding,
+   as it can significantly increase compile time/memory use
+   (PR119387).  */
+return false;
+
+  /* P2280R4 was accepted as a DR against C++11.  */
+  return cxx_dialect >= cxx11;
+}
+
 /* Remove T from the global values map, checking for attempts to destroy
a value that has already finished its lifetime.  */
 
@@ -7792,7 +7808,7 @@ cxx_eval_constant_expression (const constexpr_ctx *ctx, 
tree t,
r = TARGET_EXPR_INITIAL (r);
   if (DECL_P (r)
  /* P2280 allows references to unknown.  */
- && !(VAR_P (t) && TYPE_REF_P (TREE_TYPE (t
+ && !(p2280_active_p (ctx) && VAR_P (t) && TYPE_REF_P (TREE_TYPE (t
{
  if (!ctx->quiet)
non_const_var_error (loc, r, /*fundef_p*/false);
@@ -7844,9 +7860,9 @@ cxx_eval_constant_expression (const constexpr_ctx *ctx, 
tree t,
  r = build_constructor (TREE_TYPE (t), NULL);
  TREE_CONSTANT (r) = true;
}
-  else if (TYPE_REF_P (TREE_TYPE (t)))
+  else if (p2280_active_p (ctx) && TYPE_REF_P (TREE_TYPE (t)))
/* P2280 allows references to unknown...  */;
-  else if (is_this_parameter (t))
+  else if (p2280_active_p (ctx) && is_this_parameter (t))
/* ...as well as the this pointer.  */;
   else
{


[gcc(refs/users/meissner/heads/work199-paddis)] RFC2656-Support load/store vector with right length.

2025-04-03 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:964b3657f38627ca9f4e40a0992292fdda8bd788

commit 964b3657f38627ca9f4e40a0992292fdda8bd788
Author: Michael Meissner 
Date:   Thu Apr 3 16:51:10 2025 -0400

RFC2656-Support load/store vector with right length.

This patch adds support for new instructions that may be added to the 
PowerPC
architecture in the future to enhance the load and store vector with length
instructions.

The current instructions (lxvl, lxvll, stxvl, and stxvll) are inconvient to 
use
since the count for the number of bytes must be in the top 8 bits of the GPR
register, instead of the bottom 8 bits.  This meant that code generating 
these
instructions typically had to do a shift left by 56 bits to get the count 
into
the right position.  In a future version of the PowerPC architecture, new
variants of these instructions might be added that expect the count to be in
the bottom 8 bits of the GPR register.  These patches add this support to 
GCC
if the user uses the -mcpu=future option.

I discovered that the code in rs6000-string.cc to generate ISA 3.1 
lxvl/stxvl
future lxvll/stxvll instructions would generate these instructions on 
32-bit.
However the patterns for these instructions is only done on 64-bit systems. 
 So
I added a check for 64-bit support before generating the instructions.

The patches have been tested on both little and big endian systems.  Can I 
check
it into the master branch?

2025-04-03   Michael Meissner  

gcc/

* config/rs6000/rs6000-string.cc (expand_block_move): Do not 
generate
lxvl and stxvl on 32-bit.
* config/rs6000/vsx.md (lxvl): If -mcpu=future, generate the lxvl 
with
the shift count automaticaly used in the insn.
(lxvrl): New insn for -mcpu=future.
(lxvrll): Likewise.
(stxvl): If -mcpu=future, generate the stxvl with the shift count
automaticaly used in the insn.
(stxvrl): New insn for -mcpu=future.
(stxvrll): Likewise.

gcc/testsuite/

* gcc.target/powerpc/lxvrl.c: New test.
* lib/target-supports.exp 
(check_effective_target_powerpc_future_ok):
New effective target.

Diff:
---
 gcc/config/rs6000/rs6000-string.cc   |   1 +
 gcc/config/rs6000/vsx.md | 122 +--
 gcc/testsuite/gcc.target/powerpc/lxvrl.c |  32 
 3 files changed, 134 insertions(+), 21 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-string.cc 
b/gcc/config/rs6000/rs6000-string.cc
index 703f77fa0bf1..814328140553 100644
--- a/gcc/config/rs6000/rs6000-string.cc
+++ b/gcc/config/rs6000/rs6000-string.cc
@@ -2786,6 +2786,7 @@ expand_block_move (rtx operands[], bool might_overlap)
 
   if (TARGET_MMA && TARGET_BLOCK_OPS_UNALIGNED_VSX
  && TARGET_BLOCK_OPS_VECTOR_PAIR
+ && TARGET_POWERPC64
  && bytes >= 32
  && (align >= 256 || !STRICT_ALIGNMENT))
{
diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index dd3573b80868..89523cf4a0e5 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -5712,20 +5712,32 @@
   DONE;
 })
 
-;; Load VSX Vector with Length
+;; Load VSX Vector with Length.  If we have lxvrl, we don't have to do an
+;; explicit shift left into a pseudo.
 (define_expand "lxvl"
-  [(set (match_dup 3)
-(ashift:DI (match_operand:DI 2 "register_operand")
-   (const_int 56)))
-   (set (match_operand:V16QI 0 "vsx_register_operand")
-   (unspec:V16QI
-[(match_operand:DI 1 "gpc_reg_operand")
-  (mem:V16QI (match_dup 1))
- (match_dup 3)]
-UNSPEC_LXVL))]
+  [(use (match_operand:V16QI 0 "vsx_register_operand"))
+   (use (match_operand:DI 1 "gpc_reg_operand"))
+   (use (match_operand:DI 2 "gpc_reg_operand"))]
   "TARGET_P9_VECTOR && TARGET_64BIT"
 {
-  operands[3] = gen_reg_rtx (DImode);
+  rtx shift_len = gen_rtx_ASHIFT (DImode, operands[2], GEN_INT (56));
+  rtx len;
+
+  if (TARGET_FUTURE)
+len = shift_len;
+  else
+{
+  len = gen_reg_rtx (DImode);
+  emit_insn (gen_rtx_SET (len, shift_len));
+}
+
+  rtx dest = operands[0];
+  rtx addr = operands[1];
+  rtx mem = gen_rtx_MEM (V16QImode, addr);
+  rtvec rv = gen_rtvec (3, addr, mem, len);
+  rtx lxvl = gen_rtx_UNSPEC (V16QImode, rv, UNSPEC_LXVL);
+  emit_insn (gen_rtx_SET (dest, lxvl));
+  DONE;
 })
 
 (define_insn "*lxvl"
@@ -5749,6 +5761,34 @@
   "lxvll %x0,%1,%2"
   [(set_attr "type" "vecload")])
 
+;; For lxvrl and lxvrll, use the combiner to eliminate the shift.  The
+;; define_expand for lxvl will already incorporate the shift in generating the
+;; insn.  The lxvll buitl-in function required the user to have already done
+;; the shift.  Defining lxvrll this way, will optimize cases where the user has
+;; done the shift immediately before the built-in.
+(define_insn "*lxvrl"
+  [(set (match

[gcc(refs/users/meissner/heads/work199-paddis)] RFC2686-Add paddis support.

2025-04-03 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:a1a57cdbaae10ac92e1cafed7e3ffd744fe498ba

commit a1a57cdbaae10ac92e1cafed7e3ffd744fe498ba
Author: Michael Meissner 
Date:   Thu Apr 3 16:55:32 2025 -0400

RFC2686-Add paddis support.

2025-04-03  Michael Meissner  

gcc/

* config/rs6000/constraints.md (eU): New constraint.
(eV): Likewise.
* config/rs6000/predicates.md (paddis_operand): New predicate.
(paddis_paddi_operand): Likewise.
(add_operand): Add paddis support.
* config/rs6000/rs6000.cc (num_insns_constant_gpr): Add paddis 
support.
(num_insns_constant_multi): Likewise.
(print_operand): Add %B for paddis support.
* config/rs6000/rs6000.h (TARGET_PADDIS): New macro.
(SIGNED_INTEGER_32BIT_P): Likewise.
* config/rs6000/rs6000.md (isa attribute): Add paddis support.
(enabled attribute); Likewise.
(add3): Likewise.
(adddi3 splitter): New splitter for paddis.
(movdi_internal64): Add paddis support.
(movdi splitter): New splitter for paddis.

gcc/testsuite/

* gcc.target/powerpc/prefixed-addis.c: New test.

Diff:
---
 gcc/config/rs6000/constraints.md  | 10 +++
 gcc/config/rs6000/predicates.md   | 52 +++-
 gcc/config/rs6000/rs6000.cc   | 25 ++
 gcc/config/rs6000/rs6000.h|  4 +
 gcc/config/rs6000/rs6000.md   | 96 ---
 gcc/testsuite/gcc.target/powerpc/prefixed-addis.c | 24 ++
 6 files changed, 197 insertions(+), 14 deletions(-)

diff --git a/gcc/config/rs6000/constraints.md b/gcc/config/rs6000/constraints.md
index 4875895e4f5d..56331f184a10 100644
--- a/gcc/config/rs6000/constraints.md
+++ b/gcc/config/rs6000/constraints.md
@@ -219,6 +219,16 @@
   "An IEEE 128-bit constant that can be loaded into VSX registers."
   (match_operand 0 "easy_vector_constant_ieee128"))
 
+(define_constraint "eU"
+  "@internal integer constant that can be loaded with paddis"
+  (and (match_code "const_int")
+   (match_operand 0 "paddis_operand")))
+
+(define_constraint "eV"
+  "@internal integer constant that can be loaded with paddis + paddi"
+  (and (match_code "const_int")
+   (match_operand 0 "paddis_paddi_operand")))
+
 ;; Floating-point constraints.  These two are defined so that insn
 ;; length attributes can be calculated exactly.
 
diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
index 647e89afb6a7..cc2d1bfe678e 100644
--- a/gcc/config/rs6000/predicates.md
+++ b/gcc/config/rs6000/predicates.md
@@ -337,6 +337,53 @@
   return SIGNED_INTEGER_34BIT_P (INTVAL (op));
 })
 
+;; Return 1 if op is a 64-bit constant that uses the paddis instruction
+(define_predicate "paddis_operand"
+  (match_code "const_int")
+{
+  if (!TARGET_PADDIS && TARGET_POWERPC64)
+return 0;
+
+  /* If addi, addis, or paddi can handle the number, don't return true.  */
+  HOST_WIDE_INT value = INTVAL (op);
+  if (SIGNED_INTEGER_34BIT_P (value))
+return false;
+
+  /* If the number is too large for padds, return false.  */
+  if (!SIGNED_INTEGER_32BIT_P (value >> 32))
+return false;
+
+  /* If the bottom 32-bits are non-zero, paddis can't handle it.  */
+  if ((value & HOST_WIDE_INT_C(0x)) != 0)
+return false;
+
+  return true;
+})
+
+;; Return 1 if op is a 64-bit constant that needs the paddis instruction and an
+;; addi/addis/paddi instruction combination.
+(define_predicate "paddis_paddi_operand"
+  (match_code "const_int")
+{
+  if (!TARGET_PADDIS && TARGET_POWERPC64)
+return 0;
+
+  /* If addi, addis, or paddi can handle the number, don't return true.  */
+  HOST_WIDE_INT value = INTVAL (op);
+  if (SIGNED_INTEGER_34BIT_P (value))
+return false;
+
+  /* If the number is too large for padds, return false.  */
+  if (!SIGNED_INTEGER_32BIT_P (value >> 32))
+return false;
+
+  /* If the bottom 32-bits are zero, we can use paddis alone to handle it.  */
+  if ((value & HOST_WIDE_INT_C(0x)) == 0)
+return false;
+
+  return true;
+})
+
 ;; Return 1 if op is a register that is not special.
 ;; Disallow (SUBREG:SF (REG:SI)) and (SUBREG:SI (REG:SF)) on VSX systems where
 ;; you need to be careful in moving a SFmode to SImode and vice versa due to
@@ -1081,7 +1128,10 @@
   (if_then_else (match_code "const_int")
 (match_test "satisfies_constraint_I (op)
 || satisfies_constraint_L (op)
-|| satisfies_constraint_eI (op)")
+|| satisfies_constraint_eI (op)
+|| satisfies_constraint_eU (op)
+|| satisfies_constraint_eV (op)")
+
 (match_operand 0 "gpc_reg_operand")))
 
 ;; Return 1 if the operand is either a non-special register, or 0, or -1.
diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index cc57930f026a..e6f72bd4dbba 100644
--- a/gcc/conf

[gcc(refs/users/meissner/heads/work199-paddis)] RFC2677-Add xvrlw support.

2025-04-03 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:d9484a52803d16d2f8dbff69a95c34e88f454fa5

commit d9484a52803d16d2f8dbff69a95c34e88f454fa5
Author: Michael Meissner 
Date:   Thu Apr 3 16:57:21 2025 -0400

RFC2677-Add xvrlw support.

2025-03-27  Michael Meissner  

gcc/

* config/rs6000/altivec.md (xvrlw): New insn.
* config/rs6000/rs6000.h (TARGET_XVRLW): New macro.

gcc/testsuite/

* gcc.target/powerpc/vector-rotate-left.c: New test.

Diff:
---
 gcc/config/rs6000/altivec.md   | 14 +
 gcc/config/rs6000/rs6000.h |  3 ++
 .../gcc.target/powerpc/vector-rotate-left.c| 34 ++
 3 files changed, 51 insertions(+)

diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
index 7edc288a6565..d158cf479d60 100644
--- a/gcc/config/rs6000/altivec.md
+++ b/gcc/config/rs6000/altivec.md
@@ -1982,6 +1982,20 @@
 }
   [(set_attr "type" "vecperm")])
 
+;; -mcpu=future adds a vector rotate left word variant.  There is no vector
+;; byte/half-word/double-word/quad-word rotate left.  This insn occurs before
+;; altivec_vrl and will match for -mcpu=future, while other cpus will
+;; match the generic insn.
+(define_insn "*xvrlw"
+  [(set (match_operand:V4SI 0 "register_operand" "=v,wa")
+   (rotate:V4SI (match_operand:V4SI 1 "register_operand" "v,wa")
+(match_operand:V4SI 2 "register_operand" "v,wa")))]
+  "TARGET_XVRLW"
+  "@
+   vrlw %0,%1,%2
+   xvrlw %x0,%x1,%x2"
+  [(set_attr "type" "vecsimple")])
+
 (define_insn "altivec_vrl"
   [(set (match_operand:VI2 0 "register_operand" "=v")
 (rotate:VI2 (match_operand:VI2 1 "register_operand" "v")
diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h
index 696cea77e36d..01625a339b8e 100644
--- a/gcc/config/rs6000/rs6000.h
+++ b/gcc/config/rs6000/rs6000.h
@@ -578,6 +578,9 @@ extern int rs6000_vector_align[];
 /* Whether we have PADDIS support.  */
 #define TARGET_PADDIS  TARGET_FUTURE
 
+/* Whether we have XVRLW support.  */
+#define TARGET_XVRLW   TARGET_FUTURE
+
 /* Whether the various reciprocal divide/square root estimate instructions
exist, and whether we should automatically generate code for the instruction
by default.  */
diff --git a/gcc/testsuite/gcc.target/powerpc/vector-rotate-left.c 
b/gcc/testsuite/gcc.target/powerpc/vector-rotate-left.c
new file mode 100644
index ..5a5f37755077
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vector-rotate-left.c
@@ -0,0 +1,34 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_future_ok } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-options "-mdejagnu-cpu=future -O2" } */
+
+/* Test whether the xvrl (vector word rotate left using VSX registers insead of
+   Altivec registers is generated.  */
+
+#include 
+
+typedef vector unsigned int  v4si_t;
+
+v4si_t
+rotl_v4si_scalar (v4si_t x, unsigned long n)
+{
+  __asm__ (" # %x0" : "+f" (x));
+  return (x << n) | (x >> (32 - n));   /* xvrlw.  */
+}
+
+v4si_t
+rotr_v4si_scalar (v4si_t x, unsigned long n)
+{
+  __asm__ (" # %x0" : "+f" (x));
+  return (x >> n) | (x << (32 - n));   /* xvrlw.  */
+}
+
+v4si_t
+rotl_v4si_vector (v4si_t x, v4si_t y)
+{
+  __asm__ (" # %x0" : "+f" (x));   /* xvrlw.  */
+  return vec_rl (x, y);
+}
+
+/* { dg-final { scan-assembler-times {\mxvrlw\M} 3  } } */


[gcc(refs/users/meissner/heads/work199-paddis)] Update ChangeLog.*

2025-04-03 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:dce8782607a399ce969b31fa1e4bea426d93359f

commit dce8782607a399ce969b31fa1e4bea426d93359f
Author: Michael Meissner 
Date:   Thu Apr 3 17:00:13 2025 -0400

Update ChangeLog.*

Diff:
---
 gcc/ChangeLog.paddis | 145 +++
 1 file changed, 145 insertions(+)

diff --git a/gcc/ChangeLog.paddis b/gcc/ChangeLog.paddis
index ef499f386bcd..b25d549f9637 100644
--- a/gcc/ChangeLog.paddis
+++ b/gcc/ChangeLog.paddis
@@ -1,5 +1,150 @@
+ Branch work199-paddis, patch #311 
+
+RFC2677-Add xvrlw support.
+
+2025-03-27  Michael Meissner  
+
+gcc/
+
+   * config/rs6000/altivec.md (xvrlw): New insn.
+   * config/rs6000/rs6000.h (TARGET_XVRLW): New macro.
+
+gcc/testsuite/
+
+   * gcc.target/powerpc/vector-rotate-left.c: New test.
+
+ Branch work199-paddis, patch #310 
+
+RFC2686-Add paddis support.
+
+2025-04-03  Michael Meissner  
+
+gcc/
+
+   * config/rs6000/constraints.md (eU): New constraint.
+   (eV): Likewise.
+   * config/rs6000/predicates.md (paddis_operand): New predicate.
+   (paddis_paddi_operand): Likewise.
+   (add_operand): Add paddis support.
+   * config/rs6000/rs6000.cc (num_insns_constant_gpr): Add paddis support.
+   (num_insns_constant_multi): Likewise.
+   (print_operand): Add %B for paddis support.
+   * config/rs6000/rs6000.h (TARGET_PADDIS): New macro.
+   (SIGNED_INTEGER_32BIT_P): Likewise.
+   * config/rs6000/rs6000.md (isa attribute): Add paddis support.
+   (enabled attribute); Likewise.
+   (add3): Likewise.
+   (adddi3 splitter): New splitter for paddis.
+   (movdi_internal64): Add paddis support.
+   (movdi splitter): New splitter for paddis.
+
+gcc/testsuite/
+
+   * gcc.target/powerpc/prefixed-addis.c: New test.
+
+ Branch work199-paddis, patch #301 
+
+RFC2655-Add saturating subtract built-ins.
+
+This patch adds support for a saturating subtract built-in function that may be
+added to a future PowerPC processor.  Note, if it is added, the name of the
+built-in function may change before GCC 13 is released.  If the name changes,
+we will submit a patch changing the name.
+
+I also added support for providing dense math built-in functions, even though
+at present, we have not added any new built-in functions for dense math.  It is
+likely we will want to add new dense math built-in functions as the dense math
+support is fleshed out.
+
+The patches have been tested on both little and big endian systems.  Can I 
check
+it into the master branch?
+
+2025-03-27   Michael Meissner  
+
+gcc/
+
+   * config/rs6000/rs6000-builtin.cc (rs6000_invalid_builtin): Add support
+   for flagging invalid use of future built-in functions.
+   (rs6000_builtin_is_supported): Add support for future built-in
+   functions.
+   * config/rs6000/rs6000-builtins.def (__builtin_saturate_subtract32): New
+   built-in function for -mcpu=future.
+   (__builtin_saturate_subtract64): Likewise.
+   * config/rs6000/rs6000-gen-builtins.cc (enum bif_stanza): Add stanzas
+   for -mcpu=future built-ins.
+   (stanza_map): Likewise.
+   (enable_string): Likewise.
+   (struct attrinfo): Likewise.
+   (parse_bif_attrs): Likewise.
+   (write_decls): Likewise.
+   * config/rs6000/rs6000.md (sat_sub3): Add saturating subtract
+   built-in insn declarations.
+   (sat_sub3_dot): Likewise.
+   (sat_sub3_dot2): Likewise.
+   * doc/extend.texi (Future PowerPC built-ins): New section.
+
+gcc/testsuite/
+
+   * gcc.target/powerpc/subfus-1.c: New test.
+   * gcc.target/powerpc/subfus-2.c: Likewise.
+
+ Branch work199-paddis, patch #300 
+
+RFC2656-Support load/store vector with right length.
+
+This patch adds support for new instructions that may be added to the PowerPC
+architecture in the future to enhance the load and store vector with length
+instructions.
+
+The current instructions (lxvl, lxvll, stxvl, and stxvll) are inconvient to use
+since the count for the number of bytes must be in the top 8 bits of the GPR
+register, instead of the bottom 8 bits.  This meant that code generating these
+instructions typically had to do a shift left by 56 bits to get the count into
+the right position.  In a future version of the PowerPC architecture, new
+variants of these instructions might be added that expect the count to be in
+the bottom 8 bits of the GPR register.  These patches add this support to GCC
+if the user uses the -mcpu=future option.
+
+I discovered that the code in rs6000-string.cc to generate ISA 3.1 lxvl/stxvl
+future lxvll/stxvll instructions would generate these instructions on 32-bit.
+However the patterns for these instructions is only done on 64-bit systems.  So
+I added a check for 64-bit support before generating the instructions.
+
+Th

[gcc(refs/users/meissner/heads/work199-sha)] Add ChangeLog.sha and update REVISION.

2025-04-03 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:1a7b01d4f58322438db40d4319fef30b9c59329c

commit 1a7b01d4f58322438db40d4319fef30b9c59329c
Author: Michael Meissner 
Date:   Thu Apr 3 15:26:36 2025 -0400

Add ChangeLog.sha and update REVISION.

2025-04-03  Michael Meissner  

gcc/

* ChangeLog.sha: New file for branch.
* REVISION: Update.

Diff:
---
 gcc/ChangeLog.sha | 5 +
 gcc/REVISION  | 2 +-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/gcc/ChangeLog.sha b/gcc/ChangeLog.sha
new file mode 100644
index ..6829dd35cdc4
--- /dev/null
+++ b/gcc/ChangeLog.sha
@@ -0,0 +1,5 @@
+ Branch work199-sha, baseline 
+
+2025-04-03   Michael Meissner  
+
+   Clone branch
diff --git a/gcc/REVISION b/gcc/REVISION
index 113e419bda0d..9f6440bbc70e 100644
--- a/gcc/REVISION
+++ b/gcc/REVISION
@@ -1 +1 @@
-work199 branch
+work199-sha branch


[gcc(refs/users/meissner/heads/work199-test)] Merge commit 'refs/users/meissner/heads/work199-test' of git+ssh://gcc.gnu.org/git/gcc into me/work1

2025-04-03 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:eadedcdfa58e265938ea4d0106d04bc0aed96f07

commit eadedcdfa58e265938ea4d0106d04bc0aed96f07
Merge: f9073bbbcf78 c09fa69f394a
Author: Michael Meissner 
Date:   Thu Apr 3 16:34:06 2025 -0400

Merge commit 'refs/users/meissner/heads/work199-test' of 
git+ssh://gcc.gnu.org/git/gcc into me/work199-test

Diff:


[gcc(refs/users/meissner/heads/work199-test)] Add ChangeLog.test and update REVISION.

2025-04-03 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:f9073bbbcf78248f812a5b050ae8884b9f503aab

commit f9073bbbcf78248f812a5b050ae8884b9f503aab
Author: Michael Meissner 
Date:   Thu Apr 3 15:27:30 2025 -0400

Add ChangeLog.test and update REVISION.

2025-04-03  Michael Meissner  

gcc/

* ChangeLog.test: New file for branch.
* REVISION: Update.

Diff:
---
 gcc/ChangeLog.test | 5 +
 gcc/REVISION   | 2 +-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/gcc/ChangeLog.test b/gcc/ChangeLog.test
new file mode 100644
index ..6b729265d631
--- /dev/null
+++ b/gcc/ChangeLog.test
@@ -0,0 +1,5 @@
+ Branch work199-test, baseline 
+
+2025-04-03   Michael Meissner  
+
+   Clone branch
diff --git a/gcc/REVISION b/gcc/REVISION
index 113e419bda0d..5f364831c648 100644
--- a/gcc/REVISION
+++ b/gcc/REVISION
@@ -1 +1 @@
-work199 branch
+work199-test branch


[gcc/meissner/heads/work199-vpair] (14 commits) Merge commit 'refs/users/meissner/heads/work199-vpair' of g

2025-04-03 Thread Michael Meissner via Gcc-cvs
The branch 'meissner/heads/work199-vpair' was updated to point to:

 5b9a36c7eabc... Merge commit 'refs/users/meissner/heads/work199-vpair' of g

It previously pointed to:

 67139bc62ba5... Add ChangeLog.vpair and update REVISION.

Diff:

Summary of changes (added commits):
---

  5b9a36c... Merge commit 'refs/users/meissner/heads/work199-vpair' of g
  99242af... Add ChangeLog.vpair and update REVISION.
  c62780a... Update ChangeLog.* (*)
  76c081b... Use architecture flags for defining _ARCH_PWR macros. (*)
  aa1860c... Add rs6000 architecture masks. (*)
  9dacc68... Use vector pair load/store for memcpy with -mcpu=future (*)
  c590949... Add -mcpu=future tests. (*)
  230fbe1... Add -mcpu=future tuning support. (*)
  c099046... Add support for -mcpu=future (*)
  04725bf... Change TARGET_MODULO to TARGET_POWER9. (*)
  08e5c71... Change TARGET_POPCNTD to TARGET_POWER7. (*)
  b9b3a52... Change TARGET_CMPB to TARGET_POWER6. (*)
  d0dbdbf... Change TARGET_FPRND to TARGET_POWER5X. (*)
  31d7966... Change TARGET_POPCNTB to TARGET_POWER5. (*)

(*) This commit already exists in another branch.
Because the reference `refs/users/meissner/heads/work199-vpair' matches
your hooks.email-new-commits-only configuration,
no separate email is sent for this commit.


[gcc/meissner/heads/work199-dmf] (14 commits) Merge commit 'refs/users/meissner/heads/work199-dmf' of git

2025-04-03 Thread Michael Meissner via Gcc-cvs
The branch 'meissner/heads/work199-dmf' was updated to point to:

 3e4b995c8dc1... Merge commit 'refs/users/meissner/heads/work199-dmf' of git

It previously pointed to:

 840d7578343b... Add ChangeLog.dmf and update REVISION.

Diff:

Summary of changes (added commits):
---

  3e4b995... Merge commit 'refs/users/meissner/heads/work199-dmf' of git
  149b47a... Add ChangeLog.dmf and update REVISION.
  c62780a... Update ChangeLog.* (*)
  76c081b... Use architecture flags for defining _ARCH_PWR macros. (*)
  aa1860c... Add rs6000 architecture masks. (*)
  9dacc68... Use vector pair load/store for memcpy with -mcpu=future (*)
  c590949... Add -mcpu=future tests. (*)
  230fbe1... Add -mcpu=future tuning support. (*)
  c099046... Add support for -mcpu=future (*)
  04725bf... Change TARGET_MODULO to TARGET_POWER9. (*)
  08e5c71... Change TARGET_POPCNTD to TARGET_POWER7. (*)
  b9b3a52... Change TARGET_CMPB to TARGET_POWER6. (*)
  d0dbdbf... Change TARGET_FPRND to TARGET_POWER5X. (*)
  31d7966... Change TARGET_POPCNTB to TARGET_POWER5. (*)

(*) This commit already exists in another branch.
Because the reference `refs/users/meissner/heads/work199-dmf' matches
your hooks.email-new-commits-only configuration,
no separate email is sent for this commit.


[gcc(refs/users/meissner/heads/work199-bugs)] Add ChangeLog.bugs and update REVISION.

2025-04-03 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:c774b0264d49d73a6efc6bb462fea81ed82b3c39

commit c774b0264d49d73a6efc6bb462fea81ed82b3c39
Author: Michael Meissner 
Date:   Thu Apr 3 15:24:39 2025 -0400

Add ChangeLog.bugs and update REVISION.

2025-04-03  Michael Meissner  

gcc/

* ChangeLog.bugs: New file for branch.
* REVISION: Update.

Diff:
---
 gcc/ChangeLog.bugs | 5 +
 gcc/REVISION   | 2 +-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/gcc/ChangeLog.bugs b/gcc/ChangeLog.bugs
new file mode 100644
index ..8f82235b2084
--- /dev/null
+++ b/gcc/ChangeLog.bugs
@@ -0,0 +1,5 @@
+ Branch work199-bugs, baseline 
+
+2025-04-03   Michael Meissner  
+
+   Clone branch
diff --git a/gcc/REVISION b/gcc/REVISION
index 113e419bda0d..1c17811d60f2 100644
--- a/gcc/REVISION
+++ b/gcc/REVISION
@@ -1 +1 @@
-work199 branch
+work199-bugs branch


[gcc(refs/users/meissner/heads/work199-math)] Merge commit 'refs/users/meissner/heads/work199-math' of git+ssh://gcc.gnu.org/git/gcc into me/work1

2025-04-03 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:edb52f8e058aa984d1e9e4a81d531e629c844d91

commit edb52f8e058aa984d1e9e4a81d531e629c844d91
Merge: b3076e6ac219 7c9347563cb4
Author: Michael Meissner 
Date:   Thu Apr 3 16:25:39 2025 -0400

Merge commit 'refs/users/meissner/heads/work199-math' of 
git+ssh://gcc.gnu.org/git/gcc into me/work199-math

Diff:


[gcc(refs/users/meissner/heads/work199-paddis)] Add ChangeLog.paddis and update REVISION.

2025-04-03 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:c53b3461e27e5d95f092a236b05d96b82cfa2e2a

commit c53b3461e27e5d95f092a236b05d96b82cfa2e2a
Author: Michael Meissner 
Date:   Thu Apr 3 15:30:04 2025 -0400

Add ChangeLog.paddis and update REVISION.

2025-04-03  Michael Meissner  

gcc/

* ChangeLog.paddis: New file for branch.
* REVISION: Update.

Diff:
---
 gcc/ChangeLog.paddis | 5 +
 gcc/REVISION | 2 +-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/gcc/ChangeLog.paddis b/gcc/ChangeLog.paddis
new file mode 100644
index ..ef499f386bcd
--- /dev/null
+++ b/gcc/ChangeLog.paddis
@@ -0,0 +1,5 @@
+ Branch work199-paddis, baseline 
+
+2025-04-03   Michael Meissner  
+
+   Clone branch
diff --git a/gcc/REVISION b/gcc/REVISION
index 113e419bda0d..60be995efd8d 100644
--- a/gcc/REVISION
+++ b/gcc/REVISION
@@ -1 +1 @@
-work199 branch
+work199-paddis branch


[gcc/meissner/heads/work199-libs] (14 commits) Merge commit 'refs/users/meissner/heads/work199-libs' of gi

2025-04-03 Thread Michael Meissner via Gcc-cvs
The branch 'meissner/heads/work199-libs' was updated to point to:

 360e55041f43... Merge commit 'refs/users/meissner/heads/work199-libs' of gi

It previously pointed to:

 f46e8c6a02f2... Add ChangeLog.libs and update REVISION.

Diff:

Summary of changes (added commits):
---

  360e550... Merge commit 'refs/users/meissner/heads/work199-libs' of gi
  0696a01... Add ChangeLog.libs and update REVISION.
  c62780a... Update ChangeLog.* (*)
  76c081b... Use architecture flags for defining _ARCH_PWR macros. (*)
  aa1860c... Add rs6000 architecture masks. (*)
  9dacc68... Use vector pair load/store for memcpy with -mcpu=future (*)
  c590949... Add -mcpu=future tests. (*)
  230fbe1... Add -mcpu=future tuning support. (*)
  c099046... Add support for -mcpu=future (*)
  04725bf... Change TARGET_MODULO to TARGET_POWER9. (*)
  08e5c71... Change TARGET_POPCNTD to TARGET_POWER7. (*)
  b9b3a52... Change TARGET_CMPB to TARGET_POWER6. (*)
  d0dbdbf... Change TARGET_FPRND to TARGET_POWER5X. (*)
  31d7966... Change TARGET_POPCNTB to TARGET_POWER5. (*)

(*) This commit already exists in another branch.
Because the reference `refs/users/meissner/heads/work199-libs' matches
your hooks.email-new-commits-only configuration,
no separate email is sent for this commit.


[gcc(refs/users/meissner/heads/work199-dmf)] Merge commit 'refs/users/meissner/heads/work199-dmf' of git+ssh://gcc.gnu.org/git/gcc into me/work19

2025-04-03 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:3e4b995c8dc147cb0604ac4bf4f9013dbadd351c

commit 3e4b995c8dc147cb0604ac4bf4f9013dbadd351c
Merge: 149b47aae331 840d7578343b
Author: Michael Meissner 
Date:   Thu Apr 3 16:22:00 2025 -0400

Merge commit 'refs/users/meissner/heads/work199-dmf' of 
git+ssh://gcc.gnu.org/git/gcc into me/work199-dmf

Diff:


[gcc(refs/users/meissner/heads/work199-dmf)] RFC2653-Add support for dense math registers.

2025-04-03 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:b5c7f27e9c8ede76977534fc29e135a0cd96ea55

commit b5c7f27e9c8ede76977534fc29e135a0cd96ea55
Author: Michael Meissner 
Date:   Thu Apr 3 16:42:48 2025 -0400

RFC2653-Add support for dense math registers.

The MMA subsystem added the notion of accumulator registers as an optional
feature of ISA 3.1 (power10).  In ISA 3.1, these accumulators overlapped 
with
the VSX registers 0..31, but logically the accumulator registers were 
separate
from the FPR registers.  In ISA 3.1, it was anticipated that in future 
systems,
the accumulator registers may no overlap with the FPR registers.  This patch
adds the support for dense math registers as separate registers.

This particular patch does not change the MMA support to use the 
accumulators
within the dense math registers.  This patch just adds the basic support for
having separate DMRs.  The next patch will switch the MMA support to use the
accumulators if -mcpu=future is used.

For testing purposes, I added an undocumented option '-mdense-math' to 
enable
or disable the dense math support.

This patch updates the wD constraint added in the previous patch.  If MMA is
selected but dense math is not selected (i.e. -mcpu=power10), the wD 
constraint
will allow access to accumulators that overlap with VSX registers 0..31.  If
both MMA and dense math are selected (i.e. -mcpu=future), the wD constraint
will only allow dense math registers.

This patch modifies the existing %A output modifier.  If MMA is selected but
dense math is not selected, then %A output modifier converts the VSX 
register
number to the accumulator number, by dividing it by 4.  If both MMA and 
dense
math are selected, then %A will map the separate DMR registers into 0..7.

The intention is that user code using extended asm can be modified to run on
both MMA without dense math and MMA with dense math:

1)  If possible, don't use extended asm, but instead use the MMA 
built-in
functions;

2)  If you do need to write extended asm, change the d constraints
targetting accumulators should now use wD;

3)  Only use the built-in zero, assemble and disassemble functions 
create
move data between vector quad types and dense math accumulators.
I.e. do not use the xxmfacc, xxmtacc, and xxsetaccz directly in the
extended asm code.  The reason is these instructions assume there 
is a
1-to-1 correspondence between 4 adjacent FPR registers and an
accumulator that overlaps with those instructions.  With 
accumulators
now being separate registers, there no longer is a 1-to-1
correspondence.

It is possible that the mangling for DMRs and the GDB register numbers may
produce other changes in the future.

gcc/

2025-04-03   Michael Meissner  

* config/rs6000/mma.md (UNSPEC_MMA_DMSETDMRZ): New unspec.
(movxo): Add comments about dense math registers.
(movxo_nodm): Rename from movxo and restrict the usage to machines
without dense math registers.
(movxo_dm): New insn for movxo support for machines with dense math
registers.
(mma_): Restrict usage to machines without dense math 
registers.
(mma_xxsetaccz): Add a define_expand wrapper, and add support for 
dense
math registers.
(mma_dmsetaccz): New insn.
* config/rs6000/predicates.md (dmr_operand): New predicate.
(accumulator_operand): Add support for dense math registers.
* config/rs6000/rs6000-builtin.cc (rs6000_gimple_fold_mma_builtin): 
Do
not issue a de-prime instruction when disassembling a vector quad 
on a
system with dense math registers.
* config/rs6000/rs6000-c.cc (rs6000_define_or_undefine_macro): 
Define
__DENSE_MATH__ if we have dense math registers.
* config/rs6000/rs6000.cc (enum rs6000_reg_type): Add DMR_REG_TYPE.
(enum rs6000_reload_reg_type): Add RELOAD_REG_DMR.
(LAST_RELOAD_REG_CLASS): Add support for DMR registers and the wD
constraint.
(reload_reg_map): Likewise.
(rs6000_reg_names): Likewise.
(alt_reg_names): Likewise.
(rs6000_hard_regno_nregs_internal): Likewise.
(rs6000_hard_regno_mode_ok_uncached): Likewise.
(rs6000_debug_reg_global): Likewise.
(rs6000_setup_reg_addr_masks): Likewise.
(rs6000_init_hard_regno_mode_ok): Likewise.
(rs6000_secondary_reload_memory): Add support for DMR registers.
(rs6000_secondary_reload_simple_move): Likewise.
(rs6000_preferred_reload_class): Likewise.
(rs6000_secondary_reload_class): Likewise.
(print_operand

[gcc(refs/users/meissner/heads/work199-dmf)] RFC2653-Add wD constraint.

2025-04-03 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:4f9a1d704f83808a191e152d6dd1b84ad790

commit 4f9a1d704f83808a191e152d6dd1b84ad790
Author: Michael Meissner 
Date:   Thu Apr 3 16:40:28 2025 -0400

RFC2653-Add wD constraint.

This patch adds a new constraint ('wD') that matches the accumulator 
registers
that overlap with VSX registers 0..31 on power10.  Future patches will add 
the
support for a separate accumulator register class that will be used when the
support for dense math registes is added.

2025-04-03   Michael Meissner  

* config/rs6000/constraints.md (wD): New constraint.
* config/rs6000/mma.md (mma_): Prepare for alternate 
accumulator
registers.  Use wD constraint instead of 'd' constraint.  Use
accumulator_operand instead of fpr_reg_operand.
(mma_): Likewise.
(mma_): Likewise.
(mma_): Likewise.
(mma_): Likewise.
(mma_): Likewise.
(mma_): Likewise.
(mma_): Likewise.
(mma_): Likewise.
(mma_): Likewise.
(mma_): Likewise.
(mma_): Likewise.
(mma_): Likewise.
(mma_"
-  [(set (match_operand:XO 0 "fpr_reg_operand" "=&d")
-   (unspec:XO [(match_operand:XO 1 "fpr_reg_operand" "0")]
+  [(set (match_operand:XO 0 "accumulator_operand" "=&wD")
+   (unspec:XO [(match_operand:XO 1 "accumulator_operand" "0")]
MMA_ACC))]
   "TARGET_MMA"
   " %A0"
@@ -523,7 +523,7 @@
   [(set_attr "type" "mma")])
 
 (define_insn "mma_"
-  [(set (match_operand:XO 0 "fpr_reg_operand" "=&d,&d")
+  [(set (match_operand:XO 0 "accumulator_operand" "=&wD,&wD")
(unspec:XO [(match_operand:V16QI 1 "vsx_register_operand" "v,?wa")
(match_operand:V16QI 2 "vsx_register_operand" "v,?wa")]
MMA_VV))]
@@ -532,8 +532,8 @@
   [(set_attr "type" "mma")])
 
 (define_insn "mma_"
-  [(set (match_operand:XO 0 "fpr_reg_operand" "=&d,&d")
-   (unspec:XO [(match_operand:XO 1 "fpr_reg_operand" "0,0")
+  [(set (match_operand:XO 0 "accumulator_operand" "=&wD,&wD")
+   (unspec:XO [(match_operand:XO 1 "accumulator_operand" "0,0")
(match_operand:V16QI 2 "vsx_register_operand" "v,?wa")
(match_operand:V16QI 3 "vsx_register_operand" "v,?wa")]
MMA_AVV))]
@@ -542,7 +542,7 @@
   [(set_attr "type" "mma")])
 
 (define_insn "mma_"
-  [(set (match_operand:XO 0 "fpr_reg_operand" "=&d,&d")
+  [(set (match_operand:XO 0 "accumulator_operand" "=&wD,&wD")
(unspec:XO [(match_operand:OO 1 "vsx_register_operand" "v,?wa")
(match_operand:V16QI 2 "vsx_register_operand" "v,?wa")]
MMA_PV))]
@@ -551,8 +551,8 @@
   [(set_attr "type" "mma")])
 
 (define_insn "mma_"
-  [(set (match_operand:XO 0 "fpr_reg_operand" "=&d,&d")
-   (unspec:XO [(match_operand:XO 1 "fpr_reg_operand" "0,0")
+  [(set (match_operand:XO 0 "accumulator_operand" "=&wD,&wD")
+   (unspec:XO [(match_operand:XO 1 "accumulator_operand" "0,0")
(match_operand:OO 2 "vsx_register_operand" "v,?wa")
(match_operand:V16QI 3 "vsx_register_operand" "v,?wa")]
MMA_APV))]
@@ -561,7 +561,7 @@
   [(set_attr "type" "mma")])
 
 (define_insn "mma_"
-  [(set (match_operand:XO 0 "fpr_reg_operand" "=&d,&d")
+  [(set (match_operand:XO 0 "accumulator_operand" "=&wD,&wD")
(unspec:XO [(match_operand:V16QI 1 "vsx_register_operand" "v,?wa")
(match_operand:V16QI 2 "vsx_register_operand" "v,?wa")
(match_operand:SI 3 "const_0_to_15_operand" "n,n")
@@ -574,8 +574,8 @@
(set_attr "prefixed" "yes")])
 
 (define_insn "mma_"
-  [(set (match_operand:XO 0 "fpr_reg_operand" "=&d,&d")
-   (unspec:XO [(match_operand:XO 1 "fpr_reg_operand" "0,0")
+  [(set (match_operand:XO 0 "accumulator_operand" "=&wD,&wD")
+   (unspec:XO [(match_operand:XO 1 "accumulator_operand" "0,0")
(match_operand:V16QI 2 "vsx_register_operand" "v,?wa")
(match_operand:V16QI 3 "vsx_register_operand" "v,?wa")
(match_operand:SI 4 "const_0_to_15_operand" "n,n")
@@ -588,7 +588,7 @@
(set_attr "prefixed" "yes")])
 
 (define_insn "mma_"
-  [(set (match_operand:XO 0 "fpr_reg_operand" "=&d,&d")
+  [(set (match_operand:XO 0 "accumulator_operand" "=&wD,&wD")
(unspec:XO [(match_operand:V16QI 1 "vsx_register_operand" "v,?wa")
(match_operand:V16QI 2 "vsx_register_operand" "v,?wa")
(match_operand:SI 3 "const_0_to_15_operand" "n,n")
@@ -601,8 +601,8 @@
(set_attr "prefixed" "yes")])
 
 (define_insn "mma_"
-  [(set (match_operand:XO 0 "fpr_reg_operand" "=&d,&d")
-   (unspec:XO [(match_operand:XO 1 "fpr_reg_operand" "0,0")
+  [(set (match_operand:XO 0 "accumulator_operand" "=&wD,&wD")
+   (unspec:XO [(match_operand:XO 1 "accumulator_operand" "0,0")

[gcc] Created branch 'meissner/heads/work199-submit' in namespace 'refs/users'

2025-04-03 Thread Michael Meissner via Gcc-cvs
The branch 'meissner/heads/work199-submit' was created in namespace 
'refs/users' pointing to:

 34cf8c8901ed... Add ChangeLog.meissner and REVISION.


[gcc r15-9190] nvptx: Don't use PTX '.const', constant state space [PR119573]

2025-04-03 Thread Thomas Schwinge via Libstdc++-cvs
https://gcc.gnu.org/g:5deeae29dab2af64e3342daf7a3e424c64ea

commit r15-9190-g5deeae29dab2af64e3342daf7a3e424c64ea
Author: Thomas Schwinge 
Date:   Wed Apr 2 10:25:17 2025 +0200

nvptx: Don't use PTX '.const', constant state space [PR119573]

This avoids cases where a "File uses too much global constant data" (final
executable, or single object file), and avoids cases of wrong code 
generation:
"error : State space incorrect for instruction 'st'" ('st.const'), or 
another
case where an "illegal instruction was encountered", or a lot of cases where
for two compilation units (such as a library linked with user code) we ran 
into
"error : Memory space doesn't match" due to differences in '.const' usage
between definition and use of a variable.

We progress:

ptxas error   : File uses too much global constant data (0x1f01a bytes, 
0x1 max)
nvptx-run: cuLinkAddData failed: a PTX JIT compilation failed 
(CUDA_ERROR_INVALID_PTX, 218)

... into:

PASS: 20_util/to_chars/103955.cc  -std=gnu++17 (test for excess errors)
[-FAIL:-]{+PASS:+} 20_util/to_chars/103955.cc  -std=gnu++17 execution 
test

We progress:

ptxas error   : File uses too much global constant data (0x36c65 bytes, 
0x1 max)
nvptx-as: ptxas returned 255 exit status

... into:

[-UNSUPPORTED:-]{+PASS:+} gcc.c-torture/compile/pr46534.c   -O0  
{+(test for excess errors)+}
[-UNSUPPORTED:-]{+PASS:+} gcc.c-torture/compile/pr46534.c   -O1  
{+(test for excess errors)+}
[-UNSUPPORTED:-]{+PASS:+} gcc.c-torture/compile/pr46534.c   -O2  
{+(test for excess errors)+}
[-UNSUPPORTED:-]{+PASS:+} gcc.c-torture/compile/pr46534.c   -O3 -g  
{+(test for excess errors)+}
[-UNSUPPORTED:-]{+PASS:+} gcc.c-torture/compile/pr46534.c   -Os  
{+(test for excess errors)+}

[-FAIL:-]{+PASS:+} g++.dg/torture/pr31863.C   -O0  (test for excess 
errors)
[-FAIL:-]{+PASS:+} g++.dg/torture/pr31863.C   -O1  (test for excess 
errors)
[-FAIL:-]{+PASS:+} g++.dg/torture/pr31863.C   -O2  (test for excess 
errors)
[-FAIL:-]{+PASS:+} g++.dg/torture/pr31863.C   -O3 -g  (test for excess 
errors)
[-FAIL:-]{+PASS:+} g++.dg/torture/pr31863.C   -Os  (test for excess 
errors)

[-FAIL:-]{+PASS:+} gfortran.dg/bind-c-contiguous-1.f90   -O0  (test for 
excess errors)
[-UNRESOLVED:-]{+PASS:+} gfortran.dg/bind-c-contiguous-1.f90   -O0  
[-compilation failed to produce executable-]{+execution test+}

[-FAIL:-]{+PASS:+} gfortran.dg/bind-c-contiguous-4.f90   -O0  (test for 
excess errors)
[-UNRESOLVED:-]{+PASS:+} gfortran.dg/bind-c-contiguous-4.f90   -O0  
[-compilation failed to produce executable-]{+execution test+}

[-FAIL:-]{+PASS:+} gfortran.dg/bind-c-contiguous-5.f90   -O0  (test for 
excess errors)
[-UNRESOLVED:-]{+PASS:+} gfortran.dg/bind-c-contiguous-5.f90   -O0  
[-compilation failed to produce executable-]{+execution test+}

[-FAIL:-]{+PASS:+} 20_util/to_chars/double.cc  -std=gnu++17 (test for 
excess errors)
[-UNRESOLVED:-]{+PASS:+} 20_util/to_chars/double.cc  -std=gnu++17 
[-compilation failed to produce executable-]{+execution test+}

[-FAIL:-]{+PASS:+} 20_util/to_chars/float.cc  -std=gnu++17 (test for 
excess errors)
[-UNRESOLVED:-]{+PASS:+} 20_util/to_chars/float.cc  -std=gnu++17 
[-compilation failed to produce executable-]{+execution test+}

[-FAIL:-]{+PASS:+} special_functions/13_ellint_3/check_value.cc  
-std=gnu++17 (test for excess errors)
[-UNRESOLVED:-]{+PASS:+} special_functions/13_ellint_3/check_value.cc  
-std=gnu++17 [-compilation failed to produce executable-]{+execution test+}

[-FAIL:-]{+PASS:+} 
tr1/5_numerical_facilities/special_functions/14_ellint_3/check_value.cc  
-std=gnu++17 (test for excess errors)
[-UNRESOLVED:-]{+PASS:+} 
tr1/5_numerical_facilities/special_functions/14_ellint_3/check_value.cc  
-std=gnu++17 [-compilation failed to produce executable-]{+execution test+}

..., and progress likewise, but fail later with an unrelated error:

[-FAIL:-]{+PASS:+} ext/special_functions/hyperg/check_value.cc  
-std=gnu++17 (test for excess errors)
[-UNRESOLVED:-]{+FAIL:+} ext/special_functions/hyperg/check_value.cc  
-std=gnu++17 [-compilation failed to produce executable-]{+execution test+}


[...]/libstdc++-v3/testsuite/ext/special_functions/hyperg/check_value.cc:12317: 
void test(const testcase_hyperg (&)[Num], Ret) [with Ret = double; 
unsigned int Num = 19]: Assertion 'max_abs_frac < toler' failed.

..., and:

[-FAIL:-]{+PASS:+} 
tr1/5_numerical_facilities/special_functions/17_hyperg/check_value.cc  
-std=gnu++17 (test for excess errors)
[-UNRESOLVED:-]{+FAIL:+} 
tr1/5_numerical_facilities/special_functions/17_hyperg/check_value.cc  
-std=gnu+

[gcc r15-9191] libstdc++, nvptx: Remove machinery to inject per-file flags

2025-04-03 Thread Thomas Schwinge via Gcc-cvs
https://gcc.gnu.org/g:287f360b3e75a19c48ee14c71f51b6e7968474ef

commit r15-9191-g287f360b3e75a19c48ee14c71f51b6e7968474ef
Author: Thomas Schwinge 
Date:   Wed Apr 2 11:05:08 2025 +0200

libstdc++, nvptx: Remove machinery to inject per-file flags

Not used anymore.

libstdc++-v3/
* config/cpu/nvptx/t-nvptx: Remove.
* configure.host [nvptx]: Adjust.

Diff:
---
 libstdc++-v3/config/cpu/nvptx/t-nvptx | 1 -
 libstdc++-v3/configure.host   | 5 -
 2 files changed, 6 deletions(-)

diff --git a/libstdc++-v3/config/cpu/nvptx/t-nvptx 
b/libstdc++-v3/config/cpu/nvptx/t-nvptx
deleted file mode 100644
index eacc5468d627..
--- a/libstdc++-v3/config/cpu/nvptx/t-nvptx
+++ /dev/null
@@ -1 +0,0 @@
-# Per-file flags, see '../../../configure.host', "inject per-file flags".
diff --git a/libstdc++-v3/configure.host b/libstdc++-v3/configure.host
index 0bed9dfff957..8375764bf4dc 100644
--- a/libstdc++-v3/configure.host
+++ b/libstdc++-v3/configure.host
@@ -374,11 +374,6 @@ case "${host}" in
  
port_specific_symbol_files="\$(srcdir)/../config/os/gnu-linux/arm-eabi-extra.ver"
  ;;
   nvptx-*-none)
-# For 'make all-target-libstdc++-v3', we need to inject per-file flags:
-OPTIMIZE_CXXFLAGS="${OPTIMIZE_CXXFLAGS} \$(CXXFLAGS-\$(subdir)/\$@)"
-# ..., see:
-tmake_file="$tmake_file cpu/nvptx/t-nvptx"
-
 # For 'make all-target-libstdc++-v3', re 'alloca'/VLA usage:
 EXTRA_CFLAGS="${EXTRA_CFLAGS} -mfake-ptx-alloca"
 OPTIMIZE_CXXFLAGS="${OPTIMIZE_CXXFLAGS} -mfake-ptx-alloca"


[gcc(refs/users/meissner/heads/work199)] Update ChangeLog.*

2025-04-03 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:c62780a419279bc417ec0a1fdb2f0228446c7787

commit c62780a419279bc417ec0a1fdb2f0228446c7787
Author: Michael Meissner 
Date:   Thu Apr 3 16:18:42 2025 -0400

Update ChangeLog.*

Diff:
---
 gcc/ChangeLog.meissner | 396 +
 1 file changed, 396 insertions(+)

diff --git a/gcc/ChangeLog.meissner b/gcc/ChangeLog.meissner
index 95d48eacd3cd..dfe73a0ea32f 100644
--- a/gcc/ChangeLog.meissner
+++ b/gcc/ChangeLog.meissner
@@ -1,5 +1,401 @@
+ Branch work199, patch #21 
+
+Use architecture flags for defining _ARCH_PWR macros.
+
+For the newer architectures, this patch changes GCC to define the _ARCH_PWR
+macros using the new architecture flags instead of relying on isa options like
+-mpower10.
+
+The -mpower8-internal, -mpower10, -mpower11, and -mfuture options were removed.
+The -mpower11 and -mfuture options were removed completely, since they were 
just
+added in GCC 15. The other two options were marked as WarnRemoved, and the
+various ISA bits were removed.
+
+TARGET_POWER8, TARGET_POWER10, TARGET_POWER11, and TARGET_FUTURE were 
re-defined
+to use the architeture bits instead of the ISA bits.
+
+There are other internal isa bits that aren't removed with this patch because
+the built-in function support uses those bits.
+
+I have built both big endian and little endian bootstrap compilers and there
+were no regressions.
+
+Can I install this patch on the GCC 15 trunk?
+
+2025-04-03  Michael Meissner  
+
+gcc/
+
+   * config/rs6000/rs6000-c.cc (rs6000_target_modify_macros) Add support to
+   use architecture flags instead of ISA flags for setting most of the
+   _ARCH_PWR* macros.
+   (rs6000_cpu_cpp_builtins): Update rs6000_target_modify_macros call.
+   * config/rs6000/rs6000-cpus.def (ISA_2_7_MASKS_SERVER): Remove
+   OPTION_MASK_POWER8.
+   (ISA_3_1_MASKS_SERVER): Remove OPTION_MASK_POWER10.
+   (POWER11_MASKS_SERVER): Remove OPTION_MASK_POWER11.
+   (FUTURE_MASKS_SERVER): Remove OPTION_MASK_FUTURE.
+   (POWERPC_MASKS): Remove OPTION_MASK_POWER8, OPTION_MASK_POWER10,
+   OPTION_MASK_POWER11, and OPTION_MASK_FUTURE.
+   * config/rs6000/rs6000-protos.h (rs6000_target_modify_macros): Update
+   declaration.
+   (rs6000_target_modify_macros_ptr): Likewise.
+   * config/rs6000/rs6000.cc (rs6000_target_modify_macros_ptr): Likewise.
+   (rs6000_option_override_internal): Use architecture flags instead of ISA
+   flags.
+   (rs6000_opt_masks): Remove -mpower10, -mpower11, and -mfuture which are
+   no longer in the ISA flags.
+   (rs6000_pragma_target_parse): Use architecture flags as well as ISA
+   flags.
+   * config/rs6000/rs6000.h (TARGET_POWER5): Redefine to use architecture
+   flags.
+   (TARGET_POWER5X): Likewise.
+   (TARGET_POWER6): Likewise.
+   (TARGET_POWER7): Likewise.
+   (TARGET_POWER8): Likewise.
+   (TARGET_POWER9): Likewise.
+   (TARGET_POWER10): New macro.
+   (TARGET_POWER11): Likewise.
+   (TARGET_FUTURE): Likewise.
+   * config/rs6000/rs6000.opt (-mpower8-internal): Remove ISA flag bits.
+   (-mpower10): Likewise.
+   (-mpower11): Likewise.
+   (-mfuture): Likewise.
+
+ Branch work199, patch #20 
+
+Add rs6000 architecture masks.
+
+This patch begins the journey to move architecture bits that are not user ISA
+options from rs6000_isa_flags to a new targt variable rs6000_arch_flags.  The
+intention is to remove switches that are currently isa options, but the user
+should not be using this particular option. For example, we want users to use
+-mcpu=power10 and not just -mpower10.
+
+This patch also changes the target_clones support to use an architecture mask
+instead of isa bits.
+
+This patch also switches the handling of .machine to use architecture masks if
+they exist (power4 through power11).  All of the other PowerPCs will continue 
to
+use the existing code for setting the .machine option.
+
+I have built both big endian and little endian bootstrap compilers and there
+were no regressions.
+
+In addition, I constructed a test case that used every archiecture define (like
+_ARCH_PWR4, etc.) and I also looked at the .machine directive generated.  I ran
+this test for all supported combinations of -mcpu, big/little endian, and 32/64
+bit support.  Every single instance generated exactly the same code with the
+patches installed compared to the compiler before installing the patches.
+
+The only difference in this patch compared to the first version posted on
+November 6th is that I the correct attribution and copyright year (i.e. that I
+created rs6000-arch.def in 2024).
+
+Can I install this patch on the GCC 15 trunk?
+
+2025-04-03  Michael Meissner  
+
+gcc/
+
+   * config/rs6000/default64.h (TARGET_CPU_DEFAULT): Set default cpu name.
+   * config/rs6000/rs6000-arch.def: New file.
+   * confi

[gcc(refs/users/meissner/heads/work199-math)] Add ChangeLog.math and update REVISION.

2025-04-03 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:b3076e6ac21908e99b8d9945ba3ba2b130541937

commit b3076e6ac21908e99b8d9945ba3ba2b130541937
Author: Michael Meissner 
Date:   Thu Apr 3 15:28:23 2025 -0400

Add ChangeLog.math and update REVISION.

2025-04-03  Michael Meissner  

gcc/

* ChangeLog.math: New file for branch.
* REVISION: Update.

Diff:
---
 gcc/ChangeLog.math | 5 +
 gcc/REVISION   | 2 +-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/gcc/ChangeLog.math b/gcc/ChangeLog.math
new file mode 100644
index ..126300a17a75
--- /dev/null
+++ b/gcc/ChangeLog.math
@@ -0,0 +1,5 @@
+ Branch work199-math, baseline 
+
+2025-04-03   Michael Meissner  
+
+   Clone branch
diff --git a/gcc/REVISION b/gcc/REVISION
index 113e419bda0d..365a2f9e2106 100644
--- a/gcc/REVISION
+++ b/gcc/REVISION
@@ -1 +1 @@
-work199 branch
+work199-math branch


[gcc/meissner/heads/work199-paddis] (14 commits) Merge commit 'refs/users/meissner/heads/work199-paddis' of

2025-04-03 Thread Michael Meissner via Gcc-cvs
The branch 'meissner/heads/work199-paddis' was updated to point to:

 b130aa485b39... Merge commit 'refs/users/meissner/heads/work199-paddis' of 

It previously pointed to:

 3d09f8d126ca... Add ChangeLog.paddis and update REVISION.

Diff:

Summary of changes (added commits):
---

  b130aa4... Merge commit 'refs/users/meissner/heads/work199-paddis' of 
  c53b346... Add ChangeLog.paddis and update REVISION.
  c62780a... Update ChangeLog.* (*)
  76c081b... Use architecture flags for defining _ARCH_PWR macros. (*)
  aa1860c... Add rs6000 architecture masks. (*)
  9dacc68... Use vector pair load/store for memcpy with -mcpu=future (*)
  c590949... Add -mcpu=future tests. (*)
  230fbe1... Add -mcpu=future tuning support. (*)
  c099046... Add support for -mcpu=future (*)
  04725bf... Change TARGET_MODULO to TARGET_POWER9. (*)
  08e5c71... Change TARGET_POPCNTD to TARGET_POWER7. (*)
  b9b3a52... Change TARGET_CMPB to TARGET_POWER6. (*)
  d0dbdbf... Change TARGET_FPRND to TARGET_POWER5X. (*)
  31d7966... Change TARGET_POPCNTB to TARGET_POWER5. (*)

(*) This commit already exists in another branch.
Because the reference `refs/users/meissner/heads/work199-paddis' matches
your hooks.email-new-commits-only configuration,
no separate email is sent for this commit.


[gcc r15-9180] cobol: New testcases for INSPECT statement.

2025-04-03 Thread Robert Dubner via Gcc-cvs
https://gcc.gnu.org/g:9b9b0ccffaf6185f5f44734755ebb7ae085ed745

commit r15-9180-g9b9b0ccffaf6185f5f44734755ebb7ae085ed745
Author: Bob Dubner 
Date:   Wed Apr 2 18:01:08 2025 -0400

cobol: New testcases for INSPECT statement.

gcc/testsuite

* cobol.dg/group2/INSPECT_BACKWARD_REPLACING_LEADING.cob: New 
testcase.
* cobol.dg/group2/INSPECT_BACKWARD_REPLACING_TRAILING.cob: Likewise.
* cobol.dg/group2/INSPECT_BACKWARD_simple_CONVERTING.cob: Likewise.
* cobol.dg/group2/INSPECT_BACKWARD_simple_REPLACING.cob: Likewise.
* cobol.dg/group2/INSPECT_BACKWARD_simple_TALLYING.cob: Likewise.
* cobol.dg/group2/INSPECT_CONVERTING_NULL.cob: Likewise.
* cobol.dg/group2/INSPECT_CONVERTING_TO_figurative_constant.cob: 
Likewise.
* cobol.dg/group2/INSPECT_CONVERTING_TO_figurative_constants.cob: 
Likewise.
* cobol.dg/group2/INSPECT_ISO_Example_1.cob: Likewise.
* cobol.dg/group2/INSPECT_ISO_Example_2.cob: Likewise.
* cobol.dg/group2/INSPECT_ISO_Example_3.cob: Likewise.
* cobol.dg/group2/INSPECT_ISO_Example_4.cob: Likewise.
* cobol.dg/group2/INSPECT_ISO_Example_5.cob: Likewise.
* cobol.dg/group2/INSPECT_ISO_Example_5-f.cob: Likewise.
* cobol.dg/group2/INSPECT_ISO_Example_5-r.cob: Likewise.
* cobol.dg/group2/INSPECT_ISO_Example_6.cob: Likewise.
* cobol.dg/group2/INSPECT_ISO_Example_7.cob: Likewise.
* cobol.dg/group2/INSPECT_No_repeat_conversion_check.cob: Likewise.
* cobol.dg/group2/INSPECT_REPLACING_figurative_constant.cob: 
Likewise.
* cobol.dg/group2/INSPECT_REPLACING_LEADING_ZEROS_BY_SPACES.cob: 
Likewise.
* cobol.dg/group2/INSPECT_TALLYING_AFTER.cob: Likewise.
* cobol.dg/group2/INSPECT_TALLYING_BEFORE.cob: Likewise.
* cobol.dg/group2/INSPECT_TALLYING_REPLACING_ISO_Example.cob: 
Likewise.
* cobol.dg/group2/INSPECT_TRAILING.cob: Likewise.
* cobol.dg/group2/INSPECT_BACKWARD_REPLACING_LEADING.out: New 
known-good result.
* cobol.dg/group2/INSPECT_BACKWARD_REPLACING_TRAILING.out: Likewise.
* cobol.dg/group2/INSPECT_BACKWARD_simple_CONVERTING.out: Likewise.
* cobol.dg/group2/INSPECT_BACKWARD_simple_REPLACING.out: Likewise.
* cobol.dg/group2/INSPECT_BACKWARD_simple_TALLYING.out: Likewise.
* cobol.dg/group2/INSPECT_CONVERTING_TO_figurative_constants.out: 
Likewise.
* cobol.dg/group2/INSPECT_ISO_Example_1.out: Likewise.
* cobol.dg/group2/INSPECT_ISO_Example_2.out: Likewise.
* cobol.dg/group2/INSPECT_ISO_Example_3.out: Likewise.
* cobol.dg/group2/INSPECT_ISO_Example_4.out: Likewise.
* cobol.dg/group2/INSPECT_ISO_Example_5-f.out: Likewise.
* cobol.dg/group2/INSPECT_ISO_Example_5.out: Likewise.
* cobol.dg/group2/INSPECT_ISO_Example_5-r.out: Likewise.
* cobol.dg/group2/INSPECT_ISO_Example_6.out: Likewise.
* cobol.dg/group2/INSPECT_ISO_Example_7.out: Likewise.
* cobol.dg/group2/INSPECT_TALLYING_REPLACING_ISO_Example.out: 
Likewise.
* cobol.dg/group2/INSPECT_TRAILING.out: Likewise.

Diff:
---
 .../group2/INSPECT_BACKWARD_REPLACING_LEADING.cob  |  43 +++
 .../group2/INSPECT_BACKWARD_REPLACING_LEADING.out  |  10 ++
 .../group2/INSPECT_BACKWARD_REPLACING_TRAILING.cob |  44 +++
 .../group2/INSPECT_BACKWARD_REPLACING_TRAILING.out |  10 ++
 .../group2/INSPECT_BACKWARD_simple_CONVERTING.cob  | 105 +++
 .../group2/INSPECT_BACKWARD_simple_CONVERTING.out  |  15 +++
 .../group2/INSPECT_BACKWARD_simple_REPLACING.cob   |  29 +
 .../group2/INSPECT_BACKWARD_simple_REPLACING.out   |   7 +
 .../group2/INSPECT_BACKWARD_simple_TALLYING.cob|  78 +++
 .../group2/INSPECT_BACKWARD_simple_TALLYING.out|  14 ++
 .../cobol.dg/group2/INSPECT_CONVERTING_NULL.cob|  15 +++
 .../INSPECT_CONVERTING_TO_figurative_constant.cob  |  15 +++
 .../INSPECT_CONVERTING_TO_figurative_constants.cob |  27 
 .../INSPECT_CONVERTING_TO_figurative_constants.out |   6 +
 .../cobol.dg/group2/INSPECT_ISO_Example_1.cob  |  83 
 .../cobol.dg/group2/INSPECT_ISO_Example_1.out  |   9 ++
 .../cobol.dg/group2/INSPECT_ISO_Example_2.cob  |  75 +++
 .../cobol.dg/group2/INSPECT_ISO_Example_2.out  |   7 +
 .../cobol.dg/group2/INSPECT_ISO_Example_3.cob  |  68 ++
 .../cobol.dg/group2/INSPECT_ISO_Example_3.out  |  13 ++
 .../cobol.dg/group2/INSPECT_ISO_Example_4.cob  |  71 +++
 .../cobol.dg/group2/INSPECT_ISO_Example_4.out  |   5 +
 .../cobol.dg/group2/INSPECT_ISO_Example_5-f.cob|  81 
 .../cobol.dg/group2/INSPECT_ISO_Example_5-f.out|   9 ++
 .../cobol.dg/group2/INSPECT_ISO_Example_5-r.cob|  77 +++
 .../cobol.dg/group2/INSPECT_ISO_Example_5-r.out|   9 ++
 .../co

[gcc] Created branch 'meissner/heads/work199-libs' in namespace 'refs/users'

2025-04-03 Thread Michael Meissner via Gcc-cvs
The branch 'meissner/heads/work199-libs' was created in namespace 'refs/users' 
pointing to:

 34cf8c8901ed... Add ChangeLog.meissner and REVISION.


[gcc(refs/users/meissner/heads/work199-sha)] PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-04-03 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:5373d6ec8a3034ad5e0156718d67cde58b476cd5

commit 5373d6ec8a3034ad5e0156718d67cde58b476cd5
Author: Michael Meissner 
Date:   Thu Apr 3 19:08:45 2025 -0400

PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

The multibuff.c benchmark attached to the PR target/117251 compiled for 
Power10
PowerPC that implement SHA3 has a slowdown in the current trunk and GCC 14
compared to GCC 11 - GCC 13, due to excessive amounts of spilling.

The main function for the multibuf.c file has 3,747 lines, all of which are
using vector unsigned long long.  There are 696 vector rotates (all rotates 
are
constant), 1,824 vector xor's and 600 vector andc's.

In looking at it, the main thing that steps out is the reason for either
spilling or moving variables is the support in fusion.md (generated by
genfusion.pl) that tries to fuse the vec_andc feeding into vec_xor, and 
other
vec_xor's feeding into vec_xor.

On the powerpc for power10, there is a special fusion mode that happens if 
the
machine has a VANDC or VXOR instruction that is adjacent to a VXOR 
instruction
and the VANDC/VXOR feeds into the 2nd VXOR instruction.

While the Power10 has 64 vector registers (which uses the XXL prefix to do
logical operations), the fusion only works with the older Altivec 
instruction
set (which uses the V prefix).  The Altivec instruction only has 32 vector
registers (which are overlaid over the VSX vector registers 32-63).

By having the combiner patterns fuse_vandc_vxor and fuse_vxor_vxor to do 
this
fusion, it means that the register allocator has more register pressure for 
the
traditional Altivec registers instead of the VSX registers.

In addition, since there are vector rotates, these rotates only work on the
traditional Altivec registers, which adds to the Altivec register pressure.

Finally in addition to doing the explicit xor, andc, and rotates using the
Altivec registers, we have to also load vector constants for the rotate 
amount
and these registers also are allocated as Altivec registers.

Current trunk and GCC 12-14 have more vector spills than GCC 11, but GCC 11 
has
many more vector moves that the later compilers.  Thus even though it has 
way
less spills, the vector moves are why GCC 11 have the slowest results.

There is an instruction that was added in power10 (XXEVAL) that does provide
fusion between VSX vectors that includes ANDC->XOR and XOR->XOR fusion.

The latency of XXEVAL is slightly more than the fused VANDC/VXOR or 
VXOR/VXOR,
so I have written the patch to prefer doing the Altivec instructions if they
don't need a temporary register.

Here are the results for adding support for XXEVAL for the multibuff.c
benchmark attached to the PR.  Note that we essentially recover the speed 
with
this patch that were lost with GCC 14 and the current trunk:

  XXEVALTrunk   GCC14   GCC13   GCC12
GCC11
  ---   -   -   -
-
Benchmark time in seconds   5.53 6.156.265.575.61 
9.56

Fuse VANDC -> VXOR   209 600  600 600 600  
600
Fuse VXOR -> VXOR  0 240  240 120 120  
120
XXEVAL to fuse ANDC -> XOR   391   00   0   0   
 0
XXEVAL to fuse XOR -> XOR240   00   0   0   
 0

Spill vector to stack 78 364  364 172 184  
110
Load spilled vector from stack   431 962  962 713 723  
166
Vector moves  10 100  100  70  72
3,055

Vector rotate right  696 696  696 696 696  
696
XXLANDC or VANDC 209 600  600 600 600  
600
XXLXOR or VXOR   953   1,8241,824   1,824   1,824
1,825
XXEVAL   631   00   0   0   
 0

Load vector rotate constants  24  24   24  24  24   
24

Here are the results for adding support for XXEVAL for the singlebuff.c
benchmark attached to the PR.  Note that adding XXEVAL greatly speeds up 
this
particular benchmark:

  XXEVALTrunk   GCC14   GCC13   GCC12
GCC11
  ---   -   -   -
-
Benchmark time in seconds   4.46 5.405.405.355.36 
7.54

Fuse VANDC -> VXOR   210  600 600 600 600  
600
Fuse VXOR -> VXOR  0  240 240 120 120  
120
XXEVAL to fuse ANDC -> XOR   3900   0  

[gcc(refs/users/meissner/heads/work199-sha)] Update ChangeLog.*

2025-04-03 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:4bbec558e667e9a6c304a9c834db83cbcb0e32df

commit 4bbec558e667e9a6c304a9c834db83cbcb0e32df
Author: Michael Meissner 
Date:   Thu Apr 3 19:11:45 2025 -0400

Update ChangeLog.*

Diff:
---
 gcc/ChangeLog.sha | 168 ++
 1 file changed, 168 insertions(+)

diff --git a/gcc/ChangeLog.sha b/gcc/ChangeLog.sha
index 6829dd35cdc4..1292c924bf6f 100644
--- a/gcc/ChangeLog.sha
+++ b/gcc/ChangeLog.sha
@@ -1,5 +1,173 @@
+ Branch work199-sha, patch #401 
+
+Add potential p-future XVRLD and XVRLDI instructions.
+
+2025-04-03  Michael Meissner  
+
+gcc/
+
+   * config/rs6000/altivec.md (altivec_vrl): Add support for a
+   possible XVRLD instruction in the future.
+   (altivec_vrl_immediate): New insns.
+   * config/rs6000/predicates.md (vector_shift_immediate): New predicate.
+   * config/rs6000/rs6000.h (TARGET_XVRLW): New macro.
+   * config/rs6000/rs6000.md (isa attribute): Add xvrlw.
+   (enabled attribute): Add support for xvrlw.
+
+gcc/testsuite/
+
+   * gcc.target/powerpc/vector-rotate-left.c: New test.
+   * lib/target-supports.exp (check_effective_target_powerpc_future_ok):
+   Add support to test -mcpu=future.
+
+ Branch work199-sha, patch #400 
+
+PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations
+
+The multibuff.c benchmark attached to the PR target/117251 compiled for Power10
+PowerPC that implement SHA3 has a slowdown in the current trunk and GCC 14
+compared to GCC 11 - GCC 13, due to excessive amounts of spilling.
+
+The main function for the multibuf.c file has 3,747 lines, all of which are
+using vector unsigned long long.  There are 696 vector rotates (all rotates are
+constant), 1,824 vector xor's and 600 vector andc's.
+
+In looking at it, the main thing that steps out is the reason for either
+spilling or moving variables is the support in fusion.md (generated by
+genfusion.pl) that tries to fuse the vec_andc feeding into vec_xor, and other
+vec_xor's feeding into vec_xor.
+
+On the powerpc for power10, there is a special fusion mode that happens if the
+machine has a VANDC or VXOR instruction that is adjacent to a VXOR instruction
+and the VANDC/VXOR feeds into the 2nd VXOR instruction.
+
+While the Power10 has 64 vector registers (which uses the XXL prefix to do
+logical operations), the fusion only works with the older Altivec instruction
+set (which uses the V prefix).  The Altivec instruction only has 32 vector
+registers (which are overlaid over the VSX vector registers 32-63).
+
+By having the combiner patterns fuse_vandc_vxor and fuse_vxor_vxor to do this
+fusion, it means that the register allocator has more register pressure for the
+traditional Altivec registers instead of the VSX registers.
+
+In addition, since there are vector rotates, these rotates only work on the
+traditional Altivec registers, which adds to the Altivec register pressure.
+
+Finally in addition to doing the explicit xor, andc, and rotates using the
+Altivec registers, we have to also load vector constants for the rotate amount
+and these registers also are allocated as Altivec registers.
+
+Current trunk and GCC 12-14 have more vector spills than GCC 11, but GCC 11 has
+many more vector moves that the later compilers.  Thus even though it has way
+less spills, the vector moves are why GCC 11 have the slowest results.
+
+There is an instruction that was added in power10 (XXEVAL) that does provide
+fusion between VSX vectors that includes ANDC->XOR and XOR->XOR fusion.
+
+The latency of XXEVAL is slightly more than the fused VANDC/VXOR or VXOR/VXOR,
+so I have written the patch to prefer doing the Altivec instructions if they
+don't need a temporary register.
+
+Here are the results for adding support for XXEVAL for the multibuff.c
+benchmark attached to the PR.  Note that we essentially recover the speed with
+this patch that were lost with GCC 14 and the current trunk:
+
+  XXEVALTrunk   GCC14   GCC13   GCC12GCC11
+  ---   -   -   --
+Benchmark time in seconds   5.53 6.156.265.575.61 9.56
+
+Fuse VANDC -> VXOR   209 600  600 600 600  600
+Fuse VXOR -> VXOR  0 240  240 120 120  120
+XXEVAL to fuse ANDC -> XOR   391   00   0   00
+XXEVAL to fuse XOR -> XOR240   00   0   00
+
+Spill vector to stack 78 364  364 172 184  110
+Load spilled vector from stack   431 962  962 713 723  166
+Vector moves  10 100  100  70  723,055
+
+Vector rotate right  696 696  696 696 696  696
+XXLANDC or VANDC 209 600  600 600  

[gcc] Created branch 'meissner/heads/work199-sha' in namespace 'refs/users'

2025-04-03 Thread Michael Meissner via Gcc-cvs
The branch 'meissner/heads/work199-sha' was created in namespace 'refs/users' 
pointing to:

 34cf8c8901ed... Add ChangeLog.meissner and REVISION.


[gcc(refs/users/meissner/heads/work199-bugs)] Fix PR 118541, do not generate unordered fp cmoves for IEEE compares.

2025-04-03 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:e5bef7d99ae23af4d111e3ab0417028963df9b46

commit e5bef7d99ae23af4d111e3ab0417028963df9b46
Author: Michael Meissner 
Date:   Thu Apr 3 17:08:12 2025 -0400

Fix PR 118541, do not generate unordered fp cmoves for IEEE compares.

Bernhard Reutner-Fischer suggested some typos to the patch for 118551.  Here
is the changed patch.

In bug PR target/118541 on power9, power10, and power11 systems, for the
function:

extern double __ieee754_acos (double);

double
__acospi (double x)
{
  double ret = __ieee754_acos (x) / 3.14;
  return __builtin_isgreater (ret, 1.0) ? 1.0 : ret;
}

GCC currently generates the following code:

Power9  Power10 and Power11
==  ===
bl __ieee754_acos   bl __ieee754_acos@notoc
nop plfd 0,.LC0@pcrel
addis 9,2,.LC2@toc@ha   xxspltidp 12,1065353216
addi 1,1,32 addi 1,1,32
lfd 0,.LC2@toc@l(9) ld 0,16(1)
addis 9,2,.LC0@toc@ha   fdiv 0,1,0
ld 0,16(1)  mtlr 0
lfd 12,.LC0@toc@l(9)xscmpgtdp 1,0,12
fdiv 0,1,0  xxsel 1,0,12,1
mtlr 0  blr
xscmpgtdp 1,0,12
xxsel 1,0,12,1
blr

This is because ifcvt.c optimizes the conditional floating point move to 
use the
XSCMPGTDP instruction.

However, the XSCMPGTDP instruction will generate an interrupt if one of the
arguments is a signalling NaN and signalling NaNs can generate an interrupt.
The IEEE comparison functions (isgreater, etc.) require that the comparison 
not
raise an interrupt.

The following patch changes the PowerPC back end so that ifcvt.c will not 
change
the if/then test and move into a conditional move if the comparison is one 
of
the comparisons that do not raise an error with signalling NaNs and -Ofast 
is
not used.  If a normal comparison is used or -Ofast is used, GCC will 
continue
to generate XSCMPGTDP and XXSEL.

For the following code:

double
ordered_compare (double a, double b, double c, double d)
{
  return __builtin_isgreater (a, b) ? c : d;
}

/* Verify normal > does generate xscmpgtdp.  */

double
normal_compare (double a, double b, double c, double d)
{
  return a > b ? c : d;
}

with the following patch, GCC generates the following for power9, power10, 
and
power11:

ordered_compare:
fcmpu 0,1,2
fmr 1,4
bnglr 0
fmr 1,3
blr

normal_compare:
xscmpgtdp 1,1,2
xxsel 1,4,3,1
blr

I have built bootstrap compilers on big endian power9 systems and little 
endian
power9/power10 systems and there were no regressions.  Can I check this 
patch
into the GCC trunk, and after a waiting period, can I check this into the 
active
older branches?

2025-04-03  Michael Meissner  

gcc/

PR target/118541
* config/rs6000/predicates.md (invert_fpmask_comparison_operator): 
Do
not allow UNLT and UNLE unless -ffast-math.
* config/rs6000/rs6000-protos.h (enum rev_cond_ordered): New 
enumeration.
(rs6000_reverse_condition): Add argument.
* config/rs6000/rs6000.cc (rs6000_reverse_condition): Do not allow
ordered comparisons to be reversed for floating point conditional 
moves,
but allow ordered comparisons to be reversed on jumps.
(rs6000_emit_sCOND): Adjust rs6000_reverse_condition call.
* config/rs6000/rs6000.h (REVERSE_CONDITION): Likewise.
* config/rs6000/rs6000.md (reverse_branch_comparison): Name insn.
Adjust rs6000_reverse_condition calls.

gcc/testsuite/

PR target/118541
* gcc.target/powerpc/pr118541-1.c: New test.
* gcc.target/powerpc/pr118541-2.c: Likewise.
* gcc.target/powerpc/pr118541-3.c: Likewise.
* gcc.target/powerpc/pr118541-4.c: Likewise.

Diff:
---
 gcc/config/rs6000/predicates.md   | 10 +-
 gcc/config/rs6000/rs6000-protos.h | 17 +-
 gcc/config/rs6000/rs6000.cc   | 46 ---
 gcc/config/rs6000/rs6000.h| 10 --
 gcc/config/rs6000/rs6000.md   | 25 +--
 gcc/testsuite/gcc.target/powerpc/pr118541-1.c | 28 +

[gcc(refs/users/meissner/heads/work199-bugs)] Update ChangeLog.*

2025-04-03 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:15e5c65d551249f51233ab699959b8ea7771ee4a

commit 15e5c65d551249f51233ab699959b8ea7771ee4a
Author: Michael Meissner 
Date:   Thu Apr 3 17:14:19 2025 -0400

Update ChangeLog.*

Diff:
---
 gcc/ChangeLog.bugs | 195 +
 1 file changed, 195 insertions(+)

diff --git a/gcc/ChangeLog.bugs b/gcc/ChangeLog.bugs
index 8f82235b2084..46ab980fcaac 100644
--- a/gcc/ChangeLog.bugs
+++ b/gcc/ChangeLog.bugs
@@ -1,5 +1,200 @@
+ Branch work199-bugs, patch #202 
+
+PR 99293: Optimize splat of a V2DF/V2DI extract with constant element
+
+We had optimizations for splat of a vector extract for the other vector
+types, but we missed having one for V2DI and V2DF.  This patch adds a
+combiner insn to do this optimization.
+
+In looking at the source, we had similar optimizations for V4SI and V4SF
+extract and splats, but we missed doing V2DI/V2DF.
+
+Without the patch for the code:
+
+   vector long long splat_dup_l_0 (vector long long v)
+   {
+ return __builtin_vec_splats (__builtin_vec_extract (v, 0));
+   }
+
+the compiler generates (on a little endian power9):
+
+   splat_dup_l_0:
+   mfvsrld 9,34
+   mtvsrdd 34,9,9
+   blr
+
+Now it generates:
+
+   splat_dup_l_0:
+   xxpermdi 34,34,34,3
+   blr
+
+2025-04-03  Michael Meissner  
+
+gcc/
+
+   PR target/99293
+   * config/rs6000/vsx.md (vsx_splat_extract_): New insn.
+
+gcc/testsuite/
+
+   PR target/99293
+   * gcc.target/powerpc/builtins-1.c: Adjust insn count.
+   * gcc.target/powerpc/pr99293.c: New test.
+
+ Branch work199-bugs, patch #201 
+
+PR target/108958 -- use mtvsrdd to zero extend GPR DImode to VSX TImode
+
+Previously GCC would zero externd a DImode GPR value to TImode by first zero
+extending the DImode value into a GPR TImode value, and then do a MTVSRDD to
+move this value to a VSX register.
+
+This patch does the move directly, since if the middle argument to MTVSRDD is 
0,
+it does the zero extend.
+
+If the DImode value is already in a vector register, it does a XXSPLTIB and
+XXPERMDI to get the value into the bottom 64-bits of the register.
+
+I have built GCC with the patches in this patch set applied on both little and
+big endian PowerPC systems and there were no regressions.  Can I apply this
+patch to GCC 15?
+
+2025-04-03  Michael Meissner  
+
+gcc/
+
+   PR target/108598
+   * gcc/config/rs6000/rs6000.md (zero_extendditi2): New insn.
+
+gcc/testsuite/
+
+   PR target/108598
+   * gcc.target/powerpc/pr108958.c: New test.
+
+ Branch work199-bugs, patch #200 
+
+Fix PR 118541, do not generate unordered fp cmoves for IEEE compares.
+
+Bernhard Reutner-Fischer suggested some typos to the patch for 118551.  Here
+is the changed patch.
+
+In bug PR target/118541 on power9, power10, and power11 systems, for the
+function:
+
+extern double __ieee754_acos (double);
+
+double
+__acospi (double x)
+{
+  double ret = __ieee754_acos (x) / 3.14;
+  return __builtin_isgreater (ret, 1.0) ? 1.0 : ret;
+}
+
+GCC currently generates the following code:
+
+Power9  Power10 and Power11
+==  ===
+bl __ieee754_acos   bl __ieee754_acos@notoc
+nop plfd 0,.LC0@pcrel
+addis 9,2,.LC2@toc@ha   xxspltidp 12,1065353216
+addi 1,1,32 addi 1,1,32
+lfd 0,.LC2@toc@l(9) ld 0,16(1)
+addis 9,2,.LC0@toc@ha   fdiv 0,1,0
+ld 0,16(1)  mtlr 0
+lfd 12,.LC0@toc@l(9)xscmpgtdp 1,0,12
+fdiv 0,1,0  xxsel 1,0,12,1
+mtlr 0  blr
+xscmpgtdp 1,0,12
+xxsel 1,0,12,1
+blr
+
+This is because ifcvt.c optimizes the conditional floating point move to use 
the
+XSCMPGTDP instruction.
+
+However, the XSCMPGTDP instruction will generate an interrupt if one of the
+arguments is a signalling NaN and signalling NaNs can generate an interrupt.
+The IEEE comparison functions (isgreater, etc.) require that the comparison not
+raise an interrupt.
+
+The following patch changes the PowerPC back end so that ifcvt.c will not 
change
+the if/then test and move into a conditional move if the comparison is one of
+the comparisons that do not raise an error with signalling NaNs and -Ofast is
+not used.  If a normal comparison is used or -Ofast is used, GCC will continue
+to generate XSCMPGTDP and XXSEL.
+
+For the following code:
+
+double
+ordered_compare (double a, double b, double c, double d)
+{
+  return __builtin_isgreater (a, b) ? c : d;
+}
+
+/* Verify normal 

[gcc(refs/users/meissner/heads/work199-bugs)] PR 99293: Optimize splat of a V2DF/V2DI extract with constant element

2025-04-03 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:ed66f877926df566d290774c9919e91777058e8a

commit ed66f877926df566d290774c9919e91777058e8a
Author: Michael Meissner 
Date:   Thu Apr 3 17:11:33 2025 -0400

PR 99293: Optimize splat of a V2DF/V2DI extract with constant element

We had optimizations for splat of a vector extract for the other vector
types, but we missed having one for V2DI and V2DF.  This patch adds a
combiner insn to do this optimization.

In looking at the source, we had similar optimizations for V4SI and V4SF
extract and splats, but we missed doing V2DI/V2DF.

Without the patch for the code:

vector long long splat_dup_l_0 (vector long long v)
{
  return __builtin_vec_splats (__builtin_vec_extract (v, 0));
}

the compiler generates (on a little endian power9):

splat_dup_l_0:
mfvsrld 9,34
mtvsrdd 34,9,9
blr

Now it generates:

splat_dup_l_0:
xxpermdi 34,34,34,3
blr

2025-04-03  Michael Meissner  

gcc/

PR target/99293
* config/rs6000/vsx.md (vsx_splat_extract_): New insn.

gcc/testsuite/

PR target/99293
* gcc.target/powerpc/builtins-1.c: Adjust insn count.
* gcc.target/powerpc/pr99293.c: New test.

Diff:
---
 gcc/config/rs6000/vsx.md  | 18 ++
 gcc/testsuite/gcc.target/powerpc/builtins-1.c |  2 +-
 gcc/testsuite/gcc.target/powerpc/pr99293.c| 22 ++
 3 files changed, 41 insertions(+), 1 deletion(-)

diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index dd3573b80868..d84a2a357a31 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -4798,6 +4798,24 @@
   "lxvdsx %x0,%y1"
   [(set_attr "type" "vecload")])
 
+;; Optimize SPLAT of an extract from a V2DF/V2DI vector with a constant element
+(define_insn "*vsx_splat_extract_"
+  [(set (match_operand:VSX_D 0 "vsx_register_operand" "=wa")
+   (vec_duplicate:VSX_D
+(vec_select:
+ (match_operand:VSX_D 1 "vsx_register_operand" "wa")
+ (parallel [(match_operand 2 "const_0_to_1_operand" "n")]]
+  "VECTOR_MEM_VSX_P (mode)"
+{
+  int which_word = INTVAL (operands[2]);
+  if (!BYTES_BIG_ENDIAN)
+which_word = 1 - which_word;
+
+  operands[3] = GEN_INT (which_word ? 3 : 0);
+  return "xxpermdi %x0,%x1,%x1,%3";
+}
+  [(set_attr "type" "vecperm")])
+
 ;; V4SI splat support
 (define_insn "vsx_splat_v4si"
   [(set (match_operand:V4SI 0 "vsx_register_operand" "=wa,wa")
diff --git a/gcc/testsuite/gcc.target/powerpc/builtins-1.c 
b/gcc/testsuite/gcc.target/powerpc/builtins-1.c
index 8410a5fd4319..4e7e5384675f 100644
--- a/gcc/testsuite/gcc.target/powerpc/builtins-1.c
+++ b/gcc/testsuite/gcc.target/powerpc/builtins-1.c
@@ -1035,4 +1035,4 @@ foo156 (vector unsigned short usa)
 /* { dg-final { scan-assembler-times {\mvmrglb\M} 3 } } */
 /* { dg-final { scan-assembler-times {\mvmrgew\M} 4 } } */
 /* { dg-final { scan-assembler-times {\mvsplth|xxsplth\M} 4 } } */
-/* { dg-final { scan-assembler-times {\mxxpermdi\M} 44 } } */
+/* { dg-final { scan-assembler-times {\mxxpermdi\M} 42 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/pr99293.c 
b/gcc/testsuite/gcc.target/powerpc/pr99293.c
new file mode 100644
index ..20adc1f27f65
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr99293.c
@@ -0,0 +1,22 @@
+/* { dg-do compile { target powerpc*-*-* } } */
+/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-options "-O2 -mvsx" } */
+
+/* Test for PR 99263, which wants to do:
+   __builtin_vec_splats (__builtin_vec_extract (v, n))
+
+   where v is a V2DF or V2DI vector and n is either 0 or 1.  Previously the
+   compiler would do a direct move to the GPR registers to select the item and 
a
+   direct move from the GPR registers to do the splat.  */
+
+vector long long splat_dup_l_0 (vector long long v)
+{
+  return __builtin_vec_splats (__builtin_vec_extract (v, 0));
+}
+
+vector long long splat_dup_l_1 (vector long long v)
+{
+  return __builtin_vec_splats (__builtin_vec_extract (v, 1));
+}
+
+/* { dg-final { scan-assembler-times "xxpermdi" 2 } } */


[gcc(refs/users/meissner/heads/work199-bugs)] PR target/108958 -- use mtvsrdd to zero extend GPR DImode to VSX TImode

2025-04-03 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:1ec01865ae1d81b4c75955ad0b4034293459fa8a

commit 1ec01865ae1d81b4c75955ad0b4034293459fa8a
Author: Michael Meissner 
Date:   Thu Apr 3 17:10:28 2025 -0400

PR target/108958 -- use mtvsrdd to zero extend GPR DImode to VSX TImode

Previously GCC would zero externd a DImode GPR value to TImode by first zero
extending the DImode value into a GPR TImode value, and then do a MTVSRDD to
move this value to a VSX register.

This patch does the move directly, since if the middle argument to MTVSRDD 
is 0,
it does the zero extend.

If the DImode value is already in a vector register, it does a XXSPLTIB and
XXPERMDI to get the value into the bottom 64-bits of the register.

I have built GCC with the patches in this patch set applied on both little 
and
big endian PowerPC systems and there were no regressions.  Can I apply this
patch to GCC 15?

2025-04-03  Michael Meissner  

gcc/

PR target/108598
* gcc/config/rs6000/rs6000.md (zero_extendditi2): New insn.

gcc/testsuite/

PR target/108598
* gcc.target/powerpc/pr108958.c: New test.

Diff:
---
 gcc/config/rs6000/rs6000.md | 46 +
 gcc/testsuite/gcc.target/powerpc/pr108958.c | 27 +
 2 files changed, 73 insertions(+)

diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 02c31b576b65..4c9e2dc6390b 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -1026,6 +1026,52 @@
(set_attr "dot" "yes")
(set_attr "length" "4,8")])
 
+(define_insn_and_split "zero_extendditi2"
+  [(set (match_operand:TI 0 "gpc_reg_operand" "=r,wa,&wa")
+   (zero_extend:TI
+(match_operand:DI 1 "gpc_reg_operand" "rwa,r,wa")))]
+  "TARGET_P9_VECTOR && TARGET_POWERPC64"
+  "@
+  #
+  mtvsrdd %x0,0,%1
+  #"
+  "&& reload_completed
+   && (int_reg_operand (operands[0], TImode)
+   || vsx_register_operand (operands[1], DImode))"
+  [(set (match_dup 2)
+   (match_dup 3))
+   (set (match_dup 4)
+   (match_dup 5))]
+{
+  rtx op0 = operands[0];
+  rtx op1 = operands[1];
+  int r = reg_or_subregno (op0);
+
+  if (int_reg_operand (op0, TImode))
+{
+  int lo = BYTES_BIG_ENDIAN ? 1 : 0;
+  int hi = 1 - lo;
+
+  operands[2] = gen_rtx_REG (DImode, r + lo);
+  operands[3] = op1;
+  operands[4] = gen_rtx_REG (DImode, r + hi);
+  operands[5] = const0_rtx;
+}
+  else
+{
+  rtx op0_di = gen_rtx_REG (DImode, r);
+  rtx op0_v2di = gen_rtx_REG (V2DImode, r);
+  rtx lo = WORDS_BIG_ENDIAN ? op1 : op0_di;
+  rtx hi = WORDS_BIG_ENDIAN ? op0_di : op1;
+
+  operands[2] = op0_v2di;
+  operands[3] = CONST0_RTX (V2DImode);
+  operands[4] = op0_v2di;
+  operands[5] = gen_rtx_VEC_CONCAT (V2DImode, hi, lo);
+}
+}
+  [(set_attr "type" "*,mtvsr,vecperm")
+   (set_attr "length" "8,*,8")])
 
 (define_insn "extendqi2"
   [(set (match_operand:EXTQI 0 "gpc_reg_operand" "=r,?*v")
diff --git a/gcc/testsuite/gcc.target/powerpc/pr108958.c 
b/gcc/testsuite/gcc.target/powerpc/pr108958.c
new file mode 100644
index ..03eb58d069e7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr108958.c
@@ -0,0 +1,27 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target int128 } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-options "-mdejagnu-cpu=power9 -O2" } */
+
+/* PR target/108958, use mtvsrdd to zero extend gpr to vsx register.  */
+
+void
+gpr_to_vsx (unsigned long long x, __uint128_t *p)
+{
+  /* mtvsrdd vsx,0,gpr.  */
+  __uint128_t y = x;
+  __asm__ (" # %x0" : "+wa" (y));
+  *p = y;
+}
+
+void
+gpr_to_gpr (unsigned long long x, __uint128_t *p)
+{
+  /* mr and li.  */
+  __uint128_t y = x;
+  __asm__ (" # %0" : "+r" (y));
+  *p = y;
+}
+
+/* { dg-final { scan-assembler-times {\mli\M}  1 } } */
+/* { dg-final { scan-assembler-times {\mmtvsrdd .*,0,.*\M} 1 } } */


[gcc(refs/users/meissner/heads/work199-paddis)] RFC2653-PowerPC: Add support for 1, 024 bit DMR registers.

2025-04-03 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:0191c3ad6f4501f9093f4522bfcf860bc05bb111

commit 0191c3ad6f4501f9093f4522bfcf860bc05bb111
Author: Michael Meissner 
Date:   Thu Apr 3 17:16:15 2025 -0400

RFC2653-PowerPC: Add support for 1,024 bit DMR registers.

This patch is a prelimianry patch to add the full 1,024 bit dense math 
register
(DMRs) for -mcpu=future.  The MMA 512-bit accumulators map onto the top of 
the
DMR register.

This patch only adds the new 1,024 bit register support.  It does not add
support for any instructions that need 1,024 bit registers instead of 512 
bit
registers.

I used the new mode 'TDOmode' to be the opaque mode used for 1,024 bit
registers.  The 'wD' constraint added in previous patches is used for these
registers.  I added support to do load and store of DMRs via the VSX 
registers,
since there are no load/store dense math instructions.  I added the new 
keyword
'__dmr' to create 1,024 bit types that can be loaded into DMRs.  At 
present, I
don't have aliases for __dmr512 and __dmr1024 that we've discussed 
internally.

The patches have been tested on both little and big endian systems.  Can I 
check
it into the master branch?

2025-04-03   Michael Meissner  

gcc/

* config/rs6000/mma.md (UNSPEC_DM_INSERT512_UPPER): New unspec.
(UNSPEC_DM_INSERT512_LOWER): Likewise.
(UNSPEC_DM_EXTRACT512): Likewise.
(UNSPEC_DMR_RELOAD_FROM_MEMORY): Likewise.
(UNSPEC_DMR_RELOAD_TO_MEMORY): Likewise.
(movtdo): New define_expand and define_insn_and_split to implement 
1,024
bit DMR registers.
(movtdo_insert512_upper): New insn.
(movtdo_insert512_lower): Likewise.
(movtdo_extract512): Likewise.
(reload_dmr_from_memory): Likewise.
(reload_dmr_to_memory): Likewise.
* config/rs6000/rs6000-builtin.cc (rs6000_type_string): Add DMR
support.
(rs6000_init_builtins): Add support for __dmr keyword.
* config/rs6000/rs6000-call.cc (rs6000_return_in_memory): Add 
support
for TDOmode.
(rs6000_function_arg): Likewise.
* config/rs6000/rs6000-modes.def (TDOmode): New mode.
* config/rs6000/rs6000.cc (rs6000_hard_regno_nregs_internal): Add
support for TDOmode.
(rs6000_hard_regno_mode_ok_uncached): Likewise.
(rs6000_hard_regno_mode_ok): Likewise.
(rs6000_modes_tieable_p): Likewise.
(rs6000_debug_reg_global): Likewise.
(rs6000_setup_reg_addr_masks): Likewise.
(rs6000_init_hard_regno_mode_ok): Add support for TDOmode.  Setup 
reload
hooks for DMR mode.
(reg_offset_addressing_ok_p): Add support for TDOmode.
(rs6000_emit_move): Likewise.
(rs6000_secondary_reload_simple_move): Likewise.
(rs6000_preferred_reload_class): Likewise.
(rs6000_secondary_reload_class): Likewise.
(rs6000_mangle_type): Add mangling for __dmr type.
(rs6000_dmr_register_move_cost): Add support for TDOmode.
(rs6000_split_multireg_move): Likewise.
(rs6000_invalid_conversion): Likewise.
* config/rs6000/rs6000.h (VECTOR_ALIGNMENT_P): Add TDOmode.
(enum rs6000_builtin_type_index): Add DMR type nodes.
(dmr_type_node): Likewise.
(ptr_dmr_type_node): Likewise.

gcc/testsuite/

* gcc.target/powerpc/dm-1024bit.c: New test.
* lib/target-supports.exp (check_effective_target_ppc_dmr_ok): New
target test.

RFC2653-Add support for dense math registers.

The MMA subsystem added the notion of accumulator registers as an optional
feature of ISA 3.1 (power10).  In ISA 3.1, these accumulators overlapped 
with
the VSX registers 0..31, but logically the accumulator registers were 
separate
from the FPR registers.  In ISA 3.1, it was anticipated that in future 
systems,
the accumulator registers may no overlap with the FPR registers.  This patch
adds the support for dense math registers as separate registers.

This particular patch does not change the MMA support to use the 
accumulators
within the dense math registers.  This patch just adds the basic support for
having separate DMRs.  The next patch will switch the MMA support to use the
accumulators if -mcpu=future is used.

For testing purposes, I added an undocumented option '-mdense-math' to 
enable
or disable the dense math support.

This patch updates the wD constraint added in the previous patch.  If MMA is
selected but dense math is not selected (i.e. -mcpu=power10), the wD 
constraint
will allow access to accumulators that overlap with VSX registers 0..31.  If
both MMA and dense math are selected (i.e. -mcpu=future), the

[gcc(refs/users/meissner/heads/work199-paddis)] RFC2655-Add saturating subtract built-ins.

2025-04-03 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:e7407e11739d5407c18a13b4d634726c6a7fdf00

commit e7407e11739d5407c18a13b4d634726c6a7fdf00
Author: Michael Meissner 
Date:   Thu Apr 3 16:52:18 2025 -0400

RFC2655-Add saturating subtract built-ins.

This patch adds support for a saturating subtract built-in function that 
may be
added to a future PowerPC processor.  Note, if it is added, the name of the
built-in function may change before GCC 13 is released.  If the name 
changes,
we will submit a patch changing the name.

I also added support for providing dense math built-in functions, even 
though
at present, we have not added any new built-in functions for dense math.  
It is
likely we will want to add new dense math built-in functions as the dense 
math
support is fleshed out.

The patches have been tested on both little and big endian systems.  Can I 
check
it into the master branch?

2025-03-27   Michael Meissner  

gcc/

* config/rs6000/rs6000-builtin.cc (rs6000_invalid_builtin): Add 
support
for flagging invalid use of future built-in functions.
(rs6000_builtin_is_supported): Add support for future built-in
functions.
* config/rs6000/rs6000-builtins.def 
(__builtin_saturate_subtract32): New
built-in function for -mcpu=future.
(__builtin_saturate_subtract64): Likewise.
* config/rs6000/rs6000-gen-builtins.cc (enum bif_stanza): Add 
stanzas
for -mcpu=future built-ins.
(stanza_map): Likewise.
(enable_string): Likewise.
(struct attrinfo): Likewise.
(parse_bif_attrs): Likewise.
(write_decls): Likewise.
* config/rs6000/rs6000.md (sat_sub3): Add saturating subtract
built-in insn declarations.
(sat_sub3_dot): Likewise.
(sat_sub3_dot2): Likewise.
* doc/extend.texi (Future PowerPC built-ins): New section.

gcc/testsuite/

* gcc.target/powerpc/subfus-1.c: New test.
* gcc.target/powerpc/subfus-2.c: Likewise.

Diff:
---
 gcc/config/rs6000/rs6000-builtin.cc | 17 
 gcc/config/rs6000/rs6000-builtins.def   | 10 +
 gcc/config/rs6000/rs6000-gen-builtins.cc| 35 ++---
 gcc/config/rs6000/rs6000.md | 60 +
 gcc/doc/extend.texi | 24 
 gcc/testsuite/gcc.target/powerpc/subfus-1.c | 32 +++
 gcc/testsuite/gcc.target/powerpc/subfus-2.c | 32 +++
 7 files changed, 205 insertions(+), 5 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-builtin.cc 
b/gcc/config/rs6000/rs6000-builtin.cc
index d8ff7cf32dfd..7a56175ebe52 100644
--- a/gcc/config/rs6000/rs6000-builtin.cc
+++ b/gcc/config/rs6000/rs6000-builtin.cc
@@ -139,6 +139,17 @@ rs6000_invalid_builtin (enum rs6000_gen_builtins fncode)
 case ENB_MMA:
   error ("%qs requires the %qs option", name, "-mmma");
   break;
+case ENB_FUTURE:
+  error ("%qs requires the %qs option", name, "-mcpu=future");
+  break;
+case ENB_FUTURE_64:
+  error ("%qs requires the %qs option and either the %qs or %qs option",
+name, "-mcpu=future", "-m64", "-mpowerpc64");
+  break;
+case ENB_DM:
+  error ("%qs requires the %qs or %qs options", name, "-mcpu=future",
+"-mdense-math");
+  break;
 default:
 case ENB_ALWAYS:
   gcc_unreachable ();
@@ -194,6 +205,12 @@ rs6000_builtin_is_supported (enum rs6000_gen_builtins 
fncode)
   return TARGET_HTM;
 case ENB_MMA:
   return TARGET_MMA;
+case ENB_FUTURE:
+  return TARGET_FUTURE;
+case ENB_FUTURE_64:
+  return TARGET_FUTURE && TARGET_POWERPC64;
+case ENB_DM:
+  return TARGET_DENSE_MATH;
 default:
   gcc_unreachable ();
 }
diff --git a/gcc/config/rs6000/rs6000-builtins.def 
b/gcc/config/rs6000/rs6000-builtins.def
index 555d7d589506..eef5f41f7615 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -137,6 +137,8 @@
 ;   endian   Needs special handling for endianness
 ;   ibmldRestrict usage to the case when TFmode is IBM-128
 ;   ibm128   Restrict usage to the case where __ibm128 is supported or if ibmld
+;   future   Restrict usage to future instructions
+;   dm   Restrict usage to dense math
 ;
 ; Each attribute corresponds to extra processing required when
 ; the built-in is expanded.  All such special processing should
@@ -3924,3 +3926,11 @@
 
   void __builtin_vsx_stxvp (v256, unsigned long, const v256 *);
 STXVP nothing {mma,pair}
+
+[future]
+  const signed int __builtin_saturate_subtract32 (signed int, signed int);
+  SAT_SUBSI sat_subsi3 {}
+
+[future-64]
+  const signed long __builtin_saturate_subtract64 (signed long,  signed long);
+  SAT_SUBDI sat_subdi3 {}
diff --git a/gcc/config/rs6000/rs6000-gen-builtins.cc 
b/gcc/confi

[gcc(refs/users/meissner/heads/work199-paddis)] Update ChangeLog.*

2025-04-03 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:7740bd23ee8b87da2d5157630ff93339c05f92b5

commit 7740bd23ee8b87da2d5157630ff93339c05f92b5
Author: Michael Meissner 
Date:   Thu Apr 3 17:17:57 2025 -0400

Update ChangeLog.*

Diff:
---
 gcc/ChangeLog.dmf | 239 ++
 1 file changed, 239 insertions(+)

diff --git a/gcc/ChangeLog.dmf b/gcc/ChangeLog.dmf
new file mode 100644
index ..b53858e5
--- /dev/null
+++ b/gcc/ChangeLog.dmf
@@ -0,0 +1,239 @@
+ Branch work199-dmf, patch #102 
+
+RFC2653-PowerPC: Add support for 1,024 bit DMR registers.
+
+This patch is a prelimianry patch to add the full 1,024 bit dense math register
+(DMRs) for -mcpu=future.  The MMA 512-bit accumulators map onto the top of the
+DMR register.
+
+This patch only adds the new 1,024 bit register support.  It does not add
+support for any instructions that need 1,024 bit registers instead of 512 bit
+registers.
+
+I used the new mode 'TDOmode' to be the opaque mode used for 1,024 bit
+registers.  The 'wD' constraint added in previous patches is used for these
+registers.  I added support to do load and store of DMRs via the VSX registers,
+since there are no load/store dense math instructions.  I added the new keyword
+'__dmr' to create 1,024 bit types that can be loaded into DMRs.  At present, I
+don't have aliases for __dmr512 and __dmr1024 that we've discussed internally.
+
+The patches have been tested on both little and big endian systems.  Can I 
check
+it into the master branch?
+
+2025-03-27   Michael Meissner  
+
+gcc/
+
+   * config/rs6000/mma.md (UNSPEC_DM_INSERT512_UPPER): New unspec.
+   (UNSPEC_DM_INSERT512_LOWER): Likewise.
+   (UNSPEC_DM_EXTRACT512): Likewise.
+   (UNSPEC_DMR_RELOAD_FROM_MEMORY): Likewise.
+   (UNSPEC_DMR_RELOAD_TO_MEMORY): Likewise.
+   (movtdo): New define_expand and define_insn_and_split to implement 1,024
+   bit DMR registers.
+   (movtdo_insert512_upper): New insn.
+   (movtdo_insert512_lower): Likewise.
+   (movtdo_extract512): Likewise.
+   (reload_dmr_from_memory): Likewise.
+   (reload_dmr_to_memory): Likewise.
+   * config/rs6000/rs6000-builtin.cc (rs6000_type_string): Add DMR
+   support.
+   (rs6000_init_builtins): Add support for __dmr keyword.
+   * config/rs6000/rs6000-call.cc (rs6000_return_in_memory): Add support
+   for TDOmode.
+   (rs6000_function_arg): Likewise.
+   * config/rs6000/rs6000-modes.def (TDOmode): New mode.
+   * config/rs6000/rs6000.cc (rs6000_hard_regno_nregs_internal): Add
+   support for TDOmode.
+   (rs6000_hard_regno_mode_ok_uncached): Likewise.
+   (rs6000_hard_regno_mode_ok): Likewise.
+   (rs6000_modes_tieable_p): Likewise.
+   (rs6000_debug_reg_global): Likewise.
+   (rs6000_setup_reg_addr_masks): Likewise.
+   (rs6000_init_hard_regno_mode_ok): Add support for TDOmode.  Setup reload
+   hooks for DMR mode.
+   (reg_offset_addressing_ok_p): Add support for TDOmode.
+   (rs6000_emit_move): Likewise.
+   (rs6000_secondary_reload_simple_move): Likewise.
+   (rs6000_preferred_reload_class): Likewise.
+   (rs6000_secondary_reload_class): Likewise.
+   (rs6000_mangle_type): Add mangling for __dmr type.
+   (rs6000_dmr_register_move_cost): Add support for TDOmode.
+   (rs6000_split_multireg_move): Likewise.
+   (rs6000_invalid_conversion): Likewise.
+   * config/rs6000/rs6000.h (VECTOR_ALIGNMENT_P): Add TDOmode.
+   (enum rs6000_builtin_type_index): Add DMR type nodes.
+   (dmr_type_node): Likewise.
+   (ptr_dmr_type_node): Likewise.
+
+gcc/testsuite/
+
+   * gcc.target/powerpc/dm-1024bit.c: New test.
+   * lib/target-supports.exp (check_effective_target_ppc_dmr_ok): New
+   target test.
+
+ Branch work199-dmf, patch #101 
+
+RFC2653-Add support for dense math registers.
+
+The MMA subsystem added the notion of accumulator registers as an optional
+feature of ISA 3.1 (power10).  In ISA 3.1, these accumulators overlapped with
+the VSX registers 0..31, but logically the accumulator registers were separate
+from the FPR registers.  In ISA 3.1, it was anticipated that in future systems,
+the accumulator registers may no overlap with the FPR registers.  This patch
+adds the support for dense math registers as separate registers.
+
+This particular patch does not change the MMA support to use the accumulators
+within the dense math registers.  This patch just adds the basic support for
+having separate DMRs.  The next patch will switch the MMA support to use the
+accumulators if -mcpu=future is used.
+
+For testing purposes, I added an undocumented option '-mdense-math' to enable
+or disable the dense math support.
+
+This patch updates the wD constraint added in the previous patch.  If MMA is
+selected but dense math is not selected (i.e. -mcpu=power10), the wD constraint
+will allow acce

[gcc(refs/users/meissner/heads/work199-submit)] Add ChangeLog.submit and update REVISION.

2025-04-03 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:bda5f3ddd59152e1a7ac967a66195e331245344a

commit bda5f3ddd59152e1a7ac967a66195e331245344a
Author: Michael Meissner 
Date:   Thu Apr 3 15:29:12 2025 -0400

Add ChangeLog.submit and update REVISION.

2025-04-03  Michael Meissner  

gcc/

* ChangeLog.submit: New file for branch.
* REVISION: Update.

Diff:
---
 gcc/ChangeLog.submit | 5 +
 gcc/REVISION | 2 +-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/gcc/ChangeLog.submit b/gcc/ChangeLog.submit
new file mode 100644
index ..9b0b1e840f22
--- /dev/null
+++ b/gcc/ChangeLog.submit
@@ -0,0 +1,5 @@
+ Branch work199-submit, baseline 
+
+2025-04-03   Michael Meissner  
+
+   Clone branch
diff --git a/gcc/REVISION b/gcc/REVISION
index 113e419bda0d..f90b025b5af7 100644
--- a/gcc/REVISION
+++ b/gcc/REVISION
@@ -1 +1 @@
-work199 branch
+work199-submit branch


[gcc(refs/users/meissner/heads/work199-paddis)] Add ChangeLog.paddis and update REVISION.

2025-04-03 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:3d09f8d126ca2e9c24adf04bb563cc72928aee03

commit 3d09f8d126ca2e9c24adf04bb563cc72928aee03
Author: Michael Meissner 
Date:   Thu Apr 3 15:30:04 2025 -0400

Add ChangeLog.paddis and update REVISION.

2025-04-03  Michael Meissner  

gcc/

* ChangeLog.paddis: New file for branch.
* REVISION: Update.

Diff:
---
 gcc/ChangeLog.paddis | 5 +
 gcc/REVISION | 2 +-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/gcc/ChangeLog.paddis b/gcc/ChangeLog.paddis
new file mode 100644
index ..ef499f386bcd
--- /dev/null
+++ b/gcc/ChangeLog.paddis
@@ -0,0 +1,5 @@
+ Branch work199-paddis, baseline 
+
+2025-04-03   Michael Meissner  
+
+   Clone branch
diff --git a/gcc/REVISION b/gcc/REVISION
index 113e419bda0d..60be995efd8d 100644
--- a/gcc/REVISION
+++ b/gcc/REVISION
@@ -1 +1 @@
-work199 branch
+work199-paddis branch


[gcc] Created branch 'meissner/heads/work199-paddis' in namespace 'refs/users'

2025-04-03 Thread Michael Meissner via Gcc-cvs
The branch 'meissner/heads/work199-paddis' was created in namespace 
'refs/users' pointing to:

 34cf8c8901ed... Add ChangeLog.meissner and REVISION.


[gcc(refs/users/meissner/heads/work199-orig)] Add REVISION.

2025-04-03 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:2d05704bdda50acb6d9c9e598580f21e7b700881

commit 2d05704bdda50acb6d9c9e598580f21e7b700881
Author: Michael Meissner 
Date:   Thu Apr 3 15:30:53 2025 -0400

Add REVISION.

2025-04-03  Michael Meissner  

gcc/

* REVISION: New file for branch.

Diff:
---
 gcc/REVISION | 1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/REVISION b/gcc/REVISION
new file mode 100644
index ..f72928d47193
--- /dev/null
+++ b/gcc/REVISION
@@ -0,0 +1 @@
+work199-orig branch


[gcc(refs/users/meissner/heads/work199-test)] Add ChangeLog.test and update REVISION.

2025-04-03 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:c09fa69f394a46e5a292a17577398952ead94a12

commit c09fa69f394a46e5a292a17577398952ead94a12
Author: Michael Meissner 
Date:   Thu Apr 3 15:27:30 2025 -0400

Add ChangeLog.test and update REVISION.

2025-04-03  Michael Meissner  

gcc/

* ChangeLog.test: New file for branch.
* REVISION: Update.

Diff:
---
 gcc/ChangeLog.test | 5 +
 gcc/REVISION   | 2 +-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/gcc/ChangeLog.test b/gcc/ChangeLog.test
new file mode 100644
index ..6b729265d631
--- /dev/null
+++ b/gcc/ChangeLog.test
@@ -0,0 +1,5 @@
+ Branch work199-test, baseline 
+
+2025-04-03   Michael Meissner  
+
+   Clone branch
diff --git a/gcc/REVISION b/gcc/REVISION
index 113e419bda0d..5f364831c648 100644
--- a/gcc/REVISION
+++ b/gcc/REVISION
@@ -1 +1 @@
-work199 branch
+work199-test branch


[gcc(refs/users/meissner/heads/work199-math)] Add ChangeLog.math and update REVISION.

2025-04-03 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:7c9347563cb475bd1a2c09b70558ce19b954d19b

commit 7c9347563cb475bd1a2c09b70558ce19b954d19b
Author: Michael Meissner 
Date:   Thu Apr 3 15:28:23 2025 -0400

Add ChangeLog.math and update REVISION.

2025-04-03  Michael Meissner  

gcc/

* ChangeLog.math: New file for branch.
* REVISION: Update.

Diff:
---
 gcc/ChangeLog.math | 5 +
 gcc/REVISION   | 2 +-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/gcc/ChangeLog.math b/gcc/ChangeLog.math
new file mode 100644
index ..126300a17a75
--- /dev/null
+++ b/gcc/ChangeLog.math
@@ -0,0 +1,5 @@
+ Branch work199-math, baseline 
+
+2025-04-03   Michael Meissner  
+
+   Clone branch
diff --git a/gcc/REVISION b/gcc/REVISION
index 113e419bda0d..365a2f9e2106 100644
--- a/gcc/REVISION
+++ b/gcc/REVISION
@@ -1 +1 @@
-work199 branch
+work199-math branch


[gcc] Created branch 'meissner/heads/work199-orig' in namespace 'refs/users'

2025-04-03 Thread Michael Meissner via Gcc-cvs
The branch 'meissner/heads/work199-orig' was created in namespace 'refs/users' 
pointing to:

 c669ab0a8666... rs6000: Add Cobol support to traceback table [PR119308]


[gcc(refs/users/meissner/heads/work199-vpair)] Vector pair support.

2025-04-03 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:e147bca6262a2b6ced84a6cc0f5457d261ba3a59

commit e147bca6262a2b6ced84a6cc0f5457d261ba3a59
Author: Michael Meissner 
Date:   Thu Apr 3 19:13:10 2025 -0400

Vector pair support.

This patch adds a new include file (vector-pair.h) that adds support so that
users writing high performance libraries can change their code to allow the
generation of the vector pair load and store instructions on power10.

The intention is that if the library authors need to write special loops 
that
go over arrays that they could modify their code to use the functions 
provided
to change loops that can take advantage of the higher bandwidth for load 
vector
pair and store instructions.

This particular patch just adds a new include file (vector-pair.h) that
provides a bunch of functions that on a power10 system would use the vector
pair load operation, 2 floating point operations, and a vector pair store.  
It
does not add any new types, modes, or built-in function.

I have additional patches that can add built-in functions that the 
functions in
vector-pair.h could utilize so that the compiler can optimize and combine
operations.  I may submit those patches in the future, but I would like to
provide this patch to allow the library writer to optimize their code.

I've measured the performance of these new functions on a power10.  For 
default
unrolling, the percentage of change for the 3 methods over the normal vector
loop method:

116%Vector-pair.h function, default unroll
 93%Vector pair split built-in & 2 vector stores, default unroll
 86%Vector pair split & combine built-ins, default unroll

Using explicit 2 way unrolling the numbers are:

114%Vector-pair.h function, unroll 2
106%Vector pair split built-in & 2 vector stores, unroll 2
 98%Vector pair split & combine built-ins, unroll 2

These new functions provided in vector-pair.h use the vector pair load/store
instructions, and don't generate extra vector moves.  Using the existing
vector pair disassemble and assemble built-ins generate extra vector moves
which can hinder performance.

If I compile the loop code for power9, there is a minor speed up for default
unrolling and more of an improvement using the framework provided in the
vector-pair.h for explicit unrolling by 2:

101%Vector-pair.h function, default unroll for power9
107%Vector-pair.h function, unroll 2 for power9

Of course this is a synthetic benchmark run on a quiet power10 system.  
Results
would vary for real code on real systems.  However, I feel adding these
functions can allow the writers of high performance libraries to better
optimize their code.

As an example, if the library wants to code a simple fused multiply-add 
loop,
they might write the code as follows:

#include 
#include 
#include 

void
fma_vector (double * __restrict__ r,
const double * __restrict__ a,
const double * __restrict__ b,
size_t n)
{
  vector double * __restrict__ vr = (vector double * __restrict__)r;
  const vector double * __restrict__ va = (const vector double * 
__restrict__)a;
  const vector double * __restrict__ vb = (const vector double * 
__restrict__)b;
  size_t num_elements = sizeof (vector double) / sizeof (double);
  size_t nv = n / num_elements;
  size_t i;

  for (i = 0; i < nv; i++)
vr[i] = __builtin_vsx_xvmadddp (va[i], vb[i], vr[i]);

  for (i = nv * num_elements; i < n; i++)
r[i] = fma (a[i], b[i], r[i]);
}

The inner loop would look like:

.L3:
lxvx 0,3,9
lxvx 12,4,9
addi 10,9,16
addi 2,2,-2
lxvx 11,5,9
xvmaddadp 0,12,11
lxvx 12,4,10
lxvx 11,5,10
stxvx 0,3,9
lxvx 0,3,10
addi 9,9,32
xvmaddadp 0,12,11
stxvx 0,3,10
bdnz .L3

Now if you code the loop to use __builtin_vsx_disassemble_pair to do a 
vector
pair load, but then do 2 vector stores:

#include 
#include 
#include 

void
fma_mma_ld (double * __restrict__ r,
const double * __restrict__ a,
const double * __restrict__ b,
size_t n)
{
  __vector_pair * __restrict__ vp_r 

[gcc/devel/omp/gcc-14] OpenMP: Require target and/or targetsync init modifier [PR118965]

2025-04-03 Thread Sandra Loosemore via Gcc-cvs
https://gcc.gnu.org/g:be92d54c5622705b03f7db3e3e214b055bf82176

commit be92d54c5622705b03f7db3e3e214b055bf82176
Author: Sandra Loosemore 
Date:   Mon Mar 31 22:02:35 2025 +

OpenMP: Require target and/or targetsync init modifier [PR118965]

As noted in PR 118965, the initial interop implementation overlooked
the requirement in the OpenMP spec that at least one of the "target"
and "targetsync" modifiers is required in both the interop construct
init clause and the declare variant append_args clause.

Adding the check was fairly straightforward, but it broke about a
gazillion existing test cases.  In particular, things like "init (x, y)"
which were previously accepted (and tested for being accepted) aren't
supposed to be allowed by the spec, much less things like "init (target)"
where target was previously interpreted as a variable name instead of a
modifier.  Since one of the effects of the change is that at least one
modifier is always required, I found that deleting all the code that was
trying to detect and handle the no-modifier case allowed for better
diagnostics.

gcc/c/ChangeLog
PR middle-end/118965
* c-parser.cc (c_parser_omp_clause_init_modifiers): Adjust
error message.
(c_parser_omp_clause_init): Remove code for recognizing clauses
without modifiers.  Diagnose missing target/targetsync modifier.
(c_finish_omp_declare_variant): Diagnose missing target/targetsync
modifier.

gcc/cp/ChangeLog
PR middle-end/118965
* parser.cc (c_parser_omp_clause_init_modifiers): Adjust
error message.
(cp_parser_omp_clause_init): Remove code for recognizing clauses
without modifiers.  Diagnose missing target/targetsync modifier.
(cp_finish_omp_declare_variant): Diagnose missing target/targetsync
modifier.

gcc/fortran/ChangeLog
PR middle-end/118965
* openmp.cc (gfc_parser_omp_clause_init_modifiers): Fix some
inconsistent code indentation.  Remove code for recognizing
clauses without modifiers.  Diagnose prefer_type without a
following paren.  Adjust error message for an unrecognized modifier.
Diagnose missing target/targetsync modifier.
(gfc_match_omp_init): Fix more inconsistent code indentation.

gcc/testsuite/ChangeLog
PR middle-end/118965
* c-c++-common/gomp/append-args-1.c: Add target/targetsync
modifiers so tests do what they were previously supposed to do.
Adjust expected output.
* c-c++-common/gomp/append-args-7.c: Likewise.
* c-c++-common/gomp/append-args-8.c: Likewise.
* c-c++-common/gomp/append-args-9.c: Likewise.
* c-c++-common/gomp/interop-1.c: Likewise.
* c-c++-common/gomp/interop-2.c: Likewise.
* c-c++-common/gomp/interop-3.c: Likewise.
* c-c++-common/gomp/interop-4.c: Likewise.
* c-c++-common/gomp/pr118965-1.c: New.
* c-c++-common/gomp/pr118965-2.c: New.
* g++.dg/gomp/append-args-1.C: Add target/targetsync modifiers
and adjust expected output.
* g++.dg/gomp/append-args-2.C: Likewise.
* g++.dg/gomp/append-args-6.C: Likewise.
* g++.dg/gomp/append-args-7.C: Likewise.
* g++.dg/gomp/append-args-8.C: Likewise.
* g++.dg/gomp/interop-5.C: Likewise.
* gfortran.dg/gomp/append_args-1.f90: Add target/targetsync
modifiers and adjust expected output.
* gfortran.dg/gomp/append_args-2.f90: Likewise.
* gfortran.dg/gomp/append_args-3.f90: Likewise.
* gfortran.dg/gomp/append_args-4.f90: Likewise.
* gfortran.dg/gomp/interop-1.f90: Likewise.
* gfortran.dg/gomp/interop-2.f90: Likewise.
* gfortran.dg/gomp/interop-3.f90: Likewise.
* gfortran.dg/gomp/interop-4.f90: Likewise.
* gfortran.dg/gomp/pr118965-1.f90: New.
* gfortran.dg/gomp/pr118965-2.f90: New.

(cherry picked from commit aca8155c09001f269a20d6df438fa0e749dd5388)

Diff:
---
 gcc/c/c-parser.cc| 44 -
 gcc/cp/parser.cc | 45 +
 gcc/fortran/openmp.cc| 77 +++
 gcc/testsuite/c-c++-common/gomp/append-args-1.c  |  4 +-
 gcc/testsuite/c-c++-common/gomp/append-args-7.c  |  4 +-
 gcc/testsuite/c-c++-common/gomp/append-args-8.c  |  9 +--
 gcc/testsuite/c-c++-common/gomp/append-args-9.c  |  7 ++-
 gcc/testsuite/c-c++-common/gomp/interop-1.c  | 80 
 gcc/testsuite/c-c++-common/gomp/interop-2.c  | 64 ---
 gcc/testsuite/c-c++-common/gomp/interop-3.c  | 26 +++-
 gcc/testsuite/c-c++

[gcc/meissner/heads/work199-sha] (14 commits) Merge commit 'refs/users/meissner/heads/work199-sha' of git

2025-04-03 Thread Michael Meissner via Gcc-cvs
The branch 'meissner/heads/work199-sha' was updated to point to:

 f3bbcfce9895... Merge commit 'refs/users/meissner/heads/work199-sha' of git

It previously pointed to:

 70afce512eb5... Add ChangeLog.sha and update REVISION.

Diff:

Summary of changes (added commits):
---

  f3bbcfc... Merge commit 'refs/users/meissner/heads/work199-sha' of git
  1a7b01d... Add ChangeLog.sha and update REVISION.
  c62780a... Update ChangeLog.* (*)
  76c081b... Use architecture flags for defining _ARCH_PWR macros. (*)
  aa1860c... Add rs6000 architecture masks. (*)
  9dacc68... Use vector pair load/store for memcpy with -mcpu=future (*)
  c590949... Add -mcpu=future tests. (*)
  230fbe1... Add -mcpu=future tuning support. (*)
  c099046... Add support for -mcpu=future (*)
  04725bf... Change TARGET_MODULO to TARGET_POWER9. (*)
  08e5c71... Change TARGET_POPCNTD to TARGET_POWER7. (*)
  b9b3a52... Change TARGET_CMPB to TARGET_POWER6. (*)
  d0dbdbf... Change TARGET_FPRND to TARGET_POWER5X. (*)
  31d7966... Change TARGET_POPCNTB to TARGET_POWER5. (*)

(*) This commit already exists in another branch.
Because the reference `refs/users/meissner/heads/work199-sha' matches
your hooks.email-new-commits-only configuration,
no separate email is sent for this commit.


[gcc(refs/users/meissner/heads/work199-vpair)] Update ChangeLog.*

2025-04-03 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:21297519e31613e5cea2ea7ad921ecfe1527c6df

commit 21297519e31613e5cea2ea7ad921ecfe1527c6df
Author: Michael Meissner 
Date:   Thu Apr 3 19:14:50 2025 -0400

Update ChangeLog.*

Diff:
---
 gcc/ChangeLog.vpair | 420 
 1 file changed, 420 insertions(+)

diff --git a/gcc/ChangeLog.vpair b/gcc/ChangeLog.vpair
index f83306603144..a2d32ba404cd 100644
--- a/gcc/ChangeLog.vpair
+++ b/gcc/ChangeLog.vpair
@@ -1,5 +1,425 @@
+ Branch work199-vpair, patch #500 
+
+Vector pair support.
+
+This patch adds a new include file (vector-pair.h) that adds support so that
+users writing high performance libraries can change their code to allow the
+generation of the vector pair load and store instructions on power10.
+
+The intention is that if the library authors need to write special loops that
+go over arrays that they could modify their code to use the functions provided
+to change loops that can take advantage of the higher bandwidth for load vector
+pair and store instructions.
+
+This particular patch just adds a new include file (vector-pair.h) that
+provides a bunch of functions that on a power10 system would use the vector
+pair load operation, 2 floating point operations, and a vector pair store.  It
+does not add any new types, modes, or built-in function.
+
+I have additional patches that can add built-in functions that the functions in
+vector-pair.h could utilize so that the compiler can optimize and combine
+operations.  I may submit those patches in the future, but I would like to
+provide this patch to allow the library writer to optimize their code.
+
+I've measured the performance of these new functions on a power10.  For default
+unrolling, the percentage of change for the 3 methods over the normal vector
+loop method:
+
+   116%Vector-pair.h function, default unroll
+93%Vector pair split built-in & 2 vector stores, default unroll
+86%Vector pair split & combine built-ins, default unroll
+
+Using explicit 2 way unrolling the numbers are:
+
+   114%Vector-pair.h function, unroll 2
+   106%Vector pair split built-in & 2 vector stores, unroll 2
+98%Vector pair split & combine built-ins, unroll 2
+
+These new functions provided in vector-pair.h use the vector pair load/store
+instructions, and don't generate extra vector moves.  Using the existing
+vector pair disassemble and assemble built-ins generate extra vector moves
+which can hinder performance.
+
+If I compile the loop code for power9, there is a minor speed up for default
+unrolling and more of an improvement using the framework provided in the
+vector-pair.h for explicit unrolling by 2:
+
+   101%Vector-pair.h function, default unroll for power9
+   107%Vector-pair.h function, unroll 2 for power9
+
+Of course this is a synthetic benchmark run on a quiet power10 system.  Results
+would vary for real code on real systems.  However, I feel adding these
+functions can allow the writers of high performance libraries to better
+optimize their code.
+
+As an example, if the library wants to code a simple fused multiply-add loop,
+they might write the code as follows:
+
+   #include 
+   #include 
+   #include 
+
+   void
+   fma_vector (double * __restrict__ r,
+   const double * __restrict__ a,
+   const double * __restrict__ b,
+   size_t n)
+   {
+ vector double * __restrict__ vr = (vector double * __restrict__)r;
+ const vector double * __restrict__ va = (const vector double * 
__restrict__)a;
+ const vector double * __restrict__ vb = (const vector double * 
__restrict__)b;
+ size_t num_elements = sizeof (vector double) / sizeof (double);
+ size_t nv = n / num_elements;
+ size_t i;
+
+ for (i = 0; i < nv; i++)
+   vr[i] = __builtin_vsx_xvmadddp (va[i], vb[i], vr[i]);
+
+ for (i = nv * num_elements; i < n; i++)
+   r[i] = fma (a[i], b[i], r[i]);
+   }
+
+The inner loop would look like:
+
+   .L3:
+   lxvx 0,3,9
+   lxvx 12,4,9
+   addi 10,9,16
+   addi 2,2,-2
+   lxvx 11,5,9
+   xvmaddadp 0,12,11
+   lxvx 12,4,10
+   lxvx 11,5,10
+   stxvx 0,3,9
+   lxvx 0,3,10
+   addi 9,9,32
+   xvmaddadp 0,12,11
+   stxvx 0,3,10
+   bdnz .L3
+
+Now if you code the loop to use __builtin_vsx_disassemble_pair to do a vector
+pair load, but then do 2 vector stores:
+
+
+   #include 
+   #include 
+   #include 
+
+   void
+   fma_mma_ld (double * __restrict__ r,
+   const double * __restrict__ a,
+   const double * __restrict__ b,
+   size_t n)
+   {
+ __vector_pair * __restrict__ vp_r 

[gcc(refs/users/meissner/heads/work199-submit)] Update ChangeLog.*

2025-04-03 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:df57af1c8d5728cb58261f3b0faf196d419dfc3f

commit df57af1c8d5728cb58261f3b0faf196d419dfc3f
Author: Michael Meissner 
Date:   Thu Apr 3 20:36:50 2025 -0400

Update ChangeLog.*

Diff:
---
 gcc/ChangeLog.submit | 121 +++
 1 file changed, 121 insertions(+)

diff --git a/gcc/ChangeLog.submit b/gcc/ChangeLog.submit
index 9b0b1e840f22..bd732716b4f1 100644
--- a/gcc/ChangeLog.submit
+++ b/gcc/ChangeLog.submit
@@ -1,5 +1,126 @@
+ Branch work199-submit, patch #600 
+
+Fix PR 118541, do not generate unordered fp cmoves for IEEE compares.
+
+Bernhard Reutner-Fischer suggested some typos to the patch for 118551.  Here
+is the changed patch.
+
+In bug PR target/118541 on power9, power10, and power11 systems, for the
+function:
+
+extern double __ieee754_acos (double);
+
+double
+__acospi (double x)
+{
+  double ret = __ieee754_acos (x) / 3.14;
+  return __builtin_isgreater (ret, 1.0) ? 1.0 : ret;
+}
+
+GCC currently generates the following code:
+
+Power9  Power10 and Power11
+==  ===
+bl __ieee754_acos   bl __ieee754_acos@notoc
+nop plfd 0,.LC0@pcrel
+addis 9,2,.LC2@toc@ha   xxspltidp 12,1065353216
+addi 1,1,32 addi 1,1,32
+lfd 0,.LC2@toc@l(9) ld 0,16(1)
+addis 9,2,.LC0@toc@ha   fdiv 0,1,0
+ld 0,16(1)  mtlr 0
+lfd 12,.LC0@toc@l(9)xscmpgtdp 1,0,12
+fdiv 0,1,0  xxsel 1,0,12,1
+mtlr 0  blr
+xscmpgtdp 1,0,12
+xxsel 1,0,12,1
+blr
+
+This is because ifcvt.c optimizes the conditional floating point move to use 
the
+XSCMPGTDP instruction.
+
+However, the XSCMPGTDP instruction will generate an interrupt if one of the
+arguments is a signalling NaN and signalling NaNs can generate an interrupt.
+The IEEE comparison functions (isgreater, etc.) require that the comparison not
+raise an interrupt.
+
+The following patch changes the PowerPC back end so that ifcvt.c will not 
change
+the if/then test and move into a conditional move if the comparison is one of
+the comparisons that do not raise an error with signalling NaNs and -Ofast is
+not used.  If a normal comparison is used or -Ofast is used, GCC will continue
+to generate XSCMPGTDP and XXSEL.
+
+For the following code:
+
+double
+ordered_compare (double a, double b, double c, double d)
+{
+  return __builtin_isgreater (a, b) ? c : d;
+}
+
+/* Verify normal > does generate xscmpgtdp.  */
+
+double
+normal_compare (double a, double b, double c, double d)
+{
+  return a > b ? c : d;
+}
+
+with the following patch, GCC generates the following for power9, power10, and
+power11:
+
+ordered_compare:
+fcmpu 0,1,2
+fmr 1,4
+bnglr 0
+fmr 1,3
+blr
+
+normal_compare:
+xscmpgtdp 1,1,2
+xxsel 1,4,3,1
+blr
+
+I have built bootstrap compilers on big endian power9 systems and little endian
+power9/power10 systems and there were no regressions.  Can I check this patch
+into the GCC trunk, and after a waiting period, can I check this into the 
active
+older branches?
+
+2025-04-03  Michael Meissner  
+
+gcc/
+
+   PR target/118541
+   * config/rs6000/predicates.md (invert_fpmask_comparison_operator): Do
+   not allow UNLT and UNLE unless -ffast-math.
+   * config/rs6000/rs6000-protos.h (enum rev_cond_ordered): New 
enumeration.
+   (rs6000_reverse_condition): Add argument.
+   * config/rs6000/rs6000.cc (rs6000_reverse_condition): Do not allow
+   ordered comparisons to be reversed for floating point conditional moves,
+   but allow ordered comparisons to be reversed on jumps.
+   (rs6000_emit_sCOND): Adjust rs6000_reverse_condition call.
+   * config/rs6000/rs6000.h (REVERSE_CONDITION): Likewise.
+   * config/rs6000/rs6000.md (reverse_branch_comparison): Name insn.
+   Adjust rs6000_reverse_condition calls.
+
+gcc/testsuite/
+
+   PR target/118541
+   * gcc.target/powerpc/pr118541-1.c: New test.
+   * gcc.target/powerpc/pr118541-2.c: Likewise.
+   * gcc.target/powerpc/pr118541-3.c: Likewise.
+   * gcc.target/powerpc/pr118541-4.c: Likewise.
+
  Branch work199-submit, baseline 
 
+Add ChangeLog.submit and update REVISION.
+
+2025-04-03  Michael Meissner  
+
+gcc/
+
+   * ChangeLog.submit: New file for branch.
+   * REVISION: Update.
+
 2025-04-03   Michael Meissner  
 
Clone branch


[gcc(refs/users/meissner/heads/work199-submit)] Fix PR 118541, do not generate unordered fp cmoves for IEEE compares.

2025-04-03 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:b3e60cfb8cd93ab3dc6f5b533778435ee9c6c798

commit b3e60cfb8cd93ab3dc6f5b533778435ee9c6c798
Author: Michael Meissner 
Date:   Thu Apr 3 20:31:36 2025 -0400

Fix PR 118541, do not generate unordered fp cmoves for IEEE compares.

Bernhard Reutner-Fischer suggested some typos to the patch for 118551.  Here
is the changed patch.

In bug PR target/118541 on power9, power10, and power11 systems, for the
function:

extern double __ieee754_acos (double);

double
__acospi (double x)
{
  double ret = __ieee754_acos (x) / 3.14;
  return __builtin_isgreater (ret, 1.0) ? 1.0 : ret;
}

GCC currently generates the following code:

Power9  Power10 and Power11
==  ===
bl __ieee754_acos   bl __ieee754_acos@notoc
nop plfd 0,.LC0@pcrel
addis 9,2,.LC2@toc@ha   xxspltidp 12,1065353216
addi 1,1,32 addi 1,1,32
lfd 0,.LC2@toc@l(9) ld 0,16(1)
addis 9,2,.LC0@toc@ha   fdiv 0,1,0
ld 0,16(1)  mtlr 0
lfd 12,.LC0@toc@l(9)xscmpgtdp 1,0,12
fdiv 0,1,0  xxsel 1,0,12,1
mtlr 0  blr
xscmpgtdp 1,0,12
xxsel 1,0,12,1
blr

This is because ifcvt.c optimizes the conditional floating point move to 
use the
XSCMPGTDP instruction.

However, the XSCMPGTDP instruction will generate an interrupt if one of the
arguments is a signalling NaN and signalling NaNs can generate an interrupt.
The IEEE comparison functions (isgreater, etc.) require that the comparison 
not
raise an interrupt.

The following patch changes the PowerPC back end so that ifcvt.c will not 
change
the if/then test and move into a conditional move if the comparison is one 
of
the comparisons that do not raise an error with signalling NaNs and -Ofast 
is
not used.  If a normal comparison is used or -Ofast is used, GCC will 
continue
to generate XSCMPGTDP and XXSEL.

For the following code:

double
ordered_compare (double a, double b, double c, double d)
{
  return __builtin_isgreater (a, b) ? c : d;
}

/* Verify normal > does generate xscmpgtdp.  */

double
normal_compare (double a, double b, double c, double d)
{
  return a > b ? c : d;
}

with the following patch, GCC generates the following for power9, power10, 
and
power11:

ordered_compare:
fcmpu 0,1,2
fmr 1,4
bnglr 0
fmr 1,3
blr

normal_compare:
xscmpgtdp 1,1,2
xxsel 1,4,3,1
blr

I have built bootstrap compilers on big endian power9 systems and little 
endian
power9/power10 systems and there were no regressions.  Can I check this 
patch
into the GCC trunk, and after a waiting period, can I check this into the 
active
older branches?

2025-04-03  Michael Meissner  

gcc/

PR target/118541
* config/rs6000/predicates.md (invert_fpmask_comparison_operator): 
Do
not allow UNLT and UNLE unless -ffast-math.
* config/rs6000/rs6000-protos.h (enum rev_cond_ordered): New 
enumeration.
(rs6000_reverse_condition): Add argument.
* config/rs6000/rs6000.cc (rs6000_reverse_condition): Do not allow
ordered comparisons to be reversed for floating point conditional 
moves,
but allow ordered comparisons to be reversed on jumps.
(rs6000_emit_sCOND): Adjust rs6000_reverse_condition call.
* config/rs6000/rs6000.h (REVERSE_CONDITION): Likewise.
* config/rs6000/rs6000.md (reverse_branch_comparison): Name insn.
Adjust rs6000_reverse_condition calls.

gcc/testsuite/

PR target/118541
* gcc.target/powerpc/pr118541-1.c: New test.
* gcc.target/powerpc/pr118541-2.c: Likewise.
* gcc.target/powerpc/pr118541-3.c: Likewise.
* gcc.target/powerpc/pr118541-4.c: Likewise.

Diff:
---
 gcc/config/rs6000/predicates.md   | 10 +-
 gcc/config/rs6000/rs6000-protos.h | 17 +-
 gcc/config/rs6000/rs6000.cc   | 46 ---
 gcc/config/rs6000/rs6000.h| 10 --
 gcc/config/rs6000/rs6000.md   | 25 +--
 gcc/testsuite/gcc.target/powerpc/pr118541-1.c | 28 +

[gcc(refs/users/meissner/heads/work199-dmf)] Add ChangeLog.dmf and update REVISION.

2025-04-03 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:149b47aae3312424c7b762a2e0e9e5f3e9b0

commit 149b47aae3312424c7b762a2e0e9e5f3e9b0
Author: Michael Meissner 
Date:   Thu Apr 3 15:22:57 2025 -0400

Add ChangeLog.dmf and update REVISION.

2025-04-03  Michael Meissner  

gcc/

* ChangeLog.dmf: New file for branch.
* REVISION: Update.

Diff:
---
 gcc/ChangeLog.dmf | 5 +
 gcc/REVISION  | 2 +-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/gcc/ChangeLog.dmf b/gcc/ChangeLog.dmf
new file mode 100644
index ..3a7dad677be5
--- /dev/null
+++ b/gcc/ChangeLog.dmf
@@ -0,0 +1,5 @@
+ Branch work199-dmf, baseline 
+
+2025-04-03   Michael Meissner  
+
+   Clone branch
diff --git a/gcc/REVISION b/gcc/REVISION
index 113e419bda0d..da7d8a6c744e 100644
--- a/gcc/REVISION
+++ b/gcc/REVISION
@@ -1 +1 @@
-work199 branch
+work199-dmf branch


[gcc(refs/users/meissner/heads/work199-vpair)] Add ChangeLog.vpair and update REVISION.

2025-04-03 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:67139bc62ba599794967b12d8e4dfc3786f428e2

commit 67139bc62ba599794967b12d8e4dfc3786f428e2
Author: Michael Meissner 
Date:   Thu Apr 3 15:23:48 2025 -0400

Add ChangeLog.vpair and update REVISION.

2025-04-03  Michael Meissner  

gcc/

* ChangeLog.vpair: New file for branch.
* REVISION: Update.

Diff:
---
 gcc/ChangeLog.vpair | 5 +
 gcc/REVISION| 2 +-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/gcc/ChangeLog.vpair b/gcc/ChangeLog.vpair
new file mode 100644
index ..f83306603144
--- /dev/null
+++ b/gcc/ChangeLog.vpair
@@ -0,0 +1,5 @@
+ Branch work199-vpair, baseline 
+
+2025-04-03   Michael Meissner  
+
+   Clone branch
diff --git a/gcc/REVISION b/gcc/REVISION
index 113e419bda0d..1d07d4d01805 100644
--- a/gcc/REVISION
+++ b/gcc/REVISION
@@ -1 +1 @@
-work199 branch
+work199-vpair branch


[gcc(refs/users/meissner/heads/work199)] Add rs6000 architecture masks.

2025-04-03 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:aa1860c17665576d8ab2a38aa0ea37bc1f556ee9

commit aa1860c17665576d8ab2a38aa0ea37bc1f556ee9
Author: Michael Meissner 
Date:   Thu Apr 3 16:13:50 2025 -0400

Add rs6000 architecture masks.

This patch begins the journey to move architecture bits that are not user 
ISA
options from rs6000_isa_flags to a new targt variable rs6000_arch_flags.  
The
intention is to remove switches that are currently isa options, but the user
should not be using this particular option. For example, we want users to 
use
-mcpu=power10 and not just -mpower10.

This patch also changes the target_clones support to use an architecture 
mask
instead of isa bits.

This patch also switches the handling of .machine to use architecture masks 
if
they exist (power4 through power11).  All of the other PowerPCs will 
continue to
use the existing code for setting the .machine option.

I have built both big endian and little endian bootstrap compilers and there
were no regressions.

In addition, I constructed a test case that used every archiecture define 
(like
_ARCH_PWR4, etc.) and I also looked at the .machine directive generated.  I 
ran
this test for all supported combinations of -mcpu, big/little endian, and 
32/64
bit support.  Every single instance generated exactly the same code with the
patches installed compared to the compiler before installing the patches.

The only difference in this patch compared to the first version posted on
November 6th is that I the correct attribution and copyright year (i.e. 
that I
created rs6000-arch.def in 2024).

Can I install this patch on the GCC 15 trunk?

2025-04-03  Michael Meissner  

gcc/

* config/rs6000/default64.h (TARGET_CPU_DEFAULT): Set default cpu 
name.
* config/rs6000/rs6000-arch.def: New file.
* config/rs6000/rs6000.cc (struct clone_map): Switch to using
architecture masks instead of ISA masks.
(rs6000_clone_map): Likewise.
(rs6000_print_isa_options): Add an architecture flags argument, 
change
all callers.
(get_arch_flag): New function.
(rs6000_debug_reg_global): Update rs6000_print_isa_options calls.
(rs6000_option_override_internal): Likewise.
(rs6000_machine_from_flags): Switch to using architecture masks 
instead
of ISA masks.
(struct rs6000_arch_mask): New structure.
(rs6000_arch_masks): New table of architecutre masks and names.
(rs6000_function_specific_save): Save architecture flags.
(rs6000_function_specific_restore): Restore architecture flags.
(rs6000_function_specific_print): Update rs6000_print_isa_options 
calls.
(rs6000_print_options_internal): Add architecture flags options.
(rs6000_clone_priority): Switch to using architecture masks instead 
of
ISA masks.
(rs6000_can_inline_p): Don't allow inling if the callee requires a 
newer
architecture than the caller.
* config/rs6000/rs6000.h: Use rs6000-arch.def to create the 
architecture
masks.
* config/rs6000/rs6000.opt (rs6000_arch_flags): New target variable.
(x_rs6000_arch_flags): New save/restore field for rs6000_arch_flags.

Diff:
---
 gcc/config/rs6000/default64.h |  11 ++
 gcc/config/rs6000/rs6000-arch.def |  49 +
 gcc/config/rs6000/rs6000.cc   | 222 +++---
 gcc/config/rs6000/rs6000.h|  24 +
 gcc/config/rs6000/rs6000.opt  |   8 ++
 5 files changed, 277 insertions(+), 37 deletions(-)

diff --git a/gcc/config/rs6000/default64.h b/gcc/config/rs6000/default64.h
index 7f6001ded852..188f5c1d1378 100644
--- a/gcc/config/rs6000/default64.h
+++ b/gcc/config/rs6000/default64.h
@@ -21,6 +21,7 @@ along with GCC; see the file COPYING3.  If not see
 #define RS6000_CPU(NAME, CPU, FLAGS)
 #include "rs6000-cpus.def"
 #undef RS6000_CPU
+#undef TARGET_CPU_DEFAULT
 
 #if (TARGET_DEFAULT & MASK_LITTLE_ENDIAN)
 #undef TARGET_DEFAULT
@@ -28,10 +29,20 @@ along with GCC; see the file COPYING3.  If not see
| MASK_LITTLE_ENDIAN)
 #undef ASM_DEFAULT_SPEC
 #define ASM_DEFAULT_SPEC "-mpower8"
+#define TARGET_CPU_DEFAULT "power8"
+
 #else
 #undef TARGET_DEFAULT
 #define TARGET_DEFAULT (OPTION_MASK_PPC_GFXOPT | OPTION_MASK_PPC_GPOPT \
| OPTION_MASK_MFCRF | MASK_POWERPC64 | MASK_64BIT)
 #undef ASM_DEFAULT_SPEC
 #define ASM_DEFAULT_SPEC "-mpower4"
+
+#if (TARGET_DEFAULT & MASK_POWERPC64)
+#define TARGET_CPU_DEFAULT "powerpc64"
+
+#else
+#define TARGET_CPU_DEFAULT "powerpc"
+#endif
+
 #endif
diff --git a/gcc/config/rs6000/rs6000-arch.def 
b/gcc/config/rs6000/rs6000-arch.def
new file mode 100644
index ..c0dbc5834333
--- /dev/null
+++ b/gcc/config/rs6000/rs6000-arch.def
@@

[gcc(refs/users/meissner/heads/work199-libs)] Merge commit 'refs/users/meissner/heads/work199-libs' of git+ssh://gcc.gnu.org/git/gcc into me/work1

2025-04-03 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:360e55041f4387d4c794f461e24adcfce6da6c43

commit 360e55041f4387d4c794f461e24adcfce6da6c43
Merge: 0696a01b15df f46e8c6a02f2
Author: Michael Meissner 
Date:   Thu Apr 3 16:23:30 2025 -0400

Merge commit 'refs/users/meissner/heads/work199-libs' of 
git+ssh://gcc.gnu.org/git/gcc into me/work199-libs

Diff:


[gcc r14-11522] Fortran: Fix freeing procedure pointer components [PR119380]

2025-04-03 Thread Andre Vehreschild via Gcc-cvs
https://gcc.gnu.org/g:f955c5b409a96bd12765680517ce583d7086c62d

commit r14-11522-gf955c5b409a96bd12765680517ce583d7086c62d
Author: Andre Vehreschild 
Date:   Fri Mar 21 09:13:29 2025 +0100

Fortran: Fix freeing procedure pointer components [PR119380]

Backported from gcc-15.

PR fortran/119380

gcc/fortran/ChangeLog:

* trans-array.cc (structure_alloc_comps): Prevent freeing of
procedure pointer components.

gcc/testsuite/ChangeLog:

* gfortran.dg/proc_ptr_comp_54.f90: New test.

Diff:
---
 gcc/fortran/trans-array.cc |  4 ++--
 gcc/testsuite/gfortran.dg/proc_ptr_comp_54.f90 | 30 ++
 2 files changed, 32 insertions(+), 2 deletions(-)

diff --git a/gcc/fortran/trans-array.cc b/gcc/fortran/trans-array.cc
index c1c2b933b279..50286e4120e6 100644
--- a/gcc/fortran/trans-array.cc
+++ b/gcc/fortran/trans-array.cc
@@ -9694,13 +9694,13 @@ structure_alloc_comps (gfc_symbol * der_type, tree 
decl, tree dest,
  if (c->ts.type == BT_CLASS)
{
  attr = &CLASS_DATA (c)->attr;
- if (attr->class_pointer)
+ if (attr->class_pointer || c->attr.proc_pointer)
continue;
}
  else
{
  attr = &c->attr;
- if (attr->pointer)
+ if (attr->pointer || attr->proc_pointer)
continue;
}
 
diff --git a/gcc/testsuite/gfortran.dg/proc_ptr_comp_54.f90 
b/gcc/testsuite/gfortran.dg/proc_ptr_comp_54.f90
new file mode 100644
index ..73abc590e9ef
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/proc_ptr_comp_54.f90
@@ -0,0 +1,30 @@
+!{ dg-do run }
+
+! Check that components of procedure pointer aren't freeed.
+! Contributed by Damian Rouson  
+
+  implicit none
+
+  type foo_t
+integer, allocatable :: i_
+procedure(f), pointer, nopass :: f_
+procedure(c), pointer, nopass :: c_
+  end type
+
+  class(foo_t), allocatable :: ff
+
+  associate(foo => foo_t(1,f))
+  end associate
+
+contains
+
+  function f()
+logical, allocatable :: f
+f = .true.
+  end function
+
+  function c()
+class(foo_t), allocatable :: c
+allocate(c)
+  end function
+end


[gcc r15-9181] libgcobol: Ensure that config.h is included where needed.

2025-04-03 Thread Iain D Sandoe via Gcc-cvs
https://gcc.gnu.org/g:ce1cf361992463c84d225fb3f3b8b9f18fa96adb

commit r15-9181-gce1cf361992463c84d225fb3f3b8b9f18fa96adb
Author: Iain Sandoe 
Date:   Mon Mar 24 07:55:32 2025 +

libgcobol: Ensure that config.h is included where needed.

This includes config.h before any other project-related headers
or sources so that they properly make use of the values determined
by configure.

libgcobol/ChangeLog:

* gfileio.cc: Include config.h.
* gmath.cc: Likewise.
* io.cc: Likewise.
* libgcobol.cc: Likewise.

Signed-off-by: Iain Sandoe 

Diff:
---
 libgcobol/gfileio.cc   | 2 ++
 libgcobol/gmath.cc | 2 ++
 libgcobol/io.cc| 3 +++
 libgcobol/libgcobol.cc | 2 ++
 4 files changed, 9 insertions(+)

diff --git a/libgcobol/gfileio.cc b/libgcobol/gfileio.cc
index 0216c7b8275d..ed250c47f941 100644
--- a/libgcobol/gfileio.cc
+++ b/libgcobol/gfileio.cc
@@ -40,6 +40,8 @@
 #include 
 #include 
 
+#include "config.h"
+
 #include "ec.h"
 #include "io.h"
 #include "common-defs.h"
diff --git a/libgcobol/gmath.cc b/libgcobol/gmath.cc
index 2af0e8a8614a..fb2eae38ef3e 100644
--- a/libgcobol/gmath.cc
+++ b/libgcobol/gmath.cc
@@ -39,6 +39,8 @@
 #include 
 #include 
 
+#include "config.h"
+
 #include "ec.h"
 #include "common-defs.h"
 #include "io.h"
diff --git a/libgcobol/io.cc b/libgcobol/io.cc
index 4dca42e0badd..95e1d0266861 100644
--- a/libgcobol/io.cc
+++ b/libgcobol/io.cc
@@ -27,6 +27,9 @@
  * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
  * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  */
+
+#include "config.h"
+
 #include "io.h"
 #include "stdio.h"
 #include "stdlib.h"
diff --git a/libgcobol/libgcobol.cc b/libgcobol/libgcobol.cc
index 224c5f26e963..d8f4deda863a 100644
--- a/libgcobol/libgcobol.cc
+++ b/libgcobol/libgcobol.cc
@@ -49,6 +49,8 @@
 #include 
 #include 
 
+#include "config.h"
+
 #include "ec.h"
 #include "common-defs.h"
 #include "io.h"


[gcc r15-9183] libgcobol: Provide fallbacks for C32 strfromf32/64 functions.

2025-04-03 Thread Iain D Sandoe via Gcc-cvs
https://gcc.gnu.org/g:b6aafe9a5b1452f3bafae5889aebad0b3de408a3

commit r15-9183-gb6aafe9a5b1452f3bafae5889aebad0b3de408a3
Author: Iain Sandoe 
Date:   Tue Mar 25 15:10:12 2025 +

libgcobol: Provide fallbacks for C32 strfromf32/64 functions.

strfrom{f,d,l,fN) are all C23 and might not be available in general.
This uses snprintf() to provide fall-backs where the libc does not
yet have support.

libgcobol/ChangeLog:

* config.h.in: Regenerate.
* configure: Regenerate.
* configure.ac: Check for availability of strfromf32 and
strfromf64.
* libgcobol.cc (strfromf32, strfromf64): New.

Signed-off-by: Iain Sandoe 

Diff:
---
 libgcobol/config.h.in  |  6 ++
 libgcobol/configure| 13 +++--
 libgcobol/configure.ac |  3 +++
 libgcobol/libgcobol.cc | 24 
 4 files changed, 44 insertions(+), 2 deletions(-)

diff --git a/libgcobol/config.h.in b/libgcobol/config.h.in
index b201266e0610..5dd2b5071ff1 100644
--- a/libgcobol/config.h.in
+++ b/libgcobol/config.h.in
@@ -36,6 +36,12 @@
 /* Define to 1 if you have the  header file. */
 #undef HAVE_STDLIB_H
 
+/* Define to 1 if you have the `strfromf32' function. */
+#undef HAVE_STRFROMF32
+
+/* Define to 1 if you have the `strfromf64' function. */
+#undef HAVE_STRFROMF64
+
 /* Define to 1 if you have the  header file. */
 #undef HAVE_STRINGS_H
 
diff --git a/libgcobol/configure b/libgcobol/configure
index 44190d7e2fe0..e12b72e0817c 100755
--- a/libgcobol/configure
+++ b/libgcobol/configure
@@ -2520,6 +2520,8 @@ as_fn_append ac_func_list " random_r"
 as_fn_append ac_func_list " srandom_r"
 as_fn_append ac_func_list " initstate_r"
 as_fn_append ac_func_list " setstate_r"
+as_fn_append ac_func_list " strfromf32"
+as_fn_append ac_func_list " strfromf64"
 # Check that the precious variables saved in the cache have kept the same
 # value.
 ac_cache_corrupted=false
@@ -12906,7 +12908,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 12909 "configure"
+#line 12911 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
@@ -13012,7 +13014,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 13015 "configure"
+#line 13017 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
@@ -16385,6 +16387,13 @@ done
 
 
 
+# These are C23, and might not be available in libc.
+
+
+
+
+
+
 if test "${multilib}" = "yes"; then
   multilib_arg="--enable-multilib"
 else
diff --git a/libgcobol/configure.ac b/libgcobol/configure.ac
index 383b4134b130..34c0235c56b9 100644
--- a/libgcobol/configure.ac
+++ b/libgcobol/configure.ac
@@ -218,6 +218,9 @@ AC_SUBST(enable_static)
 # These are GLIBC
 AC_CHECK_FUNCS_ONCE(random_r srandom_r initstate_r setstate_r)
 
+# These are C23, and might not be available in libc.
+AC_CHECK_FUNCS_ONCE(strfromf32 strfromf64)
+
 if test "${multilib}" = "yes"; then
   multilib_arg="--enable-multilib"
 else
diff --git a/libgcobol/libgcobol.cc b/libgcobol/libgcobol.cc
index d8f4deda863a..c163e2c92f2b 100644
--- a/libgcobol/libgcobol.cc
+++ b/libgcobol/libgcobol.cc
@@ -68,6 +68,30 @@
 
 #include "exceptl.h"
 
+#if !defined (HAVE_STRFROMF32)
+# if __FLT_MANT_DIG__ == 24 && __FLT_MAX_EXP__ == 128
+static int
+strfromf32 (char *s, size_t n, const char *f, float v)
+{
+  return snprintf (s, n, f, (double) v);
+}
+# else
+#  error "It looks like float on this platform is not IEEE754"
+# endif
+#endif
+
+#if !defined (HAVE_STRFROMF64)
+# if __DBL_MANT_DIG__ == 53 && __DBL_MAX_EXP__ == 1024
+static int
+strfromf64 (char *s, size_t n, const char *f, double v)
+{
+  return snprintf (s, n, f, v);
+}
+# else
+#  error "It looks like double on this platform is not IEEE754"
+# endif
+#endif
+
 // This couldn't be defined in symbols.h because it conflicts with a LEVEL66
 // in parse.h
 #define LEVEL66 (66)


[gcc r15-9178] libstdc++: Fix handling of field width for wide strings and characters [PR119593]

2025-04-03 Thread Tomasz Kaminski via Gcc-cvs
https://gcc.gnu.org/g:5c7f6272f43f4265dc08eac4ee91164672c1c441

commit r15-9178-g5c7f6272f43f4265dc08eac4ee91164672c1c441
Author: Tomasz Kamiński 
Date:   Thu Apr 3 10:23:45 2025 +0200

libstdc++: Fix handling of field width for wide strings and characters 
[PR119593]

This patch corrects handling of UTF-32LE and UTF32-BE in
__unicode::__literal_encoding_is_unicode<_CharT>, so they are
recognized as unicode and functions produces correct result for wchar_t.

Use `__unicode::__field_width` to compute the estimated witdh
of the charcter for unicode wide encoding.

PR libstdc++/119593

libstdc++-v3/ChangeLog:

* include/bits/unicode.h
(__unicode::__literal_encoding_is_unicode<_CharT>):
Corrected handing for UTF-16 and UTF-32 with "LE" or "BE" suffix.
* include/std/format (__formatter_str::_S_character_width):
Define.
(__formatter_str::_S_character_width): Updated passed char
length.
* testsuite/std/format/functions/format.cc: Test for wchar_t.

Reviewed-by: Jonathan Wakely 
Signed-off-by: Tomasz Kamiński 

Diff:
---
 libstdc++-v3/include/bits/unicode.h   |  2 ++
 libstdc++-v3/include/std/format   | 16 +++-
 libstdc++-v3/testsuite/std/format/functions/format.cc |  8 ++--
 3 files changed, 23 insertions(+), 3 deletions(-)

diff --git a/libstdc++-v3/include/bits/unicode.h 
b/libstdc++-v3/include/bits/unicode.h
index 24b1ac3d53d6..99d972eccff8 100644
--- a/libstdc++-v3/include/bits/unicode.h
+++ b/libstdc++-v3/include/bits/unicode.h
@@ -1039,6 +1039,8 @@ inline namespace __v16_0_0
  string_view __s(__enc);
  if (__s.ends_with("//"))
__s.remove_suffix(2);
+ if (__s.ends_with("LE") || __s.ends_with("BE"))
+   __s.remove_suffix(2);
  return __s == "16" || __s == "32";
}
}
diff --git a/libstdc++-v3/include/std/format b/libstdc++-v3/include/std/format
index c3327e1d3841..9ef719edcf03 100644
--- a/libstdc++-v3/include/std/format
+++ b/libstdc++-v3/include/std/format
@@ -1277,12 +1277,26 @@ namespace __format
  _M_spec);
}
 
+  [[__gnu__::__always_inline__]]
+  static size_t
+  _S_character_width(_CharT __c)
+  {
+   // N.B. single byte cannot encode charcter of width greater than 1
+   if constexpr (sizeof(_CharT) > 1u && 
+   __unicode::__literal_encoding_is_unicode<_CharT>())
+ return __unicode::__field_width(__c);
+   else
+ return 1u;
+  }
+
   template
typename basic_format_context<_Out, _CharT>::iterator
_M_format_character(_CharT __c,
  basic_format_context<_Out, _CharT>& __fc) const
{
- return __format::__write_padded_as_spec({&__c, 1u}, 1, __fc, _M_spec);
+ return __format::__write_padded_as_spec({&__c, 1u},
+ _S_character_width(__c),
+ __fc, _M_spec);
}
 
   template
diff --git a/libstdc++-v3/testsuite/std/format/functions/format.cc 
b/libstdc++-v3/testsuite/std/format/functions/format.cc
index 7fc420170458..d8dbf4634133 100644
--- a/libstdc++-v3/testsuite/std/format/functions/format.cc
+++ b/libstdc++-v3/testsuite/std/format/functions/format.cc
@@ -501,9 +501,14 @@ test_unicode()
 {
   // Similar to sC example in test_std_examples, but not from the standard.
   // Verify that the character "🤡" has estimated field width 2,
-  // rather than estimated field width equal to strlen("🤡"), which would be 4.
+  // rather than estimated field width equal to strlen("🤡"), which would be 4,
+  // or just width 1 for single character.
   std::string sC = std::format("{:*<3}", "🤡");
   VERIFY( sC == "🤡*" );
+  std::wstring wsC = std::format(L"{:*<3}", L"🤡");
+  VERIFY( wsC == L"🤡*" );
+  wsC = std::format(L"{:*<3}", L'🤡');
+  VERIFY( wsC == L"🤡*" );
 
   // Verify that "£" has estimated field width 1, not strlen("£") == 2.
   std::string sL = std::format("{:*<3}", "£");
@@ -517,7 +522,6 @@ test_unicode()
   std::string sP = std::format("{:1.1} {:*<1.1}", "£", "🤡");
   VERIFY( sP == "£ *" );
   sP = std::format("{:*<2.1} {:*<2.1}", "£", "🤡");
-  VERIFY( sP == "£* **" );
 
   // Verify field width handling for extended grapheme clusters,
   // and that a cluster gets output as a single item, not truncated.


[gcc r15-9179] libstdc++: Restored accidentally removed test case.

2025-04-03 Thread Tomasz Kaminski via Libstdc++-cvs
https://gcc.gnu.org/g:81c990aa84b22562157ce2926577b392b4a129d3

commit r15-9179-g81c990aa84b22562157ce2926577b392b4a129d3
Author: Tomasz Kamiński 
Date:   Thu Apr 3 14:56:49 2025 +0200

libstdc++: Restored accidentally removed test case.

It was removed by accident r15-9178-g5c7f6272f43f42.

libstdc++-v3/ChangeLog:

* testsuite/std/format/functions/format.cc: Restored line.

Diff:
---
 libstdc++-v3/testsuite/std/format/functions/format.cc | 1 +
 1 file changed, 1 insertion(+)

diff --git a/libstdc++-v3/testsuite/std/format/functions/format.cc 
b/libstdc++-v3/testsuite/std/format/functions/format.cc
index d8dbf4634133..000f2671816d 100644
--- a/libstdc++-v3/testsuite/std/format/functions/format.cc
+++ b/libstdc++-v3/testsuite/std/format/functions/format.cc
@@ -522,6 +522,7 @@ test_unicode()
   std::string sP = std::format("{:1.1} {:*<1.1}", "£", "🤡");
   VERIFY( sP == "£ *" );
   sP = std::format("{:*<2.1} {:*<2.1}", "£", "🤡");
+  VERIFY( sP == "£* **" );
 
   // Verify field width handling for extended grapheme clusters,
   // and that a cluster gets output as a single item, not truncated.


[gcc r14-11524] libstdc++: Restored accidentally removed test case.

2025-04-03 Thread Tomasz Kaminski via Libstdc++-cvs
https://gcc.gnu.org/g:83cd4bda12f1218fea878acfc964949649ca9fc7

commit r14-11524-g83cd4bda12f1218fea878acfc964949649ca9fc7
Author: Tomasz Kamiński 
Date:   Thu Apr 3 14:56:49 2025 +0200

libstdc++: Restored accidentally removed test case.

It was removed by accident r14-11523-gad1b71fc2882c1.

libstdc++-v3/ChangeLog:

* testsuite/std/format/functions/format.cc: Restored line.

(cherry picked from commit 81c990aa84b22562157ce2926577b392b4a129d3)

Diff:
---
 libstdc++-v3/testsuite/std/format/functions/format.cc | 1 +
 1 file changed, 1 insertion(+)

diff --git a/libstdc++-v3/testsuite/std/format/functions/format.cc 
b/libstdc++-v3/testsuite/std/format/functions/format.cc
index 97eb0957e5e1..78010e159d36 100644
--- a/libstdc++-v3/testsuite/std/format/functions/format.cc
+++ b/libstdc++-v3/testsuite/std/format/functions/format.cc
@@ -518,6 +518,7 @@ test_unicode()
   std::string sP = std::format("{:1.1} {:*<1.1}", "£", "🤡");
   VERIFY( sP == "£ *" );
   sP = std::format("{:*<2.1} {:*<2.1}", "£", "🤡");
+  VERIFY( sP == "£* **" );
 
   // Verify field width handling for extended grapheme clusters,
   // and that a cluster gets output as a single item, not truncated.


[gcc r15-9186] c++: operator!= rewriting and arg-dep lookup

2025-04-03 Thread Jason Merrill via Gcc-cvs
https://gcc.gnu.org/g:bd5597156ca0c7d6fb50c6fe92a7abe6717cb2b5

commit r15-9186-gbd5597156ca0c7d6fb50c6fe92a7abe6717cb2b5
Author: Jason Merrill 
Date:   Tue Apr 1 13:04:05 2025 -0400

c++: operator!= rewriting and arg-dep lookup

When considering an op== as a rewrite target, we need to disqualify it if
there is a matching op!= in the same scope.  But add_candidates was assuming
that we could use the same set of op!= for all op==, which is wrong if
arg-dep lookup finds op== in multiple namespaces.

This broke 20_util/optional/relops/constrained.cc if the order of the ADL
set changed.

gcc/cp/ChangeLog:

* call.cc (add_candidates): Re-lookup ne_fns if we move into
another namespace.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/spaceship-rewrite6.C: New test.

Diff:
---
 gcc/cp/call.cc  | 27 +---
 gcc/testsuite/g++.dg/cpp2a/spaceship-rewrite6.C | 33 +
 2 files changed, 57 insertions(+), 3 deletions(-)

diff --git a/gcc/cp/call.cc b/gcc/cp/call.cc
index d8533623c015..6caac8963cc9 100644
--- a/gcc/cp/call.cc
+++ b/gcc/cp/call.cc
@@ -6673,6 +6673,7 @@ add_candidates (tree fns, tree first_arg, const vec *args,
   bool check_list_ctor = false;
   bool check_converting = false;
   unification_kind_t strict;
+  tree ne_context = NULL_TREE;
   tree ne_fns = NULL_TREE;
 
   if (!fns)
@@ -6719,6 +6720,7 @@ add_candidates (tree fns, tree first_arg, const vec *args,
   tree ne_name = ovl_op_identifier (false, NE_EXPR);
   if (DECL_CLASS_SCOPE_P (fn))
{
+ ne_context = DECL_CONTEXT (fn);
  ne_fns = lookup_fnfields (TREE_TYPE ((*args)[0]), ne_name,
1, tf_none);
  if (ne_fns == error_mark_node || ne_fns == NULL_TREE)
@@ -6728,8 +6730,9 @@ add_candidates (tree fns, tree first_arg, const vec *args,
}
   else
{
- tree context = decl_namespace_context (fn);
- ne_fns = lookup_qualified_name (context, ne_name, LOOK_want::NORMAL,
+ ne_context = decl_namespace_context (fn);
+ ne_fns = lookup_qualified_name (ne_context, ne_name,
+ LOOK_want::NORMAL,
  /*complain*/false);
  if (ne_fns == error_mark_node
  || !is_overloaded_fn (ne_fns))
@@ -6828,8 +6831,26 @@ add_candidates (tree fns, tree first_arg, const 
vec *args,
 
   /* When considering reversed operator==, if there's a corresponding
 operator!= in the same scope, it's not a rewrite target.  */
-  if (ne_fns)
+  if (ne_context)
{
+ if (TREE_CODE (ne_context) == NAMESPACE_DECL)
+   {
+ /* With argument-dependent lookup, fns can span multiple
+namespaces; make sure we look in the fn's namespace for a
+corresponding operator!=.  */
+ tree fn_ns = decl_namespace_context (fn);
+ if (fn_ns != ne_context)
+   {
+ ne_context = fn_ns;
+ tree ne_name = ovl_op_identifier (false, NE_EXPR);
+ ne_fns = lookup_qualified_name (ne_context, ne_name,
+ LOOK_want::NORMAL,
+ /*complain*/false);
+ if (ne_fns == error_mark_node
+ || !is_overloaded_fn (ne_fns))
+   ne_fns = NULL_TREE;
+   }
+   }
  bool found = false;
  for (lkp_iterator ne (ne_fns); !found && ne; ++ne)
if (0 && !ne.using_p ()
diff --git a/gcc/testsuite/g++.dg/cpp2a/spaceship-rewrite6.C 
b/gcc/testsuite/g++.dg/cpp2a/spaceship-rewrite6.C
new file mode 100644
index ..0ec74e891024
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/spaceship-rewrite6.C
@@ -0,0 +1,33 @@
+// { dg-do compile { target c++20 } }
+
+// We wrongly considered D to be ne_comparable because we were looking for a
+// corresponding op!= for N::op== in ::, because ::op== happened to be the
+// first thing in the lookup set.
+
+template
+struct enable_if;
+
+template
+struct enable_if
+{ typedef _Tp type; };
+
+template  struct A { };
+
+namespace N {
+  struct X { };
+  template  auto operator== (const A&, const A&)
+-> typename enable_if::type;
+  template  auto operator!= (const A&, const A&)
+-> typename enable_if::type;
+}
+
+template
+concept ne_comparable
+= requires (const A& t, const A& u) {
+  t != u;
+};
+
+struct D { };
+int operator==(D, D);
+bool operator!=(D, D) = delete;
+static_assert( ! ne_comparable );


[gcc r15-9176] Fix costs of x86 move instructions at -Os

2025-04-03 Thread Jan Hubicka via Gcc-cvs
https://gcc.gnu.org/g:564e4e0819022925dd160e455ee44baf0fda5805

commit r15-9176-g564e4e0819022925dd160e455ee44baf0fda5805
Author: Jan Hubicka 
Date:   Thu Apr 3 13:06:07 2025 +0200

Fix costs of x86 move instructions at -Os

This patch fixes problem with size costs declaring all moves to have equal 
size
(which was caught by the sanity check I tried in prologue move cost hook).
Costs are relative to reg-reg move which is two. Coincidentally it is also 
size
of the encoding, so the costs should represent typical size of move
instruction.

The patch reduces cc1plus text size 26391115->26205707 (0.7%) and similar 
changes
also happens to other binaries build during bootstrap.

Bootsrapped/regtested x86_64-linux, plan to commit it tomorrow if there
are no complains

There are other targets that define some load/store costs to be 2 that 
probably
should be fixed too, but they are mostly very old ones and I don't have way 
of
benchmarking them.

* config/i386/x86-tune-costs.h (ix86_size_cost): Fix sizes of move
instructions

Diff:
---
 gcc/config/i386/x86-tune-costs.h | 57 ++--
 1 file changed, 31 insertions(+), 26 deletions(-)

diff --git a/gcc/config/i386/x86-tune-costs.h b/gcc/config/i386/x86-tune-costs.h
index a4a128cd5dde..7c8cb738d7cd 100644
--- a/gcc/config/i386/x86-tune-costs.h
+++ b/gcc/config/i386/x86-tune-costs.h
@@ -37,34 +37,37 @@ static stringop_algs ix86_size_memset[2] = {
 const
 struct processor_costs ix86_size_cost = {/* costs for tuning for size */
   {
-  /* Start of register allocator costs.  integer->integer move cost is 2. */
-  2,/* cost for loading QImode using movzbl */
-  {2, 2, 2},   /* cost of loading integer registers
+  /* Start of register allocator costs.  integer->integer move cost is 2
+ and coststs are relative to it.  movl %eax, %ebx is 2 bytes, so the
+ sizes coincides with average size of instruction encoding.  */
+  3,/* cost for loading QImode using movzbl */
+  /* Typical load/save from stack frame is 4 bytes with ebp and 5 with esp.  */
+  {5, 6, 5},   /* cost of loading integer registers
   in QImode, HImode and SImode.
   Relative to reg-reg move (2).  */
-  {2, 2, 2},   /* cost of storing integer registers */
+  {5, 6, 5},   /* cost of storing integer registers */
   2,   /* cost of reg,reg fld/fst */
-  {2, 2, 2},   /* cost of loading fp registers
+  {5, 6, 5},   /* cost of loading fp registers
   in SFmode, DFmode and XFmode */
-  {2, 2, 2},   /* cost of storing fp registers
+  {5, 6, 5},   /* cost of storing fp registers
   in SFmode, DFmode and XFmode */
   3,   /* cost of moving MMX register */
-  {3, 3},  /* cost of loading MMX registers
+  {6, 6},  /* cost of loading MMX registers
   in SImode and DImode */
-  {3, 3},  /* cost of storing MMX registers
+  {6, 6},  /* cost of storing MMX registers
   in SImode and DImode */
-  3, 3, 3, /* cost of moving XMM,YMM,ZMM register 
*/
-  {3, 3, 3, 3, 3}, /* cost of loading SSE registers
+  4, 4, 6, /* cost of moving XMM,YMM,ZMM register 
*/
+  {6, 6, 6, 6, 11},/* cost of loading SSE registers
   in 32,64,128,256 and 512-bit */
-  {3, 3, 3, 3, 3}, /* cost of storing SSE registers
+  {6, 6, 6, 6, 11},/* cost of storing SSE registers
   in 32,64,128,256 and 512-bit */
-  3, 3,/* SSE->integer and integer->SSE moves 
*/
-  3, 3,/* mask->integer and integer->mask 
moves */
-  {2, 2, 2},   /* cost of loading mask register
+  4, 4,/* SSE->integer and integer->SSE moves 
*/
+  4, 4,/* mask->integer and integer->mask 
moves */
+  {7, 7, 7},   /* cost of loading mask register
   in QImode, HImode, SImode.  */
-  {2, 2, 2},   /* cost if storing mask register
+  {7, 7, 7},   /* cost if storing mask register
   in QImode, HImo

[gcc r15-9177] c++: Fix typo in RAW_DATA_CST build_list_conv subsubconv hanling [PR119563]

2025-04-03 Thread Jakub Jelinek via Gcc-cvs
https://gcc.gnu.org/g:70bf0ee44017e8e26bb1bdcb6a3fd114c25c39c7

commit r15-9177-g70bf0ee44017e8e26bb1bdcb6a3fd114c25c39c7
Author: Jakub Jelinek 
Date:   Thu Apr 3 13:21:56 2025 +0200

c++: Fix typo in RAW_DATA_CST build_list_conv subsubconv hanling [PR119563]

The following testcase ICEs (the embed one actually doesn't but
dereferences random uninitialized pointer far after allocated memory)
because of a typo.  In the RAW_DATA_CST handling of list conversion
where there are conversions to something other than
initializer_list<{{,un}signed ,}char>, the code now calls
implicit_conversion for all the RAW_DATA_CST elements and stores them
into subsubconvs array.
The next loop (done in a separate loop because subsubconvs[0] is
handled differently) attempts to do the
  for (i = 0; i < len; ++i)
{
  conversion *sub = subconvs[i];
  if (sub->rank > t->rank)
t->rank = sub->rank;
  if (sub->user_conv_p)
t->user_conv_p = true;
  if (sub->bad_p)
t->bad_p = true;
}
rank/user_conv_p/bad_p merging, but I mistyped the index, the loop
iterates with j iterator and i is subconvs index, so the loop effectively
doesn't do anything interesting except for merging from one of the
subsubconvs element, if lucky within the subsubconvs array, if unlucky
not even from inside of the array.

The following patch fixes that.

2025-04-03  Andrew Pinski  
Jakub Jelinek  

PR c++/119563
* call.cc (build_list_conv): Fix a typo in loop gathering
summary information from subsubconvs.

* g++.dg/cpp0x/pr119563.C: New test.
* g++.dg/cpp/embed-26.C: New test.

Diff:
---
 gcc/cp/call.cc|  2 +-
 gcc/testsuite/g++.dg/cpp/embed-26.C   | 63 
 gcc/testsuite/g++.dg/cpp0x/pr119563.C | 79 +++
 3 files changed, 143 insertions(+), 1 deletion(-)

diff --git a/gcc/cp/call.cc b/gcc/cp/call.cc
index b1469cb5a4c9..d8533623c015 100644
--- a/gcc/cp/call.cc
+++ b/gcc/cp/call.cc
@@ -917,7 +917,7 @@ build_list_conv (tree type, tree ctor, int flags, 
tsubst_flags_t complain)
  t->rank = cr_exact;
  for (j = 0; j < (unsigned) RAW_DATA_LENGTH (val); ++j)
{
- sub = subsubconvs[i];
+ sub = subsubconvs[j];
  if (sub->rank > t->rank)
t->rank = sub->rank;
  if (sub->user_conv_p)
diff --git a/gcc/testsuite/g++.dg/cpp/embed-26.C 
b/gcc/testsuite/g++.dg/cpp/embed-26.C
new file mode 100644
index ..ad3f9de6b1f0
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp/embed-26.C
@@ -0,0 +1,63 @@
+// PR c++/119563
+// { dg-do run { target c++11 } }
+// { dg-options "-O2" }
+
+namespace std {
+template 
+struct initializer_list {
+private:
+  T *_M_array;
+  decltype (sizeof 0) _M_len;
+public:
+  constexpr decltype (sizeof 0)
+  size () const noexcept { return _M_len; }
+  constexpr const T *
+  begin () const noexcept { return _M_array; }
+  constexpr const T *
+  end () const noexcept { return begin () + size (); }
+};
+}
+
+struct A {} a;
+
+struct B {
+  constexpr B (int x) : B (a, x) {}
+  template 
+  constexpr B (A, T... x) : b(x...) {}
+  int b;
+};
+
+struct C {
+  C (std::initializer_list x)
+  {
+unsigned char buf[] = {
+#embed __FILE__
+};
+if (x.size () != 2 * sizeof (buf) + 1024)
+  __builtin_abort ();
+unsigned int i = 0;
+for (auto a = x.begin (); a < x.end (); ++a, ++i)
+  if (a->b != (i < sizeof (buf) ? buf[i]
+  : i < sizeof (buf) + 1024 ? ((i - sizeof (buf)) & 7) + 1
+  : buf[i - sizeof (buf) - 1024]))
+   __builtin_abort ();
+c = true;
+  }
+  bool c;
+};
+
+#define D 1 + 0, 2 + 0, 3 + 0, 4 + 0, 5 + 0, 6 + 0, 7 + 0, 8 + 0
+#define E D, D, D, D, D, D, D, D, D, D, D, D, D, D, D, D
+
+C c {
+#embed __FILE__ suffix (,)
+  E, E, E, E, E, E, E, E,
+#embed __FILE__
+};
+
+int
+main ()
+{
+  if (!c.c)
+__builtin_abort ();
+}
diff --git a/gcc/testsuite/g++.dg/cpp0x/pr119563.C 
b/gcc/testsuite/g++.dg/cpp0x/pr119563.C
new file mode 100644
index ..9363a09e8bb3
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/pr119563.C
@@ -0,0 +1,79 @@
+// PR c++/119563
+// { dg-do run { target c++11 } }
+// { dg-options "-O2" }
+
+namespace std {
+template 
+struct initializer_list {
+private:
+  T *_M_array;
+  decltype (sizeof 0) _M_len;
+public:
+  constexpr decltype (sizeof 0)
+  size () const noexcept { return _M_len; }
+  constexpr const T *
+  begin () const noexcept { return _M_array; }
+  constexpr const T *
+  end () const noexcept { return begin () + size (); }
+};
+}
+
+struct A {} a;
+
+struct B {
+  constexpr B (int x) : B (a, x) {}
+  template 
+  constexpr B (A, T... x) : b(x...) {}
+  int b;
+};
+
+struct C {
+  C (std::initializer_list x)
+  {
+if (x.size () != 130 + 1

[gcc/devel/rust/master] Update exclusion list

2025-04-03 Thread Thomas Schwinge via Gcc-cvs
https://gcc.gnu.org/g:45f70e36a90f895e4d5a0070e77512c4798ba308

commit 45f70e36a90f895e4d5a0070e77512c4798ba308
Author: Pierre-Emmanuel Patry 
Date:   Tue Mar 25 18:47:04 2025 +0100

Update exclusion list

gcc/testsuite/ChangeLog:

* rust/compile/nr2/exclude: Remove now passing tests from exclusion
list.

Signed-off-by: Pierre-Emmanuel Patry 

Diff:
---
 gcc/testsuite/rust/compile/nr2/exclude | 6 --
 1 file changed, 6 deletions(-)

diff --git a/gcc/testsuite/rust/compile/nr2/exclude 
b/gcc/testsuite/rust/compile/nr2/exclude
index 9273bd1489e1..75a0ae0ea0ed 100644
--- a/gcc/testsuite/rust/compile/nr2/exclude
+++ b/gcc/testsuite/rust/compile/nr2/exclude
@@ -2,12 +2,8 @@ canonical_paths1.rs
 cfg1.rs
 generics9.rs
 issue-2043.rs
-issue-2330.rs
 issue-2812.rs
-issue-850.rs
-issue-855.rs
 issue-3315-2.rs
-iterators1.rs
 lookup_err1.rs
 macros/mbe/macro43.rs
 macros/mbe/macro6.rs
@@ -27,8 +23,6 @@ derive_clone_enum3.rs
 derive-debug1.rs
 derive-default1.rs
 issue-3402-1.rs
-for-loop1.rs
-for-loop2.rs
 issue-3403.rs
 derive-eq-invalid.rs
 derive-hash1.rs