Re: Fix PR 118541 (V3), do not generate unordered fp cmoves for IEEE compares

2025-05-30 Thread Michael Meissner
On Thu, May 22, 2025 at 02:17:41PM +0530, Surya Kumari Jangala wrote: > Hi Mike, > The source code changes are missing. Whoops. I just posted a completely new patch. https://gcc.gnu.org/pipermail/gcc-patches/2025-May/685233.html -- Michael Meissner, IBM PO Box 98, Ayer, Massachusett

[PATCH, V7] Fix PR 118541, do not generate unordered fp cmoves for IEEE compares

2025-05-30 Thread Michael Meissner
Now if I compare my original patches to the original code, only one benchmark is faster: 526.blender_r: 1.0% faster I have done bootstrap builds on both little endian and big endian power servers. Can I check this patch into the GCC trunk? 2025-05-29 Michael Meissner gcc/ P

Re: Fix PR 118541, do not generate unordered fp cmoves for IEEE compares

2025-05-21 Thread Michael Meissner
I have posted a new version of the patch at: https://gcc.gnu.org/pipermail/gcc-patches/2025-May/684473.html -- Michael Meissner, IBM PO Box 98, Ayer, Massachusetts, USA, 01432 email: meiss...@linux.ibm.com

Re: Fix PR 118541 (V6 not V3), do not generate unordered fp cmoves for IEEE compares

2025-05-21 Thread Michael Meissner
I got the version number of the patch wrong. This patch is something like V6 of the patch, not V3. -- Michael Meissner, IBM PO Box 98, Ayer, Massachusetts, USA, 01432 email: meiss...@linux.ibm.com

Fix PR 118541 (V3), do not generate unordered fp cmoves for IEEE compares

2025-05-21 Thread Michael Meissner
here were no regressions. Can I check this patch into the GCC trunk, and after a waiting period, can I check this into the active older branches? 2025-05-21 Michael Meissner gcc/ PR target/118541 * config/rs6000/predicates.md (invert_fpmask_comparison_operator):

Re: Fix PR 118541, do not generate unordered fp cmoves for IEEE compares

2025-05-12 Thread Michael Meissner
her desires, I remove the test for Ofast. -- Michael Meissner, IBM PO Box 98, Ayer, Massachusetts, USA, 01432 email: meiss...@linux.ibm.com

PR 99293: Optimize splat of a V2DF/V2DI extract with constant element

2025-05-08 Thread Michael Meissner
endian power9): splat_dup_l_0: mfvsrld 9,34 mtvsrdd 34,9,9 blr Now it generates: splat_dup_l_0: xxpermdi 34,34,34,3 blr 2025-04-30 Michael Meissner gcc/ PR target/99293 * config

PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-05-08 Thread Michael Meissner
xxlnand => xxlnand xxlorc => xxlnand xxleqv => xxlnand xxlnor => xxlnand xxlor => xxlnand xxlxor => xxlnand xxlandc => xxlnand xxland => xxlnand 2025-04-30 Michael Meissner gcc/ PR target/117251 * confi

PR target/108958 -- use mtvsrdd to zero extend GPR DImode to VSX TImode

2025-05-08 Thread Michael Meissner
both little and big endian PowerPC systems and there were no regressions. Can I apply this patch to GCC 15? 2025-04-30 Michael Meissner gcc/ PR target/108598 * gcc/config/rs6000/rs6000.md (zero_extendditi2): New insn. gcc/testsuite/ PR target/108598 * gcc.t

Fix PR 118541, do not generate unordered fp cmoves for IEEE compares

2025-05-08 Thread Michael Meissner
power9/power10 systems and there were no regressions. Can I check this patch into the GCC trunk, and after a waiting period, can I check this into the active older branches? 2025-04-30 Michael Meissner gcc/ PR target/118541 * config/rs6000/predica

[PATCH, V5] PR target/118541 - Do not generate unordered fp cmoves for IEEE compares on PowerPC

2025-04-05 Thread Michael Meissner
built bootstrap compilers on big endian power9 systems and little endian power9/power10 systems and there were no regressions. Can I check this patch into the GCC trunk, and after a waiting period, can I check this into the active older branches? 2025-03-28 Michael Meissner gcc/

[PATCH, V6] PR target/118541 - Do not generate unordered fp cmoves for IEEE compares on PowerPC

2025-04-01 Thread Michael Meissner
ower10 systems and there were no regressions. Can I check this patch into the GCC trunk, and after a waiting period, can I check this into the active older branches? 2025-04-01 Michael Meissner gcc/ PR target/118541 * config/rs6000/predicates.md (invert_fpmask_comparison_operato

Re: [PATCH, V3] PR target/118541 - Do not generate unordered fp cmoves for IEEE compares on PowerPC

2025-03-26 Thread Michael Meissner
On Mon, Mar 24, 2025 at 09:15:26PM +0100, Florian Weimer wrote: > * Michael Meissner: > > > +enum reverse_cond_t { > > + REVERSE_COND_ORDERED_OK, > > + REVERSE_COND_NO_ORDERED > > +}; > > This should probably be something > like > > enum re

[PATCH, V4] PR target/118541 - Do not generate unordered fp cmoves for IEEE compares on PowerPC

2025-03-26 Thread Michael Meissner
ers on big endian power9 systems and little endian power9/power10 systems and there were no regressions. Can I check this patch into the GCC trunk, and after a waiting period, can I check this into the active older branches? 2025-03-26 Michael Meissner gcc/ PR target/118541 *

Re: [PATCH v2] rs6000: Adding missed ISA 3.0 atomic memory operation instructions.

2025-02-21 Thread Michael Meissner
amo6.c: Likewise. > * gcc.target/powerpc/amo7.c: Likewise. > > Co-authored-by: Jeevitha Palanisamy It looks reasonable to me. Hopefully Segher will approve. -- Michael Meissner, IBM PO Box 98, Ayer, Massachusetts, USA, 01432 email: meiss...@linux.ibm.com

Ping #4: [PATCH repost] PR target/117251 Add PowerPC XXEVAL support for fusion optimization in power10

2025-02-12 Thread Michael Meissner
. Message-ID https://gcc.gnu.org/pipermail/gcc-patches/2024-November/669138.html -- Michael Meissner, IBM PO Box 98, Ayer, Massachusetts, USA, 01432 email: meiss...@linux.ibm.com

Ping #2: [PATCH V2], Add PowerPC Dense Match Support for future cpus

2025-02-12 Thread Michael Meissner
.html Patch 3 of 3, add support for 1,024 bit dense math registers: https://gcc.gnu.org/pipermail/gcc-patches/2024-December/670792.html -- Michael Meissner, IBM PO Box 98, Ayer, Massachusetts, USA, 01432 email: meiss...@linux.ibm.com

Ping #2: [PATCH, V2] Add Vector pair support

2025-02-12 Thread Michael Meissner
.html -- Michael Meissner, IBM PO Box 98, Ayer, Massachusetts, USA, 01432 email: meiss...@linux.ibm.com

Ping #4: [PATCH report] PR target/99293 Optimize splat of a V2DF/V2DI extract with constant element

2025-02-12 Thread Michael Meissner
. Message-ID https://gcc.gnu.org/pipermail/gcc-patches/2024-November/669136.html -- Michael Meissner, IBM PO Box 98, Ayer, Massachusetts, USA, 01432 email: meiss...@linux.ibm.com

Ping #4: [PATCH V4 0/2] Separate PowerPC ISA bits from architecture bits set by -mcpu=

2025-02-12 Thread Michael Meissner
/gcc-patches/2024-November/669110.html -- Michael Meissner, IBM PO Box 98, Ayer, Massachusetts, USA, 01432 email: meiss...@linux.ibm.com

Ping #4: [PATCH] PR target/108958: Use mtvsrdd to zero extend GPR DImode to VSX TImode

2025-02-12 Thread Michael Meissner
https://gcc.gnu.org/pipermail/gcc-patches/2024-November/669242.html -- Michael Meissner, IBM PO Box 98, Ayer, Massachusetts, USA, 01432 email: meiss...@linux.ibm.com

Ping #4: [PATCH V4 0/5] Add more user friendly TARGET_ names for PowerPC

2025-02-12 Thread Michael Meissner
: https://gcc.gnu.org/pipermail/gcc-patches/2024-November/669072.html -- Michael Meissner, IBM PO Box 98, Ayer, Massachusetts, USA, 01432 email: meiss...@linux.ibm.com

Ping #4: [PATCH] PR target/117487 Add power9/power10 float to logical operations

2025-02-12 Thread Michael Meissner
://gcc.gnu.org/pipermail/gcc-patches/2024-November/669137.html -- Michael Meissner, IBM PO Box 98, Ayer, Massachusetts, USA, 01432 email: meiss...@linux.ibm.com

[PATCH, V3] PR target/118541 - Do not generate unordered fp cmoves for IEEE compares on PowerPC

2025-02-12 Thread Michael Meissner
fter a waiting period, can I check this into the active older branches? 2025-02-12 Michael Meissner gcc/ PR target/118541 * config/rs6000/predicates.md (invert_fpmask_comparison_operator): Do not allow UNLT and UNLE unless -ffast-math. * config/rs6000/rs6000-pr

Re: [PATCH, V2] Fix PR 118541, do not generate unordered fp cmoves for IEEE compares.

2025-02-07 Thread Michael Meissner
On Fri, Feb 07, 2025 at 05:51:19PM -0600, Peter Bergner wrote: > On 2/7/25 4:02 PM, Michael Meissner wrote: > > (define_predicate "invert_fpmask_comparison_operator" > > - (match_code "ne,unlt,unle")) > > + (ior (match_code "ne") > > +

Re: [PATCH, V2] Fix PR 118541, do not generate unordered fp cmoves for IEEE compares.

2025-02-07 Thread Michael Meissner
On Fri, Feb 07, 2025 at 05:51:19PM -0600, Peter Bergner wrote: > On 2/7/25 4:02 PM, Michael Meissner wrote: > > (define_predicate "invert_fpmask_comparison_operator" > > - (match_code "ne,unlt,unle")) > > + (ior (match_code "ne") > > +

[PATCH, V2] Fix PR 118541, do not generate unordered fp cmoves for IEEE compares.

2025-02-07 Thread Michael Meissner
red_compare: fcmpu 0,1,2 fmr 1,4 bnglr 0 fmr 1,3 blr normal_compare: xscmpgtdp 1,1,2 xxsel 1,4,3,1 blr 2025-02-06 Michael Meissner gcc/ PR target/118541 * co

Re: [PATCH] Fix PR 118541, do not generate unordered fp cmoves for IEEE compares.

2025-02-05 Thread Michael Meissner
s unless they use -fsignaling-nans, but if the user explicitly uses isgreater which says it does not trap, we should generate code that will trap in some case. Normal code using '>', etc. will only generate GT, GE, etc. and it will generate the cmove. -- Michael Meissner, IBM PO Box 98, Ayer, Massachusetts, USA, 01432 email: meiss...@linux.ibm.com

Re: [PATCH] Fix PR 118541, do not generate unordered fp cmoves for IEEE compares.

2025-01-31 Thread Michael Meissner
On Fri, Jan 31, 2025 at 08:04:53AM +0100, Richard Biener wrote: > On Fri, Jan 31, 2025 at 3:55 AM Michael Meissner > wrote: > > > > Fix PR 118541, do not generate unordered fp cmoves for IEEE compares. > > > > In bug PR target/118541 on power9, power10, and power11

[PATCH] Fix PR 118541, do not generate unordered fp cmoves for IEEE compares.

2025-01-30 Thread Michael Meissner
ssions. Can I check this patch into the GCC trunk, and after a waiting period, can I check this into the active older branches? 2025-01-30 Michael Meissner gcc/ PR target/118541 * config/rs6000/rs6000-protos.h (REVERSE_COND_ORDERED_OK): New macro. (REVERSE_COND_NO_O

Ping #3: [PATCH] PR target/117487 Add power9/power10 float to logical operations

2025-01-23 Thread Michael Meissner
Ping patch to fix PR target/117487, Add power9/power10 float to logical operations Message-ID https://gcc.gnu.org/pipermail/gcc-patches/2024-November/669137.html -- Michael Meissner, IBM PO Box 98, Ayer, Massachusetts, USA, 01432 email: meiss...@linux.ibm.com

Ping: [PATCH V2], Add PowerPC Dense Match Support for future cpus

2025-01-23 Thread Michael Meissner
of 3, add support for dense math registers: https://gcc.gnu.org/pipermail/gcc-patches/2024-December/670791.html Patch 3 of 3, add support for 1,024 bit dense math registers: https://gcc.gnu.org/pipermail/gcc-patches/2024-December/670792.html -- Michael Meissner, IBM PO Box 98, Ayer, Massachusetts

Ping #3: [PATCH V4 0/5] Add more user friendly TARGET_ names for PowerPC

2025-01-23 Thread Michael Meissner
://gcc.gnu.org/pipermail/gcc-patches/2024-November/669071.html Patch #5: Change TARGET_MODULO to TARGET_POWER9: https://gcc.gnu.org/pipermail/gcc-patches/2024-November/669072.html -- Michael Meissner, IBM PO Box 98, Ayer, Massachusetts, USA, 01432 email: meiss...@linux.ibm.com

Ping #3: [PATCH report] PR target/99293 Optimize splat of a V2DF/V2DI extract with constant element

2025-01-23 Thread Michael Meissner
Ping patch to fix PR target/99293, Optimize splat of a V2DF/V2DI extract with constant element: Message-ID https://gcc.gnu.org/pipermail/gcc-patches/2024-November/669136.html -- Michael Meissner, IBM PO Box 98, Ayer, Massachusetts, USA, 01432 email: meiss...@linux.ibm.com

Ping #3: [PATCH repost] PR target/117251 Add PowerPC XXEVAL support for fusion optimization in power10

2025-01-23 Thread Michael Meissner
Ping patch to fix PR target/117251, Add PowerPC XXEVAL support for fusion optimization in power10 Message-ID https://gcc.gnu.org/pipermail/gcc-patches/2024-November/669138.html -- Michael Meissner, IBM PO Box 98, Ayer, Massachusetts, USA, 01432 email: meiss...@linux.ibm.com

Ping #3: [PATCH] PR target/108958: Use mtvsrdd to zero extend GPR DImode to VSX TImode

2025-01-23 Thread Michael Meissner
Ping patch for PR target/108958, Use mtvsrdd to zero extend GPR DImode to VSX TImode Message-ID https://gcc.gnu.org/pipermail/gcc-patches/2024-November/669242.html -- Michael Meissner, IBM PO Box 98, Ayer, Massachusetts, USA, 01432 email: meiss...@linux.ibm.com

Ping #3: [PATCH V4 0/2] Separate PowerPC ISA bits from architecture bits set by -mcpu=

2025-01-23 Thread Michael Meissner
/669109.html Patch #2, use architecture flags for defining _ARCH_PWR macros: https://gcc.gnu.org/pipermail/gcc-patches/2024-November/669110.html -- Michael Meissner, IBM PO Box 98, Ayer, Massachusetts, USA, 01432 email: meiss...@linux.ibm.com

Ping #2: [PATCH repost] PR target/117251 Add PowerPC XXEVAL support for fusion optimization in power10

2025-01-09 Thread Michael Meissner
Ping patch to fix PR target/117251, Add PowerPC XXEVAL support for fusion optimization in power10 Message-ID https://gcc.gnu.org/pipermail/gcc-patches/2024-November/669138.html -- Michael Meissner, IBM PO Box 98, Ayer, Massachusetts, USA, 01432 email: meiss...@linux.ibm.com

Ping #2: [PATCH] PR target/108958: Use mtvsrdd to zero extend GPR DImode to VSX TImode

2025-01-09 Thread Michael Meissner
Ping patch for PR target/108958, Use mtvsrdd to zero extend GPR DImode to VSX TImode Message-ID https://gcc.gnu.org/pipermail/gcc-patches/2024-November/669242.html -- Michael Meissner, IBM PO Box 98, Ayer, Massachusetts, USA, 01432 email: meiss...@linux.ibm.com

Ping #2: [PATCH] PR target/117487 Add power9/power10 float to logical operations

2025-01-09 Thread Michael Meissner
Ping patch to fix PR target/117487, Add power9/power10 float to logical operations Message-ID https://gcc.gnu.org/pipermail/gcc-patches/2024-November/669137.html -- Michael Meissner, IBM PO Box 98, Ayer, Massachusetts, USA, 01432 email: meiss...@linux.ibm.com

Ping #2: [PATCH V4 0/2] Separate PowerPC ISA bits from architecture bits set by -mcpu=

2025-01-09 Thread Michael Meissner
/669109.html Patch #2, use architecture flags for defining _ARCH_PWR macros: https://gcc.gnu.org/pipermail/gcc-patches/2024-November/669110.html -- Michael Meissner, IBM PO Box 98, Ayer, Massachusetts, USA, 01432 email: meiss...@linux.ibm.com

Ping #2: [PATCH report] PR target/99293 Optimize splat of a V2DF/V2DI extract with constant element

2025-01-09 Thread Michael Meissner
Ping patch to fix PR target/99293, Optimize splat of a V2DF/V2DI extract with constant element: Message-ID https://gcc.gnu.org/pipermail/gcc-patches/2024-November/669136.html -- Michael Meissner, IBM PO Box 98, Ayer, Massachusetts, USA, 01432 email: meiss...@linux.ibm.com

Ping #2: [PATCH V4] Do not allow -mvsx to boost the cpu to power7

2025-01-09 Thread Michael Meissner
Ping patch to not allow -mvsx to boost the cpu to power7 Message-ID https://gcc.gnu.org/pipermail/gcc-patches/2024-November/669106.html -- Michael Meissner, IBM PO Box 98, Ayer, Massachusetts, USA, 01432 email: meiss...@linux.ibm.com

Ping #2: [PATCH V4 0/4] Add support for -mcpu=future in the PowerPC

2025-01-09 Thread Michael Meissner
second file is to change the test condition for the new future-3.c to exclude 32-bit tests. https://gcc.gnu.org/pipermail/gcc-patches/2024-November/669104.html https://gcc.gnu.org/pipermail/gcc-patches/2024-November/669132.html -- Michael Meissner, IBM PO Box 98, Ayer, Massachusetts, USA, 01432

Ping #2: [PATCH V4 0/5] Add more user friendly TARGET_ names for PowerPC

2025-01-09 Thread Michael Meissner
://gcc.gnu.org/pipermail/gcc-patches/2024-November/669071.html Patch #5: Change TARGET_MODULO to TARGET_POWER9: https://gcc.gnu.org/pipermail/gcc-patches/2024-November/669072.html -- Michael Meissner, IBM PO Box 98, Ayer, Massachusetts, USA, 01432 email: meiss...@linux.ibm.com

Re: [PING^3][PATCH] testsuite: Simplify target test and dg-options for AMO tests

2024-12-04 Thread Michael Meissner
gt; > /* { dg-do run { target { powerpc*-*-linux* && { lp64 && p9vector_hw } } } > > } */ > > -/* { dg-options "-O2 -mvsx -mpower9-misc" } */ > > -/* { dg-additional-options "-mdejagnu-cpu=power9" { target { ! > > has_arch_pwr9 } } } */ > > +/* { dg-options "-mdejagnu-cpu=power9 -O2" } */ > > +/* { dg-require-effective-target powerpc_vsx } */ > > > > #include > > #include > > > > > > > -- Michael Meissner, IBM PO Box 98, Ayer, Massachusetts, USA, 01432 email: meiss...@linux.ibm.com

Re: [PATCH repost, 0/5] Add PowerPC Dense Math Support for future cpus

2024-12-04 Thread Michael Meissner
-December/670791.html https://gcc.gnu.org/pipermail/gcc-patches/2024-December/670792.html -- Michael Meissner, IBM PO Box 98, Ayer, Massachusetts, USA, 01432 email: meiss...@linux.ibm.com

[PATCH V2, 3/3], Add support for 1,024 Dense Math Registers

2024-12-04 Thread Michael Meissner
endian systems. Can I check it into the master branch? 2024-12-04 Michael Meissner gcc/ * config/rs6000/mma.md (UNSPEC_DM_INSERT512_UPPER): New unspec. (UNSPEC_DM_INSERT512_LOWER): Likewise. (UNSPEC_DM_EXTRACT512): Likewise. (

[PATCH V2, 2/3] Add support for dense math registers

2024-12-04 Thread Michael Meissner
espondence. It is possible that the mangling for DMRs and the GDB register numbers may produce other changes in the future. gcc/ 2024-12-04 Michael Meissner * config/rs6000/mma.md (UNSPEC_MMA_DMSETDMRZ): New unspec. (movxo): Add comments about dense math registers.

[PATCH V2, 1/3], Add wD constraint

2024-12-04 Thread Michael Meissner
ot;register_operand") diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc index 0878929de22..3047a9e9a9b 100644 --- a/gcc/config/rs6000/rs6000.cc +++ b/gcc/config/rs6000/rs6000.cc @@ -2412,6 +2412,7 @@ rs6000_debug_reg_global (void) "wr reg_class = %s\n" "wx reg_class = %s\n" "wA reg_class = %s\n" + "wD reg_class = %s\n" "\n", reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_d]], reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_v]], @@ -2419,7 +2420,8 @@ rs6000_debug_reg_global (void) reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_we]], reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wr]], reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wx]], - reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wA]]); + reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wA]], + reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wD]]); nl = "\n"; for (m = 0; m < NUM_MACHINE_MODES; ++m) @@ -3082,6 +3084,9 @@ rs6000_init_hard_regno_mode_ok (bool global_init_p) if (TARGET_DIRECT_MOVE_128) rs6000_constraints[RS6000_CONSTRAINT_we] = VSX_REGS; + if (TARGET_MMA) +rs6000_constraints[RS6000_CONSTRAINT_wD] = FLOAT_REGS; + /* Set up the reload helper and direct move functions. */ if (TARGET_VSX || TARGET_ALTIVEC) { diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h index 392ca858fc4..69519851326 100644 --- a/gcc/config/rs6000/rs6000.h +++ b/gcc/config/rs6000/rs6000.h @@ -1197,6 +1197,7 @@ enum r6000_reg_class_enum { RS6000_CONSTRAINT_wr,/* GPR register if 64-bit */ RS6000_CONSTRAINT_wx,/* FPR register for STFIWX */ RS6000_CONSTRAINT_wA,/* BASE_REGS if 64-bit. */ + RS6000_CONSTRAINT_wD,/* Accumulator regs if MMA/Dense Math. */ RS6000_CONSTRAINT_MAX }; diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi index 69605bf75c0..5ceccc9b97f 100644 --- a/gcc/doc/md.texi +++ b/gcc/doc/md.texi @@ -3379,6 +3379,11 @@ Like @code{d}, if @option{-mpowerpc-gfxopt} is used; otherwise, @code{NO_REGS}. @item wA Like @code{b}, if @option{-mpowerpc64} is used; otherwise, @code{NO_REGS}. +@item wD +Accumulator register if @option{-mma} is used; otherwise, +@code{NO_REGS}. For @option{-mcpu=power10} the accumulator registers +overlap with VSX vector registers 0..31. + @item wB Signed 5-bit constant integer that can be loaded into an Altivec register. -- 2.47.0 -- Michael Meissner, IBM PO Box 98, Ayer, Massachusetts, USA, 01432 email: meiss...@linux.ibm.com

[PATCH V2], Add PowerPC Dense Match Support for future cpus

2024-12-04 Thread Michael Meissner
combination: Do not allow -mvsx to boost the cpu to power7 https://gcc.gnu.org/pipermail/gcc-patches/2024-November/669106.html -- Michael Meissner, IBM PO Box 98, Ayer, Massachusetts, USA, 01432 email: meiss...@linux.ibm.com

Re: [PATCH] Add Vector pair support

2024-12-04 Thread Michael Meissner
I provided an update to this patch here: https://gcc.gnu.org/pipermail/gcc-patches/2024-December/670787.html -- Michael Meissner, IBM PO Box 98, Ayer, Massachusetts, USA, 01432 email: meiss...@linux.ibm.com

[PATCH, V2] Add Vector pair support

2024-12-04 Thread Michael Meissner
xvmaddadp 0,12,11 lxvx 12,7,10 lxvx 11,11,10 stxvx 0,3,10 lxvx 0,8,10 xvmaddadp 0,12,11 stxvx 0,8,10 bdnz .L93 2024-12-02 Michael Meissner gcc/ * config.gcc (pow

Ping: [PATCH] PR target/108958: Use mtvsrdd to zero extend GPR DImode to VSX TImode

2024-12-04 Thread Michael Meissner
Ping patch for PR target/108958, Use mtvsrdd to zero extend GPR DImode to VSX TImode Message-ID https://gcc.gnu.org/pipermail/gcc-patches/2024-November/669242.html -- Michael Meissner, IBM PO Box 98, Ayer, Massachusetts, USA, 01432 email: meiss...@linux.ibm.com

Ping: [PATCH] PR target/117251 Add PowerPC XXEVAL support for fusion optimization in power10

2024-12-04 Thread Michael Meissner
Ping patch to fix PR target/117251, Add PowerPC XXEVAL support for fusion optimization in power10 Message-ID https://gcc.gnu.org/pipermail/gcc-patches/2024-November/669138.html -- Michael Meissner, IBM PO Box 98, Ayer, Massachusetts, USA, 01432 email: meiss...@linux.ibm.com

Ping: [PATCH] PR target/117487 Add power9/power10 float to logical operations

2024-12-04 Thread Michael Meissner
Ping patch to fix PR target/117487, Add power9/power10 float to logical operations Message-ID https://gcc.gnu.org/pipermail/gcc-patches/2024-November/669137.html -- Michael Meissner, IBM PO Box 98, Ayer, Massachusetts, USA, 01432 email: meiss...@linux.ibm.com

Ping: [PATCH V4 0/4] Add support for -mcpu=future in the PowerPC

2024-12-04 Thread Michael Meissner
second file is to change the test condition for the new future-3.c to exclude 32-bit tests. https://gcc.gnu.org/pipermail/gcc-patches/2024-November/669104.html https://gcc.gnu.org/pipermail/gcc-patches/2024-November/669132.html -- Michael Meissner, IBM PO Box 98, Ayer, Massachusetts, USA, 01432

Ping: [PATCH] PR target/99293 Optimize splat of a V2DF/V2DI extract with constant element

2024-12-04 Thread Michael Meissner
Ping patch to fix PR target/99293, Optimize splat of a V2DF/V2DI extract with constant element: Message-ID https://gcc.gnu.org/pipermail/gcc-patches/2024-November/669136.html -- Michael Meissner, IBM PO Box 98, Ayer, Massachusetts, USA, 01432 email: meiss...@linux.ibm.com

Ping: [PATCH V4] Do not allow -mvsx to boost the cpu to power7

2024-12-03 Thread Michael Meissner
Ping patch to not allow -mvsx to boost the cpu to power7 Message-ID https://gcc.gnu.org/pipermail/gcc-patches/2024-November/669106.html -- Michael Meissner, IBM PO Box 98, Ayer, Massachusetts, USA, 01432 email: meiss...@linux.ibm.com

Ping: [PATCH V4 0/2] Separate PowerPC ISA bits from architecture bits set by -mcpu=

2024-12-03 Thread Michael Meissner
/669109.html Patch #2, use architecture flags for defining _ARCH_PWR macros: https://gcc.gnu.org/pipermail/gcc-patches/2024-November/669110.html -- Michael Meissner, IBM PO Box 98, Ayer, Massachusetts, USA, 01432 email: meiss...@linux.ibm.com

Ping: [PATCH V4 0/5] Add more user friendly TARGET_ names for PowerPC

2024-12-03 Thread Michael Meissner
://gcc.gnu.org/pipermail/gcc-patches/2024-November/669071.html Patch #5: Change TARGET_MODULO to TARGET_POWER9: https://gcc.gnu.org/pipermail/gcc-patches/2024-November/669072.html -- Michael Meissner, IBM PO Box 98, Ayer, Massachusetts, USA, 01432 email: meiss...@linux.ibm.com

[PATCH] PR target/108958: Use mtvsrdd to zero extend GPR DImode to VSX TImode

2024-11-17 Thread Michael Meissner
patch to GCC 15? 2024-11-17 Michael Meissner gcc/ PR target/108598 * gcc/config/rs6000/rs6000.md (zero_extendditi2): New insn. gcc/testsuite/ PR target/108598 * gcc.target/powerpc/pr108958.c: New test. --- gcc/config/rs6000/rs6000.md | 46

Re: [PATCH repost, 3/5] PowerPC: Switch to dense math names for all MMA operations

2024-11-17 Thread Michael Meissner
If we eliminate patches #3 (switch to dense math names for all MMA operations) and patch #4 (add dense math test for new instruction) it will continue to generate the power10 form of the shared instructions and not the future form dense math registers. -- Michael Meissner, IBM PO Box 98, Ayer

Re: [PATCH V2 4/11] Change TARGET_POPCNTB to TARGET_POWER5

2024-11-17 Thread Michael Meissner
On Thu, Nov 14, 2024 at 06:26:11PM -0600, Peter Bergner wrote: > On 11/8/24 1:49 PM, Michael Meissner wrote: > > As part of the architecture flags patches, this patch changes the use of > > TARGET_POPCNTB to TARGET_POWER5. The POPCNTB instruction was added in ISA > > 2.02

Re: [PATCH V2 9/11] Update tests to work with architecture flags changes.

2024-11-17 Thread Michael Meissner
On Thu, Nov 14, 2024 at 06:47:58PM -0600, Peter Bergner wrote: > On 11/8/24 1:55 PM, Michael Meissner wrote: > > Two tests used -mvsx to raise the processor level to at least power7. These > > tests were rewritten to add cpu=power7 support. > > Again, this cleanup p

[PATCH repost, 1/5] Add wD constraint

2024-11-16 Thread Michael Meissner
ot;register_operand") diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc index dc5b7eb74d4..7551d7452bc 100644 --- a/gcc/config/rs6000/rs6000.cc +++ b/gcc/config/rs6000/rs6000.cc @@ -2410,6 +2410,7 @@ rs6000_debug_reg_global (void) "wr reg_class = %s\n" "wx reg_class = %s\n" "wA reg_class = %s\n" + "wD reg_class = %s\n" "\n", reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_d]], reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_v]], @@ -2417,7 +2418,8 @@ rs6000_debug_reg_global (void) reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_we]], reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wr]], reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wx]], - reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wA]]); + reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wA]], + reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wD]]); nl = "\n"; for (m = 0; m < NUM_MACHINE_MODES; ++m) @@ -3080,6 +3082,9 @@ rs6000_init_hard_regno_mode_ok (bool global_init_p) if (TARGET_DIRECT_MOVE_128) rs6000_constraints[RS6000_CONSTRAINT_we] = VSX_REGS; + if (TARGET_MMA) +rs6000_constraints[RS6000_CONSTRAINT_wD] = FLOAT_REGS; + /* Set up the reload helper and direct move functions. */ if (TARGET_VSX || TARGET_ALTIVEC) { diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h index f95318dd553..86171275ff5 100644 --- a/gcc/config/rs6000/rs6000.h +++ b/gcc/config/rs6000/rs6000.h @@ -1200,6 +1200,7 @@ enum r6000_reg_class_enum { RS6000_CONSTRAINT_wr,/* GPR register if 64-bit */ RS6000_CONSTRAINT_wx,/* FPR register for STFIWX */ RS6000_CONSTRAINT_wA,/* BASE_REGS if 64-bit. */ + RS6000_CONSTRAINT_wD,/* Accumulator regs if MMA/Dense Math. */ RS6000_CONSTRAINT_MAX }; diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi index 25ded86f0d1..0d73b35f6de 100644 --- a/gcc/doc/md.texi +++ b/gcc/doc/md.texi @@ -3440,6 +3440,11 @@ Like @code{d}, if @option{-mpowerpc-gfxopt} is used; otherwise, @code{NO_REGS}. @item wA Like @code{b}, if @option{-mpowerpc64} is used; otherwise, @code{NO_REGS}. +@item wD +Accumulator register if @option{-mma} is used; otherwise, +@code{NO_REGS}. For @option{-mcpu=power10} the accumulator registers +overlap with VSX vector registers 0..31. + @item wB Signed 5-bit constant integer that can be loaded into an Altivec register. -- 2.47.0 -- Michael Meissner, IBM PO Box 98, Ayer, Massachusetts, USA, 01432 email: meiss...@linux.ibm.com

[PATCH repost, 4/5] Add dense math test for new instruction

2024-11-16 Thread Michael Meissner
#error "target does not have dense math support." + #else + /* Make sure we have dense math support. */ + __vector_quad dmr; + __asm__ ("dmsetaccz %A0" : "=wD" (dmr)); + vq = dmr; + #endif +

[PATCH repost, 3/5] PowerPC: Switch to dense math names for all MMA operations

2024-11-16 Thread Michael Meissner
add the 'dm' prefix afterwards. To prevent having two sets of parallel int attributes, we remove the "pm" prefix from the instruction string in the attributes, and add it later, both in the insn name and in the output template. 2024-11-16 Michael Meissner gcc/ *

[PATCH repost, 2/5] Add support for dense math registers.

2024-11-16 Thread Michael Meissner
espondence. It is possible that the mangling for DMRs and the GDB register numbers may produce other changes in the future. gcc/ 2024-11-16 Michael Meissner * config/rs6000/mma.md (UNSPEC_MMA_DMSETDMRZ): New unspec. (movxo): Add comments about dense math registers.

[PATCH repost, 0/5] Add PowerPC Dense Math Support for future cpus

2024-11-16 Thread Michael Meissner
08.html The other bug fixes posted are independent of this patch. -- Michael Meissner, IBM PO Box 98, Ayer, Massachusetts, USA, 01432 email: meiss...@linux.ibm.com

[PATCH repost, 5/5] Add support for 1,024 bit Dense Math registers

2024-11-16 Thread Michael Meissner
endian systems. Can I check it into the master branch? 2024-11-16 Michael Meissner gcc/ * config/rs6000/mma.md (UNSPEC_DM_INSERT512_UPPER): New unspec. (UNSPEC_DM_INSERT512_LOWER): Likewise. (UNSPEC_DM_EXTRACT512): Likewise. (

[PATCH, repost] Add Vector pair support

2024-11-16 Thread Michael Meissner
lxvx 12,7,10 lxvx 11,11,10 stxvx 0,3,10 lxvx 0,8,10 xvmaddadp 0,12,11 stxvx 0,8,10 bdnz .L93 2024-11-16 Michael Meissner gcc/ * config.gcc (powerpc*-*-*): Add vector-pair.h to extra he

[PATCH repost] PR target/117251 Add PowerPC XXEVAL support for fusion optimization in power10

2024-11-16 Thread Michael Meissner
t; xxlor xxlnand => xxlnand xxlorc => xxlnand xxleqv => xxlnand xxlnor => xxlnand xxlor => xxlnand xxlxor => xxlnand xxlandc => xxlnand xxland => xxlnand I have built GCC with the patches in this patch set applied on both li

[PATCH] PR target/117487 Add power9/power10 float to logical operations

2024-11-16 Thread Michael Meissner
s. Can I apply this patch to GCC 15? 2024-11-16 Michael Meissner gcc/ PR target/117487 * config/rs6000/vsx.md (SFmode logical peephoole): Update comments in the original code that supports power8. Add a new define_peephole2 to do the optimization on power

[PATCH report] PR target/99293 Optimize splat of a V2DF/V2DI extract with constant element

2024-11-16 Thread Michael Meissner
es in this patch set applied on both little and big endian PowerPC systems and there were no regressions. Can I apply this patch to GCC 15? 2024-11-16 Michael Meissner gcc/ * config/rs6000/vsx.md (vsx_splat_extract_): New insn. gcc/testsuite/ * gcc.target/powerpc/builtins-1

[PATCH V4 5/4] Restrict future-3.c test to 64-bits

2024-11-16 Thread Michael Meissner
When I checked the previous patch, I didn't check it out on 32-bits. In 32-bit mode, the vector pair load and stores are not generated, even if -mcpu=future is used. Only run the future-3.c in 64-bit mode. 2024-11-16 Michael Meissner gcc/testsuite/ * gcc.target/powerpc/futur

[PATCH V4 3/4] Add -mcpu=future tests

2024-11-16 Thread Michael Meissner
This patch adds simple tests for -mcpu=future. I have built GCC with the patches in this patch set applied on both little and big endian PowerPC systems and there were no regressions. Can I apply this patch to GCC 15? 2024-11-16 Michael Meissner gcc/testsuite/ * gcc.target/powerpc

[PATCH V4 1/2] Add rs6000 architecture masks.

2024-11-16 Thread Michael Meissner
this patch to GCC 15? 2024-11-16 Michael Meissner gcc/ * config/rs6000/default64.h (TARGET_CPU_DEFAULT): Set default cpu name. * config/rs6000/rs6000-arch.def: New file. * config/rs6000/rs6000.cc (struct clone_map): Switch to using architecture masks instead of

[PATCH V4 2/2] Use architecture flags for defining _ARCH_PWR macros.

2024-11-16 Thread Michael Meissner
bits that aren't removed with this patch because the built-in function support uses those bits. I have built both big endian and little endian bootstrap compilers and there were no regressions. Can I install this patch on the GCC 15 trunk? 2024-11-16 Michael Meissner gcc/ * c

[PATCH V4 0/2] Separate PowerPC ISA bits from architecture bits set by -mcpu=

2024-11-16 Thread Michael Meissner
ittle and big endian PowerPC systems and there were no regressions. Can I apply these patches to GCC 15? -- Michael Meissner, IBM PO Box 98, Ayer, Massachusetts, USA, 01432 email: meiss...@linux.ibm.com

[PATCH V4] Do not allow -mvsx to boost the cpu to power7

2024-11-16 Thread Michael Meissner
pc*-*-* && lp64 } } } */ /* { dg-skip-if "" { powerpc*-*-darwin* } } */ /* { dg-require-effective-target longdouble128 } */ -/* { dg-options "-O2 -mdejagnu-cpu=power7 -mabi=ieeelongdouble -mno-popcntd -Wno-psabi" } */ +/* { dg-options "-O2 -mdejagnu-cpu=power6 -mabi=ieeelongdouble -Wno-psabi" } */ int i; -- 2.47.0 -- Michael Meissner, IBM PO Box 98, Ayer, Massachusetts, USA, 01432 email: meiss...@linux.ibm.com

[PATCH V4 1/4] Add support for -mcpu=future in the PowerPC

2024-11-16 Thread Michael Meissner
ppce300c2,ppce300c3,ppce500mc,ppce500mc64,ppce5500,ppce6500, - power4,power5,power6,power7,power8,power9,power10,power11, + power4,power5,power6,power7,power8,power9,power10,power11,future, rs64a,mpccore,cell,ppca2,titan" (const (symbol_ref "(enum attr_cpu) rs6000_tune"))) diff --git a/gcc/config/rs6000/rs6000.opt b/gcc/config/rs6000/rs6000.opt index 94323bd1db2..876b9f0d4af 100644 --- a/gcc/config/rs6000/rs6000.opt +++ b/gcc/config/rs6000/rs6000.opt @@ -630,6 +630,12 @@ mieee128-constant Target Var(TARGET_IEEE128_CONSTANT) Init(1) Save Generate (do not generate) code that uses the LXVKQ instruction. +;; Users should not use -mfuture, but we need to use a bit to identify when +;; the user changes the default cpu via #pragma GCC target("cpu=future") +;; and then resets it later. +mfuture +Target Undocumented Mask(FUTURE) Var(rs6000_isa_flags) WarnRemoved + ; Documented parameters -param=rs6000-vect-unroll-limit= -- 2.47.0 -- Michael Meissner, IBM PO Box 98, Ayer, Massachusetts, USA, 01432 email: meiss...@linux.ibm.com

[PATCH V4 4/4] Use vector pair load/store for memcpy with -mcpu=future

2024-11-16 Thread Michael Meissner
on both little and big endian PowerPC systems and there were no regressions. Can I apply this patch to GCC 15? 2024-11-16 Michael Meissner gcc/ * config/rs6000/rs6000-cpus.def (ISA_FUTURE_MASKS_SERVER): Enable using load vector pair and store vector pair instructions for

[PATCH V4 2/4] Add tuning support for -mcpu=future

2024-11-16 Thread Michael Meissner
This patch makes -mtune=future use the same tuning decision as -mtune=power11. I have built GCC with the patches in this patch set applied on both little and big endian PowerPC systems and there were no regressions. Can I apply this patch to GCC 15? 2024-11-16 Michael Meissner gcc

[PATCH V4 0/4] Add support for -mcpu=future in the PowerPC

2024-11-16 Thread Michael Meissner
stems and there were no regressions. Can I apply these patches to GCC 15? -- Michael Meissner, IBM PO Box 98, Ayer, Massachusetts, USA, 01432 email: meiss...@linux.ibm.com

[PATCH V4 5/5] Change TARGET_MODULO to TARGET_POWER9.

2024-11-16 Thread Michael Meissner
patch into GCC 15? 2024-11-15 Michael Meissner gcc/ * gcc/config/rs6000/rs6000-builtin.cc (rs6000_builtin_is_supported): Change TARGET_MODULO to TARGET_POWER9. * gcc/config/rs6000/rs6000.cc (rs6000_option_override_internal): Likewise. * gcc/config/rs6000

[PATCH V4 3/5] Change TARGET_CMPB to TARGET_POWER6.

2024-11-16 Thread Michael Meissner
GCC 15? 2024-11-16 Michael Meissner gcc/ * gcc/config/rs6000/rs6000-builtin.cc (rs6000_builtin_is_supported): Change TARGET_CMPB to TARGET_POWER6. * gcc/config/rs6000/rs6000.cc (rs6000_option_override_internal): Likewise. (rs6000_rtx_costs): Likewise

[PATCH V4 4/5] Change TARGET_POPCNTD to TARGET_POWER7.

2024-11-16 Thread Michael Meissner
patch into GCC 15? 2024-11-16 Michael Meissner gcc/ * gcc/config/rs6000/dfp.md (cmp_internal1): Change TARGET_POPCNTD to TARGET_POWER7. * gcc/config/rs6000/rs6000-builtin.cc (rs6000_builtin_is_supported): Likewise. * gcc/config/rs6000/rs6000-string.cc

[PATCH V4 2/5] Change TARGET_FPRND to TARGET_POWER5X.

2024-11-16 Thread Michael Meissner
check this patch into GCC 15? 2024-11-15 Michael Meissner gcc/ * gcc/config/rs6000/rs6000.cc (rs6000_option_override_internal): Change TARGET_FPRND to TARGET_POWER5X. * gcc/config/rs6000/rs6000.h (TARGET_POWERP5X): New macro. * gcc/config/rs6000/rs6000.md (fmod3

[PATCH V4 1/5] Change TARGET_POPCNTB to TARGET_POWER5.

2024-11-16 Thread Michael Meissner
regresion. Can I check it into GCC 15. 2024-11-15 Michael Meissner gcc/ * gcc/config/rs6000/rs6000-builtin.cc (rs6000_builtin_is_supported): Change TARGET_POPCNTB to TARGET_POWER5. * gcc/config/rs6000/rs6000.cc (rs6000_option_override_internal): Likewise. * gcc

[PATCH V4 0/5] Add more user friendly TARGET_ names for PowerPC

2024-11-15 Thread Michael Meissner
l the power7 population count instructions, but TARGET_POWER7 is used elsewhere. 5: Use TARGET_POWER9 instead of TARGET_MODULO. These patches have been tested on both little endiand and big endian systems. Can I check these changes into GCC 15? -- Michael Meissner, IBM PO Box 98,

Re: [PATCH V2 4/11] Change TARGET_POPCNTB to TARGET_POWER5

2024-11-15 Thread Michael Meissner
On Thu, Nov 14, 2024 at 06:26:11PM -0600, Peter Bergner wrote: > On 11/8/24 1:49 PM, Michael Meissner wrote: > > As part of the architecture flags patches, this patch changes the use of > > TARGET_POPCNTB to TARGET_POWER5. The POPCNTB instruction was added in ISA > > 2.02

Ping #4: [PATCH] PR 99293: Optimize splat of a V2DF/V2DI extract with constant element

2024-11-13 Thread Michael Meissner
runtime with this patch applied on a power10 system: 505.mcf_r 101.67% 520.omnetpp_r 103.35% 523.xalancbmk_r 101.15% -- Michael Meissner, IBM PO Box 98, Ayer, Massachusetts, USA, 01432 email: meiss...@linux.ibm.com

[PATCH V3, 06/11] Change TARGET_CMPB to TARGET_POWER6

2024-11-13 Thread Michael Meissner
generated exactly the same code with the patches installed compared to the compiler before installing the patches. Can I install this patch on the GCC 15 trunk? 2024-11-13 Michael Meissner * config/rs6000/rs6000-builtin.cc (rs6000_builtin_is_supported): Use TARGET_POWER6 instead of

Ping: [PATCH] PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2024-11-13 Thread Michael Meissner
This patch seems to have been overlooked: https://gcc.gnu.org/pipermail/gcc-patches/2024-October/666393.html -- Michael Meissner, IBM PO Box 98, Ayer, Massachusetts, USA, 01432 email: meiss...@linux.ibm.com

[PATCH V3, 11/11] Add -mcpu=future tuning support.

2024-11-13 Thread Michael Meissner
This patch makes -mtune=future use the same tuning decision as -mtune=power11. 2024-11-13 Michael Meissner gcc/ * config/rs6000/power10.md (all reservations): Add future as an alterntive to power10 and power11. --- gcc/config/rs6000/power10.md | 144

Ping: [PATCH 0/6] PowerPC Future support (Dense Math Registers)

2024-11-13 Thread Michael Meissner
://gcc.gnu.org/pipermail/gcc-patches/2024-October/65.html https://gcc.gnu.org/pipermail/gcc-patches/2024-October/66.html https://gcc.gnu.org/pipermail/gcc-patches/2024-October/67.html https://gcc.gnu.org/pipermail/gcc-patches/2024-October/68.html -- Michael Meissner, IBM PO Box 98

Ping: [PATCH, V2] PowerPC vector pair support

2024-11-13 Thread Michael Meissner
This patch appears to be overlooked: The first link is the long explanation of the patch, and the second link is the patch itself. https://gcc.gnu.org/pipermail/gcc-patches/2024-November/667451.html https://gcc.gnu.org/pipermail/gcc-patches/2024-November/667452.html -- Michael Meissner, IBM PO

Re: [PATCH V2 1/11] Add rs6000 architecture masks.

2024-11-13 Thread Michael Meissner
On Fri, Nov 08, 2024 at 02:28:11PM -0600, Peter Bergner wrote: > On 11/8/24 1:44 PM, Michael Meissner wrote: > > diff --git a/gcc/config/rs6000/rs6000-arch.def > > b/gcc/config/rs6000/rs6000-arch.def > > new file mode 100644 > > index 000..e5b6e958133 >

[PATCH V3, 10/11] Add support for -mcpu=future

2024-11-13 Thread Michael Meissner
This patch adds the support that can be used in developing GCC support for future PowerPC processors. 2024-11-13 Michael Meissner * config.gcc (powerpc*-*-*): Add support for --with-cpu=future. * config/rs6000/aix71.h (ASM_CPU_SPEC): Add support for -mcpu=future

  1   2   3   4   5   6   7   8   9   10   >