Hello All:

This patch replaces fmr instruction 6 cycles with 2 cycles xxlor instruction
for p7 and p8 architecture.

I have implemented with switch and cases otherwise it is difficult to 
accommodate
xxlor with p7 and p8 and fmr for other architectures.

Bootstrapped and regtested.

Thanks & Regards
Ajit

                
        rs6000: fmr gets used instead of faster xxlor [PR93571]

        This patch replaces 6 cycles fmr instruction with xxlor
        2 cycles in p8 and p7 architecture.

        2023-02-21  Ajit Kumar Agarwal  <aagar...@linux.ibm.com>

gcc/ChangeLog:

        * config/rs6000/rs6000.md (*movdf_hardfloat64): Replace fmr with xxlor 
instruction.
---
 gcc/config/rs6000/rs6000.md | 49 ++++++++++++++++++++++---------------
 1 file changed, 29 insertions(+), 20 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index dfd6c73ffcb..ef587033367 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -8433,26 +8433,35 @@ (define_insn "*mov<mode>_hardfloat64"
   "TARGET_POWERPC64 && TARGET_HARD_FLOAT
    && (gpc_reg_operand (operands[0], <MODE>mode)
        || gpc_reg_operand (operands[1], <MODE>mode))"
-  "@
-   stfd%U0%X0 %1,%0
-   lfd%U1%X1 %0,%1
-   xxlor %0,%1,%1
-   lxsd %0,%1
-   stxsd %1,%0
-   lxsdx %x0,%y1
-   stxsdx %x1,%y0
-   xxlor %x0,%x1,%x1
-   xxlxor %x0,%x0,%x0
-   li %0,0
-   std%U0%X0 %1,%0
-   ld%U1%X1 %0,%1
-   mr %0,%1
-   mt%0 %1
-   mf%1 %0
-   nop
-   mfvsrd %0,%x1
-   mtvsrd %x0,%1
-   #"
+{
+  switch (which_alternative) {
+    case 0 :  return "stfd%U0%X0 %1,%0";
+    case 1 :  return "lfd%U1%X1 %0,%1";
+    case 2 : if ((TARGET_VSX || TARGET_P8_VECTOR)
+                  && !TARGET_P9_VECTOR
+                  && !TARGET_POWER10)
+               return "xxlor %0,%1,%1";
+              else
+                return "fmr %0,%1";
+
+     case 3 : return "lxsd %0,%1";
+     case 4 : return "stxsd %1,%0";
+     case 5 : return "lxsdx %x0,%y1";
+     case 6 : return "stxsdx %x1,%y0";
+     case 7 : return "xxlor %x0,%x1,%x1";
+     case 8 : return "xxlxor %x0,%x0,%x0";
+     case 9 : return "li %0,0";
+     case 10 : return "std%U0%X0 %1,%0";
+     case 11 : return "ld%U1%X1 %0,%1";
+     case 12 : return "mr %0,%1";
+     case 13 : return "mt%0 %1";
+     case 14 : return "mf%1 %0";
+     case 15 : return "nop";
+     case 16: return "mfvsrd %0,%x1";
+     case 17 : return "mtvsrd %x0,%1";
+   }
+   return "unreachable";
+}
   [(set_attr "type"
             "fpstore,     fpload,     fpsimple,   fpload,     fpstore,
              fpload,      fpstore,    veclogical, veclogical, integer,
-- 
2.31.1


On 17/02/23 10:53 pm, Segher Boessenkool wrote:
> Hi!
> 
> On Fri, Feb 17, 2023 at 10:28:41PM +0530, Ajit Agarwal wrote:
>> This patch replaces fmr instruction (6 cycles) with xxlor instruction ( 2 
>> cycles)
>> Bootstrapped and regtested on powerpc64-linux-gnu.
> 
> You tested this on a CPU that does have VSX.  It is incorrect on other
> (older) CPUs.
> 
>> --- a/gcc/config/rs6000/rs6000.md
>> +++ b/gcc/config/rs6000/rs6000.md
>> @@ -8436,7 +8436,7 @@
>>    "@
>>     stfd%U0%X0 %1,%0
>>     lfd%U1%X1 %0,%1
>> -   fmr %0,%1
>> +   xxlor %0,%1,%1
>>     lxsd %0,%1
>>     stxsd %1,%0
>>     lxsdx %x0,%y1
> 
> This is the *mov<mode>_hardfloat64 pattern.  You can add some magic to
> your Git config so that will show in the patch: in .git/config:
> 
> [diff "md"]
>         xfuncname = "^\\(define.*$"
> 
> (As it says in .gitattributes:
>   # Make diff on MD files use "(define" as a function marker.
>   # Use together with git config diff.md.xfuncname '^\(define.*$'
>   # which is run by contrib/gcc-git-customization.sh too.
>   *.md diff=md
> )
> 
> The third alternative to this insn, the fmr one, has "d" as both input
> and output constraint, and has "*" as isa attribute, so it will be used
> on any CPU that has floating point registers.  The eight alternative
> (the existing xxlor one) has "wa" constraints (via <f64_vsx>) so it
> implicitly requires VSX to be enabled.  You need to do something similar
> for what you want, but you also need to still allow fmr.
> 
> 
> Segher

Reply via email to