Re: PATCH [8 of 8], rs6000, add support for scalar floating point in Altivec registers

Michael Meissner Fri, 14 Nov 2014 12:17:39 -0800

I tracked down the regression in the spec benchmarks, and it was due to turning
off pre-increment/pre-decrement for floating point values, and these two
benchmarks use pre-increment/pre-decrement quite a bit.  My secondary reload
handlers are capable of adding in the pre-increment/pre-decrement if such an
operation is attempted on an Altivec register.


I am also including a patch to make the compiler work with -ffast-math.  If you
use -ffast-math, the easy_fp_constant predicate says that all constants are
easy in order to enable using the reciprocal approximation instructions for
division.  I put in a define_split to move the constants to the constant pool
after the reciprocal approximation work has been done but before reload
starts.  I had had this patch in when I was doing the development, but I
thought I did not need it when making up the patches, but perhaps recent
changes to the register allocator need it again.

I added an option (-mupper-regs) to simplify setting both -mupper-regs-sf and
-mupper-regs-df.  It will only set the options that the particular machine
supports.

Finally, I made the default to turn on -mupper-regs-df on power7/power8
systems, and -mupper-regs-sf on power8 systems.  I have run the regression test
suite with these options on, and there were no regressions.  Once all of the
other patches go in, can I check in these patches?

If you would prefer the default for GCC 5.0 not to enable the upper register
support, let me know, and I can remove the lines in rs6000-cpu.def that sets
the default.

2014-11-14  Michael Meissner  <[email protected]>

        * config/rs6000/predicates.md (memory_fp_constant): New predicate
        to return true if the operand is a floating point constant that
        must be put into the constant pool, before register allocation
        occurs.

        * config/rs6000/rs6000-cpus.def (ISA_2_6_MASKS_SERVER): Enable
        -mupper-regs-df by default.
        (ISA_2_7_MASKS_SERVER): Enable -mupper-regs-sf by default.
        (POWERPC_MASKS): Add -mupper-regs-{sf,df} as options set by the
        various -mcpu=... options.
        (power7 cpu): Enable -mupper-regs-df by default.

        * config/rs6000/rs6000.opt (-mupper-regs): New combination option
        that sets -mupper-regs-sf and -mupper-regs-df by default if the
        cpu supports the instructions.

        * config/rs6000/rs6000.c (rs6000_setup_reg_addr_masks): Allow
        pre-increment and pre-decrement on floating point, even if the
        -mupper-regs-{sf,df} options were used.
        (rs6000_option_override_internal): If -mupper-regs, set both
        -mupper-regs-sf and -mupper-regs-df, depending on the underlying
        cpu.

        * config/rs6000/rs6000.md (DFmode splitter): Add a define_split to
        move floating point constants to the constant pool before register
        allocation.  Normally constants are put into the pool immediately,
        but -ffast-math delays putting them into the constant pool for the
        reciprocal approximation support.
        (SFmode splitter): Likewise.

        * doc/invoke.texi (RS/6000 and PowerPC Options): Document
        -mupper-regs.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: [email protected], phone: +1 (978) 899-4797

Index: gcc/config/rs6000/predicates.md
===================================================================
--- gcc/config/rs6000/predicates.md     (revision 217448)
+++ gcc/config/rs6000/predicates.md     (working copy)
@@ -521,6 +521,27 @@ (define_predicate "easy_fp_constant"
   }
 })
 
+;; Return 1 if the operand must be loaded from memory.  This is used by a
+;; define_split to insure constants get pushed to the constant pool before
+;; reload.  If -ffast-math is used, easy_fp_constant will allow move insns to
+;; have constants in order not interfere with reciprocal estimation.  However,
+;; with -mupper-regs support, these constants must be moved to the constant
+;; pool before register allocation.
+
+(define_predicate "memory_fp_constant"
+  (match_code "const_double")
+{
+  if (TARGET_VSX && op == CONST0_RTX (mode))
+    return 0;
+
+  if (!TARGET_HARD_FLOAT || !TARGET_FPRS
+      || (mode == SFmode && !TARGET_SINGLE_FLOAT)
+      || (mode == DFmode && !TARGET_DOUBLE_FLOAT))
+    return 0;
+         
+  return 1;
+})
+
 ;; Return 1 if the operand is a CONST_VECTOR and can be loaded into a
 ;; vector register without using memory.
 (define_predicate "easy_vector_constant"
Index: gcc/config/rs6000/rs6000-cpus.def
===================================================================
--- gcc/config/rs6000/rs6000-cpus.def   (revision 217448)
+++ gcc/config/rs6000/rs6000-cpus.def   (working copy)
@@ -44,7 +44,8 @@
 #define ISA_2_6_MASKS_SERVER   (ISA_2_5_MASKS_SERVER                   \
                                 | OPTION_MASK_POPCNTD                  \
                                 | OPTION_MASK_ALTIVEC                  \
-                                | OPTION_MASK_VSX)
+                                | OPTION_MASK_VSX                      \
+                                | OPTION_MASK_UPPER_REGS_DF)
 
 /* For now, don't provide an embedded version of ISA 2.07.  */
 #define ISA_2_7_MASKS_SERVER   (ISA_2_6_MASKS_SERVER                   \
@@ -54,7 +55,8 @@
                                 | OPTION_MASK_DIRECT_MOVE              \
                                 | OPTION_MASK_HTM                      \
                                 | OPTION_MASK_QUAD_MEMORY              \
-                                | OPTION_MASK_QUAD_MEMORY_ATOMIC)
+                                | OPTION_MASK_QUAD_MEMORY_ATOMIC       \
+                                | OPTION_MASK_UPPER_REGS_SF)
 
 #define POWERPC_7400_MASK      (OPTION_MASK_PPC_GFXOPT | OPTION_MASK_ALTIVEC)
 
@@ -94,6 +96,8 @@
                                 | OPTION_MASK_RECIP_PRECISION          \
                                 | OPTION_MASK_SOFT_FLOAT               \
                                 | OPTION_MASK_STRICT_ALIGN_OPTIONAL    \
+                                | OPTION_MASK_UPPER_REGS_DF            \
+                                | OPTION_MASK_UPPER_REGS_SF            \
                                 | OPTION_MASK_VSX                      \
                                 | OPTION_MASK_VSX_TIMODE)
 
@@ -184,7 +188,7 @@ RS6000_CPU ("power6x", PROCESSOR_POWER6,
 RS6000_CPU ("power7", PROCESSOR_POWER7,   /* Don't add MASK_ISEL by default */
            POWERPC_7400_MASK | MASK_POWERPC64 | MASK_PPC_GPOPT | MASK_MFCRF
            | MASK_POPCNTB | MASK_FPRND | MASK_CMPB | MASK_DFP | MASK_POPCNTD
-           | MASK_VSX | MASK_RECIP_PRECISION)
+           | MASK_VSX | MASK_RECIP_PRECISION | OPTION_MASK_UPPER_REGS_DF)
 RS6000_CPU ("power8", PROCESSOR_POWER8, MASK_POWERPC64 | ISA_2_7_MASKS_SERVER)
 RS6000_CPU ("powerpc", PROCESSOR_POWERPC, 0)
 RS6000_CPU ("powerpc64", PROCESSOR_POWERPC64, MASK_PPC_GFXOPT | MASK_POWERPC64)
Index: gcc/config/rs6000/rs6000.opt
===================================================================
--- gcc/config/rs6000/rs6000.opt        (revision 217448)
+++ gcc/config/rs6000/rs6000.opt        (working copy)
@@ -589,6 +589,10 @@ mupper-regs-sf
 Target Report Mask(UPPER_REGS_SF) Var(rs6000_isa_flags)
 Allow float variables in upper registers with -mcpu=power8 or -mpower8-vector
 
+mupper-regs
+Target Report Var(TARGET_UPPER_REGS) Init(-1) Save
+Allow float/double variables in upper registers if cpu allows it
+
 moptimize-swaps
 Target Undocumented Var(rs6000_optimize_swaps) Init(1) Save
 Analyze and remove doubleword swaps from VSX computations.
Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c  (revision 217448)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -2462,9 +2462,7 @@ rs6000_setup_reg_addr_masks (void)
              /* Figure out if we can do PRE_INC, PRE_DEC, or PRE_MODIFY
                 addressing.  Restrict addressing on SPE for 64-bit types
                 because of the SUBREG hackery used to address 64-bit floats in
-                '32-bit' GPRs.  To simplify secondary reload, don't allow
-                update forms on scalar floating point types that can go in the
-                upper registers.  */
+                '32-bit' GPRs.  */
 
              if (TARGET_UPDATE
                  && (rc == RELOAD_REG_GPR || rc == RELOAD_REG_FPR)
@@ -2472,8 +2470,7 @@ rs6000_setup_reg_addr_masks (void)
                  && !VECTOR_MODE_P (m2)
                  && !COMPLEX_MODE_P (m2)
                  && !indexed_only_p
-                 && !(TARGET_E500_DOUBLE && GET_MODE_SIZE (m2) == 8)
-                 && !reg_addr[m2].scalar_in_vmx_p)
+                 && !(TARGET_E500_DOUBLE && GET_MODE_SIZE (m2) == 8))
                {
                  addr_mask |= RELOAD_REG_PRE_INCDEC;
 
@@ -3509,6 +3506,40 @@ rs6000_option_override_internal (bool gl
       rs6000_isa_flags &= ~OPTION_MASK_DFP;
     }
 
+  /* Allow an explicit -mupper-regs to set both -mupper-regs-df and
+     -mupper-regs-sf, depending on the cpu, unless the user explicitly also set
+     the individual option.  */
+  if (TARGET_UPPER_REGS > 0)
+    {
+      if (TARGET_VSX
+         && !(rs6000_isa_flags_explicit & OPTION_MASK_UPPER_REGS_DF))
+       {
+         rs6000_isa_flags |= OPTION_MASK_UPPER_REGS_DF;
+         rs6000_isa_flags_explicit |= OPTION_MASK_UPPER_REGS_DF;
+       }
+      if (TARGET_P8_VECTOR
+         && !(rs6000_isa_flags_explicit & OPTION_MASK_UPPER_REGS_SF))
+       {
+         rs6000_isa_flags |= OPTION_MASK_UPPER_REGS_SF;
+         rs6000_isa_flags_explicit |= OPTION_MASK_UPPER_REGS_SF;
+       }
+    }
+  else if (TARGET_UPPER_REGS == 0)
+    {
+      if (TARGET_VSX
+         && !(rs6000_isa_flags_explicit & OPTION_MASK_UPPER_REGS_DF))
+       {
+         rs6000_isa_flags &= ~OPTION_MASK_UPPER_REGS_DF;
+         rs6000_isa_flags_explicit |= OPTION_MASK_UPPER_REGS_DF;
+       }
+      if (TARGET_P8_VECTOR
+         && !(rs6000_isa_flags_explicit & OPTION_MASK_UPPER_REGS_SF))
+       {
+         rs6000_isa_flags &= ~OPTION_MASK_UPPER_REGS_SF;
+         rs6000_isa_flags_explicit |= OPTION_MASK_UPPER_REGS_SF;
+       }
+    }
+
   if (TARGET_UPPER_REGS_DF && !TARGET_VSX)
     {
       if (rs6000_isa_flags_explicit & OPTION_MASK_UPPER_REGS_DF)
Index: gcc/config/rs6000/rs6000.md
===================================================================
--- gcc/config/rs6000/rs6000.md (revision 217448)
+++ gcc/config/rs6000/rs6000.md (working copy)
@@ -8137,6 +8137,21 @@ (define_insn_and_split "*mov<mode>_softf
 { rs6000_split_multireg_move (operands[0], operands[1]); DONE; }
   [(set_attr "length" "20,20,16")])
 
+;; If we are using -ffast-math, easy_fp_constant assumes all constants are
+;; 'easy' in order to allow for reciprocal estimation.  Make sure the constant
+;; is in the constant pool before reload occurs.  This simplifies accessing
+;; scalars in the traditional Altivec registers.
+
+(define_split
+  [(set (match_operand:SFDF 0 "register_operand" "")
+       (match_operand:SFDF 1 "memory_fp_constant" ""))]
+  "TARGET_<MODE>_FPR && flag_unsafe_math_optimizations
+   && !reload_in_progress && !reload_completed && !lra_in_progress"
+  [(set (match_dup 0) (match_dup 2))]
+{
+  operands[2] = validize_mem (force_const_mem (<MODE>mode, operands[1]));
+})
+
 (define_expand "extenddftf2"
   [(set (match_operand:TF 0 "nonimmediate_operand" "")
        (float_extend:TF (match_operand:DF 1 "input_operand" "")))]
Index: gcc/doc/invoke.texi
===================================================================
--- gcc/doc/invoke.texi (revision 217448)
+++ gcc/doc/invoke.texi (working copy)
@@ -940,7 +940,8 @@ See RS/6000 and PowerPC Options.
 -mquad-memory -mno-quad-memory @gol
 -mquad-memory-atomic -mno-quad-memory-atomic @gol
 -mcompat-align-parm -mno-compat-align-parm @gol
--mupper-regs-df -mno-upper-regs-df -mupper-regs-sf -mno-upper-regs-sf}
+-mupper-regs-df -mno-upper-regs-df -mupper-regs-sf -mno-upper-regs-sf @gol
+-mupper-regs -mno-upper-regs}
 
 @emph{RX Options}
 @gccoptlist{-m64bit-doubles  -m32bit-doubles  -fpu  -nofpu@gol
@@ -19691,10 +19692,9 @@ instructions.  The @option{-mquad-memory
 Generate code that uses (does not use) the scalar double precision
 instructions that target all 64 registers in the vector/scalar
 floating point register set that were added in version 2.06 of the
-PowerPC ISA.  If @option{-mupper-regs-df} is not set, the traditional
-floating instructions will be generated that target the first 32
-registers.  This option requires the @option{-mvsx},
-@option{-mcpu=power7}, or @option{-mcpu=power8} options to be set.
+PowerPC ISA.  The @option{-mupper-regs-df} turned on by default if you
+use either of the @option{-mcpu=power7}, @option{-mcpu=power8}, or
+@option{-mvsx} options.
 
 @item -mupper-regs-sf
 @itemx -mno-upper-regs-sf
@@ -19703,10 +19703,20 @@ registers.  This option requires the @op
 Generate code that uses (does not use) the scalar single precision
 instructions that target all 64 registers in the vector/scalar
 floating point register set that were added in version 2.07 of the
-PowerPC ISA.  If @option{-mupper-regs-sf} is not set, the traditional
-floating instructions will be generated that target the first 32
-registers.  This option requires the @option{-mpower8-vector},
-@option{-mcpu=power7}, or @option{-mcpu=power8} options to be set.
+PowerPC ISA.  The @option{-mupper-regs-sf} turned on by default if you
+use either of the @option{-mcpu=power8}, or @option{-mpower8-vector}
+options.
+
+@item -mupper-regs
+@itemx -mno-upper-regs
+@opindex mupper-regs
+@opindex mno-upper-regs
+Generate code that uses (does not use) the scalar
+instructions that target all 64 registers in the vector/scalar
+floating point register set, depending on the model of the machine.
+
+If the @option{-mno-upper-regs} option was used, it will turn off both
+@option{-mupper-regs-sf} and @option{-mupper-regs-df} options.
 
 @item -mfloat-gprs=@var{yes/single/double/no}
 @itemx -mfloat-gprs

Re: PATCH [8 of 8], rs6000, add support for scalar floating point in Altivec registers

Reply via email to