I tracked down the regression in the spec benchmarks, and it was due to turning
off pre-increment/pre-decrement for floating point values, and these two
benchmarks use pre-increment/pre-decrement quite a bit. My secondary reload
handlers are capable of adding in the pre-increment/pre-decrement if such an
operation is attempted on an Altivec register.
I am also including a patch to make the compiler work with -ffast-math. If you
use -ffast-math, the easy_fp_constant predicate says that all constants are
easy in order to enable using the reciprocal approximation instructions for
division. I put in a define_split to move the constants to the constant pool
after the reciprocal approximation work has been done but before reload
starts. I had had this patch in when I was doing the development, but I
thought I did not need it when making up the patches, but perhaps recent
changes to the register allocator need it again.
I added an option (-mupper-regs) to simplify setting both -mupper-regs-sf and
-mupper-regs-df. It will only set the options that the particular machine
supports.
Finally, I made the default to turn on -mupper-regs-df on power7/power8
systems, and -mupper-regs-sf on power8 systems. I have run the regression test
suite with these options on, and there were no regressions. Once all of the
other patches go in, can I check in these patches?
If you would prefer the default for GCC 5.0 not to enable the upper register
support, let me know, and I can remove the lines in rs6000-cpu.def that sets
the default.
2014-11-14 Michael Meissner <[email protected]>
* config/rs6000/predicates.md (memory_fp_constant): New predicate
to return true if the operand is a floating point constant that
must be put into the constant pool, before register allocation
occurs.
* config/rs6000/rs6000-cpus.def (ISA_2_6_MASKS_SERVER): Enable
-mupper-regs-df by default.
(ISA_2_7_MASKS_SERVER): Enable -mupper-regs-sf by default.
(POWERPC_MASKS): Add -mupper-regs-{sf,df} as options set by the
various -mcpu=... options.
(power7 cpu): Enable -mupper-regs-df by default.
* config/rs6000/rs6000.opt (-mupper-regs): New combination option
that sets -mupper-regs-sf and -mupper-regs-df by default if the
cpu supports the instructions.
* config/rs6000/rs6000.c (rs6000_setup_reg_addr_masks): Allow
pre-increment and pre-decrement on floating point, even if the
-mupper-regs-{sf,df} options were used.
(rs6000_option_override_internal): If -mupper-regs, set both
-mupper-regs-sf and -mupper-regs-df, depending on the underlying
cpu.
* config/rs6000/rs6000.md (DFmode splitter): Add a define_split to
move floating point constants to the constant pool before register
allocation. Normally constants are put into the pool immediately,
but -ffast-math delays putting them into the constant pool for the
reciprocal approximation support.
(SFmode splitter): Likewise.
* doc/invoke.texi (RS/6000 and PowerPC Options): Document
-mupper-regs.
--
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: [email protected], phone: +1 (978) 899-4797
Index: gcc/config/rs6000/predicates.md
===================================================================
--- gcc/config/rs6000/predicates.md (revision 217448)
+++ gcc/config/rs6000/predicates.md (working copy)
@@ -521,6 +521,27 @@ (define_predicate "easy_fp_constant"
}
})
+;; Return 1 if the operand must be loaded from memory. This is used by a
+;; define_split to insure constants get pushed to the constant pool before
+;; reload. If -ffast-math is used, easy_fp_constant will allow move insns to
+;; have constants in order not interfere with reciprocal estimation. However,
+;; with -mupper-regs support, these constants must be moved to the constant
+;; pool before register allocation.
+
+(define_predicate "memory_fp_constant"
+ (match_code "const_double")
+{
+ if (TARGET_VSX && op == CONST0_RTX (mode))
+ return 0;
+
+ if (!TARGET_HARD_FLOAT || !TARGET_FPRS
+ || (mode == SFmode && !TARGET_SINGLE_FLOAT)
+ || (mode == DFmode && !TARGET_DOUBLE_FLOAT))
+ return 0;
+
+ return 1;
+})
+
;; Return 1 if the operand is a CONST_VECTOR and can be loaded into a
;; vector register without using memory.
(define_predicate "easy_vector_constant"
Index: gcc/config/rs6000/rs6000-cpus.def
===================================================================
--- gcc/config/rs6000/rs6000-cpus.def (revision 217448)
+++ gcc/config/rs6000/rs6000-cpus.def (working copy)
@@ -44,7 +44,8 @@
#define ISA_2_6_MASKS_SERVER (ISA_2_5_MASKS_SERVER \
| OPTION_MASK_POPCNTD \
| OPTION_MASK_ALTIVEC \
- | OPTION_MASK_VSX)
+ | OPTION_MASK_VSX \
+ | OPTION_MASK_UPPER_REGS_DF)
/* For now, don't provide an embedded version of ISA 2.07. */
#define ISA_2_7_MASKS_SERVER (ISA_2_6_MASKS_SERVER \
@@ -54,7 +55,8 @@
| OPTION_MASK_DIRECT_MOVE \
| OPTION_MASK_HTM \
| OPTION_MASK_QUAD_MEMORY \
- | OPTION_MASK_QUAD_MEMORY_ATOMIC)
+ | OPTION_MASK_QUAD_MEMORY_ATOMIC \
+ | OPTION_MASK_UPPER_REGS_SF)
#define POWERPC_7400_MASK (OPTION_MASK_PPC_GFXOPT | OPTION_MASK_ALTIVEC)
@@ -94,6 +96,8 @@
| OPTION_MASK_RECIP_PRECISION \
| OPTION_MASK_SOFT_FLOAT \
| OPTION_MASK_STRICT_ALIGN_OPTIONAL \
+ | OPTION_MASK_UPPER_REGS_DF \
+ | OPTION_MASK_UPPER_REGS_SF \
| OPTION_MASK_VSX \
| OPTION_MASK_VSX_TIMODE)
@@ -184,7 +188,7 @@ RS6000_CPU ("power6x", PROCESSOR_POWER6,
RS6000_CPU ("power7", PROCESSOR_POWER7, /* Don't add MASK_ISEL by default */
POWERPC_7400_MASK | MASK_POWERPC64 | MASK_PPC_GPOPT | MASK_MFCRF
| MASK_POPCNTB | MASK_FPRND | MASK_CMPB | MASK_DFP | MASK_POPCNTD
- | MASK_VSX | MASK_RECIP_PRECISION)
+ | MASK_VSX | MASK_RECIP_PRECISION | OPTION_MASK_UPPER_REGS_DF)
RS6000_CPU ("power8", PROCESSOR_POWER8, MASK_POWERPC64 | ISA_2_7_MASKS_SERVER)
RS6000_CPU ("powerpc", PROCESSOR_POWERPC, 0)
RS6000_CPU ("powerpc64", PROCESSOR_POWERPC64, MASK_PPC_GFXOPT | MASK_POWERPC64)
Index: gcc/config/rs6000/rs6000.opt
===================================================================
--- gcc/config/rs6000/rs6000.opt (revision 217448)
+++ gcc/config/rs6000/rs6000.opt (working copy)
@@ -589,6 +589,10 @@ mupper-regs-sf
Target Report Mask(UPPER_REGS_SF) Var(rs6000_isa_flags)
Allow float variables in upper registers with -mcpu=power8 or -mpower8-vector
+mupper-regs
+Target Report Var(TARGET_UPPER_REGS) Init(-1) Save
+Allow float/double variables in upper registers if cpu allows it
+
moptimize-swaps
Target Undocumented Var(rs6000_optimize_swaps) Init(1) Save
Analyze and remove doubleword swaps from VSX computations.
Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c (revision 217448)
+++ gcc/config/rs6000/rs6000.c (working copy)
@@ -2462,9 +2462,7 @@ rs6000_setup_reg_addr_masks (void)
/* Figure out if we can do PRE_INC, PRE_DEC, or PRE_MODIFY
addressing. Restrict addressing on SPE for 64-bit types
because of the SUBREG hackery used to address 64-bit floats in
- '32-bit' GPRs. To simplify secondary reload, don't allow
- update forms on scalar floating point types that can go in the
- upper registers. */
+ '32-bit' GPRs. */
if (TARGET_UPDATE
&& (rc == RELOAD_REG_GPR || rc == RELOAD_REG_FPR)
@@ -2472,8 +2470,7 @@ rs6000_setup_reg_addr_masks (void)
&& !VECTOR_MODE_P (m2)
&& !COMPLEX_MODE_P (m2)
&& !indexed_only_p
- && !(TARGET_E500_DOUBLE && GET_MODE_SIZE (m2) == 8)
- && !reg_addr[m2].scalar_in_vmx_p)
+ && !(TARGET_E500_DOUBLE && GET_MODE_SIZE (m2) == 8))
{
addr_mask |= RELOAD_REG_PRE_INCDEC;
@@ -3509,6 +3506,40 @@ rs6000_option_override_internal (bool gl
rs6000_isa_flags &= ~OPTION_MASK_DFP;
}
+ /* Allow an explicit -mupper-regs to set both -mupper-regs-df and
+ -mupper-regs-sf, depending on the cpu, unless the user explicitly also set
+ the individual option. */
+ if (TARGET_UPPER_REGS > 0)
+ {
+ if (TARGET_VSX
+ && !(rs6000_isa_flags_explicit & OPTION_MASK_UPPER_REGS_DF))
+ {
+ rs6000_isa_flags |= OPTION_MASK_UPPER_REGS_DF;
+ rs6000_isa_flags_explicit |= OPTION_MASK_UPPER_REGS_DF;
+ }
+ if (TARGET_P8_VECTOR
+ && !(rs6000_isa_flags_explicit & OPTION_MASK_UPPER_REGS_SF))
+ {
+ rs6000_isa_flags |= OPTION_MASK_UPPER_REGS_SF;
+ rs6000_isa_flags_explicit |= OPTION_MASK_UPPER_REGS_SF;
+ }
+ }
+ else if (TARGET_UPPER_REGS == 0)
+ {
+ if (TARGET_VSX
+ && !(rs6000_isa_flags_explicit & OPTION_MASK_UPPER_REGS_DF))
+ {
+ rs6000_isa_flags &= ~OPTION_MASK_UPPER_REGS_DF;
+ rs6000_isa_flags_explicit |= OPTION_MASK_UPPER_REGS_DF;
+ }
+ if (TARGET_P8_VECTOR
+ && !(rs6000_isa_flags_explicit & OPTION_MASK_UPPER_REGS_SF))
+ {
+ rs6000_isa_flags &= ~OPTION_MASK_UPPER_REGS_SF;
+ rs6000_isa_flags_explicit |= OPTION_MASK_UPPER_REGS_SF;
+ }
+ }
+
if (TARGET_UPPER_REGS_DF && !TARGET_VSX)
{
if (rs6000_isa_flags_explicit & OPTION_MASK_UPPER_REGS_DF)
Index: gcc/config/rs6000/rs6000.md
===================================================================
--- gcc/config/rs6000/rs6000.md (revision 217448)
+++ gcc/config/rs6000/rs6000.md (working copy)
@@ -8137,6 +8137,21 @@ (define_insn_and_split "*mov<mode>_softf
{ rs6000_split_multireg_move (operands[0], operands[1]); DONE; }
[(set_attr "length" "20,20,16")])
+;; If we are using -ffast-math, easy_fp_constant assumes all constants are
+;; 'easy' in order to allow for reciprocal estimation. Make sure the constant
+;; is in the constant pool before reload occurs. This simplifies accessing
+;; scalars in the traditional Altivec registers.
+
+(define_split
+ [(set (match_operand:SFDF 0 "register_operand" "")
+ (match_operand:SFDF 1 "memory_fp_constant" ""))]
+ "TARGET_<MODE>_FPR && flag_unsafe_math_optimizations
+ && !reload_in_progress && !reload_completed && !lra_in_progress"
+ [(set (match_dup 0) (match_dup 2))]
+{
+ operands[2] = validize_mem (force_const_mem (<MODE>mode, operands[1]));
+})
+
(define_expand "extenddftf2"
[(set (match_operand:TF 0 "nonimmediate_operand" "")
(float_extend:TF (match_operand:DF 1 "input_operand" "")))]
Index: gcc/doc/invoke.texi
===================================================================
--- gcc/doc/invoke.texi (revision 217448)
+++ gcc/doc/invoke.texi (working copy)
@@ -940,7 +940,8 @@ See RS/6000 and PowerPC Options.
-mquad-memory -mno-quad-memory @gol
-mquad-memory-atomic -mno-quad-memory-atomic @gol
-mcompat-align-parm -mno-compat-align-parm @gol
--mupper-regs-df -mno-upper-regs-df -mupper-regs-sf -mno-upper-regs-sf}
+-mupper-regs-df -mno-upper-regs-df -mupper-regs-sf -mno-upper-regs-sf @gol
+-mupper-regs -mno-upper-regs}
@emph{RX Options}
@gccoptlist{-m64bit-doubles -m32bit-doubles -fpu -nofpu@gol
@@ -19691,10 +19692,9 @@ instructions. The @option{-mquad-memory
Generate code that uses (does not use) the scalar double precision
instructions that target all 64 registers in the vector/scalar
floating point register set that were added in version 2.06 of the
-PowerPC ISA. If @option{-mupper-regs-df} is not set, the traditional
-floating instructions will be generated that target the first 32
-registers. This option requires the @option{-mvsx},
-@option{-mcpu=power7}, or @option{-mcpu=power8} options to be set.
+PowerPC ISA. The @option{-mupper-regs-df} turned on by default if you
+use either of the @option{-mcpu=power7}, @option{-mcpu=power8}, or
+@option{-mvsx} options.
@item -mupper-regs-sf
@itemx -mno-upper-regs-sf
@@ -19703,10 +19703,20 @@ registers. This option requires the @op
Generate code that uses (does not use) the scalar single precision
instructions that target all 64 registers in the vector/scalar
floating point register set that were added in version 2.07 of the
-PowerPC ISA. If @option{-mupper-regs-sf} is not set, the traditional
-floating instructions will be generated that target the first 32
-registers. This option requires the @option{-mpower8-vector},
-@option{-mcpu=power7}, or @option{-mcpu=power8} options to be set.
+PowerPC ISA. The @option{-mupper-regs-sf} turned on by default if you
+use either of the @option{-mcpu=power8}, or @option{-mpower8-vector}
+options.
+
+@item -mupper-regs
+@itemx -mno-upper-regs
+@opindex mupper-regs
+@opindex mno-upper-regs
+Generate code that uses (does not use) the scalar
+instructions that target all 64 registers in the vector/scalar
+floating point register set, depending on the model of the machine.
+
+If the @option{-mno-upper-regs} option was used, it will turn off both
+@option{-mupper-regs-sf} and @option{-mupper-regs-df} options.
@item -mfloat-gprs=@var{yes/single/double/no}
@itemx -mfloat-gprs