I am now breaking the patches down to be more bite size. Ultimately, I hope these patches will provide support to allow scalar floating point to occupy the Altivec (upper) registers if the ISA allows it (ISA 2.06 for DFmode, ISA 2.07 for SFmode). One effect of later patches will be to go back to using the traditional DFmode instructions for VSX if all of the registers come from the traditional floating point register set.
This patch adds the new switches, and constraints that will be used in future patches. It produces exactly the same code on the targets I tested on and passes the bootstrap/make check stages. Is it ok to apply this patch? I have tested the code generation for the following targets: power4, power5, power5+, power6, power6x, power7, power8 for 64/32-bit power7 with VSX disabled using altivec instructions for 64/32-bit power7 with both VSX and altivec disabled for 64/32-bit cell 64/32-bit e5500, e6500 64/32-bit G4 32-bit G5 64/32-bit linuxpaired 32-bit linuxspe 32-bit 2013-09-20 Michael Meissner <meiss...@linux.vnet.ibm.com> * config/rs6000/rs6000.h (enum r6000_reg_class_enum): Add new constraints: wu, ww, and wy. Repurpose wv constraint added during power8 changes. Put wg constraint in alphabetical order. * config/rs6000/rs6000.opt (-mvsx-scalar-float): New debug switch for future work to add ISA 2.07 VSX single precision support. (-mvsx-scalar-double): Change default from -1 to 1, update documentation comment. (-mvsx-scalar-memory): Rename debug switch to -mupper-regs-df. (-mupper-regs-df): New debug switch to control whether DF values can go in the traditional Altivec registers. (-mupper-regs-sf): New debug switch to control whether SF values can go in the traditional Altivec registers. * config/rs6000/rs6000.c (rs6000_debug_reg_global): Print wu, ww, and wy constraints. (rs6000_init_hard_regno_mode_ok): Use ssize_t instead of int for loop variables. Rename -mvsx-scalar-memory to -mupper-regs-df. Add new constraints, wu/ww/wy. Repurpose wv constraint. (rs6000_debug_legitimate_address_p): Print if we are running before, during, or after reload. (rs6000_secondary_reload): Add a comment. (rs6000_opt_masks): Add -mupper-regs-df, -mupper-regs-sf. * config/rs6000/constraints.md (wa constraint): Sort w<x> constraints. Update documentation string. (wd constraint): Likewise. (wf constraint): Likewise. (wg constraint): Likewise. (wn constraint): Likewise. (ws constraint): Likewise. (wt constraint): Likewise. (wx constraint): Likewise. (wz constraint): Likewise. (wu constraint): New constraint for ISA 2.07 SFmode scalar instructions. (ww constraint): Likewise. (wy constraint): Likewise. (wv constraint): Repurpose ISA 2.07 constraint that we not used in the previous submissions. * doc/md.texi (PowerPC and IBM RS6000): Likewise. -- Michael Meissner, IBM IBM, M/S 2506R, 550 King Street, Littleton, MA 01460, USA email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Index: gcc/config/rs6000/rs6000.h =================================================================== --- gcc/config/rs6000/rs6000.h (revision 202793) +++ gcc/config/rs6000/rs6000.h (working copy) @@ -1403,15 +1403,18 @@ enum r6000_reg_class_enum { RS6000_CONSTRAINT_v, /* Altivec registers */ RS6000_CONSTRAINT_wa, /* Any VSX register */ RS6000_CONSTRAINT_wd, /* VSX register for V2DF */ - RS6000_CONSTRAINT_wg, /* FPR register for -mmfpgpr */ RS6000_CONSTRAINT_wf, /* VSX register for V4SF */ + RS6000_CONSTRAINT_wg, /* FPR register for -mmfpgpr */ RS6000_CONSTRAINT_wl, /* FPR register for LFIWAX */ RS6000_CONSTRAINT_wm, /* VSX register for direct move */ RS6000_CONSTRAINT_wr, /* GPR register if 64-bit */ RS6000_CONSTRAINT_ws, /* VSX register for DF */ RS6000_CONSTRAINT_wt, /* VSX register for TImode */ - RS6000_CONSTRAINT_wv, /* Altivec register for power8 vector */ + RS6000_CONSTRAINT_wu, /* Altivec register for float load/stores. */ + RS6000_CONSTRAINT_wv, /* Altivec register for double load/stores. */ + RS6000_CONSTRAINT_ww, /* FP or VSX register for vsx float ops. */ RS6000_CONSTRAINT_wx, /* FPR register for STFIWX */ + RS6000_CONSTRAINT_wy, /* VSX register for SF */ RS6000_CONSTRAINT_wz, /* FPR register for LFIWZX */ RS6000_CONSTRAINT_MAX }; Index: gcc/config/rs6000/rs6000.opt =================================================================== --- gcc/config/rs6000/rs6000.opt (revision 202793) +++ gcc/config/rs6000/rs6000.opt (working copy) @@ -181,13 +181,16 @@ mvsx Target Report Mask(VSX) Var(rs6000_isa_flags) Use vector/scalar (VSX) instructions +mvsx-scalar-float +Target Undocumented Report Var(TARGET_VSX_SCALAR_FLOAT) Init(1) +; If -mpower8-vector, use VSX arithmetic instructions for SFmode (on by default) + mvsx-scalar-double -Target Undocumented Report Var(TARGET_VSX_SCALAR_DOUBLE) Init(-1) -; If -mvsx, use VSX arithmetic instructions for scalar double (on by default) +Target Undocumented Report Var(TARGET_VSX_SCALAR_DOUBLE) Init(1) +; If -mvsx, use VSX arithmetic instructions for DFmode (on by default) mvsx-scalar-memory -Target Undocumented Report Var(TARGET_VSX_SCALAR_MEMORY) -; If -mvsx, use VSX scalar memory reference instructions for scalar double (off by default) +Target Undocumented Report Alias(mupper-regs-df) mvsx-align-128 Target Undocumented Report Var(TARGET_VSX_ALIGN_128) @@ -550,3 +553,11 @@ Generate the quad word memory instructio mcompat-align-parm Target Report Var(rs6000_compat_align_parm) Init(0) Save Generate aggregate parameter passing code with at most 64-bit alignment. + +mupper-regs-df +Target Undocumented Mask(UPPER_REGS_DF) Var(rs6000_isa_flags) +Allow double variables in upper registers with -mcpu=power7 or -mvsx + +mupper-regs-sf +Target Undocumented Mask(UPPER_REGS_SF) Var(rs6000_isa_flags) +Allow float variables in upper registers with -mcpu=power8 or -mp8-vector Index: gcc/config/rs6000/rs6000.c =================================================================== --- gcc/config/rs6000/rs6000.c (revision 202793) +++ gcc/config/rs6000/rs6000.c (working copy) @@ -1891,8 +1891,11 @@ rs6000_debug_reg_global (void) "wr reg_class = %s\n" "ws reg_class = %s\n" "wt reg_class = %s\n" + "wu reg_class = %s\n" "wv reg_class = %s\n" + "ww reg_class = %s\n" "wx reg_class = %s\n" + "wy reg_class = %s\n" "wz reg_class = %s\n" "\n", reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_d]], @@ -1907,8 +1910,11 @@ rs6000_debug_reg_global (void) reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wr]], reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_ws]], reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wt]], + reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wu]], reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wv]], + reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_ww]], reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wx]], + reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wy]], reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wz]]); for (m = 0; m < NUM_MACHINE_MODES; ++m) @@ -2168,7 +2174,7 @@ rs6000_debug_reg_global (void) static void rs6000_init_hard_regno_mode_ok (bool global_init_p) { - int r, m, c; + ssize_t r, m, c; int align64; int align32; @@ -2320,7 +2326,7 @@ rs6000_init_hard_regno_mode_ok (bool glo { rs6000_vector_unit[DFmode] = VECTOR_VSX; rs6000_vector_mem[DFmode] - = (TARGET_VSX_SCALAR_MEMORY ? VECTOR_VSX : VECTOR_NONE); + = (TARGET_UPPER_REGS_DF ? VECTOR_VSX : VECTOR_NONE); rs6000_vector_align[DFmode] = align64; } @@ -2334,7 +2340,34 @@ rs6000_init_hard_regno_mode_ok (bool glo /* TODO add SPE and paired floating point vector support. */ /* Register class constraints for the constraints that depend on compile - switches. */ + switches. When the VSX code was added, different constraints were added + based on the type (DFmode, V2DFmode, V4SFmode). For the vector types, all + of the VSX registers are used. The register classes for scalar floating + point types is set, based on whether we allow that type into the upper + (Altivec) registers. GCC has register classes to target the Altivec + registers for load/store operations, to select using a VSX memory + operation instead of the traditional floating point operation. The + constraints are: + + d - Register class to use with traditional DFmode instructions. + f - Register class to use with traditional SFmode instructions. + v - Altivec register. + wa - Any VSX register. + wd - Preferred register class for V2DFmode. + wf - Preferred register class for V4SFmode. + wg - Float register for power6x move insns. + wl - Float register if we can do 32-bit signed int loads. + wm - VSX register for ISA 2.07 direct move operations. + wr - GPR if 64-bit mode is permitted. + ws - Register class to do ISA 2.06 DF operations. + wu - Altivec register for ISA 2.07 VSX SF/SI load/stores. + wv - Altivec register for ISA 2.06 VSX DF/DI load/stores. + wt - VSX register for TImode in VSX registers. + ww - Register class to do SF conversions in with VSX operations. + wx - Float register if we can do 32-bit int stores. + wy - Register class to do ISA 2.07 SF operations. + wz - Float register if we can do 32-bit unsigned int loads. */ + if (TARGET_HARD_FLOAT && TARGET_FPRS) rs6000_constraints[RS6000_CONSTRAINT_f] = FLOAT_REGS; @@ -2343,19 +2376,16 @@ rs6000_init_hard_regno_mode_ok (bool glo if (TARGET_VSX) { - /* At present, we just use VSX_REGS, but we have different constraints - based on the use, in case we want to fine tune the default register - class used. wa = any VSX register, wf = register class to use for - V4SF, wd = register class to use for V2DF, and ws = register classs to - use for DF scalars. */ rs6000_constraints[RS6000_CONSTRAINT_wa] = VSX_REGS; - rs6000_constraints[RS6000_CONSTRAINT_wf] = VSX_REGS; rs6000_constraints[RS6000_CONSTRAINT_wd] = VSX_REGS; - rs6000_constraints[RS6000_CONSTRAINT_ws] = (TARGET_VSX_SCALAR_MEMORY - ? VSX_REGS - : FLOAT_REGS); + rs6000_constraints[RS6000_CONSTRAINT_wf] = VSX_REGS; + rs6000_constraints[RS6000_CONSTRAINT_wv] = ALTIVEC_REGS; + if (TARGET_VSX_TIMODE) rs6000_constraints[RS6000_CONSTRAINT_wt] = VSX_REGS; + + rs6000_constraints[RS6000_CONSTRAINT_ws] + = (TARGET_UPPER_REGS_DF) ? VSX_REGS : FLOAT_REGS; } /* Add conditional constraints based on various options, to allow us to @@ -2376,7 +2406,14 @@ rs6000_init_hard_regno_mode_ok (bool glo rs6000_constraints[RS6000_CONSTRAINT_wr] = GENERAL_REGS; if (TARGET_P8_VECTOR) - rs6000_constraints[RS6000_CONSTRAINT_wv] = ALTIVEC_REGS; + { + rs6000_constraints[RS6000_CONSTRAINT_wv] = ALTIVEC_REGS; + rs6000_constraints[RS6000_CONSTRAINT_wy] + = rs6000_constraints[RS6000_CONSTRAINT_ww] + = (TARGET_UPPER_REGS_SF) ? VSX_REGS : FLOAT_REGS; + } + else if (TARGET_VSX) + rs6000_constraints[RS6000_CONSTRAINT_ww] = FLOAT_REGS; if (TARGET_STFIWX) rs6000_constraints[RS6000_CONSTRAINT_wx] = FLOAT_REGS; @@ -2409,7 +2446,7 @@ rs6000_init_hard_regno_mode_ok (bool glo rs6000_vector_reload[V4SFmode][1] = CODE_FOR_reload_v4sf_di_load; rs6000_vector_reload[V2DFmode][0] = CODE_FOR_reload_v2df_di_store; rs6000_vector_reload[V2DFmode][1] = CODE_FOR_reload_v2df_di_load; - if (TARGET_VSX && TARGET_VSX_SCALAR_MEMORY) + if (TARGET_VSX && TARGET_UPPER_REGS_DF) { rs6000_vector_reload[DFmode][0] = CODE_FOR_reload_df_di_store; rs6000_vector_reload[DFmode][1] = CODE_FOR_reload_df_di_load; @@ -2472,7 +2509,7 @@ rs6000_init_hard_regno_mode_ok (bool glo rs6000_vector_reload[V4SFmode][1] = CODE_FOR_reload_v4sf_si_load; rs6000_vector_reload[V2DFmode][0] = CODE_FOR_reload_v2df_si_store; rs6000_vector_reload[V2DFmode][1] = CODE_FOR_reload_v2df_si_load; - if (TARGET_VSX && TARGET_VSX_SCALAR_MEMORY) + if (TARGET_VSX && TARGET_UPPER_REGS_DF) { rs6000_vector_reload[DFmode][0] = CODE_FOR_reload_df_si_store; rs6000_vector_reload[DFmode][1] = CODE_FOR_reload_df_si_load; @@ -7195,10 +7232,13 @@ rs6000_debug_legitimate_address_p (enum bool ret = rs6000_legitimate_address_p (mode, x, reg_ok_strict); fprintf (stderr, "\nrs6000_legitimate_address_p: return = %s, mode = %s, " - "strict = %d, code = %s\n", + "strict = %d, reload = %s, code = %s\n", ret ? "true" : "false", GET_MODE_NAME (mode), reg_ok_strict, + (reload_completed + ? "after" + : (reload_in_progress ? "progress" : "before")), GET_RTX_NAME (GET_CODE (x))); debug_rtx (x); @@ -14865,6 +14905,7 @@ rs6000_secondary_reload (bool in_p, from_type = exchange; } + /* Can we do a direct move of some sort? */ if (rs6000_secondary_reload_move (to_type, from_type, mode, sri, altivec_p)) { @@ -29162,6 +29203,8 @@ static struct rs6000_opt_mask const rs60 { "recip-precision", OPTION_MASK_RECIP_PRECISION, false, true }, { "string", OPTION_MASK_STRING, false, true }, { "update", OPTION_MASK_NO_UPDATE, true , true }, + { "upper-regs-df", OPTION_MASK_UPPER_REGS_DF, false, false }, + { "upper-regs-sf", OPTION_MASK_UPPER_REGS_SF, false, false }, { "vsx", OPTION_MASK_VSX, false, true }, { "vsx-timode", OPTION_MASK_VSX_TIMODE, false, true }, #ifdef OPTION_MASK_64BIT Index: gcc/config/rs6000/constraints.md =================================================================== --- gcc/config/rs6000/constraints.md (revision 202793) +++ gcc/config/rs6000/constraints.md (working copy) @@ -52,29 +52,18 @@ (define_register_constraint "z" "CA_REGS "@internal") ;; Use w as a prefix to add VSX modes -;; vector double (V2DF) +;; any VSX register +(define_register_constraint "wa" "rs6000_constraints[RS6000_CONSTRAINT_wa]" + "Any VSX register if the -mvsx option was used or NO_REGS.") + (define_register_constraint "wd" "rs6000_constraints[RS6000_CONSTRAINT_wd]" - "@internal") + "VSX vector register to hold vector double data or NO_REGS.") -;; vector float (V4SF) (define_register_constraint "wf" "rs6000_constraints[RS6000_CONSTRAINT_wf]" - "@internal") - -;; scalar double (DF) -(define_register_constraint "ws" "rs6000_constraints[RS6000_CONSTRAINT_ws]" - "@internal") - -;; TImode in VSX registers -(define_register_constraint "wt" "rs6000_constraints[RS6000_CONSTRAINT_wt]" - "@internal") - -;; any VSX register -(define_register_constraint "wa" "rs6000_constraints[RS6000_CONSTRAINT_wa]" - "@internal") + "VSX vector register to hold vector float data or NO_REGS.") -;; Register constraints to simplify move patterns (define_register_constraint "wg" "rs6000_constraints[RS6000_CONSTRAINT_wg]" - "Floating point register if -mmfpgpr is used, or NO_REGS.") + "If -mmfpgpr was used, a floating point register or NO_REGS.") (define_register_constraint "wl" "rs6000_constraints[RS6000_CONSTRAINT_wl]" "Floating point register if the LFIWAX instruction is enabled or NO_REGS.") @@ -82,23 +71,38 @@ (define_register_constraint "wl" "rs6000 (define_register_constraint "wm" "rs6000_constraints[RS6000_CONSTRAINT_wm]" "VSX register if direct move instructions are enabled, or NO_REGS.") +;; NO_REGs register constraint, used to merge mov{sd,sf}, since movsd can use +;; direct move directly, and movsf can't to move between the register sets. +;; There is a mode_attr that resolves to wm for SDmode and wn for SFmode +(define_register_constraint "wn" "NO_REGS" "No register (NO_REGS).") + (define_register_constraint "wr" "rs6000_constraints[RS6000_CONSTRAINT_wr]" "General purpose register if 64-bit instructions are enabled or NO_REGS.") +(define_register_constraint "ws" "rs6000_constraints[RS6000_CONSTRAINT_ws]" + "VSX vector register to hold scalar double values or NO_REGS.") + +(define_register_constraint "wt" "rs6000_constraints[RS6000_CONSTRAINT_wt]" + "VSX vector register to hold 128 bit integer or NO_REGS.") + +(define_register_constraint "wu" "rs6000_constraints[RS6000_CONSTRAINT_wu]" + "Altivec register to use for float/32-bit int loads/stores or NO_REGS.") + (define_register_constraint "wv" "rs6000_constraints[RS6000_CONSTRAINT_wv]" - "Altivec register if -mpower8-vector is used or NO_REGS.") + "Altivec register to use for double loads/stores or NO_REGS.") + +(define_register_constraint "ww" "rs6000_constraints[RS6000_CONSTRAINT_ww]" + "FP or VSX register to perform float operations under -mvsx or NO_REGS.") (define_register_constraint "wx" "rs6000_constraints[RS6000_CONSTRAINT_wx]" "Floating point register if the STFIWX instruction is enabled or NO_REGS.") +(define_register_constraint "wy" "rs6000_constraints[RS6000_CONSTRAINT_wy]" + "VSX vector register to hold scalar float values or NO_REGS.") + (define_register_constraint "wz" "rs6000_constraints[RS6000_CONSTRAINT_wz]" "Floating point register if the LFIWZX instruction is enabled or NO_REGS.") -;; NO_REGs register constraint, used to merge mov{sd,sf}, since movsd can use -;; direct move directly, and movsf can't to move between the register sets. -;; There is a mode_attr that resolves to wm for SDmode and wn for SFmode -(define_register_constraint "wn" "NO_REGS") - ;; Lq/stq validates the address for load/store quad (define_memory_constraint "wQ" "Memory operand suitable for the load/store quad instructions" Index: gcc/doc/md.texi =================================================================== --- gcc/doc/md.texi (revision 202793) +++ gcc/doc/md.texi (working copy) @@ -2067,40 +2067,52 @@ Floating point register (containing 32-b Altivec vector register @item wa -Any VSX register +Any VSX register if the -mvsx option was used or NO_REGS. @item wd -VSX vector register to hold vector double data +VSX vector register to hold vector double data or NO_REGS. @item wf -VSX vector register to hold vector float data +VSX vector register to hold vector float data or NO_REGS. @item wg -If @option{-mmfpgpr} was used, a floating point register +If @option{-mmfpgpr} was used, a floating point register or NO_REGS. @item wl -If the LFIWAX instruction is enabled, a floating point register +Floating point register if the LFIWAX instruction is enabled or NO_REGS. @item wm -If direct moves are enabled, a VSX register. +VSX register if direct move instructions are enabled, or NO_REGS. @item wn -No register. +No register (NO_REGS). @item wr -General purpose register if 64-bit mode is used +General purpose register if 64-bit instructions are enabled or NO_REGS. @item ws -VSX vector register to hold scalar float data +VSX vector register to hold scalar double values or NO_REGS. @item wt -VSX vector register to hold 128 bit integer +VSX vector register to hold 128 bit integer or NO_REGS. + +@item wu +Altivec register to use for float/32-bit int loads/stores or NO_REGS. + +@item wv +Altivec register to use for double loads/stores or NO_REGS. + +@item ww +FP or VSX register to perform float operations under @option{-mvsx} or NO_REGS. @item wx -If the STFIWX instruction is enabled, a floating point register +Floating point register if the STFIWX instruction is enabled or NO_REGS. + +@item wy +VSX vector register to hold scalar float values or NO_REGS. @item wz -If the LFIWZX instruction is enabled, a floating point register +Floating point register if the LFIWZX instruction is enabled or NO_REGS. @item wQ A memory address that will work with the @code{lq} and @code{stq}