Hi, This continues the series of -maltivec=be patches, this one handling vec_splat. Since vec_splat is no longer treated as a true intrinsic, its default behavior for little endian is changed to use right-to-left element indexing for selecting the element to splat. With -maltivec=be for little endian, this reverts to left-to-right element indexing.
The main changes are in altivec.md, and are quite similar to what was done for vector merge high/low. Each of the vector splat builtins is split into a define_expand and an "_internal" define_insn. The expand uses the natural element order for the target, unless we have -maltivec=be for an LE target, in which case the order is reversed. The _internal insn modifies the element number on the hardware instruction for LE to account for the instruction's big-endian bias. For those of the vector splat instructions that are generated internally, rather than through a builtin, a "_direct" insn is supplied that generates precisely the instruction requested. vsx.md contains the same changes for V4SI and V4SF patterns when VSX instructions are enabled. We don't need a separate define_expand for this one because the define_expand for altivec_vspltw is used to generate the pattern that is recognized by vsx_xxspltw_<mode>. Either vsx_xxspltw_<mode> or *altivec_vspltw_internal handles the pattern, depending on whether or not VSX instructions are enabled. Most of the changes in rs6000.c are to use the new _direct forms. There is one other change to rs6000_expand_vector_init, where the modification of the selector field for a generated splat pattern is removed. That is properly handled instead by the code in altivec.md and vsx.md. As usual, there are four new test cases to cover the various vector types for VMX and VSX. Two existing test cases require adjustment because of the change in the default semantics of vec_splat for little endian. Bootstrapped and tested on powerpc64{,le}-unknown-linux-gnu with no regressions. Ok for trunk? Thanks, Bill gcc: 2014-01-29 Bill Schmidt <wschm...@linux.vnet.ibm.com> * gcc/config/rs6000/rs6000.c (rs6000_expand_vector_init): Use gen_vsx_xxspltw_v4sf_direct instead of gen_vsx_xxspltw_v4sf; remove element index adjustment for endian (now handled in vsx.md and altivec.md). (altivec_expand_vec_perm_const): Use gen_altivec_vsplt[bhw]_direct instead of gen_altivec_vsplt[bhw]. * gcc/config/rs6000/vsx.md (UNSPEC_VSX_XXSPLTW): New unspec. (vsx_xxspltw_<mode>): Adjust element index for little endian. * gcc/config/rs6000/altivec.md (altivec_vspltb): Divide into a define_expand and a new define_insn *altivec_vspltb_internal; adjust for -maltivec=be on a little endian target. (altivec_vspltb_direct): New. (altivec_vsplth): Divide into a define_expand and a new define_insn *altivec_vsplth_internal; adjust for -maltivec=be on a little endian target. (altivec_vsplth_direct): New. (altivec_vspltw): Divide into a define_expand and a new define_insn *altivec_vspltw_internal; adjust for -maltivec=be on a little endian target. (altivec_vspltw_direct): New. (altivec_vspltsf): Divide into a define_expand and a new define_insn *altivec_vspltsf_internal; adjust for -maltivec=be on a little endian target. gcc/testsuite: 2014-01-29 Bill Schmidt <wschm...@linux.vnet.ibm.com> * gcc.dg/vmx/splat.c: New. * gcc.dg/vmx/splat-vsx.c: New. * gcc.dg/vmx/splat-be-order.c: New. * gcc.dg/vmx/splat-vsx-be-order.c: New. * gcc.dg/vmx/eg-5.c: Remove special casing for little endian. * gcc.dg/vmx/sn7153.c: Add special casing for little endian. Index: gcc/testsuite/gcc.dg/vmx/splat.c =================================================================== --- gcc/testsuite/gcc.dg/vmx/splat.c (revision 0) +++ gcc/testsuite/gcc.dg/vmx/splat.c (revision 0) @@ -0,0 +1,47 @@ +#include "harness.h" + +static void test() +{ + /* Input vectors. */ + vector unsigned char vuc = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15}; + vector signed char vsc = {-8,-7,-6,-5,-4,-3,-2,-1,0,1,2,3,4,5,6,7}; + vector unsigned short vus = {0,1,2,3,4,5,6,7}; + vector signed short vss = {-4,-3,-2,-1,0,1,2,3}; + vector unsigned int vui = {0,1,2,3}; + vector signed int vsi = {-2,-1,0,1}; + vector float vf = {-2.0,-1.0,0.0,1.0}; + + /* Result vectors. */ + vector unsigned char vucr; + vector signed char vscr; + vector unsigned short vusr; + vector signed short vssr; + vector unsigned int vuir; + vector signed int vsir; + vector float vfr; + + /* Expected result vectors. */ + vector unsigned char vucer = {1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1}; + vector signed char vscer = {0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0}; + vector unsigned short vuser = {7,7,7,7,7,7,7,7}; + vector signed short vsser = {-4,-4,-4,-4,-4,-4,-4,-4}; + vector unsigned int vuier = {2,2,2,2}; + vector signed int vsier = {1,1,1,1}; + vector float vfer = {-1.0,-1.0,-1.0,-1.0}; + + vucr = vec_splat (vuc, 1); + vscr = vec_splat (vsc, 8); + vusr = vec_splat (vus, 7); + vssr = vec_splat (vss, 0); + vuir = vec_splat (vui, 2); + vsir = vec_splat (vsi, 3); + vfr = vec_splat (vf, 1); + + check (vec_all_eq (vucr, vucer), "vuc"); + check (vec_all_eq (vscr, vscer), "vsc"); + check (vec_all_eq (vusr, vuser), "vus"); + check (vec_all_eq (vssr, vsser), "vss"); + check (vec_all_eq (vuir, vuier), "vui"); + check (vec_all_eq (vsir, vsier), "vsi"); + check (vec_all_eq (vfr, vfer ), "vf"); +} Index: gcc/testsuite/gcc.dg/vmx/splat-vsx-be-order.c =================================================================== --- gcc/testsuite/gcc.dg/vmx/splat-vsx-be-order.c (revision 0) +++ gcc/testsuite/gcc.dg/vmx/splat-vsx-be-order.c (revision 0) @@ -0,0 +1,37 @@ +/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */ +/* { dg-require-effective-target powerpc_vsx_ok } */ +/* { dg-options "-maltivec=be -mabi=altivec -std=gnu99 -mvsx" } */ + +#include "harness.h" + +static void test() +{ + /* Input vectors. */ + vector unsigned int vui = {0,1,2,3}; + vector signed int vsi = {-2,-1,0,1}; + vector float vf = {-2.0,-1.0,0.0,1.0}; + + /* Result vectors. */ + vector unsigned int vuir; + vector signed int vsir; + vector float vfr; + + /* Expected result vectors. */ +#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__ + vector unsigned int vuier = {1,1,1,1}; + vector signed int vsier = {-2,-2,-2,-2}; + vector float vfer = {0.0,0.0,0.0,0.0}; +#else + vector unsigned int vuier = {2,2,2,2}; + vector signed int vsier = {1,1,1,1}; + vector float vfer = {-1.0,-1.0,-1.0,-1.0}; +#endif + + vuir = vec_splat (vui, 2); + vsir = vec_splat (vsi, 3); + vfr = vec_splat (vf, 1); + + check (vec_all_eq (vuir, vuier), "vui"); + check (vec_all_eq (vsir, vsier), "vsi"); + check (vec_all_eq (vfr, vfer ), "vf"); +} Index: gcc/testsuite/gcc.dg/vmx/eg-5.c =================================================================== --- gcc/testsuite/gcc.dg/vmx/eg-5.c (revision 207294) +++ gcc/testsuite/gcc.dg/vmx/eg-5.c (working copy) @@ -6,19 +6,10 @@ matvecmul4 (vector float c0, vector float c1, vect { /* Set result to a vector of f32 0's */ vector float result = ((vector float){0.,0.,0.,0.}); - -#ifdef __LITTLE_ENDIAN__ - result = vec_madd (c0, vec_splat (v, 3), result); - result = vec_madd (c1, vec_splat (v, 2), result); - result = vec_madd (c2, vec_splat (v, 1), result); - result = vec_madd (c3, vec_splat (v, 0), result); -#else result = vec_madd (c0, vec_splat (v, 0), result); result = vec_madd (c1, vec_splat (v, 1), result); result = vec_madd (c2, vec_splat (v, 2), result); result = vec_madd (c3, vec_splat (v, 3), result); -#endif - return result; } Index: gcc/testsuite/gcc.dg/vmx/splat-be-order.c =================================================================== --- gcc/testsuite/gcc.dg/vmx/splat-be-order.c (revision 0) +++ gcc/testsuite/gcc.dg/vmx/splat-be-order.c (revision 0) @@ -0,0 +1,59 @@ +/* { dg-options "-maltivec=be -mabi=altivec -std=gnu99 -mno-vsx" } */ + +#include "harness.h" + +static void test() +{ + /* Input vectors. */ + vector unsigned char vuc = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15}; + vector signed char vsc = {-8,-7,-6,-5,-4,-3,-2,-1,0,1,2,3,4,5,6,7}; + vector unsigned short vus = {0,1,2,3,4,5,6,7}; + vector signed short vss = {-4,-3,-2,-1,0,1,2,3}; + vector unsigned int vui = {0,1,2,3}; + vector signed int vsi = {-2,-1,0,1}; + vector float vf = {-2.0,-1.0,0.0,1.0}; + + /* Result vectors. */ + vector unsigned char vucr; + vector signed char vscr; + vector unsigned short vusr; + vector signed short vssr; + vector unsigned int vuir; + vector signed int vsir; + vector float vfr; + + /* Expected result vectors. */ +#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__ + vector unsigned char vucer = {14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14}; + vector signed char vscer = {-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1}; + vector unsigned short vuser = {0,0,0,0,0,0,0,0}; + vector signed short vsser = {3,3,3,3,3,3,3,3}; + vector unsigned int vuier = {1,1,1,1}; + vector signed int vsier = {-2,-2,-2,-2}; + vector float vfer = {0.0,0.0,0.0,0.0}; +#else + vector unsigned char vucer = {1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1}; + vector signed char vscer = {0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0}; + vector unsigned short vuser = {7,7,7,7,7,7,7,7}; + vector signed short vsser = {-4,-4,-4,-4,-4,-4,-4,-4}; + vector unsigned int vuier = {2,2,2,2}; + vector signed int vsier = {1,1,1,1}; + vector float vfer = {-1.0,-1.0,-1.0,-1.0}; +#endif + + vucr = vec_splat (vuc, 1); + vscr = vec_splat (vsc, 8); + vusr = vec_splat (vus, 7); + vssr = vec_splat (vss, 0); + vuir = vec_splat (vui, 2); + vsir = vec_splat (vsi, 3); + vfr = vec_splat (vf, 1); + + check (vec_all_eq (vucr, vucer), "vuc"); + check (vec_all_eq (vscr, vscer), "vsc"); + check (vec_all_eq (vusr, vuser), "vus"); + check (vec_all_eq (vssr, vsser), "vss"); + check (vec_all_eq (vuir, vuier), "vui"); + check (vec_all_eq (vsir, vsier), "vsi"); + check (vec_all_eq (vfr, vfer ), "vf"); +} Index: gcc/testsuite/gcc.dg/vmx/sn7153.c =================================================================== --- gcc/testsuite/gcc.dg/vmx/sn7153.c (revision 207294) +++ gcc/testsuite/gcc.dg/vmx/sn7153.c (working copy) @@ -34,7 +34,11 @@ main() void validate_sat() { +#ifdef __LITTLE_ENDIAN__ + if (vec_any_ne(vec_splat(vec_mfvscr(), 0), ((vector unsigned short){1,1,1,1,1,1,1,1}))) +#else if (vec_any_ne(vec_splat(vec_mfvscr(), 7), ((vector unsigned short){1,1,1,1,1,1,1,1}))) +#endif { union {vector unsigned short v; unsigned short s[8];} u; u.v = vec_mfvscr(); Index: gcc/testsuite/gcc.dg/vmx/splat-vsx.c =================================================================== --- gcc/testsuite/gcc.dg/vmx/splat-vsx.c (revision 0) +++ gcc/testsuite/gcc.dg/vmx/splat-vsx.c (revision 0) @@ -0,0 +1,31 @@ +/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */ +/* { dg-require-effective-target powerpc_vsx_ok } */ +/* { dg-options "-maltivec -mabi=altivec -std=gnu99 -mvsx" } */ + +#include "harness.h" + +static void test() +{ + /* Input vectors. */ + vector unsigned int vui = {0,1,2,3}; + vector signed int vsi = {-2,-1,0,1}; + vector float vf = {-2.0,-1.0,0.0,1.0}; + + /* Result vectors. */ + vector unsigned int vuir; + vector signed int vsir; + vector float vfr; + + /* Expected result vectors. */ + vector unsigned int vuier = {2,2,2,2}; + vector signed int vsier = {1,1,1,1}; + vector float vfer = {-1.0,-1.0,-1.0,-1.0}; + + vuir = vec_splat (vui, 2); + vsir = vec_splat (vsi, 3); + vfr = vec_splat (vf, 1); + + check (vec_all_eq (vuir, vuier), "vui"); + check (vec_all_eq (vsir, vsier), "vsi"); + check (vec_all_eq (vfr, vfer ), "vf"); +} Index: gcc/config/rs6000/rs6000.c =================================================================== --- gcc/config/rs6000/rs6000.c (revision 207294) +++ gcc/config/rs6000/rs6000.c (working copy) @@ -5485,7 +5485,7 @@ rs6000_expand_vector_init (rtx target, rtx vals) : gen_vsx_xscvdpsp_scalar (freg, sreg)); emit_insn (cvt); - emit_insn (gen_vsx_xxspltw_v4sf (target, freg, const0_rtx)); + emit_insn (gen_vsx_xxspltw_v4sf_direct (target, freg, const0_rtx)); } else { @@ -5522,11 +5522,9 @@ rs6000_expand_vector_init (rtx target, rtx vals) gen_rtx_SET (VOIDmode, target, mem), x))); - field = (BYTES_BIG_ENDIAN ? const0_rtx - : GEN_INT (GET_MODE_NUNITS (mode) - 1)); x = gen_rtx_VEC_SELECT (inner_mode, target, gen_rtx_PARALLEL (VOIDmode, - gen_rtvec (1, field))); + gen_rtvec (1, const0_rtx))); emit_insn (gen_rtx_SET (VOIDmode, target, gen_rtx_VEC_DUPLICATE (mode, x))); return; @@ -29980,7 +29978,7 @@ altivec_expand_vec_perm_const (rtx operands[4]) { if (!BYTES_BIG_ENDIAN) elt = 15 - elt; - emit_insn (gen_altivec_vspltb (target, op0, GEN_INT (elt))); + emit_insn (gen_altivec_vspltb_direct (target, op0, GEN_INT (elt))); return true; } @@ -29993,8 +29991,8 @@ altivec_expand_vec_perm_const (rtx operands[4]) { int field = BYTES_BIG_ENDIAN ? elt / 2 : 7 - elt / 2; x = gen_reg_rtx (V8HImode); - emit_insn (gen_altivec_vsplth (x, gen_lowpart (V8HImode, op0), - GEN_INT (field))); + emit_insn (gen_altivec_vsplth_direct (x, gen_lowpart (V8HImode, op0), + GEN_INT (field))); emit_move_insn (target, gen_lowpart (V16QImode, x)); return true; } @@ -30012,8 +30010,8 @@ altivec_expand_vec_perm_const (rtx operands[4]) { int field = BYTES_BIG_ENDIAN ? elt / 4 : 3 - elt / 4; x = gen_reg_rtx (V4SImode); - emit_insn (gen_altivec_vspltw (x, gen_lowpart (V4SImode, op0), - GEN_INT (field))); + emit_insn (gen_altivec_vspltw_direct (x, gen_lowpart (V4SImode, op0), + GEN_INT (field))); emit_move_insn (target, gen_lowpart (V16QImode, x)); return true; } Index: gcc/config/rs6000/vsx.md =================================================================== --- gcc/config/rs6000/vsx.md (revision 207294) +++ gcc/config/rs6000/vsx.md (working copy) @@ -213,6 +213,7 @@ UNSPEC_VSX_ROUND_I UNSPEC_VSX_ROUND_IC UNSPEC_VSX_SLDWI + UNSPEC_VSX_XXSPLTW ]) ;; VSX moves @@ -1751,6 +1752,20 @@ (parallel [(match_operand:QI 2 "u5bit_cint_operand" "i,i")]))))] "VECTOR_MEM_VSX_P (<MODE>mode)" +{ + if (!BYTES_BIG_ENDIAN) + operands[2] = GEN_INT (3 - INTVAL (operands[2])); + + return "xxspltw %x0,%x1,%2"; +} + [(set_attr "type" "vecperm")]) + +(define_insn "vsx_xxspltw_<mode>_direct" + [(set (match_operand:VSX_W 0 "vsx_register_operand" "=wf,?wa") + (unspec:VSX_W [(match_operand:VSX_W 1 "vsx_register_operand" "wf,wa") + (match_operand:QI 2 "u5bit_cint_operand" "i,i")] + UNSPEC_VSX_XXSPLTW))] + "VECTOR_MEM_VSX_P (<MODE>mode)" "xxspltw %x0,%x1,%2" [(set_attr "type" "vecperm")]) Index: gcc/config/rs6000/altivec.md =================================================================== --- gcc/config/rs6000/altivec.md (revision 207294) +++ gcc/config/rs6000/altivec.md (working copy) @@ -1600,44 +1600,187 @@ "vsumsws %0,%1,%2" [(set_attr "type" "veccomplex")]) -(define_insn "altivec_vspltb" +(define_expand "altivec_vspltb" + [(match_operand:V16QI 0 "register_operand" "") + (match_operand:V16QI 1 "register_operand" "") + (match_operand:QI 2 "u5bit_cint_operand" "")] + "TARGET_ALTIVEC" +{ + rtvec v; + rtx x; + + /* Special handling for LE with -maltivec=be. We have to reflect + the actual selected index for the splat in the RTL. */ + if (!BYTES_BIG_ENDIAN && VECTOR_ELT_ORDER_BIG) + operands[2] = GEN_INT (15 - INTVAL (operands[2])); + + v = gen_rtvec (1, operands[2]); + x = gen_rtx_VEC_SELECT (QImode, operands[1], gen_rtx_PARALLEL (VOIDmode, v)); + x = gen_rtx_VEC_DUPLICATE (V16QImode, x); + emit_insn (gen_rtx_SET (VOIDmode, operands[0], x)); + DONE; +}) + +(define_insn "*altivec_vspltb_internal" [(set (match_operand:V16QI 0 "register_operand" "=v") (vec_duplicate:V16QI (vec_select:QI (match_operand:V16QI 1 "register_operand" "v") (parallel [(match_operand:QI 2 "u5bit_cint_operand" "")]))))] "TARGET_ALTIVEC" +{ + /* For true LE, this adjusts the selected index. For LE with + -maltivec=be, this reverses what was done in the define_expand + because the instruction already has big-endian bias. */ + if (!BYTES_BIG_ENDIAN) + operands[2] = GEN_INT (15 - INTVAL (operands[2])); + + return "vspltb %0,%1,%2"; +} + [(set_attr "type" "vecperm")]) + +(define_insn "altivec_vspltb_direct" + [(set (match_operand:V16QI 0 "register_operand" "=v") + (unspec:V16QI [(match_operand:V16QI 1 "register_operand" "v") + (match_operand:QI 2 "u5bit_cint_operand" "i")] + UNSPEC_VSPLT_DIRECT))] + "TARGET_ALTIVEC" "vspltb %0,%1,%2" [(set_attr "type" "vecperm")]) -(define_insn "altivec_vsplth" +(define_expand "altivec_vsplth" + [(match_operand:V8HI 0 "register_operand" "") + (match_operand:V8HI 1 "register_operand" "") + (match_operand:QI 2 "u5bit_cint_operand" "")] + "TARGET_ALTIVEC" +{ + rtvec v; + rtx x; + + /* Special handling for LE with -maltivec=be. We have to reflect + the actual selected index for the splat in the RTL. */ + if (!BYTES_BIG_ENDIAN && VECTOR_ELT_ORDER_BIG) + operands[2] = GEN_INT (7 - INTVAL (operands[2])); + + v = gen_rtvec (1, operands[2]); + x = gen_rtx_VEC_SELECT (HImode, operands[1], gen_rtx_PARALLEL (VOIDmode, v)); + x = gen_rtx_VEC_DUPLICATE (V8HImode, x); + emit_insn (gen_rtx_SET (VOIDmode, operands[0], x)); + DONE; +}) + +(define_insn "*altivec_vsplth_internal" [(set (match_operand:V8HI 0 "register_operand" "=v") (vec_duplicate:V8HI (vec_select:HI (match_operand:V8HI 1 "register_operand" "v") (parallel [(match_operand:QI 2 "u5bit_cint_operand" "")]))))] "TARGET_ALTIVEC" +{ + /* For true LE, this adjusts the selected index. For LE with + -maltivec=be, this reverses what was done in the define_expand + because the instruction already has big-endian bias. */ + if (!BYTES_BIG_ENDIAN) + operands[2] = GEN_INT (7 - INTVAL (operands[2])); + + return "vsplth %0,%1,%2"; +} + [(set_attr "type" "vecperm")]) + +(define_insn "altivec_vsplth_direct" + [(set (match_operand:V8HI 0 "register_operand" "=v") + (unspec:V8HI [(match_operand:V8HI 1 "register_operand" "v") + (match_operand:QI 2 "u5bit_cint_operand" "i")] + UNSPEC_VSPLT_DIRECT))] + "TARGET_ALTIVEC" "vsplth %0,%1,%2" [(set_attr "type" "vecperm")]) -(define_insn "altivec_vspltw" +(define_expand "altivec_vspltw" + [(match_operand:V4SI 0 "register_operand" "") + (match_operand:V4SI 1 "register_operand" "") + (match_operand:QI 2 "u5bit_cint_operand" "")] + "TARGET_ALTIVEC" +{ + rtvec v; + rtx x; + + /* Special handling for LE with -maltivec=be. We have to reflect + the actual selected index for the splat in the RTL. */ + if (!BYTES_BIG_ENDIAN && VECTOR_ELT_ORDER_BIG) + operands[2] = GEN_INT (3 - INTVAL (operands[2])); + + v = gen_rtvec (1, operands[2]); + x = gen_rtx_VEC_SELECT (SImode, operands[1], gen_rtx_PARALLEL (VOIDmode, v)); + x = gen_rtx_VEC_DUPLICATE (V4SImode, x); + emit_insn (gen_rtx_SET (VOIDmode, operands[0], x)); + DONE; +}) + +(define_insn "*altivec_vspltw_internal" [(set (match_operand:V4SI 0 "register_operand" "=v") (vec_duplicate:V4SI (vec_select:SI (match_operand:V4SI 1 "register_operand" "v") (parallel [(match_operand:QI 2 "u5bit_cint_operand" "i")]))))] "TARGET_ALTIVEC" +{ + /* For true LE, this adjusts the selected index. For LE with + -maltivec=be, this reverses what was done in the define_expand + because the instruction already has big-endian bias. */ + if (!BYTES_BIG_ENDIAN) + operands[2] = GEN_INT (3 - INTVAL (operands[2])); + + return "vspltw %0,%1,%2"; +} + [(set_attr "type" "vecperm")]) + +(define_insn "altivec_vspltw_direct" + [(set (match_operand:V4SI 0 "register_operand" "=v") + (unspec:V4SI [(match_operand:V4SI 1 "register_operand" "v") + (match_operand:QI 2 "u5bit_cint_operand" "i")] + UNSPEC_VSPLT_DIRECT))] + "TARGET_ALTIVEC" "vspltw %0,%1,%2" [(set_attr "type" "vecperm")]) -(define_insn "altivec_vspltsf" +(define_expand "altivec_vspltsf" + [(match_operand:V4SF 0 "register_operand" "") + (match_operand:V4SF 1 "register_operand" "") + (match_operand:QI 2 "u5bit_cint_operand" "")] + "TARGET_ALTIVEC" +{ + rtvec v; + rtx x; + + /* Special handling for LE with -maltivec=be. We have to reflect + the actual selected index for the splat in the RTL. */ + if (!BYTES_BIG_ENDIAN && VECTOR_ELT_ORDER_BIG) + operands[2] = GEN_INT (3 - INTVAL (operands[2])); + + v = gen_rtvec (1, operands[2]); + x = gen_rtx_VEC_SELECT (SFmode, operands[1], gen_rtx_PARALLEL (VOIDmode, v)); + x = gen_rtx_VEC_DUPLICATE (V4SFmode, x); + emit_insn (gen_rtx_SET (VOIDmode, operands[0], x)); + DONE; +}) + +(define_insn "*altivec_vspltsf_internal" [(set (match_operand:V4SF 0 "register_operand" "=v") (vec_duplicate:V4SF (vec_select:SF (match_operand:V4SF 1 "register_operand" "v") (parallel [(match_operand:QI 2 "u5bit_cint_operand" "i")]))))] "VECTOR_UNIT_ALTIVEC_P (V4SFmode)" - "vspltw %0,%1,%2" +{ + /* For true LE, this adjusts the selected index. For LE with + -maltivec=be, this reverses what was done in the define_expand + because the instruction already has big-endian bias. */ + if (!BYTES_BIG_ENDIAN) + operands[2] = GEN_INT (3 - INTVAL (operands[2])); + + return "vspltw %0,%1,%2"; +} [(set_attr "type" "vecperm")]) (define_insn "altivec_vspltis<VI_char>"