Hi, While testing another patch, I hit a regression at -O2 for two of the vector shuffle tests. This patch fixes the problem.
The problem was introduced with the little endian fixes for VEC_PERM_EXPR. The original change performed the necessary transformation at expand time, but this is incorrect. This creates a technically incorrect representation in the RTL, and RTL simplification can subsequently fold the expression in an unexpected way. The correct fix is to delay the transformation until final code generation. Bootstrapped and tested on powerpc64{,le}-unknown-linux-gnu with no regressions. Is this ok for trunk? I will also plan to backport this to ibm/gcc-4_8-branch after burn-in. Thanks, Bill 2014-01-23 Bill Schmidt <wschm...@linux.vnet.ibm.com> * config/rs6000/rs6000.c (rs6000_expand_vec_perm_const_1): Remove correction for little endian... * config/rs6000/vsx.md (vsx_xxpermdi2_<mode>_1): ...and move it to here. Index: gcc/config/rs6000/rs6000.c =================================================================== --- gcc/config/rs6000/rs6000.c (revision 206889) +++ gcc/config/rs6000/rs6000.c (working copy) @@ -30115,22 +30121,6 @@ rs6000_expand_vec_perm_const_1 (rtx target, rtx op vmode = GET_MODE (target); gcc_assert (GET_MODE_NUNITS (vmode) == 2); dmode = mode_for_vector (GET_MODE_INNER (vmode), 4); - - /* For little endian, swap operands and invert/swap selectors - to get the correct xxpermdi. The operand swap sets up the - inputs as a little endian array. The selectors are swapped - because they are defined to use big endian ordering. The - selectors are inverted to get the correct doublewords for - little endian ordering. */ - if (!BYTES_BIG_ENDIAN) - { - int n; - perm0 = 3 - perm0; - perm1 = 3 - perm1; - n = perm0, perm0 = perm1, perm1 = n; - x = op0, op0 = op1, op1 = x; - } - x = gen_rtx_VEC_CONCAT (dmode, op0, op1); v = gen_rtvec (2, GEN_INT (perm0), GEN_INT (perm1)); x = gen_rtx_VEC_SELECT (vmode, x, gen_rtx_PARALLEL (VOIDmode, v)); Index: gcc/config/rs6000/vsx.md =================================================================== --- gcc/config/rs6000/vsx.md (revision 206889) +++ gcc/config/rs6000/vsx.md (working copy) @@ -1634,9 +1634,26 @@ (match_operand 4 "const_2_to_3_operand" "")])))] "VECTOR_MEM_VSX_P (<MODE>mode)" { - int mask = (INTVAL (operands[3]) << 1) | (INTVAL (operands[4]) - 2); + int op3, op4, mask; + + if (BYTES_BIG_ENDIAN) + { + op3 = INTVAL (operands[3]); + op4 = INTVAL (operands[4]); + } + else + { + op3 = 3 - INTVAL (operands[4]); + op4 = 3 - INTVAL (operands[3]); + } + + mask = (op3 << 1) | (op4 - 2); operands[3] = GEN_INT (mask); - return "xxpermdi %x0,%x1,%x2,%3"; + + if (BYTES_BIG_ENDIAN) + return "xxpermdi %x0,%x1,%x2,%3"; + else + return "xxpermdi %x0,%x2,%x1,%3"; } [(set_attr "type" "vecperm")])