Whoops, looks like I missed some simpler cases (REG with the wrong mode instead of SUBREG with the wrong mode). Is this revised version ok, assuming it passes testing? It should fix a few more test cases.
The changed code from the previous version is in the last hunk. Thanks, Bill Index: gcc/config/rs6000/rs6000.c =================================================================== --- gcc/config/rs6000/rs6000.c (revision 203792) +++ gcc/config/rs6000/rs6000.c (working copy) @@ -28837,17 +28838,23 @@ altivec_expand_vec_perm_const (rtx operands[4]) { 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31 } }, { OPTION_MASK_ALTIVEC, CODE_FOR_altivec_vpkuwum, { 2, 3, 6, 7, 10, 11, 14, 15, 18, 19, 22, 23, 26, 27, 30, 31 } }, - { OPTION_MASK_ALTIVEC, CODE_FOR_altivec_vmrghb, + { OPTION_MASK_ALTIVEC, + BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrghb : CODE_FOR_altivec_vmrglb, { 0, 16, 1, 17, 2, 18, 3, 19, 4, 20, 5, 21, 6, 22, 7, 23 } }, - { OPTION_MASK_ALTIVEC, CODE_FOR_altivec_vmrghh, + { OPTION_MASK_ALTIVEC, + BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrghh : CODE_FOR_altivec_vmrglh, { 0, 1, 16, 17, 2, 3, 18, 19, 4, 5, 20, 21, 6, 7, 22, 23 } }, - { OPTION_MASK_ALTIVEC, CODE_FOR_altivec_vmrghw, + { OPTION_MASK_ALTIVEC, + BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrghw : CODE_FOR_altivec_vmrglw, { 0, 1, 2, 3, 16, 17, 18, 19, 4, 5, 6, 7, 20, 21, 22, 23 } }, - { OPTION_MASK_ALTIVEC, CODE_FOR_altivec_vmrglb, + { OPTION_MASK_ALTIVEC, + BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrglb : CODE_FOR_altivec_vmrghb, { 8, 24, 9, 25, 10, 26, 11, 27, 12, 28, 13, 29, 14, 30, 15, 31 } }, - { OPTION_MASK_ALTIVEC, CODE_FOR_altivec_vmrglh, + { OPTION_MASK_ALTIVEC, + BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrglh : CODE_FOR_altivec_vmrghh, { 8, 9, 24, 25, 10, 11, 26, 27, 12, 13, 28, 29, 14, 15, 30, 31 } }, - { OPTION_MASK_ALTIVEC, CODE_FOR_altivec_vmrglw, + { OPTION_MASK_ALTIVEC, + BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrglw : CODE_FOR_altivec_vmrghw, { 8, 9, 10, 11, 24, 25, 26, 27, 12, 13, 14, 15, 28, 29, 30, 31 } }, { OPTION_MASK_P8_VECTOR, CODE_FOR_p8_vmrgew, { 0, 1, 2, 3, 16, 17, 18, 19, 8, 9, 10, 11, 24, 25, 26, 27 } }, @@ -28980,6 +28987,26 @@ altivec_expand_vec_perm_const (rtx operands[4]) enum machine_mode omode = insn_data[icode].operand[0].mode; enum machine_mode imode = insn_data[icode].operand[1].mode; + /* For little-endian, don't use vpkuwum and vpkuhum if the + underlying vector type is not V4SI and V8HI, respectively. + For example, using vpkuwum with a V8HI picks up the even + halfwords (BE numbering) when the even halfwords (LE + numbering) are what we need. */ + if (!BYTES_BIG_ENDIAN + && icode == CODE_FOR_altivec_vpkuwum + && ((GET_CODE (op0) == REG + && GET_MODE (op0) != V4SImode) + || (GET_CODE (op0) == SUBREG + && GET_MODE (XEXP (op0, 0)) != V4SImode))) + continue; + if (!BYTES_BIG_ENDIAN + && icode == CODE_FOR_altivec_vpkuhum + && ((GET_CODE (op0) == REG + && GET_MODE (op0) != V8HImode) + || (GET_CODE (op0) == SUBREG + && GET_MODE (XEXP (op0, 0)) != V8HImode))) + continue; + /* For little-endian, the two input operands must be swapped (or swapped back) to ensure proper right-to-left numbering from 0 to 2N-1. */ On Mon, 2013-10-21 at 10:02 -0400, David Edelsohn wrote: > On Mon, Oct 21, 2013 at 8:49 AM, Bill Schmidt > <wschm...@linux.vnet.ibm.com> wrote: > > Hi, > > > > In altivec_expand_vec_perm_const, we look for special masks that match > > the behavior of specific instructions, so we can use those instructions > > rather than load a constant control vector and perform a permute. Some > > of the masks must be treated differently for little endian mode. > > > > The masks that represent merge-high and merge-low operations have > > reversed meanings in little-endian, because of the reversed ordering of > > the vector elements. > > > > The masks that represent vector-pack operations remain correct when the > > mode of the input operands matches the natural mode of the instruction, > > but not otherwise. This is because the pack instructions always select > > the rightmost, low-order bits of the vector element. There are cases > > where we use this, for example, with a V8SI vector matching a vpkuwum > > mask in order to select the odd-numbered elements of the vector. In > > little endian mode, this instruction will get us the even-numbered > > elements instead. There is no alternative instruction with the desired > > behavior, so I've just disabled use of those masks for little endian > > when the mode isn't natural. > > > > These changes fix 32 failures in the test suite for little endian mode. > > Bootstrapped and tested on powerpc64{,le}-unknown-linux-gnu with no new > > failures. Is this ok for trunk? > > > > Thanks, > > Bill > > > > > > 2013-10-21 Bill Schmidt <wschm...@vnet.ibm.com> > > > > * config/rs6000/rs6000.c (altivec_expand_vec_perm_const): Reverse > > meaning of merge-high and merge-low masks for little endian; avoid > > use of vector-pack masks for little endian for mismatched modes. > > Okay. > > Thanks, David >