Hi, Uli Weigand discovered that the gcc.target/powerpc/swaps-p8-21.c test case fails when large and small code models are used, rather than the default medium code model. This is because analyze_swaps is determining whether the mask used for a vperm insn is loaded from the constant pool, and there is an extra indirection for such loads when the large or small code model is used. This patch changes analyze_swaps to handle the extra indirection correctly. A new test case variant is added to check for it.
Bootstrapped and tested on powerpc64le-unknown-linux-gnu with no regressions. Ok for trunk? Thanks, Bill [gcc] 2015-12-01 Bill Schmidt <wschm...@linux.vnet.ibm.com> * config/rs6000/rs6000.c (const_load_sequence_p): Handle extra indirection for large and small code models. (adjust_vperm): Likewise. [gcc/testsuite] 2015-12-01 Bill Schmidt <wschm...@linux.vnet.ibm.com> * gcc.target/powerpc/swaps-p8-22.c: New. Index: gcc/config/rs6000/rs6000.c =================================================================== --- gcc/config/rs6000/rs6000.c (revision 231083) +++ gcc/config/rs6000/rs6000.c (working copy) @@ -36613,7 +36613,12 @@ const_load_sequence_p (swap_web_entry *insn_entry, rtx base, offset; if (GET_CODE (tocrel_body) != SET) return false; - if (!toc_relative_expr_p (SET_SRC (tocrel_body), false)) + /* There is an extra level of indirection for small/large + code models. */ + rtx tocrel_expr = SET_SRC (tocrel_body); + if (GET_CODE (tocrel_expr) == MEM) + tocrel_expr = XEXP (tocrel_expr, 0); + if (!toc_relative_expr_p (tocrel_expr, false)) return false; split_const (XVECEXP (tocrel_base, 0, 0), &base, &offset); if (GET_CODE (base) != SYMBOL_REF || !CONSTANT_POOL_ADDRESS_P (base)) @@ -37294,10 +37299,19 @@ adjust_vperm (rtx_insn *insn) to set tocrel_base; otherwise it would be unnecessary as we've already established it will return true. */ rtx base, offset; - if (!toc_relative_expr_p (SET_SRC (PATTERN (tocrel_insn)), false)) + rtx tocrel_expr = SET_SRC (PATTERN (tocrel_insn)); + /* There is an extra level of indirection for small/large code models. */ + if (GET_CODE (tocrel_expr) == MEM) + tocrel_expr = XEXP (tocrel_expr, 0); + if (!toc_relative_expr_p (tocrel_expr, false)) gcc_unreachable (); split_const (XVECEXP (tocrel_base, 0, 0), &base, &offset); rtx const_vector = get_pool_constant (base); + /* With the extra indirection, get_pool_constant will produce the + real constant from the reg_equal expression, so get the real + constant. */ + if (GET_CODE (const_vector) == SYMBOL_REF) + const_vector = get_pool_constant (const_vector); gcc_assert (GET_CODE (const_vector) == CONST_VECTOR); /* Create an adjusted mask from the initial mask. */ Index: gcc/testsuite/gcc.target/powerpc/swaps-p8-22.c =================================================================== --- gcc/testsuite/gcc.target/powerpc/swaps-p8-22.c (revision 0) +++ gcc/testsuite/gcc.target/powerpc/swaps-p8-22.c (working copy) @@ -0,0 +1,29 @@ +/* { dg-do compile { target { powerpc64le-*-* } } } */ +/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power8" } } +/* { dg-options "-O2 -mcpu=power8 -maltivec -mcmodel=large" } */ + +/* The expansion for vector character multiply introduces a vperm operation. + This tests that changing the vperm mask allows us to remove all swaps + from the generated code. It is a duplicate of swaps-p8-21.c, except + that it applies the large code model, which requires an extra indirection + in the load of the constant mask. */ + +#include <altivec.h> + +void abort (); + +vector unsigned char r; +vector unsigned char v = + { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 }; +vector unsigned char i = + { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 }; + +int main () +{ + int j; + r = v * i; + return 0; +} + +/* { dg-final { scan-assembler-times "vperm" 1 } } */ +/* { dg-final { scan-assembler-not "xxpermdi" } } */