The following fixes wrong-code when SLP vectorizing shifts where we may end up detecting the shift amount as scalar even though it really isn't.
Bootstrapped and tested on x86_64-unknown-linux-gnu, applied on trunk sofar. Richard. 2019-01-18 Richard Biener <rguent...@suse.de> PR tree-optimization/88903 * tree-vect-stmts.c (vectorizable_shift): Verify we see all scalar stmts a SLP shift amount is composed of when detecting shifts by scalars. * gcc.dg/vect/pr88903-1.c: New testcase. * gcc.dg/vect/pr88903-2.c: Likewise. Index: gcc/testsuite/gcc.dg/vect/pr88903-1.c =================================================================== --- gcc/testsuite/gcc.dg/vect/pr88903-1.c (nonexistent) +++ gcc/testsuite/gcc.dg/vect/pr88903-1.c (working copy) @@ -0,0 +1,26 @@ +#include "tree-vect.h" + +int x[1024]; + +void __attribute__((noinline)) +foo() +{ + for (int i = 0; i < 512; ++i) + { + x[2*i] = x[2*i] << (i+1); + x[2*i+1] = x[2*i+1] << (i+1); + } +} + +int +main() +{ + check_vect (); + for (int i = 0; i < 1024; ++i) + x[i] = i; + foo (); + for (int i = 0; i < 1024; ++i) + if (x[i] != i << (i/2+1)) + __builtin_abort (); + return 0; +} Index: gcc/testsuite/gcc.dg/vect/pr88903-2.c =================================================================== --- gcc/testsuite/gcc.dg/vect/pr88903-2.c (nonexistent) +++ gcc/testsuite/gcc.dg/vect/pr88903-2.c (working copy) @@ -0,0 +1,28 @@ +#include "tree-vect.h" + +int x[1024]; +int y[1024]; +int z[1024]; + +void __attribute__((noinline)) foo() +{ + for (int i = 0; i < 512; ++i) + { + x[2*i] = x[2*i] << y[2*i]; + x[2*i+1] = x[2*i+1] << y[2*i]; + z[2*i] = y[2*i]; + z[2*i+1] = y[2*i+1]; + } +} + +int main() +{ + check_vect (); + for (int i = 0; i < 1024; ++i) + x[i] = i, y[i] = i % 8; + foo (); + for (int i = 0; i < 1024; ++i) + if (x[i] != i << ((i & ~1) % 8)) + __builtin_abort (); + return 0; +} Index: gcc/tree-vect-stmts.c =================================================================== --- gcc/tree-vect-stmts.c (revision 268068) +++ gcc/tree-vect-stmts.c (working copy) @@ -5540,6 +5540,15 @@ vectorizable_shift (stmt_vec_info stmt_i if (!operand_equal_p (gimple_assign_rhs2 (slpstmt), op1, 0)) scalar_shift_arg = false; } + + /* For internal SLP defs we have to make sure we see scalar stmts + for all vector elements. + ??? For different vectors we could resort to a different + scalar shift operand but code-generation below simply always + takes the first. */ + if (dt[1] == vect_internal_def + && maybe_ne (nunits_out * SLP_TREE_NUMBER_OF_VEC_STMTS (slp_node), stmts.length ())) + scalar_shift_arg = false; } /* If the shift amount is computed by a pattern stmt we cannot