http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48616
--- Comment #7 from Jakub Jelinek <jakub at gcc dot gnu.org> 2011-04-15 16:00:43 UTC --- Created attachment 23997 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=23997 pr48616.ii.bz2 Preprocessed, for -O2 -ftree-vectorize -mxop -m64. The bug is during the second vect_schedule_slp call on that testcase, or perhaps in its analysis. The first vect_schedule_slp call is on a bb that reads from shift and is vectorized correctly. The problem on the second bb is that all the 4 rhs2s of the shifts don't have the definition in the current bb (thus vect_is_simple_use does: if ((loop && !flow_bb_inside_loop_p (loop, bb)) || (!loop && bb != BB_VINFO_BB (bb_vinfo)) || (!loop && gimple_code (*def_stmt) == GIMPLE_PHI)) *dt = vect_external_def; and vectorizable_shift assumes that vect_external_def means vector shift by scalar and nothing during the analysis checks that if it assumes it is a shift by scalar, the shift count must be actually the same between all the to be slp replaced stmts. In this case if it remembered the vector_var_ from the defining BB it could very well just use it as the rhs2 operand of the vector shift, but that assumes the bb in which it is defined has been slp vectorized already.