------- Comment #2 from victork at gcc dot gnu dot org 2008-08-05 11:07 ------- > Works for me if you schedule another copyprop and dce before the vectorizer.
Yes, on powerpc it gets vectorized by non-SLP even without additional copyprop and dce, but in this case non-SLP vectorization require permute operations and eventually is less effective than SLP vectorization. By the way, I've noticed that after merging of tuples, the dump of permute operations looks differently: Before: vect_perm_even.32_50 = VEC_EXTRACT_EVEN_EXPR < vect_var_.30_47, vect_var_.31_49 > ; vect_perm_odd.33_51 = VEC_EXTRACT_ODD_EXPR < vect_var_.30_47, vect_var_.31_49 > ; After: vect_perm_even.32_49 = vect_var_.30_46 <<< ??? >>> vect_var_.31_48; vect_perm_odd.33_50 = vect_var_.30_46 <<< ??? >>> vect_var_.31_48; -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37027