https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115843
--- Comment #6 from Richard Biener <rguenth at gcc dot gnu.org> --- t.c:9:17: note: misalignment for fully-masked loop: 15 so in the first iteration only the last element should be active. But # loop_mask_58 = PHI <_100(10), { 0, 0, 0, 0, 0, 0, 0, 0 }(2)> # loop_mask_57 = PHI <_101(10), { 0, 0, 0, 0, 0, 0, 0, 0 }(2)> is then wrong and # vect_vec_iv_.6_46 = PHI <_47(10), { -15, -14, -13, -12, -11, -10, -9, -8, -7, -6, -5, -4, -3, -2, -1, 0 }(2)> _47 = vect_vec_iv_.6_46 + { 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16 }; vect__1.7_49 = vect_vec_iv_.6_46 & { 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7 }; are the values for { 1, 2, ... } thus the next iteration (not relevant for this particular induction use). There are then un(loop-)masked uses of the mask derived from vect__1.7_49 in vect_patt_15.25_84 = VEC_COND_EXPR <mask_patt_23.15_66, vect_patt_20.18_72, vect__6.22_79>; but ultimatively the loop mask is applied in its uses via .MASK_STORE. I have a fix.