https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77536
Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|UNCONFIRMED |NEW Last reconfirmed| |2016-09-14 CC| |hubicka at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> --- I think the reason of the current behavior is that without a profile we predict loops to run for 10 iterations (or so), thus after vectorization with a vectorization factor of two you'd get two vector iterations out of that guess. Honza fixed the profile updating but the real issue is probably the frequencies we use for the extra tests the vectorizer performs which seem to be all 66% / 33%: /* There are many aspects to how likely the first loop is going to be executed. Without histogram we can't really do good job. Simply set it to 2/3, so the first loop is not reordered to the end of function and the hot path through stays short. */ int first_guard_probability = 2 * REG_BR_PROB_BASE / 3; int second_guard_probability = 2 * REG_BR_PROB_BASE / 3; there isn't any attempt to distinguish the different cases. Even with that fixed the low number of predicted iterations for the scalar loop make coming up with a "hot" profile for the vectorized loop difficult.