https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91435
Bug ID: 91435 Summary: Better induction variable for vectorization Product: gcc Version: 10.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: glisse at gcc dot gnu.org Target Milestone: --- Target: x86_64-*-* (from https://stackoverflow.com/q/57465290/1918193) long RegularTest(int n) { long sum = 0; for (int i = 0; i < n; ++i) if (i % 2 != 0) sum += i + 1; return sum; } Compiling with -O3 -march=skylake, this gets vectorized, but the result has # vect_vec_iv_.14_60 = PHI <{ 0, 1, 2, 3, 4, 5, 6, 7 }(5), vect_vec_iv_.14_61(6)> vect_vec_iv_.14_61 = vect_vec_iv_.14_60 + { 8, 8, 8, 8, 8, 8, 8, 8 }; vect__3.17_66 = vect_vec_iv_.14_60 + { 2, 2, 2, 2, 2, 2, 2, 2 }; (those are the only uses of vect_vec_iv_.14_6[01]) If we are only ever going to use x+2, why not use that instead, initialize with {2,3,4,...}, and skip the +2 at every iteration? (there are other things to discuss about optimizing this testcase, for instance clang is clever enough to unroll by a factor of 2 and remove the condition, but let's stick to the induction variable for this PR)