https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114403
--- Comment #23 from Richard Biener <rguenth at gcc dot gnu.org> --- Maybe easier to understand testcase: long x[9]; long a[20]; struct { long x; long b[40]; } b; int __attribute__((noipa)) foo (int n) { int i = 0; int k = 0; do { if (x[k++]) // early exit, loop upper bound is 8 because of this break; a[i] = b.b[2*i]; // the misaligned 2*i access causes peeling for gaps } while (++i < n); return i; } int main() { x[8] = 1; if (foo (20) != 8) __builtin_abort (); return 0; } with -O3 -msse4.1 -fno-vect-cost-model we return 20 instead of 8. Adding -fdisable-tree-cunroll avoids the issue. The upper bound we set on the vector loop causes us to force taking the IV exit which continues with i == (niter - 1) / VF * VF, but 'niter' is 20 here.