http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50328
--- Comment #3 from Richard Guenther <rguenth at gcc dot gnu.org> 2011-09-08 13:40:30 UTC --- Triggered by report http://gcc.gnu.org/ml/gcc/2011-09/msg00052.html OpenCC then unrolls the outer loop to get .LBB16_double_array_mults_by_const: #<loop> Loop body line 62, nesting depth: 2, iterations: 16384 #<loop> unrolled 4 times mulpd %xmm6,%xmm0 # [0] movaps %xmm0,%xmm1 # [4] mulpd %xmm6,%xmm1 # [6] mulpd %xmm6,%xmm1 # [10] addq $8,%rax # [14] mulpd %xmm6,%xmm1 # [14] cmpq $131071,%rax # [15] setle %dil # [16] testb %dil,%dil # [17] movaps %xmm1,%xmm0 # [18] jne .LBB16_double_array_mults_by_const # [18] instead of what we get with the patch .L2: subl $1, %eax mulpd %xmm1, %xmm0 jne .L2 we don't have outer loop unrolling either.