------- Comment #10 from rguenth at gcc dot gnu dot org 2006-12-14 15:46 ------- I get on the trunk with -O2 -funroll-loops
main: .LFB2: xorl %eax, %eax .p2align 4,,7 .L2: cvtsi2sd %eax, %xmm0 addl $1, %eax cmpl $1000000000, %eax movq $0, data+24(%rip) movq $0, data+48(%rip) movq $0, data+8(%rip) movq $0, data+56(%rip) movq $0, data+16(%rip) movq $0, data+40(%rip) movsd %xmm0, data(%rip) movsd %xmm0, data+32(%rip) movsd %xmm0, data+64(%rip) jne .L2 xorl %eax, %eax ret it doesn't look like we can do better w/o cheating and moving the benchmark-loop-invariant movq $0 out of the loop ;) The UNROLL variant looks like .L2: leal 8(%rdx), %ecx addl $9, %edx movq $0, data+8(%rip) cmpl $1000000000, %edx movq $0, data+16(%rip) movq $0, data+24(%rip) cvtsi2sd %ecx, %xmm0 movq $0, data+40(%rip) movq $0, data+48(%rip) movq $0, data+56(%rip) movsd %xmm0, data(%rip) movsd %xmm0, data+32(%rip) movsd %xmm0, data+64(%rip) jne .L2 -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30201