------- Comment #2 from uros at kss-loka dot si 2005-10-24 06:53 ------- Some updated timings with gcc version 4.1.0 20051021 (experimental):
-O2 -march=pentium4 -fprefetch-loop-arrays: user 0m17.805s user 0m17.752s user 0m17.744s -O2 -march=pentium4: user 0m17.750s user 0m17.758s user 0m17.838s The main loop with -fprefetch-loop-arrays now looks: .L2: prefetcht0 buf+384(,%eax,4) movl %eax, buf(,%eax,4) addl $1, %eax cmpl $10000000, %eax jne .L2 And without prefetch: .L2: movl %eax, buf(,%eax,4) addl $1, %eax cmpl $10000000, %eax jne .L2 -- uros at kss-loka dot si changed: What |Removed |Added ---------------------------------------------------------------------------- Status|UNCONFIRMED |RESOLVED Resolution| |FIXED Target Milestone|--- |4.1.0 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20748