http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50693
Bug #: 50693 Summary: Slightly different loop body leads to 5.5x slower performance Classification: Unclassified Product: gcc Version: 4.6.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c AssignedTo: unassig...@gcc.gnu.org ReportedBy: alex.gay...@gmail.com Created attachment 25460 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=25460 Script which reproduces the speed difference. Given the two loops in the attached file, the first generates excellent code, which performs the same as a memset, however the second results in very poor code which is 5.5x slower than the first. If anyone is curious why we have such strange looking code: a compiler which targets C. This was tested by compiling at -O3 with -std=gnu99 on my machine: alex@alex-gaynor-laptop:/tmp$ gcc -O3 test.c -std=gnu99 alex@alex-gaynor-laptop:/tmp$ time ./a.out > /dev/null real 0m0.427s user 0m0.412s sys 0m0.016s alex@alex-gaynor-laptop:/tmp$ time ./a.out > /dev/null real 0m0.428s user 0m0.416s sys 0m0.008s alex@alex-gaynor-laptop:/tmp$ time ./a.out > /dev/null real 0m0.432s user 0m0.404s sys 0m0.024s alex@alex-gaynor-laptop:/tmp$ alex@alex-gaynor-laptop:/tmp$ time ./a.out 0 > /dev/null real 0m2.225s user 0m2.200s sys 0m0.020s alex@alex-gaynor-laptop:/tmp$ time ./a.out 0 > /dev/null real 0m2.217s user 0m2.196s sys 0m0.016s alex@alex-gaynor-laptop:/tmp$ time ./a.out 0 > /dev/null real 0m2.268s user 0m2.252s sys 0m0.012s