http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51078
--- Comment #10 from Paolo Carlini <paolo.carlini at oracle dot com> 2011-11-10 18:20:03 UTC --- As a general observation about this kind of road to performance improvement: before manually unrolling loops, I think we should **carefully** analyze why the loop unrolling optimizations in the compiler cannot be simply relied on. Indeed, I know we have got already few manually unrolled loops in <algorithm> but those are *very* old, essentially dating back to the HP / Sgi times: I would not be suprised at all to learn that vs the current Gcc, maybe further tweaked with the help of the compiler people, they are actually not benefiting anymore. At minimum, we should reassess how *much* to unroll for today's cpus, or whether the optimal solution now would be loop optimization supported by hints in the code.