Jakub Jelinek wrote:

Including loop unrolling to -O2 is IMNSHO a bad idea, as loop unrolling
increases code size, sometimes a lot.  And the distinction between -O2
and -O3 is exactly in the space-for-speed tradeoffs.

That's certainly a valid way of defining the difference (and certainly
used to be the case in the old days when the principle extra optimization
was inlining)

On many CPUs for many programs, -O3 generates slower code than -O2,
because the cache footprint  disadvantages override positive effects
of the loop unrolling, extra inlining etc.

That's what we have found, though I would have thought it unusual that
loop unrolling would run into this cache effect in most cases.

        Jakub


Reply via email to