http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46186
--- Comment #8 from Julian Andres Klode <[email protected]> 2010-10-26 15:25:56 UTC --- (In reply to comment #6) > You get this kind of speedup if the compiler knows that the result of the loop > is > > sum=(b*(b-1)-a*(a-1))/2 > > In which case the timing is meaningless (it is 0.000s on my laptop), so is the > ratio with the execution of the loop. > > The basic question is: how much the user's ignorance should be repaired by the > optimizer? (A colleague of mine told me that he once audited a CFD code and > found that \int_a^b dx/x was evaluated numerically instead of using > log(b)-log(a)!-) Since the optimization seems to be mostly there in -O3, it's just a matter of enabling it in -O2. I just found out that it does not optimize if you call f() via a global function pointer, it still takes 1.6 seconds despite being compiled at -O3, whereas clang can optimize it to 0.001s.
