(i'm new here, i don't know the norms, if i'm doing it all wrong, sorry) tested on gcc 6.2.0 and 5.4.0, amd64, same results.
code: http://paste.debian.net/plain/894930 in here, method1 is significantly slower than the others, and called twice in the loop. i believe that Os (correctly) concluded that the 2nd call to method1 was useless, and optimized it away. O3 (and O2) didn't optimize it away. as a result, Os is much faster than O2/O3. average run times on my system: -O2 and -O3: 1.3 seconds -O3 -ffast-math: 1.7 seconds (no idea why, but a noticeable difference) -Os: 0.7 seconds i guess this is an optimization bug. optimizing away useless calls isn't just a size thing (which Os specialize in), its most definitely a performance thing too (which is, if i remember correctly, O3's specialty)