------- Additional Comments From jbucata at tulsaconnect dot com 2005-07-18 04:42 ------- For me, with -march=athlon-xp, -funroll-loops on 4.0.0 did indeed pessimize slightly. However, -fprofile-{generate,use} pessimized more on top of that. So there's still a problem with regard to the PO.
I tried it again with your -march=i686 and it went from 9.5s => 13.5s user with plain -funroll-loops. The PO made it a tidge worse from there. IOW, consistent with what you saw. Further, I tried -funroll-loops w/o the PO on 3.3.5 and 3.4.3 and it improved run times for both--moreso in 3.4.3 than in 3.3.5. So it looks like that's another regression in 4.0.0, in -funroll-loops by itself! -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21527