http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47298
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution| |FIXED
--- Comment #12 from Richard Biener <rguenth at gcc dot gnu.org> 2013-03-27
13:02:05 UTC ---
On trunk we now vectorize the loop and then unroll it from cunroll.
4.6 -O2 -funroll-loops -ftree-vectorize -ffast-math: 10.7s
4.6 -O3 -funroll-loops -ftree-vectorize -ffast-math: 8.3s
4.7 -O2 -funroll-loops -ftree-vectorize -ffast-math: 7.4s
4.7 -O3 -funroll-loops -ftree-vectorize -ffast-math: 8.5s
4.8 -O2 -funroll-loops -ftree-vectorize -ffast-math: 6.1s
4.8 -O3 -funroll-loops -ftree-vectorize -ffast-math: 6.5s
with -march=native added (iCore5)
4.8 -O2 ... -march=native: 3.9s
4.8 -O3 ... -march=native: 4s
Apart from very minor scheduling differences I see no difference in
code generation on trunk -O2 vs. -O3.
I'd say "fixed" without more details.