This is with the polyhedron test "fatigue" on AMD Athlon64 X2 4800+ with openSUSE 10.3b6 x86-64 and today's GCC 4.3.0 20070727.
Test case available from: http://www.polyhedron.co.uk/MFL6VW74649 Using on one hand gfortran -march=opteron -ffast-math -funroll-loops -ftree-loop-linear -ftree-vectorize -msse3 -O3 fatigue.f90 and on the other hand the same with once -fprofile-generate and then, after one ./a.out run, -fprofile-use. http://physik.fu-berlin.de/~tburnus/gcc-trunk/benchmark/#rt Result: 12.32s [100] 15.30s [124] thus the profile-use case is 24% slower. This seems to be no regression as the non-profile version became faster around 2007-03-20 whereas the profile-use version stayed at the old execution time. -- Summary: -fprofile-generate/use: Program 24% slower than without Product: gcc Version: 4.3.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: burnus at gcc dot gnu dot org http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32913