http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45422
--- Comment #36 from davidxl <xinliangli at gmail dot com> 2011-01-25 17:28:30 UTC --- (In reply to comment #35) > (In reply to comment #34) > > -march=native is ambiguous, please see with -v what actually is being used. > > This was mentioned in the initial comment: > -march=k8-sse3 -mcx16 -msahf > --param l1-cache-size=64 --param l1-cache-line-size=64 --param > l2-cache-size=1024 -mtune=k8 > > The latest timings are on a newer machine (old one is gone now) which has: > -march=amdfam10 -mcx16 -msahf -mpopcnt -mabm --param l1-cache-size=64 --param > l1-cache-line-size=64 --param l2-cache-size=512 -mtune=amdfam10 I did use the options you originally posted "-ftime-report -cpp -fbounds-check -g -O3 -ffast-math -funroll-loops -ftree-vectorize -march=native -ffree-form". The timing is consistently 58s on my 2.4Ghz core-2 box, and 42s on the 2.67Ghz Xeon machine.