------- Comment #6 from burnus at gcc dot gnu dot org 2010-02-23 15:29 ------- Result with current trunk (4.5.0 2010-02-23 Rev. 156999)
In a nutshell: gas_dyn is still slower - now 35% instead of 70%. -fgraphite-identity has normal speed (or a tiny bit faster?!?), but all other options (-floop-interchange, -floop-strip-mine, and -floop-block) cause the slow down. -- nf is 6% slower and the rest looks fine. * * * System: AMD Athlon 64 X2 Dual Core Processor 4800+ @ 2.4 GHz Base options: gfortran -march=opteron -ffast-math -funroll-loops -ftree-vectorize -ftree-loop-linear -msse3 -O3 LTO uses additionally the options "-flto -fwhole-program -fno-protect-parens" Used graphite options: "-floop-interchange -floop-strip-mine -floop-block" LTO No LTO ac 1% faster (13.16s vs. 13.31s) = (13.29s vs. 13.35s) aermod = (30.69s vs. 30.87s) = (34.32s vs. 34.58s) air = (15.64s vs. 15.68s) 2% faster (15.72s vs. 15.68s) capacita 6% SLOWER (86.92s vs. 82.14s) 2% faster (81.66s vs. 82.92s) channel = (15.36s vs. 15.28s) 3% faster (15.26s vs. 15.71s) doduc 5% faster (40.97s vs. 43.05s) 5% faster (40.28s vs. 42.51s) fatigue 2% SLOWER ( 7.21s vs. 7.08s) 4% SLOWER ( 9.99s vs. 9.57s) gas_dyn 35% SLOWER (15.35s vs. 11.36s) 37% SLOWER (15.36s vs. 11.19s) induct 14% faster (29.15s vs. 33.90s) 24% faster (28.07s vs. 37.08s) linpk 1% faster (30.34s vs. 30.68s) 2% faster (30.40s vs. 31.03s) mdbx 2% faster (20.15s vs. 20.56s) = (19.40s vs. 19.48s) nf 6% SLOWER (33.49s vs. 31.49s) 6% SLOWER (33.61s vs. 31.62s) protein = (64.51s vs. 64.24s) 3% SLOWER (65.65s vs. 63.84s) rnflow 2% faster (36.07s vs. 36.82s) = (35.10s vs. 34.96s) test_fpu = (21.93s vs. 21.85s) 2% faster (20.76s vs. 21.28s) tfft 2% faster ( 8.25s vs. 8.43s) 1% faster (8.22s vs. 8.33s) Geo.Mean 1% SLOWER (23.66s vs. 23.42s) = (24.00s vs. 23.99s) * * * gas_dyn.f90 only results: A) w/o Graphite real 0m11.281s user 0m11.013s sys 0m0.044s B) w/ -fgraphite-identity real 0m10.622s user 0m10.533s sys 0m0.080s C) w/ -floop-interchange real 0m15.077s user 0m14.785s sys 0m0.068s D) w/ -floop-strip-mine real 0m15.818s user 0m15.205s sys 0m0.052s E) w/ -floop-block real 0m15.349s user 0m15.249s sys 0m0.080s F) w/ -floop-interchange -floop-strip-mine real 0m15.740s user 0m15.589s sys 0m0.044s G) w/ -floop-interchange -floop-strip-mine -floop-block real 0m15.658s user 0m15.333s sys 0m0.040s -- burnus at gcc dot gnu dot org changed: What |Removed |Added ---------------------------------------------------------------------------- Summary|[Graphite] 70% slower using |[Graphite] 35% slower using |-floop* than without |-floop* than without |graphite (gas_dyn of |graphite (gas_dyn of |Polyhedron) |Polyhedron) http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38846