On Thu, Feb 3, 2011 at 6:43 AM, Jack Howarth <howa...@bromo.med.uc.edu> wrote: > Sebastian, > Below are the results for the Polyhedron 2005 benchmarks on > x86_64-apple-darwin10 using -O3 -ffast-math -funroll-loops under gcc > trunk at r169776, with -fgraphite-identity and with -fgraphite-identity > -ftree-loop-linear. I am surprised at the absence of any impact from > -ftree-loop-linear in either run-time or executable size. The increase > in compile time on some of the benchmarks suggested it was in effect. > Is this a poor combination of optimizations for -ftree-loop-linear or > is fortran less effective in using that optimization?
Well, I don't know of any bogously nested (hot) loop in polyhedron, do you? Richard. > Jack > ps Hopefully when the remaining loop regressions in -fgraphite-identity > are solved, the graphite results will improve a bit more. > > Using built-in specs. > COLLECT_GCC=gcc-4 > COLLECT_LTO_WRAPPER=/sw/lib/gcc4.6/libexec/gcc/x86_64-apple-darwin10.7.0/4.6.0/lto-wrapper > Target: x86_64-apple-darwin10.7.0 > Configured with: ../gcc-4.6-20110202/configure --prefix=/sw > --prefix=/sw/lib/gcc4.6 --mandir=/sw/share/man --infodir=/sw/lib/gcc4.6/info > --with-build-config=bootstrap-lto --enable-stage1-languages=c,lto > --enable-languages=c,c++,fortran,lto,objc,obj-c++,java --with-gmp=/sw > --with-libiconv-prefix=/sw --with-ppl=/sw --with-cloog=/sw --with-mpc=/sw > --with-system-zlib --x-includes=/usr/X11R6/include > --x-libraries=/usr/X11R6/lib --program-suffix=-fsf-4.6 --enable-checking=yes > --enable-cloog-backend=isl > Thread model: posix > gcc version 4.6.0 20110203 (experimental) (GCC) > > command=gfortran -O3 -ffast-math -funroll-loops > > Run-time > stock -fgraphite-identity -fgraphite-identity > -ftree-loop-linear > > ac 8.80 8.80 8.80 > aermod 17.32 17.43 17.43 > air 5.48 5.43 5.44 > capacita 32.45 32.52 32.53 > channel 1.84 1.84 1.84 > doduc 28.30 26.28 26.28 > fatigue 8.13 8.09 8.09 > gas_dyn 4.30 4.32 4.31 > induct 13.07 12.51 12.51 > linpk 15.47 15.41 15.41 > mdbx 11.21 11.21 11.21 > nf 29.91 30.20 30.01 > protein 32.86 32.21 32.20 > rnflow 23.94 24.18 24.17 > test_fpu 8.02 8.05 8.04 > tfft 1.87 1.87 1.87 > > Compile-time > stock -fgraphite-identity -fgraphite-identity > -ftree-loop-linear > > ac 2.12 2.12 2.12 > aermod 57.45 59.22 59.30 > air 3.84 4.37 4.93 > capacita 2.82 2.94 3.07 > channel 1.00 1.20 1.33 > doduc 8.57 8.92 8.95 > fatigue 3.19 3.17 3.17 > gas_dyn 5.38 5.57 5.57 > induct 6.59 6.77 8.81 > linpk 1.08 1.33 1.31 > mdbx 2.83 2.92 2.92 > nf 3.09 3.08 3.10 > protein 8.51 8.70 8.67 > rnflow 9.94 10.09 10.09 > test_fpu 7.22 7.24 7.28 > tfft 0.81 0.88 0.83 > > Executable size > stock -fgraphite-identity -fgraphite-identity > -ftree-loop-linear > > ac 50976 50976 50976 > aermod 1264832 1268928 1268928 > air 73984 82184 82184 > capacita 77976 77976 77976 > channel 34792 34792 34792 > doduc 193096 193096 193096 > fatigue 86032 86032 86032 > gas_dyn 119704 115608 115608 > induct 174848 174848 174848 > linpk 38648 38648 38648 > mdbx 82072 82072 82072 > nf 75912 71816 71816 > protein 131992 131992 131992 > rnflow 181080 181080 181080 > test_fpu 155048 150952 150952 > tfft 30760 30760 30760 > >