gfortran seemingly generates an significatly inferior internal TREE
representation than g95 as for Polyhedron's induct.f90 gfortran is 18% slower
than g95, which is based on GCC 4.0.3. (Compared with other compilers the
difference is even larger.)

(GCC 4.3 is in general faster than GCC 4.1; for induct one does not see any
runtime change with the gfortran frontend during the last 1.5 years, though
GCC/gfortran 4.1.2 was seemingly slightly faster:
http://www.suse.de/~gcctest/c++bench/polyhedron/polyhedron-summary.txt-induct-19.png
)

If one looks at -ftree-vectorizer-verbose, GCC 4.3 is able to vectorize 3 loops
with gfortran whereas GCC 4.0 vectorizes 0 loops with g95.


For reduced-size example (395 instead of 6635 lines), gfortran is still 13%
slower:

$ fortran -march=opteron -ffast-math -funroll-loops -ftree-vectorize
-ftree-loop-linear -msse3 -O3  test2.f90
$ time a.out
real    0m4.632s  user    0m4.624s  sys     0m0.004s

$ g95 -march=opteron -ffast-math -funroll-loops -ftree-vectorize -msse3 -O3
test2.f90
$ time a.out
real    0m4.030s  user    0m4.024s  sys     0m0.004s

$ ifort test2.f90
$ time a.out
real    0m3.859s  user    0m3.856s  sys     0m0.000s

# NAG f95 + system gcc 4.1.3
$ f95 -O4 -ieee=full -Bstatic -march=opteron -ffast-math -funroll-loops
-ftree-vectorize -msse3 test2.f90
$ time a.out
real    0m3.381s  user    0m3.380s  sys     0m0.004s

$ sunf95 -w4 -fast -xarch=amd64a -xipo=0 test2.f90
$ time a.out
real    0m3.741s  user    0m3.736s  sys     0m0.000s




For induct (on x86_64-unknown-linux-gnu):
51.65 [100%]  gfortran -m64 as above
51.90 [100%]  gfortran with -fprofile-use
61.41 [118%]  gfortran 32bit, x87
46.12 [ 89%]  gfortran 32bit, SSE
43.33 [ 83%]  ifort 9.1
40.73 [ 78%]  ifort 10beta
42.53 [ 82%]  sunf95
30.16 [ 58%]  pathscale
38.86 [ 75%]  NAG f95 using system gcc 4.1
42.65 [ 82%]  g95/gcc 4.0.3 (g95 0.91!)


-- 
           Summary: gfortran 4.3 13%-18% slower for induct.f90 than gcc 4.0-
                    based competitor
           Product: gcc
           Version: 4.3.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: fortran
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: burnus at gcc dot gnu dot org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32084

Reply via email to