On an AMD amdfam10 system, gcc 4.5 (713s) is 7% faster than gcc 4.6 (763s)
With the following settings:

4.6: gcc version 4.6.0 20100812 (experimental) (GCC) 
FOPTIMIZE         = -Ofast -funroll-all-loops -fno-tree-pre -mveclibabi=acml
-m64 -march=amdfam10
EXTRA_LDFLAGS = -L$(ACML_DIR) -lacml_mv

4.5: gcc version 4.5.2 20100818 (prerelease) (GCC)

COPTIMIZE         = -O3 -ffast-math -funroll-all-loops -fno-tree-pre
FOPTIMIZE         = -O3 -ffast-math -funroll-all-loops -fno-tree-pre
-mveclibabi=acml -m64 -march=amdfam10
EXTRA_LDFLAGS = -L$(ACML_DIR) -lacml_mv



NOTE that for gcc 4.6, "-Ofast" = "-O3 -ffast-math" and
"-fprefetch-loop-arrays" is turned on @ -O3.

Also acml4.4.0 is used for both tests.


-- 
           Summary: CPU2006 434.zeusmp: gcc 4.6 7% regression from gcc 4.6
           Product: gcc
           Version: 4.6.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: changpeng dot fang at amd dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45390

Reply via email to