http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49089

           Summary: Regression on CFP2006 on Bulldozer From Splitting AVX
                    32-byte Unaligned Loads
           Product: gcc
           Version: 4.7.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
        AssignedTo: unassig...@gcc.gnu.org
        ReportedBy: changpeng.f...@amd.com


The regression is caused by the following patch that splits AVX 32-byte
unaligned load and store:
http://gcc.gnu.org/ml/gcc-patches/2011-03/msg01839.html

Here is the performance impact on a Bulldozer System:

              store-split     load-split
410.bwaves    0.48          -0.48
416.gamess    0.55           0.00
433.milc    1.76          -3.96
434.zeusmp    3.48          -3.48
435.gromacs    0.51           1.54
436.cactusADM    -0.72          -0.72
437.leslie3d    10.33          -0.94
444.namd    1.03           0.00
447.dealII    0.70          -1.41
450.soplex    0.79           0.40
453.povray    -0.50           -0.50
454.calculix    5.07           -1.84
459.GemsFDTD    4.33           -6.25
465.tonto    1.27            0.00
470.lbm        -0.86            1.44
481.wrf        1.35            -3.59
482.sphinx3    0.00            -2.11
geomean        1.71            -1.31

While splitting store is good, Bulldozer seems not like unaligned
load splitting.

Reply via email to