http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49089
Summary: Regression on CFP2006 on Bulldozer From Splitting AVX 32-byte Unaligned Loads Product: gcc Version: 4.7.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target AssignedTo: unassig...@gcc.gnu.org ReportedBy: changpeng.f...@amd.com The regression is caused by the following patch that splits AVX 32-byte unaligned load and store: http://gcc.gnu.org/ml/gcc-patches/2011-03/msg01839.html Here is the performance impact on a Bulldozer System: store-split load-split 410.bwaves 0.48 -0.48 416.gamess 0.55 0.00 433.milc 1.76 -3.96 434.zeusmp 3.48 -3.48 435.gromacs 0.51 1.54 436.cactusADM -0.72 -0.72 437.leslie3d 10.33 -0.94 444.namd 1.03 0.00 447.dealII 0.70 -1.41 450.soplex 0.79 0.40 453.povray -0.50 -0.50 454.calculix 5.07 -1.84 459.GemsFDTD 4.33 -6.25 465.tonto 1.27 0.00 470.lbm -0.86 1.44 481.wrf 1.35 -3.59 482.sphinx3 0.00 -2.11 geomean 1.71 -1.31 While splitting store is good, Bulldozer seems not like unaligned load splitting.