On Mon, Oct 31, 2011 at 9:36 PM, Jagasia, Harsha <harsha.jaga...@amd.com> wrote: >> > > We would like to propose changing AVX generic mode tuning to >> generate >> > 128-bit >> > > AVX instead of 256-bit AVX. >> > >> > You indicate a 3% reduction on bulldozer with avx256. >> > How does avx128 compare to -mno-avx -msse4.2? >> >> We see these % differences going from SSE42 to AVX128 to AVX256 on >> Bulldozer with "-mtune=generic -Ofast". >> (Positive is improvement, negative is degradation) >> >> Bulldozer: >> AVX128/SSE42 AVX256/AVX-128 >> 410.bwaves -1.4% -1.4% >> 416.gamess -1.1% 0.0% >> 433.milc 0.5% -2.4% >> 434.zeusmp 9.7% -2.1% >> 435.gromacs 5.1% 0.5% >> 436.cactusADM 8.2% -23.8% >> 437.leslie3d 8.1% 0.4% >> 444.namd 3.6% 0.0% >> 447.dealII -1.4% -0.4% >> 450.soplex -0.4% -0.4% >> 453.povray 0.0% -1.5% >> 454.calculix 15.7% -8.3% >> 459.GemsFDTD 4.9% 1.4% >> 465.tonto 1.3% -0.6% >> 470.lbm 0.9% 0.3% >> 481.wrf 7.3% -3.6% >> 482.sphinx3 5.0% -9.8% >> SPECFP 3.8% -3.2% >> >> > Will the next AMD generation have a useable avx256? >> > I'm not keen on the idea of generic mode being tune >> > for a single processor revision that maybe shouldn't >> > actually be using avx at all. >> >> We see a substantial gain in several SPECFP benchmarks going from SSE42 >> to AVX128 on Bulldozer. >> IMHO, accomplishing even a 5% gain in an individual benchmark takes a >> hardware company several man months. >> The loss with AVX256 for Bulldozer is much more significant than the >> gain for SandyBridge. >> While the general trend in the industry is a move toward AVX256, for >> now we would be disadvantaging Bulldozer with this choice. >> >> We have several customers who use -mtune=generic and it is default, >> unless a user explicitly overrides it with -mtune=native. They are the >> ones who want to experiment with latest ISA using gcc, but want to keep >> their ISA selection and tuning agnostic on x86/64. IMHO, it is with >> these customers in mind that generic was introduced in the first place. > > Since stage 1 closure is around the corner, just wanted to ping to see if the > maintainers have made up their mind on this one. > AVX-128 is an improvement over SSE42 for Bulldozer and AVX-256 wipes out > pretty much all of that gain in generic mode. > Until there is a convergence on AVX-256 for x86/64, we would like to propose > having generic generate avx-128 by default and have a user override to > avx-256 manually when known to benefit performance.
Did somebody spend the time analyzing why CactusADM shows so much of a difference? With the recent improvements in vectorizing for AVX, did you re-do the measurements with a recent trunk? I don't think disabling avx-256 by default is a good idea until we understand why these numbers happen and are convinced we cannot fix this by proper cost modeling. Richard. > Thanks, > Harsha > >