On Mon, Oct 31, 2011 at 9:36 PM, Jagasia, Harsha <harsha.jaga...@amd.com> wrote:
>> > > We would like to propose changing AVX generic mode tuning to
>> generate
>> > 128-bit
>> > > AVX instead of 256-bit AVX.
>> >
>> > You indicate a 3% reduction on bulldozer with avx256.
>> > How does avx128 compare to -mno-avx -msse4.2?
>>
>> We see these % differences going from SSE42 to AVX128 to AVX256 on
>> Bulldozer with "-mtune=generic -Ofast".
>> (Positive is improvement, negative is degradation)
>>
>> Bulldozer:
>>                       AVX128/SSE42    AVX256/AVX-128
>> 410.bwaves            -1.4%                   -1.4%
>> 416.gamess            -1.1%                   0.0%
>> 433.milc              0.5%                    -2.4%
>> 434.zeusmp            9.7%                    -2.1%
>> 435.gromacs           5.1%                    0.5%
>> 436.cactusADM 8.2%                    -23.8%
>> 437.leslie3d  8.1%                    0.4%
>> 444.namd              3.6%                    0.0%
>> 447.dealII            -1.4%                   -0.4%
>> 450.soplex            -0.4%                   -0.4%
>> 453.povray            0.0%                    -1.5%
>> 454.calculix  15.7%                   -8.3%
>> 459.GemsFDTD  4.9%                    1.4%
>> 465.tonto             1.3%                    -0.6%
>> 470.lbm               0.9%                    0.3%
>> 481.wrf               7.3%                    -3.6%
>> 482.sphinx3           5.0%                    -9.8%
>> SPECFP                3.8%                    -3.2%
>>
>> > Will the next AMD generation have a useable avx256?
>> > I'm not keen on the idea of generic mode being tune
>> > for a single processor revision that maybe shouldn't
>> > actually be using avx at all.
>>
>> We see a substantial gain in several SPECFP benchmarks going from SSE42
>> to AVX128 on Bulldozer.
>> IMHO, accomplishing even a 5% gain in an individual benchmark takes a
>> hardware company several man months.
>> The loss with AVX256 for Bulldozer is much more significant than the
>> gain for SandyBridge.
>> While the general trend in the industry is a move toward AVX256, for
>> now we would be disadvantaging Bulldozer with this choice.
>>
>> We have several customers who use -mtune=generic and it is default,
>> unless a user explicitly overrides it with -mtune=native. They are the
>> ones who want to experiment with latest ISA using gcc, but want to keep
>> their ISA selection and tuning agnostic on x86/64. IMHO, it is with
>> these customers in mind that generic was introduced in the first place.
>
> Since stage 1 closure is around the corner, just wanted to ping to see if the 
> maintainers have made up their mind on this one.
> AVX-128 is an improvement over SSE42 for Bulldozer and AVX-256 wipes out 
> pretty much all of that gain in generic mode.
> Until there is a convergence on AVX-256 for x86/64, we would like to propose 
> having generic generate avx-128 by default and have a user override to 
> avx-256 manually when known to benefit performance.

Did somebody spend the time analyzing why CactusADM shows so much of a
difference?  With the recent improvements in vectorizing for AVX, did
you
re-do the measurements with a recent trunk?

I don't think disabling avx-256 by default is a good idea until we
understand why these numbers happen and are convinced we cannot fix
this by proper
cost modeling.

Richard.

> Thanks,
> Harsha
>
>

Reply via email to