https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81616

--- Comment #49 from Jan Hubicka <hubicka at ucw dot cz> ---
> matrix.c is still needing additional options to get the best out of the Ryzen
> processor. But is better than before (223029 clocks vs 371978 originally), 
> but 122677 is achievable with the right options. However the same can also be

Aha, for ryzen we would still benefit from 256 vectorization. It is not a win
overall and it will need bigger surgey to vectorizer to implement properly, so
that will wait for next stage1 unfortunately.

This is the gap between -march=znver1 -mtune=generic and -march=znver1, so
about
17%

Concerning your options -mprefer-vector-width=none -mno-fma -mno-avx2 -O3
With Martin's patch in -mno-fma should no longer have effect here.  Not sure
why -mno-avx2 would be a win either. We originally introduced it to disable
scatter/gather in the other benchmark but that one is solved too.
Do those two option still improve the scores for you.

It is alaso mystery to me why -march=ivybridge would benefit anything as the
isa is more or less supperset of znver. I will try to find more to check more.

Honza

Reply via email to