https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81616

--- Comment #23 from Andrew Roberts <andrewm.roberts at sky dot com> ---
Thanks Honza,

getting closer, with original matrix.c on Ryzen:

/usr/local/gcc/bin/gcc -march=znver1 -mtune=znver1 -O3 matrix.c -o matrix
        mult took     364850 clocks

/usr/local/gcc/bin/gcc -march=znver1 -mtune=znver1 -mprefer-vector-width=none
-O3 matrix.c -o matrix
       mult took     194517 clocks

/usr/local/gcc/bin/gcc -march=znver1 -mtune=znver1 -mprefer-vector-width=none
-mno-fma -O3 matrix.c -o matrix
        mult took     130343 clocks

/usr/local/gcc/bin/gcc -march=haswell -mtune=haswell -mprefer-vector-width=none
-mno-fma -O3 matrix.c -o matrix
        mult took     130129 clocks

These last two are comparable with the fastest obtained from trying all
combinations of -march and -mtune

Reply via email to