[Bug target/113235] SMHasher SHA3-256 benchmark is almost 40% slower vs. Clang (not enough complete loop peeling)

hubicka at gcc dot gnu.org via Gcc-bugs Fri, 05 Jan 2024 12:26:53 -0800

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113235


Jan Hubicka <hubicka at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|SMHasher SHA3-256 benchmark |SMHasher SHA3-256 benchmark
                   |is almost 40% slower vs.    |is almost 40% slower vs.
                   |Clang                       |Clang (not enough complete
                   |                            |loop peeling)

--- Comment #5 from Jan Hubicka <hubicka at gcc dot gnu.org> ---
On my zen3 machine default build gets me 180MB/S
-O3 -flto -funroll-all-loops gets me 193MB/s
-O3 -flto --param max-completely-peel-times=30 gets me 382MB/s, speedup is gone
with --param max-completely-peel-times=20, default is 16.

[Bug target/113235] SMHasher SHA3-256 benchmark is almost 40% slower vs. Clang (not enough complete loop peeling)

Reply via email to