[Bug target/108922] fmod() 13x slowdown in gcc4.9 dropping "fprem" and calling fmod()

jkratochvil at azul dot com via Gcc-bugs Sun, 26 Feb 2023 15:37:05 -0800

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108922


--- Comment #13 from Jan Kratochvil <jkratochvil at azul dot com> ---
(In reply to Uroš Bizjak from comment #12)
> (In reply to Jan Kratochvil from comment #8)
> 
> > The revert makes it 13x faster. But the produced code still falls back to
> > calling glibc fmod() as shown in the disassembly in Comment 0.
> > If I use the "fprem" instruction directly it gets 15x faster - but I did not
> > figure out some (easy) way for me how to patch GCC to no longer produce the
> > call to fmod() at all and produce only the "fprem" instruction.
> 
> Use -ffinite-math-only option:
> 
> -ffinite-math-only
>    Allow optimizations for floating-point arithmetic that assume that
> arguments and results are not NaNs or +-Infs.

That works for this Comment 0 reproducer but I find -ffinite-math-only
incorrect to use due to other calculations in the whole OpenJDK codebase. Using
infinite numbers is documented for Java code and then it may have invalid
results.

To fully performance-fix it (no "call fmod" case) I find better to use
-fno-math-errno. Nothing in OpenJDK should rely on errno from math operations.
But that option still requires to revert your patch.

The question is whether gcc can rely on the undocumented Intel behavior as
described in Comment 7. glibc already relies on it anyway.

This revert proposal I have submitted only for the benefit of GCC. I (or my
employer) do not mind myself as I have already submitted a fix for OpenJDK
using an asm "fprem" expression. Relying on a fix in GCC would not be
acceptable for OpenJDK as it is still going to be built by old/exising
OSes/compilers for years: https://github.com/openjdk/jdk/pull/12508/files

[Bug target/108922] fmod() 13x slowdown in gcc4.9 dropping "fprem" and calling fmod()

Reply via email to