https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86819
--- Comment #3 from Alexander Monakov <amonakov at gcc dot gnu.org> ---
Right, sorry, somehow I imagined the hook has to do with instructions for
computing approximate inverse. Indeed the two aspects are independent.
I think there may be realistic situations where the change can introduce a
regression: while a win throughput-wise, it introduces one multiplication
latency following division latency in the dependency chain, so if the original
divisions were on the critical path, it grows longer.
A minimal testcase would be
void f(float, float);
void g(float x, float y, float d)
{
f(x / d, y / d);
}
where at -O2 -funsafe-math-optimizations we'd expect only one divss instruction
in the output.