https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69132

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jakub at gcc dot gnu.org,
                   |                            |uros at gcc dot gnu.org

--- Comment #1 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Your snippet is not self-contained, the cout in there is supposedly useless for
the testcase, but it is unclear what headers are you using and thus whether the
sqrt is in the end __builtin_sqrtf or __builtin_sqrt.
Also, GCC 4.8 is no longer supported.

That said, the "weird" single precision vector division is because it is
computing the division using Newton-Rhapson approximation, as
a / b = a * ((rcp(b) + rcp(b)) - (b * rcp(b) * rcp (b)))
You can disable this e.g. with -mrecip='default,!vec-div'
Now, whether this is beneficial even for AVX capable CPUs by default or not
depends on the timing/latencies of vrcpps+3*vmulps+vaddps+vsubps instructions
vs. vdivps.

Reply via email to