https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69132
Jakub Jelinek <jakub at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |jakub at gcc dot gnu.org, | |uros at gcc dot gnu.org --- Comment #1 from Jakub Jelinek <jakub at gcc dot gnu.org> --- Your snippet is not self-contained, the cout in there is supposedly useless for the testcase, but it is unclear what headers are you using and thus whether the sqrt is in the end __builtin_sqrtf or __builtin_sqrt. Also, GCC 4.8 is no longer supported. That said, the "weird" single precision vector division is because it is computing the division using Newton-Rhapson approximation, as a / b = a * ((rcp(b) + rcp(b)) - (b * rcp(b) * rcp (b))) You can disable this e.g. with -mrecip='default,!vec-div' Now, whether this is beneficial even for AVX capable CPUs by default or not depends on the timing/latencies of vrcpps+3*vmulps+vaddps+vsubps instructions vs. vdivps.