https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95814
Bug ID: 95814 Summary: Failure to optimize __builtin_ia32_rsqrtss properly Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: gabravier at gmail dot com Target Milestone: --- typedef float v4f32 __attribute__((vector_size(16))); float f(float x) { return __builtin_ia32_rsqrtss((v4f32){x, 0, 0, 0})[0]; } With -O3, LLVM outputs this : f(float): rsqrtss xmm0, xmm0 ret GCC outputs this : f(float): pxor xmm1, xmm1 movss xmm1, xmm0 movaps xmm0, xmm1 rsqrtss xmm0, xmm1 ret