https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98169
--- Comment #2 from denis.campredon at gmail dot com --- This also applies to vector types. ------- typedef float __attribute__((vector_size(8))) T; T f(T a) { return a != a; } ------- Gcc could generate: ------ f: xorps xmm1, xmm1 cmpunordps xmm0, xmm1 ret ------ But instead generate a less optimal code: ------ f: ucomiss xmm0, xmm0 mov edx, -1 movaps xmm1, xmm0 mov eax, 0 mov ecx, edx shufps xmm1, xmm1, 0xe5 cmovnp ecx, eax ucomiss xmm1, xmm1 movd xmm0, ecx cmovp eax, edx movd xmm2, eax punpckldq xmm0, xmm2 ret