https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115161
--- Comment #22 from Sergei Trofimovich <slyfox at gcc dot gnu.org> ---
(In reply to Sergei Trofimovich from comment #21)
gcc generates the following code for this C code:
> int main() {
> const __m128i su = _mm_set1_epi32(0x4f800000);
> const __m128 sf = _mm_castsi128_ps(su);
>
> const __m128 overflow_mask_f32 = _mm_cmpge_ps(sf,
> _mm_set1_ps(2147483648.0f));
> const __m128i overflow_mask = _mm_castps_si128(overflow_mask_f32);
>
> const __m128i conv = _mm_cvttps_epi32(sf); // overflows
> const __m128i yes = _mm_set1_epi32(INT32_MAX);
>
> const __m128i a = _mm_and_si128(overflow_mask, yes);
> const __m128i na = _mm_andnot_si128(overflow_mask, conv);
>
> const __m128i conv_masked = _mm_or_si128(a, na);
Dump of assembler code for function main:
0x0000000000401020 <+0>: sub $0x8,%rsp
0x0000000000401024 <+4>: movss 0xfdc(%rip),%xmm2 # 0x402008
0x000000000040102c <+12>: movss 0xfd0(%rip),%xmm0 # 0x402004
0x0000000000401034 <+20>: movss 0xfd0(%rip),%xmm3 # 0x40200c
0x000000000040103c <+28>: shufps $0x0,%xmm2,%xmm2
0x0000000000401040 <+32>: shufps $0x0,%xmm0,%xmm0
0x0000000000401044 <+36>: cmpleps %xmm2,%xmm0
0x0000000000401048 <+40>: cvttps2dq %xmm2,%xmm2
0x000000000040104c <+44>: shufps $0x0,%xmm3,%xmm3
0x0000000000401050 <+48>: movdqa %xmm0,%xmm1
0x0000000000401054 <+52>: andps %xmm3,%xmm0
0x0000000000401057 <+55>: pandn %xmm2,%xmm1
0x000000000040105b <+59>: por %xmm0,%xmm1
All of this all looks fine.
> const __m128i actual = _mm_cmpeq_epi32(conv_masked,
> _mm_set1_epi32(INT32_MAX));
> const __m128i expected = _mm_set1_epi32(-1);
0x000000000040105f <+63>: pcmpeqd %xmm0,%xmm0
0x0000000000401063 <+67>: pcmpeqd %xmm2,%xmm1
0x0000000000401067 <+71>: call 0x401160 <_ZL9assert_eqDv2_xS_>
Here `pcmpeqd %xmm2,%xmm1` is a problematic instruction. Why does `gcc` use
`%xmm2` (result of `cvttps2dq`) instead of, say `%xmm0` which contains
`0xFFFFffff` pattern?
> assert_eq(expected, actual);
> }
> ```