https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62664
--- Comment #2 from g.peterhoff.home at talkplus dot de ---
(In reply to Andrew Pinski from comment #1)
> I think you forgot to clear the MMX state. MMX and FP share the same
> register set and need to be cleared before going from MMX to FP.
I know, but that's not the problem. If you replace
z.m=_mm_set1_pi32(r);
by
z.m=_mm_setzero_si64()
z.m+=r; // gcc only
it works.
With sse-intrinsict it works so well:
typedef union
{
__m64 m;
__m128i s;
int16_t e[4];
} ZZ;
z.s=_mm_set1_epi32(r);
AVX unfortunately I can not test because my hardware only with float/double
counted (AVX1), but not with integers (AVX2).