https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85090
--- Comment #3 from ktkachov at gcc dot gnu.org --- Hmm, I don't have access to AVX512F hardware so I can't reproduce the runtime failure. The vector simplifications that my patch introduces look correct to me from looking at the dumps. I'm not very familiar with i386.md but the *movdi_internal pattern that produces the vmovq that zeroes out the top of the z register doesn't seem to represent that in RTL. So GCC ends up loading a DImode register xmm20 but then stores it as a V32HImode register, the zero-extending effects of the DImode load are not represented at RTL. Any ideas?