https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82897
--- Comment #10 from Andrew Pinski <pinskia at gcc dot gnu.org> --- Looks like this was fixed in GCC 15: ``` foo: .LFB7284: .cfi_startproc vmovd %edi, %xmm2 vmovdqa32 %zmm1, %zmm4 kmovw m(%rip), %k1 vpsrad %xmm2, %zmm0, %zmm4{%k1} vmovdqa32 %zmm4, %zmm0 ret ``` Though for comment #5 we get: ``` foo: .LFB7470: .cfi_startproc vmovdqa64 %zmm0, %zmm3 vmovd %edi, %xmm2 vmovdqa32 %zmm1, %zmm0 kmovw m(%rip), %k1 vmovdqa32 %zmm1, %zmm4 vpslld %xmm2, %zmm3, %zmm0{%k1} kmovw m(%rip), %k2 vpsrad %xmm2, %zmm3, %zmm4{%k2} vmovdqa32 %zmm0, zzz(%rip) vmovdqa32 %zmm4, %zmm0 ret ``` Note the extra kmovw. But we get for the trunk: ``` foo: .LFB7470: .cfi_startproc vmovdqa64 %zmm0, %zmm3 vmovd %edi, %xmm2 vmovdqa32 %zmm1, %zmm0 kmovw m(%rip), %k1 vmovdqa32 %zmm1, %zmm4 vpslld %xmm2, %zmm3, %zmm0{%k1} vpsrad %xmm2, %zmm3, %zmm4{%k1} vmovdqa32 %zmm0, zzz(%rip) vmovdqa32 %zmm4, %zmm0 ret ``` Which looks fixed.