https://gcc.gnu.org/bugzilla/show_bug.cgi?id=47860
Andrew Pinski <pinskia at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Ever confirmed|0 |1 Status|UNCONFIRMED |NEW Last reconfirmed| |2021-08-16 --- Comment #6 from Andrew Pinski <pinskia at gcc dot gnu.org> --- Confirmed, ICC is able to vectorize this loop even without AVX (GCC can do the vectorize the loop currently with AVX). movdqa %xmm0, %xmm11 #10.11 lea 1(%r14), %r15d #9.31 movups (%rdx,%r15,8), %xmm9 #9.27 movups (%rcx,%r14,8), %xmm10 #10.24 cmpltpd %xmm1, %xmm10 #10.24 pxor %xmm2, %xmm10 #10.24 movmskpd %xmm10, %r15d #10.24 testl %r15d, %r15d #10.24 je ..B1.14 # Prob 50% #10.24 # LOE rax rdx rcx rbx rsi rdi ebp r8d r9d r10d r11d r12d r13d r14d xmm0 xmm1 xmm2 xmm3 xmm4 xmm5 xmm6 xmm7 xmm8 xmm9 xmm10 xmm11 ..B1.13: # Preds ..B1.12 # Execution count [1.25e+01] pshufd $8, %xmm10, %xmm11 #10.24 movaps %xmm9, %xmm8 #5.21 pand %xmm6, %xmm11 #10.24 # LOE rax rdx rcx rbx rsi rdi ebp r8d r9d r10d r11d r12d r13d r14d xmm0 xmm1 xmm2 xmm3 xmm4 xmm5 xmm6 xmm7 xmm8 xmm11 ..B1.14: # Pre