https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111332
Andrew Pinski <pinskia at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Known to work| |6.4.0 Known to fail| |7.3.0, 7.5.0, 8.5.0, 9.5.0 --- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> --- GCC 11+ produces: .L3: vmovdqu (%rsi), %ymm2 vmovdqu 32(%rsi), %ymm1 subq $-128, %rdi subq $-128, %rsi vmovdqu -64(%rsi), %ymm0 vmovdqu -32(%rsi), %ymm3 vmovdqu %ymm2, -128(%rdi) vmovdqu %ymm3, -32(%rdi) vmovdqu %ymm1, -96(%rdi) vmovdqu %ymm0, -64(%rdi) cmpq %rax, %rdi jne .L3 Which is the best code ...