https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54803
Mikhail Maltsev <miyuki at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |miyuki at gcc dot gnu.org --- Comment #4 from Mikhail Maltsev <miyuki at gcc dot gnu.org> --- On x86_64 the testcase is also vectorized. For example, with -O3 -march=haswell: .L9: vmovdqa (%r9,%rax), %ymm0 addq $1, %r8 vpsrlq $32, %ymm0, %ymm1 vpsllq $32, %ymm0, %ymm0 vpor %ymm0, %ymm1, %ymm0 vmovdqa %ymm0, (%r9,%rax) addq $32, %rax cmpq %r8, %rcx ja .L9 On bdver2 vprotq insn is used: .L14: incq %rcx vprotq $32, (%rax,%r8), %xmm0 vmovaps %xmm0, (%rdx,%r8) addq $16, %r8 cmpq %r10, %rcx jb .L14