https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104610
--- Comment #23 from CVS Commits <cvs-commit at gcc dot gnu.org> --- The master branch has been updated by hongtao Liu <liuho...@gcc.gnu.org>: https://gcc.gnu.org/g:8c40b72036c967fbb1d1150515cf70aec382f0a2 commit r14-5002-g8c40b72036c967fbb1d1150515cf70aec382f0a2 Author: liuhongt <hongtao....@intel.com> Date: Mon Oct 9 15:07:54 2023 +0800 Improve memcmpeq for 512-bit vector with vpcmpeq + kortest. When 2 vectors are equal, kmask is allones and kortest will set CF, else CF will be cleared. So CF bit can be used to check for the result of the comparison. Before: vmovdqu (%rsi), %ymm0 vpxorq (%rdi), %ymm0, %ymm0 vptest %ymm0, %ymm0 jne .L2 vmovdqu 32(%rsi), %ymm0 vpxorq 32(%rdi), %ymm0, %ymm0 vptest %ymm0, %ymm0 je .L5 .L2: movl $1, %eax xorl $1, %eax vzeroupper ret After: vmovdqu64 (%rsi), %zmm0 xorl %eax, %eax vpcmpeqd (%rdi), %zmm0, %k0 kortestw %k0, %k0 setc %al vzeroupper ret gcc/ChangeLog: PR target/104610 * config/i386/i386-expand.cc (ix86_expand_branch): Handle 512-bit vector with vpcmpeq + kortest. * config/i386/i386.md (cbranchxi4): New expander. * config/i386/sse.md: (cbranch<mode>4): Extend to V16SImode and V8DImode. gcc/testsuite/ChangeLog: * gcc.target/i386/pr104610-2.c: New test.