On Wed, May 12, 2021 at 1:42 PM Hongtao Liu <crazy...@gmail.com> wrote: > > On Wed, May 12, 2021 at 4:36 PM Uros Bizjak <ubiz...@gmail.com> wrote: > > > > On Tue, Apr 27, 2021 at 1:05 PM Hongtao Liu via Gcc-patches > > <gcc-patches@gcc.gnu.org> wrote: > > > > > > Hi: > > > As described in the subject line, this patch is about to do the > > > below transformation. > > > > > > - vpcmpeqd %ymm3, %ymm3, %ymm3 > > > - vpandn %ymm3, %ymm2, %ymm2 > > > - vpblendvb %ymm2, %ymm1, %ymm0, %ymm0 > > > + vpblendvb %ymm2, %ymm0, %ymm1, %ymm0 > > > > > > Bootstrapped and regtested on x86-64_iinux-gnu{-m32,}. > > > > > > gcc/ChangeLog: > > > > > > PR target/99908 > > > * config/i386/sse.md (<sse4_1_avx2>_pblendvb): Add > > > splitters for pblendvb of NOT mask register. > > > > > > gcc/testsuite/ChangeLog: > > > > > > PR target/99908 > > > * gcc.target/i386/avx2-pr99908.c: New test. > > > * gcc.target/i386/sse4_1-pr99908.c: New test. > > Thanks for the review.
OTOH, have you considered ix86_fold_builtinor ix86_gimple_fold_builtin? These functions are implemented as builtins, so perhaps the transformation can be more efficiently implemented by calling these two target functions. Uros.