On Thu, May 13, 2021 at 8:43 AM Hongtao Liu <crazy...@gmail.com> wrote: > > On Wed, May 12, 2021 at 8:38 PM Uros Bizjak <ubiz...@gmail.com> wrote: > > > > On Wed, May 12, 2021 at 1:42 PM Hongtao Liu <crazy...@gmail.com> wrote: > > > > > > On Wed, May 12, 2021 at 4:36 PM Uros Bizjak <ubiz...@gmail.com> wrote: > > > > > > > > On Tue, Apr 27, 2021 at 1:05 PM Hongtao Liu via Gcc-patches > > > > <gcc-patches@gcc.gnu.org> wrote: > > > > > > > > > > Hi: > > > > > As described in the subject line, this patch is about to do the > > > > > below transformation. > > > > > > > > > > - vpcmpeqd %ymm3, %ymm3, %ymm3 > > > > > - vpandn %ymm3, %ymm2, %ymm2 > > > > > - vpblendvb %ymm2, %ymm1, %ymm0, %ymm0 > > > > > + vpblendvb %ymm2, %ymm0, %ymm1, %ymm0 > > > > > > > > > > Bootstrapped and regtested on x86-64_iinux-gnu{-m32,}. > > > > > > > > > > gcc/ChangeLog: > > > > > > > > > > PR target/99908 > > > > > * config/i386/sse.md (<sse4_1_avx2>_pblendvb): Add > > > > > splitters for pblendvb of NOT mask register. > > > > > > > > > > gcc/testsuite/ChangeLog: > > > > > > > > > > PR target/99908 > > > > > * gcc.target/i386/avx2-pr99908.c: New test. > > > > > * gcc.target/i386/sse4_1-pr99908.c: New test. > > > > > > Thanks for the review. > > > > OTOH, have you considered ix86_fold_builtinor > > ix86_gimple_fold_builtin? These functions are implemented as builtins, > > so perhaps the transformation can be more efficiently implemented by > > calling these two target functions. > Good idea, I'll try that. I find it's not that good to fold andn to 2 gimple IRs which don't always come back to andn in rtl, and lose some opt. But blendv folding seems to be obviously good. > > > > Uros. > > > > -- > BR, > Hongtao
-- BR, Hongtao