https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115683

--- Comment #6 from Hongtao Liu <liuhongt at gcc dot gnu.org> ---
(In reply to Uroš Bizjak from comment #5)
> (In reply to Hongtao Liu from comment #0)
> 
> > g++: g++.target/i386/pr100637-1b.C 
> > g++: g++.target/i386/pr100637-1w.C
> > g++: g++.target/i386/pr103861-1.C
> >
> > There're extra 1 pcmpeq instruction generated in below 3 testcase for
> > comparison of GTU, x86 doesn't support native GTU comparison, but use
> > psubusw + pcmpeq + pcmpeq, the second pcmpeq is used to negate the mask, and
> > the negate can be eliminated in vcond{,u,eq} expander by just swapping
> > if_true and if_else.
> 
> How to do that? The output from vec_cmpu is a mask value in the output
> register that is used by vcond_mask as an input. I fail to see how the swap
> of if_true and if_false operands (in vcond_mask RTX) can be communicated
> from vec_cmpu to vcond_mask.

One possible solution is that we define the "fake" blendv pattern to help
combine do the optimization, and then split this fake pattern back to op1 &
mask | op2 & ~mask when !TAREGT_SSE4_1

Reply via email to