https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103771

--- Comment #23 from rguenther at suse dot de <rguenther at suse dot de> ---
On Tue, 18 Jan 2022, crazylht at gmail dot com wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103771
> 
> --- Comment #22 from Hongtao.liu <crazylht at gmail dot com> ---
> (In reply to Hongtao.liu from comment #21)
> > (In reply to Hongtao.liu from comment #20)
> > > (In reply to Richard Biener from comment #19)
> > > > Ah, so the issue is missing -mavx512bw which means we end up with a AVX2
> > > > style
> > > > mask for V32QImode.  With -mavx512bw the code vectorizes fine.
> > > 
> > > Vectorization code is worse than before, now we need to pack vectorized 
> > > mask
> > > which takes extra 3 instructions.
> > 
> > Current ifcvt convert
> > 
> > ---------dump of .ch_vect-------
> >   if (x.1_14 > 255)
> >     goto <bb 4>; [50.00%]
> >   else
> >     goto <bb 5>; [50.00%]
> > 
> >   <bb 4> [local count: 477815112]:
> >   _17 = -_5;
> >   _18 = _17 >> 31;
> >   iftmp.0_19 = (unsigned char) _18;
> >   goto <bb 6>; [100.00%]
> > 
> >   <bb 5> [local count: 477815112]:
> >   iftmp.0_20 = (unsigned char) _5;
> > 
> >   <bb 6> [local count: 955630225]:
> >   # iftmp.0_21 = PHI <iftmp.0_19(4), iftmp.0_20(5)>
> > -------dump end---------
> > 
> > 
> > to 
> > ---- dump of .ifcvt---------
> >   _41 = -x.1_14;
> >   _17 = (int) _41;
> >   _18 = _17 >> 31;
> >   iftmp.0_19 = (unsigned char) _18; -- vec_pack_trunc
> >   iftmp.0_20 = (unsigned char) _5; -- vec_pack_trunc
> >   iftmp.0_21 = x.1_14 > 255 ? iftmp.0_19 : iftmp.0_20; -- vec_pack_trunc
> >   *_6 = iftmp.0_21;
> >   x_16 = x_24 + 1;
> > -----dump end----------
> > 
> > 
> > if ifcvt output things like
> > ------------optimal .ifcvt------
> >   _41 = -x.1_14;
> >   _17 = (int) _41;
> >   _18 = _17 >> 31;
> >   iftmp.0_21 = x.1_14 > 255 ? _18 : _5;
> >   iftmp.0_22 = (unsigned char) iftmp.0_21; --- vec_pack_trunc
> >   *_6 = iftmp.0_22;
> >   x_16 = x_24 + 1;
> > ------------end------------
> > 
> > we can save operations for packing mask(3 vec_pack_trunc vs 1
> > vec_pack_trunc?).
> 
> Or maybe a gimple simplification for it?

Yes, I think that's a candidate for a match.pd simplification.
Fortunately if-conversion already folds the built stmts.

Reply via email to