On Wed, 21 Aug 2024, Robin Dapp wrote:

> > > > Why?  I don't think the vectorizer relies on a particular else
> > > > value?  I'd say it would be appropriate for if-conversion to
> > > > use "ANY" and for the vectorizer to then pick a supported
> > > > version and/or enforce the else value it needs via a blend?
> > > 
> > > In PR115336 we have something like
> > > 
> > >   _Bool iftmp.0_113;
> > >   _Bool iftmp.0_114;
> > >   iftmp.0_113 = .MASK_LOAD (_170, 8B, _169, _171(D));
> > >   iftmp.0_114 = _47 | iftmp.0_113;
> > > 
> > > which assumes zeroing.
> >
> > I see - is that some trick ifcvt performs?  I can't immediately
> > see the connection to the PR and it only contains RISC-V assembly
> > analysis.
> 
> It happens in predicate_scalar_phi where we build the cond_expr.
> 
> After converting
> 
>   iftmp.0_114 = PHI <iftmp.0_113(45), 1(38)>
> 
> we have
> 
>   _BoolD.2746 _47;
>   iftmp.0_114 = _47 ? 1 : iftmp.0_113;
> which is folded into
>   iftmp.0_114 = _47 | iftmp.0_113;
> 
> I should really have documented that and more in the PR already...
> So it's not an ifcvt trick but rather match.

_47 was the .MASK_LOAD def, right?

It's not exactly obvious what goes wrong - the transform above
is correct - it's only "unexpected" for the lanes that were
masked.  So the actual bug must be downstream from iftmp.0_144.

I think one can try to reason on the ifcvt (scalar) code by
assuming the .MASK_LOAD def would be undefined.  Then we'd
have _47(D) ? 1 : iftmp.0_133 -> _47(D) | iftmp.0_133, I think
that's at most phishy as the COND_EXPR has a well-defined
value while the IOR might spill "undefined" elsewhere causing
divergence.  Is that what is actually happening?

Richard.

> Another related case is PR116059.
> 
> > > In order to circumvent that we could use COND_IOR
> > > but I suppose that wouldn't be optimized away even on targets that zero
> > > masked elements?  "ANY" would seem to be wrong here.
> >
> > What I was trying to say is that of course any transform we perform
> > that requires zero-masking should either make .MAKS_LOAD perform
> > that or add a COND_EXPR.  But it shouldn't be required to make
> > all .MASK_LOADs be zero-masked, no?
> >
> > I'm of course fine if you think that's the best way for RISC-V given
> > other targets are likely unaffected as they can perform zero-masking.
> 
> No, the less zeroing the better of course :)
> 
> Richard S's point was to make the COND_EXPR explicit, so that e.g.
> a MASK_LOAD (mask, ..., 1) does not appear cheap as cheap as
> MASK_LOAD (mask, ..., 0) on zeroing targets.
> 
> From this I kind of jumped to the conclusion (see below) that we might want
> it everywhere.
> 
> With the patches as is, ifcvt would enforce the zero here while all other
> masked-load occurrences in the vectorizer would just query the target's
> preferred else value and simply use that without blend/cond_expr.
> 
> > > What I didn't do (in the posted version, just locally) is an explicit
> > > VEC_COND_EXPR after each masked (gather/load lanes) call the vectorizer
> > > does.  Do we need that?  AFAICT loop masking (be it len style or
> > > fully-masked style) should be safe.
> >
> > Well, why should we need that?  There seem to be the assumption that
> > .MASK_LOAD is zero-masked in very few places (PR115336, but not
> > identified there), if we'd assume this everywhere there would be
> > way more issues with RISC-V?
> 
> Ok, I was already pretty sure we don't need - and glad to hear it confirmed.
> I was just thinking for consistency reasons we might want a masked
> load to always look like
>   foo123 = .MASK_..._LOAD (mask, ..., else)
>   foo124 = COND_EXPR (mask, foo123, 0);
> where foo124 would be optimized away (or not even emitted) for zeroing
> targets).  That way subsequent code could always rely on zero.
> But as this requirement seems very rare it doesn't look like a useful
> invariant to enforce.
> 
> All in all, it seems we don't need major changes to the approach.
> I'm going to work on the comments for the other patches.
> 
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Reply via email to