On Wed, 7 Aug 2024, Richard Biener wrote:
> > > + data = *(const v16qi_u *)s;
> > > + /* Prevent propagation into pshufb and pcmp as memory operand. */
> > > + __asm__ ("" : "+x" (data));
> >
> > It would probably make sense to a file a PR on this separately,
> > to eventually fix the compiler to not need such workarounds.
> > Not sure how much difference it makes however.
>
> This is probably to work around bugs in older compiler versions? If
> not I agree.
This is deliberate hand-tuning to avoid a subtle issue: pshufb is not
macro-fused on Intel, so with propagation it is two uops early in the
CPU front-end.
The "propagation" actually falls out of IRA/LRA decisions, and stopped
happening in gcc-14. I'm not sure if there were relevant RA changes.
In any case, this can potentially flip-flop in the future again.
Considering the trunk gets this right, I think the next move is to
add a testcase for this, not a PR, correct?
> Otherwise the patch is OK.
Still OK with the asms, or would you prefer them be taken out?
Thanks.
Alexander