On Wed, 7 Aug 2024, Richard Biener wrote:
> > > + data = *(const v16qi_u *)s; > > > + /* Prevent propagation into pshufb and pcmp as memory operand. */ > > > + __asm__ ("" : "+x" (data)); > > > > It would probably make sense to a file a PR on this separately, > > to eventually fix the compiler to not need such workarounds. > > Not sure how much difference it makes however. > > This is probably to work around bugs in older compiler versions? If > not I agree. This is deliberate hand-tuning to avoid a subtle issue: pshufb is not macro-fused on Intel, so with propagation it is two uops early in the CPU front-end. The "propagation" actually falls out of IRA/LRA decisions, and stopped happening in gcc-14. I'm not sure if there were relevant RA changes. In any case, this can potentially flip-flop in the future again. Considering the trunk gets this right, I think the next move is to add a testcase for this, not a PR, correct? > Otherwise the patch is OK. Still OK with the asms, or would you prefer them be taken out? Thanks. Alexander