https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119351

--- Comment #6 from Tamar Christina <tnfchris at gcc dot gnu.org> ---
(In reply to ktkachov from comment #5)
> (In reply to Tamar Christina from comment #4)
> > While looking at the codegen it looks like GROMACS has a lot of loops that
> > get vectorized now and it's showing some inefficiencies in the codegen,
> > including missing foldings for SVE in match.pd
> > 
> > I have written patches to fix many of these so will post them for GCC 16.
> > 
> > Anyways, miscompile is in nbnxn_make_pairlist_part, bisecting the loop as it
> > has quite a number of them in that function.
> 
> Nice, thanks!
> FTR, Jennifer is also looking at optimising some inefficiencies we found
> here with SVE, particularly around making more use of unpredicated
> instructions e.g. PR117978. Just mentioning this here to avoid duplicate
> effort

No overlap, the ones I'm looking at are just about the inefficiencies around
the code generated for early break by the backend and vectorizer.

That said we do have a similar patch as to what Jennifer is doing in PR117978
but lowering predicated arith operations like fmul/mul to Adv. SIMD if only the
lower 128-bits are used etc.

It looks like there's some commonality in the helper functions created.  But
that's just a matter of re-using whichever one gets in first I think.

Reply via email to