https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119351
--- Comment #6 from Tamar Christina <tnfchris at gcc dot gnu.org> --- (In reply to ktkachov from comment #5) > (In reply to Tamar Christina from comment #4) > > While looking at the codegen it looks like GROMACS has a lot of loops that > > get vectorized now and it's showing some inefficiencies in the codegen, > > including missing foldings for SVE in match.pd > > > > I have written patches to fix many of these so will post them for GCC 16. > > > > Anyways, miscompile is in nbnxn_make_pairlist_part, bisecting the loop as it > > has quite a number of them in that function. > > Nice, thanks! > FTR, Jennifer is also looking at optimising some inefficiencies we found > here with SVE, particularly around making more use of unpredicated > instructions e.g. PR117978. Just mentioning this here to avoid duplicate > effort No overlap, the ones I'm looking at are just about the inefficiencies around the code generated for early break by the backend and vectorizer. That said we do have a similar patch as to what Jennifer is doing in PR117978 but lowering predicated arith operations like fmul/mul to Adv. SIMD if only the lower 128-bits are used etc. It looks like there's some commonality in the helper functions created. But that's just a matter of re-using whichever one gets in first I think.