https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101929
--- Comment #13 from Richard Biener <rguenth at gcc dot gnu.org> --- (In reply to Hongtao.liu from comment #12) > > > It's difficult (if not impossible) for the vectorizer to second-guess > > the followup FRE, we're a long way from doing loop + SLP vectorization > > in one go and discover we can elide the vector store. > > I'm thinking of adding some detect in the vectorizer to find the "fre pair" > of the new vectorized store and existed vector load, then eliminate > vector_store cost in add_stmt_cost since it's probably be eliminated. > > New vector store: MEM <vector(4) unsigned int> [(unsigned int *)&tmp] = > vect__192.68_825; > > Existed vector load below: vect__63.9_482 = MEM <vector(4) unsigned int> > [(unsigned int *)&tmp] I think it would be more useful to explore whether we can find a special kind of "SLP seed" by looking for vector loads that load from a store group previously identified. We can then have the vector _load_ be the SLP seed similar as to how we handle vector CTORs. The only special sauce would be that we need to perform dependence analysis and make the "store" cheaper (but only if it is really dead afterwards which would mean doing some kind of DSE analysis - but of course we can offset the vector load cost that goes away, maybe that alone helps).