https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100173

--- Comment #2 from Hongtao.liu <crazylht at gmail dot com> ---

> but yes, cselim will also sink the first store, moving it across the
> scalar compute in the block.  I might note that ideally we'd sink
> all the compute as well and end up with just a conditional load of
> either pIn1->m_esState or pIn2_89->m_esState.  That might then allow
> scheduling to recover the original performance.
> 

I want to clasify this regression is not related to 2 sinked stores, it just
trigger some micro-architecture bound.

Also w/o -fvect-cost-model=very-cheap, it can be 2-3x faster, the tripper count
is constant, so i wonder why very-cheap cost model doesn't vectorize this loop?

Reply via email to