https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107093

--- Comment #9 from rguenther at suse dot de <rguenther at suse dot de> ---
On Tue, 11 Oct 2022, crazylht at gmail dot com wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107093
> 
> --- Comment #8 from Hongtao.liu <crazylht at gmail dot com> ---
> 
> > 
> > One downside for a fully masked body is that we're using masked stores
> > which usually have higher latency due to the "merge" semantics which
> > means an extra memory input + merge operation.  Not sure if modern
> > uArchs can optimize the all-ones mask case, the vectorizer, for
> Also I guess mask store won't be store forward even load is inside the mask
> store.

I guess the masking of the store is resolved in the load-store unit
and not by splitting the operation into a load, modify, store because
that cannot easily hide exceptions.  So yes, a masked store in the
store buffer likely cannot act as forwarding source (though the
actual mask should be fully resolved there) since the actual
merging will take place later.

Reply via email to