[Bug target/107093] AVX512 mask operations not simplified in fully masked loop

crazylht at gmail dot com via Gcc-bugs Tue, 11 Oct 2022 04:08:05 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107093


--- Comment #8 from Hongtao.liu <crazylht at gmail dot com> ---

> 
> One downside for a fully masked body is that we're using masked stores
> which usually have higher latency due to the "merge" semantics which
> means an extra memory input + merge operation.  Not sure if modern
> uArchs can optimize the all-ones mask case, the vectorizer, for
Also I guess mask store won't be store forward even load is inside the mask
store.

[Bug target/107093] AVX512 mask operations not simplified in fully masked loop

Reply via email to