[Bug target/111874] Missed mask_fold_left_plus with AVX512

crazylht at gmail dot com via Gcc-bugs Mon, 23 Oct 2023 19:50:20 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111874


--- Comment #3 from Hongtao.liu <crazylht at gmail dot com> ---
> For the case of conditional (or loop masked) fold-left reductions the scalar
> fallback isn't implemented.  But AVX512 has vpcompress that could be used
> to implement a more efficient sequence for a masked fold-left, possibly
> using a loop and population count of the mask.
There's extra kmov + vpcompress + popcnt, I'm afraid the performance could be 
 worse than the scalar version.

[Bug target/111874] Missed mask_fold_left_plus with AVX512

Reply via email to