https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88497
Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Target| |powerpc* Component|middle-end |target --- Comment #5 from Richard Biener <rguenth at gcc dot gnu.org> --- I think it is reassoc doing it "wrong" based on the targets reassoc-width? Because the vectorizer generates exactly the code you are proposing. Though you didn't even provide a fully compilable testcase. I guessed N to be 16 here and your ideal examples use 'accumulator' which I assume to be 0.0. Your x86 code-gen examples are also from GCC 8 I assume (plus some -march/tune flag you didn't expose given it uses haddpd)