https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119114
--- Comment #6 from Robin Dapp <rdapp at gcc dot gnu.org> --- As convoluted (and redundant) as it looks but the optimized tree looks at least correct to me. Maybe a backend issue? But I don't see costing for what we emit in the vectorizer and I didn't yet find where we decide to go for this vectorization scheme. We even do predictive common on the scalars that get broadcast to vectors but that's not the issue.