https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119114
--- Comment #5 from Li Pan <pan2.li at intel dot com> --- (In reply to Robin Dapp from comment #4) > Very weird indeed. It looks like we're not even vectorizing? I mean, sure, > we use vector instructions but they are all broadcast from scalars? > (VMAT_INVARIANT) And in the end we extract the first element without a > reduction. > > Can't reproduce it on aarch64. Yes, I think the tree.optimized may look like above (only VEC_EXTRACT); let me double check and update later.