https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119187
--- Comment #3 from Tamar Christina <tnfchris at gcc dot gnu.org> --- (In reply to Andrew Pinski from comment #2) > (In reply to Andrew Pinski from comment #1) > > There is another bug report for a similar thing but with SSE and AVX2. > > yes PR 95960. Ah yeah, I guess I wanted to start with something simpler that doesn't involve cross lane operations. I could have written the above loop using just gcc vector extensions.