https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118166
Andrew Pinski <pinskia at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Ever confirmed|0 |1 Status|UNCONFIRMED |WAITING Last reconfirmed| |2024-12-22 --- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> --- Do you have a benchmark where using SSE would be better than the scalar instructions? Especially considering there are more than 2 ALUs which would cause the code that you provided right now be better with the scalar instructions because there is no movement between the 2 register sets.