https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114556
Andrew Pinski <pinskia at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Ever confirmed|0 |1 Status|UNCONFIRMED |NEW Last reconfirmed| |2024-04-03 --- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> --- without the align: Inserting a partition copy on edge BB2->BB3 : PART.0 = PART.3 Inserting a value copy on edge BB2->BB3 : PART.1 = 2048 vs with: Inserting a partition copy on edge BB3->BB3 : PART.7 = PART.0 Inserting a partition copy on edge BB2->BB3 : PART.7 = PART.3 Inserting a value copy on edge BB2->BB3 : PART.1 = 2048 Basically out of ssa is doing an extra move because it could not "merge" the loop induction variable for the (vector) addition due to the different alignment requirements ... And then things go down hill from there.