https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83851
--- Comment #3 from rsandifo at gcc dot gnu.org <rsandifo at gcc dot gnu.org> --- To give a few more details, the loop in question is: .L3: mov ip, r3 add r3, r3, #48 cmp r3, r4 vld3.32 {d16, d18, d20}, [ip]! vld3.32 {d17, d19, d21}, [ip] vstmia sp, {d16-d21} vld1.64 {d16-d17}, [sp:64] vst1.64 {d16-d17}, [lr:64]! bne .L3 which apart from being awful code, has the effect of switching the elements.