https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112470
--- Comment #10 from Andrew Pinski <pinskia at gcc dot gnu.org> --- The instruction increase is 2: sub sp, sp, #128 ... stp x29, x30, [sp, 112] vs: stp x29, x30, [sp, -128]! and ldp x29, x30, [sp, 112] ... add sp, sp, 128 vs: ldp x29, x30, [sp], 128 Depending on the core, the performance might be the same. Without a full performance testcase which shows the difference, it is hard to tell if this is an issue overall or just some one that shows up in one small testcase.