https://gcc.gnu.org/bugzilla/show_bug.cgi?id=42575
--- Comment #20 from Wilco <wilco at gcc dot gnu.org> --- (In reply to Wilco from comment #19) > (In reply to Christophe Lyon from comment #18) > > This is still wrong with current trunk. > > I don't see it happening since expansion of DImode instructions improved. > The only case that uses an extra register is -mcpu=cortex-a9/-mcpu=cortex-a5 > with -O2 -mthumb: > > mul r3, r0, r3 > push {r4} > mov r4, r1 > umull r0, r1, r0, r2 > mla r2, r2, r4, r3 > ldr r4, [sp], #4 > add r1, r1, r2 > bx lr > > I don't think we should expect perfect register allocation in severely > constrained cases like this - scheduling can increase register pressure. Interestingly this will be fixed by https://gcc.gnu.org/ml/gcc-patches/2019-09/msg00576.html: mul r3, r0, r3 mov ip, r1 umull r0, r1, r0, r2 mla ip, r2, ip, r3 add r1, r1, ip bx lr With r12 as an extra temporary r4 no longer needs to be saved/restored.