https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90706
--- Comment #13 from Georg-Johann Lay <gjl at gcc dot gnu.org> --- Created attachment 53812 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53812&action=edit Test case with 32-bit integer. This problem is still present in current master (future v13) and also occurs with 32-bit integers. > avr-gcc -S -Os -mul.c -fdump-rtl-ira With v8, mul.s has 15 instructions. With newer versions, mul.s has 26 additional instructions: * 12 silly, useless stores into / loads from frame. * 12 instructions to setup the frame. * More instructions due to sub-optimal register alloc. * Uses 6 bytes stack frame where v8 needs no frame at all. In the IRA dump, there is: Pass 0 for finding pseudo/allocno costs a0 (r53,l0) best NO_REGS, allocno NO_REGS a2 (r49,l0) best GENERAL_REGS, allocno GENERAL_REGS a1 (r48,l0) best NO_REGS, allocno NO_REGS ... Pass 1 for finding pseudo/allocno costs r53: preferred NO_REGS, alternative NO_REGS, allocno NO_REGS r49: preferred GENERAL_REGS, alternative NO_REGS, allocno GENERAL_REGS r48: preferred NO_REGS, alternative NO_REGS, allocno NO_REGS ... Spill a0(r53,l0) Spill a1(r48,l0) Allocno a2r49 of GENERAL_REGS(30) ... So there are 2 register spills for no reason that lead to that code bloat.