https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117366
Bug ID: 117366 Summary: arm thumb1 epilogue size optimizer violates -ffixed Product: gcc Version: 15.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: matt.pa...@go-aps.com Target Milestone: --- arm thumb1 epilogue size optimizer violates -ffixed-r4. Compile the following minimal code snippet with options: arm-none-eabi-gcc -Os -ffixed-r4 -march=armv5te -mthumb -mabi=apcs-gnu -S test_thumb.c -o test_thumb.s test_thumb.c: --- void func(void *, void *, void *, int, int); int bad_func(void) { int a, b, c; func(&a, &b, &c, 1, 2); return b; } --- In test_thumb.s: Function prolog is: push {r0, r1, r2, r3, lr} Function epilogue is: pop {r1, r2, r3, r4, pc} <-- popping into r4 violates --fixed-r4 Bug is in function thumb1_extra_regs_pushed, which optimizes the prologue and epilog for optimize_size to do extra dummy push/pop registers instead of a separate sub/add for the stack pointer. The bug is present from the first appearance of the function in gcc 4.6.0 and is still present in the git master branch. Code snippet with the bug: while (reg_base + n_free < 8 && !(live_regs_mask & 1) && (for_prologue || call_used_or_fixed_reg_p (reg_base + n_free))) The problem is that call_used_or_fixed_reg_p (and call_used_regs[reg_base + n_free] which was used prior to gcc 10) includes fixed registers, because -ffixed-reg gets added to both the call_used_regs and fixed_regs. Because fixed register r4 is adjacent to normal call_used_regs r0-r3, it gets included as a candidate for a free register. For the prologue, that's fine because pushing a fixed register for dummy stack space doesn't hurt anything, but for the epilogue, causes corruption of the fixed reg. I tested the following fix - change back half of the conditional expression to: ... && (for_prologue || (call_used_or_fixed_reg_p (reg_base + n_free) && !fixed_reg[reg_base + n_free])))) By excluding the fixed reg r4, the code is fixed by having thumb1_extra_regs_pushed return 0 because amount > n_free * 4. The corrected epilogue is: add sp, sp, #16 pop {pc}