https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98289
Bug ID: 98289 Summary: [x86] Suboptimal optimization of stack usage when function call to cold function is not needed Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: gabravier at gmail dot com Target Milestone: --- void f(bool cond) { if (cond) __builtin_abort(); } On x86 with current trunk and -O3, this results in : f(bool): sub rsp, 8 test dil, dil jne .L3 add rsp, 8 ret f(bool) [clone .cold]: .L3: call abort This seems like a regression over GCC 7.5, which outputs : f(bool): test dil, dil jne .L7 rep ret .L7: sub rsp, 8 call abort Along with LLVM, which has similar output. Only emitting the code to begin the call upon being asked to do seems quicker in the case where the call doesn't occur.