https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64164
Bug ID: 64164 Summary: [4.9/5 Regression] one more stack slot used due to one less inlining level Product: gcc Version: 5.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: ipa Assignee: unassigned at gcc dot gnu.org Reporter: patrick.marlier at gmail dot com Host: x86_64-linux-gnu Target: x86_64-linux-gnu Build: x86_64-linux-gnu Created attachment 34178 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=34178&action=edit testcase $ cc -O2 -S -o stm.s stm.c $ cc -O2 -DOPT -S -o stm-opt.s stm.c if you compare the 2 outputs, stm_load function is using one more slot on the stack. The difference is only this: static inline size_t AO_myload2(const volatile size_t *addr) { return *(size_t *)addr; } static inline size_t AO_myload(const volatile size_t *addr) { #ifdef OPT size_t result = AO_myload2(addr); #else size_t result = *(size_t *)addr; #endif return result; } Having one more inlined function should have the same optimization not a better one. 4.8 does not have the problem and the code generated is the same.