https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88456
--- Comment #3 from Patrick Oppenlander <patrick at motec dot com.au> --- (In reply to jos...@codesourcery.com from comment #2) > If the call is one GCC can't expand on its own (atomic operations on large > objects needing locks, architecture lacks required atomic operation > instructions, etc.), it would be reasonable for GCC to inline a definition > to which it would otherwise generate an out-of-line call. What you describe is the current behaviour when using LTO. gcc happily inlines the implementations of atomic library functions for which it can't expand builtins. For context, I came across this problem while implementing atomic support on ARM Cortex-M0. Cortex-M0 doesn't support load/store-exclusive so a full suite of functions must be provided. I then built the same project targeting Cortex-M4 for which the Cortex-M0 implementations are not optimal, but should still work. However, the resultant binary used gcc provided builtins in some places and my Cortex-M0 implementations in others. I think it needs to be consistently one way or the other, or fail to build in this situation. Personally, I like the concept of being able to provide external implementations, especially when considering bare-metal embedded programming.