https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105495
Bug ID: 105495 Summary: `__atomic_compare_exchange` prevents tail-call optimization Product: gcc Version: 11.3.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: lh_mouse at 126 dot com Target Milestone: --- Godbolt: https://gcc.godbolt.org/z/7ob6zc17P Offending testcase: ```c typedef struct { int b; } cond; int __MCF_batch_release_common(cond* p, int c); int _MCF_cond_signal_some(cond* p, int x) { cond c = {x}, n = {2}; __atomic_compare_exchange(p, &c, &n, 1, 0, 0); return __MCF_batch_release_common(p, x); } ``` GCC output: ```asm _MCF_cond_signal_some: sub rsp, 24 mov edx, 2 mov eax, esi mov DWORD PTR [rsp+12], esi lock cmpxchg DWORD PTR [rdi], edx je .L2 mov DWORD PTR [rsp+12], eax <------- note this extra store, which clang doesn't generate .L2: call __MCF_batch_release_common add rsp, 24 ret ``` Clang output: ```asm _MCF_cond_signal_some: # @_MCF_cond_signal_some mov ecx, 2 mov eax, esi lock cmpxchg dword ptr [rdi], ecx jmp __MCF_batch_release_common # TAILCALL ``` 1. If `cond` was defined as a scalar type such as `long`, there is no such issue. 2. `__atomic_exchange` doesn't suffer from this issue.