https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94804
--- Comment #1 from Gabriel Ravier <gabravier at gmail dot com> --- For subtraction, it's even worse. using i128 = __int128; i128 sub128(i128 a, i128 b) { return a - b; } results in sub128(__int128, __int128): mov rax, rdi sub rax, rdx sbb rsi, rcx mov rdx, rsi ret with LLVM and sub128(__int128, __int128): mov r9, rdi mov r8, rsi mov rdi, r8 mov rax, r9 mov r8, rdx sub rax, r8 mov rdx, rdi sbb rdx, rcx ret with GCC. The excess of `mov`s feels to me like there is some sort of bug in the 128-bit register allocator or something like that.