https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108471

            Bug ID: 108471
           Summary: Suboptimal codegen for __int128 subtraction on x86_64
           Product: gcc
           Version: 13.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: other
          Assignee: unassigned at gcc dot gnu.org
          Reporter: rl.alt.accnt at gmail dot com
  Target Milestone: ---

On x86_64, GCC generates an excessive amount of redundant `mov` instructions
for `__int128` subtraction in C/C++. Clicking through versions on godbolt shows 
that this started getting worse in GCC 9.1 and later versions. See also
https://godbolt.org/z/86v6ar457

The code:
```
__int128 sub(__int128 a, __int128 b) { return a - b; }
```

At -O3 or -O2, GCC (trunk) generates:
```
sub:
        mov     r8, rdi
        mov     rax, rsi
        mov     rsi, r8
        mov     rdi, rax
        mov     r8, rdx
        mov     rax, rsi
        mov     rdx, rdi
        sub     rax, r8
        sbb     rdx, rcx
        ret
```
Interestingly, the use of `r8` in the first three instructions disappears when
compiling w/ -O1, and those instructions are folded into two `mov`s instead.

By contrast, Clang (also at -O3) generates:
```
sub:
        mov     rax, rdi
        sub     rax, rdx
        sbb     rsi, rcx
        mov     rdx, rsi
        ret
```

This is probably not a high-priority bug; I just wanted to bring attention to
the fact that this is happening.

Reply via email to