https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122272

            Bug ID: 122272
           Summary: Missed optimizations in classic "gcd" implementation
           Product: gcc
           Version: 15.2.1
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
          Assignee: unassigned at gcc dot gnu.org
          Reporter: matt at godbolt dot org
  Target Milestone: ---

In the straightforward code:

```
unsigned gcd(unsigned a, unsigned b) {
  if (b == 0) return a;
  return gcd(b, a % b);
}
```

GCC for x86 at -O2 (or -O3) generates the following assembly, which has a
number of redundant register moves (marked with ?):

```asm
gcd(unsigned int, unsigned int):
  mov eax, edi      ; eax (return value) = a
  test esi, esi     ; is b == 0?
  jne .L2           ; if b != 0; goto starting point L2
  jmp .L10          ; if b = 0; goto special case

.L11:               ; loop top
  mov esi, edx      ; set b to be a % b from the previous iteration

.L2:
  xor edx, edx      ; set up for divide
  div esi           ; eax = edx:eax (0:a) / esi (b)
                    ; edx = edx:eax (0:a) % esi (b)
  mov eax, esi      ; eax = b
  test edx, edx     ; was a % b zero?
  jne .L11          ; if not, loop around
  mov eax, esi      ; return esi (? eax is already esi)
  ret               ; returns a
.L10:
; This seems daft; when we get here, eax already has
; the right value, so the compiler has overcomplicated things.
  mov esi, edi      ; (?) esi = a
  mov eax, esi      ; (?) eax (return value) = esi
  ret               ; returns a
```

Nothing obvious (to me, anyway) in the optimisation report.

Clang 21 produces much tighter code (unannotated here):

```
gcd(unsigned int, unsigned int):
  mov eax, edi
  test esi, esi
  je .LBB0_4
  mov edx, esi
.LBB0_2:
  mov ecx, edx
  xor edx, edx
  div ecx
  mov eax, ecx
  test edx, edx
  jne .LBB0_2
  mov eax, ecx
.LBB0_4:
  ret
```

CE link: https://godbolt.org/z/34jfKrW57

Reply via email to