https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103855

            Bug ID: 103855
           Summary: Missed optimization: 64bit division used instead of
                    32bit division
           Product: gcc
           Version: 12.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: zhaoweiliew at gmail dot com
  Target Milestone: ---

Compiler Explorer link: https://gcc.godbolt.org/z/KvW8sMsqz

I compiled the following code with `x86-64 gcc (trunk) -std=c++20 -O3 -Wall
-Wextra -Werror`:

```
unsigned int optimized(unsigned int a, unsigned int b) {
    return (unsigned long long)a / b;
}

unsigned int unoptimized(unsigned int a, unsigned int b) {
    unsigned long long all = a;
    return all / b;
}
```

This is the assembly output:

```
optimized(unsigned int, unsigned int):
        mov     eax, edi
        xor     edx, edx
        div     esi
        ret
unoptimized(unsigned int, unsigned int):
        mov     eax, edi
        mov     esi, esi
        xor     edx, edx
        div     rsi
        ret
```

GCC uses a 64-bit divide for `unoptimized()`, when a 32-bit divide would be
equivalent and faster. GCC uses a 32-bit divide for `optimized()`, which is
fine. Note that LLVM does a 32-bit division in both cases.

I would like to tackle this optimization, but am not sure how to go about doing
it. Could someone tell me/point me to resources that tell me what part of the
compiler and codebase I should be looking to optimize? Thanks!

Reply via email to