https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103855
Bug ID: 103855 Summary: Missed optimization: 64bit division used instead of 32bit division Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: zhaoweiliew at gmail dot com Target Milestone: --- Compiler Explorer link: https://gcc.godbolt.org/z/KvW8sMsqz I compiled the following code with `x86-64 gcc (trunk) -std=c++20 -O3 -Wall -Wextra -Werror`: ``` unsigned int optimized(unsigned int a, unsigned int b) { return (unsigned long long)a / b; } unsigned int unoptimized(unsigned int a, unsigned int b) { unsigned long long all = a; return all / b; } ``` This is the assembly output: ``` optimized(unsigned int, unsigned int): mov eax, edi xor edx, edx div esi ret unoptimized(unsigned int, unsigned int): mov eax, edi mov esi, esi xor edx, edx div rsi ret ``` GCC uses a 64-bit divide for `unoptimized()`, when a 32-bit divide would be equivalent and faster. GCC uses a 32-bit divide for `optimized()`, which is fine. Note that LLVM does a 32-bit division in both cases. I would like to tackle this optimization, but am not sure how to go about doing it. Could someone tell me/point me to resources that tell me what part of the compiler and codebase I should be looking to optimize? Thanks!