https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102014
Bug ID: 102014
Summary: [missed optimization] __uint128_t % uint64_t emits a
call to __umodti3 instead of div instruction
Product: gcc
Version: 11.2.1
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: c
Assignee: unassigned at gcc dot gnu.org
Reporter: kamkaz at windowslive dot com
Target Milestone: ---
The following code:
#include <stdint.h>
extern u64 safe_mul(uint64_t a, uint64_t b, uint64_t n) {
return (((__uint128_t)a)*b)%n;
}
compiled with -O2 for x86_64 architecture generates following assembly:
safe_mul(unsigned long, unsigned long, unsigned long):
mov rax, rdi
mov r8, rdx
sub rsp, 8
xor ecx, ecx
mul rsi
mov rsi, rdx
mov rdi, rax
mov rdx, r8
call __umodti3
add rsp, 8
ret
With call to __umodti3, while it could compiled to:
safe_mul(unsigned long, unsigned long, unsigned long):
mov rax, rdx
mul rcx
div r8
mov rax, rdx
ret
The same thing happens with division __uint128_t / uint64_t and unnecessary
call to __udivti3 instead of div instruction.