http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51837
Bug #: 51837
Summary: Use of result from 64*64->128 bit multiply via
__uint128_t not optimized
Classification: Unclassified
Product: gcc
Version: 4.7.0
Status: UNCONFIRMED
Severity: enhancement
Priority: P3
Component: middle-end
AssignedTo: [email protected]
ReportedBy: [email protected]
unsigned long long foo(unsigned long long x, unsigned long long y)
{
__uint128_t z = (__uint128_t)x * y;
return z ^ (z >> 64);
}
Compiles into
mov %rsi, %rax
mul %rdi
mov %rax, %r9
mov %rdx, %rax
xor %r9, %rax
retq
The final two mov instructions are not needed, and the above is equivalent to
mov %rsi, %rax
mul %rdi
xor %rdx, %rax
retq