https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109896
--- Comment #3 from Thiago Macieira <thiago at kde dot org> ---
(In reply to H.J. Lu from comment #2)
> (In reply to Andrew Pinski from comment #1)
> > I suspect the overflow code was added before __builtin_*_overflow were added
> > which is why the generated code is this way.
>
> Should the C++ front-end use __builtin_mul_overflow?
That's what that code is doing, yes.
But mind you that not all examples are doing actual multiplications. That's why
I had the weird size of 47.
A size that is a power of 2 is just doing bit checks. For example, 16:
movq %rdi, %rax
shrq $59, %rax
jne .L2
Other sizes do the compare, but there's no multiplication involved. For 24:
movabsq $384307168202282325, %rax
cmpq %rdi, %rax
jb .L2
leaq (%rdi,%rdi,2), %rdi
salq $3, %rdi
5 instructions, 4 cycles (not including front-end decode), so roughly the same
as the imulq example above (4 cycles), but with far more ports to dispatch to.