https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113560
Alexander Monakov <amonakov at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |amonakov at gcc dot gnu.org
--- Comment #3 from Alexander Monakov <amonakov at gcc dot gnu.org> ---
(In reply to Roger Sayle from comment #2)
> The costs look sane, and I'd expect the synth_mult generated sequence to be
> faster, though it would be good to get some microbenchmarking.
> A reduced test case is:
> __int128 foo(__int128 x) { return x*100; }
This is not an equivalent testcase, mulx is a widening multiply from 64-bit
source operands. It has latency 3 or 4 on most implementations. Costing it as a
synthesized general 128-bit multiplication is wrong.