https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113560
Alexander Monakov <amonakov at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |amonakov at gcc dot gnu.org --- Comment #3 from Alexander Monakov <amonakov at gcc dot gnu.org> --- (In reply to Roger Sayle from comment #2) > The costs look sane, and I'd expect the synth_mult generated sequence to be > faster, though it would be good to get some microbenchmarking. > A reduced test case is: > __int128 foo(__int128 x) { return x*100; } This is not an equivalent testcase, mulx is a widening multiply from 64-bit source operands. It has latency 3 or 4 on most implementations. Costing it as a synthesized general 128-bit multiplication is wrong.