https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113560

--- Comment #5 from accelerator0099 at gmail dot com ---
If we are using an arch without BMI2, we can use single MUL instruction
instead. Here is the description of MUL reg64/mem64.
Multiplies a 64-bit register or memory operand by the contents of the RAX
register and stores the result in the RDX:RAX register.
It stores the result in RDX:RAX, putting the high-order bits of the product in
RDX.
And on zen4 arch, it costs 3 or 4 circles to do this.

Reply via email to