On 11/17/2017 04:09 PM, Palmer Dabbelt wrote:
From: Kito Cheng <kito.ch...@gmail.com>

2017-11-17  Kito Cheng  <kito.ch...@gmail.com>

         * longlong.h [__riscv] (__umulsidi3): Define.
         [__riscv] (umul_ppmm) Likewise.
         [__riscv] (__muluw3) Likewise.

Apparently the point of this is that by defining __mulsi3/__muldi3 as an extended asm, we get better register allocation, and hence better performance. It would be nice if this info was provided when a patch was submitted, so that we don't have to figure it out ourselves.

I see one minor issue though

+    (w0) = __muluw3 (__ll_lowpart (__x1), __ll_B)                      \
+          + __ll_lowpart (__x0);                                       \

The multiply here is just a shift and should be written as a shift by (W_TYPE_SIZE / 2).

You copied this code from the default definition which uses *, and assumes that the compiler will optimize the multiply into a shift. However, because __muluw3 expands into a call to __mulsi3/__muldi3 for a risc-v part with no multiply, the optimization will not take place here, and you end up with one extra unnecessary multiply operation.

Otherwise this looks OK.

Jim

Reply via email to