[TVM Discuss] [Development/RFC] [RFC] Using arm intrinsics to implement fixed point multiplication in TVM

Animesh Jain via TVM Discuss Wed, 01 Jul 2020 12:03:20 -0700


@tqchen The problem arises because LLVM codegen is not able to use suitable 
instructions. A fixed point multiply at Relay level will have to upcast the 
input tensors to int64. ARM instructions that @giuseros shared take int32 
tensors and perform the upcasting internally in the HW (please correct me if I 
am wrong - @giuseros). Therefore, today QNN/Relay graphs do not use the best 
possible ARM instructions.


At the same time, I have similar concerns about overkill. I earlier missed 
this, but having a new op disallows operator fusion, leading to 1.5% speedup 
instead of 3% speedup.





---
[Visit 
Topic](https://discuss.tvm.ai/t/rfc-using-arm-intrinsics-to-implement-fixed-point-multiplication-in-tvm/7150/6)
 to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click 
here](https://discuss.tvm.ai/email/unsubscribe/b7ac0bcacf896c18ba1847c7fe71e34197572edf37e3e381b85172ed06b357b1).

[TVM Discuss] [Development/RFC] [RFC] Using arm intrinsics to implement fixed point multiplication in TVM

Reply via email to