While I can certainly see the value of fixed point mul, there are a few other alternatives(simpler than fpm), which I list below
- When the scale itself is power of two, it is possible to directly turn things into a right shift, without having to invoke any multiplication. However, given that right shift corresponds to rounds down by default, we will need to add 0.5 compensation(so it rounds to the nearest). - When the scale itself is not power of two. It is still interesting to ask whether or not we need to upcast to i64 in this case. It would be interesting to ask how to do things in the i32 domain. For example, an alternative would be do shift first on both a and b to make sure the result does not overflow and does multiplication in i32. Given that the final result will be in i8, i32 should be more than sufficient for such kind of scaling. If we only use the intrinsic in limited cases, I can see us add the support as an intrinsic --- [Visit Topic](https://discuss.tvm.ai/t/rfc-using-arm-intrinsics-to-implement-fixed-point-multiplication-in-tvm/7150/13) to respond. You are receiving this because you enabled mailing list mode. To unsubscribe from these emails, [click here](https://discuss.tvm.ai/email/unsubscribe/1cce345222f30c12934e7e34ba8f9281ef69c6af8d0a5286224c629e8e7abe0a).