[TVM Discuss] [Development/RFC] [RFC] Using arm intrinsics to implement fixed point multiplication in TVM

Giuseppe Rossini via TVM Discuss Wed, 01 Jul 2020 12:36:06 -0700


Hi @anijain2305,


Yes, they are fused together, but at the end. 

`nn.conv2d` is usually implemented as three compute nodes: `pack+core+unpack`.

The requantization operator is fused after the `unpack`, while the best would 
be to fuse after `core` (unpack can be hard to vectorize).

However, this is a topic for another discussion :slight_smile: 

The relay operator I wrote should be fuse friendly, so it should not introduce 
any slow down. We need still to decide how to implement the `fpm` though 
:sweat_smile:





---
[Visit 
Topic](https://discuss.tvm.ai/t/rfc-using-arm-intrinsics-to-implement-fixed-point-multiplication-in-tvm/7150/11)
 to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click 
here](https://discuss.tvm.ai/email/unsubscribe/699e30f56c411adfa497b1279a489c088c8901d067412febf2318d83a64600d1).

[TVM Discuss] [Development/RFC] [RFC] Using arm intrinsics to implement fixed point multiplication in TVM

Reply via email to