Hi @anijain2305, All correct, except that the problem about fusion is more related to the fact that `qnn.conv2d` is lowered as a `nn.conv2d` followed by a `requantize` .
The best would be to fuse the requantization before the unpacking of the output tensor (i.e., after the main compute node) but I cannot do that, because the requantization happens later (hence going after the unpacking). I think this problem is common to most `conv2d` implementations that are in the `arm_cpu` path. --- [Visit Topic](https://discuss.tvm.ai/t/rfc-using-arm-intrinsics-to-implement-fixed-point-multiplication-in-tvm/7150/8) to respond. You are receiving this because you enabled mailing list mode. To unsubscribe from these emails, [click here](https://discuss.tvm.ai/email/unsubscribe/e1454c7b33ba2d486118f315099a53e91315c4ec7bf58c2a4977c145afa08d8b).