[Apache TVM Discuss] [Development] Quantized models and legalization pass

Giuseppe Rossini via Apache TVM Discuss Thu, 22 Oct 2020 15:45:30 -0700


Hi all,


I am trying to improve quantized performance for memory bound operators (e.g., 
depthwise or 1x1 convolutions with small shapes). 

### Bottom line question
Is there any way we can know the strategy picked by the autotuner during the 
legalization pass of a quantized convolution (qnn.conv2d)?

### Long story
In general, for any int8->int32 convolution there are two strategies to follow:
* Convert to int16, subtract the offset and the execute conv2d+requantization
* Stay in int8 and use some magic instruction to compute int8->int32 
convolutions. This introduces the evaluation of 4 terms: Term1 is the core 
conv2d (int8->int32), and Term{2,4} are the offset contributions (see 
[here](https://github.com/apache/incubator-tvm/blob/ef6e52f191888ee2a5f2221bde3b69391766903f/src/relay/qnn/op/convolution.cc#L542))

In theory, the int8 approach should outperform the int16, but for memory bound 
operators the additional Terms{2,4} might hit the performance (I have 
situations where Term2 takes the same time of nn.conv2d). To have the best of 
the two worlds, we should implement the two strategies and try them both. 

At the moment, this is (I think) not possible in TVM. Indeed, the decision of 
converting to int16 (and then subtracting the offsets) happens during [the 
legalization 
pass](https://github.com/apache/incubator-tvm/blob/main/python/tvm/relay/qnn/op/legalizations.py),
 i.e., when the qnn.conv2d is lowered to a normal nn.conv2d. 

So, back to the main question: is there a way to know the auto-tuner strategy 
during the legalization pass? The ideal code would be:
```
@qnn_conv2d_legalize.register("arm_cpu")
def _qnn_conv2d_legalize_arm_cpu(attrs, inputs, types):
    If strategy == "conv2d_int16":
        convert_to_int16(data)
    else:
        convert_to_int8(data)
```





---
[Visit 
Topic](https://discuss.tvm.apache.org/t/quantized-models-and-legalization-pass/8253/1)
 to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click 
here](https://discuss.tvm.apache.org/email/unsubscribe/aabcab1a9af086df5d5e7b6e90a5bd55676689271c8e4da68af63fd3c90a3843).

[Apache TVM Discuss] [Development] Quantized models and legalization pass

Reply via email to