[TVM Discuss] [Questions] Is there any speed comparison of quantization on cpu

kindlehe via TVM Discuss Thu, 09 Apr 2020 22:13:26 -0700


[quote="anijain2305, post:27, topic:6256, full:true"]
For rasp3 and rasp4, we saw 1.3x - 1.5x performance speedup going from FP32 to 
Int8.


The link comparing QNNPACK and TVM is not upstream'd yet. If I understand 
correctly, it will be sometime before the authors of that work will be able to 
make it to upstream. There are some differences in underlying design as well, 
which might cause some delays in getting to that performance.

Regarding int16, we observed that LLVM can generate good enough good with int16 
instead of int8 for rasp3/4. So we uplift the datatype to int16 (exception is 
Intel Cascadelake and Nvidia devices). When we write a better schedule with 
int8 datatypes, we can remove the upcasting.
[/quote]

Thanks for your speedup report for reference, TVM is really an excellent 
framework!
Hope to see more speedup comparison between tvm and qnnpack  and ohter DL 
frameworksfor at different cpu, like arm, interl, and so on. The speedup data 
can help users to estimate the up-limited compute in some certain devices, 
which can significantly help release the risk of diving into a wrong direction.





---
[Visit 
Topic](https://discuss.tvm.ai/t/is-there-any-speed-comparison-of-quantization-on-cpu/6256/30)
 to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click 
here](https://discuss.tvm.ai/email/unsubscribe/804c5c994fcd8b6f67fc7567cb3ebd593eaa9c080d3237442629c7dd0f7d0b26).

[TVM Discuss] [Questions] Is there any speed comparison of quantization on cpu

Reply via email to