[TVM Discuss] [Questions] Is there any speed comparison of quantization on cpu

kindlehe via TVM Discuss Thu, 09 Apr 2020 22:23:26 -0700


[quote="kindlehe, post:19, topic:6256, full:true"]
@anijain2305 
How much speedup does FP32 compared INT8 at rasp4？1.5×？

I saw some speedup conclusion
[here](https://github.com/tvmai/meetup-slides/tree/master/tvm-meetup-shanghai-Nov-16-2019)
saying that tvm is about 1.3×（=2.08/1.60）at mobilenet-v2@rasp 3b+AARCH64 than
QNNPACK.

They reported apparent speedup for both mobilenet-v1 and mobilene-v2：
![image|690x431](upload://oqRljzqKWe45ll979kPI6Z8PeOE.jpeg)

However，you say qnnpack-int8 is better than tvm-int8 @rasp4，which conclusion is
more reliable？

If qnnpack is better，than why tvm develop int8 of its own instead of using
qnnpack？
[/quote]

[quote="anijain2305, post:27, topic:6256, full:true"]
For rasp3 and rasp4, we saw 1.3x - 1.5x performance speedup going from FP32 to
Int8.

The link comparing QNNPACK and TVM is not upstream'd yet. If I understand
correctly, it will be sometime before the authors of that work will be able to
make it to upstream. There are some differences in underlying design as well,
which might cause some delays in getting to that performance.

Regarding int16, we observed that LLVM can generate good enough good with int16
instead of int8 for rasp3/4. So we uplift the datatype to int16 (exception is
Intel Cascadelake and Nvidia devices). When we write a better schedule with
int8 datatypes, we can remove the upcasting.
[/quote]

tvm-int8:qnnpack@rasp 3b+AARCH64=1.3x faster said by AliOS, but you said
qnnpack-int8 is faster than tvm-int8 @rasp4, what do you think about the
mismatch?

Your also said **For rasp3 and rasp4, we saw 1.3x - 1.5x performance speedup
going from FP32 to Int8.** If your conclusion **qnnpack-int8 is faster than
tvm-int8 @rasp4** is right, then tvm can get **more than** 1.3x - 1.5x (maybe
1.5x - 1.8x, or more, just a guess) performance speedup going from FP32 to Int8
upon tvm-int8 is fast as qnnpack-int8@rasp4 ?

---
[Visit
Topic](https://discuss.tvm.ai/t/is-there-any-speed-comparison-of-quantization-on-cpu/6256/33)
to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click
here](https://discuss.tvm.ai/email/unsubscribe/b55b65f896732beb67f17e8a00cffb7155c55a8d96f355f7e750f4dbd1b3eaec).

[TVM Discuss] [Questions] Is there any speed comparison of quantization on cpu

Reply via email to