[TVM Discuss] [Questions] Is there any speed comparison of quantization on cpu

kindlehe via TVM Discuss Wed, 15 Apr 2020 19:27:26 -0700


@anijain2305

thanks a lot.

I thought the tvm relay quantize is the same as tvm model converted from
pre-quantized.

I also test tvm-int8 model from pytorch qat model, the speed is the same as
tvm-relay-quantize-int8 model.

I really have no idea how to get 1.3x -1.5x speedup no matter
pre-quantize-int8-model or tvm-relay-quantize-int8-model. I m eager for your
kind help to reproduce the speedup on android arm device.

Nice to see your effort for quantize tutorial!

I also recommend that you give more tutorials on how to get desired int8
speedup compared fp32 on the supported device platform.

You know, many tvm users are not experienced enough as tvm authors, they may
want to see more tutorials or reports on why choose tvm-quantize instead of
other dl framework’s quantize.

Also, more testcase on cpu-int8 will be a great help, e.g. what are the
supported cpu device for int8-quantize, how to set proper target for different
fashion devices (I saw many many users asked the usage about TARGET, but still
not sure which TARGET setting can achieve best performance for related device.)

Thanks again for all tvm’s authors and contributors.

---
[Visit
Topic](https://discuss.tvm.ai/t/is-there-any-speed-comparison-of-quantization-on-cpu/6256/44)
to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click
here](https://discuss.tvm.ai/email/unsubscribe/ce3587ddee2929f46fbc1a271dba9be91651c393b4aace340e7ec4d9c9adcac5).

[TVM Discuss] [Questions] Is there any speed comparison of quantization on cpu

Reply via email to