1. I don't have experience using QAT in Torch. I think post training quantization is easier to work with. In any case, post training quantization should be the first thing you should try. If you need extra accuracy, QAT may help.
2. Yes. See https://docs.tvm.ai/tutorials/frontend/deploy_quantized.html#sphx-glr-tutorials-frontend-deploy-quantized-py. This is another quantization support TVM has. Since TVM does quantization, which framework models come from doesn't matter. --- [Visit Topic](https://discuss.tvm.ai/t/is-there-any-speed-comparison-of-quantization-on-cpu/6256/8) to respond. You are receiving this because you enabled mailing list mode. To unsubscribe from these emails, [click here](https://discuss.tvm.ai/email/unsubscribe/6e2754f7ee1e0d58f56547a6b337f27dd0a5c540422d2698f1d673c9ecf50224).