Thanks for your detailed reply very much!
I will try your suggestion to try these scripts later. However, there are still some question for me: 1. Should I use `Post-training static quantization` or `Quantization-aware training` for my own model as [static_quantization_tutorial](https://pytorch.org/tutorials/advanced/static_quantization_tutorial.html) said, before I apply `[Torch, QNN] Add support for quantized models via QNN #4977` or use scripts you offered above ? 2. Dose tvm itself support convert a FP32 pytorch model to int8 model ? If so, how to do it to keep accuracy and speed? Nice to see you are all so good to offer helps for users! Thanks sincerely and expect for your updated resources on tvm quantization. --- [Visit Topic](https://discuss.tvm.ai/t/is-there-any-speed-comparison-of-quantization-on-cpu/6256/7) to respond. You are receiving this because you enabled mailing list mode. To unsubscribe from these emails, [click here](https://discuss.tvm.ai/email/unsubscribe/3c67cf434f3ea3c7b81c18c423ac7cd2bf6efff4960ba7add30a7598d675ec15).