I have encountered the same problem. When infering on the CPU, using the avx instruction can accelerate the quantization model, and the inference speed of the quantization model is faster than that of the floating-point model. However, using tvm quantization and avx instructions can cause other problems.
--- [Visit Topic](https://discuss.tvm.apache.org/t/slower-execution-times-after-8-bit-quantization/15502/4) to respond. You are receiving this because you enabled mailing list mode. To unsubscribe from these emails, [click here](https://discuss.tvm.apache.org/email/unsubscribe/1dbe6cb5fdaef1b53bfac5994879d7c6a8bb57fd48456697304ae58bf1e04d47).