I have encountered the same problem. When infering on the CPU, using the avx 
instruction can accelerate the quantization model, and the inference speed of 
the quantization model is faster than that of the floating-point model. 
However, using tvm quantization and avx instructions can cause other problems.





---
[Visit 
Topic](https://discuss.tvm.apache.org/t/slower-execution-times-after-8-bit-quantization/15502/4)
 to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click 
here](https://discuss.tvm.apache.org/email/unsubscribe/1dbe6cb5fdaef1b53bfac5994879d7c6a8bb57fd48456697304ae58bf1e04d47).

Reply via email to