@masahi @anijain2305  
I am not very sure whether INT8 is used in `perf_bench`, due to I see these log:
```
autotvm:Cannot find config for target=llvm -mcpu=core-avx2, 
workload=('dense_nopack.x86', ('TENSOR', (1, 1280), 'int16'), ('TENSOR', (1000, 
1280), 'int16'), None, 'int32'). A fallback configuration is used, which may 
bring great performance regression.
```
I doubt that tvm slower than torch in 1-core is cuased by unused INT8.

One possible case is that tvm convert INT8 weights to int16 as the above log 
said, and do inference in INT16 instead of INT8, while torch do inference in 
INT8. Namely, we compare spped of tvm-int16 and torch-int8. This is just a 
assumption, But I don't know how to check whether INT8 is used in tvm inference.

If so, this might be a bug or something else wrong for TVM (not sure yet, just 
a guess :grinning:).





---
[Visit 
Topic](https://discuss.tvm.ai/t/is-there-any-speed-comparison-of-quantization-on-cpu/6256/21)
 to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click 
here](https://discuss.tvm.ai/email/unsubscribe/4d9982ce994c3d89d3e4e7794743d08788c9f26a4f4f0e135b9e67523b848f4b).

Reply via email to