@masahi  I set `os.environ["TVM_NUM_THREADS"] = str(2)`, but it does not help 
to the speed.

I also watch the cpu% of `tvm_model.module.time_evaluator` and `pt_model(inp)` 
by `top` command,
the cpu%<=100%, it maybe means that both tvm and torch only use one thread to 
do inference.

Here is the  speed comparison at `os.environ["TVM_NUM_THREADS"] = str(2)`:
```shell
Model name: resnet18, per channel quantization
TVM elapsed ms:126.58331409
Torch elapsed ms:34.38628673553467

Model name: resnet50, per channel quantization
TVM elapsed ms:292.58252946
Torch elapsed ms:77.93493032455444

Model name: mobilenet_v2, per channel quantization
TVM elapsed ms:24.695743800000006
Torch elapsed ms:11.568100452423096

Model name: mobilenet_v3 small, per channel quantization
TVM elapsed ms:7.13273288
Torch elapsed ms:9.331259727478027

Model name: mobilenet_v2_pretrained small, per channel quantization
TVM elapsed ms:19.51776834
Torch elapsed ms:13.21192979812622
```





---
[Visit 
Topic](https://discuss.tvm.ai/t/is-there-any-speed-comparison-of-quantization-on-cpu/6256/20)
 to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click 
here](https://discuss.tvm.ai/email/unsubscribe/21eeaf7fa2f07be34d9f3bd77bf5af865dd6fb8178f079799760bce0d705bfd3).

Reply via email to