@masahi I set `os.environ["TVM_NUM_THREADS"] = str(2)`, but it does not help to the speed.
I also watch the cpu% of `tvm_model.module.time_evaluator` and `pt_model(inp)` by `top` command, the cpu%<=100%, it maybe means that both tvm and torch only use one thread to do inference. Here is the speed comparison at `os.environ["TVM_NUM_THREADS"] = str(2)`: ```shell Model name: resnet18, per channel quantization TVM elapsed ms:126.58331409 Torch elapsed ms:34.38628673553467 Model name: resnet50, per channel quantization TVM elapsed ms:292.58252946 Torch elapsed ms:77.93493032455444 Model name: mobilenet_v2, per channel quantization TVM elapsed ms:24.695743800000006 Torch elapsed ms:11.568100452423096 Model name: mobilenet_v3 small, per channel quantization TVM elapsed ms:7.13273288 Torch elapsed ms:9.331259727478027 Model name: mobilenet_v2_pretrained small, per channel quantization TVM elapsed ms:19.51776834 Torch elapsed ms:13.21192979812622 ``` --- [Visit Topic](https://discuss.tvm.ai/t/is-there-any-speed-comparison-of-quantization-on-cpu/6256/20) to respond. You are receiving this because you enabled mailing list mode. To unsubscribe from these emails, [click here](https://discuss.tvm.ai/email/unsubscribe/21eeaf7fa2f07be34d9f3bd77bf5af865dd6fb8178f079799760bce0d705bfd3).