I see. Graph tuning inserts data transformation to some or all layers, which means an increase of e2e latency for TVM. As for many conv2d kernels TVM is slower than MKLDNN, how can TVM still achieve better e2e runtime? Did I miss any other kind of latency that MKLDNN will incur here?
--- [Visit Topic](https://discuss.tvm.ai/t/questions-regarding-the-atc19-paper-optimizing-cnn-model-inference-on-cpus/6024/4) to respond. You are receiving this because you enabled mailing list mode. To unsubscribe from these emails, [click here](https://discuss.tvm.ai/email/unsubscribe/4092d2afd58b6c4afb9334a5e535ef080952552574645e1af1199d4024fb56e4).