Do you guys know if we have the same speedups than with 2d conv networks like 
resnet 50 8 bit quantized vs FP32 / 16 ? Is the full range of optimizations 
available for this ?

In comparison, TensorRT doesn't support 8 bit quantization (and other 
optimizations) for 3d operations, so the speedup is tiny compared to FP16.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/apache/incubator-tvm/issues/4009#issuecomment-692124105

Reply via email to