Thanks @tqchen for the detailed explanation. Actually, my proposal is simpler. My `qnn.relu` does not convert to the three stages that you mentioned. It only performs the `relu_int_i8`.
The frameworks (atleast TFLite and MxNet) do not go back to FP32 unless the operator is not supported in `i8` format or accuracy is very bad in `i8`. For example, TFLite `qconv2d` will translate to `qnn.conv2d + qnn.requantize` or as you explained `conv_in_i8/i32 -> convert_to_int8` domain, but there wont be any FP32. To complete the picture, suppose the quantized framework graph is (fw stands for framework) `fw.quantize -> fw.qconv2d -> fw.qrelu -> fw.dequantize` The Relay graph would be `qnn.quantize -> qnn.conv2d -> qnn.requantize -> qnn.relu -> qnn.dequantize` `convert_to_i8 -> conv_in_i8/i32 -> convert_to_i8 -> relu_in_i8 -> convert_to_FP32` Essentially, if the framework does not convert back to FP32 in between, we would not go to FP32. -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/dmlc/tvm/issues/2351#issuecomment-508963566