Regarding @jnorwood 's comments on output min/max of conv2d. Your observations about the **values** of output min max are correct. But they are still activations. As I always try to deliver is that, the INT8 values in quantization are representing FP32 values.
When we talking about ReLU6 activations, it means that in FP32 format, the op outputs FP32 values in range [0, 6]. For INT8 quantization, INT8 data is an representation of FP32 value, which means, the output min/max (which is typically [0, 255] of INT8 type in pre-provided quantized MobileNet) are representing [0, 6] of FP32 type - the INT8 0/255 is actually FP32 0/6. Try the output scale (0.023528477177023888) with the activation min/max, we will get value range like [0, 5.999761581420898] (from output of the first conv of the pre-provided quantized MobileNet). -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/dmlc/tvm/issues/2351#issuecomment-497237872