yes, right. The scaling constant computed during training is based on the range of values seen after fused in activations (at least that is true for the tflite quantized models I've looked at). That includes being after the relu6 positive clipping also. During inference, the min and max saturation values are just handling saturation of values seen outside the range expected from the training... whether or not there was a fused in activation operation during training.
-- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/dmlc/tvm/issues/2351#issuecomment-502508330