The min and max are not conditional on existence of activation operation in the original model. They are there to saturate the downscaled and offset adjusted 32 bit signed int accumulator to the min and max value of the uint8 quantized bit range.
Although the quantized conv result is held in uint8, it could be static casted to signed int8, or even fewer than 8 bit quantization. That would require both min and max saturations, as in the reference tflite quantized conv implementation. -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/dmlc/tvm/issues/2351#issuecomment-502481388