> I think you maybe don't understand fully of my previous comment. One question > I want to ask: Do your quantized models have conv + relu / relu6 like our > model? If no, obviously is 0 ~ 255, no matter how many models are. Please > see: > https://github.com/tensorflow/tensorflow/blob/v2.0.0-beta1/tensorflow/lite/kernels/kernel_util.cc#L138 > I and @jackwish have emphasized many times of this function code. > The quantized mobilenet v1 inference model is from the tflite model repository. The training model includes relu6 and batch normalization operations, but these are fused into convolution operations in the inference model, as the Netron diagram shows.
If the convolution and relu operations were separate, you would still see 0 and 255 for those min and max values because they are applied after downscale and after offset are applied to the convolution accumulator. The min and max values only function to saturate the downscaled result to the quantized uint8 bit range, avoiding wrap-around overflow/underflow of the 8 bit value if the downscaled accumulator were simply masked to 8 bits. -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/dmlc/tvm/issues/2351#issuecomment-502693930