> I think you maybe don't understand fully of my previous comment. One question 
> I want to ask: Do your quantized models have conv + relu / relu6 like our 
> model? If no, obviously is 0 ~ 255, no matter how many models are. Please 
> see: 
> https://github.com/tensorflow/tensorflow/blob/v2.0.0-beta1/tensorflow/lite/kernels/kernel_util.cc#L138
>  I and @jackwish have emphasized many times of this function code.
> 
The quantized mobilenet v1 inference model is from the tflite model repository. 
 The training model  includes relu6 and batch normalization operations, but 
these are fused into convolution operations in the inference model, as the 
Netron diagram shows.  

If the convolution and relu operations were separate, you would still see 0 and 
255 for those min and max values because they are applied after downscale and 
after offset are applied to the convolution accumulator.  The min and max 
values only function to saturate the downscaled result to the quantized uint8 
bit range, avoiding wrap-around overflow/underflow of the 8 bit value if the 
downscaled accumulator were simply masked to 8 bits.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/dmlc/tvm/issues/2351#issuecomment-502693930

Reply via email to