Re: [dmlc/tvm] [RFC][Quantization] Support quantized models from TensorflowLite (#2351)

Zhao Wu Tue, 18 Jun 2019 00:40:06 -0700

> > I think you maybe don't understand fully of my previous comment. One 
> > question I want to ask: Do your quantized models have conv + relu / relu6 
> > like our model? If no, obviously is 0 ~ 255, no matter how many models are. 
> > Please see: 
> > https://github.com/tensorflow/tensorflow/blob/v2.0.0-beta1/tensorflow/lite/kernels/kernel_util.cc#L138
> >  I and @jackwish have emphasized many times of this function code.
> 
> The quantized mobilenet v1 inference model is from the tflite model 
> repository. The training model includes relu6 and batch normalization 
> operations, but these are fused into convolution operations in the inference 
> model, as the Netron diagram shows.
> 
> The link you reference shows floating point activation values that would be 
> applied during training. They do represent the range bound that would be 
> expected of the upscaled values in the accumulator in the inference model. 
> However the min and max saturation values passed into the inference quantized 
> convolution are applied _after downscale_ ... I previously provided the code 
> and the link. They are int32 values, not float values. They are applied after 
> both downscale and offset are applied. They are 0..255 even though the scaled 
> up range expected is 0..6 from the fused-in relu6 operation.
> 
> If the convolution and relu operations were separate, you would still see 0 
> and 255 for those min and max values because they are applied after downscale 
> and after offset are applied to the convolution accumulator. The min and max 
> values only function to saturate the downscaled result to the quantized uint8 
> bit range, avoiding wrap-around overflow/underflow of the 8 bit value if the 
> downscaled accumulator were simply masked to 8 bits.


I have emphasized the model diagram is one `quantized` model. Let me show more 
detail of the property: 
![image](https://user-images.githubusercontent.com/7287321/59662050-b95a9780-91de-11e9-89c2-b252b8b3a8ae.png)
This is to say, not all relu / relu6 can be fused into convolution in TFLite's 
`quantized` model. Then what the min / max, that is previous code  
https://github.com/tensorflow/tensorflow/blob/v2.0.0-beta1/tensorflow/lite/kernels/kernel_util.cc#L138
 does. `NOT` just simple 0 ~ 255. 

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/dmlc/tvm/issues/2351#issuecomment-502986425

Re: [dmlc/tvm] [RFC][Quantization] Support quantized models from TensorflowLite (#2351)

Reply via email to