Re: [dmlc/tvm] [RFC][Quantization] Support quantized models from TensorflowLite (#2351)

Zhao Wu Wed, 29 May 2019 10:39:49 -0700

> > For the `q_conv2d`, we will add two more arguments.
> > ```python
> >   output_min=0, 
> >   output_max=0
> > ```
> > 
> > 
> > These will be used for restrict the output range, which could be calculated 
> > previously.
> 
> I see what you are saying, but I am not sure if this is the right approach. 
> In my opinion, it will be better to put it out of conv. The reason we have 
> these 2 extra min/maxes is because of fused activation in TFLite. It seems 
> better to keep it separate so that both MxNet and TFLite can share 
> quantized_conv2d. In case of TFLite, when we see a fused conv, we can add one 
> more clamp operator in the sequence of ops at the end.


No matter whether we have fused activation function, we always need output_min 
/ output_max. Because we will get conv int32 result, but we will need uint8 
result. So we must restrict int32 to uint8. If we don't have fused activation 
function, (When we have quantized model of TFLite, we don't have fused 
activation many cases), the output_min / output_max will be 0 / 255 to restrict 
int32 result. If we have relu6, output_min / output_max will be 0 / 6. So I 
think we are better put these two into conv argument. And we could avoid 
producing another clamp, just be calculated in conv2d requantize int32 -> uint8 
process and it is nature.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/dmlc/tvm/issues/2351#issuecomment-497031857

Re: [dmlc/tvm] [RFC][Quantization] Support quantized models from TensorflowLite (#2351)

Reply via email to