> > For the `q_conv2d`, we will add two more arguments. > > ```python > > output_min=0, > > output_max=0 > > ``` > > > > > > These will be used for restrict the output range, which could be calculated > > previously. > > I see what you are saying, but I am not sure if this is the right approach. > In my opinion, it will be better to put it out of conv. The reason we have > these 2 extra min/maxes is because of fused activation in TFLite. It seems > better to keep it separate so that both MxNet and TFLite can share > quantized_conv2d. In case of TFLite, when we see a fused conv, we can add one > more clamp operator in the sequence of ops at the end.
No matter whether we have fused activation function, we always need output_min / output_max. Because we will get conv int32 result, but we will need uint8 result. So we must restrict int32 to uint8. If we don't have fused activation function, (When we have quantized model of TFLite, we don't have fused activation many cases), the output_min / output_max will be 0 / 255 to restrict int32 result. If we have relu6, output_min / output_max will be 0 / 6. So I think we are better put these two into conv argument. And we could avoid producing another clamp, just be calculated in conv2d requantize int32 -> uint8 process and it is nature. -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/dmlc/tvm/issues/2351#issuecomment-497031857