> @FrozenGene For the output_min and max, isn't the out_dtype enough? If its 
> uint8, we can clamp at 0 and 255. If its int8, we can clamp at -128 and 127. 
> I don't see any reason the values will be any different, unless you want to 
> fuse the quantized relu in the quantized convolution from the starting 
> itself. Please let me know if I am understanding something wrong. I think we 
> should not fuse operators in the frontend and let Relay graph fusion take 
> care of that.
> 
> Let's see what others think about this. @tqchen @yzhliu @ZihengJiang What are 
> your thoughts on this?

I think it is ok. If we do this way,  we should insert one clamp if we have 
activation.
Like our tflite frontend
```python
# If we have fused activations
if fused_activation_fn != ActivationFunctionType.NONE:
   if weight_tensor_type == TensorType.UINT8:
    # implement this function  
    output_min, output_max = 
self.calculate_activation_range_uint8(output_scale, output_zero_point, 
fused_activation_fn)
    # insert clip
    out = _op.clip(out, output_min, output_max)
    out = self.convert_fused_activation_function(out, fused_activation_fn)
```

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/dmlc/tvm/issues/2351#issuecomment-502418630

Reply via email to