> @FrozenGene For the output_min and max, isn't the out_dtype enough? If its
> uint8, we can clamp at 0 and 255. If its int8, we can clamp at -128 and 127.
> I don't see any reason the values will be any different, unless you want to
> fuse the quantized relu in the quantized convolution from the starting
> itself. Please let me know if I am understanding something wrong. I think we
> should not fuse operators in the frontend and let Relay graph fusion take
> care of that.
>
> Let's see what others think about this. @tqchen @yzhliu @ZihengJiang What are
> your thoughts on this?
I think it is ok. If we do this way, we should insert one clamp if we have
activation.
Like our tflite frontend
```python
# If we have fused activations
if fused_activation_fn != ActivationFunctionType.NONE:
if weight_tensor_type == TensorType.UINT8:
# implement this function
output_min, output_max =
self.calculate_activation_range_uint8(output_scale, output_zero_point,
fused_activation_fn)
# insert clip
out = _op.clip(out, output_min, output_max)
out = self.convert_fused_activation_function(out, fused_activation_fn)
```
--
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/dmlc/tvm/issues/2351#issuecomment-502418630