> @FrozenGene For the output_min and max, isn't the out_dtype enough? If its > uint8, we can clamp at 0 and 255. If its int8, we can clamp at -128 and 127. > I don't see any reason the values will be any different, unless you want to > fuse the quantized relu in the quantized convolution from the starting > itself. Please let me know if I am understanding something wrong. I think we > should not fuse operators in the frontend and let Relay graph fusion take > care of that. > > Let's see what others think about this. @tqchen @yzhliu @ZihengJiang What are > your thoughts on this?
I think it is ok. If we do this way, we should insert one clamp if we have activation. Like our tflite frontend ```python # If we have fused activations if fused_activation_fn != ActivationFunctionType.NONE: if weight_tensor_type == TensorType.UINT8: # implement this function output_min, output_max = self.calculate_activation_range_uint8(output_scale, output_zero_point, fused_activation_fn) # insert clip out = _op.clip(out, output_min, output_max) out = self.convert_fused_activation_function(out, fused_activation_fn) ``` -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/dmlc/tvm/issues/2351#issuecomment-502418630