Let's start with just Requantize to keep it focussed ### QNN proposal ~~~ def requantize(data, input_scale, input_zero_point, output_scale, output_zero_point, rounding="AWAY_FROM_ZERO", out_dtype="int8"): r"""Requantized operator. The requantize operator converts one quantized tensor representation to another quantized tensor representation. For the output tensor, we are provided with output scale and zero point. The computation is as follows Q_output = zp_output + (scale_input)/(scale_ouptut) * (Q_input - zp_input) Parameters ---------- data : tvm.relay.Expr The input data to the operator. input_scale: float The quantization scale for the input tensor. input_zero_point: int The zero point of the input tensor. output_scale: float The quantization scale for the output tensor. output_zero_point: int The zero point of the output tensor. rounding : string, optional Defines the rounding direction when the value is midway between two representable values. out_dtype : str, optional Specifies the output data type for mixed precision conv2d. Returns ------- result : tvm.relay.Expr The computed result. """ ~~~
### TF Requantize ~~~ Arguments: scope: A Scope object input_min: The float value that the minimum quantized input value represents. input_max: The float value that the maximum quantized input value represents. requested_output_min: The float value that the minimum quantized output value represents. requested_output_max: The float value that the maximum quantized output value represents. out_type: The type of the output. Should be a lower bit depth than Tinput. ~~~ * The min/max in TF are represented by scale and zero_point in QNN * `rounding` in QNN proposal is to give a choice between standard rounding (away_from_zero) and round_towards_zero, presenting a performance-accuracy knob. -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/dmlc/tvm/issues/3591#issuecomment-513418867