Let's start with just Requantize to keep it focussed
### QNN proposal
~~~
def requantize(data,
input_scale,
input_zero_point,
output_scale,
output_zero_point,
rounding="AWAY_FROM_ZERO",
out_dtype="int8"):
r"""Requantized operator.
The requantize operator converts one quantized tensor representation to
another quantized tensor representation. For the output tensor, we are
provided with output scale and zero point. The computation is as follows
Q_output = zp_output + (scale_input)/(scale_ouptut) * (Q_input - zp_input)
Parameters
----------
data : tvm.relay.Expr
The input data to the operator.
input_scale: float
The quantization scale for the input tensor.
input_zero_point: int
The zero point of the input tensor.
output_scale: float
The quantization scale for the output tensor.
output_zero_point: int
The zero point of the output tensor.
rounding : string, optional
Defines the rounding direction when the value is midway between two
representable values.
out_dtype : str, optional
Specifies the output data type for mixed precision conv2d.
Returns
-------
result : tvm.relay.Expr
The computed result.
"""
~~~
### TF Requantize
~~~
Arguments:
scope: A Scope object
input_min: The float value that the minimum quantized input value represents.
input_max: The float value that the maximum quantized input value represents.
requested_output_min: The float value that the minimum quantized output value
represents.
requested_output_max: The float value that the maximum quantized output value
represents.
out_type: The type of the output. Should be a lower bit depth than Tinput.
~~~
* The min/max in TF are represented by scale and zero_point in QNN
* `rounding` in QNN proposal is to give a choice between standard rounding
(away_from_zero) and round_towards_zero, presenting a performance-accuracy knob.
--
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/dmlc/tvm/issues/3591#issuecomment-513418867