Re: [dmlc/tvm] [QNN] [RFC] QNN Dialect - Supporting pre-quantized models in TVM (#3591)

Animesh Jain Fri, 19 Jul 2019 17:34:07 -0700

Let's start with just Requantize to keep it focussed

### QNN proposal
~~~
def requantize(data,
               input_scale,
               input_zero_point,
               output_scale,
               output_zero_point,
               rounding="AWAY_FROM_ZERO",
               out_dtype="int8"):
    r"""Requantized operator.
     The requantize operator converts one quantized tensor representation to
    another quantized tensor representation. For the output tensor, we are
    provided with output scale and zero point. The computation is as follows
     Q_output = zp_output +  (scale_input)/(scale_ouptut) * (Q_input - zp_input)
     Parameters
    ----------
    data : tvm.relay.Expr
        The input data to the operator.
     input_scale: float
           The quantization scale for the input tensor.
     input_zero_point: int
           The zero point of the input tensor.
     output_scale: float
           The quantization scale for the output tensor.
     output_zero_point: int
           The zero point of the output tensor.
     rounding : string, optional
        Defines the rounding direction when the value is midway between two
        representable values.
     out_dtype : str, optional
        Specifies the output data type for mixed precision conv2d.
     Returns
    -------
    result : tvm.relay.Expr
        The computed result.
    """
~~~


### TF Requantize 

~~~
Arguments:

scope: A Scope object
input_min: The float value that the minimum quantized input value represents.
input_max: The float value that the maximum quantized input value represents.
requested_output_min: The float value that the minimum quantized output value 
represents.
requested_output_max: The float value that the maximum quantized output value 
represents.
out_type: The type of the output. Should be a lower bit depth than Tinput.
~~~


* The min/max in TF are represented by scale and zero_point in QNN
* `rounding` in QNN proposal is to give a choice between standard rounding 
(away_from_zero) and round_towards_zero, presenting a performance-accuracy knob.



-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/dmlc/tvm/issues/3591#issuecomment-513418867

Re: [dmlc/tvm] [QNN] [RFC] QNN Dialect - Supporting pre-quantized models in TVM (#3591)

Reply via email to