[dmlc/tvm] [QNN] [RFC] - Adding QNN operators with simple lowering (#3617)

Animesh Jain Wed, 24 Jul 2019 15:46:08 -0700

Relevant QNN Dialect RFC - #3591 

Some QNN operators like Requantize and Conv2D are more amenable to going 
through C++ lowering pass. A couple of factors where C++ implementation seems 
better is - when new operator is conceptually very different from existing 
operators (Requantize), when input/output TensorType needs to be known for 
lowering, needs lots of type checking etc.


However, as we start adding more QNN operators, we should try to reduce the 
engineering effort to add a new operator. Here, we are proposing a way to 
reduce the number of additional lines to add a new operator if the lowering is 
quite straightforward. We do add a new QNN operator but it is present only in 
Python, we directly return the lowered sequence from Python.

~~~
# QNN maxpool2d can be lowered to cast, subtract and nn max_pool2d.

# This operator is in qnn namespace

def max_pool2d(quantized_data,
               input_zero_point,
               pool_size=(1, 1),
               strides=(1, 1),
               padding=(0, 0),
               layout="NCHW",
               ceil_mode=False):               
    casted_data = relay.cast(quantized_data, dtype="int32")
    shifted_data = relay.subtract(casted_data, relay.const(input_zero_point, 
"int32"))
    return relay.nn.max_pool2d(shifted_data,
                               pool_size=pool_size,
                               strides=strides,
                               padding=padding,
                               layout=layout,
                               ceil_mode=ceil_mode)

~~~

* Therefore we don't add new code in CPP files, making it easier to add a new 
*simple* operator. This operator can be shared amongst the framework parsers.
* Many operators are pretty simple - qnn.concat can be converted to a 
requantize on each input and then nn.concat, qnn.split can be converted to 
nn.split followed by requantize on each output etc. Avg_pool2d and Relu 
(basically many unary compute operations) are also very simple and can be 
lowered in this manner.

@tqchen @yzhliu @FrozenGene @shoubhik

Thanks @rankyung-hong for prototyping and helping write the doc.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/dmlc/tvm/issues/3617

[dmlc/tvm] [QNN] [RFC] - Adding QNN operators with simple lowering (#3617)

Reply via email to