Re: [dmlc/tvm] [QNN] [RFC] - Adding QNN operators with simple lowering (#3617)

黎明灰烬 Thu, 01 Aug 2019 19:37:32 -0700

> Thanks @jackwish for confirming the python lowering looks good.
> 
> For max pooling, we used casting, because we have to subtract the zero point 
> from the quantized tensor. That subtract needs to happen in higher precision 
> than (u)int8. Correct me if I am wrong.


To me, for Pooling operators, the quantization factors of input/output tensor 
are the same, thus there is no need to introduce zero point semantic. For 
MaxPooling, as we are selecting out the max value among uint8, the selecting 
doesn't introduce zero point change. For AveragePooling, we will accumulate in 
int32, and the accumulated zero point can be easily reduced in division 
operation.

You may check [TFLite 
code](https://github.com/tensorflow/tensorflow/blob/v2.0.0-beta1/tensorflow/lite/kernels/internal/reference/integer_ops/pooling.h#L81-L136)
 if you are interested. (I don't mean to enforce TFLite implementation, but the 
reasoning above.)

To summarize, in general, if the zero point won't shift in the computing, it 
can be safely ignored. (Maybe I have been talking too much... but I was trying 
to tell why...)

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/dmlc/tvm/issues/3617#issuecomment-517525018

Re: [dmlc/tvm] [QNN] [RFC] - Adding QNN operators with simple lowering (#3617)

Reply via email to