> Thanks @jackwish for confirming the python lowering looks good. > > For max pooling, we used casting, because we have to subtract the zero point > from the quantized tensor. That subtract needs to happen in higher precision > than (u)int8. Correct me if I am wrong.
To me, for Pooling operators, the quantization factors of input/output tensor are the same, thus there is no need to introduce zero point semantic. For MaxPooling, as we are selecting out the max value among uint8, the selecting doesn't introduce zero point change. For AveragePooling, we will accumulate in int32, and the accumulated zero point can be easily reduced in division operation. You may check [TFLite code](https://github.com/tensorflow/tensorflow/blob/v2.0.0-beta1/tensorflow/lite/kernels/internal/reference/integer_ops/pooling.h#L81-L136) if you are interested. (I don't mean to enforce TFLite implementation, but the reasoning above.) To summarize, in general, if the zero point won't shift in the computing, it can be safely ignored. (Maybe I have been talking too much... but I was trying to tell why...) -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/dmlc/tvm/issues/3617#issuecomment-517525018