> Thanks @jackwish and @FrozenGene I understand your points. > > This can be treated as optimization then. If the input zero point is zero OR > if the input and output quantization params are same, don't cast, directly > apply maxpool. Generally, we would like to keep QNN APIs generic. So, if > MxNet for some reason decides to have different mix/maxes, we should be able > to support that. Does that sound good?
If a generic api is prefered, maybe scale/zero point of input/output tensor should all be included. If the zero point of input/output is different, maybe scale is also different, which requires requantization. Pooling seems not the case, as the input and output are of the same value distribution. Anyway, I am good if we are likely to subtract the zero point, but remember to add it back after the lowered pooling. :) -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/dmlc/tvm/issues/3617#issuecomment-517526919