Hi @anijain2305 regarding the requantization, if the it is not going to put in 
conv op, the op may suppose to output FP32, otherwise the semantic is 
confusing. The requantization can convert FP32 to INT8. The multiplier/shift 
based reuantization approach introduced by TFLite is also adopted by 
Caffe2/QNNPACK.

And, maybe we can put the quantization parameters in tensor, as the *scale* and 
*zero point* are describing the INT8 tensor data rather than the op. The op are 
supposed to read these parameters and get things done.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/dmlc/tvm/issues/2351#issuecomment-496892642

Reply via email to