> Hi @anijain2305 regarding the requantization, if the it is not going to put 
> in conv op, the op may suppose to output FP32, otherwise the semantic is 
> confusing. The requantization can convert FP32 to INT8. The multiplier/shift 
> based reuantization approach introduced by TFLite is also adopted by 
> Caffe2/QNNPACK.

Makes sense. Does it make sense to add accumulator_dtype as one of the 
attributes of quantized_conv2d. This will be set to int32 for TFLite, Caffe2, 
QNNPACK. But, if some network needs accumulation in FP32, then it will support 
that as well.

> And, maybe we can put the quantization parameters in tensor, as the scale and 
> zero point are describing the INT8 tensor data rather than the op. The op are 
> supposed to read these parameters and get things done.

Not sure about this. The good thing is the conv2d relay operator can be shared 
across FP32 and quantized tensor types. The bad thing is compute depends on the 
quantized tensor type now. This might require new Relay optimizations, 
preventing us to fully use the existing infrastructure.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/dmlc/tvm/issues/2351#issuecomment-496998142

Reply via email to