I would suggest to design the infrastructure that supports both 
symmetric/asymmetric quantization. We can certainly start with symmetric to 
flush the flow, while keeping in mind that we can share as much infrastructure 
as possible between them.

> * namespace for the tflite quantize style dialect

I think this is required for both asymmetric and symmetric quantization. These 
ops will be rewritten to low-level instructions by a Relay pass. How about 
using `relay.op._quantization` as the namespace? So, the operations can be 
`relay.op._quantization.conv2d` or `relay.op._quantization.quantize`.

> * List of ops that might need tvm's compute declaration
I am not sure yet. The only unknown to me are the special rounding operations 
that are used in converting the Floating point to Integer multiplication in 
scaling the quantized conv matrix. But, they might already be covered in 
current low-level ops.

> * set of possible passes that lower the rest into the core ops

I was hoping to re-use the FForwardRewrite infrastructure to lower the ops. Do 
you anticipate more passes here?


-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/dmlc/tvm/issues/2351#issuecomment-497528304

Reply via email to