I would suggest to design the infrastructure that supports both symmetric/asymmetric quantization. We can certainly start with symmetric to flush the flow, while keeping in mind that we can share as much infrastructure as possible between them.
> * namespace for the tflite quantize style dialect I think this is required for both asymmetric and symmetric quantization. These ops will be rewritten to low-level instructions by a Relay pass. How about using `relay.op._quantization` as the namespace? So, the operations can be `relay.op._quantization.conv2d` or `relay.op._quantization.quantize`. > * List of ops that might need tvm's compute declaration I am not sure yet. The only unknown to me are the special rounding operations that are used in converting the Floating point to Integer multiplication in scaling the quantized conv matrix. But, they might already be covered in current low-level ops. > * set of possible passes that lower the rest into the core ops I was hoping to re-use the FForwardRewrite infrastructure to lower the ops. Do you anticipate more passes here? -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/dmlc/tvm/issues/2351#issuecomment-497528304