Sorry for the delayed reply in this discussion. Here are a few thoughts.
Let us put a concise namespace for the quantization dialect. Two possible
candidates:
- ```relay.op.qnn```, e.g. relay.op.qnn.conv2d
- The qnn name is consistent with QNNPack
- ```relay.op.tflite```
- The op name is a dialect.
In both cases, they are a dialect of relay, which means by default we do not
want to introduce special implementation, but instead will translate them into
existing core ops. We need to have a special op_level for these core ops.
I still think we should minimize the number of operators, and directly
translate to lower ops if possible. This includes things like ```
quantize/dequantize```, and qnn.concat. Please discuss this alternative and
list pros and cons.
--
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/dmlc/tvm/issues/2351#issuecomment-506970753