Re: [dmlc/tvm] [RFC][Quantization] Support quantized models from TensorflowLite (#2351)

Animesh Jain Fri, 14 Jun 2019 16:22:54 -0700

@FrozenGene Thanks for replying. I might be wrong, but I don't think it is a 
good design to take one codegen backend like QNNPACK and make changes all the 
way into Relay APIs to make the connection. In my opinion, APIs must be minimal.


But, your point of using QNNPACK is completely valid. I have been thinking 
about that myself, dreading the painful experience of write tensorized kernel 
for Intel x86, and hoping to somehow use OpenVINO/MKLDNN. But, similarly, I 
don't think adding MKLDNN/OpenVINO arguments in the Relay API will be right 
choice either there.

One way to handle this is to separate out the Relay operators API that we are 
discussing and the infrastructure to use external codegen like QNNPACK. I think 
it is entirely possible to write Relay passes for each codegen backend and then 
rewrite/fuse the Relay ops in a manner that the codegen backend can understand. 
In this case, we do not creep in the backend specific idiosyncracies into the 
Relay op API, while also having a well-defined infrastructure that shows how to 
add external codegens.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/dmlc/tvm/issues/2351#issuecomment-502309015

Re: [dmlc/tvm] [RFC][Quantization] Support quantized models from TensorflowLite (#2351)

Reply via email to