@tqchen Here is a usage scenario that we are thinking about with Relay to ONNX 
serialization. Let us say that there is a HW chip vendor, whose model compiler 
toolchain already supports ONNX as an input format. Since ONNX is quite limited 
in its scope, there is only a small set of models that can be supported by this 
HW out of the box. Since there already exists a graph annotation, graph 
partitioning support within Relay, our intention is to create a pass, which 
could split a model graph into ONNX compatible sub-graph and the other part 
which could be compiled using TVM. The HW vendor could develop support for 
Codegen and Runtime for their HW using the BYOCG methodology. Once they have 
done that, they could use Relay-to-ONNX serialization (this RFC), to export the 
ONNX compatible sub-graph to ONNX. Their compiler could convert the ONNX 
sub-graph into something that can be executed on their HW, while TVM would 
compile the remaining part of the graph. Since the HW vendor would have also 
implemented support in the Runtime based on BYOCG, the resulting sub-graphs can 
be executed on the target HW, with the TVM compiled part to execute on the CPU 
and the ONNX compatible sub-graph on the HW accelerator. This way, we are 
helping the HW vendors to expand the range of models that they can support on 
their HW.





---
[Visit Topic](https://discuss.tvm.ai/t/rfc-relay-to-onnx/6101/6) to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click 
here](https://discuss.tvm.ai/email/unsubscribe/1ac6f892f6ccdd2109956c37cf2b07d680bca9e292eadf3adda8d028c564a1ac).

Reply via email to