@masahi I see, thanks for sharing. I also thought about an approach like this.
It seems like a lot of additional overhead to create and maintain a whole new
IR for serialization. Since it seems to be the case that many external codegens
(not just TRT) will need to do something like this, I won
Hi @masahi, that is correct. I am using [TVM's native json serialization
API](https://github.com/neo-ai/tvm/blob/dev/include/tvm/node/serialization.h#L39-L48)
to serialize the relay subgraphs during codegen and deserialize it in the
runtime for my TRT integration.
I am curiously what binary f
Hi @jonso, when I do relay.build with target="cuda", the data inputs supplied
to my runtime module are already placed on the GPU by the graph runtime.The
DLTensor->data will be a device pointer to the data in GPU memory and you can
pass this directly to CUDA libraries.
If you need to get the