Hi @jonso, when I do relay.build with target="cuda", the data inputs supplied to my runtime module are already placed on the GPU by the graph runtime.The DLTensor->data will be a device pointer to the data in GPU memory and you can pass this directly to CUDA libraries.
If you need to get the data back onto the CPU, you could use cudaMemcpy to move the data as part of your generated code. But it sounds like you want everything to be on the GPU. --- [Visit Topic](https://discuss.tvm.ai/t/external-codegen-with-cuda-target/6159/6) to respond. You are receiving this because you enabled mailing list mode. To unsubscribe from these emails, [click here](https://discuss.tvm.ai/email/unsubscribe/97299d499b7cc793b2b4321b1ba2ca687347333eae4c70706f2ff6a0634f1f9b).