Hi @jonso, when I do relay.build with target="cuda", the data inputs supplied 
to my runtime module are already placed on the GPU by the graph runtime.The 
DLTensor->data will be a device pointer to the data in GPU memory and you can 
pass this directly to CUDA libraries.

If you need to get the data back onto the CPU, you could use cudaMemcpy to move 
the data as part of your generated code. But it sounds like you want everything 
to be on the GPU.





---
[Visit 
Topic](https://discuss.tvm.ai/t/external-codegen-with-cuda-target/6159/6) to 
respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click 
here](https://discuss.tvm.ai/email/unsubscribe/97299d499b7cc793b2b4321b1ba2ca687347333eae4c70706f2ff6a0634f1f9b).

Reply via email to