Sorry about that, I think I misspoke. I already have the annotation pass set up properly and my codegen is being called. However, when I try to print out one of my inputs from my codegen, the program crashes.
I have a feeling that since the target is “cuda”, the data isn’t being moved from GPU back to CPU. Is there a way to verify this flow? Do you have an example with external codegen on GPU? When the target is llvm it works properly. --- [Visit Topic](https://discuss.tvm.ai/t/external-codegen-with-cuda-target/6159/3) to respond. You are receiving this because you enabled mailing list mode. To unsubscribe from these emails, [click here](https://discuss.tvm.ai/email/unsubscribe/0ab59414d9be904b8d2b253b589aad047d620742df8b2675b7cd6e1401877951).