I would still recommend using the PackedFunc interface, as there can be quite a 
few things that are needed to directly use the raw kernel, for example, the 
launching parameter calculation is part of the host code, as well as the data 
unpacking.

Depending on the schedule, we could also generate a function that contains 
multiple cuda kernel launches.

If the primary concern is C++ ABI, TVM runtime contains a C interface which has 
a stable ABI, see 
https://github.com/apache/incubator-tvm/blob/main/include/tvm/runtime/c_runtime_api.h





---
[Visit 
Topic](https://discuss.tvm.apache.org/t/whatt-the-arguments-order-of-tvm-generated-cuda-kernel/8422/6)
 to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click 
here](https://discuss.tvm.apache.org/email/unsubscribe/9b8279fb2e2fbd32271b7339152055a25503d4b79585d2d707c25cfb21f1ee78).

Reply via email to