Re: [I] [Proposal] Refactor cubin launcher [tvm-ffi]

via GitHub Sun, 30 Nov 2025 19:08:54 -0800


yaoyaoding commented on issue #292:
URL: https://github.com/apache/tvm-ffi/issues/292#issuecomment-3594338242


   Thanks @oraluben for the proposal and willingness for contribution!
   
   **Runtime API vs Driver API**
   I also feel it's good to switch from runtime api to driver api to get rid of 
the dependency over cuda runtime. It's suggested to use the modern [library 
management 
APIs](https://docs.nvidia.com/cuda/cuda-driver-api/group__CUDA__LIBRARY.html#group__CUDA__LIBRARY)
 over the module management APIs. Since it decouples the kernel with cuda 
context and it's easier to support multi-gpu case. [This 
version](https://github.com/apache/tvm-ffi/pull/283/files/b12edd1c035b8f32d7750258233e7db15f59cf6b#diff-328af6a8e19e7712dd600bd593241491af8542003456c96744958f8e0114203a)
 used the library api and can be a reference.
   
   **Embed cubin**
   Currently, we use some link tool to embed the cubin into an object file. 
@oraluben suggests to use `bin2c` in the proposal. To give others the context, 
CUDA toolkit also provides `bin2c` to translate a cubin into a c source with 
cubin as a buffer.
   ```cuda
   // a.cu content
   extern "C" __global__ void AddOneKernel(float* x) {
       int idx = blockIdx.x * blockDim.x + threadIdx.x;
       x[idx] += 1;
   }
   extern "C" __global__ void AddOneKernel_AA(float* x) {
       int idx = blockIdx.x * blockDim.x + threadIdx.x;
       x[idx] += 1;
   }
   ```
   ```sh
   $ bin2c a.cu --name my_kernel
   ```
   ```c
   // output of bin2c
   #ifdef __cplusplus
   extern "C" {
   #endif
   
   unsigned char my_kernel[] = {
   0x65,0x78,0x74, ...
   };
   
   #ifdef __cplusplus
   }
   #endif
   ```
   I was not aware of such tool, and now I prefer the `bin2c` than the current 
way to perform the embedding than the current method (using link stage tools to 
perform the embedding). 
   
   **Interface**
   We might need more discussion over the design of the interface. 
   
   This is the current definition of `TVM_FFI_EMBED_CUBIN`
   ```c++
   #define TVM_FFI_EMBED_CUBIN(name)                        \
     extern "C" const char __tvm_ffi__cubin_##name[];       \
     extern "C" const char __tvm_ffi__cubin_##name##_end[]; \
     namespace {                                            \
     struct EmbedCubinModule_##name {                       \
       tvm::ffi::CubinModule mod{__tvm_ffi__cubin_##name};  \
       static EmbedCubinModule_##name* Global() {           \
         static EmbedCubinModule_##name inst;               \
         return &inst;                                      \
       }                                                    \
     };                                                     \
     } /* anonymous namespace */
   ```
   
   We can change it to something like
   ```c++
   #define TVM_FFI_EMBED_CUBIN(name)                        \
     extern "C" const unsigned char __tvm_ffi__cubin_##name[];       \
     namespace {                                            \
     struct EmbedCubinModule_##name {                       \
       tvm::ffi::CubinModule mod{__tvm_ffi__cubin_##name};  \
       static EmbedCubinModule_##name* Global() {           \
         static EmbedCubinModule_##name inst;               \
         return &inst;                                      \
       }                                                    \
     };                                                     \
     } /* anonymous namespace */
   ```
   (just remove the `_end` symbol). 
   The workflow looks like:
   - use `bin2c` to generate `cubin.c` with name `__tvm_ffi__cubin_##name` 
given a cubin binary.
   - compile `cubin.c` to `cubin.o`
   - link the user's object with cubin.o
   - localize the cubin symbol
   
   We have defined some python/cmake utilties in 
[here](https://github.com/apache/tvm-ffi/blob/main/python/tvm_ffi/utils/embed_cubin.py)
 and 
[here](https://github.com/apache/tvm-ffi/blob/main/cmake/Utils/EmbedCubin.cmake)
 that might need to be updated.
   
   
   @oraluben @tqchen happy to know your opinion.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [I] [Proposal] Refactor cubin launcher [tvm-ffi]

Reply via email to