tqchen commented on code in PR #18519: URL: https://github.com/apache/tvm/pull/18519#discussion_r2592722729
########## python/tvm/contrib/nvcc.py: ########## @@ -315,9 +541,39 @@ def find_nvshmem_paths() -> Tuple[str, str]: @tvm_ffi.register_global_func def tvm_callback_cuda_compile(code, target): # pylint: disable=unused-argument - """use nvcc to generate fatbin code for better optimization""" - ptx = compile_cuda(code, target_format="fatbin") - return ptx + """ + Compile CUDA code using the configured backend (nvcc or nvrtc). + + This callback is invoked by TVM's C++ backend during CUDA module compilation. + By default, uses nvcc to generate fatbin. + + Environment Variables + --------------------- + TVM_CUDA_COMPILE_MODE : str + Compiler backend: "nvcc" (default) or "nvrtc" Review Comment: I think we should cross check the speed diff and once confirmed, we can switch to nvrtc default -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
