tra added a comment.
> HIP generates one fat binary for all devices after linking. However, for each
> compilation
> unit a ctor function is emitted which register the same fat binary.
> Measures need to be taken to make sure the fat binary is only registered
> once.
Are you saying that for HIP there's only one fatbin file with GPU code for the
complete host executable, even if it consists of multiple HIP TUs?
================
Comment at: lib/CodeGen/CGCUDANV.cpp:449
+ CtorBuilder.SetInsertPoint(IfBlock);
+ // GpuBinaryHandle = __{cuda|hip}RegisterFatBinary(&FatbinWrapper);
+ llvm::CallInst *RegisterFatbinCall = CtorBuilder.CreateCall(
----------------
Given that it's HIP-only code, there will be no `cuda`.
https://reviews.llvm.org/D49083
_______________________________________________
cfe-commits mailing list
[email protected]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits