tra added a comment.

I didn't get a chance to review the patch before it got committed.



================
Comment at: lib/Basic/Targets.cpp:161
+    case CudaArch::GFX902:
+      return "320";
+    case CudaArch::UNKNOWN:
----------------
Unless you're planning to guarantee 1:1 match to functionality provided by 
nvidia's sm_32, it would be prudent to use some other value for the macro so 
the source code has a way to tell these GPUs apart.

Another issue with this approach is that typical use pattern for __CUDA_ARCH__ 
is 
`#if __CUDA_ARCH__ >= XXX`. I don't expect that we'll always be able to 
maintain order across GPU architectures among NVIDIA and AMD GPUs. Perhaps for 
HIP compilation it would make more sense to define __CUDA_ARCH__ as 1 (this 
should serve as a legacy indication of device-side compilation) and define 
__HIP_ARCH__ to indicate which AMD GPU we're compiling for without accidentally 
enabling something that was intended for NVIDIA's GPUs only.


Repository:
  rC Clang

https://reviews.llvm.org/D45277



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to