tra added inline comments.

================
Comment at: lib/Basic/Targets.cpp:161
+    case CudaArch::GFX902:
+      return "320";
+    case CudaArch::UNKNOWN:
----------------
yaxunl wrote:
> tra wrote:
> > Unless you're planning to guarantee 1:1 match to functionality provided by 
> > nvidia's sm_32, it would be prudent to use some other value for the macro 
> > so the source code has a way to tell these GPUs apart.
> > 
> > Another issue with this approach is that typical use pattern for 
> > __CUDA_ARCH__ is 
> > `#if __CUDA_ARCH__ >= XXX`. I don't expect that we'll always be able to 
> > maintain order across GPU architectures among NVIDIA and AMD GPUs. Perhaps 
> > for HIP compilation it would make more sense to define __CUDA_ARCH__ as 1 
> > (this should serve as a legacy indication of device-side compilation) and 
> > define __HIP_ARCH__ to indicate which AMD GPU we're compiling for without 
> > accidentally enabling something that was intended for NVIDIA's GPUs only.
> I think let `__CUDA_ARCH__`==1 for amdgcn is reasonable and I can make that 
> change.
> 
> On the other hand, I think it may be difficult to define `__HIP_ARCH__` which 
> can sort mixed nvptx/amdgcn GPU's by capability. I do think a well defined 
> `__HIP_ARCH__` would be useful for users. Just need some further discussion 
> how to define it.
> 
> For now, if there are specific codes for nvptx, it can continue use 
> `__CUDA_ARCH__`. If there are specific codes for amdgcn, it can check 
> predefined amdgpu canonical names, e.g. `__gfx803__`, etc. 
OK.



Repository:
  rC Clang

https://reviews.llvm.org/D45277



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to