tra added inline comments.
================ Comment at: lib/Basic/Targets.cpp:161 + case CudaArch::GFX902: + return "320"; + case CudaArch::UNKNOWN: ---------------- yaxunl wrote: > tra wrote: > > Unless you're planning to guarantee 1:1 match to functionality provided by > > nvidia's sm_32, it would be prudent to use some other value for the macro > > so the source code has a way to tell these GPUs apart. > > > > Another issue with this approach is that typical use pattern for > > __CUDA_ARCH__ is > > `#if __CUDA_ARCH__ >= XXX`. I don't expect that we'll always be able to > > maintain order across GPU architectures among NVIDIA and AMD GPUs. Perhaps > > for HIP compilation it would make more sense to define __CUDA_ARCH__ as 1 > > (this should serve as a legacy indication of device-side compilation) and > > define __HIP_ARCH__ to indicate which AMD GPU we're compiling for without > > accidentally enabling something that was intended for NVIDIA's GPUs only. > I think let `__CUDA_ARCH__`==1 for amdgcn is reasonable and I can make that > change. > > On the other hand, I think it may be difficult to define `__HIP_ARCH__` which > can sort mixed nvptx/amdgcn GPU's by capability. I do think a well defined > `__HIP_ARCH__` would be useful for users. Just need some further discussion > how to define it. > > For now, if there are specific codes for nvptx, it can continue use > `__CUDA_ARCH__`. If there are specific codes for amdgcn, it can check > predefined amdgpu canonical names, e.g. `__gfx803__`, etc. OK. Repository: rC Clang https://reviews.llvm.org/D45277 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits