[clang] [Clang] Fix 'gpuintrin.h' match when included with no arch set (PR #129927)

Joseph Huber via cfe-commits Wed, 05 Mar 2025 16:37:58 -0800

================
@@ -179,8 +179,10 @@ __gpu_shuffle_idx_u64(uint64_t __lane_mask, uint32_t 
__idx, uint64_t __x,
 _DEFAULT_FN_ATTRS static __inline__ uint64_t
 __gpu_match_any_u32(uint64_t __lane_mask, uint32_t __x) {
   // Newer targets can use the dedicated CUDA support.
-  if (__CUDA_ARCH__ >= 700 || __nvvm_reflect("__CUDA_ARCH") >= 700)
+#if __CUDA_ARCH__ >= 700
+  if (__nvvm_reflect("__CUDA_ARCH") >= 700)
----------------
jhuber6 wrote:


Probably just because I forgot to but it was a no-op so I didn't notice.

And yeah, I wish there was a way to let `nvvm_reflect` call these without the 
incurring the wrath of the compiler. You can spoof some target attributes but 
that's always broken when inlined.

https://github.com/llvm/llvm-project/pull/129927
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang] Fix 'gpuintrin.h' match when included with no arch set (PR #129927)

Reply via email to