[clang] [llvm] [llvm][AMDGPU] Fold `llvm.amdgcn.wavefrontsize` early (PR #114481)

Matt Arsenault via cfe-commits Mon, 04 Nov 2024 11:15:35 -0800

================
@@ -1024,6 +1024,15 @@ GCNTTIImpl::instCombineIntrinsic(InstCombiner &IC, 
IntrinsicInst &II) const {
     }
     break;
   }
+  case Intrinsic::amdgcn_wavefrontsize: {
+    // TODO: this is a workaround for the pseudo-generic target one gets with 
no
+    // specified mcpu, which spoofs its wave size to 64; it should be removed.
----------------
arsenm wrote:


We already do some light 64->32 folds, that are only sort of correct.

Technically we could make exec_hi an allocatable scratch register in wave32, 
but what we do now bakes in an assumption that exec_hi must always be 0.

But yes, the only way to really avoid any possible edge cases (and support a 
future of machine linked libraries) requires just having totally separate 
builds 

https://github.com/llvm/llvm-project/pull/114481
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [llvm][AMDGPU] Fold `llvm.amdgcn.wavefrontsize` early (PR #114481)

Reply via email to