================ @@ -1024,6 +1024,15 @@ GCNTTIImpl::instCombineIntrinsic(InstCombiner &IC, IntrinsicInst &II) const { } break; } + case Intrinsic::amdgcn_wavefrontsize: { + // TODO: this is a workaround for the pseudo-generic target one gets with no + // specified mcpu, which spoofs its wave size to 64; it should be removed. ---------------- arsenm wrote:
We already do some light 64->32 folds, that are only sort of correct. Technically we could make exec_hi an allocatable scratch register in wave32, but what we do now bakes in an assumption that exec_hi must always be 0. But yes, the only way to really avoid any possible edge cases (and support a future of machine linked libraries) requires just having totally separate builds https://github.com/llvm/llvm-project/pull/114481 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits