================ @@ -2496,6 +2496,26 @@ def int_amdgcn_flat_atomic_fmax_num : AMDGPUAtomicRtn<llvm_anyfloat_ty>; def int_amdgcn_global_atomic_fmin_num : AMDGPUAtomicRtn<llvm_anyfloat_ty>; def int_amdgcn_global_atomic_fmax_num : AMDGPUAtomicRtn<llvm_anyfloat_ty>; +class AMDGPUGlobalLoadTr<LLVMType data_ty> : + Intrinsic< + [data_ty], + [global_ptr_ty], + [IntrReadMem, IntrWillReturn, IntrConvergent, NoCapture<ArgIndex<0>>, IntrNoCallback, IntrNoFree], + "", + [SDNPMemOperand] + >; + +// Wave32 +// <2 x i32> @llvm.amdgcn.global.load.tr.v2i32(ptr addrspace(1)) -> global_load_tr_b64 +// <8 x i16> @llvm.amdgcn.global.load.tr.v8i16(ptr addrspace(1)) -> global_load_tr_b128 +// <8 x half> @llvm.amdgcn.global.load.tr.v8f16(ptr addrspace(1)) -> global_load_tr_b128 ---------------- piotrAMD wrote:
Here is just a comment to say how the intrinsic could be used, but the right patterns need to be added. I added an extra one for bf16 in the most recent update. https://github.com/llvm/llvm-project/pull/77772 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits