krzysz00 created this revision. krzysz00 added reviewers: nhaehnle, arsenm, b-sumner, piotr, sstefan1, jdoerfert. Herald added subscribers: kosarev, jeroen.dobbelaere, foad, wenlei, okura, kuter, kerbowa, arphaman, zzheng, hiraditya, arichardson, tpr, dstuttard, yaxunl, jvesely, kzhuravl, MatzeB. Herald added a project: All. krzysz00 requested review of this revision. Herald added subscribers: llvm-commits, cfe-commits, pcwang-thead, wdng. Herald added projects: clang, LLVM.
Per discussion at https://discourse.llvm.org/t/representing-buffer-descriptors-in-the-amdgpu-target-call-for-suggestions/68798, we define two new address spaces for AMDGCN targets. The first is address space 7, a non-integral address space (which was already in the data layout) that has 160-bit pointers (which are 256-bit aligned) and uses a 32-bit offset. These pointers combine a 128-bit buffer descriptor and a 32-bit offset, and will be usable with normal LLVM operations (load, store, GEP). However, they will be rewritten out of existence before code generation. The second of these is address space 8, the address space for "buffer resources". These will be used to represent the resource arguments to buffer instructions, and the intrinsics will, in the future, be changed from taking <4 x i32> as the resource arguments to a ptr addrspace(8). These pointers are 128-bits long (with the same alignment). However, they must not be used as the arguments to getelementptr or otherwise used in address computations, since they can have arbitrarily complex inherent addressing semantics that can't be represented in LLVM. These are, however, integral, since inttoptr and ptrtoint behave deterministically and reasonably. While this runs the risk of GEPs being optimized to incorrect pointer arithmetic, address space 8 pointers / buffer resources must not appear in a GEP anyway, so it'll be fine. Future work includes: - Upgrading (including auto-upgrading) buffer intrinsics from 4xi32 to ptr addrspace(8). - A late rewrite to turn address space 7 operations into buffer intrinsics and offset computations. This commit also updates the "fallback address space" for buffer intrinsics to the buffer resource, and updates the alias analysis table. Depends on D143437 <https://reviews.llvm.org/D143437> Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D145441 Files: clang/lib/Basic/Targets/AMDGPU.cpp clang/test/CodeGen/target-data.c clang/test/CodeGenOpenCL/amdgpu-env-amdgcn.cl llvm/docs/AMDGPUUsage.rst llvm/lib/IR/AutoUpgrade.cpp llvm/lib/Target/AMDGPU/AMDGPU.h llvm/lib/Target/AMDGPU/AMDGPUAliasAnalysis.cpp llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp llvm/lib/Target/AMDGPU/SIISelLowering.cpp llvm/test/CodeGen/AMDGPU/GlobalISel/buffer-atomic-fadd.f32-no-rtn.ll llvm/test/CodeGen/AMDGPU/GlobalISel/buffer-atomic-fadd.f32-rtn.ll llvm/test/CodeGen/AMDGPU/GlobalISel/buffer-atomic-fadd.f64.ll llvm/test/CodeGen/AMDGPU/GlobalISel/buffer-atomic-fadd.v2f16-no-rtn.ll llvm/test/CodeGen/AMDGPU/GlobalISel/buffer-atomic-fadd.v2f16-rtn.ll llvm/test/CodeGen/AMDGPU/GlobalISel/irtranslator-non-integral-address-spaces.ll llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-llvm.amdgcn.image.atomic.dim.a16.ll llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-llvm.amdgcn.image.dim.a16.ll llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-llvm.amdgcn.image.load.2d.d16.ll llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-llvm.amdgcn.image.load.2d.ll llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-llvm.amdgcn.image.load.2darraymsaa.ll llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-llvm.amdgcn.image.load.3d.ll llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-llvm.amdgcn.image.sample.a16.ll llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-llvm.amdgcn.image.sample.d.ll llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-llvm.amdgcn.image.sample.g16.a16.ll llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-llvm.amdgcn.image.sample.g16.ll llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-llvm.amdgcn.image.store.2d.d16.ll llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.image.atomic.dim.mir llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.raw.buffer.atomic.add.ll llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.raw.buffer.atomic.cmpswap.ll llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.raw.buffer.atomic.fadd-with-ret.ll llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.raw.buffer.atomic.fadd.ll llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.raw.buffer.load.format.f16.ll llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.raw.buffer.load.format.ll llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.raw.buffer.load.ll llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.raw.buffer.store.format.f16.ll llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.raw.buffer.store.format.f32.ll llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.raw.buffer.store.ll llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.raw.tbuffer.load.f16.ll llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.raw.tbuffer.load.ll llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.raw.tbuffer.store.f16.ll llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.raw.tbuffer.store.i8.ll llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.raw.tbuffer.store.ll llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.struct.buffer.atomic.add.ll llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.struct.buffer.atomic.cmpswap.ll llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.struct.buffer.atomic.fadd-with-ret.ll llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.struct.buffer.atomic.fadd.ll llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.struct.buffer.load.format.f16.ll llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.struct.buffer.load.format.ll llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.struct.buffer.load.ll llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.struct.buffer.store.format.f16.ll llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.struct.buffer.store.format.f32.ll llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.struct.buffer.store.ll llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.struct.tbuffer.load.f16.ll llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.struct.tbuffer.load.ll llvm/test/CodeGen/AMDGPU/GlobalISel/regbankselect-amdgcn.image.load.1d.ll llvm/test/CodeGen/AMDGPU/GlobalISel/regbankselect-amdgcn.image.sample.1d.ll llvm/test/CodeGen/AMDGPU/GlobalISel/regbankselect-amdgcn.raw.buffer.load.ll llvm/test/CodeGen/AMDGPU/GlobalISel/regbankselect-amdgcn.struct.buffer.load.ll llvm/test/CodeGen/AMDGPU/GlobalISel/regbankselect-amdgcn.struct.buffer.store.ll llvm/test/CodeGen/AMDGPU/addrspacecast-captured.ll llvm/test/CodeGen/AMDGPU/annotate-kernel-features-hsa.ll llvm/test/CodeGen/AMDGPU/buffer-atomic-fadd.f32-no-rtn.ll llvm/test/CodeGen/AMDGPU/buffer-atomic-fadd.f32-rtn.ll llvm/test/CodeGen/AMDGPU/buffer-atomic-fadd.f64.ll llvm/test/CodeGen/AMDGPU/buffer-atomic-fadd.v2f16-no-rtn.ll llvm/test/CodeGen/AMDGPU/buffer-atomic-fadd.v2f16-rtn.ll llvm/test/CodeGen/AMDGPU/buffer-intrinsics-mmo-offsets.ll llvm/test/CodeGen/AMDGPU/cgp-addressing-modes.ll llvm/test/CodeGen/AMDGPU/extract_subvector_vec4_vec3.ll llvm/test/CodeGen/AMDGPU/force-alwaysinline-lds-global-address.ll llvm/test/CodeGen/AMDGPU/loop-idiom.ll llvm/test/CodeGen/AMDGPU/mdt-preserving-crash.ll llvm/test/CodeGen/AMDGPU/noop-shader-O0.ll llvm/test/CodeGen/AMDGPU/nullptr-long-address-spaces.ll llvm/test/CodeGen/AMDGPU/nullptr.ll llvm/test/CodeGen/AMDGPU/promote-alloca-lifetime.ll llvm/test/CodeGen/AMDGPU/promote-alloca-to-lds-select.ll llvm/test/CodeGen/AMDGPU/sgpr-copy-local-cse.ll llvm/test/CodeGen/AMDGPU/splitkit-getsubrangeformask.ll llvm/test/CodeGen/AMDGPU/unroll.ll llvm/test/CodeGen/AMDGPU/unsupported-image-a16.ll llvm/test/CodeGen/AMDGPU/unsupported-image-g16.ll llvm/test/CodeGen/AMDGPU/vgpr-liverange-ir.ll llvm/test/CodeGen/MIR/AMDGPU/custom-pseudo-source-values.ll llvm/test/Instrumentation/AddressSanitizer/AMDGPU/adaptive_constant_global_redzones.ll llvm/test/Instrumentation/AddressSanitizer/AMDGPU/adaptive_global_redzones.ll llvm/test/Instrumentation/AddressSanitizer/AMDGPU/asan_do_not_instrument_lds.ll llvm/test/Instrumentation/AddressSanitizer/AMDGPU/asan_do_not_instrument_scratch.ll llvm/test/Instrumentation/AddressSanitizer/AMDGPU/asan_instrument_constant_address_space.ll llvm/test/Instrumentation/AddressSanitizer/AMDGPU/asan_instrument_generic_address_space.ll llvm/test/Instrumentation/AddressSanitizer/AMDGPU/asan_instrument_global_address_space.ll llvm/test/Instrumentation/AddressSanitizer/AMDGPU/global_metadata_addrspacecasts.ll llvm/test/Instrumentation/AddressSanitizer/AMDGPU/no_redzones_in_lds_globals.ll llvm/test/Instrumentation/AddressSanitizer/AMDGPU/no_redzones_in_scratch_globals.ll llvm/test/Transforms/AlignmentFromAssumptions/amdgpu-crash.ll llvm/test/Transforms/AtomicExpand/AMDGPU/expand-atomic-i16.ll llvm/test/Transforms/EarlyCSE/AMDGPU/memrealtime.ll llvm/test/Transforms/IndVarSimplify/AMDGPU/no-widen-to-i64.ll llvm/test/Transforms/InferAddressSpaces/AMDGPU/noop-ptrint-pair.ll llvm/test/Transforms/InferAddressSpaces/AMDGPU/ptrmask.ll llvm/test/Transforms/InferAddressSpaces/X86/noop-ptrint-pair.ll llvm/test/Transforms/Inline/AMDGPU/amdgpu-inline-alloca-argument.ll llvm/test/Transforms/InstCombine/AMDGPU/memcpy-from-constant.ll llvm/test/Transforms/InstCombine/alloca-in-non-alloca-as.ll llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/aa-metadata.ll llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/adjust-alloca-alignment.ll llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/complex-index.ll llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/extended-index.ll llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/gep-bitcast.ll llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/insertion-point.ll llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/interleaved-mayalias-store.ll llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/invariant-load.ll llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/merge-stores-private.ll llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/merge-stores.ll llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/missing-alignment.ll llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/multiple_tails.ll llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/no-implicit-float.ll llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/optnone.ll llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/pointer-elements.ll llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/selects-inseltpoison.ll llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/selects.ll llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/store_with_aliasing_load.ll llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/weird-type-accesses.ll llvm/test/Transforms/LoopLoadElim/pr46854-adress-spaces.ll llvm/test/Transforms/LoopStrengthReduce/AMDGPU/atomics.ll llvm/test/Transforms/LoopStrengthReduce/AMDGPU/different-addrspace-addressing-mode-loops.ll llvm/test/Transforms/LoopStrengthReduce/AMDGPU/lsr-invalid-ptr-extend.ll llvm/test/Transforms/LoopStrengthReduce/AMDGPU/lsr-postinc-pos-addrspace.ll llvm/test/Transforms/LoopStrengthReduce/AMDGPU/preserve-addrspace-assert.ll llvm/test/Transforms/OpenMP/attributor_pointer_offset_crash.ll llvm/test/Transforms/OpenMP/spmdization_constant_prop.ll llvm/test/Transforms/OpenMP/values_in_offload_arrays.alloca.ll llvm/test/Transforms/SLPVectorizer/AMDGPU/address-space-ptr-sze-gep-index-assert.ll llvm/test/Transforms/VectorCombine/AMDGPU/as-transition-inseltpoison.ll llvm/test/Transforms/VectorCombine/AMDGPU/as-transition.ll llvm/unittests/Bitcode/DataLayoutUpgradeTest.cpp _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits