[PATCH] D145441: [AMDGPU] Define data layout entries for buffers

Krzysztof Drewniak via Phabricator via cfe-commits Mon, 06 Mar 2023 15:23:40 -0800

krzysz00 created this revision.
krzysz00 added reviewers: nhaehnle, arsenm, b-sumner, piotr, sstefan1, 
jdoerfert.
Herald added subscribers: kosarev, jeroen.dobbelaere, foad, wenlei, okura, 
kuter, kerbowa, arphaman, zzheng, hiraditya, arichardson, tpr, dstuttard, 
yaxunl, jvesely, kzhuravl, MatzeB.
Herald added a project: All.
krzysz00 requested review of this revision.
Herald added subscribers: llvm-commits, cfe-commits, pcwang-thead, wdng.
Herald added projects: clang, LLVM.


Per discussion at
https://discourse.llvm.org/t/representing-buffer-descriptors-in-the-amdgpu-target-call-for-suggestions/68798,
we define two new address spaces for AMDGCN targets.

The first is address space 7, a non-integral address space (which was
already in the data layout) that has 160-bit pointers (which are
256-bit aligned) and uses a 32-bit offset. These pointers combine a
128-bit buffer descriptor and a 32-bit offset, and will be usable with
normal LLVM operations (load, store, GEP). However, they will be
rewritten out of existence before code generation.

The second of these is address space 8, the address space for "buffer
resources". These will be used to represent the resource arguments to
buffer instructions, and the intrinsics will, in the future, be
changed from taking <4 x i32> as the resource arguments to a
ptr addrspace(8). These pointers are 128-bits long (with the same
alignment). However, they must not be used as the arguments to
getelementptr or otherwise used in address computations, since they
can have arbitrarily complex inherent addressing semantics that can't
be represented in LLVM. These are, however, integral, since inttoptr
and ptrtoint behave deterministically and reasonably. While this runs
the risk of GEPs being optimized to incorrect pointer arithmetic,
address space 8 pointers / buffer resources must not appear in a GEP
anyway, so it'll be fine.

Future work includes:

- Upgrading (including auto-upgrading) buffer intrinsics from 4xi32 to

ptr addrspace(8).

- A late rewrite to turn address space 7 operations into buffer

intrinsics and offset computations.

This commit also updates the "fallback address space" for buffer
intrinsics to the buffer resource, and updates the alias analysis
table.

Depends on D143437 <https://reviews.llvm.org/D143437>


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D145441

Files:
  clang/lib/Basic/Targets/AMDGPU.cpp
  clang/test/CodeGen/target-data.c
  clang/test/CodeGenOpenCL/amdgpu-env-amdgcn.cl
  llvm/docs/AMDGPUUsage.rst
  llvm/lib/IR/AutoUpgrade.cpp
  llvm/lib/Target/AMDGPU/AMDGPU.h
  llvm/lib/Target/AMDGPU/AMDGPUAliasAnalysis.cpp
  llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
  llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp
  llvm/lib/Target/AMDGPU/SIISelLowering.cpp
  llvm/test/CodeGen/AMDGPU/GlobalISel/buffer-atomic-fadd.f32-no-rtn.ll
  llvm/test/CodeGen/AMDGPU/GlobalISel/buffer-atomic-fadd.f32-rtn.ll
  llvm/test/CodeGen/AMDGPU/GlobalISel/buffer-atomic-fadd.f64.ll
  llvm/test/CodeGen/AMDGPU/GlobalISel/buffer-atomic-fadd.v2f16-no-rtn.ll
  llvm/test/CodeGen/AMDGPU/GlobalISel/buffer-atomic-fadd.v2f16-rtn.ll
  
llvm/test/CodeGen/AMDGPU/GlobalISel/irtranslator-non-integral-address-spaces.ll
  
llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-llvm.amdgcn.image.atomic.dim.a16.ll
  llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-llvm.amdgcn.image.dim.a16.ll
  llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-llvm.amdgcn.image.load.2d.d16.ll
  llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-llvm.amdgcn.image.load.2d.ll
  
llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-llvm.amdgcn.image.load.2darraymsaa.ll
  llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-llvm.amdgcn.image.load.3d.ll
  llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-llvm.amdgcn.image.sample.a16.ll
  llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-llvm.amdgcn.image.sample.d.ll
  
llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-llvm.amdgcn.image.sample.g16.a16.ll
  llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-llvm.amdgcn.image.sample.g16.ll
  llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-llvm.amdgcn.image.store.2d.d16.ll
  llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.image.atomic.dim.mir
  llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.raw.buffer.atomic.add.ll
  llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.raw.buffer.atomic.cmpswap.ll
  
llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.raw.buffer.atomic.fadd-with-ret.ll
  llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.raw.buffer.atomic.fadd.ll
  llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.raw.buffer.load.format.f16.ll
  llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.raw.buffer.load.format.ll
  llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.raw.buffer.load.ll
  llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.raw.buffer.store.format.f16.ll
  llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.raw.buffer.store.format.f32.ll
  llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.raw.buffer.store.ll
  llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.raw.tbuffer.load.f16.ll
  llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.raw.tbuffer.load.ll
  llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.raw.tbuffer.store.f16.ll
  llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.raw.tbuffer.store.i8.ll
  llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.raw.tbuffer.store.ll
  llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.struct.buffer.atomic.add.ll
  
llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.struct.buffer.atomic.cmpswap.ll
  
llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.struct.buffer.atomic.fadd-with-ret.ll
  llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.struct.buffer.atomic.fadd.ll
  
llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.struct.buffer.load.format.f16.ll
  llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.struct.buffer.load.format.ll
  llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.struct.buffer.load.ll
  
llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.struct.buffer.store.format.f16.ll
  
llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.struct.buffer.store.format.f32.ll
  llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.struct.buffer.store.ll
  llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.struct.tbuffer.load.f16.ll
  llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.struct.tbuffer.load.ll
  llvm/test/CodeGen/AMDGPU/GlobalISel/regbankselect-amdgcn.image.load.1d.ll
  llvm/test/CodeGen/AMDGPU/GlobalISel/regbankselect-amdgcn.image.sample.1d.ll
  llvm/test/CodeGen/AMDGPU/GlobalISel/regbankselect-amdgcn.raw.buffer.load.ll
  llvm/test/CodeGen/AMDGPU/GlobalISel/regbankselect-amdgcn.struct.buffer.load.ll
  
llvm/test/CodeGen/AMDGPU/GlobalISel/regbankselect-amdgcn.struct.buffer.store.ll
  llvm/test/CodeGen/AMDGPU/addrspacecast-captured.ll
  llvm/test/CodeGen/AMDGPU/annotate-kernel-features-hsa.ll
  llvm/test/CodeGen/AMDGPU/buffer-atomic-fadd.f32-no-rtn.ll
  llvm/test/CodeGen/AMDGPU/buffer-atomic-fadd.f32-rtn.ll
  llvm/test/CodeGen/AMDGPU/buffer-atomic-fadd.f64.ll
  llvm/test/CodeGen/AMDGPU/buffer-atomic-fadd.v2f16-no-rtn.ll
  llvm/test/CodeGen/AMDGPU/buffer-atomic-fadd.v2f16-rtn.ll
  llvm/test/CodeGen/AMDGPU/buffer-intrinsics-mmo-offsets.ll
  llvm/test/CodeGen/AMDGPU/cgp-addressing-modes.ll
  llvm/test/CodeGen/AMDGPU/extract_subvector_vec4_vec3.ll
  llvm/test/CodeGen/AMDGPU/force-alwaysinline-lds-global-address.ll
  llvm/test/CodeGen/AMDGPU/loop-idiom.ll
  llvm/test/CodeGen/AMDGPU/mdt-preserving-crash.ll
  llvm/test/CodeGen/AMDGPU/noop-shader-O0.ll
  llvm/test/CodeGen/AMDGPU/nullptr-long-address-spaces.ll
  llvm/test/CodeGen/AMDGPU/nullptr.ll
  llvm/test/CodeGen/AMDGPU/promote-alloca-lifetime.ll
  llvm/test/CodeGen/AMDGPU/promote-alloca-to-lds-select.ll
  llvm/test/CodeGen/AMDGPU/sgpr-copy-local-cse.ll
  llvm/test/CodeGen/AMDGPU/splitkit-getsubrangeformask.ll
  llvm/test/CodeGen/AMDGPU/unroll.ll
  llvm/test/CodeGen/AMDGPU/unsupported-image-a16.ll
  llvm/test/CodeGen/AMDGPU/unsupported-image-g16.ll
  llvm/test/CodeGen/AMDGPU/vgpr-liverange-ir.ll
  llvm/test/CodeGen/MIR/AMDGPU/custom-pseudo-source-values.ll
  
llvm/test/Instrumentation/AddressSanitizer/AMDGPU/adaptive_constant_global_redzones.ll
  llvm/test/Instrumentation/AddressSanitizer/AMDGPU/adaptive_global_redzones.ll
  
llvm/test/Instrumentation/AddressSanitizer/AMDGPU/asan_do_not_instrument_lds.ll
  
llvm/test/Instrumentation/AddressSanitizer/AMDGPU/asan_do_not_instrument_scratch.ll
  
llvm/test/Instrumentation/AddressSanitizer/AMDGPU/asan_instrument_constant_address_space.ll
  
llvm/test/Instrumentation/AddressSanitizer/AMDGPU/asan_instrument_generic_address_space.ll
  
llvm/test/Instrumentation/AddressSanitizer/AMDGPU/asan_instrument_global_address_space.ll
  
llvm/test/Instrumentation/AddressSanitizer/AMDGPU/global_metadata_addrspacecasts.ll
  
llvm/test/Instrumentation/AddressSanitizer/AMDGPU/no_redzones_in_lds_globals.ll
  
llvm/test/Instrumentation/AddressSanitizer/AMDGPU/no_redzones_in_scratch_globals.ll
  llvm/test/Transforms/AlignmentFromAssumptions/amdgpu-crash.ll
  llvm/test/Transforms/AtomicExpand/AMDGPU/expand-atomic-i16.ll
  llvm/test/Transforms/EarlyCSE/AMDGPU/memrealtime.ll
  llvm/test/Transforms/IndVarSimplify/AMDGPU/no-widen-to-i64.ll
  llvm/test/Transforms/InferAddressSpaces/AMDGPU/noop-ptrint-pair.ll
  llvm/test/Transforms/InferAddressSpaces/AMDGPU/ptrmask.ll
  llvm/test/Transforms/InferAddressSpaces/X86/noop-ptrint-pair.ll
  llvm/test/Transforms/Inline/AMDGPU/amdgpu-inline-alloca-argument.ll
  llvm/test/Transforms/InstCombine/AMDGPU/memcpy-from-constant.ll
  llvm/test/Transforms/InstCombine/alloca-in-non-alloca-as.ll
  llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/aa-metadata.ll
  llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/adjust-alloca-alignment.ll
  llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/complex-index.ll
  llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/extended-index.ll
  llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/gep-bitcast.ll
  llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/insertion-point.ll
  llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/interleaved-mayalias-store.ll
  llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/invariant-load.ll
  llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/merge-stores-private.ll
  llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/merge-stores.ll
  llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/missing-alignment.ll
  llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/multiple_tails.ll
  llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/no-implicit-float.ll
  llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/optnone.ll
  llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/pointer-elements.ll
  llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/selects-inseltpoison.ll
  llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/selects.ll
  llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/store_with_aliasing_load.ll
  llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/weird-type-accesses.ll
  llvm/test/Transforms/LoopLoadElim/pr46854-adress-spaces.ll
  llvm/test/Transforms/LoopStrengthReduce/AMDGPU/atomics.ll
  
llvm/test/Transforms/LoopStrengthReduce/AMDGPU/different-addrspace-addressing-mode-loops.ll
  llvm/test/Transforms/LoopStrengthReduce/AMDGPU/lsr-invalid-ptr-extend.ll
  llvm/test/Transforms/LoopStrengthReduce/AMDGPU/lsr-postinc-pos-addrspace.ll
  llvm/test/Transforms/LoopStrengthReduce/AMDGPU/preserve-addrspace-assert.ll
  llvm/test/Transforms/OpenMP/attributor_pointer_offset_crash.ll
  llvm/test/Transforms/OpenMP/spmdization_constant_prop.ll
  llvm/test/Transforms/OpenMP/values_in_offload_arrays.alloca.ll
  
llvm/test/Transforms/SLPVectorizer/AMDGPU/address-space-ptr-sze-gep-index-assert.ll
  llvm/test/Transforms/VectorCombine/AMDGPU/as-transition-inseltpoison.ll
  llvm/test/Transforms/VectorCombine/AMDGPU/as-transition.ll
  llvm/unittests/Bitcode/DataLayoutUpgradeTest.cpp

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D145441: [AMDGPU] Define data layout entries for buffers

Reply via email to