[PATCH] D153883: [Clang][OpenMP] Enable use of __kmpc_alloc_shared for VLAs defined in AMD GPU offloaded regions

Gheorghe-Teodor Bercea via Phabricator via cfe-commits Wed, 28 Jun 2023 06:55:07 -0700

doru1004 added inline comments.


================
Comment at: clang/lib/CodeGen/CGDecl.cpp:1603
+    // deallocation call of __kmpc_free_shared() is emitted later.
+    if (getLangOpts().OpenMP && getTarget().getTriple().isAMDGCN()) {
+      // Emit call to __kmpc_alloc_shared() instead of the alloca.
----------------
arsenm wrote:
> ABataev wrote:
> > doru1004 wrote:
> > > jhuber6 wrote:
> > > > ABataev wrote:
> > > > > OpenMPIsDevice?
> > > > Does NVPTX handle this already? If not, is there a compelling reason to 
> > > > exclude NVPTX? Otherwise we should check if we are the OpenMP device.
> > > Does NVPTX support dynamic allocas?
> > It does not matter here, it depends on the runtime library implementations. 
> > The compiler just shall provide proper runtime calls emission, everything 
> > else is part of the runtime support.
> I think I heard recent ptx introdced new instructions for it. amdgpu codegen 
> just happens to be broken because we don't properly restore the stack 
> afterwards. When I added the support we had no way of testing (and still 
> don't really, __builtin_alloca doesn't handle non-0 stack address space 
> correctly)
If NVPTX supports that then there is no reason to have NVPTX avoid emitting 
allocas (i.e. the condition stays as it is right now) but I am willing to reach 
a consensus so please let me know what you would all prefer.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D153883/new/

https://reviews.llvm.org/D153883

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D153883: [Clang][OpenMP] Enable use of __kmpc_alloc_shared for VLAs defined in AMD GPU offloaded regions

Reply via email to