[PATCH] D153883: [Clang][OpenMP] Enable use of __kmpc_alloc_shared for VLAs defined in AMD GPU offloaded regions

Joseph Huber via Phabricator via cfe-commits Tue, 27 Jun 2023 08:23:03 -0700

jhuber6 added a comment.

So this is implementing the `stacksave` using `__kmpc_alloc_shared` instead? It 
makes sense since the OpenMP standard expects sharing for the stack. I wonder 
how this interfaces with `-fopenmp-cuda-mode`.




================
Comment at: clang/lib/CodeGen/CGDecl.cpp:1603
+    // deallocation call of __kmpc_free_shared() is emitted later.
+    if (getLangOpts().OpenMP && getTarget().getTriple().isAMDGCN()) {
+      // Emit call to __kmpc_alloc_shared() instead of the alloca.
----------------
Does NVPTX handle this already? If not, is there a compelling reason to exclude 
NVPTX? Otherwise we should check if we are the OpenMP device.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D153883/new/

https://reviews.llvm.org/D153883

_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D153883: [Clang][OpenMP] Enable use of __kmpc_alloc_shared for VLAs defined in AMD GPU offloaded regions

Reply via email to