[llvm-branch-commits] [openmp] 33a5d21 - [OpenMP][NVPTX] Added forward declaration to pave the way for building deviceRTLs with OpenMP

Shilei Tian via llvm-branch-commits Wed, 20 Jan 2021 13:00:57 -0800

Author: Shilei Tian
Date: 2021-01-20T15:56:02-05:00
New Revision: 33a5d212c6198af2bd902bb8e4cfd0f0bec0114f


URL: 
https://github.com/llvm/llvm-project/commit/33a5d212c6198af2bd902bb8e4cfd0f0bec0114f
DIFF: 
https://github.com/llvm/llvm-project/commit/33a5d212c6198af2bd902bb8e4cfd0f0bec0114f.diff

LOG: [OpenMP][NVPTX] Added forward declaration to pave the way for building 
deviceRTLs with OpenMP

Once we switch to build deviceRTLs with OpenMP, primitives and CUDA
intrinsics cannot be used directly anymore because `__device__` is not 
recognized
by OpenMP compiler. To avoid involving all CUDA internal headers we had in 
`clang`,
we forward declared these functions. Eventually they will be transformed into
right LLVM instrinsics.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D95058

Added: 
    

Modified: 
    openmp/libomptarget/deviceRTLs/nvptx/src/target_impl.cu

Removed: 
    


################################################################################
diff  --git a/openmp/libomptarget/deviceRTLs/nvptx/src/target_impl.cu 
b/openmp/libomptarget/deviceRTLs/nvptx/src/target_impl.cu
index ffc7498e662e..75945e3cd8c4 100644
--- a/openmp/libomptarget/deviceRTLs/nvptx/src/target_impl.cu
+++ b/openmp/libomptarget/deviceRTLs/nvptx/src/target_impl.cu
@@ -16,6 +16,23 @@
 
 #include <cuda.h>
 
+// Forward declaration of CUDA primitives which will be evetually transformed
+// into LLVM intrinsics.
+extern "C" {
+unsigned int __activemask();
+unsigned int __ballot(unsigned);
+// The default argument here is based on NVIDIA's website
+// https://developer.nvidia.com/blog/using-cuda-warp-level-primitives/
+int __shfl_sync(unsigned mask, int val, int src_line, int width = WARPSIZE);
+int __shfl(int val, int src_line, int width = WARPSIZE);
+int __shfl_down(int var, unsigned detla, int width);
+int __shfl_down_sync(unsigned mask, int var, unsigned detla, int width);
+void __syncwarp(int mask);
+void __threadfence();
+void __threadfence_block();
+void __threadfence_system();
+}
+
 DEVICE void __kmpc_impl_unpack(uint64_t val, uint32_t &lo, uint32_t &hi) {
   asm volatile("mov.b64 {%0,%1}, %2;" : "=r"(lo), "=r"(hi) : "l"(val));
 }


        
_______________________________________________
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [openmp] 33a5d21 - [OpenMP][NVPTX] Added forward declaration to pave the way for building deviceRTLs with OpenMP

Reply via email to