Author: Shilei Tian Date: 2021-01-20T15:56:02-05:00 New Revision: 33a5d212c6198af2bd902bb8e4cfd0f0bec0114f
URL: https://github.com/llvm/llvm-project/commit/33a5d212c6198af2bd902bb8e4cfd0f0bec0114f DIFF: https://github.com/llvm/llvm-project/commit/33a5d212c6198af2bd902bb8e4cfd0f0bec0114f.diff LOG: [OpenMP][NVPTX] Added forward declaration to pave the way for building deviceRTLs with OpenMP Once we switch to build deviceRTLs with OpenMP, primitives and CUDA intrinsics cannot be used directly anymore because `__device__` is not recognized by OpenMP compiler. To avoid involving all CUDA internal headers we had in `clang`, we forward declared these functions. Eventually they will be transformed into right LLVM instrinsics. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D95058 Added: Modified: openmp/libomptarget/deviceRTLs/nvptx/src/target_impl.cu Removed: ################################################################################ diff --git a/openmp/libomptarget/deviceRTLs/nvptx/src/target_impl.cu b/openmp/libomptarget/deviceRTLs/nvptx/src/target_impl.cu index ffc7498e662e..75945e3cd8c4 100644 --- a/openmp/libomptarget/deviceRTLs/nvptx/src/target_impl.cu +++ b/openmp/libomptarget/deviceRTLs/nvptx/src/target_impl.cu @@ -16,6 +16,23 @@ #include <cuda.h> +// Forward declaration of CUDA primitives which will be evetually transformed +// into LLVM intrinsics. +extern "C" { +unsigned int __activemask(); +unsigned int __ballot(unsigned); +// The default argument here is based on NVIDIA's website +// https://developer.nvidia.com/blog/using-cuda-warp-level-primitives/ +int __shfl_sync(unsigned mask, int val, int src_line, int width = WARPSIZE); +int __shfl(int val, int src_line, int width = WARPSIZE); +int __shfl_down(int var, unsigned detla, int width); +int __shfl_down_sync(unsigned mask, int var, unsigned detla, int width); +void __syncwarp(int mask); +void __threadfence(); +void __threadfence_block(); +void __threadfence_system(); +} + DEVICE void __kmpc_impl_unpack(uint64_t val, uint32_t &lo, uint32_t &hi) { asm volatile("mov.b64 {%0,%1}, %2;" : "=r"(lo), "=r"(hi) : "l"(val)); } _______________________________________________ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits