r340772 - [OpenMP][NVPTX] Use appropriate _CALL_ELF macro when offloading
Author: gbercea Date: Mon Aug 27 13:16:20 2018 New Revision: 340772 URL: http://llvm.org/viewvc/llvm-project?rev=340772&view=rev Log: [OpenMP][NVPTX] Use appropriate _CALL_ELF macro when offloading Summary: When offloading to a device and using the powerpc64le version of the auxiliary triple, the _CALL_ELF macro is not set correctly to 2 resulting in the attempt to include a header that does not exist. This patch fixes this problem. Reviewers: Hahnfeld, ABataev, caomhin Reviewed By: Hahnfeld Subscribers: guansong, cfe-commits Differential Revision: https://reviews.llvm.org/D51312 Modified: cfe/trunk/lib/Frontend/InitPreprocessor.cpp cfe/trunk/test/Preprocessor/aux-triple.c Modified: cfe/trunk/lib/Frontend/InitPreprocessor.cpp URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Frontend/InitPreprocessor.cpp?rev=340772&r1=340771&r2=340772&view=diff == --- cfe/trunk/lib/Frontend/InitPreprocessor.cpp (original) +++ cfe/trunk/lib/Frontend/InitPreprocessor.cpp Mon Aug 27 13:16:20 2018 @@ -1106,14 +1106,19 @@ static void InitializePredefinedAuxMacro auto AuxTriple = AuxTI.getTriple(); // Define basic target macros needed by at least bits/wordsize.h and - // bits/mathinline.h + // bits/mathinline.h. + // On PowerPC, explicitely set _CALL_ELF macro needed for gnu/stubs.h. switch (AuxTriple.getArch()) { case llvm::Triple::x86_64: Builder.defineMacro("__x86_64__"); break; case llvm::Triple::ppc64: +Builder.defineMacro("__powerpc64__"); +Builder.defineMacro("_CALL_ELF", "1"); +break; case llvm::Triple::ppc64le: Builder.defineMacro("__powerpc64__"); +Builder.defineMacro("_CALL_ELF", "2"); break; default: break; Modified: cfe/trunk/test/Preprocessor/aux-triple.c URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/test/Preprocessor/aux-triple.c?rev=340772&r1=340771&r2=340772&view=diff == --- cfe/trunk/test/Preprocessor/aux-triple.c (original) +++ cfe/trunk/test/Preprocessor/aux-triple.c Mon Aug 27 13:16:20 2018 @@ -14,7 +14,7 @@ // RUN: %clang_cc1 -x cuda -E -dM -ffreestanding < /dev/null \ // RUN: -triple nvptx64-none-none -aux-triple powerpc64le-unknown-linux-gnu \ // RUN: | FileCheck -match-full-lines %s \ -// RUN: -check-prefixes NVPTX64,PPC64,LINUX,LINUX-CPP +// RUN: -check-prefixes NVPTX64,PPC64LE,LINUX,LINUX-CPP // RUN: %clang_cc1 -x cuda -E -dM -ffreestanding < /dev/null \ // RUN: -triple nvptx64-none-none -aux-triple x86_64-unknown-linux-gnu \ // RUN: | FileCheck -match-full-lines %s \ @@ -24,7 +24,7 @@ // RUN: %clang_cc1 -E -dM -ffreestanding < /dev/null \ // RUN: -fopenmp -fopenmp-is-device -triple nvptx64-none-none \ // RUN: -aux-triple powerpc64le-unknown-linux-gnu \ -// RUN: | FileCheck -match-full-lines -check-prefixes NVPTX64,PPC64,LINUX %s +// RUN: | FileCheck -match-full-lines -check-prefixes NVPTX64,PPC64LE,LINUX %s // RUN: %clang_cc1 -E -dM -ffreestanding < /dev/null \ // RUN: -fopenmp -fopenmp-is-device -triple nvptx64-none-none \ // RUN: -aux-triple x86_64-unknown-linux-gnu \ @@ -33,13 +33,15 @@ // RUN: -fopenmp -fopenmp-is-device -triple nvptx64-none-none \ // RUN: -aux-triple powerpc64le-unknown-linux-gnu \ // RUN: | FileCheck -match-full-lines %s \ -// RUN: -check-prefixes NVPTX64,PPC64,LINUX,LINUX-CPP +// RUN: -check-prefixes NVPTX64,PPC64LE,LINUX,LINUX-CPP // RUN: %clang_cc1 -x c++ -E -dM -ffreestanding < /dev/null \ // RUN: -fopenmp -fopenmp-is-device -triple nvptx64-none-none \ // RUN: -aux-triple x86_64-unknown-linux-gnu \ // RUN: | FileCheck -match-full-lines %s \ // RUN: -check-prefixes NVPTX64,X86_64,LINUX,LINUX-CPP +// PPC64LE:#define _CALL_ELF 2 + // NONE-NOT:#define _GNU_SOURCE // LINUX-CPP:#define _GNU_SOURCE 1 @@ -56,7 +58,7 @@ // LINUX:#define __linux__ 1 // NONE-NOT:#define __powerpc64__ -// PPC64:#define __powerpc64__ 1 +// PPC64LE:#define __powerpc64__ 1 // NONE-NOT:#define __x86_64__ // X86_64:#define __x86_64__ 1 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
r305294 - Add comma to comment.
Author: gbercea Date: Tue Jun 13 10:35:27 2017 New Revision: 305294 URL: http://llvm.org/viewvc/llvm-project?rev=305294&view=rev Log: Add comma to comment. Modified: cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp Modified: cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp?rev=305294&r1=305293&r2=305294&view=diff == --- cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp (original) +++ cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp Tue Jun 13 10:35:27 2017 @@ -6327,7 +6327,7 @@ bool CGOpenMPRuntime::emitTargetGlobalVa } } - // If we are in target mode we do not emit any global (declare target is not + // If we are in target mode, we do not emit any global (declare target is not // implemented yet). Therefore we signal that GD was processed in this case. return true; } ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
r326948 - [OpenMP] Remove implicit data sharing code gen that aims to use device shared memory
Author: gbercea Date: Wed Mar 7 13:59:50 2018 New Revision: 326948 URL: http://llvm.org/viewvc/llvm-project?rev=326948&view=rev Log: [OpenMP] Remove implicit data sharing code gen that aims to use device shared memory Summary: Remove this scheme for now since it will be covered by another more generic scheme using global memory. This code will be worked into an optimization for the generic data sharing scheme. Removing this completely and then adding it via future patches will make all future data sharing patches cleaner. Reviewers: ABataev, carlo.bertolli, caomhin Reviewed By: ABataev Subscribers: jholewinski, guansong, cfe-commits Differential Revision: https://reviews.llvm.org/D43625 Removed: cfe/trunk/test/OpenMP/nvptx_data_sharing.cpp Modified: cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.h cfe/trunk/test/OpenMP/nvptx_parallel_codegen.cpp cfe/trunk/test/OpenMP/nvptx_target_teams_codegen.cpp Modified: cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp?rev=326948&r1=326947&r2=326948&view=diff == --- cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp (original) +++ cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp Wed Mar 7 13:59:50 2018 @@ -33,11 +33,11 @@ enum OpenMPRTLFunctionNVPTX { /// \brief Call to void __kmpc_spmd_kernel_deinit(); OMPRTL_NVPTX__kmpc_spmd_kernel_deinit, /// \brief Call to void __kmpc_kernel_prepare_parallel(void - /// *outlined_function, void ***args, kmp_int32 nArgs, int16_t + /// *outlined_function, int16_t /// IsOMPRuntimeInitialized); OMPRTL_NVPTX__kmpc_kernel_prepare_parallel, - /// \brief Call to bool __kmpc_kernel_parallel(void **outlined_function, void - /// ***args, int16_t IsOMPRuntimeInitialized); + /// \brief Call to bool __kmpc_kernel_parallel(void **outlined_function, + /// int16_t IsOMPRuntimeInitialized); OMPRTL_NVPTX__kmpc_kernel_parallel, /// \brief Call to void __kmpc_kernel_end_parallel(); OMPRTL_NVPTX__kmpc_kernel_end_parallel, @@ -288,7 +288,6 @@ void CGOpenMPRuntimeNVPTX::emitGenericKe EntryFunctionState EST; WorkerFunctionState WST(CGM, D.getLocStart()); Work.clear(); - WrapperFunctionsMap.clear(); // Emit target region as a standalone region. class NVPTXPrePostActionTy : public PrePostActionTy { @@ -508,11 +507,8 @@ void CGOpenMPRuntimeNVPTX::emitWorkerLoo CGF.InitTempAlloca(ExecStatus, Bld.getInt8(/*C=*/0)); CGF.InitTempAlloca(WorkFn, llvm::Constant::getNullValue(CGF.Int8PtrTy)); - // Set up shared arguments - Address SharedArgs = - CGF.CreateDefaultAlignTempAlloca(CGF.Int8PtrPtrTy, "shared_args"); // TODO: Optimize runtime initialization and pass in correct value. - llvm::Value *Args[] = {WorkFn.getPointer(), SharedArgs.getPointer(), + llvm::Value *Args[] = {WorkFn.getPointer(), /*RequiresOMPRuntime=*/Bld.getInt16(1)}; llvm::Value *Ret = CGF.EmitRuntimeCall( createNVPTXRuntimeFunction(OMPRTL_NVPTX__kmpc_kernel_parallel), Args); @@ -532,9 +528,6 @@ void CGOpenMPRuntimeNVPTX::emitWorkerLoo // Signal start of parallel region. CGF.EmitBlock(ExecuteBB); - // Current context - ASTContext &Ctx = CGF.getContext(); - // Process work items: outlined parallel functions. for (auto *W : Work) { // Try to match this outlined function. @@ -550,19 +543,14 @@ void CGOpenMPRuntimeNVPTX::emitWorkerLoo // Execute this outlined function. CGF.EmitBlock(ExecuteFNBB); -// Insert call to work function via shared wrapper. The shared -// wrapper takes exactly three arguments: -// - the parallelism level; -// - the master thread ID; -// - the list of references to shared arguments. -// -// TODO: Assert that the function is a wrapper function.s -Address Capture = CGF.EmitLoadOfPointer(SharedArgs, - Ctx.getPointerType( - Ctx.getPointerType(Ctx.VoidPtrTy)).castAs()); -emitOutlinedFunctionCall(CGF, WST.Loc, W, - {Bld.getInt16(/*ParallelLevel=*/0), - getMasterThreadID(CGF), Capture.getPointer()}); +// Insert call to work function. +// FIXME: Pass arguments to outlined function from master thread. +auto *Fn = cast(W); +Address ZeroAddr = +CGF.CreateDefaultAlignTempAlloca(CGF.Int32Ty, /*Name=*/".zero.addr"); +CGF.InitTempAlloca(ZeroAddr, CGF.Builder.getInt32(/*C=*/0)); +llvm::Value *FnArgs[] = {ZeroAddr.getPointer(), ZeroAddr.getPointer()}; +emitCall(CGF, WST.Loc, Fn, FnArgs); // Go to end of parallel region. CGF.EmitBranch(TerminateBB); @@ -630,10 +618,8 @@ CGOpenMPRuntimeNVPTX::createNVPTXRuntime } case OMPRTL_NVPTX__kmpc_kernel_prepare_parallel: { /// Build void __kmpc_kernel_prepare_parallel( -/// void *outlined_function, void ***args, kmp_int32 n
r327438 - [OpenMP] Add flag for linking runtime bitcode library
Author: gbercea Date: Tue Mar 13 12:39:19 2018 New Revision: 327438 URL: http://llvm.org/viewvc/llvm-project?rev=327438&view=rev Log: [OpenMP] Add flag for linking runtime bitcode library Summary: This patch adds an additional flag to the OpenMP device offloading toolchain to link in the runtime library bitcode. Reviewers: Hahnfeld, ABataev, carlo.bertolli, caomhin, grokos, hfinkel Reviewed By: ABataev, grokos Subscribers: jholewinski, guansong, cfe-commits Differential Revision: https://reviews.llvm.org/D43197 Added: cfe/trunk/test/Driver/Inputs/libomptarget/ cfe/trunk/test/Driver/Inputs/libomptarget/libomptarget-nvptx-sm_20.bc Modified: cfe/trunk/include/clang/Basic/DiagnosticDriverKinds.td cfe/trunk/lib/Driver/ToolChains/Cuda.cpp cfe/trunk/test/Driver/openmp-offload-gpu.c Modified: cfe/trunk/include/clang/Basic/DiagnosticDriverKinds.td URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Basic/DiagnosticDriverKinds.td?rev=327438&r1=327437&r2=327438&view=diff == --- cfe/trunk/include/clang/Basic/DiagnosticDriverKinds.td (original) +++ cfe/trunk/include/clang/Basic/DiagnosticDriverKinds.td Tue Mar 13 12:39:19 2018 @@ -203,6 +203,9 @@ def err_drv_expecting_fopenmp_with_fopen def warn_drv_omp_offload_target_duplicate : Warning< "The OpenMP offloading target '%0' is similar to target '%1' already specified - will be ignored.">, InGroup; +def warn_drv_omp_offload_target_missingbcruntime : Warning< + "No library '%0' found in the default clang lib directory or in LIBRARY_PATH. Expect degraded performance due to no inlining of runtime functions on target devices.">, + InGroup; def err_drv_bitcode_unsupported_on_toolchain : Error< "-fembed-bitcode is not supported on versions of iOS prior to 6.0">; Modified: cfe/trunk/lib/Driver/ToolChains/Cuda.cpp URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Driver/ToolChains/Cuda.cpp?rev=327438&r1=327437&r2=327438&view=diff == --- cfe/trunk/lib/Driver/ToolChains/Cuda.cpp (original) +++ cfe/trunk/lib/Driver/ToolChains/Cuda.cpp Tue Mar 13 12:39:19 2018 @@ -21,6 +21,7 @@ #include "llvm/Option/ArgList.h" #include "llvm/Support/FileSystem.h" #include "llvm/Support/Path.h" +#include "llvm/Support/Process.h" #include "llvm/Support/Program.h" #include @@ -580,6 +581,44 @@ void CudaToolChain::addClangTargetOption CC1Args.push_back("-target-feature"); CC1Args.push_back("+ptx42"); } + + if (DeviceOffloadingKind == Action::OFK_OpenMP) { +SmallVector LibraryPaths; +// Add path to lib and/or lib64 folders. +SmallString<256> DefaultLibPath = + llvm::sys::path::parent_path(getDriver().Dir); +llvm::sys::path::append(DefaultLibPath, +Twine("lib") + CLANG_LIBDIR_SUFFIX); +LibraryPaths.emplace_back(DefaultLibPath.c_str()); + +// Add user defined library paths from LIBRARY_PATH. +llvm::Optional LibPath = +llvm::sys::Process::GetEnv("LIBRARY_PATH"); +if (LibPath) { + SmallVector Frags; + const char EnvPathSeparatorStr[] = {llvm::sys::EnvPathSeparator, '\0'}; + llvm::SplitString(*LibPath, Frags, EnvPathSeparatorStr); + for (StringRef Path : Frags) +LibraryPaths.emplace_back(Path.trim()); +} + +std::string LibOmpTargetName = + "libomptarget-nvptx-" + GpuArch.str() + ".bc"; +bool FoundBCLibrary = false; +for (StringRef LibraryPath : LibraryPaths) { + SmallString<128> LibOmpTargetFile(LibraryPath); + llvm::sys::path::append(LibOmpTargetFile, LibOmpTargetName); + if (llvm::sys::fs::exists(LibOmpTargetFile)) { +CC1Args.push_back("-mlink-cuda-bitcode"); +CC1Args.push_back(DriverArgs.MakeArgString(LibOmpTargetFile)); +FoundBCLibrary = true; +break; + } +} +if (!FoundBCLibrary) + getDriver().Diag(diag::warn_drv_omp_offload_target_missingbcruntime) + << LibOmpTargetName; + } } void CudaToolChain::AddCudaIncludeArgs(const ArgList &DriverArgs, Added: cfe/trunk/test/Driver/Inputs/libomptarget/libomptarget-nvptx-sm_20.bc URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/test/Driver/Inputs/libomptarget/libomptarget-nvptx-sm_20.bc?rev=327438&view=auto == (empty) Modified: cfe/trunk/test/Driver/openmp-offload-gpu.c URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/test/Driver/openmp-offload-gpu.c?rev=327438&r1=327437&r2=327438&view=diff == --- cfe/trunk/test/Driver/openmp-offload-gpu.c (original) +++ cfe/trunk/test/Driver/openmp-offload-gpu.c Tue Mar 13 12:39:19 2018 @@ -142,3 +142,23 @@ // RUN: | FileCheck -check-prefix=CHK-NOLIBDEVICE %s // CHK-NOLIBDEVICE-NOT: error:{{.*}}sm_60 + +/// ##
r327447 - Revert revision 327438.
Author: gbercea Date: Tue Mar 13 13:50:12 2018 New Revision: 327447 URL: http://llvm.org/viewvc/llvm-project?rev=327447&view=rev Log: Revert revision 327438. Removed: cfe/trunk/test/Driver/Inputs/libomptarget/libomptarget-nvptx-sm_20.bc Modified: cfe/trunk/include/clang/Basic/DiagnosticDriverKinds.td cfe/trunk/lib/Driver/ToolChains/Cuda.cpp cfe/trunk/test/Driver/openmp-offload-gpu.c Modified: cfe/trunk/include/clang/Basic/DiagnosticDriverKinds.td URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Basic/DiagnosticDriverKinds.td?rev=327447&r1=327446&r2=327447&view=diff == --- cfe/trunk/include/clang/Basic/DiagnosticDriverKinds.td (original) +++ cfe/trunk/include/clang/Basic/DiagnosticDriverKinds.td Tue Mar 13 13:50:12 2018 @@ -203,9 +203,6 @@ def err_drv_expecting_fopenmp_with_fopen def warn_drv_omp_offload_target_duplicate : Warning< "The OpenMP offloading target '%0' is similar to target '%1' already specified - will be ignored.">, InGroup; -def warn_drv_omp_offload_target_missingbcruntime : Warning< - "No library '%0' found in the default clang lib directory or in LIBRARY_PATH. Expect degraded performance due to no inlining of runtime functions on target devices.">, - InGroup; def err_drv_bitcode_unsupported_on_toolchain : Error< "-fembed-bitcode is not supported on versions of iOS prior to 6.0">; Modified: cfe/trunk/lib/Driver/ToolChains/Cuda.cpp URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Driver/ToolChains/Cuda.cpp?rev=327447&r1=327446&r2=327447&view=diff == --- cfe/trunk/lib/Driver/ToolChains/Cuda.cpp (original) +++ cfe/trunk/lib/Driver/ToolChains/Cuda.cpp Tue Mar 13 13:50:12 2018 @@ -581,44 +581,6 @@ void CudaToolChain::addClangTargetOption CC1Args.push_back("-target-feature"); CC1Args.push_back("+ptx42"); } - - if (DeviceOffloadingKind == Action::OFK_OpenMP) { -SmallVector LibraryPaths; -// Add path to lib and/or lib64 folders. -SmallString<256> DefaultLibPath = - llvm::sys::path::parent_path(getDriver().Dir); -llvm::sys::path::append(DefaultLibPath, -Twine("lib") + CLANG_LIBDIR_SUFFIX); -LibraryPaths.emplace_back(DefaultLibPath.c_str()); - -// Add user defined library paths from LIBRARY_PATH. -llvm::Optional LibPath = -llvm::sys::Process::GetEnv("LIBRARY_PATH"); -if (LibPath) { - SmallVector Frags; - const char EnvPathSeparatorStr[] = {llvm::sys::EnvPathSeparator, '\0'}; - llvm::SplitString(*LibPath, Frags, EnvPathSeparatorStr); - for (StringRef Path : Frags) -LibraryPaths.emplace_back(Path.trim()); -} - -std::string LibOmpTargetName = - "libomptarget-nvptx-" + GpuArch.str() + ".bc"; -bool FoundBCLibrary = false; -for (StringRef LibraryPath : LibraryPaths) { - SmallString<128> LibOmpTargetFile(LibraryPath); - llvm::sys::path::append(LibOmpTargetFile, LibOmpTargetName); - if (llvm::sys::fs::exists(LibOmpTargetFile)) { -CC1Args.push_back("-mlink-cuda-bitcode"); -CC1Args.push_back(DriverArgs.MakeArgString(LibOmpTargetFile)); -FoundBCLibrary = true; -break; - } -} -if (!FoundBCLibrary) - getDriver().Diag(diag::warn_drv_omp_offload_target_missingbcruntime) - << LibOmpTargetName; - } } void CudaToolChain::AddCudaIncludeArgs(const ArgList &DriverArgs, Removed: cfe/trunk/test/Driver/Inputs/libomptarget/libomptarget-nvptx-sm_20.bc URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/test/Driver/Inputs/libomptarget/libomptarget-nvptx-sm_20.bc?rev=327446&view=auto == (empty) Modified: cfe/trunk/test/Driver/openmp-offload-gpu.c URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/test/Driver/openmp-offload-gpu.c?rev=327447&r1=327446&r2=327447&view=diff == --- cfe/trunk/test/Driver/openmp-offload-gpu.c (original) +++ cfe/trunk/test/Driver/openmp-offload-gpu.c Tue Mar 13 13:50:12 2018 @@ -142,23 +142,3 @@ // RUN: | FileCheck -check-prefix=CHK-NOLIBDEVICE %s // CHK-NOLIBDEVICE-NOT: error:{{.*}}sm_60 - -/// ### - -/// Check that the runtime bitcode library is part of the compile line. Create a bogus -/// bitcode library and add it to the LIBRARY_PATH. -// RUN: env LIBRARY_PATH=%S/Inputs/libomptarget %clang -### -fopenmp=libomp -fopenmp-targets=nvptx64-nvidia-cuda \ -// RUN: -Xopenmp-target -march=sm_20 -fopenmp-relocatable-target -save-temps \ -// RUN: -no-canonical-prefixes %s 2>&1 | FileCheck -check-prefix=CHK-BCLIB %s - -// CHK-BCLIB: clang{{.*}}-triple{{.*}}nvptx64-nvidia-cuda{{.*}}-mlink-cuda-bitcode{{.*}}libomptarget-nvptx-sm_20.bc - -/// #
r327460 - [OpenMP] Add flag for linking runtime bitcode library
Author: gbercea Date: Tue Mar 13 16:19:52 2018 New Revision: 327460 URL: http://llvm.org/viewvc/llvm-project?rev=327460&view=rev Log: [OpenMP] Add flag for linking runtime bitcode library Summary: This patch adds an additional flag to the OpenMP device offloading toolchain to link in the runtime library bitcode. Reviewers: Hahnfeld, ABataev, carlo.bertolli, caomhin, grokos, hfinkel Reviewed By: ABataev, grokos Subscribers: jholewinski, guansong, cfe-commits Differential Revision: https://reviews.llvm.org/D43197 Added: cfe/trunk/test/Driver/Inputs/libomptarget/libomptarget-nvptx-sm_20.bc Modified: cfe/trunk/include/clang/Basic/DiagnosticDriverKinds.td cfe/trunk/lib/Driver/ToolChains/Cuda.cpp cfe/trunk/test/Driver/openmp-offload-gpu.c Modified: cfe/trunk/include/clang/Basic/DiagnosticDriverKinds.td URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Basic/DiagnosticDriverKinds.td?rev=327460&r1=327459&r2=327460&view=diff == --- cfe/trunk/include/clang/Basic/DiagnosticDriverKinds.td (original) +++ cfe/trunk/include/clang/Basic/DiagnosticDriverKinds.td Tue Mar 13 16:19:52 2018 @@ -203,6 +203,9 @@ def err_drv_expecting_fopenmp_with_fopen def warn_drv_omp_offload_target_duplicate : Warning< "The OpenMP offloading target '%0' is similar to target '%1' already specified - will be ignored.">, InGroup; +def warn_drv_omp_offload_target_missingbcruntime : Warning< + "No library '%0' found in the default clang lib directory or in LIBRARY_PATH. Expect degraded performance due to no inlining of runtime functions on target devices.">, + InGroup; def err_drv_bitcode_unsupported_on_toolchain : Error< "-fembed-bitcode is not supported on versions of iOS prior to 6.0">; Modified: cfe/trunk/lib/Driver/ToolChains/Cuda.cpp URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Driver/ToolChains/Cuda.cpp?rev=327460&r1=327459&r2=327460&view=diff == --- cfe/trunk/lib/Driver/ToolChains/Cuda.cpp (original) +++ cfe/trunk/lib/Driver/ToolChains/Cuda.cpp Tue Mar 13 16:19:52 2018 @@ -581,6 +581,44 @@ void CudaToolChain::addClangTargetOption CC1Args.push_back("-target-feature"); CC1Args.push_back("+ptx42"); } + + if (DeviceOffloadingKind == Action::OFK_OpenMP) { +SmallVector LibraryPaths; +// Add path to lib and/or lib64 folders. +SmallString<256> DefaultLibPath = + llvm::sys::path::parent_path(getDriver().Dir); +llvm::sys::path::append(DefaultLibPath, +Twine("lib") + CLANG_LIBDIR_SUFFIX); +LibraryPaths.emplace_back(DefaultLibPath.c_str()); + +// Add user defined library paths from LIBRARY_PATH. +llvm::Optional LibPath = +llvm::sys::Process::GetEnv("LIBRARY_PATH"); +if (LibPath) { + SmallVector Frags; + const char EnvPathSeparatorStr[] = {llvm::sys::EnvPathSeparator, '\0'}; + llvm::SplitString(*LibPath, Frags, EnvPathSeparatorStr); + for (StringRef Path : Frags) +LibraryPaths.emplace_back(Path.trim()); +} + +std::string LibOmpTargetName = + "libomptarget-nvptx-" + GpuArch.str() + ".bc"; +bool FoundBCLibrary = false; +for (StringRef LibraryPath : LibraryPaths) { + SmallString<128> LibOmpTargetFile(LibraryPath); + llvm::sys::path::append(LibOmpTargetFile, LibOmpTargetName); + if (llvm::sys::fs::exists(LibOmpTargetFile)) { +CC1Args.push_back("-mlink-cuda-bitcode"); +CC1Args.push_back(DriverArgs.MakeArgString(LibOmpTargetFile)); +FoundBCLibrary = true; +break; + } +} +if (!FoundBCLibrary) + getDriver().Diag(diag::warn_drv_omp_offload_target_missingbcruntime) + << LibOmpTargetName; + } } void CudaToolChain::AddCudaIncludeArgs(const ArgList &DriverArgs, Added: cfe/trunk/test/Driver/Inputs/libomptarget/libomptarget-nvptx-sm_20.bc URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/test/Driver/Inputs/libomptarget/libomptarget-nvptx-sm_20.bc?rev=327460&view=auto == (empty) Modified: cfe/trunk/test/Driver/openmp-offload-gpu.c URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/test/Driver/openmp-offload-gpu.c?rev=327460&r1=327459&r2=327460&view=diff == --- cfe/trunk/test/Driver/openmp-offload-gpu.c (original) +++ cfe/trunk/test/Driver/openmp-offload-gpu.c Tue Mar 13 16:19:52 2018 @@ -142,3 +142,26 @@ // RUN: | FileCheck -check-prefix=CHK-NOLIBDEVICE %s // CHK-NOLIBDEVICE-NOT: error:{{.*}}sm_60 + +/// ### + +/// Check that the runtime bitcode library is part of the compile line. Create a bogus +/// bitcode library and add it to the LIBRARY_PATH. +// RUN: env LIBRARY_PATH=%S/Inputs/libomptarget %clang -### -fopenmp=li
r327513 - [OpenMP] Add OpenMP data sharing infrastructure using global memory
Author: gbercea Date: Wed Mar 14 07:17:45 2018 New Revision: 327513 URL: http://llvm.org/viewvc/llvm-project?rev=327513&view=rev Log: [OpenMP] Add OpenMP data sharing infrastructure using global memory Summary: This patch handles the Clang code generation phase for the OpenMP data sharing infrastructure. TODO: add a more detailed description. Reviewers: ABataev, carlo.bertolli, caomhin, hfinkel, Hahnfeld Reviewed By: ABataev Subscribers: jholewinski, guansong, cfe-commits Differential Revision: https://reviews.llvm.org/D43660 Added: cfe/trunk/test/OpenMP/nvptx_data_sharing.cpp Modified: cfe/trunk/lib/CodeGen/CGDecl.cpp cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp cfe/trunk/lib/CodeGen/CGOpenMPRuntime.h cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.h cfe/trunk/lib/CodeGen/CGStmtOpenMP.cpp cfe/trunk/lib/CodeGen/CodeGenFunction.cpp cfe/trunk/test/OpenMP/nvptx_parallel_codegen.cpp Modified: cfe/trunk/lib/CodeGen/CGDecl.cpp URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGDecl.cpp?rev=327513&r1=327512&r2=327513&view=diff == --- cfe/trunk/lib/CodeGen/CGDecl.cpp (original) +++ cfe/trunk/lib/CodeGen/CGDecl.cpp Wed Mar 14 07:17:45 2018 @@ -1068,9 +1068,17 @@ CodeGenFunction::EmitAutoVarAlloca(const } // A normal fixed sized variable becomes an alloca in the entry block, -// unless it's an NRVO variable. - -if (NRVO) { +// unless: +// - it's an NRVO variable. +// - we are compiling OpenMP and it's an OpenMP local variable. + +Address OpenMPLocalAddr = +getLangOpts().OpenMP +? CGM.getOpenMPRuntime().getAddressOfLocalVariable(*this, &D) +: Address::invalid(); +if (getLangOpts().OpenMP && OpenMPLocalAddr.isValid()) { + address = OpenMPLocalAddr; +} else if (NRVO) { // The named return value optimization: allocate this variable in the // return slot, so that we can elide the copy when returning this // variable (C++0x [class.copy]p34). @@ -1896,9 +1904,18 @@ void CodeGenFunction::EmitParmDecl(const } } } else { -// Otherwise, create a temporary to hold the value. -DeclPtr = CreateMemTemp(Ty, getContext().getDeclAlign(&D), -D.getName() + ".addr"); +// Check if the parameter address is controlled by OpenMP runtime. +Address OpenMPLocalAddr = +getLangOpts().OpenMP +? CGM.getOpenMPRuntime().getAddressOfLocalVariable(*this, &D) +: Address::invalid(); +if (getLangOpts().OpenMP && OpenMPLocalAddr.isValid()) { + DeclPtr = OpenMPLocalAddr; +} else { + // Otherwise, create a temporary to hold the value. + DeclPtr = CreateMemTemp(Ty, getContext().getDeclAlign(&D), + D.getName() + ".addr"); +} DoStore = true; } Modified: cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp?rev=327513&r1=327512&r2=327513&view=diff == --- cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp (original) +++ cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp Wed Mar 14 07:17:45 2018 @@ -8100,6 +8100,11 @@ Address CGOpenMPRuntime::getParameterAdd return CGF.GetAddrOfLocalVar(NativeParam); } +Address CGOpenMPRuntime::getAddressOfLocalVariable(CodeGenFunction &CGF, + const VarDecl *VD) { + return Address::invalid(); +} + llvm::Value *CGOpenMPSIMDRuntime::emitParallelOutlinedFunction( const OMPExecutableDirective &D, const VarDecl *ThreadIDVar, OpenMPDirectiveKind InnermostKind, const RegionCodeGenTy &CodeGen) { Modified: cfe/trunk/lib/CodeGen/CGOpenMPRuntime.h URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGOpenMPRuntime.h?rev=327513&r1=327512&r2=327513&view=diff == --- cfe/trunk/lib/CodeGen/CGOpenMPRuntime.h (original) +++ cfe/trunk/lib/CodeGen/CGOpenMPRuntime.h Wed Mar 14 07:17:45 2018 @@ -676,7 +676,7 @@ public: /// \brief Cleans up references to the objects in finished function. /// - void functionFinished(CodeGenFunction &CGF); + virtual void functionFinished(CodeGenFunction &CGF); /// \brief Emits code for parallel or serial call of the \a OutlinedFn with /// variables captured in a record which address is stored in \a @@ -1362,6 +1362,14 @@ public: emitOutlinedFunctionCall(CodeGenFunction &CGF, SourceLocation Loc, llvm::Value *OutlinedFn, ArrayRef Args = llvm::None) const; + + /// Emits OpenMP-specific function prolog. + /// Required for device constructs. + virtual void emitFunctionProlog(CodeGenFunction &CGF, const Decl *D) {} + + /// Gets the OpenMP
r345417 - [NFC][OpenMP] Add new test for parallel for code generation.
Author: gbercea Date: Fri Oct 26 11:59:52 2018 New Revision: 345417 URL: http://llvm.org/viewvc/llvm-project?rev=345417&view=rev Log: [NFC][OpenMP] Add new test for parallel for code generation. Summary: This is a simple test of the parallel for code generation. It will be used to showcase the change introduced by patch D53443. Reviewers: ABataev, caomhin Reviewed By: ABataev Subscribers: guansong, cfe-commits Differential Revision: https://reviews.llvm.org/D53772 Added: cfe/trunk/test/OpenMP/nvptx_parallel_for_codegen.cpp Added: cfe/trunk/test/OpenMP/nvptx_parallel_for_codegen.cpp URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/test/OpenMP/nvptx_parallel_for_codegen.cpp?rev=345417&view=auto == --- cfe/trunk/test/OpenMP/nvptx_parallel_for_codegen.cpp (added) +++ cfe/trunk/test/OpenMP/nvptx_parallel_for_codegen.cpp Fri Oct 26 11:59:52 2018 @@ -0,0 +1,101 @@ +// Test target codegen - host bc file has to be created first. +// RUN: %clang_cc1 -verify -fopenmp -x c++ -triple powerpc64le-unknown-unknown -fopenmp-targets=nvptx64-nvidia-cuda -emit-llvm-bc %s -o %t-ppc-host.bc +// RUN: %clang_cc1 -verify -fopenmp -x c++ -triple nvptx64-unknown-unknown -fopenmp-targets=nvptx64-nvidia-cuda -emit-llvm %s -fopenmp-is-device -fopenmp-host-ir-file-path %t-ppc-host.bc -o - | FileCheck %s --check-prefix CHECK --check-prefix CHECK-64 +// expected-no-diagnostics +#ifndef HEADER +#define HEADER + +template +tx ftemplate(int n) { + tx b[10]; + + #pragma omp target + { +tx d = n; +#pragma omp parallel for +for(int i=0; i<10; i++) { + b[i] += d; +} +b[3] += 1; + } + + return b[3]; +} + +int bar(int n){ + int a = 0; + + a += ftemplate(n); + + return a; +} + +// CHECK-LABEL: define {{.*}}void {{@__omp_offloading_.+template.+l12}}_worker() +// CHECK: call void @llvm.nvvm.barrier0() +// CHECK: call i1 @__kmpc_kernel_parallel( +// CHECK: call void @__omp_outlined___wrapper( + +// CHECK: define weak void @__omp_offloading_{{.*}}l12( +// CHECK: call void @__omp_offloading_{{.*}}l12_worker() +// CHECK: call void @__kmpc_kernel_init( +// CHECK: call void @__kmpc_data_sharing_init_stack() +// CHECK: call i8* @__kmpc_data_sharing_push_stack(i64 4, i16 0) +// CHECK: call void @__kmpc_kernel_prepare_parallel( +// CHECK: call void @__kmpc_begin_sharing_variables({{.*}}, i64 2) +// CHECK: call void @llvm.nvvm.barrier0() +// CHECK: call void @llvm.nvvm.barrier0() +// CHECK: call void @__kmpc_end_sharing_variables() +// CHECK: call void @__kmpc_data_sharing_pop_stack( +// CHECK: call void @__kmpc_kernel_deinit(i16 1) + +// CHECK: define internal void @__omp_outlined__( +// CHECK: alloca +// CHECK: alloca +// CHECK: alloca +// CHECK: alloca +// CHECK: [[OMP_IV:%.*]] = alloca i32 +// CHECK: store i32 0, {{.*}} [[OMP_LB:%.+]], +// CHECK: store i32 9, {{.*}} [[OMP_UB:%.+]], +// CHECK: store i32 1, {{.*}} [[OMP_ST:%.+]], +// CHECK: call void @__kmpc_for_static_init_4({{.*}} i32 34, {{.*}} [[OMP_LB]], {{.*}} [[OMP_UB]], {{.*}} [[OMP_ST]], i32 1, i32 1) +// CHECK: [[OMP_UB_1:%.+]] = load {{.*}} [[OMP_UB]] +// CHECK: [[COMP_1:%.+]] = icmp sgt {{.*}} [[OMP_UB_1]] +// CHECK: br i1 [[COMP_1]], label %[[COND_TRUE:.+]], label %[[COND_FALSE:.+]] + +// CHECK: [[COND_TRUE]] +// CHECK: br label %[[COND_END:.+]] + +// CHECK: [[COND_FALSE]] +// CHECK: [[OMP_UB_2:%.+]] = load {{.*}}* [[OMP_UB]] +// CHECK: br label %[[COND_END]] + +// CHECK: [[COND_END]] +// CHECK: [[COND_RES:%.+]] = phi i32 [ 9, %[[COND_TRUE]] ], [ [[OMP_UB_2]], %[[COND_FALSE]] ] +// CHECK: store i32 [[COND_RES]], i32* [[OMP_UB]] +// CHECK: [[OMP_LB_1:%.+]] = load i32, i32* [[OMP_LB]] +// CHECK: store i32 [[OMP_LB_1]], i32* [[OMP_IV]] +// CHECK: br label %[[OMP_INNER_FOR_COND:.+]] + +// CHECK: [[OMP_INNER_FOR_COND]] +// CHECK: [[OMP_IV_2:%.+]] = load i32, i32* [[OMP_IV]] +// CHECK: [[OMP_UB_4:%.+]] = load i32, i32* [[OMP_UB]] +// CHECK: [[COMP_3:%.+]] = icmp sle i32 [[OMP_IV_2]], [[OMP_UB_4]] +// CHECK: br i1 [[COMP_3]], label %[[OMP_INNER_FOR_BODY:.+]], label %[[OMP_INNER_FOR_END:.+]] + +// CHECK: [[OMP_INNER_FOR_BODY]] +// CHECK: br label %[[OMP_BODY_CONTINUE:.+]] + +// CHECK: [[OMP_BODY_CONTINUE]] +// CHECK: br label %[[OMP_INNER_FOR_INC:.+]] + +// CHECK: [[OMP_INNER_FOR_INC]] +// CHECK: [[OMP_IV_3:%.+]] = load i32, i32* [[OMP_IV]] +// CHECK: [[ADD_1:%.+]] = add nsw i32 [[OMP_IV_3]], 1 +// CHECK: store i32 [[ADD_1]], i32* [[OMP_IV]] +// CHECK: br label %[[OMP_INNER_FOR_COND]] + +// CHECK: [[OMP_INNER_FOR_END]] +// CHECK: call void @__kmpc_for_static_fini( +// CHECK: ret void + +#endif ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
r345507 - [OpenMP][NVPTX] Enable default scheduling for parallel for in non-SPMD cases.
Author: gbercea Date: Mon Oct 29 08:23:23 2018 New Revision: 345507 URL: http://llvm.org/viewvc/llvm-project?rev=345507&view=rev Log: [OpenMP][NVPTX] Enable default scheduling for parallel for in non-SPMD cases. Summary: This patch enables the choosing of the default schedule for parallel for loops even in non-SPMD cases. Reviewers: ABataev, caomhin Reviewed By: ABataev Subscribers: jholewinski, guansong, cfe-commits Differential Revision: https://reviews.llvm.org/D53443 Modified: cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp cfe/trunk/test/OpenMP/nvptx_parallel_for_codegen.cpp Modified: cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp?rev=345507&r1=345506&r2=345507&view=diff == --- cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp (original) +++ cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp Mon Oct 29 08:23:23 2018 @@ -4238,16 +4238,17 @@ void CGOpenMPRuntimeNVPTX::getDefaultDis Chunk = CGF.EmitScalarConversion(getNVPTXNumThreads(CGF), CGF.getContext().getIntTypeForBitwidth(32, /*Signed=*/0), S.getIterationVariable()->getType(), S.getBeginLoc()); +return; } + CGOpenMPRuntime::getDefaultDistScheduleAndChunk( + CGF, S, ScheduleKind, Chunk); } void CGOpenMPRuntimeNVPTX::getDefaultScheduleAndChunk( CodeGenFunction &CGF, const OMPLoopDirective &S, OpenMPScheduleClauseKind &ScheduleKind, llvm::Value *&Chunk) const { - if (getExecutionMode() == CGOpenMPRuntimeNVPTX::EM_SPMD) { -ScheduleKind = OMPC_SCHEDULE_static; -Chunk = CGF.Builder.getIntN(CGF.getContext().getTypeSize( -S.getIterationVariable()->getType()), 1); - } + ScheduleKind = OMPC_SCHEDULE_static; + Chunk = CGF.Builder.getIntN(CGF.getContext().getTypeSize( + S.getIterationVariable()->getType()), 1); } Modified: cfe/trunk/test/OpenMP/nvptx_parallel_for_codegen.cpp URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/test/OpenMP/nvptx_parallel_for_codegen.cpp?rev=345507&r1=345506&r2=345507&view=diff == --- cfe/trunk/test/OpenMP/nvptx_parallel_for_codegen.cpp (original) +++ cfe/trunk/test/OpenMP/nvptx_parallel_for_codegen.cpp Mon Oct 29 08:23:23 2018 @@ -57,7 +57,10 @@ int bar(int n){ // CHECK: store i32 0, {{.*}} [[OMP_LB:%.+]], // CHECK: store i32 9, {{.*}} [[OMP_UB:%.+]], // CHECK: store i32 1, {{.*}} [[OMP_ST:%.+]], -// CHECK: call void @__kmpc_for_static_init_4({{.*}} i32 34, {{.*}} [[OMP_LB]], {{.*}} [[OMP_UB]], {{.*}} [[OMP_ST]], i32 1, i32 1) +// CHECK: call void @__kmpc_for_static_init_4({{.*}} i32 33, {{.*}} [[OMP_LB]], {{.*}} [[OMP_UB]], {{.*}} [[OMP_ST]], i32 1, i32 1) +// CHECK: br label %[[OMP_DISPATCH_COND:.+]] + +// CHECK: [[OMP_DISPATCH_COND]] // CHECK: [[OMP_UB_1:%.+]] = load {{.*}} [[OMP_UB]] // CHECK: [[COMP_1:%.+]] = icmp sgt {{.*}} [[OMP_UB_1]] // CHECK: br i1 [[COMP_1]], label %[[COND_TRUE:.+]], label %[[COND_FALSE:.+]] @@ -74,6 +77,12 @@ int bar(int n){ // CHECK: store i32 [[COND_RES]], i32* [[OMP_UB]] // CHECK: [[OMP_LB_1:%.+]] = load i32, i32* [[OMP_LB]] // CHECK: store i32 [[OMP_LB_1]], i32* [[OMP_IV]] +// CHECK: [[OMP_IV_1:%.+]] = load i32, i32* [[OMP_IV]] +// CHECK: [[OMP_UB_3:%.+]] = load i32, i32* [[OMP_UB]] +// CHECK: [[COMP_2:%.+]] = icmp sle i32 [[OMP_IV_1]], [[OMP_UB_3]] +// CHECK: br i1 [[COMP_2]], label %[[DISPATCH_BODY:.+]], label %[[DISPATCH_END:.+]] + +// CHECK: [[DISPATCH_BODY]] // CHECK: br label %[[OMP_INNER_FOR_COND:.+]] // CHECK: [[OMP_INNER_FOR_COND]] @@ -94,7 +103,20 @@ int bar(int n){ // CHECK: store i32 [[ADD_1]], i32* [[OMP_IV]] // CHECK: br label %[[OMP_INNER_FOR_COND]] -// CHECK: [[OMP_INNER_FOR_END]] +// CHECK: [[OMP_INNER_FOR_COND]] +// CHECK: br label %[[OMP_DISPATCH_INC:.+]] + +// CHECK: [[OMP_DISPATCH_INC]] +// CHECK: [[OMP_LB_2:%.+]] = load i32, i32* [[OMP_LB]] +// CHECK: [[OMP_ST_1:%.+]] = load i32, i32* [[OMP_ST]] +// CHECK: [[ADD_2:%.+]] = add nsw i32 [[OMP_LB_2]], [[OMP_ST_1]] +// CHECK: store i32 [[ADD_2]], i32* [[OMP_LB]] +// CHECK: [[OMP_UB_5:%.+]] = load i32, i32* [[OMP_UB]] +// CHECK: [[OMP_ST_2:%.+]] = load i32, i32* [[OMP_ST]] +// CHECK: [[ADD_3:%.+]] = add nsw i32 [[OMP_UB_5]], [[OMP_ST_2]] +// CHECK: store i32 [[ADD_3]], i32* [[OMP_UB]] + +// CHECK: [[DISPATCH_END]] // CHECK: call void @__kmpc_for_static_fini( // CHECK: ret void ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
r345509 - [OpenMP][NVPTX] Use single loops when generating code for distribute parallel for
Author: gbercea Date: Mon Oct 29 08:45:47 2018 New Revision: 345509 URL: http://llvm.org/viewvc/llvm-project?rev=345509&view=rev Log: [OpenMP][NVPTX] Use single loops when generating code for distribute parallel for Summary: This patch adds a new code generation path for bound sharing directives containing distribute parallel for. The new code generation scheme applies to chunked schedules on distribute and parallel for directives. The scheme simplifies the code that is being generated by eliminating the need for an outer for loop over chunks for both distribute and parallel for directives. In the case of distribute it applies to any sized chunk while in the parallel for case it only applies when chunk size is 1. Reviewers: ABataev, caomhin Reviewed By: ABataev Subscribers: jholewinski, guansong, cfe-commits Differential Revision: https://reviews.llvm.org/D53448 Modified: cfe/trunk/include/clang/AST/StmtOpenMP.h cfe/trunk/lib/AST/StmtOpenMP.cpp cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp cfe/trunk/lib/CodeGen/CGOpenMPRuntime.h cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.h cfe/trunk/lib/CodeGen/CGStmtOpenMP.cpp cfe/trunk/lib/Sema/SemaOpenMP.cpp cfe/trunk/lib/Serialization/ASTReaderStmt.cpp cfe/trunk/lib/Serialization/ASTWriterStmt.cpp cfe/trunk/test/OpenMP/distribute_parallel_for_codegen.cpp cfe/trunk/test/OpenMP/distribute_parallel_for_simd_codegen.cpp cfe/trunk/test/OpenMP/nvptx_target_teams_distribute_parallel_for_codegen.cpp Modified: cfe/trunk/include/clang/AST/StmtOpenMP.h URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/AST/StmtOpenMP.h?rev=345509&r1=345508&r2=345509&view=diff == --- cfe/trunk/include/clang/AST/StmtOpenMP.h (original) +++ cfe/trunk/include/clang/AST/StmtOpenMP.h Mon Oct 29 08:45:47 2018 @@ -392,9 +392,11 @@ class OMPLoopDirective : public OMPExecu CombinedConditionOffset = 25, CombinedNextLowerBoundOffset = 26, CombinedNextUpperBoundOffset = 27, +CombinedDistConditionOffset = 28, +CombinedParForInDistConditionOffset = 29, // Offset to the end (and start of the following counters/updates/finals // arrays) for combined distribute loop directives. -CombinedDistributeEnd = 28, +CombinedDistributeEnd = 30, }; /// Get the counters storage. @@ -605,6 +607,17 @@ protected: "expected loop bound sharing directive"); *std::next(child_begin(), CombinedNextUpperBoundOffset) = CombNUB; } + void setCombinedDistCond(Expr *CombDistCond) { +assert(isOpenMPLoopBoundSharingDirective(getDirectiveKind()) && + "expected loop bound distribute sharing directive"); +*std::next(child_begin(), CombinedDistConditionOffset) = CombDistCond; + } + void setCombinedParForInDistCond(Expr *CombParForInDistCond) { +assert(isOpenMPLoopBoundSharingDirective(getDirectiveKind()) && + "expected loop bound distribute sharing directive"); +*std::next(child_begin(), + CombinedParForInDistConditionOffset) = CombParForInDistCond; + } void setCounters(ArrayRef A); void setPrivateCounters(ArrayRef A); void setInits(ArrayRef A); @@ -637,6 +650,13 @@ public: /// Update of UpperBound for statically scheduled omp loops for /// outer loop in combined constructs (e.g. 'distribute parallel for') Expr *NUB; +/// Distribute Loop condition used when composing 'omp distribute' +/// with 'omp for' in a same construct when schedule is chunked. +Expr *DistCond; +/// 'omp parallel for' loop condition used when composed with +/// 'omp distribute' in the same construct and when schedule is +/// chunked and the chunk size is 1. +Expr *ParForInDistCond; }; /// The expressions built for the OpenMP loop CodeGen for the @@ -754,6 +774,8 @@ public: DistCombinedFields.Cond = nullptr; DistCombinedFields.NLB = nullptr; DistCombinedFields.NUB = nullptr; + DistCombinedFields.DistCond = nullptr; + DistCombinedFields.ParForInDistCond = nullptr; } }; @@ -922,6 +944,18 @@ public: return const_cast(reinterpret_cast( *std::next(child_begin(), CombinedNextUpperBoundOffset))); } + Expr *getCombinedDistCond() const { +assert(isOpenMPLoopBoundSharingDirective(getDirectiveKind()) && + "expected loop bound distribute sharing directive"); +return const_cast(reinterpret_cast( +*std::next(child_begin(), CombinedDistConditionOffset))); + } + Expr *getCombinedParForInDistCond() const { +assert(isOpenMPLoopBoundSharingDirective(getDirectiveKind()) && + "expected loop bound distribute sharing directive"); +return const_cast(reinterpret_cast( +*std::next(child_begin(), CombinedParForInDistConditionOffset))); + } const Stmt *getBody() const { // This relies on the loop form is al
r345527 - [OpenMP] Fix condition.
Author: gbercea Date: Mon Oct 29 12:44:25 2018 New Revision: 345527 URL: http://llvm.org/viewvc/llvm-project?rev=345527&view=rev Log: [OpenMP] Fix condition. Summary: Iteration variable must be strictly less than the number of iterations. This fixes a bug introduced by previous patch D53448. Reviewers: ABataev, caomhin Reviewed By: ABataev Subscribers: guansong, cfe-commits Differential Revision: https://reviews.llvm.org/D53827 Modified: cfe/trunk/lib/Sema/SemaOpenMP.cpp cfe/trunk/test/OpenMP/distribute_parallel_for_codegen.cpp cfe/trunk/test/OpenMP/distribute_parallel_for_simd_codegen.cpp cfe/trunk/test/OpenMP/nvptx_target_teams_distribute_parallel_for_codegen.cpp Modified: cfe/trunk/lib/Sema/SemaOpenMP.cpp URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Sema/SemaOpenMP.cpp?rev=345527&r1=345526&r2=345527&view=diff == --- cfe/trunk/lib/Sema/SemaOpenMP.cpp (original) +++ cfe/trunk/lib/Sema/SemaOpenMP.cpp Mon Oct 29 12:44:25 2018 @@ -5299,7 +5299,8 @@ checkOpenMPLoop(OpenMPDirectiveKind DKin ExprResult CombDistCond; if (isOpenMPLoopBoundSharingDirective(DKind)) { CombDistCond = -SemaRef.BuildBinOp(CurScope, CondLoc, BO_LE, IV.get(), NumIterations.get()); +SemaRef.BuildBinOp( +CurScope, CondLoc, BO_LT, IV.get(), NumIterations.get()); } ExprResult CombCond; Modified: cfe/trunk/test/OpenMP/distribute_parallel_for_codegen.cpp URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/test/OpenMP/distribute_parallel_for_codegen.cpp?rev=345527&r1=345526&r2=345527&view=diff == --- cfe/trunk/test/OpenMP/distribute_parallel_for_codegen.cpp (original) +++ cfe/trunk/test/OpenMP/distribute_parallel_for_codegen.cpp Mon Oct 29 12:44:25 2018 @@ -447,7 +447,7 @@ int main() { // LAMBDA-DAG: [[OMP_IV_VAL_1:%.+]] = load {{.+}} [[OMP_IV]], // LAMBDA-DAG: [[OMP_UB_VAL_3:%.+]] = load {{.+}} // LAMBDA-DAG: [[OMP_UB_VAL_3_PLUS_ONE:%.+]] = add {{.+}} [[OMP_UB_VAL_3]], 1 - // LAMBDA: [[CMP_IV_UB:%.+]] = icmp sle {{.+}} [[OMP_IV_VAL_1]], [[OMP_UB_VAL_3_PLUS_ONE]] + // LAMBDA: [[CMP_IV_UB:%.+]] = icmp slt {{.+}} [[OMP_IV_VAL_1]], [[OMP_UB_VAL_3_PLUS_ONE]] // LAMBDA: br {{.+}} [[CMP_IV_UB]], label %[[DIST_INNER_LOOP_BODY:.+]], label %[[DIST_INNER_LOOP_END:.+]] // check that PrevLB and PrevUB are passed to the 'for' @@ -1210,7 +1210,7 @@ int main() { // CHECK-DAG: [[OMP_IV_VAL_1:%.+]] = load {{.+}} [[OMP_IV]], // CHECK-DAG: [[OMP_UB_VAL_3:%.+]] = load {{.+}} // CHECK-DAG: [[OMP_UB_VAL_3_PLUS_ONE:%.+]] = add {{.+}} [[OMP_UB_VAL_3]], 1 -// CHECK: [[CMP_IV_UB:%.+]] = icmp sle {{.+}} [[OMP_IV_VAL_1]], [[OMP_UB_VAL_3_PLUS_ONE]] +// CHECK: [[CMP_IV_UB:%.+]] = icmp slt {{.+}} [[OMP_IV_VAL_1]], [[OMP_UB_VAL_3_PLUS_ONE]] // CHECK: br {{.+}} [[CMP_IV_UB]], label %[[DIST_INNER_LOOP_BODY:.+]], label %[[DIST_INNER_LOOP_END:.+]] // check that PrevLB and PrevUB are passed to the 'for' @@ -1938,7 +1938,7 @@ int main() { // CHECK-DAG: [[OMP_IV_VAL_1:%.+]] = load {{.+}} [[OMP_IV]], // CHECK-DAG: [[OMP_UB_VAL_3:%.+]] = load {{.+}} // CHECK-DAG: [[OMP_UB_VAL_3_PLUS_ONE:%.+]] = add {{.+}} [[OMP_UB_VAL_3]], 1 -// CHECK: [[CMP_IV_UB:%.+]] = icmp sle {{.+}} [[OMP_IV_VAL_1]], [[OMP_UB_VAL_3_PLUS_ONE]] +// CHECK: [[CMP_IV_UB:%.+]] = icmp slt {{.+}} [[OMP_IV_VAL_1]], [[OMP_UB_VAL_3_PLUS_ONE]] // CHECK: br {{.+}} [[CMP_IV_UB]], label %[[DIST_INNER_LOOP_BODY:.+]], label %[[DIST_INNER_LOOP_END:.+]] // check that PrevLB and PrevUB are passed to the 'for' Modified: cfe/trunk/test/OpenMP/distribute_parallel_for_simd_codegen.cpp URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/test/OpenMP/distribute_parallel_for_simd_codegen.cpp?rev=345527&r1=345526&r2=345527&view=diff == --- cfe/trunk/test/OpenMP/distribute_parallel_for_simd_codegen.cpp (original) +++ cfe/trunk/test/OpenMP/distribute_parallel_for_simd_codegen.cpp Mon Oct 29 12:44:25 2018 @@ -446,7 +446,7 @@ int main() { // LAMBDA-DAG: [[OMP_IV_VAL_1:%.+]] = load {{.+}} [[OMP_IV]], // LAMBDA-DAG: [[OMP_UB_VAL_3:%.+]] = load {{.+}} // LAMBDA-DAG: [[OMP_UB_VAL_3_PLUS_ONE:%.+]] = add {{.+}} [[OMP_UB_VAL_3]], 1 - // LAMBDA: [[CMP_IV_UB:%.+]] = icmp sle {{.+}} [[OMP_IV_VAL_1]], [[OMP_UB_VAL_3_PLUS_ONE]] + // LAMBDA: [[CMP_IV_UB:%.+]] = icmp slt {{.+}} [[OMP_IV_VAL_1]], [[OMP_UB_VAL_3_PLUS_ONE]] // LAMBDA: br {{.+}} [[CMP_IV_UB]], label %[[DIST_INNER_LOOP_BODY:.+]], label %[[DIST_INNER_LOOP_END:.+]] // check that PrevLB and PrevUB are passed to the 'for' @@ -1209,7 +1209,7 @@ int main() { // CHECK-DAG: [[OMP_IV_VAL_1:%.+]] = load {{.+}} [[OMP_IV]], // CHECK-DAG: [[OMP_UB_VAL_3:%.+]] = load {{.+}} // CHECK-DAG: [[OMP_UB_VAL_3_PLUS_ONE:%.+]] = add {{.+}} [[OMP_UB
r347915 - [OpenMP] Add a new version of the SPMD deinit kernel function
Author: gbercea Date: Thu Nov 29 12:53:49 2018 New Revision: 347915 URL: http://llvm.org/viewvc/llvm-project?rev=347915&view=rev Log: [OpenMP] Add a new version of the SPMD deinit kernel function Summary: This patch adds a new runtime for the SPMD deinit kernel function which replaces the previous function. The new function takes as argument the flag which signals whether the runtime is required or not. This enables the compiler to optimize out the part of the deinit function which are not needed. Reviewers: ABataev, caomhin Reviewed By: ABataev Subscribers: jholewinski, guansong, cfe-commits Differential Revision: https://reviews.llvm.org/D54970 Modified: cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp cfe/trunk/test/OpenMP/nvptx_target_parallel_codegen.cpp cfe/trunk/test/OpenMP/nvptx_target_parallel_proc_bind_codegen.cpp cfe/trunk/test/OpenMP/nvptx_target_parallel_reduction_codegen.cpp cfe/trunk/test/OpenMP/nvptx_target_teams_codegen.cpp cfe/trunk/test/OpenMP/nvptx_target_teams_distribute_parallel_for_codegen.cpp cfe/trunk/test/OpenMP/nvptx_target_teams_distribute_parallel_for_generic_mode_codegen.cpp cfe/trunk/test/OpenMP/nvptx_target_teams_distribute_parallel_for_simd_codegen.cpp cfe/trunk/test/OpenMP/nvptx_teams_reduction_codegen.cpp Modified: cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp?rev=347915&r1=347914&r2=347915&view=diff == --- cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp (original) +++ cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp Thu Nov 29 12:53:49 2018 @@ -33,8 +33,8 @@ enum OpenMPRTLFunctionNVPTX { /// Call to void __kmpc_spmd_kernel_init(kmp_int32 thread_limit, /// int16_t RequiresOMPRuntime, int16_t RequiresDataSharing); OMPRTL_NVPTX__kmpc_spmd_kernel_init, - /// Call to void __kmpc_spmd_kernel_deinit(); - OMPRTL_NVPTX__kmpc_spmd_kernel_deinit, + /// Call to void __kmpc_spmd_kernel_deinit_v2(int16_t RequiresOMPRuntime); + OMPRTL_NVPTX__kmpc_spmd_kernel_deinit_v2, /// Call to void __kmpc_kernel_prepare_parallel(void /// *outlined_function, int16_t /// IsOMPRuntimeInitialized); @@ -1413,8 +1413,11 @@ void CGOpenMPRuntimeNVPTX::emitSPMDEntry CGF.EmitBlock(OMPDeInitBB); // DeInitialize the OMP state in the runtime; called by all active threads. + llvm::Value *Args[] = {/*RequiresOMPRuntime=*/ + CGF.Builder.getInt16(RequiresFullRuntime ? 1 : 0)}; CGF.EmitRuntimeCall( - createNVPTXRuntimeFunction(OMPRTL_NVPTX__kmpc_spmd_kernel_deinit), None); + createNVPTXRuntimeFunction( + OMPRTL_NVPTX__kmpc_spmd_kernel_deinit_v2), Args); CGF.EmitBranch(EST.ExitBB); CGF.EmitBlock(EST.ExitBB); @@ -1597,11 +1600,12 @@ CGOpenMPRuntimeNVPTX::createNVPTXRuntime RTLFn = CGM.CreateRuntimeFunction(FnTy, "__kmpc_spmd_kernel_init"); break; } - case OMPRTL_NVPTX__kmpc_spmd_kernel_deinit: { -// Build void __kmpc_spmd_kernel_deinit(); + case OMPRTL_NVPTX__kmpc_spmd_kernel_deinit_v2: { +// Build void __kmpc_spmd_kernel_deinit_v2(int16_t RequiresOMPRuntime); +llvm::Type *TypeParams[] = {CGM.Int16Ty}; auto *FnTy = -llvm::FunctionType::get(CGM.VoidTy, llvm::None, /*isVarArg*/ false); -RTLFn = CGM.CreateRuntimeFunction(FnTy, "__kmpc_spmd_kernel_deinit"); +llvm::FunctionType::get(CGM.VoidTy, TypeParams, /*isVarArg*/ false); +RTLFn = CGM.CreateRuntimeFunction(FnTy, "__kmpc_spmd_kernel_deinit_v2"); break; } case OMPRTL_NVPTX__kmpc_kernel_prepare_parallel: { Modified: cfe/trunk/test/OpenMP/nvptx_target_parallel_codegen.cpp URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/test/OpenMP/nvptx_target_parallel_codegen.cpp?rev=347915&r1=347914&r2=347915&view=diff == --- cfe/trunk/test/OpenMP/nvptx_target_parallel_codegen.cpp (original) +++ cfe/trunk/test/OpenMP/nvptx_target_parallel_codegen.cpp Thu Nov 29 12:53:49 2018 @@ -68,7 +68,7 @@ int bar(int n){ // CHECK: br label {{%?}}[[DONE:.+]] // // CHECK: [[DONE]] - // CHECK: call void @__kmpc_spmd_kernel_deinit() + // CHECK: call void @__kmpc_spmd_kernel_deinit_v2(i16 1) // CHECK: br label {{%?}}[[EXIT:.+]] // // CHECK: [[EXIT]] @@ -111,7 +111,7 @@ int bar(int n){ // CHECK: br label {{%?}}[[DONE:.+]] // // CHECK: [[DONE]] - // CHECK: call void @__kmpc_spmd_kernel_deinit() + // CHECK: call void @__kmpc_spmd_kernel_deinit_v2(i16 1) // CHECK: br label {{%?}}[[EXIT:.+]] // // CHECK: [[EXIT]] Modified: cfe/trunk/test/OpenMP/nvptx_target_parallel_proc_bind_codegen.cpp URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/test/OpenMP/nvptx_target_parallel_proc_bind_codegen.cpp?rev=347915&r1=347914&r2=347915&view=diff == --- cfe/trunk/test/Ope
r358709 - [OpenMP] Add checks for requires and target directives.
Author: gbercea Date: Thu Apr 18 12:53:43 2019 New Revision: 358709 URL: http://llvm.org/viewvc/llvm-project?rev=358709&view=rev Log: [OpenMP] Add checks for requires and target directives. Summary: The requires directive containing target related clauses must appear before any target region in the compilation unit. Reviewers: ABataev, AlexEichenberger, caomhin Reviewed By: ABataev Subscribers: guansong, jfb, jdoerfert, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D60875 Added: cfe/trunk/test/OpenMP/requires_target_messages.cpp Modified: cfe/trunk/include/clang/Basic/DiagnosticSemaKinds.td cfe/trunk/lib/Sema/SemaOpenMP.cpp cfe/trunk/test/OpenMP/requires_messages.cpp Modified: cfe/trunk/include/clang/Basic/DiagnosticSemaKinds.td URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Basic/DiagnosticSemaKinds.td?rev=358709&r1=358708&r2=358709&view=diff == --- cfe/trunk/include/clang/Basic/DiagnosticSemaKinds.td (original) +++ cfe/trunk/include/clang/Basic/DiagnosticSemaKinds.td Thu Apr 18 12:53:43 2019 @@ -9132,6 +9132,10 @@ def err_omp_requires_clause_redeclaratio "Only one %0 clause can appear on a requires directive in a single translation unit">; def note_omp_requires_previous_clause : Note < "%0 clause previously used here">; +def err_omp_target_before_requires : Error < + "target region encountered before requires directive with '%0' clause">; +def note_omp_requires_encountered_target : Note < + "target previously encountered here">; def err_omp_invalid_scope : Error < "'#pragma omp %0' directive must appear only in file scope">; def note_omp_invalid_length_on_this_ptr_mapping : Note < Modified: cfe/trunk/lib/Sema/SemaOpenMP.cpp URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Sema/SemaOpenMP.cpp?rev=358709&r1=358708&r2=358709&view=diff == --- cfe/trunk/lib/Sema/SemaOpenMP.cpp (original) +++ cfe/trunk/lib/Sema/SemaOpenMP.cpp Thu Apr 18 12:53:43 2019 @@ -193,6 +193,8 @@ private: /// Expression for the predefined allocators. Expr *OMPPredefinedAllocators[OMPAllocateDeclAttr::OMPUserDefinedMemAlloc] = { nullptr}; + /// Vector of previously encountered target directives + SmallVector TargetLocations; public: explicit DSAStackTy(Sema &S) : SemaRef(S) {} @@ -454,6 +456,16 @@ public: return IsDuplicate; } + /// Add location of previously encountered target to internal vector + void addTargetDirLocation(SourceLocation LocStart) { +TargetLocations.push_back(LocStart); + } + + // Return previously encountered target region locations. + ArrayRef getEncounteredTargetLocs() const { +return TargetLocations; + } + /// Set default data sharing attribute to none. void setDefaultDSANone(SourceLocation Loc) { assert(!isStackEmpty()); @@ -2418,6 +2430,27 @@ Sema::ActOnOpenMPRequiresDirective(Sourc OMPRequiresDecl *Sema::CheckOMPRequiresDecl(SourceLocation Loc, ArrayRef ClauseList) { + /// For target specific clauses, the requires directive cannot be + /// specified after the handling of any of the target regions in the + /// current compilation unit. + ArrayRef TargetLocations = + DSAStack->getEncounteredTargetLocs(); + if (!TargetLocations.empty()) { +for (const OMPClause *CNew : ClauseList) { + // Check if any of the requires clauses affect target regions. + if (isa(CNew) || + isa(CNew) || + isa(CNew) || + isa(CNew)) { +Diag(Loc, diag::err_omp_target_before_requires) +<< getOpenMPClauseName(CNew->getClauseKind()); +for (SourceLocation TargetLoc : TargetLocations) { + Diag(TargetLoc, diag::note_omp_requires_encountered_target); +} + } +} + } + if (!DSAStack->hasDuplicateRequiresClause(ClauseList)) return OMPRequiresDecl::Create(Context, getCurLexicalContext(), Loc, ClauseList); @@ -4167,6 +4200,16 @@ StmtResult Sema::ActOnOpenMPExecutableDi ->setIsOMPStructuredBlock(true); } + if (!CurContext->isDependentContext() && + isOpenMPTargetExecutionDirective(Kind) && + !(DSAStack->hasRequiresDeclWithClause() || +DSAStack->hasRequiresDeclWithClause() || +DSAStack->hasRequiresDeclWithClause() || +DSAStack->hasRequiresDeclWithClause())) { +// Register target to DSA Stack. +DSAStack->addTargetDirLocation(StartLoc); + } + return Res; } Modified: cfe/trunk/test/OpenMP/requires_messages.cpp URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/test/OpenMP/requires_messages.cpp?rev=358709&r1=358708&r2=358709&view=diff == --- cfe/trunk/test/OpenMP/requires_messages.cpp (original) +++ cfe/trunk/test/OpenMP/requires
r358711 - [OpenMP][NFC] Fix requires target test.
Author: gbercea Date: Thu Apr 18 13:34:43 2019 New Revision: 358711 URL: http://llvm.org/viewvc/llvm-project?rev=358711&view=rev Log: [OpenMP][NFC] Fix requires target test. Summary: Fix requires target test. Reviewers: ABataev Subscribers: guansong, jdoerfert, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D60886 Modified: cfe/trunk/test/OpenMP/requires_target_messages.cpp Modified: cfe/trunk/test/OpenMP/requires_target_messages.cpp URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/test/OpenMP/requires_target_messages.cpp?rev=358711&r1=358710&r2=358711&view=diff == --- cfe/trunk/test/OpenMP/requires_target_messages.cpp (original) +++ cfe/trunk/test/OpenMP/requires_target_messages.cpp Thu Apr 18 13:34:43 2019 @@ -2,14 +2,14 @@ void foo2() { int a; - #pragma omp target // expected-note 4 {{Target previously encountered here}} + #pragma omp target // expected-note 4 {{target previously encountered here}} { a = a + 1; } } #pragma omp requires atomic_default_mem_order(seq_cst) -#pragma omp requires unified_address //expected-error {{Target region encountered before requires directive with 'unified_address' clause}} -#pragma omp requires unified_shared_memory //expected-error {{Target region encountered before requires directive with 'unified_shared_memory' clause}} -#pragma omp requires reverse_offload //expected-error {{Target region encountered before requires directive with 'reverse_offload' clause}} -#pragma omp requires dynamic_allocators //expected-error {{Target region encountered before requires directive with 'dynamic_allocators' clause}} +#pragma omp requires unified_address //expected-error {{target region encountered before requires directive with 'unified_address' clause}} +#pragma omp requires unified_shared_memory //expected-error {{target region encountered before requires directive with 'unified_shared_memory' clause}} +#pragma omp requires reverse_offload //expected-error {{target region encountered before requires directive with 'reverse_offload' clause}} +#pragma omp requires dynamic_allocators //expected-error {{target region encountered before requires directive with 'dynamic_allocators' clause}} ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
r359910 - [CUDA][Clang][Bugfix] Add missing CUDA 9.2 case
Author: gbercea Date: Fri May 3 10:59:18 2019 New Revision: 359910 URL: http://llvm.org/viewvc/llvm-project?rev=359910&view=rev Log: [CUDA][Clang][Bugfix] Add missing CUDA 9.2 case Summary: The bug was reported on the OpenMP-dev list: .../obj-release/lib/clang/9.0.0/include/__clang_cuda_intrinsics.h:173:35: error: '__nvvm_shfl_sync_idx_i32' needs target feature ptx60|ptx61|ptx63|ptx64 __MAKE_SYNC_SHUFFLES(__shfl_sync, __nvvm_shfl_sync_idx_i32, This problem occurs when trying to compile a .cu file that requires a newer ptx version (>ptx60 in this case) than ptx42. Reviewers: tra, ABataev, caomhin Reviewed By: tra Subscribers: jdoerfert, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D61474 Modified: cfe/trunk/lib/Driver/ToolChains/Cuda.cpp Modified: cfe/trunk/lib/Driver/ToolChains/Cuda.cpp URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Driver/ToolChains/Cuda.cpp?rev=359910&r1=359909&r2=359910&view=diff == --- cfe/trunk/lib/Driver/ToolChains/Cuda.cpp (original) +++ cfe/trunk/lib/Driver/ToolChains/Cuda.cpp Fri May 3 10:59:18 2019 @@ -656,6 +656,9 @@ void CudaToolChain::addClangTargetOption case CudaVersion::CUDA_100: PtxFeature = "+ptx63"; break; +case CudaVersion::CUDA_92: + PtxFeature = "+ptx61"; + break; case CudaVersion::CUDA_91: PtxFeature = "+ptx61"; break; ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
r360063 - [OpenMP][Clang] Support for target math functions
Author: gbercea Date: Mon May 6 11:19:15 2019 New Revision: 360063 URL: http://llvm.org/viewvc/llvm-project?rev=360063&view=rev Log: [OpenMP][Clang] Support for target math functions Summary: In this patch we propose a temporary solution to resolving math functions for the NVPTX toolchain, temporary until OpenMP variant is supported by Clang. We intercept the inclusion of math.h and cmath headers and if we are in the OpenMP-NVPTX case, we re-use CUDA's math function resolution mechanism. Authors: @gtbercea @jdoerfert Reviewers: hfinkel, caomhin, ABataev, tra Reviewed By: hfinkel, ABataev, tra Subscribers: mgorny, guansong, cfe-commits, jdoerfert Tags: #clang Differential Revision: https://reviews.llvm.org/D61399 Added: cfe/trunk/lib/Headers/openmp_wrappers/ cfe/trunk/lib/Headers/openmp_wrappers/__clang_openmp_math.h cfe/trunk/lib/Headers/openmp_wrappers/cmath cfe/trunk/lib/Headers/openmp_wrappers/math.h cfe/trunk/test/Headers/Inputs/include/cmath cfe/trunk/test/Headers/Inputs/include/limits cfe/trunk/test/Headers/nvptx_device_cmath_functions.c cfe/trunk/test/Headers/nvptx_device_cmath_functions.cpp cfe/trunk/test/Headers/nvptx_device_math_functions.c cfe/trunk/test/Headers/nvptx_device_math_functions.cpp Modified: cfe/trunk/lib/Driver/ToolChains/Clang.cpp cfe/trunk/lib/Headers/CMakeLists.txt cfe/trunk/lib/Headers/__clang_cuda_cmath.h cfe/trunk/lib/Headers/__clang_cuda_device_functions.h cfe/trunk/lib/Headers/__clang_cuda_libdevice_declares.h cfe/trunk/lib/Headers/__clang_cuda_math_forward_declares.h cfe/trunk/test/Driver/openmp-offload-gpu.c cfe/trunk/test/Headers/Inputs/include/math.h Modified: cfe/trunk/lib/Driver/ToolChains/Clang.cpp URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Driver/ToolChains/Clang.cpp?rev=360063&r1=360062&r2=360063&view=diff == --- cfe/trunk/lib/Driver/ToolChains/Clang.cpp (original) +++ cfe/trunk/lib/Driver/ToolChains/Clang.cpp Mon May 6 11:19:15 2019 @@ -1151,6 +1151,21 @@ void Clang::AddPreprocessingOptions(Comp if (JA.isOffloading(Action::OFK_Cuda)) getToolChain().AddCudaIncludeArgs(Args, CmdArgs); + // If we are offloading to a target via OpenMP we need to include the + // openmp_wrappers folder which contains alternative system headers. + if (JA.isDeviceOffloading(Action::OFK_OpenMP) && + getToolChain().getTriple().isNVPTX()){ +if (!Args.hasArg(options::OPT_nobuiltininc)) { + // Add openmp_wrappers/* to our system include path. This lets us wrap + // standard library headers. + SmallString<128> P(D.ResourceDir); + llvm::sys::path::append(P, "include"); + llvm::sys::path::append(P, "openmp_wrappers"); + CmdArgs.push_back("-internal-isystem"); + CmdArgs.push_back(Args.MakeArgString(P)); +} + } + // Add -i* options, and automatically translate to // -include-pch/-include-pth for transparent PCH support. It's // wonky, but we include looking for .gch so we can support seamless Modified: cfe/trunk/lib/Headers/CMakeLists.txt URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Headers/CMakeLists.txt?rev=360063&r1=360062&r2=360063&view=diff == --- cfe/trunk/lib/Headers/CMakeLists.txt (original) +++ cfe/trunk/lib/Headers/CMakeLists.txt Mon May 6 11:19:15 2019 @@ -33,6 +33,9 @@ set(files avxintrin.h bmi2intrin.h bmiintrin.h + openmp_wrappers/math.h + openmp_wrappers/cmath + openmp_wrappers/__clang_openmp_math.h __clang_cuda_builtin_vars.h __clang_cuda_cmath.h __clang_cuda_complex_builtins.h Modified: cfe/trunk/lib/Headers/__clang_cuda_cmath.h URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Headers/__clang_cuda_cmath.h?rev=360063&r1=360062&r2=360063&view=diff == --- cfe/trunk/lib/Headers/__clang_cuda_cmath.h (original) +++ cfe/trunk/lib/Headers/__clang_cuda_cmath.h Mon May 6 11:19:15 2019 @@ -30,7 +30,11 @@ // implementation. Declaring in the global namespace and pulling into namespace // std covers all of the known knowns. +#ifdef _OPENMP +#define __DEVICE__ static __attribute__((always_inline)) +#else #define __DEVICE__ static __device__ __inline__ __attribute__((always_inline)) +#endif __DEVICE__ long long abs(long long __n) { return ::llabs(__n); } __DEVICE__ long abs(long __n) { return ::labs(__n); } @@ -47,6 +51,8 @@ __DEVICE__ float exp(float __x) { return __DEVICE__ float fabs(float __x) { return ::fabsf(__x); } __DEVICE__ float floor(float __x) { return ::floorf(__x); } __DEVICE__ float fmod(float __x, float __y) { return ::fmodf(__x, __y); } +// TODO: remove when variant is supported +#ifndef _OPENMP __DEVICE__ int fpclassify(float __x) { return __builtin_fpclassify(FP_NAN, FP_INFINITE, FP_NORMAL, FP_SUBNORMAL,
r360265 - [OpenMP][Clang] Support for target math functions
Author: gbercea Date: Wed May 8 08:52:33 2019 New Revision: 360265 URL: http://llvm.org/viewvc/llvm-project?rev=360265&view=rev Log: [OpenMP][Clang] Support for target math functions Summary: In this patch we propose a temporary solution to resolving math functions for the NVPTX toolchain, temporary until OpenMP variant is supported by Clang. We intercept the inclusion of math.h and cmath headers and if we are in the OpenMP-NVPTX case, we re-use CUDA's math function resolution mechanism. Authors: @gtbercea @jdoerfert Reviewers: hfinkel, caomhin, ABataev, tra Reviewed By: hfinkel, ABataev, tra Subscribers: JDevlieghere, mgorny, guansong, cfe-commits, jdoerfert Tags: #clang Differential Revision: https://reviews.llvm.org/D61399 Added: cfe/trunk/lib/Headers/openmp_wrappers/ cfe/trunk/lib/Headers/openmp_wrappers/__clang_openmp_math.h cfe/trunk/lib/Headers/openmp_wrappers/cmath cfe/trunk/lib/Headers/openmp_wrappers/math.h cfe/trunk/test/Headers/Inputs/include/cmath cfe/trunk/test/Headers/Inputs/include/limits cfe/trunk/test/Headers/nvptx_device_cmath_functions.c cfe/trunk/test/Headers/nvptx_device_cmath_functions.cpp cfe/trunk/test/Headers/nvptx_device_math_functions.c cfe/trunk/test/Headers/nvptx_device_math_functions.cpp Modified: cfe/trunk/lib/Driver/ToolChain.cpp cfe/trunk/lib/Driver/ToolChains/Clang.cpp cfe/trunk/lib/Headers/CMakeLists.txt cfe/trunk/lib/Headers/__clang_cuda_cmath.h cfe/trunk/lib/Headers/__clang_cuda_device_functions.h cfe/trunk/lib/Headers/__clang_cuda_libdevice_declares.h cfe/trunk/lib/Headers/__clang_cuda_math_forward_declares.h cfe/trunk/test/Driver/openmp-offload-gpu.c cfe/trunk/test/Headers/Inputs/include/math.h Modified: cfe/trunk/lib/Driver/ToolChain.cpp URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Driver/ToolChain.cpp?rev=360265&r1=360264&r2=360265&view=diff == --- cfe/trunk/lib/Driver/ToolChain.cpp (original) +++ cfe/trunk/lib/Driver/ToolChain.cpp Wed May 8 08:52:33 2019 @@ -425,7 +425,7 @@ bool ToolChain::needsProfileRT(const Arg Args.hasArg(options::OPT_fprofile_instr_generate) || Args.hasArg(options::OPT_fprofile_instr_generate_EQ) || Args.hasArg(options::OPT_fcreate_profile) || - Args.hasArg(options::OPT_forder_file_instrumentation)) + Args.hasArg(options::OPT_forder_file_instrumentation)) return true; return false; Modified: cfe/trunk/lib/Driver/ToolChains/Clang.cpp URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Driver/ToolChains/Clang.cpp?rev=360265&r1=360264&r2=360265&view=diff == --- cfe/trunk/lib/Driver/ToolChains/Clang.cpp (original) +++ cfe/trunk/lib/Driver/ToolChains/Clang.cpp Wed May 8 08:52:33 2019 @@ -1151,6 +1151,24 @@ void Clang::AddPreprocessingOptions(Comp if (JA.isOffloading(Action::OFK_Cuda)) getToolChain().AddCudaIncludeArgs(Args, CmdArgs); + // If we are offloading to a target via OpenMP we need to include the + // openmp_wrappers folder which contains alternative system headers. + if (JA.isDeviceOffloading(Action::OFK_OpenMP) && + getToolChain().getTriple().isNVPTX()){ +if (!Args.hasArg(options::OPT_nobuiltininc)) { + // Add openmp_wrappers/* to our system include path. This lets us wrap + // standard library headers. + SmallString<128> P(D.ResourceDir); + llvm::sys::path::append(P, "include"); + llvm::sys::path::append(P, "openmp_wrappers"); + CmdArgs.push_back("-internal-isystem"); + CmdArgs.push_back(Args.MakeArgString(P)); +} + +CmdArgs.push_back("-include"); +CmdArgs.push_back("__clang_openmp_math.h"); + } + // Add -i* options, and automatically translate to // -include-pch/-include-pth for transparent PCH support. It's // wonky, but we include looking for .gch so we can support seamless Modified: cfe/trunk/lib/Headers/CMakeLists.txt URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Headers/CMakeLists.txt?rev=360265&r1=360264&r2=360265&view=diff == --- cfe/trunk/lib/Headers/CMakeLists.txt (original) +++ cfe/trunk/lib/Headers/CMakeLists.txt Wed May 8 08:52:33 2019 @@ -128,6 +128,12 @@ set(ppc_wrapper_files ppc_wrappers/mmintrin.h ) +set(openmp_wrapper_files + openmp_wrappers/math.h + openmp_wrappers/cmath + openmp_wrappers/__clang_openmp_math.h +) + set(output_dir ${LLVM_LIBRARY_OUTPUT_INTDIR}/clang/${CLANG_VERSION}/include) set(out_files) set(generated_files) @@ -156,7 +162,7 @@ endfunction(clang_generate_header) # Copy header files from the source directory to the build directory -foreach( f ${files} ${cuda_wrapper_files} ${ppc_wrapper_files} ) +foreach( f ${files} ${cuda_wrapper_files} ${ppc_wrapper_files} ${openmp_wrapper_files}) copy_header_to_output_dir(${CMAKE_CUR
r343253 - [OpenMP] Make default distribute schedule for NVPTX target regions in SPMD mode achieve coalescing
Author: gbercea Date: Thu Sep 27 12:22:56 2018 New Revision: 343253 URL: http://llvm.org/viewvc/llvm-project?rev=343253&view=rev Log: [OpenMP] Make default distribute schedule for NVPTX target regions in SPMD mode achieve coalescing Summary: For the OpenMP NVPTX toolchain choose a default distribute schedule that ensures coalescing on the GPU when in SPMD mode. This significantly increases the performance of offloaded target code and reduces the number of registers used on the GPU side. Reviewers: ABataev, caomhin, Hahnfeld Reviewed By: ABataev, Hahnfeld Subscribers: Hahnfeld, jholewinski, guansong, cfe-commits Differential Revision: https://reviews.llvm.org/D52434 Modified: cfe/trunk/lib/CodeGen/CGOpenMPRuntime.h cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.h cfe/trunk/lib/CodeGen/CGStmtOpenMP.cpp cfe/trunk/test/OpenMP/nvptx_target_teams_distribute_parallel_for_codegen.cpp cfe/trunk/test/OpenMP/nvptx_target_teams_distribute_parallel_for_simd_codegen.cpp Modified: cfe/trunk/lib/CodeGen/CGOpenMPRuntime.h URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGOpenMPRuntime.h?rev=343253&r1=343252&r2=343253&view=diff == --- cfe/trunk/lib/CodeGen/CGOpenMPRuntime.h (original) +++ cfe/trunk/lib/CodeGen/CGOpenMPRuntime.h Thu Sep 27 12:22:56 2018 @@ -1490,6 +1490,12 @@ public: const VarDecl *NativeParam, const VarDecl *TargetParam) const; + /// Choose default schedule type and chunk value for the + /// dist_schedule clause. + virtual void getDefaultDistScheduleAndChunk(CodeGenFunction &CGF, + const OMPLoopDirective &S, OpenMPDistScheduleClauseKind &ScheduleKind, + llvm::Value *&Chunk) const {} + /// Emits call of the outlined function with the provided arguments, /// translating these arguments to correct target-specific arguments. virtual void Modified: cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp?rev=343253&r1=343252&r2=343253&view=diff == --- cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp (original) +++ cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp Thu Sep 27 12:22:56 2018 @@ -4081,3 +4081,15 @@ void CGOpenMPRuntimeNVPTX::functionFinis FunctionGlobalizedDecls.erase(CGF.CurFn); CGOpenMPRuntime::functionFinished(CGF); } + +void CGOpenMPRuntimeNVPTX::getDefaultDistScheduleAndChunk( +CodeGenFunction &CGF, const OMPLoopDirective &S, +OpenMPDistScheduleClauseKind &ScheduleKind, +llvm::Value *&Chunk) const { + if (getExecutionMode() == CGOpenMPRuntimeNVPTX::EM_SPMD) { +ScheduleKind = OMPC_DIST_SCHEDULE_static; +Chunk = CGF.EmitScalarConversion(getNVPTXNumThreads(CGF), +CGF.getContext().getIntTypeForBitwidth(32, /*Signed=*/0), +S.getIterationVariable()->getType(), S.getBeginLoc()); + } +} Modified: cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.h URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.h?rev=343253&r1=343252&r2=343253&view=diff == --- cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.h (original) +++ cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.h Thu Sep 27 12:22:56 2018 @@ -340,6 +340,11 @@ public: /// void functionFinished(CodeGenFunction &CGF) override; + /// Choose a default value for the schedule clause. + void getDefaultDistScheduleAndChunk(CodeGenFunction &CGF, + const OMPLoopDirective &S, OpenMPDistScheduleClauseKind &ScheduleKind, + llvm::Value *&Chunk) const override; + private: /// Track the execution mode when codegening directives within a target /// region. The appropriate mode (SPMD/NON-SPMD) is set on entry to the Modified: cfe/trunk/lib/CodeGen/CGStmtOpenMP.cpp URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGStmtOpenMP.cpp?rev=343253&r1=343252&r2=343253&view=diff == --- cfe/trunk/lib/CodeGen/CGStmtOpenMP.cpp (original) +++ cfe/trunk/lib/CodeGen/CGStmtOpenMP.cpp Thu Sep 27 12:22:56 2018 @@ -3325,6 +3325,10 @@ void CodeGenFunction::EmitOMPDistributeL S.getIterationVariable()->getType(), S.getBeginLoc()); } + } else { +// Default behaviour for dist_schedule clause. +CGM.getOpenMPRuntime().getDefaultDistScheduleAndChunk( +*this, S, ScheduleKind, Chunk); } const unsigned IVSize = getContext().getTypeSize(IVExpr->getType()); const bool IVSigned = IVExpr->getType()->hasSignedIntegerRepresentation(); Modified: cfe/trunk/test/OpenMP/nvptx_target_teams_distribute_para
r343260 - [OpenMP] Make default parallel for schedule in NVPTX target regions in SPMD mode achieve coalescing
Author: gbercea Date: Thu Sep 27 13:29:00 2018 New Revision: 343260 URL: http://llvm.org/viewvc/llvm-project?rev=343260&view=rev Log: [OpenMP] Make default parallel for schedule in NVPTX target regions in SPMD mode achieve coalescing Summary: Set default schedule for parallel for loops to schedule(static, 1) when using SPMD mode on the NVPTX device offloading toolchain to ensure coalescing. Reviewers: ABataev, Hahnfeld, caomhin Reviewed By: ABataev Subscribers: jholewinski, guansong, cfe-commits Differential Revision: https://reviews.llvm.org/D52629 Modified: cfe/trunk/lib/CodeGen/CGOpenMPRuntime.h cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.h cfe/trunk/lib/CodeGen/CGStmtOpenMP.cpp cfe/trunk/test/OpenMP/nvptx_target_teams_distribute_parallel_for_codegen.cpp cfe/trunk/test/OpenMP/nvptx_target_teams_distribute_parallel_for_simd_codegen.cpp Modified: cfe/trunk/lib/CodeGen/CGOpenMPRuntime.h URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGOpenMPRuntime.h?rev=343260&r1=343259&r2=343260&view=diff == --- cfe/trunk/lib/CodeGen/CGOpenMPRuntime.h (original) +++ cfe/trunk/lib/CodeGen/CGOpenMPRuntime.h Thu Sep 27 13:29:00 2018 @@ -1496,6 +1496,12 @@ public: const OMPLoopDirective &S, OpenMPDistScheduleClauseKind &ScheduleKind, llvm::Value *&Chunk) const {} + /// Choose default schedule type and chunk value for the + /// schedule clause. + virtual void getDefaultScheduleAndChunk(CodeGenFunction &CGF, + const OMPLoopDirective &S, OpenMPScheduleClauseKind &ScheduleKind, + llvm::Value *&Chunk) const {} + /// Emits call of the outlined function with the provided arguments, /// translating these arguments to correct target-specific arguments. virtual void Modified: cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp?rev=343260&r1=343259&r2=343260&view=diff == --- cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp (original) +++ cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp Thu Sep 27 13:29:00 2018 @@ -4093,3 +4093,14 @@ void CGOpenMPRuntimeNVPTX::getDefaultDis S.getIterationVariable()->getType(), S.getBeginLoc()); } } + +void CGOpenMPRuntimeNVPTX::getDefaultScheduleAndChunk( +CodeGenFunction &CGF, const OMPLoopDirective &S, +OpenMPScheduleClauseKind &ScheduleKind, +llvm::Value *&Chunk) const { + if (getExecutionMode() == CGOpenMPRuntimeNVPTX::EM_SPMD) { +ScheduleKind = OMPC_SCHEDULE_static; +Chunk = CGF.Builder.getIntN(CGF.getContext().getTypeSize( +S.getIterationVariable()->getType()), 1); + } +} Modified: cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.h URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.h?rev=343260&r1=343259&r2=343260&view=diff == --- cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.h (original) +++ cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.h Thu Sep 27 13:29:00 2018 @@ -340,11 +340,16 @@ public: /// void functionFinished(CodeGenFunction &CGF) override; - /// Choose a default value for the schedule clause. + /// Choose a default value for the dist_schedule clause. void getDefaultDistScheduleAndChunk(CodeGenFunction &CGF, const OMPLoopDirective &S, OpenMPDistScheduleClauseKind &ScheduleKind, llvm::Value *&Chunk) const override; + /// Choose a default value for the schedule clause. + void getDefaultScheduleAndChunk(CodeGenFunction &CGF, + const OMPLoopDirective &S, OpenMPScheduleClauseKind &ScheduleKind, + llvm::Value *&Chunk) const override; + private: /// Track the execution mode when codegening directives within a target /// region. The appropriate mode (SPMD/NON-SPMD) is set on entry to the Modified: cfe/trunk/lib/CodeGen/CGStmtOpenMP.cpp URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGStmtOpenMP.cpp?rev=343260&r1=343259&r2=343260&view=diff == --- cfe/trunk/lib/CodeGen/CGStmtOpenMP.cpp (original) +++ cfe/trunk/lib/CodeGen/CGStmtOpenMP.cpp Thu Sep 27 13:29:00 2018 @@ -2310,6 +2310,10 @@ bool CodeGenFunction::EmitOMPWorksharing S.getIterationVariable()->getType(), S.getBeginLoc()); } + } else { +// Default behaviour for schedule clause. +CGM.getOpenMPRuntime().getDefaultScheduleAndChunk( +*this, S, ScheduleKind.Schedule, Chunk); } const unsigned IVSize = getContext().getTypeSize(IVExpr->getType()); const bool IVSigned = IVExpr->getType()->hasSignedIntegerRepresentation(); Modified: cfe/trunk/test/OpenMP/nvptx_targe
r363435 - [OpenMP] Avoid emitting maps for target link variables when unified memory is used
Author: gbercea Date: Fri Jun 14 10:58:26 2019 New Revision: 363435 URL: http://llvm.org/viewvc/llvm-project?rev=363435&view=rev Log: [OpenMP] Avoid emitting maps for target link variables when unified memory is used Summary: This patch avoids the emission of maps for target link variables when unified memory is present. Reviewers: ABataev, caomhin Reviewed By: ABataev Subscribers: guansong, jdoerfert, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D60883 Modified: cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp cfe/trunk/lib/CodeGen/CGOpenMPRuntime.h cfe/trunk/lib/Sema/SemaOpenMP.cpp cfe/trunk/test/OpenMP/nvptx_target_requires_unified_shared_memory.cpp Modified: cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp?rev=363435&r1=363434&r2=363435&view=diff == --- cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp (original) +++ cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp Fri Jun 14 10:58:26 2019 @@ -8266,7 +8266,8 @@ public: continue; llvm::Optional Res = OMPDeclareTargetDeclAttr::isDeclareTargetDeclaration(VD); -if (!Res || *Res != OMPDeclareTargetDeclAttr::MT_Link) +if (CGF.CGM.getOpenMPRuntime().hasRequiresUnifiedSharedMemory() || +!Res || *Res != OMPDeclareTargetDeclAttr::MT_Link) continue; StructRangeInfoTy PartialStruct; generateInfoForComponentList( @@ -9251,6 +9252,10 @@ bool CGOpenMPRuntime::hasAllocateAttribu return false; } +bool CGOpenMPRuntime::hasRequiresUnifiedSharedMemory() const { + return HasRequiresUnifiedSharedMemory; +} + CGOpenMPRuntime::DisableAutoDeclareTargetRAII::DisableAutoDeclareTargetRAII( CodeGenModule &CGM) : CGM(CGM) { Modified: cfe/trunk/lib/CodeGen/CGOpenMPRuntime.h URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGOpenMPRuntime.h?rev=363435&r1=363434&r2=363435&view=diff == --- cfe/trunk/lib/CodeGen/CGOpenMPRuntime.h (original) +++ cfe/trunk/lib/CodeGen/CGOpenMPRuntime.h Fri Jun 14 10:58:26 2019 @@ -1623,6 +1623,9 @@ public: /// the predefined allocator and translates it into the corresponding address /// space. virtual bool hasAllocateAttributeForGlobalVar(const VarDecl *VD, LangAS &AS); + + /// Return whether the unified_shared_memory has been specified. + bool hasRequiresUnifiedSharedMemory() const; }; /// Class supports emissionof SIMD-only code. Modified: cfe/trunk/lib/Sema/SemaOpenMP.cpp URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Sema/SemaOpenMP.cpp?rev=363435&r1=363434&r2=363435&view=diff == --- cfe/trunk/lib/Sema/SemaOpenMP.cpp (original) +++ cfe/trunk/lib/Sema/SemaOpenMP.cpp Fri Jun 14 10:58:26 2019 @@ -2667,7 +2667,8 @@ public: llvm::Optional Res = OMPDeclareTargetDeclAttr::isDeclareTargetDeclaration(VD); if (VD->hasGlobalStorage() && CS && !CS->capturesVariable(VD) && - (!Res || *Res != OMPDeclareTargetDeclAttr::MT_Link)) + (Stack->hasRequiresDeclWithClause() || + !Res || *Res != OMPDeclareTargetDeclAttr::MT_Link)) return; SourceLocation ELoc = E->getExprLoc(); Modified: cfe/trunk/test/OpenMP/nvptx_target_requires_unified_shared_memory.cpp URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/test/OpenMP/nvptx_target_requires_unified_shared_memory.cpp?rev=363435&r1=363434&r2=363435&view=diff == --- cfe/trunk/test/OpenMP/nvptx_target_requires_unified_shared_memory.cpp (original) +++ cfe/trunk/test/OpenMP/nvptx_target_requires_unified_shared_memory.cpp Fri Jun 14 10:58:26 2019 @@ -26,42 +26,35 @@ int bar(int n){ // CHECK: [[VAR:@.+]] = global double 1.00e+01 // CHECK: [[VAR_DECL_TGT_LINK_PTR:@.+]] = global double* [[VAR]] -// CHECK: [[OFFLOAD_SIZES:@.+]] = private unnamed_addr constant [3 x i64] [i64 4, i64 8, i64 8] -// CHECK: [[OFFLOAD_MAPTYPES:@.+]] = private unnamed_addr constant [3 x i64] [i64 800, i64 800, i64 531] +// CHECK: [[OFFLOAD_SIZES:@.+]] = private unnamed_addr constant [2 x i64] [i64 4, i64 8] +// CHECK: [[OFFLOAD_MAPTYPES:@.+]] = private unnamed_addr constant [2 x i64] [i64 800, i64 800] // CHECK: [[N_CASTED:%.+]] = alloca i64 // CHECK: [[SUM_CASTED:%.+]] = alloca i64 -// CHECK: [[OFFLOAD_BASEPTRS:%.+]] = alloca [3 x i8*] -// CHECK: [[OFFLOAD_PTRS:%.+]] = alloca [3 x i8*] +// CHECK: [[OFFLOAD_BASEPTRS:%.+]] = alloca [2 x i8*] +// CHECK: [[OFFLOAD_PTRS:%.+]] = alloca [2 x i8*] // CHECK: [[LOAD1:%.+]] = load i64, i64* [[N_CASTED]] // CHECK: [[LOAD2:%.+]] = load i64, i64* [[SUM_CASTED]] -// CHECK: [[BPTR1:%.+]] = getelementptr inbounds [3 x i8*], [3 x i8*]* [[OFFLOAD_BASEPTRS]], i32 0, i32 0 +// CHEC
r363451 - [OpenMP] Add target task alloc function with device ID
Author: gbercea Date: Fri Jun 14 13:19:54 2019 New Revision: 363451 URL: http://llvm.org/viewvc/llvm-project?rev=363451&view=rev Log: [OpenMP] Add target task alloc function with device ID Summary: Add a new call to Clang to perform task allocation for the target. Reviewers: ABataev, AlexEichenberger, caomhin Reviewed By: ABataev, AlexEichenberger Subscribers: openmp-commits, Hahnfeld, guansong, jdoerfert, cfe-commits Tags: #clang, #openmp Differential Revision: https://reviews.llvm.org/D63009 Modified: cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp cfe/trunk/test/OpenMP/target_depend_codegen.cpp cfe/trunk/test/OpenMP/target_enter_data_depend_codegen.cpp cfe/trunk/test/OpenMP/target_exit_data_depend_codegen.cpp cfe/trunk/test/OpenMP/target_parallel_depend_codegen.cpp cfe/trunk/test/OpenMP/target_parallel_for_depend_codegen.cpp cfe/trunk/test/OpenMP/target_parallel_for_simd_depend_codegen.cpp cfe/trunk/test/OpenMP/target_simd_depend_codegen.cpp cfe/trunk/test/OpenMP/target_teams_depend_codegen.cpp cfe/trunk/test/OpenMP/target_teams_distribute_depend_codegen.cpp cfe/trunk/test/OpenMP/target_teams_distribute_parallel_for_depend_codegen.cpp cfe/trunk/test/OpenMP/target_teams_distribute_parallel_for_simd_depend_codegen.cpp cfe/trunk/test/OpenMP/target_teams_distribute_simd_depend_codegen.cpp cfe/trunk/test/OpenMP/target_update_depend_codegen.cpp Modified: cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp?rev=363451&r1=363450&r2=363451&view=diff == --- cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp (original) +++ cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp Fri Jun 14 13:19:54 2019 @@ -475,6 +475,12 @@ enum OpenMPOffloadingRequiresDirFlags : OMP_REQ_DYNAMIC_ALLOCATORS = 0x010, LLVM_MARK_AS_BITMASK_ENUM(/*LargestValue=*/OMP_REQ_DYNAMIC_ALLOCATORS) }; + +enum OpenMPOffloadingReservedDeviceIDs { + /// Device ID if the device was not defined, runtime should get it + /// from environment variables in the spec. + OMP_DEVICEID_UNDEF = -1, +}; } // anonymous namespace /// Describes ident structure that describes a source location. @@ -604,6 +610,11 @@ enum OpenMPRTLFunction { // kmp_int32 flags, size_t sizeof_kmp_task_t, size_t sizeof_shareds, // kmp_routine_entry_t *task_entry); OMPRTL__kmpc_omp_task_alloc, + // Call to kmp_task_t * __kmpc_omp_target_task_alloc(ident_t *, + // kmp_int32 gtid, kmp_int32 flags, size_t sizeof_kmp_task_t, + // size_t sizeof_shareds, kmp_routine_entry_t *task_entry, + // kmp_int64 device_id); + OMPRTL__kmpc_omp_target_task_alloc, // Call to kmp_int32 __kmpc_omp_task(ident_t *, kmp_int32 gtid, kmp_task_t * // new_task); OMPRTL__kmpc_omp_task, @@ -1912,6 +1923,21 @@ llvm::FunctionCallee CGOpenMPRuntime::cr RTLFn = CGM.CreateRuntimeFunction(FnTy, /*Name=*/"__kmpc_omp_task_alloc"); break; } + case OMPRTL__kmpc_omp_target_task_alloc: { +// Build kmp_task_t *__kmpc_omp_target_task_alloc(ident_t *, kmp_int32 gtid, +// kmp_int32 flags, size_t sizeof_kmp_task_t, size_t sizeof_shareds, +// kmp_routine_entry_t *task_entry, kmp_int64 device_id); +assert(KmpRoutineEntryPtrTy != nullptr && + "Type kmp_routine_entry_t must be created."); +llvm::Type *TypeParams[] = {getIdentTyPointerTy(), CGM.Int32Ty, CGM.Int32Ty, +CGM.SizeTy, CGM.SizeTy, KmpRoutineEntryPtrTy, +CGM.Int64Ty}; +// Return void * and then cast to particular kmp_task_t type. +auto *FnTy = +llvm::FunctionType::get(CGM.VoidPtrTy, TypeParams, /*isVarArg=*/false); +RTLFn = CGM.CreateRuntimeFunction(FnTy, /*Name=*/"__kmpc_omp_target_task_alloc"); +break; + } case OMPRTL__kmpc_omp_task: { // Build kmp_int32 __kmpc_omp_task(ident_t *, kmp_int32 gtid, kmp_task_t // *new_task); @@ -5074,13 +5100,30 @@ CGOpenMPRuntime::emitTaskInit(CodeGenFun : CGF.Builder.getInt32(Data.Final.getInt() ? FinalFlag : 0); TaskFlags = CGF.Builder.CreateOr(TaskFlags, CGF.Builder.getInt32(Flags)); llvm::Value *SharedsSize = CGM.getSize(C.getTypeSizeInChars(SharedsTy)); - llvm::Value *AllocArgs[] = {emitUpdateLocation(CGF, Loc), - getThreadID(CGF, Loc), TaskFlags, - KmpTaskTWithPrivatesTySize, SharedsSize, - CGF.Builder.CreatePointerBitCastOrAddrSpaceCast( - TaskEntry, KmpRoutineEntryPtrTy)}; - llvm::Value *NewTask = CGF.EmitRuntimeCall( + SmallVector AllocArgs = {emitUpdateLocation(CGF, Loc), + getThreadID(CGF, Loc), TaskFlags, KmpTaskTWithPrivatesTySize, + SharedsSize, CGF.Builder.CreatePointerBitCastOrAddrSpaceCast( + TaskEntry, KmpRoutineEntryPtrTy)}; + llvm::Value *NewTask; + if (D.hasClausesOfKind()) { +// Check if we h
r363809 - [OpenMP] Strengthen regression tests for task allocation under nowait depend clauses NFC
Author: gbercea Date: Wed Jun 19 07:26:43 2019 New Revision: 363809 URL: http://llvm.org/viewvc/llvm-project?rev=363809&view=rev Log: [OpenMP] Strengthen regression tests for task allocation under nowait depend clauses NFC Summary: This patch strengthens the tests introduced in D63009 by: - adding new test for default device ID. - modifying existing tests to pass device ID local variable to the task allocation function. Reviewers: ABataev, Hahnfeld, caomhin, jdoerfert Reviewed By: ABataev Subscribers: guansong, jdoerfert, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D63454 Added: cfe/trunk/test/OpenMP/target_constant_device_codegen.cpp Modified: cfe/trunk/test/OpenMP/target_depend_codegen.cpp cfe/trunk/test/OpenMP/target_enter_data_depend_codegen.cpp cfe/trunk/test/OpenMP/target_exit_data_depend_codegen.cpp cfe/trunk/test/OpenMP/target_parallel_depend_codegen.cpp cfe/trunk/test/OpenMP/target_parallel_for_depend_codegen.cpp cfe/trunk/test/OpenMP/target_parallel_for_simd_depend_codegen.cpp cfe/trunk/test/OpenMP/target_simd_depend_codegen.cpp cfe/trunk/test/OpenMP/target_teams_depend_codegen.cpp cfe/trunk/test/OpenMP/target_teams_distribute_depend_codegen.cpp cfe/trunk/test/OpenMP/target_teams_distribute_parallel_for_depend_codegen.cpp cfe/trunk/test/OpenMP/target_teams_distribute_parallel_for_simd_depend_codegen.cpp cfe/trunk/test/OpenMP/target_teams_distribute_simd_depend_codegen.cpp cfe/trunk/test/OpenMP/target_update_depend_codegen.cpp Added: cfe/trunk/test/OpenMP/target_constant_device_codegen.cpp URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/test/OpenMP/target_constant_device_codegen.cpp?rev=363809&view=auto == --- cfe/trunk/test/OpenMP/target_constant_device_codegen.cpp (added) +++ cfe/trunk/test/OpenMP/target_constant_device_codegen.cpp Wed Jun 19 07:26:43 2019 @@ -0,0 +1,34 @@ +// Test host codegen. +// RUN: %clang_cc1 -verify -fopenmp -x c++ -triple powerpc64le-unknown-unknown -fopenmp-targets=powerpc64le-ibm-linux-gnu -emit-llvm %s -o - | FileCheck %s --check-prefix CHECK --check-prefix CHECK-64 +// RUN: %clang_cc1 -fopenmp -x c++ -std=c++11 -triple powerpc64le-unknown-unknown -fopenmp-targets=powerpc64le-ibm-linux-gnu -emit-pch -o %t %s +// RUN: %clang_cc1 -fopenmp -x c++ -triple powerpc64le-unknown-unknown -fopenmp-targets=powerpc64le-ibm-linux-gnu -std=c++11 -include-pch %t -verify %s -emit-llvm -o - | FileCheck %s --check-prefix CHECK --check-prefix CHECK-64 + +// expected-no-diagnostics +#ifndef HEADER +#define HEADER + +int global; +extern int global; + +// CHECK: define {{.*}}[[FOO:@.+]]( +int foo(int n) { + int a = 0; + float b[10]; + double cn[5][n]; + + #pragma omp target nowait depend(in: global) depend(out: a, b, cn[4]) + { + } + + // CHECK: call i8* @__kmpc_omp_target_task_alloc({{.*}}, i64 -1) + + #pragma omp target device(1) nowait depend(in: global) depend(out: a, b, cn[4]) + { + } + + // CHECK: call i8* @__kmpc_omp_target_task_alloc({{.*}}, i64 1) + + return a; +} + +#endif Modified: cfe/trunk/test/OpenMP/target_depend_codegen.cpp URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/test/OpenMP/target_depend_codegen.cpp?rev=363809&r1=363808&r2=363809&view=diff == --- cfe/trunk/test/OpenMP/target_depend_codegen.cpp (original) +++ cfe/trunk/test/OpenMP/target_depend_codegen.cpp Wed Jun 19 07:26:43 2019 @@ -132,8 +132,10 @@ int foo(int n) { // CHECK: [[GEP:%.+]] = getelementptr inbounds %{{.+}}, %{{.+}}* %{{.+}}, i32 0, i32 2 // CHECK: [[DEV:%.+]] = load i32, i32* [[DEVICE_CAP]], // CHECK: store i32 [[DEV]], i32* [[GEP]], + // CHECK: [[DEV1:%.+]] = load i32, i32* [[DEVICE_CAP]], + // CHECK: [[DEV2:%.+]] = sext i32 [[DEV1]] to i64 - // CHECK: [[TASK:%.+]] = call i8* @__kmpc_omp_target_task_alloc(%struct.ident_t* @0, i32 [[GTID]], i32 1, i[[SZ]] {{104|52}}, i[[SZ]] {{16|12}}, i32 (i32, i8*)* bitcast (i32 (i32, %{{.+}}*)* [[TASK_ENTRY1_:@.+]] to i32 (i32, i8*)*), i64 + // CHECK: [[TASK:%.+]] = call i8* @__kmpc_omp_target_task_alloc(%struct.ident_t* @0, i32 [[GTID]], i32 1, i[[SZ]] {{104|52}}, i[[SZ]] {{16|12}}, i32 (i32, i8*)* bitcast (i32 (i32, %{{.+}}*)* [[TASK_ENTRY1_:@.+]] to i32 (i32, i8*)*), i64 [[DEV2]]) // CHECK: [[BC_TASK:%.+]] = bitcast i8* [[TASK]] to [[TASK_TY1_:%.+]]* // CHECK: getelementptr inbounds [3 x %struct.kmp_depend_info], [3 x %struct.kmp_depend_info]* %{{.+}}, i[[SZ]] 0, i[[SZ]] 0 // CHECK: getelementptr inbounds [3 x %struct.kmp_depend_info], [3 x %struct.kmp_depend_info]* %{{.+}}, i[[SZ]] 0, i[[SZ]] 1 @@ -148,8 +150,10 @@ int foo(int n) { // CHECK: [[GEP:%.+]] = getelementptr inbounds %{{.+}}, %{{.+}}* %{{.+}}, i32 0, i32 2 // CHECK: [[DEV:%.+]] = load i32, i32* [[DEVICE_CAP]], //
r363959 - [OpenMP] Add support for handling declare target to clause when unified memory is required
Author: gbercea Date: Thu Jun 20 11:04:47 2019 New Revision: 363959 URL: http://llvm.org/viewvc/llvm-project?rev=363959&view=rev Log: [OpenMP] Add support for handling declare target to clause when unified memory is required Summary: This patch adds support for the handling of the variables under the declare target to clause. The variables in this case are handled like link variables are. A pointer is created on the host and then mapped to the device. The runtime will then copy the address of the host variable in the device pointer. Reviewers: ABataev, AlexEichenberger, caomhin Reviewed By: ABataev Subscribers: guansong, jdoerfert, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D63108 Modified: cfe/trunk/lib/CodeGen/CGDeclCXX.cpp cfe/trunk/lib/CodeGen/CGExpr.cpp cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp cfe/trunk/lib/CodeGen/CGOpenMPRuntime.h cfe/trunk/lib/CodeGen/CodeGenModule.cpp cfe/trunk/test/OpenMP/declare_target_codegen.cpp cfe/trunk/test/OpenMP/declare_target_link_codegen.cpp cfe/trunk/test/OpenMP/nvptx_target_requires_unified_shared_memory.cpp Modified: cfe/trunk/lib/CodeGen/CGDeclCXX.cpp URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGDeclCXX.cpp?rev=363959&r1=363958&r2=363959&view=diff == --- cfe/trunk/lib/CodeGen/CGDeclCXX.cpp (original) +++ cfe/trunk/lib/CodeGen/CGDeclCXX.cpp Thu Jun 20 11:04:47 2019 @@ -74,7 +74,7 @@ static void EmitDeclDestroy(CodeGenFunct // bails even if the attribute is not present. if (D.isNoDestroy(CGF.getContext())) return; - + CodeGenModule &CGM = CGF.CGM; // FIXME: __attribute__((cleanup)) ? Modified: cfe/trunk/lib/CodeGen/CGExpr.cpp URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGExpr.cpp?rev=363959&r1=363958&r2=363959&view=diff == --- cfe/trunk/lib/CodeGen/CGExpr.cpp (original) +++ cfe/trunk/lib/CodeGen/CGExpr.cpp Thu Jun 20 11:04:47 2019 @@ -2295,15 +2295,22 @@ static LValue EmitThreadPrivateVarDeclLV return CGF.MakeAddrLValue(Addr, T, AlignmentSource::Decl); } -static Address emitDeclTargetLinkVarDeclLValue(CodeGenFunction &CGF, - const VarDecl *VD, QualType T) { +static Address emitDeclTargetVarDeclLValue(CodeGenFunction &CGF, + const VarDecl *VD, QualType T) { llvm::Optional Res = OMPDeclareTargetDeclAttr::isDeclareTargetDeclaration(VD); - if (!Res || *Res == OMPDeclareTargetDeclAttr::MT_To) + // Return an invalid address if variable is MT_To and unified + // memory is not enabled. For all other cases: MT_Link and + // MT_To with unified memory, return a valid address. + if (!Res || (*Res == OMPDeclareTargetDeclAttr::MT_To && + !CGF.CGM.getOpenMPRuntime().hasRequiresUnifiedSharedMemory())) return Address::invalid(); - assert(*Res == OMPDeclareTargetDeclAttr::MT_Link && "Expected link clause"); + assert(((*Res == OMPDeclareTargetDeclAttr::MT_Link) || + (*Res == OMPDeclareTargetDeclAttr::MT_To && + CGF.CGM.getOpenMPRuntime().hasRequiresUnifiedSharedMemory())) && + "Expected link clause OR to clause with unified memory enabled."); QualType PtrTy = CGF.getContext().getPointerType(VD->getType()); - Address Addr = CGF.CGM.getOpenMPRuntime().getAddrOfDeclareTargetLink(VD); + Address Addr = CGF.CGM.getOpenMPRuntime().getAddrOfDeclareTargetVar(VD); return CGF.EmitLoadOfPointer(Addr, PtrTy->castAs()); } @@ -2359,7 +2366,7 @@ static LValue EmitGlobalVarDeclLValue(Co // Check if the variable is marked as declare target with link clause in // device codegen. if (CGF.getLangOpts().OpenMPIsDevice) { -Address Addr = emitDeclTargetLinkVarDeclLValue(CGF, VD, T); +Address Addr = emitDeclTargetVarDeclLValue(CGF, VD, T); if (Addr.isValid()) return CGF.MakeAddrLValue(Addr, T, AlignmentSource::Decl); } Modified: cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp?rev=363959&r1=363958&r2=363959&view=diff == --- cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp (original) +++ cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp Thu Jun 20 11:04:47 2019 @@ -2552,16 +2552,18 @@ CGOpenMPRuntime::createDispatchNextFunct return CGM.CreateRuntimeFunction(FnTy, Name); } -Address CGOpenMPRuntime::getAddrOfDeclareTargetLink(const VarDecl *VD) { +Address CGOpenMPRuntime::getAddrOfDeclareTargetVar(const VarDecl *VD) { if (CGM.getLangOpts().OpenMPSimd) return Address::invalid(); llvm::Optional Res = OMPDeclareTargetDeclAttr::isDeclareTargetDeclaration(VD); - if (Res && *Res == OMPDeclareTargetDeclAttr::MT_Link) { + if (Res && (*Res == OMPDeclareTargetDeclAttr::MT_Link |
r367613 - [OpenMP] Fix declare target link implementation
Author: gbercea Date: Thu Aug 1 14:15:58 2019 New Revision: 367613 URL: http://llvm.org/viewvc/llvm-project?rev=367613&view=rev Log: [OpenMP] Fix declare target link implementation Summary: This patch fixes the case where variables in different compilation units or the same compilation unit are under the declare target link clause AND have the same name. This also fixes the name clash error that occurs when unified memory is activated. The changes in this patch include: - Pointers to internal variables are given unique names. - Externally visible variables are given the same name as before. - All pointer variables (external or internal) are weakly linked. Reviewers: ABataev, jdoerfert, caomhin Reviewed By: ABataev Subscribers: lebedev.ri, guansong, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D64592 Modified: cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp cfe/trunk/test/OpenMP/declare_target_codegen.cpp cfe/trunk/test/OpenMP/declare_target_link_codegen.cpp cfe/trunk/test/OpenMP/nvptx_target_requires_unified_shared_memory.cpp Modified: cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp?rev=367613&r1=367612&r2=367613&view=diff == --- cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp (original) +++ cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp Thu Aug 1 14:15:58 2019 @@ -2552,6 +2552,32 @@ CGOpenMPRuntime::createDispatchNextFunct return CGM.CreateRuntimeFunction(FnTy, Name); } +/// Obtain information that uniquely identifies a target entry. This +/// consists of the file and device IDs as well as line number associated with +/// the relevant entry source location. +static void getTargetEntryUniqueInfo(ASTContext &C, SourceLocation Loc, + unsigned &DeviceID, unsigned &FileID, + unsigned &LineNum) { + SourceManager &SM = C.getSourceManager(); + + // The loc should be always valid and have a file ID (the user cannot use + // #pragma directives in macros) + + assert(Loc.isValid() && "Source location is expected to be always valid."); + + PresumedLoc PLoc = SM.getPresumedLoc(Loc); + assert(PLoc.isValid() && "Source location is expected to be always valid."); + + llvm::sys::fs::UniqueID ID; + if (auto EC = llvm::sys::fs::getUniqueID(PLoc.getFilename(), ID)) +SM.getDiagnostics().Report(diag::err_cannot_open_file) +<< PLoc.getFilename() << EC.message(); + + DeviceID = ID.getDevice(); + FileID = ID.getFile(); + LineNum = PLoc.getLine(); +} + Address CGOpenMPRuntime::getAddrOfDeclareTargetVar(const VarDecl *VD) { if (CGM.getLangOpts().OpenMPSimd) return Address::invalid(); @@ -2563,19 +2589,27 @@ Address CGOpenMPRuntime::getAddrOfDeclar SmallString<64> PtrName; { llvm::raw_svector_ostream OS(PtrName); - OS << CGM.getMangledName(GlobalDecl(VD)) << "_decl_tgt_ref_ptr"; + OS << CGM.getMangledName(GlobalDecl(VD)); + if (!VD->isExternallyVisible()) { +unsigned DeviceID, FileID, Line; +getTargetEntryUniqueInfo(CGM.getContext(), + VD->getCanonicalDecl()->getBeginLoc(), + DeviceID, FileID, Line); +OS << llvm::format("_%x", FileID); + } + OS << "_decl_tgt_ref_ptr"; } llvm::Value *Ptr = CGM.getModule().getNamedValue(PtrName); if (!Ptr) { QualType PtrTy = CGM.getContext().getPointerType(VD->getType()); Ptr = getOrCreateInternalVariable(CGM.getTypes().ConvertTypeForMem(PtrTy), PtrName); - if (!CGM.getLangOpts().OpenMPIsDevice) { -auto *GV = cast(Ptr); -GV->setLinkage(llvm::GlobalValue::ExternalLinkage); + + auto *GV = cast(Ptr); + GV->setLinkage(llvm::GlobalValue::WeakAnyLinkage); + + if (!CGM.getLangOpts().OpenMPIsDevice) GV->setInitializer(CGM.GetAddrOfGlobal(VD)); - } - CGM.addUsedGlobal(cast(Ptr)); registerTargetGlobalVariable(VD, cast(Ptr)); } return Address(Ptr, CGM.getContext().getDeclAlign(VD)); @@ -2749,32 +2783,6 @@ llvm::Function *CGOpenMPRuntime::emitThr return nullptr; } -/// Obtain information that uniquely identifies a target entry. This -/// consists of the file and device IDs as well as line number associated with -/// the relevant entry source location. -static void getTargetEntryUniqueInfo(ASTContext &C, SourceLocation Loc, - unsigned &DeviceID, unsigned &FileID, - unsigned &LineNum) { - SourceManager &SM = C.getSourceManager(); - - // The loc should be always valid and have a file ID (the user cannot use - // #pragma directives in macros) - - assert(Loc.isValid() && "Source location is expected to be always valid."); - - PresumedLoc PLoc = SM.getPresumedLoc(Loc); -
r368491 - [OpenMP] Add support for close map modifier in Clang
Author: gbercea Date: Fri Aug 9 14:42:13 2019 New Revision: 368491 URL: http://llvm.org/viewvc/llvm-project?rev=368491&view=rev Log: [OpenMP] Add support for close map modifier in Clang Summary: This patch adds support for the close map modifier in Clang. This ensures that the new map type is marked and passed to the OpenMP runtime appropriately. Additional regression tests have been merged from patch D55892 (author @saghir). Reviewers: ABataev, caomhin, jdoerfert, kkwli0 Reviewed By: ABataev Subscribers: kkwli0, Hahnfeld, saghir, guansong, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D65341 Modified: cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp cfe/trunk/test/OpenMP/target_data_codegen.cpp cfe/trunk/test/OpenMP/target_enter_data_codegen.cpp cfe/trunk/test/OpenMP/target_exit_data_codegen.cpp cfe/trunk/test/OpenMP/target_map_codegen.cpp Modified: cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp?rev=368491&r1=368490&r2=368491&view=diff == --- cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp (original) +++ cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp Fri Aug 9 14:42:13 2019 @@ -7116,6 +7116,9 @@ public: OMP_MAP_LITERAL = 0x100, /// Implicit map OMP_MAP_IMPLICIT = 0x200, +/// Close is a hint to the runtime to allocate memory close to +/// the target device. +OMP_MAP_CLOSE = 0x400, /// The 16 MSBs of the flags indicate whether the entry is member of some /// struct/class. OMP_MAP_MEMBER_OF = 0x, @@ -7296,6 +7299,9 @@ private: if (llvm::find(MapModifiers, OMPC_MAP_MODIFIER_always) != MapModifiers.end()) Bits |= OMP_MAP_ALWAYS; +if (llvm::find(MapModifiers, OMPC_MAP_MODIFIER_close) +!= MapModifiers.end()) + Bits |= OMP_MAP_CLOSE; return Bits; } @@ -7724,10 +7730,10 @@ private: if (!IsExpressionFirstInfo) { // If we have a PTR_AND_OBJ pair where the OBJ is a pointer as well, -// then we reset the TO/FROM/ALWAYS/DELETE flags. +// then we reset the TO/FROM/ALWAYS/DELETE/CLOSE flags. if (IsPointer) Flags &= ~(OMP_MAP_TO | OMP_MAP_FROM | OMP_MAP_ALWAYS | - OMP_MAP_DELETE); + OMP_MAP_DELETE | OMP_MAP_CLOSE); if (ShouldBeMemberOf) { // Set placeholder value MEMBER_OF= to indicate that the flag Modified: cfe/trunk/test/OpenMP/target_data_codegen.cpp URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/test/OpenMP/target_data_codegen.cpp?rev=368491&r1=368490&r2=368491&view=diff == --- cfe/trunk/test/OpenMP/target_data_codegen.cpp (original) +++ cfe/trunk/test/OpenMP/target_data_codegen.cpp Fri Aug 9 14:42:13 2019 @@ -40,6 +40,10 @@ double gc[100]; // CK1: [[SIZE04:@.+]] = {{.+}}constant [2 x i64] [i64 sdiv exact (i64 sub (i64 ptrtoint (double** getelementptr (double*, double** getelementptr inbounds (%struct.ST, %struct.ST* @gb, i32 0, i32 1), i32 1) to i64), i64 ptrtoint (double** getelementptr inbounds (%struct.ST, %struct.ST* @gb, i32 0, i32 1) to i64)), i64 ptrtoint (i8* getelementptr (i8, i8* null, i32 1) to i64)), i64 24] // CK1: [[MTYPE04:@.+]] = {{.+}}constant [2 x i64] [i64 32, i64 281474976710673] +// CK1: [[MTYPE05:@.+]] = {{.+}}constant [1 x i64] [i64 1057] + +// CK1: [[MTYPE06:@.+]] = {{.+}}constant [1 x i64] [i64 1061] + // CK1-LABEL: _Z3fooi void foo(int arg) { int la; @@ -163,6 +167,64 @@ void foo(int arg) { // CK1-DAG: [[GEPP]] = getelementptr inbounds {{.+}}[[P]] #pragma omp target data map(to: gb.b[:3]) {++arg;} + + // CK1: %{{.+}} = add nsw i32 %{{[^,]+}}, 1 + {++arg;} + + // Region 05 + // CK1-DAG: call void @__tgt_target_data_begin(i64 -1, i32 1, i8** [[GEPBP:%.+]], i8** [[GEPP:%.+]], i[[sz]]* [[GEPS:%.+]], {{.+}}getelementptr {{.+}}[1 x i{{.+}}]* [[MTYPE05]]{{.+}}) + // CK1-DAG: [[GEPBP]] = getelementptr inbounds {{.+}}[[BP:%[^,]+]] + // CK1-DAG: [[GEPP]] = getelementptr inbounds {{.+}}[[P:%[^,]+]] + // CK1-DAG: [[GEPS]] = getelementptr inbounds {{.+}}[[S:%[^,]+]] + + // CK1-DAG: [[BP0:%.+]] = getelementptr inbounds {{.+}}[[BP]], i{{.+}} 0, i{{.+}} 0 + // CK1-DAG: [[P0:%.+]] = getelementptr inbounds {{.+}}[[P]], i{{.+}} 0, i{{.+}} 0 + // CK1-DAG: [[S0:%.+]] = getelementptr inbounds {{.+}}[[S]], i{{.+}} 0, i{{.+}} 0 + // CK1-DAG: [[CBP0:%.+]] = bitcast i8** [[BP0]] to float** + // CK1-DAG: [[CP0:%.+]] = bitcast i8** [[P0]] to float** + // CK1-DAG: store float* [[VAR0:%.+]], float** [[CBP0]] + // CK1-DAG: store float* [[VAR0]], float** [[CP0]] + // CK1-DAG: store i[[sz]] [[CSVAL0:%[^,]+]], i[[sz]]* [[S0]] + // CK1-64-DAG: [[CSVAL0]] = mul nuw i64 %{{[^,]+}}, 4 + // CK1-32-DAG: [[CSVAL0]] = sext i32 [[CSVAL032:%.+]] to i64
r360626 - [OpenMP][Clang][BugFix] Split declares and math functions inclusion.
Author: gbercea Date: Mon May 13 15:11:44 2019 New Revision: 360626 URL: http://llvm.org/viewvc/llvm-project?rev=360626&view=rev Log: [OpenMP][Clang][BugFix] Split declares and math functions inclusion. Summary: This patches fixes an issue in which the __clang_cuda_cmath.h header is being included even when cmath or math.h headers are not included. Reviewers: jdoerfert, ABataev, hfinkel, caomhin, tra Reviewed By: tra Subscribers: tra, mgorny, guansong, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D61765 Added: cfe/trunk/lib/Headers/openmp_wrappers/__clang_openmp_math_declares.h cfe/trunk/test/Headers/Inputs/include/cstdlib Modified: cfe/trunk/lib/Driver/ToolChains/Clang.cpp cfe/trunk/lib/Headers/CMakeLists.txt cfe/trunk/lib/Headers/__clang_cuda_cmath.h cfe/trunk/lib/Headers/__clang_cuda_device_functions.h cfe/trunk/lib/Headers/__clang_cuda_math_forward_declares.h cfe/trunk/lib/Headers/openmp_wrappers/__clang_openmp_math.h cfe/trunk/lib/Headers/openmp_wrappers/cmath cfe/trunk/lib/Headers/openmp_wrappers/math.h cfe/trunk/test/Headers/nvptx_device_cmath_functions.c cfe/trunk/test/Headers/nvptx_device_cmath_functions.cpp cfe/trunk/test/Headers/nvptx_device_math_functions.c cfe/trunk/test/Headers/nvptx_device_math_functions.cpp Modified: cfe/trunk/lib/Driver/ToolChains/Clang.cpp URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Driver/ToolChains/Clang.cpp?rev=360626&r1=360625&r2=360626&view=diff == --- cfe/trunk/lib/Driver/ToolChains/Clang.cpp (original) +++ cfe/trunk/lib/Driver/ToolChains/Clang.cpp Mon May 13 15:11:44 2019 @@ -1166,7 +1166,7 @@ void Clang::AddPreprocessingOptions(Comp } CmdArgs.push_back("-include"); -CmdArgs.push_back("__clang_openmp_math.h"); +CmdArgs.push_back("__clang_openmp_math_declares.h"); } // Add -i* options, and automatically translate to Modified: cfe/trunk/lib/Headers/CMakeLists.txt URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Headers/CMakeLists.txt?rev=360626&r1=360625&r2=360626&view=diff == --- cfe/trunk/lib/Headers/CMakeLists.txt (original) +++ cfe/trunk/lib/Headers/CMakeLists.txt Mon May 13 15:11:44 2019 @@ -132,6 +132,7 @@ set(openmp_wrapper_files openmp_wrappers/math.h openmp_wrappers/cmath openmp_wrappers/__clang_openmp_math.h + openmp_wrappers/__clang_openmp_math_declares.h ) set(output_dir ${LLVM_LIBRARY_OUTPUT_INTDIR}/clang/${CLANG_VERSION}/include) Modified: cfe/trunk/lib/Headers/__clang_cuda_cmath.h URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Headers/__clang_cuda_cmath.h?rev=360626&r1=360625&r2=360626&view=diff == --- cfe/trunk/lib/Headers/__clang_cuda_cmath.h (original) +++ cfe/trunk/lib/Headers/__clang_cuda_cmath.h Mon May 13 15:11:44 2019 @@ -36,8 +36,10 @@ #define __DEVICE__ static __device__ __inline__ __attribute__((always_inline)) #endif +#if !(defined(_OPENMP) && defined(__cplusplus)) __DEVICE__ long long abs(long long __n) { return ::llabs(__n); } __DEVICE__ long abs(long __n) { return ::labs(__n); } +#endif __DEVICE__ float abs(float __x) { return ::fabsf(__x); } __DEVICE__ double abs(double __x) { return ::fabs(__x); } __DEVICE__ float acos(float __x) { return ::acosf(__x); } Modified: cfe/trunk/lib/Headers/__clang_cuda_device_functions.h URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Headers/__clang_cuda_device_functions.h?rev=360626&r1=360625&r2=360626&view=diff == --- cfe/trunk/lib/Headers/__clang_cuda_device_functions.h (original) +++ cfe/trunk/lib/Headers/__clang_cuda_device_functions.h Mon May 13 15:11:44 2019 @@ -1493,8 +1493,10 @@ __DEVICE__ double cbrt(double __a) { ret __DEVICE__ float cbrtf(float __a) { return __nv_cbrtf(__a); } __DEVICE__ double ceil(double __a) { return __nv_ceil(__a); } __DEVICE__ float ceilf(float __a) { return __nv_ceilf(__a); } +#ifndef _OPENMP __DEVICE__ int clock() { return __nvvm_read_ptx_sreg_clock(); } __DEVICE__ long long clock64() { return __nvvm_read_ptx_sreg_clock64(); } +#endif __DEVICE__ double copysign(double __a, double __b) { return __nv_copysign(__a, __b); } Modified: cfe/trunk/lib/Headers/__clang_cuda_math_forward_declares.h URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Headers/__clang_cuda_math_forward_declares.h?rev=360626&r1=360625&r2=360626&view=diff == --- cfe/trunk/lib/Headers/__clang_cuda_math_forward_declares.h (original) +++ cfe/trunk/lib/Headers/__clang_cuda_math_forward_declares.h Mon May 13 15:11:44 2019 @@ -27,11 +27,13 @@ static __inline__ __attribute__((always_inline)) __attribute__((device)) #endif -__DEVICE__
r360804 - [OpenMP][bugfix] Fix issues with C++ 17 compilation when handling math functions
Author: gbercea Date: Wed May 15 13:18:21 2019 New Revision: 360804 URL: http://llvm.org/viewvc/llvm-project?rev=360804&view=rev Log: [OpenMP][bugfix] Fix issues with C++ 17 compilation when handling math functions Summary: In OpenMP device offloading we must ensure that unde C++ 17, the inclusion of cstdlib will works correctly. Reviewers: ABataev, tra, jdoerfert, hfinkel, caomhin Reviewed By: jdoerfert Subscribers: Hahnfeld, guansong, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D61949 Added: cfe/trunk/test/Headers/nvptx_device_cmath_functions_cxx17.cpp cfe/trunk/test/Headers/nvptx_device_math_functions_cxx17.cpp Modified: cfe/trunk/lib/Headers/__clang_cuda_cmath.h cfe/trunk/lib/Headers/__clang_cuda_device_functions.h cfe/trunk/lib/Headers/__clang_cuda_math_forward_declares.h cfe/trunk/test/Headers/Inputs/include/cstdlib Modified: cfe/trunk/lib/Headers/__clang_cuda_cmath.h URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Headers/__clang_cuda_cmath.h?rev=360804&r1=360803&r2=360804&view=diff == --- cfe/trunk/lib/Headers/__clang_cuda_cmath.h (original) +++ cfe/trunk/lib/Headers/__clang_cuda_cmath.h Wed May 15 13:18:21 2019 @@ -36,6 +36,15 @@ #define __DEVICE__ static __device__ __inline__ __attribute__((always_inline)) #endif +// For C++ 17 we need to include noexcept attribute to be compatible +// with the header-defined version. This may be removed once +// variant is supported. +#if defined(_OPENMP) && defined(__cplusplus) && __cplusplus >= 201703L +#define __NOEXCEPT noexcept +#else +#define __NOEXCEPT +#endif + #if !(defined(_OPENMP) && defined(__cplusplus)) __DEVICE__ long long abs(long long __n) { return ::llabs(__n); } __DEVICE__ long abs(long __n) { return ::labs(__n); } @@ -50,7 +59,7 @@ __DEVICE__ float ceil(float __x) { retur __DEVICE__ float cos(float __x) { return ::cosf(__x); } __DEVICE__ float cosh(float __x) { return ::coshf(__x); } __DEVICE__ float exp(float __x) { return ::expf(__x); } -__DEVICE__ float fabs(float __x) { return ::fabsf(__x); } +__DEVICE__ float fabs(float __x) __NOEXCEPT { return ::fabsf(__x); } __DEVICE__ float floor(float __x) { return ::floorf(__x); } __DEVICE__ float fmod(float __x, float __y) { return ::fmodf(__x, __y); } // TODO: remove when variant is supported @@ -465,6 +474,7 @@ _GLIBCXX_END_NAMESPACE_VERSION } // namespace std #endif +#undef __NOEXCEPT #undef __DEVICE__ #endif Modified: cfe/trunk/lib/Headers/__clang_cuda_device_functions.h URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Headers/__clang_cuda_device_functions.h?rev=360804&r1=360803&r2=360804&view=diff == --- cfe/trunk/lib/Headers/__clang_cuda_device_functions.h (original) +++ cfe/trunk/lib/Headers/__clang_cuda_device_functions.h Wed May 15 13:18:21 2019 @@ -37,6 +37,15 @@ #define __FAST_OR_SLOW(fast, slow) slow #endif +// For C++ 17 we need to include noexcept attribute to be compatible +// with the header-defined version. This may be removed once +// variant is supported. +#if defined(_OPENMP) && defined(__cplusplus) && __cplusplus >= 201703L +#define __NOEXCEPT noexcept +#else +#define __NOEXCEPT +#endif + __DEVICE__ int __all(int __a) { return __nvvm_vote_all(__a); } __DEVICE__ int __any(int __a) { return __nvvm_vote_any(__a); } __DEVICE__ unsigned int __ballot(int __a) { return __nvvm_vote_ballot(__a); } @@ -1474,7 +1483,8 @@ __DEVICE__ unsigned int __vsubus4(unsign return r; } #endif // CUDA_VERSION >= 9020 -__DEVICE__ int abs(int __a) { return __nv_abs(__a); } +__DEVICE__ int abs(int __a) __NOEXCEPT { return __nv_abs(__a); } +__DEVICE__ double fabs(double __a) __NOEXCEPT { return __nv_fabs(__a); } __DEVICE__ double acos(double __a) { return __nv_acos(__a); } __DEVICE__ float acosf(float __a) { return __nv_acosf(__a); } __DEVICE__ double acosh(double __a) { return __nv_acosh(__a); } @@ -1533,7 +1543,6 @@ __DEVICE__ float exp2f(float __a) { retu __DEVICE__ float expf(float __a) { return __nv_expf(__a); } __DEVICE__ double expm1(double __a) { return __nv_expm1(__a); } __DEVICE__ float expm1f(float __a) { return __nv_expm1f(__a); } -__DEVICE__ double fabs(double __a) { return __nv_fabs(__a); } __DEVICE__ float fabsf(float __a) { return __nv_fabsf(__a); } __DEVICE__ double fdim(double __a, double __b) { return __nv_fdim(__a, __b); } __DEVICE__ float fdimf(float __a, float __b) { return __nv_fdimf(__a, __b); } @@ -1572,15 +1581,15 @@ __DEVICE__ float j1f(float __a) { return __DEVICE__ double jn(int __n, double __a) { return __nv_jn(__n, __a); } __DEVICE__ float jnf(int __n, float __a) { return __nv_jnf(__n, __a); } #if defined(__LP64__) || defined(_WIN64) -__DEVICE__ long labs(long __a) { return __nv_llabs(__a); }; +__DEVICE__ long labs(long __a) __NOEXCEPT { return __nv_llabs(__a); }; #else -__DEVICE__ long labs(long __a) {
r360809 - [OpenMP][Bugfix] Move double and float versions of abs under c++ macro
Author: gbercea Date: Wed May 15 13:28:23 2019 New Revision: 360809 URL: http://llvm.org/viewvc/llvm-project?rev=360809&view=rev Log: [OpenMP][Bugfix] Move double and float versions of abs under c++ macro Summary: This is a fix for the reported bug: [[ https://bugs.llvm.org/show_bug.cgi?id=41861 | 41861 ]] abs functions need to be moved under the c++ macro to avoid conflicts with included headers. Reviewers: tra, jdoerfert, hfinkel, ABataev, caomhin Reviewed By: jdoerfert Subscribers: guansong, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D61959 Modified: cfe/trunk/lib/Headers/__clang_cuda_cmath.h cfe/trunk/lib/Headers/__clang_cuda_math_forward_declares.h cfe/trunk/test/Headers/Inputs/include/cstdlib cfe/trunk/test/Headers/nvptx_device_cmath_functions.c cfe/trunk/test/Headers/nvptx_device_cmath_functions.cpp cfe/trunk/test/Headers/nvptx_device_cmath_functions_cxx17.cpp cfe/trunk/test/Headers/nvptx_device_math_functions.c cfe/trunk/test/Headers/nvptx_device_math_functions.cpp cfe/trunk/test/Headers/nvptx_device_math_functions_cxx17.cpp Modified: cfe/trunk/lib/Headers/__clang_cuda_cmath.h URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Headers/__clang_cuda_cmath.h?rev=360809&r1=360808&r2=360809&view=diff == --- cfe/trunk/lib/Headers/__clang_cuda_cmath.h (original) +++ cfe/trunk/lib/Headers/__clang_cuda_cmath.h Wed May 15 13:28:23 2019 @@ -48,9 +48,9 @@ #if !(defined(_OPENMP) && defined(__cplusplus)) __DEVICE__ long long abs(long long __n) { return ::llabs(__n); } __DEVICE__ long abs(long __n) { return ::labs(__n); } -#endif __DEVICE__ float abs(float __x) { return ::fabsf(__x); } __DEVICE__ double abs(double __x) { return ::fabs(__x); } +#endif __DEVICE__ float acos(float __x) { return ::acosf(__x); } __DEVICE__ float asin(float __x) { return ::asinf(__x); } __DEVICE__ float atan(float __x) { return ::atanf(__x); } Modified: cfe/trunk/lib/Headers/__clang_cuda_math_forward_declares.h URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Headers/__clang_cuda_math_forward_declares.h?rev=360809&r1=360808&r2=360809&view=diff == --- cfe/trunk/lib/Headers/__clang_cuda_math_forward_declares.h (original) +++ cfe/trunk/lib/Headers/__clang_cuda_math_forward_declares.h Wed May 15 13:28:23 2019 @@ -39,10 +39,10 @@ #if !(defined(_OPENMP) && defined(__cplusplus)) __DEVICE__ long abs(long); __DEVICE__ long long abs(long long); -#endif -__DEVICE__ int abs(int) __NOEXCEPT; __DEVICE__ double abs(double); __DEVICE__ float abs(float); +#endif +__DEVICE__ int abs(int) __NOEXCEPT; __DEVICE__ double acos(double); __DEVICE__ float acos(float); __DEVICE__ double acosh(double); Modified: cfe/trunk/test/Headers/Inputs/include/cstdlib URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/test/Headers/Inputs/include/cstdlib?rev=360809&r1=360808&r2=360809&view=diff == --- cfe/trunk/test/Headers/Inputs/include/cstdlib (original) +++ cfe/trunk/test/Headers/Inputs/include/cstdlib Wed May 15 13:28:23 2019 @@ -3,9 +3,11 @@ #if __cplusplus >= 201703L extern int abs (int __x) throw() __attribute__ ((__const__)) ; extern long int labs (long int __x) throw() __attribute__ ((__const__)) ; +extern float fabs (float __x) throw() __attribute__ ((__const__)) ; #else extern int abs (int __x) __attribute__ ((__const__)) ; extern long int labs (long int __x) __attribute__ ((__const__)) ; +extern float fabs (float __x) __attribute__ ((__const__)) ; #endif namespace std Modified: cfe/trunk/test/Headers/nvptx_device_cmath_functions.c URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/test/Headers/nvptx_device_cmath_functions.c?rev=360809&r1=360808&r2=360809&view=diff == --- cfe/trunk/test/Headers/nvptx_device_cmath_functions.c (original) +++ cfe/trunk/test/Headers/nvptx_device_cmath_functions.c Wed May 15 13:28:23 2019 @@ -17,5 +17,9 @@ void test_sqrt(double a1) { double l2 = pow(a1, a1); // CHECK-YES: call double @__nv_modf(double double l3 = modf(a1 + 3.5, &a1); +// CHECK-YES: call double @__nv_fabs(double +double l4 = fabs(a1); +// CHECK-YES: call i32 @__nv_abs(i32 +double l5 = abs((int)a1); } } Modified: cfe/trunk/test/Headers/nvptx_device_cmath_functions.cpp URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/test/Headers/nvptx_device_cmath_functions.cpp?rev=360809&r1=360808&r2=360809&view=diff == --- cfe/trunk/test/Headers/nvptx_device_cmath_functions.cpp (original) +++ cfe/trunk/test/Headers/nvptx_device_cmath_functions.cpp Wed May 15 13:28:23 2019 @@ -18,5 +18,9 @@ void test_sqrt(double a1) { double l2 = pow(a1, a1);
r361066 - [OpenMP][bugfix] Add missing math functions variants for log and abs.
Author: gbercea Date: Fri May 17 12:15:53 2019 New Revision: 361066 URL: http://llvm.org/viewvc/llvm-project?rev=361066&view=rev Log: [OpenMP][bugfix] Add missing math functions variants for log and abs. Summary: When including the random header in C++, some of the math functions it relies on are not present in the CUDA headers. We include this variants in this case. Reviewers: jdoerfert, hfinkel, tra, caomhin Reviewed By: tra Subscribers: efriedma, guansong, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D62046 Modified: cfe/trunk/lib/Headers/__clang_cuda_cmath.h cfe/trunk/lib/Headers/__clang_cuda_math_forward_declares.h Modified: cfe/trunk/lib/Headers/__clang_cuda_cmath.h URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Headers/__clang_cuda_cmath.h?rev=361066&r1=361065&r2=361066&view=diff == --- cfe/trunk/lib/Headers/__clang_cuda_cmath.h (original) +++ cfe/trunk/lib/Headers/__clang_cuda_cmath.h Fri May 17 12:15:53 2019 @@ -51,6 +51,11 @@ __DEVICE__ long abs(long __n) { return : __DEVICE__ float abs(float __x) { return ::fabsf(__x); } __DEVICE__ double abs(double __x) { return ::fabs(__x); } #endif +// TODO: remove once variat is supported. +#if defined(_OPENMP) && defined(__cplusplus) +__DEVICE__ const float abs(const float __x) { return ::fabsf((float)__x); } +__DEVICE__ const double abs(const double __x) { return ::fabs((double)__x); } +#endif __DEVICE__ float acos(float __x) { return ::acosf(__x); } __DEVICE__ float asin(float __x) { return ::asinf(__x); } __DEVICE__ float atan(float __x) { return ::atanf(__x); } Modified: cfe/trunk/lib/Headers/__clang_cuda_math_forward_declares.h URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Headers/__clang_cuda_math_forward_declares.h?rev=361066&r1=361065&r2=361066&view=diff == --- cfe/trunk/lib/Headers/__clang_cuda_math_forward_declares.h (original) +++ cfe/trunk/lib/Headers/__clang_cuda_math_forward_declares.h Fri May 17 12:15:53 2019 @@ -42,6 +42,14 @@ __DEVICE__ long long abs(long long); __DEVICE__ double abs(double); __DEVICE__ float abs(float); #endif +// While providing the CUDA declarations and definitions for math functions, +// we may manually define additional functions. +// TODO: Once variant is supported the additional functions will have +// to be removed. +#if defined(_OPENMP) && defined(__cplusplus) +__DEVICE__ const double abs(const double); +__DEVICE__ const float abs(const float); +#endif __DEVICE__ int abs(int) __NOEXCEPT; __DEVICE__ double acos(double); __DEVICE__ float acos(float); @@ -144,6 +152,9 @@ __DEVICE__ double log2(double); __DEVICE__ float log2(float); __DEVICE__ double logb(double); __DEVICE__ float logb(float); +#if defined(_OPENMP) && defined(__cplusplus) +__DEVICE__ long double log(long double); +#endif __DEVICE__ double log(double); __DEVICE__ float log(float); __DEVICE__ long lrint(double); ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
r361298 - [OpenMP] Add support for registering requires directives with the runtime
Author: gbercea Date: Tue May 21 12:42:01 2019 New Revision: 361298 URL: http://llvm.org/viewvc/llvm-project?rev=361298&view=rev Log: [OpenMP] Add support for registering requires directives with the runtime Summary: This patch adds support for the registration of the requires directives with the runtime. Each requires directive clause will enable a particular flag to be set. The set of flags is passed to the runtime to be checked for compatibility with other such flags coming from other object files. The registration function is called whenever OpenMP is present even if a requires directive is not present. This helps detect cases in which requires directives are used inconsistently. Reviewers: ABataev, AlexEichenberger, caomhin Reviewed By: ABataev, AlexEichenberger Subscribers: jholewinski, guansong, jfb, jdoerfert, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D60568 Modified: cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp cfe/trunk/lib/CodeGen/CGOpenMPRuntime.h cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.h cfe/trunk/lib/CodeGen/CodeGenModule.cpp cfe/trunk/test/OpenMP/openmp_offload_registration.cpp cfe/trunk/test/OpenMP/target_codegen.cpp cfe/trunk/test/OpenMP/target_codegen_registration.cpp cfe/trunk/test/OpenMP/target_depend_codegen.cpp cfe/trunk/test/OpenMP/target_parallel_codegen.cpp cfe/trunk/test/OpenMP/target_parallel_codegen_registration.cpp cfe/trunk/test/OpenMP/target_parallel_depend_codegen.cpp cfe/trunk/test/OpenMP/target_parallel_for_codegen.cpp cfe/trunk/test/OpenMP/target_parallel_for_codegen_registration.cpp cfe/trunk/test/OpenMP/target_parallel_for_depend_codegen.cpp cfe/trunk/test/OpenMP/target_parallel_for_simd_codegen.cpp cfe/trunk/test/OpenMP/target_parallel_for_simd_codegen_registration.cpp cfe/trunk/test/OpenMP/target_parallel_for_simd_depend_codegen.cpp cfe/trunk/test/OpenMP/target_parallel_if_codegen.cpp cfe/trunk/test/OpenMP/target_parallel_num_threads_codegen.cpp cfe/trunk/test/OpenMP/target_simd_codegen.cpp cfe/trunk/test/OpenMP/target_simd_codegen_registration.cpp cfe/trunk/test/OpenMP/target_simd_depend_codegen.cpp cfe/trunk/test/OpenMP/target_teams_codegen.cpp cfe/trunk/test/OpenMP/target_teams_codegen_registration.cpp cfe/trunk/test/OpenMP/target_teams_depend_codegen.cpp cfe/trunk/test/OpenMP/target_teams_distribute_codegen.cpp cfe/trunk/test/OpenMP/target_teams_distribute_codegen_registration.cpp cfe/trunk/test/OpenMP/target_teams_distribute_depend_codegen.cpp cfe/trunk/test/OpenMP/target_teams_distribute_parallel_for_depend_codegen.cpp cfe/trunk/test/OpenMP/target_teams_distribute_parallel_for_simd_codegen_registration.cpp cfe/trunk/test/OpenMP/target_teams_distribute_parallel_for_simd_depend_codegen.cpp cfe/trunk/test/OpenMP/target_teams_distribute_simd_codegen.cpp cfe/trunk/test/OpenMP/target_teams_distribute_simd_codegen_registration.cpp cfe/trunk/test/OpenMP/target_teams_distribute_simd_depend_codegen.cpp cfe/trunk/test/OpenMP/target_teams_num_teams_codegen.cpp cfe/trunk/test/OpenMP/target_teams_thread_limit_codegen.cpp Modified: cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp?rev=361298&r1=361297&r2=361298&view=diff == --- cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp (original) +++ cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp Tue May 21 12:42:01 2019 @@ -457,6 +457,26 @@ enum OpenMPLocationFlags : unsigned { LLVM_MARK_AS_BITMASK_ENUM(/*LargestValue=*/OMP_IDENT_WORK_DISTRIBUTE) }; +namespace { +LLVM_ENABLE_BITMASK_ENUMS_IN_NAMESPACE(); +/// Values for bit flags for marking which requires clauses have been used. +enum OpenMPOffloadingRequiresDirFlags : int64_t { + /// flag undefined. + OMP_REQ_UNDEFINED = 0x000, + /// no requires clause present. + OMP_REQ_NONE= 0x001, + /// reverse_offload clause. + OMP_REQ_REVERSE_OFFLOAD = 0x002, + /// unified_address clause. + OMP_REQ_UNIFIED_ADDRESS = 0x004, + /// unified_shared_memory clause. + OMP_REQ_UNIFIED_SHARED_MEMORY = 0x008, + /// dynamic_allocators clause. + OMP_REQ_DYNAMIC_ALLOCATORS = 0x010, + LLVM_MARK_AS_BITMASK_ENUM(/*LargestValue=*/OMP_REQ_DYNAMIC_ALLOCATORS) +}; +} // anonymous namespace + /// Describes ident structure that describes a source location. /// All descriptions are taken from /// https://github.com/llvm/llvm-project/blob/master/openmp/runtime/src/kmp.h @@ -694,6 +714,8 @@ enum OpenMPRTLFunction { // *host_ptr, int32_t arg_num, void** args_base, void **args, size_t // *arg_sizes, int64_t *arg_types, int32_t num_teams, int32_t thread_limit); OMPRTL__tgt_target_teams_nowait, + // Call to void __tgt_register_requires(int64_t flags); + OMPRT
r361658 - [OpenMP] Add test for requires and unified shared memory clause with declare target link
Author: gbercea Date: Fri May 24 11:48:42 2019 New Revision: 361658 URL: http://llvm.org/viewvc/llvm-project?rev=361658&view=rev Log: [OpenMP] Add test for requires and unified shared memory clause with declare target link Summary: This patch adds a test for requires with unified share memory clause when a declare target link is present. This test needs to go in prior to changes to declare target link for comparison purposes. Reviewers: ABataev, caomhin Reviewed By: ABataev Subscribers: guansong, jdoerfert, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D62407 Added: cfe/trunk/test/OpenMP/nvptx_target_requires_unified_shared_memory.cpp Added: cfe/trunk/test/OpenMP/nvptx_target_requires_unified_shared_memory.cpp URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/test/OpenMP/nvptx_target_requires_unified_shared_memory.cpp?rev=361658&view=auto == --- cfe/trunk/test/OpenMP/nvptx_target_requires_unified_shared_memory.cpp (added) +++ cfe/trunk/test/OpenMP/nvptx_target_requires_unified_shared_memory.cpp Fri May 24 11:48:42 2019 @@ -0,0 +1,67 @@ +// Test declare target link under unified memory requirement. +// RUN: %clang_cc1 -verify -fopenmp -fopenmp-cuda-mode -x c++ -triple powerpc64le-unknown-unknown -fopenmp-targets=nvptx64-nvidia-cuda -emit-llvm %s -o - | FileCheck %s --check-prefix CHECK +// expected-no-diagnostics + +#ifndef HEADER +#define HEADER + +#define N 1000 + +double var = 10.0; + +#pragma omp requires unified_shared_memory +#pragma omp declare target link(var) + +int bar(int n){ + double sum = 0; + +#pragma omp target + for(int i = 0; i < n; i++) { +sum += var; + } + + return sum; +} + +// CHECK: [[VAR:@.+]] = global double 1.00e+01 +// CHECK: [[VAR_DECL_TGT_LINK_PTR:@.+]] = global double* [[VAR]] + +// CHECK: [[OFFLOAD_SIZES:@.+]] = private unnamed_addr constant [3 x i64] [i64 4, i64 8, i64 8] +// CHECK: [[OFFLOAD_MAPTYPES:@.+]] = private unnamed_addr constant [3 x i64] [i64 800, i64 800, i64 531] + +// CHECK: [[N_CASTED:%.+]] = alloca i64 +// CHECK: [[SUM_CASTED:%.+]] = alloca i64 + +// CHECK: [[OFFLOAD_BASEPTRS:%.+]] = alloca [3 x i8*] +// CHECK: [[OFFLOAD_PTRS:%.+]] = alloca [3 x i8*] + +// CHECK: [[LOAD1:%.+]] = load i64, i64* [[N_CASTED]] +// CHECK: [[LOAD2:%.+]] = load i64, i64* [[SUM_CASTED]] + +// CHECK: [[BPTR1:%.+]] = getelementptr inbounds [3 x i8*], [3 x i8*]* [[OFFLOAD_BASEPTRS]], i32 0, i32 0 +// CHECK: [[BCAST1:%.+]] = bitcast i8** [[BPTR1]] to i64* +// CHECK: store i64 [[LOAD1]], i64* [[BCAST1]] +// CHECK: [[BPTR2:%.+]] = getelementptr inbounds [3 x i8*], [3 x i8*]* [[OFFLOAD_PTRS]], i32 0, i32 0 +// CHECK: [[BCAST2:%.+]] = bitcast i8** [[BPTR2]] to i64* +// CHECK: store i64 [[LOAD1]], i64* [[BCAST2]] + +// CHECK: [[BPTR3:%.+]] = getelementptr inbounds [3 x i8*], [3 x i8*]* [[OFFLOAD_BASEPTRS]], i32 0, i32 1 +// CHECK: [[BCAST3:%.+]] = bitcast i8** [[BPTR3]] to i64* +// CHECK: store i64 [[LOAD2]], i64* [[BCAST3]] +// CHECK: [[BPTR4:%.+]] = getelementptr inbounds [3 x i8*], [3 x i8*]* [[OFFLOAD_PTRS]], i32 0, i32 1 +// CHECK: [[BCAST4:%.+]] = bitcast i8** [[BPTR4]] to i64* +// CHECK: store i64 [[LOAD2]], i64* [[BCAST4]] + +// CHECK: [[BPTR5:%.+]] = getelementptr inbounds [3 x i8*], [3 x i8*]* [[OFFLOAD_BASEPTRS]], i32 0, i32 2 +// CHECK: [[BCAST5:%.+]] = bitcast i8** [[BPTR5]] to double*** +// CHECK: store double** [[VAR_DECL_TGT_LINK_PTR]], double*** [[BCAST5]] +// CHECK: [[BPTR6:%.+]] = getelementptr inbounds [3 x i8*], [3 x i8*]* [[OFFLOAD_PTRS]], i32 0, i32 2 +// CHECK: [[BCAST6:%.+]] = bitcast i8** [[BPTR6]] to double** +// CHECK: store double* [[VAR]], double** [[BCAST6]] + +// CHECK: [[BPTR7:%.+]] = getelementptr inbounds [3 x i8*], [3 x i8*]* [[OFFLOAD_BASEPTRS]], i32 0, i32 0 +// CHECK: [[BPTR8:%.+]] = getelementptr inbounds [3 x i8*], [3 x i8*]* [[OFFLOAD_PTRS]], i32 0, i32 0 + +// CHECK: call i32 @__tgt_target(i64 -1, i8* @{{.*}}.region_id, i32 3, i8** [[BPTR7]], i8** [[BPTR8]], i64* getelementptr inbounds ([3 x i64], [3 x i64]* [[OFFLOAD_SIZES]], i32 0, i32 0), i64* getelementptr inbounds ([3 x i64], [3 x i64]* [[OFFLOAD_MAPTYPES]], i32 0, i32 0)) + +#endif ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
r350758 - [OpenMP] Add flag for preventing the extension to 64 bits for the collapse loop counter
Author: gbercea Date: Wed Jan 9 12:38:35 2019 New Revision: 350758 URL: http://llvm.org/viewvc/llvm-project?rev=350758&view=rev Log: [OpenMP] Add flag for preventing the extension to 64 bits for the collapse loop counter Summary: Introduce a compiler flag for cases when the user knows that the collapsed loop counter can be safely represented using at most 32 bits. This will prevent the emission of expensive mathematical operations (such as the div operation) on the iteration variable using 64 bits where 32 bit operations are sufficient. Reviewers: ABataev, caomhin Reviewed By: ABataev Subscribers: hfinkel, kkwli0, guansong, cfe-commits Differential Revision: https://reviews.llvm.org/D55928 Modified: cfe/trunk/docs/OpenMPSupport.rst cfe/trunk/include/clang/Basic/LangOptions.def cfe/trunk/include/clang/Driver/Options.td cfe/trunk/lib/Driver/ToolChains/Clang.cpp cfe/trunk/lib/Frontend/CompilerInvocation.cpp cfe/trunk/lib/Sema/SemaOpenMP.cpp cfe/trunk/test/OpenMP/nvptx_target_teams_distribute_parallel_for_codegen.cpp Modified: cfe/trunk/docs/OpenMPSupport.rst URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/docs/OpenMPSupport.rst?rev=350758&r1=350757&r2=350758&view=diff == --- cfe/trunk/docs/OpenMPSupport.rst (original) +++ cfe/trunk/docs/OpenMPSupport.rst Wed Jan 9 12:38:35 2019 @@ -108,6 +108,16 @@ are stored in the global memory. In `Cud between the threads and it is user responsibility to share the required data between the threads in the parallel regions. +Collapsed loop nest counter +--- + +When using the collapse clause on a loop nest the default behaviour is to +automatically extend the representation of the loop counter to 64 bits for +the cases where the sizes of the collapsed loops are not known at compile +time. To prevent this conservative choice and use at most 32 bits, +compile your program with the `-fopenmp-optimistic-collapse`. + + Features not supported or with limited support for Cuda devices --- Modified: cfe/trunk/include/clang/Basic/LangOptions.def URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Basic/LangOptions.def?rev=350758&r1=350757&r2=350758&view=diff == --- cfe/trunk/include/clang/Basic/LangOptions.def (original) +++ cfe/trunk/include/clang/Basic/LangOptions.def Wed Jan 9 12:38:35 2019 @@ -207,6 +207,7 @@ LANGOPT(OpenMPCUDAForceFullRuntime , 1, LANGOPT(OpenMPHostCXXExceptions, 1, 0, "C++ exceptions handling in the host code.") LANGOPT(OpenMPCUDANumSMs , 32, 0, "Number of SMs for CUDA devices.") LANGOPT(OpenMPCUDABlocksPerSM , 32, 0, "Number of blocks per SM for CUDA devices.") +LANGOPT(OpenMPOptimisticCollapse , 1, 0, "Use at most 32 bits to represent the collapsed loop nest counter.") LANGOPT(RenderScript , 1, 0, "RenderScript") LANGOPT(CUDAIsDevice , 1, 0, "compiling for CUDA device") Modified: cfe/trunk/include/clang/Driver/Options.td URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Driver/Options.td?rev=350758&r1=350757&r2=350758&view=diff == --- cfe/trunk/include/clang/Driver/Options.td (original) +++ cfe/trunk/include/clang/Driver/Options.td Wed Jan 9 12:38:35 2019 @@ -1574,6 +1574,10 @@ def fopenmp_cuda_number_of_sm_EQ : Joine Flags<[CC1Option, NoArgumentUnused, HelpHidden]>; def fopenmp_cuda_blocks_per_sm_EQ : Joined<["-"], "fopenmp-cuda-blocks-per-sm=">, Group, Flags<[CC1Option, NoArgumentUnused, HelpHidden]>; +def fopenmp_optimistic_collapse : Flag<["-"], "fopenmp-optimistic-collapse">, Group, + Flags<[CC1Option, NoArgumentUnused, HelpHidden]>; +def fno_openmp_optimistic_collapse : Flag<["-"], "fno-openmp-optimistic-collapse">, Group, + Flags<[NoArgumentUnused, HelpHidden]>; def fno_optimize_sibling_calls : Flag<["-"], "fno-optimize-sibling-calls">, Group; def foptimize_sibling_calls : Flag<["-"], "foptimize-sibling-calls">, Group; def fno_escaping_block_tail_calls : Flag<["-"], "fno-escaping-block-tail-calls">, Group, Flags<[CC1Option]>; Modified: cfe/trunk/lib/Driver/ToolChains/Clang.cpp URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Driver/ToolChains/Clang.cpp?rev=350758&r1=350757&r2=350758&view=diff == --- cfe/trunk/lib/Driver/ToolChains/Clang.cpp (original) +++ cfe/trunk/lib/Driver/ToolChains/Clang.cpp Wed Jan 9 12:38:35 2019 @@ -4434,6 +4434,10 @@ void Clang::ConstructJob(Compilation &C, Args.AddAllArgs(CmdArgs, options::OPT_fopenmp_version_EQ); Args.AddAllArgs(CmdArgs, options::OPT_fopenmp_cuda_number_of_sm_EQ); Args.AddAllArgs(CmdArgs, options::OPT_fopenmp_cuda_blocks_per_sm_EQ); + if (Args.hasFlag(options::OPT
r350759 - [OpenMP] Avoid remainder operations for loop index values on a collapsed loop nest.
Author: gbercea Date: Wed Jan 9 12:45:26 2019 New Revision: 350759 URL: http://llvm.org/viewvc/llvm-project?rev=350759&view=rev Log: [OpenMP] Avoid remainder operations for loop index values on a collapsed loop nest. Summary: Change the strategy for computing loop index variables after collapsing a loop nest via the collapse clause by replacing the expensive remainder operation with multiplications and additions. Reviewers: ABataev, caomhin Reviewed By: ABataev Subscribers: guansong, arphaman, cfe-commits Differential Revision: https://reviews.llvm.org/D56413 Modified: cfe/trunk/lib/Sema/SemaOpenMP.cpp cfe/trunk/test/OpenMP/for_codegen.cpp cfe/trunk/test/OpenMP/for_simd_codegen.cpp cfe/trunk/test/OpenMP/parallel_for_simd_codegen.cpp cfe/trunk/test/OpenMP/simd_codegen.cpp Modified: cfe/trunk/lib/Sema/SemaOpenMP.cpp URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Sema/SemaOpenMP.cpp?rev=350759&r1=350758&r2=350759&view=diff == --- cfe/trunk/lib/Sema/SemaOpenMP.cpp (original) +++ cfe/trunk/lib/Sema/SemaOpenMP.cpp Wed Jan 9 12:45:26 2019 @@ -5579,31 +5579,59 @@ checkOpenMPLoop(OpenMPDirectiveKind DKin Built.Updates.resize(NestedLoopCount); Built.Finals.resize(NestedLoopCount); { -ExprResult Div; -// Go from inner nested loop to outer. -for (int Cnt = NestedLoopCount - 1; Cnt >= 0; --Cnt) { +// We implement the following algorithm for obtaining the +// original loop iteration variable values based on the +// value of the collapsed loop iteration variable IV. +// +// Let n+1 be the number of collapsed loops in the nest. +// Iteration variables (I0, I1, In) +// Iteration counts (N0, N1, ... Nn) +// +// Acc = IV; +// +// To compute Ik for loop k, 0 <= k <= n, generate: +//Prod = N(k+1) * N(k+2) * ... * Nn; +//Ik = Acc / Prod; +//Acc -= Ik * Prod; +// +ExprResult Acc = IV; +for (unsigned int Cnt = 0; Cnt < NestedLoopCount; ++Cnt) { LoopIterationSpace &IS = IterSpaces[Cnt]; SourceLocation UpdLoc = IS.IncSrcRange.getBegin(); - // Build: Iter = (IV / Div) % IS.NumIters - // where Div is product of previous iterations' IS.NumIters. ExprResult Iter; - if (Div.isUsable()) { -Iter = -SemaRef.BuildBinOp(CurScope, UpdLoc, BO_Div, IV.get(), Div.get()); - } else { -Iter = IV; -assert((Cnt == (int)NestedLoopCount - 1) && - "unusable div expected on first iteration only"); - } - if (Cnt != 0 && Iter.isUsable()) -Iter = SemaRef.BuildBinOp(CurScope, UpdLoc, BO_Rem, Iter.get(), - IS.NumIterations); + // Compute prod + ExprResult Prod = + SemaRef.ActOnIntegerConstant(SourceLocation(), 1).get(); + for (unsigned int K = Cnt+1; K < NestedLoopCount; ++K) +Prod = SemaRef.BuildBinOp(CurScope, UpdLoc, BO_Mul, Prod.get(), + IterSpaces[K].NumIterations); + + // Iter = Acc / Prod + // If there is at least one more inner loop to avoid + // multiplication by 1. + if (Cnt + 1 < NestedLoopCount) +Iter = SemaRef.BuildBinOp(CurScope, UpdLoc, BO_Div, + Acc.get(), Prod.get()); + else +Iter = Acc; if (!Iter.isUsable()) { HasErrors = true; break; } + // Update Acc: + // Acc -= Iter * Prod + // Check if there is at least one more inner loop to avoid + // multiplication by 1. + if (Cnt + 1 < NestedLoopCount) +Prod = SemaRef.BuildBinOp(CurScope, UpdLoc, BO_Mul, + Iter.get(), Prod.get()); + else +Prod = Iter; + Acc = SemaRef.BuildBinOp(CurScope, UpdLoc, BO_Sub, + Acc.get(), Prod.get()); + // Build update: IS.CounterVar(Private) = IS.Start + Iter * IS.Step auto *VD = cast(cast(IS.CounterVar)->getDecl()); DeclRefExpr *CounterVar = buildDeclRefExpr( @@ -5632,22 +5660,6 @@ checkOpenMPLoop(OpenMPDirectiveKind DKin break; } - // Build Div for the next iteration: Div <- Div * IS.NumIters - if (Cnt != 0) { -if (Div.isUnset()) - Div = IS.NumIterations; -else - Div = SemaRef.BuildBinOp(CurScope, UpdLoc, BO_Mul, Div.get(), - IS.NumIterations); - -// Add parentheses (for debugging purposes only). -if (Div.isUsable()) - Div = tryBuildCapture(SemaRef, Div.get(), Captures); -if (!Div.isUsable()) { - HasErrors = true; - break; -} - } if (!Update.isUsable() || !Final.isUsable()) { HasErrors = true; break; Modified: cfe/trunk/test/OpenMP/for_codegen.cpp URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/test/OpenMP/for_codegen.c
r337015 - [OpenMP] Initialize data sharing stack for SPMD case
Author: gbercea Date: Fri Jul 13 09:18:24 2018 New Revision: 337015 URL: http://llvm.org/viewvc/llvm-project?rev=337015&view=rev Log: [OpenMP] Initialize data sharing stack for SPMD case Summary: In the SPMD case, we need to initialize the data sharing and globalization infrastructure. This covers the case when an SPMD region calls a function in a different compilation unit. Reviewers: ABataev, carlo.bertolli, caomhin Reviewed By: ABataev Subscribers: Hahnfeld, jholewinski, guansong, cfe-commits Differential Revision: https://reviews.llvm.org/D49188 Modified: cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp cfe/trunk/test/OpenMP/nvptx_data_sharing.cpp cfe/trunk/test/OpenMP/nvptx_target_parallel_codegen.cpp cfe/trunk/test/OpenMP/nvptx_target_parallel_proc_bind_codegen.cpp cfe/trunk/test/OpenMP/nvptx_target_parallel_reduction_codegen.cpp cfe/trunk/test/OpenMP/nvptx_target_teams_codegen.cpp cfe/trunk/test/OpenMP/nvptx_target_teams_distribute_parallel_for_codegen.cpp cfe/trunk/test/OpenMP/nvptx_target_teams_distribute_parallel_for_simd_codegen.cpp Modified: cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp?rev=337015&r1=337014&r2=337015&view=diff == --- cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp (original) +++ cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp Fri Jul 13 09:18:24 2018 @@ -81,6 +81,8 @@ enum OpenMPRTLFunctionNVPTX { OMPRTL_NVPTX__kmpc_end_reduce_nowait, /// Call to void __kmpc_data_sharing_init_stack(); OMPRTL_NVPTX__kmpc_data_sharing_init_stack, + /// Call to void __kmpc_data_sharing_init_stack_spmd(); + OMPRTL_NVPTX__kmpc_data_sharing_init_stack_spmd, /// Call to void* __kmpc_data_sharing_push_stack(size_t size, /// int16_t UseSharedMemory); OMPRTL_NVPTX__kmpc_data_sharing_push_stack, @@ -1025,6 +1027,12 @@ void CGOpenMPRuntimeNVPTX::emitSPMDEntry /*RequiresDataSharing=*/Bld.getInt16(1)}; CGF.EmitRuntimeCall( createNVPTXRuntimeFunction(OMPRTL_NVPTX__kmpc_spmd_kernel_init), Args); + + // For data sharing, we need to initialize the stack. + CGF.EmitRuntimeCall( + createNVPTXRuntimeFunction( + OMPRTL_NVPTX__kmpc_data_sharing_init_stack_spmd)); + CGF.EmitBranch(ExecuteBB); CGF.EmitBlock(ExecuteBB); @@ -1107,11 +1115,6 @@ void CGOpenMPRuntimeNVPTX::emitWorkerLoo // Wait for parallel work syncCTAThreads(CGF); - // For data sharing, we need to initialize the stack for workers. - CGF.EmitRuntimeCall( - createNVPTXRuntimeFunction( - OMPRTL_NVPTX__kmpc_data_sharing_init_stack)); - Address WorkFn = CGF.CreateDefaultAlignTempAlloca(CGF.Int8PtrTy, /*Name=*/"work_fn"); Address ExecStatus = @@ -1417,6 +1420,13 @@ CGOpenMPRuntimeNVPTX::createNVPTXRuntime RTLFn = CGM.CreateRuntimeFunction(FnTy, "__kmpc_data_sharing_init_stack"); break; } + case OMPRTL_NVPTX__kmpc_data_sharing_init_stack_spmd: { +/// Build void __kmpc_data_sharing_init_stack_spmd(); +auto *FnTy = +llvm::FunctionType::get(CGM.VoidTy, llvm::None, /*isVarArg*/ false); +RTLFn = CGM.CreateRuntimeFunction(FnTy, "__kmpc_data_sharing_init_stack_spmd"); +break; + } case OMPRTL_NVPTX__kmpc_data_sharing_push_stack: { // Build void *__kmpc_data_sharing_push_stack(size_t size, // int16_t UseSharedMemory); Modified: cfe/trunk/test/OpenMP/nvptx_data_sharing.cpp URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/test/OpenMP/nvptx_data_sharing.cpp?rev=337015&r1=337014&r2=337015&view=diff == --- cfe/trunk/test/OpenMP/nvptx_data_sharing.cpp (original) +++ cfe/trunk/test/OpenMP/nvptx_data_sharing.cpp Fri Jul 13 09:18:24 2018 @@ -30,7 +30,7 @@ void test_ds(){ /// = In the worker function = /// // CK1: {{.*}}define internal void @__omp_offloading{{.*}}test_ds{{.*}}_worker() // CK1: call void @llvm.nvvm.barrier0() -// CK1: call void @__kmpc_data_sharing_init_stack +// CK1-NOT: call void @__kmpc_data_sharing_init_stack /// = In the kernel function = /// Modified: cfe/trunk/test/OpenMP/nvptx_target_parallel_codegen.cpp URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/test/OpenMP/nvptx_target_parallel_codegen.cpp?rev=337015&r1=337014&r2=337015&view=diff == --- cfe/trunk/test/OpenMP/nvptx_target_parallel_codegen.cpp (original) +++ cfe/trunk/test/OpenMP/nvptx_target_parallel_codegen.cpp Fri Jul 13 09:18:24 2018 @@ -60,6 +60,7 @@ int bar(int n){ // CHECK: [[AA:%.+]] = load i16*, i16** [[AA_ADDR]], align // CHECK: [[THREAD_LIMIT:%.+]] = call i32 @llvm.nvvm.read.ptx.sreg.ntid.x() // CHECK: call void @__kmpc_spmd_kernel_init(i32 [[THREAD_LIMIT]], + // CHECK: call void @__kmpc_data_sharing_init_stack_spm
[openmp] [clang] [OpenMP][Fix] Fix test initializations (PR #74797)
https://github.com/doru1004 created https://github.com/llvm/llvm-project/pull/74797 Make sure arrays used in test are properly initialized. >From 6712acd1175d1d6d55ce261651a543872a221c9a Mon Sep 17 00:00:00 2001 From: Doru Bercea Date: Wed, 15 Nov 2023 11:07:09 -0500 Subject: [PATCH 1/2] Fix ordering when mapping a struct. --- clang/lib/CodeGen/CGOpenMPRuntime.cpp | 22 +++ clang/test/OpenMP/map_struct_ordering.cpp | 172 ++ .../struct_mapping_with_pointers.cpp | 114 3 files changed, 308 insertions(+) create mode 100644 clang/test/OpenMP/map_struct_ordering.cpp create mode 100644 openmp/libomptarget/test/offloading/struct_mapping_with_pointers.cpp diff --git a/clang/lib/CodeGen/CGOpenMPRuntime.cpp b/clang/lib/CodeGen/CGOpenMPRuntime.cpp index d2be8141a3a4b..84a6b36646897 100644 --- a/clang/lib/CodeGen/CGOpenMPRuntime.cpp +++ b/clang/lib/CodeGen/CGOpenMPRuntime.cpp @@ -7731,10 +7731,30 @@ class MappableExprsHandler { IsImplicit, Mapper, VarRef, ForDeviceAddr); }; +// Sort all map clauses and make sure all the maps containing array +// sections are processed last. +llvm::SmallVector SortedMapClauses; for (const auto *Cl : Clauses) { const auto *C = dyn_cast(Cl); if (!C) continue; + const auto *EI = C->getVarRefs().begin(); + if (*EI && !isa(*EI)) { +SortedMapClauses.emplace_back(C); + } +} +for (const auto *Cl : Clauses) { + const auto *C = dyn_cast(Cl); + if (!C) +continue; + const auto *EI = C->getVarRefs().begin(); + if (*EI && isa(*EI)) { +SortedMapClauses.emplace_back(C); + } +} + +// Iterate over all map clauses: +for (const OMPMapClause *C : SortedMapClauses) { MapKind Kind = Other; if (llvm::is_contained(C->getMapTypeModifiers(), OMPC_MAP_MODIFIER_present)) @@ -7751,6 +7771,7 @@ class MappableExprsHandler { ++EI; } } + for (const auto *Cl : Clauses) { const auto *C = dyn_cast(Cl); if (!C) @@ -7767,6 +7788,7 @@ class MappableExprsHandler { ++EI; } } + for (const auto *Cl : Clauses) { const auto *C = dyn_cast(Cl); if (!C) diff --git a/clang/test/OpenMP/map_struct_ordering.cpp b/clang/test/OpenMP/map_struct_ordering.cpp new file mode 100644 index 0..035b39b5b12ab --- /dev/null +++ b/clang/test/OpenMP/map_struct_ordering.cpp @@ -0,0 +1,172 @@ +// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --include-generated-funcs --replace-value-regex "__omp_offloading_[0-9a-z]+_[0-9a-z]+" --prefix-filecheck-ir-name _ --version 4 + +// RUN: %clang_cc1 -verify -fopenmp -x c++ -std=c++11 -triple powerpc64le-unknown-unknown -fopenmp-targets=powerpc64le-ibm-linux-gnu -emit-llvm %s -o - -Wno-openmp-mapping | FileCheck %s --check-prefix=CHECK + +// expected-no-diagnostics +#ifndef HEADER +#define HEADER + +struct Descriptor { + int *datum; + long int x; + int xi; + long int arr[1][30]; +}; + +int map_struct() { + Descriptor dat = Descriptor(); + dat.xi = 3; + dat.arr[0][0] = 1; + + #pragma omp target enter data map(to: dat.datum[:10]) map(to: dat) + + #pragma omp target + { +dat.xi = 4; +dat.datum[dat.arr[0][0]] = dat.xi; + } + + #pragma omp target exit data map(from: dat) + + return dat.xi; +} + +#endif +// CHECK-LABEL: define dso_local noundef signext i32 @_Z10map_structv( +// CHECK-SAME: ) #[[ATTR0:[0-9]+]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[DAT:%.*]] = alloca [[STRUCT_DESCRIPTOR:%.*]], align 8 +// CHECK-NEXT:[[DOTOFFLOAD_BASEPTRS:%.*]] = alloca [3 x ptr], align 8 +// CHECK-NEXT:[[DOTOFFLOAD_PTRS:%.*]] = alloca [3 x ptr], align 8 +// CHECK-NEXT:[[DOTOFFLOAD_MAPPERS:%.*]] = alloca [3 x ptr], align 8 +// CHECK-NEXT:[[DOTOFFLOAD_SIZES:%.*]] = alloca [3 x i64], align 8 +// CHECK-NEXT:[[DOTOFFLOAD_BASEPTRS4:%.*]] = alloca [1 x ptr], align 8 +// CHECK-NEXT:[[DOTOFFLOAD_PTRS5:%.*]] = alloca [1 x ptr], align 8 +// CHECK-NEXT:[[DOTOFFLOAD_MAPPERS6:%.*]] = alloca [1 x ptr], align 8 +// CHECK-NEXT:[[KERNEL_ARGS:%.*]] = alloca [[STRUCT___TGT_KERNEL_ARGUMENTS:%.*]], align 8 +// CHECK-NEXT:[[DOTOFFLOAD_BASEPTRS7:%.*]] = alloca [1 x ptr], align 8 +// CHECK-NEXT:[[DOTOFFLOAD_PTRS8:%.*]] = alloca [1 x ptr], align 8 +// CHECK-NEXT:[[DOTOFFLOAD_MAPPERS9:%.*]] = alloca [1 x ptr], align 8 +// CHECK-NEXT:call void @llvm.memset.p0.i64(ptr align 8 [[DAT]], i8 0, i64 264, i1 false) +// CHECK-NEXT:[[XI:%.*]] = getelementptr inbounds [[STRUCT_DESCRIPTOR]], ptr [[DAT]], i32 0, i32 2 +// CHECK-NEXT:store i32 3, ptr [[XI]], align 8 +// CHECK-NEXT:[[ARR:%.*]] = getelementptr inbounds [[STRUCT_DESCRIPTOR]], ptr [[DAT]], i32 0, i32 3 +// CHECK-NEXT:[[ARRAYIDX:%.*]] = getelementptr inbounds [1 x [30 x i64]], ptr [[ARR]], i64 0, i64 0 +// CHECK-NEXT:[[ARRAYIDX1:%
[clang] [openmp] [OpenMP][Fix] Fix test initializations (PR #74797)
https://github.com/doru1004 closed https://github.com/llvm/llvm-project/pull/74797 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [openmp] Fix ordering of processing of map clauses when mapping a struct. (PR #72410)
https://github.com/doru1004 created https://github.com/llvm/llvm-project/pull/72410 Mapping a struct, if done in the wrong order, can overwrite the pointer attachment details. This fixes this problem. Original failing example: ``` #include #include struct Descriptor { int *datum; long int x; int xi; long int arr[1][30]; }; int main() { Descriptor dat = Descriptor(); dat.datum = (int *)malloc(sizeof(int)*10); dat.xi = 3; dat.arr[0][0] = 1; #pragma omp target enter data map(to: dat.datum[:10]) map(to: dat) #pragma omp target { dat.xi = 4; dat.datum[dat.arr[0][0]] = dat.xi; } #pragma omp target exit data map(from: dat) return 0; } ``` Previous attempt at fixing this: https://github.com/llvm/llvm-project/pull/70821 >From 6f9450b5fa9ff47c35e7498b3a536a218655a9d6 Mon Sep 17 00:00:00 2001 From: Doru Bercea Date: Wed, 15 Nov 2023 11:07:09 -0500 Subject: [PATCH] Fix ordering when mapping a struct. --- clang/lib/CodeGen/CGOpenMPRuntime.cpp | 44 +-- .../struct_mapping_with_pointers.cpp | 114 ++ 2 files changed, 151 insertions(+), 7 deletions(-) create mode 100644 openmp/libomptarget/test/offloading/struct_mapping_with_pointers.cpp diff --git a/clang/lib/CodeGen/CGOpenMPRuntime.cpp b/clang/lib/CodeGen/CGOpenMPRuntime.cpp index d2be8141a3a4b31..50518c46152bbaf 100644 --- a/clang/lib/CodeGen/CGOpenMPRuntime.cpp +++ b/clang/lib/CodeGen/CGOpenMPRuntime.cpp @@ -7731,6 +7731,8 @@ class MappableExprsHandler { IsImplicit, Mapper, VarRef, ForDeviceAddr); }; +// Iterate over all non-section maps first to avoid overwriting pointer +// attachment. for (const auto *Cl : Clauses) { const auto *C = dyn_cast(Cl); if (!C) @@ -7742,15 +7744,42 @@ class MappableExprsHandler { else if (C->getMapType() == OMPC_MAP_alloc) Kind = Allocs; const auto *EI = C->getVarRefs().begin(); - for (const auto L : C->component_lists()) { -const Expr *E = (C->getMapLoc().isValid()) ? *EI : nullptr; -InfoGen(std::get<0>(L), Kind, std::get<1>(L), C->getMapType(), -C->getMapTypeModifiers(), std::nullopt, -/*ReturnDevicePointer=*/false, C->isImplicit(), std::get<2>(L), -E); -++EI; + if (*EI && !isa(*EI)) { +for (const auto L : C->component_lists()) { + const Expr *E = (C->getMapLoc().isValid()) ? *EI : nullptr; + InfoGen(std::get<0>(L), Kind, std::get<1>(L), C->getMapType(), + C->getMapTypeModifiers(), std::nullopt, + /*ReturnDevicePointer=*/false, C->isImplicit(), std::get<2>(L), + E); + ++EI; +} + } +} + +// Process the maps with sections. +for (const auto *Cl : Clauses) { + const auto *C = dyn_cast(Cl); + if (!C) +continue; + MapKind Kind = Other; + if (llvm::is_contained(C->getMapTypeModifiers(), + OMPC_MAP_MODIFIER_present)) +Kind = Present; + else if (C->getMapType() == OMPC_MAP_alloc) +Kind = Allocs; + const auto *EI = C->getVarRefs().begin(); + if (*EI && isa(*EI)) { +for (const auto L : C->component_lists()) { + const Expr *E = (C->getMapLoc().isValid()) ? *EI : nullptr; + InfoGen(std::get<0>(L), Kind, std::get<1>(L), C->getMapType(), + C->getMapTypeModifiers(), std::nullopt, + /*ReturnDevicePointer=*/false, C->isImplicit(), std::get<2>(L), + E); + ++EI; +} } } + for (const auto *Cl : Clauses) { const auto *C = dyn_cast(Cl); if (!C) @@ -7767,6 +7796,7 @@ class MappableExprsHandler { ++EI; } } + for (const auto *Cl : Clauses) { const auto *C = dyn_cast(Cl); if (!C) diff --git a/openmp/libomptarget/test/offloading/struct_mapping_with_pointers.cpp b/openmp/libomptarget/test/offloading/struct_mapping_with_pointers.cpp new file mode 100644 index 000..c7ce4bade8de9a2 --- /dev/null +++ b/openmp/libomptarget/test/offloading/struct_mapping_with_pointers.cpp @@ -0,0 +1,114 @@ +// clang-format off +// RUN: %libomptarget-compilexx-generic && env LIBOMPTARGET_DEBUG=1 %libomptarget-run-generic 2>&1 | %fcheck-generic +// clang-format on + +#include +#include + +struct Descriptor { + int *datum; + long int x; + int *more_datum; + int xi; + int val_datum, val_more_datum; + long int arr[1][30]; + int val_arr; +}; + +int main() { + Descriptor dat = Descriptor(); + dat.datum = (int *)malloc(sizeof(int) * 10); + dat.more_datum = (int *)malloc(sizeof(int) * 20); + dat.xi = 3; + dat.arr[0][0] = 1; + + dat.datum[7] = 7; + dat.more_datum[17] = 17; + + /// The struct is mapped with type 0x0 when the pointer fields are mapped. + /// The struct is also map explicitely by the user. The second mapping by + /// the user must not overwrite
[openmp] [clang] [Clang][OpenMP] Fix ordering of processing of map clauses when mapping a struct. (PR #72410)
https://github.com/doru1004 edited https://github.com/llvm/llvm-project/pull/72410 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [openmp] [Clang][OpenMP] Fix ordering of processing of map clauses when mapping a struct. (PR #72410)
https://github.com/doru1004 updated https://github.com/llvm/llvm-project/pull/72410 >From ed9d50576cf167b4d9017e55333220d1601d088f Mon Sep 17 00:00:00 2001 From: Doru Bercea Date: Wed, 15 Nov 2023 11:07:09 -0500 Subject: [PATCH] Fix ordering when mapping a struct. --- clang/lib/CodeGen/CGOpenMPRuntime.cpp | 44 +-- .../struct_mapping_with_pointers.cpp | 114 ++ 2 files changed, 151 insertions(+), 7 deletions(-) create mode 100644 openmp/libomptarget/test/offloading/struct_mapping_with_pointers.cpp diff --git a/clang/lib/CodeGen/CGOpenMPRuntime.cpp b/clang/lib/CodeGen/CGOpenMPRuntime.cpp index d2be8141a3a4b31..0079530f90f723d 100644 --- a/clang/lib/CodeGen/CGOpenMPRuntime.cpp +++ b/clang/lib/CodeGen/CGOpenMPRuntime.cpp @@ -7731,6 +7731,8 @@ class MappableExprsHandler { IsImplicit, Mapper, VarRef, ForDeviceAddr); }; +// Iterate over all non-section maps first to avoid overwriting pointer +// attachment. for (const auto *Cl : Clauses) { const auto *C = dyn_cast(Cl); if (!C) @@ -7742,15 +7744,42 @@ class MappableExprsHandler { else if (C->getMapType() == OMPC_MAP_alloc) Kind = Allocs; const auto *EI = C->getVarRefs().begin(); - for (const auto L : C->component_lists()) { -const Expr *E = (C->getMapLoc().isValid()) ? *EI : nullptr; -InfoGen(std::get<0>(L), Kind, std::get<1>(L), C->getMapType(), -C->getMapTypeModifiers(), std::nullopt, -/*ReturnDevicePointer=*/false, C->isImplicit(), std::get<2>(L), -E); -++EI; + if (*EI && !isa(*EI)) { +for (const auto L : C->component_lists()) { + const Expr *E = (C->getMapLoc().isValid()) ? *EI : nullptr; + InfoGen(std::get<0>(L), Kind, std::get<1>(L), C->getMapType(), + C->getMapTypeModifiers(), std::nullopt, + /*ReturnDevicePointer=*/false, C->isImplicit(), + std::get<2>(L), E); + ++EI; +} + } +} + +// Process the maps with sections. +for (const auto *Cl : Clauses) { + const auto *C = dyn_cast(Cl); + if (!C) +continue; + MapKind Kind = Other; + if (llvm::is_contained(C->getMapTypeModifiers(), + OMPC_MAP_MODIFIER_present)) +Kind = Present; + else if (C->getMapType() == OMPC_MAP_alloc) +Kind = Allocs; + const auto *EI = C->getVarRefs().begin(); + if (*EI && isa(*EI)) { +for (const auto L : C->component_lists()) { + const Expr *E = (C->getMapLoc().isValid()) ? *EI : nullptr; + InfoGen(std::get<0>(L), Kind, std::get<1>(L), C->getMapType(), + C->getMapTypeModifiers(), std::nullopt, + /*ReturnDevicePointer=*/false, C->isImplicit(), + std::get<2>(L), E); + ++EI; +} } } + for (const auto *Cl : Clauses) { const auto *C = dyn_cast(Cl); if (!C) @@ -7767,6 +7796,7 @@ class MappableExprsHandler { ++EI; } } + for (const auto *Cl : Clauses) { const auto *C = dyn_cast(Cl); if (!C) diff --git a/openmp/libomptarget/test/offloading/struct_mapping_with_pointers.cpp b/openmp/libomptarget/test/offloading/struct_mapping_with_pointers.cpp new file mode 100644 index 000..c7ce4bade8de9a2 --- /dev/null +++ b/openmp/libomptarget/test/offloading/struct_mapping_with_pointers.cpp @@ -0,0 +1,114 @@ +// clang-format off +// RUN: %libomptarget-compilexx-generic && env LIBOMPTARGET_DEBUG=1 %libomptarget-run-generic 2>&1 | %fcheck-generic +// clang-format on + +#include +#include + +struct Descriptor { + int *datum; + long int x; + int *more_datum; + int xi; + int val_datum, val_more_datum; + long int arr[1][30]; + int val_arr; +}; + +int main() { + Descriptor dat = Descriptor(); + dat.datum = (int *)malloc(sizeof(int) * 10); + dat.more_datum = (int *)malloc(sizeof(int) * 20); + dat.xi = 3; + dat.arr[0][0] = 1; + + dat.datum[7] = 7; + dat.more_datum[17] = 17; + + /// The struct is mapped with type 0x0 when the pointer fields are mapped. + /// The struct is also map explicitely by the user. The second mapping by + /// the user must not overwrite the mapping set up for the pointer fields + /// when mapping the struct happens after the mapping of the pointers. + + // clang-format off + // CHECK: Libomptarget --> Entry 0: Base=[[DAT_HST_PTR_BASE:0x.*]], Begin=[[DAT_HST_PTR_BASE]], Size=288, Type=0x0, Name=unknown + // CHECK: Libomptarget --> Entry 1: Base=[[DAT_HST_PTR_BASE]], Begin=[[DAT_HST_PTR_BASE]], Size=288, Type=0x10001, Name=unknown + // CHECK: Libomptarget --> Entry 2: Base=[[DAT_HST_PTR_BASE]], Begin=[[DATUM_HST_PTR_BASE:0x.*]], Size=40, Type=0x10011, Name=unknown + // CHECK: Libomptarget --> Entry 3: Base=[[MORE_DATUM_HST_PTR_BASE:0x.*]], Begin=[[MORE_DATUM_HST_PTR_BEGIN:0x.*]], Si
[clang] [openmp] [Clang][OpenMP] Fix ordering of processing of map clauses when mapping a struct. (PR #72410)
doru1004 wrote: > This being in clang instead seems like a good change. Are there no CodeGen > tests changed? We should add one if so. Probably just take your > `libomptarget` test and run `update_cc_test_checks` on it with the arguments > found in other test files. No code gen test changes. Happy to add one no problem. https://github.com/llvm/llvm-project/pull/72410 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [openmp] [Clang][OpenMP] Fix ordering of processing of map clauses when mapping a struct. (PR #72410)
https://github.com/doru1004 updated https://github.com/llvm/llvm-project/pull/72410 >From a16ffab67e8f8134fd943761da730c120bbae88d Mon Sep 17 00:00:00 2001 From: Doru Bercea Date: Wed, 15 Nov 2023 11:07:09 -0500 Subject: [PATCH] Fix ordering when mapping a struct. --- clang/lib/CodeGen/CGOpenMPRuntime.cpp | 44 - clang/test/OpenMP/map_struct_ordering.cpp | 172 ++ .../struct_mapping_with_pointers.cpp | 114 3 files changed, 323 insertions(+), 7 deletions(-) create mode 100644 clang/test/OpenMP/map_struct_ordering.cpp create mode 100644 openmp/libomptarget/test/offloading/struct_mapping_with_pointers.cpp diff --git a/clang/lib/CodeGen/CGOpenMPRuntime.cpp b/clang/lib/CodeGen/CGOpenMPRuntime.cpp index d2be8141a3a4b31..0079530f90f723d 100644 --- a/clang/lib/CodeGen/CGOpenMPRuntime.cpp +++ b/clang/lib/CodeGen/CGOpenMPRuntime.cpp @@ -7731,6 +7731,8 @@ class MappableExprsHandler { IsImplicit, Mapper, VarRef, ForDeviceAddr); }; +// Iterate over all non-section maps first to avoid overwriting pointer +// attachment. for (const auto *Cl : Clauses) { const auto *C = dyn_cast(Cl); if (!C) @@ -7742,15 +7744,42 @@ class MappableExprsHandler { else if (C->getMapType() == OMPC_MAP_alloc) Kind = Allocs; const auto *EI = C->getVarRefs().begin(); - for (const auto L : C->component_lists()) { -const Expr *E = (C->getMapLoc().isValid()) ? *EI : nullptr; -InfoGen(std::get<0>(L), Kind, std::get<1>(L), C->getMapType(), -C->getMapTypeModifiers(), std::nullopt, -/*ReturnDevicePointer=*/false, C->isImplicit(), std::get<2>(L), -E); -++EI; + if (*EI && !isa(*EI)) { +for (const auto L : C->component_lists()) { + const Expr *E = (C->getMapLoc().isValid()) ? *EI : nullptr; + InfoGen(std::get<0>(L), Kind, std::get<1>(L), C->getMapType(), + C->getMapTypeModifiers(), std::nullopt, + /*ReturnDevicePointer=*/false, C->isImplicit(), + std::get<2>(L), E); + ++EI; +} + } +} + +// Process the maps with sections. +for (const auto *Cl : Clauses) { + const auto *C = dyn_cast(Cl); + if (!C) +continue; + MapKind Kind = Other; + if (llvm::is_contained(C->getMapTypeModifiers(), + OMPC_MAP_MODIFIER_present)) +Kind = Present; + else if (C->getMapType() == OMPC_MAP_alloc) +Kind = Allocs; + const auto *EI = C->getVarRefs().begin(); + if (*EI && isa(*EI)) { +for (const auto L : C->component_lists()) { + const Expr *E = (C->getMapLoc().isValid()) ? *EI : nullptr; + InfoGen(std::get<0>(L), Kind, std::get<1>(L), C->getMapType(), + C->getMapTypeModifiers(), std::nullopt, + /*ReturnDevicePointer=*/false, C->isImplicit(), + std::get<2>(L), E); + ++EI; +} } } + for (const auto *Cl : Clauses) { const auto *C = dyn_cast(Cl); if (!C) @@ -7767,6 +7796,7 @@ class MappableExprsHandler { ++EI; } } + for (const auto *Cl : Clauses) { const auto *C = dyn_cast(Cl); if (!C) diff --git a/clang/test/OpenMP/map_struct_ordering.cpp b/clang/test/OpenMP/map_struct_ordering.cpp new file mode 100644 index 000..035b39b5b12ab4a --- /dev/null +++ b/clang/test/OpenMP/map_struct_ordering.cpp @@ -0,0 +1,172 @@ +// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --include-generated-funcs --replace-value-regex "__omp_offloading_[0-9a-z]+_[0-9a-z]+" --prefix-filecheck-ir-name _ --version 4 + +// RUN: %clang_cc1 -verify -fopenmp -x c++ -std=c++11 -triple powerpc64le-unknown-unknown -fopenmp-targets=powerpc64le-ibm-linux-gnu -emit-llvm %s -o - -Wno-openmp-mapping | FileCheck %s --check-prefix=CHECK + +// expected-no-diagnostics +#ifndef HEADER +#define HEADER + +struct Descriptor { + int *datum; + long int x; + int xi; + long int arr[1][30]; +}; + +int map_struct() { + Descriptor dat = Descriptor(); + dat.xi = 3; + dat.arr[0][0] = 1; + + #pragma omp target enter data map(to: dat.datum[:10]) map(to: dat) + + #pragma omp target + { +dat.xi = 4; +dat.datum[dat.arr[0][0]] = dat.xi; + } + + #pragma omp target exit data map(from: dat) + + return dat.xi; +} + +#endif +// CHECK-LABEL: define dso_local noundef signext i32 @_Z10map_structv( +// CHECK-SAME: ) #[[ATTR0:[0-9]+]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[DAT:%.*]] = alloca [[STRUCT_DESCRIPTOR:%.*]], align 8 +// CHECK-NEXT:[[DOTOFFLOAD_BASEPTRS:%.*]] = alloca [3 x ptr], align 8 +// CHECK-NEXT:[[DOTOFFLOAD_PTRS:%.*]] = alloca [3 x ptr], align 8 +// CHECK-NEXT:[[DOTOFFLOAD_MAPPERS:%.*]] = alloca [3 x ptr], align 8 +// CHECK-NEXT:[[DOTOFFLOAD_SIZES:%.*]] = alloca [3 x i64], a
[openmp] [clang] [Clang][OpenMP] Fix ordering of processing of map clauses when mapping a struct. (PR #72410)
doru1004 wrote: > This being in clang instead seems like a good change. Are there no CodeGen > tests changed? We should add one if so. Probably just take your > `libomptarget` test and run `update_cc_test_checks` on it with the arguments > found in other test files. Just added the test. https://github.com/llvm/llvm-project/pull/72410 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [openmp] [Clang][OpenMP] Fix ordering of processing of map clauses when mapping a struct. (PR #72410)
https://github.com/doru1004 updated https://github.com/llvm/llvm-project/pull/72410 >From d29229095203dccdee5ded18c0df0474e006ad53 Mon Sep 17 00:00:00 2001 From: Doru Bercea Date: Wed, 15 Nov 2023 11:07:09 -0500 Subject: [PATCH] Fix ordering when mapping a struct. --- clang/lib/CodeGen/CGOpenMPRuntime.cpp | 27 ++- clang/test/OpenMP/map_struct_ordering.cpp | 172 ++ .../struct_mapping_with_pointers.cpp | 114 3 files changed, 311 insertions(+), 2 deletions(-) create mode 100644 clang/test/OpenMP/map_struct_ordering.cpp create mode 100644 openmp/libomptarget/test/offloading/struct_mapping_with_pointers.cpp diff --git a/clang/lib/CodeGen/CGOpenMPRuntime.cpp b/clang/lib/CodeGen/CGOpenMPRuntime.cpp index d2be8141a3a4b31..a39115300fa641e 100644 --- a/clang/lib/CodeGen/CGOpenMPRuntime.cpp +++ b/clang/lib/CodeGen/CGOpenMPRuntime.cpp @@ -7731,10 +7731,31 @@ class MappableExprsHandler { IsImplicit, Mapper, VarRef, ForDeviceAddr); }; +// Sort all map clauses and make sure all the maps containing array +// sections are processed last. +llvm::SmallVector SortedMapClauses; for (const auto *Cl : Clauses) { const auto *C = dyn_cast(Cl); if (!C) continue; + const auto *EI = C->getVarRefs().begin(); + if (*EI && !isa(*EI)) { +SortedMapClauses.emplace_back(C); + } +} +for (const auto *Cl : Clauses) { + const auto *C = dyn_cast(Cl); + if (!C) +continue; + const auto *EI = C->getVarRefs().begin(); + if (*EI && isa(*EI)) { +SortedMapClauses.emplace_back(C); + } +} + +// Iterate over all non-section maps first to avoid overwriting pointer +// attachment. +for (const OMPMapClause *C : SortedMapClauses) { MapKind Kind = Other; if (llvm::is_contained(C->getMapTypeModifiers(), OMPC_MAP_MODIFIER_present)) @@ -7746,11 +7767,12 @@ class MappableExprsHandler { const Expr *E = (C->getMapLoc().isValid()) ? *EI : nullptr; InfoGen(std::get<0>(L), Kind, std::get<1>(L), C->getMapType(), C->getMapTypeModifiers(), std::nullopt, -/*ReturnDevicePointer=*/false, C->isImplicit(), std::get<2>(L), -E); +/*ReturnDevicePointer=*/false, C->isImplicit(), +std::get<2>(L), E); ++EI; } } + for (const auto *Cl : Clauses) { const auto *C = dyn_cast(Cl); if (!C) @@ -7767,6 +7789,7 @@ class MappableExprsHandler { ++EI; } } + for (const auto *Cl : Clauses) { const auto *C = dyn_cast(Cl); if (!C) diff --git a/clang/test/OpenMP/map_struct_ordering.cpp b/clang/test/OpenMP/map_struct_ordering.cpp new file mode 100644 index 000..035b39b5b12ab4a --- /dev/null +++ b/clang/test/OpenMP/map_struct_ordering.cpp @@ -0,0 +1,172 @@ +// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --include-generated-funcs --replace-value-regex "__omp_offloading_[0-9a-z]+_[0-9a-z]+" --prefix-filecheck-ir-name _ --version 4 + +// RUN: %clang_cc1 -verify -fopenmp -x c++ -std=c++11 -triple powerpc64le-unknown-unknown -fopenmp-targets=powerpc64le-ibm-linux-gnu -emit-llvm %s -o - -Wno-openmp-mapping | FileCheck %s --check-prefix=CHECK + +// expected-no-diagnostics +#ifndef HEADER +#define HEADER + +struct Descriptor { + int *datum; + long int x; + int xi; + long int arr[1][30]; +}; + +int map_struct() { + Descriptor dat = Descriptor(); + dat.xi = 3; + dat.arr[0][0] = 1; + + #pragma omp target enter data map(to: dat.datum[:10]) map(to: dat) + + #pragma omp target + { +dat.xi = 4; +dat.datum[dat.arr[0][0]] = dat.xi; + } + + #pragma omp target exit data map(from: dat) + + return dat.xi; +} + +#endif +// CHECK-LABEL: define dso_local noundef signext i32 @_Z10map_structv( +// CHECK-SAME: ) #[[ATTR0:[0-9]+]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[DAT:%.*]] = alloca [[STRUCT_DESCRIPTOR:%.*]], align 8 +// CHECK-NEXT:[[DOTOFFLOAD_BASEPTRS:%.*]] = alloca [3 x ptr], align 8 +// CHECK-NEXT:[[DOTOFFLOAD_PTRS:%.*]] = alloca [3 x ptr], align 8 +// CHECK-NEXT:[[DOTOFFLOAD_MAPPERS:%.*]] = alloca [3 x ptr], align 8 +// CHECK-NEXT:[[DOTOFFLOAD_SIZES:%.*]] = alloca [3 x i64], align 8 +// CHECK-NEXT:[[DOTOFFLOAD_BASEPTRS4:%.*]] = alloca [1 x ptr], align 8 +// CHECK-NEXT:[[DOTOFFLOAD_PTRS5:%.*]] = alloca [1 x ptr], align 8 +// CHECK-NEXT:[[DOTOFFLOAD_MAPPERS6:%.*]] = alloca [1 x ptr], align 8 +// CHECK-NEXT:[[KERNEL_ARGS:%.*]] = alloca [[STRUCT___TGT_KERNEL_ARGUMENTS:%.*]], align 8 +// CHECK-NEXT:[[DOTOFFLOAD_BASEPTRS7:%.*]] = alloca [1 x ptr], align 8 +// CHECK-NEXT:[[DOTOFFLOAD_PTRS8:%.*]] = alloca [1 x ptr], align 8 +// CHECK-NEXT:[[DOTOFFLOAD_MAPPERS9:%.*]] = alloca [1 x ptr], align 8 +// CHECK-NEXT:call void @llvm.memset.p0.i64(ptr align 8 [[DAT]], i
[clang] [openmp] [Clang][OpenMP] Fix ordering of processing of map clauses when mapping a struct. (PR #72410)
https://github.com/doru1004 updated https://github.com/llvm/llvm-project/pull/72410 >From 2ea93a7b4841671dc12ee39a25a66c536d92d83f Mon Sep 17 00:00:00 2001 From: Doru Bercea Date: Wed, 15 Nov 2023 11:07:09 -0500 Subject: [PATCH] Fix ordering when mapping a struct. --- clang/lib/CodeGen/CGOpenMPRuntime.cpp | 23 +++ clang/test/OpenMP/map_struct_ordering.cpp | 172 ++ .../struct_mapping_with_pointers.cpp | 114 3 files changed, 309 insertions(+) create mode 100644 clang/test/OpenMP/map_struct_ordering.cpp create mode 100644 openmp/libomptarget/test/offloading/struct_mapping_with_pointers.cpp diff --git a/clang/lib/CodeGen/CGOpenMPRuntime.cpp b/clang/lib/CodeGen/CGOpenMPRuntime.cpp index d2be8141a3a4b31..b4b8794947687c0 100644 --- a/clang/lib/CodeGen/CGOpenMPRuntime.cpp +++ b/clang/lib/CodeGen/CGOpenMPRuntime.cpp @@ -7731,10 +7731,31 @@ class MappableExprsHandler { IsImplicit, Mapper, VarRef, ForDeviceAddr); }; +// Sort all map clauses and make sure all the maps containing array +// sections are processed last. +llvm::SmallVector SortedMapClauses; for (const auto *Cl : Clauses) { const auto *C = dyn_cast(Cl); if (!C) continue; + const auto *EI = C->getVarRefs().begin(); + if (*EI && !isa(*EI)) { +SortedMapClauses.emplace_back(C); + } +} +for (const auto *Cl : Clauses) { + const auto *C = dyn_cast(Cl); + if (!C) +continue; + const auto *EI = C->getVarRefs().begin(); + if (*EI && isa(*EI)) { +SortedMapClauses.emplace_back(C); + } +} + +// Iterate over all non-section maps first to avoid overwriting pointer +// attachment. +for (const OMPMapClause *C : SortedMapClauses) { MapKind Kind = Other; if (llvm::is_contained(C->getMapTypeModifiers(), OMPC_MAP_MODIFIER_present)) @@ -7751,6 +7772,7 @@ class MappableExprsHandler { ++EI; } } + for (const auto *Cl : Clauses) { const auto *C = dyn_cast(Cl); if (!C) @@ -7767,6 +7789,7 @@ class MappableExprsHandler { ++EI; } } + for (const auto *Cl : Clauses) { const auto *C = dyn_cast(Cl); if (!C) diff --git a/clang/test/OpenMP/map_struct_ordering.cpp b/clang/test/OpenMP/map_struct_ordering.cpp new file mode 100644 index 000..035b39b5b12ab4a --- /dev/null +++ b/clang/test/OpenMP/map_struct_ordering.cpp @@ -0,0 +1,172 @@ +// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --include-generated-funcs --replace-value-regex "__omp_offloading_[0-9a-z]+_[0-9a-z]+" --prefix-filecheck-ir-name _ --version 4 + +// RUN: %clang_cc1 -verify -fopenmp -x c++ -std=c++11 -triple powerpc64le-unknown-unknown -fopenmp-targets=powerpc64le-ibm-linux-gnu -emit-llvm %s -o - -Wno-openmp-mapping | FileCheck %s --check-prefix=CHECK + +// expected-no-diagnostics +#ifndef HEADER +#define HEADER + +struct Descriptor { + int *datum; + long int x; + int xi; + long int arr[1][30]; +}; + +int map_struct() { + Descriptor dat = Descriptor(); + dat.xi = 3; + dat.arr[0][0] = 1; + + #pragma omp target enter data map(to: dat.datum[:10]) map(to: dat) + + #pragma omp target + { +dat.xi = 4; +dat.datum[dat.arr[0][0]] = dat.xi; + } + + #pragma omp target exit data map(from: dat) + + return dat.xi; +} + +#endif +// CHECK-LABEL: define dso_local noundef signext i32 @_Z10map_structv( +// CHECK-SAME: ) #[[ATTR0:[0-9]+]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[DAT:%.*]] = alloca [[STRUCT_DESCRIPTOR:%.*]], align 8 +// CHECK-NEXT:[[DOTOFFLOAD_BASEPTRS:%.*]] = alloca [3 x ptr], align 8 +// CHECK-NEXT:[[DOTOFFLOAD_PTRS:%.*]] = alloca [3 x ptr], align 8 +// CHECK-NEXT:[[DOTOFFLOAD_MAPPERS:%.*]] = alloca [3 x ptr], align 8 +// CHECK-NEXT:[[DOTOFFLOAD_SIZES:%.*]] = alloca [3 x i64], align 8 +// CHECK-NEXT:[[DOTOFFLOAD_BASEPTRS4:%.*]] = alloca [1 x ptr], align 8 +// CHECK-NEXT:[[DOTOFFLOAD_PTRS5:%.*]] = alloca [1 x ptr], align 8 +// CHECK-NEXT:[[DOTOFFLOAD_MAPPERS6:%.*]] = alloca [1 x ptr], align 8 +// CHECK-NEXT:[[KERNEL_ARGS:%.*]] = alloca [[STRUCT___TGT_KERNEL_ARGUMENTS:%.*]], align 8 +// CHECK-NEXT:[[DOTOFFLOAD_BASEPTRS7:%.*]] = alloca [1 x ptr], align 8 +// CHECK-NEXT:[[DOTOFFLOAD_PTRS8:%.*]] = alloca [1 x ptr], align 8 +// CHECK-NEXT:[[DOTOFFLOAD_MAPPERS9:%.*]] = alloca [1 x ptr], align 8 +// CHECK-NEXT:call void @llvm.memset.p0.i64(ptr align 8 [[DAT]], i8 0, i64 264, i1 false) +// CHECK-NEXT:[[XI:%.*]] = getelementptr inbounds [[STRUCT_DESCRIPTOR]], ptr [[DAT]], i32 0, i32 2 +// CHECK-NEXT:store i32 3, ptr [[XI]], align 8 +// CHECK-NEXT:[[ARR:%.*]] = getelementptr inbounds [[STRUCT_DESCRIPTOR]], ptr [[DAT]], i32 0, i32 3 +// CHECK-NEXT:[[ARRAYIDX:%.*]] = getelementptr inbounds [1 x [30 x i64]], ptr [[ARR]], i64 0, i64 0 +// CHECK-NEXT:[[ARRAY
[openmp] [clang] [Clang][OpenMP] Fix ordering of processing of map clauses when mapping a struct. (PR #72410)
https://github.com/doru1004 updated https://github.com/llvm/llvm-project/pull/72410 >From 6712acd1175d1d6d55ce261651a543872a221c9a Mon Sep 17 00:00:00 2001 From: Doru Bercea Date: Wed, 15 Nov 2023 11:07:09 -0500 Subject: [PATCH] Fix ordering when mapping a struct. --- clang/lib/CodeGen/CGOpenMPRuntime.cpp | 22 +++ clang/test/OpenMP/map_struct_ordering.cpp | 172 ++ .../struct_mapping_with_pointers.cpp | 114 3 files changed, 308 insertions(+) create mode 100644 clang/test/OpenMP/map_struct_ordering.cpp create mode 100644 openmp/libomptarget/test/offloading/struct_mapping_with_pointers.cpp diff --git a/clang/lib/CodeGen/CGOpenMPRuntime.cpp b/clang/lib/CodeGen/CGOpenMPRuntime.cpp index d2be8141a3a4b31..84a6b36646897d7 100644 --- a/clang/lib/CodeGen/CGOpenMPRuntime.cpp +++ b/clang/lib/CodeGen/CGOpenMPRuntime.cpp @@ -7731,10 +7731,30 @@ class MappableExprsHandler { IsImplicit, Mapper, VarRef, ForDeviceAddr); }; +// Sort all map clauses and make sure all the maps containing array +// sections are processed last. +llvm::SmallVector SortedMapClauses; for (const auto *Cl : Clauses) { const auto *C = dyn_cast(Cl); if (!C) continue; + const auto *EI = C->getVarRefs().begin(); + if (*EI && !isa(*EI)) { +SortedMapClauses.emplace_back(C); + } +} +for (const auto *Cl : Clauses) { + const auto *C = dyn_cast(Cl); + if (!C) +continue; + const auto *EI = C->getVarRefs().begin(); + if (*EI && isa(*EI)) { +SortedMapClauses.emplace_back(C); + } +} + +// Iterate over all map clauses: +for (const OMPMapClause *C : SortedMapClauses) { MapKind Kind = Other; if (llvm::is_contained(C->getMapTypeModifiers(), OMPC_MAP_MODIFIER_present)) @@ -7751,6 +7771,7 @@ class MappableExprsHandler { ++EI; } } + for (const auto *Cl : Clauses) { const auto *C = dyn_cast(Cl); if (!C) @@ -7767,6 +7788,7 @@ class MappableExprsHandler { ++EI; } } + for (const auto *Cl : Clauses) { const auto *C = dyn_cast(Cl); if (!C) diff --git a/clang/test/OpenMP/map_struct_ordering.cpp b/clang/test/OpenMP/map_struct_ordering.cpp new file mode 100644 index 000..035b39b5b12ab4a --- /dev/null +++ b/clang/test/OpenMP/map_struct_ordering.cpp @@ -0,0 +1,172 @@ +// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --include-generated-funcs --replace-value-regex "__omp_offloading_[0-9a-z]+_[0-9a-z]+" --prefix-filecheck-ir-name _ --version 4 + +// RUN: %clang_cc1 -verify -fopenmp -x c++ -std=c++11 -triple powerpc64le-unknown-unknown -fopenmp-targets=powerpc64le-ibm-linux-gnu -emit-llvm %s -o - -Wno-openmp-mapping | FileCheck %s --check-prefix=CHECK + +// expected-no-diagnostics +#ifndef HEADER +#define HEADER + +struct Descriptor { + int *datum; + long int x; + int xi; + long int arr[1][30]; +}; + +int map_struct() { + Descriptor dat = Descriptor(); + dat.xi = 3; + dat.arr[0][0] = 1; + + #pragma omp target enter data map(to: dat.datum[:10]) map(to: dat) + + #pragma omp target + { +dat.xi = 4; +dat.datum[dat.arr[0][0]] = dat.xi; + } + + #pragma omp target exit data map(from: dat) + + return dat.xi; +} + +#endif +// CHECK-LABEL: define dso_local noundef signext i32 @_Z10map_structv( +// CHECK-SAME: ) #[[ATTR0:[0-9]+]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[DAT:%.*]] = alloca [[STRUCT_DESCRIPTOR:%.*]], align 8 +// CHECK-NEXT:[[DOTOFFLOAD_BASEPTRS:%.*]] = alloca [3 x ptr], align 8 +// CHECK-NEXT:[[DOTOFFLOAD_PTRS:%.*]] = alloca [3 x ptr], align 8 +// CHECK-NEXT:[[DOTOFFLOAD_MAPPERS:%.*]] = alloca [3 x ptr], align 8 +// CHECK-NEXT:[[DOTOFFLOAD_SIZES:%.*]] = alloca [3 x i64], align 8 +// CHECK-NEXT:[[DOTOFFLOAD_BASEPTRS4:%.*]] = alloca [1 x ptr], align 8 +// CHECK-NEXT:[[DOTOFFLOAD_PTRS5:%.*]] = alloca [1 x ptr], align 8 +// CHECK-NEXT:[[DOTOFFLOAD_MAPPERS6:%.*]] = alloca [1 x ptr], align 8 +// CHECK-NEXT:[[KERNEL_ARGS:%.*]] = alloca [[STRUCT___TGT_KERNEL_ARGUMENTS:%.*]], align 8 +// CHECK-NEXT:[[DOTOFFLOAD_BASEPTRS7:%.*]] = alloca [1 x ptr], align 8 +// CHECK-NEXT:[[DOTOFFLOAD_PTRS8:%.*]] = alloca [1 x ptr], align 8 +// CHECK-NEXT:[[DOTOFFLOAD_MAPPERS9:%.*]] = alloca [1 x ptr], align 8 +// CHECK-NEXT:call void @llvm.memset.p0.i64(ptr align 8 [[DAT]], i8 0, i64 264, i1 false) +// CHECK-NEXT:[[XI:%.*]] = getelementptr inbounds [[STRUCT_DESCRIPTOR]], ptr [[DAT]], i32 0, i32 2 +// CHECK-NEXT:store i32 3, ptr [[XI]], align 8 +// CHECK-NEXT:[[ARR:%.*]] = getelementptr inbounds [[STRUCT_DESCRIPTOR]], ptr [[DAT]], i32 0, i32 3 +// CHECK-NEXT:[[ARRAYIDX:%.*]] = getelementptr inbounds [1 x [30 x i64]], ptr [[ARR]], i64 0, i64 0 +// CHECK-NEXT:[[ARRAYIDX1:%.*]] = getelementptr inbounds [30 x i64], ptr [[ARRA
[clang] [openmp] [Clang][OpenMP] Fix ordering of processing of map clauses when mapping a struct. (PR #72410)
@@ -7731,10 +7731,30 @@ class MappableExprsHandler { IsImplicit, Mapper, VarRef, ForDeviceAddr); }; +// Sort all map clauses and make sure all the maps containing array +// sections are processed last. +llvm::SmallVector SortedMapClauses; doru1004 wrote: I don't understand the question. https://github.com/llvm/llvm-project/pull/72410 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [openmp] [Clang][OpenMP] Fix ordering of processing of map clauses when mapping a struct. (PR #72410)
@@ -7731,10 +7731,30 @@ class MappableExprsHandler { IsImplicit, Mapper, VarRef, ForDeviceAddr); }; +// Sort all map clauses and make sure all the maps containing array +// sections are processed last. +llvm::SmallVector SortedMapClauses; doru1004 wrote: Are you asking what is the sorting criteria? https://github.com/llvm/llvm-project/pull/72410 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [openmp] [Clang][OpenMP] Fix ordering of processing of map clauses when mapping a struct. (PR #72410)
@@ -7731,10 +7731,30 @@ class MappableExprsHandler { IsImplicit, Mapper, VarRef, ForDeviceAddr); }; +// Sort all map clauses and make sure all the maps containing array +// sections are processed last. +llvm::SmallVector SortedMapClauses; doru1004 wrote: Ah yes, so I just moved all the maps containing sections at the end of the clause list. I want those maps to happen last after all the structs and other maps have happened. https://github.com/llvm/llvm-project/pull/72410 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [openmp] [Clang][OpenMP] Fix ordering of processing of map clauses when mapping a struct. (PR #72410)
@@ -7731,10 +7731,30 @@ class MappableExprsHandler { IsImplicit, Mapper, VarRef, ForDeviceAddr); }; +// Sort all map clauses and make sure all the maps containing array +// sections are processed last. +llvm::SmallVector SortedMapClauses; doru1004 wrote: It's a form of sorting it's more like a split between all section-containing maps and the ones that don't. https://github.com/llvm/llvm-project/pull/72410 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[openmp] [clang] [Clang][OpenMP] Fix ordering of processing of map clauses when mapping a struct. (PR #72410)
@@ -7731,10 +7731,30 @@ class MappableExprsHandler { IsImplicit, Mapper, VarRef, ForDeviceAddr); }; +// Sort all map clauses and make sure all the maps containing array +// sections are processed last. +llvm::SmallVector SortedMapClauses; doru1004 wrote: @alexey-bataev I agree it's not ideal. the problem is related to the order in which the clauses are processed. We cannot process the base struct after we have processed an array section inside the struct. https://github.com/llvm/llvm-project/pull/72410 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [openmp] [Clang][OpenMP] Fix ordering of processing of map clauses when mapping a struct. (PR #72410)
@@ -7731,10 +7731,30 @@ class MappableExprsHandler { IsImplicit, Mapper, VarRef, ForDeviceAddr); }; +// Sort all map clauses and make sure all the maps containing array +// sections are processed last. +llvm::SmallVector SortedMapClauses; doru1004 wrote: at runtime, if things happen in the wrong order, the processing of the base struct overwrites the pointer attachment for the array. https://github.com/llvm/llvm-project/pull/72410 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [openmp] [Clang][OpenMP] Fix ordering of processing of map clauses when mapping a struct. (PR #72410)
@@ -7731,10 +7731,30 @@ class MappableExprsHandler { IsImplicit, Mapper, VarRef, ForDeviceAddr); }; +// Sort all map clauses and make sure all the maps containing array +// sections are processed last. +llvm::SmallVector SortedMapClauses; doru1004 wrote: Implicit ordering isn't working in the case in the example above, please see the code. The entries are in the wrong order in the runtime and the problem starts here. https://github.com/llvm/llvm-project/pull/72410 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [openmp] [Clang][OpenMP] Fix ordering of processing of map clauses when mapping a struct. (PR #72410)
@@ -7731,10 +7731,30 @@ class MappableExprsHandler { IsImplicit, Mapper, VarRef, ForDeviceAddr); }; +// Sort all map clauses and make sure all the maps containing array +// sections are processed last. +llvm::SmallVector SortedMapClauses; doru1004 wrote: Well I don't see anything other that's wrong other than the order and the order comes from how the user wrote the code so I am not sure how else to fix it. https://github.com/llvm/llvm-project/pull/72410 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [openmp] [Clang][OpenMP] Fix ordering of processing of map clauses when mapping a struct. (PR #72410)
@@ -7731,10 +7731,30 @@ class MappableExprsHandler { IsImplicit, Mapper, VarRef, ForDeviceAddr); }; +// Sort all map clauses and make sure all the maps containing array +// sections are processed last. +llvm::SmallVector SortedMapClauses; doru1004 wrote: I can't find any "bug" in the existing code. It works as intended. The problem is that it doesn't handle these types of situations and I don't see how else to fix an ordering problem other than by re-ordering. If you have a different solution in mind please let me know. https://github.com/llvm/llvm-project/pull/72410 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[openmp] [clang] [Clang][OpenMP] Fix ordering of processing of map clauses when mapping a struct. (PR #72410)
@@ -7731,10 +7731,30 @@ class MappableExprsHandler { IsImplicit, Mapper, VarRef, ForDeviceAddr); }; +// Sort all map clauses and make sure all the maps containing array +// sections are processed last. +llvm::SmallVector SortedMapClauses; doru1004 wrote: @alexey-bataev I have looked at the code again and I really can't see another solution to this problem. If you have a different fix in mind please let me know. https://github.com/llvm/llvm-project/pull/72410 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [openmp] [Clang][OpenMP] Fix mapping of structs to device (PR #75642)
https://github.com/doru1004 created https://github.com/llvm/llvm-project/pull/75642 Fix mapping of structs to device. The following example fails: ``` #include #include struct Descriptor { int *datum; long int x; int xi; long int arr[1][30]; }; int main() { Descriptor dat = Descriptor(); dat.datum = (int *)malloc(sizeof(int)*10); dat.xi = 3; dat.arr[0][0] = 1; #pragma omp target enter data map(to: dat.datum[:10]) map(to: dat) #pragma omp target { dat.xi = 4; dat.datum[dat.arr[0][0]] = dat.xi; } #pragma omp target exit data map(from: dat) return 0; } ``` This is a rework of the previous attempt: https://github.com/llvm/llvm-project/pull/72410 >From 2dc40b67e55985de4e9e89758d6c65eb73faac02 Mon Sep 17 00:00:00 2001 From: Doru Bercea Date: Fri, 15 Dec 2023 10:22:38 -0500 Subject: [PATCH] Fix mapping of structs to device. --- clang/lib/CodeGen/CGOpenMPRuntime.cpp | 147 +++ clang/test/OpenMP/map_struct_ordering.cpp | 172 ++ .../struct_mapping_with_pointers.cpp | 115 3 files changed, 401 insertions(+), 33 deletions(-) create mode 100644 clang/test/OpenMP/map_struct_ordering.cpp create mode 100644 openmp/libomptarget/test/offloading/struct_mapping_with_pointers.cpp diff --git a/clang/lib/CodeGen/CGOpenMPRuntime.cpp b/clang/lib/CodeGen/CGOpenMPRuntime.cpp index 7f7e6f53066644..02f5d8fca7090c 100644 --- a/clang/lib/CodeGen/CGOpenMPRuntime.cpp +++ b/clang/lib/CodeGen/CGOpenMPRuntime.cpp @@ -6811,8 +6811,10 @@ class MappableExprsHandler { OpenMPMapClauseKind MapType, ArrayRef MapModifiers, ArrayRef MotionModifiers, OMPClauseMappableExprCommon::MappableExprComponentListRef Components, - MapCombinedInfoTy &CombinedInfo, StructRangeInfoTy &PartialStruct, - bool IsFirstComponentList, bool IsImplicit, + MapCombinedInfoTy &CombinedInfo, + MapCombinedInfoTy &StructBaseCombinedInfo, + StructRangeInfoTy &PartialStruct, bool IsFirstComponentList, + bool IsImplicit, bool GenerateAllInfoForClauses, const ValueDecl *Mapper = nullptr, bool ForDeviceAddr = false, const ValueDecl *BaseDecl = nullptr, const Expr *MapExpr = nullptr, ArrayRef @@ -7098,6 +7100,25 @@ class MappableExprsHandler { bool IsNonContiguous = CombinedInfo.NonContigInfo.IsNonContiguous; bool IsPrevMemberReference = false; +// We need to check if we will be encountering any MEs. If we do not +// encounter any ME expression it means we will be mapping the whole struct. +// In that case we need to skip adding an entry for the struct to the +// CombinedInfo list and instead add an entry to the StructBaseCombinedInfo +// list only when generating all info for clauses. +bool IsMappingWholeStruct = true; +if (!GenerateAllInfoForClauses) { + IsMappingWholeStruct = false; +} else { + for (auto TempI = I; TempI != CE; ++TempI) { +const MemberExpr *PossibleME = +dyn_cast(TempI->getAssociatedExpression()); +if (PossibleME) { + IsMappingWholeStruct = false; + break; +} + } +} + for (; I != CE; ++I) { // If the current component is member of a struct (parent struct) mark it. if (!EncounteredME) { @@ -7317,21 +7338,41 @@ class MappableExprsHandler { break; } llvm::Value *Size = getExprTypeSize(I->getAssociatedExpression()); +// Skip adding an entry in the CurInfo of this combined entry if the +// whole struct is currently being mapped. The struct needs to be added +// in the first position before any data internal to the struct is being +// mapped. if (!IsMemberPointerOrAddr || (Next == CE && MapType != OMPC_MAP_unknown)) { - CombinedInfo.Exprs.emplace_back(MapDecl, MapExpr); - CombinedInfo.BasePointers.push_back(BP.getPointer()); - CombinedInfo.DevicePtrDecls.push_back(nullptr); - CombinedInfo.DevicePointers.push_back(DeviceInfoTy::None); - CombinedInfo.Pointers.push_back(LB.getPointer()); - CombinedInfo.Sizes.push_back( - CGF.Builder.CreateIntCast(Size, CGF.Int64Ty, /*isSigned=*/true)); - CombinedInfo.NonContigInfo.Dims.push_back(IsNonContiguous ? DimSize -: 1); + if (!IsMappingWholeStruct) { +CombinedInfo.Exprs.emplace_back(MapDecl, MapExpr); +CombinedInfo.BasePointers.push_back(BP.getPointer()); +CombinedInfo.DevicePtrDecls.push_back(nullptr); +CombinedInfo.DevicePointers.push_back(DeviceInfoTy::None); +CombinedInfo.Pointers.push_back(LB.getPointer()); +CombinedInfo.Sizes.push_back(CGF.Builder.CreateIntCast( +Size, CGF.Int64Ty, /*isSigned=*/true)); +CombinedInfo.NonContigInfo.Dims.push_back(IsNonContiguous ? DimSize
[clang] [openmp] [Clang][OpenMP] Fix mapping of structs to device (PR #75642)
https://github.com/doru1004 updated https://github.com/llvm/llvm-project/pull/75642 >From ae6cf04a149f00f52c1da8e7b9c1ca3af5393f99 Mon Sep 17 00:00:00 2001 From: Doru Bercea Date: Fri, 15 Dec 2023 10:22:38 -0500 Subject: [PATCH] Fix mapping of structs to device. --- clang/lib/CodeGen/CGOpenMPRuntime.cpp | 147 +++ clang/test/OpenMP/map_struct_ordering.cpp | 172 ++ .../struct_mapping_with_pointers.cpp | 114 3 files changed, 400 insertions(+), 33 deletions(-) create mode 100644 clang/test/OpenMP/map_struct_ordering.cpp create mode 100644 openmp/libomptarget/test/offloading/struct_mapping_with_pointers.cpp diff --git a/clang/lib/CodeGen/CGOpenMPRuntime.cpp b/clang/lib/CodeGen/CGOpenMPRuntime.cpp index 7f7e6f53066644..02f5d8fca7090c 100644 --- a/clang/lib/CodeGen/CGOpenMPRuntime.cpp +++ b/clang/lib/CodeGen/CGOpenMPRuntime.cpp @@ -6811,8 +6811,10 @@ class MappableExprsHandler { OpenMPMapClauseKind MapType, ArrayRef MapModifiers, ArrayRef MotionModifiers, OMPClauseMappableExprCommon::MappableExprComponentListRef Components, - MapCombinedInfoTy &CombinedInfo, StructRangeInfoTy &PartialStruct, - bool IsFirstComponentList, bool IsImplicit, + MapCombinedInfoTy &CombinedInfo, + MapCombinedInfoTy &StructBaseCombinedInfo, + StructRangeInfoTy &PartialStruct, bool IsFirstComponentList, + bool IsImplicit, bool GenerateAllInfoForClauses, const ValueDecl *Mapper = nullptr, bool ForDeviceAddr = false, const ValueDecl *BaseDecl = nullptr, const Expr *MapExpr = nullptr, ArrayRef @@ -7098,6 +7100,25 @@ class MappableExprsHandler { bool IsNonContiguous = CombinedInfo.NonContigInfo.IsNonContiguous; bool IsPrevMemberReference = false; +// We need to check if we will be encountering any MEs. If we do not +// encounter any ME expression it means we will be mapping the whole struct. +// In that case we need to skip adding an entry for the struct to the +// CombinedInfo list and instead add an entry to the StructBaseCombinedInfo +// list only when generating all info for clauses. +bool IsMappingWholeStruct = true; +if (!GenerateAllInfoForClauses) { + IsMappingWholeStruct = false; +} else { + for (auto TempI = I; TempI != CE; ++TempI) { +const MemberExpr *PossibleME = +dyn_cast(TempI->getAssociatedExpression()); +if (PossibleME) { + IsMappingWholeStruct = false; + break; +} + } +} + for (; I != CE; ++I) { // If the current component is member of a struct (parent struct) mark it. if (!EncounteredME) { @@ -7317,21 +7338,41 @@ class MappableExprsHandler { break; } llvm::Value *Size = getExprTypeSize(I->getAssociatedExpression()); +// Skip adding an entry in the CurInfo of this combined entry if the +// whole struct is currently being mapped. The struct needs to be added +// in the first position before any data internal to the struct is being +// mapped. if (!IsMemberPointerOrAddr || (Next == CE && MapType != OMPC_MAP_unknown)) { - CombinedInfo.Exprs.emplace_back(MapDecl, MapExpr); - CombinedInfo.BasePointers.push_back(BP.getPointer()); - CombinedInfo.DevicePtrDecls.push_back(nullptr); - CombinedInfo.DevicePointers.push_back(DeviceInfoTy::None); - CombinedInfo.Pointers.push_back(LB.getPointer()); - CombinedInfo.Sizes.push_back( - CGF.Builder.CreateIntCast(Size, CGF.Int64Ty, /*isSigned=*/true)); - CombinedInfo.NonContigInfo.Dims.push_back(IsNonContiguous ? DimSize -: 1); + if (!IsMappingWholeStruct) { +CombinedInfo.Exprs.emplace_back(MapDecl, MapExpr); +CombinedInfo.BasePointers.push_back(BP.getPointer()); +CombinedInfo.DevicePtrDecls.push_back(nullptr); +CombinedInfo.DevicePointers.push_back(DeviceInfoTy::None); +CombinedInfo.Pointers.push_back(LB.getPointer()); +CombinedInfo.Sizes.push_back(CGF.Builder.CreateIntCast( +Size, CGF.Int64Ty, /*isSigned=*/true)); +CombinedInfo.NonContigInfo.Dims.push_back(IsNonContiguous ? DimSize + : 1); + } else { +StructBaseCombinedInfo.Exprs.emplace_back(MapDecl, MapExpr); +StructBaseCombinedInfo.BasePointers.push_back(BP.getPointer()); +StructBaseCombinedInfo.DevicePtrDecls.push_back(nullptr); + StructBaseCombinedInfo.DevicePointers.push_back(DeviceInfoTy::None); +StructBaseCombinedInfo.Pointers.push_back(LB.getPointer()); +StructBaseCombinedInfo.Sizes.push_back(CGF.Builder.CreateIntCast( +Size, CGF.Int64Ty, /*isSigned=*/true)); +
[clang] [openmp] [Clang][OpenMP] Fix mapping of structs to device (PR #75642)
https://github.com/doru1004 updated https://github.com/llvm/llvm-project/pull/75642 >From e0e1f5e7bb2f95f2568b5dd647b883f4740bcafd Mon Sep 17 00:00:00 2001 From: Doru Bercea Date: Fri, 15 Dec 2023 10:22:38 -0500 Subject: [PATCH] Fix mapping of structs to device. --- clang/lib/CodeGen/CGOpenMPRuntime.cpp | 146 +++ clang/test/OpenMP/map_struct_ordering.cpp | 172 ++ .../struct_mapping_with_pointers.cpp | 114 3 files changed, 399 insertions(+), 33 deletions(-) create mode 100644 clang/test/OpenMP/map_struct_ordering.cpp create mode 100644 openmp/libomptarget/test/offloading/struct_mapping_with_pointers.cpp diff --git a/clang/lib/CodeGen/CGOpenMPRuntime.cpp b/clang/lib/CodeGen/CGOpenMPRuntime.cpp index 7f7e6f53066644..350e7108b8d5a7 100644 --- a/clang/lib/CodeGen/CGOpenMPRuntime.cpp +++ b/clang/lib/CodeGen/CGOpenMPRuntime.cpp @@ -6811,8 +6811,10 @@ class MappableExprsHandler { OpenMPMapClauseKind MapType, ArrayRef MapModifiers, ArrayRef MotionModifiers, OMPClauseMappableExprCommon::MappableExprComponentListRef Components, - MapCombinedInfoTy &CombinedInfo, StructRangeInfoTy &PartialStruct, - bool IsFirstComponentList, bool IsImplicit, + MapCombinedInfoTy &CombinedInfo, + MapCombinedInfoTy &StructBaseCombinedInfo, + StructRangeInfoTy &PartialStruct, bool IsFirstComponentList, + bool IsImplicit, bool GenerateAllInfoForClauses, const ValueDecl *Mapper = nullptr, bool ForDeviceAddr = false, const ValueDecl *BaseDecl = nullptr, const Expr *MapExpr = nullptr, ArrayRef @@ -7098,6 +7100,25 @@ class MappableExprsHandler { bool IsNonContiguous = CombinedInfo.NonContigInfo.IsNonContiguous; bool IsPrevMemberReference = false; +// We need to check if we will be encountering any MEs. If we do not +// encounter any ME expression it means we will be mapping the whole struct. +// In that case we need to skip adding an entry for the struct to the +// CombinedInfo list and instead add an entry to the StructBaseCombinedInfo +// list only when generating all info for clauses. +bool IsMappingWholeStruct = true; +if (!GenerateAllInfoForClauses) { + IsMappingWholeStruct = false; +} else { + for (auto TempI = I; TempI != CE; ++TempI) { +const MemberExpr *PossibleME = +dyn_cast(TempI->getAssociatedExpression()); +if (PossibleME) { + IsMappingWholeStruct = false; + break; +} + } +} + for (; I != CE; ++I) { // If the current component is member of a struct (parent struct) mark it. if (!EncounteredME) { @@ -7317,21 +7338,41 @@ class MappableExprsHandler { break; } llvm::Value *Size = getExprTypeSize(I->getAssociatedExpression()); +// Skip adding an entry in the CurInfo of this combined entry if the +// whole struct is currently being mapped. The struct needs to be added +// in the first position before any data internal to the struct is being +// mapped. if (!IsMemberPointerOrAddr || (Next == CE && MapType != OMPC_MAP_unknown)) { - CombinedInfo.Exprs.emplace_back(MapDecl, MapExpr); - CombinedInfo.BasePointers.push_back(BP.getPointer()); - CombinedInfo.DevicePtrDecls.push_back(nullptr); - CombinedInfo.DevicePointers.push_back(DeviceInfoTy::None); - CombinedInfo.Pointers.push_back(LB.getPointer()); - CombinedInfo.Sizes.push_back( - CGF.Builder.CreateIntCast(Size, CGF.Int64Ty, /*isSigned=*/true)); - CombinedInfo.NonContigInfo.Dims.push_back(IsNonContiguous ? DimSize -: 1); + if (!IsMappingWholeStruct) { +CombinedInfo.Exprs.emplace_back(MapDecl, MapExpr); +CombinedInfo.BasePointers.push_back(BP.getPointer()); +CombinedInfo.DevicePtrDecls.push_back(nullptr); +CombinedInfo.DevicePointers.push_back(DeviceInfoTy::None); +CombinedInfo.Pointers.push_back(LB.getPointer()); +CombinedInfo.Sizes.push_back(CGF.Builder.CreateIntCast( +Size, CGF.Int64Ty, /*isSigned=*/true)); +CombinedInfo.NonContigInfo.Dims.push_back(IsNonContiguous ? DimSize + : 1); + } else { +StructBaseCombinedInfo.Exprs.emplace_back(MapDecl, MapExpr); +StructBaseCombinedInfo.BasePointers.push_back(BP.getPointer()); +StructBaseCombinedInfo.DevicePtrDecls.push_back(nullptr); + StructBaseCombinedInfo.DevicePointers.push_back(DeviceInfoTy::None); +StructBaseCombinedInfo.Pointers.push_back(LB.getPointer()); +StructBaseCombinedInfo.Sizes.push_back(CGF.Builder.CreateIntCast( +Size, CGF.Int64Ty, /*isSigned=*/true)); +
[clang] [openmp] [Clang][OpenMP] Fix mapping of structs to device (PR #75642)
https://github.com/doru1004 updated https://github.com/llvm/llvm-project/pull/75642 >From 32454489d4e77f22ab935827dffe0febbb7b0626 Mon Sep 17 00:00:00 2001 From: Doru Bercea Date: Fri, 15 Dec 2023 10:22:38 -0500 Subject: [PATCH] Fix mapping of structs to device. --- clang/lib/CodeGen/CGOpenMPRuntime.cpp | 148 +++ clang/test/OpenMP/map_struct_ordering.cpp | 172 ++ .../struct_mapping_with_pointers.cpp | 114 3 files changed, 401 insertions(+), 33 deletions(-) create mode 100644 clang/test/OpenMP/map_struct_ordering.cpp create mode 100644 openmp/libomptarget/test/offloading/struct_mapping_with_pointers.cpp diff --git a/clang/lib/CodeGen/CGOpenMPRuntime.cpp b/clang/lib/CodeGen/CGOpenMPRuntime.cpp index 7f7e6f53066644..ea6645a39e8321 100644 --- a/clang/lib/CodeGen/CGOpenMPRuntime.cpp +++ b/clang/lib/CodeGen/CGOpenMPRuntime.cpp @@ -6811,8 +6811,10 @@ class MappableExprsHandler { OpenMPMapClauseKind MapType, ArrayRef MapModifiers, ArrayRef MotionModifiers, OMPClauseMappableExprCommon::MappableExprComponentListRef Components, - MapCombinedInfoTy &CombinedInfo, StructRangeInfoTy &PartialStruct, - bool IsFirstComponentList, bool IsImplicit, + MapCombinedInfoTy &CombinedInfo, + MapCombinedInfoTy &StructBaseCombinedInfo, + StructRangeInfoTy &PartialStruct, bool IsFirstComponentList, + bool IsImplicit, bool GenerateAllInfoForClauses, const ValueDecl *Mapper = nullptr, bool ForDeviceAddr = false, const ValueDecl *BaseDecl = nullptr, const Expr *MapExpr = nullptr, ArrayRef @@ -7098,6 +7100,25 @@ class MappableExprsHandler { bool IsNonContiguous = CombinedInfo.NonContigInfo.IsNonContiguous; bool IsPrevMemberReference = false; +// We need to check if we will be encountering any MEs. If we do not +// encounter any ME expression it means we will be mapping the whole struct. +// In that case we need to skip adding an entry for the struct to the +// CombinedInfo list and instead add an entry to the StructBaseCombinedInfo +// list only when generating all info for clauses. +bool IsMappingWholeStruct = true; +if (!GenerateAllInfoForClauses) { + IsMappingWholeStruct = false; +} else { + for (auto TempI = I; TempI != CE; ++TempI) { +const MemberExpr *PossibleME = +dyn_cast(TempI->getAssociatedExpression()); +if (PossibleME) { + IsMappingWholeStruct = false; + break; +} + } +} + for (; I != CE; ++I) { // If the current component is member of a struct (parent struct) mark it. if (!EncounteredME) { @@ -7317,21 +7338,41 @@ class MappableExprsHandler { break; } llvm::Value *Size = getExprTypeSize(I->getAssociatedExpression()); +// Skip adding an entry in the CurInfo of this combined entry if the +// whole struct is currently being mapped. The struct needs to be added +// in the first position before any data internal to the struct is being +// mapped. if (!IsMemberPointerOrAddr || (Next == CE && MapType != OMPC_MAP_unknown)) { - CombinedInfo.Exprs.emplace_back(MapDecl, MapExpr); - CombinedInfo.BasePointers.push_back(BP.getPointer()); - CombinedInfo.DevicePtrDecls.push_back(nullptr); - CombinedInfo.DevicePointers.push_back(DeviceInfoTy::None); - CombinedInfo.Pointers.push_back(LB.getPointer()); - CombinedInfo.Sizes.push_back( - CGF.Builder.CreateIntCast(Size, CGF.Int64Ty, /*isSigned=*/true)); - CombinedInfo.NonContigInfo.Dims.push_back(IsNonContiguous ? DimSize -: 1); + if (!IsMappingWholeStruct) { +CombinedInfo.Exprs.emplace_back(MapDecl, MapExpr); +CombinedInfo.BasePointers.push_back(BP.getPointer()); +CombinedInfo.DevicePtrDecls.push_back(nullptr); +CombinedInfo.DevicePointers.push_back(DeviceInfoTy::None); +CombinedInfo.Pointers.push_back(LB.getPointer()); +CombinedInfo.Sizes.push_back(CGF.Builder.CreateIntCast( +Size, CGF.Int64Ty, /*isSigned=*/true)); +CombinedInfo.NonContigInfo.Dims.push_back(IsNonContiguous ? DimSize + : 1); + } else { +StructBaseCombinedInfo.Exprs.emplace_back(MapDecl, MapExpr); +StructBaseCombinedInfo.BasePointers.push_back(BP.getPointer()); +StructBaseCombinedInfo.DevicePtrDecls.push_back(nullptr); + StructBaseCombinedInfo.DevicePointers.push_back(DeviceInfoTy::None); +StructBaseCombinedInfo.Pointers.push_back(LB.getPointer()); +StructBaseCombinedInfo.Sizes.push_back(CGF.Builder.CreateIntCast( +Size, CGF.Int64Ty, /*isSigned=*/true)); +
[clang] [openmp] [Clang][OpenMP] Fix mapping of structs to device (PR #75642)
doru1004 wrote: @alexey-bataev I have reworked the previous patch with your advice in mind. The emitCombinedEntry function was not changed since eliminating the combined entry has many ramifications which would need to be handled in a separate patch. For now this fixes the immediate error in a way that allows us to later get rid of the combined entry later on if we want to. https://github.com/llvm/llvm-project/pull/75642 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[openmp] [clang] [Clang][OpenMP] Fix mapping of structs to device (PR #75642)
https://github.com/doru1004 closed https://github.com/llvm/llvm-project/pull/75642 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [openmp] [Clang][OpenMP] Fix ordering of processing of map clauses when mapping a struct. (PR #72410)
https://github.com/doru1004 closed https://github.com/llvm/llvm-project/pull/72410 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [openmp] [Clang][OpenMP] Fix mapping of structs to device (PR #75642)
doru1004 wrote: > The newly added test `offloading/struct_mapping_with_pointers.cpp` fails on > NVIDIA GPUs as well. > > ``` > TEST 'libomptarget :: nvptx64-nvidia-cuda :: > offloading/struct_mapping_with_pointers.cpp' FAILED > Exit Code: 1 > > Command Output (stdout): > -- > # RUN: at line 2 > /gpfs/jlse-fs0/users/ac.shilei.tian/build/llvm/release/bin/clang++ -fopenmp > -pthread -I > /home/ac.shilei.tian/Documents/vscode/llvm-project/openmp/libomptarget/test > -I /gpfs/jlse > -fs0/users/ac.shilei.tian/build/openmp/release/runtime/src -L > /gpfs/jlse-fs0/users/ac.shilei.tian/build/openmp/release/libomptarget -L > /gpfs/jlse-fs0/users/ac.shilei.tian/build/ll > vm/release/./lib -L > /gpfs/jlse-fs0/users/ac.shilei.tian/build/openmp/release/runtime/src > -Wl,-rpath,/gpfs/jlse-fs0/users/ac.shilei.tian/build/openmp/release/libomptarget > -Wl,-rpa > th,/gpfs/jlse-fs0/users/ac.shilei.tian/build/openmp/release/runtime/src > -Wl,-rpath,/gpfs/jlse-fs0/users/ac.shilei.tian/build/llvm/release/./lib > -Wl,-rpath,/soft/compilers/cuda/cud > a-11.8.0/targets/x86_64-linux/lib > --libomptarget-nvptx-bc-path=/gpfs/jlse-fs0/users/ac.shilei.tian/build/openmp/release/libomptarget/DeviceRTL > -fopenmp-targets=nvptx64-nvidia-cuda > > /home/ac.shilei.tian/Documents/vscode/llvm-project/openmp/libomptarget/test/offloading/struct_mapping_with_pointers.cpp > -o /gpfs/jlse-fs0/users/ac.shilei.tian/build/openmp/releas > e/libomptarget/test/nvptx64-nvidia-cuda/offloading/Output/struct_mapping_with_pointers.cpp.tmp > > /gpfs/jlse-fs0/users/ac.shilei.tian/build/openmp/release/libomptarget/libomptarget.d > evicertl.a && env LIBOMPTARGET_DEBUG=1 > /gpfs/jlse-fs0/users/ac.shilei.tian/build/openmp/release/libomptarget/test/nvptx64-nvidia-cuda/offloading/Output/struct_mapping_with_pointer > s.cpp.tmp 2>&1 | > /gpfs/jlse-fs0/users/ac.shilei.tian/build/llvm/release/bin/FileCheck > /home/ac.shilei.tian/Documents/vscode/llvm-project/openmp/libomptarget/test/offloading/struct > _mapping_with_pointers.cpp > # executed command: > /gpfs/jlse-fs0/users/ac.shilei.tian/build/llvm/release/bin/clang++ -fopenmp > -pthread -I > /home/ac.shilei.tian/Documents/vscode/llvm-project/openmp/libomptarget/ > test -I /gpfs/jlse-fs0/users/ac.shilei.tian/build/openmp/release/runtime/src > -L /gpfs/jlse-fs0/users/ac.shilei.tian/build/openmp/release/libomptarget -L > /gpfs/jlse-fs0/users/ac.sh > ilei.tian/build/llvm/release/./lib -L > /gpfs/jlse-fs0/users/ac.shilei.tian/build/openmp/release/runtime/src > -Wl,-rpath,/gpfs/jlse-fs0/users/ac.shilei.tian/build/openmp/release/libo > mptarget > -Wl,-rpath,/gpfs/jlse-fs0/users/ac.shilei.tian/build/openmp/release/runtime/src > -Wl,-rpath,/gpfs/jlse-fs0/users/ac.shilei.tian/build/llvm/release/./lib > -Wl,-rpath,/soft/c > ompilers/cuda/cuda-11.8.0/targets/x86_64-linux/lib > --libomptarget-nvptx-bc-path=/gpfs/jlse-fs0/users/ac.shilei.tian/build/openmp/release/libomptarget/DeviceRTL > -fopenmp-targets=nv > ptx64-nvidia-cuda > /home/ac.shilei.tian/Documents/vscode/llvm-project/openmp/libomptarget/test/offloading/struct_mapping_with_pointers.cpp > -o /gpfs/jlse-fs0/users/ac.shilei.tian/bu > ild/openmp/release/libomptarget/test/nvptx64-nvidia-cuda/offloading/Output/struct_mapping_with_pointers.cpp.tmp > /gpfs/jlse-fs0/users/ac.shilei.tian/build/openmp/release/libomptarg > et/libomptarget.devicertl.a > # executed command: env LIBOMPTARGET_DEBUG=1 > /gpfs/jlse-fs0/users/ac.shilei.tian/build/openmp/release/libomptarget/test/nvptx64-nvidia-cuda/offloading/Output/struct_mapping_with_p > ointers.cpp.tmp > # executed command: > /gpfs/jlse-fs0/users/ac.shilei.tian/build/llvm/release/bin/FileCheck > /home/ac.shilei.tian/Documents/vscode/llvm-project/openmp/libomptarget/test/offloading/str > uct_mapping_with_pointers.cpp > # .---command stderr > # | > /home/ac.shilei.tian/Documents/vscode/llvm-project/openmp/libomptarget/test/offloading/struct_mapping_with_pointers.cpp:106:12: > error: CHECK: expected string not found in inpu > t > # | // CHECK: dat.datum[dat.arr[0][0]] = 0 > # |^ > # | :124:24: note: scanning from here > # | dat.val_more_datum = 18 > # |^ > # | :125:1: note: possible intended match here > # | dat.datum[dat.arr[0][0]] = 32542 > # | ^ > # | > # | Input file: > # | Check file: > /home/ac.shilei.tian/Documents/vscode/llvm-project/openmp/libomptarget/test/offloading/struct_mapping_with_pointers.cpp > # | > # | -dump-input=help explains the following input dump. > # | > # | Input was: > # | << > # | . > # | . > # | . > # |119: omptarget --> Done unregistering library! > # |120: omptarget --> Deinit offload library! > # |121: TARGET CUDA RTL --> Missing 2 resources to be returned > # |122: dat.xi = 4 > # |123: dat.val_datum = 8 > # |124: dat.val_more_datum = 18 > # | check:106'0
[flang] [clang] [mlir] [Flang][OpenMP][MLIR] Add support for -nogpulib option (PR #71045)
https://github.com/doru1004 approved this pull request. LG https://github.com/llvm/llvm-project/pull/71045 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [OpenMP][CodeGen] Improved codegen for combined loop directives (PR #72417)
@@ -6106,6 +6106,8 @@ class OMPTeamsGenericLoopDirective final : public OMPLoopDirective { class OMPTargetTeamsGenericLoopDirective final : public OMPLoopDirective { friend class ASTStmtReader; friend class OMPExecutableDirective; + /// true if loop directive's associated loop can be a parallel for. + bool CanBeParallelFor = false; doru1004 wrote: I don't think it is possible to have the analysis in Sema and not use a flag here. The two options we have are: 1. Do the analysis in Sema and have the flag and then read the flag in CG. 2. Have the analysis in CG and then there's no reason to pass anything around and CG can call the function when needed. There is a 3rd hybrid way to do this where this function is moved back into CG: ``` bool Sema::teamsLoopCanBeParallelFor(Stmt *AStmt) { TeamsLoopChecker Checker(*this); Checker.Visit(AStmt); return Checker.teamsLoopCanBeParallelFor(); } ``` But then I don't know how you can call the TeamsLoopChecker which lives in Sema. https://github.com/llvm/llvm-project/pull/72417 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[llvm] [openmp] [clang] [OpenMP] Remove `register_requires` global constructor (PR #80460)
@@ -199,7 +199,7 @@ static int initLibrary(DeviceTy &Device) { Entry.size) != OFFLOAD_SUCCESS) REPORT("Failed to write symbol for USM %s\n", Entry.name); } -} else { +} else if (Entry.addr) { doru1004 wrote: So now we don't have a "default" else branch here. Is it not needed? https://github.com/llvm/llvm-project/pull/80460 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[llvm] [openmp] [clang] [OpenMP] Remove `register_requires` global constructor (PR #80460)
@@ -199,7 +199,7 @@ static int initLibrary(DeviceTy &Device) { Entry.size) != OFFLOAD_SUCCESS) REPORT("Failed to write symbol for USM %s\n", Entry.name); } -} else { +} else if (Entry.addr) { doru1004 wrote: Also, could you explain in a comment why Entry.addr is used here? https://github.com/llvm/llvm-project/pull/80460 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[openmp] [clang] [llvm] [OpenMP] Remove `register_requires` global constructor (PR #80460)
@@ -199,7 +199,7 @@ static int initLibrary(DeviceTy &Device) { Entry.size) != OFFLOAD_SUCCESS) REPORT("Failed to write symbol for USM %s\n", Entry.name); } -} else { +} else if (Entry.addr) { doru1004 wrote: Should we check for size > 0 too? https://github.com/llvm/llvm-project/pull/80460 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[openmp] [clang] [llvm] [OpenMP] Remove `register_requires` global constructor (PR #80460)
@@ -199,7 +199,7 @@ static int initLibrary(DeviceTy &Device) { Entry.size) != OFFLOAD_SUCCESS) REPORT("Failed to write symbol for USM %s\n", Entry.name); } -} else { +} else if (Entry.addr) { doru1004 wrote: Or is that too restrictive https://github.com/llvm/llvm-project/pull/80460 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[openmp] [clang] [llvm] [OpenMP] Remove `register_requires` global constructor (PR #80460)
@@ -199,7 +199,7 @@ static int initLibrary(DeviceTy &Device) { Entry.size) != OFFLOAD_SUCCESS) REPORT("Failed to write symbol for USM %s\n", Entry.name); } -} else { +} else if (Entry.addr) { doru1004 wrote: So it will only enter this branch if size is 0 and then if the address is not nullptr. https://github.com/llvm/llvm-project/pull/80460 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
r310263 - [OpenMP] Add flag for specifying the target device architecture for OpenMP device offloading
Author: gbercea Date: Mon Aug 7 08:39:11 2017 New Revision: 310263 URL: http://llvm.org/viewvc/llvm-project?rev=310263&view=rev Log: [OpenMP] Add flag for specifying the target device architecture for OpenMP device offloading Summary: OpenMP has the ability to offload target regions to devices which may have different architectures. A new -fopenmp-target-arch flag is introduced to specify the device architecture. In this patch I use the new flag to specify the compute capability of the underlying NVIDIA architecture for the OpenMP offloading CUDA tool chain. Only a host-offloading test is provided since full device offloading capability will only be available when [[ https://reviews.llvm.org/D29654 | D29654 ]] lands. Reviewers: hfinkel, Hahnfeld, carlo.bertolli, caomhin, ABataev Reviewed By: hfinkel Subscribers: guansong, cfe-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D34784 Modified: cfe/trunk/include/clang/Basic/DiagnosticDriverKinds.td cfe/trunk/include/clang/Driver/Options.td cfe/trunk/include/clang/Driver/ToolChain.h cfe/trunk/lib/Driver/Compilation.cpp cfe/trunk/lib/Driver/ToolChain.cpp cfe/trunk/lib/Driver/ToolChains/Cuda.cpp cfe/trunk/test/Driver/openmp-offload.c Modified: cfe/trunk/include/clang/Basic/DiagnosticDriverKinds.td URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Basic/DiagnosticDriverKinds.td?rev=310263&r1=310262&r2=310263&view=diff == --- cfe/trunk/include/clang/Basic/DiagnosticDriverKinds.td (original) +++ cfe/trunk/include/clang/Basic/DiagnosticDriverKinds.td Mon Aug 7 08:39:11 2017 @@ -69,6 +69,10 @@ def err_drv_invalid_Xarch_argument_with_ "invalid Xarch argument: '%0', options requiring arguments are unsupported">; def err_drv_invalid_Xarch_argument_isdriver : Error< "invalid Xarch argument: '%0', cannot change driver behavior inside Xarch argument">; +def err_drv_Xopenmp_target_missing_triple : Error< + "cannot deduce implicit triple value for -Xopenmp-target, specify triple using -Xopenmp-target=">; +def err_drv_invalid_Xopenmp_target_with_args : Error< + "invalid -Xopenmp-target argument: '%0', options requiring arguments are unsupported">; def err_drv_argument_only_allowed_with : Error< "invalid argument '%0' only allowed with '%1'">; def err_drv_argument_not_allowed_with : Error< Modified: cfe/trunk/include/clang/Driver/Options.td URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Driver/Options.td?rev=310263&r1=310262&r2=310263&view=diff == --- cfe/trunk/include/clang/Driver/Options.td (original) +++ cfe/trunk/include/clang/Driver/Options.td Mon Aug 7 08:39:11 2017 @@ -459,6 +459,10 @@ def Xcuda_fatbinary : Separate<["-"], "X HelpText<"Pass to fatbinary invocation">, MetaVarName<"">; def Xcuda_ptxas : Separate<["-"], "Xcuda-ptxas">, HelpText<"Pass to the ptxas assembler">, MetaVarName<"">; +def Xopenmp_target : Separate<["-"], "Xopenmp-target">, + HelpText<"Pass to the target offloading toolchain.">, MetaVarName<"">; +def Xopenmp_target_EQ : JoinedAndSeparate<["-"], "Xopenmp-target=">, + HelpText<"Pass to the specified target offloading toolchain. The triple that identifies the toolchain must be provided after the equals sign.">, MetaVarName<"">; def z : Separate<["-"], "z">, Flags<[LinkerInput, RenderAsInput]>, HelpText<"Pass -z to the linker">, MetaVarName<"">, Group; Modified: cfe/trunk/include/clang/Driver/ToolChain.h URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Driver/ToolChain.h?rev=310263&r1=310262&r2=310263&view=diff == --- cfe/trunk/include/clang/Driver/ToolChain.h (original) +++ cfe/trunk/include/clang/Driver/ToolChain.h Mon Aug 7 08:39:11 2017 @@ -217,6 +217,17 @@ public: return nullptr; } + /// TranslateOpenMPTargetArgs - Create a new derived argument list for + /// that contains the OpenMP target specific flags passed via + /// -Xopenmp-target -opt=val OR -Xopenmp-target= -opt=val + /// Translation occurs only when the \p DeviceOffloadKind is specified. + /// + /// \param DeviceOffloadKind - The device offload kind used for the + /// translation. + virtual llvm::opt::DerivedArgList * + TranslateOpenMPTargetArgs(const llvm::opt::DerivedArgList &Args, + Action::OffloadKind DeviceOffloadKind) const; + /// Choose a tool to use to handle the action \p JA. /// /// This can be overridden when a particular ToolChain needs to use Modified: cfe/trunk/lib/Driver/Compilation.cpp URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Driver/Compilation.cpp?rev=310263&r1=310262&r2=310263&view=diff == --- cfe/trunk/lib/Driver/Compilation.cpp (original) +++ cfe/trunk/lib/Dr
r310282 - Non-functional change. Fix previous patch D34784.
Author: gbercea Date: Mon Aug 7 11:43:37 2017 New Revision: 310282 URL: http://llvm.org/viewvc/llvm-project?rev=310282&view=rev Log: Non-functional change. Fix previous patch D34784. Modified: cfe/trunk/lib/Driver/Compilation.cpp Modified: cfe/trunk/lib/Driver/Compilation.cpp URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Driver/Compilation.cpp?rev=310282&r1=310281&r2=310282&view=diff == --- cfe/trunk/lib/Driver/Compilation.cpp (original) +++ cfe/trunk/lib/Driver/Compilation.cpp Mon Aug 7 11:43:37 2017 @@ -60,11 +60,15 @@ Compilation::getArgsForToolChain(const T DerivedArgList *&Entry = TCArgs[{TC, BoundArch, DeviceOffloadKind}]; if (!Entry) { // Translate OpenMP toolchain arguments provided via the -Xopenmp-target flags. -Entry = TC->TranslateOpenMPTargetArgs(*TranslatedArgs, DeviceOffloadKind); -if (!Entry) - Entry = TranslatedArgs; +DerivedArgList *OpenMPArgs = TC->TranslateOpenMPTargetArgs(*TranslatedArgs, +DeviceOffloadKind); +if (!OpenMPArgs) { + Entry = TC->TranslateArgs(*TranslatedArgs, BoundArch, DeviceOffloadKind); +} else { + Entry = TC->TranslateArgs(*OpenMPArgs, BoundArch, DeviceOffloadKind); + delete OpenMPArgs; +} -Entry = TC->TranslateArgs(*Entry, BoundArch, DeviceOffloadKind); if (!Entry) Entry = TranslatedArgs; } ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [OpenMP][libomptarget] Add map checks when running under unified shared memory (PR #69005)
https://github.com/doru1004 updated https://github.com/llvm/llvm-project/pull/69005 >From cb4121c466a0fc357d6ca129bfdd4e7c5e2d11ee Mon Sep 17 00:00:00 2001 From: Doru Bercea Date: Wed, 16 Nov 2022 17:23:48 -0600 Subject: [PATCH 1/2] Fix declare target implementation to support enter. --- clang/include/clang/Basic/Attr.td | 4 +- .../clang/Basic/DiagnosticParseKinds.td | 12 - clang/lib/AST/AttrImpl.cpp| 2 +- clang/lib/CodeGen/CGExpr.cpp | 12 +++-- clang/lib/CodeGen/CGOpenMPRuntime.cpp | 24 ++--- clang/lib/CodeGen/CodeGenModule.cpp | 6 ++- clang/lib/Parse/ParseOpenMP.cpp | 39 ++ clang/lib/Sema/SemaOpenMP.cpp | 10 ++-- .../test/OpenMP/declare_target_ast_print.cpp | 53 +++ 9 files changed, 130 insertions(+), 32 deletions(-) diff --git a/clang/include/clang/Basic/Attr.td b/clang/include/clang/Basic/Attr.td index 16cf932c3760bd3..eaf4a6db3600e07 100644 --- a/clang/include/clang/Basic/Attr.td +++ b/clang/include/clang/Basic/Attr.td @@ -3749,8 +3749,8 @@ def OMPDeclareTargetDecl : InheritableAttr { let Documentation = [OMPDeclareTargetDocs]; let Args = [ EnumArgument<"MapType", "MapTypeTy", - [ "to", "link" ], - [ "MT_To", "MT_Link" ]>, + [ "to", "enter", "link" ], + [ "MT_To", "MT_Enter", "MT_Link" ]>, EnumArgument<"DevType", "DevTypeTy", [ "host", "nohost", "any" ], [ "DT_Host", "DT_NoHost", "DT_Any" ]>, diff --git a/clang/include/clang/Basic/DiagnosticParseKinds.td b/clang/include/clang/Basic/DiagnosticParseKinds.td index 674d6bd34fc544f..27cd3da1f191c3d 100644 --- a/clang/include/clang/Basic/DiagnosticParseKinds.td +++ b/clang/include/clang/Basic/DiagnosticParseKinds.td @@ -1383,12 +1383,22 @@ def note_omp_assumption_clause_continue_here : Note<"the ignored tokens spans until here">; def err_omp_declare_target_unexpected_clause: Error< "unexpected '%0' clause, only %select{'device_type'|'to' or 'link'|'to', 'link' or 'device_type'|'device_type', 'indirect'|'to', 'link', 'device_type' or 'indirect'}1 clauses expected">; +def err_omp_declare_target_unexpected_clause_52: Error< + "unexpected '%0' clause, only %select{'device_type'|'enter' or 'link'|'enter', 'link' or 'device_type'|'device_type', 'indirect'|'enter', 'link', 'device_type' or 'indirect'}1 clauses expected">; def err_omp_begin_declare_target_unexpected_implicit_to_clause: Error< "unexpected '(', only 'to', 'link' or 'device_type' clauses expected for 'begin declare target' directive">; -def err_omp_declare_target_unexpected_clause_after_implicit_to: Error< +def err_omp_declare_target_wrong_clause_after_implicit_to: Error< "unexpected clause after an implicit 'to' clause">; +def err_omp_declare_target_wrong_clause_after_implicit_enter: Error< + "unexpected clause after an implicit 'enter' clause">; def err_omp_declare_target_missing_to_or_link_clause: Error< "expected at least one %select{'to' or 'link'|'to', 'link' or 'indirect'}0 clause">; +def err_omp_declare_target_missing_enter_or_link_clause: Error< + "expected at least one %select{'enter' or 'link'|'enter', 'link' or 'indirect'}0 clause">; +def err_omp_declare_target_unexpected_to_clause: Error< + "unexpected 'to' clause, use 'enter' instead">; +def err_omp_declare_target_unexpected_enter_clause: Error< + "unexpected 'enter' clause, use 'to' instead">; def err_omp_declare_target_multiple : Error< "%0 appears multiple times in clauses on the same declare target directive">; def err_omp_declare_target_indirect_device_type: Error< diff --git a/clang/lib/AST/AttrImpl.cpp b/clang/lib/AST/AttrImpl.cpp index cecbd703ac61e8c..da842f6b190e74d 100644 --- a/clang/lib/AST/AttrImpl.cpp +++ b/clang/lib/AST/AttrImpl.cpp @@ -137,7 +137,7 @@ void OMPDeclareTargetDeclAttr::printPrettyPragma( // Use fake syntax because it is for testing and debugging purpose only. if (getDevType() != DT_Any) OS << " device_type(" << ConvertDevTypeTyToStr(getDevType()) << ")"; - if (getMapType() != MT_To) + if (getMapType() != MT_To && getMapType() != MT_Enter) OS << ' ' << ConvertMapTypeTyToStr(getMapType()); if (Expr *E = getIndirectExpr()) { OS << " indirect("; diff --git a/clang/lib/CodeGen/CGExpr.cpp b/clang/lib/CodeGen/CGExpr.cpp index ee09a8566c3719e..77085ff34fca233 100644 --- a/clang/lib/CodeGen/CGExpr.cpp +++ b/clang/lib/CodeGen/CGExpr.cpp @@ -2495,14 +2495,16 @@ static Address emitDeclTargetVarDeclLValue(CodeGenFunction &CGF, const VarDecl *VD, QualType T) { llvm::Optional Res = OMPDeclareTargetDeclAttr::isDeclareTargetDeclaration(VD); - // Return an invalid address if variable is MT_To and unified - // memory is not enabled. For all other cases: MT_Link and - // MT_To with unified memory, return a valid address. - if (!Res || (*
[clang] [OpenMP][libomptarget] Add map checks when running under unified shared memory (PR #69005)
@@ -444,6 +486,29 @@ DeviceTy::getTgtPtrBegin(void *HstPtrBegin, int64_t Size, bool UpdateRefCount, LR.TPR.getEntry()->dynRefCountToStr().c_str(), DynRefCountAction, LR.TPR.getEntry()->holdRefCountToStr().c_str(), HoldRefCountAction); LR.TPR.TargetPointer = (void *)TP; + +// If this entry is not marked as being host pointer (the way the +// implementation works today this is never true, mistake?) then we +// have to check if this is a host pointer or not. This is a host pointer +// if the host address matches the target address. +if ((PM->RTLs.RequiresFlags & OMP_REQ_UNIFIED_SHARED_MEMORY) && +!LR.TPR.Flags.IsHostPointer) { doru1004 wrote: There are several tests which exercise the call to the getTgtPtrBegin. The reason this change is needed is because the first condition, if true, and it can be true even when USM is enabled, then the USM branch will not be taken at all and the IsHostPointer and IsPresent will not be correctly set. https://github.com/llvm/llvm-project/pull/69005 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [OpenMP][libomptarget] Add map checks when running under unified shared memory (PR #69005)
https://github.com/doru1004 updated https://github.com/llvm/llvm-project/pull/69005 >From cb4121c466a0fc357d6ca129bfdd4e7c5e2d11ee Mon Sep 17 00:00:00 2001 From: Doru Bercea Date: Wed, 16 Nov 2022 17:23:48 -0600 Subject: [PATCH 1/2] Fix declare target implementation to support enter. --- clang/include/clang/Basic/Attr.td | 4 +- .../clang/Basic/DiagnosticParseKinds.td | 12 - clang/lib/AST/AttrImpl.cpp| 2 +- clang/lib/CodeGen/CGExpr.cpp | 12 +++-- clang/lib/CodeGen/CGOpenMPRuntime.cpp | 24 ++--- clang/lib/CodeGen/CodeGenModule.cpp | 6 ++- clang/lib/Parse/ParseOpenMP.cpp | 39 ++ clang/lib/Sema/SemaOpenMP.cpp | 10 ++-- .../test/OpenMP/declare_target_ast_print.cpp | 53 +++ 9 files changed, 130 insertions(+), 32 deletions(-) diff --git a/clang/include/clang/Basic/Attr.td b/clang/include/clang/Basic/Attr.td index 16cf932c3760bd3..eaf4a6db3600e07 100644 --- a/clang/include/clang/Basic/Attr.td +++ b/clang/include/clang/Basic/Attr.td @@ -3749,8 +3749,8 @@ def OMPDeclareTargetDecl : InheritableAttr { let Documentation = [OMPDeclareTargetDocs]; let Args = [ EnumArgument<"MapType", "MapTypeTy", - [ "to", "link" ], - [ "MT_To", "MT_Link" ]>, + [ "to", "enter", "link" ], + [ "MT_To", "MT_Enter", "MT_Link" ]>, EnumArgument<"DevType", "DevTypeTy", [ "host", "nohost", "any" ], [ "DT_Host", "DT_NoHost", "DT_Any" ]>, diff --git a/clang/include/clang/Basic/DiagnosticParseKinds.td b/clang/include/clang/Basic/DiagnosticParseKinds.td index 674d6bd34fc544f..27cd3da1f191c3d 100644 --- a/clang/include/clang/Basic/DiagnosticParseKinds.td +++ b/clang/include/clang/Basic/DiagnosticParseKinds.td @@ -1383,12 +1383,22 @@ def note_omp_assumption_clause_continue_here : Note<"the ignored tokens spans until here">; def err_omp_declare_target_unexpected_clause: Error< "unexpected '%0' clause, only %select{'device_type'|'to' or 'link'|'to', 'link' or 'device_type'|'device_type', 'indirect'|'to', 'link', 'device_type' or 'indirect'}1 clauses expected">; +def err_omp_declare_target_unexpected_clause_52: Error< + "unexpected '%0' clause, only %select{'device_type'|'enter' or 'link'|'enter', 'link' or 'device_type'|'device_type', 'indirect'|'enter', 'link', 'device_type' or 'indirect'}1 clauses expected">; def err_omp_begin_declare_target_unexpected_implicit_to_clause: Error< "unexpected '(', only 'to', 'link' or 'device_type' clauses expected for 'begin declare target' directive">; -def err_omp_declare_target_unexpected_clause_after_implicit_to: Error< +def err_omp_declare_target_wrong_clause_after_implicit_to: Error< "unexpected clause after an implicit 'to' clause">; +def err_omp_declare_target_wrong_clause_after_implicit_enter: Error< + "unexpected clause after an implicit 'enter' clause">; def err_omp_declare_target_missing_to_or_link_clause: Error< "expected at least one %select{'to' or 'link'|'to', 'link' or 'indirect'}0 clause">; +def err_omp_declare_target_missing_enter_or_link_clause: Error< + "expected at least one %select{'enter' or 'link'|'enter', 'link' or 'indirect'}0 clause">; +def err_omp_declare_target_unexpected_to_clause: Error< + "unexpected 'to' clause, use 'enter' instead">; +def err_omp_declare_target_unexpected_enter_clause: Error< + "unexpected 'enter' clause, use 'to' instead">; def err_omp_declare_target_multiple : Error< "%0 appears multiple times in clauses on the same declare target directive">; def err_omp_declare_target_indirect_device_type: Error< diff --git a/clang/lib/AST/AttrImpl.cpp b/clang/lib/AST/AttrImpl.cpp index cecbd703ac61e8c..da842f6b190e74d 100644 --- a/clang/lib/AST/AttrImpl.cpp +++ b/clang/lib/AST/AttrImpl.cpp @@ -137,7 +137,7 @@ void OMPDeclareTargetDeclAttr::printPrettyPragma( // Use fake syntax because it is for testing and debugging purpose only. if (getDevType() != DT_Any) OS << " device_type(" << ConvertDevTypeTyToStr(getDevType()) << ")"; - if (getMapType() != MT_To) + if (getMapType() != MT_To && getMapType() != MT_Enter) OS << ' ' << ConvertMapTypeTyToStr(getMapType()); if (Expr *E = getIndirectExpr()) { OS << " indirect("; diff --git a/clang/lib/CodeGen/CGExpr.cpp b/clang/lib/CodeGen/CGExpr.cpp index ee09a8566c3719e..77085ff34fca233 100644 --- a/clang/lib/CodeGen/CGExpr.cpp +++ b/clang/lib/CodeGen/CGExpr.cpp @@ -2495,14 +2495,16 @@ static Address emitDeclTargetVarDeclLValue(CodeGenFunction &CGF, const VarDecl *VD, QualType T) { llvm::Optional Res = OMPDeclareTargetDeclAttr::isDeclareTargetDeclaration(VD); - // Return an invalid address if variable is MT_To and unified - // memory is not enabled. For all other cases: MT_Link and - // MT_To with unified memory, return a valid address. - if (!Res || (*
[clang] [OpenMP][libomptarget] Add map checks when running under unified shared memory (PR #69005)
https://github.com/doru1004 updated https://github.com/llvm/llvm-project/pull/69005 >From cb4121c466a0fc357d6ca129bfdd4e7c5e2d11ee Mon Sep 17 00:00:00 2001 From: Doru Bercea Date: Wed, 16 Nov 2022 17:23:48 -0600 Subject: [PATCH 1/2] Fix declare target implementation to support enter. --- clang/include/clang/Basic/Attr.td | 4 +- .../clang/Basic/DiagnosticParseKinds.td | 12 - clang/lib/AST/AttrImpl.cpp| 2 +- clang/lib/CodeGen/CGExpr.cpp | 12 +++-- clang/lib/CodeGen/CGOpenMPRuntime.cpp | 24 ++--- clang/lib/CodeGen/CodeGenModule.cpp | 6 ++- clang/lib/Parse/ParseOpenMP.cpp | 39 ++ clang/lib/Sema/SemaOpenMP.cpp | 10 ++-- .../test/OpenMP/declare_target_ast_print.cpp | 53 +++ 9 files changed, 130 insertions(+), 32 deletions(-) diff --git a/clang/include/clang/Basic/Attr.td b/clang/include/clang/Basic/Attr.td index 16cf932c3760bd3..eaf4a6db3600e07 100644 --- a/clang/include/clang/Basic/Attr.td +++ b/clang/include/clang/Basic/Attr.td @@ -3749,8 +3749,8 @@ def OMPDeclareTargetDecl : InheritableAttr { let Documentation = [OMPDeclareTargetDocs]; let Args = [ EnumArgument<"MapType", "MapTypeTy", - [ "to", "link" ], - [ "MT_To", "MT_Link" ]>, + [ "to", "enter", "link" ], + [ "MT_To", "MT_Enter", "MT_Link" ]>, EnumArgument<"DevType", "DevTypeTy", [ "host", "nohost", "any" ], [ "DT_Host", "DT_NoHost", "DT_Any" ]>, diff --git a/clang/include/clang/Basic/DiagnosticParseKinds.td b/clang/include/clang/Basic/DiagnosticParseKinds.td index 674d6bd34fc544f..27cd3da1f191c3d 100644 --- a/clang/include/clang/Basic/DiagnosticParseKinds.td +++ b/clang/include/clang/Basic/DiagnosticParseKinds.td @@ -1383,12 +1383,22 @@ def note_omp_assumption_clause_continue_here : Note<"the ignored tokens spans until here">; def err_omp_declare_target_unexpected_clause: Error< "unexpected '%0' clause, only %select{'device_type'|'to' or 'link'|'to', 'link' or 'device_type'|'device_type', 'indirect'|'to', 'link', 'device_type' or 'indirect'}1 clauses expected">; +def err_omp_declare_target_unexpected_clause_52: Error< + "unexpected '%0' clause, only %select{'device_type'|'enter' or 'link'|'enter', 'link' or 'device_type'|'device_type', 'indirect'|'enter', 'link', 'device_type' or 'indirect'}1 clauses expected">; def err_omp_begin_declare_target_unexpected_implicit_to_clause: Error< "unexpected '(', only 'to', 'link' or 'device_type' clauses expected for 'begin declare target' directive">; -def err_omp_declare_target_unexpected_clause_after_implicit_to: Error< +def err_omp_declare_target_wrong_clause_after_implicit_to: Error< "unexpected clause after an implicit 'to' clause">; +def err_omp_declare_target_wrong_clause_after_implicit_enter: Error< + "unexpected clause after an implicit 'enter' clause">; def err_omp_declare_target_missing_to_or_link_clause: Error< "expected at least one %select{'to' or 'link'|'to', 'link' or 'indirect'}0 clause">; +def err_omp_declare_target_missing_enter_or_link_clause: Error< + "expected at least one %select{'enter' or 'link'|'enter', 'link' or 'indirect'}0 clause">; +def err_omp_declare_target_unexpected_to_clause: Error< + "unexpected 'to' clause, use 'enter' instead">; +def err_omp_declare_target_unexpected_enter_clause: Error< + "unexpected 'enter' clause, use 'to' instead">; def err_omp_declare_target_multiple : Error< "%0 appears multiple times in clauses on the same declare target directive">; def err_omp_declare_target_indirect_device_type: Error< diff --git a/clang/lib/AST/AttrImpl.cpp b/clang/lib/AST/AttrImpl.cpp index cecbd703ac61e8c..da842f6b190e74d 100644 --- a/clang/lib/AST/AttrImpl.cpp +++ b/clang/lib/AST/AttrImpl.cpp @@ -137,7 +137,7 @@ void OMPDeclareTargetDeclAttr::printPrettyPragma( // Use fake syntax because it is for testing and debugging purpose only. if (getDevType() != DT_Any) OS << " device_type(" << ConvertDevTypeTyToStr(getDevType()) << ")"; - if (getMapType() != MT_To) + if (getMapType() != MT_To && getMapType() != MT_Enter) OS << ' ' << ConvertMapTypeTyToStr(getMapType()); if (Expr *E = getIndirectExpr()) { OS << " indirect("; diff --git a/clang/lib/CodeGen/CGExpr.cpp b/clang/lib/CodeGen/CGExpr.cpp index ee09a8566c3719e..77085ff34fca233 100644 --- a/clang/lib/CodeGen/CGExpr.cpp +++ b/clang/lib/CodeGen/CGExpr.cpp @@ -2495,14 +2495,16 @@ static Address emitDeclTargetVarDeclLValue(CodeGenFunction &CGF, const VarDecl *VD, QualType T) { llvm::Optional Res = OMPDeclareTargetDeclAttr::isDeclareTargetDeclaration(VD); - // Return an invalid address if variable is MT_To and unified - // memory is not enabled. For all other cases: MT_Link and - // MT_To with unified memory, return a valid address. - if (!Res || (*
[clang] [OpenMP][libomptarget] Add map checks when running under unified shared memory (PR #69005)
https://github.com/doru1004 updated https://github.com/llvm/llvm-project/pull/69005 >From cb4121c466a0fc357d6ca129bfdd4e7c5e2d11ee Mon Sep 17 00:00:00 2001 From: Doru Bercea Date: Wed, 16 Nov 2022 17:23:48 -0600 Subject: [PATCH 1/2] Fix declare target implementation to support enter. --- clang/include/clang/Basic/Attr.td | 4 +- .../clang/Basic/DiagnosticParseKinds.td | 12 - clang/lib/AST/AttrImpl.cpp| 2 +- clang/lib/CodeGen/CGExpr.cpp | 12 +++-- clang/lib/CodeGen/CGOpenMPRuntime.cpp | 24 ++--- clang/lib/CodeGen/CodeGenModule.cpp | 6 ++- clang/lib/Parse/ParseOpenMP.cpp | 39 ++ clang/lib/Sema/SemaOpenMP.cpp | 10 ++-- .../test/OpenMP/declare_target_ast_print.cpp | 53 +++ 9 files changed, 130 insertions(+), 32 deletions(-) diff --git a/clang/include/clang/Basic/Attr.td b/clang/include/clang/Basic/Attr.td index 16cf932c3760bd3..eaf4a6db3600e07 100644 --- a/clang/include/clang/Basic/Attr.td +++ b/clang/include/clang/Basic/Attr.td @@ -3749,8 +3749,8 @@ def OMPDeclareTargetDecl : InheritableAttr { let Documentation = [OMPDeclareTargetDocs]; let Args = [ EnumArgument<"MapType", "MapTypeTy", - [ "to", "link" ], - [ "MT_To", "MT_Link" ]>, + [ "to", "enter", "link" ], + [ "MT_To", "MT_Enter", "MT_Link" ]>, EnumArgument<"DevType", "DevTypeTy", [ "host", "nohost", "any" ], [ "DT_Host", "DT_NoHost", "DT_Any" ]>, diff --git a/clang/include/clang/Basic/DiagnosticParseKinds.td b/clang/include/clang/Basic/DiagnosticParseKinds.td index 674d6bd34fc544f..27cd3da1f191c3d 100644 --- a/clang/include/clang/Basic/DiagnosticParseKinds.td +++ b/clang/include/clang/Basic/DiagnosticParseKinds.td @@ -1383,12 +1383,22 @@ def note_omp_assumption_clause_continue_here : Note<"the ignored tokens spans until here">; def err_omp_declare_target_unexpected_clause: Error< "unexpected '%0' clause, only %select{'device_type'|'to' or 'link'|'to', 'link' or 'device_type'|'device_type', 'indirect'|'to', 'link', 'device_type' or 'indirect'}1 clauses expected">; +def err_omp_declare_target_unexpected_clause_52: Error< + "unexpected '%0' clause, only %select{'device_type'|'enter' or 'link'|'enter', 'link' or 'device_type'|'device_type', 'indirect'|'enter', 'link', 'device_type' or 'indirect'}1 clauses expected">; def err_omp_begin_declare_target_unexpected_implicit_to_clause: Error< "unexpected '(', only 'to', 'link' or 'device_type' clauses expected for 'begin declare target' directive">; -def err_omp_declare_target_unexpected_clause_after_implicit_to: Error< +def err_omp_declare_target_wrong_clause_after_implicit_to: Error< "unexpected clause after an implicit 'to' clause">; +def err_omp_declare_target_wrong_clause_after_implicit_enter: Error< + "unexpected clause after an implicit 'enter' clause">; def err_omp_declare_target_missing_to_or_link_clause: Error< "expected at least one %select{'to' or 'link'|'to', 'link' or 'indirect'}0 clause">; +def err_omp_declare_target_missing_enter_or_link_clause: Error< + "expected at least one %select{'enter' or 'link'|'enter', 'link' or 'indirect'}0 clause">; +def err_omp_declare_target_unexpected_to_clause: Error< + "unexpected 'to' clause, use 'enter' instead">; +def err_omp_declare_target_unexpected_enter_clause: Error< + "unexpected 'enter' clause, use 'to' instead">; def err_omp_declare_target_multiple : Error< "%0 appears multiple times in clauses on the same declare target directive">; def err_omp_declare_target_indirect_device_type: Error< diff --git a/clang/lib/AST/AttrImpl.cpp b/clang/lib/AST/AttrImpl.cpp index cecbd703ac61e8c..da842f6b190e74d 100644 --- a/clang/lib/AST/AttrImpl.cpp +++ b/clang/lib/AST/AttrImpl.cpp @@ -137,7 +137,7 @@ void OMPDeclareTargetDeclAttr::printPrettyPragma( // Use fake syntax because it is for testing and debugging purpose only. if (getDevType() != DT_Any) OS << " device_type(" << ConvertDevTypeTyToStr(getDevType()) << ")"; - if (getMapType() != MT_To) + if (getMapType() != MT_To && getMapType() != MT_Enter) OS << ' ' << ConvertMapTypeTyToStr(getMapType()); if (Expr *E = getIndirectExpr()) { OS << " indirect("; diff --git a/clang/lib/CodeGen/CGExpr.cpp b/clang/lib/CodeGen/CGExpr.cpp index ee09a8566c3719e..77085ff34fca233 100644 --- a/clang/lib/CodeGen/CGExpr.cpp +++ b/clang/lib/CodeGen/CGExpr.cpp @@ -2495,14 +2495,16 @@ static Address emitDeclTargetVarDeclLValue(CodeGenFunction &CGF, const VarDecl *VD, QualType T) { llvm::Optional Res = OMPDeclareTargetDeclAttr::isDeclareTargetDeclaration(VD); - // Return an invalid address if variable is MT_To and unified - // memory is not enabled. For all other cases: MT_Link and - // MT_To with unified memory, return a valid address. - if (!Res || (*
[libunwind] [OpenMP][libomptarget] Add map checks when running under unified shared memory (PR #69005)
https://github.com/doru1004 updated https://github.com/llvm/llvm-project/pull/69005 >From cb4121c466a0fc357d6ca129bfdd4e7c5e2d11ee Mon Sep 17 00:00:00 2001 From: Doru Bercea Date: Wed, 16 Nov 2022 17:23:48 -0600 Subject: [PATCH 1/2] Fix declare target implementation to support enter. --- clang/include/clang/Basic/Attr.td | 4 +- .../clang/Basic/DiagnosticParseKinds.td | 12 - clang/lib/AST/AttrImpl.cpp| 2 +- clang/lib/CodeGen/CGExpr.cpp | 12 +++-- clang/lib/CodeGen/CGOpenMPRuntime.cpp | 24 ++--- clang/lib/CodeGen/CodeGenModule.cpp | 6 ++- clang/lib/Parse/ParseOpenMP.cpp | 39 ++ clang/lib/Sema/SemaOpenMP.cpp | 10 ++-- .../test/OpenMP/declare_target_ast_print.cpp | 53 +++ 9 files changed, 130 insertions(+), 32 deletions(-) diff --git a/clang/include/clang/Basic/Attr.td b/clang/include/clang/Basic/Attr.td index 16cf932c3760bd3..eaf4a6db3600e07 100644 --- a/clang/include/clang/Basic/Attr.td +++ b/clang/include/clang/Basic/Attr.td @@ -3749,8 +3749,8 @@ def OMPDeclareTargetDecl : InheritableAttr { let Documentation = [OMPDeclareTargetDocs]; let Args = [ EnumArgument<"MapType", "MapTypeTy", - [ "to", "link" ], - [ "MT_To", "MT_Link" ]>, + [ "to", "enter", "link" ], + [ "MT_To", "MT_Enter", "MT_Link" ]>, EnumArgument<"DevType", "DevTypeTy", [ "host", "nohost", "any" ], [ "DT_Host", "DT_NoHost", "DT_Any" ]>, diff --git a/clang/include/clang/Basic/DiagnosticParseKinds.td b/clang/include/clang/Basic/DiagnosticParseKinds.td index 674d6bd34fc544f..27cd3da1f191c3d 100644 --- a/clang/include/clang/Basic/DiagnosticParseKinds.td +++ b/clang/include/clang/Basic/DiagnosticParseKinds.td @@ -1383,12 +1383,22 @@ def note_omp_assumption_clause_continue_here : Note<"the ignored tokens spans until here">; def err_omp_declare_target_unexpected_clause: Error< "unexpected '%0' clause, only %select{'device_type'|'to' or 'link'|'to', 'link' or 'device_type'|'device_type', 'indirect'|'to', 'link', 'device_type' or 'indirect'}1 clauses expected">; +def err_omp_declare_target_unexpected_clause_52: Error< + "unexpected '%0' clause, only %select{'device_type'|'enter' or 'link'|'enter', 'link' or 'device_type'|'device_type', 'indirect'|'enter', 'link', 'device_type' or 'indirect'}1 clauses expected">; def err_omp_begin_declare_target_unexpected_implicit_to_clause: Error< "unexpected '(', only 'to', 'link' or 'device_type' clauses expected for 'begin declare target' directive">; -def err_omp_declare_target_unexpected_clause_after_implicit_to: Error< +def err_omp_declare_target_wrong_clause_after_implicit_to: Error< "unexpected clause after an implicit 'to' clause">; +def err_omp_declare_target_wrong_clause_after_implicit_enter: Error< + "unexpected clause after an implicit 'enter' clause">; def err_omp_declare_target_missing_to_or_link_clause: Error< "expected at least one %select{'to' or 'link'|'to', 'link' or 'indirect'}0 clause">; +def err_omp_declare_target_missing_enter_or_link_clause: Error< + "expected at least one %select{'enter' or 'link'|'enter', 'link' or 'indirect'}0 clause">; +def err_omp_declare_target_unexpected_to_clause: Error< + "unexpected 'to' clause, use 'enter' instead">; +def err_omp_declare_target_unexpected_enter_clause: Error< + "unexpected 'enter' clause, use 'to' instead">; def err_omp_declare_target_multiple : Error< "%0 appears multiple times in clauses on the same declare target directive">; def err_omp_declare_target_indirect_device_type: Error< diff --git a/clang/lib/AST/AttrImpl.cpp b/clang/lib/AST/AttrImpl.cpp index cecbd703ac61e8c..da842f6b190e74d 100644 --- a/clang/lib/AST/AttrImpl.cpp +++ b/clang/lib/AST/AttrImpl.cpp @@ -137,7 +137,7 @@ void OMPDeclareTargetDeclAttr::printPrettyPragma( // Use fake syntax because it is for testing and debugging purpose only. if (getDevType() != DT_Any) OS << " device_type(" << ConvertDevTypeTyToStr(getDevType()) << ")"; - if (getMapType() != MT_To) + if (getMapType() != MT_To && getMapType() != MT_Enter) OS << ' ' << ConvertMapTypeTyToStr(getMapType()); if (Expr *E = getIndirectExpr()) { OS << " indirect("; diff --git a/clang/lib/CodeGen/CGExpr.cpp b/clang/lib/CodeGen/CGExpr.cpp index ee09a8566c3719e..77085ff34fca233 100644 --- a/clang/lib/CodeGen/CGExpr.cpp +++ b/clang/lib/CodeGen/CGExpr.cpp @@ -2495,14 +2495,16 @@ static Address emitDeclTargetVarDeclLValue(CodeGenFunction &CGF, const VarDecl *VD, QualType T) { llvm::Optional Res = OMPDeclareTargetDeclAttr::isDeclareTargetDeclaration(VD); - // Return an invalid address if variable is MT_To and unified - // memory is not enabled. For all other cases: MT_Link and - // MT_To with unified memory, return a valid address. - if (!Res || (*
r328219 - [OpenMP][Clang] Add call to global data sharing stack initialization on the workers side
Author: gbercea Date: Thu Mar 22 10:33:27 2018 New Revision: 328219 URL: http://llvm.org/viewvc/llvm-project?rev=328219&view=rev Log: [OpenMP][Clang] Add call to global data sharing stack initialization on the workers side Summary: The workers also need to initialize the global stack. The call to the initialization function needs to happen after the kernel_init() function is called by the master. This ensures that the per-team data structures of the runtime have been initialized. Reviewers: ABataev, grokos, carlo.bertolli, caomhin Reviewed By: ABataev Subscribers: jholewinski, guansong, cfe-commits Differential Revision: https://reviews.llvm.org/D44749 Modified: cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp cfe/trunk/test/OpenMP/nvptx_data_sharing.cpp Modified: cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp?rev=328219&r1=328218&r2=328219&view=diff == --- cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp (original) +++ cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp Thu Mar 22 10:33:27 2018 @@ -801,6 +801,11 @@ void CGOpenMPRuntimeNVPTX::emitWorkerLoo // Wait for parallel work syncCTAThreads(CGF); + // For data sharing, we need to initialize the stack for workers. + CGF.EmitRuntimeCall( + createNVPTXRuntimeFunction( + OMPRTL_NVPTX__kmpc_data_sharing_init_stack)); + Address WorkFn = CGF.CreateDefaultAlignTempAlloca(CGF.Int8PtrTy, /*Name=*/"work_fn"); Address ExecStatus = Modified: cfe/trunk/test/OpenMP/nvptx_data_sharing.cpp URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/test/OpenMP/nvptx_data_sharing.cpp?rev=328219&r1=328218&r2=328219&view=diff == --- cfe/trunk/test/OpenMP/nvptx_data_sharing.cpp (original) +++ cfe/trunk/test/OpenMP/nvptx_data_sharing.cpp Thu Mar 22 10:33:27 2018 @@ -27,6 +27,11 @@ void test_ds(){ } } +/// = In the worker function = /// +// CK1: {{.*}}define internal void @__omp_offloading{{.*}}test_ds{{.*}}_worker() +// CK1: call void @llvm.nvvm.barrier0() +// CK1: call void @__kmpc_data_sharing_init_stack + /// = In the kernel function = /// // CK1: {{.*}}define void @__omp_offloading{{.*}}test_ds{{.*}}() ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
r306689 - [OpenMP] Add support for auxiliary triple specification
Author: gbercea Date: Thu Jun 29 08:49:03 2017 New Revision: 306689 URL: http://llvm.org/viewvc/llvm-project?rev=306689&view=rev Log: [OpenMP] Add support for auxiliary triple specification Summary: Device offloading requires the specification of an additional flag containing the triple of the //other// architecture the code is being compiled on if such an architecture exists. If compiling for the host, the auxiliary triple flag will contain the triple describing the device and vice versa. Reviewers: arpith-jacob, sfantao, caomhin, carlo.bertolli, ABataev, Hahnfeld, jlebar, hfinkel, tstellar Reviewed By: Hahnfeld Subscribers: rengolin, cfe-commits Differential Revision: https://reviews.llvm.org/D29339 Modified: cfe/trunk/lib/Driver/ToolChains/Clang.cpp cfe/trunk/lib/Frontend/CompilerInstance.cpp cfe/trunk/lib/Frontend/CompilerInvocation.cpp cfe/trunk/lib/Frontend/InitPreprocessor.cpp cfe/trunk/test/Driver/openmp-offload.c Modified: cfe/trunk/lib/Driver/ToolChains/Clang.cpp URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Driver/ToolChains/Clang.cpp?rev=306689&r1=306688&r2=306689&view=diff == --- cfe/trunk/lib/Driver/ToolChains/Clang.cpp (original) +++ cfe/trunk/lib/Driver/ToolChains/Clang.cpp Thu Jun 29 08:49:03 2017 @@ -129,6 +129,13 @@ forAllAssociatedToolChains(Compilation & else if (JA.isDeviceOffloading(Action::OFK_Cuda)) Work(*C.getSingleOffloadToolChain()); + if (JA.isHostOffloading(Action::OFK_OpenMP)) { +auto TCs = C.getOffloadToolChains(); +for (auto II = TCs.first, IE = TCs.second; II != IE; ++II) + Work(*II->second); + } else if (JA.isDeviceOffloading(Action::OFK_OpenMP)) +Work(*C.getSingleOffloadToolChain()); + // // TODO: Add support for other offloading programming models here. // @@ -1991,6 +1998,16 @@ void Clang::ConstructJob(Compilation &C, CmdArgs.push_back("-aux-triple"); CmdArgs.push_back(Args.MakeArgString(NormalizedTriple)); } + + if (IsOpenMPDevice) { +// We have to pass the triple of the host if compiling for an OpenMP device. +std::string NormalizedTriple = +C.getSingleOffloadToolChain() +->getTriple() +.normalize(); +CmdArgs.push_back("-aux-triple"); +CmdArgs.push_back(Args.MakeArgString(NormalizedTriple)); + } if (Triple.isOSWindows() && (Triple.getArch() == llvm::Triple::arm || Triple.getArch() == llvm::Triple::thumb)) { Modified: cfe/trunk/lib/Frontend/CompilerInstance.cpp URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Frontend/CompilerInstance.cpp?rev=306689&r1=306688&r2=306689&view=diff == --- cfe/trunk/lib/Frontend/CompilerInstance.cpp (original) +++ cfe/trunk/lib/Frontend/CompilerInstance.cpp Thu Jun 29 08:49:03 2017 @@ -936,8 +936,9 @@ bool CompilerInstance::ExecuteAction(Fro if (!hasTarget()) return false; - // Create TargetInfo for the other side of CUDA compilation. - if (getLangOpts().CUDA && !getFrontendOpts().AuxTriple.empty()) { + // Create TargetInfo for the other side of CUDA and OpenMP compilation. + if ((getLangOpts().CUDA || getLangOpts().OpenMPIsDevice) && + !getFrontendOpts().AuxTriple.empty()) { auto TO = std::make_shared(); TO->Triple = getFrontendOpts().AuxTriple; TO->HostTriple = getTarget().getTriple().str(); Modified: cfe/trunk/lib/Frontend/CompilerInvocation.cpp URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Frontend/CompilerInvocation.cpp?rev=306689&r1=306688&r2=306689&view=diff == --- cfe/trunk/lib/Frontend/CompilerInvocation.cpp (original) +++ cfe/trunk/lib/Frontend/CompilerInvocation.cpp Thu Jun 29 08:49:03 2017 @@ -2644,6 +2644,10 @@ bool CompilerInvocation::CreateFromArgs( Res.getTargetOpts().HostTriple = Res.getFrontendOpts().AuxTriple; } + // Set the triple of the host for OpenMP device compile. + if (LangOpts.OpenMPIsDevice) +Res.getTargetOpts().HostTriple = Res.getFrontendOpts().AuxTriple; + // FIXME: Override value name discarding when asan or msan is used because the // backend passes depend on the name of the alloca in order to print out // names. Modified: cfe/trunk/lib/Frontend/InitPreprocessor.cpp URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Frontend/InitPreprocessor.cpp?rev=306689&r1=306688&r2=306689&view=diff == --- cfe/trunk/lib/Frontend/InitPreprocessor.cpp (original) +++ cfe/trunk/lib/Frontend/InitPreprocessor.cpp Thu Jun 29 08:49:03 2017 @@ -1043,7 +1043,7 @@ void clang::InitializePreprocessor( if (InitOpts.UsePredefines) { // FIXME: This will create multiple definitions for most of the predefined // macros. This is not the right way to handle this. -
r306691 - [OpenMP] Pass -fopenmp-is-device to preprocessing and machine specific code generation stages
Author: gbercea Date: Thu Jun 29 08:59:19 2017 New Revision: 306691 URL: http://llvm.org/viewvc/llvm-project?rev=306691&view=rev Log: [OpenMP] Pass -fopenmp-is-device to preprocessing and machine specific code generation stages Summary: The preprocessing and code generation and optimization stages of the compiler are also passed the "-fopenmp-is-device" flag. This is used to trigger machine specific preprocessing and code generation when performing device offloading to an NVIDIA GPU via OpenMP directives. Reviewers: arpith-jacob, caomhin, carlo.bertolli, Hahnfeld, hfinkel, tstellar Reviewed By: Hahnfeld Subscribers: Hahnfeld, rengolin Differential Revision: https://reviews.llvm.org/D29645 Modified: cfe/trunk/lib/Driver/ToolChains/Clang.cpp cfe/trunk/test/Driver/openmp-offload.c Modified: cfe/trunk/lib/Driver/ToolChains/Clang.cpp URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Driver/ToolChains/Clang.cpp?rev=306691&r1=306690&r2=306691&view=diff == --- cfe/trunk/lib/Driver/ToolChains/Clang.cpp (original) +++ cfe/trunk/lib/Driver/ToolChains/Clang.cpp Thu Jun 29 08:59:19 2017 @@ -4429,10 +4429,12 @@ void Clang::ConstructJob(Compilation &C, // device declarations can be identified. Also, -fopenmp-is-device is passed // along to tell the frontend that it is generating code for a device, so that // only the relevant declarations are emitted. - if (IsOpenMPDevice && Inputs.size() == 2) { + if (IsOpenMPDevice) { CmdArgs.push_back("-fopenmp-is-device"); -CmdArgs.push_back("-fopenmp-host-ir-file-path"); -CmdArgs.push_back(Args.MakeArgString(Inputs.back().getFilename())); +if (Inputs.size() == 2) { + CmdArgs.push_back("-fopenmp-host-ir-file-path"); + CmdArgs.push_back(Args.MakeArgString(Inputs.back().getFilename())); +} } // For all the host OpenMP offloading compile jobs we need to pass the targets Modified: cfe/trunk/test/Driver/openmp-offload.c URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/test/Driver/openmp-offload.c?rev=306691&r1=306690&r2=306691&view=diff == --- cfe/trunk/test/Driver/openmp-offload.c (original) +++ cfe/trunk/test/Driver/openmp-offload.c Thu Jun 29 08:59:19 2017 @@ -589,3 +589,13 @@ // CHK-UBUJOBS-ST-SAME: [[HOSTOBJ:[^\\/]+\.o]]" "{{.*}}[[HOSTASM]]" // CHK-UBUJOBS-ST: clang-offload-bundler{{.*}}" "-type=o" "-targets=openmp-powerpc64le-ibm-linux-gnu,openmp-x86_64-pc-linux-gnu,host-powerpc64le--linux" "-outputs= // CHK-UBUJOBS-ST-SAME: [[RES:[^\\/]+\.o]]" "-inputs={{.*}}[[T1OBJ]],{{.*}}[[T2OBJ]],{{.*}}[[HOSTOBJ]]" + +/// ### + +/// Check -fopenmp-is-device is also passed when generating the *.i and *.s intermediate files. +// RUN: %clang -### -fopenmp=libomp -fopenmp-targets=powerpc64le-ibm-linux-gnu -save-temps -no-canonical-prefixes %s 2>&1 \ +// RUN: | FileCheck -check-prefix=CHK-FOPENMP-IS-DEVICE %s + +// CHK-FOPENMP-IS-DEVICE: clang{{.*}}.i" {{.*}}" "-fopenmp-is-device" +// CHK-FOPENMP-IS-DEVICE-NEXT: clang{{.*}}.bc" {{.*}}.i" "-fopenmp-is-device" "-fopenmp-host-ir-file-path" +// CHK-FOPENMP-IS-DEVICE-NEXT: clang{{.*}}.s" {{.*}}.bc" "-fopenmp-is-device" ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
r306724 - [OpenMP] Fix test for revision D29645. NFC
Author: gbercea Date: Thu Jun 29 11:49:16 2017 New Revision: 306724 URL: http://llvm.org/viewvc/llvm-project?rev=306724&view=rev Log: [OpenMP] Fix test for revision D29645. NFC Modified: cfe/trunk/test/Driver/openmp-offload.c Modified: cfe/trunk/test/Driver/openmp-offload.c URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/test/Driver/openmp-offload.c?rev=306724&r1=306723&r2=306724&view=diff == --- cfe/trunk/test/Driver/openmp-offload.c (original) +++ cfe/trunk/test/Driver/openmp-offload.c Thu Jun 29 11:49:16 2017 @@ -592,10 +592,8 @@ /// ### -/// Check -fopenmp-is-device is also passed when generating the *.i and *.s intermediate files. -// RUN: %clang -### -fopenmp=libomp -fopenmp-targets=powerpc64le-ibm-linux-gnu -save-temps -no-canonical-prefixes %s 2>&1 \ +/// Check -fopenmp-is-device is passed when compiling for the device. +// RUN: %clang -### -fopenmp=libomp -fopenmp-targets=powerpc64le-ibm-linux-gnu %s 2>&1 \ // RUN: | FileCheck -check-prefix=CHK-FOPENMP-IS-DEVICE %s -// CHK-FOPENMP-IS-DEVICE: clang{{.*}}.i" {{.*}}" "-fopenmp-is-device" -// CHK-FOPENMP-IS-DEVICE-NEXT: clang{{.*}}.bc" {{.*}}.i" "-fopenmp-is-device" "-fopenmp-host-ir-file-path" -// CHK-FOPENMP-IS-DEVICE-NEXT: clang{{.*}}.s" {{.*}}.bc" "-fopenmp-is-device" +// CHK-FOPENMP-IS-DEVICE: clang{{.*}} "-aux-triple" "powerpc64le-unknown-linux-gnu" {{.*}}.c" "-fopenmp-is-device" "-fopenmp-host-ir-file-path" ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
r307271 - [OpenMP] Customize CUDA-based tool chain selection
Author: gbercea Date: Thu Jul 6 09:08:15 2017 New Revision: 307271 URL: http://llvm.org/viewvc/llvm-project?rev=307271&view=rev Log: [OpenMP] Customize CUDA-based tool chain selection Summary: This patch provides a generic way of selecting CUDA based tool chains as host-device pairs. Reviewers: arpith-jacob, caomhin, carlo.bertolli, ABataev, Hahnfeld, jlebar, hfinkel, tstellar Reviewed By: Hahnfeld Subscribers: rengolin, cfe-commits Differential Revision: https://reviews.llvm.org/D29658 Modified: cfe/trunk/lib/Driver/Driver.cpp Modified: cfe/trunk/lib/Driver/Driver.cpp URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Driver/Driver.cpp?rev=307271&r1=307270&r2=307271&view=diff == --- cfe/trunk/lib/Driver/Driver.cpp (original) +++ cfe/trunk/lib/Driver/Driver.cpp Thu Jul 6 09:08:15 2017 @@ -572,8 +572,22 @@ void Driver::CreateOffloadingDeviceToolC if (TT.getArch() == llvm::Triple::UnknownArch) Diag(clang::diag::err_drv_invalid_omp_target) << Val; else { -const ToolChain &TC = getToolChain(C.getInputArgs(), TT); -C.addOffloadDeviceToolChain(&TC, Action::OFK_OpenMP); +const ToolChain *TC; +// CUDA toolchains have to be selected differently. They pair host +// and device in their implementation. +if (TT.isNVPTX()) { + const ToolChain *HostTC = + C.getSingleOffloadToolChain(); + assert(HostTC && "Host toolchain should be always defined."); + auto &CudaTC = + ToolChains[TT.str() + "/" + HostTC->getTriple().str()]; + if (!CudaTC) +CudaTC = llvm::make_unique( +*this, TT, *HostTC, C.getInputArgs()); + TC = CudaTC.get(); +} else + TC = &getToolChain(C.getInputArgs(), TT); +C.addOffloadDeviceToolChain(TC, Action::OFK_OpenMP); } } } else ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
r307272 - [OpenMP] Extend CLANG target options with device offloading kind.
Author: gbercea Date: Thu Jul 6 09:22:21 2017 New Revision: 307272 URL: http://llvm.org/viewvc/llvm-project?rev=307272&view=rev Log: [OpenMP] Extend CLANG target options with device offloading kind. Summary: Pass the type of the device offloading when building the tool chain for a particular target architecture. This is required when supporting multiple tool chains that target a single device type. In our particular use case, the OpenMP and CUDA tool chains will use the same ```addClangTargetOptions ``` method. This enables the reuse of common options and ensures control over options only supported by a particular tool chain. Reviewers: arpith-jacob, caomhin, carlo.bertolli, ABataev, jlebar, hfinkel, tstellar, Hahnfeld Reviewed By: hfinkel Subscribers: jgravelle-google, aheejin, rengolin, jfb, dschuff, sbc100, cfe-commits Differential Revision: https://reviews.llvm.org/D29647 Modified: cfe/trunk/include/clang/Driver/ToolChain.h cfe/trunk/lib/Driver/ToolChain.cpp cfe/trunk/lib/Driver/ToolChains/BareMetal.cpp cfe/trunk/lib/Driver/ToolChains/BareMetal.h cfe/trunk/lib/Driver/ToolChains/Clang.cpp cfe/trunk/lib/Driver/ToolChains/Cuda.cpp cfe/trunk/lib/Driver/ToolChains/Cuda.h cfe/trunk/lib/Driver/ToolChains/Darwin.cpp cfe/trunk/lib/Driver/ToolChains/Darwin.h cfe/trunk/lib/Driver/ToolChains/Fuchsia.cpp cfe/trunk/lib/Driver/ToolChains/Fuchsia.h cfe/trunk/lib/Driver/ToolChains/Gnu.cpp cfe/trunk/lib/Driver/ToolChains/Gnu.h cfe/trunk/lib/Driver/ToolChains/Hexagon.cpp cfe/trunk/lib/Driver/ToolChains/Hexagon.h cfe/trunk/lib/Driver/ToolChains/WebAssembly.cpp cfe/trunk/lib/Driver/ToolChains/WebAssembly.h cfe/trunk/lib/Driver/ToolChains/XCore.cpp cfe/trunk/lib/Driver/ToolChains/XCore.h Modified: cfe/trunk/include/clang/Driver/ToolChain.h URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Driver/ToolChain.h?rev=307272&r1=307271&r2=307272&view=diff == --- cfe/trunk/include/clang/Driver/ToolChain.h (original) +++ cfe/trunk/include/clang/Driver/ToolChain.h Thu Jul 6 09:22:21 2017 @@ -411,7 +411,8 @@ public: /// \brief Add options that need to be passed to cc1 for this target. virtual void addClangTargetOptions(const llvm::opt::ArgList &DriverArgs, - llvm::opt::ArgStringList &CC1Args) const; + llvm::opt::ArgStringList &CC1Args, + Action::OffloadKind DeviceOffloadKind) const; /// \brief Add warning options that need to be passed to cc1 for this target. virtual void addClangWarningOptions(llvm::opt::ArgStringList &CC1Args) const; Modified: cfe/trunk/lib/Driver/ToolChain.cpp URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Driver/ToolChain.cpp?rev=307272&r1=307271&r2=307272&view=diff == --- cfe/trunk/lib/Driver/ToolChain.cpp (original) +++ cfe/trunk/lib/Driver/ToolChain.cpp Thu Jul 6 09:22:21 2017 @@ -544,9 +544,9 @@ void ToolChain::AddClangSystemIncludeArg // Each toolchain should provide the appropriate include flags. } -void ToolChain::addClangTargetOptions(const ArgList &DriverArgs, - ArgStringList &CC1Args) const { -} +void ToolChain::addClangTargetOptions( +const ArgList &DriverArgs, ArgStringList &CC1Args, +Action::OffloadKind DeviceOffloadKind) const {} void ToolChain::addClangWarningOptions(ArgStringList &CC1Args) const {} Modified: cfe/trunk/lib/Driver/ToolChains/BareMetal.cpp URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Driver/ToolChains/BareMetal.cpp?rev=307272&r1=307271&r2=307272&view=diff == --- cfe/trunk/lib/Driver/ToolChains/BareMetal.cpp (original) +++ cfe/trunk/lib/Driver/ToolChains/BareMetal.cpp Thu Jul 6 09:22:21 2017 @@ -98,7 +98,8 @@ void BareMetal::AddClangSystemIncludeArg } void BareMetal::addClangTargetOptions(const ArgList &DriverArgs, - ArgStringList &CC1Args) const { + ArgStringList &CC1Args, + Action::OffloadKind) const { CC1Args.push_back("-nostdsysteminc"); } Modified: cfe/trunk/lib/Driver/ToolChains/BareMetal.h URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Driver/ToolChains/BareMetal.h?rev=307272&r1=307271&r2=307272&view=diff == --- cfe/trunk/lib/Driver/ToolChains/BareMetal.h (original) +++ cfe/trunk/lib/Driver/ToolChains/BareMetal.h Thu Jul 6 09:22:21 2017 @@ -54,7 +54,8 @@ public: void AddClangSystemIncludeArgs(const llvm::opt::ArgList &DriverArgs, llvm::opt::ArgStringList &CC1Args) const override; void addClangTargetOptio
[clang] [llvm] [openmp] [OpenMP][offload] Fix dynamic schedule tracking (PR #97065)
doru1004 wrote: > The code changes look good now, but I'd prefer to have a non-SPMD mode test > case. All good @shiltian ? The test you requested was added Friday. https://github.com/llvm/llvm-project/pull/97065 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [openmp] [OpenMP][offload] Fix dynamic schedule tracking (PR #97065)
https://github.com/doru1004 closed https://github.com/llvm/llvm-project/pull/97065 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [AMDGPU] Correctly pass the target-id to `ld.lld` (PR #101037)
https://github.com/doru1004 approved this pull request. LG https://github.com/llvm/llvm-project/pull/101037 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [openmp] [OpenMP][offload] Fix dynamic schedule tracking (PR #97065)
doru1004 wrote: > Could you provide a more descriptive summary? > > I thought we discussed that the dynamic support would just use the static > scheduler, but this seems to implement it? I personally don't want to see > more things in the OpenMP runtime relying on `malloc` if we can avoid it. I can add a summary. The problem with the dispatch loop right now is that it is just not testable/runnable. This patch fixes that aspect of it. So wherever the dispatch loop is used it has to be correct. We can re-evaluate the use of static schedule to implement dynamic schedule once this actually works. Malloc cannot be helped here if we want to have correctness. Currently it is just broken and not even runnable. https://github.com/llvm/llvm-project/pull/97065 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [openmp] [OpenMP][offload] Fix dynamic schedule tracking (PR #97065)
@@ -444,32 +444,81 @@ template struct omptarget_nvptx_LoopSupport { // KMP interface implementation (dyn loops) -// TODO: This is a stopgap. We probably want to expand the dispatch API to take -// an DST pointer which can then be allocated properly without malloc. -static DynamicScheduleTracker *THREAD_LOCAL(ThreadDSTPtr); +// TODO: Expand the dispatch API to take a DST pointer which can then be +// allocated properly without malloc. +// For now, each team will contain an LDS pointer (ThreadDST) to a global array +// of references to the DST structs allocated (in global memory) for each thread +// in the team. The global memory array is allocated during the init phase if it +// was not allocated already and will be deallocated when the dispatch phase +// ends: +// +// __kmpc_dispatch_init +// +// ** Dispatch loop ** +// +// __kmpc_dispatch_deinit +// +static DynamicScheduleTracker **SHARED(ThreadDST); // Create a new DST, link the current one, and define the new as current. static DynamicScheduleTracker *pushDST() { + int32_t ThreadOffset = mapping::getThreadIdInBlock(); + // Each block will allocate an array of pointers to DST structs. The array is + // equal in length to the number of threads in that block. + if (ThreadDST == 0) { doru1004 wrote: Done https://github.com/llvm/llvm-project/pull/97065 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [openmp] [OpenMP][offload] Fix dynamic schedule tracking (PR #97065)
@@ -444,32 +444,81 @@ template struct omptarget_nvptx_LoopSupport { // KMP interface implementation (dyn loops) -// TODO: This is a stopgap. We probably want to expand the dispatch API to take -// an DST pointer which can then be allocated properly without malloc. -static DynamicScheduleTracker *THREAD_LOCAL(ThreadDSTPtr); +// TODO: Expand the dispatch API to take a DST pointer which can then be +// allocated properly without malloc. +// For now, each team will contain an LDS pointer (ThreadDST) to a global array +// of references to the DST structs allocated (in global memory) for each thread +// in the team. The global memory array is allocated during the init phase if it +// was not allocated already and will be deallocated when the dispatch phase +// ends: +// +// __kmpc_dispatch_init +// +// ** Dispatch loop ** +// +// __kmpc_dispatch_deinit +// +static DynamicScheduleTracker **SHARED(ThreadDST); // Create a new DST, link the current one, and define the new as current. static DynamicScheduleTracker *pushDST() { + int32_t ThreadOffset = mapping::getThreadIdInBlock(); doru1004 wrote: Changed to ThreadIndex https://github.com/llvm/llvm-project/pull/97065 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [openmp] [OpenMP][offload] Fix dynamic schedule tracking (PR #97065)
@@ -444,32 +444,81 @@ template struct omptarget_nvptx_LoopSupport { // KMP interface implementation (dyn loops) -// TODO: This is a stopgap. We probably want to expand the dispatch API to take -// an DST pointer which can then be allocated properly without malloc. -static DynamicScheduleTracker *THREAD_LOCAL(ThreadDSTPtr); +// TODO: Expand the dispatch API to take a DST pointer which can then be +// allocated properly without malloc. +// For now, each team will contain an LDS pointer (ThreadDST) to a global array +// of references to the DST structs allocated (in global memory) for each thread +// in the team. The global memory array is allocated during the init phase if it +// was not allocated already and will be deallocated when the dispatch phase +// ends: +// +// __kmpc_dispatch_init +// +// ** Dispatch loop ** +// +// __kmpc_dispatch_deinit +// +static DynamicScheduleTracker **SHARED(ThreadDST); // Create a new DST, link the current one, and define the new as current. static DynamicScheduleTracker *pushDST() { + int32_t ThreadIndex = mapping::getThreadIdInBlock(); + // Each block will allocate an array of pointers to DST structs. The array is + // equal in length to the number of threads in that block. + if (!ThreadDST) { +// Allocate global memory array of pointers to DST structs: +if (ThreadIndex == 0) doru1004 wrote: Thread 0 is an arbitrary choice. https://github.com/llvm/llvm-project/pull/97065 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [openmp] [OpenMP][offload] Fix dynamic schedule tracking (PR #97065)
@@ -444,32 +444,81 @@ template struct omptarget_nvptx_LoopSupport { // KMP interface implementation (dyn loops) -// TODO: This is a stopgap. We probably want to expand the dispatch API to take -// an DST pointer which can then be allocated properly without malloc. -static DynamicScheduleTracker *THREAD_LOCAL(ThreadDSTPtr); +// TODO: Expand the dispatch API to take a DST pointer which can then be +// allocated properly without malloc. +// For now, each team will contain an LDS pointer (ThreadDST) to a global array +// of references to the DST structs allocated (in global memory) for each thread +// in the team. The global memory array is allocated during the init phase if it +// was not allocated already and will be deallocated when the dispatch phase +// ends: +// +// __kmpc_dispatch_init +// +// ** Dispatch loop ** +// +// __kmpc_dispatch_deinit +// +static DynamicScheduleTracker **SHARED(ThreadDST); // Create a new DST, link the current one, and define the new as current. static DynamicScheduleTracker *pushDST() { + int32_t ThreadIndex = mapping::getThreadIdInBlock(); + // Each block will allocate an array of pointers to DST structs. The array is + // equal in length to the number of threads in that block. + if (!ThreadDST) { +// Allocate global memory array of pointers to DST structs: +if (ThreadIndex == 0) doru1004 wrote: Is there a better check to do then? https://github.com/llvm/llvm-project/pull/97065 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [openmp] [PGO][OpenMP] Instrumentation for GPU devices (PR #76587)
doru1004 wrote: This is failing for me: ``` ld.lld: error: undefined symbol: llvm::InstrProfSymtab::create(llvm::StringRef) >>> referenced by GlobalHandler.cpp >>> >>> GlobalHandler.cpp.o:(llvm::omp::target::plugin::GPUProfGlobals::dump() >>> const) in archive >>> /home/dobercea/upstream/llvm-project/build/lib/libomptarget.rtl.amdgpu.a ld.lld: error: undefined symbol: llvm::InstrProfSymtab::dumpNames(llvm::raw_ostream&) const >>> referenced by GlobalHandler.cpp >>> >>> GlobalHandler.cpp.o:(llvm::omp::target::plugin::GPUProfGlobals::dump() >>> const) in archive >>> /home/dobercea/upstream/llvm-project/build/lib/libomptarget.rtl.amdgpu.a clang++: error: linker command failed with exit code 1 (use -v to see invocation) ``` https://github.com/llvm/llvm-project/pull/76587 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [openmp] [PGO][OpenMP] Instrumentation for GPU devices (PR #76587)
doru1004 wrote: Should this be reverted? https://github.com/llvm/llvm-project/pull/76587 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang-tools-extra] Revert: [clangd] Replace an include with a forward declaration (PR #97082)
https://github.com/doru1004 created https://github.com/llvm/llvm-project/pull/97082 Reverting due to failures on several buildbots. >From beb28561c632a9c76412d78210f6c7cdcf50819a Mon Sep 17 00:00:00 2001 From: Doru Bercea Date: Fri, 28 Jun 2024 12:37:31 -0400 Subject: [PATCH] Revert: [clangd] Replace an include with a forward declaration --- clang-tools-extra/clangd/index/remote/Client.h | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/clang-tools-extra/clangd/index/remote/Client.h b/clang-tools-extra/clangd/index/remote/Client.h index 9755fb23c2ba5..9004012f066ae 100644 --- a/clang-tools-extra/clangd/index/remote/Client.h +++ b/clang-tools-extra/clangd/index/remote/Client.h @@ -9,6 +9,7 @@ #ifndef LLVM_CLANG_TOOLS_EXTRA_CLANGD_INDEX_REMOTE_CLIENT_H #define LLVM_CLANG_TOOLS_EXTRA_CLANGD_INDEX_REMOTE_CLIENT_H +#include "index/Index.h" #include "llvm/ADT/StringRef.h" #include @@ -16,8 +17,6 @@ namespace clang { namespace clangd { -class SymbolIndex; - namespace remote { /// Returns an SymbolIndex client that passes requests to remote index located ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang-tools-extra] Revert: [clangd] Replace an include with a forward declaration (PR #97082)
https://github.com/doru1004 updated https://github.com/llvm/llvm-project/pull/97082 >From beb28561c632a9c76412d78210f6c7cdcf50819a Mon Sep 17 00:00:00 2001 From: Doru Bercea Date: Fri, 28 Jun 2024 12:37:31 -0400 Subject: [PATCH] Revert: [clangd] Replace an include with a forward declaration --- clang-tools-extra/clangd/index/remote/Client.h | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/clang-tools-extra/clangd/index/remote/Client.h b/clang-tools-extra/clangd/index/remote/Client.h index 9755fb23c2ba5..9004012f066ae 100644 --- a/clang-tools-extra/clangd/index/remote/Client.h +++ b/clang-tools-extra/clangd/index/remote/Client.h @@ -9,6 +9,7 @@ #ifndef LLVM_CLANG_TOOLS_EXTRA_CLANGD_INDEX_REMOTE_CLIENT_H #define LLVM_CLANG_TOOLS_EXTRA_CLANGD_INDEX_REMOTE_CLIENT_H +#include "index/Index.h" #include "llvm/ADT/StringRef.h" #include @@ -16,8 +17,6 @@ namespace clang { namespace clangd { -class SymbolIndex; - namespace remote { /// Returns an SymbolIndex client that passes requests to remote index located ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [openmp] [PGO][OpenMP] Instrumentation for GPU devices (PR #76587)
doru1004 wrote: I'm building on an x86 + AMD GPU. What fails is this command: ``` [8/14] Performing build step for 'runtimes' [1/4] Linking CXX shared library /home/dobercea/upstream/llvm-project/build/lib/libomptarget.so.19.0git FAILED: /home/dobercea/upstream/llvm-project/build/lib/libomptarget.so.19.0git : && /home/dobercea/upstream/llvm-project/build/./bin/clang++ --target=x86_64-unknown-linux-gnu -fPIC [] ``` https://github.com/llvm/llvm-project/pull/76587 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [openmp] [OpenMP][offload] Fix dynamic schedule tracking (PR #97065)
@@ -444,32 +444,81 @@ template struct omptarget_nvptx_LoopSupport { // KMP interface implementation (dyn loops) -// TODO: This is a stopgap. We probably want to expand the dispatch API to take -// an DST pointer which can then be allocated properly without malloc. -static DynamicScheduleTracker *THREAD_LOCAL(ThreadDSTPtr); +// TODO: Expand the dispatch API to take a DST pointer which can then be +// allocated properly without malloc. +// For now, each team will contain an LDS pointer (ThreadDST) to a global array +// of references to the DST structs allocated (in global memory) for each thread +// in the team. The global memory array is allocated during the init phase if it +// was not allocated already and will be deallocated when the dispatch phase +// ends: +// +// __kmpc_dispatch_init +// +// ** Dispatch loop ** +// +// __kmpc_dispatch_deinit +// +static DynamicScheduleTracker **SHARED(ThreadDST); // Create a new DST, link the current one, and define the new as current. static DynamicScheduleTracker *pushDST() { + int32_t ThreadIndex = mapping::getThreadIdInBlock(); + // Each block will allocate an array of pointers to DST structs. The array is + // equal in length to the number of threads in that block. + if (!ThreadDST) { +// Allocate global memory array of pointers to DST structs: +if (ThreadIndex == 0) doru1004 wrote: I'll check that in a bit, it looks like trunk is broken and I've rebased on latest. Trying to resolve the multiple fails with trunk to be able to actually build the compiler again. https://github.com/llvm/llvm-project/pull/97065 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [openmp] [PGO][OpenMP] Instrumentation for GPU devices (PR #76587)
doru1004 wrote: Yes of course: ``` cmake \ -DCMAKE_BUILD_TYPE=Release \ -DCMAKE_INSTALL_PREFIX=~/rocm/trunk_1.0 \ -DLLVM_ENABLE_PROJECTS="clang;lld;llvm;clang-tools-extra;compiler-rt;flang" \ -DLLVM_LIT_ARGS="-vv --show-unsupported --show-xfail -j 32" \ -DLLVM_TARGETS_TO_BUILD="X86;AMDGPU" \ -DLLVM_ENABLE_ASSERTIONS=ON \ -DLIBOMPTARGET_ENABLE_DEBUG=ON \ -DPACKAGE_VENDOR="AMD" \ -DCLANG_DEFAULT_LINKER=lld \ -DLLVM_ENABLE_RUNTIMES="openmp;offload" \ -DRUNTIMES_amdgcn-amd-amdhsa_LLVM_ENABLE_RUNTIMES=libc \ -DLLVM_RUNTIME_TARGETS="default;amdgcn-amd-amdhsa" \ -DBUILD_SHARED_LIBS=ON \ ../llvm -GNinja ``` https://github.com/llvm/llvm-project/pull/76587 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang-tools-extra] Revert: [clangd] Replace an include with a forward declaration (PR #97082)
https://github.com/doru1004 updated https://github.com/llvm/llvm-project/pull/97082 >From beb28561c632a9c76412d78210f6c7cdcf50819a Mon Sep 17 00:00:00 2001 From: Doru Bercea Date: Fri, 28 Jun 2024 12:37:31 -0400 Subject: [PATCH] Revert: [clangd] Replace an include with a forward declaration --- clang-tools-extra/clangd/index/remote/Client.h | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/clang-tools-extra/clangd/index/remote/Client.h b/clang-tools-extra/clangd/index/remote/Client.h index 9755fb23c2ba5..9004012f066ae 100644 --- a/clang-tools-extra/clangd/index/remote/Client.h +++ b/clang-tools-extra/clangd/index/remote/Client.h @@ -9,6 +9,7 @@ #ifndef LLVM_CLANG_TOOLS_EXTRA_CLANGD_INDEX_REMOTE_CLIENT_H #define LLVM_CLANG_TOOLS_EXTRA_CLANGD_INDEX_REMOTE_CLIENT_H +#include "index/Index.h" #include "llvm/ADT/StringRef.h" #include @@ -16,8 +17,6 @@ namespace clang { namespace clangd { -class SymbolIndex; - namespace remote { /// Returns an SymbolIndex client that passes requests to remote index located ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits