[PATCH] D29879: [OpenMP] Teams reduction on the NVPTX device.

2017-02-16 Thread Arpith Jacob via Phabricator via cfe-commits
arpith-jacob updated this revision to Diff 88715. arpith-jacob added a comment. Addressed review comments. https://reviews.llvm.org/D29879 Files: lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp lib/CodeGen/CGStmtOpenMP.cpp test/OpenMP/nvptx_teams_reduction_codegen.cpp Index: test/OpenMP/nvptx_teams

[PATCH] D29879: [OpenMP] Teams reduction on the NVPTX device.

2017-02-16 Thread Arpith Jacob via Phabricator via cfe-commits
arpith-jacob marked an inline comment as done. arpith-jacob added a comment. Alexey, do you any more comments on this patch? https://reviews.llvm.org/D29879 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailma

[PATCH] D29910: [OpenMP] Specialize default schedule on a worksharing loop on the NVPTX device.

2017-02-14 Thread Arpith Jacob via Phabricator via cfe-commits
arpith-jacob updated this revision to Diff 88423. arpith-jacob added a comment. Hi Alexey, Thank you for reviewing this patch. > I don't like the idea of adding some kind of default scheduling, that is not > defined in standard in Sema Actually, "default scheduling" is defined in the OpenMP sp

[PATCH] D29879: [OpenMP] Teams reduction on the NVPTX device.

2017-02-14 Thread Arpith Jacob via Phabricator via cfe-commits
arpith-jacob marked 8 inline comments as done. arpith-jacob added a comment. Alexey, thank you for your review. I have used SizeTy instead of assuming 64-bits. Comment at: lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp:725 +/*isVarArg=*/false); +llvm

[PATCH] D29879: [OpenMP] Teams reduction on the NVPTX device.

2017-02-14 Thread Arpith Jacob via Phabricator via cfe-commits
arpith-jacob updated this revision to Diff 88407. arpith-jacob added a comment. Use SizeTy instead of assuming 64 bits! https://reviews.llvm.org/D29879 Files: lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp lib/CodeGen/CGStmtOpenMP.cpp test/OpenMP/nvptx_teams_reduction_codegen.cpp Index: test/OpenM

[PATCH] D29910: [OpenMP] Specialize default schedule on a worksharing loop on the NVPTX device.

2017-02-14 Thread Arpith Jacob via Phabricator via cfe-commits
arpith-jacob added a comment. Hi Alexey, Thank you for your review. The main difference in the specialized codegen (if vs. else part in CGStmtOpenMP.cpp). If-part: emitForStaticInit uses the Chunk parameter (else has it set to null). If-part: does not use EmitIgnoredExpr() I can combine if- a

[PATCH] D29910: [OpenMP] Specialize default schedule on a worksharing loop on the NVPTX device.

2017-02-13 Thread Arpith Jacob via Phabricator via cfe-commits
arpith-jacob created this revision. The default schedule type on a worksharing loop is implementation defined according to the OpenMP specifications. Currently, the compiler codegens a doubly nested loop that effectively implements a schedule of type (static). This is ideal for threads on CPUs.

[PATCH] D29879: [OpenMP] Teams reduction on the NVPTX device.

2017-02-12 Thread Arpith Jacob via Phabricator via cfe-commits
arpith-jacob created this revision. Herald added a subscriber: jholewinski. This patch implements codegen for the reduction clause on any teams construct for elementary data types. It builds on parallel reductions on the GPU. Subsequently, the team master writes to a unique location in a global

[PATCH] D29758: [OpenMP] Parallel reduction on the NVPTX device.

2017-02-12 Thread Arpith Jacob via Phabricator via cfe-commits
arpith-jacob updated this revision to Diff 88149. arpith-jacob added a comment. Minor fixup of comment style on emitInterWarpCopyFunction(). https://reviews.llvm.org/D29758 Files: lib/CodeGen/CGOpenMPRuntime.cpp lib/CodeGen/CGOpenMPRuntime.h lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp lib/Code

[PATCH] D29758: [OpenMP] Parallel reduction on the NVPTX device.

2017-02-12 Thread Arpith Jacob via Phabricator via cfe-commits
arpith-jacob updated this revision to Diff 88144. arpith-jacob added a comment. Updated patch to address Alexey's comments. Condensed parameters in emitReduction() to a struct Options. https://reviews.llvm.org/D29758 Files: lib/CodeGen/CGOpenMPRuntime.cpp lib/CodeGen/CGOpenMPRuntime.h l

[PATCH] D29758: [OpenMP] Parallel reduction on the NVPTX device.

2017-02-10 Thread Arpith Jacob via Phabricator via cfe-commits
arpith-jacob added inline comments. Comment at: lib/CodeGen/CGOpenMPRuntime.h:956-962 virtual void emitReduction(CodeGenFunction &CGF, SourceLocation Loc, ArrayRef Privates, ArrayRef LHSExprs,

[PATCH] D29506: [OpenMP] Teams reduction on the NVPTX device.

2017-02-09 Thread Arpith Jacob via Phabricator via cfe-commits
arpith-jacob marked 2 inline comments as done. arpith-jacob added a comment. In https://reviews.llvm.org/D29506#669542, @ABataev wrote: > The patch is too big and quite hard to review? Could you split it into > several smaller parts? Alexey, thank you for your time. I have addressed your comm

[PATCH] D29758: [OpenMP] Parallel reduction on the NVPTX device.

2017-02-09 Thread Arpith Jacob via Phabricator via cfe-commits
arpith-jacob created this revision. Herald added a subscriber: jholewinski. This patch implements codegen for the reduction clause on any parallel construct for elementary data types. An efficient implementation requires hierarchical reduction within a warp and a threadblock. It is complicated b

[PATCH] D29506: [OpenMP] Teams reduction on the NVPTX device.

2017-02-03 Thread Arpith Jacob via Phabricator via cfe-commits
arpith-jacob added inline comments. Comment at: lib/CodeGen/CGOpenMPRuntime.h:524 + + static bool classof(const CGOpenMPRuntime *RT) { +return RT->getKind() == RK_HOST; This is required to cast to the NVPTX runtime in a static function as follows; CGOpenMPR

[PATCH] D29143: [OpenMP] Codegen support for 'target teams' on the NVPTX device.

2017-01-25 Thread Arpith Jacob via Phabricator via cfe-commits
arpith-jacob created this revision. Herald added a subscriber: jholewinski. This is a simple patch to teach OpenMP codegen to emit the construct in Generic mode. https://reviews.llvm.org/D29143 Files: lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp test/OpenMP/nvptx_target_teams_codegen.cpp Index: te

[PATCH] D29128: [OpenMP] Support for the proc_bind-clause on 'target parallel' on the NVPTX device.

2017-01-25 Thread Arpith Jacob via Phabricator via cfe-commits
arpith-jacob created this revision. Herald added a subscriber: jholewinski. This patch adds support for the proc_bind clause on the Spmd construct 'target parallel' on the NVPTX device. Since the parallel region is created upon kernel launch, this clause can be safely ignored on the NVPTX device

[PATCH] D29087: [OpenMP] Support for thread_limit-clause on the 'target teams' directive.

2017-01-24 Thread Arpith Jacob via Phabricator via cfe-commits
arpith-jacob created this revision. The thread_limit-clause on the combined directive applies to the 'teams' region of this construct. We modify the ThreadLimitClause class to capture the clause expression within the 'target' region. https://reviews.llvm.org/D29087 Files: include/clang/AST/O

[PATCH] D29085: [OpenMP] Support for num_teams-clause on the 'target teams' directive.

2017-01-24 Thread Arpith Jacob via Phabricator via cfe-commits
arpith-jacob created this revision. The num_teams-clause on the combined directive applies to the 'teams' region of this construct. We modify the NumTeamsClause class to capture the clause expression within the 'target' region. https://reviews.llvm.org/D29085 Files: include/clang/AST/OpenMPC

[PATCH] D29084: [OpenMP] Codegen support for 'target teams' on the host.

2017-01-24 Thread Arpith Jacob via Phabricator via cfe-commits
arpith-jacob created this revision. This patch adds support for codegen of 'target teams' on the host. This combined directive has two captured statements, one for the 'teams' region, and the other for the 'parallel'. This target teams region is offloaded using the __tgt_target_teams() call. The

[PATCH] D29082: [OpenMP] Support for the num_threads-clause on 'target parallel'.

2017-01-24 Thread Arpith Jacob via Phabricator via cfe-commits
arpith-jacob created this revision. The num_threads-clause on the combined directive applies to the 'parallel' region of this construct. We modify the NumThreadsClause class to capture the clause expression within the 'target' region. The offload runtime call for 'target parallel' is changed to

[PATCH] D29083: [OpenMP] Support for the num_threads-clause on 'target parallel' on the NVPTX device.

2017-01-24 Thread Arpith Jacob via Phabricator via cfe-commits
arpith-jacob created this revision. Herald added a subscriber: jholewinski. This patch adds support for the Spmd construct 'target parallel' on the NVPTX device. This involves ignoring the num_threads clause on the device since the number of threads in this combined construct is already set on th

[PATCH] D29026: [OpenMP] DSAChecker bug fix for combined directives.

2017-01-23 Thread Arpith Jacob via Phabricator via cfe-commits
arpith-jacob created this revision. The DSAChecker code in SemaOpenMP looks at the captured statement associated with an OpenMP directive. A combined directive such as 'target parallel' has nested capture statements, which have to be fully traversed before executing the DSAChecker. This is a pat

[PATCH] D28753: [OpenMP] Codegen support for 'target parallel' on the host.

2017-01-18 Thread Arpith Jacob via Phabricator via cfe-commits
arpith-jacob added inline comments. Comment at: lib/Sema/SemaOpenMP.cpp:1933-1937 + StmtResult SR = S; + int ThisCaptureLevel = + getOpenMPCaptureLevels(DSAStack->getCurrentDirective()); + while (--ThisCaptureLevel >= 0) +SR = ActOnCapturedRegionEnd(SR.get()); ---

[PATCH] D28781: [OpenMP] Support for the if-clause on the combined directive 'target parallel'.

2017-01-18 Thread Arpith Jacob via Phabricator via cfe-commits
arpith-jacob updated this revision to Diff 84816. arpith-jacob added a comment. Inherit from OMPLexical scope with an added argument to reduce code duplication. https://reviews.llvm.org/D28781 Files: include/clang/AST/OpenMPClause.h include/clang/AST/RecursiveASTVisitor.h lib/AST/OpenMPCl

[PATCH] D28781: [OpenMP] Support for the if-clause on the combined directive 'target parallel'.

2017-01-17 Thread Arpith Jacob via Phabricator via cfe-commits
arpith-jacob added a comment. Another correction. We'll have to create a similar scope OMPTeamsScope that inherits from OMPLexicalScope for target-teams combined directives. https://reviews.llvm.org/D28781 ___ cfe-commits mailing list cfe-commits@

[PATCH] D28781: [OpenMP] Support for the if-clause on the combined directive 'target parallel'.

2017-01-17 Thread Arpith Jacob via Phabricator via cfe-commits
arpith-jacob added inline comments. Comment at: lib/CodeGen/CGStmtOpenMP.cpp:84-115 +/// Lexical scope for OpenMP parallel construct, that handles correct codegen +/// for captured expressions. +class OMPParallelScope final : public CodeGenFunction::LexicalScope { + void emitPre

[PATCH] D28781: [OpenMP] Support for the if-clause on the combined directive 'target parallel'.

2017-01-17 Thread Arpith Jacob via Phabricator via cfe-commits
arpith-jacob added inline comments. Comment at: lib/CodeGen/CGStmtOpenMP.cpp:84-115 +/// Lexical scope for OpenMP parallel construct, that handles correct codegen +/// for captured expressions. +class OMPParallelScope final : public CodeGenFunction::LexicalScope { + void emitPre

[PATCH] D28753: [OpenMP] Codegen support for 'target parallel' on the host.

2017-01-17 Thread Arpith Jacob via Phabricator via cfe-commits
arpith-jacob updated this revision to Diff 84689. arpith-jacob added a comment. The patch was updated to split 'emitParallelOrTeamsOutlinedFunction' into 'emitParallelOutlinedFunction' and 'emitTeamsOutlinedFunction' to enable the use of getCapturedStmt(). Also updated an assert statement for c

[PATCH] D28753: [OpenMP] Codegen support for 'target parallel' on the host.

2017-01-17 Thread Arpith Jacob via Phabricator via cfe-commits
arpith-jacob added inline comments. Comment at: lib/CodeGen/CGOpenMPRuntime.h:543 virtual llvm::Value *emitParallelOrTeamsOutlinedFunction( - const OMPExecutableDirective &D, const VarDecl *ThreadIDVar, - OpenMPDirectiveKind InnermostKind, const RegionCodeGenTy &Code

[PATCH] D28753: [OpenMP] Codegen support for 'target parallel' on the host.

2017-01-16 Thread Arpith Jacob via Phabricator via cfe-commits
arpith-jacob updated this revision to Diff 84629. arpith-jacob added a comment. Updated 'getOpenMPCaptureRegions' to return the OMPD_teams region kind for the teams directive. https://reviews.llvm.org/D28753 Files: include/clang/AST/StmtOpenMP.h include/clang/Basic/OpenMPKinds.h include/

[PATCH] D28753: [OpenMP] Codegen support for 'target parallel' on the host.

2017-01-16 Thread Arpith Jacob via Phabricator via cfe-commits
arpith-jacob updated this revision to Diff 84627. arpith-jacob added a comment. Added a method 'getCapturedStmt' as part of OMPExecutableDirective. https://reviews.llvm.org/D28753 Files: include/clang/AST/StmtOpenMP.h include/clang/Basic/OpenMPKinds.h include/clang/Sema/Sema.h lib/Basic

[PATCH] D28781: [OpenMP] Support for the if-clause on the combined directive 'target parallel'.

2017-01-16 Thread Arpith Jacob via Phabricator via cfe-commits
arpith-jacob created this revision. The if-clause on the combined directive potentially applies to both the 'target' and the 'parallel' regions. Codegen'ing the if-clause on the combined directive requires additional support because the expression in the clause must be captured by the 'target' cap

[PATCH] D28752: [OpenMP] Refactor code that calls codegen for target regions on the device.

2017-01-16 Thread Arpith Jacob via Phabricator via cfe-commits
arpith-jacob added a comment. Thanks Alexey. > Is this an NFC patch? If so add 'NFC' to this patch. Do you mean NVPTX? No, this is a patch to support target directives for any accelerator. https://reviews.llvm.org/D28752 ___ cfe-commits mailing

[PATCH] D28755: [OpenMP] Codegen for the 'target parallel' directive on the NVPTX device.

2017-01-15 Thread Arpith Jacob via Phabricator via cfe-commits
arpith-jacob created this revision. arpith-jacob added reviewers: ABataev, sfantao, carlo.bertolli, caomhin, kkwli0, gtbercea. arpith-jacob added a subscriber: cfe-commits. Herald added a subscriber: jholewinski. This patch adds codegen for the 'target parallel' directive on the NVPTX device. We

[PATCH] D28753: [OpenMP] Codegen support for 'target parallel' on the host.

2017-01-15 Thread Arpith Jacob via Phabricator via cfe-commits
arpith-jacob created this revision. arpith-jacob added reviewers: ABataev, sfantao, carlo.bertolli, caomhin, kkwli0, gtbercea. arpith-jacob added a subscriber: cfe-commits. Herald added a subscriber: jholewinski. This patch adds support for codegen of 'target parallel' on the host. It is also the

[PATCH] D28752: [OpenMP] Refactor code that calls codegen for target regions on the device.

2017-01-15 Thread Arpith Jacob via Phabricator via cfe-commits
arpith-jacob created this revision. arpith-jacob added reviewers: ABataev, sfantao, carlo.bertolli, kkwli0, caomhin, gtbercea. arpith-jacob added a subscriber: cfe-commits. This patch refactors code that calls codegen for target regions. Currently the codebase only supports the 'target' directive

[PATCH] D28145: [OpenMP] Basic support for a parallel directive in a target region on an NVPTX device.

2017-01-09 Thread Arpith Jacob via Phabricator via cfe-commits
arpith-jacob updated this revision to Diff 83641. arpith-jacob added a comment. Use i1 type for bool after all. But this time use the api ConvertType(). https://reviews.llvm.org/D28145 Files: lib/CodeGen/CGOpenMPRuntime.cpp lib/CodeGen/CGOpenMPRuntime.h lib/CodeGen/CGOpenMPRuntimeNVPTX.c

[PATCH] D28145: [OpenMP] Basic support for a parallel directive in a target region on an NVPTX device.

2017-01-09 Thread Arpith Jacob via Phabricator via cfe-commits
arpith-jacob updated this revision to Diff 83631. arpith-jacob marked an inline comment as done. arpith-jacob added a comment. Using CGF.ConvertTypeForMem(Context.getBoolType()) to get the right type for 'bool' rather than using i1. https://reviews.llvm.org/D28145 Files: lib/CodeGen/CGOpenMP

[PATCH] D28145: [OpenMP] Basic support for a parallel directive in a target region on an NVPTX device.

2017-01-09 Thread Arpith Jacob via Phabricator via cfe-commits
arpith-jacob marked 2 inline comments as done. arpith-jacob added inline comments. Comment at: lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp:365 +llvm::FunctionType *FnTy = +llvm::FunctionType::get(llvm::Type::getInt1Ty(CGM.getLLVMContext()), +T

[PATCH] D28145: [OpenMP] Basic support for a parallel directive in a target region on an NVPTX device.

2017-01-09 Thread Arpith Jacob via Phabricator via cfe-commits
arpith-jacob updated this revision to Diff 83609. arpith-jacob added a comment. Moved CommonActionTy to CGOpenMPRuntimeNVPTX.cpp and renamed it to NVPTXActionTy, allowing us to customize the class in the future, if necessary. https://reviews.llvm.org/D28145 Files: lib/CodeGen/CGOpenMPRuntime

[PATCH] D28145: [OpenMP] Basic support for a parallel directive in a target region on an NVPTX device.

2017-01-03 Thread Arpith Jacob via Phabricator via cfe-commits
arpith-jacob updated this revision to Diff 82922. arpith-jacob added a comment. Updated patch based on reviews. The serialized parallel region executed by the master was modified to call the 'end' runtime call with a PrePostActionTy so that it is called upon exit of any cleanup scope. Added re

[PATCH] D28145: [OpenMP] Basic support for a parallel directive in a target region on an NVPTX device.

2017-01-03 Thread Arpith Jacob via Phabricator via cfe-commits
arpith-jacob added inline comments. Comment at: lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp:511-516 +// Activate workers. +syncCTAThreads(CGF); + +// Barrier at end of parallel region. +syncCTAThreads(CGF); + tra wrote: > Are two back-to-back syncCTAThre

[PATCH] D28145: [OpenMP] Basic support for a parallel directive in a target region on an NVPTX device.

2017-01-02 Thread Arpith Jacob via Phabricator via cfe-commits
arpith-jacob added inline comments. Comment at: lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp:539-542 +llvm::Value *EndArgs[] = {emitUpdateLocation(CGF, Loc), ThreadID}; +CGF.EmitRuntimeCall( +createNVPTXRuntimeFunction(OMPRTL_NVPTX__kmpc_end_serialized_parallel), +

[PATCH] D28145: [OpenMP] Basic support for a parallel directive in a target region on an NVPTX device.

2016-12-30 Thread Arpith Jacob via Phabricator via cfe-commits
arpith-jacob updated this revision to Diff 82750. arpith-jacob added a comment. Alexey, thank you for your review. I've updated the patch addressing your comments. - I experimented with various ways of changing the name of the outlined function. In the end I decided against moving the two clas

[PATCH] D28145: [OpenMP] Basic support for a parallel directive in a target region on an NVPTX device.

2016-12-29 Thread Arpith Jacob via Phabricator via cfe-commits
arpith-jacob added inline comments. Comment at: lib/CodeGen/CGOpenMPRuntime.cpp:114 /// \brief Get the name of the capture helper. - StringRef getHelperName() const override { return ".omp_outlined."; } + StringRef getHelperName() const override { return "__omp_outlined__";

[PATCH] D28145: [OpenMP] Basic support for a parallel directive in a target region on an NVPTX device.

2016-12-28 Thread Arpith Jacob via Phabricator via cfe-commits
arpith-jacob added inline comments. Comment at: lib/CodeGen/CGOpenMPRuntime.cpp:114 /// \brief Get the name of the capture helper. - StringRef getHelperName() const override { return ".omp_outlined."; } + StringRef getHelperName() const override { return "__omp_outlined__";

[PATCH] D28145: [OpenMP] Basic support for a parallel directive in a target region on an NVPTX device.

2016-12-28 Thread Arpith Jacob via Phabricator via cfe-commits
arpith-jacob created this revision. arpith-jacob added reviewers: ABataev, sfantao, carlo.bertolli, kkwli0, caomhin. arpith-jacob added a subscriber: cfe-commits. Herald added a subscriber: jholewinski. This patch introduces support for the execution of parallel constructs in a target region on t

[PATCH] D28125: [OpenMP] Update target codegen for NVPTX device.

2016-12-28 Thread Arpith Jacob via Phabricator via cfe-commits
arpith-jacob updated this revision to Diff 82621. arpith-jacob added a comment. Alexey and Justin, thank you for spending the time to review this patch. I've updated the patch accordingly. I've also removed a dot ('.') from the worker function name since the character is not accepted by the nv

[PATCH] D28124: [OpenMP] Code cleanup for NVPTX OpenMP codegen

2016-12-28 Thread Arpith Jacob via Phabricator via cfe-commits
arpith-jacob updated this revision to Diff 82619. arpith-jacob added a comment. Addressed comments in review to start function names with a lowercase letter and to fix the enum type name along with the enumerator name. https://reviews.llvm.org/D28124 Files: lib/CodeGen/CGOpenMPRuntimeNVPTX.c

[PATCH] D28125: [OpenMP] Update target codegen for NVPTX device.

2016-12-27 Thread Arpith Jacob via Phabricator via cfe-commits
arpith-jacob created this revision. arpith-jacob added reviewers: ABataev, sfantao, carlo.bertolli, kkwli0, caomhin. arpith-jacob added a subscriber: cfe-commits. Herald added subscribers: aprantl, jholewinski. This patch includes updates for codegen of the target region for the NVPTX device. It

[PATCH] D28124: [OpenMP] Code cleanup for NVPTX OpenMP codegen

2016-12-27 Thread Arpith Jacob via Phabricator via cfe-commits
arpith-jacob created this revision. arpith-jacob added reviewers: ABataev, sfantao, carlo.bertolli, kkwli0, caomhin. arpith-jacob added a subscriber: cfe-commits. Herald added a subscriber: jholewinski. This patch cleans up private methods for NVPTX OpenMP codegen. It converts private members to