arpith-jacob updated this revision to Diff 88715.
arpith-jacob added a comment.
Addressed review comments.
https://reviews.llvm.org/D29879
Files:
lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp
lib/CodeGen/CGStmtOpenMP.cpp
test/OpenMP/nvptx_teams_reduction_codegen.cpp
Index: test/OpenMP/nvptx_teams
arpith-jacob marked an inline comment as done.
arpith-jacob added a comment.
Alexey, do you any more comments on this patch?
https://reviews.llvm.org/D29879
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailma
arpith-jacob updated this revision to Diff 88423.
arpith-jacob added a comment.
Hi Alexey,
Thank you for reviewing this patch.
> I don't like the idea of adding some kind of default scheduling, that is not
> defined in standard in Sema
Actually, "default scheduling" is defined in the OpenMP sp
arpith-jacob marked 8 inline comments as done.
arpith-jacob added a comment.
Alexey, thank you for your review. I have used SizeTy instead of assuming
64-bits.
Comment at: lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp:725
+/*isVarArg=*/false);
+llvm
arpith-jacob updated this revision to Diff 88407.
arpith-jacob added a comment.
Use SizeTy instead of assuming 64 bits!
https://reviews.llvm.org/D29879
Files:
lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp
lib/CodeGen/CGStmtOpenMP.cpp
test/OpenMP/nvptx_teams_reduction_codegen.cpp
Index: test/OpenM
arpith-jacob added a comment.
Hi Alexey,
Thank you for your review. The main difference in the specialized codegen (if
vs. else part in CGStmtOpenMP.cpp).
If-part: emitForStaticInit uses the Chunk parameter (else has it set to null).
If-part: does not use EmitIgnoredExpr()
I can combine if- a
arpith-jacob created this revision.
The default schedule type on a worksharing loop is implementation
defined according to the OpenMP specifications. Currently, the
compiler codegens a doubly nested loop that effectively implements
a schedule of type (static). This is ideal for threads on CPUs.
arpith-jacob created this revision.
Herald added a subscriber: jholewinski.
This patch implements codegen for the reduction clause on
any teams construct for elementary data types. It builds
on parallel reductions on the GPU. Subsequently,
the team master writes to a unique location in a global
arpith-jacob updated this revision to Diff 88149.
arpith-jacob added a comment.
Minor fixup of comment style on emitInterWarpCopyFunction().
https://reviews.llvm.org/D29758
Files:
lib/CodeGen/CGOpenMPRuntime.cpp
lib/CodeGen/CGOpenMPRuntime.h
lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp
lib/Code
arpith-jacob updated this revision to Diff 88144.
arpith-jacob added a comment.
Updated patch to address Alexey's comments. Condensed parameters in
emitReduction() to a struct Options.
https://reviews.llvm.org/D29758
Files:
lib/CodeGen/CGOpenMPRuntime.cpp
lib/CodeGen/CGOpenMPRuntime.h
l
arpith-jacob added inline comments.
Comment at: lib/CodeGen/CGOpenMPRuntime.h:956-962
virtual void emitReduction(CodeGenFunction &CGF, SourceLocation Loc,
ArrayRef Privates,
ArrayRef LHSExprs,
arpith-jacob marked 2 inline comments as done.
arpith-jacob added a comment.
In https://reviews.llvm.org/D29506#669542, @ABataev wrote:
> The patch is too big and quite hard to review? Could you split it into
> several smaller parts?
Alexey, thank you for your time. I have addressed your comm
arpith-jacob created this revision.
Herald added a subscriber: jholewinski.
This patch implements codegen for the reduction clause on
any parallel construct for elementary data types. An efficient
implementation requires hierarchical reduction within a
warp and a threadblock. It is complicated b
arpith-jacob added inline comments.
Comment at: lib/CodeGen/CGOpenMPRuntime.h:524
+
+ static bool classof(const CGOpenMPRuntime *RT) {
+return RT->getKind() == RK_HOST;
This is required to cast to the NVPTX runtime in a static function as follows;
CGOpenMPR
arpith-jacob created this revision.
Herald added a subscriber: jholewinski.
This is a simple patch to teach OpenMP codegen to emit the construct
in Generic mode.
https://reviews.llvm.org/D29143
Files:
lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp
test/OpenMP/nvptx_target_teams_codegen.cpp
Index: te
arpith-jacob created this revision.
Herald added a subscriber: jholewinski.
This patch adds support for the proc_bind clause on the Spmd construct
'target parallel' on the NVPTX device. Since the parallel region is created
upon kernel launch, this clause can be safely ignored on the NVPTX device
arpith-jacob created this revision.
The thread_limit-clause on the combined directive applies to the
'teams' region of this construct. We modify the ThreadLimitClause
class to capture the clause expression within the 'target' region.
https://reviews.llvm.org/D29087
Files:
include/clang/AST/O
arpith-jacob created this revision.
The num_teams-clause on the combined directive applies to the
'teams' region of this construct. We modify the NumTeamsClause
class to capture the clause expression within the 'target' region.
https://reviews.llvm.org/D29085
Files:
include/clang/AST/OpenMPC
arpith-jacob created this revision.
This patch adds support for codegen of 'target teams' on the host.
This combined directive has two captured statements, one for the
'teams' region, and the other for the 'parallel'.
This target teams region is offloaded using the __tgt_target_teams()
call. The
arpith-jacob created this revision.
The num_threads-clause on the combined directive applies to the
'parallel' region of this construct. We modify the NumThreadsClause
class to capture the clause expression within the 'target' region.
The offload runtime call for 'target parallel' is changed to
arpith-jacob created this revision.
Herald added a subscriber: jholewinski.
This patch adds support for the Spmd construct 'target parallel' on the
NVPTX device. This involves ignoring the num_threads clause on the device
since the number of threads in this combined construct is already set on
th
arpith-jacob created this revision.
The DSAChecker code in SemaOpenMP looks at the captured statement
associated with an OpenMP directive. A combined directive such as
'target parallel' has nested capture statements, which have to be
fully traversed before executing the DSAChecker. This is a pat
arpith-jacob added inline comments.
Comment at: lib/Sema/SemaOpenMP.cpp:1933-1937
+ StmtResult SR = S;
+ int ThisCaptureLevel =
+ getOpenMPCaptureLevels(DSAStack->getCurrentDirective());
+ while (--ThisCaptureLevel >= 0)
+SR = ActOnCapturedRegionEnd(SR.get());
---
arpith-jacob updated this revision to Diff 84816.
arpith-jacob added a comment.
Inherit from OMPLexical scope with an added argument to reduce code duplication.
https://reviews.llvm.org/D28781
Files:
include/clang/AST/OpenMPClause.h
include/clang/AST/RecursiveASTVisitor.h
lib/AST/OpenMPCl
arpith-jacob added a comment.
Another correction. We'll have to create a similar scope OMPTeamsScope that
inherits from OMPLexicalScope for target-teams combined directives.
https://reviews.llvm.org/D28781
___
cfe-commits mailing list
cfe-commits@
arpith-jacob added inline comments.
Comment at: lib/CodeGen/CGStmtOpenMP.cpp:84-115
+/// Lexical scope for OpenMP parallel construct, that handles correct codegen
+/// for captured expressions.
+class OMPParallelScope final : public CodeGenFunction::LexicalScope {
+ void emitPre
arpith-jacob added inline comments.
Comment at: lib/CodeGen/CGStmtOpenMP.cpp:84-115
+/// Lexical scope for OpenMP parallel construct, that handles correct codegen
+/// for captured expressions.
+class OMPParallelScope final : public CodeGenFunction::LexicalScope {
+ void emitPre
arpith-jacob updated this revision to Diff 84689.
arpith-jacob added a comment.
The patch was updated to split 'emitParallelOrTeamsOutlinedFunction' into
'emitParallelOutlinedFunction' and 'emitTeamsOutlinedFunction' to enable the
use of getCapturedStmt().
Also updated an assert statement for c
arpith-jacob added inline comments.
Comment at: lib/CodeGen/CGOpenMPRuntime.h:543
virtual llvm::Value *emitParallelOrTeamsOutlinedFunction(
- const OMPExecutableDirective &D, const VarDecl *ThreadIDVar,
- OpenMPDirectiveKind InnermostKind, const RegionCodeGenTy &Code
arpith-jacob updated this revision to Diff 84629.
arpith-jacob added a comment.
Updated 'getOpenMPCaptureRegions' to return the OMPD_teams region kind for the
teams directive.
https://reviews.llvm.org/D28753
Files:
include/clang/AST/StmtOpenMP.h
include/clang/Basic/OpenMPKinds.h
include/
arpith-jacob updated this revision to Diff 84627.
arpith-jacob added a comment.
Added a method 'getCapturedStmt' as part of OMPExecutableDirective.
https://reviews.llvm.org/D28753
Files:
include/clang/AST/StmtOpenMP.h
include/clang/Basic/OpenMPKinds.h
include/clang/Sema/Sema.h
lib/Basic
arpith-jacob created this revision.
The if-clause on the combined directive potentially applies to both the
'target' and the 'parallel' regions. Codegen'ing the if-clause on the
combined directive requires additional support because the expression in
the clause must be captured by the 'target' cap
arpith-jacob added a comment.
Thanks Alexey.
> Is this an NFC patch? If so add 'NFC' to this patch.
Do you mean NVPTX? No, this is a patch to support target directives for any
accelerator.
https://reviews.llvm.org/D28752
___
cfe-commits mailing
arpith-jacob created this revision.
arpith-jacob added reviewers: ABataev, sfantao, carlo.bertolli, caomhin,
kkwli0, gtbercea.
arpith-jacob added a subscriber: cfe-commits.
Herald added a subscriber: jholewinski.
This patch adds codegen for the 'target parallel' directive on the NVPTX
device. We
arpith-jacob created this revision.
arpith-jacob added reviewers: ABataev, sfantao, carlo.bertolli, caomhin,
kkwli0, gtbercea.
arpith-jacob added a subscriber: cfe-commits.
Herald added a subscriber: jholewinski.
This patch adds support for codegen of 'target parallel' on the host.
It is also the
arpith-jacob created this revision.
arpith-jacob added reviewers: ABataev, sfantao, carlo.bertolli, kkwli0,
caomhin, gtbercea.
arpith-jacob added a subscriber: cfe-commits.
This patch refactors code that calls codegen for target regions. Currently
the codebase only supports the 'target' directive
arpith-jacob updated this revision to Diff 83641.
arpith-jacob added a comment.
Use i1 type for bool after all. But this time use the api ConvertType().
https://reviews.llvm.org/D28145
Files:
lib/CodeGen/CGOpenMPRuntime.cpp
lib/CodeGen/CGOpenMPRuntime.h
lib/CodeGen/CGOpenMPRuntimeNVPTX.c
arpith-jacob updated this revision to Diff 83631.
arpith-jacob marked an inline comment as done.
arpith-jacob added a comment.
Using CGF.ConvertTypeForMem(Context.getBoolType()) to get the right type for
'bool' rather than using i1.
https://reviews.llvm.org/D28145
Files:
lib/CodeGen/CGOpenMP
arpith-jacob marked 2 inline comments as done.
arpith-jacob added inline comments.
Comment at: lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp:365
+llvm::FunctionType *FnTy =
+llvm::FunctionType::get(llvm::Type::getInt1Ty(CGM.getLLVMContext()),
+T
arpith-jacob updated this revision to Diff 83609.
arpith-jacob added a comment.
Moved CommonActionTy to CGOpenMPRuntimeNVPTX.cpp and renamed it to
NVPTXActionTy, allowing us to customize the class in the future, if necessary.
https://reviews.llvm.org/D28145
Files:
lib/CodeGen/CGOpenMPRuntime
arpith-jacob updated this revision to Diff 82922.
arpith-jacob added a comment.
Updated patch based on reviews.
The serialized parallel region executed by the master was modified to call the
'end' runtime call with a PrePostActionTy so that it is called upon exit of any
cleanup scope.
Added re
arpith-jacob added inline comments.
Comment at: lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp:511-516
+// Activate workers.
+syncCTAThreads(CGF);
+
+// Barrier at end of parallel region.
+syncCTAThreads(CGF);
+
tra wrote:
> Are two back-to-back syncCTAThre
arpith-jacob added inline comments.
Comment at: lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp:539-542
+llvm::Value *EndArgs[] = {emitUpdateLocation(CGF, Loc), ThreadID};
+CGF.EmitRuntimeCall(
+createNVPTXRuntimeFunction(OMPRTL_NVPTX__kmpc_end_serialized_parallel),
+
arpith-jacob updated this revision to Diff 82750.
arpith-jacob added a comment.
Alexey, thank you for your review. I've updated the patch addressing your
comments.
- I experimented with various ways of changing the name of the outlined
function. In the end I decided against moving the two clas
arpith-jacob added inline comments.
Comment at: lib/CodeGen/CGOpenMPRuntime.cpp:114
/// \brief Get the name of the capture helper.
- StringRef getHelperName() const override { return ".omp_outlined."; }
+ StringRef getHelperName() const override { return "__omp_outlined__";
arpith-jacob added inline comments.
Comment at: lib/CodeGen/CGOpenMPRuntime.cpp:114
/// \brief Get the name of the capture helper.
- StringRef getHelperName() const override { return ".omp_outlined."; }
+ StringRef getHelperName() const override { return "__omp_outlined__";
arpith-jacob created this revision.
arpith-jacob added reviewers: ABataev, sfantao, carlo.bertolli, kkwli0, caomhin.
arpith-jacob added a subscriber: cfe-commits.
Herald added a subscriber: jholewinski.
This patch introduces support for the execution of parallel constructs in a
target
region on t
arpith-jacob updated this revision to Diff 82621.
arpith-jacob added a comment.
Alexey and Justin, thank you for spending the time to review this patch. I've
updated the patch accordingly. I've also removed a dot ('.') from the worker
function name since the character is not accepted by the nv
arpith-jacob updated this revision to Diff 82619.
arpith-jacob added a comment.
Addressed comments in review to start function names with a lowercase letter
and to fix the enum type name along with the enumerator name.
https://reviews.llvm.org/D28124
Files:
lib/CodeGen/CGOpenMPRuntimeNVPTX.c
arpith-jacob created this revision.
arpith-jacob added reviewers: ABataev, sfantao, carlo.bertolli, kkwli0, caomhin.
arpith-jacob added a subscriber: cfe-commits.
Herald added subscribers: aprantl, jholewinski.
This patch includes updates for codegen of the target region for the NVPTX
device. It
arpith-jacob created this revision.
arpith-jacob added reviewers: ABataev, sfantao, carlo.bertolli, kkwli0, caomhin.
arpith-jacob added a subscriber: cfe-commits.
Herald added a subscriber: jholewinski.
This patch cleans up private methods for NVPTX OpenMP codegen. It converts
private members to
51 matches
Mail list logo