[PATCH] D101976: [OpenMP] Unified entry point for SPMD & generic kernels in the device RTL

Jon Chesterfield via Phabricator via cfe-commits Wed, 12 May 2021 12:16:39 -0700

JonChesterfield accepted this revision.
JonChesterfield added a comment.
This revision is now accepted and ready to land.


I'm not certain what this 'aligned' limitation for nvptx syncthreads is, but 
can't think of a corresponding one for amdgcn. So we may not need the LDS 
barrier construction, and it'll be much faster if we don't.

This was reported working on amdgpu by a third party against an earlier trunk 
build, but sadly the current trunk seems to have regressed (debugging offline). 
So I have no reason to believe this doesn't work, and some reason to believe it 
will do. Objection withdrawn.

The code itself always looked fine, was only nervous about the changes to 
concurrency primitives in nvptx.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D101976/new/

https://reviews.llvm.org/D101976

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D101976: [OpenMP] Unified entry point for SPMD & generic kernels in the device RTL

Reply via email to