JonChesterfield accepted this revision. JonChesterfield added a comment. This revision is now accepted and ready to land.
I'm not certain what this 'aligned' limitation for nvptx syncthreads is, but can't think of a corresponding one for amdgcn. So we may not need the LDS barrier construction, and it'll be much faster if we don't. This was reported working on amdgpu by a third party against an earlier trunk build, but sadly the current trunk seems to have regressed (debugging offline). So I have no reason to believe this doesn't work, and some reason to believe it will do. Objection withdrawn. The code itself always looked fine, was only nervous about the changes to concurrency primitives in nvptx. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D101976/new/ https://reviews.llvm.org/D101976 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits