jdoerfert added a comment. In D101976#2742788 <https://reviews.llvm.org/D101976#2742788>, @JonChesterfield wrote:
> In D101976#2742188 <https://reviews.llvm.org/D101976#2742188>, @jdoerfert > wrote: > >> In D101976#2742166 <https://reviews.llvm.org/D101976#2742166>, >> @JonChesterfield wrote: >> >>> What are the required semantics of the barrier operations? Amdgcn builds >>> them on shared memory, so probably needs a change to the corresponding >>> target_impl to match >> >> I have *not* tested AMDGCN but I was not expecting a problem. The semantics >> I need here is: >> warp N, thread 0 hits a barrier instruction I0 >> warp N, threads 1-31 hit a barrier instruction I1 >> the entire warp synchronizes and moves on. > > One hazard is the amdgpu devicertl only has one barrier. D102016 > <https://reviews.llvm.org/D102016> makes it simpler to add a second. I'd > guess we want named_sync to call one barrier and syncthreads to call a > different one, so we should probably rename those functions. The LDS barrier > implementation needs to know how many threads to wait for, we may be OK > passing 'all the threads' down from the __syncthreads entry point. > > The other is the single instruction pointer per wavefront, like pre-volta > nvidia cards (which I believe we also expect to work). I'm not sure whether > totally independent barriers will work, or whether we'll need to arrange for > thread 0 and thread 1-31 to call the two different barriers at the same point > in control flow. So what do you wnat me to change for this patch now? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D101976/new/ https://reviews.llvm.org/D101976 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits