jdoerfert added a comment.

In D101976#2742788 <https://reviews.llvm.org/D101976#2742788>, @JonChesterfield 
wrote:

> In D101976#2742188 <https://reviews.llvm.org/D101976#2742188>, @jdoerfert 
> wrote:
>
>> In D101976#2742166 <https://reviews.llvm.org/D101976#2742166>, 
>> @JonChesterfield wrote:
>>
>>> What are the required semantics of the barrier operations? Amdgcn builds 
>>> them on shared memory, so probably needs a change to the corresponding 
>>> target_impl to match
>>
>> I have *not* tested AMDGCN but I was not expecting a problem. The semantics 
>> I need here is: 
>>  warp N, thread     0 hits a barrier instruction I0
>>  warp N, threads 1-31 hit  a barrier instruction I1
>>  the entire warp synchronizes and moves on.
>
> One hazard is the amdgpu devicertl only has one barrier. D102016 
> <https://reviews.llvm.org/D102016> makes it simpler to add a second. I'd 
> guess we want named_sync to call one barrier and syncthreads to call a 
> different one, so we should probably rename those functions. The LDS barrier 
> implementation needs to know how many threads to wait for, we may be OK 
> passing 'all the threads' down from the __syncthreads entry point.
>
> The other is the single instruction pointer per wavefront, like pre-volta 
> nvidia cards (which I believe we also expect to work). I'm not sure whether 
> totally independent barriers will work, or whether we'll need to arrange for 
> thread 0 and thread 1-31 to call the two different barriers at the same point 
> in control flow.

So what do you wnat me to change for this patch now?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D101976/new/

https://reviews.llvm.org/D101976

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to