arsenm added a comment.

In D124158#3478319 <https://reviews.llvm.org/D124158#3478319>, @jdoerfert wrote:

>> The issue you're describing sounds like it's specific to @__shfl_sync.  In 
>> general, in C++, you aren't allowed to read from an uninitialized variable; 
>> see [basic.indet] in the standard.  But if your testcase doesn't have 
>> undefined behavior, CUDA language rules must somehow allow this particular 
>> builtin function to take undef variables as input.  (Is this documented 
>> somewhere?)
>>
>> That isn't related to the "convergent" attribute; the transform you're 
>> describing doesn't break convergence rules.
>
> I concur, especially on the last part. So far I have not seen why this is 
> tied in any way to convergent. It might be a shfl oddity in which case the 
> proper solution is to `freeze` all shuffle arguments in clang.
> EDIT: https://godbolt.org/z/dnv63bzjn

I'm thinking noundef is a bit of red herring here. The real problem seems to be 
arising from the assume call which is inserted, which now introduces the 
assumption that the lane ID must be 0


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124158/new/

https://reviews.llvm.org/D124158

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to