efriedma added a comment.

The issue you're describing sounds like it's specific to @__shfl_sync.  In 
general, in C++, you aren't allowed to read from an uninitialized variable; see 
[basic.indet] in the standard.  But if your testcase doesn't have undefined 
behavior, CUDA language rules must somehow allow this particular builtin 
function to take undef variables as input.  (Is this documented somewhere?)

That isn't related to the "convergent" attribute; the transform you're 
describing doesn't break convergence rules.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124158/new/

https://reviews.llvm.org/D124158

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to