efriedma added a comment. The issue you're describing sounds like it's specific to @__shfl_sync. In general, in C++, you aren't allowed to read from an uninitialized variable; see [basic.indet] in the standard. But if your testcase doesn't have undefined behavior, CUDA language rules must somehow allow this particular builtin function to take undef variables as input. (Is this documented somewhere?)
That isn't related to the "convergent" attribute; the transform you're describing doesn't break convergence rules. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D124158/new/ https://reviews.llvm.org/D124158 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits