arsenm added a comment. In D124158#3478319 <https://reviews.llvm.org/D124158#3478319>, @jdoerfert wrote:
>> The issue you're describing sounds like it's specific to @__shfl_sync. In >> general, in C++, you aren't allowed to read from an uninitialized variable; >> see [basic.indet] in the standard. But if your testcase doesn't have >> undefined behavior, CUDA language rules must somehow allow this particular >> builtin function to take undef variables as input. (Is this documented >> somewhere?) >> >> That isn't related to the "convergent" attribute; the transform you're >> describing doesn't break convergence rules. > > I concur, especially on the last part. So far I have not seen why this is > tied in any way to convergent. It might be a shfl oddity in which case the > proper solution is to `freeze` all shuffle arguments in clang. > EDIT: https://godbolt.org/z/dnv63bzjn I'm thinking noundef is a bit of red herring here. The real problem seems to be arising from the assume call which is inserted, which now introduces the assumption that the lane ID must be 0 Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D124158/new/ https://reviews.llvm.org/D124158 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits