jdoerfert added a comment. > For Ex: SimplifyCFG pass removes the branch leading to a BB which has an > incoming value that will always trigger undefined behavior. This basically > modifies the CFG and combines the basic blocks. This works for CPU execution. > But on a GPU, there are intrinsics like "__shfl_sync(unsigned mask, T var, > int srcLane, int width=warpSize)", Where the exchange of variable occurs > simultaneously for all active threads within the warp. So, here in the > cuda/hip kernel, variable var in shuffl_sync may not be initialised, and LLVM > IR treats it as undef. Currently all the arguments are tagged with noundef > attribute and the above mentioned optimization by SimplifyCFG gets applied > and the kernel execution becomes ambiguous. So, the proposed change is to > skip adding noundef attribute to arguments when a function has been tagged > with convergent attribute.
Can we please have an example for this. I don't know what would be broken w/ noundef + convergent and I somewhat doubt noundef is the problem. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D124158/new/ https://reviews.llvm.org/D124158 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits