https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99932
--- Comment #10 from Tom de Vries <vries at gcc dot gnu.org> --- [ FTR, T400, driver 470.94 ] Interestingly, changing the default ptx version to 6.3 makes the minimal test-case pass, as well as the full parallel-dims.c The only code changes are shfl -> shfl.sync and vote -> vote.sync. Both of these require sm_30, so from that perspective we won't leave any architectures behind. OTOH, this may leave behind: - some older drivers - some older CUDAs (if ptxas is used for ptx verification in the nvptx-none-as). Rerunning the entire testsuite though shows that the non-32-vector-length test-cases are still failing.