https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97203
--- Comment #3 from Tom de Vries <vries at gcc dot gnu.org> --- [ Note, this is with GOMP_NVPTX_JIT=-O0. ] In sinf, we have: ... 45: return -__kernel_cosf(y[0],y[1]); ... which translates to: ... .loc 1 45 12 ld.f32 %r67,[%frame+4]; ld.f32 %r65,[%frame]; { .param .f32 %value_in; .param .f32 %out_arg1; st.param.f32 [%out_arg1],%r65; .param .f32 %out_arg2; st.param.f32 [%out_arg2],%r67; call (%value_in),__kernel_cosf,(%out_arg1,%out_arg2); ld.param.f32 %r68,[%value_in]; } .loc 1 45 11 neg.f32 %r37,%r68; ... If I place (using GOMP_NVPTX_PTXRW) a trap before the first load: ... .loc 1 45 12 +trap ld.f32 %r67,[%frame+4]; ... I get: ... libgomp: cuCtxSynchronize error: an illegal instruction was encountered ... If I place it after the first load, I get: ... libgomp: cuCtxSynchronize error: an illegal memory access was encountered ...