https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88981

--- Comment #2 from Tom de Vries <vries at gcc dot gnu.org> ---
A good thing to note here, when adding #pragma acc wait, the program (compiled
with -O0) takes ~10 seconds to finish on my quadro 1200m.

Without the pragma acc wait, it still takes 10 seconds.

When inspecting with a debugger where it's waiting (since there's no wait
reponsible for this), we're hanging on either cuMemFree or cuCtxDestroy.  I
can't find documentation of this hanging behaviour, so this behaviour may be
specific to the driver version or card or architecture.

Reply via email to