https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97203

--- Comment #10 from Tom de Vries <vries at gcc dot gnu.org> ---
(In reply to Alexander Monakov from comment #8)
> No, -msoft-stack-reserve-local is really meant to be in bytes: it may not
> exceed the amount of .local memory reserved by CUDA driver (which is just
> 1-2 KB, unless overridden via cuCtxSetLimit, which nvptx-run.c does, but
> plugin-nvptx.c does not).
> 
> Keep in mind that .local memory reservation is multiplied by number of
> active contexts, which could be in range 20000-30000 when the code was
> written: 128KB local memory per active thread would imply a 2.5GB allocation
> on the GPU.

With the number of active contexts, do you mean the sm_count * thread_max as
used in nvptx-run.c (which, FWIW, is 10.240 on my card)?

Reply via email to