https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97203
--- Comment #7 from Tom de Vries <vries at gcc dot gnu.org> --- (In reply to Alexander Monakov from comment #6) > (In reply to Tom de Vries from comment #4) > > So, I think calling functions from simd code is atm not supported for nvptx. > > > > Stack variables in simd code are mapped on a per-thread stack rather than on > > the > > usual per-warp stack. > > > > The functions are compiled with the usual per-warp stack, so calling those > > functions from simd might mean the different lanes are gonna disagree about > > what the value in a stack variable should be. > > This is inaccurate. In -msoft-stack mode there's no baked-in assumption that > stacks are always per-warp. The "soft stack" pointer can point either to > global memory (outside of SIMD regions), or to local memory (inside SIMD > regions). The pointer is switched between per-warp global memory and > per-lane local memory by nvptx.c:nvptx_output_softstack_switch. > > The main requirement is that functions callable from OpenMP offloaded code > are compiled for -mgomp multilib variant. The design allows calling > functions even from inside SIMD regions, and it should be supported. I see, that's helpful, thanks. I guess I was thrown off by seeing a %simtstack_ar of 136 bytes: ... .local .align 8 .b8 %simtstack_ar[136]; ... which seems more of an amount claimed by a single function. Is it possible you meant the default of -msoft-stack-reserve-local=128 to mean 128kb (similar to what is claimed in nvptx_stacks_size in the plugin)? Because currently it means 128 bytes.