https://gcc.gnu.org/g:c80ecfa0927a1ada31864c709220a2adb7c96662
commit r15-5961-gc80ecfa0927a1ada31864c709220a2adb7c96662 Author: Thomas Schwinge <tschwi...@baylibre.com> Date: Tue Nov 12 09:54:35 2024 +0100 Clarify libgomp nvptx 'omp_low_lat_mem_space' documentation PTX '%dynamic_smem_size' was "Introduced in PTX ISA version 4.1", and "Requires 'sm_20' or higher". Given that GCC/nvptx generally supports 'sm_20', only the PTX ISA version matters here, and that's all fine if just using GCC's defaults. Follow-up to commit e9a19ead498fcc89186b724c6e76854f7751a89b "openmp, nvptx: low-lat memory access traits". libgomp/ * libgomp.texi: Clarify nvptx 'omp_low_lat_mem_space' documentation. Diff: --- libgomp/libgomp.texi | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/libgomp/libgomp.texi b/libgomp/libgomp.texi index 453c35679077..6b8000c696f6 100644 --- a/libgomp/libgomp.texi +++ b/libgomp/libgomp.texi @@ -6972,8 +6972,10 @@ The implementation remark: memory-copy functions of the CUDA library. Higher dimensions will call those functions in a loop and are therefore supported. @item Low-latency memory (@code{omp_low_lat_mem_space}) is supported when the - the @code{access} trait is set to @code{cgroup}, the ISA is at least - @code{sm_53}, and the PTX version is at least 4.1. The default pool size + the @code{access} trait is set to @code{cgroup}, and libgomp has + been built for PTX ISA version 4.1 or higher (such as in GCC's + default configuration). @c -mptx=4.1 + The default pool size is 8 kiB per team, but may be adjusted at runtime by setting environment variable @code{GOMP_NVPTX_LOWLAT_POOL=@var{bytes}}. The maximum value is limited by the available hardware, and care should be taken that the