https://gcc.gnu.org/g:c80ecfa0927a1ada31864c709220a2adb7c96662

commit r15-5961-gc80ecfa0927a1ada31864c709220a2adb7c96662
Author: Thomas Schwinge <tschwi...@baylibre.com>
Date:   Tue Nov 12 09:54:35 2024 +0100

    Clarify libgomp nvptx 'omp_low_lat_mem_space' documentation
    
    PTX '%dynamic_smem_size' was "Introduced in PTX ISA version 4.1", and
    "Requires 'sm_20' or higher".  Given that GCC/nvptx generally supports
    'sm_20', only the PTX ISA version matters here, and that's all fine if
    just using GCC's defaults.  Follow-up to
    commit e9a19ead498fcc89186b724c6e76854f7751a89b
    "openmp, nvptx: low-lat memory access traits".
    
            libgomp/
            * libgomp.texi: Clarify nvptx 'omp_low_lat_mem_space'
            documentation.

Diff:
---
 libgomp/libgomp.texi | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/libgomp/libgomp.texi b/libgomp/libgomp.texi
index 453c35679077..6b8000c696f6 100644
--- a/libgomp/libgomp.texi
+++ b/libgomp/libgomp.texi
@@ -6972,8 +6972,10 @@ The implementation remark:
       memory-copy functions of the CUDA library.  Higher dimensions will
       call those functions in a loop and are therefore supported.
 @item Low-latency memory (@code{omp_low_lat_mem_space}) is supported when the
-      the @code{access} trait is set to @code{cgroup}, the ISA is at least
-      @code{sm_53}, and the PTX version is at least 4.1.  The default pool size
+      the @code{access} trait is set to @code{cgroup}, and libgomp has
+      been built for PTX ISA version 4.1 or higher (such as in GCC's
+      default configuration).  @c -mptx=4.1
+      The default pool size
       is 8 kiB per team, but may be adjusted at runtime by setting environment
       variable @code{GOMP_NVPTX_LOWLAT_POOL=@var{bytes}}.  The maximum value is
       limited by the available hardware, and care should be taken that the

Reply via email to