The attached patch does some cleanup to the memory allocation description, which I mainly started as I wondered myself about some details - especially about the pool_size feature.

It also includes the documentation about omp::allocator::* by Alex.

And, as I proposed by then (cf. below), it moves the list of supported traits/predefined memspaces/allocators to the memory-allocation section; before it was under OMP_ALLOCATOR (as the former didn't exist when I completed the list of ICV env variables).

Any comments before I commit the attached patch?

Tobias

PS: The pages currently look as follows: https://gcc.gnu.org/onlinedocs/libgomp/Memory-allocation.html and https://gcc.gnu.org/onlinedocs/libgomp/OMP_005fALLOCATOR.html

Tobias Burnus wrote:
Alex wrote:
Here is a follow up patch for documentation of the omp.h allocators,
[…]
I want the table in there somewhere but I'm not confident that where I
put it was the right place.

I think having the C++ template classes listed under the OMP_ALLOCATOR environment variable feels odd.

I think it is best to move the two tables, the existing one under
"OMP_ALLOCATOR – Set the default allocator",
https://https://gcc.gnu.org/onlinedocs/libgomp/OMP_005fALLOCATOR.html
and the one you added to "11.3 Memory allocation",
https://gcc.gnu.org/onlinedocs/libgomp/Memory-allocation.html
libgomp.texi: Document omp(x)::allocator::*, restructure memory allocator doc

libgomp/ChangeLog:

	* libgomp.texi (omp_init_allocator): Refer to 'Memory allocation'
	for available memory spaces.
	(OMP_ALLOCATOR): Move list of traits and predefined memspaces
	and allocators to ...
	(Memory allocation): ... here. Document omp(x)::allocator::*;
	minor wording tweaks, be more explicit about memkind, pinned and
	pool_size.

Co-authored-by: waffl3x <waff...@baylibre.com>

 libgomp/libgomp.texi | 179 ++++++++++++++++++++++++++++++++-------------------
 1 file changed, 112 insertions(+), 67 deletions(-)

diff --git a/libgomp/libgomp.texi b/libgomp/libgomp.texi
index 8374595bc82..06bd5419acc 100644
--- a/libgomp/libgomp.texi
+++ b/libgomp/libgomp.texi
@@ -3453,7 +3453,7 @@ traits; if an allocator that fulfills the requirements cannot be created,
 @code{omp_null_allocator} is returned.
 
 The predefined memory spaces and available traits can be found at
-@ref{OMP_ALLOCATOR}, where the trait names have to be prefixed by
+@ref{Memory allocation}, where the trait names have to be prefixed by
 @code{omp_atk_} (e.g. @code{omp_atk_pinned}) and the named trait values by
 @code{omp_atv_} (e.g. @code{omp_atv_true}); additionally, @code{omp_atv_default}
 may be used as trait value to specify that the default value should be used.
@@ -3476,7 +3476,7 @@ may be used as trait value to specify that the default value should be used.
 @end multitable
 
 @item @emph{See also}:
-@ref{OMP_ALLOCATOR}, @ref{Memory allocation}, @ref{omp_destroy_allocator}
+@ref{Memory allocation}, @ref{OMP_ALLOCATOR}, @ref{omp_destroy_allocator}
 
 @item @emph{Reference}:
 @uref{https://www.openmp.org, OpenMP specification v5.0}, Section 3.7.2
@@ -4057,63 +4057,15 @@ The value can either be a predefined allocator or a predefined memory space
 or a predefined memory space followed by a colon and a comma-separated list
 of memory trait and value pairs, separated by @code{=}.
 
+See @ref{Memory allocation} for a list of supported prefedined allocators,
+memory spaces, and traits.
+
 Note: The corresponding device environment variables are currently not
 supported.  Therefore, the non-host @var{def-allocator-var} ICVs are always
 initialized to @code{omp_default_mem_alloc}.  However, on all devices,
 the @code{omp_set_default_allocator} API routine can be used to change
 value.
 
-@multitable @columnfractions .45 .45
-@headitem Predefined allocators @tab Associated predefined memory spaces
-@item omp_default_mem_alloc     @tab omp_default_mem_space
-@item omp_large_cap_mem_alloc   @tab omp_large_cap_mem_space
-@item omp_const_mem_alloc       @tab omp_const_mem_space
-@item omp_high_bw_mem_alloc     @tab omp_high_bw_mem_space
-@item omp_low_lat_mem_alloc     @tab omp_low_lat_mem_space
-@item omp_cgroup_mem_alloc      @tab omp_low_lat_mem_space (implementation defined)
-@item omp_pteam_mem_alloc       @tab omp_low_lat_mem_space (implementation defined)
-@item omp_thread_mem_alloc      @tab omp_low_lat_mem_space (implementation defined)
-@item ompx_gnu_pinned_mem_alloc @tab omp_default_mem_space (GNU extension)
-@end multitable
-
-The predefined allocators use the default values for the traits,
-as listed below.  Except that the last three allocators have the
-@code{access} trait set to @code{cgroup}, @code{pteam}, and
-@code{thread}, respectively.
-
-@multitable @columnfractions .25 .40 .25
-@headitem Trait @tab Allowed values @tab Default value
-@item @code{sync_hint} @tab @code{contended}, @code{uncontended},
-                            @code{serialized}, @code{private}
-                       @tab @code{contended}
-@item @code{alignment} @tab Positive integer being a power of two
-                       @tab 1 byte
-@item @code{access}    @tab @code{all}, @code{cgroup},
-                            @code{pteam}, @code{thread}
-                       @tab @code{all}
-@item @code{pool_size} @tab Positive integer
-                       @tab See @ref{Memory allocation}
-@item @code{fallback}  @tab @code{default_mem_fb}, @code{null_fb},
-                            @code{abort_fb}, @code{allocator_fb}
-                       @tab See below
-@item @code{fb_data}   @tab @emph{unsupported as it needs an allocator handle}
-                       @tab (none)
-@item @code{pinned}    @tab @code{true}, @code{false}
-                       @tab See below
-@item @code{partition} @tab @code{environment}, @code{nearest},
-                            @code{blocked}, @code{interleaved}
-                       @tab @code{environment}
-@end multitable
-
-For the @code{fallback} trait, the default value is @code{null_fb} for the
-@code{omp_default_mem_alloc} allocator and any allocator that is associated
-with device memory; for all other allocators, it is @code{default_mem_fb}
-by default.
-
-For the @code{pinned} trait, the default value is @code{true} for
-predefined allocator @code{ompx_gnu_pinned_mem_alloc} (a GNU extension), and
-@code{false} for all others.
-
 Examples:
 @smallexample
 OMP_ALLOCATOR=omp_high_bw_mem_alloc
@@ -6883,6 +6835,7 @@ on more architectures, GCC currently does not match any @code{arch} or
       @tab See @code{-march=} in ``Nvidia PTX Options''
 @end multitable
 
+
 @node Memory allocation
 @section Memory allocation
 
@@ -6917,11 +6870,94 @@ The description below applies to:
       @code{_Alignof} and C++'s @code{alignof}.
 @end itemize
 
-For the available predefined allocators and, as applicable, their associated
-predefined memory spaces and for the available traits and their default values,
-see @ref{OMP_ALLOCATOR}.  Predefined allocators without an associated memory
-space use the @code{omp_default_mem_space} memory space.  See additionally
-@ref{Offload-Target Specifics}.
+GCC supports the following predefined allocators and predefined memory spaces:
+
+@multitable @columnfractions .45 .45
+@headitem Predefined allocators @tab Associated predefined memory spaces
+@item omp_default_mem_alloc     @tab omp_default_mem_space
+@item omp_large_cap_mem_alloc   @tab omp_large_cap_mem_space
+@item omp_const_mem_alloc       @tab omp_const_mem_space
+@item omp_high_bw_mem_alloc     @tab omp_high_bw_mem_space
+@item omp_low_lat_mem_alloc     @tab omp_low_lat_mem_space
+@item omp_cgroup_mem_alloc      @tab omp_low_lat_mem_space (implementation defined)
+@item omp_pteam_mem_alloc       @tab omp_low_lat_mem_space (implementation defined)
+@item omp_thread_mem_alloc      @tab omp_low_lat_mem_space (implementation defined)
+@item ompx_gnu_pinned_mem_alloc @tab omp_default_mem_space (GNU extension)
+@end multitable
+
+Each predefined allocator, including @code{omp_null_allocator}, has a corresponding
+allocator class template that meet the C++ allocator completeness requirements.
+These are located in the @code{omp::allocator} namespace, and the
+@code{ompx::allocator} namespace for gnu extensions.  This allows the
+allocator-aware C++ standard library containers to use OpenMP allocation routines;
+for instance:
+
+@smallexample
+std::vector<int, omp::allocator::cgroup_mem<int>> vec;
+@end smallexample
+
+The following allocator templates are supported:
+
+@multitable @columnfractions .45 .45
+@headitem Predefined allocators @tab Associated allocator template
+@item omp_null_allocator        @tab omp::allocator::null_allocator
+@item omp_default_mem_alloc     @tab omp::allocator::default_mem
+@item omp_large_cap_mem_alloc   @tab omp::allocator::large_cap_mem
+@item omp_const_mem_alloc       @tab omp::allocator::const_mem
+@item omp_high_bw_mem_alloc     @tab omp::allocator::high_bw_mem
+@item omp_low_lat_mem_alloc     @tab omp::allocator::low_lat_mem
+@item omp_cgroup_mem_alloc      @tab omp::allocator::cgroup_mem
+@item omp_pteam_mem_alloc       @tab omp::allocator::pteam_mem
+@item omp_thread_mem_alloc      @tab omp::allocator::thread_mem
+@item ompx_gnu_pinned_mem_alloc @tab ompx::allocator::gnu_pinned_mem
+@end multitable
+
+The following traits are available when constructing a new allocator;
+if a trait is not specified or with the value @code{default}, the
+specified default value is used for that trait.  The predefined
+allocators use the default values of each trait, except that the
+@code{omp_cgroup_mem_alloc}, @code{omp_pteam_mem_alloc}, and
+@code{omp_thread_mem_alloc} allocators have the @code{access} trait
+set to @code{cgroup}, @code{pteam}, and @code{thread}, respectively.
+For each trait, a named constant prefixed by @code{omp_atk_} exists;
+for each non-numeric value, a named constant prefixed by @code{omp_atv_}
+exists.
+
+@multitable @columnfractions .25 .40 .25
+@headitem Trait @tab Allowed values @tab Default value
+@item @code{sync_hint} @tab @code{contended}, @code{uncontended},
+                            @code{serialized}, @code{private}
+                       @tab @code{contended}
+@item @code{alignment} @tab Positive integer being a power of two
+                       @tab 1 byte
+@item @code{access}    @tab @code{all}, @code{cgroup},
+                            @code{pteam}, @code{thread}
+                       @tab @code{all}
+@item @code{pool_size} @tab Positive integer (bytes)
+                       @tab See below.
+@item @code{fallback}  @tab @code{default_mem_fb}, @code{null_fb},
+                            @code{abort_fb}, @code{allocator_fb}
+                       @tab See below
+@item @code{fb_data}   @tab @emph{allocator handle}
+                       @tab (none)
+@item @code{pinned}    @tab @code{true}, @code{false}
+                       @tab See below
+@item @code{partition} @tab @code{environment}, @code{nearest},
+                            @code{blocked}, @code{interleaved}
+                       @tab @code{environment}
+@end multitable
+
+For the @code{fallback} trait, the default value is @code{null_fb} for the
+@code{omp_default_mem_alloc} allocator and any allocator that is associated
+with device memory; for all other allocators, it is @code{default_mem_fb}
+by default.
+
+For the @code{pinned} trait, the default value is @code{true} for
+predefined allocator @code{ompx_gnu_pinned_mem_alloc} (a GNU extension), and
+@code{false} for all others.
+
+The following description applies to the initial device (the host) and largely
+also to non-host devices; for the latter, also see @ref{Offload-Target Specifics}.
 
 For the memory spaces, the following applies:
 @itemize
@@ -6936,14 +6972,16 @@ For the memory spaces, the following applies:
 @end itemize
 
 On Linux systems, where the @uref{https://github.com/memkind/memkind, memkind
-library} (@code{libmemkind.so.0}) is available at runtime, it is used when
-creating memory allocators requesting
+library} (@code{libmemkind.so.0}) is available at runtime and the respective
+memkind kind is supported, it is used when creating memory allocators requesting
 
 @itemize
-@item the memory space @code{omp_high_bw_mem_space}
-@item the memory space @code{omp_large_cap_mem_space}
-@item the @code{partition} trait @code{interleaved}; note that for
-      @code{omp_large_cap_mem_space} the allocation will not be interleaved
+@item the @code{partition} trait @code{interleaved} except when the memory space
+      is @code{omp_large_cap_mem_space} (uses @code{MEMKIND_HBW_INTERLEAVE})
+@item the memory space is @code{omp_high_bw_mem_space}  (uses
+      @code{MEMKIND_HBW_PREFERRED})
+@item the memory space is @code{omp_large_cap_mem_space} (uses
+      @code{MEMKIND_DAX_KMEM_ALL} or, if not available, @code{MEMKIND_DAX_KMEM})
 @end itemize
 
 On Linux systems, where the @uref{https://github.com/numactl/numactl, numa
@@ -6969,10 +7007,15 @@ a @code{nearest} allocation.
 Additional notes regarding the traits:
 @itemize
 @item The @code{pinned} trait is supported on Linux hosts, but is subject to
-      the OS @code{ulimit}/@code{rlimit} locked memory settings.
+      the OS @code{ulimit}/@code{rlimit} locked memory settings.  It currently
+      uses @code{mmap} and is therefore optimized for few allocations, including
+      large data.  If the conditions for numa or memkind allocations are
+      fulfilled, those allocators are used instead.
 @item The default for the @code{pool_size} trait is no pool and for every
       (re)allocation the associated library routine is called, which might
-      internally use a memory pool.
+      internally use a memory pool.  Currently, the same applies when a
+      @code{pool_size} has been specified, except that once allocations exceed
+      the the pool size, the action of the @code{fallback} trait applies.
 @item For the @code{partition} trait, the partition part size will be the same
       as the requested size (i.e. @code{interleaved} or @code{blocked} has no
       effect), except for @code{interleaved} when the memkind library is
@@ -6981,13 +7024,15 @@ Additional notes regarding the traits:
       that allocated the memory; on Linux, this is in particular the case when
       the memory placement policy is set to preferred.
 @item The @code{access} trait has no effect such that memory is always
-      accessible by all threads.
+      accessible by all threads. (Except on supported no-host devices.)
 @item The @code{sync_hint} trait has no effect.
 @end itemize
 
 See also:
 @ref{Offload-Target Specifics}
 
+
+
 @c ---------------------------------------------------------------------
 @c Offload-Target Specifics
 @c ---------------------------------------------------------------------

Reply via email to