On 21/10/2025 12:20, Tobias Burnus wrote:
Andrew Stubbs wrote in "[PATCH v7 0/2] libgomp: OpenMP pinned memory
improvements":
Compared to the v6 patchset I sent last week, this series has had the
gomp_debug noise and some vestigial cruft removed, [...]
Andrew Stubbs wrote in this thread:
Use Cuda to pin memory, instead of Linux mlock, when available.
There are two advantages: firstly, this gives a significant speed
boost for
NVPTX offloading, and secondly, it side-steps the usual OS ulimit/rlimit
setting.
...
OK for mainline?
...
--- a/libgomp/libgomp.texi
+++ b/libgomp/libgomp.texi
-@item The @code{pinned} trait is supported on Linux hosts, but is
subject to
- the OS @code{ulimit}/@code{rlimit} locked memory settings. It
currently
- uses @code{mmap} and is therefore optimized for few
allocations, including
- large data. If the conditions for numa or memkind allocations are
- fulfilled, those allocators are used instead.
+@item The @code{pinned} trait is supported on Linux hosts, but is
usually
+ subject to the OS @code{ulimit}/@code{rlimit} locked memory
settings (see
+ @ref{Offload-Target Specifics} for exceptions). It currently uses
+ @code{mmap} and is therefore optimized for few allocations,
including
+ large data.
I think I mentioned this before, but it looks as if it got lost
while removing large quoted code blocks in the reply email :-/
The patch removes the sentence:
"If the conditions for numa or memkind allocations are fulfilled,
those allocators are used instead."
Given the current implementation, I think this sensence should
stay. However, it seems as if the proper change is the
following.
(A) Looking at the current code, I wonder whether we need in
omp_init_allocator something like:
/* Reject unsupported memory spaces. */
#if defined(LIBGOMP_USE_MEMKIND) || defined(LIBGOMP_USE_LIBNUMA)
if (data.pinned || data.memkind != GOMP_MEMKIND_NONE)
return omp_null_allocator;
#end if
if (!MEMSPACE_VALIDATE (data.memspace, data.access, data.pinned))
return omp_null_allocator;
(B) Mentioning that (possibly not in this @item but elsewhere in the
section) that 'if requesting pinned memory and the conditions for
numa or memkind allocations are fulfilled', the creation of the
user allocator will fail.
[Or, alternatively, changing the priority: Using the pinned-memory
allocator, resetting the use of memkind – under the assumption that
pinning is more important than using memkind/numa - as it might be
detectable as a user.]
As the more proper changes still imply that the removed sentence
should be removed: I think this patch is fine, but we really should
do the follow-up fix.
The sentence was removed because it's not about the "pinned" trait.
Memkind can be used for the high-bandwidth and large-capacity memory
spaces, but this part of the manual is talking about pinned memory.
The patch probably does make sense though, although I don't like that
supporting libmemkind makes some allocators invalid that would normally
work fine.
* * *
--- a/libgomp/target.c
+++ b/libgomp/target.c
...
+static struct gomp_device_descr *
+get_device_for_page_locked (void)
+{
...
+ if (device == (void *) -1)
+ {
...
+ if (device)
+ gomp_fatal ("Unclear how %s and %s libgomp plugins may"
+ " simultaneously provide functionality"
+ " for page-locked memory",
+ device->name, devices[i].name);
+ else
+ device = &devices[i];
+ }
* * *
+gomp_page_locked_host_alloc (void **ptr, size_t size)
+{
+ struct gomp_device_descr *device = get_device_for_page_locked ();
+ if (device)
+ {
+ gomp_mutex_lock (&device->lock);
+ if (device->state == GOMP_DEVICE_UNINITIALIZED)
+ {
+ gomp_init_device (device);
+ gomp_debug (0, "Using device %s for page-locked memory\n",
+ device->name);
Wouldn't make more sense to move this diagnostic up to
after 'device =' in get_device_for_page_locked (as quoted above,
above the '* * *' line)?
In both locations, it should print once – but here only if
the device wasn't initialized before, in the other case at
the first time device-for-page-lock is checked.
Having this output also after any device-initializing function
call seems to make more sense.
(BTW: 'gomp_fatal' calls 'exit', the 'else' is not really
needed.)
OK, I think it makes sense to always have this message in the log,
without depending on the initialization order.
* * *
LGTM – as is or with the suggestion above applied, but I think
with the moved diagnostic it is nicer.
Thanks for the patch!
I will make the diagnostic change, rebase, retest and then commit both
patches.
Andrew