Re: -foffload-memory=pinned (was: [PATCH 1/5] openmp: Add -foffload-memory)

Andrew Stubbs Mon, 13 Feb 2023 07:20:24 -0800

On 13/02/2023 14:38, Thomas Schwinge wrote:

Hi!


On 2022-03-08T11:30:55+0000, Hafiz Abid Qadeer <ab...@codesourcery.com> wrote:

From: Andrew Stubbs <a...@codesourcery.com>

Add a new option.  It will be used in follow-up patches.

--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi

+@option{-foffload-memory=pinned} forces all host memory to be pinned (this
+mode may require the user to increase the ulimit setting for locked memory).


So, this is currently implemented via 'mlockall', which, as discussed,
(a) has issues ('ulimit -l'), and (b) doesn't actually achieve what it
meant to achieve (because it doesn't register the page-locked memory with
the GPU driver).

So one idea was to re-purpose the unified shared memory
'gcc/omp-low.cc:pass_usm_transform' (compiler pass that "changes calls to
malloc/free/calloc/realloc and operator new to memory allocation
functions in libgomp with allocator=ompx_unified_shared_mem_alloc"),
<https://inbox.sourceware.org/gcc-patches/20220308113059.688551-5-ab...@codesourcery.com>>
 (I have not yet looked into that in detail.)

Here's now a different idea.  As '-foffload-memory=pinned', per the name
of the option, concerns itself with memory used in offloading but not
host execution generally, why are we actually attempting to "[force] all
host memory to be pinned" -- why not just the memory that's being used
with offloading?  That is, if '-foffload-memory=pinned' is set, register
as page-locked with the GPU driver all memory that appears in OMP
offloading data regions, such as OpenMP 'target' 'map' clauses etc.  That
way, this is directed at the offloading data transfers, as itended, but
at the same time we don't "waste" page-locked memory for generic host
memory allocations.  What do you think -- you, who've spent a lot more
time on this topic than I have, so it's likely possible that I fail to
realize some "details"?

The main reason it is the way it is is because in general it's notpossible to know what memory is going to be offloaded at the time it isallocated (and stack/static memory is never allocated that way).

If there's a way to pin it after the fact then maybe that's not aterrible idea? The downside is that the memory might already have beenpaged out at that point, and we'd have to track what we'd previouslypinned, or else re-pin it every time we launch a kernel. We'd also haveno way to unpin previously pinned memory (not that that's relevant tothe "lock all" case).

My original plan was to use omp_alloc for both the standard OpenMPsupport and the -foffload-memory option (to get the benefit of pinningwithout modifying any source), but then I decided that the mlockalloption was much less invasive. This is still the best way to implementtarget-independent pinning, when there's no driver registration option.


Andrew

Re: -foffload-memory=pinned (was: [PATCH 1/5] openmp: Add -foffload-memory)

Reply via email to