On 07/07/2022 12:54, Tobias Burnus wrote:
Hi Andrew,
On 07.07.22 12:34, Andrew Stubbs wrote:
Implement the -foffload-memory=pinned option such that libgomp is
instructed to enable fully-pinned memory at start-up. The option is
intended to provide a performance boost to certain offload programs
without
modifying the code.
...
gcc/ChangeLog:
* omp-builtins.def (BUILT_IN_GOMP_ENABLE_PINNED_MODE): New.
* omp-low.cc (omp_enable_pinned_mode): New function.
(execute_lower_omp): Call omp_enable_pinned_mode.
libgomp/ChangeLog:
* config/linux/allocator.c (always_pinned_mode): New variable.
(GOMP_enable_pinned_mode): New function.
(linux_memspace_alloc): Disable pinning when always_pinned_mode set.
(linux_memspace_calloc): Likewise.
(linux_memspace_free): Likewise.
(linux_memspace_realloc): Likewise.
* libgomp.map: Add GOMP_enable_pinned_mode.
* testsuite/libgomp.c/alloc-pinned-7.c: New test.
...
...
--- a/gcc/omp-low.cc
+++ b/gcc/omp-low.cc
@@ -14620,6 +14620,68 @@ lower_omp (gimple_seq *body, omp_context *ctx)
input_location = saved_location;
}
+/* Emit a constructor function to enable -foffload-memory=pinned
+ at runtime. Libgomp handles the OS mode setting, but we need to
trigger
+ it by calling GOMP_enable_pinned mode before the program proper
runs. */
+
+static void
+omp_enable_pinned_mode ()
Is there a reason not to use the mechanism of OpenMP's 'requires'
directive for this?
(Okay, I have to admit that the final patch was only committed on
Monday. But still ...)
Possibly, I had most of this done before then. I'll have a look next
time I visit this patch.
The Cuda-specific solution can't work this way anyway, because there's
no mlockall equivalent, so I will make conditional adjustments anyway.
Likewise, the 'requires' mechanism could then also be used in '[PATCH
16/17] amdgcn, openmp: Auto-detect USM mode and set HSA_XNACK'.
No, I don't think so; that environment variable needs to be set before
the libraries are loaded or it's too late. There are other ways to
achieve the same thing, by leaving messages for the libgomp plugin to
pick up, perhaps, but it's all extra complexity for no real gain.
Andrew