On 11.12.23 18:04, Andrew Stubbs wrote:
Use Cuda to pin memory, instead of Linux mlock, when available.
There are two advantages: firstly, this gives a significant speed boost for
NVPTX offloading, and secondly, it side-steps the usual OS ulimit/rlimit
setting.
The design adds a device independent plugin API for allocating pinned memory,
and then implements it for NVPTX. At present, the other supported devices do
not have equivalent capabilities (or requirements).
LGTM. If you temporarily back out .texi's "@code{ompx_pinned_mem_alloc}
allocator or" it can go in immediately after 1/6, i.e. before the
ompx_pinned_mem_alloc patch.
As discussed in previous thread for the previous patch, I think the init
should be done in the worker function of gomp_init_targets_once().
As discussed elsewhere, the patch can go in as is and I will provide a
follow-up patch.
Thanks,
Tobias
libgomp/ChangeLog:
* config/linux/allocator.c: Include assert.h.
(using_device_for_page_locked): New variable.
(linux_memspace_alloc): Add init0 parameter. Support device pinning.
(linux_memspace_calloc): Set init0 to true.
(linux_memspace_free): Support device pinning.
(linux_memspace_realloc): Support device pinning.
(MEMSPACE_ALLOC): Set init0 to false.
* libgomp-plugin.h
(GOMP_OFFLOAD_page_locked_host_alloc): New prototype.
(GOMP_OFFLOAD_page_locked_host_free): Likewise.
* libgomp.h (gomp_page_locked_host_alloc): Likewise.
(gomp_page_locked_host_free): Likewise.
(struct gomp_device_descr): Add page_locked_host_alloc_func and
page_locked_host_free_func.
* libgomp.texi: Adjust the docs for the pinned trait.
* libgomp_g.h (GOMP_enable_pinned_mode): New prototype.
* plugin/plugin-nvptx.c
(GOMP_OFFLOAD_page_locked_host_alloc): New function.
(GOMP_OFFLOAD_page_locked_host_free): Likewise.
* target.c (device_for_page_locked): New variable.
(get_device_for_page_locked): New function.
(gomp_page_locked_host_alloc): Likewise.
(gomp_page_locked_host_free): Likewise.
(gomp_load_plugin_for_device): Add page_locked_host_alloc and
page_locked_host_free.
* testsuite/libgomp.c/alloc-pinned-1.c: Change expectations for NVPTX
devices.
* testsuite/libgomp.c/alloc-pinned-2.c: Likewise.
* testsuite/libgomp.c/alloc-pinned-3.c: Likewise.
* testsuite/libgomp.c/alloc-pinned-4.c: Likewise.
* testsuite/libgomp.c/alloc-pinned-5.c: Likewise.
* testsuite/libgomp.c/alloc-pinned-6.c: Likewise.
Co-Authored-By: Thomas Schwinge <tho...@codesourcery.com>
---
libgomp/config/linux/allocator.c | 137 ++++++++++++++-----
libgomp/libgomp-plugin.h | 2 +
libgomp/libgomp.h | 4 +
libgomp/libgomp.texi | 11 +-
libgomp/libgomp_g.h | 1 +
libgomp/plugin/plugin-nvptx.c | 42 ++++++
libgomp/target.c | 136 ++++++++++++++++++
libgomp/testsuite/libgomp.c/alloc-pinned-1.c | 26 ++++
libgomp/testsuite/libgomp.c/alloc-pinned-2.c | 26 ++++
libgomp/testsuite/libgomp.c/alloc-pinned-3.c | 45 +++++-
libgomp/testsuite/libgomp.c/alloc-pinned-4.c | 44 +++++-
libgomp/testsuite/libgomp.c/alloc-pinned-5.c | 26 ++++
libgomp/testsuite/libgomp.c/alloc-pinned-6.c | 35 ++++-
13 files changed, 487 insertions(+), 48 deletions(-)
--
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht
München, HRB 106955