This patch series is a rework of part of the series I posted about a year ago:
https://patchwork.sourceware.org/project/gcc/list/?series=10748&state=%2A&archive=both The series depends on the low-latency patch series I posted a few weeks ago: https://patchwork.sourceware.org/project/gcc/list/?series=23045&state=%2A&archive=both I will post the Unified Shared Memory and allocator directive patches at a later time. This version of the patches implement the same basic features, rebased on the current sourcebase, plus a Cuda-specific allocator for improved performance with NVPTX offloading, and a custom allocator for better handling of small allocations. The whole series has been bug fixed and generally improved (mostly by Thomas :) ). An older, less compact, version of these patches is already applied to the devel/omp/gcc-13 (OG13) branch. OK for mainline? Andrew Andrew Stubbs (5): libgomp: basic pinned memory on Linux libgomp, openmp: Add ompx_pinned_mem_alloc openmp: Add -foffload-memory openmp: -foffload-memory=pinned libgomp: fine-grained pinned memory allocator Thomas Schwinge (1): libgomp, nvptx: Cuda pinned memory gcc/common.opt | 16 + gcc/coretypes.h | 7 + gcc/doc/invoke.texi | 16 +- gcc/omp-builtins.def | 3 + gcc/omp-low.cc | 66 ++++ libgomp/Makefile.am | 2 +- libgomp/Makefile.in | 5 +- libgomp/allocator.c | 96 ++++-- libgomp/config/gcn/allocator.c | 17 +- libgomp/config/linux/allocator.c | 234 +++++++++++++ libgomp/config/nvptx/allocator.c | 17 +- libgomp/libgomp-plugin.h | 2 + libgomp/libgomp.h | 14 + libgomp/libgomp.map | 1 + libgomp/libgomp_g.h | 1 + libgomp/omp.h.in | 1 + libgomp/omp_lib.f90.in | 2 + libgomp/plugin/plugin-nvptx.c | 34 ++ libgomp/target.c | 136 ++++++++ .../libgomp.c-c++-common/alloc-pinned-1.c | 28 ++ libgomp/testsuite/libgomp.c/alloc-pinned-1.c | 134 ++++++++ libgomp/testsuite/libgomp.c/alloc-pinned-2.c | 139 ++++++++ libgomp/testsuite/libgomp.c/alloc-pinned-3.c | 174 ++++++++++ libgomp/testsuite/libgomp.c/alloc-pinned-4.c | 176 ++++++++++ libgomp/testsuite/libgomp.c/alloc-pinned-5.c | 128 +++++++ libgomp/testsuite/libgomp.c/alloc-pinned-6.c | 127 +++++++ libgomp/testsuite/libgomp.c/alloc-pinned-7.c | 63 ++++ libgomp/testsuite/libgomp.c/alloc-pinned-8.c | 127 +++++++ .../libgomp.fortran/alloc-pinned-1.f90 | 16 + libgomp/usmpin-allocator.c | 319 ++++++++++++++++++ 30 files changed, 2051 insertions(+), 50 deletions(-) create mode 100644 libgomp/testsuite/libgomp.c-c++-common/alloc-pinned-1.c create mode 100644 libgomp/testsuite/libgomp.c/alloc-pinned-1.c create mode 100644 libgomp/testsuite/libgomp.c/alloc-pinned-2.c create mode 100644 libgomp/testsuite/libgomp.c/alloc-pinned-3.c create mode 100644 libgomp/testsuite/libgomp.c/alloc-pinned-4.c create mode 100644 libgomp/testsuite/libgomp.c/alloc-pinned-5.c create mode 100644 libgomp/testsuite/libgomp.c/alloc-pinned-6.c create mode 100644 libgomp/testsuite/libgomp.c/alloc-pinned-7.c create mode 100644 libgomp/testsuite/libgomp.c/alloc-pinned-8.c create mode 100644 libgomp/testsuite/libgomp.fortran/alloc-pinned-1.f90 create mode 100644 libgomp/usmpin-allocator.c -- 2.41.0