On 14/11/2025 14:23, Tobias Burnus wrote:
Hi Andrew,
Andrew Stubbs wrote:
On 11/11/2025 21:35, Tobias Burnus wrote:
Can you also update: https://gcc.gnu.org/onlinedocs/libgomp/AMD-
Radeon.html – search for HSA_XNACK.
That's only for the USM case, but I think that's the only one that
really matters. (Even if xnack+ / on is also affected.)
OK, I'll add that to my todo list.
Thanks!
How is the attached?
Andrew
From f31a7d595133f17c44695e0d76e1a658838f82ce Mon Sep 17 00:00:00 2001
From: Andrew Stubbs <[email protected]>
Date: Fri, 28 Nov 2025 16:20:46 +0000
Subject: [PATCH] libgomp, amdgcn: document HSA_XNACK
Mention that the HSA_XNACK variable is automatically set by the toolchain.
libgomp/ChangeLog:
* libgomp.texi (AMD GCN): Mention HSA_XNACK is set automatically.
---
libgomp/libgomp.texi | 15 ++++++++-------
1 file changed, 8 insertions(+), 7 deletions(-)
diff --git a/libgomp/libgomp.texi b/libgomp/libgomp.texi
index 967e95d72fb..b3fd8f02b3a 100644
--- a/libgomp/libgomp.texi
+++ b/libgomp/libgomp.texi
@@ -7122,13 +7122,14 @@ The implementation remark:
such that the next reverse offload region is only executed after the previous
one returned.
@item OpenMP code that has a @code{requires} directive with @code{self_maps} or
- @code{unified_shared_memory} is only supported if all AMD GPUs have the
- @code{HSA_AMD_SYSTEM_INFO_SVM_ACCESSIBLE_BY_DEFAULT} property; for
- discrete GPUs, this may require setting the @code{HSA_XNACK} environment
- variable to @samp{1}; for systems with both an APU and a discrete GPU that
- does not support XNACK, consider using @code{ROCR_VISIBLE_DEVICES} to
- enable only the APU. If not supported, all AMD GPU devices are removed
- from the list of available devices (``host fallback'').
+ @code{unified_shared_memory} is only supported if @emph{all} the AMD GPUs
+ present have the @code{HSA_AMD_SYSTEM_INFO_SVM_ACCESSIBLE_BY_DEFAULT}
+ property; some systems require the "xnack" feature enabled for this to be
+ true, in which case the runtime will attempt to set the @code{HSA_XNACK}
+ environment variable to @samp{1} automatically (user-set values are not
+ overridden, and the setting only affects the executable itself and any
+ child processes). If any AMD GPU device is not supported, all AMD GPUs
+ are removed from the list of available devices (``host fallback'').
@item The available stack size can be changed using the @code{GCN_STACK_SIZE}
environment variable; the default is 32 kiB per thread.
@item Low-latency memory (@code{omp_low_lat_mem_space}) is supported when the
--
2.51.0