On 08/12/2025 12:43, Tobias Burnus wrote:
Andrew Stubbs wrote:
Unified Shared Memory does not appear to work well on gfx908, which is why we disabled xnack by default.  For this reason it makes sense to inform the user as compile time, but this is causing trouble in the testsuite which assumes
that USM only fails at runtime.

This patch changes the gfx908 compile time message to a warning only (in case
some other target does this differently), and prevents the tests from
attempting to run in host-fallback mode (given that that is not what they are
trying to test).  It also changes the existing warning to only fire once.

The patch assumes that effective target "omp_usm" also implies self-maps.
...
--- a/gcc/config/gcn/gcn.cc
+++ b/gcc/config/gcn/gcn.cc
      {
        warning_at (UNKNOWN_LOCATION, 0,
            "Unified Shared Memory is enabled, but XNACK is disabled");
        inform (UNKNOWN_LOCATION, "Try -foffload-options=-mxnack=any");
+      warned_xnack = 1;
      }

Wording wise, I am wondering whether it should be
  "Unified Shared Memory requested" or "… required" instead of "enabled"?

Fair enough; let's use "required".

[Two side remarks to bystanders:

* The hint (inform/"note:") won't help with gfx908 – bad luck!
   (xnack is always disabled on gfx908, as Andrew wrote).

* Warning only once is more useful than it seems as even relatively
   simple programs (libgomp/testsuite size) could warn tens if not hundreds
   of times!

* * *

+++ b/gcc/config/gcn/mkoffload.cc
@@ -627,9 +627,12 @@ process_asm (FILE *in, FILE *out, FILE *cfile, uint32_t omp_requires)
                || TEST_XNACK_ON (elf_flags)
                || xnack_required);
    if (TEST_XNACK_OFF (elf_flags) && xnack_required)
-    fatal_error (input_location,
-         "conflicting settings; XNACK is forced off but Unified "
-         "Shared Memory is on");
+    {
+      warning (input_location,
+           "conflicting settings; XNACK is forced off but Unified "
+           "Shared Memory is on");
+      xnack_required = 0;
+    }

Remarks:

* IMHO, 'on' is better than 'enabled', but still it might be easier
   to understand with 'required' or 'requested'.

OK, "required" it is.

* Warning vs. error: One can argue whether one or the other is better.
   If USM is not supported, host fallback is used. Thus, permitting the
   compilation is fine - even if it is not usable. (And "just" warning
   about it.) On the other hand, an error is also fine - as the user
   can still make it compile with -foffload=disabled (or only nvptx-none)
   or by switching to a USM-supporting -march=...

* * *

+++ b/libgomp/testsuite/lib/libgomp.exp
+# return 1 if OpenMP Unified Shared Memory is supported

I have two small issues with it: (A) It is always supported
by only using the host (e.g. with no device present or only
devices for which GCC does not handle the requirement).
(B) The check does not look at actual capabilities but at the
requirement check.

Thus, I think you should address (A), e.g. by appending:
"by non-host devices" or "by offload devices".

Done.

Maybe even adding 'requirement' on top as in:

# return 1 if the OpenMP requirement Unified Shared Memory
# is supported by at least one non-host device

This is a feature test, not a requirement at this stage, so I like it better without.

+proc check_effective_target_omp_usm { } {
+    if { [check_effective_target_offload_device_nvptx] } {
+    return 1
+    }
+
+    if { [libgomp_check_effective_target_offload_target "amdgcn"] } {
+    if [check_runtime_nocache usm_available_ {
+        #include <omp.h>
+        #pragma omp requires unified_shared_memory
+        int main ()
+        {
+          int a;
+          #pragma omp target map(from: a)
+        a = omp_is_initial_device ();
+          return a;
+        }
+    } ] {
+      return 1
+    }
+    }
+
+    return 0
+}

You have to do the same check for Nvidia devices. Contrary to
managed-memory support, not all Nvidia devices support USM.

Fixed.

* * *

--- a/libgomp/testsuite/libgomp.c++/target-std__array-concurrent-usm.C
+++ b/libgomp/testsuite/libgomp.c++/target-std__array-concurrent-usm.C
@@ -1,3 +1,4 @@
+/* { dg-require-effective-target omp_usm } */
  #pragma omp requires unified_shared_memory self_maps
  #define MEM_SHARED

If anyone wonders about the host fallback: The non '-usm' testcase
checks whether a stack variable is self mapped by default - if it
is (i.e. host fallback), it is compiled with '-DMEM_SHARED.
Otherwise, it assumes mapping happens.

Thus, there is no need to also run the '-usm' testcase with host
fallback. And if there is no automatic self mapping, but supported,
it makes sense to run it again - but that's what happens with the
added effective target check.

* * *

Thanks,

Tobias


v2 patch attached.

OK for mainline?

Andrew
From 68394e303ea71dcd3ba4b424c43445ebb1deaead Mon Sep 17 00:00:00 2001
From: Andrew Stubbs <[email protected]>
Date: Mon, 8 Dec 2025 16:18:59 +0000
Subject: [PATCH v2] amdgcn: Adjust failure mode for gfx908 USM

Unified Shared Memory does not appear to work well on gfx908, which is why we
disabled xnack by default.  For this reason it makes sense to inform the user
as compile time, but this is causing trouble in the testsuite which assumes
that USM only fails at runtime.

This patch changes the gfx908 compile time message to a warning only (in case
some other target does this differently), and prevents the tests from
attempting to run in host-fallback mode (given that that is not what they are
trying to test).  It also changes the existing warning to only fire once.

The patch assumes that effective target "omp_usm" also implies self-maps.

gcc/ChangeLog:

	* config/gcn/gcn.cc (gcn_init_cumulative_args): Only warn once.
	Use "required" instead of "enabled" in the warning.
	* config/gcn/mkoffload.cc (process_asm): Warn, don't error.
	Use "required" instead of "on" in the warning.

libgomp/ChangeLog:

	* testsuite/lib/libgomp.exp (check_effective_target_omp_usm): New.
	* testsuite/libgomp.c++/target-std__array-concurrent-usm.C: Require
	working Unified Shared Memory to run the test.
	* testsuite/libgomp.c++/target-std__bitset-concurrent-usm.C: Likewise.
	* testsuite/libgomp.c++/target-std__deque-concurrent-usm.C: Likewise.
	* testsuite/libgomp.c++/target-std__forward_list-concurrent-usm.C:
	Likewise.
	* testsuite/libgomp.c++/target-std__list-concurrent-usm.C: Likewise.
	* testsuite/libgomp.c++/target-std__map-concurrent-usm.C: Likewise.
	* testsuite/libgomp.c++/target-std__multimap-concurrent-usm.C:
	Likewise.
	* testsuite/libgomp.c++/target-std__multiset-concurrent-usm.C:
	Likewise.
	* testsuite/libgomp.c++/target-std__set-concurrent-usm.C: Likewise.
	* testsuite/libgomp.c++/target-std__span-concurrent-usm.C: Likewise.
	* testsuite/libgomp.c++/target-std__valarray-concurrent-usm.C:
	Likewise.
	* testsuite/libgomp.c++/target-std__vector-concurrent-usm.C: Likewise.
	* testsuite/libgomp.c-c++-common/target-implicit-map-4.c: Likewise.
	* testsuite/libgomp.c-c++-common/target-link-3.c: Likewise.
	* testsuite/libgomp.c-c++-common/target-link-4.c: Likewise.
	* testsuite/libgomp.fortran/self_maps.f90: Likewise.
---
 gcc/config/gcn/gcn.cc                         |  9 +++++---
 gcc/config/gcn/mkoffload.cc                   |  9 +++++---
 libgomp/testsuite/lib/libgomp.exp             | 23 +++++++++++++++++++
 .../target-std__array-concurrent-usm.C        |  1 +
 .../target-std__bitset-concurrent-usm.C       |  1 +
 .../target-std__deque-concurrent-usm.C        |  1 +
 .../target-std__forward_list-concurrent-usm.C |  1 +
 .../target-std__list-concurrent-usm.C         |  1 +
 .../target-std__map-concurrent-usm.C          |  1 +
 .../target-std__multimap-concurrent-usm.C     |  1 +
 .../target-std__multiset-concurrent-usm.C     |  1 +
 .../target-std__set-concurrent-usm.C          |  1 +
 .../target-std__span-concurrent-usm.C         |  1 +
 .../target-std__valarray-concurrent-usm.C     |  1 +
 .../target-std__vector-concurrent-usm.C       |  1 +
 .../target-implicit-map-4.c                   |  1 +
 .../libgomp.c-c++-common/target-link-3.c      |  1 +
 .../libgomp.c-c++-common/target-link-4.c      |  1 +
 .../testsuite/libgomp.fortran/self_maps.f90   |  1 +
 19 files changed, 51 insertions(+), 6 deletions(-)

diff --git a/gcc/config/gcn/gcn.cc b/gcc/config/gcn/gcn.cc
index a729ea4de36..54abf8c1a74 100644
--- a/gcc/config/gcn/gcn.cc
+++ b/gcc/config/gcn/gcn.cc
@@ -2940,14 +2940,17 @@ gcn_init_cumulative_args (CUMULATIVE_ARGS *cum /* Argument info to init */ ,
   if (!caller && cfun->machine->normal_function)
     gcn_detect_incoming_pointer_arg (fndecl);
 
-  if ((omp_requires_mask & (OMP_REQUIRES_UNIFIED_SHARED_MEMORY
-			    | OMP_REQUIRES_SELF_MAPS))
+  static bool warned_xnack = 0;
+  if (!warned_xnack
+      && (omp_requires_mask & (OMP_REQUIRES_UNIFIED_SHARED_MEMORY
+			       | OMP_REQUIRES_SELF_MAPS))
       && gcn_devices[gcn_arch].xnack_default != HSACO_ATTR_UNSUPPORTED
       && flag_xnack == HSACO_ATTR_OFF)
     {
       warning_at (UNKNOWN_LOCATION, 0,
-		  "Unified Shared Memory is enabled, but XNACK is disabled");
+		  "Unified Shared Memory is required, but XNACK is disabled");
       inform (UNKNOWN_LOCATION, "Try -foffload-options=-mxnack=any");
+      warned_xnack = 1;
     }
 
   reinit_regs ();
diff --git a/gcc/config/gcn/mkoffload.cc b/gcc/config/gcn/mkoffload.cc
index d9d89c64f95..ac6aae52adb 100644
--- a/gcc/config/gcn/mkoffload.cc
+++ b/gcc/config/gcn/mkoffload.cc
@@ -627,9 +627,12 @@ process_asm (FILE *in, FILE *out, FILE *cfile, uint32_t omp_requires)
 		      || TEST_XNACK_ON (elf_flags)
 		      || xnack_required);
   if (TEST_XNACK_OFF (elf_flags) && xnack_required)
-    fatal_error (input_location,
-		 "conflicting settings; XNACK is forced off but Unified "
-		 "Shared Memory is on");
+    {
+      warning (input_location,
+	       "conflicting settings; XNACK is forced off but Unified "
+	       "Shared Memory is required");
+      xnack_required = 0;
+    }
 
   /* Start generating the C code.  */
   if (gcn_stack_size)
diff --git a/libgomp/testsuite/lib/libgomp.exp b/libgomp/testsuite/lib/libgomp.exp
index 076b775560f..cce2e93f857 100644
--- a/libgomp/testsuite/lib/libgomp.exp
+++ b/libgomp/testsuite/lib/libgomp.exp
@@ -725,6 +725,29 @@ int main() {
 } } "-lhipblas" ]
 }
 
+# return 1 if OpenMP Unified Shared Memory is supported by offload devices
+
+proc check_effective_target_omp_usm { } {
+    if { [check_effective_target_offload_device_nvptx] 
+         || [check_effective_target_offload_target_amdgcn] } {
+	if [check_runtime usm_available_ {
+	    #include <omp.h>
+	    #pragma omp requires unified_shared_memory
+	    int main ()
+	    {
+	      int a;
+	      #pragma omp target map(from: a)
+		a = omp_is_initial_device ();
+	      return a;
+	    }
+	} ] {
+	  return 1
+	}
+    }
+
+    return 0
+}
+
 # return 1 if OpenMP Device Managed Memory is supported
 
 proc check_effective_target_omp_managedmem { } {
diff --git a/libgomp/testsuite/libgomp.c++/target-std__array-concurrent-usm.C b/libgomp/testsuite/libgomp.c++/target-std__array-concurrent-usm.C
index 9923783bcb1..aa36f7109e9 100644
--- a/libgomp/testsuite/libgomp.c++/target-std__array-concurrent-usm.C
+++ b/libgomp/testsuite/libgomp.c++/target-std__array-concurrent-usm.C
@@ -1,3 +1,4 @@
+/* { dg-require-effective-target omp_usm } */
 #pragma omp requires unified_shared_memory self_maps
 
 #define MEM_SHARED
diff --git a/libgomp/testsuite/libgomp.c++/target-std__bitset-concurrent-usm.C b/libgomp/testsuite/libgomp.c++/target-std__bitset-concurrent-usm.C
index 9023ef85c55..d08ea710971 100644
--- a/libgomp/testsuite/libgomp.c++/target-std__bitset-concurrent-usm.C
+++ b/libgomp/testsuite/libgomp.c++/target-std__bitset-concurrent-usm.C
@@ -1,3 +1,4 @@
+/* { dg-require-effective-target omp_usm } */
 #pragma omp requires unified_shared_memory self_maps
 
 #define MEM_SHARED
diff --git a/libgomp/testsuite/libgomp.c++/target-std__deque-concurrent-usm.C b/libgomp/testsuite/libgomp.c++/target-std__deque-concurrent-usm.C
index 863a1de7687..b30ade4f086 100644
--- a/libgomp/testsuite/libgomp.c++/target-std__deque-concurrent-usm.C
+++ b/libgomp/testsuite/libgomp.c++/target-std__deque-concurrent-usm.C
@@ -1,3 +1,4 @@
+/* { dg-require-effective-target omp_usm } */
 #pragma omp requires unified_shared_memory self_maps
 
 #define MEM_SHARED
diff --git a/libgomp/testsuite/libgomp.c++/target-std__forward_list-concurrent-usm.C b/libgomp/testsuite/libgomp.c++/target-std__forward_list-concurrent-usm.C
index 60d5cee5ef3..65004b25510 100644
--- a/libgomp/testsuite/libgomp.c++/target-std__forward_list-concurrent-usm.C
+++ b/libgomp/testsuite/libgomp.c++/target-std__forward_list-concurrent-usm.C
@@ -1,3 +1,4 @@
+/* { dg-require-effective-target omp_usm } */
 #pragma omp requires unified_shared_memory self_maps
 
 #define MEM_SHARED
diff --git a/libgomp/testsuite/libgomp.c++/target-std__list-concurrent-usm.C b/libgomp/testsuite/libgomp.c++/target-std__list-concurrent-usm.C
index 5057bf96fef..3cdd44db427 100644
--- a/libgomp/testsuite/libgomp.c++/target-std__list-concurrent-usm.C
+++ b/libgomp/testsuite/libgomp.c++/target-std__list-concurrent-usm.C
@@ -1,3 +1,4 @@
+/* { dg-require-effective-target omp_usm } */
 #pragma omp requires unified_shared_memory self_maps
 
 #define MEM_SHARED
diff --git a/libgomp/testsuite/libgomp.c++/target-std__map-concurrent-usm.C b/libgomp/testsuite/libgomp.c++/target-std__map-concurrent-usm.C
index fe37426b8c8..b7d3dd822a7 100644
--- a/libgomp/testsuite/libgomp.c++/target-std__map-concurrent-usm.C
+++ b/libgomp/testsuite/libgomp.c++/target-std__map-concurrent-usm.C
@@ -1,3 +1,4 @@
+/* { dg-require-effective-target omp_usm } */
 #pragma omp requires unified_shared_memory self_maps
 
 #define MEM_SHARED
diff --git a/libgomp/testsuite/libgomp.c++/target-std__multimap-concurrent-usm.C b/libgomp/testsuite/libgomp.c++/target-std__multimap-concurrent-usm.C
index 79f9245117f..f243790a638 100644
--- a/libgomp/testsuite/libgomp.c++/target-std__multimap-concurrent-usm.C
+++ b/libgomp/testsuite/libgomp.c++/target-std__multimap-concurrent-usm.C
@@ -1,3 +1,4 @@
+/* { dg-require-effective-target omp_usm } */
 #pragma omp requires unified_shared_memory self_maps
 
 #define MEM_SHARED
diff --git a/libgomp/testsuite/libgomp.c++/target-std__multiset-concurrent-usm.C b/libgomp/testsuite/libgomp.c++/target-std__multiset-concurrent-usm.C
index 2d8075663bd..d869e8937ff 100644
--- a/libgomp/testsuite/libgomp.c++/target-std__multiset-concurrent-usm.C
+++ b/libgomp/testsuite/libgomp.c++/target-std__multiset-concurrent-usm.C
@@ -1,3 +1,4 @@
+/* { dg-require-effective-target omp_usm } */
 #pragma omp requires unified_shared_memory self_maps
 
 #define MEM_SHARED
diff --git a/libgomp/testsuite/libgomp.c++/target-std__set-concurrent-usm.C b/libgomp/testsuite/libgomp.c++/target-std__set-concurrent-usm.C
index 54f62e3e84b..5fbf91b2e07 100644
--- a/libgomp/testsuite/libgomp.c++/target-std__set-concurrent-usm.C
+++ b/libgomp/testsuite/libgomp.c++/target-std__set-concurrent-usm.C
@@ -1,3 +1,4 @@
+/* { dg-require-effective-target omp_usm } */
 #pragma omp requires unified_shared_memory self_maps
 
 #define MEM_SHARED
diff --git a/libgomp/testsuite/libgomp.c++/target-std__span-concurrent-usm.C b/libgomp/testsuite/libgomp.c++/target-std__span-concurrent-usm.C
index 7ef16bfb574..09f98790b3c 100644
--- a/libgomp/testsuite/libgomp.c++/target-std__span-concurrent-usm.C
+++ b/libgomp/testsuite/libgomp.c++/target-std__span-concurrent-usm.C
@@ -1,4 +1,5 @@
 // { dg-additional-options "-std=c++20" }
+/* { dg-require-effective-target omp_usm } */
 
 #pragma omp requires unified_shared_memory self_maps
 
diff --git a/libgomp/testsuite/libgomp.c++/target-std__valarray-concurrent-usm.C b/libgomp/testsuite/libgomp.c++/target-std__valarray-concurrent-usm.C
index 41ec80ee900..828b67c3930 100644
--- a/libgomp/testsuite/libgomp.c++/target-std__valarray-concurrent-usm.C
+++ b/libgomp/testsuite/libgomp.c++/target-std__valarray-concurrent-usm.C
@@ -1,3 +1,4 @@
+/* { dg-require-effective-target omp_usm } */
 #pragma omp requires unified_shared_memory self_maps
 
 #define MEM_SHARED
diff --git a/libgomp/testsuite/libgomp.c++/target-std__vector-concurrent-usm.C b/libgomp/testsuite/libgomp.c++/target-std__vector-concurrent-usm.C
index 967bff3b81a..835f6d5287e 100644
--- a/libgomp/testsuite/libgomp.c++/target-std__vector-concurrent-usm.C
+++ b/libgomp/testsuite/libgomp.c++/target-std__vector-concurrent-usm.C
@@ -1,3 +1,4 @@
+/* { dg-require-effective-target omp_usm } */
 #pragma omp requires unified_shared_memory self_maps
 
 #define MEM_SHARED
diff --git a/libgomp/testsuite/libgomp.c-c++-common/target-implicit-map-4.c b/libgomp/testsuite/libgomp.c-c++-common/target-implicit-map-4.c
index d0b0cd178c0..97bb97abccf 100644
--- a/libgomp/testsuite/libgomp.c-c++-common/target-implicit-map-4.c
+++ b/libgomp/testsuite/libgomp.c-c++-common/target-implicit-map-4.c
@@ -4,6 +4,7 @@
    and for not mapping the stack variables 'A' and 'B' (not mapped
    but accessible -> USM makes this tested feature even more important.)  */
 
+/* { dg-require-effective-target omp_usm } */
 #pragma omp requires unified_shared_memory
 
 /* Ensure that defaultmap(default : pointer) uses correct OpenMP 5.2
diff --git a/libgomp/testsuite/libgomp.c-c++-common/target-link-3.c b/libgomp/testsuite/libgomp.c-c++-common/target-link-3.c
index c707b38b7d4..96642353d4a 100644
--- a/libgomp/testsuite/libgomp.c-c++-common/target-link-3.c
+++ b/libgomp/testsuite/libgomp.c-c++-common/target-link-3.c
@@ -3,6 +3,7 @@
 #include <stdint.h>
 #include <omp.h>
 
+/* { dg-require-effective-target omp_usm } */
 #pragma omp requires unified_shared_memory
 
 int A[3] = {-3,-4,-5};
diff --git a/libgomp/testsuite/libgomp.c-c++-common/target-link-4.c b/libgomp/testsuite/libgomp.c-c++-common/target-link-4.c
index 785055e216d..009c521a996 100644
--- a/libgomp/testsuite/libgomp.c-c++-common/target-link-4.c
+++ b/libgomp/testsuite/libgomp.c-c++-common/target-link-4.c
@@ -3,6 +3,7 @@
 #include <stdint.h>
 #include <omp.h>
 
+/* { dg-require-effective-target omp_usm } */
 #pragma omp requires self_maps
 
 int A[3] = {-3,-4,-5};
diff --git a/libgomp/testsuite/libgomp.fortran/self_maps.f90 b/libgomp/testsuite/libgomp.fortran/self_maps.f90
index 208fd1c71d5..60889687c2c 100644
--- a/libgomp/testsuite/libgomp.fortran/self_maps.f90
+++ b/libgomp/testsuite/libgomp.fortran/self_maps.f90
@@ -1,4 +1,5 @@
 ! Basic test whether self_maps work
+! { dg-require-effective-target omp_usm }
 
 module m
   !$omp requires self_maps
-- 
2.51.0

Reply via email to