Andrew Stubbs wrote:
Otherwise, this patch seems fine (I have not reviewed the new magic numbers and settings.)

As Andrew mentioned via chat, we also have to update the 'amdhsa.version'.

Well, that's what the attached patch does. (I have no idea which tool / library relies on it, but it makes sense to use the right value.)

OK for mainline? (*)

Tobias

(*) Loosely tested with offloading with a ROCm 6.0.2 and 6.3.2 runtime; however, the 1.0 was accepted by ROCm also for v4 and as generic does not work, v6' 1.1 could not really be tested. However, looking at the ROCR code and during debugging, I did not spot any issue with it. Actually the string 'amdhsa' does not appear at all in ROCR.

PS: If you wonder where V6 is set: that's a few lines up in the .awk file.
[gcn] Fix the output amdhsa.version

The amdhsa.version depends on the code object version; while V3 had 1.0,
V4 has 1.1 and V5 (and V6) have 1.2. GCC used 1.0 but generated since
a while either V4 or, with -march=gfx...-generic, V6. Now it uses the
proper version again.

gcc/ChangeLog:

	* config/gcn/gcn.cc (gcn_hsa_declare_function_name): Update
	'amdhsa.version' output to match used code version.
	* config/gcn/gen-gcn-device-macros.awk: Add a comment to
	crosslink.

 gcc/config/gcn/gcn.cc                    | 17 +++++++++++------
 gcc/config/gcn/gen-gcn-device-macros.awk |  4 +++-
 2 files changed, 14 insertions(+), 7 deletions(-)

diff --git a/gcc/config/gcn/gcn.cc b/gcc/config/gcn/gcn.cc
index 82fc6ff1e41..b0c06d5e632 100644
--- a/gcc/config/gcn/gcn.cc
+++ b/gcc/config/gcn/gcn.cc
@@ -6668,18 +6668,23 @@ gcn_hsa_declare_function_name (FILE *file, const char *name,
     fprintf (file,
 	     "\t  .amdhsa_tg_split\t0\n");
   fputs ("\t.end_amdhsa_kernel\n", file);
 
 #if 1
   /* The following is YAML embedded in assembler; tabs are not allowed.  */
-  fputs ("        .amdgpu_metadata\n"
-	 "        amdhsa.version:\n"
-	 "          - 1\n"
-	 "          - 0\n"
-	 "        amdhsa.kernels:\n"
-	 "          - .name: ", file);
+
+  /* 'amdhsa.version': code object V3 = 1.0, V4 = 1.1, V5/V6 = 1.2.  */
+  /* Keep in sync with 'amdhsa-code-object' in gen-gcn-device-macros.awk.  */
+  fprintf (file,
+	   "        .amdgpu_metadata\n"
+	   "        amdhsa.version:\n"
+	   "          - 1\n"
+	   "          - %d\n"
+	   "        amdhsa.kernels:\n"
+	   "          - .name: ",
+	   gcn_devices[gcn_arch].generic_version ? 2 /* V6 */ : 1 /* V4 */);
   assemble_name (file, name);
   fputs ("\n            .symbol: ", file);
   assemble_name (file, name);
   fprintf (file,
 	   ".kd\n"
 	   "            .kernarg_segment_size: %i\n"
diff --git a/gcc/config/gcn/gen-gcn-device-macros.awk b/gcc/config/gcn/gen-gcn-device-macros.awk
index aa271004c27..d227e6fcedf 100644
--- a/gcc/config/gcn/gen-gcn-device-macros.awk
+++ b/gcc/config/gcn/gen-gcn-device-macros.awk
@@ -114,13 +114,15 @@ BEGIN {
 # ABI Version: In principle, the LLVM default would work. However,
 # however, when debugging symbols are turned on, mkoffload.cc
 # writes a new AMD GPU object file and the ABI version needs to be the
 # same. - LLVM <= 17 defaults to 4 while LLVM >= 18 defaults to 5.
 # GCC supports LLVM >= 13.0.1 and only LLVM >= 14 supports version 5.
 # Code object V6 is supported since LLVM 19.
-
+#
+# Keep in sync with 'amdhsa.version' in gcn.cc
+#
 END {
   print ""
   print ""
   printf "#define ABI_VERSION_OPT \"%%{\"%s \"!march=*|march=*:--amdhsa-code-object-version=4} \"\n", generic_list
   printf "#define XNACKOPT \"%%{\"%s \":%%eexpected march\\n} \"\n", gensub (/OPT/, "XNACK", "g", list)
   printf "#define SRAMOPT \"%%{\"%s \":%%eexpected march\\n} \"\n", gensub (/OPT/, "SRAM", "g", list)

Reply via email to