On 05/02/2025 11:14, Tobias Burnus wrote:
The number of AMD GPUs is huge - and, unfortunately, every GPU device
is potentially slightly different, requiring different code generation
either in some dusty corner case or for standard code.

As for several GPUs identical code can run (either all or when disabling
some features), AMD introduced with LLVM 19 some gfx*-generic targets.
GCC added support for gfx10-3-generic and gfx11-generic with commit
r15-4550-g1bdeebe69b71bf in October 2024 (undocumented). GCC itself always supports all -march= targets, but a assembler supporting the arch is required such that at user runtime and when building a multilib, a assembler (and linker) supporting the new features is required. (GCC uses LLVM' assembler (llvm-mc) and linker (lld), i.e. LLVM 19+ is required for gfx*-generic.] However, the required runtime code landed in ROCm much later; namely, commit 0c18ff22 rocr: Generic ISA targets support (Oct 28, 2024) in https://github.com/ROCm/ROCR-Runtime It is believed that the next ROCm release contained this feature, which is ROCm 6.3, released on Dec 3, 2024. The latest ROCm is 6.3.2 of Jan 28, 2025. Still, adding gfx*generic increases the number of required multilibs as it does not seem to be possible to link mixed code of generic and specific GPU code. See https://llvm.org/docs/ AMDGPUUsage.html#amdgpu-generic-processor-table for a list of gfx*generic and supported gfx* devices and some generic restrictions due to using multilib. While gfx11-generic and gfx10-3-generic include all GPUs of that generation, with no or few restrictions, GFX9 devices are rather different and, hence, gfx9-generic only covers a subset of the devices. * * * This patch now enables support for gfx10-3-generic, gfx11-generic and (new!) gfx9-generic in libgomp, making it actually usable. In libgomp, GCC prints its own diagnostic if there is an ISA mismatch between the actual GPU and the compiled-for GPU. Hence, not only ROCm but also GCC needs to know which GPUs are compatible - in order to propose the -foffload-options=-march=gfx... to compile for. That diagnostic now also proposes to try compile for the specific gfx*generic besides compiling for the specific GPU. Reasoning: As the number of multilibs is limited, having only a gfx11-generic multilib, it makes sense to propose -march=gfx11-generic besides, e.g., - march=gfx1103 especially when the gfx1103 multilib is unavailable - and vice versa. In case GCC thinks that the ISA is supported but (a too old) ROCm does not recognize it, the error is now inferior; however, some wording has been added to the generic error message, which might still help. As there are a couple of GPUs, previously unsupported, that are supported by ROCm with the same gfx*-generic as GPUs we support, it makes sense to add those GPUs as well - both to handle them in libgomp's generic diagnostic and to support them in general. Therefore, the following GPUs are now supported in addition: gfx902, gfx904, gfx909, gfx1031, gfx1032, gfx1033, gfx1034, gfx1035, gfx1101, gfx1102, gfx1150, gfx1151, gfx1152, and gfx1153. However, the multilib config has not been touched, hence, those 14 device types and gfx{9,10-3,11}-generic are not supported by default. Currently, the following 9 GPUs are enabled by default:gfx900, gfx906, gfx908, gfx90a, gfx90c, gfx1030, gfx1036, gfx1100, andgfx1103.

I'm not too happy about adding a whole list of specific devices that we have not tested. So far, whenever I have added a new device there have been meta-data oddities and such-like that needed to be tweaked. Admittedly, adding a new device to an existing generation has been easier, but still there have been unexpected issues.

Adding the generic architectures does make sense, assuming we can test them, and seems like a much better way to support these devices, until somebody can add properly tested and tuned support for an individual device.

I also don't like adding knowledge of unsupported devices purely for improving diagnostics. It's fine for the known-unsupported devices, but wait a month or so and there will be new unknown-unsupported devices, and the message degrades again. Worse, the new diagnostic can recommend trying -march=<name> for devices which the compiler will recognize but have never been tested, and probably don't have multilibs configured.

A better approach might be to pattern-match "gfx{9,10,11}" in the name HSA gives you for the physical device and recommend generic -march=gfx{9,10,11}-generic in those cases?

* * *

For distros building with LLVM 19, I could imagine that adding the gfx10-3-generic and gfx11-generic (and possibly gfx9-generic) multilibs could make sense; whether gfx1030, gfx1036, gfx1100, andgfx1103 could already be dropped - or only later (once ROCm 3.6 is more widely deployed) is the a good question. [My gut feeling is that a distro should wait until next year, given
that December 2024 is still very recent.]

* * *

Thus back to the attached patch, which does:

* Add gfx9-generic - and enable libgomp support for gfx10-3-generic
* Addgfx902, gfx904, gfx909, gfx1031, gfx1032, gfx1033, gfx1034, gfx1035, gfx1101, gfx1102, gfx1150, gfx1151, gfx1152, and gfx1153. * Update the install + invoke (-march=) documentation for it
The patch has loosely be tested - but I currently do not have a ROCm 6.3
available with a gfx*-generic supported device; hence, I don't know whether
it really works.


Thus, I would be happy if someone with a supported gfx{9,10-3,11}-generic
device - or a newly added non-generic gfx* could test whether it actually
works!


[I am about to get a ROCm 6.3.2 with a gfx906 device, possibly later also
for gfx900 and even later for gfx1100.]

Any comment, remark, suggestion?
OK for mainline, once someone has shown that any gfx*-generic actually works?

I'm happy to add the new gfx9-generic, and improving the diagnostics is always good, but I'm not convinced about making it look like we support devices we've never tested.

(Of course, if someone is able to test them, then that's different.)

Andrew

Reply via email to