On Mon, Jan 20, 2020 at 10:36:31AM +0000, Andrew Stubbs wrote:
> The HSA/ROCm runtime rejects binaries not built for the exact device
> present.
> 
> In practice, binaries built by GCC for GCN3 "fiji" devices would probably
> run on any of the devices we currently support, if only the driver would
> load it. It would not be optimal, but AFAIK the subset of the ISA we
> actually use is compatible.
> 
> However, in theory, the meta-data isn't quite the same (for one thing, GCN3
> allocates registers in increments of 8, where GCN5 uses 16) so some programs
> may misbehave. I have suggested to the ROCm folks that possibly this could
> be fixed in the drivers, but that's not been done yet. I suppose that
> running sub-optimal code on an accelerator device is not a priority.
> 
> It would be possible for us to a) ensure we always use fully portable
> instruction and meta-data encodings in the compiler, and b) patch the ELF
> flags in libgomp at load time, to achieve some degree of
> forward-portability. We have not attempted this to date. Sooner or later AMD
> would change something in such a way that it couldn't be worked around.

:(
Another option would be to build offloading code by GCN multiple times, once
for each incompatible ISA the user is asking for, so that one can have then
binaries that will work on different hw.
Because e.g. with the distro vendor hat, it is hard to guess what device
will the users have if we were to enable gcn offloading and ship binaries
with the offloading support in the distro.  For PTX it is easy, we can just
emit sm_30 as the lowest ISA supported (the default anyway) and be done with it.

        Jakub

Reply via email to