On Fri, Mar 15, 2024 at 5:36 PM Andrew Stubbs <a...@baylibre.com> wrote: > > On 15/03/2024 13:56, Tobias Burnus wrote: > > Hi Andrew, > > > > Andrew Stubbs wrote: > >> This is more-or-less what I was planning to do myself, but as I want > >> to include all the other features that get parametrized in gcn.cc, > >> gcn.h, gcn-hsa.h, gcn-opts.h, I hadn't got around to it yet. > >> Unfortunately, I think the gcn.opt and config.gcc will always need > >> manually updating, but if that's all it'll be an improvement. > > > > Well, for .opt see how nvptx does it – it actually generates an .opt file. > > > >> I don't like the idea of including AMDGPU_ISA_UNSUPPORTED; > > > > I concur – I was initially thinking of reporting the device name > > ("Unsupported %s") but I then realized that the agent returns a string > > while only for GCC generated files (→ eflag) the hexcode is used. Thus, > > I ended up not using it. > > > >> Ultimately, I want to replace many of the conditionals like > >> "TARGET_CDNA2_PLUS" from the code and replace them with feature flags > >> derived from a def file, or at least a header file. We've acquired too > >> many places where there are unsearchable conditionals that need > >> finding and fixing every time a new device comes along. > > I was thinking of having more flags, but those where the only ones > > required for the two files. > >> I had imagined that this .def file would exist in gcc/config/gcn, but > >> you've placed it in libgomp.... maybe it makes sense to have multiple > >> such files if they contain very different data, but I had imagined one > >> file and I'm not sure that the compiler definitions live in libgomp. > > > > There is already: > > > > gcc/config/darwin-c.cc:#include "../../libcpp/internal.h" > > > > gcc/config/gcn/gcn-run.cc:#include > > "../../../libgomp/config/gcn/libgomp-gcn.h" > > > > gcc/fortran/cpp.cc:#include "../../libcpp/internal.h" > > > > gcc/fortran/trigd_fe.inc:#include "../../libgfortran/intrinsics/trigd.inc" > > > > But there is also the reverse: > > > > libcpp/lex.cc:#include "../gcc/config/i386/cpuid.h" > > > > libgfortran/libgfortran.h:#include "../gcc/fortran/libgfortran.h" > > > > lto-plugin/lto-plugin.c:#include "../gcc/lto/common.h" > > > > If you add more items, it is probably better to have it under > > gcc/config/gcn/ - and I really prefer a single file for all. > > > > * * * > > > > Talking about feature sets: This would be a bit like LLVM (see below) > > but I think they have a bit too much indirections. But I do concur that > > we need to consolidate the current support – and hopefully make it > > easier to keep adding more GPU support; we seem to have already covered > > a larger chunk :-) > > > > I also did wonder whether we should support, e.g., running a gfx1100 > > code (or a gfx11-generic one) on, e.g., a gfx1103 device. Alternatively, > > we could keep the current check which requires an exact match. > > We didn't invent that restriction; the runtime won't let you do it. We > only have the check because the HSA/ROCr error message is not very > user-friendly.
Note that I heard/read somewhere that they plan to support a "blend" version that would allow running a kernel on any gfx11xx or gfx106x versions (supposedly at some runtime cost). Guess we'll need to watch the LLVM side of things here (or the ROCm runtime side of it). > > BTW: I do note that looking at the feature sets in LLVM that all GFX110x > > GPUs seem to have common silicon bugs: FeatureMSAALoadDstSelBug and > > FeatureMADIntraFwdBug, while 1100 and 1102 additionally have the > > FeatureUserSGPRInit16Bug but 1101 and 1103 don't. — For some reasons, > > FeatureISAVersion11_Generic only consists of two of those bugs (it > > doesn't have FeatureMADIntraFwdBug), which doesn't seem to be that > > consistent. Maybe the workaround has issues elsewhere? If so, a generic > > -march=gfx11 might be not as useful as one might hope for. > > > > * * * > > > > If I look at LLVM's > > https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/AMDGPU/AMDGPU.td > > , > > > > they first define several features – like 'FeatureUnalignedScratchAccess'. > > > > Then they combine them like in: > > > > def FeatureISAVersion11_Common ... [FeatureGFX11, ... > > FeatureAtomicFaddRtnInsts ... > > > > And then they use those to map them to feature sets like: > > > > def FeatureISAVersion11_0_Common ... > > listconcat(FeatureISAVersion11_Common.Features, > > [FeatureMSAALoadDstSelBug ... > > > > And for gfx1103: > > > > def FeatureISAVersion11_0_3 : FeatureSet< > > !listconcat(FeatureISAVersion11_0_Common.Features, > > [])>; > > > > The mapping to gfx... names then happens in > > https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/AMDGPU/GCNProcessors.td > > such as: > > > > def : ProcessorModel<"gfx1103", GFX11SpeedModel, > > FeatureISAVersion11_0_3.Features > > >; > > > > Or for the generic one, i.e.: > > > > // [gfx1100, gfx1101, gfx1102, gfx1103, gfx1150, gfx1151] > > def : ProcessorModel<"gfx11-generic", GFX11SpeedModel, > > FeatureISAVersion11_Generic.Features > > > > LLVM also has some generic flags like the following in > > https://github.com/llvm/llvm-project/blob/main/llvm/lib/TargetParser/TargetParser.cpp > > > > {{"gfx1013"}, {"gfx1013"}, GK_GFX1013, > > FEATURE_FAST_FMA_F32|FEATURE_FAST_DENORMAL_F32|FEATURE_WAVE32|FEATURE_XNACK|FEATURE_WGP}, > > > > I hope that this will give some inspiration – but I assume that at least > > the initial implementation will be much shorter. > > Yeah, we can have one macro for each arch, or multiple macros for > building different tables. First one seems easier but less readable, > second one will need some thinking about. Probably best to keep it > simple though. I'd say whatever gets us to smaller required changes to say introduce support for gfx1013 is welcome. Applies to the libgomp plugin side as well, of course. Richard. > Andrew