tra added inline comments.
================ Comment at: clang/lib/Driver/ToolChains/AMDGPU.cpp:167 + llvm::ErrorOr<std::unique_ptr<llvm::MemoryBuffer>> VersionFile = + FS.getBufferForFile(BinPath + "/.hipVersion"); + if (!VersionFile) ---------------- arsenm wrote: > tra wrote: > > arsenm wrote: > > > yaxunl wrote: > > > > yaxunl wrote: > > > > > yaxunl wrote: > > > > > > tra wrote: > > > > > > > arsenm wrote: > > > > > > > > yaxunl wrote: > > > > > > > > > arsenm wrote: > > > > > > > > > > arsenm wrote: > > > > > > > > > > > yaxunl wrote: > > > > > > > > > > > > arsenm wrote: > > > > > > > > > > > > > I don't think the detection should fail if this is > > > > > > > > > > > > > missing > > > > > > > > > > > > why? this file is unique to HIP installation. If it is > > > > > > > > > > > > missing, the installation is broken. > > > > > > > > > > > Because I should be able to use this without a complete > > > > > > > > > > > hip installation. Without a specified version, it should > > > > > > > > > > > assume the most modern layout. This will for example > > > > > > > > > > > break pointing --rocm-path at the device library build > > > > > > > > > > > directory for library tests > > > > > > > > > > I also don't see what value checking the version really > > > > > > > > > > provides; it may be informative to print it, but I don't > > > > > > > > > > think it's useful to derive information from it > > > > > > > > > what is the directory structure of your most modern layout? > > > > > > > > /opt/rocm/amdgcn/bitcode/foo.bc > > > > > > > > > > > > > > > > The plan is to remove this and rely on symlinks in the resource > > > > > > > > directory though > > > > > > > > I also don't see what value checking the version really > > > > > > > > provides; it may be informative to print it, but I don't think > > > > > > > > it's useful to derive information from it > > > > > > > > > > > > > > In CUDA it's used to detect which back-end features to enable > > > > > > > (they depend on what's supported by ptxas supplied by that CUDA > > > > > > > version). I don't think that would be relevant to AMDGPU as it > > > > > > > does not need external dependencies to generate GPU code. > > > > > > > > > > > > > > It may be useful for filesystem layout detection. E.g. where to > > > > > > > find bitcode files and what names to expect. This part seems to > > > > > > > be relevant for ROCm, but I'm not sure if the layout is closely > > > > > > > tied to the version. > > > > > > > > > > > > > We are required to support previous ROCm releases for certain time > > > > > > range. To do that we need to detect HIP version and enable/disable > > > > > > certain HIP features based on HIP version when necessary. > > > > > > > > > > > > Therefore if we have a new directory structure for ROCm, that > > > > > > directory structure should contain a HIP version file so that we > > > > > > can detect HIP version. > > > > > > > > > > > > > > > > > Currently clang includes some wrapper headers by default, which does > > > > > not work with ROCm 3.5 since those device functions are defined in > > > > > HIP headers of ROCm 3.5. To support ROCm 3.5, we have to disable > > > > > including those wrapper headers. This is one example why we need > > > > > detect HIP version. > > > > I think we need to separate HIP runtime detection and device library > > > > detection and also separate hasValidHIPRuntime and > > > > hasValidDeviceLibrary. ROCm toolchain only need to make sure > > > > hasValidDeviceLibrary but HIP toolchain need both hasValidDeviceLibrary > > > > and hasValidHIPRuntime. > > > Regardless of whether there's a version file or if does anything, I think > > > the absence of one implies do the most modern thing > > "The most modern thing" is an ever moving target. If the failure modes keep > > changing it will create a source of new problems every time something > > changes. Probably not a big issue in this case, but with (eventually) wide > > enough user base there will be some. > > > > I assume that AMD does not have much of legacy user base yet for > > compilations with clang, so defaulting to a recent version may be sensible > > (unlike CUDA where clang had to deal with a lot of existing CUDA users > > using old CUDA versions). That said, I'd recommend pinning it to a > > **specific** version, so the expectations remain stable. Then we can > > document the failure/fallback in a way users can deal with -- `if no > > version file is found, clang assumes version x.y and expects to find files > > X, Y and Z in directory A, B, C. You can specify the locations manually > > with --something-something-X=A or specify the version with > > --hip-version=3.7`. > > > > Considering that clang is being used from various derived tools, you may > > not always have the luxury of having the whole ROCm installation around and > > need to be able to override the expectations via command line flags. > > > > If we have a way to override the version, then it does not matter all that > > much which version we pick as a fallback. If the guess is wrong, we can > > always correct it with a flag. > Unlike CUDA we're not really chasing some 3rd party thing that periodically > drops unpredictable changes. We know what the most modern thing is on tip of > tree > Unlike CUDA we're not really chasing some 3rd party thing that periodically > drops unpredictable changes. We know what the most modern thing is on tip of > tree We're in agreement on this part. If lack of the version file is non-fatal, I'm fine with whatever version AMD picks as the default. I'm just pointing out that you'll likely need a way to keep compiler working even if the filesystem does not have everything you expect for the full end-to-end compilation, be it the version file or bitcode. Reasonable default is fine. Overrides can be implemented later, if necessary. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D82930/new/ https://reviews.llvm.org/D82930 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits