tra added inline comments.
================
Comment at: clang/lib/Driver/ToolChains/AMDGPU.cpp:167
+    llvm::ErrorOr<std::unique_ptr<llvm::MemoryBuffer>> VersionFile =
+        FS.getBufferForFile(BinPath + "/.hipVersion");
+    if (!VersionFile)
----------------
arsenm wrote:
> tra wrote:
> > arsenm wrote:
> > > yaxunl wrote:
> > > > yaxunl wrote:
> > > > > yaxunl wrote:
> > > > > > tra wrote:
> > > > > > > arsenm wrote:
> > > > > > > > yaxunl wrote:
> > > > > > > > > arsenm wrote:
> > > > > > > > > > arsenm wrote:
> > > > > > > > > > > yaxunl wrote:
> > > > > > > > > > > > arsenm wrote:
> > > > > > > > > > > > > I don't think the detection should fail if this is 
> > > > > > > > > > > > > missing
> > > > > > > > > > > > why? this file is unique to HIP installation. If it is 
> > > > > > > > > > > > missing, the installation is broken.
> > > > > > > > > > > Because I should be able to use this without a complete 
> > > > > > > > > > > hip installation. Without a specified version, it should 
> > > > > > > > > > > assume the most modern layout. This will for example 
> > > > > > > > > > > break pointing --rocm-path at the device library build 
> > > > > > > > > > > directory for library tests
> > > > > > > > > > I also don't see what value checking the version really 
> > > > > > > > > > provides; it may be informative to print it, but I don't 
> > > > > > > > > > think it's useful to derive information from it
> > > > > > > > > what is the directory structure of your most modern layout?
> > > > > > > > /opt/rocm/amdgcn/bitcode/foo.bc
> > > > > > > > 
> > > > > > > > The plan is to remove this and rely on symlinks in the resource 
> > > > > > > > directory though
> > > > > > > > I also don't see what value checking the version really 
> > > > > > > > provides; it may be informative to print it, but I don't think 
> > > > > > > > it's useful to derive information from it
> > > > > > > 
> > > > > > > In CUDA it's used to detect which back-end features to enable 
> > > > > > > (they depend on what's supported by ptxas supplied by that CUDA 
> > > > > > > version). I don't think that would be relevant to AMDGPU as it 
> > > > > > > does not need external dependencies to generate GPU code.
> > > > > > > 
> > > > > > > It may be useful for filesystem layout detection. E.g. where to 
> > > > > > > find bitcode files and what names to expect. This part seems to 
> > > > > > > be relevant for ROCm, but I'm not sure if the layout is closely 
> > > > > > > tied to the version.
> > > > > > > 
> > > > > > We are required to support previous ROCm releases for certain time 
> > > > > > range. To do that we need to detect HIP version and enable/disable 
> > > > > > certain HIP features based on HIP version when necessary.
> > > > > > 
> > > > > > Therefore if we have a new directory structure for ROCm, that 
> > > > > > directory structure should contain a HIP version file so that we 
> > > > > > can detect HIP version.
> > > > > > 
> > > > > > 
> > > > > Currently clang includes some wrapper headers by default, which does 
> > > > > not work with ROCm 3.5 since those device functions are defined in 
> > > > > HIP headers of ROCm 3.5. To support ROCm 3.5, we have to disable 
> > > > > including those wrapper headers. This is one example why we need 
> > > > > detect HIP version.
> > > > I think we need to separate HIP runtime detection and device library 
> > > > detection and also separate hasValidHIPRuntime and 
> > > > hasValidDeviceLibrary. ROCm toolchain only need to make sure 
> > > > hasValidDeviceLibrary but HIP toolchain need both hasValidDeviceLibrary 
> > > > and hasValidHIPRuntime.
> > > Regardless of whether there's a version file or if does anything, I think 
> > > the absence of one implies do the most modern thing
> > "The most modern thing" is an ever moving target. If the failure modes keep 
> > changing it will create a source of new problems every time something 
> > changes. Probably not a big issue in this case, but with (eventually) wide 
> > enough user base there will be some.
> > 
> > I assume that AMD does not have much of legacy user base yet for 
> > compilations with clang, so defaulting to a recent version may be sensible 
> > (unlike CUDA where clang had to deal with a lot of existing CUDA users 
> > using old CUDA versions). That said, I'd recommend pinning it to a 
> > **specific** version, so the expectations remain stable. Then we can 
> > document the failure/fallback in a way users can deal with -- `if no 
> > version file is found, clang assumes version x.y and expects to find files 
> > X, Y and Z in directory A, B, C. You can specify the locations manually 
> > with --something-something-X=A or specify the version with 
> > --hip-version=3.7`. 
> > 
> > Considering that clang is being used from various derived tools, you may 
> > not always have the luxury of having the whole ROCm installation around and 
> > need to be able to override the expectations via command line flags.
> > 
> > If we have a way to override the version, then it does not matter all that 
> > much which version we pick as a fallback. If the guess is wrong, we can 
> > always correct it with a flag.
> Unlike CUDA we're not really chasing some 3rd party thing that periodically 
> drops unpredictable changes. We know what the most modern thing is on tip of 
> tree
> Unlike CUDA we're not really chasing some 3rd party thing that periodically 
> drops unpredictable changes. We know what the most modern thing is on tip of 
> tree
We're in agreement on this part. If lack of the version file is non-fatal, I'm 
fine with whatever version AMD picks as the default.

I'm just pointing out that you'll likely need a way to keep compiler working 
even if the filesystem does not have everything you expect for the full 
end-to-end compilation, be it the version file or bitcode. Reasonable default 
is fine. Overrides can be implemented later, if necessary. 



CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82930/new/

https://reviews.llvm.org/D82930



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to