[PATCH] D77670: [CUDA] Add partial support for recent CUDA versions.

Jonas Hahnfeld via Phabricator via cfe-commits Fri, 13 Aug 2021 13:38:56 -0700

Hahnfeld added a comment.

In D77670#2944192 <https://reviews.llvm.org/D77670#2944192>, @tra wrote:


> In D77670#2943753 <https://reviews.llvm.org/D77670#2943753>, @Hahnfeld wrote:
>
>> @tra The split between `LATEST` and `LATEST_SUPPORTED` leads to very weird 
>> warning and error messages:
>
> Agreed, it's far from ideal. There's also more than one issue involved.

Unfortunately, yes...

>> clang-14: warning: unknown CUDA version: cuda.h: CUDA_VERSION=11040.; 
>> assuming the latest supported version 10.1 [-Wunknown-cuda-version]
>
> The good news is that we've grown support for enough clang builtins and PTX 
> instructions to bump the "latest supported" to ~CUDA-11.3 or, maybe, even 
> 11.4.  At least, clang  should be able to compile all CUDA headers in those 
> versions.
> This should reduce the noise.

Great!

>> clang-14: error: cannot find libdevice for sm_20; provide path to different 
>> CUDA installation via '--cuda-path', or pass '-nocudalib' to build without 
>> linking with libdevice
>
> It's also time to bump the default GPU target to something that's supported 
> by the CUDA versions we reasonably expect to see. That should probably be 
> sm_35 as that's probably the oldest GPU platform that's still widely 
> available (e.g. there are tons of them on Google cloud and AWS) and is still 
> supported by all CUDA versions clang accepts.

+1 for at least `sm_35` - that would match recent `nvcc`s, right?

>> clang-14: error: GPU arch sm_20 is supported by CUDA versions between 7.0 
>> and 8.0 (inclusive), but installation at /usr/local/cuda-11.4 is 11.2; use 
>> '--cuda-path' to specify a different CUDA install, pass a different GPU arch 
>> with '--cuda-gpu-arch', or pass '--no-cuda-version-check'
>
> Perhaps it's time to start considering decommisioning sm_20 support in clang 
> and NVPTX. nvcc has done that long ago and is already on the way to dropping 
> sm_3x, too. sm_30 is no longer supported and sm_35 has been deprecated and is 
> expected be gone in the next CUDA release.

+1 - given that Clang 13.x just branched, now may be an ideal moment to make 
this cut.

>> Clang is mentioning three different CUDA versions here: 11.4 is what I 
>> really have installed, 11.2 is `LATEST` and therefore the one returned by 
>> `getCudaVersion` or as the "last resort" in `CudaInstallationDetector`, and 
>> the first warning says that Clang assumes the latest supported version 10.1. 
>> As a developer looking into the code, I get that the first warning is about 
>> saying that 10.1 is the latest fully supported version in terms of features, 
>> but I think this is really confusing to users. Do you see a chance to 
>> improve this? (other than adding just 11.3 and 11.4 to the enumerations 
>> where we'll always run behind)
>
> I'm open to suggestions. This was the least bad compromise I managed to come 
> up with.
>
> We could report the actually detected version, instead of the 'latest' 
> version clang knows about. Or not report it at all as it's not particularly 
> helpful for the end user. That would mitigate one source of confusion.
>
> As for the `latest supported`, I think we may still want to have it in some 
> form. Clang has to deal with version-specific CUDA quirks, so a CUDA version 
> outside of the range that clang is known to work with puts the user in 
> uncharted waters. E.g. until recently clang worked well enough with 
> CUDA-11.3, but only if you were compiling for the older GPUs. Attempts to 
> compile some headers for sm_80 would fail and that *was* confusing to users 
> who ran into that when the warning was disabled.

Yeah, the problem was that I didn't have better suggestions either when I wrote 
the first comment. But maybe now: How about having a "past-the-latest" value in 
the enum that Clang remembers if it detects a version more recent than it knows 
about? Then we could have two warnings:

- If we have a "past-the-latest" version, tell the user that Clang has no clue 
about this version and we assume the `LATEST` version; things might work, but 
no guarantees.
- If we have a version that is greater than the latest supported version, emit 
the current warning and say that support is "best-effort" (or something along 
that line). In that case, both the detected version and the "assumed" supported 
version should make sense to the user.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77670/new/

https://reviews.llvm.org/D77670

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D77670: [CUDA] Add partial support for recent CUDA versions.

Reply via email to