[PATCH] D45061: [NVPTX, CUDA] Use custom feature detection to handle NVPTX target builtins.

Artem Belevich via Phabricator via cfe-commits Tue, 03 Apr 2018 10:14:05 -0700

tra added a comment.

In https://reviews.llvm.org/D45061#1053795, @echristo wrote:


> Let's talk about the rest of it more. I'm not sure I'm seeing the need here 
> rather than the annotations that are already here. Can you elaborate more 
> here on why we need an additional method when you've already got subtarget 
> features for each of the ptx versions anyhow?


The patch intends to address two issues:

- mismatch on constraints between llvm and clang. E.g. hasPTX60() on LLVM side 
means "ptx60 or newer". "ptx60" in TARGET_BUILTIN on clang side means "ptx60" 
*only*. It is possible to address this within existing implementation by 
enumerating all PTX versions that are newer. It works OK for ptx60 as we only 
need to write "ptx60|ptx61". It gets more interesting for older GPUs E.g 
"ptx31+" would have to become 
"ptx31|ptx32|ptx40|ptx41|ptx42|ptx43|ptx50|ptx60|ptx61". Similar enumeration 
will need to happen for GPU version, which brings us to the next point
- NVIDIA keeps growing PTX versions and GPU variants. Recently they've changed 
CUDA release frequency to ~1/quarter and they tend to add minor variants of PTX 
and GPU versions fairly frequently. It's going to be an unnecessary maintenance 
headache as after every new introduced variant I'll need to go and update all 
NVPTX builtins that have nothing to do with the CUDA changes. Granted, it's not 
a showstopper, but it is an annoyance that is guaranteed to stay.

With this patch, TARGET_BUILTIN constraints become semantically identical to 
LLVM and we no longer need to chase every bump in PTX version or GPU variant.


https://reviews.llvm.org/D45061



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D45061: [NVPTX, CUDA] Use custom feature detection to handle NVPTX target builtins.

Reply via email to