AlexVlx wrote:

> On a different point: I don't think this builtin is actually semantically 
> different from `__builtin_cpu_is`. As long as we're not treating it as 
> `constexpr`, the fact that it's lowered by the compiler and doesn't need a 
> runtime check is just a happy property of GPU targeting rather than a 
> fundamental difference. You could certainly imagine targets that _do_ simply 
> do this with a runtime switch. And the behavior of allowing additional 
> builtin to be used within the guarded block seems like a nice feature that 
> other targets would probably like to take advantage of.
> 
> We could allow `__builtin_processor_is` as an alternative name for that 
> builtin if folks feel weird about having "cpu" in the name for a GPU target.

The `processor_is` interface initially did not exist, but rather 
`__builtin_cpu_is` gained the ability to be statically resolved in the FE in 
certain cases / generate no run time code. There was strong opposition from 
some of my colleagues (some of which are on this thread) claiming that the 
semantics of `__builtin_cpu_is` mandate the existence of a run time check. The 
"cpu" bit wasn't really a problem:) 

If you / other Clang owners are happy with extending `__builtin_cpu_is`, 
personally I would prefer that since I believe that it can be beneficial for 
targets other than ours / GPUs in general. For example, even for x86, there's a 
difference between e.g. `x86_64-v2` and `znver5`, which could be resolved in 
the FE and remove the need to do a cpuid check at run time, and then go via a 
function call rather than direct inline code.

https://github.com/llvm/llvm-project/pull/134016
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to