AlexVlx wrote: > On a different point: I don't think this builtin is actually semantically > different from `__builtin_cpu_is`. As long as we're not treating it as > `constexpr`, the fact that it's lowered by the compiler and doesn't need a > runtime check is just a happy property of GPU targeting rather than a > fundamental difference. You could certainly imagine targets that _do_ simply > do this with a runtime switch. And the behavior of allowing additional > builtin to be used within the guarded block seems like a nice feature that > other targets would probably like to take advantage of. > > We could allow `__builtin_processor_is` as an alternative name for that > builtin if folks feel weird about having "cpu" in the name for a GPU target.
The `processor_is` interface initially did not exist, but rather `__builtin_cpu_is` gained the ability to be statically resolved in the FE in certain cases / generate no run time code. There was strong opposition from some of my colleagues (some of which are on this thread) claiming that the semantics of `__builtin_cpu_is` mandate the existence of a run time check. The "cpu" bit wasn't really a problem:) If you / other Clang owners are happy with extending `__builtin_cpu_is`, personally I would prefer that since I believe that it can be beneficial for targets other than ours / GPUs in general. For example, even for x86, there's a difference between e.g. `x86_64-v2` and `znver5`, which could be resolved in the FE and remove the need to do a cpuid check at run time, and then go via a function call rather than direct inline code. https://github.com/llvm/llvm-project/pull/134016 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits