[clang] [llvm] [AMDGPU][clang][CodeGen][opt] Add late-resolved feature identifying predicates (PR #134016)

Aaron Ballman via cfe-commits Mon, 23 Jun 2025 11:52:20 -0700

================
@@ -9102,6 +9102,15 @@ bool InitializationSequence::Diagnose(Sema &S,
 
   case FK_ConversionFailed: {
     QualType FromType = OnlyArg->getType();
+    // __amdgpu_feature_predicate_t can be explicitly cast to the logical op
+    // type, although this is almost always an error and we advise against it
----------------
AaronBallman wrote:


> > > The __has_builtin counter-example actually does not work and cannot work, 
> > > please see: https://gcc.godbolt.org/z/7G5Y1d85b.
> > 
> > 
> > I cannot imagine a situation in which this isn't indicative of a bug, but 
> > perhaps this situation is the same one that necessitated [this 
> > PR](https://github.com/llvm/llvm-project/pull/126324#issuecomment-2706655366)
> >  which eventually concluded that we should change the behavior of 
> > `__has_builtin` rather than introduce a new builtin.
> 
> This is not actually a bug, it's intended behaviour. To obtain what you 
> expect the `b` would have to be `constexpr`, and then the `if` itself would 
> have to be `if constexpr`. Otherwise there's no binding commitment to 
> evaluate this at compile time (and, in effect, if this gets trivially 
> evaluated / removed in the FE, it induces dependence on optimisation level).

I... am an idiot. :-D Sorry, I think I must have been braindead when I wrote 
that because you're exactly correct. Sorry for the noise!

> > Backing up a step.. my expectation is that this eventually lowers down to a 
> > test and jump which jumps past the target code if the test fails. e.g.,
> > ```
> >   %0 = load i8, ptr %b, align 1
> >   %loadedv = trunc i8 %0 to i1
> >   br i1 %loadedv, label %if.then, label %if.end
> > 
> > if.then:
> >   # the target-specific instructions live here
> >   br label %if.end
> > 
> > if.end:
> >   ret void
> > ```
> >   
> > So we'd be generating instructions for the target which may be invalid if 
> > the test lies. If something did change that value so it no longer 
> > represents the predicate, I think that's UB (and we could help users catch 
> > that UB via a sanitizer check if we wanted to, rather than try to make the 
> > backend have to try to figure it out at compile time).
> 
> This cannot work reliably (e.g. there are instructions that would simply fail 
> at ISEL, and a run time jump doesn't mean that you do not lower to ISA the 
> jumped around block), and introducing dependence on sanitizers seems not 
> ideal. Furthermore, a run time jump isn't free, which is a concern for us, 
> and we also already have a mechanism for that case 
> (`__attribute__((target))`). Note that these can also control e.g. resource 
> allocation, so actually generating both might lead to arbitrary exhaustion of 
> a limited resource, and spurious compilation failures, consider e.g. (I'll 
> use CUDA/HIP syntax):
> 
> ```c++
> // This is a bit odd, and technically a race because multiple lanes write to 
> shared_buf
> void foo() {
>   __shared__ int* shared_buf;
>   if (__builtin_amdgcn_processor_is("gfx950") {
>     __shared__ int buf[70 * 1024];
>     shared_buf = buf;
>   } else {
>     __shared__ int buf[60 * 1024];
>     shared_buf = buf;
>   }
> 
>   __syncthreads();
>   // use shared_buf
> ```
> 
> If we tried to lower that we'd exhaust LDS, and spuriously fail to compile. 
> This would have originated from perfectly valid uses of `#if 
> defined(__gfx950__) #else`. We'd like these to work, so we must unambiguously 
> do the fold ourselves.

Okay, so the situation is different than what I expected. I was unaware this 
would cause ISEL failures.

> > > if for a chain from point of cast to final point of use folding fails 
> > > (because you passed your value to an
> > > opaque function, modified it based on a run time value etc.), you get an 
> > > error and a diagnostic.
> > 
> > 
> > I was thinking you would not get a diagnostic; you'd get the behavior you 
> > asked for, which may be utter nonsense.
> 
> One of the difficulties here (ignoring that the utter nonsense behaviour at 
> run time might be nasal demons - GPUs aren't always as polite as to issue a 
> `SIGILL` and graciously die:)) is that not all constructs / IR sequences / 
> ASM uses lower into ISA, so what the user is more likely to get is an ICE 
> with an error that makes no sense unless they work on LLVM. That's fairly 
> grim user experience, IMHO, and one that we have the ability to prevent.

Yeah, we obviously don't want the user experience to be compiler crashes. :-) 

> > Am I missing something still? If so, maybe it would be quicker for us to 
> > hop in a telecon call? I'm going to be out of the office until Monday, but 
> > I'm happy to meet with you if that's more productive.
> 
> I would be absolutely happy to if you think it'd help. I regret not coming to 
> the Sofia meeting, we could've probably sorted this out directly with a 
> laptop:)

FWIW, I'm still pretty uncomfortable about this design. I keep coming back to 
this feeling really novel and seeming like it's designed to work around backend 
issues. If the user did something like this:
```
void func(std::vector<bool> processor_features) {
  if (processor_features[12]) { // SSE3 is allowed
      __asm__ ("do a bunch of sse3 stuff");
  } else {
    // Do slow fallback stuff
  }
}
```
they would reasonably expect that inline assembly to be non-problematic even if 
sse3 isn't available. But... when I try to play silly games in practice, we 
assert: https://godbolt.org/z/xc13WhW4W and so maybe I'm just wrong. CC @nikic 
for more opinions.

As for meeting to discuss, are you free sometime this week? I'm on US East 
Coast time, what times typically work best for you?

https://github.com/llvm/llvm-project/pull/134016
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AMDGPU][clang][CodeGen][opt] Add late-resolved feature identifying predicates (PR #134016)

Reply via email to