AlexVlx wrote: > > Definitely, more than happy to have a 1-on-1 (2-on-1 even, since I think > > @AaronBallman also suggested something along these lines as well :) ). > > Please email me with some times that will work for you. > > > We've just made the call to foo() illegal on anything that is not gfx9000 > > I... don't think I'm suggesting this? The fact that a call to foo() from a > __builtin_amdgcn_processor_is block shouldn't imply anything about other > calls to foo(). >
Perhaps I am misunderstanding, case in which I apologise. I started from: "We can tell, statically, that the first call is correctly guarded by an if statement: it's guaranteed it will never run on a non-gfx9000 processor. The second call, on the other hand, is not. So we can add a frontend rule: **the first call is legal, the second is not**." I'm saying we cannot really infer anything about the legality of a naked call to a builtin either, at this point. Because the builtin might be available on many processors / processors other than gfx9000. We can develop the argument to say "well, fine, what we actually meant here is `is_invocable`, rather than `processor_is`, and then thing work out", but the corollary to that appears to be that if you ever use the predicate on a builtin, you must touch every other use of that builtin within at least the same function, and relate it to the predicate evaluation. > What I'm basically suggesting is just exposing SPIR-V specialization > constants as a C construct. Your example SPIR-V was something like: > > ``` > %cmp = OpIEqual %bool %runtime_known_hw_id %hw_id_that_supports_feature > if (%cmp = true) { > /* some feature */ > } else { > /* other feature */ > } > ``` > > We want to come up with a corresponding C construct that's guaranteed to > compile to valid SPIR-V. My suggestion is something like: > > ``` > if (__runtime_known_hw_id_eq("hw_id_that_supports_feature")) { > /* some feature */ > } > ``` > I'm confused as to what is different versus what this PR does, which is does generate valid SPIRV / LLVM IR. Perhaps there is an underlying assumption that there is some construct that makes the otherwise dead block still contain valid code, and there really isn't. There's an example I provided above where what is guarded is (static) finite resource allocation, not just the use of an intrinsic; we'd not know in the FE which is correct, and we cannot allocate both until we know the target at JIT / finalisation time (so before executing the code), and we cannot generate executable code with both allocation requests live, as the finite resource gets exhausted. So the only place where we can meaningfully deal with this is in the ME / over IR, before hitting the BE. We should be careful to avoid focusing on the `processor_is` / `hw_id` aspect, this leads to brittle code that has to constantly grow additional identity checks via `||` disjunction. > In the body of the if statement, you can use whatever intrinsics are legal on > hw_id_that_supports_feature. > > > we're just sliding in immediately after Clang, before optimisation > > Isn't doing checks immediately after IR generation basically the same as > checking the AST, just on a slightly different representation? Not in this case. There's at least two aspects that make a difference: - linking in bitcode, which can allow more extensive analysis than what you can do per TU in the AST - this is minor, however please note the conversation above about having to be conservative around external symbols, and the risks of leaving them around; - lack of information when generating the AST, when dealing with abstract targets like SPIRV (more specifically, AMDGCN flavoured SPIRV, for the purposes of this PR) - the FE targets `amdgcnspirv`, which is generic across all concrete AMDGPU targets (union of features); - the predicates proposed here offer customisation points for which the resolution is deferred to the point where the target is known; - we only know the concrete target when we are finalising, which happens at a completely different time-point, on possibly a different machine; - we cannot time-travel to inform the AST about this, but we can compose generic IR with target IR, and lower it as target IR (this is already how various flavours of device / offload libs work, so it's hardly novel). None of the above matters for concrete targets, where we just resolve everything in the AST already, because we have full information in the FE. https://github.com/llvm/llvm-project/pull/134016 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits