[PATCH] D155850: [Clang][CodeGen][RFC] Add codegen support for C++ Parallel Algorithm Offload

Alex Voicu via Phabricator via cfe-commits Thu, 20 Jul 2023 14:09:04 -0700

AlexVlx added inline comments.

================
Comment at: clang/lib/CodeGen/CGBuiltin.cpp:5542
+  if (!getLangOpts().HIPStdPar)
+    ErrorUnsupported(E, "builtin function");

----------------
efriedma wrote:
> This doesn't make sense; we can't just ignore bits of the source code.  I 
> guess this is related to "the decision on their validity is deferred", but I 
> don't see how you expect this to work.
This is one of the weirder parts, so let's consider the following example:

```cpp
void foo() { __builtin_ia32_pause(); }
void bar() { __builtin_trap(); }

void baz(const vector<int>& v) {
    return for_each(par_unseq, cbegin(v), cend(v), [](auto&& x) { if (x == 42) 
bar(); });
}
```

In the case above, what we'd offload to the accelerator, and ask the target BE 
to lower, is the implementation of `for_each`, and `bar`, because it is 
reachable from the latter. `foo` is not reachable by any execution path on the 
accelerator side, however it includes a builtin that is unsupported by the 
accelerator (unless said accelerator is x86, which is not impossible, but not 
something we're dealing with at the moment). If we were to actually error out 
early, in the FE, in these cases, there's almost no appeal to what is being 
proposed, because standard headers, as well as other libraries, are littered 
with various target specific builtins that are not going to be supported. This 
all builds on the core invariant of this model / extension / thingamabob, which 
is that the algorithms, and only the algorithms, are targets for offload. It 
thus follows that as long as code that is reachable from an algorithm's 
implementation is safe, all is fine, but we cannot know this in the FE / on an 
AST level, because we need the actual CFG. This part is handled in LLVM in the 
`SelectAcceleratorCodePass` that's in the last patch in this series.

Now, you will quite correctly observe that there's nothing preventing an user 
from calling `foo` in the callable they pass to an algorithm; they might read 
the docs / appreciate that this won't work, but even there they are not safe, 
because there via some opaque call chain they might end up touching some 
unsupported builtin. My intuition here, which is reflected above in letting 
builtins just flow through, is that such cases are better served with a compile 
time error, which is what will obtain once the target BE chokes trying to lower 
an unsupported builtin. It's not going to be a beautiful error, and we could 
probably prettify it somewhat if we were to check after we've done the 
accelerator code selection pass, but it will happen at compile time. Another 
solution would be to emit these as traps (poison + trap for value returning 
ones), but I am concerned that it would lead to really fascinating debug 
journeys.

Having said this, if there's a better way to deal with these scenarios, it 
would be rather nice. Similarly, if the above doesn't make sense, please let me 
know.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D155850/new/

https://reviews.llvm.org/D155850

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D155850: [Clang][CodeGen][RFC] Add codegen support for C++ Parallel Algorithm Offload

Reply via email to