@@ -0,0 +1,5 @@
+/// Some target-specific options are ignored for GPU, so %clang exits with
code 0.
+// DEFINE: %{check} = %clang -### -c -mcmodel=medium
Artem-B wrote:
> Also, what exactly are we checking here? With `-###` CC1 sub-compilations do
> not run and
@@ -0,0 +1,5 @@
+/// Some target-specific options are ignored for GPU, so %clang exits with
code 0.
+// DEFINE: %{check} = %clang -### -c -mcmodel=medium
Artem-B wrote:
In this particular case, the changes we test (and the error messages) were
originating in th
https://github.com/Artem-B approved this pull request.
https://github.com/llvm/llvm-project/pull/79222
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
Artem-B wrote:
> This is what we already do for `--offload-arch=native` on CUDA, but this is
> somewhat tangential. I've updated this patch to present the warning in the
> case of multiply GPUs being detected, so I don't think there's a concern here
> with the user being confused. If they have
Artem-B wrote:
> I think the semantics of native on other architectures are clear enough here.
I don't think we have the same idea about that. Let's spell it out, so there's
no confusion.
[GCC
manual](https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html#index-march-16)
says:
> Using -march=na
Artem-B wrote:
> This method of compilation is not like CUDA, so we can't target all the GPUs
> at the same time.
I think this is the key fact I was missing. If the patch is only for a
standalone compilation which does not do multi-GPU compilation in principle,
then your approach makes sense.
https://github.com/Artem-B approved this pull request.
LGTM, as we can only handle a single GPU target during compilation.
https://github.com/llvm/llvm-project/pull/79373
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org
Artem-B wrote:
Found another issue. We merge four independent byte loads with `align 1` into a
32-bit load, which fails at runtime on misaligned pointers.
```
%t0 = type { [17 x i8] }
@shared_storage = linkonce_odr local_unnamed_addr addrspace(3) global %t0
undef, align 1
define <4 x i8> @i