jlebar wrote:

I think I'm with Art on this one.

>> Problem #2 [...] The arch=native will create a working configuration, but 
>> would build more than necessary.
>
> It will target the first GPU it finds. We could maybe change the behavior to 
> detect the newest, but the idea is just to target the user's system.

OK, but I think this is worse.

Now it's basically always incorrect to ship a build system which uses 
arch=native, because the people running the build might very reasonably have 
multiple GPUs in their system, and which GPU clang picks is unspecified.

But we all know people are going to do it anyway.

Given that this feature cannot correctly be used with a build system, and given 
that 99.99% of invocations of clang are from a build system that the user 
running the build did not write, it seems to me that we should not add a 
feature that is such a footgun when used with a build system.

(A non-CUDA C++ file compiled with march=native will almost surely run on your 
computer, whereas this won't, and it's unpredictable whether or not it will, 
depending on the order the nvidia driver returns GPUs in.  So there's no good 
analogy here.)

If we were going to add this, I think we should compile for all the GPUs in 
your system, like Art had assumed.  I think that's better, but it has other 
problems, like slow builds and also the fact that your graphics GPU is likely 
less powerful than your compute GPU, so now compilation is going to fail 
because you're e.g. using tensorcores and compiling for a GPU that doesn't have 
them.  So again you can't really use arch=native in a build system, even if you 
say "requires an sm80 GPU", because really the requirement is "has an sm80 GPU 
and no others in the machine".

https://github.com/llvm/llvm-project/pull/79373
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to