jlebar added a comment.

Here are two other approaches we considered and rejected, for the record:

1. Copy-paste a <complex> implementation from e.g. libc++ into 
__clang_cuda_runtime_wrapper.h, and edit it appropriately.  Then #define the 
real <complex>'s include guards.

  Main problem with this is the obvious one: We're copying a big chunk of the 
standard library into the compiler, where it doesn't belong, and now we have 
two divergent copies of this code to maintain.  In addition, we can't 
necessarily use libc++, since we need to support pre-c++11 and AIUI libc++ does 
not.



2. Provide `__device__` overrides for all the functions defined in <complex>.  
This almost works, except that we do not (currently) have a way to let you 
inject new overloads for member functions into classes we don't own.  E.g. we 
can add a `__device__` overload `std::real(const complex<T>&)`, just like we 
could override `std::real` in any other way, but we can't add a new 
`__device__` overload to `std::complex<T>::real()`.

  This approach also has a similar problem to (1), which is that we'd end up 
copy/pasting almost all of <complex> into the compiler.


================
Comment at: include/clang/Driver/Options.td:383-384
@@ -382,2 +382,4 @@
   HelpText<"Enable device-side debug info generation. Disables ptxas 
optimizations.">;
+def cuda_allow_std_complex : Flag<["--"], "cuda-allow-std-complex">,
+  HelpText<"Allow CUDA device code to use definitions from <complex>, other 
than operator>> and operator<<.">;
 def cuda_path_EQ : Joined<["--"], "cuda-path=">, Group<i_Group>,
----------------
tra wrote:
> rsmith wrote:
> > I don't think it's reasonable to have something this hacky / arbitrary in 
> > the stable Clang driver interface.
> What would be a better way to enable this 'feature'? I guess we could live 
> with -Xclang -fcuda-allow-std-complex for now, but that does not seem to be 
> particularly good way to give user control, either.
> 
> Perhaps we should have some sort of --cuda-enable-extension=foo option to 
> control CUDA hacks.
> I don't think it's reasonable to have something this hacky / arbitrary in the 
> stable Clang driver interface.

This is an important feature for a lot of projects, including tensorflow and 
eigen.  No matter how we define the flag, I suspect people are going to use it 
en masse.  (Most projects I've seen pass the equivalent flag to nvcc.)  At the 
point that many or even most projects are relying on it, I'd suspect we'll have 
difficulty changing this flag, regardless of whether or not it is officially 
part of our stable API.

There's also the issue of discoverability.  nvcc actually gives a nice error 
message when you try to use std::complex -- it seems pretty unfriendly not to 
even list the relevant flag in clang --help.

I don't feel particularly strongly about this, though -- I'm more concerned 
about getting something that works.


http://reviews.llvm.org/D18328



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to