Re: [PATCH] D18328: [CUDA] Add option to mark most functions inside as host+device.

Justin Lebar via cfe-commits Mon, 21 Mar 2016 15:49:13 -0700

jlebar added inline comments.

================
Comment at: include/clang/Driver/Options.td:383-384
@@ -382,2 +382,4 @@
   HelpText<"Enable device-side debug info generation. Disables ptxas 
optimizations.">;
+def cuda_allow_std_complex : Flag<["--"], "cuda-allow-std-complex">,
+  HelpText<"Allow CUDA device code to use definitions from <complex>, other 
than operator>> and operator<<.">;
 def cuda_path_EQ : Joined<["--"], "cuda-path=">, Group<i_Group>,
----------------
rsmith wrote:
> jlebar wrote:
> > jlebar wrote:
> > > rnk wrote:
> > > > jlebar wrote:
> > > > > tra wrote:
> > > > > > rsmith wrote:
> > > > > > > I don't think it's reasonable to have something this hacky / 
> > > > > > > arbitrary in the stable Clang driver interface.
> > > > > > What would be a better way to enable this 'feature'? I guess we 
> > > > > > could live with -Xclang -fcuda-allow-std-complex for now, but that 
> > > > > > does not seem to be particularly good way to give user control, 
> > > > > > either.
> > > > > > 
> > > > > > Perhaps we should have some sort of --cuda-enable-extension=foo 
> > > > > > option to control CUDA hacks.
> > > > > > I don't think it's reasonable to have something this hacky / 
> > > > > > arbitrary in the stable Clang driver interface.
> > > > > 
> > > > > This is an important feature for a lot of projects, including 
> > > > > tensorflow and eigen.  No matter how we define the flag, I suspect 
> > > > > people are going to use it en masse.  (Most projects I've seen pass 
> > > > > the equivalent flag to nvcc.)  At the point that many or even most 
> > > > > projects are relying on it, I'd suspect we'll have difficulty 
> > > > > changing this flag, regardless of whether or not it is officially 
> > > > > part of our stable API.
> > > > > 
> > > > > There's also the issue of discoverability.  nvcc actually gives a 
> > > > > nice error message when you try to use std::complex -- it seems 
> > > > > pretty unfriendly not to even list the relevant flag in clang --help.
> > > > > 
> > > > > I don't feel particularly strongly about this, though -- I'm more 
> > > > > concerned about getting something that works.
> > > > What if we had a catchall nvcc quirks mode flag similar to 
> > > > -fms-compatibility? We probably don't want a super fine grained LangOpt 
> > > > like this.
> > > An alternative wrt the flag is to enable it by default.  This would be 
> > > somewhat consistent with existing behavior, wherein we make most std math 
> > > functions available without a special flag, even though they're not 
> > > technically host-device.  The main difference here is that there we're 
> > > matching nvcc's default behavior, whereas here we're actually going 
> > > further than nvcc -- nvcc by default doesn't let you touch std::complex 
> > > from device code at all, and with a flag, you can touch its *constexpr* 
> > > functions.  Which is not actually very much.
> > > 
> > > Nonetheless, since the user-visible effect is consistent with our 
> > > approach of making std math stuff available, and since this shouldn't 
> > > make us reject code nvcc accepts, I'd be more OK hiding the flag to turn 
> > > it off.
> > > What if we had a catchall nvcc quirks mode flag similar to 
> > > -fms-compatibility?
> > 
> > I think we midair'ed on this.  See above comment about turning this flag on 
> > by default -- calling this "nvcc compat" wouldn't quite be right.  We could 
> > certainly have a broader flag, but I'm not sure at the moment what else 
> > would reasonably go in with this one.
> I'd find either of these suggestions (-fnvcc-compatibility or a cc1-only flag 
> to turn this behaviour off) more palatable than the current approach.
> 
> I'd also be a lot happier about this if we can view it as a short-term 
> workaround, with the longer-term fix being to get the host/device attributes 
> added to standard library implementations (even if it turns out we can never 
> actually remove this workaround in practice). If we can legitimately claim 
> that this is the way that CUDA is intended to work, and the missing 
> attributes in <complex> are a bug in that header (in CUDA mode), then that 
> provides a solid justification for having this complexity in Clang.
> If we can legitimately claim that this is the way that CUDA is intended to 
> work, and the missing attributes in <complex> are a bug in that header (in 
> CUDA mode), then that provides a solid justification for having this 
> complexity in Clang.


I think that

  1) the number of people passing --relaxed-constexpr to nvcc just so they can 
use a limited subset of std::complex, and
  
  2) the fact that we're already doing this for (basically all) other std math 
functions

may be decent arguments for this.  But I don't know if I'm a great judge of 
what we can legitimately claim here.

================
Comment at: lib/Sema/SemaCUDA.cpp:464-465
@@ +463,4 @@
+// <complex> without passing -fcuda-allow-std-complex.
+// TODO: Output a nvcc-compat warning if you try to use a non-constexpr 
function
+// from <complex> -- nvcc only lets you use constexpr functions.
+bool Sema::declShouldBeCUDAHostDevice(const FunctionDecl &FD) {
----------------
rsmith wrote:
> Does nvcc do this "`constexpr` implies `__host__ __device__`" thing only for 
> functions declared within <complex>, or for all functions?
> 
> Another alternative strategy: a wrapper `<complex>` header that does this:
> 
>     #include // ... union of includes from libc++ and libstdc++ <complex>
>     #define constexpr __host__ __device__ constexpr
>     #include_next <complex>
>     #undef constexpr
> Does nvcc do this "constexpr implies __host__ __device__" thing only for 
> functions declared within <complex>, or for all functions?

All functions.  Although std::complex is the main use I've observed.

> Another alternative strategy: a wrapper <complex> header that does this:

That one is quite clever, although I'm not sure about enumerating all of the 
includes from the headers.  I guess that should be reasonably stable...

I think I would like to get full complex support, though, if we can agree on a 
path towards that.  The current limitation is silly, it seems clear that people 
want this, and the constexpr thing gives you but a shadow of the actual library.

================
Comment at: lib/Sema/SemaCUDA.cpp:479-481
@@ +478,5 @@
+    return false;
+  StringRef Filename = FE->getName();
+  if (Filename != "complex" && !Filename.endswith("/complex"))
+    return false;
+
----------------
rsmith wrote:
> I don't think this works: the standard library might factor parts of 
> <complex> out into separate header files. For instance, libstdc++ 4.4 
> includes the TR1 pieces of <complex> in that way.
Hm, that is unfortunate.  One option would be to say that we just don't support 
this.  Otherwise we have to go down the road of identifying all the relevant 
functions...


http://reviews.llvm.org/D18328



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Re: [PATCH] D18328: [CUDA] Add option to mark most functions inside as host+device.

Reply via email to