rjmccall added a comment.

It is totally unreasonable, at the time you are resolving a template argument, 
to investigate how the corresponding template parameter is used within the 
template and use that to shape how the template argument is resolved.  That is 
simply not how the C++ template model works.  Given that CODA doesn't 
distinguish between host and device functions in the type system, if you are 
going to have a rule here, it has to be based on, at most, (1) the current 
semantic context (which may not even be a function), (2) the template being 
specialized, and (3) the declarations in the template-argument set.

As I've said before on a previous patch, I think the *best* rule would be to 
recognize a hard difference between host and device function types, probably by 
making function types default to being host function types and requiring 
function pointers that can store device function pointers to be explicitly 
annotated.  However, that would not be source-compatible with ordinary CUDA, 
which is presumably unacceptable.

The second-best rule would be to preserve compatibility by making an 
unannotated function type still be "unknown whether host or device", but to 
also allow the creation of explicitly host-only and device-only function types. 
 For source compatibility, DREs to functions would formally have the unknown 
function type.  Converting a pointer to an unknown function into a pointer to a 
host function would do some basic checking on the operand expression (basically 
to verify that it's not obviously a device function), and resolving an overload 
set in the context of a host-only function pointer type would do the obvious 
filtering.

Otherwise, you're going to be stuck where you are right now, which is that 
you're messing around with heuristics because somebody added a language 
extension that isn't actually very well thought out.  But if that's what you 
have to do, it's what you have to do.  For this specific question, where you 
are trying to resolve an overloaded template argument, I think there are 
basically two sensible options.

- You can filter the overloads by the host-ness of the template.  This makes 
some sense, because it's probably most likely that a function template that 
takes a function as a template argument is going to call it — but not 
necessarily, because it very well might decide instead to call over to the 
device to invoke the function.  Also, not all templates have a "host-ness"; 
that's pretty much exclusive to function templates.
- You can filter the overload by the host-ness of the current context.  Again, 
this makes some sense because it's likely that a host function is trying to 
pass down a host function — but again, it's not hard to think of exceptions.  
And again, this has the problem that the context isn't always a function and so 
doesn't necessarily have a host-ness.

Any sort of additional template-specific guidance seems doomed to gradually 
turn into the second design I mentioned above where you have the ability to be 
more specific about function types.

For the time being, this is still a Clang extension, and while Artem mentioned 
that NVIDIA is investigating it, that's presumably still an investigation and 
we still have an opportunity to shape their thinking.  So I would really 
recommend taking the second approach, or maybe even trying to convince them to 
take the first.  (How common is higher-order programming on the device, anyway, 
that you can't break source compatibility for it?)  For this specific line of 
inquiry, that would probably mean not trying to automatically use any 
particular filter on the overload set but instead just relying on the 
programmer to annotation what kind of function they want.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D56411/new/

https://reviews.llvm.org/D56411



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to