[PATCH] D56411: [CUDA][HIP][Sema] Fix template kernel with function as template parameter

Yaxun Liu via Phabricator via cfe-commits Fri, 15 Feb 2019 21:29:20 -0800

yaxunl added a comment.

In D56411#1400251 <https://reviews.llvm.org/D56411#1400251>, @rjmccall wrote:


> It is totally unreasonable, at the time you are resolving a template 
> argument, to investigate how the corresponding template parameter is used 
> within the template and use that to shape how the template argument is 
> resolved.  That is simply not how the C++ template model works.  Given that 
> CODA doesn't distinguish between host and device functions in the type 
> system, if you are going to have a rule here, it has to be based on, at most, 
> (1) the current semantic context (which may not even be a function), (2) the 
> template being specialized, and (3) the declarations in the template-argument 
> set.
>
> As I've said before on a previous patch, I think the *best* rule would be to 
> recognize a hard difference between host and device function types, probably 
> by making function types default to being host function types and requiring 
> function pointers that can store device function pointers to be explicitly 
> annotated.  However, that would not be source-compatible with ordinary CUDA, 
> which is presumably unacceptable.
>
> The second-best rule would be to preserve compatibility by making an 
> unannotated function type still be "unknown whether host or device", but to 
> also allow the creation of explicitly host-only and device-only function 
> types.  For source compatibility, DREs to functions would formally have the 
> unknown function type.  Converting a pointer to an unknown function into a 
> pointer to a host function would do some basic checking on the operand 
> expression (basically to verify that it's not obviously a device function), 
> and resolving an overload set in the context of a host-only function pointer 
> type would do the obvious filtering.
>
> Otherwise, you're going to be stuck where you are right now, which is that 
> you're messing around with heuristics because somebody added a language 
> extension that isn't actually very well thought out.  But if that's what you 
> have to do, it's what you have to do.  For this specific question, where you 
> are trying to resolve an overloaded template argument, I think there are 
> basically two sensible options.
>
> - You can filter the overloads by the host-ness of the template.  This makes 
> some sense, because it's probably most likely that a function template that 
> takes a function as a template argument is going to call it — but not 
> necessarily, because it very well might decide instead to call over to the 
> device to invoke the function.  Also, not all templates have a "host-ness"; 
> that's pretty much exclusive to function templates.
> - You can filter the overload by the host-ness of the current context.  
> Again, this makes some sense because it's likely that a host function is 
> trying to pass down a host function — but again, it's not hard to think of 
> exceptions.  And again, this has the problem that the context isn't always a 
> function and so doesn't necessarily have a host-ness. Any sort of additional 
> template-specific guidance seems doomed to gradually turn into the second 
> design I mentioned above where you have the ability to be more specific about 
> function types.
>
>   For the time being, this is still a Clang extension, and while Artem 
> mentioned that NVIDIA is investigating it, that's presumably still an 
> investigation and we still have an opportunity to shape their thinking.  So I 
> would really recommend taking the second approach, or maybe even trying to 
> convince them to take the first.  (How common is higher-order programming on 
> the device, anyway, that you can't break source compatibility for it?)  For 
> this specific line of inquiry, that would probably mean not trying to 
> automatically use any particular filter on the overload set but instead just 
> relying on the programmer to annotation what kind of function they want.


I have seen important machine learning frameworks heavily using function type 
template parameters. If we make host-ness part of type system. Those templates 
expecting device function template parameters have to be rewritten, otherwise 
they won't compile. I don't think it is an easy task to persuade developers to 
make that change, since nvcc does not require that.

However, since this host-ness based overloading resolution is already in place 
and used by existing code, I do not want to break it. I consider your 
suggestion about host-ness based heuristic overloading resolution most viable 
for the current situation: take the host-ness of function templates as the 
first heuristic if the function under resolution is a function template 
argument, otherwise take the host-ness of the current context as the next 
heuristic.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D56411/new/

https://reviews.llvm.org/D56411



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D56411: [CUDA][HIP][Sema] Fix template kernel with function as template parameter

Reply via email to