tra added a comment.

In D80450#2418775 <https://reviews.llvm.org/D80450#2418775>, @yaxunl wrote:

> For the call `ag(e)`. There are two candidates:
>
> 1. `ag` in namespace `b`. The function arguments can match. However it is a 
> host function, therefore is a wrong-sided candidate and not viable.
>
> 2. `ag` in default name space. It is a host device function. However the 
> function arguments requires `a<ae>`, therefore cannot match.
>
> Before my patch, wrong-sided candidate is allowed. clang resolves to 
> candidate 1 and this results in a diagnostic about host function referenced 
> in device host function, which can be deferred. Since `f()` is not emitted on 
> device side, the deferred diags is not emitted.

This used to be a fairly common pattern in existing CUDA code. A lot of 
templated code had `__host__ __device__` slapped on it because NVCC had no 
target overloading and it's hard to control who/where/how will instantiate 
particular template. Some of those templates could only be instantiated on one 
side of the compilation. Clang's only choice was to allow the wrong-side 
candidates and/or defer the diagnostics. I vaguely recall that it was one of 
the trickier bits of the overload resolution rules to handle.

> After my patch, wrong-sided candidate is not allowed. clang resolves to 
> candidate 2, which results in a diagnostic that no matching function, which 
> is not a deferred diagnostics by default and is emitted even if `f()` is not 
> emitted on device side.
> ...
> Basically it all boils down to the issue that overloading resolution 
> diagnostics are not deferred by default.

Looks that way.

> I think a conservative solution is that we keep the old overloading 
> resolution behavior by default (i.e. do not exclude wrong-sided candidates), 
> whereas enable the correct hostness based overloading resolution when 
> -fgpu-defer-diags is on. If developers would like correct hostness based 
> overloading resolution, they can use -fgpu-defer-diags. Then as 
> -fgpu-defer-diags become stable, we turn it on by default.

SGTM. I'll check how the patch fares on our CUDA code.



================
Comment at: clang/test/SemaCUDA/function-overload.cu:614
+
+// Two HD candidates competes with H candidate.
+// HDs have type mismatch whereas H has type match.
----------------
competes->compete


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D80450/new/

https://reviews.llvm.org/D80450

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to