tra added a comment.
In https://reviews.llvm.org/D47849#1124638, @Hahnfeld wrote:
> IMO this goes into the right direction, we should use the fast implementation
> in libdevice. If LLVM doesn't lower these calls in the NVPTX backend, I think
> it's ok to use header wrappers as CUDA already does.
Using wrapper headers may be OK solution for now. Ideally we should grow our
own equivalent of device-side libm so we don't have to rely on libdevice
bitcode.
> Two questions:
>
> 1. Can you explain where this is important for "correctness"? Yesterday I
> compiled a code using `sqrt` and it seems to spit out the correct results.
> Maybe that's relevant for other functions?
> 2. Incidentally I ran into a closely related problem: I can't `#include
> <math.h>` in translation units compiled for offloading, Clang complains about
> inline assembly for x86 (see below). Does that work for you?
>
> ``` In file included from /usr/include/math.h:413:
> /usr/include/bits/mathinline.h:131:43: error: invalid input constraint 'x' in
> asm __asm ("pmovmskb %1, %0" : "=r" (__m) : "x" (__x)); ^
> /usr/include/bits/mathinline.h:143:43: error: invalid input constraint 'x' in
> asm __asm ("pmovmskb %1, %0" : "=r" (__m) : "x" (__x)); ^ 2 errors generated.
> ```
Avoiding conflicts between host and device implementations of the same
functions in C++ requires use of attribute-based overloading
(https://goo.gl/EXnymm). For CUDA compilation, we provide device-side overloads
with __device__ attributes but otherwise identical signatures. We may need to
extend it to work in C mode, too. Clang already has
__attribute__((overloadable)), so basic overloading mechanisms should be there
already.
================
Comment at: lib/Headers/__clang_cuda_device_functions.h:1153-1155
+__DEVICE__ long long llabs(long long __a) { return __nv_llabs(__a); }
#if defined(__LP64__)
__DEVICE__ long labs(long __a) { return llabs(__a); };
----------------
I think it should've been `return __nv_llabs(__a)` here and the definition of
`long long llabs()` should remain back where it was.
Repository:
rC Clang
https://reviews.llvm.org/D47849
_______________________________________________
cfe-commits mailing list
[email protected]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits