tra added inline comments. ================ Comment at: lib/Headers/__clang_cuda_intrinsics.h:77-80 @@ +76,6 @@ + _Static_assert(sizeof(__tmp) == sizeof(__in)); \ + memcpy(&__tmp, &__in, sizeof(__in)); \ + __tmp = ::__FnName(__tmp, __offset, __width); \ + double __out; \ + memcpy(&__out, &__tmp, sizeof(__out)); \ + return __out; \ ---------------- Could we use a union instead?
================ Comment at: lib/Headers/__clang_cuda_intrinsics.h:87 @@ +86,3 @@ +__MAKE_SHUFFLES(__shfl_up, __builtin_ptx_shfl_up_i32, __builtin_ptx_shfl_up_f32, + 0); +__MAKE_SHUFFLES(__shfl_down, __builtin_ptx_shfl_down_i32, ---------------- Ugh. Took me a while to figure out why 0 is used here. Unlike other variants shfl.up apparently applies to lanes >= maxLane. Who would have thought. Might add a comment here so it's not mistaken for a typo. http://reviews.llvm.org/D21162 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits