tra added inline comments.

================
Comment at: lib/Headers/__clang_cuda_intrinsics.h:77-80
@@ +76,6 @@
+    _Static_assert(sizeof(__tmp) == sizeof(__in));                             
\
+    memcpy(&__tmp, &__in, sizeof(__in));                                       
\
+    __tmp = ::__FnName(__tmp, __offset, __width);                              
\
+    double __out;                                                              
\
+    memcpy(&__out, &__tmp, sizeof(__out));                                     
\
+    return __out;                                                              
\
----------------
Could we use a union instead?

================
Comment at: lib/Headers/__clang_cuda_intrinsics.h:87
@@ +86,3 @@
+__MAKE_SHUFFLES(__shfl_up, __builtin_ptx_shfl_up_i32, 
__builtin_ptx_shfl_up_f32,
+                0);
+__MAKE_SHUFFLES(__shfl_down, __builtin_ptx_shfl_down_i32,
----------------
Ugh. Took me a while to figure out why 0 is used here.
Unlike other variants shfl.up apparently applies to lanes >= maxLane. Who would 
have thought.
Might add a comment here so it's not mistaken for a typo.


http://reviews.llvm.org/D21162



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to