Re: [PATCH] D19990: [CUDA] Implement __ldg using intrinsics.

2016-05-19 Thread Justin Lebar via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL270150: [CUDA] Implement __ldg using intrinsics. (authored by jlebar). Changed prior to commit: http://reviews.llvm.org/D19990?vs=56603&id=57873#toc Repository: rL LLVM http://reviews.llvm.org/D1999

Re: [PATCH] D19990: [CUDA] Implement __ldg using intrinsics.

2016-05-19 Thread Richard Smith via cfe-commits
rsmith accepted this revision. rsmith added a comment. This revision is now accepted and ready to land. After offline discussion: we don't know for sure whether we're going to hit the combinatorial explosion in future or not. Let's go ahead with this as-is for now, then, with the explicit acknow

Re: [PATCH] D19990: [CUDA] Implement __ldg using intrinsics.

2016-05-17 Thread Richard Smith via cfe-commits
rsmith added inline comments. Comment at: include/clang/Basic/BuiltinsNVPTX.def:569-603 @@ -568,1 +568,37 @@ +// __ldg. This is not implemented as a builtin by nvcc. +BUILTIN(__nvvm_ldg_c, "ccC*", "") +BUILTIN(__nvvm_ldg_s, "ssC*", "") +BUILTIN(__nvvm_ldg_i, "iiC*", "") +BUILTI

Re: [PATCH] D19990: [CUDA] Implement __ldg using intrinsics.

2016-05-17 Thread Justin Lebar via cfe-commits
jlebar added a comment. Friendly ping. This is a big help with some Tensorflow benchmarks. http://reviews.llvm.org/D19990 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Re: [PATCH] D19990: [CUDA] Implement __ldg using intrinsics.

2016-05-09 Thread Justin Lebar via cfe-commits
jlebar updated this revision to Diff 56603. jlebar added a comment. Remove static_asserts. http://reviews.llvm.org/D19990 Files: include/clang/Basic/BuiltinsNVPTX.def lib/CodeGen/CGBuiltin.cpp lib/Headers/CMakeLists.txt lib/Headers/__clang_cuda_intrinsics.h lib/Headers/__clang_cuda_ru

Re: [PATCH] D19990: [CUDA] Implement __ldg using intrinsics.

2016-05-09 Thread Justin Lebar via cfe-commits
jlebar added a comment. Art pointed out that static_assert is c++11-only. I'll just remove them and make a note to move them into the CUDA test-suite stuff Art is working on. http://reviews.llvm.org/D19990 ___ cfe-commits mailing list cfe-commits@

Re: [PATCH] D19990: [CUDA] Implement __ldg using intrinsics.

2016-05-09 Thread Artem Belevich via cfe-commits
tra added a comment. OK. Let's stick with __ldg for now. http://reviews.llvm.org/D19990 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Re: [PATCH] D19990: [CUDA] Implement __ldg using intrinsics.

2016-05-09 Thread Justin Lebar via cfe-commits
jlebar added a comment. Art pointed me to the fact that CUDA 8 adds a bunch more load intrinsics, and I said ohmygosh maybe we *do* want to do the variadic intrinsic thing here. But now looking at how __builtin_add_overflow is implemented, we'd need special sema checking to make it work. We wo

Re: [PATCH] D19990: [CUDA] Implement __ldg using intrinsics.

2016-05-05 Thread Justin Lebar via cfe-commits
jlebar added inline comments. Comment at: include/clang/Basic/BuiltinsNVPTX.def:569-603 @@ -568,1 +568,37 @@ +// __ldg. This is not implemented as a builtin by nvcc. +BUILTIN(__nvvm_ldg_c, "ccC*", "") +BUILTIN(__nvvm_ldg_s, "ssC*", "") +BUILTIN(__nvvm_ldg_i, "iiC*", "") +BUILTI

Re: [PATCH] D19990: [CUDA] Implement __ldg using intrinsics.

2016-05-05 Thread Justin Lebar via cfe-commits
jlebar added inline comments. Comment at: include/clang/Basic/BuiltinsNVPTX.def:569-603 @@ -568,1 +568,37 @@ +// __ldg. This is not implemented as a builtin by nvcc. +BUILTIN(__nvvm_ldg_c, "ccC*", "") +BUILTIN(__nvvm_ldg_s, "ssC*", "") +BUILTIN(__nvvm_ldg_i, "iiC*", "") +BUILTI

Re: [PATCH] D19990: [CUDA] Implement __ldg using intrinsics.

2016-05-05 Thread David Majnemer via cfe-commits
majnemer added a subscriber: majnemer. Comment at: include/clang/Basic/BuiltinsNVPTX.def:569-603 @@ -568,1 +568,37 @@ +// __ldg. This is not implemented as a builtin by nvcc. +BUILTIN(__nvvm_ldg_c, "ccC*", "") +BUILTIN(__nvvm_ldg_s, "ssC*", "") +BUILTIN(__nvvm_ldg_i, "iiC*", ""