r264963 - [CUDA] Add math forward declares to CUDA header wrapper.

2016-03-30 Thread Justin Lebar via cfe-commits
Author: jlebar Date: Wed Mar 30 18:30:14 2016 New Revision: 264963 URL: http://llvm.org/viewvc/llvm-project?rev=264963&view=rev Log: [CUDA] Add math forward declares to CUDA header wrapper. Summary: This is necessary for a future patch which will make all constexpr functions implicitly host+devic

r264964 - [CUDA] Make unattributed constexpr functions implicitly host+device.

2016-03-30 Thread Justin Lebar via cfe-commits
Author: jlebar Date: Wed Mar 30 18:30:21 2016 New Revision: 264964 URL: http://llvm.org/viewvc/llvm-project?rev=264964&view=rev Log: [CUDA] Make unattributed constexpr functions implicitly host+device. With this patch, by a constexpr function is implicitly host+device unless: a) it's a variadic

Re: [PATCH] D18629: [CUDA] Don't initialize the CUDA toolchain if we don't have any CUDA inputs.

2016-03-30 Thread Justin Lebar via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL264965: [CUDA] Don't initialize the CUDA toolchain if we don't have any CUDA inputs. (authored by jlebar). Changed prior to commit: http://reviews.llvm.org/D18629?vs=52146&id=52153#toc Repository: rL

Re: [PATCH] D18380: [CUDA] Make unattributed constexpr functions (usually) implicitly host+device.

2016-03-30 Thread Justin Lebar via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL264964: [CUDA] Make unattributed constexpr functions implicitly host+device. (authored by jlebar). Changed prior to commit: http://reviews.llvm.org/D18380?vs=51868&id=52152#toc Repository: rL LLVM h

Re: [PATCH] D18539: [CUDA] Add math forward declares.

2016-03-30 Thread Justin Lebar via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL264963: [CUDA] Add math forward declares to CUDA header wrapper. (authored by jlebar). Changed prior to commit: http://reviews.llvm.org/D18539?vs=51988&id=52151#toc Repository: rL LLVM http://review

r264965 - [CUDA] Don't initialize the CUDA toolchain if we don't have any CUDA inputs.

2016-03-30 Thread Justin Lebar via cfe-commits
Author: jlebar Date: Wed Mar 30 18:30:25 2016 New Revision: 264965 URL: http://llvm.org/viewvc/llvm-project?rev=264965&view=rev Log: [CUDA] Don't initialize the CUDA toolchain if we don't have any CUDA inputs. Summary: This prevents errors when you invoke clang with a flag that the NVPTX toolchai

r264969 - [CUDA] Add -disable-llvm-passes to CodeGenCUDA/link-device-bitcode.cu. NFC

2016-03-30 Thread Justin Lebar via cfe-commits
Author: jlebar Date: Wed Mar 30 18:45:38 2016 New Revision: 264969 URL: http://llvm.org/viewvc/llvm-project?rev=264969&view=rev Log: [CUDA] Add -disable-llvm-passes to CodeGenCUDA/link-device-bitcode.cu. NFC We already have this flag in most of the file, but we need it everywhere else, to disabl

[PATCH] D18671: [CUDA] Add --cuda-flush-denormals-to-zero.

2016-03-31 Thread Justin Lebar via cfe-commits
jlebar created this revision. jlebar added reviewers: tra, rnk. jlebar added a subscriber: cfe-commits. Setting this flag causes all functions are annotated with the "nvvm-f32ftz" = "true" attribute. In addition, we annotate the module with "nvvm-reflect-ftz" set to 0 or 1, depending on whether -

[PATCH] D18672: [NVPTX] Read __CUDA_FTZ from module flags in NVVMReflect.

2016-03-31 Thread Justin Lebar via cfe-commits
jlebar created this revision. jlebar added reviewers: tra, rnk. jlebar added a subscriber: cfe-commits. Herald added a subscriber: jholewinski. Previously the NVVMReflect pass would read its configuration from command-line flags or a static configuration given to the pass at instantiation time. T

Re: [PATCH] D18671: [CUDA] Add --cuda-flush-denormals-to-zero.

2016-03-31 Thread Justin Lebar via cfe-commits
jlebar added a comment. Thank you for the review, Art! Comment at: include/clang/Driver/Options.td:385 @@ -384,1 +384,3 @@ HelpText<"CUDA installation path">; +def cuda_flush_denormals_to_zero : Flag<["--"], "cuda-flush-denormals-to-zero">, + HelpText<"Flush denormal floati

Re: [PATCH] D18671: [CUDA] Add --cuda-flush-denormals-to-zero.

2016-03-31 Thread Justin Lebar via cfe-commits
jlebar updated this revision to Diff 52292. jlebar added a comment. Address tra's review comments. I also decided no longer to turn this on when -menable-unsafe-fp-math is on: That's a cc1 flag that's implied by various clang flags, but now ftz is a clang flag, so turning it on implicitly didn't

Re: [PATCH] D18671: [CUDA] Add --cuda-flush-denormals-to-zero.

2016-03-31 Thread Justin Lebar via cfe-commits
jlebar marked an inline comment as done. Comment at: include/clang/Driver/Options.td:385 @@ -384,1 +384,3 @@ HelpText<"CUDA installation path">; +def fcuda_flush_denormals_to_zero : Flag<["-"], "fcuda-flush-denormals-to-zero">, + Group, Flags<[CC1Option]>, rn

Re: [PATCH] D18671: [CUDA] Add --cuda-flush-denormals-to-zero.

2016-03-31 Thread Justin Lebar via cfe-commits
jlebar updated this revision to Diff 52302. jlebar marked an inline comment as done. jlebar added a comment. Add -fno variant. http://reviews.llvm.org/D18671 Files: include/clang/Basic/LangOptions.def include/clang/Driver/Options.td lib/CodeGen/CGCall.cpp lib/CodeGen/CodeGenModule.cpp

Re: [PATCH] D18671: [CUDA] Add --cuda-flush-denormals-to-zero.

2016-03-31 Thread Justin Lebar via cfe-commits
jlebar updated this revision to Diff 52310. jlebar marked an inline comment as done. jlebar added a comment. Update flags so we only pass -fcuda-flush-denormals-to-zero to cc1 if appropriate. http://reviews.llvm.org/D18671 Files: include/clang/Basic/LangOptions.def include/clang/Driver/Opt

Re: [PATCH] D18671: [CUDA] Add --cuda-flush-denormals-to-zero.

2016-03-31 Thread Justin Lebar via cfe-commits
jlebar added inline comments. Comment at: lib/Frontend/CompilerInvocation.cpp:1567 @@ +1566,3 @@ + if (Opts.CUDAIsDevice && Args.hasArg(OPT_fcuda_flush_denormals_to_zero)) +Opts.CUDADeviceFlushDenormalsToZero = 1; + Aha, I knew there had to be a better way to

Re: [PATCH] D18671: [CUDA] Add --cuda-flush-denormals-to-zero.

2016-03-31 Thread Justin Lebar via cfe-commits
jlebar added a comment. Thank you for explaining that, Reid! http://reviews.llvm.org/D18671 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

r265083 - [CUDA] Fix typo in __clang_cuda_runtime_wrapper.h.

2016-03-31 Thread Justin Lebar via cfe-commits
Author: jlebar Date: Thu Mar 31 19:25:42 2016 New Revision: 265083 URL: http://llvm.org/viewvc/llvm-project?rev=265083&view=rev Log: [CUDA] Fix typo in __clang_cuda_runtime_wrapper.h. We're #including the wrong file! Modified: cfe/trunk/lib/Headers/__clang_cuda_runtime_wrapper.h Modified: c

Re: [PATCH] D18672: [NVPTX] Read __CUDA_FTZ from module flags in NVVMReflect.

2016-03-31 Thread Justin Lebar via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL265090: [NVPTX] Read __CUDA_FTZ from module flags in NVVMReflect. (authored by jlebar). Changed prior to commit: http://reviews.llvm.org/D18672?vs=52274&id=52320#toc Repository: rL LLVM http://revie

Re: [PATCH] D18671: [CUDA] Add --cuda-flush-denormals-to-zero.

2016-04-05 Thread Justin Lebar via cfe-commits
jlebar added a comment. Reid, are we good here? http://reviews.llvm.org/D18671 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Re: [PATCH] D18671: [CUDA] Add --cuda-flush-denormals-to-zero.

2016-04-05 Thread Justin Lebar via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL265435: [CUDA] Add -fcuda-flush-denormals-to-zero. (authored by jlebar). Changed prior to commit: http://reviews.llvm.org/D18671?vs=52310&id=52718#toc Repository: rL LLVM http://reviews.llvm.org/D18

r265435 - [CUDA] Add -fcuda-flush-denormals-to-zero.

2016-04-05 Thread Justin Lebar via cfe-commits
Author: jlebar Date: Tue Apr 5 13:26:20 2016 New Revision: 265435 URL: http://llvm.org/viewvc/llvm-project?rev=265435&view=rev Log: [CUDA] Add -fcuda-flush-denormals-to-zero. Summary: Setting this flag causes all functions are annotated with the "nvvm-f32ftz" = "true" attribute. In addition, we

r265436 - [CUDA] Show --cuda-gpu-arch option in clang --help.

2016-04-05 Thread Justin Lebar via cfe-commits
Author: jlebar Date: Tue Apr 5 13:26:25 2016 New Revision: 265436 URL: http://llvm.org/viewvc/llvm-project?rev=265436&view=rev Log: [CUDA] Show --cuda-gpu-arch option in clang --help. For some reason it was hidden. Modified: cfe/trunk/include/clang/Driver/Options.td Modified: cfe/trunk/inc

Re: [PATCH] D18617: Call TargetMachine::addEarlyAsPossiblePasses from BackendUtil.

2016-04-07 Thread Justin Lebar via cfe-commits
jlebar added inline comments. Comment at: lib/CodeGen/BackendUtil.cpp:347 @@ +346,3 @@ +PassManagerBuilder::EP_EarlyAsPossible, +[this](const PassManagerBuilder &, legacy::PassManagerBase &PM) { + TM->addEarlyAsPossiblePasses(PM); chandler

[PATCH] D18882: [CUDA] Tweak math forward declares so we're compatible with libstdc++4.9.

2016-04-07 Thread Justin Lebar via cfe-commits
jlebar created this revision. jlebar added a reviewer: rsmith. jlebar added subscribers: tra, cfe-commits. See comments in patch; we were assuming that some stdlib math functions would be defined in namespace std, when in fact the spec says they should be defined in the global namespace. libstdc+

Re: [PATCH] D18882: [CUDA] Tweak math forward declares so we're compatible with libstdc++4.9.

2016-04-07 Thread Justin Lebar via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL265751: [CUDA] Tweak math forward declares so we're compatible with libstdc++4.9. (authored by jlebar). Changed prior to commit: http://reviews.llvm.org/D18882?vs=52979&id=52980#toc Repository: rL LL

r265751 - [CUDA] Tweak math forward declares so we're compatible with libstdc++4.9.

2016-04-07 Thread Justin Lebar via cfe-commits
Author: jlebar Date: Thu Apr 7 18:55:53 2016 New Revision: 265751 URL: http://llvm.org/viewvc/llvm-project?rev=265751&view=rev Log: [CUDA] Tweak math forward declares so we're compatible with libstdc++4.9. Summary: See comments in patch; we were assuming that some stdlib math functions would be

Re: [PATCH] D18882: [CUDA] Tweak math forward declares so we're compatible with libstdc++4.9.

2016-04-07 Thread Justin Lebar via cfe-commits
jlebar added a comment. Thank you for the review! Repository: rL LLVM http://reviews.llvm.org/D18882 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Re: [PATCH] D18617: Call TargetMachine::addEarlyAsPossiblePasses from BackendUtil.

2016-04-12 Thread Justin Lebar via cfe-commits
jlebar updated this revision to Diff 53415. jlebar added a comment. Switch [this] to [&]. http://reviews.llvm.org/D18617 Files: lib/CodeGen/BackendUtil.cpp Index: lib/CodeGen/BackendUtil.cpp === --- lib/CodeGen/BackendUtil.cpp +

[PATCH] D19180: [CUDA] Raise an error if the CUDA install can't be found.

2016-04-15 Thread Justin Lebar via cfe-commits
jlebar created this revision. jlebar added a reviewer: tra. jlebar added a subscriber: cfe-commits. Without this change, we silently proceed on without including __clang_cuda_runtime_wrapper.h. This leads to very strange behavior -- you say you're compiling CUDA code, but e.g. __device__ is not d

r266496 - [CUDA] Raise an error if the CUDA install can't be found.

2016-04-15 Thread Justin Lebar via cfe-commits
Author: jlebar Date: Fri Apr 15 19:11:11 2016 New Revision: 266496 URL: http://llvm.org/viewvc/llvm-project?rev=266496&view=rev Log: [CUDA] Raise an error if the CUDA install can't be found. Summary: Without this change, we silently proceed on without including __clang_cuda_runtime_wrapper.h. Th

Re: [PATCH] D19180: [CUDA] Raise an error if the CUDA install can't be found.

2016-04-15 Thread Justin Lebar via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL266496: [CUDA] Raise an error if the CUDA install can't be found. (authored by jlebar). Changed prior to commit: http://reviews.llvm.org/D19180?vs=53954&id=53973#toc Repository: rL LLVM http://revie

[PATCH] D19248: [CUDA] Add --cuda-compile-host-device, which overrides --cuda-host-only and --cuda-device-only.

2016-04-18 Thread Justin Lebar via cfe-commits
jlebar created this revision. jlebar added a reviewer: tra. jlebar added subscribers: jhen, cfe-commits. This completes the flag's tristate, letting you override it at will on the command line. http://reviews.llvm.org/D19248 Files: include/clang/Driver/Options.td lib/Driver/Driver.cpp test

[PATCH] D19251: [CUDA] Add --no-cuda-noopt-debug, which disables --cuda-noopt-debug.

2016-04-18 Thread Justin Lebar via cfe-commits
jlebar created this revision. jlebar added a reviewer: tra. jlebar added subscribers: jhen, cfe-commits. http://reviews.llvm.org/D19251 Files: include/clang/Driver/Options.td lib/Driver/Tools.cpp test/Driver/cuda-external-tools.cu Index: test/Driver/cuda-external-tools.cu =

Re: [PATCH] D19248: [CUDA] Add --cuda-compile-host-device, which overrides --cuda-host-only and --cuda-device-only.

2016-04-18 Thread Justin Lebar via cfe-commits
jlebar added a comment. Wow, oops. Thank you. http://reviews.llvm.org/D19248 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Re: [PATCH] D19248: [CUDA] Add --cuda-compile-host-device, which overrides --cuda-host-only and --cuda-device-only.

2016-04-18 Thread Justin Lebar via cfe-commits
jlebar updated this revision to Diff 54148. jlebar marked 2 inline comments as done. jlebar added a comment. Fix help text. http://reviews.llvm.org/D19248 Files: include/clang/Driver/Options.td lib/Driver/Driver.cpp test/Driver/cuda-options.cu test/Driver/cuda-unused-arg-warning.cu Ind

r266708 - [CUDA] Add --no-cuda-noopt-debug, which disables --cuda-noopt-debug.

2016-04-18 Thread Justin Lebar via cfe-commits
Author: jlebar Date: Mon Apr 18 21:27:11 2016 New Revision: 266708 URL: http://llvm.org/viewvc/llvm-project?rev=266708&view=rev Log: [CUDA] Add --no-cuda-noopt-debug, which disables --cuda-noopt-debug. Reviewers: tra Subscribers: cfe-commits, jhen Differential Revision: http://reviews.llvm.org/

r266707 - [CUDA] Add --cuda-compile-host-device, which overrides --cuda-host-only and --cuda-device-only.

2016-04-18 Thread Justin Lebar via cfe-commits
Author: jlebar Date: Mon Apr 18 21:27:07 2016 New Revision: 266707 URL: http://llvm.org/viewvc/llvm-project?rev=266707&view=rev Log: [CUDA] Add --cuda-compile-host-device, which overrides --cuda-host-only and --cuda-device-only. Summary: This completes the flag's tristate, letting you override i

Re: [PATCH] D19248: [CUDA] Add --cuda-compile-host-device, which overrides --cuda-host-only and --cuda-device-only.

2016-04-18 Thread Justin Lebar via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL266707: [CUDA] Add --cuda-compile-host-device, which overrides --cuda-host-only and… (authored by jlebar). Changed prior to commit: http://reviews.llvm.org/D19248?vs=54148&id=54152#toc Repository: rL

Re: [PATCH] D19251: [CUDA] Add --no-cuda-noopt-debug, which disables --cuda-noopt-debug.

2016-04-18 Thread Justin Lebar via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL266708: [CUDA] Add --no-cuda-noopt-debug, which disables --cuda-noopt-debug. (authored by jlebar). Changed prior to commit: http://reviews.llvm.org/D19251?vs=54145&id=54153#toc Repository: rL LLVM h

Re: r266496 - [CUDA] Raise an error if the CUDA install can't be found.

2016-04-19 Thread Justin Lebar via cfe-commits
#x27;d have to change our test. On Tue, Apr 19, 2016 at 11:21 AM, Chandler Carruth wrote: > This commit is missing a test. > > > On Fri, Apr 15, 2016 at 5:16 PM Justin Lebar via cfe-commits > wrote: >> >> Author: jlebar >> Date: Fri Apr 15 19:11:11 2016 >> New

Re: r266496 - [CUDA] Raise an error if the CUDA install can't be found.

2016-04-19 Thread Justin Lebar via cfe-commits
t test changes... > > We have several fake install trees in the driver tests to check pretty much > exactly these kinds of things? > > On Tue, Apr 19, 2016 at 11:31 AM Justin Lebar via cfe-commits > wrote: >> >> Yes, in general our testing story around the CUDA instal

r266796 - [CUDA] Add a test for r266496 (raise an error if a CUDA installation isn't found)

2016-04-19 Thread Justin Lebar via cfe-commits
Author: jlebar Date: Tue Apr 19 13:52:28 2016 New Revision: 266796 URL: http://llvm.org/viewvc/llvm-project?rev=266796&view=rev Log: [CUDA] Add a test for r266496 (raise an error if a CUDA installation isn't found) Added: cfe/trunk/test/Driver/cuda-not-found.cu Added: cfe/trunk/test/Driver/

Re: r266496 - [CUDA] Raise an error if the CUDA install can't be found.

2016-04-19 Thread Justin Lebar via cfe-commits
2016 at 11:33 AM, Chandler Carruth > wrote: >> I don't really understand why having to change the test when we change the >> code it test changes... >> >> We have several fake install trees in the driver tests to check pretty much >> exactly these kinds of things?

[PATCH] D19346: [CUDA] Copy host builtin types to NVPTXTargetInfo.

2016-04-20 Thread Justin Lebar via cfe-commits
jlebar created this revision. jlebar added a reviewer: rsmith. jlebar added subscribers: tra, jhen, cfe-commits. Host and device types must match, otherwise when we pass values back and forth between the host and device, we will get the wrong result. This patch makes NVPTXTargetInfo inherit most

Re: [PATCH] D20457: Update -ffast-math documentation to match reality.

2016-05-20 Thread Justin Lebar via cfe-commits
jlebar added a comment. Thank you for the review! http://reviews.llvm.org/D20457 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Re: [PATCH] D20457: Update -ffast-math documentation to match reality.

2016-05-20 Thread Justin Lebar via cfe-commits
jlebar marked 2 inline comments as done. jlebar added a comment. Repository: rL LLVM http://reviews.llvm.org/D20457 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Re: [PATCH] D20493: [CUDA] Add -fcuda-approx-transcendentals flag.

2016-05-23 Thread Justin Lebar via cfe-commits
jlebar updated this revision to Diff 58123. jlebar added a comment. More tightly scope the __USE_FAST_MATH__ macro. tra pointed out that device_functions.hpp uses __USE_FAST_MATH__ for its own purposes. For this CL, we only want to define __USE_FAST_MATH__ around math_functions.hpp. http://rev

r270484 - [CUDA] Add -fcuda-approx-transcendentals flag.

2016-05-23 Thread Justin Lebar via cfe-commits
Author: jlebar Date: Mon May 23 15:19:56 2016 New Revision: 270484 URL: http://llvm.org/viewvc/llvm-project?rev=270484&view=rev Log: [CUDA] Add -fcuda-approx-transcendentals flag. Summary: This lets us emit e.g. sin.approx.f32. See http://docs.nvidia.com/cuda/parallel-thread-execution/#floating-

Re: [PATCH] D20493: [CUDA] Add -fcuda-approx-transcendentals flag.

2016-05-23 Thread Justin Lebar via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL270484: [CUDA] Add -fcuda-approx-transcendentals flag. (authored by jlebar). Changed prior to commit: http://reviews.llvm.org/D20493?vs=58123&id=58145#toc Repository: rL LLVM http://reviews.llvm.org

[PATCH] D20794: [CUDA] Fix order of vectorized ldg intrinsics' elements.

2016-05-30 Thread Justin Lebar via cfe-commits
jlebar created this revision. jlebar added subscribers: tra, cfe-commits. The order is [x, y, z, w], not [w, x, y, z]. http://reviews.llvm.org/D20794 Files: lib/Headers/__clang_cuda_intrinsics.h Index: lib/Headers/__clang_cuda_intrinsics.h =

r271215 - [CUDA] Fix order of vectorized ldg intrinsics' elements.

2016-05-30 Thread Justin Lebar via cfe-commits
Author: jlebar Date: Mon May 30 12:12:55 2016 New Revision: 271215 URL: http://llvm.org/viewvc/llvm-project?rev=271215&view=rev Log: [CUDA] Fix order of vectorized ldg intrinsics' elements. Summary: The order is [x, y, z, w], not [w, x, y, z]. Subscribers: cfe-commits, tra Differential Revision

Re: [PATCH] D20794: [CUDA] Fix order of vectorized ldg intrinsics' elements.

2016-05-30 Thread Justin Lebar via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL271215: [CUDA] Fix order of vectorized ldg intrinsics' elements. (authored by jlebar). Changed prior to commit: http://reviews.llvm.org/D20794?vs=58972&id=58976#toc Repository: rL LLVM http://review

[PATCH] D20836: [CUDA] Conservatively mark inline asm as convergent.

2016-05-31 Thread Justin Lebar via cfe-commits
jlebar created this revision. jlebar added a reviewer: tra. jlebar added a subscriber: cfe-commits. This is particularly important because a some convergent CUDA intrinsics (e.g. __shfl_down) are implemented in terms of inline asm. http://reviews.llvm.org/D20836 Files: lib/CodeGen/CGStmt.cpp

r271336 - [CUDA] Conservatively mark inline asm as convergent.

2016-05-31 Thread Justin Lebar via cfe-commits
Author: jlebar Date: Tue May 31 16:27:13 2016 New Revision: 271336 URL: http://llvm.org/viewvc/llvm-project?rev=271336&view=rev Log: [CUDA] Conservatively mark inline asm as convergent. Summary: This is particularly important because a some convergent CUDA intrinsics (e.g. __shfl_down) are imple

Re: [PATCH] D20836: [CUDA] Conservatively mark inline asm as convergent.

2016-05-31 Thread Justin Lebar via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL271336: [CUDA] Conservatively mark inline asm as convergent. (authored by jlebar). Changed prior to commit: http://reviews.llvm.org/D20836?vs=59130&id=59133#toc Repository: rL LLVM http://reviews.ll

Re: [PATCH] D20836: [CUDA] Conservatively mark inline asm as convergent.

2016-05-31 Thread Justin Lebar via cfe-commits
jlebar added a comment. In http://reviews.llvm.org/D20836#444911, @tra wrote: > I guess we would not be able to remove convergent from inline asm > automatically. Do we need a way to explicitly remove convergent from inline > asm? We can think about it. I'm not sure it will make a big differ

Re: r271336 - [CUDA] Conservatively mark inline asm as convergent.

2016-06-01 Thread Justin Lebar via cfe-commits
Thank you, Tom. I will have a look. On Wed, Jun 1, 2016 at 11:22 AM, Tom Stellard wrote: > On Tue, May 31, 2016 at 09:27:13PM -0000, Justin Lebar via cfe-commits wrote: >> Author: jlebar >> Date: Tue May 31 16:27:13 2016 >> New Revision: 271336 >> >> URL: http:/

Re: [PATCH] D20985: [CUDA] Add implicit conversion of __launch_bounds__ arguments to rvalue.

2016-06-03 Thread Justin Lebar via cfe-commits
jlebar added a comment. How is this different from test/SemaCUDA/launch_bounds.cu:27-28? It does const int constint = 512; __launch_bounds__(constint) void TestConstInt(void); which looks verbatim the same as this testcase. http://reviews.llvm.org/D20985 ___

Re: [PATCH] D20985: [CUDA] Add implicit conversion of __launch_bounds__ arguments to rvalue.

2016-06-03 Thread Justin Lebar via cfe-commits
jlebar added a comment. In http://reviews.llvm.org/D20985#448836, @tra wrote: > In http://reviews.llvm.org/D20985#448822, @jlebar wrote: > > > How is this different from test/SemaCUDA/launch_bounds.cu:27-28? It does > > > > const int constint = 512; > > __launch_bounds__(constint) void TestC

Re: [PATCH] D20985: [CUDA] Add implicit conversion of __launch_bounds__ arguments to rvalue.

2016-06-03 Thread Justin Lebar via cfe-commits
jlebar accepted this revision. This revision is now accepted and ready to land. Comment at: lib/Sema/SemaDeclAttr.cpp:4044 @@ +4043,3 @@ +// Checks whether an argument of launch_bounds attribute is +// acceptable, performs implicit conversion to Rvalue and returns +// non-nullptr

Re: [PATCH] D20985: [CUDA] Add implicit conversion of __launch_bounds__ arguments to rvalue.

2016-06-03 Thread Justin Lebar via cfe-commits
jlebar added inline comments. Comment at: lib/Sema/SemaDeclAttr.cpp:4079 @@ +4078,3 @@ + if (ValArg.isInvalid()) +return nullptr; + OK, so then we want an assert, not an if? http://reviews.llvm.org/D20985 ___ c

[PATCH] D21162: [CUDA] Implement __shfl* intrinsics in clang headers.

2016-06-08 Thread Justin Lebar via cfe-commits
jlebar created this revision. jlebar added a reviewer: tra. jlebar added subscribers: cfe-commits, jholewinski. Clang changes to make use of the LLVM intrinsics added in D21160. http://reviews.llvm.org/D21162 Files: include/clang/Basic/BuiltinsNVPTX.def lib/Headers/__clang_cuda_intrinsics.h

Re: [PATCH] D21162: [CUDA] Implement __shfl* intrinsics in clang headers.

2016-06-09 Thread Justin Lebar via cfe-commits
jlebar added a comment. (Art, I would appreciate a second set of eyes on this one, as the last time I did this -- with ldg -- I messed up pretty badly.) http://reviews.llvm.org/D21162 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://l

Re: [PATCH] D21162: [CUDA] Implement __shfl* intrinsics in clang headers.

2016-06-09 Thread Justin Lebar via cfe-commits
jlebar added a comment. Thank you for the reviews, Justin! http://reviews.llvm.org/D21162 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Re: [PATCH] D21162: [CUDA] Implement __shfl* intrinsics in clang headers.

2016-06-09 Thread Justin Lebar via cfe-commits
jlebar updated this revision to Diff 60223. jlebar marked 2 inline comments as done. jlebar added a comment. Update after tra's review. http://reviews.llvm.org/D21162 Files: include/clang/Basic/BuiltinsNVPTX.def lib/Headers/__clang_cuda_intrinsics.h lib/Headers/__clang_cuda_runtime_wrappe

Re: [PATCH] D21162: [CUDA] Implement __shfl* intrinsics in clang headers.

2016-06-09 Thread Justin Lebar via cfe-commits
jlebar added inline comments. Comment at: lib/Headers/__clang_cuda_intrinsics.h:77-80 @@ +76,6 @@ +_Static_assert(sizeof(__tmp) == sizeof(__in)); \ +memcpy(&__tmp, &__in, sizeof(__in)); \ +__tmp = ::__

r272299 - [CUDA] Implement __shfl* intrinsics in clang headers.

2016-06-09 Thread Justin Lebar via cfe-commits
Author: jlebar Date: Thu Jun 9 15:04:57 2016 New Revision: 272299 URL: http://llvm.org/viewvc/llvm-project?rev=272299&view=rev Log: [CUDA] Implement __shfl* intrinsics in clang headers. Summary: Clang changes to make use of the LLVM intrinsics added in D21160. Reviewers: tra Subscribers: jhole

Re: [PATCH] D21162: [CUDA] Implement __shfl* intrinsics in clang headers.

2016-06-09 Thread Justin Lebar via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL272299: [CUDA] Implement __shfl* intrinsics in clang headers. (authored by jlebar). Changed prior to commit: http://reviews.llvm.org/D21162?vs=60223&id=60230#toc Repository: rL LLVM http://reviews.l

[PATCH] D21337: [CUDA] Give templated device functions internal linkage, templated kernels external linkage.

2016-06-14 Thread Justin Lebar via cfe-commits
jlebar created this revision. jlebar added a reviewer: rsmith. jlebar added subscribers: tra, cfe-commits. This lets LLVM perform IPO over these functions. In particular, it allows LLVM to emit ld.global.nc for loads to __restrict pointers in kernels that are never written to. http://reviews.llv

Re: [PATCH] D21337: [CUDA] Give templated device functions internal linkage, templated kernels external linkage.

2016-06-14 Thread Justin Lebar via cfe-commits
jlebar added a comment. tra makes the good point that maybe this should be done in ASTContext, where we already have a special case for __global__. (I think I gravitated to doing it this way because the GVA* enums have zero documentation -- at least I have a vague idea of what the LLVM attribu

[PATCH] D21419: [CUDA] Don't pass top-level -march down to device cc1 or ptxas.

2016-06-15 Thread Justin Lebar via cfe-commits
jlebar created this revision. jlebar added a reviewer: tra. jlebar added subscribers: echristo, cfe-commits. Previously if you did e.g. $ clang -march=haswell -x cuda foo.cu we would pass "-march=haswell -march=sm_20" down to the ptxas tool. This causes it to assert, and rightly so! http://re

Re: [PATCH] D21419: [CUDA] Don't pass top-level -march down to device cc1 or ptxas.

2016-06-15 Thread Justin Lebar via cfe-commits
jlebar updated this revision to Diff 60931. jlebar added a comment. Remove redundant test. http://reviews.llvm.org/D21419 Files: lib/Driver/ToolChains.cpp test/Driver/cuda-march.cu Index: test/Driver/cuda-march.cu === --- /dev

Re: [PATCH] D21419: [CUDA] Don't pass top-level -march down to device cc1 or ptxas.

2016-06-15 Thread Justin Lebar via cfe-commits
jlebar added inline comments. Comment at: test/Driver/cuda-march.cu:15-16 @@ +14,4 @@ + +// RUN: %clang -### -target x86_64-linux-gnu -c -march=skylake --cuda-gpu-arch=sm_30 %s 2>&1 | \ +// RUN: FileCheck -check-prefix SKYLAKE -check-prefix SM30 %s + tra wrote: >

Re: [PATCH] D21419: [CUDA] Don't pass top-level -march down to device cc1 or ptxas.

2016-06-15 Thread Justin Lebar via cfe-commits
jlebar updated this revision to Diff 60932. jlebar added a comment. Fix tests for real this time. http://reviews.llvm.org/D21419 Files: lib/Driver/ToolChains.cpp test/Driver/cuda-march.cu Index: test/Driver/cuda-march.cu === -

r272857 - [CUDA] Don't pass top-level -march down to device cc1 or ptxas.

2016-06-15 Thread Justin Lebar via cfe-commits
Author: jlebar Date: Wed Jun 15 18:46:11 2016 New Revision: 272857 URL: http://llvm.org/viewvc/llvm-project?rev=272857&view=rev Log: [CUDA] Don't pass top-level -march down to device cc1 or ptxas. Summary: Previously if you did e.g. $ clang -march=haswell -x cuda foo.cu we would pass "-march=

Re: [PATCH] D21419: [CUDA] Don't pass top-level -march down to device cc1 or ptxas.

2016-06-15 Thread Justin Lebar via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL272857: [CUDA] Don't pass top-level -march down to device cc1 or ptxas. (authored by jlebar). Changed prior to commit: http://reviews.llvm.org/D21419?vs=60932&id=60935#toc Repository: rL LLVM http:/

Re: [PATCH] D21507: Changes after running check modernize-use-emplace (D20964)

2016-06-20 Thread Justin Lebar via cfe-commits
jlebar added a subscriber: jlebar. jlebar added a comment. There seem to be many nontrivial whitespace errors introduced by this patch. For example, -Attrs.push_back(HTMLStartTagComment::Attribute(Ident.getLocation(), - Ident.ge

Re: [PATCH] D21337: [CUDA] Give templated device functions internal linkage, templated kernels external linkage.

2016-06-24 Thread Justin Lebar via cfe-commits
jlebar added a comment. Friendly ping. http://reviews.llvm.org/D21337 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D21778: [CUDA] Add support for CUDA 8 and sm_60-62.

2016-06-27 Thread Justin Lebar via cfe-commits
jlebar created this revision. jlebar added a reviewer: tra. jlebar added a subscriber: cfe-commits. Also add sm_32, which was missing. http://reviews.llvm.org/D21778 Files: lib/Basic/Targets.cpp lib/Driver/Action.cpp lib/Driver/ToolChains.cpp Index: lib/Driver/ToolChains.cpp =

[PATCH] D21810: Don't instantiate a full host toolchain in ASTMatchersTest.

2016-06-28 Thread Justin Lebar via cfe-commits
jlebar created this revision. jlebar added a reviewer: chandlerc. jlebar added a subscriber: cfe-commits. Herald added a subscriber: klimek. This test was stat()'ing large swaths of /usr/lib hundreds of times, as every invocation of matchesConditionally*() created a new Linux toolchain. In additi

Re: [PATCH] D21810: Don't instantiate a full host toolchain in ASTMatchersTest.

2016-06-28 Thread Justin Lebar via cfe-commits
jlebar added inline comments. Comment at: unittests/ASTMatchers/ASTMatchersTest.h:81-83 @@ +80,5 @@ + // + // FIXME: This is a hack to work around the fact that there's no way to do the + // equivalent of runToolOnCodeWithArgs without instantiating a full Driver. + // We shou

Re: [PATCH] D21810: Don't instantiate a full host toolchain in ASTMatchersTest.

2016-06-28 Thread Justin Lebar via cfe-commits
jlebar updated this revision to Diff 62132. jlebar marked 3 inline comments as done. jlebar added a comment. Fix typo in comment. http://reviews.llvm.org/D21810 Files: unittests/ASTMatchers/ASTMatchersTest.h Index: unittests/ASTMatchers/ASTMatchersTest.h =

[PATCH] D21867: [CUDA] Add utility functions for dealing with CUDA versions / architectures.

2016-06-29 Thread Justin Lebar via cfe-commits
jlebar created this revision. jlebar added a reviewer: tra. jlebar added a subscriber: cfe-commits. Currently our handling of CUDA architectures is scattered all around clang. This patch centralizes it. A key advantage of this centralization is that you can now write a C++ switch on e.g. CudaArc

[PATCH] D21869: [CUDA] Check that our CUDA install supports the requested architectures.

2016-06-29 Thread Justin Lebar via cfe-commits
jlebar created this revision. jlebar added a reviewer: tra. jlebar added a subscriber: cfe-commits. Raise an error if you're using a CUDA installation that's too old for the requested architectures. In practice, this means that you need a CUDA 8 install to compile for sm_6*. http://reviews.llvm.

[PATCH] D21868: [CUDA] Rename member variables in CudaInstallationDetector.

2016-06-29 Thread Justin Lebar via cfe-commits
jlebar created this revision. jlebar added a reviewer: tra. jlebar added a subscriber: cfe-commits. Remove the "Cuda" prefix from these variables -- it's clear that they related to CUDA given their containing type. http://reviews.llvm.org/D21868 Files: lib/Driver/ToolChains.cpp lib/Driver/To

Re: [PATCH] D21810: Don't instantiate a full host toolchain in ASTMatchersTest.

2016-06-30 Thread Justin Lebar via cfe-commits
jlebar added a comment. > But I think this is a reasonable workaround until such an API can be provided. Should I take that as an LG, or are we waiting for someone else to approve this? http://reviews.llvm.org/D21810 ___ cfe-commits mailing list c

r274257 - Don't instantiate a full host toolchain in ASTMatchersTest.

2016-06-30 Thread Justin Lebar via cfe-commits
Author: jlebar Date: Thu Jun 30 13:12:25 2016 New Revision: 274257 URL: http://llvm.org/viewvc/llvm-project?rev=274257&view=rev Log: Don't instantiate a full host toolchain in ASTMatchersTest. Summary: This test was stat()'ing large swaths of /usr/lib hundreds of times, as every invocation of mat

Re: [PATCH] D21810: Don't instantiate a full host toolchain in ASTMatchersTest.

2016-06-30 Thread Justin Lebar via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL274257: Don't instantiate a full host toolchain in ASTMatchersTest. (authored by jlebar). Changed prior to commit: http://reviews.llvm.org/D21810?vs=62132&id=62385#toc Repository: rL LLVM http://rev

Re: [PATCH] D21337: [CUDA] Give templated device functions internal linkage, templated kernels external linkage.

2016-06-30 Thread Justin Lebar via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL274261: [CUDA] Give templated device functions internal linkage, templated kernels… (authored by jlebar). Changed prior to commit: http://reviews.llvm.org/D21337?vs=60728&id=62391#toc Repository: rL

r274261 - [CUDA] Give templated device functions internal linkage, templated kernels external linkage.

2016-06-30 Thread Justin Lebar via cfe-commits
Author: jlebar Date: Thu Jun 30 13:41:33 2016 New Revision: 274261 URL: http://llvm.org/viewvc/llvm-project?rev=274261&view=rev Log: [CUDA] Give templated device functions internal linkage, templated kernels external linkage. Summary: This lets LLVM perform IPO over these functions. In particul

Re: [PATCH] D21867: [CUDA] Add utility functions for dealing with CUDA versions / architectures.

2016-06-30 Thread Justin Lebar via cfe-commits
jlebar marked an inline comment as done. Comment at: lib/Basic/Cuda.cpp:8-19 @@ +7,14 @@ + +const char *CudaVersionToString(CudaVersion V) { + switch (V) { + case CudaVersion::UNKNOWN: +return "unknown"; + case CudaVersion::CUDA_70: +return "7.0"; + case CudaVersion::C

Re: [PATCH] D21867: [CUDA] Add utility functions for dealing with CUDA versions / architectures.

2016-06-30 Thread Justin Lebar via cfe-commits
jlebar marked an inline comment as done. Comment at: lib/Driver/Driver.cpp:1026-1028 @@ -1024,4 +1025,5 @@ } else if (CudaDeviceAction *CDA = dyn_cast(A)) { -os << '"' - << (CDA->getGpuArchName() ? CDA->getGpuArchName() : "(multiple archs)") +os << '"' << (CDA->ge

r274269 - Fix ASTMatchersNodeTest to work on Windows.

2016-06-30 Thread Justin Lebar via cfe-commits
Author: jlebar Date: Thu Jun 30 15:29:29 2016 New Revision: 274269 URL: http://llvm.org/viewvc/llvm-project?rev=274269&view=rev Log: Fix ASTMatchersNodeTest to work on Windows. It was failing because it had an explicit check for whether we're on Windows. There are a few other similar explicit ch

Re: r264008 - [sema] [CUDA] Use std algorithms in EraseUnwantedCUDAMatchesImpl.

2016-06-30 Thread Justin Lebar via cfe-commits
Interestingly all the clang tests pass with that whole line commented out. So something *really* seems missing here. Thank you for finding this. On Thu, Jun 30, 2016 at 5:08 AM, Benjamin Kramer wrote: > On Tue, Mar 22, 2016 at 1:09 AM, Justin Lebar via cfe-commits > wrote: >> Au

Re: [PATCH] D21867: [CUDA] Add utility functions for dealing with CUDA versions / architectures.

2016-06-30 Thread Justin Lebar via cfe-commits
jlebar updated this revision to Diff 62409. jlebar added a comment. Address Art's review. http://reviews.llvm.org/D21867 Files: include/clang/Basic/Cuda.h include/clang/Driver/Action.h lib/Basic/CMakeLists.txt lib/Basic/Cuda.cpp lib/Basic/Targets.cpp lib/Driver/Action.cpp lib/Driv

Re: [PATCH] D21867: [CUDA] Add utility functions for dealing with CUDA versions / architectures.

2016-06-30 Thread Justin Lebar via cfe-commits
jlebar marked 4 inline comments as done. jlebar added a comment. http://reviews.llvm.org/D21867 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Re: r274257 - Don't instantiate a full host toolchain in ASTMatchersTest.

2016-06-30 Thread Justin Lebar via cfe-commits
1 test from 1 test case ran. (20 ms total) > [ PASSED ] 0 tests. > [ FAILED ] 1 test, listed below: > [ FAILED ] DeclarationMatcher.MatchClass > > 1 FAILED TEST > > > > > > 2016-06-30 21:12 GMT+03:00 Justin Lebar via cfe-commits > : >>

Re: r264008 - [sema] [CUDA] Use std algorithms in EraseUnwantedCUDAMatchesImpl.

2016-06-30 Thread Justin Lebar via cfe-commits
t 5:08 AM, Benjamin Kramer wrote: >> On Tue, Mar 22, 2016 at 1:09 AM, Justin Lebar via cfe-commits >> wrote: >>> Author: jlebar >>> Date: Mon Mar 21 19:09:25 2016 >>> New Revision: 264008 >>> >>> URL: http://llvm.or

Re: [PATCH] D18172: [CUDA][OpenMP] Add a generic offload action builder

2016-06-30 Thread Justin Lebar via cfe-commits
jlebar added a comment. Alexey, it seems that you're asking for "final" on all classes that are not inherited from. Forgive my ignorance, but would you mind pointing me to the document that talks about our position on "final" in LLVM source? I don't see it in the style guide, but I may be mis

[PATCH] D21912: [CUDA] Don't assume that destructors can't be overloaded.

2016-06-30 Thread Justin Lebar via cfe-commits
jlebar created this revision. jlebar added a reviewer: rsmith. jlebar added subscribers: tra, cfe-commits. You can overload a destructor in CUDA, and SemaOverload needs to be tweaked not to crash when it sees an explicit call to an overloaded destructor. http://reviews.llvm.org/D21912 Files: l

<    1   2   3   4   5   6   7   8   9   >