r261018 - [CUDA] pass debug options to ptxas.

2016-02-16 Thread Artem Belevich via cfe-commits
Author: tra Date: Tue Feb 16 16:03:20 2016 New Revision: 261018 URL: http://llvm.org/viewvc/llvm-project?rev=261018&view=rev Log: [CUDA] pass debug options to ptxas. ptxas optimizations are disabled if we need to generate debug info as ptxas does not accept '-g' otherwise. Differential Revision:

Re: [PATCH] D17111: [CUDA] Added --cuda-noopt-device-debug option to control ptxas' debug info generation.

2016-02-16 Thread Artem Belevich via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL261018: [CUDA] pass debug options to ptxas. (authored by tra). Changed prior to commit: http://reviews.llvm.org/D17111?vs=47680&id=48108#toc Repository: rL LLVM http://reviews.llvm.org/D17111 Files

Re: [PATCH] D17561: [CUDA] Add conversion operators for threadIdx, blockIdx, gridDim, and blockDim to uint3 and dim3.

2016-02-24 Thread Artem Belevich via cfe-commits
tra added inline comments. Comment at: lib/Headers/cuda_builtin_vars.h:72 @@ -66,1 +71,3 @@ + // uint3). This function is defined after we pull in vector_types.h. + __attribute__((device)) operator uint3() const; private: Considering that built-in variables ar

Re: [PATCH] D17562: [CUDA] Add hack so code which includes "curand.h" doesn't break.

2016-02-24 Thread Artem Belevich via cfe-commits
tra accepted this revision. tra added a comment. This revision is now accepted and ready to land. OK. http://reviews.llvm.org/D17562 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D17581: [CUDA] disable attribute-based overloading for __global__ functions.

2016-02-24 Thread Artem Belevich via cfe-commits
tra created this revision. tra added a reviewer: jlebar. tra added a subscriber: cfe-commits. __global__ functions are present on both host and device side, so providing __host__ or __device__ overloads is not going to do anything useful. http://reviews.llvm.org/D17581 Files: lib/Sema/SemaOver

Re: [PATCH] D17561: [CUDA] Add conversion operators for threadIdx, blockIdx, gridDim, and blockDim to uint3 and dim3.

2016-02-24 Thread Artem Belevich via cfe-commits
tra accepted this revision. tra added a comment. This revision is now accepted and ready to land. OK. http://reviews.llvm.org/D17561 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

r261778 - [CUDA] do not allow attribute-based overloading for __global__ functions.

2016-02-24 Thread Artem Belevich via cfe-commits
Author: tra Date: Wed Feb 24 15:54:45 2016 New Revision: 261778 URL: http://llvm.org/viewvc/llvm-project?rev=261778&view=rev Log: [CUDA] do not allow attribute-based overloading for __global__ functions. __global__ functions are present on both host and device side, so providing __host__ or __dev

Re: [PATCH] D19346: [CUDA] Copy host builtin types to NVPTXTargetInfo.

2016-04-28 Thread Artem Belevich via cfe-commits
tra added inline comments. Comment at: lib/Basic/Targets.cpp:1642 @@ +1641,3 @@ + +std::unique_ptr HostTarget( +AllocateTarget(llvm::Triple(Opts.HostTriple), Opts)); You may want to make sure we don't recurse here if someone specifies host triple to b

[PATCH] D19748: [CUDA] Make sure device-side __global__ functions are always visible.

2016-04-29 Thread Artem Belevich via cfe-commits
tra created this revision. tra added reviewers: jlebar, rnk. tra added a subscriber: cfe-commits. __global__ functions are a special case in CUDA. Even when the symbol would normally not be externally visible according to C++ rules, they still must be visible to host-side stub which launches the

Re: [PATCH] D19748: [CUDA] Make sure device-side __global__ functions are always visible.

2016-05-02 Thread Artem Belevich via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL268299: [CUDA] Make sure device-side __global__ functions are always visible. (authored by tra). Changed prior to commit: http://reviews.llvm.org/D19748?vs=55674&id=55885#toc Repository: rL LLVM htt

r268299 - [CUDA] Make sure device-side __global__ functions are always visible.

2016-05-02 Thread Artem Belevich via cfe-commits
Author: tra Date: Mon May 2 15:30:03 2016 New Revision: 268299 URL: http://llvm.org/viewvc/llvm-project?rev=268299&view=rev Log: [CUDA] Make sure device-side __global__ functions are always visible. __global__ functions are a special case in CUDA. Even when the symbol would normally not be exte

[PATCH] D20034: [CUDA] Only __shared__ variables can be static local on device side.

2016-05-06 Thread Artem Belevich via cfe-commits
tra created this revision. tra added reviewers: jingyue, jlebar. tra added a subscriber: cfe-commits. According to CUDA programming guide (v7.5): > E.2.9.4: Within the body of a __device__ or __global__ function, only > __shared__ variables may be declared with static storage class. http://re

Re: [PATCH] D20034: [CUDA] Only __shared__ variables can be static local on device side.

2016-05-06 Thread Artem Belevich via cfe-commits
tra added a comment. In http://reviews.llvm.org/D20034#423945, @jlebar wrote: > What are we supposed to do if we encounter a static __shared__ variable in an > HD function? Presumably that also should be an error if we invoke the HD > function from the device? nvcc produces an error only of

[PATCH] D20039: [CUDA] Restrict init of local __shared__ variables to empty constructors only.

2016-05-06 Thread Artem Belevich via cfe-commits
tra created this revision. tra added reviewers: jingyue, jlebar, rnk. tra added a subscriber: cfe-commits. While __shared__ variables look like any other variable with a static storage class to compiler, they behave differently on device side. * one instance is created per block of GPUS, so stan

Re: [PATCH] D19990: [CUDA] Implement __ldg using intrinsics.

2016-05-09 Thread Artem Belevich via cfe-commits
tra added a comment. OK. Let's stick with __ldg for now. http://reviews.llvm.org/D19990 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Re: [PATCH] D20034: [CUDA] Only __shared__ variables can be static local on device side.

2016-05-09 Thread Artem Belevich via cfe-commits
tra updated this revision to Diff 56610. tra added a comment. Updated tests in CodeGenCUDA/address-spaces.cu http://reviews.llvm.org/D20034 Files: include/clang/Basic/DiagnosticSemaKinds.td lib/Sema/SemaDecl.cpp test/CodeGenCUDA/address-spaces.cu test/CodeGenCUDA/device-var-init.cu Ind

r268962 - [CUDA] Only __shared__ variables can be static local on device side.

2016-05-09 Thread Artem Belevich via cfe-commits
Author: tra Date: Mon May 9 14:36:08 2016 New Revision: 268962 URL: http://llvm.org/viewvc/llvm-project?rev=268962&view=rev Log: [CUDA] Only __shared__ variables can be static local on device side. According to CUDA programming guide (v7.5): > E.2.9.4: Within the body of a device or global funct

Re: [PATCH] D20034: [CUDA] Only __shared__ variables can be static local on device side.

2016-05-09 Thread Artem Belevich via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL268962: [CUDA] Only __shared__ variables can be static local on device side. (authored by tra). Changed prior to commit: http://reviews.llvm.org/D20034?vs=56610&id=56611#toc Repository: rL LLVM http

Re: [PATCH] D20039: [CUDA] Restrict init of local __shared__ variables to empty constructors only.

2016-05-09 Thread Artem Belevich via cfe-commits
tra updated this revision to Diff 56619. tra added a comment. Reworded comments. Removed tests that no longer apply as we don't generate constructors for static local variables on device side. Empty constructor cases are already covered by test/CodeGenCUDA/device-var-init.cu. http://reviews.l

Re: [PATCH] D20039: [CUDA] Restrict init of local __shared__ variables to empty constructors only.

2016-05-09 Thread Artem Belevich via cfe-commits
tra added a comment. In http://reviews.llvm.org/D20039#424067, @jlebar wrote: > While I think this is 100% the right thing to do, I am worried about breaking > existing targets. Maybe we need an escape valve, at least until we get that > sorted out? Unless you're pretty confident this isn't h

r268982 - [CUDA] Restrict init of local __shared__ variables to empty constructors only.

2016-05-09 Thread Artem Belevich via cfe-commits
Author: tra Date: Mon May 9 17:09:56 2016 New Revision: 268982 URL: http://llvm.org/viewvc/llvm-project?rev=268982&view=rev Log: [CUDA] Restrict init of local __shared__ variables to empty constructors only. Allow only empty constructors for local __shared__ variables in a way identical to restr

Re: [PATCH] D20039: [CUDA] Restrict init of local __shared__ variables to empty constructors only.

2016-05-09 Thread Artem Belevich via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL268982: [CUDA] Restrict init of local __shared__ variables to empty constructors only. (authored by tra). Changed prior to commit: http://reviews.llvm.org/D20039?vs=56619&id=56642#toc Repository: rL

[PATCH] D20139: [CUDA] Split device-var-init.cu tests into separate Sema and CodeGen parts.

2016-05-10 Thread Artem Belevich via cfe-commits
tra created this revision. tra added a reviewer: jlebar. tra added a subscriber: cfe-commits. Codegen tests for device-side variable initialization are subset of test cases used to verify Sema's part of the job. Including CodeGenCUDA/device-var-init.cu from SemaCUDA makes it easier to keep both

[PATCH] D20140: [CUDA] Do not allow non-empty destructors for global device-side variables.

2016-05-10 Thread Artem Belevich via cfe-commits
tra created this revision. tra added reviewers: jlebar, rsmith, jingyue. tra added a subscriber: cfe-commits. According to Cuda Programming guide (v7.5, E2.3.1): > __device__, __constant__ and __shared__ variables defined in namespace > scope, that are of class type, cannot have a non-empty constr

Re: [PATCH] D20139: [CUDA] Split device-var-init.cu tests into separate Sema and CodeGen parts.

2016-05-10 Thread Artem Belevich via cfe-commits
tra added inline comments. Comment at: test/SemaCUDA/device-var-init.cu:7-11 @@ -6,9 +6,7 @@ // RUN: %clang_cc1 -triple nvptx64-nvidia-cuda -fcuda-is-device -std=c++11 \ -// RUN: -fno-threadsafe-statics -emit-llvm -o - %s | FileCheck %s -// RUN: %clang_cc1 -triple nvptx64-nvi

[PATCH] D20141: Check for nullptr argument.

2016-05-10 Thread Artem Belevich via cfe-commits
tra created this revision. tra added reviewers: jlebar, jordan_rose. tra added a subscriber: cfe-commits. GetOrCreateLLVMGlobal() accepts nullptr D, but in some cases we end up dereferencing it without checking if it's non-null. Fixes PR15492. http://reviews.llvm.org/D20141 Files: lib/CodeGe

Re: [PATCH] D20141: Check for nullptr argument.

2016-05-10 Thread Artem Belevich via cfe-commits
tra added a comment. I've never seen it triggered. Fix is based on the comment above the function that D==nullptr is acceptable and the fact that we are checking D in other places in this function. Two cases where nullptr D is passed explicitly has something to do with -fblocks, but that does

[PATCH] D20341: [CUDA] Enable fusing FP ops for CUDA by default.

2016-05-17 Thread Artem Belevich via cfe-commits
tra created this revision. tra added reviewers: jlebar, jingyue. tra added a subscriber: cfe-commits. This matches default nvcc behavior and gives substantial performance boost on GPU where fmad is much cheaper compared to add+mul. http://reviews.llvm.org/D20341 Files: lib/Frontend/Compiler

Re: [PATCH] D20341: [CUDA] Enable fusing FP ops for CUDA by default.

2016-05-17 Thread Artem Belevich via cfe-commits
tra added a subscriber: scanon. tra added a comment. Things are even more interesting. -ffp-contract=fast is *not* what this change does. :-) We have two places where we can fuse FP instructions -- in clang and in LLVM back-end. Clang fuses add+mul into llvm.fmuladd intrinsic if -ffp-contract=o

Re: [PATCH] D20341: [CUDA] Enable fusing FP ops for CUDA by default.

2016-05-17 Thread Artem Belevich via cfe-commits
tra added a comment. In http://reviews.llvm.org/D20341#432494, @hfinkel wrote: > > That having been said, is this change the equivalent of -ffp-contract=fast or > -ffp-contract=on? I think it is the latter and we want the former (i.e. where > we let the backend be as aggressive as possible

Re: [PATCH] D20341: [CUDA] Enable fusing FP ops for CUDA by default.

2016-05-17 Thread Artem Belevich via cfe-commits
tra added a comment. OK. Consensus seems to be that -ffp-contract=fast is the way to go. I'll update the patch. I've just checked Steve's example with nvcc and indeed it fused mul+add. http://reviews.llvm.org/D20341 ___ cfe-commits mailing list cfe

Re: [PATCH] D20341: [CUDA] Enable fusing FP ops for CUDA by default.

2016-05-17 Thread Artem Belevich via cfe-commits
tra updated this revision to Diff 57540. tra added a comment. Changed default to -ffp-contract=fast. http://reviews.llvm.org/D20341 Files: lib/Frontend/CompilerInvocation.cpp Index: lib/Frontend/CompilerInvocation.cpp === --- li

Re: [PATCH] D20341: [CUDA] Enable fusing FP ops for CUDA by default.

2016-05-17 Thread Artem Belevich via cfe-commits
tra updated this revision to Diff 57541. tra added a comment. Added test case. Is there a better way to test that correct options are passed to back-end? This test resorts to checking assembly generated by back-end which is way too far away from what actually needs testing. http://reviews.llvm

Re: [PATCH] D20341: [CUDA] Enable fusing FP ops for CUDA by default.

2016-05-18 Thread Artem Belevich via cfe-commits
tra added a comment. I don't think using FMA throws away IEEE compliance. IEEE 784-2008 says: > A language standard should also define, and require implementations to > provide, attributes that allow and > disallow value-changing optimizations, separately or collectively, for a > block. Thes

[PATCH] D20405: [CUDA] allow sm_50,52,53 GPUs

2016-05-18 Thread Artem Belevich via cfe-commits
tra created this revision. tra added a reviewer: jlebar. tra added a subscriber: cfe-commits. LLVM accepts them since r233575. http://reviews.llvm.org/D20405 Files: lib/Basic/Targets.cpp lib/Driver/ToolChains.cpp test/CodeGen/nvptx-cpus.c Index: test/CodeGen/nvptx-cpus.c

r270084 - [CUDA] Allow sm_50,52,53 GPUs

2016-05-19 Thread Artem Belevich via cfe-commits
Author: tra Date: Thu May 19 12:47:47 2016 New Revision: 270084 URL: http://llvm.org/viewvc/llvm-project?rev=270084&view=rev Log: [CUDA] Allow sm_50,52,53 GPUs LLVM accepts them since r233575. Differential Revision: http://reviews.llvm.org/D20405 Modified: cfe/trunk/lib/Basic/Targets.cpp

Re: [PATCH] D20405: [CUDA] allow sm_50,52,53 GPUs

2016-05-19 Thread Artem Belevich via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL270084: [CUDA] Allow sm_50,52,53 GPUs (authored by tra). Changed prior to commit: http://reviews.llvm.org/D20405?vs=57715&id=57822#toc Repository: rL LLVM http://reviews.llvm.org/D20405 Files: cfe

Re: [PATCH] D20141: Check for nullptr argument.

2016-05-19 Thread Artem Belevich via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL270086: Check for nullptr argument. (authored by tra). Changed prior to commit: http://reviews.llvm.org/D20141?vs=56832&id=57825#toc Repository: rL LLVM http://reviews.llvm.org/D20141 Files: cfe/t

r270086 - Check for nullptr argument.

2016-05-19 Thread Artem Belevich via cfe-commits
Author: tra Date: Thu May 19 13:00:18 2016 New Revision: 270086 URL: http://llvm.org/viewvc/llvm-project?rev=270086&view=rev Log: Check for nullptr argument. Addresses static analysis report in PR15492. Differential Revision: http://reviews.llvm.org/D20141 Modified: cfe/trunk/lib/CodeGen/Co

Re: [PATCH] D20341: [CUDA] Enable fusing FP ops for CUDA by default.

2016-05-19 Thread Artem Belevich via cfe-commits
tra added a subscriber: chandlerc. tra added a comment. Short version of offline discussion with @chandlerc : Default of -ffp-contract=fast for CUDA is fine. http://reviews.llvm.org/D20341 ___ cfe-commits mailing list cfe-commits@lists.llvm.org htt

r270094 - [CUDA] Enable fusing FP ops (-ffp-contract=fast) for CUDA by default.

2016-05-19 Thread Artem Belevich via cfe-commits
Author: tra Date: Thu May 19 13:44:45 2016 New Revision: 270094 URL: http://llvm.org/viewvc/llvm-project?rev=270094&view=rev Log: [CUDA] Enable fusing FP ops (-ffp-contract=fast) for CUDA by default. This matches default nvcc behavior and gives substantial performance boost on GPU where fmad is m

Re: [PATCH] D20341: [CUDA] Enable fusing FP ops for CUDA by default.

2016-05-19 Thread Artem Belevich via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL270094: [CUDA] Enable fusing FP ops (-ffp-contract=fast) for CUDA by default. (authored by tra). Changed prior to commit: http://reviews.llvm.org/D20341?vs=57541&id=57833#toc Repository: rL LLVM htt

r270107 - [CUDA] Split device-var-init.cu tests into separate Sema and CodeGen parts.

2016-05-19 Thread Artem Belevich via cfe-commits
Author: tra Date: Thu May 19 15:13:39 2016 New Revision: 270107 URL: http://llvm.org/viewvc/llvm-project?rev=270107&view=rev Log: [CUDA] Split device-var-init.cu tests into separate Sema and CodeGen parts. Codegen tests for device-side variable initialization are subset of test cases used to veri

Re: [PATCH] D20140: [CUDA] Do not allow non-empty destructors for global device-side variables.

2016-05-19 Thread Artem Belevich via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL270108: [CUDA] Do not allow non-empty destructors for global device-side variables. (authored by tra). Changed prior to commit: http://reviews.llvm.org/D20140?vs=56829&id=57849#toc Repository: rL LLV

Re: [PATCH] D20139: [CUDA] Split device-var-init.cu tests into separate Sema and CodeGen parts.

2016-05-19 Thread Artem Belevich via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL270107: [CUDA] Split device-var-init.cu tests into separate Sema and CodeGen parts. (authored by tra). Changed prior to commit: http://reviews.llvm.org/D20139?vs=56824&id=57848#toc Repository: rL LLV

r270108 - [CUDA] Do not allow non-empty destructors for global device-side variables.

2016-05-19 Thread Artem Belevich via cfe-commits
Author: tra Date: Thu May 19 15:13:53 2016 New Revision: 270108 URL: http://llvm.org/viewvc/llvm-project?rev=270108&view=rev Log: [CUDA] Do not allow non-empty destructors for global device-side variables. According to Cuda Programming guide (v7.5, E2.3.1): > __device__, __constant__ and __shared

Re: [PATCH] D18380: [CUDA] Make unattributed constexpr functions (usually) implicitly host+device.

2016-03-29 Thread Artem Belevich via cfe-commits
tra added a comment. LGTM. http://reviews.llvm.org/D18380 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Re: r264783 - [PGO] Move the instrumentation point closer to the value site.

2016-03-29 Thread Artem Belevich via cfe-commits
Hi, FYI, cxx-indirect-call.cpp test fails on platforms with different alignment. It may help to either use specific target or change your patterns to accommodate other targets. --Artem TEST 'Clang :: Profile/cxx-indirect-call.cpp' FAILED Script: -- /usr/

Re: r264783 - [PGO] Move the instrumentation point closer to the value site.

2016-03-29 Thread Artem Belevich via cfe-commits
Thanks for the quick fix. The test works on x86_64-unknown-linux-gnu now. --Artem On Tue, Mar 29, 2016 at 3:24 PM, Betul Buyukkurt wrote: > Hi Artem, > > > > I’ve uploaded a patch to remove the alignment. > > > > Thanks, > > -Betul > > > > *From:* Artem Belevich [mailto:t...@google.com] > *Sent

Re: [PATCH] D18539: [CUDA] Add math forward declares.

2016-03-29 Thread Artem Belevich via cfe-commits
tra accepted this revision. tra added a comment. This revision is now accepted and ready to land. One nit. LGTM otherwise. Comment at: lib/Headers/__clang_cuda_math_forward_declares.h:1 @@ +1,2 @@ +/*=== __clang_cuda_cmath.h - Device-side CUDA cmath support === +

Re: [PATCH] D18671: [CUDA] Add --cuda-flush-denormals-to-zero.

2016-03-31 Thread Artem Belevich via cfe-commits
tra added inline comments. Comment at: include/clang/Driver/Options.td:385 @@ -384,1 +384,3 @@ HelpText<"CUDA installation path">; +def cuda_flush_denormals_to_zero : Flag<["--"], "cuda-flush-denormals-to-zero">, + HelpText<"Flush denormal floating point values to zero in CUD

Re: [PATCH] D18672: [NVPTX] Read __CUDA_FTZ from module flags in NVVMReflect.

2016-03-31 Thread Artem Belevich via cfe-commits
tra accepted this revision. tra added a comment. This revision is now accepted and ready to land. LGTM. http://reviews.llvm.org/D18672 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D19074: [CUDA] removed unneeded __nvvm_reflect_anchor()

2016-04-13 Thread Artem Belevich via cfe-commits
tra created this revision. tra added reviewers: jlebar, majnemer. tra added a subscriber: cfe-commits. Since r265060 LLVM infers correct __nvvm_reflect attributes. http://reviews.llvm.org/D19074 Files: lib/Headers/__clang_cuda_runtime_wrapper.h Index: lib/Headers/__clang_cuda_runtime_wrappe

Re: [PATCH] D19180: [CUDA] Raise an error if the CUDA install can't be found.

2016-04-15 Thread Artem Belevich via cfe-commits
tra accepted this revision. tra added a comment. This revision is now accepted and ready to land. LGTM. http://reviews.llvm.org/D19180 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Re: [PATCH] D19248: [CUDA] Add --cuda-compile-host-device, which overrides --cuda-host-only and --cuda-device-only.

2016-04-18 Thread Artem Belevich via cfe-commits
tra added a comment. Help strings seem to be backwards. LGTM otherwise. Comment at: include/clang/Driver/Options.td:378 @@ -377,2 +377,3 @@ def cuda_device_only : Flag<["--"], "cuda-device-only">, - HelpText<"Do device-side CUDA compilation only">; + HelpText<"Compile CUDA co

r267062 - [CUDA] removed unneeded __nvvm_reflect_anchor()

2016-04-21 Thread Artem Belevich via cfe-commits
Author: tra Date: Thu Apr 21 16:40:27 2016 New Revision: 267062 URL: http://llvm.org/viewvc/llvm-project?rev=267062&view=rev Log: [CUDA] removed unneeded __nvvm_reflect_anchor() Since r265060 LLVM infers correct __nvvm_reflect attributes, so explicit declaration of __nvvm_reflect() is no longer n

Re: [PATCH] D19074: [CUDA] removed unneeded __nvvm_reflect_anchor()

2016-04-21 Thread Artem Belevich via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL267062: [CUDA] removed unneeded __nvvm_reflect_anchor() (authored by tra). Changed prior to commit: http://reviews.llvm.org/D19074?vs=53618&id=54585#toc Repository: rL LLVM http://reviews.llvm.org/D

Re: [PATCH] D20493: [CUDA] Add -fcuda-approx-transcendentals flag.

2016-05-23 Thread Artem Belevich via cfe-commits
tra added a comment. LGTM. http://reviews.llvm.org/D20493 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Re: [PATCH] D20836: [CUDA] Conservatively mark inline asm as convergent.

2016-05-31 Thread Artem Belevich via cfe-commits
tra added a comment. I guess we would not be able to remove convergent from inline asm automatically. Do we need a way to explicitly remove convergent from inline asm? http://reviews.llvm.org/D20836 ___ cfe-commits mailing list cfe-commits@lists.ll

[PATCH] D20985: [CUDA] Add implicit conversion of __launch_bounds__ arguments to rvalue.

2016-06-03 Thread Artem Belevich via cfe-commits
tra created this revision. tra added reviewers: rsmith, jlebar. tra added a subscriber: cfe-commits. Fixes clang crash reported in PR27778. http://reviews.llvm.org/D20985 Files: lib/Sema/SemaDeclAttr.cpp test/CodeGenCUDA/launch-bounds.cu test/SemaCUDA/pr27778.cu Index: test/SemaCUDA/pr277

Re: [PATCH] D20985: [CUDA] Add implicit conversion of __launch_bounds__ arguments to rvalue.

2016-06-03 Thread Artem Belevich via cfe-commits
tra added a comment. In http://reviews.llvm.org/D20985#448822, @jlebar wrote: > How is this different from test/SemaCUDA/launch_bounds.cu:27-28? It does > > const int constint = 512; > __launch_bounds__(constint) void TestConstInt(void); > > > which looks verbatim the same as this testcas

Re: [PATCH] D20985: [CUDA] Add implicit conversion of __launch_bounds__ arguments to rvalue.

2016-06-03 Thread Artem Belevich via cfe-commits
tra updated this revision to Diff 59624. tra added a comment. Addressed Justin's comments. http://reviews.llvm.org/D20985 Files: lib/Sema/SemaDeclAttr.cpp test/CodeGenCUDA/launch-bounds.cu test/SemaCUDA/pr27778.cu Index: test/SemaCUDA/pr27778.cu ==

Re: [PATCH] D20985: [CUDA] Add implicit conversion of __launch_bounds__ arguments to rvalue.

2016-06-03 Thread Artem Belevich via cfe-commits
tra updated this revision to Diff 59631. tra marked an inline comment as done. tra added a comment. Rephrased comments http://reviews.llvm.org/D20985 Files: lib/Sema/SemaDeclAttr.cpp test/CodeGenCUDA/launch-bounds.cu test/SemaCUDA/pr27778.cu Index: test/SemaCUDA/pr27778.cu ==

Re: [PATCH] D20985: [CUDA] Add implicit conversion of __launch_bounds__ arguments to rvalue.

2016-06-03 Thread Artem Belevich via cfe-commits
tra marked 3 inline comments as done. Comment at: lib/Sema/SemaDeclAttr.cpp:4046 @@ +4045,3 @@ +// non-nullptr Expr result on success. Returns nullptr otherwise and +// may output an error. +static Expr *makeLaunchBoundsArgExpr(Sema &S, Expr *E, jlebar wrote: > Pr

Re: [PATCH] D20985: [CUDA] Add implicit conversion of __launch_bounds__ arguments to rvalue.

2016-06-06 Thread Artem Belevich via cfe-commits
tra updated this revision to Diff 59778. tra added a comment. Replaced if() with assert() to catch unexpected PerformCopyInitialization() failures. http://reviews.llvm.org/D20985 Files: lib/Sema/SemaDeclAttr.cpp test/CodeGenCUDA/launch-bounds.cu test/SemaCUDA/pr27778.cu Index: test/Sema

Re: [PATCH] D20985: [CUDA] Add implicit conversion of __launch_bounds__ arguments to rvalue.

2016-06-06 Thread Artem Belevich via cfe-commits
tra marked an inline comment as done. tra added a comment. http://reviews.llvm.org/D20985 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

r271951 - [CUDA] Add implicit conversion of __launch_bounds__ arguments to rvalue.

2016-06-06 Thread Artem Belevich via cfe-commits
Author: tra Date: Mon Jun 6 17:54:57 2016 New Revision: 271951 URL: http://llvm.org/viewvc/llvm-project?rev=271951&view=rev Log: [CUDA] Add implicit conversion of __launch_bounds__ arguments to rvalue. Fixes clang crash reported in PR27778. Differential Revision: http://reviews.llvm.org/D20985

Re: [PATCH] D20985: [CUDA] Add implicit conversion of __launch_bounds__ arguments to rvalue.

2016-06-06 Thread Artem Belevich via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL271951: [CUDA] Add implicit conversion of __launch_bounds__ arguments to rvalue. (authored by tra). Changed prior to commit: http://reviews.llvm.org/D20985?vs=59778&id=59800#toc Repository: rL LLVM

Re: [PATCH] D21162: [CUDA] Implement __shfl* intrinsics in clang headers.

2016-06-09 Thread Artem Belevich via cfe-commits
tra added inline comments. Comment at: lib/Headers/__clang_cuda_intrinsics.h:77-80 @@ +76,6 @@ +_Static_assert(sizeof(__tmp) == sizeof(__in)); \ +memcpy(&__tmp, &__in, sizeof(__in)); \ +__tmp = ::__FnN

Re: [PATCH] D18170: [CUDA][OpenMP] Create generic offload toolchains

2016-06-10 Thread Artem Belevich via cfe-commits
tra accepted this revision. tra added a comment. This revision is now accepted and ready to land. LGTM. http://reviews.llvm.org/D18170 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

r272573 - Test fix -- use captured call result instead of hardcoded %2.

2016-06-13 Thread Artem Belevich via cfe-commits
Author: tra Date: Mon Jun 13 13:44:22 2016 New Revision: 272573 URL: http://llvm.org/viewvc/llvm-project?rev=272573&view=rev Log: Test fix -- use captured call result instead of hardcoded %2. Modified: cfe/trunk/test/CodeGen/bitscan-builtins.c Modified: cfe/trunk/test/CodeGen/bitscan-builtin

Re: [clang-tools-extra] r272816 - clang-rename: implement renaming of classes with a dtor

2016-06-15 Thread Artem Belevich via cfe-commits
Miklos, TokenName produces unused variable warning in builds with asserts disabled. Could you add LLVM_ATTRIBUTE_UNUSED to it? Thanks, --Artem On Wed, Jun 15, 2016 at 11:35 AM, Miklos Vajna via cfe-commits < cfe-commits@lists.llvm.org> wrote: > Author: vmiklos > Date: Wed Jun 15 13:35:41 2016

[clang-tools-extra] r272852 - [clang-tools] mark TokenName as unused

2016-06-15 Thread Artem Belevich via cfe-commits
Author: tra Date: Wed Jun 15 18:04:42 2016 New Revision: 272852 URL: http://llvm.org/viewvc/llvm-project?rev=272852&view=rev Log: [clang-tools] mark TokenName as unused Otherwise it produces compiler warning if asserts are disabled. Modified: clang-tools-extra/trunk/clang-rename/USRLocFinder

Re: [clang-tools-extra] r272816 - clang-rename: implement renaming of classes with a dtor

2016-06-15 Thread Artem Belevich via cfe-commits
Should be fixed in r272852 --Artem On Wed, Jun 15, 2016 at 3:16 PM, Artem Belevich wrote: > Miklos, > > TokenName produces unused variable warning in builds with asserts disabled. > Could you add LLVM_ATTRIBUTE_UNUSED to it? > > Thanks, > --Artem > > > On Wed, Jun 15, 2016 at 11:35 AM, Miklos V

Re: [PATCH] D21419: [CUDA] Don't pass top-level -march down to device cc1 or ptxas.

2016-06-15 Thread Artem Belevich via cfe-commits
tra added inline comments. Comment at: test/Driver/cuda-march.cu:15-16 @@ +14,4 @@ + +// RUN: %clang -### -target x86_64-linux-gnu -c -march=skylake --cuda-gpu-arch=sm_30 %s 2>&1 | \ +// RUN: FileCheck -check-prefix SKYLAKE -check-prefix SM30 %s + These look redu

Re: [PATCH] D21419: [CUDA] Don't pass top-level -march down to device cc1 or ptxas.

2016-06-15 Thread Artem Belevich via cfe-commits
tra added inline comments. Comment at: test/Driver/cuda-march.cu:22-28 @@ +21,9 @@ + +// SM30:clang +// SM30: "-cc1" +// SM30-SAME: "-triple" "nvptx +// SM30-SAME: "-target-cpu" "sm_30" +// SM30: ptxas +// SM30-SAME: "--gpu-name" "sm_30" + +// HASWELL:clang You do

Re: [PATCH] D21419: [CUDA] Don't pass top-level -march down to device cc1 or ptxas.

2016-06-15 Thread Artem Belevich via cfe-commits
tra accepted this revision. tra added a comment. This revision is now accepted and ready to land. LGTM. http://reviews.llvm.org/D21419 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

r272947 - Minor fixes for miamcpu-opt.c test

2016-06-16 Thread Artem Belevich via cfe-commits
Author: tra Date: Thu Jun 16 15:16:49 2016 New Revision: 272947 URL: http://llvm.org/viewvc/llvm-project?rev=272947&view=rev Log: Minor fixes for miamcpu-opt.c test Added -no-canonical-prefixes to make cc1 binary name more predictable. Added appropriate REQUIRES keywords. Modified: cfe/trunk

Re: [libcxx] r273034 - Add Filesystem TS -- Complete

2016-06-20 Thread Artem Belevich via cfe-commits
Eric, Some tests appear to fail if the path to the tests' current directory has some symlinks in it. In my case source and build tree are in directory 'work' that's symlinked to from my home directory: /usr/local/home/tra/work -> /work/tra This causes multiple failures in libcxx tests. One exampl

r273289 - [aarch64] Update datalayout for aarch64 tests

2016-06-21 Thread Artem Belevich via cfe-commits
Author: tra Date: Tue Jun 21 12:35:31 2016 New Revision: 273289 URL: http://llvm.org/viewvc/llvm-project?rev=273289&view=rev Log: [aarch64] Update datalayout for aarch64 tests This brings the tests in sync with the changes in r273280. Modified: cfe/trunk/test/CodeGen/aarch64-type-sizes.c

Re: [PATCH] D21867: [CUDA] Add utility functions for dealing with CUDA versions / architectures.

2016-06-30 Thread Artem Belevich via cfe-commits
tra accepted this revision. tra added a comment. This revision is now accepted and ready to land. LGTM. Comment at: lib/Basic/Cuda.cpp:8-19 @@ +7,14 @@ + +const char *CudaVersionToString(CudaVersion V) { + switch (V) { + case CudaVersion::UNKNOWN: +return "unknown"; + cas

Re: [PATCH] D21867: [CUDA] Add utility functions for dealing with CUDA versions / architectures.

2016-06-30 Thread Artem Belevich via cfe-commits
tra added inline comments. Comment at: lib/Driver/Driver.cpp:1026-1028 @@ -1024,4 +1025,5 @@ } else if (CudaDeviceAction *CDA = dyn_cast(A)) { -os << '"' - << (CDA->getGpuArchName() ? CDA->getGpuArchName() : "(multiple archs)") +os << '"' << (CDA->getGpuArch() !=

Re: [PATCH] D21867: [CUDA] Add utility functions for dealing with CUDA versions / architectures.

2016-06-30 Thread Artem Belevich via cfe-commits
tra added inline comments. Comment at: lib/Driver/Driver.cpp:1026-1028 @@ -1024,4 +1025,5 @@ } else if (CudaDeviceAction *CDA = dyn_cast(A)) { -os << '"' - << (CDA->getGpuArchName() ? CDA->getGpuArchName() : "(multiple archs)") +os << '"' << (CDA->getGpuArch() !=

Re: [PATCH] D21913: [CUDA] Add additional testcases for EraseUnwantedCUDAMatches.

2016-07-06 Thread Artem Belevich via cfe-commits
tra accepted this revision. tra added a comment. This revision is now accepted and ready to land. LGTM. http://reviews.llvm.org/D21913 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Re: [PATCH] D21914: [CUDA] Use the multi-element remove function in EraseUnwantedCUDAMatches.

2016-07-06 Thread Artem Belevich via cfe-commits
tra accepted this revision. tra added a comment. This revision is now accepted and ready to land. LGTM. http://reviews.llvm.org/D21914 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Re: [PATCH] D21778: [CUDA] Add support for CUDA 8 and sm_60-62.

2016-07-06 Thread Artem Belevich via cfe-commits
tra added a comment. The changes look good. They will need to wait for corresponding patch on LLVM side to deal with new SM variants, though. Comment at: lib/Driver/ToolChains.cpp:1715 @@ -1714,2 +1714,3 @@ CudaPathCandidates.push_back(D.SysRoot + "/usr/local/cuda"); +

Re: [PATCH] D21869: [CUDA] Check that our CUDA install supports the requested architectures.

2016-07-07 Thread Artem Belevich via cfe-commits
tra added inline comments. Comment at: include/clang/Basic/DiagnosticDriverKinds.td:32 @@ -29,1 +31,3 @@ + "Use --cuda-path to specify a different CUDA install, or pass " + "--nocuda-version-check.">; def err_drv_invalid_thread_model_for_target : Error< Is it s

Re: [PATCH] D21869: [CUDA] Check that our CUDA install supports the requested architectures.

2016-07-07 Thread Artem Belevich via cfe-commits
tra accepted this revision. tra added a comment. This revision is now accepted and ready to land. LGTM. Comment at: lib/Driver/ToolChains.cpp:1798 @@ +1797,3 @@ +FS.getBufferForFile(InstallPath + "/version.txt"); +if (!VersionFile) { + // CUDA 7.0 doesn't have a

Re: [PATCH] D18171: [CUDA][OpenMP] Create generic offload action

2016-07-12 Thread Artem Belevich via cfe-commits
tra added a comment. Few minor nits and suggestions. Other than that I'm OK with the patch. Comment at: lib/Driver/Action.cpp:156 @@ +155,3 @@ + // Propagate info to the dependencies. + for (unsigned i = 0; i < getInputs().size(); ++i) +getInputs()[i]->propagateDeviceOfflo

Re: [PATCH] D18171: [CUDA][OpenMP] Create generic offload action

2016-07-13 Thread Artem Belevich via cfe-commits
tra added inline comments. Comment at: lib/Driver/Action.cpp:191-202 @@ +190,14 @@ +const OffloadActionWorkTy &Work) const { + auto I = getInputs().begin(); + auto E = getInputs().end(); + if (I == E) +return; + + // Skip host action + if (HostTC) +++I; + + auto

Re: [PATCH] D18171: [CUDA][OpenMP] Create generic offload action

2016-07-13 Thread Artem Belevich via cfe-commits
tra added a comment. The changes look good. Now we just need some tests. Something along the lines of test/Driver/phases.c should do. http://reviews.llvm.org/D18171 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi

Re: [PATCH] D18171: [CUDA][OpenMP] Create generic offload action

2016-07-13 Thread Artem Belevich via cfe-commits
tra added inline comments. Comment at: test/Driver/cuda_phases.cu:1 @@ +1,2 @@ +// RUN: %clang -target powerpc64le-ibm-linux-gnu -ccc-print-phases --cuda-gpu-arch=sm_30 %s 2>&1 \ +// RUN: | FileCheck -check-prefix=BIN %s Few words describing the test would be nic

Re: [PATCH] D18171: [CUDA][OpenMP] Create generic offload action

2016-07-13 Thread Artem Belevich via cfe-commits
tra accepted this revision. tra added a comment. This revision is now accepted and ready to land. LGTM. Comment at: test/Driver/cuda_phases.cu:48 @@ +47,3 @@ +// +// Test two gpu architecture with complete compilation. +// architecture*s*. There are few more cop

[PATCH] D17779: [CUDA] Emit host-side 'shadows' for device-side global variables

2016-03-01 Thread Artem Belevich via cfe-commits
tra created this revision. tra added reviewers: jlebar, jingyue. tra added a subscriber: cfe-commits. .. and register them with CUDA runtime. This is needed for commonly used cudaMemcpy*() APIs that use address of host-side shadow to access their counterparts on device side. Fixes PR26340. htt

[PATCH] D17780: [CUDA] Do not generate unnecessary runtime init code.

2016-03-01 Thread Artem Belevich via cfe-commits
tra created this revision. tra added reviewers: jlebar, jingyue. tra added a subscriber: cfe-commits. Do not generate runtime init code if we don't have anything to init. http://reviews.llvm.org/D17780 Files: lib/CodeGen/CGCUDANV.cpp test/CodeGenCUDA/device-stub.cu Index: test/CodeGenCUDA/d

Re: [PATCH] D17779: [CUDA] Emit host-side 'shadows' for device-side global variables

2016-03-01 Thread Artem Belevich via cfe-commits
tra updated this revision to Diff 49561. tra marked 9 inline comments as done. tra added a comment. Addressed Justin's comments. http://reviews.llvm.org/D17779 Files: lib/CodeGen/CGCUDANV.cpp lib/CodeGen/CGCUDARuntime.h lib/CodeGen/CodeGenModule.cpp test/CodeGenCUDA/device-stub.cu tes

r262499 - [CUDA] Do not generate unnecessary runtime init code.

2016-03-02 Thread Artem Belevich via cfe-commits
Author: tra Date: Wed Mar 2 12:28:53 2016 New Revision: 262499 URL: http://llvm.org/viewvc/llvm-project?rev=262499&view=rev Log: [CUDA] Do not generate unnecessary runtime init code. Differential Revision: http://reviews.llvm.org/D17780 Modified: cfe/trunk/lib/CodeGen/CGCUDANV.cpp cfe/t

r262498 - [CUDA] Emit host-side 'shadows' for device-side global variables

2016-03-02 Thread Artem Belevich via cfe-commits
Author: tra Date: Wed Mar 2 12:28:50 2016 New Revision: 262498 URL: http://llvm.org/viewvc/llvm-project?rev=262498&view=rev Log: [CUDA] Emit host-side 'shadows' for device-side global variables ... and register them with CUDA runtime. This is needed for commonly used cudaMemcpy*() APIs that use

Re: [PATCH] D17779: [CUDA] Emit host-side 'shadows' for device-side global variables

2016-03-02 Thread Artem Belevich via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL262498: [CUDA] Emit host-side 'shadows' for device-side global variables (authored by tra). Changed prior to commit: http://reviews.llvm.org/D17779?vs=49561&id=49645#toc Repository: rL LLVM http://r

Re: [PATCH] D17780: [CUDA] Do not generate unnecessary runtime init code.

2016-03-02 Thread Artem Belevich via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL262499: [CUDA] Do not generate unnecessary runtime init code. (authored by tra). Changed prior to commit: http://reviews.llvm.org/D17780?vs=49539&id=49646#toc Repository: rL LLVM http://reviews.llvm

<    2   3   4   5   6   7   8   9   10   11   >