Author: tra
Date: Tue Feb 16 16:03:20 2016
New Revision: 261018
URL: http://llvm.org/viewvc/llvm-project?rev=261018&view=rev
Log:
[CUDA] pass debug options to ptxas.
ptxas optimizations are disabled if we need to generate debug info
as ptxas does not accept '-g' otherwise.
Differential Revision:
This revision was automatically updated to reflect the committed changes.
Closed by commit rL261018: [CUDA] pass debug options to ptxas. (authored by
tra).
Changed prior to commit:
http://reviews.llvm.org/D17111?vs=47680&id=48108#toc
Repository:
rL LLVM
http://reviews.llvm.org/D17111
Files
tra added inline comments.
Comment at: lib/Headers/cuda_builtin_vars.h:72
@@ -66,1 +71,3 @@
+ // uint3). This function is defined after we pull in vector_types.h.
+ __attribute__((device)) operator uint3() const;
private:
Considering that built-in variables ar
tra accepted this revision.
tra added a comment.
This revision is now accepted and ready to land.
OK.
http://reviews.llvm.org/D17562
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
tra created this revision.
tra added a reviewer: jlebar.
tra added a subscriber: cfe-commits.
__global__ functions are present on both host and device side,
so providing __host__ or __device__ overloads is not going to
do anything useful.
http://reviews.llvm.org/D17581
Files:
lib/Sema/SemaOver
tra accepted this revision.
tra added a comment.
This revision is now accepted and ready to land.
OK.
http://reviews.llvm.org/D17561
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
Author: tra
Date: Wed Feb 24 15:54:45 2016
New Revision: 261778
URL: http://llvm.org/viewvc/llvm-project?rev=261778&view=rev
Log:
[CUDA] do not allow attribute-based overloading for __global__ functions.
__global__ functions are present on both host and device side,
so providing __host__ or __dev
tra added inline comments.
Comment at: lib/Basic/Targets.cpp:1642
@@ +1641,3 @@
+
+std::unique_ptr HostTarget(
+AllocateTarget(llvm::Triple(Opts.HostTriple), Opts));
You may want to make sure we don't recurse here if someone specifies host
triple to b
tra created this revision.
tra added reviewers: jlebar, rnk.
tra added a subscriber: cfe-commits.
__global__ functions are a special case in CUDA.
Even when the symbol would normally not be externally
visible according to C++ rules, they still must be visible
to host-side stub which launches the
This revision was automatically updated to reflect the committed changes.
Closed by commit rL268299: [CUDA] Make sure device-side __global__ functions
are always visible. (authored by tra).
Changed prior to commit:
http://reviews.llvm.org/D19748?vs=55674&id=55885#toc
Repository:
rL LLVM
htt
Author: tra
Date: Mon May 2 15:30:03 2016
New Revision: 268299
URL: http://llvm.org/viewvc/llvm-project?rev=268299&view=rev
Log:
[CUDA] Make sure device-side __global__ functions are always visible.
__global__ functions are a special case in CUDA.
Even when the symbol would normally not be exte
tra created this revision.
tra added reviewers: jingyue, jlebar.
tra added a subscriber: cfe-commits.
According to CUDA programming guide (v7.5):
> E.2.9.4: Within the body of a __device__ or __global__ function, only
> __shared__ variables may be declared with static storage class.
http://re
tra added a comment.
In http://reviews.llvm.org/D20034#423945, @jlebar wrote:
> What are we supposed to do if we encounter a static __shared__ variable in an
> HD function? Presumably that also should be an error if we invoke the HD
> function from the device?
nvcc produces an error only of
tra created this revision.
tra added reviewers: jingyue, jlebar, rnk.
tra added a subscriber: cfe-commits.
While __shared__ variables look like any other variable with a static storage
class to compiler, they behave differently on device side.
* one instance is created per block of GPUS, so stan
tra added a comment.
OK. Let's stick with __ldg for now.
http://reviews.llvm.org/D19990
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
tra updated this revision to Diff 56610.
tra added a comment.
Updated tests in CodeGenCUDA/address-spaces.cu
http://reviews.llvm.org/D20034
Files:
include/clang/Basic/DiagnosticSemaKinds.td
lib/Sema/SemaDecl.cpp
test/CodeGenCUDA/address-spaces.cu
test/CodeGenCUDA/device-var-init.cu
Ind
Author: tra
Date: Mon May 9 14:36:08 2016
New Revision: 268962
URL: http://llvm.org/viewvc/llvm-project?rev=268962&view=rev
Log:
[CUDA] Only __shared__ variables can be static local on device side.
According to CUDA programming guide (v7.5):
> E.2.9.4: Within the body of a device or global funct
This revision was automatically updated to reflect the committed changes.
Closed by commit rL268962: [CUDA] Only __shared__ variables can be static local
on device side. (authored by tra).
Changed prior to commit:
http://reviews.llvm.org/D20034?vs=56610&id=56611#toc
Repository:
rL LLVM
http
tra updated this revision to Diff 56619.
tra added a comment.
Reworded comments.
Removed tests that no longer apply as we don't generate constructors for static
local variables on device side.
Empty constructor cases are already covered by
test/CodeGenCUDA/device-var-init.cu.
http://reviews.l
tra added a comment.
In http://reviews.llvm.org/D20039#424067, @jlebar wrote:
> While I think this is 100% the right thing to do, I am worried about breaking
> existing targets. Maybe we need an escape valve, at least until we get that
> sorted out? Unless you're pretty confident this isn't h
Author: tra
Date: Mon May 9 17:09:56 2016
New Revision: 268982
URL: http://llvm.org/viewvc/llvm-project?rev=268982&view=rev
Log:
[CUDA] Restrict init of local __shared__ variables to empty constructors only.
Allow only empty constructors for local __shared__ variables in a way
identical to restr
This revision was automatically updated to reflect the committed changes.
Closed by commit rL268982: [CUDA] Restrict init of local __shared__ variables
to empty constructors only. (authored by tra).
Changed prior to commit:
http://reviews.llvm.org/D20039?vs=56619&id=56642#toc
Repository:
rL
tra created this revision.
tra added a reviewer: jlebar.
tra added a subscriber: cfe-commits.
Codegen tests for device-side variable initialization are subset of test
cases used to verify Sema's part of the job.
Including CodeGenCUDA/device-var-init.cu from SemaCUDA makes it easier to
keep both
tra created this revision.
tra added reviewers: jlebar, rsmith, jingyue.
tra added a subscriber: cfe-commits.
According to Cuda Programming guide (v7.5, E2.3.1):
> __device__, __constant__ and __shared__ variables defined in namespace
> scope, that are of class type, cannot have a non-empty constr
tra added inline comments.
Comment at: test/SemaCUDA/device-var-init.cu:7-11
@@ -6,9 +6,7 @@
// RUN: %clang_cc1 -triple nvptx64-nvidia-cuda -fcuda-is-device -std=c++11 \
-// RUN: -fno-threadsafe-statics -emit-llvm -o - %s | FileCheck %s
-// RUN: %clang_cc1 -triple nvptx64-nvi
tra created this revision.
tra added reviewers: jlebar, jordan_rose.
tra added a subscriber: cfe-commits.
GetOrCreateLLVMGlobal() accepts nullptr D, but in some cases we end up
dereferencing it without checking if it's non-null.
Fixes PR15492.
http://reviews.llvm.org/D20141
Files:
lib/CodeGe
tra added a comment.
I've never seen it triggered. Fix is based on the comment above the function
that D==nullptr is acceptable and the fact that we are checking D in other
places in this function.
Two cases where nullptr D is passed explicitly has something to do with
-fblocks, but that does
tra created this revision.
tra added reviewers: jlebar, jingyue.
tra added a subscriber: cfe-commits.
This matches default nvcc behavior and gives substantial performance boost on
GPU where fmad is much cheaper compared to add+mul.
http://reviews.llvm.org/D20341
Files:
lib/Frontend/Compiler
tra added a subscriber: scanon.
tra added a comment.
Things are even more interesting. -ffp-contract=fast is *not* what this change
does. :-)
We have two places where we can fuse FP instructions -- in clang and in LLVM
back-end.
Clang fuses add+mul into llvm.fmuladd intrinsic if -ffp-contract=o
tra added a comment.
In http://reviews.llvm.org/D20341#432494, @hfinkel wrote:
>
> That having been said, is this change the equivalent of -ffp-contract=fast or
> -ffp-contract=on? I think it is the latter and we want the former (i.e. where
> we let the backend be as aggressive as possible
tra added a comment.
OK. Consensus seems to be that -ffp-contract=fast is the way to go. I'll update
the patch.
I've just checked Steve's example with nvcc and indeed it fused mul+add.
http://reviews.llvm.org/D20341
___
cfe-commits mailing list
cfe
tra updated this revision to Diff 57540.
tra added a comment.
Changed default to -ffp-contract=fast.
http://reviews.llvm.org/D20341
Files:
lib/Frontend/CompilerInvocation.cpp
Index: lib/Frontend/CompilerInvocation.cpp
===
--- li
tra updated this revision to Diff 57541.
tra added a comment.
Added test case.
Is there a better way to test that correct options are passed to back-end?
This test resorts to checking assembly generated by back-end which is way too
far away from what actually needs testing.
http://reviews.llvm
tra added a comment.
I don't think using FMA throws away IEEE compliance.
IEEE 784-2008 says:
> A language standard should also define, and require implementations to
> provide, attributes that allow and
> disallow value-changing optimizations, separately or collectively, for a
> block. Thes
tra created this revision.
tra added a reviewer: jlebar.
tra added a subscriber: cfe-commits.
LLVM accepts them since r233575.
http://reviews.llvm.org/D20405
Files:
lib/Basic/Targets.cpp
lib/Driver/ToolChains.cpp
test/CodeGen/nvptx-cpus.c
Index: test/CodeGen/nvptx-cpus.c
Author: tra
Date: Thu May 19 12:47:47 2016
New Revision: 270084
URL: http://llvm.org/viewvc/llvm-project?rev=270084&view=rev
Log:
[CUDA] Allow sm_50,52,53 GPUs
LLVM accepts them since r233575.
Differential Revision: http://reviews.llvm.org/D20405
Modified:
cfe/trunk/lib/Basic/Targets.cpp
This revision was automatically updated to reflect the committed changes.
Closed by commit rL270084: [CUDA] Allow sm_50,52,53 GPUs (authored by tra).
Changed prior to commit:
http://reviews.llvm.org/D20405?vs=57715&id=57822#toc
Repository:
rL LLVM
http://reviews.llvm.org/D20405
Files:
cfe
This revision was automatically updated to reflect the committed changes.
Closed by commit rL270086: Check for nullptr argument. (authored by tra).
Changed prior to commit:
http://reviews.llvm.org/D20141?vs=56832&id=57825#toc
Repository:
rL LLVM
http://reviews.llvm.org/D20141
Files:
cfe/t
Author: tra
Date: Thu May 19 13:00:18 2016
New Revision: 270086
URL: http://llvm.org/viewvc/llvm-project?rev=270086&view=rev
Log:
Check for nullptr argument.
Addresses static analysis report in PR15492.
Differential Revision: http://reviews.llvm.org/D20141
Modified:
cfe/trunk/lib/CodeGen/Co
tra added a subscriber: chandlerc.
tra added a comment.
Short version of offline discussion with @chandlerc : Default of
-ffp-contract=fast for CUDA is fine.
http://reviews.llvm.org/D20341
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
htt
Author: tra
Date: Thu May 19 13:44:45 2016
New Revision: 270094
URL: http://llvm.org/viewvc/llvm-project?rev=270094&view=rev
Log:
[CUDA] Enable fusing FP ops (-ffp-contract=fast) for CUDA by default.
This matches default nvcc behavior and gives substantial
performance boost on GPU where fmad is m
This revision was automatically updated to reflect the committed changes.
Closed by commit rL270094: [CUDA] Enable fusing FP ops (-ffp-contract=fast) for
CUDA by default. (authored by tra).
Changed prior to commit:
http://reviews.llvm.org/D20341?vs=57541&id=57833#toc
Repository:
rL LLVM
htt
Author: tra
Date: Thu May 19 15:13:39 2016
New Revision: 270107
URL: http://llvm.org/viewvc/llvm-project?rev=270107&view=rev
Log:
[CUDA] Split device-var-init.cu tests into separate Sema and CodeGen parts.
Codegen tests for device-side variable initialization are subset of test
cases used to veri
This revision was automatically updated to reflect the committed changes.
Closed by commit rL270108: [CUDA] Do not allow non-empty destructors for global
device-side variables. (authored by tra).
Changed prior to commit:
http://reviews.llvm.org/D20140?vs=56829&id=57849#toc
Repository:
rL LLV
This revision was automatically updated to reflect the committed changes.
Closed by commit rL270107: [CUDA] Split device-var-init.cu tests into separate
Sema and CodeGen parts. (authored by tra).
Changed prior to commit:
http://reviews.llvm.org/D20139?vs=56824&id=57848#toc
Repository:
rL LLV
Author: tra
Date: Thu May 19 15:13:53 2016
New Revision: 270108
URL: http://llvm.org/viewvc/llvm-project?rev=270108&view=rev
Log:
[CUDA] Do not allow non-empty destructors for global device-side variables.
According to Cuda Programming guide (v7.5, E2.3.1):
> __device__, __constant__ and __shared
tra added a comment.
LGTM.
http://reviews.llvm.org/D18380
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
Hi,
FYI, cxx-indirect-call.cpp test fails on platforms with different
alignment. It may help to either use specific target or change your
patterns to accommodate other targets.
--Artem
TEST 'Clang :: Profile/cxx-indirect-call.cpp' FAILED
Script:
--
/usr/
Thanks for the quick fix. The test works on x86_64-unknown-linux-gnu now.
--Artem
On Tue, Mar 29, 2016 at 3:24 PM, Betul Buyukkurt
wrote:
> Hi Artem,
>
>
>
> I’ve uploaded a patch to remove the alignment.
>
>
>
> Thanks,
>
> -Betul
>
>
>
> *From:* Artem Belevich [mailto:t...@google.com]
> *Sent
tra accepted this revision.
tra added a comment.
This revision is now accepted and ready to land.
One nit. LGTM otherwise.
Comment at: lib/Headers/__clang_cuda_math_forward_declares.h:1
@@ +1,2 @@
+/*=== __clang_cuda_cmath.h - Device-side CUDA cmath support ===
+
tra added inline comments.
Comment at: include/clang/Driver/Options.td:385
@@ -384,1 +384,3 @@
HelpText<"CUDA installation path">;
+def cuda_flush_denormals_to_zero : Flag<["--"],
"cuda-flush-denormals-to-zero">,
+ HelpText<"Flush denormal floating point values to zero in CUD
tra accepted this revision.
tra added a comment.
This revision is now accepted and ready to land.
LGTM.
http://reviews.llvm.org/D18672
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
tra created this revision.
tra added reviewers: jlebar, majnemer.
tra added a subscriber: cfe-commits.
Since r265060 LLVM infers correct __nvvm_reflect attributes.
http://reviews.llvm.org/D19074
Files:
lib/Headers/__clang_cuda_runtime_wrapper.h
Index: lib/Headers/__clang_cuda_runtime_wrappe
tra accepted this revision.
tra added a comment.
This revision is now accepted and ready to land.
LGTM.
http://reviews.llvm.org/D19180
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
tra added a comment.
Help strings seem to be backwards.
LGTM otherwise.
Comment at: include/clang/Driver/Options.td:378
@@ -377,2 +377,3 @@
def cuda_device_only : Flag<["--"], "cuda-device-only">,
- HelpText<"Do device-side CUDA compilation only">;
+ HelpText<"Compile CUDA co
Author: tra
Date: Thu Apr 21 16:40:27 2016
New Revision: 267062
URL: http://llvm.org/viewvc/llvm-project?rev=267062&view=rev
Log:
[CUDA] removed unneeded __nvvm_reflect_anchor()
Since r265060 LLVM infers correct __nvvm_reflect attributes, so
explicit declaration of __nvvm_reflect() is no longer n
This revision was automatically updated to reflect the committed changes.
Closed by commit rL267062: [CUDA] removed unneeded __nvvm_reflect_anchor()
(authored by tra).
Changed prior to commit:
http://reviews.llvm.org/D19074?vs=53618&id=54585#toc
Repository:
rL LLVM
http://reviews.llvm.org/D
tra added a comment.
LGTM.
http://reviews.llvm.org/D20493
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
tra added a comment.
I guess we would not be able to remove convergent from inline asm
automatically. Do we need a way to explicitly remove convergent from inline asm?
http://reviews.llvm.org/D20836
___
cfe-commits mailing list
cfe-commits@lists.ll
tra created this revision.
tra added reviewers: rsmith, jlebar.
tra added a subscriber: cfe-commits.
Fixes clang crash reported in PR27778.
http://reviews.llvm.org/D20985
Files:
lib/Sema/SemaDeclAttr.cpp
test/CodeGenCUDA/launch-bounds.cu
test/SemaCUDA/pr27778.cu
Index: test/SemaCUDA/pr277
tra added a comment.
In http://reviews.llvm.org/D20985#448822, @jlebar wrote:
> How is this different from test/SemaCUDA/launch_bounds.cu:27-28? It does
>
> const int constint = 512;
> __launch_bounds__(constint) void TestConstInt(void);
>
>
> which looks verbatim the same as this testcas
tra updated this revision to Diff 59624.
tra added a comment.
Addressed Justin's comments.
http://reviews.llvm.org/D20985
Files:
lib/Sema/SemaDeclAttr.cpp
test/CodeGenCUDA/launch-bounds.cu
test/SemaCUDA/pr27778.cu
Index: test/SemaCUDA/pr27778.cu
==
tra updated this revision to Diff 59631.
tra marked an inline comment as done.
tra added a comment.
Rephrased comments
http://reviews.llvm.org/D20985
Files:
lib/Sema/SemaDeclAttr.cpp
test/CodeGenCUDA/launch-bounds.cu
test/SemaCUDA/pr27778.cu
Index: test/SemaCUDA/pr27778.cu
==
tra marked 3 inline comments as done.
Comment at: lib/Sema/SemaDeclAttr.cpp:4046
@@ +4045,3 @@
+// non-nullptr Expr result on success. Returns nullptr otherwise and
+// may output an error.
+static Expr *makeLaunchBoundsArgExpr(Sema &S, Expr *E,
jlebar wrote:
> Pr
tra updated this revision to Diff 59778.
tra added a comment.
Replaced if() with assert() to catch unexpected PerformCopyInitialization()
failures.
http://reviews.llvm.org/D20985
Files:
lib/Sema/SemaDeclAttr.cpp
test/CodeGenCUDA/launch-bounds.cu
test/SemaCUDA/pr27778.cu
Index: test/Sema
tra marked an inline comment as done.
tra added a comment.
http://reviews.llvm.org/D20985
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
Author: tra
Date: Mon Jun 6 17:54:57 2016
New Revision: 271951
URL: http://llvm.org/viewvc/llvm-project?rev=271951&view=rev
Log:
[CUDA] Add implicit conversion of __launch_bounds__ arguments to rvalue.
Fixes clang crash reported in PR27778.
Differential Revision: http://reviews.llvm.org/D20985
This revision was automatically updated to reflect the committed changes.
Closed by commit rL271951: [CUDA] Add implicit conversion of __launch_bounds__
arguments to rvalue. (authored by tra).
Changed prior to commit:
http://reviews.llvm.org/D20985?vs=59778&id=59800#toc
Repository:
rL LLVM
tra added inline comments.
Comment at: lib/Headers/__clang_cuda_intrinsics.h:77-80
@@ +76,6 @@
+_Static_assert(sizeof(__tmp) == sizeof(__in));
\
+memcpy(&__tmp, &__in, sizeof(__in));
\
+__tmp = ::__FnN
tra accepted this revision.
tra added a comment.
This revision is now accepted and ready to land.
LGTM.
http://reviews.llvm.org/D18170
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
Author: tra
Date: Mon Jun 13 13:44:22 2016
New Revision: 272573
URL: http://llvm.org/viewvc/llvm-project?rev=272573&view=rev
Log:
Test fix -- use captured call result instead of hardcoded %2.
Modified:
cfe/trunk/test/CodeGen/bitscan-builtins.c
Modified: cfe/trunk/test/CodeGen/bitscan-builtin
Miklos,
TokenName produces unused variable warning in builds with asserts disabled.
Could you add LLVM_ATTRIBUTE_UNUSED to it?
Thanks,
--Artem
On Wed, Jun 15, 2016 at 11:35 AM, Miklos Vajna via cfe-commits <
cfe-commits@lists.llvm.org> wrote:
> Author: vmiklos
> Date: Wed Jun 15 13:35:41 2016
Author: tra
Date: Wed Jun 15 18:04:42 2016
New Revision: 272852
URL: http://llvm.org/viewvc/llvm-project?rev=272852&view=rev
Log:
[clang-tools] mark TokenName as unused
Otherwise it produces compiler warning if asserts are disabled.
Modified:
clang-tools-extra/trunk/clang-rename/USRLocFinder
Should be fixed in r272852
--Artem
On Wed, Jun 15, 2016 at 3:16 PM, Artem Belevich wrote:
> Miklos,
>
> TokenName produces unused variable warning in builds with asserts disabled.
> Could you add LLVM_ATTRIBUTE_UNUSED to it?
>
> Thanks,
> --Artem
>
>
> On Wed, Jun 15, 2016 at 11:35 AM, Miklos V
tra added inline comments.
Comment at: test/Driver/cuda-march.cu:15-16
@@ +14,4 @@
+
+// RUN: %clang -### -target x86_64-linux-gnu -c -march=skylake
--cuda-gpu-arch=sm_30 %s 2>&1 | \
+// RUN: FileCheck -check-prefix SKYLAKE -check-prefix SM30 %s
+
These look redu
tra added inline comments.
Comment at: test/Driver/cuda-march.cu:22-28
@@ +21,9 @@
+
+// SM30:clang
+// SM30: "-cc1"
+// SM30-SAME: "-triple" "nvptx
+// SM30-SAME: "-target-cpu" "sm_30"
+// SM30: ptxas
+// SM30-SAME: "--gpu-name" "sm_30"
+
+// HASWELL:clang
You do
tra accepted this revision.
tra added a comment.
This revision is now accepted and ready to land.
LGTM.
http://reviews.llvm.org/D21419
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
Author: tra
Date: Thu Jun 16 15:16:49 2016
New Revision: 272947
URL: http://llvm.org/viewvc/llvm-project?rev=272947&view=rev
Log:
Minor fixes for miamcpu-opt.c test
Added -no-canonical-prefixes to make cc1 binary name more predictable.
Added appropriate REQUIRES keywords.
Modified:
cfe/trunk
Eric,
Some tests appear to fail if the path to the tests' current directory has
some symlinks in it.
In my case source and build tree are in directory 'work' that's symlinked
to from my home directory:
/usr/local/home/tra/work -> /work/tra
This causes multiple failures in libcxx tests. One exampl
Author: tra
Date: Tue Jun 21 12:35:31 2016
New Revision: 273289
URL: http://llvm.org/viewvc/llvm-project?rev=273289&view=rev
Log:
[aarch64] Update datalayout for aarch64 tests
This brings the tests in sync with the changes in r273280.
Modified:
cfe/trunk/test/CodeGen/aarch64-type-sizes.c
tra accepted this revision.
tra added a comment.
This revision is now accepted and ready to land.
LGTM.
Comment at: lib/Basic/Cuda.cpp:8-19
@@ +7,14 @@
+
+const char *CudaVersionToString(CudaVersion V) {
+ switch (V) {
+ case CudaVersion::UNKNOWN:
+return "unknown";
+ cas
tra added inline comments.
Comment at: lib/Driver/Driver.cpp:1026-1028
@@ -1024,4 +1025,5 @@
} else if (CudaDeviceAction *CDA = dyn_cast(A)) {
-os << '"'
- << (CDA->getGpuArchName() ? CDA->getGpuArchName() : "(multiple archs)")
+os << '"' << (CDA->getGpuArch() !=
tra added inline comments.
Comment at: lib/Driver/Driver.cpp:1026-1028
@@ -1024,4 +1025,5 @@
} else if (CudaDeviceAction *CDA = dyn_cast(A)) {
-os << '"'
- << (CDA->getGpuArchName() ? CDA->getGpuArchName() : "(multiple archs)")
+os << '"' << (CDA->getGpuArch() !=
tra accepted this revision.
tra added a comment.
This revision is now accepted and ready to land.
LGTM.
http://reviews.llvm.org/D21913
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
tra accepted this revision.
tra added a comment.
This revision is now accepted and ready to land.
LGTM.
http://reviews.llvm.org/D21914
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
tra added a comment.
The changes look good.
They will need to wait for corresponding patch on LLVM side to deal with new SM
variants, though.
Comment at: lib/Driver/ToolChains.cpp:1715
@@ -1714,2 +1714,3 @@
CudaPathCandidates.push_back(D.SysRoot + "/usr/local/cuda");
+
tra added inline comments.
Comment at: include/clang/Basic/DiagnosticDriverKinds.td:32
@@ -29,1 +31,3 @@
+ "Use --cuda-path to specify a different CUDA install, or pass "
+ "--nocuda-version-check.">;
def err_drv_invalid_thread_model_for_target : Error<
Is it s
tra accepted this revision.
tra added a comment.
This revision is now accepted and ready to land.
LGTM.
Comment at: lib/Driver/ToolChains.cpp:1798
@@ +1797,3 @@
+FS.getBufferForFile(InstallPath + "/version.txt");
+if (!VersionFile) {
+ // CUDA 7.0 doesn't have a
tra added a comment.
Few minor nits and suggestions. Other than that I'm OK with the patch.
Comment at: lib/Driver/Action.cpp:156
@@ +155,3 @@
+ // Propagate info to the dependencies.
+ for (unsigned i = 0; i < getInputs().size(); ++i)
+getInputs()[i]->propagateDeviceOfflo
tra added inline comments.
Comment at: lib/Driver/Action.cpp:191-202
@@ +190,14 @@
+const OffloadActionWorkTy &Work) const {
+ auto I = getInputs().begin();
+ auto E = getInputs().end();
+ if (I == E)
+return;
+
+ // Skip host action
+ if (HostTC)
+++I;
+
+ auto
tra added a comment.
The changes look good.
Now we just need some tests. Something along the lines of test/Driver/phases.c
should do.
http://reviews.llvm.org/D18171
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi
tra added inline comments.
Comment at: test/Driver/cuda_phases.cu:1
@@ +1,2 @@
+// RUN: %clang -target powerpc64le-ibm-linux-gnu -ccc-print-phases
--cuda-gpu-arch=sm_30 %s 2>&1 \
+// RUN: | FileCheck -check-prefix=BIN %s
Few words describing the test would be nic
tra accepted this revision.
tra added a comment.
This revision is now accepted and ready to land.
LGTM.
Comment at: test/Driver/cuda_phases.cu:48
@@ +47,3 @@
+//
+// Test two gpu architecture with complete compilation.
+//
architecture*s*.
There are few more cop
tra created this revision.
tra added reviewers: jlebar, jingyue.
tra added a subscriber: cfe-commits.
.. and register them with CUDA runtime.
This is needed for commonly used cudaMemcpy*() APIs that use address of
host-side shadow to access their counterparts on device side.
Fixes PR26340.
htt
tra created this revision.
tra added reviewers: jlebar, jingyue.
tra added a subscriber: cfe-commits.
Do not generate runtime init code if we don't have anything to init.
http://reviews.llvm.org/D17780
Files:
lib/CodeGen/CGCUDANV.cpp
test/CodeGenCUDA/device-stub.cu
Index: test/CodeGenCUDA/d
tra updated this revision to Diff 49561.
tra marked 9 inline comments as done.
tra added a comment.
Addressed Justin's comments.
http://reviews.llvm.org/D17779
Files:
lib/CodeGen/CGCUDANV.cpp
lib/CodeGen/CGCUDARuntime.h
lib/CodeGen/CodeGenModule.cpp
test/CodeGenCUDA/device-stub.cu
tes
Author: tra
Date: Wed Mar 2 12:28:53 2016
New Revision: 262499
URL: http://llvm.org/viewvc/llvm-project?rev=262499&view=rev
Log:
[CUDA] Do not generate unnecessary runtime init code.
Differential Revision: http://reviews.llvm.org/D17780
Modified:
cfe/trunk/lib/CodeGen/CGCUDANV.cpp
cfe/t
Author: tra
Date: Wed Mar 2 12:28:50 2016
New Revision: 262498
URL: http://llvm.org/viewvc/llvm-project?rev=262498&view=rev
Log:
[CUDA] Emit host-side 'shadows' for device-side global variables
... and register them with CUDA runtime.
This is needed for commonly used cudaMemcpy*() APIs that use
This revision was automatically updated to reflect the committed changes.
Closed by commit rL262498: [CUDA] Emit host-side 'shadows' for device-side
global variables (authored by tra).
Changed prior to commit:
http://reviews.llvm.org/D17779?vs=49561&id=49645#toc
Repository:
rL LLVM
http://r
This revision was automatically updated to reflect the committed changes.
Closed by commit rL262499: [CUDA] Do not generate unnecessary runtime init
code. (authored by tra).
Changed prior to commit:
http://reviews.llvm.org/D17780?vs=49539&id=49646#toc
Repository:
rL LLVM
http://reviews.llvm
601 - 700 of 1265 matches
Mail list logo