r332619 - [CUDA] Make std::min/max work when compiling in C++14 mode with a C++11 stdlib.

2018-05-17 Thread Justin Lebar via cfe-commits
Author: jlebar
Date: Thu May 17 09:12:42 2018
New Revision: 332619

URL: http://llvm.org/viewvc/llvm-project?rev=332619&view=rev
Log:
[CUDA] Make std::min/max work when compiling in C++14 mode with a C++11 stdlib.

Reviewers: rsmith

Subscribers: sanjoy, cfe-commits, tra

Differential Revision: https://reviews.llvm.org/D46993

Modified:
cfe/trunk/lib/Headers/cuda_wrappers/algorithm

Modified: cfe/trunk/lib/Headers/cuda_wrappers/algorithm
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Headers/cuda_wrappers/algorithm?rev=332619&r1=332618&r2=332619&view=diff
==
--- cfe/trunk/lib/Headers/cuda_wrappers/algorithm (original)
+++ cfe/trunk/lib/Headers/cuda_wrappers/algorithm Thu May 17 09:12:42 2018
@@ -24,28 +24,36 @@
 #ifndef __CLANG_CUDA_WRAPPERS_ALGORITHM
 #define __CLANG_CUDA_WRAPPERS_ALGORITHM
 
-// This header defines __device__ overloads of std::min/max, but only if we're
-// <= C++11.  In C++14, these functions are constexpr, and so are implicitly
-// __host__ __device__.
+// This header defines __device__ overloads of std::min/max.
 //
-// We don't support the initializer_list overloads because
-// initializer_list::begin() and end() are not __host__ __device__ functions.
+// Ideally we'd declare these functions only if we're <= C++11.  In C++14,
+// these functions are constexpr, and so are implicitly __host__ __device__.
 //
-// When compiling in C++14 mode, we could force std::min/max to have different
-// implementations for host and device, by declaring the device overloads
-// before the constexpr overloads appear.  We choose not to do this because
-
-//  a) why write our own implementation when we can use one from the standard
-// library? and
-//  b) libstdc++ is evil and declares min/max inside a header that is included
-// *before* we include .  So we'd have to unconditionally
-// declare our __device__ overloads of min/max, but that would pollute
-// things for people who choose not to include .
+// However, the compiler being in C++14 mode does not imply that the standard
+// library supports C++14.  There is no macro we can test to check that the
+// stdlib has constexpr std::min/max.  Thus we have to unconditionally define
+// our device overloads.
+//
+// A host+device function cannot be overloaded, and a constexpr function
+// implicitly become host device if there's no explicitly host or device
+// overload preceding it.  So the simple thing to do would be to declare our
+// device min/max overloads, and then #include_next .  This way our
+// device overloads would come first, and so if we have a C++14 stdlib, its
+// min/max won't become host+device and conflict with our device overloads.
+//
+// But that also doesn't work.  libstdc++ is evil and declares std::min/max in
+// an internal header that is included *before* .  Thus by the time
+// we're inside of this file, std::min/max may already have been declared, and
+// thus we can't prevent them from becoming host+device if they're constexpr.
+//
+// Therefore we perpetrate the following hack: We mark our __device__ overloads
+// with __attribute__((enable_if(true, ""))).  This causes the signature of the
+// function to change without changing anything else about it.  (Except that
+// overload resolution will prefer it over the __host__ __device__ version
+// rather than considering them equally good).
 
 #include_next 
 
-#if __cplusplus <= 201103L
-
 // We need to define these overloads in exactly the namespace our standard
 // library uses (including the right inline namespace), otherwise they won't be
 // picked up by other functions in the standard library (e.g. functions in
@@ -60,24 +68,28 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 #endif
 
 template 
+__attribute__((enable_if(true, "")))
 inline __device__ const __T &
 max(const __T &__a, const __T &__b, __Cmp __cmp) {
   return __cmp(__a, __b) ? __b : __a;
 }
 
 template 
+__attribute__((enable_if(true, "")))
 inline __device__ const __T &
 max(const __T &__a, const __T &__b) {
   return __a < __b ? __b : __a;
 }
 
 template 
+__attribute__((enable_if(true, "")))
 inline __device__ const __T &
 min(const __T &__a, const __T &__b, __Cmp __cmp) {
   return __cmp(__b, __a) ? __b : __a;
 }
 
 template 
+__attribute__((enable_if(true, "")))
 inline __device__ const __T &
 min(const __T &__a, const __T &__b) {
   return __a < __b ? __a : __b;
@@ -92,5 +104,4 @@ _GLIBCXX_END_NAMESPACE_VERSION
 } // namespace std
 #endif
 
-#endif // __cplusplus <= 201103L
 #endif // __CLANG_CUDA_WRAPPERS_ALGORITHM


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r332621 - [CUDA] Allow "extern __shared__ Foo foo[]" within anon. namespaces.

2018-05-17 Thread Justin Lebar via cfe-commits
Author: jlebar
Date: Thu May 17 09:15:07 2018
New Revision: 332621

URL: http://llvm.org/viewvc/llvm-project?rev=332621&view=rev
Log:
[CUDA] Allow "extern __shared__ Foo foo[]" within anon. namespaces.

Summary:
Previously this triggered a -Wundefined-internal warning.  But it's not
an undefined variable -- any variable of this form is a pointer to the
base of GPU core's shared memory.

Reviewers: tra

Subscribers: sanjoy, rsmith

Differential Revision: https://reviews.llvm.org/D46782

Modified:
cfe/trunk/include/clang/AST/Decl.h
cfe/trunk/lib/AST/Decl.cpp
cfe/trunk/lib/Sema/Sema.cpp
cfe/trunk/test/SemaCUDA/extern-shared.cu

Modified: cfe/trunk/include/clang/AST/Decl.h
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/AST/Decl.h?rev=332621&r1=332620&r2=332621&view=diff
==
--- cfe/trunk/include/clang/AST/Decl.h (original)
+++ cfe/trunk/include/clang/AST/Decl.h Thu May 17 09:15:07 2018
@@ -1456,6 +1456,11 @@ public:
 
   void setDescribedVarTemplate(VarTemplateDecl *Template);
 
+  // Is this variable known to have a definition somewhere in the complete
+  // program? This may be true even if the declaration has internal linkage and
+  // has no definition within this source file.
+  bool isKnownToBeDefined() const;
+
   // Implement isa/cast/dyncast/etc.
   static bool classof(const Decl *D) { return classofKind(D->getKind()); }
   static bool classofKind(Kind K) { return K >= firstVar && K <= lastVar; }

Modified: cfe/trunk/lib/AST/Decl.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/AST/Decl.cpp?rev=332621&r1=332620&r2=332621&view=diff
==
--- cfe/trunk/lib/AST/Decl.cpp (original)
+++ cfe/trunk/lib/AST/Decl.cpp Thu May 17 09:15:07 2018
@@ -2432,6 +2432,23 @@ void VarDecl::setDescribedVarTemplate(Va
   getASTContext().setTemplateOrSpecializationInfo(this, Template);
 }
 
+bool VarDecl::isKnownToBeDefined() const {
+  const auto &LangOpts = getASTContext().getLangOpts();
+  // In CUDA mode without relocatable device code, variables of form 'extern
+  // __shared__ Foo foo[]' are pointers to the base of the GPU core's shared
+  // memory pool.  These are never undefined variables, even if they appear
+  // inside of an anon namespace or static function.
+  //
+  // With CUDA relocatable device code enabled, these variables don't get
+  // special handling; they're treated like regular extern variables.
+  if (LangOpts.CUDA && !LangOpts.CUDARelocatableDeviceCode &&
+  hasExternalStorage() && hasAttr() &&
+  isa(getType()))
+return true;
+
+  return hasDefinition();
+}
+
 MemberSpecializationInfo *VarDecl::getMemberSpecializationInfo() const {
   if (isStaticDataMember())
 // FIXME: Remove ?

Modified: cfe/trunk/lib/Sema/Sema.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Sema/Sema.cpp?rev=332621&r1=332620&r2=332621&view=diff
==
--- cfe/trunk/lib/Sema/Sema.cpp (original)
+++ cfe/trunk/lib/Sema/Sema.cpp Thu May 17 09:15:07 2018
@@ -653,6 +653,11 @@ void Sema::getUndefinedButUsed(
   !isExternalWithNoLinkageType(VD) &&
   !VD->getMostRecentDecl()->isInline())
 continue;
+
+  // Skip VarDecls that lack formal definitions but which we know are in
+  // fact defined somewhere.
+  if (VD->isKnownToBeDefined())
+continue;
 }
 
 Undefined.push_back(std::make_pair(ND, UndefinedUse.second));

Modified: cfe/trunk/test/SemaCUDA/extern-shared.cu
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/SemaCUDA/extern-shared.cu?rev=332621&r1=332620&r2=332621&view=diff
==
--- cfe/trunk/test/SemaCUDA/extern-shared.cu (original)
+++ cfe/trunk/test/SemaCUDA/extern-shared.cu Thu May 17 09:15:07 2018
@@ -1,10 +1,10 @@
-// RUN: %clang_cc1 -fsyntax-only -verify %s
-// RUN: %clang_cc1 -fsyntax-only -fcuda-is-device -verify %s
+// RUN: %clang_cc1 -fsyntax-only -Wundefined-internal -verify %s
+// RUN: %clang_cc1 -fsyntax-only -Wundefined-internal -fcuda-is-device -verify 
%s
 
-// RUN: %clang_cc1 -fsyntax-only -fcuda-rdc -verify=rdc %s
-// RUN: %clang_cc1 -fsyntax-only -fcuda-is-device -fcuda-rdc -verify=rdc %s
-// These declarations are fine in separate compilation mode:
-// rdc-no-diagnostics
+// RUN: %clang_cc1 -fsyntax-only -Wundefined-internal -fcuda-rdc -verify=rdc %s
+// RUN: %clang_cc1 -fsyntax-only -Wundefined-internal -fcuda-is-device 
-fcuda-rdc -verify=rdc %s
+
+// Most of these declarations are fine in separate compilation mode.
 
 #include "Inputs/cuda.h"
 
@@ -26,3 +26,18 @@ __host__ __device__ void bar() {
 extern __shared__ int global; // expected-error {{__shared__ variable 'global' 
cannot be 'extern'}}
 extern __shared__ int global_arr[]; // ok
 extern __shared__ int global_arr1[1]; // expected-

r314142 - Revert "[NVPTX] added match.{any, all}.sync instructions, intrinsics & builtins.", rL314135.

2017-09-25 Thread Justin Lebar via cfe-commits
Author: jlebar
Date: Mon Sep 25 12:41:56 2017
New Revision: 314142

URL: http://llvm.org/viewvc/llvm-project?rev=314142&view=rev
Log:
Revert "[NVPTX] added match.{any,all}.sync instructions, intrinsics & 
builtins.", rL314135.

Causing assertion failures on macos:

> Assertion failed: (Num < NumOperands && "Invalid child # of SDNode!"),
> function getOperand, file
> /Users/buildslave/jenkins/workspace/clang-stage1-cmake-RA-incremental/llvm/include/llvm/CodeGen/SelectionDAGNodes.h,
> line 835.

http://green.lab.llvm.org/green/job/clang-stage1-cmake-RA-incremental/42739/testReport/LLVM/CodeGen_NVPTX/surf_read_cuda_ll/

Modified:
cfe/trunk/include/clang/Basic/BuiltinsNVPTX.def
cfe/trunk/lib/CodeGen/CGBuiltin.cpp
cfe/trunk/lib/Headers/__clang_cuda_intrinsics.h
cfe/trunk/test/CodeGen/builtins-nvptx-ptx60.cu

Modified: cfe/trunk/include/clang/Basic/BuiltinsNVPTX.def
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Basic/BuiltinsNVPTX.def?rev=314142&r1=314141&r2=314142&view=diff
==
--- cfe/trunk/include/clang/Basic/BuiltinsNVPTX.def (original)
+++ cfe/trunk/include/clang/Basic/BuiltinsNVPTX.def Mon Sep 25 12:41:56 2017
@@ -413,13 +413,6 @@ TARGET_BUILTIN(__nvvm_vote_any_sync, "bU
 TARGET_BUILTIN(__nvvm_vote_uni_sync, "bUib", "", "ptx60")
 TARGET_BUILTIN(__nvvm_vote_ballot_sync, "UiUib", "", "ptx60")
 
-// Match
-TARGET_BUILTIN(__nvvm_match_any_sync_i32, "UiUiUi", "", "ptx60")
-TARGET_BUILTIN(__nvvm_match_any_sync_i64, "WiUiWi", "", "ptx60")
-// These return a pair {value, predicate}, which requires custom lowering.
-TARGET_BUILTIN(__nvvm_match_all_sync_i32p, "UiUiUii*", "", "ptx60")
-TARGET_BUILTIN(__nvvm_match_all_sync_i64p, "WiUiWii*", "", "ptx60")
-
 // Membar
 
 BUILTIN(__nvvm_membar_cta, "v", "")

Modified: cfe/trunk/lib/CodeGen/CGBuiltin.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGBuiltin.cpp?rev=314142&r1=314141&r2=314142&view=diff
==
--- cfe/trunk/lib/CodeGen/CGBuiltin.cpp (original)
+++ cfe/trunk/lib/CodeGen/CGBuiltin.cpp Mon Sep 25 12:41:56 2017
@@ -9589,21 +9589,6 @@ Value *CodeGenFunction::EmitNVPTXBuiltin
 {Ptr->getType()->getPointerElementType(), Ptr->getType()}),
 {Ptr, EmitScalarExpr(E->getArg(1)), EmitScalarExpr(E->getArg(2))});
   }
-  case NVPTX::BI__nvvm_match_all_sync_i32p:
-  case NVPTX::BI__nvvm_match_all_sync_i64p: {
-Value *Mask = EmitScalarExpr(E->getArg(0));
-Value *Val = EmitScalarExpr(E->getArg(1));
-Address PredOutPtr = EmitPointerWithAlignment(E->getArg(2));
-Value *ResultPair = Builder.CreateCall(
-CGM.getIntrinsic(BuiltinID == NVPTX::BI__nvvm_match_all_sync_i32p
- ? Intrinsic::nvvm_match_all_sync_i32p
- : Intrinsic::nvvm_match_all_sync_i64p),
-{Mask, Val});
-Value *Pred = Builder.CreateZExt(Builder.CreateExtractValue(ResultPair, 1),
- PredOutPtr.getElementType());
-Builder.CreateStore(Pred, PredOutPtr);
-return Builder.CreateExtractValue(ResultPair, 0);
-  }
   default:
 return nullptr;
   }

Modified: cfe/trunk/lib/Headers/__clang_cuda_intrinsics.h
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Headers/__clang_cuda_intrinsics.h?rev=314142&r1=314141&r2=314142&view=diff
==
--- cfe/trunk/lib/Headers/__clang_cuda_intrinsics.h (original)
+++ cfe/trunk/lib/Headers/__clang_cuda_intrinsics.h Mon Sep 25 12:41:56 2017
@@ -92,9 +92,8 @@ __MAKE_SHUFFLES(__shfl_xor, __nvvm_shfl_
 
 #endif // !defined(__CUDA_ARCH__) || __CUDA_ARCH__ >= 300
 
-#if CUDA_VERSION >= 9000
-#if (!defined(__CUDA_ARCH__) || __CUDA_ARCH__ >= 300)
 // __shfl_sync_* variants available in CUDA-9
+#if CUDA_VERSION >= 9000 && (!defined(__CUDA_ARCH__) || __CUDA_ARCH__ >= 300)
 #pragma push_macro("__MAKE_SYNC_SHUFFLES")
 #define __MAKE_SYNC_SHUFFLES(__FnName, __IntIntrinsic, __FloatIntrinsic,   
\
  __Mask)   
\
@@ -188,33 +187,8 @@ inline __device__ unsigned int __ballot_
 
 inline __device__ unsigned int activemask() { return __nvvm_vote_ballot(1); }
 
-#endif // !defined(__CUDA_ARCH__) || __CUDA_ARCH__ >= 300
-
-// Define __match* builtins CUDA-9 headers expect to see.
-#if !defined(__CUDA_ARCH__) || __CUDA_ARCH__ >= 700
-inline __device__ unsigned int __match32_any_sync(unsigned int mask,
-  unsigned int value) {
-  return __nvvm_match_any_sync_i32(mask, value);
-}
-
-inline __device__ unsigned long long
-__match64_any_sync(unsigned int mask, unsigned long long value) {
-  return __nvvm_match_any_sync_i64(mask, value);
-}
-
-inline __device__ unsigned int
-__match32_all_sync(unsigned int mask, unsigned int value, int *pred) {
-  return __nvvm_match_all_sync

Re: [PATCH] D55456: [CUDA] added missing 'inline' for the functions defined in the header.

2018-12-07 Thread Justin Lebar via cfe-commits
Lgtm

On Fri, Dec 7, 2018, 1:12 PM Artem Belevich via Phabricator <
revi...@reviews.llvm.org> wrote:

> tra created this revision.
> tra added a reviewer: jlebar.
> Herald added subscribers: bixia, sanjoy.
>
> https://reviews.llvm.org/D55456
>
> Files:
>   clang/lib/Headers/cuda_wrappers/new
>
>
> Index: clang/lib/Headers/cuda_wrappers/new
> ===
> --- clang/lib/Headers/cuda_wrappers/new
> +++ clang/lib/Headers/cuda_wrappers/new
> @@ -73,10 +73,12 @@
>
>  // Sized delete, C++14 only.
>  #if __cplusplus >= 201402L
> -__device__ void operator delete(void *ptr, __SIZE_TYPE__ size)
> CUDA_NOEXCEPT {
> +__device__ inline void operator delete(void *ptr,
> +   __SIZE_TYPE__ size) CUDA_NOEXCEPT {
>::operator delete(ptr);
>  }
> -__device__ void operator delete[](void *ptr, __SIZE_TYPE__ size)
> CUDA_NOEXCEPT {
> +__device__ inline void operator delete[](void *ptr,
> + __SIZE_TYPE__ size)
> CUDA_NOEXCEPT {
>::operator delete(ptr);
>  }
>  #endif
>
>
>
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [PATCH] D61458: [hip] Relax CUDA call restriction within `decltype` context.

2019-05-02 Thread Justin Lebar via cfe-commits
> So, actually, I wonder if that's not the right answer. We generally allow
different overloads to have different return types. What if, for example,
the return type on the host is __float128 and on the device it's
`MyLongFloatTy`?

The problem is that conceptually compiling for host/device does not create
a new set of overloads.

When we compile for (say) host, we build a full AST for all functions,
including device functions, and that AST must pass sema checks.  This is
significant for example because when compiling for device we need to know
which kernel templates were instantiated on the host side, so we know which
kernels to emit.

Here's a contrived example.

```
 __host__ int8 bar();
__device__ int16 bar();
__host__ __device__ auto foo() -> decltype(bar()) {}

template  __global__ kernel();

void launch_kernel() {
  kernel<<<...>>>();
}
```

This template instantiation had better be the same when compiling for host
and device.

That's contrived, but consider this much simpler case:

```
void host_fn() {
  static_assert(sizeof(decltype(foo())) == sizeof(int8));
}
```

If we let foo return int16 in device mode, this static_assert will fail
when compiling in *device* mode even though host_fn is never called on the
device.  https://gcc.godbolt.org/z/gYq901

Why are we doing sema checks on the host code when compiling for device?
See contrived example above, we need quite a bit of info about the host
code to infer those templates.

On Thu, May 2, 2019 at 7:05 PM Hal Finkel via Phabricator <
revi...@reviews.llvm.org> wrote:

> hfinkel added a comment.
>
> In D61458#1488970 , @jlebar
> wrote:
>
> > Here's one for you:
> >
> >   __host__ float bar();
> >   __device__ int bar();
> >   __host__ __device__ auto foo() -> decltype(bar()) {}
> >
> >
> > What is the return type of `foo`?  :)
> >
> > I don't believe the right answer is, "float when compiling for host, int
> when compiling for device."
>
>
> So, actually, I wonder if that's not the right answer. We generally allow
> different overloads to have different return types. What if, for example,
> the return type on the host is __float128 and on the device it's
> `MyLongFloatTy`?
>
> > I'd be happy if we said this was an error, so long as it's well-defined
> what exactly we're disallowing.  But I bet @rsmith can come up with
> substantially more evil testcases than this.
>
>
>
>
> Repository:
>   rG LLVM Github Monorepo
>
> CHANGES SINCE LAST ACTION
>   https://reviews.llvm.org/D61458/new/
>
> https://reviews.llvm.org/D61458
>
>
>
>
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [PATCH] D61458: [hip] Relax CUDA call restriction within `decltype` context.

2019-05-02 Thread Justin Lebar via cfe-commits
> In any case, it seems like your examples argue for disallowing a
return-type mismatch between host and device overloads, not disallowing
observing the type?

Oh no, we have to allow return-type mismatches between host and device
overloads, that is a common thing in CUDA code I've seen.  You can safely
observe this difference *so long as you're inside of a function*.  This is
because we have this caller-sensitive function parsing thing.  When parsing
a __host__ __device__ function, we look at the caller to understand what
context we're in.

What I think you can't do is observe the return-type mismatch between host
and device overloads *from outside of a function*, e.g. from within a
trailing return type.

But perhaps rsmith or another expert can take my attempt at a
contract above and trap me in a Faustian contradiction.

On Thu, May 2, 2019 at 7:47 PM Finkel, Hal J.  wrote:

> Thanks, Justin. It sees like we have the standard set of options: We can
> disallow the mismatch. We can allow it with a warning. We can allow it
> without a warning. We can say that if the mismatch contributes to the type
> of a kernel function, that's illformed (NDR).
>
> In any case, it seems like your examples argue for disallowing a
> return-type mismatch between host and device overloads, not disallowing
> observing the type? Or maybe disallowing observing the type only when
> there's a mismatch?
>
>  -Hal
>
> Hal Finkel
> Lead, Compiler Technology and Programming Languages
> Leadership Computing Facility
> Argonne National Laboratory
>
> --
> *From:* Justin Lebar 
> *Sent:* Thursday, May 2, 2019 9:16 PM
> *To:* reviews+d61458+public+f6ea501465ad5...@reviews.llvm.org
> *Cc:* michael.hl...@gmail.com; Artem Belevich; John McCall; Liu, Yaxun
> (Sam); Finkel, Hal J.; Richard Smith; Clang Commits; mlek...@skidmore.edu;
> blitzrak...@gmail.com; Han Shen
> *Subject:* Re: [PATCH] D61458: [hip] Relax CUDA call restriction within
> `decltype` context.
>
> > So, actually, I wonder if that's not the right answer. We generally
> allow different overloads to have different return types. What if, for
> example, the return type on the host is __float128 and on the device it's
> `MyLongFloatTy`?
>
> The problem is that conceptually compiling for host/device does not create
> a new set of overloads.
>
> When we compile for (say) host, we build a full AST for all functions,
> including device functions, and that AST must pass sema checks.  This is
> significant for example because when compiling for device we need to know
> which kernel templates were instantiated on the host side, so we know which
> kernels to emit.
>
> Here's a contrived example.
>
> ```
>  __host__ int8 bar();
> __device__ int16 bar();
> __host__ __device__ auto foo() -> decltype(bar()) {}
>
> template  __global__ kernel();
>
> void launch_kernel() {
>   kernel<<<...>>>();
> }
> ```
>
> This template instantiation had better be the same when compiling for host
> and device.
>
> That's contrived, but consider this much simpler case:
>
> ```
> void host_fn() {
>   static_assert(sizeof(decltype(foo())) == sizeof(int8));
> }
> ```
>
> If we let foo return int16 in device mode, this static_assert will fail
> when compiling in *device* mode even though host_fn is never called on the
> device.  https://gcc.godbolt.org/z/gYq901
>
> Why are we doing sema checks on the host code when compiling for device?
> See contrived example above, we need quite a bit of info about the host
> code to infer those templates.
>
> On Thu, May 2, 2019 at 7:05 PM Hal Finkel via Phabricator <
> revi...@reviews.llvm.org> wrote:
>
> hfinkel added a comment.
>
> In D61458#1488970 , @jlebar
> wrote:
>
> > Here's one for you:
> >
> >   __host__ float bar();
> >   __device__ int bar();
> >   __host__ __device__ auto foo() -> decltype(bar()) {}
> >
> >
> > What is the return type of `foo`?  :)
> >
> > I don't believe the right answer is, "float when compiling for host, int
> when compiling for device."
>
>
> So, actually, I wonder if that's not the right answer. We generally allow
> different overloads to have different return types. What if, for example,
> the return type on the host is __float128 and on the device it's
> `MyLongFloatTy`?
>
> > I'd be happy if we said this was an error, so long as it's well-defined
> what exactly we're disallowing.  But I bet @rsmith can come up with
> substantially more evil testcases than this.
>
>
>
>
> Repository:
>   rG LLVM Github Monorepo
>
> CHANGES SINCE LAST ACTION
>   https://reviews.llvm.org/D61458/new/
>
> https://reviews.llvm.org/D61458
>
>
>
>
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r312681 - [CUDA] Add device overloads for non-placement new/delete.

2017-09-06 Thread Justin Lebar via cfe-commits
Author: jlebar
Date: Wed Sep  6 17:37:20 2017
New Revision: 312681

URL: http://llvm.org/viewvc/llvm-project?rev=312681&view=rev
Log:
[CUDA] Add device overloads for non-placement new/delete.

Summary:
Tests have to live in the test-suite, and so will come in a separate
patch.

Fixes PR34360.

Reviewers: tra

Subscribers: llvm-commits, sanjoy

Differential Revision: https://reviews.llvm.org/D37539

Modified:
cfe/trunk/lib/Headers/cuda_wrappers/new

Modified: cfe/trunk/lib/Headers/cuda_wrappers/new
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Headers/cuda_wrappers/new?rev=312681&r1=312680&r2=312681&view=diff
==
--- cfe/trunk/lib/Headers/cuda_wrappers/new (original)
+++ cfe/trunk/lib/Headers/cuda_wrappers/new Wed Sep  6 17:37:20 2017
@@ -26,7 +26,6 @@
 
 #include_next 
 
-// Device overrides for placement new and delete.
 #pragma push_macro("CUDA_NOEXCEPT")
 #if __cplusplus >= 201103L
 #define CUDA_NOEXCEPT noexcept
@@ -34,6 +33,55 @@
 #define CUDA_NOEXCEPT
 #endif
 
+// Device overrides for non-placement new and delete.
+__device__ inline void *operator new(__SIZE_TYPE__ size) {
+  if (size == 0) {
+size = 1;
+  }
+  return ::malloc(size);
+}
+__device__ inline void *operator new(__SIZE_TYPE__ size,
+ const std::nothrow_t &) CUDA_NOEXCEPT {
+  return ::operator new(size);
+}
+
+__device__ inline void *operator new[](__SIZE_TYPE__ size) {
+  return ::operator new(size);
+}
+__device__ inline void *operator new[](__SIZE_TYPE__ size,
+   const std::nothrow_t &) {
+  return ::operator new(size);
+}
+
+__device__ inline void operator delete(void* ptr) CUDA_NOEXCEPT {
+  if (ptr) {
+::free(ptr);
+  }
+}
+__device__ inline void operator delete(void *ptr,
+   const std::nothrow_t &) CUDA_NOEXCEPT {
+  ::operator delete(ptr);
+}
+
+__device__ inline void operator delete[](void* ptr) CUDA_NOEXCEPT {
+  ::operator delete(ptr);
+}
+__device__ inline void operator delete[](void *ptr,
+ const std::nothrow_t &) CUDA_NOEXCEPT 
{
+  ::operator delete(ptr);
+}
+
+// Sized delete, C++14 only.
+#if __cplusplus >= 201402L
+__device__ void operator delete(void *ptr, __SIZE_TYPE__ size) CUDA_NOEXCEPT {
+  ::operator delete(ptr);
+}
+__device__ void operator delete[](void *ptr, __SIZE_TYPE__ size) CUDA_NOEXCEPT 
{
+  ::operator delete(ptr);
+}
+#endif
+
+// Device overrides for placement new and delete.
 __device__ inline void *operator new(__SIZE_TYPE__, void *__ptr) CUDA_NOEXCEPT 
{
   return __ptr;
 }
@@ -42,6 +90,7 @@ __device__ inline void *operator new[](_
 }
 __device__ inline void operator delete(void *, void *) CUDA_NOEXCEPT {}
 __device__ inline void operator delete[](void *, void *) CUDA_NOEXCEPT {}
+
 #pragma pop_macro("CUDA_NOEXCEPT")
 
 #endif // include guard


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r312736 - [CUDA] When compilation fails, print the compilation mode.

2017-09-07 Thread Justin Lebar via cfe-commits
Author: jlebar
Date: Thu Sep  7 11:37:16 2017
New Revision: 312736

URL: http://llvm.org/viewvc/llvm-project?rev=312736&view=rev
Log:
[CUDA] When compilation fails, print the compilation mode.

Summary:
That is, instead of "1 error generated", we now say "1 error generated
when compiling for sm_35".

This (partially) solves a usability foogtun wherein e.g. users call a
function that's only defined on sm_60 when compiling for sm_35, and they
get an unhelpful error message.

Reviewers: tra

Subscribers: sanjoy, cfe-commits

Differential Revision: https://reviews.llvm.org/D37548

Added:
cfe/trunk/test/SemaCUDA/error-includes-mode.cu
Modified:
cfe/trunk/lib/Frontend/CompilerInstance.cpp

Modified: cfe/trunk/lib/Frontend/CompilerInstance.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Frontend/CompilerInstance.cpp?rev=312736&r1=312735&r2=312736&view=diff
==
--- cfe/trunk/lib/Frontend/CompilerInstance.cpp (original)
+++ cfe/trunk/lib/Frontend/CompilerInstance.cpp Thu Sep  7 11:37:16 2017
@@ -1003,8 +1003,17 @@ bool CompilerInstance::ExecuteAction(Fro
   OS << " and ";
 if (NumErrors)
   OS << NumErrors << " error" << (NumErrors == 1 ? "" : "s");
-if (NumWarnings || NumErrors)
-  OS << " generated.\n";
+if (NumWarnings || NumErrors) {
+  OS << " generated";
+  if (getLangOpts().CUDA) {
+if (!getLangOpts().CUDAIsDevice) {
+  OS << " when compiling for host";
+} else {
+  OS << " when compiling for " << getTargetOpts().CPU;
+}
+  }
+  OS << ".\n";
+}
   }
 
   if (getFrontendOpts().ShowStats) {

Added: cfe/trunk/test/SemaCUDA/error-includes-mode.cu
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/SemaCUDA/error-includes-mode.cu?rev=312736&view=auto
==
--- cfe/trunk/test/SemaCUDA/error-includes-mode.cu (added)
+++ cfe/trunk/test/SemaCUDA/error-includes-mode.cu Thu Sep  7 11:37:16 2017
@@ -0,0 +1,7 @@
+// RUN: not %clang_cc1 -fsyntax-only %s 2>&1 | FileCheck --check-prefix HOST %s
+// RUN: not %clang_cc1 -triple nvptx-unknown-unknown -target-cpu sm_35 \
+// RUN:   -fcuda-is-device -fsyntax-only %s 2>&1 | FileCheck --check-prefix 
SM35 %s
+
+// HOST: 1 error generated when compiling for host
+// SM35: 1 error generated when compiling for sm_35
+error;


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r336026 - [CUDA] Make __host__/__device__ min/max overloads constexpr in C++14.

2018-06-29 Thread Justin Lebar via cfe-commits
Author: jlebar
Date: Fri Jun 29 15:28:09 2018
New Revision: 336026

URL: http://llvm.org/viewvc/llvm-project?rev=336026&view=rev
Log:
[CUDA] Make __host__/__device__ min/max overloads constexpr in C++14.

Summary: Tests in a separate change to the test-suite.

Reviewers: rsmith, tra

Subscribers: lahwaacz, sanjoy, cfe-commits

Differential Revision: https://reviews.llvm.org/D48151

Modified:
cfe/trunk/lib/Headers/cuda_wrappers/algorithm

Modified: cfe/trunk/lib/Headers/cuda_wrappers/algorithm
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Headers/cuda_wrappers/algorithm?rev=336026&r1=336025&r2=336026&view=diff
==
--- cfe/trunk/lib/Headers/cuda_wrappers/algorithm (original)
+++ cfe/trunk/lib/Headers/cuda_wrappers/algorithm Fri Jun 29 15:28:09 2018
@@ -67,34 +67,43 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 #endif
 #endif
 
+#pragma push_macro("_CPP14_CONSTEXPR")
+#if __cplusplus >= 201402L
+#define _CPP14_CONSTEXPR constexpr
+#else
+#define _CPP14_CONSTEXPR
+#endif
+
 template 
 __attribute__((enable_if(true, "")))
-inline __host__ __device__ const __T &
+inline _CPP14_CONSTEXPR __host__ __device__ const __T &
 max(const __T &__a, const __T &__b, __Cmp __cmp) {
   return __cmp(__a, __b) ? __b : __a;
 }
 
 template 
 __attribute__((enable_if(true, "")))
-inline __host__ __device__ const __T &
+inline _CPP14_CONSTEXPR __host__ __device__ const __T &
 max(const __T &__a, const __T &__b) {
   return __a < __b ? __b : __a;
 }
 
 template 
 __attribute__((enable_if(true, "")))
-inline __host__ __device__ const __T &
+inline _CPP14_CONSTEXPR __host__ __device__ const __T &
 min(const __T &__a, const __T &__b, __Cmp __cmp) {
   return __cmp(__b, __a) ? __b : __a;
 }
 
 template 
 __attribute__((enable_if(true, "")))
-inline __host__ __device__ const __T &
+inline _CPP14_CONSTEXPR __host__ __device__ const __T &
 min(const __T &__a, const __T &__b) {
   return __a < __b ? __a : __b;
 }
 
+#pragma pop_macro("_CPP14_CONSTEXPR")
+
 #ifdef _LIBCPP_END_NAMESPACE_STD
 _LIBCPP_END_NAMESPACE_STD
 #else


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r336025 - [CUDA] Make min/max shims host+device.

2018-06-29 Thread Justin Lebar via cfe-commits
Author: jlebar
Date: Fri Jun 29 15:27:56 2018
New Revision: 336025

URL: http://llvm.org/viewvc/llvm-project?rev=336025&view=rev
Log:
[CUDA] Make min/max shims host+device.

Summary:
Fixes PR37753: min/max can't be called from __host__ __device__
functions in C++14 mode.

Testcase in a separate test-suite commit.

Reviewers: rsmith

Subscribers: sanjoy, lahwaacz, cfe-commits

Differential Revision: https://reviews.llvm.org/D48036

Modified:
cfe/trunk/lib/Headers/cuda_wrappers/algorithm

Modified: cfe/trunk/lib/Headers/cuda_wrappers/algorithm
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Headers/cuda_wrappers/algorithm?rev=336025&r1=336024&r2=336025&view=diff
==
--- cfe/trunk/lib/Headers/cuda_wrappers/algorithm (original)
+++ cfe/trunk/lib/Headers/cuda_wrappers/algorithm Fri Jun 29 15:27:56 2018
@@ -69,28 +69,28 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
 template 
 __attribute__((enable_if(true, "")))
-inline __device__ const __T &
+inline __host__ __device__ const __T &
 max(const __T &__a, const __T &__b, __Cmp __cmp) {
   return __cmp(__a, __b) ? __b : __a;
 }
 
 template 
 __attribute__((enable_if(true, "")))
-inline __device__ const __T &
+inline __host__ __device__ const __T &
 max(const __T &__a, const __T &__b) {
   return __a < __b ? __b : __a;
 }
 
 template 
 __attribute__((enable_if(true, "")))
-inline __device__ const __T &
+inline __host__ __device__ const __T &
 min(const __T &__a, const __T &__b, __Cmp __cmp) {
   return __cmp(__b, __a) ? __b : __a;
 }
 
 template 
 __attribute__((enable_if(true, "")))
-inline __device__ const __T &
+inline __host__ __device__ const __T &
 min(const __T &__a, const __T &__b) {
   return __a < __b ? __a : __b;
 }


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [PATCH] D9168: [NVPTX] Check if callsite is defined when computing argument allignment

2016-09-20 Thread Justin Lebar via cfe-commits
jlebar added a subscriber: jlebar.
jlebar added a comment.

FWIW I have run into this in the past and just not managed to muster up the 
energy to fix it.  So, thank you!


https://reviews.llvm.org/D9168



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [PATCH] D9168: [NVPTX] Check if callsite is defined when computing argument allignment

2016-09-20 Thread Justin Lebar via cfe-commits
jlebar added a comment.

> I was not able to figure out how to comandeer a revision, so i just went 
> ahead and pushed it.


Under "leap into action", one of the options is to commandeer the revision.


https://reviews.llvm.org/D9168



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D24975: [CUDA] Add #pragma clang force_cuda_host_device_{begin, end} pragmas.

2016-09-27 Thread Justin Lebar via cfe-commits
jlebar created this revision.
jlebar added a reviewer: rsmith.
jlebar added subscribers: cfe-commits, jhen, tra.

These cause us to consider all functions in-between to be __host__
__device__.

You can nest these pragmas; you just can't have more 'end's than
'begin's.

https://reviews.llvm.org/D24975

Files:
  clang/include/clang/Basic/DiagnosticParseKinds.td
  clang/include/clang/Parse/Parser.h
  clang/include/clang/Sema/Sema.h
  clang/lib/Parse/ParsePragma.cpp
  clang/lib/Sema/SemaCUDA.cpp
  clang/test/Parser/cuda-force-host-device.cu

Index: clang/test/Parser/cuda-force-host-device.cu
===
--- /dev/null
+++ clang/test/Parser/cuda-force-host-device.cu
@@ -0,0 +1,27 @@
+// RUN: %clang_cc1 -fsyntax-only -verify %s
+
+// Check the force_cuda_host_device_{begin,end} pragmas.
+
+#pragma clang force_cuda_host_device_begin
+void f();
+#pragma clang force_cuda_host_device_begin
+void g();
+#pragma clang force_cuda_host_device_end
+void h();
+#pragma clang force_cuda_host_device_end
+
+void i(); // expected-note {{not viable}}
+
+void host() {
+  f();
+  g();
+  h();
+  i();
+}
+
+__attribute__((device)) void device() {
+  f();
+  g();
+  h();
+  i(); // expected-error {{no matching function}}
+}
Index: clang/lib/Sema/SemaCUDA.cpp
===
--- clang/lib/Sema/SemaCUDA.cpp
+++ clang/lib/Sema/SemaCUDA.cpp
@@ -23,6 +23,19 @@
 #include "llvm/ADT/SmallVector.h"
 using namespace clang;
 
+void Sema::PushForceCUDAHostDevice() {
+  assert(getLangOpts().CUDA && "May be called only for CUDA compilations.");
+  ForceCUDAHostDeviceDepth++;
+}
+
+bool Sema::PopForceCUDAHostDevice() {
+  assert(getLangOpts().CUDA && "May be called only for CUDA compilations.");
+  if (ForceCUDAHostDeviceDepth == 0)
+return false;
+  ForceCUDAHostDeviceDepth--;
+  return true;
+}
+
 ExprResult Sema::ActOnCUDAExecConfigExpr(Scope *S, SourceLocation oc,
  MultiExprArg ExecConfig,
  SourceLocation GGGLoc) {
@@ -441,9 +454,23 @@
 //  * a __device__ function with this signature was already declared, in which
 //case in which case we output an error, unless the __device__ decl is in a
 //system header, in which case we leave the constexpr function unattributed.
+//
+// In addition, all function decls are treated as __host__ __device__ when
+// ForceCUDAHostDeviceDepth > 0 (corresponding to code within a
+//   #pragma clang force_cuda_host_device_begin/end
+// pair).
 void Sema::maybeAddCUDAHostDeviceAttrs(Scope *S, FunctionDecl *NewD,
const LookupResult &Previous) {
   assert(getLangOpts().CUDA && "May be called only for CUDA compilations.");
+
+  if (ForceCUDAHostDeviceDepth > 0) {
+if (!NewD->hasAttr())
+  NewD->addAttr(CUDAHostAttr::CreateImplicit(Context));
+if (!NewD->hasAttr())
+  NewD->addAttr(CUDADeviceAttr::CreateImplicit(Context));
+return;
+  }
+
   if (!getLangOpts().CUDAHostDeviceConstexpr || !NewD->isConstexpr() ||
   NewD->isVariadic() || NewD->hasAttr() ||
   NewD->hasAttr() || NewD->hasAttr())
Index: clang/lib/Parse/ParsePragma.cpp
===
--- clang/lib/Parse/ParsePragma.cpp
+++ clang/lib/Parse/ParsePragma.cpp
@@ -167,6 +167,26 @@
 Token &FirstToken) override;
 };
 
+struct PragmaForceCUDAHostDeviceStartHandler : public PragmaHandler {
+  PragmaForceCUDAHostDeviceStartHandler(Sema &Actions)
+  : PragmaHandler("force_cuda_host_device_begin"), Actions(Actions) {}
+  void HandlePragma(Preprocessor &PP, PragmaIntroducerKind Introducer,
+Token &NameTok) override;
+
+private:
+  Sema &Actions;
+};
+
+struct PragmaForceCUDAHostDeviceEndHandler : public PragmaHandler {
+  PragmaForceCUDAHostDeviceEndHandler(Sema &Actions)
+  : PragmaHandler("force_cuda_host_device_end"), Actions(Actions) {}
+  void HandlePragma(Preprocessor &PP, PragmaIntroducerKind Introducer,
+Token &NameTok) override;
+
+private:
+  Sema &Actions;
+};
+
 }  // end namespace
 
 void Parser::initializePragmaHandlers() {
@@ -239,6 +259,15 @@
 PP.AddPragmaHandler(MSIntrinsic.get());
   }
 
+  if (getLangOpts().CUDA) {
+CUDAForceHostDeviceStartHandler.reset(
+new PragmaForceCUDAHostDeviceStartHandler(Actions));
+PP.AddPragmaHandler("clang", CUDAForceHostDeviceStartHandler.get());
+CUDAForceHostDeviceEndHandler.reset(
+new PragmaForceCUDAHostDeviceEndHandler(Actions));
+PP.AddPragmaHandler("clang", CUDAForceHostDeviceEndHandler.get());
+  }
+
   OptimizeHandler.reset(new PragmaOptimizeHandler(Actions));
   PP.AddPragmaHandler("clang", OptimizeHandler.get());
 
@@ -309,6 +338,13 @@
 MSIntrinsic.reset();
   }
 
+  if (getLangOpts().CUDA) {
+PP.RemovePragmaHandler("clang", CUDAForceHostDeviceStartHandler.get());
+CUDAForceHostDe

[PATCH] D24978: [CUDA] Rename cuda_builtin_vars.h to __clang_cuda_builtin_vars.h.

2016-09-27 Thread Justin Lebar via cfe-commits
jlebar created this revision.
jlebar added a reviewer: tra.
jlebar added a subscriber: cfe-commits.
Herald added subscribers: mgorny, beanz.

This matches the idiom we use for our other CUDA wrapper headers.

https://reviews.llvm.org/D24978

Files:
  clang/lib/Frontend/CompilerInvocation.cpp
  clang/lib/Headers/CMakeLists.txt
  clang/lib/Headers/__clang_cuda_builtin_vars.h
  clang/lib/Headers/__clang_cuda_runtime_wrapper.h
  clang/lib/Headers/cuda_builtin_vars.h
  clang/test/CodeGenCUDA/cuda-builtin-vars.cu
  clang/test/SemaCUDA/cuda-builtin-vars.cu

Index: clang/test/SemaCUDA/cuda-builtin-vars.cu
===
--- clang/test/SemaCUDA/cuda-builtin-vars.cu
+++ clang/test/SemaCUDA/cuda-builtin-vars.cu
@@ -1,6 +1,6 @@
 // RUN: %clang_cc1 "-triple" "nvptx-nvidia-cuda" -fcuda-is-device -fsyntax-only -verify %s
 
-#include "cuda_builtin_vars.h"
+#include "__clang_cuda_builtin_vars.h"
 __attribute__((global))
 void kernel(int *out) {
   int i = 0;
@@ -34,20 +34,20 @@
 
   out[i++] = warpSize;
   warpSize = 0; // expected-error {{cannot assign to variable 'warpSize' with const-qualified type 'const int'}}
-  // expected-note@cuda_builtin_vars.h:* {{variable 'warpSize' declared const here}}
+  // expected-note@__clang_cuda_builtin_vars.h:* {{variable 'warpSize' declared const here}}
 
   // Make sure we can't construct or assign to the special variables.
   __cuda_builtin_threadIdx_t x; // expected-error {{calling a private constructor of class '__cuda_builtin_threadIdx_t'}}
-  // expected-note@cuda_builtin_vars.h:* {{declared private here}}
+  // expected-note@__clang_cuda_builtin_vars.h:* {{declared private here}}
 
   __cuda_builtin_threadIdx_t y = threadIdx; // expected-error {{calling a private constructor of class '__cuda_builtin_threadIdx_t'}}
-  // expected-note@cuda_builtin_vars.h:* {{declared private here}}
+  // expected-note@__clang_cuda_builtin_vars.h:* {{declared private here}}
 
   threadIdx = threadIdx; // expected-error {{'operator=' is a private member of '__cuda_builtin_threadIdx_t'}}
-  // expected-note@cuda_builtin_vars.h:* {{declared private here}}
+  // expected-note@__clang_cuda_builtin_vars.h:* {{declared private here}}
 
   void *ptr = &threadIdx; // expected-error {{'operator&' is a private member of '__cuda_builtin_threadIdx_t'}}
-  // expected-note@cuda_builtin_vars.h:* {{declared private here}}
+  // expected-note@__clang_cuda_builtin_vars.h:* {{declared private here}}
 
   // Following line should've caused an error as one is not allowed to
   // take address of a built-in variable in CUDA. Alas there's no way
Index: clang/test/CodeGenCUDA/cuda-builtin-vars.cu
===
--- clang/test/CodeGenCUDA/cuda-builtin-vars.cu
+++ clang/test/CodeGenCUDA/cuda-builtin-vars.cu
@@ -1,6 +1,6 @@
 // RUN: %clang_cc1 "-triple" "nvptx-nvidia-cuda" -emit-llvm -fcuda-is-device -o - %s | FileCheck %s
 
-#include "cuda_builtin_vars.h"
+#include "__clang_cuda_builtin_vars.h"
 
 // CHECK: define void @_Z6kernelPi(i32* %out)
 __attribute__((global))
Index: clang/lib/Headers/__clang_cuda_runtime_wrapper.h
===
--- clang/lib/Headers/__clang_cuda_runtime_wrapper.h
+++ clang/lib/Headers/__clang_cuda_runtime_wrapper.h
@@ -72,9 +72,9 @@
 #define __CUDA_ARCH__ 350
 #endif
 
-#include "cuda_builtin_vars.h"
+#include "__clang_cuda_builtin_vars.h"
 
-// No need for device_launch_parameters.h as cuda_builtin_vars.h above
+// No need for device_launch_parameters.h as __clang_cuda_builtin_vars.h above
 // has taken care of builtin variables declared in the file.
 #define __DEVICE_LAUNCH_PARAMETERS_H__
 
@@ -267,8 +267,8 @@
 }
 } // namespace std
 
-// Out-of-line implementations from cuda_builtin_vars.h.  These need to come
-// after we've pulled in the definition of uint3 and dim3.
+// Out-of-line implementations from __clang_cuda_builtin_vars.h.  These need to
+// come after we've pulled in the definition of uint3 and dim3.
 
 __device__ inline __cuda_builtin_threadIdx_t::operator uint3() const {
   uint3 ret;
@@ -299,10 +299,10 @@
 
 // curand_mtgp32_kernel helpfully redeclares blockDim and threadIdx in host
 // mode, giving them their "proper" types of dim3 and uint3.  This is
-// incompatible with the types we give in cuda_builtin_vars.h.  As as hack,
-// force-include the header (nvcc doesn't include it by default) but redefine
-// dim3 and uint3 to our builtin types.  (Thankfully dim3 and uint3 are only
-// used here for the redeclarations of blockDim and threadIdx.)
+// incompatible with the types we give in __clang_cuda_builtin_vars.h.  As as
+// hack, force-include the header (nvcc doesn't include it by default) but
+// redefine dim3 and uint3 to our builtin types.  (Thankfully dim3 and uint3 are
+// only used here for the redeclarations of blockDim and threadIdx.)
 #pragma push_macro("dim3")
 #pragma push_macro("uint3")
 #define dim3 __cuda

[PATCH] D24977: [CUDA] Declare our __device__ math functions in the same inline namespace as our standard library.

2016-09-27 Thread Justin Lebar via cfe-commits
jlebar created this revision.
jlebar added a reviewer: tra.
jlebar added subscribers: jhen, cfe-commits.

Currently we declare our inline __device__ math functions in namespace
std.  But libstdc++ and libc++ declare these functions in an inline
namespace inside namespace std.  We need to match this because, in a
later patch, we want to get e.g.  to use our device overloads,
and it only will if those overloads are in the right inline namespace.

https://reviews.llvm.org/D24977

Files:
  clang/lib/Headers/__clang_cuda_cmath.h
  clang/lib/Headers/__clang_cuda_math_forward_declares.h

Index: clang/lib/Headers/__clang_cuda_math_forward_declares.h
===
--- clang/lib/Headers/__clang_cuda_math_forward_declares.h
+++ clang/lib/Headers/__clang_cuda_math_forward_declares.h
@@ -185,7 +185,19 @@
 __DEVICE__ double trunc(double);
 __DEVICE__ float trunc(float);
 
+// We need to define these overloads in exactly the namespace our standard
+// library uses (including the right inline namespace), otherwise they won't be
+// picked up by other functions in the standard library (e.g. functions in
+// ).  Thus the ugliness below.
+#ifdef _LIBCPP_BEGIN_NAMESPACE_STD
+_LIBCPP_BEGIN_NAMESPACE_STD
+#else
 namespace std {
+#ifdef _GLIBCXX_BEGIN_NAMESPACE_VERSION
+_GLIBCXX_BEGIN_NAMESPACE_VERSION
+#endif
+#endif
+
 using ::abs;
 using ::acos;
 using ::acosh;
@@ -259,7 +271,15 @@
 using ::tanh;
 using ::tgamma;
 using ::trunc;
+
+#ifdef _LIBCPP_END_NAMESPACE_STD
+_LIBCPP_END_NAMESPACE_STD
+#else
+#ifdef _GLIBCXX_BEGIN_NAMESPACE_VERSION
+_GLIBCXX_END_NAMESPACE_VERSION
+#endif
 } // namespace std
+#endif
 
 #pragma pop_macro("__DEVICE__")
 
Index: clang/lib/Headers/__clang_cuda_cmath.h
===
--- clang/lib/Headers/__clang_cuda_cmath.h
+++ clang/lib/Headers/__clang_cuda_cmath.h
@@ -316,7 +316,19 @@
   return std::scalbn((double)__x, __exp);
 }
 
+// We need to define these overloads in exactly the namespace our standard
+// library uses (including the right inline namespace), otherwise they won't be
+// picked up by other functions in the standard library (e.g. functions in
+// ).  Thus the ugliness below.
+#ifdef _LIBCPP_BEGIN_NAMESPACE_STD
+_LIBCPP_BEGIN_NAMESPACE_STD
+#else
 namespace std {
+#ifdef _GLIBCXX_BEGIN_NAMESPACE_VERSION
+_GLIBCXX_BEGIN_NAMESPACE_VERSION
+#endif
+#endif
+
 // Pull the new overloads we defined above into namespace std.
 using ::acos;
 using ::acosh;
@@ -451,7 +463,15 @@
 using ::tanhf;
 using ::tgammaf;
 using ::truncf;
-}
+
+#ifdef _LIBCPP_END_NAMESPACE_STD
+_LIBCPP_END_NAMESPACE_STD
+#else
+#ifdef _GLIBCXX_BEGIN_NAMESPACE_VERSION
+_GLIBCXX_END_NAMESPACE_VERSION
+#endif
+} // namespace std
+#endif
 
 #undef __DEVICE__
 


Index: clang/lib/Headers/__clang_cuda_math_forward_declares.h
===
--- clang/lib/Headers/__clang_cuda_math_forward_declares.h
+++ clang/lib/Headers/__clang_cuda_math_forward_declares.h
@@ -185,7 +185,19 @@
 __DEVICE__ double trunc(double);
 __DEVICE__ float trunc(float);
 
+// We need to define these overloads in exactly the namespace our standard
+// library uses (including the right inline namespace), otherwise they won't be
+// picked up by other functions in the standard library (e.g. functions in
+// ).  Thus the ugliness below.
+#ifdef _LIBCPP_BEGIN_NAMESPACE_STD
+_LIBCPP_BEGIN_NAMESPACE_STD
+#else
 namespace std {
+#ifdef _GLIBCXX_BEGIN_NAMESPACE_VERSION
+_GLIBCXX_BEGIN_NAMESPACE_VERSION
+#endif
+#endif
+
 using ::abs;
 using ::acos;
 using ::acosh;
@@ -259,7 +271,15 @@
 using ::tanh;
 using ::tgamma;
 using ::trunc;
+
+#ifdef _LIBCPP_END_NAMESPACE_STD
+_LIBCPP_END_NAMESPACE_STD
+#else
+#ifdef _GLIBCXX_BEGIN_NAMESPACE_VERSION
+_GLIBCXX_END_NAMESPACE_VERSION
+#endif
 } // namespace std
+#endif
 
 #pragma pop_macro("__DEVICE__")
 
Index: clang/lib/Headers/__clang_cuda_cmath.h
===
--- clang/lib/Headers/__clang_cuda_cmath.h
+++ clang/lib/Headers/__clang_cuda_cmath.h
@@ -316,7 +316,19 @@
   return std::scalbn((double)__x, __exp);
 }
 
+// We need to define these overloads in exactly the namespace our standard
+// library uses (including the right inline namespace), otherwise they won't be
+// picked up by other functions in the standard library (e.g. functions in
+// ).  Thus the ugliness below.
+#ifdef _LIBCPP_BEGIN_NAMESPACE_STD
+_LIBCPP_BEGIN_NAMESPACE_STD
+#else
 namespace std {
+#ifdef _GLIBCXX_BEGIN_NAMESPACE_VERSION
+_GLIBCXX_BEGIN_NAMESPACE_VERSION
+#endif
+#endif
+
 // Pull the new overloads we defined above into namespace std.
 using ::acos;
 using ::acosh;
@@ -451,7 +463,15 @@
 using ::tanhf;
 using ::tgammaf;
 using ::truncf;
-}
+
+#ifdef _LIBCPP_END_NAMESPACE_STD
+_LIBCPP_END_NAMESPACE_STD
+#else
+#ifdef _GLIBCXX_BEGIN_NAMESPACE_VERSION
+_GLIBCXX_END_NAMESPACE_VERSION
+#endif
+} // namespace std
+#endif
 
 #undef __DEVICE__
 
_

[PATCH] D24979: [CUDA] Support and std::min/max on the device.

2016-09-27 Thread Justin Lebar via cfe-commits
jlebar created this revision.
jlebar added a reviewer: tra.
jlebar added subscribers: cfe-commits, jhen.
Herald added subscribers: mgorny, beanz.

We do this by wrapping  and .

Tests are in the test-suite.   support to come separately.

https://reviews.llvm.org/D24979

Files:
  clang/lib/Driver/ToolChains.cpp
  clang/lib/Headers/CMakeLists.txt
  clang/lib/Headers/__clang_cuda_complex_builtins.h
  clang/lib/Headers/__clang_cuda_runtime_wrapper.h
  clang/lib/Headers/cuda_wrappers/algorithm
  clang/lib/Headers/cuda_wrappers/complex

Index: clang/lib/Headers/cuda_wrappers/complex
===
--- /dev/null
+++ clang/lib/Headers/cuda_wrappers/complex
@@ -0,0 +1,79 @@
+/*=== complex - CUDA wrapper for  --===
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ *
+ *===---===
+ */
+
+#pragma once
+
+// Wrapper around  that forces its functions to be __host__
+// __device__.
+
+// First, include host-only headers we think are likely to be included by
+// , so that the pragma below only applies to  itself.
+#if __cplusplus >= 201103L
+#include 
+#endif
+#include 
+#include 
+#include 
+
+// Next, include our  wrapper, to ensure that device overloads of
+// std::min/max are available.
+#include 
+
+#pragma clang force_cuda_host_device_begin
+
+// When compiling for device, ask libstdc++ to use its own implements of
+// complex functions, rather than calling builtins (which resolve to library
+// functions that don't exist when compiling CUDA device code).
+//
+// This is a little dicey, because it causes libstdc++ to define a different
+// set of overloads on host and device.
+//
+//   // Present only when compiling for host.
+//   __host__ __device__ void complex sin(const complex& x) {
+// return __builtin_csinf(x);
+//   }
+//
+//   // Present when compiling for host and for device.
+//   template 
+//   void __host__ __device__ complex sin(const complex& x) {
+// return complex(sin(x.real()) * cosh(x.imag()),
+//   cos(x.real()), sinh(x.imag()));
+//   }
+//
+// This is safe because when compiling for device, all function calls in
+// __host__ code to sin() will still resolve to *something*, even if they don't
+// resolve to the same function as they resolve to when compiling for host.  We
+// don't care that they don't resolve to the right function because we won't
+// codegen this host code when compiling for device.
+
+#pragma push_macro("_GLIBCXX_USE_C99_COMPLEX")
+#pragma push_macro("_GLIBCXX_USE_C99_COMPLEX_TR1")
+#define _GLIBCXX_USE_C99_COMPLEX 0
+#define _GLIBCXX_USE_C99_COMPLEX_TR1 0
+
+#include_next 
+
+#pragma pop_macro("_GLIBCXX_USE_C99_COMPLEX_TR1")
+#pragma pop_macro("_GLIBCXX_USE_C99_COMPLEX")
+
+#pragma clang force_cuda_host_device_end
Index: clang/lib/Headers/cuda_wrappers/algorithm
===
--- /dev/null
+++ clang/lib/Headers/cuda_wrappers/algorithm
@@ -0,0 +1,96 @@
+/*=== complex - CUDA wrapper for  ===
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEM

Re: [PATCH] D24977: [CUDA] Declare our __device__ math functions in the same inline namespace as our standard library.

2016-09-27 Thread Justin Lebar via cfe-commits
jlebar added a comment.

> That is way too much knowledge about details of standard library 
> implementation.


Honestly I think this looks a lot scarier than it is.  Or, to be specific, I 
think we are already relying on implementation details much more implicit and 
fragile than what is explicit here.  See the git log of all of the changes I've 
had to make to this file before now to make us compatible with all of the 
standard libraries we want to support.

> If it changes, I suspect users will end up with a rather uninformative error.


You mean, if the standard libraries change the macro they're using here?  If 
so, we'll fall back to plain "namespace std", which is what we had before, so 
it should work fine.  In fact the only way I think this can affect things one 
way or another is if the standard library does

  namespace std {
  inline namespace foo {
  void some_fn(std::complex);
  
  void test() {
some_fn(std::complex());
  }
  } // inline namespace foo
  }  // namespace std

ADL on some_fn will prefer the some_fn inside std::foo, so if we declare an 
overload of some_fn inside plain namespace std, it won't match.

> We could whitelist libc++/libstdc++ version we've tested with and produce 
> #warning "Unsupported standard library version" if we see something else.


In practice, we are testing with versions of libstdc++ that are so much newer 
than what anyone has on their systems, I am not exactly worried about this.

But I think more generally these questions are probably better handled in a 
separate patch?  Like I say, we are already rather tightly-coupled to the 
standard libraries -- I don't think this patch changes that reality too much.


https://reviews.llvm.org/D24977



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [PATCH] D24979: [CUDA] Support and std::min/max on the device.

2016-09-27 Thread Justin Lebar via cfe-commits
jlebar added a comment.

> I' personally would prefer to force-include these files. I suspect it will 
> not change things much as we already include a lot.


We have already had bugs filed by users whose root cause was that we #included 
more things than nvcc #includes.  I know exact compatibility with nvcc is not 
our goal, but unless we have a good reason I don't think we should break 
compatibility with nvcc *and* the C++ standard by force-including additional 
system headers.

> This looks like fix-includes and it may be somewhat shaky if users start 
> messing with include paths.


We add this include path first, so I think it should be OK?  What do you think, 
@echristo.


https://reviews.llvm.org/D24979



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [PATCH] D24946: [CUDA] Added support for CUDA-8

2016-09-27 Thread Justin Lebar via cfe-commits
jlebar added inline comments.


Comment at: lib/Headers/__clang_cuda_runtime_wrapper.h:139
@@ -137,1 +138,3 @@
 
+// CUDA 8.0.41 relies on __USE_FAST_MATH__ and __CUDA_PREC_DIV's values
+// Previous versions used to check thether they are defined or not.

Nit, missing period.


Comment at: lib/Headers/__clang_cuda_runtime_wrapper.h:140
@@ +139,3 @@
+// CUDA 8.0.41 relies on __USE_FAST_MATH__ and __CUDA_PREC_DIV's values
+// Previous versions used to check thether they are defined or not.
+// CU_DEVICE_INVALID macro is only defined in 8.0.41, so we use it

typo


Comment at: lib/Headers/__clang_cuda_runtime_wrapper.h:156
@@ +155,3 @@
+#endif
+#endif
+

I don't understand what we are doing here...

We're saying, if __USE_FAST_MATH__ is defined, and if it's not equal to 0, then 
redefine it equal to 1?  Isn't that a compile error?


https://reviews.llvm.org/D24946



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [PATCH] D24975: [CUDA] Add #pragma clang force_cuda_host_device_{begin, end} pragmas.

2016-09-27 Thread Justin Lebar via cfe-commits
jlebar updated this revision to Diff 72717.
jlebar marked 2 inline comments as done.
jlebar added a comment.

Address Richard Smith's review comments:

- Change macro format.
- Add tests (these Just Worked).


https://reviews.llvm.org/D24975

Files:
  clang/include/clang/Basic/DiagnosticParseKinds.td
  clang/include/clang/Parse/Parser.h
  clang/include/clang/Sema/Sema.h
  clang/lib/Parse/ParsePragma.cpp
  clang/lib/Sema/SemaCUDA.cpp
  clang/test/Parser/cuda-force-host-device-templates.cu
  clang/test/Parser/cuda-force-host-device.cu

Index: clang/test/Parser/cuda-force-host-device.cu
===
--- /dev/null
+++ clang/test/Parser/cuda-force-host-device.cu
@@ -0,0 +1,33 @@
+// RUN: %clang_cc1 -fsyntax-only -verify %s
+
+// Check the force_cuda_host_device pragma.
+
+#pragma clang force_cuda_host_device begin
+void f();
+#pragma clang force_cuda_host_device begin
+void g();
+#pragma clang force_cuda_host_device end
+void h();
+#pragma clang force_cuda_host_device end
+
+void i(); // expected-note {{not viable}}
+
+void host() {
+  f();
+  g();
+  h();
+  i();
+}
+
+__attribute__((device)) void device() {
+  f();
+  g();
+  h();
+  i(); // expected-error {{no matching function}}
+}
+
+#pragma clang force_cuda_host_device foo
+// expected-warning@-1 {{Incorrect use of #pragma clang force_cuda_host_device begin|end}}
+
+#pragma clang force_cuda_host_device
+// expected-warning@-1 {{Incorrect use of #pragma clang force_cuda_host_device begin|end}}
Index: clang/test/Parser/cuda-force-host-device-templates.cu
===
--- /dev/null
+++ clang/test/Parser/cuda-force-host-device-templates.cu
@@ -0,0 +1,41 @@
+// RUN: %clang_cc1 -S -verify -fcuda-is-device %s -o /dev/null
+
+// Check how the force_cuda_host_device pragma interacts with template
+// instantiations.  The errors here are emitted at codegen, so we can't do
+// -fsyntax-only.
+
+template 
+T foo() {  // expected-note {{declared here}}
+  return T();
+}
+
+template 
+struct X {
+  void foo(); // expected-note {{declared here}}
+};
+
+#pragma clang force_cuda_host_device begin
+__attribute__((host)) __attribute__((device)) void test() {
+  int n = foo();  // expected-error {{reference to __host__ function 'foo'}}
+  X().foo();  // expected-error {{reference to __host__ function 'foo'}}
+}
+#pragma clang force_cuda_host_device end
+
+// Same thing as above, but within a force_cuda_host_device block without a
+// corresponding end.
+
+template 
+T bar() {  // expected-note {{declared here}}
+  return T();
+}
+
+template 
+struct Y {
+  void bar(); // expected-note {{declared here}}
+};
+
+#pragma clang force_cuda_host_device begin
+__attribute__((host)) __attribute__((device)) void test2() {
+  int n = bar();  // expected-error {{reference to __host__ function 'bar'}}
+  Y().bar();  // expected-error {{reference to __host__ function 'bar'}}
+}
Index: clang/lib/Sema/SemaCUDA.cpp
===
--- clang/lib/Sema/SemaCUDA.cpp
+++ clang/lib/Sema/SemaCUDA.cpp
@@ -23,6 +23,19 @@
 #include "llvm/ADT/SmallVector.h"
 using namespace clang;
 
+void Sema::PushForceCUDAHostDevice() {
+  assert(getLangOpts().CUDA && "May be called only for CUDA compilations.");
+  ForceCUDAHostDeviceDepth++;
+}
+
+bool Sema::PopForceCUDAHostDevice() {
+  assert(getLangOpts().CUDA && "May be called only for CUDA compilations.");
+  if (ForceCUDAHostDeviceDepth == 0)
+return false;
+  ForceCUDAHostDeviceDepth--;
+  return true;
+}
+
 ExprResult Sema::ActOnCUDAExecConfigExpr(Scope *S, SourceLocation oc,
  MultiExprArg ExecConfig,
  SourceLocation GGGLoc) {
@@ -441,9 +454,23 @@
 //  * a __device__ function with this signature was already declared, in which
 //case in which case we output an error, unless the __device__ decl is in a
 //system header, in which case we leave the constexpr function unattributed.
+//
+// In addition, all function decls are treated as __host__ __device__ when
+// ForceCUDAHostDeviceDepth > 0 (corresponding to code within a
+//   #pragma clang force_cuda_host_device_begin/end
+// pair).
 void Sema::maybeAddCUDAHostDeviceAttrs(Scope *S, FunctionDecl *NewD,
const LookupResult &Previous) {
   assert(getLangOpts().CUDA && "May be called only for CUDA compilations.");
+
+  if (ForceCUDAHostDeviceDepth > 0) {
+if (!NewD->hasAttr())
+  NewD->addAttr(CUDAHostAttr::CreateImplicit(Context));
+if (!NewD->hasAttr())
+  NewD->addAttr(CUDADeviceAttr::CreateImplicit(Context));
+return;
+  }
+
   if (!getLangOpts().CUDAHostDeviceConstexpr || !NewD->isConstexpr() ||
   NewD->isVariadic() || NewD->hasAttr() ||
   NewD->hasAttr() || NewD->hasAttr())
Index: clang/lib/Parse/ParsePragma.cpp
===
---

Re: [PATCH] D24979: [CUDA] Support and std::min/max on the device.

2016-09-27 Thread Justin Lebar via cfe-commits
jlebar updated this revision to Diff 72719.
jlebar added a comment.
Herald added a subscriber: mehdi_amini.

s/libgcc/runtime/


https://reviews.llvm.org/D24979

Files:
  clang/lib/Driver/ToolChains.cpp
  clang/lib/Headers/CMakeLists.txt
  clang/lib/Headers/__clang_cuda_complex_builtins.h
  clang/lib/Headers/__clang_cuda_runtime_wrapper.h
  clang/lib/Headers/cuda_wrappers/algorithm
  clang/lib/Headers/cuda_wrappers/complex

Index: clang/lib/Headers/cuda_wrappers/complex
===
--- /dev/null
+++ clang/lib/Headers/cuda_wrappers/complex
@@ -0,0 +1,79 @@
+/*=== complex - CUDA wrapper for  --===
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ *
+ *===---===
+ */
+
+#pragma once
+
+// Wrapper around  that forces its functions to be __host__
+// __device__.
+
+// First, include host-only headers we think are likely to be included by
+// , so that the pragma below only applies to  itself.
+#if __cplusplus >= 201103L
+#include 
+#endif
+#include 
+#include 
+#include 
+
+// Next, include our  wrapper, to ensure that device overloads of
+// std::min/max are available.
+#include 
+
+#pragma clang force_cuda_host_device_begin
+
+// When compiling for device, ask libstdc++ to use its own implements of
+// complex functions, rather than calling builtins (which resolve to library
+// functions that don't exist when compiling CUDA device code).
+//
+// This is a little dicey, because it causes libstdc++ to define a different
+// set of overloads on host and device.
+//
+//   // Present only when compiling for host.
+//   __host__ __device__ void complex sin(const complex& x) {
+// return __builtin_csinf(x);
+//   }
+//
+//   // Present when compiling for host and for device.
+//   template 
+//   void __host__ __device__ complex sin(const complex& x) {
+// return complex(sin(x.real()) * cosh(x.imag()),
+//   cos(x.real()), sinh(x.imag()));
+//   }
+//
+// This is safe because when compiling for device, all function calls in
+// __host__ code to sin() will still resolve to *something*, even if they don't
+// resolve to the same function as they resolve to when compiling for host.  We
+// don't care that they don't resolve to the right function because we won't
+// codegen this host code when compiling for device.
+
+#pragma push_macro("_GLIBCXX_USE_C99_COMPLEX")
+#pragma push_macro("_GLIBCXX_USE_C99_COMPLEX_TR1")
+#define _GLIBCXX_USE_C99_COMPLEX 0
+#define _GLIBCXX_USE_C99_COMPLEX_TR1 0
+
+#include_next 
+
+#pragma pop_macro("_GLIBCXX_USE_C99_COMPLEX_TR1")
+#pragma pop_macro("_GLIBCXX_USE_C99_COMPLEX")
+
+#pragma clang force_cuda_host_device_end
Index: clang/lib/Headers/cuda_wrappers/algorithm
===
--- /dev/null
+++ clang/lib/Headers/cuda_wrappers/algorithm
@@ -0,0 +1,96 @@
+/*=== complex - CUDA wrapper for  ===
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIA

Re: [PATCH] D24975: [CUDA] Add #pragma clang force_cuda_host_device_{begin, end} pragmas.

2016-09-27 Thread Justin Lebar via cfe-commits
jlebar updated this revision to Diff 72734.
jlebar marked 2 inline comments as done.
jlebar added a comment.

Address Richard's comments.

I'm fairly neutral on whether we want to make it an error not to match all of
your "begin" pragmas with "end"s.  I checked pragma push_macro, and it looks
like it's not an error to pop those, so with that prior art, and since it was
simpler not to check for matching begin/ends, I did the same.  But like I say,
I don't feel strongly either way (or even if we wanted to make these new
pragmas not-nestable).


https://reviews.llvm.org/D24975

Files:
  clang/include/clang/Basic/DiagnosticParseKinds.td
  clang/include/clang/Basic/DiagnosticSemaKinds.td
  clang/include/clang/Parse/Parser.h
  clang/include/clang/Sema/Sema.h
  clang/include/clang/Serialization/ASTBitCodes.h
  clang/include/clang/Serialization/ASTReader.h
  clang/include/clang/Serialization/ASTWriter.h
  clang/lib/Parse/ParsePragma.cpp
  clang/lib/Sema/SemaCUDA.cpp
  clang/lib/Serialization/ASTReader.cpp
  clang/lib/Serialization/ASTWriter.cpp
  clang/test/Parser/cuda-force-host-device-templates.cu
  clang/test/Parser/cuda-force-host-device.cu

Index: clang/test/Parser/cuda-force-host-device.cu
===
--- /dev/null
+++ clang/test/Parser/cuda-force-host-device.cu
@@ -0,0 +1,36 @@
+// RUN: %clang_cc1 -fsyntax-only -verify %s
+
+// Check the force_cuda_host_device pragma.
+
+#pragma clang force_cuda_host_device begin
+void f();
+#pragma clang force_cuda_host_device begin
+void g();
+#pragma clang force_cuda_host_device end
+void h();
+#pragma clang force_cuda_host_device end
+
+void i(); // expected-note {{not viable}}
+
+void host() {
+  f();
+  g();
+  h();
+  i();
+}
+
+__attribute__((device)) void device() {
+  f();
+  g();
+  h();
+  i(); // expected-error {{no matching function}}
+}
+
+#pragma clang force_cuda_host_device foo
+// expected-warning@-1 {{incorrect use of #pragma clang force_cuda_host_device begin|end}}
+
+#pragma clang force_cuda_host_device
+// expected-warning@-1 {{incorrect use of #pragma clang force_cuda_host_device begin|end}}
+
+#pragma clang force_cuda_host_device begin foo
+// expected-warning@-1 {{incorrect use of #pragma clang force_cuda_host_device begin|end}}
Index: clang/test/Parser/cuda-force-host-device-templates.cu
===
--- /dev/null
+++ clang/test/Parser/cuda-force-host-device-templates.cu
@@ -0,0 +1,41 @@
+// RUN: %clang_cc1 -std=c++14 -S -verify -fcuda-is-device %s -o /dev/null
+
+// Check how the force_cuda_host_device pragma interacts with template
+// instantiations.  The errors here are emitted at codegen, so we can't do
+// -fsyntax-only.
+
+template 
+auto foo() {  // expected-note {{declared here}}
+  return T();
+}
+
+template 
+struct X {
+  void foo(); // expected-note {{declared here}}
+};
+
+#pragma clang force_cuda_host_device begin
+__attribute__((host)) __attribute__((device)) void test() {
+  int n = foo();  // expected-error {{reference to __host__ function 'foo'}}
+  X().foo();  // expected-error {{reference to __host__ function 'foo'}}
+}
+#pragma clang force_cuda_host_device end
+
+// Same thing as above, but within a force_cuda_host_device block without a
+// corresponding end.
+
+template 
+T bar() {  // expected-note {{declared here}}
+  return T();
+}
+
+template 
+struct Y {
+  void bar(); // expected-note {{declared here}}
+};
+
+#pragma clang force_cuda_host_device begin
+__attribute__((host)) __attribute__((device)) void test2() {
+  int n = bar();  // expected-error {{reference to __host__ function 'bar'}}
+  Y().bar();  // expected-error {{reference to __host__ function 'bar'}}
+}
Index: clang/lib/Serialization/ASTWriter.cpp
===
--- clang/lib/Serialization/ASTWriter.cpp
+++ clang/lib/Serialization/ASTWriter.cpp
@@ -1069,6 +1069,7 @@
   RECORD(POINTERS_TO_MEMBERS_PRAGMA_OPTIONS);
   RECORD(UNUSED_LOCAL_TYPEDEF_NAME_CANDIDATES);
   RECORD(DELETE_EXPRS_TO_ANALYZE);
+  RECORD(CUDA_PRAGMA_FORCE_HOST_DEVICE_DEPTH);
 
   // SourceManager Block.
   BLOCK(SOURCE_MANAGER_BLOCK);
@@ -3943,6 +3944,13 @@
   Stream.EmitRecord(OPENCL_EXTENSIONS, Record);
 }
 
+void ASTWriter::WriteCUDAPragmas(Sema &SemaRef) {
+  if (SemaRef.ForceCUDAHostDeviceDepth > 0) {
+RecordData::value_type Record[] = {SemaRef.ForceCUDAHostDeviceDepth};
+Stream.EmitRecord(CUDA_PRAGMA_FORCE_HOST_DEVICE_DEPTH, Record);
+  }
+}
+
 void ASTWriter::WriteObjCCategories() {
   SmallVector CategoriesMap;
   RecordData Categories;
@@ -4618,6 +4626,7 @@
   WriteIdentifierTable(PP, SemaRef.IdResolver, isModule);
   WriteFPPragmaOptions(SemaRef.getFPOptions());
   WriteOpenCLExtensions(SemaRef);
+  WriteCUDAPragmas(SemaRef);
   WritePragmaDiagnosticMappings(Context.getDiagnostics(), isModule);
 
   // If we're emitting a module, write out the submodule information.  
Index: clang/lib/Serialization/ASTReader.cpp

Re: [PATCH] D24975: [CUDA] Add #pragma clang force_cuda_host_device_{begin, end} pragmas.

2016-09-27 Thread Justin Lebar via cfe-commits
jlebar added a comment.

> What happens if there are trailing tokens after the pragma?


Added code to make this an error.


https://reviews.llvm.org/D24975



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D25036: [CUDA] Disallow exceptions in device code.

2016-09-28 Thread Justin Lebar via cfe-commits
jlebar created this revision.
jlebar added a reviewer: tra.
jlebar added subscribers: jhen, cfe-commits.

https://reviews.llvm.org/D25036

Files:
  clang/include/clang/Basic/DiagnosticSemaKinds.td
  clang/include/clang/Sema/Sema.h
  clang/lib/Sema/SemaCUDA.cpp
  clang/lib/Sema/SemaExprCXX.cpp
  clang/lib/Sema/SemaStmt.cpp
  clang/test/SemaCUDA/exceptions-host-device.cu
  clang/test/SemaCUDA/exceptions.cu

Index: clang/test/SemaCUDA/exceptions.cu
===
--- /dev/null
+++ clang/test/SemaCUDA/exceptions.cu
@@ -0,0 +1,21 @@
+// RUN: %clang_cc1 -fcxx-exceptions -fcuda-is-device -fsyntax-only -verify %s
+// RUN: %clang_cc1 -fcxx-exceptions -fsyntax-only -verify %s
+
+#include "Inputs/cuda.h"
+
+void host() {
+  throw NULL;
+  try {} catch(void*) {}
+}
+__device__ void device() {
+  throw NULL;
+  // expected-error@-1 {{cannot use 'throw' in __device__ function 'device'}}
+  try {} catch(void*) {}
+  // expected-error@-1 {{cannot use 'try' in __device__ function 'device'}}
+}
+__global__ void kernel() {
+  throw NULL;
+  // expected-error@-1 {{cannot use 'throw' in __global__ function 'kernel'}}
+  try {} catch(void*) {}
+  // expected-error@-1 {{cannot use 'try' in __global__ function 'kernel'}}
+}
Index: clang/test/SemaCUDA/exceptions-host-device.cu
===
--- /dev/null
+++ clang/test/SemaCUDA/exceptions-host-device.cu
@@ -0,0 +1,38 @@
+// RUN: %clang_cc1 -fcxx-exceptions -fcuda-is-device -verify %s -S -o /dev/null
+// RUN: %clang_cc1 -fcxx-exceptions -verify -DHOST %s -S -o /dev/null
+
+#include "Inputs/cuda.h"
+
+// Check that it's an error to use 'try' and 'throw' from a __host__ __device__
+// function if and only if it's codegen'ed for device.
+
+#ifdef HOST
+// expected-no-diagnostics
+#endif
+
+__host__ __device__ void hd1() {
+  throw NULL;
+  try {} catch(void*) {}
+#ifndef HOST
+  // expected-error@-3 {{cannot use 'throw' in __host__ __device__ function 'hd1'}}
+  // expected-error@-3 {{cannot use 'try' in __host__ __device__ function 'hd1'}}
+#endif
+}
+
+// No error, never instantiated on device.
+inline __host__ __device__ void hd2() {
+  throw NULL;
+  try {} catch(void*) {}
+}
+void call_hd2() { hd2(); }
+
+// Error, instantiated on device.
+inline __host__ __device__ void hd3() {
+  throw NULL;
+  try {} catch(void*) {}
+#ifndef HOST
+  // expected-error@-3 {{cannot use 'throw' in __host__ __device__ function 'hd3'}}
+  // expected-error@-3 {{cannot use 'try' in __host__ __device__ function 'hd3'}}
+#endif
+}
+__device__ void call_hd3() { hd3(); }
Index: clang/lib/Sema/SemaStmt.cpp
===
--- clang/lib/Sema/SemaStmt.cpp
+++ clang/lib/Sema/SemaStmt.cpp
@@ -3644,6 +3644,10 @@
   !getSourceManager().isInSystemHeader(TryLoc))
 Diag(TryLoc, diag::err_exceptions_disabled) << "try";
 
+  // Exceptions aren't allowed in CUDA device code.
+  if (getLangOpts().CUDA)
+CheckCUDAExceptionExpr(TryLoc, "try");
+
   if (getCurScope() && getCurScope()->isOpenMPSimdDirectiveScope())
 Diag(TryLoc, diag::err_omp_simd_region_cannot_use_stmt) << "try";
 
Index: clang/lib/Sema/SemaExprCXX.cpp
===
--- clang/lib/Sema/SemaExprCXX.cpp
+++ clang/lib/Sema/SemaExprCXX.cpp
@@ -683,6 +683,10 @@
   !getSourceManager().isInSystemHeader(OpLoc))
 Diag(OpLoc, diag::err_exceptions_disabled) << "throw";
 
+  // Exceptions aren't allowed in CUDA device code.
+  if (getLangOpts().CUDA)
+CheckCUDAExceptionExpr(OpLoc, "throw");
+
   if (getCurScope() && getCurScope()->isOpenMPSimdDirectiveScope())
 Diag(OpLoc, diag::err_omp_simd_region_cannot_use_stmt) << "throw";
 
Index: clang/lib/Sema/SemaCUDA.cpp
===
--- clang/lib/Sema/SemaCUDA.cpp
+++ clang/lib/Sema/SemaCUDA.cpp
@@ -515,3 +515,27 @@
   }
   return true;
 }
+
+bool Sema::CheckCUDAExceptionExpr(SourceLocation Loc, StringRef ExprTy) {
+  assert(getLangOpts().CUDA && "Should only be called during CUDA compilation");
+  FunctionDecl *CurFn = dyn_cast(CurContext);
+  if (!CurFn)
+return true;
+  CUDAFunctionTarget Target = IdentifyCUDATarget(CurFn);
+
+  // Raise an error immediately if this is a __global__ or __device__ function.
+  // If it's a __host__ __device__ function, enqueue a deferred error which will
+  // be emitted if the function is codegen'ed for device.
+  if (Target == CFT_Global || Target == CFT_Device) {
+Diag(Loc, diag::err_cuda_device_exceptions) << ExprTy << Target << CurFn;
+return false;
+  }
+  if (Target == CFT_HostDevice && getLangOpts().CUDAIsDevice) {
+PartialDiagnostic ErrPD{PartialDiagnostic::NullDiagnostic()};
+ErrPD.Reset(diag::err_cuda_device_exceptions);
+ErrPD << ExprTy << Target << CurFn;
+CurFn->addDeferredDiag({Loc, std::move(ErrPD)});
+return false;
+  }
+  return true;
+}
Index: c

Re: [PATCH] D25036: [CUDA] Disallow exceptions in device code.

2016-09-28 Thread Justin Lebar via cfe-commits
jlebar marked an inline comment as done.


Comment at: clang/lib/Sema/SemaExprCXX.cpp:688
@@ +687,3 @@
+  if (getLangOpts().CUDA)
+CheckCUDAExceptionExpr(OpLoc, "throw");
+

tra wrote:
> Do you need/want to check returned result?
We could, and we could return ExprError here, but I thought it would make sense 
to do the same thing that we do right above, for -fno-exceptions: Continue 
parsing as normal, since we in fact *can* understand what the user is trying to 
do.  In theory (certainly for try/catch) this will let us emit better errors 
elsewhere.


https://reviews.llvm.org/D25036



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D25050: [CUDA] Disallow variable-length arrays in CUDA device code.

2016-09-28 Thread Justin Lebar via cfe-commits
jlebar created this revision.
jlebar added a reviewer: tra.
jlebar added subscribers: jhen, cfe-commits.

https://reviews.llvm.org/D25050

Files:
  clang/include/clang/Basic/DiagnosticSemaKinds.td
  clang/include/clang/Sema/Sema.h
  clang/lib/Sema/SemaCUDA.cpp
  clang/lib/Sema/SemaType.cpp
  clang/test/SemaCUDA/vla-host-device.cu
  clang/test/SemaCUDA/vla.cu

Index: clang/test/SemaCUDA/vla.cu
===
--- /dev/null
+++ clang/test/SemaCUDA/vla.cu
@@ -0,0 +1,12 @@
+// RUN: %clang_cc1 -fcuda-is-device -fsyntax-only -verify %s
+// RUN: %clang_cc1 -fsyntax-only -verify -DHOST %s
+
+#include "Inputs/cuda.h"
+
+void host(int n) {
+  int x[n];
+}
+
+__device__ void device(int n) {
+  int x[n];  // expected-error {{cannot use variable-length arrays in __device__ functions}}
+}
Index: clang/test/SemaCUDA/vla-host-device.cu
===
--- /dev/null
+++ clang/test/SemaCUDA/vla-host-device.cu
@@ -0,0 +1,21 @@
+// RUN: %clang_cc1 -fcuda-is-device -verify -S %s -o /dev/null
+// RUN: %clang_cc1 -verify -DHOST %s -S -o /dev/null
+
+#include "Inputs/cuda.h"
+
+#ifdef HOST
+// expected-no-diagnostics
+#endif
+
+__host__ __device__ void hd(int n) {
+  int x[n];
+#ifndef HOST
+  // expected-error@-2 {{cannot use variable-length arrays in __host__ __device__ functions}}
+#endif
+}
+
+// No error because never codegen'ed for device.
+__host__ __device__ inline void hd_inline(int n) {
+  int x[n];
+}
+void call_hd_inline() { hd_inline(42); }
Index: clang/lib/Sema/SemaType.cpp
===
--- clang/lib/Sema/SemaType.cpp
+++ clang/lib/Sema/SemaType.cpp
@@ -2241,6 +2241,10 @@
 Diag(Loc, diag::err_opencl_vla);
 return QualType();
   }
+  // CUDA device code doesn't support VLAs.
+  if (getLangOpts().CUDA && T->isVariableArrayType() && !CheckCUDAVLA(Loc))
+return QualType();
+
   // If this is not C99, extwarn about VLA's and C99 array size modifiers.
   if (!getLangOpts().C99) {
 if (T->isVariableArrayType()) {
Index: clang/lib/Sema/SemaCUDA.cpp
===
--- clang/lib/Sema/SemaCUDA.cpp
+++ clang/lib/Sema/SemaCUDA.cpp
@@ -539,3 +539,23 @@
   }
   return true;
 }
+
+bool Sema::CheckCUDAVLA(SourceLocation Loc) {
+  assert(getLangOpts().CUDA && "Should only be called during CUDA compilation");
+  FunctionDecl *CurFn = dyn_cast(CurContext);
+  if (!CurFn)
+return true;
+  CUDAFunctionTarget Target = IdentifyCUDATarget(CurFn);
+  if (Target == CFT_Global || Target == CFT_Device) {
+Diag(Loc, diag::err_cuda_vla) << Target;
+return false;
+  }
+  if (Target == CFT_HostDevice && getLangOpts().CUDAIsDevice) {
+PartialDiagnostic ErrPD{PartialDiagnostic::NullDiagnostic()};
+ErrPD.Reset(diag::err_cuda_vla);
+ErrPD << Target;
+CurFn->addDeferredDiag({Loc, std::move(ErrPD)});
+return false;
+  }
+  return true;
+}
Index: clang/include/clang/Sema/Sema.h
===
--- clang/include/clang/Sema/Sema.h
+++ clang/include/clang/Sema/Sema.h
@@ -9255,6 +9255,8 @@
   /// ExprTy should be the string "try" or "throw", as appropriate.
   bool CheckCUDAExceptionExpr(SourceLocation Loc, StringRef ExprTy);
 
+  bool CheckCUDAVLA(SourceLocation Loc);
+
   /// Finds a function in \p Matches with highest calling priority
   /// from \p Caller context and erases all functions with lower
   /// calling priority.
Index: clang/include/clang/Basic/DiagnosticSemaKinds.td
===
--- clang/include/clang/Basic/DiagnosticSemaKinds.td
+++ clang/include/clang/Basic/DiagnosticSemaKinds.td
@@ -6709,6 +6709,10 @@
 def err_device_static_local_var : Error<
 "Within a __device__/__global__ function, "
 "only __shared__ variables may be marked \"static\"">;
+def err_cuda_vla : Error<
+"cannot use variable-length arrays in "
+"%select{__device__|__global__|__host__|__host__ __device__}0 functions">;
+
 def warn_non_pod_vararg_with_format_string : Warning<
   "cannot pass %select{non-POD|non-trivial}0 object of type %1 to variadic "
   "%select{function|block|method|constructor}2; expected type from format "
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r282647 - [CUDA] Disallow variable-length arrays in CUDA device code.

2016-09-28 Thread Justin Lebar via cfe-commits
Author: jlebar
Date: Wed Sep 28 17:45:58 2016
New Revision: 282647

URL: http://llvm.org/viewvc/llvm-project?rev=282647&view=rev
Log:
[CUDA] Disallow variable-length arrays in CUDA device code.

Reviewers: tra

Subscribers: cfe-commits, jhen

Differential Revision: https://reviews.llvm.org/D25050

Added:
cfe/trunk/test/SemaCUDA/vla-host-device.cu
cfe/trunk/test/SemaCUDA/vla.cu
Modified:
cfe/trunk/include/clang/Basic/DiagnosticSemaKinds.td
cfe/trunk/include/clang/Sema/Sema.h
cfe/trunk/lib/Sema/SemaCUDA.cpp
cfe/trunk/lib/Sema/SemaType.cpp

Modified: cfe/trunk/include/clang/Basic/DiagnosticSemaKinds.td
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Basic/DiagnosticSemaKinds.td?rev=282647&r1=282646&r2=282647&view=diff
==
--- cfe/trunk/include/clang/Basic/DiagnosticSemaKinds.td (original)
+++ cfe/trunk/include/clang/Basic/DiagnosticSemaKinds.td Wed Sep 28 17:45:58 
2016
@@ -6713,6 +6713,10 @@ def err_shared_var_init : Error<
 def err_device_static_local_var : Error<
 "Within a __device__/__global__ function, "
 "only __shared__ variables may be marked \"static\"">;
+def err_cuda_vla : Error<
+"cannot use variable-length arrays in "
+"%select{__device__|__global__|__host__|__host__ __device__}0 functions">;
+
 def warn_non_pod_vararg_with_format_string : Warning<
   "cannot pass %select{non-POD|non-trivial}0 object of type %1 to variadic "
   "%select{function|block|method|constructor}2; expected type from format "

Modified: cfe/trunk/include/clang/Sema/Sema.h
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Sema/Sema.h?rev=282647&r1=282646&r2=282647&view=diff
==
--- cfe/trunk/include/clang/Sema/Sema.h (original)
+++ cfe/trunk/include/clang/Sema/Sema.h Wed Sep 28 17:45:58 2016
@@ -9255,6 +9255,8 @@ public:
   /// ExprTy should be the string "try" or "throw", as appropriate.
   bool CheckCUDAExceptionExpr(SourceLocation Loc, StringRef ExprTy);
 
+  bool CheckCUDAVLA(SourceLocation Loc);
+
   /// Finds a function in \p Matches with highest calling priority
   /// from \p Caller context and erases all functions with lower
   /// calling priority.

Modified: cfe/trunk/lib/Sema/SemaCUDA.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Sema/SemaCUDA.cpp?rev=282647&r1=282646&r2=282647&view=diff
==
--- cfe/trunk/lib/Sema/SemaCUDA.cpp (original)
+++ cfe/trunk/lib/Sema/SemaCUDA.cpp Wed Sep 28 17:45:58 2016
@@ -539,3 +539,23 @@ bool Sema::CheckCUDAExceptionExpr(Source
   }
   return true;
 }
+
+bool Sema::CheckCUDAVLA(SourceLocation Loc) {
+  assert(getLangOpts().CUDA && "Should only be called during CUDA 
compilation");
+  FunctionDecl *CurFn = dyn_cast(CurContext);
+  if (!CurFn)
+return true;
+  CUDAFunctionTarget Target = IdentifyCUDATarget(CurFn);
+  if (Target == CFT_Global || Target == CFT_Device) {
+Diag(Loc, diag::err_cuda_vla) << Target;
+return false;
+  }
+  if (Target == CFT_HostDevice && getLangOpts().CUDAIsDevice) {
+PartialDiagnostic ErrPD{PartialDiagnostic::NullDiagnostic()};
+ErrPD.Reset(diag::err_cuda_vla);
+ErrPD << Target;
+CurFn->addDeferredDiag({Loc, std::move(ErrPD)});
+return false;
+  }
+  return true;
+}

Modified: cfe/trunk/lib/Sema/SemaType.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Sema/SemaType.cpp?rev=282647&r1=282646&r2=282647&view=diff
==
--- cfe/trunk/lib/Sema/SemaType.cpp (original)
+++ cfe/trunk/lib/Sema/SemaType.cpp Wed Sep 28 17:45:58 2016
@@ -2241,6 +2241,10 @@ QualType Sema::BuildArrayType(QualType T
 Diag(Loc, diag::err_opencl_vla);
 return QualType();
   }
+  // CUDA device code doesn't support VLAs.
+  if (getLangOpts().CUDA && T->isVariableArrayType() && !CheckCUDAVLA(Loc))
+return QualType();
+
   // If this is not C99, extwarn about VLA's and C99 array size modifiers.
   if (!getLangOpts().C99) {
 if (T->isVariableArrayType()) {

Added: cfe/trunk/test/SemaCUDA/vla-host-device.cu
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/SemaCUDA/vla-host-device.cu?rev=282647&view=auto
==
--- cfe/trunk/test/SemaCUDA/vla-host-device.cu (added)
+++ cfe/trunk/test/SemaCUDA/vla-host-device.cu Wed Sep 28 17:45:58 2016
@@ -0,0 +1,21 @@
+// RUN: %clang_cc1 -fcuda-is-device -verify -S %s -o /dev/null
+// RUN: %clang_cc1 -verify -DHOST %s -S -o /dev/null
+
+#include "Inputs/cuda.h"
+
+#ifdef HOST
+// expected-no-diagnostics
+#endif
+
+__host__ __device__ void hd(int n) {
+  int x[n];
+#ifndef HOST
+  // expected-error@-2 {{cannot use variable-length arrays in __host__ 
__device__ functions}}
+#endif
+}
+
+// No error because never codegen'ed for device.
+__host__ __

Re: [PATCH] D25050: [CUDA] Disallow variable-length arrays in CUDA device code.

2016-09-28 Thread Justin Lebar via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rL282647: [CUDA] Disallow variable-length arrays in CUDA 
device code. (authored by jlebar).

Changed prior to commit:
  https://reviews.llvm.org/D25050?vs=72914&id=72919#toc

Repository:
  rL LLVM

https://reviews.llvm.org/D25050

Files:
  cfe/trunk/include/clang/Basic/DiagnosticSemaKinds.td
  cfe/trunk/include/clang/Sema/Sema.h
  cfe/trunk/lib/Sema/SemaCUDA.cpp
  cfe/trunk/lib/Sema/SemaType.cpp
  cfe/trunk/test/SemaCUDA/vla-host-device.cu
  cfe/trunk/test/SemaCUDA/vla.cu

Index: cfe/trunk/test/SemaCUDA/vla.cu
===
--- cfe/trunk/test/SemaCUDA/vla.cu
+++ cfe/trunk/test/SemaCUDA/vla.cu
@@ -0,0 +1,12 @@
+// RUN: %clang_cc1 -fcuda-is-device -fsyntax-only -verify %s
+// RUN: %clang_cc1 -fsyntax-only -verify -DHOST %s
+
+#include "Inputs/cuda.h"
+
+void host(int n) {
+  int x[n];
+}
+
+__device__ void device(int n) {
+  int x[n];  // expected-error {{cannot use variable-length arrays in __device__ functions}}
+}
Index: cfe/trunk/test/SemaCUDA/vla-host-device.cu
===
--- cfe/trunk/test/SemaCUDA/vla-host-device.cu
+++ cfe/trunk/test/SemaCUDA/vla-host-device.cu
@@ -0,0 +1,21 @@
+// RUN: %clang_cc1 -fcuda-is-device -verify -S %s -o /dev/null
+// RUN: %clang_cc1 -verify -DHOST %s -S -o /dev/null
+
+#include "Inputs/cuda.h"
+
+#ifdef HOST
+// expected-no-diagnostics
+#endif
+
+__host__ __device__ void hd(int n) {
+  int x[n];
+#ifndef HOST
+  // expected-error@-2 {{cannot use variable-length arrays in __host__ __device__ functions}}
+#endif
+}
+
+// No error because never codegen'ed for device.
+__host__ __device__ inline void hd_inline(int n) {
+  int x[n];
+}
+void call_hd_inline() { hd_inline(42); }
Index: cfe/trunk/lib/Sema/SemaType.cpp
===
--- cfe/trunk/lib/Sema/SemaType.cpp
+++ cfe/trunk/lib/Sema/SemaType.cpp
@@ -2241,6 +2241,10 @@
 Diag(Loc, diag::err_opencl_vla);
 return QualType();
   }
+  // CUDA device code doesn't support VLAs.
+  if (getLangOpts().CUDA && T->isVariableArrayType() && !CheckCUDAVLA(Loc))
+return QualType();
+
   // If this is not C99, extwarn about VLA's and C99 array size modifiers.
   if (!getLangOpts().C99) {
 if (T->isVariableArrayType()) {
Index: cfe/trunk/lib/Sema/SemaCUDA.cpp
===
--- cfe/trunk/lib/Sema/SemaCUDA.cpp
+++ cfe/trunk/lib/Sema/SemaCUDA.cpp
@@ -539,3 +539,23 @@
   }
   return true;
 }
+
+bool Sema::CheckCUDAVLA(SourceLocation Loc) {
+  assert(getLangOpts().CUDA && "Should only be called during CUDA compilation");
+  FunctionDecl *CurFn = dyn_cast(CurContext);
+  if (!CurFn)
+return true;
+  CUDAFunctionTarget Target = IdentifyCUDATarget(CurFn);
+  if (Target == CFT_Global || Target == CFT_Device) {
+Diag(Loc, diag::err_cuda_vla) << Target;
+return false;
+  }
+  if (Target == CFT_HostDevice && getLangOpts().CUDAIsDevice) {
+PartialDiagnostic ErrPD{PartialDiagnostic::NullDiagnostic()};
+ErrPD.Reset(diag::err_cuda_vla);
+ErrPD << Target;
+CurFn->addDeferredDiag({Loc, std::move(ErrPD)});
+return false;
+  }
+  return true;
+}
Index: cfe/trunk/include/clang/Sema/Sema.h
===
--- cfe/trunk/include/clang/Sema/Sema.h
+++ cfe/trunk/include/clang/Sema/Sema.h
@@ -9255,6 +9255,8 @@
   /// ExprTy should be the string "try" or "throw", as appropriate.
   bool CheckCUDAExceptionExpr(SourceLocation Loc, StringRef ExprTy);
 
+  bool CheckCUDAVLA(SourceLocation Loc);
+
   /// Finds a function in \p Matches with highest calling priority
   /// from \p Caller context and erases all functions with lower
   /// calling priority.
Index: cfe/trunk/include/clang/Basic/DiagnosticSemaKinds.td
===
--- cfe/trunk/include/clang/Basic/DiagnosticSemaKinds.td
+++ cfe/trunk/include/clang/Basic/DiagnosticSemaKinds.td
@@ -6713,6 +6713,10 @@
 def err_device_static_local_var : Error<
 "Within a __device__/__global__ function, "
 "only __shared__ variables may be marked \"static\"">;
+def err_cuda_vla : Error<
+"cannot use variable-length arrays in "
+"%select{__device__|__global__|__host__|__host__ __device__}0 functions">;
+
 def warn_non_pod_vararg_with_format_string : Warning<
   "cannot pass %select{non-POD|non-trivial}0 object of type %1 to variadic "
   "%select{function|block|method|constructor}2; expected type from format "
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r282646 - [CUDA] Disallow exceptions in device code.

2016-09-28 Thread Justin Lebar via cfe-commits
Author: jlebar
Date: Wed Sep 28 17:45:54 2016
New Revision: 282646

URL: http://llvm.org/viewvc/llvm-project?rev=282646&view=rev
Log:
[CUDA] Disallow exceptions in device code.

Reviewers: tra

Subscribers: cfe-commits, jhen

Differential Revision: https://reviews.llvm.org/D25036

Added:
cfe/trunk/test/SemaCUDA/exceptions-host-device.cu
cfe/trunk/test/SemaCUDA/exceptions.cu
Modified:
cfe/trunk/include/clang/Basic/DiagnosticSemaKinds.td
cfe/trunk/include/clang/Sema/Sema.h
cfe/trunk/lib/Sema/SemaCUDA.cpp
cfe/trunk/lib/Sema/SemaExprCXX.cpp
cfe/trunk/lib/Sema/SemaStmt.cpp

Modified: cfe/trunk/include/clang/Basic/DiagnosticSemaKinds.td
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Basic/DiagnosticSemaKinds.td?rev=282646&r1=282645&r2=282646&view=diff
==
--- cfe/trunk/include/clang/Basic/DiagnosticSemaKinds.td (original)
+++ cfe/trunk/include/clang/Basic/DiagnosticSemaKinds.td Wed Sep 28 17:45:54 
2016
@@ -6702,6 +6702,9 @@ def err_cuda_unattributed_constexpr_cann
   "attribute, or build with -fno-cuda-host-device-constexpr.">;
 def note_cuda_conflicting_device_function_declared_here : Note<
   "conflicting __device__ function declared here">;
+def err_cuda_device_exceptions : Error<
+  "cannot use '%0' in "
+  "%select{__device__|__global__|__host__|__host__ __device__}1 function %2">;
 def err_dynamic_var_init : Error<
 "dynamic initialization is not supported for "
 "__device__, __constant__, and __shared__ variables.">;

Modified: cfe/trunk/include/clang/Sema/Sema.h
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Sema/Sema.h?rev=282646&r1=282645&r2=282646&view=diff
==
--- cfe/trunk/include/clang/Sema/Sema.h (original)
+++ cfe/trunk/include/clang/Sema/Sema.h Wed Sep 28 17:45:54 2016
@@ -9245,6 +9245,16 @@ public:
   /// Otherwise, returns true without emitting any diagnostics.
   bool CheckCUDACall(SourceLocation Loc, FunctionDecl *Callee);
 
+  /// Check whether a 'try' or 'throw' expression is allowed within the current
+  /// context, and raise an error or create a deferred error, as appropriate.
+  ///
+  /// 'try' and 'throw' are never allowed in CUDA __device__ functions, and are
+  /// allowed in __host__ __device__ functions only if those functions are 
never
+  /// codegen'ed for the device.
+  ///
+  /// ExprTy should be the string "try" or "throw", as appropriate.
+  bool CheckCUDAExceptionExpr(SourceLocation Loc, StringRef ExprTy);
+
   /// Finds a function in \p Matches with highest calling priority
   /// from \p Caller context and erases all functions with lower
   /// calling priority.

Modified: cfe/trunk/lib/Sema/SemaCUDA.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Sema/SemaCUDA.cpp?rev=282646&r1=282645&r2=282646&view=diff
==
--- cfe/trunk/lib/Sema/SemaCUDA.cpp (original)
+++ cfe/trunk/lib/Sema/SemaCUDA.cpp Wed Sep 28 17:45:54 2016
@@ -515,3 +515,27 @@ bool Sema::CheckCUDACall(SourceLocation
   }
   return true;
 }
+
+bool Sema::CheckCUDAExceptionExpr(SourceLocation Loc, StringRef ExprTy) {
+  assert(getLangOpts().CUDA && "Should only be called during CUDA 
compilation");
+  FunctionDecl *CurFn = dyn_cast(CurContext);
+  if (!CurFn)
+return true;
+  CUDAFunctionTarget Target = IdentifyCUDATarget(CurFn);
+
+  // Raise an error immediately if this is a __global__ or __device__ function.
+  // If it's a __host__ __device__ function, enqueue a deferred error which 
will
+  // be emitted if the function is codegen'ed for device.
+  if (Target == CFT_Global || Target == CFT_Device) {
+Diag(Loc, diag::err_cuda_device_exceptions) << ExprTy << Target << CurFn;
+return false;
+  }
+  if (Target == CFT_HostDevice && getLangOpts().CUDAIsDevice) {
+PartialDiagnostic ErrPD{PartialDiagnostic::NullDiagnostic()};
+ErrPD.Reset(diag::err_cuda_device_exceptions);
+ErrPD << ExprTy << Target << CurFn;
+CurFn->addDeferredDiag({Loc, std::move(ErrPD)});
+return false;
+  }
+  return true;
+}

Modified: cfe/trunk/lib/Sema/SemaExprCXX.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Sema/SemaExprCXX.cpp?rev=282646&r1=282645&r2=282646&view=diff
==
--- cfe/trunk/lib/Sema/SemaExprCXX.cpp (original)
+++ cfe/trunk/lib/Sema/SemaExprCXX.cpp Wed Sep 28 17:45:54 2016
@@ -683,6 +683,10 @@ ExprResult Sema::BuildCXXThrow(SourceLoc
   !getSourceManager().isInSystemHeader(OpLoc))
 Diag(OpLoc, diag::err_exceptions_disabled) << "throw";
 
+  // Exceptions aren't allowed in CUDA device code.
+  if (getLangOpts().CUDA)
+CheckCUDAExceptionExpr(OpLoc, "throw");
+
   if (getCurScope() && getCurScope()->isOpenMPSimdDirectiveScope())
 Diag(OpLoc, diag::err_omp_simd_region_cannot_use_stmt) << "thro

Re: [PATCH] D25036: [CUDA] Disallow exceptions in device code.

2016-09-28 Thread Justin Lebar via cfe-commits
This revision was automatically updated to reflect the committed changes.
jlebar marked an inline comment as done.
Closed by commit rL282646: [CUDA] Disallow exceptions in device code. (authored 
by jlebar).

Changed prior to commit:
  https://reviews.llvm.org/D25036?vs=72872&id=72918#toc

Repository:
  rL LLVM

https://reviews.llvm.org/D25036

Files:
  cfe/trunk/include/clang/Basic/DiagnosticSemaKinds.td
  cfe/trunk/include/clang/Sema/Sema.h
  cfe/trunk/lib/Sema/SemaCUDA.cpp
  cfe/trunk/lib/Sema/SemaExprCXX.cpp
  cfe/trunk/lib/Sema/SemaStmt.cpp
  cfe/trunk/test/SemaCUDA/exceptions-host-device.cu
  cfe/trunk/test/SemaCUDA/exceptions.cu

Index: cfe/trunk/include/clang/Sema/Sema.h
===
--- cfe/trunk/include/clang/Sema/Sema.h
+++ cfe/trunk/include/clang/Sema/Sema.h
@@ -9245,6 +9245,16 @@
   /// Otherwise, returns true without emitting any diagnostics.
   bool CheckCUDACall(SourceLocation Loc, FunctionDecl *Callee);
 
+  /// Check whether a 'try' or 'throw' expression is allowed within the current
+  /// context, and raise an error or create a deferred error, as appropriate.
+  ///
+  /// 'try' and 'throw' are never allowed in CUDA __device__ functions, and are
+  /// allowed in __host__ __device__ functions only if those functions are never
+  /// codegen'ed for the device.
+  ///
+  /// ExprTy should be the string "try" or "throw", as appropriate.
+  bool CheckCUDAExceptionExpr(SourceLocation Loc, StringRef ExprTy);
+
   /// Finds a function in \p Matches with highest calling priority
   /// from \p Caller context and erases all functions with lower
   /// calling priority.
Index: cfe/trunk/include/clang/Basic/DiagnosticSemaKinds.td
===
--- cfe/trunk/include/clang/Basic/DiagnosticSemaKinds.td
+++ cfe/trunk/include/clang/Basic/DiagnosticSemaKinds.td
@@ -6702,6 +6702,9 @@
   "attribute, or build with -fno-cuda-host-device-constexpr.">;
 def note_cuda_conflicting_device_function_declared_here : Note<
   "conflicting __device__ function declared here">;
+def err_cuda_device_exceptions : Error<
+  "cannot use '%0' in "
+  "%select{__device__|__global__|__host__|__host__ __device__}1 function %2">;
 def err_dynamic_var_init : Error<
 "dynamic initialization is not supported for "
 "__device__, __constant__, and __shared__ variables.">;
Index: cfe/trunk/test/SemaCUDA/exceptions-host-device.cu
===
--- cfe/trunk/test/SemaCUDA/exceptions-host-device.cu
+++ cfe/trunk/test/SemaCUDA/exceptions-host-device.cu
@@ -0,0 +1,38 @@
+// RUN: %clang_cc1 -fcxx-exceptions -fcuda-is-device -verify %s -S -o /dev/null
+// RUN: %clang_cc1 -fcxx-exceptions -verify -DHOST %s -S -o /dev/null
+
+#include "Inputs/cuda.h"
+
+// Check that it's an error to use 'try' and 'throw' from a __host__ __device__
+// function if and only if it's codegen'ed for device.
+
+#ifdef HOST
+// expected-no-diagnostics
+#endif
+
+__host__ __device__ void hd1() {
+  throw NULL;
+  try {} catch(void*) {}
+#ifndef HOST
+  // expected-error@-3 {{cannot use 'throw' in __host__ __device__ function 'hd1'}}
+  // expected-error@-3 {{cannot use 'try' in __host__ __device__ function 'hd1'}}
+#endif
+}
+
+// No error, never instantiated on device.
+inline __host__ __device__ void hd2() {
+  throw NULL;
+  try {} catch(void*) {}
+}
+void call_hd2() { hd2(); }
+
+// Error, instantiated on device.
+inline __host__ __device__ void hd3() {
+  throw NULL;
+  try {} catch(void*) {}
+#ifndef HOST
+  // expected-error@-3 {{cannot use 'throw' in __host__ __device__ function 'hd3'}}
+  // expected-error@-3 {{cannot use 'try' in __host__ __device__ function 'hd3'}}
+#endif
+}
+__device__ void call_hd3() { hd3(); }
Index: cfe/trunk/test/SemaCUDA/exceptions.cu
===
--- cfe/trunk/test/SemaCUDA/exceptions.cu
+++ cfe/trunk/test/SemaCUDA/exceptions.cu
@@ -0,0 +1,21 @@
+// RUN: %clang_cc1 -fcxx-exceptions -fcuda-is-device -fsyntax-only -verify %s
+// RUN: %clang_cc1 -fcxx-exceptions -fsyntax-only -verify %s
+
+#include "Inputs/cuda.h"
+
+void host() {
+  throw NULL;
+  try {} catch(void*) {}
+}
+__device__ void device() {
+  throw NULL;
+  // expected-error@-1 {{cannot use 'throw' in __device__ function 'device'}}
+  try {} catch(void*) {}
+  // expected-error@-1 {{cannot use 'try' in __device__ function 'device'}}
+}
+__global__ void kernel() {
+  throw NULL;
+  // expected-error@-1 {{cannot use 'throw' in __global__ function 'kernel'}}
+  try {} catch(void*) {}
+  // expected-error@-1 {{cannot use 'try' in __global__ function 'kernel'}}
+}
Index: cfe/trunk/lib/Sema/SemaCUDA.cpp
===
--- cfe/trunk/lib/Sema/SemaCUDA.cpp
+++ cfe/trunk/lib/Sema/SemaCUDA.cpp
@@ -515,3 +515,27 @@
   }
   return true;
 }
+
+bool Sema::CheckCUDAExceptionExpr(SourceLocation Loc, StringRef ExprT

Re: [PATCH] D24573: [CUDA] Do a better job at detecting wrong-side calls.

2016-09-28 Thread Justin Lebar via cfe-commits
jlebar added a comment.

I'm sorry to put more stuff on your plate, Richard, but you already have 
context on this, so you may be the best person to review this.  Unfortunately 
people keep hitting the bug fixed here (and being confused), so it's maybe not 
as much of an edge case as I thought.

FWIW this patch is not actually as big as it looks at first: The ActOnCallExpr 
changes are just moving things out of an "Impl" function, nothing interesting 
there.


https://reviews.llvm.org/D24573



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r282822 - Move UTF functions into namespace llvm.

2016-09-29 Thread Justin Lebar via cfe-commits
Author: jlebar
Date: Thu Sep 29 19:38:45 2016
New Revision: 282822

URL: http://llvm.org/viewvc/llvm-project?rev=282822&view=rev
Log:
Move UTF functions into namespace llvm.

Summary:
This lets people link against LLVM and their own version of the UTF
library.

I determined this only affects llvm, clang, lld, and lldb by running

$ git grep -wl 'UTF[0-9]\+\|\bConvertUTF\bisLegalUTF\|getNumBytesFor' | cut -f 
1 -d '/' | sort | uniq
  clang
  lld
  lldb
  llvm

Tested with

  ninja lldb
  ninja check-clang check-llvm check-lld

(ninja check-lldb doesn't complete for me with or without this patch.)

Reviewers: rnk

Subscribers: klimek, beanz, mgorny, llvm-commits

Differential Revision: https://reviews.llvm.org/D24996

Modified:
cfe/trunk/lib/Analysis/FormatString.cpp
cfe/trunk/lib/CodeGen/CodeGenModule.cpp
cfe/trunk/lib/Format/Encoding.h
cfe/trunk/lib/Frontend/TextDiagnostic.cpp
cfe/trunk/lib/Lex/Lexer.cpp
cfe/trunk/lib/Lex/LiteralSupport.cpp
cfe/trunk/lib/Sema/SemaChecking.cpp
cfe/trunk/lib/Sema/SemaExpr.cpp

Modified: cfe/trunk/lib/Analysis/FormatString.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Analysis/FormatString.cpp?rev=282822&r1=282821&r2=282822&view=diff
==
--- cfe/trunk/lib/Analysis/FormatString.cpp (original)
+++ cfe/trunk/lib/Analysis/FormatString.cpp Thu Sep 29 19:38:45 2016
@@ -266,14 +266,15 @@ bool clang::analyze_format_string::Parse
   if (SpecifierBegin + 1 >= FmtStrEnd)
 return false;
 
-  const UTF8 *SB = reinterpret_cast(SpecifierBegin + 1);
-  const UTF8 *SE = reinterpret_cast(FmtStrEnd);
+  const llvm::UTF8 *SB =
+  reinterpret_cast(SpecifierBegin + 1);
+  const llvm::UTF8 *SE = reinterpret_cast(FmtStrEnd);
   const char FirstByte = *SB;
 
   // If the invalid specifier is a multibyte UTF-8 string, return the
   // total length accordingly so that the conversion specifier can be
   // properly updated to reflect a complete UTF-8 specifier.
-  unsigned NumBytes = getNumBytesForUTF8(FirstByte);
+  unsigned NumBytes = llvm::getNumBytesForUTF8(FirstByte);
   if (NumBytes == 1)
 return false;
   if (SB + NumBytes > SE)

Modified: cfe/trunk/lib/CodeGen/CodeGenModule.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CodeGenModule.cpp?rev=282822&r1=282821&r2=282822&view=diff
==
--- cfe/trunk/lib/CodeGen/CodeGenModule.cpp (original)
+++ cfe/trunk/lib/CodeGen/CodeGenModule.cpp Thu Sep 29 19:38:45 2016
@@ -3136,13 +3136,12 @@ GetConstantCFStringEntry(llvm::StringMap
   // Otherwise, convert the UTF8 literals into a string of shorts.
   IsUTF16 = true;
 
-  SmallVector ToBuf(NumBytes + 1); // +1 for ending nulls.
-  const UTF8 *FromPtr = (const UTF8 *)String.data();
-  UTF16 *ToPtr = &ToBuf[0];
+  SmallVector ToBuf(NumBytes + 1); // +1 for ending nulls.
+  const llvm::UTF8 *FromPtr = (const llvm::UTF8 *)String.data();
+  llvm::UTF16 *ToPtr = &ToBuf[0];
 
-  (void)ConvertUTF8toUTF16(&FromPtr, FromPtr + NumBytes,
-   &ToPtr, ToPtr + NumBytes,
-   strictConversion);
+  (void)llvm::ConvertUTF8toUTF16(&FromPtr, FromPtr + NumBytes, &ToPtr,
+ ToPtr + NumBytes, llvm::strictConversion);
 
   // ConvertUTF8toUTF16 returns the length in ToPtr.
   StringLength = ToPtr - &ToBuf[0];

Modified: cfe/trunk/lib/Format/Encoding.h
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Format/Encoding.h?rev=282822&r1=282821&r2=282822&view=diff
==
--- cfe/trunk/lib/Format/Encoding.h (original)
+++ cfe/trunk/lib/Format/Encoding.h Thu Sep 29 19:38:45 2016
@@ -33,16 +33,17 @@ enum Encoding {
 /// \brief Detects encoding of the Text. If the Text can be decoded using 
UTF-8,
 /// it is considered UTF8, otherwise we treat it as some 8-bit encoding.
 inline Encoding detectEncoding(StringRef Text) {
-  const UTF8 *Ptr = reinterpret_cast(Text.begin());
-  const UTF8 *BufEnd = reinterpret_cast(Text.end());
-  if (::isLegalUTF8String(&Ptr, BufEnd))
+  const llvm::UTF8 *Ptr = reinterpret_cast(Text.begin());
+  const llvm::UTF8 *BufEnd = reinterpret_cast(Text.end());
+  if (llvm::isLegalUTF8String(&Ptr, BufEnd))
 return Encoding_UTF8;
   return Encoding_Unknown;
 }
 
 inline unsigned getCodePointCountUTF8(StringRef Text) {
   unsigned CodePoints = 0;
-  for (size_t i = 0, e = Text.size(); i < e; i += getNumBytesForUTF8(Text[i])) 
{
+  for (size_t i = 0, e = Text.size(); i < e;
+   i += llvm::getNumBytesForUTF8(Text[i])) {
 ++CodePoints;
   }
   return CodePoints;
@@ -97,7 +98,7 @@ inline unsigned columnWidthWithTabs(Stri
 inline unsigned getCodePointNumBytes(char FirstChar, Encoding Encoding) {
   switch (Encoding) {
   case Encoding_UTF8:
-return getNumBytesForUTF8(FirstChar);
+return llvm::getNumBytesForUTF8(FirstChar);
   defaul

[PATCH] D25103: [CUDA] Handle attributes on CUDA lambdas appearing between [...] and (...).

2016-09-30 Thread Justin Lebar via cfe-commits
jlebar created this revision.
jlebar added a reviewer: rnk.
jlebar added subscribers: tra, cfe-commits.

This is ugh, but it makes us compatible with NVCC.  Fixes bug 26341.


https://reviews.llvm.org/D25103

Files:
  clang/lib/Parse/ParseExprCXX.cpp
  clang/test/Parser/lambda-attr.cu

Index: clang/test/Parser/lambda-attr.cu
===
--- /dev/null
+++ clang/test/Parser/lambda-attr.cu
@@ -0,0 +1,30 @@
+// RUN: %clang_cc1 -std=c++11 -fsyntax-only -verify %s
+// RUN: %clang_cc1 -std=c++11 -fsyntax-only -fcuda-is-device -verify %s
+
+// expected-no-diagnostics
+
+__attribute__((device)) void device_attr() {
+  ([]() __attribute__((device)){})();
+  ([] __attribute__((device)) () {})();
+  ([] __attribute__((device)) {})();
+
+  ([&]() __attribute__((device)){})();
+  ([&] __attribute__((device)) () {})();
+  ([&] __attribute__((device)) {})();
+
+  ([&](int) __attribute__((device)){})(0);
+  ([&] __attribute__((device)) (int) {})(0);
+}
+
+__attribute__((host)) __attribute__((device)) void host_device_attrs() {
+  ([]() __attribute__((host)) __attribute__((device)){})();
+  ([] __attribute__((host)) __attribute__((device)) () {})();
+  ([] __attribute__((host)) __attribute__((device)) {})();
+
+  ([&]() __attribute__((host)) __attribute__((device)){})();
+  ([&] __attribute__((host)) __attribute__((device)) () {})();
+  ([&] __attribute__((host)) __attribute__((device)) {})();
+
+  ([&](int) __attribute__((host)) __attribute__((device)){})(0);
+  ([&] __attribute__((host)) __attribute__((device)) (int) {})(0);
+}
Index: clang/lib/Parse/ParseExprCXX.cpp
===
--- clang/lib/Parse/ParseExprCXX.cpp
+++ clang/lib/Parse/ParseExprCXX.cpp
@@ -1124,22 +1124,29 @@
   DeclSpec DS(AttrFactory);
   Declarator D(DS, Declarator::LambdaExprContext);
   TemplateParameterDepthRAII CurTemplateDepthTracker(TemplateParameterDepth);
-  Actions.PushLambdaScope();
+  Actions.PushLambdaScope();
+
+  ParsedAttributes Attr(AttrFactory);
+  SourceLocation DeclLoc = Tok.getLocation();
+  SourceLocation DeclEndLoc = DeclLoc;
+  if (getLangOpts().CUDA) {
+// In CUDA code, GNU attributes are allowed to appear immediately after the
+// "[...]", even if there is no "(...)" before the lambda body.
+MaybeParseGNUAttributes(Attr, &DeclEndLoc);
+  }
 
   TypeResult TrailingReturnType;
   if (Tok.is(tok::l_paren)) {
 ParseScope PrototypeScope(this,
   Scope::FunctionPrototypeScope |
   Scope::FunctionDeclarationScope |
   Scope::DeclScope);
 
-SourceLocation DeclEndLoc;
 BalancedDelimiterTracker T(*this, tok::l_paren);
 T.consumeOpen();
 SourceLocation LParenLoc = T.getOpenLocation();
 
 // Parse parameter-declaration-clause.
-ParsedAttributes Attr(AttrFactory);
 SmallVector ParamInfo;
 SourceLocation EllipsisLoc;
 
@@ -1245,12 +1252,10 @@
 Diag(Tok, diag::err_lambda_missing_parens)
   << TokKind
   << FixItHint::CreateInsertion(Tok.getLocation(), "() ");
-SourceLocation DeclLoc = Tok.getLocation();
-SourceLocation DeclEndLoc = DeclLoc;
+DeclEndLoc = DeclLoc;
 
 // GNU-style attributes must be parsed before the mutable specifier to be
 // compatible with GCC.
-ParsedAttributes Attr(AttrFactory);
 MaybeParseGNUAttributes(Attr, &DeclEndLoc);
 
 // Parse 'mutable', if it's there.
@@ -1296,8 +1301,35 @@
DeclLoc, DeclEndLoc, D,
TrailingReturnType),
   Attr, DeclEndLoc);
+  } else if (getLangOpts().CUDA) {
+// CUDA code may have attributes, which we need to add if they weren't added
+// in one of the if statements above.
+SourceLocation NoLoc;
+D.AddTypeInfo(DeclaratorChunk::getFunction(/*hasProto=*/true,
+   /*isAmbiguous=*/false,
+   /*LParenLoc=*/NoLoc,
+   /*Params=*/nullptr,
+   /*NumParams=*/0,
+   /*EllipsisLoc=*/NoLoc,
+   /*RParenLoc=*/NoLoc,
+   /*TypeQuals=*/0,
+   /*RefQualifierIsLValueRef=*/true,
+   /*RefQualifierLoc=*/NoLoc,
+   /*ConstQualifierLoc=*/NoLoc,
+   /*VolatileQualifierLoc=*/NoLoc,
+   /*RestrictQualifierLoc=*/NoLoc,
+   /*MutableLoc=*/NoLoc,
+   EST_None,
+   

[PATCH] D25105: [CUDA] Make lambdas inherit __host__ and __device__ attributes from the scope in which they're created.

2016-09-30 Thread Justin Lebar via cfe-commits
jlebar created this revision.
jlebar added a reviewer: tra.
jlebar added subscribers: rnk, cfe-commits.

NVCC compat.  Fixes bug 30567.


https://reviews.llvm.org/D25105

Files:
  clang/include/clang/Sema/Sema.h
  clang/lib/Sema/SemaCUDA.cpp
  clang/lib/Sema/SemaLambda.cpp
  clang/test/SemaCUDA/implicit-device-lambda-hd.cu
  clang/test/SemaCUDA/implicit-device-lambda.cu

Index: clang/test/SemaCUDA/implicit-device-lambda.cu
===
--- /dev/null
+++ clang/test/SemaCUDA/implicit-device-lambda.cu
@@ -0,0 +1,86 @@
+// RUN: %clang_cc1 -std=c++11 -fcuda-is-device -verify -fsyntax-only -verify-ignore-unexpected=note %s
+// RUN: %clang_cc1 -std=c++11 -verify -fsyntax-only -verify-ignore-unexpected=note %s
+
+#include "Inputs/cuda.h"
+
+__device__ void device_fn() {
+  auto f1 = [&] {};
+  f1(); // implicitly __device__
+
+  auto f2 = [&] __device__ {};
+  f2();
+
+  auto f3 = [&] __host__ {};
+  f3();  // expected-error {{no matching function}}
+
+  auto f4 = [&] __host__ __device__ {};
+  f4();
+
+  // Now do it all again with '()'s in the lambda declarations: This is a
+  // different parse path.
+  auto g1 = [&]() {};
+  g1(); // implicitly __device__
+
+  auto g2 = [&]() __device__ {};
+  g2();
+
+  auto g3 = [&]() __host__ {};
+  g3();  // expected-error {{no matching function}}
+
+  auto g4 = [&]() __host__ __device__ {};
+  g4();
+
+  // Once more, with the '()'s in a different place.
+  auto h1 = [&]() {};
+  h1(); // implicitly __device__
+
+  auto h2 = [&] __device__ () {};
+  h2();
+
+  auto h3 = [&] __host__ () {};
+  h3();  // expected-error {{no matching function}}
+
+  auto h4 = [&] __host__ __device__ () {};
+  h4();
+}
+
+// Behaves identically to device_fn.
+__global__ void kernel_fn() {
+  auto f1 = [&] {};
+  f1(); // implicitly __device__
+
+  auto f2 = [&] __device__ {};
+  f2();
+
+  auto f3 = [&] __host__ {};
+  f3();  // expected-error {{no matching function}}
+
+  auto f4 = [&] __host__ __device__ {};
+  f4();
+
+  // No need to re-test all the parser contortions we test in the device
+  // function.
+}
+
+__host__ void host_fn() {
+  auto f1 = [&] {};
+  f1(); // implicitly __host__ (i.e., no magic)
+
+  auto f2 = [&] __device__ {};
+  f2();  // expected-error {{no matching function}}
+
+  auto f3 = [&] __host__ {};
+  f3();
+
+  auto f4 = [&] __host__ __device__ {};
+  f4();
+}
+
+// The special treatment above only applies to lambdas.
+__device__ void foo() {
+  struct X {
+void foo() {}
+  };
+  X x;
+  x.foo(); // expected-error {{reference to __host__ function 'foo' in __device__ function}}
+}
Index: clang/test/SemaCUDA/implicit-device-lambda-hd.cu
===
--- /dev/null
+++ clang/test/SemaCUDA/implicit-device-lambda-hd.cu
@@ -0,0 +1,27 @@
+// RUN: %clang_cc1 -std=c++11 -fcuda-is-device -verify -verify-ignore-unexpected=note \
+// RUN:   -S -o /dev/null %s
+// RUN: %clang_cc1 -std=c++11 -verify -fsyntax-only -verify-ignore-unexpected=note \
+// RUN:   -DHOST -S -o /dev/null %s
+#include "Inputs/cuda.h"
+
+__host__ __device__ void hd_fn() {
+  auto f1 = [&] {};
+  f1(); // implicitly __host__ __device__
+
+  auto f2 = [&] __device__ {};
+  f2();
+#ifdef HOST
+  // expected-error@-2 {{reference to __device__ function}}
+#endif
+
+  auto f3 = [&] __host__ {};
+  f3();
+#ifndef HOST
+  // expected-error@-2 {{reference to __host__ function}}
+#endif
+
+  auto f4 = [&] __host__ __device__ {};
+  f4();
+}
+
+
Index: clang/lib/Sema/SemaLambda.cpp
===
--- clang/lib/Sema/SemaLambda.cpp
+++ clang/lib/Sema/SemaLambda.cpp
@@ -886,7 +886,12 @@
   
   // Attributes on the lambda apply to the method.  
   ProcessDeclAttributes(CurScope, Method, ParamInfo);
-  
+
+  // CUDA lambdas get implicit attributes based on the scope in which they're
+  // declared.
+  if (getLangOpts().CUDA)
+CUDASetLambdaAttrs(Method);
+
   // Introduce the function call operator as the current declaration context.
   PushDeclContext(CurScope, Method);
 
Index: clang/lib/Sema/SemaCUDA.cpp
===
--- clang/lib/Sema/SemaCUDA.cpp
+++ clang/lib/Sema/SemaCUDA.cpp
@@ -559,3 +559,22 @@
   }
   return true;
 }
+
+void Sema::CUDASetLambdaAttrs(CXXMethodDecl *Method) {
+  if (Method->hasAttr() || Method->hasAttr())
+return;
+  FunctionDecl *CurFn = dyn_cast(CurContext);
+  if (!CurFn)
+return;
+  CUDAFunctionTarget Target = IdentifyCUDATarget(CurFn);
+  if (Target == CFT_Global || Target == CFT_Device) {
+Method->addAttr(CUDADeviceAttr::CreateImplicit(Context));
+  } else if (Target == CFT_HostDevice) {
+Method->addAttr(CUDADeviceAttr::CreateImplicit(Context));
+Method->addAttr(CUDAHostAttr::CreateImplicit(Context));
+  }
+
+  // TODO: nvcc doesn't allow you to specify __host__ or __device__ attributes
+  // on lambdas in all contexts -- we should emit a compatib

[PATCH] D25103: [CUDA] Handle attributes on CUDA lambdas appearing between [...] and (...).

2016-09-30 Thread Justin Lebar via cfe-commits
jlebar updated this revision to Diff 73085.
jlebar marked an inline comment as done.
jlebar added a comment.

Don't hallucinate a function declarator.


https://reviews.llvm.org/D25103

Files:
  clang/lib/Parse/ParseExprCXX.cpp
  clang/test/Parser/lambda-attr.cu


Index: clang/test/Parser/lambda-attr.cu
===
--- /dev/null
+++ clang/test/Parser/lambda-attr.cu
@@ -0,0 +1,33 @@
+// RUN: %clang_cc1 -std=c++11 -fsyntax-only -verify %s
+// RUN: %clang_cc1 -std=c++11 -fsyntax-only -fcuda-is-device -verify %s
+
+// expected-no-diagnostics
+
+__attribute__((device)) void device_fn() {}
+__attribute__((device)) void hd_fn() {}
+
+__attribute__((device)) void device_attr() {
+  ([]() __attribute__((device)) { device_fn(); })();
+  ([] __attribute__((device)) () { device_fn(); })();
+  ([] __attribute__((device)) { device_fn(); })();
+
+  ([&]() __attribute__((device)){ device_fn(); })();
+  ([&] __attribute__((device)) () { device_fn(); })();
+  ([&] __attribute__((device)) { device_fn(); })();
+
+  ([&](int) __attribute__((device)){ device_fn(); })(0);
+  ([&] __attribute__((device)) (int) { device_fn(); })(0);
+}
+
+__attribute__((host)) __attribute__((device)) void host_device_attrs() {
+  ([]() __attribute__((host)) __attribute__((device)){ hd_fn(); })();
+  ([] __attribute__((host)) __attribute__((device)) () { hd_fn(); })();
+  ([] __attribute__((host)) __attribute__((device)) { hd_fn(); })();
+
+  ([&]() __attribute__((host)) __attribute__((device)){ hd_fn(); })();
+  ([&] __attribute__((host)) __attribute__((device)) () { hd_fn(); })();
+  ([&] __attribute__((host)) __attribute__((device)) { hd_fn(); })();
+
+  ([&](int) __attribute__((host)) __attribute__((device)){ hd_fn(); })(0);
+  ([&] __attribute__((host)) __attribute__((device)) (int) { hd_fn(); })(0);
+}
Index: clang/lib/Parse/ParseExprCXX.cpp
===
--- clang/lib/Parse/ParseExprCXX.cpp
+++ clang/lib/Parse/ParseExprCXX.cpp
@@ -1124,22 +1124,30 @@
   DeclSpec DS(AttrFactory);
   Declarator D(DS, Declarator::LambdaExprContext);
   TemplateParameterDepthRAII CurTemplateDepthTracker(TemplateParameterDepth);
-  Actions.PushLambdaScope();
+  Actions.PushLambdaScope();
+
+  ParsedAttributes Attr(AttrFactory);
+  SourceLocation DeclLoc = Tok.getLocation();
+  SourceLocation DeclEndLoc = DeclLoc;
+  if (getLangOpts().CUDA) {
+// In CUDA code, GNU attributes are allowed to appear immediately after the
+// "[...]", even if there is no "(...)" before the lambda body.
+MaybeParseGNUAttributes(Attr, &DeclEndLoc);
+D.takeAttributes(Attr, DeclEndLoc);
+  }
 
   TypeResult TrailingReturnType;
   if (Tok.is(tok::l_paren)) {
 ParseScope PrototypeScope(this,
   Scope::FunctionPrototypeScope |
   Scope::FunctionDeclarationScope |
   Scope::DeclScope);
 
-SourceLocation DeclEndLoc;
 BalancedDelimiterTracker T(*this, tok::l_paren);
 T.consumeOpen();
 SourceLocation LParenLoc = T.getOpenLocation();
 
 // Parse parameter-declaration-clause.
-ParsedAttributes Attr(AttrFactory);
 SmallVector ParamInfo;
 SourceLocation EllipsisLoc;
 
@@ -1245,12 +1253,10 @@
 Diag(Tok, diag::err_lambda_missing_parens)
   << TokKind
   << FixItHint::CreateInsertion(Tok.getLocation(), "() ");
-SourceLocation DeclLoc = Tok.getLocation();
-SourceLocation DeclEndLoc = DeclLoc;
+DeclEndLoc = DeclLoc;
 
 // GNU-style attributes must be parsed before the mutable specifier to be
 // compatible with GCC.
-ParsedAttributes Attr(AttrFactory);
 MaybeParseGNUAttributes(Attr, &DeclEndLoc);
 
 // Parse 'mutable', if it's there.
@@ -1297,7 +1303,6 @@
TrailingReturnType),
   Attr, DeclEndLoc);
   }
-  
 
   // FIXME: Rename BlockScope -> ClosureScope if we decide to continue using
   // it.


Index: clang/test/Parser/lambda-attr.cu
===
--- /dev/null
+++ clang/test/Parser/lambda-attr.cu
@@ -0,0 +1,33 @@
+// RUN: %clang_cc1 -std=c++11 -fsyntax-only -verify %s
+// RUN: %clang_cc1 -std=c++11 -fsyntax-only -fcuda-is-device -verify %s
+
+// expected-no-diagnostics
+
+__attribute__((device)) void device_fn() {}
+__attribute__((device)) void hd_fn() {}
+
+__attribute__((device)) void device_attr() {
+  ([]() __attribute__((device)) { device_fn(); })();
+  ([] __attribute__((device)) () { device_fn(); })();
+  ([] __attribute__((device)) { device_fn(); })();
+
+  ([&]() __attribute__((device)){ device_fn(); })();
+  ([&] __attribute__((device)) () { device_fn(); })();
+  ([&] __attribute__((device)) { device_fn(); })();
+
+  ([&](int) __attribute__((device)){ device_fn(); })(0);
+  ([&] __attribute__((device)) (int) { device_fn(); })(0);
+}
+
+__attribute__((host)) __attribute__((device)) vo

r282880 - [CUDA] Make lambdas inherit __host__ and __device__ attributes from the scope in which they're created.

2016-09-30 Thread Justin Lebar via cfe-commits
Author: jlebar
Date: Fri Sep 30 12:14:53 2016
New Revision: 282880

URL: http://llvm.org/viewvc/llvm-project?rev=282880&view=rev
Log:
[CUDA] Make lambdas inherit __host__ and __device__ attributes from the scope 
in which they're created.

Summary: NVCC compat.  Fixes bug 30567.

Reviewers: tra

Subscribers: cfe-commits, rnk

Differential Revision: https://reviews.llvm.org/D25105

Added:
cfe/trunk/test/SemaCUDA/implicit-device-lambda-hd.cu
cfe/trunk/test/SemaCUDA/implicit-device-lambda.cu
Modified:
cfe/trunk/include/clang/Sema/Sema.h
cfe/trunk/lib/Sema/SemaCUDA.cpp
cfe/trunk/lib/Sema/SemaLambda.cpp

Modified: cfe/trunk/include/clang/Sema/Sema.h
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Sema/Sema.h?rev=282880&r1=282879&r2=282880&view=diff
==
--- cfe/trunk/include/clang/Sema/Sema.h (original)
+++ cfe/trunk/include/clang/Sema/Sema.h Fri Sep 30 12:14:53 2016
@@ -9264,6 +9264,14 @@ public:
   /// an error otherwise.
   bool CheckCUDAVLA(SourceLocation Loc);
 
+  /// Set __device__ or __host__ __device__ attributes on the given lambda
+  /// operator() method.
+  ///
+  /// CUDA lambdas declared inside __device__ or __global__ functions inherit
+  /// the __device__ attribute.  Similarly, lambdas inside __host__ __device__
+  /// functions become __host__ __device__ themselves.
+  void CUDASetLambdaAttrs(CXXMethodDecl *Method);
+
   /// Finds a function in \p Matches with highest calling priority
   /// from \p Caller context and erases all functions with lower
   /// calling priority.

Modified: cfe/trunk/lib/Sema/SemaCUDA.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Sema/SemaCUDA.cpp?rev=282880&r1=282879&r2=282880&view=diff
==
--- cfe/trunk/lib/Sema/SemaCUDA.cpp (original)
+++ cfe/trunk/lib/Sema/SemaCUDA.cpp Fri Sep 30 12:14:53 2016
@@ -559,3 +559,22 @@ bool Sema::CheckCUDAVLA(SourceLocation L
   }
   return true;
 }
+
+void Sema::CUDASetLambdaAttrs(CXXMethodDecl *Method) {
+  if (Method->hasAttr() || Method->hasAttr())
+return;
+  FunctionDecl *CurFn = dyn_cast(CurContext);
+  if (!CurFn)
+return;
+  CUDAFunctionTarget Target = IdentifyCUDATarget(CurFn);
+  if (Target == CFT_Global || Target == CFT_Device) {
+Method->addAttr(CUDADeviceAttr::CreateImplicit(Context));
+  } else if (Target == CFT_HostDevice) {
+Method->addAttr(CUDADeviceAttr::CreateImplicit(Context));
+Method->addAttr(CUDAHostAttr::CreateImplicit(Context));
+  }
+
+  // TODO: nvcc doesn't allow you to specify __host__ or __device__ attributes
+  // on lambdas in all contexts -- we should emit a compatibility warning where
+  // we're more permissive.
+}

Modified: cfe/trunk/lib/Sema/SemaLambda.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Sema/SemaLambda.cpp?rev=282880&r1=282879&r2=282880&view=diff
==
--- cfe/trunk/lib/Sema/SemaLambda.cpp (original)
+++ cfe/trunk/lib/Sema/SemaLambda.cpp Fri Sep 30 12:14:53 2016
@@ -886,7 +886,12 @@ void Sema::ActOnStartOfLambdaDefinition(
   
   // Attributes on the lambda apply to the method.  
   ProcessDeclAttributes(CurScope, Method, ParamInfo);
-  
+
+  // CUDA lambdas get implicit attributes based on the scope in which they're
+  // declared.
+  if (getLangOpts().CUDA)
+CUDASetLambdaAttrs(Method);
+
   // Introduce the function call operator as the current declaration context.
   PushDeclContext(CurScope, Method);
 

Added: cfe/trunk/test/SemaCUDA/implicit-device-lambda-hd.cu
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/SemaCUDA/implicit-device-lambda-hd.cu?rev=282880&view=auto
==
--- cfe/trunk/test/SemaCUDA/implicit-device-lambda-hd.cu (added)
+++ cfe/trunk/test/SemaCUDA/implicit-device-lambda-hd.cu Fri Sep 30 12:14:53 
2016
@@ -0,0 +1,27 @@
+// RUN: %clang_cc1 -std=c++11 -fcuda-is-device -verify 
-verify-ignore-unexpected=note \
+// RUN:   -S -o /dev/null %s
+// RUN: %clang_cc1 -std=c++11 -verify -fsyntax-only 
-verify-ignore-unexpected=note \
+// RUN:   -DHOST -S -o /dev/null %s
+#include "Inputs/cuda.h"
+
+__host__ __device__ void hd_fn() {
+  auto f1 = [&] {};
+  f1(); // implicitly __host__ __device__
+
+  auto f2 = [&] __device__ {};
+  f2();
+#ifdef HOST
+  // expected-error@-2 {{reference to __device__ function}}
+#endif
+
+  auto f3 = [&] __host__ {};
+  f3();
+#ifndef HOST
+  // expected-error@-2 {{reference to __host__ function}}
+#endif
+
+  auto f4 = [&] __host__ __device__ {};
+  f4();
+}
+
+

Added: cfe/trunk/test/SemaCUDA/implicit-device-lambda.cu
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/SemaCUDA/implicit-device-lambda.cu?rev=282880&view=auto
==
--- cfe/trunk/test/SemaCUDA/implicit-device-l

r282879 - [CUDA] Handle attributes on CUDA lambdas appearing between [...] and (...).

2016-09-30 Thread Justin Lebar via cfe-commits
Author: jlebar
Date: Fri Sep 30 12:14:48 2016
New Revision: 282879

URL: http://llvm.org/viewvc/llvm-project?rev=282879&view=rev
Log:
[CUDA] Handle attributes on CUDA lambdas appearing between [...] and (...).

Summary: This is ugh, but it makes us compatible with NVCC.  Fixes bug 26341.

Reviewers: rnk

Subscribers: cfe-commits, tra

Differential Revision: https://reviews.llvm.org/D25103

Added:
cfe/trunk/test/Parser/lambda-attr.cu
Modified:
cfe/trunk/lib/Parse/ParseExprCXX.cpp

Modified: cfe/trunk/lib/Parse/ParseExprCXX.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Parse/ParseExprCXX.cpp?rev=282879&r1=282878&r2=282879&view=diff
==
--- cfe/trunk/lib/Parse/ParseExprCXX.cpp (original)
+++ cfe/trunk/lib/Parse/ParseExprCXX.cpp Fri Sep 30 12:14:48 2016
@@ -1124,7 +1124,17 @@ ExprResult Parser::ParseLambdaExpression
   DeclSpec DS(AttrFactory);
   Declarator D(DS, Declarator::LambdaExprContext);
   TemplateParameterDepthRAII CurTemplateDepthTracker(TemplateParameterDepth);
-  Actions.PushLambdaScope();
+  Actions.PushLambdaScope();
+
+  ParsedAttributes Attr(AttrFactory);
+  SourceLocation DeclLoc = Tok.getLocation();
+  SourceLocation DeclEndLoc = DeclLoc;
+  if (getLangOpts().CUDA) {
+// In CUDA code, GNU attributes are allowed to appear immediately after the
+// "[...]", even if there is no "(...)" before the lambda body.
+MaybeParseGNUAttributes(Attr, &DeclEndLoc);
+D.takeAttributes(Attr, DeclEndLoc);
+  }
 
   TypeResult TrailingReturnType;
   if (Tok.is(tok::l_paren)) {
@@ -1133,13 +1143,11 @@ ExprResult Parser::ParseLambdaExpression
   Scope::FunctionDeclarationScope |
   Scope::DeclScope);
 
-SourceLocation DeclEndLoc;
 BalancedDelimiterTracker T(*this, tok::l_paren);
 T.consumeOpen();
 SourceLocation LParenLoc = T.getOpenLocation();
 
 // Parse parameter-declaration-clause.
-ParsedAttributes Attr(AttrFactory);
 SmallVector ParamInfo;
 SourceLocation EllipsisLoc;
 
@@ -1245,12 +1253,10 @@ ExprResult Parser::ParseLambdaExpression
 Diag(Tok, diag::err_lambda_missing_parens)
   << TokKind
   << FixItHint::CreateInsertion(Tok.getLocation(), "() ");
-SourceLocation DeclLoc = Tok.getLocation();
-SourceLocation DeclEndLoc = DeclLoc;
+DeclEndLoc = DeclLoc;
 
 // GNU-style attributes must be parsed before the mutable specifier to be
 // compatible with GCC.
-ParsedAttributes Attr(AttrFactory);
 MaybeParseGNUAttributes(Attr, &DeclEndLoc);
 
 // Parse 'mutable', if it's there.
@@ -1297,7 +1303,6 @@ ExprResult Parser::ParseLambdaExpression
TrailingReturnType),
   Attr, DeclEndLoc);
   }
-  
 
   // FIXME: Rename BlockScope -> ClosureScope if we decide to continue using
   // it.

Added: cfe/trunk/test/Parser/lambda-attr.cu
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/Parser/lambda-attr.cu?rev=282879&view=auto
==
--- cfe/trunk/test/Parser/lambda-attr.cu (added)
+++ cfe/trunk/test/Parser/lambda-attr.cu Fri Sep 30 12:14:48 2016
@@ -0,0 +1,33 @@
+// RUN: %clang_cc1 -std=c++11 -fsyntax-only -verify %s
+// RUN: %clang_cc1 -std=c++11 -fsyntax-only -fcuda-is-device -verify %s
+
+// expected-no-diagnostics
+
+__attribute__((device)) void device_fn() {}
+__attribute__((device)) void hd_fn() {}
+
+__attribute__((device)) void device_attr() {
+  ([]() __attribute__((device)) { device_fn(); })();
+  ([] __attribute__((device)) () { device_fn(); })();
+  ([] __attribute__((device)) { device_fn(); })();
+
+  ([&]() __attribute__((device)){ device_fn(); })();
+  ([&] __attribute__((device)) () { device_fn(); })();
+  ([&] __attribute__((device)) { device_fn(); })();
+
+  ([&](int) __attribute__((device)){ device_fn(); })(0);
+  ([&] __attribute__((device)) (int) { device_fn(); })(0);
+}
+
+__attribute__((host)) __attribute__((device)) void host_device_attrs() {
+  ([]() __attribute__((host)) __attribute__((device)){ hd_fn(); })();
+  ([] __attribute__((host)) __attribute__((device)) () { hd_fn(); })();
+  ([] __attribute__((host)) __attribute__((device)) { hd_fn(); })();
+
+  ([&]() __attribute__((host)) __attribute__((device)){ hd_fn(); })();
+  ([&] __attribute__((host)) __attribute__((device)) () { hd_fn(); })();
+  ([&] __attribute__((host)) __attribute__((device)) { hd_fn(); })();
+
+  ([&](int) __attribute__((host)) __attribute__((device)){ hd_fn(); })(0);
+  ([&] __attribute__((host)) __attribute__((device)) (int) { hd_fn(); })(0);
+}


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r282878 - [CUDA] Add missing comment on Sema::CheckCUDAVLA.

2016-09-30 Thread Justin Lebar via cfe-commits
Author: jlebar
Date: Fri Sep 30 12:14:44 2016
New Revision: 282878

URL: http://llvm.org/viewvc/llvm-project?rev=282878&view=rev
Log:
[CUDA] Add missing comment on Sema::CheckCUDAVLA.

Modified:
cfe/trunk/include/clang/Sema/Sema.h

Modified: cfe/trunk/include/clang/Sema/Sema.h
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Sema/Sema.h?rev=282878&r1=282877&r2=282878&view=diff
==
--- cfe/trunk/include/clang/Sema/Sema.h (original)
+++ cfe/trunk/include/clang/Sema/Sema.h Fri Sep 30 12:14:44 2016
@@ -9259,6 +9259,9 @@ public:
   /// ExprTy should be the string "try" or "throw", as appropriate.
   bool CheckCUDAExceptionExpr(SourceLocation Loc, StringRef ExprTy);
 
+  /// Check whether it's legal for us to create a variable-length array in the
+  /// current context.  Returns true if the VLA is OK; returns false and emits
+  /// an error otherwise.
   bool CheckCUDAVLA(SourceLocation Loc);
 
   /// Finds a function in \p Matches with highest calling priority


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D25105: [CUDA] Make lambdas inherit __host__ and __device__ attributes from the scope in which they're created.

2016-09-30 Thread Justin Lebar via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rL282880: [CUDA] Make lambdas inherit __host__ and __device__ 
attributes from the scope… (authored by jlebar).

Changed prior to commit:
  https://reviews.llvm.org/D25105?vs=73073&id=73089#toc

Repository:
  rL LLVM

https://reviews.llvm.org/D25105

Files:
  cfe/trunk/include/clang/Sema/Sema.h
  cfe/trunk/lib/Sema/SemaCUDA.cpp
  cfe/trunk/lib/Sema/SemaLambda.cpp
  cfe/trunk/test/SemaCUDA/implicit-device-lambda-hd.cu
  cfe/trunk/test/SemaCUDA/implicit-device-lambda.cu

Index: cfe/trunk/include/clang/Sema/Sema.h
===
--- cfe/trunk/include/clang/Sema/Sema.h
+++ cfe/trunk/include/clang/Sema/Sema.h
@@ -9264,6 +9264,14 @@
   /// an error otherwise.
   bool CheckCUDAVLA(SourceLocation Loc);
 
+  /// Set __device__ or __host__ __device__ attributes on the given lambda
+  /// operator() method.
+  ///
+  /// CUDA lambdas declared inside __device__ or __global__ functions inherit
+  /// the __device__ attribute.  Similarly, lambdas inside __host__ __device__
+  /// functions become __host__ __device__ themselves.
+  void CUDASetLambdaAttrs(CXXMethodDecl *Method);
+
   /// Finds a function in \p Matches with highest calling priority
   /// from \p Caller context and erases all functions with lower
   /// calling priority.
Index: cfe/trunk/test/SemaCUDA/implicit-device-lambda.cu
===
--- cfe/trunk/test/SemaCUDA/implicit-device-lambda.cu
+++ cfe/trunk/test/SemaCUDA/implicit-device-lambda.cu
@@ -0,0 +1,86 @@
+// RUN: %clang_cc1 -std=c++11 -fcuda-is-device -verify -fsyntax-only -verify-ignore-unexpected=note %s
+// RUN: %clang_cc1 -std=c++11 -verify -fsyntax-only -verify-ignore-unexpected=note %s
+
+#include "Inputs/cuda.h"
+
+__device__ void device_fn() {
+  auto f1 = [&] {};
+  f1(); // implicitly __device__
+
+  auto f2 = [&] __device__ {};
+  f2();
+
+  auto f3 = [&] __host__ {};
+  f3();  // expected-error {{no matching function}}
+
+  auto f4 = [&] __host__ __device__ {};
+  f4();
+
+  // Now do it all again with '()'s in the lambda declarations: This is a
+  // different parse path.
+  auto g1 = [&]() {};
+  g1(); // implicitly __device__
+
+  auto g2 = [&]() __device__ {};
+  g2();
+
+  auto g3 = [&]() __host__ {};
+  g3();  // expected-error {{no matching function}}
+
+  auto g4 = [&]() __host__ __device__ {};
+  g4();
+
+  // Once more, with the '()'s in a different place.
+  auto h1 = [&]() {};
+  h1(); // implicitly __device__
+
+  auto h2 = [&] __device__ () {};
+  h2();
+
+  auto h3 = [&] __host__ () {};
+  h3();  // expected-error {{no matching function}}
+
+  auto h4 = [&] __host__ __device__ () {};
+  h4();
+}
+
+// Behaves identically to device_fn.
+__global__ void kernel_fn() {
+  auto f1 = [&] {};
+  f1(); // implicitly __device__
+
+  auto f2 = [&] __device__ {};
+  f2();
+
+  auto f3 = [&] __host__ {};
+  f3();  // expected-error {{no matching function}}
+
+  auto f4 = [&] __host__ __device__ {};
+  f4();
+
+  // No need to re-test all the parser contortions we test in the device
+  // function.
+}
+
+__host__ void host_fn() {
+  auto f1 = [&] {};
+  f1(); // implicitly __host__ (i.e., no magic)
+
+  auto f2 = [&] __device__ {};
+  f2();  // expected-error {{no matching function}}
+
+  auto f3 = [&] __host__ {};
+  f3();
+
+  auto f4 = [&] __host__ __device__ {};
+  f4();
+}
+
+// The special treatment above only applies to lambdas.
+__device__ void foo() {
+  struct X {
+void foo() {}
+  };
+  X x;
+  x.foo(); // expected-error {{reference to __host__ function 'foo' in __device__ function}}
+}
Index: cfe/trunk/test/SemaCUDA/implicit-device-lambda-hd.cu
===
--- cfe/trunk/test/SemaCUDA/implicit-device-lambda-hd.cu
+++ cfe/trunk/test/SemaCUDA/implicit-device-lambda-hd.cu
@@ -0,0 +1,27 @@
+// RUN: %clang_cc1 -std=c++11 -fcuda-is-device -verify -verify-ignore-unexpected=note \
+// RUN:   -S -o /dev/null %s
+// RUN: %clang_cc1 -std=c++11 -verify -fsyntax-only -verify-ignore-unexpected=note \
+// RUN:   -DHOST -S -o /dev/null %s
+#include "Inputs/cuda.h"
+
+__host__ __device__ void hd_fn() {
+  auto f1 = [&] {};
+  f1(); // implicitly __host__ __device__
+
+  auto f2 = [&] __device__ {};
+  f2();
+#ifdef HOST
+  // expected-error@-2 {{reference to __device__ function}}
+#endif
+
+  auto f3 = [&] __host__ {};
+  f3();
+#ifndef HOST
+  // expected-error@-2 {{reference to __host__ function}}
+#endif
+
+  auto f4 = [&] __host__ __device__ {};
+  f4();
+}
+
+
Index: cfe/trunk/lib/Sema/SemaLambda.cpp
===
--- cfe/trunk/lib/Sema/SemaLambda.cpp
+++ cfe/trunk/lib/Sema/SemaLambda.cpp
@@ -886,7 +886,12 @@
   
   // Attributes on the lambda apply to the method.  
   ProcessDeclAttributes(CurScope, Method, ParamInfo);
-  
+
+  // CUDA lambdas get implicit attributes based on 

[PATCH] D25103: [CUDA] Handle attributes on CUDA lambdas appearing between [...] and (...).

2016-09-30 Thread Justin Lebar via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rL282879: [CUDA] Handle attributes on CUDA lambdas appearing 
between [...] and (...). (authored by jlebar).

Changed prior to commit:
  https://reviews.llvm.org/D25103?vs=73085&id=73088#toc

Repository:
  rL LLVM

https://reviews.llvm.org/D25103

Files:
  cfe/trunk/lib/Parse/ParseExprCXX.cpp
  cfe/trunk/test/Parser/lambda-attr.cu


Index: cfe/trunk/test/Parser/lambda-attr.cu
===
--- cfe/trunk/test/Parser/lambda-attr.cu
+++ cfe/trunk/test/Parser/lambda-attr.cu
@@ -0,0 +1,33 @@
+// RUN: %clang_cc1 -std=c++11 -fsyntax-only -verify %s
+// RUN: %clang_cc1 -std=c++11 -fsyntax-only -fcuda-is-device -verify %s
+
+// expected-no-diagnostics
+
+__attribute__((device)) void device_fn() {}
+__attribute__((device)) void hd_fn() {}
+
+__attribute__((device)) void device_attr() {
+  ([]() __attribute__((device)) { device_fn(); })();
+  ([] __attribute__((device)) () { device_fn(); })();
+  ([] __attribute__((device)) { device_fn(); })();
+
+  ([&]() __attribute__((device)){ device_fn(); })();
+  ([&] __attribute__((device)) () { device_fn(); })();
+  ([&] __attribute__((device)) { device_fn(); })();
+
+  ([&](int) __attribute__((device)){ device_fn(); })(0);
+  ([&] __attribute__((device)) (int) { device_fn(); })(0);
+}
+
+__attribute__((host)) __attribute__((device)) void host_device_attrs() {
+  ([]() __attribute__((host)) __attribute__((device)){ hd_fn(); })();
+  ([] __attribute__((host)) __attribute__((device)) () { hd_fn(); })();
+  ([] __attribute__((host)) __attribute__((device)) { hd_fn(); })();
+
+  ([&]() __attribute__((host)) __attribute__((device)){ hd_fn(); })();
+  ([&] __attribute__((host)) __attribute__((device)) () { hd_fn(); })();
+  ([&] __attribute__((host)) __attribute__((device)) { hd_fn(); })();
+
+  ([&](int) __attribute__((host)) __attribute__((device)){ hd_fn(); })(0);
+  ([&] __attribute__((host)) __attribute__((device)) (int) { hd_fn(); })(0);
+}
Index: cfe/trunk/lib/Parse/ParseExprCXX.cpp
===
--- cfe/trunk/lib/Parse/ParseExprCXX.cpp
+++ cfe/trunk/lib/Parse/ParseExprCXX.cpp
@@ -1124,22 +1124,30 @@
   DeclSpec DS(AttrFactory);
   Declarator D(DS, Declarator::LambdaExprContext);
   TemplateParameterDepthRAII CurTemplateDepthTracker(TemplateParameterDepth);
-  Actions.PushLambdaScope();
+  Actions.PushLambdaScope();
+
+  ParsedAttributes Attr(AttrFactory);
+  SourceLocation DeclLoc = Tok.getLocation();
+  SourceLocation DeclEndLoc = DeclLoc;
+  if (getLangOpts().CUDA) {
+// In CUDA code, GNU attributes are allowed to appear immediately after the
+// "[...]", even if there is no "(...)" before the lambda body.
+MaybeParseGNUAttributes(Attr, &DeclEndLoc);
+D.takeAttributes(Attr, DeclEndLoc);
+  }
 
   TypeResult TrailingReturnType;
   if (Tok.is(tok::l_paren)) {
 ParseScope PrototypeScope(this,
   Scope::FunctionPrototypeScope |
   Scope::FunctionDeclarationScope |
   Scope::DeclScope);
 
-SourceLocation DeclEndLoc;
 BalancedDelimiterTracker T(*this, tok::l_paren);
 T.consumeOpen();
 SourceLocation LParenLoc = T.getOpenLocation();
 
 // Parse parameter-declaration-clause.
-ParsedAttributes Attr(AttrFactory);
 SmallVector ParamInfo;
 SourceLocation EllipsisLoc;
 
@@ -1245,12 +1253,10 @@
 Diag(Tok, diag::err_lambda_missing_parens)
   << TokKind
   << FixItHint::CreateInsertion(Tok.getLocation(), "() ");
-SourceLocation DeclLoc = Tok.getLocation();
-SourceLocation DeclEndLoc = DeclLoc;
+DeclEndLoc = DeclLoc;
 
 // GNU-style attributes must be parsed before the mutable specifier to be
 // compatible with GCC.
-ParsedAttributes Attr(AttrFactory);
 MaybeParseGNUAttributes(Attr, &DeclEndLoc);
 
 // Parse 'mutable', if it's there.
@@ -1297,7 +1303,6 @@
TrailingReturnType),
   Attr, DeclEndLoc);
   }
-  
 
   // FIXME: Rename BlockScope -> ClosureScope if we decide to continue using
   // it.


Index: cfe/trunk/test/Parser/lambda-attr.cu
===
--- cfe/trunk/test/Parser/lambda-attr.cu
+++ cfe/trunk/test/Parser/lambda-attr.cu
@@ -0,0 +1,33 @@
+// RUN: %clang_cc1 -std=c++11 -fsyntax-only -verify %s
+// RUN: %clang_cc1 -std=c++11 -fsyntax-only -fcuda-is-device -verify %s
+
+// expected-no-diagnostics
+
+__attribute__((device)) void device_fn() {}
+__attribute__((device)) void hd_fn() {}
+
+__attribute__((device)) void device_attr() {
+  ([]() __attribute__((device)) { device_fn(); })();
+  ([] __attribute__((device)) () { device_fn(); })();
+  ([] __attribute__((device)) { device_fn(); })();
+
+  ([&]() __attribute__((device)){ device_fn(); })();
+  ([&] __attribute__((device)) () { dev

[PATCH] D25103: [CUDA] Handle attributes on CUDA lambdas appearing between [...] and (...).

2016-09-30 Thread Justin Lebar via cfe-commits
jlebar added inline comments.


> rnk wrote in ParseExprCXX.cpp:1135
> Does nvcc support __declspec style attributes? Maybe we should check for 
> those too?

nvcc doesn't seem to support __declspec attributes.

I have no strong opinion on whether or not we should add them ourselves, though 
I guess I have a weak aversion to mucking with the parsing rules more than is 
necessary.  (I put off this patch for as long as I could while I tried to get 
nvcc to put the attributes in the right place.)

> rnk wrote in ParseExprCXX.cpp:1280
> Let's not duplicate this amazingly long type info thingy. I think you can 
> avoid it if you hoist MutableLoc and add a bool like 
> `NeedFuncDeclaratorChunk`. Also, maybe we shouldn't be hallucinating a 
> function declarator chunk in CUDA when there are no attributes?

Oh, I didn't look closely enough at the API to realize that I could add 
attributes without creating a new function declarator chunk.  New patch is a 
much smaller change.

https://reviews.llvm.org/D25103



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D25114: [CUDA] Fix up MaybeParseGNUAttributes call used for out-of-place attributes on CUDA lambdas.

2016-09-30 Thread Justin Lebar via cfe-commits
jlebar created this revision.
jlebar added a reviewer: rnk.
jlebar added subscribers: tra, cfe-commits.

There's an overload that we can use to make this a bit cleaner.


https://reviews.llvm.org/D25114

Files:
  clang/lib/Parse/ParseExprCXX.cpp


Index: clang/lib/Parse/ParseExprCXX.cpp
===
--- clang/lib/Parse/ParseExprCXX.cpp
+++ clang/lib/Parse/ParseExprCXX.cpp
@@ -1128,12 +1128,10 @@
 
   ParsedAttributes Attr(AttrFactory);
   SourceLocation DeclLoc = Tok.getLocation();
-  SourceLocation DeclEndLoc = DeclLoc;
   if (getLangOpts().CUDA) {
 // In CUDA code, GNU attributes are allowed to appear immediately after the
 // "[...]", even if there is no "(...)" before the lambda body.
-MaybeParseGNUAttributes(Attr, &DeclEndLoc);
-D.takeAttributes(Attr, DeclEndLoc);
+MaybeParseGNUAttributes(D);
   }
 
   TypeResult TrailingReturnType;
@@ -1161,7 +1159,7 @@
 }
 T.consumeClose();
 SourceLocation RParenLoc = T.getCloseLocation();
-DeclEndLoc = RParenLoc;
+SourceLocation DeclEndLoc = RParenLoc;
 
 // GNU-style attributes must be parsed before the mutable specifier to be
 // compatible with GCC.
@@ -1253,7 +1251,7 @@
 Diag(Tok, diag::err_lambda_missing_parens)
   << TokKind
   << FixItHint::CreateInsertion(Tok.getLocation(), "() ");
-DeclEndLoc = DeclLoc;
+SourceLocation DeclEndLoc = DeclLoc;
 
 // GNU-style attributes must be parsed before the mutable specifier to be
 // compatible with GCC.


Index: clang/lib/Parse/ParseExprCXX.cpp
===
--- clang/lib/Parse/ParseExprCXX.cpp
+++ clang/lib/Parse/ParseExprCXX.cpp
@@ -1128,12 +1128,10 @@
 
   ParsedAttributes Attr(AttrFactory);
   SourceLocation DeclLoc = Tok.getLocation();
-  SourceLocation DeclEndLoc = DeclLoc;
   if (getLangOpts().CUDA) {
 // In CUDA code, GNU attributes are allowed to appear immediately after the
 // "[...]", even if there is no "(...)" before the lambda body.
-MaybeParseGNUAttributes(Attr, &DeclEndLoc);
-D.takeAttributes(Attr, DeclEndLoc);
+MaybeParseGNUAttributes(D);
   }
 
   TypeResult TrailingReturnType;
@@ -1161,7 +1159,7 @@
 }
 T.consumeClose();
 SourceLocation RParenLoc = T.getCloseLocation();
-DeclEndLoc = RParenLoc;
+SourceLocation DeclEndLoc = RParenLoc;
 
 // GNU-style attributes must be parsed before the mutable specifier to be
 // compatible with GCC.
@@ -1253,7 +1251,7 @@
 Diag(Tok, diag::err_lambda_missing_parens)
   << TokKind
   << FixItHint::CreateInsertion(Tok.getLocation(), "() ");
-DeclEndLoc = DeclLoc;
+SourceLocation DeclEndLoc = DeclLoc;
 
 // GNU-style attributes must be parsed before the mutable specifier to be
 // compatible with GCC.
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D25115: [CUDA] Emit a warning if a CUDA host/device/global attribute is placed after '(...)'.

2016-09-30 Thread Justin Lebar via cfe-commits
jlebar created this revision.
jlebar added a reviewer: rnk.
jlebar added subscribers: tra, cfe-commits.

This is probably the sane place for the attribute to go, but nvcc
specifically rejects it.  Other GNU-style attributes are allowed in this
position (although judging from the warning it emits for
host/device/global, those attributes are applied to the lambda's
anonymous struct, not to the function itself).

It would be nice to have a FixIt message here, but doing so, or even
just getting the correct range for the attribute, including its '((' and
'))'s, is apparently Hard.


https://reviews.llvm.org/D25115

Files:
  clang/include/clang/Basic/DiagnosticParseKinds.td
  clang/lib/Parse/ParseExprCXX.cpp
  clang/test/Parser/lambda-attr.cu

Index: clang/test/Parser/lambda-attr.cu
===
--- clang/test/Parser/lambda-attr.cu
+++ clang/test/Parser/lambda-attr.cu
@@ -1,33 +1,42 @@
 // RUN: %clang_cc1 -std=c++11 -fsyntax-only -verify %s
 // RUN: %clang_cc1 -std=c++11 -fsyntax-only -fcuda-is-device -verify %s
 
-// expected-no-diagnostics
-
 __attribute__((device)) void device_fn() {}
 __attribute__((device)) void hd_fn() {}
 
 __attribute__((device)) void device_attr() {
   ([]() __attribute__((device)) { device_fn(); })();
+  // expected-warning@-1 {{nvcc does not allow '__device__' to appear after '()' in lambdas}}
   ([] __attribute__((device)) () { device_fn(); })();
   ([] __attribute__((device)) { device_fn(); })();
 
   ([&]() __attribute__((device)){ device_fn(); })();
+  // expected-warning@-1 {{nvcc does not allow '__device__' to appear after '()' in lambdas}}
   ([&] __attribute__((device)) () { device_fn(); })();
   ([&] __attribute__((device)) { device_fn(); })();
 
   ([&](int) __attribute__((device)){ device_fn(); })(0);
+  // expected-warning@-1 {{nvcc does not allow '__device__' to appear after '()' in lambdas}}
   ([&] __attribute__((device)) (int) { device_fn(); })(0);
 }
 
 __attribute__((host)) __attribute__((device)) void host_device_attrs() {
   ([]() __attribute__((host)) __attribute__((device)){ hd_fn(); })();
+  // expected-warning@-1 {{nvcc does not allow '__host__' to appear after '()' in lambdas}}
+  // expected-warning@-2 {{nvcc does not allow '__device__' to appear after '()' in lambdas}}
   ([] __attribute__((host)) __attribute__((device)) () { hd_fn(); })();
   ([] __attribute__((host)) __attribute__((device)) { hd_fn(); })();
 
   ([&]() __attribute__((host)) __attribute__((device)){ hd_fn(); })();
+  // expected-warning@-1 {{nvcc does not allow '__host__' to appear after '()' in lambdas}}
+  // expected-warning@-2 {{nvcc does not allow '__device__' to appear after '()' in lambdas}}
   ([&] __attribute__((host)) __attribute__((device)) () { hd_fn(); })();
   ([&] __attribute__((host)) __attribute__((device)) { hd_fn(); })();
 
   ([&](int) __attribute__((host)) __attribute__((device)){ hd_fn(); })(0);
+  // expected-warning@-1 {{nvcc does not allow '__host__' to appear after '()' in lambdas}}
+  // expected-warning@-2 {{nvcc does not allow '__device__' to appear after '()' in lambdas}}
   ([&] __attribute__((host)) __attribute__((device)) (int) { hd_fn(); })(0);
 }
+
+// TODO: Add tests for __attribute__((global)) once we support global lambdas.
Index: clang/lib/Parse/ParseExprCXX.cpp
===
--- clang/lib/Parse/ParseExprCXX.cpp
+++ clang/lib/Parse/ParseExprCXX.cpp
@@ -1134,6 +1134,18 @@
 MaybeParseGNUAttributes(D);
   }
 
+  // Helper to emit a warning if we see a CUDA host/device/global attribute
+  // after '(...)'. nvcc doesn't accept this.
+  auto WarnIfHasCUDATargetAttr = [&] {
+if (getLangOpts().CUDA)
+  for (auto *A = Attr.getList(); A != nullptr; A = A->getNext())
+if (A->getKind() == AttributeList::AT_CUDADevice ||
+A->getKind() == AttributeList::AT_CUDAHost ||
+A->getKind() == AttributeList::AT_CUDAGlobal)
+  Diag(A->getLoc(), diag::warn_cuda_attr_lambda_position)
+  << A->getName()->getName();
+  };
+
   TypeResult TrailingReturnType;
   if (Tok.is(tok::l_paren)) {
 ParseScope PrototypeScope(this,
@@ -1210,6 +1222,8 @@
 
 PrototypeScope.Exit();
 
+WarnIfHasCUDATargetAttr();
+
 SourceLocation NoLoc;
 D.AddTypeInfo(DeclaratorChunk::getFunction(/*hasProto=*/true,
/*isAmbiguous=*/false,
@@ -1275,6 +1289,8 @@
 DeclEndLoc = Range.getEnd();
 }
 
+WarnIfHasCUDATargetAttr();
+
 SourceLocation NoLoc;
 D.AddTypeInfo(DeclaratorChunk::getFunction(/*hasProto=*/true,
/*isAmbiguous=*/false,
Index: clang/include/clang/Basic/DiagnosticParseKinds.td
===
--- clang/include/clang/Basic/DiagnosticParseKinds.td
+++ clang/include/clang/Basic/DiagnosticParseKinds.td
@@ -1022,6 +1022,10 @@
 def warn_pragma_unroll_cuda

r282911 - [CUDA] Emit a warning if a CUDA host/device/global attribute is placed after '(...)'.

2016-09-30 Thread Justin Lebar via cfe-commits
Author: jlebar
Date: Fri Sep 30 14:55:55 2016
New Revision: 282911

URL: http://llvm.org/viewvc/llvm-project?rev=282911&view=rev
Log:
[CUDA] Emit a warning if a CUDA host/device/global attribute is placed after 
'(...)'.

Summary:
This is probably the sane place for the attribute to go, but nvcc
specifically rejects it.  Other GNU-style attributes are allowed in this
position (although judging from the warning it emits for
host/device/global, those attributes are applied to the lambda's
anonymous struct, not to the function itself).

It would be nice to have a FixIt message here, but doing so, or even
just getting the correct range for the attribute, including its '((' and
'))'s, is apparently Hard.

Reviewers: rnk

Subscribers: cfe-commits, tra

Differential Revision: https://reviews.llvm.org/D25115

Modified:
cfe/trunk/include/clang/Basic/DiagnosticParseKinds.td
cfe/trunk/lib/Parse/ParseExprCXX.cpp
cfe/trunk/test/Parser/lambda-attr.cu

Modified: cfe/trunk/include/clang/Basic/DiagnosticParseKinds.td
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Basic/DiagnosticParseKinds.td?rev=282911&r1=282910&r2=282911&view=diff
==
--- cfe/trunk/include/clang/Basic/DiagnosticParseKinds.td (original)
+++ cfe/trunk/include/clang/Basic/DiagnosticParseKinds.td Fri Sep 30 14:55:55 
2016
@@ -1022,6 +1022,10 @@ def err_pragma_invalid_keyword : Error<
 def warn_pragma_unroll_cuda_value_in_parens : Warning<
   "argument to '#pragma unroll' should not be in parentheses in CUDA C/C++">,
   InGroup;
+
+def warn_cuda_attr_lambda_position : Warning<
+  "nvcc does not allow '__%0__' to appear after '()' in lambdas">,
+  InGroup;
 } // end of Parse Issue category.
 
 let CategoryName = "Modules Issue" in {

Modified: cfe/trunk/lib/Parse/ParseExprCXX.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Parse/ParseExprCXX.cpp?rev=282911&r1=282910&r2=282911&view=diff
==
--- cfe/trunk/lib/Parse/ParseExprCXX.cpp (original)
+++ cfe/trunk/lib/Parse/ParseExprCXX.cpp Fri Sep 30 14:55:55 2016
@@ -1134,6 +1134,18 @@ ExprResult Parser::ParseLambdaExpression
 MaybeParseGNUAttributes(D);
   }
 
+  // Helper to emit a warning if we see a CUDA host/device/global attribute
+  // after '(...)'. nvcc doesn't accept this.
+  auto WarnIfHasCUDATargetAttr = [&] {
+if (getLangOpts().CUDA)
+  for (auto *A = Attr.getList(); A != nullptr; A = A->getNext())
+if (A->getKind() == AttributeList::AT_CUDADevice ||
+A->getKind() == AttributeList::AT_CUDAHost ||
+A->getKind() == AttributeList::AT_CUDAGlobal)
+  Diag(A->getLoc(), diag::warn_cuda_attr_lambda_position)
+  << A->getName()->getName();
+  };
+
   TypeResult TrailingReturnType;
   if (Tok.is(tok::l_paren)) {
 ParseScope PrototypeScope(this,
@@ -1210,6 +1222,8 @@ ExprResult Parser::ParseLambdaExpression
 
 PrototypeScope.Exit();
 
+WarnIfHasCUDATargetAttr();
+
 SourceLocation NoLoc;
 D.AddTypeInfo(DeclaratorChunk::getFunction(/*hasProto=*/true,
/*isAmbiguous=*/false,
@@ -1275,6 +1289,8 @@ ExprResult Parser::ParseLambdaExpression
 DeclEndLoc = Range.getEnd();
 }
 
+WarnIfHasCUDATargetAttr();
+
 SourceLocation NoLoc;
 D.AddTypeInfo(DeclaratorChunk::getFunction(/*hasProto=*/true,
/*isAmbiguous=*/false,

Modified: cfe/trunk/test/Parser/lambda-attr.cu
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/Parser/lambda-attr.cu?rev=282911&r1=282910&r2=282911&view=diff
==
--- cfe/trunk/test/Parser/lambda-attr.cu (original)
+++ cfe/trunk/test/Parser/lambda-attr.cu Fri Sep 30 14:55:55 2016
@@ -1,33 +1,42 @@
 // RUN: %clang_cc1 -std=c++11 -fsyntax-only -verify %s
 // RUN: %clang_cc1 -std=c++11 -fsyntax-only -fcuda-is-device -verify %s
 
-// expected-no-diagnostics
-
 __attribute__((device)) void device_fn() {}
 __attribute__((device)) void hd_fn() {}
 
 __attribute__((device)) void device_attr() {
   ([]() __attribute__((device)) { device_fn(); })();
+  // expected-warning@-1 {{nvcc does not allow '__device__' to appear after 
'()' in lambdas}}
   ([] __attribute__((device)) () { device_fn(); })();
   ([] __attribute__((device)) { device_fn(); })();
 
   ([&]() __attribute__((device)){ device_fn(); })();
+  // expected-warning@-1 {{nvcc does not allow '__device__' to appear after 
'()' in lambdas}}
   ([&] __attribute__((device)) () { device_fn(); })();
   ([&] __attribute__((device)) { device_fn(); })();
 
   ([&](int) __attribute__((device)){ device_fn(); })(0);
+  // expected-warning@-1 {{nvcc does not allow '__device__' to appear after 
'()' in lambdas}}
   ([&] __attribute__((device)) (int) { device_fn(); })(0);
 }
 
 __attribute__((host)) __attribute__((

r282910 - [CUDA] Fix up MaybeParseGNUAttributes call used for out-of-place attributes on CUDA lambdas.

2016-09-30 Thread Justin Lebar via cfe-commits
Author: jlebar
Date: Fri Sep 30 14:55:48 2016
New Revision: 282910

URL: http://llvm.org/viewvc/llvm-project?rev=282910&view=rev
Log:
[CUDA] Fix up MaybeParseGNUAttributes call used for out-of-place attributes on 
CUDA lambdas.

Summary: There's an overload that we can use to make this a bit cleaner.

Reviewers: rnk

Subscribers: cfe-commits, tra

Differential Revision: https://reviews.llvm.org/D25114

Modified:
cfe/trunk/lib/Parse/ParseExprCXX.cpp

Modified: cfe/trunk/lib/Parse/ParseExprCXX.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Parse/ParseExprCXX.cpp?rev=282910&r1=282909&r2=282910&view=diff
==
--- cfe/trunk/lib/Parse/ParseExprCXX.cpp (original)
+++ cfe/trunk/lib/Parse/ParseExprCXX.cpp Fri Sep 30 14:55:48 2016
@@ -1128,12 +1128,10 @@ ExprResult Parser::ParseLambdaExpression
 
   ParsedAttributes Attr(AttrFactory);
   SourceLocation DeclLoc = Tok.getLocation();
-  SourceLocation DeclEndLoc = DeclLoc;
   if (getLangOpts().CUDA) {
 // In CUDA code, GNU attributes are allowed to appear immediately after the
 // "[...]", even if there is no "(...)" before the lambda body.
-MaybeParseGNUAttributes(Attr, &DeclEndLoc);
-D.takeAttributes(Attr, DeclEndLoc);
+MaybeParseGNUAttributes(D);
   }
 
   TypeResult TrailingReturnType;
@@ -1161,7 +1159,7 @@ ExprResult Parser::ParseLambdaExpression
 }
 T.consumeClose();
 SourceLocation RParenLoc = T.getCloseLocation();
-DeclEndLoc = RParenLoc;
+SourceLocation DeclEndLoc = RParenLoc;
 
 // GNU-style attributes must be parsed before the mutable specifier to be
 // compatible with GCC.
@@ -1253,7 +1251,7 @@ ExprResult Parser::ParseLambdaExpression
 Diag(Tok, diag::err_lambda_missing_parens)
   << TokKind
   << FixItHint::CreateInsertion(Tok.getLocation(), "() ");
-DeclEndLoc = DeclLoc;
+SourceLocation DeclEndLoc = DeclLoc;
 
 // GNU-style attributes must be parsed before the mutable specifier to be
 // compatible with GCC.


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r282912 - [CUDA] Remove incorrect comment in CUDASetLambdaAttrs.

2016-09-30 Thread Justin Lebar via cfe-commits
Author: jlebar
Date: Fri Sep 30 14:55:59 2016
New Revision: 282912

URL: http://llvm.org/viewvc/llvm-project?rev=282912&view=rev
Log:
[CUDA] Remove incorrect comment in CUDASetLambdaAttrs.

I'd said that nvcc doesn't allow you to add __host__ or __device__
attributes on lambdas in all circumstances, but I believe this was user
error on my part.  I can't reproduce these warnings/errors if I pass
--expt-extended-lambda to nvcc.

Modified:
cfe/trunk/lib/Sema/SemaCUDA.cpp

Modified: cfe/trunk/lib/Sema/SemaCUDA.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Sema/SemaCUDA.cpp?rev=282912&r1=282911&r2=282912&view=diff
==
--- cfe/trunk/lib/Sema/SemaCUDA.cpp (original)
+++ cfe/trunk/lib/Sema/SemaCUDA.cpp Fri Sep 30 14:55:59 2016
@@ -573,8 +573,4 @@ void Sema::CUDASetLambdaAttrs(CXXMethodD
 Method->addAttr(CUDADeviceAttr::CreateImplicit(Context));
 Method->addAttr(CUDAHostAttr::CreateImplicit(Context));
   }
-
-  // TODO: nvcc doesn't allow you to specify __host__ or __device__ attributes
-  // on lambdas in all contexts -- we should emit a compatibility warning where
-  // we're more permissive.
 }


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D25115: [CUDA] Emit a warning if a CUDA host/device/global attribute is placed after '(...)'.

2016-09-30 Thread Justin Lebar via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rL282911: [CUDA] Emit a warning if a CUDA host/device/global 
attribute is placed after '(. (authored by jlebar).

Changed prior to commit:
  https://reviews.llvm.org/D25115?vs=73105&id=73123#toc

Repository:
  rL LLVM

https://reviews.llvm.org/D25115

Files:
  cfe/trunk/include/clang/Basic/DiagnosticParseKinds.td
  cfe/trunk/lib/Parse/ParseExprCXX.cpp
  cfe/trunk/test/Parser/lambda-attr.cu

Index: cfe/trunk/include/clang/Basic/DiagnosticParseKinds.td
===
--- cfe/trunk/include/clang/Basic/DiagnosticParseKinds.td
+++ cfe/trunk/include/clang/Basic/DiagnosticParseKinds.td
@@ -1022,6 +1022,10 @@
 def warn_pragma_unroll_cuda_value_in_parens : Warning<
   "argument to '#pragma unroll' should not be in parentheses in CUDA C/C++">,
   InGroup;
+
+def warn_cuda_attr_lambda_position : Warning<
+  "nvcc does not allow '__%0__' to appear after '()' in lambdas">,
+  InGroup;
 } // end of Parse Issue category.
 
 let CategoryName = "Modules Issue" in {
Index: cfe/trunk/test/Parser/lambda-attr.cu
===
--- cfe/trunk/test/Parser/lambda-attr.cu
+++ cfe/trunk/test/Parser/lambda-attr.cu
@@ -1,33 +1,42 @@
 // RUN: %clang_cc1 -std=c++11 -fsyntax-only -verify %s
 // RUN: %clang_cc1 -std=c++11 -fsyntax-only -fcuda-is-device -verify %s
 
-// expected-no-diagnostics
-
 __attribute__((device)) void device_fn() {}
 __attribute__((device)) void hd_fn() {}
 
 __attribute__((device)) void device_attr() {
   ([]() __attribute__((device)) { device_fn(); })();
+  // expected-warning@-1 {{nvcc does not allow '__device__' to appear after '()' in lambdas}}
   ([] __attribute__((device)) () { device_fn(); })();
   ([] __attribute__((device)) { device_fn(); })();
 
   ([&]() __attribute__((device)){ device_fn(); })();
+  // expected-warning@-1 {{nvcc does not allow '__device__' to appear after '()' in lambdas}}
   ([&] __attribute__((device)) () { device_fn(); })();
   ([&] __attribute__((device)) { device_fn(); })();
 
   ([&](int) __attribute__((device)){ device_fn(); })(0);
+  // expected-warning@-1 {{nvcc does not allow '__device__' to appear after '()' in lambdas}}
   ([&] __attribute__((device)) (int) { device_fn(); })(0);
 }
 
 __attribute__((host)) __attribute__((device)) void host_device_attrs() {
   ([]() __attribute__((host)) __attribute__((device)){ hd_fn(); })();
+  // expected-warning@-1 {{nvcc does not allow '__host__' to appear after '()' in lambdas}}
+  // expected-warning@-2 {{nvcc does not allow '__device__' to appear after '()' in lambdas}}
   ([] __attribute__((host)) __attribute__((device)) () { hd_fn(); })();
   ([] __attribute__((host)) __attribute__((device)) { hd_fn(); })();
 
   ([&]() __attribute__((host)) __attribute__((device)){ hd_fn(); })();
+  // expected-warning@-1 {{nvcc does not allow '__host__' to appear after '()' in lambdas}}
+  // expected-warning@-2 {{nvcc does not allow '__device__' to appear after '()' in lambdas}}
   ([&] __attribute__((host)) __attribute__((device)) () { hd_fn(); })();
   ([&] __attribute__((host)) __attribute__((device)) { hd_fn(); })();
 
   ([&](int) __attribute__((host)) __attribute__((device)){ hd_fn(); })(0);
+  // expected-warning@-1 {{nvcc does not allow '__host__' to appear after '()' in lambdas}}
+  // expected-warning@-2 {{nvcc does not allow '__device__' to appear after '()' in lambdas}}
   ([&] __attribute__((host)) __attribute__((device)) (int) { hd_fn(); })(0);
 }
+
+// TODO: Add tests for __attribute__((global)) once we support global lambdas.
Index: cfe/trunk/lib/Parse/ParseExprCXX.cpp
===
--- cfe/trunk/lib/Parse/ParseExprCXX.cpp
+++ cfe/trunk/lib/Parse/ParseExprCXX.cpp
@@ -1134,6 +1134,18 @@
 MaybeParseGNUAttributes(D);
   }
 
+  // Helper to emit a warning if we see a CUDA host/device/global attribute
+  // after '(...)'. nvcc doesn't accept this.
+  auto WarnIfHasCUDATargetAttr = [&] {
+if (getLangOpts().CUDA)
+  for (auto *A = Attr.getList(); A != nullptr; A = A->getNext())
+if (A->getKind() == AttributeList::AT_CUDADevice ||
+A->getKind() == AttributeList::AT_CUDAHost ||
+A->getKind() == AttributeList::AT_CUDAGlobal)
+  Diag(A->getLoc(), diag::warn_cuda_attr_lambda_position)
+  << A->getName()->getName();
+  };
+
   TypeResult TrailingReturnType;
   if (Tok.is(tok::l_paren)) {
 ParseScope PrototypeScope(this,
@@ -1210,6 +1222,8 @@
 
 PrototypeScope.Exit();
 
+WarnIfHasCUDATargetAttr();
+
 SourceLocation NoLoc;
 D.AddTypeInfo(DeclaratorChunk::getFunction(/*hasProto=*/true,
/*isAmbiguous=*/false,
@@ -1275,6 +1289,8 @@
 DeclEndLoc = Range.getEnd();
 }
 
+WarnIfHasCUDATargetAttr();
+
 SourceLocation NoLoc;
 D.AddTypeInfo(DeclaratorChunk::ge

[PATCH] D25114: [CUDA] Fix up MaybeParseGNUAttributes call used for out-of-place attributes on CUDA lambdas.

2016-09-30 Thread Justin Lebar via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rL282910: [CUDA] Fix up MaybeParseGNUAttributes call used for 
out-of-place attributes on… (authored by jlebar).

Changed prior to commit:
  https://reviews.llvm.org/D25114?vs=73104&id=73122#toc

Repository:
  rL LLVM

https://reviews.llvm.org/D25114

Files:
  cfe/trunk/lib/Parse/ParseExprCXX.cpp


Index: cfe/trunk/lib/Parse/ParseExprCXX.cpp
===
--- cfe/trunk/lib/Parse/ParseExprCXX.cpp
+++ cfe/trunk/lib/Parse/ParseExprCXX.cpp
@@ -1128,12 +1128,10 @@
 
   ParsedAttributes Attr(AttrFactory);
   SourceLocation DeclLoc = Tok.getLocation();
-  SourceLocation DeclEndLoc = DeclLoc;
   if (getLangOpts().CUDA) {
 // In CUDA code, GNU attributes are allowed to appear immediately after the
 // "[...]", even if there is no "(...)" before the lambda body.
-MaybeParseGNUAttributes(Attr, &DeclEndLoc);
-D.takeAttributes(Attr, DeclEndLoc);
+MaybeParseGNUAttributes(D);
   }
 
   TypeResult TrailingReturnType;
@@ -1161,7 +1159,7 @@
 }
 T.consumeClose();
 SourceLocation RParenLoc = T.getCloseLocation();
-DeclEndLoc = RParenLoc;
+SourceLocation DeclEndLoc = RParenLoc;
 
 // GNU-style attributes must be parsed before the mutable specifier to be
 // compatible with GCC.
@@ -1253,7 +1251,7 @@
 Diag(Tok, diag::err_lambda_missing_parens)
   << TokKind
   << FixItHint::CreateInsertion(Tok.getLocation(), "() ");
-DeclEndLoc = DeclLoc;
+SourceLocation DeclEndLoc = DeclLoc;
 
 // GNU-style attributes must be parsed before the mutable specifier to be
 // compatible with GCC.


Index: cfe/trunk/lib/Parse/ParseExprCXX.cpp
===
--- cfe/trunk/lib/Parse/ParseExprCXX.cpp
+++ cfe/trunk/lib/Parse/ParseExprCXX.cpp
@@ -1128,12 +1128,10 @@
 
   ParsedAttributes Attr(AttrFactory);
   SourceLocation DeclLoc = Tok.getLocation();
-  SourceLocation DeclEndLoc = DeclLoc;
   if (getLangOpts().CUDA) {
 // In CUDA code, GNU attributes are allowed to appear immediately after the
 // "[...]", even if there is no "(...)" before the lambda body.
-MaybeParseGNUAttributes(Attr, &DeclEndLoc);
-D.takeAttributes(Attr, DeclEndLoc);
+MaybeParseGNUAttributes(D);
   }
 
   TypeResult TrailingReturnType;
@@ -1161,7 +1159,7 @@
 }
 T.consumeClose();
 SourceLocation RParenLoc = T.getCloseLocation();
-DeclEndLoc = RParenLoc;
+SourceLocation DeclEndLoc = RParenLoc;
 
 // GNU-style attributes must be parsed before the mutable specifier to be
 // compatible with GCC.
@@ -1253,7 +1251,7 @@
 Diag(Tok, diag::err_lambda_missing_parens)
   << TokKind
   << FixItHint::CreateInsertion(Tok.getLocation(), "() ");
-DeclEndLoc = DeclLoc;
+SourceLocation DeclEndLoc = DeclLoc;
 
 // GNU-style attributes must be parsed before the mutable specifier to be
 // compatible with GCC.
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r282927 - [CUDA] Fix implicit-device-lambda.cu after r282911.

2016-09-30 Thread Justin Lebar via cfe-commits
Author: jlebar
Date: Fri Sep 30 15:17:37 2016
New Revision: 282927

URL: http://llvm.org/viewvc/llvm-project?rev=282927&view=rev
Log:
[CUDA] Fix implicit-device-lambda.cu after r282911.

This commit added a warning that we're (correctly) hitting in this test.
Just ignore it.

Modified:
cfe/trunk/test/SemaCUDA/implicit-device-lambda.cu

Modified: cfe/trunk/test/SemaCUDA/implicit-device-lambda.cu
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/SemaCUDA/implicit-device-lambda.cu?rev=282927&r1=282926&r2=282927&view=diff
==
--- cfe/trunk/test/SemaCUDA/implicit-device-lambda.cu (original)
+++ cfe/trunk/test/SemaCUDA/implicit-device-lambda.cu Fri Sep 30 15:17:37 2016
@@ -1,5 +1,5 @@
-// RUN: %clang_cc1 -std=c++11 -fcuda-is-device -verify -fsyntax-only 
-verify-ignore-unexpected=note %s
-// RUN: %clang_cc1 -std=c++11 -verify -fsyntax-only 
-verify-ignore-unexpected=note %s
+// RUN: %clang_cc1 -std=c++11 -fcuda-is-device -verify -fsyntax-only 
-verify-ignore-unexpected=warning -verify-ignore-unexpected=note %s
+// RUN: %clang_cc1 -std=c++11 -verify -fsyntax-only 
-verify-ignore-unexpected=warning -verify-ignore-unexpected=note %s
 
 #include "Inputs/cuda.h"
 


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D25125: [CUDA] Disallow 'extern __shared__' variables.

2016-09-30 Thread Justin Lebar via cfe-commits
jlebar created this revision.
jlebar added a reviewer: rnk.
jlebar added subscribers: tra, cfe-commits.

https://reviews.llvm.org/D25125

Files:
  clang/include/clang/Basic/DiagnosticSemaKinds.td
  clang/lib/Sema/SemaDeclAttr.cpp
  clang/test/SemaCUDA/extern-shared.cu


Index: clang/test/SemaCUDA/extern-shared.cu
===
--- /dev/null
+++ clang/test/SemaCUDA/extern-shared.cu
@@ -0,0 +1,14 @@
+// RUN: %clang_cc1 -fsyntax-only -verify %s
+// RUN: %clang_cc1 -fsyntax-only -fcuda-is-device -verify %s
+
+#include "Inputs/cuda.h"
+
+__device__ void foo() {
+  extern __shared__ int x; // expected-error {{__shared__ variable 'x' cannot 
be 'extern'}}
+}
+
+__host__ __device__ void bar() {
+  extern __shared__ int x; // expected-error {{__shared__ variable 'x' cannot 
be 'extern'}}
+}
+
+extern __shared__ int global; // expected-error {{__shared__ variable 'global' 
cannot be 'extern'}}
Index: clang/lib/Sema/SemaDeclAttr.cpp
===
--- clang/lib/Sema/SemaDeclAttr.cpp
+++ clang/lib/Sema/SemaDeclAttr.cpp
@@ -3696,6 +3696,19 @@
 D->addAttr(Optnone);
 }
 
+static void handleSharedAttr(Sema &S, Decl *D, const AttributeList &Attr) {
+  if (checkAttrMutualExclusion(S, D, Attr.getRange(),
+   Attr.getName()))
+return;
+  auto *VD = cast(D);
+  if (VD->hasExternalStorage()) {
+S.Diag(Attr.getLoc(), diag::err_cuda_extern_shared) << VD;
+return;
+  }
+  D->addAttr(::new (S.Context) CUDASharedAttr(
+  Attr.getRange(), S.Context, Attr.getAttributeSpellingListIndex()));
+}
+
 static void handleGlobalAttr(Sema &S, Decl *D, const AttributeList &Attr) {
   if (checkAttrMutualExclusion(S, D, Attr.getRange(),
Attr.getName()) ||
@@ -5639,8 +5652,7 @@
 handleSimpleAttribute(S, D, Attr);
 break;
   case AttributeList::AT_CUDAShared:
-handleSimpleAttributeWithExclusions(S, D,
-  
Attr);
+handleSharedAttr(S, D, Attr);
 break;
   case AttributeList::AT_VecReturn:
 handleVecReturnAttr(S, D, Attr);
Index: clang/include/clang/Basic/DiagnosticSemaKinds.td
===
--- clang/include/clang/Basic/DiagnosticSemaKinds.td
+++ clang/include/clang/Basic/DiagnosticSemaKinds.td
@@ -6723,6 +6723,7 @@
 def err_cuda_vla : Error<
 "cannot use variable-length arrays in "
 "%select{__device__|__global__|__host__|__host__ __device__}0 functions">;
+def err_cuda_extern_shared : Error<"__shared__ variable %0 cannot be 
'extern'">;
 
 def warn_non_pod_vararg_with_format_string : Warning<
   "cannot pass %select{non-POD|non-trivial}0 object of type %1 to variadic "


Index: clang/test/SemaCUDA/extern-shared.cu
===
--- /dev/null
+++ clang/test/SemaCUDA/extern-shared.cu
@@ -0,0 +1,14 @@
+// RUN: %clang_cc1 -fsyntax-only -verify %s
+// RUN: %clang_cc1 -fsyntax-only -fcuda-is-device -verify %s
+
+#include "Inputs/cuda.h"
+
+__device__ void foo() {
+  extern __shared__ int x; // expected-error {{__shared__ variable 'x' cannot be 'extern'}}
+}
+
+__host__ __device__ void bar() {
+  extern __shared__ int x; // expected-error {{__shared__ variable 'x' cannot be 'extern'}}
+}
+
+extern __shared__ int global; // expected-error {{__shared__ variable 'global' cannot be 'extern'}}
Index: clang/lib/Sema/SemaDeclAttr.cpp
===
--- clang/lib/Sema/SemaDeclAttr.cpp
+++ clang/lib/Sema/SemaDeclAttr.cpp
@@ -3696,6 +3696,19 @@
 D->addAttr(Optnone);
 }
 
+static void handleSharedAttr(Sema &S, Decl *D, const AttributeList &Attr) {
+  if (checkAttrMutualExclusion(S, D, Attr.getRange(),
+   Attr.getName()))
+return;
+  auto *VD = cast(D);
+  if (VD->hasExternalStorage()) {
+S.Diag(Attr.getLoc(), diag::err_cuda_extern_shared) << VD;
+return;
+  }
+  D->addAttr(::new (S.Context) CUDASharedAttr(
+  Attr.getRange(), S.Context, Attr.getAttributeSpellingListIndex()));
+}
+
 static void handleGlobalAttr(Sema &S, Decl *D, const AttributeList &Attr) {
   if (checkAttrMutualExclusion(S, D, Attr.getRange(),
Attr.getName()) ||
@@ -5639,8 +5652,7 @@
 handleSimpleAttribute(S, D, Attr);
 break;
   case AttributeList::AT_CUDAShared:
-handleSimpleAttributeWithExclusions(S, D,
-  Attr);
+handleSharedAttr(S, D, Attr);
 break;
   case AttributeList::AT_VecReturn:
 handleVecReturnAttr(S, D, Attr);
Index: clang/include/clang/Basic/DiagnosticSemaKinds.td
===
--- clang/include/clang/Basic/DiagnosticSemaKinds.td
+++ clang/include/clang

[PATCH] D25125: [CUDA] Disallow 'extern __shared__' variables.

2016-09-30 Thread Justin Lebar via cfe-commits
jlebar updated this revision to Diff 73136.
jlebar added a comment.

Fix typo (and add a test to catch it).


https://reviews.llvm.org/D25125

Files:
  clang/include/clang/Basic/DiagnosticSemaKinds.td
  clang/lib/Sema/SemaDeclAttr.cpp
  clang/test/SemaCUDA/bad-attributes.cu
  clang/test/SemaCUDA/extern-shared.cu


Index: clang/test/SemaCUDA/extern-shared.cu
===
--- /dev/null
+++ clang/test/SemaCUDA/extern-shared.cu
@@ -0,0 +1,14 @@
+// RUN: %clang_cc1 -fsyntax-only -verify %s
+// RUN: %clang_cc1 -fsyntax-only -fcuda-is-device -verify %s
+
+#include "Inputs/cuda.h"
+
+__device__ void foo() {
+  extern __shared__ int x; // expected-error {{__shared__ variable 'x' cannot 
be 'extern'}}
+}
+
+__host__ __device__ void bar() {
+  extern __shared__ int x; // expected-error {{__shared__ variable 'x' cannot 
be 'extern'}}
+}
+
+extern __shared__ int global; // expected-error {{__shared__ variable 'global' 
cannot be 'extern'}}
Index: clang/test/SemaCUDA/bad-attributes.cu
===
--- clang/test/SemaCUDA/bad-attributes.cu
+++ clang/test/SemaCUDA/bad-attributes.cu
@@ -42,6 +42,8 @@
 __shared__ __device__ int z9;
 __shared__ __constant__ int z10;  // expected-error {{attributes are not 
compatible}}
 // expected-note@-1 {{conflicting attribute is here}}
+__constant__ __shared__ int z10a;  // expected-error {{attributes are not 
compatible}}
+// expected-note@-1 {{conflicting attribute is here}}
 
 __global__ __device__ void z11();  // expected-error {{attributes are not 
compatible}}
 // expected-note@-1 {{conflicting attribute is here}}
Index: clang/lib/Sema/SemaDeclAttr.cpp
===
--- clang/lib/Sema/SemaDeclAttr.cpp
+++ clang/lib/Sema/SemaDeclAttr.cpp
@@ -3696,6 +3696,19 @@
 D->addAttr(Optnone);
 }
 
+static void handleSharedAttr(Sema &S, Decl *D, const AttributeList &Attr) {
+  if (checkAttrMutualExclusion(S, D, Attr.getRange(),
+ Attr.getName()))
+return;
+  auto *VD = cast(D);
+  if (VD->hasExternalStorage()) {
+S.Diag(Attr.getLoc(), diag::err_cuda_extern_shared) << VD;
+return;
+  }
+  D->addAttr(::new (S.Context) CUDASharedAttr(
+  Attr.getRange(), S.Context, Attr.getAttributeSpellingListIndex()));
+}
+
 static void handleGlobalAttr(Sema &S, Decl *D, const AttributeList &Attr) {
   if (checkAttrMutualExclusion(S, D, Attr.getRange(),
Attr.getName()) ||
@@ -5639,8 +5652,7 @@
 handleSimpleAttribute(S, D, Attr);
 break;
   case AttributeList::AT_CUDAShared:
-handleSimpleAttributeWithExclusions(S, D,
-  
Attr);
+handleSharedAttr(S, D, Attr);
 break;
   case AttributeList::AT_VecReturn:
 handleVecReturnAttr(S, D, Attr);
Index: clang/include/clang/Basic/DiagnosticSemaKinds.td
===
--- clang/include/clang/Basic/DiagnosticSemaKinds.td
+++ clang/include/clang/Basic/DiagnosticSemaKinds.td
@@ -6723,6 +6723,7 @@
 def err_cuda_vla : Error<
 "cannot use variable-length arrays in "
 "%select{__device__|__global__|__host__|__host__ __device__}0 functions">;
+def err_cuda_extern_shared : Error<"__shared__ variable %0 cannot be 
'extern'">;
 
 def warn_non_pod_vararg_with_format_string : Warning<
   "cannot pass %select{non-POD|non-trivial}0 object of type %1 to variadic "


Index: clang/test/SemaCUDA/extern-shared.cu
===
--- /dev/null
+++ clang/test/SemaCUDA/extern-shared.cu
@@ -0,0 +1,14 @@
+// RUN: %clang_cc1 -fsyntax-only -verify %s
+// RUN: %clang_cc1 -fsyntax-only -fcuda-is-device -verify %s
+
+#include "Inputs/cuda.h"
+
+__device__ void foo() {
+  extern __shared__ int x; // expected-error {{__shared__ variable 'x' cannot be 'extern'}}
+}
+
+__host__ __device__ void bar() {
+  extern __shared__ int x; // expected-error {{__shared__ variable 'x' cannot be 'extern'}}
+}
+
+extern __shared__ int global; // expected-error {{__shared__ variable 'global' cannot be 'extern'}}
Index: clang/test/SemaCUDA/bad-attributes.cu
===
--- clang/test/SemaCUDA/bad-attributes.cu
+++ clang/test/SemaCUDA/bad-attributes.cu
@@ -42,6 +42,8 @@
 __shared__ __device__ int z9;
 __shared__ __constant__ int z10;  // expected-error {{attributes are not compatible}}
 // expected-note@-1 {{conflicting attribute is here}}
+__constant__ __shared__ int z10a;  // expected-error {{attributes are not compatible}}
+// expected-note@-1 {{conflicting attribute is here}}
 
 __global__ __device__ void z11();  // expected-error {{attributes are not compatible}}
 // expected-note@-1 {{conflicting attribute is here}}
Index: clang/lib/Sema/SemaDeclAttr.cpp

[PATCH] D25129: [CUDA] Disallow __constant__ local variables.

2016-09-30 Thread Justin Lebar via cfe-commits
jlebar created this revision.
jlebar added reviewers: tra, rnk.
jlebar added a subscriber: cfe-commits.

https://reviews.llvm.org/D25129

Files:
  clang/include/clang/Basic/DiagnosticSemaKinds.td
  clang/lib/Sema/SemaDeclAttr.cpp
  clang/test/SemaCUDA/bad-attributes.cu


Index: clang/test/SemaCUDA/bad-attributes.cu
===
--- clang/test/SemaCUDA/bad-attributes.cu
+++ clang/test/SemaCUDA/bad-attributes.cu
@@ -61,3 +61,11 @@
 #ifdef EXPECT_INLINE_WARNING
 // expected-warning@-2 {{ignored 'inline' attribute on kernel function 
'foobar'}}
 #endif
+
+__constant__ int global_constant;
+void host_fn() {
+  __constant__ int c; // expected-error {{__constant__ variables must be 
global}}
+}
+__device__ void device_fn() {
+  __constant__ int c; // expected-error {{__constant__ variables must be 
global}}
+}
Index: clang/lib/Sema/SemaDeclAttr.cpp
===
--- clang/lib/Sema/SemaDeclAttr.cpp
+++ clang/lib/Sema/SemaDeclAttr.cpp
@@ -3696,6 +3696,19 @@
 D->addAttr(Optnone);
 }
 
+static void handleConstantAttr(Sema &S, Decl *D, const AttributeList &Attr) {
+  if (checkAttrMutualExclusion(S, D, Attr.getRange(),
+   Attr.getName()))
+return;
+  auto *VD = cast(D);
+  if (!VD->hasGlobalStorage()) {
+S.Diag(Attr.getLoc(), diag::err_cuda_nonglobal_constant);
+return;
+  }
+  D->addAttr(::new (S.Context) CUDAConstantAttr(
+  Attr.getRange(), S.Context, Attr.getAttributeSpellingListIndex()));
+}
+
 static void handleSharedAttr(Sema &S, Decl *D, const AttributeList &Attr) {
   if (checkAttrMutualExclusion(S, D, Attr.getRange(),
  Attr.getName()))
@@ -5541,8 +5554,7 @@
 handleCommonAttr(S, D, Attr);
 break;
   case AttributeList::AT_CUDAConstant:
-handleSimpleAttributeWithExclusions(S, D,
-  
Attr);
+handleConstantAttr(S, D, Attr);
 break;
   case AttributeList::AT_PassObjectSize:
 handlePassObjectSizeAttr(S, D, Attr);
Index: clang/include/clang/Basic/DiagnosticSemaKinds.td
===
--- clang/include/clang/Basic/DiagnosticSemaKinds.td
+++ clang/include/clang/Basic/DiagnosticSemaKinds.td
@@ -6724,6 +6724,7 @@
 "cannot use variable-length arrays in "
 "%select{__device__|__global__|__host__|__host__ __device__}0 functions">;
 def err_cuda_extern_shared : Error<"__shared__ variable %0 cannot be 
'extern'">;
+def err_cuda_nonglobal_constant : Error<"__constant__ variables must be 
global">;
 
 def warn_non_pod_vararg_with_format_string : Warning<
   "cannot pass %select{non-POD|non-trivial}0 object of type %1 to variadic "


Index: clang/test/SemaCUDA/bad-attributes.cu
===
--- clang/test/SemaCUDA/bad-attributes.cu
+++ clang/test/SemaCUDA/bad-attributes.cu
@@ -61,3 +61,11 @@
 #ifdef EXPECT_INLINE_WARNING
 // expected-warning@-2 {{ignored 'inline' attribute on kernel function 'foobar'}}
 #endif
+
+__constant__ int global_constant;
+void host_fn() {
+  __constant__ int c; // expected-error {{__constant__ variables must be global}}
+}
+__device__ void device_fn() {
+  __constant__ int c; // expected-error {{__constant__ variables must be global}}
+}
Index: clang/lib/Sema/SemaDeclAttr.cpp
===
--- clang/lib/Sema/SemaDeclAttr.cpp
+++ clang/lib/Sema/SemaDeclAttr.cpp
@@ -3696,6 +3696,19 @@
 D->addAttr(Optnone);
 }
 
+static void handleConstantAttr(Sema &S, Decl *D, const AttributeList &Attr) {
+  if (checkAttrMutualExclusion(S, D, Attr.getRange(),
+   Attr.getName()))
+return;
+  auto *VD = cast(D);
+  if (!VD->hasGlobalStorage()) {
+S.Diag(Attr.getLoc(), diag::err_cuda_nonglobal_constant);
+return;
+  }
+  D->addAttr(::new (S.Context) CUDAConstantAttr(
+  Attr.getRange(), S.Context, Attr.getAttributeSpellingListIndex()));
+}
+
 static void handleSharedAttr(Sema &S, Decl *D, const AttributeList &Attr) {
   if (checkAttrMutualExclusion(S, D, Attr.getRange(),
  Attr.getName()))
@@ -5541,8 +5554,7 @@
 handleCommonAttr(S, D, Attr);
 break;
   case AttributeList::AT_CUDAConstant:
-handleSimpleAttributeWithExclusions(S, D,
-  Attr);
+handleConstantAttr(S, D, Attr);
 break;
   case AttributeList::AT_PassObjectSize:
 handlePassObjectSizeAttr(S, D, Attr);
Index: clang/include/clang/Basic/DiagnosticSemaKinds.td
===
--- clang/include/clang/Basic/DiagnosticSemaKinds.td
+++ clang/include/clang/Basic/DiagnosticSemaKinds.td
@@ -6724,6 +6724,7 @@
 "cannot use variable-length

[PATCH] D25129: [CUDA] Disallow __constant__ local variables.

2016-09-30 Thread Justin Lebar via cfe-commits
jlebar added inline comments.


> tra wrote in DiagnosticSemaKinds.td:6727
> Nit: Technically they are allowed in namespace scope.

That's still a "global variable"?  Or do you think calling it such will be 
confusing?

https://reviews.llvm.org/D25129



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D25139: [CUDA] Add Sema::CUDADiagBuilder and Sema::CUDADiagIfDeviceCode().

2016-09-30 Thread Justin Lebar via cfe-commits
jlebar created this revision.
jlebar added a reviewer: rnk.
jlebar added subscribers: tra, cfe-commits.

Together these let you easily create diagnostics that

- are never emitted for host code
- are always emitted for __device__ and __global__ functions, and
- are emitted for __host__ __device__ functions iff these functions are 
codegen'ed.

At the moment there are only three diagnostics that need this treatment,
but I have more to add, and it's not sustainable to write code for emitting
every such diagnostic twice, and from a special wrapper in SemaCUDA.cpp.

While we're at it, don't emit the function name in
err_cuda_device_exceptions: It's not necessary to print it, and making
this work in the new framework in the face of a null value for
dyn_cast(CurContext) isn't worth the effort.


https://reviews.llvm.org/D25139

Files:
  clang/include/clang/Basic/DiagnosticSemaKinds.td
  clang/include/clang/Sema/Sema.h
  clang/lib/Sema/SemaCUDA.cpp
  clang/lib/Sema/SemaExprCXX.cpp
  clang/lib/Sema/SemaStmt.cpp
  clang/lib/Sema/SemaType.cpp
  clang/test/SemaCUDA/exceptions-host-device.cu
  clang/test/SemaCUDA/exceptions.cu

Index: clang/test/SemaCUDA/exceptions.cu
===
--- clang/test/SemaCUDA/exceptions.cu
+++ clang/test/SemaCUDA/exceptions.cu
@@ -9,13 +9,13 @@
 }
 __device__ void device() {
   throw NULL;
-  // expected-error@-1 {{cannot use 'throw' in __device__ function 'device'}}
+  // expected-error@-1 {{cannot use 'throw' in __device__ function}}
   try {} catch(void*) {}
-  // expected-error@-1 {{cannot use 'try' in __device__ function 'device'}}
+  // expected-error@-1 {{cannot use 'try' in __device__ function}}
 }
 __global__ void kernel() {
   throw NULL;
-  // expected-error@-1 {{cannot use 'throw' in __global__ function 'kernel'}}
+  // expected-error@-1 {{cannot use 'throw' in __global__ function}}
   try {} catch(void*) {}
-  // expected-error@-1 {{cannot use 'try' in __global__ function 'kernel'}}
+  // expected-error@-1 {{cannot use 'try' in __global__ function}}
 }
Index: clang/test/SemaCUDA/exceptions-host-device.cu
===
--- clang/test/SemaCUDA/exceptions-host-device.cu
+++ clang/test/SemaCUDA/exceptions-host-device.cu
@@ -14,8 +14,8 @@
   throw NULL;
   try {} catch(void*) {}
 #ifndef HOST
-  // expected-error@-3 {{cannot use 'throw' in __host__ __device__ function 'hd1'}}
-  // expected-error@-3 {{cannot use 'try' in __host__ __device__ function 'hd1'}}
+  // expected-error@-3 {{cannot use 'throw' in __host__ __device__ function}}
+  // expected-error@-3 {{cannot use 'try' in __host__ __device__ function}}
 #endif
 }
 
@@ -31,8 +31,8 @@
   throw NULL;
   try {} catch(void*) {}
 #ifndef HOST
-  // expected-error@-3 {{cannot use 'throw' in __host__ __device__ function 'hd3'}}
-  // expected-error@-3 {{cannot use 'try' in __host__ __device__ function 'hd3'}}
+  // expected-error@-3 {{cannot use 'throw' in __host__ __device__ function}}
+  // expected-error@-3 {{cannot use 'try' in __host__ __device__ function}}
 #endif
 }
 __device__ void call_hd3() { hd3(); }
Index: clang/lib/Sema/SemaType.cpp
===
--- clang/lib/Sema/SemaType.cpp
+++ clang/lib/Sema/SemaType.cpp
@@ -2249,8 +2249,8 @@
 return QualType();
   }
   // CUDA device code doesn't support VLAs.
-  if (getLangOpts().CUDA && T->isVariableArrayType() && !CheckCUDAVLA(Loc))
-return QualType();
+  if (getLangOpts().CUDA && T->isVariableArrayType())
+CUDADiagIfDeviceCode(Loc, diag::err_cuda_vla) << CurrentCUDATarget();
 
   // If this is not C99, extwarn about VLA's and C99 array size modifiers.
   if (!getLangOpts().C99) {
Index: clang/lib/Sema/SemaStmt.cpp
===
--- clang/lib/Sema/SemaStmt.cpp
+++ clang/lib/Sema/SemaStmt.cpp
@@ -3646,7 +3646,8 @@
 
   // Exceptions aren't allowed in CUDA device code.
   if (getLangOpts().CUDA)
-CheckCUDAExceptionExpr(TryLoc, "try");
+CUDADiagIfDeviceCode(TryLoc, diag::err_cuda_device_exceptions)
+<< "try" << CurrentCUDATarget();
 
   if (getCurScope() && getCurScope()->isOpenMPSimdDirectiveScope())
 Diag(TryLoc, diag::err_omp_simd_region_cannot_use_stmt) << "try";
Index: clang/lib/Sema/SemaExprCXX.cpp
===
--- clang/lib/Sema/SemaExprCXX.cpp
+++ clang/lib/Sema/SemaExprCXX.cpp
@@ -685,7 +685,8 @@
 
   // Exceptions aren't allowed in CUDA device code.
   if (getLangOpts().CUDA)
-CheckCUDAExceptionExpr(OpLoc, "throw");
+CUDADiagIfDeviceCode(OpLoc, diag::err_cuda_device_exceptions)
+<< "throw" << CurrentCUDATarget();
 
   if (getCurScope() && getCurScope()->isOpenMPSimdDirectiveScope())
 Diag(OpLoc, diag::err_omp_simd_region_cannot_use_stmt) << "throw";
Index: clang/lib/Sema/SemaCUDA.cpp
===
-

[PATCH] D25139: [CUDA] Add Sema::CUDADiagBuilder and Sema::CUDADiagIfDeviceCode().

2016-09-30 Thread Justin Lebar via cfe-commits
jlebar updated this revision to Diff 73165.
jlebar added a comment.

Add CUDADiagIfHostCode().


https://reviews.llvm.org/D25139

Files:
  clang/include/clang/Basic/DiagnosticSemaKinds.td
  clang/include/clang/Sema/Sema.h
  clang/lib/Sema/SemaCUDA.cpp
  clang/lib/Sema/SemaExprCXX.cpp
  clang/lib/Sema/SemaStmt.cpp
  clang/lib/Sema/SemaType.cpp
  clang/test/SemaCUDA/exceptions-host-device.cu
  clang/test/SemaCUDA/exceptions.cu

Index: clang/test/SemaCUDA/exceptions.cu
===
--- clang/test/SemaCUDA/exceptions.cu
+++ clang/test/SemaCUDA/exceptions.cu
@@ -9,13 +9,13 @@
 }
 __device__ void device() {
   throw NULL;
-  // expected-error@-1 {{cannot use 'throw' in __device__ function 'device'}}
+  // expected-error@-1 {{cannot use 'throw' in __device__ function}}
   try {} catch(void*) {}
-  // expected-error@-1 {{cannot use 'try' in __device__ function 'device'}}
+  // expected-error@-1 {{cannot use 'try' in __device__ function}}
 }
 __global__ void kernel() {
   throw NULL;
-  // expected-error@-1 {{cannot use 'throw' in __global__ function 'kernel'}}
+  // expected-error@-1 {{cannot use 'throw' in __global__ function}}
   try {} catch(void*) {}
-  // expected-error@-1 {{cannot use 'try' in __global__ function 'kernel'}}
+  // expected-error@-1 {{cannot use 'try' in __global__ function}}
 }
Index: clang/test/SemaCUDA/exceptions-host-device.cu
===
--- clang/test/SemaCUDA/exceptions-host-device.cu
+++ clang/test/SemaCUDA/exceptions-host-device.cu
@@ -14,8 +14,8 @@
   throw NULL;
   try {} catch(void*) {}
 #ifndef HOST
-  // expected-error@-3 {{cannot use 'throw' in __host__ __device__ function 'hd1'}}
-  // expected-error@-3 {{cannot use 'try' in __host__ __device__ function 'hd1'}}
+  // expected-error@-3 {{cannot use 'throw' in __host__ __device__ function}}
+  // expected-error@-3 {{cannot use 'try' in __host__ __device__ function}}
 #endif
 }
 
@@ -31,8 +31,8 @@
   throw NULL;
   try {} catch(void*) {}
 #ifndef HOST
-  // expected-error@-3 {{cannot use 'throw' in __host__ __device__ function 'hd3'}}
-  // expected-error@-3 {{cannot use 'try' in __host__ __device__ function 'hd3'}}
+  // expected-error@-3 {{cannot use 'throw' in __host__ __device__ function}}
+  // expected-error@-3 {{cannot use 'try' in __host__ __device__ function}}
 #endif
 }
 __device__ void call_hd3() { hd3(); }
Index: clang/lib/Sema/SemaType.cpp
===
--- clang/lib/Sema/SemaType.cpp
+++ clang/lib/Sema/SemaType.cpp
@@ -2249,8 +2249,8 @@
 return QualType();
   }
   // CUDA device code doesn't support VLAs.
-  if (getLangOpts().CUDA && T->isVariableArrayType() && !CheckCUDAVLA(Loc))
-return QualType();
+  if (getLangOpts().CUDA && T->isVariableArrayType())
+CUDADiagIfDeviceCode(Loc, diag::err_cuda_vla) << CurrentCUDATarget();
 
   // If this is not C99, extwarn about VLA's and C99 array size modifiers.
   if (!getLangOpts().C99) {
Index: clang/lib/Sema/SemaStmt.cpp
===
--- clang/lib/Sema/SemaStmt.cpp
+++ clang/lib/Sema/SemaStmt.cpp
@@ -3646,7 +3646,8 @@
 
   // Exceptions aren't allowed in CUDA device code.
   if (getLangOpts().CUDA)
-CheckCUDAExceptionExpr(TryLoc, "try");
+CUDADiagIfDeviceCode(TryLoc, diag::err_cuda_device_exceptions)
+<< "try" << CurrentCUDATarget();
 
   if (getCurScope() && getCurScope()->isOpenMPSimdDirectiveScope())
 Diag(TryLoc, diag::err_omp_simd_region_cannot_use_stmt) << "try";
Index: clang/lib/Sema/SemaExprCXX.cpp
===
--- clang/lib/Sema/SemaExprCXX.cpp
+++ clang/lib/Sema/SemaExprCXX.cpp
@@ -685,7 +685,8 @@
 
   // Exceptions aren't allowed in CUDA device code.
   if (getLangOpts().CUDA)
-CheckCUDAExceptionExpr(OpLoc, "throw");
+CUDADiagIfDeviceCode(OpLoc, diag::err_cuda_device_exceptions)
+<< "throw" << CurrentCUDATarget();
 
   if (getCurScope() && getCurScope()->isOpenMPSimdDirectiveScope())
 Diag(OpLoc, diag::err_omp_simd_region_cannot_use_stmt) << "throw";
Index: clang/lib/Sema/SemaCUDA.cpp
===
--- clang/lib/Sema/SemaCUDA.cpp
+++ clang/lib/Sema/SemaCUDA.cpp
@@ -42,6 +42,10 @@
 
 /// IdentifyCUDATarget - Determine the CUDA compilation target for this function
 Sema::CUDAFunctionTarget Sema::IdentifyCUDATarget(const FunctionDecl *D) {
+  // Code that lives outside a function is run on the host.
+  if (D == nullptr)
+return CFT_Host;
+
   if (D->hasAttr())
 return CFT_InvalidTarget;
 
@@ -95,9 +99,8 @@
 Sema::IdentifyCUDAPreference(const FunctionDecl *Caller,
  const FunctionDecl *Callee) {
   assert(Callee && "Callee must be valid.");
+  CUDAFunctionTarget CallerTarget = IdentifyCUDATarget(Caller);
   CUDAFunctionTarget CalleeTarget = IdentifyCUDATarget

[PATCH] D25143: [CUDA] Disallow __shared__ variables in host functions.

2016-09-30 Thread Justin Lebar via cfe-commits
jlebar created this revision.
jlebar added reviewers: tra, rnk.
jlebar added a subscriber: cfe-commits.

https://reviews.llvm.org/D25143

Files:
  clang/include/clang/Basic/DiagnosticSemaKinds.td
  clang/lib/Sema/SemaDeclAttr.cpp
  clang/test/SemaCUDA/bad-attributes.cu


Index: clang/test/SemaCUDA/bad-attributes.cu
===
--- clang/test/SemaCUDA/bad-attributes.cu
+++ clang/test/SemaCUDA/bad-attributes.cu
@@ -65,6 +65,7 @@
 __constant__ int global_constant;
 void host_fn() {
   __constant__ int c; // expected-error {{__constant__ variables must be 
global}}
+  __shared__ int s; // expected-error {{__shared__ local variables not allowed 
in __host__ functions}}
 }
 __device__ void device_fn() {
   __constant__ int c; // expected-error {{__constant__ variables must be 
global}}
Index: clang/lib/Sema/SemaDeclAttr.cpp
===
--- clang/lib/Sema/SemaDeclAttr.cpp
+++ clang/lib/Sema/SemaDeclAttr.cpp
@@ -3718,6 +3718,11 @@
 S.Diag(Attr.getLoc(), diag::err_cuda_extern_shared) << VD;
 return;
   }
+  if (VD->hasLocalStorage() &&
+  !(S.CUDADiagIfHostCode(Attr.getLoc(), diag::err_cuda_host_shared)
+<< S.CurrentCUDATarget())
+   .IsDeferredOrNop())
+return;
   D->addAttr(::new (S.Context) CUDASharedAttr(
   Attr.getRange(), S.Context, Attr.getAttributeSpellingListIndex()));
 }
Index: clang/include/clang/Basic/DiagnosticSemaKinds.td
===
--- clang/include/clang/Basic/DiagnosticSemaKinds.td
+++ clang/include/clang/Basic/DiagnosticSemaKinds.td
@@ -6724,6 +6724,9 @@
 "cannot use variable-length arrays in "
 "%select{__device__|__global__|__host__|__host__ __device__}0 functions">;
 def err_cuda_extern_shared : Error<"__shared__ variable %0 cannot be 
'extern'">;
+def err_cuda_host_shared : Error<
+"__shared__ local variables not allowed in "
+"%select{__device__|__global__|__host__|__host__ __device__}0 functions">;
 def err_cuda_nonglobal_constant : Error<"__constant__ variables must be 
global">;
 
 def warn_non_pod_vararg_with_format_string : Warning<


Index: clang/test/SemaCUDA/bad-attributes.cu
===
--- clang/test/SemaCUDA/bad-attributes.cu
+++ clang/test/SemaCUDA/bad-attributes.cu
@@ -65,6 +65,7 @@
 __constant__ int global_constant;
 void host_fn() {
   __constant__ int c; // expected-error {{__constant__ variables must be global}}
+  __shared__ int s; // expected-error {{__shared__ local variables not allowed in __host__ functions}}
 }
 __device__ void device_fn() {
   __constant__ int c; // expected-error {{__constant__ variables must be global}}
Index: clang/lib/Sema/SemaDeclAttr.cpp
===
--- clang/lib/Sema/SemaDeclAttr.cpp
+++ clang/lib/Sema/SemaDeclAttr.cpp
@@ -3718,6 +3718,11 @@
 S.Diag(Attr.getLoc(), diag::err_cuda_extern_shared) << VD;
 return;
   }
+  if (VD->hasLocalStorage() &&
+  !(S.CUDADiagIfHostCode(Attr.getLoc(), diag::err_cuda_host_shared)
+<< S.CurrentCUDATarget())
+   .IsDeferredOrNop())
+return;
   D->addAttr(::new (S.Context) CUDASharedAttr(
   Attr.getRange(), S.Context, Attr.getAttributeSpellingListIndex()));
 }
Index: clang/include/clang/Basic/DiagnosticSemaKinds.td
===
--- clang/include/clang/Basic/DiagnosticSemaKinds.td
+++ clang/include/clang/Basic/DiagnosticSemaKinds.td
@@ -6724,6 +6724,9 @@
 "cannot use variable-length arrays in "
 "%select{__device__|__global__|__host__|__host__ __device__}0 functions">;
 def err_cuda_extern_shared : Error<"__shared__ variable %0 cannot be 'extern'">;
+def err_cuda_host_shared : Error<
+"__shared__ local variables not allowed in "
+"%select{__device__|__global__|__host__|__host__ __device__}0 functions">;
 def err_cuda_nonglobal_constant : Error<"__constant__ variables must be global">;
 
 def warn_non_pod_vararg_with_format_string : Warning<
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D25129: [CUDA] Disallow __constant__ local variables.

2016-09-30 Thread Justin Lebar via cfe-commits
jlebar added inline comments.


> tra wrote in DiagnosticSemaKinds.td:6727
> It's not clear whether you mean global storage class or global namespace.
> The code checks for global storage, but error message is could be interpreted 
> either way, IMO.
> 
> I'll leave phrasing up to you.

> It's not clear whether you mean global storage class or global namespace.

So there's actually no such thing as "global" storage class.  It's *static* 
storage (so helpful), meaning, a global variable or a `static` variable inside 
of a function.

But `__constant__` symbols must actually be global variables -- nvcc doesn't 
allow them to appear inside function bodies.

I think "global" is the right way to describe what we're going after.  Until a 
Language Lawyer corrects me, anyway.  :)

https://reviews.llvm.org/D25129



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r282985 - [CUDA] Disallow 'extern __shared__' variables.

2016-09-30 Thread Justin Lebar via cfe-commits
Author: jlebar
Date: Fri Sep 30 18:57:30 2016
New Revision: 282985

URL: http://llvm.org/viewvc/llvm-project?rev=282985&view=rev
Log:
[CUDA] Disallow 'extern __shared__' variables.

Also add a test that we disallow

  __constant__ __shared__ int x;

because it's possible to break this without breaking

  __shared__ __constant__ int x;

Reviewers: rnk

Subscribers: cfe-commits, tra

Differential Revision: https://reviews.llvm.org/D25125

Added:
cfe/trunk/test/SemaCUDA/extern-shared.cu
Modified:
cfe/trunk/include/clang/Basic/DiagnosticSemaKinds.td
cfe/trunk/lib/Sema/SemaDeclAttr.cpp
cfe/trunk/test/SemaCUDA/bad-attributes.cu

Modified: cfe/trunk/include/clang/Basic/DiagnosticSemaKinds.td
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Basic/DiagnosticSemaKinds.td?rev=282985&r1=282984&r2=282985&view=diff
==
--- cfe/trunk/include/clang/Basic/DiagnosticSemaKinds.td (original)
+++ cfe/trunk/include/clang/Basic/DiagnosticSemaKinds.td Fri Sep 30 18:57:30 
2016
@@ -6722,6 +6722,7 @@ def err_device_static_local_var : Error<
 def err_cuda_vla : Error<
 "cannot use variable-length arrays in "
 "%select{__device__|__global__|__host__|__host__ __device__}0 functions">;
+def err_cuda_extern_shared : Error<"__shared__ variable %0 cannot be 
'extern'">;
 
 def warn_non_pod_vararg_with_format_string : Warning<
   "cannot pass %select{non-POD|non-trivial}0 object of type %1 to variadic "

Modified: cfe/trunk/lib/Sema/SemaDeclAttr.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Sema/SemaDeclAttr.cpp?rev=282985&r1=282984&r2=282985&view=diff
==
--- cfe/trunk/lib/Sema/SemaDeclAttr.cpp (original)
+++ cfe/trunk/lib/Sema/SemaDeclAttr.cpp Fri Sep 30 18:57:30 2016
@@ -3696,6 +3696,19 @@ static void handleOptimizeNoneAttr(Sema
 D->addAttr(Optnone);
 }
 
+static void handleSharedAttr(Sema &S, Decl *D, const AttributeList &Attr) {
+  if (checkAttrMutualExclusion(S, D, Attr.getRange(),
+ Attr.getName()))
+return;
+  auto *VD = cast(D);
+  if (VD->hasExternalStorage()) {
+S.Diag(Attr.getLoc(), diag::err_cuda_extern_shared) << VD;
+return;
+  }
+  D->addAttr(::new (S.Context) CUDASharedAttr(
+  Attr.getRange(), S.Context, Attr.getAttributeSpellingListIndex()));
+}
+
 static void handleGlobalAttr(Sema &S, Decl *D, const AttributeList &Attr) {
   if (checkAttrMutualExclusion(S, D, Attr.getRange(),
Attr.getName()) ||
@@ -5639,8 +5652,7 @@ static void ProcessDeclAttribute(Sema &S
 handleSimpleAttribute(S, D, Attr);
 break;
   case AttributeList::AT_CUDAShared:
-handleSimpleAttributeWithExclusions(S, D,
-  
Attr);
+handleSharedAttr(S, D, Attr);
 break;
   case AttributeList::AT_VecReturn:
 handleVecReturnAttr(S, D, Attr);

Modified: cfe/trunk/test/SemaCUDA/bad-attributes.cu
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/SemaCUDA/bad-attributes.cu?rev=282985&r1=282984&r2=282985&view=diff
==
--- cfe/trunk/test/SemaCUDA/bad-attributes.cu (original)
+++ cfe/trunk/test/SemaCUDA/bad-attributes.cu Fri Sep 30 18:57:30 2016
@@ -42,6 +42,8 @@ __constant__ __shared__ int z8;  // expe
 __shared__ __device__ int z9;
 __shared__ __constant__ int z10;  // expected-error {{attributes are not 
compatible}}
 // expected-note@-1 {{conflicting attribute is here}}
+__constant__ __shared__ int z10a;  // expected-error {{attributes are not 
compatible}}
+// expected-note@-1 {{conflicting attribute is here}}
 
 __global__ __device__ void z11();  // expected-error {{attributes are not 
compatible}}
 // expected-note@-1 {{conflicting attribute is here}}

Added: cfe/trunk/test/SemaCUDA/extern-shared.cu
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/SemaCUDA/extern-shared.cu?rev=282985&view=auto
==
--- cfe/trunk/test/SemaCUDA/extern-shared.cu (added)
+++ cfe/trunk/test/SemaCUDA/extern-shared.cu Fri Sep 30 18:57:30 2016
@@ -0,0 +1,14 @@
+// RUN: %clang_cc1 -fsyntax-only -verify %s
+// RUN: %clang_cc1 -fsyntax-only -fcuda-is-device -verify %s
+
+#include "Inputs/cuda.h"
+
+__device__ void foo() {
+  extern __shared__ int x; // expected-error {{__shared__ variable 'x' cannot 
be 'extern'}}
+}
+
+__host__ __device__ void bar() {
+  extern __shared__ int x; // expected-error {{__shared__ variable 'x' cannot 
be 'extern'}}
+}
+
+extern __shared__ int global; // expected-error {{__shared__ variable 'global' 
cannot be 'extern'}}


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r282986 - [CUDA] Disallow __constant__ local variables.

2016-09-30 Thread Justin Lebar via cfe-commits
Author: jlebar
Date: Fri Sep 30 18:57:34 2016
New Revision: 282986

URL: http://llvm.org/viewvc/llvm-project?rev=282986&view=rev
Log:
[CUDA] Disallow __constant__ local variables.

Reviewers: tra, rnk

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D25129

Modified:
cfe/trunk/include/clang/Basic/DiagnosticSemaKinds.td
cfe/trunk/lib/Sema/SemaDeclAttr.cpp
cfe/trunk/test/SemaCUDA/bad-attributes.cu

Modified: cfe/trunk/include/clang/Basic/DiagnosticSemaKinds.td
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Basic/DiagnosticSemaKinds.td?rev=282986&r1=282985&r2=282986&view=diff
==
--- cfe/trunk/include/clang/Basic/DiagnosticSemaKinds.td (original)
+++ cfe/trunk/include/clang/Basic/DiagnosticSemaKinds.td Fri Sep 30 18:57:34 
2016
@@ -6723,6 +6723,7 @@ def err_cuda_vla : Error<
 "cannot use variable-length arrays in "
 "%select{__device__|__global__|__host__|__host__ __device__}0 functions">;
 def err_cuda_extern_shared : Error<"__shared__ variable %0 cannot be 
'extern'">;
+def err_cuda_nonglobal_constant : Error<"__constant__ variables must be 
global">;
 
 def warn_non_pod_vararg_with_format_string : Warning<
   "cannot pass %select{non-POD|non-trivial}0 object of type %1 to variadic "

Modified: cfe/trunk/lib/Sema/SemaDeclAttr.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Sema/SemaDeclAttr.cpp?rev=282986&r1=282985&r2=282986&view=diff
==
--- cfe/trunk/lib/Sema/SemaDeclAttr.cpp (original)
+++ cfe/trunk/lib/Sema/SemaDeclAttr.cpp Fri Sep 30 18:57:34 2016
@@ -3696,6 +3696,19 @@ static void handleOptimizeNoneAttr(Sema
 D->addAttr(Optnone);
 }
 
+static void handleConstantAttr(Sema &S, Decl *D, const AttributeList &Attr) {
+  if (checkAttrMutualExclusion(S, D, Attr.getRange(),
+   Attr.getName()))
+return;
+  auto *VD = cast(D);
+  if (!VD->hasGlobalStorage()) {
+S.Diag(Attr.getLoc(), diag::err_cuda_nonglobal_constant);
+return;
+  }
+  D->addAttr(::new (S.Context) CUDAConstantAttr(
+  Attr.getRange(), S.Context, Attr.getAttributeSpellingListIndex()));
+}
+
 static void handleSharedAttr(Sema &S, Decl *D, const AttributeList &Attr) {
   if (checkAttrMutualExclusion(S, D, Attr.getRange(),
  Attr.getName()))
@@ -5541,8 +5554,7 @@ static void ProcessDeclAttribute(Sema &S
 handleCommonAttr(S, D, Attr);
 break;
   case AttributeList::AT_CUDAConstant:
-handleSimpleAttributeWithExclusions(S, D,
-  
Attr);
+handleConstantAttr(S, D, Attr);
 break;
   case AttributeList::AT_PassObjectSize:
 handlePassObjectSizeAttr(S, D, Attr);

Modified: cfe/trunk/test/SemaCUDA/bad-attributes.cu
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/SemaCUDA/bad-attributes.cu?rev=282986&r1=282985&r2=282986&view=diff
==
--- cfe/trunk/test/SemaCUDA/bad-attributes.cu (original)
+++ cfe/trunk/test/SemaCUDA/bad-attributes.cu Fri Sep 30 18:57:34 2016
@@ -61,3 +61,11 @@ __global__ static inline void foobar() {
 #ifdef EXPECT_INLINE_WARNING
 // expected-warning@-2 {{ignored 'inline' attribute on kernel function 
'foobar'}}
 #endif
+
+__constant__ int global_constant;
+void host_fn() {
+  __constant__ int c; // expected-error {{__constant__ variables must be 
global}}
+}
+__device__ void device_fn() {
+  __constant__ int c; // expected-error {{__constant__ variables must be 
global}}
+}


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D25129: [CUDA] Disallow __constant__ local variables.

2016-09-30 Thread Justin Lebar via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rL282986: [CUDA] Disallow __constant__ local variables. 
(authored by jlebar).

Changed prior to commit:
  https://reviews.llvm.org/D25129?vs=73139&id=73168#toc

Repository:
  rL LLVM

https://reviews.llvm.org/D25129

Files:
  cfe/trunk/include/clang/Basic/DiagnosticSemaKinds.td
  cfe/trunk/lib/Sema/SemaDeclAttr.cpp
  cfe/trunk/test/SemaCUDA/bad-attributes.cu


Index: cfe/trunk/lib/Sema/SemaDeclAttr.cpp
===
--- cfe/trunk/lib/Sema/SemaDeclAttr.cpp
+++ cfe/trunk/lib/Sema/SemaDeclAttr.cpp
@@ -3696,6 +3696,19 @@
 D->addAttr(Optnone);
 }
 
+static void handleConstantAttr(Sema &S, Decl *D, const AttributeList &Attr) {
+  if (checkAttrMutualExclusion(S, D, Attr.getRange(),
+   Attr.getName()))
+return;
+  auto *VD = cast(D);
+  if (!VD->hasGlobalStorage()) {
+S.Diag(Attr.getLoc(), diag::err_cuda_nonglobal_constant);
+return;
+  }
+  D->addAttr(::new (S.Context) CUDAConstantAttr(
+  Attr.getRange(), S.Context, Attr.getAttributeSpellingListIndex()));
+}
+
 static void handleSharedAttr(Sema &S, Decl *D, const AttributeList &Attr) {
   if (checkAttrMutualExclusion(S, D, Attr.getRange(),
  Attr.getName()))
@@ -5541,8 +5554,7 @@
 handleCommonAttr(S, D, Attr);
 break;
   case AttributeList::AT_CUDAConstant:
-handleSimpleAttributeWithExclusions(S, D,
-  
Attr);
+handleConstantAttr(S, D, Attr);
 break;
   case AttributeList::AT_PassObjectSize:
 handlePassObjectSizeAttr(S, D, Attr);
Index: cfe/trunk/include/clang/Basic/DiagnosticSemaKinds.td
===
--- cfe/trunk/include/clang/Basic/DiagnosticSemaKinds.td
+++ cfe/trunk/include/clang/Basic/DiagnosticSemaKinds.td
@@ -6723,6 +6723,7 @@
 "cannot use variable-length arrays in "
 "%select{__device__|__global__|__host__|__host__ __device__}0 functions">;
 def err_cuda_extern_shared : Error<"__shared__ variable %0 cannot be 
'extern'">;
+def err_cuda_nonglobal_constant : Error<"__constant__ variables must be 
global">;
 
 def warn_non_pod_vararg_with_format_string : Warning<
   "cannot pass %select{non-POD|non-trivial}0 object of type %1 to variadic "
Index: cfe/trunk/test/SemaCUDA/bad-attributes.cu
===
--- cfe/trunk/test/SemaCUDA/bad-attributes.cu
+++ cfe/trunk/test/SemaCUDA/bad-attributes.cu
@@ -61,3 +61,11 @@
 #ifdef EXPECT_INLINE_WARNING
 // expected-warning@-2 {{ignored 'inline' attribute on kernel function 
'foobar'}}
 #endif
+
+__constant__ int global_constant;
+void host_fn() {
+  __constant__ int c; // expected-error {{__constant__ variables must be 
global}}
+}
+__device__ void device_fn() {
+  __constant__ int c; // expected-error {{__constant__ variables must be 
global}}
+}


Index: cfe/trunk/lib/Sema/SemaDeclAttr.cpp
===
--- cfe/trunk/lib/Sema/SemaDeclAttr.cpp
+++ cfe/trunk/lib/Sema/SemaDeclAttr.cpp
@@ -3696,6 +3696,19 @@
 D->addAttr(Optnone);
 }
 
+static void handleConstantAttr(Sema &S, Decl *D, const AttributeList &Attr) {
+  if (checkAttrMutualExclusion(S, D, Attr.getRange(),
+   Attr.getName()))
+return;
+  auto *VD = cast(D);
+  if (!VD->hasGlobalStorage()) {
+S.Diag(Attr.getLoc(), diag::err_cuda_nonglobal_constant);
+return;
+  }
+  D->addAttr(::new (S.Context) CUDAConstantAttr(
+  Attr.getRange(), S.Context, Attr.getAttributeSpellingListIndex()));
+}
+
 static void handleSharedAttr(Sema &S, Decl *D, const AttributeList &Attr) {
   if (checkAttrMutualExclusion(S, D, Attr.getRange(),
  Attr.getName()))
@@ -5541,8 +5554,7 @@
 handleCommonAttr(S, D, Attr);
 break;
   case AttributeList::AT_CUDAConstant:
-handleSimpleAttributeWithExclusions(S, D,
-  Attr);
+handleConstantAttr(S, D, Attr);
 break;
   case AttributeList::AT_PassObjectSize:
 handlePassObjectSizeAttr(S, D, Attr);
Index: cfe/trunk/include/clang/Basic/DiagnosticSemaKinds.td
===
--- cfe/trunk/include/clang/Basic/DiagnosticSemaKinds.td
+++ cfe/trunk/include/clang/Basic/DiagnosticSemaKinds.td
@@ -6723,6 +6723,7 @@
 "cannot use variable-length arrays in "
 "%select{__device__|__global__|__host__|__host__ __device__}0 functions">;
 def err_cuda_extern_shared : Error<"__shared__ variable %0 cannot be 'extern'">;
+def err_cuda_nonglobal_constant : Error<"__constant__ variables must be global">;
 
 def warn_non_pod_vararg_with_format_string : Warning<
   "cannot pass %select{non-POD|non-t

[PATCH] D25125: [CUDA] Disallow 'extern __shared__' variables.

2016-09-30 Thread Justin Lebar via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rL282985: [CUDA] Disallow 'extern __shared__' variables. 
(authored by jlebar).

Changed prior to commit:
  https://reviews.llvm.org/D25125?vs=73136&id=73167#toc

Repository:
  rL LLVM

https://reviews.llvm.org/D25125

Files:
  cfe/trunk/include/clang/Basic/DiagnosticSemaKinds.td
  cfe/trunk/lib/Sema/SemaDeclAttr.cpp
  cfe/trunk/test/SemaCUDA/bad-attributes.cu
  cfe/trunk/test/SemaCUDA/extern-shared.cu


Index: cfe/trunk/include/clang/Basic/DiagnosticSemaKinds.td
===
--- cfe/trunk/include/clang/Basic/DiagnosticSemaKinds.td
+++ cfe/trunk/include/clang/Basic/DiagnosticSemaKinds.td
@@ -6722,6 +6722,7 @@
 def err_cuda_vla : Error<
 "cannot use variable-length arrays in "
 "%select{__device__|__global__|__host__|__host__ __device__}0 functions">;
+def err_cuda_extern_shared : Error<"__shared__ variable %0 cannot be 
'extern'">;
 
 def warn_non_pod_vararg_with_format_string : Warning<
   "cannot pass %select{non-POD|non-trivial}0 object of type %1 to variadic "
Index: cfe/trunk/test/SemaCUDA/bad-attributes.cu
===
--- cfe/trunk/test/SemaCUDA/bad-attributes.cu
+++ cfe/trunk/test/SemaCUDA/bad-attributes.cu
@@ -42,6 +42,8 @@
 __shared__ __device__ int z9;
 __shared__ __constant__ int z10;  // expected-error {{attributes are not 
compatible}}
 // expected-note@-1 {{conflicting attribute is here}}
+__constant__ __shared__ int z10a;  // expected-error {{attributes are not 
compatible}}
+// expected-note@-1 {{conflicting attribute is here}}
 
 __global__ __device__ void z11();  // expected-error {{attributes are not 
compatible}}
 // expected-note@-1 {{conflicting attribute is here}}
Index: cfe/trunk/test/SemaCUDA/extern-shared.cu
===
--- cfe/trunk/test/SemaCUDA/extern-shared.cu
+++ cfe/trunk/test/SemaCUDA/extern-shared.cu
@@ -0,0 +1,14 @@
+// RUN: %clang_cc1 -fsyntax-only -verify %s
+// RUN: %clang_cc1 -fsyntax-only -fcuda-is-device -verify %s
+
+#include "Inputs/cuda.h"
+
+__device__ void foo() {
+  extern __shared__ int x; // expected-error {{__shared__ variable 'x' cannot 
be 'extern'}}
+}
+
+__host__ __device__ void bar() {
+  extern __shared__ int x; // expected-error {{__shared__ variable 'x' cannot 
be 'extern'}}
+}
+
+extern __shared__ int global; // expected-error {{__shared__ variable 'global' 
cannot be 'extern'}}
Index: cfe/trunk/lib/Sema/SemaDeclAttr.cpp
===
--- cfe/trunk/lib/Sema/SemaDeclAttr.cpp
+++ cfe/trunk/lib/Sema/SemaDeclAttr.cpp
@@ -3696,6 +3696,19 @@
 D->addAttr(Optnone);
 }
 
+static void handleSharedAttr(Sema &S, Decl *D, const AttributeList &Attr) {
+  if (checkAttrMutualExclusion(S, D, Attr.getRange(),
+ Attr.getName()))
+return;
+  auto *VD = cast(D);
+  if (VD->hasExternalStorage()) {
+S.Diag(Attr.getLoc(), diag::err_cuda_extern_shared) << VD;
+return;
+  }
+  D->addAttr(::new (S.Context) CUDASharedAttr(
+  Attr.getRange(), S.Context, Attr.getAttributeSpellingListIndex()));
+}
+
 static void handleGlobalAttr(Sema &S, Decl *D, const AttributeList &Attr) {
   if (checkAttrMutualExclusion(S, D, Attr.getRange(),
Attr.getName()) ||
@@ -5639,8 +5652,7 @@
 handleSimpleAttribute(S, D, Attr);
 break;
   case AttributeList::AT_CUDAShared:
-handleSimpleAttributeWithExclusions(S, D,
-  
Attr);
+handleSharedAttr(S, D, Attr);
 break;
   case AttributeList::AT_VecReturn:
 handleVecReturnAttr(S, D, Attr);


Index: cfe/trunk/include/clang/Basic/DiagnosticSemaKinds.td
===
--- cfe/trunk/include/clang/Basic/DiagnosticSemaKinds.td
+++ cfe/trunk/include/clang/Basic/DiagnosticSemaKinds.td
@@ -6722,6 +6722,7 @@
 def err_cuda_vla : Error<
 "cannot use variable-length arrays in "
 "%select{__device__|__global__|__host__|__host__ __device__}0 functions">;
+def err_cuda_extern_shared : Error<"__shared__ variable %0 cannot be 'extern'">;
 
 def warn_non_pod_vararg_with_format_string : Warning<
   "cannot pass %select{non-POD|non-trivial}0 object of type %1 to variadic "
Index: cfe/trunk/test/SemaCUDA/bad-attributes.cu
===
--- cfe/trunk/test/SemaCUDA/bad-attributes.cu
+++ cfe/trunk/test/SemaCUDA/bad-attributes.cu
@@ -42,6 +42,8 @@
 __shared__ __device__ int z9;
 __shared__ __constant__ int z10;  // expected-error {{attributes are not compatible}}
 // expected-note@-1 {{conflicting attribute is here}}
+__constant__ __shared__ int z10a;  // expected-error {{attributes are not compatible}}
+// expected-note@-1 {{conflicting attribute 

r282987 - [CUDA] Harmonize asserts in SemaCUDA, NFC.

2016-09-30 Thread Justin Lebar via cfe-commits
Author: jlebar
Date: Fri Sep 30 18:57:38 2016
New Revision: 282987

URL: http://llvm.org/viewvc/llvm-project?rev=282987&view=rev
Log:
[CUDA] Harmonize asserts in SemaCUDA, NFC.

Modified:
cfe/trunk/lib/Sema/SemaCUDA.cpp

Modified: cfe/trunk/lib/Sema/SemaCUDA.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Sema/SemaCUDA.cpp?rev=282987&r1=282986&r2=282987&view=diff
==
--- cfe/trunk/lib/Sema/SemaCUDA.cpp (original)
+++ cfe/trunk/lib/Sema/SemaCUDA.cpp Fri Sep 30 18:57:38 2016
@@ -443,7 +443,7 @@ bool Sema::isEmptyCudaDestructor(SourceL
 //system header, in which case we leave the constexpr function 
unattributed.
 void Sema::maybeAddCUDAHostDeviceAttrs(Scope *S, FunctionDecl *NewD,
const LookupResult &Previous) {
-  assert(getLangOpts().CUDA && "May be called only for CUDA compilations.");
+  assert(getLangOpts().CUDA && "Should only be called during CUDA 
compilation");
   if (!getLangOpts().CUDAHostDeviceConstexpr || !NewD->isConstexpr() ||
   NewD->isVariadic() || NewD->hasAttr() ||
   NewD->hasAttr() || NewD->hasAttr())
@@ -482,8 +482,7 @@ void Sema::maybeAddCUDAHostDeviceAttrs(S
 }
 
 bool Sema::CheckCUDACall(SourceLocation Loc, FunctionDecl *Callee) {
-  assert(getLangOpts().CUDA &&
- "Should only be called during CUDA compilation.");
+  assert(getLangOpts().CUDA && "Should only be called during CUDA 
compilation");
   assert(Callee && "Callee may not be null.");
   FunctionDecl *Caller = dyn_cast(CurContext);
   if (!Caller)
@@ -561,6 +560,7 @@ bool Sema::CheckCUDAVLA(SourceLocation L
 }
 
 void Sema::CUDASetLambdaAttrs(CXXMethodDecl *Method) {
+  assert(getLangOpts().CUDA && "Should only be called during CUDA 
compilation");
   if (Method->hasAttr() || Method->hasAttr())
 return;
   FunctionDecl *CurFn = dyn_cast(CurContext);


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D25139: [CUDA] Add Sema::CUDADiagBuilder and Sema::CUDADiagIfDeviceCode().

2016-09-30 Thread Justin Lebar via cfe-commits
jlebar updated this revision to Diff 73171.
jlebar added a comment.

Tweak API a bit.

Now we rely on an implicit conversion to bool.  Which is not great, I know, but
in practice I think works better than an explicit named function.


https://reviews.llvm.org/D25139

Files:
  clang/include/clang/Basic/DiagnosticSemaKinds.td
  clang/include/clang/Sema/Sema.h
  clang/lib/Sema/SemaCUDA.cpp
  clang/lib/Sema/SemaExprCXX.cpp
  clang/lib/Sema/SemaStmt.cpp
  clang/lib/Sema/SemaType.cpp
  clang/test/SemaCUDA/exceptions-host-device.cu
  clang/test/SemaCUDA/exceptions.cu

Index: clang/test/SemaCUDA/exceptions.cu
===
--- clang/test/SemaCUDA/exceptions.cu
+++ clang/test/SemaCUDA/exceptions.cu
@@ -9,13 +9,13 @@
 }
 __device__ void device() {
   throw NULL;
-  // expected-error@-1 {{cannot use 'throw' in __device__ function 'device'}}
+  // expected-error@-1 {{cannot use 'throw' in __device__ function}}
   try {} catch(void*) {}
-  // expected-error@-1 {{cannot use 'try' in __device__ function 'device'}}
+  // expected-error@-1 {{cannot use 'try' in __device__ function}}
 }
 __global__ void kernel() {
   throw NULL;
-  // expected-error@-1 {{cannot use 'throw' in __global__ function 'kernel'}}
+  // expected-error@-1 {{cannot use 'throw' in __global__ function}}
   try {} catch(void*) {}
-  // expected-error@-1 {{cannot use 'try' in __global__ function 'kernel'}}
+  // expected-error@-1 {{cannot use 'try' in __global__ function}}
 }
Index: clang/test/SemaCUDA/exceptions-host-device.cu
===
--- clang/test/SemaCUDA/exceptions-host-device.cu
+++ clang/test/SemaCUDA/exceptions-host-device.cu
@@ -14,8 +14,8 @@
   throw NULL;
   try {} catch(void*) {}
 #ifndef HOST
-  // expected-error@-3 {{cannot use 'throw' in __host__ __device__ function 'hd1'}}
-  // expected-error@-3 {{cannot use 'try' in __host__ __device__ function 'hd1'}}
+  // expected-error@-3 {{cannot use 'throw' in __host__ __device__ function}}
+  // expected-error@-3 {{cannot use 'try' in __host__ __device__ function}}
 #endif
 }
 
@@ -31,8 +31,8 @@
   throw NULL;
   try {} catch(void*) {}
 #ifndef HOST
-  // expected-error@-3 {{cannot use 'throw' in __host__ __device__ function 'hd3'}}
-  // expected-error@-3 {{cannot use 'try' in __host__ __device__ function 'hd3'}}
+  // expected-error@-3 {{cannot use 'throw' in __host__ __device__ function}}
+  // expected-error@-3 {{cannot use 'try' in __host__ __device__ function}}
 #endif
 }
 __device__ void call_hd3() { hd3(); }
Index: clang/lib/Sema/SemaType.cpp
===
--- clang/lib/Sema/SemaType.cpp
+++ clang/lib/Sema/SemaType.cpp
@@ -2249,8 +2249,8 @@
 return QualType();
   }
   // CUDA device code doesn't support VLAs.
-  if (getLangOpts().CUDA && T->isVariableArrayType() && !CheckCUDAVLA(Loc))
-return QualType();
+  if (getLangOpts().CUDA && T->isVariableArrayType())
+CUDADiagIfDeviceCode(Loc, diag::err_cuda_vla) << CurrentCUDATarget();
 
   // If this is not C99, extwarn about VLA's and C99 array size modifiers.
   if (!getLangOpts().C99) {
Index: clang/lib/Sema/SemaStmt.cpp
===
--- clang/lib/Sema/SemaStmt.cpp
+++ clang/lib/Sema/SemaStmt.cpp
@@ -3646,7 +3646,8 @@
 
   // Exceptions aren't allowed in CUDA device code.
   if (getLangOpts().CUDA)
-CheckCUDAExceptionExpr(TryLoc, "try");
+CUDADiagIfDeviceCode(TryLoc, diag::err_cuda_device_exceptions)
+<< "try" << CurrentCUDATarget();
 
   if (getCurScope() && getCurScope()->isOpenMPSimdDirectiveScope())
 Diag(TryLoc, diag::err_omp_simd_region_cannot_use_stmt) << "try";
Index: clang/lib/Sema/SemaExprCXX.cpp
===
--- clang/lib/Sema/SemaExprCXX.cpp
+++ clang/lib/Sema/SemaExprCXX.cpp
@@ -685,7 +685,8 @@
 
   // Exceptions aren't allowed in CUDA device code.
   if (getLangOpts().CUDA)
-CheckCUDAExceptionExpr(OpLoc, "throw");
+CUDADiagIfDeviceCode(OpLoc, diag::err_cuda_device_exceptions)
+<< "throw" << CurrentCUDATarget();
 
   if (getCurScope() && getCurScope()->isOpenMPSimdDirectiveScope())
 Diag(OpLoc, diag::err_omp_simd_region_cannot_use_stmt) << "throw";
Index: clang/lib/Sema/SemaCUDA.cpp
===
--- clang/lib/Sema/SemaCUDA.cpp
+++ clang/lib/Sema/SemaCUDA.cpp
@@ -42,6 +42,10 @@
 
 /// IdentifyCUDATarget - Determine the CUDA compilation target for this function
 Sema::CUDAFunctionTarget Sema::IdentifyCUDATarget(const FunctionDecl *D) {
+  // Code that lives outside a function is run on the host.
+  if (D == nullptr)
+return CFT_Host;
+
   if (D->hasAttr())
 return CFT_InvalidTarget;
 
@@ -95,9 +99,8 @@
 Sema::IdentifyCUDAPreference(const FunctionDecl *Caller,
  const FunctionDecl *Callee) {
   assert(Callee && "Callee 

[PATCH] D25143: [CUDA] Disallow __shared__ variables in host functions.

2016-09-30 Thread Justin Lebar via cfe-commits
jlebar updated this revision to Diff 73172.
jlebar added a comment.

Update to new CUDADiagIfHostCode API.


https://reviews.llvm.org/D25143

Files:
  clang/include/clang/Basic/DiagnosticSemaKinds.td
  clang/lib/Sema/SemaDeclAttr.cpp
  clang/test/SemaCUDA/bad-attributes.cu


Index: clang/test/SemaCUDA/bad-attributes.cu
===
--- clang/test/SemaCUDA/bad-attributes.cu
+++ clang/test/SemaCUDA/bad-attributes.cu
@@ -65,6 +65,7 @@
 __constant__ int global_constant;
 void host_fn() {
   __constant__ int c; // expected-error {{__constant__ variables must be 
global}}
+  __shared__ int s; // expected-error {{__shared__ local variables not allowed 
in __host__ functions}}
 }
 __device__ void device_fn() {
   __constant__ int c; // expected-error {{__constant__ variables must be 
global}}
Index: clang/lib/Sema/SemaDeclAttr.cpp
===
--- clang/lib/Sema/SemaDeclAttr.cpp
+++ clang/lib/Sema/SemaDeclAttr.cpp
@@ -3718,6 +3718,10 @@
 S.Diag(Attr.getLoc(), diag::err_cuda_extern_shared) << VD;
 return;
   }
+  if (S.getLangOpts().CUDA && VD->hasLocalStorage() &&
+  S.CUDADiagIfHostCode(Attr.getLoc(), diag::err_cuda_host_shared)
+  << S.CurrentCUDATarget())
+return;
   D->addAttr(::new (S.Context) CUDASharedAttr(
   Attr.getRange(), S.Context, Attr.getAttributeSpellingListIndex()));
 }
Index: clang/include/clang/Basic/DiagnosticSemaKinds.td
===
--- clang/include/clang/Basic/DiagnosticSemaKinds.td
+++ clang/include/clang/Basic/DiagnosticSemaKinds.td
@@ -6723,6 +6723,9 @@
 "cannot use variable-length arrays in "
 "%select{__device__|__global__|__host__|__host__ __device__}0 functions">;
 def err_cuda_extern_shared : Error<"__shared__ variable %0 cannot be 
'extern'">;
+def err_cuda_host_shared : Error<
+"__shared__ local variables not allowed in "
+"%select{__device__|__global__|__host__|__host__ __device__}0 functions">;
 def err_cuda_nonglobal_constant : Error<"__constant__ variables must be 
global">;
 
 def warn_non_pod_vararg_with_format_string : Warning<


Index: clang/test/SemaCUDA/bad-attributes.cu
===
--- clang/test/SemaCUDA/bad-attributes.cu
+++ clang/test/SemaCUDA/bad-attributes.cu
@@ -65,6 +65,7 @@
 __constant__ int global_constant;
 void host_fn() {
   __constant__ int c; // expected-error {{__constant__ variables must be global}}
+  __shared__ int s; // expected-error {{__shared__ local variables not allowed in __host__ functions}}
 }
 __device__ void device_fn() {
   __constant__ int c; // expected-error {{__constant__ variables must be global}}
Index: clang/lib/Sema/SemaDeclAttr.cpp
===
--- clang/lib/Sema/SemaDeclAttr.cpp
+++ clang/lib/Sema/SemaDeclAttr.cpp
@@ -3718,6 +3718,10 @@
 S.Diag(Attr.getLoc(), diag::err_cuda_extern_shared) << VD;
 return;
   }
+  if (S.getLangOpts().CUDA && VD->hasLocalStorage() &&
+  S.CUDADiagIfHostCode(Attr.getLoc(), diag::err_cuda_host_shared)
+  << S.CurrentCUDATarget())
+return;
   D->addAttr(::new (S.Context) CUDASharedAttr(
   Attr.getRange(), S.Context, Attr.getAttributeSpellingListIndex()));
 }
Index: clang/include/clang/Basic/DiagnosticSemaKinds.td
===
--- clang/include/clang/Basic/DiagnosticSemaKinds.td
+++ clang/include/clang/Basic/DiagnosticSemaKinds.td
@@ -6723,6 +6723,9 @@
 "cannot use variable-length arrays in "
 "%select{__device__|__global__|__host__|__host__ __device__}0 functions">;
 def err_cuda_extern_shared : Error<"__shared__ variable %0 cannot be 'extern'">;
+def err_cuda_host_shared : Error<
+"__shared__ local variables not allowed in "
+"%select{__device__|__global__|__host__|__host__ __device__}0 functions">;
 def err_cuda_nonglobal_constant : Error<"__constant__ variables must be global">;
 
 def warn_non_pod_vararg_with_format_string : Warning<
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D25150: [CUDA] Allow static variables in __host__ __device__ functions, so long as they're never codegen'ed for device.

2016-09-30 Thread Justin Lebar via cfe-commits
jlebar created this revision.
jlebar added reviewers: tra, rnk.
jlebar added a subscriber: cfe-commits.

https://reviews.llvm.org/D25150

Files:
  clang/include/clang/Basic/DiagnosticSemaKinds.td
  clang/lib/Sema/SemaDecl.cpp
  clang/test/SemaCUDA/device-var-init.cu
  clang/test/SemaCUDA/static-vars-hd.cu


Index: clang/test/SemaCUDA/static-vars-hd.cu
===
--- /dev/null
+++ clang/test/SemaCUDA/static-vars-hd.cu
@@ -0,0 +1,20 @@
+// RUN: %clang_cc1 -fcxx-exceptions -fcuda-is-device -S -o /dev/null -verify %s
+// RUN: %clang_cc1 -fcxx-exceptions -S -o /dev/null -D HOST -verify %s
+
+#include "Inputs/cuda.h"
+
+#ifdef HOST
+// expected-no-diagnostics
+#endif
+
+__host__ __device__ void f() {
+  static int x = 42;
+#ifndef HOST
+  // expected-error@-2 {{within a __host__ __device__ function, only 
__shared__ variables may be marked 'static'}}
+#endif
+}
+
+inline __host__ __device__ void g() {
+  static int x = 42; // no error on device because this is never codegen'ed 
there.
+}
+void call_g() { g(); }
Index: clang/test/SemaCUDA/device-var-init.cu
===
--- clang/test/SemaCUDA/device-var-init.cu
+++ clang/test/SemaCUDA/device-var-init.cu
@@ -207,9 +207,9 @@
   // expected-error@-1 {{initialization is not supported for __shared__ 
variables.}}
 
   static __device__ int ds;
-  // expected-error@-1 {{Within a __device__/__global__ function, only 
__shared__ variables may be marked "static"}}
+  // expected-error@-1 {{within a __device__ function, only __shared__ 
variables may be marked 'static'}}
   static __constant__ int dc;
-  // expected-error@-1 {{Within a __device__/__global__ function, only 
__shared__ variables may be marked "static"}}
+  // expected-error@-1 {{within a __device__ function, only __shared__ 
variables may be marked 'static'}}
   static int v;
-  // expected-error@-1 {{Within a __device__/__global__ function, only 
__shared__ variables may be marked "static"}}
+  // expected-error@-1 {{within a __device__ function, only __shared__ 
variables may be marked 'static'}}
 }
Index: clang/lib/Sema/SemaDecl.cpp
===
--- clang/lib/Sema/SemaDecl.cpp
+++ clang/lib/Sema/SemaDecl.cpp
@@ -10642,12 +10642,11 @@
   // CUDA E.2.9.4: Within the body of a __device__ or __global__
   // function, only __shared__ variables may be declared with
   // static storage class.
-  if (getLangOpts().CUDA && getLangOpts().CUDAIsDevice &&
-  (FD->hasAttr() || FD->hasAttr()) &&
-  !VD->hasAttr()) {
-Diag(VD->getLocation(), diag::err_device_static_local_var);
+  if (getLangOpts().CUDA && !VD->hasAttr() &&
+  CUDADiagIfDeviceCode(VD->getLocation(),
+   diag::err_device_static_local_var)
+  << CurrentCUDATarget())
 VD->setInvalidDecl();
-  }
 }
   }
 
@@ -10661,7 +10660,7 @@
 if (Init && VD->hasGlobalStorage()) {
   if (VD->hasAttr() || VD->hasAttr() ||
   VD->hasAttr()) {
-assert((!VD->isStaticLocal() || VD->hasAttr()));
+assert(!VD->isStaticLocal() || VD->hasAttr());
 bool AllowedInit = false;
 if (const CXXConstructExpr *CE = dyn_cast(Init))
   AllowedInit =
Index: clang/include/clang/Basic/DiagnosticSemaKinds.td
===
--- clang/include/clang/Basic/DiagnosticSemaKinds.td
+++ clang/include/clang/Basic/DiagnosticSemaKinds.td
@@ -6717,8 +6717,8 @@
 def err_shared_var_init : Error<
 "initialization is not supported for __shared__ variables.">;
 def err_device_static_local_var : Error<
-"Within a __device__/__global__ function, "
-"only __shared__ variables may be marked \"static\"">;
+"within a %select{__device__|__global__|__host__|__host__ __device__}0 "
+"function, only __shared__ variables may be marked 'static'">;
 def err_cuda_vla : Error<
 "cannot use variable-length arrays in "
 "%select{__device__|__global__|__host__|__host__ __device__}0 functions">;


Index: clang/test/SemaCUDA/static-vars-hd.cu
===
--- /dev/null
+++ clang/test/SemaCUDA/static-vars-hd.cu
@@ -0,0 +1,20 @@
+// RUN: %clang_cc1 -fcxx-exceptions -fcuda-is-device -S -o /dev/null -verify %s
+// RUN: %clang_cc1 -fcxx-exceptions -S -o /dev/null -D HOST -verify %s
+
+#include "Inputs/cuda.h"
+
+#ifdef HOST
+// expected-no-diagnostics
+#endif
+
+__host__ __device__ void f() {
+  static int x = 42;
+#ifndef HOST
+  // expected-error@-2 {{within a __host__ __device__ function, only __shared__ variables may be marked 'static'}}
+#endif
+}
+
+inline __host__ __device__ void g() {
+  static int x = 42; // no error on device because this is never codegen'ed there.
+}
+void call_g() { g(); }
Index: clang/test/SemaCUDA/device-var-init.cu
==

r283068 - [CUDA] Allow extern __shared__ on empty-length arrays.

2016-10-02 Thread Justin Lebar via cfe-commits
Author: jlebar
Date: Sun Oct  2 10:24:50 2016
New Revision: 283068

URL: http://llvm.org/viewvc/llvm-project?rev=283068&view=rev
Log:
[CUDA] Allow extern __shared__ on empty-length arrays.

"extern __shared__ int x[]" is OK.

Modified:
cfe/trunk/lib/Sema/SemaDeclAttr.cpp
cfe/trunk/test/SemaCUDA/extern-shared.cu

Modified: cfe/trunk/lib/Sema/SemaDeclAttr.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Sema/SemaDeclAttr.cpp?rev=283068&r1=283067&r2=283068&view=diff
==
--- cfe/trunk/lib/Sema/SemaDeclAttr.cpp (original)
+++ cfe/trunk/lib/Sema/SemaDeclAttr.cpp Sun Oct  2 10:24:50 2016
@@ -3714,7 +3714,9 @@ static void handleSharedAttr(Sema &S, De
  Attr.getName()))
 return;
   auto *VD = cast(D);
-  if (VD->hasExternalStorage()) {
+  // extern __shared__ is only allowed on arrays with no length (e.g.
+  // "int x[]").
+  if (VD->hasExternalStorage() && !isa(VD->getType())) {
 S.Diag(Attr.getLoc(), diag::err_cuda_extern_shared) << VD;
 return;
   }

Modified: cfe/trunk/test/SemaCUDA/extern-shared.cu
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/SemaCUDA/extern-shared.cu?rev=283068&r1=283067&r2=283068&view=diff
==
--- cfe/trunk/test/SemaCUDA/extern-shared.cu (original)
+++ cfe/trunk/test/SemaCUDA/extern-shared.cu Sun Oct  2 10:24:50 2016
@@ -5,10 +5,19 @@
 
 __device__ void foo() {
   extern __shared__ int x; // expected-error {{__shared__ variable 'x' cannot 
be 'extern'}}
+  extern __shared__ int arr[];  // ok
+  extern __shared__ int arr0[0]; // expected-error {{__shared__ variable 
'arr0' cannot be 'extern'}}
+  extern __shared__ int arr1[1]; // expected-error {{__shared__ variable 
'arr1' cannot be 'extern'}}
+  extern __shared__ int* ptr ; // expected-error {{__shared__ variable 'ptr' 
cannot be 'extern'}}
 }
 
 __host__ __device__ void bar() {
-  extern __shared__ int x; // expected-error {{__shared__ variable 'x' cannot 
be 'extern'}}
+  extern __shared__ int arr[];  // ok
+  extern __shared__ int arr0[0]; // expected-error {{__shared__ variable 
'arr0' cannot be 'extern'}}
+  extern __shared__ int arr1[1]; // expected-error {{__shared__ variable 
'arr1' cannot be 'extern'}}
+  extern __shared__ int* ptr ; // expected-error {{__shared__ variable 'ptr' 
cannot be 'extern'}}
 }
 
 extern __shared__ int global; // expected-error {{__shared__ variable 'global' 
cannot be 'extern'}}
+extern __shared__ int global_arr[]; // ok
+extern __shared__ int global_arr1[1]; // expected-error {{__shared__ variable 
'global_arr1' cannot be 'extern'}}


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D25166: [CUDA] Mark device functions as nounwind.

2016-10-02 Thread Justin Lebar via cfe-commits
jlebar created this revision.
jlebar added a reviewer: tra.
jlebar added a subscriber: cfe-commits.

This prevents clang from emitting 'invoke's and catch statements.

Things previously mostly worked thanks to TryToMarkNoThrow() in
CodeGenFunction.  But this is not a proper IPO, and it doesn't properly
handle cases like mutual recursion.

Fixes bug 30593.


https://reviews.llvm.org/D25166

Files:
  clang/lib/Sema/SemaDecl.cpp
  clang/test/CodeGenCUDA/convergent.cu
  clang/test/CodeGenCUDA/device-var-init.cu
  clang/test/CodeGenCUDA/nothrow.cu


Index: clang/test/CodeGenCUDA/nothrow.cu
===
--- /dev/null
+++ clang/test/CodeGenCUDA/nothrow.cu
@@ -0,0 +1,29 @@
+// RUN: %clang_cc1 -fcxx-exceptions -fexceptions -fcuda-is-device -triple 
nvptx-nvidia-cuda -emit-llvm \
+// RUN:   -disable-llvm-passes -o - %s | FileCheck -check-prefix DEVICE %s
+
+// RUN: %clang_cc1 -fcxx-exceptions -fexceptions -triple 
x86_64-unknown-linux-gnu -emit-llvm \
+// RUN:   -disable-llvm-passes -o - %s | \
+// RUN:  FileCheck -check-prefix HOST %s
+
+#include "Inputs/cuda.h"
+
+__host__ __device__ void f();
+
+// HOST: define void @_Z7host_fnv() [[HOST_ATTR:#[0-9]+]]
+void host_fn() { f(); }
+
+// DEVICE: define void @_Z3foov() [[DEVICE_ATTR:#[0-9]+]]
+__device__ void foo() { f(); }
+
+// This is nounwind only on the device side.
+// CHECK: define void @_Z3foov() [[DEVICE_ATTR:#[0-9]+]]
+__host__ __device__ void bar() { f(); }
+
+// DEVICE: define void @_Z3bazv() [[DEVICE_ATTR:#[0-9]+]]
+__global__ void baz() { f(); }
+
+// DEVICE: attributes [[DEVICE_ATTR]] = {
+// DEVICE-SAME: nounwind
+// HOST: attributes [[HOST_ATTR]] = {
+// HOST-NOT: nounwind
+// HOST-SAME: }
Index: clang/test/CodeGenCUDA/device-var-init.cu
===
--- clang/test/CodeGenCUDA/device-var-init.cu
+++ clang/test/CodeGenCUDA/device-var-init.cu
@@ -182,9 +182,9 @@
   df(); // CHECK: call void @_Z2dfv()
 
   // Verify that we only call non-empty destructors
-  // CHECK-NEXT: call void @_ZN8T_FA_NEDD1Ev(%struct.T_FA_NED* %t_fa_ned) #6
-  // CHECK-NEXT: call void @_ZN7T_F_NEDD1Ev(%struct.T_F_NED* %t_f_ned) #6
-  // CHECK-NEXT: call void @_ZN7T_B_NEDD1Ev(%struct.T_B_NED* %t_b_ned) #6
+  // CHECK-NEXT: call void @_ZN8T_FA_NEDD1Ev(%struct.T_FA_NED* %t_fa_ned)
+  // CHECK-NEXT: call void @_ZN7T_F_NEDD1Ev(%struct.T_F_NED* %t_f_ned)
+  // CHECK-NEXT: call void @_ZN7T_B_NEDD1Ev(%struct.T_B_NED* %t_b_ned)
   // CHECK-NEXT: call void @_ZN2VDD1Ev(%struct.VD* %vd)
   // CHECK-NEXT: call void @_ZN3NEDD1Ev(%struct.NED* %ned)
   // CHECK-NEXT: call void @_ZN2UDD1Ev(%struct.UD* %ud)
Index: clang/test/CodeGenCUDA/convergent.cu
===
--- clang/test/CodeGenCUDA/convergent.cu
+++ clang/test/CodeGenCUDA/convergent.cu
@@ -36,8 +36,8 @@
 // DEVICE: attributes [[BAZ_ATTR]] = {
 // DEVICE-SAME: convergent
 // DEVICE-SAME: }
-// DEVICE: attributes [[CALL_ATTR]] = { convergent }
-// DEVICE: attributes [[ASM_ATTR]] = { convergent
+// DEVICE-DAG: attributes [[CALL_ATTR]] = { convergent
+// DEVICE-DAG: attributes [[ASM_ATTR]] = { convergent
 
 // HOST: declare void @_Z3bazv() [[BAZ_ATTR:#[0-9]+]]
 // HOST: attributes [[BAZ_ATTR]] = {
Index: clang/lib/Sema/SemaDecl.cpp
===
--- clang/lib/Sema/SemaDecl.cpp
+++ clang/lib/Sema/SemaDecl.cpp
@@ -12074,6 +12074,14 @@
   FD->addAttr(NoThrowAttr::CreateImplicit(Context, FD->getLocation()));
   }
 
+  // CUDA device functions cannot throw.
+  if (getLangOpts().CUDA && !FD->hasAttr()) {
+CUDAFunctionTarget T = IdentifyCUDATarget(FD);
+if (T == CFT_Device || T == CFT_Global ||
+(getLangOpts().CUDAIsDevice && T == CFT_HostDevice))
+  FD->addAttr(NoThrowAttr::CreateImplicit(Context, FD->getLocation()));
+  }
+
   IdentifierInfo *Name = FD->getIdentifier();
   if (!Name)
 return;


Index: clang/test/CodeGenCUDA/nothrow.cu
===
--- /dev/null
+++ clang/test/CodeGenCUDA/nothrow.cu
@@ -0,0 +1,29 @@
+// RUN: %clang_cc1 -fcxx-exceptions -fexceptions -fcuda-is-device -triple nvptx-nvidia-cuda -emit-llvm \
+// RUN:   -disable-llvm-passes -o - %s | FileCheck -check-prefix DEVICE %s
+
+// RUN: %clang_cc1 -fcxx-exceptions -fexceptions -triple x86_64-unknown-linux-gnu -emit-llvm \
+// RUN:   -disable-llvm-passes -o - %s | \
+// RUN:  FileCheck -check-prefix HOST %s
+
+#include "Inputs/cuda.h"
+
+__host__ __device__ void f();
+
+// HOST: define void @_Z7host_fnv() [[HOST_ATTR:#[0-9]+]]
+void host_fn() { f(); }
+
+// DEVICE: define void @_Z3foov() [[DEVICE_ATTR:#[0-9]+]]
+__device__ void foo() { f(); }
+
+// This is nounwind only on the device side.
+// CHECK: define void @_Z3foov() [[DEVICE_ATTR:#[0-9]+]]
+__host__ __device__ void bar() { f(); }
+
+// DEVICE: define void @_Z3bazv() [[DEVICE_ATTR:#[0-9]+]]
+__global__ void baz() { f(); }
+
+// DE

[PATCH] D25166: [CUDA] Mark device functions as nounwind.

2016-10-03 Thread Justin Lebar via cfe-commits
jlebar added a comment.

In https://reviews.llvm.org/D25166#559117, @rnk wrote:

> It feels like the right thing is to disable EH in device side compilation, 
> but obviously that won't work because it would reject try/throw in host code.


Exactly.

> I think instead of doing that, we should make sure that CUDA diagnoses try / 
> catch / throw in device functions, > and then do what you've done here.

Disallowing try/throw is https://reviews.llvm.org/D25036.

> Also, take a look at CodeGenFunction::getInvokeDestImpl(). I think you should 
> add some checks in there to return nullptr if we're doing a device-side CUDA 
> compilation. That's a much more direct way to ensure we never generate 
> invokes on the device side.

Hm...if we did this but didn't mark functions as "this never throws", would 
that really take care of everything?  For example, if we were to call an 
external function which is itself not marked as noexcept, llvm wouldn't be able 
to infer that we don't throw.

Are you saying we should do both?  I'm happy to do that, but I am not sure, if 
we keep the nounwind attribute in, whether it's possible to write a testcase.

> Can we use a noexcept exception specification instead of this GCC attribute?

I have no attachment to this specific attribute, but wouldn't adding noexcept 
change the semantics of the program (because traits can detect whether or not 
noexcept is present)?


https://reviews.llvm.org/D25166



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D24571: [CUDA] Disallow overloading destructors.

2016-10-03 Thread Justin Lebar via cfe-commits
jlebar added inline comments.


> rnk wrote in SemaOverload.cpp:1131
> I feel like we should exit early on destructors here, before we do any target 
> checks. The assert also feels kind of trivial because we only come into this 
> overload machinery if looking up New's DeclarationName found Old.

Will do when I check this in.  Thank you for the review!

https://reviews.llvm.org/D24571



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r283120 - [CUDA] Disallow overloading destructors.

2016-10-03 Thread Justin Lebar via cfe-commits
Author: jlebar
Date: Mon Oct  3 11:48:23 2016
New Revision: 283120

URL: http://llvm.org/viewvc/llvm-project?rev=283120&view=rev
Log:
[CUDA] Disallow overloading destructors.

Summary:
We'd attempted to allow this, but turns out we were doing a very bad
job.  :)

Making this work properly would be a giant change in clang.  For
example, we'd need to make CXXRecordDecl::getDestructor()
context-sensitive, because the destructor you end up with depends on
where you're calling it from.

For now (and hopefully for ever), just disallow overloading of
destructors in CUDA.

Reviewers: rsmith

Subscribers: cfe-commits, tra

Differential Revision: https://reviews.llvm.org/D24571

Added:
cfe/trunk/test/SemaCUDA/no-destructor-overload.cu
Removed:
cfe/trunk/test/SemaCUDA/call-overloaded-destructor.cu
Modified:
cfe/trunk/lib/Sema/SemaOverload.cpp
cfe/trunk/test/CodeGenCUDA/function-overload.cu
cfe/trunk/test/SemaCUDA/function-overload.cu

Modified: cfe/trunk/lib/Sema/SemaOverload.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Sema/SemaOverload.cpp?rev=283120&r1=283119&r2=283120&view=diff
==
--- cfe/trunk/lib/Sema/SemaOverload.cpp (original)
+++ cfe/trunk/lib/Sema/SemaOverload.cpp Mon Oct  3 11:48:23 2016
@@ -1129,6 +1129,11 @@ bool Sema::IsOverload(FunctionDecl *New,
   }
 
   if (getLangOpts().CUDA && ConsiderCudaAttrs) {
+// Don't allow overloading of destructors.  (In theory we could, but it
+// would be a giant change to clang.)
+if (isa(New))
+  return false;
+
 CUDAFunctionTarget NewTarget = IdentifyCUDATarget(New),
OldTarget = IdentifyCUDATarget(Old);
 if (NewTarget == CFT_InvalidTarget || NewTarget == CFT_Global)

Modified: cfe/trunk/test/CodeGenCUDA/function-overload.cu
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/CodeGenCUDA/function-overload.cu?rev=283120&r1=283119&r2=283120&view=diff
==
--- cfe/trunk/test/CodeGenCUDA/function-overload.cu (original)
+++ cfe/trunk/test/CodeGenCUDA/function-overload.cu Mon Oct  3 11:48:23 2016
@@ -16,8 +16,6 @@ int x;
 struct s_cd_dh {
   __host__ s_cd_dh() { x = 11; }
   __device__ s_cd_dh() { x = 12; }
-  __host__ ~s_cd_dh() { x = 21; }
-  __device__ ~s_cd_dh() { x = 22; }
 };
 
 struct s_cd_hd {
@@ -38,7 +36,6 @@ void wrapper() {
   // CHECK-BOTH: call void @_ZN7s_cd_hdC1Ev
 
   // CHECK-BOTH: call void @_ZN7s_cd_hdD1Ev(
-  // CHECK-BOTH: call void @_ZN7s_cd_dhD1Ev(
 }
 // CHECK-BOTH: ret void
 
@@ -56,8 +53,3 @@ void wrapper() {
 // CHECK-BOTH: define linkonce_odr void @_ZN7s_cd_hdD2Ev(
 // CHECK-BOTH: store i32 32,
 // CHECK-BOTH: ret void
-
-// CHECK-BOTH: define linkonce_odr void @_ZN7s_cd_dhD2Ev(
-// CHECK-HOST:   store i32 21,
-// CHECK-DEVICE: store i32 22,
-// CHECK-BOTH: ret void

Removed: cfe/trunk/test/SemaCUDA/call-overloaded-destructor.cu
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/SemaCUDA/call-overloaded-destructor.cu?rev=283119&view=auto
==
--- cfe/trunk/test/SemaCUDA/call-overloaded-destructor.cu (original)
+++ cfe/trunk/test/SemaCUDA/call-overloaded-destructor.cu (removed)
@@ -1,17 +0,0 @@
-// expected-no-diagnostics
-
-// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -fsyntax-only -verify %s
-// RUN: %clang_cc1 -triple nvptx64-nvidia-cuda -fsyntax-only -fcuda-is-device 
-verify %s
-
-#include "Inputs/cuda.h"
-
-struct S {
-  __host__ ~S() {}
-  __device__ ~S() {}
-};
-
-__host__ __device__ void test() {
-  S s;
-  // This should not crash clang.
-  s.~S();
-}

Modified: cfe/trunk/test/SemaCUDA/function-overload.cu
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/SemaCUDA/function-overload.cu?rev=283120&r1=283119&r2=283120&view=diff
==
--- cfe/trunk/test/SemaCUDA/function-overload.cu (original)
+++ cfe/trunk/test/SemaCUDA/function-overload.cu Mon Oct  3 11:48:23 2016
@@ -210,44 +210,11 @@ struct d_h {
   __host__ ~d_h() {} // expected-error {{destructor cannot be redeclared}}
 };
 
-// H/D overloading is OK
-struct d_dh {
-  __device__ ~d_dh() {}
-  __host__ ~d_dh() {}
-};
-
 // HD is OK
 struct d_hd {
   __host__ __device__ ~d_hd() {}
 };
 
-// Mixing H/D and HD is not allowed.
-struct d_dhhd {
-  __device__ ~d_dhhd() {}
-  __host__ ~d_dhhd() {} // expected-note {{previous declaration is here}}
-  __host__ __device__ ~d_dhhd() {} // expected-error {{destructor cannot be 
redeclared}}
-};
-
-struct d_hhd {
-  __host__ ~d_hhd() {} // expected-note {{previous declaration is here}}
-  __host__ __device__ ~d_hhd() {} // expected-error {{destructor cannot be 
redeclared}}
-};
-
-struct d_hdh {
-  __host__ __device__ ~d_hdh() {} // expected-note {{previous declaration is 
here}}
-  __host__ ~d_hdh() {} // expected-error {{destructor cannot be redeclar

r283121 - [CUDA] Clean up some comments in Sema::IsOverload. NFC

2016-10-03 Thread Justin Lebar via cfe-commits
Author: jlebar
Date: Mon Oct  3 11:48:27 2016
New Revision: 283121

URL: http://llvm.org/viewvc/llvm-project?rev=283121&view=rev
Log:
[CUDA] Clean up some comments in Sema::IsOverload.  NFC

Modified:
cfe/trunk/lib/Sema/SemaOverload.cpp

Modified: cfe/trunk/lib/Sema/SemaOverload.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Sema/SemaOverload.cpp?rev=283121&r1=283120&r2=283121&view=diff
==
--- cfe/trunk/lib/Sema/SemaOverload.cpp (original)
+++ cfe/trunk/lib/Sema/SemaOverload.cpp Mon Oct  3 11:48:27 2016
@@ -1141,17 +1141,17 @@ bool Sema::IsOverload(FunctionDecl *New,
 
 assert((OldTarget != CFT_InvalidTarget) && "Unexpected invalid target.");
 
-// Don't allow mixing of HD with other kinds. This guarantees that
-// we have only one viable function with this signature on any
-// side of CUDA compilation .
-// __global__ functions can't be overloaded based on attribute
-// difference because, like HD, they also exist on both sides.
+// Don't allow HD and global functions to overload other functions with the
+// same signature.  We allow overloading based on CUDA attributes so that
+// functions can have different implementations on the host and device, but
+// HD/global functions "exist" in some sense on both the host and device, 
so
+// should have the same implementation on both sides.
 if ((NewTarget == CFT_HostDevice) || (OldTarget == CFT_HostDevice) ||
 (NewTarget == CFT_Global) || (OldTarget == CFT_Global))
   return false;
 
-// Allow overloading of functions with same signature, but
-// different CUDA target attributes.
+// Allow overloading of functions with same signature and different CUDA
+// target attributes.
 return NewTarget != OldTarget;
   }
 


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D25139: [CUDA] Add Sema::CUDADiagBuilder and Sema::CUDADiagIfDeviceCode().

2016-10-03 Thread Justin Lebar via cfe-commits
jlebar updated this revision to Diff 73315.
jlebar marked an inline comment as done.
jlebar added a comment.

Address review comments, and rebase atop https://reviews.llvm.org/D24573.


https://reviews.llvm.org/D25139

Files:
  clang/include/clang/Basic/DiagnosticSemaKinds.td
  clang/include/clang/Sema/Sema.h
  clang/lib/Sema/SemaCUDA.cpp
  clang/lib/Sema/SemaExprCXX.cpp
  clang/lib/Sema/SemaStmt.cpp
  clang/lib/Sema/SemaType.cpp
  clang/test/SemaCUDA/exceptions-host-device.cu
  clang/test/SemaCUDA/exceptions.cu

Index: clang/test/SemaCUDA/exceptions.cu
===
--- clang/test/SemaCUDA/exceptions.cu
+++ clang/test/SemaCUDA/exceptions.cu
@@ -9,13 +9,13 @@
 }
 __device__ void device() {
   throw NULL;
-  // expected-error@-1 {{cannot use 'throw' in __device__ function 'device'}}
+  // expected-error@-1 {{cannot use 'throw' in __device__ function}}
   try {} catch(void*) {}
-  // expected-error@-1 {{cannot use 'try' in __device__ function 'device'}}
+  // expected-error@-1 {{cannot use 'try' in __device__ function}}
 }
 __global__ void kernel() {
   throw NULL;
-  // expected-error@-1 {{cannot use 'throw' in __global__ function 'kernel'}}
+  // expected-error@-1 {{cannot use 'throw' in __global__ function}}
   try {} catch(void*) {}
-  // expected-error@-1 {{cannot use 'try' in __global__ function 'kernel'}}
+  // expected-error@-1 {{cannot use 'try' in __global__ function}}
 }
Index: clang/test/SemaCUDA/exceptions-host-device.cu
===
--- clang/test/SemaCUDA/exceptions-host-device.cu
+++ clang/test/SemaCUDA/exceptions-host-device.cu
@@ -14,8 +14,8 @@
   throw NULL;
   try {} catch(void*) {}
 #ifndef HOST
-  // expected-error@-3 {{cannot use 'throw' in __host__ __device__ function 'hd1'}}
-  // expected-error@-3 {{cannot use 'try' in __host__ __device__ function 'hd1'}}
+  // expected-error@-3 {{cannot use 'throw' in __host__ __device__ function}}
+  // expected-error@-3 {{cannot use 'try' in __host__ __device__ function}}
 #endif
 }
 
@@ -31,8 +31,8 @@
   throw NULL;
   try {} catch(void*) {}
 #ifndef HOST
-  // expected-error@-3 {{cannot use 'throw' in __host__ __device__ function 'hd3'}}
-  // expected-error@-3 {{cannot use 'try' in __host__ __device__ function 'hd3'}}
+  // expected-error@-3 {{cannot use 'throw' in __host__ __device__ function}}
+  // expected-error@-3 {{cannot use 'try' in __host__ __device__ function}}
 #endif
 }
 __device__ void call_hd3() { hd3(); }
Index: clang/lib/Sema/SemaType.cpp
===
--- clang/lib/Sema/SemaType.cpp
+++ clang/lib/Sema/SemaType.cpp
@@ -2249,8 +2249,8 @@
 return QualType();
   }
   // CUDA device code doesn't support VLAs.
-  if (getLangOpts().CUDA && T->isVariableArrayType() && !CheckCUDAVLA(Loc))
-return QualType();
+  if (getLangOpts().CUDA && T->isVariableArrayType())
+CUDADiagIfDeviceCode(Loc, diag::err_cuda_vla) << CurrentCUDATarget();
 
   // If this is not C99, extwarn about VLA's and C99 array size modifiers.
   if (!getLangOpts().C99) {
Index: clang/lib/Sema/SemaStmt.cpp
===
--- clang/lib/Sema/SemaStmt.cpp
+++ clang/lib/Sema/SemaStmt.cpp
@@ -3646,7 +3646,8 @@
 
   // Exceptions aren't allowed in CUDA device code.
   if (getLangOpts().CUDA)
-CheckCUDAExceptionExpr(TryLoc, "try");
+CUDADiagIfDeviceCode(TryLoc, diag::err_cuda_device_exceptions)
+<< "try" << CurrentCUDATarget();
 
   if (getCurScope() && getCurScope()->isOpenMPSimdDirectiveScope())
 Diag(TryLoc, diag::err_omp_simd_region_cannot_use_stmt) << "try";
Index: clang/lib/Sema/SemaExprCXX.cpp
===
--- clang/lib/Sema/SemaExprCXX.cpp
+++ clang/lib/Sema/SemaExprCXX.cpp
@@ -685,7 +685,8 @@
 
   // Exceptions aren't allowed in CUDA device code.
   if (getLangOpts().CUDA)
-CheckCUDAExceptionExpr(OpLoc, "throw");
+CUDADiagIfDeviceCode(OpLoc, diag::err_cuda_device_exceptions)
+<< "throw" << CurrentCUDATarget();
 
   if (getCurScope() && getCurScope()->isOpenMPSimdDirectiveScope())
 Diag(OpLoc, diag::err_omp_simd_region_cannot_use_stmt) << "throw";
Index: clang/lib/Sema/SemaCUDA.cpp
===
--- clang/lib/Sema/SemaCUDA.cpp
+++ clang/lib/Sema/SemaCUDA.cpp
@@ -42,6 +42,10 @@
 
 /// IdentifyCUDATarget - Determine the CUDA compilation target for this function
 Sema::CUDAFunctionTarget Sema::IdentifyCUDATarget(const FunctionDecl *D) {
+  // Code that lives outside a function is run on the host.
+  if (D == nullptr)
+return CFT_Host;
+
   if (D->hasAttr())
 return CFT_InvalidTarget;
 
@@ -95,9 +99,8 @@
 Sema::IdentifyCUDAPreference(const FunctionDecl *Caller,
  const FunctionDecl *Callee) {
   assert(Callee && "Callee must be valid.");
+  CUDAFunctionTarget CallerTar

[PATCH] D25139: [CUDA] Add Sema::CUDADiagBuilder and Sema::CUDADiagIfDeviceCode().

2016-10-03 Thread Justin Lebar via cfe-commits
jlebar added inline comments.


> rnk wrote in Sema.h:9238
> I'm concerned that this usage pattern isn't going to be efficient because you 
> build the complete diagnostic before calling the bool conversion operator to 
> determine that it doesn't need to be emitted. I think you want to construct 
> something more like:
> 
>   if (isCUDADeviceCode())
> CUDADiag(...) << ...;
> 
> Otherwise you are going to construct and destruct a large number of 
> diagnostics about language features that are forbidden in device code, but 
> are legal in host code, and 99% of the TU is going to be host code that uses 
> these illegal features.

I think the comment is misleading -- I tried to update it to resolve this 
misunderstanding.  Does it make more sense now?

> rnk wrote in Sema.h:9258
> Remind me why we need to do this? Which arena is this stuff allocated in and 
> where would I go to read more about it? My thought is that, if we don't 
> construct very many of these, we should just allocate them in the usual 
> ASTContext arena and let them live forever. It would be more consistent with 
> our usual way of doing things.

These diagnostics live until the end of codegen, and so are destroyed after the 
ASTContext.

I am becoming increasingly displeased with emitting these errors during 
codegen.  In particular, it makes it annoying to write tests.  You cannot test 
deferred and immediate errors in the same test, because if you have any 
immediate errors, we never codegen, so we never emit the deferred ones.  This 
will also be a suboptimal user experience.

The only serious alternative mooted thus far is emitting the deferred errors 
when a function is marked used.  But this is going to emit deferred errors if 
e.g. two inline host+device functions have mutual recursion but are otherwise 
never touched (so don't need to be codegen'ed).  I am not sure if this will be 
OK or not, but I need to look at it.

If we move the errors to being emitted earlier, we won't need to do this dance.

https://reviews.llvm.org/D25139



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D24571: [CUDA] Disallow overloading destructors.

2016-10-04 Thread Justin Lebar via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rL283120: [CUDA] Disallow overloading destructors. (authored 
by jlebar).

Changed prior to commit:
  https://reviews.llvm.org/D24571?vs=71379&id=73413#toc

Repository:
  rL LLVM

https://reviews.llvm.org/D24571

Files:
  cfe/trunk/lib/Sema/SemaOverload.cpp
  cfe/trunk/test/CodeGenCUDA/function-overload.cu
  cfe/trunk/test/SemaCUDA/call-overloaded-destructor.cu
  cfe/trunk/test/SemaCUDA/function-overload.cu
  cfe/trunk/test/SemaCUDA/no-destructor-overload.cu

Index: cfe/trunk/test/CodeGenCUDA/function-overload.cu
===
--- cfe/trunk/test/CodeGenCUDA/function-overload.cu
+++ cfe/trunk/test/CodeGenCUDA/function-overload.cu
@@ -16,8 +16,6 @@
 struct s_cd_dh {
   __host__ s_cd_dh() { x = 11; }
   __device__ s_cd_dh() { x = 12; }
-  __host__ ~s_cd_dh() { x = 21; }
-  __device__ ~s_cd_dh() { x = 22; }
 };
 
 struct s_cd_hd {
@@ -38,7 +36,6 @@
   // CHECK-BOTH: call void @_ZN7s_cd_hdC1Ev
 
   // CHECK-BOTH: call void @_ZN7s_cd_hdD1Ev(
-  // CHECK-BOTH: call void @_ZN7s_cd_dhD1Ev(
 }
 // CHECK-BOTH: ret void
 
@@ -56,8 +53,3 @@
 // CHECK-BOTH: define linkonce_odr void @_ZN7s_cd_hdD2Ev(
 // CHECK-BOTH: store i32 32,
 // CHECK-BOTH: ret void
-
-// CHECK-BOTH: define linkonce_odr void @_ZN7s_cd_dhD2Ev(
-// CHECK-HOST:   store i32 21,
-// CHECK-DEVICE: store i32 22,
-// CHECK-BOTH: ret void
Index: cfe/trunk/test/SemaCUDA/no-destructor-overload.cu
===
--- cfe/trunk/test/SemaCUDA/no-destructor-overload.cu
+++ cfe/trunk/test/SemaCUDA/no-destructor-overload.cu
@@ -0,0 +1,33 @@
+// RUN: %clang_cc1 -fsyntax-only -verify %s
+// RUN: %clang_cc1 -fcuda-is-device -fsyntax-only -verify %s
+
+#include "Inputs/cuda.h"
+
+// We don't allow destructors to be overloaded.  Making this work would be a
+// giant change to clang, and the use cases seem quite limited.
+
+struct A {
+  ~A() {} // expected-note {{previous declaration is here}}
+  __device__ ~A() {} // expected-error {{destructor cannot be redeclared}}
+};
+
+struct B {
+  __host__ ~B() {} // expected-note {{previous declaration is here}}
+  __host__ __device__ ~B() {} // expected-error {{destructor cannot be redeclared}}
+};
+
+struct C {
+  __host__ __device__ ~C() {} // expected-note {{previous declaration is here}}
+  __host__ ~C() {} // expected-error {{destructor cannot be redeclared}}
+};
+
+struct D {
+  __device__ ~D() {} // expected-note {{previous declaration is here}}
+  __host__ __device__ ~D() {} // expected-error {{destructor cannot be redeclared}}
+};
+
+struct E {
+  __host__ __device__ ~E() {} // expected-note {{previous declaration is here}}
+  __device__ ~E() {} // expected-error {{destructor cannot be redeclared}}
+};
+
Index: cfe/trunk/test/SemaCUDA/function-overload.cu
===
--- cfe/trunk/test/SemaCUDA/function-overload.cu
+++ cfe/trunk/test/SemaCUDA/function-overload.cu
@@ -210,44 +210,11 @@
   __host__ ~d_h() {} // expected-error {{destructor cannot be redeclared}}
 };
 
-// H/D overloading is OK
-struct d_dh {
-  __device__ ~d_dh() {}
-  __host__ ~d_dh() {}
-};
-
 // HD is OK
 struct d_hd {
   __host__ __device__ ~d_hd() {}
 };
 
-// Mixing H/D and HD is not allowed.
-struct d_dhhd {
-  __device__ ~d_dhhd() {}
-  __host__ ~d_dhhd() {} // expected-note {{previous declaration is here}}
-  __host__ __device__ ~d_dhhd() {} // expected-error {{destructor cannot be redeclared}}
-};
-
-struct d_hhd {
-  __host__ ~d_hhd() {} // expected-note {{previous declaration is here}}
-  __host__ __device__ ~d_hhd() {} // expected-error {{destructor cannot be redeclared}}
-};
-
-struct d_hdh {
-  __host__ __device__ ~d_hdh() {} // expected-note {{previous declaration is here}}
-  __host__ ~d_hdh() {} // expected-error {{destructor cannot be redeclared}}
-};
-
-struct d_dhd {
-  __device__ ~d_dhd() {} // expected-note {{previous declaration is here}}
-  __host__ __device__ ~d_dhd() {} // expected-error {{destructor cannot be redeclared}}
-};
-
-struct d_hdd {
-  __host__ __device__ ~d_hdd() {} // expected-note {{previous declaration is here}}
-  __device__ ~d_hdd() {} // expected-error {{destructor cannot be redeclared}}
-};
-
 // Test overloading of member functions
 struct m_h {
   void operator delete(void *ptr); // expected-note {{previous declaration is here}}
Index: cfe/trunk/test/SemaCUDA/call-overloaded-destructor.cu
===
--- cfe/trunk/test/SemaCUDA/call-overloaded-destructor.cu
+++ cfe/trunk/test/SemaCUDA/call-overloaded-destructor.cu
@@ -1,17 +0,0 @@
-// expected-no-diagnostics
-
-// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -fsyntax-only -verify %s
-// RUN: %clang_cc1 -triple nvptx64-nvidia-cuda -fsyntax-only -fcuda-is-device -verify %s
-
-#include "Inputs/cuda.h"
-
-struct S {
-  __host__ ~S() {}
-  __device__ ~S()

[PATCH] D25260: [CUDA] Destroy deferred diagnostics before destroying the ASTContext's PartialDiagnostic allocator.

2016-10-04 Thread Justin Lebar via cfe-commits
jlebar created this revision.
jlebar added a reviewer: rnk.
jlebar added a subscriber: cfe-commits.

This will let us (in a separate patch) allocate deferred diagnostics in
the ASTContext's PartialDiagnostic arena.


https://reviews.llvm.org/D25260

Files:
  clang/include/clang/AST/ASTContext.h
  clang/lib/CodeGen/CodeGenModule.cpp


Index: clang/lib/CodeGen/CodeGenModule.cpp
===
--- clang/lib/CodeGen/CodeGenModule.cpp
+++ clang/lib/CodeGen/CodeGenModule.cpp
@@ -509,6 +509,9 @@
 DiagnosticBuilder Builder(getDiags().Report(Loc, PD.getDiagID()));
 PD.Emit(Builder);
   }
+  // Clear the deferred diags so they don't outlive the ASTContext from whence
+  // they're allocated.
+  DeferredDiags.clear();
 }
 
 void CodeGenModule::UpdateCompletedType(const TagDecl *TD) {
Index: clang/include/clang/AST/ASTContext.h
===
--- clang/include/clang/AST/ASTContext.h
+++ clang/include/clang/AST/ASTContext.h
@@ -325,12 +325,6 @@
   };
   llvm::DenseMap ModuleInitializers;
 
-  /// Diagnostics that are emitted if and only if the given function is
-  /// codegen'ed.  Access these through FunctionDecl::addDeferredDiag() and
-  /// FunctionDecl::takeDeferredDiags().
-  llvm::DenseMap>
-  DeferredDiags;
-
 public:
   /// \brief A type synonym for the TemplateOrInstantiation mapping.
   typedef llvm::PointerUnion
@@ -454,6 +448,12 @@
   /// \brief Allocator for partial diagnostics.
   PartialDiagnostic::StorageAllocator DiagAllocator;
 
+  /// Diagnostics that are emitted if and only if the given function is
+  /// codegen'ed.  Access these through FunctionDecl::addDeferredDiag() and
+  /// FunctionDecl::takeDeferredDiags().
+  llvm::DenseMap>
+  DeferredDiags;
+
   /// \brief The current C++ ABI.
   std::unique_ptr ABI;
   CXXABI *createCXXABI(const TargetInfo &T);


Index: clang/lib/CodeGen/CodeGenModule.cpp
===
--- clang/lib/CodeGen/CodeGenModule.cpp
+++ clang/lib/CodeGen/CodeGenModule.cpp
@@ -509,6 +509,9 @@
 DiagnosticBuilder Builder(getDiags().Report(Loc, PD.getDiagID()));
 PD.Emit(Builder);
   }
+  // Clear the deferred diags so they don't outlive the ASTContext from whence
+  // they're allocated.
+  DeferredDiags.clear();
 }
 
 void CodeGenModule::UpdateCompletedType(const TagDecl *TD) {
Index: clang/include/clang/AST/ASTContext.h
===
--- clang/include/clang/AST/ASTContext.h
+++ clang/include/clang/AST/ASTContext.h
@@ -325,12 +325,6 @@
   };
   llvm::DenseMap ModuleInitializers;
 
-  /// Diagnostics that are emitted if and only if the given function is
-  /// codegen'ed.  Access these through FunctionDecl::addDeferredDiag() and
-  /// FunctionDecl::takeDeferredDiags().
-  llvm::DenseMap>
-  DeferredDiags;
-
 public:
   /// \brief A type synonym for the TemplateOrInstantiation mapping.
   typedef llvm::PointerUnion
@@ -454,6 +448,12 @@
   /// \brief Allocator for partial diagnostics.
   PartialDiagnostic::StorageAllocator DiagAllocator;
 
+  /// Diagnostics that are emitted if and only if the given function is
+  /// codegen'ed.  Access these through FunctionDecl::addDeferredDiag() and
+  /// FunctionDecl::takeDeferredDiags().
+  llvm::DenseMap>
+  DeferredDiags;
+
   /// \brief The current C++ ABI.
   std::unique_ptr ABI;
   CXXABI *createCXXABI(const TargetInfo &T);
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D25139: [CUDA] Add Sema::CUDADiagBuilder and Sema::CUDADiagIfDeviceCode().

2016-10-04 Thread Justin Lebar via cfe-commits
jlebar added inline comments.


> rnk wrote in Sema.h:9258
> The ASTContext should outlive IRgen, since the AST is allocated in its arena. 
> Is there a separate diagnostic memory pool that I don't know about?

You're right, this is a silly bug.  Fixed in a separate patch, 
https://reviews.llvm.org/D25260 -- with that change we can do the natural thing 
here.

https://reviews.llvm.org/D25139



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D25139: [CUDA] Add Sema::CUDADiagBuilder and Sema::CUDADiagIfDeviceCode().

2016-10-04 Thread Justin Lebar via cfe-commits
jlebar updated this revision to Diff 73576.
jlebar marked an inline comment as done.
jlebar added a comment.

Rebase atop https://reviews.llvm.org/D25260, which obviates the need for this
ugly PD allocation dance.


https://reviews.llvm.org/D25139

Files:
  clang/include/clang/Basic/DiagnosticSemaKinds.td
  clang/include/clang/Sema/Sema.h
  clang/lib/Sema/SemaCUDA.cpp
  clang/lib/Sema/SemaExprCXX.cpp
  clang/lib/Sema/SemaStmt.cpp
  clang/lib/Sema/SemaType.cpp
  clang/test/SemaCUDA/exceptions-host-device.cu
  clang/test/SemaCUDA/exceptions.cu

Index: clang/test/SemaCUDA/exceptions.cu
===
--- clang/test/SemaCUDA/exceptions.cu
+++ clang/test/SemaCUDA/exceptions.cu
@@ -9,13 +9,13 @@
 }
 __device__ void device() {
   throw NULL;
-  // expected-error@-1 {{cannot use 'throw' in __device__ function 'device'}}
+  // expected-error@-1 {{cannot use 'throw' in __device__ function}}
   try {} catch(void*) {}
-  // expected-error@-1 {{cannot use 'try' in __device__ function 'device'}}
+  // expected-error@-1 {{cannot use 'try' in __device__ function}}
 }
 __global__ void kernel() {
   throw NULL;
-  // expected-error@-1 {{cannot use 'throw' in __global__ function 'kernel'}}
+  // expected-error@-1 {{cannot use 'throw' in __global__ function}}
   try {} catch(void*) {}
-  // expected-error@-1 {{cannot use 'try' in __global__ function 'kernel'}}
+  // expected-error@-1 {{cannot use 'try' in __global__ function}}
 }
Index: clang/test/SemaCUDA/exceptions-host-device.cu
===
--- clang/test/SemaCUDA/exceptions-host-device.cu
+++ clang/test/SemaCUDA/exceptions-host-device.cu
@@ -14,8 +14,8 @@
   throw NULL;
   try {} catch(void*) {}
 #ifndef HOST
-  // expected-error@-3 {{cannot use 'throw' in __host__ __device__ function 'hd1'}}
-  // expected-error@-3 {{cannot use 'try' in __host__ __device__ function 'hd1'}}
+  // expected-error@-3 {{cannot use 'throw' in __host__ __device__ function}}
+  // expected-error@-3 {{cannot use 'try' in __host__ __device__ function}}
 #endif
 }
 
@@ -31,8 +31,8 @@
   throw NULL;
   try {} catch(void*) {}
 #ifndef HOST
-  // expected-error@-3 {{cannot use 'throw' in __host__ __device__ function 'hd3'}}
-  // expected-error@-3 {{cannot use 'try' in __host__ __device__ function 'hd3'}}
+  // expected-error@-3 {{cannot use 'throw' in __host__ __device__ function}}
+  // expected-error@-3 {{cannot use 'try' in __host__ __device__ function}}
 #endif
 }
 __device__ void call_hd3() { hd3(); }
Index: clang/lib/Sema/SemaType.cpp
===
--- clang/lib/Sema/SemaType.cpp
+++ clang/lib/Sema/SemaType.cpp
@@ -2249,8 +2249,8 @@
 return QualType();
   }
   // CUDA device code doesn't support VLAs.
-  if (getLangOpts().CUDA && T->isVariableArrayType() && !CheckCUDAVLA(Loc))
-return QualType();
+  if (getLangOpts().CUDA && T->isVariableArrayType())
+CUDADiagIfDeviceCode(Loc, diag::err_cuda_vla) << CurrentCUDATarget();
 
   // If this is not C99, extwarn about VLA's and C99 array size modifiers.
   if (!getLangOpts().C99) {
Index: clang/lib/Sema/SemaStmt.cpp
===
--- clang/lib/Sema/SemaStmt.cpp
+++ clang/lib/Sema/SemaStmt.cpp
@@ -3646,7 +3646,8 @@
 
   // Exceptions aren't allowed in CUDA device code.
   if (getLangOpts().CUDA)
-CheckCUDAExceptionExpr(TryLoc, "try");
+CUDADiagIfDeviceCode(TryLoc, diag::err_cuda_device_exceptions)
+<< "try" << CurrentCUDATarget();
 
   if (getCurScope() && getCurScope()->isOpenMPSimdDirectiveScope())
 Diag(TryLoc, diag::err_omp_simd_region_cannot_use_stmt) << "try";
Index: clang/lib/Sema/SemaExprCXX.cpp
===
--- clang/lib/Sema/SemaExprCXX.cpp
+++ clang/lib/Sema/SemaExprCXX.cpp
@@ -685,7 +685,8 @@
 
   // Exceptions aren't allowed in CUDA device code.
   if (getLangOpts().CUDA)
-CheckCUDAExceptionExpr(OpLoc, "throw");
+CUDADiagIfDeviceCode(OpLoc, diag::err_cuda_device_exceptions)
+<< "throw" << CurrentCUDATarget();
 
   if (getCurScope() && getCurScope()->isOpenMPSimdDirectiveScope())
 Diag(OpLoc, diag::err_omp_simd_region_cannot_use_stmt) << "throw";
Index: clang/lib/Sema/SemaCUDA.cpp
===
--- clang/lib/Sema/SemaCUDA.cpp
+++ clang/lib/Sema/SemaCUDA.cpp
@@ -18,6 +18,7 @@
 #include "clang/Sema/Lookup.h"
 #include "clang/Sema/Sema.h"
 #include "clang/Sema/SemaDiagnostic.h"
+#include "clang/Sema/SemaInternal.h"
 #include "clang/Sema/Template.h"
 #include "llvm/ADT/Optional.h"
 #include "llvm/ADT/SmallVector.h"
@@ -42,6 +43,10 @@
 
 /// IdentifyCUDATarget - Determine the CUDA compilation target for this function
 Sema::CUDAFunctionTarget Sema::IdentifyCUDATarget(const FunctionDecl *D) {
+  // Code that lives outside a function is run on the host.
+  if (D == nullptr)
+ 

[PATCH] D25166: [CUDA] Mark device functions as nounwind.

2016-10-04 Thread Justin Lebar via cfe-commits
jlebar updated this revision to Diff 73577.
jlebar added a comment.

Move everything into codegen.


https://reviews.llvm.org/D25166

Files:
  clang/lib/CodeGen/CGCall.cpp
  clang/lib/CodeGen/CGException.cpp
  clang/test/CodeGenCUDA/convergent.cu
  clang/test/CodeGenCUDA/device-var-init.cu
  clang/test/CodeGenCUDA/nothrow.cu


Index: clang/test/CodeGenCUDA/nothrow.cu
===
--- /dev/null
+++ clang/test/CodeGenCUDA/nothrow.cu
@@ -0,0 +1,29 @@
+// RUN: %clang_cc1 -fcxx-exceptions -fexceptions -fcuda-is-device -triple 
nvptx-nvidia-cuda -emit-llvm \
+// RUN:   -disable-llvm-passes -o - %s | FileCheck -check-prefix DEVICE %s
+
+// RUN: %clang_cc1 -fcxx-exceptions -fexceptions -triple 
x86_64-unknown-linux-gnu -emit-llvm \
+// RUN:   -disable-llvm-passes -o - %s | \
+// RUN:  FileCheck -check-prefix HOST %s
+
+#include "Inputs/cuda.h"
+
+__host__ __device__ void f();
+
+// HOST: define void @_Z7host_fnv() [[HOST_ATTR:#[0-9]+]]
+void host_fn() { f(); }
+
+// DEVICE: define void @_Z3foov() [[DEVICE_ATTR:#[0-9]+]]
+__device__ void foo() { f(); }
+
+// This is nounwind only on the device side.
+// CHECK: define void @_Z3foov() [[DEVICE_ATTR:#[0-9]+]]
+__host__ __device__ void bar() { f(); }
+
+// DEVICE: define void @_Z3bazv() [[DEVICE_ATTR:#[0-9]+]]
+__global__ void baz() { f(); }
+
+// DEVICE: attributes [[DEVICE_ATTR]] = {
+// DEVICE-SAME: nounwind
+// HOST: attributes [[HOST_ATTR]] = {
+// HOST-NOT: nounwind
+// HOST-SAME: }
Index: clang/test/CodeGenCUDA/device-var-init.cu
===
--- clang/test/CodeGenCUDA/device-var-init.cu
+++ clang/test/CodeGenCUDA/device-var-init.cu
@@ -182,9 +182,9 @@
   df(); // CHECK: call void @_Z2dfv()
 
   // Verify that we only call non-empty destructors
-  // CHECK-NEXT: call void @_ZN8T_FA_NEDD1Ev(%struct.T_FA_NED* %t_fa_ned) #6
-  // CHECK-NEXT: call void @_ZN7T_F_NEDD1Ev(%struct.T_F_NED* %t_f_ned) #6
-  // CHECK-NEXT: call void @_ZN7T_B_NEDD1Ev(%struct.T_B_NED* %t_b_ned) #6
+  // CHECK-NEXT: call void @_ZN8T_FA_NEDD1Ev(%struct.T_FA_NED* %t_fa_ned)
+  // CHECK-NEXT: call void @_ZN7T_F_NEDD1Ev(%struct.T_F_NED* %t_f_ned)
+  // CHECK-NEXT: call void @_ZN7T_B_NEDD1Ev(%struct.T_B_NED* %t_b_ned)
   // CHECK-NEXT: call void @_ZN2VDD1Ev(%struct.VD* %vd)
   // CHECK-NEXT: call void @_ZN3NEDD1Ev(%struct.NED* %ned)
   // CHECK-NEXT: call void @_ZN2UDD1Ev(%struct.UD* %ud)
Index: clang/test/CodeGenCUDA/convergent.cu
===
--- clang/test/CodeGenCUDA/convergent.cu
+++ clang/test/CodeGenCUDA/convergent.cu
@@ -36,8 +36,8 @@
 // DEVICE: attributes [[BAZ_ATTR]] = {
 // DEVICE-SAME: convergent
 // DEVICE-SAME: }
-// DEVICE: attributes [[CALL_ATTR]] = { convergent }
-// DEVICE: attributes [[ASM_ATTR]] = { convergent
+// DEVICE-DAG: attributes [[CALL_ATTR]] = { convergent
+// DEVICE-DAG: attributes [[ASM_ATTR]] = { convergent
 
 // HOST: declare void @_Z3bazv() [[BAZ_ATTR:#[0-9]+]]
 // HOST: attributes [[BAZ_ATTR]] = {
Index: clang/lib/CodeGen/CGException.cpp
===
--- clang/lib/CodeGen/CGException.cpp
+++ clang/lib/CodeGen/CGException.cpp
@@ -698,6 +698,10 @@
   return nullptr;
   }
 
+  // CUDA device code doesn't have exceptions.
+  if (LO.CUDA && LO.CUDAIsDevice)
+return nullptr;
+
   // Check the innermost scope for a cached landing pad.  If this is
   // a non-EH cleanup, we'll check enclosing scopes in EmitLandingPad.
   llvm::BasicBlock *LP = EHStack.begin()->getCachedLandingPad();
Index: clang/lib/CodeGen/CGCall.cpp
===
--- clang/lib/CodeGen/CGCall.cpp
+++ clang/lib/CodeGen/CGCall.cpp
@@ -1805,6 +1805,9 @@
 // them).  LLVM will remove this attribute where it safely can.
 FuncAttrs.addAttribute(llvm::Attribute::Convergent);
 
+// Exceptions aren't supported in CUDA device code.
+FuncAttrs.addAttribute(llvm::Attribute::NoUnwind);
+
 // Respect -fcuda-flush-denormals-to-zero.
 if (getLangOpts().CUDADeviceFlushDenormalsToZero)
   FuncAttrs.addAttribute("nvptx-f32ftz", "true");


Index: clang/test/CodeGenCUDA/nothrow.cu
===
--- /dev/null
+++ clang/test/CodeGenCUDA/nothrow.cu
@@ -0,0 +1,29 @@
+// RUN: %clang_cc1 -fcxx-exceptions -fexceptions -fcuda-is-device -triple nvptx-nvidia-cuda -emit-llvm \
+// RUN:   -disable-llvm-passes -o - %s | FileCheck -check-prefix DEVICE %s
+
+// RUN: %clang_cc1 -fcxx-exceptions -fexceptions -triple x86_64-unknown-linux-gnu -emit-llvm \
+// RUN:   -disable-llvm-passes -o - %s | \
+// RUN:  FileCheck -check-prefix HOST %s
+
+#include "Inputs/cuda.h"
+
+__host__ __device__ void f();
+
+// HOST: define void @_Z7host_fnv() [[HOST_ATTR:#[0-9]+]]
+void host_fn() { f(); }
+
+// DEVICE: define void @_Z3foov() [[DEVICE_ATTR:#[0-9]+]]
+__device__ void foo() { f(); }
+
+// This

[PATCH] D25260: [CUDA] Destroy deferred diagnostics before destroying the ASTContext's PartialDiagnostic allocator.

2016-10-04 Thread Justin Lebar via cfe-commits
jlebar updated this revision to Diff 73578.
jlebar added a comment.

Update comment.


https://reviews.llvm.org/D25260

Files:
  clang/include/clang/AST/ASTContext.h
  clang/lib/CodeGen/CodeGenModule.cpp


Index: clang/lib/CodeGen/CodeGenModule.cpp
===
--- clang/lib/CodeGen/CodeGenModule.cpp
+++ clang/lib/CodeGen/CodeGenModule.cpp
@@ -509,6 +509,9 @@
 DiagnosticBuilder Builder(getDiags().Report(Loc, PD.getDiagID()));
 PD.Emit(Builder);
   }
+  // Clear the deferred diags so they don't outlive the ASTContext's
+  // PartialDiagnostic allocator.
+  DeferredDiags.clear();
 }
 
 void CodeGenModule::UpdateCompletedType(const TagDecl *TD) {
Index: clang/include/clang/AST/ASTContext.h
===
--- clang/include/clang/AST/ASTContext.h
+++ clang/include/clang/AST/ASTContext.h
@@ -325,12 +325,6 @@
   };
   llvm::DenseMap ModuleInitializers;
 
-  /// Diagnostics that are emitted if and only if the given function is
-  /// codegen'ed.  Access these through FunctionDecl::addDeferredDiag() and
-  /// FunctionDecl::takeDeferredDiags().
-  llvm::DenseMap>
-  DeferredDiags;
-
 public:
   /// \brief A type synonym for the TemplateOrInstantiation mapping.
   typedef llvm::PointerUnion
@@ -454,6 +448,12 @@
   /// \brief Allocator for partial diagnostics.
   PartialDiagnostic::StorageAllocator DiagAllocator;
 
+  /// Diagnostics that are emitted if and only if the given function is
+  /// codegen'ed.  Access these through FunctionDecl::addDeferredDiag() and
+  /// FunctionDecl::takeDeferredDiags().
+  llvm::DenseMap>
+  DeferredDiags;
+
   /// \brief The current C++ ABI.
   std::unique_ptr ABI;
   CXXABI *createCXXABI(const TargetInfo &T);


Index: clang/lib/CodeGen/CodeGenModule.cpp
===
--- clang/lib/CodeGen/CodeGenModule.cpp
+++ clang/lib/CodeGen/CodeGenModule.cpp
@@ -509,6 +509,9 @@
 DiagnosticBuilder Builder(getDiags().Report(Loc, PD.getDiagID()));
 PD.Emit(Builder);
   }
+  // Clear the deferred diags so they don't outlive the ASTContext's
+  // PartialDiagnostic allocator.
+  DeferredDiags.clear();
 }
 
 void CodeGenModule::UpdateCompletedType(const TagDecl *TD) {
Index: clang/include/clang/AST/ASTContext.h
===
--- clang/include/clang/AST/ASTContext.h
+++ clang/include/clang/AST/ASTContext.h
@@ -325,12 +325,6 @@
   };
   llvm::DenseMap ModuleInitializers;
 
-  /// Diagnostics that are emitted if and only if the given function is
-  /// codegen'ed.  Access these through FunctionDecl::addDeferredDiag() and
-  /// FunctionDecl::takeDeferredDiags().
-  llvm::DenseMap>
-  DeferredDiags;
-
 public:
   /// \brief A type synonym for the TemplateOrInstantiation mapping.
   typedef llvm::PointerUnion
@@ -454,6 +448,12 @@
   /// \brief Allocator for partial diagnostics.
   PartialDiagnostic::StorageAllocator DiagAllocator;
 
+  /// Diagnostics that are emitted if and only if the given function is
+  /// codegen'ed.  Access these through FunctionDecl::addDeferredDiag() and
+  /// FunctionDecl::takeDeferredDiags().
+  llvm::DenseMap>
+  DeferredDiags;
+
   /// \brief The current C++ ABI.
   std::unique_ptr ABI;
   CXXABI *createCXXABI(const TargetInfo &T);
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D25166: [CUDA] Mark device functions as nounwind.

2016-10-04 Thread Justin Lebar via cfe-commits
jlebar updated this revision to Diff 73579.
jlebar marked an inline comment as done.
jlebar added a comment.

Update tests.


https://reviews.llvm.org/D25166

Files:
  clang/lib/CodeGen/CGCall.cpp
  clang/lib/CodeGen/CGException.cpp
  clang/test/CodeGenCUDA/convergent.cu
  clang/test/CodeGenCUDA/device-var-init.cu
  clang/test/CodeGenCUDA/nothrow.cu

Index: clang/test/CodeGenCUDA/nothrow.cu
===
--- /dev/null
+++ clang/test/CodeGenCUDA/nothrow.cu
@@ -0,0 +1,39 @@
+// RUN: %clang_cc1 -std=c++11 -fcxx-exceptions -fexceptions -fcuda-is-device \
+// RUN:   -triple nvptx-nvidia-cuda -emit-llvm -disable-llvm-passes -o - %s | \
+// RUN  FileCheck -check-prefix DEVICE %s
+
+// RUN: %clang_cc1 -std=c++11 -fcxx-exceptions -fexceptions \
+// RUN:   -triple x86_64-unknown-linux-gnu -emit-llvm -disable-llvm-passes -o - %s | \
+// RUN:  FileCheck -check-prefix HOST %s
+
+#include "Inputs/cuda.h"
+
+__host__ __device__ void f();
+
+// HOST: define void @_Z7host_fnv() [[HOST_ATTR:#[0-9]+]]
+void host_fn() { f(); }
+
+// DEVICE: define void @_Z3foov() [[DEVICE_ATTR:#[0-9]+]]
+__device__ void foo() {
+  // DEVICE: call void @_Z1fv
+  f();
+}
+
+// DEVICE: define void @_Z12foo_noexceptv() [[DEVICE_ATTR:#[0-9]+]]
+__device__ void foo_noexcept() noexcept {
+  // DEVICE: call void @_Z1fv
+  f();
+}
+
+// This is nounwind only on the device side.
+// CHECK: define void @_Z3foov() [[DEVICE_ATTR:#[0-9]+]]
+__host__ __device__ void bar() { f(); }
+
+// DEVICE: define void @_Z3bazv() [[DEVICE_ATTR:#[0-9]+]]
+__global__ void baz() { f(); }
+
+// DEVICE: attributes [[DEVICE_ATTR]] = {
+// DEVICE-SAME: nounwind
+// HOST: attributes [[HOST_ATTR]] = {
+// HOST-NOT: nounwind
+// HOST-SAME: }
Index: clang/test/CodeGenCUDA/device-var-init.cu
===
--- clang/test/CodeGenCUDA/device-var-init.cu
+++ clang/test/CodeGenCUDA/device-var-init.cu
@@ -182,9 +182,9 @@
   df(); // CHECK: call void @_Z2dfv()
 
   // Verify that we only call non-empty destructors
-  // CHECK-NEXT: call void @_ZN8T_FA_NEDD1Ev(%struct.T_FA_NED* %t_fa_ned) #6
-  // CHECK-NEXT: call void @_ZN7T_F_NEDD1Ev(%struct.T_F_NED* %t_f_ned) #6
-  // CHECK-NEXT: call void @_ZN7T_B_NEDD1Ev(%struct.T_B_NED* %t_b_ned) #6
+  // CHECK-NEXT: call void @_ZN8T_FA_NEDD1Ev(%struct.T_FA_NED* %t_fa_ned)
+  // CHECK-NEXT: call void @_ZN7T_F_NEDD1Ev(%struct.T_F_NED* %t_f_ned)
+  // CHECK-NEXT: call void @_ZN7T_B_NEDD1Ev(%struct.T_B_NED* %t_b_ned)
   // CHECK-NEXT: call void @_ZN2VDD1Ev(%struct.VD* %vd)
   // CHECK-NEXT: call void @_ZN3NEDD1Ev(%struct.NED* %ned)
   // CHECK-NEXT: call void @_ZN2UDD1Ev(%struct.UD* %ud)
Index: clang/test/CodeGenCUDA/convergent.cu
===
--- clang/test/CodeGenCUDA/convergent.cu
+++ clang/test/CodeGenCUDA/convergent.cu
@@ -36,8 +36,8 @@
 // DEVICE: attributes [[BAZ_ATTR]] = {
 // DEVICE-SAME: convergent
 // DEVICE-SAME: }
-// DEVICE: attributes [[CALL_ATTR]] = { convergent }
-// DEVICE: attributes [[ASM_ATTR]] = { convergent
+// DEVICE-DAG: attributes [[CALL_ATTR]] = { convergent
+// DEVICE-DAG: attributes [[ASM_ATTR]] = { convergent
 
 // HOST: declare void @_Z3bazv() [[BAZ_ATTR:#[0-9]+]]
 // HOST: attributes [[BAZ_ATTR]] = {
Index: clang/lib/CodeGen/CGException.cpp
===
--- clang/lib/CodeGen/CGException.cpp
+++ clang/lib/CodeGen/CGException.cpp
@@ -698,6 +698,10 @@
   return nullptr;
   }
 
+  // CUDA device code doesn't have exceptions.
+  if (LO.CUDA && LO.CUDAIsDevice)
+return nullptr;
+
   // Check the innermost scope for a cached landing pad.  If this is
   // a non-EH cleanup, we'll check enclosing scopes in EmitLandingPad.
   llvm::BasicBlock *LP = EHStack.begin()->getCachedLandingPad();
Index: clang/lib/CodeGen/CGCall.cpp
===
--- clang/lib/CodeGen/CGCall.cpp
+++ clang/lib/CodeGen/CGCall.cpp
@@ -1805,6 +1805,9 @@
 // them).  LLVM will remove this attribute where it safely can.
 FuncAttrs.addAttribute(llvm::Attribute::Convergent);
 
+// Exceptions aren't supported in CUDA device code.
+FuncAttrs.addAttribute(llvm::Attribute::NoUnwind);
+
 // Respect -fcuda-flush-denormals-to-zero.
 if (getLangOpts().CUDADeviceFlushDenormalsToZero)
   FuncAttrs.addAttribute("nvptx-f32ftz", "true");
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r283271 - [CUDA] Destroy deferred diagnostics before destroying the ASTContext's PartialDiagnostic allocator.

2016-10-04 Thread Justin Lebar via cfe-commits
Author: jlebar
Date: Tue Oct  4 18:41:45 2016
New Revision: 283271

URL: http://llvm.org/viewvc/llvm-project?rev=283271&view=rev
Log:
[CUDA] Destroy deferred diagnostics before destroying the ASTContext's 
PartialDiagnostic allocator.

Summary:
This will let us (in a separate patch) allocate deferred diagnostics in
the ASTContext's PartialDiagnostic arena.

Reviewers: rnk

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D25260

Modified:
cfe/trunk/include/clang/AST/ASTContext.h
cfe/trunk/lib/CodeGen/CodeGenModule.cpp

Modified: cfe/trunk/include/clang/AST/ASTContext.h
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/AST/ASTContext.h?rev=283271&r1=283270&r2=283271&view=diff
==
--- cfe/trunk/include/clang/AST/ASTContext.h (original)
+++ cfe/trunk/include/clang/AST/ASTContext.h Tue Oct  4 18:41:45 2016
@@ -325,12 +325,6 @@ class ASTContext : public RefCountedBase
   };
   llvm::DenseMap ModuleInitializers;
 
-  /// Diagnostics that are emitted if and only if the given function is
-  /// codegen'ed.  Access these through FunctionDecl::addDeferredDiag() and
-  /// FunctionDecl::takeDeferredDiags().
-  llvm::DenseMap>
-  DeferredDiags;
-
 public:
   /// \brief A type synonym for the TemplateOrInstantiation mapping.
   typedef llvm::PointerUnion
@@ -454,6 +448,12 @@ private:
   /// \brief Allocator for partial diagnostics.
   PartialDiagnostic::StorageAllocator DiagAllocator;
 
+  /// Diagnostics that are emitted if and only if the given function is
+  /// codegen'ed.  Access these through FunctionDecl::addDeferredDiag() and
+  /// FunctionDecl::takeDeferredDiags().
+  llvm::DenseMap>
+  DeferredDiags;
+
   /// \brief The current C++ ABI.
   std::unique_ptr ABI;
   CXXABI *createCXXABI(const TargetInfo &T);

Modified: cfe/trunk/lib/CodeGen/CodeGenModule.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CodeGenModule.cpp?rev=283271&r1=283270&r2=283271&view=diff
==
--- cfe/trunk/lib/CodeGen/CodeGenModule.cpp (original)
+++ cfe/trunk/lib/CodeGen/CodeGenModule.cpp Tue Oct  4 18:41:45 2016
@@ -509,6 +509,9 @@ void CodeGenModule::Release() {
 DiagnosticBuilder Builder(getDiags().Report(Loc, PD.getDiagID()));
 PD.Emit(Builder);
   }
+  // Clear the deferred diags so they don't outlive the ASTContext's
+  // PartialDiagnostic allocator.
+  DeferredDiags.clear();
 }
 
 void CodeGenModule::UpdateCompletedType(const TagDecl *TD) {


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D25260: [CUDA] Destroy deferred diagnostics before destroying the ASTContext's PartialDiagnostic allocator.

2016-10-04 Thread Justin Lebar via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rL283271: [CUDA] Destroy deferred diagnostics before 
destroying the ASTContext's… (authored by jlebar).

Changed prior to commit:
  https://reviews.llvm.org/D25260?vs=73578&id=73580#toc

Repository:
  rL LLVM

https://reviews.llvm.org/D25260

Files:
  cfe/trunk/include/clang/AST/ASTContext.h
  cfe/trunk/lib/CodeGen/CodeGenModule.cpp


Index: cfe/trunk/include/clang/AST/ASTContext.h
===
--- cfe/trunk/include/clang/AST/ASTContext.h
+++ cfe/trunk/include/clang/AST/ASTContext.h
@@ -325,12 +325,6 @@
   };
   llvm::DenseMap ModuleInitializers;
 
-  /// Diagnostics that are emitted if and only if the given function is
-  /// codegen'ed.  Access these through FunctionDecl::addDeferredDiag() and
-  /// FunctionDecl::takeDeferredDiags().
-  llvm::DenseMap>
-  DeferredDiags;
-
 public:
   /// \brief A type synonym for the TemplateOrInstantiation mapping.
   typedef llvm::PointerUnion
@@ -454,6 +448,12 @@
   /// \brief Allocator for partial diagnostics.
   PartialDiagnostic::StorageAllocator DiagAllocator;
 
+  /// Diagnostics that are emitted if and only if the given function is
+  /// codegen'ed.  Access these through FunctionDecl::addDeferredDiag() and
+  /// FunctionDecl::takeDeferredDiags().
+  llvm::DenseMap>
+  DeferredDiags;
+
   /// \brief The current C++ ABI.
   std::unique_ptr ABI;
   CXXABI *createCXXABI(const TargetInfo &T);
Index: cfe/trunk/lib/CodeGen/CodeGenModule.cpp
===
--- cfe/trunk/lib/CodeGen/CodeGenModule.cpp
+++ cfe/trunk/lib/CodeGen/CodeGenModule.cpp
@@ -509,6 +509,9 @@
 DiagnosticBuilder Builder(getDiags().Report(Loc, PD.getDiagID()));
 PD.Emit(Builder);
   }
+  // Clear the deferred diags so they don't outlive the ASTContext's
+  // PartialDiagnostic allocator.
+  DeferredDiags.clear();
 }
 
 void CodeGenModule::UpdateCompletedType(const TagDecl *TD) {


Index: cfe/trunk/include/clang/AST/ASTContext.h
===
--- cfe/trunk/include/clang/AST/ASTContext.h
+++ cfe/trunk/include/clang/AST/ASTContext.h
@@ -325,12 +325,6 @@
   };
   llvm::DenseMap ModuleInitializers;
 
-  /// Diagnostics that are emitted if and only if the given function is
-  /// codegen'ed.  Access these through FunctionDecl::addDeferredDiag() and
-  /// FunctionDecl::takeDeferredDiags().
-  llvm::DenseMap>
-  DeferredDiags;
-
 public:
   /// \brief A type synonym for the TemplateOrInstantiation mapping.
   typedef llvm::PointerUnion
@@ -454,6 +448,12 @@
   /// \brief Allocator for partial diagnostics.
   PartialDiagnostic::StorageAllocator DiagAllocator;
 
+  /// Diagnostics that are emitted if and only if the given function is
+  /// codegen'ed.  Access these through FunctionDecl::addDeferredDiag() and
+  /// FunctionDecl::takeDeferredDiags().
+  llvm::DenseMap>
+  DeferredDiags;
+
   /// \brief The current C++ ABI.
   std::unique_ptr ABI;
   CXXABI *createCXXABI(const TargetInfo &T);
Index: cfe/trunk/lib/CodeGen/CodeGenModule.cpp
===
--- cfe/trunk/lib/CodeGen/CodeGenModule.cpp
+++ cfe/trunk/lib/CodeGen/CodeGenModule.cpp
@@ -509,6 +509,9 @@
 DiagnosticBuilder Builder(getDiags().Report(Loc, PD.getDiagID()));
 PD.Emit(Builder);
   }
+  // Clear the deferred diags so they don't outlive the ASTContext's
+  // PartialDiagnostic allocator.
+  DeferredDiags.clear();
 }
 
 void CodeGenModule::UpdateCompletedType(const TagDecl *TD) {
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r283272 - [CUDA] Mark device functions as nounwind.

2016-10-04 Thread Justin Lebar via cfe-commits
Author: jlebar
Date: Tue Oct  4 18:41:49 2016
New Revision: 283272

URL: http://llvm.org/viewvc/llvm-project?rev=283272&view=rev
Log:
[CUDA] Mark device functions as nounwind.

Summary:
This prevents clang from emitting 'invoke's and catch statements.

Things previously mostly worked thanks to TryToMarkNoThrow() in
CodeGenFunction.  But this is not a proper IPO, and it doesn't properly
handle cases like mutual recursion.

Fixes bug 30593.

Reviewers: tra

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D25166

Added:
cfe/trunk/test/CodeGenCUDA/nothrow.cu
Modified:
cfe/trunk/lib/CodeGen/CGCall.cpp
cfe/trunk/lib/CodeGen/CGException.cpp
cfe/trunk/test/CodeGenCUDA/convergent.cu
cfe/trunk/test/CodeGenCUDA/device-var-init.cu

Modified: cfe/trunk/lib/CodeGen/CGCall.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGCall.cpp?rev=283272&r1=283271&r2=283272&view=diff
==
--- cfe/trunk/lib/CodeGen/CGCall.cpp (original)
+++ cfe/trunk/lib/CodeGen/CGCall.cpp Tue Oct  4 18:41:49 2016
@@ -1814,6 +1814,9 @@ void CodeGenModule::ConstructAttributeLi
 // them).  LLVM will remove this attribute where it safely can.
 FuncAttrs.addAttribute(llvm::Attribute::Convergent);
 
+// Exceptions aren't supported in CUDA device code.
+FuncAttrs.addAttribute(llvm::Attribute::NoUnwind);
+
 // Respect -fcuda-flush-denormals-to-zero.
 if (getLangOpts().CUDADeviceFlushDenormalsToZero)
   FuncAttrs.addAttribute("nvptx-f32ftz", "true");

Modified: cfe/trunk/lib/CodeGen/CGException.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGException.cpp?rev=283272&r1=283271&r2=283272&view=diff
==
--- cfe/trunk/lib/CodeGen/CGException.cpp (original)
+++ cfe/trunk/lib/CodeGen/CGException.cpp Tue Oct  4 18:41:49 2016
@@ -698,6 +698,10 @@ llvm::BasicBlock *CodeGenFunction::getIn
   return nullptr;
   }
 
+  // CUDA device code doesn't have exceptions.
+  if (LO.CUDA && LO.CUDAIsDevice)
+return nullptr;
+
   // Check the innermost scope for a cached landing pad.  If this is
   // a non-EH cleanup, we'll check enclosing scopes in EmitLandingPad.
   llvm::BasicBlock *LP = EHStack.begin()->getCachedLandingPad();

Modified: cfe/trunk/test/CodeGenCUDA/convergent.cu
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/CodeGenCUDA/convergent.cu?rev=283272&r1=283271&r2=283272&view=diff
==
--- cfe/trunk/test/CodeGenCUDA/convergent.cu (original)
+++ cfe/trunk/test/CodeGenCUDA/convergent.cu Tue Oct  4 18:41:49 2016
@@ -36,8 +36,8 @@ __host__ __device__ void bar() {
 // DEVICE: attributes [[BAZ_ATTR]] = {
 // DEVICE-SAME: convergent
 // DEVICE-SAME: }
-// DEVICE: attributes [[CALL_ATTR]] = { convergent }
-// DEVICE: attributes [[ASM_ATTR]] = { convergent
+// DEVICE-DAG: attributes [[CALL_ATTR]] = { convergent
+// DEVICE-DAG: attributes [[ASM_ATTR]] = { convergent
 
 // HOST: declare void @_Z3bazv() [[BAZ_ATTR:#[0-9]+]]
 // HOST: attributes [[BAZ_ATTR]] = {

Modified: cfe/trunk/test/CodeGenCUDA/device-var-init.cu
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/CodeGenCUDA/device-var-init.cu?rev=283272&r1=283271&r2=283272&view=diff
==
--- cfe/trunk/test/CodeGenCUDA/device-var-init.cu (original)
+++ cfe/trunk/test/CodeGenCUDA/device-var-init.cu Tue Oct  4 18:41:49 2016
@@ -182,9 +182,9 @@ __device__ void df() {
   df(); // CHECK: call void @_Z2dfv()
 
   // Verify that we only call non-empty destructors
-  // CHECK-NEXT: call void @_ZN8T_FA_NEDD1Ev(%struct.T_FA_NED* %t_fa_ned) #6
-  // CHECK-NEXT: call void @_ZN7T_F_NEDD1Ev(%struct.T_F_NED* %t_f_ned) #6
-  // CHECK-NEXT: call void @_ZN7T_B_NEDD1Ev(%struct.T_B_NED* %t_b_ned) #6
+  // CHECK-NEXT: call void @_ZN8T_FA_NEDD1Ev(%struct.T_FA_NED* %t_fa_ned)
+  // CHECK-NEXT: call void @_ZN7T_F_NEDD1Ev(%struct.T_F_NED* %t_f_ned)
+  // CHECK-NEXT: call void @_ZN7T_B_NEDD1Ev(%struct.T_B_NED* %t_b_ned)
   // CHECK-NEXT: call void @_ZN2VDD1Ev(%struct.VD* %vd)
   // CHECK-NEXT: call void @_ZN3NEDD1Ev(%struct.NED* %ned)
   // CHECK-NEXT: call void @_ZN2UDD1Ev(%struct.UD* %ud)

Added: cfe/trunk/test/CodeGenCUDA/nothrow.cu
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/CodeGenCUDA/nothrow.cu?rev=283272&view=auto
==
--- cfe/trunk/test/CodeGenCUDA/nothrow.cu (added)
+++ cfe/trunk/test/CodeGenCUDA/nothrow.cu Tue Oct  4 18:41:49 2016
@@ -0,0 +1,39 @@
+// RUN: %clang_cc1 -std=c++11 -fcxx-exceptions -fexceptions -fcuda-is-device \
+// RUN:   -triple nvptx-nvidia-cuda -emit-llvm -disable-llvm-passes -o - %s | \
+// RUN  FileCheck -check-prefix DEVICE %s
+
+// RUN: %clang_cc1 -std=c++11 -fcxx-exceptions -fexceptions \
+// RUN:   -tripl

[PATCH] D25166: [CUDA] Mark device functions as nounwind.

2016-10-04 Thread Justin Lebar via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rL283272: [CUDA] Mark device functions as nounwind. (authored 
by jlebar).

Changed prior to commit:
  https://reviews.llvm.org/D25166?vs=73579&id=73581#toc

Repository:
  rL LLVM

https://reviews.llvm.org/D25166

Files:
  cfe/trunk/lib/CodeGen/CGCall.cpp
  cfe/trunk/lib/CodeGen/CGException.cpp
  cfe/trunk/test/CodeGenCUDA/convergent.cu
  cfe/trunk/test/CodeGenCUDA/device-var-init.cu
  cfe/trunk/test/CodeGenCUDA/nothrow.cu

Index: cfe/trunk/test/CodeGenCUDA/convergent.cu
===
--- cfe/trunk/test/CodeGenCUDA/convergent.cu
+++ cfe/trunk/test/CodeGenCUDA/convergent.cu
@@ -36,8 +36,8 @@
 // DEVICE: attributes [[BAZ_ATTR]] = {
 // DEVICE-SAME: convergent
 // DEVICE-SAME: }
-// DEVICE: attributes [[CALL_ATTR]] = { convergent }
-// DEVICE: attributes [[ASM_ATTR]] = { convergent
+// DEVICE-DAG: attributes [[CALL_ATTR]] = { convergent
+// DEVICE-DAG: attributes [[ASM_ATTR]] = { convergent
 
 // HOST: declare void @_Z3bazv() [[BAZ_ATTR:#[0-9]+]]
 // HOST: attributes [[BAZ_ATTR]] = {
Index: cfe/trunk/test/CodeGenCUDA/nothrow.cu
===
--- cfe/trunk/test/CodeGenCUDA/nothrow.cu
+++ cfe/trunk/test/CodeGenCUDA/nothrow.cu
@@ -0,0 +1,39 @@
+// RUN: %clang_cc1 -std=c++11 -fcxx-exceptions -fexceptions -fcuda-is-device \
+// RUN:   -triple nvptx-nvidia-cuda -emit-llvm -disable-llvm-passes -o - %s | \
+// RUN  FileCheck -check-prefix DEVICE %s
+
+// RUN: %clang_cc1 -std=c++11 -fcxx-exceptions -fexceptions \
+// RUN:   -triple x86_64-unknown-linux-gnu -emit-llvm -disable-llvm-passes -o - %s | \
+// RUN:  FileCheck -check-prefix HOST %s
+
+#include "Inputs/cuda.h"
+
+__host__ __device__ void f();
+
+// HOST: define void @_Z7host_fnv() [[HOST_ATTR:#[0-9]+]]
+void host_fn() { f(); }
+
+// DEVICE: define void @_Z3foov() [[DEVICE_ATTR:#[0-9]+]]
+__device__ void foo() {
+  // DEVICE: call void @_Z1fv
+  f();
+}
+
+// DEVICE: define void @_Z12foo_noexceptv() [[DEVICE_ATTR:#[0-9]+]]
+__device__ void foo_noexcept() noexcept {
+  // DEVICE: call void @_Z1fv
+  f();
+}
+
+// This is nounwind only on the device side.
+// CHECK: define void @_Z3foov() [[DEVICE_ATTR:#[0-9]+]]
+__host__ __device__ void bar() { f(); }
+
+// DEVICE: define void @_Z3bazv() [[DEVICE_ATTR:#[0-9]+]]
+__global__ void baz() { f(); }
+
+// DEVICE: attributes [[DEVICE_ATTR]] = {
+// DEVICE-SAME: nounwind
+// HOST: attributes [[HOST_ATTR]] = {
+// HOST-NOT: nounwind
+// HOST-SAME: }
Index: cfe/trunk/test/CodeGenCUDA/device-var-init.cu
===
--- cfe/trunk/test/CodeGenCUDA/device-var-init.cu
+++ cfe/trunk/test/CodeGenCUDA/device-var-init.cu
@@ -182,9 +182,9 @@
   df(); // CHECK: call void @_Z2dfv()
 
   // Verify that we only call non-empty destructors
-  // CHECK-NEXT: call void @_ZN8T_FA_NEDD1Ev(%struct.T_FA_NED* %t_fa_ned) #6
-  // CHECK-NEXT: call void @_ZN7T_F_NEDD1Ev(%struct.T_F_NED* %t_f_ned) #6
-  // CHECK-NEXT: call void @_ZN7T_B_NEDD1Ev(%struct.T_B_NED* %t_b_ned) #6
+  // CHECK-NEXT: call void @_ZN8T_FA_NEDD1Ev(%struct.T_FA_NED* %t_fa_ned)
+  // CHECK-NEXT: call void @_ZN7T_F_NEDD1Ev(%struct.T_F_NED* %t_f_ned)
+  // CHECK-NEXT: call void @_ZN7T_B_NEDD1Ev(%struct.T_B_NED* %t_b_ned)
   // CHECK-NEXT: call void @_ZN2VDD1Ev(%struct.VD* %vd)
   // CHECK-NEXT: call void @_ZN3NEDD1Ev(%struct.NED* %ned)
   // CHECK-NEXT: call void @_ZN2UDD1Ev(%struct.UD* %ud)
Index: cfe/trunk/lib/CodeGen/CGCall.cpp
===
--- cfe/trunk/lib/CodeGen/CGCall.cpp
+++ cfe/trunk/lib/CodeGen/CGCall.cpp
@@ -1814,6 +1814,9 @@
 // them).  LLVM will remove this attribute where it safely can.
 FuncAttrs.addAttribute(llvm::Attribute::Convergent);
 
+// Exceptions aren't supported in CUDA device code.
+FuncAttrs.addAttribute(llvm::Attribute::NoUnwind);
+
 // Respect -fcuda-flush-denormals-to-zero.
 if (getLangOpts().CUDADeviceFlushDenormalsToZero)
   FuncAttrs.addAttribute("nvptx-f32ftz", "true");
Index: cfe/trunk/lib/CodeGen/CGException.cpp
===
--- cfe/trunk/lib/CodeGen/CGException.cpp
+++ cfe/trunk/lib/CodeGen/CGException.cpp
@@ -698,6 +698,10 @@
   return nullptr;
   }
 
+  // CUDA device code doesn't have exceptions.
+  if (LO.CUDA && LO.CUDAIsDevice)
+return nullptr;
+
   // Check the innermost scope for a cached landing pad.  If this is
   // a non-EH cleanup, we'll check enclosing scopes in EmitLandingPad.
   llvm::BasicBlock *LP = EHStack.begin()->getCachedLandingPad();
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r283280 - [CUDA] Add missing ':' to noexcept.cu test.

2016-10-04 Thread Justin Lebar via cfe-commits
Author: jlebar
Date: Tue Oct  4 19:27:38 2016
New Revision: 283280

URL: http://llvm.org/viewvc/llvm-project?rev=283280&view=rev
Log:
[CUDA] Add missing ':' to noexcept.cu test.

Modified:
cfe/trunk/test/CodeGenCUDA/nothrow.cu

Modified: cfe/trunk/test/CodeGenCUDA/nothrow.cu
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/CodeGenCUDA/nothrow.cu?rev=283280&r1=283279&r2=283280&view=diff
==
--- cfe/trunk/test/CodeGenCUDA/nothrow.cu (original)
+++ cfe/trunk/test/CodeGenCUDA/nothrow.cu Tue Oct  4 19:27:38 2016
@@ -1,6 +1,6 @@
 // RUN: %clang_cc1 -std=c++11 -fcxx-exceptions -fexceptions -fcuda-is-device \
 // RUN:   -triple nvptx-nvidia-cuda -emit-llvm -disable-llvm-passes -o - %s | \
-// RUN  FileCheck -check-prefix DEVICE %s
+// RUN: FileCheck -check-prefix DEVICE %s
 
 // RUN: %clang_cc1 -std=c++11 -fcxx-exceptions -fexceptions \
 // RUN:   -triple x86_64-unknown-linux-gnu -emit-llvm -disable-llvm-passes -o 
- %s | \


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r283487 - [Sema] Replace smart quote with "'" in comment.

2016-10-06 Thread Justin Lebar via cfe-commits
Author: jlebar
Date: Thu Oct  6 14:47:56 2016
New Revision: 283487

URL: http://llvm.org/viewvc/llvm-project?rev=283487&view=rev
Log:
[Sema] Replace smart quote with "'" in comment.

Looks like the smart quote was copy/pasted from the C++ standard.

The smart quote was not encoded as valid UTF-8 (?), even though vim was
detecting the file as UTF-8.  This broke the clang-format Python script,
which tried to read the file using the same encoding as vim detected.

Modified:
cfe/trunk/lib/Sema/SemaExpr.cpp

Modified: cfe/trunk/lib/Sema/SemaExpr.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Sema/SemaExpr.cpp?rev=283487&r1=283486&r2=283487&view=diff
==
--- cfe/trunk/lib/Sema/SemaExpr.cpp (original)
+++ cfe/trunk/lib/Sema/SemaExpr.cpp Thu Oct  6 14:47:56 2016
@@ -13687,7 +13687,7 @@ static bool captureInLambda(LambdaScopeI
 // C++ [expr.prim.lambda]p5:
 //   The closure type for a lambda-expression has a public inline 
 //   function call operator [...]. This function call operator is 
-//   declared const (9.3.1) if and only if the lambda-expression’s 
+//   declared const (9.3.1) if and only if the lambda-expression's 
 //   parameter-declaration-clause is not followed by mutable.
 DeclRefType = CaptureType.getNonReferenceType();
 if (!LSI->Mutable && !CaptureType->isReferenceType())


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r283637 - [CUDA] Do a better job at detecting wrong-side calls.

2016-10-07 Thread Justin Lebar via cfe-commits
Author: jlebar
Date: Fri Oct  7 20:07:11 2016
New Revision: 283637

URL: http://llvm.org/viewvc/llvm-project?rev=283637&view=rev
Log:
[CUDA] Do a better job at detecting wrong-side calls.

Summary:
Move CheckCUDACall from ActOnCallExpr and BuildDeclRefExpr to
DiagnoseUseOfDecl.  This lets us catch some edge cases we were missing,
specifically around class operators.

This necessitates a few other changes:

 - Avoid emitting duplicate deferred diags in CheckCUDACall.

   Previously we'd carefully placed our call to CheckCUDACall such that
   it would only ever run once for a particular callsite.  But now this
   isn't the case.

 - Emit deferred diagnostics from a template
   specialization/instantiation's primary template, in addition to from
   the specialization/instantiation itself.  DiagnoseUseOfDecl ends up
   putting the deferred diagnostics on the template, rather than the
   specialization, so we need to check both.

Reviewers: rsmith

Subscribers: cfe-commits, tra

Differential Revision: https://reviews.llvm.org/D24573

Modified:
cfe/trunk/include/clang/Sema/Sema.h
cfe/trunk/lib/CodeGen/CodeGenModule.cpp
cfe/trunk/lib/Sema/SemaCUDA.cpp
cfe/trunk/lib/Sema/SemaExpr.cpp
cfe/trunk/test/SemaCUDA/Inputs/cuda.h
cfe/trunk/test/SemaCUDA/call-host-fn-from-device.cu

Modified: cfe/trunk/include/clang/Sema/Sema.h
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Sema/Sema.h?rev=283637&r1=283636&r2=283637&view=diff
==
--- cfe/trunk/include/clang/Sema/Sema.h (original)
+++ cfe/trunk/include/clang/Sema/Sema.h Fri Oct  7 20:07:11 2016
@@ -9267,16 +9267,27 @@ public:
   void maybeAddCUDAHostDeviceAttrs(Scope *S, FunctionDecl *FD,
const LookupResult &Previous);
 
+private:
+  /// Raw encodings of SourceLocations for which CheckCUDACall has emitted a
+  /// deferred "bad call" diagnostic.  We use this to avoid emitting the same
+  /// deferred diag twice.
+  llvm::DenseSet LocsWithCUDACallDeferredDiags;
+
+public:
   /// Check whether we're allowed to call Callee from the current context.
   ///
-  /// If the call is never allowed in a semantically-correct program
-  /// (CFP_Never), emits an error and returns false.
+  /// - If the call is never allowed in a semantically-correct program
+  ///   (CFP_Never), emits an error and returns false.
+  ///
+  /// - If the call is allowed in semantically-correct programs, but only if
+  ///   it's never codegen'ed (CFP_WrongSide), creates a deferred diagnostic to
+  ///   be emitted if and when the caller is codegen'ed, and returns true.
   ///
-  /// If the call is allowed in semantically-correct programs, but only if it's
-  /// never codegen'ed (CFP_WrongSide), creates a deferred diagnostic to be
-  /// emitted if and when the caller is codegen'ed, and returns true.
+  ///   Will only create deferred diagnostics for a given SourceLocation once,
+  ///   so you can safely call this multiple times without generating duplicate
+  ///   deferred errors.
   ///
-  /// Otherwise, returns true without emitting any diagnostics.
+  /// - Otherwise, returns true without emitting any diagnostics.
   bool CheckCUDACall(SourceLocation Loc, FunctionDecl *Callee);
 
   /// Check whether a 'try' or 'throw' expression is allowed within the current

Modified: cfe/trunk/lib/CodeGen/CodeGenModule.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CodeGenModule.cpp?rev=283637&r1=283636&r2=283637&view=diff
==
--- cfe/trunk/lib/CodeGen/CodeGenModule.cpp (original)
+++ cfe/trunk/lib/CodeGen/CodeGenModule.cpp Fri Oct  7 20:07:11 2016
@@ -2923,6 +2923,10 @@ void CodeGenModule::EmitGlobalFunctionDe
   // non-error diags here, because order can be significant, e.g. with notes
   // that follow errors.)
   auto Diags = D->takeDeferredDiags();
+  if (auto *Templ = D->getPrimaryTemplate()) {
+auto TemplDiags = Templ->getAsFunction()->takeDeferredDiags();
+Diags.insert(Diags.end(), TemplDiags.begin(), TemplDiags.end());
+  }
   bool HasError = llvm::any_of(Diags, [this](const PartialDiagnosticAt &PDAt) {
 return getDiags().getDiagnosticLevel(PDAt.second.getDiagID(), PDAt.first) 
>=
DiagnosticsEngine::Error;

Modified: cfe/trunk/lib/Sema/SemaCUDA.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Sema/SemaCUDA.cpp?rev=283637&r1=283636&r2=283637&view=diff
==
--- cfe/trunk/lib/Sema/SemaCUDA.cpp (original)
+++ cfe/trunk/lib/Sema/SemaCUDA.cpp Fri Oct  7 20:07:11 2016
@@ -495,7 +495,13 @@ bool Sema::CheckCUDACall(SourceLocation
 Diag(Callee->getLocation(), diag::note_previous_decl) << Callee;
 return false;
   }
-  if (Pref == Sema::CFP_WrongSide) {
+
+  // Insert into LocsWithCUDADeferredDiags to avoid emitting duplicate deferred
+  // diagnostics for 

[PATCH] D24573: [CUDA] Do a better job at detecting wrong-side calls.

2016-10-08 Thread Justin Lebar via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rL283637: [CUDA] Do a better job at detecting wrong-side 
calls. (authored by jlebar).

Changed prior to commit:
  https://reviews.llvm.org/D24573?vs=71381&id=74030#toc

Repository:
  rL LLVM

https://reviews.llvm.org/D24573

Files:
  cfe/trunk/include/clang/Sema/Sema.h
  cfe/trunk/lib/CodeGen/CodeGenModule.cpp
  cfe/trunk/lib/Sema/SemaCUDA.cpp
  cfe/trunk/lib/Sema/SemaExpr.cpp
  cfe/trunk/test/SemaCUDA/Inputs/cuda.h
  cfe/trunk/test/SemaCUDA/call-host-fn-from-device.cu

Index: cfe/trunk/include/clang/Sema/Sema.h
===
--- cfe/trunk/include/clang/Sema/Sema.h
+++ cfe/trunk/include/clang/Sema/Sema.h
@@ -9267,16 +9267,27 @@
   void maybeAddCUDAHostDeviceAttrs(Scope *S, FunctionDecl *FD,
const LookupResult &Previous);
 
+private:
+  /// Raw encodings of SourceLocations for which CheckCUDACall has emitted a
+  /// deferred "bad call" diagnostic.  We use this to avoid emitting the same
+  /// deferred diag twice.
+  llvm::DenseSet LocsWithCUDACallDeferredDiags;
+
+public:
   /// Check whether we're allowed to call Callee from the current context.
   ///
-  /// If the call is never allowed in a semantically-correct program
-  /// (CFP_Never), emits an error and returns false.
+  /// - If the call is never allowed in a semantically-correct program
+  ///   (CFP_Never), emits an error and returns false.
   ///
-  /// If the call is allowed in semantically-correct programs, but only if it's
-  /// never codegen'ed (CFP_WrongSide), creates a deferred diagnostic to be
-  /// emitted if and when the caller is codegen'ed, and returns true.
+  /// - If the call is allowed in semantically-correct programs, but only if
+  ///   it's never codegen'ed (CFP_WrongSide), creates a deferred diagnostic to
+  ///   be emitted if and when the caller is codegen'ed, and returns true.
+  ///
+  ///   Will only create deferred diagnostics for a given SourceLocation once,
+  ///   so you can safely call this multiple times without generating duplicate
+  ///   deferred errors.
   ///
-  /// Otherwise, returns true without emitting any diagnostics.
+  /// - Otherwise, returns true without emitting any diagnostics.
   bool CheckCUDACall(SourceLocation Loc, FunctionDecl *Callee);
 
   /// Check whether a 'try' or 'throw' expression is allowed within the current
Index: cfe/trunk/test/SemaCUDA/call-host-fn-from-device.cu
===
--- cfe/trunk/test/SemaCUDA/call-host-fn-from-device.cu
+++ cfe/trunk/test/SemaCUDA/call-host-fn-from-device.cu
@@ -12,6 +12,9 @@
 // expected-note@-4 {{'host_fn' declared here}}
 // expected-note@-5 {{'host_fn' declared here}}
 // expected-note@-6 {{'host_fn' declared here}}
+// expected-note@-7 {{'host_fn' declared here}}
+
+struct Dummy {};
 
 struct S {
   S() {}
@@ -34,6 +37,15 @@
 
   void h() {}
   // expected-note@-1 {{'h' declared here}}
+
+  void operator+();
+  // expected-note@-1 {{'operator+' declared here}}
+
+  void operator-(const T&) {}
+  // expected-note@-1 {{'operator-' declared here}}
+
+  operator Dummy() { return Dummy(); }
+  // expected-note@-1 {{'operator Dummy' declared here}}
 };
 
 __host__ __device__ void T::hd3() {
@@ -92,3 +104,30 @@
 __host__ __device__ void fn_ptr_template() {
   auto* ptr = &host_fn;  // Not an error because the template isn't instantiated.
 }
+
+__host__ __device__ void unaryOp() {
+  T t;
+  (void) +t; // expected-error {{reference to __host__ function 'operator+' in __host__ __device__ function}}
+}
+
+__host__ __device__ void binaryOp() {
+  T t;
+  (void) (t - t); // expected-error {{reference to __host__ function 'operator-' in __host__ __device__ function}}
+}
+
+__host__ __device__ void implicitConversion() {
+  T t;
+  Dummy d = t; // expected-error {{reference to __host__ function 'operator Dummy' in __host__ __device__ function}}
+}
+
+template 
+struct TmplStruct {
+  template  __host__ __device__ void fn() {}
+};
+
+template <>
+template <>
+__host__ __device__ void TmplStruct::fn() { host_fn(); }
+// expected-error@-1 {{reference to __host__ function 'host_fn' in __host__ __device__ function}}
+
+__device__ void double_specialization() { TmplStruct().fn(); }
Index: cfe/trunk/test/SemaCUDA/Inputs/cuda.h
===
--- cfe/trunk/test/SemaCUDA/Inputs/cuda.h
+++ cfe/trunk/test/SemaCUDA/Inputs/cuda.h
@@ -22,7 +22,9 @@
 int cudaConfigureCall(dim3 gridSize, dim3 blockSize, size_t sharedSize = 0,
   cudaStream_t stream = 0);
 
-// Device-side placement new overloads.
+// Host- and device-side placement new overloads.
+void *operator new(__SIZE_TYPE__, void *p) { return p; }
+void *operator new[](__SIZE_TYPE__, void *p) { return p; }
 __device__ void *operator new(__SIZE_TYPE__, void *p) { return p; }
 __device__ void *operator new[](__SIZE_

[PATCH] D24975: [CUDA] Add #pragma clang force_cuda_host_device_{begin, end} pragmas.

2016-10-08 Thread Justin Lebar via cfe-commits
jlebar marked 2 inline comments as done.
jlebar added a comment.

In https://reviews.llvm.org/D24975#565054, @rsmith wrote:

> Please add a test to test/PCH for the serialization code.  Otherwise, LGTM.


Test added.  It caught a bug, too.  :)

Thank you for the review.


https://reviews.llvm.org/D24975



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r283678 - [CUDA] Declare our __device__ math functions in the same inline namespace as our standard library.

2016-10-08 Thread Justin Lebar via cfe-commits
Author: jlebar
Date: Sat Oct  8 17:16:03 2016
New Revision: 283678

URL: http://llvm.org/viewvc/llvm-project?rev=283678&view=rev
Log:
[CUDA] Declare our __device__ math functions in the same inline namespace as 
our standard library.

Summary:
Currently we declare our inline __device__ math functions in namespace
std.  But libstdc++ and libc++ declare these functions in an inline
namespace inside namespace std.  We need to match this because, in a
later patch, we want to get e.g.  to use our device overloads,
and it only will if those overloads are in the right inline namespace.

Reviewers: tra

Subscribers: cfe-commits, jhen

Differential Revision: https://reviews.llvm.org/D24977

Modified:
cfe/trunk/lib/Headers/__clang_cuda_cmath.h
cfe/trunk/lib/Headers/__clang_cuda_math_forward_declares.h

Modified: cfe/trunk/lib/Headers/__clang_cuda_cmath.h
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Headers/__clang_cuda_cmath.h?rev=283678&r1=283677&r2=283678&view=diff
==
--- cfe/trunk/lib/Headers/__clang_cuda_cmath.h (original)
+++ cfe/trunk/lib/Headers/__clang_cuda_cmath.h Sat Oct  8 17:16:03 2016
@@ -316,7 +316,19 @@ scalbn(__T __x, int __exp) {
   return std::scalbn((double)__x, __exp);
 }
 
+// We need to define these overloads in exactly the namespace our standard
+// library uses (including the right inline namespace), otherwise they won't be
+// picked up by other functions in the standard library (e.g. functions in
+// ).  Thus the ugliness below.
+#ifdef _LIBCPP_BEGIN_NAMESPACE_STD
+_LIBCPP_BEGIN_NAMESPACE_STD
+#else
 namespace std {
+#ifdef _GLIBCXX_BEGIN_NAMESPACE_VERSION
+_GLIBCXX_BEGIN_NAMESPACE_VERSION
+#endif
+#endif
+
 // Pull the new overloads we defined above into namespace std.
 using ::acos;
 using ::acosh;
@@ -451,7 +463,15 @@ using ::tanf;
 using ::tanhf;
 using ::tgammaf;
 using ::truncf;
-}
+
+#ifdef _LIBCPP_END_NAMESPACE_STD
+_LIBCPP_END_NAMESPACE_STD
+#else
+#ifdef _GLIBCXX_BEGIN_NAMESPACE_VERSION
+_GLIBCXX_END_NAMESPACE_VERSION
+#endif
+} // namespace std
+#endif
 
 #undef __DEVICE__
 

Modified: cfe/trunk/lib/Headers/__clang_cuda_math_forward_declares.h
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Headers/__clang_cuda_math_forward_declares.h?rev=283678&r1=283677&r2=283678&view=diff
==
--- cfe/trunk/lib/Headers/__clang_cuda_math_forward_declares.h (original)
+++ cfe/trunk/lib/Headers/__clang_cuda_math_forward_declares.h Sat Oct  8 
17:16:03 2016
@@ -185,7 +185,19 @@ __DEVICE__ float tgamma(float);
 __DEVICE__ double trunc(double);
 __DEVICE__ float trunc(float);
 
+// We need to define these overloads in exactly the namespace our standard
+// library uses (including the right inline namespace), otherwise they won't be
+// picked up by other functions in the standard library (e.g. functions in
+// ).  Thus the ugliness below.
+#ifdef _LIBCPP_BEGIN_NAMESPACE_STD
+_LIBCPP_BEGIN_NAMESPACE_STD
+#else
 namespace std {
+#ifdef _GLIBCXX_BEGIN_NAMESPACE_VERSION
+_GLIBCXX_BEGIN_NAMESPACE_VERSION
+#endif
+#endif
+
 using ::abs;
 using ::acos;
 using ::acosh;
@@ -259,7 +271,15 @@ using ::tan;
 using ::tanh;
 using ::tgamma;
 using ::trunc;
+
+#ifdef _LIBCPP_END_NAMESPACE_STD
+_LIBCPP_END_NAMESPACE_STD
+#else
+#ifdef _GLIBCXX_BEGIN_NAMESPACE_VERSION
+_GLIBCXX_END_NAMESPACE_VERSION
+#endif
 } // namespace std
+#endif
 
 #pragma pop_macro("__DEVICE__")
 


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r283677 - [CUDA] Add #pragma clang force_cuda_host_device_{begin, end} pragmas.

2016-10-08 Thread Justin Lebar via cfe-commits
Author: jlebar
Date: Sat Oct  8 17:15:58 2016
New Revision: 283677

URL: http://llvm.org/viewvc/llvm-project?rev=283677&view=rev
Log:
[CUDA] Add #pragma clang force_cuda_host_device_{begin,end} pragmas.

Summary:
These cause us to consider all functions in-between to be __host__
__device__.

You can nest these pragmas; you just can't have more 'end's than
'begin's.

Reviewers: rsmith

Subscribers: tra, jhen, cfe-commits

Differential Revision: https://reviews.llvm.org/D24975

Added:
cfe/trunk/test/PCH/pragma-cuda-force-host-device.cu
cfe/trunk/test/Parser/cuda-force-host-device-templates.cu
cfe/trunk/test/Parser/cuda-force-host-device.cu
Modified:
cfe/trunk/include/clang/Basic/DiagnosticParseKinds.td
cfe/trunk/include/clang/Parse/Parser.h
cfe/trunk/include/clang/Sema/Sema.h
cfe/trunk/include/clang/Serialization/ASTBitCodes.h
cfe/trunk/include/clang/Serialization/ASTReader.h
cfe/trunk/include/clang/Serialization/ASTWriter.h
cfe/trunk/lib/Parse/ParsePragma.cpp
cfe/trunk/lib/Sema/SemaCUDA.cpp
cfe/trunk/lib/Serialization/ASTReader.cpp
cfe/trunk/lib/Serialization/ASTWriter.cpp

Modified: cfe/trunk/include/clang/Basic/DiagnosticParseKinds.td
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Basic/DiagnosticParseKinds.td?rev=283677&r1=283676&r2=283677&view=diff
==
--- cfe/trunk/include/clang/Basic/DiagnosticParseKinds.td (original)
+++ cfe/trunk/include/clang/Basic/DiagnosticParseKinds.td Sat Oct  8 17:15:58 
2016
@@ -1026,6 +1026,12 @@ def warn_pragma_unroll_cuda_value_in_par
 def warn_cuda_attr_lambda_position : Warning<
   "nvcc does not allow '__%0__' to appear after '()' in lambdas">,
   InGroup;
+def warn_pragma_force_cuda_host_device_bad_arg : Warning<
+  "incorrect use of #pragma clang force_cuda_host_device begin|end">,
+  InGroup;
+def err_pragma_cannot_end_force_cuda_host_device : Error<
+  "force_cuda_host_device end pragma without matching "
+  "force_cuda_host_device begin">;
 } // end of Parse Issue category.
 
 let CategoryName = "Modules Issue" in {

Modified: cfe/trunk/include/clang/Parse/Parser.h
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Parse/Parser.h?rev=283677&r1=283676&r2=283677&view=diff
==
--- cfe/trunk/include/clang/Parse/Parser.h (original)
+++ cfe/trunk/include/clang/Parse/Parser.h Sat Oct  8 17:15:58 2016
@@ -173,6 +173,7 @@ class Parser : public CodeCompletionHand
   std::unique_ptr MSSection;
   std::unique_ptr MSRuntimeChecks;
   std::unique_ptr MSIntrinsic;
+  std::unique_ptr CUDAForceHostDeviceHandler;
   std::unique_ptr OptimizeHandler;
   std::unique_ptr LoopHintHandler;
   std::unique_ptr UnrollHintHandler;

Modified: cfe/trunk/include/clang/Sema/Sema.h
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Sema/Sema.h?rev=283677&r1=283676&r2=283677&view=diff
==
--- cfe/trunk/include/clang/Sema/Sema.h (original)
+++ cfe/trunk/include/clang/Sema/Sema.h Sat Oct  8 17:15:58 2016
@@ -9219,6 +9219,20 @@ public:
 QualType FieldTy, bool IsMsStruct,
 Expr *BitWidth, bool *ZeroWidth = nullptr);
 
+private:
+  unsigned ForceCUDAHostDeviceDepth = 0;
+
+public:
+  /// Increments our count of the number of times we've seen a pragma forcing
+  /// functions to be __host__ __device__.  So long as this count is greater
+  /// than zero, all functions encountered will be __host__ __device__.
+  void PushForceCUDAHostDevice();
+
+  /// Decrements our count of the number of times we've seen a pragma forcing
+  /// functions to be __host__ __device__.  Returns false if the count is 0
+  /// before incrementing, so you can emit an error.
+  bool PopForceCUDAHostDevice();
+
   enum CUDAFunctionTarget {
 CFT_Device,
 CFT_Global,

Modified: cfe/trunk/include/clang/Serialization/ASTBitCodes.h
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Serialization/ASTBitCodes.h?rev=283677&r1=283676&r2=283677&view=diff
==
--- cfe/trunk/include/clang/Serialization/ASTBitCodes.h (original)
+++ cfe/trunk/include/clang/Serialization/ASTBitCodes.h Sat Oct  8 17:15:58 2016
@@ -580,7 +580,11 @@ namespace clang {
   MSSTRUCT_PRAGMA_OPTIONS = 55,
 
   /// \brief Record code for \#pragma ms_struct options.
-  POINTERS_TO_MEMBERS_PRAGMA_OPTIONS = 56
+  POINTERS_TO_MEMBERS_PRAGMA_OPTIONS = 56,
+
+  /// \brief Number of unmatched #pragma clang cuda_force_host_device begin
+  /// directives we've seen.
+  CUDA_PRAGMA_FORCE_HOST_DEVICE_DEPTH = 57,
 };
 
 /// \brief Record types used within a source manager block.

Modified: cfe/trunk/include/clang/Serialization/ASTReader.h
URL: 
http://llvm.org/viewvc/llvm-projec

[PATCH] D24977: [CUDA] Declare our __device__ math functions in the same inline namespace as our standard library.

2016-10-08 Thread Justin Lebar via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rL283678: [CUDA] Declare our __device__ math functions in the 
same inline namespace as… (authored by jlebar).

Changed prior to commit:
  https://reviews.llvm.org/D24977?vs=72684&id=74052#toc

Repository:
  rL LLVM

https://reviews.llvm.org/D24977

Files:
  cfe/trunk/lib/Headers/__clang_cuda_cmath.h
  cfe/trunk/lib/Headers/__clang_cuda_math_forward_declares.h


Index: cfe/trunk/lib/Headers/__clang_cuda_math_forward_declares.h
===
--- cfe/trunk/lib/Headers/__clang_cuda_math_forward_declares.h
+++ cfe/trunk/lib/Headers/__clang_cuda_math_forward_declares.h
@@ -185,7 +185,19 @@
 __DEVICE__ double trunc(double);
 __DEVICE__ float trunc(float);
 
+// We need to define these overloads in exactly the namespace our standard
+// library uses (including the right inline namespace), otherwise they won't be
+// picked up by other functions in the standard library (e.g. functions in
+// ).  Thus the ugliness below.
+#ifdef _LIBCPP_BEGIN_NAMESPACE_STD
+_LIBCPP_BEGIN_NAMESPACE_STD
+#else
 namespace std {
+#ifdef _GLIBCXX_BEGIN_NAMESPACE_VERSION
+_GLIBCXX_BEGIN_NAMESPACE_VERSION
+#endif
+#endif
+
 using ::abs;
 using ::acos;
 using ::acosh;
@@ -259,7 +271,15 @@
 using ::tanh;
 using ::tgamma;
 using ::trunc;
+
+#ifdef _LIBCPP_END_NAMESPACE_STD
+_LIBCPP_END_NAMESPACE_STD
+#else
+#ifdef _GLIBCXX_BEGIN_NAMESPACE_VERSION
+_GLIBCXX_END_NAMESPACE_VERSION
+#endif
 } // namespace std
+#endif
 
 #pragma pop_macro("__DEVICE__")
 
Index: cfe/trunk/lib/Headers/__clang_cuda_cmath.h
===
--- cfe/trunk/lib/Headers/__clang_cuda_cmath.h
+++ cfe/trunk/lib/Headers/__clang_cuda_cmath.h
@@ -316,7 +316,19 @@
   return std::scalbn((double)__x, __exp);
 }
 
+// We need to define these overloads in exactly the namespace our standard
+// library uses (including the right inline namespace), otherwise they won't be
+// picked up by other functions in the standard library (e.g. functions in
+// ).  Thus the ugliness below.
+#ifdef _LIBCPP_BEGIN_NAMESPACE_STD
+_LIBCPP_BEGIN_NAMESPACE_STD
+#else
 namespace std {
+#ifdef _GLIBCXX_BEGIN_NAMESPACE_VERSION
+_GLIBCXX_BEGIN_NAMESPACE_VERSION
+#endif
+#endif
+
 // Pull the new overloads we defined above into namespace std.
 using ::acos;
 using ::acosh;
@@ -451,7 +463,15 @@
 using ::tanhf;
 using ::tgammaf;
 using ::truncf;
-}
+
+#ifdef _LIBCPP_END_NAMESPACE_STD
+_LIBCPP_END_NAMESPACE_STD
+#else
+#ifdef _GLIBCXX_BEGIN_NAMESPACE_VERSION
+_GLIBCXX_END_NAMESPACE_VERSION
+#endif
+} // namespace std
+#endif
 
 #undef __DEVICE__
 


Index: cfe/trunk/lib/Headers/__clang_cuda_math_forward_declares.h
===
--- cfe/trunk/lib/Headers/__clang_cuda_math_forward_declares.h
+++ cfe/trunk/lib/Headers/__clang_cuda_math_forward_declares.h
@@ -185,7 +185,19 @@
 __DEVICE__ double trunc(double);
 __DEVICE__ float trunc(float);
 
+// We need to define these overloads in exactly the namespace our standard
+// library uses (including the right inline namespace), otherwise they won't be
+// picked up by other functions in the standard library (e.g. functions in
+// ).  Thus the ugliness below.
+#ifdef _LIBCPP_BEGIN_NAMESPACE_STD
+_LIBCPP_BEGIN_NAMESPACE_STD
+#else
 namespace std {
+#ifdef _GLIBCXX_BEGIN_NAMESPACE_VERSION
+_GLIBCXX_BEGIN_NAMESPACE_VERSION
+#endif
+#endif
+
 using ::abs;
 using ::acos;
 using ::acosh;
@@ -259,7 +271,15 @@
 using ::tanh;
 using ::tgamma;
 using ::trunc;
+
+#ifdef _LIBCPP_END_NAMESPACE_STD
+_LIBCPP_END_NAMESPACE_STD
+#else
+#ifdef _GLIBCXX_BEGIN_NAMESPACE_VERSION
+_GLIBCXX_END_NAMESPACE_VERSION
+#endif
 } // namespace std
+#endif
 
 #pragma pop_macro("__DEVICE__")
 
Index: cfe/trunk/lib/Headers/__clang_cuda_cmath.h
===
--- cfe/trunk/lib/Headers/__clang_cuda_cmath.h
+++ cfe/trunk/lib/Headers/__clang_cuda_cmath.h
@@ -316,7 +316,19 @@
   return std::scalbn((double)__x, __exp);
 }
 
+// We need to define these overloads in exactly the namespace our standard
+// library uses (including the right inline namespace), otherwise they won't be
+// picked up by other functions in the standard library (e.g. functions in
+// ).  Thus the ugliness below.
+#ifdef _LIBCPP_BEGIN_NAMESPACE_STD
+_LIBCPP_BEGIN_NAMESPACE_STD
+#else
 namespace std {
+#ifdef _GLIBCXX_BEGIN_NAMESPACE_VERSION
+_GLIBCXX_BEGIN_NAMESPACE_VERSION
+#endif
+#endif
+
 // Pull the new overloads we defined above into namespace std.
 using ::acos;
 using ::acosh;
@@ -451,7 +463,15 @@
 using ::tanhf;
 using ::tgammaf;
 using ::truncf;
-}
+
+#ifdef _LIBCPP_END_NAMESPACE_STD
+_LIBCPP_END_NAMESPACE_STD
+#else
+#ifdef _GLIBCXX_BEGIN_NAMESPACE_VERSION
+_GLIBCXX_END_NAMESPACE_VERSION
+#endif
+} // namespace std
+#endif
 
 #undef __DEVICE__
 
___
cfe-commits mailing list
cfe-commits@

[PATCH] D24975: [CUDA] Add #pragma clang force_cuda_host_device_{begin, end} pragmas.

2016-10-08 Thread Justin Lebar via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rL283677: [CUDA] Add #pragma clang 
force_cuda_host_device_{begin,end} pragmas. (authored by jlebar).

Changed prior to commit:
  https://reviews.llvm.org/D24975?vs=72734&id=74051#toc

Repository:
  rL LLVM

https://reviews.llvm.org/D24975

Files:
  cfe/trunk/include/clang/Basic/DiagnosticParseKinds.td
  cfe/trunk/include/clang/Parse/Parser.h
  cfe/trunk/include/clang/Sema/Sema.h
  cfe/trunk/include/clang/Serialization/ASTBitCodes.h
  cfe/trunk/include/clang/Serialization/ASTReader.h
  cfe/trunk/include/clang/Serialization/ASTWriter.h
  cfe/trunk/lib/Parse/ParsePragma.cpp
  cfe/trunk/lib/Sema/SemaCUDA.cpp
  cfe/trunk/lib/Serialization/ASTReader.cpp
  cfe/trunk/lib/Serialization/ASTWriter.cpp
  cfe/trunk/test/PCH/pragma-cuda-force-host-device.cu
  cfe/trunk/test/Parser/cuda-force-host-device-templates.cu
  cfe/trunk/test/Parser/cuda-force-host-device.cu

Index: cfe/trunk/include/clang/Serialization/ASTReader.h
===
--- cfe/trunk/include/clang/Serialization/ASTReader.h
+++ cfe/trunk/include/clang/Serialization/ASTReader.h
@@ -772,6 +772,10 @@
   /// Sema tracks these to emit warnings.
   SmallVector UnusedLocalTypedefNameCandidates;
 
+  /// \brief Our current depth in #pragma cuda force_host_device begin/end
+  /// macros.
+  unsigned ForceCUDAHostDeviceDepth = 0;
+
   /// \brief The IDs of the declarations Sema stores directly.
   ///
   /// Sema tracks a few important decls, such as namespace std, directly.
Index: cfe/trunk/include/clang/Serialization/ASTWriter.h
===
--- cfe/trunk/include/clang/Serialization/ASTWriter.h
+++ cfe/trunk/include/clang/Serialization/ASTWriter.h
@@ -459,6 +459,7 @@
   void WriteDeclContextVisibleUpdate(const DeclContext *DC);
   void WriteFPPragmaOptions(const FPOptions &Opts);
   void WriteOpenCLExtensions(Sema &SemaRef);
+  void WriteCUDAPragmas(Sema &SemaRef);
   void WriteObjCCategories();
   void WriteLateParsedTemplates(Sema &SemaRef);
   void WriteOptimizePragmaOptions(Sema &SemaRef);
Index: cfe/trunk/include/clang/Serialization/ASTBitCodes.h
===
--- cfe/trunk/include/clang/Serialization/ASTBitCodes.h
+++ cfe/trunk/include/clang/Serialization/ASTBitCodes.h
@@ -580,7 +580,11 @@
   MSSTRUCT_PRAGMA_OPTIONS = 55,
 
   /// \brief Record code for \#pragma ms_struct options.
-  POINTERS_TO_MEMBERS_PRAGMA_OPTIONS = 56
+  POINTERS_TO_MEMBERS_PRAGMA_OPTIONS = 56,
+
+  /// \brief Number of unmatched #pragma clang cuda_force_host_device begin
+  /// directives we've seen.
+  CUDA_PRAGMA_FORCE_HOST_DEVICE_DEPTH = 57,
 };
 
 /// \brief Record types used within a source manager block.
Index: cfe/trunk/include/clang/Parse/Parser.h
===
--- cfe/trunk/include/clang/Parse/Parser.h
+++ cfe/trunk/include/clang/Parse/Parser.h
@@ -173,6 +173,7 @@
   std::unique_ptr MSSection;
   std::unique_ptr MSRuntimeChecks;
   std::unique_ptr MSIntrinsic;
+  std::unique_ptr CUDAForceHostDeviceHandler;
   std::unique_ptr OptimizeHandler;
   std::unique_ptr LoopHintHandler;
   std::unique_ptr UnrollHintHandler;
Index: cfe/trunk/include/clang/Sema/Sema.h
===
--- cfe/trunk/include/clang/Sema/Sema.h
+++ cfe/trunk/include/clang/Sema/Sema.h
@@ -9219,6 +9219,20 @@
 QualType FieldTy, bool IsMsStruct,
 Expr *BitWidth, bool *ZeroWidth = nullptr);
 
+private:
+  unsigned ForceCUDAHostDeviceDepth = 0;
+
+public:
+  /// Increments our count of the number of times we've seen a pragma forcing
+  /// functions to be __host__ __device__.  So long as this count is greater
+  /// than zero, all functions encountered will be __host__ __device__.
+  void PushForceCUDAHostDevice();
+
+  /// Decrements our count of the number of times we've seen a pragma forcing
+  /// functions to be __host__ __device__.  Returns false if the count is 0
+  /// before incrementing, so you can emit an error.
+  bool PopForceCUDAHostDevice();
+
   enum CUDAFunctionTarget {
 CFT_Device,
 CFT_Global,
Index: cfe/trunk/include/clang/Basic/DiagnosticParseKinds.td
===
--- cfe/trunk/include/clang/Basic/DiagnosticParseKinds.td
+++ cfe/trunk/include/clang/Basic/DiagnosticParseKinds.td
@@ -1026,6 +1026,12 @@
 def warn_cuda_attr_lambda_position : Warning<
   "nvcc does not allow '__%0__' to appear after '()' in lambdas">,
   InGroup;
+def warn_pragma_force_cuda_host_device_bad_arg : Warning<
+  "incorrect use of #pragma clang force_cuda_host_device begin|end">,
+  InGroup;
+def err_pragma_cannot_end_force_cuda_host_device : Error<
+  "force_cuda_host_device end pragma without matching "
+  "force_

r283679 - [CUDA] Rename cuda_builtin_vars.h to __clang_cuda_builtin_vars.h.

2016-10-08 Thread Justin Lebar via cfe-commits
Author: jlebar
Date: Sat Oct  8 17:16:08 2016
New Revision: 283679

URL: http://llvm.org/viewvc/llvm-project?rev=283679&view=rev
Log:
[CUDA] Rename cuda_builtin_vars.h to __clang_cuda_builtin_vars.h.

Summary: This matches the idiom we use for our other CUDA wrapper headers.

Reviewers: tra

Subscribers: beanz, mgorny, cfe-commits

Differential Revision: https://reviews.llvm.org/D24978

Added:
cfe/trunk/lib/Headers/__clang_cuda_builtin_vars.h
Removed:
cfe/trunk/lib/Headers/cuda_builtin_vars.h
Modified:
cfe/trunk/lib/Frontend/CompilerInvocation.cpp
cfe/trunk/lib/Headers/CMakeLists.txt
cfe/trunk/lib/Headers/__clang_cuda_runtime_wrapper.h
cfe/trunk/test/CodeGenCUDA/cuda-builtin-vars.cu
cfe/trunk/test/SemaCUDA/cuda-builtin-vars.cu

Modified: cfe/trunk/lib/Frontend/CompilerInvocation.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Frontend/CompilerInvocation.cpp?rev=283679&r1=283678&r2=283679&view=diff
==
--- cfe/trunk/lib/Frontend/CompilerInvocation.cpp (original)
+++ cfe/trunk/lib/Frontend/CompilerInvocation.cpp Sat Oct  8 17:16:08 2016
@@ -2012,9 +2012,10 @@ static void ParseLangArgs(LangOptions &O
   // enabled for Microsoft Extensions or Borland Extensions, here.
   //
   // FIXME: __declspec is also currently enabled for CUDA, but isn't really a
-  // CUDA extension, however it is required for supporting cuda_builtin_vars.h,
-  // which uses __declspec(property). Once that has been rewritten in terms of
-  // something more generic, remove the Opts.CUDA term here.
+  // CUDA extension. However, it is required for supporting
+  // __clang_cuda_builtin_vars.h, which uses __declspec(property). Once that 
has
+  // been rewritten in terms of something more generic, remove the Opts.CUDA
+  // term here.
   Opts.DeclSpecKeyword =
   Args.hasFlag(OPT_fdeclspec, OPT_fno_declspec,
(Opts.MicrosoftExt || Opts.Borland || Opts.CUDA));

Modified: cfe/trunk/lib/Headers/CMakeLists.txt
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Headers/CMakeLists.txt?rev=283679&r1=283678&r2=283679&view=diff
==
--- cfe/trunk/lib/Headers/CMakeLists.txt (original)
+++ cfe/trunk/lib/Headers/CMakeLists.txt Sat Oct  8 17:16:08 2016
@@ -22,12 +22,12 @@ set(files
   avxintrin.h
   bmi2intrin.h
   bmiintrin.h
+  __clang_cuda_builtin_vars.h
   __clang_cuda_cmath.h
   __clang_cuda_intrinsics.h
   __clang_cuda_math_forward_declares.h
   __clang_cuda_runtime_wrapper.h
   cpuid.h
-  cuda_builtin_vars.h
   clflushoptintrin.h
   emmintrin.h
   f16cintrin.h

Added: cfe/trunk/lib/Headers/__clang_cuda_builtin_vars.h
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Headers/__clang_cuda_builtin_vars.h?rev=283679&view=auto
==
--- cfe/trunk/lib/Headers/__clang_cuda_builtin_vars.h (added)
+++ cfe/trunk/lib/Headers/__clang_cuda_builtin_vars.h Sat Oct  8 17:16:08 2016
@@ -0,0 +1,126 @@
+/*=== cuda_builtin_vars.h - CUDA built-in variables 
-===
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to 
deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ *
+ *===---===
+ */
+
+#ifndef __CUDA_BUILTIN_VARS_H
+#define __CUDA_BUILTIN_VARS_H
+
+// Forward declares from vector_types.h.
+struct uint3;
+struct dim3;
+
+// The file implements built-in CUDA variables using __declspec(property).
+// https://msdn.microsoft.com/en-us/library/yhfk0thd.aspx
+// All read accesses of built-in variable fields get converted into calls to a
+// getter function which in turn calls the appropriate builtin to fetch the
+// value.
+//
+// Example:
+//int x = threadIdx.x;
+// IR output:
+//  %0 = call i32 @llvm.nvvm.read.ptx.sreg.tid.x() #3
+// PTX output:
+//  mov.u32 %r2, %tid.x;
+
+#define __CUDA_DEVICE_BUILTIN(FIELD

r283680 - [CUDA] Support and std::min/max on the device.

2016-10-08 Thread Justin Lebar via cfe-commits
Author: jlebar
Date: Sat Oct  8 17:16:12 2016
New Revision: 283680

URL: http://llvm.org/viewvc/llvm-project?rev=283680&view=rev
Log:
[CUDA] Support  and std::min/max on the device.

Summary:
We do this by wrapping  and .

Tests are in the test-suite.

Reviewers: tra

Subscribers: jhen, beanz, cfe-commits, mgorny

Differential Revision: https://reviews.llvm.org/D24979

Added:
cfe/trunk/lib/Headers/__clang_cuda_complex_builtins.h
cfe/trunk/lib/Headers/cuda_wrappers/
cfe/trunk/lib/Headers/cuda_wrappers/algorithm
cfe/trunk/lib/Headers/cuda_wrappers/complex
Modified:
cfe/trunk/lib/Driver/ToolChains.cpp
cfe/trunk/lib/Headers/CMakeLists.txt
cfe/trunk/lib/Headers/__clang_cuda_runtime_wrapper.h

Modified: cfe/trunk/lib/Driver/ToolChains.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Driver/ToolChains.cpp?rev=283680&r1=283679&r2=283680&view=diff
==
--- cfe/trunk/lib/Driver/ToolChains.cpp (original)
+++ cfe/trunk/lib/Driver/ToolChains.cpp Sat Oct  8 17:16:12 2016
@@ -4694,6 +4694,15 @@ void Linux::AddClangCXXStdlibIncludeArgs
 
 void Linux::AddCudaIncludeArgs(const ArgList &DriverArgs,
ArgStringList &CC1Args) const {
+  if (!DriverArgs.hasArg(options::OPT_nobuiltininc)) {
+// Add cuda_wrappers/* to our system include path.  This lets us wrap
+// standard library headers.
+SmallString<128> P(getDriver().ResourceDir);
+llvm::sys::path::append(P, "include");
+llvm::sys::path::append(P, "cuda_wrappers");
+addSystemInclude(DriverArgs, CC1Args, P);
+  }
+
   if (DriverArgs.hasArg(options::OPT_nocudainc))
 return;
 

Modified: cfe/trunk/lib/Headers/CMakeLists.txt
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Headers/CMakeLists.txt?rev=283680&r1=283679&r2=283680&view=diff
==
--- cfe/trunk/lib/Headers/CMakeLists.txt (original)
+++ cfe/trunk/lib/Headers/CMakeLists.txt Sat Oct  8 17:16:12 2016
@@ -24,10 +24,13 @@ set(files
   bmiintrin.h
   __clang_cuda_builtin_vars.h
   __clang_cuda_cmath.h
+  __clang_cuda_complex_builtins.h
   __clang_cuda_intrinsics.h
   __clang_cuda_math_forward_declares.h
   __clang_cuda_runtime_wrapper.h
   cpuid.h
+  cuda_wrappers/algorithm
+  cuda_wrappers/complex
   clflushoptintrin.h
   emmintrin.h
   f16cintrin.h

Added: cfe/trunk/lib/Headers/__clang_cuda_complex_builtins.h
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Headers/__clang_cuda_complex_builtins.h?rev=283680&view=auto
==
--- cfe/trunk/lib/Headers/__clang_cuda_complex_builtins.h (added)
+++ cfe/trunk/lib/Headers/__clang_cuda_complex_builtins.h Sat Oct  8 17:16:12 
2016
@@ -0,0 +1,203 @@
+/*===-- __clang_cuda_complex_builtins - CUDA impls of runtime complex fns 
---===
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to 
deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ *
+ *===---===
+ */
+
+#ifndef __CLANG_CUDA_COMPLEX_BUILTINS
+#define __CLANG_CUDA_COMPLEX_BUILTINS
+
+// This header defines __muldc3, __mulsc3, __divdc3, and __divsc3.  These are
+// libgcc functions that clang assumes are available when compiling c99 complex
+// operations.  (These implementations come from libc++, and have been modified
+// to work with CUDA.)
+
+extern "C" inline __device__ double _Complex __muldc3(double __a, double __b,
+  double __c, double __d) {
+  double __ac = __a * __c;
+  double __bd = __b * __d;
+  double __ad = __a * __d;
+  double __bc = __b * __c;
+  double _Complex z;
+  __real__(z) = __ac - __bd;
+  __imag__(z) = __ad + __bc;
+  if (std::isnan(__real__(z)) && std::isnan(__imag__(z))) {
+int __recalc = 0;
+if (std::isinf(__a) || std::isinf(__b)) {
+  __a = std::copysign(std::isinf(__a) ? 1 : 0

[PATCH] D24978: [CUDA] Rename cuda_builtin_vars.h to __clang_cuda_builtin_vars.h.

2016-10-08 Thread Justin Lebar via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rL283679: [CUDA] Rename cuda_builtin_vars.h to 
__clang_cuda_builtin_vars.h. (authored by jlebar).

Changed prior to commit:
  https://reviews.llvm.org/D24978?vs=72685&id=74054#toc

Repository:
  rL LLVM

https://reviews.llvm.org/D24978

Files:
  cfe/trunk/lib/Frontend/CompilerInvocation.cpp
  cfe/trunk/lib/Headers/CMakeLists.txt
  cfe/trunk/lib/Headers/__clang_cuda_builtin_vars.h
  cfe/trunk/lib/Headers/__clang_cuda_runtime_wrapper.h
  cfe/trunk/lib/Headers/cuda_builtin_vars.h
  cfe/trunk/test/CodeGenCUDA/cuda-builtin-vars.cu
  cfe/trunk/test/SemaCUDA/cuda-builtin-vars.cu

Index: cfe/trunk/test/SemaCUDA/cuda-builtin-vars.cu
===
--- cfe/trunk/test/SemaCUDA/cuda-builtin-vars.cu
+++ cfe/trunk/test/SemaCUDA/cuda-builtin-vars.cu
@@ -1,6 +1,6 @@
 // RUN: %clang_cc1 "-triple" "nvptx-nvidia-cuda" -fcuda-is-device -fsyntax-only -verify %s
 
-#include "cuda_builtin_vars.h"
+#include "__clang_cuda_builtin_vars.h"
 __attribute__((global))
 void kernel(int *out) {
   int i = 0;
@@ -34,20 +34,20 @@
 
   out[i++] = warpSize;
   warpSize = 0; // expected-error {{cannot assign to variable 'warpSize' with const-qualified type 'const int'}}
-  // expected-note@cuda_builtin_vars.h:* {{variable 'warpSize' declared const here}}
+  // expected-note@__clang_cuda_builtin_vars.h:* {{variable 'warpSize' declared const here}}
 
   // Make sure we can't construct or assign to the special variables.
   __cuda_builtin_threadIdx_t x; // expected-error {{calling a private constructor of class '__cuda_builtin_threadIdx_t'}}
-  // expected-note@cuda_builtin_vars.h:* {{declared private here}}
+  // expected-note@__clang_cuda_builtin_vars.h:* {{declared private here}}
 
   __cuda_builtin_threadIdx_t y = threadIdx; // expected-error {{calling a private constructor of class '__cuda_builtin_threadIdx_t'}}
-  // expected-note@cuda_builtin_vars.h:* {{declared private here}}
+  // expected-note@__clang_cuda_builtin_vars.h:* {{declared private here}}
 
   threadIdx = threadIdx; // expected-error {{'operator=' is a private member of '__cuda_builtin_threadIdx_t'}}
-  // expected-note@cuda_builtin_vars.h:* {{declared private here}}
+  // expected-note@__clang_cuda_builtin_vars.h:* {{declared private here}}
 
   void *ptr = &threadIdx; // expected-error {{'operator&' is a private member of '__cuda_builtin_threadIdx_t'}}
-  // expected-note@cuda_builtin_vars.h:* {{declared private here}}
+  // expected-note@__clang_cuda_builtin_vars.h:* {{declared private here}}
 
   // Following line should've caused an error as one is not allowed to
   // take address of a built-in variable in CUDA. Alas there's no way
Index: cfe/trunk/test/CodeGenCUDA/cuda-builtin-vars.cu
===
--- cfe/trunk/test/CodeGenCUDA/cuda-builtin-vars.cu
+++ cfe/trunk/test/CodeGenCUDA/cuda-builtin-vars.cu
@@ -1,6 +1,6 @@
 // RUN: %clang_cc1 "-triple" "nvptx-nvidia-cuda" -emit-llvm -fcuda-is-device -o - %s | FileCheck %s
 
-#include "cuda_builtin_vars.h"
+#include "__clang_cuda_builtin_vars.h"
 
 // CHECK: define void @_Z6kernelPi(i32* %out)
 __attribute__((global))
Index: cfe/trunk/lib/Frontend/CompilerInvocation.cpp
===
--- cfe/trunk/lib/Frontend/CompilerInvocation.cpp
+++ cfe/trunk/lib/Frontend/CompilerInvocation.cpp
@@ -2012,9 +2012,10 @@
   // enabled for Microsoft Extensions or Borland Extensions, here.
   //
   // FIXME: __declspec is also currently enabled for CUDA, but isn't really a
-  // CUDA extension, however it is required for supporting cuda_builtin_vars.h,
-  // which uses __declspec(property). Once that has been rewritten in terms of
-  // something more generic, remove the Opts.CUDA term here.
+  // CUDA extension. However, it is required for supporting
+  // __clang_cuda_builtin_vars.h, which uses __declspec(property). Once that has
+  // been rewritten in terms of something more generic, remove the Opts.CUDA
+  // term here.
   Opts.DeclSpecKeyword =
   Args.hasFlag(OPT_fdeclspec, OPT_fno_declspec,
(Opts.MicrosoftExt || Opts.Borland || Opts.CUDA));
Index: cfe/trunk/lib/Headers/CMakeLists.txt
===
--- cfe/trunk/lib/Headers/CMakeLists.txt
+++ cfe/trunk/lib/Headers/CMakeLists.txt
@@ -22,12 +22,12 @@
   avxintrin.h
   bmi2intrin.h
   bmiintrin.h
+  __clang_cuda_builtin_vars.h
   __clang_cuda_cmath.h
   __clang_cuda_intrinsics.h
   __clang_cuda_math_forward_declares.h
   __clang_cuda_runtime_wrapper.h
   cpuid.h
-  cuda_builtin_vars.h
   clflushoptintrin.h
   emmintrin.h
   f16cintrin.h
Index: cfe/trunk/lib/Headers/__clang_cuda_builtin_vars.h
===
--- cfe/trunk/lib/Headers/__clang_cuda_builtin_vars.h
+++ cfe/trunk/lib/Headers/__clang_cuda_builtin_vars.h
@@

[PATCH] D24979: [CUDA] Support and std::min/max on the device.

2016-10-08 Thread Justin Lebar via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rL283680: [CUDA] Support  and std::min/max on the 
device. (authored by jlebar).

Changed prior to commit:
  https://reviews.llvm.org/D24979?vs=72719&id=74053#toc

Repository:
  rL LLVM

https://reviews.llvm.org/D24979

Files:
  cfe/trunk/lib/Driver/ToolChains.cpp
  cfe/trunk/lib/Headers/CMakeLists.txt
  cfe/trunk/lib/Headers/__clang_cuda_complex_builtins.h
  cfe/trunk/lib/Headers/__clang_cuda_runtime_wrapper.h
  cfe/trunk/lib/Headers/cuda_wrappers/algorithm
  cfe/trunk/lib/Headers/cuda_wrappers/complex

Index: cfe/trunk/lib/Driver/ToolChains.cpp
===
--- cfe/trunk/lib/Driver/ToolChains.cpp
+++ cfe/trunk/lib/Driver/ToolChains.cpp
@@ -4694,6 +4694,15 @@
 
 void Linux::AddCudaIncludeArgs(const ArgList &DriverArgs,
ArgStringList &CC1Args) const {
+  if (!DriverArgs.hasArg(options::OPT_nobuiltininc)) {
+// Add cuda_wrappers/* to our system include path.  This lets us wrap
+// standard library headers.
+SmallString<128> P(getDriver().ResourceDir);
+llvm::sys::path::append(P, "include");
+llvm::sys::path::append(P, "cuda_wrappers");
+addSystemInclude(DriverArgs, CC1Args, P);
+  }
+
   if (DriverArgs.hasArg(options::OPT_nocudainc))
 return;
 
Index: cfe/trunk/lib/Headers/cuda_wrappers/complex
===
--- cfe/trunk/lib/Headers/cuda_wrappers/complex
+++ cfe/trunk/lib/Headers/cuda_wrappers/complex
@@ -0,0 +1,79 @@
+/*=== complex - CUDA wrapper for  --===
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ *
+ *===---===
+ */
+
+#pragma once
+
+// Wrapper around  that forces its functions to be __host__
+// __device__.
+
+// First, include host-only headers we think are likely to be included by
+// , so that the pragma below only applies to  itself.
+#if __cplusplus >= 201103L
+#include 
+#endif
+#include 
+#include 
+#include 
+
+// Next, include our  wrapper, to ensure that device overloads of
+// std::min/max are available.
+#include 
+
+#pragma clang force_cuda_host_device begin
+
+// When compiling for device, ask libstdc++ to use its own implements of
+// complex functions, rather than calling builtins (which resolve to library
+// functions that don't exist when compiling CUDA device code).
+//
+// This is a little dicey, because it causes libstdc++ to define a different
+// set of overloads on host and device.
+//
+//   // Present only when compiling for host.
+//   __host__ __device__ void complex sin(const complex& x) {
+// return __builtin_csinf(x);
+//   }
+//
+//   // Present when compiling for host and for device.
+//   template 
+//   void __host__ __device__ complex sin(const complex& x) {
+// return complex(sin(x.real()) * cosh(x.imag()),
+//   cos(x.real()), sinh(x.imag()));
+//   }
+//
+// This is safe because when compiling for device, all function calls in
+// __host__ code to sin() will still resolve to *something*, even if they don't
+// resolve to the same function as they resolve to when compiling for host.  We
+// don't care that they don't resolve to the right function because we won't
+// codegen this host code when compiling for device.
+
+#pragma push_macro("_GLIBCXX_USE_C99_COMPLEX")
+#pragma push_macro("_GLIBCXX_USE_C99_COMPLEX_TR1")
+#define _GLIBCXX_USE_C99_COMPLEX 0
+#define _GLIBCXX_USE_C99_COMPLEX_TR1 0
+
+#include_next 
+
+#pragma pop_macro("_GLIBCXX_USE_C99_COMPLEX_TR1")
+#pragma pop_macro("_GLIBCXX_USE_C99_COMPLEX")
+
+#pragma clang force_cuda_host_device end
Index: cfe/trunk/lib/Headers/cuda_wrappers/algorithm
===
--- cfe/trunk/lib/Headers/cuda_wrappers/algorithm
+++ cfe/trunk/lib/Headers/cuda_wra

Re: r283680 - [CUDA] Support and std::min/max on the device.

2016-10-08 Thread Justin Lebar via cfe-commits
Hal,

On NVPTX, these functions eventually get resolved to function calls in
libdevice, e.g. __nv_isinff and __nv_isnanf.

llvm does not do a good job understanding the body of e.g.
__nvvm_isnanf, because it uses nvptx-specific intrinsic functions,
notably @llvm.nvvm.fabs.f.  These are opaque to the LLVM optimizer.

The fix is not as simple as simply changing our implementation of e.g.
std::isnan to call __builtin_isnanf, because we also would want to fix
::isnanf, but we can't override that implementation without some major
surgery on the nvptx headers.

David Majnemer and I talked about one way to fix this, namely by using
IR intrinsic upgrades to replace the opaque nvptx intrinsics with LLVM
intrinsics.  LLVM would then be able to understand these intrinsics
and optimize them.  We would reap benefits not just for std::isnan,
but also e.g. constant-folding calls like std::abs that also
eventually end up in libnvvm.

I did the first half of this work, by adding lowerings for the various
LLVM intrinsics to the NVPTX backend [1].  But David is now busy with
other things and hasn't been able to help with the second half, namely
using IR upgrades to replace the nvptx target-specific intrinsics with
generalized LLVM intrinsics.  Perhaps this is something you'd be able
to help with?

In any case, using builtins here without fixing std::isnan and ::isnan
feels to me to be the wrong solution.  It seems to me that we should
be able to rely on std::isnan and friends being fast, and if they're
not, we should fix that.  Using builtins here would be "cheating" to
make our implementation faster than user code.

I'll note, separately, that on x86, clang does not seem to
constant-fold std::isinf or __builtin_isinff to false with -ffast-math
-ffinite-math-only.  GCC can do it.  Clang gets std::isnan.
https://godbolt.org/g/vZB55a

By the way, the changes you made to libc++ unfortunately break this
patch with libc++, because e.g. __libcpp_isnan is not a device
function.  I'll have to think about how to fix that -- I may send you
a patch.

Regards,
-Justin

[1] https://reviews.llvm.org/D24300

On Sat, Oct 8, 2016 at 3:36 PM, Hal Finkel  wrote:
> Hi Justin,
>
> This is neat!
>
> I see a bunch of uses of std::isinf, etc. here. It tends to be important 
> that, when using -ffast-math (or -ffinite-math-only) these checks get 
> optimized away. Can you please check that they do? If not, you might mirror 
> what I've done in r283051 for libc++, which is similar to what libstdc++ ends 
> up doing, so that we use __builtin_isnan/isinf/isfinite.
>
> Thanks again,
> Hal
>
> - Original Message -
>> From: "Justin Lebar via cfe-commits" 
>> To: cfe-commits@lists.llvm.org
>> Sent: Saturday, October 8, 2016 5:16:13 PM
>> Subject: r283680 - [CUDA] Support  and std::min/max on the device.
>>
>> Author: jlebar
>> Date: Sat Oct  8 17:16:12 2016
>> New Revision: 283680
>>
>> URL: http://llvm.org/viewvc/llvm-project?rev=283680&view=rev
>> Log:
>> [CUDA] Support  and std::min/max on the device.
>>
>> Summary:
>> We do this by wrapping  and .
>>
>> Tests are in the test-suite.
>>
>> Reviewers: tra
>>
>> Subscribers: jhen, beanz, cfe-commits, mgorny
>>
>> Differential Revision: https://reviews.llvm.org/D24979
>>
>> Added:
>> cfe/trunk/lib/Headers/__clang_cuda_complex_builtins.h
>> cfe/trunk/lib/Headers/cuda_wrappers/
>> cfe/trunk/lib/Headers/cuda_wrappers/algorithm
>> cfe/trunk/lib/Headers/cuda_wrappers/complex
>> Modified:
>> cfe/trunk/lib/Driver/ToolChains.cpp
>> cfe/trunk/lib/Headers/CMakeLists.txt
>> cfe/trunk/lib/Headers/__clang_cuda_runtime_wrapper.h
>>
>> Modified: cfe/trunk/lib/Driver/ToolChains.cpp
>> URL:
>> http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Driver/ToolChains.cpp?rev=283680&r1=283679&r2=283680&view=diff
>> ==
>> --- cfe/trunk/lib/Driver/ToolChains.cpp (original)
>> +++ cfe/trunk/lib/Driver/ToolChains.cpp Sat Oct  8 17:16:12 2016
>> @@ -4694,6 +4694,15 @@ void Linux::AddClangCXXStdlibIncludeArgs
>>
>>  void Linux::AddCudaIncludeArgs(const ArgList &DriverArgs,
>> ArgStringList &CC1Args) const {
>> +  if (!DriverArgs.hasArg(options::OPT_nobuiltininc)) {
>> +// Add cuda_wrappers/* to our system include path.  This lets us
>> wrap
>> +// standard library headers.
>> +SmallString<128> P(getDriver().ResourceDir);
>> +llvm::sys::path::append(P, "include");
>> +llvm::sys::path::append(P, "cuda_wrappers");
>> +add

r283683 - [CUDA] Don't install cuda_wrappers/{algorithm, complex} into the main include dir.

2016-10-08 Thread Justin Lebar via cfe-commits
Author: jlebar
Date: Sat Oct  8 19:27:39 2016
New Revision: 283683

URL: http://llvm.org/viewvc/llvm-project?rev=283683&view=rev
Log:
[CUDA] Don't install cuda_wrappers/{algorithm,complex} into the main include 
dir.

This is obviously wrong -- if we do this, then all compiles will pick up
these wrappers, which is not what we want.

Modified:
cfe/trunk/lib/Headers/CMakeLists.txt

Modified: cfe/trunk/lib/Headers/CMakeLists.txt
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Headers/CMakeLists.txt?rev=283683&r1=283682&r2=283683&view=diff
==
--- cfe/trunk/lib/Headers/CMakeLists.txt (original)
+++ cfe/trunk/lib/Headers/CMakeLists.txt Sat Oct  8 19:27:39 2016
@@ -29,8 +29,6 @@ set(files
   __clang_cuda_math_forward_declares.h
   __clang_cuda_runtime_wrapper.h
   cpuid.h
-  cuda_wrappers/algorithm
-  cuda_wrappers/complex
   clflushoptintrin.h
   emmintrin.h
   f16cintrin.h
@@ -92,6 +90,11 @@ set(files
   xtestintrin.h
   )
 
+set(cuda_wrapper_files
+  cuda_wrappers/algorithm
+  cuda_wrappers/complex
+)
+
 set(output_dir ${LLVM_LIBRARY_OUTPUT_INTDIR}/clang/${CLANG_VERSION}/include)
 
 # Generate arm_neon.h
@@ -99,7 +102,7 @@ clang_tablegen(arm_neon.h -gen-arm-neon
   SOURCE ${CLANG_SOURCE_DIR}/include/clang/Basic/arm_neon.td)
 
 set(out_files)
-foreach( f ${files} )
+foreach( f ${files} ${cuda_wrapper_files} )
   set( src ${CMAKE_CURRENT_SOURCE_DIR}/${f} )
   set( dst ${output_dir}/${f} )
   add_custom_command(OUTPUT ${dst}
@@ -124,6 +127,12 @@ install(
   PERMISSIONS OWNER_READ OWNER_WRITE GROUP_READ WORLD_READ
   DESTINATION lib${LLVM_LIBDIR_SUFFIX}/clang/${CLANG_VERSION}/include)
 
+install(
+  FILES ${cuda_wrapper_files}
+  COMPONENT clang-headers
+  PERMISSIONS OWNER_READ OWNER_WRITE GROUP_READ WORLD_READ
+  DESTINATION 
lib${LLVM_LIBDIR_SUFFIX}/clang/${CLANG_VERSION}/include/cuda_wrappers)
+
 if (NOT CMAKE_CONFIGURATION_TYPES) # don't add this for IDE's.
   add_custom_target(install-clang-headers
 DEPENDS clang-headers


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: r283680 - [CUDA] Support and std::min/max on the device.

2016-10-08 Thread Justin Lebar via cfe-commits
headers.
>>
>> David Majnemer and I talked about one way to fix this, namely by
>> using
>> IR intrinsic upgrades to replace the opaque nvptx intrinsics with
>> LLVM
>> intrinsics.  LLVM would then be able to understand these intrinsics
>> and optimize them.  We would reap benefits not just for std::isnan,
>> but also e.g. constant-folding calls like std::abs that also
>> eventually end up in libnvvm.
>>
>> I did the first half of this work, by adding lowerings for the
>> various
>> LLVM intrinsics to the NVPTX backend [1].  But David is now busy with
>> other things and hasn't been able to help with the second half,
>> namely
>> using IR upgrades to replace the nvptx target-specific intrinsics
>> with
>> generalized LLVM intrinsics.  Perhaps this is something you'd be able
>> to help with?
>>
>> In any case, using builtins here without fixing std::isnan and
>> ::isnan
>> feels to me to be the wrong solution.  It seems to me that we should
>> be able to rely on std::isnan and friends being fast, and if they're
>> not, we should fix that.  Using builtins here would be "cheating" to
>> make our implementation faster than user code.
>>
>> I'll note, separately, that on x86, clang does not seem to
>> constant-fold std::isinf or __builtin_isinff to false with
>> -ffast-math
>> -ffinite-math-only.  GCC can do it.  Clang gets std::isnan.
>> https://godbolt.org/g/vZB55a
>>
>> By the way, the changes you made to libc++ unfortunately break this
>> patch with libc++, because e.g. __libcpp_isnan is not a device
>> function.  I'll have to think about how to fix that -- I may send you
>> a patch.
>>
>> Regards,
>> -Justin
>>
>> [1] https://reviews.llvm.org/D24300
>>
>> On Sat, Oct 8, 2016 at 3:36 PM, Hal Finkel  wrote:
>> > Hi Justin,
>> >
>> > This is neat!
>> >
>> > I see a bunch of uses of std::isinf, etc. here. It tends to be
>> > important that, when using -ffast-math (or -ffinite-math-only)
>> > these checks get optimized away. Can you please check that they
>> > do? If not, you might mirror what I've done in r283051 for libc++,
>> > which is similar to what libstdc++ ends up doing, so that we use
>> > __builtin_isnan/isinf/isfinite.
>> >
>> > Thanks again,
>> > Hal
>> >
>> > - Original Message -
>> >> From: "Justin Lebar via cfe-commits" 
>> >> To: cfe-commits@lists.llvm.org
>> >> Sent: Saturday, October 8, 2016 5:16:13 PM
>> >> Subject: r283680 - [CUDA] Support  and std::min/max on
>> >> the device.
>> >>
>> >> Author: jlebar
>> >> Date: Sat Oct  8 17:16:12 2016
>> >> New Revision: 283680
>> >>
>> >> URL: http://llvm.org/viewvc/llvm-project?rev=283680&view=rev
>> >> Log:
>> >> [CUDA] Support  and std::min/max on the device.
>> >>
>> >> Summary:
>> >> We do this by wrapping  and .
>> >>
>> >> Tests are in the test-suite.
>> >>
>> >> Reviewers: tra
>> >>
>> >> Subscribers: jhen, beanz, cfe-commits, mgorny
>> >>
>> >> Differential Revision: https://reviews.llvm.org/D24979
>> >>
>> >> Added:
>> >> cfe/trunk/lib/Headers/__clang_cuda_complex_builtins.h
>> >> cfe/trunk/lib/Headers/cuda_wrappers/
>> >> cfe/trunk/lib/Headers/cuda_wrappers/algorithm
>> >> cfe/trunk/lib/Headers/cuda_wrappers/complex
>> >> Modified:
>> >> cfe/trunk/lib/Driver/ToolChains.cpp
>> >> cfe/trunk/lib/Headers/CMakeLists.txt
>> >> cfe/trunk/lib/Headers/__clang_cuda_runtime_wrapper.h
>> >>
>> >> Modified: cfe/trunk/lib/Driver/ToolChains.cpp
>> >> URL:
>> >> http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Driver/ToolChains.cpp?rev=283680&r1=283679&r2=283680&view=diff
>> >> ==
>> >> --- cfe/trunk/lib/Driver/ToolChains.cpp (original)
>> >> +++ cfe/trunk/lib/Driver/ToolChains.cpp Sat Oct  8 17:16:12 2016
>> >> @@ -4694,6 +4694,15 @@ void Linux::AddClangCXXStdlibIncludeArgs
>> >>
>> >>  void Linux::AddCudaIncludeArgs(const ArgList &DriverArgs,
>> >> ArgStringList &CC1Args) const {
>> >> +  if (!DriverA

  1   2   3   4   5   6   7   8   9   >