Re: r339603 - [OPENMP] Fix emission of the loop doacross constructs.

2018-08-15 Thread Jonas Hahnfeld via cfe-commits

Alexey, Hans,

does it make sense to backport for 7.0 as it fixes PR37580?

Thanks,
Jonas

On 2018-08-13 21:04, Alexey Bataev via cfe-commits wrote:

Author: abataev
Date: Mon Aug 13 12:04:24 2018
New Revision: 339603

URL: http://llvm.org/viewvc/llvm-project?rev=339603&view=rev
Log:
[OPENMP] Fix emission of the loop doacross constructs.

The number of loops associated with the OpenMP loop constructs should
not be considered as the number loops to collapse.

Modified:
cfe/trunk/include/clang/AST/OpenMPClause.h
cfe/trunk/lib/AST/OpenMPClause.cpp
cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp
cfe/trunk/lib/CodeGen/CGOpenMPRuntime.h
cfe/trunk/lib/CodeGen/CGStmtOpenMP.cpp
cfe/trunk/lib/Sema/SemaOpenMP.cpp
cfe/trunk/lib/Serialization/ASTReaderStmt.cpp
cfe/trunk/lib/Serialization/ASTWriterStmt.cpp
cfe/trunk/test/OpenMP/ordered_doacross_codegen.c
cfe/trunk/test/OpenMP/ordered_doacross_codegen.cpp
cfe/trunk/test/OpenMP/parallel_for_simd_ast_print.cpp

Modified: cfe/trunk/include/clang/AST/OpenMPClause.h
URL:
http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/AST/OpenMPClause.h?rev=339603&r1=339602&r2=339603&view=diff
==
--- cfe/trunk/include/clang/AST/OpenMPClause.h (original)
+++ cfe/trunk/include/clang/AST/OpenMPClause.h Mon Aug 13 12:04:24 2018
@@ -930,8 +930,11 @@ public:
 /// \endcode
 /// In this example directive '#pragma omp for' has 'ordered' clause 
with

 /// parameter 2.
-class OMPOrderedClause : public OMPClause {
+class OMPOrderedClause final
+: public OMPClause,
+  private llvm::TrailingObjects {
   friend class OMPClauseReader;
+  friend TrailingObjects;

   /// Location of '('.
   SourceLocation LParenLoc;
@@ -939,6 +942,26 @@ class OMPOrderedClause : public OMPClaus
   /// Number of for-loops.
   Stmt *NumForLoops = nullptr;

+  /// Real number of loops.
+  unsigned NumberOfLoops = 0;
+
+  /// Build 'ordered' clause.
+  ///
+  /// \param Num Expression, possibly associated with this clause.
+  /// \param NumLoops Number of loops, associated with this clause.
+  /// \param StartLoc Starting location of the clause.
+  /// \param LParenLoc Location of '('.
+  /// \param EndLoc Ending location of the clause.
+  OMPOrderedClause(Expr *Num, unsigned NumLoops, SourceLocation 
StartLoc,

+   SourceLocation LParenLoc, SourceLocation EndLoc)
+  : OMPClause(OMPC_ordered, StartLoc, EndLoc), 
LParenLoc(LParenLoc),

+NumForLoops(Num), NumberOfLoops(NumLoops) {}
+
+  /// Build an empty clause.
+  explicit OMPOrderedClause(unsigned NumLoops)
+  : OMPClause(OMPC_ordered, SourceLocation(), SourceLocation()),
+NumberOfLoops(NumLoops) {}
+
   /// Set the number of associated for-loops.
   void setNumForLoops(Expr *Num) { NumForLoops = Num; }

@@ -946,17 +969,17 @@ public:
   /// Build 'ordered' clause.
   ///
   /// \param Num Expression, possibly associated with this clause.
+  /// \param NumLoops Number of loops, associated with this clause.
   /// \param StartLoc Starting location of the clause.
   /// \param LParenLoc Location of '('.
   /// \param EndLoc Ending location of the clause.
-  OMPOrderedClause(Expr *Num, SourceLocation StartLoc,
-SourceLocation LParenLoc, SourceLocation EndLoc)
-  : OMPClause(OMPC_ordered, StartLoc, EndLoc), 
LParenLoc(LParenLoc),

-NumForLoops(Num) {}
+  static OMPOrderedClause *Create(const ASTContext &C, Expr *Num,
+  unsigned NumLoops, SourceLocation 
StartLoc,

+  SourceLocation LParenLoc,
+  SourceLocation EndLoc);

   /// Build an empty clause.
-  explicit OMPOrderedClause()
-  : OMPClause(OMPC_ordered, SourceLocation(), SourceLocation()) {}
+  static OMPOrderedClause* CreateEmpty(const ASTContext &C, unsigned 
NumLoops);


   /// Sets the location of '('.
   void setLParenLoc(SourceLocation Loc) { LParenLoc = Loc; }
@@ -967,6 +990,17 @@ public:
   /// Return the number of associated for-loops.
   Expr *getNumForLoops() const { return 
cast_or_null(NumForLoops); }


+  /// Set number of iterations for the specified loop.
+  void setLoopNumIterations(unsigned NumLoop, Expr *NumIterations);
+  /// Get number of iterations for all the loops.
+  ArrayRef getLoopNumIterations() const;
+
+  /// Set loop counter for the specified loop.
+  void setLoopCounter(unsigned NumLoop, Expr *Counter);
+  /// Get loops counter for the specified loop.
+  Expr *getLoopCunter(unsigned NumLoop);
+  const Expr *getLoopCunter(unsigned NumLoop) const;
+
   child_range children() { return child_range(&NumForLoops,
&NumForLoops + 1); }

   static bool classof(const OMPClause *T) {
@@ -3095,24 +3129,32 @@ class OMPDependClause final
   /// Colon location.
   SourceLocation ColonLoc;

+  /// Number of loops, associated with the depend clause.
+  unsigned NumLoops = 0;
+
   /// Build clause with number of v

Re: r339603 - [OPENMP] Fix emission of the loop doacross constructs.

2018-08-16 Thread Jonas Hahnfeld via cfe-commits

Thanks Hans!

On 2018-08-16 11:35, Hans Wennborg wrote:

I've gone ahead and merged it in r339851.

On Wed, Aug 15, 2018 at 3:23 PM, Alexey Bataev  
wrote:

I think it would be good to backport it. Could you do that, Jonas?

-
Best regards,
Alexey Bataev

15.08.2018 5:02, Jonas Hahnfeld via cfe-commits пишет:

Alexey, Hans,

does it make sense to backport for 7.0 as it fixes PR37580?

Thanks,
Jonas

On 2018-08-13 21:04, Alexey Bataev via cfe-commits wrote:

Author: abataev
Date: Mon Aug 13 12:04:24 2018
New Revision: 339603

URL: http://llvm.org/viewvc/llvm-project?rev=339603&view=rev
Log:
[OPENMP] Fix emission of the loop doacross constructs.

The number of loops associated with the OpenMP loop constructs should
not be considered as the number loops to collapse.

Modified:
cfe/trunk/include/clang/AST/OpenMPClause.h
cfe/trunk/lib/AST/OpenMPClause.cpp
cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp
cfe/trunk/lib/CodeGen/CGOpenMPRuntime.h
cfe/trunk/lib/CodeGen/CGStmtOpenMP.cpp
cfe/trunk/lib/Sema/SemaOpenMP.cpp
cfe/trunk/lib/Serialization/ASTReaderStmt.cpp
cfe/trunk/lib/Serialization/ASTWriterStmt.cpp
cfe/trunk/test/OpenMP/ordered_doacross_codegen.c
cfe/trunk/test/OpenMP/ordered_doacross_codegen.cpp
cfe/trunk/test/OpenMP/parallel_for_simd_ast_print.cpp

Modified: cfe/trunk/include/clang/AST/OpenMPClause.h
URL:
http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/AST/OpenMPClause.h?rev=339603&r1=339602&r2=339603&view=diff
==
--- cfe/trunk/include/clang/AST/OpenMPClause.h (original)
+++ cfe/trunk/include/clang/AST/OpenMPClause.h Mon Aug 13 12:04:24 
2018

@@ -930,8 +930,11 @@ public:
 /// \endcode
 /// In this example directive '#pragma omp for' has 'ordered' clause 
with

 /// parameter 2.
-class OMPOrderedClause : public OMPClause {
+class OMPOrderedClause final
+: public OMPClause,
+  private llvm::TrailingObjects {
   friend class OMPClauseReader;
+  friend TrailingObjects;

   /// Location of '('.
   SourceLocation LParenLoc;
@@ -939,6 +942,26 @@ class OMPOrderedClause : public OMPClaus
   /// Number of for-loops.
   Stmt *NumForLoops = nullptr;

+  /// Real number of loops.
+  unsigned NumberOfLoops = 0;
+
+  /// Build 'ordered' clause.
+  ///
+  /// \param Num Expression, possibly associated with this clause.
+  /// \param NumLoops Number of loops, associated with this clause.
+  /// \param StartLoc Starting location of the clause.
+  /// \param LParenLoc Location of '('.
+  /// \param EndLoc Ending location of the clause.
+  OMPOrderedClause(Expr *Num, unsigned NumLoops, SourceLocation 
StartLoc,

+   SourceLocation LParenLoc, SourceLocation EndLoc)
+  : OMPClause(OMPC_ordered, StartLoc, EndLoc), 
LParenLoc(LParenLoc),

+NumForLoops(Num), NumberOfLoops(NumLoops) {}
+
+  /// Build an empty clause.
+  explicit OMPOrderedClause(unsigned NumLoops)
+  : OMPClause(OMPC_ordered, SourceLocation(), SourceLocation()),
+NumberOfLoops(NumLoops) {}
+
   /// Set the number of associated for-loops.
   void setNumForLoops(Expr *Num) { NumForLoops = Num; }

@@ -946,17 +969,17 @@ public:
   /// Build 'ordered' clause.
   ///
   /// \param Num Expression, possibly associated with this clause.
+  /// \param NumLoops Number of loops, associated with this clause.
   /// \param StartLoc Starting location of the clause.
   /// \param LParenLoc Location of '('.
   /// \param EndLoc Ending location of the clause.
-  OMPOrderedClause(Expr *Num, SourceLocation StartLoc,
-SourceLocation LParenLoc, SourceLocation EndLoc)
-  : OMPClause(OMPC_ordered, StartLoc, EndLoc), 
LParenLoc(LParenLoc),

-NumForLoops(Num) {}
+  static OMPOrderedClause *Create(const ASTContext &C, Expr *Num,
+  unsigned NumLoops, SourceLocation
StartLoc,
+  SourceLocation LParenLoc,
+  SourceLocation EndLoc);

   /// Build an empty clause.
-  explicit OMPOrderedClause()
-  : OMPClause(OMPC_ordered, SourceLocation(), SourceLocation()) 
{}

+  static OMPOrderedClause* CreateEmpty(const ASTContext &C, unsigned
NumLoops);

   /// Sets the location of '('.
   void setLParenLoc(SourceLocation Loc) { LParenLoc = Loc; }
@@ -967,6 +990,17 @@ public:
   /// Return the number of associated for-loops.
   Expr *getNumForLoops() const { return 
cast_or_null(NumForLoops); }


+  /// Set number of iterations for the specified loop.
+  void setLoopNumIterations(unsigned NumLoop, Expr *NumIterations);
+  /// Get number of iterations for all the loops.
+  ArrayRef getLoopNumIterations() const;
+
+  /// Set loop counter for the specified loop.
+  void setLoopCounter(unsigned NumLoop, Expr *Counter);
+  /// Get loops counter for the specified loop.
+  Expr *get

r333283 - [Sema] Add tests for weak functions

2018-05-25 Thread Jonas Hahnfeld via cfe-commits
Author: hahnfeld
Date: Fri May 25 08:56:12 2018
New Revision: 333283

URL: http://llvm.org/viewvc/llvm-project?rev=333283&view=rev
Log:
[Sema] Add tests for weak functions

I found these checks to be missing, just add some simple cases.

Differential Revision: https://reviews.llvm.org/D47200

Modified:
cfe/trunk/test/Sema/attr-weak.c

Modified: cfe/trunk/test/Sema/attr-weak.c
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/Sema/attr-weak.c?rev=333283&r1=333282&r2=333283&view=diff
==
--- cfe/trunk/test/Sema/attr-weak.c (original)
+++ cfe/trunk/test/Sema/attr-weak.c Fri May 25 08:56:12 2018
@@ -1,7 +1,9 @@
 // RUN: %clang_cc1 -verify -fsyntax-only %s
 
+extern int f0() __attribute__((weak));
 extern int g0 __attribute__((weak));
 extern int g1 __attribute__((weak_import));
+int f2() __attribute__((weak));
 int g2 __attribute__((weak));
 int g3 __attribute__((weak_import)); // expected-warning {{'weak_import' 
attribute cannot be specified on a definition}}
 int __attribute__((weak_import)) g4(void);
@@ -11,6 +13,7 @@ void __attribute__((weak_import)) g5(voi
 struct __attribute__((weak)) s0 {}; // expected-warning {{'weak' attribute 
only applies to variables, functions, and classes}}
 struct __attribute__((weak_import)) s1 {}; // expected-warning {{'weak_import' 
attribute only applies to variables and functions}}
 
+static int f() __attribute__((weak)); // expected-error {{weak declaration 
cannot have internal linkage}}
 static int x __attribute__((weak)); // expected-error {{weak declaration 
cannot have internal linkage}}
 
 // rdar://9538608


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r314329 - [OpenMP] Fix passing of -m arguments to device toolchain

2017-09-27 Thread Jonas Hahnfeld via cfe-commits
Author: hahnfeld
Date: Wed Sep 27 11:12:34 2017
New Revision: 314329

URL: http://llvm.org/viewvc/llvm-project?rev=314329&view=rev
Log:
[OpenMP] Fix passing of -m arguments to device toolchain

AuxTriple is not set if host and device share a toolchain. Also,
removing an argument modifies the DAL which needs to be returned
for future use.
(Move tests back to offload-openmp.c as they are not related to GPUs.)

Differential Revision: https://reviews.llvm.org/D38258

Modified:
cfe/trunk/lib/Driver/ToolChain.cpp
cfe/trunk/test/Driver/openmp-offload-gpu.c
cfe/trunk/test/Driver/openmp-offload.c

Modified: cfe/trunk/lib/Driver/ToolChain.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Driver/ToolChain.cpp?rev=314329&r1=314328&r2=314329&view=diff
==
--- cfe/trunk/lib/Driver/ToolChain.cpp (original)
+++ cfe/trunk/lib/Driver/ToolChain.cpp Wed Sep 27 11:12:34 2017
@@ -807,16 +807,19 @@ llvm::opt::DerivedArgList *ToolChain::Tr
   if (DeviceOffloadKind == Action::OFK_OpenMP) {
 DerivedArgList *DAL = new DerivedArgList(Args.getBaseArgs());
 const OptTable &Opts = getDriver().getOpts();
-bool NewArgAdded = false;
+bool Modified = false;
 
 // Handle -Xopenmp-target flags
 for (Arg *A : Args) {
   // Exclude flags which may only apply to the host toolchain.
-  // Do not exclude flags when the host triple (AuxTriple),
-  // matches the current toolchain triple.
+  // Do not exclude flags when the host triple (AuxTriple)
+  // matches the current toolchain triple. If it is not present
+  // at all, target and host share a toolchain.
   if (A->getOption().matches(options::OPT_m_Group)) {
-if (getAuxTriple() && getAuxTriple()->str() == getTriple().str())
+if (!getAuxTriple() || getAuxTriple()->str() == getTriple().str())
   DAL->append(A);
+else
+  Modified = true;
 continue;
   }
 
@@ -857,10 +860,10 @@ llvm::opt::DerivedArgList *ToolChain::Tr
   A = XOpenMPTargetArg.release();
   AllocatedArgs.push_back(A);
   DAL->append(A);
-  NewArgAdded = true;
+  Modified = true;
 }
 
-if (NewArgAdded) {
+if (Modified) {
   return DAL;
 } else {
   delete DAL;

Modified: cfe/trunk/test/Driver/openmp-offload-gpu.c
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/Driver/openmp-offload-gpu.c?rev=314329&r1=314328&r2=314329&view=diff
==
--- cfe/trunk/test/Driver/openmp-offload-gpu.c (original)
+++ cfe/trunk/test/Driver/openmp-offload-gpu.c Wed Sep 27 11:12:34 2017
@@ -9,38 +9,6 @@
 
 /// ###
 
-/// Check -Xopenmp-target=powerpc64le-ibm-linux-gnu -march=pwr7 is passed when 
compiling for the device.
-// RUN:   %clang -### -no-canonical-prefixes -fopenmp=libomp 
-fopenmp-targets=powerpc64le-ibm-linux-gnu 
-Xopenmp-target=powerpc64le-ibm-linux-gnu -mcpu=pwr7 %s 2>&1 \
-// RUN:   | FileCheck -check-prefix=CHK-FOPENMP-EQ-TARGET %s
-
-// CHK-FOPENMP-EQ-TARGET: clang{{.*}} "-target-cpu" "pwr7"
-
-/// ###
-
-/// Check -Xopenmp-target -march=pwr7 is passed when compiling for the device.
-// RUN:   %clang -### -no-canonical-prefixes -fopenmp=libomp 
-fopenmp-targets=powerpc64le-ibm-linux-gnu -Xopenmp-target -mcpu=pwr7 %s 2>&1 \
-// RUN:   | FileCheck -check-prefix=CHK-FOPENMP-TARGET %s
-
-// CHK-FOPENMP-TARGET: clang{{.*}} "-target-cpu" "pwr7"
-
-/// ###
-
-/// Check -Xopenmp-target triggers error when multiple triples are used.
-// RUN:   %clang -### -no-canonical-prefixes -fopenmp=libomp 
-fopenmp-targets=powerpc64le-ibm-linux-gnu,powerpc64le-unknown-linux-gnu 
-Xopenmp-target -mcpu=pwr8 %s 2>&1 \
-// RUN:   | FileCheck -check-prefix=CHK-FOPENMP-TARGET-AMBIGUOUS-ERROR %s
-
-// CHK-FOPENMP-TARGET-AMBIGUOUS-ERROR: clang{{.*}} error: cannot deduce 
implicit triple value for -Xopenmp-target, specify triple using 
-Xopenmp-target=
-
-/// ###
-
-/// Check -Xopenmp-target triggers error when an option requiring arguments is 
passed to it.
-// RUN:   %clang -### -no-canonical-prefixes -fopenmp=libomp 
-fopenmp-targets=powerpc64le-ibm-linux-gnu -Xopenmp-target -Xopenmp-target 
-mcpu=pwr8 %s 2>&1 \
-// RUN:   | FileCheck -check-prefix=CHK-FOPENMP-TARGET-NESTED-ERROR %s
-
-// CHK-FOPENMP-TARGET-NESTED-ERROR: clang{{.*}} error: invalid -Xopenmp-target 
argument: '-Xopenmp-target -Xopenmp-target', options requiring arguments are 
unsupported
-
-/// ###
-
 /// Check -Xopenmp-target uses one of the archs provided when several archs 
are used.
 // RUN:   %clang -### -no-canonical-prefixe

r314328 - [OpenMP] Fix memory leak when translating arguments

2017-09-27 Thread Jonas Hahnfeld via cfe-commits
Author: hahnfeld
Date: Wed Sep 27 11:12:31 2017
New Revision: 314328

URL: http://llvm.org/viewvc/llvm-project?rev=314328&view=rev
Log:
[OpenMP] Fix memory leak when translating arguments

Parsing the argument after -Xopenmp-target allocates memory that needs
to be freed. Associate it with the final DerivedArgList after we know
which one will be used.

Differential Revision: https://reviews.llvm.org/D38257

Modified:
cfe/trunk/include/clang/Driver/ToolChain.h
cfe/trunk/lib/Driver/Compilation.cpp
cfe/trunk/lib/Driver/ToolChain.cpp
cfe/trunk/test/Driver/openmp-offload-gpu.c

Modified: cfe/trunk/include/clang/Driver/ToolChain.h
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Driver/ToolChain.h?rev=314328&r1=314327&r2=314328&view=diff
==
--- cfe/trunk/include/clang/Driver/ToolChain.h (original)
+++ cfe/trunk/include/clang/Driver/ToolChain.h Wed Sep 27 11:12:31 2017
@@ -249,9 +249,10 @@ public:
   ///
   /// \param DeviceOffloadKind - The device offload kind used for the
   /// translation.
-  virtual llvm::opt::DerivedArgList *
-  TranslateOpenMPTargetArgs(const llvm::opt::DerivedArgList &Args,
-  Action::OffloadKind DeviceOffloadKind) const;
+  virtual llvm::opt::DerivedArgList *TranslateOpenMPTargetArgs(
+  const llvm::opt::DerivedArgList &Args,
+  Action::OffloadKind DeviceOffloadKind,
+  SmallVector &AllocatedArgs) const;
 
   /// Choose a tool to use to handle the action \p JA.
   ///

Modified: cfe/trunk/lib/Driver/Compilation.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Driver/Compilation.cpp?rev=314328&r1=314327&r2=314328&view=diff
==
--- cfe/trunk/lib/Driver/Compilation.cpp (original)
+++ cfe/trunk/lib/Driver/Compilation.cpp Wed Sep 27 11:12:31 2017
@@ -51,9 +51,10 @@ Compilation::getArgsForToolChain(const T
 
   DerivedArgList *&Entry = TCArgs[{TC, BoundArch, DeviceOffloadKind}];
   if (!Entry) {
+SmallVector AllocatedArgs;
 // Translate OpenMP toolchain arguments provided via the -Xopenmp-target 
flags.
-DerivedArgList *OpenMPArgs = TC->TranslateOpenMPTargetArgs(*TranslatedArgs,
-DeviceOffloadKind);
+DerivedArgList *OpenMPArgs = TC->TranslateOpenMPTargetArgs(
+*TranslatedArgs, DeviceOffloadKind, AllocatedArgs);
 if (!OpenMPArgs) {
   Entry = TC->TranslateArgs(*TranslatedArgs, BoundArch, DeviceOffloadKind);
 } else {
@@ -63,6 +64,11 @@ Compilation::getArgsForToolChain(const T
 
 if (!Entry)
   Entry = TranslatedArgs;
+
+// Add allocated arguments to the final DAL.
+for (auto ArgPtr : AllocatedArgs) {
+  Entry->AddSynthesizedArg(ArgPtr);
+}
   }
 
   return *Entry;

Modified: cfe/trunk/lib/Driver/ToolChain.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Driver/ToolChain.cpp?rev=314328&r1=314327&r2=314328&view=diff
==
--- cfe/trunk/lib/Driver/ToolChain.cpp (original)
+++ cfe/trunk/lib/Driver/ToolChain.cpp Wed Sep 27 11:12:31 2017
@@ -800,9 +800,10 @@ ToolChain::computeMSVCVersion(const Driv
   return VersionTuple();
 }
 
-llvm::opt::DerivedArgList *
-ToolChain::TranslateOpenMPTargetArgs(const llvm::opt::DerivedArgList &Args,
-Action::OffloadKind DeviceOffloadKind) const {
+llvm::opt::DerivedArgList *ToolChain::TranslateOpenMPTargetArgs(
+const llvm::opt::DerivedArgList &Args,
+Action::OffloadKind DeviceOffloadKind,
+SmallVector &AllocatedArgs) const {
   if (DeviceOffloadKind == Action::OFK_OpenMP) {
 DerivedArgList *DAL = new DerivedArgList(Args.getBaseArgs());
 const OptTable &Opts = getDriver().getOpts();
@@ -854,6 +855,7 @@ ToolChain::TranslateOpenMPTargetArgs(con
   }
   XOpenMPTargetArg->setBaseArg(A);
   A = XOpenMPTargetArg.release();
+  AllocatedArgs.push_back(A);
   DAL->append(A);
   NewArgAdded = true;
 }

Modified: cfe/trunk/test/Driver/openmp-offload-gpu.c
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/Driver/openmp-offload-gpu.c?rev=314328&r1=314327&r2=314328&view=diff
==
--- cfe/trunk/test/Driver/openmp-offload-gpu.c (original)
+++ cfe/trunk/test/Driver/openmp-offload-gpu.c Wed Sep 27 11:12:31 2017
@@ -2,9 +2,6 @@
 /// Perform several driver tests for OpenMP offloading
 ///
 
-// Until this test is stabilized on all local configurations.
-// UNSUPPORTED: linux
-
 // REQUIRES: clang-driver
 // REQUIRES: x86-registered-target
 // REQUIRES: powerpc-registered-target


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r314330 - [OpenMP] Fix translation of target args

2017-09-27 Thread Jonas Hahnfeld via cfe-commits
Author: hahnfeld
Date: Wed Sep 27 11:12:36 2017
New Revision: 314330

URL: http://llvm.org/viewvc/llvm-project?rev=314330&view=rev
Log:
[OpenMP] Fix translation of target args

ToolChain::TranslateArgs() returns nullptr if no changes are performed.
This would currently mean that OpenMPArgs are lost. Patch fixes this
by falling back to simply using OpenMPArgs in that case.

Differential Revision: https://reviews.llvm.org/D38259

Modified:
cfe/trunk/lib/Driver/Compilation.cpp

Modified: cfe/trunk/lib/Driver/Compilation.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Driver/Compilation.cpp?rev=314330&r1=314329&r2=314330&view=diff
==
--- cfe/trunk/lib/Driver/Compilation.cpp (original)
+++ cfe/trunk/lib/Driver/Compilation.cpp Wed Sep 27 11:12:36 2017
@@ -57,14 +57,16 @@ Compilation::getArgsForToolChain(const T
 *TranslatedArgs, DeviceOffloadKind, AllocatedArgs);
 if (!OpenMPArgs) {
   Entry = TC->TranslateArgs(*TranslatedArgs, BoundArch, DeviceOffloadKind);
+  if (!Entry)
+Entry = TranslatedArgs;
 } else {
   Entry = TC->TranslateArgs(*OpenMPArgs, BoundArch, DeviceOffloadKind);
-  delete OpenMPArgs;
+  if (!Entry)
+Entry = OpenMPArgs;
+  else
+delete OpenMPArgs;
 }
 
-if (!Entry)
-  Entry = TranslatedArgs;
-
 // Add allocated arguments to the final DAL.
 for (auto ArgPtr : AllocatedArgs) {
   Entry->AddSynthesizedArg(ArgPtr);


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r314691 - [CUDA] Fix name of __activemask()

2017-10-02 Thread Jonas Hahnfeld via cfe-commits
Author: hahnfeld
Date: Mon Oct  2 10:50:11 2017
New Revision: 314691

URL: http://llvm.org/viewvc/llvm-project?rev=314691&view=rev
Log:
[CUDA] Fix name of __activemask()

The name has two underscores in the official CUDA documentation:
http://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#warp-vote-functions

Differential Revision: https://reviews.llvm.org/D38468

Modified:
cfe/trunk/lib/Headers/__clang_cuda_intrinsics.h

Modified: cfe/trunk/lib/Headers/__clang_cuda_intrinsics.h
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Headers/__clang_cuda_intrinsics.h?rev=314691&r1=314690&r2=314691&view=diff
==
--- cfe/trunk/lib/Headers/__clang_cuda_intrinsics.h (original)
+++ cfe/trunk/lib/Headers/__clang_cuda_intrinsics.h Mon Oct  2 10:50:11 2017
@@ -186,7 +186,7 @@ inline __device__ unsigned int __ballot_
   return __nvvm_vote_ballot_sync(mask, pred);
 }
 
-inline __device__ unsigned int activemask() { return __nvvm_vote_ballot(1); }
+inline __device__ unsigned int __activemask() { return __nvvm_vote_ballot(1); }
 
 #endif // !defined(__CUDA_ARCH__) || __CUDA_ARCH__ >= 300
 


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r314902 - [OpenMP] Fix passing of -m arguments correctly

2017-10-04 Thread Jonas Hahnfeld via cfe-commits
Author: hahnfeld
Date: Wed Oct  4 06:32:59 2017
New Revision: 314902

URL: http://llvm.org/viewvc/llvm-project?rev=314902&view=rev
Log:
[OpenMP] Fix passing of -m arguments correctly

The recent fix in D38258 was wrong: getAuxTriple() only returns
non-null values for the CUDA toolchain. That is why the now added
test for PPC and X86 failed.

Differential Revision: https://reviews.llvm.org/D38372

Modified:
cfe/trunk/include/clang/Driver/ToolChain.h
cfe/trunk/lib/Driver/Compilation.cpp
cfe/trunk/lib/Driver/ToolChain.cpp
cfe/trunk/test/Driver/openmp-offload.c

Modified: cfe/trunk/include/clang/Driver/ToolChain.h
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Driver/ToolChain.h?rev=314902&r1=314901&r2=314902&view=diff
==
--- cfe/trunk/include/clang/Driver/ToolChain.h (original)
+++ cfe/trunk/include/clang/Driver/ToolChain.h Wed Oct  4 06:32:59 2017
@@ -245,14 +245,9 @@ public:
   /// TranslateOpenMPTargetArgs - Create a new derived argument list for
   /// that contains the OpenMP target specific flags passed via
   /// -Xopenmp-target -opt=val OR -Xopenmp-target= -opt=val
-  /// Translation occurs only when the \p DeviceOffloadKind is specified.
-  ///
-  /// \param DeviceOffloadKind - The device offload kind used for the
-  /// translation.
   virtual llvm::opt::DerivedArgList *TranslateOpenMPTargetArgs(
-  const llvm::opt::DerivedArgList &Args,
-  Action::OffloadKind DeviceOffloadKind,
-  SmallVector &AllocatedArgs) const;
+  const llvm::opt::DerivedArgList &Args, bool SameTripleAsHost,
+  SmallVectorImpl &AllocatedArgs) const;
 
   /// Choose a tool to use to handle the action \p JA.
   ///

Modified: cfe/trunk/lib/Driver/Compilation.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Driver/Compilation.cpp?rev=314902&r1=314901&r2=314902&view=diff
==
--- cfe/trunk/lib/Driver/Compilation.cpp (original)
+++ cfe/trunk/lib/Driver/Compilation.cpp Wed Oct  4 06:32:59 2017
@@ -52,9 +52,15 @@ Compilation::getArgsForToolChain(const T
   DerivedArgList *&Entry = TCArgs[{TC, BoundArch, DeviceOffloadKind}];
   if (!Entry) {
 SmallVector AllocatedArgs;
+DerivedArgList *OpenMPArgs = nullptr;
 // Translate OpenMP toolchain arguments provided via the -Xopenmp-target 
flags.
-DerivedArgList *OpenMPArgs = TC->TranslateOpenMPTargetArgs(
-*TranslatedArgs, DeviceOffloadKind, AllocatedArgs);
+if (DeviceOffloadKind == Action::OFK_OpenMP) {
+  const ToolChain *HostTC = getSingleOffloadToolChain();
+  bool SameTripleAsHost = (TC->getTriple() == HostTC->getTriple());
+  OpenMPArgs = TC->TranslateOpenMPTargetArgs(
+  *TranslatedArgs, SameTripleAsHost, AllocatedArgs);
+}
+
 if (!OpenMPArgs) {
   Entry = TC->TranslateArgs(*TranslatedArgs, BoundArch, DeviceOffloadKind);
   if (!Entry)

Modified: cfe/trunk/lib/Driver/ToolChain.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Driver/ToolChain.cpp?rev=314902&r1=314901&r2=314902&view=diff
==
--- cfe/trunk/lib/Driver/ToolChain.cpp (original)
+++ cfe/trunk/lib/Driver/ToolChain.cpp Wed Oct  4 06:32:59 2017
@@ -801,74 +801,68 @@ ToolChain::computeMSVCVersion(const Driv
 }
 
 llvm::opt::DerivedArgList *ToolChain::TranslateOpenMPTargetArgs(
-const llvm::opt::DerivedArgList &Args,
-Action::OffloadKind DeviceOffloadKind,
-SmallVector &AllocatedArgs) const {
-  if (DeviceOffloadKind == Action::OFK_OpenMP) {
-DerivedArgList *DAL = new DerivedArgList(Args.getBaseArgs());
-const OptTable &Opts = getDriver().getOpts();
-bool Modified = false;
-
-// Handle -Xopenmp-target flags
-for (Arg *A : Args) {
-  // Exclude flags which may only apply to the host toolchain.
-  // Do not exclude flags when the host triple (AuxTriple)
-  // matches the current toolchain triple. If it is not present
-  // at all, target and host share a toolchain.
-  if (A->getOption().matches(options::OPT_m_Group)) {
-if (!getAuxTriple() || getAuxTriple()->str() == getTriple().str())
-  DAL->append(A);
-else
-  Modified = true;
-continue;
-  }
-
-  unsigned Index;
-  unsigned Prev;
-  bool XOpenMPTargetNoTriple = A->getOption().matches(
-  options::OPT_Xopenmp_target);
-
-  if (A->getOption().matches(options::OPT_Xopenmp_target_EQ)) {
-// Passing device args: -Xopenmp-target= -opt=val.
-if (A->getValue(0) == getTripleString())
-  Index = Args.getBaseArgs().MakeIndex(A->getValue(1));
-else
-  continue;
-  } else if (XOpenMPTargetNoTriple) {
-// Passing device args: -Xopenmp-target -opt=val.
-Index = Args.getBaseArgs().MakeIndex(A->getValue(0));
-  } else {
+const llvm::opt::DerivedArgList 

r314904 - [test] Pass in fixed triple for openmp-offload.c

2017-10-04 Thread Jonas Hahnfeld via cfe-commits
Author: hahnfeld
Date: Wed Oct  4 06:54:09 2017
New Revision: 314904

URL: http://llvm.org/viewvc/llvm-project?rev=314904&view=rev
Log:
[test] Pass in fixed triple for openmp-offload.c

This should fix the test on other architectures.

Related to: https://reviews.llvm.org/D38372

Modified:
cfe/trunk/test/Driver/openmp-offload.c

Modified: cfe/trunk/test/Driver/openmp-offload.c
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/Driver/openmp-offload.c?rev=314904&r1=314903&r2=314904&view=diff
==
--- cfe/trunk/test/Driver/openmp-offload.c (original)
+++ cfe/trunk/test/Driver/openmp-offload.c Wed Oct  4 06:54:09 2017
@@ -64,7 +64,7 @@
 /// ##
 
 /// Check -march=pwr7 is NOT passed to nvptx64-nvidia-cuda.
-// RUN:%clang -### -no-canonical-prefixes -fopenmp=libomp 
-fopenmp-targets=nvptx64-nvidia-cuda -march=pwr7 %s 2>&1 \
+// RUN:%clang -### -no-canonical-prefixes -fopenmp=libomp 
-fopenmp-targets=nvptx64-nvidia-cuda -target powerpc64le-ibm-linux-gnu 
-march=pwr7 %s 2>&1 \
 // RUN:| FileCheck -check-prefix=CHK-FOPENMP-MARCH-TO-GPU %s
 
 // CHK-FOPENMP-MARCH-TO-GPU-NOT: clang{{.*}} "-target-cpu" "pwr7" 
{{.*}}"-fopenmp-is-device"
@@ -72,7 +72,7 @@
 /// ###
 
 /// Check -march=pwr7 is NOT passed to x86_64-unknown-linux-gnu.
-// RUN:%clang -### -no-canonical-prefixes -fopenmp=libomp 
-fopenmp-targets=x86_64-unknown-linux-gnu -march=pwr7 %s 2>&1 \
+// RUN:%clang -### -no-canonical-prefixes -fopenmp=libomp 
-fopenmp-targets=x86_64-unknown-linux-gnu -target powerpc64le-ibm-linux-gnu 
-march=pwr7 %s 2>&1 \
 // RUN:| FileCheck -check-prefix=CHK-FOPENMP-MARCH-TO-X86 %s
 
 // CHK-FOPENMP-MARCH-TO-X86-NOT: clang{{.*}} "-target-cpu" "pwr7" 
{{.*}}"-fopenmp-is-device"


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r315902 - [CUDA] Require libdevice only if needed

2017-10-16 Thread Jonas Hahnfeld via cfe-commits
Author: hahnfeld
Date: Mon Oct 16 06:31:30 2017
New Revision: 315902

URL: http://llvm.org/viewvc/llvm-project?rev=315902&view=rev
Log:
[CUDA] Require libdevice only if needed

If the user passes -nocudalib, we can live without it being present.
Simplify the code by just checking whether LibDeviceMap is empty.

Differential Revision: https://reviews.llvm.org/D38901

Added:
cfe/trunk/test/Driver/Inputs/CUDA-nolibdevice/
cfe/trunk/test/Driver/Inputs/CUDA-nolibdevice/usr/
cfe/trunk/test/Driver/Inputs/CUDA-nolibdevice/usr/local/
cfe/trunk/test/Driver/Inputs/CUDA-nolibdevice/usr/local/cuda/
cfe/trunk/test/Driver/Inputs/CUDA-nolibdevice/usr/local/cuda/bin/
cfe/trunk/test/Driver/Inputs/CUDA-nolibdevice/usr/local/cuda/bin/.keep
cfe/trunk/test/Driver/Inputs/CUDA-nolibdevice/usr/local/cuda/include/
cfe/trunk/test/Driver/Inputs/CUDA-nolibdevice/usr/local/cuda/include/.keep
cfe/trunk/test/Driver/Inputs/CUDA-nolibdevice/usr/local/cuda/lib/
cfe/trunk/test/Driver/Inputs/CUDA-nolibdevice/usr/local/cuda/lib/.keep
cfe/trunk/test/Driver/Inputs/CUDA-nolibdevice/usr/local/cuda/lib64/
cfe/trunk/test/Driver/Inputs/CUDA-nolibdevice/usr/local/cuda/lib64/.keep
Modified:
cfe/trunk/lib/Driver/ToolChains/Cuda.cpp
cfe/trunk/test/Driver/cuda-detect.cu

Modified: cfe/trunk/lib/Driver/ToolChains/Cuda.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Driver/ToolChains/Cuda.cpp?rev=315902&r1=315901&r2=315902&view=diff
==
--- cfe/trunk/lib/Driver/ToolChains/Cuda.cpp (original)
+++ cfe/trunk/lib/Driver/ToolChains/Cuda.cpp Mon Oct 16 06:31:30 2017
@@ -87,8 +87,7 @@ CudaInstallationDetector::CudaInstallati
 LibDevicePath = InstallPath + "/nvvm/libdevice";
 
 auto &FS = D.getVFS();
-if (!(FS.exists(IncludePath) && FS.exists(BinPath) &&
-  FS.exists(LibDevicePath)))
+if (!(FS.exists(IncludePath) && FS.exists(BinPath)))
   continue;
 
 // On Linux, we have both lib and lib64 directories, and we need to choose
@@ -167,17 +166,9 @@ CudaInstallationDetector::CudaInstallati
   }
 }
 
-// This code prevents IsValid from being set when
-// no libdevice has been found.
-bool allEmpty = true;
-std::string LibDeviceFile;
-for (auto key : LibDeviceMap.keys()) {
-  LibDeviceFile = LibDeviceMap.lookup(key);
-  if (!LibDeviceFile.empty())
-allEmpty = false;
-}
-
-if (allEmpty)
+// Check that we have found at least one libdevice that we can link in if
+// -nocudalib hasn't been specified.
+if (LibDeviceMap.empty() && !Args.hasArg(options::OPT_nocudalib))
   continue;
 
 IsValid = true;

Added: cfe/trunk/test/Driver/Inputs/CUDA-nolibdevice/usr/local/cuda/bin/.keep
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/Driver/Inputs/CUDA-nolibdevice/usr/local/cuda/bin/.keep?rev=315902&view=auto
==
(empty)

Added: 
cfe/trunk/test/Driver/Inputs/CUDA-nolibdevice/usr/local/cuda/include/.keep
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/Driver/Inputs/CUDA-nolibdevice/usr/local/cuda/include/.keep?rev=315902&view=auto
==
(empty)

Added: cfe/trunk/test/Driver/Inputs/CUDA-nolibdevice/usr/local/cuda/lib/.keep
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/Driver/Inputs/CUDA-nolibdevice/usr/local/cuda/lib/.keep?rev=315902&view=auto
==
(empty)

Added: cfe/trunk/test/Driver/Inputs/CUDA-nolibdevice/usr/local/cuda/lib64/.keep
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/Driver/Inputs/CUDA-nolibdevice/usr/local/cuda/lib64/.keep?rev=315902&view=auto
==
(empty)

Modified: cfe/trunk/test/Driver/cuda-detect.cu
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/Driver/cuda-detect.cu?rev=315902&r1=315901&r2=315902&view=diff
==
--- cfe/trunk/test/Driver/cuda-detect.cu (original)
+++ cfe/trunk/test/Driver/cuda-detect.cu Mon Oct 16 06:31:30 2017
@@ -2,7 +2,7 @@
 // REQUIRES: x86-registered-target
 // REQUIRES: nvptx-registered-target
 //
-// # Check that we properly detect CUDA installation.
+// Check that we properly detect CUDA installation.
 // RUN: %clang -v --target=i386-unknown-linux \
 // RUN:   --sysroot=%S/no-cuda-there 2>&1 | FileCheck %s -check-prefix NOCUDA
 // RUN: %clang -v --target=i386-apple-macosx \
@@ -18,6 +18,19 @@
 // RUN: %clang -v --target=i386-apple-macosx \
 // RUN:   --cuda-path=%S/Inputs/CUDA/usr/local/cuda 2>&1 | FileCheck %s
 
+// Check that we don't find a CUDA installation without libdevice ...
+// RUN: %clang -v --target=i386-unknown-linux \
+// RUN:   --sysroot=%S/Inputs/CUDA-nolibdev

r315996 - [CMake][OpenMP] Customize default offloading arch

2017-10-17 Thread Jonas Hahnfeld via cfe-commits
Author: hahnfeld
Date: Tue Oct 17 06:37:36 2017
New Revision: 315996

URL: http://llvm.org/viewvc/llvm-project?rev=315996&view=rev
Log:
[CMake][OpenMP] Customize default offloading arch

For the shuffle instructions in reductions we need at least sm_30
but the user may want to customize the default architecture.

Differential Revision: https://reviews.llvm.org/D38883

Modified:
cfe/trunk/CMakeLists.txt
cfe/trunk/include/clang/Config/config.h.cmake
cfe/trunk/lib/Driver/ToolChains/Cuda.cpp
cfe/trunk/lib/Driver/ToolChains/Cuda.h

Modified: cfe/trunk/CMakeLists.txt
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/CMakeLists.txt?rev=315996&r1=315995&r2=315996&view=diff
==
--- cfe/trunk/CMakeLists.txt (original)
+++ cfe/trunk/CMakeLists.txt Tue Oct 17 06:37:36 2017
@@ -235,6 +235,17 @@ endif()
 set(CLANG_DEFAULT_OPENMP_RUNTIME "libomp" CACHE STRING
   "Default OpenMP runtime used by -fopenmp.")
 
+# OpenMP offloading requires at least sm_30 because we use shuffle instructions
+# to generate efficient code for reductions.
+set(CLANG_OPENMP_NVPTX_DEFAULT_ARCH "sm_30" CACHE STRING
+  "Default architecture for OpenMP offloading to Nvidia GPUs.")
+string(REGEX MATCH "^sm_([0-9]+)$" MATCHED_ARCH 
"${CLANG_OPENMP_NVPTX_DEFAULT_ARCH}")
+if (NOT DEFINED MATCHED_ARCH OR "${CMAKE_MATCH_1}" LESS 30)
+  message(WARNING "Resetting default architecture for OpenMP offloading to 
Nvidia GPUs to sm_30")
+  set(CLANG_OPENMP_NVPTX_DEFAULT_ARCH "sm_30" CACHE STRING
+"Default architecture for OpenMP offloading to Nvidia GPUs." FORCE)
+endif()
+
 set(CLANG_VENDOR ${PACKAGE_VENDOR} CACHE STRING
   "Vendor-specific text for showing with version information.")
 

Modified: cfe/trunk/include/clang/Config/config.h.cmake
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Config/config.h.cmake?rev=315996&r1=315995&r2=315996&view=diff
==
--- cfe/trunk/include/clang/Config/config.h.cmake (original)
+++ cfe/trunk/include/clang/Config/config.h.cmake Tue Oct 17 06:37:36 2017
@@ -20,6 +20,9 @@
 /* Default OpenMP runtime used by -fopenmp. */
 #define CLANG_DEFAULT_OPENMP_RUNTIME "${CLANG_DEFAULT_OPENMP_RUNTIME}"
 
+/* Default architecture for OpenMP offloading to Nvidia GPUs. */
+#define CLANG_OPENMP_NVPTX_DEFAULT_ARCH "${CLANG_OPENMP_NVPTX_DEFAULT_ARCH}"
+
 /* Multilib suffix for libdir. */
 #define CLANG_LIBDIR_SUFFIX "${CLANG_LIBDIR_SUFFIX}"
 

Modified: cfe/trunk/lib/Driver/ToolChains/Cuda.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Driver/ToolChains/Cuda.cpp?rev=315996&r1=315995&r2=315996&view=diff
==
--- cfe/trunk/lib/Driver/ToolChains/Cuda.cpp (original)
+++ cfe/trunk/lib/Driver/ToolChains/Cuda.cpp Tue Oct 17 06:37:36 2017
@@ -542,9 +542,9 @@ CudaToolChain::TranslateArgs(const llvm:
   // flags are not duplicated.
   // Also append the compute capability.
   if (DeviceOffloadKind == Action::OFK_OpenMP) {
-for (Arg *A : Args){
+for (Arg *A : Args) {
   bool IsDuplicate = false;
-  for (Arg *DALArg : *DAL){
+  for (Arg *DALArg : *DAL) {
 if (A == DALArg) {
   IsDuplicate = true;
   break;
@@ -555,14 +555,9 @@ CudaToolChain::TranslateArgs(const llvm:
 }
 
 StringRef Arch = DAL->getLastArgValue(options::OPT_march_EQ);
-if (Arch.empty()) {
-  // Default compute capability for CUDA toolchain is the
-  // lowest compute capability supported by the installed
-  // CUDA version.
-  DAL->AddJoinedArg(nullptr,
-  Opts.getOption(options::OPT_march_EQ),
-  CudaInstallation.getLowestExistingArch());
-}
+if (Arch.empty())
+  DAL->AddJoinedArg(nullptr, Opts.getOption(options::OPT_march_EQ),
+CLANG_OPENMP_NVPTX_DEFAULT_ARCH);
 
 return DAL;
   }

Modified: cfe/trunk/lib/Driver/ToolChains/Cuda.h
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Driver/ToolChains/Cuda.h?rev=315996&r1=315995&r2=315996&view=diff
==
--- cfe/trunk/lib/Driver/ToolChains/Cuda.h (original)
+++ cfe/trunk/lib/Driver/ToolChains/Cuda.h Tue Oct 17 06:37:36 2017
@@ -76,17 +76,6 @@ public:
   std::string getLibDeviceFile(StringRef Gpu) const {
 return LibDeviceMap.lookup(Gpu);
   }
-  /// \brief Get lowest available compute capability
-  /// for which a libdevice library exists.
-  std::string getLowestExistingArch() const {
-std::string LibDeviceFile;
-for (auto key : LibDeviceMap.keys()) {
-  LibDeviceFile = LibDeviceMap.lookup(key);
-  if (!LibDeviceFile.empty())
-return key;
-}
-return "sm_20";
-  }
 };
 
 namespace tools {


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinf

r316001 - [OpenMP] Implement omp_is_initial_device() as builtin

2017-10-17 Thread Jonas Hahnfeld via cfe-commits
Author: hahnfeld
Date: Tue Oct 17 07:28:14 2017
New Revision: 316001

URL: http://llvm.org/viewvc/llvm-project?rev=316001&view=rev
Log:
[OpenMP] Implement omp_is_initial_device() as builtin

This allows to return the static value that we know at compile time.

Differential Revision: https://reviews.llvm.org/D38968

Added:
cfe/trunk/test/OpenMP/is_initial_device.c
Modified:
cfe/trunk/include/clang/Basic/Builtins.def
cfe/trunk/include/clang/Basic/Builtins.h
cfe/trunk/lib/AST/ExprConstant.cpp
cfe/trunk/lib/Basic/Builtins.cpp

Modified: cfe/trunk/include/clang/Basic/Builtins.def
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Basic/Builtins.def?rev=316001&r1=316000&r2=316001&view=diff
==
--- cfe/trunk/include/clang/Basic/Builtins.def (original)
+++ cfe/trunk/include/clang/Basic/Builtins.def Tue Oct 17 07:28:14 2017
@@ -1434,6 +1434,9 @@ LANGBUILTIN(__builtin_load_halff, "fhC*"
 BUILTIN(__builtin_os_log_format_buffer_size, "zcC*.", "p:0:nut")
 BUILTIN(__builtin_os_log_format, "v*v*cC*.", "p:0:nt")
 
+// OpenMP 4.0
+LANGBUILTIN(omp_is_initial_device, "i", "nc", OMP_LANG)
+
 // Builtins for XRay
 BUILTIN(__xray_customevent, "vcC*z", "")
 

Modified: cfe/trunk/include/clang/Basic/Builtins.h
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Basic/Builtins.h?rev=316001&r1=316000&r2=316001&view=diff
==
--- cfe/trunk/include/clang/Basic/Builtins.h (original)
+++ cfe/trunk/include/clang/Basic/Builtins.h Tue Oct 17 07:28:14 2017
@@ -38,6 +38,7 @@ enum LanguageID {
   MS_LANG = 0x10, // builtin requires MS mode.
   OCLC20_LANG = 0x20, // builtin for OpenCL C 2.0 only.
   OCLC1X_LANG = 0x40, // builtin for OpenCL C 1.x only.
+  OMP_LANG = 0x80,// builtin requires OpenMP.
   ALL_LANGUAGES = C_LANG | CXX_LANG | OBJC_LANG, // builtin for all languages.
   ALL_GNU_LANGUAGES = ALL_LANGUAGES | GNU_LANG,  // builtin requires GNU mode.
   ALL_MS_LANGUAGES = ALL_LANGUAGES | MS_LANG,// builtin requires MS mode.

Modified: cfe/trunk/lib/AST/ExprConstant.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/AST/ExprConstant.cpp?rev=316001&r1=316000&r2=316001&view=diff
==
--- cfe/trunk/lib/AST/ExprConstant.cpp (original)
+++ cfe/trunk/lib/AST/ExprConstant.cpp Tue Oct 17 07:28:14 2017
@@ -7929,6 +7929,9 @@ bool IntExprEvaluator::VisitBuiltinCallE
 return BuiltinOp == Builtin::BI__atomic_always_lock_free ?
 Success(0, E) : Error(E);
   }
+  case Builtin::BIomp_is_initial_device:
+// We can decide statically which value the runtime would return if called.
+return Success(Info.getLangOpts().OpenMPIsDevice ? 0 : 1, E);
   }
 }
 

Modified: cfe/trunk/lib/Basic/Builtins.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Basic/Builtins.cpp?rev=316001&r1=316000&r2=316001&view=diff
==
--- cfe/trunk/lib/Basic/Builtins.cpp (original)
+++ cfe/trunk/lib/Basic/Builtins.cpp Tue Oct 17 07:28:14 2017
@@ -75,8 +75,9 @@ bool Builtin::Context::builtinIsSupporte
   (BuiltinInfo.Langs & ALL_OCLC_LANGUAGES) == 
OCLC20_LANG;
   bool OclCUnsupported = !LangOpts.OpenCL &&
  (BuiltinInfo.Langs & ALL_OCLC_LANGUAGES);
+  bool OpenMPUnsupported = !LangOpts.OpenMP && BuiltinInfo.Langs == OMP_LANG;
   return !BuiltinsUnsupported && !MathBuiltinsUnsupported && !OclCUnsupported 
&&
- !OclC1Unsupported && !OclC2Unsupported &&
+ !OclC1Unsupported && !OclC2Unsupported && !OpenMPUnsupported &&
  !GnuModeUnsupported && !MSModeUnsupported && !ObjCUnsupported;
 }
 

Added: cfe/trunk/test/OpenMP/is_initial_device.c
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/OpenMP/is_initial_device.c?rev=316001&view=auto
==
--- cfe/trunk/test/OpenMP/is_initial_device.c (added)
+++ cfe/trunk/test/OpenMP/is_initial_device.c Tue Oct 17 07:28:14 2017
@@ -0,0 +1,36 @@
+// REQUIRES: powerpc-registered-target
+
+// RUN: %clang_cc1 -verify -fopenmp -x c -triple powerpc64le-unknown-unknown 
-fopenmp-targets=powerpc64le-unknown-unknown \
+// RUN:-emit-llvm-bc %s -o %t-ppc-host.bc
+// RUN: %clang_cc1 -verify -fopenmp -x ir -triple powerpc64le-unknown-unknown 
-emit-llvm \
+// RUN: %t-ppc-host.bc -o - | FileCheck %s -check-prefixes 
HOST,OUTLINED
+// RUN: %clang_cc1 -verify -fopenmp -x c -triple powerpc64le-unknown-unknown 
-emit-llvm -fopenmp-is-device \
+// RUN: %s -fopenmp-host-ir-file-path %t-ppc-host.bc -o - | 
FileCheck %s -check-prefixes DEVICE,OUTLINED
+
+// expected-no-diagnostics
+int check() {
+  int host = omp_is_initial_device();
+  int device;
+#pragma omp target map(tofrom: device)
+  {
+device 

r316229 - [OpenMP] Avoid VLAs for some reductions on array sections

2017-10-20 Thread Jonas Hahnfeld via cfe-commits
Author: hahnfeld
Date: Fri Oct 20 12:40:40 2017
New Revision: 316229

URL: http://llvm.org/viewvc/llvm-project?rev=316229&view=rev
Log:
[OpenMP] Avoid VLAs for some reductions on array sections

In some cases the compiler can deduce the length of an array section
as constants. With this information, VLAs can be avoided in place of
a constant sized array or even a scalar value if the length is 1.
Example:
int a[4], b[2];
pragma omp parallel reduction(+: a[1:2], b[1:1])
{ }

For chained array sections, this optimization is restricted to cases
where all array sections except the last have a constant length 1.
This trivially guarantees that there are no holes in the memory region
that needs to be privatized.
Example:
int c[3][4];
pragma omp parallel reduction(+: c[1:1][1:2])
{ }

Differential Revision: https://reviews.llvm.org/D39136

Modified:
cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp
cfe/trunk/lib/CodeGen/CGStmtOpenMP.cpp
cfe/trunk/lib/Sema/SemaOpenMP.cpp
cfe/trunk/test/OpenMP/for_reduction_codegen.cpp
cfe/trunk/test/OpenMP/for_reduction_codegen_UDR.cpp

Modified: cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp?rev=316229&r1=316228&r2=316229&view=diff
==
--- cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp (original)
+++ cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp Fri Oct 20 12:40:40 2017
@@ -925,7 +925,7 @@ void ReductionCodeGen::emitAggregateType
   cast(cast(ClausesData[N].Private)->getDecl());
   QualType PrivateType = PrivateVD->getType();
   bool AsArraySection = isa(ClausesData[N].Ref);
-  if (!AsArraySection && !PrivateType->isVariablyModifiedType()) {
+  if (!PrivateType->isVariablyModifiedType()) {
 Sizes.emplace_back(
 CGF.getTypeSize(
 SharedAddresses[N].first.getType().getNonReferenceType()),
@@ -963,10 +963,9 @@ void ReductionCodeGen::emitAggregateType
   auto *PrivateVD =
   cast(cast(ClausesData[N].Private)->getDecl());
   QualType PrivateType = PrivateVD->getType();
-  bool AsArraySection = isa(ClausesData[N].Ref);
-  if (!AsArraySection && !PrivateType->isVariablyModifiedType()) {
+  if (!PrivateType->isVariablyModifiedType()) {
 assert(!Size && !Sizes[N].second &&
-   "Size should be nullptr for non-variably modified redution "
+   "Size should be nullptr for non-variably modified reduction "
"items.");
 return;
   }
@@ -994,8 +993,7 @@ void ReductionCodeGen::emitInitializatio
CGF.ConvertTypeForMem(SharedType)),
   SharedType, SharedAddresses[N].first.getBaseInfo(),
   CGF.CGM.getTBAAAccessInfo(SharedType));
-  if (isa(ClausesData[N].Ref) ||
-  CGF.getContext().getAsArrayType(PrivateVD->getType())) {
+  if (CGF.getContext().getAsArrayType(PrivateVD->getType())) {
 emitAggregateInitialization(CGF, N, PrivateAddr, SharedLVal, DRD);
   } else if (DRD && (DRD->getInitializer() || !PrivateVD->hasInit())) {
 emitInitWithReductionInitializer(CGF, DRD, ClausesData[N].ReductionOp,

Modified: cfe/trunk/lib/CodeGen/CGStmtOpenMP.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGStmtOpenMP.cpp?rev=316229&r1=316228&r2=316229&view=diff
==
--- cfe/trunk/lib/CodeGen/CGStmtOpenMP.cpp (original)
+++ cfe/trunk/lib/CodeGen/CGStmtOpenMP.cpp Fri Oct 20 12:40:40 2017
@@ -996,7 +996,9 @@ void CodeGenFunction::EmitOMPReductionCl
 
 auto *LHSVD = cast(cast(*ILHS)->getDecl());
 auto *RHSVD = cast(cast(*IRHS)->getDecl());
-if (isa(IRef)) {
+QualType Type = PrivateVD->getType();
+bool isaOMPArraySectionExpr = isa(IRef);
+if (isaOMPArraySectionExpr && Type->isVariablyModifiedType()) {
   // Store the address of the original variable associated with the LHS
   // implicit variable.
   PrivateScope.addPrivate(LHSVD, [&RedCG, Count]() -> Address {
@@ -1005,7 +1007,8 @@ void CodeGenFunction::EmitOMPReductionCl
   PrivateScope.addPrivate(RHSVD, [this, PrivateVD]() -> Address {
 return GetAddrOfLocalVar(PrivateVD);
   });
-} else if (isa(IRef)) {
+} else if ((isaOMPArraySectionExpr && Type->isScalarType()) ||
+   isa(IRef)) {
   // Store the address of the original variable associated with the LHS
   // implicit variable.
   PrivateScope.addPrivate(LHSVD, [&RedCG, Count]() -> Address {

Modified: cfe/trunk/lib/Sema/SemaOpenMP.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Sema/SemaOpenMP.cpp?rev=316229&r1=316228&r2=316229&view=diff
==
--- cfe/trunk/lib/Sema/SemaOpenMP.cpp (original)
+++ cfe/trunk/lib/Sema/SemaOpenMP.cpp Fri Oct 20 12:40:40 2017
@@ -9330,6 +9330,68 @@ struct ReductionData {
 };
 } // namespace
 
+static bool CheckOMPArraySectionConstantForReduction(
+ASTC

r316235 - Revert "[OpenMP] Avoid VLAs for some reductions on array sections"

2017-10-20 Thread Jonas Hahnfeld via cfe-commits
Author: hahnfeld
Date: Fri Oct 20 13:16:17 2017
New Revision: 316235

URL: http://llvm.org/viewvc/llvm-project?rev=316235&view=rev
Log:
Revert "[OpenMP] Avoid VLAs for some reductions on array sections"

This breaks at least two buildbots:
http://lab.llvm.org:8011/builders/clang-cmake-x86_64-avx2-linux/builds/1175
http://lab.llvm.org:8011/builders/clang-atom-d525-fedora-rel/builds/10478

This reverts commit r316229 during local investigation.

Modified:
cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp
cfe/trunk/lib/CodeGen/CGStmtOpenMP.cpp
cfe/trunk/lib/Sema/SemaOpenMP.cpp
cfe/trunk/test/OpenMP/for_reduction_codegen.cpp
cfe/trunk/test/OpenMP/for_reduction_codegen_UDR.cpp

Modified: cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp?rev=316235&r1=316234&r2=316235&view=diff
==
--- cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp (original)
+++ cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp Fri Oct 20 13:16:17 2017
@@ -925,7 +925,7 @@ void ReductionCodeGen::emitAggregateType
   cast(cast(ClausesData[N].Private)->getDecl());
   QualType PrivateType = PrivateVD->getType();
   bool AsArraySection = isa(ClausesData[N].Ref);
-  if (!PrivateType->isVariablyModifiedType()) {
+  if (!AsArraySection && !PrivateType->isVariablyModifiedType()) {
 Sizes.emplace_back(
 CGF.getTypeSize(
 SharedAddresses[N].first.getType().getNonReferenceType()),
@@ -963,9 +963,10 @@ void ReductionCodeGen::emitAggregateType
   auto *PrivateVD =
   cast(cast(ClausesData[N].Private)->getDecl());
   QualType PrivateType = PrivateVD->getType();
-  if (!PrivateType->isVariablyModifiedType()) {
+  bool AsArraySection = isa(ClausesData[N].Ref);
+  if (!AsArraySection && !PrivateType->isVariablyModifiedType()) {
 assert(!Size && !Sizes[N].second &&
-   "Size should be nullptr for non-variably modified reduction "
+   "Size should be nullptr for non-variably modified redution "
"items.");
 return;
   }
@@ -993,7 +994,8 @@ void ReductionCodeGen::emitInitializatio
CGF.ConvertTypeForMem(SharedType)),
   SharedType, SharedAddresses[N].first.getBaseInfo(),
   CGF.CGM.getTBAAAccessInfo(SharedType));
-  if (CGF.getContext().getAsArrayType(PrivateVD->getType())) {
+  if (isa(ClausesData[N].Ref) ||
+  CGF.getContext().getAsArrayType(PrivateVD->getType())) {
 emitAggregateInitialization(CGF, N, PrivateAddr, SharedLVal, DRD);
   } else if (DRD && (DRD->getInitializer() || !PrivateVD->hasInit())) {
 emitInitWithReductionInitializer(CGF, DRD, ClausesData[N].ReductionOp,

Modified: cfe/trunk/lib/CodeGen/CGStmtOpenMP.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGStmtOpenMP.cpp?rev=316235&r1=316234&r2=316235&view=diff
==
--- cfe/trunk/lib/CodeGen/CGStmtOpenMP.cpp (original)
+++ cfe/trunk/lib/CodeGen/CGStmtOpenMP.cpp Fri Oct 20 13:16:17 2017
@@ -996,9 +996,7 @@ void CodeGenFunction::EmitOMPReductionCl
 
 auto *LHSVD = cast(cast(*ILHS)->getDecl());
 auto *RHSVD = cast(cast(*IRHS)->getDecl());
-QualType Type = PrivateVD->getType();
-bool isaOMPArraySectionExpr = isa(IRef);
-if (isaOMPArraySectionExpr && Type->isVariablyModifiedType()) {
+if (isa(IRef)) {
   // Store the address of the original variable associated with the LHS
   // implicit variable.
   PrivateScope.addPrivate(LHSVD, [&RedCG, Count]() -> Address {
@@ -1007,8 +1005,7 @@ void CodeGenFunction::EmitOMPReductionCl
   PrivateScope.addPrivate(RHSVD, [this, PrivateVD]() -> Address {
 return GetAddrOfLocalVar(PrivateVD);
   });
-} else if ((isaOMPArraySectionExpr && Type->isScalarType()) ||
-   isa(IRef)) {
+} else if (isa(IRef)) {
   // Store the address of the original variable associated with the LHS
   // implicit variable.
   PrivateScope.addPrivate(LHSVD, [&RedCG, Count]() -> Address {

Modified: cfe/trunk/lib/Sema/SemaOpenMP.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Sema/SemaOpenMP.cpp?rev=316235&r1=316234&r2=316235&view=diff
==
--- cfe/trunk/lib/Sema/SemaOpenMP.cpp (original)
+++ cfe/trunk/lib/Sema/SemaOpenMP.cpp Fri Oct 20 13:16:17 2017
@@ -9330,68 +9330,6 @@ struct ReductionData {
 };
 } // namespace
 
-static bool CheckOMPArraySectionConstantForReduction(
-ASTContext &Context, const OMPArraySectionExpr *OASE, bool &SingleElement,
-SmallVectorImpl &ArraySizes) {
-  const Expr *Length = OASE->getLength();
-  if (Length == nullptr) {
-// For array sections of the form [1:] or [:], we would need to analyze
-// the lower bound...
-if (OASE->getColonLoc().isValid())
-  return false;
-
-// This is an array subscript which has impli

r316362 - [OpenMP] Avoid VLAs for some reductions on array sections

2017-10-23 Thread Jonas Hahnfeld via cfe-commits
Author: hahnfeld
Date: Mon Oct 23 12:01:35 2017
New Revision: 316362

URL: http://llvm.org/viewvc/llvm-project?rev=316362&view=rev
Log:
[OpenMP] Avoid VLAs for some reductions on array sections

In some cases the compiler can deduce the length of an array section
as constants. With this information, VLAs can be avoided in place of
a constant sized array or even a scalar value if the length is 1.
Example:
int a[4], b[2];
pragma omp parallel reduction(+: a[1:2], b[1:1])
{ }

For chained array sections, this optimization is restricted to cases
where all array sections except the last have a constant length 1.
This trivially guarantees that there are no holes in the memory region
that needs to be privatized.
Example:
int c[3][4];
pragma omp parallel reduction(+: c[1:1][1:2])
{ }

This relands commit r316229 that I reverted in r316235 because it
failed on some bots. During investigation I found that this was because
Clang and GCC evaluate the two arguments to emplace_back() in
ReductionCodeGen::emitSharedLValue() in a different order, hence
leading to a different order of generated instructions in the final
LLVM IR. Fix this by passing in the arguments from temporary variables
that are evaluated in a defined order.

Differential Revision: https://reviews.llvm.org/D39136

Modified:
cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp
cfe/trunk/lib/CodeGen/CGStmtOpenMP.cpp
cfe/trunk/lib/Sema/SemaOpenMP.cpp
cfe/trunk/test/OpenMP/for_reduction_codegen.cpp
cfe/trunk/test/OpenMP/for_reduction_codegen_UDR.cpp

Modified: cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp?rev=316362&r1=316361&r2=316362&view=diff
==
--- cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp (original)
+++ cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp Mon Oct 23 12:01:35 2017
@@ -916,8 +916,9 @@ ReductionCodeGen::ReductionCodeGen(Array
 void ReductionCodeGen::emitSharedLValue(CodeGenFunction &CGF, unsigned N) {
   assert(SharedAddresses.size() == N &&
  "Number of generated lvalues must be exactly N.");
-  SharedAddresses.emplace_back(emitSharedLValue(CGF, ClausesData[N].Ref),
-   emitSharedLValueUB(CGF, ClausesData[N].Ref));
+  LValue First = emitSharedLValue(CGF, ClausesData[N].Ref);
+  LValue Second = emitSharedLValueUB(CGF, ClausesData[N].Ref);
+  SharedAddresses.emplace_back(First, Second);
 }
 
 void ReductionCodeGen::emitAggregateType(CodeGenFunction &CGF, unsigned N) {
@@ -925,7 +926,7 @@ void ReductionCodeGen::emitAggregateType
   cast(cast(ClausesData[N].Private)->getDecl());
   QualType PrivateType = PrivateVD->getType();
   bool AsArraySection = isa(ClausesData[N].Ref);
-  if (!AsArraySection && !PrivateType->isVariablyModifiedType()) {
+  if (!PrivateType->isVariablyModifiedType()) {
 Sizes.emplace_back(
 CGF.getTypeSize(
 SharedAddresses[N].first.getType().getNonReferenceType()),
@@ -963,10 +964,9 @@ void ReductionCodeGen::emitAggregateType
   auto *PrivateVD =
   cast(cast(ClausesData[N].Private)->getDecl());
   QualType PrivateType = PrivateVD->getType();
-  bool AsArraySection = isa(ClausesData[N].Ref);
-  if (!AsArraySection && !PrivateType->isVariablyModifiedType()) {
+  if (!PrivateType->isVariablyModifiedType()) {
 assert(!Size && !Sizes[N].second &&
-   "Size should be nullptr for non-variably modified redution "
+   "Size should be nullptr for non-variably modified reduction "
"items.");
 return;
   }
@@ -994,8 +994,7 @@ void ReductionCodeGen::emitInitializatio
CGF.ConvertTypeForMem(SharedType)),
   SharedType, SharedAddresses[N].first.getBaseInfo(),
   CGF.CGM.getTBAAAccessInfo(SharedType));
-  if (isa(ClausesData[N].Ref) ||
-  CGF.getContext().getAsArrayType(PrivateVD->getType())) {
+  if (CGF.getContext().getAsArrayType(PrivateVD->getType())) {
 emitAggregateInitialization(CGF, N, PrivateAddr, SharedLVal, DRD);
   } else if (DRD && (DRD->getInitializer() || !PrivateVD->hasInit())) {
 emitInitWithReductionInitializer(CGF, DRD, ClausesData[N].ReductionOp,

Modified: cfe/trunk/lib/CodeGen/CGStmtOpenMP.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGStmtOpenMP.cpp?rev=316362&r1=316361&r2=316362&view=diff
==
--- cfe/trunk/lib/CodeGen/CGStmtOpenMP.cpp (original)
+++ cfe/trunk/lib/CodeGen/CGStmtOpenMP.cpp Mon Oct 23 12:01:35 2017
@@ -996,7 +996,9 @@ void CodeGenFunction::EmitOMPReductionCl
 
 auto *LHSVD = cast(cast(*ILHS)->getDecl());
 auto *RHSVD = cast(cast(*IRHS)->getDecl());
-if (isa(IRef)) {
+QualType Type = PrivateVD->getType();
+bool isaOMPArraySectionExpr = isa(IRef);
+if (isaOMPArraySectionExpr && Type->isVariablyModifiedType()) {
   // Store the address of the original variable associ

r340681 - [CUDA/OpenMP] Define only some host macros during device compilation

2018-08-25 Thread Jonas Hahnfeld via cfe-commits
Author: hahnfeld
Date: Sat Aug 25 06:42:40 2018
New Revision: 340681

URL: http://llvm.org/viewvc/llvm-project?rev=340681&view=rev
Log:
[CUDA/OpenMP] Define only some host macros during device compilation

When compiling CUDA or OpenMP device code Clang parses header files
that expect certain predefined macros from the host architecture. To
make this work the compiler passes the host triple via the -aux-triple
argument and (until now) pulls in all macros for that "auxiliary triple"
unconditionally.

However this results in defines like __SSE_MATH__ that will trigger
inline assembly making use of the "advertised" target features. See
the discussion of D47849 and PR38464 for a detailed explanation of
the encountered problems.

Instead of blacklisting "known bad" examples this patch starts adding
defines that are needed for certain headers like bits/wordsize.h and
bits/mathinline.h.
The disadvantage of this approach is that it decouples the definitions
from their target toolchain. However in my opinion it's more important
to keep definitions for one header close together. For one this will
include a clear documentation why these particular defines are needed.
Furthermore it simplifies maintenance because adding defines for a new
header or support for a new aux-triple only needs to touch one piece
of code.

Differential Revision: https://reviews.llvm.org/D50845

Added:
cfe/trunk/test/Preprocessor/aux-triple.c
Modified:
cfe/trunk/lib/Frontend/InitPreprocessor.cpp
cfe/trunk/test/SemaCUDA/builtins.cu

Modified: cfe/trunk/lib/Frontend/InitPreprocessor.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Frontend/InitPreprocessor.cpp?rev=340681&r1=340680&r2=340681&view=diff
==
--- cfe/trunk/lib/Frontend/InitPreprocessor.cpp (original)
+++ cfe/trunk/lib/Frontend/InitPreprocessor.cpp Sat Aug 25 06:42:40 2018
@@ -1099,6 +1099,44 @@ static void InitializePredefinedMacros(c
   TI.getTargetDefines(LangOpts, Builder);
 }
 
+/// Initialize macros based on AuxTargetInfo.
+static void InitializePredefinedAuxMacros(const TargetInfo &AuxTI,
+  const LangOptions &LangOpts,
+  MacroBuilder &Builder) {
+  auto AuxTriple = AuxTI.getTriple();
+
+  // Define basic target macros needed by at least bits/wordsize.h and
+  // bits/mathinline.h
+  switch (AuxTriple.getArch()) {
+  case llvm::Triple::x86_64:
+Builder.defineMacro("__x86_64__");
+break;
+  case llvm::Triple::ppc64:
+  case llvm::Triple::ppc64le:
+Builder.defineMacro("__powerpc64__");
+break;
+  default:
+break;
+  }
+
+  // libc++ needs to find out the object file format and threading API.
+  if (AuxTriple.getOS() == llvm::Triple::Linux) {
+Builder.defineMacro("__ELF__");
+Builder.defineMacro("__linux__");
+// Used in features.h. If this is omitted, math.h doesn't declare float
+// versions of the functions in bits/mathcalls.h.
+if (LangOpts.CPlusPlus)
+  Builder.defineMacro("_GNU_SOURCE");
+  } else if (AuxTriple.isOSDarwin()) {
+Builder.defineMacro("__APPLE__");
+Builder.defineMacro("__MACH__");
+  } else if (AuxTriple.isOSWindows()) {
+Builder.defineMacro("_WIN32");
+if (AuxTriple.isWindowsGNUEnvironment())
+  Builder.defineMacro("__MINGW32__");
+  }
+}
+
 /// InitializePreprocessor - Initialize the preprocessor getting it and the
 /// environment ready to process a single file. This returns true on error.
 ///
@@ -1120,13 +1158,9 @@ void clang::InitializePreprocessor(
 
   // Install things like __POWERPC__, __GNUC__, etc into the macro table.
   if (InitOpts.UsePredefines) {
-// FIXME: This will create multiple definitions for most of the predefined
-// macros. This is not the right way to handle this.
-if ((LangOpts.CUDA || LangOpts.OpenMPIsDevice) && PP.getAuxTargetInfo())
-  InitializePredefinedMacros(*PP.getAuxTargetInfo(), LangOpts, FEOpts,
- Builder);
-
 InitializePredefinedMacros(PP.getTargetInfo(), LangOpts, FEOpts, Builder);
+if ((LangOpts.CUDA || LangOpts.OpenMPIsDevice) && PP.getAuxTargetInfo())
+  InitializePredefinedAuxMacros(*PP.getAuxTargetInfo(), LangOpts, Builder);
 
 // Install definitions to make Objective-C++ ARC work well with various
 // C++ Standard Library implementations.

Added: cfe/trunk/test/Preprocessor/aux-triple.c
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/Preprocessor/aux-triple.c?rev=340681&view=auto
==
--- cfe/trunk/test/Preprocessor/aux-triple.c (added)
+++ cfe/trunk/test/Preprocessor/aux-triple.c Sat Aug 25 06:42:40 2018
@@ -0,0 +1,62 @@
+// Ensure that Clang sets some very basic target defines based on -aux-triple.
+
+// RUN: %clang_cc1 -E -dM -ffreestanding < /dev/null \
+// RUN: -triple nvptx64-none-none \
+// RUN:   | FileCheck -match-f

Re: r336467 - [OPENMP] Fix PR38026: Link -latomic when -fopenmp is used.

2018-07-19 Thread Jonas Hahnfeld via cfe-commits

On 2018-07-19 15:43, Hal Finkel wrote:

On 07/16/2018 01:19 PM, Jonas Hahnfeld wrote:

[ Moving discussion from https://reviews.llvm.org/D49386 to the
relevant comment on cfe-commits, CC'ing Hal who commented on the
original issue ]

Is this change really a good idea? It always requires libatomic for
all OpenMP applications, even if there is no 'omp atomic' directive or
all of them can be lowered to atomic instructions that don't require a
runtime library. I'd argue that it's a larger restriction than the
problem it solves.


Can you please elaborate on why you feel that this is problematic?


The linked patch deals with the case that there is no libatomic, 
effectively disabling all tests of the OpenMP runtime (even though only 
few of them require atomic instructions). So apparently there are Linux 
systems without libatomic. Taking them any chance to use OpenMP with 
Clang is a large regression IMO and not user-friendly either.



Per https://clang.llvm.org/docs/Toolchain.html#libatomic-gnu the user
is expected to manually link -latomic whenever Clang can't lower
atomic instructions - including C11 atomics and C++ atomics. In my
opinion OpenMP is just another abstraction that doesn't require a
special treatment.


From my perspective, because we instruct our users that all you need to
do in order to enable OpenMP is pass -fopenmp flags during compiling 
and

linking. The user should not need to know or care about how atomics are
implemented.

It's not clear to me that our behavior for C++ atomics is good either.
From the documentation, it looks like the rationale is to avoid 
choosing

between the GNU libatomic implementation and the compiler-rt
implementation? We should probably make a default choice and provide a
flag to override. That would seem more user-friendly to me.


I didn't mean to say it's a good default, but OpenMP is now different 
from C and C++. And as you said, the choice was probably made for a 
reason, so there should be some discussion whether to change it.


Jonas
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: r336467 - [OPENMP] Fix PR38026: Link -latomic when -fopenmp is used.

2018-07-23 Thread Jonas Hahnfeld via cfe-commits

On 2018-07-19 20:55, Hal Finkel wrote:

On 07/19/2018 09:01 AM, Jonas Hahnfeld wrote:

On 2018-07-19 15:43, Hal Finkel wrote:

On 07/16/2018 01:19 PM, Jonas Hahnfeld wrote:

[ Moving discussion from https://reviews.llvm.org/D49386 to the
relevant comment on cfe-commits, CC'ing Hal who commented on the
original issue ]

Is this change really a good idea? It always requires libatomic for
all OpenMP applications, even if there is no 'omp atomic' directive 
or
all of them can be lowered to atomic instructions that don't require 
a

runtime library. I'd argue that it's a larger restriction than the
problem it solves.


Can you please elaborate on why you feel that this is problematic?


The linked patch deals with the case that there is no libatomic,
effectively disabling all tests of the OpenMP runtime (even though
only few of them require atomic instructions). So apparently there are
Linux systems without libatomic. Taking them any chance to use OpenMP
with Clang is a large regression IMO and not user-friendly either.


If there's a significant population of such systems, then this 
certainly

seems like a problem.

Let's revert this for now while we figure out what to do (which might
just mean updating the documentation to include OpenMP where we talk
about atomics).


Alexey, what do you think? Can I go ahead and revert this commit?

Thanks,
Jonas
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r337722 - Revert "[OPENMP] Fix PR38026: Link -latomic when -fopenmp is used."

2018-07-23 Thread Jonas Hahnfeld via cfe-commits
Author: hahnfeld
Date: Mon Jul 23 11:27:09 2018
New Revision: 337722

URL: http://llvm.org/viewvc/llvm-project?rev=337722&view=rev
Log:
Revert "[OPENMP] Fix PR38026: Link -latomic when -fopenmp is used."

This reverts commit r336467: libatomic is not available on all Linux
systems and this commit completely breaks OpenMP on them, even if there
are no atomic operations or all of them can be lowered to hardware
instructions.

See http://lists.llvm.org/pipermail/cfe-commits/Week-of-Mon-20180716/234816.html
for post-commit discussion.

Modified:
cfe/trunk/lib/Driver/ToolChains/Gnu.cpp
cfe/trunk/test/OpenMP/linking.c

Modified: cfe/trunk/lib/Driver/ToolChains/Gnu.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Driver/ToolChains/Gnu.cpp?rev=337722&r1=337721&r2=337722&view=diff
==
--- cfe/trunk/lib/Driver/ToolChains/Gnu.cpp (original)
+++ cfe/trunk/lib/Driver/ToolChains/Gnu.cpp Mon Jul 23 11:27:09 2018
@@ -479,7 +479,6 @@ void tools::gnutools::Linker::ConstructJ
 
   bool WantPthread = Args.hasArg(options::OPT_pthread) ||
  Args.hasArg(options::OPT_pthreads);
-  bool WantAtomic = false;
 
   // FIXME: Only pass GompNeedsRT = true for platforms with libgomp that
   // require librt. Most modern Linux platforms do, but some may not.
@@ -488,16 +487,13 @@ void tools::gnutools::Linker::ConstructJ
/* GompNeedsRT= */ true))
 // OpenMP runtimes implies pthreads when using the GNU toolchain.
 // FIXME: Does this really make sense for all GNU toolchains?
-WantAtomic = WantPthread = true;
+WantPthread = true;
 
   AddRunTimeLibs(ToolChain, D, CmdArgs, Args);
 
   if (WantPthread && !isAndroid)
 CmdArgs.push_back("-lpthread");
 
-  if (WantAtomic)
-CmdArgs.push_back("-latomic");
-
   if (Args.hasArg(options::OPT_fsplit_stack))
 CmdArgs.push_back("--wrap=pthread_create");
 

Modified: cfe/trunk/test/OpenMP/linking.c
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/OpenMP/linking.c?rev=337722&r1=337721&r2=337722&view=diff
==
--- cfe/trunk/test/OpenMP/linking.c (original)
+++ cfe/trunk/test/OpenMP/linking.c Mon Jul 23 11:27:09 2018
@@ -8,14 +8,14 @@
 // RUN:   | FileCheck --check-prefix=CHECK-LD-32 %s
 // CHECK-LD-32: "{{.*}}ld{{(.exe)?}}"
 // CHECK-LD-32: "-l[[DEFAULT_OPENMP_LIB:[^"]*]]"
-// CHECK-LD-32: "-lpthread" "-latomic" "-lc"
+// CHECK-LD-32: "-lpthread" "-lc"
 //
 // RUN: %clang -no-canonical-prefixes %s -### -o %t.o 2>&1 \
 // RUN: -fopenmp -target x86_64-unknown-linux -rtlib=platform \
 // RUN:   | FileCheck --check-prefix=CHECK-LD-64 %s
 // CHECK-LD-64: "{{.*}}ld{{(.exe)?}}"
 // CHECK-LD-64: "-l[[DEFAULT_OPENMP_LIB:[^"]*]]"
-// CHECK-LD-64: "-lpthread" "-latomic" "-lc"
+// CHECK-LD-64: "-lpthread" "-lc"
 //
 // RUN: %clang -no-canonical-prefixes %s -### -o %t.o 2>&1 \
 // RUN: -fopenmp=libgomp -target i386-unknown-linux -rtlib=platform \
@@ -27,7 +27,7 @@
 // SIMD-ONLY2-NOT: liomp
 // CHECK-GOMP-LD-32: "{{.*}}ld{{(.exe)?}}"
 // CHECK-GOMP-LD-32: "-lgomp" "-lrt"
-// CHECK-GOMP-LD-32: "-lpthread" "-latomic" "-lc"
+// CHECK-GOMP-LD-32: "-lpthread" "-lc"
 
 // RUN: %clang -no-canonical-prefixes %s -### -o %t.o 2>&1 -fopenmp-simd 
-target i386-unknown-linux -rtlib=platform | FileCheck --check-prefix 
SIMD-ONLY2 %s
 // SIMD-ONLY2-NOT: lgomp
@@ -39,21 +39,21 @@
 // RUN:   | FileCheck --check-prefix=CHECK-GOMP-LD-64 %s
 // CHECK-GOMP-LD-64: "{{.*}}ld{{(.exe)?}}"
 // CHECK-GOMP-LD-64: "-lgomp" "-lrt"
-// CHECK-GOMP-LD-64: "-lpthread" "-latomic" "-lc"
+// CHECK-GOMP-LD-64: "-lpthread" "-lc"
 //
 // RUN: %clang -no-canonical-prefixes %s -### -o %t.o 2>&1 \
 // RUN: -fopenmp -target i386-unknown-linux -rtlib=platform \
 // RUN:   | FileCheck --check-prefix=CHECK-IOMP5-LD-32 %s
 // CHECK-IOMP5-LD-32: "{{.*}}ld{{(.exe)?}}"
 // CHECK-IOMP5-LD-32: "-l[[DEFAULT_OPENMP_LIB:[^"]*]]"
-// CHECK-IOMP5-LD-32: "-lpthread" "-latomic" "-lc"
+// CHECK-IOMP5-LD-32: "-lpthread" "-lc"
 //
 // RUN: %clang -no-canonical-prefixes %s -### -o %t.o 2>&1 \
 // RUN: -fopenmp -target x86_64-unknown-linux -rtlib=platform \
 // RUN:   | FileCheck --check-prefix=CHECK-IOMP5-LD-64 %s
 // CHECK-IOMP5-LD-64: "{{.*}}ld{{(.exe)?}}"
 // CHECK-IOMP5-LD-64: "-l[[DEFAULT_OPENMP_LIB:[^"]*]]"
-// CHECK-IOMP5-LD-64: "-lpthread" "-latomic" "-lc"
+// CHECK-IOMP5-LD-64: "-lpthread" "-lc"
 //
 // RUN: %clang -no-canonical-prefixes %s -### -o %t.o 2>&1 \
 // RUN: -fopenmp=lib -target i386-unknown-linux \
@@ -71,7 +71,7 @@
 // RUN:   | FileCheck --check-prefix=CHECK-LD-OVERRIDE-32 %s
 // CHECK-LD-OVERRIDE-32: "{{.*}}ld{{(.exe)?}}"
 // CHECK-LD-OVERRIDE-32: "-lgomp" "-lrt"
-// CHECK-LD-OVERRIDE-32: "-lpthread" "-latomic" "-lc"
+// CHECK-LD-OVERRIDE-32: "-lpthread" "-lc"
 //
 // RUN: %clang -no-canonical-prefixes %s -### -o %t.o 2>&1 \
 // RU

Re: r336467 - [OPENMP] Fix PR38026: Link -latomic when -fopenmp is used.

2018-07-23 Thread Jonas Hahnfeld via cfe-commits

[ + cfe-commits ]

Reverted in r337722, I've reopened the bug.

Regards,
Jonas

On 2018-07-23 11:37, Alexey Bataev wrote:

Hi Jonas, yes, go ahead.

Best regards,
Alexey Bataev


23 июля 2018 г., в 5:16, Jonas Hahnfeld 

написал(а):



On 2018-07-23 11:08, Jonas Hahnfeld via cfe-commits wrote:

On 2018-07-19 20:55, Hal Finkel wrote:

On 07/19/2018 09:01 AM, Jonas Hahnfeld wrote:

On 2018-07-19 15:43, Hal Finkel wrote:

On 07/16/2018 01:19 PM, Jonas Hahnfeld wrote:
[ Moving discussion from https://reviews.llvm.org/D49386 to the
relevant comment on cfe-commits, CC'ing Hal who commented on

the

original issue ]

Is this change really a good idea? It always requires libatomic

for

all OpenMP applications, even if there is no 'omp atomic'

directive

or
all of them can be lowered to atomic instructions that don't
require a
runtime library. I'd argue that it's a larger restriction than

the

problem it solves.


Can you please elaborate on why you feel that this is

problematic?


The linked patch deals with the case that there is no libatomic,
effectively disabling all tests of the OpenMP runtime (even

though

only few of them require atomic instructions). So apparently

there

are
Linux systems without libatomic. Taking them any chance to use

OpenMP

with Clang is a large regression IMO and not user-friendly

either.


If there's a significant population of such systems, then this
certainly
seems like a problem.

Let's revert this for now while we figure out what to do (which

might

just mean updating the documentation to include OpenMP where we

talk

about atomics).


Alexey, what do you think? Can I go ahead and revert this commit?

Thanks,
Jonas


Meh, my message got blocked by @hotmail.com :-( I hope you received

the

message(s) via the mailing list...

Regards,
Jonas


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [libcxx] r337669 - Use possibly cached directory entry values when performing recursive directory iteration.

2018-07-23 Thread Jonas Hahnfeld via cfe-commits

Hi Eric,

this breaks 
test/std/experimental/filesystem/class.rec.dir.itr/rec.dir.itr.members/increment.pass.cpp 
for me:
In access_denied_on_recursion_test_case():176 Assertion TEST_CHECK(ec) 
failed.
in file: 
<<>>/projects/libcxx/test/std/experimental/filesystem/class.rec.dir.itr/rec.dir.itr.members/increment.pass.cpp


In access_denied_on_recursion_test_case():177 Assertion TEST_CHECK(it == 
endIt) failed.
in file: 
<<>>/projects/libcxx/test/std/experimental/filesystem/class.rec.dir.itr/rec.dir.itr.members/increment.pass.cpp


In access_denied_on_recursion_test_case():189 Assertion 
TEST_REQUIRE_THROW(filesystem_error,++it) failed.
in file: 
<<>>/projects/libcxx/test/std/experimental/filesystem/class.rec.dir.itr/rec.dir.itr.members/increment.pass.cpp


In test_PR35078():285 Assertion TEST_REQUIRE(it != endIt) failed.
in file: 
<<>>/projects/libcxx/test/std/experimental/filesystem/class.rec.dir.itr/rec.dir.itr.members/increment.pass.cpp


In test_PR35078_with_symlink():384 Assertion TEST_CHECK(ec) failed.
in file: 
<<>>/projects/libcxx/test/std/experimental/filesystem/class.rec.dir.itr/rec.dir.itr.members/increment.pass.cpp


In test_PR35078_with_symlink():385 Assertion TEST_CHECK(ec == eacess_ec) 
failed.
in file: 
<<>>/projects/libcxx/test/std/experimental/filesystem/class.rec.dir.itr/rec.dir.itr.members/increment.pass.cpp


In test_PR35078_with_symlink_file():461 Assertion TEST_CHECK(*it == 
symFile) failed.
in file: 
<<>>/projects/libcxx/test/std/experimental/filesystem/class.rec.dir.itr/rec.dir.itr.members/increment.pass.cpp


In test_PR35078_with_symlink_file():467 Assertion TEST_REQUIRE(it != 
EndIt) failed.
in file: 
<<>>/projects/libcxx/test/std/experimental/filesystem/class.rec.dir.itr/rec.dir.itr.members/increment.pass.cpp


Summary for testsuite recursive_directory_iterator_increment_tests:
5 of 9 test cases passed.
156 of 164 assertions passed.
0 unsupported test cases.

Do you have an idea? I'm on a local XFS mount, the sources are on NFS...

Thanks,
Jonas

On 2018-07-23 06:55, Eric Fiselier via cfe-commits wrote:

Author: ericwf
Date: Sun Jul 22 21:55:57 2018
New Revision: 337669

URL: http://llvm.org/viewvc/llvm-project?rev=337669&view=rev
Log:
Use possibly cached directory entry values when performing recursive
directory iteration.

Modified:
libcxx/trunk/src/experimental/filesystem/directory_iterator.cpp

Modified: 
libcxx/trunk/src/experimental/filesystem/directory_iterator.cpp

URL:
http://llvm.org/viewvc/llvm-project/libcxx/trunk/src/experimental/filesystem/directory_iterator.cpp?rev=337669&r1=337668&r2=337669&view=diff
==
--- libcxx/trunk/src/experimental/filesystem/directory_iterator.cpp 
(original)

+++ libcxx/trunk/src/experimental/filesystem/directory_iterator.cpp
Sun Jul 22 21:55:57 2018
@@ -359,13 +359,13 @@ bool recursive_directory_iterator::__try
   bool skip_rec = false;
   std::error_code m_ec;
   if (!rec_sym) {
-file_status st = curr_it.__entry_.symlink_status(m_ec);
+file_status st(curr_it.__entry_.__get_sym_ft(&m_ec));
 if (m_ec && status_known(st))
   m_ec.clear();
 if (m_ec || is_symlink(st) || !is_directory(st))
   skip_rec = true;
   } else {
-file_status st = curr_it.__entry_.status(m_ec);
+file_status st(curr_it.__entry_.__get_ft(&m_ec));
 if (m_ec && status_known(st))
   m_ec.clear();
 if (m_ec || !is_directory(st))


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [libcxx] r337669 - Use possibly cached directory entry values when performing recursive directory iteration.

2018-07-24 Thread Jonas Hahnfeld via cfe-commits

Thanks for your work, the test is now passing on my system.

Cheers,
Jonas

On 2018-07-24 01:00, Eric Fiselier wrote:

Sorry for the repeated emails. I did more digging and we were indeed
handling DT_UNKNOWN incorrectly.
I've fixed that in r337768.

/Eric

On Mon, Jul 23, 2018 at 4:43 PM Eric Fiselier  wrote:


Hi Jonas,

I believe I fixed the issue, and I've recommitted the change as
r337765.
Please let me know if you still see the failures. I think there
might be a lingering issues with how we handle DT_UNKNOWN.

/Eric

On Mon, Jul 23, 2018 at 3:53 PM Eric Fiselier  wrote:

I think I've found the bug, but I need to spend some more time on
it.

I've reverted in for now in r337749.

/Eric

On Mon, Jul 23, 2018 at 1:25 PM Eric Fiselier  wrote:

Thanks. I'm looking into this.

/Eric

On Mon, Jul 23, 2018 at 12:58 PM Jonas Hahnfeld 
wrote:
Hi Eric,

this breaks


test/std/experimental/filesystem/class.rec.dir.itr/rec.dir.itr.members/increment.pass.cpp


for me:
In access_denied_on_recursion_test_case():176 Assertion
TEST_CHECK(ec)
failed.
in file:


<<>>/projects/libcxx/test/std/experimental/filesystem/class.rec.dir.itr/rec.dir.itr.members/increment.pass.cpp


In access_denied_on_recursion_test_case():177 Assertion
TEST_CHECK(it ==
endIt) failed.
in file:


<<>>/projects/libcxx/test/std/experimental/filesystem/class.rec.dir.itr/rec.dir.itr.members/increment.pass.cpp


In access_denied_on_recursion_test_case():189 Assertion
TEST_REQUIRE_THROW(filesystem_error,++it) failed.
in file:


<<>>/projects/libcxx/test/std/experimental/filesystem/class.rec.dir.itr/rec.dir.itr.members/increment.pass.cpp


In test_PR35078():285 Assertion TEST_REQUIRE(it != endIt) failed.
in file:


<<>>/projects/libcxx/test/std/experimental/filesystem/class.rec.dir.itr/rec.dir.itr.members/increment.pass.cpp


In test_PR35078_with_symlink():384 Assertion TEST_CHECK(ec) failed.
in file:


<<>>/projects/libcxx/test/std/experimental/filesystem/class.rec.dir.itr/rec.dir.itr.members/increment.pass.cpp


In test_PR35078_with_symlink():385 Assertion TEST_CHECK(ec ==
eacess_ec)
failed.
in file:


<<>>/projects/libcxx/test/std/experimental/filesystem/class.rec.dir.itr/rec.dir.itr.members/increment.pass.cpp


In test_PR35078_with_symlink_file():461 Assertion TEST_CHECK(*it ==
symFile) failed.
in file:


<<>>/projects/libcxx/test/std/experimental/filesystem/class.rec.dir.itr/rec.dir.itr.members/increment.pass.cpp


In test_PR35078_with_symlink_file():467 Assertion TEST_REQUIRE(it !=

EndIt) failed.
in file:


<<>>/projects/libcxx/test/std/experimental/filesystem/class.rec.dir.itr/rec.dir.itr.members/increment.pass.cpp


Summary for testsuite recursive_directory_iterator_increment_tests:
5 of 9 test cases passed.
156 of 164 assertions passed.
0 unsupported test cases.

Do you have an idea? I'm on a local XFS mount, the sources are on
NFS...

Thanks,
Jonas

On 2018-07-23 06:55, Eric Fiselier via cfe-commits wrote:

Author: ericwf
Date: Sun Jul 22 21:55:57 2018
New Revision: 337669

URL: http://llvm.org/viewvc/llvm-project?rev=337669&view=rev
Log:
Use possibly cached directory entry values when performing

recursive

directory iteration.

Modified:


libcxx/trunk/src/experimental/filesystem/directory_iterator.cpp


Modified:
libcxx/trunk/src/experimental/filesystem/directory_iterator.cpp
URL:




http://llvm.org/viewvc/llvm-project/libcxx/trunk/src/experimental/filesystem/directory_iterator.cpp?rev=337669&r1=337668&r2=337669&view=diff





==

---

libcxx/trunk/src/experimental/filesystem/directory_iterator.cpp

(original)
+++

libcxx/trunk/src/experimental/filesystem/directory_iterator.cpp

Sun Jul 22 21:55:57 2018
@@ -359,13 +359,13 @@ bool recursive_directory_iterator::__try
bool skip_rec = false;
std::error_code m_ec;
if (!rec_sym) {
-file_status st = curr_it.__entry_.symlink_status(m_ec);
+file_status st(curr_it.__entry_.__get_sym_ft(&m_ec));
if (m_ec && status_known(st))
m_ec.clear();
if (m_ec || is_symlink(st) || !is_directory(st))
skip_rec = true;
} else {
-file_status st = curr_it.__entry_.status(m_ec);
+file_status st(curr_it.__entry_.__get_ft(&m_ec));
if (m_ec && status_known(st))
m_ec.clear();
if (m_ec || !is_directory(st))


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: r338032 - [OPENMP] Force OpenMP 4.5 when compiling for offloading.

2018-07-26 Thread Jonas Hahnfeld via cfe-commits

Hi Alexey,

maybe we can now raise the version generally? Is that something that we 
want to do before branching for 7.0?
According to http://clang.llvm.org/docs/OpenMPSupport.html, all 
directives of OpenMP 4.5 are supported (even if Clang may not generate 
optimal code).


Cheers,
Jonas

On 2018-07-26 17:17, Alexey Bataev via cfe-commits wrote:

Author: abataev
Date: Thu Jul 26 08:17:38 2018
New Revision: 338032

URL: http://llvm.org/viewvc/llvm-project?rev=338032&view=rev
Log:
[OPENMP] Force OpenMP 4.5 when compiling for offloading.

If the user requested compilation for OpenMP with the offloading
support, force the version of the OpenMP standard to 4.5 by default.

Modified:
cfe/trunk/lib/Driver/ToolChains/Clang.cpp
cfe/trunk/lib/Frontend/CompilerInvocation.cpp
cfe/trunk/test/OpenMP/driver.c

Modified: cfe/trunk/lib/Driver/ToolChains/Clang.cpp
URL:
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Driver/ToolChains/Clang.cpp?rev=338032&r1=338031&r2=338032&view=diff
==
--- cfe/trunk/lib/Driver/ToolChains/Clang.cpp (original)
+++ cfe/trunk/lib/Driver/ToolChains/Clang.cpp Thu Jul 26 08:17:38 2018
@@ -4698,7 +4698,7 @@ void Clang::ConstructJob(Compilation &C,

   // For all the host OpenMP offloading compile jobs we need to pass
the targets
   // information using -fopenmp-targets= option.
-  if (isa(JA) && 
JA.isHostOffloading(Action::OFK_OpenMP)) {

+  if (JA.isHostOffloading(Action::OFK_OpenMP)) {
 SmallString<128> TargetInfo("-fopenmp-targets=");

 Arg *Tgts = Args.getLastArg(options::OPT_fopenmp_targets_EQ);

Modified: cfe/trunk/lib/Frontend/CompilerInvocation.cpp
URL:
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Frontend/CompilerInvocation.cpp?rev=338032&r1=338031&r2=338032&view=diff
==
--- cfe/trunk/lib/Frontend/CompilerInvocation.cpp (original)
+++ cfe/trunk/lib/Frontend/CompilerInvocation.cpp Thu Jul 26 08:17:38 
2018

@@ -2594,13 +2594,15 @@ static void ParseLangArgs(LangOptions &O
   Opts.OpenMP && !Args.hasArg(options::OPT_fnoopenmp_use_tls);
   Opts.OpenMPIsDevice =
   Opts.OpenMP && Args.hasArg(options::OPT_fopenmp_is_device);
+  bool IsTargetSpecified =
+  Opts.OpenMPIsDevice || 
Args.hasArg(options::OPT_fopenmp_targets_EQ);


   if (Opts.OpenMP || Opts.OpenMPSimd) {
-if (int Version =
-getLastArgIntValue(Args, OPT_fopenmp_version_EQ,
-   IsSimdSpecified ? 45 : Opts.OpenMP, 
Diags))

+if (int Version = getLastArgIntValue(
+Args, OPT_fopenmp_version_EQ,
+(IsSimdSpecified || IsTargetSpecified) ? 45 : Opts.OpenMP, 
Diags))

   Opts.OpenMP = Version;
-else if (IsSimdSpecified)
+else if (IsSimdSpecified || IsTargetSpecified)
   Opts.OpenMP = 45;
 // Provide diagnostic when a given target is not expected to be an 
OpenMP

 // device or host.

Modified: cfe/trunk/test/OpenMP/driver.c
URL:
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/OpenMP/driver.c?rev=338032&r1=338031&r2=338032&view=diff
==
--- cfe/trunk/test/OpenMP/driver.c (original)
+++ cfe/trunk/test/OpenMP/driver.c Thu Jul 26 08:17:38 2018
@@ -1,3 +1,4 @@
+// REQUIRES: x86-registered-target
 // Test that by default -fnoopenmp-use-tls is passed to frontend.
 //
 // RUN: %clang %s -### -o %t.o 2>&1 -fopenmp=libomp | FileCheck
--check-prefix=CHECK-DEFAULT %s
@@ -23,7 +24,9 @@

 // RUN: %clang %s -c -E -dM -fopenmp=libomp -fopenmp-version=45 |
FileCheck --check-prefix=CHECK-45-VERSION %s
 // RUN: %clang %s -c -E -dM -fopenmp=libomp -fopenmp-simd | FileCheck
--check-prefix=CHECK-45-VERSION %s
+// RUN: %clang %s -c -E -dM -fopenmp=libomp
-fopenmp-targets=x86_64-unknown-unknown -o - | FileCheck
--check-prefix=CHECK-45-VERSION --check-prefix=CHECK-45-VERSION2 %s
 // CHECK-45-VERSION: #define _OPENMP 201511
+// CHECK-45-VERSION2: #define _OPENMP 201511

 // RUN: %clang %s -c -E -dM -fopenmp-version=1 | FileCheck
--check-prefix=CHECK-VERSION %s
 // RUN: %clang %s -c -E -dM -fopenmp-version=31 | FileCheck
--check-prefix=CHECK-VERSION %s


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: r338049 - [OPENMP] What's new for OpenMP in clang.

2018-07-26 Thread Jonas Hahnfeld via cfe-commits

Hi Alexey,

On 2018-07-26 19:53, Alexey Bataev via cfe-commits wrote:

Author: abataev
Date: Thu Jul 26 10:53:45 2018
New Revision: 338049

URL: http://llvm.org/viewvc/llvm-project?rev=338049&view=rev
Log:
[OPENMP] What's new for OpenMP in clang.

Updated ReleaseNotes + Status of the OpenMP support in clang.

Modified:
cfe/trunk/docs/OpenMPSupport.rst
cfe/trunk/docs/ReleaseNotes.rst

Modified: cfe/trunk/docs/OpenMPSupport.rst
[...]

Modified: cfe/trunk/docs/ReleaseNotes.rst
URL:
http://llvm.org/viewvc/llvm-project/cfe/trunk/docs/ReleaseNotes.rst?rev=338049&r1=338048&r2=338049&view=diff
==
--- cfe/trunk/docs/ReleaseNotes.rst (original)
+++ cfe/trunk/docs/ReleaseNotes.rst Thu Jul 26 10:53:45 2018
@@ -216,7 +216,21 @@ OpenCL C Language Changes in Clang
 OpenMP Support in Clang
 --

-- ...
+- Clang gained basic support for OpenMP 4.5 offloading for NVPTX 
target.

+   To compile your program for NVPTX target use the following options:
+   `-fopenmp -fopenmp-targets=nvptx64-nvidia-cuda` for 64 bit 
platforms or

+   `-fopenmp -fopenmp-targets=nvptx-nvidia-cuda` for 32 bit platform.
+
+- Passing options to the OpenMP device offloading toolchain can be 
done using
+  the `-Xopenmp-target= -opt=val` flag. In this way the 
`-opt=val`
+  option will be forwarded to the respective OpenMP device offloading 
toolchain
+  described by the triple. For example passing the compute capability 
to

+  the OpenMP NVPTX offloading toolchain can be done as follows:
+  `-Xopenmp-target=nvptx62-nvidia-cuda -march=sm_60`.


Is that a typo and should say "nvptx64"?
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: r338139 - [OPENMP] Static variables on device must be externally visible.

2018-07-27 Thread Jonas Hahnfeld via cfe-commits

Hi Alexey,

from what I can see this change can't handle the case where there are 
static variables with the same name in multiple TUs.
(The same problem exists for static CUDA kernels with -fcuda-rdc. I 
found that nvcc mangles the function names in this case, but didn't have 
time yet to prepare a similar patch for Clang.)


I think for now it would be better to emit a meaningful error instead of 
generating incorrect code and letting the user figure out what went 
wrong.


My 2 cents,
Jonas

On 2018-07-27 19:37, Alexey Bataev via cfe-commits wrote:

Author: abataev
Date: Fri Jul 27 10:37:32 2018
New Revision: 338139

URL: http://llvm.org/viewvc/llvm-project?rev=338139&view=rev
Log:
[OPENMP] Static variables on device must be externally visible.

Do not mark static variable as internal on the device as they must be
visible from the host to be mapped correctly.

Modified:
cfe/trunk/lib/AST/ASTContext.cpp
cfe/trunk/test/OpenMP/declare_target_codegen.cpp

cfe/trunk/test/OpenMP/nvptx_declare_target_var_ctor_dtor_codegen.cpp


Modified: cfe/trunk/lib/AST/ASTContext.cpp
URL:
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/AST/ASTContext.cpp?rev=338139&r1=338138&r2=338139&view=diff
==
--- cfe/trunk/lib/AST/ASTContext.cpp (original)
+++ cfe/trunk/lib/AST/ASTContext.cpp Fri Jul 27 10:37:32 2018
@@ -9504,6 +9504,21 @@ static GVALinkage basicGVALinkageForFunc
   return GVA_DiscardableODR;
 }

+static bool isDeclareTargetToDeclaration(const Decl *VD) {
+  for (const Decl *D : VD->redecls()) {
+if (!D->hasAttrs())
+  continue;
+if (const auto *Attr = D->getAttr())
+  return Attr->getMapType() == OMPDeclareTargetDeclAttr::MT_To;
+  }
+  if (const auto *V = dyn_cast(VD)) {
+if (const VarDecl *TD = V->getTemplateInstantiationPattern())
+  return isDeclareTargetToDeclaration(TD);
+  }
+
+  return false;
+}
+
 static GVALinkage adjustGVALinkageForAttributes(const ASTContext 
&Context,
 const Decl *D, 
GVALinkage L) {

   // See http://msdn.microsoft.com/en-us/library/xa0d9ste.aspx
@@ -9520,6 +9535,12 @@ static GVALinkage adjustGVALinkageForAtt
 // visible externally so they can be launched from host.
 if (L == GVA_DiscardableODR || L == GVA_Internal)
   return GVA_StrongODR;
+  } else if (Context.getLangOpts().OpenMP &&
Context.getLangOpts().OpenMPIsDevice &&
+ isDeclareTargetToDeclaration(D)) {
+// Static variables must be visible externally so they can be 
mapped from

+// host.
+if (L == GVA_Internal)
+  return GVA_StrongODR;
   }
   return L;
 }

Modified: cfe/trunk/test/OpenMP/declare_target_codegen.cpp
URL:
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/OpenMP/declare_target_codegen.cpp?rev=338139&r1=338138&r2=338139&view=diff
==
--- cfe/trunk/test/OpenMP/declare_target_codegen.cpp (original)
+++ cfe/trunk/test/OpenMP/declare_target_codegen.cpp Fri Jul 27 
10:37:32 2018

@@ -18,12 +18,14 @@
 // CHECK-DAG: @d = global i32 0,
 // CHECK-DAG: @c = external global i32,
 // CHECK-DAG: @globals = global %struct.S zeroinitializer,
-// CHECK-DAG: @llvm.used = appending global [1 x i8*] [i8* bitcast
(void ()* @__omp_offloading__{{.+}}_globals_l[[@LINE+41]]_ctor to
i8*)], section "llvm.metadata"
+// CHECK-DAG: @{{.+}}stat = weak_odr global %struct.S zeroinitializer,
+// CHECK-DAG: @llvm.used = appending global [2 x i8*] [i8* bitcast
(void ()* @__omp_offloading__{{.+}}_globals_l[[@LINE+42]]_ctor to
i8*), i8* bitcast (void ()*
@__omp_offloading__{{.+}}_stat_l[[@LINE+43]]_ctor to i8*)], section
"llvm.metadata"

 // CHECK-DAG: define {{.*}}i32 
@{{.*}}{{foo|bar|baz2|baz3|FA|f_method}}{{.*}}()

 // CHECK-DAG: define {{.*}}void
@{{.*}}TemplateClass{{.*}}(%class.TemplateClass* %{{.*}})
 // CHECK-DAG: define {{.*}}i32
@{{.*}}TemplateClass{{.*}}f_method{{.*}}(%class.TemplateClass*
%{{.*}})
-// CHECK-DAG: define {{.*}}void
@__omp_offloading__{{.*}}_globals_l[[@LINE+36]]_ctor()
+// CHECK-DAG: define {{.*}}void
@__omp_offloading__{{.*}}_globals_l[[@LINE+37]]_ctor()
+// CHECK-DAG: define {{.*}}void
@__omp_offloading__{{.*}}_stat_l[[@LINE+37]]_ctor()

 #ifndef HEADER
 #define HEADER
@@ -60,6 +62,7 @@ int foo() { return 0; }
 int b = 15;
 int d;
 S globals(d);
+static S stat(d);
 #pragma omp end declare target
 int c;


Modified: 
cfe/trunk/test/OpenMP/nvptx_declare_target_var_ctor_dtor_codegen.cpp

URL:
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/OpenMP/nvptx_declare_target_var_ctor_dtor_codegen.cpp?rev=338139&r1=338138&r2=338139&view=diff
==
---
cfe/trunk/test/OpenMP/nvptx_declare_target_var_ctor_dtor_codegen.cpp
(original)
+++
cfe/trunk/test/OpenMP/nvptx_declare_target_var_ctor_dtor_codegen.cpp
Fri Jul 27 10:37:32 2018
@@ -15,7 +15,7 @@

 // SIMD-ONLY-NOT: {{__kmpc|__tgt}}

-// D

Re: r338049 - [OPENMP] What's new for OpenMP in clang.

2018-07-29 Thread Jonas Hahnfeld via cfe-commits
I just noticed that UsersManual says: "Clang supports all OpenMP 3.1 
directives and clauses." Maybe this should link to OpenMPSupport?


On 2018-07-26 19:53, Alexey Bataev via cfe-commits wrote:

Author: abataev
Date: Thu Jul 26 10:53:45 2018
New Revision: 338049

URL: http://llvm.org/viewvc/llvm-project?rev=338049&view=rev
Log:
[OPENMP] What's new for OpenMP in clang.

Updated ReleaseNotes + Status of the OpenMP support in clang.

Modified:
cfe/trunk/docs/OpenMPSupport.rst
cfe/trunk/docs/ReleaseNotes.rst

Modified: cfe/trunk/docs/OpenMPSupport.rst
URL:
http://llvm.org/viewvc/llvm-project/cfe/trunk/docs/OpenMPSupport.rst?rev=338049&r1=338048&r2=338049&view=diff
==
--- cfe/trunk/docs/OpenMPSupport.rst (original)
+++ cfe/trunk/docs/OpenMPSupport.rst Thu Jul 26 10:53:45 2018
@@ -10,13 +10,15 @@
 .. role:: partial
 .. role:: good

+.. contents::
+   :local:
+
 ==
 OpenMP Support
 ==

-Clang fully supports OpenMP 3.1 + some elements of OpenMP 4.5. Clang
supports offloading to X86_64, AArch64 and PPC64[LE] devices.
-Support for Cuda devices is not ready yet.
-The status of major OpenMP 4.5 features support in Clang.
+Clang fully supports OpenMP 4.5. Clang supports offloading to X86_64, 
AArch64,

+PPC64[LE] and has `basic support for Cuda devices`_.

 Standalone directives
 =
@@ -35,7 +37,7 @@ Standalone directives

 * #pragma omp target: :good:`Complete`.

-* #pragma omp declare target: :partial:`Partial`.  No full codegen 
support.

+* #pragma omp declare target: :good:`Complete`.

 * #pragma omp teams: :good:`Complete`.

@@ -64,5 +66,66 @@ Combined directives

 * #pragma omp target teams distribute parallel for [simd]: 
:good:`Complete`.


-Clang does not support any constructs/updates from upcoming OpenMP
5.0 except for `reduction`-based clauses in the `task` and
`target`-based directives.
-In addition, the LLVM OpenMP runtime `libomp` supports the OpenMP
Tools Interface (OMPT) on x86, x86_64, AArch64, and PPC64 on Linux,
Windows, and mac OS.
+Clang does not support any constructs/updates from upcoming OpenMP 5.0 
except
+for `reduction`-based clauses in the `task` and `target`-based 
directives.

+
+In addition, the LLVM OpenMP runtime `libomp` supports the OpenMP 
Tools

+Interface (OMPT) on x86, x86_64, AArch64, and PPC64 on Linux,
Windows, and mac OS.
+ows, and mac OS.
+
+.. _basic support for Cuda devices:
+
+Cuda devices support
+
+
+Directives execution modes
+--
+
+Clang code generation for target regions supports two modes: the SPMD 
and
+non-SPMD modes. Clang chooses one of these two modes automatically 
based on the
+way directives and clauses on those directives are used. The SPMD mode 
uses a
+simplified set of runtime functions thus increasing performance at the 
cost of
+supporting some OpenMP features. The non-SPMD mode is the most generic 
mode and
+supports all currently available OpenMP features. The compiler will 
always
+attempt to use the SPMD mode wherever possible. SPMD mode will not be 
used if:

+
+   - The target region contains an `if()` clause that refers to a 
`parallel`

+ directive.
+
+   - The target region contains a `parallel` directive with a 
`num_threads()`

+ clause.
+
+   - The target region contains user code (other than OpenMP-specific
+ directives) in between the `target` and the `parallel` 
directives.

+
+Data-sharing modes
+--
+
+Clang supports two data-sharing models for Cuda devices: `Generic` and 
`Cuda`
+modes. The default mode is `Generic`. `Cuda` mode can give an 
additional
+performance and can be activated using the `-fopenmp-cuda-mode` flag. 
In
+`Generic` mode all local variables that can be shared in the parallel 
regions
+are stored in the global memory. In `Cuda` mode local variables are 
not shared
+between the threads and it is user responsibility to share the 
required data

+between the threads in the parallel regions.
+
+Features not supported or with limited support for Cuda devices
+---
+
+- Reductions across the teams are not supported yet.
+
+- Cancellation constructs are not supported.
+
+- Doacross loop nest is not supported.
+
+- User-defined reductions are supported only for trivial types.
+
+- Nested parallelism: inner parallel regions are executed 
sequentially.

+
+- Static linking of libraries containing device code is not supported 
yet.

+
+- Automatic translation of math functions in target regions to 
device-specific

+  math functions is not implemented yet.
+
+- Debug information for OpenMP target regions is not supported yet.
+

Modified: cfe/trunk/docs/ReleaseNotes.rst
URL:
http://llvm.org/viewvc/llvm-project/cfe/trunk/docs/ReleaseNotes.rst?rev=338049&r1=338048&r2=338049&view=diff
==
---

r338360 - Fix linux-header-search.cpp with CLANG_DEFAULT_CXX_STDLIB

2018-07-31 Thread Jonas Hahnfeld via cfe-commits
Author: hahnfeld
Date: Tue Jul 31 04:36:14 2018
New Revision: 338360

URL: http://llvm.org/viewvc/llvm-project?rev=338360&view=rev
Log:
Fix linux-header-search.cpp with CLANG_DEFAULT_CXX_STDLIB

This configuration was broken after r338294 because Clang might
be configured to always use libc++.

Modified:
cfe/trunk/test/Driver/linux-header-search.cpp

Modified: cfe/trunk/test/Driver/linux-header-search.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/Driver/linux-header-search.cpp?rev=338360&r1=338359&r2=338360&view=diff
==
--- cfe/trunk/test/Driver/linux-header-search.cpp (original)
+++ cfe/trunk/test/Driver/linux-header-search.cpp Tue Jul 31 04:36:14 2018
@@ -496,7 +496,7 @@
 
 // Check header search on OpenEmbedded ARM.
 // RUN: %clang -no-canonical-prefixes %s -### -fsyntax-only 2>&1 \
-// RUN: -target arm-oe-linux-gnueabi \
+// RUN: -target arm-oe-linux-gnueabi -stdlib=libstdc++ \
 // RUN: --sysroot=%S/Inputs/openembedded_arm_linux_tree \
 // RUN:   | FileCheck --check-prefix=CHECK-OE-ARM %s
 
@@ -507,7 +507,7 @@
 
 // Check header search on OpenEmbedded AArch64.
 // RUN: %clang -no-canonical-prefixes %s -### -fsyntax-only 2>&1 \
-// RUN: -target aarch64-oe-linux \
+// RUN: -target aarch64-oe-linux -stdlib=libstdc++ \
 // RUN: --sysroot=%S/Inputs/openembedded_aarch64_linux_tree \
 // RUN:   | FileCheck --check-prefix=CHECK-OE-AARCH64 %s
 


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r338414 - Fix riscv32-toolchain.c with CLANG_DEFAULT_CXX_STDLIB

2018-07-31 Thread Jonas Hahnfeld via cfe-commits
Author: hahnfeld
Date: Tue Jul 31 11:47:48 2018
New Revision: 338414

URL: http://llvm.org/viewvc/llvm-project?rev=338414&view=rev
Log:
Fix riscv32-toolchain.c with CLANG_DEFAULT_CXX_STDLIB

This configuration was (again) broken after r338385 because Clang
might be configured to always use libc++.

Modified:
cfe/trunk/test/Driver/riscv32-toolchain.c

Modified: cfe/trunk/test/Driver/riscv32-toolchain.c
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/Driver/riscv32-toolchain.c?rev=338414&r1=338413&r2=338414&view=diff
==
--- cfe/trunk/test/Driver/riscv32-toolchain.c (original)
+++ cfe/trunk/test/Driver/riscv32-toolchain.c Tue Jul 31 11:47:48 2018
@@ -19,7 +19,7 @@
 // C-RV32-BAREMETAL-ILP32: 
"{{.*}}/Inputs/basic_riscv32_tree/lib/gcc/riscv32-unknown-elf/8.0.1{{/|}}crtend.o"
 
 // RUN: %clangxx %s -### -no-canonical-prefixes \
-// RUN:   -target riscv32-unknown-elf \
+// RUN:   -target riscv32-unknown-elf -stdlib=libstdc++ \
 // RUN:   --gcc-toolchain=%S/Inputs/basic_riscv32_tree \
 // RUN:   --sysroot=%S/Inputs/basic_riscv32_tree/riscv32-unknown-elf 2>&1 \
 // RUN:   | FileCheck -check-prefix=CXX-RV32-BAREMETAL-ILP32 %s


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r324877 - [CUDA] Fix test cuda-external-tools.cu

2018-02-12 Thread Jonas Hahnfeld via cfe-commits
Author: hahnfeld
Date: Mon Feb 12 02:46:34 2018
New Revision: 324877

URL: http://llvm.org/viewvc/llvm-project?rev=324877&view=rev
Log:
[CUDA] Fix test cuda-external-tools.cu

This didn't verify the CHECK prefix before!

Differential Revision: https://reviews.llvm.org/D42920

Modified:
cfe/trunk/test/Driver/cuda-external-tools.cu

Modified: cfe/trunk/test/Driver/cuda-external-tools.cu
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/Driver/cuda-external-tools.cu?rev=324877&r1=324876&r2=324877&view=diff
==
--- cfe/trunk/test/Driver/cuda-external-tools.cu (original)
+++ cfe/trunk/test/Driver/cuda-external-tools.cu Mon Feb 12 02:46:34 2018
@@ -7,112 +7,115 @@
 
 // Regular compiles with -O{0,1,2,3,4,fast}.  -O4 and -Ofast map to ptxas O3.
 // RUN: %clang -### -target x86_64-linux-gnu -O0 -c %s 2>&1 \
-// RUN: | FileCheck -check-prefix ARCH64 -check-prefix SM20 -check-prefix OPT0 
%s
+// RUN: | FileCheck -check-prefixes=CHECK,ARCH64,SM20,OPT0 %s
 // RUN: %clang -### -target x86_64-linux-gnu -O1 -c %s 2>&1 \
-// RUN: | FileCheck -check-prefix ARCH64 -check-prefix SM20 -check-prefix OPT1 
%s
+// RUN: | FileCheck -check-prefixes=CHECK,ARCH64,SM20,OPT1 %s
 // RUN: %clang -### -target x86_64-linux-gnu -O2 -c %s 2>&1 \
-// RUN: | FileCheck -check-prefix ARCH64 -check-prefix SM20 -check-prefix OPT2 
%s
+// RUN: | FileCheck -check-prefixes=CHECK,ARCH64,SM20,OPT2 %s
 // RUN: %clang -### -target x86_64-linux-gnu -O3 -c %s 2>&1 \
-// RUN: | FileCheck -check-prefix ARCH64 -check-prefix SM20 -check-prefix OPT3 
%s
+// RUN: | FileCheck -check-prefixes=CHECK,ARCH64,SM20,OPT3 %s
 // RUN: %clang -### -target x86_64-linux-gnu -O4 -c %s 2>&1 \
-// RUN: | FileCheck -check-prefix ARCH64 -check-prefix SM20 -check-prefix OPT3 
%s
+// RUN: | FileCheck -check-prefixes=CHECK,ARCH64,SM20,OPT3 %s
 // RUN: %clang -### -target x86_64-linux-gnu -Ofast -c %s 2>&1 \
-// RUN: | FileCheck -check-prefix ARCH64 -check-prefix SM20 -check-prefix OPT3 
%s
+// RUN: | FileCheck -check-prefixes=CHECK,ARCH64,SM20,OPT3 %s
 
 // With debugging enabled, ptxas should be run with with no ptxas 
optimizations.
 // RUN: %clang -### -target x86_64-linux-gnu --cuda-noopt-device-debug -O2 -c 
%s 2>&1 \
-// RUN: | FileCheck -check-prefix ARCH64 -check-prefix SM20 -check-prefix DBG 
%s
+// RUN: | FileCheck -check-prefixes=CHECK,ARCH64,SM20,DBG %s
 
 // --no-cuda-noopt-device-debug overrides --cuda-noopt-device-debug.
 // RUN: %clang -### -target x86_64-linux-gnu --cuda-noopt-device-debug \
 // RUN:   --no-cuda-noopt-device-debug -O2 -c %s 2>&1 \
-// RUN: | FileCheck -check-prefix ARCH64 -check-prefix SM20 -check-prefix OPT2 
%s
+// RUN: | FileCheck -check-prefixes=CHECK,ARCH64,SM20,OPT2 %s
 
 // Regular compile without -O.  This should result in us passing -O0 to ptxas.
 // RUN: %clang -### -target x86_64-linux-gnu -c %s 2>&1 \
-// RUN: | FileCheck -check-prefix ARCH64 -check-prefix SM20 -check-prefix OPT0 
%s
+// RUN: | FileCheck -check-prefixes=CHECK,ARCH64,SM20,OPT0 %s
 
 // Regular compiles with -Os and -Oz.  For lack of a better option, we map
 // these to ptxas -O3.
 // RUN: %clang -### -target x86_64-linux-gnu -Os -c %s 2>&1 \
-// RUN: | FileCheck -check-prefix ARCH64 -check-prefix SM20 -check-prefix OPT2 
%s
+// RUN: | FileCheck -check-prefixes=CHECK,ARCH64,SM20,OPT2 %s
 // RUN: %clang -### -target x86_64-linux-gnu -Oz -c %s 2>&1 \
-// RUN: | FileCheck -check-prefix ARCH64 -check-prefix SM20 -check-prefix OPT2 
%s
+// RUN: | FileCheck -check-prefixes=CHECK,ARCH64,SM20,OPT2 %s
 
 // Regular compile targeting sm_35.
 // RUN: %clang -### -target x86_64-linux-gnu --cuda-gpu-arch=sm_35 -c %s 2>&1 \
-// RUN: | FileCheck -check-prefix ARCH64 -check-prefix SM35 %s
+// RUN: | FileCheck -check-prefixes=CHECK,ARCH64,SM35 %s
 
 // 32-bit compile.
-// RUN: %clang -### -target x86_32-linux-gnu -c %s 2>&1 \
-// RUN: | FileCheck -check-prefix ARCH32 -check-prefix SM20 %s
+// RUN: %clang -### -target i386-linux-gnu -c %s 2>&1 \
+// RUN: | FileCheck -check-prefixes=CHECK,ARCH32,SM20 %s
 
 // Compile with -fintegrated-as.  This should still cause us to invoke ptxas.
 // RUN: %clang -### -target x86_64-linux-gnu -fintegrated-as -c %s 2>&1 \
-// RUN: | FileCheck -check-prefix ARCH64 -check-prefix SM20 -check-prefix OPT0 
%s
+// RUN: | FileCheck -check-prefixes=CHECK,ARCH64,SM20,OPT0 %s
 
 // Check -Xcuda-ptxas and -Xcuda-fatbinary
 // RUN: %clang -### -target x86_64-linux-gnu -c -Xcuda-ptxas -foo1 \
 // RUN:   -Xcuda-fatbinary -bar1 -Xcuda-ptxas -foo2 -Xcuda-fatbinary -bar2 %s 
2>&1 \
-// RUN: | FileCheck -check-prefix SM20 -check-prefix PTXAS-EXTRA \
-// RUN:   -check-prefix FATBINARY-EXTRA %s
+// RUN: | FileCheck -check-prefixes=CHECK,SM20,PTXAS-EXTRA,FATBINARY-EXTRA %s
 
 // MacOS spot-checks
 // RUN: %clang -### -target x86_64-apple-macosx -O0 -c %s 2>&1 \
-// RUN: | FileCheck -check-prefix ARCH64 -check-prefix SM20 -check-prefix OPT0 
%s
+// RUN: | FileCheck -check-prefixes=CHECK,ARCH64

r324878 - [CUDA] Add option to generate relocatable device code

2018-02-12 Thread Jonas Hahnfeld via cfe-commits
Author: hahnfeld
Date: Mon Feb 12 02:46:45 2018
New Revision: 324878

URL: http://llvm.org/viewvc/llvm-project?rev=324878&view=rev
Log:
[CUDA] Add option to generate relocatable device code

As a first step, pass '-c/--compile-only' to ptxas so that it
doesn't complain about references to external function. This
will successfully generate object files, but they won't work
at runtime because the registration routines need to adapted.

Differential Revision: https://reviews.llvm.org/D42921

Modified:
cfe/trunk/include/clang/Basic/LangOptions.def
cfe/trunk/include/clang/Driver/Options.td
cfe/trunk/lib/Driver/ToolChains/Clang.cpp
cfe/trunk/lib/Driver/ToolChains/Cuda.cpp
cfe/trunk/lib/Frontend/CompilerInvocation.cpp
cfe/trunk/test/Driver/cuda-external-tools.cu

Modified: cfe/trunk/include/clang/Basic/LangOptions.def
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Basic/LangOptions.def?rev=324878&r1=324877&r2=324878&view=diff
==
--- cfe/trunk/include/clang/Basic/LangOptions.def (original)
+++ cfe/trunk/include/clang/Basic/LangOptions.def Mon Feb 12 02:46:45 2018
@@ -204,6 +204,7 @@ LANGOPT(CUDAAllowVariadicFunctions, 1, 0
 LANGOPT(CUDAHostDeviceConstexpr, 1, 1, "treating unattributed constexpr 
functions as __host__ __device__")
 LANGOPT(CUDADeviceFlushDenormalsToZero, 1, 0, "flushing denormals to zero")
 LANGOPT(CUDADeviceApproxTranscendentals, 1, 0, "using approximate 
transcendental functions")
+LANGOPT(CUDARelocatableDeviceCode, 1, 0, "generate relocatable device code")
 
 LANGOPT(SizedDeallocation , 1, 0, "sized deallocation")
 LANGOPT(AlignedAllocation , 1, 0, "aligned allocation")

Modified: cfe/trunk/include/clang/Driver/Options.td
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Driver/Options.td?rev=324878&r1=324877&r2=324878&view=diff
==
--- cfe/trunk/include/clang/Driver/Options.td (original)
+++ cfe/trunk/include/clang/Driver/Options.td Mon Feb 12 02:46:45 2018
@@ -566,6 +566,9 @@ def fno_cuda_flush_denormals_to_zero : F
 def fcuda_approx_transcendentals : Flag<["-"], "fcuda-approx-transcendentals">,
   Flags<[CC1Option]>, HelpText<"Use approximate transcendental functions">;
 def fno_cuda_approx_transcendentals : Flag<["-"], 
"fno-cuda-approx-transcendentals">;
+def fcuda_rdc : Flag<["-"], "fcuda-rdc">, Flags<[CC1Option, HelpHidden]>,
+  HelpText<"Generate relocatable device code, also known as separate 
compilation mode.">;
+def fno_cuda_rdc : Flag<["-"], "fno-cuda-rdc">;
 def dA : Flag<["-"], "dA">, Group;
 def dD : Flag<["-"], "dD">, Group, Flags<[CC1Option]>,
   HelpText<"Print macro definitions in -E mode in addition to normal output">;

Modified: cfe/trunk/lib/Driver/ToolChains/Clang.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Driver/ToolChains/Clang.cpp?rev=324878&r1=324877&r2=324878&view=diff
==
--- cfe/trunk/lib/Driver/ToolChains/Clang.cpp (original)
+++ cfe/trunk/lib/Driver/ToolChains/Clang.cpp Mon Feb 12 02:46:45 2018
@@ -4658,14 +4658,20 @@ void Clang::ConstructJob(Compilation &C,
 CmdArgs.push_back(Args.MakeArgString(Flags));
   }
 
-  // Host-side cuda compilation receives device-side outputs as Inputs[1...].
-  // Include them with -fcuda-include-gpubinary.
-  if (IsCuda && Inputs.size() > 1)
-for (auto I = std::next(Inputs.begin()), E = Inputs.end(); I != E; ++I) {
-  CmdArgs.push_back("-fcuda-include-gpubinary");
-  CmdArgs.push_back(I->getFilename());
+  if (IsCuda) {
+// Host-side cuda compilation receives device-side outputs as Inputs[1...].
+// Include them with -fcuda-include-gpubinary.
+if (Inputs.size() > 1) {
+  for (auto I = std::next(Inputs.begin()), E = Inputs.end(); I != E; ++I) {
+CmdArgs.push_back("-fcuda-include-gpubinary");
+CmdArgs.push_back(I->getFilename());
+  }
 }
 
+if (Args.hasFlag(options::OPT_fcuda_rdc, options::OPT_fno_cuda_rdc, false))
+  CmdArgs.push_back("-fcuda-rdc");
+  }
+
   // OpenMP offloading device jobs take the argument -fopenmp-host-ir-file-path
   // to specify the result of the compile phase on the host, so the meaningful
   // device declarations can be identified. Also, -fopenmp-is-device is passed

Modified: cfe/trunk/lib/Driver/ToolChains/Cuda.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Driver/ToolChains/Cuda.cpp?rev=324878&r1=324877&r2=324878&view=diff
==
--- cfe/trunk/lib/Driver/ToolChains/Cuda.cpp (original)
+++ cfe/trunk/lib/Driver/ToolChains/Cuda.cpp Mon Feb 12 02:46:45 2018
@@ -355,11 +355,17 @@ void NVPTX::Assembler::ConstructJob(Comp
   for (const auto& A : Args.getAllArgValues(options::OPT_Xcuda_ptxas))
 CmdArgs.push_back(Args.MakeArgString(A));
 
-  // In OpenMP we need to 

r325136 - [CUDA] Allow external variables in separate compilation

2018-02-14 Thread Jonas Hahnfeld via cfe-commits
Author: hahnfeld
Date: Wed Feb 14 08:04:03 2018
New Revision: 325136

URL: http://llvm.org/viewvc/llvm-project?rev=325136&view=rev
Log:
[CUDA] Allow external variables in separate compilation

According to the CUDA Programming Guide this is prohibited in
whole program compilation mode. This makes sense because external
references cannot be satisfied in that mode anyway. However,
such variables are allowed in separate compilation mode which
is a valid use case.

Differential Revision: https://reviews.llvm.org/D42923

Modified:
cfe/trunk/lib/Sema/SemaDeclAttr.cpp
cfe/trunk/test/SemaCUDA/extern-shared.cu

Modified: cfe/trunk/lib/Sema/SemaDeclAttr.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Sema/SemaDeclAttr.cpp?rev=325136&r1=325135&r2=325136&view=diff
==
--- cfe/trunk/lib/Sema/SemaDeclAttr.cpp (original)
+++ cfe/trunk/lib/Sema/SemaDeclAttr.cpp Wed Feb 14 08:04:03 2018
@@ -4112,7 +4112,8 @@ static void handleSharedAttr(Sema &S, De
   auto *VD = cast(D);
   // extern __shared__ is only allowed on arrays with no length (e.g.
   // "int x[]").
-  if (VD->hasExternalStorage() && !isa(VD->getType())) {
+  if (!S.getLangOpts().CUDARelocatableDeviceCode && VD->hasExternalStorage() &&
+  !isa(VD->getType())) {
 S.Diag(Attr.getLoc(), diag::err_cuda_extern_shared) << VD;
 return;
   }

Modified: cfe/trunk/test/SemaCUDA/extern-shared.cu
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/SemaCUDA/extern-shared.cu?rev=325136&r1=325135&r2=325136&view=diff
==
--- cfe/trunk/test/SemaCUDA/extern-shared.cu (original)
+++ cfe/trunk/test/SemaCUDA/extern-shared.cu Wed Feb 14 08:04:03 2018
@@ -1,6 +1,11 @@
 // RUN: %clang_cc1 -fsyntax-only -verify %s
 // RUN: %clang_cc1 -fsyntax-only -fcuda-is-device -verify %s
 
+// RUN: %clang_cc1 -fsyntax-only -fcuda-rdc -verify=rdc %s
+// RUN: %clang_cc1 -fsyntax-only -fcuda-is-device -fcuda-rdc -verify=rdc %s
+// These declarations are fine in separate compilation mode:
+// rdc-no-diagnostics
+
 #include "Inputs/cuda.h"
 
 __device__ void foo() {


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: r325391 - [OPENMP] Do not emit messages for templates in declare target

2018-02-17 Thread Jonas Hahnfeld via cfe-commits

Hi Alexey,

I think that's mostly my test case from PR35348?

Am 2018-02-16 22:23, schrieb Alexey Bataev via cfe-commits:

Author: abataev
Date: Fri Feb 16 13:23:23 2018
New Revision: 325391

URL: http://llvm.org/viewvc/llvm-project?rev=325391&view=rev
Log:
[OPENMP] Do not emit messages for templates in declare target
constructs.

The compiler may emit some extra warnings for functions, that are
implicit specialization of the templates, declared in the target 
region.


Modified:
cfe/trunk/lib/Sema/SemaOpenMP.cpp
cfe/trunk/test/OpenMP/declare_target_messages.cpp

[...]

Modified: cfe/trunk/test/OpenMP/declare_target_messages.cpp
URL:
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/OpenMP/declare_target_messages.cpp?rev=325391&r1=325390&r2=325391&view=diff
==
--- cfe/trunk/test/OpenMP/declare_target_messages.cpp (original)
+++ cfe/trunk/test/OpenMP/declare_target_messages.cpp Fri Feb 16 
13:23:23 2018

@@ -33,6 +33,33 @@ struct NonT {

 typedef int sint;

+template 
+T bla1() { return 0; }
+
+#pragma omp declare target
+template 
+T bla2() { return 0; }
+#pragma omp end declare target
+
+template<>
+float bla2() { return 1.0; }
+
+#pragma omp declare target
+void blub2() {
+  bla2();


I don't agree with this: The compiler has to warn about calling an 
explicit template specialization that is outside of any 'declare target' 
region. That's at least the case for OpenMP 4.5, I know that there are 
changes for OpenMP 5.0. But in that case the compiler needs to add an 
implicit 'declare target' attribute to generate correct code.

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r325805 - [docs] Fix duplicate arguments for JoinedAndSeparate

2018-02-22 Thread Jonas Hahnfeld via cfe-commits
Author: hahnfeld
Date: Thu Feb 22 09:06:27 2018
New Revision: 325805

URL: http://llvm.org/viewvc/llvm-project?rev=325805&view=rev
Log:
[docs] Fix duplicate arguments for JoinedAndSeparate

We can't see how many arguments are in the meta var name, so just
assume that it is the right number.

Differential Revision: https://reviews.llvm.org/D42840

Modified:
cfe/trunk/utils/TableGen/ClangOptionDocEmitter.cpp

Modified: cfe/trunk/utils/TableGen/ClangOptionDocEmitter.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/utils/TableGen/ClangOptionDocEmitter.cpp?rev=325805&r1=325804&r2=325805&view=diff
==
--- cfe/trunk/utils/TableGen/ClangOptionDocEmitter.cpp (original)
+++ cfe/trunk/utils/TableGen/ClangOptionDocEmitter.cpp Thu Feb 22 09:06:27 2018
@@ -245,19 +245,27 @@ void emitOptionWithArgs(StringRef Prefix
 void emitOptionName(StringRef Prefix, const Record *Option, raw_ostream &OS) {
   // Find the arguments to list after the option.
   unsigned NumArgs = getNumArgsForKind(Option->getValueAsDef("Kind"), Option);
+  bool HasMetaVarName = !Option->isValueUnset("MetaVarName");
 
   std::vector Args;
-  if (!Option->isValueUnset("MetaVarName"))
+  if (HasMetaVarName)
 Args.push_back(Option->getValueAsString("MetaVarName"));
   else if (NumArgs == 1)
 Args.push_back("");
 
-  while (Args.size() < NumArgs) {
-Args.push_back(("").str());
-// Use '--args  ...' if any number of args are allowed.
-if (Args.size() == 2 && NumArgs == UnlimitedArgs) {
-  Args.back() += "...";
-  break;
+  // Fill up arguments if this option didn't provide a meta var name or it
+  // supports an unlimited number of arguments. We can't see how many arguments
+  // already are in a meta var name, so assume it has right number. This is
+  // needed for JoinedAndSeparate options so that there arent't too many
+  // arguments.
+  if (!HasMetaVarName || NumArgs == UnlimitedArgs) {
+while (Args.size() < NumArgs) {
+  Args.push_back(("").str());
+  // Use '--args  ...' if any number of args are allowed.
+  if (Args.size() == 2 && NumArgs == UnlimitedArgs) {
+Args.back() += "...";
+break;
+  }
 }
   }
 


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r325806 - [docs] Improve help for OpenMP options, NFC.

2018-02-22 Thread Jonas Hahnfeld via cfe-commits
Author: hahnfeld
Date: Thu Feb 22 09:06:35 2018
New Revision: 325806

URL: http://llvm.org/viewvc/llvm-project?rev=325806&view=rev
Log:
[docs] Improve help for OpenMP options, NFC.

 * Add HelpText for -fopenmp so that it appears in clang --help.
 * Hide -fno-openmp-simd, only list the positive option.
 * Hide -fopenmp-relocatable-target and -fopenmp-use-tls from
   clang --help and from ClangCommandLineReference.
 * Improve MetaVarName for -Xopenmp-target=<...>.

Differential Revision: https://reviews.llvm.org/D42841

Modified:
cfe/trunk/include/clang/Driver/Options.td

Modified: cfe/trunk/include/clang/Driver/Options.td
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Driver/Options.td?rev=325806&r1=325805&r2=325806&view=diff
==
--- cfe/trunk/include/clang/Driver/Options.td (original)
+++ cfe/trunk/include/clang/Driver/Options.td Thu Feb 22 09:06:35 2018
@@ -466,7 +466,8 @@ def Xcuda_ptxas : Separate<["-"], "Xcuda
 def Xopenmp_target : Separate<["-"], "Xopenmp-target">,
   HelpText<"Pass  to the target offloading toolchain.">, 
MetaVarName<"">;
 def Xopenmp_target_EQ : JoinedAndSeparate<["-"], "Xopenmp-target=">,
-  HelpText<"Pass  to the specified target offloading toolchain. The 
triple that identifies the toolchain must be provided after the equals sign.">, 
MetaVarName<"">;
+  HelpText<"Pass  to the target offloading toolchain identified by 
.">,
+  MetaVarName<" ">;
 def z : Separate<["-"], "z">, Flags<[LinkerInput, RenderAsInput]>,
   HelpText<"Pass -z  to the linker">, MetaVarName<"">,
   Group;
@@ -1397,24 +1398,26 @@ def fno_objc_nonfragile_abi : Flag<["-"]
 
 def fobjc_sender_dependent_dispatch : Flag<["-"], 
"fobjc-sender-dependent-dispatch">, Group;
 def fomit_frame_pointer : Flag<["-"], "fomit-frame-pointer">, Group;
-def fopenmp : Flag<["-"], "fopenmp">, Group, Flags<[CC1Option, 
NoArgumentUnused]>;
+def fopenmp : Flag<["-"], "fopenmp">, Group, Flags<[CC1Option, 
NoArgumentUnused]>,
+  HelpText<"Parse OpenMP pragmas and generate parallel code.">;
 def fno_openmp : Flag<["-"], "fno-openmp">, Group, 
Flags<[NoArgumentUnused]>;
 def fopenmp_version_EQ : Joined<["-"], "fopenmp-version=">, Group, 
Flags<[CC1Option, NoArgumentUnused]>;
 def fopenmp_EQ : Joined<["-"], "fopenmp=">, Group;
-def fopenmp_use_tls : Flag<["-"], "fopenmp-use-tls">, Group, 
Flags<[NoArgumentUnused]>;
-def fnoopenmp_use_tls : Flag<["-"], "fnoopenmp-use-tls">, Group, 
Flags<[CC1Option, NoArgumentUnused]>;
+def fopenmp_use_tls : Flag<["-"], "fopenmp-use-tls">, Group,
+  Flags<[NoArgumentUnused, HelpHidden]>;
+def fnoopenmp_use_tls : Flag<["-"], "fnoopenmp-use-tls">, Group,
+  Flags<[CC1Option, NoArgumentUnused, HelpHidden]>;
 def fopenmp_targets_EQ : CommaJoined<["-"], "fopenmp-targets=">, 
Flags<[DriverOption, CC1Option]>,
   HelpText<"Specify comma-separated list of triples OpenMP offloading targets 
to be supported">;
-def fopenmp_dump_offload_linker_script : Flag<["-"], 
"fopenmp-dump-offload-linker-script">, Group,
-  Flags<[NoArgumentUnused]>;
-def fopenmp_relocatable_target : Flag<["-"], "fopenmp-relocatable-target">, 
Group, Flags<[CC1Option, NoArgumentUnused]>,
-  HelpText<"OpenMP target code is compiled as relocatable using the -c flag. 
For OpenMP targets the code is relocatable by default.">;
-def fnoopenmp_relocatable_target : Flag<["-"], 
"fnoopenmp-relocatable-target">, Group, Flags<[CC1Option, 
NoArgumentUnused]>,
-  HelpText<"Do not compile OpenMP target code as relocatable.">;
+def fopenmp_dump_offload_linker_script : Flag<["-"], 
"fopenmp-dump-offload-linker-script">,
+  Group, Flags<[NoArgumentUnused, HelpHidden]>;
+def fopenmp_relocatable_target : Flag<["-"], "fopenmp-relocatable-target">,
+  Group, Flags<[CC1Option, NoArgumentUnused, HelpHidden]>;
+def fnoopenmp_relocatable_target : Flag<["-"], "fnoopenmp-relocatable-target">,
+  Group, Flags<[CC1Option, NoArgumentUnused, HelpHidden]>;
 def fopenmp_simd : Flag<["-"], "fopenmp-simd">, Group, 
Flags<[CC1Option, NoArgumentUnused]>,
   HelpText<"Emit OpenMP code only for SIMD-based constructs.">;
-def fno_openmp_simd : Flag<["-"], "fno-openmp-simd">, Group, 
Flags<[CC1Option, NoArgumentUnused]>,
-  HelpText<"Disable OpenMP code for SIMD-based constructs.">;
+def fno_openmp_simd : Flag<["-"], "fno-openmp-simd">, Group, 
Flags<[CC1Option, NoArgumentUnused]>;
 def fno_optimize_sibling_calls : Flag<["-"], "fno-optimize-sibling-calls">, 
Group;
 def foptimize_sibling_calls : Flag<["-"], "foptimize-sibling-calls">, 
Group;
 def force__cpusubtype__ALL : Flag<["-"], "force_cpusubtype_ALL">;


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r325807 - [docs] Regenerate command line reference

2018-02-22 Thread Jonas Hahnfeld via cfe-commits
Author: hahnfeld
Date: Thu Feb 22 09:10:28 2018
New Revision: 325807

URL: http://llvm.org/viewvc/llvm-project?rev=325807&view=rev
Log:
[docs] Regenerate command line reference

Modified:
cfe/trunk/docs/ClangCommandLineReference.rst

Modified: cfe/trunk/docs/ClangCommandLineReference.rst
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/docs/ClangCommandLineReference.rst?rev=325807&r1=325806&r2=325807&view=diff
==
--- cfe/trunk/docs/ClangCommandLineReference.rst (original)
+++ cfe/trunk/docs/ClangCommandLineReference.rst Thu Feb 22 09:10:28 2018
@@ -61,10 +61,10 @@ Pass  to the ptxas assembler
 Pass  to the target offloading toolchain.
 
 .. program:: clang1
-.. option:: -Xopenmp-target= 
+.. option:: -Xopenmp-target= 
 .. program:: clang
 
-Pass  to the specified target offloading toolchain. The triple that 
identifies the toolchain must be provided after the equals sign.
+Pass  to the target offloading toolchain identified by .
 
 .. option:: -Z
 
@@ -710,6 +710,14 @@ Print source range spans in numeric form
 
 .. option:: -fdiagnostics-show-category=
 
+.. option:: -fdiscard-value-names, -fno-discard-value-names
+
+Discard value names in LLVM IR
+
+.. option:: -fexperimental-isel, -fno-experimental-isel
+
+Enables the experimental global instruction selector
+
 .. option:: -fexperimental-new-pass-manager, -fno-experimental-new-pass-manager
 
 Enables an experimental new pass manager in LLVM.
@@ -744,6 +752,10 @@ Level of field padding for AddressSaniti
 
 Enable linker dead stripping of globals in AddressSanitizer
 
+.. option:: -fsanitize-address-poison-class-member-array-new-cookie, 
-fno-sanitize-address-poison-class-member-array-new-cookie
+
+Enable poisoning array cookies when using class member operator new\[\] in 
AddressSanitizer
+
 .. option:: -fsanitize-address-use-after-scope, 
-fno-sanitize-address-use-after-scope
 
 Enable use-after-scope detection in AddressSanitizer
@@ -876,6 +888,10 @@ Add directory to include search path
 
 Restrict all prior -I flags to double-quoted inclusion and remove current 
directory from include path
 
+.. option:: --cuda-path-ignore-env
+
+Ignore environment variables to detect CUDA installation
+
 .. option:: --cuda-path=
 
 CUDA installation path
@@ -1507,12 +1523,6 @@ Do not treat C++ operator name keywords
 
 .. option:: -fno-working-directory
 
-.. option:: -fnoopenmp-relocatable-target
-
-Do not compile OpenMP target code as relocatable.
-
-.. option:: -fnoopenmp-use-tls
-
 .. option:: -fobjc-abi-version=
 
 .. option:: -fobjc-arc, -fno-objc-arc
@@ -1551,18 +1561,12 @@ Enable ARC-style weak references in Obje
 
 .. option:: -fopenmp, -fno-openmp
 
-.. option:: -fopenmp-dump-offload-linker-script
-
-.. option:: -fopenmp-relocatable-target
-
-OpenMP target code is compiled as relocatable using the -c flag. For OpenMP 
targets the code is relocatable by default.
+Parse OpenMP pragmas and generate parallel code.
 
 .. option:: -fopenmp-simd, -fno-openmp-simd
 
 Emit OpenMP code only for SIMD-based constructs.
 
-.. option:: -fopenmp-use-tls
-
 .. option:: -fopenmp-version=
 
 .. program:: clang1
@@ -1748,7 +1752,7 @@ Enable the superword-level parallelism v
 
 .. option:: -fsplit-dwarf-inlining, -fno-split-dwarf-inlining
 
-Place debug types in their own section (ELF Only)
+Provide minimal debug info in the object/executable to facilitate online 
symbolication/stack traces in the absence of .dwo/.dwp files when using Split 
DWARF
 
 .. option:: -fsplit-stack
 
@@ -1974,6 +1978,10 @@ OpenCL language standard to compile for.
 
 OpenCL only. This option is added for compatibility with OpenCL 1.0.
 
+.. option:: -cl-uniform-work-group-size
+
+OpenCL only. Defines that the global work-size be a multiple of the work-group 
size specified to clEnqueueNDRangeKernel
+
 .. option:: -cl-unsafe-math-optimizations
 
 OpenCL only. Allow unsafe floating-point optimizations.  Also implies 
-cl-no-signed-zeros and -cl-mad-enable.
@@ -2086,6 +2094,10 @@ Use Intel MCU ABI
 
 (integrated-as) Emit an object file which can be used with an incremental 
linker
 
+.. option:: -mindirect-jump=
+
+Change indirect jump instructions to inhibit speculation
+
 .. option:: -miphoneos-version-min=, -mios-version-min=
 
 .. option:: -mips16
@@ -2436,6 +2448,8 @@ X86
 
 .. option:: -mrtm, -mno-rtm
 
+.. option:: -msahf, -mno-sahf
+
 .. option:: -msgx, -mno-sgx
 
 .. option:: -msha, -mno-sha


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r321486 - [OpenMP] Further adjustments of nvptx runtime functions

2017-12-27 Thread Jonas Hahnfeld via cfe-commits
Author: hahnfeld
Date: Wed Dec 27 02:39:56 2017
New Revision: 321486

URL: http://llvm.org/viewvc/llvm-project?rev=321486&view=rev
Log:
[OpenMP] Further adjustments of nvptx runtime functions

Pass in default value of 1, similar to previous commit r318836.

Differential Revision: https://reviews.llvm.org/D41012

Modified:
cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp
cfe/trunk/test/OpenMP/nvptx_data_sharing.cpp
cfe/trunk/test/OpenMP/nvptx_target_teams_codegen.cpp

Modified: cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp?rev=321486&r1=321485&r2=321486&view=diff
==
--- cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp (original)
+++ cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp Wed Dec 27 02:39:56 2017
@@ -33,10 +33,11 @@ enum OpenMPRTLFunctionNVPTX {
   /// \brief Call to void __kmpc_spmd_kernel_deinit();
   OMPRTL_NVPTX__kmpc_spmd_kernel_deinit,
   /// \brief Call to void __kmpc_kernel_prepare_parallel(void
-  /// *outlined_function, void ***args, kmp_int32 nArgs);
+  /// *outlined_function, void ***args, kmp_int32 nArgs, int16_t
+  /// IsOMPRuntimeInitialized);
   OMPRTL_NVPTX__kmpc_kernel_prepare_parallel,
   /// \brief Call to bool __kmpc_kernel_parallel(void **outlined_function, void
-  /// ***args);
+  /// ***args, int16_t IsOMPRuntimeInitialized);
   OMPRTL_NVPTX__kmpc_kernel_parallel,
   /// \brief Call to void __kmpc_kernel_end_parallel();
   OMPRTL_NVPTX__kmpc_kernel_end_parallel,
@@ -521,7 +522,9 @@ void CGOpenMPRuntimeNVPTX::emitWorkerLoo
   // Set up shared arguments
   Address SharedArgs =
   CGF.CreateDefaultAlignTempAlloca(CGF.Int8PtrPtrTy, "shared_args");
-  llvm::Value *Args[] = {WorkFn.getPointer(), SharedArgs.getPointer()};
+  // TODO: Optimize runtime initialization and pass in correct value.
+  llvm::Value *Args[] = {WorkFn.getPointer(), SharedArgs.getPointer(),
+ /*RequiresOMPRuntime=*/Bld.getInt16(1)};
   llvm::Value *Ret = CGF.EmitRuntimeCall(
   createNVPTXRuntimeFunction(OMPRTL_NVPTX__kmpc_kernel_parallel), Args);
   Bld.CreateStore(Bld.CreateZExt(Ret, CGF.Int8Ty), ExecStatus);
@@ -637,18 +640,21 @@ CGOpenMPRuntimeNVPTX::createNVPTXRuntime
   }
   case OMPRTL_NVPTX__kmpc_kernel_prepare_parallel: {
 /// Build void __kmpc_kernel_prepare_parallel(
-/// void *outlined_function, void ***args, kmp_int32 nArgs);
+/// void *outlined_function, void ***args, kmp_int32 nArgs, int16_t
+/// IsOMPRuntimeInitialized);
 llvm::Type *TypeParams[] = {CGM.Int8PtrTy,
-CGM.Int8PtrPtrTy->getPointerTo(0), CGM.Int32Ty};
+CGM.Int8PtrPtrTy->getPointerTo(0), CGM.Int32Ty,
+CGM.Int16Ty};
 llvm::FunctionType *FnTy =
 llvm::FunctionType::get(CGM.VoidTy, TypeParams, /*isVarArg*/ false);
 RTLFn = CGM.CreateRuntimeFunction(FnTy, "__kmpc_kernel_prepare_parallel");
 break;
   }
   case OMPRTL_NVPTX__kmpc_kernel_parallel: {
-/// Build bool __kmpc_kernel_parallel(void **outlined_function, void 
***args);
+/// Build bool __kmpc_kernel_parallel(void **outlined_function, void
+/// ***args, int16_t IsOMPRuntimeInitialized);
 llvm::Type *TypeParams[] = {CGM.Int8PtrPtrTy,
-CGM.Int8PtrPtrTy->getPointerTo(0)};
+CGM.Int8PtrPtrTy->getPointerTo(0), 
CGM.Int16Ty};
 llvm::Type *RetTy = CGM.getTypes().ConvertType(CGM.getContext().BoolTy);
 llvm::FunctionType *FnTy =
 llvm::FunctionType::get(RetTy, TypeParams, /*isVarArg*/ false);
@@ -949,8 +955,10 @@ void CGOpenMPRuntimeNVPTX::emitGenericPa
   CGF.CreateDefaultAlignTempAlloca(CGF.VoidPtrPtrTy,
   "shared_args");
   llvm::Value *SharedArgsPtr = SharedArgs.getPointer();
+  // TODO: Optimize runtime initialization and pass in correct value.
   llvm::Value *Args[] = {ID, SharedArgsPtr,
- Bld.getInt32(CapturedVars.size())};
+ Bld.getInt32(CapturedVars.size()),
+ /*RequiresOMPRuntime=*/Bld.getInt16(1)};
 
   CGF.EmitRuntimeCall(
   
createNVPTXRuntimeFunction(OMPRTL_NVPTX__kmpc_kernel_prepare_parallel),
@@ -970,9 +978,10 @@ void CGOpenMPRuntimeNVPTX::emitGenericPa
 Idx++;
   }
 } else {
-  llvm::Value *Args[] = {ID,
-  llvm::ConstantPointerNull::get(CGF.VoidPtrPtrTy->getPointerTo(0)),
-  /*nArgs=*/Bld.getInt32(0)};
+  // TODO: Optimize runtime initialization and pass in correct value.
+  llvm::Value *Args[] = {
+  ID, 
llvm::ConstantPointerNull::get(CGF.VoidPtrPtrTy->getPointerTo(0)),
+  /*nArgs=*/Bld.getInt32(0), /*RequiresOMPRuntime=*/Bld.getInt16(1)};
   CGF.EmitRuntimeCall(
   
createNVPTXRuntimeFunction(OMPRTL_NVPTX__kmpc_kernel_prepare_parallel),
   Args);

Modified: cfe/trunk/test/OpenMP/nvptx

Re: r321816 - [OPENMP] Add debug info for generated functions.

2018-01-04 Thread Jonas Hahnfeld via cfe-commits

Hi Alexey,

should this change be backported to 6.0?

Regards,
Jonas

Am 2018-01-04 20:45, schrieb Alexey Bataev via cfe-commits:

Author: abataev
Date: Thu Jan  4 11:45:16 2018
New Revision: 321816

URL: http://llvm.org/viewvc/llvm-project?rev=321816&view=rev
Log:
[OPENMP] Add debug info for generated functions.

Most of the generated functions for the OpenMP were generated with
disabled debug info. Patch fixes this for better user experience.

Modified:
cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp
cfe/trunk/lib/CodeGen/CGOpenMPRuntime.h
cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp
cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.h
cfe/trunk/test/OpenMP/target_parallel_debug_codegen.cpp

Modified: cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp
URL:
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp?rev=321816&r1=321815&r2=321816&view=diff
==
--- cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp (original)
+++ cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp Thu Jan  4 11:45:16 2018
@@ -1216,7 +1216,8 @@ emitCombinerOrInitializer(CodeGenModule
   CodeGenFunction CGF(CGM);
   // Map "T omp_in;" variable to "*omp_in_parm" value in all 
expressions.
   // Map "T omp_out;" variable to "*omp_out_parm" value in all 
expressions.

-  CGF.StartFunction(GlobalDecl(), C.VoidTy, Fn, FnInfo, Args);
+  CGF.StartFunction(GlobalDecl(), C.VoidTy, Fn, FnInfo, Args,
In->getLocation(),
+Out->getLocation());
   CodeGenFunction::OMPPrivateScope Scope(CGF);
   Address AddrIn = CGF.GetAddrOfLocalVar(&OmpInParm);
   Scope.addPrivate(In, [&CGF, AddrIn, PtrTy]() -> Address {
@@ -2383,7 +2384,8 @@ llvm::Function *CGOpenMPRuntime::emitThr
   // threadprivate copy of the variable VD
   CodeGenFunction CtorCGF(CGM);
   FunctionArgList Args;
-  ImplicitParamDecl Dst(CGM.getContext(), 
CGM.getContext().VoidPtrTy,

+  ImplicitParamDecl Dst(CGM.getContext(), /*DC=*/nullptr, Loc,
+/*Id=*/nullptr, 
CGM.getContext().VoidPtrTy,

 ImplicitParamDecl::Other);
   Args.push_back(&Dst);

@@ -2393,13 +2395,13 @@ llvm::Function *CGOpenMPRuntime::emitThr
   auto Fn = CGM.CreateGlobalInitOrDestructFunction(
   FTy, ".__kmpc_global_ctor_.", FI, Loc);
   CtorCGF.StartFunction(GlobalDecl(), CGM.getContext().VoidPtrTy, 
Fn, FI,

-Args, SourceLocation());
+Args, Loc, Loc);
   auto ArgVal = CtorCGF.EmitLoadOfScalar(
   CtorCGF.GetAddrOfLocalVar(&Dst), /*Volatile=*/false,
   CGM.getContext().VoidPtrTy, Dst.getLocation());
   Address Arg = Address(ArgVal, VDAddr.getAlignment());
-  Arg = CtorCGF.Builder.CreateElementBitCast(Arg,
- 
CtorCGF.ConvertTypeForMem(ASTTy));

+  Arg = CtorCGF.Builder.CreateElementBitCast(
+  Arg, CtorCGF.ConvertTypeForMem(ASTTy));
   CtorCGF.EmitAnyExprToMem(Init, Arg, 
Init->getType().getQualifiers(),

/*IsInitializer=*/true);
   ArgVal = CtorCGF.EmitLoadOfScalar(
@@ -2414,7 +2416,8 @@ llvm::Function *CGOpenMPRuntime::emitThr
   // of the variable VD
   CodeGenFunction DtorCGF(CGM);
   FunctionArgList Args;
-  ImplicitParamDecl Dst(CGM.getContext(), 
CGM.getContext().VoidPtrTy,

+  ImplicitParamDecl Dst(CGM.getContext(), /*DC=*/nullptr, Loc,
+/*Id=*/nullptr, 
CGM.getContext().VoidPtrTy,

 ImplicitParamDecl::Other);
   Args.push_back(&Dst);

@@ -2425,7 +2428,7 @@ llvm::Function *CGOpenMPRuntime::emitThr
   FTy, ".__kmpc_global_dtor_.", FI, Loc);
   auto NL = ApplyDebugLocation::CreateEmpty(DtorCGF);
   DtorCGF.StartFunction(GlobalDecl(), CGM.getContext().VoidTy,
Fn, FI, Args,
-SourceLocation());
+Loc, Loc);
   // Create a scope with an artificial location for the body of
this function.
   auto AL = ApplyDebugLocation::CreateArtificial(DtorCGF);
   auto ArgVal = DtorCGF.EmitLoadOfScalar(
@@ -2469,7 +2472,7 @@ llvm::Function *CGOpenMPRuntime::emitThr
   FunctionArgList ArgList;
   InitCGF.StartFunction(GlobalDecl(), CGM.getContext().VoidTy,
InitFunction,
 CGM.getTypes().arrangeNullaryFunction(), 
ArgList,

-Loc);
+Loc, Loc);
   emitThreadPrivateVarInit(InitCGF, VDAddr, Ctor, CopyCtor, Dtor, 
Loc);

   InitCGF.FinishFunction();
   return InitFunction;
@@ -2783,12 +2786,15 @@ static Address emitAddrOfVarFromArray(Co
 static llvm::Value *emitCopyprivateCopyFunction(
 CodeGenModule &CGM, llvm::Type *ArgsType,
 ArrayRef CopyprivateVars, ArrayRef 
DestExprs,
-ArrayRef SrcExprs, ArrayRef 
AssignmentOps) {
+ArrayRef SrcExprs, ArrayRef 
AssignmentOps,

+SourceLocation Loc) {
   a

Re: r321816 - [OPENMP] Add debug info for generated functions.

2018-01-04 Thread Jonas Hahnfeld via cfe-commits
You mean r321818 and r321820? I skipped them because they are for NVPTX 
and target directives which aren't fully functional in 6.0 anyway, 
right?

Or patches in the future?

Am 2018-01-04 21:58, schrieb Alexey Bataev:

Hi Jonas, I don't think it is necessary. It is better to backport my 2
next patches with bug fixes.

Best regards,
Alexey

04.01.2018 15:54, Jonas Hahnfeld пишет:


Hi Alexey,

should this change be backported to 6.0?

Regards,
Jonas

Am 2018-01-04 20:45, schrieb Alexey Bataev via cfe-commits:


Author: abataev
Date: Thu Jan  4 11:45:16 2018
New Revision: 321816

URL:




https://eur02.safelinks.protection.outlook.com/?url=http%3A%2F%2Fllvm.org%2Fviewvc%2Fllvm-project%3Frev%3D321816%26view%3Drev&data=02%7C01%7C%7Ceb0f898e6fe040bc1a4208d553b566ae%7C84df9e7fe9f640afb435%7C1%7C0%7C636506960925164662&sdata=g3DdxRoQ%2B8RbIORsLLfEJAAP4Zn2Orsshr6PwIthnQw%3D&reserved=0

Log:
[OPENMP] Add debug info for generated functions.

Most of the generated functions for the OpenMP were generated with

disabled debug info. Patch fixes this for better user experience.

Modified:
cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp
cfe/trunk/lib/CodeGen/CGOpenMPRuntime.h
cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp
cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.h
cfe/trunk/test/OpenMP/target_parallel_debug_codegen.cpp

Modified: cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp
URL:




https://eur02.safelinks.protection.outlook.com/?url=http%3A%2F%2Fllvm.org%2Fviewvc%2Fllvm-project%2Fcfe%2Ftrunk%2Flib%2FCodeGen%2FCGOpenMPRuntime.cpp%3Frev%3D321816%26r1%3D321815%26r2%3D321816%26view%3Ddiff&data=02%7C01%7C%7Ceb0f898e6fe040bc1a4208d553b566ae%7C84df9e7fe9f640afb435%7C1%7C0%7C636506960925164662&sdata=2ppjOPjnpev4zlt1Fh6ByuYdotTiSr0Z1WyvBa8WWHo%3D&reserved=0






==


--- cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp (original)
+++ cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp Thu Jan  4 11:45:16
2018
@@ -1216,7 +1216,8 @@ emitCombinerOrInitializer(CodeGenModule
CodeGenFunction CGF(CGM);
// Map "T omp_in;" variable to "*omp_in_parm" value in all
expressions.
// Map "T omp_out;" variable to "*omp_out_parm" value in all
expressions.
-  CGF.StartFunction(GlobalDecl(), C.VoidTy, Fn, FnInfo, Args);
+  CGF.StartFunction(GlobalDecl(), C.VoidTy, Fn, FnInfo, Args,
In->getLocation(),
+Out->getLocation());
CodeGenFunction::OMPPrivateScope Scope(CGF);
Address AddrIn = CGF.GetAddrOfLocalVar(&OmpInParm);
Scope.addPrivate(In, [&CGF, AddrIn, PtrTy]() -> Address {
@@ -2383,7 +2384,8 @@ llvm::Function *CGOpenMPRuntime::emitThr
// threadprivate copy of the variable VD
CodeGenFunction CtorCGF(CGM);
FunctionArgList Args;
-  ImplicitParamDecl Dst(CGM.getContext(),
CGM.getContext().VoidPtrTy,
+  ImplicitParamDecl Dst(CGM.getContext(), /*DC=*/nullptr,
Loc,
+/*Id=*/nullptr,
CGM.getContext().VoidPtrTy,
ImplicitParamDecl::Other);
Args.push_back(&Dst);

@@ -2393,13 +2395,13 @@ llvm::Function *CGOpenMPRuntime::emitThr
auto Fn = CGM.CreateGlobalInitOrDestructFunction(
FTy, ".__kmpc_global_ctor_.", FI, Loc);
CtorCGF.StartFunction(GlobalDecl(),
CGM.getContext().VoidPtrTy, Fn, FI,
-Args, SourceLocation());
+Args, Loc, Loc);
auto ArgVal = CtorCGF.EmitLoadOfScalar(
CtorCGF.GetAddrOfLocalVar(&Dst), /*Volatile=*/false,
CGM.getContext().VoidPtrTy, Dst.getLocation());
Address Arg = Address(ArgVal, VDAddr.getAlignment());
-  Arg = CtorCGF.Builder.CreateElementBitCast(Arg,
-
CtorCGF.ConvertTypeForMem(ASTTy));
+  Arg = CtorCGF.Builder.CreateElementBitCast(
+  Arg, CtorCGF.ConvertTypeForMem(ASTTy));
CtorCGF.EmitAnyExprToMem(Init, Arg,
Init->getType().getQualifiers(),
/*IsInitializer=*/true);
ArgVal = CtorCGF.EmitLoadOfScalar(
@@ -2414,7 +2416,8 @@ llvm::Function *CGOpenMPRuntime::emitThr
// of the variable VD
CodeGenFunction DtorCGF(CGM);
FunctionArgList Args;
-  ImplicitParamDecl Dst(CGM.getContext(),
CGM.getContext().VoidPtrTy,
+  ImplicitParamDecl Dst(CGM.getContext(), /*DC=*/nullptr,
Loc,
+/*Id=*/nullptr,
CGM.getContext().VoidPtrTy,
ImplicitParamDecl::Other);
Args.push_back(&Dst);

@@ -2425,7 +2428,7 @@ llvm::Function *CGOpenMPRuntime::emitThr
FTy, ".__kmpc_global_dtor_.", FI, Loc);
auto NL = ApplyDebugLocation::CreateEmpty(DtorCGF);
DtorCGF.StartFunction(GlobalDecl(),
CGM.getContext().VoidTy,
Fn, FI, Args,
-SourceLocation());
+Loc, Loc);
// Create a scope with an artificial location for the body
of
this function.
auto AL = ApplyDebugLocation::CreateArtificial(DtorCGF);
auto ArgVal = DtorCGF.EmitLoadOfScalar(
@@ -2469,7 +2472,7 @@ llvm::Function *CGOpenMPRuntime::emitThr
FunctionArgList ArgList;
InitCGF.StartFunction(GlobalDecl(),
CGM.getContext().VoidTy,
InitFunction,

CGM.getTypes().arrangeNullaryFunction(), ArgList,
-Loc);
+ 

Re: r322018 - [OPENMP] Current status of OpenMP support.

2018-01-08 Thread Jonas Hahnfeld via cfe-commits
Can we backport this page to release_60? I think the documented support 
is also valid for 6.0 or did I miss recent commits that added support 
for new directives / clauses?


Am 2018-01-08 20:02, schrieb Alexey Bataev via cfe-commits:

Author: abataev
Date: Mon Jan  8 11:02:51 2018
New Revision: 322018

URL: http://llvm.org/viewvc/llvm-project?rev=322018&view=rev
Log:
[OPENMP] Current status of OpenMP support.

Summary: Some info about supported features of OpenMP 4.5-5.0.

Reviewers: hfinkel, rsmith

Subscribers: kkwli0, Hahnfeld, cfe-commits

Differential Revision: https://reviews.llvm.org/D39457

Added:
cfe/trunk/docs/OpenMPSupport.rst
Modified:
cfe/trunk/docs/index.rst

Added: cfe/trunk/docs/OpenMPSupport.rst
URL:
http://llvm.org/viewvc/llvm-project/cfe/trunk/docs/OpenMPSupport.rst?rev=322018&view=auto
==
--- cfe/trunk/docs/OpenMPSupport.rst (added)
+++ cfe/trunk/docs/OpenMPSupport.rst Mon Jan  8 11:02:51 2018
@@ -0,0 +1,68 @@
+.. raw:: html
+
+  
+.none { background-color: #FF }
+.partial { background-color: #99 }
+.good { background-color: #CCFF99 }
+  
+
+.. role:: none
+.. role:: partial
+.. role:: good
+
+==
+OpenMP Support
+==
+
+Clang fully supports OpenMP 3.1 + some elements of OpenMP 4.5. Clang
supports offloading to X86_64, AArch64 and PPC64[LE] devices.
+Support for Cuda devices is not ready yet.
+The status of major OpenMP 4.5 features support in Clang.
+
+Standalone directives
+=
+
+* #pragma omp [for] simd: :good:`Complete`.
+
+* #pragma omp declare simd: :partial:`Partial`.  We support 
parsing/semantic
+  analysis + generation of special attributes for X86 target, but 
still

+  missing the LLVM pass for vectorization.
+
+* #pragma omp taskloop [simd]: :good:`Complete`.
+
+* #pragma omp target [enter|exit] data: :good:`Complete`.
+
+* #pragma omp target update: :good:`Complete`.
+
+* #pragma omp target: :partial:`Partial`.  No support for the `depend` 
clauses.

+
+* #pragma omp declare target: :partial:`Partial`.  No full codegen 
support.

+
+* #pragma omp teams: :good:`Complete`.
+
+* #pragma omp distribute [simd]: :good:`Complete`.
+
+* #pragma omp distribute parallel for [simd]: :good:`Complete`.
+
+Combined directives
+===
+
+* #pragma omp parallel for simd: :good:`Complete`.
+
+* #pragma omp target parallel: :partial:`Partial`.  No support for
the `depend` clauses.
+
+* #pragma omp target parallel for [simd]: :partial:`Partial`.  No
support for the `depend` clauses.
+
+* #pragma omp target simd: :partial:`Partial`.  No support for the
`depend` clauses.
+
+* #pragma omp target teams: :partial:`Partial`.  No support for the
`depend` clauses.
+
+* #pragma omp teams distribute [simd]: :good:`Complete`.
+
+* #pragma omp target teams distribute [simd]: :partial:`Partial`.  No
support for the and `depend` clauses.
+
+* #pragma omp teams distribute parallel for [simd]: :good:`Complete`.
+
+* #pragma omp target teams distribute parallel for [simd]:
:partial:`Partial`.  No full codegen support.
+
+Clang does not support any constructs/updates from upcoming OpenMP
5.0 except for `reduction`-based clauses in the `task` and
`target`-based directives.
+

Modified: cfe/trunk/docs/index.rst
URL:
http://llvm.org/viewvc/llvm-project/cfe/trunk/docs/index.rst?rev=322018&r1=322017&r2=322018&view=diff
==
--- cfe/trunk/docs/index.rst (original)
+++ cfe/trunk/docs/index.rst Mon Jan  8 11:02:51 2018
@@ -39,6 +39,7 @@ Using Clang as a Compiler
SourceBasedCodeCoverage
Modules
MSVCCompatibility
+   OpenMPSupport
ThinLTO
CommandGuide/index
FAQ


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: r322112 - [OPENMP] Fix directive kind on stand-alone target data directives, NFC.

2018-01-09 Thread Jonas Hahnfeld via cfe-commits

Why is this NFC and doesn't change a test?

Am 2018-01-09 20:59, schrieb Alexey Bataev via cfe-commits:

Author: abataev
Date: Tue Jan  9 11:59:25 2018
New Revision: 322112

URL: http://llvm.org/viewvc/llvm-project?rev=322112&view=rev
Log:
[OPENMP] Fix directive kind on stand-alone target data directives, NFC.

Modified:
cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp

Modified: cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp
URL:
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp?rev=322112&r1=322111&r2=322112&view=diff
==
--- cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp (original)
+++ cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp Tue Jan  9 11:59:25 2018
@@ -7662,7 +7662,7 @@ void CGOpenMPRuntime::emitTargetDataStan
 if (D.hasClausesOfKind())
   CGF.EmitOMPTargetTaskBasedDirective(D, ThenGen, InputInfo);
 else
-  emitInlinedDirective(CGF, OMPD_target_update, ThenGen);
+  emitInlinedDirective(CGF, D.getDirectiveKind(), ThenGen);
   };

   if (IfCond)


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r326342 - [CUDA] Include single GPU binary, NFCI.

2018-02-28 Thread Jonas Hahnfeld via cfe-commits
Author: hahnfeld
Date: Wed Feb 28 09:53:46 2018
New Revision: 326342

URL: http://llvm.org/viewvc/llvm-project?rev=326342&view=rev
Log:
[CUDA] Include single GPU binary, NFCI.

Binaries for multiple architectures are combined by fatbinary,
so the current code was effectively not needed.

Differential Revision: https://reviews.llvm.org/D43461

Modified:
cfe/trunk/include/clang/Frontend/CodeGenOptions.h
cfe/trunk/lib/CodeGen/CGCUDANV.cpp
cfe/trunk/lib/Driver/ToolChains/Clang.cpp
cfe/trunk/lib/Frontend/CompilerInvocation.cpp
cfe/trunk/test/Driver/cuda-options.cu

Modified: cfe/trunk/include/clang/Frontend/CodeGenOptions.h
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Frontend/CodeGenOptions.h?rev=326342&r1=326341&r2=326342&view=diff
==
--- cfe/trunk/include/clang/Frontend/CodeGenOptions.h (original)
+++ cfe/trunk/include/clang/Frontend/CodeGenOptions.h Wed Feb 28 09:53:46 2018
@@ -205,10 +205,9 @@ public:
   /// the summary and module symbol table (and not, e.g. any debug metadata).
   std::string ThinLinkBitcodeFile;
 
-  /// A list of file names passed with -fcuda-include-gpubinary options to
-  /// forward to CUDA runtime back-end for incorporating them into host-side
-  /// object file.
-  std::vector CudaGpuBinaryFileNames;
+  /// Name of file passed with -fcuda-include-gpubinary option to forward to
+  /// CUDA runtime back-end for incorporating them into host-side object file.
+  std::string CudaGpuBinaryFileName;
 
   /// The name of the file to which the backend should save YAML optimization
   /// records.

Modified: cfe/trunk/lib/CodeGen/CGCUDANV.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGCUDANV.cpp?rev=326342&r1=326341&r2=326342&view=diff
==
--- cfe/trunk/lib/CodeGen/CGCUDANV.cpp (original)
+++ cfe/trunk/lib/CodeGen/CGCUDANV.cpp Wed Feb 28 09:53:46 2018
@@ -41,10 +41,10 @@ private:
   /// Keeps track of kernel launch stubs emitted in this module
   llvm::SmallVector EmittedKernels;
   llvm::SmallVector, 16> 
DeviceVars;
-  /// Keeps track of variables containing handles of GPU binaries. Populated by
+  /// Keeps track of variable containing handle of GPU binary. Populated by
   /// ModuleCtorFunction() and used to create corresponding cleanup calls in
   /// ModuleDtorFunction()
-  llvm::SmallVector GpuBinaryHandles;
+  llvm::GlobalVariable *GpuBinaryHandle = nullptr;
 
   llvm::Constant *getSetupArgumentFn() const;
   llvm::Constant *getLaunchFn() const;
@@ -245,16 +245,14 @@ llvm::Function *CGNVCUDARuntime::makeReg
 /// Creates a global constructor function for the module:
 /// \code
 /// void __cuda_module_ctor(void*) {
-/// Handle0 = __cudaRegisterFatBinary(GpuBinaryBlob0);
-/// __cuda_register_globals(Handle0);
-/// ...
-/// HandleN = __cudaRegisterFatBinary(GpuBinaryBlobN);
-/// __cuda_register_globals(HandleN);
+/// Handle = __cudaRegisterFatBinary(GpuBinaryBlob);
+/// __cuda_register_globals(Handle);
 /// }
 /// \endcode
 llvm::Function *CGNVCUDARuntime::makeModuleCtorFunction() {
-  // No need to generate ctors/dtors if there are no GPU binaries.
-  if (CGM.getCodeGenOpts().CudaGpuBinaryFileNames.empty())
+  // No need to generate ctors/dtors if there is no GPU binary.
+  std::string GpuBinaryFileName = CGM.getCodeGenOpts().CudaGpuBinaryFileName;
+  if (GpuBinaryFileName.empty())
 return nullptr;
 
   // void __cuda_register_globals(void* handle);
@@ -267,6 +265,18 @@ llvm::Function *CGNVCUDARuntime::makeMod
   llvm::StructType *FatbinWrapperTy =
   llvm::StructType::get(IntTy, IntTy, VoidPtrTy, VoidPtrTy);
 
+  // Register GPU binary with the CUDA runtime, store returned handle in a
+  // global variable and save a reference in GpuBinaryHandle to be cleaned up
+  // in destructor on exit. Then associate all known kernels with the GPU 
binary
+  // handle so CUDA runtime can figure out what to call on the GPU side.
+  llvm::ErrorOr> GpuBinaryOrErr =
+  llvm::MemoryBuffer::getFileOrSTDIN(GpuBinaryFileName);
+  if (std::error_code EC = GpuBinaryOrErr.getError()) {
+CGM.getDiags().Report(diag::err_cannot_open_file)
+<< GpuBinaryFileName << EC.message();
+return nullptr;
+  }
+
   llvm::Function *ModuleCtorFunc = llvm::Function::Create(
   llvm::FunctionType::get(VoidTy, VoidPtrTy, false),
   llvm::GlobalValue::InternalLinkage, "__cuda_module_ctor", &TheModule);
@@ -276,79 +286,56 @@ llvm::Function *CGNVCUDARuntime::makeMod
 
   CtorBuilder.SetInsertPoint(CtorEntryBB);
 
-  // For each GPU binary, register it with the CUDA runtime and store returned
-  // handle in a global variable and save the handle in GpuBinaryHandles vector
-  // to be cleaned up in destructor on exit. Then associate all known kernels
-  // with the GPU binary handle so CUDA runtime can figure out what to call on
-  // the GPU side.
-  for (co

Re: r326590 - [OPENMP] Treat local variables in CUDA mode as thread local.

2018-03-02 Thread Jonas Hahnfeld via cfe-commits

Hi Alexey,

Am 2018-03-02 18:17, schrieb Alexey Bataev via cfe-commits:

Author: abataev
Date: Fri Mar  2 09:17:12 2018
New Revision: 326590

URL: http://llvm.org/viewvc/llvm-project?rev=326590&view=rev
Log:
[OPENMP] Treat local variables in CUDA mode as thread local.

In CUDA mode all local variables are actually thread
local|threadprivate, not private, and, thus, they cannot be shared
between threads|lanes.

Added:
cfe/trunk/test/OpenMP/nvptx_target_cuda_mode_messages.cpp
Modified:
cfe/trunk/include/clang/Driver/Options.td
cfe/trunk/lib/Sema/SemaOpenMP.cpp

Modified: cfe/trunk/include/clang/Driver/Options.td
URL:
http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Driver/Options.td?rev=326590&r1=326589&r2=326590&view=diff
==
--- cfe/trunk/include/clang/Driver/Options.td (original)
+++ cfe/trunk/include/clang/Driver/Options.td Fri Mar  2 09:17:12 2018
@@ -1427,7 +1427,7 @@ def fopenmp_simd : Flag<["-"], "fopenmp-
   HelpText<"Emit OpenMP code only for SIMD-based constructs.">;
 def fno_openmp_simd : Flag<["-"], "fno-openmp-simd">, Group,
Flags<[CC1Option, NoArgumentUnused]>;
 def fopenmp_cuda_mode : Flag<["-"], "fopenmp-cuda-mode">,
Group, Flags<[CC1Option, NoArgumentUnused]>;
-def fno_openmp_cuda_mode : Flag<["-"], "fno-openmp-cuda-mode">,
Group, Flags<[CC1Option, NoArgumentUnused]>;
+def fno_openmp_cuda_mode : Flag<["-"], "fno-openmp-cuda-mode">,
Group, Flags<[NoArgumentUnused]>;


Did you remove CC1Option intentionally? I think we need this with the 
method Carlo chose.
Btw unless I muss something OpenMPCUDAMode is not the default? If I 
remember correctly we discussed this on Phabricator?


Regards,
Jonas
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r333757 - [OpenMP] Fix typo in NVPTX linker, NFC.

2018-06-01 Thread Jonas Hahnfeld via cfe-commits
Author: hahnfeld
Date: Fri Jun  1 07:43:48 2018
New Revision: 333757

URL: http://llvm.org/viewvc/llvm-project?rev=333757&view=rev
Log:
[OpenMP] Fix typo in NVPTX linker, NFC.

Clang calls "nvlink" for linking multiple object files with OpenMP
target functions, so correct this information when printing errors.

Modified:
cfe/trunk/lib/Driver/ToolChains/Cuda.h

Modified: cfe/trunk/lib/Driver/ToolChains/Cuda.h
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Driver/ToolChains/Cuda.h?rev=333757&r1=333756&r2=333757&view=diff
==
--- cfe/trunk/lib/Driver/ToolChains/Cuda.h (original)
+++ cfe/trunk/lib/Driver/ToolChains/Cuda.h Fri Jun  1 07:43:48 2018
@@ -115,7 +115,7 @@ class LLVM_LIBRARY_VISIBILITY Linker : p
 class LLVM_LIBRARY_VISIBILITY OpenMPLinker : public Tool {
  public:
OpenMPLinker(const ToolChain &TC)
-   : Tool("NVPTX::OpenMPLinker", "fatbinary", TC, RF_Full, 
llvm::sys::WEM_UTF8,
+   : Tool("NVPTX::OpenMPLinker", "nvlink", TC, RF_Full, 
llvm::sys::WEM_UTF8,
   "--options-file") {}
 
bool hasIntegratedCPP() const override { return false; }


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r334281 - [CUDA] Fix emission of constant strings in sections

2018-06-08 Thread Jonas Hahnfeld via cfe-commits
Author: hahnfeld
Date: Fri Jun  8 04:17:08 2018
New Revision: 334281

URL: http://llvm.org/viewvc/llvm-project?rev=334281&view=rev
Log:
[CUDA] Fix emission of constant strings in sections

CGM.GetAddrOfConstantCString() sets the adress of the created GlobalValue
to unnamed. When emitting the object file LLVM will mark the surrounding
section as SHF_MERGE iff the string is nul-terminated and contains no
other nuls (see IsNullTerminatedString). This results in problems when
saving temporaries because LLVM doesn't set an EntrySize, so reading in
the serialized assembly file fails.
This never happened for the GPU binaries because they usually contain
a nul-character somewhere. Instead this only affected the module ID
when compiling relocatable device code.

However, this points to a potentially larger problem: If we put a
constant string into a named section, we really want the data to end
up in that section in the object file. To avoid LLVM merging sections
this patch unmarks the GlobalVariable's address as unnamed which also
fixes the problem of invalid serialized assembly files when saving
temporaries.

Differential Revision: https://reviews.llvm.org/D47902

Modified:
cfe/trunk/lib/CodeGen/CGCUDANV.cpp
cfe/trunk/test/CodeGenCUDA/device-stub.cu

Modified: cfe/trunk/lib/CodeGen/CGCUDANV.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGCUDANV.cpp?rev=334281&r1=334280&r2=334281&view=diff
==
--- cfe/trunk/lib/CodeGen/CGCUDANV.cpp (original)
+++ cfe/trunk/lib/CodeGen/CGCUDANV.cpp Fri Jun  8 04:17:08 2018
@@ -75,8 +75,12 @@ private:
 auto ConstStr = CGM.GetAddrOfConstantCString(Str, Name.c_str());
 llvm::GlobalVariable *GV =
 cast(ConstStr.getPointer());
-if (!SectionName.empty())
+if (!SectionName.empty()) {
   GV->setSection(SectionName);
+  // Mark the address as used which make sure that this section isn't
+  // merged and we will really have it in the object file.
+  GV->setUnnamedAddr(llvm::GlobalValue::UnnamedAddr::None);
+}
 if (Alignment)
   GV->setAlignment(Alignment);
 

Modified: cfe/trunk/test/CodeGenCUDA/device-stub.cu
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/CodeGenCUDA/device-stub.cu?rev=334281&r1=334280&r2=334281&view=diff
==
--- cfe/trunk/test/CodeGenCUDA/device-stub.cu (original)
+++ cfe/trunk/test/CodeGenCUDA/device-stub.cu Fri Jun  8 04:17:08 2018
@@ -65,7 +65,7 @@ void use_pointers() {
 // ALL: private unnamed_addr constant{{.*}}kernelfunc{{.*}}\00"
 // * constant unnamed string with GPU binary
 // HIP: @[[FATBIN:__hip_fatbin]] = external constant i8, section ".hip_fatbin"
-// CUDA: @[[FATBIN:.*]] = private unnamed_addr constant{{.*GPU binary would be 
here.*}}\00",
+// CUDA: @[[FATBIN:.*]] = private constant{{.*GPU binary would be here.*}}\00",
 // CUDANORDC-SAME: section ".nv_fatbin", align 8
 // CUDARDC-SAME: section "__nv_relfatbin", align 8
 // * constant struct that wraps GPU binary
@@ -81,7 +81,7 @@ void use_pointers() {
 // * variable to save GPU binary handle after initialization
 // NORDC: @__[[PREFIX]]_gpubin_handle = internal global i8** null
 // * constant unnamed string with NVModuleID
-// RDC: [[MODULE_ID_GLOBAL:@.*]] = private unnamed_addr constant
+// RDC: [[MODULE_ID_GLOBAL:@.*]] = private constant
 // CUDARDC-SAME: c"[[MODULE_ID:.+]]\00", section "__nv_module_id", align 32
 // HIPRDC-SAME: c"[[MODULE_ID:.+]]\00", section "__hip_module_id", align 32
 // * Make sure our constructor was added to global ctor list.
@@ -141,7 +141,7 @@ void hostfunc(void) { kernelfunc<<<1, 1>
 // There should be no __[[PREFIX]]_register_globals if we have no
 // device-side globals, but we still need to register GPU binary.
 // Skip GPU binary string first.
-// CUDANOGLOBALS: @{{.*}} = private unnamed_addr constant{{.*}}
+// CUDANOGLOBALS: @{{.*}} = private constant{{.*}}
 // HIPNOGLOBALS: @{{.*}} = external constant{{.*}}
 // NOGLOBALS-NOT: define internal void @__{{.*}}_register_globals
 // NOGLOBALS: define internal void @__[[PREFIX:cuda|hip]]_module_ctor


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r343230 - [OpenMP] Improve search for libomptarget-nvptx

2018-09-27 Thread Jonas Hahnfeld via cfe-commits
Author: hahnfeld
Date: Thu Sep 27 09:12:32 2018
New Revision: 343230

URL: http://llvm.org/viewvc/llvm-project?rev=343230&view=rev
Log:
[OpenMP] Improve search for libomptarget-nvptx

When looking for the bclib Clang considered the default library
path first while it preferred directories in LIBRARY_PATH when
constructing the invocation of nvlink. The latter actually makes
more sense because during development it allows using a non-default
runtime library. So change the search for the bclib to start
looking in directories given by LIBRARY_PATH.
Additionally add a new option --libomptarget-nvptx-path= which
will be searched first. This will be handy for testing purposes.

Differential Revision: https://reviews.llvm.org/D51686

Modified:
cfe/trunk/include/clang/Driver/Options.td
cfe/trunk/lib/Driver/ToolChains/Cuda.cpp
cfe/trunk/test/Driver/openmp-offload-gpu.c

Modified: cfe/trunk/include/clang/Driver/Options.td
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Driver/Options.td?rev=343230&r1=343229&r2=343230&view=diff
==
--- cfe/trunk/include/clang/Driver/Options.td (original)
+++ cfe/trunk/include/clang/Driver/Options.td Thu Sep 27 09:12:32 2018
@@ -596,6 +596,8 @@ def hip_device_lib_EQ : Joined<["--"], "
   HelpText<"HIP device library">;
 def fhip_dump_offload_linker_script : Flag<["-"], 
"fhip-dump-offload-linker-script">,
   Group, Flags<[NoArgumentUnused, HelpHidden]>;
+def libomptarget_nvptx_path_EQ : Joined<["--"], "libomptarget-nvptx-path=">, 
Group,
+  HelpText<"Path to libomptarget-nvptx libraries">;
 def dA : Flag<["-"], "dA">, Group;
 def dD : Flag<["-"], "dD">, Group, Flags<[CC1Option]>,
   HelpText<"Print macro definitions in -E mode in addition to normal output">;

Modified: cfe/trunk/lib/Driver/ToolChains/Cuda.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Driver/ToolChains/Cuda.cpp?rev=343230&r1=343229&r2=343230&view=diff
==
--- cfe/trunk/lib/Driver/ToolChains/Cuda.cpp (original)
+++ cfe/trunk/lib/Driver/ToolChains/Cuda.cpp Thu Sep 27 09:12:32 2018
@@ -511,6 +511,11 @@ void NVPTX::OpenMPLinker::ConstructJob(C
   CmdArgs.push_back("-arch");
   CmdArgs.push_back(Args.MakeArgString(GPUArch));
 
+  // Assume that the directory specified with --libomptarget_nvptx_path
+  // contains the static library libomptarget-nvptx.a.
+  if (const Arg *A = Args.getLastArg(options::OPT_libomptarget_nvptx_path_EQ))
+CmdArgs.push_back(Args.MakeArgString(Twine("-L") + A->getValue()));
+
   // Add paths specified in LIBRARY_PATH environment variable as -L options.
   addDirectoryList(Args, CmdArgs, "-L", "LIBRARY_PATH");
 
@@ -647,12 +652,9 @@ void CudaToolChain::addClangTargetOption
 
   if (DeviceOffloadingKind == Action::OFK_OpenMP) {
 SmallVector LibraryPaths;
-// Add path to lib and/or lib64 folders.
-SmallString<256> DefaultLibPath =
-  llvm::sys::path::parent_path(getDriver().Dir);
-llvm::sys::path::append(DefaultLibPath,
-Twine("lib") + CLANG_LIBDIR_SUFFIX);
-LibraryPaths.emplace_back(DefaultLibPath.c_str());
+
+if (const Arg *A = 
DriverArgs.getLastArg(options::OPT_libomptarget_nvptx_path_EQ))
+  LibraryPaths.push_back(A->getValue());
 
 // Add user defined library paths from LIBRARY_PATH.
 llvm::Optional LibPath =
@@ -665,6 +667,12 @@ void CudaToolChain::addClangTargetOption
 LibraryPaths.emplace_back(Path.trim());
 }
 
+// Add path to lib / lib64 folder.
+SmallString<256> DefaultLibPath =
+llvm::sys::path::parent_path(getDriver().Dir);
+llvm::sys::path::append(DefaultLibPath, Twine("lib") + 
CLANG_LIBDIR_SUFFIX);
+LibraryPaths.emplace_back(DefaultLibPath.c_str());
+
 std::string LibOmpTargetName =
   "libomptarget-nvptx-" + GpuArch.str() + ".bc";
 bool FoundBCLibrary = false;

Modified: cfe/trunk/test/Driver/openmp-offload-gpu.c
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/Driver/openmp-offload-gpu.c?rev=343230&r1=343229&r2=343230&view=diff
==
--- cfe/trunk/test/Driver/openmp-offload-gpu.c (original)
+++ cfe/trunk/test/Driver/openmp-offload-gpu.c Thu Sep 27 09:12:32 2018
@@ -30,6 +30,22 @@
 
 /// ###
 
+/// Check that -lomptarget-nvptx is passed to nvlink.
+// RUN:   %clang -### -no-canonical-prefixes -fopenmp=libomp \
+// RUN:  -fopenmp-targets=nvptx64-nvidia-cuda %s 2>&1 \
+// RUN:   | FileCheck -check-prefix=CHK-NVLINK %s
+/// Check that the value of --libomptarget-nvptx-path is forwarded to nvlink.
+// RUN:   %clang -### -no-canonical-prefixes -fopenmp=libomp \
+// RUN:  --libomptarget-nvptx-path=/path/to/libomptarget/ \
+// RUN:  -fopenmp-targets=nvptx64-nvidia-cuda %s 2>&1 \
+// RUN:   | FileCheck -check-prefixes=CHK-NVLINK,C

r343240 - Fix greedy FileCheck expression in test/Driver/mips-abi.c

2018-09-27 Thread Jonas Hahnfeld via cfe-commits
Author: hahnfeld
Date: Thu Sep 27 10:27:48 2018
New Revision: 343240

URL: http://llvm.org/viewvc/llvm-project?rev=343240&view=rev
Log:
Fix greedy FileCheck expression in test/Driver/mips-abi.c

'ld{{.*}}"' seems to match the complete line for me which is failing
the test. Only allow an optional '.exe' for Windows systems as most
other tests do.
Another possibility would be to collapse the greedy expression with
the next check to avoid matching the full line.

Differential Revision: https://reviews.llvm.org/D52619

Modified:
cfe/trunk/test/Driver/mips-abi.c

Modified: cfe/trunk/test/Driver/mips-abi.c
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/Driver/mips-abi.c?rev=343240&r1=343239&r2=343240&view=diff
==
--- cfe/trunk/test/Driver/mips-abi.c (original)
+++ cfe/trunk/test/Driver/mips-abi.c Thu Sep 27 10:27:48 2018
@@ -169,7 +169,7 @@
 // TARGET-O32: "-triple" "mips-unknown-linux-gnu"
 // TARGET-O32: "-target-cpu" "mips32r2"
 // TARGET-O32: "-target-abi" "o32"
-// TARGET-O32: ld{{.*}}"
+// TARGET-O32: ld{{(.exe)?}}"
 // TARGET-O32: "-m" "elf32btsmip"
 
 // RUN: %clang -target mips-linux-gnu -mabi=n32 -### %s 2>&1 \
@@ -177,7 +177,7 @@
 // TARGET-N32: "-triple" "mips64-unknown-linux-gnu"
 // TARGET-N32: "-target-cpu" "mips64r2"
 // TARGET-N32: "-target-abi" "n32"
-// TARGET-N32: ld{{.*}}"
+// TARGET-N32: ld{{(.exe)?}}"
 // TARGET-N32: "-m" "elf32btsmipn32"
 
 // RUN: %clang -target mips-linux-gnu -mabi=64 -### %s 2>&1 \
@@ -185,5 +185,5 @@
 // TARGET-N64: "-triple" "mips64-unknown-linux-gnu"
 // TARGET-N64: "-target-cpu" "mips64r2"
 // TARGET-N64: "-target-abi" "n64"
-// TARGET-N64: ld{{.*}}"
+// TARGET-N64: ld{{(.exe)?}}"
 // TARGET-N64: "-m" "elf64btsmip"


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: r343492 - [OPENMP][NVPTX] Handle `requires datasharing` flag correctly with

2018-10-01 Thread Jonas Hahnfeld via cfe-commits
Does this fix an issue or is it just 'this looks incorrect'? 
__kmpc_data_sharing_push_stack has an extra case if the runtime is 
uninitialized, so AFAICS the initialization in __kmpc_spmd_kernel_init 
should not be needed.


If the runtime needs to be initialized Clang emits a call to 
__kmpc_data_sharing_init_stack_spmd which should setup the data 
structures.


Jonas

On 2018-10-01 18:20, Alexey Bataev via cfe-commits wrote:

Author: abataev
Date: Mon Oct  1 09:20:57 2018
New Revision: 343492

URL: http://llvm.org/viewvc/llvm-project?rev=343492&view=rev
Log:
[OPENMP][NVPTX] Handle `requires datasharing` flag correctly with
lightweight runtime.

The datasharing flag must be set to `1` when executing SPMD-mode
compatible directive with reduction|lastprivate clauses.

Modified:
cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp
cfe/trunk/test/OpenMP/nvptx_SPMD_codegen.cpp

cfe/trunk/test/OpenMP/nvptx_target_teams_distribute_parallel_for_codegen.cpp


cfe/trunk/test/OpenMP/nvptx_target_teams_distribute_parallel_for_simd_codegen.cpp

Modified: cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp
URL:
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp?rev=343492&r1=343491&r2=343492&view=diff
==
--- cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp (original)
+++ cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp Mon Oct  1 09:20:57 
2018

@@ -1207,6 +1207,10 @@ void CGOpenMPRuntimeNVPTX::emitSPMDKerne
IsOffloadEntry, CodeGen);
 }

+static void
+getDistributeLastprivateVars(const OMPExecutableDirective &D,
+ llvm::SmallVectorImpl 
&Vars);

+
 void CGOpenMPRuntimeNVPTX::emitSPMDEntryHeader(
 CodeGenFunction &CGF, EntryFunctionState &EST,
 const OMPExecutableDirective &D) {
@@ -1219,11 +1223,33 @@ void CGOpenMPRuntimeNVPTX::emitSPMDEntry
   // Initialize the OMP state in the runtime; called by all active 
threads.
   bool RequiresFullRuntime = 
CGM.getLangOpts().OpenMPCUDAForceFullRuntime ||
  
!supportsLightweightRuntime(CGF.getContext(), D);
+  // Check if we have inner distribute + lastprivate|reduction 
clauses.

+  bool RequiresDatasharing = RequiresFullRuntime;
+  if (!RequiresDatasharing) {
+const OMPExecutableDirective *TD = &D;
+if (!isOpenMPTeamsDirective(TD->getDirectiveKind()) &&
+!isOpenMPParallelDirective(TD->getDirectiveKind())) {
+  const Stmt *S = getSingleCompoundChild(
+  
TD->getInnermostCapturedStmt()->getCapturedStmt()->IgnoreContainers(

+  /*IgnoreCaptured=*/true));
+  TD = cast(S);
+}
+if (!isOpenMPDistributeDirective(TD->getDirectiveKind()) &&
+!isOpenMPParallelDirective(TD->getDirectiveKind())) {
+  const Stmt *S = getSingleCompoundChild(
+  
TD->getInnermostCapturedStmt()->getCapturedStmt()->IgnoreContainers(

+  /*IgnoreCaptured=*/true));
+  TD = cast(S);
+}
+if (isOpenMPDistributeDirective(TD->getDirectiveKind()))
+  RequiresDatasharing = 
TD->hasClausesOfKind() ||
+
TD->hasClausesOfKind();

+  }
   llvm::Value *Args[] = {
   getThreadLimit(CGF, /*IsInSPMDExecutionMode=*/true),
   /*RequiresOMPRuntime=*/
   Bld.getInt16(RequiresFullRuntime ? 1 : 0),
-  /*RequiresDataSharing=*/Bld.getInt16(RequiresFullRuntime ? 1 : 
0)};
+  /*RequiresDataSharing=*/Bld.getInt16(RequiresDatasharing ? 1 : 
0)};

   CGF.EmitRuntimeCall(
   createNVPTXRuntimeFunction(OMPRTL_NVPTX__kmpc_spmd_kernel_init), 
Args);



Modified: cfe/trunk/test/OpenMP/nvptx_SPMD_codegen.cpp
URL:
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/OpenMP/nvptx_SPMD_codegen.cpp?rev=343492&r1=343491&r2=343492&view=diff
==
--- cfe/trunk/test/OpenMP/nvptx_SPMD_codegen.cpp (original)
+++ cfe/trunk/test/OpenMP/nvptx_SPMD_codegen.cpp Mon Oct  1 09:20:57 
2018

@@ -40,7 +40,7 @@ void foo() {
   for (int i = 0; i < 10; ++i)
 ;
 int a;
-// CHECK: call void @__kmpc_spmd_kernel_init(i32 {{.+}}, i16 0, i16 0)
+// CHECK: call void @__kmpc_spmd_kernel_init(i32 {{.+}}, i16 0, i16 1)
 // CHECK: call void @__kmpc_spmd_kernel_init(i32 {{.+}}, i16 0, i16 0)
 // CHECK: call void @__kmpc_spmd_kernel_init(i32 {{.+}}, i16 0, i16 0)
 // CHECK: call void @__kmpc_spmd_kernel_init(i32 {{.+}}, i16 1, i16 
{{.+}})


Modified:
cfe/trunk/test/OpenMP/nvptx_target_teams_distribute_parallel_for_codegen.cpp
URL:
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/OpenMP/nvptx_target_teams_distribute_parallel_for_codegen.cpp?rev=343492&r1=343491&r2=343492&view=diff
==
---
cfe/trunk/test/OpenMP/nvptx_target_teams_distribute_parallel_for_codegen.cpp
(original)
+++
cfe/trunk/test/OpenMP/nvptx_target_teams_distribute_parallel_for_codegen.cpp
Mon Oct  1 09:20:57 2018
@@

r343618 - [OpenMP][NVPTX] Simplify codegen for orphaned parallel, NFCI.

2018-10-02 Thread Jonas Hahnfeld via cfe-commits
Author: hahnfeld
Date: Tue Oct  2 12:12:54 2018
New Revision: 343618

URL: http://llvm.org/viewvc/llvm-project?rev=343618&view=rev
Log:
[OpenMP][NVPTX] Simplify codegen for orphaned parallel, NFCI.

Worker threads fork off to the compiler generated worker function
directly after entering the kernel function. Hence, there is no
need to check whether the current thread is the master if we are
outside of a parallel region (neither SPMD nor parallel_level > 0).

Differential Revision: https://reviews.llvm.org/D52732

Modified:
cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp
cfe/trunk/test/OpenMP/nvptx_target_codegen.cpp

Modified: cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp?rev=343618&r1=343617&r2=343618&view=diff
==
--- cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp (original)
+++ cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp Tue Oct  2 12:12:54 2018
@@ -2233,30 +2233,24 @@ void CGOpenMPRuntimeNVPTX::emitNonSPMDPa
 Work.emplace_back(WFn);
   };
 
-  auto &&LNParallelGen = [this, Loc, &SeqGen, &L0ParallelGen, &CodeGen,
-  &ThreadIDAddr](CodeGenFunction &CGF,
- PrePostActionTy &Action) {
-RegionCodeGenTy RCG(CodeGen);
+  auto &&LNParallelGen = [this, Loc, &SeqGen, &L0ParallelGen](
+ CodeGenFunction &CGF, PrePostActionTy &Action) {
 if (IsInParallelRegion) {
   SeqGen(CGF, Action);
 } else if (IsInTargetMasterThreadRegion) {
   L0ParallelGen(CGF, Action);
-} else if (getExecutionMode() == CGOpenMPRuntimeNVPTX::EM_NonSPMD) {
-  RCG(CGF);
 } else {
   // Check for master and then parallelism:
   // if (__kmpc_is_spmd_exec_mode() || __kmpc_parallel_level(loc, gtid)) {
-  //  Serialized execution.
-  // } else if (master) {
-  //   Worker call.
+  //   Serialized execution.
   // } else {
-  //   Outlined function call.
+  //   Worker call.
   // }
   CGBuilderTy &Bld = CGF.Builder;
   llvm::BasicBlock *ExitBB = CGF.createBasicBlock(".exit");
   llvm::BasicBlock *SeqBB = CGF.createBasicBlock(".sequential");
   llvm::BasicBlock *ParallelCheckBB = CGF.createBasicBlock(".parcheck");
-  llvm::BasicBlock *MasterCheckBB = CGF.createBasicBlock(".mastercheck");
+  llvm::BasicBlock *MasterBB = CGF.createBasicBlock(".master");
   llvm::Value *IsSPMD = Bld.CreateIsNotNull(CGF.EmitNounwindRuntimeCall(
   createNVPTXRuntimeFunction(OMPRTL_NVPTX__kmpc_is_spmd_exec_mode)));
   Bld.CreateCondBr(IsSPMD, SeqBB, ParallelCheckBB);
@@ -2269,29 +2263,17 @@ void CGOpenMPRuntimeNVPTX::emitNonSPMDPa
   createNVPTXRuntimeFunction(OMPRTL_NVPTX__kmpc_parallel_level),
   {RTLoc, ThreadID});
   llvm::Value *Res = Bld.CreateIsNotNull(PL);
-  Bld.CreateCondBr(Res, SeqBB, MasterCheckBB);
+  Bld.CreateCondBr(Res, SeqBB, MasterBB);
   CGF.EmitBlock(SeqBB);
   SeqGen(CGF, Action);
   CGF.EmitBranch(ExitBB);
   // There is no need to emit line number for unconditional branch.
   (void)ApplyDebugLocation::CreateEmpty(CGF);
-  CGF.EmitBlock(MasterCheckBB);
-  llvm::BasicBlock *MasterThenBB = CGF.createBasicBlock("master.then");
-  llvm::BasicBlock *ElseBlock = CGF.createBasicBlock("omp_if.else");
-  llvm::Value *IsMaster =
-  Bld.CreateICmpEQ(getNVPTXThreadID(CGF), getMasterThreadID(CGF));
-  Bld.CreateCondBr(IsMaster, MasterThenBB, ElseBlock);
-  CGF.EmitBlock(MasterThenBB);
+  CGF.EmitBlock(MasterBB);
   L0ParallelGen(CGF, Action);
   CGF.EmitBranch(ExitBB);
   // There is no need to emit line number for unconditional branch.
   (void)ApplyDebugLocation::CreateEmpty(CGF);
-  CGF.EmitBlock(ElseBlock);
-  // In the worker need to use the real thread id.
-  ThreadIDAddr = emitThreadIDAddress(CGF, Loc);
-  RCG(CGF);
-  // There is no need to emit line number for unconditional branch.
-  (void)ApplyDebugLocation::CreateEmpty(CGF);
   // Emit the continuation block for code after the if.
   CGF.EmitBlock(ExitBB, /*IsFinished=*/true);
 }

Modified: cfe/trunk/test/OpenMP/nvptx_target_codegen.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/OpenMP/nvptx_target_codegen.cpp?rev=343618&r1=343617&r2=343618&view=diff
==
--- cfe/trunk/test/OpenMP/nvptx_target_codegen.cpp (original)
+++ cfe/trunk/test/OpenMP/nvptx_target_codegen.cpp Tue Oct  2 12:12:54 2018
@@ -557,7 +557,6 @@ int baz(int f, double &a) {
   // CHECK: [[STACK:%.+]] = alloca [[GLOBAL_ST:%.+]],
   // CHECK: [[ZERO_ADDR:%.+]] = alloca i32,
   // CHECK: [[GTID:%.+]] = call i32 @__kmpc_global_thread_num(%struct.ident_t*
-  // CHECK: [[GTID_ADDR:%.+]] = alloca i32,
   // CHECK: store i32 0, i32* [[ZERO_ADDR]]
  

r343617 - [OpenMP] Simplify code for reductions on distribute directives, NFC.

2018-10-02 Thread Jonas Hahnfeld via cfe-commits
Author: hahnfeld
Date: Tue Oct  2 12:12:47 2018
New Revision: 343617

URL: http://llvm.org/viewvc/llvm-project?rev=343617&view=rev
Log:
[OpenMP] Simplify code for reductions on distribute directives, NFC.

Only need to care about the 'distribute simd' case, all other composite
directives are handled elsewhere. This was already reflected in the
outer 'if' condition, so all other inner conditions could never be true.

Differential Revision: https://reviews.llvm.org/D52731

Modified:
cfe/trunk/lib/CodeGen/CGStmtOpenMP.cpp

Modified: cfe/trunk/lib/CodeGen/CGStmtOpenMP.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGStmtOpenMP.cpp?rev=343617&r1=343616&r2=343617&view=diff
==
--- cfe/trunk/lib/CodeGen/CGStmtOpenMP.cpp (original)
+++ cfe/trunk/lib/CodeGen/CGStmtOpenMP.cpp Tue Oct  2 12:12:47 2018
@@ -3400,20 +3400,7 @@ void CodeGenFunction::EmitOMPDistributeL
   if (isOpenMPSimdDirective(S.getDirectiveKind()) &&
   !isOpenMPParallelDirective(S.getDirectiveKind()) &&
   !isOpenMPTeamsDirective(S.getDirectiveKind())) {
-OpenMPDirectiveKind ReductionKind = OMPD_unknown;
-if (isOpenMPParallelDirective(S.getDirectiveKind()) &&
-isOpenMPSimdDirective(S.getDirectiveKind())) {
-  ReductionKind = OMPD_parallel_for_simd;
-} else if (isOpenMPParallelDirective(S.getDirectiveKind())) {
-  ReductionKind = OMPD_parallel_for;
-} else if (isOpenMPSimdDirective(S.getDirectiveKind())) {
-  ReductionKind = OMPD_simd;
-} else if (!isOpenMPTeamsDirective(S.getDirectiveKind()) &&
-   S.hasClausesOfKind()) {
-  llvm_unreachable(
-  "No reduction clauses is allowed in distribute directive.");
-}
-EmitOMPReductionClauseFinal(S, ReductionKind);
+EmitOMPReductionClauseFinal(S, OMPD_simd);
 // Emit post-update of the reduction variables if IsLastIter != 0.
 emitPostUpdateForReductionClause(
 *this, S, [IL, &S](CodeGenFunction &CGF) {


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r330426 - [CUDA] Document recent changes

2018-04-20 Thread Jonas Hahnfeld via cfe-commits
Author: hahnfeld
Date: Fri Apr 20 06:04:54 2018
New Revision: 330426

URL: http://llvm.org/viewvc/llvm-project?rev=330426&view=rev
Log:
[CUDA] Document recent changes

 * Finding installations via ptxas binary
 * Relocatable device code

Differential Revision: https://reviews.llvm.org/D45449

Modified:
cfe/trunk/docs/ReleaseNotes.rst
cfe/trunk/include/clang/Driver/Options.td

Modified: cfe/trunk/docs/ReleaseNotes.rst
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/docs/ReleaseNotes.rst?rev=330426&r1=330425&r2=330426&view=diff
==
--- cfe/trunk/docs/ReleaseNotes.rst (original)
+++ cfe/trunk/docs/ReleaseNotes.rst Fri Apr 20 06:04:54 2018
@@ -163,6 +163,18 @@ OpenMP Support in Clang
 
 - ...
 
+CUDA Support in Clang
+-
+
+- Clang will now try to locate the CUDA installation next to :program:`ptxas`
+  in the `PATH` environment variable. This behavior can be turned off by 
passing
+  the new flag `--cuda-path-ignore-env`.
+
+- Clang now supports generating object files with relocatable device code. This
+  feature needs to be enabled with `-fcuda-rdc` and my result in performance
+  penalties compared to whole program compilation. Please note that NVIDIA's
+  :program:`nvcc` must be used for linking.
+
 Internal API Changes
 
 

Modified: cfe/trunk/include/clang/Driver/Options.td
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Driver/Options.td?rev=330426&r1=330425&r2=330426&view=diff
==
--- cfe/trunk/include/clang/Driver/Options.td (original)
+++ cfe/trunk/include/clang/Driver/Options.td Fri Apr 20 06:04:54 2018
@@ -573,7 +573,7 @@ def fno_cuda_flush_denormals_to_zero : F
 def fcuda_approx_transcendentals : Flag<["-"], "fcuda-approx-transcendentals">,
   Flags<[CC1Option]>, HelpText<"Use approximate transcendental functions">;
 def fno_cuda_approx_transcendentals : Flag<["-"], 
"fno-cuda-approx-transcendentals">;
-def fcuda_rdc : Flag<["-"], "fcuda-rdc">, Flags<[CC1Option, HelpHidden]>,
+def fcuda_rdc : Flag<["-"], "fcuda-rdc">, Flags<[CC1Option]>,
   HelpText<"Generate relocatable device code, also known as separate 
compilation mode.">;
 def fno_cuda_rdc : Flag<["-"], "fno-cuda-rdc">;
 def dA : Flag<["-"], "dA">, Group;


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r330425 - [CUDA] Register relocatable GPU binaries

2018-04-20 Thread Jonas Hahnfeld via cfe-commits
Author: hahnfeld
Date: Fri Apr 20 06:04:45 2018
New Revision: 330425

URL: http://llvm.org/viewvc/llvm-project?rev=330425&view=rev
Log:
[CUDA] Register relocatable GPU binaries

nvcc generates a unique registration function for each object file
that contains relocatable device code. Unique names are achieved
with a module id that is also reflected in the function's name.

Differential Revision: https://reviews.llvm.org/D42922

Modified:
cfe/trunk/lib/CodeGen/CGCUDANV.cpp
cfe/trunk/test/CodeGenCUDA/device-stub.cu

Modified: cfe/trunk/lib/CodeGen/CGCUDANV.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGCUDANV.cpp?rev=330425&r1=330424&r2=330425&view=diff
==
--- cfe/trunk/lib/CodeGen/CGCUDANV.cpp (original)
+++ cfe/trunk/lib/CodeGen/CGCUDANV.cpp Fri Apr 20 06:04:45 2018
@@ -15,12 +15,13 @@
 #include "CGCUDARuntime.h"
 #include "CodeGenFunction.h"
 #include "CodeGenModule.h"
-#include "clang/CodeGen/ConstantInitBuilder.h"
 #include "clang/AST/Decl.h"
+#include "clang/CodeGen/ConstantInitBuilder.h"
 #include "llvm/IR/BasicBlock.h"
 #include "llvm/IR/CallSite.h"
 #include "llvm/IR/Constants.h"
 #include "llvm/IR/DerivedTypes.h"
+#include "llvm/Support/Format.h"
 
 using namespace clang;
 using namespace CodeGen;
@@ -45,10 +46,16 @@ private:
   /// ModuleCtorFunction() and used to create corresponding cleanup calls in
   /// ModuleDtorFunction()
   llvm::GlobalVariable *GpuBinaryHandle = nullptr;
+  /// Whether we generate relocatable device code.
+  bool RelocatableDeviceCode;
 
   llvm::Constant *getSetupArgumentFn() const;
   llvm::Constant *getLaunchFn() const;
 
+  llvm::FunctionType *getRegisterGlobalsFnTy() const;
+  llvm::FunctionType *getCallbackFnTy() const;
+  llvm::FunctionType *getRegisterLinkedBinaryFnTy() const;
+
   /// Creates a function to register all kernel stubs generated in this module.
   llvm::Function *makeRegisterGlobalsFn();
 
@@ -71,7 +78,23 @@ private:
 
 return llvm::ConstantExpr::getGetElementPtr(ConstStr.getElementType(),
 ConstStr.getPointer(), Zeros);
- }
+  }
+
+  /// Helper function that generates an empty dummy function returning void.
+  llvm::Function *makeDummyFunction(llvm::FunctionType *FnTy) {
+assert(FnTy->getReturnType()->isVoidTy() &&
+   "Can only generate dummy functions returning void!");
+llvm::Function *DummyFunc = llvm::Function::Create(
+FnTy, llvm::GlobalValue::InternalLinkage, "dummy", &TheModule);
+
+llvm::BasicBlock *DummyBlock =
+llvm::BasicBlock::Create(Context, "", DummyFunc);
+CGBuilderTy FuncBuilder(CGM, Context);
+FuncBuilder.SetInsertPoint(DummyBlock);
+FuncBuilder.CreateRetVoid();
+
+return DummyFunc;
+  }
 
   void emitDeviceStubBody(CodeGenFunction &CGF, FunctionArgList &Args);
 
@@ -93,7 +116,8 @@ public:
 
 CGNVCUDARuntime::CGNVCUDARuntime(CodeGenModule &CGM)
 : CGCUDARuntime(CGM), Context(CGM.getLLVMContext()),
-  TheModule(CGM.getModule()) {
+  TheModule(CGM.getModule()),
+  RelocatableDeviceCode(CGM.getLangOpts().CUDARelocatableDeviceCode) {
   CodeGen::CodeGenTypes &Types = CGM.getTypes();
   ASTContext &Ctx = CGM.getContext();
 
@@ -120,6 +144,22 @@ llvm::Constant *CGNVCUDARuntime::getLaun
   llvm::FunctionType::get(IntTy, CharPtrTy, false), "cudaLaunch");
 }
 
+llvm::FunctionType *CGNVCUDARuntime::getRegisterGlobalsFnTy() const {
+  return llvm::FunctionType::get(VoidTy, VoidPtrPtrTy, false);
+}
+
+llvm::FunctionType *CGNVCUDARuntime::getCallbackFnTy() const {
+  return llvm::FunctionType::get(VoidTy, VoidPtrTy, false);
+}
+
+llvm::FunctionType *CGNVCUDARuntime::getRegisterLinkedBinaryFnTy() const {
+  auto CallbackFnTy = getCallbackFnTy();
+  auto RegisterGlobalsFnTy = getRegisterGlobalsFnTy();
+  llvm::Type *Params[] = {RegisterGlobalsFnTy->getPointerTo(), VoidPtrTy,
+  VoidPtrTy, CallbackFnTy->getPointerTo()};
+  return llvm::FunctionType::get(VoidTy, Params, false);
+}
+
 void CGNVCUDARuntime::emitDeviceStub(CodeGenFunction &CGF,
  FunctionArgList &Args) {
   EmittedKernels.push_back(CGF.CurFn);
@@ -181,8 +221,8 @@ llvm::Function *CGNVCUDARuntime::makeReg
 return nullptr;
 
   llvm::Function *RegisterKernelsFunc = llvm::Function::Create(
-  llvm::FunctionType::get(VoidTy, VoidPtrPtrTy, false),
-  llvm::GlobalValue::InternalLinkage, "__cuda_register_globals", 
&TheModule);
+  getRegisterGlobalsFnTy(), llvm::GlobalValue::InternalLinkage,
+  "__cuda_register_globals", &TheModule);
   llvm::BasicBlock *EntryBB =
   llvm::BasicBlock::Create(Context, "entry", RegisterKernelsFunc);
   CGBuilderTy Builder(CGM, Context);
@@ -257,6 +297,11 @@ llvm::Function *CGNVCUDARuntime::makeMod
 
   // void __cuda_register_globals(void* handle);
   llvm::Function *RegisterGlobalsFunc = makeRegisterGlobalsFn();
+  // We always need a function to pass 

r330430 - [docs] Regenerate command line reference

2018-04-20 Thread Jonas Hahnfeld via cfe-commits
Author: hahnfeld
Date: Fri Apr 20 06:26:03 2018
New Revision: 330430

URL: http://llvm.org/viewvc/llvm-project?rev=330430&view=rev
Log:
[docs] Regenerate command line reference

This will correctly sort some manually added entries which should
generally be avoided!

Modified:
cfe/trunk/docs/ClangCommandLineReference.rst

Modified: cfe/trunk/docs/ClangCommandLineReference.rst
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/docs/ClangCommandLineReference.rst?rev=330430&r1=330429&r2=330430&view=diff
==
--- cfe/trunk/docs/ClangCommandLineReference.rst (original)
+++ cfe/trunk/docs/ClangCommandLineReference.rst Fri Apr 20 06:26:03 2018
@@ -144,14 +144,14 @@ Compile CUDA code for device only
 
 CUDA GPU architecture (e.g. sm\_35).  May be specified more than once.
 
-.. option:: --cuda-include-ptx=, --no-cuda-include-ptx=
-
-Include (or not) PTX along with CUDA GPU binary for the given architecture 
(e.g. sm\_35). Argument may be 'all'. The option may be specified more than 
once. Default: --cuda-include-ptx=all
-
 .. option:: --cuda-host-only
 
 Compile CUDA code for host only.  Has no effect on non-CUDA compilations.
 
+.. option:: --cuda-include-ptx=, --no-cuda-include-ptx=
+
+Include PTX for the follwing GPU architecture (e.g. sm\_35) or 'all'. May be 
specified more than once.
+
 .. option:: --cuda-noopt-device-debug, --no-cuda-noopt-device-debug
 
 Enable device-side debug info generation. Disables ptxas optimizations.
@@ -202,6 +202,10 @@ Use approximate transcendental functions
 
 Flush denormal floating point values to zero in CUDA device mode.
 
+.. option:: -ffixed-r19
+
+Reserve the r19 register (Hexagon only)
+
 .. option:: -fheinous-gnu-extensions
 
 .. option:: -flat\_namespace
@@ -242,6 +246,8 @@ Display available options
 
 .. option:: --help-hidden
 
+Display help for hidden options
+
 .. option:: -image\_base 
 
 .. option:: -index-header-map
@@ -940,7 +946,7 @@ Specify the module user build path
 
 Don't verify input files for the modules if the module has been successfully 
validated or loaded during this build session
 
-.. option:: -fmodules-validate-system-headers
+.. option:: -fmodules-validate-system-headers, 
-fno-modules-validate-system-headers
 
 Validate the system headers that a module depends on when loading the module
 
@@ -948,8 +954,6 @@ Validate the system headers that a modul
 
 Specify the prebuilt module path
 
-.. option:: -i
-
 .. option:: -idirafter, --include-directory-after , 
--include-directory-after=
 
 Add directory to AFTER include search path
@@ -1135,6 +1139,12 @@ Target-independent compilation options
 
 .. option:: -faccess-control, -fno-access-control
 
+.. option:: -falign-functions, -fno-align-functions
+
+.. program:: clang1
+.. option:: -falign-functions=
+.. program:: clang
+
 .. program:: clang1
 .. option:: -faligned-allocation, -faligned-new, -fno-aligned-allocation
 .. program:: clang
@@ -1337,6 +1347,8 @@ Use emutls functions to access thread\_l
 
 .. option:: -ferror-limit=
 
+.. option:: -fescaping-block-tail-calls, -fno-escaping-block-tail-calls
+
 .. option:: -fexceptions, -fno-exceptions
 
 Enable support for exception handling
@@ -1353,6 +1365,10 @@ Allow aggressive, lossy floating-point o
 
 .. option:: -ffor-scope, -fno-for-scope
 
+.. option:: -fforce-enable-int128, -fno-force-enable-int128
+
+Enable support for int128\_t type
+
 .. option:: -ffp-contract=
 
 Form fused FP ops (e.g. FMAs): fast (everywhere) \| on (according to 
FP\_CONTRACT pragma, default) \| off (never fuse)
@@ -1441,6 +1457,8 @@ Specify the maximum alignment to enforce
 
 .. option:: -fmerge-all-constants, -fno-merge-all-constants
 
+Allow merging of constants
+
 .. option:: -fmessage-length=
 
 .. option:: -fmodule-file-deps, -fno-module-file-deps
@@ -1523,10 +1541,16 @@ Do not elide types when printing diagnos
 
 Do not treat C++ operator name keywords as synonyms for operators
 
+.. option:: -fno-rtti-data
+
+Control emission of RTTI data
+
 .. option:: -fno-strict-modules-decluse
 
 .. option:: -fno-working-directory
 
+.. option:: -fnoxray-link-deps
+
 .. option:: -fobjc-abi-version=
 
 .. option:: -fobjc-arc, -fno-objc-arc
@@ -1680,6 +1704,10 @@ Allow division operations to be reassoci
 
 Override the default ABI to return small structs in registers
 
+.. option:: -fregister-global-dtors-with-atexit, 
-fno-register-global-dtors-with-atexit
+
+Use atexit or \_\_cxa\_atexit to register global destructors
+
 .. option:: -frelaxed-template-template-args, 
-fno-relaxed-template-template-args
 
 Enable C++17 relaxed template template argument matching
@@ -1906,9 +1934,17 @@ Store string literals as writable data
 
 Determine whether to always emit \_\_xray\_customevent(...) calls even if the 
function it appears in is not always instrumented.
 
+.. option:: -fxray-always-emit-typedevents, -fno-xray-always-emit-typedevents
+
+Determine whether to always emit \_\_xray\_typedevent(...) calls

r330429 - [OpenMP] Hide -fopenmp-cuda-mode

2018-04-20 Thread Jonas Hahnfeld via cfe-commits
Author: hahnfeld
Date: Fri Apr 20 06:25:59 2018
New Revision: 330429

URL: http://llvm.org/viewvc/llvm-project?rev=330429&view=rev
Log:
[OpenMP] Hide -fopenmp-cuda-mode

This is an advanced flag that should show up neither in clang --help
nor in the ClangCommandLineReference.

Modified:
cfe/trunk/include/clang/Driver/Options.td

Modified: cfe/trunk/include/clang/Driver/Options.td
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Driver/Options.td?rev=330429&r1=330428&r2=330429&view=diff
==
--- cfe/trunk/include/clang/Driver/Options.td (original)
+++ cfe/trunk/include/clang/Driver/Options.td Fri Apr 20 06:25:59 2018
@@ -1463,8 +1463,10 @@ def fnoopenmp_relocatable_target : Flag<
 def fopenmp_simd : Flag<["-"], "fopenmp-simd">, Group, 
Flags<[CC1Option, NoArgumentUnused]>,
   HelpText<"Emit OpenMP code only for SIMD-based constructs.">;
 def fno_openmp_simd : Flag<["-"], "fno-openmp-simd">, Group, 
Flags<[CC1Option, NoArgumentUnused]>;
-def fopenmp_cuda_mode : Flag<["-"], "fopenmp-cuda-mode">, Group, 
Flags<[CC1Option, NoArgumentUnused]>;
-def fno_openmp_cuda_mode : Flag<["-"], "fno-openmp-cuda-mode">, 
Group, Flags<[NoArgumentUnused]>;
+def fopenmp_cuda_mode : Flag<["-"], "fopenmp-cuda-mode">, Group,
+  Flags<[CC1Option, NoArgumentUnused, HelpHidden]>;
+def fno_openmp_cuda_mode : Flag<["-"], "fno-openmp-cuda-mode">, Group,
+  Flags<[NoArgumentUnused, HelpHidden]>;
 def fno_optimize_sibling_calls : Flag<["-"], "fno-optimize-sibling-calls">, 
Group;
 def foptimize_sibling_calls : Flag<["-"], "foptimize-sibling-calls">, 
Group;
 def fno_escaping_block_tail_calls : Flag<["-"], 
"fno-escaping-block-tail-calls">, Group, Flags<[CC1Option]>;


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: r336467 - [OPENMP] Fix PR38026: Link -latomic when -fopenmp is used.

2018-07-16 Thread Jonas Hahnfeld via cfe-commits
[ Moving discussion from https://reviews.llvm.org/D49386 to the relevant 
comment on cfe-commits, CC'ing Hal who commented on the original issue ]


Is this change really a good idea? It always requires libatomic for all 
OpenMP applications, even if there is no 'omp atomic' directive or all 
of them can be lowered to atomic instructions that don't require a 
runtime library. I'd argue that it's a larger restriction than the 
problem it solves.
Per https://clang.llvm.org/docs/Toolchain.html#libatomic-gnu the user is 
expected to manually link -latomic whenever Clang can't lower atomic 
instructions - including C11 atomics and C++ atomics. In my opinion 
OpenMP is just another abstraction that doesn't require a special 
treatment.


Thoughts?
Jonas

On 2018-07-06 23:13, Alexey Bataev via cfe-commits wrote:

Author: abataev
Date: Fri Jul  6 14:13:41 2018
New Revision: 336467

URL: http://llvm.org/viewvc/llvm-project?rev=336467&view=rev
Log:
[OPENMP] Fix PR38026: Link -latomic when -fopenmp is used.

On Linux atomic constructs in OpenMP require libatomic library. Patch
links libatomic when -fopenmp is used.

Modified:
cfe/trunk/lib/Driver/ToolChains/Gnu.cpp
cfe/trunk/test/OpenMP/linking.c

Modified: cfe/trunk/lib/Driver/ToolChains/Gnu.cpp
URL:
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Driver/ToolChains/Gnu.cpp?rev=336467&r1=336466&r2=336467&view=diff
==
--- cfe/trunk/lib/Driver/ToolChains/Gnu.cpp (original)
+++ cfe/trunk/lib/Driver/ToolChains/Gnu.cpp Fri Jul  6 14:13:41 2018
@@ -479,6 +479,7 @@ void tools::gnutools::Linker::ConstructJ

   bool WantPthread = Args.hasArg(options::OPT_pthread) ||
  Args.hasArg(options::OPT_pthreads);
+  bool WantAtomic = false;

   // FIXME: Only pass GompNeedsRT = true for platforms with 
libgomp that
   // require librt. Most modern Linux platforms do, but some may 
not.

@@ -487,13 +488,16 @@ void tools::gnutools::Linker::ConstructJ
/* GompNeedsRT= */ true))
 // OpenMP runtimes implies pthreads when using the GNU 
toolchain.

 // FIXME: Does this really make sense for all GNU toolchains?
-WantPthread = true;
+WantAtomic = WantPthread = true;

   AddRunTimeLibs(ToolChain, D, CmdArgs, Args);

   if (WantPthread && !isAndroid)
 CmdArgs.push_back("-lpthread");

+  if (WantAtomic)
+CmdArgs.push_back("-latomic");
+
   if (Args.hasArg(options::OPT_fsplit_stack))
 CmdArgs.push_back("--wrap=pthread_create");


Modified: cfe/trunk/test/OpenMP/linking.c
URL:
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/OpenMP/linking.c?rev=336467&r1=336466&r2=336467&view=diff
==
--- cfe/trunk/test/OpenMP/linking.c (original)
+++ cfe/trunk/test/OpenMP/linking.c Fri Jul  6 14:13:41 2018
@@ -8,14 +8,14 @@
 // RUN:   | FileCheck --check-prefix=CHECK-LD-32 %s
 // CHECK-LD-32: "{{.*}}ld{{(.exe)?}}"
 // CHECK-LD-32: "-l[[DEFAULT_OPENMP_LIB:[^"]*]]"
-// CHECK-LD-32: "-lpthread" "-lc"
+// CHECK-LD-32: "-lpthread" "-latomic" "-lc"
 //
 // RUN: %clang -no-canonical-prefixes %s -### -o %t.o 2>&1 \
 // RUN: -fopenmp -target x86_64-unknown-linux -rtlib=platform \
 // RUN:   | FileCheck --check-prefix=CHECK-LD-64 %s
 // CHECK-LD-64: "{{.*}}ld{{(.exe)?}}"
 // CHECK-LD-64: "-l[[DEFAULT_OPENMP_LIB:[^"]*]]"
-// CHECK-LD-64: "-lpthread" "-lc"
+// CHECK-LD-64: "-lpthread" "-latomic" "-lc"
 //
 // RUN: %clang -no-canonical-prefixes %s -### -o %t.o 2>&1 \
 // RUN: -fopenmp=libgomp -target i386-unknown-linux 
-rtlib=platform \

@@ -27,7 +27,7 @@
 // SIMD-ONLY2-NOT: liomp
 // CHECK-GOMP-LD-32: "{{.*}}ld{{(.exe)?}}"
 // CHECK-GOMP-LD-32: "-lgomp" "-lrt"
-// CHECK-GOMP-LD-32: "-lpthread" "-lc"
+// CHECK-GOMP-LD-32: "-lpthread" "-latomic" "-lc"

 // RUN: %clang -no-canonical-prefixes %s -### -o %t.o 2>&1
-fopenmp-simd -target i386-unknown-linux -rtlib=platform | FileCheck
--check-prefix SIMD-ONLY2 %s
 // SIMD-ONLY2-NOT: lgomp
@@ -39,21 +39,21 @@
 // RUN:   | FileCheck --check-prefix=CHECK-GOMP-LD-64 %s
 // CHECK-GOMP-LD-64: "{{.*}}ld{{(.exe)?}}"
 // CHECK-GOMP-LD-64: "-lgomp" "-lrt"
-// CHECK-GOMP-LD-64: "-lpthread" "-lc"
+// CHECK-GOMP-LD-64: "-lpthread" "-latomic" "-lc"
 //
 // RUN: %clang -no-canonical-prefixes %s -### -o %t.o 2>&1 \
 // RUN: -fopenmp -target i386-unknown-linux -rtlib=platform \
 // RUN:   | FileCheck --check-prefix=CHECK-IOMP5-LD-32 %s
 // CHECK-IOMP5-LD-32: "{{.*}}ld{{(.exe)?}}"
 // CHECK-IOMP5-LD-32: "-l[[DEFAULT_OPENMP_LIB:[^"]*]]"
-// CHECK-IOMP5-LD-32: "-lpthread" "-lc"
+// CHECK-IOMP5-LD-32: "-lpthread" "-latomic" "-lc"
 //
 // RUN: %clang -no-canonical-prefixes %s -### -o %t.o 2>&1 \
 // RUN: -fopenmp -target x86_64-unknown-linux -rtlib=platform \
 // RUN:   | FileCheck --check-prefix=CHECK-IOMP5-LD-64 %s
 // CHECK-IOMP5-LD-64: "{{.*}}ld{{(.exe)?}}"
 // CHECK-IOMP5-LD-64: "-l[[DE

Re: [PATCH] D24601: XFAIL Driver/darwin-stdlib.cpp if CLANG_DEFAULT_CXX_STDLIB is set

2016-09-23 Thread Jonas Hahnfeld via cfe-commits
Hahnfeld added a comment.

ping


https://reviews.llvm.org/D24601



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [PATCH] D24601: XFAIL Driver/darwin-stdlib.cpp if CLANG_DEFAULT_CXX_STDLIB is set

2016-09-27 Thread Jonas Hahnfeld via cfe-commits
Hahnfeld added a comment.

In https://reviews.llvm.org/D24601#553482, @vsk wrote:

> It should be fine to XFAIL this test temporarily. Is there a PR for this?


I've now created an entry in Bugzilla: 
https://llvm.org/bugs/show_bug.cgi?id=30548


https://reviews.llvm.org/D24601



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [PATCH] D24601: XFAIL Driver/darwin-stdlib.cpp if CLANG_DEFAULT_CXX_STDLIB is set

2016-09-27 Thread Jonas Hahnfeld via cfe-commits
Hahnfeld updated the summary for this revision.
Hahnfeld updated this revision to Diff 72765.
Hahnfeld added a comment.

Link PR


https://reviews.llvm.org/D24601

Files:
  test/Driver/darwin-stdlib.cpp
  test/lit.cfg
  test/lit.site.cfg.in

Index: test/lit.site.cfg.in
===
--- test/lit.site.cfg.in
+++ test/lit.site.cfg.in
@@ -16,6 +16,7 @@
 config.llvm_use_sanitizer = "@LLVM_USE_SANITIZER@"
 config.have_zlib = "@HAVE_LIBZ@"
 config.clang_arcmt = @ENABLE_CLANG_ARCMT@
+config.clang_default_cxx_stdlib = "@CLANG_DEFAULT_CXX_STDLIB@"
 config.clang_staticanalyzer = @ENABLE_CLANG_STATIC_ANALYZER@
 config.clang_examples = @ENABLE_CLANG_EXAMPLES@
 config.enable_shared = @ENABLE_SHARED@
Index: test/lit.cfg
===
--- test/lit.cfg
+++ test/lit.cfg
@@ -343,6 +343,9 @@
 
 # Set available features we allow tests to conditionalize on.
 #
+if config.clang_default_cxx_stdlib != '':
+config.available_features.add('default-cxx-stdlib-set')
+
 # Enabled/disabled features
 if config.clang_staticanalyzer != 0:
 config.available_features.add("staticanalyzer")
Index: test/Driver/darwin-stdlib.cpp
===
--- test/Driver/darwin-stdlib.cpp
+++ test/Driver/darwin-stdlib.cpp
@@ -1,3 +1,7 @@
+// This test will fail if CLANG_DEFAULT_CXX_STDLIB is set to anything different
+// than the platform default. (see https://llvm.org/bugs/show_bug.cgi?id=30548)
+// XFAIL: default-cxx-stdlib-set
+
 // RUN: %clang -target x86_64-apple-darwin -arch arm64 
-miphoneos-version-min=7.0 %s -### 2>&1 | FileCheck %s 
--check-prefix=CHECK-LIBCXX
 // RUN: %clang -target x86_64-apple-darwin -mmacosx-version-min=10.8 %s -### 
2>&1 | FileCheck %s --check-prefix=CHECK-LIBSTDCXX
 // RUN: %clang -target x86_64-apple-darwin -mmacosx-version-min=10.9 %s -### 
2>&1 | FileCheck %s --check-prefix=CHECK-LIBCXX


Index: test/lit.site.cfg.in
===
--- test/lit.site.cfg.in
+++ test/lit.site.cfg.in
@@ -16,6 +16,7 @@
 config.llvm_use_sanitizer = "@LLVM_USE_SANITIZER@"
 config.have_zlib = "@HAVE_LIBZ@"
 config.clang_arcmt = @ENABLE_CLANG_ARCMT@
+config.clang_default_cxx_stdlib = "@CLANG_DEFAULT_CXX_STDLIB@"
 config.clang_staticanalyzer = @ENABLE_CLANG_STATIC_ANALYZER@
 config.clang_examples = @ENABLE_CLANG_EXAMPLES@
 config.enable_shared = @ENABLE_SHARED@
Index: test/lit.cfg
===
--- test/lit.cfg
+++ test/lit.cfg
@@ -343,6 +343,9 @@
 
 # Set available features we allow tests to conditionalize on.
 #
+if config.clang_default_cxx_stdlib != '':
+config.available_features.add('default-cxx-stdlib-set')
+
 # Enabled/disabled features
 if config.clang_staticanalyzer != 0:
 config.available_features.add("staticanalyzer")
Index: test/Driver/darwin-stdlib.cpp
===
--- test/Driver/darwin-stdlib.cpp
+++ test/Driver/darwin-stdlib.cpp
@@ -1,3 +1,7 @@
+// This test will fail if CLANG_DEFAULT_CXX_STDLIB is set to anything different
+// than the platform default. (see https://llvm.org/bugs/show_bug.cgi?id=30548)
+// XFAIL: default-cxx-stdlib-set
+
 // RUN: %clang -target x86_64-apple-darwin -arch arm64 -miphoneos-version-min=7.0 %s -### 2>&1 | FileCheck %s --check-prefix=CHECK-LIBCXX
 // RUN: %clang -target x86_64-apple-darwin -mmacosx-version-min=10.8 %s -### 2>&1 | FileCheck %s --check-prefix=CHECK-LIBSTDCXX
 // RUN: %clang -target x86_64-apple-darwin -mmacosx-version-min=10.9 %s -### 2>&1 | FileCheck %s --check-prefix=CHECK-LIBCXX
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r282701 - XFAIL Driver/darwin-stdlib.cpp if CLANG_DEFAULT_CXX_STDLIB is set

2016-09-29 Thread Jonas Hahnfeld via cfe-commits
Author: hahnfeld
Date: Thu Sep 29 02:43:08 2016
New Revision: 282701

URL: http://llvm.org/viewvc/llvm-project?rev=282701&view=rev
Log:
XFAIL Driver/darwin-stdlib.cpp if CLANG_DEFAULT_CXX_STDLIB is set

Until someone rewrites the stdlib logic for Darwin so that we don't need
to pass down the -stdlib argument to cc1.
(see https://llvm.org/bugs/show_bug.cgi?id=30548)

Differential Revision: https://reviews.llvm.org/D24601

Modified:
cfe/trunk/test/Driver/darwin-stdlib.cpp
cfe/trunk/test/lit.cfg
cfe/trunk/test/lit.site.cfg.in

Modified: cfe/trunk/test/Driver/darwin-stdlib.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/Driver/darwin-stdlib.cpp?rev=282701&r1=282700&r2=282701&view=diff
==
--- cfe/trunk/test/Driver/darwin-stdlib.cpp (original)
+++ cfe/trunk/test/Driver/darwin-stdlib.cpp Thu Sep 29 02:43:08 2016
@@ -1,3 +1,7 @@
+// This test will fail if CLANG_DEFAULT_CXX_STDLIB is set to anything different
+// than the platform default. (see https://llvm.org/bugs/show_bug.cgi?id=30548)
+// XFAIL: default-cxx-stdlib-set
+
 // RUN: %clang -target x86_64-apple-darwin -arch arm64 
-miphoneos-version-min=7.0 %s -### 2>&1 | FileCheck %s 
--check-prefix=CHECK-LIBCXX
 // RUN: %clang -target x86_64-apple-darwin -mmacosx-version-min=10.8 %s -### 
2>&1 | FileCheck %s --check-prefix=CHECK-LIBSTDCXX
 // RUN: %clang -target x86_64-apple-darwin -mmacosx-version-min=10.9 %s -### 
2>&1 | FileCheck %s --check-prefix=CHECK-LIBCXX

Modified: cfe/trunk/test/lit.cfg
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/lit.cfg?rev=282701&r1=282700&r2=282701&view=diff
==
--- cfe/trunk/test/lit.cfg (original)
+++ cfe/trunk/test/lit.cfg Thu Sep 29 02:43:08 2016
@@ -343,6 +343,9 @@ for pattern in tool_patterns:
 
 # Set available features we allow tests to conditionalize on.
 #
+if config.clang_default_cxx_stdlib != '':
+config.available_features.add('default-cxx-stdlib-set')
+
 # Enabled/disabled features
 if config.clang_staticanalyzer != 0:
 config.available_features.add("staticanalyzer")

Modified: cfe/trunk/test/lit.site.cfg.in
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/lit.site.cfg.in?rev=282701&r1=282700&r2=282701&view=diff
==
--- cfe/trunk/test/lit.site.cfg.in (original)
+++ cfe/trunk/test/lit.site.cfg.in Thu Sep 29 02:43:08 2016
@@ -16,6 +16,7 @@ config.target_triple = "@TARGET_TRIPLE@"
 config.llvm_use_sanitizer = "@LLVM_USE_SANITIZER@"
 config.have_zlib = "@HAVE_LIBZ@"
 config.clang_arcmt = @ENABLE_CLANG_ARCMT@
+config.clang_default_cxx_stdlib = "@CLANG_DEFAULT_CXX_STDLIB@"
 config.clang_staticanalyzer = @ENABLE_CLANG_STATIC_ANALYZER@
 config.clang_examples = @ENABLE_CLANG_EXAMPLES@
 config.enable_shared = @ENABLE_SHARED@


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [PATCH] D24601: XFAIL Driver/darwin-stdlib.cpp if CLANG_DEFAULT_CXX_STDLIB is set

2016-09-29 Thread Jonas Hahnfeld via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rL282701: XFAIL Driver/darwin-stdlib.cpp if 
CLANG_DEFAULT_CXX_STDLIB is set (authored by Hahnfeld).

Changed prior to commit:
  https://reviews.llvm.org/D24601?vs=72765&id=72960#toc

Repository:
  rL LLVM

https://reviews.llvm.org/D24601

Files:
  cfe/trunk/test/Driver/darwin-stdlib.cpp
  cfe/trunk/test/lit.cfg
  cfe/trunk/test/lit.site.cfg.in

Index: cfe/trunk/test/lit.site.cfg.in
===
--- cfe/trunk/test/lit.site.cfg.in
+++ cfe/trunk/test/lit.site.cfg.in
@@ -16,6 +16,7 @@
 config.llvm_use_sanitizer = "@LLVM_USE_SANITIZER@"
 config.have_zlib = "@HAVE_LIBZ@"
 config.clang_arcmt = @ENABLE_CLANG_ARCMT@
+config.clang_default_cxx_stdlib = "@CLANG_DEFAULT_CXX_STDLIB@"
 config.clang_staticanalyzer = @ENABLE_CLANG_STATIC_ANALYZER@
 config.clang_examples = @ENABLE_CLANG_EXAMPLES@
 config.enable_shared = @ENABLE_SHARED@
Index: cfe/trunk/test/Driver/darwin-stdlib.cpp
===
--- cfe/trunk/test/Driver/darwin-stdlib.cpp
+++ cfe/trunk/test/Driver/darwin-stdlib.cpp
@@ -1,3 +1,7 @@
+// This test will fail if CLANG_DEFAULT_CXX_STDLIB is set to anything different
+// than the platform default. (see https://llvm.org/bugs/show_bug.cgi?id=30548)
+// XFAIL: default-cxx-stdlib-set
+
 // RUN: %clang -target x86_64-apple-darwin -arch arm64 
-miphoneos-version-min=7.0 %s -### 2>&1 | FileCheck %s 
--check-prefix=CHECK-LIBCXX
 // RUN: %clang -target x86_64-apple-darwin -mmacosx-version-min=10.8 %s -### 
2>&1 | FileCheck %s --check-prefix=CHECK-LIBSTDCXX
 // RUN: %clang -target x86_64-apple-darwin -mmacosx-version-min=10.9 %s -### 
2>&1 | FileCheck %s --check-prefix=CHECK-LIBCXX
Index: cfe/trunk/test/lit.cfg
===
--- cfe/trunk/test/lit.cfg
+++ cfe/trunk/test/lit.cfg
@@ -343,6 +343,9 @@
 
 # Set available features we allow tests to conditionalize on.
 #
+if config.clang_default_cxx_stdlib != '':
+config.available_features.add('default-cxx-stdlib-set')
+
 # Enabled/disabled features
 if config.clang_staticanalyzer != 0:
 config.available_features.add("staticanalyzer")


Index: cfe/trunk/test/lit.site.cfg.in
===
--- cfe/trunk/test/lit.site.cfg.in
+++ cfe/trunk/test/lit.site.cfg.in
@@ -16,6 +16,7 @@
 config.llvm_use_sanitizer = "@LLVM_USE_SANITIZER@"
 config.have_zlib = "@HAVE_LIBZ@"
 config.clang_arcmt = @ENABLE_CLANG_ARCMT@
+config.clang_default_cxx_stdlib = "@CLANG_DEFAULT_CXX_STDLIB@"
 config.clang_staticanalyzer = @ENABLE_CLANG_STATIC_ANALYZER@
 config.clang_examples = @ENABLE_CLANG_EXAMPLES@
 config.enable_shared = @ENABLE_SHARED@
Index: cfe/trunk/test/Driver/darwin-stdlib.cpp
===
--- cfe/trunk/test/Driver/darwin-stdlib.cpp
+++ cfe/trunk/test/Driver/darwin-stdlib.cpp
@@ -1,3 +1,7 @@
+// This test will fail if CLANG_DEFAULT_CXX_STDLIB is set to anything different
+// than the platform default. (see https://llvm.org/bugs/show_bug.cgi?id=30548)
+// XFAIL: default-cxx-stdlib-set
+
 // RUN: %clang -target x86_64-apple-darwin -arch arm64 -miphoneos-version-min=7.0 %s -### 2>&1 | FileCheck %s --check-prefix=CHECK-LIBCXX
 // RUN: %clang -target x86_64-apple-darwin -mmacosx-version-min=10.8 %s -### 2>&1 | FileCheck %s --check-prefix=CHECK-LIBSTDCXX
 // RUN: %clang -target x86_64-apple-darwin -mmacosx-version-min=10.9 %s -### 2>&1 | FileCheck %s --check-prefix=CHECK-LIBCXX
Index: cfe/trunk/test/lit.cfg
===
--- cfe/trunk/test/lit.cfg
+++ cfe/trunk/test/lit.cfg
@@ -343,6 +343,9 @@
 
 # Set available features we allow tests to conditionalize on.
 #
+if config.clang_default_cxx_stdlib != '':
+config.available_features.add('default-cxx-stdlib-set')
+
 # Enabled/disabled features
 if config.clang_staticanalyzer != 0:
 config.available_features.add("staticanalyzer")
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [PATCH] D22452: [libcxx] Fix last_write_time tests for filesystems that don't support negative and very large times.

2016-09-29 Thread Jonas Hahnfeld via cfe-commits
Hahnfeld added a comment.

ping?


https://reviews.llvm.org/D22452



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D25263: [Driver] Allow setting the default linker during build

2016-10-17 Thread Jonas Hahnfeld via cfe-commits
Hahnfeld added a comment.

Have you run all tests with `CLANG_DEFAULT_LINKER` not being the platform 
default? I imagine there might be some tests that expect `ld` to be used...




Comment at: CMakeLists.txt:198
 
+set(CLANG_DEFAULT_LINKER "" CACHE STRING
+  "Default linker to use (\"bfd\" or \"gold\" or \"lld\", empty for platform 
default")

bruno wrote:
> mgorny wrote:
> > Is there a reason not to allow using the absolute path here, like for the 
> > command-line option?
> I agree here, if we're adding a cmake options for this, it should accept full 
> paths to the linker to be used (without any need for its type like gold, bfd, 
> etc) as well.
> 
> Additionally, if "" maps to "ld", plain CLANG_DEFAULT_LINKER="ld" should also 
> work here.
I agree with both points here.



Comment at: lib/Driver/ToolChain.cpp:352
+  return UseLinker;
+  } else if (A && (UseLinker.empty() || UseLinker == "ld")) {
+// If we're passed -fuse-ld= with no argument, or with the argument ld,

I wonder whether this is really correct: If `DefaultLinker` is not `ld` (it is 
`lld` for some ToolChains), `-fuse-ld=` with an empty argument should probably 
not use `ld` but rather whatever `DefaultLinker` says...


Repository:
  rL LLVM

https://reviews.llvm.org/D25263



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D25669: [Driver] Simplify ToolChain::GetCXXStdlibType (NFC)

2016-10-17 Thread Jonas Hahnfeld via cfe-commits
Hahnfeld created this revision.
Hahnfeld added reviewers: mgorny, ddunbar, phosek.
Hahnfeld added subscribers: cfe-commits, zlei.

I made the wrong assumption that execution would continue after an error Diag 
which led to unneeded complex code.
This patch aligns with the better implementation of 
`ToolChain::GetRuntimeLibType`.


https://reviews.llvm.org/D25669

Files:
  lib/Driver/ToolChain.cpp


Index: lib/Driver/ToolChain.cpp
===
--- lib/Driver/ToolChain.cpp
+++ lib/Driver/ToolChain.cpp
@@ -546,43 +546,22 @@
   return GetDefaultRuntimeLibType();
 }
 
-static bool ParseCXXStdlibType(const StringRef& Name,
-   ToolChain::CXXStdlibType& Type) {
-  if (Name == "libc++")
-Type = ToolChain::CST_Libcxx;
-  else if (Name == "libstdc++")
-Type = ToolChain::CST_Libstdcxx;
-  else
-return false;
-
-  return true;
-}
-
 ToolChain::CXXStdlibType ToolChain::GetCXXStdlibType(const ArgList &Args) 
const{
-  ToolChain::CXXStdlibType Type;
-  bool HasValidType = false;
-  bool ForcePlatformDefault = false;
-
   const Arg *A = Args.getLastArg(options::OPT_stdlib_EQ);
-  if (A) {
-StringRef Value = A->getValue();
-HasValidType = ParseCXXStdlibType(Value, Type);
-
-// Only use in tests to override CLANG_DEFAULT_CXX_STDLIB!
-if (Value == "platform")
-  ForcePlatformDefault = true;
-else if (!HasValidType)
-  getDriver().Diag(diag::err_drv_invalid_stdlib_name)
-<< A->getAsString(Args);
-  }
+  StringRef LibName = A ? A->getValue() : CLANG_DEFAULT_CXX_STDLIB;
 
-  // If no argument was provided or its value was invalid, look for the
-  // default unless forced or configured to take the platform default.
-  if (!HasValidType && (ForcePlatformDefault ||
-  !ParseCXXStdlibType(CLANG_DEFAULT_CXX_STDLIB, Type)))
-Type = GetDefaultCXXStdlibType();
+  // "platform" is only used in tests to override CLANG_DEFAULT_CXX_STDLIB
+  if (LibName == "libc++")
+return ToolChain::CST_Libcxx;
+  else if (LibName == "libstdc++")
+return ToolChain::CST_Libstdcxx;
+  else if (LibName == "platform")
+return GetDefaultCXXStdlibType();
+
+  if (A)
+getDriver().Diag(diag::err_drv_invalid_stdlib_name) << 
A->getAsString(Args);
 
-  return Type;
+  return GetDefaultCXXStdlibType();
 }
 
 /// \brief Utility function to add a system include directory to CC1 arguments.


Index: lib/Driver/ToolChain.cpp
===
--- lib/Driver/ToolChain.cpp
+++ lib/Driver/ToolChain.cpp
@@ -546,43 +546,22 @@
   return GetDefaultRuntimeLibType();
 }
 
-static bool ParseCXXStdlibType(const StringRef& Name,
-   ToolChain::CXXStdlibType& Type) {
-  if (Name == "libc++")
-Type = ToolChain::CST_Libcxx;
-  else if (Name == "libstdc++")
-Type = ToolChain::CST_Libstdcxx;
-  else
-return false;
-
-  return true;
-}
-
 ToolChain::CXXStdlibType ToolChain::GetCXXStdlibType(const ArgList &Args) const{
-  ToolChain::CXXStdlibType Type;
-  bool HasValidType = false;
-  bool ForcePlatformDefault = false;
-
   const Arg *A = Args.getLastArg(options::OPT_stdlib_EQ);
-  if (A) {
-StringRef Value = A->getValue();
-HasValidType = ParseCXXStdlibType(Value, Type);
-
-// Only use in tests to override CLANG_DEFAULT_CXX_STDLIB!
-if (Value == "platform")
-  ForcePlatformDefault = true;
-else if (!HasValidType)
-  getDriver().Diag(diag::err_drv_invalid_stdlib_name)
-<< A->getAsString(Args);
-  }
+  StringRef LibName = A ? A->getValue() : CLANG_DEFAULT_CXX_STDLIB;
 
-  // If no argument was provided or its value was invalid, look for the
-  // default unless forced or configured to take the platform default.
-  if (!HasValidType && (ForcePlatformDefault ||
-  !ParseCXXStdlibType(CLANG_DEFAULT_CXX_STDLIB, Type)))
-Type = GetDefaultCXXStdlibType();
+  // "platform" is only used in tests to override CLANG_DEFAULT_CXX_STDLIB
+  if (LibName == "libc++")
+return ToolChain::CST_Libcxx;
+  else if (LibName == "libstdc++")
+return ToolChain::CST_Libstdcxx;
+  else if (LibName == "platform")
+return GetDefaultCXXStdlibType();
+
+  if (A)
+getDriver().Diag(diag::err_drv_invalid_stdlib_name) << A->getAsString(Args);
 
-  return Type;
+  return GetDefaultCXXStdlibType();
 }
 
 /// \brief Utility function to add a system include directory to CC1 arguments.
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r318836 - [OpenMP] Adjust arguments of nvptx runtime functions

2017-11-22 Thread Jonas Hahnfeld via cfe-commits
Author: hahnfeld
Date: Wed Nov 22 06:46:49 2017
New Revision: 318836

URL: http://llvm.org/viewvc/llvm-project?rev=318836&view=rev
Log:
[OpenMP] Adjust arguments of nvptx runtime functions

In the future the compiler will analyze whether the OpenMP
runtime needs to be (fully) initialized and avoid that overhead
if possible. The functions already take an argument to transfer
that information to the runtime, so pass in the default value 1.
(This is needed for binary compatibility with libomptarget-nvptx
currently being upstreamed.)

Differential Revision: https://reviews.llvm.org/D40354

Modified:
cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp
cfe/trunk/test/OpenMP/nvptx_parallel_codegen.cpp
cfe/trunk/test/OpenMP/nvptx_target_codegen.cpp
cfe/trunk/test/OpenMP/nvptx_target_teams_codegen.cpp
cfe/trunk/test/OpenMP/nvptx_teams_reduction_codegen.cpp

Modified: cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp?rev=318836&r1=318835&r2=318836&view=diff
==
--- cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp (original)
+++ cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp Wed Nov 22 06:46:49 2017
@@ -22,19 +22,21 @@ using namespace CodeGen;
 
 namespace {
 enum OpenMPRTLFunctionNVPTX {
-  /// \brief Call to void __kmpc_kernel_init(kmp_int32 thread_limit);
+  /// \brief Call to void __kmpc_kernel_init(kmp_int32 thread_limit,
+  /// int16_t RequiresOMPRuntime);
   OMPRTL_NVPTX__kmpc_kernel_init,
-  /// \brief Call to void __kmpc_kernel_deinit();
+  /// \brief Call to void __kmpc_kernel_deinit(int16_t 
IsOMPRuntimeInitialized);
   OMPRTL_NVPTX__kmpc_kernel_deinit,
   /// \brief Call to void __kmpc_spmd_kernel_init(kmp_int32 thread_limit,
-  /// short RequiresOMPRuntime, short RequiresDataSharing);
+  /// int16_t RequiresOMPRuntime, int16_t RequiresDataSharing);
   OMPRTL_NVPTX__kmpc_spmd_kernel_init,
   /// \brief Call to void __kmpc_spmd_kernel_deinit();
   OMPRTL_NVPTX__kmpc_spmd_kernel_deinit,
   /// \brief Call to void __kmpc_kernel_prepare_parallel(void
-  /// *outlined_function);
+  /// *outlined_function, void ***args, kmp_int32 nArgs);
   OMPRTL_NVPTX__kmpc_kernel_prepare_parallel,
-  /// \brief Call to bool __kmpc_kernel_parallel(void **outlined_function);
+  /// \brief Call to bool __kmpc_kernel_parallel(void **outlined_function, void
+  /// ***args);
   OMPRTL_NVPTX__kmpc_kernel_parallel,
   /// \brief Call to void __kmpc_kernel_end_parallel();
   OMPRTL_NVPTX__kmpc_kernel_end_parallel,
@@ -355,7 +357,9 @@ void CGOpenMPRuntimeNVPTX::emitGenericEn
   CGF.EmitBlock(MasterBB);
   // First action in sequential region:
   // Initialize the state of the OpenMP runtime library on the GPU.
-  llvm::Value *Args[] = {getThreadLimit(CGF)};
+  // TODO: Optimize runtime initialization and pass in correct value.
+  llvm::Value *Args[] = {getThreadLimit(CGF),
+ Bld.getInt16(/*RequiresOMPRuntime=*/1)};
   CGF.EmitRuntimeCall(
   createNVPTXRuntimeFunction(OMPRTL_NVPTX__kmpc_kernel_init), Args);
 }
@@ -370,8 +374,10 @@ void CGOpenMPRuntimeNVPTX::emitGenericEn
 
   CGF.EmitBlock(TerminateBB);
   // Signal termination condition.
+  // TODO: Optimize runtime initialization and pass in correct value.
+  llvm::Value *Args[] = {CGF.Builder.getInt16(/*IsOMPRuntimeInitialized=*/1)};
   CGF.EmitRuntimeCall(
-  createNVPTXRuntimeFunction(OMPRTL_NVPTX__kmpc_kernel_deinit), None);
+  createNVPTXRuntimeFunction(OMPRTL_NVPTX__kmpc_kernel_deinit), Args);
   // Barrier to terminate worker threads.
   syncCTAThreads(CGF);
   // Master thread jumps to exit point.
@@ -597,23 +603,25 @@ CGOpenMPRuntimeNVPTX::createNVPTXRuntime
   llvm::Constant *RTLFn = nullptr;
   switch (static_cast(Function)) {
   case OMPRTL_NVPTX__kmpc_kernel_init: {
-// Build void __kmpc_kernel_init(kmp_int32 thread_limit);
-llvm::Type *TypeParams[] = {CGM.Int32Ty};
+// Build void __kmpc_kernel_init(kmp_int32 thread_limit, int16_t
+// RequiresOMPRuntime);
+llvm::Type *TypeParams[] = {CGM.Int32Ty, CGM.Int16Ty};
 llvm::FunctionType *FnTy =
 llvm::FunctionType::get(CGM.VoidTy, TypeParams, /*isVarArg*/ false);
 RTLFn = CGM.CreateRuntimeFunction(FnTy, "__kmpc_kernel_init");
 break;
   }
   case OMPRTL_NVPTX__kmpc_kernel_deinit: {
-// Build void __kmpc_kernel_deinit();
+// Build void __kmpc_kernel_deinit(int16_t IsOMPRuntimeInitialized);
+llvm::Type *TypeParams[] = {CGM.Int16Ty};
 llvm::FunctionType *FnTy =
-llvm::FunctionType::get(CGM.VoidTy, llvm::None, /*isVarArg*/ false);
+llvm::FunctionType::get(CGM.VoidTy, TypeParams, /*isVarArg*/ false);
 RTLFn = CGM.CreateRuntimeFunction(FnTy, "__kmpc_kernel_deinit");
 break;
   }
   case OMPRTL_NVPTX__kmpc_spmd_kernel_init: {
 // Build void __kmpc_spmd_kernel_init(kmp_int32 thread_limit,
-// short RequiresOMPRuntime, short RequiresDataSharin

r319931 - Fix PR35542: Correct adjusting of private reduction variable

2017-12-06 Thread Jonas Hahnfeld via cfe-commits
Author: hahnfeld
Date: Wed Dec  6 11:15:28 2017
New Revision: 319931

URL: http://llvm.org/viewvc/llvm-project?rev=319931&view=rev
Log:
Fix PR35542: Correct adjusting of private reduction variable

The adjustment is calculated with CreatePtrDiff() which returns
the difference in (base) elements. This is passed to CreateGEP()
so make sure that the GEP base has the correct pointer type:
It needs to be a pointer to the base type, not a pointer to a
constant sized array.

Differential Revision: https://reviews.llvm.org/D40911

Modified:
cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp
cfe/trunk/test/OpenMP/for_reduction_codegen.cpp
cfe/trunk/test/OpenMP/for_reduction_codegen_UDR.cpp

Modified: cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp?rev=319931&r1=319930&r2=319931&view=diff
==
--- cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp (original)
+++ cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp Wed Dec  6 11:15:28 2017
@@ -1104,11 +1104,14 @@ Address ReductionCodeGen::adjustPrivateA
 OriginalBaseLValue);
 llvm::Value *Adjustment = CGF.Builder.CreatePtrDiff(
 BaseLValue.getPointer(), SharedAddresses[N].first.getPointer());
-llvm::Value *Ptr =
-CGF.Builder.CreateGEP(PrivateAddr.getPointer(), Adjustment);
+llvm::Value *PrivatePointer =
+CGF.Builder.CreatePointerBitCastOrAddrSpaceCast(
+PrivateAddr.getPointer(),
+SharedAddresses[N].first.getAddress().getType());
+llvm::Value *Ptr = CGF.Builder.CreateGEP(PrivatePointer, Adjustment);
 return castToBase(CGF, OrigVD->getType(),
   SharedAddresses[N].first.getType(),
-  OriginalBaseLValue.getPointer()->getType(),
+  OriginalBaseLValue.getAddress().getType(),
   OriginalBaseLValue.getAlignment(), Ptr);
   }
   BaseDecls.emplace_back(

Modified: cfe/trunk/test/OpenMP/for_reduction_codegen.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/OpenMP/for_reduction_codegen.cpp?rev=319931&r1=319930&r2=319931&view=diff
==
--- cfe/trunk/test/OpenMP/for_reduction_codegen.cpp (original)
+++ cfe/trunk/test/OpenMP/for_reduction_codegen.cpp Wed Dec  6 11:15:28 2017
@@ -944,8 +944,8 @@ int main() {
 // CHECK: [[LOW_BOUND:%.+]] = ptrtoint i32* [[LOW]] to i64
 // CHECK: [[OFFSET_BYTES:%.+]] = sub i64 [[START]], [[LOW_BOUND]]
 // CHECK: [[OFFSET:%.+]] = sdiv exact i64 [[OFFSET_BYTES]], ptrtoint (i32* 
getelementptr (i32, i32* null, i32 1) to i64)
-// CHECK: [[PSEUDO_ARR_PRIV:%.+]] = getelementptr [1 x [2 x i32]], [1 x [2 x 
i32]]* [[ARR_PRIV]], i64 [[OFFSET]]
-// CHECK: [[ARR_PRIV:%.+]] = bitcast [1 x [2 x i32]]* [[PSEUDO_ARR_PRIV]] to 
i32*
+// CHECK: [[ARR_PRIV_PTR:%.+]] = bitcast [1 x [2 x i32]]* [[ARR_PRIV]] to i32*
+// CHECK: [[ARR_PRIV:%.+]] = getelementptr i32, i32* [[ARR_PRIV_PTR]], i64 
[[OFFSET]]
 
 // CHECK: ret void
 
@@ -1000,9 +1000,9 @@ int main() {
 // CHECK: [[LOW_BOUND:%.+]] = ptrtoint [[S_FLOAT_TY]]* [[LOW]] to i64
 // CHECK: [[OFFSET_BYTES:%.+]] = sub i64 [[START]], [[LOW_BOUND]]
 // CHECK: [[OFFSET:%.+]] = sdiv exact i64 [[OFFSET_BYTES]], ptrtoint (float* 
getelementptr (float, float* null, i32 1) to i64)
-// CHECK: [[PSEUDO_VAR2_PRIV:%.+]] = getelementptr [1 x [6 x [[S_FLOAT_TY, 
[1 x [6 x [[S_FLOAT_TY* [[VAR2_PRIV]], i64 [[OFFSET]]
+// CHECK: [[VAR2_PRIV_PTR:%.+]] = bitcast [1 x [6 x [[S_FLOAT_TY* 
[[VAR2_PRIV]] to [[S_FLOAT_TY]]*
+// CHECK: [[VAR2_PRIV:%.+]] = getelementptr [[S_FLOAT_TY]], [[S_FLOAT_TY]]* 
[[VAR2_PRIV_PTR]], i64 [[OFFSET]]
 // CHECK: store [[S_FLOAT_TY]]** [[REF:.+]], [[S_FLOAT_TY]]*** %
-// CHECK: [[VAR2_PRIV:%.+]] = bitcast [1 x [6 x [[S_FLOAT_TY* 
[[PSEUDO_VAR2_PRIV]] to [[S_FLOAT_TY]]*
 // CHECK: store [[S_FLOAT_TY]]* [[VAR2_PRIV]], [[S_FLOAT_TY]]** [[REF]]
 // CHECK: ret void
 
@@ -1029,9 +1029,9 @@ int main() {
 // CHECK: [[LOW_BOUND:%.+]] = ptrtoint [[S_FLOAT_TY]]* [[LOW]] to i64
 // CHECK: [[OFFSET_BYTES:%.+]] = sub i64 [[START]], [[LOW_BOUND]]
 // CHECK: [[OFFSET:%.+]] = sdiv exact i64 [[OFFSET_BYTES]], ptrtoint (float* 
getelementptr (float, float* null, i32 1) to i64)
-// CHECK: [[PSEUDO_VAR2_PRIV:%.+]] = getelementptr [1 x [6 x [[S_FLOAT_TY, 
[1 x [6 x [[S_FLOAT_TY* [[VAR2_PRIV]], i64 [[OFFSET]]
+// CHECK: [[VAR2_PRIV_PTR:%.+]] = bitcast [1 x [6 x [[S_FLOAT_TY* 
[[VAR2_PRIV]] to [[S_FLOAT_TY]]*
+// CHECK: [[VAR2_PRIV:%.+]] = getelementptr [[S_FLOAT_TY]], [[S_FLOAT_TY]]* 
[[VAR2_PRIV_PTR]], i64 [[OFFSET]]
 // CHECK: store [[S_FLOAT_TY]]** [[REF:.+]], [[S_FLOAT_TY]]*** %
-// CHECK: [[VAR2_PRIV:%.+]] = bitcast [1 x [6 x [[S_FLOAT_TY* 
[[PSEUDO_VAR2_PRIV]] to [[S_FLOAT_TY]]*
 // CHECK: store [[S_FLOAT_TY]]* [[VAR2_PRIV]], [[S_FLOAT_TY]]** [[REF]]
 // CHECK: ret void
 
@@ -1080,7 +1080,8 @@ int main() {
 // CHECK: 

Re: r315996 - [CMake][OpenMP] Customize default offloading arch

2017-12-07 Thread Jonas Hahnfeld via cfe-commits

Hi Ahmed,

Am 2017-12-07 19:57, schrieb Ahmed Bougacha:

Hi Jonas,

On Tue, Oct 17, 2017 at 6:37 AM, Jonas Hahnfeld via cfe-commits
 wrote:

Author: hahnfeld
Date: Tue Oct 17 06:37:36 2017
New Revision: 315996

URL: http://llvm.org/viewvc/llvm-project?rev=315996&view=rev
Log:
[CMake][OpenMP] Customize default offloading arch

For the shuffle instructions in reductions we need at least sm_30
but the user may want to customize the default architecture.

Differential Revision: https://reviews.llvm.org/D38883

Modified:
cfe/trunk/CMakeLists.txt
cfe/trunk/include/clang/Config/config.h.cmake
cfe/trunk/lib/Driver/ToolChains/Cuda.cpp
cfe/trunk/lib/Driver/ToolChains/Cuda.h

Modified: cfe/trunk/CMakeLists.txt
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/CMakeLists.txt?rev=315996&r1=315995&r2=315996&view=diff

==
--- cfe/trunk/CMakeLists.txt (original)
+++ cfe/trunk/CMakeLists.txt Tue Oct 17 06:37:36 2017
@@ -235,6 +235,17 @@ endif()
 set(CLANG_DEFAULT_OPENMP_RUNTIME "libomp" CACHE STRING
   "Default OpenMP runtime used by -fopenmp.")

+# OpenMP offloading requires at least sm_30 because we use shuffle 
instructions

+# to generate efficient code for reductions.
+set(CLANG_OPENMP_NVPTX_DEFAULT_ARCH "sm_30" CACHE STRING
+  "Default architecture for OpenMP offloading to Nvidia GPUs.")
+string(REGEX MATCH "^sm_([0-9]+)$" MATCHED_ARCH 
"${CLANG_OPENMP_NVPTX_DEFAULT_ARCH}")

+if (NOT DEFINED MATCHED_ARCH OR "${CMAKE_MATCH_1}" LESS 30)
+  message(WARNING "Resetting default architecture for OpenMP 
offloading to Nvidia GPUs to sm_30")


This warning is pretty noisy and doesn't affect most people: I don't
know what it means but I get it in every cmake run.
Can we somehow restrict or disable it?


So the next line used to say

+  set(CLANG_OPENMP_NVPTX_DEFAULT_ARCH "sm_30" CACHE STRING
+"Default architecture for OpenMP offloading to Nvidia GPUs." 
FORCE)


which should make sure that the cache is updated to a "correct" value 
and you only see the warning once. That said, we have raised the default 
to "sm_35" today, maybe something has gone wrong here. Let me check that 
and come back to you!


Cheers,
Jonas
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: r315996 - [CMake][OpenMP] Customize default offloading arch

2017-12-07 Thread Jonas Hahnfeld via cfe-commits

Am 2017-12-07 20:34, schrieb Jonas Hahnfeld via cfe-commits:

Hi Ahmed,

Am 2017-12-07 19:57, schrieb Ahmed Bougacha:

Hi Jonas,

On Tue, Oct 17, 2017 at 6:37 AM, Jonas Hahnfeld via cfe-commits
 wrote:

Author: hahnfeld
Date: Tue Oct 17 06:37:36 2017
New Revision: 315996

URL: http://llvm.org/viewvc/llvm-project?rev=315996&view=rev
Log:
[CMake][OpenMP] Customize default offloading arch

For the shuffle instructions in reductions we need at least sm_30
but the user may want to customize the default architecture.

Differential Revision: https://reviews.llvm.org/D38883

Modified:
cfe/trunk/CMakeLists.txt
cfe/trunk/include/clang/Config/config.h.cmake
cfe/trunk/lib/Driver/ToolChains/Cuda.cpp
cfe/trunk/lib/Driver/ToolChains/Cuda.h

Modified: cfe/trunk/CMakeLists.txt
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/CMakeLists.txt?rev=315996&r1=315995&r2=315996&view=diff

==
--- cfe/trunk/CMakeLists.txt (original)
+++ cfe/trunk/CMakeLists.txt Tue Oct 17 06:37:36 2017
@@ -235,6 +235,17 @@ endif()
 set(CLANG_DEFAULT_OPENMP_RUNTIME "libomp" CACHE STRING
   "Default OpenMP runtime used by -fopenmp.")

+# OpenMP offloading requires at least sm_30 because we use shuffle 
instructions

+# to generate efficient code for reductions.
+set(CLANG_OPENMP_NVPTX_DEFAULT_ARCH "sm_30" CACHE STRING
+  "Default architecture for OpenMP offloading to Nvidia GPUs.")
+string(REGEX MATCH "^sm_([0-9]+)$" MATCHED_ARCH 
"${CLANG_OPENMP_NVPTX_DEFAULT_ARCH}")

+if (NOT DEFINED MATCHED_ARCH OR "${CMAKE_MATCH_1}" LESS 30)
+  message(WARNING "Resetting default architecture for OpenMP 
offloading to Nvidia GPUs to sm_30")


This warning is pretty noisy and doesn't affect most people: I don't
know what it means but I get it in every cmake run.
Can we somehow restrict or disable it?


So the next line used to say

+  set(CLANG_OPENMP_NVPTX_DEFAULT_ARCH "sm_30" CACHE STRING
+"Default architecture for OpenMP offloading to Nvidia GPUs." 
FORCE)


which should make sure that the cache is updated to a "correct" value
and you only see the warning once. That said, we have raised the
default to "sm_35" today, maybe something has gone wrong here. Let me
check that and come back to you!


Works "correctly" (at least as intended) for me: I get a warning if the 
cache has an incorrect value or the user specifies it on the command 
line. Right then the cache is updated (FORCEd set) and the warning isn't 
printed in future CMake invocations. I'm using CMake 3.5.2, maybe a 
newer version behaves differently? In that case I agree that we should 
fix this, the warning wasn't meant to annoy everyone on each 
reconfiguration!


Jonas
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] f0403c8 - Fix build of Lex unit test with CLANG_DYLIB

2022-09-12 Thread Jonas Hahnfeld via cfe-commits

Author: Jonas Hahnfeld
Date: 2022-09-12T13:49:57+02:00
New Revision: f0403c853bc93fe1127fef7493a4feff1479191e

URL: 
https://github.com/llvm/llvm-project/commit/f0403c853bc93fe1127fef7493a4feff1479191e
DIFF: 
https://github.com/llvm/llvm-project/commit/f0403c853bc93fe1127fef7493a4feff1479191e.diff

LOG: Fix build of Lex unit test with CLANG_DYLIB

If CLANG_LINK_CLANG_DYLIB, clang_target_link_libraries ignores all
indivial libraries and only links clang-cpp. As LLVMTestingSupport
is separate, pass it via target_link_libraries directly.

Added: 


Modified: 
clang/unittests/Lex/CMakeLists.txt

Removed: 




diff  --git a/clang/unittests/Lex/CMakeLists.txt 
b/clang/unittests/Lex/CMakeLists.txt
index 5b498f54fb0af..bed5fd9186f22 100644
--- a/clang/unittests/Lex/CMakeLists.txt
+++ b/clang/unittests/Lex/CMakeLists.txt
@@ -20,6 +20,9 @@ clang_target_link_libraries(LexTests
   clangLex
   clangParse
   clangSema
+  )
 
+target_link_libraries(LexTests
+  PRIVATE
   LLVMTestingSupport
   )



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Fix crash with modules and constexpr destructor (PR #69076)

2023-11-15 Thread Jonas Hahnfeld via cfe-commits

hahnjo wrote:

ping @shafik @cor3ntin @ChuanqiXu9, how can we make progress here?

https://github.com/llvm/llvm-project/pull/69076
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [JITLink][RISCV] Implement eh_frame handling (PR #68253)

2023-10-28 Thread Jonas Hahnfeld via cfe-commits

https://github.com/hahnjo closed https://github.com/llvm/llvm-project/pull/68253
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Fix crash with modules and constexpr destructor (PR #69076)

2023-10-30 Thread Jonas Hahnfeld via cfe-commits

hahnjo wrote:

I can add the comment as requested, but for the other questions related to full 
expressions and modules I'd really need input from experts...

https://github.com/llvm/llvm-project/pull/69076
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-repl] [test] Make an XFAIL more precise (PR #70991)

2023-11-01 Thread Jonas Hahnfeld via cfe-commits

https://github.com/hahnjo approved this pull request.

Very interesting... See also https://github.com/llvm/llvm-project/issues/68092, 
now I understand even less what the problem is...

https://github.com/llvm/llvm-project/pull/70991
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[llvm] [clang] Reapply #2 [clang-repl] [test] Make an XFAIL more precise (#70991) (PR #71168)

2023-11-03 Thread Jonas Hahnfeld via cfe-commits

https://github.com/hahnjo approved this pull request.

Looks reasonable to me. I know this fixes a test error for MinGW, but if 
possible maybe let it sit until early next week in case somebody else has a 
different opinion on moving `host=` to `lit`.

https://github.com/llvm/llvm-project/pull/71168
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][Sema] Always clear UndefinedButUsed (PR #73955)

2023-12-12 Thread Jonas Hahnfeld via cfe-commits

hahnjo wrote:

I tried to craft a test here, but declaration unloading in `clang-repl` is not 
powerful enough (yet) to show a (user-visible) consequence of the problem.

On a high level, the problem should be triggered by:
```
clang-repl> template  struct A { void f() { } };
clang-repl> A().f();
clang-repl> %undo
clang-repl> A().f();
```
In principle, the `%undo` should remove the implicit template instantiation of 
`f` which is then subsequently re-instantiated. This is currently not 
implemented in `clang-repl` (we just re-use the first instantiation, which is 
wrong in case there are more lines in between that could influence the 
instantiation). With debug statements, I can verify that `f` is in 
`UndefinedButUsed`, but `getUndefinedButUsed` filters it out.

The slightly more complex
```
clang-repl> template  struct A { void f() { } };
clang-repl> A().f();
clang-repl> %undo
clang-repl> %undo
clang-repl> template  struct A { void f() { } };
clang-repl> A().f();
```
ie unloading the entire class doesn't work either for the same reason, we never 
actually treat the instantiated member function. (Subsequently this leads to 
the entire `clang-repl` crashing after printing `definition with same mangled 
name '_ZN1AIiE1fEv' as another definition`...)

https://github.com/llvm/llvm-project/pull/73955
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][Sema] Always clear UndefinedButUsed (PR #73955)

2023-12-12 Thread Jonas Hahnfeld via cfe-commits

https://github.com/hahnjo closed https://github.com/llvm/llvm-project/pull/73955
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][Sema] Always clear UndefinedButUsed (PR #73955)

2023-11-30 Thread Jonas Hahnfeld via cfe-commits

https://github.com/hahnjo created 
https://github.com/llvm/llvm-project/pull/73955

Before, it was only cleared if there were undefined entities. This is important 
for Clang's incremental parsing as used by `clang-repl` that might receive 
multiple calls to `Sema.ActOnEndOfTranslationUnit`.

>From 9dd0362e1dcffa5475d9f959ce9bfc6a7e83083b Mon Sep 17 00:00:00 2001
From: Jonas Hahnfeld 
Date: Thu, 30 Nov 2023 16:51:23 +0100
Subject: [PATCH] [clang][Sema] Always clear UndefinedButUsed

Before, it was only cleared if there were undefined entities. This
is important for Clang's incremental parsing as used by clang-repl
that might receive multiple calls to Sema.ActOnEndOfTranslationUnit.
---
 clang/lib/Sema/Sema.cpp | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/clang/lib/Sema/Sema.cpp b/clang/lib/Sema/Sema.cpp
index 9771aaa2f3b0371..d08f8cd56b39bde 100644
--- a/clang/lib/Sema/Sema.cpp
+++ b/clang/lib/Sema/Sema.cpp
@@ -870,6 +870,7 @@ static void checkUndefinedButUsed(Sema &S) {
   // Collect all the still-undefined entities with internal linkage.
   SmallVector, 16> Undefined;
   S.getUndefinedButUsed(Undefined);
+  S.UndefinedButUsed.clear();
   if (Undefined.empty()) return;
 
   for (const auto &Undef : Undefined) {
@@ -923,8 +924,6 @@ static void checkUndefinedButUsed(Sema &S) {
 if (UseLoc.isValid())
   S.Diag(UseLoc, diag::note_used_here);
   }
-
-  S.UndefinedButUsed.clear();
 }
 
 void Sema::LoadExternalWeakUndeclaredIdentifiers() {

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][Sema] Always clear UndefinedButUsed (PR #73955)

2023-11-30 Thread Jonas Hahnfeld via cfe-commits

hahnjo wrote:

I will try, but observing the consequences of this depends on unloading: 
Basically it happens if a declaration in `UndefinedButUsed` thas was previously 
defined is unloaded, which makes it undefined. For now, it's possible that for 
the upstream cases it's only an optimization because the data structure doesn't 
grow as much.

https://github.com/llvm/llvm-project/pull/73955
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Serialization] Fix instantiated default arguments (PR #76473)

2023-12-27 Thread Jonas Hahnfeld via cfe-commits

https://github.com/hahnjo created 
https://github.com/llvm/llvm-project/pull/76473

`ParmVarDecl::getDefaultArg()` strips the outermost `FullExpr`, such as 
`ExprWithCleanups`. This leads to an `llvm_unreachable` being executed with the 
added test `clang/test/Modules/pr68702.cpp`; instead use the more generic 
`VarDecl::getInit()` which also returns `FullExpr`'s.

Closes https://github.com/llvm/llvm-project/issues/68702

>From def58be1572162b4862e37ba983793e422208d38 Mon Sep 17 00:00:00 2001
From: Jonas Hahnfeld 
Date: Wed, 27 Dec 2023 23:10:13 +0100
Subject: [PATCH] [Serialization] Fix instantiated default arguments

ParmVarDecl::getDefaultArg() strips the outermost FullExpr, such as
ExprWithCleanups. This leads to an llvm_unreachable being executed
with the added test clang/test/Modules/pr68702.cpp; instead use the
more generic VarDecl::getInit() which also returns FullExpr's.

Closes https://github.com/llvm/llvm-project/issues/68702
---
 clang/lib/Serialization/ASTWriter.cpp |  6 ++-
 clang/test/Modules/pr68702.cpp| 65 +++
 2 files changed, 69 insertions(+), 2 deletions(-)
 create mode 100644 clang/test/Modules/pr68702.cpp

diff --git a/clang/lib/Serialization/ASTWriter.cpp 
b/clang/lib/Serialization/ASTWriter.cpp
index 78939bfd533ffa..b9f65dc6d452fc 100644
--- a/clang/lib/Serialization/ASTWriter.cpp
+++ b/clang/lib/Serialization/ASTWriter.cpp
@@ -5293,8 +5293,10 @@ void ASTWriter::WriteDeclUpdatesBlocks(RecordDataImpl 
&OffsetsRecord) {
 break;
 
   case UPD_CXX_INSTANTIATED_DEFAULT_ARGUMENT:
-Record.AddStmt(const_cast(
-cast(Update.getDecl())->getDefaultArg()));
+// Do not use ParmVarDecl::getDefaultArg(): It strips the outermost
+// FullExpr, such as ExprWithCleanups.
+Record.AddStmt(
+const_cast(cast(Update.getDecl())->getInit()));
 break;
 
   case UPD_CXX_INSTANTIATED_DEFAULT_MEMBER_INITIALIZER:
diff --git a/clang/test/Modules/pr68702.cpp b/clang/test/Modules/pr68702.cpp
new file mode 100644
index 00..d32f946910f4fb
--- /dev/null
+++ b/clang/test/Modules/pr68702.cpp
@@ -0,0 +1,65 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: split-file %s %t
+
+// RUN: %clang_cc1 -std=c++20 -fmodules -fimplicit-module-maps 
-fmodules-cache-path=%t %t/main.cpp -o %t/main.o
+
+//--- V.h
+#ifndef V_H
+#define V_H
+
+class A {
+public:
+  constexpr A() { }
+  constexpr ~A() { }
+};
+
+template 
+class V {
+public:
+  V() = default;
+
+  constexpr V(int n, const A& a = A()) {}
+};
+
+#endif
+
+//--- inst1.h
+#include "V.h"
+
+static void inst1() {
+  V v;
+}
+
+//--- inst2.h
+#include "V.h"
+
+static void inst2() {
+  V v(100);
+}
+
+//--- module.modulemap
+module "M" {
+  export *
+  module "V.h" {
+export *
+header "V.h"
+  }
+  module "inst1.h" {
+export *
+header "inst1.h"
+  }
+}
+
+module "inst2.h" {
+  export *
+  header "inst2.h"
+}
+
+//--- main.cpp
+#include "V.h"
+#include "inst2.h"
+
+static void m() {
+  static V v(100);
+}

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Fix crash with modules and constexpr destructor (PR #69076)

2023-12-27 Thread Jonas Hahnfeld via cfe-commits

https://github.com/hahnjo updated 
https://github.com/llvm/llvm-project/pull/69076

>From a55ca99a373b17501d56d18af9e8aa2dc2cbcea0 Mon Sep 17 00:00:00 2001
From: Jonas Hahnfeld 
Date: Sat, 14 Oct 2023 20:10:28 +0200
Subject: [PATCH] Fix crash with modules and constexpr destructor

With modules, serialization might omit the outer ExprWithCleanups
as it calls ParmVarDecl::getDefaultArg(). Complementary to fixing
this in a separate change, make the code more robust by adding a
FullExpressionRAII and avoid the llvm_unreachable in the added test
clang/test/Modules/pr68702.cpp.

Closes https://github.com/llvm/llvm-project/issues/68702
---
 clang/lib/AST/ExprConstant.cpp | 16 ++---
 clang/test/Modules/pr68702.cpp | 65 ++
 2 files changed, 77 insertions(+), 4 deletions(-)
 create mode 100644 clang/test/Modules/pr68702.cpp

diff --git a/clang/lib/AST/ExprConstant.cpp b/clang/lib/AST/ExprConstant.cpp
index f6aeee1a4e935d..416d48ae82933f 100644
--- a/clang/lib/AST/ExprConstant.cpp
+++ b/clang/lib/AST/ExprConstant.cpp
@@ -15754,10 +15754,18 @@ bool Expr::EvaluateAsInitializer(APValue &Value, 
const ASTContext &Ctx,
 LValue LVal;
 LVal.set(VD);
 
-if (!EvaluateInPlace(Value, Info, LVal, this,
- /*AllowNonLiteralTypes=*/true) ||
-EStatus.HasSideEffects)
-  return false;
+{
+  // C++23 [intro.execution]/p5
+  // A full-expression is ... an init-declarator ([dcl.decl]) or a
+  // mem-initializer.
+  // So we need to make sure temporary objects are destroyed after having
+  // evaluated the expression (per C++23 [class.temporary]/p4).
+  FullExpressionRAII Scope(Info);
+  if (!EvaluateInPlace(Value, Info, LVal, this,
+   /*AllowNonLiteralTypes=*/true) ||
+  EStatus.HasSideEffects)
+return false;
+}
 
 // At this point, any lifetime-extended temporaries are completely
 // initialized.
diff --git a/clang/test/Modules/pr68702.cpp b/clang/test/Modules/pr68702.cpp
new file mode 100644
index 00..d32f946910f4fb
--- /dev/null
+++ b/clang/test/Modules/pr68702.cpp
@@ -0,0 +1,65 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: split-file %s %t
+
+// RUN: %clang_cc1 -std=c++20 -fmodules -fimplicit-module-maps 
-fmodules-cache-path=%t %t/main.cpp -o %t/main.o
+
+//--- V.h
+#ifndef V_H
+#define V_H
+
+class A {
+public:
+  constexpr A() { }
+  constexpr ~A() { }
+};
+
+template 
+class V {
+public:
+  V() = default;
+
+  constexpr V(int n, const A& a = A()) {}
+};
+
+#endif
+
+//--- inst1.h
+#include "V.h"
+
+static void inst1() {
+  V v;
+}
+
+//--- inst2.h
+#include "V.h"
+
+static void inst2() {
+  V v(100);
+}
+
+//--- module.modulemap
+module "M" {
+  export *
+  module "V.h" {
+export *
+header "V.h"
+  }
+  module "inst1.h" {
+export *
+header "inst1.h"
+  }
+}
+
+module "inst2.h" {
+  export *
+  header "inst2.h"
+}
+
+//--- main.cpp
+#include "V.h"
+#include "inst2.h"
+
+static void m() {
+  static V v(100);
+}

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Fix crash with modules and constexpr destructor (PR #69076)

2023-12-27 Thread Jonas Hahnfeld via cfe-commits

hahnjo wrote:

I finally had time to debug this: The reason for modules being involved here is 
because the serialization code calls `ParmVarDecl::getDefaultArg()` which 
strips the outermost `FullExpr`, such as `ExprWithCleanups`. A potential fix is 
in https://github.com/llvm/llvm-project/pull/76473 though I'm not really 
convinced by this asymmetry between `getInit()` but calling `setDefaultArg()`. 
However, removing the handling of `FullExpr` in `setDefaultArg()` causes a 
total 29 test failures, so that's not an (easy) option...

Personally, I would argue that adding `FullExpressionRAII` makes the code more 
robust against the absence of `ExprWithCleanups` so that's maybe a good thing 
to have regardless of fixing the serialization code.

https://github.com/llvm/llvm-project/pull/69076
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Serialization] Load Specializations Lazily (1/2) (PR #76774)

2024-01-05 Thread Jonas Hahnfeld via cfe-commits


@@ -1249,3 +1249,5 @@ void ODRHash::AddQualType(QualType T) {
 void ODRHash::AddBoolean(bool Value) {
   Bools.push_back(Value);
 }
+
+void ODRHash::AddInteger(unsigned Value) { ID.AddInteger(Value); }

hahnjo wrote:

The review related to `ODRHash` is this one: https://reviews.llvm.org/D153003

In short, my understanding is that `ODRHash` gives the following guarantee: If 
the hashes are different, there is guaranteed to be a ODR violation. In the 
other direction, if two hashes are the same, the declarations have to be 
compared in more detail, ie there may or may not be an ODR violation.

For the specializations, we need the opposite: If two template arguments are 
semantically the same (*), they *must* hash to the same value or otherwise we 
will not find the correct bucket. On the other hand, two different 
specialization arguments may have the same hash, that's fine for the map data 
structure.

Now the additional caveat (*) is that "semantically the same" is not the same 
congruence as "no ODR violation". In https://reviews.llvm.org/D153003 we 
discuss `using` declarations, but IIRC it's also possible to construct 
problematic cases with (nested) namespaces, top-level `::` prefixes, and 
template template parameters. Taken together, my conclusion from the discussion 
above is that `ODRHash` is simply not the right method to find template 
specialization parameters in a map.

https://github.com/llvm/llvm-project/pull/76774
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Fix crash with modules and constexpr destructor (PR #69076)

2024-01-07 Thread Jonas Hahnfeld via cfe-commits

hahnjo wrote:

Ping, is this ok to be accepted and landed?

> So personally I am fine with the current workaround with a `FIXME`.

You mean next to the comment I already added referring to the C++ standard? Can 
you formulate what I should put there?

https://github.com/llvm/llvm-project/pull/69076
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Fix crash with modules and constexpr destructor (PR #69076)

2024-01-07 Thread Jonas Hahnfeld via cfe-commits

hahnjo wrote:

Well, this patch is up since almost three months now (!). Sure, we can keep 
carrying a similar fix downstream, but ideally I would really like to get rid 
of as many local changes as possible. That's not possible without proper 
review, but the current situation is quite unsatisfactory...

https://github.com/llvm/llvm-project/pull/69076
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Serialization] Load Specializations Lazily (1/2) (PR #76774)

2024-01-07 Thread Jonas Hahnfeld via cfe-commits


@@ -1249,3 +1249,5 @@ void ODRHash::AddQualType(QualType T) {
 void ODRHash::AddBoolean(bool Value) {
   Bools.push_back(Value);
 }
+
+void ODRHash::AddInteger(unsigned Value) { ID.AddInteger(Value); }

hahnjo wrote:

That test does not exercise an alias argument to a template template argument. 
IIRC the code you code is only active for typenames. Also see the test in 
https://reviews.llvm.org/D153003 that exercises different spellings of the 
semantically equivalent type `NS::A` inside a namespace.

https://github.com/llvm/llvm-project/pull/76774
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Fix crash with modules and constexpr destructor (PR #69076)

2024-01-09 Thread Jonas Hahnfeld via cfe-commits

https://github.com/hahnjo edited https://github.com/llvm/llvm-project/pull/69076
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Fix crash with modules and constexpr destructor (PR #69076)

2024-01-09 Thread Jonas Hahnfeld via cfe-commits

hahnjo wrote:

> address my previous comment: [#69076 
> (comment)](https://github.com/llvm/llvm-project/pull/69076#issuecomment-1780327252)

I had already expanded the commit message with the full details, now also 
copied to the PR summary. Is that sufficient to address the comment?

https://github.com/llvm/llvm-project/pull/69076
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Fix crash with modules and constexpr destructor (PR #69076)

2024-01-09 Thread Jonas Hahnfeld via cfe-commits

https://github.com/hahnjo updated 
https://github.com/llvm/llvm-project/pull/69076

>From a55ca99a373b17501d56d18af9e8aa2dc2cbcea0 Mon Sep 17 00:00:00 2001
From: Jonas Hahnfeld 
Date: Sat, 14 Oct 2023 20:10:28 +0200
Subject: [PATCH 1/3] Fix crash with modules and constexpr destructor

With modules, serialization might omit the outer ExprWithCleanups
as it calls ParmVarDecl::getDefaultArg(). Complementary to fixing
this in a separate change, make the code more robust by adding a
FullExpressionRAII and avoid the llvm_unreachable in the added test
clang/test/Modules/pr68702.cpp.

Closes https://github.com/llvm/llvm-project/issues/68702
---
 clang/lib/AST/ExprConstant.cpp | 16 ++---
 clang/test/Modules/pr68702.cpp | 65 ++
 2 files changed, 77 insertions(+), 4 deletions(-)
 create mode 100644 clang/test/Modules/pr68702.cpp

diff --git a/clang/lib/AST/ExprConstant.cpp b/clang/lib/AST/ExprConstant.cpp
index f6aeee1a4e935d..416d48ae82933f 100644
--- a/clang/lib/AST/ExprConstant.cpp
+++ b/clang/lib/AST/ExprConstant.cpp
@@ -15754,10 +15754,18 @@ bool Expr::EvaluateAsInitializer(APValue &Value, 
const ASTContext &Ctx,
 LValue LVal;
 LVal.set(VD);
 
-if (!EvaluateInPlace(Value, Info, LVal, this,
- /*AllowNonLiteralTypes=*/true) ||
-EStatus.HasSideEffects)
-  return false;
+{
+  // C++23 [intro.execution]/p5
+  // A full-expression is ... an init-declarator ([dcl.decl]) or a
+  // mem-initializer.
+  // So we need to make sure temporary objects are destroyed after having
+  // evaluated the expression (per C++23 [class.temporary]/p4).
+  FullExpressionRAII Scope(Info);
+  if (!EvaluateInPlace(Value, Info, LVal, this,
+   /*AllowNonLiteralTypes=*/true) ||
+  EStatus.HasSideEffects)
+return false;
+}
 
 // At this point, any lifetime-extended temporaries are completely
 // initialized.
diff --git a/clang/test/Modules/pr68702.cpp b/clang/test/Modules/pr68702.cpp
new file mode 100644
index 00..d32f946910f4fb
--- /dev/null
+++ b/clang/test/Modules/pr68702.cpp
@@ -0,0 +1,65 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: split-file %s %t
+
+// RUN: %clang_cc1 -std=c++20 -fmodules -fimplicit-module-maps 
-fmodules-cache-path=%t %t/main.cpp -o %t/main.o
+
+//--- V.h
+#ifndef V_H
+#define V_H
+
+class A {
+public:
+  constexpr A() { }
+  constexpr ~A() { }
+};
+
+template 
+class V {
+public:
+  V() = default;
+
+  constexpr V(int n, const A& a = A()) {}
+};
+
+#endif
+
+//--- inst1.h
+#include "V.h"
+
+static void inst1() {
+  V v;
+}
+
+//--- inst2.h
+#include "V.h"
+
+static void inst2() {
+  V v(100);
+}
+
+//--- module.modulemap
+module "M" {
+  export *
+  module "V.h" {
+export *
+header "V.h"
+  }
+  module "inst1.h" {
+export *
+header "inst1.h"
+  }
+}
+
+module "inst2.h" {
+  export *
+  header "inst2.h"
+}
+
+//--- main.cpp
+#include "V.h"
+#include "inst2.h"
+
+static void m() {
+  static V v(100);
+}

>From 8381d35fc3e1dd57ba0dd2a76aea2931c659e419 Mon Sep 17 00:00:00 2001
From: Jonas Hahnfeld 
Date: Wed, 10 Jan 2024 08:41:59 +0100
Subject: [PATCH 2/3] Expand comment

---
 clang/lib/AST/ExprConstant.cpp | 4 
 1 file changed, 4 insertions(+)

diff --git a/clang/lib/AST/ExprConstant.cpp b/clang/lib/AST/ExprConstant.cpp
index 416d48ae82933f..f20850d14c0c86 100644
--- a/clang/lib/AST/ExprConstant.cpp
+++ b/clang/lib/AST/ExprConstant.cpp
@@ -15760,6 +15760,10 @@ bool Expr::EvaluateAsInitializer(APValue &Value, const 
ASTContext &Ctx,
   // mem-initializer.
   // So we need to make sure temporary objects are destroyed after having
   // evaluated the expression (per C++23 [class.temporary]/p4).
+  //
+  // FIXME: Otherwise this may break test/Modules/pr68702.cpp because the
+  // serialization code calls ParmVarDecl::getDefaultArg() which strips the
+  // outermost FullExpr, such as ExprWithCleanups.
   FullExpressionRAII Scope(Info);
   if (!EvaluateInPlace(Value, Info, LVal, this,
/*AllowNonLiteralTypes=*/true) ||

>From 676c7ea4ad35ae9114023573d5698a22adeb1460 Mon Sep 17 00:00:00 2001
From: Jonas Hahnfeld 
Date: Wed, 10 Jan 2024 08:47:50 +0100
Subject: [PATCH 3/3] Add a release note

---
 clang/docs/ReleaseNotes.rst | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index ee211c16a48ac8..aa3252b1d4f5f4 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -877,6 +877,8 @@ Miscellaneous Clang Crashes Fixed
 - Fixed a crash when a lambda marked as ``static`` referenced a captured
   variable in an expression.
   `Issue 74608 `_
+- Fixed a crash with modules and a ``constexpr`` destructor.
+  `Issue 68702 `_
 
 
 OpenACC Specific Changes

__

[clang] Fix crash with modules and constexpr destructor (PR #69076)

2024-01-09 Thread Jonas Hahnfeld via cfe-commits


@@ -15754,10 +15754,18 @@ bool Expr::EvaluateAsInitializer(APValue &Value, 
const ASTContext &Ctx,
 LValue LVal;
 LVal.set(VD);
 
-if (!EvaluateInPlace(Value, Info, LVal, this,
- /*AllowNonLiteralTypes=*/true) ||
-EStatus.HasSideEffects)
-  return false;
+{
+  // C++23 [intro.execution]/p5
+  // A full-expression is ... an init-declarator ([dcl.decl]) or a
+  // mem-initializer.
+  // So we need to make sure temporary objects are destroyed after having
+  // evaluated the expression (per C++23 [class.temporary]/p4).

hahnjo wrote:

Done.

https://github.com/llvm/llvm-project/pull/69076
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Fix crash with modules and constexpr destructor (PR #69076)

2024-01-09 Thread Jonas Hahnfeld via cfe-commits

hahnjo wrote:

> Please add a release note

> This change needs a release note. Please add an entry to 
> `clang/docs/ReleaseNotes.rst` in the section the most adapted to the change, 
> and referencing any Github issue this change fixes. Thanks!

Done.

https://github.com/llvm/llvm-project/pull/69076
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] f22795d - [Interpreter] Pass target features to JIT

2022-06-30 Thread Jonas Hahnfeld via cfe-commits

Author: Jonas Hahnfeld
Date: 2022-06-30T21:25:14+02:00
New Revision: f22795de683d571bbf7e655a7b4ed5ccda186e66

URL: 
https://github.com/llvm/llvm-project/commit/f22795de683d571bbf7e655a7b4ed5ccda186e66
DIFF: 
https://github.com/llvm/llvm-project/commit/f22795de683d571bbf7e655a7b4ed5ccda186e66.diff

LOG: [Interpreter] Pass target features to JIT

This is required to support RISC-V where the '+d' target feature
indicates the presence of the D instruction set extension, which
changes to the Hard-float 'd' ABI.

Differential Revision: https://reviews.llvm.org/D128853

Added: 


Modified: 
clang/lib/Interpreter/IncrementalExecutor.cpp
clang/lib/Interpreter/IncrementalExecutor.h
clang/lib/Interpreter/Interpreter.cpp

Removed: 




diff  --git a/clang/lib/Interpreter/IncrementalExecutor.cpp 
b/clang/lib/Interpreter/IncrementalExecutor.cpp
index c055827281b4f..227ab9703dc76 100644
--- a/clang/lib/Interpreter/IncrementalExecutor.cpp
+++ b/clang/lib/Interpreter/IncrementalExecutor.cpp
@@ -12,6 +12,8 @@
 
 #include "IncrementalExecutor.h"
 
+#include "clang/Basic/TargetInfo.h"
+#include "clang/Basic/TargetOptions.h"
 #include "clang/Interpreter/PartialTranslationUnit.h"
 #include "llvm/ExecutionEngine/ExecutionEngine.h"
 #include "llvm/ExecutionEngine/Orc/CompileUtils.h"
@@ -28,12 +30,13 @@ namespace clang {
 
 IncrementalExecutor::IncrementalExecutor(llvm::orc::ThreadSafeContext &TSC,
  llvm::Error &Err,
- const llvm::Triple &Triple)
+ const clang::TargetInfo &TI)
 : TSCtx(TSC) {
   using namespace llvm::orc;
   llvm::ErrorAsOutParameter EAO(&Err);
 
-  auto JTMB = JITTargetMachineBuilder(Triple);
+  auto JTMB = JITTargetMachineBuilder(TI.getTriple());
+  JTMB.addFeatures(TI.getTargetOpts().Features);
   if (auto JitOrErr = LLJITBuilder().setJITTargetMachineBuilder(JTMB).create())
 Jit = std::move(*JitOrErr);
   else {

diff  --git a/clang/lib/Interpreter/IncrementalExecutor.h 
b/clang/lib/Interpreter/IncrementalExecutor.h
index 580724e1e24e2..f11ec0aa9e758 100644
--- a/clang/lib/Interpreter/IncrementalExecutor.h
+++ b/clang/lib/Interpreter/IncrementalExecutor.h
@@ -15,7 +15,6 @@
 
 #include "llvm/ADT/DenseMap.h"
 #include "llvm/ADT/StringRef.h"
-#include "llvm/ADT/Triple.h"
 #include "llvm/ExecutionEngine/Orc/ExecutionUtils.h"
 
 #include 
@@ -32,6 +31,7 @@ class ThreadSafeContext;
 namespace clang {
 
 struct PartialTranslationUnit;
+class TargetInfo;
 
 class IncrementalExecutor {
   using CtorDtorIterator = llvm::orc::CtorDtorIterator;
@@ -45,7 +45,7 @@ class IncrementalExecutor {
   enum SymbolNameKind { IRName, LinkerName };
 
   IncrementalExecutor(llvm::orc::ThreadSafeContext &TSC, llvm::Error &Err,
-  const llvm::Triple &Triple);
+  const clang::TargetInfo &TI);
   ~IncrementalExecutor();
 
   llvm::Error addModule(PartialTranslationUnit &PTU);

diff  --git a/clang/lib/Interpreter/Interpreter.cpp 
b/clang/lib/Interpreter/Interpreter.cpp
index a10eb79b413b3..0191ad78581d9 100644
--- a/clang/lib/Interpreter/Interpreter.cpp
+++ b/clang/lib/Interpreter/Interpreter.cpp
@@ -213,10 +213,10 @@ Interpreter::Parse(llvm::StringRef Code) {
 llvm::Error Interpreter::Execute(PartialTranslationUnit &T) {
   assert(T.TheModule);
   if (!IncrExecutor) {
-const llvm::Triple &Triple =
-getCompilerInstance()->getASTContext().getTargetInfo().getTriple();
+const clang::TargetInfo &TI =
+getCompilerInstance()->getASTContext().getTargetInfo();
 llvm::Error Err = llvm::Error::success();
-IncrExecutor = std::make_unique(*TSCtx, Err, Triple);
+IncrExecutor = std::make_unique(*TSCtx, Err, TI);
 
 if (Err)
   return Err;



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] e4903d8 - [CUDA/HIP] Remove argument from module ctor/dtor signatures

2022-04-09 Thread Jonas Hahnfeld via cfe-commits

Author: Jonas Hahnfeld
Date: 2022-04-09T12:34:41+02:00
New Revision: e4903d8be399864cc978236fc4a28087f91c20fe

URL: 
https://github.com/llvm/llvm-project/commit/e4903d8be399864cc978236fc4a28087f91c20fe
DIFF: 
https://github.com/llvm/llvm-project/commit/e4903d8be399864cc978236fc4a28087f91c20fe.diff

LOG: [CUDA/HIP] Remove argument from module ctor/dtor signatures

In theory, constructors can take arguments when called via .init_array
where at least glibc passes in (argc, argv, envp). This isn't used in
the generated code and if it was, the first argument should be an
integer, not a pointer. For destructors registered via atexit, the
function should never take an argument.

Differential Revision: https://reviews.llvm.org/D123370

Added: 


Modified: 
clang/lib/CodeGen/CGCUDANV.cpp
clang/test/CodeGenCUDA/device-stub.cu

Removed: 




diff  --git a/clang/lib/CodeGen/CGCUDANV.cpp b/clang/lib/CodeGen/CGCUDANV.cpp
index 3ae152d743206..187817d0e5059 100644
--- a/clang/lib/CodeGen/CGCUDANV.cpp
+++ b/clang/lib/CodeGen/CGCUDANV.cpp
@@ -659,7 +659,7 @@ llvm::Function *CGNVCUDARuntime::makeRegisterGlobalsFn() {
 ///
 /// For CUDA:
 /// \code
-/// void __cuda_module_ctor(void*) {
+/// void __cuda_module_ctor() {
 /// Handle = __cudaRegisterFatBinary(GpuBinaryBlob);
 /// __cuda_register_globals(Handle);
 /// }
@@ -667,7 +667,7 @@ llvm::Function *CGNVCUDARuntime::makeRegisterGlobalsFn() {
 ///
 /// For HIP:
 /// \code
-/// void __hip_module_ctor(void*) {
+/// void __hip_module_ctor() {
 /// if (__hip_gpubin_handle == 0) {
 /// __hip_gpubin_handle  = __hipRegisterFatBinary(GpuBinaryBlob);
 /// __hip_register_globals(__hip_gpubin_handle);
@@ -717,7 +717,7 @@ llvm::Function *CGNVCUDARuntime::makeModuleCtorFunction() {
   }
 
   llvm::Function *ModuleCtorFunc = llvm::Function::Create(
-  llvm::FunctionType::get(VoidTy, VoidPtrTy, false),
+  llvm::FunctionType::get(VoidTy, false),
   llvm::GlobalValue::InternalLinkage,
   addUnderscoredPrefixToName("_module_ctor"), &TheModule);
   llvm::BasicBlock *CtorEntryBB =
@@ -931,14 +931,14 @@ llvm::Function *CGNVCUDARuntime::makeModuleCtorFunction() 
{
 ///
 /// For CUDA:
 /// \code
-/// void __cuda_module_dtor(void*) {
+/// void __cuda_module_dtor() {
 /// __cudaUnregisterFatBinary(Handle);
 /// }
 /// \endcode
 ///
 /// For HIP:
 /// \code
-/// void __hip_module_dtor(void*) {
+/// void __hip_module_dtor() {
 /// if (__hip_gpubin_handle) {
 /// __hipUnregisterFatBinary(__hip_gpubin_handle);
 /// __hip_gpubin_handle = 0;
@@ -956,7 +956,7 @@ llvm::Function *CGNVCUDARuntime::makeModuleDtorFunction() {
   addUnderscoredPrefixToName("UnregisterFatBinary"));
 
   llvm::Function *ModuleDtorFunc = llvm::Function::Create(
-  llvm::FunctionType::get(VoidTy, VoidPtrTy, false),
+  llvm::FunctionType::get(VoidTy, false),
   llvm::GlobalValue::InternalLinkage,
   addUnderscoredPrefixToName("_module_dtor"), &TheModule);
 

diff  --git a/clang/test/CodeGenCUDA/device-stub.cu 
b/clang/test/CodeGenCUDA/device-stub.cu
index aa7211aeaf8e7..0f925a29c215d 100644
--- a/clang/test/CodeGenCUDA/device-stub.cu
+++ b/clang/test/CodeGenCUDA/device-stub.cu
@@ -257,8 +257,8 @@ void hostfunc(void) { kernelfunc<<<1, 1>>>(1, 1, 1); }
 // CUDANORDC-NEXT: call void @__[[PREFIX]]_register_globals
 // HIP-NEXT: call void @__[[PREFIX]]_register_globals
 // * In separate mode we also register a destructor.
-// CUDANORDC-NEXT: call i32 @atexit(void (i8*)* @__[[PREFIX]]_module_dtor)
-// HIP-NEXT: call i32 @atexit(void (i8*)* @__[[PREFIX]]_module_dtor)
+// CUDANORDC-NEXT: call i32 @atexit(void ()* @__[[PREFIX]]_module_dtor)
+// HIP-NEXT: call i32 @atexit(void ()* @__[[PREFIX]]_module_dtor)
 
 // With relocatable device code we call 
__[[PREFIX]]RegisterLinkedBinary%NVModuleID%
 // CUDARDC: call{{.*}}__[[PREFIX]]RegisterLinkedBinary[[MODULE_ID]](



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-repl] Set up executor implicitly to account for init PTUs (PR #84758)

2024-03-19 Thread Jonas Hahnfeld via cfe-commits
Stefan =?utf-8?q?Gränitz?= 
Message-ID:
In-Reply-To: 



@@ -14,7 +14,7 @@ struct A { int a; A(int a) : a(a) {} virtual ~A(); };
 // PartialTranslationUnit.
 inline A::~A() { printf("~A(%d)\n", a); }
 
-// Create one instance with new and delete it.
+// Create one instance with new and delete it. We crash here now:
 A *a1 = new A(1);

hahnjo wrote:

I had a quick look here and got the same backtraces on Linux as 
@weliveindetail. From the debugger, it seems to fail because the DataLayout is 
not properly initialized? This could mean that executing the initial, empty PTU 
causes some damage in the internal data structures, but not sure...

https://github.com/llvm/llvm-project/pull/84758
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [CUDA] Correctly set CUDA default architecture (PR #84017)

2024-03-05 Thread Jonas Hahnfeld via cfe-commits


@@ -2,56 +2,56 @@
 // REQUIRES: nvptx-registered-target
 // REQUIRES: zlib
 
-// RUN: not %clang -### --target=x86_64-linux-gnu -c %s -g -gz 2>&1 \
+// RUN: %clang -### --target=x86_64-linux-gnu --offload-arch=sm_52 -nogpulib 
-nogpuinc -c %s -g -gz 2>&1 \
 // RUN: | FileCheck %s --check-prefixes WARN,COMMON
-// RUN: not %clang -### --target=x86_64-linux-gnu -c %s -gdwarf 
-fdebug-info-for-profiling 2>&1 \
+// RUN: %clang -### --target=x86_64-linux-gnu --offload-arch=sm_52 -nogpulib 
-nogpuinc -c %s -gdwarf -fdebug-info-for-profiling 2>&1 \
 // RUN: | FileCheck %s --check-prefixes WARN,COMMON
-// RUN: not %clang -### --target=x86_64-linux-gnu -c %s -gdwarf-2 
-gsplit-dwarf 2>&1 \
+// RUN: %clang -### --target=x86_64-linux-gnu --offload-arch=sm_52 -nogpulib 
-nogpuinc -c %s -gdwarf-2 -gsplit-dwarf 2>&1 \
 // RUN: | FileCheck %s --check-prefixes WARN,COMMON
-// RUN: not %clang -### --target=x86_64-linux-gnu -c %s -gdwarf-3 -glldb 2>&1 \
+// RUN: %clang -### --target=x86_64-linux-gnu --offload-arch=sm_52 -nogpulib 
-nogpuinc -c %s -gdwarf-3 -glldb 2>&1 \
 // RUN: | FileCheck %s --check-prefixes WARN,COMMON
-// RUN: not %clang -### --target=x86_64-linux-gnu -c %s -gdwarf-4 -gcodeview 
2>&1 \
+// RUN: %clang -### --target=x86_64-linux-gnu --offload-arch=sm_52 -nogpulib 
-nogpuinc -c %s -gdwarf-4 -gcodeview 2>&1 \
 // RUN: | FileCheck %s --check-prefixes WARN,COMMON
-// RUN: not %clang -### --target=x86_64-linux-gnu -c %s -gdwarf-5 -gmodules 
2>&1 \
+// RUN: %clang -### --target=x86_64-linux-gnu --offload-arch=sm_52 -nogpulib 
-nogpuinc -c %s -gdwarf-5 -gmodules 2>&1 \
 // RUN: | FileCheck %s --check-prefixes WARN,COMMON
-// RUN: not %clang -### --target=x86_64-linux-gnu -c %s -ggdb1 -fdebug-macro 
2>&1 \
+// RUN: %clang -### --target=x86_64-linux-gnu --offload-arch=sm_52 -nogpulib 
-nogpuinc -c %s -ggdb1 -fdebug-macro 2>&1 \
 // RUN: | FileCheck %s --check-prefixes WARN,COMMON
-// RUN: not %clang -### --target=x86_64-linux-gnu -c %s -ggdb2 -ggnu-pubnames 
2>&1 \
+// RUN: %clang -### --target=x86_64-linux-gnu --offload-arch=sm_52 -nogpulib 
-nogpuinc -c %s -ggdb2 -ggnu-pubnames 2>&1 \
 // RUN: | FileCheck %s --check-prefixes WARN,COMMON
-// RUN: not %clang -### --target=x86_64-linux-gnu -c %s -ggdb3 -gdwarf-aranges 
2>&1 \
+// RUN: %clang -### --target=x86_64-linux-gnu --offload-arch=sm_52 -nogpulib 
-nogpuinc -c %s -ggdb3 -gdwarf-aranges 2>&1 \
 // RUN: | FileCheck %s --check-prefixes WARN,COMMON
-// RUN: not %clang -### --target=x86_64-linux-gnu -c %s -g -gcolumn-info 
-fdebug-types-section 2>&1 \
+// RUN: %clang -### --target=x86_64-linux-gnu --offload-arch=sm_52 -nogpulib 
-nogpuinc -c %s -g -gcolumn-info -fdebug-types-section 2>&1 \
 // RUN: | FileCheck %s --check-prefixes WARN,COMMON
 
 // Same tests for OpenMP
-// RUN: not %clang -### --target=x86_64-linux-gnu -fopenmp=libomp 
-fopenmp-targets=nvptx64-nvidia-cuda -c %s \
+// RUN: %clang -### --target=x86_64-linux-gnu --offload-arch=sm_52 -nogpulib 
-nogpuinc -fopenmp=libomp -c %s \

hahnjo wrote:

Please note that this completely dropped 
`-fopenmp-targets=nvptx64-nvidia-cuda`, so the test likely lost coverage...

https://github.com/llvm/llvm-project/pull/84017
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang] Add cuda-path arguments to failing test (PR #84008)

2024-03-06 Thread Jonas Hahnfeld via cfe-commits

https://github.com/hahnjo closed https://github.com/llvm/llvm-project/pull/84008
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang] Add cuda-path arguments to failing test (PR #84008)

2024-03-06 Thread Jonas Hahnfeld via cfe-commits

hahnjo wrote:

https://github.com/llvm/llvm-project/pull/84017 changed the test in ways that 
this isn't needed anymore.

https://github.com/llvm/llvm-project/pull/84008
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [CUDA] Correctly set CUDA default architecture (PR #84017)

2024-03-06 Thread Jonas Hahnfeld via cfe-commits


@@ -2,56 +2,56 @@
 // REQUIRES: nvptx-registered-target
 // REQUIRES: zlib
 
-// RUN: not %clang -### --target=x86_64-linux-gnu -c %s -g -gz 2>&1 \
+// RUN: %clang -### --target=x86_64-linux-gnu --offload-arch=sm_52 -nogpulib 
-nogpuinc -c %s -g -gz 2>&1 \
 // RUN: | FileCheck %s --check-prefixes WARN,COMMON
-// RUN: not %clang -### --target=x86_64-linux-gnu -c %s -gdwarf 
-fdebug-info-for-profiling 2>&1 \
+// RUN: %clang -### --target=x86_64-linux-gnu --offload-arch=sm_52 -nogpulib 
-nogpuinc -c %s -gdwarf -fdebug-info-for-profiling 2>&1 \
 // RUN: | FileCheck %s --check-prefixes WARN,COMMON
-// RUN: not %clang -### --target=x86_64-linux-gnu -c %s -gdwarf-2 
-gsplit-dwarf 2>&1 \
+// RUN: %clang -### --target=x86_64-linux-gnu --offload-arch=sm_52 -nogpulib 
-nogpuinc -c %s -gdwarf-2 -gsplit-dwarf 2>&1 \
 // RUN: | FileCheck %s --check-prefixes WARN,COMMON
-// RUN: not %clang -### --target=x86_64-linux-gnu -c %s -gdwarf-3 -glldb 2>&1 \
+// RUN: %clang -### --target=x86_64-linux-gnu --offload-arch=sm_52 -nogpulib 
-nogpuinc -c %s -gdwarf-3 -glldb 2>&1 \
 // RUN: | FileCheck %s --check-prefixes WARN,COMMON
-// RUN: not %clang -### --target=x86_64-linux-gnu -c %s -gdwarf-4 -gcodeview 
2>&1 \
+// RUN: %clang -### --target=x86_64-linux-gnu --offload-arch=sm_52 -nogpulib 
-nogpuinc -c %s -gdwarf-4 -gcodeview 2>&1 \
 // RUN: | FileCheck %s --check-prefixes WARN,COMMON
-// RUN: not %clang -### --target=x86_64-linux-gnu -c %s -gdwarf-5 -gmodules 
2>&1 \
+// RUN: %clang -### --target=x86_64-linux-gnu --offload-arch=sm_52 -nogpulib 
-nogpuinc -c %s -gdwarf-5 -gmodules 2>&1 \
 // RUN: | FileCheck %s --check-prefixes WARN,COMMON
-// RUN: not %clang -### --target=x86_64-linux-gnu -c %s -ggdb1 -fdebug-macro 
2>&1 \
+// RUN: %clang -### --target=x86_64-linux-gnu --offload-arch=sm_52 -nogpulib 
-nogpuinc -c %s -ggdb1 -fdebug-macro 2>&1 \
 // RUN: | FileCheck %s --check-prefixes WARN,COMMON
-// RUN: not %clang -### --target=x86_64-linux-gnu -c %s -ggdb2 -ggnu-pubnames 
2>&1 \
+// RUN: %clang -### --target=x86_64-linux-gnu --offload-arch=sm_52 -nogpulib 
-nogpuinc -c %s -ggdb2 -ggnu-pubnames 2>&1 \
 // RUN: | FileCheck %s --check-prefixes WARN,COMMON
-// RUN: not %clang -### --target=x86_64-linux-gnu -c %s -ggdb3 -gdwarf-aranges 
2>&1 \
+// RUN: %clang -### --target=x86_64-linux-gnu --offload-arch=sm_52 -nogpulib 
-nogpuinc -c %s -ggdb3 -gdwarf-aranges 2>&1 \
 // RUN: | FileCheck %s --check-prefixes WARN,COMMON
-// RUN: not %clang -### --target=x86_64-linux-gnu -c %s -g -gcolumn-info 
-fdebug-types-section 2>&1 \
+// RUN: %clang -### --target=x86_64-linux-gnu --offload-arch=sm_52 -nogpulib 
-nogpuinc -c %s -g -gcolumn-info -fdebug-types-section 2>&1 \
 // RUN: | FileCheck %s --check-prefixes WARN,COMMON
 
 // Same tests for OpenMP
-// RUN: not %clang -### --target=x86_64-linux-gnu -fopenmp=libomp 
-fopenmp-targets=nvptx64-nvidia-cuda -c %s \
+// RUN: %clang -### --target=x86_64-linux-gnu --offload-arch=sm_52 -nogpulib 
-nogpuinc -fopenmp=libomp -c %s \

hahnjo wrote:

Sure, but this one is explicitly testing the (in)compatibility of debug options 
and these run lines are supposed to test them together with OpenMP offloading. 
Now they don't anymore...

https://github.com/llvm/llvm-project/pull/84017
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-repl] Set up executor implicitly to account for init PTUs (PR #84758)

2024-03-11 Thread Jonas Hahnfeld via cfe-commits
Stefan =?utf-8?q?Gränitz?= 
Message-ID:
In-Reply-To: 



@@ -14,7 +14,7 @@ struct A { int a; A(int a) : a(a) {} virtual ~A(); };
 // PartialTranslationUnit.
 inline A::~A() { printf("~A(%d)\n", a); }
 
-// Create one instance with new and delete it.
+// Create one instance with new and delete it. We crash here now:
 A *a1 = new A(1);

hahnjo wrote:

Do we even have initial PTUs in the default case? Also the minimal reproducer 
shows a more general version where the `virtual` destructor is actually defined 
inline (c861d32d7c2791bdc058d9d9fbaecc1c2f07b8c7 addresses the case where it is 
out-of-line, which is special due to key `virtual` functions). So if that 
breaks entirely (which is critical for us), I'm personally not ok with just 
`XFAIL`ing it to land another change...

https://github.com/llvm/llvm-project/pull/84758
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang] Add cuda-path arguments to failing test (PR #84008)

2024-03-05 Thread Jonas Hahnfeld via cfe-commits

hahnjo wrote:

> We had a lot that were like this previously. Guessing this one slipped 
> through because of the `zlib` requirement.

I actually think there are some more left; for example 
`clang/test/Driver/cuda-dwarf-2.cu` has many `not %clang` invocations that 
don't specify `--cuda-path`. This happens to work on my system because my CUDA 
installation is too recent to support `sm_20`, but I suspect it would actually 
fail with an older CUDA installed...

> Does this work with `-nogpulib` instead? Usually easier than passing the 
> dummy CUDA path.

It definitely doesn't work for the "pure" CUDA invocations, it still finds my 
local installation and complains. It might work for the OpenMP invocations, but 
hard to tell for me on a system with CUDA installed. As it's a `.cu` test after 
all, I think I would prefer the uniformity of passing `--cuda-path` everywhere 
and be safe from any weird interaction.

https://github.com/llvm/llvm-project/pull/84008
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang] Add cuda-path arguments to failing test (PR #84008)

2024-03-05 Thread Jonas Hahnfeld via cfe-commits

hahnjo wrote:

> > It definitely doesn't work for the "pure" CUDA invocations, it still finds 
> > my local installation and complains. It might work for the OpenMP 
> > invocations, but hard to tell for me on a system with CUDA installed. As 
> > it's a `.cu` test after all, I think I would prefer the uniformity of 
> > passing `--cuda-path` everywhere and be safe from any weird interaction.
> 
> Might need `-nogpulib -nogpuinc` in those cases, we do that in other `.cu` 
> files in the test suite.

No, I already tried that, it doesn't work for me. All `clang/test/Driver/*.cu` 
that supply `-nocudainc` also pass `--cuda-path`...

https://github.com/llvm/llvm-project/pull/84008
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang] Add cuda-path arguments to failing test (PR #84008)

2024-03-05 Thread Jonas Hahnfeld via cfe-commits

hahnjo wrote:

> > > Might need `-nogpulib -nogpuinc` in those cases, we do that in other 
> > > `.cu` files in the test suite.
> > 
> > 
> > No, I already tried that, it doesn't work for me. All 
> > `clang/test/Driver/*.cu` that supply `-nocudainc` also pass `--cuda-path`...
> 
> The only reason it will fail without a CUDA installation if you pass both 
> `-nogpulib` and `-nogpuinc` is if you need to use `ptxas`, which shouldn't be 
> the case if you're using options like `-###`. Tested it out by moving my CUDA 
> installation and the following works as I expect.
> 
> ```
> > cat /dev/null | clang -x cuda - --offload-arch=sm_52 -nogpulib -nogpuinc 
> > -emit-llvm -c && echo $?
> 0
> > cat /dev/null | clang -x cuda - --offload-arch=sm_52
> >   
> clang: error: cannot find libdevice for sm_52; provide path to different CUDA 
> installation via '--cuda-path', or pass '-nocudalib' to build without linking 
> with libdevice
> clang: error: cannot find CUDA installation; provide its path via 
> '--cuda-path', or pass '-nocudainc' to build without CUDA includes
> clang: error: cannot find CUDA installation; provide its path via 
> '--cuda-path', or pass '-nocudainc' to build without CUDA includes
> ```
> 
> Which tests aren't working in your case?

The invocations in `clang/test/Driver/cuda-omp-unsupported-debug-options.cu` 
don't pass `-emit-llvm` but `-###`.
```
 > cat /dev/null | ./bin/clang -### -x cuda - -nogpulib -nogpuinc -c && echo $?
```
should error with recent CUDA installations because of `sm_35`.

https://github.com/llvm/llvm-project/pull/84008
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang] Add cuda-path arguments to failing test (PR #84008)

2024-03-05 Thread Jonas Hahnfeld via cfe-commits

hahnjo wrote:

Ok, but that still doesn't change the fact that the Clang driver will search 
for a system-wide CUDA installation unless passed `--cuda-path`...

https://github.com/llvm/llvm-project/pull/84008
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


  1   2   3   4   >