[clang-tools-extra] [clang-pseudo] Add a --print-terminal-tokens option (PR #87898)

2024-04-06 Thread Jeremy Rifkin via cfe-commits

https://github.com/jeremy-rifkin created 
https://github.com/llvm/llvm-project/pull/87898

This PR adds a `--print-terminal-tokens` option to clang-pseudo which prints 
tokens in a parse forest in addition to providing the token index:

```
› bin/clang-pseudo --source test.cpp --print-forest
[  0, end) translation-unit~simple-declaration := decl-specifier-seq 
init-declarator-list ;
[  0,   1) ├─decl-specifier-seq~simple-type-specifier := 
[  0,   1) │ ├─simple-type-specifier~IDENTIFIER := tok[0]
[  0,   1) │ └─simple-type-specifier~IDENTIFIER := tok[0]
[  1,   3) ├─init-declarator-list~ptr-declarator := ptr-operator ptr-declarator
[  1,   2) │ ├─ptr-operator~* := tok[1]
[  2,   3) │ └─ptr-declarator~IDENTIFIER := tok[2]
[  3, end) └─; := tok[3]
```
```
› bin/clang-pseudo --source test.cpp --print-forest --print-terminal-tokens
[  0, end) translation-unit~simple-declaration := decl-specifier-seq 
init-declarator-list ;
[  0,   1) ├─decl-specifier-seq~simple-type-specifier := 
[  0,   1) │ ├─simple-type-specifier~IDENTIFIER := tok[0] (identifier 1:0 "T" 
flags=1)
[  0,   1) │ └─simple-type-specifier~IDENTIFIER := tok[0] (identifier 1:0 "T" 
flags=1)
[  1,   3) ├─init-declarator-list~ptr-declarator := ptr-operator ptr-declarator
[  1,   2) │ ├─ptr-operator~* := tok[1] (star 1:0 "*")
[  2,   3) │ └─ptr-declarator~IDENTIFIER := tok[2] (identifier 1:0 "y")
[  3, end) └─; := tok[3] (semi 1:0 ";")
```



>From 2ebb15e08b5e2d8a9fe6cfddbe0dd2a8942b2542 Mon Sep 17 00:00:00 2001
From: Jeremy <51220084+jeremy-rif...@users.noreply.github.com>
Date: Sat, 6 Apr 2024 17:02:20 -0500
Subject: [PATCH] Add a --print-terminal-tokens option

---
 clang-tools-extra/pseudo/fuzzer/Fuzzer.cpp|  2 +-
 .../pseudo/include/clang-pseudo/Forest.h  | 11 ++--
 clang-tools-extra/pseudo/lib/Forest.cpp   | 26 +--
 clang-tools-extra/pseudo/tool/ClangPseudo.cpp | 12 +++--
 4 files changed, 38 insertions(+), 13 deletions(-)

diff --git a/clang-tools-extra/pseudo/fuzzer/Fuzzer.cpp 
b/clang-tools-extra/pseudo/fuzzer/Fuzzer.cpp
index 87b9d15480cc35..33b3da1ed6ea9f 100644
--- a/clang-tools-extra/pseudo/fuzzer/Fuzzer.cpp
+++ b/clang-tools-extra/pseudo/fuzzer/Fuzzer.cpp
@@ -46,7 +46,7 @@ class Fuzzer {
 glrParse(clang::pseudo::ParseParams{ParseableStream, Arena, GSS},
  *Lang.G.findNonterminal("translation-unit"), Lang);
 if (Print)
-  llvm::outs() << Root.dumpRecursive(Lang.G);
+  llvm::outs() << Root.dumpRecursive(Lang.G, std::nullopt);
   }
 };
 
diff --git a/clang-tools-extra/pseudo/include/clang-pseudo/Forest.h 
b/clang-tools-extra/pseudo/include/clang-pseudo/Forest.h
index e9edb40e02b64e..642c489b3fba41 100644
--- a/clang-tools-extra/pseudo/include/clang-pseudo/Forest.h
+++ b/clang-tools-extra/pseudo/include/clang-pseudo/Forest.h
@@ -26,6 +26,8 @@
 #include "llvm/ADT/STLExtras.h"
 #include "llvm/Support/Allocator.h"
 #include 
+#include 
+#include 
 
 namespace clang {
 namespace pseudo {
@@ -112,8 +114,13 @@ class alignas(class ForestNode *) ForestNode {
   // Iteration over all nodes in the forest, including this.
   llvm::iterator_range descendants() const;
 
-  std::string dump(const Grammar &) const;
-  std::string dumpRecursive(const Grammar &, bool Abbreviated = false) const;
+  std::string
+  dump(const Grammar &,
+   std::optional>) const;
+  std::string
+  dumpRecursive(const Grammar &,
+std::optional>,
+bool Abbreviated = false) const;
 
 private:
   friend class ForestArena;
diff --git a/clang-tools-extra/pseudo/lib/Forest.cpp 
b/clang-tools-extra/pseudo/lib/Forest.cpp
index e8e60e5ec475a4..adce731d6c1e1c 100644
--- a/clang-tools-extra/pseudo/lib/Forest.cpp
+++ b/clang-tools-extra/pseudo/lib/Forest.cpp
@@ -45,13 +45,21 @@ ForestNode::descendants() const {
   return {RecursiveIterator(this), RecursiveIterator()};
 }
 
-std::string ForestNode::dump(const Grammar &G) const {
+std::string ForestNode::dump(
+const Grammar &G,
+std::optional> Code) const {
   switch (kind()) {
   case Ambiguous:
 return llvm::formatv("{0} := ", G.symbolName(symbol()));
   case Terminal:
-return llvm::formatv("{0} := tok[{1}]", G.symbolName(symbol()),
- startTokenIndex());
+if (Code) {
+  return llvm::formatv("{0} := tok[{1}] ({2})", G.symbolName(symbol()),
+   startTokenIndex(),
+   Code->get().tokens()[startTokenIndex()]);
+} else {
+  return llvm::formatv("{0} := tok[{1}]", G.symbolName(symbol()),
+   startTokenIndex());
+}
   case Sequence:
 return G.dumpRule(rule());
   case Opaque:
@@ -60,8 +68,10 @@ std::string ForestNode::dump(const Grammar &G) const {
   llvm_unreachable("Unhandled node kind!");
 }
 
-std::string ForestNode::dumpRecursive(const Grammar &G,
-  bool Abbreviated) const {
+std::string ForestNode::dumpRecursive(
+const Grammar &G,
+std::optional> Code,

[clang-tools-extra] [clang-pseudo] clang-format file (PR #87900)

2024-04-06 Thread Jeremy Rifkin via cfe-commits

https://github.com/jeremy-rifkin created 
https://github.com/llvm/llvm-project/pull/87900

This PR just clang-formats the `clang-tools-extra/pseudo` directory to make 
contribution easier.

One issue I ran into is clang-format kept changing this region, despite the 
`clang-format off` directives, which I manually reverted:
```
//clang-format off
#define NONTERMINAL(NAME, ID) \
};\
} \
namespace NAME {  \
enum Rule : RuleID {
//clang-format on
```

>From 7add6f47c280b5345f0ad15738a0e82a39f51cb8 Mon Sep 17 00:00:00 2001
From: Jeremy <51220084+jeremy-rif...@users.noreply.github.com>
Date: Sat, 6 Apr 2024 17:39:48 -0500
Subject: [PATCH 1/2] clang-format clang-tools-extra/pseudo

---
 clang-tools-extra/pseudo/gen/Main.cpp |   4 +-
 .../pseudo/include/clang-pseudo/GLR.h |   4 +-
 .../pseudo/include/clang-pseudo/cxx/CXX.h |   2 +-
 .../include/clang-pseudo/grammar/LRGraph.h|   6 +-
 .../include/clang-pseudo/grammar/LRTable.h|   5 +-
 .../pseudo/lib/DirectiveTree.cpp  |   8 +-
 clang-tools-extra/pseudo/lib/GLR.cpp  |  19 +--
 clang-tools-extra/pseudo/lib/cxx/CXX.cpp  | 130 +-
 .../pseudo/lib/grammar/GrammarBNF.cpp |   8 +-
 clang-tools-extra/pseudo/tool/ClangPseudo.cpp |   9 +-
 .../pseudo/unittests/GLRTest.cpp  |  38 +++--
 11 files changed, 116 insertions(+), 117 deletions(-)

diff --git a/clang-tools-extra/pseudo/gen/Main.cpp 
b/clang-tools-extra/pseudo/gen/Main.cpp
index 25cb26563837a6..779d5077f40ba7 100644
--- a/clang-tools-extra/pseudo/gen/Main.cpp
+++ b/clang-tools-extra/pseudo/gen/Main.cpp
@@ -74,7 +74,7 @@ std::string mangleSymbol(SymbolID SID, const Grammar &G) {
 #define TOK(X) llvm::StringRef(#X).upper(),
 #define KEYWORD(Keyword, Condition) llvm::StringRef(#Keyword).upper(),
 #include "clang/Basic/TokenKinds.def"
-  };
+  };
   if (isToken(SID))
 return TokNames[symbolToToken(SID)];
   std::string Name = G.symbolName(SID).str();
@@ -84,7 +84,7 @@ std::string mangleSymbol(SymbolID SID, const Grammar &G) {
 }
 
 // Mangles the RHS of a rule definition into a valid identifier.
-// 
+//
 // These are unique only for a fixed LHS.
 // e.g. for the grammar rule `ptr-declarator := ptr-operator ptr-declarator`,
 // it is `ptr_operator__ptr_declarator`.
diff --git a/clang-tools-extra/pseudo/include/clang-pseudo/GLR.h 
b/clang-tools-extra/pseudo/include/clang-pseudo/GLR.h
index 0100f818d4ed78..84610b47360b9d 100644
--- a/clang-tools-extra/pseudo/include/clang-pseudo/GLR.h
+++ b/clang-tools-extra/pseudo/include/clang-pseudo/GLR.h
@@ -120,8 +120,8 @@ struct ParseParams {
   const TokenStream &Code;
 
   // Arena for data structure used by the GLR algorithm.
-  ForestArena &Forest;  // Storage for the output forest.
-  GSS &GSStack; // Storage for parsing stacks.
+  ForestArena &Forest; // Storage for the output forest.
+  GSS &GSStack;// Storage for parsing stacks.
 };
 
 // Parses the given token stream as the start symbol with the GLR algorithm,
diff --git a/clang-tools-extra/pseudo/include/clang-pseudo/cxx/CXX.h 
b/clang-tools-extra/pseudo/include/clang-pseudo/cxx/CXX.h
index 7bbb4d2c00201f..897727ab61ca7a 100644
--- a/clang-tools-extra/pseudo/include/clang-pseudo/cxx/CXX.h
+++ b/clang-tools-extra/pseudo/include/clang-pseudo/cxx/CXX.h
@@ -65,7 +65,7 @@ enum Rule : RuleID {
 #define RULE(LHS, RHS, ID) RHS = ID,
 #include "CXXSymbols.inc"
 };
-}
+} // namespace dummy
 } // namespace rules
 } // namespace detail
 
diff --git a/clang-tools-extra/pseudo/include/clang-pseudo/grammar/LRGraph.h 
b/clang-tools-extra/pseudo/include/clang-pseudo/grammar/LRGraph.h
index dd9e87c2c172bf..4527f0e41c9452 100644
--- a/clang-tools-extra/pseudo/include/clang-pseudo/grammar/LRGraph.h
+++ b/clang-tools-extra/pseudo/include/clang-pseudo/grammar/LRGraph.h
@@ -143,9 +143,9 @@ class LRGraph {
   //   stmt := { . stmt-seq [recover=braces] }
   // has a Recovery { Src = S, Strategy=braces, Result=stmt-seq }.
   struct Recovery {
-StateID Src; // The state we are in when encountering the error.
-ExtensionID Strategy;  // Heuristic choosing the tokens to match.
-SymbolID Result;   // The symbol that is produced.
+StateID Src;  // The state we are in when encountering the error.
+ExtensionID Strategy; // Heuristic choosing the tokens to match.
+SymbolID Result;  // The symbol that is produced.
   };
 
   llvm::ArrayRef states() const { return States; }
diff --git a/clang-tools-extra/pseudo/include/clang-pseudo/grammar/LRTable.h 
b/clang-tools-extra/pseudo/include/clang-pseudo/grammar/LRTable.h
index 1706b6936c9ea2..26c3d1f6c8c829 100644
--- a/clang-tools-extra/pseudo/include/clang-pseudo/grammar/LRTable.h
+++ b/clang-tools-extra/pseudo/include/clang-pseudo/grammar/LRTable.h
@@ -76,14 +76,13 @@ class LRTable {
   // Expected to be called by LR parsers.
   // If the nonterminal is invalid here, returns std::nullopt.

[clang-tools-extra] [clang-pseudo] clang-format files (PR #87900)

2024-04-06 Thread Jeremy Rifkin via cfe-commits

https://github.com/jeremy-rifkin edited 
https://github.com/llvm/llvm-project/pull/87900
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang-tools-extra] [clang-pseudo] Add a --print-terminal-tokens option (PR #87898)

2024-04-06 Thread Jeremy Rifkin via cfe-commits

https://github.com/jeremy-rifkin updated 
https://github.com/llvm/llvm-project/pull/87898

>From 2ebb15e08b5e2d8a9fe6cfddbe0dd2a8942b2542 Mon Sep 17 00:00:00 2001
From: Jeremy <51220084+jeremy-rif...@users.noreply.github.com>
Date: Sat, 6 Apr 2024 17:02:20 -0500
Subject: [PATCH 1/2] Add a --print-terminal-tokens option

---
 clang-tools-extra/pseudo/fuzzer/Fuzzer.cpp|  2 +-
 .../pseudo/include/clang-pseudo/Forest.h  | 11 ++--
 clang-tools-extra/pseudo/lib/Forest.cpp   | 26 +--
 clang-tools-extra/pseudo/tool/ClangPseudo.cpp | 12 +++--
 4 files changed, 38 insertions(+), 13 deletions(-)

diff --git a/clang-tools-extra/pseudo/fuzzer/Fuzzer.cpp 
b/clang-tools-extra/pseudo/fuzzer/Fuzzer.cpp
index 87b9d15480cc35..33b3da1ed6ea9f 100644
--- a/clang-tools-extra/pseudo/fuzzer/Fuzzer.cpp
+++ b/clang-tools-extra/pseudo/fuzzer/Fuzzer.cpp
@@ -46,7 +46,7 @@ class Fuzzer {
 glrParse(clang::pseudo::ParseParams{ParseableStream, Arena, GSS},
  *Lang.G.findNonterminal("translation-unit"), Lang);
 if (Print)
-  llvm::outs() << Root.dumpRecursive(Lang.G);
+  llvm::outs() << Root.dumpRecursive(Lang.G, std::nullopt);
   }
 };
 
diff --git a/clang-tools-extra/pseudo/include/clang-pseudo/Forest.h 
b/clang-tools-extra/pseudo/include/clang-pseudo/Forest.h
index e9edb40e02b64e..642c489b3fba41 100644
--- a/clang-tools-extra/pseudo/include/clang-pseudo/Forest.h
+++ b/clang-tools-extra/pseudo/include/clang-pseudo/Forest.h
@@ -26,6 +26,8 @@
 #include "llvm/ADT/STLExtras.h"
 #include "llvm/Support/Allocator.h"
 #include 
+#include 
+#include 
 
 namespace clang {
 namespace pseudo {
@@ -112,8 +114,13 @@ class alignas(class ForestNode *) ForestNode {
   // Iteration over all nodes in the forest, including this.
   llvm::iterator_range descendants() const;
 
-  std::string dump(const Grammar &) const;
-  std::string dumpRecursive(const Grammar &, bool Abbreviated = false) const;
+  std::string
+  dump(const Grammar &,
+   std::optional>) const;
+  std::string
+  dumpRecursive(const Grammar &,
+std::optional>,
+bool Abbreviated = false) const;
 
 private:
   friend class ForestArena;
diff --git a/clang-tools-extra/pseudo/lib/Forest.cpp 
b/clang-tools-extra/pseudo/lib/Forest.cpp
index e8e60e5ec475a4..adce731d6c1e1c 100644
--- a/clang-tools-extra/pseudo/lib/Forest.cpp
+++ b/clang-tools-extra/pseudo/lib/Forest.cpp
@@ -45,13 +45,21 @@ ForestNode::descendants() const {
   return {RecursiveIterator(this), RecursiveIterator()};
 }
 
-std::string ForestNode::dump(const Grammar &G) const {
+std::string ForestNode::dump(
+const Grammar &G,
+std::optional> Code) const {
   switch (kind()) {
   case Ambiguous:
 return llvm::formatv("{0} := ", G.symbolName(symbol()));
   case Terminal:
-return llvm::formatv("{0} := tok[{1}]", G.symbolName(symbol()),
- startTokenIndex());
+if (Code) {
+  return llvm::formatv("{0} := tok[{1}] ({2})", G.symbolName(symbol()),
+   startTokenIndex(),
+   Code->get().tokens()[startTokenIndex()]);
+} else {
+  return llvm::formatv("{0} := tok[{1}]", G.symbolName(symbol()),
+   startTokenIndex());
+}
   case Sequence:
 return G.dumpRule(rule());
   case Opaque:
@@ -60,8 +68,10 @@ std::string ForestNode::dump(const Grammar &G) const {
   llvm_unreachable("Unhandled node kind!");
 }
 
-std::string ForestNode::dumpRecursive(const Grammar &G,
-  bool Abbreviated) const {
+std::string ForestNode::dumpRecursive(
+const Grammar &G,
+std::optional> Code,
+bool Abbreviated) const {
   using llvm::formatv;
   Token::Index MaxToken = 0;
   // Count visits of nodes so we can mark those seen multiple times.
@@ -95,7 +105,7 @@ std::string ForestNode::dumpRecursive(const Grammar &G,
   std::string Result;
   constexpr Token::Index KEnd = std::numeric_limits::max();
   std::function,
- LineDecoration &LineDec)>
+ LineDecoration LineDec)>
   Dump = [&](const ForestNode *P, Token::Index End,
  std::optional ElidedParent, LineDecoration LineDec) 
{
 bool SharedNode = VisitCounts.find(P)->getSecond() > 1;
@@ -145,13 +155,13 @@ std::string ForestNode::dumpRecursive(const Grammar &G,
 
   // The first time, print as #1. Later, =#1.
   if (First) {
-Result += formatv("{0} #{1}", P->dump(G), ID);
+Result += formatv("{0} #{1}", P->dump(G, Code), ID);
   } else {
 Result += formatv("{0} =#{1}", G.symbolName(P->symbol()), ID);
 Children = {}; // Don't walk the children again.
   }
 } else {
-  Result.append(P->dump(G));
+  Result.append(P->dump(G, Code));
 }
 Result.push_back('\n');
 
diff --git a/clang-tools-extra/pseudo/tool/ClangPseudo.cpp 
b/clang-tools-extra/pseudo/tool/ClangPseud

[clang-tools-extra] [clang-pseudo] Add a --print-terminal-tokens option (PR #87898)

2024-04-06 Thread Jeremy Rifkin via cfe-commits

https://github.com/jeremy-rifkin updated 
https://github.com/llvm/llvm-project/pull/87898

>From 2ebb15e08b5e2d8a9fe6cfddbe0dd2a8942b2542 Mon Sep 17 00:00:00 2001
From: Jeremy <51220084+jeremy-rif...@users.noreply.github.com>
Date: Sat, 6 Apr 2024 17:02:20 -0500
Subject: [PATCH 1/3] Add a --print-terminal-tokens option

---
 clang-tools-extra/pseudo/fuzzer/Fuzzer.cpp|  2 +-
 .../pseudo/include/clang-pseudo/Forest.h  | 11 ++--
 clang-tools-extra/pseudo/lib/Forest.cpp   | 26 +--
 clang-tools-extra/pseudo/tool/ClangPseudo.cpp | 12 +++--
 4 files changed, 38 insertions(+), 13 deletions(-)

diff --git a/clang-tools-extra/pseudo/fuzzer/Fuzzer.cpp 
b/clang-tools-extra/pseudo/fuzzer/Fuzzer.cpp
index 87b9d15480cc35..33b3da1ed6ea9f 100644
--- a/clang-tools-extra/pseudo/fuzzer/Fuzzer.cpp
+++ b/clang-tools-extra/pseudo/fuzzer/Fuzzer.cpp
@@ -46,7 +46,7 @@ class Fuzzer {
 glrParse(clang::pseudo::ParseParams{ParseableStream, Arena, GSS},
  *Lang.G.findNonterminal("translation-unit"), Lang);
 if (Print)
-  llvm::outs() << Root.dumpRecursive(Lang.G);
+  llvm::outs() << Root.dumpRecursive(Lang.G, std::nullopt);
   }
 };
 
diff --git a/clang-tools-extra/pseudo/include/clang-pseudo/Forest.h 
b/clang-tools-extra/pseudo/include/clang-pseudo/Forest.h
index e9edb40e02b64e..642c489b3fba41 100644
--- a/clang-tools-extra/pseudo/include/clang-pseudo/Forest.h
+++ b/clang-tools-extra/pseudo/include/clang-pseudo/Forest.h
@@ -26,6 +26,8 @@
 #include "llvm/ADT/STLExtras.h"
 #include "llvm/Support/Allocator.h"
 #include 
+#include 
+#include 
 
 namespace clang {
 namespace pseudo {
@@ -112,8 +114,13 @@ class alignas(class ForestNode *) ForestNode {
   // Iteration over all nodes in the forest, including this.
   llvm::iterator_range descendants() const;
 
-  std::string dump(const Grammar &) const;
-  std::string dumpRecursive(const Grammar &, bool Abbreviated = false) const;
+  std::string
+  dump(const Grammar &,
+   std::optional>) const;
+  std::string
+  dumpRecursive(const Grammar &,
+std::optional>,
+bool Abbreviated = false) const;
 
 private:
   friend class ForestArena;
diff --git a/clang-tools-extra/pseudo/lib/Forest.cpp 
b/clang-tools-extra/pseudo/lib/Forest.cpp
index e8e60e5ec475a4..adce731d6c1e1c 100644
--- a/clang-tools-extra/pseudo/lib/Forest.cpp
+++ b/clang-tools-extra/pseudo/lib/Forest.cpp
@@ -45,13 +45,21 @@ ForestNode::descendants() const {
   return {RecursiveIterator(this), RecursiveIterator()};
 }
 
-std::string ForestNode::dump(const Grammar &G) const {
+std::string ForestNode::dump(
+const Grammar &G,
+std::optional> Code) const {
   switch (kind()) {
   case Ambiguous:
 return llvm::formatv("{0} := ", G.symbolName(symbol()));
   case Terminal:
-return llvm::formatv("{0} := tok[{1}]", G.symbolName(symbol()),
- startTokenIndex());
+if (Code) {
+  return llvm::formatv("{0} := tok[{1}] ({2})", G.symbolName(symbol()),
+   startTokenIndex(),
+   Code->get().tokens()[startTokenIndex()]);
+} else {
+  return llvm::formatv("{0} := tok[{1}]", G.symbolName(symbol()),
+   startTokenIndex());
+}
   case Sequence:
 return G.dumpRule(rule());
   case Opaque:
@@ -60,8 +68,10 @@ std::string ForestNode::dump(const Grammar &G) const {
   llvm_unreachable("Unhandled node kind!");
 }
 
-std::string ForestNode::dumpRecursive(const Grammar &G,
-  bool Abbreviated) const {
+std::string ForestNode::dumpRecursive(
+const Grammar &G,
+std::optional> Code,
+bool Abbreviated) const {
   using llvm::formatv;
   Token::Index MaxToken = 0;
   // Count visits of nodes so we can mark those seen multiple times.
@@ -95,7 +105,7 @@ std::string ForestNode::dumpRecursive(const Grammar &G,
   std::string Result;
   constexpr Token::Index KEnd = std::numeric_limits::max();
   std::function,
- LineDecoration &LineDec)>
+ LineDecoration LineDec)>
   Dump = [&](const ForestNode *P, Token::Index End,
  std::optional ElidedParent, LineDecoration LineDec) 
{
 bool SharedNode = VisitCounts.find(P)->getSecond() > 1;
@@ -145,13 +155,13 @@ std::string ForestNode::dumpRecursive(const Grammar &G,
 
   // The first time, print as #1. Later, =#1.
   if (First) {
-Result += formatv("{0} #{1}", P->dump(G), ID);
+Result += formatv("{0} #{1}", P->dump(G, Code), ID);
   } else {
 Result += formatv("{0} =#{1}", G.symbolName(P->symbol()), ID);
 Children = {}; // Don't walk the children again.
   }
 } else {
-  Result.append(P->dump(G));
+  Result.append(P->dump(G, Code));
 }
 Result.push_back('\n');
 
diff --git a/clang-tools-extra/pseudo/tool/ClangPseudo.cpp 
b/clang-tools-extra/pseudo/tool/ClangPseud

[clang] [compiler-rt] [Clang] Make `-Wreturn-type` default to an error in all language modes (PR #123470)

2025-01-19 Thread Jeremy Rifkin via cfe-commits

jeremy-rifkin wrote:

There are implications with #123166, which might also be clang 21 material

https://github.com/llvm/llvm-project/pull/123470
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang][Analysis] Don't treat the default switch path as unreachable when all enumerators are covered (PR #123166)

2025-01-18 Thread Jeremy Rifkin via cfe-commits

https://github.com/jeremy-rifkin updated 
https://github.com/llvm/llvm-project/pull/123166

>From e1ce92c0f54301cacaba316d38d44d20c6d61cb8 Mon Sep 17 00:00:00 2001
From: Jeremy Rifkin <51220084+jeremy-rif...@users.noreply.github.com>
Date: Thu, 16 Jan 2025 00:27:03 -0600
Subject: [PATCH 1/4] Don't treat the default switch case as unreachable even
 when all enumerators are covered

---
 clang/lib/Analysis/CFG.cpp   | 12 ++--
 clang/lib/Sema/AnalysisBasedWarnings.cpp |  3 ---
 clang/test/Analysis/cfg.cpp  |  8 
 ...cxx-uninitialized-object-unguarded-access.cpp |  2 ++
 clang/test/Sema/return.c |  2 +-
 clang/test/SemaCXX/array-bounds.cpp  | 16 +++-
 6 files changed, 16 insertions(+), 27 deletions(-)

diff --git a/clang/lib/Analysis/CFG.cpp b/clang/lib/Analysis/CFG.cpp
index 304bbb2b422c61..b819df35189bb8 100644
--- a/clang/lib/Analysis/CFG.cpp
+++ b/clang/lib/Analysis/CFG.cpp
@@ -4408,17 +4408,9 @@ CFGBlock *CFGBuilder::VisitSwitchStmt(SwitchStmt 
*Terminator) {
   }
 
   // If we have no "default:" case, the default transition is to the code
-  // following the switch body.  Moreover, take into account if all the
-  // cases of a switch are covered (e.g., switching on an enum value).
-  //
-  // Note: We add a successor to a switch that is considered covered yet has no
-  //   case statements if the enumeration has no enumerators.
-  bool SwitchAlwaysHasSuccessor = false;
-  SwitchAlwaysHasSuccessor |= switchExclusivelyCovered;
-  SwitchAlwaysHasSuccessor |= Terminator->isAllEnumCasesCovered() &&
-  Terminator->getSwitchCaseList();
+  // following the switch body.
   addSuccessor(SwitchTerminatedBlock, DefaultCaseBlock,
-   !SwitchAlwaysHasSuccessor);
+   !switchExclusivelyCovered);
 
   // Add the terminator and condition in the switch block.
   SwitchTerminatedBlock->setTerminator(Terminator);
diff --git a/clang/lib/Sema/AnalysisBasedWarnings.cpp 
b/clang/lib/Sema/AnalysisBasedWarnings.cpp
index 589869d0186575..2fa6f0a881e808 100644
--- a/clang/lib/Sema/AnalysisBasedWarnings.cpp
+++ b/clang/lib/Sema/AnalysisBasedWarnings.cpp
@@ -456,10 +456,7 @@ static ControlFlowKind 
CheckFallThrough(AnalysisDeclContext &AC) {
   bool HasPlainEdge = false;
   bool HasAbnormalEdge = false;
 
-  // Ignore default cases that aren't likely to be reachable because all
-  // enums in a switch(X) have explicit case statements.
   CFGBlock::FilterOptions FO;
-  FO.IgnoreDefaultsWithCoveredEnums = 1;
 
   for (CFGBlock::filtered_pred_iterator I =
cfg->getExit().filtered_pred_start_end(FO);
diff --git a/clang/test/Analysis/cfg.cpp b/clang/test/Analysis/cfg.cpp
index 44a89df28e3b29..940994a7932192 100644
--- a/clang/test/Analysis/cfg.cpp
+++ b/clang/test/Analysis/cfg.cpp
@@ -178,7 +178,7 @@ namespace NoReturnSingleSuccessor {
 // CHECK-NEXT:1: x
 // CHECK-NEXT:2: [B1.1] (ImplicitCastExpr, LValueToRValue, int)
 // CHECK-NEXT:3: return [B1.2];
-// CHECK-NEXT:Preds (5): B3 B4 B5 B6 B2(Unreachable)
+// CHECK-NEXT:Preds (5): B3 B4 B5 B6 B2
 // CHECK-NEXT:Succs (1): B0
 // CHECK:  [B2]
 // CHECK-NEXT:1: 0
@@ -188,7 +188,7 @@ namespace NoReturnSingleSuccessor {
 // CHECK-NEXT:5: [B2.4] (ImplicitCastExpr, IntegralCast, int)
 // CHECK-NEXT:T: switch [B2.5]
 // CHECK-NEXT:Preds (1): B7
-// CHECK-NEXT:Succs (5): B3 B4 B5 B6 B1(Unreachable)
+// CHECK-NEXT:Succs (5): B3 B4 B5 B6 B1
 // CHECK:  [B3]
 // CHECK-NEXT:   case D:
 // CHECK-NEXT:1: 4
@@ -254,14 +254,14 @@ int test_enum_with_extension(enum MyEnum value) {
 // CHECK-NEXT:5: [B2.4] (ImplicitCastExpr, IntegralCast, int)
 // CHECK-NEXT:T: switch [B2.5]
 // CHECK-NEXT:Preds (1): B7
-// CHECK-NEXT:Succs (4): B4 B5 B6 B3(Unreachable)
+// CHECK-NEXT:Succs (4): B4 B5 B6 B3
 // CHECK:  [B3]
 // CHECK-NEXT:   default:
 // CHECK-NEXT:1: 4
 // CHECK-NEXT:2: x
 // CHECK-NEXT:3: [B3.2] = [B3.1]
 // CHECK-NEXT:T: break;
-// CHECK-NEXT:Preds (1): B2(Unreachable)
+// CHECK-NEXT:Preds (1): B2
 // CHECK-NEXT:Succs (1): B1
 // CHECK:  [B4]
 // CHECK-NEXT:   case C:
diff --git a/clang/test/Analysis/cxx-uninitialized-object-unguarded-access.cpp 
b/clang/test/Analysis/cxx-uninitialized-object-unguarded-access.cpp
index 611e1d8255976c..b791c527c2e33e 100644
--- a/clang/test/Analysis/cxx-uninitialized-object-unguarded-access.cpp
+++ b/clang/test/Analysis/cxx-uninitialized-object-unguarded-access.cpp
@@ -383,6 +383,7 @@ class SwitchGuardedFieldsTest {
 case Kind::V:
   return -1;
 }
+halt();
   }
 
   int operator+() {
@@ -392,6 +393,7 @@ class SwitchGuardedFieldsTest {
 case Kind::V:
   return -1;
 }
+halt();
   }
 };
 
diff --git a/clang/test/Sema/return.c b/clang/test/Sema/return.c
index 14d5300e819492..b0814f152cbe95 100644
--- a/clang/test/Sema/return.c
+++ b/clang/test/Sema/return.c
@@ -265,7 +265,

[clang] [Clang][Analysis] Don't treat the default switch path as unreachable when all enumerators are covered (PR #123166)

2025-01-18 Thread Jeremy Rifkin via cfe-commits


@@ -170,16 +170,14 @@ void test_nested_switch() {
   }
 }
 
-// Test that if all the values of an enum covered, that the 'default' branch
-// is unreachable.
+// Test that a warning is not emitted if the code is unreachable.
 enum Values { A, B, C, D };
 void test_all_enums_covered(enum Values v) {

jeremy-rifkin wrote:

It looks like there's another test case that does effectively make an 
unreachable branch, I'll update this case as you suggested

https://github.com/llvm/llvm-project/pull/123166
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang-tools-extra] [clang-pseudo] clang-format files (PR #87900)

2025-01-15 Thread Jeremy Rifkin via cfe-commits

jeremy-rifkin wrote:

Clang pseudo has apparently been removed

https://github.com/llvm/llvm-project/pull/87900
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang-tools-extra] [clang-pseudo] Add a --print-terminal-tokens option (PR #87898)

2025-01-15 Thread Jeremy Rifkin via cfe-commits

https://github.com/jeremy-rifkin closed 
https://github.com/llvm/llvm-project/pull/87898
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang-tools-extra] [clang-pseudo] Add a --print-terminal-tokens option (PR #87898)

2025-01-15 Thread Jeremy Rifkin via cfe-commits

jeremy-rifkin wrote:

Clang pseudo has apparently been removed

https://github.com/llvm/llvm-project/pull/87898
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang-tools-extra] [clang-pseudo] clang-format files (PR #87900)

2025-01-15 Thread Jeremy Rifkin via cfe-commits

https://github.com/jeremy-rifkin closed 
https://github.com/llvm/llvm-project/pull/87900
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Don't treat the default switch path as unreachable when all enumerators are covered (PR #123166)

2025-01-15 Thread Jeremy Rifkin via cfe-commits

https://github.com/jeremy-rifkin updated 
https://github.com/llvm/llvm-project/pull/123166

>From e1ce92c0f54301cacaba316d38d44d20c6d61cb8 Mon Sep 17 00:00:00 2001
From: Jeremy Rifkin <51220084+jeremy-rif...@users.noreply.github.com>
Date: Thu, 16 Jan 2025 00:27:03 -0600
Subject: [PATCH 1/2] Don't treat the default switch case as unreachable even
 when all enumerators are covered

---
 clang/lib/Analysis/CFG.cpp   | 12 ++--
 clang/lib/Sema/AnalysisBasedWarnings.cpp |  3 ---
 clang/test/Analysis/cfg.cpp  |  8 
 ...cxx-uninitialized-object-unguarded-access.cpp |  2 ++
 clang/test/Sema/return.c |  2 +-
 clang/test/SemaCXX/array-bounds.cpp  | 16 +++-
 6 files changed, 16 insertions(+), 27 deletions(-)

diff --git a/clang/lib/Analysis/CFG.cpp b/clang/lib/Analysis/CFG.cpp
index 304bbb2b422c61..b819df35189bb8 100644
--- a/clang/lib/Analysis/CFG.cpp
+++ b/clang/lib/Analysis/CFG.cpp
@@ -4408,17 +4408,9 @@ CFGBlock *CFGBuilder::VisitSwitchStmt(SwitchStmt 
*Terminator) {
   }
 
   // If we have no "default:" case, the default transition is to the code
-  // following the switch body.  Moreover, take into account if all the
-  // cases of a switch are covered (e.g., switching on an enum value).
-  //
-  // Note: We add a successor to a switch that is considered covered yet has no
-  //   case statements if the enumeration has no enumerators.
-  bool SwitchAlwaysHasSuccessor = false;
-  SwitchAlwaysHasSuccessor |= switchExclusivelyCovered;
-  SwitchAlwaysHasSuccessor |= Terminator->isAllEnumCasesCovered() &&
-  Terminator->getSwitchCaseList();
+  // following the switch body.
   addSuccessor(SwitchTerminatedBlock, DefaultCaseBlock,
-   !SwitchAlwaysHasSuccessor);
+   !switchExclusivelyCovered);
 
   // Add the terminator and condition in the switch block.
   SwitchTerminatedBlock->setTerminator(Terminator);
diff --git a/clang/lib/Sema/AnalysisBasedWarnings.cpp 
b/clang/lib/Sema/AnalysisBasedWarnings.cpp
index 589869d0186575..2fa6f0a881e808 100644
--- a/clang/lib/Sema/AnalysisBasedWarnings.cpp
+++ b/clang/lib/Sema/AnalysisBasedWarnings.cpp
@@ -456,10 +456,7 @@ static ControlFlowKind 
CheckFallThrough(AnalysisDeclContext &AC) {
   bool HasPlainEdge = false;
   bool HasAbnormalEdge = false;
 
-  // Ignore default cases that aren't likely to be reachable because all
-  // enums in a switch(X) have explicit case statements.
   CFGBlock::FilterOptions FO;
-  FO.IgnoreDefaultsWithCoveredEnums = 1;
 
   for (CFGBlock::filtered_pred_iterator I =
cfg->getExit().filtered_pred_start_end(FO);
diff --git a/clang/test/Analysis/cfg.cpp b/clang/test/Analysis/cfg.cpp
index 44a89df28e3b29..940994a7932192 100644
--- a/clang/test/Analysis/cfg.cpp
+++ b/clang/test/Analysis/cfg.cpp
@@ -178,7 +178,7 @@ namespace NoReturnSingleSuccessor {
 // CHECK-NEXT:1: x
 // CHECK-NEXT:2: [B1.1] (ImplicitCastExpr, LValueToRValue, int)
 // CHECK-NEXT:3: return [B1.2];
-// CHECK-NEXT:Preds (5): B3 B4 B5 B6 B2(Unreachable)
+// CHECK-NEXT:Preds (5): B3 B4 B5 B6 B2
 // CHECK-NEXT:Succs (1): B0
 // CHECK:  [B2]
 // CHECK-NEXT:1: 0
@@ -188,7 +188,7 @@ namespace NoReturnSingleSuccessor {
 // CHECK-NEXT:5: [B2.4] (ImplicitCastExpr, IntegralCast, int)
 // CHECK-NEXT:T: switch [B2.5]
 // CHECK-NEXT:Preds (1): B7
-// CHECK-NEXT:Succs (5): B3 B4 B5 B6 B1(Unreachable)
+// CHECK-NEXT:Succs (5): B3 B4 B5 B6 B1
 // CHECK:  [B3]
 // CHECK-NEXT:   case D:
 // CHECK-NEXT:1: 4
@@ -254,14 +254,14 @@ int test_enum_with_extension(enum MyEnum value) {
 // CHECK-NEXT:5: [B2.4] (ImplicitCastExpr, IntegralCast, int)
 // CHECK-NEXT:T: switch [B2.5]
 // CHECK-NEXT:Preds (1): B7
-// CHECK-NEXT:Succs (4): B4 B5 B6 B3(Unreachable)
+// CHECK-NEXT:Succs (4): B4 B5 B6 B3
 // CHECK:  [B3]
 // CHECK-NEXT:   default:
 // CHECK-NEXT:1: 4
 // CHECK-NEXT:2: x
 // CHECK-NEXT:3: [B3.2] = [B3.1]
 // CHECK-NEXT:T: break;
-// CHECK-NEXT:Preds (1): B2(Unreachable)
+// CHECK-NEXT:Preds (1): B2
 // CHECK-NEXT:Succs (1): B1
 // CHECK:  [B4]
 // CHECK-NEXT:   case C:
diff --git a/clang/test/Analysis/cxx-uninitialized-object-unguarded-access.cpp 
b/clang/test/Analysis/cxx-uninitialized-object-unguarded-access.cpp
index 611e1d8255976c..b791c527c2e33e 100644
--- a/clang/test/Analysis/cxx-uninitialized-object-unguarded-access.cpp
+++ b/clang/test/Analysis/cxx-uninitialized-object-unguarded-access.cpp
@@ -383,6 +383,7 @@ class SwitchGuardedFieldsTest {
 case Kind::V:
   return -1;
 }
+halt();
   }
 
   int operator+() {
@@ -392,6 +393,7 @@ class SwitchGuardedFieldsTest {
 case Kind::V:
   return -1;
 }
+halt();
   }
 };
 
diff --git a/clang/test/Sema/return.c b/clang/test/Sema/return.c
index 14d5300e819492..b0814f152cbe95 100644
--- a/clang/test/Sema/return.c
+++ b/clang/test/Sema/return.c
@@ -265,7 +265,

[clang] Don't treat the default switch path as unreachable when all enumerators are covered (PR #123166)

2025-01-15 Thread Jeremy Rifkin via cfe-commits

https://github.com/jeremy-rifkin edited 
https://github.com/llvm/llvm-project/pull/123166
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Don't treat the default switch path as unreachable when all enumerators are covered (PR #123166)

2025-01-15 Thread Jeremy Rifkin via cfe-commits

https://github.com/jeremy-rifkin created 
https://github.com/llvm/llvm-project/pull/123166

This PR updates the CFG building logic and analysis based warning logic to not 
consider the default path for `switch` statements to be unreachable when all 
enumerators are covered. I.e.:

```cpp
enum class E { a, b };

int foo(E e) {
switch(e) {
case E::a:
return 20;
case E::b:
return 30;
}
// reachable
}
```

The motivation is here is to have `-Wreturn-type` warnings in cases such as 
above. Even though it's often sufficient to handle all enumerators, extraneous 
enum values are valid and common in the case of flag enums. More context at 
#123153. This is in-line with behavior in other compilers (gcc, msvc, and edg 
emit warnings for this).

A few test cases broke due to relying on missing returns in the default case. 
Two objective C test cases failed, which I would appreciate guidance on so that 
I can be sure to test the right things.

On the llvm discord a point was raised about changes to this warning behavior 
maybe being disruptive for any codebases that rely heavily on this missing 
return analysis behavior. I think the fact so few tests are affected in llvm is 
a good sign, and similarly the fact other compilers warn about this makes any 
potential disruption limited.

The two relevant commits introducing the enum covering logic:
https://github.com/llvm/llvm-project/commit/50205744c32d3925ba50d1bd69b1c9e6ab3a28b8
https://github.com/llvm/llvm-project/commit/f69ce860487ec0b1eceb23f8391614d5772aae3b

Closes #123153


>From e1ce92c0f54301cacaba316d38d44d20c6d61cb8 Mon Sep 17 00:00:00 2001
From: Jeremy Rifkin <51220084+jeremy-rif...@users.noreply.github.com>
Date: Thu, 16 Jan 2025 00:27:03 -0600
Subject: [PATCH] Don't treat the default switch case as unreachable even when
 all enumerators are covered

---
 clang/lib/Analysis/CFG.cpp   | 12 ++--
 clang/lib/Sema/AnalysisBasedWarnings.cpp |  3 ---
 clang/test/Analysis/cfg.cpp  |  8 
 ...cxx-uninitialized-object-unguarded-access.cpp |  2 ++
 clang/test/Sema/return.c |  2 +-
 clang/test/SemaCXX/array-bounds.cpp  | 16 +++-
 6 files changed, 16 insertions(+), 27 deletions(-)

diff --git a/clang/lib/Analysis/CFG.cpp b/clang/lib/Analysis/CFG.cpp
index 304bbb2b422c61..b819df35189bb8 100644
--- a/clang/lib/Analysis/CFG.cpp
+++ b/clang/lib/Analysis/CFG.cpp
@@ -4408,17 +4408,9 @@ CFGBlock *CFGBuilder::VisitSwitchStmt(SwitchStmt 
*Terminator) {
   }
 
   // If we have no "default:" case, the default transition is to the code
-  // following the switch body.  Moreover, take into account if all the
-  // cases of a switch are covered (e.g., switching on an enum value).
-  //
-  // Note: We add a successor to a switch that is considered covered yet has no
-  //   case statements if the enumeration has no enumerators.
-  bool SwitchAlwaysHasSuccessor = false;
-  SwitchAlwaysHasSuccessor |= switchExclusivelyCovered;
-  SwitchAlwaysHasSuccessor |= Terminator->isAllEnumCasesCovered() &&
-  Terminator->getSwitchCaseList();
+  // following the switch body.
   addSuccessor(SwitchTerminatedBlock, DefaultCaseBlock,
-   !SwitchAlwaysHasSuccessor);
+   !switchExclusivelyCovered);
 
   // Add the terminator and condition in the switch block.
   SwitchTerminatedBlock->setTerminator(Terminator);
diff --git a/clang/lib/Sema/AnalysisBasedWarnings.cpp 
b/clang/lib/Sema/AnalysisBasedWarnings.cpp
index 589869d0186575..2fa6f0a881e808 100644
--- a/clang/lib/Sema/AnalysisBasedWarnings.cpp
+++ b/clang/lib/Sema/AnalysisBasedWarnings.cpp
@@ -456,10 +456,7 @@ static ControlFlowKind 
CheckFallThrough(AnalysisDeclContext &AC) {
   bool HasPlainEdge = false;
   bool HasAbnormalEdge = false;
 
-  // Ignore default cases that aren't likely to be reachable because all
-  // enums in a switch(X) have explicit case statements.
   CFGBlock::FilterOptions FO;
-  FO.IgnoreDefaultsWithCoveredEnums = 1;
 
   for (CFGBlock::filtered_pred_iterator I =
cfg->getExit().filtered_pred_start_end(FO);
diff --git a/clang/test/Analysis/cfg.cpp b/clang/test/Analysis/cfg.cpp
index 44a89df28e3b29..940994a7932192 100644
--- a/clang/test/Analysis/cfg.cpp
+++ b/clang/test/Analysis/cfg.cpp
@@ -178,7 +178,7 @@ namespace NoReturnSingleSuccessor {
 // CHECK-NEXT:1: x
 // CHECK-NEXT:2: [B1.1] (ImplicitCastExpr, LValueToRValue, int)
 // CHECK-NEXT:3: return [B1.2];
-// CHECK-NEXT:Preds (5): B3 B4 B5 B6 B2(Unreachable)
+// CHECK-NEXT:Preds (5): B3 B4 B5 B6 B2
 // CHECK-NEXT:Succs (1): B0
 // CHECK:  [B2]
 // CHECK-NEXT:1: 0
@@ -188,7 +188,7 @@ namespace NoReturnSingleSuccessor {
 // CHECK-NEXT:5: [B2.4] (ImplicitCastExpr, IntegralCast, int)
 // CHECK-NEXT:T: switch [B2.5]
 // CHECK-NEXT:Preds (1): B7
-// CHECK-NEXT:Succs (5): B3 B4 B5 B6 B1(Unreachable)
+// CHECK-NE

[clang] Don't treat the default switch path as unreachable when all enumerators are covered (PR #123166)

2025-01-16 Thread Jeremy Rifkin via cfe-commits

jeremy-rifkin wrote:

Hi Erich, thanks for taking a look.
My preference here would be to explicitly mark the default path as unreachable 
as such if the intent of the code is that the code path should be unreachable, 
similar to how one might write:
```cpp
bool is_prime(int num) {
if(num < 0) {
std::unreachable();
}
}
```

For such a function, the input `num` would never be negative under normal 
operation similar to how an enum might be assumed to never take on abnormal 
values under normal operation.

On one hand I can see an argument for treating enums as special, however, from 
a safety standpoint I think it’s much better to not make any assumptions about 
valid enum values.

Even setting flag enums and such aside, assuming enums can't take on abnormal 
values seems a very fickle assumption to bake into the compiler. Especially 
when C enums are involved, e.g. there is no warning on the following:
```cpp
enum E { a, b };

int foo(E e) {
switch(e) {
case E::a:
return 20;
case E::b:
return 30;
}
}
```

I don't know of any existing annotation that could be used to indicate the 
intent of an enum to clang. While on one hand I can see value in annotating a 
flag enum is such, e.g., my preference would be to not annotate enums about 
intended use and rather annotate the uses about relevant assumptions (i.e. 
marking paths unreachable explicitly). Sometimes enums are used in ways they 
aren’t intended and it would be impossible to predict that at the point of 
declaration (whether this is good practice or not is a different question). I'd 
also prefer to follow existing good practice (warn about a potentially 
non-returning path) here rather than rely on adoption of a new practice 
(annotations about enum use).

https://github.com/llvm/llvm-project/pull/123166
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Don't treat the default switch path as unreachable when all enumerators are covered (PR #123166)

2025-01-16 Thread Jeremy Rifkin via cfe-commits

https://github.com/jeremy-rifkin updated 
https://github.com/llvm/llvm-project/pull/123166

>From e1ce92c0f54301cacaba316d38d44d20c6d61cb8 Mon Sep 17 00:00:00 2001
From: Jeremy Rifkin <51220084+jeremy-rif...@users.noreply.github.com>
Date: Thu, 16 Jan 2025 00:27:03 -0600
Subject: [PATCH 1/3] Don't treat the default switch case as unreachable even
 when all enumerators are covered

---
 clang/lib/Analysis/CFG.cpp   | 12 ++--
 clang/lib/Sema/AnalysisBasedWarnings.cpp |  3 ---
 clang/test/Analysis/cfg.cpp  |  8 
 ...cxx-uninitialized-object-unguarded-access.cpp |  2 ++
 clang/test/Sema/return.c |  2 +-
 clang/test/SemaCXX/array-bounds.cpp  | 16 +++-
 6 files changed, 16 insertions(+), 27 deletions(-)

diff --git a/clang/lib/Analysis/CFG.cpp b/clang/lib/Analysis/CFG.cpp
index 304bbb2b422c61..b819df35189bb8 100644
--- a/clang/lib/Analysis/CFG.cpp
+++ b/clang/lib/Analysis/CFG.cpp
@@ -4408,17 +4408,9 @@ CFGBlock *CFGBuilder::VisitSwitchStmt(SwitchStmt 
*Terminator) {
   }
 
   // If we have no "default:" case, the default transition is to the code
-  // following the switch body.  Moreover, take into account if all the
-  // cases of a switch are covered (e.g., switching on an enum value).
-  //
-  // Note: We add a successor to a switch that is considered covered yet has no
-  //   case statements if the enumeration has no enumerators.
-  bool SwitchAlwaysHasSuccessor = false;
-  SwitchAlwaysHasSuccessor |= switchExclusivelyCovered;
-  SwitchAlwaysHasSuccessor |= Terminator->isAllEnumCasesCovered() &&
-  Terminator->getSwitchCaseList();
+  // following the switch body.
   addSuccessor(SwitchTerminatedBlock, DefaultCaseBlock,
-   !SwitchAlwaysHasSuccessor);
+   !switchExclusivelyCovered);
 
   // Add the terminator and condition in the switch block.
   SwitchTerminatedBlock->setTerminator(Terminator);
diff --git a/clang/lib/Sema/AnalysisBasedWarnings.cpp 
b/clang/lib/Sema/AnalysisBasedWarnings.cpp
index 589869d0186575..2fa6f0a881e808 100644
--- a/clang/lib/Sema/AnalysisBasedWarnings.cpp
+++ b/clang/lib/Sema/AnalysisBasedWarnings.cpp
@@ -456,10 +456,7 @@ static ControlFlowKind 
CheckFallThrough(AnalysisDeclContext &AC) {
   bool HasPlainEdge = false;
   bool HasAbnormalEdge = false;
 
-  // Ignore default cases that aren't likely to be reachable because all
-  // enums in a switch(X) have explicit case statements.
   CFGBlock::FilterOptions FO;
-  FO.IgnoreDefaultsWithCoveredEnums = 1;
 
   for (CFGBlock::filtered_pred_iterator I =
cfg->getExit().filtered_pred_start_end(FO);
diff --git a/clang/test/Analysis/cfg.cpp b/clang/test/Analysis/cfg.cpp
index 44a89df28e3b29..940994a7932192 100644
--- a/clang/test/Analysis/cfg.cpp
+++ b/clang/test/Analysis/cfg.cpp
@@ -178,7 +178,7 @@ namespace NoReturnSingleSuccessor {
 // CHECK-NEXT:1: x
 // CHECK-NEXT:2: [B1.1] (ImplicitCastExpr, LValueToRValue, int)
 // CHECK-NEXT:3: return [B1.2];
-// CHECK-NEXT:Preds (5): B3 B4 B5 B6 B2(Unreachable)
+// CHECK-NEXT:Preds (5): B3 B4 B5 B6 B2
 // CHECK-NEXT:Succs (1): B0
 // CHECK:  [B2]
 // CHECK-NEXT:1: 0
@@ -188,7 +188,7 @@ namespace NoReturnSingleSuccessor {
 // CHECK-NEXT:5: [B2.4] (ImplicitCastExpr, IntegralCast, int)
 // CHECK-NEXT:T: switch [B2.5]
 // CHECK-NEXT:Preds (1): B7
-// CHECK-NEXT:Succs (5): B3 B4 B5 B6 B1(Unreachable)
+// CHECK-NEXT:Succs (5): B3 B4 B5 B6 B1
 // CHECK:  [B3]
 // CHECK-NEXT:   case D:
 // CHECK-NEXT:1: 4
@@ -254,14 +254,14 @@ int test_enum_with_extension(enum MyEnum value) {
 // CHECK-NEXT:5: [B2.4] (ImplicitCastExpr, IntegralCast, int)
 // CHECK-NEXT:T: switch [B2.5]
 // CHECK-NEXT:Preds (1): B7
-// CHECK-NEXT:Succs (4): B4 B5 B6 B3(Unreachable)
+// CHECK-NEXT:Succs (4): B4 B5 B6 B3
 // CHECK:  [B3]
 // CHECK-NEXT:   default:
 // CHECK-NEXT:1: 4
 // CHECK-NEXT:2: x
 // CHECK-NEXT:3: [B3.2] = [B3.1]
 // CHECK-NEXT:T: break;
-// CHECK-NEXT:Preds (1): B2(Unreachable)
+// CHECK-NEXT:Preds (1): B2
 // CHECK-NEXT:Succs (1): B1
 // CHECK:  [B4]
 // CHECK-NEXT:   case C:
diff --git a/clang/test/Analysis/cxx-uninitialized-object-unguarded-access.cpp 
b/clang/test/Analysis/cxx-uninitialized-object-unguarded-access.cpp
index 611e1d8255976c..b791c527c2e33e 100644
--- a/clang/test/Analysis/cxx-uninitialized-object-unguarded-access.cpp
+++ b/clang/test/Analysis/cxx-uninitialized-object-unguarded-access.cpp
@@ -383,6 +383,7 @@ class SwitchGuardedFieldsTest {
 case Kind::V:
   return -1;
 }
+halt();
   }
 
   int operator+() {
@@ -392,6 +393,7 @@ class SwitchGuardedFieldsTest {
 case Kind::V:
   return -1;
 }
+halt();
   }
 };
 
diff --git a/clang/test/Sema/return.c b/clang/test/Sema/return.c
index 14d5300e819492..b0814f152cbe95 100644
--- a/clang/test/Sema/return.c
+++ b/clang/test/Sema/return.c
@@ -265,7 +265,

[clang] [Clang][Analysis] Don't treat the default switch path as unreachable when all enumerators are covered (PR #123166)

2025-01-16 Thread Jeremy Rifkin via cfe-commits

https://github.com/jeremy-rifkin edited 
https://github.com/llvm/llvm-project/pull/123166
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Don't treat the default switch path as unreachable when all enumerators are covered (PR #123166)

2025-01-16 Thread Jeremy Rifkin via cfe-commits


@@ -170,16 +170,14 @@ void test_nested_switch() {
   }
 }
 
-// Test that if all the values of an enum covered, that the 'default' branch
-// is unreachable.
+// Test that a warning is not emitted if the code is unreachable.
 enum Values { A, B, C, D };
 void test_all_enums_covered(enum Values v) {

jeremy-rifkin wrote:

Thanks for taking a look. What I was trying to do here was make a control flow 
graph where the out of bounds access happened in a path clang knew was 
unreachable (and thus not fire). I'm not sure if that's tested anywhere else. 
Checking if the warning is emitted here with the switch seems a little 
redundant to me with other checks about no longer assuming the default path is 
unreachable if enum values are covered. Thoughts?

https://github.com/llvm/llvm-project/pull/123166
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [compiler-rt] [Clang] Make `-Wreturn-type` default to an error in all language modes (PR #123470)

2025-02-18 Thread Jeremy Rifkin via cfe-commits

jeremy-rifkin wrote:

Hi @foxtran, I have been working on a proposal on the C++ side. Extensive 
justification is needed to justify breakage here and I have been attempting to 
survey the extent of potential breakage. The wording is also tricky especially 
since attributes are designed to be ignorable. Would you be interested in 
collaborating on a paper?

https://github.com/llvm/llvm-project/pull/123470
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits