[clang] [clang-format] Support of TableGen formatting. (PR #76059)

2023-12-21 Thread Hirofumi Nakamura via cfe-commits

hnakamura5 wrote:

Thank you for your advice. I have checked the message "clang-format did not 
modify any files" in my local.



https://github.com/llvm/llvm-project/pull/76059
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] Support of TableGen formatting. (PR #76059)

2023-12-22 Thread Hirofumi Nakamura via cfe-commits

https://github.com/hnakamura5 updated 
https://github.com/llvm/llvm-project/pull/76059

>From b0080a41c1802517e4a02976058231cf37a82adb Mon Sep 17 00:00:00 2001
From: hnakamura5 
Date: Fri, 3 Nov 2023 20:58:17 +0900
Subject: [PATCH] [clang-format] Support of TableGen formatting.

Currently, TableGen has its language style but the it does not works
well. This patch adds total support of TableGen formatting including
the support for the code (multi line string), DAG args, bang operators,
the cond operator, and the paste operators.
---
 clang/include/clang/Format/Format.h   |  47 ++
 clang/lib/Format/ContinuationIndenter.cpp |  18 +-
 clang/lib/Format/Format.cpp   |  29 ++
 clang/lib/Format/FormatToken.h|  98 
 clang/lib/Format/FormatTokenLexer.cpp | 142 ++
 clang/lib/Format/FormatTokenLexer.h   |   6 +
 clang/lib/Format/TokenAnnotator.cpp   | 426 +-
 clang/lib/Format/UnwrappedLineParser.cpp  |  54 ++-
 clang/lib/Format/WhitespaceManager.cpp|  31 +-
 clang/lib/Format/WhitespaceManager.h  |  14 +
 clang/unittests/Format/FormatTestTableGen.cpp | 307 +
 clang/unittests/Format/TokenAnnotatorTest.cpp |  50 ++
 12 files changed, 1197 insertions(+), 25 deletions(-)

diff --git a/clang/include/clang/Format/Format.h 
b/clang/include/clang/Format/Format.h
index 8604dea689f937..30a38aed99866e 100644
--- a/clang/include/clang/Format/Format.h
+++ b/clang/include/clang/Format/Format.h
@@ -396,6 +396,36 @@ struct FormatStyle {
   /// \version 17
   ShortCaseStatementsAlignmentStyle AlignConsecutiveShortCaseStatements;
 
+  /// Style of aligning consecutive TableGen cond operator colons.
+  /// \code
+  ///   !cond(!eq(size, 1) : 1,
+  /// !eq(size, 16): 1,
+  /// true : 0)
+  /// \endcode
+  /// \version 18
+  AlignConsecutiveStyle AlignConsecutiveTableGenCondOperatorColons;
+
+  /// Style of aligning consecutive TableGen DAGArg operator colons.
+  /// Intended to be used with TableGenBreakInsideDAGArgList
+  /// \code
+  ///   let dagarg = (ins
+  ///   a  :$src1,
+  ///   aa :$src2,
+  ///   aaa:$src3
+  ///   )
+  /// \endcode
+  /// \version 18
+  AlignConsecutiveStyle AlignConsecutiveTableGenBreakingDAGArgColons;
+
+  /// Style of aligning consecutive TableGen def colons.
+  /// \code
+  ///   def Def   : Parent {}
+  ///   def DefDef: Parent {}
+  ///   def DefDefDef : Parent {}
+  /// \endcode
+  /// \version 18
+  AlignConsecutiveStyle AlignConsecutiveTableGenDefinitions;
+
   /// Different styles for aligning escaped newlines.
   enum EscapedNewlineAlignmentStyle : int8_t {
 /// Don't align escaped newlines.
@@ -3037,6 +3067,7 @@ struct FormatStyle {
   bool isProto() const {
 return Language == LK_Proto || Language == LK_TextProto;
   }
+  bool isTableGen() const { return Language == LK_TableGen; }
 
   /// Language, this format style is targeted at.
   /// \version 3.5
@@ -4656,6 +4687,15 @@ struct FormatStyle {
   /// \version 8
   std::vector StatementMacros;
 
+  /// Tablegen
+  bool TableGenAllowBreakBeforeInheritColon;
+  bool TableGenAllowBreakAfterInheritColon;
+  bool TableGenBreakInsideCondOperator;
+  bool TableGenBreakInsideDAGArgList;
+  bool TableGenPreferBreakInsideSquareBracket;
+  bool TableGenSpaceAroundDAGArgColon;
+  std::vector TableGenBreakingDAGArgOperators;
+
   /// The number of columns used for tab stops.
   /// \version 3.7
   unsigned TabWidth;
@@ -4753,6 +4793,13 @@ struct FormatStyle {
AlignConsecutiveMacros == R.AlignConsecutiveMacros &&
AlignConsecutiveShortCaseStatements ==
R.AlignConsecutiveShortCaseStatements &&
+   AlignConsecutiveTableGenCondOperatorColons ==
+   R.AlignConsecutiveTableGenCondOperatorColons &&
+   AlignConsecutiveTableGenBreakingDAGArgColons ==
+   R.AlignConsecutiveTableGenBreakingDAGArgColons &&
+   AlignConsecutiveTableGenDefinitions ==
+   R.AlignConsecutiveTableGenDefinitions &&
+   AlignConsecutiveMacros == R.AlignConsecutiveMacros &&
AlignEscapedNewlines == R.AlignEscapedNewlines &&
AlignOperands == R.AlignOperands &&
AlignTrailingComments == R.AlignTrailingComments &&
diff --git a/clang/lib/Format/ContinuationIndenter.cpp 
b/clang/lib/Format/ContinuationIndenter.cpp
index bd319f21b05f86..176dc7744a5576 100644
--- a/clang/lib/Format/ContinuationIndenter.cpp
+++ b/clang/lib/Format/ContinuationIndenter.cpp
@@ -800,6 +800,7 @@ void ContinuationIndenter::addTokenOnCurrentLine(LineState 
&State, bool DryRun,
   if (Style.AlignAfterOpenBracket != FormatStyle::BAS_DontAlign &&
   !CurrentState.IsCSharpGenericTypeConstraint && Previous.opensScope() &&
   Previous.isNot(TT_ObjCMethodExpr) && Previous.isNot(TT_RequiresClause) &&
+  Previous.isNot(TT_TableGenDAGArgOpener) &&
   !(Current.MacroParent && Previous.MacroParent) &&
   (

[clang] [clang-format] Support of TableGen formatting. (PR #76059)

2023-12-23 Thread Hirofumi Nakamura via cfe-commits

https://github.com/hnakamura5 updated 
https://github.com/llvm/llvm-project/pull/76059

>From b0080a41c1802517e4a02976058231cf37a82adb Mon Sep 17 00:00:00 2001
From: hnakamura5 
Date: Fri, 3 Nov 2023 20:58:17 +0900
Subject: [PATCH 1/2] [clang-format] Support of TableGen formatting.

Currently, TableGen has its language style but the it does not works
well. This patch adds total support of TableGen formatting including
the support for the code (multi line string), DAG args, bang operators,
the cond operator, and the paste operators.
---
 clang/include/clang/Format/Format.h   |  47 ++
 clang/lib/Format/ContinuationIndenter.cpp |  18 +-
 clang/lib/Format/Format.cpp   |  29 ++
 clang/lib/Format/FormatToken.h|  98 
 clang/lib/Format/FormatTokenLexer.cpp | 142 ++
 clang/lib/Format/FormatTokenLexer.h   |   6 +
 clang/lib/Format/TokenAnnotator.cpp   | 426 +-
 clang/lib/Format/UnwrappedLineParser.cpp  |  54 ++-
 clang/lib/Format/WhitespaceManager.cpp|  31 +-
 clang/lib/Format/WhitespaceManager.h  |  14 +
 clang/unittests/Format/FormatTestTableGen.cpp | 307 +
 clang/unittests/Format/TokenAnnotatorTest.cpp |  50 ++
 12 files changed, 1197 insertions(+), 25 deletions(-)

diff --git a/clang/include/clang/Format/Format.h 
b/clang/include/clang/Format/Format.h
index 8604dea689f937..30a38aed99866e 100644
--- a/clang/include/clang/Format/Format.h
+++ b/clang/include/clang/Format/Format.h
@@ -396,6 +396,36 @@ struct FormatStyle {
   /// \version 17
   ShortCaseStatementsAlignmentStyle AlignConsecutiveShortCaseStatements;
 
+  /// Style of aligning consecutive TableGen cond operator colons.
+  /// \code
+  ///   !cond(!eq(size, 1) : 1,
+  /// !eq(size, 16): 1,
+  /// true : 0)
+  /// \endcode
+  /// \version 18
+  AlignConsecutiveStyle AlignConsecutiveTableGenCondOperatorColons;
+
+  /// Style of aligning consecutive TableGen DAGArg operator colons.
+  /// Intended to be used with TableGenBreakInsideDAGArgList
+  /// \code
+  ///   let dagarg = (ins
+  ///   a  :$src1,
+  ///   aa :$src2,
+  ///   aaa:$src3
+  ///   )
+  /// \endcode
+  /// \version 18
+  AlignConsecutiveStyle AlignConsecutiveTableGenBreakingDAGArgColons;
+
+  /// Style of aligning consecutive TableGen def colons.
+  /// \code
+  ///   def Def   : Parent {}
+  ///   def DefDef: Parent {}
+  ///   def DefDefDef : Parent {}
+  /// \endcode
+  /// \version 18
+  AlignConsecutiveStyle AlignConsecutiveTableGenDefinitions;
+
   /// Different styles for aligning escaped newlines.
   enum EscapedNewlineAlignmentStyle : int8_t {
 /// Don't align escaped newlines.
@@ -3037,6 +3067,7 @@ struct FormatStyle {
   bool isProto() const {
 return Language == LK_Proto || Language == LK_TextProto;
   }
+  bool isTableGen() const { return Language == LK_TableGen; }
 
   /// Language, this format style is targeted at.
   /// \version 3.5
@@ -4656,6 +4687,15 @@ struct FormatStyle {
   /// \version 8
   std::vector StatementMacros;
 
+  /// Tablegen
+  bool TableGenAllowBreakBeforeInheritColon;
+  bool TableGenAllowBreakAfterInheritColon;
+  bool TableGenBreakInsideCondOperator;
+  bool TableGenBreakInsideDAGArgList;
+  bool TableGenPreferBreakInsideSquareBracket;
+  bool TableGenSpaceAroundDAGArgColon;
+  std::vector TableGenBreakingDAGArgOperators;
+
   /// The number of columns used for tab stops.
   /// \version 3.7
   unsigned TabWidth;
@@ -4753,6 +4793,13 @@ struct FormatStyle {
AlignConsecutiveMacros == R.AlignConsecutiveMacros &&
AlignConsecutiveShortCaseStatements ==
R.AlignConsecutiveShortCaseStatements &&
+   AlignConsecutiveTableGenCondOperatorColons ==
+   R.AlignConsecutiveTableGenCondOperatorColons &&
+   AlignConsecutiveTableGenBreakingDAGArgColons ==
+   R.AlignConsecutiveTableGenBreakingDAGArgColons &&
+   AlignConsecutiveTableGenDefinitions ==
+   R.AlignConsecutiveTableGenDefinitions &&
+   AlignConsecutiveMacros == R.AlignConsecutiveMacros &&
AlignEscapedNewlines == R.AlignEscapedNewlines &&
AlignOperands == R.AlignOperands &&
AlignTrailingComments == R.AlignTrailingComments &&
diff --git a/clang/lib/Format/ContinuationIndenter.cpp 
b/clang/lib/Format/ContinuationIndenter.cpp
index bd319f21b05f86..176dc7744a5576 100644
--- a/clang/lib/Format/ContinuationIndenter.cpp
+++ b/clang/lib/Format/ContinuationIndenter.cpp
@@ -800,6 +800,7 @@ void ContinuationIndenter::addTokenOnCurrentLine(LineState 
&State, bool DryRun,
   if (Style.AlignAfterOpenBracket != FormatStyle::BAS_DontAlign &&
   !CurrentState.IsCSharpGenericTypeConstraint && Previous.opensScope() &&
   Previous.isNot(TT_ObjCMethodExpr) && Previous.isNot(TT_RequiresClause) &&
+  Previous.isNot(TT_TableGenDAGArgOpener) &&
   !(Current.MacroParent && Previous.MacroParent) &&

[clang] [clang-format] Support of TableGen formatting. (PR #76059)

2023-12-23 Thread Hirofumi Nakamura via cfe-commits




hnakamura5 wrote:

Thank you for the information. I added the document.

https://github.com/llvm/llvm-project/pull/76059
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] Support of TableGen formatting. (PR #76059)

2023-12-23 Thread Hirofumi Nakamura via cfe-commits

https://github.com/hnakamura5 updated 
https://github.com/llvm/llvm-project/pull/76059

>From b0080a41c1802517e4a02976058231cf37a82adb Mon Sep 17 00:00:00 2001
From: hnakamura5 
Date: Fri, 3 Nov 2023 20:58:17 +0900
Subject: [PATCH 1/2] [clang-format] Support of TableGen formatting.

Currently, TableGen has its language style but the it does not works
well. This patch adds total support of TableGen formatting including
the support for the code (multi line string), DAG args, bang operators,
the cond operator, and the paste operators.
---
 clang/include/clang/Format/Format.h   |  47 ++
 clang/lib/Format/ContinuationIndenter.cpp |  18 +-
 clang/lib/Format/Format.cpp   |  29 ++
 clang/lib/Format/FormatToken.h|  98 
 clang/lib/Format/FormatTokenLexer.cpp | 142 ++
 clang/lib/Format/FormatTokenLexer.h   |   6 +
 clang/lib/Format/TokenAnnotator.cpp   | 426 +-
 clang/lib/Format/UnwrappedLineParser.cpp  |  54 ++-
 clang/lib/Format/WhitespaceManager.cpp|  31 +-
 clang/lib/Format/WhitespaceManager.h  |  14 +
 clang/unittests/Format/FormatTestTableGen.cpp | 307 +
 clang/unittests/Format/TokenAnnotatorTest.cpp |  50 ++
 12 files changed, 1197 insertions(+), 25 deletions(-)

diff --git a/clang/include/clang/Format/Format.h 
b/clang/include/clang/Format/Format.h
index 8604dea689f937..30a38aed99866e 100644
--- a/clang/include/clang/Format/Format.h
+++ b/clang/include/clang/Format/Format.h
@@ -396,6 +396,36 @@ struct FormatStyle {
   /// \version 17
   ShortCaseStatementsAlignmentStyle AlignConsecutiveShortCaseStatements;
 
+  /// Style of aligning consecutive TableGen cond operator colons.
+  /// \code
+  ///   !cond(!eq(size, 1) : 1,
+  /// !eq(size, 16): 1,
+  /// true : 0)
+  /// \endcode
+  /// \version 18
+  AlignConsecutiveStyle AlignConsecutiveTableGenCondOperatorColons;
+
+  /// Style of aligning consecutive TableGen DAGArg operator colons.
+  /// Intended to be used with TableGenBreakInsideDAGArgList
+  /// \code
+  ///   let dagarg = (ins
+  ///   a  :$src1,
+  ///   aa :$src2,
+  ///   aaa:$src3
+  ///   )
+  /// \endcode
+  /// \version 18
+  AlignConsecutiveStyle AlignConsecutiveTableGenBreakingDAGArgColons;
+
+  /// Style of aligning consecutive TableGen def colons.
+  /// \code
+  ///   def Def   : Parent {}
+  ///   def DefDef: Parent {}
+  ///   def DefDefDef : Parent {}
+  /// \endcode
+  /// \version 18
+  AlignConsecutiveStyle AlignConsecutiveTableGenDefinitions;
+
   /// Different styles for aligning escaped newlines.
   enum EscapedNewlineAlignmentStyle : int8_t {
 /// Don't align escaped newlines.
@@ -3037,6 +3067,7 @@ struct FormatStyle {
   bool isProto() const {
 return Language == LK_Proto || Language == LK_TextProto;
   }
+  bool isTableGen() const { return Language == LK_TableGen; }
 
   /// Language, this format style is targeted at.
   /// \version 3.5
@@ -4656,6 +4687,15 @@ struct FormatStyle {
   /// \version 8
   std::vector StatementMacros;
 
+  /// Tablegen
+  bool TableGenAllowBreakBeforeInheritColon;
+  bool TableGenAllowBreakAfterInheritColon;
+  bool TableGenBreakInsideCondOperator;
+  bool TableGenBreakInsideDAGArgList;
+  bool TableGenPreferBreakInsideSquareBracket;
+  bool TableGenSpaceAroundDAGArgColon;
+  std::vector TableGenBreakingDAGArgOperators;
+
   /// The number of columns used for tab stops.
   /// \version 3.7
   unsigned TabWidth;
@@ -4753,6 +4793,13 @@ struct FormatStyle {
AlignConsecutiveMacros == R.AlignConsecutiveMacros &&
AlignConsecutiveShortCaseStatements ==
R.AlignConsecutiveShortCaseStatements &&
+   AlignConsecutiveTableGenCondOperatorColons ==
+   R.AlignConsecutiveTableGenCondOperatorColons &&
+   AlignConsecutiveTableGenBreakingDAGArgColons ==
+   R.AlignConsecutiveTableGenBreakingDAGArgColons &&
+   AlignConsecutiveTableGenDefinitions ==
+   R.AlignConsecutiveTableGenDefinitions &&
+   AlignConsecutiveMacros == R.AlignConsecutiveMacros &&
AlignEscapedNewlines == R.AlignEscapedNewlines &&
AlignOperands == R.AlignOperands &&
AlignTrailingComments == R.AlignTrailingComments &&
diff --git a/clang/lib/Format/ContinuationIndenter.cpp 
b/clang/lib/Format/ContinuationIndenter.cpp
index bd319f21b05f86..176dc7744a5576 100644
--- a/clang/lib/Format/ContinuationIndenter.cpp
+++ b/clang/lib/Format/ContinuationIndenter.cpp
@@ -800,6 +800,7 @@ void ContinuationIndenter::addTokenOnCurrentLine(LineState 
&State, bool DryRun,
   if (Style.AlignAfterOpenBracket != FormatStyle::BAS_DontAlign &&
   !CurrentState.IsCSharpGenericTypeConstraint && Previous.opensScope() &&
   Previous.isNot(TT_ObjCMethodExpr) && Previous.isNot(TT_RequiresClause) &&
+  Previous.isNot(TT_TableGenDAGArgOpener) &&
   !(Current.MacroParent && Previous.MacroParent) &&

[clang] [clang-format] Support of TableGen formatting. (PR #76059)

2023-12-23 Thread Hirofumi Nakamura via cfe-commits


@@ -396,6 +396,36 @@ struct FormatStyle {
   /// \version 17
   ShortCaseStatementsAlignmentStyle AlignConsecutiveShortCaseStatements;
 
+  /// Style of aligning consecutive TableGen cond operator colons.
+  /// \code
+  ///   !cond(!eq(size, 1) : 1,
+  /// !eq(size, 16): 1,
+  /// true : 0)
+  /// \endcode
+  /// \version 18
+  AlignConsecutiveStyle AlignConsecutiveTableGenCondOperatorColons;
+
+  /// Style of aligning consecutive TableGen DAGArg operator colons.
+  /// Intended to be used with TableGenBreakInsideDAGArgList
+  /// \code
+  ///   let dagarg = (ins
+  ///   a  :$src1,
+  ///   aa :$src2,
+  ///   aaa:$src3
+  ///   )
+  /// \endcode
+  /// \version 18
+  AlignConsecutiveStyle AlignConsecutiveTableGenBreakingDAGArgColons;
+
+  /// Style of aligning consecutive TableGen def colons.
+  /// \code
+  ///   def Def   : Parent {}
+  ///   def DefDef: Parent {}
+  ///   def DefDefDef : Parent {}
+  /// \endcode
+  /// \version 18
+  AlignConsecutiveStyle AlignConsecutiveTableGenDefinitions;

hnakamura5 wrote:

The option AlignConsecutiveDeclarations aligns the identifier that is declared.
This AlignConsecutiveTableGenDefinitions option aligns the inheritance colon of 
def, that is the style often seen in TableGen file formatted by hand.
The name seems confusing. I changed the name to clarify it aligns colon.

https://github.com/llvm/llvm-project/pull/76059
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] Support of TableGen formatting. (PR #76059)

2023-12-23 Thread Hirofumi Nakamura via cfe-commits


@@ -4656,6 +4687,15 @@ struct FormatStyle {
   /// \version 8
   std::vector StatementMacros;
 
+  /// Tablegen
+  bool TableGenAllowBreakBeforeInheritColon;
+  bool TableGenAllowBreakAfterInheritColon;

hnakamura5 wrote:

Thank you for the information. I removed these options. Instead  
BreakInheritanceList works well. There happens some changes, but seems 
acceptable. 

https://github.com/llvm/llvm-project/pull/76059
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] Support of TableGen formatting. (PR #76059)

2023-12-23 Thread Hirofumi Nakamura via cfe-commits


@@ -40,6 +40,13 @@ class FormatTestTableGen : public ::testing::Test {
 EXPECT_EQ(Code.str(), format(Code)) << "Expected code is not stable";
 EXPECT_EQ(Code.str(), format(test::messUp(Code)));
   }
+
+  static void verifyFormat(llvm::StringRef Code, const FormatStyle &Style) {

hnakamura5 wrote:

I have not noticed that. Thank you. I can also fix this. But now I am planing 
whether (and how) to split this growing pull request. A kind of refactoring may 
be better done after that.

https://github.com/llvm/llvm-project/pull/76059
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] Support of TableGen formatting. (PR #76059)

2023-12-23 Thread Hirofumi Nakamura via cfe-commits

hnakamura5 wrote:

@rymiel  @HazardyKnusperkeks 
Thank you for your review!
I have fixed the points.
But for refactoring of the test base class in 
https://github.com/llvm/llvm-project/commit/f8d10d5ac9ab4b45b388c74357fc82fb96562e66
 .
I'm not sure I should do here, and if I should, I should do it in splitted pull 
request.

Now I really understand I should split this pull request into some parts. At 
first it is large and continue growing by adding documents.
I'm wondering how and current plan is separating semantically,

- Handling multi line string (~100 lines).
- Handling numeric like identifier (~100 lines). 
- Handling TableGen specific keywords (~100 lines)
- Unwrapped line parsing(~100 lines).
- Parse TableGen values (about 500+ lines including unittest).
- Basic options (but for aligning ones) (about 500+ lines including the 
document).
- Aligning options (about 100 lines including document).
- Refactor unittests.

I'm not sure this is good plan. They may be complicated.
Could you help me to plan if you have some idea?

In addition, I do not know the appropriate way to split pull request after I 
made one. Is it enough to refer each other, and abort this at last?

https://github.com/llvm/llvm-project/pull/76059
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] TableGen keywords support. (PR #77477)

2024-01-09 Thread Hirofumi Nakamura via cfe-commits

https://github.com/hnakamura5 created 
https://github.com/llvm/llvm-project/pull/77477

Add TableGen keywords to the additional keyword list of the formatter.

This pull request is the splited part from 
https://github.com/llvm/llvm-project/pull/76059 .

>From 915d1822f68f975f60e49b3cc236fe97a19e7f49 Mon Sep 17 00:00:00 2001
From: hnakamura5 
Date: Tue, 9 Jan 2024 22:57:53 +0900
Subject: [PATCH] [clang-format] TableGen keywords support.

Add TableGen keywords to the additional keyword list of the formatter.
---
 clang/include/clang/Format/Format.h   |  1 +
 clang/lib/Format/FormatToken.h| 75 +++
 clang/lib/Format/FormatTokenLexer.cpp |  3 +
 clang/lib/Format/TokenAnnotator.cpp   |  6 ++
 clang/unittests/Format/TokenAnnotatorTest.cpp | 18 +
 5 files changed, 103 insertions(+)

diff --git a/clang/include/clang/Format/Format.h 
b/clang/include/clang/Format/Format.h
index 8604dea689f937..d63e96ea95832d 100644
--- a/clang/include/clang/Format/Format.h
+++ b/clang/include/clang/Format/Format.h
@@ -3037,6 +3037,7 @@ struct FormatStyle {
   bool isProto() const {
 return Language == LK_Proto || Language == LK_TextProto;
   }
+  bool isTableGen() const { return Language == LK_TableGen; }
 
   /// Language, this format style is targeted at.
   /// \version 3.5
diff --git a/clang/lib/Format/FormatToken.h b/clang/lib/Format/FormatToken.h
index 3f9664f8f78a3e..bd21a972441a98 100644
--- a/clang/lib/Format/FormatToken.h
+++ b/clang/lib/Format/FormatToken.h
@@ -1202,6 +1202,21 @@ struct AdditionalKeywords {
 kw_verilogHashHash = &IdentTable.get("##");
 kw_apostrophe = &IdentTable.get("\'");
 
+// TableGen keywords
+kw_bit = &IdentTable.get("bit");
+kw_bits = &IdentTable.get("bits");
+kw_code = &IdentTable.get("code");
+kw_dag = &IdentTable.get("dag");
+kw_def = &IdentTable.get("def");
+kw_defm = &IdentTable.get("defm");
+kw_defset = &IdentTable.get("defset");
+kw_defvar = &IdentTable.get("defvar");
+kw_dump = &IdentTable.get("dump");
+kw_include = &IdentTable.get("include");
+kw_list = &IdentTable.get("list");
+kw_multiclass = &IdentTable.get("multiclass");
+kw_then = &IdentTable.get("then");
+
 // Keep this at the end of the constructor to make sure everything here
 // is
 // already initialized.
@@ -1294,6 +1309,27 @@ struct AdditionalKeywords {
  kw_wildcard, kw_wire,
  kw_with, kw_wor,
  kw_verilogHash,  kw_verilogHashHash});
+
+TableGenExtraKeywords = std::unordered_set({
+kw_assert,
+kw_bit,
+kw_bits,
+kw_code,
+kw_dag,
+kw_def,
+kw_defm,
+kw_defset,
+kw_defvar,
+kw_dump,
+kw_foreach,
+kw_in,
+kw_include,
+kw_let,
+kw_list,
+kw_multiclass,
+kw_string,
+kw_then,
+});
   }
 
   // Context sensitive keywords.
@@ -1539,6 +1575,21 @@ struct AdditionalKeywords {
   // Symbols in Verilog that don't exist in C++.
   IdentifierInfo *kw_apostrophe;
 
+  // TableGen keywords
+  IdentifierInfo *kw_bit;
+  IdentifierInfo *kw_bits;
+  IdentifierInfo *kw_code;
+  IdentifierInfo *kw_dag;
+  IdentifierInfo *kw_def;
+  IdentifierInfo *kw_defm;
+  IdentifierInfo *kw_defset;
+  IdentifierInfo *kw_defvar;
+  IdentifierInfo *kw_dump;
+  IdentifierInfo *kw_include;
+  IdentifierInfo *kw_list;
+  IdentifierInfo *kw_multiclass;
+  IdentifierInfo *kw_then;
+
   /// Returns \c true if \p Tok is a keyword or an identifier.
   bool isWordLike(const FormatToken &Tok) const {
 // getIdentifierinfo returns non-null for keywords as well as identifiers.
@@ -1811,6 +1862,27 @@ struct AdditionalKeywords {
 }
   }
 
+  bool isTableGenDefinition(const FormatToken &Tok) const {
+return Tok.isOneOf(kw_def, kw_defm, kw_defset, kw_defvar, kw_multiclass,
+   kw_let, tok::kw_class);
+  }
+
+  bool isTableGenKeyword(const FormatToken &Tok) const {
+switch (Tok.Tok.getKind()) {
+case tok::kw_class:
+case tok::kw_else:
+case tok::kw_false:
+case tok::kw_if:
+case tok::kw_int:
+case tok::kw_true:
+  return true;
+default:
+  return Tok.is(tok::identifier) &&
+ TableGenExtraKeywords.find(Tok.Tok.getIdentifierInfo()) !=
+ TableGenExtraKeywords.end();
+}
+  }
+
 private:
   /// The JavaScript keywords beyond the C++ keyword set.
   std::unordered_set JsExtraKeywords;
@@ -1820,6 +1892,9 @@ struct AdditionalKeywords {
 
   /// The Verilog keywords beyond the C++ keyword set.
   std::unordered_set VerilogExtraKeywords;
+
+  /// The TableGen keywords beyond the C++ keyword set.
+  std::unordered_set TableGenExtraKeywords;
 };
 
 inline bool isLineComment(const FormatToken &FormatTok) {
diff --git a/clang/lib/Format/FormatTokenLexer.cpp 
b/clang/lib/Format/FormatTokenLexer.cpp
index 61430282c6f88c..a1fd6dd6effe6c 100644
--- a/clang/lib/Format/For

[clang] [clang-format] Support of TableGen formatting. (PR #76059)

2024-01-09 Thread Hirofumi Nakamura via cfe-commits

hnakamura5 wrote:

Thanks to the advises, I begin to split this into several parts.

Made the keywords part in https://github.com/llvm/llvm-project/pull/77477 .

https://github.com/llvm/llvm-project/pull/76059
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] TableGen keywords support. (PR #77477)

2024-01-11 Thread Hirofumi Nakamura via cfe-commits

https://github.com/hnakamura5 updated 
https://github.com/llvm/llvm-project/pull/77477

>From 4e9f2bc86c4e48c4d412fea7804c226f041d022c Mon Sep 17 00:00:00 2001
From: hnakamura5 
Date: Tue, 9 Jan 2024 22:57:53 +0900
Subject: [PATCH] [clang-format] TableGen keywords support.

Add TableGen keywords to the additional keyword list of the formatter.
---
 clang/include/clang/Format/Format.h   |  1 +
 clang/lib/Format/FormatToken.h| 75 +++
 clang/lib/Format/FormatTokenLexer.cpp |  3 +
 clang/lib/Format/TokenAnnotator.cpp   |  6 ++
 clang/unittests/Format/TokenAnnotatorTest.cpp | 18 +
 5 files changed, 103 insertions(+)

diff --git a/clang/include/clang/Format/Format.h 
b/clang/include/clang/Format/Format.h
index 59b645ecab715b..5ffd63ee73fc36 100644
--- a/clang/include/clang/Format/Format.h
+++ b/clang/include/clang/Format/Format.h
@@ -3055,6 +3055,7 @@ struct FormatStyle {
   bool isProto() const {
 return Language == LK_Proto || Language == LK_TextProto;
   }
+  bool isTableGen() const { return Language == LK_TableGen; }
 
   /// Language, this format style is targeted at.
   /// \version 3.5
diff --git a/clang/lib/Format/FormatToken.h b/clang/lib/Format/FormatToken.h
index 3f9664f8f78a3e..bd21a972441a98 100644
--- a/clang/lib/Format/FormatToken.h
+++ b/clang/lib/Format/FormatToken.h
@@ -1202,6 +1202,21 @@ struct AdditionalKeywords {
 kw_verilogHashHash = &IdentTable.get("##");
 kw_apostrophe = &IdentTable.get("\'");
 
+// TableGen keywords
+kw_bit = &IdentTable.get("bit");
+kw_bits = &IdentTable.get("bits");
+kw_code = &IdentTable.get("code");
+kw_dag = &IdentTable.get("dag");
+kw_def = &IdentTable.get("def");
+kw_defm = &IdentTable.get("defm");
+kw_defset = &IdentTable.get("defset");
+kw_defvar = &IdentTable.get("defvar");
+kw_dump = &IdentTable.get("dump");
+kw_include = &IdentTable.get("include");
+kw_list = &IdentTable.get("list");
+kw_multiclass = &IdentTable.get("multiclass");
+kw_then = &IdentTable.get("then");
+
 // Keep this at the end of the constructor to make sure everything here
 // is
 // already initialized.
@@ -1294,6 +1309,27 @@ struct AdditionalKeywords {
  kw_wildcard, kw_wire,
  kw_with, kw_wor,
  kw_verilogHash,  kw_verilogHashHash});
+
+TableGenExtraKeywords = std::unordered_set({
+kw_assert,
+kw_bit,
+kw_bits,
+kw_code,
+kw_dag,
+kw_def,
+kw_defm,
+kw_defset,
+kw_defvar,
+kw_dump,
+kw_foreach,
+kw_in,
+kw_include,
+kw_let,
+kw_list,
+kw_multiclass,
+kw_string,
+kw_then,
+});
   }
 
   // Context sensitive keywords.
@@ -1539,6 +1575,21 @@ struct AdditionalKeywords {
   // Symbols in Verilog that don't exist in C++.
   IdentifierInfo *kw_apostrophe;
 
+  // TableGen keywords
+  IdentifierInfo *kw_bit;
+  IdentifierInfo *kw_bits;
+  IdentifierInfo *kw_code;
+  IdentifierInfo *kw_dag;
+  IdentifierInfo *kw_def;
+  IdentifierInfo *kw_defm;
+  IdentifierInfo *kw_defset;
+  IdentifierInfo *kw_defvar;
+  IdentifierInfo *kw_dump;
+  IdentifierInfo *kw_include;
+  IdentifierInfo *kw_list;
+  IdentifierInfo *kw_multiclass;
+  IdentifierInfo *kw_then;
+
   /// Returns \c true if \p Tok is a keyword or an identifier.
   bool isWordLike(const FormatToken &Tok) const {
 // getIdentifierinfo returns non-null for keywords as well as identifiers.
@@ -1811,6 +1862,27 @@ struct AdditionalKeywords {
 }
   }
 
+  bool isTableGenDefinition(const FormatToken &Tok) const {
+return Tok.isOneOf(kw_def, kw_defm, kw_defset, kw_defvar, kw_multiclass,
+   kw_let, tok::kw_class);
+  }
+
+  bool isTableGenKeyword(const FormatToken &Tok) const {
+switch (Tok.Tok.getKind()) {
+case tok::kw_class:
+case tok::kw_else:
+case tok::kw_false:
+case tok::kw_if:
+case tok::kw_int:
+case tok::kw_true:
+  return true;
+default:
+  return Tok.is(tok::identifier) &&
+ TableGenExtraKeywords.find(Tok.Tok.getIdentifierInfo()) !=
+ TableGenExtraKeywords.end();
+}
+  }
+
 private:
   /// The JavaScript keywords beyond the C++ keyword set.
   std::unordered_set JsExtraKeywords;
@@ -1820,6 +1892,9 @@ struct AdditionalKeywords {
 
   /// The Verilog keywords beyond the C++ keyword set.
   std::unordered_set VerilogExtraKeywords;
+
+  /// The TableGen keywords beyond the C++ keyword set.
+  std::unordered_set TableGenExtraKeywords;
 };
 
 inline bool isLineComment(const FormatToken &FormatTok) {
diff --git a/clang/lib/Format/FormatTokenLexer.cpp 
b/clang/lib/Format/FormatTokenLexer.cpp
index 61430282c6f88c..a1fd6dd6effe6c 100644
--- a/clang/lib/Format/FormatTokenLexer.cpp
+++ b/clang/lib/Format/FormatTokenLexer.cpp
@@ -1182,6 +1182,9 @@ FormatToken *FormatTokenLexer::getNextToken() {
   

[clang] [clang-format] TableGen keywords support. (PR #77477)

2024-01-11 Thread Hirofumi Nakamura via cfe-commits

hnakamura5 wrote:

@HazardyKnusperkeks 
Thank you for reviewing!
I do not have write permission to the repository.  Could you please commit this 
or tell me what I can do?
 

https://github.com/llvm/llvm-project/pull/77477
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] Add Options to break inside the TableGen DAGArg. (PR #83149)

2024-03-17 Thread Hirofumi Nakamura via cfe-commits


@@ -2332,6 +2332,77 @@ TEST_F(TokenAnnotatorTest, UnderstandTableGenTokens) {
   EXPECT_TOKEN(Tokens[4], tok::less, TT_TemplateOpener);
   EXPECT_TOKEN(Tokens[6], tok::greater, TT_TemplateCloser);
   EXPECT_TOKEN(Tokens[7], tok::l_brace, TT_FunctionLBrace);
+
+  // DAGArg breaking options. They use different token types depending on what
+  // is specified.
+  Style.TableGenBreakInsideDAGArg = FormatStyle::DAS_BreakElements;
+
+  // When TableGenBreakInsideDAGArg is DAS_BreakElements and
+  // TableGenBreakingDAGArgOperators is not specified, it makes all the DAGArg
+  // elements to have line break.
+  Tokens = AnnotateValue("(ins type1:$src1, type2:$src2)");
+  ASSERT_EQ(Tokens.size(), 10u) << Tokens;
+  EXPECT_TOKEN(Tokens[0], tok::l_paren, TT_TableGenDAGArgOpenerToBreak);
+  EXPECT_TOKEN(Tokens[1], tok::identifier,
+   TT_TableGenDAGArgOperatorID); // ins
+  EXPECT_TOKEN(Tokens[5], tok::comma, TT_TableGenDAGArgListCommaToBreak);
+  EXPECT_TOKEN(Tokens[9], tok::r_paren, TT_TableGenDAGArgCloser);
+
+  Tokens = AnnotateValue("(other type1:$src1, type2:$src2)");
+  ASSERT_EQ(Tokens.size(), 10u) << Tokens;
+  EXPECT_TOKEN(Tokens[0], tok::l_paren, TT_TableGenDAGArgOpenerToBreak);
+  EXPECT_TOKEN(Tokens[1], tok::identifier,
+   TT_TableGenDAGArgOperatorID); // other
+  EXPECT_TOKEN(Tokens[5], tok::comma, TT_TableGenDAGArgListCommaToBreak);
+  EXPECT_TOKEN(Tokens[9], tok::r_paren, TT_TableGenDAGArgCloser);
+
+  // For non-identifier operators, breaks after the operator.
+  Tokens = AnnotateValue("(!cast(\"Name\") type1:$src1, type2:$src2)");
+  ASSERT_EQ(Tokens.size(), 16u) << Tokens;
+  EXPECT_TOKEN(Tokens[0], tok::l_paren, TT_TableGenDAGArgOpenerToBreak);
+  EXPECT_TOKEN(Tokens[7], tok::r_paren, TT_TableGenDAGArgOperatorToBreak);
+  EXPECT_TOKEN(Tokens[11], tok::comma, TT_TableGenDAGArgListCommaToBreak);
+  EXPECT_TOKEN(Tokens[15], tok::r_paren, TT_TableGenDAGArgCloser);
+
+  Style.TableGenBreakInsideDAGArg = FormatStyle::DAS_BreakAll;
+
+  // When TableGenBreakInsideDAGArg is DAS_BreakAll and
+  // TableGenBreakingDAGArgOperators is not specified, it makes all the DAGArg
+  // to have line break inside it.
+  Tokens = AnnotateValue("(ins type1:$src1, type2:$src2)");
+  ASSERT_EQ(Tokens.size(), 10u) << Tokens;
+  EXPECT_TOKEN(Tokens[0], tok::l_paren, TT_TableGenDAGArgOpenerToBreak);
+  EXPECT_TOKEN(Tokens[1], tok::identifier,
+   TT_TableGenDAGArgOperatorToBreak); // ins
+  EXPECT_TOKEN(Tokens[5], tok::comma, TT_TableGenDAGArgListCommaToBreak);
+  EXPECT_TOKEN(Tokens[9], tok::r_paren, TT_TableGenDAGArgCloser);
+
+  Tokens = AnnotateValue("(other type1:$src1, type2:$src2)");
+  ASSERT_EQ(Tokens.size(), 10u) << Tokens;
+  EXPECT_TOKEN(Tokens[0], tok::l_paren, TT_TableGenDAGArgOpenerToBreak);
+  EXPECT_TOKEN(Tokens[1], tok::identifier,
+   TT_TableGenDAGArgOperatorToBreak); // other
+  EXPECT_TOKEN(Tokens[5], tok::comma, TT_TableGenDAGArgListCommaToBreak);
+  EXPECT_TOKEN(Tokens[9], tok::r_paren, TT_TableGenDAGArgCloser);
+
+  // If TableGenBreakingDAGArgOperators is specified, it is limited to the
+  // specified operators.
+  Style.TableGenBreakingDAGArgOperators = {"ins", "outs"};
+  Tokens = AnnotateValue("(ins type1:$src1, type2:$src2)");
+  ASSERT_EQ(Tokens.size(), 10u) << Tokens;
+  EXPECT_TOKEN(Tokens[0], tok::l_paren, TT_TableGenDAGArgOpenerToBreak);
+  EXPECT_TOKEN(Tokens[1], tok::identifier,
+   TT_TableGenDAGArgOperatorToBreak); // ins
+  EXPECT_TOKEN(Tokens[5], tok::comma, TT_TableGenDAGArgListCommaToBreak);
+  EXPECT_TOKEN(Tokens[9], tok::r_paren, TT_TableGenDAGArgCloser);
+
+  Tokens = AnnotateValue("(other type1:$src1, type2:$src2)");
+  ASSERT_EQ(Tokens.size(), 10u) << Tokens;
+  EXPECT_TOKEN(Tokens[0], tok::l_paren, TT_TableGenDAGArgOpener);
+  EXPECT_TOKEN(Tokens[1], tok::identifier,
+   TT_Unknown); // other

hnakamura5 wrote:

Fixed as suggested.

https://github.com/llvm/llvm-project/pull/83149
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] Add Options to break inside the TableGen DAGArg. (PR #83149)

2024-03-17 Thread Hirofumi Nakamura via cfe-commits


@@ -332,6 +332,84 @@ TEST_F(FormatTestTableGen, Assert) {
   verifyFormat("assert !le(DefVar1, 0), \"Assert1\";\n");
 }
 
+TEST_F(FormatTestTableGen, DAGArgBreakElements) {
+  FormatStyle Style = getGoogleStyle(FormatStyle::LK_TableGen);
+  Style.ColumnLimit = 60;
+  // By default, the DAGArg does not have a break inside.
+  verifyFormat("def Def : Parent {\n"

hnakamura5 wrote:

Added the suggested check of default option.

https://github.com/llvm/llvm-project/pull/83149
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] Add Options to break inside the TableGen DAGArg. (PR #83149)

2024-03-17 Thread Hirofumi Nakamura via cfe-commits

https://github.com/hnakamura5 updated 
https://github.com/llvm/llvm-project/pull/83149

>From becb28f6daa1fed9cabe40375a7ed863207b6bd2 Mon Sep 17 00:00:00 2001
From: hnakamura5 
Date: Wed, 28 Feb 2024 01:10:12 +0900
Subject: [PATCH 1/4] [clang-format] Add Options to break inside the TableGen
 DAGArg.

---
 clang/docs/ClangFormatStyleOptions.rst| 42 
 clang/include/clang/Format/Format.h   | 46 +++--
 clang/lib/Format/ContinuationIndenter.cpp |  3 +-
 clang/lib/Format/Format.cpp   |  6 +++
 clang/lib/Format/FormatToken.h|  2 +
 clang/lib/Format/TokenAnnotator.cpp   | 49 ++-
 clang/unittests/Format/FormatTestTableGen.cpp | 42 
 clang/unittests/Format/TokenAnnotatorTest.cpp | 41 
 8 files changed, 226 insertions(+), 5 deletions(-)

diff --git a/clang/docs/ClangFormatStyleOptions.rst 
b/clang/docs/ClangFormatStyleOptions.rst
index df399a229d8d4f..9b055d16b24ac9 100644
--- a/clang/docs/ClangFormatStyleOptions.rst
+++ b/clang/docs/ClangFormatStyleOptions.rst
@@ -6158,6 +6158,48 @@ the configuration (without a prefix: ``Auto``).
 **TabWidth** (``Unsigned``) :versionbadge:`clang-format 3.7` :ref:`¶ 
`
   The number of columns used for tab stops.
 
+.. _TableGenBreakInsideDAGArgList:
+
+**TableGenBreakInsideDAGArgList** (``Boolean``) :versionbadge:`clang-format 
19` :ref:`¶ `
+  Insert the line break for each element of DAGArg list in TableGen.
+
+
+  .. code-block:: c++
+
+let DAGArgIns = (ins
+i32:$src1,
+i32:$src2
+);
+
+.. _TableGenBreakingDAGArgOperators:
+
+**TableGenBreakingDAGArgOperators** (``List of Strings``) 
:versionbadge:`clang-format 19` :ref:`¶ `
+  Works only when TableGenBreakInsideDAGArgList is true.
+  The string list needs to consist of identifiers in TableGen.
+  If any identifier is specified, this limits the line breaks by
+  TableGenBreakInsideDAGArgList option only on DAGArg values beginning with
+  the specified identifiers.
+
+  For example the configuration,
+
+  .. code-block:: c++
+
+TableGenBreakInsideDAGArgList: true
+TableGenBreakingDAGArgOperators: ['ins', 'outs']
+
+  makes the line break only occurs inside DAGArgs beginning with the
+  specified identifiers 'ins' and 'outs'.
+
+
+  .. code-block:: c++
+
+let DAGArgIns = (ins
+i32:$src1,
+i32:$src2
+);
+let DAGArgOtherID = (other i32:$other1, i32:$other2);
+let DAGArgBang = (!cast("Some") i32:$src1, i32:$src2)
+
 .. _TypeNames:
 
 **TypeNames** (``List of Strings``) :versionbadge:`clang-format 17` :ref:`¶ 
`
diff --git a/clang/include/clang/Format/Format.h 
b/clang/include/clang/Format/Format.h
index 613f1fd168465d..9729634183110c 100644
--- a/clang/include/clang/Format/Format.h
+++ b/clang/include/clang/Format/Format.h
@@ -4728,6 +4728,43 @@ struct FormatStyle {
   /// \version 8
   std::vector StatementMacros;
 
+  /// Works only when TableGenBreakInsideDAGArgList is true.
+  /// The string list needs to consist of identifiers in TableGen.
+  /// If any identifier is specified, this limits the line breaks by
+  /// TableGenBreakInsideDAGArgList option only on DAGArg values beginning with
+  /// the specified identifiers.
+  ///
+  /// For example the configuration,
+  /// \code
+  ///   TableGenBreakInsideDAGArgList: true
+  ///   TableGenBreakingDAGArgOperators: ['ins', 'outs']
+  /// \endcode
+  ///
+  /// makes the line break only occurs inside DAGArgs beginning with the
+  /// specified identifiers 'ins' and 'outs'.
+  ///
+  /// \code
+  ///   let DAGArgIns = (ins
+  ///   i32:$src1,
+  ///   i32:$src2
+  ///   );
+  ///   let DAGArgOtherID = (other i32:$other1, i32:$other2);
+  ///   let DAGArgBang = (!cast("Some") i32:$src1, i32:$src2)
+  /// \endcode
+  /// \version 19
+  std::vector TableGenBreakingDAGArgOperators;
+
+  /// Insert the line break for each element of DAGArg list in TableGen.
+  ///
+  /// \code
+  ///   let DAGArgIns = (ins
+  ///   i32:$src1,
+  ///   i32:$src2
+  ///   );
+  /// \endcode
+  /// \version 19
+  bool TableGenBreakInsideDAGArgList;
+
   /// The number of columns used for tab stops.
   /// \version 3.7
   unsigned TabWidth;
@@ -4980,9 +5017,12 @@ struct FormatStyle {
SpacesInSquareBrackets == R.SpacesInSquareBrackets &&
Standard == R.Standard &&
StatementAttributeLikeMacros == R.StatementAttributeLikeMacros &&
-   StatementMacros == R.StatementMacros && TabWidth == R.TabWidth &&
-   TypeNames == R.TypeNames && TypenameMacros == R.TypenameMacros &&
-   UseTab == R.UseTab &&
+   StatementMacros == R.StatementMacros &&
+   TableGenBreakingDAGArgOperators ==
+   R.TableGenBreakingDAGArgOperators &&
+   TableGenBreakInsideDAGArgList == R.TableGenBreakInsideDAGArgList &&
+   TabWidth == R.TabWidth && TypeNames == R.TypeNames &&
+   TypenameMacros == R.TypenameMacros && UseTa

[clang] [clang-format] Add Options to break inside the TableGen DAGArg. (PR #83149)

2024-03-17 Thread Hirofumi Nakamura via cfe-commits


@@ -1842,6 +1846,19 @@ void 
ContinuationIndenter::moveStatePastScopeOpener(LineState &State,
 Style.ContinuationIndentWidth +
 std::max(CurrentState.LastSpace, CurrentState.StartOfFunctionCall);
 
+if (Style.isTableGen()) {
+  if (Current.is(TT_TableGenDAGArgOpenerToBreak) &&
+  Style.TableGenBreakInsideDAGArg == FormatStyle::DAS_BreakElements) {

hnakamura5 wrote:

Changed as suggested.

https://github.com/llvm/llvm-project/pull/83149
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] Add Options to break inside the TableGen DAGArg. (PR #83149)

2024-03-18 Thread Hirofumi Nakamura via cfe-commits

https://github.com/hnakamura5 closed 
https://github.com/llvm/llvm-project/pull/83149
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] Add Options to break inside the TableGen DAGArg. (PR #83149)

2024-03-18 Thread Hirofumi Nakamura via cfe-commits

hnakamura5 wrote:

Thank you very much!
I really appreciate you for reviewing up to such a complicated option.

https://github.com/llvm/llvm-project/pull/83149
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] Fixed the warning in building document for TableGenBreakingDAGArgOperators. (PR #85760)

2024-03-19 Thread Hirofumi Nakamura via cfe-commits

https://github.com/hnakamura5 created 
https://github.com/llvm/llvm-project/pull/85760

Intend to fix the `Test documentation build `, degraded here 
https://github.com/llvm/llvm-project/pull/83149 .

>From 612bc89ef805a3324520f4b7ef1ebb13e334ec0b Mon Sep 17 00:00:00 2001
From: hnakamura5 
Date: Tue, 19 Mar 2024 18:46:37 +0900
Subject: [PATCH] [clang-format] Fixed the warning in building document for
 TableGenBreakingDAGArgOperators.

---
 clang/docs/ClangFormatStyleOptions.rst | 2 +-
 clang/include/clang/Format/Format.h| 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/clang/docs/ClangFormatStyleOptions.rst 
b/clang/docs/ClangFormatStyleOptions.rst
index 35b6d0a2b52b67..be021dfc5c084c 100644
--- a/clang/docs/ClangFormatStyleOptions.rst
+++ b/clang/docs/ClangFormatStyleOptions.rst
@@ -6204,7 +6204,7 @@ the configuration (without a prefix: ``Auto``).
 
   For example the configuration,
 
-  .. code-block:: c++
+  .. code-block:: yaml
 
 TableGenBreakInsideDAGArg: BreakAll
 TableGenBreakingDAGArgOperators: ['ins', 'outs']
diff --git a/clang/include/clang/Format/Format.h 
b/clang/include/clang/Format/Format.h
index 54861a66889e22..7ad2579bf7773b 100644
--- a/clang/include/clang/Format/Format.h
+++ b/clang/include/clang/Format/Format.h
@@ -4735,7 +4735,7 @@ struct FormatStyle {
   /// the specified identifiers.
   ///
   /// For example the configuration,
-  /// \code
+  /// \code{.yaml}
   ///   TableGenBreakInsideDAGArg: BreakAll
   ///   TableGenBreakingDAGArgOperators: ['ins', 'outs']
   /// \endcode

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] Fixed the warning in building document for TableGenBreakingDAGArgOperators. (PR #85760)

2024-03-19 Thread Hirofumi Nakamura via cfe-commits

hnakamura5 wrote:

Thank you so much for quick response!

https://github.com/llvm/llvm-project/pull/85760
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] Add Options to break inside the TableGen DAGArg. (PR #83149)

2024-03-19 Thread Hirofumi Nakamura via cfe-commits

hnakamura5 wrote:

@wangpc-pp 
Thank you for telling, and sorry for overlooking the detailed check for CI.

https://github.com/llvm/llvm-project/pull/83149
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] Add Options to break inside the TableGen DAGArg. (PR #83149)

2024-03-19 Thread Hirofumi Nakamura via cfe-commits

hnakamura5 wrote:

@HazardyKnusperkeks 
Thank you for reviewing and merging the fix.

https://github.com/llvm/llvm-project/pull/83149
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] Added AlignConsecutiveTableGenBreakingDAGArgColons option. (PR #86150)

2024-03-21 Thread Hirofumi Nakamura via cfe-commits

https://github.com/hnakamura5 created 
https://github.com/llvm/llvm-project/pull/86150

This is the option to specify the style of alignment of the colons inside 
TableGen's DAGArg.

>From 4a0d3cd3d20220b7f363922b49eae8cd0c740426 Mon Sep 17 00:00:00 2001
From: hnakamura5 
Date: Fri, 22 Mar 2024 01:36:47 +0900
Subject: [PATCH] [clang-format] Added
 AlignConsecutiveTableGenBreakingDAGArgColons option.

---
 clang/docs/ClangFormatStyleOptions.rst| 145 ++
 clang/include/clang/Format/Format.h   |  17 ++
 clang/lib/Format/Format.cpp   |   3 +
 clang/lib/Format/FormatToken.h|   1 +
 clang/lib/Format/TokenAnnotator.cpp   |  18 ++-
 clang/lib/Format/WhitespaceManager.cpp|   6 +
 clang/lib/Format/WhitespaceManager.h  |   3 +
 clang/unittests/Format/FormatTestTableGen.cpp |  32 
 clang/unittests/Format/TokenAnnotatorTest.cpp |  16 ++
 9 files changed, 236 insertions(+), 5 deletions(-)

diff --git a/clang/docs/ClangFormatStyleOptions.rst 
b/clang/docs/ClangFormatStyleOptions.rst
index be021dfc5c084c..2ee36f24d7ce4b 100644
--- a/clang/docs/ClangFormatStyleOptions.rst
+++ b/clang/docs/ClangFormatStyleOptions.rst
@@ -955,6 +955,151 @@ the configuration (without a prefix: ``Auto``).
   }
 
 
+.. _AlignConsecutiveTableGenBreakingDAGArgColons:
+
+**AlignConsecutiveTableGenBreakingDAGArgColons** (``AlignConsecutiveStyle``) 
:versionbadge:`clang-format 19` :ref:`¶ 
`
+  Style of aligning consecutive TableGen DAGArg operator colons.
+  If enabled, align the colon inside DAGArg which have line break inside.
+  This works only when TableGenBreakInsideDAGArg is BreakElements or
+  BreakAll and the DAGArg is not excepted by
+  TableGenBreakingDAGArgOperators's effect.
+
+  .. code-block:: c++
+
+let dagarg = (ins
+a  :$src1,
+aa :$src2,
+aaa:$src3
+)
+
+  Nested configuration flags:
+
+  Alignment options.
+
+  They can also be read as a whole for compatibility. The choices are:
+  - None
+  - Consecutive
+  - AcrossEmptyLines
+  - AcrossComments
+  - AcrossEmptyLinesAndComments
+
+  For example, to align across empty lines and not across comments, either
+  of these work.
+
+  .. code-block:: c++
+
+AlignConsecutiveTableGenBreakingDAGArgColons: AcrossEmptyLines
+
+AlignConsecutiveTableGenBreakingDAGArgColons:
+  Enabled: true
+  AcrossEmptyLines: true
+  AcrossComments: false
+
+  * ``bool Enabled`` Whether aligning is enabled.
+
+.. code-block:: c++
+
+  #define SHORT_NAME   42
+  #define LONGER_NAME  0x007f
+  #define EVEN_LONGER_NAME (2)
+  #define foo(x)   (x * x)
+  #define bar(y, z)(y + z)
+
+  int a= 1;
+  int somelongname = 2;
+  double c = 3;
+
+  int  : 1;
+  int b: 12;
+  int ccc  : 8;
+
+  int  = 12;
+  float   b = 23;
+  std::string ccc;
+
+  * ``bool AcrossEmptyLines`` Whether to align across empty lines.
+
+.. code-block:: c++
+
+  true:
+  int a= 1;
+  int somelongname = 2;
+  double c = 3;
+
+  int d= 3;
+
+  false:
+  int a= 1;
+  int somelongname = 2;
+  double c = 3;
+
+  int d = 3;
+
+  * ``bool AcrossComments`` Whether to align across comments.
+
+.. code-block:: c++
+
+  true:
+  int d= 3;
+  /* A comment. */
+  double e = 4;
+
+  false:
+  int d = 3;
+  /* A comment. */
+  double e = 4;
+
+  * ``bool AlignCompound`` Only for ``AlignConsecutiveAssignments``.  Whether 
compound assignments
+like ``+=`` are aligned along with ``=``.
+
+.. code-block:: c++
+
+  true:
+  a   &= 2;
+  bbb  = 2;
+
+  false:
+  a &= 2;
+  bbb = 2;
+
+  * ``bool AlignFunctionPointers`` Only for ``AlignConsecutiveDeclarations``. 
Whether function pointers are
+aligned.
+
+.. code-block:: c++
+
+  true:
+  unsigned i;
+  int &r;
+  int *p;
+  int  (*f)();
+
+  false:
+  unsigned i;
+  int &r;
+  int *p;
+  int (*f)();
+
+  * ``bool PadOperators`` Only for ``AlignConsecutiveAssignments``.  Whether 
short assignment
+operators are left-padded to the same length as long ones in order to
+put all assignment operators to the right of the left hand side.
+
+.. code-block:: c++
+
+  true:
+  a   >>= 2;
+  bbb   = 2;
+
+  a = 2;
+  bbb >>= 2;
+
+  false:
+  a >>= 2;
+  bbb = 2;
+
+  a = 2;
+  bbb >>= 2;
+
+
 .. _AlignConsecutiveTableGenCondOperatorColons:
 
 **AlignConsecutiveTableGenCondOperatorColons** (``AlignConsecutiveStyle``) 
:versionbadge:`clang-format 19` :ref:`¶ 
`
diff --git a/clang/include/clang/Format/Format.h 
b/clang/include/clang/Format/Format.h
index 7ad2579bf7773b..0720c8283cd75c 100644
--- a/clang/include/clang/Format/Format.h
+++ b/clang/include/clang/Format/F

[clang] [clang-format] Support of TableGen formatting. (PR #76059)

2024-03-21 Thread Hirofumi Nakamura via cfe-commits

hnakamura5 wrote:

Alignment option for DAGArg: https://github.com/llvm/llvm-project/pull/86150

https://github.com/llvm/llvm-project/pull/76059
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] Added AlignConsecutiveTableGenBreakingDAGArgColons option. (PR #86150)

2024-03-22 Thread Hirofumi Nakamura via cfe-commits

https://github.com/hnakamura5 closed 
https://github.com/llvm/llvm-project/pull/86150
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] Added AlignConsecutiveTableGenBreakingDAGArgColons option. (PR #86150)

2024-03-22 Thread Hirofumi Nakamura via cfe-commits

hnakamura5 wrote:

@HazardyKnusperkeks 
Thank you very much!
It took about 4 month with ten or more PRs for the parts of the first concept 
of TableGen formatting. 
You reviewed every time and gave me many valuable suggestions. I appreciate you 
again and again!

https://github.com/llvm/llvm-project/pull/86150
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] Support of TableGen formatting. (PR #76059)

2024-03-22 Thread Hirofumi Nakamura via cfe-commits

hnakamura5 wrote:

All the split parts of this PR is merged. Thank you!

https://github.com/llvm/llvm-project/pull/76059
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] Support of TableGen formatting. (PR #76059)

2024-03-22 Thread Hirofumi Nakamura via cfe-commits

https://github.com/hnakamura5 closed 
https://github.com/llvm/llvm-project/pull/76059
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] Support of TableGen value annotations. (PR #80299)

2024-02-10 Thread Hirofumi Nakamura via cfe-commits

https://github.com/hnakamura5 updated 
https://github.com/llvm/llvm-project/pull/80299

>From 36f83a124ea8ad27cfefa1d12ae5aa781f8e6e3e Mon Sep 17 00:00:00 2001
From: hnakamura5 
Date: Thu, 1 Feb 2024 23:07:42 +0900
Subject: [PATCH 1/3] [clang-format] Support of TableGen value annotations.

---
 clang/lib/Format/FormatToken.h|  10 +
 clang/lib/Format/FormatTokenLexer.cpp |   2 +-
 clang/lib/Format/TokenAnnotator.cpp   | 303 +-
 clang/lib/Format/UnwrappedLineParser.cpp  |  13 +-
 clang/unittests/Format/TokenAnnotatorTest.cpp |  45 +++
 5 files changed, 364 insertions(+), 9 deletions(-)

diff --git a/clang/lib/Format/FormatToken.h b/clang/lib/Format/FormatToken.h
index bace91b5f99b4d..0c1dce7a294082 100644
--- a/clang/lib/Format/FormatToken.h
+++ b/clang/lib/Format/FormatToken.h
@@ -150,7 +150,17 @@ namespace format {
   TYPE(StructuredBindingLSquare)   
\
   TYPE(TableGenBangOperator)   
\
   TYPE(TableGenCondOperator)   
\
+  TYPE(TableGenCondOperatorColon)  
\
+  TYPE(TableGenCondOperatorComma)  
\
+  TYPE(TableGenDAGArgCloser)   
\
+  TYPE(TableGenDAGArgListColon)
\
+  TYPE(TableGenDAGArgListComma)
\
+  TYPE(TableGenDAGArgOpener)   
\
+  TYPE(TableGenListCloser) 
\
+  TYPE(TableGenListOpener) 
\
   TYPE(TableGenMultiLineString)
\
+  TYPE(TableGenTrailingPasteOperator)  
\
+  TYPE(TableGenValueSuffix)
\
   TYPE(TemplateCloser) 
\
   TYPE(TemplateOpener) 
\
   TYPE(TemplateString) 
\
diff --git a/clang/lib/Format/FormatTokenLexer.cpp 
b/clang/lib/Format/FormatTokenLexer.cpp
index d7de09ef0e12ab..27b2b1b619b1d3 100644
--- a/clang/lib/Format/FormatTokenLexer.cpp
+++ b/clang/lib/Format/FormatTokenLexer.cpp
@@ -816,7 +816,7 @@ void FormatTokenLexer::handleTableGenMultilineString() {
   auto CloseOffset = Lex->getBuffer().find("}]", OpenOffset);
   if (CloseOffset == StringRef::npos)
 return;
-  auto Text = Lex->getBuffer().substr(OpenOffset, CloseOffset + 2);
+  auto Text = Lex->getBuffer().substr(OpenOffset, CloseOffset - OpenOffset + 
2);
   MultiLineString->TokenText = Text;
   resetLexer(SourceMgr.getFileOffset(
   Lex->getSourceLocation(Lex->getBufferLocation() - 2 + Text.size(;
diff --git a/clang/lib/Format/TokenAnnotator.cpp 
b/clang/lib/Format/TokenAnnotator.cpp
index df1c5bc19de1e8..afcf5f638ce4c8 100644
--- a/clang/lib/Format/TokenAnnotator.cpp
+++ b/clang/lib/Format/TokenAnnotator.cpp
@@ -256,6 +256,18 @@ class AnnotatingParser {
   }
 }
   }
+  if (Style.isTableGen()) {
+if (CurrentToken->isOneOf(tok::comma, tok::equal)) {
+  // They appears as a separator. Unless it is not in class definition.
+  next();
+  continue;
+}
+// In angle, there must be Value like tokens. Types are also able to be
+// parsed in the same way with Values.
+if (!parseTableGenValue())
+  return false;
+continue;
+  }
   if (!consumeToken())
 return false;
 }
@@ -388,6 +400,28 @@ class AnnotatingParser {
   Contexts.back().IsExpression = !IsForOrCatch;
 }
 
+if (Style.isTableGen()) {
+  if (FormatToken *Prev = OpeningParen.Previous) {
+if (Prev->is(TT_TableGenCondOperator)) {
+  Contexts.back().IsTableGenCondOpe = true;
+  Contexts.back().IsExpression = true;
+} else if (Contexts.size() > 1 &&
+   Contexts[Contexts.size() - 2].IsTableGenBangOpe) {
+  // Hack to handle bang operators. The parent context's flag
+  // was set by parseTableGenSimpleValue().
+  // We have to specify the context outside because the prev of "(" may
+  // be ">", not the bang operator in this case.
+  Contexts.back().IsTableGenBangOpe = true;
+  Contexts.back().IsExpression = true;
+} else {
+  // Otherwise, this paren seems DAGArg.
+  if (!parseTableGenDAGArg())
+return false;
+  return parseTableGenDAGArgAndList(&OpeningParen);
+}
+  }
+}
+
 // Infer the role of the l_paren based on the previous token if we haven't
 // detected one yet.
 if (PrevNonComment && Opening

[clang] [clang-format] Support of TableGen value annotations. (PR #80299)

2024-02-10 Thread Hirofumi Nakamura via cfe-commits


@@ -833,13 +885,207 @@ class AnnotatingParser {
 Left->setType(TT_ArrayInitializerLSquare);
   }
   FormatToken *Tok = CurrentToken;
+  if (Style.isTableGen()) {
+if (CurrentToken->isOneOf(tok::comma, tok::minus, tok::ellipsis)) {
+  // '-' and '...' appears as a separator in slice.
+  next();
+} else {
+  // In TableGen there must be a list of Values in square brackets.
+  // It must be ValueList or SliceElements.
+  if (!parseTableGenValue())
+return false;
+}
+updateParameterCount(Left, Tok);
+continue;
+  }
   if (!consumeToken())
 return false;
   updateParameterCount(Left, Tok);
 }
 return false;
   }
 
+  void nextTableGenNonComment() {
+next();
+while (CurrentToken && CurrentToken->is(tok::comment))
+  next();
+  }
+
+  bool parseTableGenValue(bool ParseNameMode = false) {

hnakamura5 wrote:

I added some more comments, here and on tryToParseTableGenTokVar.
Actually the behavior of return value is same as other parse functions here 
such as parseAngle, parseBrace.
That is, returning false results in the total failure of parseLine itself. They 
do not backtrack on fail.

https://github.com/llvm/llvm-project/pull/80299
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] Support of TableGen value annotations. (PR #80299)

2024-02-12 Thread Hirofumi Nakamura via cfe-commits

https://github.com/hnakamura5 closed 
https://github.com/llvm/llvm-project/pull/80299
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] Support of TableGen value annotations. (PR #80299)

2024-02-12 Thread Hirofumi Nakamura via cfe-commits

hnakamura5 wrote:

Thank you very much!
Currently I think this will be the largest part. I try to make so. Thank you 
again for this, and of course for the other parts.

https://github.com/llvm/llvm-project/pull/80299
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] Support of TableGen basic format restrictions. (PR #81611)

2024-02-13 Thread Hirofumi Nakamura via cfe-commits

https://github.com/hnakamura5 created 
https://github.com/llvm/llvm-project/pull/81611

- Allow/force to break the line or not.
- Allow to insert space or not.

This is separated part from https://github.com/llvm/llvm-project/pull/76059.
Now we come to format in basic style !

>From 7ee4b35f0aed434053b6fd6329ef39de97bc22db Mon Sep 17 00:00:00 2001
From: hnakamura5 
Date: Tue, 13 Feb 2024 23:50:15 +0900
Subject: [PATCH] [clang-format] Support of TableGen basic format restrictions.

- Allow/force to break the line or not.
- Allow to insert space or not.
---
 clang/lib/Format/ContinuationIndenter.cpp |  14 +-
 clang/lib/Format/TokenAnnotator.cpp   |  56 
 clang/unittests/Format/FormatTestTableGen.cpp | 263 ++
 3 files changed, 331 insertions(+), 2 deletions(-)

diff --git a/clang/lib/Format/ContinuationIndenter.cpp 
b/clang/lib/Format/ContinuationIndenter.cpp
index 0b2ef97af44d83..1879af94f6da49 100644
--- a/clang/lib/Format/ContinuationIndenter.cpp
+++ b/clang/lib/Format/ContinuationIndenter.cpp
@@ -821,6 +821,7 @@ void ContinuationIndenter::addTokenOnCurrentLine(LineState 
&State, bool DryRun,
   if (Style.AlignAfterOpenBracket != FormatStyle::BAS_DontAlign &&
   !CurrentState.IsCSharpGenericTypeConstraint && Previous.opensScope() &&
   Previous.isNot(TT_ObjCMethodExpr) && Previous.isNot(TT_RequiresClause) &&
+  Previous.isNot(TT_TableGenDAGArgOpener) &&
   !(Current.MacroParent && Previous.MacroParent) &&
   (Current.isNot(TT_LineComment) ||
Previous.isOneOf(BK_BracedInit, TT_VerilogMultiLineListLParen))) {
@@ -1250,7 +1251,7 @@ unsigned ContinuationIndenter::getNewLineColumn(const 
LineState &State) {
 return CurrentState.Indent;
   }
   if ((Current.isOneOf(tok::r_brace, tok::r_square) ||
-   (Current.is(tok::greater) && Style.isProto())) &&
+   (Current.is(tok::greater) && (Style.isProto() || Style.isTableGen( 
&&
   State.Stack.size() > 1) {
 if (Current.closesBlockOrBlockTypeList(Style))
   return State.Stack[State.Stack.size() - 2].NestedBlockIndent;
@@ -1278,6 +1279,12 @@ unsigned ContinuationIndenter::getNewLineColumn(const 
LineState &State) {
Current.Next->isOneOf(tok::semi, tok::kw_const, tok::l_brace))) {
 return State.Stack[State.Stack.size() - 2].LastSpace;
   }
+  // When DAGArg closer exists top of line, it should be aligned in the similar
+  // way as function call above.
+  if (Style.isTableGen() && Current.is(TT_TableGenDAGArgCloser) &&
+  State.Stack.size() > 1) {
+return State.Stack[State.Stack.size() - 2].LastSpace;
+  }
   if (Style.AlignAfterOpenBracket == FormatStyle::BAS_BlockIndent &&
   (Current.is(tok::r_paren) ||
(Current.is(tok::r_brace) && Current.MatchingParen &&
@@ -1696,7 +1703,9 @@ void 
ContinuationIndenter::moveStatePastFakeLParens(LineState &State,
 (!Previous || Previous->isNot(tok::kw_return) ||
  (Style.Language != FormatStyle::LK_Java && PrecedenceLevel > 0)) &&
 (Style.AlignAfterOpenBracket != FormatStyle::BAS_DontAlign ||
- PrecedenceLevel != prec::Comma || Current.NestingLevel == 0)) {
+ PrecedenceLevel != prec::Comma || Current.NestingLevel == 0) &&
+(!Style.isTableGen() ||
+ (Previous && Previous->is(TT_TableGenDAGArgListComma {
   NewParenState.Indent = std::max(
   std::max(State.Column, NewParenState.Indent), 
CurrentState.LastSpace);
 }
@@ -1942,6 +1951,7 @@ void 
ContinuationIndenter::moveStatePastScopeCloser(LineState &State) {
   (Current.isOneOf(tok::r_paren, tok::r_square, TT_TemplateString) ||
(Current.is(tok::r_brace) && State.NextToken != State.Line->First) ||
State.NextToken->is(TT_TemplateCloser) ||
+   State.NextToken->is(TT_TableGenListCloser) ||
(Current.is(tok::greater) && Current.is(TT_DictLiteral {
 State.Stack.pop_back();
   }
diff --git a/clang/lib/Format/TokenAnnotator.cpp 
b/clang/lib/Format/TokenAnnotator.cpp
index d353388a862b56..636d098881c97e 100644
--- a/clang/lib/Format/TokenAnnotator.cpp
+++ b/clang/lib/Format/TokenAnnotator.cpp
@@ -5072,7 +5072,38 @@ bool TokenAnnotator::spaceRequiredBefore(const 
AnnotatedLine &Line,
  Left.endsSequence(tok::greatergreater, tok::l_brace))) {
   return false;
 }
+  } else if (Style.isTableGen()) {
+// Avoid to connect [ and {. [{ is start token of multiline string.
+if (Left.is(tok::l_square) && Right.is(tok::l_brace))
+  return true;
+if (Left.is(tok::r_brace) && Right.is(tok::r_square))
+  return true;
+// Do not insert around colon in DAGArg and cond operator.
+if (Right.is(TT_TableGenDAGArgListColon) ||
+Left.is(TT_TableGenDAGArgListColon)) {
+  return false;
+}
+if (Right.is(TT_TableGenCondOperatorColon))
+  return false;
+// Do not insert bang operators and consequent openers.
+if (Right.isOneOf(tok::l_paren, tok::greater) &&
+Left.isOneOf(TT_TableGenBangOperator, TT_TableG

[clang] [clang-format] Support of TableGen basic format restrictions. (PR #81611)

2024-02-13 Thread Hirofumi Nakamura via cfe-commits

https://github.com/hnakamura5 updated 
https://github.com/llvm/llvm-project/pull/81611

>From 7ee4b35f0aed434053b6fd6329ef39de97bc22db Mon Sep 17 00:00:00 2001
From: hnakamura5 
Date: Tue, 13 Feb 2024 23:50:15 +0900
Subject: [PATCH 1/2] [clang-format] Support of TableGen basic format
 restrictions.

- Allow/force to break the line or not.
- Allow to insert space or not.
---
 clang/lib/Format/ContinuationIndenter.cpp |  14 +-
 clang/lib/Format/TokenAnnotator.cpp   |  56 
 clang/unittests/Format/FormatTestTableGen.cpp | 263 ++
 3 files changed, 331 insertions(+), 2 deletions(-)

diff --git a/clang/lib/Format/ContinuationIndenter.cpp 
b/clang/lib/Format/ContinuationIndenter.cpp
index 0b2ef97af44d83..1879af94f6da49 100644
--- a/clang/lib/Format/ContinuationIndenter.cpp
+++ b/clang/lib/Format/ContinuationIndenter.cpp
@@ -821,6 +821,7 @@ void ContinuationIndenter::addTokenOnCurrentLine(LineState 
&State, bool DryRun,
   if (Style.AlignAfterOpenBracket != FormatStyle::BAS_DontAlign &&
   !CurrentState.IsCSharpGenericTypeConstraint && Previous.opensScope() &&
   Previous.isNot(TT_ObjCMethodExpr) && Previous.isNot(TT_RequiresClause) &&
+  Previous.isNot(TT_TableGenDAGArgOpener) &&
   !(Current.MacroParent && Previous.MacroParent) &&
   (Current.isNot(TT_LineComment) ||
Previous.isOneOf(BK_BracedInit, TT_VerilogMultiLineListLParen))) {
@@ -1250,7 +1251,7 @@ unsigned ContinuationIndenter::getNewLineColumn(const 
LineState &State) {
 return CurrentState.Indent;
   }
   if ((Current.isOneOf(tok::r_brace, tok::r_square) ||
-   (Current.is(tok::greater) && Style.isProto())) &&
+   (Current.is(tok::greater) && (Style.isProto() || Style.isTableGen( 
&&
   State.Stack.size() > 1) {
 if (Current.closesBlockOrBlockTypeList(Style))
   return State.Stack[State.Stack.size() - 2].NestedBlockIndent;
@@ -1278,6 +1279,12 @@ unsigned ContinuationIndenter::getNewLineColumn(const 
LineState &State) {
Current.Next->isOneOf(tok::semi, tok::kw_const, tok::l_brace))) {
 return State.Stack[State.Stack.size() - 2].LastSpace;
   }
+  // When DAGArg closer exists top of line, it should be aligned in the similar
+  // way as function call above.
+  if (Style.isTableGen() && Current.is(TT_TableGenDAGArgCloser) &&
+  State.Stack.size() > 1) {
+return State.Stack[State.Stack.size() - 2].LastSpace;
+  }
   if (Style.AlignAfterOpenBracket == FormatStyle::BAS_BlockIndent &&
   (Current.is(tok::r_paren) ||
(Current.is(tok::r_brace) && Current.MatchingParen &&
@@ -1696,7 +1703,9 @@ void 
ContinuationIndenter::moveStatePastFakeLParens(LineState &State,
 (!Previous || Previous->isNot(tok::kw_return) ||
  (Style.Language != FormatStyle::LK_Java && PrecedenceLevel > 0)) &&
 (Style.AlignAfterOpenBracket != FormatStyle::BAS_DontAlign ||
- PrecedenceLevel != prec::Comma || Current.NestingLevel == 0)) {
+ PrecedenceLevel != prec::Comma || Current.NestingLevel == 0) &&
+(!Style.isTableGen() ||
+ (Previous && Previous->is(TT_TableGenDAGArgListComma {
   NewParenState.Indent = std::max(
   std::max(State.Column, NewParenState.Indent), 
CurrentState.LastSpace);
 }
@@ -1942,6 +1951,7 @@ void 
ContinuationIndenter::moveStatePastScopeCloser(LineState &State) {
   (Current.isOneOf(tok::r_paren, tok::r_square, TT_TemplateString) ||
(Current.is(tok::r_brace) && State.NextToken != State.Line->First) ||
State.NextToken->is(TT_TemplateCloser) ||
+   State.NextToken->is(TT_TableGenListCloser) ||
(Current.is(tok::greater) && Current.is(TT_DictLiteral {
 State.Stack.pop_back();
   }
diff --git a/clang/lib/Format/TokenAnnotator.cpp 
b/clang/lib/Format/TokenAnnotator.cpp
index d353388a862b56..636d098881c97e 100644
--- a/clang/lib/Format/TokenAnnotator.cpp
+++ b/clang/lib/Format/TokenAnnotator.cpp
@@ -5072,7 +5072,38 @@ bool TokenAnnotator::spaceRequiredBefore(const 
AnnotatedLine &Line,
  Left.endsSequence(tok::greatergreater, tok::l_brace))) {
   return false;
 }
+  } else if (Style.isTableGen()) {
+// Avoid to connect [ and {. [{ is start token of multiline string.
+if (Left.is(tok::l_square) && Right.is(tok::l_brace))
+  return true;
+if (Left.is(tok::r_brace) && Right.is(tok::r_square))
+  return true;
+// Do not insert around colon in DAGArg and cond operator.
+if (Right.is(TT_TableGenDAGArgListColon) ||
+Left.is(TT_TableGenDAGArgListColon)) {
+  return false;
+}
+if (Right.is(TT_TableGenCondOperatorColon))
+  return false;
+// Do not insert bang operators and consequent openers.
+if (Right.isOneOf(tok::l_paren, tok::greater) &&
+Left.isOneOf(TT_TableGenBangOperator, TT_TableGenCondOperator)) {
+  return false;
+}
+// Trailing paste requires space before '{' or ':', the case in name 
values.
+// Not before ';', the case in normal values.
+   

[clang] [clang-format] Support of TableGen basic format restrictions. (PR #81611)

2024-02-13 Thread Hirofumi Nakamura via cfe-commits

https://github.com/hnakamura5 edited 
https://github.com/llvm/llvm-project/pull/81611
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] Support of TableGen basic format restrictions. (PR #81611)

2024-02-13 Thread Hirofumi Nakamura via cfe-commits


@@ -55,5 +55,268 @@ TEST_F(FormatTestTableGen, NoSpacesInSquareBracketLists) {
   verifyFormat("def flag : Flag<[\"-\", \"--\"], \"foo\">;");
 }
 
+TEST_F(FormatTestTableGen, LiteralsAndIdentifiers) {
+  verifyFormat("def LiteralAndIdentifiers {\n"
+   "  let someInteger = -42;\n"
+   "  let 0startID = $TokVarName;\n"
+   "  let 0xstartInteger = 0x42;\n"
+   "  let someIdentifier = $TokVarName;\n"
+   "}\n");
+}
+
+TEST_F(FormatTestTableGen, BangOperators) {
+  verifyFormat("def BangOperators {\n"
+   "  let IfOpe = !if(\n"
+   "  !not(!and(!gt(!add(1, 2), !sub(3, 4)), !isa($x))),\n"
+   "  !foldl(0, !listconcat(!range(5, 6), !range(7, 8)),\n"
+   " total, rec, !add(total, rec.Number)),\n"
+   "  !tail(!range(9, 10)));\n"
+   "  let ForeachOpe = !foreach(\n"
+   "  arg, arglist,\n"
+   "  !if(!isa(arg.Type),\n"
+   "  !add(!cast(arg).Number, x), arg));\n"
+   "  let CondOpe1 = !cond(!eq(size, 1): 1,\n"
+   "   !eq(size, 2): 1,\n"
+   "   !eq(size, 4): 1,\n"
+   "   !eq(size, 8): 1,\n"
+   "   !eq(size, 16): 1,\n"
+   "   true: 0);\n"
+   "  let CondOpe2 = !cond(!lt(x, 0): \"negativenegative\",\n"
+   "   !eq(x, 0): \"zerozero\",\n"
+   "   true: \"positivepositive\");\n"
+   "  let CondOpe2WithComment = !cond(!lt(x, 0):  // negative\n"
+   "  \"negativenegative\",\n"
+   "  !eq(x, 0):  // zero\n"
+   "  \"zerozero\",\n"
+   "  true:  // default\n"
+   "  \"positivepositive\");\n"
+   "}\n");
+}
+
+TEST_F(FormatTestTableGen, Include) {
+  verifyFormat("include \"test/IncludeFile.h\"\n");
+}
+
+TEST_F(FormatTestTableGen, Types) {
+  verifyFormat("def Types : list, bits<3>, list> {}\n");
+}
+
+TEST_F(FormatTestTableGen, SimpleValue1_SingleLiterals) {
+  verifyFormat("def SimpleValue {\n"
+   "  let Integer = 42;\n"
+   "  let String = \"some string\";\n"
+   "}\n");
+}
+
+TEST_F(FormatTestTableGen, SimpleValue1_MultilineString) {
+  // verifyFormat does not understand multiline TableGen code-literals
+  std::string DefWithCode =
+  "def SimpleValueCode {\n"
+  "  let Code =\n"
+  "  [{ A TokCode is  nothing more than a multi-line string literal "
+  "delimited by \\[{ and }\\]. It  can break across lines and the line "
+  "breaks are retained in the string. "
+  
"(https://llvm.org/docs/TableGen/ProgRef.html#grammar-token-TokCode)}];\n"
+  "}\n";
+  std::string DefWithCodeMessingUp =
+  "def SimpleValueCode {\n"
+  "  let   Code=   "
+  "[{ A TokCode is  nothing more than a multi-line string literal "
+  "delimited by \\[{ and }\\]. It  can break across lines and the line "
+  "breaks are retained in the string. "
+  
"(https://llvm.org/docs/TableGen/ProgRef.html#grammar-token-TokCode)}];\n"
+  "   }\n";
+  EXPECT_EQ(DefWithCode, format(DefWithCodeMessingUp));
+}
+
+TEST_F(FormatTestTableGen, SimpleValue2) {
+  verifyFormat("def SimpleValue2 {\n"
+   "  let True = true;\n"
+   "  let False = false;\n"
+   "}\n");
+}
+
+TEST_F(FormatTestTableGen, SimpleValue3) {
+  verifyFormat("class SimpleValue3 { int Question = ?; }\n");
+}
+
+TEST_F(FormatTestTableGen, SimpleValue4) {
+  verifyFormat("def SimpleValue4 { let ValueList = {1, 2, 3}; }\n");
+}
+
+TEST_F(FormatTestTableGen, SimpleValue5) {
+  verifyFormat("def SimpleValue5 {\n"
+   "  let SquareList = [1, 4, 9];\n"
+   "  let SquareListWithType = [\"a\", \"b\", \"c\"];\n"
+   "  let SquareListListWithType = [[1, 2], [3, 4, 5], [7]]<\n"
+   "  list>;\n"
+   "  let SquareBitsListWithType = [ {1, 2},\n"
+   " {3, 4} ]>>;\n"
+   "}\n");

hnakamura5 wrote:

SquareListListWithType and SquareBitsListWithType seems a little bit strange 
for they are similar but the format is difference.
One reason is  '[' and '{' cannot be connected because '[{' is the beginning of 
multiline string.

https://github.com/llvm/llvm-project/pull/81611
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] Support of TableGen basic format restrictions. (PR #81611)

2024-02-13 Thread Hirofumi Nakamura via cfe-commits


@@ -5822,6 +5860,24 @@ bool TokenAnnotator::canBreakBefore(const AnnotatedLine 
&Line,
   return false;
 if (Left.is(TT_TemplateString) && Left.opensScope())
   return true;
+  } else if (Style.isTableGen()) {
+// Avoid to break after "def", "class", "let" and so on.
+if (Keywords.isTableGenDefinition(Left))
+  return false;
+// Avoid to break after '(' in the cases that is in bang operators.
+if (Right.is(tok::l_paren)) {
+  return !Left.isOneOf(TT_TableGenBangOperator, TT_TableGenCondOperator,
+   TT_TemplateCloser);
+}
+// Avoid to split the value and its suffix part.
+if (Left.is(TT_TableGenValueSuffix))
+  return false;
+// Avoid to break between the value and its suffix part.
+if (Left.is(TT_TableGenValueSuffix))
+  return false;
+// Avoid to break around paste operator.
+if (Left.is(tok::hash) || Right.is(tok::hash))
+  return false;

hnakamura5 wrote:

Returning false does not allow to insert line break before Right token.

https://github.com/llvm/llvm-project/pull/81611
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] Support of TableGen basic format restrictions. (PR #81611)

2024-02-13 Thread Hirofumi Nakamura via cfe-commits


@@ -5072,7 +5072,38 @@ bool TokenAnnotator::spaceRequiredBefore(const 
AnnotatedLine &Line,
  Left.endsSequence(tok::greatergreater, tok::l_brace))) {
   return false;
 }
+  } else if (Style.isTableGen()) {
+// Avoid to connect [ and {. [{ is start token of multiline string.
+if (Left.is(tok::l_square) && Right.is(tok::l_brace))
+  return true;
+if (Left.is(tok::r_brace) && Right.is(tok::r_square))
+  return true;
+// Do not insert around colon in DAGArg and cond operator.
+if (Right.is(TT_TableGenDAGArgListColon) ||
+Left.is(TT_TableGenDAGArgListColon)) {
+  return false;
+}
+if (Right.is(TT_TableGenCondOperatorColon))
+  return false;
+// Do not insert bang operators and consequent openers.
+if (Right.isOneOf(tok::l_paren, tok::less) &&
+Left.isOneOf(TT_TableGenBangOperator, TT_TableGenCondOperator)) {
+  return false;
+}
+// Trailing paste requires space before '{' or ':', the case in name 
values.
+// Not before ';', the case in normal values.
+if (Left.is(TT_TableGenTrailingPasteOperator) &&
+Right.isOneOf(tok::l_brace, tok::colon)) {
+  return true;
+}
+// Otherwise paste operator does not prefer space around.
+if (Left.is(tok::hash) || Right.is(tok::hash))
+  return false;
+// Sure not to connect after defining keywords.
+if (Keywords.isTableGenDefinition(Left))
+  return true;

hnakamura5 wrote:

Returning true forces to insert space between Left and Right tokens.

https://github.com/llvm/llvm-project/pull/81611
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] Support of TableGen basic format restrictions. (PR #81611)

2024-02-13 Thread Hirofumi Nakamura via cfe-commits

https://github.com/hnakamura5 edited 
https://github.com/llvm/llvm-project/pull/81611
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] Support of TableGen basic format restrictions. (PR #81611)

2024-02-13 Thread Hirofumi Nakamura via cfe-commits

https://github.com/hnakamura5 updated 
https://github.com/llvm/llvm-project/pull/81611

>From 7ee4b35f0aed434053b6fd6329ef39de97bc22db Mon Sep 17 00:00:00 2001
From: hnakamura5 
Date: Tue, 13 Feb 2024 23:50:15 +0900
Subject: [PATCH 1/3] [clang-format] Support of TableGen basic format
 restrictions.

- Allow/force to break the line or not.
- Allow to insert space or not.
---
 clang/lib/Format/ContinuationIndenter.cpp |  14 +-
 clang/lib/Format/TokenAnnotator.cpp   |  56 
 clang/unittests/Format/FormatTestTableGen.cpp | 263 ++
 3 files changed, 331 insertions(+), 2 deletions(-)

diff --git a/clang/lib/Format/ContinuationIndenter.cpp 
b/clang/lib/Format/ContinuationIndenter.cpp
index 0b2ef97af44d83..1879af94f6da49 100644
--- a/clang/lib/Format/ContinuationIndenter.cpp
+++ b/clang/lib/Format/ContinuationIndenter.cpp
@@ -821,6 +821,7 @@ void ContinuationIndenter::addTokenOnCurrentLine(LineState 
&State, bool DryRun,
   if (Style.AlignAfterOpenBracket != FormatStyle::BAS_DontAlign &&
   !CurrentState.IsCSharpGenericTypeConstraint && Previous.opensScope() &&
   Previous.isNot(TT_ObjCMethodExpr) && Previous.isNot(TT_RequiresClause) &&
+  Previous.isNot(TT_TableGenDAGArgOpener) &&
   !(Current.MacroParent && Previous.MacroParent) &&
   (Current.isNot(TT_LineComment) ||
Previous.isOneOf(BK_BracedInit, TT_VerilogMultiLineListLParen))) {
@@ -1250,7 +1251,7 @@ unsigned ContinuationIndenter::getNewLineColumn(const 
LineState &State) {
 return CurrentState.Indent;
   }
   if ((Current.isOneOf(tok::r_brace, tok::r_square) ||
-   (Current.is(tok::greater) && Style.isProto())) &&
+   (Current.is(tok::greater) && (Style.isProto() || Style.isTableGen( 
&&
   State.Stack.size() > 1) {
 if (Current.closesBlockOrBlockTypeList(Style))
   return State.Stack[State.Stack.size() - 2].NestedBlockIndent;
@@ -1278,6 +1279,12 @@ unsigned ContinuationIndenter::getNewLineColumn(const 
LineState &State) {
Current.Next->isOneOf(tok::semi, tok::kw_const, tok::l_brace))) {
 return State.Stack[State.Stack.size() - 2].LastSpace;
   }
+  // When DAGArg closer exists top of line, it should be aligned in the similar
+  // way as function call above.
+  if (Style.isTableGen() && Current.is(TT_TableGenDAGArgCloser) &&
+  State.Stack.size() > 1) {
+return State.Stack[State.Stack.size() - 2].LastSpace;
+  }
   if (Style.AlignAfterOpenBracket == FormatStyle::BAS_BlockIndent &&
   (Current.is(tok::r_paren) ||
(Current.is(tok::r_brace) && Current.MatchingParen &&
@@ -1696,7 +1703,9 @@ void 
ContinuationIndenter::moveStatePastFakeLParens(LineState &State,
 (!Previous || Previous->isNot(tok::kw_return) ||
  (Style.Language != FormatStyle::LK_Java && PrecedenceLevel > 0)) &&
 (Style.AlignAfterOpenBracket != FormatStyle::BAS_DontAlign ||
- PrecedenceLevel != prec::Comma || Current.NestingLevel == 0)) {
+ PrecedenceLevel != prec::Comma || Current.NestingLevel == 0) &&
+(!Style.isTableGen() ||
+ (Previous && Previous->is(TT_TableGenDAGArgListComma {
   NewParenState.Indent = std::max(
   std::max(State.Column, NewParenState.Indent), 
CurrentState.LastSpace);
 }
@@ -1942,6 +1951,7 @@ void 
ContinuationIndenter::moveStatePastScopeCloser(LineState &State) {
   (Current.isOneOf(tok::r_paren, tok::r_square, TT_TemplateString) ||
(Current.is(tok::r_brace) && State.NextToken != State.Line->First) ||
State.NextToken->is(TT_TemplateCloser) ||
+   State.NextToken->is(TT_TableGenListCloser) ||
(Current.is(tok::greater) && Current.is(TT_DictLiteral {
 State.Stack.pop_back();
   }
diff --git a/clang/lib/Format/TokenAnnotator.cpp 
b/clang/lib/Format/TokenAnnotator.cpp
index d353388a862b56..636d098881c97e 100644
--- a/clang/lib/Format/TokenAnnotator.cpp
+++ b/clang/lib/Format/TokenAnnotator.cpp
@@ -5072,7 +5072,38 @@ bool TokenAnnotator::spaceRequiredBefore(const 
AnnotatedLine &Line,
  Left.endsSequence(tok::greatergreater, tok::l_brace))) {
   return false;
 }
+  } else if (Style.isTableGen()) {
+// Avoid to connect [ and {. [{ is start token of multiline string.
+if (Left.is(tok::l_square) && Right.is(tok::l_brace))
+  return true;
+if (Left.is(tok::r_brace) && Right.is(tok::r_square))
+  return true;
+// Do not insert around colon in DAGArg and cond operator.
+if (Right.is(TT_TableGenDAGArgListColon) ||
+Left.is(TT_TableGenDAGArgListColon)) {
+  return false;
+}
+if (Right.is(TT_TableGenCondOperatorColon))
+  return false;
+// Do not insert bang operators and consequent openers.
+if (Right.isOneOf(tok::l_paren, tok::greater) &&
+Left.isOneOf(TT_TableGenBangOperator, TT_TableGenCondOperator)) {
+  return false;
+}
+// Trailing paste requires space before '{' or ':', the case in name 
values.
+// Not before ';', the case in normal values.
+   

[clang] [clang-format] Support of TableGen basic format restrictions. (PR #81611)

2024-02-13 Thread Hirofumi Nakamura via cfe-commits

https://github.com/hnakamura5 edited 
https://github.com/llvm/llvm-project/pull/81611
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] Add Options to break inside the TableGen DAGArg. (PR #83149)

2024-03-10 Thread Hirofumi Nakamura via cfe-commits

https://github.com/hnakamura5 updated 
https://github.com/llvm/llvm-project/pull/83149

>From becb28f6daa1fed9cabe40375a7ed863207b6bd2 Mon Sep 17 00:00:00 2001
From: hnakamura5 
Date: Wed, 28 Feb 2024 01:10:12 +0900
Subject: [PATCH 1/2] [clang-format] Add Options to break inside the TableGen
 DAGArg.

---
 clang/docs/ClangFormatStyleOptions.rst| 42 
 clang/include/clang/Format/Format.h   | 46 +++--
 clang/lib/Format/ContinuationIndenter.cpp |  3 +-
 clang/lib/Format/Format.cpp   |  6 +++
 clang/lib/Format/FormatToken.h|  2 +
 clang/lib/Format/TokenAnnotator.cpp   | 49 ++-
 clang/unittests/Format/FormatTestTableGen.cpp | 42 
 clang/unittests/Format/TokenAnnotatorTest.cpp | 41 
 8 files changed, 226 insertions(+), 5 deletions(-)

diff --git a/clang/docs/ClangFormatStyleOptions.rst 
b/clang/docs/ClangFormatStyleOptions.rst
index df399a229d8d4f..9b055d16b24ac9 100644
--- a/clang/docs/ClangFormatStyleOptions.rst
+++ b/clang/docs/ClangFormatStyleOptions.rst
@@ -6158,6 +6158,48 @@ the configuration (without a prefix: ``Auto``).
 **TabWidth** (``Unsigned``) :versionbadge:`clang-format 3.7` :ref:`¶ 
`
   The number of columns used for tab stops.
 
+.. _TableGenBreakInsideDAGArgList:
+
+**TableGenBreakInsideDAGArgList** (``Boolean``) :versionbadge:`clang-format 
19` :ref:`¶ `
+  Insert the line break for each element of DAGArg list in TableGen.
+
+
+  .. code-block:: c++
+
+let DAGArgIns = (ins
+i32:$src1,
+i32:$src2
+);
+
+.. _TableGenBreakingDAGArgOperators:
+
+**TableGenBreakingDAGArgOperators** (``List of Strings``) 
:versionbadge:`clang-format 19` :ref:`¶ `
+  Works only when TableGenBreakInsideDAGArgList is true.
+  The string list needs to consist of identifiers in TableGen.
+  If any identifier is specified, this limits the line breaks by
+  TableGenBreakInsideDAGArgList option only on DAGArg values beginning with
+  the specified identifiers.
+
+  For example the configuration,
+
+  .. code-block:: c++
+
+TableGenBreakInsideDAGArgList: true
+TableGenBreakingDAGArgOperators: ['ins', 'outs']
+
+  makes the line break only occurs inside DAGArgs beginning with the
+  specified identifiers 'ins' and 'outs'.
+
+
+  .. code-block:: c++
+
+let DAGArgIns = (ins
+i32:$src1,
+i32:$src2
+);
+let DAGArgOtherID = (other i32:$other1, i32:$other2);
+let DAGArgBang = (!cast("Some") i32:$src1, i32:$src2)
+
 .. _TypeNames:
 
 **TypeNames** (``List of Strings``) :versionbadge:`clang-format 17` :ref:`¶ 
`
diff --git a/clang/include/clang/Format/Format.h 
b/clang/include/clang/Format/Format.h
index 613f1fd168465d..9729634183110c 100644
--- a/clang/include/clang/Format/Format.h
+++ b/clang/include/clang/Format/Format.h
@@ -4728,6 +4728,43 @@ struct FormatStyle {
   /// \version 8
   std::vector StatementMacros;
 
+  /// Works only when TableGenBreakInsideDAGArgList is true.
+  /// The string list needs to consist of identifiers in TableGen.
+  /// If any identifier is specified, this limits the line breaks by
+  /// TableGenBreakInsideDAGArgList option only on DAGArg values beginning with
+  /// the specified identifiers.
+  ///
+  /// For example the configuration,
+  /// \code
+  ///   TableGenBreakInsideDAGArgList: true
+  ///   TableGenBreakingDAGArgOperators: ['ins', 'outs']
+  /// \endcode
+  ///
+  /// makes the line break only occurs inside DAGArgs beginning with the
+  /// specified identifiers 'ins' and 'outs'.
+  ///
+  /// \code
+  ///   let DAGArgIns = (ins
+  ///   i32:$src1,
+  ///   i32:$src2
+  ///   );
+  ///   let DAGArgOtherID = (other i32:$other1, i32:$other2);
+  ///   let DAGArgBang = (!cast("Some") i32:$src1, i32:$src2)
+  /// \endcode
+  /// \version 19
+  std::vector TableGenBreakingDAGArgOperators;
+
+  /// Insert the line break for each element of DAGArg list in TableGen.
+  ///
+  /// \code
+  ///   let DAGArgIns = (ins
+  ///   i32:$src1,
+  ///   i32:$src2
+  ///   );
+  /// \endcode
+  /// \version 19
+  bool TableGenBreakInsideDAGArgList;
+
   /// The number of columns used for tab stops.
   /// \version 3.7
   unsigned TabWidth;
@@ -4980,9 +5017,12 @@ struct FormatStyle {
SpacesInSquareBrackets == R.SpacesInSquareBrackets &&
Standard == R.Standard &&
StatementAttributeLikeMacros == R.StatementAttributeLikeMacros &&
-   StatementMacros == R.StatementMacros && TabWidth == R.TabWidth &&
-   TypeNames == R.TypeNames && TypenameMacros == R.TypenameMacros &&
-   UseTab == R.UseTab &&
+   StatementMacros == R.StatementMacros &&
+   TableGenBreakingDAGArgOperators ==
+   R.TableGenBreakingDAGArgOperators &&
+   TableGenBreakInsideDAGArgList == R.TableGenBreakInsideDAGArgList &&
+   TabWidth == R.TabWidth && TypeNames == R.TypeNames &&
+   TypenameMacros == R.TypenameMacros && UseTa

[clang] [clang-format] Add Options to break inside the TableGen DAGArg. (PR #83149)

2024-03-10 Thread Hirofumi Nakamura via cfe-commits

https://github.com/hnakamura5 edited 
https://github.com/llvm/llvm-project/pull/83149
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] Add Options to break inside the TableGen DAGArg. (PR #83149)

2024-03-10 Thread Hirofumi Nakamura via cfe-commits

https://github.com/hnakamura5 updated 
https://github.com/llvm/llvm-project/pull/83149

>From becb28f6daa1fed9cabe40375a7ed863207b6bd2 Mon Sep 17 00:00:00 2001
From: hnakamura5 
Date: Wed, 28 Feb 2024 01:10:12 +0900
Subject: [PATCH 1/3] [clang-format] Add Options to break inside the TableGen
 DAGArg.

---
 clang/docs/ClangFormatStyleOptions.rst| 42 
 clang/include/clang/Format/Format.h   | 46 +++--
 clang/lib/Format/ContinuationIndenter.cpp |  3 +-
 clang/lib/Format/Format.cpp   |  6 +++
 clang/lib/Format/FormatToken.h|  2 +
 clang/lib/Format/TokenAnnotator.cpp   | 49 ++-
 clang/unittests/Format/FormatTestTableGen.cpp | 42 
 clang/unittests/Format/TokenAnnotatorTest.cpp | 41 
 8 files changed, 226 insertions(+), 5 deletions(-)

diff --git a/clang/docs/ClangFormatStyleOptions.rst 
b/clang/docs/ClangFormatStyleOptions.rst
index df399a229d8d4f..9b055d16b24ac9 100644
--- a/clang/docs/ClangFormatStyleOptions.rst
+++ b/clang/docs/ClangFormatStyleOptions.rst
@@ -6158,6 +6158,48 @@ the configuration (without a prefix: ``Auto``).
 **TabWidth** (``Unsigned``) :versionbadge:`clang-format 3.7` :ref:`¶ 
`
   The number of columns used for tab stops.
 
+.. _TableGenBreakInsideDAGArgList:
+
+**TableGenBreakInsideDAGArgList** (``Boolean``) :versionbadge:`clang-format 
19` :ref:`¶ `
+  Insert the line break for each element of DAGArg list in TableGen.
+
+
+  .. code-block:: c++
+
+let DAGArgIns = (ins
+i32:$src1,
+i32:$src2
+);
+
+.. _TableGenBreakingDAGArgOperators:
+
+**TableGenBreakingDAGArgOperators** (``List of Strings``) 
:versionbadge:`clang-format 19` :ref:`¶ `
+  Works only when TableGenBreakInsideDAGArgList is true.
+  The string list needs to consist of identifiers in TableGen.
+  If any identifier is specified, this limits the line breaks by
+  TableGenBreakInsideDAGArgList option only on DAGArg values beginning with
+  the specified identifiers.
+
+  For example the configuration,
+
+  .. code-block:: c++
+
+TableGenBreakInsideDAGArgList: true
+TableGenBreakingDAGArgOperators: ['ins', 'outs']
+
+  makes the line break only occurs inside DAGArgs beginning with the
+  specified identifiers 'ins' and 'outs'.
+
+
+  .. code-block:: c++
+
+let DAGArgIns = (ins
+i32:$src1,
+i32:$src2
+);
+let DAGArgOtherID = (other i32:$other1, i32:$other2);
+let DAGArgBang = (!cast("Some") i32:$src1, i32:$src2)
+
 .. _TypeNames:
 
 **TypeNames** (``List of Strings``) :versionbadge:`clang-format 17` :ref:`¶ 
`
diff --git a/clang/include/clang/Format/Format.h 
b/clang/include/clang/Format/Format.h
index 613f1fd168465d..9729634183110c 100644
--- a/clang/include/clang/Format/Format.h
+++ b/clang/include/clang/Format/Format.h
@@ -4728,6 +4728,43 @@ struct FormatStyle {
   /// \version 8
   std::vector StatementMacros;
 
+  /// Works only when TableGenBreakInsideDAGArgList is true.
+  /// The string list needs to consist of identifiers in TableGen.
+  /// If any identifier is specified, this limits the line breaks by
+  /// TableGenBreakInsideDAGArgList option only on DAGArg values beginning with
+  /// the specified identifiers.
+  ///
+  /// For example the configuration,
+  /// \code
+  ///   TableGenBreakInsideDAGArgList: true
+  ///   TableGenBreakingDAGArgOperators: ['ins', 'outs']
+  /// \endcode
+  ///
+  /// makes the line break only occurs inside DAGArgs beginning with the
+  /// specified identifiers 'ins' and 'outs'.
+  ///
+  /// \code
+  ///   let DAGArgIns = (ins
+  ///   i32:$src1,
+  ///   i32:$src2
+  ///   );
+  ///   let DAGArgOtherID = (other i32:$other1, i32:$other2);
+  ///   let DAGArgBang = (!cast("Some") i32:$src1, i32:$src2)
+  /// \endcode
+  /// \version 19
+  std::vector TableGenBreakingDAGArgOperators;
+
+  /// Insert the line break for each element of DAGArg list in TableGen.
+  ///
+  /// \code
+  ///   let DAGArgIns = (ins
+  ///   i32:$src1,
+  ///   i32:$src2
+  ///   );
+  /// \endcode
+  /// \version 19
+  bool TableGenBreakInsideDAGArgList;
+
   /// The number of columns used for tab stops.
   /// \version 3.7
   unsigned TabWidth;
@@ -4980,9 +5017,12 @@ struct FormatStyle {
SpacesInSquareBrackets == R.SpacesInSquareBrackets &&
Standard == R.Standard &&
StatementAttributeLikeMacros == R.StatementAttributeLikeMacros &&
-   StatementMacros == R.StatementMacros && TabWidth == R.TabWidth &&
-   TypeNames == R.TypeNames && TypenameMacros == R.TypenameMacros &&
-   UseTab == R.UseTab &&
+   StatementMacros == R.StatementMacros &&
+   TableGenBreakingDAGArgOperators ==
+   R.TableGenBreakingDAGArgOperators &&
+   TableGenBreakInsideDAGArgList == R.TableGenBreakInsideDAGArgList &&
+   TabWidth == R.TabWidth && TypeNames == R.TypeNames &&
+   TypenameMacros == R.TypenameMacros && UseTa

[clang] [clang-format] Add AlignConsecutiveTableGenCondOperatorColons option. (PR #82878)

2024-02-24 Thread Hirofumi Nakamura via cfe-commits

https://github.com/hnakamura5 created 
https://github.com/llvm/llvm-project/pull/82878

To align colons inside TableGen !cond operators.

>From d0ceab536cc9aa06ce5de1324eee1e3a05dac804 Mon Sep 17 00:00:00 2001
From: hnakamura5 
Date: Sat, 24 Feb 2024 22:21:04 +0900
Subject: [PATCH] [clang-format] Add AlignConsecutiveTableGenCondOperatorColons
  option to align colons in tablegen cond operators.

---
 clang/docs/ClangFormatStyleOptions.rst| 140 ++
 clang/include/clang/Format/Format.h   |  12 ++
 clang/lib/Format/Format.cpp   |   3 +
 clang/lib/Format/WhitespaceManager.cpp|  18 ++-
 clang/lib/Format/WhitespaceManager.h  |   8 +
 clang/unittests/Format/FormatTestTableGen.cpp |  21 +++
 6 files changed, 199 insertions(+), 3 deletions(-)

diff --git a/clang/docs/ClangFormatStyleOptions.rst 
b/clang/docs/ClangFormatStyleOptions.rst
index fdf7bfaeaa4ec7..2cb55503038f66 100644
--- a/clang/docs/ClangFormatStyleOptions.rst
+++ b/clang/docs/ClangFormatStyleOptions.rst
@@ -955,6 +955,146 @@ the configuration (without a prefix: ``Auto``).
   }
 
 
+.. _AlignConsecutiveTableGenCondOperatorColons:
+
+**AlignConsecutiveTableGenCondOperatorColons** (``AlignConsecutiveStyle``) 
:versionbadge:`clang-format 19` :ref:`¶ 
`
+  Style of aligning consecutive TableGen cond operator colons.
+  Align the colons of cases inside !cond operators.
+
+  .. code-block:: c++
+
+!cond(!eq(size, 1) : 1,
+  !eq(size, 16): 1,
+  true : 0)
+
+  Nested configuration flags:
+
+  Alignment options.
+
+  They can also be read as a whole for compatibility. The choices are:
+  - None
+  - Consecutive
+  - AcrossEmptyLines
+  - AcrossComments
+  - AcrossEmptyLinesAndComments
+
+  For example, to align across empty lines and not across comments, either
+  of these work.
+
+  .. code-block:: c++
+
+AlignConsecutiveMacros: AcrossEmptyLines
+
+AlignConsecutiveMacros:
+  Enabled: true
+  AcrossEmptyLines: true
+  AcrossComments: false
+
+  * ``bool Enabled`` Whether aligning is enabled.
+
+.. code-block:: c++
+
+  #define SHORT_NAME   42
+  #define LONGER_NAME  0x007f
+  #define EVEN_LONGER_NAME (2)
+  #define foo(x)   (x * x)
+  #define bar(y, z)(y + z)
+
+  int a= 1;
+  int somelongname = 2;
+  double c = 3;
+
+  int  : 1;
+  int b: 12;
+  int ccc  : 8;
+
+  int  = 12;
+  float   b = 23;
+  std::string ccc;
+
+  * ``bool AcrossEmptyLines`` Whether to align across empty lines.
+
+.. code-block:: c++
+
+  true:
+  int a= 1;
+  int somelongname = 2;
+  double c = 3;
+
+  int d= 3;
+
+  false:
+  int a= 1;
+  int somelongname = 2;
+  double c = 3;
+
+  int d = 3;
+
+  * ``bool AcrossComments`` Whether to align across comments.
+
+.. code-block:: c++
+
+  true:
+  int d= 3;
+  /* A comment. */
+  double e = 4;
+
+  false:
+  int d = 3;
+  /* A comment. */
+  double e = 4;
+
+  * ``bool AlignCompound`` Only for ``AlignConsecutiveAssignments``.  Whether 
compound assignments
+like ``+=`` are aligned along with ``=``.
+
+.. code-block:: c++
+
+  true:
+  a   &= 2;
+  bbb  = 2;
+
+  false:
+  a &= 2;
+  bbb = 2;
+
+  * ``bool AlignFunctionPointers`` Only for ``AlignConsecutiveDeclarations``. 
Whether function pointers are
+aligned.
+
+.. code-block:: c++
+
+  true:
+  unsigned i;
+  int &r;
+  int *p;
+  int  (*f)();
+
+  false:
+  unsigned i;
+  int &r;
+  int *p;
+  int (*f)();
+
+  * ``bool PadOperators`` Only for ``AlignConsecutiveAssignments``.  Whether 
short assignment
+operators are left-padded to the same length as long ones in order to
+put all assignment operators to the right of the left hand side.
+
+.. code-block:: c++
+
+  true:
+  a   >>= 2;
+  bbb   = 2;
+
+  a = 2;
+  bbb >>= 2;
+
+  false:
+  a >>= 2;
+  bbb = 2;
+
+  a = 2;
+  bbb >>= 2;
+
+
 .. _AlignEscapedNewlines:
 
 **AlignEscapedNewlines** (``EscapedNewlineAlignmentStyle``) 
:versionbadge:`clang-format 5` :ref:`¶ `
diff --git a/clang/include/clang/Format/Format.h 
b/clang/include/clang/Format/Format.h
index e9b2160a7b9243..11853d23f2b42b 100644
--- a/clang/include/clang/Format/Format.h
+++ b/clang/include/clang/Format/Format.h
@@ -414,6 +414,16 @@ struct FormatStyle {
   /// \version 17
   ShortCaseStatementsAlignmentStyle AlignConsecutiveShortCaseStatements;
 
+  /// Style of aligning consecutive TableGen cond operator colons.
+  /// Align the colons of cases inside !cond operators.
+  /// \code
+  ///   !cond(!eq(size, 1) : 1,
+  /// !eq(size, 16): 1,
+  /// true : 0)
+  /// \endcode
+  /// \version 19
+  AlignConsecutiveStyle Alig

[clang] [clang-format] Support of TableGen formatting. (PR #76059)

2024-02-24 Thread Hirofumi Nakamura via cfe-commits

hnakamura5 wrote:

Alignment option for cond operator: 
https://github.com/llvm/llvm-project/pull/82878.

https://github.com/llvm/llvm-project/pull/76059
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] Add AlignConsecutiveTableGenCondOperatorColons option. (PR #82878)

2024-02-25 Thread Hirofumi Nakamura via cfe-commits


@@ -849,7 +851,12 @@ void WhitespaceManager::alignConsecutiveAssignments() {
 }
 
 void WhitespaceManager::alignConsecutiveBitFields() {
-  if (!Style.AlignConsecutiveBitFields.Enabled)
+  alignConsecutiveColons(Style.AlignConsecutiveBitFields, TT_BitFieldColon);
+}
+
+void WhitespaceManager::alignConsecutiveColons(
+const FormatStyle::AlignConsecutiveStyle &AlignStyle, TokenType Type) {

hnakamura5 wrote:

Both are OK, but other such functions seems using const reference to pass style.

https://github.com/llvm/llvm-project/pull/82878
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] Add AlignConsecutiveTableGenCondOperatorColons option. (PR #82878)

2024-02-26 Thread Hirofumi Nakamura via cfe-commits

https://github.com/hnakamura5 closed 
https://github.com/llvm/llvm-project/pull/82878
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] Add AlignConsecutiveTableGenCondOperatorColons option. (PR #82878)

2024-02-26 Thread Hirofumi Nakamura via cfe-commits

hnakamura5 wrote:

Thank you!

https://github.com/llvm/llvm-project/pull/82878
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] Add AlignConsecutiveTableGenDefinitions option. (PR #83008)

2024-02-26 Thread Hirofumi Nakamura via cfe-commits

https://github.com/hnakamura5 created 
https://github.com/llvm/llvm-project/pull/83008

To align TableGen consecutive definitions.

>From 4d22f709eff00b38cce6e9f4087bea14d04424fd Mon Sep 17 00:00:00 2001
From: hnakamura5 
Date: Mon, 26 Feb 2024 23:17:55 +0900
Subject: [PATCH 1/2] [clang-format] Add AlignConsecutiveTableGenDefinitions
 option to align TableGen consecutive definitions.

---
 clang/docs/ClangFormatStyleOptions.rst| 140 ++
 clang/include/clang/Format/Format.h   |  12 ++
 clang/lib/Format/Format.cpp   |   3 +
 clang/lib/Format/WhitespaceManager.cpp|   9 +-
 clang/lib/Format/WhitespaceManager.h  |   3 +
 clang/unittests/Format/FormatTestTableGen.cpp |  14 ++
 6 files changed, 180 insertions(+), 1 deletion(-)

diff --git a/clang/docs/ClangFormatStyleOptions.rst 
b/clang/docs/ClangFormatStyleOptions.rst
index d509bb80767979..7be66df3aec61d 100644
--- a/clang/docs/ClangFormatStyleOptions.rst
+++ b/clang/docs/ClangFormatStyleOptions.rst
@@ -1095,6 +1095,146 @@ the configuration (without a prefix: ``Auto``).
   bbb >>= 2;
 
 
+.. _AlignConsecutiveTableGenDefinitionsColons:
+
+**AlignConsecutiveTableGenDefinitionsColons** (``AlignConsecutiveStyle``) 
:versionbadge:`clang-format 19` :ref:`¶ 
`
+  Style of aligning consecutive TableGen definition colons.
+  This aligns the inheritance colons of consecutive definitions.
+
+  .. code-block:: c++
+
+def Def   : Parent {}
+def DefDef: Parent {}
+def DefDefDef : Parent {}
+
+  Nested configuration flags:
+
+  Alignment options.
+
+  They can also be read as a whole for compatibility. The choices are:
+  - None
+  - Consecutive
+  - AcrossEmptyLines
+  - AcrossComments
+  - AcrossEmptyLinesAndComments
+
+  For example, to align across empty lines and not across comments, either
+  of these work.
+
+  .. code-block:: c++
+
+AlignConsecutiveMacros: AcrossEmptyLines
+
+AlignConsecutiveMacros:
+  Enabled: true
+  AcrossEmptyLines: true
+  AcrossComments: false
+
+  * ``bool Enabled`` Whether aligning is enabled.
+
+.. code-block:: c++
+
+  #define SHORT_NAME   42
+  #define LONGER_NAME  0x007f
+  #define EVEN_LONGER_NAME (2)
+  #define foo(x)   (x * x)
+  #define bar(y, z)(y + z)
+
+  int a= 1;
+  int somelongname = 2;
+  double c = 3;
+
+  int  : 1;
+  int b: 12;
+  int ccc  : 8;
+
+  int  = 12;
+  float   b = 23;
+  std::string ccc;
+
+  * ``bool AcrossEmptyLines`` Whether to align across empty lines.
+
+.. code-block:: c++
+
+  true:
+  int a= 1;
+  int somelongname = 2;
+  double c = 3;
+
+  int d= 3;
+
+  false:
+  int a= 1;
+  int somelongname = 2;
+  double c = 3;
+
+  int d = 3;
+
+  * ``bool AcrossComments`` Whether to align across comments.
+
+.. code-block:: c++
+
+  true:
+  int d= 3;
+  /* A comment. */
+  double e = 4;
+
+  false:
+  int d = 3;
+  /* A comment. */
+  double e = 4;
+
+  * ``bool AlignCompound`` Only for ``AlignConsecutiveAssignments``.  Whether 
compound assignments
+like ``+=`` are aligned along with ``=``.
+
+.. code-block:: c++
+
+  true:
+  a   &= 2;
+  bbb  = 2;
+
+  false:
+  a &= 2;
+  bbb = 2;
+
+  * ``bool AlignFunctionPointers`` Only for ``AlignConsecutiveDeclarations``. 
Whether function pointers are
+aligned.
+
+.. code-block:: c++
+
+  true:
+  unsigned i;
+  int &r;
+  int *p;
+  int  (*f)();
+
+  false:
+  unsigned i;
+  int &r;
+  int *p;
+  int (*f)();
+
+  * ``bool PadOperators`` Only for ``AlignConsecutiveAssignments``.  Whether 
short assignment
+operators are left-padded to the same length as long ones in order to
+put all assignment operators to the right of the left hand side.
+
+.. code-block:: c++
+
+  true:
+  a   >>= 2;
+  bbb   = 2;
+
+  a = 2;
+  bbb >>= 2;
+
+  false:
+  a >>= 2;
+  bbb = 2;
+
+  a = 2;
+  bbb >>= 2;
+
+
 .. _AlignEscapedNewlines:
 
 **AlignEscapedNewlines** (``EscapedNewlineAlignmentStyle``) 
:versionbadge:`clang-format 5` :ref:`¶ `
diff --git a/clang/include/clang/Format/Format.h 
b/clang/include/clang/Format/Format.h
index 449ce9e53be147..1b66d6b9fc6ced 100644
--- a/clang/include/clang/Format/Format.h
+++ b/clang/include/clang/Format/Format.h
@@ -424,6 +424,16 @@ struct FormatStyle {
   /// \version 19
   AlignConsecutiveStyle AlignConsecutiveTableGenCondOperatorColons;
 
+  /// Style of aligning consecutive TableGen definition colons.
+  /// This aligns the inheritance colons of consecutive definitions.
+  /// \code
+  ///   def Def   : Parent {}
+  ///   def DefDef: Parent {}
+  ///   def DefDefDef : Parent {}
+  /// \endcode
+  /// \version 19
+  Alig

[clang] [clang-format] Add AlignConsecutiveTableGenDefinitions option. (PR #83008)

2024-02-27 Thread Hirofumi Nakamura via cfe-commits

https://github.com/hnakamura5 closed 
https://github.com/llvm/llvm-project/pull/83008
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] Add AlignConsecutiveTableGenDefinitions option. (PR #83008)

2024-02-27 Thread Hirofumi Nakamura via cfe-commits

hnakamura5 wrote:

Thank you!

https://github.com/llvm/llvm-project/pull/83008
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] Add Options to break inside the TableGen DAGArg. (PR #83149)

2024-02-27 Thread Hirofumi Nakamura via cfe-commits

https://github.com/hnakamura5 created 
https://github.com/llvm/llvm-project/pull/83149

This adds two options to control the line break inside TableGen DAGArg.
- TableGenBreakInsideDAGArgList 
- TableGenBreakingDAGArgOperators


>From becb28f6daa1fed9cabe40375a7ed863207b6bd2 Mon Sep 17 00:00:00 2001
From: hnakamura5 
Date: Wed, 28 Feb 2024 01:10:12 +0900
Subject: [PATCH] [clang-format] Add Options to break inside the TableGen
 DAGArg.

---
 clang/docs/ClangFormatStyleOptions.rst| 42 
 clang/include/clang/Format/Format.h   | 46 +++--
 clang/lib/Format/ContinuationIndenter.cpp |  3 +-
 clang/lib/Format/Format.cpp   |  6 +++
 clang/lib/Format/FormatToken.h|  2 +
 clang/lib/Format/TokenAnnotator.cpp   | 49 ++-
 clang/unittests/Format/FormatTestTableGen.cpp | 42 
 clang/unittests/Format/TokenAnnotatorTest.cpp | 41 
 8 files changed, 226 insertions(+), 5 deletions(-)

diff --git a/clang/docs/ClangFormatStyleOptions.rst 
b/clang/docs/ClangFormatStyleOptions.rst
index df399a229d8d4f..9b055d16b24ac9 100644
--- a/clang/docs/ClangFormatStyleOptions.rst
+++ b/clang/docs/ClangFormatStyleOptions.rst
@@ -6158,6 +6158,48 @@ the configuration (without a prefix: ``Auto``).
 **TabWidth** (``Unsigned``) :versionbadge:`clang-format 3.7` :ref:`¶ 
`
   The number of columns used for tab stops.
 
+.. _TableGenBreakInsideDAGArgList:
+
+**TableGenBreakInsideDAGArgList** (``Boolean``) :versionbadge:`clang-format 
19` :ref:`¶ `
+  Insert the line break for each element of DAGArg list in TableGen.
+
+
+  .. code-block:: c++
+
+let DAGArgIns = (ins
+i32:$src1,
+i32:$src2
+);
+
+.. _TableGenBreakingDAGArgOperators:
+
+**TableGenBreakingDAGArgOperators** (``List of Strings``) 
:versionbadge:`clang-format 19` :ref:`¶ `
+  Works only when TableGenBreakInsideDAGArgList is true.
+  The string list needs to consist of identifiers in TableGen.
+  If any identifier is specified, this limits the line breaks by
+  TableGenBreakInsideDAGArgList option only on DAGArg values beginning with
+  the specified identifiers.
+
+  For example the configuration,
+
+  .. code-block:: c++
+
+TableGenBreakInsideDAGArgList: true
+TableGenBreakingDAGArgOperators: ['ins', 'outs']
+
+  makes the line break only occurs inside DAGArgs beginning with the
+  specified identifiers 'ins' and 'outs'.
+
+
+  .. code-block:: c++
+
+let DAGArgIns = (ins
+i32:$src1,
+i32:$src2
+);
+let DAGArgOtherID = (other i32:$other1, i32:$other2);
+let DAGArgBang = (!cast("Some") i32:$src1, i32:$src2)
+
 .. _TypeNames:
 
 **TypeNames** (``List of Strings``) :versionbadge:`clang-format 17` :ref:`¶ 
`
diff --git a/clang/include/clang/Format/Format.h 
b/clang/include/clang/Format/Format.h
index 613f1fd168465d..9729634183110c 100644
--- a/clang/include/clang/Format/Format.h
+++ b/clang/include/clang/Format/Format.h
@@ -4728,6 +4728,43 @@ struct FormatStyle {
   /// \version 8
   std::vector StatementMacros;
 
+  /// Works only when TableGenBreakInsideDAGArgList is true.
+  /// The string list needs to consist of identifiers in TableGen.
+  /// If any identifier is specified, this limits the line breaks by
+  /// TableGenBreakInsideDAGArgList option only on DAGArg values beginning with
+  /// the specified identifiers.
+  ///
+  /// For example the configuration,
+  /// \code
+  ///   TableGenBreakInsideDAGArgList: true
+  ///   TableGenBreakingDAGArgOperators: ['ins', 'outs']
+  /// \endcode
+  ///
+  /// makes the line break only occurs inside DAGArgs beginning with the
+  /// specified identifiers 'ins' and 'outs'.
+  ///
+  /// \code
+  ///   let DAGArgIns = (ins
+  ///   i32:$src1,
+  ///   i32:$src2
+  ///   );
+  ///   let DAGArgOtherID = (other i32:$other1, i32:$other2);
+  ///   let DAGArgBang = (!cast("Some") i32:$src1, i32:$src2)
+  /// \endcode
+  /// \version 19
+  std::vector TableGenBreakingDAGArgOperators;
+
+  /// Insert the line break for each element of DAGArg list in TableGen.
+  ///
+  /// \code
+  ///   let DAGArgIns = (ins
+  ///   i32:$src1,
+  ///   i32:$src2
+  ///   );
+  /// \endcode
+  /// \version 19
+  bool TableGenBreakInsideDAGArgList;
+
   /// The number of columns used for tab stops.
   /// \version 3.7
   unsigned TabWidth;
@@ -4980,9 +5017,12 @@ struct FormatStyle {
SpacesInSquareBrackets == R.SpacesInSquareBrackets &&
Standard == R.Standard &&
StatementAttributeLikeMacros == R.StatementAttributeLikeMacros &&
-   StatementMacros == R.StatementMacros && TabWidth == R.TabWidth &&
-   TypeNames == R.TypeNames && TypenameMacros == R.TypenameMacros &&
-   UseTab == R.UseTab &&
+   StatementMacros == R.StatementMacros &&
+   TableGenBreakingDAGArgOperators ==
+   R.TableGenBreakingDAGArgOperators &&
+   TableGenBreakInsideDAGArgList == R.TableGenBreakInsi

[clang] [clang-format] Support of TableGen formatting. (PR #76059)

2024-02-27 Thread Hirofumi Nakamura via cfe-commits

hnakamura5 wrote:

Alignment option for definitions: 
https://github.com/llvm/llvm-project/pull/83008.

https://github.com/llvm/llvm-project/pull/76059
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] Support of TableGen formatting. (PR #76059)

2024-02-27 Thread Hirofumi Nakamura via cfe-commits

hnakamura5 wrote:

Break options for DAGArg: https://github.com/llvm/llvm-project/pull/83149.

https://github.com/llvm/llvm-project/pull/76059
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] Add Options to break inside the TableGen DAGArg. (PR #83149)

2024-02-27 Thread Hirofumi Nakamura via cfe-commits

hnakamura5 wrote:

These options have a dependency that TableGenBreakInsideDAGArgList is effective 
only when TableGenBreakingDAGArgOperators is specified as true.
I'm not sure this is a smart way.

https://github.com/llvm/llvm-project/pull/83149
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] Support of TableGen tokens with unary operator like form, bang operators and numeric literals. (PR #78996)

2024-01-30 Thread Hirofumi Nakamura via cfe-commits

https://github.com/hnakamura5 closed 
https://github.com/llvm/llvm-project/pull/78996
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] Support of TableGen tokens with unary operator like form, bang operators and numeric literals. (PR #78996)

2024-01-30 Thread Hirofumi Nakamura via cfe-commits

hnakamura5 wrote:

@HazardyKnusperkeks 
Thank you for checking and accepting!

@mydeveloperday 
You will be able to see the points in the consequent PRs.

https://github.com/llvm/llvm-project/pull/78996
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] Support of TableGen value annotations. (PR #80299)

2024-02-01 Thread Hirofumi Nakamura via cfe-commits

https://github.com/hnakamura5 created 
https://github.com/llvm/llvm-project/pull/80299

This implements the annotation of the values in TableGen.
The main changes are,

- parseTableGenValue(), the simplified parser method for the syntax of values.
- modified consumeToken() to parseTableGenValue in 'if', 'assert' and after '='.
- modified parseParens() to call parseTableGenValue inside.
- modified parseSquare() to to call parseTableGenValue inside, with skipping 
separator tokens.
- modified parseAngle() to call parseTableGenValue inside, with skipping 
separator tokens.

This PR is separated from https://github.com/llvm/llvm-project/pull/76059 .
Though this is fairly a large patch, I failed to split into some self completed 
patch. I will add some comments to clarify where the diff exists.

>From 36f83a124ea8ad27cfefa1d12ae5aa781f8e6e3e Mon Sep 17 00:00:00 2001
From: hnakamura5 
Date: Thu, 1 Feb 2024 23:07:42 +0900
Subject: [PATCH] [clang-format] Support of TableGen value annotations.

---
 clang/lib/Format/FormatToken.h|  10 +
 clang/lib/Format/FormatTokenLexer.cpp |   2 +-
 clang/lib/Format/TokenAnnotator.cpp   | 303 +-
 clang/lib/Format/UnwrappedLineParser.cpp  |  13 +-
 clang/unittests/Format/TokenAnnotatorTest.cpp |  45 +++
 5 files changed, 364 insertions(+), 9 deletions(-)

diff --git a/clang/lib/Format/FormatToken.h b/clang/lib/Format/FormatToken.h
index bace91b5f99b4..0c1dce7a29408 100644
--- a/clang/lib/Format/FormatToken.h
+++ b/clang/lib/Format/FormatToken.h
@@ -150,7 +150,17 @@ namespace format {
   TYPE(StructuredBindingLSquare)   
\
   TYPE(TableGenBangOperator)   
\
   TYPE(TableGenCondOperator)   
\
+  TYPE(TableGenCondOperatorColon)  
\
+  TYPE(TableGenCondOperatorComma)  
\
+  TYPE(TableGenDAGArgCloser)   
\
+  TYPE(TableGenDAGArgListColon)
\
+  TYPE(TableGenDAGArgListComma)
\
+  TYPE(TableGenDAGArgOpener)   
\
+  TYPE(TableGenListCloser) 
\
+  TYPE(TableGenListOpener) 
\
   TYPE(TableGenMultiLineString)
\
+  TYPE(TableGenTrailingPasteOperator)  
\
+  TYPE(TableGenValueSuffix)
\
   TYPE(TemplateCloser) 
\
   TYPE(TemplateOpener) 
\
   TYPE(TemplateString) 
\
diff --git a/clang/lib/Format/FormatTokenLexer.cpp 
b/clang/lib/Format/FormatTokenLexer.cpp
index d7de09ef0e12a..27b2b1b619b1d 100644
--- a/clang/lib/Format/FormatTokenLexer.cpp
+++ b/clang/lib/Format/FormatTokenLexer.cpp
@@ -816,7 +816,7 @@ void FormatTokenLexer::handleTableGenMultilineString() {
   auto CloseOffset = Lex->getBuffer().find("}]", OpenOffset);
   if (CloseOffset == StringRef::npos)
 return;
-  auto Text = Lex->getBuffer().substr(OpenOffset, CloseOffset + 2);
+  auto Text = Lex->getBuffer().substr(OpenOffset, CloseOffset - OpenOffset + 
2);
   MultiLineString->TokenText = Text;
   resetLexer(SourceMgr.getFileOffset(
   Lex->getSourceLocation(Lex->getBufferLocation() - 2 + Text.size(;
diff --git a/clang/lib/Format/TokenAnnotator.cpp 
b/clang/lib/Format/TokenAnnotator.cpp
index df1c5bc19de1e..afcf5f638ce4c 100644
--- a/clang/lib/Format/TokenAnnotator.cpp
+++ b/clang/lib/Format/TokenAnnotator.cpp
@@ -256,6 +256,18 @@ class AnnotatingParser {
   }
 }
   }
+  if (Style.isTableGen()) {
+if (CurrentToken->isOneOf(tok::comma, tok::equal)) {
+  // They appears as a separator. Unless it is not in class definition.
+  next();
+  continue;
+}
+// In angle, there must be Value like tokens. Types are also able to be
+// parsed in the same way with Values.
+if (!parseTableGenValue())
+  return false;
+continue;
+  }
   if (!consumeToken())
 return false;
 }
@@ -388,6 +400,28 @@ class AnnotatingParser {
   Contexts.back().IsExpression = !IsForOrCatch;
 }
 
+if (Style.isTableGen()) {
+  if (FormatToken *Prev = OpeningParen.Previous) {
+if (Prev->is(TT_TableGenCondOperator)) {
+  Contexts.back().IsTableGenCondOpe = true;
+  Contexts.back().IsExpression = true;
+} else if (Contexts.size() > 1 &&
+   Contexts[Contexts.size() - 2].IsTableGenBangOpe) {
+  // Hack to ha

[clang] [clang-format] Support of TableGen value annotations. (PR #80299)

2024-02-01 Thread Hirofumi Nakamura via cfe-commits


@@ -816,7 +816,7 @@ void FormatTokenLexer::handleTableGenMultilineString() {
   auto CloseOffset = Lex->getBuffer().find("}]", OpenOffset);
   if (CloseOffset == StringRef::npos)
 return;
-  auto Text = Lex->getBuffer().substr(OpenOffset, CloseOffset + 2);
+  auto Text = Lex->getBuffer().substr(OpenOffset, CloseOffset - OpenOffset + 
2);

hnakamura5 wrote:

This is actually a bug fix of https://github.com/llvm/llvm-project/pull/78032. 
I need this to run tests. If requested, I will split this change into single 
pull request.

https://github.com/llvm/llvm-project/pull/80299
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] Support of TableGen value annotations. (PR #80299)

2024-02-01 Thread Hirofumi Nakamura via cfe-commits


@@ -256,6 +256,18 @@ class AnnotatingParser {
   }
 }
   }
+  if (Style.isTableGen()) {
+if (CurrentToken->isOneOf(tok::comma, tok::equal)) {
+  // They appears as a separator. Unless it is not in class definition.
+  next();
+  continue;
+}
+// In angle, there must be Value like tokens. Types are also able to be
+// parsed in the same way with Values.
+if (!parseTableGenValue())
+  return false;
+continue;
+  }

hnakamura5 wrote:

This is inside parseAgnle().

https://github.com/llvm/llvm-project/pull/80299
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] Support of TableGen value annotations. (PR #80299)

2024-02-01 Thread Hirofumi Nakamura via cfe-commits


@@ -388,6 +400,28 @@ class AnnotatingParser {
   Contexts.back().IsExpression = !IsForOrCatch;
 }
 
+if (Style.isTableGen()) {
+  if (FormatToken *Prev = OpeningParen.Previous) {
+if (Prev->is(TT_TableGenCondOperator)) {
+  Contexts.back().IsTableGenCondOpe = true;
+  Contexts.back().IsExpression = true;
+} else if (Contexts.size() > 1 &&
+   Contexts[Contexts.size() - 2].IsTableGenBangOpe) {
+  // Hack to handle bang operators. The parent context's flag
+  // was set by parseTableGenSimpleValue().
+  // We have to specify the context outside because the prev of "(" may
+  // be ">", not the bang operator in this case.
+  Contexts.back().IsTableGenBangOpe = true;
+  Contexts.back().IsExpression = true;
+} else {
+  // Otherwise, this paren seems DAGArg.
+  if (!parseTableGenDAGArg())
+return false;
+  return parseTableGenDAGArgAndList(&OpeningParen);
+}
+  }
+}
+

hnakamura5 wrote:

In parseParens().

https://github.com/llvm/llvm-project/pull/80299
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] Support of TableGen value annotations. (PR #80299)

2024-02-01 Thread Hirofumi Nakamura via cfe-commits


@@ -549,6 +583,22 @@ class AnnotatingParser {
   if (CurrentToken->is(tok::comma))
 Contexts.back().CanBeExpression = true;
 
+  if (Style.isTableGen()) {
+if (CurrentToken->is(tok::comma)) {
+  if (Contexts.back().IsTableGenCondOpe)
+CurrentToken->setType(TT_TableGenCondOperatorComma);
+  next();
+} else if (CurrentToken->is(tok::colon)) {
+  if (Contexts.back().IsTableGenCondOpe)
+CurrentToken->setType(TT_TableGenCondOperatorColon);
+  next();
+}
+// In TableGen there must be Values in parens.
+if (!parseTableGenValue())
+  return false;
+continue;
+  }
+

hnakamura5 wrote:

In parseParens().

https://github.com/llvm/llvm-project/pull/80299
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] Support of TableGen value annotations. (PR #80299)

2024-02-01 Thread Hirofumi Nakamura via cfe-commits

https://github.com/hnakamura5 edited 
https://github.com/llvm/llvm-project/pull/80299
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] Support of TableGen value annotations. (PR #80299)

2024-02-01 Thread Hirofumi Nakamura via cfe-commits


@@ -833,13 +885,207 @@ class AnnotatingParser {
 Left->setType(TT_ArrayInitializerLSquare);
   }
   FormatToken *Tok = CurrentToken;
+  if (Style.isTableGen()) {
+if (CurrentToken->isOneOf(tok::comma, tok::minus, tok::ellipsis)) {
+  // '-' and '...' appears as a separator in slice.
+  next();
+} else {
+  // In TableGen there must be a list of Values in square brackets.
+  // It must be ValueList or SliceElements.
+  if (!parseTableGenValue())
+return false;
+}
+updateParameterCount(Left, Tok);
+continue;
+  }

hnakamura5 wrote:

In parseSquare(). 

https://github.com/llvm/llvm-project/pull/80299
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] Support of TableGen value annotations. (PR #80299)

2024-02-01 Thread Hirofumi Nakamura via cfe-commits


@@ -1423,11 +1692,30 @@ class AnnotatingParser {
 if (!Tok->getPreviousNonComment())
   Line.IsContinuation = true;
   }
+  if (Style.isTableGen()) {
+if (Tok->is(Keywords.kw_assert)) {
+  if (!parseTableGenValue())
+return false;
+} else if (Tok->isOneOf(Keywords.kw_def, Keywords.kw_defm) &&
+   (!Tok->Next ||
+!Tok->Next->isOneOf(tok::colon, tok::l_brace))) {
+  // The case NameValue appears.
+  if (!parseTableGenValue(true))
+return false;
+}
+  }

hnakamura5 wrote:

In consumeToken().

https://github.com/llvm/llvm-project/pull/80299
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] Support of TableGen value annotations. (PR #80299)

2024-02-01 Thread Hirofumi Nakamura via cfe-commits


@@ -1423,11 +1692,30 @@ class AnnotatingParser {
 if (!Tok->getPreviousNonComment())
   Line.IsContinuation = true;
   }
+  if (Style.isTableGen()) {
+if (Tok->is(Keywords.kw_assert)) {
+  if (!parseTableGenValue())
+return false;
+} else if (Tok->isOneOf(Keywords.kw_def, Keywords.kw_defm) &&
+   (!Tok->Next ||
+!Tok->Next->isOneOf(tok::colon, tok::l_brace))) {
+  // The case NameValue appears.
+  if (!parseTableGenValue(true))
+return false;
+}
+  }
   break;
 case tok::arrow:
   if (Tok->Previous && Tok->Previous->is(tok::kw_noexcept))
 Tok->setType(TT_TrailingReturnArrow);
   break;
+case tok::equal:
+  // In TableGen, there must be a value after "=";
+  if (Style.isTableGen()) {
+if (!parseTableGenValue())
+  return false;
+  }
+  break;

hnakamura5 wrote:

In consumeToken().

https://github.com/llvm/llvm-project/pull/80299
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] Support of TableGen value annotations. (PR #80299)

2024-02-01 Thread Hirofumi Nakamura via cfe-commits


@@ -915,10 +1163,12 @@ class AnnotatingParser {
 Previous->setType(TT_SelectorName);
   }
 }
-if (CurrentToken->is(tok::colon) && OpeningBrace.is(TT_Unknown))
+if (CurrentToken->is(tok::colon) && OpeningBrace.is(TT_Unknown) &&
+!Style.isTableGen()) {
   OpeningBrace.setType(TT_DictLiteral);
-else if (Style.isJavaScript())
+} else if (Style.isJavaScript()) {
   OpeningBrace.overwriteFixedType(TT_DictLiteral);
+}

hnakamura5 wrote:

Line 1126 to here is inside parseBrace(), where is determining whether the 
l_brace is opening of dictionary literal.
TableGen does not have DictLitereals.

https://github.com/llvm/llvm-project/pull/80299
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] Support of TableGen value annotations. (PR #80299)

2024-02-01 Thread Hirofumi Nakamura via cfe-commits

https://github.com/hnakamura5 edited 
https://github.com/llvm/llvm-project/pull/80299
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] Support of TableGen value annotations. (PR #80299)

2024-02-06 Thread Hirofumi Nakamura via cfe-commits

https://github.com/hnakamura5 updated 
https://github.com/llvm/llvm-project/pull/80299

>From 36f83a124ea8ad27cfefa1d12ae5aa781f8e6e3e Mon Sep 17 00:00:00 2001
From: hnakamura5 
Date: Thu, 1 Feb 2024 23:07:42 +0900
Subject: [PATCH 1/2] [clang-format] Support of TableGen value annotations.

---
 clang/lib/Format/FormatToken.h|  10 +
 clang/lib/Format/FormatTokenLexer.cpp |   2 +-
 clang/lib/Format/TokenAnnotator.cpp   | 303 +-
 clang/lib/Format/UnwrappedLineParser.cpp  |  13 +-
 clang/unittests/Format/TokenAnnotatorTest.cpp |  45 +++
 5 files changed, 364 insertions(+), 9 deletions(-)

diff --git a/clang/lib/Format/FormatToken.h b/clang/lib/Format/FormatToken.h
index bace91b5f99b4..0c1dce7a29408 100644
--- a/clang/lib/Format/FormatToken.h
+++ b/clang/lib/Format/FormatToken.h
@@ -150,7 +150,17 @@ namespace format {
   TYPE(StructuredBindingLSquare)   
\
   TYPE(TableGenBangOperator)   
\
   TYPE(TableGenCondOperator)   
\
+  TYPE(TableGenCondOperatorColon)  
\
+  TYPE(TableGenCondOperatorComma)  
\
+  TYPE(TableGenDAGArgCloser)   
\
+  TYPE(TableGenDAGArgListColon)
\
+  TYPE(TableGenDAGArgListComma)
\
+  TYPE(TableGenDAGArgOpener)   
\
+  TYPE(TableGenListCloser) 
\
+  TYPE(TableGenListOpener) 
\
   TYPE(TableGenMultiLineString)
\
+  TYPE(TableGenTrailingPasteOperator)  
\
+  TYPE(TableGenValueSuffix)
\
   TYPE(TemplateCloser) 
\
   TYPE(TemplateOpener) 
\
   TYPE(TemplateString) 
\
diff --git a/clang/lib/Format/FormatTokenLexer.cpp 
b/clang/lib/Format/FormatTokenLexer.cpp
index d7de09ef0e12a..27b2b1b619b1d 100644
--- a/clang/lib/Format/FormatTokenLexer.cpp
+++ b/clang/lib/Format/FormatTokenLexer.cpp
@@ -816,7 +816,7 @@ void FormatTokenLexer::handleTableGenMultilineString() {
   auto CloseOffset = Lex->getBuffer().find("}]", OpenOffset);
   if (CloseOffset == StringRef::npos)
 return;
-  auto Text = Lex->getBuffer().substr(OpenOffset, CloseOffset + 2);
+  auto Text = Lex->getBuffer().substr(OpenOffset, CloseOffset - OpenOffset + 
2);
   MultiLineString->TokenText = Text;
   resetLexer(SourceMgr.getFileOffset(
   Lex->getSourceLocation(Lex->getBufferLocation() - 2 + Text.size(;
diff --git a/clang/lib/Format/TokenAnnotator.cpp 
b/clang/lib/Format/TokenAnnotator.cpp
index df1c5bc19de1e..afcf5f638ce4c 100644
--- a/clang/lib/Format/TokenAnnotator.cpp
+++ b/clang/lib/Format/TokenAnnotator.cpp
@@ -256,6 +256,18 @@ class AnnotatingParser {
   }
 }
   }
+  if (Style.isTableGen()) {
+if (CurrentToken->isOneOf(tok::comma, tok::equal)) {
+  // They appears as a separator. Unless it is not in class definition.
+  next();
+  continue;
+}
+// In angle, there must be Value like tokens. Types are also able to be
+// parsed in the same way with Values.
+if (!parseTableGenValue())
+  return false;
+continue;
+  }
   if (!consumeToken())
 return false;
 }
@@ -388,6 +400,28 @@ class AnnotatingParser {
   Contexts.back().IsExpression = !IsForOrCatch;
 }
 
+if (Style.isTableGen()) {
+  if (FormatToken *Prev = OpeningParen.Previous) {
+if (Prev->is(TT_TableGenCondOperator)) {
+  Contexts.back().IsTableGenCondOpe = true;
+  Contexts.back().IsExpression = true;
+} else if (Contexts.size() > 1 &&
+   Contexts[Contexts.size() - 2].IsTableGenBangOpe) {
+  // Hack to handle bang operators. The parent context's flag
+  // was set by parseTableGenSimpleValue().
+  // We have to specify the context outside because the prev of "(" may
+  // be ">", not the bang operator in this case.
+  Contexts.back().IsTableGenBangOpe = true;
+  Contexts.back().IsExpression = true;
+} else {
+  // Otherwise, this paren seems DAGArg.
+  if (!parseTableGenDAGArg())
+return false;
+  return parseTableGenDAGArgAndList(&OpeningParen);
+}
+  }
+}
+
 // Infer the role of the l_paren based on the previous token if we haven't
 // detected one yet.
 if (PrevNonComment && OpeningParen.

[clang] [clang-format] Support of TableGen value annotations. (PR #80299)

2024-02-06 Thread Hirofumi Nakamura via cfe-commits


@@ -256,6 +256,18 @@ class AnnotatingParser {
   }
 }
   }
+  if (Style.isTableGen()) {
+if (CurrentToken->isOneOf(tok::comma, tok::equal)) {
+  // They appears as a separator. Unless it is not in class definition.

hnakamura5 wrote:

Changed this sentence.

https://github.com/llvm/llvm-project/pull/80299
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] Support of TableGen value annotations. (PR #80299)

2024-02-06 Thread Hirofumi Nakamura via cfe-commits


@@ -833,13 +885,207 @@ class AnnotatingParser {
 Left->setType(TT_ArrayInitializerLSquare);
   }
   FormatToken *Tok = CurrentToken;
+  if (Style.isTableGen()) {
+if (CurrentToken->isOneOf(tok::comma, tok::minus, tok::ellipsis)) {
+  // '-' and '...' appears as a separator in slice.
+  next();
+} else {
+  // In TableGen there must be a list of Values in square brackets.
+  // It must be ValueList or SliceElements.
+  if (!parseTableGenValue())
+return false;
+}
+updateParameterCount(Left, Tok);
+continue;
+  }
   if (!consumeToken())
 return false;
   updateParameterCount(Left, Tok);
 }
 return false;
   }
 
+  void nextTableGenNonComment() {

hnakamura5 wrote:

Changed the name to  skipToNextNonComment .

https://github.com/llvm/llvm-project/pull/80299
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] Support of TableGen value annotations. (PR #80299)

2024-02-06 Thread Hirofumi Nakamura via cfe-commits


@@ -833,13 +885,207 @@ class AnnotatingParser {
 Left->setType(TT_ArrayInitializerLSquare);
   }
   FormatToken *Tok = CurrentToken;
+  if (Style.isTableGen()) {
+if (CurrentToken->isOneOf(tok::comma, tok::minus, tok::ellipsis)) {
+  // '-' and '...' appears as a separator in slice.
+  next();
+} else {
+  // In TableGen there must be a list of Values in square brackets.
+  // It must be ValueList or SliceElements.
+  if (!parseTableGenValue())
+return false;
+}
+updateParameterCount(Left, Tok);
+continue;
+  }
   if (!consumeToken())
 return false;
   updateParameterCount(Left, Tok);
 }
 return false;
   }
 
+  void nextTableGenNonComment() {
+next();
+while (CurrentToken && CurrentToken->is(tok::comment))
+  next();
+  }
+
+  bool parseTableGenValue(bool ParseNameMode = false) {

hnakamura5 wrote:

I added the description about the syntax of TableGen's Value.

https://github.com/llvm/llvm-project/pull/80299
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] Support of TableGen value annotations. (PR #80299)

2024-02-06 Thread Hirofumi Nakamura via cfe-commits


@@ -833,13 +885,207 @@ class AnnotatingParser {
 Left->setType(TT_ArrayInitializerLSquare);
   }
   FormatToken *Tok = CurrentToken;
+  if (Style.isTableGen()) {
+if (CurrentToken->isOneOf(tok::comma, tok::minus, tok::ellipsis)) {
+  // '-' and '...' appears as a separator in slice.
+  next();
+} else {
+  // In TableGen there must be a list of Values in square brackets.
+  // It must be ValueList or SliceElements.
+  if (!parseTableGenValue())
+return false;
+}
+updateParameterCount(Left, Tok);
+continue;
+  }
   if (!consumeToken())
 return false;
   updateParameterCount(Left, Tok);
 }
 return false;
   }
 
+  void nextTableGenNonComment() {
+next();
+while (CurrentToken && CurrentToken->is(tok::comment))
+  next();
+  }
+
+  bool parseTableGenValue(bool ParseNameMode = false) {
+if (!CurrentToken)
+  return false;
+while (CurrentToken->is(tok::comment))
+  next();
+if (!parseTableGenSimpleValue())
+  return false;
+if (!CurrentToken)
+  return true;
+// Value "#" [Value]
+if (CurrentToken->is(tok::hash)) {
+  if (CurrentToken->Next &&
+  CurrentToken->Next->isOneOf(tok::colon, tok::semi, tok::l_brace)) {
+// Trailing paste operator.
+// These are only the allowed cases in TGParser::ParseValue().
+CurrentToken->setType(TT_TableGenTrailingPasteOperator);
+next();
+return true;
+  }
+  FormatToken *HashTok = CurrentToken;
+  nextTableGenNonComment();
+  HashTok->setType(TT_Unknown);
+  if (!parseTableGenValue(ParseNameMode))
+return false;
+}
+// In name mode, '{' is regarded as the end of the value.
+// See TGParser::ParseValue in TGParser.cpp
+if (ParseNameMode && CurrentToken->is(tok::l_brace))
+  return true;
+if (CurrentToken->isOneOf(tok::l_brace, tok::l_square, tok::period)) {
+  // Delegate ValueSuffix to normal consumeToken
+  CurrentToken->setType(TT_TableGenValueSuffix);
+  FormatToken *Suffix = CurrentToken;
+  nextTableGenNonComment();
+  if (Suffix->is(tok::l_square)) {
+return parseSquare();
+  } else if (Suffix->is(tok::l_brace)) {

hnakamura5 wrote:

Removed 'else'.

https://github.com/llvm/llvm-project/pull/80299
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] Support of TableGen value annotations. (PR #80299)

2024-02-06 Thread Hirofumi Nakamura via cfe-commits


@@ -1423,11 +1692,30 @@ class AnnotatingParser {
 if (!Tok->getPreviousNonComment())
   Line.IsContinuation = true;
   }
+  if (Style.isTableGen()) {
+if (Tok->is(Keywords.kw_assert)) {
+  if (!parseTableGenValue())
+return false;
+} else if (Tok->isOneOf(Keywords.kw_def, Keywords.kw_defm) &&
+   (!Tok->Next ||
+!Tok->Next->isOneOf(tok::colon, tok::l_brace))) {
+  // The case NameValue appears.
+  if (!parseTableGenValue(true))
+return false;
+}
+  }
   break;
 case tok::arrow:
   if (Tok->Previous && Tok->Previous->is(tok::kw_noexcept))
 Tok->setType(TT_TrailingReturnArrow);
   break;
+case tok::equal:
+  // In TableGen, there must be a value after "=";
+  if (Style.isTableGen()) {
+if (!parseTableGenValue())

hnakamura5 wrote:

Fixed as suggested.

https://github.com/llvm/llvm-project/pull/80299
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] Support of TableGen value annotations. (PR #80299)

2024-02-06 Thread Hirofumi Nakamura via cfe-commits


@@ -833,13 +885,207 @@ class AnnotatingParser {
 Left->setType(TT_ArrayInitializerLSquare);
   }
   FormatToken *Tok = CurrentToken;
+  if (Style.isTableGen()) {
+if (CurrentToken->isOneOf(tok::comma, tok::minus, tok::ellipsis)) {
+  // '-' and '...' appears as a separator in slice.
+  next();
+} else {
+  // In TableGen there must be a list of Values in square brackets.
+  // It must be ValueList or SliceElements.
+  if (!parseTableGenValue())
+return false;
+}
+updateParameterCount(Left, Tok);
+continue;
+  }
   if (!consumeToken())
 return false;
   updateParameterCount(Left, Tok);
 }
 return false;
   }
 
+  void nextTableGenNonComment() {
+next();
+while (CurrentToken && CurrentToken->is(tok::comment))
+  next();
+  }
+
+  bool parseTableGenValue(bool ParseNameMode = false) {
+if (!CurrentToken)
+  return false;
+while (CurrentToken->is(tok::comment))
+  next();
+if (!parseTableGenSimpleValue())
+  return false;
+if (!CurrentToken)
+  return true;
+// Value "#" [Value]
+if (CurrentToken->is(tok::hash)) {
+  if (CurrentToken->Next &&
+  CurrentToken->Next->isOneOf(tok::colon, tok::semi, tok::l_brace)) {
+// Trailing paste operator.
+// These are only the allowed cases in TGParser::ParseValue().
+CurrentToken->setType(TT_TableGenTrailingPasteOperator);
+next();
+return true;
+  }
+  FormatToken *HashTok = CurrentToken;
+  nextTableGenNonComment();
+  HashTok->setType(TT_Unknown);
+  if (!parseTableGenValue(ParseNameMode))
+return false;
+}
+// In name mode, '{' is regarded as the end of the value.
+// See TGParser::ParseValue in TGParser.cpp
+if (ParseNameMode && CurrentToken->is(tok::l_brace))
+  return true;
+if (CurrentToken->isOneOf(tok::l_brace, tok::l_square, tok::period)) {
+  // Delegate ValueSuffix to normal consumeToken
+  CurrentToken->setType(TT_TableGenValueSuffix);
+  FormatToken *Suffix = CurrentToken;
+  nextTableGenNonComment();
+  if (Suffix->is(tok::l_square)) {
+return parseSquare();
+  } else if (Suffix->is(tok::l_brace)) {
+Scopes.push_back(getScopeType(*Suffix));
+return parseBrace();
+  }
+  return true;

hnakamura5 wrote:

Removed this.

https://github.com/llvm/llvm-project/pull/80299
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] Support of TableGen tokens with unary operator like form, bang operators and numeric literals. (PR #78996)

2024-01-29 Thread Hirofumi Nakamura via cfe-commits


@@ -276,13 +276,44 @@ void FormatTokenLexer::tryMergePreviousTokens() {
   return;
 }
   }
-  // TableGen's Multi line string starts with [{
-  if (Style.isTableGen() && tryMergeTokens({tok::l_square, tok::l_brace},
-   TT_TableGenMultiLineString)) {
-// Set again with finalizing. This must never be annotated as other types.
-Tokens.back()->setFinalizedType(TT_TableGenMultiLineString);
-Tokens.back()->Tok.setKind(tok::string_literal);
-return;
+  if (Style.isTableGen()) {
+// TableGen's Multi line string starts with [{
+if (tryMergeTokens({tok::l_square, tok::l_brace},
+   TT_TableGenMultiLineString)) {
+  // Set again with finalizing. This must never be annotated as other 
types.
+  Tokens.back()->setFinalizedType(TT_TableGenMultiLineString);
+  Tokens.back()->Tok.setKind(tok::string_literal);
+  return;
+}
+// TableGen's bang operator is the form !.
+// !cond is a special case with specific syntax.
+if (tryMergeTokens({tok::exclaim, tok::identifier},
+   TT_TableGenBangOperator)) {
+  Tokens.back()->Tok.setKind(tok::identifier);
+  Tokens.back()->Tok.setIdentifierInfo(nullptr);
+  if (Tokens.back()->TokenText == "!cond")
+Tokens.back()->setFinalizedType(TT_TableGenCondOperator);
+  else
+Tokens.back()->setFinalizedType(TT_TableGenBangOperator);
+  return;
+}
+if (tryMergeTokens({tok::exclaim, tok::kw_if}, TT_TableGenBangOperator)) {
+  // Here, "! if" becomes "!if".  That is, ! captures if even when the 
space
+  // exists. That is only one possibility in TableGen's syntax.
+  Tokens.back()->Tok.setKind(tok::identifier);
+  Tokens.back()->Tok.setIdentifierInfo(nullptr);
+  Tokens.back()->setFinalizedType(TT_TableGenBangOperator);
+  return;
+}
+// +, - with numbers are literals. Not unary operators.
+if (tryMergeTokens({tok::plus, tok::numeric_constant}, TT_Unknown)) {
+  Tokens.back()->Tok.setKind(tok::numeric_constant);
+  return;

hnakamura5 wrote:

@mydeveloperday 
Thank you for reviewing this pull request.
Now it comes to 1 week since this PR is started. I want to continue before I 
forget.
Could you mind accepting or adding some suggestion?
Or if you do not intend neither, can I request another reviewer? 

https://github.com/llvm/llvm-project/pull/78996
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] TableGen keywords support. (PR #77477)

2024-01-13 Thread Hirofumi Nakamura via cfe-commits

hnakamura5 wrote:

Thank you very much!

https://github.com/llvm/llvm-project/pull/77477
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] TableGen multi line string support. (PR #78032)

2024-01-13 Thread Hirofumi Nakamura via cfe-commits

https://github.com/hnakamura5 created 
https://github.com/llvm/llvm-project/pull/78032

Support the handling of TableGen's multiline string (code) literal.
That has the form,
[{ this is the string possibly with multi line... }]

This is a separated part from https://github.com/llvm/llvm-project/pull/76059.

>From d0767350f26215e86dee039427183630b3f02668 Mon Sep 17 00:00:00 2001
From: hnakamura5 
Date: Sat, 13 Jan 2024 21:44:34 +0900
Subject: [PATCH] [clang-format] TableGen multi line string support.

---
 clang/lib/Format/ContinuationIndenter.cpp |  3 +
 clang/lib/Format/FormatToken.h|  1 +
 clang/lib/Format/FormatTokenLexer.cpp | 57 +++
 clang/lib/Format/FormatTokenLexer.h   |  3 +
 clang/lib/Format/TokenAnnotator.cpp   |  2 +-
 clang/unittests/Format/TokenAnnotatorTest.cpp |  5 ++
 6 files changed, 70 insertions(+), 1 deletion(-)

diff --git a/clang/lib/Format/ContinuationIndenter.cpp 
b/clang/lib/Format/ContinuationIndenter.cpp
index 102504182c4505..e6eaaa9ab45706 100644
--- a/clang/lib/Format/ContinuationIndenter.cpp
+++ b/clang/lib/Format/ContinuationIndenter.cpp
@@ -1591,6 +1591,9 @@ unsigned 
ContinuationIndenter::moveStateToNextToken(LineState &State,
 State.StartOfStringLiteral = State.Column + 1;
   if (Current.is(TT_CSharpStringLiteral) && State.StartOfStringLiteral == 0) {
 State.StartOfStringLiteral = State.Column + 1;
+  } else if (Current.is(TT_TableGenMultiLineString) &&
+ State.StartOfStringLiteral == 0) {
+State.StartOfStringLiteral = State.Column + 1;
   } else if (Current.isStringLiteral() && State.StartOfStringLiteral == 0) {
 State.StartOfStringLiteral = State.Column;
   } else if (!Current.isOneOf(tok::comment, tok::identifier, tok::hash) &&
diff --git a/clang/lib/Format/FormatToken.h b/clang/lib/Format/FormatToken.h
index d5ef627f1348d3..dede89f2600150 100644
--- a/clang/lib/Format/FormatToken.h
+++ b/clang/lib/Format/FormatToken.h
@@ -148,6 +148,7 @@ namespace format {
   TYPE(StructLBrace)   
\
   TYPE(StructRBrace)   
\
   TYPE(StructuredBindingLSquare)   
\
+  TYPE(TableGenMultiLineString)
\
   TYPE(TemplateCloser) 
\
   TYPE(TemplateOpener) 
\
   TYPE(TemplateString) 
\
diff --git a/clang/lib/Format/FormatTokenLexer.cpp 
b/clang/lib/Format/FormatTokenLexer.cpp
index a1fd6dd6effe6c..1060009bdcf131 100644
--- a/clang/lib/Format/FormatTokenLexer.cpp
+++ b/clang/lib/Format/FormatTokenLexer.cpp
@@ -93,6 +93,8 @@ ArrayRef FormatTokenLexer::lex() {
   // string literals are correctly identified.
   handleCSharpVerbatimAndInterpolatedStrings();
 }
+if (Style.isTableGen())
+  handleTableGenMultilineString();
 if (Tokens.back()->NewlinesBefore > 0 || Tokens.back()->IsMultiline)
   FirstInLineIndex = Tokens.size() - 1;
   } while (Tokens.back()->isNot(tok::eof));
@@ -272,6 +274,14 @@ void FormatTokenLexer::tryMergePreviousTokens() {
   return;
 }
   }
+  if (Style.isTableGen()) {
+if (tryMergeTokens({tok::l_square, tok::l_brace},
+   TT_TableGenMultiLineString)) {
+  // Multi line string starts with [{
+  Tokens.back()->Tok.setKind(tok::string_literal);
+  return;
+}
+  }
 }
 
 bool FormatTokenLexer::tryMergeNSStringLiteral() {
@@ -763,6 +773,53 @@ void 
FormatTokenLexer::handleCSharpVerbatimAndInterpolatedStrings() {
   resetLexer(SourceMgr.getFileOffset(Lex->getSourceLocation(Offset + 1)));
 }
 
+void FormatTokenLexer::handleTableGenMultilineString() {
+  FormatToken *MultiLineString = Tokens.back();
+  if (MultiLineString->isNot(TT_TableGenMultiLineString))
+return;
+
+  bool PrevIsRBrace = false;
+  const char *FirstBreak = nullptr;
+  const char *LastBreak = nullptr;
+  const char *Begin = MultiLineString->TokenText.begin();
+  // Skip until }], the closer of multi line string found.
+  for (const char *Current = Begin, *End = Lex->getBuffer().end();
+   Current != End; ++Current) {
+if (PrevIsRBrace && *Current == ']') {
+  // }] is the end of multi line string.
+  if (!FirstBreak)
+FirstBreak = Current;
+  MultiLineString->TokenText = StringRef(Begin, Current - Begin + 1);
+  // ColumnWidth is only the width of the first line.
+  MultiLineString->ColumnWidth = encoding::columnWidthWithTabs(
+  StringRef(Begin, FirstBreak - Begin + 1),
+  MultiLineString->OriginalColumn, Style.TabWidth, Encoding);
+  if (LastBreak) {
+// Set LastLineColumnWidth if multi line string has multiple lines.
+MultiLineString->LastLineColumnWidth = encoding::columnWidthWithTabs(
+StringRef(Last

[clang] [clang-format] Support of TableGen formatting. (PR #76059)

2024-01-13 Thread Hirofumi Nakamura via cfe-commits

hnakamura5 wrote:

Multi line string part https://github.com/llvm/llvm-project/pull/78032

https://github.com/llvm/llvm-project/pull/76059
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] TableGen multi line string support. (PR #78032)

2024-01-13 Thread Hirofumi Nakamura via cfe-commits

https://github.com/hnakamura5 updated 
https://github.com/llvm/llvm-project/pull/78032

>From d0767350f26215e86dee039427183630b3f02668 Mon Sep 17 00:00:00 2001
From: hnakamura5 
Date: Sat, 13 Jan 2024 21:44:34 +0900
Subject: [PATCH 1/2] [clang-format] TableGen multi line string support.

---
 clang/lib/Format/ContinuationIndenter.cpp |  3 +
 clang/lib/Format/FormatToken.h|  1 +
 clang/lib/Format/FormatTokenLexer.cpp | 57 +++
 clang/lib/Format/FormatTokenLexer.h   |  3 +
 clang/lib/Format/TokenAnnotator.cpp   |  2 +-
 clang/unittests/Format/TokenAnnotatorTest.cpp |  5 ++
 6 files changed, 70 insertions(+), 1 deletion(-)

diff --git a/clang/lib/Format/ContinuationIndenter.cpp 
b/clang/lib/Format/ContinuationIndenter.cpp
index 102504182c4505..e6eaaa9ab45706 100644
--- a/clang/lib/Format/ContinuationIndenter.cpp
+++ b/clang/lib/Format/ContinuationIndenter.cpp
@@ -1591,6 +1591,9 @@ unsigned 
ContinuationIndenter::moveStateToNextToken(LineState &State,
 State.StartOfStringLiteral = State.Column + 1;
   if (Current.is(TT_CSharpStringLiteral) && State.StartOfStringLiteral == 0) {
 State.StartOfStringLiteral = State.Column + 1;
+  } else if (Current.is(TT_TableGenMultiLineString) &&
+ State.StartOfStringLiteral == 0) {
+State.StartOfStringLiteral = State.Column + 1;
   } else if (Current.isStringLiteral() && State.StartOfStringLiteral == 0) {
 State.StartOfStringLiteral = State.Column;
   } else if (!Current.isOneOf(tok::comment, tok::identifier, tok::hash) &&
diff --git a/clang/lib/Format/FormatToken.h b/clang/lib/Format/FormatToken.h
index d5ef627f1348d3..dede89f2600150 100644
--- a/clang/lib/Format/FormatToken.h
+++ b/clang/lib/Format/FormatToken.h
@@ -148,6 +148,7 @@ namespace format {
   TYPE(StructLBrace)   
\
   TYPE(StructRBrace)   
\
   TYPE(StructuredBindingLSquare)   
\
+  TYPE(TableGenMultiLineString)
\
   TYPE(TemplateCloser) 
\
   TYPE(TemplateOpener) 
\
   TYPE(TemplateString) 
\
diff --git a/clang/lib/Format/FormatTokenLexer.cpp 
b/clang/lib/Format/FormatTokenLexer.cpp
index a1fd6dd6effe6c..1060009bdcf131 100644
--- a/clang/lib/Format/FormatTokenLexer.cpp
+++ b/clang/lib/Format/FormatTokenLexer.cpp
@@ -93,6 +93,8 @@ ArrayRef FormatTokenLexer::lex() {
   // string literals are correctly identified.
   handleCSharpVerbatimAndInterpolatedStrings();
 }
+if (Style.isTableGen())
+  handleTableGenMultilineString();
 if (Tokens.back()->NewlinesBefore > 0 || Tokens.back()->IsMultiline)
   FirstInLineIndex = Tokens.size() - 1;
   } while (Tokens.back()->isNot(tok::eof));
@@ -272,6 +274,14 @@ void FormatTokenLexer::tryMergePreviousTokens() {
   return;
 }
   }
+  if (Style.isTableGen()) {
+if (tryMergeTokens({tok::l_square, tok::l_brace},
+   TT_TableGenMultiLineString)) {
+  // Multi line string starts with [{
+  Tokens.back()->Tok.setKind(tok::string_literal);
+  return;
+}
+  }
 }
 
 bool FormatTokenLexer::tryMergeNSStringLiteral() {
@@ -763,6 +773,53 @@ void 
FormatTokenLexer::handleCSharpVerbatimAndInterpolatedStrings() {
   resetLexer(SourceMgr.getFileOffset(Lex->getSourceLocation(Offset + 1)));
 }
 
+void FormatTokenLexer::handleTableGenMultilineString() {
+  FormatToken *MultiLineString = Tokens.back();
+  if (MultiLineString->isNot(TT_TableGenMultiLineString))
+return;
+
+  bool PrevIsRBrace = false;
+  const char *FirstBreak = nullptr;
+  const char *LastBreak = nullptr;
+  const char *Begin = MultiLineString->TokenText.begin();
+  // Skip until }], the closer of multi line string found.
+  for (const char *Current = Begin, *End = Lex->getBuffer().end();
+   Current != End; ++Current) {
+if (PrevIsRBrace && *Current == ']') {
+  // }] is the end of multi line string.
+  if (!FirstBreak)
+FirstBreak = Current;
+  MultiLineString->TokenText = StringRef(Begin, Current - Begin + 1);
+  // ColumnWidth is only the width of the first line.
+  MultiLineString->ColumnWidth = encoding::columnWidthWithTabs(
+  StringRef(Begin, FirstBreak - Begin + 1),
+  MultiLineString->OriginalColumn, Style.TabWidth, Encoding);
+  if (LastBreak) {
+// Set LastLineColumnWidth if multi line string has multiple lines.
+MultiLineString->LastLineColumnWidth = encoding::columnWidthWithTabs(
+StringRef(LastBreak + 1, Current - LastBreak),
+MultiLineString->OriginalColumn, Style.TabWidth, Encoding);
+  }
+  resetLexer(SourceMgr.getFileOffset(Lex->getSourceLocation(Current + 1)));
+  return;
+}

[clang] [clang-format] TableGen multi line string support. (PR #78032)

2024-01-13 Thread Hirofumi Nakamura via cfe-commits


@@ -272,6 +274,14 @@ void FormatTokenLexer::tryMergePreviousTokens() {
   return;
 }
   }
+  if (Style.isTableGen()) {
+if (tryMergeTokens({tok::l_square, tok::l_brace},

hnakamura5 wrote:

Followed this suggestion.

https://github.com/llvm/llvm-project/pull/78032
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] TableGen multi line string support. (PR #78032)

2024-01-13 Thread Hirofumi Nakamura via cfe-commits


@@ -2193,6 +2193,11 @@ TEST_F(TokenAnnotatorTest, UnderstandTableGenTokens) {
   ASSERT_TRUE(Keywords.isTableGenDefinition(*Tokens[0]));
   ASSERT_TRUE(Tokens[0]->is(Keywords.kw_def));
   ASSERT_TRUE(Tokens[1]->is(TT_StartOfName));
+
+  // Code, the multiline string token.

hnakamura5 wrote:

Added the multiline case.

https://github.com/llvm/llvm-project/pull/78032
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] TableGen multi line string support. (PR #78032)

2024-01-13 Thread Hirofumi Nakamura via cfe-commits


@@ -1710,7 +1710,7 @@ class AnnotatingParser {
 TT_UnionLBrace, TT_RequiresClause,
 TT_RequiresClauseInARequiresExpression, TT_RequiresExpression,
 TT_RequiresExpressionLParen, TT_RequiresExpressionLBrace,
-TT_BracedListLBrace)) {
+TT_BracedListLBrace, TT_TableGenMultiLineString)) {

hnakamura5 wrote:

Modified to finalize on finding the token.

https://github.com/llvm/llvm-project/pull/78032
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] TableGen multi line string support. (PR #78032)

2024-01-13 Thread Hirofumi Nakamura via cfe-commits


@@ -763,6 +773,53 @@ void 
FormatTokenLexer::handleCSharpVerbatimAndInterpolatedStrings() {
   resetLexer(SourceMgr.getFileOffset(Lex->getSourceLocation(Offset + 1)));
 }
 
+void FormatTokenLexer::handleTableGenMultilineString() {
+  FormatToken *MultiLineString = Tokens.back();
+  if (MultiLineString->isNot(TT_TableGenMultiLineString))
+return;
+
+  bool PrevIsRBrace = false;
+  const char *FirstBreak = nullptr;
+  const char *LastBreak = nullptr;
+  const char *Begin = MultiLineString->TokenText.begin();
+  // Skip until }], the closer of multi line string found.
+  for (const char *Current = Begin, *End = Lex->getBuffer().end();

hnakamura5 wrote:

Changed the algorithm to use find and rfind as suggestion. The new one looks 
better.

https://github.com/llvm/llvm-project/pull/78032
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] TableGen multi line string support. (PR #78032)

2024-01-15 Thread Hirofumi Nakamura via cfe-commits

https://github.com/hnakamura5 updated 
https://github.com/llvm/llvm-project/pull/78032

>From d0767350f26215e86dee039427183630b3f02668 Mon Sep 17 00:00:00 2001
From: hnakamura5 
Date: Sat, 13 Jan 2024 21:44:34 +0900
Subject: [PATCH 1/3] [clang-format] TableGen multi line string support.

---
 clang/lib/Format/ContinuationIndenter.cpp |  3 +
 clang/lib/Format/FormatToken.h|  1 +
 clang/lib/Format/FormatTokenLexer.cpp | 57 +++
 clang/lib/Format/FormatTokenLexer.h   |  3 +
 clang/lib/Format/TokenAnnotator.cpp   |  2 +-
 clang/unittests/Format/TokenAnnotatorTest.cpp |  5 ++
 6 files changed, 70 insertions(+), 1 deletion(-)

diff --git a/clang/lib/Format/ContinuationIndenter.cpp 
b/clang/lib/Format/ContinuationIndenter.cpp
index 102504182c4505..e6eaaa9ab45706 100644
--- a/clang/lib/Format/ContinuationIndenter.cpp
+++ b/clang/lib/Format/ContinuationIndenter.cpp
@@ -1591,6 +1591,9 @@ unsigned 
ContinuationIndenter::moveStateToNextToken(LineState &State,
 State.StartOfStringLiteral = State.Column + 1;
   if (Current.is(TT_CSharpStringLiteral) && State.StartOfStringLiteral == 0) {
 State.StartOfStringLiteral = State.Column + 1;
+  } else if (Current.is(TT_TableGenMultiLineString) &&
+ State.StartOfStringLiteral == 0) {
+State.StartOfStringLiteral = State.Column + 1;
   } else if (Current.isStringLiteral() && State.StartOfStringLiteral == 0) {
 State.StartOfStringLiteral = State.Column;
   } else if (!Current.isOneOf(tok::comment, tok::identifier, tok::hash) &&
diff --git a/clang/lib/Format/FormatToken.h b/clang/lib/Format/FormatToken.h
index d5ef627f1348d3..dede89f2600150 100644
--- a/clang/lib/Format/FormatToken.h
+++ b/clang/lib/Format/FormatToken.h
@@ -148,6 +148,7 @@ namespace format {
   TYPE(StructLBrace)   
\
   TYPE(StructRBrace)   
\
   TYPE(StructuredBindingLSquare)   
\
+  TYPE(TableGenMultiLineString)
\
   TYPE(TemplateCloser) 
\
   TYPE(TemplateOpener) 
\
   TYPE(TemplateString) 
\
diff --git a/clang/lib/Format/FormatTokenLexer.cpp 
b/clang/lib/Format/FormatTokenLexer.cpp
index a1fd6dd6effe6c..1060009bdcf131 100644
--- a/clang/lib/Format/FormatTokenLexer.cpp
+++ b/clang/lib/Format/FormatTokenLexer.cpp
@@ -93,6 +93,8 @@ ArrayRef FormatTokenLexer::lex() {
   // string literals are correctly identified.
   handleCSharpVerbatimAndInterpolatedStrings();
 }
+if (Style.isTableGen())
+  handleTableGenMultilineString();
 if (Tokens.back()->NewlinesBefore > 0 || Tokens.back()->IsMultiline)
   FirstInLineIndex = Tokens.size() - 1;
   } while (Tokens.back()->isNot(tok::eof));
@@ -272,6 +274,14 @@ void FormatTokenLexer::tryMergePreviousTokens() {
   return;
 }
   }
+  if (Style.isTableGen()) {
+if (tryMergeTokens({tok::l_square, tok::l_brace},
+   TT_TableGenMultiLineString)) {
+  // Multi line string starts with [{
+  Tokens.back()->Tok.setKind(tok::string_literal);
+  return;
+}
+  }
 }
 
 bool FormatTokenLexer::tryMergeNSStringLiteral() {
@@ -763,6 +773,53 @@ void 
FormatTokenLexer::handleCSharpVerbatimAndInterpolatedStrings() {
   resetLexer(SourceMgr.getFileOffset(Lex->getSourceLocation(Offset + 1)));
 }
 
+void FormatTokenLexer::handleTableGenMultilineString() {
+  FormatToken *MultiLineString = Tokens.back();
+  if (MultiLineString->isNot(TT_TableGenMultiLineString))
+return;
+
+  bool PrevIsRBrace = false;
+  const char *FirstBreak = nullptr;
+  const char *LastBreak = nullptr;
+  const char *Begin = MultiLineString->TokenText.begin();
+  // Skip until }], the closer of multi line string found.
+  for (const char *Current = Begin, *End = Lex->getBuffer().end();
+   Current != End; ++Current) {
+if (PrevIsRBrace && *Current == ']') {
+  // }] is the end of multi line string.
+  if (!FirstBreak)
+FirstBreak = Current;
+  MultiLineString->TokenText = StringRef(Begin, Current - Begin + 1);
+  // ColumnWidth is only the width of the first line.
+  MultiLineString->ColumnWidth = encoding::columnWidthWithTabs(
+  StringRef(Begin, FirstBreak - Begin + 1),
+  MultiLineString->OriginalColumn, Style.TabWidth, Encoding);
+  if (LastBreak) {
+// Set LastLineColumnWidth if multi line string has multiple lines.
+MultiLineString->LastLineColumnWidth = encoding::columnWidthWithTabs(
+StringRef(LastBreak + 1, Current - LastBreak),
+MultiLineString->OriginalColumn, Style.TabWidth, Encoding);
+  }
+  resetLexer(SourceMgr.getFileOffset(Lex->getSourceLocation(Current + 1)));
+  return;
+}

[clang] [clang-format] TableGen multi line string support. (PR #78032)

2024-01-15 Thread Hirofumi Nakamura via cfe-commits


@@ -778,45 +778,31 @@ void FormatTokenLexer::handleTableGenMultilineString() {
   if (MultiLineString->isNot(TT_TableGenMultiLineString))
 return;
 
-  bool PrevIsRBrace = false;
-  const char *FirstBreak = nullptr;
-  const char *LastBreak = nullptr;
-  const char *Begin = MultiLineString->TokenText.begin();
-  // Skip until }], the closer of multi line string found.
-  for (const char *Current = Begin, *End = Lex->getBuffer().end();
-   Current != End; ++Current) {
-if (PrevIsRBrace && *Current == ']') {
-  // }] is the end of multi line string.
-  if (!FirstBreak)
-FirstBreak = Current;
-  MultiLineString->TokenText = StringRef(Begin, Current - Begin + 1);
-  // ColumnWidth is only the width of the first line.
-  MultiLineString->ColumnWidth = encoding::columnWidthWithTabs(
-  StringRef(Begin, FirstBreak - Begin + 1),
-  MultiLineString->OriginalColumn, Style.TabWidth, Encoding);
-  if (LastBreak) {
-// Set LastLineColumnWidth if multi line string has multiple lines.
-MultiLineString->LastLineColumnWidth = encoding::columnWidthWithTabs(
-StringRef(LastBreak + 1, Current - LastBreak),
-MultiLineString->OriginalColumn, Style.TabWidth, Encoding);
-  }
-  resetLexer(SourceMgr.getFileOffset(Lex->getSourceLocation(Current + 1)));
-  return;
-}
-PrevIsRBrace = false;
-if (*Current == '\n') {
-  MultiLineString->IsMultiline = true;
-  // Assure LastBreak is not equal to FirstBreak.
-  if (!FirstBreak)
-FirstBreak = Current;
-  LastBreak = Current;
-  continue;
-}
-if (*Current == '}') {
-  // Memorize '}'. If next character is ']', they are the closer.
-  PrevIsRBrace = true;
-  continue;
-}
+  auto OpenOffset = Lex->getCurrentBufferOffset() - 2 /* "[{" */;
+  // "}]" is the end of multi line string.
+  auto CloseOffset = Lex->getBuffer().find("}]", OpenOffset);
+  if (CloseOffset == StringRef::npos)
+return;
+  auto Text = Lex->getBuffer().substr(OpenOffset, CloseOffset + 2);
+  MultiLineString->TokenText = Text;
+  resetLexer(SourceMgr.getFileOffset(
+  Lex->getSourceLocation(Lex->getBufferLocation() - 2 + Text.size(;
+  // Set ColumnWidth and LastLineColumnWidth.
+  auto FirstLineText = Text;
+  auto FirstBreak = Text.find('\n');
+  if (FirstBreak != StringRef::npos) {
+MultiLineString->IsMultiline = true;
+FirstLineText = Text.substr(0, FirstBreak + 1);
+  }
+  // ColumnWidth holds only the width of the first line.
+  MultiLineString->ColumnWidth = encoding::columnWidthWithTabs(
+  FirstLineText, MultiLineString->OriginalColumn, Style.TabWidth, 
Encoding);
+  auto LastBreak = Text.rfind('\n');
+  if (LastBreak != StringRef::npos) {
+// Set LastLineColumnWidth if it has multiple lines.
+MultiLineString->LastLineColumnWidth = encoding::columnWidthWithTabs(
+Text.substr(LastBreak + 1, Text.size()),

hnakamura5 wrote:

Text.size() was redundant as you say. I removed it.

https://github.com/llvm/llvm-project/pull/78032
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] TableGen multi line string support. (PR #78032)

2024-01-15 Thread Hirofumi Nakamura via cfe-commits


@@ -778,45 +778,31 @@ void FormatTokenLexer::handleTableGenMultilineString() {
   if (MultiLineString->isNot(TT_TableGenMultiLineString))
 return;
 
-  bool PrevIsRBrace = false;
-  const char *FirstBreak = nullptr;
-  const char *LastBreak = nullptr;
-  const char *Begin = MultiLineString->TokenText.begin();
-  // Skip until }], the closer of multi line string found.
-  for (const char *Current = Begin, *End = Lex->getBuffer().end();
-   Current != End; ++Current) {
-if (PrevIsRBrace && *Current == ']') {
-  // }] is the end of multi line string.
-  if (!FirstBreak)
-FirstBreak = Current;
-  MultiLineString->TokenText = StringRef(Begin, Current - Begin + 1);
-  // ColumnWidth is only the width of the first line.
-  MultiLineString->ColumnWidth = encoding::columnWidthWithTabs(
-  StringRef(Begin, FirstBreak - Begin + 1),
-  MultiLineString->OriginalColumn, Style.TabWidth, Encoding);
-  if (LastBreak) {
-// Set LastLineColumnWidth if multi line string has multiple lines.
-MultiLineString->LastLineColumnWidth = encoding::columnWidthWithTabs(
-StringRef(LastBreak + 1, Current - LastBreak),
-MultiLineString->OriginalColumn, Style.TabWidth, Encoding);
-  }
-  resetLexer(SourceMgr.getFileOffset(Lex->getSourceLocation(Current + 1)));
-  return;
-}
-PrevIsRBrace = false;
-if (*Current == '\n') {
-  MultiLineString->IsMultiline = true;
-  // Assure LastBreak is not equal to FirstBreak.
-  if (!FirstBreak)
-FirstBreak = Current;
-  LastBreak = Current;
-  continue;
-}
-if (*Current == '}') {
-  // Memorize '}'. If next character is ']', they are the closer.
-  PrevIsRBrace = true;
-  continue;
-}
+  auto OpenOffset = Lex->getCurrentBufferOffset() - 2 /* "[{" */;
+  // "}]" is the end of multi line string.
+  auto CloseOffset = Lex->getBuffer().find("}]", OpenOffset);
+  if (CloseOffset == StringRef::npos)
+return;
+  auto Text = Lex->getBuffer().substr(OpenOffset, CloseOffset + 2);
+  MultiLineString->TokenText = Text;
+  resetLexer(SourceMgr.getFileOffset(
+  Lex->getSourceLocation(Lex->getBufferLocation() - 2 + Text.size(;
+  // Set ColumnWidth and LastLineColumnWidth.
+  auto FirstLineText = Text;
+  auto FirstBreak = Text.find('\n');
+  if (FirstBreak != StringRef::npos) {
+MultiLineString->IsMultiline = true;
+FirstLineText = Text.substr(0, FirstBreak + 1);
+  }
+  // ColumnWidth holds only the width of the first line.
+  MultiLineString->ColumnWidth = encoding::columnWidthWithTabs(
+  FirstLineText, MultiLineString->OriginalColumn, Style.TabWidth, 
Encoding);
+  auto LastBreak = Text.rfind('\n');

hnakamura5 wrote:

Modified as the suggestion.

https://github.com/llvm/llvm-project/pull/78032
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] TableGen multi line string support. (PR #78032)

2024-01-15 Thread Hirofumi Nakamura via cfe-commits


@@ -274,13 +274,13 @@ void FormatTokenLexer::tryMergePreviousTokens() {
   return;
 }
   }
-  if (Style.isTableGen()) {
-if (tryMergeTokens({tok::l_square, tok::l_brace},
-   TT_TableGenMultiLineString)) {
-  // Multi line string starts with [{
-  Tokens.back()->Tok.setKind(tok::string_literal);
-  return;
-}
+  // TableGen's Multi line string starts with [{
+  if (Style.isTableGen() && tryMergeTokens({tok::l_square, tok::l_brace},
+   TT_TableGenMultiLineString)) {
+// This must never be annotated as other types.
+Tokens.back()->setTypeIsFinalized();

hnakamura5 wrote:

This is to clarify the intention is only finalizing the merged token. And not 
setting type to something else.
How about that?

I know the both the plan does not differ so much. If you regards 
`setTypeIsFinalized()` is unsuitable API, I will change the way.

https://github.com/llvm/llvm-project/pull/78032
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] TableGen multi line string support. (PR #78032)

2024-01-15 Thread Hirofumi Nakamura via cfe-commits

https://github.com/hnakamura5 edited 
https://github.com/llvm/llvm-project/pull/78032
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] TableGen multi line string support. (PR #78032)

2024-01-16 Thread Hirofumi Nakamura via cfe-commits

https://github.com/hnakamura5 updated 
https://github.com/llvm/llvm-project/pull/78032

>From d0767350f26215e86dee039427183630b3f02668 Mon Sep 17 00:00:00 2001
From: hnakamura5 
Date: Sat, 13 Jan 2024 21:44:34 +0900
Subject: [PATCH 1/4] [clang-format] TableGen multi line string support.

---
 clang/lib/Format/ContinuationIndenter.cpp |  3 +
 clang/lib/Format/FormatToken.h|  1 +
 clang/lib/Format/FormatTokenLexer.cpp | 57 +++
 clang/lib/Format/FormatTokenLexer.h   |  3 +
 clang/lib/Format/TokenAnnotator.cpp   |  2 +-
 clang/unittests/Format/TokenAnnotatorTest.cpp |  5 ++
 6 files changed, 70 insertions(+), 1 deletion(-)

diff --git a/clang/lib/Format/ContinuationIndenter.cpp 
b/clang/lib/Format/ContinuationIndenter.cpp
index 102504182c4505..e6eaaa9ab45706 100644
--- a/clang/lib/Format/ContinuationIndenter.cpp
+++ b/clang/lib/Format/ContinuationIndenter.cpp
@@ -1591,6 +1591,9 @@ unsigned 
ContinuationIndenter::moveStateToNextToken(LineState &State,
 State.StartOfStringLiteral = State.Column + 1;
   if (Current.is(TT_CSharpStringLiteral) && State.StartOfStringLiteral == 0) {
 State.StartOfStringLiteral = State.Column + 1;
+  } else if (Current.is(TT_TableGenMultiLineString) &&
+ State.StartOfStringLiteral == 0) {
+State.StartOfStringLiteral = State.Column + 1;
   } else if (Current.isStringLiteral() && State.StartOfStringLiteral == 0) {
 State.StartOfStringLiteral = State.Column;
   } else if (!Current.isOneOf(tok::comment, tok::identifier, tok::hash) &&
diff --git a/clang/lib/Format/FormatToken.h b/clang/lib/Format/FormatToken.h
index d5ef627f1348d3..dede89f2600150 100644
--- a/clang/lib/Format/FormatToken.h
+++ b/clang/lib/Format/FormatToken.h
@@ -148,6 +148,7 @@ namespace format {
   TYPE(StructLBrace)   
\
   TYPE(StructRBrace)   
\
   TYPE(StructuredBindingLSquare)   
\
+  TYPE(TableGenMultiLineString)
\
   TYPE(TemplateCloser) 
\
   TYPE(TemplateOpener) 
\
   TYPE(TemplateString) 
\
diff --git a/clang/lib/Format/FormatTokenLexer.cpp 
b/clang/lib/Format/FormatTokenLexer.cpp
index a1fd6dd6effe6c..1060009bdcf131 100644
--- a/clang/lib/Format/FormatTokenLexer.cpp
+++ b/clang/lib/Format/FormatTokenLexer.cpp
@@ -93,6 +93,8 @@ ArrayRef FormatTokenLexer::lex() {
   // string literals are correctly identified.
   handleCSharpVerbatimAndInterpolatedStrings();
 }
+if (Style.isTableGen())
+  handleTableGenMultilineString();
 if (Tokens.back()->NewlinesBefore > 0 || Tokens.back()->IsMultiline)
   FirstInLineIndex = Tokens.size() - 1;
   } while (Tokens.back()->isNot(tok::eof));
@@ -272,6 +274,14 @@ void FormatTokenLexer::tryMergePreviousTokens() {
   return;
 }
   }
+  if (Style.isTableGen()) {
+if (tryMergeTokens({tok::l_square, tok::l_brace},
+   TT_TableGenMultiLineString)) {
+  // Multi line string starts with [{
+  Tokens.back()->Tok.setKind(tok::string_literal);
+  return;
+}
+  }
 }
 
 bool FormatTokenLexer::tryMergeNSStringLiteral() {
@@ -763,6 +773,53 @@ void 
FormatTokenLexer::handleCSharpVerbatimAndInterpolatedStrings() {
   resetLexer(SourceMgr.getFileOffset(Lex->getSourceLocation(Offset + 1)));
 }
 
+void FormatTokenLexer::handleTableGenMultilineString() {
+  FormatToken *MultiLineString = Tokens.back();
+  if (MultiLineString->isNot(TT_TableGenMultiLineString))
+return;
+
+  bool PrevIsRBrace = false;
+  const char *FirstBreak = nullptr;
+  const char *LastBreak = nullptr;
+  const char *Begin = MultiLineString->TokenText.begin();
+  // Skip until }], the closer of multi line string found.
+  for (const char *Current = Begin, *End = Lex->getBuffer().end();
+   Current != End; ++Current) {
+if (PrevIsRBrace && *Current == ']') {
+  // }] is the end of multi line string.
+  if (!FirstBreak)
+FirstBreak = Current;
+  MultiLineString->TokenText = StringRef(Begin, Current - Begin + 1);
+  // ColumnWidth is only the width of the first line.
+  MultiLineString->ColumnWidth = encoding::columnWidthWithTabs(
+  StringRef(Begin, FirstBreak - Begin + 1),
+  MultiLineString->OriginalColumn, Style.TabWidth, Encoding);
+  if (LastBreak) {
+// Set LastLineColumnWidth if multi line string has multiple lines.
+MultiLineString->LastLineColumnWidth = encoding::columnWidthWithTabs(
+StringRef(LastBreak + 1, Current - LastBreak),
+MultiLineString->OriginalColumn, Style.TabWidth, Encoding);
+  }
+  resetLexer(SourceMgr.getFileOffset(Lex->getSourceLocation(Current + 1)));
+  return;
+}

[clang] [clang-format] TableGen multi line string support. (PR #78032)

2024-01-16 Thread Hirofumi Nakamura via cfe-commits

https://github.com/hnakamura5 updated 
https://github.com/llvm/llvm-project/pull/78032

>From d0767350f26215e86dee039427183630b3f02668 Mon Sep 17 00:00:00 2001
From: hnakamura5 
Date: Sat, 13 Jan 2024 21:44:34 +0900
Subject: [PATCH 1/4] [clang-format] TableGen multi line string support.

---
 clang/lib/Format/ContinuationIndenter.cpp |  3 +
 clang/lib/Format/FormatToken.h|  1 +
 clang/lib/Format/FormatTokenLexer.cpp | 57 +++
 clang/lib/Format/FormatTokenLexer.h   |  3 +
 clang/lib/Format/TokenAnnotator.cpp   |  2 +-
 clang/unittests/Format/TokenAnnotatorTest.cpp |  5 ++
 6 files changed, 70 insertions(+), 1 deletion(-)

diff --git a/clang/lib/Format/ContinuationIndenter.cpp 
b/clang/lib/Format/ContinuationIndenter.cpp
index 102504182c4505..e6eaaa9ab45706 100644
--- a/clang/lib/Format/ContinuationIndenter.cpp
+++ b/clang/lib/Format/ContinuationIndenter.cpp
@@ -1591,6 +1591,9 @@ unsigned 
ContinuationIndenter::moveStateToNextToken(LineState &State,
 State.StartOfStringLiteral = State.Column + 1;
   if (Current.is(TT_CSharpStringLiteral) && State.StartOfStringLiteral == 0) {
 State.StartOfStringLiteral = State.Column + 1;
+  } else if (Current.is(TT_TableGenMultiLineString) &&
+ State.StartOfStringLiteral == 0) {
+State.StartOfStringLiteral = State.Column + 1;
   } else if (Current.isStringLiteral() && State.StartOfStringLiteral == 0) {
 State.StartOfStringLiteral = State.Column;
   } else if (!Current.isOneOf(tok::comment, tok::identifier, tok::hash) &&
diff --git a/clang/lib/Format/FormatToken.h b/clang/lib/Format/FormatToken.h
index d5ef627f1348d3..dede89f2600150 100644
--- a/clang/lib/Format/FormatToken.h
+++ b/clang/lib/Format/FormatToken.h
@@ -148,6 +148,7 @@ namespace format {
   TYPE(StructLBrace)   
\
   TYPE(StructRBrace)   
\
   TYPE(StructuredBindingLSquare)   
\
+  TYPE(TableGenMultiLineString)
\
   TYPE(TemplateCloser) 
\
   TYPE(TemplateOpener) 
\
   TYPE(TemplateString) 
\
diff --git a/clang/lib/Format/FormatTokenLexer.cpp 
b/clang/lib/Format/FormatTokenLexer.cpp
index a1fd6dd6effe6c..1060009bdcf131 100644
--- a/clang/lib/Format/FormatTokenLexer.cpp
+++ b/clang/lib/Format/FormatTokenLexer.cpp
@@ -93,6 +93,8 @@ ArrayRef FormatTokenLexer::lex() {
   // string literals are correctly identified.
   handleCSharpVerbatimAndInterpolatedStrings();
 }
+if (Style.isTableGen())
+  handleTableGenMultilineString();
 if (Tokens.back()->NewlinesBefore > 0 || Tokens.back()->IsMultiline)
   FirstInLineIndex = Tokens.size() - 1;
   } while (Tokens.back()->isNot(tok::eof));
@@ -272,6 +274,14 @@ void FormatTokenLexer::tryMergePreviousTokens() {
   return;
 }
   }
+  if (Style.isTableGen()) {
+if (tryMergeTokens({tok::l_square, tok::l_brace},
+   TT_TableGenMultiLineString)) {
+  // Multi line string starts with [{
+  Tokens.back()->Tok.setKind(tok::string_literal);
+  return;
+}
+  }
 }
 
 bool FormatTokenLexer::tryMergeNSStringLiteral() {
@@ -763,6 +773,53 @@ void 
FormatTokenLexer::handleCSharpVerbatimAndInterpolatedStrings() {
   resetLexer(SourceMgr.getFileOffset(Lex->getSourceLocation(Offset + 1)));
 }
 
+void FormatTokenLexer::handleTableGenMultilineString() {
+  FormatToken *MultiLineString = Tokens.back();
+  if (MultiLineString->isNot(TT_TableGenMultiLineString))
+return;
+
+  bool PrevIsRBrace = false;
+  const char *FirstBreak = nullptr;
+  const char *LastBreak = nullptr;
+  const char *Begin = MultiLineString->TokenText.begin();
+  // Skip until }], the closer of multi line string found.
+  for (const char *Current = Begin, *End = Lex->getBuffer().end();
+   Current != End; ++Current) {
+if (PrevIsRBrace && *Current == ']') {
+  // }] is the end of multi line string.
+  if (!FirstBreak)
+FirstBreak = Current;
+  MultiLineString->TokenText = StringRef(Begin, Current - Begin + 1);
+  // ColumnWidth is only the width of the first line.
+  MultiLineString->ColumnWidth = encoding::columnWidthWithTabs(
+  StringRef(Begin, FirstBreak - Begin + 1),
+  MultiLineString->OriginalColumn, Style.TabWidth, Encoding);
+  if (LastBreak) {
+// Set LastLineColumnWidth if multi line string has multiple lines.
+MultiLineString->LastLineColumnWidth = encoding::columnWidthWithTabs(
+StringRef(LastBreak + 1, Current - LastBreak),
+MultiLineString->OriginalColumn, Style.TabWidth, Encoding);
+  }
+  resetLexer(SourceMgr.getFileOffset(Lex->getSourceLocation(Current + 1)));
+  return;
+}

[clang] [clang-format] TableGen multi line string support. (PR #78032)

2024-01-16 Thread Hirofumi Nakamura via cfe-commits


@@ -274,13 +274,13 @@ void FormatTokenLexer::tryMergePreviousTokens() {
   return;
 }
   }
-  if (Style.isTableGen()) {
-if (tryMergeTokens({tok::l_square, tok::l_brace},
-   TT_TableGenMultiLineString)) {
-  // Multi line string starts with [{
-  Tokens.back()->Tok.setKind(tok::string_literal);
-  return;
-}
+  // TableGen's Multi line string starts with [{
+  if (Style.isTableGen() && tryMergeTokens({tok::l_square, tok::l_brace},
+   TT_TableGenMultiLineString)) {
+// This must never be annotated as other types.
+Tokens.back()->setTypeIsFinalized();

hnakamura5 wrote:

Changed to use `setFinalizedType`.
I failed to get the idea of other usage of `setTypeIsFinalized`. It may be so 
specific.

https://github.com/llvm/llvm-project/pull/78032
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[llvm] [clang-tools-extra] [clang] [clang-format] TableGen multi line string support. (PR #78032)

2024-01-17 Thread Hirofumi Nakamura via cfe-commits

https://github.com/hnakamura5 updated 
https://github.com/llvm/llvm-project/pull/78032

>From d0767350f26215e86dee039427183630b3f02668 Mon Sep 17 00:00:00 2001
From: hnakamura5 
Date: Sat, 13 Jan 2024 21:44:34 +0900
Subject: [PATCH 1/4] [clang-format] TableGen multi line string support.

---
 clang/lib/Format/ContinuationIndenter.cpp |  3 +
 clang/lib/Format/FormatToken.h|  1 +
 clang/lib/Format/FormatTokenLexer.cpp | 57 +++
 clang/lib/Format/FormatTokenLexer.h   |  3 +
 clang/lib/Format/TokenAnnotator.cpp   |  2 +-
 clang/unittests/Format/TokenAnnotatorTest.cpp |  5 ++
 6 files changed, 70 insertions(+), 1 deletion(-)

diff --git a/clang/lib/Format/ContinuationIndenter.cpp 
b/clang/lib/Format/ContinuationIndenter.cpp
index 102504182c4505..e6eaaa9ab45706 100644
--- a/clang/lib/Format/ContinuationIndenter.cpp
+++ b/clang/lib/Format/ContinuationIndenter.cpp
@@ -1591,6 +1591,9 @@ unsigned 
ContinuationIndenter::moveStateToNextToken(LineState &State,
 State.StartOfStringLiteral = State.Column + 1;
   if (Current.is(TT_CSharpStringLiteral) && State.StartOfStringLiteral == 0) {
 State.StartOfStringLiteral = State.Column + 1;
+  } else if (Current.is(TT_TableGenMultiLineString) &&
+ State.StartOfStringLiteral == 0) {
+State.StartOfStringLiteral = State.Column + 1;
   } else if (Current.isStringLiteral() && State.StartOfStringLiteral == 0) {
 State.StartOfStringLiteral = State.Column;
   } else if (!Current.isOneOf(tok::comment, tok::identifier, tok::hash) &&
diff --git a/clang/lib/Format/FormatToken.h b/clang/lib/Format/FormatToken.h
index d5ef627f1348d3..dede89f2600150 100644
--- a/clang/lib/Format/FormatToken.h
+++ b/clang/lib/Format/FormatToken.h
@@ -148,6 +148,7 @@ namespace format {
   TYPE(StructLBrace)   
\
   TYPE(StructRBrace)   
\
   TYPE(StructuredBindingLSquare)   
\
+  TYPE(TableGenMultiLineString)
\
   TYPE(TemplateCloser) 
\
   TYPE(TemplateOpener) 
\
   TYPE(TemplateString) 
\
diff --git a/clang/lib/Format/FormatTokenLexer.cpp 
b/clang/lib/Format/FormatTokenLexer.cpp
index a1fd6dd6effe6c..1060009bdcf131 100644
--- a/clang/lib/Format/FormatTokenLexer.cpp
+++ b/clang/lib/Format/FormatTokenLexer.cpp
@@ -93,6 +93,8 @@ ArrayRef FormatTokenLexer::lex() {
   // string literals are correctly identified.
   handleCSharpVerbatimAndInterpolatedStrings();
 }
+if (Style.isTableGen())
+  handleTableGenMultilineString();
 if (Tokens.back()->NewlinesBefore > 0 || Tokens.back()->IsMultiline)
   FirstInLineIndex = Tokens.size() - 1;
   } while (Tokens.back()->isNot(tok::eof));
@@ -272,6 +274,14 @@ void FormatTokenLexer::tryMergePreviousTokens() {
   return;
 }
   }
+  if (Style.isTableGen()) {
+if (tryMergeTokens({tok::l_square, tok::l_brace},
+   TT_TableGenMultiLineString)) {
+  // Multi line string starts with [{
+  Tokens.back()->Tok.setKind(tok::string_literal);
+  return;
+}
+  }
 }
 
 bool FormatTokenLexer::tryMergeNSStringLiteral() {
@@ -763,6 +773,53 @@ void 
FormatTokenLexer::handleCSharpVerbatimAndInterpolatedStrings() {
   resetLexer(SourceMgr.getFileOffset(Lex->getSourceLocation(Offset + 1)));
 }
 
+void FormatTokenLexer::handleTableGenMultilineString() {
+  FormatToken *MultiLineString = Tokens.back();
+  if (MultiLineString->isNot(TT_TableGenMultiLineString))
+return;
+
+  bool PrevIsRBrace = false;
+  const char *FirstBreak = nullptr;
+  const char *LastBreak = nullptr;
+  const char *Begin = MultiLineString->TokenText.begin();
+  // Skip until }], the closer of multi line string found.
+  for (const char *Current = Begin, *End = Lex->getBuffer().end();
+   Current != End; ++Current) {
+if (PrevIsRBrace && *Current == ']') {
+  // }] is the end of multi line string.
+  if (!FirstBreak)
+FirstBreak = Current;
+  MultiLineString->TokenText = StringRef(Begin, Current - Begin + 1);
+  // ColumnWidth is only the width of the first line.
+  MultiLineString->ColumnWidth = encoding::columnWidthWithTabs(
+  StringRef(Begin, FirstBreak - Begin + 1),
+  MultiLineString->OriginalColumn, Style.TabWidth, Encoding);
+  if (LastBreak) {
+// Set LastLineColumnWidth if multi line string has multiple lines.
+MultiLineString->LastLineColumnWidth = encoding::columnWidthWithTabs(
+StringRef(LastBreak + 1, Current - LastBreak),
+MultiLineString->OriginalColumn, Style.TabWidth, Encoding);
+  }
+  resetLexer(SourceMgr.getFileOffset(Lex->getSourceLocation(Current + 1)));
+  return;
+}

[clang] [clang-tools-extra] [llvm] [clang-format] TableGen multi line string support. (PR #78032)

2024-01-17 Thread Hirofumi Nakamura via cfe-commits

https://github.com/hnakamura5 closed 
https://github.com/llvm/llvm-project/pull/78032
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-tools-extra] [llvm] [clang-format] TableGen multi line string support. (PR #78032)

2024-01-17 Thread Hirofumi Nakamura via cfe-commits

hnakamura5 wrote:

Thank you very much!

https://github.com/llvm/llvm-project/pull/78032
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] Support of TableGen identifiers beginning with a number. (PR #78571)

2024-01-18 Thread Hirofumi Nakamura via cfe-commits

https://github.com/hnakamura5 created 
https://github.com/llvm/llvm-project/pull/78571

TableGen allows the identifiers beginning with a number.
This patch add the support of the recognition of such identifiers.

>From b472c08735b3ce3b6f7b81e499a2ef16c3faad4a Mon Sep 17 00:00:00 2001
From: hnakamura5 
Date: Thu, 18 Jan 2024 21:49:06 +0900
Subject: [PATCH] [clang-format] Support of TableGen identifiers beginning with
 a number.

---
 clang/lib/Format/FormatTokenLexer.cpp | 44 ++-
 clang/lib/Format/FormatTokenLexer.h   |  4 ++
 clang/unittests/Format/TokenAnnotatorTest.cpp | 21 +
 3 files changed, 68 insertions(+), 1 deletion(-)

diff --git a/clang/lib/Format/FormatTokenLexer.cpp 
b/clang/lib/Format/FormatTokenLexer.cpp
index 25ac9be57c81a9..f1982533f112c7 100644
--- a/clang/lib/Format/FormatTokenLexer.cpp
+++ b/clang/lib/Format/FormatTokenLexer.cpp
@@ -93,8 +93,10 @@ ArrayRef FormatTokenLexer::lex() {
   // string literals are correctly identified.
   handleCSharpVerbatimAndInterpolatedStrings();
 }
-if (Style.isTableGen())
+if (Style.isTableGen()) {
   handleTableGenMultilineString();
+  handleTableGenNumericLikeIdentifier();
+}
 if (Tokens.back()->NewlinesBefore > 0 || Tokens.back()->IsMultiline)
   FirstInLineIndex = Tokens.size() - 1;
   } while (Tokens.back()->isNot(tok::eof));
@@ -804,6 +806,46 @@ void FormatTokenLexer::handleTableGenMultilineString() {
   FirstLineText, MultiLineString->OriginalColumn, Style.TabWidth, 
Encoding);
 }
 
+void FormatTokenLexer::handleTableGenNumericLikeIdentifier() {
+  FormatToken *Tok = Tokens.back();
+  // TableGen identifiers can begin with digits. Such tokens are lexed as
+  // numeric_constant now.
+  if (Tok->isNot(tok::numeric_constant))
+return;
+  StringRef Text = Tok->TokenText;
+  // Identifiers cannot begin with + or -.
+  if (Text.size() < 1 || Text[0] == '+' || Text[0] == '-')
+return;
+  // The following check is based on llvm::TGLexer::LexToken.
+  if (isdigit(Text[0])) {
+size_t I = 0;
+char NextChar = (char)0;
+// Identifiers in TalbleGen may begin with digits. Skip to first non-digit.
+do {
+  NextChar = Text[I++];
+} while (I < Text.size() && isdigit(NextChar));
+// All the characters are digits.
+if (I >= Text.size())
+  return;
+// Base character. But it does not check the first 0 and that the base is
+// the second character.
+if (NextChar == 'x' || NextChar == 'b') {
+  char NextNextChar = Text[I];
+  // This is regarded as binary number.
+  if (isxdigit(NextNextChar)) {
+if (NextChar == 'b' && (NextNextChar == '0' || NextNextChar == '1'))
+  return;
+// Regarded as hex number or decimal number.
+if (NextChar == 'x' || isdigit(NextNextChar))
+  return;
+  }
+}
+  }
+  // Otherwise, this is actually a identifier.
+  Tok->Tok.setKind(tok::identifier);
+  Tok->Tok.setIdentifierInfo(nullptr);
+}
+
 void FormatTokenLexer::handleTemplateStrings() {
   FormatToken *BacktickToken = Tokens.back();
 
diff --git a/clang/lib/Format/FormatTokenLexer.h 
b/clang/lib/Format/FormatTokenLexer.h
index 1dec6bbc41514c..65dd733bd53352 100644
--- a/clang/lib/Format/FormatTokenLexer.h
+++ b/clang/lib/Format/FormatTokenLexer.h
@@ -97,6 +97,10 @@ class FormatTokenLexer {
 
   // Handles TableGen multiline strings. It has the form [{ ... }].
   void handleTableGenMultilineString();
+  // Handles TableGen numeric like identifiers.
+  // They have a forms of [0-9]*[_a-zA-Z]([_a-zA-Z0-9]*). But limited to the
+  // case it is not lexed as an integer.
+  void handleTableGenNumericLikeIdentifier();
 
   void tryParsePythonComment();
 
diff --git a/clang/unittests/Format/TokenAnnotatorTest.cpp 
b/clang/unittests/Format/TokenAnnotatorTest.cpp
index 117d8fe8f7dc12..753e749befa57e 100644
--- a/clang/unittests/Format/TokenAnnotatorTest.cpp
+++ b/clang/unittests/Format/TokenAnnotatorTest.cpp
@@ -2209,6 +2209,27 @@ TEST_F(TokenAnnotatorTest, UnderstandTableGenTokens) {
   EXPECT_EQ(Tokens[0]->ColumnWidth, sizeof("[{ It can break\n") - 1);
   EXPECT_TRUE(Tokens[0]->IsMultiline);
   EXPECT_EQ(Tokens[0]->LastLineColumnWidth, sizeof("   the string. }]") - 1);
+
+  // Identifier tokens. In TableGen, identifiers can begin with a number.
+  // In ambiguous cases, the lexer tries to lex it as a number.
+  // Even if the try fails, it does not fall back to identifier lexing and
+  // regard as an error.
+  // The ambiguity is not documented. The result of those tests are based on 
the
+  // implementation of llvm::TGLexer::LexToken.
+  Tokens = Annotate("1234");
+  EXPECT_TOKEN(Tokens[0], tok::numeric_constant, TT_Unknown);
+  Tokens = Annotate("0x1abC");
+  EXPECT_TOKEN(Tokens[0], tok::numeric_constant, TT_Unknown);
+  // This is invalid syntax of number, but not an identifier.
+  Tokens = Annotate("0x1234x");
+  EXPECT_TOKEN(Tokens[0], tok::numeric_constant, TT_Unknown);
+  Tokens = Annotate("

[clang] [clang-format] Support of TableGen identifiers beginning with a number. (PR #78571)

2024-01-18 Thread Hirofumi Nakamura via cfe-commits

hnakamura5 wrote:

I checked simply the corner cases in unittest of this patch with the following 
sample.

```
// test_id.td
class 01234Vector {
  int 2dVector = 0x1abc;
  int invalid_num = 0x1234x;
  int 0x1234x = i;
}

def Def: 01234Vector<1>;
```

The followings are the result of 
```
llvm-tblgen .\test_id.td
```

(1) With the whole sample code above.
```
.\test_id.td:4:27: error: expected ';' after declaration
  int invalid_num = 0x1234x;
   ^ ← This caret points to the last x.
```
We can see `0x1234x` is not lexed as a valid token.

(2) When the line of invalid_num is commented out from the sample.
```
.\test_id.td:5:7: error: Expected identifier in declaration
  int 0x1234x = i;
  ^ ← This caret points to the first 0.
```
`0x1234x` is NOT an identifier.

(3) Additionally, the line of  int 0x1234x = i is commented out.
```
.\test_id.td:2:23: warning: unused template argument: 01234Vector:i
class 01234Vector {
  ^
- Classes -
class 01234Vector {
  int 2dVector = 6844;
}
- Defs -
def Def {   // 01234Vector
  int 2dVector = 6844;
}
```
This is the result after the process completes. 

So the following remained code has valid syntax.
```
class 01234Vector {
  int 2dVector = 0x1abc;
}

def Def: 01234Vector<1>;
```


https://github.com/llvm/llvm-project/pull/78571
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] Support of TableGen identifiers beginning with a number. (PR #78571)

2024-01-18 Thread Hirofumi Nakamura via cfe-commits


@@ -804,6 +806,46 @@ void FormatTokenLexer::handleTableGenMultilineString() {
   FirstLineText, MultiLineString->OriginalColumn, Style.TabWidth, 
Encoding);
 }
 
+void FormatTokenLexer::handleTableGenNumericLikeIdentifier() {
+  FormatToken *Tok = Tokens.back();
+  // TableGen identifiers can begin with digits. Such tokens are lexed as
+  // numeric_constant now.
+  if (Tok->isNot(tok::numeric_constant))
+return;
+  StringRef Text = Tok->TokenText;
+  // Identifiers cannot begin with + or -.
+  if (Text.size() < 1 || Text[0] == '+' || Text[0] == '-')
+return;
+  // The following check is based on llvm::TGLexer::LexToken.
+  if (isdigit(Text[0])) {
+size_t I = 0;
+char NextChar = (char)0;
+// Identifiers in TalbleGen may begin with digits. Skip to first non-digit.
+do {
+  NextChar = Text[I++];
+} while (I < Text.size() && isdigit(NextChar));
+// All the characters are digits.
+if (I >= Text.size())
+  return;
+// Base character. But it does not check the first 0 and that the base is
+// the second character.

hnakamura5 wrote:

Yes for the both question. This is about TableGen compiler's lexer.
As you wonder, this comment may be not precise enough. Later I will fix it.

For example,
`0x1234x` is regarded as integer because the lexer assumes it is a integer at 
the point it have got `0x1` part. This is an syntax error example written in 
the unittest.
I want to note here by this comment is,
`1x1234x` is also regarded as integer (and syntax error). This behavior comes 
from the lexer does not check the character before 'x' is 0 or other number.
(FYI,  `1y1234x ` is a valid identifier. Such a ambiguity is only when the 
first non-digit character is 'x' or 'b'. )

https://github.com/llvm/llvm-project/pull/78571
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


  1   2   >