[llvm-branch-commits] [clang] release/19.x: [clang-format] Fix a serious bug in `git clang-format -f` (#102629) (PR #102770)

2024-08-13 Thread Tobias Hieta via llvm-branch-commits

https://github.com/tru updated https://github.com/llvm/llvm-project/pull/102770

>From efdd0e9fda292f9040367fcd6a9b2c2590fea739 Mon Sep 17 00:00:00 2001
From: Owen Pan 
Date: Sat, 10 Aug 2024 13:31:35 -0700
Subject: [PATCH] [clang-format] Fix a serious bug in `git clang-format -f`
 (#102629)

With the --force (or -f) option, git-clang-format wipes out input files
excluded by a .clang-format-ignore file if they have unstaged changes.

This patch adds a hidden clang-format option --list-ignored that lists
such excluded files for git-clang-format to filter out.

Fixes #102459.

(cherry picked from commit 986bc3d0719af653fecb77e8cfc59f39bec148fd)
---
 clang/test/Format/list-ignored.cpp| 60 +++
 clang/tools/clang-format/ClangFormat.cpp  | 12 -
 clang/tools/clang-format/git-clang-format | 15 +-
 3 files changed, 84 insertions(+), 3 deletions(-)
 create mode 100644 clang/test/Format/list-ignored.cpp

diff --git a/clang/test/Format/list-ignored.cpp 
b/clang/test/Format/list-ignored.cpp
new file mode 100644
index 00..6e65a68a6f9968
--- /dev/null
+++ b/clang/test/Format/list-ignored.cpp
@@ -0,0 +1,60 @@
+// RUN: rm -rf %t.dir
+// RUN: mkdir -p %t.dir/level1/level2
+
+// RUN: cd %t.dir
+// RUN: echo "*" > .clang-format-ignore
+// RUN: echo "level*/*.c*" >> .clang-format-ignore
+// RUN: echo "*/*2/foo.*" >> .clang-format-ignore
+
+// RUN: touch foo.cc
+// RUN: clang-format -list-ignored .clang-format-ignore foo.cc \
+// RUN:   | FileCheck %s
+// CHECK: .clang-format-ignore
+// CHECK-NEXT: foo.cc
+
+// RUN: cd level1
+// RUN: touch bar.cc baz.c
+// RUN: clang-format -list-ignored bar.cc baz.c \
+// RUN:   | FileCheck %s -check-prefix=CHECK2
+// CHECK2: bar.cc
+// CHECK2-NEXT: baz.c
+
+// RUN: cd level2
+// RUN: touch foo.c foo.js
+// RUN: clang-format -list-ignored foo.c foo.js \
+// RUN:   | FileCheck %s -check-prefix=CHECK3
+// CHECK3: foo.c
+// CHECK3-NEXT: foo.js
+
+// RUN: touch .clang-format-ignore
+// RUN: clang-format -list-ignored foo.c foo.js \
+// RUN:   | FileCheck %s -allow-empty -check-prefix=CHECK4
+// CHECK4-NOT: foo.c
+// CHECK4-NOT: foo.js
+
+// RUN: echo "*.js" > .clang-format-ignore
+// RUN: clang-format -list-ignored foo.c foo.js \
+// RUN:   | FileCheck %s -check-prefix=CHECK5
+// CHECK5-NOT: foo.c
+// CHECK5: foo.js
+
+// RUN: cd ../..
+// RUN: clang-format -list-ignored *.cc level1/*.c* level1/level2/foo.* \
+// RUN:   | FileCheck %s -check-prefix=CHECK6
+// CHECK6: foo.cc
+// CHECK6-NEXT: bar.cc
+// CHECK6-NEXT: baz.c
+// CHECK6-NOT: foo.c
+// CHECK6-NEXT: foo.js
+
+// RUN: rm .clang-format-ignore
+// RUN: clang-format -list-ignored *.cc level1/*.c* level1/level2/foo.* \
+// RUN:   | FileCheck %s -check-prefix=CHECK7
+// CHECK7-NOT: foo.cc
+// CHECK7-NOT: bar.cc
+// CHECK7-NOT: baz.c
+// CHECK7-NOT: foo.c
+// CHECK7: foo.js
+
+// RUN: cd ..
+// RUN: rm -r %t.dir
diff --git a/clang/tools/clang-format/ClangFormat.cpp 
b/clang/tools/clang-format/ClangFormat.cpp
index 6cba1267f3b0db..c4b6209a71a88f 100644
--- a/clang/tools/clang-format/ClangFormat.cpp
+++ b/clang/tools/clang-format/ClangFormat.cpp
@@ -210,6 +210,10 @@ static cl::opt FailOnIncompleteFormat(
 cl::desc("If set, fail with exit code 1 on incomplete format."),
 cl::init(false), cl::cat(ClangFormatCategory));
 
+static cl::opt ListIgnored("list-ignored",
+ cl::desc("List ignored files."),
+ cl::cat(ClangFormatCategory), cl::Hidden);
+
 namespace clang {
 namespace format {
 
@@ -715,7 +719,13 @@ int main(int argc, const char **argv) {
   unsigned FileNo = 1;
   bool Error = false;
   for (const auto &FileName : FileNames) {
-if (isIgnored(FileName))
+const bool Ignored = isIgnored(FileName);
+if (ListIgnored) {
+  if (Ignored)
+outs() << FileName << '\n';
+  continue;
+}
+if (Ignored)
   continue;
 if (Verbose) {
   errs() << "Formatting [" << FileNo++ << "/" << FileNames.size() << "] "
diff --git a/clang/tools/clang-format/git-clang-format 
b/clang/tools/clang-format/git-clang-format
index d33fd478d77fd9..714ba8a6e77d51 100755
--- a/clang/tools/clang-format/git-clang-format
+++ b/clang/tools/clang-format/git-clang-format
@@ -173,11 +173,12 @@ def main():
   # those files.
   cd_to_toplevel()
   filter_symlinks(changed_lines)
+  filter_ignored_files(changed_lines, binary=opts.binary)
   if opts.verbose >= 1:
 ignored_files.difference_update(changed_lines)
 if ignored_files:
-  print(
-'Ignoring changes in the following files (wrong extension or 
symlink):')
+  print('Ignoring the following files (wrong extension, symlink, or '
+'ignored by clang-format):')
   for filename in ignored_files:
 print('%s' % filename)
 if changed_lines:
@@ -399,6 +400,16 @@ def filter_symlinks(dictionary):
   del dictionary[filename]
 
 
+def filter_ignored_files(dictionary, binary):
+  """Delete every key in `dictionary` that is ignor

[llvm-branch-commits] [clang] efdd0e9 - [clang-format] Fix a serious bug in `git clang-format -f` (#102629)

2024-08-13 Thread Tobias Hieta via llvm-branch-commits

Author: Owen Pan
Date: 2024-08-13T09:00:45+02:00
New Revision: efdd0e9fda292f9040367fcd6a9b2c2590fea739

URL: 
https://github.com/llvm/llvm-project/commit/efdd0e9fda292f9040367fcd6a9b2c2590fea739
DIFF: 
https://github.com/llvm/llvm-project/commit/efdd0e9fda292f9040367fcd6a9b2c2590fea739.diff

LOG: [clang-format] Fix a serious bug in `git clang-format -f` (#102629)

With the --force (or -f) option, git-clang-format wipes out input files
excluded by a .clang-format-ignore file if they have unstaged changes.

This patch adds a hidden clang-format option --list-ignored that lists
such excluded files for git-clang-format to filter out.

Fixes #102459.

(cherry picked from commit 986bc3d0719af653fecb77e8cfc59f39bec148fd)

Added: 
clang/test/Format/list-ignored.cpp

Modified: 
clang/tools/clang-format/ClangFormat.cpp
clang/tools/clang-format/git-clang-format

Removed: 




diff  --git a/clang/test/Format/list-ignored.cpp 
b/clang/test/Format/list-ignored.cpp
new file mode 100644
index 00..6e65a68a6f9968
--- /dev/null
+++ b/clang/test/Format/list-ignored.cpp
@@ -0,0 +1,60 @@
+// RUN: rm -rf %t.dir
+// RUN: mkdir -p %t.dir/level1/level2
+
+// RUN: cd %t.dir
+// RUN: echo "*" > .clang-format-ignore
+// RUN: echo "level*/*.c*" >> .clang-format-ignore
+// RUN: echo "*/*2/foo.*" >> .clang-format-ignore
+
+// RUN: touch foo.cc
+// RUN: clang-format -list-ignored .clang-format-ignore foo.cc \
+// RUN:   | FileCheck %s
+// CHECK: .clang-format-ignore
+// CHECK-NEXT: foo.cc
+
+// RUN: cd level1
+// RUN: touch bar.cc baz.c
+// RUN: clang-format -list-ignored bar.cc baz.c \
+// RUN:   | FileCheck %s -check-prefix=CHECK2
+// CHECK2: bar.cc
+// CHECK2-NEXT: baz.c
+
+// RUN: cd level2
+// RUN: touch foo.c foo.js
+// RUN: clang-format -list-ignored foo.c foo.js \
+// RUN:   | FileCheck %s -check-prefix=CHECK3
+// CHECK3: foo.c
+// CHECK3-NEXT: foo.js
+
+// RUN: touch .clang-format-ignore
+// RUN: clang-format -list-ignored foo.c foo.js \
+// RUN:   | FileCheck %s -allow-empty -check-prefix=CHECK4
+// CHECK4-NOT: foo.c
+// CHECK4-NOT: foo.js
+
+// RUN: echo "*.js" > .clang-format-ignore
+// RUN: clang-format -list-ignored foo.c foo.js \
+// RUN:   | FileCheck %s -check-prefix=CHECK5
+// CHECK5-NOT: foo.c
+// CHECK5: foo.js
+
+// RUN: cd ../..
+// RUN: clang-format -list-ignored *.cc level1/*.c* level1/level2/foo.* \
+// RUN:   | FileCheck %s -check-prefix=CHECK6
+// CHECK6: foo.cc
+// CHECK6-NEXT: bar.cc
+// CHECK6-NEXT: baz.c
+// CHECK6-NOT: foo.c
+// CHECK6-NEXT: foo.js
+
+// RUN: rm .clang-format-ignore
+// RUN: clang-format -list-ignored *.cc level1/*.c* level1/level2/foo.* \
+// RUN:   | FileCheck %s -check-prefix=CHECK7
+// CHECK7-NOT: foo.cc
+// CHECK7-NOT: bar.cc
+// CHECK7-NOT: baz.c
+// CHECK7-NOT: foo.c
+// CHECK7: foo.js
+
+// RUN: cd ..
+// RUN: rm -r %t.dir

diff  --git a/clang/tools/clang-format/ClangFormat.cpp 
b/clang/tools/clang-format/ClangFormat.cpp
index 6cba1267f3b0db..c4b6209a71a88f 100644
--- a/clang/tools/clang-format/ClangFormat.cpp
+++ b/clang/tools/clang-format/ClangFormat.cpp
@@ -210,6 +210,10 @@ static cl::opt FailOnIncompleteFormat(
 cl::desc("If set, fail with exit code 1 on incomplete format."),
 cl::init(false), cl::cat(ClangFormatCategory));
 
+static cl::opt ListIgnored("list-ignored",
+ cl::desc("List ignored files."),
+ cl::cat(ClangFormatCategory), cl::Hidden);
+
 namespace clang {
 namespace format {
 
@@ -715,7 +719,13 @@ int main(int argc, const char **argv) {
   unsigned FileNo = 1;
   bool Error = false;
   for (const auto &FileName : FileNames) {
-if (isIgnored(FileName))
+const bool Ignored = isIgnored(FileName);
+if (ListIgnored) {
+  if (Ignored)
+outs() << FileName << '\n';
+  continue;
+}
+if (Ignored)
   continue;
 if (Verbose) {
   errs() << "Formatting [" << FileNo++ << "/" << FileNames.size() << "] "

diff  --git a/clang/tools/clang-format/git-clang-format 
b/clang/tools/clang-format/git-clang-format
index d33fd478d77fd9..714ba8a6e77d51 100755
--- a/clang/tools/clang-format/git-clang-format
+++ b/clang/tools/clang-format/git-clang-format
@@ -173,11 +173,12 @@ def main():
   # those files.
   cd_to_toplevel()
   filter_symlinks(changed_lines)
+  filter_ignored_files(changed_lines, binary=opts.binary)
   if opts.verbose >= 1:
 ignored_files.
diff erence_update(changed_lines)
 if ignored_files:
-  print(
-'Ignoring changes in the following files (wrong extension or 
symlink):')
+  print('Ignoring the following files (wrong extension, symlink, or '
+'ignored by clang-format):')
   for filename in ignored_files:
 print('%s' % filename)
 if changed_lines:
@@ -399,6 +400,16 @@ def filter_symlinks(dictionary):
   del dictionary[filename]
 
 
+def filter_ignored_files(dictionary, binary):
+  """Delete 

[llvm-branch-commits] [clang] release/19.x: [clang-format] Fix a serious bug in `git clang-format -f` (#102629) (PR #102770)

2024-08-13 Thread Tobias Hieta via llvm-branch-commits

https://github.com/tru closed https://github.com/llvm/llvm-project/pull/102770
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/19.x: [clang-format] Fix a serious bug in `git clang-format -f` (#102629) (PR #102770)

2024-08-13 Thread via llvm-branch-commits

github-actions[bot] wrote:

@owenca (or anyone else). If you would like to add a note about this fix in the 
release notes (completely optional). Please reply to this comment with a one or 
two sentence description of the fix.  When you are done, please add the 
release:note label to this PR. 

https://github.com/llvm/llvm-project/pull/102770
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lldb] release/19.x: [lldb] Fix crash when adding members to an "incomplete" type (#102116) (PR #102895)

2024-08-13 Thread Tobias Hieta via llvm-branch-commits

https://github.com/tru updated https://github.com/llvm/llvm-project/pull/102895

>From c261de724b01a1db029d896914adc8995cd2c35a Mon Sep 17 00:00:00 2001
From: Pavel Labath 
Date: Thu, 8 Aug 2024 10:53:15 +0200
Subject: [PATCH] [lldb] Fix crash when adding members to an "incomplete" type
 (#102116)

This fixes a regression caused by delayed type definition searching
(#96755 and friends): If we end up adding a member (e.g. a typedef) to a
type that we've already attempted to complete (and failed), the
resulting AST would end up inconsistent (we would start to "forcibly"
complete it, but never finish it), and importing it into an expression
AST would crash.

This patch fixes this by detecting the situation and finishing the
definition as well.

(cherry picked from commit 57cd1000c9c93fd0e64352cfbc9fbbe5b8a8fcef)
---
 .../SymbolFile/DWARF/DWARFASTParserClang.cpp  | 11 +++--
 .../DWARF/x86/typedef-in-incomplete-type.cpp  | 23 +++
 2 files changed, 32 insertions(+), 2 deletions(-)
 create mode 100644 
lldb/test/Shell/SymbolFile/DWARF/x86/typedef-in-incomplete-type.cpp

diff --git a/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp 
b/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
index 85c59a605c675c..ac769ad9fbd52c 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
+++ b/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
@@ -269,8 +269,15 @@ static void PrepareContextToReceiveMembers(TypeSystemClang 
&ast,
   }
 
   // We don't have a type definition and/or the import failed, but we need to
-  // add members to it. Start the definition to make that possible.
-  tag_decl_ctx->startDefinition();
+  // add members to it. Start the definition to make that possible. If the type
+  // has no external storage we also have to complete the definition. 
Otherwise,
+  // that will happen when we are asked to complete the type
+  // (CompleteTypeFromDWARF).
+  ast.StartTagDeclarationDefinition(type);
+  if (!tag_decl_ctx->hasExternalLexicalStorage()) {
+ast.SetDeclIsForcefullyCompleted(tag_decl_ctx);
+ast.CompleteTagDeclarationDefinition(type);
+  }
 }
 
 ParsedDWARFTypeAttributes::ParsedDWARFTypeAttributes(const DWARFDIE &die) {
diff --git 
a/lldb/test/Shell/SymbolFile/DWARF/x86/typedef-in-incomplete-type.cpp 
b/lldb/test/Shell/SymbolFile/DWARF/x86/typedef-in-incomplete-type.cpp
new file mode 100644
index 00..591607784b0a9b
--- /dev/null
+++ b/lldb/test/Shell/SymbolFile/DWARF/x86/typedef-in-incomplete-type.cpp
@@ -0,0 +1,23 @@
+// RUN: %clangxx --target=x86_64-pc-linux -flimit-debug-info -o %t -c %s -g
+// RUN: %lldb %t -o "target var a" -o "expr -- var" -o exit | FileCheck %s
+
+// This forces lldb to attempt to complete the type A. Since it has no
+// definition it will fail.
+// CHECK: target var a
+// CHECK: (A) a = 
+
+// Now attempt to display the second variable, which will try to add a typedef
+// to the incomplete type. Make sure that succeeds. Use the expression command
+// to make sure the resulting AST can be imported correctly.
+// CHECK: expr -- var
+// CHECK: (A::X) $0 = 0
+
+struct A {
+  // Declare the constructor, but don't define it to avoid emitting the
+  // definition in the debug info.
+  A();
+  using X = int;
+};
+
+A a;
+A::X var;

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lldb] c261de7 - [lldb] Fix crash when adding members to an "incomplete" type (#102116)

2024-08-13 Thread Tobias Hieta via llvm-branch-commits

Author: Pavel Labath
Date: 2024-08-13T09:02:18+02:00
New Revision: c261de724b01a1db029d896914adc8995cd2c35a

URL: 
https://github.com/llvm/llvm-project/commit/c261de724b01a1db029d896914adc8995cd2c35a
DIFF: 
https://github.com/llvm/llvm-project/commit/c261de724b01a1db029d896914adc8995cd2c35a.diff

LOG: [lldb] Fix crash when adding members to an "incomplete" type (#102116)

This fixes a regression caused by delayed type definition searching
(#96755 and friends): If we end up adding a member (e.g. a typedef) to a
type that we've already attempted to complete (and failed), the
resulting AST would end up inconsistent (we would start to "forcibly"
complete it, but never finish it), and importing it into an expression
AST would crash.

This patch fixes this by detecting the situation and finishing the
definition as well.

(cherry picked from commit 57cd1000c9c93fd0e64352cfbc9fbbe5b8a8fcef)

Added: 
lldb/test/Shell/SymbolFile/DWARF/x86/typedef-in-incomplete-type.cpp

Modified: 
lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp

Removed: 




diff  --git a/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp 
b/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
index 85c59a605c675c..ac769ad9fbd52c 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
+++ b/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
@@ -269,8 +269,15 @@ static void PrepareContextToReceiveMembers(TypeSystemClang 
&ast,
   }
 
   // We don't have a type definition and/or the import failed, but we need to
-  // add members to it. Start the definition to make that possible.
-  tag_decl_ctx->startDefinition();
+  // add members to it. Start the definition to make that possible. If the type
+  // has no external storage we also have to complete the definition. 
Otherwise,
+  // that will happen when we are asked to complete the type
+  // (CompleteTypeFromDWARF).
+  ast.StartTagDeclarationDefinition(type);
+  if (!tag_decl_ctx->hasExternalLexicalStorage()) {
+ast.SetDeclIsForcefullyCompleted(tag_decl_ctx);
+ast.CompleteTagDeclarationDefinition(type);
+  }
 }
 
 ParsedDWARFTypeAttributes::ParsedDWARFTypeAttributes(const DWARFDIE &die) {

diff  --git 
a/lldb/test/Shell/SymbolFile/DWARF/x86/typedef-in-incomplete-type.cpp 
b/lldb/test/Shell/SymbolFile/DWARF/x86/typedef-in-incomplete-type.cpp
new file mode 100644
index 00..591607784b0a9b
--- /dev/null
+++ b/lldb/test/Shell/SymbolFile/DWARF/x86/typedef-in-incomplete-type.cpp
@@ -0,0 +1,23 @@
+// RUN: %clangxx --target=x86_64-pc-linux -flimit-debug-info -o %t -c %s -g
+// RUN: %lldb %t -o "target var a" -o "expr -- var" -o exit | FileCheck %s
+
+// This forces lldb to attempt to complete the type A. Since it has no
+// definition it will fail.
+// CHECK: target var a
+// CHECK: (A) a = 
+
+// Now attempt to display the second variable, which will try to add a typedef
+// to the incomplete type. Make sure that succeeds. Use the expression command
+// to make sure the resulting AST can be imported correctly.
+// CHECK: expr -- var
+// CHECK: (A::X) $0 = 0
+
+struct A {
+  // Declare the constructor, but don't define it to avoid emitting the
+  // definition in the debug info.
+  A();
+  using X = int;
+};
+
+A a;
+A::X var;



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lldb] release/19.x: [lldb] Fix crash when adding members to an "incomplete" type (#102116) (PR #102895)

2024-08-13 Thread Tobias Hieta via llvm-branch-commits

https://github.com/tru closed https://github.com/llvm/llvm-project/pull/102895
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/19.x: [clang] Implement -fptrauth-auth-traps. (#102417) (PR #102938)

2024-08-13 Thread Tobias Hieta via llvm-branch-commits

https://github.com/tru updated https://github.com/llvm/llvm-project/pull/102938

>From 955fe3f1ef193d26c73fb54ab6e01a818a6bde8e Mon Sep 17 00:00:00 2001
From: Ahmed Bougacha 
Date: Fri, 9 Aug 2024 12:32:01 -0700
Subject: [PATCH] [clang] Implement -fptrauth-auth-traps. (#102417)

This provides -fptrauth-auth-traps, which at the frontend level only
controls the addition of the "ptrauth-auth-traps" function attribute.

The attribute in turn controls various aspects of backend codegen, by
providing the guarantee that every "auth" operation generated will trap
on failure.

This can either be delegated to the hardware (if AArch64 FPAC is known
to be available), in which case this attribute doesn't change codegen.
Otherwise, if FPAC isn't available, this asks the backend to emit
additional instructions to check and trap on auth failure.

(cherry picked from commit d179acd0484bac30c5ebbbed4d29a4734d92ac93)
---
 clang/include/clang/Basic/PointerAuthOptions.h   | 3 +++
 clang/lib/CodeGen/CodeGenFunction.cpp| 2 ++
 clang/lib/Frontend/CompilerInvocation.cpp| 7 ---
 clang/test/CodeGen/ptrauth-function-attributes.c | 5 +
 4 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/clang/include/clang/Basic/PointerAuthOptions.h 
b/clang/include/clang/Basic/PointerAuthOptions.h
index c0ab35bce5d84b..a26af69e1fa246 100644
--- a/clang/include/clang/Basic/PointerAuthOptions.h
+++ b/clang/include/clang/Basic/PointerAuthOptions.h
@@ -162,6 +162,9 @@ struct PointerAuthOptions {
   /// Should return addresses be authenticated?
   bool ReturnAddresses = false;
 
+  /// Do authentication failures cause a trap?
+  bool AuthTraps = false;
+
   /// Do indirect goto label addresses need to be authenticated?
   bool IndirectGotos = false;
 
diff --git a/clang/lib/CodeGen/CodeGenFunction.cpp 
b/clang/lib/CodeGen/CodeGenFunction.cpp
index 4dc57d0ff5b269..2b2e23f1e5d7fb 100644
--- a/clang/lib/CodeGen/CodeGenFunction.cpp
+++ b/clang/lib/CodeGen/CodeGenFunction.cpp
@@ -884,6 +884,8 @@ void CodeGenFunction::StartFunction(GlobalDecl GD, QualType 
RetTy,
 Fn->addFnAttr("ptrauth-returns");
   if (CodeGenOpts.PointerAuth.FunctionPointers)
 Fn->addFnAttr("ptrauth-calls");
+  if (CodeGenOpts.PointerAuth.AuthTraps)
+Fn->addFnAttr("ptrauth-auth-traps");
   if (CodeGenOpts.PointerAuth.IndirectGotos)
 Fn->addFnAttr("ptrauth-indirect-gotos");
 
diff --git a/clang/lib/Frontend/CompilerInvocation.cpp 
b/clang/lib/Frontend/CompilerInvocation.cpp
index fa5d076c202a36..028fdb2cc6b9da 100644
--- a/clang/lib/Frontend/CompilerInvocation.cpp
+++ b/clang/lib/Frontend/CompilerInvocation.cpp
@@ -1504,16 +1504,17 @@ void CompilerInvocation::setDefaultPointerAuthOptions(
 Opts.CXXMemberFunctionPointers =
 PointerAuthSchema(Key::ASIA, false, Discrimination::Type);
   }
-  Opts.IndirectGotos = LangOpts.PointerAuthIndirectGotos;
   Opts.ReturnAddresses = LangOpts.PointerAuthReturns;
+  Opts.AuthTraps = LangOpts.PointerAuthAuthTraps;
+  Opts.IndirectGotos = LangOpts.PointerAuthIndirectGotos;
 }
 
 static void parsePointerAuthOptions(PointerAuthOptions &Opts,
 const LangOptions &LangOpts,
 const llvm::Triple &Triple,
 DiagnosticsEngine &Diags) {
-  if (!LangOpts.PointerAuthCalls && !LangOpts.PointerAuthIndirectGotos &&
-  !LangOpts.PointerAuthReturns)
+  if (!LangOpts.PointerAuthCalls && !LangOpts.PointerAuthReturns &&
+  !LangOpts.PointerAuthAuthTraps && !LangOpts.PointerAuthIndirectGotos)
 return;
 
   CompilerInvocation::setDefaultPointerAuthOptions(Opts, LangOpts, Triple);
diff --git a/clang/test/CodeGen/ptrauth-function-attributes.c 
b/clang/test/CodeGen/ptrauth-function-attributes.c
index 17ebf9d6e2e01c..e7081f00b4f686 100644
--- a/clang/test/CodeGen/ptrauth-function-attributes.c
+++ b/clang/test/CodeGen/ptrauth-function-attributes.c
@@ -8,6 +8,9 @@
 // RUN: %clang_cc1 -triple arm64-apple-ios   -fptrauth-returns -emit-llvm %s 
-o - | FileCheck %s --check-prefixes=ALL,RETS
 // RUN: %clang_cc1 -triple aarch64-linux-gnu -fptrauth-returns -emit-llvm %s 
-o - | FileCheck %s --check-prefixes=ALL,RETS
 
+// RUN: %clang_cc1 -triple arm64-apple-ios   -fptrauth-auth-traps -emit-llvm 
%s -o - | FileCheck %s --check-prefixes=ALL,TRAPS
+// RUN: %clang_cc1 -triple aarch64-linux-gnu -fptrauth-auth-traps -emit-llvm 
%s -o - | FileCheck %s --check-prefixes=ALL,TRAPS
+
 // RUN: %clang_cc1 -triple arm64-apple-ios   -fptrauth-indirect-gotos 
-emit-llvm %s -o - | FileCheck %s --check-prefixes=ALL,GOTOS
 // RUN: %clang_cc1 -triple aarch64-linux-gnu -fptrauth-indirect-gotos 
-emit-llvm %s -o - | FileCheck %s --check-prefixes=ALL,GOTOS
 
@@ -19,6 +22,8 @@ void test() {
 
 // RETS: attributes #0 = {{{.*}} "ptrauth-returns" {{.*}}}
 
+// TRAPS: attributes #0 = {{{.*}} "ptrauth-auth-traps" {{.*}}}
+
 // GOTOS: attributes #0 = {{{.*}} "ptrauth-indirect-gotos" {{.*}}}
 
 // OFF-NOT: attributes {{.*}} "ptrauth-


[llvm-branch-commits] [clang] release/19.x: [clang] Implement -fptrauth-auth-traps. (#102417) (PR #102938)

2024-08-13 Thread Tobias Hieta via llvm-branch-commits

https://github.com/tru closed https://github.com/llvm/llvm-project/pull/102938
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] 955fe3f - [clang] Implement -fptrauth-auth-traps. (#102417)

2024-08-13 Thread Tobias Hieta via llvm-branch-commits

Author: Ahmed Bougacha
Date: 2024-08-13T09:02:58+02:00
New Revision: 955fe3f1ef193d26c73fb54ab6e01a818a6bde8e

URL: 
https://github.com/llvm/llvm-project/commit/955fe3f1ef193d26c73fb54ab6e01a818a6bde8e
DIFF: 
https://github.com/llvm/llvm-project/commit/955fe3f1ef193d26c73fb54ab6e01a818a6bde8e.diff

LOG: [clang] Implement -fptrauth-auth-traps. (#102417)

This provides -fptrauth-auth-traps, which at the frontend level only
controls the addition of the "ptrauth-auth-traps" function attribute.

The attribute in turn controls various aspects of backend codegen, by
providing the guarantee that every "auth" operation generated will trap
on failure.

This can either be delegated to the hardware (if AArch64 FPAC is known
to be available), in which case this attribute doesn't change codegen.
Otherwise, if FPAC isn't available, this asks the backend to emit
additional instructions to check and trap on auth failure.

(cherry picked from commit d179acd0484bac30c5ebbbed4d29a4734d92ac93)

Added: 


Modified: 
clang/include/clang/Basic/PointerAuthOptions.h
clang/lib/CodeGen/CodeGenFunction.cpp
clang/lib/Frontend/CompilerInvocation.cpp
clang/test/CodeGen/ptrauth-function-attributes.c

Removed: 




diff  --git a/clang/include/clang/Basic/PointerAuthOptions.h 
b/clang/include/clang/Basic/PointerAuthOptions.h
index c0ab35bce5d84b..a26af69e1fa246 100644
--- a/clang/include/clang/Basic/PointerAuthOptions.h
+++ b/clang/include/clang/Basic/PointerAuthOptions.h
@@ -162,6 +162,9 @@ struct PointerAuthOptions {
   /// Should return addresses be authenticated?
   bool ReturnAddresses = false;
 
+  /// Do authentication failures cause a trap?
+  bool AuthTraps = false;
+
   /// Do indirect goto label addresses need to be authenticated?
   bool IndirectGotos = false;
 

diff  --git a/clang/lib/CodeGen/CodeGenFunction.cpp 
b/clang/lib/CodeGen/CodeGenFunction.cpp
index 4dc57d0ff5b269..2b2e23f1e5d7fb 100644
--- a/clang/lib/CodeGen/CodeGenFunction.cpp
+++ b/clang/lib/CodeGen/CodeGenFunction.cpp
@@ -884,6 +884,8 @@ void CodeGenFunction::StartFunction(GlobalDecl GD, QualType 
RetTy,
 Fn->addFnAttr("ptrauth-returns");
   if (CodeGenOpts.PointerAuth.FunctionPointers)
 Fn->addFnAttr("ptrauth-calls");
+  if (CodeGenOpts.PointerAuth.AuthTraps)
+Fn->addFnAttr("ptrauth-auth-traps");
   if (CodeGenOpts.PointerAuth.IndirectGotos)
 Fn->addFnAttr("ptrauth-indirect-gotos");
 

diff  --git a/clang/lib/Frontend/CompilerInvocation.cpp 
b/clang/lib/Frontend/CompilerInvocation.cpp
index fa5d076c202a36..028fdb2cc6b9da 100644
--- a/clang/lib/Frontend/CompilerInvocation.cpp
+++ b/clang/lib/Frontend/CompilerInvocation.cpp
@@ -1504,16 +1504,17 @@ void CompilerInvocation::setDefaultPointerAuthOptions(
 Opts.CXXMemberFunctionPointers =
 PointerAuthSchema(Key::ASIA, false, Discrimination::Type);
   }
-  Opts.IndirectGotos = LangOpts.PointerAuthIndirectGotos;
   Opts.ReturnAddresses = LangOpts.PointerAuthReturns;
+  Opts.AuthTraps = LangOpts.PointerAuthAuthTraps;
+  Opts.IndirectGotos = LangOpts.PointerAuthIndirectGotos;
 }
 
 static void parsePointerAuthOptions(PointerAuthOptions &Opts,
 const LangOptions &LangOpts,
 const llvm::Triple &Triple,
 DiagnosticsEngine &Diags) {
-  if (!LangOpts.PointerAuthCalls && !LangOpts.PointerAuthIndirectGotos &&
-  !LangOpts.PointerAuthReturns)
+  if (!LangOpts.PointerAuthCalls && !LangOpts.PointerAuthReturns &&
+  !LangOpts.PointerAuthAuthTraps && !LangOpts.PointerAuthIndirectGotos)
 return;
 
   CompilerInvocation::setDefaultPointerAuthOptions(Opts, LangOpts, Triple);

diff  --git a/clang/test/CodeGen/ptrauth-function-attributes.c 
b/clang/test/CodeGen/ptrauth-function-attributes.c
index 17ebf9d6e2e01c..e7081f00b4f686 100644
--- a/clang/test/CodeGen/ptrauth-function-attributes.c
+++ b/clang/test/CodeGen/ptrauth-function-attributes.c
@@ -8,6 +8,9 @@
 // RUN: %clang_cc1 -triple arm64-apple-ios   -fptrauth-returns -emit-llvm %s 
-o - | FileCheck %s --check-prefixes=ALL,RETS
 // RUN: %clang_cc1 -triple aarch64-linux-gnu -fptrauth-returns -emit-llvm %s 
-o - | FileCheck %s --check-prefixes=ALL,RETS
 
+// RUN: %clang_cc1 -triple arm64-apple-ios   -fptrauth-auth-traps -emit-llvm 
%s -o - | FileCheck %s --check-prefixes=ALL,TRAPS
+// RUN: %clang_cc1 -triple aarch64-linux-gnu -fptrauth-auth-traps -emit-llvm 
%s -o - | FileCheck %s --check-prefixes=ALL,TRAPS
+
 // RUN: %clang_cc1 -triple arm64-apple-ios   -fptrauth-indirect-gotos 
-emit-llvm %s -o - | FileCheck %s --check-prefixes=ALL,GOTOS
 // RUN: %clang_cc1 -triple aarch64-linux-gnu -fptrauth-indirect-gotos 
-emit-llvm %s -o - | FileCheck %s --check-prefixes=ALL,GOTOS
 
@@ -19,6 +22,8 @@ void test() {
 
 // RETS: attributes #0 = {{{.*}} "ptrauth-returns" {{.*}}}
 
+// TRAPS: attributes #0 = {{{.*}} "ptrauth-auth-traps" {{.*}}}
+
 // GOTOS

[llvm-branch-commits] [lldb] release/19.x: [lldb] Fix crash when adding members to an "incomplete" type (#102116) (PR #102895)

2024-08-13 Thread via llvm-branch-commits

github-actions[bot] wrote:

@labath (or anyone else). If you would like to add a note about this fix in the 
release notes (completely optional). Please reply to this comment with a one or 
two sentence description of the fix.  When you are done, please add the 
release:note label to this PR. 

https://github.com/llvm/llvm-project/pull/102895
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/19.x: [clang] Implement -fptrauth-auth-traps. (#102417) (PR #102938)

2024-08-13 Thread via llvm-branch-commits

github-actions[bot] wrote:

@asl (or anyone else). If you would like to add a note about this fix in the 
release notes (completely optional). Please reply to this comment with a one or 
two sentence description of the fix.  When you are done, please add the 
release:note label to this PR. 

https://github.com/llvm/llvm-project/pull/102938
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/19.x: [Clang] Correctly forward `--cuda-path` to the nvlink wrapper (#100170) (PR #100216)

2024-08-13 Thread Tobias Hieta via llvm-branch-commits

tru wrote:

What's the state of this one?

https://github.com/llvm/llvm-project/pull/100216
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: [RISCV] Use experimental.vp.splat to splat specific vector length elements. (#101329) (PR #101506)

2024-08-13 Thread Tobias Hieta via llvm-branch-commits

tru wrote:

Should this still be merged? Who needs to review it? cc @topperc 

https://github.com/llvm/llvm-project/pull/101506
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [FixIrreducible] Use CycleInfo instead of a custom SCC traversal (PR #103014)

2024-08-13 Thread Sameer Sahasrabuddhe via llvm-branch-commits

https://github.com/ssahasra created 
https://github.com/llvm/llvm-project/pull/103014

1. CycleInfo efficiently locates all cycles in a single pass, while the SCC is 
repeated inside every natural loop.

2. CycleInfo provides a hierarchy of irreducible cycles, and the new 
implementation transforms each cycle in this hierarchy separately instead of 
reducing an entire irreducible SCC in a single step. This reduces the number of 
control-flow paths that pass through the header of each newly created loop. 
This is evidenced by the reduced number of predecessors on the "guard" blocks 
in the lit tests, and fewer operands on the corresponding PHI nodes.

3. When an entry of an irreducible cycle is the header of a child natural loop, 
the original implementation destroyed that loop. This is now preserved, since 
the incoming edges on non-header entries are not touched.

>From 0ba4872d47179a4d54a06224008cc160905360dc Mon Sep 17 00:00:00 2001
From: Sameer Sahasrabuddhe 
Date: Mon, 12 Aug 2024 14:44:13 +0530
Subject: [PATCH] [FixIrreducible] Use CycleInfo instead of a custom SCC
 traversal

1. CycleInfo efficiently locates all cycles in a single pass, while the SCC is
   repeated inside every natural loop.

2. CycleInfo provides a hierarchy of irreducible cycles, and the new
   implementation transforms each cycle in this hierarchy separately instead of
   reducing an entire irreducible SCC in a single step. This reduces the number
   of control-flow paths that pass through the header of each newly created
   loop. This is evidenced by the reduced number of predecessors on the "guard"
   blocks in the lit tests, and fewer operands on the corresponding PHI nodes.

3. When an entry of an irreducible cycle is the header of a child natural loop,
   the original implementation destroyed that loop. This is now preserved,
   since the incoming edges on non-header entries are not touched.
---
 llvm/include/llvm/ADT/GenericCycleInfo.h  |  28 +-
 llvm/lib/Transforms/Utils/FixIrreducible.cpp  | 364 +-
 llvm/test/CodeGen/AMDGPU/llc-pipeline.ll  |  15 +-
 llvm/test/Transforms/FixIrreducible/basic.ll  |  98 ++---
 .../Transforms/FixIrreducible/bug45623.ll |   9 +-
 llvm/test/Transforms/FixIrreducible/nested.ll | 143 ---
 llvm/test/Transforms/FixIrreducible/switch.ll |   8 +-
 .../Transforms/FixIrreducible/unreachable.ll  |   1 +
 .../workarounds/needs-fix-reducible.ll|  56 +--
 .../workarounds/needs-fr-ule.ll   | 173 +
 10 files changed, 500 insertions(+), 395 deletions(-)

diff --git a/llvm/include/llvm/ADT/GenericCycleInfo.h 
b/llvm/include/llvm/ADT/GenericCycleInfo.h
index b5d719c6313c43..cf13f8e95a35e3 100644
--- a/llvm/include/llvm/ADT/GenericCycleInfo.h
+++ b/llvm/include/llvm/ADT/GenericCycleInfo.h
@@ -107,6 +107,13 @@ template  class GenericCycle {
 return is_contained(Entries, Block);
   }
 
+  /// \brief Replace all entries with \p Block as single entry.
+  void setSingleEntry(BlockT *Block) {
+assert(contains(Block));
+Entries.clear();
+Entries.push_back(Block);
+  }
+
   /// \brief Return whether \p Block is contained in the cycle.
   bool contains(const BlockT *Block) const { return Blocks.contains(Block); }
 
@@ -192,11 +199,16 @@ template  class GenericCycle {
   //@{
   using const_entry_iterator =
   typename SmallVectorImpl::const_iterator;
-
+  const_entry_iterator entry_begin() const { return Entries.begin(); }
+  const_entry_iterator entry_end() const { return Entries.end(); }
   size_t getNumEntries() const { return Entries.size(); }
   iterator_range entries() const {
-return llvm::make_range(Entries.begin(), Entries.end());
+return llvm::make_range(entry_begin(), entry_end());
   }
+  using const_reverse_entry_iterator =
+  typename SmallVectorImpl::const_reverse_iterator;
+  const_reverse_entry_iterator entry_rbegin() const { return Entries.rbegin(); 
}
+  const_reverse_entry_iterator entry_rend() const { return Entries.rend(); }
   //@}
 
   Printable printEntries(const ContextT &Ctx) const {
@@ -257,12 +269,6 @@ template  class GenericCycleInfo {
   /// the subtree.
   void moveTopLevelCycleToNewParent(CycleT *NewParent, CycleT *Child);
 
-  /// Assumes that \p Cycle is the innermost cycle containing \p Block.
-  /// \p Block will be appended to \p Cycle and all of its parent cycles.
-  /// \p Block will be added to BlockMap with \p Cycle and
-  /// BlockMapTopLevel with \p Cycle's top level parent cycle.
-  void addBlockToCycle(BlockT *Block, CycleT *Cycle);
-
 public:
   GenericCycleInfo() = default;
   GenericCycleInfo(GenericCycleInfo &&) = default;
@@ -280,6 +286,12 @@ template  class GenericCycleInfo {
   unsigned getCycleDepth(const BlockT *Block) const;
   CycleT *getTopLevelParentCycle(BlockT *Block);
 
+  /// Assumes that \p Cycle is the innermost cycle containing \p Block.
+  /// \p Block will be appended to \p Cycle and all of its parent cycles.
+  /// \p Block will be added to BlockMap with \p Cyc

[llvm-branch-commits] [llvm] [FixIrreducible] Use CycleInfo instead of a custom SCC traversal (PR #103014)

2024-08-13 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-llvm-adt

Author: Sameer Sahasrabuddhe (ssahasra)


Changes

1. CycleInfo efficiently locates all cycles in a single pass, while the SCC is 
repeated inside every natural loop.

2. CycleInfo provides a hierarchy of irreducible cycles, and the new 
implementation transforms each cycle in this hierarchy separately instead of 
reducing an entire irreducible SCC in a single step. This reduces the number of 
control-flow paths that pass through the header of each newly created loop. 
This is evidenced by the reduced number of predecessors on the "guard" blocks 
in the lit tests, and fewer operands on the corresponding PHI nodes.

3. When an entry of an irreducible cycle is the header of a child natural loop, 
the original implementation destroyed that loop. This is now preserved, since 
the incoming edges on non-header entries are not touched.

---

Patch is 74.95 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/103014.diff


10 Files Affected:

- (modified) llvm/include/llvm/ADT/GenericCycleInfo.h (+20-8) 
- (modified) llvm/lib/Transforms/Utils/FixIrreducible.cpp (+179-185) 
- (modified) llvm/test/CodeGen/AMDGPU/llc-pipeline.ll (+10-5) 
- (modified) llvm/test/Transforms/FixIrreducible/basic.ll (+54-44) 
- (modified) llvm/test/Transforms/FixIrreducible/bug45623.ll (+5-4) 
- (modified) llvm/test/Transforms/FixIrreducible/nested.ll (+97-46) 
- (modified) llvm/test/Transforms/FixIrreducible/switch.ll (+5-3) 
- (modified) llvm/test/Transforms/FixIrreducible/unreachable.ll (+1) 
- (modified) 
llvm/test/Transforms/StructurizeCFG/workarounds/needs-fix-reducible.ll (+34-22) 
- (modified) llvm/test/Transforms/StructurizeCFG/workarounds/needs-fr-ule.ll 
(+95-78) 


``diff
diff --git a/llvm/include/llvm/ADT/GenericCycleInfo.h 
b/llvm/include/llvm/ADT/GenericCycleInfo.h
index b5d719c6313c43..cf13f8e95a35e3 100644
--- a/llvm/include/llvm/ADT/GenericCycleInfo.h
+++ b/llvm/include/llvm/ADT/GenericCycleInfo.h
@@ -107,6 +107,13 @@ template  class GenericCycle {
 return is_contained(Entries, Block);
   }
 
+  /// \brief Replace all entries with \p Block as single entry.
+  void setSingleEntry(BlockT *Block) {
+assert(contains(Block));
+Entries.clear();
+Entries.push_back(Block);
+  }
+
   /// \brief Return whether \p Block is contained in the cycle.
   bool contains(const BlockT *Block) const { return Blocks.contains(Block); }
 
@@ -192,11 +199,16 @@ template  class GenericCycle {
   //@{
   using const_entry_iterator =
   typename SmallVectorImpl::const_iterator;
-
+  const_entry_iterator entry_begin() const { return Entries.begin(); }
+  const_entry_iterator entry_end() const { return Entries.end(); }
   size_t getNumEntries() const { return Entries.size(); }
   iterator_range entries() const {
-return llvm::make_range(Entries.begin(), Entries.end());
+return llvm::make_range(entry_begin(), entry_end());
   }
+  using const_reverse_entry_iterator =
+  typename SmallVectorImpl::const_reverse_iterator;
+  const_reverse_entry_iterator entry_rbegin() const { return Entries.rbegin(); 
}
+  const_reverse_entry_iterator entry_rend() const { return Entries.rend(); }
   //@}
 
   Printable printEntries(const ContextT &Ctx) const {
@@ -257,12 +269,6 @@ template  class GenericCycleInfo {
   /// the subtree.
   void moveTopLevelCycleToNewParent(CycleT *NewParent, CycleT *Child);
 
-  /// Assumes that \p Cycle is the innermost cycle containing \p Block.
-  /// \p Block will be appended to \p Cycle and all of its parent cycles.
-  /// \p Block will be added to BlockMap with \p Cycle and
-  /// BlockMapTopLevel with \p Cycle's top level parent cycle.
-  void addBlockToCycle(BlockT *Block, CycleT *Cycle);
-
 public:
   GenericCycleInfo() = default;
   GenericCycleInfo(GenericCycleInfo &&) = default;
@@ -280,6 +286,12 @@ template  class GenericCycleInfo {
   unsigned getCycleDepth(const BlockT *Block) const;
   CycleT *getTopLevelParentCycle(BlockT *Block);
 
+  /// Assumes that \p Cycle is the innermost cycle containing \p Block.
+  /// \p Block will be appended to \p Cycle and all of its parent cycles.
+  /// \p Block will be added to BlockMap with \p Cycle and
+  /// BlockMapTopLevel with \p Cycle's top level parent cycle.
+  void addBlockToCycle(BlockT *Block, CycleT *Cycle);
+
   /// Methods for debug and self-test.
   //@{
   void verifyCycleNest(bool VerifyFull = false, LoopInfoT *LI = nullptr) const;
diff --git a/llvm/lib/Transforms/Utils/FixIrreducible.cpp 
b/llvm/lib/Transforms/Utils/FixIrreducible.cpp
index 30075af2ffc654..11a26e63c8d375 100644
--- a/llvm/lib/Transforms/Utils/FixIrreducible.cpp
+++ b/llvm/lib/Transforms/Utils/FixIrreducible.cpp
@@ -6,50 +6,66 @@
 //
 
//===--===//
 //
-// An irreducible SCC is one which has multiple "header" blocks, i.e., blocks
-// with control-flow edges incident from outside the SCC.  This pass converts a
-/

[llvm-branch-commits] [llvm] [FixIrreducible] Use CycleInfo instead of a custom SCC traversal (PR #103014)

2024-08-13 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-backend-amdgpu

Author: Sameer Sahasrabuddhe (ssahasra)


Changes

1. CycleInfo efficiently locates all cycles in a single pass, while the SCC is 
repeated inside every natural loop.

2. CycleInfo provides a hierarchy of irreducible cycles, and the new 
implementation transforms each cycle in this hierarchy separately instead of 
reducing an entire irreducible SCC in a single step. This reduces the number of 
control-flow paths that pass through the header of each newly created loop. 
This is evidenced by the reduced number of predecessors on the "guard" blocks 
in the lit tests, and fewer operands on the corresponding PHI nodes.

3. When an entry of an irreducible cycle is the header of a child natural loop, 
the original implementation destroyed that loop. This is now preserved, since 
the incoming edges on non-header entries are not touched.

---

Patch is 74.95 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/103014.diff


10 Files Affected:

- (modified) llvm/include/llvm/ADT/GenericCycleInfo.h (+20-8) 
- (modified) llvm/lib/Transforms/Utils/FixIrreducible.cpp (+179-185) 
- (modified) llvm/test/CodeGen/AMDGPU/llc-pipeline.ll (+10-5) 
- (modified) llvm/test/Transforms/FixIrreducible/basic.ll (+54-44) 
- (modified) llvm/test/Transforms/FixIrreducible/bug45623.ll (+5-4) 
- (modified) llvm/test/Transforms/FixIrreducible/nested.ll (+97-46) 
- (modified) llvm/test/Transforms/FixIrreducible/switch.ll (+5-3) 
- (modified) llvm/test/Transforms/FixIrreducible/unreachable.ll (+1) 
- (modified) 
llvm/test/Transforms/StructurizeCFG/workarounds/needs-fix-reducible.ll (+34-22) 
- (modified) llvm/test/Transforms/StructurizeCFG/workarounds/needs-fr-ule.ll 
(+95-78) 


``diff
diff --git a/llvm/include/llvm/ADT/GenericCycleInfo.h 
b/llvm/include/llvm/ADT/GenericCycleInfo.h
index b5d719c6313c43..cf13f8e95a35e3 100644
--- a/llvm/include/llvm/ADT/GenericCycleInfo.h
+++ b/llvm/include/llvm/ADT/GenericCycleInfo.h
@@ -107,6 +107,13 @@ template  class GenericCycle {
 return is_contained(Entries, Block);
   }
 
+  /// \brief Replace all entries with \p Block as single entry.
+  void setSingleEntry(BlockT *Block) {
+assert(contains(Block));
+Entries.clear();
+Entries.push_back(Block);
+  }
+
   /// \brief Return whether \p Block is contained in the cycle.
   bool contains(const BlockT *Block) const { return Blocks.contains(Block); }
 
@@ -192,11 +199,16 @@ template  class GenericCycle {
   //@{
   using const_entry_iterator =
   typename SmallVectorImpl::const_iterator;
-
+  const_entry_iterator entry_begin() const { return Entries.begin(); }
+  const_entry_iterator entry_end() const { return Entries.end(); }
   size_t getNumEntries() const { return Entries.size(); }
   iterator_range entries() const {
-return llvm::make_range(Entries.begin(), Entries.end());
+return llvm::make_range(entry_begin(), entry_end());
   }
+  using const_reverse_entry_iterator =
+  typename SmallVectorImpl::const_reverse_iterator;
+  const_reverse_entry_iterator entry_rbegin() const { return Entries.rbegin(); 
}
+  const_reverse_entry_iterator entry_rend() const { return Entries.rend(); }
   //@}
 
   Printable printEntries(const ContextT &Ctx) const {
@@ -257,12 +269,6 @@ template  class GenericCycleInfo {
   /// the subtree.
   void moveTopLevelCycleToNewParent(CycleT *NewParent, CycleT *Child);
 
-  /// Assumes that \p Cycle is the innermost cycle containing \p Block.
-  /// \p Block will be appended to \p Cycle and all of its parent cycles.
-  /// \p Block will be added to BlockMap with \p Cycle and
-  /// BlockMapTopLevel with \p Cycle's top level parent cycle.
-  void addBlockToCycle(BlockT *Block, CycleT *Cycle);
-
 public:
   GenericCycleInfo() = default;
   GenericCycleInfo(GenericCycleInfo &&) = default;
@@ -280,6 +286,12 @@ template  class GenericCycleInfo {
   unsigned getCycleDepth(const BlockT *Block) const;
   CycleT *getTopLevelParentCycle(BlockT *Block);
 
+  /// Assumes that \p Cycle is the innermost cycle containing \p Block.
+  /// \p Block will be appended to \p Cycle and all of its parent cycles.
+  /// \p Block will be added to BlockMap with \p Cycle and
+  /// BlockMapTopLevel with \p Cycle's top level parent cycle.
+  void addBlockToCycle(BlockT *Block, CycleT *Cycle);
+
   /// Methods for debug and self-test.
   //@{
   void verifyCycleNest(bool VerifyFull = false, LoopInfoT *LI = nullptr) const;
diff --git a/llvm/lib/Transforms/Utils/FixIrreducible.cpp 
b/llvm/lib/Transforms/Utils/FixIrreducible.cpp
index 30075af2ffc654..11a26e63c8d375 100644
--- a/llvm/lib/Transforms/Utils/FixIrreducible.cpp
+++ b/llvm/lib/Transforms/Utils/FixIrreducible.cpp
@@ -6,50 +6,66 @@
 //
 
//===--===//
 //
-// An irreducible SCC is one which has multiple "header" blocks, i.e., blocks
-// with control-flow edges incident from outside the SCC.  This pass convert

[llvm-branch-commits] [libcxx] Cherry-pick fixes to std::hypot for PowerPC (PR #102052)

2024-08-13 Thread Tobias Hieta via llvm-branch-commits

https://github.com/tru updated https://github.com/llvm/llvm-project/pull/102052

>From 5230e327a10023b4349474bd89d67f8a1764c648 Mon Sep 17 00:00:00 2001
From: Mitch Phillips 
Date: Wed, 24 Jul 2024 12:58:24 +0200
Subject: [PATCH 1/2] Revert "[libc++][math] Fix undue overflowing of
 `std::hypot(x,y,z)` (#93350)"

This reverts commit 9628777479a970db5d0c2d0b456dac6633864760.

More details in https://github.com/llvm/llvm-project/pull/93350, but
this broke the PowerPC sanitizer bots.

(cherry picked from commit 1031335f2ee1879737576fde3a3425ce0046e773)
---
 libcxx/include/__math/hypot.h | 89 --
 libcxx/include/cmath  | 25 -
 .../test/libcxx/transitive_includes/cxx17.csv |  3 -
 .../test/libcxx/transitive_includes/cxx20.csv |  3 -
 .../test/libcxx/transitive_includes/cxx23.csv |  3 -
 .../test/libcxx/transitive_includes/cxx26.csv |  3 -
 .../test/std/numerics/c.math/cmath.pass.cpp   | 91 ---
 libcxx/test/support/fp_compare.h  | 45 +
 8 files changed, 65 insertions(+), 197 deletions(-)

diff --git a/libcxx/include/__math/hypot.h b/libcxx/include/__math/hypot.h
index 61fd260c594095..1bf193a9ab7ee9 100644
--- a/libcxx/include/__math/hypot.h
+++ b/libcxx/include/__math/hypot.h
@@ -15,21 +15,10 @@
 #include <__type_traits/is_same.h>
 #include <__type_traits/promote.h>
 
-#if _LIBCPP_STD_VER >= 17
-#  include <__algorithm/max.h>
-#  include <__math/abs.h>
-#  include <__math/roots.h>
-#  include <__utility/pair.h>
-#  include 
-#endif
-
 #if !defined(_LIBCPP_HAS_NO_PRAGMA_SYSTEM_HEADER)
 #  pragma GCC system_header
 #endif
 
-_LIBCPP_PUSH_MACROS
-#include <__undef_macros>
-
 _LIBCPP_BEGIN_NAMESPACE_STD
 
 namespace __math {
@@ -52,86 +41,8 @@ inline _LIBCPP_HIDE_FROM_ABI typename __promote<_A1, 
_A2>::type hypot(_A1 __x, _
   return __math::hypot((__result_type)__x, (__result_type)__y);
 }
 
-#if _LIBCPP_STD_VER >= 17
-// Factors needed to determine if over-/underflow might happen for 
`std::hypot(x,y,z)`.
-// returns [overflow_threshold, overflow_scale]
-template 
-_LIBCPP_HIDE_FROM_ABI std::pair<_Real, _Real> __hypot_factors() {
-  static_assert(std::numeric_limits<_Real>::is_iec559);
-
-  if constexpr (std::is_same_v<_Real, float>) {
-static_assert(-125 == std::numeric_limits<_Real>::min_exponent);
-static_assert(+128 == std::numeric_limits<_Real>::max_exponent);
-return {0x1.0p+62f, 0x1.0p-70f};
-  } else if constexpr (std::is_same_v<_Real, double>) {
-static_assert(-1021 == std::numeric_limits<_Real>::min_exponent);
-static_assert(+1024 == std::numeric_limits<_Real>::max_exponent);
-return {0x1.0p+510, 0x1.0p-600};
-  } else { // long double
-static_assert(std::is_same_v<_Real, long double>);
-
-// preprocessor guard necessary, otherwise literals (e.g. `0x1.0p+8'190l`) 
throw warnings even when shielded by `if
-// constexpr`
-#  if __DBL_MAX_EXP__ == __LDBL_MAX_EXP__
-static_assert(sizeof(_Real) == sizeof(double));
-return static_cast>(__math::__hypot_factors());
-#  else
-static_assert(sizeof(_Real) > sizeof(double));
-static_assert(-16381 == std::numeric_limits<_Real>::min_exponent);
-static_assert(+16384 == std::numeric_limits<_Real>::max_exponent);
-return {0x1.0p+8190l, 0x1.0p-9000l};
-#  endif
-  }
-}
-
-// Computes the three-dimensional hypotenuse: `std::hypot(x,y,z)`.
-// The naive implementation might over-/underflow which is why this 
implementation is more involved:
-//If the square of an argument might run into issues, we scale the 
arguments appropriately.
-// See https://github.com/llvm/llvm-project/issues/92782 for a detailed 
discussion and summary.
-template 
-_LIBCPP_HIDE_FROM_ABI _Real __hypot(_Real __x, _Real __y, _Real __z) {
-  const _Real __max_abs = std::max(__math::fabs(__x), 
std::max(__math::fabs(__y), __math::fabs(__z)));
-  const auto [__overflow_threshold, __overflow_scale] = 
__math::__hypot_factors<_Real>();
-  _Real __scale;
-  if (__max_abs > __overflow_threshold) { // x*x + y*y + z*z might overflow
-__scale = __overflow_scale;
-__x *= __scale;
-__y *= __scale;
-__z *= __scale;
-  } else if (__max_abs < 1 / __overflow_threshold) { // x*x + y*y + z*z might 
underflow
-__scale = 1 / __overflow_scale;
-__x *= __scale;
-__y *= __scale;
-__z *= __scale;
-  } else
-__scale = 1;
-  return __math::sqrt(__x * __x + __y * __y + __z * __z) / __scale;
-}
-
-inline _LIBCPP_HIDE_FROM_ABI float hypot(float __x, float __y, float __z) { 
return __math::__hypot(__x, __y, __z); }
-
-inline _LIBCPP_HIDE_FROM_ABI double hypot(double __x, double __y, double __z) 
{ return __math::__hypot(__x, __y, __z); }
-
-inline _LIBCPP_HIDE_FROM_ABI long double hypot(long double __x, long double 
__y, long double __z) {
-  return __math::__hypot(__x, __y, __z);
-}
-
-template  && is_arithmetic_v<_A2> && 
is_arithmetic_v<_A3>, int> = 0 >
-_LIBCPP_HIDE_FROM_ABI typename __promote<_A1, _A2, _A3>::type hypot(_A1 __x, 
_A2 _

[llvm-branch-commits] [libcxx] 5230e32 - Revert "[libc++][math] Fix undue overflowing of `std::hypot(x, y, z)` (#93350)"

2024-08-13 Thread Tobias Hieta via llvm-branch-commits

Author: Mitch Phillips
Date: 2024-08-13T09:07:01+02:00
New Revision: 5230e327a10023b4349474bd89d67f8a1764c648

URL: 
https://github.com/llvm/llvm-project/commit/5230e327a10023b4349474bd89d67f8a1764c648
DIFF: 
https://github.com/llvm/llvm-project/commit/5230e327a10023b4349474bd89d67f8a1764c648.diff

LOG: Revert "[libc++][math] Fix undue overflowing of `std::hypot(x,y,z)` 
(#93350)"

This reverts commit 9628777479a970db5d0c2d0b456dac6633864760.

More details in https://github.com/llvm/llvm-project/pull/93350, but
this broke the PowerPC sanitizer bots.

(cherry picked from commit 1031335f2ee1879737576fde3a3425ce0046e773)

Added: 


Modified: 
libcxx/include/__math/hypot.h
libcxx/include/cmath
libcxx/test/libcxx/transitive_includes/cxx17.csv
libcxx/test/libcxx/transitive_includes/cxx20.csv
libcxx/test/libcxx/transitive_includes/cxx23.csv
libcxx/test/libcxx/transitive_includes/cxx26.csv
libcxx/test/std/numerics/c.math/cmath.pass.cpp
libcxx/test/support/fp_compare.h

Removed: 




diff  --git a/libcxx/include/__math/hypot.h b/libcxx/include/__math/hypot.h
index 61fd260c594095..1bf193a9ab7ee9 100644
--- a/libcxx/include/__math/hypot.h
+++ b/libcxx/include/__math/hypot.h
@@ -15,21 +15,10 @@
 #include <__type_traits/is_same.h>
 #include <__type_traits/promote.h>
 
-#if _LIBCPP_STD_VER >= 17
-#  include <__algorithm/max.h>
-#  include <__math/abs.h>
-#  include <__math/roots.h>
-#  include <__utility/pair.h>
-#  include 
-#endif
-
 #if !defined(_LIBCPP_HAS_NO_PRAGMA_SYSTEM_HEADER)
 #  pragma GCC system_header
 #endif
 
-_LIBCPP_PUSH_MACROS
-#include <__undef_macros>
-
 _LIBCPP_BEGIN_NAMESPACE_STD
 
 namespace __math {
@@ -52,86 +41,8 @@ inline _LIBCPP_HIDE_FROM_ABI typename __promote<_A1, 
_A2>::type hypot(_A1 __x, _
   return __math::hypot((__result_type)__x, (__result_type)__y);
 }
 
-#if _LIBCPP_STD_VER >= 17
-// Factors needed to determine if over-/underflow might happen for 
`std::hypot(x,y,z)`.
-// returns [overflow_threshold, overflow_scale]
-template 
-_LIBCPP_HIDE_FROM_ABI std::pair<_Real, _Real> __hypot_factors() {
-  static_assert(std::numeric_limits<_Real>::is_iec559);
-
-  if constexpr (std::is_same_v<_Real, float>) {
-static_assert(-125 == std::numeric_limits<_Real>::min_exponent);
-static_assert(+128 == std::numeric_limits<_Real>::max_exponent);
-return {0x1.0p+62f, 0x1.0p-70f};
-  } else if constexpr (std::is_same_v<_Real, double>) {
-static_assert(-1021 == std::numeric_limits<_Real>::min_exponent);
-static_assert(+1024 == std::numeric_limits<_Real>::max_exponent);
-return {0x1.0p+510, 0x1.0p-600};
-  } else { // long double
-static_assert(std::is_same_v<_Real, long double>);
-
-// preprocessor guard necessary, otherwise literals (e.g. `0x1.0p+8'190l`) 
throw warnings even when shielded by `if
-// constexpr`
-#  if __DBL_MAX_EXP__ == __LDBL_MAX_EXP__
-static_assert(sizeof(_Real) == sizeof(double));
-return static_cast>(__math::__hypot_factors());
-#  else
-static_assert(sizeof(_Real) > sizeof(double));
-static_assert(-16381 == std::numeric_limits<_Real>::min_exponent);
-static_assert(+16384 == std::numeric_limits<_Real>::max_exponent);
-return {0x1.0p+8190l, 0x1.0p-9000l};
-#  endif
-  }
-}
-
-// Computes the three-dimensional hypotenuse: `std::hypot(x,y,z)`.
-// The naive implementation might over-/underflow which is why this 
implementation is more involved:
-//If the square of an argument might run into issues, we scale the 
arguments appropriately.
-// See https://github.com/llvm/llvm-project/issues/92782 for a detailed 
discussion and summary.
-template 
-_LIBCPP_HIDE_FROM_ABI _Real __hypot(_Real __x, _Real __y, _Real __z) {
-  const _Real __max_abs = std::max(__math::fabs(__x), 
std::max(__math::fabs(__y), __math::fabs(__z)));
-  const auto [__overflow_threshold, __overflow_scale] = 
__math::__hypot_factors<_Real>();
-  _Real __scale;
-  if (__max_abs > __overflow_threshold) { // x*x + y*y + z*z might overflow
-__scale = __overflow_scale;
-__x *= __scale;
-__y *= __scale;
-__z *= __scale;
-  } else if (__max_abs < 1 / __overflow_threshold) { // x*x + y*y + z*z might 
underflow
-__scale = 1 / __overflow_scale;
-__x *= __scale;
-__y *= __scale;
-__z *= __scale;
-  } else
-__scale = 1;
-  return __math::sqrt(__x * __x + __y * __y + __z * __z) / __scale;
-}
-
-inline _LIBCPP_HIDE_FROM_ABI float hypot(float __x, float __y, float __z) { 
return __math::__hypot(__x, __y, __z); }
-
-inline _LIBCPP_HIDE_FROM_ABI double hypot(double __x, double __y, double __z) 
{ return __math::__hypot(__x, __y, __z); }
-
-inline _LIBCPP_HIDE_FROM_ABI long double hypot(long double __x, long double 
__y, long double __z) {
-  return __math::__hypot(__x, __y, __z);
-}
-
-template  && is_arithmetic_v<_A2> && 
is_arithmetic_v<_A3>, int> = 0 >
-_LIBCPP_HIDE_FROM_ABI typename __promote<_A1, _A2, _A3>::type hy

[llvm-branch-commits] [libcxx] 2c29bd3 - [libc++][math] Fix undue overflowing of `std::hypot(x, y, z)` (#100820)

2024-08-13 Thread Tobias Hieta via llvm-branch-commits

Author: PaulXiCao
Date: 2024-08-13T09:07:01+02:00
New Revision: 2c29bd3d4cdf8b81ebeb8e562fbc91f63a90a1ad

URL: 
https://github.com/llvm/llvm-project/commit/2c29bd3d4cdf8b81ebeb8e562fbc91f63a90a1ad
DIFF: 
https://github.com/llvm/llvm-project/commit/2c29bd3d4cdf8b81ebeb8e562fbc91f63a90a1ad.diff

LOG: [libc++][math] Fix undue overflowing of `std::hypot(x,y,z)` (#100820)

This is in relation to mr #93350. It was merged to main, but reverted
because of failing sanitizer builds on PowerPC.

The fix includes replacing the hard-coded threshold constants (e.g.
`__overflow_threshold`) for different floating-point sizes by a general
computation using `std::ldexp`. Thus, it should now work for all architectures.
This has the drawback of not being `constexpr` anymore as `std::ldexp`
is not implemented as `constexpr` (even though the standard mandates it
for C++23).

Closes #92782

(cherry picked from commit 72825fde03aab3ce9eba2635b872144d1fb6b6b2)

Added: 


Modified: 
libcxx/include/__math/hypot.h
libcxx/include/cmath
libcxx/test/libcxx/transitive_includes/cxx03.csv
libcxx/test/libcxx/transitive_includes/cxx11.csv
libcxx/test/libcxx/transitive_includes/cxx14.csv
libcxx/test/libcxx/transitive_includes/cxx17.csv
libcxx/test/libcxx/transitive_includes/cxx20.csv
libcxx/test/libcxx/transitive_includes/cxx23.csv
libcxx/test/libcxx/transitive_includes/cxx26.csv
libcxx/test/std/numerics/c.math/cmath.pass.cpp
libcxx/test/support/fp_compare.h

Removed: 




diff  --git a/libcxx/include/__math/hypot.h b/libcxx/include/__math/hypot.h
index 1bf193a9ab7ee9..b992163711010a 100644
--- a/libcxx/include/__math/hypot.h
+++ b/libcxx/include/__math/hypot.h
@@ -9,16 +9,25 @@
 #ifndef _LIBCPP___MATH_HYPOT_H
 #define _LIBCPP___MATH_HYPOT_H
 
+#include <__algorithm/max.h>
 #include <__config>
+#include <__math/abs.h>
+#include <__math/exponential_functions.h>
+#include <__math/roots.h>
 #include <__type_traits/enable_if.h>
 #include <__type_traits/is_arithmetic.h>
 #include <__type_traits/is_same.h>
 #include <__type_traits/promote.h>
+#include <__utility/pair.h>
+#include 
 
 #if !defined(_LIBCPP_HAS_NO_PRAGMA_SYSTEM_HEADER)
 #  pragma GCC system_header
 #endif
 
+_LIBCPP_PUSH_MACROS
+#include <__undef_macros>
+
 _LIBCPP_BEGIN_NAMESPACE_STD
 
 namespace __math {
@@ -41,8 +50,60 @@ inline _LIBCPP_HIDE_FROM_ABI typename __promote<_A1, 
_A2>::type hypot(_A1 __x, _
   return __math::hypot((__result_type)__x, (__result_type)__y);
 }
 
+#if _LIBCPP_STD_VER >= 17
+// Computes the three-dimensional hypotenuse: `std::hypot(x,y,z)`.
+// The naive implementation might over-/underflow which is why this 
implementation is more involved:
+//If the square of an argument might run into issues, we scale the 
arguments appropriately.
+// See https://github.com/llvm/llvm-project/issues/92782 for a detailed 
discussion and summary.
+template 
+_LIBCPP_HIDE_FROM_ABI _Real __hypot(_Real __x, _Real __y, _Real __z) {
+  // Factors needed to determine if over-/underflow might happen
+  constexpr int __exp  = std::numeric_limits<_Real>::max_exponent 
/ 2;
+  const _Real __overflow_threshold = __math::ldexp(_Real(1), __exp);
+  const _Real __overflow_scale = __math::ldexp(_Real(1), -(__exp + 20));
+
+  // Scale arguments depending on their size
+  const _Real __max_abs = std::max(__math::fabs(__x), 
std::max(__math::fabs(__y), __math::fabs(__z)));
+  _Real __scale;
+  if (__max_abs > __overflow_threshold) { // x*x + y*y + z*z might overflow
+__scale = __overflow_scale;
+  } else if (__max_abs < 1 / __overflow_threshold) { // x*x + y*y + z*z might 
underflow
+__scale = 1 / __overflow_scale;
+  } else {
+__scale = 1;
+  }
+  __x *= __scale;
+  __y *= __scale;
+  __z *= __scale;
+
+  // Compute hypot of scaled arguments and undo scaling
+  return __math::sqrt(__x * __x + __y * __y + __z * __z) / __scale;
+}
+
+inline _LIBCPP_HIDE_FROM_ABI float hypot(float __x, float __y, float __z) { 
return __math::__hypot(__x, __y, __z); }
+
+inline _LIBCPP_HIDE_FROM_ABI double hypot(double __x, double __y, double __z) 
{ return __math::__hypot(__x, __y, __z); }
+
+inline _LIBCPP_HIDE_FROM_ABI long double hypot(long double __x, long double 
__y, long double __z) {
+  return __math::__hypot(__x, __y, __z);
+}
+
+template  && is_arithmetic_v<_A2> && 
is_arithmetic_v<_A3>, int> = 0 >
+_LIBCPP_HIDE_FROM_ABI typename __promote<_A1, _A2, _A3>::type hypot(_A1 __x, 
_A2 __y, _A3 __z) _NOEXCEPT {
+  using __result_type = typename __promote<_A1, _A2, _A3>::type;
+  static_assert(!(
+  std::is_same_v<_A1, __result_type> && std::is_same_v<_A2, __result_type> 
&& std::is_same_v<_A3, __result_type>));
+  return __math::__hypot(
+  static_cast<__result_type>(__x), static_cast<__result_type>(__y), 
static_cast<__result_type>(__z));
+}
+#endif
+
 } // namespace __math
 
 _LIBCPP_END_NAMESPACE_STD
+_LIBCPP_POP_MACROS
 
 #endif // _

[llvm-branch-commits] [libcxx] Cherry-pick fixes to std::hypot for PowerPC (PR #102052)

2024-08-13 Thread Tobias Hieta via llvm-branch-commits

https://github.com/tru closed https://github.com/llvm/llvm-project/pull/102052
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/19.x: Fix codegen of consteval functions returning an empty class, and related issues (#93115) (PR #102070)

2024-08-13 Thread Tobias Hieta via llvm-branch-commits

tru wrote:

@zygoloid @RKSimon @phoebewang does this look good to merge?

https://github.com/llvm/llvm-project/pull/102070
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libcxx] Cherry-pick fixes to std::hypot for PowerPC (PR #102052)

2024-08-13 Thread via llvm-branch-commits

github-actions[bot] wrote:

@ldionne (or anyone else). If you would like to add a note about this fix in 
the release notes (completely optional). Please reply to this comment with a 
one or two sentence description of the fix.  When you are done, please add the 
release:note label to this PR. 

https://github.com/llvm/llvm-project/pull/102052
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: [Hexagon] Do not optimize address of another function's block (#101209) (PR #102179)

2024-08-13 Thread Tobias Hieta via llvm-branch-commits

tru wrote:

@SundeepKushwaha should this be merged?

https://github.com/llvm/llvm-project/pull/102179
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/19.x: [C++20] [Modules] Don't diagnose duplicated implicit decl in multiple named modules (#102423) (PR #102425)

2024-08-13 Thread Tobias Hieta via llvm-branch-commits

https://github.com/tru updated https://github.com/llvm/llvm-project/pull/102425

>From 3f193bcc5a72f05a4b2d5f74a0ee16a76b3b32f2 Mon Sep 17 00:00:00 2001
From: Chuanqi Xu 
Date: Thu, 8 Aug 2024 13:29:59 +0800
Subject: [PATCH] [C++20] [Modules] Don't diagnose duplicated implicit decl in
 multiple named modules (#102423)

Close https://github.com/llvm/llvm-project/issues/102360
Close https://github.com/llvm/llvm-project/issues/102349

http://eel.is/c++draft/basic.def.odr#15.3 makes it clear that the
duplicated deinition are not allowed to be attached to named modules.

But we need to filter the implicit declarations as user can do nothing
about it and the diagnostic message is annoying.

(cherry picked from commit e72d956b99e920b0fe2a7946eb3a51b9e889c73c)
---
 clang/lib/Serialization/ASTReaderDecl.cpp | 65 +--
 clang/test/Modules/pr102349.cppm  | 52 ++
 clang/test/Modules/pr102360.cppm  | 53 ++
 3 files changed, 154 insertions(+), 16 deletions(-)
 create mode 100644 clang/test/Modules/pr102349.cppm
 create mode 100644 clang/test/Modules/pr102360.cppm

diff --git a/clang/lib/Serialization/ASTReaderDecl.cpp 
b/clang/lib/Serialization/ASTReaderDecl.cpp
index 31ab6c651d59f4..ccc97f65526d6c 100644
--- a/clang/lib/Serialization/ASTReaderDecl.cpp
+++ b/clang/lib/Serialization/ASTReaderDecl.cpp
@@ -3684,6 +3684,54 @@ static void inheritDefaultTemplateArguments(ASTContext 
&Context,
   }
 }
 
+// [basic.link]/p10:
+//If two declarations of an entity are attached to different modules,
+//the program is ill-formed;
+static void checkMultipleDefinitionInNamedModules(ASTReader &Reader, Decl *D,
+  Decl *Previous) {
+  Module *M = Previous->getOwningModule();
+
+  // We only care about the case in named modules.
+  if (!M || !M->isNamedModule())
+return;
+
+  // If it is previous implcitly introduced, it is not meaningful to
+  // diagnose it.
+  if (Previous->isImplicit())
+return;
+
+  // FIXME: Get rid of the enumeration of decl types once we have an 
appropriate
+  // abstract for decls of an entity. e.g., the namespace decl and using decl
+  // doesn't introduce an entity.
+  if (!isa(Previous))
+return;
+
+  // Skip implicit instantiations since it may give false positive diagnostic
+  // messages.
+  // FIXME: Maybe this shows the implicit instantiations may have incorrect
+  // module owner ships. But given we've finished the compilation of a module,
+  // how can we add new entities to that module?
+  if (auto *VTSD = dyn_cast(Previous);
+  VTSD && !VTSD->isExplicitSpecialization())
+return;
+  if (auto *CTSD = dyn_cast(Previous);
+  CTSD && !CTSD->isExplicitSpecialization())
+return;
+  if (auto *Func = dyn_cast(Previous))
+if (auto *FTSI = Func->getTemplateSpecializationInfo();
+FTSI && !FTSI->isExplicitSpecialization())
+  return;
+
+  // It is fine if they are in the same module.
+  if (Reader.getContext().isInSameModule(M, D->getOwningModule()))
+return;
+
+  Reader.Diag(Previous->getLocation(),
+  diag::err_multiple_decl_in_different_modules)
+  << cast(Previous) << M->Name;
+  Reader.Diag(D->getLocation(), diag::note_also_found);
+}
+
 void ASTDeclReader::attachPreviousDecl(ASTReader &Reader, Decl *D,
Decl *Previous, Decl *Canon) {
   assert(D && Previous);
@@ -3697,22 +3745,7 @@ void ASTDeclReader::attachPreviousDecl(ASTReader 
&Reader, Decl *D,
 #include "clang/AST/DeclNodes.inc"
   }
 
-  // [basic.link]/p10:
-  //If two declarations of an entity are attached to different modules,
-  //the program is ill-formed;
-  //
-  // FIXME: Get rid of the enumeration of decl types once we have an 
appropriate
-  // abstract for decls of an entity. e.g., the namespace decl and using decl
-  // doesn't introduce an entity.
-  if (Module *M = Previous->getOwningModule();
-  M && M->isNamedModule() &&
-  isa(Previous) 
&&
-  !Reader.getContext().isInSameModule(M, D->getOwningModule())) {
-Reader.Diag(Previous->getLocation(),
-diag::err_multiple_decl_in_different_modules)
-<< cast(Previous) << M->Name;
-Reader.Diag(D->getLocation(), diag::note_also_found);
-  }
+  checkMultipleDefinitionInNamedModules(Reader, D, Previous);
 
   // If the declaration was visible in one module, a redeclaration of it in
   // another module remains visible even if it wouldn't be visible by itself.
diff --git a/clang/test/Modules/pr102349.cppm b/clang/test/Modules/pr102349.cppm
new file mode 100644
index 00..2d166c9e93fcfc
--- /dev/null
+++ b/clang/test/Modules/pr102349.cppm
@@ -0,0 +1,52 @@
+// RUN: rm -rf %t
+// RUN: mkdir -p %t
+// RUN: split-file %s %t
+//
+// RUN: %clang_cc1 -std=c++20 %t/a.cppm -emit-module-interface -o %t/a.pcm
+// RUN: %clang_cc1 -std=c++20 %t/b.cppm -emit-module-interface -o %t/b.pcm \
+// RUN:   -fprebuilt-module-path=%t
+// 

[llvm-branch-commits] [clang] 3f193bc - [C++20] [Modules] Don't diagnose duplicated implicit decl in multiple named modules (#102423)

2024-08-13 Thread Tobias Hieta via llvm-branch-commits

Author: Chuanqi Xu
Date: 2024-08-13T09:08:58+02:00
New Revision: 3f193bcc5a72f05a4b2d5f74a0ee16a76b3b32f2

URL: 
https://github.com/llvm/llvm-project/commit/3f193bcc5a72f05a4b2d5f74a0ee16a76b3b32f2
DIFF: 
https://github.com/llvm/llvm-project/commit/3f193bcc5a72f05a4b2d5f74a0ee16a76b3b32f2.diff

LOG: [C++20] [Modules] Don't diagnose duplicated implicit decl in multiple 
named modules (#102423)

Close https://github.com/llvm/llvm-project/issues/102360
Close https://github.com/llvm/llvm-project/issues/102349

http://eel.is/c++draft/basic.def.odr#15.3 makes it clear that the
duplicated deinition are not allowed to be attached to named modules.

But we need to filter the implicit declarations as user can do nothing
about it and the diagnostic message is annoying.

(cherry picked from commit e72d956b99e920b0fe2a7946eb3a51b9e889c73c)

Added: 
clang/test/Modules/pr102349.cppm
clang/test/Modules/pr102360.cppm

Modified: 
clang/lib/Serialization/ASTReaderDecl.cpp

Removed: 




diff  --git a/clang/lib/Serialization/ASTReaderDecl.cpp 
b/clang/lib/Serialization/ASTReaderDecl.cpp
index 31ab6c651d59f4..ccc97f65526d6c 100644
--- a/clang/lib/Serialization/ASTReaderDecl.cpp
+++ b/clang/lib/Serialization/ASTReaderDecl.cpp
@@ -3684,6 +3684,54 @@ static void inheritDefaultTemplateArguments(ASTContext 
&Context,
   }
 }
 
+// [basic.link]/p10:
+//If two declarations of an entity are attached to 
diff erent modules,
+//the program is ill-formed;
+static void checkMultipleDefinitionInNamedModules(ASTReader &Reader, Decl *D,
+  Decl *Previous) {
+  Module *M = Previous->getOwningModule();
+
+  // We only care about the case in named modules.
+  if (!M || !M->isNamedModule())
+return;
+
+  // If it is previous implcitly introduced, it is not meaningful to
+  // diagnose it.
+  if (Previous->isImplicit())
+return;
+
+  // FIXME: Get rid of the enumeration of decl types once we have an 
appropriate
+  // abstract for decls of an entity. e.g., the namespace decl and using decl
+  // doesn't introduce an entity.
+  if (!isa(Previous))
+return;
+
+  // Skip implicit instantiations since it may give false positive diagnostic
+  // messages.
+  // FIXME: Maybe this shows the implicit instantiations may have incorrect
+  // module owner ships. But given we've finished the compilation of a module,
+  // how can we add new entities to that module?
+  if (auto *VTSD = dyn_cast(Previous);
+  VTSD && !VTSD->isExplicitSpecialization())
+return;
+  if (auto *CTSD = dyn_cast(Previous);
+  CTSD && !CTSD->isExplicitSpecialization())
+return;
+  if (auto *Func = dyn_cast(Previous))
+if (auto *FTSI = Func->getTemplateSpecializationInfo();
+FTSI && !FTSI->isExplicitSpecialization())
+  return;
+
+  // It is fine if they are in the same module.
+  if (Reader.getContext().isInSameModule(M, D->getOwningModule()))
+return;
+
+  Reader.Diag(Previous->getLocation(),
+  diag::err_multiple_decl_in_
diff erent_modules)
+  << cast(Previous) << M->Name;
+  Reader.Diag(D->getLocation(), diag::note_also_found);
+}
+
 void ASTDeclReader::attachPreviousDecl(ASTReader &Reader, Decl *D,
Decl *Previous, Decl *Canon) {
   assert(D && Previous);
@@ -3697,22 +3745,7 @@ void ASTDeclReader::attachPreviousDecl(ASTReader 
&Reader, Decl *D,
 #include "clang/AST/DeclNodes.inc"
   }
 
-  // [basic.link]/p10:
-  //If two declarations of an entity are attached to 
diff erent modules,
-  //the program is ill-formed;
-  //
-  // FIXME: Get rid of the enumeration of decl types once we have an 
appropriate
-  // abstract for decls of an entity. e.g., the namespace decl and using decl
-  // doesn't introduce an entity.
-  if (Module *M = Previous->getOwningModule();
-  M && M->isNamedModule() &&
-  isa(Previous) 
&&
-  !Reader.getContext().isInSameModule(M, D->getOwningModule())) {
-Reader.Diag(Previous->getLocation(),
-diag::err_multiple_decl_in_
diff erent_modules)
-<< cast(Previous) << M->Name;
-Reader.Diag(D->getLocation(), diag::note_also_found);
-  }
+  checkMultipleDefinitionInNamedModules(Reader, D, Previous);
 
   // If the declaration was visible in one module, a redeclaration of it in
   // another module remains visible even if it wouldn't be visible by itself.

diff  --git a/clang/test/Modules/pr102349.cppm 
b/clang/test/Modules/pr102349.cppm
new file mode 100644
index 00..2d166c9e93fcfc
--- /dev/null
+++ b/clang/test/Modules/pr102349.cppm
@@ -0,0 +1,52 @@
+// RUN: rm -rf %t
+// RUN: mkdir -p %t
+// RUN: split-file %s %t
+//
+// RUN: %clang_cc1 -std=c++20 %t/a.cppm -emit-module-interface -o %t/a.pcm
+// RUN: %clang_cc1 -std=c++20 %t/b.cppm -emit-module-interface -o %t/b.pcm \
+// RUN:   -fprebuilt-module-path=%t
+// RUN: %clang_cc1 -std=c++20 %t/c.cppm -emit

[llvm-branch-commits] [clang] release/19.x: [C++20] [Modules] Don't diagnose duplicated implicit decl in multiple named modules (#102423) (PR #102425)

2024-08-13 Thread Tobias Hieta via llvm-branch-commits

https://github.com/tru closed https://github.com/llvm/llvm-project/pull/102425
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] Cherry pick: [Clang][Sema] Make UnresolvedLookupExprs in class scope explicit spec… (PR #102514)

2024-08-13 Thread Tobias Hieta via llvm-branch-commits

tru wrote:

@AaronBallman can you have a look?

https://github.com/llvm/llvm-project/pull/102514
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/19.x: [C++20] [Modules] Don't diagnose duplicated implicit decl in multiple named modules (#102423) (PR #102425)

2024-08-13 Thread via llvm-branch-commits

github-actions[bot] wrote:

@ChuanqiXu9 (or anyone else). If you would like to add a note about this fix in 
the release notes (completely optional). Please reply to this comment with a 
one or two sentence description of the fix.  When you are done, please add the 
release:note label to this PR. 

https://github.com/llvm/llvm-project/pull/102425
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/19.x: Reland [C++20] [Modules] [Itanium ABI] Generate the vtable in the mod… (#102287) (PR #102561)

2024-08-13 Thread Tobias Hieta via llvm-branch-commits

tru wrote:

@ChuanqiXu9 are we good to merge this? It seems like a pretty big patch but 
reading the description it seems like some good fixes in there. Any risk taking 
this on?

https://github.com/llvm/llvm-project/pull/102561
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/19.x: Reland [C++20] [Modules] [Itanium ABI] Generate the vtable in the mod… (#102287) (PR #102561)

2024-08-13 Thread Chuanqi Xu via llvm-branch-commits

ChuanqiXu9 wrote:

> @ChuanqiXu9 are we good to merge this? It seems like a pretty big patch but 
> reading the description it seems like some good fixes in there. Any risk 
> taking this on?

Yes, this is not small. And I am 100% sure it is fine. But given this is 
important, I still like to backport this. How about waiting 2 weeks to see if 
there is any regression reports? If no, I'll try to ping you again to merge 
this.

https://github.com/llvm/llvm-project/pull/102561
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/19.x: Reland [C++20] [Modules] [Itanium ABI] Generate the vtable in the mod… (#102287) (PR #102561)

2024-08-13 Thread Tobias Hieta via llvm-branch-commits

tru wrote:

That seems fine. We should have another 3 weeks before the final is released, 
so if you have higher confidence before that we can merge.

https://github.com/llvm/llvm-project/pull/102561
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] d0b1a58 - Address comments

2024-08-13 Thread Alexis Engelke via llvm-branch-commits

Author: Alexis Engelke
Date: 2024-08-13T07:50:05Z
New Revision: d0b1a582fd33e8c3605c027883c6deb35757f560

URL: 
https://github.com/llvm/llvm-project/commit/d0b1a582fd33e8c3605c027883c6deb35757f560
DIFF: 
https://github.com/llvm/llvm-project/commit/d0b1a582fd33e8c3605c027883c6deb35757f560.diff

LOG: Address comments

Added: 


Modified: 
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp

Removed: 




diff  --git a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp 
b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
index 91e180f9eea13c..edacb2fb33540f 100644
--- a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+++ b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
@@ -5991,6 +5991,9 @@ BoUpSLP::collectUserStores(const BoUpSLP::TreeEntry *TE) 
const {
   DenseMap> PtrToStoresMap;
   for (unsigned Lane : seq(0, TE->Scalars.size())) {
 Value *V = TE->Scalars[Lane];
+// Don't iterate over the users of constant data.
+if (isa(V))
+  continue;
 // To save compilation time we don't visit if we have too many users.
 if (V->hasNUsesOrMore(UsesLimit))
   break;
@@ -5998,8 +6001,8 @@ BoUpSLP::collectUserStores(const BoUpSLP::TreeEntry *TE) 
const {
 // Collect stores per pointer object.
 for (User *U : V->users()) {
   auto *SI = dyn_cast(U);
-  // Test whether we can handle the store. If V is a constant, its users
-  // might be in 
diff erent functions.
+  // Test whether we can handle the store. V might be a global, which could
+  // be used in a 
diff erent function.
   if (SI == nullptr || !SI->isSimple() || SI->getFunction() != F ||
   !isValidElementType(SI->getValueOperand()->getType()))
 continue;



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [FixIrreducible] Use CycleInfo instead of a custom SCC traversal (PR #103014)

2024-08-13 Thread Sameer Sahasrabuddhe via llvm-branch-commits

https://github.com/ssahasra closed 
https://github.com/llvm/llvm-project/pull/103014
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [MC][NFC] Statically allocate storage for decoded pseudo probes and function records (PR #102789)

2024-08-13 Thread Amir Ayupov via llvm-branch-commits


@@ -443,17 +446,24 @@ bool MCPseudoProbeDecoder::buildAddress2ProbeMap(
   // If the incoming node is null, all its children nodes should be disgarded.
   if (Cur) {
 // Switch/add to a new tree node(inlinee)
-Cur = Cur->getOrAddNode(std::make_tuple(Guid, Index));
-Cur->Guid = Guid;
+Cur->Children[CurChild] = MCDecodedPseudoProbeInlineTree(Guid, Index, Cur);
+Cur = &Cur->Children[CurChild];
 if (IsTopLevelFunc && !EncodingIsAddrBased) {
   if (auto V = FuncStartAddrs.lookup(Guid))
 LastAddr = V;
 }
   }
+  // Advance CurChild for non-skipped top-level functions and unconditionally
+  // for inlined functions.
+  if (IsTopLevelFunc)
+CurChild += !!Cur;

aaupov wrote:

It was initially this way, but I had to change to advancing CurChildIndex from 
within buildAddress2ProbeMap. The problem is if GuidFilter is in place, we will 
only allocate enough top-level entries for filtered functions. Therefore we 
can't advance CurChildIndex from top-level buildAddress2ProbeMap invocation 
(GUID is not yet parsed).

We can advance non-top level child index though.

Let me refactor this a bit so it's easier to follow.

https://github.com/llvm/llvm-project/pull/102789
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] AMDGPU/NewPM: Fill out passes in addCodeGenPrepare (PR #102867)

2024-08-13 Thread Matt Arsenault via llvm-branch-commits


@@ -106,3 +135,12 @@ Error 
AMDGPUCodeGenPassBuilder::addInstSelector(AddMachinePass &addPass) const {
   addPass(SILowerI1CopiesPass());
   return Error::success();
 }
+
+bool AMDGPUCodeGenPassBuilder::isPassEnabled(const cl::opt &Opt,

arsenm wrote:

> IMO adding code noise around these flags is a good thing since these flags 
> are undesirable.

I'm not moved by this argument at all 

https://github.com/llvm/llvm-project/pull/102867
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [MC][NFC] Statically allocate storage for decoded pseudo probes and function records (PR #102789)

2024-08-13 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov updated 
https://github.com/llvm/llvm-project/pull/102789

>From ddcbb593f72ca47acaa82f9c14a7fd2c4e30903b Mon Sep 17 00:00:00 2001
From: Amir Ayupov 
Date: Tue, 13 Aug 2024 03:51:31 -0700
Subject: [PATCH] Pass CurChildIndex by value

Created using spr 1.3.4
---
 llvm/include/llvm/MC/MCPseudoProbe.h |  6 --
 llvm/lib/MC/MCPseudoProbe.cpp| 26 +++---
 2 files changed, 15 insertions(+), 17 deletions(-)

diff --git a/llvm/include/llvm/MC/MCPseudoProbe.h 
b/llvm/include/llvm/MC/MCPseudoProbe.h
index a46188e565c7e8..32d7a4e9129eca 100644
--- a/llvm/include/llvm/MC/MCPseudoProbe.h
+++ b/llvm/include/llvm/MC/MCPseudoProbe.h
@@ -474,11 +474,13 @@ class MCPseudoProbeDecoder {
   }
 
 private:
+  // Recursively parse an inlining tree encoded in pseudo_probe section. 
Returns
+  // whether the the top-level node should be skipped.
   template 
-  void buildAddress2ProbeMap(MCDecodedPseudoProbeInlineTree *Cur,
+  bool buildAddress2ProbeMap(MCDecodedPseudoProbeInlineTree *Cur,
  uint64_t &LastAddr, const Uint64Set &GuildFilter,
  const Uint64Map &FuncStartAddrs,
- uint32_t &CurChild);
+ const uint32_t CurChildIndex);
 };
 
 } // end namespace llvm
diff --git a/llvm/lib/MC/MCPseudoProbe.cpp b/llvm/lib/MC/MCPseudoProbe.cpp
index c4c2dfcec40564..e6f6e797b4ee71 100644
--- a/llvm/lib/MC/MCPseudoProbe.cpp
+++ b/llvm/lib/MC/MCPseudoProbe.cpp
@@ -420,17 +420,17 @@ bool MCPseudoProbeDecoder::buildGUID2FuncDescMap(const 
uint8_t *Start,
 }
 
 template 
-void MCPseudoProbeDecoder::buildAddress2ProbeMap(
+bool MCPseudoProbeDecoder::buildAddress2ProbeMap(
 MCDecodedPseudoProbeInlineTree *Cur, uint64_t &LastAddr,
 const Uint64Set &GuidFilter, const Uint64Map &FuncStartAddrs,
-uint32_t &CurChild) {
+const uint32_t CurChildIndex) {
   // The pseudo_probe section encodes an inline forest and each tree has a
   // format defined in MCPseudoProbe.h
 
   uint32_t Index = 0;
   if (IsTopLevelFunc) {
 // Use a sequential id for top level inliner.
-Index = CurChild;
+Index = CurChildIndex;
   } else {
 // Read inline site for inlinees
 Index = cantFail(errorOrToExpected(readUnsignedNumber()));
@@ -446,19 +446,14 @@ void MCPseudoProbeDecoder::buildAddress2ProbeMap(
   // If the incoming node is null, all its children nodes should be disgarded.
   if (Cur) {
 // Switch/add to a new tree node(inlinee)
-Cur->Children[CurChild] = MCDecodedPseudoProbeInlineTree(Guid, Index, Cur);
-Cur = &Cur->Children[CurChild];
+Cur->Children[CurChildIndex] =
+MCDecodedPseudoProbeInlineTree(Guid, Index, Cur);
+Cur = &Cur->Children[CurChildIndex];
 if (IsTopLevelFunc && !EncodingIsAddrBased) {
   if (auto V = FuncStartAddrs.lookup(Guid))
 LastAddr = V;
 }
   }
-  // Advance CurChild for non-skipped top-level functions and unconditionally
-  // for inlined functions.
-  if (IsTopLevelFunc)
-CurChild += !!Cur;
-  else
-++CurChild;
 
   // Read number of probes in the current node.
   uint32_t NodeCount =
@@ -519,9 +514,10 @@ void MCPseudoProbeDecoder::buildAddress2ProbeMap(
 InlineTreeVec.resize(InlineTreeVec.size() + ChildrenToProcess);
 Cur->Children = 
MutableArrayRef(InlineTreeVec).take_back(ChildrenToProcess);
   }
-  for (uint32_t I = 0; I < ChildrenToProcess;) {
+  for (uint32_t I = 0; I < ChildrenToProcess; I++) {
 buildAddress2ProbeMap(Cur, LastAddr, GuidFilter, FuncStartAddrs, I);
   }
+  return Cur;
 }
 
 template 
@@ -630,10 +626,10 @@ bool MCPseudoProbeDecoder::buildAddress2ProbeMap(
   Data = Start;
   End = Data + Size;
   uint64_t LastAddr = 0;
-  uint32_t Child = 0;
+  uint32_t CurChildIndex = 0;
   while (Data < End)
-buildAddress2ProbeMap(&DummyInlineRoot, LastAddr, GuidFilter,
-FuncStartAddrs, Child);
+CurChildIndex += buildAddress2ProbeMap(
+&DummyInlineRoot, LastAddr, GuidFilter, FuncStartAddrs, CurChildIndex);
   assert(Data == End && "Have unprocessed data in pseudo_probe section");
   return true;
 }

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: [AArch64] Add streaming-mode stack hazard optimization remarks (#101695) (PR #102168)

2024-08-13 Thread Hari Limaye via llvm-branch-commits

hazzlim wrote:

> > The patch here is pretty big in size, but it seems to only affects the 
> > remarks, on the other hand it doesn't seem to really fix anything and in 
> > that case I feel like RC3 might be the wrong time to merge this. Is there a 
> > huge upside to take this this late in the process?
> > Also ping @jroelofs as aarch64 domain expert and @AaronBallman as clang 
> > maintainer.
> 
> We had 8 release candidates for 18.x and I would _very much_ like to avoid 
> that happening again, so I think that because we're about to hit rc3 (the 
> last scheduled rc before we release according to the release schedule posted 
> at https://llvm.org/) we should only be taking low-risk, high-impact changes 
> such as fixes to regressions or obviously correct changes. I don't think this 
> patch qualifies; is there significant risk to not putting this in? (e.g., 
> does this fix what you would consider to be a stop-ship level issue of some 
> kind?)

Thank you for taking a look at this PR @AaronBallman. To reiterate points from 
above, I do think that it is low risk (as it is very opt-in) and high impact in 
the sense it helps to diagnose code that can incur a fairly significant 
performance hit. However I appreciate we're almost at rc3 and so understand if 
you don't think this qualifies for the 19 release at this stage.

https://github.com/llvm/llvm-project/pull/102168
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/19.x: [AIX] Revert `#pragma mc_func` check (#102919) (PR #102968)

2024-08-13 Thread Aaron Ballman via llvm-branch-commits

https://github.com/AaronBallman approved this pull request.

LGTM!

https://github.com/llvm/llvm-project/pull/102968
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [Transforms] Refactor CreateControlFlowHub (PR #103013)

2024-08-13 Thread Matt Arsenault via llvm-branch-commits


@@ -0,0 +1,341 @@
+//===- ControlFlowUtils.cpp - Control Flow Utilities 
---==//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+//
+// Utilities to manipulate the CFG and restore SSA for the new control flow.
+//
+//===--===//
+
+#include "llvm/Transforms/Utils/ControlFlowUtils.h"
+#include "llvm/ADT/SetVector.h"
+#include "llvm/ADT/SmallSet.h"
+#include "llvm/Analysis/DomTreeUpdater.h"
+#include "llvm/IR/Constants.h"
+#include "llvm/IR/Instructions.h"
+#include "llvm/IR/ValueHandle.h"
+#include "llvm/Transforms/Utils/Local.h"
+
+#define DEBUG_TYPE "control-flow-hub"
+
+using namespace llvm;
+
+using BBPredicates = DenseMap;
+using EdgeDescriptor = ControlFlowHub::BranchDescriptor;
+
+// Redirects the terminator of the incoming block to the first guard block in
+// the hub. Returns the branch condition from `BB` if it exits.
+// - If only one of Succ0 or Succ1 is not null, the corresponding branch
+//   successor is redirected to the FirstGuardBlock.
+// - Else both are not null, and branch is replaced with an unconditional
+//   branch to the FirstGuardBlock.
+static Value *redirectToHub(BasicBlock *BB, BasicBlock *Succ0,
+BasicBlock *Succ1, BasicBlock *FirstGuardBlock) {
+  assert(isa(BB->getTerminator()) &&
+ "Only support branch terminator.");
+  auto *Branch = cast(BB->getTerminator());
+  auto *Condition = Branch->isConditional() ? Branch->getCondition() : nullptr;
+
+  assert(Succ0 || Succ1);
+
+  if (Branch->isUnconditional()) {
+assert(Succ0 == Branch->getSuccessor(0));
+assert(!Succ1);
+Branch->setSuccessor(0, FirstGuardBlock);
+  } else {
+assert(!Succ1 || Succ1 == Branch->getSuccessor(1));
+if (Succ0 && !Succ1) {
+  Branch->setSuccessor(0, FirstGuardBlock);
+} else if (Succ1 && !Succ0) {
+  Branch->setSuccessor(1, FirstGuardBlock);
+} else {
+  Branch->eraseFromParent();
+  BranchInst::Create(FirstGuardBlock, BB);
+}
+  }
+
+  return Condition;
+}
+
+// Setup the branch instructions for guard blocks.
+//
+// Each guard block terminates in a conditional branch that transfers
+// control to the corresponding outgoing block or the next guard
+// block. The last guard block has two outgoing blocks as successors.
+static void setupBranchForGuard(ArrayRef GuardBlocks,
+ArrayRef Outgoing,
+BBPredicates &GuardPredicates) {
+  assert(Outgoing.size() > 1);
+  assert(GuardBlocks.size() == Outgoing.size() - 1);
+  int I = 0;
+  for (int E = GuardBlocks.size() - 1; I != E; ++I) {
+BasicBlock *Out = Outgoing[I];
+BranchInst::Create(Out, GuardBlocks[I + 1], GuardPredicates[Out],
+   GuardBlocks[I]);
+  }
+  BasicBlock *Out = Outgoing[I];
+  BranchInst::Create(Out, Outgoing[I + 1], GuardPredicates[Out],
+ GuardBlocks[I]);
+}
+
+// Assign an index to each outgoing block. At the corresponding guard
+// block, compute the branch condition by comparing this index.
+static void calcPredicateUsingInteger(ArrayRef Branches,
+  ArrayRef Outgoing,
+  ArrayRef GuardBlocks,
+  BBPredicates &GuardPredicates) {
+  LLVMContext &Context = GuardBlocks.front()->getContext();
+  BasicBlock *FirstGuardBlock = GuardBlocks.front();
+
+  auto *Phi = PHINode::Create(Type::getInt32Ty(Context), Branches.size(),
+  "merged.bb.idx", FirstGuardBlock);
+
+  for (auto [BB, Succ0, Succ1] : Branches) {
+Value *Condition = redirectToHub(BB, Succ0, Succ1, FirstGuardBlock);
+Value *IncomingId = nullptr;
+if (Succ0 && Succ1) {
+  auto Succ0Iter = find(Outgoing, Succ0);
+  auto Succ1Iter = find(Outgoing, Succ1);
+  Value *Id0 = ConstantInt::get(Type::getInt32Ty(Context),
+std::distance(Outgoing.begin(), 
Succ0Iter));
+  Value *Id1 = ConstantInt::get(Type::getInt32Ty(Context),
+std::distance(Outgoing.begin(), 
Succ1Iter));
+  IncomingId = SelectInst::Create(Condition, Id0, Id1, "target.bb.idx",
+  BB->getTerminator()->getIterator());
+} else {
+  // Get the index of the non-null successor.
+  auto SuccIter = Succ0 ? find(Outgoing, Succ0) : find(Outgoing, Succ1);
+  IncomingId = ConstantInt::get(Type::getInt32Ty(Context),
+std::distance(Outgoing.begin(), SuccIter));
+}
+Phi->addIncoming(IncomingId, BB);
+  }
+
+  for (int I = 0, E = Outgoing.size() - 1; I != E; ++I) {
+Basic

[llvm-branch-commits] [llvm] [Transforms] Refactor CreateControlFlowHub (PR #103013)

2024-08-13 Thread Matt Arsenault via llvm-branch-commits


@@ -0,0 +1,341 @@
+//===- ControlFlowUtils.cpp - Control Flow Utilities 
---==//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+//
+// Utilities to manipulate the CFG and restore SSA for the new control flow.
+//
+//===--===//
+
+#include "llvm/Transforms/Utils/ControlFlowUtils.h"
+#include "llvm/ADT/SetVector.h"
+#include "llvm/ADT/SmallSet.h"
+#include "llvm/Analysis/DomTreeUpdater.h"
+#include "llvm/IR/Constants.h"
+#include "llvm/IR/Instructions.h"
+#include "llvm/IR/ValueHandle.h"
+#include "llvm/Transforms/Utils/Local.h"
+
+#define DEBUG_TYPE "control-flow-hub"
+
+using namespace llvm;
+
+using BBPredicates = DenseMap;
+using EdgeDescriptor = ControlFlowHub::BranchDescriptor;
+
+// Redirects the terminator of the incoming block to the first guard block in
+// the hub. Returns the branch condition from `BB` if it exits.
+// - If only one of Succ0 or Succ1 is not null, the corresponding branch
+//   successor is redirected to the FirstGuardBlock.
+// - Else both are not null, and branch is replaced with an unconditional
+//   branch to the FirstGuardBlock.
+static Value *redirectToHub(BasicBlock *BB, BasicBlock *Succ0,
+BasicBlock *Succ1, BasicBlock *FirstGuardBlock) {
+  assert(isa(BB->getTerminator()) &&
+ "Only support branch terminator.");
+  auto *Branch = cast(BB->getTerminator());
+  auto *Condition = Branch->isConditional() ? Branch->getCondition() : nullptr;
+
+  assert(Succ0 || Succ1);
+
+  if (Branch->isUnconditional()) {
+assert(Succ0 == Branch->getSuccessor(0));
+assert(!Succ1);
+Branch->setSuccessor(0, FirstGuardBlock);
+  } else {
+assert(!Succ1 || Succ1 == Branch->getSuccessor(1));
+if (Succ0 && !Succ1) {
+  Branch->setSuccessor(0, FirstGuardBlock);
+} else if (Succ1 && !Succ0) {
+  Branch->setSuccessor(1, FirstGuardBlock);
+} else {
+  Branch->eraseFromParent();
+  BranchInst::Create(FirstGuardBlock, BB);
+}
+  }
+
+  return Condition;
+}
+
+// Setup the branch instructions for guard blocks.
+//
+// Each guard block terminates in a conditional branch that transfers
+// control to the corresponding outgoing block or the next guard
+// block. The last guard block has two outgoing blocks as successors.
+static void setupBranchForGuard(ArrayRef GuardBlocks,
+ArrayRef Outgoing,
+BBPredicates &GuardPredicates) {
+  assert(Outgoing.size() > 1);
+  assert(GuardBlocks.size() == Outgoing.size() - 1);
+  int I = 0;
+  for (int E = GuardBlocks.size() - 1; I != E; ++I) {
+BasicBlock *Out = Outgoing[I];
+BranchInst::Create(Out, GuardBlocks[I + 1], GuardPredicates[Out],
+   GuardBlocks[I]);
+  }
+  BasicBlock *Out = Outgoing[I];
+  BranchInst::Create(Out, Outgoing[I + 1], GuardPredicates[Out],
+ GuardBlocks[I]);
+}
+
+// Assign an index to each outgoing block. At the corresponding guard
+// block, compute the branch condition by comparing this index.
+static void calcPredicateUsingInteger(ArrayRef Branches,
+  ArrayRef Outgoing,
+  ArrayRef GuardBlocks,
+  BBPredicates &GuardPredicates) {
+  LLVMContext &Context = GuardBlocks.front()->getContext();
+  BasicBlock *FirstGuardBlock = GuardBlocks.front();
+
+  auto *Phi = PHINode::Create(Type::getInt32Ty(Context), Branches.size(),
+  "merged.bb.idx", FirstGuardBlock);
+
+  for (auto [BB, Succ0, Succ1] : Branches) {
+Value *Condition = redirectToHub(BB, Succ0, Succ1, FirstGuardBlock);
+Value *IncomingId = nullptr;
+if (Succ0 && Succ1) {
+  auto Succ0Iter = find(Outgoing, Succ0);
+  auto Succ1Iter = find(Outgoing, Succ1);
+  Value *Id0 = ConstantInt::get(Type::getInt32Ty(Context),
+std::distance(Outgoing.begin(), 
Succ0Iter));
+  Value *Id1 = ConstantInt::get(Type::getInt32Ty(Context),
+std::distance(Outgoing.begin(), 
Succ1Iter));
+  IncomingId = SelectInst::Create(Condition, Id0, Id1, "target.bb.idx",
+  BB->getTerminator()->getIterator());
+} else {
+  // Get the index of the non-null successor.
+  auto SuccIter = Succ0 ? find(Outgoing, Succ0) : find(Outgoing, Succ1);
+  IncomingId = ConstantInt::get(Type::getInt32Ty(Context),
+std::distance(Outgoing.begin(), SuccIter));
+}
+Phi->addIncoming(IncomingId, BB);
+  }
+
+  for (int I = 0, E = Outgoing.size() - 1; I != E; ++I) {
+Basic

[llvm-branch-commits] [llvm] [Transforms] Refactor CreateControlFlowHub (PR #103013)

2024-08-13 Thread Matt Arsenault via llvm-branch-commits


@@ -0,0 +1,341 @@
+//===- ControlFlowUtils.cpp - Control Flow Utilities 
---==//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+//
+// Utilities to manipulate the CFG and restore SSA for the new control flow.
+//
+//===--===//
+
+#include "llvm/Transforms/Utils/ControlFlowUtils.h"
+#include "llvm/ADT/SetVector.h"
+#include "llvm/ADT/SmallSet.h"
+#include "llvm/Analysis/DomTreeUpdater.h"
+#include "llvm/IR/Constants.h"
+#include "llvm/IR/Instructions.h"
+#include "llvm/IR/ValueHandle.h"
+#include "llvm/Transforms/Utils/Local.h"
+
+#define DEBUG_TYPE "control-flow-hub"
+
+using namespace llvm;
+
+using BBPredicates = DenseMap;
+using EdgeDescriptor = ControlFlowHub::BranchDescriptor;
+
+// Redirects the terminator of the incoming block to the first guard block in
+// the hub. Returns the branch condition from `BB` if it exits.
+// - If only one of Succ0 or Succ1 is not null, the corresponding branch
+//   successor is redirected to the FirstGuardBlock.
+// - Else both are not null, and branch is replaced with an unconditional
+//   branch to the FirstGuardBlock.
+static Value *redirectToHub(BasicBlock *BB, BasicBlock *Succ0,
+BasicBlock *Succ1, BasicBlock *FirstGuardBlock) {
+  assert(isa(BB->getTerminator()) &&
+ "Only support branch terminator.");
+  auto *Branch = cast(BB->getTerminator());
+  auto *Condition = Branch->isConditional() ? Branch->getCondition() : nullptr;
+
+  assert(Succ0 || Succ1);
+
+  if (Branch->isUnconditional()) {
+assert(Succ0 == Branch->getSuccessor(0));
+assert(!Succ1);
+Branch->setSuccessor(0, FirstGuardBlock);
+  } else {
+assert(!Succ1 || Succ1 == Branch->getSuccessor(1));
+if (Succ0 && !Succ1) {
+  Branch->setSuccessor(0, FirstGuardBlock);
+} else if (Succ1 && !Succ0) {
+  Branch->setSuccessor(1, FirstGuardBlock);
+} else {
+  Branch->eraseFromParent();
+  BranchInst::Create(FirstGuardBlock, BB);
+}
+  }
+
+  return Condition;
+}
+
+// Setup the branch instructions for guard blocks.
+//
+// Each guard block terminates in a conditional branch that transfers
+// control to the corresponding outgoing block or the next guard
+// block. The last guard block has two outgoing blocks as successors.
+static void setupBranchForGuard(ArrayRef GuardBlocks,
+ArrayRef Outgoing,
+BBPredicates &GuardPredicates) {
+  assert(Outgoing.size() > 1);
+  assert(GuardBlocks.size() == Outgoing.size() - 1);
+  int I = 0;
+  for (int E = GuardBlocks.size() - 1; I != E; ++I) {
+BasicBlock *Out = Outgoing[I];
+BranchInst::Create(Out, GuardBlocks[I + 1], GuardPredicates[Out],
+   GuardBlocks[I]);
+  }
+  BasicBlock *Out = Outgoing[I];
+  BranchInst::Create(Out, Outgoing[I + 1], GuardPredicates[Out],
+ GuardBlocks[I]);
+}
+
+// Assign an index to each outgoing block. At the corresponding guard
+// block, compute the branch condition by comparing this index.
+static void calcPredicateUsingInteger(ArrayRef Branches,
+  ArrayRef Outgoing,
+  ArrayRef GuardBlocks,
+  BBPredicates &GuardPredicates) {
+  LLVMContext &Context = GuardBlocks.front()->getContext();
+  BasicBlock *FirstGuardBlock = GuardBlocks.front();
+
+  auto *Phi = PHINode::Create(Type::getInt32Ty(Context), Branches.size(),
+  "merged.bb.idx", FirstGuardBlock);
+
+  for (auto [BB, Succ0, Succ1] : Branches) {
+Value *Condition = redirectToHub(BB, Succ0, Succ1, FirstGuardBlock);
+Value *IncomingId = nullptr;
+if (Succ0 && Succ1) {
+  auto Succ0Iter = find(Outgoing, Succ0);
+  auto Succ1Iter = find(Outgoing, Succ1);
+  Value *Id0 = ConstantInt::get(Type::getInt32Ty(Context),
+std::distance(Outgoing.begin(), 
Succ0Iter));
+  Value *Id1 = ConstantInt::get(Type::getInt32Ty(Context),
+std::distance(Outgoing.begin(), 
Succ1Iter));
+  IncomingId = SelectInst::Create(Condition, Id0, Id1, "target.bb.idx",
+  BB->getTerminator()->getIterator());
+} else {
+  // Get the index of the non-null successor.
+  auto SuccIter = Succ0 ? find(Outgoing, Succ0) : find(Outgoing, Succ1);
+  IncomingId = ConstantInt::get(Type::getInt32Ty(Context),
+std::distance(Outgoing.begin(), SuccIter));
+}
+Phi->addIncoming(IncomingId, BB);
+  }
+
+  for (int I = 0, E = Outgoing.size() - 1; I != E; ++I) {
+Basic

[llvm-branch-commits] [llvm] [Transforms] Refactor CreateControlFlowHub (PR #103013)

2024-08-13 Thread Matt Arsenault via llvm-branch-commits


@@ -0,0 +1,341 @@
+//===- ControlFlowUtils.cpp - Control Flow Utilities 
---==//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+//
+// Utilities to manipulate the CFG and restore SSA for the new control flow.
+//
+//===--===//
+
+#include "llvm/Transforms/Utils/ControlFlowUtils.h"
+#include "llvm/ADT/SetVector.h"
+#include "llvm/ADT/SmallSet.h"
+#include "llvm/Analysis/DomTreeUpdater.h"
+#include "llvm/IR/Constants.h"
+#include "llvm/IR/Instructions.h"
+#include "llvm/IR/ValueHandle.h"
+#include "llvm/Transforms/Utils/Local.h"
+
+#define DEBUG_TYPE "control-flow-hub"
+
+using namespace llvm;
+
+using BBPredicates = DenseMap;
+using EdgeDescriptor = ControlFlowHub::BranchDescriptor;
+
+// Redirects the terminator of the incoming block to the first guard block in
+// the hub. Returns the branch condition from `BB` if it exits.
+// - If only one of Succ0 or Succ1 is not null, the corresponding branch
+//   successor is redirected to the FirstGuardBlock.
+// - Else both are not null, and branch is replaced with an unconditional
+//   branch to the FirstGuardBlock.
+static Value *redirectToHub(BasicBlock *BB, BasicBlock *Succ0,
+BasicBlock *Succ1, BasicBlock *FirstGuardBlock) {
+  assert(isa(BB->getTerminator()) &&
+ "Only support branch terminator.");
+  auto *Branch = cast(BB->getTerminator());
+  auto *Condition = Branch->isConditional() ? Branch->getCondition() : nullptr;
+
+  assert(Succ0 || Succ1);
+
+  if (Branch->isUnconditional()) {
+assert(Succ0 == Branch->getSuccessor(0));
+assert(!Succ1);
+Branch->setSuccessor(0, FirstGuardBlock);
+  } else {
+assert(!Succ1 || Succ1 == Branch->getSuccessor(1));
+if (Succ0 && !Succ1) {
+  Branch->setSuccessor(0, FirstGuardBlock);
+} else if (Succ1 && !Succ0) {
+  Branch->setSuccessor(1, FirstGuardBlock);
+} else {
+  Branch->eraseFromParent();
+  BranchInst::Create(FirstGuardBlock, BB);
+}
+  }
+
+  return Condition;
+}
+
+// Setup the branch instructions for guard blocks.
+//
+// Each guard block terminates in a conditional branch that transfers
+// control to the corresponding outgoing block or the next guard
+// block. The last guard block has two outgoing blocks as successors.
+static void setupBranchForGuard(ArrayRef GuardBlocks,
+ArrayRef Outgoing,
+BBPredicates &GuardPredicates) {
+  assert(Outgoing.size() > 1);
+  assert(GuardBlocks.size() == Outgoing.size() - 1);
+  int I = 0;
+  for (int E = GuardBlocks.size() - 1; I != E; ++I) {
+BasicBlock *Out = Outgoing[I];
+BranchInst::Create(Out, GuardBlocks[I + 1], GuardPredicates[Out],
+   GuardBlocks[I]);
+  }
+  BasicBlock *Out = Outgoing[I];
+  BranchInst::Create(Out, Outgoing[I + 1], GuardPredicates[Out],
+ GuardBlocks[I]);
+}
+
+// Assign an index to each outgoing block. At the corresponding guard
+// block, compute the branch condition by comparing this index.
+static void calcPredicateUsingInteger(ArrayRef Branches,
+  ArrayRef Outgoing,
+  ArrayRef GuardBlocks,
+  BBPredicates &GuardPredicates) {
+  LLVMContext &Context = GuardBlocks.front()->getContext();
+  BasicBlock *FirstGuardBlock = GuardBlocks.front();
+
+  auto *Phi = PHINode::Create(Type::getInt32Ty(Context), Branches.size(),
+  "merged.bb.idx", FirstGuardBlock);
+
+  for (auto [BB, Succ0, Succ1] : Branches) {
+Value *Condition = redirectToHub(BB, Succ0, Succ1, FirstGuardBlock);
+Value *IncomingId = nullptr;
+if (Succ0 && Succ1) {
+  auto Succ0Iter = find(Outgoing, Succ0);
+  auto Succ1Iter = find(Outgoing, Succ1);
+  Value *Id0 = ConstantInt::get(Type::getInt32Ty(Context),
+std::distance(Outgoing.begin(), 
Succ0Iter));
+  Value *Id1 = ConstantInt::get(Type::getInt32Ty(Context),

arsenm wrote:

Avoid repeated getInt32Ty calls 

https://github.com/llvm/llvm-project/pull/103013
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/19.x: [AIX] Revert `#pragma mc_func` check (#102919) (PR #102968)

2024-08-13 Thread Tobias Hieta via llvm-branch-commits

https://github.com/tru updated https://github.com/llvm/llvm-project/pull/102968

>From 28f2d04b3ca36faffe997fa86833e5ed83699272 Mon Sep 17 00:00:00 2001
From: Qiongsi Wu <274595+qiongs...@users.noreply.github.com>
Date: Mon, 12 Aug 2024 16:00:25 -0400
Subject: [PATCH] [AIX] Revert `#pragma mc_func` check (#102919)

https://github.com/llvm/llvm-project/pull/99888 added a specific
diagnostic for `#pragma mc_func` on AIX. There are some disagreements
on:

1. If the check should be on by default. Leaving the check off by
default is dangerous, since it is difficult to be aware of such a check.
Turning it on by default at the moment causes build failures on AIX. See
https://github.com/llvm/llvm-project/pull/101336 for more details.
2. If the check can be made more general. See
https://github.com/llvm/llvm-project/pull/101336#issuecomment-2269283906.

This PR reverts this check from `main` so we can flush out these
disagreements.

(cherry picked from commit 123b6fcc70af17d81c903b839ffb55afc9a9728f)
---
 .../clang/Basic/DiagnosticParseKinds.td   |  3 ---
 clang/include/clang/Driver/Options.td |  7 --
 clang/include/clang/Lex/PreprocessorOptions.h |  5 
 clang/include/clang/Parse/Parser.h|  1 -
 clang/lib/Driver/ToolChains/AIX.cpp   |  6 -
 clang/lib/Parse/ParsePragma.cpp   | 25 ---
 clang/test/Preprocessor/pragma_mc_func.c  | 23 -
 7 files changed, 70 deletions(-)
 delete mode 100644 clang/test/Preprocessor/pragma_mc_func.c

diff --git a/clang/include/clang/Basic/DiagnosticParseKinds.td 
b/clang/include/clang/Basic/DiagnosticParseKinds.td
index f8d50d12bb9351..12aab09f285567 100644
--- a/clang/include/clang/Basic/DiagnosticParseKinds.td
+++ b/clang/include/clang/Basic/DiagnosticParseKinds.td
@@ -1260,9 +1260,6 @@ def warn_pragma_intrinsic_builtin : Warning<
 def warn_pragma_unused_expected_var : Warning<
   "expected '#pragma unused' argument to be a variable name">,
   InGroup;
-// - #pragma mc_func
-def err_pragma_mc_func_not_supported :
-   Error<"#pragma mc_func is not supported">;
 // - #pragma init_seg
 def warn_pragma_init_seg_unsupported_target : Warning<
   "'#pragma init_seg' is only supported when targeting a "
diff --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index 5e412cb5f51894..15f9ee75492e3f 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -8090,13 +8090,6 @@ def source_date_epoch : Separate<["-"], 
"source-date-epoch">,
 
 } // let Visibility = [CC1Option]
 
-defm err_pragma_mc_func_aix : BoolFOption<"err-pragma-mc-func-aix",
-  PreprocessorOpts<"ErrorOnPragmaMcfuncOnAIX">, DefaultFalse,
-  PosFlag,
-  NegFlag>;
-
 
//===--===//
 // CUDA Options
 
//===--===//
diff --git a/clang/include/clang/Lex/PreprocessorOptions.h 
b/clang/include/clang/Lex/PreprocessorOptions.h
index 3f7dd9db18ba7d..c2e3d68333024a 100644
--- a/clang/include/clang/Lex/PreprocessorOptions.h
+++ b/clang/include/clang/Lex/PreprocessorOptions.h
@@ -211,10 +211,6 @@ class PreprocessorOptions {
   /// If set, the UNIX timestamp specified by SOURCE_DATE_EPOCH.
   std::optional SourceDateEpoch;
 
-  /// If set, the preprocessor reports an error when processing #pragma mc_func
-  /// on AIX.
-  bool ErrorOnPragmaMcfuncOnAIX = false;
-
 public:
   PreprocessorOptions() : PrecompiledPreambleBytes(0, false) {}
 
@@ -252,7 +248,6 @@ class PreprocessorOptions {
 PrecompiledPreambleBytes.first = 0;
 PrecompiledPreambleBytes.second = false;
 RetainExcludedConditionalBlocks = false;
-ErrorOnPragmaMcfuncOnAIX = false;
   }
 };
 
diff --git a/clang/include/clang/Parse/Parser.h 
b/clang/include/clang/Parse/Parser.h
index 35bb1a19d40f0a..f256d603ae6268 100644
--- a/clang/include/clang/Parse/Parser.h
+++ b/clang/include/clang/Parse/Parser.h
@@ -221,7 +221,6 @@ class Parser : public CodeCompletionHandler {
   std::unique_ptr MaxTokensHerePragmaHandler;
   std::unique_ptr MaxTokensTotalPragmaHandler;
   std::unique_ptr RISCVPragmaHandler;
-  std::unique_ptr MCFuncPragmaHandler;
 
   std::unique_ptr CommentSemaHandler;
 
diff --git a/clang/lib/Driver/ToolChains/AIX.cpp 
b/clang/lib/Driver/ToolChains/AIX.cpp
index fb780fb75651d2..b04502a57a9f7a 100644
--- a/clang/lib/Driver/ToolChains/AIX.cpp
+++ b/clang/lib/Driver/ToolChains/AIX.cpp
@@ -557,12 +557,6 @@ void AIX::addClangTargetOptions(
   if (!Args.getLastArgNoClaim(options::OPT_fsized_deallocation,
   options::OPT_fno_sized_deallocation))
 CC1Args.push_back("-fno-sized-deallocation");
-
-  if (Args.hasFlag(options::OPT_ferr_pragma_mc_func_aix,
-   options::OPT_fno_err_pragma_mc_func_aix, false))
-CC1Args.push_back("-ferr-pragma-mc-func-aix");
-  else
-CC1Args.push_back("-fno-err-pragma-mc-func-aix");
 }
 
 void AIX:

[llvm-branch-commits] [clang] 28f2d04 - [AIX] Revert `#pragma mc_func` check (#102919)

2024-08-13 Thread Tobias Hieta via llvm-branch-commits

Author: Qiongsi Wu
Date: 2024-08-13T14:04:18+02:00
New Revision: 28f2d04b3ca36faffe997fa86833e5ed83699272

URL: 
https://github.com/llvm/llvm-project/commit/28f2d04b3ca36faffe997fa86833e5ed83699272
DIFF: 
https://github.com/llvm/llvm-project/commit/28f2d04b3ca36faffe997fa86833e5ed83699272.diff

LOG: [AIX] Revert `#pragma mc_func` check (#102919)

https://github.com/llvm/llvm-project/pull/99888 added a specific
diagnostic for `#pragma mc_func` on AIX. There are some disagreements
on:

1. If the check should be on by default. Leaving the check off by
default is dangerous, since it is difficult to be aware of such a check.
Turning it on by default at the moment causes build failures on AIX. See
https://github.com/llvm/llvm-project/pull/101336 for more details.
2. If the check can be made more general. See
https://github.com/llvm/llvm-project/pull/101336#issuecomment-2269283906.

This PR reverts this check from `main` so we can flush out these
disagreements.

(cherry picked from commit 123b6fcc70af17d81c903b839ffb55afc9a9728f)

Added: 


Modified: 
clang/include/clang/Basic/DiagnosticParseKinds.td
clang/include/clang/Driver/Options.td
clang/include/clang/Lex/PreprocessorOptions.h
clang/include/clang/Parse/Parser.h
clang/lib/Driver/ToolChains/AIX.cpp
clang/lib/Parse/ParsePragma.cpp

Removed: 
clang/test/Preprocessor/pragma_mc_func.c



diff  --git a/clang/include/clang/Basic/DiagnosticParseKinds.td 
b/clang/include/clang/Basic/DiagnosticParseKinds.td
index f8d50d12bb9351..12aab09f285567 100644
--- a/clang/include/clang/Basic/DiagnosticParseKinds.td
+++ b/clang/include/clang/Basic/DiagnosticParseKinds.td
@@ -1260,9 +1260,6 @@ def warn_pragma_intrinsic_builtin : Warning<
 def warn_pragma_unused_expected_var : Warning<
   "expected '#pragma unused' argument to be a variable name">,
   InGroup;
-// - #pragma mc_func
-def err_pragma_mc_func_not_supported :
-   Error<"#pragma mc_func is not supported">;
 // - #pragma init_seg
 def warn_pragma_init_seg_unsupported_target : Warning<
   "'#pragma init_seg' is only supported when targeting a "

diff  --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index 5e412cb5f51894..15f9ee75492e3f 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -8090,13 +8090,6 @@ def source_date_epoch : Separate<["-"], 
"source-date-epoch">,
 
 } // let Visibility = [CC1Option]
 
-defm err_pragma_mc_func_aix : BoolFOption<"err-pragma-mc-func-aix",
-  PreprocessorOpts<"ErrorOnPragmaMcfuncOnAIX">, DefaultFalse,
-  PosFlag,
-  NegFlag>;
-
 
//===--===//
 // CUDA Options
 
//===--===//

diff  --git a/clang/include/clang/Lex/PreprocessorOptions.h 
b/clang/include/clang/Lex/PreprocessorOptions.h
index 3f7dd9db18ba7d..c2e3d68333024a 100644
--- a/clang/include/clang/Lex/PreprocessorOptions.h
+++ b/clang/include/clang/Lex/PreprocessorOptions.h
@@ -211,10 +211,6 @@ class PreprocessorOptions {
   /// If set, the UNIX timestamp specified by SOURCE_DATE_EPOCH.
   std::optional SourceDateEpoch;
 
-  /// If set, the preprocessor reports an error when processing #pragma mc_func
-  /// on AIX.
-  bool ErrorOnPragmaMcfuncOnAIX = false;
-
 public:
   PreprocessorOptions() : PrecompiledPreambleBytes(0, false) {}
 
@@ -252,7 +248,6 @@ class PreprocessorOptions {
 PrecompiledPreambleBytes.first = 0;
 PrecompiledPreambleBytes.second = false;
 RetainExcludedConditionalBlocks = false;
-ErrorOnPragmaMcfuncOnAIX = false;
   }
 };
 

diff  --git a/clang/include/clang/Parse/Parser.h 
b/clang/include/clang/Parse/Parser.h
index 35bb1a19d40f0a..f256d603ae6268 100644
--- a/clang/include/clang/Parse/Parser.h
+++ b/clang/include/clang/Parse/Parser.h
@@ -221,7 +221,6 @@ class Parser : public CodeCompletionHandler {
   std::unique_ptr MaxTokensHerePragmaHandler;
   std::unique_ptr MaxTokensTotalPragmaHandler;
   std::unique_ptr RISCVPragmaHandler;
-  std::unique_ptr MCFuncPragmaHandler;
 
   std::unique_ptr CommentSemaHandler;
 

diff  --git a/clang/lib/Driver/ToolChains/AIX.cpp 
b/clang/lib/Driver/ToolChains/AIX.cpp
index fb780fb75651d2..b04502a57a9f7a 100644
--- a/clang/lib/Driver/ToolChains/AIX.cpp
+++ b/clang/lib/Driver/ToolChains/AIX.cpp
@@ -557,12 +557,6 @@ void AIX::addClangTargetOptions(
   if (!Args.getLastArgNoClaim(options::OPT_fsized_deallocation,
   options::OPT_fno_sized_deallocation))
 CC1Args.push_back("-fno-sized-deallocation");
-
-  if (Args.hasFlag(options::OPT_ferr_pragma_mc_func_aix,
-   options::OPT_fno_err_pragma_mc_func_aix, false))
-CC1Args.push_back("-ferr-pragma-mc-func-aix");
-  else
-CC1Args.push_back("-fno-err-pragma-mc-func-aix");
 }
 
 void AIX::addProfileRTLibs(const llvm::opt::ArgList &Args,

diff  --

[llvm-branch-commits] [clang] release/19.x: [AIX] Revert `#pragma mc_func` check (#102919) (PR #102968)

2024-08-13 Thread Tobias Hieta via llvm-branch-commits

https://github.com/tru closed https://github.com/llvm/llvm-project/pull/102968
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/19.x: [AIX] Revert `#pragma mc_func` check (#102919) (PR #102968)

2024-08-13 Thread via llvm-branch-commits

github-actions[bot] wrote:

@qiongsiwu (or anyone else). If you would like to add a note about this fix in 
the release notes (completely optional). Please reply to this comment with a 
one or two sentence description of the fix.  When you are done, please add the 
release:note label to this PR. 

https://github.com/llvm/llvm-project/pull/102968
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/19.x: [Clang] Correctly forward `--cuda-path` to the nvlink wrapper (#100170) (PR #100216)

2024-08-13 Thread Joseph Huber via llvm-branch-commits

https://github.com/jhuber6 approved this pull request.


https://github.com/llvm/llvm-project/pull/100216
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: [CodeGen][ARM64EC] Define hybrid_patchable EXP thunk symbol as a function. (#102898) (PR #103048)

2024-08-13 Thread via llvm-branch-commits

https://github.com/llvmbot milestoned 
https://github.com/llvm/llvm-project/pull/103048
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: [CodeGen][ARM64EC] Define hybrid_patchable EXP thunk symbol as a function. (#102898) (PR #103048)

2024-08-13 Thread via llvm-branch-commits

llvmbot wrote:

@efriedma-quic What do you think about merging this PR to the release branch?

https://github.com/llvm/llvm-project/pull/103048
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: [CodeGen][ARM64EC] Define hybrid_patchable EXP thunk symbol as a function. (#102898) (PR #103048)

2024-08-13 Thread via llvm-branch-commits

https://github.com/llvmbot created 
https://github.com/llvm/llvm-project/pull/103048

Backport d550ada5ab6cd6e49de71ac4c9aa27ced4c11de0

Requested by: @cjacek

>From f1e58884bfc55c3ec83f3f882a4219bf48059fdc Mon Sep 17 00:00:00 2001
From: Jacek Caban 
Date: Tue, 13 Aug 2024 13:39:42 +0200
Subject: [PATCH] [CodeGen][ARM64EC] Define hybrid_patchable EXP thunk symbol
 as a function. (#102898)

This is needed for MSVC link.exe to generate redirection metadata for hybrid 
patchable thunks.

(cherry picked from commit d550ada5ab6cd6e49de71ac4c9aa27ced4c11de0)
---
 llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp |  7 ++
 .../AArch64/arm64ec-hybrid-patchable.ll   | 24 +++
 2 files changed, 27 insertions(+), 4 deletions(-)

diff --git a/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp 
b/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
index 3c9b07ad45bf24..c64454cc253c35 100644
--- a/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
+++ b/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
@@ -1292,6 +1292,13 @@ void AArch64AsmPrinter::emitGlobalAlias(const Module &M,
   StringRef ExpStr = cast(Node->getOperand(0))->getString();
   MCSymbol *ExpSym = MMI->getContext().getOrCreateSymbol(ExpStr);
   MCSymbol *Sym = MMI->getContext().getOrCreateSymbol(GA.getName());
+
+  OutStreamer->beginCOFFSymbolDef(ExpSym);
+  OutStreamer->emitCOFFSymbolStorageClass(COFF::IMAGE_SYM_CLASS_EXTERNAL);
+  OutStreamer->emitCOFFSymbolType(COFF::IMAGE_SYM_DTYPE_FUNCTION
+  << COFF::SCT_COMPLEX_TYPE_SHIFT);
+  OutStreamer->endCOFFSymbolDef();
+
   OutStreamer->beginCOFFSymbolDef(Sym);
   OutStreamer->emitCOFFSymbolStorageClass(COFF::IMAGE_SYM_CLASS_EXTERNAL);
   OutStreamer->emitCOFFSymbolType(COFF::IMAGE_SYM_DTYPE_FUNCTION
diff --git a/llvm/test/CodeGen/AArch64/arm64ec-hybrid-patchable.ll 
b/llvm/test/CodeGen/AArch64/arm64ec-hybrid-patchable.ll
index 64fb5b36b2c623..1ed6a273338abb 100644
--- a/llvm/test/CodeGen/AArch64/arm64ec-hybrid-patchable.ll
+++ b/llvm/test/CodeGen/AArch64/arm64ec-hybrid-patchable.ll
@@ -240,6 +240,10 @@ define dso_local void @caller() nounwind {
 ; CHECK-NEXT:  .section.drectve,"yni"
 ; CHECK-NEXT:  .ascii  " /EXPORT:exp"
 
+; CHECK-NEXT:  .def"EXP+#func";
+; CHECK-NEXT:  .scl2;
+; CHECK-NEXT:  .type   32;
+; CHECK-NEXT:  .endef
 ; CHECK-NEXT:  .deffunc;
 ; CHECK-NEXT:  .scl2;
 ; CHECK-NEXT:  .type   32;
@@ -252,6 +256,10 @@ define dso_local void @caller() nounwind {
 ; CHECK-NEXT:  .type   32;
 ; CHECK-NEXT:  .endef
 ; CHECK-NEXT:  .set "#func", "#func$hybpatch_thunk"{{$}}
+; CHECK-NEXT:  .def"EXP+#has_varargs";
+; CHECK-NEXT:  .scl2;
+; CHECK-NEXT:  .type   32;
+; CHECK-NEXT:  .endef
 ; CHECK-NEXT:  .defhas_varargs;
 ; CHECK-NEXT:  .scl2;
 ; CHECK-NEXT:  .type   32;
@@ -264,6 +272,10 @@ define dso_local void @caller() nounwind {
 ; CHECK-NEXT:  .type   32;
 ; CHECK-NEXT:  .endef
 ; CHECK-NEXT:  .set "#has_varargs", "#has_varargs$hybpatch_thunk"
+; CHECK-NEXT:  .def"EXP+#has_sret";
+; CHECK-NEXT:  .scl2;
+; CHECK-NEXT:  .type   32;
+; CHECK-NEXT:  .endef
 ; CHECK-NEXT:  .defhas_sret;
 ; CHECK-NEXT:  .scl2;
 ; CHECK-NEXT:  .type   32;
@@ -276,6 +288,10 @@ define dso_local void @caller() nounwind {
 ; CHECK-NEXT:  .type   32;
 ; CHECK-NEXT:  .endef
 ; CHECK-NEXT:  .set "#has_sret", "#has_sret$hybpatch_thunk"
+; CHECK-NEXT:  .def"EXP+#exp";
+; CHECK-NEXT:  .scl2;
+; CHECK-NEXT:  .type   32;
+; CHECK-NEXT:  .endef
 ; CHECK-NEXT:  .defexp;
 ; CHECK-NEXT:  .scl2;
 ; CHECK-NEXT:  .type   32;
@@ -295,18 +311,18 @@ define dso_local void @caller() nounwind {
 ; SYM:  [78](sec 20)(fl 0x00)(ty  20)(scl   2) (nx 0) 0x 
#exp$hybpatch_thunk
 ; SYM:  [110](sec  0)(fl 0x00)(ty   0)(scl  69) (nx 1) 0x func
 ; SYM-NEXT: AUX indx 112 srch 3
-; SYM-NEXT: [112](sec  0)(fl 0x00)(ty   0)(scl   2) (nx 0) 0x EXP+#func
+; SYM-NEXT: [112](sec  0)(fl 0x00)(ty  20)(scl   2) (nx 0) 0x EXP+#func
 ; SYM:  [116](sec  0)(fl 0x00)(ty   0)(scl  69) (nx 1) 0x #func
 ; SYM-NEXT: AUX indx 53 srch 3
 ; SYM:  [122](sec  0)(fl 0x00)(ty   0)(scl  69) (nx 1) 0x 
has_varargs
 ; SYM-NEXT: AUX indx 124 srch 3
-; SYM-NEXT: [124](sec  0)(fl 0x00)(ty   0)(scl   2) (nx 0) 0x 
EXP+#has_varargs
+; SYM-NEXT: [124](sec  0)(fl 0x00)(ty  20)(scl   2) (nx 0) 0x 
EXP+#has_varargs
 ; SYM-NEXT: [125](sec  0)(fl 0x00)(ty   0)(scl  69) (nx 1) 0x has_sret
 ; SYM-NEXT: AUX indx 127 srch 3
-; SYM-NEXT: [127](sec  0)(fl 0x00)(ty   0)(scl   2) (nx 0) 0x 
EXP+#has_sret
+; SYM-NEXT: [127](sec  0)(fl 0x00)(ty  20)(scl   2) (nx 0) 0x 
EXP+#has_sret
 ; SYM-NEXT: [128](sec  0)(fl 0x00)(ty   0)(scl  69) (nx 1) 0x exp
 ; SYM-NEXT: AUX indx 130 srch 3
-; SYM-NEXT: [130](sec  0)(fl 0x00)(

[llvm-branch-commits] [llvm] release/19.x: [CodeGen][ARM64EC] Define hybrid_patchable EXP thunk symbol as a function. (#102898) (PR #103048)

2024-08-13 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-backend-aarch64

Author: None (llvmbot)


Changes

Backport d550ada5ab6cd6e49de71ac4c9aa27ced4c11de0

Requested by: @cjacek

---
Full diff: https://github.com/llvm/llvm-project/pull/103048.diff


2 Files Affected:

- (modified) llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp (+7) 
- (modified) llvm/test/CodeGen/AArch64/arm64ec-hybrid-patchable.ll (+20-4) 


``diff
diff --git a/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp 
b/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
index 3c9b07ad45bf24..c64454cc253c35 100644
--- a/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
+++ b/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
@@ -1292,6 +1292,13 @@ void AArch64AsmPrinter::emitGlobalAlias(const Module &M,
   StringRef ExpStr = cast(Node->getOperand(0))->getString();
   MCSymbol *ExpSym = MMI->getContext().getOrCreateSymbol(ExpStr);
   MCSymbol *Sym = MMI->getContext().getOrCreateSymbol(GA.getName());
+
+  OutStreamer->beginCOFFSymbolDef(ExpSym);
+  OutStreamer->emitCOFFSymbolStorageClass(COFF::IMAGE_SYM_CLASS_EXTERNAL);
+  OutStreamer->emitCOFFSymbolType(COFF::IMAGE_SYM_DTYPE_FUNCTION
+  << COFF::SCT_COMPLEX_TYPE_SHIFT);
+  OutStreamer->endCOFFSymbolDef();
+
   OutStreamer->beginCOFFSymbolDef(Sym);
   OutStreamer->emitCOFFSymbolStorageClass(COFF::IMAGE_SYM_CLASS_EXTERNAL);
   OutStreamer->emitCOFFSymbolType(COFF::IMAGE_SYM_DTYPE_FUNCTION
diff --git a/llvm/test/CodeGen/AArch64/arm64ec-hybrid-patchable.ll 
b/llvm/test/CodeGen/AArch64/arm64ec-hybrid-patchable.ll
index 64fb5b36b2c623..1ed6a273338abb 100644
--- a/llvm/test/CodeGen/AArch64/arm64ec-hybrid-patchable.ll
+++ b/llvm/test/CodeGen/AArch64/arm64ec-hybrid-patchable.ll
@@ -240,6 +240,10 @@ define dso_local void @caller() nounwind {
 ; CHECK-NEXT:  .section.drectve,"yni"
 ; CHECK-NEXT:  .ascii  " /EXPORT:exp"
 
+; CHECK-NEXT:  .def"EXP+#func";
+; CHECK-NEXT:  .scl2;
+; CHECK-NEXT:  .type   32;
+; CHECK-NEXT:  .endef
 ; CHECK-NEXT:  .deffunc;
 ; CHECK-NEXT:  .scl2;
 ; CHECK-NEXT:  .type   32;
@@ -252,6 +256,10 @@ define dso_local void @caller() nounwind {
 ; CHECK-NEXT:  .type   32;
 ; CHECK-NEXT:  .endef
 ; CHECK-NEXT:  .set "#func", "#func$hybpatch_thunk"{{$}}
+; CHECK-NEXT:  .def"EXP+#has_varargs";
+; CHECK-NEXT:  .scl2;
+; CHECK-NEXT:  .type   32;
+; CHECK-NEXT:  .endef
 ; CHECK-NEXT:  .defhas_varargs;
 ; CHECK-NEXT:  .scl2;
 ; CHECK-NEXT:  .type   32;
@@ -264,6 +272,10 @@ define dso_local void @caller() nounwind {
 ; CHECK-NEXT:  .type   32;
 ; CHECK-NEXT:  .endef
 ; CHECK-NEXT:  .set "#has_varargs", "#has_varargs$hybpatch_thunk"
+; CHECK-NEXT:  .def"EXP+#has_sret";
+; CHECK-NEXT:  .scl2;
+; CHECK-NEXT:  .type   32;
+; CHECK-NEXT:  .endef
 ; CHECK-NEXT:  .defhas_sret;
 ; CHECK-NEXT:  .scl2;
 ; CHECK-NEXT:  .type   32;
@@ -276,6 +288,10 @@ define dso_local void @caller() nounwind {
 ; CHECK-NEXT:  .type   32;
 ; CHECK-NEXT:  .endef
 ; CHECK-NEXT:  .set "#has_sret", "#has_sret$hybpatch_thunk"
+; CHECK-NEXT:  .def"EXP+#exp";
+; CHECK-NEXT:  .scl2;
+; CHECK-NEXT:  .type   32;
+; CHECK-NEXT:  .endef
 ; CHECK-NEXT:  .defexp;
 ; CHECK-NEXT:  .scl2;
 ; CHECK-NEXT:  .type   32;
@@ -295,18 +311,18 @@ define dso_local void @caller() nounwind {
 ; SYM:  [78](sec 20)(fl 0x00)(ty  20)(scl   2) (nx 0) 0x 
#exp$hybpatch_thunk
 ; SYM:  [110](sec  0)(fl 0x00)(ty   0)(scl  69) (nx 1) 0x func
 ; SYM-NEXT: AUX indx 112 srch 3
-; SYM-NEXT: [112](sec  0)(fl 0x00)(ty   0)(scl   2) (nx 0) 0x EXP+#func
+; SYM-NEXT: [112](sec  0)(fl 0x00)(ty  20)(scl   2) (nx 0) 0x EXP+#func
 ; SYM:  [116](sec  0)(fl 0x00)(ty   0)(scl  69) (nx 1) 0x #func
 ; SYM-NEXT: AUX indx 53 srch 3
 ; SYM:  [122](sec  0)(fl 0x00)(ty   0)(scl  69) (nx 1) 0x 
has_varargs
 ; SYM-NEXT: AUX indx 124 srch 3
-; SYM-NEXT: [124](sec  0)(fl 0x00)(ty   0)(scl   2) (nx 0) 0x 
EXP+#has_varargs
+; SYM-NEXT: [124](sec  0)(fl 0x00)(ty  20)(scl   2) (nx 0) 0x 
EXP+#has_varargs
 ; SYM-NEXT: [125](sec  0)(fl 0x00)(ty   0)(scl  69) (nx 1) 0x has_sret
 ; SYM-NEXT: AUX indx 127 srch 3
-; SYM-NEXT: [127](sec  0)(fl 0x00)(ty   0)(scl   2) (nx 0) 0x 
EXP+#has_sret
+; SYM-NEXT: [127](sec  0)(fl 0x00)(ty  20)(scl   2) (nx 0) 0x 
EXP+#has_sret
 ; SYM-NEXT: [128](sec  0)(fl 0x00)(ty   0)(scl  69) (nx 1) 0x exp
 ; SYM-NEXT: AUX indx 130 srch 3
-; SYM-NEXT: [130](sec  0)(fl 0x00)(ty   0)(scl   2) (nx 0) 0x EXP+#exp
+; SYM-NEXT: [130](sec  0)(fl 0x00)(ty  20)(scl   2) (nx 0) 0x EXP+#exp
 ; SYM-NEXT: [131](sec  0)(fl 0x00)(ty   0)(scl  69) (nx 1) 0x 
#has_varargs
 ; SYM-NEXT: AUX indx 58 srch 3
 ; SYM-NEXT: [133](sec  0)(fl 0x00)(ty   0)(scl  69) (nx 1) 0x #has_sret

``

[llvm-branch-commits] [clang] Cherry pick: [Clang][Sema] Make UnresolvedLookupExprs in class scope explicit spec… (PR #102514)

2024-08-13 Thread Aaron Ballman via llvm-branch-commits

https://github.com/AaronBallman approved this pull request.

LGTM

https://github.com/llvm/llvm-project/pull/102514
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] AtomicExpand: Convert ARM test to generated checks (PR #103064)

2024-08-13 Thread Matt Arsenault via llvm-branch-commits

https://github.com/arsenm created 
https://github.com/llvm/llvm-project/pull/103064

This was close to manually written full checks, and was missing
a change in a future commit.

>From b35e514f602290b5b3b5f85f1f2237f795c1e472 Mon Sep 17 00:00:00 2001
From: Matt Arsenault 
Date: Tue, 13 Aug 2024 14:03:14 +0400
Subject: [PATCH] AtomicExpand: Convert ARM test to generated checks

This was close to manually written full checks, and was missing
a change in a future commit.
---
 .../AtomicExpand/ARM/cmpxchg-weak.ll  | 275 +-
 1 file changed, 130 insertions(+), 145 deletions(-)

diff --git a/llvm/test/Transforms/AtomicExpand/ARM/cmpxchg-weak.ll 
b/llvm/test/Transforms/AtomicExpand/ARM/cmpxchg-weak.ll
index 23aa57e18ecc5a..8195a5b6145e3a 100644
--- a/llvm/test/Transforms/AtomicExpand/ARM/cmpxchg-weak.ll
+++ b/llvm/test/Transforms/AtomicExpand/ARM/cmpxchg-weak.ll
@@ -1,169 +1,154 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py 
UTC_ARGS: --version 5
 ; RUN: opt -passes=atomic-expand -codegen-opt-level=1 -S 
-mtriple=thumbv7s-apple-ios7.0 %s | FileCheck %s
 
-define i32 @test_cmpxchg_seq_cst(ptr %addr, i32 %desired, i32 %new) {
-; CHECK-LABEL: @test_cmpxchg_seq_cst
 ; Intrinsic for "dmb ishst" is then expected
-; CHECK: br label %[[START:.*]]
-
-; CHECK: [[START]]:
-; CHECK: [[LOADED:%.*]] = call i32 @llvm.arm.ldrex.p0(ptr elementtype(i32) 
%addr)
-; CHECK: [[SHOULD_STORE:%.*]] = icmp eq i32 [[LOADED]], %desired
-; CHECK: br i1 [[SHOULD_STORE]], label %[[FENCED_STORE:.*]], label 
%[[NO_STORE_BB:.*]]
-
-; CHECK: [[FENCED_STORE]]:
-; CHECK: call void @llvm.arm.dmb(i32 10)
-; CHECK: br label %[[TRY_STORE:.*]]
-
-; CHECK: [[TRY_STORE]]:
-; CHECK: [[LOADED_TRYSTORE:%.*]] = phi i32 [ [[LOADED]], %[[FENCED_STORE]] 
]
-; CHECK: [[STREX:%.*]] = call i32 @llvm.arm.strex.p0(i32 %new, ptr 
elementtype(i32) %addr)
-; CHECK: [[SUCCESS:%.*]] = icmp eq i32 [[STREX]], 0
-; CHECK: br i1 [[SUCCESS]], label %[[SUCCESS_BB:.*]], label 
%[[FAILURE_BB:.*]]
-
-; CHECK: [[SUCCESS_BB]]:
-; CHECK: call void @llvm.arm.dmb(i32 11)
-; CHECK: br label %[[END:.*]]
-
-; CHECK: [[NO_STORE_BB]]:
-; CHECK: [[LOADED_NOSTORE:%.*]] = phi i32 [ [[LOADED]], %[[START]] ]
-; CHECK: call void @llvm.arm.clrex()
-; CHECK: br label %[[FAILURE_BB]]
-
-; CHECK: [[FAILURE_BB]]:
-; CHECK: [[LOADED_FAILURE:%.*]] = phi i32 [ [[LOADED_NOSTORE]], 
%[[NO_STORE_BB]] ], [ [[LOADED_TRYSTORE]], %[[TRY_STORE]] ]
-; CHECK: call void @llvm.arm.dmb(i32 11)
-; CHECK: br label %[[END]]
-
-; CHECK: [[END]]:
-; CHECK: [[LOADED_EXIT:%.*]] = phi i32 [ [[LOADED_TRYSTORE]], 
%[[SUCCESS_BB]] ], [ [[LOADED_FAILURE]], %[[FAILURE_BB]] ]
-; CHECK: [[SUCCESS:%.*]] = phi i1 [ true, %[[SUCCESS_BB]] ], [ false, 
%[[FAILURE_BB]] ]
-; CHECK: ret i32 [[LOADED_EXIT]]
-
+define i32 @test_cmpxchg_seq_cst(ptr %addr, i32 %desired, i32 %new) {
+; CHECK-LABEL: define i32 @test_cmpxchg_seq_cst(
+; CHECK-SAME: ptr [[ADDR:%.*]], i32 [[DESIRED:%.*]], i32 [[NEW:%.*]]) {
+; CHECK-NEXT:br label %[[CMPXCHG_START:.*]]
+; CHECK:   [[CMPXCHG_START]]:
+; CHECK-NEXT:[[TMP1:%.*]] = call i32 @llvm.arm.ldrex.p0(ptr 
elementtype(i32) [[ADDR]])
+; CHECK-NEXT:[[SHOULD_STORE:%.*]] = icmp eq i32 [[TMP1]], [[DESIRED]]
+; CHECK-NEXT:br i1 [[SHOULD_STORE]], label %[[CMPXCHG_FENCEDSTORE:.*]], 
label %[[CMPXCHG_NOSTORE:.*]]
+; CHECK:   [[CMPXCHG_FENCEDSTORE]]:
+; CHECK-NEXT:call void @llvm.arm.dmb(i32 10)
+; CHECK-NEXT:br label %[[CMPXCHG_TRYSTORE:.*]]
+; CHECK:   [[CMPXCHG_TRYSTORE]]:
+; CHECK-NEXT:[[LOADED_TRYSTORE:%.*]] = phi i32 [ [[TMP1]], 
%[[CMPXCHG_FENCEDSTORE]] ]
+; CHECK-NEXT:[[TMP2:%.*]] = call i32 @llvm.arm.strex.p0(i32 [[NEW]], ptr 
elementtype(i32) [[ADDR]])
+; CHECK-NEXT:[[SUCCESS:%.*]] = icmp eq i32 [[TMP2]], 0
+; CHECK-NEXT:br i1 [[SUCCESS]], label %[[CMPXCHG_SUCCESS:.*]], label 
%[[CMPXCHG_FAILURE:.*]]
+; CHECK:   [[CMPXCHG_RELEASEDLOAD:.*:]]
+; CHECK-NEXT:unreachable
+; CHECK:   [[CMPXCHG_SUCCESS]]:
+; CHECK-NEXT:call void @llvm.arm.dmb(i32 11)
+; CHECK-NEXT:br label %[[CMPXCHG_END:.*]]
+; CHECK:   [[CMPXCHG_NOSTORE]]:
+; CHECK-NEXT:[[LOADED_NOSTORE:%.*]] = phi i32 [ [[TMP1]], 
%[[CMPXCHG_START]] ]
+; CHECK-NEXT:call void @llvm.arm.clrex()
+; CHECK-NEXT:br label %[[CMPXCHG_FAILURE]]
+; CHECK:   [[CMPXCHG_FAILURE]]:
+; CHECK-NEXT:[[LOADED_FAILURE:%.*]] = phi i32 [ [[LOADED_NOSTORE]], 
%[[CMPXCHG_NOSTORE]] ], [ [[LOADED_TRYSTORE]], %[[CMPXCHG_TRYSTORE]] ]
+; CHECK-NEXT:call void @llvm.arm.dmb(i32 11)
+; CHECK-NEXT:br label %[[CMPXCHG_END]]
+; CHECK:   [[CMPXCHG_END]]:
+; CHECK-NEXT:[[LOADED_EXIT:%.*]] = phi i32 [ [[LOADED_TRYSTORE]], 
%[[CMPXCHG_SUCCESS]] ], [ [[LOADED_FAILURE]], %[[CMPXCHG_FAILURE]] ]
+; CHECK-NEXT:[[SUCCESS1:%.*]] = phi i1 [ true, %[[CMPXCHG_SUCCESS]] ], [ 
false, %[[CMPXCHG_FAILURE]] ]
+; CHECK-NEXT:ret i32 [[LOADED_EXIT]]
+;
   %pair = cmpx

[llvm-branch-commits] [llvm] AtomicExpand: Convert ARM test to generated checks (PR #103064)

2024-08-13 Thread Matt Arsenault via llvm-branch-commits

https://github.com/arsenm ready_for_review 
https://github.com/llvm/llvm-project/pull/103064
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] AtomicExpand: Convert ARM test to generated checks (PR #103064)

2024-08-13 Thread Matt Arsenault via llvm-branch-commits

arsenm wrote:

> [!WARNING]
> This pull request is not mergeable via GitHub because a downstack PR is 
> open. Once all requirements are satisfied, merge this PR as a stack  href="https://app.graphite.dev/github/pr/llvm/llvm-project/103064?utm_source=stack-comment-downstack-mergeability-warning";
>  >on Graphite.
> https://graphite.dev/docs/merge-pull-requests";>Learn more

* **#103064** https://app.graphite.dev/github/pr/llvm/llvm-project/103064?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/> 👈
* **#103063** https://app.graphite.dev/github/pr/llvm/llvm-project/103063?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* `main`

This stack of pull requests is managed by Graphite. https://stacking.dev/?utm_source=stack-comment";>Learn more about 
stacking.


 Join @arsenm and the rest of your teammates on https://graphite.dev?utm-source=stack-comment";>https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="11px" height="11px"/> Graphite
  

https://github.com/llvm/llvm-project/pull/103064
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] AtomicExpand: Convert ARM test to generated checks (PR #103064)

2024-08-13 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-llvm-transforms

Author: Matt Arsenault (arsenm)


Changes

This was close to manually written full checks, and was missing
a change in a future commit.

---
Full diff: https://github.com/llvm/llvm-project/pull/103064.diff


1 Files Affected:

- (modified) llvm/test/Transforms/AtomicExpand/ARM/cmpxchg-weak.ll (+130-145) 


``diff
diff --git a/llvm/test/Transforms/AtomicExpand/ARM/cmpxchg-weak.ll 
b/llvm/test/Transforms/AtomicExpand/ARM/cmpxchg-weak.ll
index 23aa57e18ecc5..8195a5b6145e3 100644
--- a/llvm/test/Transforms/AtomicExpand/ARM/cmpxchg-weak.ll
+++ b/llvm/test/Transforms/AtomicExpand/ARM/cmpxchg-weak.ll
@@ -1,169 +1,154 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py 
UTC_ARGS: --version 5
 ; RUN: opt -passes=atomic-expand -codegen-opt-level=1 -S 
-mtriple=thumbv7s-apple-ios7.0 %s | FileCheck %s
 
-define i32 @test_cmpxchg_seq_cst(ptr %addr, i32 %desired, i32 %new) {
-; CHECK-LABEL: @test_cmpxchg_seq_cst
 ; Intrinsic for "dmb ishst" is then expected
-; CHECK: br label %[[START:.*]]
-
-; CHECK: [[START]]:
-; CHECK: [[LOADED:%.*]] = call i32 @llvm.arm.ldrex.p0(ptr elementtype(i32) 
%addr)
-; CHECK: [[SHOULD_STORE:%.*]] = icmp eq i32 [[LOADED]], %desired
-; CHECK: br i1 [[SHOULD_STORE]], label %[[FENCED_STORE:.*]], label 
%[[NO_STORE_BB:.*]]
-
-; CHECK: [[FENCED_STORE]]:
-; CHECK: call void @llvm.arm.dmb(i32 10)
-; CHECK: br label %[[TRY_STORE:.*]]
-
-; CHECK: [[TRY_STORE]]:
-; CHECK: [[LOADED_TRYSTORE:%.*]] = phi i32 [ [[LOADED]], %[[FENCED_STORE]] 
]
-; CHECK: [[STREX:%.*]] = call i32 @llvm.arm.strex.p0(i32 %new, ptr 
elementtype(i32) %addr)
-; CHECK: [[SUCCESS:%.*]] = icmp eq i32 [[STREX]], 0
-; CHECK: br i1 [[SUCCESS]], label %[[SUCCESS_BB:.*]], label 
%[[FAILURE_BB:.*]]
-
-; CHECK: [[SUCCESS_BB]]:
-; CHECK: call void @llvm.arm.dmb(i32 11)
-; CHECK: br label %[[END:.*]]
-
-; CHECK: [[NO_STORE_BB]]:
-; CHECK: [[LOADED_NOSTORE:%.*]] = phi i32 [ [[LOADED]], %[[START]] ]
-; CHECK: call void @llvm.arm.clrex()
-; CHECK: br label %[[FAILURE_BB]]
-
-; CHECK: [[FAILURE_BB]]:
-; CHECK: [[LOADED_FAILURE:%.*]] = phi i32 [ [[LOADED_NOSTORE]], 
%[[NO_STORE_BB]] ], [ [[LOADED_TRYSTORE]], %[[TRY_STORE]] ]
-; CHECK: call void @llvm.arm.dmb(i32 11)
-; CHECK: br label %[[END]]
-
-; CHECK: [[END]]:
-; CHECK: [[LOADED_EXIT:%.*]] = phi i32 [ [[LOADED_TRYSTORE]], 
%[[SUCCESS_BB]] ], [ [[LOADED_FAILURE]], %[[FAILURE_BB]] ]
-; CHECK: [[SUCCESS:%.*]] = phi i1 [ true, %[[SUCCESS_BB]] ], [ false, 
%[[FAILURE_BB]] ]
-; CHECK: ret i32 [[LOADED_EXIT]]
-
+define i32 @test_cmpxchg_seq_cst(ptr %addr, i32 %desired, i32 %new) {
+; CHECK-LABEL: define i32 @test_cmpxchg_seq_cst(
+; CHECK-SAME: ptr [[ADDR:%.*]], i32 [[DESIRED:%.*]], i32 [[NEW:%.*]]) {
+; CHECK-NEXT:br label %[[CMPXCHG_START:.*]]
+; CHECK:   [[CMPXCHG_START]]:
+; CHECK-NEXT:[[TMP1:%.*]] = call i32 @llvm.arm.ldrex.p0(ptr 
elementtype(i32) [[ADDR]])
+; CHECK-NEXT:[[SHOULD_STORE:%.*]] = icmp eq i32 [[TMP1]], [[DESIRED]]
+; CHECK-NEXT:br i1 [[SHOULD_STORE]], label %[[CMPXCHG_FENCEDSTORE:.*]], 
label %[[CMPXCHG_NOSTORE:.*]]
+; CHECK:   [[CMPXCHG_FENCEDSTORE]]:
+; CHECK-NEXT:call void @llvm.arm.dmb(i32 10)
+; CHECK-NEXT:br label %[[CMPXCHG_TRYSTORE:.*]]
+; CHECK:   [[CMPXCHG_TRYSTORE]]:
+; CHECK-NEXT:[[LOADED_TRYSTORE:%.*]] = phi i32 [ [[TMP1]], 
%[[CMPXCHG_FENCEDSTORE]] ]
+; CHECK-NEXT:[[TMP2:%.*]] = call i32 @llvm.arm.strex.p0(i32 [[NEW]], ptr 
elementtype(i32) [[ADDR]])
+; CHECK-NEXT:[[SUCCESS:%.*]] = icmp eq i32 [[TMP2]], 0
+; CHECK-NEXT:br i1 [[SUCCESS]], label %[[CMPXCHG_SUCCESS:.*]], label 
%[[CMPXCHG_FAILURE:.*]]
+; CHECK:   [[CMPXCHG_RELEASEDLOAD:.*:]]
+; CHECK-NEXT:unreachable
+; CHECK:   [[CMPXCHG_SUCCESS]]:
+; CHECK-NEXT:call void @llvm.arm.dmb(i32 11)
+; CHECK-NEXT:br label %[[CMPXCHG_END:.*]]
+; CHECK:   [[CMPXCHG_NOSTORE]]:
+; CHECK-NEXT:[[LOADED_NOSTORE:%.*]] = phi i32 [ [[TMP1]], 
%[[CMPXCHG_START]] ]
+; CHECK-NEXT:call void @llvm.arm.clrex()
+; CHECK-NEXT:br label %[[CMPXCHG_FAILURE]]
+; CHECK:   [[CMPXCHG_FAILURE]]:
+; CHECK-NEXT:[[LOADED_FAILURE:%.*]] = phi i32 [ [[LOADED_NOSTORE]], 
%[[CMPXCHG_NOSTORE]] ], [ [[LOADED_TRYSTORE]], %[[CMPXCHG_TRYSTORE]] ]
+; CHECK-NEXT:call void @llvm.arm.dmb(i32 11)
+; CHECK-NEXT:br label %[[CMPXCHG_END]]
+; CHECK:   [[CMPXCHG_END]]:
+; CHECK-NEXT:[[LOADED_EXIT:%.*]] = phi i32 [ [[LOADED_TRYSTORE]], 
%[[CMPXCHG_SUCCESS]] ], [ [[LOADED_FAILURE]], %[[CMPXCHG_FAILURE]] ]
+; CHECK-NEXT:[[SUCCESS1:%.*]] = phi i1 [ true, %[[CMPXCHG_SUCCESS]] ], [ 
false, %[[CMPXCHG_FAILURE]] ]
+; CHECK-NEXT:ret i32 [[LOADED_EXIT]]
+;
   %pair = cmpxchg weak ptr %addr, i32 %desired, i32 %new seq_cst seq_cst
   %oldval = extractvalue { i32, i1 } %pair, 0
   ret i32 %oldval
 }
 
 define i1 @test_cmpxchg_weak_fail(ptr %addr, i32 %desired, i32 %new) {
-; CHECK-LABEL: @test_c

[llvm-branch-commits] [llvm] [Transforms] Refactor CreateControlFlowHub (PR #103013)

2024-08-13 Thread Matt Arsenault via llvm-branch-commits


@@ -1,5 +1,6 @@
 ; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
-; RUN: opt < %s -passes='lower-switch,fix-irreducible' -S | FileCheck %s
+; RUN: opt < %s -lowerswitch -fix-irreducible --verify-loop-info -S | 
FileCheck %s

arsenm wrote:

Does this really need the old PM run lines? 

https://github.com/llvm/llvm-project/pull/103013
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [Transforms] Refactor CreateControlFlowHub (PR #103013)

2024-08-13 Thread Matt Arsenault via llvm-branch-commits


@@ -140,53 +141,43 @@ static void restoreSSA(const DominatorTree &DT, const 
Loop *L,
   }
 }
 
+static bool isExitBlock(Loop *L, BasicBlock *Succ, LoopInfo &LI) {
+  Loop *SL = LI.getLoopFor(Succ);
+  if (SL == L || L->contains(SL))
+return false;
+  return true;

arsenm wrote:

return bool expression 

https://github.com/llvm/llvm-project/pull/103013
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [Transforms] Refactor CreateControlFlowHub (PR #103013)

2024-08-13 Thread Matt Arsenault via llvm-branch-commits


@@ -1,5 +1,6 @@
 ; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
-; RUN: opt < %s -passes='lower-switch,fix-irreducible' -S | FileCheck %s
+; RUN: opt < %s -lowerswitch -fix-irreducible --verify-loop-info -S | 
FileCheck %s
+; RUN: opt < %s -passes="lower-switch,fix-irreducible,verify" -S | 
FileCheck %s

arsenm wrote:

I think these usually use single quotes 

https://github.com/llvm/llvm-project/pull/103013
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] AtomicExpand: Stop trying to prune cmpxchg extractvalue users (PR #103211)

2024-08-13 Thread Matt Arsenault via llvm-branch-commits

https://github.com/arsenm created 
https://github.com/llvm/llvm-project/pull/103211

The expansion for cmpxchg was trying to tidy up extractvalue users
to directly use the lowered pieces, and then erasing the now dead
extractvalues. This was making an assumption about the iteration
order did not depend on those user instructions.

Continue doing the replacement, but just leave the dead extractvalues.
This is a minor regression, but it is of no importance since the
dead instructions will just get dropped during codegen anyway.

>From 6c3af75b41ef54b831f3886bc62171127369467e Mon Sep 17 00:00:00 2001
From: Matt Arsenault 
Date: Tue, 13 Aug 2024 13:40:13 +0400
Subject: [PATCH] AtomicExpand: Stop trying to prune cmpxchg extractvalue users

The expansion for cmpxchg was trying to tidy up extractvalue users
to directly use the lowered pieces, and then erasing the now dead
extractvalues. This was making an assumption about the iteration
order did not depend on those user instructions.

Continue doing the replacement, but just leave the dead extractvalues.
This is a minor regression, but it is of no importance since the
dead instructions will just get dropped during codegen anyway.
---
 llvm/lib/CodeGen/AtomicExpandPass.cpp |   7 -
 .../AtomicExpand/ARM/cmpxchg-weak.ll  | 287 +-
 2 files changed, 142 insertions(+), 152 deletions(-)

diff --git a/llvm/lib/CodeGen/AtomicExpandPass.cpp 
b/llvm/lib/CodeGen/AtomicExpandPass.cpp
index d8f33c42a8a14c..b6344e0c27c4e6 100644
--- a/llvm/lib/CodeGen/AtomicExpandPass.cpp
+++ b/llvm/lib/CodeGen/AtomicExpandPass.cpp
@@ -1514,7 +1514,6 @@ bool 
AtomicExpandImpl::expandAtomicCmpXchg(AtomicCmpXchgInst *CI) {
 
   // Look for any users of the cmpxchg that are just comparing the loaded value
   // against the desired one, and replace them with the CFG-derived version.
-  SmallVector PrunedInsts;
   for (auto *User : CI->users()) {
 ExtractValueInst *EV = dyn_cast(User);
 if (!EV)
@@ -1527,14 +1526,8 @@ bool 
AtomicExpandImpl::expandAtomicCmpXchg(AtomicCmpXchgInst *CI) {
   EV->replaceAllUsesWith(Loaded);
 else
   EV->replaceAllUsesWith(Success);
-
-PrunedInsts.push_back(EV);
   }
 
-  // We can remove the instructions now we're no longer iterating through them.
-  for (auto *EV : PrunedInsts)
-EV->eraseFromParent();
-
   if (!CI->use_empty()) {
 // Some use of the full struct return that we don't understand has 
happened,
 // so we've got to reconstruct it properly.
diff --git a/llvm/test/Transforms/AtomicExpand/ARM/cmpxchg-weak.ll 
b/llvm/test/Transforms/AtomicExpand/ARM/cmpxchg-weak.ll
index 23aa57e18ecc5a..d1d6f89bccade1 100644
--- a/llvm/test/Transforms/AtomicExpand/ARM/cmpxchg-weak.ll
+++ b/llvm/test/Transforms/AtomicExpand/ARM/cmpxchg-weak.ll
@@ -1,169 +1,166 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py 
UTC_ARGS: --version 5
 ; RUN: opt -passes=atomic-expand -codegen-opt-level=1 -S 
-mtriple=thumbv7s-apple-ios7.0 %s | FileCheck %s
 
-define i32 @test_cmpxchg_seq_cst(ptr %addr, i32 %desired, i32 %new) {
-; CHECK-LABEL: @test_cmpxchg_seq_cst
 ; Intrinsic for "dmb ishst" is then expected
-; CHECK: br label %[[START:.*]]
-
-; CHECK: [[START]]:
-; CHECK: [[LOADED:%.*]] = call i32 @llvm.arm.ldrex.p0(ptr elementtype(i32) 
%addr)
-; CHECK: [[SHOULD_STORE:%.*]] = icmp eq i32 [[LOADED]], %desired
-; CHECK: br i1 [[SHOULD_STORE]], label %[[FENCED_STORE:.*]], label 
%[[NO_STORE_BB:.*]]
-
-; CHECK: [[FENCED_STORE]]:
-; CHECK: call void @llvm.arm.dmb(i32 10)
-; CHECK: br label %[[TRY_STORE:.*]]
-
-; CHECK: [[TRY_STORE]]:
-; CHECK: [[LOADED_TRYSTORE:%.*]] = phi i32 [ [[LOADED]], %[[FENCED_STORE]] 
]
-; CHECK: [[STREX:%.*]] = call i32 @llvm.arm.strex.p0(i32 %new, ptr 
elementtype(i32) %addr)
-; CHECK: [[SUCCESS:%.*]] = icmp eq i32 [[STREX]], 0
-; CHECK: br i1 [[SUCCESS]], label %[[SUCCESS_BB:.*]], label 
%[[FAILURE_BB:.*]]
-
-; CHECK: [[SUCCESS_BB]]:
-; CHECK: call void @llvm.arm.dmb(i32 11)
-; CHECK: br label %[[END:.*]]
-
-; CHECK: [[NO_STORE_BB]]:
-; CHECK: [[LOADED_NOSTORE:%.*]] = phi i32 [ [[LOADED]], %[[START]] ]
-; CHECK: call void @llvm.arm.clrex()
-; CHECK: br label %[[FAILURE_BB]]
-
-; CHECK: [[FAILURE_BB]]:
-; CHECK: [[LOADED_FAILURE:%.*]] = phi i32 [ [[LOADED_NOSTORE]], 
%[[NO_STORE_BB]] ], [ [[LOADED_TRYSTORE]], %[[TRY_STORE]] ]
-; CHECK: call void @llvm.arm.dmb(i32 11)
-; CHECK: br label %[[END]]
-
-; CHECK: [[END]]:
-; CHECK: [[LOADED_EXIT:%.*]] = phi i32 [ [[LOADED_TRYSTORE]], 
%[[SUCCESS_BB]] ], [ [[LOADED_FAILURE]], %[[FAILURE_BB]] ]
-; CHECK: [[SUCCESS:%.*]] = phi i1 [ true, %[[SUCCESS_BB]] ], [ false, 
%[[FAILURE_BB]] ]
-; CHECK: ret i32 [[LOADED_EXIT]]
-
+define i32 @test_cmpxchg_seq_cst(ptr %addr, i32 %desired, i32 %new) {
+; CHECK-LABEL: define i32 @test_cmpxchg_seq_cst(
+; CHECK-SAME: ptr [[ADDR:%.*]], i32 [[DESIRED:%.*]], i32 [[NEW:%.*]]) {
+; CHECK-NEXT:br label %[[CMPXCHG_START:.

[llvm-branch-commits] [llvm] AtomicExpand: Stop trying to prune cmpxchg extractvalue users (PR #103211)

2024-08-13 Thread Matt Arsenault via llvm-branch-commits

arsenm wrote:

> [!WARNING]
> This pull request is not mergeable via GitHub because a downstack PR is 
> open. Once all requirements are satisfied, merge this PR as a stack  href="https://app.graphite.dev/github/pr/llvm/llvm-project/103211?utm_source=stack-comment-downstack-mergeability-warning";
>  >on Graphite.
> https://graphite.dev/docs/merge-pull-requests";>Learn more

* **#103211** https://app.graphite.dev/github/pr/llvm/llvm-project/103211?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/> 👈
* **#102914** https://app.graphite.dev/github/pr/llvm/llvm-project/102914?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* `main`

This stack of pull requests is managed by Graphite. https://stacking.dev/?utm_source=stack-comment";>Learn more about 
stacking.


 Join @arsenm and the rest of your teammates on https://graphite.dev?utm-source=stack-comment";>https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="11px" height="11px"/> Graphite
  

https://github.com/llvm/llvm-project/pull/103211
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] AtomicExpand: Stop trying to prune cmpxchg extractvalue users (PR #103211)

2024-08-13 Thread Matt Arsenault via llvm-branch-commits

https://github.com/arsenm ready_for_review 
https://github.com/llvm/llvm-project/pull/103211
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: [PPC][AIX] Save/restore r31 when using base pointer (#100182) (PR #103301)

2024-08-13 Thread via llvm-branch-commits

https://github.com/llvmbot created 
https://github.com/llvm/llvm-project/pull/103301

Backport d07f106e512c08455b76cc1889ee48318e73c810

Requested by: @syzaara

>From 84853313eed6b778285f6a1329837f2da0a5ff10 Mon Sep 17 00:00:00 2001
From: Zaara Syeda 
Date: Wed, 7 Aug 2024 09:59:45 -0400
Subject: [PATCH] [PPC][AIX] Save/restore r31 when using base pointer (#100182)

When the base pointer r30 is used to hold the stack pointer, r30 is
spilled in the prologue. On AIX registers are saved from highest to
lowest, so r31 also needs to be saved.

Fixes https://github.com/llvm/llvm-project/issues/96411

(cherry picked from commit d07f106e512c08455b76cc1889ee48318e73c810)
---
 llvm/lib/Target/PowerPC/PPCFrameLowering.cpp  | 14 --
 llvm/test/CodeGen/PowerPC/aix-base-pointer.ll |  5 +
 2 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/llvm/lib/Target/PowerPC/PPCFrameLowering.cpp 
b/llvm/lib/Target/PowerPC/PPCFrameLowering.cpp
index 1963582ce68631..a57ed33bda9c77 100644
--- a/llvm/lib/Target/PowerPC/PPCFrameLowering.cpp
+++ b/llvm/lib/Target/PowerPC/PPCFrameLowering.cpp
@@ -1007,7 +1007,7 @@ void PPCFrameLowering::emitPrologue(MachineFunction &MF,
 // R0 cannot be used as a base register, but it can be used as an
 // index in a store-indexed.
 int LastOffset = 0;
-if (HasFP)  {
+if (HasFP) {
   // R0 += (FPOffset-LastOffset).
   // Need addic, since addi treats R0 as 0.
   BuildMI(MBB, MBBI, dl, TII.get(PPC::ADDIC), ScratchReg)
@@ -2025,8 +2025,18 @@ void 
PPCFrameLowering::determineCalleeSaves(MachineFunction &MF,
   // code. Same goes for the base pointer and the PIC base register.
   if (needsFP(MF))
 SavedRegs.reset(isPPC64 ? PPC::X31 : PPC::R31);
-  if (RegInfo->hasBasePointer(MF))
+  if (RegInfo->hasBasePointer(MF)) {
 SavedRegs.reset(RegInfo->getBaseRegister(MF));
+// On AIX, when BaseRegister(R30) is used, need to spill r31 too to match
+// AIX trackback table requirement.
+if (!needsFP(MF) && !SavedRegs.test(isPPC64 ? PPC::X31 : PPC::R31) &&
+Subtarget.isAIXABI()) {
+  assert(
+  (RegInfo->getBaseRegister(MF) == (isPPC64 ? PPC::X30 : PPC::R30)) &&
+  "Invalid base register on AIX!");
+  SavedRegs.set(isPPC64 ? PPC::X31 : PPC::R31);
+}
+  }
   if (FI->usesPICBase())
 SavedRegs.reset(PPC::R30);
 
diff --git a/llvm/test/CodeGen/PowerPC/aix-base-pointer.ll 
b/llvm/test/CodeGen/PowerPC/aix-base-pointer.ll
index ab222d770360ce..5e66e5ec276389 100644
--- a/llvm/test/CodeGen/PowerPC/aix-base-pointer.ll
+++ b/llvm/test/CodeGen/PowerPC/aix-base-pointer.ll
@@ -6,6 +6,7 @@
 
 ; Use an overaligned buffer to force base-pointer usage. Test verifies:
 ; - base pointer register (r30) is saved/defined/restored.
+; - frame pointer register (r31) is saved/defined/restored.
 ; - stack frame is allocated with correct alignment.
 ; - Address of %AlignedBuffer is calculated based off offset from the stack
 ;   pointer.
@@ -25,7 +26,9 @@ declare void @callee(ptr)
 ; 32BIT: subfic 0, 0, -224
 ; 32BIT: stwux 1, 1, 0
 ; 32BIT: addi 3, 1, 64
+; 32BIT: stw 31, -12(30)
 ; 32BIT: bl .callee
+; 32BIT: lwz 31, -12(30)
 ; 32BIT: mr 1, 30
 ; 32BIT: lwz 30, -16(1)
 
@@ -36,6 +39,8 @@ declare void @callee(ptr)
 ; 64BIT: subfic 0, 0, -288
 ; 64BIT: stdux 1, 1, 0
 ; 64BIT: addi 3, 1, 128
+; 64BIT: std 31, -16(30)
 ; 64BIT: bl .callee
+; 64BIT: ld 31, -16(30)
 ; 64BIT: mr 1, 30
 ; 64BIT: ld 30, -24(1)

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: [PPC][AIX] Save/restore r31 when using base pointer (#100182) (PR #103301)

2024-08-13 Thread via llvm-branch-commits

https://github.com/llvmbot milestoned 
https://github.com/llvm/llvm-project/pull/103301
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: [PPC][AIX] Save/restore r31 when using base pointer (#100182) (PR #103301)

2024-08-13 Thread via llvm-branch-commits

llvmbot wrote:

@mandlebug What do you think about merging this PR to the release branch?

https://github.com/llvm/llvm-project/pull/103301
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: [PPC][AIX] Save/restore r31 when using base pointer (#100182) (PR #103301)

2024-08-13 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-backend-powerpc

Author: None (llvmbot)


Changes

Backport d07f106e512c08455b76cc1889ee48318e73c810

Requested by: @syzaara

---
Full diff: https://github.com/llvm/llvm-project/pull/103301.diff


2 Files Affected:

- (modified) llvm/lib/Target/PowerPC/PPCFrameLowering.cpp (+12-2) 
- (modified) llvm/test/CodeGen/PowerPC/aix-base-pointer.ll (+5) 


``diff
diff --git a/llvm/lib/Target/PowerPC/PPCFrameLowering.cpp 
b/llvm/lib/Target/PowerPC/PPCFrameLowering.cpp
index 1963582ce68631..a57ed33bda9c77 100644
--- a/llvm/lib/Target/PowerPC/PPCFrameLowering.cpp
+++ b/llvm/lib/Target/PowerPC/PPCFrameLowering.cpp
@@ -1007,7 +1007,7 @@ void PPCFrameLowering::emitPrologue(MachineFunction &MF,
 // R0 cannot be used as a base register, but it can be used as an
 // index in a store-indexed.
 int LastOffset = 0;
-if (HasFP)  {
+if (HasFP) {
   // R0 += (FPOffset-LastOffset).
   // Need addic, since addi treats R0 as 0.
   BuildMI(MBB, MBBI, dl, TII.get(PPC::ADDIC), ScratchReg)
@@ -2025,8 +2025,18 @@ void 
PPCFrameLowering::determineCalleeSaves(MachineFunction &MF,
   // code. Same goes for the base pointer and the PIC base register.
   if (needsFP(MF))
 SavedRegs.reset(isPPC64 ? PPC::X31 : PPC::R31);
-  if (RegInfo->hasBasePointer(MF))
+  if (RegInfo->hasBasePointer(MF)) {
 SavedRegs.reset(RegInfo->getBaseRegister(MF));
+// On AIX, when BaseRegister(R30) is used, need to spill r31 too to match
+// AIX trackback table requirement.
+if (!needsFP(MF) && !SavedRegs.test(isPPC64 ? PPC::X31 : PPC::R31) &&
+Subtarget.isAIXABI()) {
+  assert(
+  (RegInfo->getBaseRegister(MF) == (isPPC64 ? PPC::X30 : PPC::R30)) &&
+  "Invalid base register on AIX!");
+  SavedRegs.set(isPPC64 ? PPC::X31 : PPC::R31);
+}
+  }
   if (FI->usesPICBase())
 SavedRegs.reset(PPC::R30);
 
diff --git a/llvm/test/CodeGen/PowerPC/aix-base-pointer.ll 
b/llvm/test/CodeGen/PowerPC/aix-base-pointer.ll
index ab222d770360ce..5e66e5ec276389 100644
--- a/llvm/test/CodeGen/PowerPC/aix-base-pointer.ll
+++ b/llvm/test/CodeGen/PowerPC/aix-base-pointer.ll
@@ -6,6 +6,7 @@
 
 ; Use an overaligned buffer to force base-pointer usage. Test verifies:
 ; - base pointer register (r30) is saved/defined/restored.
+; - frame pointer register (r31) is saved/defined/restored.
 ; - stack frame is allocated with correct alignment.
 ; - Address of %AlignedBuffer is calculated based off offset from the stack
 ;   pointer.
@@ -25,7 +26,9 @@ declare void @callee(ptr)
 ; 32BIT: subfic 0, 0, -224
 ; 32BIT: stwux 1, 1, 0
 ; 32BIT: addi 3, 1, 64
+; 32BIT: stw 31, -12(30)
 ; 32BIT: bl .callee
+; 32BIT: lwz 31, -12(30)
 ; 32BIT: mr 1, 30
 ; 32BIT: lwz 30, -16(1)
 
@@ -36,6 +39,8 @@ declare void @callee(ptr)
 ; 64BIT: subfic 0, 0, -288
 ; 64BIT: stdux 1, 1, 0
 ; 64BIT: addi 3, 1, 128
+; 64BIT: std 31, -16(30)
 ; 64BIT: bl .callee
+; 64BIT: ld 31, -16(30)
 ; 64BIT: mr 1, 30
 ; 64BIT: ld 30, -24(1)

``




https://github.com/llvm/llvm-project/pull/103301
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libcxx] [libc++][format][2/7] Optimizes c-string arguments. (PR #101805)

2024-08-13 Thread Mark de Wever via llvm-branch-commits

https://github.com/mordante edited 
https://github.com/llvm/llvm-project/pull/101805
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libcxx] [libc++][format][2/7] Optimizes c-string arguments. (PR #101805)

2024-08-13 Thread Louis Dionne via llvm-branch-commits

https://github.com/ldionne edited 
https://github.com/llvm/llvm-project/pull/101805
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [InstCombine] Don't look at ConstantData users (PR #103302)

2024-08-13 Thread Alexis Engelke via llvm-branch-commits

https://github.com/aengelke created 
https://github.com/llvm/llvm-project/pull/103302

When looking at PHI operand for combining, only look at instructions and
arguments. The loop later iteraters over Arg's users, which is not
useful if Arg is a constant -- it's users are not meaningful and might
be in different functions, which causes problems for the dominates()
query.



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libcxx] [libc++][format][2/7] Optimizes c-string arguments. (PR #101805)

2024-08-13 Thread Louis Dionne via llvm-branch-commits

https://github.com/ldionne approved this pull request.


https://github.com/llvm/llvm-project/pull/101805
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libcxx] [libc++][format][2/7] Optimizes c-string arguments. (PR #101805)

2024-08-13 Thread Louis Dionne via llvm-branch-commits


@@ -64,32 +64,14 @@ struct _LIBCPP_TEMPLATE_VIS formatter : public __formatte
   template 
   _LIBCPP_HIDE_FROM_ABI typename _FormatContext::iterator format(const _CharT* 
__str, _FormatContext& __ctx) const {
 _LIBCPP_ASSERT_INTERNAL(__str, "The basic_format_arg constructor should 
have prevented an invalid pointer.");
-
-__format_spec::__parsed_specifications<_CharT> __specs = 
_Base::__parser_.__get_parsed_std_specifications(__ctx);
-#  if _LIBCPP_STD_VER >= 23
-if (_Base::__parser_.__type_ == __format_spec::__type::__debug)
-  return 
__formatter::__format_escaped_string(basic_string_view<_CharT>{__str}, 
__ctx.out(), __specs);
-#  endif
-
-// When using a center or right alignment and the width option the length
-// of __str must be known to add the padding upfront. This case is handled
-// by the base class by converting the argument to a basic_string_view.
+// Converting the input to a basic_string_view means the data is looped 
over twice;
+// - once to determine the lenght, and

ldionne wrote:

```suggestion
// - once to determine the length, and
```

https://github.com/llvm/llvm-project/pull/101805
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [InstCombine] Don't look at ConstantData users (PR #103302)

2024-08-13 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-llvm-transforms

Author: Alexis Engelke (aengelke)


Changes

When looking at PHI operand for combining, only look at instructions and
arguments. The loop later iteraters over Arg's users, which is not
useful if Arg is a constant -- it's users are not meaningful and might
be in different functions, which causes problems for the dominates()
query.


---

Patch is 22.42 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/103302.diff


2 Files Affected:

- (modified) llvm/lib/Transforms/InstCombine/InstCombinePHI.cpp (+4) 
- (added) llvm/test/Transforms/InstCombine/phi-int-users.ll (+576) 


``diff
diff --git a/llvm/lib/Transforms/InstCombine/InstCombinePHI.cpp 
b/llvm/lib/Transforms/InstCombine/InstCombinePHI.cpp
index 86411320ab2487..bcff9a72b65724 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombinePHI.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombinePHI.cpp
@@ -143,6 +143,10 @@ bool InstCombinerImpl::foldIntegerTypedPHI(PHINode &PN) {
 BasicBlock *BB = std::get<0>(Incoming);
 Value *Arg = std::get<1>(Incoming);
 
+// Arg could be a constant, constant expr, etc., which we don't cover here.
+if (!isa(Arg) && !isa(Arg))
+  return false;
+
 // First look backward:
 if (auto *PI = dyn_cast(Arg)) {
   AvailablePtrVals.emplace_back(PI->getOperand(0));
diff --git a/llvm/test/Transforms/InstCombine/phi-int-users.ll 
b/llvm/test/Transforms/InstCombine/phi-int-users.ll
new file mode 100644
index 00..ce81c5d7e36267
--- /dev/null
+++ b/llvm/test/Transforms/InstCombine/phi-int-users.ll
@@ -0,0 +1,576 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py 
UTC_ARGS: --version 5
+; RUN: opt -mtriple=arm64 
-passes='inline,function(sroa,jump-threading,instcombine)' -S < %s 
| FileCheck %s
+
+; Verify that instcombine doesn't look at users of Constant in different
+; functions for dominates() queries.
+
+%struct.widget = type { %struct.baz, i8, [7 x i8] }
+%struct.baz = type { %struct.snork }
+%struct.snork = type { [8 x i8] }
+
+define void @spam(ptr %arg) {
+; CHECK-LABEL: define void @spam(
+; CHECK-SAME: ptr [[ARG:%.*]]) personality ptr null {
+; CHECK-NEXT:  [[BB:.*:]]
+; CHECK-NEXT:[[LOAD_I:%.*]] = load volatile i1, ptr null, align 1
+; CHECK-NEXT:br i1 [[LOAD_I]], label %[[BB2_I:.*]], label %[[BB3_I:.*]]
+; CHECK:   [[BB2_I]]:
+; CHECK-NEXT:store i64 1, ptr [[ARG]], align 8
+; CHECK-NEXT:br label %[[BARNEY_EXIT:.*]]
+; CHECK:   [[BB3_I]]:
+; CHECK-NEXT:[[LOAD_I_I:%.*]] = load volatile i32, ptr null, align 4
+; CHECK-NEXT:[[ICMP_I_I:%.*]] = icmp eq i32 [[LOAD_I_I]], 0
+; CHECK-NEXT:br i1 [[ICMP_I_I]], label %[[BB2_I_I:.*]], label 
%[[BB3_I_I:.*]]
+; CHECK:   [[BB2_I_I]]:
+; CHECK-NEXT:br label %[[BB1_I:.*]]
+; CHECK:   [[BB1_I]]:
+; CHECK-NEXT:[[LOAD_I_I_I:%.*]] = load volatile i1, ptr null, align 1
+; CHECK-NEXT:br i1 [[LOAD_I_I_I]], label %[[SPAM_EXIT_I:.*]], label 
%[[BB3_I_I_I:.*]]
+; CHECK:   [[BB3_I_I_I]]:
+; CHECK-NEXT:call void @zot.4()
+; CHECK-NEXT:br label %[[SPAM_EXIT_I]]
+; CHECK:   [[SPAM_EXIT_I]]:
+; CHECK-NEXT:[[ALLOCA_SROA_0_1_I:%.*]] = phi i64 [ 0, %[[BB3_I_I_I]] ], [ 
1, %[[BB1_I]] ]
+; CHECK-NEXT:[[TMP0:%.*]] = inttoptr i64 [[ALLOCA_SROA_0_1_I]] to ptr
+; CHECK-NEXT:store i32 0, ptr [[TMP0]], align 4
+; CHECK-NEXT:br label %[[BB1_I]]
+; CHECK:   [[EGGS_EXIT:.*:]]
+; CHECK-NEXT:br label %[[BARNEY_EXIT]]
+; CHECK:   [[BB3_I_I]]:
+; CHECK-NEXT:[[LOAD_I_I1:%.*]] = load volatile i1, ptr null, align 1
+; CHECK-NEXT:br i1 [[LOAD_I_I1]], label %[[QUUX_EXIT:.*]], label 
%[[BB3_I_I2:.*]]
+; CHECK:   [[BB3_I_I2]]:
+; CHECK-NEXT:call void @snork()
+; CHECK-NEXT:unreachable
+; CHECK:   [[QUUX_EXIT]]:
+; CHECK-NEXT:store ptr poison, ptr null, align 8
+; CHECK-NEXT:br label %[[BARNEY_EXIT]]
+; CHECK:   [[BARNEY_EXIT]]:
+; CHECK-NEXT:ret void
+;
+bb:
+  call void @barney(ptr %arg)
+  ret void
+}
+
+define ptr @zot(ptr %arg) {
+; CHECK-LABEL: define ptr @zot(
+; CHECK-SAME: ptr [[ARG:%.*]]) personality ptr null {
+; CHECK-NEXT:  [[BB:.*:]]
+; CHECK-NEXT:[[LOAD_I_I_I_I:%.*]] = load ptr, ptr [[ARG]], align 8
+; CHECK-NEXT:store ptr null, ptr [[ARG]], align 8
+; CHECK-NEXT:store i32 0, ptr [[LOAD_I_I_I_I]], align 4
+; CHECK-NEXT:ret ptr null
+;
+bb:
+  %call = call ptr @ham.8(ptr %arg)
+  ret ptr null
+}
+
+define void @wombat() personality ptr null {
+; CHECK-LABEL: define void @wombat() personality ptr null {
+; CHECK-NEXT:  [[BB:.*:]]
+; CHECK-NEXT:call void @snork()
+; CHECK-NEXT:unreachable
+;
+bb:
+  call void @snork()
+  unreachable
+}
+
+define ptr @wombat.1(ptr %arg) {
+; CHECK-LABEL: define ptr @wombat.1(
+; CHECK-SAME: ptr [[ARG:%.*]]) {
+; CHECK-NEXT:  [[BB:.*:]]
+; CHECK-NEXT:store i64 1, ptr [[ARG]], align 8
+; CHECK-NEXT:ret ptr null
+;
+bb:
+  %call = call ptr @foo.9(ptr 

[llvm-branch-commits] [libcxx] [libc++][format][2/7] Optimizes c-string arguments. (PR #101805)

2024-08-13 Thread Louis Dionne via llvm-branch-commits


@@ -64,32 +64,14 @@ struct _LIBCPP_TEMPLATE_VIS formatter : public __formatte
   template 
   _LIBCPP_HIDE_FROM_ABI typename _FormatContext::iterator format(const _CharT* 
__str, _FormatContext& __ctx) const {
 _LIBCPP_ASSERT_INTERNAL(__str, "The basic_format_arg constructor should 
have prevented an invalid pointer.");
-
-__format_spec::__parsed_specifications<_CharT> __specs = 
_Base::__parser_.__get_parsed_std_specifications(__ctx);
-#  if _LIBCPP_STD_VER >= 23
-if (_Base::__parser_.__type_ == __format_spec::__type::__debug)
-  return 
__formatter::__format_escaped_string(basic_string_view<_CharT>{__str}, 
__ctx.out(), __specs);
-#  endif
-
-// When using a center or right alignment and the width option the length
-// of __str must be known to add the padding upfront. This case is handled
-// by the base class by converting the argument to a basic_string_view.
+// Converting the input to a basic_string_view means the data is looped 
over twice;
+// - once to determine the lenght, and
+// - once to process the data.
 //
-// When using left alignment and the width option the padding is added
-// after outputting __str so the length can be determined while outputting
-// __str. The same holds true for the precision, during outputting __str it
-// can be validated whether the precision threshold has been reached. For
-// now these optimizations aren't implemented. Instead the base class
-// handles these options.
-// TODO FMT Implement these improvements.
-if (__specs.__has_width() || __specs.__has_precision())
-  return __formatter::__write_string(basic_string_view<_CharT>{__str}, 
__ctx.out(), __specs);
-
-// No formatting required, copy the string to the output.
-auto __out_it = __ctx.out();
-while (*__str)
-  *__out_it++ = *__str++;
-return __out_it;
+// This sounds slower than writing the output directly. However internally
+// the output algorithms have optimizations for "bulk" operations. This
+// means processing the data twice is faster than processing it once.

ldionne wrote:

```suggestion
// This sounds slower than writing the output directly. However internally
// the output algorithms have optimizations for "bulk" operations, which 
make
// this faster than a single-pass character-by-character output.
```

https://github.com/llvm/llvm-project/pull/101805
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libcxx] [libc++][format][3/7] Improves std::format performance. (PR #101817)

2024-08-13 Thread Louis Dionne via llvm-branch-commits

https://github.com/ldionne edited 
https://github.com/llvm/llvm-project/pull/101817
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [InstCombine] Don't look at ConstantData users (PR #103302)

2024-08-13 Thread Nikita Popov via llvm-branch-commits


@@ -0,0 +1,576 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py 
UTC_ARGS: --version 5
+; RUN: opt -mtriple=arm64 
-passes='inline,function(sroa,jump-threading,instcombine)' -S < %s 
| FileCheck %s

nikic wrote:

Why can't this be an instcombine only test?

https://github.com/llvm/llvm-project/pull/103302
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [InstCombine] Don't look at ConstantData users (PR #103302)

2024-08-13 Thread Alexis Engelke via llvm-branch-commits

https://github.com/aengelke updated 
https://github.com/llvm/llvm-project/pull/103302

>From 6a2ac00a8424a4402475e2b7972bfb01330c3bf8 Mon Sep 17 00:00:00 2001
From: Alexis Engelke 
Date: Tue, 13 Aug 2024 16:10:38 +
Subject: [PATCH] Only run instcombine in test case

Created using spr 1.3.5-bogner
---
 .../Transforms/InstCombine/phi-int-users.ll   | 416 --
 1 file changed, 379 insertions(+), 37 deletions(-)

diff --git a/llvm/test/Transforms/InstCombine/phi-int-users.ll 
b/llvm/test/Transforms/InstCombine/phi-int-users.ll
index ce81c5d7e3626..8a6bf44b884a2 100644
--- a/llvm/test/Transforms/InstCombine/phi-int-users.ll
+++ b/llvm/test/Transforms/InstCombine/phi-int-users.ll
@@ -1,14 +1,10 @@
 ; NOTE: Assertions have been autogenerated by utils/update_test_checks.py 
UTC_ARGS: --version 5
-; RUN: opt -mtriple=arm64 
-passes='inline,function(sroa,jump-threading,instcombine)' -S < %s 
| FileCheck %s
+; RUN: opt -mtriple=arm64 -S < %s -passes=instcombine | FileCheck %s
 
 ; Verify that instcombine doesn't look at users of Constant in different
 ; functions for dominates() queries.
 
-%struct.widget = type { %struct.baz, i8, [7 x i8] }
-%struct.baz = type { %struct.snork }
-%struct.snork = type { [8 x i8] }
-
-define void @spam(ptr %arg) {
+define void @spam(ptr %arg) personality ptr null {
 ; CHECK-LABEL: define void @spam(
 ; CHECK-SAME: ptr [[ARG:%.*]]) personality ptr null {
 ; CHECK-NEXT:  [[BB:.*:]]
@@ -49,11 +45,55 @@ define void @spam(ptr %arg) {
 ; CHECK-NEXT:ret void
 ;
 bb:
-  call void @barney(ptr %arg)
+  %load.i = load volatile i1, ptr null, align 1
+  br i1 %load.i, label %bb2.i, label %bb3.i
+
+bb2.i:; preds = %bb
+  store i64 1, ptr %arg, align 8
+  br label %barney.exit
+
+bb3.i:; preds = %bb
+  %load.i.i = load volatile i32, ptr null, align 4
+  %icmp.i.i = icmp eq i32 %load.i.i, 0
+  br i1 %icmp.i.i, label %bb2.i.i, label %bb3.i.i
+
+bb2.i.i:  ; preds = %bb3.i
+  br label %bb1.i
+
+bb1.i:; preds = %spam.exit.i, 
%bb2.i.i
+  %load.i.i.i = load volatile i1, ptr null, align 1
+  br i1 %load.i.i.i, label %spam.exit.i, label %bb3.i.i.i
+
+bb3.i.i.i:; preds = %bb1.i
+  call void @zot.4()
+  br label %spam.exit.i
+
+spam.exit.i:  ; preds = %bb3.i.i.i, %bb1.i
+  %alloca.sroa.0.1.i = phi i64 [ 0, %bb3.i.i.i ], [ 1, %bb1.i ]
+  %0 = inttoptr i64 %alloca.sroa.0.1.i to ptr
+  store i32 0, ptr %0, align 4
+  br label %bb1.i
+
+eggs.exit:; No predecessors!
+  br label %barney.exit
+
+bb3.i.i:  ; preds = %bb3.i
+  %load.i.i1 = load volatile i1, ptr null, align 1
+  br i1 %load.i.i1, label %quux.exit, label %bb3.i.i2
+
+bb3.i.i2: ; preds = %bb3.i.i
+  call void @snork()
+  unreachable
+
+quux.exit:; preds = %bb3.i.i
+  store ptr null, ptr null, align 8
+  br label %barney.exit
+
+barney.exit:  ; preds = %quux.exit, 
%eggs.exit, %bb2.i
   ret void
 }
 
-define ptr @zot(ptr %arg) {
+define ptr @zot(ptr %arg) personality ptr null {
 ; CHECK-LABEL: define ptr @zot(
 ; CHECK-SAME: ptr [[ARG:%.*]]) personality ptr null {
 ; CHECK-NEXT:  [[BB:.*:]]
@@ -63,7 +103,9 @@ define ptr @zot(ptr %arg) {
 ; CHECK-NEXT:ret ptr null
 ;
 bb:
-  %call = call ptr @ham.8(ptr %arg)
+  %load.i.i.i.i = load ptr, ptr %arg, align 8
+  store ptr null, ptr %arg, align 8
+  store i32 0, ptr %load.i.i.i.i, align 4
   ret ptr null
 }
 
@@ -86,7 +128,7 @@ define ptr @wombat.1(ptr %arg) {
 ; CHECK-NEXT:ret ptr null
 ;
 bb:
-  %call = call ptr @foo.9(ptr %arg)
+  store i64 1, ptr %arg, align 8
   ret ptr null
 }
 
@@ -103,7 +145,15 @@ define void @quux() personality ptr null {
 ; CHECK-NEXT:ret void
 ;
 bb:
-  call void @wobble()
+  %load.i = load volatile i1, ptr null, align 1
+  br i1 %load.i, label %wibble.exit, label %bb3.i
+
+bb3.i:; preds = %bb
+  call void @snork()
+  unreachable
+
+wibble.exit:  ; preds = %bb
+  store ptr null, ptr null, align 8
   ret void
 }
 
@@ -120,7 +170,15 @@ define void @wobble() personality ptr null {
 ; CHECK-NEXT:ret void
 ;
 bb:
-  call void @quux.3()
+  %load.i.i = load volatile i1, ptr null, align 1
+  br i1 %load.i.i, label %wobble.2.exit, label %bb3.i.i
+
+bb3.i.i:  ; preds = %bb
+  call void @snork()
+  unreachable
+
+wobble.2.exit:; preds = %bb
+  store ptr null, ptr null, align 8
   ret void
 }
 
@@ -141,12 +199,20 @@ define void @eggs() personality ptr null {
 ; CHECK-NEXT:br label %[[BB1]]
 ;
 bb:
-  %alloca = alloca %struct.widget, align 8
   br label %bb1
 
-bb1:  

[llvm-branch-commits] [llvm] [InstCombine] Don't look at ConstantData users (PR #103302)

2024-08-13 Thread Alexis Engelke via llvm-branch-commits


@@ -0,0 +1,576 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py 
UTC_ARGS: --version 5
+; RUN: opt -mtriple=arm64 
-passes='inline,function(sroa,jump-threading,instcombine)' -S < %s 
| FileCheck %s

aengelke wrote:

True, changed

https://github.com/llvm/llvm-project/pull/103302
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [InstCombine] Don't look at ConstantData users (PR #103302)

2024-08-13 Thread Nikita Popov via llvm-branch-commits


@@ -0,0 +1,918 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py 
UTC_ARGS: --version 5
+; RUN: opt -mtriple=arm64 -S < %s -passes=instcombine | FileCheck %s

nikic wrote:

Triple should not be needed.

https://github.com/llvm/llvm-project/pull/103302
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [InstCombine] Don't look at ConstantData users (PR #103302)

2024-08-13 Thread Nikita Popov via llvm-branch-commits




nikic wrote:

This test still doesn't look anything approaching minimal. Please run it 
through llvm-reduce.

https://github.com/llvm/llvm-project/pull/103302
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: [RISCV] Use experimental.vp.splat to splat specific vector length elements. (#101329) (PR #101506)

2024-08-13 Thread Craig Topper via llvm-branch-commits

https://github.com/topperc approved this pull request.

LGTM

https://github.com/llvm/llvm-project/pull/101506
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [PowerPC][GlobalMerge] Enable GlobalMerge by default on AIX (PR #101226)

2024-08-13 Thread Amy Kwan via llvm-branch-commits

https://github.com/amy-kwan updated 
https://github.com/llvm/llvm-project/pull/101226

>From a5d47e331e3bd754db092c194a5ca5b25ff99011 Mon Sep 17 00:00:00 2001
From: Amy Kwan 
Date: Tue, 30 Jul 2024 12:55:34 -0500
Subject: [PATCH] [PowerPC][GlobalMerge] Enable GlobalMerge by default on AIX

This patch turns on the GlobalMerge pass by default on AIX and updates LIT
tests accordingly.
---
 llvm/lib/Target/PowerPC/PPCTargetMachine.cpp   | 7 ++-
 llvm/test/CodeGen/PowerPC/merge-private.ll | 6 ++
 llvm/test/CodeGen/PowerPC/mergeable-string-pool.ll | 4 ++--
 .../test/DebugInfo/Symbolize/XCOFF/xcoff-symbolize-data.ll | 2 +-
 4 files changed, 15 insertions(+), 4 deletions(-)

diff --git a/llvm/lib/Target/PowerPC/PPCTargetMachine.cpp 
b/llvm/lib/Target/PowerPC/PPCTargetMachine.cpp
index 6a502230482816..6ee25b1816892a 100644
--- a/llvm/lib/Target/PowerPC/PPCTargetMachine.cpp
+++ b/llvm/lib/Target/PowerPC/PPCTargetMachine.cpp
@@ -500,7 +500,12 @@ void PPCPassConfig::addIRPasses() {
 }
 
 bool PPCPassConfig::addPreISel() {
-  if (EnableGlobalMerge)
+  // The GlobalMerge pass is intended to be on by default on AIX.
+  // Specifying the command line option overrides the AIX default.
+  if ((EnableGlobalMerge.getNumOccurrences() > 0)
+  ? EnableGlobalMerge
+  : (TM->getTargetTriple().isOSAIX() &&
+ getOptLevel() != CodeGenOptLevel::None))
 addPass(createGlobalMergePass(TM, GlobalMergeMaxOffset, false, false,
   true));
 
diff --git a/llvm/test/CodeGen/PowerPC/merge-private.ll 
b/llvm/test/CodeGen/PowerPC/merge-private.ll
index 6ed2d6dfc542b7..0ca706abb275fc 100644
--- a/llvm/test/CodeGen/PowerPC/merge-private.ll
+++ b/llvm/test/CodeGen/PowerPC/merge-private.ll
@@ -11,6 +11,12 @@
 ; RUN: llc -verify-machineinstrs -mtriple powerpc64-unknown-linux -mcpu=pwr8 \
 ; RUN: -ppc-asm-full-reg-names -ppc-global-merge=true < %s | FileCheck %s \
 ; RUN: --check-prefix=LINUX64BE
+; The below run line is added to ensure that the assembly corresponding to
+; the following check-prefix is generated by default on AIX (without any
+; options).
+; RUN: llc -verify-machineinstrs -mtriple powerpc64-ibm-aix-xcoff -mcpu=pwr8 \
+; RUN: -ppc-asm-full-reg-names < %s | FileCheck %s \
+; RUN: --check-prefix=AIX64
 
 @.str = private unnamed_addr constant [15 x i8] c"Private global\00", align 1
 @str = internal constant [16 x i8] c"Internal global\00", align 1
diff --git a/llvm/test/CodeGen/PowerPC/mergeable-string-pool.ll 
b/llvm/test/CodeGen/PowerPC/mergeable-string-pool.ll
index 81147d10cde6e7..833ed9fa65acf1 100644
--- a/llvm/test/CodeGen/PowerPC/mergeable-string-pool.ll
+++ b/llvm/test/CodeGen/PowerPC/mergeable-string-pool.ll
@@ -1,6 +1,6 @@
-; RUN: llc -verify-machineinstrs -mtriple powerpc-ibm-aix-xcoff -mcpu=pwr8 \
+; RUN: llc -verify-machineinstrs -mtriple powerpc-ibm-aix-xcoff -mcpu=pwr8 
-enable-global-merge=false \
 ; RUN:   -ppc-asm-full-reg-names < %s | FileCheck %s 
--check-prefixes=AIX32,AIXDATA
-; RUN: llc -verify-machineinstrs -mtriple powerpc64-ibm-aix-xcoff -mcpu=pwr8 \
+; RUN: llc -verify-machineinstrs -mtriple powerpc64-ibm-aix-xcoff -mcpu=pwr8 
-enable-global-merge=false \
 ; RUN:   -ppc-asm-full-reg-names < %s | FileCheck %s 
--check-prefixes=AIX64,AIXDATA
 ; RUN: llc -verify-machineinstrs -mtriple powerpc64-unknown-linux -mcpu=pwr8 \
 ; RUN:   -ppc-asm-full-reg-names < %s | FileCheck %s 
--check-prefixes=LINUX64BE,LINUXDATA
diff --git a/llvm/test/DebugInfo/Symbolize/XCOFF/xcoff-symbolize-data.ll 
b/llvm/test/DebugInfo/Symbolize/XCOFF/xcoff-symbolize-data.ll
index 5432b59d583bac..1a467ec72a75da 100644
--- a/llvm/test/DebugInfo/Symbolize/XCOFF/xcoff-symbolize-data.ll
+++ b/llvm/test/DebugInfo/Symbolize/XCOFF/xcoff-symbolize-data.ll
@@ -5,7 +5,7 @@
 ;; AIX assembly syntax.
 
 ; REQUIRES: powerpc-registered-target
-; RUN: llc -filetype=obj -o %t -mtriple=powerpc-aix-ibm-xcoff < %s
+; RUN: llc -filetype=obj -o %t -mtriple=powerpc-aix-ibm-xcoff 
-ppc-global-merge=false < %s
 ; RUN: llvm-symbolizer --obj=%t 'DATA 0x60' 'DATA 0x61' 'DATA 0x64' 'DATA 
0X68' \
 ; RUN:   'DATA 0x90' 'DATA 0x94' 'DATA 0X98' | \
 ; RUN:   FileCheck %s

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libcxx] [libc++][format][3/7] Improves std::format performance. (PR #101817)

2024-08-13 Thread Louis Dionne via llvm-branch-commits

https://github.com/ldionne edited 
https://github.com/llvm/llvm-project/pull/101817
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libcxx] [libc++][format][3/7] Improves std::format performance. (PR #101817)

2024-08-13 Thread Louis Dionne via llvm-branch-commits


@@ -58,23 +59,156 @@ namespace __format {
 /// This helper is used together with the @ref back_insert_iterator to offer
 /// type-erasure for the formatting functions. This reduces the number to
 /// template instantiations.
+///
+/// The design of the class is being changed to improve performance and do some
+/// code cleanups.
+/// The original design (as shipped up to LLVM-19) uses the following design:
+/// - There is an external object that connects the buffer to the output.
+/// - The class constructor stores a function pointer to a grow function and a
+///   type-erased pointer to the object that does the grow.
+/// - When writing data to the buffer would exceed the external buffer's
+///   capacity it requests the external buffer to flush its contents.
+///
+/// The new design tries to solve some issues with the current design:
+/// - The buffer used is a fixed-size buffer, benchmarking shows that using a
+///   dynamic allocated buffer has performance benefits.
+/// - Implementing P3107R5 "Permit an efficient implementation of std::print"
+///   is not trivial with the current buffers. Using the code from this series
+///   makes it trivial.
+///
+/// This class is ABI-tagged, still the new design does not change the size of
+/// objects of this class.
+///
+/// The new design contains information regarding format_to_n changes, these
+/// will be implemented in follow-up patch.
+///
+/// The new design is the following.
+/// - There is an external object that connects the buffer to the output.
+/// - This buffer object:
+///   - inherits publicly from this class.
+///   - has a static or dynamic buffer.
+///   - has a static member function to make space in its buffer write
+/// operations. This can be done by increasing the size of the internal
+/// buffer or by writing the contents of the buffer to the output iterator.
+///
+/// This member function is a constructor argument, so its name is not
+/// fixed. The code uses the name __prepare_write.
+/// - The number of output code units can be limited by a __max_output_size
+///   object. This is used in format_to_n This object:
+///   - Contains the maximum number of code units to be written.
+///   - Contains the number of code units that are requested to be written.
+/// This number is returned to the user of format_to_n.
+///   - The write functions call objects __request_write member function.
+/// This function:
+/// - Updates the number of code units that are requested to be written.
+/// - Returns the number of code units that can be written without
+///   exceeding the maximum number of code units to be written.
+///
+/// Documentation for the buffer usage members:
+/// - __ptr_ the start of the buffer.
+/// - __capacity_ the number of code units that can be written.
+///   This means [__ptr_, __ptr_ + __capacity_) is a valid range to write to.
+/// - __size_ the number of code units written in the buffer. The next code
+///   unit will be written at __ptr_ + __size_. This __size_ may NOT contain
+///   the total number of code units written by the __output_buffer. Whether or
+///   not it does depends on the sub-class used. Typically the total number of
+///   code units written is not interesting. It is interesting for format_to_n
+///   which has its own way to track this number.

ldionne wrote:

```suggestion
/// - __ptr_
///The start of the buffer.
/// - __capacity_
///The number of code units that can be written.
///   This means [__ptr_, __ptr_ + __capacity_) is a valid range to write to.
/// - __size_
///The number of code units written in the buffer. The next code
///   unit will be written at __ptr_ + __size_. This __size_ may NOT contain
///   the total number of code units written by the __output_buffer. Whether or
///   not it does depends on the sub-class used. Typically the total number of
///   code units written is not interesting. It is interesting for format_to_n
///   which has its own way to track this number.
```

Breaking the line makes it easier to see that the first word is the name of the 
member.

https://github.com/llvm/llvm-project/pull/101817
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libcxx] [libc++][format][3/7] Improves std::format performance. (PR #101817)

2024-08-13 Thread Louis Dionne via llvm-branch-commits

https://github.com/ldionne requested changes to this pull request.

As discussed, I think we should behave more like `std::vector` and not assume 
that elements between `[size(), capacity())` are already initialized. IMO this 
is a simpler mental model that is consistent with `std::vector` and I don't 
think we leave anything on the table by doing that.

https://github.com/llvm/llvm-project/pull/101817
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libcxx] [libc++][format][3/7] Improves std::format performance. (PR #101817)

2024-08-13 Thread Louis Dionne via llvm-branch-commits


@@ -58,23 +59,156 @@ namespace __format {
 /// This helper is used together with the @ref back_insert_iterator to offer
 /// type-erasure for the formatting functions. This reduces the number to
 /// template instantiations.
+///
+/// The design of the class is being changed to improve performance and do some
+/// code cleanups.
+/// The original design (as shipped up to LLVM-19) uses the following design:
+/// - There is an external object that connects the buffer to the output.
+/// - The class constructor stores a function pointer to a grow function and a
+///   type-erased pointer to the object that does the grow.
+/// - When writing data to the buffer would exceed the external buffer's
+///   capacity it requests the external buffer to flush its contents.
+///
+/// The new design tries to solve some issues with the current design:
+/// - The buffer used is a fixed-size buffer, benchmarking shows that using a
+///   dynamic allocated buffer has performance benefits.
+/// - Implementing P3107R5 "Permit an efficient implementation of std::print"
+///   is not trivial with the current buffers. Using the code from this series
+///   makes it trivial.
+///
+/// This class is ABI-tagged, still the new design does not change the size of
+/// objects of this class.
+///
+/// The new design contains information regarding format_to_n changes, these
+/// will be implemented in follow-up patch.
+///
+/// The new design is the following.
+/// - There is an external object that connects the buffer to the output.
+/// - This buffer object:
+///   - inherits publicly from this class.
+///   - has a static or dynamic buffer.
+///   - has a static member function to make space in its buffer write
+/// operations. This can be done by increasing the size of the internal
+/// buffer or by writing the contents of the buffer to the output iterator.
+///
+/// This member function is a constructor argument, so its name is not
+/// fixed. The code uses the name __prepare_write.
+/// - The number of output code units can be limited by a __max_output_size
+///   object. This is used in format_to_n This object:
+///   - Contains the maximum number of code units to be written.
+///   - Contains the number of code units that are requested to be written.
+/// This number is returned to the user of format_to_n.
+///   - The write functions call objects __request_write member function.
+/// This function:
+/// - Updates the number of code units that are requested to be written.
+/// - Returns the number of code units that can be written without
+///   exceeding the maximum number of code units to be written.
+///
+/// Documentation for the buffer usage members:
+/// - __ptr_ the start of the buffer.
+/// - __capacity_ the number of code units that can be written.
+///   This means [__ptr_, __ptr_ + __capacity_) is a valid range to write to.
+/// - __size_ the number of code units written in the buffer. The next code
+///   unit will be written at __ptr_ + __size_. This __size_ may NOT contain
+///   the total number of code units written by the __output_buffer. Whether or
+///   not it does depends on the sub-class used. Typically the total number of
+///   code units written is not interesting. It is interesting for format_to_n
+///   which has its own way to track this number.
+///
+/// Documentation for the buffer changes function:
+/// The subclasses have a function with the following signature:
+///
+///   static void __prepare_write(
+/// __output_buffer<_CharT>& __buffer, size_t __code_units);
+///
+/// This function is called when a write function writes more code units than
+/// the buffer' available space. When an __max_output_size object is provided
+/// the number of code units is the number of code units returned from
+/// __max_output_size::__request_write function.
+///
+/// - The __buffer contains *this. Since the class containing this function
+///   inherits from __output_buffer it's save to cast it to the subclass being
+///   used.
+/// - The __code_units is the number of code units the caller will write + 1.
+///   - This value does not take the avaiable space of the buffer into account.
+///   - The push_back function is more efficient when writing before resizing,
+/// this means the buffer should always have room for one code unit. Hence
+/// the + 1 is the size.
+/// - When the function returns there is room for at least one code unit. There
+///   is no requirement there is room for __code_units code units:
+///   - The class has some "bulk" operations. For example, __copy which copies
+/// the contents of a basic_string_view to the output. If the sub-class has
+/// a fixed size buffer the size of the basic_string_view may be larger
+/// than the buffer. In that case it's impossible to honor the requested
+/// size.
+///   - The at least one code unit makes sure the entire output can be written.
+/// (Obviously making room on

[llvm-branch-commits] [libcxx] [libc++][format][3/7] Improves std::format performance. (PR #101817)

2024-08-13 Thread Louis Dionne via llvm-branch-commits


@@ -58,23 +59,156 @@ namespace __format {
 /// This helper is used together with the @ref back_insert_iterator to offer
 /// type-erasure for the formatting functions. This reduces the number to
 /// template instantiations.
+///
+/// The design of the class is being changed to improve performance and do some
+/// code cleanups.
+/// The original design (as shipped up to LLVM-19) uses the following design:
+/// - There is an external object that connects the buffer to the output.
+/// - The class constructor stores a function pointer to a grow function and a
+///   type-erased pointer to the object that does the grow.

ldionne wrote:

```suggestion
/// - The class constructor stores a function pointer to a "grow" function and a
///   type-erased pointer to the object that should be grown.
```

https://github.com/llvm/llvm-project/pull/101817
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libcxx] [libc++][format][3/7] Improves std::format performance. (PR #101817)

2024-08-13 Thread Louis Dionne via llvm-branch-commits


@@ -499,11 +652,78 @@ struct _LIBCPP_TEMPLATE_VIS __format_to_n_buffer final
   _LIBCPP_HIDE_FROM_ABI auto __make_output_iterator() { return 
this->__output_.__make_output_iterator(); }
 
   _LIBCPP_HIDE_FROM_ABI format_to_n_result<_OutIt> __result() && {
-this->__output_.__flush();
+this->__output_.__flush(0);
 return {std::move(this->__writer_).__out_it(), this->__size_};
   }
 };
 
+// * * * LLVM-20 classes * * *
+
+// A dynamically growing buffer.
+template <__fmt_char_type _CharT>
+class _LIBCPP_TEMPLATE_VIS __allocating_buffer : public 
__output_buffer<_CharT> {
+public:
+  __allocating_buffer(const __allocating_buffer&)= delete;
+  __allocating_buffer& operator=(const __allocating_buffer&) = delete;
+
+  [[nodiscard]] _LIBCPP_HIDE_FROM_ABI __allocating_buffer()
+  : __output_buffer<_CharT>{__buffer_, __buffer_size_, __prepare_write} {}
+
+  _LIBCPP_HIDE_FROM_ABI ~__allocating_buffer() {
+if (__ptr_ != __buffer_) {
+  ranges::destroy_n(__ptr_, this->__size());
+  allocator_traits<_Alloc>::deallocate(__alloc_, __ptr_, 
this->__capacity());
+}
+  }
+
+  [[nodiscard]] _LIBCPP_HIDE_FROM_ABI basic_string_view<_CharT> __view() { 
return {__ptr_, this->__size()}; }
+
+private:
+  // At the moment the allocator is hard-code. There might be reasons to have
+  // an allocator trait in the future. This ensures forward compatibility.
+  using _Alloc = allocator<_CharT>;
+  _LIBCPP_NO_UNIQUE_ADDRESS _Alloc __alloc_;
+
+  // Since allocating is expensive the class has a small internal buffer. When
+  // its capacity is exceeded a dynamic buffer will be allocated.
+  static constexpr size_t __buffer_size_ = 256;
+  _CharT __buffer_[__buffer_size_];
+  _CharT* __ptr_{__buffer_};
+
+  _LIBCPP_HIDE_FROM_ABI void __grow_buffer(size_t __capacity) {
+if (__capacity < __buffer_size_)
+  return;
+
+_LIBCPP_ASSERT_INTERNAL(__capacity > this->__capacity(), "the buffer must 
grow");
+auto __result = std::__allocate_at_least(__alloc_, __capacity);
+auto __guard  = std::__make_exception_guard([&] {
+  allocator_traits<_Alloc>::deallocate(__alloc_, __result.ptr, 
__result.count);
+});
+// This shouldn't throw, but just to be safe. Note that at -O1 this
+// guard is optimized away so there is no runtime overhead.
+new (__result.ptr) _CharT[__result.count];
+std::copy_n(__ptr_, this->__size(), __result.ptr);
+__guard.__complete();
+if (__ptr_ != __buffer_) {
+  ranges::destroy_n(__ptr_, this->__capacity());
+  allocator_traits<_Alloc>::deallocate(__alloc_, __ptr_, 
this->__capacity());
+}

ldionne wrote:

```suggestion
auto __guard  = std::__make_exception_guard([&] {
  allocator_traits<_Alloc>::deallocate(__alloc_, __result.ptr, 
__result.count);
});
// This shouldn't throw, but just to be safe. Note that at -O1 this
// guard is optimized away so there is no runtime overhead.
std::__uninitialized_allocator_relocate(__alloc_, __ptr_, __ptr_ + 
this->__size(), __result.ptr);
__guard.__complete();
if (__ptr_ != __buffer_) {
  allocator_traits<_Alloc>::deallocate(__alloc_, __ptr_, 
this->__capacity());
}
```

https://github.com/llvm/llvm-project/pull/101817
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libcxx] [libc++][format][3/7] Improves std::format performance. (PR #101817)

2024-08-13 Thread Louis Dionne via llvm-branch-commits


@@ -58,23 +59,156 @@ namespace __format {
 /// This helper is used together with the @ref back_insert_iterator to offer
 /// type-erasure for the formatting functions. This reduces the number to
 /// template instantiations.
+///
+/// The design of the class is being changed to improve performance and do some
+/// code cleanups.
+/// The original design (as shipped up to LLVM-19) uses the following design:
+/// - There is an external object that connects the buffer to the output.
+/// - The class constructor stores a function pointer to a grow function and a
+///   type-erased pointer to the object that does the grow.
+/// - When writing data to the buffer would exceed the external buffer's
+///   capacity it requests the external buffer to flush its contents.
+///
+/// The new design tries to solve some issues with the current design:
+/// - The buffer used is a fixed-size buffer, benchmarking shows that using a
+///   dynamic allocated buffer has performance benefits.
+/// - Implementing P3107R5 "Permit an efficient implementation of std::print"
+///   is not trivial with the current buffers. Using the code from this series
+///   makes it trivial.
+///
+/// This class is ABI-tagged, still the new design does not change the size of
+/// objects of this class.
+///
+/// The new design contains information regarding format_to_n changes, these
+/// will be implemented in follow-up patch.
+///
+/// The new design is the following.
+/// - There is an external object that connects the buffer to the output.
+/// - This buffer object:
+///   - inherits publicly from this class.
+///   - has a static or dynamic buffer.
+///   - has a static member function to make space in its buffer write
+/// operations. This can be done by increasing the size of the internal
+/// buffer or by writing the contents of the buffer to the output iterator.
+///
+/// This member function is a constructor argument, so its name is not
+/// fixed. The code uses the name __prepare_write.
+/// - The number of output code units can be limited by a __max_output_size
+///   object. This is used in format_to_n This object:
+///   - Contains the maximum number of code units to be written.
+///   - Contains the number of code units that are requested to be written.
+/// This number is returned to the user of format_to_n.

ldionne wrote:

```suggestion
///   - Contains the number of code units that were actually written.
/// This number is returned to the user of format_to_n.
```

https://github.com/llvm/llvm-project/pull/101817
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libcxx] [libc++][format][3/7] Improves std::format performance. (PR #101817)

2024-08-13 Thread Louis Dionne via llvm-branch-commits


@@ -58,23 +59,156 @@ namespace __format {
 /// This helper is used together with the @ref back_insert_iterator to offer
 /// type-erasure for the formatting functions. This reduces the number to
 /// template instantiations.
+///
+/// The design of the class is being changed to improve performance and do some
+/// code cleanups.
+/// The original design (as shipped up to LLVM-19) uses the following design:
+/// - There is an external object that connects the buffer to the output.
+/// - The class constructor stores a function pointer to a grow function and a
+///   type-erased pointer to the object that does the grow.
+/// - When writing data to the buffer would exceed the external buffer's
+///   capacity it requests the external buffer to flush its contents.
+///
+/// The new design tries to solve some issues with the current design:
+/// - The buffer used is a fixed-size buffer, benchmarking shows that using a
+///   dynamic allocated buffer has performance benefits.
+/// - Implementing P3107R5 "Permit an efficient implementation of std::print"
+///   is not trivial with the current buffers. Using the code from this series
+///   makes it trivial.
+///
+/// This class is ABI-tagged, still the new design does not change the size of
+/// objects of this class.
+///
+/// The new design contains information regarding format_to_n changes, these
+/// will be implemented in follow-up patch.
+///
+/// The new design is the following.
+/// - There is an external object that connects the buffer to the output.
+/// - This buffer object:
+///   - inherits publicly from this class.
+///   - has a static or dynamic buffer.
+///   - has a static member function to make space in its buffer write
+/// operations. This can be done by increasing the size of the internal
+/// buffer or by writing the contents of the buffer to the output iterator.
+///
+/// This member function is a constructor argument, so its name is not
+/// fixed. The code uses the name __prepare_write.
+/// - The number of output code units can be limited by a __max_output_size
+///   object. This is used in format_to_n This object:
+///   - Contains the maximum number of code units to be written.
+///   - Contains the number of code units that are requested to be written.
+/// This number is returned to the user of format_to_n.
+///   - The write functions call objects __request_write member function.
+/// This function:
+/// - Updates the number of code units that are requested to be written.
+/// - Returns the number of code units that can be written without
+///   exceeding the maximum number of code units to be written.
+///
+/// Documentation for the buffer usage members:
+/// - __ptr_ the start of the buffer.
+/// - __capacity_ the number of code units that can be written.
+///   This means [__ptr_, __ptr_ + __capacity_) is a valid range to write to.
+/// - __size_ the number of code units written in the buffer. The next code
+///   unit will be written at __ptr_ + __size_. This __size_ may NOT contain
+///   the total number of code units written by the __output_buffer. Whether or
+///   not it does depends on the sub-class used. Typically the total number of
+///   code units written is not interesting. It is interesting for format_to_n
+///   which has its own way to track this number.
+///
+/// Documentation for the buffer changes function:
+/// The subclasses have a function with the following signature:
+///
+///   static void __prepare_write(
+/// __output_buffer<_CharT>& __buffer, size_t __code_units);
+///
+/// This function is called when a write function writes more code units than
+/// the buffer' available space. When an __max_output_size object is provided

ldionne wrote:

```suggestion
/// the buffer's available space. When an __max_output_size object is provided
```

https://github.com/llvm/llvm-project/pull/101817
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libcxx] [libc++][format][3/7] Improves std::format performance. (PR #101817)

2024-08-13 Thread Louis Dionne via llvm-branch-commits


@@ -58,23 +59,156 @@ namespace __format {
 /// This helper is used together with the @ref back_insert_iterator to offer
 /// type-erasure for the formatting functions. This reduces the number to
 /// template instantiations.
+///
+/// The design of the class is being changed to improve performance and do some
+/// code cleanups.
+/// The original design (as shipped up to LLVM-19) uses the following design:
+/// - There is an external object that connects the buffer to the output.
+/// - The class constructor stores a function pointer to a grow function and a
+///   type-erased pointer to the object that does the grow.
+/// - When writing data to the buffer would exceed the external buffer's
+///   capacity it requests the external buffer to flush its contents.
+///
+/// The new design tries to solve some issues with the current design:
+/// - The buffer used is a fixed-size buffer, benchmarking shows that using a
+///   dynamic allocated buffer has performance benefits.
+/// - Implementing P3107R5 "Permit an efficient implementation of std::print"
+///   is not trivial with the current buffers. Using the code from this series
+///   makes it trivial.
+///
+/// This class is ABI-tagged, still the new design does not change the size of
+/// objects of this class.
+///
+/// The new design contains information regarding format_to_n changes, these
+/// will be implemented in follow-up patch.
+///
+/// The new design is the following.
+/// - There is an external object that connects the buffer to the output.
+/// - This buffer object:
+///   - inherits publicly from this class.
+///   - has a static or dynamic buffer.
+///   - has a static member function to make space in its buffer write
+/// operations. This can be done by increasing the size of the internal
+/// buffer or by writing the contents of the buffer to the output iterator.
+///
+/// This member function is a constructor argument, so its name is not
+/// fixed. The code uses the name __prepare_write.
+/// - The number of output code units can be limited by a __max_output_size
+///   object. This is used in format_to_n This object:
+///   - Contains the maximum number of code units to be written.
+///   - Contains the number of code units that are requested to be written.
+/// This number is returned to the user of format_to_n.
+///   - The write functions call objects __request_write member function.
+/// This function:
+/// - Updates the number of code units that are requested to be written.
+/// - Returns the number of code units that can be written without
+///   exceeding the maximum number of code units to be written.
+///
+/// Documentation for the buffer usage members:
+/// - __ptr_ the start of the buffer.
+/// - __capacity_ the number of code units that can be written.
+///   This means [__ptr_, __ptr_ + __capacity_) is a valid range to write to.
+/// - __size_ the number of code units written in the buffer. The next code
+///   unit will be written at __ptr_ + __size_. This __size_ may NOT contain
+///   the total number of code units written by the __output_buffer. Whether or
+///   not it does depends on the sub-class used. Typically the total number of
+///   code units written is not interesting. It is interesting for format_to_n
+///   which has its own way to track this number.
+///
+/// Documentation for the buffer changes function:
+/// The subclasses have a function with the following signature:
+///
+///   static void __prepare_write(
+/// __output_buffer<_CharT>& __buffer, size_t __code_units);
+///
+/// This function is called when a write function writes more code units than
+/// the buffer' available space. When an __max_output_size object is provided
+/// the number of code units is the number of code units returned from
+/// __max_output_size::__request_write function.
+///
+/// - The __buffer contains *this. Since the class containing this function
+///   inherits from __output_buffer it's save to cast it to the subclass being
+///   used.
+/// - The __code_units is the number of code units the caller will write + 1.
+///   - This value does not take the avaiable space of the buffer into account.
+///   - The push_back function is more efficient when writing before resizing,
+/// this means the buffer should always have room for one code unit. Hence
+/// the + 1 is the size.
+/// - When the function returns there is room for at least one code unit. There
+///   is no requirement there is room for __code_units code units:
+///   - The class has some "bulk" operations. For example, __copy which copies
+/// the contents of a basic_string_view to the output. If the sub-class has
+/// a fixed size buffer the size of the basic_string_view may be larger
+/// than the buffer. In that case it's impossible to honor the requested
+/// size.
+///   - The at least one code unit makes sure the entire output can be written.
+/// (Obviously making room on

[llvm-branch-commits] [libcxx] [libc++][format][3/7] Improves std::format performance. (PR #101817)

2024-08-13 Thread Louis Dionne via llvm-branch-commits


@@ -58,23 +59,156 @@ namespace __format {
 /// This helper is used together with the @ref back_insert_iterator to offer
 /// type-erasure for the formatting functions. This reduces the number to
 /// template instantiations.
+///
+/// The design of the class is being changed to improve performance and do some
+/// code cleanups.
+/// The original design (as shipped up to LLVM-19) uses the following design:
+/// - There is an external object that connects the buffer to the output.
+/// - The class constructor stores a function pointer to a grow function and a
+///   type-erased pointer to the object that does the grow.
+/// - When writing data to the buffer would exceed the external buffer's
+///   capacity it requests the external buffer to flush its contents.
+///
+/// The new design tries to solve some issues with the current design:
+/// - The buffer used is a fixed-size buffer, benchmarking shows that using a
+///   dynamic allocated buffer has performance benefits.
+/// - Implementing P3107R5 "Permit an efficient implementation of std::print"
+///   is not trivial with the current buffers. Using the code from this series
+///   makes it trivial.
+///
+/// This class is ABI-tagged, still the new design does not change the size of
+/// objects of this class.
+///
+/// The new design contains information regarding format_to_n changes, these
+/// will be implemented in follow-up patch.
+///
+/// The new design is the following.
+/// - There is an external object that connects the buffer to the output.
+/// - This buffer object:
+///   - inherits publicly from this class.
+///   - has a static or dynamic buffer.
+///   - has a static member function to make space in its buffer write
+/// operations. This can be done by increasing the size of the internal
+/// buffer or by writing the contents of the buffer to the output iterator.
+///
+/// This member function is a constructor argument, so its name is not
+/// fixed. The code uses the name __prepare_write.
+/// - The number of output code units can be limited by a __max_output_size
+///   object. This is used in format_to_n This object:
+///   - Contains the maximum number of code units to be written.
+///   - Contains the number of code units that are requested to be written.
+/// This number is returned to the user of format_to_n.
+///   - The write functions call objects __request_write member function.
+/// This function:
+/// - Updates the number of code units that are requested to be written.
+/// - Returns the number of code units that can be written without
+///   exceeding the maximum number of code units to be written.
+///
+/// Documentation for the buffer usage members:
+/// - __ptr_ the start of the buffer.
+/// - __capacity_ the number of code units that can be written.
+///   This means [__ptr_, __ptr_ + __capacity_) is a valid range to write to.
+/// - __size_ the number of code units written in the buffer. The next code
+///   unit will be written at __ptr_ + __size_. This __size_ may NOT contain
+///   the total number of code units written by the __output_buffer. Whether or
+///   not it does depends on the sub-class used. Typically the total number of
+///   code units written is not interesting. It is interesting for format_to_n
+///   which has its own way to track this number.
+///
+/// Documentation for the buffer changes function:
+/// The subclasses have a function with the following signature:
+///
+///   static void __prepare_write(
+/// __output_buffer<_CharT>& __buffer, size_t __code_units);
+///
+/// This function is called when a write function writes more code units than
+/// the buffer' available space. When an __max_output_size object is provided
+/// the number of code units is the number of code units returned from
+/// __max_output_size::__request_write function.
+///
+/// - The __buffer contains *this. Since the class containing this function
+///   inherits from __output_buffer it's save to cast it to the subclass being

ldionne wrote:

```suggestion
///   inherits from __output_buffer it's safe to cast it to the subclass being
```

https://github.com/llvm/llvm-project/pull/101817
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libcxx] [libc++][format][3/7] Improves std::format performance. (PR #101817)

2024-08-13 Thread Louis Dionne via llvm-branch-commits


@@ -58,23 +59,156 @@ namespace __format {
 /// This helper is used together with the @ref back_insert_iterator to offer
 /// type-erasure for the formatting functions. This reduces the number to
 /// template instantiations.
+///
+/// The design of the class is being changed to improve performance and do some
+/// code cleanups.
+/// The original design (as shipped up to LLVM-19) uses the following design:
+/// - There is an external object that connects the buffer to the output.
+/// - The class constructor stores a function pointer to a grow function and a
+///   type-erased pointer to the object that does the grow.
+/// - When writing data to the buffer would exceed the external buffer's
+///   capacity it requests the external buffer to flush its contents.
+///
+/// The new design tries to solve some issues with the current design:
+/// - The buffer used is a fixed-size buffer, benchmarking shows that using a
+///   dynamic allocated buffer has performance benefits.
+/// - Implementing P3107R5 "Permit an efficient implementation of std::print"
+///   is not trivial with the current buffers. Using the code from this series
+///   makes it trivial.
+///
+/// This class is ABI-tagged, still the new design does not change the size of
+/// objects of this class.
+///
+/// The new design contains information regarding format_to_n changes, these
+/// will be implemented in follow-up patch.
+///
+/// The new design is the following.
+/// - There is an external object that connects the buffer to the output.
+/// - This buffer object:
+///   - inherits publicly from this class.
+///   - has a static or dynamic buffer.
+///   - has a static member function to make space in its buffer write
+/// operations. This can be done by increasing the size of the internal
+/// buffer or by writing the contents of the buffer to the output iterator.
+///
+/// This member function is a constructor argument, so its name is not
+/// fixed. The code uses the name __prepare_write.
+/// - The number of output code units can be limited by a __max_output_size
+///   object. This is used in format_to_n This object:
+///   - Contains the maximum number of code units to be written.
+///   - Contains the number of code units that are requested to be written.
+/// This number is returned to the user of format_to_n.
+///   - The write functions call objects __request_write member function.
+/// This function:

ldionne wrote:

```suggestion
///   - The write functions call the object's __request_write member function.
/// This function:
```

https://github.com/llvm/llvm-project/pull/101817
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libcxx] [libc++][format][3/7] Improves std::format performance. (PR #101817)

2024-08-13 Thread Louis Dionne via llvm-branch-commits


@@ -499,11 +652,78 @@ struct _LIBCPP_TEMPLATE_VIS __format_to_n_buffer final
   _LIBCPP_HIDE_FROM_ABI auto __make_output_iterator() { return 
this->__output_.__make_output_iterator(); }
 
   _LIBCPP_HIDE_FROM_ABI format_to_n_result<_OutIt> __result() && {
-this->__output_.__flush();
+this->__output_.__flush(0);
 return {std::move(this->__writer_).__out_it(), this->__size_};
   }
 };
 
+// * * * LLVM-20 classes * * *
+
+// A dynamically growing buffer.
+template <__fmt_char_type _CharT>
+class _LIBCPP_TEMPLATE_VIS __allocating_buffer : public 
__output_buffer<_CharT> {
+public:
+  __allocating_buffer(const __allocating_buffer&)= delete;
+  __allocating_buffer& operator=(const __allocating_buffer&) = delete;
+
+  [[nodiscard]] _LIBCPP_HIDE_FROM_ABI __allocating_buffer()
+  : __output_buffer<_CharT>{__buffer_, __buffer_size_, __prepare_write} {}
+
+  _LIBCPP_HIDE_FROM_ABI ~__allocating_buffer() {
+if (__ptr_ != __buffer_) {
+  ranges::destroy_n(__ptr_, this->__size());
+  allocator_traits<_Alloc>::deallocate(__alloc_, __ptr_, 
this->__capacity());
+}
+  }
+
+  [[nodiscard]] _LIBCPP_HIDE_FROM_ABI basic_string_view<_CharT> __view() { 
return {__ptr_, this->__size()}; }
+
+private:
+  // At the moment the allocator is hard-code. There might be reasons to have
+  // an allocator trait in the future. This ensures forward compatibility.
+  using _Alloc = allocator<_CharT>;
+  _LIBCPP_NO_UNIQUE_ADDRESS _Alloc __alloc_;

ldionne wrote:

If we hardcode the allocator, I would simply not use one. This would simplify 
business with `construct_at` & friends too.

https://github.com/llvm/llvm-project/pull/101817
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libcxx] [libc++][format][3/7] Improves std::format performance. (PR #101817)

2024-08-13 Thread Louis Dionne via llvm-branch-commits


@@ -452,9 +452,9 @@ format_to(_OutIt __out_it, wformat_string<_Args...> __fmt, 
_Args&&... __args) {
 // fires too eagerly, see http://llvm.org/PR61563.
 template 
 [[nodiscard]] _LIBCPP_ALWAYS_INLINE inline _LIBCPP_HIDE_FROM_ABI string 
vformat(string_view __fmt, format_args __args) {
-  string __res;
-  std::vformat_to(std::back_inserter(__res), __fmt, __args);
-  return __res;
+  __format::__allocating_buffer __buffer;
+  std::vformat_to(__buffer.__make_output_iterator(), __fmt, __args);

ldionne wrote:

Can we remove an include since you don't mention `back_inserter` directly 
anymore?

https://github.com/llvm/llvm-project/pull/101817
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libcxx] [libc++][format][3/7] Improves std::format performance. (PR #101817)

2024-08-13 Thread Louis Dionne via llvm-branch-commits


@@ -499,11 +652,78 @@ struct _LIBCPP_TEMPLATE_VIS __format_to_n_buffer final
   _LIBCPP_HIDE_FROM_ABI auto __make_output_iterator() { return 
this->__output_.__make_output_iterator(); }
 
   _LIBCPP_HIDE_FROM_ABI format_to_n_result<_OutIt> __result() && {
-this->__output_.__flush();
+this->__output_.__flush(0);
 return {std::move(this->__writer_).__out_it(), this->__size_};
   }
 };
 
+// * * * LLVM-20 classes * * *
+
+// A dynamically growing buffer.
+template <__fmt_char_type _CharT>
+class _LIBCPP_TEMPLATE_VIS __allocating_buffer : public 
__output_buffer<_CharT> {
+public:
+  __allocating_buffer(const __allocating_buffer&)= delete;
+  __allocating_buffer& operator=(const __allocating_buffer&) = delete;
+
+  [[nodiscard]] _LIBCPP_HIDE_FROM_ABI __allocating_buffer()
+  : __output_buffer<_CharT>{__buffer_, __buffer_size_, __prepare_write} {}
+
+  _LIBCPP_HIDE_FROM_ABI ~__allocating_buffer() {
+if (__ptr_ != __buffer_) {
+  ranges::destroy_n(__ptr_, this->__size());
+  allocator_traits<_Alloc>::deallocate(__alloc_, __ptr_, 
this->__capacity());
+}
+  }
+
+  [[nodiscard]] _LIBCPP_HIDE_FROM_ABI basic_string_view<_CharT> __view() { 
return {__ptr_, this->__size()}; }
+
+private:
+  // At the moment the allocator is hard-code. There might be reasons to have
+  // an allocator trait in the future. This ensures forward compatibility.
+  using _Alloc = allocator<_CharT>;
+  _LIBCPP_NO_UNIQUE_ADDRESS _Alloc __alloc_;
+
+  // Since allocating is expensive the class has a small internal buffer. When
+  // its capacity is exceeded a dynamic buffer will be allocated.
+  static constexpr size_t __buffer_size_ = 256;
+  _CharT __buffer_[__buffer_size_];

ldionne wrote:

```suggestion
  std::byte __buffer_[__buffer_size_ * sizeof(_CharT)];
```

https://github.com/llvm/llvm-project/pull/101817
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [InstCombine] Don't look at ConstantData users (PR #103302)

2024-08-13 Thread Alexis Engelke via llvm-branch-commits

https://github.com/aengelke updated 
https://github.com/llvm/llvm-project/pull/103302

>From 6a2ac00a8424a4402475e2b7972bfb01330c3bf8 Mon Sep 17 00:00:00 2001
From: Alexis Engelke 
Date: Tue, 13 Aug 2024 16:10:38 +
Subject: [PATCH 1/2] Only run instcombine in test case

Created using spr 1.3.5-bogner
---
 .../Transforms/InstCombine/phi-int-users.ll   | 416 --
 1 file changed, 379 insertions(+), 37 deletions(-)

diff --git a/llvm/test/Transforms/InstCombine/phi-int-users.ll 
b/llvm/test/Transforms/InstCombine/phi-int-users.ll
index ce81c5d7e3626..8a6bf44b884a2 100644
--- a/llvm/test/Transforms/InstCombine/phi-int-users.ll
+++ b/llvm/test/Transforms/InstCombine/phi-int-users.ll
@@ -1,14 +1,10 @@
 ; NOTE: Assertions have been autogenerated by utils/update_test_checks.py 
UTC_ARGS: --version 5
-; RUN: opt -mtriple=arm64 
-passes='inline,function(sroa,jump-threading,instcombine)' -S < %s 
| FileCheck %s
+; RUN: opt -mtriple=arm64 -S < %s -passes=instcombine | FileCheck %s
 
 ; Verify that instcombine doesn't look at users of Constant in different
 ; functions for dominates() queries.
 
-%struct.widget = type { %struct.baz, i8, [7 x i8] }
-%struct.baz = type { %struct.snork }
-%struct.snork = type { [8 x i8] }
-
-define void @spam(ptr %arg) {
+define void @spam(ptr %arg) personality ptr null {
 ; CHECK-LABEL: define void @spam(
 ; CHECK-SAME: ptr [[ARG:%.*]]) personality ptr null {
 ; CHECK-NEXT:  [[BB:.*:]]
@@ -49,11 +45,55 @@ define void @spam(ptr %arg) {
 ; CHECK-NEXT:ret void
 ;
 bb:
-  call void @barney(ptr %arg)
+  %load.i = load volatile i1, ptr null, align 1
+  br i1 %load.i, label %bb2.i, label %bb3.i
+
+bb2.i:; preds = %bb
+  store i64 1, ptr %arg, align 8
+  br label %barney.exit
+
+bb3.i:; preds = %bb
+  %load.i.i = load volatile i32, ptr null, align 4
+  %icmp.i.i = icmp eq i32 %load.i.i, 0
+  br i1 %icmp.i.i, label %bb2.i.i, label %bb3.i.i
+
+bb2.i.i:  ; preds = %bb3.i
+  br label %bb1.i
+
+bb1.i:; preds = %spam.exit.i, 
%bb2.i.i
+  %load.i.i.i = load volatile i1, ptr null, align 1
+  br i1 %load.i.i.i, label %spam.exit.i, label %bb3.i.i.i
+
+bb3.i.i.i:; preds = %bb1.i
+  call void @zot.4()
+  br label %spam.exit.i
+
+spam.exit.i:  ; preds = %bb3.i.i.i, %bb1.i
+  %alloca.sroa.0.1.i = phi i64 [ 0, %bb3.i.i.i ], [ 1, %bb1.i ]
+  %0 = inttoptr i64 %alloca.sroa.0.1.i to ptr
+  store i32 0, ptr %0, align 4
+  br label %bb1.i
+
+eggs.exit:; No predecessors!
+  br label %barney.exit
+
+bb3.i.i:  ; preds = %bb3.i
+  %load.i.i1 = load volatile i1, ptr null, align 1
+  br i1 %load.i.i1, label %quux.exit, label %bb3.i.i2
+
+bb3.i.i2: ; preds = %bb3.i.i
+  call void @snork()
+  unreachable
+
+quux.exit:; preds = %bb3.i.i
+  store ptr null, ptr null, align 8
+  br label %barney.exit
+
+barney.exit:  ; preds = %quux.exit, 
%eggs.exit, %bb2.i
   ret void
 }
 
-define ptr @zot(ptr %arg) {
+define ptr @zot(ptr %arg) personality ptr null {
 ; CHECK-LABEL: define ptr @zot(
 ; CHECK-SAME: ptr [[ARG:%.*]]) personality ptr null {
 ; CHECK-NEXT:  [[BB:.*:]]
@@ -63,7 +103,9 @@ define ptr @zot(ptr %arg) {
 ; CHECK-NEXT:ret ptr null
 ;
 bb:
-  %call = call ptr @ham.8(ptr %arg)
+  %load.i.i.i.i = load ptr, ptr %arg, align 8
+  store ptr null, ptr %arg, align 8
+  store i32 0, ptr %load.i.i.i.i, align 4
   ret ptr null
 }
 
@@ -86,7 +128,7 @@ define ptr @wombat.1(ptr %arg) {
 ; CHECK-NEXT:ret ptr null
 ;
 bb:
-  %call = call ptr @foo.9(ptr %arg)
+  store i64 1, ptr %arg, align 8
   ret ptr null
 }
 
@@ -103,7 +145,15 @@ define void @quux() personality ptr null {
 ; CHECK-NEXT:ret void
 ;
 bb:
-  call void @wobble()
+  %load.i = load volatile i1, ptr null, align 1
+  br i1 %load.i, label %wibble.exit, label %bb3.i
+
+bb3.i:; preds = %bb
+  call void @snork()
+  unreachable
+
+wibble.exit:  ; preds = %bb
+  store ptr null, ptr null, align 8
   ret void
 }
 
@@ -120,7 +170,15 @@ define void @wobble() personality ptr null {
 ; CHECK-NEXT:ret void
 ;
 bb:
-  call void @quux.3()
+  %load.i.i = load volatile i1, ptr null, align 1
+  br i1 %load.i.i, label %wobble.2.exit, label %bb3.i.i
+
+bb3.i.i:  ; preds = %bb
+  call void @snork()
+  unreachable
+
+wobble.2.exit:; preds = %bb
+  store ptr null, ptr null, align 8
   ret void
 }
 
@@ -141,12 +199,20 @@ define void @eggs() personality ptr null {
 ; CHECK-NEXT:br label %[[BB1]]
 ;
 bb:
-  %alloca = alloca %struct.widget, align 8
   br label %bb1
 
-bb

[llvm-branch-commits] [llvm] [InstCombine] Don't look at ConstantData users (PR #103302)

2024-08-13 Thread Alexis Engelke via llvm-branch-commits




aengelke wrote:

Done. Fun fact, llvm-reduce crashed on the input due to this bug.

https://github.com/llvm/llvm-project/pull/103302
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: [PPC][AIX] Save/restore r31 when using base pointer (#100182) (PR #103301)

2024-08-13 Thread Sean Fertile via llvm-branch-commits

https://github.com/mandlebug approved this pull request.

LGTM.

https://github.com/llvm/llvm-project/pull/103301
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libcxx] release/19.x: [libc++] Use a different smart ptr type alias (#102089) (PR #103003)

2024-08-13 Thread Louis Dionne via llvm-branch-commits

https://github.com/ldionne approved this pull request.


https://github.com/llvm/llvm-project/pull/103003
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


  1   2   >