date:20250514

[llvm-branch-commits] [clang] [llvm] Enable fexec-charset option (PR #138895)

2025-05-14 Thread via llvm-branch-commits



@@ -0,0 +1,36 @@
+//===--- clang/Lex/LiteralConverter.h - Translator for Literals -*- C++ 
-*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+
+#ifndef LLVM_CLANG_LEX_LITERALCONVERTER_H
+#define LLVM_CLANG_LEX_LITERALCONVERTER_H
+
+#include "clang/Basic/Diagnostic.h"
+#include "clang/Basic/LangOptions.h"
+#include "clang/Basic/TargetInfo.h"
+#include "llvm/ADT/StringMap.h"
+#include "llvm/ADT/StringRef.h"
+#include "llvm/Support/CharSet.h"
+
+enum ConversionAction { NoConversion, ToSystemCharset, ToExecCharset };
+
+class LiteralConverter {
+  llvm::StringRef InternalCharset;
+  llvm::StringRef SystemCharset;
+  llvm::StringRef ExecCharset;
+  llvm::StringMap CharsetConverters;

cor3ntin wrote:

Let's have a single converter, and think about the pragma later.

(A solution would be to have a stack of them, such that getting to the active 
one would be a single pointer read... either way, the map is insufficient to 
implement the pragma, and we don't want to design that pragma now, that PR has 
enough complexity)

https://github.com/llvm/llvm-project/pull/138895
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [libcxx] [libc++] Implements the new FTM documentation generator. (PR #139774)

2025-05-14 Thread Louis Dionne via llvm-branch-commits



@@ -0,0 +1,187 @@
+# 
===--===##
+#
+# Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+# See https://llvm.org/LICENSE.txt for license information.
+# SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+#
+# 
===--===##
+
+# RUN: %{python} %s %{libcxx-dir}/utils 
%{libcxx-dir}/test/libcxx/feature_test_macro/test_data.json
+
+import sys
+import unittest
+
+UTILS = sys.argv[1]
+TEST_DATA = sys.argv[2]
+del sys.argv[1:3]
+
+sys.path.append(UTILS)
+from generate_feature_test_macro_components import FeatureTestMacros
+
+
+class Test(unittest.TestCase):
+def setUp(self):
+self.ftm = FeatureTestMacros(TEST_DATA, ["charconv"])
+self.maxDiff = None  # This causes the diff to be printed when the 
test fails
+
+def test_implementeation(self):

ldionne wrote:

```suggestion
def test_implementation(self):
```

https://github.com/llvm/llvm-project/pull/139774
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [libcxx] [libc++] Implements the new FTM documentation generator. (PR #139774)

2025-05-14 Thread Louis Dionne via llvm-branch-commits



@@ -2555,6 +2645,72 @@ def generate_header_test_directory(self, path: os.path) 
-> None:
 f.write(self.generate_header_test_file(header))
 
 
+@functools.cached_property
+def status_list_table(self) -> str:
+"""Creates the rst status table using a list-table."""
+
+result = ""
+for std in self.std_dialects:
+result += create_table_row(
+[[f'**{std.replace("c++", "C++")}**', "", "", "", ""]]
+)
+
+for feature in self.__data:
+if std not in feature["values"].keys():
+continue
+
+row = list()
+
+ftm = feature["name"]
+libcxx_value = (
+f"{self.standard_ftms[ftm][std]}"
+if self.is_implemented(ftm, std)
+else "*unimplemented*"
+)
+
+values = feature["values"][std]
+assert len(values) > 0, f"{feature['name']}[{std}] has no 
entries"
+for value in values:

ldionne wrote:

```suggestion
for value, papers in values.items():
```

Would that work too?

https://github.com/llvm/llvm-project/pull/139774
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [libcxx] [libc++] Implements the new FTM documentation generator. (PR #139774)

2025-05-14 Thread Louis Dionne via llvm-branch-commits


https://github.com/ldionne edited 
https://github.com/llvm/llvm-project/pull/139774
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [libcxx] [libc++] Implements the new FTM documentation generator. (PR #139774)

2025-05-14 Thread Louis Dionne via llvm-branch-commits


https://github.com/ldionne approved this pull request.

I am really excited for this change! This looks really good, with a few 
comments.

https://github.com/llvm/llvm-project/pull/139774
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang] [clang][analyzer] Handle CXXParenInitListExpr alongside InitListExpr (PR #139909)

2025-05-14 Thread Fangyi Zhou via llvm-branch-commits


fangyi-zhou wrote:

I think you probably want to cherry-pick 
https://github.com/llvm/llvm-project/pull/136041/commits/5dc9d55eb04d94c01dba0364b51a509f975e542a
 which addresses reviewer comments and fixes the tests.

https://github.com/llvm/llvm-project/pull/139909
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang] [llvm] Enable fexec-charset option (PR #138895)

2025-05-14 Thread via llvm-branch-commits



@@ -0,0 +1,36 @@
+//===--- clang/Lex/LiteralConverter.h - Translator for Literals -*- C++ 
-*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+
+#ifndef LLVM_CLANG_LEX_LITERALCONVERTER_H
+#define LLVM_CLANG_LEX_LITERALCONVERTER_H
+
+#include "clang/Basic/Diagnostic.h"
+#include "clang/Basic/LangOptions.h"
+#include "clang/Basic/TargetInfo.h"
+#include "llvm/ADT/StringMap.h"
+#include "llvm/ADT/StringRef.h"
+#include "llvm/Support/CharSet.h"
+
+enum ConversionAction { NoConversion, ToSystemCharset, ToExecCharset };
+
+class LiteralConverter {
+  llvm::StringRef InternalCharset;
+  llvm::StringRef SystemCharset;
+  llvm::StringRef ExecCharset;
+  llvm::StringMap CharsetConverters;

cor3ntin wrote:

Even if we were to do that, I don't think a map would be the best solution - 
there is only a given pair of converters active at any given time.

https://github.com/llvm/llvm-project/pull/138895
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: improve handling of unreachable basic blocks (PR #136183)

2025-05-14 Thread Anatoly Trosinenko via llvm-branch-commits


https://github.com/atrosinenko updated 
https://github.com/llvm/llvm-project/pull/136183

>From c11b9017a4d3a4c07946081a355506e0f69d312b Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko 
Date: Thu, 17 Apr 2025 20:51:16 +0300
Subject: [PATCH 1/3] [BOLT] Gadget scanner: improve handling of unreachable
 basic blocks

Instead of refusing to analyze an instruction completely, when it is
unreachable according to the CFG reconstructed by BOLT, pessimistically
assume all registers to be unsafe at the start of basic blocks without
any predecessors. Nevertheless, unreachable basic blocks found in
optimized code likely means imprecise CFG reconstruction, thus report a
warning once per basic block without predecessors.
---
 bolt/lib/Passes/PAuthGadgetScanner.cpp| 46 ++-
 .../AArch64/gs-pacret-autiasp.s   |  7 ++-
 .../binary-analysis/AArch64/gs-pauth-calls.s  | 57 +++
 3 files changed, 95 insertions(+), 15 deletions(-)

diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp 
b/bolt/lib/Passes/PAuthGadgetScanner.cpp
index cd7c077a6412e..3cee579ef2a15 100644
--- a/bolt/lib/Passes/PAuthGadgetScanner.cpp
+++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp
@@ -343,6 +343,12 @@ class SrcSafetyAnalysis {
 return S;
   }
 
+  /// Creates a state with all registers marked unsafe (not to be confused
+  /// with empty state).
+  SrcState createUnsafeState() const {
+return SrcState(NumRegs, RegsToTrackInstsFor.getNumTrackedRegisters());
+  }
+
   BitVector getClobberedRegs(const MCInst &Point) const {
 BitVector Clobbered(NumRegs);
 // Assume a call can clobber all registers, including callee-saved
@@ -585,6 +591,13 @@ class DataflowSrcSafetyAnalysis
 if (BB.isEntryPoint())
   return createEntryState();
 
+// If a basic block without any predecessors is found in an optimized code,
+// this likely means that some CFG edges were not detected. Pessimistically
+// assume all registers to be unsafe before this basic block and warn about
+// this fact in FunctionAnalysis::findUnsafeUses().
+if (BB.pred_empty())
+  return createUnsafeState();
+
 return SrcState();
   }
 
@@ -658,12 +671,6 @@ class CFGUnawareSrcSafetyAnalysis : public 
SrcSafetyAnalysis {
   BC.MIB->removeAnnotation(I.second, StateAnnotationIndex);
   }
 
-  /// Creates a state with all registers marked unsafe (not to be confused
-  /// with empty state).
-  SrcState createUnsafeState() const {
-return SrcState(NumRegs, RegsToTrackInstsFor.getNumTrackedRegisters());
-  }
-
 public:
   CFGUnawareSrcSafetyAnalysis(BinaryFunction &BF,
   MCPlusBuilder::AllocatorIdTy AllocId,
@@ -1342,19 +1349,30 @@ void FunctionAnalysisContext::findUnsafeUses(
 BF.dump();
   });
 
+  if (BF.hasCFG()) {
+// Warn on basic blocks being unreachable according to BOLT, as this
+// likely means CFG is imprecise.
+for (BinaryBasicBlock &BB : BF) {
+  if (!BB.pred_empty() || BB.isEntryPoint())
+continue;
+  // Arbitrarily attach the report to the first instruction of BB.
+  MCInst *InstToReport = BB.getFirstNonPseudoInstr();
+  if (!InstToReport)
+continue; // BB has no real instructions
+
+  Reports.push_back(
+  make_generic_report(MCInstReference::get(InstToReport, BF),
+  "Warning: no predecessor basic blocks detected "
+  "(possibly incomplete CFG)"));
+}
+  }
+
   iterateOverInstrs(BF, [&](MCInstReference Inst) {
 if (BC.MIB->isCFI(Inst))
   return;
 
 const SrcState &S = Analysis->getStateBefore(Inst);
-
-// If non-empty state was never propagated from the entry basic block
-// to Inst, assume it to be unreachable and report a warning.
-if (S.empty()) {
-  Reports.push_back(
-  make_generic_report(Inst, "Warning: unreachable instruction found"));
-  return;
-}
+assert(!S.empty() && "Instruction has no associated state");
 
 if (auto Report = shouldReportReturnGadget(BC, Inst, S))
   Reports.push_back(*Report);
diff --git a/bolt/test/binary-analysis/AArch64/gs-pacret-autiasp.s 
b/bolt/test/binary-analysis/AArch64/gs-pacret-autiasp.s
index 284f0bea607a5..6559ba336e8de 100644
--- a/bolt/test/binary-analysis/AArch64/gs-pacret-autiasp.s
+++ b/bolt/test/binary-analysis/AArch64/gs-pacret-autiasp.s
@@ -215,12 +215,17 @@ f_callclobbered_calleesaved:
 .globl  f_unreachable_instruction
 .type   f_unreachable_instruction,@function
 f_unreachable_instruction:
-// CHECK-LABEL: GS-PAUTH: Warning: unreachable instruction found in function 
f_unreachable_instruction, basic block {{[0-9a-zA-Z.]+}}, at address
+// CHECK-LABEL: GS-PAUTH: Warning: no predecessor basic blocks detected 
(possibly incomplete CFG) in function f_unreachable_instruction, basic block 
{{[0-9a-zA-Z.]+}}, at address
 // CHECK-NEXT:The instruction is {{[0-9a-f]+}}:   add x0, x1, 
x2
 // CHECK-NOT:   instructi

[llvm-branch-commits] [llvm] [BOLT] Factor out MCInstReference from gadget scanner (NFC) (PR #138655)

2025-05-14 Thread Anatoly Trosinenko via llvm-branch-commits


https://github.com/atrosinenko updated 
https://github.com/llvm/llvm-project/pull/138655

>From cbcac1bc4612b601a5ec963663ba69a5f212feb1 Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko 
Date: Mon, 28 Apr 2025 18:35:48 +0300
Subject: [PATCH] [BOLT] Factor out MCInstReference from gadget scanner (NFC)

Move MCInstReference representing a constant reference to an instruction
inside a parent entity - either inside a basic block (which has a
reference to its parent function) or directly to the function (when CFG
information is not available).
---
 bolt/include/bolt/Core/MCInstUtils.h  | 172 +
 bolt/include/bolt/Passes/PAuthGadgetScanner.h | 180 +-
 bolt/lib/Core/CMakeLists.txt  |   1 +
 bolt/lib/Core/MCInstUtils.cpp |  57 ++
 bolt/lib/Passes/PAuthGadgetScanner.cpp|  99 --
 5 files changed, 272 insertions(+), 237 deletions(-)
 create mode 100644 bolt/include/bolt/Core/MCInstUtils.h
 create mode 100644 bolt/lib/Core/MCInstUtils.cpp

diff --git a/bolt/include/bolt/Core/MCInstUtils.h 
b/bolt/include/bolt/Core/MCInstUtils.h
new file mode 100644
index 0..a3912a8fb265a
--- /dev/null
+++ b/bolt/include/bolt/Core/MCInstUtils.h
@@ -0,0 +1,172 @@
+//===- bolt/Core/MCInstUtils.h --*- C++ 
-*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+
+#ifndef BOLT_CORE_MCINSTUTILS_H
+#define BOLT_CORE_MCINSTUTILS_H
+
+#include "bolt/Core/BinaryBasicBlock.h"
+
+#include 
+#include 
+#include 
+
+namespace llvm {
+namespace bolt {
+
+class BinaryFunction;
+
+/// MCInstReference represents a reference to a constant MCInst as stored 
either
+/// in a BinaryFunction (i.e. before a CFG is created), or in a 
BinaryBasicBlock
+/// (after a CFG is created).
+class MCInstReference {
+  using nocfg_const_iterator = std::map::const_iterator;
+
+  // Two cases are possible:
+  // * functions with CFG reconstructed - a function stores a collection of
+  //   basic blocks, each basic block stores a contiguous vector of MCInst
+  // * functions without CFG - there are no basic blocks created,
+  //   the instructions are directly stored in std::map in BinaryFunction
+  //
+  // In both cases, the direct parent of MCInst is stored together with an
+  // iterator pointing to the instruction.
+
+  // Helper struct: CFG is available, the direct parent is a basic block,
+  // iterator's type is `MCInst *`.
+  struct RefInBB {
+RefInBB(const BinaryBasicBlock *BB, const MCInst *Inst)
+: BB(BB), It(Inst) {}
+RefInBB(const RefInBB &Other) = default;
+RefInBB &operator=(const RefInBB &Other) = default;
+
+const BinaryBasicBlock *BB;
+BinaryBasicBlock::const_iterator It;
+
+bool operator<(const RefInBB &Other) const {
+  if (BB != Other.BB)
+return std::less{}(BB, Other.BB);
+  return It < Other.It;
+}
+
+bool operator==(const RefInBB &Other) const {
+  return BB == Other.BB && It == Other.It;
+}
+  };
+
+  // Helper struct: CFG is *not* available, the direct parent is a function,
+  // iterator's type is std::map::iterator (the mapped value
+  // is an instruction's offset).
+  struct RefInBF {
+RefInBF(const BinaryFunction *BF, nocfg_const_iterator It)
+: BF(BF), It(It) {}
+RefInBF(const RefInBF &Other) = default;
+RefInBF &operator=(const RefInBF &Other) = default;
+
+const BinaryFunction *BF;
+nocfg_const_iterator It;
+
+bool operator<(const RefInBF &Other) const {
+  if (BF != Other.BF)
+return std::less{}(BF, Other.BF);
+  return It->first < Other.It->first;
+}
+
+bool operator==(const RefInBF &Other) const {
+  return BF == Other.BF && It->first == Other.It->first;
+}
+  };
+
+  std::variant Reference;
+
+  // Utility methods to be used like this:
+  //
+  // if (auto *Ref = tryGetRefInBB())
+  //   return Ref->doSomething(...);
+  // return getRefInBF().doSomethingElse(...);
+  const RefInBB *tryGetRefInBB() const {
+assert(std::get_if(&Reference) ||
+   std::get_if(&Reference));
+return std::get_if(&Reference);
+  }
+  const RefInBF &getRefInBF() const {
+assert(std::get_if(&Reference));
+return *std::get_if(&Reference);
+  }
+
+public:
+  /// Constructs an empty reference.
+  MCInstReference() : Reference(RefInBB(nullptr, nullptr)) {}
+  /// Constructs a reference to the instruction inside the basic block.
+  MCInstReference(const BinaryBasicBlock *BB, const MCInst *Inst)
+  : Reference(RefInBB(BB, Inst)) {
+assert(BB && Inst && "Neither BB nor Inst should be nullptr");
+  }
+  /// Constructs a reference to the instruction inside the basic block.
+  MCInstReference(const BinaryBasicBlock *BB, uns

[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: account for BRK when searching for auth oracles (PR #137975)

2025-05-14 Thread Anatoly Trosinenko via llvm-branch-commits


https://github.com/atrosinenko updated 
https://github.com/llvm/llvm-project/pull/137975

>From 91b67b64a4e731cdfabc09bd224c32e1ab25e21d Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko 
Date: Wed, 30 Apr 2025 16:08:10 +0300
Subject: [PATCH] [BOLT] Gadget scanner: account for BRK when searching for
 auth oracles

An authenticated pointer can be explicitly checked by the compiler via a
sequence of instructions that executes BRK on failure. It is important
to recognize such BRK instruction as checking every register (as it is
expected to immediately trigger an abnormal program termination) to
prevent false positive reports about authentication oracles:

autia   x2, x3
autia   x0, x1
; neither x0 nor x2 are checked at this point
eor x16, x0, x0, lsl #1
tbz x16, #62, on_success ; marks x0 as checked
; end of BB: for x2 to be checked here, it must be checked in both
; successor basic blocks
  on_failure:
brk 0xc470
  on_success:
; x2 is checked
ldr x1, [x2] ; marks x2 as checked
---
 bolt/include/bolt/Core/MCPlusBuilder.h| 14 ++
 bolt/lib/Passes/PAuthGadgetScanner.cpp| 13 +-
 .../Target/AArch64/AArch64MCPlusBuilder.cpp   | 24 --
 .../AArch64/gs-pauth-address-checks.s | 44 +--
 .../AArch64/gs-pauth-authentication-oracles.s |  9 ++--
 .../AArch64/gs-pauth-signing-oracles.s|  6 +--
 6 files changed, 75 insertions(+), 35 deletions(-)

diff --git a/bolt/include/bolt/Core/MCPlusBuilder.h 
b/bolt/include/bolt/Core/MCPlusBuilder.h
index 6d3aa4f5f0feb..87de6754017db 100644
--- a/bolt/include/bolt/Core/MCPlusBuilder.h
+++ b/bolt/include/bolt/Core/MCPlusBuilder.h
@@ -706,6 +706,20 @@ class MCPlusBuilder {
 return false;
   }
 
+  /// Returns true if Inst is a trap instruction.
+  ///
+  /// Tests if Inst is an instruction that immediately causes an abnormal
+  /// program termination, for example when a security violation is detected
+  /// by a compiler-inserted check.
+  ///
+  /// @note An implementation of this method should likely return false for
+  /// calls to library functions like abort(), as it is possible that the
+  /// execution state is partially attacker-controlled at this point.
+  virtual bool isTrap(const MCInst &Inst) const {
+llvm_unreachable("not implemented");
+return false;
+  }
+
   virtual bool isBreakpoint(const MCInst &Inst) const {
 llvm_unreachable("not implemented");
 return false;
diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp 
b/bolt/lib/Passes/PAuthGadgetScanner.cpp
index dfb71575b2b39..835ee26aaf08a 100644
--- a/bolt/lib/Passes/PAuthGadgetScanner.cpp
+++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp
@@ -1028,6 +1028,15 @@ class DstSafetyAnalysis {
   dbgs() << ")\n";
 });
 
+// If this instruction terminates the program immediately, no
+// authentication oracles are possible past this point.
+if (BC.MIB->isTrap(Point)) {
+  LLVM_DEBUG({ traceInst(BC, "Trap instruction found", Point); });
+  DstState Next(NumRegs, RegsToTrackInstsFor.getNumTrackedRegisters());
+  Next.CannotEscapeUnchecked.set();
+  return Next;
+}
+
 // If this instruction is reachable by the analysis, a non-empty state will
 // be propagated to it sooner or later. Until then, skip computeNext().
 if (Cur.empty()) {
@@ -1133,8 +1142,8 @@ class DataflowDstSafetyAnalysis
 //
 // A basic block without any successors, on the other hand, can be
 // pessimistically initialized to everything-is-unsafe: this will naturally
-// handle both return and tail call instructions and is harmless for
-// internal indirect branch instructions (such as computed gotos).
+// handle return, trap and tail call instructions. At the same time, it is
+// harmless for internal indirect branch instructions, like computed gotos.
 if (BB.succ_empty())
   return createUnsafeState();
 
diff --git a/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp 
b/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp
index f3c29e6ee43b9..4d11c5b206eab 100644
--- a/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp
+++ b/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp
@@ -386,10 +386,9 @@ class AArch64MCPlusBuilder : public MCPlusBuilder {
 // the list of successors of this basic block as appropriate.
 
 // Any of the above code sequences assume the fall-through basic block
-// is a dead-end BRK instruction (any immediate operand is accepted).
+// is a dead-end trap instruction.
 const BinaryBasicBlock *BreakBB = BB.getFallthrough();
-if (!BreakBB || BreakBB->empty() ||
-BreakBB->front().getOpcode() != AArch64::BRK)
+if (!BreakBB || BreakBB->empty() || !isTrap(BreakBB->front()))
   return std::nullopt;
 
 // Iterate over the instructions of BB in reverse order, matching opcodes
@@ -1745,6 +1744,25 @@ class AArch64MCPlusBuilder : public MCPlusBuilder {
 Inst.addOperand(MCOperand::createImm(0));
   }

[llvm-branch-commits] [llvm] [BOLT] Introduce helpers to match `MCInst`s one at a time (NFC) (PR #138883)

2025-05-14 Thread Anatoly Trosinenko via llvm-branch-commits


https://github.com/atrosinenko updated 
https://github.com/llvm/llvm-project/pull/138883

>From 1c135a144d7f21e05c3598a992baa170cdde7950 Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko 
Date: Wed, 7 May 2025 16:42:00 +0300
Subject: [PATCH] [BOLT] Introduce helpers to match `MCInst`s one at a time
 (NFC)

Introduce matchInst helper function to capture and/or match the operands
of MCInst. Unlike the existing `MCPlusBuilder::MCInstMatcher` machinery,
matchInst is intended for the use cases when precise control over the
instruction order is required. For example, when validating PtrAuth
hardening, all registers are usually considered unsafe after a function
call, even though callee-saved registers should preserve their old
values *under normal operation*.
---
 bolt/include/bolt/Core/MCInstUtils.h  | 128 ++
 .../Target/AArch64/AArch64MCPlusBuilder.cpp   |  90 +---
 2 files changed, 162 insertions(+), 56 deletions(-)

diff --git a/bolt/include/bolt/Core/MCInstUtils.h 
b/bolt/include/bolt/Core/MCInstUtils.h
index a3912a8fb265a..b495eb8ef5eec 100644
--- a/bolt/include/bolt/Core/MCInstUtils.h
+++ b/bolt/include/bolt/Core/MCInstUtils.h
@@ -166,6 +166,134 @@ static inline raw_ostream &operator<<(raw_ostream &OS,
   return Ref.print(OS);
 }
 
+/// Instruction-matching helpers operating on a single instruction at a time.
+///
+/// Unlike MCPlusBuilder::MCInstMatcher, this matchInst() function focuses on
+/// the cases where a precise control over the instruction order is important:
+///
+/// // Bring the short names into the local scope:
+/// using namespace MCInstMatcher;
+/// // Declare the registers to capture:
+/// Reg Xn, Xm;
+/// // Capture the 0th and 1st operands, match the 2nd operand against the
+/// // just captured Xm register, match the 3rd operand against literal 0:
+/// if (!matchInst(MaybeAdd, AArch64::ADDXrs, Xm, Xn, Xm, Imm(0))
+///   return AArch64::NoRegister;
+/// // Match the 0th operand against Xm:
+/// if (!matchInst(MaybeBr, AArch64::BR, Xm))
+///   return AArch64::NoRegister;
+/// // Return the matched register:
+/// return Xm.get();
+namespace MCInstMatcher {
+
+// The base class to match an operand of type T.
+//
+// The subclasses of OpMatcher are intended to be allocated on the stack and
+// to only be used by passing them to matchInst() and by calling their get()
+// function, thus the peculiar `mutable` specifiers: to make the calling code
+// compact and readable, the templated matchInst() function has to accept both
+// long-lived Imm/Reg wrappers declared as local variables (intended to capture
+// the first operand's value and match the subsequent operands, whether inside
+// a single instruction or across multiple instructions), as well as temporary
+// wrappers around literal values to match, f.e. Imm(42) or Reg(AArch64::XZR).
+template  class OpMatcher {
+  mutable std::optional Value;
+  mutable std::optional SavedValue;
+
+  // Remember/restore the last Value - to be called by matchInst.
+  void remember() const { SavedValue = Value; }
+  void restore() const { Value = SavedValue; }
+
+  template 
+  friend bool matchInst(const MCInst &, unsigned, const OpMatchers &...);
+
+protected:
+  OpMatcher(std::optional ValueToMatch) : Value(ValueToMatch) {}
+
+  bool matchValue(T OpValue) const {
+// Check that OpValue does not contradict the existing Value.
+bool MatchResult = !Value || *Value == OpValue;
+// If MatchResult is false, all matchers will be reset before returning 
from
+// matchInst, including this one, thus no need to assign conditionally.
+Value = OpValue;
+
+return MatchResult;
+  }
+
+public:
+  /// Returns the captured value.
+  T get() const {
+assert(Value.has_value());
+return *Value;
+  }
+};
+
+class Reg : public OpMatcher {
+  bool matches(const MCOperand &Op) const {
+if (!Op.isReg())
+  return false;
+
+return matchValue(Op.getReg());
+  }
+
+  template 
+  friend bool matchInst(const MCInst &, unsigned, const OpMatchers &...);
+
+public:
+  Reg(std::optional RegToMatch = std::nullopt)
+  : OpMatcher(RegToMatch) {}
+};
+
+class Imm : public OpMatcher {
+  bool matches(const MCOperand &Op) const {
+if (!Op.isImm())
+  return false;
+
+return matchValue(Op.getImm());
+  }
+
+  template 
+  friend bool matchInst(const MCInst &, unsigned, const OpMatchers &...);
+
+public:
+  Imm(std::optional ImmToMatch = std::nullopt)
+  : OpMatcher(ImmToMatch) {}
+};
+
+/// Tries to match Inst and updates Ops on success.
+///
+/// If Inst has the specified Opcode and its operand list prefix matches Ops,
+/// this function returns true and updates Ops, otherwise false is returned and
+/// values of Ops are kept as before matchInst was called.
+///
+/// Please note that while Ops are technically passed by a const reference to
+/// make invocations like `matchInst(MI, Opcode, Imm(42))` possible, all their
+/// fields are marked mut

[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: account for BRK when searching for auth oracles (PR #137975)

2025-05-14 Thread Anatoly Trosinenko via llvm-branch-commits


https://github.com/atrosinenko updated 
https://github.com/llvm/llvm-project/pull/137975

>From 91b67b64a4e731cdfabc09bd224c32e1ab25e21d Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko 
Date: Wed, 30 Apr 2025 16:08:10 +0300
Subject: [PATCH] [BOLT] Gadget scanner: account for BRK when searching for
 auth oracles

An authenticated pointer can be explicitly checked by the compiler via a
sequence of instructions that executes BRK on failure. It is important
to recognize such BRK instruction as checking every register (as it is
expected to immediately trigger an abnormal program termination) to
prevent false positive reports about authentication oracles:

autia   x2, x3
autia   x0, x1
; neither x0 nor x2 are checked at this point
eor x16, x0, x0, lsl #1
tbz x16, #62, on_success ; marks x0 as checked
; end of BB: for x2 to be checked here, it must be checked in both
; successor basic blocks
  on_failure:
brk 0xc470
  on_success:
; x2 is checked
ldr x1, [x2] ; marks x2 as checked
---
 bolt/include/bolt/Core/MCPlusBuilder.h| 14 ++
 bolt/lib/Passes/PAuthGadgetScanner.cpp| 13 +-
 .../Target/AArch64/AArch64MCPlusBuilder.cpp   | 24 --
 .../AArch64/gs-pauth-address-checks.s | 44 +--
 .../AArch64/gs-pauth-authentication-oracles.s |  9 ++--
 .../AArch64/gs-pauth-signing-oracles.s|  6 +--
 6 files changed, 75 insertions(+), 35 deletions(-)

diff --git a/bolt/include/bolt/Core/MCPlusBuilder.h 
b/bolt/include/bolt/Core/MCPlusBuilder.h
index 6d3aa4f5f0feb..87de6754017db 100644
--- a/bolt/include/bolt/Core/MCPlusBuilder.h
+++ b/bolt/include/bolt/Core/MCPlusBuilder.h
@@ -706,6 +706,20 @@ class MCPlusBuilder {
 return false;
   }
 
+  /// Returns true if Inst is a trap instruction.
+  ///
+  /// Tests if Inst is an instruction that immediately causes an abnormal
+  /// program termination, for example when a security violation is detected
+  /// by a compiler-inserted check.
+  ///
+  /// @note An implementation of this method should likely return false for
+  /// calls to library functions like abort(), as it is possible that the
+  /// execution state is partially attacker-controlled at this point.
+  virtual bool isTrap(const MCInst &Inst) const {
+llvm_unreachable("not implemented");
+return false;
+  }
+
   virtual bool isBreakpoint(const MCInst &Inst) const {
 llvm_unreachable("not implemented");
 return false;
diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp 
b/bolt/lib/Passes/PAuthGadgetScanner.cpp
index dfb71575b2b39..835ee26aaf08a 100644
--- a/bolt/lib/Passes/PAuthGadgetScanner.cpp
+++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp
@@ -1028,6 +1028,15 @@ class DstSafetyAnalysis {
   dbgs() << ")\n";
 });
 
+// If this instruction terminates the program immediately, no
+// authentication oracles are possible past this point.
+if (BC.MIB->isTrap(Point)) {
+  LLVM_DEBUG({ traceInst(BC, "Trap instruction found", Point); });
+  DstState Next(NumRegs, RegsToTrackInstsFor.getNumTrackedRegisters());
+  Next.CannotEscapeUnchecked.set();
+  return Next;
+}
+
 // If this instruction is reachable by the analysis, a non-empty state will
 // be propagated to it sooner or later. Until then, skip computeNext().
 if (Cur.empty()) {
@@ -1133,8 +1142,8 @@ class DataflowDstSafetyAnalysis
 //
 // A basic block without any successors, on the other hand, can be
 // pessimistically initialized to everything-is-unsafe: this will naturally
-// handle both return and tail call instructions and is harmless for
-// internal indirect branch instructions (such as computed gotos).
+// handle return, trap and tail call instructions. At the same time, it is
+// harmless for internal indirect branch instructions, like computed gotos.
 if (BB.succ_empty())
   return createUnsafeState();
 
diff --git a/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp 
b/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp
index f3c29e6ee43b9..4d11c5b206eab 100644
--- a/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp
+++ b/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp
@@ -386,10 +386,9 @@ class AArch64MCPlusBuilder : public MCPlusBuilder {
 // the list of successors of this basic block as appropriate.
 
 // Any of the above code sequences assume the fall-through basic block
-// is a dead-end BRK instruction (any immediate operand is accepted).
+// is a dead-end trap instruction.
 const BinaryBasicBlock *BreakBB = BB.getFallthrough();
-if (!BreakBB || BreakBB->empty() ||
-BreakBB->front().getOpcode() != AArch64::BRK)
+if (!BreakBB || BreakBB->empty() || !isTrap(BreakBB->front()))
   return std::nullopt;
 
 // Iterate over the instructions of BB in reverse order, matching opcodes
@@ -1745,6 +1744,25 @@ class AArch64MCPlusBuilder : public MCPlusBuilder {
 Inst.addOperand(MCOperand::createImm(0));
   }

[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect untrusted LR before tail call (PR #137224)

2025-05-14 Thread Anatoly Trosinenko via llvm-branch-commits


https://github.com/atrosinenko updated 
https://github.com/llvm/llvm-project/pull/137224

>From d20efbedd0be34942e8e28cc91eefeb28d1b8108 Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko 
Date: Tue, 22 Apr 2025 21:43:14 +0300
Subject: [PATCH] [BOLT] Gadget scanner: detect untrusted LR before tail call

Implement the detection of tail calls performed with untrusted link
register, which violates the assumption made on entry to every function.

Unlike other pauth gadgets, this one involves some amount of guessing
which branch instructions should be checked as tail calls.
---
 bolt/lib/Passes/PAuthGadgetScanner.cpp|  94 ++-
 .../AArch64/gs-pacret-autiasp.s   |  31 +-
 .../AArch64/gs-pauth-debug-output.s   |  30 +-
 .../AArch64/gs-pauth-tail-calls.s | 597 ++
 4 files changed, 706 insertions(+), 46 deletions(-)
 create mode 100644 bolt/test/binary-analysis/AArch64/gs-pauth-tail-calls.s

diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp 
b/bolt/lib/Passes/PAuthGadgetScanner.cpp
index 7a5d47a3ff812..dfb71575b2b39 100644
--- a/bolt/lib/Passes/PAuthGadgetScanner.cpp
+++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp
@@ -701,8 +701,9 @@ class DataflowSrcSafetyAnalysis
 //
 // Then, a function can be split into a number of disjoint contiguous sequences
 // of instructions without labels in between. These sequences can be processed
-// the same way basic blocks are processed by data-flow analysis, assuming
-// pessimistically that all registers are unsafe at the start of each sequence.
+// the same way basic blocks are processed by data-flow analysis, with the same
+// pessimistic estimation of the initial state at the start of each sequence
+// (except the first instruction of the function).
 class CFGUnawareSrcSafetyAnalysis : public SrcSafetyAnalysis {
   BinaryFunction &BF;
   MCPlusBuilder::AllocatorIdTy AllocId;
@@ -713,12 +714,6 @@ class CFGUnawareSrcSafetyAnalysis : public 
SrcSafetyAnalysis {
   BC.MIB->removeAnnotation(I.second, StateAnnotationIndex);
   }
 
-  /// Creates a state with all registers marked unsafe (not to be confused
-  /// with empty state).
-  SrcState createUnsafeState() const {
-return SrcState(NumRegs, RegsToTrackInstsFor.getNumTrackedRegisters());
-  }
-
 public:
   CFGUnawareSrcSafetyAnalysis(BinaryFunction &BF,
   MCPlusBuilder::AllocatorIdTy AllocId,
@@ -729,6 +724,7 @@ class CFGUnawareSrcSafetyAnalysis : public 
SrcSafetyAnalysis {
   }
 
   void run() override {
+const SrcState DefaultState = computePessimisticState(BF);
 SrcState S = createEntryState();
 for (auto &I : BF.instrs()) {
   MCInst &Inst = I.second;
@@ -743,7 +739,7 @@ class CFGUnawareSrcSafetyAnalysis : public 
SrcSafetyAnalysis {
 LLVM_DEBUG({
   traceInst(BC, "Due to label, resetting the state before", Inst);
 });
-S = createUnsafeState();
+S = DefaultState;
   }
 
   // Check if we need to remove an old annotation (this is the case if
@@ -1288,6 +1284,83 @@ shouldReportReturnGadget(const BinaryContext &BC, const 
MCInstReference &Inst,
   return make_gadget_report(RetKind, Inst, *RetReg);
 }
 
+/// While BOLT already marks some of the branch instructions as tail calls,
+/// this function tries to improve the coverage by including less obvious cases
+/// when it is possible to do without introducing too many false positives.
+static bool shouldAnalyzeTailCallInst(const BinaryContext &BC,
+  const BinaryFunction &BF,
+  const MCInstReference &Inst) {
+  // Some BC.MIB->isXYZ(Inst) methods simply delegate to MCInstrDesc::isXYZ()
+  // (such as isBranch at the time of writing this comment), some don't (such
+  // as isCall). For that reason, call MCInstrDesc's methods explicitly when
+  // it is important.
+  const MCInstrDesc &Desc =
+  BC.MII->get(static_cast(Inst).getOpcode());
+  // Tail call should be a branch (but not necessarily an indirect one).
+  if (!Desc.isBranch())
+return false;
+
+  // Always analyze the branches already marked as tail calls by BOLT.
+  if (BC.MIB->isTailCall(Inst))
+return true;
+
+  // Try to also check the branches marked as "UNKNOWN CONTROL FLOW" - the
+  // below is a simplified condition from BinaryContext::printInstruction.
+  bool IsUnknownControlFlow =
+  BC.MIB->isIndirectBranch(Inst) && !BC.MIB->getJumpTable(Inst);
+
+  if (BF.hasCFG() && IsUnknownControlFlow)
+return true;
+
+  return false;
+}
+
+static std::optional>
+shouldReportUnsafeTailCall(const BinaryContext &BC, const BinaryFunction &BF,
+   const MCInstReference &Inst, const SrcState &S) {
+  static const GadgetKind UntrustedLRKind(
+  "untrusted link register found before tail call");
+
+  if (!shouldAnalyzeTailCallInst(BC, BF, Inst))
+return std::nullopt;
+
+  // Not only the set of registers returned by getTrustedLiveInRegs() can be
+  // see

[llvm-branch-commits] [mlir] [MLIR][ArmSVE] Add initial lowering of vector.contract to SVE `*MMLA` instructions (PR #135636)

2025-05-14 Thread Momchil Velikov via llvm-branch-commits



@@ -0,0 +1,304 @@
+//===- LowerContractionToSMMLAPattern.cpp - Contract to SMMLA ---*- C++ 
-*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+//
+// This file implements lowering patterns from vector.contract to
+// SVE I8MM operations.
+//
+//===---
+
+#include "mlir/Dialect/Arith/IR/Arith.h"
+#include "mlir/Dialect/ArmSVE/IR/ArmSVEDialect.h"
+#include "mlir/Dialect/ArmSVE/Transforms/Transforms.h"
+#include "mlir/Dialect/Func/IR/FuncOps.h"
+#include "mlir/Dialect/LLVMIR/LLVMDialect.h"
+#include "mlir/Dialect/Utils/IndexingUtils.h"
+#include "mlir/Dialect/Vector/IR/VectorOps.h"
+#include "mlir/IR/AffineMap.h"
+#include "mlir/IR/PatternMatch.h"
+#include "mlir/Transforms/GreedyPatternRewriteDriver.h"
+
+#include "mlir/Dialect/UB/IR/UBOps.h"
+
+#define DEBUG_TYPE "lower-contract-to-arm-sve-i8mm"
+
+using namespace mlir;
+using namespace mlir::arm_sve;
+
+namespace {
+// Check if the given value is a result of the operation `T` (which must be
+// sign- or zero- extend) from i8 to i32. Return the value before the 
extension.
+template 
+inline std::enable_if_t<(std::is_base_of_v ||
+ std::is_base_of_v),
+std::optional>
+extractExtOperand(Value v, Type i8Ty, Type i32Ty) {
+  auto extOp = dyn_cast_or_null(v.getDefiningOp());
+  if (!extOp)
+return {};
+
+  auto inOp = extOp.getIn();
+  auto inTy = dyn_cast(inOp.getType());
+  if (!inTy || inTy.getElementType() != i8Ty)
+return {};
+
+  auto outTy = dyn_cast(extOp.getType());
+  if (!outTy || outTy.getElementType() != i32Ty)
+return {};
+
+  return inOp;
+}
+
+// Designate the operation (resp. instruction) used to do sub-tile matrix
+// multiplications.
+enum class MMLA {
+  Signed,  // smmla
+  Unsigned,// ummla
+  Mixed,   // usmmla
+  MixedSwapped // usmmla with LHS and RHS swapped
+};
+
+// Create the matrix multply and accumulate operation according to `op`.
+Value createMMLA(PatternRewriter &rewriter, MMLA op, Location loc,
+ mlir::VectorType accType, Value acc, Value lhs, Value rhs) {
+  switch (op) {
+  case MMLA::Signed:
+return rewriter.create(loc, accType, acc, lhs, rhs);
+  case MMLA::Unsigned:
+return rewriter.create(loc, accType, acc, lhs, rhs);
+  case MMLA::Mixed:
+return rewriter.create(loc, accType, acc, lhs, rhs);
+  case MMLA::MixedSwapped:
+// The accumulator comes transposed and the result will be transposed
+// later, so all we have to do here is swap the operands.
+return rewriter.create(loc, accType, acc, rhs, lhs);
+  }
+}
+
+class LowerContractionToSVEI8MMPattern
+: public OpRewritePattern {
+public:
+  using OpRewritePattern::OpRewritePattern;
+  LogicalResult matchAndRewrite(vector::ContractionOp op,
+PatternRewriter &rewriter) const override {
+
+Location loc = op.getLoc();
+mlir::VectorType lhsType = op.getLhsType();
+mlir::VectorType rhsType = op.getRhsType();
+
+// For now handle LHS and RHS<8x[N]> - these are the types we
+// eventually expect from MMT4D. M and N dimensions must be even and at
+// least 2.
+if (!lhsType.hasRank() || lhsType.getRank() != 2 || !rhsType.hasRank() ||
+rhsType.getRank() != 2)
+  return failure();
+
+if (lhsType.isScalable() || !rhsType.isScalable())
+  return failure();
+
+// M, N, and K are the conventional names for matrix dimensions in the
+// context of matrix multiplication.
+auto M = lhsType.getDimSize(0);
+auto N = rhsType.getDimSize(0);
+auto K = rhsType.getDimSize(1);
+
+if (lhsType.getDimSize(1) != K || K != 8 || M < 2 || M % 2 != 0 || N < 2 ||
+N % 2 != 0 || !rhsType.getScalableDims()[0])
+  return failure();
+
+// Check permutation maps. For now only accept
+//   lhs: (d0, d1, d2) -> (d0, d2)
+//   rhs: (d0, d1, d2) -> (d1, d2)
+//   acc: (d0, d1, d2) -> (d0, d1)
+// Note: RHS is transposed.
+if (op.getIndexingMapsArray()[0] !=
+AffineMap::getMultiDimMapWithTargets(3, ArrayRef{0u, 2u},
+ op.getContext()) ||
+op.getIndexingMapsArray()[1] !=
+AffineMap::getMultiDimMapWithTargets(3, ArrayRef{1u, 2u},
+ op.getContext()) ||
+op.getIndexingMapsArray()[2] !=
+AffineMap::getMultiDimMapWithTargets(3, ArrayRef{0u, 1u},
+ op.getContext()))
+  return failure();
+
+// Check iterator types for matrix multiplication.
+auto itTypes = op.getIteratorTypesArray();
+if (itTypes.size() != 3 || itTypes[0] != vector::IteratorType::parallel ||
+itTypes[1] != vector::IteratorType

[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: improve handling of unreachable basic blocks (PR #136183)

2025-05-14 Thread Anatoly Trosinenko via llvm-branch-commits


atrosinenko wrote:

Moved the `computePessimisticState` function here from #137224, as accounting 
for the registers that were never clobbered in a function (basically, 
accounting for the leaf functions) turns out to decrease the number of false 
positive reports quite significantly.

https://github.com/llvm/llvm-project/pull/136183
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] Add pointer field protection feature. (PR #133538)

2025-05-14 Thread Oliver Hunt via llvm-branch-commits



@@ -7538,6 +7538,14 @@ static bool IsEligibleForTrivialRelocation(Sema &SemaRef,
 if (!SemaRef.IsCXXTriviallyRelocatableType(Field->getType()))
   return false;
   }
+
+  // FIXME: PFP should not affect trivial relocatability, instead it should
+  // affect the implementation of std::trivially_relocate. See:
+  // 
https://discourse.llvm.org/t/rfc-structure-protection-a-family-of-uaf-mitigation-techniques/8/16?u=pcc
+  if (!SemaRef.Context.arePFPFieldsTriviallyRelocatable(D) &&

ojhunt wrote:

If PFP fields are present, and we haven't yet implemented support for them in 
`trivially_relocate`, `IsCXXTriviallyRelocatableType` should be returning false.

This is something we have to address for address discriminated pointer auth as 
well.

https://github.com/llvm/llvm-project/pull/133538
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] Add pointer field protection feature. (PR #133538)

2025-05-14 Thread Oliver Hunt via llvm-branch-commits



@@ -1319,14 +1319,66 @@ static llvm::Value 
*CoerceIntOrPtrToIntOrPtr(llvm::Value *Val, llvm::Type *Ty,
 /// This safely handles the case when the src type is smaller than the
 /// destination type; in this situation the values of bits which not
 /// present in the src are undefined.
-static llvm::Value *CreateCoercedLoad(Address Src, llvm::Type *Ty,
+static llvm::Value *CreateCoercedLoad(Address Src, QualType SrcFETy,
+  llvm::Type *Ty,
   CodeGenFunction &CGF) {
   llvm::Type *SrcTy = Src.getElementType();
 
   // If SrcTy and Ty are the same, just do a load.
   if (SrcTy == Ty)
 return CGF.Builder.CreateLoad(Src);
 
+  // Coercion directly through memory does not work if the structure has 
pointer
+  // field protection because the struct in registers has a different bit
+  // pattern to the struct in memory, so we must read the elements one by one
+  // and use them to form the coerced structure.
+  std::vector PFPFields;
+  CGF.getContext().findPFPFields(SrcFETy, CharUnits::Zero(), PFPFields, true);

ojhunt wrote:

This should be done by generating a copy thunk and calling that, rather than 
performing this logic repeatedly, and more importantly regenerating it 
repeatedly. If the update function is small enough it will be inlined, and if 
it's not you probably do not want a large number of copies of the code.

Your intent appears to be to maintain the register transfer abi for POD types, 
despite the objects logically no longer being POD types - which isn't 
unreasonable given that the current design means that all structs containing 
pointers would become non-POD, which is probably suboptimal - but if you are 
doing so, because a large enough struct can spill, I'd recommend you adopt a 
model where the PFP fields in a return or parameter copy of the object are 
tagged specifically as transfer fields, something akin to `hash(argument 
position || PFP offset || some_this_is_a_transfer_constant)`.

https://github.com/llvm/llvm-project/pull/133538
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] Add pointer field protection feature. (PR #133538)

2025-05-14 Thread Oliver Hunt via llvm-branch-commits


https://github.com/ojhunt requested changes to this pull request.

Thoughts:

This should be opt-in on a field or struct granularity, not just a global 
behavior.

In the RFC I think you mentioned not applying PFP to C types, but I'm unsure 
how you're deciding what is a C type?

There are a lot of special cases being added in places that should not need to 
be aware of PFP - the core type queries should be returning correct values for 
types containing PFP fields.

A lot of the "this read/write/copy/init special work" exactly aligns with the 
constraints of pointer auth, and I think the overall implementation could be 
made better by introducing the concept of a `HardenedPointerQualifier` or 
similar, that could be either PointerAuthQualifier or PFPQualifier. In 
principle this could even just be treated as a different PointerAuth key set.

That approach would also help with some of the problems you have with offsetof 
and pointers to PFP fields - the thing that causes the problems with the 
pointer to a PFP field is that the PFP schema for a given field is in reality 
part of the type, so given

```cpp
struct A {
   void *field1;
   void *field2;
};
```

It looks you are trying to maintain the idea that `A::field1` and `A::field2` 
have the same type, which makes this code valid according to the type system:

```cpp
void f(void** obj);
void g(A *a) {
  f(&a->field1);
  f(&a->field2);
}
```

but the real types are not the same. The semantic equivalent under pointer auth 
is something akin to

```cpp
struct A {
   void * __ptrauth(1, 1, hash("A::field1")) field1;
   void * __ptrauth(1, 1, hash("A::field2")) field2;
};
```

Which makes it very clear that the types are different, and I think trying to 
pretend they are not is part of what is causing problems.

The more I think about it, the more I feel that this could be implemented 
almost entirely on top of the existing pointer auth work.

Currently for pointer auth there's a set of ARM keys specified, and I think you 
just need to create a different set, PFP_Keys or similar, and set up the 
appropriate schemas (basically null schemas for all the existing cases), and 
add a new schema: DefaultFieldSchema or similar, and apply that schema to every 
pointer field in an object unless there's an explicit qualifier applied already.

Then it would in principle just be a matter of making adding the appropriate 
logic to the ptrauth intrinsics in llvm.

Now there are some moderately large caveats to that idea

* the existing ptrauth semantics simply say that any struct containing address 
discriminated values is no-longer a pod type and so gets copied through the 
stack, which is something you're trying to avoid, so you'd still need to keep 
that logic (though as I said, that should be done by generating and then 
calling the functions to do rather than regenerating the entire copy 
repeatedly).
* the way that pointer auth is currently implemented assumes that there is only 
a single pointer auth abi active so you wouldn't be able to trivially have both 
pointer auth and PFP live at the same time. That's what makes me think having a 
SecuredPointer abstraction might be a batter approach, but it might also not be 
too difficult to update the PointerAuthQualifier to include the ABI being used 
-- I'm really not sure.

https://github.com/llvm/llvm-project/pull/133538
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] Add pointer field protection feature. (PR #133538)

2025-05-14 Thread Oliver Hunt via llvm-branch-commits



@@ -2976,7 +3006,15 @@ void CodeGenFunction::EmitForwardingCallToLambda(
   QualType resultType = FPT->getReturnType();
   ReturnValueSlot returnSlot;
   if (!resultType->isVoidType() &&
-  calleeFnInfo->getReturnInfo().getKind() == ABIArgInfo::Indirect &&
+  (calleeFnInfo->getReturnInfo().getKind() == ABIArgInfo::Indirect ||
+   // With pointer field protection, we need to set up the return slot when

ojhunt wrote:

This should also not need to be modified. Something is up with how you're 
setting up the record decls if you need to have so many insertions of PFP 
specific behavior.

https://github.com/llvm/llvm-project/pull/133538
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] Add pointer field protection feature. (PR #133538)

2025-05-14 Thread Oliver Hunt via llvm-branch-commits



@@ -362,6 +362,17 @@ class LangOptionsBase {
 BKey
   };
 
+  enum class PointerFieldProtectionKind {

ojhunt wrote:

I'm not sure I like this being solely a global decision - it makes custom 
allocators much harder, and it makes it hard for allocators that have different 
policies, e.g 

```cpp
struct ImportantStruct {
   void *operator new(size_t) { /*tagged allocator*/ }
 };

struct BoringStruct {
   ...
};

struct SomeOtherStruct {
  ImportantStruct *important; // should be tagged pointer
  BoringString *lessImportant; // should be a non tagged pointer
};
```

This would again require a struct attribute to indicate the pointer policy





https://github.com/llvm/llvm-project/pull/133538
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] Add pointer field protection feature. (PR #133538)

2025-05-14 Thread Oliver Hunt via llvm-branch-commits



@@ -928,6 +936,11 @@ namespace {
   if (PointerAuthQualifier Q = F->getType().getPointerAuth();
   Q && Q.isAddressDiscriminated())
 return false;
+  // Non-trivially-copyable fields with pointer field protection need to be

ojhunt wrote:

This is an example of the code I was thinking of when I talked about sharing 
code with PointerAuth, the PFP and pointer auth code is essentially identical 
here.

Essentially any place that interacts with a pointer auth qualifier should also 
have PFP code, and the code is going to be (at the place of interaction, not 
the underlying implementation) going to be essentially the same.

https://github.com/llvm/llvm-project/pull/133538
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] Add pointer field protection feature. (PR #133538)

2025-05-14 Thread Oliver Hunt via llvm-branch-commits



@@ -3011,6 +3011,12 @@ defm experimental_omit_vtable_rtti : 
BoolFOption<"experimental-omit-vtable-rtti"
   NegFlag,
   BothFlags<[], [CC1Option], " the RTTI component from virtual tables">>;
 
+def experimental_pointer_field_protection_EQ : Joined<["-"], 
"fexperimental-pointer-field-protection=">, Group,

ojhunt wrote:

As above, not super sure that I like this being a global compile time setting, 
but maybe it makes sense to be.

https://github.com/llvm/llvm-project/pull/133538
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] Add pointer field protection feature. (PR #133538)

2025-05-14 Thread Oliver Hunt via llvm-branch-commits



@@ -2268,13 +2293,22 @@ CodeGenFunction::EmitNullInitialization(Address 
DestPtr, QualType Ty) {
 
 // Get and call the appropriate llvm.memcpy overload.
 Builder.CreateMemCpy(DestPtr, SrcPtr, SizeVal, false);
-return;
+  } else {
+// Otherwise, just memset the whole thing to zero.  This is legal

ojhunt wrote:

This means types that require constructors will be doing zero initialization, 
then reinitializing fields. From a codegen PoV this can lead to codegen along 
the lines of

```
zero the memory; // compiler is set to initialize all memory
zero the memory; // this branch means objects that are not zero initializable 
get initialized again
initialize the memory; // the struct is not zero initializable the 
constructor/initializer will run
```
Ideally the compiler will optimized this down, but it's both extra codegen, and 
extra optimization work.

Given that your model assumes that null is a safe value for PFP fields, the 
`isZeroInitializable()` call should return true. For non-zero initializable 
objects, the initialization code for the object is responsible for initializing 
the PFP fields.

https://github.com/llvm/llvm-project/pull/133538
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] Add pointer field protection feature. (PR #133538)

2025-05-14 Thread Oliver Hunt via llvm-branch-commits



@@ -2513,6 +2513,12 @@ def CountedByOrNull : DeclOrTypeAttr {
   let LangOpts = [COnly];
 }
 
+def NoPointerFieldProtection : DeclOrTypeAttr {

ojhunt wrote:

This an ABI break so I don't think it can reasonably an on by default for all 
structs - we can already see annotations in libc++, and they would be needed on 
every single struct field.

We can imagine a new platform where this would be the platform ABI, but for 
every other case it is functionally unusable.

I would recommend attributes to permit opt in and opt out on a per-struct 
basis, and CLI flags to select the default behavior.

https://github.com/llvm/llvm-project/pull/133538
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] Add pointer field protection feature. (PR #133538)

2025-05-14 Thread Oliver Hunt via llvm-branch-commits


https://github.com/ojhunt edited 
https://github.com/llvm/llvm-project/pull/133538
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] Add pointer field protection feature. (PR #133538)

2025-05-14 Thread Oliver Hunt via llvm-branch-commits


https://github.com/ojhunt edited 
https://github.com/llvm/llvm-project/pull/133538
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] Add pointer field protection feature. (PR #133538)

2025-05-14 Thread Oliver Hunt via llvm-branch-commits


https://github.com/ojhunt commented:

unfortunately I really don't know enough about actual backend codegen to 
comment on the actual codegen implementation here, but I think there's some 
design issues with having the backend determine the discriminators rather than 
having those selected in the front end

https://github.com/llvm/llvm-project/pull/133538
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] Add pointer field protection feature. (PR #133538)

2025-05-14 Thread Oliver Hunt via llvm-branch-commits



@@ -441,6 +445,254 @@ bool 
PreISelIntrinsicLowering::expandMemIntrinsicUses(Function &F) const {
   return Changed;
 }
 
+namespace {
+
+enum class PointerEncoding {
+  Rotate,
+  PACCopyable,
+  PACNonCopyable,
+};
+
+bool expandProtectedFieldPtr(Function &Intr) {
+  Module &M = *Intr.getParent();
+  bool IsAArch64 = Triple(M.getTargetTriple()).isAArch64();
+
+  std::set NonPFPFields;
+  std::set LoadsStores;
+
+  Type *Int8Ty = Type::getInt8Ty(M.getContext());
+  Type *Int64Ty = Type::getInt64Ty(M.getContext());
+  PointerType *PtrTy = PointerType::get(M.getContext(), 0);
+
+  Function *SignIntr =
+  Intrinsic::getOrInsertDeclaration(&M, Intrinsic::ptrauth_sign, {});
+  Function *AuthIntr =
+  Intrinsic::getOrInsertDeclaration(&M, Intrinsic::ptrauth_auth, {});
+
+  auto *EmuFnTy = FunctionType::get(Int64Ty, {Int64Ty, Int64Ty}, false);
+  FunctionCallee EmuSignIntr = M.getOrInsertFunction("__emupac_pacda", 
EmuFnTy);
+  FunctionCallee EmuAuthIntr = M.getOrInsertFunction("__emupac_autda", 
EmuFnTy);
+
+  auto CreateSign = [&](IRBuilder<> &B, Value *Val, Value *Disc,
+   OperandBundleDef DSBundle) {
+Function *F = B.GetInsertBlock()->getParent();
+Attribute FSAttr = F->getFnAttribute("target-features");
+if (FSAttr.isValid() && FSAttr.getValueAsString().contains("+pauth"))
+  return B.CreateCall(SignIntr, {Val, B.getInt32(2), Disc}, DSBundle);
+return B.CreateCall(EmuSignIntr, {Val, Disc}, DSBundle);
+  };
+
+  auto CreateAuth = [&](IRBuilder<> &B, Value *Val, Value *Disc,
+   OperandBundleDef DSBundle) {
+Function *F = B.GetInsertBlock()->getParent();
+Attribute FSAttr = F->getFnAttribute("target-features");
+if (FSAttr.isValid() && FSAttr.getValueAsString().contains("+pauth"))
+  return B.CreateCall(AuthIntr, {Val, B.getInt32(2), Disc}, DSBundle);
+return B.CreateCall(EmuAuthIntr, {Val, Disc}, DSBundle);
+  };
+
+  for (User *U : Intr.users()) {
+auto *Call = cast(U);
+auto *FieldName = cast(

ojhunt wrote:

I'm not really a backend person, but I feel that it's incorrect for the 
discriminator selection/computation to be occurring in llvm. I think clang 
should be passing the discriminator as a parameter to the intrinsics - this 
would also permit the PFP system to allow users to override the discriminator 
for fields, which is likely to end up being necessary, as time changes the 
exact types or naming of fields (side note: because I'm a muppet, I only just 
realized that the current model derives the discriminator from the name of the 
field which means the names of fields actually becomes ABI - it might be better 
to use something like "name of struct + offset in struct + type" though 
obviously that reduces the variation in discriminators - OTOH it's not uncommon 
for different projects to declare there own versions of POD structs with 
slightly different naming so maybe it helps there)

https://github.com/llvm/llvm-project/pull/133538
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] Add pointer field protection feature. (PR #133538)

2025-05-14 Thread Oliver Hunt via llvm-branch-commits



@@ -544,6 +544,7 @@ TYPE_TRAIT_2(__is_pointer_interconvertible_base_of, 
IsPointerInterconvertibleBas
 #include "clang/Basic/TransformTypeTraits.def"
 
 // Clang-only C++ Type Traits
+TYPE_TRAIT_1(__has_non_relocatable_fields, HasNonRelocatableFields, KEYCXX)

ojhunt wrote:

I do not like this -- there are existing functions for handling whether an 
object is trivially copying, relocatable, etc that should just be updated to 
correctly report the traits of impacted types. See the various Sema functions 
querying `isAddressDiscriminated()`. That function currently only cares about 
pointer auth, but it would not be unreasonable to have it consider other 
properties, like PDP.

In fact a lot of the behavior you're wanting in this feature is essentially 
covered by the PointerAuthQualfier. It would perhaps be reasonable to 
generalize that to something akin to SecurePointerQualifier that could be 
either a PAQ or PFP qualifier. Then most of the current semantic queries made 
to support pointer auth could just be a single shared implementation.

https://github.com/llvm/llvm-project/pull/133538
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] Add pointer field protection feature. (PR #133538)

2025-05-14 Thread Oliver Hunt via llvm-branch-commits



@@ -1415,6 +1469,52 @@ void CodeGenFunction::CreateCoercedStore(llvm::Value 
*Src, Address Dst,
 }
   }
 
+  // Coercion directly through memory does not work if the structure has 
pointer
+  // field protection because the struct passed by value has a different bit
+  // pattern to the struct in memory, so we must read the elements one by one
+  // and use them to form the coerced structure.
+  std::vector PFPFields;
+  getContext().findPFPFields(SrcFETy, CharUnits::Zero(), PFPFields, true);

ojhunt wrote:

Same as for CoercedLoad :D

https://github.com/llvm/llvm-project/pull/133538
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [llvm][EmbedBitcodePass] Prevent modifying the module with ThinLTO (PR #139999)

2025-05-14 Thread via llvm-branch-commits


llvmbot wrote:




@llvm/pr-subscribers-llvm-transforms

Author: Paul Kirth (ilovepi)


Changes

Since ThinLTOBitcodeWriterPass handles many things for CFI and WPD, like
updating vtable linkage, we need to prevent those changes from
persisting in the non-LTO object code we will compile under FatLTO.

The only non-invasive way to do that is to clone the module when
serializing the module in ThinLTOBitcodeWriterPass. We may be able to
avoid cloning in the future with additional infrastructure to restore
the IR to its original state.

Fixes #139440

---
Full diff: https://github.com/llvm/llvm-project/pull/13.diff


2 Files Affected:

- (modified) llvm/lib/Transforms/IPO/EmbedBitcodePass.cpp (+5-1) 
- (modified) llvm/test/Transforms/EmbedBitcode/embed-wpd.ll (+4-3) 


``diff
diff --git a/llvm/lib/Transforms/IPO/EmbedBitcodePass.cpp 
b/llvm/lib/Transforms/IPO/EmbedBitcodePass.cpp
index 73f567734a91b..7a378c61cb899 100644
--- a/llvm/lib/Transforms/IPO/EmbedBitcodePass.cpp
+++ b/llvm/lib/Transforms/IPO/EmbedBitcodePass.cpp
@@ -16,6 +16,7 @@
 #include "llvm/Support/raw_ostream.h"
 #include "llvm/TargetParser/Triple.h"
 #include "llvm/Transforms/IPO/ThinLTOBitcodeWriter.h"
+#include "llvm/Transforms/Utils/Cloning.h"
 #include "llvm/Transforms/Utils/ModuleUtils.h"
 
 #include 
@@ -33,8 +34,11 @@ PreservedAnalyses EmbedBitcodePass::run(Module &M, 
ModuleAnalysisManager &AM) {
 
   std::string Data;
   raw_string_ostream OS(Data);
+  // Clone the module with with Thin LTO, since ThinLTOBitcodeWriterPass 
changes
+  // vtable linkage that would break the non-lto object code for FatLTO.
   if (IsThinLTO)
-ThinLTOBitcodeWriterPass(OS, /*ThinLinkOS=*/nullptr).run(M, AM);
+ThinLTOBitcodeWriterPass(OS, /*ThinLinkOS=*/nullptr)
+.run(*llvm::CloneModule(M), AM);
   else
 BitcodeWriterPass(OS, /*ShouldPreserveUseListOrder=*/false, EmitLTOSummary)
 .run(M, AM);
diff --git a/llvm/test/Transforms/EmbedBitcode/embed-wpd.ll 
b/llvm/test/Transforms/EmbedBitcode/embed-wpd.ll
index f1f7712f54039..54931be42b4eb 100644
--- a/llvm/test/Transforms/EmbedBitcode/embed-wpd.ll
+++ b/llvm/test/Transforms/EmbedBitcode/embed-wpd.ll
@@ -1,12 +1,13 @@
 ; RUN: opt --mtriple x86_64-unknown-linux-gnu < %s 
-passes="embed-bitcode" -S | FileCheck %s
 
-; CHECK-NOT: $_ZTV3Foo = comdat any
+; CHECK: $_ZTV3Foo = comdat any
 $_ZTV3Foo = comdat any
 
 $_ZTI3Foo = comdat any
 
-; CHECK: @_ZTV3Foo = external hidden unnamed_addr constant { [5 x ptr] }, 
align 8
-; CHECK: @_ZTI3Foo = linkonce_odr hidden constant { ptr, ptr, ptr } { ptr 
getelementptr inbounds (ptr, ptr @_ZTVN10__cxxabiv120__si_class_type_infoE, i64 
2), ptr @_ZTS3Foo, ptr @_ZTISt13runtime_error }, comdat, align 8
+;; ThinLTOBitcodeWriter will remove the vtable for Foo, and make it an 
external symbol
+; CHECK: @_ZTV3Foo = linkonce_odr hidden unnamed_addr constant { [5 x ptr] } { 
[5 x ptr] [ptr null, ptr @_ZTI3Foo, ptr @_ZN3FooD2Ev, ptr @_ZN3FooD0Ev, ptr 
@_ZNKSt13runtime_error4whatEv] }, comdat, align 8, !type !0, !type !1, !type 
!2, !type !3, !type !4, !type !5
+; CHECK-NOT: @foo = external unnamed_addr constant { [5 x ptr] }, align 8
 ; CHECK: @llvm.embedded.object = private constant {{.*}}, section ".llvm.lto", 
align 1
 ; CHECK: @llvm.compiler.used = appending global [1 x ptr] [ptr 
@llvm.embedded.object], section "llvm.metadata"
 @_ZTV3Foo = linkonce_odr hidden unnamed_addr constant { [5 x ptr] } { [5 x 
ptr] [ptr null, ptr @_ZTI3Foo, ptr @_ZN3FooD2Ev, ptr @_ZN3FooD0Ev, ptr 
@_ZNKSt13runtime_error4whatEv] }, comdat, align 8, !type !0, !type !1, !type 
!2, !type !3, !type !4, !type !5

``




https://github.com/llvm/llvm-project/pull/13
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [RISCV][Scheduler] Add scheduler definitions for the Q extension (PR #139495)

2025-05-14 Thread Craig Topper via llvm-branch-commits


https://github.com/topperc approved this pull request.

LGTM

https://github.com/llvm/llvm-project/pull/139495
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [StaticDataLayout][PGO]Implement reader and writer change for data access profiles (PR #139997)

2025-05-14 Thread Mingming Liu via llvm-branch-commits


https://github.com/mingmingl-llvm created 
https://github.com/llvm/llvm-project/pull/139997

None

>From 75878647c2c36cca00e9d003dc84bf4597e19187 Mon Sep 17 00:00:00 2001
From: mingmingl 
Date: Tue, 13 May 2025 22:54:59 -0700
Subject: [PATCH] [StaticDataLayout][PGO]Implement reader and writer change for
 data access profiles

---
 .../include/llvm/ProfileData/DataAccessProf.h | 12 +++-
 .../llvm/ProfileData/IndexedMemProfData.h | 12 +++-
 .../llvm/ProfileData/InstrProfReader.h|  6 +-
 .../llvm/ProfileData/InstrProfWriter.h|  6 ++
 llvm/include/llvm/ProfileData/MemProfReader.h | 12 
 llvm/include/llvm/ProfileData/MemProfYAML.h   | 65 +++
 llvm/lib/ProfileData/DataAccessProf.cpp   |  6 +-
 llvm/lib/ProfileData/IndexedMemProfData.cpp   | 61 +
 llvm/lib/ProfileData/InstrProfReader.cpp  | 14 
 llvm/lib/ProfileData/InstrProfWriter.cpp  | 20 --
 llvm/lib/ProfileData/MemProfReader.cpp| 34 ++
 .../tools/llvm-profdata/memprof-yaml.test | 11 
 llvm/tools/llvm-profdata/llvm-profdata.cpp|  5 ++
 .../ProfileData/DataAccessProfTest.cpp| 11 ++--
 14 files changed, 244 insertions(+), 31 deletions(-)

diff --git a/llvm/include/llvm/ProfileData/DataAccessProf.h 
b/llvm/include/llvm/ProfileData/DataAccessProf.h
index e8504102238d1..f5f6abf0a2817 100644
--- a/llvm/include/llvm/ProfileData/DataAccessProf.h
+++ b/llvm/include/llvm/ProfileData/DataAccessProf.h
@@ -41,6 +41,8 @@ namespace data_access_prof {
 struct SourceLocation {
   SourceLocation(StringRef FileNameRef, uint32_t Line)
   : FileName(FileNameRef.str()), Line(Line) {}
+
+  SourceLocation() {}
   /// The filename where the data is located.
   std::string FileName;
   /// The line number in the source code.
@@ -53,6 +55,8 @@ namespace internal {
 // which strings are owned by `DataAccessProfData`. Used by 
`DataAccessProfData`
 // to represent data locations internally.
 struct SourceLocationRef {
+  SourceLocationRef(StringRef FileNameRef, uint32_t Line)
+  : FileName(FileNameRef), Line(Line) {}
   // The filename where the data is located.
   StringRef FileName;
   // The line number in the source code.
@@ -100,8 +104,9 @@ using SymbolHandle = std::variant;
 /// The data access profiles for a symbol.
 struct DataAccessProfRecord {
 public:
-  DataAccessProfRecord(SymbolHandleRef SymHandleRef,
-   ArrayRef LocRefs) {
+  DataAccessProfRecord(SymbolHandleRef SymHandleRef, uint64_t AccessCount,
+   ArrayRef LocRefs)
+  : AccessCount(AccessCount) {
 if (std::holds_alternative(SymHandleRef)) {
   SymHandle = std::get(SymHandleRef).str();
 } else
@@ -110,8 +115,9 @@ struct DataAccessProfRecord {
 for (auto Loc : LocRefs)
   Locations.push_back(SourceLocation(Loc.FileName, Loc.Line));
   }
+  DataAccessProfRecord() {}
   SymbolHandle SymHandle;
-
+  uint64_t AccessCount;
   // The locations of data in the source code. Optional.
   SmallVector Locations;
 };
diff --git a/llvm/include/llvm/ProfileData/IndexedMemProfData.h 
b/llvm/include/llvm/ProfileData/IndexedMemProfData.h
index 3c6c329d1c49d..66fa38472059b 100644
--- a/llvm/include/llvm/ProfileData/IndexedMemProfData.h
+++ b/llvm/include/llvm/ProfileData/IndexedMemProfData.h
@@ -10,14 +10,20 @@
 //
 
//===--===//
 
+#include "llvm/ProfileData/DataAccessProf.h"
 #include "llvm/ProfileData/InstrProf.h"
 #include "llvm/ProfileData/MemProf.h"
 
+#include 
+#include 
+
 namespace llvm {
 
 // Write the MemProf data to OS.
-Error writeMemProf(ProfOStream &OS, memprof::IndexedMemProfData &MemProfData,
-   memprof::IndexedVersion MemProfVersionRequested,
-   bool MemProfFullSchema);
+Error writeMemProf(
+ProfOStream &OS, memprof::IndexedMemProfData &MemProfData,
+memprof::IndexedVersion MemProfVersionRequested, bool MemProfFullSchema,
+std::optional>
+DataAccessProfileData);
 
 } // namespace llvm
diff --git a/llvm/include/llvm/ProfileData/InstrProfReader.h 
b/llvm/include/llvm/ProfileData/InstrProfReader.h
index c250a9ede39bc..210df6be46f04 100644
--- a/llvm/include/llvm/ProfileData/InstrProfReader.h
+++ b/llvm/include/llvm/ProfileData/InstrProfReader.h
@@ -18,6 +18,7 @@
 #include "llvm/ADT/StringRef.h"
 #include "llvm/IR/ProfileSummary.h"
 #include "llvm/Object/BuildID.h"
+#include "llvm/ProfileData/DataAccessProf.h"
 #include "llvm/ProfileData/InstrProf.h"
 #include "llvm/ProfileData/InstrProfCorrelator.h"
 #include "llvm/ProfileData/MemProf.h"
@@ -704,9 +705,12 @@ class IndexedMemProfReader {
   // The number of elements in the radix tree array.
   unsigned RadixTreeSize = 0;
 
+  std::unique_ptr DataAccessProfileData;
+
   Error deserializeV2(const unsigned char *Start, const unsigned char *Ptr);
   Error deserializeRadixTreeBased(const unsigned char *Start,
-  const unsigned char *Ptr);
+

[llvm-branch-commits] [llvm] [llvm][EmbedBitcodePass] Prevent modifying the module with ThinLTO (PR #139999)

2025-05-14 Thread Paul Kirth via llvm-branch-commits


ilovepi wrote:

> [!WARNING]
> This pull request is not mergeable via GitHub because a downstack PR is 
> open. Once all requirements are satisfied, merge this PR as a stack  href="https://app.graphite.dev/github/pr/llvm/llvm-project/13?utm_source=stack-comment-downstack-mergeability-warning";
>  >on Graphite.
> https://graphite.dev/docs/merge-pull-requests";>Learn more

* **#13** https://app.graphite.dev/github/pr/llvm/llvm-project/13?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/> 👈 https://app.graphite.dev/github/pr/llvm/llvm-project/13?utm_source=stack-comment-view-in-graphite";
 target="_blank">(View in Graphite)
* **#139998** https://app.graphite.dev/github/pr/llvm/llvm-project/139998?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* `main`




This stack of pull requests is managed by https://graphite.dev?utm-source=stack-comment";>Graphite. Learn 
more about https://stacking.dev/?utm_source=stack-comment";>stacking.


https://github.com/llvm/llvm-project/pull/13
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [llvm][EmbedBitcodePass] Prevent modifying the module with ThinLTO (PR #139999)

2025-05-14 Thread Paul Kirth via llvm-branch-commits


https://github.com/ilovepi created 
https://github.com/llvm/llvm-project/pull/13

Since ThinLTOBitcodeWriterPass handles many things for CFI and WPD, like
updating vtable linkage, we need to prevent those changes from
persisting in the non-LTO object code we will compile under FatLTO.

The only non-invasive way to do that is to clone the module when
serializing the module in ThinLTOBitcodeWriterPass. We may be able to
avoid cloning in the future with additional infrastructure to restore
the IR to its original state.

Fixes #139440

>From 0bc0cc2c64a7c7a347cf71888e9a57339537ce25 Mon Sep 17 00:00:00 2001
From: Paul Kirth 
Date: Wed, 14 May 2025 20:47:26 -0700
Subject: [PATCH] [llvm][EmbedBitcodePass] Prevent modifying the module with
 ThinLTO

Since ThinLTOBitcodeWriterPass handles many things for CFI and WPD, like
updating vtable linkage, we need to prevent those changes from
persisting in the non-LTO object code we will compile under FatLTO.

The only non-invasive way to do that is to clone the module when
serializing the module in ThinLTOBitcodeWriterPass. We may be able to
avoid cloning in the future with additional infrastructure to restore
the IR to its original state.

Fixes #139440
---
 llvm/lib/Transforms/IPO/EmbedBitcodePass.cpp   | 6 +-
 llvm/test/Transforms/EmbedBitcode/embed-wpd.ll | 7 ---
 2 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/llvm/lib/Transforms/IPO/EmbedBitcodePass.cpp 
b/llvm/lib/Transforms/IPO/EmbedBitcodePass.cpp
index 73f567734a91b..7a378c61cb899 100644
--- a/llvm/lib/Transforms/IPO/EmbedBitcodePass.cpp
+++ b/llvm/lib/Transforms/IPO/EmbedBitcodePass.cpp
@@ -16,6 +16,7 @@
 #include "llvm/Support/raw_ostream.h"
 #include "llvm/TargetParser/Triple.h"
 #include "llvm/Transforms/IPO/ThinLTOBitcodeWriter.h"
+#include "llvm/Transforms/Utils/Cloning.h"
 #include "llvm/Transforms/Utils/ModuleUtils.h"
 
 #include 
@@ -33,8 +34,11 @@ PreservedAnalyses EmbedBitcodePass::run(Module &M, 
ModuleAnalysisManager &AM) {
 
   std::string Data;
   raw_string_ostream OS(Data);
+  // Clone the module with with Thin LTO, since ThinLTOBitcodeWriterPass 
changes
+  // vtable linkage that would break the non-lto object code for FatLTO.
   if (IsThinLTO)
-ThinLTOBitcodeWriterPass(OS, /*ThinLinkOS=*/nullptr).run(M, AM);
+ThinLTOBitcodeWriterPass(OS, /*ThinLinkOS=*/nullptr)
+.run(*llvm::CloneModule(M), AM);
   else
 BitcodeWriterPass(OS, /*ShouldPreserveUseListOrder=*/false, EmitLTOSummary)
 .run(M, AM);
diff --git a/llvm/test/Transforms/EmbedBitcode/embed-wpd.ll 
b/llvm/test/Transforms/EmbedBitcode/embed-wpd.ll
index f1f7712f54039..54931be42b4eb 100644
--- a/llvm/test/Transforms/EmbedBitcode/embed-wpd.ll
+++ b/llvm/test/Transforms/EmbedBitcode/embed-wpd.ll
@@ -1,12 +1,13 @@
 ; RUN: opt --mtriple x86_64-unknown-linux-gnu < %s 
-passes="embed-bitcode" -S | FileCheck %s
 
-; CHECK-NOT: $_ZTV3Foo = comdat any
+; CHECK: $_ZTV3Foo = comdat any
 $_ZTV3Foo = comdat any
 
 $_ZTI3Foo = comdat any
 
-; CHECK: @_ZTV3Foo = external hidden unnamed_addr constant { [5 x ptr] }, 
align 8
-; CHECK: @_ZTI3Foo = linkonce_odr hidden constant { ptr, ptr, ptr } { ptr 
getelementptr inbounds (ptr, ptr @_ZTVN10__cxxabiv120__si_class_type_infoE, i64 
2), ptr @_ZTS3Foo, ptr @_ZTISt13runtime_error }, comdat, align 8
+;; ThinLTOBitcodeWriter will remove the vtable for Foo, and make it an 
external symbol
+; CHECK: @_ZTV3Foo = linkonce_odr hidden unnamed_addr constant { [5 x ptr] } { 
[5 x ptr] [ptr null, ptr @_ZTI3Foo, ptr @_ZN3FooD2Ev, ptr @_ZN3FooD0Ev, ptr 
@_ZNKSt13runtime_error4whatEv] }, comdat, align 8, !type !0, !type !1, !type 
!2, !type !3, !type !4, !type !5
+; CHECK-NOT: @foo = external unnamed_addr constant { [5 x ptr] }, align 8
 ; CHECK: @llvm.embedded.object = private constant {{.*}}, section ".llvm.lto", 
align 1
 ; CHECK: @llvm.compiler.used = appending global [1 x ptr] [ptr 
@llvm.embedded.object], section "llvm.metadata"
 @_ZTV3Foo = linkonce_odr hidden unnamed_addr constant { [5 x ptr] } { [5 x 
ptr] [ptr null, ptr @_ZTI3Foo, ptr @_ZN3FooD2Ev, ptr @_ZN3FooD0Ev, ptr 
@_ZNKSt13runtime_error4whatEv] }, comdat, align 8, !type !0, !type !1, !type 
!2, !type !3, !type !4, !type !5

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] Add pointer field protection feature. (PR #133538)

2025-05-14 Thread Peter Collingbourne via llvm-branch-commits



@@ -544,6 +544,7 @@ TYPE_TRAIT_2(__is_pointer_interconvertible_base_of, 
IsPointerInterconvertibleBas
 #include "clang/Basic/TransformTypeTraits.def"
 
 // Clang-only C++ Type Traits
+TYPE_TRAIT_1(__has_non_relocatable_fields, HasNonRelocatableFields, KEYCXX)

pcc wrote:

I don't like it either. As mentioned in the main comment, I hope to get rid of 
this when the P2786 implementation is finalized.

Sharing the qualifiers with PAuth ABI would be a good idea for the separate 
opt-in solution.

https://github.com/llvm/llvm-project/pull/133538
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] Add pointer field protection feature. (PR #133538)

2025-05-14 Thread Peter Collingbourne via llvm-branch-commits



@@ -2976,7 +3006,15 @@ void CodeGenFunction::EmitForwardingCallToLambda(
   QualType resultType = FPT->getReturnType();
   ReturnValueSlot returnSlot;
   if (!resultType->isVoidType() &&
-  calleeFnInfo->getReturnInfo().getKind() == ABIArgInfo::Indirect &&
+  (calleeFnInfo->getReturnInfo().getKind() == ABIArgInfo::Indirect ||
+   // With pointer field protection, we need to set up the return slot when

pcc wrote:

The return value slot for forwarding lambdas is set up in an unusual way 
compared to other functions. I think I first tried simplifying this to work 
like normal functions but that led to other problems. I agree that ideally we 
wouldn't need to modify this.

https://github.com/llvm/llvm-project/pull/133538
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] Add pointer field protection feature. (PR #133538)

2025-05-14 Thread Peter Collingbourne via llvm-branch-commits



@@ -1415,6 +1469,52 @@ void CodeGenFunction::CreateCoercedStore(llvm::Value 
*Src, Address Dst,
 }
   }
 
+  // Coercion directly through memory does not work if the structure has 
pointer
+  // field protection because the struct passed by value has a different bit
+  // pattern to the struct in memory, so we must read the elements one by one
+  // and use them to form the coerced structure.
+  std::vector PFPFields;
+  getContext().findPFPFields(SrcFETy, CharUnits::Zero(), PFPFields, true);

pcc wrote:

See my other comment.

https://github.com/llvm/llvm-project/pull/133538
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] Add pointer field protection feature. (PR #133538)

2025-05-14 Thread Peter Collingbourne via llvm-branch-commits



@@ -7756,6 +7756,10 @@ void Clang::ConstructJob(Compilation &C, const JobAction 
&JA,
   Args.addOptInFlag(CmdArgs, options::OPT_funique_source_file_names,
 options::OPT_fno_unique_source_file_names);
 
+  if (!IsCudaDevice)

pcc wrote:

With CUDA the driver creates two compilation jobs, one for the host and one for 
the device. Since it is an ABI requirement that all C++ translation units are 
built with the same `-fexperimental-pointer-field-protection` flags, the flag 
must be passed when compiling the CUDA translation units as well, so that the 
host side is built correctly. And since GPUs don't support PFP, we need to 
exclude the flag from the CUDA device job. It looks like the same ends up 
happening for PAuth ABI because the flags are copied in 
`Clang::AddAArch64TargetArgs`, which means that they are ignored for the CUDA 
device which will have a different architecture.

If someone did end up implementing PFP for GPUs, I think we would create a 
separate flag to enable PFP for the device.

https://github.com/llvm/llvm-project/pull/133538
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] Add pointer field protection feature. (PR #133538)

2025-05-14 Thread Peter Collingbourne via llvm-branch-commits



@@ -7538,6 +7538,14 @@ static bool IsEligibleForTrivialRelocation(Sema &SemaRef,
 if (!SemaRef.IsCXXTriviallyRelocatableType(Field->getType()))
   return false;
   }
+
+  // FIXME: PFP should not affect trivial relocatability, instead it should
+  // affect the implementation of std::trivially_relocate. See:
+  // 
https://discourse.llvm.org/t/rfc-structure-protection-a-family-of-uaf-mitigation-techniques/8/16?u=pcc
+  if (!SemaRef.Context.arePFPFieldsTriviallyRelocatable(D) &&

pcc wrote:

Yes, that does look like the right place for now until we have 
`std::trivially_relocate` fully implemented. I see there is already a call to 
`BaseElementType.hasAddressDiscriminatedPointerAuth()` in that function.

https://github.com/llvm/llvm-project/pull/133538
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] Add pointer field protection feature. (PR #133538)

2025-05-14 Thread Peter Collingbourne via llvm-branch-commits



@@ -2513,6 +2513,12 @@ def CountedByOrNull : DeclOrTypeAttr {
   let LangOpts = [COnly];
 }
 
+def NoPointerFieldProtection : DeclOrTypeAttr {

pcc wrote:

There are numerous circumstances where the C++ ABI is distinct from the 
platform ABI and is allowed to change. For example, most Chromium-based web 
browsers ship with a statically linked libc++, and the same is true for many 
Android apps and server-side software at Google. In the intended usage model, 
all of libc++ would be built with a `-fexperimental-pointer-field-protection=` 
flag consistent with the rest of the program.

The libc++abi opt-outs that I added are only for opting out pointer fields of 
RTTI structs that are generated by the compiler. In principle, we could teach 
the compiler to sign those pointer fields which would let us remove the 
opt-outs, but there is a lower ROI for subjecting those fields to PFP because 
the RTTI structs are not typically dynamically allocated.

We may consider adding an attribute to allow opting in in the case where the 
flag is not passed. But I think we should do that in a followup. We would need 
to carefully consider how that would interact with the other aspects of the 
opt-in solution, such as the pointer qualifiers.

https://github.com/llvm/llvm-project/pull/133538
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] Add pointer field protection feature. (PR #133538)

2025-05-14 Thread Peter Collingbourne via llvm-branch-commits



@@ -441,6 +445,254 @@ bool 
PreISelIntrinsicLowering::expandMemIntrinsicUses(Function &F) const {
   return Changed;
 }
 
+namespace {
+
+enum class PointerEncoding {
+  Rotate,
+  PACCopyable,
+  PACNonCopyable,
+};
+
+bool expandProtectedFieldPtr(Function &Intr) {
+  Module &M = *Intr.getParent();
+  bool IsAArch64 = Triple(M.getTargetTriple()).isAArch64();
+
+  std::set NonPFPFields;
+  std::set LoadsStores;
+
+  Type *Int8Ty = Type::getInt8Ty(M.getContext());
+  Type *Int64Ty = Type::getInt64Ty(M.getContext());
+  PointerType *PtrTy = PointerType::get(M.getContext(), 0);
+
+  Function *SignIntr =
+  Intrinsic::getOrInsertDeclaration(&M, Intrinsic::ptrauth_sign, {});
+  Function *AuthIntr =
+  Intrinsic::getOrInsertDeclaration(&M, Intrinsic::ptrauth_auth, {});
+
+  auto *EmuFnTy = FunctionType::get(Int64Ty, {Int64Ty, Int64Ty}, false);
+  FunctionCallee EmuSignIntr = M.getOrInsertFunction("__emupac_pacda", 
EmuFnTy);
+  FunctionCallee EmuAuthIntr = M.getOrInsertFunction("__emupac_autda", 
EmuFnTy);
+
+  auto CreateSign = [&](IRBuilder<> &B, Value *Val, Value *Disc,
+   OperandBundleDef DSBundle) {
+Function *F = B.GetInsertBlock()->getParent();
+Attribute FSAttr = F->getFnAttribute("target-features");
+if (FSAttr.isValid() && FSAttr.getValueAsString().contains("+pauth"))
+  return B.CreateCall(SignIntr, {Val, B.getInt32(2), Disc}, DSBundle);
+return B.CreateCall(EmuSignIntr, {Val, Disc}, DSBundle);
+  };
+
+  auto CreateAuth = [&](IRBuilder<> &B, Value *Val, Value *Disc,
+   OperandBundleDef DSBundle) {
+Function *F = B.GetInsertBlock()->getParent();
+Attribute FSAttr = F->getFnAttribute("target-features");
+if (FSAttr.isValid() && FSAttr.getValueAsString().contains("+pauth"))
+  return B.CreateCall(AuthIntr, {Val, B.getInt32(2), Disc}, DSBundle);
+return B.CreateCall(EmuAuthIntr, {Val, Disc}, DSBundle);
+  };
+
+  for (User *U : Intr.users()) {
+auto *Call = cast(U);
+auto *FieldName = cast(

pcc wrote:

Yeah, the exact semantics of the intrinsic are a bit inelegant at the moment. 
For example, Clang shares knowledge with this pass about how deactivation 
symbols are named, as well as how to compute hashes (it uses `std::hash` 
because I didn't get around to switching it to SipHash yet, my bad). It may be 
best to move all of that knowledge into Clang and then the intrinsic would take 
a deactivation symbol as well as a discriminator hash.

https://github.com/llvm/llvm-project/pull/133538
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] Add pointer field protection feature. (PR #133538)

2025-05-14 Thread Peter Collingbourne via llvm-branch-commits



@@ -3011,6 +3011,12 @@ defm experimental_omit_vtable_rtti : 
BoolFOption<"experimental-omit-vtable-rtti"
   NegFlag,
   BothFlags<[], [CC1Option], " the RTTI component from virtual tables">>;
 
+def experimental_pointer_field_protection_EQ : Joined<["-"], 
"fexperimental-pointer-field-protection=">, Group,

pcc wrote:

(See other reply.)

https://github.com/llvm/llvm-project/pull/133538
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] Add pointer field protection feature. (PR #133538)

2025-05-14 Thread Peter Collingbourne via llvm-branch-commits



@@ -2268,13 +2293,22 @@ CodeGenFunction::EmitNullInitialization(Address 
DestPtr, QualType Ty) {
 
 // Get and call the appropriate llvm.memcpy overload.
 Builder.CreateMemCpy(DestPtr, SrcPtr, SizeVal, false);
-return;
+  } else {
+// Otherwise, just memset the whole thing to zero.  This is legal

pcc wrote:

A PFP field containing a null pointer has a non-null bit pattern (it is a 
signed null). This is to avoid a conditional branch at every location when 
loading/storing a pointer and the nullness of the pointer cannot be statically 
determined. That's why we need to store nulls here.

The EmitNullInitialization function is not used in case of a constructor call. 
It is used in cases like
```
struct S {
  void *p;
};

void f() {
  S s{};
}
```
I am not adding a call to memset here. The change here is a bit confusing 
because I needed to change some of the indentation to get rid of an early 
return. The memset previously on line 2277, which was moved to line 2301, was 
previously, and is now, being executed if `isZeroInitializable`. The only code 
being added is the null stores.

https://github.com/llvm/llvm-project/pull/133538
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] Add pointer field protection feature. (PR #133538)

2025-05-14 Thread Peter Collingbourne via llvm-branch-commits



@@ -928,6 +936,11 @@ namespace {
   if (PointerAuthQualifier Q = F->getType().getPointerAuth();
   Q && Q.isAddressDiscriminated())
 return false;
+  // Non-trivially-copyable fields with pointer field protection need to be

pcc wrote:

We could move lines 936-943 into a separate function that takes a QualType and 
call it from other places. However, I checked the other callers of 
`PointerAuthQualifier::isAddressDiscriminated` and I think almost all of them 
are already correct.

https://github.com/llvm/llvm-project/pull/133538
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] Add pointer field protection feature. (PR #133538)

2025-05-14 Thread Peter Collingbourne via llvm-branch-commits


https://github.com/pcc edited https://github.com/llvm/llvm-project/pull/133538
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] Add pointer field protection feature. (PR #133538)

2025-05-14 Thread Peter Collingbourne via llvm-branch-commits



@@ -2201,6 +2215,22 @@ void CodeGenFunction::EmitCXXConstructorCall(
 EmitTypeCheck(CodeGenFunction::TCK_ConstructorCall, Loc, This,
   getContext().getRecordType(ClassDecl), CharUnits::Zero());
 
+  // When initializing an object that has pointer field protection and whose
+  // fields are not trivially relocatable we must initialize any pointer fields
+  // to a valid signed pointer (any pointer value will do, but we just use null
+  // pointers). This is because if the object is subsequently copied, its copy
+  // constructor will need to read and authenticate any pointer fields in order
+  // to copy the object to a new address, which will fail if the pointers are
+  // uninitialized.
+  if (!getContext().arePFPFieldsTriviallyRelocatable(D->getParent())) {

pcc wrote:

That's fair. This code was added while testing in our internal codebase at a 
time when we had a more expansive view of which fields would be subject to PFP. 
I will re-evaluate whether this is still needed.

https://github.com/llvm/llvm-project/pull/133538
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] Add pointer field protection feature. (PR #133538)

2025-05-14 Thread Peter Collingbourne via llvm-branch-commits


https://github.com/pcc commented:

Hi Oliver, thanks for your comments! I'll address them below.

> Thoughts:
> 
> This should be opt-in on a field or struct granularity, not just a global 
> behavior.

This would certainly be easier if it were an opt-in behavior, as it would allow 
avoiding a substantial amount of complexity that you point out below. The 
problem is that would not make the solution very effective as a UAF mitigation, 
as it would only help with code that was opted in, which is likely a tiny 
fraction of an entire codebase. That fraction is unlikely to be the fraction 
with UAF bugs, not only because of the size of the fraction but also because if 
developers aren't thinking about memory safety issues, they certainly won't be 
manually opting into this. With PFP, the problem that I set out to solve was to 
find a way to automatically protect as many pointers stored in memory as 
possible in standards-conforming code.

> In the RFC I think you mentioned not applying PFP to C types, but I'm unsure 
> how you're deciding what is a C type?

This is based on whether the type has standard layout. C does not permit 
declaring a type that is non-standard layout.

> There are a lot of special cases being added in places that should not need 
> to be aware of PFP - the core type queries should be returning correct values 
> for types containing PFP fields.

Agreed on type queries. The current adjustment to how trivially-relocatable is 
defined is a workaround that will be removed once P2786 is fully implemented in 
libc++, and at that point we shouldn't need `__has_non_relocatable_fields` 
either. The intent is that turning on PFP should be invisible to the developer 
(in other words, the modification to the storage format of pointer fields in 
certain structs is permitted by the as-if rule). As a result, most of the 
implementation is in CodeGen and below.

> A lot of the "this read/write/copy/init special work" exactly aligns with the 
> constraints of pointer auth, and I think the overall implementation could be 
> made better by introducing the concept of a `HardenedPointerQualifier` or 
> similar, that could be either PointerAuthQualifier or PFPQualifier. In 
> principle this could even just be treated as a different PointerAuth key set.
>
> That approach would also help with some of the problems you have with 
> offsetof and pointers to PFP fields - the thing that causes the problems with 
> the pointer to a PFP field is that the PFP schema for a given field is in 
> reality part of the type, so given
> 
> ```c++
> struct A {
>void *field1;
>void *field2;
> };
> ```
> 
> It looks you are trying to maintain the idea that `A::field1` and `A::field2` 
> have the same type, which makes this code valid according to the type system:
> 
> ```c++
> void f(void** obj);
> void g(A *a) {
>   f(&a->field1);
>   f(&a->field2);
> }
> ```
> 
> but the real types are not the same. The semantic equivalent under pointer 
> auth is something akin to
> 
> ```c++
> struct A {
>void * __ptrauth(1, 1, hash("A::field1")) field1;
>void * __ptrauth(1, 1, hash("A::field2")) field2;
> };
> ```
> 
> Which makes it very clear that the types are different, and I think trying to 
> pretend they are not is part of what is causing problems.

In the early stages of the project, I wanted to make it so that the PFP fields 
would implicitly receive qualifiers, similar to what you proposed, and I was 
trying to come up with an approach that would make it work, but I couldn't see 
a way because of the inter-convertibility of pointers from different structs 
(and pointers outside of structs entirely), and disallowing conversions between 
differently qualified pointer types would render the compiler unable to compile 
standard C++. The only way that I could see to make it work was to utilize the 
as-if rule and invisibly modify the in-memory representation.

For pointers in standard-layout types (i.e. pointers that explicitly opt in) I 
agree with you that the right approach would be to use a qualifier, but I think 
that should be done as a separate change, possibly as a followup. In this 
change, I would like to implement the solution for implicitly protected fields.

> The more I think about it, the more I feel that this could be implemented 
> almost entirely on top of the existing pointer auth work.
> 
> Currently for pointer auth there's a set of ARM keys specified, and I think 
> you just need to create a different set, PFP_Keys or similar, and set up the 
> appropriate schemas (basically null schemas for all the existing cases), and 
> add a new schema: DefaultFieldSchema or similar, and apply that schema to 
> every pointer field in an object unless there's an explicit qualifier applied 
> already.
> 
> Then it would in principle just be a matter of making adding the appropriate 
> logic to the ptrauth intrinsics in llvm.
>
> Now there are some moderately large caveats to that idea
> 
> * the existing ptrauth

[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: optionally assume auth traps on failure (PR #139778)

2025-05-14 Thread Anatoly Trosinenko via llvm-branch-commits


https://github.com/atrosinenko updated 
https://github.com/llvm/llvm-project/pull/139778

>From 8f997500a97b5ad7acc9dea416cca6b2f7bb615d Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko 
Date: Tue, 13 May 2025 19:50:41 +0300
Subject: [PATCH] [BOLT] Gadget scanner: optionally assume auth traps on
 failure

On AArch64 it is possible for an auth instruction to either return an
invalid address value on failure (without FEAT_FPAC) or generate an
error (with FEAT_FPAC). It thus may be possible to never emit explicit
pointer checks, if the target CPU is known to support FEAT_FPAC.

This commit implements an --auth-traps-on-failure command line option,
which essentially makes "safe-to-dereference" and "trusted" register
properties identical and disables scanning for authentication oracles
completely.
---
 bolt/lib/Passes/PAuthGadgetScanner.cpp| 112 +++
 .../binary-analysis/AArch64/cmdline-args.test |   1 +
 .../AArch64/gs-pauth-authentication-oracles.s |   6 +-
 .../binary-analysis/AArch64/gs-pauth-calls.s  |   5 +-
 .../AArch64/gs-pauth-debug-output.s   | 177 ++---
 .../AArch64/gs-pauth-jump-table.s |   6 +-
 .../AArch64/gs-pauth-signing-oracles.s|  54 ++---
 .../AArch64/gs-pauth-tail-calls.s | 184 +-
 8 files changed, 318 insertions(+), 227 deletions(-)

diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp 
b/bolt/lib/Passes/PAuthGadgetScanner.cpp
index bda971bcd9343..cfe86d32df798 100644
--- a/bolt/lib/Passes/PAuthGadgetScanner.cpp
+++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp
@@ -14,6 +14,7 @@
 #include "bolt/Passes/PAuthGadgetScanner.h"
 #include "bolt/Core/ParallelUtilities.h"
 #include "bolt/Passes/DataflowAnalysis.h"
+#include "bolt/Utils/CommandLineOpts.h"
 #include "llvm/ADT/STLExtras.h"
 #include "llvm/ADT/SmallSet.h"
 #include "llvm/MC/MCInst.h"
@@ -26,6 +27,11 @@ namespace llvm {
 namespace bolt {
 namespace PAuthGadgetScanner {
 
+static cl::opt AuthTrapsOnFailure(
+"auth-traps-on-failure",
+cl::desc("Assume authentication instructions always trap on failure"),
+cl::cat(opts::BinaryAnalysisCategory));
+
 [[maybe_unused]] static void traceInst(const BinaryContext &BC, StringRef 
Label,
const MCInst &MI) {
   dbgs() << "  " << Label << ": ";
@@ -365,6 +371,34 @@ class SrcSafetyAnalysis {
 return Clobbered;
   }
 
+  std::optional getRegMadeTrustedByChecking(const MCInst &Inst,
+   SrcState Cur) const {
+// This functions cannot return multiple registers. This is never the case
+// on AArch64.
+std::optional RegCheckedByInst =
+BC.MIB->getAuthCheckedReg(Inst, /*MayOverwrite=*/false);
+if (RegCheckedByInst && Cur.SafeToDerefRegs[*RegCheckedByInst])
+  return *RegCheckedByInst;
+
+auto It = CheckerSequenceInfo.find(&Inst);
+if (It == CheckerSequenceInfo.end())
+  return std::nullopt;
+
+MCPhysReg RegCheckedBySequence = It->second.first;
+const MCInst *FirstCheckerInst = It->second.second;
+
+// FirstCheckerInst should belong to the same basic block (see the
+// assertion in DataflowSrcSafetyAnalysis::run()), meaning it was
+// deterministically processed a few steps before this instruction.
+const SrcState &StateBeforeChecker = getStateBefore(*FirstCheckerInst);
+
+// The sequence checks the register, but it should be authenticated before.
+if (!StateBeforeChecker.SafeToDerefRegs[RegCheckedBySequence])
+  return std::nullopt;
+
+return RegCheckedBySequence;
+  }
+
   // Returns all registers that can be treated as if they are written by an
   // authentication instruction.
   SmallVector getRegsMadeSafeToDeref(const MCInst &Point,
@@ -387,18 +421,38 @@ class SrcSafetyAnalysis {
 Regs.push_back(DstAndSrc->first);
 }
 
+// Make sure explicit checker sequence keeps register safe-to-dereference
+// when the register would be clobbered according to the regular rules:
+//
+//; LR is safe to dereference here
+//mov   x16, x30  ; start of the sequence, LR is s-t-d right before
+//xpaclri ; clobbers LR, LR is not safe anymore
+//cmp   x30, x16
+//b.eq  1f; end of the sequence: LR is marked as trusted
+//brk   0x1234
+//  1:
+//; at this point LR would be marked as trusted,
+//; but not safe-to-dereference
+//
+// or even just
+//
+//; X1 is safe to dereference here
+//ldr x0, [x1, #8]!
+//; X1 is trusted here, but it was clobbered due to address write-back
+if (auto CheckedReg = getRegMadeTrustedByChecking(Point, Cur))
+  Regs.push_back(*CheckedReg);
+
 return Regs;
   }
 
   // Returns all registers made trusted by this instruction.
   SmallVector getRegsMadeTrusted(const MCInst &Point,
 const SrcState &Cur) const {
+assert(!AuthTrapsOnFailure &&

[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: prevent false positives due to jump tables (PR #138884)

2025-05-14 Thread Anatoly Trosinenko via llvm-branch-commits


https://github.com/atrosinenko updated 
https://github.com/llvm/llvm-project/pull/138884

>From cdc385aac7abe960340e26b2427ac9215e0a54fd Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko 
Date: Tue, 6 May 2025 11:31:03 +0300
Subject: [PATCH] [BOLT] Gadget scanner: prevent false positives due to jump
 tables

As part of PAuth hardening, AArch64 LLVM backend can use a special
BR_JumpTable pseudo (enabled by -faarch64-jump-table-hardening
Clang option) which is expanded in the AsmPrinter into a contiguous
sequence without unsafe instructions in the middle.

This commit adds another target-specific callback to MCPlusBuilder
to make it possible to inhibit false positives for known-safe jump
table dispatch sequences. Without special handling, the branch
instruction is likely to be reported as a non-protected call (as its
destination is not produced by an auth instruction, PC-relative address
materialization, etc.) and possibly as a tail call being performed with
unsafe link register (as the detection whether the branch instruction
is a tail call is an heuristic).

For now, only the specific instruction sequence used by the AArch64
LLVM backend is matched.
---
 bolt/include/bolt/Core/MCInstUtils.h  |   9 +
 bolt/include/bolt/Core/MCPlusBuilder.h|  14 +
 bolt/lib/Core/MCInstUtils.cpp |  20 +
 bolt/lib/Passes/PAuthGadgetScanner.cpp|  10 +
 .../Target/AArch64/AArch64MCPlusBuilder.cpp   |  73 ++
 .../AArch64/gs-pauth-jump-table.s | 703 ++
 6 files changed, 829 insertions(+)
 create mode 100644 bolt/test/binary-analysis/AArch64/gs-pauth-jump-table.s

diff --git a/bolt/include/bolt/Core/MCInstUtils.h 
b/bolt/include/bolt/Core/MCInstUtils.h
index b495eb8ef5eec..5dd0aaa48d6e7 100644
--- a/bolt/include/bolt/Core/MCInstUtils.h
+++ b/bolt/include/bolt/Core/MCInstUtils.h
@@ -158,6 +158,15 @@ class MCInstReference {
 return nullptr;
   }
 
+  /// Returns the only preceding instruction, or std::nullopt if multiple or no
+  /// predecessors are possible.
+  ///
+  /// If CFG information is available, basic block boundary can be crossed,
+  /// provided there is exactly one predecessor. If CFG is not available, the
+  /// preceding instruction in the offset order is returned, unless this is the
+  /// first instruction of the function.
+  std::optional getSinglePredecessor();
+
   raw_ostream &print(raw_ostream &OS) const;
 };
 
diff --git a/bolt/include/bolt/Core/MCPlusBuilder.h 
b/bolt/include/bolt/Core/MCPlusBuilder.h
index 87de6754017db..eb93d7de7fee9 100644
--- a/bolt/include/bolt/Core/MCPlusBuilder.h
+++ b/bolt/include/bolt/Core/MCPlusBuilder.h
@@ -14,6 +14,7 @@
 #ifndef BOLT_CORE_MCPLUSBUILDER_H
 #define BOLT_CORE_MCPLUSBUILDER_H
 
+#include "bolt/Core/MCInstUtils.h"
 #include "bolt/Core/MCPlus.h"
 #include "bolt/Core/Relocation.h"
 #include "llvm/ADT/ArrayRef.h"
@@ -699,6 +700,19 @@ class MCPlusBuilder {
 return std::nullopt;
   }
 
+  /// Tests if BranchInst corresponds to an instruction sequence which is known
+  /// to be a safe dispatch via jump table.
+  ///
+  /// The target can decide which instruction sequences to consider "safe" from
+  /// the Pointer Authentication point of view, such as any jump table dispatch
+  /// sequence without function calls inside, any sequence which is contiguous,
+  /// or only some specific well-known sequences.
+  virtual bool
+  isSafeJumpTableBranchForPtrAuth(MCInstReference BranchInst) const {
+llvm_unreachable("not implemented");
+return false;
+  }
+
   virtual bool isTerminator(const MCInst &Inst) const;
 
   virtual bool isNoop(const MCInst &Inst) const {
diff --git a/bolt/lib/Core/MCInstUtils.cpp b/bolt/lib/Core/MCInstUtils.cpp
index 40f6edd59135c..b7c6d898988af 100644
--- a/bolt/lib/Core/MCInstUtils.cpp
+++ b/bolt/lib/Core/MCInstUtils.cpp
@@ -55,3 +55,23 @@ raw_ostream &MCInstReference::print(raw_ostream &OS) const {
   OS << ">";
   return OS;
 }
+
+std::optional MCInstReference::getSinglePredecessor() {
+  if (const RefInBB *Ref = tryGetRefInBB()) {
+if (Ref->It != Ref->BB->begin())
+  return MCInstReference(Ref->BB, &*std::prev(Ref->It));
+
+if (Ref->BB->pred_size() != 1)
+  return std::nullopt;
+
+BinaryBasicBlock *PredBB = *Ref->BB->pred_begin();
+assert(!PredBB->empty() && "Empty basic blocks are not supported yet");
+return MCInstReference(PredBB, &*PredBB->rbegin());
+  }
+
+  const RefInBF &Ref = getRefInBF();
+  if (Ref.It == Ref.BF->instrs().begin())
+return std::nullopt;
+
+  return MCInstReference(Ref.BF, std::prev(Ref.It));
+}
diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp 
b/bolt/lib/Passes/PAuthGadgetScanner.cpp
index 5e08ae3fbf767..bda971bcd9343 100644
--- a/bolt/lib/Passes/PAuthGadgetScanner.cpp
+++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp
@@ -1328,6 +1328,11 @@ shouldReportUnsafeTailCall(const BinaryContext &BC, 
const BinaryFunction &BF,
 return std::nullopt;
   }
 
+  if (BC.MIB->isSafeJumpTableBranchForPtrAuth(Inst)) {
+LL

[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: prevent false positives due to jump tables (PR #138884)

2025-05-14 Thread Anatoly Trosinenko via llvm-branch-commits


https://github.com/atrosinenko updated 
https://github.com/llvm/llvm-project/pull/138884



  



Rate limit · GitHub


  body {
background-color: #f6f8fa;
color: #24292e;
font-family: -apple-system,BlinkMacSystemFont,Segoe 
UI,Helvetica,Arial,sans-serif,Apple Color Emoji,Segoe UI Emoji,Segoe UI Symbol;
font-size: 14px;
line-height: 1.5;
margin: 0;
  }

  .container { margin: 50px auto; max-width: 600px; text-align: center; 
padding: 0 24px; }

  a { color: #0366d6; text-decoration: none; }
  a:hover { text-decoration: underline; }

  h1 { line-height: 60px; font-size: 48px; font-weight: 300; margin: 0px; 
text-shadow: 0 1px 0 #fff; }
  p { color: rgba(0, 0, 0, 0.5); margin: 20px 0 40px; }

  ul { list-style: none; margin: 25px 0; padding: 0; }
  li { display: table-cell; font-weight: bold; width: 1%; }

  .logo { display: inline-block; margin-top: 35px; }
  .logo-img-2x { display: none; }
  @media
  only screen and (-webkit-min-device-pixel-ratio: 2),
  only screen and (   min--moz-device-pixel-ratio: 2),
  only screen and ( -o-min-device-pixel-ratio: 2/1),
  only screen and (min-device-pixel-ratio: 2),
  only screen and (min-resolution: 192dpi),
  only screen and (min-resolution: 2dppx) {
.logo-img-1x { display: none; }
.logo-img-2x { display: inline-block; }
  }

  #suggestions {
margin-top: 35px;
color: #ccc;
  }
  #suggestions a {
color: #66;
font-weight: 200;
font-size: 14px;
margin: 0 10px;
  }


  
  



  Whoa there!
  You have exceeded a secondary rate limit.
Please wait a few minutes before you try again;
in some cases this may take up to an hour.
  
  
https://support.github.com/contact";>Contact Support —
https://githubstatus.com";>GitHub Status —
https://twitter.com/githubstatus";>@githubstatus
  

  

  

  

  

  


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: account for BRK when searching for auth oracles (PR #137975)

2025-05-14 Thread Anatoly Trosinenko via llvm-branch-commits


https://github.com/atrosinenko updated 
https://github.com/llvm/llvm-project/pull/137975



  



Rate limit · GitHub


  body {
background-color: #f6f8fa;
color: #24292e;
font-family: -apple-system,BlinkMacSystemFont,Segoe 
UI,Helvetica,Arial,sans-serif,Apple Color Emoji,Segoe UI Emoji,Segoe UI Symbol;
font-size: 14px;
line-height: 1.5;
margin: 0;
  }

  .container { margin: 50px auto; max-width: 600px; text-align: center; 
padding: 0 24px; }

  a { color: #0366d6; text-decoration: none; }
  a:hover { text-decoration: underline; }

  h1 { line-height: 60px; font-size: 48px; font-weight: 300; margin: 0px; 
text-shadow: 0 1px 0 #fff; }
  p { color: rgba(0, 0, 0, 0.5); margin: 20px 0 40px; }

  ul { list-style: none; margin: 25px 0; padding: 0; }
  li { display: table-cell; font-weight: bold; width: 1%; }

  .logo { display: inline-block; margin-top: 35px; }
  .logo-img-2x { display: none; }
  @media
  only screen and (-webkit-min-device-pixel-ratio: 2),
  only screen and (   min--moz-device-pixel-ratio: 2),
  only screen and ( -o-min-device-pixel-ratio: 2/1),
  only screen and (min-device-pixel-ratio: 2),
  only screen and (min-resolution: 192dpi),
  only screen and (min-resolution: 2dppx) {
.logo-img-1x { display: none; }
.logo-img-2x { display: inline-block; }
  }

  #suggestions {
margin-top: 35px;
color: #ccc;
  }
  #suggestions a {
color: #66;
font-weight: 200;
font-size: 14px;
margin: 0 10px;
  }


  
  



  Whoa there!
  You have exceeded a secondary rate limit.
Please wait a few minutes before you try again;
in some cases this may take up to an hour.
  
  
https://support.github.com/contact";>Contact Support —
https://githubstatus.com";>GitHub Status —
https://twitter.com/githubstatus";>@githubstatus
  

  

  

  

  

  


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect untrusted LR before tail call (PR #137224)

2025-05-14 Thread Anatoly Trosinenko via llvm-branch-commits


atrosinenko wrote:

As this PR improved the handling of leaf functions without CFG information, a 
few more test cases can be cleaned up - updated in 
1377286872187ed191b02bd9632842f4db6dc367.

https://github.com/llvm/llvm-project/pull/137224
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [BOLT] Introduce helpers to match `MCInst`s one at a time (NFC) (PR #138883)

2025-05-14 Thread Anatoly Trosinenko via llvm-branch-commits


https://github.com/atrosinenko updated 
https://github.com/llvm/llvm-project/pull/138883

>From 4bd8dd9334c8ad810e7fc593331cd6d4e2fdbbad Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko 
Date: Wed, 7 May 2025 16:42:00 +0300
Subject: [PATCH] [BOLT] Introduce helpers to match `MCInst`s one at a time
 (NFC)

Introduce matchInst helper function to capture and/or match the operands
of MCInst. Unlike the existing `MCPlusBuilder::MCInstMatcher` machinery,
matchInst is intended for the use cases when precise control over the
instruction order is required. For example, when validating PtrAuth
hardening, all registers are usually considered unsafe after a function
call, even though callee-saved registers should preserve their old
values *under normal operation*.
---
 bolt/include/bolt/Core/MCInstUtils.h  | 128 ++
 .../Target/AArch64/AArch64MCPlusBuilder.cpp   |  90 +---
 2 files changed, 162 insertions(+), 56 deletions(-)

diff --git a/bolt/include/bolt/Core/MCInstUtils.h 
b/bolt/include/bolt/Core/MCInstUtils.h
index a3912a8fb265a..b495eb8ef5eec 100644
--- a/bolt/include/bolt/Core/MCInstUtils.h
+++ b/bolt/include/bolt/Core/MCInstUtils.h
@@ -166,6 +166,134 @@ static inline raw_ostream &operator<<(raw_ostream &OS,
   return Ref.print(OS);
 }
 
+/// Instruction-matching helpers operating on a single instruction at a time.
+///
+/// Unlike MCPlusBuilder::MCInstMatcher, this matchInst() function focuses on
+/// the cases where a precise control over the instruction order is important:
+///
+/// // Bring the short names into the local scope:
+/// using namespace MCInstMatcher;
+/// // Declare the registers to capture:
+/// Reg Xn, Xm;
+/// // Capture the 0th and 1st operands, match the 2nd operand against the
+/// // just captured Xm register, match the 3rd operand against literal 0:
+/// if (!matchInst(MaybeAdd, AArch64::ADDXrs, Xm, Xn, Xm, Imm(0))
+///   return AArch64::NoRegister;
+/// // Match the 0th operand against Xm:
+/// if (!matchInst(MaybeBr, AArch64::BR, Xm))
+///   return AArch64::NoRegister;
+/// // Return the matched register:
+/// return Xm.get();
+namespace MCInstMatcher {
+
+// The base class to match an operand of type T.
+//
+// The subclasses of OpMatcher are intended to be allocated on the stack and
+// to only be used by passing them to matchInst() and by calling their get()
+// function, thus the peculiar `mutable` specifiers: to make the calling code
+// compact and readable, the templated matchInst() function has to accept both
+// long-lived Imm/Reg wrappers declared as local variables (intended to capture
+// the first operand's value and match the subsequent operands, whether inside
+// a single instruction or across multiple instructions), as well as temporary
+// wrappers around literal values to match, f.e. Imm(42) or Reg(AArch64::XZR).
+template  class OpMatcher {
+  mutable std::optional Value;
+  mutable std::optional SavedValue;
+
+  // Remember/restore the last Value - to be called by matchInst.
+  void remember() const { SavedValue = Value; }
+  void restore() const { Value = SavedValue; }
+
+  template 
+  friend bool matchInst(const MCInst &, unsigned, const OpMatchers &...);
+
+protected:
+  OpMatcher(std::optional ValueToMatch) : Value(ValueToMatch) {}
+
+  bool matchValue(T OpValue) const {
+// Check that OpValue does not contradict the existing Value.
+bool MatchResult = !Value || *Value == OpValue;
+// If MatchResult is false, all matchers will be reset before returning 
from
+// matchInst, including this one, thus no need to assign conditionally.
+Value = OpValue;
+
+return MatchResult;
+  }
+
+public:
+  /// Returns the captured value.
+  T get() const {
+assert(Value.has_value());
+return *Value;
+  }
+};
+
+class Reg : public OpMatcher {
+  bool matches(const MCOperand &Op) const {
+if (!Op.isReg())
+  return false;
+
+return matchValue(Op.getReg());
+  }
+
+  template 
+  friend bool matchInst(const MCInst &, unsigned, const OpMatchers &...);
+
+public:
+  Reg(std::optional RegToMatch = std::nullopt)
+  : OpMatcher(RegToMatch) {}
+};
+
+class Imm : public OpMatcher {
+  bool matches(const MCOperand &Op) const {
+if (!Op.isImm())
+  return false;
+
+return matchValue(Op.getImm());
+  }
+
+  template 
+  friend bool matchInst(const MCInst &, unsigned, const OpMatchers &...);
+
+public:
+  Imm(std::optional ImmToMatch = std::nullopt)
+  : OpMatcher(ImmToMatch) {}
+};
+
+/// Tries to match Inst and updates Ops on success.
+///
+/// If Inst has the specified Opcode and its operand list prefix matches Ops,
+/// this function returns true and updates Ops, otherwise false is returned and
+/// values of Ops are kept as before matchInst was called.
+///
+/// Please note that while Ops are technically passed by a const reference to
+/// make invocations like `matchInst(MI, Opcode, Imm(42))` possible, all their
+/// fields are marked mut

[llvm-branch-commits] [llvm] [BOLT] Factor out MCInstReference from gadget scanner (NFC) (PR #138655)

2025-05-14 Thread Anatoly Trosinenko via llvm-branch-commits


https://github.com/atrosinenko updated 
https://github.com/llvm/llvm-project/pull/138655

>From a7a2eea52cb63453c261499288fec46f9e1d3613 Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko 
Date: Mon, 28 Apr 2025 18:35:48 +0300
Subject: [PATCH] [BOLT] Factor out MCInstReference from gadget scanner (NFC)

Move MCInstReference representing a constant reference to an instruction
inside a parent entity - either inside a basic block (which has a
reference to its parent function) or directly to the function (when CFG
information is not available).
---
 bolt/include/bolt/Core/MCInstUtils.h  | 172 +
 bolt/include/bolt/Passes/PAuthGadgetScanner.h | 180 +-
 bolt/lib/Core/CMakeLists.txt  |   1 +
 bolt/lib/Core/MCInstUtils.cpp |  57 ++
 bolt/lib/Passes/PAuthGadgetScanner.cpp| 102 --
 5 files changed, 273 insertions(+), 239 deletions(-)
 create mode 100644 bolt/include/bolt/Core/MCInstUtils.h
 create mode 100644 bolt/lib/Core/MCInstUtils.cpp

diff --git a/bolt/include/bolt/Core/MCInstUtils.h 
b/bolt/include/bolt/Core/MCInstUtils.h
new file mode 100644
index 0..a3912a8fb265a
--- /dev/null
+++ b/bolt/include/bolt/Core/MCInstUtils.h
@@ -0,0 +1,172 @@
+//===- bolt/Core/MCInstUtils.h --*- C++ 
-*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+
+#ifndef BOLT_CORE_MCINSTUTILS_H
+#define BOLT_CORE_MCINSTUTILS_H
+
+#include "bolt/Core/BinaryBasicBlock.h"
+
+#include 
+#include 
+#include 
+
+namespace llvm {
+namespace bolt {
+
+class BinaryFunction;
+
+/// MCInstReference represents a reference to a constant MCInst as stored 
either
+/// in a BinaryFunction (i.e. before a CFG is created), or in a 
BinaryBasicBlock
+/// (after a CFG is created).
+class MCInstReference {
+  using nocfg_const_iterator = std::map::const_iterator;
+
+  // Two cases are possible:
+  // * functions with CFG reconstructed - a function stores a collection of
+  //   basic blocks, each basic block stores a contiguous vector of MCInst
+  // * functions without CFG - there are no basic blocks created,
+  //   the instructions are directly stored in std::map in BinaryFunction
+  //
+  // In both cases, the direct parent of MCInst is stored together with an
+  // iterator pointing to the instruction.
+
+  // Helper struct: CFG is available, the direct parent is a basic block,
+  // iterator's type is `MCInst *`.
+  struct RefInBB {
+RefInBB(const BinaryBasicBlock *BB, const MCInst *Inst)
+: BB(BB), It(Inst) {}
+RefInBB(const RefInBB &Other) = default;
+RefInBB &operator=(const RefInBB &Other) = default;
+
+const BinaryBasicBlock *BB;
+BinaryBasicBlock::const_iterator It;
+
+bool operator<(const RefInBB &Other) const {
+  if (BB != Other.BB)
+return std::less{}(BB, Other.BB);
+  return It < Other.It;
+}
+
+bool operator==(const RefInBB &Other) const {
+  return BB == Other.BB && It == Other.It;
+}
+  };
+
+  // Helper struct: CFG is *not* available, the direct parent is a function,
+  // iterator's type is std::map::iterator (the mapped value
+  // is an instruction's offset).
+  struct RefInBF {
+RefInBF(const BinaryFunction *BF, nocfg_const_iterator It)
+: BF(BF), It(It) {}
+RefInBF(const RefInBF &Other) = default;
+RefInBF &operator=(const RefInBF &Other) = default;
+
+const BinaryFunction *BF;
+nocfg_const_iterator It;
+
+bool operator<(const RefInBF &Other) const {
+  if (BF != Other.BF)
+return std::less{}(BF, Other.BF);
+  return It->first < Other.It->first;
+}
+
+bool operator==(const RefInBF &Other) const {
+  return BF == Other.BF && It->first == Other.It->first;
+}
+  };
+
+  std::variant Reference;
+
+  // Utility methods to be used like this:
+  //
+  // if (auto *Ref = tryGetRefInBB())
+  //   return Ref->doSomething(...);
+  // return getRefInBF().doSomethingElse(...);
+  const RefInBB *tryGetRefInBB() const {
+assert(std::get_if(&Reference) ||
+   std::get_if(&Reference));
+return std::get_if(&Reference);
+  }
+  const RefInBF &getRefInBF() const {
+assert(std::get_if(&Reference));
+return *std::get_if(&Reference);
+  }
+
+public:
+  /// Constructs an empty reference.
+  MCInstReference() : Reference(RefInBB(nullptr, nullptr)) {}
+  /// Constructs a reference to the instruction inside the basic block.
+  MCInstReference(const BinaryBasicBlock *BB, const MCInst *Inst)
+  : Reference(RefInBB(BB, Inst)) {
+assert(BB && Inst && "Neither BB nor Inst should be nullptr");
+  }
+  /// Constructs a reference to the instruction inside the basic block.
+  MCInstReference(const BinaryBasicBlock *BB, uns

[llvm-branch-commits] [llvm] [BOLT] Factor out MCInstReference from gadget scanner (NFC) (PR #138655)

2025-05-14 Thread Anatoly Trosinenko via llvm-branch-commits


https://github.com/atrosinenko updated 
https://github.com/llvm/llvm-project/pull/138655

>From a7a2eea52cb63453c261499288fec46f9e1d3613 Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko 
Date: Mon, 28 Apr 2025 18:35:48 +0300
Subject: [PATCH] [BOLT] Factor out MCInstReference from gadget scanner (NFC)

Move MCInstReference representing a constant reference to an instruction
inside a parent entity - either inside a basic block (which has a
reference to its parent function) or directly to the function (when CFG
information is not available).
---
 bolt/include/bolt/Core/MCInstUtils.h  | 172 +
 bolt/include/bolt/Passes/PAuthGadgetScanner.h | 180 +-
 bolt/lib/Core/CMakeLists.txt  |   1 +
 bolt/lib/Core/MCInstUtils.cpp |  57 ++
 bolt/lib/Passes/PAuthGadgetScanner.cpp| 102 --
 5 files changed, 273 insertions(+), 239 deletions(-)
 create mode 100644 bolt/include/bolt/Core/MCInstUtils.h
 create mode 100644 bolt/lib/Core/MCInstUtils.cpp

diff --git a/bolt/include/bolt/Core/MCInstUtils.h 
b/bolt/include/bolt/Core/MCInstUtils.h
new file mode 100644
index 0..a3912a8fb265a
--- /dev/null
+++ b/bolt/include/bolt/Core/MCInstUtils.h
@@ -0,0 +1,172 @@
+//===- bolt/Core/MCInstUtils.h --*- C++ 
-*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+
+#ifndef BOLT_CORE_MCINSTUTILS_H
+#define BOLT_CORE_MCINSTUTILS_H
+
+#include "bolt/Core/BinaryBasicBlock.h"
+
+#include 
+#include 
+#include 
+
+namespace llvm {
+namespace bolt {
+
+class BinaryFunction;
+
+/// MCInstReference represents a reference to a constant MCInst as stored 
either
+/// in a BinaryFunction (i.e. before a CFG is created), or in a 
BinaryBasicBlock
+/// (after a CFG is created).
+class MCInstReference {
+  using nocfg_const_iterator = std::map::const_iterator;
+
+  // Two cases are possible:
+  // * functions with CFG reconstructed - a function stores a collection of
+  //   basic blocks, each basic block stores a contiguous vector of MCInst
+  // * functions without CFG - there are no basic blocks created,
+  //   the instructions are directly stored in std::map in BinaryFunction
+  //
+  // In both cases, the direct parent of MCInst is stored together with an
+  // iterator pointing to the instruction.
+
+  // Helper struct: CFG is available, the direct parent is a basic block,
+  // iterator's type is `MCInst *`.
+  struct RefInBB {
+RefInBB(const BinaryBasicBlock *BB, const MCInst *Inst)
+: BB(BB), It(Inst) {}
+RefInBB(const RefInBB &Other) = default;
+RefInBB &operator=(const RefInBB &Other) = default;
+
+const BinaryBasicBlock *BB;
+BinaryBasicBlock::const_iterator It;
+
+bool operator<(const RefInBB &Other) const {
+  if (BB != Other.BB)
+return std::less{}(BB, Other.BB);
+  return It < Other.It;
+}
+
+bool operator==(const RefInBB &Other) const {
+  return BB == Other.BB && It == Other.It;
+}
+  };
+
+  // Helper struct: CFG is *not* available, the direct parent is a function,
+  // iterator's type is std::map::iterator (the mapped value
+  // is an instruction's offset).
+  struct RefInBF {
+RefInBF(const BinaryFunction *BF, nocfg_const_iterator It)
+: BF(BF), It(It) {}
+RefInBF(const RefInBF &Other) = default;
+RefInBF &operator=(const RefInBF &Other) = default;
+
+const BinaryFunction *BF;
+nocfg_const_iterator It;
+
+bool operator<(const RefInBF &Other) const {
+  if (BF != Other.BF)
+return std::less{}(BF, Other.BF);
+  return It->first < Other.It->first;
+}
+
+bool operator==(const RefInBF &Other) const {
+  return BF == Other.BF && It->first == Other.It->first;
+}
+  };
+
+  std::variant Reference;
+
+  // Utility methods to be used like this:
+  //
+  // if (auto *Ref = tryGetRefInBB())
+  //   return Ref->doSomething(...);
+  // return getRefInBF().doSomethingElse(...);
+  const RefInBB *tryGetRefInBB() const {
+assert(std::get_if(&Reference) ||
+   std::get_if(&Reference));
+return std::get_if(&Reference);
+  }
+  const RefInBF &getRefInBF() const {
+assert(std::get_if(&Reference));
+return *std::get_if(&Reference);
+  }
+
+public:
+  /// Constructs an empty reference.
+  MCInstReference() : Reference(RefInBB(nullptr, nullptr)) {}
+  /// Constructs a reference to the instruction inside the basic block.
+  MCInstReference(const BinaryBasicBlock *BB, const MCInst *Inst)
+  : Reference(RefInBB(BB, Inst)) {
+assert(BB && Inst && "Neither BB nor Inst should be nullptr");
+  }
+  /// Constructs a reference to the instruction inside the basic block.
+  MCInstReference(const BinaryBasicBlock *BB, uns

[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: account for BRK when searching for auth oracles (PR #137975)

2025-05-14 Thread Anatoly Trosinenko via llvm-branch-commits


https://github.com/atrosinenko updated 
https://github.com/llvm/llvm-project/pull/137975

>From 5b6f6d0e0ab5e9353c7082f44d4107e21dc84cf5 Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko 
Date: Wed, 30 Apr 2025 16:08:10 +0300
Subject: [PATCH] [BOLT] Gadget scanner: account for BRK when searching for
 auth oracles

An authenticated pointer can be explicitly checked by the compiler via a
sequence of instructions that executes BRK on failure. It is important
to recognize such BRK instruction as checking every register (as it is
expected to immediately trigger an abnormal program termination) to
prevent false positive reports about authentication oracles:

autia   x2, x3
autia   x0, x1
; neither x0 nor x2 are checked at this point
eor x16, x0, x0, lsl #1
tbz x16, #62, on_success ; marks x0 as checked
; end of BB: for x2 to be checked here, it must be checked in both
; successor basic blocks
  on_failure:
brk 0xc470
  on_success:
; x2 is checked
ldr x1, [x2] ; marks x2 as checked
---
 bolt/include/bolt/Core/MCPlusBuilder.h| 14 ++
 bolt/lib/Passes/PAuthGadgetScanner.cpp| 13 +-
 .../Target/AArch64/AArch64MCPlusBuilder.cpp   | 24 --
 .../AArch64/gs-pauth-address-checks.s | 44 +--
 .../AArch64/gs-pauth-authentication-oracles.s |  9 ++--
 .../AArch64/gs-pauth-signing-oracles.s|  6 +--
 6 files changed, 75 insertions(+), 35 deletions(-)

diff --git a/bolt/include/bolt/Core/MCPlusBuilder.h 
b/bolt/include/bolt/Core/MCPlusBuilder.h
index 6d3aa4f5f0feb..87de6754017db 100644
--- a/bolt/include/bolt/Core/MCPlusBuilder.h
+++ b/bolt/include/bolt/Core/MCPlusBuilder.h
@@ -706,6 +706,20 @@ class MCPlusBuilder {
 return false;
   }
 
+  /// Returns true if Inst is a trap instruction.
+  ///
+  /// Tests if Inst is an instruction that immediately causes an abnormal
+  /// program termination, for example when a security violation is detected
+  /// by a compiler-inserted check.
+  ///
+  /// @note An implementation of this method should likely return false for
+  /// calls to library functions like abort(), as it is possible that the
+  /// execution state is partially attacker-controlled at this point.
+  virtual bool isTrap(const MCInst &Inst) const {
+llvm_unreachable("not implemented");
+return false;
+  }
+
   virtual bool isBreakpoint(const MCInst &Inst) const {
 llvm_unreachable("not implemented");
 return false;
diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp 
b/bolt/lib/Passes/PAuthGadgetScanner.cpp
index dfb71575b2b39..835ee26aaf08a 100644
--- a/bolt/lib/Passes/PAuthGadgetScanner.cpp
+++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp
@@ -1028,6 +1028,15 @@ class DstSafetyAnalysis {
   dbgs() << ")\n";
 });
 
+// If this instruction terminates the program immediately, no
+// authentication oracles are possible past this point.
+if (BC.MIB->isTrap(Point)) {
+  LLVM_DEBUG({ traceInst(BC, "Trap instruction found", Point); });
+  DstState Next(NumRegs, RegsToTrackInstsFor.getNumTrackedRegisters());
+  Next.CannotEscapeUnchecked.set();
+  return Next;
+}
+
 // If this instruction is reachable by the analysis, a non-empty state will
 // be propagated to it sooner or later. Until then, skip computeNext().
 if (Cur.empty()) {
@@ -1133,8 +1142,8 @@ class DataflowDstSafetyAnalysis
 //
 // A basic block without any successors, on the other hand, can be
 // pessimistically initialized to everything-is-unsafe: this will naturally
-// handle both return and tail call instructions and is harmless for
-// internal indirect branch instructions (such as computed gotos).
+// handle return, trap and tail call instructions. At the same time, it is
+// harmless for internal indirect branch instructions, like computed gotos.
 if (BB.succ_empty())
   return createUnsafeState();
 
diff --git a/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp 
b/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp
index f3c29e6ee43b9..4d11c5b206eab 100644
--- a/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp
+++ b/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp
@@ -386,10 +386,9 @@ class AArch64MCPlusBuilder : public MCPlusBuilder {
 // the list of successors of this basic block as appropriate.
 
 // Any of the above code sequences assume the fall-through basic block
-// is a dead-end BRK instruction (any immediate operand is accepted).
+// is a dead-end trap instruction.
 const BinaryBasicBlock *BreakBB = BB.getFallthrough();
-if (!BreakBB || BreakBB->empty() ||
-BreakBB->front().getOpcode() != AArch64::BRK)
+if (!BreakBB || BreakBB->empty() || !isTrap(BreakBB->front()))
   return std::nullopt;
 
 // Iterate over the instructions of BB in reverse order, matching opcodes
@@ -1745,6 +1744,25 @@ class AArch64MCPlusBuilder : public MCPlusBuilder {
 Inst.addOperand(MCOperand::createImm(0));
   }

[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect untrusted LR before tail call (PR #137224)

2025-05-14 Thread Anatoly Trosinenko via llvm-branch-commits


https://github.com/atrosinenko updated 
https://github.com/llvm/llvm-project/pull/137224

>From d20efbedd0be34942e8e28cc91eefeb28d1b8108 Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko 
Date: Tue, 22 Apr 2025 21:43:14 +0300
Subject: [PATCH 1/2] [BOLT] Gadget scanner: detect untrusted LR before tail
 call

Implement the detection of tail calls performed with untrusted link
register, which violates the assumption made on entry to every function.

Unlike other pauth gadgets, this one involves some amount of guessing
which branch instructions should be checked as tail calls.
---
 bolt/lib/Passes/PAuthGadgetScanner.cpp|  94 ++-
 .../AArch64/gs-pacret-autiasp.s   |  31 +-
 .../AArch64/gs-pauth-debug-output.s   |  30 +-
 .../AArch64/gs-pauth-tail-calls.s | 597 ++
 4 files changed, 706 insertions(+), 46 deletions(-)
 create mode 100644 bolt/test/binary-analysis/AArch64/gs-pauth-tail-calls.s

diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp 
b/bolt/lib/Passes/PAuthGadgetScanner.cpp
index 7a5d47a3ff812..dfb71575b2b39 100644
--- a/bolt/lib/Passes/PAuthGadgetScanner.cpp
+++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp
@@ -701,8 +701,9 @@ class DataflowSrcSafetyAnalysis
 //
 // Then, a function can be split into a number of disjoint contiguous sequences
 // of instructions without labels in between. These sequences can be processed
-// the same way basic blocks are processed by data-flow analysis, assuming
-// pessimistically that all registers are unsafe at the start of each sequence.
+// the same way basic blocks are processed by data-flow analysis, with the same
+// pessimistic estimation of the initial state at the start of each sequence
+// (except the first instruction of the function).
 class CFGUnawareSrcSafetyAnalysis : public SrcSafetyAnalysis {
   BinaryFunction &BF;
   MCPlusBuilder::AllocatorIdTy AllocId;
@@ -713,12 +714,6 @@ class CFGUnawareSrcSafetyAnalysis : public 
SrcSafetyAnalysis {
   BC.MIB->removeAnnotation(I.second, StateAnnotationIndex);
   }
 
-  /// Creates a state with all registers marked unsafe (not to be confused
-  /// with empty state).
-  SrcState createUnsafeState() const {
-return SrcState(NumRegs, RegsToTrackInstsFor.getNumTrackedRegisters());
-  }
-
 public:
   CFGUnawareSrcSafetyAnalysis(BinaryFunction &BF,
   MCPlusBuilder::AllocatorIdTy AllocId,
@@ -729,6 +724,7 @@ class CFGUnawareSrcSafetyAnalysis : public 
SrcSafetyAnalysis {
   }
 
   void run() override {
+const SrcState DefaultState = computePessimisticState(BF);
 SrcState S = createEntryState();
 for (auto &I : BF.instrs()) {
   MCInst &Inst = I.second;
@@ -743,7 +739,7 @@ class CFGUnawareSrcSafetyAnalysis : public 
SrcSafetyAnalysis {
 LLVM_DEBUG({
   traceInst(BC, "Due to label, resetting the state before", Inst);
 });
-S = createUnsafeState();
+S = DefaultState;
   }
 
   // Check if we need to remove an old annotation (this is the case if
@@ -1288,6 +1284,83 @@ shouldReportReturnGadget(const BinaryContext &BC, const 
MCInstReference &Inst,
   return make_gadget_report(RetKind, Inst, *RetReg);
 }
 
+/// While BOLT already marks some of the branch instructions as tail calls,
+/// this function tries to improve the coverage by including less obvious cases
+/// when it is possible to do without introducing too many false positives.
+static bool shouldAnalyzeTailCallInst(const BinaryContext &BC,
+  const BinaryFunction &BF,
+  const MCInstReference &Inst) {
+  // Some BC.MIB->isXYZ(Inst) methods simply delegate to MCInstrDesc::isXYZ()
+  // (such as isBranch at the time of writing this comment), some don't (such
+  // as isCall). For that reason, call MCInstrDesc's methods explicitly when
+  // it is important.
+  const MCInstrDesc &Desc =
+  BC.MII->get(static_cast(Inst).getOpcode());
+  // Tail call should be a branch (but not necessarily an indirect one).
+  if (!Desc.isBranch())
+return false;
+
+  // Always analyze the branches already marked as tail calls by BOLT.
+  if (BC.MIB->isTailCall(Inst))
+return true;
+
+  // Try to also check the branches marked as "UNKNOWN CONTROL FLOW" - the
+  // below is a simplified condition from BinaryContext::printInstruction.
+  bool IsUnknownControlFlow =
+  BC.MIB->isIndirectBranch(Inst) && !BC.MIB->getJumpTable(Inst);
+
+  if (BF.hasCFG() && IsUnknownControlFlow)
+return true;
+
+  return false;
+}
+
+static std::optional>
+shouldReportUnsafeTailCall(const BinaryContext &BC, const BinaryFunction &BF,
+   const MCInstReference &Inst, const SrcState &S) {
+  static const GadgetKind UntrustedLRKind(
+  "untrusted link register found before tail call");
+
+  if (!shouldAnalyzeTailCallInst(BC, BF, Inst))
+return std::nullopt;
+
+  // Not only the set of registers returned by getTrustedLiveInRegs() can be
+  /

[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: optionally assume auth traps on failure (PR #139778)

2025-05-14 Thread Anatoly Trosinenko via llvm-branch-commits


https://github.com/atrosinenko updated 
https://github.com/llvm/llvm-project/pull/139778

>From 8f997500a97b5ad7acc9dea416cca6b2f7bb615d Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko 
Date: Tue, 13 May 2025 19:50:41 +0300
Subject: [PATCH] [BOLT] Gadget scanner: optionally assume auth traps on
 failure

On AArch64 it is possible for an auth instruction to either return an
invalid address value on failure (without FEAT_FPAC) or generate an
error (with FEAT_FPAC). It thus may be possible to never emit explicit
pointer checks, if the target CPU is known to support FEAT_FPAC.

This commit implements an --auth-traps-on-failure command line option,
which essentially makes "safe-to-dereference" and "trusted" register
properties identical and disables scanning for authentication oracles
completely.
---
 bolt/lib/Passes/PAuthGadgetScanner.cpp| 112 +++
 .../binary-analysis/AArch64/cmdline-args.test |   1 +
 .../AArch64/gs-pauth-authentication-oracles.s |   6 +-
 .../binary-analysis/AArch64/gs-pauth-calls.s  |   5 +-
 .../AArch64/gs-pauth-debug-output.s   | 177 ++---
 .../AArch64/gs-pauth-jump-table.s |   6 +-
 .../AArch64/gs-pauth-signing-oracles.s|  54 ++---
 .../AArch64/gs-pauth-tail-calls.s | 184 +-
 8 files changed, 318 insertions(+), 227 deletions(-)

diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp 
b/bolt/lib/Passes/PAuthGadgetScanner.cpp
index bda971bcd9343..cfe86d32df798 100644
--- a/bolt/lib/Passes/PAuthGadgetScanner.cpp
+++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp
@@ -14,6 +14,7 @@
 #include "bolt/Passes/PAuthGadgetScanner.h"
 #include "bolt/Core/ParallelUtilities.h"
 #include "bolt/Passes/DataflowAnalysis.h"
+#include "bolt/Utils/CommandLineOpts.h"
 #include "llvm/ADT/STLExtras.h"
 #include "llvm/ADT/SmallSet.h"
 #include "llvm/MC/MCInst.h"
@@ -26,6 +27,11 @@ namespace llvm {
 namespace bolt {
 namespace PAuthGadgetScanner {
 
+static cl::opt AuthTrapsOnFailure(
+"auth-traps-on-failure",
+cl::desc("Assume authentication instructions always trap on failure"),
+cl::cat(opts::BinaryAnalysisCategory));
+
 [[maybe_unused]] static void traceInst(const BinaryContext &BC, StringRef 
Label,
const MCInst &MI) {
   dbgs() << "  " << Label << ": ";
@@ -365,6 +371,34 @@ class SrcSafetyAnalysis {
 return Clobbered;
   }
 
+  std::optional getRegMadeTrustedByChecking(const MCInst &Inst,
+   SrcState Cur) const {
+// This functions cannot return multiple registers. This is never the case
+// on AArch64.
+std::optional RegCheckedByInst =
+BC.MIB->getAuthCheckedReg(Inst, /*MayOverwrite=*/false);
+if (RegCheckedByInst && Cur.SafeToDerefRegs[*RegCheckedByInst])
+  return *RegCheckedByInst;
+
+auto It = CheckerSequenceInfo.find(&Inst);
+if (It == CheckerSequenceInfo.end())
+  return std::nullopt;
+
+MCPhysReg RegCheckedBySequence = It->second.first;
+const MCInst *FirstCheckerInst = It->second.second;
+
+// FirstCheckerInst should belong to the same basic block (see the
+// assertion in DataflowSrcSafetyAnalysis::run()), meaning it was
+// deterministically processed a few steps before this instruction.
+const SrcState &StateBeforeChecker = getStateBefore(*FirstCheckerInst);
+
+// The sequence checks the register, but it should be authenticated before.
+if (!StateBeforeChecker.SafeToDerefRegs[RegCheckedBySequence])
+  return std::nullopt;
+
+return RegCheckedBySequence;
+  }
+
   // Returns all registers that can be treated as if they are written by an
   // authentication instruction.
   SmallVector getRegsMadeSafeToDeref(const MCInst &Point,
@@ -387,18 +421,38 @@ class SrcSafetyAnalysis {
 Regs.push_back(DstAndSrc->first);
 }
 
+// Make sure explicit checker sequence keeps register safe-to-dereference
+// when the register would be clobbered according to the regular rules:
+//
+//; LR is safe to dereference here
+//mov   x16, x30  ; start of the sequence, LR is s-t-d right before
+//xpaclri ; clobbers LR, LR is not safe anymore
+//cmp   x30, x16
+//b.eq  1f; end of the sequence: LR is marked as trusted
+//brk   0x1234
+//  1:
+//; at this point LR would be marked as trusted,
+//; but not safe-to-dereference
+//
+// or even just
+//
+//; X1 is safe to dereference here
+//ldr x0, [x1, #8]!
+//; X1 is trusted here, but it was clobbered due to address write-back
+if (auto CheckedReg = getRegMadeTrustedByChecking(Point, Cur))
+  Regs.push_back(*CheckedReg);
+
 return Regs;
   }
 
   // Returns all registers made trusted by this instruction.
   SmallVector getRegsMadeTrusted(const MCInst &Point,
 const SrcState &Cur) const {
+assert(!AuthTrapsOnFailure &&

[llvm-branch-commits] [llvm] [BOLT] Introduce helpers to match `MCInst`s one at a time (NFC) (PR #138883)

2025-05-14 Thread Anatoly Trosinenko via llvm-branch-commits


https://github.com/atrosinenko updated 
https://github.com/llvm/llvm-project/pull/138883

>From 4bd8dd9334c8ad810e7fc593331cd6d4e2fdbbad Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko 
Date: Wed, 7 May 2025 16:42:00 +0300
Subject: [PATCH] [BOLT] Introduce helpers to match `MCInst`s one at a time
 (NFC)

Introduce matchInst helper function to capture and/or match the operands
of MCInst. Unlike the existing `MCPlusBuilder::MCInstMatcher` machinery,
matchInst is intended for the use cases when precise control over the
instruction order is required. For example, when validating PtrAuth
hardening, all registers are usually considered unsafe after a function
call, even though callee-saved registers should preserve their old
values *under normal operation*.
---
 bolt/include/bolt/Core/MCInstUtils.h  | 128 ++
 .../Target/AArch64/AArch64MCPlusBuilder.cpp   |  90 +---
 2 files changed, 162 insertions(+), 56 deletions(-)

diff --git a/bolt/include/bolt/Core/MCInstUtils.h 
b/bolt/include/bolt/Core/MCInstUtils.h
index a3912a8fb265a..b495eb8ef5eec 100644
--- a/bolt/include/bolt/Core/MCInstUtils.h
+++ b/bolt/include/bolt/Core/MCInstUtils.h
@@ -166,6 +166,134 @@ static inline raw_ostream &operator<<(raw_ostream &OS,
   return Ref.print(OS);
 }
 
+/// Instruction-matching helpers operating on a single instruction at a time.
+///
+/// Unlike MCPlusBuilder::MCInstMatcher, this matchInst() function focuses on
+/// the cases where a precise control over the instruction order is important:
+///
+/// // Bring the short names into the local scope:
+/// using namespace MCInstMatcher;
+/// // Declare the registers to capture:
+/// Reg Xn, Xm;
+/// // Capture the 0th and 1st operands, match the 2nd operand against the
+/// // just captured Xm register, match the 3rd operand against literal 0:
+/// if (!matchInst(MaybeAdd, AArch64::ADDXrs, Xm, Xn, Xm, Imm(0))
+///   return AArch64::NoRegister;
+/// // Match the 0th operand against Xm:
+/// if (!matchInst(MaybeBr, AArch64::BR, Xm))
+///   return AArch64::NoRegister;
+/// // Return the matched register:
+/// return Xm.get();
+namespace MCInstMatcher {
+
+// The base class to match an operand of type T.
+//
+// The subclasses of OpMatcher are intended to be allocated on the stack and
+// to only be used by passing them to matchInst() and by calling their get()
+// function, thus the peculiar `mutable` specifiers: to make the calling code
+// compact and readable, the templated matchInst() function has to accept both
+// long-lived Imm/Reg wrappers declared as local variables (intended to capture
+// the first operand's value and match the subsequent operands, whether inside
+// a single instruction or across multiple instructions), as well as temporary
+// wrappers around literal values to match, f.e. Imm(42) or Reg(AArch64::XZR).
+template  class OpMatcher {
+  mutable std::optional Value;
+  mutable std::optional SavedValue;
+
+  // Remember/restore the last Value - to be called by matchInst.
+  void remember() const { SavedValue = Value; }
+  void restore() const { Value = SavedValue; }
+
+  template 
+  friend bool matchInst(const MCInst &, unsigned, const OpMatchers &...);
+
+protected:
+  OpMatcher(std::optional ValueToMatch) : Value(ValueToMatch) {}
+
+  bool matchValue(T OpValue) const {
+// Check that OpValue does not contradict the existing Value.
+bool MatchResult = !Value || *Value == OpValue;
+// If MatchResult is false, all matchers will be reset before returning 
from
+// matchInst, including this one, thus no need to assign conditionally.
+Value = OpValue;
+
+return MatchResult;
+  }
+
+public:
+  /// Returns the captured value.
+  T get() const {
+assert(Value.has_value());
+return *Value;
+  }
+};
+
+class Reg : public OpMatcher {
+  bool matches(const MCOperand &Op) const {
+if (!Op.isReg())
+  return false;
+
+return matchValue(Op.getReg());
+  }
+
+  template 
+  friend bool matchInst(const MCInst &, unsigned, const OpMatchers &...);
+
+public:
+  Reg(std::optional RegToMatch = std::nullopt)
+  : OpMatcher(RegToMatch) {}
+};
+
+class Imm : public OpMatcher {
+  bool matches(const MCOperand &Op) const {
+if (!Op.isImm())
+  return false;
+
+return matchValue(Op.getImm());
+  }
+
+  template 
+  friend bool matchInst(const MCInst &, unsigned, const OpMatchers &...);
+
+public:
+  Imm(std::optional ImmToMatch = std::nullopt)
+  : OpMatcher(ImmToMatch) {}
+};
+
+/// Tries to match Inst and updates Ops on success.
+///
+/// If Inst has the specified Opcode and its operand list prefix matches Ops,
+/// this function returns true and updates Ops, otherwise false is returned and
+/// values of Ops are kept as before matchInst was called.
+///
+/// Please note that while Ops are technically passed by a const reference to
+/// make invocations like `matchInst(MI, Opcode, Imm(42))` possible, all their
+/// fields are marked mut

[llvm-branch-commits] [llvm] [llvm][OpenMP] Add "SourceLanguages" property to Directive (PR #139960)

2025-05-14 Thread Krzysztof Parzyszek via llvm-branch-commits


https://github.com/kparzysz created 
https://github.com/llvm/llvm-project/pull/139960

The official languages that OpenMP recognizes are C/C++ and Fortran. Some 
OpenMP directives are language-specific, some are C/C++-only, some are 
Fortran-only.

Add a property to the TableGen definition of Directive that will be the list of 
languages that allow the directive.

The TableGen backend will then generate a bitmask-like enumeration 
SourceLanguages, and a function
  SourceLanguages getDirectiveLanguages(Directive D);



  



Rate limit · GitHub


  body {
background-color: #f6f8fa;
color: #24292e;
font-family: -apple-system,BlinkMacSystemFont,Segoe 
UI,Helvetica,Arial,sans-serif,Apple Color Emoji,Segoe UI Emoji,Segoe UI Symbol;
font-size: 14px;
line-height: 1.5;
margin: 0;
  }

  .container { margin: 50px auto; max-width: 600px; text-align: center; 
padding: 0 24px; }

  a { color: #0366d6; text-decoration: none; }
  a:hover { text-decoration: underline; }

  h1 { line-height: 60px; font-size: 48px; font-weight: 300; margin: 0px; 
text-shadow: 0 1px 0 #fff; }
  p { color: rgba(0, 0, 0, 0.5); margin: 20px 0 40px; }

  ul { list-style: none; margin: 25px 0; padding: 0; }
  li { display: table-cell; font-weight: bold; width: 1%; }

  .logo { display: inline-block; margin-top: 35px; }
  .logo-img-2x { display: none; }
  @media
  only screen and (-webkit-min-device-pixel-ratio: 2),
  only screen and (   min--moz-device-pixel-ratio: 2),
  only screen and ( -o-min-device-pixel-ratio: 2/1),
  only screen and (min-device-pixel-ratio: 2),
  only screen and (min-resolution: 192dpi),
  only screen and (min-resolution: 2dppx) {
.logo-img-1x { display: none; }
.logo-img-2x { display: inline-block; }
  }

  #suggestions {
margin-top: 35px;
color: #ccc;
  }
  #suggestions a {
color: #66;
font-weight: 200;
font-size: 14px;
margin: 0 10px;
  }


  
  



  Whoa there!
  You have exceeded a secondary rate limit.
Please wait a few minutes before you try again;
in some cases this may take up to an hour.
  
  
https://support.github.com/contact";>Contact Support —
https://githubstatus.com";>GitHub Status —
https://twitter.com/githubstatus";>@githubstatus
  

  

  

  

  

  


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [llvm][OpenMP] Add "SourceLanguages" property to Directive (PR #139960)

2025-05-14 Thread via llvm-branch-commits


llvmbot wrote:




@llvm/pr-subscribers-flang-openmp

Author: Krzysztof Parzyszek (kparzysz)


Changes

The official languages that OpenMP recognizes are C/C++ and Fortran. Some 
OpenMP directives are language-specific, some are C/C++-only, some are 
Fortran-only.

Add a property to the TableGen definition of Directive that will be the list of 
languages that allow the directive.

The TableGen backend will then generate a bitmask-like enumeration 
SourceLanguages, and a function
  SourceLanguages getDirectiveLanguages(Directive D);

---

Patch is 24.68 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/139960.diff


6 Files Affected:

- (modified) llvm/include/llvm/Frontend/Directive/DirectiveBase.td (+12) 
- (modified) llvm/include/llvm/Frontend/OpenMP/OMP.td (+63-19) 
- (modified) llvm/include/llvm/TableGen/DirectiveEmitter.h (+12-2) 
- (modified) llvm/test/TableGen/directive1.td (+17) 
- (modified) llvm/test/TableGen/directive2.td (+17) 
- (modified) llvm/utils/TableGen/Basic/DirectiveEmitter.cpp (+77) 


``diff
diff --git a/llvm/include/llvm/Frontend/Directive/DirectiveBase.td 
b/llvm/include/llvm/Frontend/Directive/DirectiveBase.td
index 4faea18324cb7..3e2744dea8d14 100644
--- a/llvm/include/llvm/Frontend/Directive/DirectiveBase.td
+++ b/llvm/include/llvm/Frontend/Directive/DirectiveBase.td
@@ -172,6 +172,15 @@ def CA_Meta: Category<"Meta"> {}
 def CA_Subsidiary: Category<"Subsidiary"> {}
 def CA_Utility: Category<"Utility"> {}
 
+class SourceLanguage {
+  string name = n;  // Name of the enum value in enum class Association.
+}
+
+// The C languages also implies C++ until there is a reason to add C++
+// separately.
+def L_C : SourceLanguage<"C"> {}
+def L_Fortran : SourceLanguage<"Fortran"> {}
+
 // Information about a specific directive.
 class Directive {
   // Name of the directive. Can be composite directive sepearted by whitespace.
@@ -205,4 +214,7 @@ class Directive {
 
   // The category of the directive.
   Category category = ?;
+
+  // The languages that allow this directive. Default: all languages.
+  list languages = [L_C, L_Fortran];
 }
diff --git a/llvm/include/llvm/Frontend/OpenMP/OMP.td 
b/llvm/include/llvm/Frontend/OpenMP/OMP.td
index 194b1e657c493..0af4b436649a3 100644
--- a/llvm/include/llvm/Frontend/OpenMP/OMP.td
+++ b/llvm/include/llvm/Frontend/OpenMP/OMP.td
@@ -573,6 +573,7 @@ def OMP_Allocators : Directive<"allocators"> {
   ];
   let association = AS_Block;
   let category = CA_Executable;
+  let languages = [L_Fortran];
 }
 def OMP_Assumes : Directive<"assumes"> {
   let association = AS_None;
@@ -586,10 +587,6 @@ def OMP_Assumes : Directive<"assumes"> {
 VersionedClause,
   ];
 }
-def OMP_EndAssumes : Directive<"end assumes"> {
-  let association = AS_Delimited;
-  let category = OMP_Assumes.category;
-}
 def OMP_Assume : Directive<"assume"> {
   let association = AS_Block;
   let category = CA_Informational;
@@ -637,6 +634,12 @@ def OMP_BeginAssumes : Directive<"begin assumes"> {
 VersionedClause,
 VersionedClause,
   ];
+  let languages = [L_C];
+}
+def OMP_EndAssumes : Directive<"end assumes"> {
+  let association = AS_Delimited;
+  let category = OMP_BeginAssumes.category;
+  let languages = OMP_BeginAssumes.languages;
 }
 def OMP_BeginDeclareTarget : Directive<"begin declare target"> {
   let allowedClauses = [
@@ -647,10 +650,22 @@ def OMP_BeginDeclareTarget : Directive<"begin declare 
target"> {
   ];
   let association = AS_Delimited;
   let category = CA_Declarative;
+  let languages = [L_C];
+}
+def OMP_EndDeclareTarget : Directive<"end declare target"> {
+  let association = AS_Delimited;
+  let category = OMP_BeginDeclareTarget.category;
+  let languages = OMP_BeginDeclareTarget.languages;
 }
 def OMP_BeginDeclareVariant : Directive<"begin declare variant"> {
   let association = AS_Delimited;
   let category = CA_Declarative;
+  let languages = [L_C];
+}
+def OMP_EndDeclareVariant : Directive<"end declare variant"> {
+  let association = AS_Delimited;
+  let category = OMP_BeginDeclareVariant.category;
+  let languages = OMP_BeginDeclareVariant.languages;
 }
 def OMP_Cancel : Directive<"cancel"> {
   let allowedOnceClauses = [
@@ -717,10 +732,6 @@ def OMP_DeclareTarget : Directive<"declare target"> {
   let association = AS_None;
   let category = CA_Declarative;
 }
-def OMP_EndDeclareTarget : Directive<"end declare target"> {
-  let association = AS_Delimited;
-  let category = OMP_DeclareTarget.category;
-}
 def OMP_DeclareVariant : Directive<"declare variant"> {
   let allowedClauses = [
 VersionedClause,
@@ -731,10 +742,7 @@ def OMP_DeclareVariant : Directive<"declare variant"> {
   ];
   let association = AS_Declaration;
   let category = CA_Declarative;
-}
-def OMP_EndDeclareVariant : Directive<"end declare variant"> {
-  let association = AS_Delimited;
-  let category = OMP_DeclareVariant.category;
+  let languages = [L_C];
 }
 def OMP_Depobj : Directive<"depobj"> {
   let allowedClauses =

[llvm-branch-commits] [llvm] [llvm][OpenMP] Add "SourceLanguages" property to Directive (PR #139960)

2025-05-14 Thread via llvm-branch-commits


llvmbot wrote:




@llvm/pr-subscribers-tablegen

Author: Krzysztof Parzyszek (kparzysz)


Changes

The official languages that OpenMP recognizes are C/C++ and Fortran. Some 
OpenMP directives are language-specific, some are C/C++-only, some are 
Fortran-only.

Add a property to the TableGen definition of Directive that will be the list of 
languages that allow the directive.

The TableGen backend will then generate a bitmask-like enumeration 
SourceLanguages, and a function
  SourceLanguages getDirectiveLanguages(Directive D);

---

Patch is 24.68 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/139960.diff


6 Files Affected:

- (modified) llvm/include/llvm/Frontend/Directive/DirectiveBase.td (+12) 
- (modified) llvm/include/llvm/Frontend/OpenMP/OMP.td (+63-19) 
- (modified) llvm/include/llvm/TableGen/DirectiveEmitter.h (+12-2) 
- (modified) llvm/test/TableGen/directive1.td (+17) 
- (modified) llvm/test/TableGen/directive2.td (+17) 
- (modified) llvm/utils/TableGen/Basic/DirectiveEmitter.cpp (+77) 


``diff
diff --git a/llvm/include/llvm/Frontend/Directive/DirectiveBase.td 
b/llvm/include/llvm/Frontend/Directive/DirectiveBase.td
index 4faea18324cb7..3e2744dea8d14 100644
--- a/llvm/include/llvm/Frontend/Directive/DirectiveBase.td
+++ b/llvm/include/llvm/Frontend/Directive/DirectiveBase.td
@@ -172,6 +172,15 @@ def CA_Meta: Category<"Meta"> {}
 def CA_Subsidiary: Category<"Subsidiary"> {}
 def CA_Utility: Category<"Utility"> {}
 
+class SourceLanguage {
+  string name = n;  // Name of the enum value in enum class Association.
+}
+
+// The C languages also implies C++ until there is a reason to add C++
+// separately.
+def L_C : SourceLanguage<"C"> {}
+def L_Fortran : SourceLanguage<"Fortran"> {}
+
 // Information about a specific directive.
 class Directive {
   // Name of the directive. Can be composite directive sepearted by whitespace.
@@ -205,4 +214,7 @@ class Directive {
 
   // The category of the directive.
   Category category = ?;
+
+  // The languages that allow this directive. Default: all languages.
+  list languages = [L_C, L_Fortran];
 }
diff --git a/llvm/include/llvm/Frontend/OpenMP/OMP.td 
b/llvm/include/llvm/Frontend/OpenMP/OMP.td
index 194b1e657c493..0af4b436649a3 100644
--- a/llvm/include/llvm/Frontend/OpenMP/OMP.td
+++ b/llvm/include/llvm/Frontend/OpenMP/OMP.td
@@ -573,6 +573,7 @@ def OMP_Allocators : Directive<"allocators"> {
   ];
   let association = AS_Block;
   let category = CA_Executable;
+  let languages = [L_Fortran];
 }
 def OMP_Assumes : Directive<"assumes"> {
   let association = AS_None;
@@ -586,10 +587,6 @@ def OMP_Assumes : Directive<"assumes"> {
 VersionedClause,
   ];
 }
-def OMP_EndAssumes : Directive<"end assumes"> {
-  let association = AS_Delimited;
-  let category = OMP_Assumes.category;
-}
 def OMP_Assume : Directive<"assume"> {
   let association = AS_Block;
   let category = CA_Informational;
@@ -637,6 +634,12 @@ def OMP_BeginAssumes : Directive<"begin assumes"> {
 VersionedClause,
 VersionedClause,
   ];
+  let languages = [L_C];
+}
+def OMP_EndAssumes : Directive<"end assumes"> {
+  let association = AS_Delimited;
+  let category = OMP_BeginAssumes.category;
+  let languages = OMP_BeginAssumes.languages;
 }
 def OMP_BeginDeclareTarget : Directive<"begin declare target"> {
   let allowedClauses = [
@@ -647,10 +650,22 @@ def OMP_BeginDeclareTarget : Directive<"begin declare 
target"> {
   ];
   let association = AS_Delimited;
   let category = CA_Declarative;
+  let languages = [L_C];
+}
+def OMP_EndDeclareTarget : Directive<"end declare target"> {
+  let association = AS_Delimited;
+  let category = OMP_BeginDeclareTarget.category;
+  let languages = OMP_BeginDeclareTarget.languages;
 }
 def OMP_BeginDeclareVariant : Directive<"begin declare variant"> {
   let association = AS_Delimited;
   let category = CA_Declarative;
+  let languages = [L_C];
+}
+def OMP_EndDeclareVariant : Directive<"end declare variant"> {
+  let association = AS_Delimited;
+  let category = OMP_BeginDeclareVariant.category;
+  let languages = OMP_BeginDeclareVariant.languages;
 }
 def OMP_Cancel : Directive<"cancel"> {
   let allowedOnceClauses = [
@@ -717,10 +732,6 @@ def OMP_DeclareTarget : Directive<"declare target"> {
   let association = AS_None;
   let category = CA_Declarative;
 }
-def OMP_EndDeclareTarget : Directive<"end declare target"> {
-  let association = AS_Delimited;
-  let category = OMP_DeclareTarget.category;
-}
 def OMP_DeclareVariant : Directive<"declare variant"> {
   let allowedClauses = [
 VersionedClause,
@@ -731,10 +742,7 @@ def OMP_DeclareVariant : Directive<"declare variant"> {
   ];
   let association = AS_Declaration;
   let category = CA_Declarative;
-}
-def OMP_EndDeclareVariant : Directive<"end declare variant"> {
-  let association = AS_Delimited;
-  let category = OMP_DeclareVariant.category;
+  let languages = [L_C];
 }
 def OMP_Depobj : Directive<"depobj"> {
   let allowedClauses = [
@

[llvm-branch-commits] [clang] [clang][OpenMP] Improve handling of non-C/C++ directives (PR #139961)

2025-05-14 Thread Krzysztof Parzyszek via llvm-branch-commits


https://github.com/kparzysz created 
https://github.com/llvm/llvm-project/pull/139961

The PR139793 added handling of the Fortran-only "workshare" directive, however 
there are more such directives, e.g. "allocators". Use the 
genDirectiveLanguages function to detect non-C/C++ directives instead of 
enumerating them.



  



Rate limit · GitHub


  body {
background-color: #f6f8fa;
color: #24292e;
font-family: -apple-system,BlinkMacSystemFont,Segoe 
UI,Helvetica,Arial,sans-serif,Apple Color Emoji,Segoe UI Emoji,Segoe UI Symbol;
font-size: 14px;
line-height: 1.5;
margin: 0;
  }

  .container { margin: 50px auto; max-width: 600px; text-align: center; 
padding: 0 24px; }

  a { color: #0366d6; text-decoration: none; }
  a:hover { text-decoration: underline; }

  h1 { line-height: 60px; font-size: 48px; font-weight: 300; margin: 0px; 
text-shadow: 0 1px 0 #fff; }
  p { color: rgba(0, 0, 0, 0.5); margin: 20px 0 40px; }

  ul { list-style: none; margin: 25px 0; padding: 0; }
  li { display: table-cell; font-weight: bold; width: 1%; }

  .logo { display: inline-block; margin-top: 35px; }
  .logo-img-2x { display: none; }
  @media
  only screen and (-webkit-min-device-pixel-ratio: 2),
  only screen and (   min--moz-device-pixel-ratio: 2),
  only screen and ( -o-min-device-pixel-ratio: 2/1),
  only screen and (min-device-pixel-ratio: 2),
  only screen and (min-resolution: 192dpi),
  only screen and (min-resolution: 2dppx) {
.logo-img-1x { display: none; }
.logo-img-2x { display: inline-block; }
  }

  #suggestions {
margin-top: 35px;
color: #ccc;
  }
  #suggestions a {
color: #66;
font-weight: 200;
font-size: 14px;
margin: 0 10px;
  }


  
  



  Whoa there!
  You have exceeded a secondary rate limit.
Please wait a few minutes before you try again;
in some cases this may take up to an hour.
  
  
https://support.github.com/contact";>Contact Support —
https://githubstatus.com";>GitHub Status —
https://twitter.com/githubstatus";>@githubstatus
  

  

  

  

  

  


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang] [clang][OpenMP] Improve handling of non-C/C++ directives (PR #139961)

2025-05-14 Thread via llvm-branch-commits


llvmbot wrote:




@llvm/pr-subscribers-clang

Author: Krzysztof Parzyszek (kparzysz)


Changes

The PR139793 added handling of the Fortran-only "workshare" directive, however 
there are more such directives, e.g. "allocators". Use the 
genDirectiveLanguages function to detect non-C/C++ directives instead of 
enumerating them.

---
Full diff: https://github.com/llvm/llvm-project/pull/139961.diff


3 Files Affected:

- (modified) clang/lib/Parse/ParseOpenMP.cpp (+2-3) 
- (added) clang/test/OpenMP/openmp_non_c_directives.c (+12) 
- (removed) clang/test/OpenMP/openmp_workshare.c (-8) 


``diff
diff --git a/clang/lib/Parse/ParseOpenMP.cpp b/clang/lib/Parse/ParseOpenMP.cpp
index c409409602e75..a53fd6b2964cb 100644
--- a/clang/lib/Parse/ParseOpenMP.cpp
+++ b/clang/lib/Parse/ParseOpenMP.cpp
@@ -2613,9 +2613,8 @@ StmtResult 
Parser::ParseOpenMPDeclarativeOrExecutableDirective(
 Diag(Tok, diag::err_omp_unknown_directive);
 return StmtError();
   }
-  if (DKind == OMPD_workshare) {
-// "workshare" is an executable, Fortran-only directive. Treat it
-// as unknown.
+  if (!(getDirectiveLanguages(DKind) & SourceLanguage::C)) {
+// Treat directives that are not allowed in C/C++ as unknown.
 DKind = OMPD_unknown;
   }
 
diff --git a/clang/test/OpenMP/openmp_non_c_directives.c 
b/clang/test/OpenMP/openmp_non_c_directives.c
new file mode 100644
index 0..844d7dad551bc
--- /dev/null
+++ b/clang/test/OpenMP/openmp_non_c_directives.c
@@ -0,0 +1,12 @@
+// RUN: %clang_cc1 -verify -fopenmp -ferror-limit 100 -o - %s
+
+// Test the reaction to some Fortran-only directives.
+
+void foo() {
+#pragma omp allocators // expected-error {{expected an OpenMP directive}}
+#pragma omp do // expected-error {{expected an OpenMP directive}}
+#pragma omp end workshare // expected-error {{expected an OpenMP directive}}
+#pragma omp parallel workshare // expected-warning {{extra tokens at the end 
of '#pragma omp parallel' are ignored}}
+#pragma omp workshare // expected-error {{expected an OpenMP directive}}
+}
+
diff --git a/clang/test/OpenMP/openmp_workshare.c 
b/clang/test/OpenMP/openmp_workshare.c
deleted file mode 100644
index 0302eb19f9ef4..0
--- a/clang/test/OpenMP/openmp_workshare.c
+++ /dev/null
@@ -1,8 +0,0 @@
-// RUN: %clang_cc1 -verify -fopenmp -ferror-limit 100 -o - %s
-
-// Workshare is a Fortran-only directive.
-
-void foo() {
-#pragma omp workshare // expected-error {{expected an OpenMP directive}}
-}
-

``




https://github.com/llvm/llvm-project/pull/139961
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang] [HLSL] Implicit resource binding for cbuffers (PR #139022)

2025-05-14 Thread Helena Kotas via llvm-branch-commits


https://github.com/hekota updated 
https://github.com/llvm/llvm-project/pull/139022



  



Rate limit · GitHub


  body {
background-color: #f6f8fa;
color: #24292e;
font-family: -apple-system,BlinkMacSystemFont,Segoe 
UI,Helvetica,Arial,sans-serif,Apple Color Emoji,Segoe UI Emoji,Segoe UI Symbol;
font-size: 14px;
line-height: 1.5;
margin: 0;
  }

  .container { margin: 50px auto; max-width: 600px; text-align: center; 
padding: 0 24px; }

  a { color: #0366d6; text-decoration: none; }
  a:hover { text-decoration: underline; }

  h1 { line-height: 60px; font-size: 48px; font-weight: 300; margin: 0px; 
text-shadow: 0 1px 0 #fff; }
  p { color: rgba(0, 0, 0, 0.5); margin: 20px 0 40px; }

  ul { list-style: none; margin: 25px 0; padding: 0; }
  li { display: table-cell; font-weight: bold; width: 1%; }

  .logo { display: inline-block; margin-top: 35px; }
  .logo-img-2x { display: none; }
  @media
  only screen and (-webkit-min-device-pixel-ratio: 2),
  only screen and (   min--moz-device-pixel-ratio: 2),
  only screen and ( -o-min-device-pixel-ratio: 2/1),
  only screen and (min-device-pixel-ratio: 2),
  only screen and (min-resolution: 192dpi),
  only screen and (min-resolution: 2dppx) {
.logo-img-1x { display: none; }
.logo-img-2x { display: inline-block; }
  }

  #suggestions {
margin-top: 35px;
color: #ccc;
  }
  #suggestions a {
color: #66;
font-weight: 200;
font-size: 14px;
margin: 0 10px;
  }


  
  



  Whoa there!
  You have exceeded a secondary rate limit.
Please wait a few minutes before you try again;
in some cases this may take up to an hour.
  
  
https://support.github.com/contact";>Contact Support —
https://githubstatus.com";>GitHub Status —
https://twitter.com/githubstatus";>@githubstatus
  

  

  

  

  

  


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang] [clang][analyzer] Handle CXXParenInitListExpr alongside InitListExpr (PR #139909)

2025-05-14 Thread Gábor Horváth via llvm-branch-commits


https://github.com/Xazax-hun approved this pull request.


https://github.com/llvm/llvm-project/pull/139909
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [llvm][OpenMP] Add "SourceLanguages" property to Directive (PR #139960)

2025-05-14 Thread Krzysztof Parzyszek via llvm-branch-commits


kparzysz wrote:

Previous PR: https://github.com/llvm/llvm-project/pull/139958
Next PR: https://github.com/llvm/llvm-project/pull/139961

https://github.com/llvm/llvm-project/pull/139960
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang] [clang][OpenMP] Improve handling of non-C/C++ directives (PR #139961)

2025-05-14 Thread Krzysztof Parzyszek via llvm-branch-commits


kparzysz wrote:

Previous PR: https://github.com/llvm/llvm-project/pull/139960

https://github.com/llvm/llvm-project/pull/139961
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang] [clang][OpenMP] Improve handling of non-C/C++ directives (PR #139961)

2025-05-14 Thread Alexey Bataev via llvm-branch-commits


https://github.com/alexey-bataev approved this pull request.


https://github.com/llvm/llvm-project/pull/139961
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [DirectX] Adding support for static samples is yaml2obj/obj2yaml (PR #139963)

2025-05-14 Thread via llvm-branch-commits


llvmbot wrote:



@llvm/pr-subscribers-llvm-binary-utilities

@llvm/pr-subscribers-backend-directx

Author: None (joaosaffran)


Changes

- Adds support for static samplers ins dxcontainer binary format.
- Adds writing logic to mcdxbc
- adds reading logic to Object
- adds tests
Closes: [126636](https://github.com/llvm/llvm-project/issues/126636)

---

Patch is 23.58 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/139963.diff


12 Files Affected:

- (modified) llvm/include/llvm/BinaryFormat/DXContainer.h (+51) 
- (modified) llvm/include/llvm/BinaryFormat/DXContainerConstants.def (+73) 
- (modified) llvm/include/llvm/MC/DXContainerRootSignature.h (+1) 
- (modified) llvm/include/llvm/Object/DXContainer.h (+5) 
- (modified) llvm/include/llvm/ObjectYAML/DXContainerYAML.h (+26) 
- (modified) llvm/lib/MC/DXContainerRootSignature.cpp (+19-1) 
- (modified) llvm/lib/Object/DXContainer.cpp (+5) 
- (modified) llvm/lib/ObjectYAML/DXContainerEmitter.cpp (+19) 
- (modified) llvm/lib/ObjectYAML/DXContainerYAML.cpp (+38) 
- (added) llvm/test/ObjectYAML/DXContainer/RootSignature-StaticSamplers.yaml 
(+63) 
- (modified) llvm/unittests/Object/DXContainerTest.cpp (+46) 
- (modified) llvm/unittests/ObjectYAML/DXContainerYAMLTest.cpp (+59) 


``diff
diff --git a/llvm/include/llvm/BinaryFormat/DXContainer.h 
b/llvm/include/llvm/BinaryFormat/DXContainer.h
index c18d1e3d7f024..b9e9821a3f4a6 100644
--- a/llvm/include/llvm/BinaryFormat/DXContainer.h
+++ b/llvm/include/llvm/BinaryFormat/DXContainer.h
@@ -18,6 +18,7 @@
 #include "llvm/Support/SwapByteOrder.h"
 #include "llvm/TargetParser/Triple.h"
 
+#include 
 #include 
 
 namespace llvm {
@@ -600,6 +601,25 @@ inline bool isValidShaderVisibility(uint32_t V) {
   return false;
 }
 
+#define STATIC_SAMPLER_FILTER(Val, Enum) Enum = Val,
+enum class StaticSamplerFilter : uint32_t {
+#include "DXContainerConstants.def"
+};
+
+#define TEXTURE_ADDRESS_MODE(Val, Enum) Enum = Val,
+enum class TextureAddressMode : uint32_t {
+#include "DXContainerConstants.def"
+};
+
+#define COMPARISON_FUNCTION(Val, Enum) Enum = Val,
+enum class ComparisonFunction : uint32_t {
+#include "DXContainerConstants.def"
+};
+
+#define STATIC_BORDER_COLOR(Val, Enum) Enum = Val,
+enum class StaticBorderColor : uint32_t {
+#include "DXContainerConstants.def"
+};
 namespace v1 {
 
 struct RootSignatureHeader {
@@ -667,6 +687,37 @@ struct DescriptorRange {
 sys::swapByteOrder(OffsetInDescriptorsFromTableStart);
   }
 };
+
+struct StaticSampler {
+  uint32_t Filter;
+  uint32_t AddressU;
+  uint32_t AddressV;
+  uint32_t AddressW;
+  float MipLODBias;
+  uint32_t MaxAnisotropy;
+  uint32_t ComparisonFunc;
+  uint32_t BorderColor;
+  float MinLOD;
+  float MaxLOD;
+  uint32_t ShaderRegister;
+  uint32_t RegisterSpace;
+  uint32_t ShaderVisibility;
+  void swapBytes() {
+sys::swapByteOrder(Filter);
+sys::swapByteOrder(AddressU);
+sys::swapByteOrder(AddressV);
+sys::swapByteOrder(AddressW);
+sys::swapByteOrder(MipLODBias);
+sys::swapByteOrder(MaxAnisotropy);
+sys::swapByteOrder(ComparisonFunc);
+sys::swapByteOrder(BorderColor);
+sys::swapByteOrder(MinLOD);
+sys::swapByteOrder(MaxLOD);
+sys::swapByteOrder(ShaderRegister);
+sys::swapByteOrder(RegisterSpace);
+sys::swapByteOrder(ShaderVisibility);
+  };
+};
 } // namespace v1
 
 namespace v2 {
diff --git a/llvm/include/llvm/BinaryFormat/DXContainerConstants.def 
b/llvm/include/llvm/BinaryFormat/DXContainerConstants.def
index 5fe7e7c321a33..f25c464d5f9cc 100644
--- a/llvm/include/llvm/BinaryFormat/DXContainerConstants.def
+++ b/llvm/include/llvm/BinaryFormat/DXContainerConstants.def
@@ -129,6 +129,79 @@ SHADER_VISIBILITY(7, Mesh)
 #undef SHADER_VISIBILITY
 #endif // SHADER_VISIBILITY
 
+#ifdef STATIC_SAMPLER_FILTER
+
+STATIC_SAMPLER_FILTER(0, MIN_MAG_MIP_POINT)
+STATIC_SAMPLER_FILTER(0x1, MIN_MAG_POINT_MIP_LINEAR)
+STATIC_SAMPLER_FILTER(0x4, MIN_POINT_MAG_LINEAR_MIP_POINT)
+STATIC_SAMPLER_FILTER(0x5, MIN_POINT_MAG_MIP_LINEAR)
+STATIC_SAMPLER_FILTER(0x10, MIN_LINEAR_MAG_MIP_POINT)
+STATIC_SAMPLER_FILTER(0x11, MIN_LINEAR_MAG_POINT_MIP_LINEAR)
+STATIC_SAMPLER_FILTER(0x14, MIN_MAG_LINEAR_MIP_POINT)
+STATIC_SAMPLER_FILTER(0x15, MIN_MAG_MIP_LINEAR)
+STATIC_SAMPLER_FILTER(0x55, ANISOTROPIC)
+STATIC_SAMPLER_FILTER(0x80, COMPARISON_MIN_MAG_MIP_POINT)
+STATIC_SAMPLER_FILTER(0x81, COMPARISON_MIN_MAG_POINT_MIP_LINEAR)
+STATIC_SAMPLER_FILTER(0x84, COMPARISON_MIN_POINT_MAG_LINEAR_MIP_POINT)
+STATIC_SAMPLER_FILTER(0x85, COMPARISON_MIN_POINT_MAG_MIP_LINEAR)
+STATIC_SAMPLER_FILTER(0x90, COMPARISON_MIN_LINEAR_MAG_MIP_POINT)
+STATIC_SAMPLER_FILTER(0x91, COMPARISON_MIN_LINEAR_MAG_POINT_MIP_LINEAR)
+STATIC_SAMPLER_FILTER(0x94, COMPARISON_MIN_MAG_LINEAR_MIP_POINT)
+STATIC_SAMPLER_FILTER(0x95, COMPARISON_MIN_MAG_MIP_LINEAR)
+STATIC_SAMPLER_FILTER(0xd5, COMPARISON_ANISOTROPIC)
+STATIC_SAMPLER_FILTER(0x100, MINIMUM_MIN_MAG_MIP_POINT)
+STATIC_SAMPLER_FILTER(0x101, MINIMUM_MIN_MAG_POINT_MIP_LINEAR)

[llvm-branch-commits] [llvm] [DirectX] Adding support for static samples is yaml2obj/obj2yaml (PR #139963)

2025-05-14 Thread via llvm-branch-commits


https://github.com/joaosaffran created 
https://github.com/llvm/llvm-project/pull/139963

- Adds support for static samplers ins dxcontainer binary format.
- Adds writing logic to mcdxbc
- adds reading logic to Object
- adds tests
Closes: [126636](https://github.com/llvm/llvm-project/issues/126636)



  



Rate limit · GitHub


  body {
background-color: #f6f8fa;
color: #24292e;
font-family: -apple-system,BlinkMacSystemFont,Segoe 
UI,Helvetica,Arial,sans-serif,Apple Color Emoji,Segoe UI Emoji,Segoe UI Symbol;
font-size: 14px;
line-height: 1.5;
margin: 0;
  }

  .container { margin: 50px auto; max-width: 600px; text-align: center; 
padding: 0 24px; }

  a { color: #0366d6; text-decoration: none; }
  a:hover { text-decoration: underline; }

  h1 { line-height: 60px; font-size: 48px; font-weight: 300; margin: 0px; 
text-shadow: 0 1px 0 #fff; }
  p { color: rgba(0, 0, 0, 0.5); margin: 20px 0 40px; }

  ul { list-style: none; margin: 25px 0; padding: 0; }
  li { display: table-cell; font-weight: bold; width: 1%; }

  .logo { display: inline-block; margin-top: 35px; }
  .logo-img-2x { display: none; }
  @media
  only screen and (-webkit-min-device-pixel-ratio: 2),
  only screen and (   min--moz-device-pixel-ratio: 2),
  only screen and ( -o-min-device-pixel-ratio: 2/1),
  only screen and (min-device-pixel-ratio: 2),
  only screen and (min-resolution: 192dpi),
  only screen and (min-resolution: 2dppx) {
.logo-img-1x { display: none; }
.logo-img-2x { display: inline-block; }
  }

  #suggestions {
margin-top: 35px;
color: #ccc;
  }
  #suggestions a {
color: #66;
font-weight: 200;
font-size: 14px;
margin: 0 10px;
  }


  
  



  Whoa there!
  You have exceeded a secondary rate limit.
Please wait a few minutes before you try again;
in some cases this may take up to an hour.
  
  
https://support.github.com/contact";>Contact Support —
https://githubstatus.com";>GitHub Status —
https://twitter.com/githubstatus";>@githubstatus
  

  

  

  

  

  


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [mlir] [MLIR][ArmSVE] Add initial lowering of vector.contract to SVE `*MMLA` instructions (PR #135636)

2025-05-14 Thread Momchil Velikov via llvm-branch-commits



@@ -0,0 +1,304 @@
+//===- LowerContractionToSMMLAPattern.cpp - Contract to SMMLA ---*- C++ 
-*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+//
+// This file implements lowering patterns from vector.contract to
+// SVE I8MM operations.
+//
+//===---
+
+#include "mlir/Dialect/Arith/IR/Arith.h"
+#include "mlir/Dialect/ArmSVE/IR/ArmSVEDialect.h"
+#include "mlir/Dialect/ArmSVE/Transforms/Transforms.h"
+#include "mlir/Dialect/Func/IR/FuncOps.h"
+#include "mlir/Dialect/LLVMIR/LLVMDialect.h"
+#include "mlir/Dialect/Utils/IndexingUtils.h"
+#include "mlir/Dialect/Vector/IR/VectorOps.h"
+#include "mlir/IR/AffineMap.h"
+#include "mlir/IR/PatternMatch.h"
+#include "mlir/Transforms/GreedyPatternRewriteDriver.h"
+
+#include "mlir/Dialect/UB/IR/UBOps.h"
+
+#define DEBUG_TYPE "lower-contract-to-arm-sve-i8mm"
+
+using namespace mlir;
+using namespace mlir::arm_sve;
+
+namespace {
+// Check if the given value is a result of the operation `T` (which must be
+// sign- or zero- extend) from i8 to i32. Return the value before the 
extension.
+template 
+inline std::enable_if_t<(std::is_base_of_v ||
+ std::is_base_of_v),
+std::optional>
+extractExtOperand(Value v, Type i8Ty, Type i32Ty) {
+  auto extOp = dyn_cast_or_null(v.getDefiningOp());
+  if (!extOp)
+return {};
+
+  auto inOp = extOp.getIn();
+  auto inTy = dyn_cast(inOp.getType());
+  if (!inTy || inTy.getElementType() != i8Ty)
+return {};
+
+  auto outTy = dyn_cast(extOp.getType());
+  if (!outTy || outTy.getElementType() != i32Ty)
+return {};
+
+  return inOp;
+}
+
+// Designate the operation (resp. instruction) used to do sub-tile matrix
+// multiplications.
+enum class MMLA {
+  Signed,  // smmla
+  Unsigned,// ummla
+  Mixed,   // usmmla
+  MixedSwapped // usmmla with LHS and RHS swapped
+};
+
+// Create the matrix multply and accumulate operation according to `op`.
+Value createMMLA(PatternRewriter &rewriter, MMLA op, Location loc,
+ mlir::VectorType accType, Value acc, Value lhs, Value rhs) {
+  switch (op) {
+  case MMLA::Signed:
+return rewriter.create(loc, accType, acc, lhs, rhs);
+  case MMLA::Unsigned:
+return rewriter.create(loc, accType, acc, lhs, rhs);
+  case MMLA::Mixed:
+return rewriter.create(loc, accType, acc, lhs, rhs);
+  case MMLA::MixedSwapped:
+// The accumulator comes transposed and the result will be transposed
+// later, so all we have to do here is swap the operands.
+return rewriter.create(loc, accType, acc, rhs, lhs);
+  }
+}
+
+class LowerContractionToSVEI8MMPattern

momchil-velikov wrote:

Done.

https://github.com/llvm/llvm-project/pull/135636
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [mlir] [MLIR][ArmSVE] Add initial lowering of vector.contract to SVE `*MMLA` instructions (PR #135636)

2025-05-14 Thread Momchil Velikov via llvm-branch-commits



@@ -0,0 +1,304 @@
+//===- LowerContractionToSMMLAPattern.cpp - Contract to SMMLA ---*- C++ 
-*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+//
+// This file implements lowering patterns from vector.contract to
+// SVE I8MM operations.
+//
+//===---
+
+#include "mlir/Dialect/Arith/IR/Arith.h"
+#include "mlir/Dialect/ArmSVE/IR/ArmSVEDialect.h"
+#include "mlir/Dialect/ArmSVE/Transforms/Transforms.h"
+#include "mlir/Dialect/Func/IR/FuncOps.h"
+#include "mlir/Dialect/LLVMIR/LLVMDialect.h"
+#include "mlir/Dialect/Utils/IndexingUtils.h"
+#include "mlir/Dialect/Vector/IR/VectorOps.h"
+#include "mlir/IR/AffineMap.h"
+#include "mlir/IR/PatternMatch.h"
+#include "mlir/Transforms/GreedyPatternRewriteDriver.h"
+
+#include "mlir/Dialect/UB/IR/UBOps.h"
+
+#define DEBUG_TYPE "lower-contract-to-arm-sve-i8mm"
+
+using namespace mlir;
+using namespace mlir::arm_sve;
+
+namespace {
+// Check if the given value is a result of the operation `T` (which must be
+// sign- or zero- extend) from i8 to i32. Return the value before the 
extension.
+template 
+inline std::enable_if_t<(std::is_base_of_v ||
+ std::is_base_of_v),
+std::optional>
+extractExtOperand(Value v, Type i8Ty, Type i32Ty) {
+  auto extOp = dyn_cast_or_null(v.getDefiningOp());
+  if (!extOp)
+return {};
+
+  auto inOp = extOp.getIn();
+  auto inTy = dyn_cast(inOp.getType());
+  if (!inTy || inTy.getElementType() != i8Ty)
+return {};
+
+  auto outTy = dyn_cast(extOp.getType());
+  if (!outTy || outTy.getElementType() != i32Ty)
+return {};
+
+  return inOp;
+}
+
+// Designate the operation (resp. instruction) used to do sub-tile matrix
+// multiplications.
+enum class MMLA {
+  Signed,  // smmla
+  Unsigned,// ummla
+  Mixed,   // usmmla
+  MixedSwapped // usmmla with LHS and RHS swapped
+};
+
+// Create the matrix multply and accumulate operation according to `op`.
+Value createMMLA(PatternRewriter &rewriter, MMLA op, Location loc,
+ mlir::VectorType accType, Value acc, Value lhs, Value rhs) {
+  switch (op) {
+  case MMLA::Signed:
+return rewriter.create(loc, accType, acc, lhs, rhs);
+  case MMLA::Unsigned:
+return rewriter.create(loc, accType, acc, lhs, rhs);
+  case MMLA::Mixed:
+return rewriter.create(loc, accType, acc, lhs, rhs);
+  case MMLA::MixedSwapped:
+// The accumulator comes transposed and the result will be transposed
+// later, so all we have to do here is swap the operands.
+return rewriter.create(loc, accType, acc, rhs, lhs);
+  }
+}
+
+class LowerContractionToSVEI8MMPattern
+: public OpRewritePattern {
+public:
+  using OpRewritePattern::OpRewritePattern;
+  LogicalResult matchAndRewrite(vector::ContractionOp op,
+PatternRewriter &rewriter) const override {
+
+Location loc = op.getLoc();
+mlir::VectorType lhsType = op.getLhsType();
+mlir::VectorType rhsType = op.getRhsType();
+
+// For now handle LHS and RHS<8x[N]> - these are the types we
+// eventually expect from MMT4D. M and N dimensions must be even and at

momchil-velikov wrote:

Done (in the top-level description).

https://github.com/llvm/llvm-project/pull/135636
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [mlir] [MLIR][ArmSVE] Add an ArmSVE dialect operation which maps to svusmmla (PR #135634)

2025-05-14 Thread Momchil Velikov via llvm-branch-commits


https://github.com/momchil-velikov updated 
https://github.com/llvm/llvm-project/pull/135634



  



Rate limit · GitHub


  body {
background-color: #f6f8fa;
color: #24292e;
font-family: -apple-system,BlinkMacSystemFont,Segoe 
UI,Helvetica,Arial,sans-serif,Apple Color Emoji,Segoe UI Emoji,Segoe UI Symbol;
font-size: 14px;
line-height: 1.5;
margin: 0;
  }

  .container { margin: 50px auto; max-width: 600px; text-align: center; 
padding: 0 24px; }

  a { color: #0366d6; text-decoration: none; }
  a:hover { text-decoration: underline; }

  h1 { line-height: 60px; font-size: 48px; font-weight: 300; margin: 0px; 
text-shadow: 0 1px 0 #fff; }
  p { color: rgba(0, 0, 0, 0.5); margin: 20px 0 40px; }

  ul { list-style: none; margin: 25px 0; padding: 0; }
  li { display: table-cell; font-weight: bold; width: 1%; }

  .logo { display: inline-block; margin-top: 35px; }
  .logo-img-2x { display: none; }
  @media
  only screen and (-webkit-min-device-pixel-ratio: 2),
  only screen and (   min--moz-device-pixel-ratio: 2),
  only screen and ( -o-min-device-pixel-ratio: 2/1),
  only screen and (min-device-pixel-ratio: 2),
  only screen and (min-resolution: 192dpi),
  only screen and (min-resolution: 2dppx) {
.logo-img-1x { display: none; }
.logo-img-2x { display: inline-block; }
  }

  #suggestions {
margin-top: 35px;
color: #ccc;
  }
  #suggestions a {
color: #66;
font-weight: 200;
font-size: 14px;
margin: 0 10px;
  }


  
  



  Whoa there!
  You have exceeded a secondary rate limit.
Please wait a few minutes before you try again;
in some cases this may take up to an hour.
  
  
https://support.github.com/contact";>Contact Support —
https://githubstatus.com";>GitHub Status —
https://twitter.com/githubstatus";>@githubstatus
  

  

  

  

  

  


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [mlir] [MLIR][ArmSVE] Add initial lowering of vector.contract to SVE `*MMLA` instructions (PR #135636)

2025-05-14 Thread Momchil Velikov via llvm-branch-commits


https://github.com/momchil-velikov updated 
https://github.com/llvm/llvm-project/pull/135636

>From f397467bc167d94a28a919a45c009a8f08b6351b Mon Sep 17 00:00:00 2001
From: Momchil Velikov 
Date: Tue, 8 Apr 2025 14:43:54 +
Subject: [PATCH 1/2] [MLIR][ArmSVE] Add initial lowering of `vector.contract`
 to SVE `*MMLA` instructions

---
 mlir/include/mlir/Conversion/Passes.td|   4 +
 .../Dialect/ArmSVE/Transforms/Transforms.h|   3 +
 .../Conversion/VectorToLLVM/CMakeLists.txt|   1 +
 .../VectorToLLVM/ConvertVectorToLLVMPass.cpp  |   7 +
 .../LowerContractionToSMMLAPattern.cpp|   5 +-
 .../Dialect/ArmSVE/Transforms/CMakeLists.txt  |   1 +
 .../LowerContractionToSVEI8MMPattern.cpp  | 304 ++
 .../Vector/CPU/ArmSVE/vector-smmla.mlir   |  94 ++
 .../Vector/CPU/ArmSVE/vector-summla.mlir  |  85 +
 .../Vector/CPU/ArmSVE/vector-ummla.mlir   |  94 ++
 .../Vector/CPU/ArmSVE/vector-usmmla.mlir  |  95 ++
 .../CPU/ArmSVE/contraction-smmla-4x8x4.mlir   | 117 +++
 .../ArmSVE/contraction-smmla-8x8x8-vs2.mlir   | 159 +
 .../CPU/ArmSVE/contraction-summla-4x8x4.mlir  | 118 +++
 .../CPU/ArmSVE/contraction-ummla-4x8x4.mlir   | 119 +++
 .../CPU/ArmSVE/contraction-usmmla-4x8x4.mlir  | 117 +++
 16 files changed, 1322 insertions(+), 1 deletion(-)
 create mode 100644 
mlir/lib/Dialect/ArmSVE/Transforms/LowerContractionToSVEI8MMPattern.cpp
 create mode 100644 mlir/test/Dialect/Vector/CPU/ArmSVE/vector-smmla.mlir
 create mode 100644 mlir/test/Dialect/Vector/CPU/ArmSVE/vector-summla.mlir
 create mode 100644 mlir/test/Dialect/Vector/CPU/ArmSVE/vector-ummla.mlir
 create mode 100644 mlir/test/Dialect/Vector/CPU/ArmSVE/vector-usmmla.mlir
 create mode 100644 
mlir/test/Integration/Dialect/Vector/CPU/ArmSVE/contraction-smmla-4x8x4.mlir
 create mode 100644 
mlir/test/Integration/Dialect/Vector/CPU/ArmSVE/contraction-smmla-8x8x8-vs2.mlir
 create mode 100644 
mlir/test/Integration/Dialect/Vector/CPU/ArmSVE/contraction-summla-4x8x4.mlir
 create mode 100644 
mlir/test/Integration/Dialect/Vector/CPU/ArmSVE/contraction-ummla-4x8x4.mlir
 create mode 100644 
mlir/test/Integration/Dialect/Vector/CPU/ArmSVE/contraction-usmmla-4x8x4.mlir

diff --git a/mlir/include/mlir/Conversion/Passes.td 
b/mlir/include/mlir/Conversion/Passes.td
index 10557658d5d7d..b496ee0114910 100644
--- a/mlir/include/mlir/Conversion/Passes.td
+++ b/mlir/include/mlir/Conversion/Passes.td
@@ -1431,6 +1431,10 @@ def ConvertVectorToLLVMPass : 
Pass<"convert-vector-to-llvm"> {
"bool", /*default=*/"false",
"Enables the use of ArmSVE dialect while lowering the vector "
"dialect.">,
+Option<"armI8MM", "enable-arm-i8mm",
+   "bool", /*default=*/"false",
+   "Enables the use of Arm FEAT_I8MM instructions while lowering "
+   "the vector dialect.">,
 Option<"x86Vector", "enable-x86vector",
"bool", /*default=*/"false",
"Enables the use of X86Vector dialect while lowering the vector "
diff --git a/mlir/include/mlir/Dialect/ArmSVE/Transforms/Transforms.h 
b/mlir/include/mlir/Dialect/ArmSVE/Transforms/Transforms.h
index 8665c8224cc45..232e2be29e574 100644
--- a/mlir/include/mlir/Dialect/ArmSVE/Transforms/Transforms.h
+++ b/mlir/include/mlir/Dialect/ArmSVE/Transforms/Transforms.h
@@ -20,6 +20,9 @@ class RewritePatternSet;
 void populateArmSVELegalizeForLLVMExportPatterns(
 const LLVMTypeConverter &converter, RewritePatternSet &patterns);
 
+void populateLowerContractionToSVEI8MMPatternPatterns(
+RewritePatternSet &patterns);
+
 /// Configure the target to support lowering ArmSVE ops to ops that map to LLVM
 /// intrinsics.
 void configureArmSVELegalizeForExportTarget(LLVMConversionTarget &target);
diff --git a/mlir/lib/Conversion/VectorToLLVM/CMakeLists.txt 
b/mlir/lib/Conversion/VectorToLLVM/CMakeLists.txt
index 330474a718e30..8e2620029c354 100644
--- a/mlir/lib/Conversion/VectorToLLVM/CMakeLists.txt
+++ b/mlir/lib/Conversion/VectorToLLVM/CMakeLists.txt
@@ -35,6 +35,7 @@ add_mlir_conversion_library(MLIRVectorToLLVMPass
   MLIRVectorToLLVM
 
   MLIRArmNeonDialect
+  MLIRArmNeonTransforms
   MLIRArmSVEDialect
   MLIRArmSVETransforms
   MLIRAMXDialect
diff --git a/mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVMPass.cpp 
b/mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVMPass.cpp
index 0ee6dce9ee94b..293e01a5bf4d4 100644
--- a/mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVMPass.cpp
+++ b/mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVMPass.cpp
@@ -14,6 +14,7 @@
 #include "mlir/Dialect/AMX/Transforms.h"
 #include "mlir/Dialect/Arith/IR/Arith.h"
 #include "mlir/Dialect/ArmNeon/ArmNeonDialect.h"
+#include "mlir/Dialect/ArmNeon/Transforms.h"
 #include "mlir/Dialect/ArmSVE/IR/ArmSVEDialect.h"
 #include "mlir/Dialect/ArmSVE/Transforms/Transforms.h"
 #include "mlir/Dialect/Func/IR/FuncOps.h"
@@ -82,6 +83,12 @@ void ConvertVectorToLLVMPass::runOnOperation() {
 populateVectorStepLoweringPattern

[llvm-branch-commits] [mlir] [MLIR][ArmSVE] Add an ArmSVE dialect operation which maps to svusmmla (PR #135634)

2025-05-14 Thread Momchil Velikov via llvm-branch-commits


https://github.com/momchil-velikov updated 
https://github.com/llvm/llvm-project/pull/135634

>From 528237309c0bfd7bbb51a8fea37b54e07f21ad1d Mon Sep 17 00:00:00 2001
From: Momchil Velikov 
Date: Thu, 10 Apr 2025 14:38:27 +
Subject: [PATCH] [MLIR][ArmSVE] Add an ArmSVE dialect operation which maps to
 `svusmmla`

---
 mlir/include/mlir/Dialect/ArmSVE/IR/ArmSVE.td | 95 +++
 .../Transforms/LegalizeForLLVMExport.cpp  |  4 +
 .../Dialect/ArmSVE/legalize-for-llvm.mlir | 12 +++
 mlir/test/Dialect/ArmSVE/roundtrip.mlir   | 11 +++
 mlir/test/Target/LLVMIR/arm-sve.mlir  | 12 +++
 5 files changed, 96 insertions(+), 38 deletions(-)

diff --git a/mlir/include/mlir/Dialect/ArmSVE/IR/ArmSVE.td 
b/mlir/include/mlir/Dialect/ArmSVE/IR/ArmSVE.td
index 3a990f8464ef8..7385bb73b449a 100644
--- a/mlir/include/mlir/Dialect/ArmSVE/IR/ArmSVE.td
+++ b/mlir/include/mlir/Dialect/ArmSVE/IR/ArmSVE.td
@@ -147,11 +147,9 @@ class ScalableMaskedIOp,
-   AllTypesMatch<["acc", "dst"]>,
- ]> {
+def SdotOp : ArmSVE_Op<"sdot", [Pure,
+AllTypesMatch<["src1", "src2"]>,
+AllTypesMatch<["acc", "dst"]>]> {
   let summary = "Vector-vector dot product and accumulate op";
   let description = [{
 SDOT: Signed integer addition of dot product.
@@ -178,11 +176,9 @@ def SdotOp : ArmSVE_Op<"sdot",
 "$acc `,` $src1 `,` $src2 attr-dict `:` type($src1) `to` type($dst)";
 }
 
-def SmmlaOp : ArmSVE_Op<"smmla",
-[Pure,
-AllTypesMatch<["src1", "src2"]>,
-AllTypesMatch<["acc", "dst"]>,
-  ]> {
+def SmmlaOp : ArmSVE_Op<"smmla", [Pure,
+  AllTypesMatch<["src1", "src2"]>,
+  AllTypesMatch<["acc", "dst"]>]> {
   let summary = "Matrix-matrix multiply and accumulate op";
   let description = [{
 SMMLA: Signed integer matrix multiply-accumulate.
@@ -210,11 +206,9 @@ def SmmlaOp : ArmSVE_Op<"smmla",
 "$acc `,` $src1 `,` $src2 attr-dict `:` type($src1) `to` type($dst)";
 }
 
-def UdotOp : ArmSVE_Op<"udot",
-   [Pure,
-   AllTypesMatch<["src1", "src2"]>,
-   AllTypesMatch<["acc", "dst"]>,
- ]> {
+def UdotOp : ArmSVE_Op<"udot", [Pure,
+AllTypesMatch<["src1", "src2"]>,
+AllTypesMatch<["acc", "dst"]>]> {
   let summary = "Vector-vector dot product and accumulate op";
   let description = [{
 UDOT: Unsigned integer addition of dot product.
@@ -241,11 +235,9 @@ def UdotOp : ArmSVE_Op<"udot",
 "$acc `,` $src1 `,` $src2 attr-dict `:` type($src1) `to` type($dst)";
 }
 
-def UmmlaOp : ArmSVE_Op<"ummla",
-[Pure,
-AllTypesMatch<["src1", "src2"]>,
-AllTypesMatch<["acc", "dst"]>,
-  ]> {
+def UmmlaOp : ArmSVE_Op<"ummla", [Pure,
+  AllTypesMatch<["src1", "src2"]>,
+  AllTypesMatch<["acc", "dst"]>]> {
   let summary = "Matrix-matrix multiply and accumulate op";
   let description = [{
 UMMLA: Unsigned integer matrix multiply-accumulate.
@@ -273,14 +265,42 @@ def UmmlaOp : ArmSVE_Op<"ummla",
 "$acc `,` $src1 `,` $src2 attr-dict `:` type($src1) `to` type($dst)";
 }
 
+def UsmmlaOp : ArmSVE_Op<"usmmla", [Pure,
+AllTypesMatch<["src1", "src2"]>,
+AllTypesMatch<["acc", "dst"]>]> {
+  let summary = "Matrix-matrix multiply and accumulate op";
+  let description = [{
+USMMLA: Unsigned by signed integer matrix multiply-accumulate.
+
+The unsigned by signed integer matrix multiply-accumulate operation
+multiplies the 2×8 matrix of unsigned 8-bit integer values held
+the first source vector by the 8×2 matrix of signed 8-bit integer
+values in the second source vector. The resulting 2×2 widened 32-bit
+integer matrix product is then added to the 32-bit integer matrix
+accumulator.
+
+Source:
+https://developer.arm.com/documentation/100987/
+  }];
+  // Supports (vector<16xi8>, vector<16xi8>) -> (vector<4xi32>)
+  let arguments = (ins
+  ScalableVectorOfLengthAndType<[4], [I32]>:$acc,
+  ScalableVectorOfLengthAndType<[16], [I8]>:$src1,
+  ScalableVectorOfLengthAndType<[16], [I8]>:$src2
+  );
+  let results = (outs ScalableVectorOfLengthAndType<[4], [I32]>:$dst);
+  let assemblyFormat =
+"$acc `,` $src1 `,` $src2 attr-dict `:` type($src1) `to` type($dst)";
+}
+
 class SvboolTypeConstraint : TypesMatchWith<
   "expected corresponding svbool type widened to [16]xi1",
   lhsArg, rhsArg,
   
"VectorType(VectorType::Builder(::llvm::cast($_self)).setDim(::llvm::cast($_self).getRank()
 - 1, 16))">;
 
 def ConvertFromSvboolOp : ArmSVE_Op<"convert_from_svbool",
-[Pure, SvboolTypeConstraint<"result", "source">]>
-{
+

[llvm-branch-commits] [llvm] [ObjC] Support objc_claimAutoreleasedReturnValue (PR #138696)

2025-05-14 Thread via llvm-branch-commits


AZero13 wrote:

What did you mean by // FIXME: do this on ARCRuntimeEntryPoints, and do the 
todo above ARCInstKind

Like you want to do the same check there? How can I help?

https://github.com/llvm/llvm-project/pull/138696
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [mlir] [MLIR][ArmSVE] Add initial lowering of vector.contract to SVE `*MMLA` instructions (PR #135636)

2025-05-14 Thread Momchil Velikov via llvm-branch-commits



@@ -0,0 +1,304 @@
+//===- LowerContractionToSMMLAPattern.cpp - Contract to SMMLA ---*- C++ 
-*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+//
+// This file implements lowering patterns from vector.contract to
+// SVE I8MM operations.
+//
+//===---
+
+#include "mlir/Dialect/Arith/IR/Arith.h"
+#include "mlir/Dialect/ArmSVE/IR/ArmSVEDialect.h"
+#include "mlir/Dialect/ArmSVE/Transforms/Transforms.h"
+#include "mlir/Dialect/Func/IR/FuncOps.h"
+#include "mlir/Dialect/LLVMIR/LLVMDialect.h"
+#include "mlir/Dialect/Utils/IndexingUtils.h"
+#include "mlir/Dialect/Vector/IR/VectorOps.h"
+#include "mlir/IR/AffineMap.h"
+#include "mlir/IR/PatternMatch.h"
+#include "mlir/Transforms/GreedyPatternRewriteDriver.h"
+
+#include "mlir/Dialect/UB/IR/UBOps.h"
+
+#define DEBUG_TYPE "lower-contract-to-arm-sve-i8mm"
+
+using namespace mlir;
+using namespace mlir::arm_sve;
+
+namespace {
+// Check if the given value is a result of the operation `T` (which must be
+// sign- or zero- extend) from i8 to i32. Return the value before the 
extension.
+template 
+inline std::enable_if_t<(std::is_base_of_v ||
+ std::is_base_of_v),
+std::optional>
+extractExtOperand(Value v, Type i8Ty, Type i32Ty) {
+  auto extOp = dyn_cast_or_null(v.getDefiningOp());
+  if (!extOp)
+return {};
+
+  auto inOp = extOp.getIn();
+  auto inTy = dyn_cast(inOp.getType());
+  if (!inTy || inTy.getElementType() != i8Ty)
+return {};
+
+  auto outTy = dyn_cast(extOp.getType());
+  if (!outTy || outTy.getElementType() != i32Ty)
+return {};
+
+  return inOp;
+}
+
+// Designate the operation (resp. instruction) used to do sub-tile matrix
+// multiplications.
+enum class MMLA {
+  Signed,  // smmla
+  Unsigned,// ummla
+  Mixed,   // usmmla
+  MixedSwapped // usmmla with LHS and RHS swapped
+};
+
+// Create the matrix multply and accumulate operation according to `op`.
+Value createMMLA(PatternRewriter &rewriter, MMLA op, Location loc,
+ mlir::VectorType accType, Value acc, Value lhs, Value rhs) {
+  switch (op) {
+  case MMLA::Signed:
+return rewriter.create(loc, accType, acc, lhs, rhs);
+  case MMLA::Unsigned:
+return rewriter.create(loc, accType, acc, lhs, rhs);
+  case MMLA::Mixed:
+return rewriter.create(loc, accType, acc, lhs, rhs);
+  case MMLA::MixedSwapped:
+// The accumulator comes transposed and the result will be transposed
+// later, so all we have to do here is swap the operands.
+return rewriter.create(loc, accType, acc, rhs, lhs);
+  }
+}
+
+class LowerContractionToSVEI8MMPattern
+: public OpRewritePattern {
+public:
+  using OpRewritePattern::OpRewritePattern;
+  LogicalResult matchAndRewrite(vector::ContractionOp op,
+PatternRewriter &rewriter) const override {
+
+Location loc = op.getLoc();
+mlir::VectorType lhsType = op.getLhsType();
+mlir::VectorType rhsType = op.getRhsType();
+
+// For now handle LHS and RHS<8x[N]> - these are the types we
+// eventually expect from MMT4D. M and N dimensions must be even and at
+// least 2.
+if (!lhsType.hasRank() || lhsType.getRank() != 2 || !rhsType.hasRank() ||
+rhsType.getRank() != 2)
+  return failure();
+
+if (lhsType.isScalable() || !rhsType.isScalable())
+  return failure();
+
+// M, N, and K are the conventional names for matrix dimensions in the
+// context of matrix multiplication.
+auto M = lhsType.getDimSize(0);
+auto N = rhsType.getDimSize(0);
+auto K = rhsType.getDimSize(1);
+
+if (lhsType.getDimSize(1) != K || K != 8 || M < 2 || M % 2 != 0 || N < 2 ||
+N % 2 != 0 || !rhsType.getScalableDims()[0])
+  return failure();
+
+// Check permutation maps. For now only accept
+//   lhs: (d0, d1, d2) -> (d0, d2)
+//   rhs: (d0, d1, d2) -> (d1, d2)
+//   acc: (d0, d1, d2) -> (d0, d1)
+// Note: RHS is transposed.
+if (op.getIndexingMapsArray()[0] !=
+AffineMap::getMultiDimMapWithTargets(3, ArrayRef{0u, 2u},
+ op.getContext()) ||
+op.getIndexingMapsArray()[1] !=
+AffineMap::getMultiDimMapWithTargets(3, ArrayRef{1u, 2u},
+ op.getContext()) ||
+op.getIndexingMapsArray()[2] !=
+AffineMap::getMultiDimMapWithTargets(3, ArrayRef{0u, 1u},
+ op.getContext()))
+  return failure();
+
+// Check iterator types for matrix multiplication.
+auto itTypes = op.getIteratorTypesArray();
+if (itTypes.size() != 3 || itTypes[0] != vector::IteratorType::parallel ||
+itTypes[1] != vector::IteratorType

[llvm-branch-commits] [mlir] [MLIR][ArmSVE] Add initial lowering of vector.contract to SVE `*MMLA` instructions (PR #135636)

2025-05-14 Thread Momchil Velikov via llvm-branch-commits


https://github.com/momchil-velikov updated 
https://github.com/llvm/llvm-project/pull/135636

>From f397467bc167d94a28a919a45c009a8f08b6351b Mon Sep 17 00:00:00 2001
From: Momchil Velikov 
Date: Tue, 8 Apr 2025 14:43:54 +
Subject: [PATCH 1/2] [MLIR][ArmSVE] Add initial lowering of `vector.contract`
 to SVE `*MMLA` instructions

---
 mlir/include/mlir/Conversion/Passes.td|   4 +
 .../Dialect/ArmSVE/Transforms/Transforms.h|   3 +
 .../Conversion/VectorToLLVM/CMakeLists.txt|   1 +
 .../VectorToLLVM/ConvertVectorToLLVMPass.cpp  |   7 +
 .../LowerContractionToSMMLAPattern.cpp|   5 +-
 .../Dialect/ArmSVE/Transforms/CMakeLists.txt  |   1 +
 .../LowerContractionToSVEI8MMPattern.cpp  | 304 ++
 .../Vector/CPU/ArmSVE/vector-smmla.mlir   |  94 ++
 .../Vector/CPU/ArmSVE/vector-summla.mlir  |  85 +
 .../Vector/CPU/ArmSVE/vector-ummla.mlir   |  94 ++
 .../Vector/CPU/ArmSVE/vector-usmmla.mlir  |  95 ++
 .../CPU/ArmSVE/contraction-smmla-4x8x4.mlir   | 117 +++
 .../ArmSVE/contraction-smmla-8x8x8-vs2.mlir   | 159 +
 .../CPU/ArmSVE/contraction-summla-4x8x4.mlir  | 118 +++
 .../CPU/ArmSVE/contraction-ummla-4x8x4.mlir   | 119 +++
 .../CPU/ArmSVE/contraction-usmmla-4x8x4.mlir  | 117 +++
 16 files changed, 1322 insertions(+), 1 deletion(-)
 create mode 100644 
mlir/lib/Dialect/ArmSVE/Transforms/LowerContractionToSVEI8MMPattern.cpp
 create mode 100644 mlir/test/Dialect/Vector/CPU/ArmSVE/vector-smmla.mlir
 create mode 100644 mlir/test/Dialect/Vector/CPU/ArmSVE/vector-summla.mlir
 create mode 100644 mlir/test/Dialect/Vector/CPU/ArmSVE/vector-ummla.mlir
 create mode 100644 mlir/test/Dialect/Vector/CPU/ArmSVE/vector-usmmla.mlir
 create mode 100644 
mlir/test/Integration/Dialect/Vector/CPU/ArmSVE/contraction-smmla-4x8x4.mlir
 create mode 100644 
mlir/test/Integration/Dialect/Vector/CPU/ArmSVE/contraction-smmla-8x8x8-vs2.mlir
 create mode 100644 
mlir/test/Integration/Dialect/Vector/CPU/ArmSVE/contraction-summla-4x8x4.mlir
 create mode 100644 
mlir/test/Integration/Dialect/Vector/CPU/ArmSVE/contraction-ummla-4x8x4.mlir
 create mode 100644 
mlir/test/Integration/Dialect/Vector/CPU/ArmSVE/contraction-usmmla-4x8x4.mlir

diff --git a/mlir/include/mlir/Conversion/Passes.td 
b/mlir/include/mlir/Conversion/Passes.td
index 10557658d5d7d..b496ee0114910 100644
--- a/mlir/include/mlir/Conversion/Passes.td
+++ b/mlir/include/mlir/Conversion/Passes.td
@@ -1431,6 +1431,10 @@ def ConvertVectorToLLVMPass : 
Pass<"convert-vector-to-llvm"> {
"bool", /*default=*/"false",
"Enables the use of ArmSVE dialect while lowering the vector "
"dialect.">,
+Option<"armI8MM", "enable-arm-i8mm",
+   "bool", /*default=*/"false",
+   "Enables the use of Arm FEAT_I8MM instructions while lowering "
+   "the vector dialect.">,
 Option<"x86Vector", "enable-x86vector",
"bool", /*default=*/"false",
"Enables the use of X86Vector dialect while lowering the vector "
diff --git a/mlir/include/mlir/Dialect/ArmSVE/Transforms/Transforms.h 
b/mlir/include/mlir/Dialect/ArmSVE/Transforms/Transforms.h
index 8665c8224cc45..232e2be29e574 100644
--- a/mlir/include/mlir/Dialect/ArmSVE/Transforms/Transforms.h
+++ b/mlir/include/mlir/Dialect/ArmSVE/Transforms/Transforms.h
@@ -20,6 +20,9 @@ class RewritePatternSet;
 void populateArmSVELegalizeForLLVMExportPatterns(
 const LLVMTypeConverter &converter, RewritePatternSet &patterns);
 
+void populateLowerContractionToSVEI8MMPatternPatterns(
+RewritePatternSet &patterns);
+
 /// Configure the target to support lowering ArmSVE ops to ops that map to LLVM
 /// intrinsics.
 void configureArmSVELegalizeForExportTarget(LLVMConversionTarget &target);
diff --git a/mlir/lib/Conversion/VectorToLLVM/CMakeLists.txt 
b/mlir/lib/Conversion/VectorToLLVM/CMakeLists.txt
index 330474a718e30..8e2620029c354 100644
--- a/mlir/lib/Conversion/VectorToLLVM/CMakeLists.txt
+++ b/mlir/lib/Conversion/VectorToLLVM/CMakeLists.txt
@@ -35,6 +35,7 @@ add_mlir_conversion_library(MLIRVectorToLLVMPass
   MLIRVectorToLLVM
 
   MLIRArmNeonDialect
+  MLIRArmNeonTransforms
   MLIRArmSVEDialect
   MLIRArmSVETransforms
   MLIRAMXDialect
diff --git a/mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVMPass.cpp 
b/mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVMPass.cpp
index 0ee6dce9ee94b..293e01a5bf4d4 100644
--- a/mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVMPass.cpp
+++ b/mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVMPass.cpp
@@ -14,6 +14,7 @@
 #include "mlir/Dialect/AMX/Transforms.h"
 #include "mlir/Dialect/Arith/IR/Arith.h"
 #include "mlir/Dialect/ArmNeon/ArmNeonDialect.h"
+#include "mlir/Dialect/ArmNeon/Transforms.h"
 #include "mlir/Dialect/ArmSVE/IR/ArmSVEDialect.h"
 #include "mlir/Dialect/ArmSVE/Transforms/Transforms.h"
 #include "mlir/Dialect/Func/IR/FuncOps.h"
@@ -82,6 +83,12 @@ void ConvertVectorToLLVMPass::runOnOperation() {
 populateVectorStepLoweringPattern

[llvm-branch-commits] [llvm] [BOLT][test] Fix callcont-fallthru.s after #129481 (PR #135867)

2025-05-14 Thread Amir Ayupov via llvm-branch-commits


aaupov wrote:

This approach doesn't solve the problem in case nm is symlinked to llvm-nm 
which doesn't have the flag. Abandon in favor of #139953.

https://github.com/llvm/llvm-project/pull/135867
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [BOLT][test] Fix callcont-fallthru.s after #129481 (PR #135867)

2025-05-14 Thread Amir Ayupov via llvm-branch-commits


https://github.com/aaupov closed 
https://github.com/llvm/llvm-project/pull/135867
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] IR: Remove redundant UseList check in addUse (PR #138676)

2025-05-14 Thread Nikita Popov via llvm-branch-commits


nikic wrote:

I think this may have been noise. I reran this and there are no differences 
over the significance threshold: 
https://llvm-compile-time-tracker.com/compare.php?from=6c1bb48cc45396894597c8cb897c31205d1bdeb6&to=1837fe71fcfb4363fd2b66cdb9ff6a82b3f380fb&stat=instructions:u

https://github.com/llvm/llvm-project/pull/138676
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] IR: Remove redundant UseList check in addUse (PR #138676)

2025-05-14 Thread Nikita Popov via llvm-branch-commits


https://github.com/nikic approved this pull request.


https://github.com/llvm/llvm-project/pull/138676
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] release/20.x: [LoongArch] Fix fp_to_uint/fp_to_sint conversion errors for lasx (#137129) (PR #139851)

2025-05-14 Thread Lu Weining via llvm-branch-commits


SixWeining wrote:

This PR fixes https://github.com/llvm/llvm-project/issues/136971.

https://github.com/llvm/llvm-project/pull/139851
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang] [Clang][Backport] Demote mixed enumeration arithmetic error to a warning (#131811) (PR #139396)

2025-05-14 Thread Aaron Ballman via llvm-branch-commits



@@ -7567,9 +7567,13 @@ def warn_arith_conv_mixed_enum_types_cxx20 : Warning<
   "%sub{select_arith_conv_kind}0 "
   "different enumeration types%diff{ ($ and $)|}1,2 is deprecated">,
   InGroup;
-def err_conv_mixed_enum_types_cxx26 : Error<
+
+def err_conv_mixed_enum_types: Error <
   "invalid %sub{select_arith_conv_kind}0 "
   "different enumeration types%diff{ ($ and $)|}1,2">;
+def warn_conv_mixed_enum_types_cxx26 : Warning <
+  err_conv_mixed_enum_types.Summary>,
+  InGroup, DefaultError;

AaronBallman wrote:

I don't disagree (it does add a new enum value), but I am starting to think 
this ABI requirement is onerous enough to be worth rethinking the scope of it. 
The point to the ABI requirement is that you should be able to use Clang as a 
library interchangeably between the dot releases. The platform calling 
convention ABI does not change when an enumeration gains a new enumerator, nor 
does name mangling behavior change. In fact, adding an enumerator can only 
increase the number of valid integer values that can be represented by the 
enumeration, which means the addition won't introduce new UB but could remove 
UB by widening the range of valid values. So I think this kind of change isn't 
introducing a kind of ABI break we need to avoid.

Or am I missing something?

https://github.com/llvm/llvm-project/pull/139396
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [mlir] [MLIR][ArmSVE] Add initial lowering of vector.contract to SVE `*MMLA` instructions (PR #135636)

2025-05-14 Thread Momchil Velikov via llvm-branch-commits



@@ -0,0 +1,304 @@
+//===- LowerContractionToSMMLAPattern.cpp - Contract to SMMLA ---*- C++ 
-*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+//
+// This file implements lowering patterns from vector.contract to
+// SVE I8MM operations.
+//
+//===---
+
+#include "mlir/Dialect/Arith/IR/Arith.h"
+#include "mlir/Dialect/ArmSVE/IR/ArmSVEDialect.h"
+#include "mlir/Dialect/ArmSVE/Transforms/Transforms.h"
+#include "mlir/Dialect/Func/IR/FuncOps.h"
+#include "mlir/Dialect/LLVMIR/LLVMDialect.h"
+#include "mlir/Dialect/Utils/IndexingUtils.h"
+#include "mlir/Dialect/Vector/IR/VectorOps.h"
+#include "mlir/IR/AffineMap.h"
+#include "mlir/IR/PatternMatch.h"
+#include "mlir/Transforms/GreedyPatternRewriteDriver.h"
+
+#include "mlir/Dialect/UB/IR/UBOps.h"
+
+#define DEBUG_TYPE "lower-contract-to-arm-sve-i8mm"
+
+using namespace mlir;
+using namespace mlir::arm_sve;
+
+namespace {
+// Check if the given value is a result of the operation `T` (which must be
+// sign- or zero- extend) from i8 to i32. Return the value before the 
extension.
+template 
+inline std::enable_if_t<(std::is_base_of_v ||
+ std::is_base_of_v),
+std::optional>
+extractExtOperand(Value v, Type i8Ty, Type i32Ty) {
+  auto extOp = dyn_cast_or_null(v.getDefiningOp());
+  if (!extOp)
+return {};
+
+  auto inOp = extOp.getIn();
+  auto inTy = dyn_cast(inOp.getType());
+  if (!inTy || inTy.getElementType() != i8Ty)
+return {};
+
+  auto outTy = dyn_cast(extOp.getType());
+  if (!outTy || outTy.getElementType() != i32Ty)
+return {};
+
+  return inOp;
+}
+
+// Designate the operation (resp. instruction) used to do sub-tile matrix
+// multiplications.
+enum class MMLA {
+  Signed,  // smmla
+  Unsigned,// ummla
+  Mixed,   // usmmla
+  MixedSwapped // usmmla with LHS and RHS swapped
+};
+
+// Create the matrix multply and accumulate operation according to `op`.
+Value createMMLA(PatternRewriter &rewriter, MMLA op, Location loc,
+ mlir::VectorType accType, Value acc, Value lhs, Value rhs) {
+  switch (op) {
+  case MMLA::Signed:
+return rewriter.create(loc, accType, acc, lhs, rhs);
+  case MMLA::Unsigned:
+return rewriter.create(loc, accType, acc, lhs, rhs);
+  case MMLA::Mixed:
+return rewriter.create(loc, accType, acc, lhs, rhs);
+  case MMLA::MixedSwapped:
+// The accumulator comes transposed and the result will be transposed
+// later, so all we have to do here is swap the operands.
+return rewriter.create(loc, accType, acc, rhs, lhs);
+  }
+}
+
+class LowerContractionToSVEI8MMPattern
+: public OpRewritePattern {
+public:
+  using OpRewritePattern::OpRewritePattern;
+  LogicalResult matchAndRewrite(vector::ContractionOp op,
+PatternRewriter &rewriter) const override {
+
+Location loc = op.getLoc();
+mlir::VectorType lhsType = op.getLhsType();
+mlir::VectorType rhsType = op.getRhsType();
+
+// For now handle LHS and RHS<8x[N]> - these are the types we
+// eventually expect from MMT4D. M and N dimensions must be even and at

momchil-velikov wrote:

There's no dependency on MMT4D - this comment is merely a rationale why we have 
chosen these particular operand shapes.

https://github.com/llvm/llvm-project/pull/135636
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [mlir] [MLIR][ArmSVE] Add initial lowering of vector.contract to SVE `*MMLA` instructions (PR #135636)

2025-05-14 Thread Momchil Velikov via llvm-branch-commits



@@ -0,0 +1,304 @@
+//===- LowerContractionToSMMLAPattern.cpp - Contract to SMMLA ---*- C++ 
-*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+//
+// This file implements lowering patterns from vector.contract to
+// SVE I8MM operations.
+//
+//===---
+
+#include "mlir/Dialect/Arith/IR/Arith.h"
+#include "mlir/Dialect/ArmSVE/IR/ArmSVEDialect.h"
+#include "mlir/Dialect/ArmSVE/Transforms/Transforms.h"
+#include "mlir/Dialect/Func/IR/FuncOps.h"
+#include "mlir/Dialect/LLVMIR/LLVMDialect.h"
+#include "mlir/Dialect/Utils/IndexingUtils.h"
+#include "mlir/Dialect/Vector/IR/VectorOps.h"
+#include "mlir/IR/AffineMap.h"
+#include "mlir/IR/PatternMatch.h"
+#include "mlir/Transforms/GreedyPatternRewriteDriver.h"
+
+#include "mlir/Dialect/UB/IR/UBOps.h"
+
+#define DEBUG_TYPE "lower-contract-to-arm-sve-i8mm"
+
+using namespace mlir;
+using namespace mlir::arm_sve;
+
+namespace {
+// Check if the given value is a result of the operation `T` (which must be
+// sign- or zero- extend) from i8 to i32. Return the value before the 
extension.
+template 
+inline std::enable_if_t<(std::is_base_of_v ||
+ std::is_base_of_v),
+std::optional>
+extractExtOperand(Value v, Type i8Ty, Type i32Ty) {
+  auto extOp = dyn_cast_or_null(v.getDefiningOp());
+  if (!extOp)
+return {};
+
+  auto inOp = extOp.getIn();
+  auto inTy = dyn_cast(inOp.getType());
+  if (!inTy || inTy.getElementType() != i8Ty)
+return {};
+
+  auto outTy = dyn_cast(extOp.getType());
+  if (!outTy || outTy.getElementType() != i32Ty)
+return {};
+
+  return inOp;
+}
+
+// Designate the operation (resp. instruction) used to do sub-tile matrix
+// multiplications.
+enum class MMLA {
+  Signed,  // smmla
+  Unsigned,// ummla
+  Mixed,   // usmmla
+  MixedSwapped // usmmla with LHS and RHS swapped
+};
+
+// Create the matrix multply and accumulate operation according to `op`.
+Value createMMLA(PatternRewriter &rewriter, MMLA op, Location loc,
+ mlir::VectorType accType, Value acc, Value lhs, Value rhs) {
+  switch (op) {
+  case MMLA::Signed:
+return rewriter.create(loc, accType, acc, lhs, rhs);
+  case MMLA::Unsigned:
+return rewriter.create(loc, accType, acc, lhs, rhs);
+  case MMLA::Mixed:
+return rewriter.create(loc, accType, acc, lhs, rhs);
+  case MMLA::MixedSwapped:
+// The accumulator comes transposed and the result will be transposed
+// later, so all we have to do here is swap the operands.
+return rewriter.create(loc, accType, acc, rhs, lhs);
+  }
+}
+
+class LowerContractionToSVEI8MMPattern
+: public OpRewritePattern {
+public:
+  using OpRewritePattern::OpRewritePattern;
+  LogicalResult matchAndRewrite(vector::ContractionOp op,
+PatternRewriter &rewriter) const override {
+
+Location loc = op.getLoc();
+mlir::VectorType lhsType = op.getLhsType();
+mlir::VectorType rhsType = op.getRhsType();
+
+// For now handle LHS and RHS<8x[N]> - these are the types we
+// eventually expect from MMT4D. M and N dimensions must be even and at
+// least 2.
+if (!lhsType.hasRank() || lhsType.getRank() != 2 || !rhsType.hasRank() ||

momchil-velikov wrote:

Done

https://github.com/llvm/llvm-project/pull/135636
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [mlir] [MLIR][ArmSVE] Add initial lowering of vector.contract to SVE `*MMLA` instructions (PR #135636)

2025-05-14 Thread Momchil Velikov via llvm-branch-commits



@@ -0,0 +1,304 @@
+//===- LowerContractionToSMMLAPattern.cpp - Contract to SMMLA ---*- C++ 
-*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+//
+// This file implements lowering patterns from vector.contract to
+// SVE I8MM operations.

momchil-velikov wrote:

The `vector.contract` implicitly sign-extends its operands, so it does not need 
to by accompanied by explicit extend operations. I'll add code to handle this 
case too.

https://github.com/llvm/llvm-project/pull/135636
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang] [llvm] Enable fexec-charset option (PR #138895)

2025-05-14 Thread Abhina Sree via llvm-branch-commits



@@ -0,0 +1,36 @@
+//===--- clang/Lex/LiteralConverter.h - Translator for Literals -*- C++ 
-*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+
+#ifndef LLVM_CLANG_LEX_LITERALCONVERTER_H
+#define LLVM_CLANG_LEX_LITERALCONVERTER_H
+
+#include "clang/Basic/Diagnostic.h"
+#include "clang/Basic/LangOptions.h"
+#include "clang/Basic/TargetInfo.h"
+#include "llvm/ADT/StringMap.h"
+#include "llvm/ADT/StringRef.h"
+#include "llvm/Support/CharSet.h"
+
+enum ConversionAction { NoConversion, ToSystemCharset, ToExecCharset };
+
+class LiteralConverter {
+  llvm::StringRef InternalCharset;
+  llvm::StringRef SystemCharset;
+  llvm::StringRef ExecCharset;
+  llvm::StringMap CharsetConverters;

abhina-sree wrote:

That is true, but I guess the tradeoff is you would need to recreate the 
converters more times. 
pragma converts are stackable so it's possible to do the following

```
pragma convert ("IBM-1047")

pragma convert("UTF-8")

pragma convert(pop)

pragma convert(pop)
```

https://github.com/llvm/llvm-project/pull/138895
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [libcxx] [libc++] Implement std::move_only_function (P0288R9) (PR #94670)

2025-05-14 Thread A. Jiang via llvm-branch-commits


frederick-vs-ja wrote:

FYI [P2548R2](https://wg21.link/p2548r2) added relaxing wording to 
[[func.wrap.general]](https://eel.is/c++draft/func.wrap.general) to allow 
unwrapping in construction. I think we should implement the allowance for 
`move_only_function` in C++23 as a DR.
We can't unwrap in `function` construction though, IIUC, because its target 
object is observable. But we can unwrap `function` when constructing a 
`move_only_function`.

https://github.com/llvm/llvm-project/pull/94670
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] release/20.x: [LoongArch] Fix fp_to_uint/fp_to_sint conversion errors for lasx (#137129) (PR #139851)

2025-05-14 Thread via llvm-branch-commits


https://github.com/heiher approved this pull request.


https://github.com/llvm/llvm-project/pull/139851
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [mlir] [MLIR][ArmSVE] Add initial lowering of vector.contract to SVE `*MMLA` instructions (PR #135636)

2025-05-14 Thread Andrzej Warzyński via llvm-branch-commits



@@ -0,0 +1,304 @@
+//===- LowerContractionToSMMLAPattern.cpp - Contract to SMMLA ---*- C++ 
-*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+//
+// This file implements lowering patterns from vector.contract to
+// SVE I8MM operations.
+//
+//===---
+
+#include "mlir/Dialect/Arith/IR/Arith.h"
+#include "mlir/Dialect/ArmSVE/IR/ArmSVEDialect.h"
+#include "mlir/Dialect/ArmSVE/Transforms/Transforms.h"
+#include "mlir/Dialect/Func/IR/FuncOps.h"
+#include "mlir/Dialect/LLVMIR/LLVMDialect.h"
+#include "mlir/Dialect/Utils/IndexingUtils.h"
+#include "mlir/Dialect/Vector/IR/VectorOps.h"
+#include "mlir/IR/AffineMap.h"
+#include "mlir/IR/PatternMatch.h"
+#include "mlir/Transforms/GreedyPatternRewriteDriver.h"
+
+#include "mlir/Dialect/UB/IR/UBOps.h"
+
+#define DEBUG_TYPE "lower-contract-to-arm-sve-i8mm"
+
+using namespace mlir;
+using namespace mlir::arm_sve;
+
+namespace {
+// Check if the given value is a result of the operation `T` (which must be
+// sign- or zero- extend) from i8 to i32. Return the value before the 
extension.
+template 
+inline std::enable_if_t<(std::is_base_of_v ||
+ std::is_base_of_v),
+std::optional>
+extractExtOperand(Value v, Type i8Ty, Type i32Ty) {
+  auto extOp = dyn_cast_or_null(v.getDefiningOp());
+  if (!extOp)
+return {};
+
+  auto inOp = extOp.getIn();
+  auto inTy = dyn_cast(inOp.getType());
+  if (!inTy || inTy.getElementType() != i8Ty)
+return {};
+
+  auto outTy = dyn_cast(extOp.getType());
+  if (!outTy || outTy.getElementType() != i32Ty)
+return {};
+
+  return inOp;
+}
+
+// Designate the operation (resp. instruction) used to do sub-tile matrix
+// multiplications.
+enum class MMLA {
+  Signed,  // smmla
+  Unsigned,// ummla
+  Mixed,   // usmmla
+  MixedSwapped // usmmla with LHS and RHS swapped
+};
+
+// Create the matrix multply and accumulate operation according to `op`.
+Value createMMLA(PatternRewriter &rewriter, MMLA op, Location loc,
+ mlir::VectorType accType, Value acc, Value lhs, Value rhs) {
+  switch (op) {
+  case MMLA::Signed:
+return rewriter.create(loc, accType, acc, lhs, rhs);
+  case MMLA::Unsigned:
+return rewriter.create(loc, accType, acc, lhs, rhs);
+  case MMLA::Mixed:
+return rewriter.create(loc, accType, acc, lhs, rhs);
+  case MMLA::MixedSwapped:
+// The accumulator comes transposed and the result will be transposed
+// later, so all we have to do here is swap the operands.
+return rewriter.create(loc, accType, acc, rhs, lhs);
+  }
+}
+
+class LowerContractionToSVEI8MMPattern
+: public OpRewritePattern {
+public:
+  using OpRewritePattern::OpRewritePattern;
+  LogicalResult matchAndRewrite(vector::ContractionOp op,
+PatternRewriter &rewriter) const override {
+
+Location loc = op.getLoc();
+mlir::VectorType lhsType = op.getLhsType();
+mlir::VectorType rhsType = op.getRhsType();
+
+// For now handle LHS and RHS<8x[N]> - these are the types we
+// eventually expect from MMT4D. M and N dimensions must be even and at

banach-space wrote:

Perhaps just expand this comment a bit (e.g. by noting that MMT4D is the main 
use-case ATM)?

https://github.com/llvm/llvm-project/pull/135636
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [X86] Remove extra MOV after widening atomic load (PR #138635)

2025-05-14 Thread via llvm-branch-commits



@@ -1200,6 +1200,13 @@ def : Pat<(i16 (atomic_load_nonext_16 addr:$src)), 
(MOV16rm addr:$src)>;
 def : Pat<(i32 (atomic_load_nonext_32 addr:$src)), (MOV32rm addr:$src)>;
 def : Pat<(i64 (atomic_load_nonext_64 addr:$src)), (MOV64rm addr:$src)>;
 
+def : Pat<(v4i32 (scalar_to_vector (i32 (anyext (i16 (atomic_load_16 
addr:$src)),
+   (MOVDI2PDIrm addr:$src)>;   // load atomic <2 x i8>

jofrn wrote:

Ok. Thanks!

Since you've made the commits in already, I'll interleave the SSE/AVX updates 
throughout the series rather than making a new PR.

https://github.com/llvm/llvm-project/pull/138635
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [mlir] [MLIR][ArmSVE] Add initial lowering of vector.contract to SVE `*MMLA` instructions (PR #135636)

2025-05-14 Thread Momchil Velikov via llvm-branch-commits



@@ -0,0 +1,304 @@
+//===- LowerContractionToSMMLAPattern.cpp - Contract to SMMLA ---*- C++ 
-*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+//
+// This file implements lowering patterns from vector.contract to
+// SVE I8MM operations.
+//
+//===---
+
+#include "mlir/Dialect/Arith/IR/Arith.h"
+#include "mlir/Dialect/ArmSVE/IR/ArmSVEDialect.h"
+#include "mlir/Dialect/ArmSVE/Transforms/Transforms.h"
+#include "mlir/Dialect/Func/IR/FuncOps.h"
+#include "mlir/Dialect/LLVMIR/LLVMDialect.h"
+#include "mlir/Dialect/Utils/IndexingUtils.h"
+#include "mlir/Dialect/Vector/IR/VectorOps.h"
+#include "mlir/IR/AffineMap.h"
+#include "mlir/IR/PatternMatch.h"
+#include "mlir/Transforms/GreedyPatternRewriteDriver.h"
+
+#include "mlir/Dialect/UB/IR/UBOps.h"
+
+#define DEBUG_TYPE "lower-contract-to-arm-sve-i8mm"
+
+using namespace mlir;
+using namespace mlir::arm_sve;
+
+namespace {
+// Check if the given value is a result of the operation `T` (which must be
+// sign- or zero- extend) from i8 to i32. Return the value before the 
extension.
+template 
+inline std::enable_if_t<(std::is_base_of_v ||
+ std::is_base_of_v),
+std::optional>
+extractExtOperand(Value v, Type i8Ty, Type i32Ty) {
+  auto extOp = dyn_cast_or_null(v.getDefiningOp());
+  if (!extOp)
+return {};
+
+  auto inOp = extOp.getIn();
+  auto inTy = dyn_cast(inOp.getType());
+  if (!inTy || inTy.getElementType() != i8Ty)
+return {};
+
+  auto outTy = dyn_cast(extOp.getType());
+  if (!outTy || outTy.getElementType() != i32Ty)
+return {};
+
+  return inOp;
+}
+
+// Designate the operation (resp. instruction) used to do sub-tile matrix
+// multiplications.
+enum class MMLA {
+  Signed,  // smmla
+  Unsigned,// ummla
+  Mixed,   // usmmla
+  MixedSwapped // usmmla with LHS and RHS swapped
+};
+
+// Create the matrix multply and accumulate operation according to `op`.
+Value createMMLA(PatternRewriter &rewriter, MMLA op, Location loc,
+ mlir::VectorType accType, Value acc, Value lhs, Value rhs) {
+  switch (op) {
+  case MMLA::Signed:
+return rewriter.create(loc, accType, acc, lhs, rhs);
+  case MMLA::Unsigned:
+return rewriter.create(loc, accType, acc, lhs, rhs);
+  case MMLA::Mixed:
+return rewriter.create(loc, accType, acc, lhs, rhs);
+  case MMLA::MixedSwapped:
+// The accumulator comes transposed and the result will be transposed
+// later, so all we have to do here is swap the operands.
+return rewriter.create(loc, accType, acc, rhs, lhs);
+  }
+}
+
+class LowerContractionToSVEI8MMPattern
+: public OpRewritePattern {
+public:
+  using OpRewritePattern::OpRewritePattern;
+  LogicalResult matchAndRewrite(vector::ContractionOp op,
+PatternRewriter &rewriter) const override {
+
+Location loc = op.getLoc();
+mlir::VectorType lhsType = op.getLhsType();
+mlir::VectorType rhsType = op.getRhsType();
+
+// For now handle LHS and RHS<8x[N]> - these are the types we
+// eventually expect from MMT4D. M and N dimensions must be even and at
+// least 2.
+if (!lhsType.hasRank() || lhsType.getRank() != 2 || !rhsType.hasRank() ||
+rhsType.getRank() != 2)
+  return failure();
+
+if (lhsType.isScalable() || !rhsType.isScalable())
+  return failure();
+
+// M, N, and K are the conventional names for matrix dimensions in the
+// context of matrix multiplication.
+auto M = lhsType.getDimSize(0);
+auto N = rhsType.getDimSize(0);
+auto K = rhsType.getDimSize(1);
+
+if (lhsType.getDimSize(1) != K || K != 8 || M < 2 || M % 2 != 0 || N < 2 ||
+N % 2 != 0 || !rhsType.getScalableDims()[0])
+  return failure();
+
+// Check permutation maps. For now only accept
+//   lhs: (d0, d1, d2) -> (d0, d2)
+//   rhs: (d0, d1, d2) -> (d1, d2)
+//   acc: (d0, d1, d2) -> (d0, d1)
+// Note: RHS is transposed.
+if (op.getIndexingMapsArray()[0] !=
+AffineMap::getMultiDimMapWithTargets(3, ArrayRef{0u, 2u},
+ op.getContext()) ||
+op.getIndexingMapsArray()[1] !=
+AffineMap::getMultiDimMapWithTargets(3, ArrayRef{1u, 2u},
+ op.getContext()) ||
+op.getIndexingMapsArray()[2] !=
+AffineMap::getMultiDimMapWithTargets(3, ArrayRef{0u, 1u},
+ op.getContext()))
+  return failure();
+
+// Check iterator types for matrix multiplication.
+auto itTypes = op.getIteratorTypesArray();
+if (itTypes.size() != 3 || itTypes[0] != vector::IteratorType::parallel ||
+itTypes[1] != vector::IteratorType

[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: prevent false positives due to jump tables (PR #138884)

2025-05-14 Thread Anatoly Trosinenko via llvm-branch-commits


https://github.com/atrosinenko updated 
https://github.com/llvm/llvm-project/pull/138884

>From eae2c10ca4d3024862eba06acbb073244ac350e9 Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko 
Date: Tue, 6 May 2025 11:31:03 +0300
Subject: [PATCH] [BOLT] Gadget scanner: prevent false positives due to jump
 tables

As part of PAuth hardening, AArch64 LLVM backend can use a special
BR_JumpTable pseudo (enabled by -faarch64-jump-table-hardening
Clang option) which is expanded in the AsmPrinter into a contiguous
sequence without unsafe instructions in the middle.

This commit adds another target-specific callback to MCPlusBuilder
to make it possible to inhibit false positives for known-safe jump
table dispatch sequences. Without special handling, the branch
instruction is likely to be reported as a non-protected call (as its
destination is not produced by an auth instruction, PC-relative address
materialization, etc.) and possibly as a tail call being performed with
unsafe link register (as the detection whether the branch instruction
is a tail call is an heuristic).

For now, only the specific instruction sequence used by the AArch64
LLVM backend is matched.
---
 bolt/include/bolt/Core/MCInstUtils.h  |   9 +
 bolt/include/bolt/Core/MCPlusBuilder.h|  14 +
 bolt/lib/Core/MCInstUtils.cpp |  20 +
 bolt/lib/Passes/PAuthGadgetScanner.cpp|  10 +
 .../Target/AArch64/AArch64MCPlusBuilder.cpp   |  73 ++
 .../AArch64/gs-pauth-jump-table.s | 703 ++
 6 files changed, 829 insertions(+)
 create mode 100644 bolt/test/binary-analysis/AArch64/gs-pauth-jump-table.s

diff --git a/bolt/include/bolt/Core/MCInstUtils.h 
b/bolt/include/bolt/Core/MCInstUtils.h
index b495eb8ef5eec..5dd0aaa48d6e7 100644
--- a/bolt/include/bolt/Core/MCInstUtils.h
+++ b/bolt/include/bolt/Core/MCInstUtils.h
@@ -158,6 +158,15 @@ class MCInstReference {
 return nullptr;
   }
 
+  /// Returns the only preceding instruction, or std::nullopt if multiple or no
+  /// predecessors are possible.
+  ///
+  /// If CFG information is available, basic block boundary can be crossed,
+  /// provided there is exactly one predecessor. If CFG is not available, the
+  /// preceding instruction in the offset order is returned, unless this is the
+  /// first instruction of the function.
+  std::optional getSinglePredecessor();
+
   raw_ostream &print(raw_ostream &OS) const;
 };
 
diff --git a/bolt/include/bolt/Core/MCPlusBuilder.h 
b/bolt/include/bolt/Core/MCPlusBuilder.h
index 87de6754017db..eb93d7de7fee9 100644
--- a/bolt/include/bolt/Core/MCPlusBuilder.h
+++ b/bolt/include/bolt/Core/MCPlusBuilder.h
@@ -14,6 +14,7 @@
 #ifndef BOLT_CORE_MCPLUSBUILDER_H
 #define BOLT_CORE_MCPLUSBUILDER_H
 
+#include "bolt/Core/MCInstUtils.h"
 #include "bolt/Core/MCPlus.h"
 #include "bolt/Core/Relocation.h"
 #include "llvm/ADT/ArrayRef.h"
@@ -699,6 +700,19 @@ class MCPlusBuilder {
 return std::nullopt;
   }
 
+  /// Tests if BranchInst corresponds to an instruction sequence which is known
+  /// to be a safe dispatch via jump table.
+  ///
+  /// The target can decide which instruction sequences to consider "safe" from
+  /// the Pointer Authentication point of view, such as any jump table dispatch
+  /// sequence without function calls inside, any sequence which is contiguous,
+  /// or only some specific well-known sequences.
+  virtual bool
+  isSafeJumpTableBranchForPtrAuth(MCInstReference BranchInst) const {
+llvm_unreachable("not implemented");
+return false;
+  }
+
   virtual bool isTerminator(const MCInst &Inst) const;
 
   virtual bool isNoop(const MCInst &Inst) const {
diff --git a/bolt/lib/Core/MCInstUtils.cpp b/bolt/lib/Core/MCInstUtils.cpp
index 40f6edd59135c..b7c6d898988af 100644
--- a/bolt/lib/Core/MCInstUtils.cpp
+++ b/bolt/lib/Core/MCInstUtils.cpp
@@ -55,3 +55,23 @@ raw_ostream &MCInstReference::print(raw_ostream &OS) const {
   OS << ">";
   return OS;
 }
+
+std::optional MCInstReference::getSinglePredecessor() {
+  if (const RefInBB *Ref = tryGetRefInBB()) {
+if (Ref->It != Ref->BB->begin())
+  return MCInstReference(Ref->BB, &*std::prev(Ref->It));
+
+if (Ref->BB->pred_size() != 1)
+  return std::nullopt;
+
+BinaryBasicBlock *PredBB = *Ref->BB->pred_begin();
+assert(!PredBB->empty() && "Empty basic blocks are not supported yet");
+return MCInstReference(PredBB, &*PredBB->rbegin());
+  }
+
+  const RefInBF &Ref = getRefInBF();
+  if (Ref.It == Ref.BF->instrs().begin())
+return std::nullopt;
+
+  return MCInstReference(Ref.BF, std::prev(Ref.It));
+}
diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp 
b/bolt/lib/Passes/PAuthGadgetScanner.cpp
index 10a3f9716c201..e96ec5996984d 100644
--- a/bolt/lib/Passes/PAuthGadgetScanner.cpp
+++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp
@@ -1329,6 +1329,11 @@ shouldReportUnsafeTailCall(const BinaryContext &BC, 
const BinaryFunction &BF,
 return std::nullopt;
   }
 
+  if (BC.MIB->isSafeJumpTableBranchForPtrAuth(Inst)) {
+LL

[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: optionally assume auth traps on failure (PR #139778)

2025-05-14 Thread Anatoly Trosinenko via llvm-branch-commits


https://github.com/atrosinenko updated 
https://github.com/llvm/llvm-project/pull/139778

>From 3dd903c3143f03e0aefb26d0349cf746b8169357 Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko 
Date: Tue, 13 May 2025 19:50:41 +0300
Subject: [PATCH] [BOLT] Gadget scanner: optionally assume auth traps on
 failure

On AArch64 it is possible for an auth instruction to either return an
invalid address value on failure (without FEAT_FPAC) or generate an
error (with FEAT_FPAC). It thus may be possible to never emit explicit
pointer checks, if the target CPU is known to support FEAT_FPAC.

This commit implements an --auth-traps-on-failure command line option,
which essentially makes "safe-to-dereference" and "trusted" register
properties identical and disables scanning for authentication oracles
completely.
---
 bolt/lib/Passes/PAuthGadgetScanner.cpp| 112 +++
 .../binary-analysis/AArch64/cmdline-args.test |   1 +
 .../AArch64/gs-pauth-authentication-oracles.s |   6 +-
 .../binary-analysis/AArch64/gs-pauth-calls.s  |   5 +-
 .../AArch64/gs-pauth-debug-output.s   | 177 ++---
 .../AArch64/gs-pauth-jump-table.s |   6 +-
 .../AArch64/gs-pauth-signing-oracles.s|  55 +++---
 .../AArch64/gs-pauth-tail-calls.s | 184 +-
 8 files changed, 318 insertions(+), 228 deletions(-)

diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp 
b/bolt/lib/Passes/PAuthGadgetScanner.cpp
index e96ec5996984d..d591bd7831fc6 100644
--- a/bolt/lib/Passes/PAuthGadgetScanner.cpp
+++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp
@@ -14,6 +14,7 @@
 #include "bolt/Passes/PAuthGadgetScanner.h"
 #include "bolt/Core/ParallelUtilities.h"
 #include "bolt/Passes/DataflowAnalysis.h"
+#include "bolt/Utils/CommandLineOpts.h"
 #include "llvm/ADT/STLExtras.h"
 #include "llvm/ADT/SmallSet.h"
 #include "llvm/MC/MCInst.h"
@@ -26,6 +27,11 @@ namespace llvm {
 namespace bolt {
 namespace PAuthGadgetScanner {
 
+static cl::opt AuthTrapsOnFailure(
+"auth-traps-on-failure",
+cl::desc("Assume authentication instructions always trap on failure"),
+cl::cat(opts::BinaryAnalysisCategory));
+
 [[maybe_unused]] static void traceInst(const BinaryContext &BC, StringRef 
Label,
const MCInst &MI) {
   dbgs() << "  " << Label << ": ";
@@ -365,6 +371,34 @@ class SrcSafetyAnalysis {
 return Clobbered;
   }
 
+  std::optional getRegMadeTrustedByChecking(const MCInst &Inst,
+   SrcState Cur) const {
+// This functions cannot return multiple registers. This is never the case
+// on AArch64.
+std::optional RegCheckedByInst =
+BC.MIB->getAuthCheckedReg(Inst, /*MayOverwrite=*/false);
+if (RegCheckedByInst && Cur.SafeToDerefRegs[*RegCheckedByInst])
+  return *RegCheckedByInst;
+
+auto It = CheckerSequenceInfo.find(&Inst);
+if (It == CheckerSequenceInfo.end())
+  return std::nullopt;
+
+MCPhysReg RegCheckedBySequence = It->second.first;
+const MCInst *FirstCheckerInst = It->second.second;
+
+// FirstCheckerInst should belong to the same basic block (see the
+// assertion in DataflowSrcSafetyAnalysis::run()), meaning it was
+// deterministically processed a few steps before this instruction.
+const SrcState &StateBeforeChecker = getStateBefore(*FirstCheckerInst);
+
+// The sequence checks the register, but it should be authenticated before.
+if (!StateBeforeChecker.SafeToDerefRegs[RegCheckedBySequence])
+  return std::nullopt;
+
+return RegCheckedBySequence;
+  }
+
   // Returns all registers that can be treated as if they are written by an
   // authentication instruction.
   SmallVector getRegsMadeSafeToDeref(const MCInst &Point,
@@ -387,18 +421,38 @@ class SrcSafetyAnalysis {
 Regs.push_back(DstAndSrc->first);
 }
 
+// Make sure explicit checker sequence keeps register safe-to-dereference
+// when the register would be clobbered according to the regular rules:
+//
+//; LR is safe to dereference here
+//mov   x16, x30  ; start of the sequence, LR is s-t-d right before
+//xpaclri ; clobbers LR, LR is not safe anymore
+//cmp   x30, x16
+//b.eq  1f; end of the sequence: LR is marked as trusted
+//brk   0x1234
+//  1:
+//; at this point LR would be marked as trusted,
+//; but not safe-to-dereference
+//
+// or even just
+//
+//; X1 is safe to dereference here
+//ldr x0, [x1, #8]!
+//; X1 is trusted here, but it was clobbered due to address write-back
+if (auto CheckedReg = getRegMadeTrustedByChecking(Point, Cur))
+  Regs.push_back(*CheckedReg);
+
 return Regs;
   }
 
   // Returns all registers made trusted by this instruction.
   SmallVector getRegsMadeTrusted(const MCInst &Point,
 const SrcState &Cur) const {
+assert(!AuthTrapsOnFailure &

[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: prevent false positives due to jump tables (PR #138884)

2025-05-14 Thread Anatoly Trosinenko via llvm-branch-commits


https://github.com/atrosinenko updated 
https://github.com/llvm/llvm-project/pull/138884

>From eae2c10ca4d3024862eba06acbb073244ac350e9 Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko 
Date: Tue, 6 May 2025 11:31:03 +0300
Subject: [PATCH] [BOLT] Gadget scanner: prevent false positives due to jump
 tables

As part of PAuth hardening, AArch64 LLVM backend can use a special
BR_JumpTable pseudo (enabled by -faarch64-jump-table-hardening
Clang option) which is expanded in the AsmPrinter into a contiguous
sequence without unsafe instructions in the middle.

This commit adds another target-specific callback to MCPlusBuilder
to make it possible to inhibit false positives for known-safe jump
table dispatch sequences. Without special handling, the branch
instruction is likely to be reported as a non-protected call (as its
destination is not produced by an auth instruction, PC-relative address
materialization, etc.) and possibly as a tail call being performed with
unsafe link register (as the detection whether the branch instruction
is a tail call is an heuristic).

For now, only the specific instruction sequence used by the AArch64
LLVM backend is matched.
---
 bolt/include/bolt/Core/MCInstUtils.h  |   9 +
 bolt/include/bolt/Core/MCPlusBuilder.h|  14 +
 bolt/lib/Core/MCInstUtils.cpp |  20 +
 bolt/lib/Passes/PAuthGadgetScanner.cpp|  10 +
 .../Target/AArch64/AArch64MCPlusBuilder.cpp   |  73 ++
 .../AArch64/gs-pauth-jump-table.s | 703 ++
 6 files changed, 829 insertions(+)
 create mode 100644 bolt/test/binary-analysis/AArch64/gs-pauth-jump-table.s

diff --git a/bolt/include/bolt/Core/MCInstUtils.h 
b/bolt/include/bolt/Core/MCInstUtils.h
index b495eb8ef5eec..5dd0aaa48d6e7 100644
--- a/bolt/include/bolt/Core/MCInstUtils.h
+++ b/bolt/include/bolt/Core/MCInstUtils.h
@@ -158,6 +158,15 @@ class MCInstReference {
 return nullptr;
   }
 
+  /// Returns the only preceding instruction, or std::nullopt if multiple or no
+  /// predecessors are possible.
+  ///
+  /// If CFG information is available, basic block boundary can be crossed,
+  /// provided there is exactly one predecessor. If CFG is not available, the
+  /// preceding instruction in the offset order is returned, unless this is the
+  /// first instruction of the function.
+  std::optional getSinglePredecessor();
+
   raw_ostream &print(raw_ostream &OS) const;
 };
 
diff --git a/bolt/include/bolt/Core/MCPlusBuilder.h 
b/bolt/include/bolt/Core/MCPlusBuilder.h
index 87de6754017db..eb93d7de7fee9 100644
--- a/bolt/include/bolt/Core/MCPlusBuilder.h
+++ b/bolt/include/bolt/Core/MCPlusBuilder.h
@@ -14,6 +14,7 @@
 #ifndef BOLT_CORE_MCPLUSBUILDER_H
 #define BOLT_CORE_MCPLUSBUILDER_H
 
+#include "bolt/Core/MCInstUtils.h"
 #include "bolt/Core/MCPlus.h"
 #include "bolt/Core/Relocation.h"
 #include "llvm/ADT/ArrayRef.h"
@@ -699,6 +700,19 @@ class MCPlusBuilder {
 return std::nullopt;
   }
 
+  /// Tests if BranchInst corresponds to an instruction sequence which is known
+  /// to be a safe dispatch via jump table.
+  ///
+  /// The target can decide which instruction sequences to consider "safe" from
+  /// the Pointer Authentication point of view, such as any jump table dispatch
+  /// sequence without function calls inside, any sequence which is contiguous,
+  /// or only some specific well-known sequences.
+  virtual bool
+  isSafeJumpTableBranchForPtrAuth(MCInstReference BranchInst) const {
+llvm_unreachable("not implemented");
+return false;
+  }
+
   virtual bool isTerminator(const MCInst &Inst) const;
 
   virtual bool isNoop(const MCInst &Inst) const {
diff --git a/bolt/lib/Core/MCInstUtils.cpp b/bolt/lib/Core/MCInstUtils.cpp
index 40f6edd59135c..b7c6d898988af 100644
--- a/bolt/lib/Core/MCInstUtils.cpp
+++ b/bolt/lib/Core/MCInstUtils.cpp
@@ -55,3 +55,23 @@ raw_ostream &MCInstReference::print(raw_ostream &OS) const {
   OS << ">";
   return OS;
 }
+
+std::optional MCInstReference::getSinglePredecessor() {
+  if (const RefInBB *Ref = tryGetRefInBB()) {
+if (Ref->It != Ref->BB->begin())
+  return MCInstReference(Ref->BB, &*std::prev(Ref->It));
+
+if (Ref->BB->pred_size() != 1)
+  return std::nullopt;
+
+BinaryBasicBlock *PredBB = *Ref->BB->pred_begin();
+assert(!PredBB->empty() && "Empty basic blocks are not supported yet");
+return MCInstReference(PredBB, &*PredBB->rbegin());
+  }
+
+  const RefInBF &Ref = getRefInBF();
+  if (Ref.It == Ref.BF->instrs().begin())
+return std::nullopt;
+
+  return MCInstReference(Ref.BF, std::prev(Ref.It));
+}
diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp 
b/bolt/lib/Passes/PAuthGadgetScanner.cpp
index 10a3f9716c201..e96ec5996984d 100644
--- a/bolt/lib/Passes/PAuthGadgetScanner.cpp
+++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp
@@ -1329,6 +1329,11 @@ shouldReportUnsafeTailCall(const BinaryContext &BC, 
const BinaryFunction &BF,
 return std::nullopt;
   }
 
+  if (BC.MIB->isSafeJumpTableBranchForPtrAuth(Inst)) {
+LL

[llvm-branch-commits] [llvm] [BOLT] Introduce helpers to match `MCInst`s one at a time (NFC) (PR #138883)

2025-05-14 Thread Anatoly Trosinenko via llvm-branch-commits


https://github.com/atrosinenko updated 
https://github.com/llvm/llvm-project/pull/138883

>From 1c135a144d7f21e05c3598a992baa170cdde7950 Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko 
Date: Wed, 7 May 2025 16:42:00 +0300
Subject: [PATCH] [BOLT] Introduce helpers to match `MCInst`s one at a time
 (NFC)

Introduce matchInst helper function to capture and/or match the operands
of MCInst. Unlike the existing `MCPlusBuilder::MCInstMatcher` machinery,
matchInst is intended for the use cases when precise control over the
instruction order is required. For example, when validating PtrAuth
hardening, all registers are usually considered unsafe after a function
call, even though callee-saved registers should preserve their old
values *under normal operation*.
---
 bolt/include/bolt/Core/MCInstUtils.h  | 128 ++
 .../Target/AArch64/AArch64MCPlusBuilder.cpp   |  90 +---
 2 files changed, 162 insertions(+), 56 deletions(-)

diff --git a/bolt/include/bolt/Core/MCInstUtils.h 
b/bolt/include/bolt/Core/MCInstUtils.h
index a3912a8fb265a..b495eb8ef5eec 100644
--- a/bolt/include/bolt/Core/MCInstUtils.h
+++ b/bolt/include/bolt/Core/MCInstUtils.h
@@ -166,6 +166,134 @@ static inline raw_ostream &operator<<(raw_ostream &OS,
   return Ref.print(OS);
 }
 
+/// Instruction-matching helpers operating on a single instruction at a time.
+///
+/// Unlike MCPlusBuilder::MCInstMatcher, this matchInst() function focuses on
+/// the cases where a precise control over the instruction order is important:
+///
+/// // Bring the short names into the local scope:
+/// using namespace MCInstMatcher;
+/// // Declare the registers to capture:
+/// Reg Xn, Xm;
+/// // Capture the 0th and 1st operands, match the 2nd operand against the
+/// // just captured Xm register, match the 3rd operand against literal 0:
+/// if (!matchInst(MaybeAdd, AArch64::ADDXrs, Xm, Xn, Xm, Imm(0))
+///   return AArch64::NoRegister;
+/// // Match the 0th operand against Xm:
+/// if (!matchInst(MaybeBr, AArch64::BR, Xm))
+///   return AArch64::NoRegister;
+/// // Return the matched register:
+/// return Xm.get();
+namespace MCInstMatcher {
+
+// The base class to match an operand of type T.
+//
+// The subclasses of OpMatcher are intended to be allocated on the stack and
+// to only be used by passing them to matchInst() and by calling their get()
+// function, thus the peculiar `mutable` specifiers: to make the calling code
+// compact and readable, the templated matchInst() function has to accept both
+// long-lived Imm/Reg wrappers declared as local variables (intended to capture
+// the first operand's value and match the subsequent operands, whether inside
+// a single instruction or across multiple instructions), as well as temporary
+// wrappers around literal values to match, f.e. Imm(42) or Reg(AArch64::XZR).
+template  class OpMatcher {
+  mutable std::optional Value;
+  mutable std::optional SavedValue;
+
+  // Remember/restore the last Value - to be called by matchInst.
+  void remember() const { SavedValue = Value; }
+  void restore() const { Value = SavedValue; }
+
+  template 
+  friend bool matchInst(const MCInst &, unsigned, const OpMatchers &...);
+
+protected:
+  OpMatcher(std::optional ValueToMatch) : Value(ValueToMatch) {}
+
+  bool matchValue(T OpValue) const {
+// Check that OpValue does not contradict the existing Value.
+bool MatchResult = !Value || *Value == OpValue;
+// If MatchResult is false, all matchers will be reset before returning 
from
+// matchInst, including this one, thus no need to assign conditionally.
+Value = OpValue;
+
+return MatchResult;
+  }
+
+public:
+  /// Returns the captured value.
+  T get() const {
+assert(Value.has_value());
+return *Value;
+  }
+};
+
+class Reg : public OpMatcher {
+  bool matches(const MCOperand &Op) const {
+if (!Op.isReg())
+  return false;
+
+return matchValue(Op.getReg());
+  }
+
+  template 
+  friend bool matchInst(const MCInst &, unsigned, const OpMatchers &...);
+
+public:
+  Reg(std::optional RegToMatch = std::nullopt)
+  : OpMatcher(RegToMatch) {}
+};
+
+class Imm : public OpMatcher {
+  bool matches(const MCOperand &Op) const {
+if (!Op.isImm())
+  return false;
+
+return matchValue(Op.getImm());
+  }
+
+  template 
+  friend bool matchInst(const MCInst &, unsigned, const OpMatchers &...);
+
+public:
+  Imm(std::optional ImmToMatch = std::nullopt)
+  : OpMatcher(ImmToMatch) {}
+};
+
+/// Tries to match Inst and updates Ops on success.
+///
+/// If Inst has the specified Opcode and its operand list prefix matches Ops,
+/// this function returns true and updates Ops, otherwise false is returned and
+/// values of Ops are kept as before matchInst was called.
+///
+/// Please note that while Ops are technically passed by a const reference to
+/// make invocations like `matchInst(MI, Opcode, Imm(42))` possible, all their
+/// fields are marked mut

[llvm-branch-commits] [llvm] [BOLT] Factor out MCInstReference from gadget scanner (NFC) (PR #138655)

2025-05-14 Thread Anatoly Trosinenko via llvm-branch-commits


https://github.com/atrosinenko updated 
https://github.com/llvm/llvm-project/pull/138655

>From cbcac1bc4612b601a5ec963663ba69a5f212feb1 Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko 
Date: Mon, 28 Apr 2025 18:35:48 +0300
Subject: [PATCH] [BOLT] Factor out MCInstReference from gadget scanner (NFC)

Move MCInstReference representing a constant reference to an instruction
inside a parent entity - either inside a basic block (which has a
reference to its parent function) or directly to the function (when CFG
information is not available).
---
 bolt/include/bolt/Core/MCInstUtils.h  | 172 +
 bolt/include/bolt/Passes/PAuthGadgetScanner.h | 180 +-
 bolt/lib/Core/CMakeLists.txt  |   1 +
 bolt/lib/Core/MCInstUtils.cpp |  57 ++
 bolt/lib/Passes/PAuthGadgetScanner.cpp|  99 --
 5 files changed, 272 insertions(+), 237 deletions(-)
 create mode 100644 bolt/include/bolt/Core/MCInstUtils.h
 create mode 100644 bolt/lib/Core/MCInstUtils.cpp

diff --git a/bolt/include/bolt/Core/MCInstUtils.h 
b/bolt/include/bolt/Core/MCInstUtils.h
new file mode 100644
index 0..a3912a8fb265a
--- /dev/null
+++ b/bolt/include/bolt/Core/MCInstUtils.h
@@ -0,0 +1,172 @@
+//===- bolt/Core/MCInstUtils.h --*- C++ 
-*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+
+#ifndef BOLT_CORE_MCINSTUTILS_H
+#define BOLT_CORE_MCINSTUTILS_H
+
+#include "bolt/Core/BinaryBasicBlock.h"
+
+#include 
+#include 
+#include 
+
+namespace llvm {
+namespace bolt {
+
+class BinaryFunction;
+
+/// MCInstReference represents a reference to a constant MCInst as stored 
either
+/// in a BinaryFunction (i.e. before a CFG is created), or in a 
BinaryBasicBlock
+/// (after a CFG is created).
+class MCInstReference {
+  using nocfg_const_iterator = std::map::const_iterator;
+
+  // Two cases are possible:
+  // * functions with CFG reconstructed - a function stores a collection of
+  //   basic blocks, each basic block stores a contiguous vector of MCInst
+  // * functions without CFG - there are no basic blocks created,
+  //   the instructions are directly stored in std::map in BinaryFunction
+  //
+  // In both cases, the direct parent of MCInst is stored together with an
+  // iterator pointing to the instruction.
+
+  // Helper struct: CFG is available, the direct parent is a basic block,
+  // iterator's type is `MCInst *`.
+  struct RefInBB {
+RefInBB(const BinaryBasicBlock *BB, const MCInst *Inst)
+: BB(BB), It(Inst) {}
+RefInBB(const RefInBB &Other) = default;
+RefInBB &operator=(const RefInBB &Other) = default;
+
+const BinaryBasicBlock *BB;
+BinaryBasicBlock::const_iterator It;
+
+bool operator<(const RefInBB &Other) const {
+  if (BB != Other.BB)
+return std::less{}(BB, Other.BB);
+  return It < Other.It;
+}
+
+bool operator==(const RefInBB &Other) const {
+  return BB == Other.BB && It == Other.It;
+}
+  };
+
+  // Helper struct: CFG is *not* available, the direct parent is a function,
+  // iterator's type is std::map::iterator (the mapped value
+  // is an instruction's offset).
+  struct RefInBF {
+RefInBF(const BinaryFunction *BF, nocfg_const_iterator It)
+: BF(BF), It(It) {}
+RefInBF(const RefInBF &Other) = default;
+RefInBF &operator=(const RefInBF &Other) = default;
+
+const BinaryFunction *BF;
+nocfg_const_iterator It;
+
+bool operator<(const RefInBF &Other) const {
+  if (BF != Other.BF)
+return std::less{}(BF, Other.BF);
+  return It->first < Other.It->first;
+}
+
+bool operator==(const RefInBF &Other) const {
+  return BF == Other.BF && It->first == Other.It->first;
+}
+  };
+
+  std::variant Reference;
+
+  // Utility methods to be used like this:
+  //
+  // if (auto *Ref = tryGetRefInBB())
+  //   return Ref->doSomething(...);
+  // return getRefInBF().doSomethingElse(...);
+  const RefInBB *tryGetRefInBB() const {
+assert(std::get_if(&Reference) ||
+   std::get_if(&Reference));
+return std::get_if(&Reference);
+  }
+  const RefInBF &getRefInBF() const {
+assert(std::get_if(&Reference));
+return *std::get_if(&Reference);
+  }
+
+public:
+  /// Constructs an empty reference.
+  MCInstReference() : Reference(RefInBB(nullptr, nullptr)) {}
+  /// Constructs a reference to the instruction inside the basic block.
+  MCInstReference(const BinaryBasicBlock *BB, const MCInst *Inst)
+  : Reference(RefInBB(BB, Inst)) {
+assert(BB && Inst && "Neither BB nor Inst should be nullptr");
+  }
+  /// Constructs a reference to the instruction inside the basic block.
+  MCInstReference(const BinaryBasicBlock *BB, uns

1 2 >

1 - 100 of 117 matches

Mail list logo