[llvm-branch-commits] [clang] [llvm] Enable fexec-charset option (PR #138895)
@@ -0,0 +1,36 @@ +//===--- clang/Lex/LiteralConverter.h - Translator for Literals -*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===--===// + +#ifndef LLVM_CLANG_LEX_LITERALCONVERTER_H +#define LLVM_CLANG_LEX_LITERALCONVERTER_H + +#include "clang/Basic/Diagnostic.h" +#include "clang/Basic/LangOptions.h" +#include "clang/Basic/TargetInfo.h" +#include "llvm/ADT/StringMap.h" +#include "llvm/ADT/StringRef.h" +#include "llvm/Support/CharSet.h" + +enum ConversionAction { NoConversion, ToSystemCharset, ToExecCharset }; + +class LiteralConverter { + llvm::StringRef InternalCharset; + llvm::StringRef SystemCharset; + llvm::StringRef ExecCharset; + llvm::StringMap CharsetConverters; cor3ntin wrote: Let's have a single converter, and think about the pragma later. (A solution would be to have a stack of them, such that getting to the active one would be a single pointer read... either way, the map is insufficient to implement the pragma, and we don't want to design that pragma now, that PR has enough complexity) https://github.com/llvm/llvm-project/pull/138895 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [libcxx] [libc++] Implements the new FTM documentation generator. (PR #139774)
@@ -0,0 +1,187 @@ +# ===--===## +# +# Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +# See https://llvm.org/LICENSE.txt for license information. +# SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +# +# ===--===## + +# RUN: %{python} %s %{libcxx-dir}/utils %{libcxx-dir}/test/libcxx/feature_test_macro/test_data.json + +import sys +import unittest + +UTILS = sys.argv[1] +TEST_DATA = sys.argv[2] +del sys.argv[1:3] + +sys.path.append(UTILS) +from generate_feature_test_macro_components import FeatureTestMacros + + +class Test(unittest.TestCase): +def setUp(self): +self.ftm = FeatureTestMacros(TEST_DATA, ["charconv"]) +self.maxDiff = None # This causes the diff to be printed when the test fails + +def test_implementeation(self): ldionne wrote: ```suggestion def test_implementation(self): ``` https://github.com/llvm/llvm-project/pull/139774 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [libcxx] [libc++] Implements the new FTM documentation generator. (PR #139774)
@@ -2555,6 +2645,72 @@ def generate_header_test_directory(self, path: os.path) -> None: f.write(self.generate_header_test_file(header)) +@functools.cached_property +def status_list_table(self) -> str: +"""Creates the rst status table using a list-table.""" + +result = "" +for std in self.std_dialects: +result += create_table_row( +[[f'**{std.replace("c++", "C++")}**', "", "", "", ""]] +) + +for feature in self.__data: +if std not in feature["values"].keys(): +continue + +row = list() + +ftm = feature["name"] +libcxx_value = ( +f"{self.standard_ftms[ftm][std]}" +if self.is_implemented(ftm, std) +else "*unimplemented*" +) + +values = feature["values"][std] +assert len(values) > 0, f"{feature['name']}[{std}] has no entries" +for value in values: ldionne wrote: ```suggestion for value, papers in values.items(): ``` Would that work too? https://github.com/llvm/llvm-project/pull/139774 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [libcxx] [libc++] Implements the new FTM documentation generator. (PR #139774)
https://github.com/ldionne edited https://github.com/llvm/llvm-project/pull/139774 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [libcxx] [libc++] Implements the new FTM documentation generator. (PR #139774)
https://github.com/ldionne approved this pull request. I am really excited for this change! This looks really good, with a few comments. https://github.com/llvm/llvm-project/pull/139774 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [clang][analyzer] Handle CXXParenInitListExpr alongside InitListExpr (PR #139909)
fangyi-zhou wrote: I think you probably want to cherry-pick https://github.com/llvm/llvm-project/pull/136041/commits/5dc9d55eb04d94c01dba0364b51a509f975e542a which addresses reviewer comments and fixes the tests. https://github.com/llvm/llvm-project/pull/139909 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [llvm] Enable fexec-charset option (PR #138895)
@@ -0,0 +1,36 @@ +//===--- clang/Lex/LiteralConverter.h - Translator for Literals -*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===--===// + +#ifndef LLVM_CLANG_LEX_LITERALCONVERTER_H +#define LLVM_CLANG_LEX_LITERALCONVERTER_H + +#include "clang/Basic/Diagnostic.h" +#include "clang/Basic/LangOptions.h" +#include "clang/Basic/TargetInfo.h" +#include "llvm/ADT/StringMap.h" +#include "llvm/ADT/StringRef.h" +#include "llvm/Support/CharSet.h" + +enum ConversionAction { NoConversion, ToSystemCharset, ToExecCharset }; + +class LiteralConverter { + llvm::StringRef InternalCharset; + llvm::StringRef SystemCharset; + llvm::StringRef ExecCharset; + llvm::StringMap CharsetConverters; cor3ntin wrote: Even if we were to do that, I don't think a map would be the best solution - there is only a given pair of converters active at any given time. https://github.com/llvm/llvm-project/pull/138895 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: improve handling of unreachable basic blocks (PR #136183)
https://github.com/atrosinenko updated https://github.com/llvm/llvm-project/pull/136183 >From c11b9017a4d3a4c07946081a355506e0f69d312b Mon Sep 17 00:00:00 2001 From: Anatoly Trosinenko Date: Thu, 17 Apr 2025 20:51:16 +0300 Subject: [PATCH 1/3] [BOLT] Gadget scanner: improve handling of unreachable basic blocks Instead of refusing to analyze an instruction completely, when it is unreachable according to the CFG reconstructed by BOLT, pessimistically assume all registers to be unsafe at the start of basic blocks without any predecessors. Nevertheless, unreachable basic blocks found in optimized code likely means imprecise CFG reconstruction, thus report a warning once per basic block without predecessors. --- bolt/lib/Passes/PAuthGadgetScanner.cpp| 46 ++- .../AArch64/gs-pacret-autiasp.s | 7 ++- .../binary-analysis/AArch64/gs-pauth-calls.s | 57 +++ 3 files changed, 95 insertions(+), 15 deletions(-) diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp b/bolt/lib/Passes/PAuthGadgetScanner.cpp index cd7c077a6412e..3cee579ef2a15 100644 --- a/bolt/lib/Passes/PAuthGadgetScanner.cpp +++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp @@ -343,6 +343,12 @@ class SrcSafetyAnalysis { return S; } + /// Creates a state with all registers marked unsafe (not to be confused + /// with empty state). + SrcState createUnsafeState() const { +return SrcState(NumRegs, RegsToTrackInstsFor.getNumTrackedRegisters()); + } + BitVector getClobberedRegs(const MCInst &Point) const { BitVector Clobbered(NumRegs); // Assume a call can clobber all registers, including callee-saved @@ -585,6 +591,13 @@ class DataflowSrcSafetyAnalysis if (BB.isEntryPoint()) return createEntryState(); +// If a basic block without any predecessors is found in an optimized code, +// this likely means that some CFG edges were not detected. Pessimistically +// assume all registers to be unsafe before this basic block and warn about +// this fact in FunctionAnalysis::findUnsafeUses(). +if (BB.pred_empty()) + return createUnsafeState(); + return SrcState(); } @@ -658,12 +671,6 @@ class CFGUnawareSrcSafetyAnalysis : public SrcSafetyAnalysis { BC.MIB->removeAnnotation(I.second, StateAnnotationIndex); } - /// Creates a state with all registers marked unsafe (not to be confused - /// with empty state). - SrcState createUnsafeState() const { -return SrcState(NumRegs, RegsToTrackInstsFor.getNumTrackedRegisters()); - } - public: CFGUnawareSrcSafetyAnalysis(BinaryFunction &BF, MCPlusBuilder::AllocatorIdTy AllocId, @@ -1342,19 +1349,30 @@ void FunctionAnalysisContext::findUnsafeUses( BF.dump(); }); + if (BF.hasCFG()) { +// Warn on basic blocks being unreachable according to BOLT, as this +// likely means CFG is imprecise. +for (BinaryBasicBlock &BB : BF) { + if (!BB.pred_empty() || BB.isEntryPoint()) +continue; + // Arbitrarily attach the report to the first instruction of BB. + MCInst *InstToReport = BB.getFirstNonPseudoInstr(); + if (!InstToReport) +continue; // BB has no real instructions + + Reports.push_back( + make_generic_report(MCInstReference::get(InstToReport, BF), + "Warning: no predecessor basic blocks detected " + "(possibly incomplete CFG)")); +} + } + iterateOverInstrs(BF, [&](MCInstReference Inst) { if (BC.MIB->isCFI(Inst)) return; const SrcState &S = Analysis->getStateBefore(Inst); - -// If non-empty state was never propagated from the entry basic block -// to Inst, assume it to be unreachable and report a warning. -if (S.empty()) { - Reports.push_back( - make_generic_report(Inst, "Warning: unreachable instruction found")); - return; -} +assert(!S.empty() && "Instruction has no associated state"); if (auto Report = shouldReportReturnGadget(BC, Inst, S)) Reports.push_back(*Report); diff --git a/bolt/test/binary-analysis/AArch64/gs-pacret-autiasp.s b/bolt/test/binary-analysis/AArch64/gs-pacret-autiasp.s index 284f0bea607a5..6559ba336e8de 100644 --- a/bolt/test/binary-analysis/AArch64/gs-pacret-autiasp.s +++ b/bolt/test/binary-analysis/AArch64/gs-pacret-autiasp.s @@ -215,12 +215,17 @@ f_callclobbered_calleesaved: .globl f_unreachable_instruction .type f_unreachable_instruction,@function f_unreachable_instruction: -// CHECK-LABEL: GS-PAUTH: Warning: unreachable instruction found in function f_unreachable_instruction, basic block {{[0-9a-zA-Z.]+}}, at address +// CHECK-LABEL: GS-PAUTH: Warning: no predecessor basic blocks detected (possibly incomplete CFG) in function f_unreachable_instruction, basic block {{[0-9a-zA-Z.]+}}, at address // CHECK-NEXT:The instruction is {{[0-9a-f]+}}: add x0, x1, x2 // CHECK-NOT: instructi
[llvm-branch-commits] [llvm] [BOLT] Factor out MCInstReference from gadget scanner (NFC) (PR #138655)
https://github.com/atrosinenko updated https://github.com/llvm/llvm-project/pull/138655 >From cbcac1bc4612b601a5ec963663ba69a5f212feb1 Mon Sep 17 00:00:00 2001 From: Anatoly Trosinenko Date: Mon, 28 Apr 2025 18:35:48 +0300 Subject: [PATCH] [BOLT] Factor out MCInstReference from gadget scanner (NFC) Move MCInstReference representing a constant reference to an instruction inside a parent entity - either inside a basic block (which has a reference to its parent function) or directly to the function (when CFG information is not available). --- bolt/include/bolt/Core/MCInstUtils.h | 172 + bolt/include/bolt/Passes/PAuthGadgetScanner.h | 180 +- bolt/lib/Core/CMakeLists.txt | 1 + bolt/lib/Core/MCInstUtils.cpp | 57 ++ bolt/lib/Passes/PAuthGadgetScanner.cpp| 99 -- 5 files changed, 272 insertions(+), 237 deletions(-) create mode 100644 bolt/include/bolt/Core/MCInstUtils.h create mode 100644 bolt/lib/Core/MCInstUtils.cpp diff --git a/bolt/include/bolt/Core/MCInstUtils.h b/bolt/include/bolt/Core/MCInstUtils.h new file mode 100644 index 0..a3912a8fb265a --- /dev/null +++ b/bolt/include/bolt/Core/MCInstUtils.h @@ -0,0 +1,172 @@ +//===- bolt/Core/MCInstUtils.h --*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===--===// + +#ifndef BOLT_CORE_MCINSTUTILS_H +#define BOLT_CORE_MCINSTUTILS_H + +#include "bolt/Core/BinaryBasicBlock.h" + +#include +#include +#include + +namespace llvm { +namespace bolt { + +class BinaryFunction; + +/// MCInstReference represents a reference to a constant MCInst as stored either +/// in a BinaryFunction (i.e. before a CFG is created), or in a BinaryBasicBlock +/// (after a CFG is created). +class MCInstReference { + using nocfg_const_iterator = std::map::const_iterator; + + // Two cases are possible: + // * functions with CFG reconstructed - a function stores a collection of + // basic blocks, each basic block stores a contiguous vector of MCInst + // * functions without CFG - there are no basic blocks created, + // the instructions are directly stored in std::map in BinaryFunction + // + // In both cases, the direct parent of MCInst is stored together with an + // iterator pointing to the instruction. + + // Helper struct: CFG is available, the direct parent is a basic block, + // iterator's type is `MCInst *`. + struct RefInBB { +RefInBB(const BinaryBasicBlock *BB, const MCInst *Inst) +: BB(BB), It(Inst) {} +RefInBB(const RefInBB &Other) = default; +RefInBB &operator=(const RefInBB &Other) = default; + +const BinaryBasicBlock *BB; +BinaryBasicBlock::const_iterator It; + +bool operator<(const RefInBB &Other) const { + if (BB != Other.BB) +return std::less{}(BB, Other.BB); + return It < Other.It; +} + +bool operator==(const RefInBB &Other) const { + return BB == Other.BB && It == Other.It; +} + }; + + // Helper struct: CFG is *not* available, the direct parent is a function, + // iterator's type is std::map::iterator (the mapped value + // is an instruction's offset). + struct RefInBF { +RefInBF(const BinaryFunction *BF, nocfg_const_iterator It) +: BF(BF), It(It) {} +RefInBF(const RefInBF &Other) = default; +RefInBF &operator=(const RefInBF &Other) = default; + +const BinaryFunction *BF; +nocfg_const_iterator It; + +bool operator<(const RefInBF &Other) const { + if (BF != Other.BF) +return std::less{}(BF, Other.BF); + return It->first < Other.It->first; +} + +bool operator==(const RefInBF &Other) const { + return BF == Other.BF && It->first == Other.It->first; +} + }; + + std::variant Reference; + + // Utility methods to be used like this: + // + // if (auto *Ref = tryGetRefInBB()) + // return Ref->doSomething(...); + // return getRefInBF().doSomethingElse(...); + const RefInBB *tryGetRefInBB() const { +assert(std::get_if(&Reference) || + std::get_if(&Reference)); +return std::get_if(&Reference); + } + const RefInBF &getRefInBF() const { +assert(std::get_if(&Reference)); +return *std::get_if(&Reference); + } + +public: + /// Constructs an empty reference. + MCInstReference() : Reference(RefInBB(nullptr, nullptr)) {} + /// Constructs a reference to the instruction inside the basic block. + MCInstReference(const BinaryBasicBlock *BB, const MCInst *Inst) + : Reference(RefInBB(BB, Inst)) { +assert(BB && Inst && "Neither BB nor Inst should be nullptr"); + } + /// Constructs a reference to the instruction inside the basic block. + MCInstReference(const BinaryBasicBlock *BB, uns
[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: account for BRK when searching for auth oracles (PR #137975)
https://github.com/atrosinenko updated https://github.com/llvm/llvm-project/pull/137975 >From 91b67b64a4e731cdfabc09bd224c32e1ab25e21d Mon Sep 17 00:00:00 2001 From: Anatoly Trosinenko Date: Wed, 30 Apr 2025 16:08:10 +0300 Subject: [PATCH] [BOLT] Gadget scanner: account for BRK when searching for auth oracles An authenticated pointer can be explicitly checked by the compiler via a sequence of instructions that executes BRK on failure. It is important to recognize such BRK instruction as checking every register (as it is expected to immediately trigger an abnormal program termination) to prevent false positive reports about authentication oracles: autia x2, x3 autia x0, x1 ; neither x0 nor x2 are checked at this point eor x16, x0, x0, lsl #1 tbz x16, #62, on_success ; marks x0 as checked ; end of BB: for x2 to be checked here, it must be checked in both ; successor basic blocks on_failure: brk 0xc470 on_success: ; x2 is checked ldr x1, [x2] ; marks x2 as checked --- bolt/include/bolt/Core/MCPlusBuilder.h| 14 ++ bolt/lib/Passes/PAuthGadgetScanner.cpp| 13 +- .../Target/AArch64/AArch64MCPlusBuilder.cpp | 24 -- .../AArch64/gs-pauth-address-checks.s | 44 +-- .../AArch64/gs-pauth-authentication-oracles.s | 9 ++-- .../AArch64/gs-pauth-signing-oracles.s| 6 +-- 6 files changed, 75 insertions(+), 35 deletions(-) diff --git a/bolt/include/bolt/Core/MCPlusBuilder.h b/bolt/include/bolt/Core/MCPlusBuilder.h index 6d3aa4f5f0feb..87de6754017db 100644 --- a/bolt/include/bolt/Core/MCPlusBuilder.h +++ b/bolt/include/bolt/Core/MCPlusBuilder.h @@ -706,6 +706,20 @@ class MCPlusBuilder { return false; } + /// Returns true if Inst is a trap instruction. + /// + /// Tests if Inst is an instruction that immediately causes an abnormal + /// program termination, for example when a security violation is detected + /// by a compiler-inserted check. + /// + /// @note An implementation of this method should likely return false for + /// calls to library functions like abort(), as it is possible that the + /// execution state is partially attacker-controlled at this point. + virtual bool isTrap(const MCInst &Inst) const { +llvm_unreachable("not implemented"); +return false; + } + virtual bool isBreakpoint(const MCInst &Inst) const { llvm_unreachable("not implemented"); return false; diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp b/bolt/lib/Passes/PAuthGadgetScanner.cpp index dfb71575b2b39..835ee26aaf08a 100644 --- a/bolt/lib/Passes/PAuthGadgetScanner.cpp +++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp @@ -1028,6 +1028,15 @@ class DstSafetyAnalysis { dbgs() << ")\n"; }); +// If this instruction terminates the program immediately, no +// authentication oracles are possible past this point. +if (BC.MIB->isTrap(Point)) { + LLVM_DEBUG({ traceInst(BC, "Trap instruction found", Point); }); + DstState Next(NumRegs, RegsToTrackInstsFor.getNumTrackedRegisters()); + Next.CannotEscapeUnchecked.set(); + return Next; +} + // If this instruction is reachable by the analysis, a non-empty state will // be propagated to it sooner or later. Until then, skip computeNext(). if (Cur.empty()) { @@ -1133,8 +1142,8 @@ class DataflowDstSafetyAnalysis // // A basic block without any successors, on the other hand, can be // pessimistically initialized to everything-is-unsafe: this will naturally -// handle both return and tail call instructions and is harmless for -// internal indirect branch instructions (such as computed gotos). +// handle return, trap and tail call instructions. At the same time, it is +// harmless for internal indirect branch instructions, like computed gotos. if (BB.succ_empty()) return createUnsafeState(); diff --git a/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp b/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp index f3c29e6ee43b9..4d11c5b206eab 100644 --- a/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp +++ b/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp @@ -386,10 +386,9 @@ class AArch64MCPlusBuilder : public MCPlusBuilder { // the list of successors of this basic block as appropriate. // Any of the above code sequences assume the fall-through basic block -// is a dead-end BRK instruction (any immediate operand is accepted). +// is a dead-end trap instruction. const BinaryBasicBlock *BreakBB = BB.getFallthrough(); -if (!BreakBB || BreakBB->empty() || -BreakBB->front().getOpcode() != AArch64::BRK) +if (!BreakBB || BreakBB->empty() || !isTrap(BreakBB->front())) return std::nullopt; // Iterate over the instructions of BB in reverse order, matching opcodes @@ -1745,6 +1744,25 @@ class AArch64MCPlusBuilder : public MCPlusBuilder { Inst.addOperand(MCOperand::createImm(0)); }
[llvm-branch-commits] [llvm] [BOLT] Introduce helpers to match `MCInst`s one at a time (NFC) (PR #138883)
https://github.com/atrosinenko updated https://github.com/llvm/llvm-project/pull/138883 >From 1c135a144d7f21e05c3598a992baa170cdde7950 Mon Sep 17 00:00:00 2001 From: Anatoly Trosinenko Date: Wed, 7 May 2025 16:42:00 +0300 Subject: [PATCH] [BOLT] Introduce helpers to match `MCInst`s one at a time (NFC) Introduce matchInst helper function to capture and/or match the operands of MCInst. Unlike the existing `MCPlusBuilder::MCInstMatcher` machinery, matchInst is intended for the use cases when precise control over the instruction order is required. For example, when validating PtrAuth hardening, all registers are usually considered unsafe after a function call, even though callee-saved registers should preserve their old values *under normal operation*. --- bolt/include/bolt/Core/MCInstUtils.h | 128 ++ .../Target/AArch64/AArch64MCPlusBuilder.cpp | 90 +--- 2 files changed, 162 insertions(+), 56 deletions(-) diff --git a/bolt/include/bolt/Core/MCInstUtils.h b/bolt/include/bolt/Core/MCInstUtils.h index a3912a8fb265a..b495eb8ef5eec 100644 --- a/bolt/include/bolt/Core/MCInstUtils.h +++ b/bolt/include/bolt/Core/MCInstUtils.h @@ -166,6 +166,134 @@ static inline raw_ostream &operator<<(raw_ostream &OS, return Ref.print(OS); } +/// Instruction-matching helpers operating on a single instruction at a time. +/// +/// Unlike MCPlusBuilder::MCInstMatcher, this matchInst() function focuses on +/// the cases where a precise control over the instruction order is important: +/// +/// // Bring the short names into the local scope: +/// using namespace MCInstMatcher; +/// // Declare the registers to capture: +/// Reg Xn, Xm; +/// // Capture the 0th and 1st operands, match the 2nd operand against the +/// // just captured Xm register, match the 3rd operand against literal 0: +/// if (!matchInst(MaybeAdd, AArch64::ADDXrs, Xm, Xn, Xm, Imm(0)) +/// return AArch64::NoRegister; +/// // Match the 0th operand against Xm: +/// if (!matchInst(MaybeBr, AArch64::BR, Xm)) +/// return AArch64::NoRegister; +/// // Return the matched register: +/// return Xm.get(); +namespace MCInstMatcher { + +// The base class to match an operand of type T. +// +// The subclasses of OpMatcher are intended to be allocated on the stack and +// to only be used by passing them to matchInst() and by calling their get() +// function, thus the peculiar `mutable` specifiers: to make the calling code +// compact and readable, the templated matchInst() function has to accept both +// long-lived Imm/Reg wrappers declared as local variables (intended to capture +// the first operand's value and match the subsequent operands, whether inside +// a single instruction or across multiple instructions), as well as temporary +// wrappers around literal values to match, f.e. Imm(42) or Reg(AArch64::XZR). +template class OpMatcher { + mutable std::optional Value; + mutable std::optional SavedValue; + + // Remember/restore the last Value - to be called by matchInst. + void remember() const { SavedValue = Value; } + void restore() const { Value = SavedValue; } + + template + friend bool matchInst(const MCInst &, unsigned, const OpMatchers &...); + +protected: + OpMatcher(std::optional ValueToMatch) : Value(ValueToMatch) {} + + bool matchValue(T OpValue) const { +// Check that OpValue does not contradict the existing Value. +bool MatchResult = !Value || *Value == OpValue; +// If MatchResult is false, all matchers will be reset before returning from +// matchInst, including this one, thus no need to assign conditionally. +Value = OpValue; + +return MatchResult; + } + +public: + /// Returns the captured value. + T get() const { +assert(Value.has_value()); +return *Value; + } +}; + +class Reg : public OpMatcher { + bool matches(const MCOperand &Op) const { +if (!Op.isReg()) + return false; + +return matchValue(Op.getReg()); + } + + template + friend bool matchInst(const MCInst &, unsigned, const OpMatchers &...); + +public: + Reg(std::optional RegToMatch = std::nullopt) + : OpMatcher(RegToMatch) {} +}; + +class Imm : public OpMatcher { + bool matches(const MCOperand &Op) const { +if (!Op.isImm()) + return false; + +return matchValue(Op.getImm()); + } + + template + friend bool matchInst(const MCInst &, unsigned, const OpMatchers &...); + +public: + Imm(std::optional ImmToMatch = std::nullopt) + : OpMatcher(ImmToMatch) {} +}; + +/// Tries to match Inst and updates Ops on success. +/// +/// If Inst has the specified Opcode and its operand list prefix matches Ops, +/// this function returns true and updates Ops, otherwise false is returned and +/// values of Ops are kept as before matchInst was called. +/// +/// Please note that while Ops are technically passed by a const reference to +/// make invocations like `matchInst(MI, Opcode, Imm(42))` possible, all their +/// fields are marked mut
[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: account for BRK when searching for auth oracles (PR #137975)
https://github.com/atrosinenko updated https://github.com/llvm/llvm-project/pull/137975 >From 91b67b64a4e731cdfabc09bd224c32e1ab25e21d Mon Sep 17 00:00:00 2001 From: Anatoly Trosinenko Date: Wed, 30 Apr 2025 16:08:10 +0300 Subject: [PATCH] [BOLT] Gadget scanner: account for BRK when searching for auth oracles An authenticated pointer can be explicitly checked by the compiler via a sequence of instructions that executes BRK on failure. It is important to recognize such BRK instruction as checking every register (as it is expected to immediately trigger an abnormal program termination) to prevent false positive reports about authentication oracles: autia x2, x3 autia x0, x1 ; neither x0 nor x2 are checked at this point eor x16, x0, x0, lsl #1 tbz x16, #62, on_success ; marks x0 as checked ; end of BB: for x2 to be checked here, it must be checked in both ; successor basic blocks on_failure: brk 0xc470 on_success: ; x2 is checked ldr x1, [x2] ; marks x2 as checked --- bolt/include/bolt/Core/MCPlusBuilder.h| 14 ++ bolt/lib/Passes/PAuthGadgetScanner.cpp| 13 +- .../Target/AArch64/AArch64MCPlusBuilder.cpp | 24 -- .../AArch64/gs-pauth-address-checks.s | 44 +-- .../AArch64/gs-pauth-authentication-oracles.s | 9 ++-- .../AArch64/gs-pauth-signing-oracles.s| 6 +-- 6 files changed, 75 insertions(+), 35 deletions(-) diff --git a/bolt/include/bolt/Core/MCPlusBuilder.h b/bolt/include/bolt/Core/MCPlusBuilder.h index 6d3aa4f5f0feb..87de6754017db 100644 --- a/bolt/include/bolt/Core/MCPlusBuilder.h +++ b/bolt/include/bolt/Core/MCPlusBuilder.h @@ -706,6 +706,20 @@ class MCPlusBuilder { return false; } + /// Returns true if Inst is a trap instruction. + /// + /// Tests if Inst is an instruction that immediately causes an abnormal + /// program termination, for example when a security violation is detected + /// by a compiler-inserted check. + /// + /// @note An implementation of this method should likely return false for + /// calls to library functions like abort(), as it is possible that the + /// execution state is partially attacker-controlled at this point. + virtual bool isTrap(const MCInst &Inst) const { +llvm_unreachable("not implemented"); +return false; + } + virtual bool isBreakpoint(const MCInst &Inst) const { llvm_unreachable("not implemented"); return false; diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp b/bolt/lib/Passes/PAuthGadgetScanner.cpp index dfb71575b2b39..835ee26aaf08a 100644 --- a/bolt/lib/Passes/PAuthGadgetScanner.cpp +++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp @@ -1028,6 +1028,15 @@ class DstSafetyAnalysis { dbgs() << ")\n"; }); +// If this instruction terminates the program immediately, no +// authentication oracles are possible past this point. +if (BC.MIB->isTrap(Point)) { + LLVM_DEBUG({ traceInst(BC, "Trap instruction found", Point); }); + DstState Next(NumRegs, RegsToTrackInstsFor.getNumTrackedRegisters()); + Next.CannotEscapeUnchecked.set(); + return Next; +} + // If this instruction is reachable by the analysis, a non-empty state will // be propagated to it sooner or later. Until then, skip computeNext(). if (Cur.empty()) { @@ -1133,8 +1142,8 @@ class DataflowDstSafetyAnalysis // // A basic block without any successors, on the other hand, can be // pessimistically initialized to everything-is-unsafe: this will naturally -// handle both return and tail call instructions and is harmless for -// internal indirect branch instructions (such as computed gotos). +// handle return, trap and tail call instructions. At the same time, it is +// harmless for internal indirect branch instructions, like computed gotos. if (BB.succ_empty()) return createUnsafeState(); diff --git a/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp b/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp index f3c29e6ee43b9..4d11c5b206eab 100644 --- a/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp +++ b/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp @@ -386,10 +386,9 @@ class AArch64MCPlusBuilder : public MCPlusBuilder { // the list of successors of this basic block as appropriate. // Any of the above code sequences assume the fall-through basic block -// is a dead-end BRK instruction (any immediate operand is accepted). +// is a dead-end trap instruction. const BinaryBasicBlock *BreakBB = BB.getFallthrough(); -if (!BreakBB || BreakBB->empty() || -BreakBB->front().getOpcode() != AArch64::BRK) +if (!BreakBB || BreakBB->empty() || !isTrap(BreakBB->front())) return std::nullopt; // Iterate over the instructions of BB in reverse order, matching opcodes @@ -1745,6 +1744,25 @@ class AArch64MCPlusBuilder : public MCPlusBuilder { Inst.addOperand(MCOperand::createImm(0)); }
[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect untrusted LR before tail call (PR #137224)
https://github.com/atrosinenko updated https://github.com/llvm/llvm-project/pull/137224 >From d20efbedd0be34942e8e28cc91eefeb28d1b8108 Mon Sep 17 00:00:00 2001 From: Anatoly Trosinenko Date: Tue, 22 Apr 2025 21:43:14 +0300 Subject: [PATCH] [BOLT] Gadget scanner: detect untrusted LR before tail call Implement the detection of tail calls performed with untrusted link register, which violates the assumption made on entry to every function. Unlike other pauth gadgets, this one involves some amount of guessing which branch instructions should be checked as tail calls. --- bolt/lib/Passes/PAuthGadgetScanner.cpp| 94 ++- .../AArch64/gs-pacret-autiasp.s | 31 +- .../AArch64/gs-pauth-debug-output.s | 30 +- .../AArch64/gs-pauth-tail-calls.s | 597 ++ 4 files changed, 706 insertions(+), 46 deletions(-) create mode 100644 bolt/test/binary-analysis/AArch64/gs-pauth-tail-calls.s diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp b/bolt/lib/Passes/PAuthGadgetScanner.cpp index 7a5d47a3ff812..dfb71575b2b39 100644 --- a/bolt/lib/Passes/PAuthGadgetScanner.cpp +++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp @@ -701,8 +701,9 @@ class DataflowSrcSafetyAnalysis // // Then, a function can be split into a number of disjoint contiguous sequences // of instructions without labels in between. These sequences can be processed -// the same way basic blocks are processed by data-flow analysis, assuming -// pessimistically that all registers are unsafe at the start of each sequence. +// the same way basic blocks are processed by data-flow analysis, with the same +// pessimistic estimation of the initial state at the start of each sequence +// (except the first instruction of the function). class CFGUnawareSrcSafetyAnalysis : public SrcSafetyAnalysis { BinaryFunction &BF; MCPlusBuilder::AllocatorIdTy AllocId; @@ -713,12 +714,6 @@ class CFGUnawareSrcSafetyAnalysis : public SrcSafetyAnalysis { BC.MIB->removeAnnotation(I.second, StateAnnotationIndex); } - /// Creates a state with all registers marked unsafe (not to be confused - /// with empty state). - SrcState createUnsafeState() const { -return SrcState(NumRegs, RegsToTrackInstsFor.getNumTrackedRegisters()); - } - public: CFGUnawareSrcSafetyAnalysis(BinaryFunction &BF, MCPlusBuilder::AllocatorIdTy AllocId, @@ -729,6 +724,7 @@ class CFGUnawareSrcSafetyAnalysis : public SrcSafetyAnalysis { } void run() override { +const SrcState DefaultState = computePessimisticState(BF); SrcState S = createEntryState(); for (auto &I : BF.instrs()) { MCInst &Inst = I.second; @@ -743,7 +739,7 @@ class CFGUnawareSrcSafetyAnalysis : public SrcSafetyAnalysis { LLVM_DEBUG({ traceInst(BC, "Due to label, resetting the state before", Inst); }); -S = createUnsafeState(); +S = DefaultState; } // Check if we need to remove an old annotation (this is the case if @@ -1288,6 +1284,83 @@ shouldReportReturnGadget(const BinaryContext &BC, const MCInstReference &Inst, return make_gadget_report(RetKind, Inst, *RetReg); } +/// While BOLT already marks some of the branch instructions as tail calls, +/// this function tries to improve the coverage by including less obvious cases +/// when it is possible to do without introducing too many false positives. +static bool shouldAnalyzeTailCallInst(const BinaryContext &BC, + const BinaryFunction &BF, + const MCInstReference &Inst) { + // Some BC.MIB->isXYZ(Inst) methods simply delegate to MCInstrDesc::isXYZ() + // (such as isBranch at the time of writing this comment), some don't (such + // as isCall). For that reason, call MCInstrDesc's methods explicitly when + // it is important. + const MCInstrDesc &Desc = + BC.MII->get(static_cast(Inst).getOpcode()); + // Tail call should be a branch (but not necessarily an indirect one). + if (!Desc.isBranch()) +return false; + + // Always analyze the branches already marked as tail calls by BOLT. + if (BC.MIB->isTailCall(Inst)) +return true; + + // Try to also check the branches marked as "UNKNOWN CONTROL FLOW" - the + // below is a simplified condition from BinaryContext::printInstruction. + bool IsUnknownControlFlow = + BC.MIB->isIndirectBranch(Inst) && !BC.MIB->getJumpTable(Inst); + + if (BF.hasCFG() && IsUnknownControlFlow) +return true; + + return false; +} + +static std::optional> +shouldReportUnsafeTailCall(const BinaryContext &BC, const BinaryFunction &BF, + const MCInstReference &Inst, const SrcState &S) { + static const GadgetKind UntrustedLRKind( + "untrusted link register found before tail call"); + + if (!shouldAnalyzeTailCallInst(BC, BF, Inst)) +return std::nullopt; + + // Not only the set of registers returned by getTrustedLiveInRegs() can be + // see
[llvm-branch-commits] [mlir] [MLIR][ArmSVE] Add initial lowering of vector.contract to SVE `*MMLA` instructions (PR #135636)
@@ -0,0 +1,304 @@ +//===- LowerContractionToSMMLAPattern.cpp - Contract to SMMLA ---*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===--===// +// +// This file implements lowering patterns from vector.contract to +// SVE I8MM operations. +// +//===--- + +#include "mlir/Dialect/Arith/IR/Arith.h" +#include "mlir/Dialect/ArmSVE/IR/ArmSVEDialect.h" +#include "mlir/Dialect/ArmSVE/Transforms/Transforms.h" +#include "mlir/Dialect/Func/IR/FuncOps.h" +#include "mlir/Dialect/LLVMIR/LLVMDialect.h" +#include "mlir/Dialect/Utils/IndexingUtils.h" +#include "mlir/Dialect/Vector/IR/VectorOps.h" +#include "mlir/IR/AffineMap.h" +#include "mlir/IR/PatternMatch.h" +#include "mlir/Transforms/GreedyPatternRewriteDriver.h" + +#include "mlir/Dialect/UB/IR/UBOps.h" + +#define DEBUG_TYPE "lower-contract-to-arm-sve-i8mm" + +using namespace mlir; +using namespace mlir::arm_sve; + +namespace { +// Check if the given value is a result of the operation `T` (which must be +// sign- or zero- extend) from i8 to i32. Return the value before the extension. +template +inline std::enable_if_t<(std::is_base_of_v || + std::is_base_of_v), +std::optional> +extractExtOperand(Value v, Type i8Ty, Type i32Ty) { + auto extOp = dyn_cast_or_null(v.getDefiningOp()); + if (!extOp) +return {}; + + auto inOp = extOp.getIn(); + auto inTy = dyn_cast(inOp.getType()); + if (!inTy || inTy.getElementType() != i8Ty) +return {}; + + auto outTy = dyn_cast(extOp.getType()); + if (!outTy || outTy.getElementType() != i32Ty) +return {}; + + return inOp; +} + +// Designate the operation (resp. instruction) used to do sub-tile matrix +// multiplications. +enum class MMLA { + Signed, // smmla + Unsigned,// ummla + Mixed, // usmmla + MixedSwapped // usmmla with LHS and RHS swapped +}; + +// Create the matrix multply and accumulate operation according to `op`. +Value createMMLA(PatternRewriter &rewriter, MMLA op, Location loc, + mlir::VectorType accType, Value acc, Value lhs, Value rhs) { + switch (op) { + case MMLA::Signed: +return rewriter.create(loc, accType, acc, lhs, rhs); + case MMLA::Unsigned: +return rewriter.create(loc, accType, acc, lhs, rhs); + case MMLA::Mixed: +return rewriter.create(loc, accType, acc, lhs, rhs); + case MMLA::MixedSwapped: +// The accumulator comes transposed and the result will be transposed +// later, so all we have to do here is swap the operands. +return rewriter.create(loc, accType, acc, rhs, lhs); + } +} + +class LowerContractionToSVEI8MMPattern +: public OpRewritePattern { +public: + using OpRewritePattern::OpRewritePattern; + LogicalResult matchAndRewrite(vector::ContractionOp op, +PatternRewriter &rewriter) const override { + +Location loc = op.getLoc(); +mlir::VectorType lhsType = op.getLhsType(); +mlir::VectorType rhsType = op.getRhsType(); + +// For now handle LHS and RHS<8x[N]> - these are the types we +// eventually expect from MMT4D. M and N dimensions must be even and at +// least 2. +if (!lhsType.hasRank() || lhsType.getRank() != 2 || !rhsType.hasRank() || +rhsType.getRank() != 2) + return failure(); + +if (lhsType.isScalable() || !rhsType.isScalable()) + return failure(); + +// M, N, and K are the conventional names for matrix dimensions in the +// context of matrix multiplication. +auto M = lhsType.getDimSize(0); +auto N = rhsType.getDimSize(0); +auto K = rhsType.getDimSize(1); + +if (lhsType.getDimSize(1) != K || K != 8 || M < 2 || M % 2 != 0 || N < 2 || +N % 2 != 0 || !rhsType.getScalableDims()[0]) + return failure(); + +// Check permutation maps. For now only accept +// lhs: (d0, d1, d2) -> (d0, d2) +// rhs: (d0, d1, d2) -> (d1, d2) +// acc: (d0, d1, d2) -> (d0, d1) +// Note: RHS is transposed. +if (op.getIndexingMapsArray()[0] != +AffineMap::getMultiDimMapWithTargets(3, ArrayRef{0u, 2u}, + op.getContext()) || +op.getIndexingMapsArray()[1] != +AffineMap::getMultiDimMapWithTargets(3, ArrayRef{1u, 2u}, + op.getContext()) || +op.getIndexingMapsArray()[2] != +AffineMap::getMultiDimMapWithTargets(3, ArrayRef{0u, 1u}, + op.getContext())) + return failure(); + +// Check iterator types for matrix multiplication. +auto itTypes = op.getIteratorTypesArray(); +if (itTypes.size() != 3 || itTypes[0] != vector::IteratorType::parallel || +itTypes[1] != vector::IteratorType
[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: improve handling of unreachable basic blocks (PR #136183)
atrosinenko wrote: Moved the `computePessimisticState` function here from #137224, as accounting for the registers that were never clobbered in a function (basically, accounting for the leaf functions) turns out to decrease the number of false positive reports quite significantly. https://github.com/llvm/llvm-project/pull/136183 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] Add pointer field protection feature. (PR #133538)
@@ -7538,6 +7538,14 @@ static bool IsEligibleForTrivialRelocation(Sema &SemaRef, if (!SemaRef.IsCXXTriviallyRelocatableType(Field->getType())) return false; } + + // FIXME: PFP should not affect trivial relocatability, instead it should + // affect the implementation of std::trivially_relocate. See: + // https://discourse.llvm.org/t/rfc-structure-protection-a-family-of-uaf-mitigation-techniques/8/16?u=pcc + if (!SemaRef.Context.arePFPFieldsTriviallyRelocatable(D) && ojhunt wrote: If PFP fields are present, and we haven't yet implemented support for them in `trivially_relocate`, `IsCXXTriviallyRelocatableType` should be returning false. This is something we have to address for address discriminated pointer auth as well. https://github.com/llvm/llvm-project/pull/133538 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] Add pointer field protection feature. (PR #133538)
@@ -1319,14 +1319,66 @@ static llvm::Value *CoerceIntOrPtrToIntOrPtr(llvm::Value *Val, llvm::Type *Ty, /// This safely handles the case when the src type is smaller than the /// destination type; in this situation the values of bits which not /// present in the src are undefined. -static llvm::Value *CreateCoercedLoad(Address Src, llvm::Type *Ty, +static llvm::Value *CreateCoercedLoad(Address Src, QualType SrcFETy, + llvm::Type *Ty, CodeGenFunction &CGF) { llvm::Type *SrcTy = Src.getElementType(); // If SrcTy and Ty are the same, just do a load. if (SrcTy == Ty) return CGF.Builder.CreateLoad(Src); + // Coercion directly through memory does not work if the structure has pointer + // field protection because the struct in registers has a different bit + // pattern to the struct in memory, so we must read the elements one by one + // and use them to form the coerced structure. + std::vector PFPFields; + CGF.getContext().findPFPFields(SrcFETy, CharUnits::Zero(), PFPFields, true); ojhunt wrote: This should be done by generating a copy thunk and calling that, rather than performing this logic repeatedly, and more importantly regenerating it repeatedly. If the update function is small enough it will be inlined, and if it's not you probably do not want a large number of copies of the code. Your intent appears to be to maintain the register transfer abi for POD types, despite the objects logically no longer being POD types - which isn't unreasonable given that the current design means that all structs containing pointers would become non-POD, which is probably suboptimal - but if you are doing so, because a large enough struct can spill, I'd recommend you adopt a model where the PFP fields in a return or parameter copy of the object are tagged specifically as transfer fields, something akin to `hash(argument position || PFP offset || some_this_is_a_transfer_constant)`. https://github.com/llvm/llvm-project/pull/133538 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] Add pointer field protection feature. (PR #133538)
https://github.com/ojhunt requested changes to this pull request. Thoughts: This should be opt-in on a field or struct granularity, not just a global behavior. In the RFC I think you mentioned not applying PFP to C types, but I'm unsure how you're deciding what is a C type? There are a lot of special cases being added in places that should not need to be aware of PFP - the core type queries should be returning correct values for types containing PFP fields. A lot of the "this read/write/copy/init special work" exactly aligns with the constraints of pointer auth, and I think the overall implementation could be made better by introducing the concept of a `HardenedPointerQualifier` or similar, that could be either PointerAuthQualifier or PFPQualifier. In principle this could even just be treated as a different PointerAuth key set. That approach would also help with some of the problems you have with offsetof and pointers to PFP fields - the thing that causes the problems with the pointer to a PFP field is that the PFP schema for a given field is in reality part of the type, so given ```cpp struct A { void *field1; void *field2; }; ``` It looks you are trying to maintain the idea that `A::field1` and `A::field2` have the same type, which makes this code valid according to the type system: ```cpp void f(void** obj); void g(A *a) { f(&a->field1); f(&a->field2); } ``` but the real types are not the same. The semantic equivalent under pointer auth is something akin to ```cpp struct A { void * __ptrauth(1, 1, hash("A::field1")) field1; void * __ptrauth(1, 1, hash("A::field2")) field2; }; ``` Which makes it very clear that the types are different, and I think trying to pretend they are not is part of what is causing problems. The more I think about it, the more I feel that this could be implemented almost entirely on top of the existing pointer auth work. Currently for pointer auth there's a set of ARM keys specified, and I think you just need to create a different set, PFP_Keys or similar, and set up the appropriate schemas (basically null schemas for all the existing cases), and add a new schema: DefaultFieldSchema or similar, and apply that schema to every pointer field in an object unless there's an explicit qualifier applied already. Then it would in principle just be a matter of making adding the appropriate logic to the ptrauth intrinsics in llvm. Now there are some moderately large caveats to that idea * the existing ptrauth semantics simply say that any struct containing address discriminated values is no-longer a pod type and so gets copied through the stack, which is something you're trying to avoid, so you'd still need to keep that logic (though as I said, that should be done by generating and then calling the functions to do rather than regenerating the entire copy repeatedly). * the way that pointer auth is currently implemented assumes that there is only a single pointer auth abi active so you wouldn't be able to trivially have both pointer auth and PFP live at the same time. That's what makes me think having a SecuredPointer abstraction might be a batter approach, but it might also not be too difficult to update the PointerAuthQualifier to include the ABI being used -- I'm really not sure. https://github.com/llvm/llvm-project/pull/133538 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] Add pointer field protection feature. (PR #133538)
@@ -2976,7 +3006,15 @@ void CodeGenFunction::EmitForwardingCallToLambda( QualType resultType = FPT->getReturnType(); ReturnValueSlot returnSlot; if (!resultType->isVoidType() && - calleeFnInfo->getReturnInfo().getKind() == ABIArgInfo::Indirect && + (calleeFnInfo->getReturnInfo().getKind() == ABIArgInfo::Indirect || + // With pointer field protection, we need to set up the return slot when ojhunt wrote: This should also not need to be modified. Something is up with how you're setting up the record decls if you need to have so many insertions of PFP specific behavior. https://github.com/llvm/llvm-project/pull/133538 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] Add pointer field protection feature. (PR #133538)
@@ -362,6 +362,17 @@ class LangOptionsBase { BKey }; + enum class PointerFieldProtectionKind { ojhunt wrote: I'm not sure I like this being solely a global decision - it makes custom allocators much harder, and it makes it hard for allocators that have different policies, e.g ```cpp struct ImportantStruct { void *operator new(size_t) { /*tagged allocator*/ } }; struct BoringStruct { ... }; struct SomeOtherStruct { ImportantStruct *important; // should be tagged pointer BoringString *lessImportant; // should be a non tagged pointer }; ``` This would again require a struct attribute to indicate the pointer policy https://github.com/llvm/llvm-project/pull/133538 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] Add pointer field protection feature. (PR #133538)
@@ -928,6 +936,11 @@ namespace { if (PointerAuthQualifier Q = F->getType().getPointerAuth(); Q && Q.isAddressDiscriminated()) return false; + // Non-trivially-copyable fields with pointer field protection need to be ojhunt wrote: This is an example of the code I was thinking of when I talked about sharing code with PointerAuth, the PFP and pointer auth code is essentially identical here. Essentially any place that interacts with a pointer auth qualifier should also have PFP code, and the code is going to be (at the place of interaction, not the underlying implementation) going to be essentially the same. https://github.com/llvm/llvm-project/pull/133538 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] Add pointer field protection feature. (PR #133538)
@@ -3011,6 +3011,12 @@ defm experimental_omit_vtable_rtti : BoolFOption<"experimental-omit-vtable-rtti" NegFlag, BothFlags<[], [CC1Option], " the RTTI component from virtual tables">>; +def experimental_pointer_field_protection_EQ : Joined<["-"], "fexperimental-pointer-field-protection=">, Group, ojhunt wrote: As above, not super sure that I like this being a global compile time setting, but maybe it makes sense to be. https://github.com/llvm/llvm-project/pull/133538 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] Add pointer field protection feature. (PR #133538)
@@ -2268,13 +2293,22 @@ CodeGenFunction::EmitNullInitialization(Address DestPtr, QualType Ty) { // Get and call the appropriate llvm.memcpy overload. Builder.CreateMemCpy(DestPtr, SrcPtr, SizeVal, false); -return; + } else { +// Otherwise, just memset the whole thing to zero. This is legal ojhunt wrote: This means types that require constructors will be doing zero initialization, then reinitializing fields. From a codegen PoV this can lead to codegen along the lines of ``` zero the memory; // compiler is set to initialize all memory zero the memory; // this branch means objects that are not zero initializable get initialized again initialize the memory; // the struct is not zero initializable the constructor/initializer will run ``` Ideally the compiler will optimized this down, but it's both extra codegen, and extra optimization work. Given that your model assumes that null is a safe value for PFP fields, the `isZeroInitializable()` call should return true. For non-zero initializable objects, the initialization code for the object is responsible for initializing the PFP fields. https://github.com/llvm/llvm-project/pull/133538 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] Add pointer field protection feature. (PR #133538)
@@ -2513,6 +2513,12 @@ def CountedByOrNull : DeclOrTypeAttr { let LangOpts = [COnly]; } +def NoPointerFieldProtection : DeclOrTypeAttr { ojhunt wrote: This an ABI break so I don't think it can reasonably an on by default for all structs - we can already see annotations in libc++, and they would be needed on every single struct field. We can imagine a new platform where this would be the platform ABI, but for every other case it is functionally unusable. I would recommend attributes to permit opt in and opt out on a per-struct basis, and CLI flags to select the default behavior. https://github.com/llvm/llvm-project/pull/133538 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] Add pointer field protection feature. (PR #133538)
https://github.com/ojhunt edited https://github.com/llvm/llvm-project/pull/133538 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] Add pointer field protection feature. (PR #133538)
https://github.com/ojhunt edited https://github.com/llvm/llvm-project/pull/133538 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] Add pointer field protection feature. (PR #133538)
https://github.com/ojhunt commented: unfortunately I really don't know enough about actual backend codegen to comment on the actual codegen implementation here, but I think there's some design issues with having the backend determine the discriminators rather than having those selected in the front end https://github.com/llvm/llvm-project/pull/133538 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] Add pointer field protection feature. (PR #133538)
@@ -441,6 +445,254 @@ bool PreISelIntrinsicLowering::expandMemIntrinsicUses(Function &F) const { return Changed; } +namespace { + +enum class PointerEncoding { + Rotate, + PACCopyable, + PACNonCopyable, +}; + +bool expandProtectedFieldPtr(Function &Intr) { + Module &M = *Intr.getParent(); + bool IsAArch64 = Triple(M.getTargetTriple()).isAArch64(); + + std::set NonPFPFields; + std::set LoadsStores; + + Type *Int8Ty = Type::getInt8Ty(M.getContext()); + Type *Int64Ty = Type::getInt64Ty(M.getContext()); + PointerType *PtrTy = PointerType::get(M.getContext(), 0); + + Function *SignIntr = + Intrinsic::getOrInsertDeclaration(&M, Intrinsic::ptrauth_sign, {}); + Function *AuthIntr = + Intrinsic::getOrInsertDeclaration(&M, Intrinsic::ptrauth_auth, {}); + + auto *EmuFnTy = FunctionType::get(Int64Ty, {Int64Ty, Int64Ty}, false); + FunctionCallee EmuSignIntr = M.getOrInsertFunction("__emupac_pacda", EmuFnTy); + FunctionCallee EmuAuthIntr = M.getOrInsertFunction("__emupac_autda", EmuFnTy); + + auto CreateSign = [&](IRBuilder<> &B, Value *Val, Value *Disc, + OperandBundleDef DSBundle) { +Function *F = B.GetInsertBlock()->getParent(); +Attribute FSAttr = F->getFnAttribute("target-features"); +if (FSAttr.isValid() && FSAttr.getValueAsString().contains("+pauth")) + return B.CreateCall(SignIntr, {Val, B.getInt32(2), Disc}, DSBundle); +return B.CreateCall(EmuSignIntr, {Val, Disc}, DSBundle); + }; + + auto CreateAuth = [&](IRBuilder<> &B, Value *Val, Value *Disc, + OperandBundleDef DSBundle) { +Function *F = B.GetInsertBlock()->getParent(); +Attribute FSAttr = F->getFnAttribute("target-features"); +if (FSAttr.isValid() && FSAttr.getValueAsString().contains("+pauth")) + return B.CreateCall(AuthIntr, {Val, B.getInt32(2), Disc}, DSBundle); +return B.CreateCall(EmuAuthIntr, {Val, Disc}, DSBundle); + }; + + for (User *U : Intr.users()) { +auto *Call = cast(U); +auto *FieldName = cast( ojhunt wrote: I'm not really a backend person, but I feel that it's incorrect for the discriminator selection/computation to be occurring in llvm. I think clang should be passing the discriminator as a parameter to the intrinsics - this would also permit the PFP system to allow users to override the discriminator for fields, which is likely to end up being necessary, as time changes the exact types or naming of fields (side note: because I'm a muppet, I only just realized that the current model derives the discriminator from the name of the field which means the names of fields actually becomes ABI - it might be better to use something like "name of struct + offset in struct + type" though obviously that reduces the variation in discriminators - OTOH it's not uncommon for different projects to declare there own versions of POD structs with slightly different naming so maybe it helps there) https://github.com/llvm/llvm-project/pull/133538 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] Add pointer field protection feature. (PR #133538)
@@ -544,6 +544,7 @@ TYPE_TRAIT_2(__is_pointer_interconvertible_base_of, IsPointerInterconvertibleBas #include "clang/Basic/TransformTypeTraits.def" // Clang-only C++ Type Traits +TYPE_TRAIT_1(__has_non_relocatable_fields, HasNonRelocatableFields, KEYCXX) ojhunt wrote: I do not like this -- there are existing functions for handling whether an object is trivially copying, relocatable, etc that should just be updated to correctly report the traits of impacted types. See the various Sema functions querying `isAddressDiscriminated()`. That function currently only cares about pointer auth, but it would not be unreasonable to have it consider other properties, like PDP. In fact a lot of the behavior you're wanting in this feature is essentially covered by the PointerAuthQualfier. It would perhaps be reasonable to generalize that to something akin to SecurePointerQualifier that could be either a PAQ or PFP qualifier. Then most of the current semantic queries made to support pointer auth could just be a single shared implementation. https://github.com/llvm/llvm-project/pull/133538 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] Add pointer field protection feature. (PR #133538)
@@ -1415,6 +1469,52 @@ void CodeGenFunction::CreateCoercedStore(llvm::Value *Src, Address Dst, } } + // Coercion directly through memory does not work if the structure has pointer + // field protection because the struct passed by value has a different bit + // pattern to the struct in memory, so we must read the elements one by one + // and use them to form the coerced structure. + std::vector PFPFields; + getContext().findPFPFields(SrcFETy, CharUnits::Zero(), PFPFields, true); ojhunt wrote: Same as for CoercedLoad :D https://github.com/llvm/llvm-project/pull/133538 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [llvm][EmbedBitcodePass] Prevent modifying the module with ThinLTO (PR #139999)
llvmbot wrote: @llvm/pr-subscribers-llvm-transforms Author: Paul Kirth (ilovepi) Changes Since ThinLTOBitcodeWriterPass handles many things for CFI and WPD, like updating vtable linkage, we need to prevent those changes from persisting in the non-LTO object code we will compile under FatLTO. The only non-invasive way to do that is to clone the module when serializing the module in ThinLTOBitcodeWriterPass. We may be able to avoid cloning in the future with additional infrastructure to restore the IR to its original state. Fixes #139440 --- Full diff: https://github.com/llvm/llvm-project/pull/13.diff 2 Files Affected: - (modified) llvm/lib/Transforms/IPO/EmbedBitcodePass.cpp (+5-1) - (modified) llvm/test/Transforms/EmbedBitcode/embed-wpd.ll (+4-3) ``diff diff --git a/llvm/lib/Transforms/IPO/EmbedBitcodePass.cpp b/llvm/lib/Transforms/IPO/EmbedBitcodePass.cpp index 73f567734a91b..7a378c61cb899 100644 --- a/llvm/lib/Transforms/IPO/EmbedBitcodePass.cpp +++ b/llvm/lib/Transforms/IPO/EmbedBitcodePass.cpp @@ -16,6 +16,7 @@ #include "llvm/Support/raw_ostream.h" #include "llvm/TargetParser/Triple.h" #include "llvm/Transforms/IPO/ThinLTOBitcodeWriter.h" +#include "llvm/Transforms/Utils/Cloning.h" #include "llvm/Transforms/Utils/ModuleUtils.h" #include @@ -33,8 +34,11 @@ PreservedAnalyses EmbedBitcodePass::run(Module &M, ModuleAnalysisManager &AM) { std::string Data; raw_string_ostream OS(Data); + // Clone the module with with Thin LTO, since ThinLTOBitcodeWriterPass changes + // vtable linkage that would break the non-lto object code for FatLTO. if (IsThinLTO) -ThinLTOBitcodeWriterPass(OS, /*ThinLinkOS=*/nullptr).run(M, AM); +ThinLTOBitcodeWriterPass(OS, /*ThinLinkOS=*/nullptr) +.run(*llvm::CloneModule(M), AM); else BitcodeWriterPass(OS, /*ShouldPreserveUseListOrder=*/false, EmitLTOSummary) .run(M, AM); diff --git a/llvm/test/Transforms/EmbedBitcode/embed-wpd.ll b/llvm/test/Transforms/EmbedBitcode/embed-wpd.ll index f1f7712f54039..54931be42b4eb 100644 --- a/llvm/test/Transforms/EmbedBitcode/embed-wpd.ll +++ b/llvm/test/Transforms/EmbedBitcode/embed-wpd.ll @@ -1,12 +1,13 @@ ; RUN: opt --mtriple x86_64-unknown-linux-gnu < %s -passes="embed-bitcode" -S | FileCheck %s -; CHECK-NOT: $_ZTV3Foo = comdat any +; CHECK: $_ZTV3Foo = comdat any $_ZTV3Foo = comdat any $_ZTI3Foo = comdat any -; CHECK: @_ZTV3Foo = external hidden unnamed_addr constant { [5 x ptr] }, align 8 -; CHECK: @_ZTI3Foo = linkonce_odr hidden constant { ptr, ptr, ptr } { ptr getelementptr inbounds (ptr, ptr @_ZTVN10__cxxabiv120__si_class_type_infoE, i64 2), ptr @_ZTS3Foo, ptr @_ZTISt13runtime_error }, comdat, align 8 +;; ThinLTOBitcodeWriter will remove the vtable for Foo, and make it an external symbol +; CHECK: @_ZTV3Foo = linkonce_odr hidden unnamed_addr constant { [5 x ptr] } { [5 x ptr] [ptr null, ptr @_ZTI3Foo, ptr @_ZN3FooD2Ev, ptr @_ZN3FooD0Ev, ptr @_ZNKSt13runtime_error4whatEv] }, comdat, align 8, !type !0, !type !1, !type !2, !type !3, !type !4, !type !5 +; CHECK-NOT: @foo = external unnamed_addr constant { [5 x ptr] }, align 8 ; CHECK: @llvm.embedded.object = private constant {{.*}}, section ".llvm.lto", align 1 ; CHECK: @llvm.compiler.used = appending global [1 x ptr] [ptr @llvm.embedded.object], section "llvm.metadata" @_ZTV3Foo = linkonce_odr hidden unnamed_addr constant { [5 x ptr] } { [5 x ptr] [ptr null, ptr @_ZTI3Foo, ptr @_ZN3FooD2Ev, ptr @_ZN3FooD0Ev, ptr @_ZNKSt13runtime_error4whatEv] }, comdat, align 8, !type !0, !type !1, !type !2, !type !3, !type !4, !type !5 `` https://github.com/llvm/llvm-project/pull/13 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [RISCV][Scheduler] Add scheduler definitions for the Q extension (PR #139495)
https://github.com/topperc approved this pull request. LGTM https://github.com/llvm/llvm-project/pull/139495 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [StaticDataLayout][PGO]Implement reader and writer change for data access profiles (PR #139997)
https://github.com/mingmingl-llvm created https://github.com/llvm/llvm-project/pull/139997 None >From 75878647c2c36cca00e9d003dc84bf4597e19187 Mon Sep 17 00:00:00 2001 From: mingmingl Date: Tue, 13 May 2025 22:54:59 -0700 Subject: [PATCH] [StaticDataLayout][PGO]Implement reader and writer change for data access profiles --- .../include/llvm/ProfileData/DataAccessProf.h | 12 +++- .../llvm/ProfileData/IndexedMemProfData.h | 12 +++- .../llvm/ProfileData/InstrProfReader.h| 6 +- .../llvm/ProfileData/InstrProfWriter.h| 6 ++ llvm/include/llvm/ProfileData/MemProfReader.h | 12 llvm/include/llvm/ProfileData/MemProfYAML.h | 65 +++ llvm/lib/ProfileData/DataAccessProf.cpp | 6 +- llvm/lib/ProfileData/IndexedMemProfData.cpp | 61 + llvm/lib/ProfileData/InstrProfReader.cpp | 14 llvm/lib/ProfileData/InstrProfWriter.cpp | 20 -- llvm/lib/ProfileData/MemProfReader.cpp| 34 ++ .../tools/llvm-profdata/memprof-yaml.test | 11 llvm/tools/llvm-profdata/llvm-profdata.cpp| 5 ++ .../ProfileData/DataAccessProfTest.cpp| 11 ++-- 14 files changed, 244 insertions(+), 31 deletions(-) diff --git a/llvm/include/llvm/ProfileData/DataAccessProf.h b/llvm/include/llvm/ProfileData/DataAccessProf.h index e8504102238d1..f5f6abf0a2817 100644 --- a/llvm/include/llvm/ProfileData/DataAccessProf.h +++ b/llvm/include/llvm/ProfileData/DataAccessProf.h @@ -41,6 +41,8 @@ namespace data_access_prof { struct SourceLocation { SourceLocation(StringRef FileNameRef, uint32_t Line) : FileName(FileNameRef.str()), Line(Line) {} + + SourceLocation() {} /// The filename where the data is located. std::string FileName; /// The line number in the source code. @@ -53,6 +55,8 @@ namespace internal { // which strings are owned by `DataAccessProfData`. Used by `DataAccessProfData` // to represent data locations internally. struct SourceLocationRef { + SourceLocationRef(StringRef FileNameRef, uint32_t Line) + : FileName(FileNameRef), Line(Line) {} // The filename where the data is located. StringRef FileName; // The line number in the source code. @@ -100,8 +104,9 @@ using SymbolHandle = std::variant; /// The data access profiles for a symbol. struct DataAccessProfRecord { public: - DataAccessProfRecord(SymbolHandleRef SymHandleRef, - ArrayRef LocRefs) { + DataAccessProfRecord(SymbolHandleRef SymHandleRef, uint64_t AccessCount, + ArrayRef LocRefs) + : AccessCount(AccessCount) { if (std::holds_alternative(SymHandleRef)) { SymHandle = std::get(SymHandleRef).str(); } else @@ -110,8 +115,9 @@ struct DataAccessProfRecord { for (auto Loc : LocRefs) Locations.push_back(SourceLocation(Loc.FileName, Loc.Line)); } + DataAccessProfRecord() {} SymbolHandle SymHandle; - + uint64_t AccessCount; // The locations of data in the source code. Optional. SmallVector Locations; }; diff --git a/llvm/include/llvm/ProfileData/IndexedMemProfData.h b/llvm/include/llvm/ProfileData/IndexedMemProfData.h index 3c6c329d1c49d..66fa38472059b 100644 --- a/llvm/include/llvm/ProfileData/IndexedMemProfData.h +++ b/llvm/include/llvm/ProfileData/IndexedMemProfData.h @@ -10,14 +10,20 @@ // //===--===// +#include "llvm/ProfileData/DataAccessProf.h" #include "llvm/ProfileData/InstrProf.h" #include "llvm/ProfileData/MemProf.h" +#include +#include + namespace llvm { // Write the MemProf data to OS. -Error writeMemProf(ProfOStream &OS, memprof::IndexedMemProfData &MemProfData, - memprof::IndexedVersion MemProfVersionRequested, - bool MemProfFullSchema); +Error writeMemProf( +ProfOStream &OS, memprof::IndexedMemProfData &MemProfData, +memprof::IndexedVersion MemProfVersionRequested, bool MemProfFullSchema, +std::optional> +DataAccessProfileData); } // namespace llvm diff --git a/llvm/include/llvm/ProfileData/InstrProfReader.h b/llvm/include/llvm/ProfileData/InstrProfReader.h index c250a9ede39bc..210df6be46f04 100644 --- a/llvm/include/llvm/ProfileData/InstrProfReader.h +++ b/llvm/include/llvm/ProfileData/InstrProfReader.h @@ -18,6 +18,7 @@ #include "llvm/ADT/StringRef.h" #include "llvm/IR/ProfileSummary.h" #include "llvm/Object/BuildID.h" +#include "llvm/ProfileData/DataAccessProf.h" #include "llvm/ProfileData/InstrProf.h" #include "llvm/ProfileData/InstrProfCorrelator.h" #include "llvm/ProfileData/MemProf.h" @@ -704,9 +705,12 @@ class IndexedMemProfReader { // The number of elements in the radix tree array. unsigned RadixTreeSize = 0; + std::unique_ptr DataAccessProfileData; + Error deserializeV2(const unsigned char *Start, const unsigned char *Ptr); Error deserializeRadixTreeBased(const unsigned char *Start, - const unsigned char *Ptr); +
[llvm-branch-commits] [llvm] [llvm][EmbedBitcodePass] Prevent modifying the module with ThinLTO (PR #139999)
ilovepi wrote: > [!WARNING] > This pull request is not mergeable via GitHub because a downstack PR is > open. Once all requirements are satisfied, merge this PR as a stack href="https://app.graphite.dev/github/pr/llvm/llvm-project/13?utm_source=stack-comment-downstack-mergeability-warning"; > >on Graphite. > https://graphite.dev/docs/merge-pull-requests";>Learn more * **#13** https://app.graphite.dev/github/pr/llvm/llvm-project/13?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> đ https://app.graphite.dev/github/pr/llvm/llvm-project/13?utm_source=stack-comment-view-in-graphite"; target="_blank">(View in Graphite) * **#139998** https://app.graphite.dev/github/pr/llvm/llvm-project/139998?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * `main` This stack of pull requests is managed by https://graphite.dev?utm-source=stack-comment";>Graphite. Learn more about https://stacking.dev/?utm_source=stack-comment";>stacking. https://github.com/llvm/llvm-project/pull/13 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [llvm][EmbedBitcodePass] Prevent modifying the module with ThinLTO (PR #139999)
https://github.com/ilovepi created https://github.com/llvm/llvm-project/pull/13 Since ThinLTOBitcodeWriterPass handles many things for CFI and WPD, like updating vtable linkage, we need to prevent those changes from persisting in the non-LTO object code we will compile under FatLTO. The only non-invasive way to do that is to clone the module when serializing the module in ThinLTOBitcodeWriterPass. We may be able to avoid cloning in the future with additional infrastructure to restore the IR to its original state. Fixes #139440 >From 0bc0cc2c64a7c7a347cf71888e9a57339537ce25 Mon Sep 17 00:00:00 2001 From: Paul Kirth Date: Wed, 14 May 2025 20:47:26 -0700 Subject: [PATCH] [llvm][EmbedBitcodePass] Prevent modifying the module with ThinLTO Since ThinLTOBitcodeWriterPass handles many things for CFI and WPD, like updating vtable linkage, we need to prevent those changes from persisting in the non-LTO object code we will compile under FatLTO. The only non-invasive way to do that is to clone the module when serializing the module in ThinLTOBitcodeWriterPass. We may be able to avoid cloning in the future with additional infrastructure to restore the IR to its original state. Fixes #139440 --- llvm/lib/Transforms/IPO/EmbedBitcodePass.cpp | 6 +- llvm/test/Transforms/EmbedBitcode/embed-wpd.ll | 7 --- 2 files changed, 9 insertions(+), 4 deletions(-) diff --git a/llvm/lib/Transforms/IPO/EmbedBitcodePass.cpp b/llvm/lib/Transforms/IPO/EmbedBitcodePass.cpp index 73f567734a91b..7a378c61cb899 100644 --- a/llvm/lib/Transforms/IPO/EmbedBitcodePass.cpp +++ b/llvm/lib/Transforms/IPO/EmbedBitcodePass.cpp @@ -16,6 +16,7 @@ #include "llvm/Support/raw_ostream.h" #include "llvm/TargetParser/Triple.h" #include "llvm/Transforms/IPO/ThinLTOBitcodeWriter.h" +#include "llvm/Transforms/Utils/Cloning.h" #include "llvm/Transforms/Utils/ModuleUtils.h" #include @@ -33,8 +34,11 @@ PreservedAnalyses EmbedBitcodePass::run(Module &M, ModuleAnalysisManager &AM) { std::string Data; raw_string_ostream OS(Data); + // Clone the module with with Thin LTO, since ThinLTOBitcodeWriterPass changes + // vtable linkage that would break the non-lto object code for FatLTO. if (IsThinLTO) -ThinLTOBitcodeWriterPass(OS, /*ThinLinkOS=*/nullptr).run(M, AM); +ThinLTOBitcodeWriterPass(OS, /*ThinLinkOS=*/nullptr) +.run(*llvm::CloneModule(M), AM); else BitcodeWriterPass(OS, /*ShouldPreserveUseListOrder=*/false, EmitLTOSummary) .run(M, AM); diff --git a/llvm/test/Transforms/EmbedBitcode/embed-wpd.ll b/llvm/test/Transforms/EmbedBitcode/embed-wpd.ll index f1f7712f54039..54931be42b4eb 100644 --- a/llvm/test/Transforms/EmbedBitcode/embed-wpd.ll +++ b/llvm/test/Transforms/EmbedBitcode/embed-wpd.ll @@ -1,12 +1,13 @@ ; RUN: opt --mtriple x86_64-unknown-linux-gnu < %s -passes="embed-bitcode" -S | FileCheck %s -; CHECK-NOT: $_ZTV3Foo = comdat any +; CHECK: $_ZTV3Foo = comdat any $_ZTV3Foo = comdat any $_ZTI3Foo = comdat any -; CHECK: @_ZTV3Foo = external hidden unnamed_addr constant { [5 x ptr] }, align 8 -; CHECK: @_ZTI3Foo = linkonce_odr hidden constant { ptr, ptr, ptr } { ptr getelementptr inbounds (ptr, ptr @_ZTVN10__cxxabiv120__si_class_type_infoE, i64 2), ptr @_ZTS3Foo, ptr @_ZTISt13runtime_error }, comdat, align 8 +;; ThinLTOBitcodeWriter will remove the vtable for Foo, and make it an external symbol +; CHECK: @_ZTV3Foo = linkonce_odr hidden unnamed_addr constant { [5 x ptr] } { [5 x ptr] [ptr null, ptr @_ZTI3Foo, ptr @_ZN3FooD2Ev, ptr @_ZN3FooD0Ev, ptr @_ZNKSt13runtime_error4whatEv] }, comdat, align 8, !type !0, !type !1, !type !2, !type !3, !type !4, !type !5 +; CHECK-NOT: @foo = external unnamed_addr constant { [5 x ptr] }, align 8 ; CHECK: @llvm.embedded.object = private constant {{.*}}, section ".llvm.lto", align 1 ; CHECK: @llvm.compiler.used = appending global [1 x ptr] [ptr @llvm.embedded.object], section "llvm.metadata" @_ZTV3Foo = linkonce_odr hidden unnamed_addr constant { [5 x ptr] } { [5 x ptr] [ptr null, ptr @_ZTI3Foo, ptr @_ZN3FooD2Ev, ptr @_ZN3FooD0Ev, ptr @_ZNKSt13runtime_error4whatEv] }, comdat, align 8, !type !0, !type !1, !type !2, !type !3, !type !4, !type !5 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] Add pointer field protection feature. (PR #133538)
@@ -544,6 +544,7 @@ TYPE_TRAIT_2(__is_pointer_interconvertible_base_of, IsPointerInterconvertibleBas #include "clang/Basic/TransformTypeTraits.def" // Clang-only C++ Type Traits +TYPE_TRAIT_1(__has_non_relocatable_fields, HasNonRelocatableFields, KEYCXX) pcc wrote: I don't like it either. As mentioned in the main comment, I hope to get rid of this when the P2786 implementation is finalized. Sharing the qualifiers with PAuth ABI would be a good idea for the separate opt-in solution. https://github.com/llvm/llvm-project/pull/133538 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] Add pointer field protection feature. (PR #133538)
@@ -2976,7 +3006,15 @@ void CodeGenFunction::EmitForwardingCallToLambda( QualType resultType = FPT->getReturnType(); ReturnValueSlot returnSlot; if (!resultType->isVoidType() && - calleeFnInfo->getReturnInfo().getKind() == ABIArgInfo::Indirect && + (calleeFnInfo->getReturnInfo().getKind() == ABIArgInfo::Indirect || + // With pointer field protection, we need to set up the return slot when pcc wrote: The return value slot for forwarding lambdas is set up in an unusual way compared to other functions. I think I first tried simplifying this to work like normal functions but that led to other problems. I agree that ideally we wouldn't need to modify this. https://github.com/llvm/llvm-project/pull/133538 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] Add pointer field protection feature. (PR #133538)
@@ -1415,6 +1469,52 @@ void CodeGenFunction::CreateCoercedStore(llvm::Value *Src, Address Dst, } } + // Coercion directly through memory does not work if the structure has pointer + // field protection because the struct passed by value has a different bit + // pattern to the struct in memory, so we must read the elements one by one + // and use them to form the coerced structure. + std::vector PFPFields; + getContext().findPFPFields(SrcFETy, CharUnits::Zero(), PFPFields, true); pcc wrote: See my other comment. https://github.com/llvm/llvm-project/pull/133538 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] Add pointer field protection feature. (PR #133538)
@@ -7756,6 +7756,10 @@ void Clang::ConstructJob(Compilation &C, const JobAction &JA, Args.addOptInFlag(CmdArgs, options::OPT_funique_source_file_names, options::OPT_fno_unique_source_file_names); + if (!IsCudaDevice) pcc wrote: With CUDA the driver creates two compilation jobs, one for the host and one for the device. Since it is an ABI requirement that all C++ translation units are built with the same `-fexperimental-pointer-field-protection` flags, the flag must be passed when compiling the CUDA translation units as well, so that the host side is built correctly. And since GPUs don't support PFP, we need to exclude the flag from the CUDA device job. It looks like the same ends up happening for PAuth ABI because the flags are copied in `Clang::AddAArch64TargetArgs`, which means that they are ignored for the CUDA device which will have a different architecture. If someone did end up implementing PFP for GPUs, I think we would create a separate flag to enable PFP for the device. https://github.com/llvm/llvm-project/pull/133538 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] Add pointer field protection feature. (PR #133538)
@@ -7538,6 +7538,14 @@ static bool IsEligibleForTrivialRelocation(Sema &SemaRef, if (!SemaRef.IsCXXTriviallyRelocatableType(Field->getType())) return false; } + + // FIXME: PFP should not affect trivial relocatability, instead it should + // affect the implementation of std::trivially_relocate. See: + // https://discourse.llvm.org/t/rfc-structure-protection-a-family-of-uaf-mitigation-techniques/8/16?u=pcc + if (!SemaRef.Context.arePFPFieldsTriviallyRelocatable(D) && pcc wrote: Yes, that does look like the right place for now until we have `std::trivially_relocate` fully implemented. I see there is already a call to `BaseElementType.hasAddressDiscriminatedPointerAuth()` in that function. https://github.com/llvm/llvm-project/pull/133538 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] Add pointer field protection feature. (PR #133538)
@@ -2513,6 +2513,12 @@ def CountedByOrNull : DeclOrTypeAttr { let LangOpts = [COnly]; } +def NoPointerFieldProtection : DeclOrTypeAttr { pcc wrote: There are numerous circumstances where the C++ ABI is distinct from the platform ABI and is allowed to change. For example, most Chromium-based web browsers ship with a statically linked libc++, and the same is true for many Android apps and server-side software at Google. In the intended usage model, all of libc++ would be built with a `-fexperimental-pointer-field-protection=` flag consistent with the rest of the program. The libc++abi opt-outs that I added are only for opting out pointer fields of RTTI structs that are generated by the compiler. In principle, we could teach the compiler to sign those pointer fields which would let us remove the opt-outs, but there is a lower ROI for subjecting those fields to PFP because the RTTI structs are not typically dynamically allocated. We may consider adding an attribute to allow opting in in the case where the flag is not passed. But I think we should do that in a followup. We would need to carefully consider how that would interact with the other aspects of the opt-in solution, such as the pointer qualifiers. https://github.com/llvm/llvm-project/pull/133538 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] Add pointer field protection feature. (PR #133538)
@@ -441,6 +445,254 @@ bool PreISelIntrinsicLowering::expandMemIntrinsicUses(Function &F) const { return Changed; } +namespace { + +enum class PointerEncoding { + Rotate, + PACCopyable, + PACNonCopyable, +}; + +bool expandProtectedFieldPtr(Function &Intr) { + Module &M = *Intr.getParent(); + bool IsAArch64 = Triple(M.getTargetTriple()).isAArch64(); + + std::set NonPFPFields; + std::set LoadsStores; + + Type *Int8Ty = Type::getInt8Ty(M.getContext()); + Type *Int64Ty = Type::getInt64Ty(M.getContext()); + PointerType *PtrTy = PointerType::get(M.getContext(), 0); + + Function *SignIntr = + Intrinsic::getOrInsertDeclaration(&M, Intrinsic::ptrauth_sign, {}); + Function *AuthIntr = + Intrinsic::getOrInsertDeclaration(&M, Intrinsic::ptrauth_auth, {}); + + auto *EmuFnTy = FunctionType::get(Int64Ty, {Int64Ty, Int64Ty}, false); + FunctionCallee EmuSignIntr = M.getOrInsertFunction("__emupac_pacda", EmuFnTy); + FunctionCallee EmuAuthIntr = M.getOrInsertFunction("__emupac_autda", EmuFnTy); + + auto CreateSign = [&](IRBuilder<> &B, Value *Val, Value *Disc, + OperandBundleDef DSBundle) { +Function *F = B.GetInsertBlock()->getParent(); +Attribute FSAttr = F->getFnAttribute("target-features"); +if (FSAttr.isValid() && FSAttr.getValueAsString().contains("+pauth")) + return B.CreateCall(SignIntr, {Val, B.getInt32(2), Disc}, DSBundle); +return B.CreateCall(EmuSignIntr, {Val, Disc}, DSBundle); + }; + + auto CreateAuth = [&](IRBuilder<> &B, Value *Val, Value *Disc, + OperandBundleDef DSBundle) { +Function *F = B.GetInsertBlock()->getParent(); +Attribute FSAttr = F->getFnAttribute("target-features"); +if (FSAttr.isValid() && FSAttr.getValueAsString().contains("+pauth")) + return B.CreateCall(AuthIntr, {Val, B.getInt32(2), Disc}, DSBundle); +return B.CreateCall(EmuAuthIntr, {Val, Disc}, DSBundle); + }; + + for (User *U : Intr.users()) { +auto *Call = cast(U); +auto *FieldName = cast( pcc wrote: Yeah, the exact semantics of the intrinsic are a bit inelegant at the moment. For example, Clang shares knowledge with this pass about how deactivation symbols are named, as well as how to compute hashes (it uses `std::hash` because I didn't get around to switching it to SipHash yet, my bad). It may be best to move all of that knowledge into Clang and then the intrinsic would take a deactivation symbol as well as a discriminator hash. https://github.com/llvm/llvm-project/pull/133538 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] Add pointer field protection feature. (PR #133538)
@@ -3011,6 +3011,12 @@ defm experimental_omit_vtable_rtti : BoolFOption<"experimental-omit-vtable-rtti" NegFlag, BothFlags<[], [CC1Option], " the RTTI component from virtual tables">>; +def experimental_pointer_field_protection_EQ : Joined<["-"], "fexperimental-pointer-field-protection=">, Group, pcc wrote: (See other reply.) https://github.com/llvm/llvm-project/pull/133538 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] Add pointer field protection feature. (PR #133538)
@@ -2268,13 +2293,22 @@ CodeGenFunction::EmitNullInitialization(Address DestPtr, QualType Ty) { // Get and call the appropriate llvm.memcpy overload. Builder.CreateMemCpy(DestPtr, SrcPtr, SizeVal, false); -return; + } else { +// Otherwise, just memset the whole thing to zero. This is legal pcc wrote: A PFP field containing a null pointer has a non-null bit pattern (it is a signed null). This is to avoid a conditional branch at every location when loading/storing a pointer and the nullness of the pointer cannot be statically determined. That's why we need to store nulls here. The EmitNullInitialization function is not used in case of a constructor call. It is used in cases like ``` struct S { void *p; }; void f() { S s{}; } ``` I am not adding a call to memset here. The change here is a bit confusing because I needed to change some of the indentation to get rid of an early return. The memset previously on line 2277, which was moved to line 2301, was previously, and is now, being executed if `isZeroInitializable`. The only code being added is the null stores. https://github.com/llvm/llvm-project/pull/133538 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] Add pointer field protection feature. (PR #133538)
@@ -928,6 +936,11 @@ namespace { if (PointerAuthQualifier Q = F->getType().getPointerAuth(); Q && Q.isAddressDiscriminated()) return false; + // Non-trivially-copyable fields with pointer field protection need to be pcc wrote: We could move lines 936-943 into a separate function that takes a QualType and call it from other places. However, I checked the other callers of `PointerAuthQualifier::isAddressDiscriminated` and I think almost all of them are already correct. https://github.com/llvm/llvm-project/pull/133538 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] Add pointer field protection feature. (PR #133538)
https://github.com/pcc edited https://github.com/llvm/llvm-project/pull/133538 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] Add pointer field protection feature. (PR #133538)
@@ -2201,6 +2215,22 @@ void CodeGenFunction::EmitCXXConstructorCall( EmitTypeCheck(CodeGenFunction::TCK_ConstructorCall, Loc, This, getContext().getRecordType(ClassDecl), CharUnits::Zero()); + // When initializing an object that has pointer field protection and whose + // fields are not trivially relocatable we must initialize any pointer fields + // to a valid signed pointer (any pointer value will do, but we just use null + // pointers). This is because if the object is subsequently copied, its copy + // constructor will need to read and authenticate any pointer fields in order + // to copy the object to a new address, which will fail if the pointers are + // uninitialized. + if (!getContext().arePFPFieldsTriviallyRelocatable(D->getParent())) { pcc wrote: That's fair. This code was added while testing in our internal codebase at a time when we had a more expansive view of which fields would be subject to PFP. I will re-evaluate whether this is still needed. https://github.com/llvm/llvm-project/pull/133538 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] Add pointer field protection feature. (PR #133538)
https://github.com/pcc commented: Hi Oliver, thanks for your comments! I'll address them below. > Thoughts: > > This should be opt-in on a field or struct granularity, not just a global > behavior. This would certainly be easier if it were an opt-in behavior, as it would allow avoiding a substantial amount of complexity that you point out below. The problem is that would not make the solution very effective as a UAF mitigation, as it would only help with code that was opted in, which is likely a tiny fraction of an entire codebase. That fraction is unlikely to be the fraction with UAF bugs, not only because of the size of the fraction but also because if developers aren't thinking about memory safety issues, they certainly won't be manually opting into this. With PFP, the problem that I set out to solve was to find a way to automatically protect as many pointers stored in memory as possible in standards-conforming code. > In the RFC I think you mentioned not applying PFP to C types, but I'm unsure > how you're deciding what is a C type? This is based on whether the type has standard layout. C does not permit declaring a type that is non-standard layout. > There are a lot of special cases being added in places that should not need > to be aware of PFP - the core type queries should be returning correct values > for types containing PFP fields. Agreed on type queries. The current adjustment to how trivially-relocatable is defined is a workaround that will be removed once P2786 is fully implemented in libc++, and at that point we shouldn't need `__has_non_relocatable_fields` either. The intent is that turning on PFP should be invisible to the developer (in other words, the modification to the storage format of pointer fields in certain structs is permitted by the as-if rule). As a result, most of the implementation is in CodeGen and below. > A lot of the "this read/write/copy/init special work" exactly aligns with the > constraints of pointer auth, and I think the overall implementation could be > made better by introducing the concept of a `HardenedPointerQualifier` or > similar, that could be either PointerAuthQualifier or PFPQualifier. In > principle this could even just be treated as a different PointerAuth key set. > > That approach would also help with some of the problems you have with > offsetof and pointers to PFP fields - the thing that causes the problems with > the pointer to a PFP field is that the PFP schema for a given field is in > reality part of the type, so given > > ```c++ > struct A { >void *field1; >void *field2; > }; > ``` > > It looks you are trying to maintain the idea that `A::field1` and `A::field2` > have the same type, which makes this code valid according to the type system: > > ```c++ > void f(void** obj); > void g(A *a) { > f(&a->field1); > f(&a->field2); > } > ``` > > but the real types are not the same. The semantic equivalent under pointer > auth is something akin to > > ```c++ > struct A { >void * __ptrauth(1, 1, hash("A::field1")) field1; >void * __ptrauth(1, 1, hash("A::field2")) field2; > }; > ``` > > Which makes it very clear that the types are different, and I think trying to > pretend they are not is part of what is causing problems. In the early stages of the project, I wanted to make it so that the PFP fields would implicitly receive qualifiers, similar to what you proposed, and I was trying to come up with an approach that would make it work, but I couldn't see a way because of the inter-convertibility of pointers from different structs (and pointers outside of structs entirely), and disallowing conversions between differently qualified pointer types would render the compiler unable to compile standard C++. The only way that I could see to make it work was to utilize the as-if rule and invisibly modify the in-memory representation. For pointers in standard-layout types (i.e. pointers that explicitly opt in) I agree with you that the right approach would be to use a qualifier, but I think that should be done as a separate change, possibly as a followup. In this change, I would like to implement the solution for implicitly protected fields. > The more I think about it, the more I feel that this could be implemented > almost entirely on top of the existing pointer auth work. > > Currently for pointer auth there's a set of ARM keys specified, and I think > you just need to create a different set, PFP_Keys or similar, and set up the > appropriate schemas (basically null schemas for all the existing cases), and > add a new schema: DefaultFieldSchema or similar, and apply that schema to > every pointer field in an object unless there's an explicit qualifier applied > already. > > Then it would in principle just be a matter of making adding the appropriate > logic to the ptrauth intrinsics in llvm. > > Now there are some moderately large caveats to that idea > > * the existing ptrauth
[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: optionally assume auth traps on failure (PR #139778)
https://github.com/atrosinenko updated https://github.com/llvm/llvm-project/pull/139778 >From 8f997500a97b5ad7acc9dea416cca6b2f7bb615d Mon Sep 17 00:00:00 2001 From: Anatoly Trosinenko Date: Tue, 13 May 2025 19:50:41 +0300 Subject: [PATCH] [BOLT] Gadget scanner: optionally assume auth traps on failure On AArch64 it is possible for an auth instruction to either return an invalid address value on failure (without FEAT_FPAC) or generate an error (with FEAT_FPAC). It thus may be possible to never emit explicit pointer checks, if the target CPU is known to support FEAT_FPAC. This commit implements an --auth-traps-on-failure command line option, which essentially makes "safe-to-dereference" and "trusted" register properties identical and disables scanning for authentication oracles completely. --- bolt/lib/Passes/PAuthGadgetScanner.cpp| 112 +++ .../binary-analysis/AArch64/cmdline-args.test | 1 + .../AArch64/gs-pauth-authentication-oracles.s | 6 +- .../binary-analysis/AArch64/gs-pauth-calls.s | 5 +- .../AArch64/gs-pauth-debug-output.s | 177 ++--- .../AArch64/gs-pauth-jump-table.s | 6 +- .../AArch64/gs-pauth-signing-oracles.s| 54 ++--- .../AArch64/gs-pauth-tail-calls.s | 184 +- 8 files changed, 318 insertions(+), 227 deletions(-) diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp b/bolt/lib/Passes/PAuthGadgetScanner.cpp index bda971bcd9343..cfe86d32df798 100644 --- a/bolt/lib/Passes/PAuthGadgetScanner.cpp +++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp @@ -14,6 +14,7 @@ #include "bolt/Passes/PAuthGadgetScanner.h" #include "bolt/Core/ParallelUtilities.h" #include "bolt/Passes/DataflowAnalysis.h" +#include "bolt/Utils/CommandLineOpts.h" #include "llvm/ADT/STLExtras.h" #include "llvm/ADT/SmallSet.h" #include "llvm/MC/MCInst.h" @@ -26,6 +27,11 @@ namespace llvm { namespace bolt { namespace PAuthGadgetScanner { +static cl::opt AuthTrapsOnFailure( +"auth-traps-on-failure", +cl::desc("Assume authentication instructions always trap on failure"), +cl::cat(opts::BinaryAnalysisCategory)); + [[maybe_unused]] static void traceInst(const BinaryContext &BC, StringRef Label, const MCInst &MI) { dbgs() << " " << Label << ": "; @@ -365,6 +371,34 @@ class SrcSafetyAnalysis { return Clobbered; } + std::optional getRegMadeTrustedByChecking(const MCInst &Inst, + SrcState Cur) const { +// This functions cannot return multiple registers. This is never the case +// on AArch64. +std::optional RegCheckedByInst = +BC.MIB->getAuthCheckedReg(Inst, /*MayOverwrite=*/false); +if (RegCheckedByInst && Cur.SafeToDerefRegs[*RegCheckedByInst]) + return *RegCheckedByInst; + +auto It = CheckerSequenceInfo.find(&Inst); +if (It == CheckerSequenceInfo.end()) + return std::nullopt; + +MCPhysReg RegCheckedBySequence = It->second.first; +const MCInst *FirstCheckerInst = It->second.second; + +// FirstCheckerInst should belong to the same basic block (see the +// assertion in DataflowSrcSafetyAnalysis::run()), meaning it was +// deterministically processed a few steps before this instruction. +const SrcState &StateBeforeChecker = getStateBefore(*FirstCheckerInst); + +// The sequence checks the register, but it should be authenticated before. +if (!StateBeforeChecker.SafeToDerefRegs[RegCheckedBySequence]) + return std::nullopt; + +return RegCheckedBySequence; + } + // Returns all registers that can be treated as if they are written by an // authentication instruction. SmallVector getRegsMadeSafeToDeref(const MCInst &Point, @@ -387,18 +421,38 @@ class SrcSafetyAnalysis { Regs.push_back(DstAndSrc->first); } +// Make sure explicit checker sequence keeps register safe-to-dereference +// when the register would be clobbered according to the regular rules: +// +//; LR is safe to dereference here +//mov x16, x30 ; start of the sequence, LR is s-t-d right before +//xpaclri ; clobbers LR, LR is not safe anymore +//cmp x30, x16 +//b.eq 1f; end of the sequence: LR is marked as trusted +//brk 0x1234 +// 1: +//; at this point LR would be marked as trusted, +//; but not safe-to-dereference +// +// or even just +// +//; X1 is safe to dereference here +//ldr x0, [x1, #8]! +//; X1 is trusted here, but it was clobbered due to address write-back +if (auto CheckedReg = getRegMadeTrustedByChecking(Point, Cur)) + Regs.push_back(*CheckedReg); + return Regs; } // Returns all registers made trusted by this instruction. SmallVector getRegsMadeTrusted(const MCInst &Point, const SrcState &Cur) const { +assert(!AuthTrapsOnFailure &&
[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: prevent false positives due to jump tables (PR #138884)
https://github.com/atrosinenko updated https://github.com/llvm/llvm-project/pull/138884 >From cdc385aac7abe960340e26b2427ac9215e0a54fd Mon Sep 17 00:00:00 2001 From: Anatoly Trosinenko Date: Tue, 6 May 2025 11:31:03 +0300 Subject: [PATCH] [BOLT] Gadget scanner: prevent false positives due to jump tables As part of PAuth hardening, AArch64 LLVM backend can use a special BR_JumpTable pseudo (enabled by -faarch64-jump-table-hardening Clang option) which is expanded in the AsmPrinter into a contiguous sequence without unsafe instructions in the middle. This commit adds another target-specific callback to MCPlusBuilder to make it possible to inhibit false positives for known-safe jump table dispatch sequences. Without special handling, the branch instruction is likely to be reported as a non-protected call (as its destination is not produced by an auth instruction, PC-relative address materialization, etc.) and possibly as a tail call being performed with unsafe link register (as the detection whether the branch instruction is a tail call is an heuristic). For now, only the specific instruction sequence used by the AArch64 LLVM backend is matched. --- bolt/include/bolt/Core/MCInstUtils.h | 9 + bolt/include/bolt/Core/MCPlusBuilder.h| 14 + bolt/lib/Core/MCInstUtils.cpp | 20 + bolt/lib/Passes/PAuthGadgetScanner.cpp| 10 + .../Target/AArch64/AArch64MCPlusBuilder.cpp | 73 ++ .../AArch64/gs-pauth-jump-table.s | 703 ++ 6 files changed, 829 insertions(+) create mode 100644 bolt/test/binary-analysis/AArch64/gs-pauth-jump-table.s diff --git a/bolt/include/bolt/Core/MCInstUtils.h b/bolt/include/bolt/Core/MCInstUtils.h index b495eb8ef5eec..5dd0aaa48d6e7 100644 --- a/bolt/include/bolt/Core/MCInstUtils.h +++ b/bolt/include/bolt/Core/MCInstUtils.h @@ -158,6 +158,15 @@ class MCInstReference { return nullptr; } + /// Returns the only preceding instruction, or std::nullopt if multiple or no + /// predecessors are possible. + /// + /// If CFG information is available, basic block boundary can be crossed, + /// provided there is exactly one predecessor. If CFG is not available, the + /// preceding instruction in the offset order is returned, unless this is the + /// first instruction of the function. + std::optional getSinglePredecessor(); + raw_ostream &print(raw_ostream &OS) const; }; diff --git a/bolt/include/bolt/Core/MCPlusBuilder.h b/bolt/include/bolt/Core/MCPlusBuilder.h index 87de6754017db..eb93d7de7fee9 100644 --- a/bolt/include/bolt/Core/MCPlusBuilder.h +++ b/bolt/include/bolt/Core/MCPlusBuilder.h @@ -14,6 +14,7 @@ #ifndef BOLT_CORE_MCPLUSBUILDER_H #define BOLT_CORE_MCPLUSBUILDER_H +#include "bolt/Core/MCInstUtils.h" #include "bolt/Core/MCPlus.h" #include "bolt/Core/Relocation.h" #include "llvm/ADT/ArrayRef.h" @@ -699,6 +700,19 @@ class MCPlusBuilder { return std::nullopt; } + /// Tests if BranchInst corresponds to an instruction sequence which is known + /// to be a safe dispatch via jump table. + /// + /// The target can decide which instruction sequences to consider "safe" from + /// the Pointer Authentication point of view, such as any jump table dispatch + /// sequence without function calls inside, any sequence which is contiguous, + /// or only some specific well-known sequences. + virtual bool + isSafeJumpTableBranchForPtrAuth(MCInstReference BranchInst) const { +llvm_unreachable("not implemented"); +return false; + } + virtual bool isTerminator(const MCInst &Inst) const; virtual bool isNoop(const MCInst &Inst) const { diff --git a/bolt/lib/Core/MCInstUtils.cpp b/bolt/lib/Core/MCInstUtils.cpp index 40f6edd59135c..b7c6d898988af 100644 --- a/bolt/lib/Core/MCInstUtils.cpp +++ b/bolt/lib/Core/MCInstUtils.cpp @@ -55,3 +55,23 @@ raw_ostream &MCInstReference::print(raw_ostream &OS) const { OS << ">"; return OS; } + +std::optional MCInstReference::getSinglePredecessor() { + if (const RefInBB *Ref = tryGetRefInBB()) { +if (Ref->It != Ref->BB->begin()) + return MCInstReference(Ref->BB, &*std::prev(Ref->It)); + +if (Ref->BB->pred_size() != 1) + return std::nullopt; + +BinaryBasicBlock *PredBB = *Ref->BB->pred_begin(); +assert(!PredBB->empty() && "Empty basic blocks are not supported yet"); +return MCInstReference(PredBB, &*PredBB->rbegin()); + } + + const RefInBF &Ref = getRefInBF(); + if (Ref.It == Ref.BF->instrs().begin()) +return std::nullopt; + + return MCInstReference(Ref.BF, std::prev(Ref.It)); +} diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp b/bolt/lib/Passes/PAuthGadgetScanner.cpp index 5e08ae3fbf767..bda971bcd9343 100644 --- a/bolt/lib/Passes/PAuthGadgetScanner.cpp +++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp @@ -1328,6 +1328,11 @@ shouldReportUnsafeTailCall(const BinaryContext &BC, const BinaryFunction &BF, return std::nullopt; } + if (BC.MIB->isSafeJumpTableBranchForPtrAuth(Inst)) { +LL
[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: prevent false positives due to jump tables (PR #138884)
https://github.com/atrosinenko updated https://github.com/llvm/llvm-project/pull/138884 Rate limit · GitHub body { background-color: #f6f8fa; color: #24292e; font-family: -apple-system,BlinkMacSystemFont,Segoe UI,Helvetica,Arial,sans-serif,Apple Color Emoji,Segoe UI Emoji,Segoe UI Symbol; font-size: 14px; line-height: 1.5; margin: 0; } .container { margin: 50px auto; max-width: 600px; text-align: center; padding: 0 24px; } a { color: #0366d6; text-decoration: none; } a:hover { text-decoration: underline; } h1 { line-height: 60px; font-size: 48px; font-weight: 300; margin: 0px; text-shadow: 0 1px 0 #fff; } p { color: rgba(0, 0, 0, 0.5); margin: 20px 0 40px; } ul { list-style: none; margin: 25px 0; padding: 0; } li { display: table-cell; font-weight: bold; width: 1%; } .logo { display: inline-block; margin-top: 35px; } .logo-img-2x { display: none; } @media only screen and (-webkit-min-device-pixel-ratio: 2), only screen and ( min--moz-device-pixel-ratio: 2), only screen and ( -o-min-device-pixel-ratio: 2/1), only screen and (min-device-pixel-ratio: 2), only screen and (min-resolution: 192dpi), only screen and (min-resolution: 2dppx) { .logo-img-1x { display: none; } .logo-img-2x { display: inline-block; } } #suggestions { margin-top: 35px; color: #ccc; } #suggestions a { color: #66; font-weight: 200; font-size: 14px; margin: 0 10px; } Whoa there! You have exceeded a secondary rate limit. Please wait a few minutes before you try again; in some cases this may take up to an hour. https://support.github.com/contact";>Contact Support â https://githubstatus.com";>GitHub Status â https://twitter.com/githubstatus";>@githubstatus ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: account for BRK when searching for auth oracles (PR #137975)
https://github.com/atrosinenko updated https://github.com/llvm/llvm-project/pull/137975 Rate limit · GitHub body { background-color: #f6f8fa; color: #24292e; font-family: -apple-system,BlinkMacSystemFont,Segoe UI,Helvetica,Arial,sans-serif,Apple Color Emoji,Segoe UI Emoji,Segoe UI Symbol; font-size: 14px; line-height: 1.5; margin: 0; } .container { margin: 50px auto; max-width: 600px; text-align: center; padding: 0 24px; } a { color: #0366d6; text-decoration: none; } a:hover { text-decoration: underline; } h1 { line-height: 60px; font-size: 48px; font-weight: 300; margin: 0px; text-shadow: 0 1px 0 #fff; } p { color: rgba(0, 0, 0, 0.5); margin: 20px 0 40px; } ul { list-style: none; margin: 25px 0; padding: 0; } li { display: table-cell; font-weight: bold; width: 1%; } .logo { display: inline-block; margin-top: 35px; } .logo-img-2x { display: none; } @media only screen and (-webkit-min-device-pixel-ratio: 2), only screen and ( min--moz-device-pixel-ratio: 2), only screen and ( -o-min-device-pixel-ratio: 2/1), only screen and (min-device-pixel-ratio: 2), only screen and (min-resolution: 192dpi), only screen and (min-resolution: 2dppx) { .logo-img-1x { display: none; } .logo-img-2x { display: inline-block; } } #suggestions { margin-top: 35px; color: #ccc; } #suggestions a { color: #66; font-weight: 200; font-size: 14px; margin: 0 10px; } Whoa there! You have exceeded a secondary rate limit. Please wait a few minutes before you try again; in some cases this may take up to an hour. https://support.github.com/contact";>Contact Support â https://githubstatus.com";>GitHub Status â https://twitter.com/githubstatus";>@githubstatus ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect untrusted LR before tail call (PR #137224)
atrosinenko wrote: As this PR improved the handling of leaf functions without CFG information, a few more test cases can be cleaned up - updated in 1377286872187ed191b02bd9632842f4db6dc367. https://github.com/llvm/llvm-project/pull/137224 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [BOLT] Introduce helpers to match `MCInst`s one at a time (NFC) (PR #138883)
https://github.com/atrosinenko updated https://github.com/llvm/llvm-project/pull/138883 >From 4bd8dd9334c8ad810e7fc593331cd6d4e2fdbbad Mon Sep 17 00:00:00 2001 From: Anatoly Trosinenko Date: Wed, 7 May 2025 16:42:00 +0300 Subject: [PATCH] [BOLT] Introduce helpers to match `MCInst`s one at a time (NFC) Introduce matchInst helper function to capture and/or match the operands of MCInst. Unlike the existing `MCPlusBuilder::MCInstMatcher` machinery, matchInst is intended for the use cases when precise control over the instruction order is required. For example, when validating PtrAuth hardening, all registers are usually considered unsafe after a function call, even though callee-saved registers should preserve their old values *under normal operation*. --- bolt/include/bolt/Core/MCInstUtils.h | 128 ++ .../Target/AArch64/AArch64MCPlusBuilder.cpp | 90 +--- 2 files changed, 162 insertions(+), 56 deletions(-) diff --git a/bolt/include/bolt/Core/MCInstUtils.h b/bolt/include/bolt/Core/MCInstUtils.h index a3912a8fb265a..b495eb8ef5eec 100644 --- a/bolt/include/bolt/Core/MCInstUtils.h +++ b/bolt/include/bolt/Core/MCInstUtils.h @@ -166,6 +166,134 @@ static inline raw_ostream &operator<<(raw_ostream &OS, return Ref.print(OS); } +/// Instruction-matching helpers operating on a single instruction at a time. +/// +/// Unlike MCPlusBuilder::MCInstMatcher, this matchInst() function focuses on +/// the cases where a precise control over the instruction order is important: +/// +/// // Bring the short names into the local scope: +/// using namespace MCInstMatcher; +/// // Declare the registers to capture: +/// Reg Xn, Xm; +/// // Capture the 0th and 1st operands, match the 2nd operand against the +/// // just captured Xm register, match the 3rd operand against literal 0: +/// if (!matchInst(MaybeAdd, AArch64::ADDXrs, Xm, Xn, Xm, Imm(0)) +/// return AArch64::NoRegister; +/// // Match the 0th operand against Xm: +/// if (!matchInst(MaybeBr, AArch64::BR, Xm)) +/// return AArch64::NoRegister; +/// // Return the matched register: +/// return Xm.get(); +namespace MCInstMatcher { + +// The base class to match an operand of type T. +// +// The subclasses of OpMatcher are intended to be allocated on the stack and +// to only be used by passing them to matchInst() and by calling their get() +// function, thus the peculiar `mutable` specifiers: to make the calling code +// compact and readable, the templated matchInst() function has to accept both +// long-lived Imm/Reg wrappers declared as local variables (intended to capture +// the first operand's value and match the subsequent operands, whether inside +// a single instruction or across multiple instructions), as well as temporary +// wrappers around literal values to match, f.e. Imm(42) or Reg(AArch64::XZR). +template class OpMatcher { + mutable std::optional Value; + mutable std::optional SavedValue; + + // Remember/restore the last Value - to be called by matchInst. + void remember() const { SavedValue = Value; } + void restore() const { Value = SavedValue; } + + template + friend bool matchInst(const MCInst &, unsigned, const OpMatchers &...); + +protected: + OpMatcher(std::optional ValueToMatch) : Value(ValueToMatch) {} + + bool matchValue(T OpValue) const { +// Check that OpValue does not contradict the existing Value. +bool MatchResult = !Value || *Value == OpValue; +// If MatchResult is false, all matchers will be reset before returning from +// matchInst, including this one, thus no need to assign conditionally. +Value = OpValue; + +return MatchResult; + } + +public: + /// Returns the captured value. + T get() const { +assert(Value.has_value()); +return *Value; + } +}; + +class Reg : public OpMatcher { + bool matches(const MCOperand &Op) const { +if (!Op.isReg()) + return false; + +return matchValue(Op.getReg()); + } + + template + friend bool matchInst(const MCInst &, unsigned, const OpMatchers &...); + +public: + Reg(std::optional RegToMatch = std::nullopt) + : OpMatcher(RegToMatch) {} +}; + +class Imm : public OpMatcher { + bool matches(const MCOperand &Op) const { +if (!Op.isImm()) + return false; + +return matchValue(Op.getImm()); + } + + template + friend bool matchInst(const MCInst &, unsigned, const OpMatchers &...); + +public: + Imm(std::optional ImmToMatch = std::nullopt) + : OpMatcher(ImmToMatch) {} +}; + +/// Tries to match Inst and updates Ops on success. +/// +/// If Inst has the specified Opcode and its operand list prefix matches Ops, +/// this function returns true and updates Ops, otherwise false is returned and +/// values of Ops are kept as before matchInst was called. +/// +/// Please note that while Ops are technically passed by a const reference to +/// make invocations like `matchInst(MI, Opcode, Imm(42))` possible, all their +/// fields are marked mut
[llvm-branch-commits] [llvm] [BOLT] Factor out MCInstReference from gadget scanner (NFC) (PR #138655)
https://github.com/atrosinenko updated https://github.com/llvm/llvm-project/pull/138655 >From a7a2eea52cb63453c261499288fec46f9e1d3613 Mon Sep 17 00:00:00 2001 From: Anatoly Trosinenko Date: Mon, 28 Apr 2025 18:35:48 +0300 Subject: [PATCH] [BOLT] Factor out MCInstReference from gadget scanner (NFC) Move MCInstReference representing a constant reference to an instruction inside a parent entity - either inside a basic block (which has a reference to its parent function) or directly to the function (when CFG information is not available). --- bolt/include/bolt/Core/MCInstUtils.h | 172 + bolt/include/bolt/Passes/PAuthGadgetScanner.h | 180 +- bolt/lib/Core/CMakeLists.txt | 1 + bolt/lib/Core/MCInstUtils.cpp | 57 ++ bolt/lib/Passes/PAuthGadgetScanner.cpp| 102 -- 5 files changed, 273 insertions(+), 239 deletions(-) create mode 100644 bolt/include/bolt/Core/MCInstUtils.h create mode 100644 bolt/lib/Core/MCInstUtils.cpp diff --git a/bolt/include/bolt/Core/MCInstUtils.h b/bolt/include/bolt/Core/MCInstUtils.h new file mode 100644 index 0..a3912a8fb265a --- /dev/null +++ b/bolt/include/bolt/Core/MCInstUtils.h @@ -0,0 +1,172 @@ +//===- bolt/Core/MCInstUtils.h --*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===--===// + +#ifndef BOLT_CORE_MCINSTUTILS_H +#define BOLT_CORE_MCINSTUTILS_H + +#include "bolt/Core/BinaryBasicBlock.h" + +#include +#include +#include + +namespace llvm { +namespace bolt { + +class BinaryFunction; + +/// MCInstReference represents a reference to a constant MCInst as stored either +/// in a BinaryFunction (i.e. before a CFG is created), or in a BinaryBasicBlock +/// (after a CFG is created). +class MCInstReference { + using nocfg_const_iterator = std::map::const_iterator; + + // Two cases are possible: + // * functions with CFG reconstructed - a function stores a collection of + // basic blocks, each basic block stores a contiguous vector of MCInst + // * functions without CFG - there are no basic blocks created, + // the instructions are directly stored in std::map in BinaryFunction + // + // In both cases, the direct parent of MCInst is stored together with an + // iterator pointing to the instruction. + + // Helper struct: CFG is available, the direct parent is a basic block, + // iterator's type is `MCInst *`. + struct RefInBB { +RefInBB(const BinaryBasicBlock *BB, const MCInst *Inst) +: BB(BB), It(Inst) {} +RefInBB(const RefInBB &Other) = default; +RefInBB &operator=(const RefInBB &Other) = default; + +const BinaryBasicBlock *BB; +BinaryBasicBlock::const_iterator It; + +bool operator<(const RefInBB &Other) const { + if (BB != Other.BB) +return std::less{}(BB, Other.BB); + return It < Other.It; +} + +bool operator==(const RefInBB &Other) const { + return BB == Other.BB && It == Other.It; +} + }; + + // Helper struct: CFG is *not* available, the direct parent is a function, + // iterator's type is std::map::iterator (the mapped value + // is an instruction's offset). + struct RefInBF { +RefInBF(const BinaryFunction *BF, nocfg_const_iterator It) +: BF(BF), It(It) {} +RefInBF(const RefInBF &Other) = default; +RefInBF &operator=(const RefInBF &Other) = default; + +const BinaryFunction *BF; +nocfg_const_iterator It; + +bool operator<(const RefInBF &Other) const { + if (BF != Other.BF) +return std::less{}(BF, Other.BF); + return It->first < Other.It->first; +} + +bool operator==(const RefInBF &Other) const { + return BF == Other.BF && It->first == Other.It->first; +} + }; + + std::variant Reference; + + // Utility methods to be used like this: + // + // if (auto *Ref = tryGetRefInBB()) + // return Ref->doSomething(...); + // return getRefInBF().doSomethingElse(...); + const RefInBB *tryGetRefInBB() const { +assert(std::get_if(&Reference) || + std::get_if(&Reference)); +return std::get_if(&Reference); + } + const RefInBF &getRefInBF() const { +assert(std::get_if(&Reference)); +return *std::get_if(&Reference); + } + +public: + /// Constructs an empty reference. + MCInstReference() : Reference(RefInBB(nullptr, nullptr)) {} + /// Constructs a reference to the instruction inside the basic block. + MCInstReference(const BinaryBasicBlock *BB, const MCInst *Inst) + : Reference(RefInBB(BB, Inst)) { +assert(BB && Inst && "Neither BB nor Inst should be nullptr"); + } + /// Constructs a reference to the instruction inside the basic block. + MCInstReference(const BinaryBasicBlock *BB, uns
[llvm-branch-commits] [llvm] [BOLT] Factor out MCInstReference from gadget scanner (NFC) (PR #138655)
https://github.com/atrosinenko updated https://github.com/llvm/llvm-project/pull/138655 >From a7a2eea52cb63453c261499288fec46f9e1d3613 Mon Sep 17 00:00:00 2001 From: Anatoly Trosinenko Date: Mon, 28 Apr 2025 18:35:48 +0300 Subject: [PATCH] [BOLT] Factor out MCInstReference from gadget scanner (NFC) Move MCInstReference representing a constant reference to an instruction inside a parent entity - either inside a basic block (which has a reference to its parent function) or directly to the function (when CFG information is not available). --- bolt/include/bolt/Core/MCInstUtils.h | 172 + bolt/include/bolt/Passes/PAuthGadgetScanner.h | 180 +- bolt/lib/Core/CMakeLists.txt | 1 + bolt/lib/Core/MCInstUtils.cpp | 57 ++ bolt/lib/Passes/PAuthGadgetScanner.cpp| 102 -- 5 files changed, 273 insertions(+), 239 deletions(-) create mode 100644 bolt/include/bolt/Core/MCInstUtils.h create mode 100644 bolt/lib/Core/MCInstUtils.cpp diff --git a/bolt/include/bolt/Core/MCInstUtils.h b/bolt/include/bolt/Core/MCInstUtils.h new file mode 100644 index 0..a3912a8fb265a --- /dev/null +++ b/bolt/include/bolt/Core/MCInstUtils.h @@ -0,0 +1,172 @@ +//===- bolt/Core/MCInstUtils.h --*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===--===// + +#ifndef BOLT_CORE_MCINSTUTILS_H +#define BOLT_CORE_MCINSTUTILS_H + +#include "bolt/Core/BinaryBasicBlock.h" + +#include +#include +#include + +namespace llvm { +namespace bolt { + +class BinaryFunction; + +/// MCInstReference represents a reference to a constant MCInst as stored either +/// in a BinaryFunction (i.e. before a CFG is created), or in a BinaryBasicBlock +/// (after a CFG is created). +class MCInstReference { + using nocfg_const_iterator = std::map::const_iterator; + + // Two cases are possible: + // * functions with CFG reconstructed - a function stores a collection of + // basic blocks, each basic block stores a contiguous vector of MCInst + // * functions without CFG - there are no basic blocks created, + // the instructions are directly stored in std::map in BinaryFunction + // + // In both cases, the direct parent of MCInst is stored together with an + // iterator pointing to the instruction. + + // Helper struct: CFG is available, the direct parent is a basic block, + // iterator's type is `MCInst *`. + struct RefInBB { +RefInBB(const BinaryBasicBlock *BB, const MCInst *Inst) +: BB(BB), It(Inst) {} +RefInBB(const RefInBB &Other) = default; +RefInBB &operator=(const RefInBB &Other) = default; + +const BinaryBasicBlock *BB; +BinaryBasicBlock::const_iterator It; + +bool operator<(const RefInBB &Other) const { + if (BB != Other.BB) +return std::less{}(BB, Other.BB); + return It < Other.It; +} + +bool operator==(const RefInBB &Other) const { + return BB == Other.BB && It == Other.It; +} + }; + + // Helper struct: CFG is *not* available, the direct parent is a function, + // iterator's type is std::map::iterator (the mapped value + // is an instruction's offset). + struct RefInBF { +RefInBF(const BinaryFunction *BF, nocfg_const_iterator It) +: BF(BF), It(It) {} +RefInBF(const RefInBF &Other) = default; +RefInBF &operator=(const RefInBF &Other) = default; + +const BinaryFunction *BF; +nocfg_const_iterator It; + +bool operator<(const RefInBF &Other) const { + if (BF != Other.BF) +return std::less{}(BF, Other.BF); + return It->first < Other.It->first; +} + +bool operator==(const RefInBF &Other) const { + return BF == Other.BF && It->first == Other.It->first; +} + }; + + std::variant Reference; + + // Utility methods to be used like this: + // + // if (auto *Ref = tryGetRefInBB()) + // return Ref->doSomething(...); + // return getRefInBF().doSomethingElse(...); + const RefInBB *tryGetRefInBB() const { +assert(std::get_if(&Reference) || + std::get_if(&Reference)); +return std::get_if(&Reference); + } + const RefInBF &getRefInBF() const { +assert(std::get_if(&Reference)); +return *std::get_if(&Reference); + } + +public: + /// Constructs an empty reference. + MCInstReference() : Reference(RefInBB(nullptr, nullptr)) {} + /// Constructs a reference to the instruction inside the basic block. + MCInstReference(const BinaryBasicBlock *BB, const MCInst *Inst) + : Reference(RefInBB(BB, Inst)) { +assert(BB && Inst && "Neither BB nor Inst should be nullptr"); + } + /// Constructs a reference to the instruction inside the basic block. + MCInstReference(const BinaryBasicBlock *BB, uns
[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: account for BRK when searching for auth oracles (PR #137975)
https://github.com/atrosinenko updated https://github.com/llvm/llvm-project/pull/137975 >From 5b6f6d0e0ab5e9353c7082f44d4107e21dc84cf5 Mon Sep 17 00:00:00 2001 From: Anatoly Trosinenko Date: Wed, 30 Apr 2025 16:08:10 +0300 Subject: [PATCH] [BOLT] Gadget scanner: account for BRK when searching for auth oracles An authenticated pointer can be explicitly checked by the compiler via a sequence of instructions that executes BRK on failure. It is important to recognize such BRK instruction as checking every register (as it is expected to immediately trigger an abnormal program termination) to prevent false positive reports about authentication oracles: autia x2, x3 autia x0, x1 ; neither x0 nor x2 are checked at this point eor x16, x0, x0, lsl #1 tbz x16, #62, on_success ; marks x0 as checked ; end of BB: for x2 to be checked here, it must be checked in both ; successor basic blocks on_failure: brk 0xc470 on_success: ; x2 is checked ldr x1, [x2] ; marks x2 as checked --- bolt/include/bolt/Core/MCPlusBuilder.h| 14 ++ bolt/lib/Passes/PAuthGadgetScanner.cpp| 13 +- .../Target/AArch64/AArch64MCPlusBuilder.cpp | 24 -- .../AArch64/gs-pauth-address-checks.s | 44 +-- .../AArch64/gs-pauth-authentication-oracles.s | 9 ++-- .../AArch64/gs-pauth-signing-oracles.s| 6 +-- 6 files changed, 75 insertions(+), 35 deletions(-) diff --git a/bolt/include/bolt/Core/MCPlusBuilder.h b/bolt/include/bolt/Core/MCPlusBuilder.h index 6d3aa4f5f0feb..87de6754017db 100644 --- a/bolt/include/bolt/Core/MCPlusBuilder.h +++ b/bolt/include/bolt/Core/MCPlusBuilder.h @@ -706,6 +706,20 @@ class MCPlusBuilder { return false; } + /// Returns true if Inst is a trap instruction. + /// + /// Tests if Inst is an instruction that immediately causes an abnormal + /// program termination, for example when a security violation is detected + /// by a compiler-inserted check. + /// + /// @note An implementation of this method should likely return false for + /// calls to library functions like abort(), as it is possible that the + /// execution state is partially attacker-controlled at this point. + virtual bool isTrap(const MCInst &Inst) const { +llvm_unreachable("not implemented"); +return false; + } + virtual bool isBreakpoint(const MCInst &Inst) const { llvm_unreachable("not implemented"); return false; diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp b/bolt/lib/Passes/PAuthGadgetScanner.cpp index dfb71575b2b39..835ee26aaf08a 100644 --- a/bolt/lib/Passes/PAuthGadgetScanner.cpp +++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp @@ -1028,6 +1028,15 @@ class DstSafetyAnalysis { dbgs() << ")\n"; }); +// If this instruction terminates the program immediately, no +// authentication oracles are possible past this point. +if (BC.MIB->isTrap(Point)) { + LLVM_DEBUG({ traceInst(BC, "Trap instruction found", Point); }); + DstState Next(NumRegs, RegsToTrackInstsFor.getNumTrackedRegisters()); + Next.CannotEscapeUnchecked.set(); + return Next; +} + // If this instruction is reachable by the analysis, a non-empty state will // be propagated to it sooner or later. Until then, skip computeNext(). if (Cur.empty()) { @@ -1133,8 +1142,8 @@ class DataflowDstSafetyAnalysis // // A basic block without any successors, on the other hand, can be // pessimistically initialized to everything-is-unsafe: this will naturally -// handle both return and tail call instructions and is harmless for -// internal indirect branch instructions (such as computed gotos). +// handle return, trap and tail call instructions. At the same time, it is +// harmless for internal indirect branch instructions, like computed gotos. if (BB.succ_empty()) return createUnsafeState(); diff --git a/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp b/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp index f3c29e6ee43b9..4d11c5b206eab 100644 --- a/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp +++ b/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp @@ -386,10 +386,9 @@ class AArch64MCPlusBuilder : public MCPlusBuilder { // the list of successors of this basic block as appropriate. // Any of the above code sequences assume the fall-through basic block -// is a dead-end BRK instruction (any immediate operand is accepted). +// is a dead-end trap instruction. const BinaryBasicBlock *BreakBB = BB.getFallthrough(); -if (!BreakBB || BreakBB->empty() || -BreakBB->front().getOpcode() != AArch64::BRK) +if (!BreakBB || BreakBB->empty() || !isTrap(BreakBB->front())) return std::nullopt; // Iterate over the instructions of BB in reverse order, matching opcodes @@ -1745,6 +1744,25 @@ class AArch64MCPlusBuilder : public MCPlusBuilder { Inst.addOperand(MCOperand::createImm(0)); }
[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect untrusted LR before tail call (PR #137224)
https://github.com/atrosinenko updated https://github.com/llvm/llvm-project/pull/137224 >From d20efbedd0be34942e8e28cc91eefeb28d1b8108 Mon Sep 17 00:00:00 2001 From: Anatoly Trosinenko Date: Tue, 22 Apr 2025 21:43:14 +0300 Subject: [PATCH 1/2] [BOLT] Gadget scanner: detect untrusted LR before tail call Implement the detection of tail calls performed with untrusted link register, which violates the assumption made on entry to every function. Unlike other pauth gadgets, this one involves some amount of guessing which branch instructions should be checked as tail calls. --- bolt/lib/Passes/PAuthGadgetScanner.cpp| 94 ++- .../AArch64/gs-pacret-autiasp.s | 31 +- .../AArch64/gs-pauth-debug-output.s | 30 +- .../AArch64/gs-pauth-tail-calls.s | 597 ++ 4 files changed, 706 insertions(+), 46 deletions(-) create mode 100644 bolt/test/binary-analysis/AArch64/gs-pauth-tail-calls.s diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp b/bolt/lib/Passes/PAuthGadgetScanner.cpp index 7a5d47a3ff812..dfb71575b2b39 100644 --- a/bolt/lib/Passes/PAuthGadgetScanner.cpp +++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp @@ -701,8 +701,9 @@ class DataflowSrcSafetyAnalysis // // Then, a function can be split into a number of disjoint contiguous sequences // of instructions without labels in between. These sequences can be processed -// the same way basic blocks are processed by data-flow analysis, assuming -// pessimistically that all registers are unsafe at the start of each sequence. +// the same way basic blocks are processed by data-flow analysis, with the same +// pessimistic estimation of the initial state at the start of each sequence +// (except the first instruction of the function). class CFGUnawareSrcSafetyAnalysis : public SrcSafetyAnalysis { BinaryFunction &BF; MCPlusBuilder::AllocatorIdTy AllocId; @@ -713,12 +714,6 @@ class CFGUnawareSrcSafetyAnalysis : public SrcSafetyAnalysis { BC.MIB->removeAnnotation(I.second, StateAnnotationIndex); } - /// Creates a state with all registers marked unsafe (not to be confused - /// with empty state). - SrcState createUnsafeState() const { -return SrcState(NumRegs, RegsToTrackInstsFor.getNumTrackedRegisters()); - } - public: CFGUnawareSrcSafetyAnalysis(BinaryFunction &BF, MCPlusBuilder::AllocatorIdTy AllocId, @@ -729,6 +724,7 @@ class CFGUnawareSrcSafetyAnalysis : public SrcSafetyAnalysis { } void run() override { +const SrcState DefaultState = computePessimisticState(BF); SrcState S = createEntryState(); for (auto &I : BF.instrs()) { MCInst &Inst = I.second; @@ -743,7 +739,7 @@ class CFGUnawareSrcSafetyAnalysis : public SrcSafetyAnalysis { LLVM_DEBUG({ traceInst(BC, "Due to label, resetting the state before", Inst); }); -S = createUnsafeState(); +S = DefaultState; } // Check if we need to remove an old annotation (this is the case if @@ -1288,6 +1284,83 @@ shouldReportReturnGadget(const BinaryContext &BC, const MCInstReference &Inst, return make_gadget_report(RetKind, Inst, *RetReg); } +/// While BOLT already marks some of the branch instructions as tail calls, +/// this function tries to improve the coverage by including less obvious cases +/// when it is possible to do without introducing too many false positives. +static bool shouldAnalyzeTailCallInst(const BinaryContext &BC, + const BinaryFunction &BF, + const MCInstReference &Inst) { + // Some BC.MIB->isXYZ(Inst) methods simply delegate to MCInstrDesc::isXYZ() + // (such as isBranch at the time of writing this comment), some don't (such + // as isCall). For that reason, call MCInstrDesc's methods explicitly when + // it is important. + const MCInstrDesc &Desc = + BC.MII->get(static_cast(Inst).getOpcode()); + // Tail call should be a branch (but not necessarily an indirect one). + if (!Desc.isBranch()) +return false; + + // Always analyze the branches already marked as tail calls by BOLT. + if (BC.MIB->isTailCall(Inst)) +return true; + + // Try to also check the branches marked as "UNKNOWN CONTROL FLOW" - the + // below is a simplified condition from BinaryContext::printInstruction. + bool IsUnknownControlFlow = + BC.MIB->isIndirectBranch(Inst) && !BC.MIB->getJumpTable(Inst); + + if (BF.hasCFG() && IsUnknownControlFlow) +return true; + + return false; +} + +static std::optional> +shouldReportUnsafeTailCall(const BinaryContext &BC, const BinaryFunction &BF, + const MCInstReference &Inst, const SrcState &S) { + static const GadgetKind UntrustedLRKind( + "untrusted link register found before tail call"); + + if (!shouldAnalyzeTailCallInst(BC, BF, Inst)) +return std::nullopt; + + // Not only the set of registers returned by getTrustedLiveInRegs() can be + /
[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: optionally assume auth traps on failure (PR #139778)
https://github.com/atrosinenko updated https://github.com/llvm/llvm-project/pull/139778 >From 8f997500a97b5ad7acc9dea416cca6b2f7bb615d Mon Sep 17 00:00:00 2001 From: Anatoly Trosinenko Date: Tue, 13 May 2025 19:50:41 +0300 Subject: [PATCH] [BOLT] Gadget scanner: optionally assume auth traps on failure On AArch64 it is possible for an auth instruction to either return an invalid address value on failure (without FEAT_FPAC) or generate an error (with FEAT_FPAC). It thus may be possible to never emit explicit pointer checks, if the target CPU is known to support FEAT_FPAC. This commit implements an --auth-traps-on-failure command line option, which essentially makes "safe-to-dereference" and "trusted" register properties identical and disables scanning for authentication oracles completely. --- bolt/lib/Passes/PAuthGadgetScanner.cpp| 112 +++ .../binary-analysis/AArch64/cmdline-args.test | 1 + .../AArch64/gs-pauth-authentication-oracles.s | 6 +- .../binary-analysis/AArch64/gs-pauth-calls.s | 5 +- .../AArch64/gs-pauth-debug-output.s | 177 ++--- .../AArch64/gs-pauth-jump-table.s | 6 +- .../AArch64/gs-pauth-signing-oracles.s| 54 ++--- .../AArch64/gs-pauth-tail-calls.s | 184 +- 8 files changed, 318 insertions(+), 227 deletions(-) diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp b/bolt/lib/Passes/PAuthGadgetScanner.cpp index bda971bcd9343..cfe86d32df798 100644 --- a/bolt/lib/Passes/PAuthGadgetScanner.cpp +++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp @@ -14,6 +14,7 @@ #include "bolt/Passes/PAuthGadgetScanner.h" #include "bolt/Core/ParallelUtilities.h" #include "bolt/Passes/DataflowAnalysis.h" +#include "bolt/Utils/CommandLineOpts.h" #include "llvm/ADT/STLExtras.h" #include "llvm/ADT/SmallSet.h" #include "llvm/MC/MCInst.h" @@ -26,6 +27,11 @@ namespace llvm { namespace bolt { namespace PAuthGadgetScanner { +static cl::opt AuthTrapsOnFailure( +"auth-traps-on-failure", +cl::desc("Assume authentication instructions always trap on failure"), +cl::cat(opts::BinaryAnalysisCategory)); + [[maybe_unused]] static void traceInst(const BinaryContext &BC, StringRef Label, const MCInst &MI) { dbgs() << " " << Label << ": "; @@ -365,6 +371,34 @@ class SrcSafetyAnalysis { return Clobbered; } + std::optional getRegMadeTrustedByChecking(const MCInst &Inst, + SrcState Cur) const { +// This functions cannot return multiple registers. This is never the case +// on AArch64. +std::optional RegCheckedByInst = +BC.MIB->getAuthCheckedReg(Inst, /*MayOverwrite=*/false); +if (RegCheckedByInst && Cur.SafeToDerefRegs[*RegCheckedByInst]) + return *RegCheckedByInst; + +auto It = CheckerSequenceInfo.find(&Inst); +if (It == CheckerSequenceInfo.end()) + return std::nullopt; + +MCPhysReg RegCheckedBySequence = It->second.first; +const MCInst *FirstCheckerInst = It->second.second; + +// FirstCheckerInst should belong to the same basic block (see the +// assertion in DataflowSrcSafetyAnalysis::run()), meaning it was +// deterministically processed a few steps before this instruction. +const SrcState &StateBeforeChecker = getStateBefore(*FirstCheckerInst); + +// The sequence checks the register, but it should be authenticated before. +if (!StateBeforeChecker.SafeToDerefRegs[RegCheckedBySequence]) + return std::nullopt; + +return RegCheckedBySequence; + } + // Returns all registers that can be treated as if they are written by an // authentication instruction. SmallVector getRegsMadeSafeToDeref(const MCInst &Point, @@ -387,18 +421,38 @@ class SrcSafetyAnalysis { Regs.push_back(DstAndSrc->first); } +// Make sure explicit checker sequence keeps register safe-to-dereference +// when the register would be clobbered according to the regular rules: +// +//; LR is safe to dereference here +//mov x16, x30 ; start of the sequence, LR is s-t-d right before +//xpaclri ; clobbers LR, LR is not safe anymore +//cmp x30, x16 +//b.eq 1f; end of the sequence: LR is marked as trusted +//brk 0x1234 +// 1: +//; at this point LR would be marked as trusted, +//; but not safe-to-dereference +// +// or even just +// +//; X1 is safe to dereference here +//ldr x0, [x1, #8]! +//; X1 is trusted here, but it was clobbered due to address write-back +if (auto CheckedReg = getRegMadeTrustedByChecking(Point, Cur)) + Regs.push_back(*CheckedReg); + return Regs; } // Returns all registers made trusted by this instruction. SmallVector getRegsMadeTrusted(const MCInst &Point, const SrcState &Cur) const { +assert(!AuthTrapsOnFailure &&
[llvm-branch-commits] [llvm] [BOLT] Introduce helpers to match `MCInst`s one at a time (NFC) (PR #138883)
https://github.com/atrosinenko updated https://github.com/llvm/llvm-project/pull/138883 >From 4bd8dd9334c8ad810e7fc593331cd6d4e2fdbbad Mon Sep 17 00:00:00 2001 From: Anatoly Trosinenko Date: Wed, 7 May 2025 16:42:00 +0300 Subject: [PATCH] [BOLT] Introduce helpers to match `MCInst`s one at a time (NFC) Introduce matchInst helper function to capture and/or match the operands of MCInst. Unlike the existing `MCPlusBuilder::MCInstMatcher` machinery, matchInst is intended for the use cases when precise control over the instruction order is required. For example, when validating PtrAuth hardening, all registers are usually considered unsafe after a function call, even though callee-saved registers should preserve their old values *under normal operation*. --- bolt/include/bolt/Core/MCInstUtils.h | 128 ++ .../Target/AArch64/AArch64MCPlusBuilder.cpp | 90 +--- 2 files changed, 162 insertions(+), 56 deletions(-) diff --git a/bolt/include/bolt/Core/MCInstUtils.h b/bolt/include/bolt/Core/MCInstUtils.h index a3912a8fb265a..b495eb8ef5eec 100644 --- a/bolt/include/bolt/Core/MCInstUtils.h +++ b/bolt/include/bolt/Core/MCInstUtils.h @@ -166,6 +166,134 @@ static inline raw_ostream &operator<<(raw_ostream &OS, return Ref.print(OS); } +/// Instruction-matching helpers operating on a single instruction at a time. +/// +/// Unlike MCPlusBuilder::MCInstMatcher, this matchInst() function focuses on +/// the cases where a precise control over the instruction order is important: +/// +/// // Bring the short names into the local scope: +/// using namespace MCInstMatcher; +/// // Declare the registers to capture: +/// Reg Xn, Xm; +/// // Capture the 0th and 1st operands, match the 2nd operand against the +/// // just captured Xm register, match the 3rd operand against literal 0: +/// if (!matchInst(MaybeAdd, AArch64::ADDXrs, Xm, Xn, Xm, Imm(0)) +/// return AArch64::NoRegister; +/// // Match the 0th operand against Xm: +/// if (!matchInst(MaybeBr, AArch64::BR, Xm)) +/// return AArch64::NoRegister; +/// // Return the matched register: +/// return Xm.get(); +namespace MCInstMatcher { + +// The base class to match an operand of type T. +// +// The subclasses of OpMatcher are intended to be allocated on the stack and +// to only be used by passing them to matchInst() and by calling their get() +// function, thus the peculiar `mutable` specifiers: to make the calling code +// compact and readable, the templated matchInst() function has to accept both +// long-lived Imm/Reg wrappers declared as local variables (intended to capture +// the first operand's value and match the subsequent operands, whether inside +// a single instruction or across multiple instructions), as well as temporary +// wrappers around literal values to match, f.e. Imm(42) or Reg(AArch64::XZR). +template class OpMatcher { + mutable std::optional Value; + mutable std::optional SavedValue; + + // Remember/restore the last Value - to be called by matchInst. + void remember() const { SavedValue = Value; } + void restore() const { Value = SavedValue; } + + template + friend bool matchInst(const MCInst &, unsigned, const OpMatchers &...); + +protected: + OpMatcher(std::optional ValueToMatch) : Value(ValueToMatch) {} + + bool matchValue(T OpValue) const { +// Check that OpValue does not contradict the existing Value. +bool MatchResult = !Value || *Value == OpValue; +// If MatchResult is false, all matchers will be reset before returning from +// matchInst, including this one, thus no need to assign conditionally. +Value = OpValue; + +return MatchResult; + } + +public: + /// Returns the captured value. + T get() const { +assert(Value.has_value()); +return *Value; + } +}; + +class Reg : public OpMatcher { + bool matches(const MCOperand &Op) const { +if (!Op.isReg()) + return false; + +return matchValue(Op.getReg()); + } + + template + friend bool matchInst(const MCInst &, unsigned, const OpMatchers &...); + +public: + Reg(std::optional RegToMatch = std::nullopt) + : OpMatcher(RegToMatch) {} +}; + +class Imm : public OpMatcher { + bool matches(const MCOperand &Op) const { +if (!Op.isImm()) + return false; + +return matchValue(Op.getImm()); + } + + template + friend bool matchInst(const MCInst &, unsigned, const OpMatchers &...); + +public: + Imm(std::optional ImmToMatch = std::nullopt) + : OpMatcher(ImmToMatch) {} +}; + +/// Tries to match Inst and updates Ops on success. +/// +/// If Inst has the specified Opcode and its operand list prefix matches Ops, +/// this function returns true and updates Ops, otherwise false is returned and +/// values of Ops are kept as before matchInst was called. +/// +/// Please note that while Ops are technically passed by a const reference to +/// make invocations like `matchInst(MI, Opcode, Imm(42))` possible, all their +/// fields are marked mut
[llvm-branch-commits] [llvm] [llvm][OpenMP] Add "SourceLanguages" property to Directive (PR #139960)
https://github.com/kparzysz created https://github.com/llvm/llvm-project/pull/139960 The official languages that OpenMP recognizes are C/C++ and Fortran. Some OpenMP directives are language-specific, some are C/C++-only, some are Fortran-only. Add a property to the TableGen definition of Directive that will be the list of languages that allow the directive. The TableGen backend will then generate a bitmask-like enumeration SourceLanguages, and a function SourceLanguages getDirectiveLanguages(Directive D); Rate limit · GitHub body { background-color: #f6f8fa; color: #24292e; font-family: -apple-system,BlinkMacSystemFont,Segoe UI,Helvetica,Arial,sans-serif,Apple Color Emoji,Segoe UI Emoji,Segoe UI Symbol; font-size: 14px; line-height: 1.5; margin: 0; } .container { margin: 50px auto; max-width: 600px; text-align: center; padding: 0 24px; } a { color: #0366d6; text-decoration: none; } a:hover { text-decoration: underline; } h1 { line-height: 60px; font-size: 48px; font-weight: 300; margin: 0px; text-shadow: 0 1px 0 #fff; } p { color: rgba(0, 0, 0, 0.5); margin: 20px 0 40px; } ul { list-style: none; margin: 25px 0; padding: 0; } li { display: table-cell; font-weight: bold; width: 1%; } .logo { display: inline-block; margin-top: 35px; } .logo-img-2x { display: none; } @media only screen and (-webkit-min-device-pixel-ratio: 2), only screen and ( min--moz-device-pixel-ratio: 2), only screen and ( -o-min-device-pixel-ratio: 2/1), only screen and (min-device-pixel-ratio: 2), only screen and (min-resolution: 192dpi), only screen and (min-resolution: 2dppx) { .logo-img-1x { display: none; } .logo-img-2x { display: inline-block; } } #suggestions { margin-top: 35px; color: #ccc; } #suggestions a { color: #66; font-weight: 200; font-size: 14px; margin: 0 10px; } Whoa there! You have exceeded a secondary rate limit. Please wait a few minutes before you try again; in some cases this may take up to an hour. https://support.github.com/contact";>Contact Support â https://githubstatus.com";>GitHub Status â https://twitter.com/githubstatus";>@githubstatus ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [llvm][OpenMP] Add "SourceLanguages" property to Directive (PR #139960)
llvmbot wrote: @llvm/pr-subscribers-flang-openmp Author: Krzysztof Parzyszek (kparzysz) Changes The official languages that OpenMP recognizes are C/C++ and Fortran. Some OpenMP directives are language-specific, some are C/C++-only, some are Fortran-only. Add a property to the TableGen definition of Directive that will be the list of languages that allow the directive. The TableGen backend will then generate a bitmask-like enumeration SourceLanguages, and a function SourceLanguages getDirectiveLanguages(Directive D); --- Patch is 24.68 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/139960.diff 6 Files Affected: - (modified) llvm/include/llvm/Frontend/Directive/DirectiveBase.td (+12) - (modified) llvm/include/llvm/Frontend/OpenMP/OMP.td (+63-19) - (modified) llvm/include/llvm/TableGen/DirectiveEmitter.h (+12-2) - (modified) llvm/test/TableGen/directive1.td (+17) - (modified) llvm/test/TableGen/directive2.td (+17) - (modified) llvm/utils/TableGen/Basic/DirectiveEmitter.cpp (+77) ``diff diff --git a/llvm/include/llvm/Frontend/Directive/DirectiveBase.td b/llvm/include/llvm/Frontend/Directive/DirectiveBase.td index 4faea18324cb7..3e2744dea8d14 100644 --- a/llvm/include/llvm/Frontend/Directive/DirectiveBase.td +++ b/llvm/include/llvm/Frontend/Directive/DirectiveBase.td @@ -172,6 +172,15 @@ def CA_Meta: Category<"Meta"> {} def CA_Subsidiary: Category<"Subsidiary"> {} def CA_Utility: Category<"Utility"> {} +class SourceLanguage { + string name = n; // Name of the enum value in enum class Association. +} + +// The C languages also implies C++ until there is a reason to add C++ +// separately. +def L_C : SourceLanguage<"C"> {} +def L_Fortran : SourceLanguage<"Fortran"> {} + // Information about a specific directive. class Directive { // Name of the directive. Can be composite directive sepearted by whitespace. @@ -205,4 +214,7 @@ class Directive { // The category of the directive. Category category = ?; + + // The languages that allow this directive. Default: all languages. + list languages = [L_C, L_Fortran]; } diff --git a/llvm/include/llvm/Frontend/OpenMP/OMP.td b/llvm/include/llvm/Frontend/OpenMP/OMP.td index 194b1e657c493..0af4b436649a3 100644 --- a/llvm/include/llvm/Frontend/OpenMP/OMP.td +++ b/llvm/include/llvm/Frontend/OpenMP/OMP.td @@ -573,6 +573,7 @@ def OMP_Allocators : Directive<"allocators"> { ]; let association = AS_Block; let category = CA_Executable; + let languages = [L_Fortran]; } def OMP_Assumes : Directive<"assumes"> { let association = AS_None; @@ -586,10 +587,6 @@ def OMP_Assumes : Directive<"assumes"> { VersionedClause, ]; } -def OMP_EndAssumes : Directive<"end assumes"> { - let association = AS_Delimited; - let category = OMP_Assumes.category; -} def OMP_Assume : Directive<"assume"> { let association = AS_Block; let category = CA_Informational; @@ -637,6 +634,12 @@ def OMP_BeginAssumes : Directive<"begin assumes"> { VersionedClause, VersionedClause, ]; + let languages = [L_C]; +} +def OMP_EndAssumes : Directive<"end assumes"> { + let association = AS_Delimited; + let category = OMP_BeginAssumes.category; + let languages = OMP_BeginAssumes.languages; } def OMP_BeginDeclareTarget : Directive<"begin declare target"> { let allowedClauses = [ @@ -647,10 +650,22 @@ def OMP_BeginDeclareTarget : Directive<"begin declare target"> { ]; let association = AS_Delimited; let category = CA_Declarative; + let languages = [L_C]; +} +def OMP_EndDeclareTarget : Directive<"end declare target"> { + let association = AS_Delimited; + let category = OMP_BeginDeclareTarget.category; + let languages = OMP_BeginDeclareTarget.languages; } def OMP_BeginDeclareVariant : Directive<"begin declare variant"> { let association = AS_Delimited; let category = CA_Declarative; + let languages = [L_C]; +} +def OMP_EndDeclareVariant : Directive<"end declare variant"> { + let association = AS_Delimited; + let category = OMP_BeginDeclareVariant.category; + let languages = OMP_BeginDeclareVariant.languages; } def OMP_Cancel : Directive<"cancel"> { let allowedOnceClauses = [ @@ -717,10 +732,6 @@ def OMP_DeclareTarget : Directive<"declare target"> { let association = AS_None; let category = CA_Declarative; } -def OMP_EndDeclareTarget : Directive<"end declare target"> { - let association = AS_Delimited; - let category = OMP_DeclareTarget.category; -} def OMP_DeclareVariant : Directive<"declare variant"> { let allowedClauses = [ VersionedClause, @@ -731,10 +742,7 @@ def OMP_DeclareVariant : Directive<"declare variant"> { ]; let association = AS_Declaration; let category = CA_Declarative; -} -def OMP_EndDeclareVariant : Directive<"end declare variant"> { - let association = AS_Delimited; - let category = OMP_DeclareVariant.category; + let languages = [L_C]; } def OMP_Depobj : Directive<"depobj"> { let allowedClauses =
[llvm-branch-commits] [llvm] [llvm][OpenMP] Add "SourceLanguages" property to Directive (PR #139960)
llvmbot wrote: @llvm/pr-subscribers-tablegen Author: Krzysztof Parzyszek (kparzysz) Changes The official languages that OpenMP recognizes are C/C++ and Fortran. Some OpenMP directives are language-specific, some are C/C++-only, some are Fortran-only. Add a property to the TableGen definition of Directive that will be the list of languages that allow the directive. The TableGen backend will then generate a bitmask-like enumeration SourceLanguages, and a function SourceLanguages getDirectiveLanguages(Directive D); --- Patch is 24.68 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/139960.diff 6 Files Affected: - (modified) llvm/include/llvm/Frontend/Directive/DirectiveBase.td (+12) - (modified) llvm/include/llvm/Frontend/OpenMP/OMP.td (+63-19) - (modified) llvm/include/llvm/TableGen/DirectiveEmitter.h (+12-2) - (modified) llvm/test/TableGen/directive1.td (+17) - (modified) llvm/test/TableGen/directive2.td (+17) - (modified) llvm/utils/TableGen/Basic/DirectiveEmitter.cpp (+77) ``diff diff --git a/llvm/include/llvm/Frontend/Directive/DirectiveBase.td b/llvm/include/llvm/Frontend/Directive/DirectiveBase.td index 4faea18324cb7..3e2744dea8d14 100644 --- a/llvm/include/llvm/Frontend/Directive/DirectiveBase.td +++ b/llvm/include/llvm/Frontend/Directive/DirectiveBase.td @@ -172,6 +172,15 @@ def CA_Meta: Category<"Meta"> {} def CA_Subsidiary: Category<"Subsidiary"> {} def CA_Utility: Category<"Utility"> {} +class SourceLanguage { + string name = n; // Name of the enum value in enum class Association. +} + +// The C languages also implies C++ until there is a reason to add C++ +// separately. +def L_C : SourceLanguage<"C"> {} +def L_Fortran : SourceLanguage<"Fortran"> {} + // Information about a specific directive. class Directive { // Name of the directive. Can be composite directive sepearted by whitespace. @@ -205,4 +214,7 @@ class Directive { // The category of the directive. Category category = ?; + + // The languages that allow this directive. Default: all languages. + list languages = [L_C, L_Fortran]; } diff --git a/llvm/include/llvm/Frontend/OpenMP/OMP.td b/llvm/include/llvm/Frontend/OpenMP/OMP.td index 194b1e657c493..0af4b436649a3 100644 --- a/llvm/include/llvm/Frontend/OpenMP/OMP.td +++ b/llvm/include/llvm/Frontend/OpenMP/OMP.td @@ -573,6 +573,7 @@ def OMP_Allocators : Directive<"allocators"> { ]; let association = AS_Block; let category = CA_Executable; + let languages = [L_Fortran]; } def OMP_Assumes : Directive<"assumes"> { let association = AS_None; @@ -586,10 +587,6 @@ def OMP_Assumes : Directive<"assumes"> { VersionedClause, ]; } -def OMP_EndAssumes : Directive<"end assumes"> { - let association = AS_Delimited; - let category = OMP_Assumes.category; -} def OMP_Assume : Directive<"assume"> { let association = AS_Block; let category = CA_Informational; @@ -637,6 +634,12 @@ def OMP_BeginAssumes : Directive<"begin assumes"> { VersionedClause, VersionedClause, ]; + let languages = [L_C]; +} +def OMP_EndAssumes : Directive<"end assumes"> { + let association = AS_Delimited; + let category = OMP_BeginAssumes.category; + let languages = OMP_BeginAssumes.languages; } def OMP_BeginDeclareTarget : Directive<"begin declare target"> { let allowedClauses = [ @@ -647,10 +650,22 @@ def OMP_BeginDeclareTarget : Directive<"begin declare target"> { ]; let association = AS_Delimited; let category = CA_Declarative; + let languages = [L_C]; +} +def OMP_EndDeclareTarget : Directive<"end declare target"> { + let association = AS_Delimited; + let category = OMP_BeginDeclareTarget.category; + let languages = OMP_BeginDeclareTarget.languages; } def OMP_BeginDeclareVariant : Directive<"begin declare variant"> { let association = AS_Delimited; let category = CA_Declarative; + let languages = [L_C]; +} +def OMP_EndDeclareVariant : Directive<"end declare variant"> { + let association = AS_Delimited; + let category = OMP_BeginDeclareVariant.category; + let languages = OMP_BeginDeclareVariant.languages; } def OMP_Cancel : Directive<"cancel"> { let allowedOnceClauses = [ @@ -717,10 +732,6 @@ def OMP_DeclareTarget : Directive<"declare target"> { let association = AS_None; let category = CA_Declarative; } -def OMP_EndDeclareTarget : Directive<"end declare target"> { - let association = AS_Delimited; - let category = OMP_DeclareTarget.category; -} def OMP_DeclareVariant : Directive<"declare variant"> { let allowedClauses = [ VersionedClause, @@ -731,10 +742,7 @@ def OMP_DeclareVariant : Directive<"declare variant"> { ]; let association = AS_Declaration; let category = CA_Declarative; -} -def OMP_EndDeclareVariant : Directive<"end declare variant"> { - let association = AS_Delimited; - let category = OMP_DeclareVariant.category; + let languages = [L_C]; } def OMP_Depobj : Directive<"depobj"> { let allowedClauses = [ @
[llvm-branch-commits] [clang] [clang][OpenMP] Improve handling of non-C/C++ directives (PR #139961)
https://github.com/kparzysz created https://github.com/llvm/llvm-project/pull/139961 The PR139793 added handling of the Fortran-only "workshare" directive, however there are more such directives, e.g. "allocators". Use the genDirectiveLanguages function to detect non-C/C++ directives instead of enumerating them. Rate limit · GitHub body { background-color: #f6f8fa; color: #24292e; font-family: -apple-system,BlinkMacSystemFont,Segoe UI,Helvetica,Arial,sans-serif,Apple Color Emoji,Segoe UI Emoji,Segoe UI Symbol; font-size: 14px; line-height: 1.5; margin: 0; } .container { margin: 50px auto; max-width: 600px; text-align: center; padding: 0 24px; } a { color: #0366d6; text-decoration: none; } a:hover { text-decoration: underline; } h1 { line-height: 60px; font-size: 48px; font-weight: 300; margin: 0px; text-shadow: 0 1px 0 #fff; } p { color: rgba(0, 0, 0, 0.5); margin: 20px 0 40px; } ul { list-style: none; margin: 25px 0; padding: 0; } li { display: table-cell; font-weight: bold; width: 1%; } .logo { display: inline-block; margin-top: 35px; } .logo-img-2x { display: none; } @media only screen and (-webkit-min-device-pixel-ratio: 2), only screen and ( min--moz-device-pixel-ratio: 2), only screen and ( -o-min-device-pixel-ratio: 2/1), only screen and (min-device-pixel-ratio: 2), only screen and (min-resolution: 192dpi), only screen and (min-resolution: 2dppx) { .logo-img-1x { display: none; } .logo-img-2x { display: inline-block; } } #suggestions { margin-top: 35px; color: #ccc; } #suggestions a { color: #66; font-weight: 200; font-size: 14px; margin: 0 10px; } Whoa there! You have exceeded a secondary rate limit. Please wait a few minutes before you try again; in some cases this may take up to an hour. https://support.github.com/contact";>Contact Support â https://githubstatus.com";>GitHub Status â https://twitter.com/githubstatus";>@githubstatus ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [clang][OpenMP] Improve handling of non-C/C++ directives (PR #139961)
llvmbot wrote: @llvm/pr-subscribers-clang Author: Krzysztof Parzyszek (kparzysz) Changes The PR139793 added handling of the Fortran-only "workshare" directive, however there are more such directives, e.g. "allocators". Use the genDirectiveLanguages function to detect non-C/C++ directives instead of enumerating them. --- Full diff: https://github.com/llvm/llvm-project/pull/139961.diff 3 Files Affected: - (modified) clang/lib/Parse/ParseOpenMP.cpp (+2-3) - (added) clang/test/OpenMP/openmp_non_c_directives.c (+12) - (removed) clang/test/OpenMP/openmp_workshare.c (-8) ``diff diff --git a/clang/lib/Parse/ParseOpenMP.cpp b/clang/lib/Parse/ParseOpenMP.cpp index c409409602e75..a53fd6b2964cb 100644 --- a/clang/lib/Parse/ParseOpenMP.cpp +++ b/clang/lib/Parse/ParseOpenMP.cpp @@ -2613,9 +2613,8 @@ StmtResult Parser::ParseOpenMPDeclarativeOrExecutableDirective( Diag(Tok, diag::err_omp_unknown_directive); return StmtError(); } - if (DKind == OMPD_workshare) { -// "workshare" is an executable, Fortran-only directive. Treat it -// as unknown. + if (!(getDirectiveLanguages(DKind) & SourceLanguage::C)) { +// Treat directives that are not allowed in C/C++ as unknown. DKind = OMPD_unknown; } diff --git a/clang/test/OpenMP/openmp_non_c_directives.c b/clang/test/OpenMP/openmp_non_c_directives.c new file mode 100644 index 0..844d7dad551bc --- /dev/null +++ b/clang/test/OpenMP/openmp_non_c_directives.c @@ -0,0 +1,12 @@ +// RUN: %clang_cc1 -verify -fopenmp -ferror-limit 100 -o - %s + +// Test the reaction to some Fortran-only directives. + +void foo() { +#pragma omp allocators // expected-error {{expected an OpenMP directive}} +#pragma omp do // expected-error {{expected an OpenMP directive}} +#pragma omp end workshare // expected-error {{expected an OpenMP directive}} +#pragma omp parallel workshare // expected-warning {{extra tokens at the end of '#pragma omp parallel' are ignored}} +#pragma omp workshare // expected-error {{expected an OpenMP directive}} +} + diff --git a/clang/test/OpenMP/openmp_workshare.c b/clang/test/OpenMP/openmp_workshare.c deleted file mode 100644 index 0302eb19f9ef4..0 --- a/clang/test/OpenMP/openmp_workshare.c +++ /dev/null @@ -1,8 +0,0 @@ -// RUN: %clang_cc1 -verify -fopenmp -ferror-limit 100 -o - %s - -// Workshare is a Fortran-only directive. - -void foo() { -#pragma omp workshare // expected-error {{expected an OpenMP directive}} -} - `` https://github.com/llvm/llvm-project/pull/139961 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [HLSL] Implicit resource binding for cbuffers (PR #139022)
https://github.com/hekota updated https://github.com/llvm/llvm-project/pull/139022 Rate limit · GitHub body { background-color: #f6f8fa; color: #24292e; font-family: -apple-system,BlinkMacSystemFont,Segoe UI,Helvetica,Arial,sans-serif,Apple Color Emoji,Segoe UI Emoji,Segoe UI Symbol; font-size: 14px; line-height: 1.5; margin: 0; } .container { margin: 50px auto; max-width: 600px; text-align: center; padding: 0 24px; } a { color: #0366d6; text-decoration: none; } a:hover { text-decoration: underline; } h1 { line-height: 60px; font-size: 48px; font-weight: 300; margin: 0px; text-shadow: 0 1px 0 #fff; } p { color: rgba(0, 0, 0, 0.5); margin: 20px 0 40px; } ul { list-style: none; margin: 25px 0; padding: 0; } li { display: table-cell; font-weight: bold; width: 1%; } .logo { display: inline-block; margin-top: 35px; } .logo-img-2x { display: none; } @media only screen and (-webkit-min-device-pixel-ratio: 2), only screen and ( min--moz-device-pixel-ratio: 2), only screen and ( -o-min-device-pixel-ratio: 2/1), only screen and (min-device-pixel-ratio: 2), only screen and (min-resolution: 192dpi), only screen and (min-resolution: 2dppx) { .logo-img-1x { display: none; } .logo-img-2x { display: inline-block; } } #suggestions { margin-top: 35px; color: #ccc; } #suggestions a { color: #66; font-weight: 200; font-size: 14px; margin: 0 10px; } Whoa there! You have exceeded a secondary rate limit. Please wait a few minutes before you try again; in some cases this may take up to an hour. https://support.github.com/contact";>Contact Support â https://githubstatus.com";>GitHub Status â https://twitter.com/githubstatus";>@githubstatus ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [clang][analyzer] Handle CXXParenInitListExpr alongside InitListExpr (PR #139909)
https://github.com/Xazax-hun approved this pull request. https://github.com/llvm/llvm-project/pull/139909 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [llvm][OpenMP] Add "SourceLanguages" property to Directive (PR #139960)
kparzysz wrote: Previous PR: https://github.com/llvm/llvm-project/pull/139958 Next PR: https://github.com/llvm/llvm-project/pull/139961 https://github.com/llvm/llvm-project/pull/139960 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [clang][OpenMP] Improve handling of non-C/C++ directives (PR #139961)
kparzysz wrote: Previous PR: https://github.com/llvm/llvm-project/pull/139960 https://github.com/llvm/llvm-project/pull/139961 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [clang][OpenMP] Improve handling of non-C/C++ directives (PR #139961)
https://github.com/alexey-bataev approved this pull request. https://github.com/llvm/llvm-project/pull/139961 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [DirectX] Adding support for static samples is yaml2obj/obj2yaml (PR #139963)
llvmbot wrote: @llvm/pr-subscribers-llvm-binary-utilities @llvm/pr-subscribers-backend-directx Author: None (joaosaffran) Changes - Adds support for static samplers ins dxcontainer binary format. - Adds writing logic to mcdxbc - adds reading logic to Object - adds tests Closes: [126636](https://github.com/llvm/llvm-project/issues/126636) --- Patch is 23.58 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/139963.diff 12 Files Affected: - (modified) llvm/include/llvm/BinaryFormat/DXContainer.h (+51) - (modified) llvm/include/llvm/BinaryFormat/DXContainerConstants.def (+73) - (modified) llvm/include/llvm/MC/DXContainerRootSignature.h (+1) - (modified) llvm/include/llvm/Object/DXContainer.h (+5) - (modified) llvm/include/llvm/ObjectYAML/DXContainerYAML.h (+26) - (modified) llvm/lib/MC/DXContainerRootSignature.cpp (+19-1) - (modified) llvm/lib/Object/DXContainer.cpp (+5) - (modified) llvm/lib/ObjectYAML/DXContainerEmitter.cpp (+19) - (modified) llvm/lib/ObjectYAML/DXContainerYAML.cpp (+38) - (added) llvm/test/ObjectYAML/DXContainer/RootSignature-StaticSamplers.yaml (+63) - (modified) llvm/unittests/Object/DXContainerTest.cpp (+46) - (modified) llvm/unittests/ObjectYAML/DXContainerYAMLTest.cpp (+59) ``diff diff --git a/llvm/include/llvm/BinaryFormat/DXContainer.h b/llvm/include/llvm/BinaryFormat/DXContainer.h index c18d1e3d7f024..b9e9821a3f4a6 100644 --- a/llvm/include/llvm/BinaryFormat/DXContainer.h +++ b/llvm/include/llvm/BinaryFormat/DXContainer.h @@ -18,6 +18,7 @@ #include "llvm/Support/SwapByteOrder.h" #include "llvm/TargetParser/Triple.h" +#include #include namespace llvm { @@ -600,6 +601,25 @@ inline bool isValidShaderVisibility(uint32_t V) { return false; } +#define STATIC_SAMPLER_FILTER(Val, Enum) Enum = Val, +enum class StaticSamplerFilter : uint32_t { +#include "DXContainerConstants.def" +}; + +#define TEXTURE_ADDRESS_MODE(Val, Enum) Enum = Val, +enum class TextureAddressMode : uint32_t { +#include "DXContainerConstants.def" +}; + +#define COMPARISON_FUNCTION(Val, Enum) Enum = Val, +enum class ComparisonFunction : uint32_t { +#include "DXContainerConstants.def" +}; + +#define STATIC_BORDER_COLOR(Val, Enum) Enum = Val, +enum class StaticBorderColor : uint32_t { +#include "DXContainerConstants.def" +}; namespace v1 { struct RootSignatureHeader { @@ -667,6 +687,37 @@ struct DescriptorRange { sys::swapByteOrder(OffsetInDescriptorsFromTableStart); } }; + +struct StaticSampler { + uint32_t Filter; + uint32_t AddressU; + uint32_t AddressV; + uint32_t AddressW; + float MipLODBias; + uint32_t MaxAnisotropy; + uint32_t ComparisonFunc; + uint32_t BorderColor; + float MinLOD; + float MaxLOD; + uint32_t ShaderRegister; + uint32_t RegisterSpace; + uint32_t ShaderVisibility; + void swapBytes() { +sys::swapByteOrder(Filter); +sys::swapByteOrder(AddressU); +sys::swapByteOrder(AddressV); +sys::swapByteOrder(AddressW); +sys::swapByteOrder(MipLODBias); +sys::swapByteOrder(MaxAnisotropy); +sys::swapByteOrder(ComparisonFunc); +sys::swapByteOrder(BorderColor); +sys::swapByteOrder(MinLOD); +sys::swapByteOrder(MaxLOD); +sys::swapByteOrder(ShaderRegister); +sys::swapByteOrder(RegisterSpace); +sys::swapByteOrder(ShaderVisibility); + }; +}; } // namespace v1 namespace v2 { diff --git a/llvm/include/llvm/BinaryFormat/DXContainerConstants.def b/llvm/include/llvm/BinaryFormat/DXContainerConstants.def index 5fe7e7c321a33..f25c464d5f9cc 100644 --- a/llvm/include/llvm/BinaryFormat/DXContainerConstants.def +++ b/llvm/include/llvm/BinaryFormat/DXContainerConstants.def @@ -129,6 +129,79 @@ SHADER_VISIBILITY(7, Mesh) #undef SHADER_VISIBILITY #endif // SHADER_VISIBILITY +#ifdef STATIC_SAMPLER_FILTER + +STATIC_SAMPLER_FILTER(0, MIN_MAG_MIP_POINT) +STATIC_SAMPLER_FILTER(0x1, MIN_MAG_POINT_MIP_LINEAR) +STATIC_SAMPLER_FILTER(0x4, MIN_POINT_MAG_LINEAR_MIP_POINT) +STATIC_SAMPLER_FILTER(0x5, MIN_POINT_MAG_MIP_LINEAR) +STATIC_SAMPLER_FILTER(0x10, MIN_LINEAR_MAG_MIP_POINT) +STATIC_SAMPLER_FILTER(0x11, MIN_LINEAR_MAG_POINT_MIP_LINEAR) +STATIC_SAMPLER_FILTER(0x14, MIN_MAG_LINEAR_MIP_POINT) +STATIC_SAMPLER_FILTER(0x15, MIN_MAG_MIP_LINEAR) +STATIC_SAMPLER_FILTER(0x55, ANISOTROPIC) +STATIC_SAMPLER_FILTER(0x80, COMPARISON_MIN_MAG_MIP_POINT) +STATIC_SAMPLER_FILTER(0x81, COMPARISON_MIN_MAG_POINT_MIP_LINEAR) +STATIC_SAMPLER_FILTER(0x84, COMPARISON_MIN_POINT_MAG_LINEAR_MIP_POINT) +STATIC_SAMPLER_FILTER(0x85, COMPARISON_MIN_POINT_MAG_MIP_LINEAR) +STATIC_SAMPLER_FILTER(0x90, COMPARISON_MIN_LINEAR_MAG_MIP_POINT) +STATIC_SAMPLER_FILTER(0x91, COMPARISON_MIN_LINEAR_MAG_POINT_MIP_LINEAR) +STATIC_SAMPLER_FILTER(0x94, COMPARISON_MIN_MAG_LINEAR_MIP_POINT) +STATIC_SAMPLER_FILTER(0x95, COMPARISON_MIN_MAG_MIP_LINEAR) +STATIC_SAMPLER_FILTER(0xd5, COMPARISON_ANISOTROPIC) +STATIC_SAMPLER_FILTER(0x100, MINIMUM_MIN_MAG_MIP_POINT) +STATIC_SAMPLER_FILTER(0x101, MINIMUM_MIN_MAG_POINT_MIP_LINEAR)
[llvm-branch-commits] [llvm] [DirectX] Adding support for static samples is yaml2obj/obj2yaml (PR #139963)
https://github.com/joaosaffran created https://github.com/llvm/llvm-project/pull/139963 - Adds support for static samplers ins dxcontainer binary format. - Adds writing logic to mcdxbc - adds reading logic to Object - adds tests Closes: [126636](https://github.com/llvm/llvm-project/issues/126636) Rate limit · GitHub body { background-color: #f6f8fa; color: #24292e; font-family: -apple-system,BlinkMacSystemFont,Segoe UI,Helvetica,Arial,sans-serif,Apple Color Emoji,Segoe UI Emoji,Segoe UI Symbol; font-size: 14px; line-height: 1.5; margin: 0; } .container { margin: 50px auto; max-width: 600px; text-align: center; padding: 0 24px; } a { color: #0366d6; text-decoration: none; } a:hover { text-decoration: underline; } h1 { line-height: 60px; font-size: 48px; font-weight: 300; margin: 0px; text-shadow: 0 1px 0 #fff; } p { color: rgba(0, 0, 0, 0.5); margin: 20px 0 40px; } ul { list-style: none; margin: 25px 0; padding: 0; } li { display: table-cell; font-weight: bold; width: 1%; } .logo { display: inline-block; margin-top: 35px; } .logo-img-2x { display: none; } @media only screen and (-webkit-min-device-pixel-ratio: 2), only screen and ( min--moz-device-pixel-ratio: 2), only screen and ( -o-min-device-pixel-ratio: 2/1), only screen and (min-device-pixel-ratio: 2), only screen and (min-resolution: 192dpi), only screen and (min-resolution: 2dppx) { .logo-img-1x { display: none; } .logo-img-2x { display: inline-block; } } #suggestions { margin-top: 35px; color: #ccc; } #suggestions a { color: #66; font-weight: 200; font-size: 14px; margin: 0 10px; } Whoa there! You have exceeded a secondary rate limit. Please wait a few minutes before you try again; in some cases this may take up to an hour. https://support.github.com/contact";>Contact Support â https://githubstatus.com";>GitHub Status â https://twitter.com/githubstatus";>@githubstatus ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [MLIR][ArmSVE] Add initial lowering of vector.contract to SVE `*MMLA` instructions (PR #135636)
@@ -0,0 +1,304 @@ +//===- LowerContractionToSMMLAPattern.cpp - Contract to SMMLA ---*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===--===// +// +// This file implements lowering patterns from vector.contract to +// SVE I8MM operations. +// +//===--- + +#include "mlir/Dialect/Arith/IR/Arith.h" +#include "mlir/Dialect/ArmSVE/IR/ArmSVEDialect.h" +#include "mlir/Dialect/ArmSVE/Transforms/Transforms.h" +#include "mlir/Dialect/Func/IR/FuncOps.h" +#include "mlir/Dialect/LLVMIR/LLVMDialect.h" +#include "mlir/Dialect/Utils/IndexingUtils.h" +#include "mlir/Dialect/Vector/IR/VectorOps.h" +#include "mlir/IR/AffineMap.h" +#include "mlir/IR/PatternMatch.h" +#include "mlir/Transforms/GreedyPatternRewriteDriver.h" + +#include "mlir/Dialect/UB/IR/UBOps.h" + +#define DEBUG_TYPE "lower-contract-to-arm-sve-i8mm" + +using namespace mlir; +using namespace mlir::arm_sve; + +namespace { +// Check if the given value is a result of the operation `T` (which must be +// sign- or zero- extend) from i8 to i32. Return the value before the extension. +template +inline std::enable_if_t<(std::is_base_of_v || + std::is_base_of_v), +std::optional> +extractExtOperand(Value v, Type i8Ty, Type i32Ty) { + auto extOp = dyn_cast_or_null(v.getDefiningOp()); + if (!extOp) +return {}; + + auto inOp = extOp.getIn(); + auto inTy = dyn_cast(inOp.getType()); + if (!inTy || inTy.getElementType() != i8Ty) +return {}; + + auto outTy = dyn_cast(extOp.getType()); + if (!outTy || outTy.getElementType() != i32Ty) +return {}; + + return inOp; +} + +// Designate the operation (resp. instruction) used to do sub-tile matrix +// multiplications. +enum class MMLA { + Signed, // smmla + Unsigned,// ummla + Mixed, // usmmla + MixedSwapped // usmmla with LHS and RHS swapped +}; + +// Create the matrix multply and accumulate operation according to `op`. +Value createMMLA(PatternRewriter &rewriter, MMLA op, Location loc, + mlir::VectorType accType, Value acc, Value lhs, Value rhs) { + switch (op) { + case MMLA::Signed: +return rewriter.create(loc, accType, acc, lhs, rhs); + case MMLA::Unsigned: +return rewriter.create(loc, accType, acc, lhs, rhs); + case MMLA::Mixed: +return rewriter.create(loc, accType, acc, lhs, rhs); + case MMLA::MixedSwapped: +// The accumulator comes transposed and the result will be transposed +// later, so all we have to do here is swap the operands. +return rewriter.create(loc, accType, acc, rhs, lhs); + } +} + +class LowerContractionToSVEI8MMPattern momchil-velikov wrote: Done. https://github.com/llvm/llvm-project/pull/135636 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [MLIR][ArmSVE] Add initial lowering of vector.contract to SVE `*MMLA` instructions (PR #135636)
@@ -0,0 +1,304 @@ +//===- LowerContractionToSMMLAPattern.cpp - Contract to SMMLA ---*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===--===// +// +// This file implements lowering patterns from vector.contract to +// SVE I8MM operations. +// +//===--- + +#include "mlir/Dialect/Arith/IR/Arith.h" +#include "mlir/Dialect/ArmSVE/IR/ArmSVEDialect.h" +#include "mlir/Dialect/ArmSVE/Transforms/Transforms.h" +#include "mlir/Dialect/Func/IR/FuncOps.h" +#include "mlir/Dialect/LLVMIR/LLVMDialect.h" +#include "mlir/Dialect/Utils/IndexingUtils.h" +#include "mlir/Dialect/Vector/IR/VectorOps.h" +#include "mlir/IR/AffineMap.h" +#include "mlir/IR/PatternMatch.h" +#include "mlir/Transforms/GreedyPatternRewriteDriver.h" + +#include "mlir/Dialect/UB/IR/UBOps.h" + +#define DEBUG_TYPE "lower-contract-to-arm-sve-i8mm" + +using namespace mlir; +using namespace mlir::arm_sve; + +namespace { +// Check if the given value is a result of the operation `T` (which must be +// sign- or zero- extend) from i8 to i32. Return the value before the extension. +template +inline std::enable_if_t<(std::is_base_of_v || + std::is_base_of_v), +std::optional> +extractExtOperand(Value v, Type i8Ty, Type i32Ty) { + auto extOp = dyn_cast_or_null(v.getDefiningOp()); + if (!extOp) +return {}; + + auto inOp = extOp.getIn(); + auto inTy = dyn_cast(inOp.getType()); + if (!inTy || inTy.getElementType() != i8Ty) +return {}; + + auto outTy = dyn_cast(extOp.getType()); + if (!outTy || outTy.getElementType() != i32Ty) +return {}; + + return inOp; +} + +// Designate the operation (resp. instruction) used to do sub-tile matrix +// multiplications. +enum class MMLA { + Signed, // smmla + Unsigned,// ummla + Mixed, // usmmla + MixedSwapped // usmmla with LHS and RHS swapped +}; + +// Create the matrix multply and accumulate operation according to `op`. +Value createMMLA(PatternRewriter &rewriter, MMLA op, Location loc, + mlir::VectorType accType, Value acc, Value lhs, Value rhs) { + switch (op) { + case MMLA::Signed: +return rewriter.create(loc, accType, acc, lhs, rhs); + case MMLA::Unsigned: +return rewriter.create(loc, accType, acc, lhs, rhs); + case MMLA::Mixed: +return rewriter.create(loc, accType, acc, lhs, rhs); + case MMLA::MixedSwapped: +// The accumulator comes transposed and the result will be transposed +// later, so all we have to do here is swap the operands. +return rewriter.create(loc, accType, acc, rhs, lhs); + } +} + +class LowerContractionToSVEI8MMPattern +: public OpRewritePattern { +public: + using OpRewritePattern::OpRewritePattern; + LogicalResult matchAndRewrite(vector::ContractionOp op, +PatternRewriter &rewriter) const override { + +Location loc = op.getLoc(); +mlir::VectorType lhsType = op.getLhsType(); +mlir::VectorType rhsType = op.getRhsType(); + +// For now handle LHS and RHS<8x[N]> - these are the types we +// eventually expect from MMT4D. M and N dimensions must be even and at momchil-velikov wrote: Done (in the top-level description). https://github.com/llvm/llvm-project/pull/135636 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [MLIR][ArmSVE] Add an ArmSVE dialect operation which maps to svusmmla (PR #135634)
https://github.com/momchil-velikov updated https://github.com/llvm/llvm-project/pull/135634 Rate limit · GitHub body { background-color: #f6f8fa; color: #24292e; font-family: -apple-system,BlinkMacSystemFont,Segoe UI,Helvetica,Arial,sans-serif,Apple Color Emoji,Segoe UI Emoji,Segoe UI Symbol; font-size: 14px; line-height: 1.5; margin: 0; } .container { margin: 50px auto; max-width: 600px; text-align: center; padding: 0 24px; } a { color: #0366d6; text-decoration: none; } a:hover { text-decoration: underline; } h1 { line-height: 60px; font-size: 48px; font-weight: 300; margin: 0px; text-shadow: 0 1px 0 #fff; } p { color: rgba(0, 0, 0, 0.5); margin: 20px 0 40px; } ul { list-style: none; margin: 25px 0; padding: 0; } li { display: table-cell; font-weight: bold; width: 1%; } .logo { display: inline-block; margin-top: 35px; } .logo-img-2x { display: none; } @media only screen and (-webkit-min-device-pixel-ratio: 2), only screen and ( min--moz-device-pixel-ratio: 2), only screen and ( -o-min-device-pixel-ratio: 2/1), only screen and (min-device-pixel-ratio: 2), only screen and (min-resolution: 192dpi), only screen and (min-resolution: 2dppx) { .logo-img-1x { display: none; } .logo-img-2x { display: inline-block; } } #suggestions { margin-top: 35px; color: #ccc; } #suggestions a { color: #66; font-weight: 200; font-size: 14px; margin: 0 10px; } Whoa there! You have exceeded a secondary rate limit. Please wait a few minutes before you try again; in some cases this may take up to an hour. https://support.github.com/contact";>Contact Support â https://githubstatus.com";>GitHub Status â https://twitter.com/githubstatus";>@githubstatus ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [MLIR][ArmSVE] Add initial lowering of vector.contract to SVE `*MMLA` instructions (PR #135636)
https://github.com/momchil-velikov updated https://github.com/llvm/llvm-project/pull/135636 >From f397467bc167d94a28a919a45c009a8f08b6351b Mon Sep 17 00:00:00 2001 From: Momchil Velikov Date: Tue, 8 Apr 2025 14:43:54 + Subject: [PATCH 1/2] [MLIR][ArmSVE] Add initial lowering of `vector.contract` to SVE `*MMLA` instructions --- mlir/include/mlir/Conversion/Passes.td| 4 + .../Dialect/ArmSVE/Transforms/Transforms.h| 3 + .../Conversion/VectorToLLVM/CMakeLists.txt| 1 + .../VectorToLLVM/ConvertVectorToLLVMPass.cpp | 7 + .../LowerContractionToSMMLAPattern.cpp| 5 +- .../Dialect/ArmSVE/Transforms/CMakeLists.txt | 1 + .../LowerContractionToSVEI8MMPattern.cpp | 304 ++ .../Vector/CPU/ArmSVE/vector-smmla.mlir | 94 ++ .../Vector/CPU/ArmSVE/vector-summla.mlir | 85 + .../Vector/CPU/ArmSVE/vector-ummla.mlir | 94 ++ .../Vector/CPU/ArmSVE/vector-usmmla.mlir | 95 ++ .../CPU/ArmSVE/contraction-smmla-4x8x4.mlir | 117 +++ .../ArmSVE/contraction-smmla-8x8x8-vs2.mlir | 159 + .../CPU/ArmSVE/contraction-summla-4x8x4.mlir | 118 +++ .../CPU/ArmSVE/contraction-ummla-4x8x4.mlir | 119 +++ .../CPU/ArmSVE/contraction-usmmla-4x8x4.mlir | 117 +++ 16 files changed, 1322 insertions(+), 1 deletion(-) create mode 100644 mlir/lib/Dialect/ArmSVE/Transforms/LowerContractionToSVEI8MMPattern.cpp create mode 100644 mlir/test/Dialect/Vector/CPU/ArmSVE/vector-smmla.mlir create mode 100644 mlir/test/Dialect/Vector/CPU/ArmSVE/vector-summla.mlir create mode 100644 mlir/test/Dialect/Vector/CPU/ArmSVE/vector-ummla.mlir create mode 100644 mlir/test/Dialect/Vector/CPU/ArmSVE/vector-usmmla.mlir create mode 100644 mlir/test/Integration/Dialect/Vector/CPU/ArmSVE/contraction-smmla-4x8x4.mlir create mode 100644 mlir/test/Integration/Dialect/Vector/CPU/ArmSVE/contraction-smmla-8x8x8-vs2.mlir create mode 100644 mlir/test/Integration/Dialect/Vector/CPU/ArmSVE/contraction-summla-4x8x4.mlir create mode 100644 mlir/test/Integration/Dialect/Vector/CPU/ArmSVE/contraction-ummla-4x8x4.mlir create mode 100644 mlir/test/Integration/Dialect/Vector/CPU/ArmSVE/contraction-usmmla-4x8x4.mlir diff --git a/mlir/include/mlir/Conversion/Passes.td b/mlir/include/mlir/Conversion/Passes.td index 10557658d5d7d..b496ee0114910 100644 --- a/mlir/include/mlir/Conversion/Passes.td +++ b/mlir/include/mlir/Conversion/Passes.td @@ -1431,6 +1431,10 @@ def ConvertVectorToLLVMPass : Pass<"convert-vector-to-llvm"> { "bool", /*default=*/"false", "Enables the use of ArmSVE dialect while lowering the vector " "dialect.">, +Option<"armI8MM", "enable-arm-i8mm", + "bool", /*default=*/"false", + "Enables the use of Arm FEAT_I8MM instructions while lowering " + "the vector dialect.">, Option<"x86Vector", "enable-x86vector", "bool", /*default=*/"false", "Enables the use of X86Vector dialect while lowering the vector " diff --git a/mlir/include/mlir/Dialect/ArmSVE/Transforms/Transforms.h b/mlir/include/mlir/Dialect/ArmSVE/Transforms/Transforms.h index 8665c8224cc45..232e2be29e574 100644 --- a/mlir/include/mlir/Dialect/ArmSVE/Transforms/Transforms.h +++ b/mlir/include/mlir/Dialect/ArmSVE/Transforms/Transforms.h @@ -20,6 +20,9 @@ class RewritePatternSet; void populateArmSVELegalizeForLLVMExportPatterns( const LLVMTypeConverter &converter, RewritePatternSet &patterns); +void populateLowerContractionToSVEI8MMPatternPatterns( +RewritePatternSet &patterns); + /// Configure the target to support lowering ArmSVE ops to ops that map to LLVM /// intrinsics. void configureArmSVELegalizeForExportTarget(LLVMConversionTarget &target); diff --git a/mlir/lib/Conversion/VectorToLLVM/CMakeLists.txt b/mlir/lib/Conversion/VectorToLLVM/CMakeLists.txt index 330474a718e30..8e2620029c354 100644 --- a/mlir/lib/Conversion/VectorToLLVM/CMakeLists.txt +++ b/mlir/lib/Conversion/VectorToLLVM/CMakeLists.txt @@ -35,6 +35,7 @@ add_mlir_conversion_library(MLIRVectorToLLVMPass MLIRVectorToLLVM MLIRArmNeonDialect + MLIRArmNeonTransforms MLIRArmSVEDialect MLIRArmSVETransforms MLIRAMXDialect diff --git a/mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVMPass.cpp b/mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVMPass.cpp index 0ee6dce9ee94b..293e01a5bf4d4 100644 --- a/mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVMPass.cpp +++ b/mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVMPass.cpp @@ -14,6 +14,7 @@ #include "mlir/Dialect/AMX/Transforms.h" #include "mlir/Dialect/Arith/IR/Arith.h" #include "mlir/Dialect/ArmNeon/ArmNeonDialect.h" +#include "mlir/Dialect/ArmNeon/Transforms.h" #include "mlir/Dialect/ArmSVE/IR/ArmSVEDialect.h" #include "mlir/Dialect/ArmSVE/Transforms/Transforms.h" #include "mlir/Dialect/Func/IR/FuncOps.h" @@ -82,6 +83,12 @@ void ConvertVectorToLLVMPass::runOnOperation() { populateVectorStepLoweringPattern
[llvm-branch-commits] [mlir] [MLIR][ArmSVE] Add an ArmSVE dialect operation which maps to svusmmla (PR #135634)
https://github.com/momchil-velikov updated https://github.com/llvm/llvm-project/pull/135634 >From 528237309c0bfd7bbb51a8fea37b54e07f21ad1d Mon Sep 17 00:00:00 2001 From: Momchil Velikov Date: Thu, 10 Apr 2025 14:38:27 + Subject: [PATCH] [MLIR][ArmSVE] Add an ArmSVE dialect operation which maps to `svusmmla` --- mlir/include/mlir/Dialect/ArmSVE/IR/ArmSVE.td | 95 +++ .../Transforms/LegalizeForLLVMExport.cpp | 4 + .../Dialect/ArmSVE/legalize-for-llvm.mlir | 12 +++ mlir/test/Dialect/ArmSVE/roundtrip.mlir | 11 +++ mlir/test/Target/LLVMIR/arm-sve.mlir | 12 +++ 5 files changed, 96 insertions(+), 38 deletions(-) diff --git a/mlir/include/mlir/Dialect/ArmSVE/IR/ArmSVE.td b/mlir/include/mlir/Dialect/ArmSVE/IR/ArmSVE.td index 3a990f8464ef8..7385bb73b449a 100644 --- a/mlir/include/mlir/Dialect/ArmSVE/IR/ArmSVE.td +++ b/mlir/include/mlir/Dialect/ArmSVE/IR/ArmSVE.td @@ -147,11 +147,9 @@ class ScalableMaskedIOp, - AllTypesMatch<["acc", "dst"]>, - ]> { +def SdotOp : ArmSVE_Op<"sdot", [Pure, +AllTypesMatch<["src1", "src2"]>, +AllTypesMatch<["acc", "dst"]>]> { let summary = "Vector-vector dot product and accumulate op"; let description = [{ SDOT: Signed integer addition of dot product. @@ -178,11 +176,9 @@ def SdotOp : ArmSVE_Op<"sdot", "$acc `,` $src1 `,` $src2 attr-dict `:` type($src1) `to` type($dst)"; } -def SmmlaOp : ArmSVE_Op<"smmla", -[Pure, -AllTypesMatch<["src1", "src2"]>, -AllTypesMatch<["acc", "dst"]>, - ]> { +def SmmlaOp : ArmSVE_Op<"smmla", [Pure, + AllTypesMatch<["src1", "src2"]>, + AllTypesMatch<["acc", "dst"]>]> { let summary = "Matrix-matrix multiply and accumulate op"; let description = [{ SMMLA: Signed integer matrix multiply-accumulate. @@ -210,11 +206,9 @@ def SmmlaOp : ArmSVE_Op<"smmla", "$acc `,` $src1 `,` $src2 attr-dict `:` type($src1) `to` type($dst)"; } -def UdotOp : ArmSVE_Op<"udot", - [Pure, - AllTypesMatch<["src1", "src2"]>, - AllTypesMatch<["acc", "dst"]>, - ]> { +def UdotOp : ArmSVE_Op<"udot", [Pure, +AllTypesMatch<["src1", "src2"]>, +AllTypesMatch<["acc", "dst"]>]> { let summary = "Vector-vector dot product and accumulate op"; let description = [{ UDOT: Unsigned integer addition of dot product. @@ -241,11 +235,9 @@ def UdotOp : ArmSVE_Op<"udot", "$acc `,` $src1 `,` $src2 attr-dict `:` type($src1) `to` type($dst)"; } -def UmmlaOp : ArmSVE_Op<"ummla", -[Pure, -AllTypesMatch<["src1", "src2"]>, -AllTypesMatch<["acc", "dst"]>, - ]> { +def UmmlaOp : ArmSVE_Op<"ummla", [Pure, + AllTypesMatch<["src1", "src2"]>, + AllTypesMatch<["acc", "dst"]>]> { let summary = "Matrix-matrix multiply and accumulate op"; let description = [{ UMMLA: Unsigned integer matrix multiply-accumulate. @@ -273,14 +265,42 @@ def UmmlaOp : ArmSVE_Op<"ummla", "$acc `,` $src1 `,` $src2 attr-dict `:` type($src1) `to` type($dst)"; } +def UsmmlaOp : ArmSVE_Op<"usmmla", [Pure, +AllTypesMatch<["src1", "src2"]>, +AllTypesMatch<["acc", "dst"]>]> { + let summary = "Matrix-matrix multiply and accumulate op"; + let description = [{ +USMMLA: Unsigned by signed integer matrix multiply-accumulate. + +The unsigned by signed integer matrix multiply-accumulate operation +multiplies the 2Ă8 matrix of unsigned 8-bit integer values held +the first source vector by the 8Ă2 matrix of signed 8-bit integer +values in the second source vector. The resulting 2Ă2 widened 32-bit +integer matrix product is then added to the 32-bit integer matrix +accumulator. + +Source: +https://developer.arm.com/documentation/100987/ + }]; + // Supports (vector<16xi8>, vector<16xi8>) -> (vector<4xi32>) + let arguments = (ins + ScalableVectorOfLengthAndType<[4], [I32]>:$acc, + ScalableVectorOfLengthAndType<[16], [I8]>:$src1, + ScalableVectorOfLengthAndType<[16], [I8]>:$src2 + ); + let results = (outs ScalableVectorOfLengthAndType<[4], [I32]>:$dst); + let assemblyFormat = +"$acc `,` $src1 `,` $src2 attr-dict `:` type($src1) `to` type($dst)"; +} + class SvboolTypeConstraint : TypesMatchWith< "expected corresponding svbool type widened to [16]xi1", lhsArg, rhsArg, "VectorType(VectorType::Builder(::llvm::cast($_self)).setDim(::llvm::cast($_self).getRank() - 1, 16))">; def ConvertFromSvboolOp : ArmSVE_Op<"convert_from_svbool", -[Pure, SvboolTypeConstraint<"result", "source">]> -{ +
[llvm-branch-commits] [llvm] [ObjC] Support objc_claimAutoreleasedReturnValue (PR #138696)
AZero13 wrote: What did you mean by // FIXME: do this on ARCRuntimeEntryPoints, and do the todo above ARCInstKind Like you want to do the same check there? How can I help? https://github.com/llvm/llvm-project/pull/138696 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [MLIR][ArmSVE] Add initial lowering of vector.contract to SVE `*MMLA` instructions (PR #135636)
@@ -0,0 +1,304 @@ +//===- LowerContractionToSMMLAPattern.cpp - Contract to SMMLA ---*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===--===// +// +// This file implements lowering patterns from vector.contract to +// SVE I8MM operations. +// +//===--- + +#include "mlir/Dialect/Arith/IR/Arith.h" +#include "mlir/Dialect/ArmSVE/IR/ArmSVEDialect.h" +#include "mlir/Dialect/ArmSVE/Transforms/Transforms.h" +#include "mlir/Dialect/Func/IR/FuncOps.h" +#include "mlir/Dialect/LLVMIR/LLVMDialect.h" +#include "mlir/Dialect/Utils/IndexingUtils.h" +#include "mlir/Dialect/Vector/IR/VectorOps.h" +#include "mlir/IR/AffineMap.h" +#include "mlir/IR/PatternMatch.h" +#include "mlir/Transforms/GreedyPatternRewriteDriver.h" + +#include "mlir/Dialect/UB/IR/UBOps.h" + +#define DEBUG_TYPE "lower-contract-to-arm-sve-i8mm" + +using namespace mlir; +using namespace mlir::arm_sve; + +namespace { +// Check if the given value is a result of the operation `T` (which must be +// sign- or zero- extend) from i8 to i32. Return the value before the extension. +template +inline std::enable_if_t<(std::is_base_of_v || + std::is_base_of_v), +std::optional> +extractExtOperand(Value v, Type i8Ty, Type i32Ty) { + auto extOp = dyn_cast_or_null(v.getDefiningOp()); + if (!extOp) +return {}; + + auto inOp = extOp.getIn(); + auto inTy = dyn_cast(inOp.getType()); + if (!inTy || inTy.getElementType() != i8Ty) +return {}; + + auto outTy = dyn_cast(extOp.getType()); + if (!outTy || outTy.getElementType() != i32Ty) +return {}; + + return inOp; +} + +// Designate the operation (resp. instruction) used to do sub-tile matrix +// multiplications. +enum class MMLA { + Signed, // smmla + Unsigned,// ummla + Mixed, // usmmla + MixedSwapped // usmmla with LHS and RHS swapped +}; + +// Create the matrix multply and accumulate operation according to `op`. +Value createMMLA(PatternRewriter &rewriter, MMLA op, Location loc, + mlir::VectorType accType, Value acc, Value lhs, Value rhs) { + switch (op) { + case MMLA::Signed: +return rewriter.create(loc, accType, acc, lhs, rhs); + case MMLA::Unsigned: +return rewriter.create(loc, accType, acc, lhs, rhs); + case MMLA::Mixed: +return rewriter.create(loc, accType, acc, lhs, rhs); + case MMLA::MixedSwapped: +// The accumulator comes transposed and the result will be transposed +// later, so all we have to do here is swap the operands. +return rewriter.create(loc, accType, acc, rhs, lhs); + } +} + +class LowerContractionToSVEI8MMPattern +: public OpRewritePattern { +public: + using OpRewritePattern::OpRewritePattern; + LogicalResult matchAndRewrite(vector::ContractionOp op, +PatternRewriter &rewriter) const override { + +Location loc = op.getLoc(); +mlir::VectorType lhsType = op.getLhsType(); +mlir::VectorType rhsType = op.getRhsType(); + +// For now handle LHS and RHS<8x[N]> - these are the types we +// eventually expect from MMT4D. M and N dimensions must be even and at +// least 2. +if (!lhsType.hasRank() || lhsType.getRank() != 2 || !rhsType.hasRank() || +rhsType.getRank() != 2) + return failure(); + +if (lhsType.isScalable() || !rhsType.isScalable()) + return failure(); + +// M, N, and K are the conventional names for matrix dimensions in the +// context of matrix multiplication. +auto M = lhsType.getDimSize(0); +auto N = rhsType.getDimSize(0); +auto K = rhsType.getDimSize(1); + +if (lhsType.getDimSize(1) != K || K != 8 || M < 2 || M % 2 != 0 || N < 2 || +N % 2 != 0 || !rhsType.getScalableDims()[0]) + return failure(); + +// Check permutation maps. For now only accept +// lhs: (d0, d1, d2) -> (d0, d2) +// rhs: (d0, d1, d2) -> (d1, d2) +// acc: (d0, d1, d2) -> (d0, d1) +// Note: RHS is transposed. +if (op.getIndexingMapsArray()[0] != +AffineMap::getMultiDimMapWithTargets(3, ArrayRef{0u, 2u}, + op.getContext()) || +op.getIndexingMapsArray()[1] != +AffineMap::getMultiDimMapWithTargets(3, ArrayRef{1u, 2u}, + op.getContext()) || +op.getIndexingMapsArray()[2] != +AffineMap::getMultiDimMapWithTargets(3, ArrayRef{0u, 1u}, + op.getContext())) + return failure(); + +// Check iterator types for matrix multiplication. +auto itTypes = op.getIteratorTypesArray(); +if (itTypes.size() != 3 || itTypes[0] != vector::IteratorType::parallel || +itTypes[1] != vector::IteratorType
[llvm-branch-commits] [mlir] [MLIR][ArmSVE] Add initial lowering of vector.contract to SVE `*MMLA` instructions (PR #135636)
https://github.com/momchil-velikov updated https://github.com/llvm/llvm-project/pull/135636 >From f397467bc167d94a28a919a45c009a8f08b6351b Mon Sep 17 00:00:00 2001 From: Momchil Velikov Date: Tue, 8 Apr 2025 14:43:54 + Subject: [PATCH 1/2] [MLIR][ArmSVE] Add initial lowering of `vector.contract` to SVE `*MMLA` instructions --- mlir/include/mlir/Conversion/Passes.td| 4 + .../Dialect/ArmSVE/Transforms/Transforms.h| 3 + .../Conversion/VectorToLLVM/CMakeLists.txt| 1 + .../VectorToLLVM/ConvertVectorToLLVMPass.cpp | 7 + .../LowerContractionToSMMLAPattern.cpp| 5 +- .../Dialect/ArmSVE/Transforms/CMakeLists.txt | 1 + .../LowerContractionToSVEI8MMPattern.cpp | 304 ++ .../Vector/CPU/ArmSVE/vector-smmla.mlir | 94 ++ .../Vector/CPU/ArmSVE/vector-summla.mlir | 85 + .../Vector/CPU/ArmSVE/vector-ummla.mlir | 94 ++ .../Vector/CPU/ArmSVE/vector-usmmla.mlir | 95 ++ .../CPU/ArmSVE/contraction-smmla-4x8x4.mlir | 117 +++ .../ArmSVE/contraction-smmla-8x8x8-vs2.mlir | 159 + .../CPU/ArmSVE/contraction-summla-4x8x4.mlir | 118 +++ .../CPU/ArmSVE/contraction-ummla-4x8x4.mlir | 119 +++ .../CPU/ArmSVE/contraction-usmmla-4x8x4.mlir | 117 +++ 16 files changed, 1322 insertions(+), 1 deletion(-) create mode 100644 mlir/lib/Dialect/ArmSVE/Transforms/LowerContractionToSVEI8MMPattern.cpp create mode 100644 mlir/test/Dialect/Vector/CPU/ArmSVE/vector-smmla.mlir create mode 100644 mlir/test/Dialect/Vector/CPU/ArmSVE/vector-summla.mlir create mode 100644 mlir/test/Dialect/Vector/CPU/ArmSVE/vector-ummla.mlir create mode 100644 mlir/test/Dialect/Vector/CPU/ArmSVE/vector-usmmla.mlir create mode 100644 mlir/test/Integration/Dialect/Vector/CPU/ArmSVE/contraction-smmla-4x8x4.mlir create mode 100644 mlir/test/Integration/Dialect/Vector/CPU/ArmSVE/contraction-smmla-8x8x8-vs2.mlir create mode 100644 mlir/test/Integration/Dialect/Vector/CPU/ArmSVE/contraction-summla-4x8x4.mlir create mode 100644 mlir/test/Integration/Dialect/Vector/CPU/ArmSVE/contraction-ummla-4x8x4.mlir create mode 100644 mlir/test/Integration/Dialect/Vector/CPU/ArmSVE/contraction-usmmla-4x8x4.mlir diff --git a/mlir/include/mlir/Conversion/Passes.td b/mlir/include/mlir/Conversion/Passes.td index 10557658d5d7d..b496ee0114910 100644 --- a/mlir/include/mlir/Conversion/Passes.td +++ b/mlir/include/mlir/Conversion/Passes.td @@ -1431,6 +1431,10 @@ def ConvertVectorToLLVMPass : Pass<"convert-vector-to-llvm"> { "bool", /*default=*/"false", "Enables the use of ArmSVE dialect while lowering the vector " "dialect.">, +Option<"armI8MM", "enable-arm-i8mm", + "bool", /*default=*/"false", + "Enables the use of Arm FEAT_I8MM instructions while lowering " + "the vector dialect.">, Option<"x86Vector", "enable-x86vector", "bool", /*default=*/"false", "Enables the use of X86Vector dialect while lowering the vector " diff --git a/mlir/include/mlir/Dialect/ArmSVE/Transforms/Transforms.h b/mlir/include/mlir/Dialect/ArmSVE/Transforms/Transforms.h index 8665c8224cc45..232e2be29e574 100644 --- a/mlir/include/mlir/Dialect/ArmSVE/Transforms/Transforms.h +++ b/mlir/include/mlir/Dialect/ArmSVE/Transforms/Transforms.h @@ -20,6 +20,9 @@ class RewritePatternSet; void populateArmSVELegalizeForLLVMExportPatterns( const LLVMTypeConverter &converter, RewritePatternSet &patterns); +void populateLowerContractionToSVEI8MMPatternPatterns( +RewritePatternSet &patterns); + /// Configure the target to support lowering ArmSVE ops to ops that map to LLVM /// intrinsics. void configureArmSVELegalizeForExportTarget(LLVMConversionTarget &target); diff --git a/mlir/lib/Conversion/VectorToLLVM/CMakeLists.txt b/mlir/lib/Conversion/VectorToLLVM/CMakeLists.txt index 330474a718e30..8e2620029c354 100644 --- a/mlir/lib/Conversion/VectorToLLVM/CMakeLists.txt +++ b/mlir/lib/Conversion/VectorToLLVM/CMakeLists.txt @@ -35,6 +35,7 @@ add_mlir_conversion_library(MLIRVectorToLLVMPass MLIRVectorToLLVM MLIRArmNeonDialect + MLIRArmNeonTransforms MLIRArmSVEDialect MLIRArmSVETransforms MLIRAMXDialect diff --git a/mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVMPass.cpp b/mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVMPass.cpp index 0ee6dce9ee94b..293e01a5bf4d4 100644 --- a/mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVMPass.cpp +++ b/mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVMPass.cpp @@ -14,6 +14,7 @@ #include "mlir/Dialect/AMX/Transforms.h" #include "mlir/Dialect/Arith/IR/Arith.h" #include "mlir/Dialect/ArmNeon/ArmNeonDialect.h" +#include "mlir/Dialect/ArmNeon/Transforms.h" #include "mlir/Dialect/ArmSVE/IR/ArmSVEDialect.h" #include "mlir/Dialect/ArmSVE/Transforms/Transforms.h" #include "mlir/Dialect/Func/IR/FuncOps.h" @@ -82,6 +83,12 @@ void ConvertVectorToLLVMPass::runOnOperation() { populateVectorStepLoweringPattern
[llvm-branch-commits] [llvm] [BOLT][test] Fix callcont-fallthru.s after #129481 (PR #135867)
aaupov wrote: This approach doesn't solve the problem in case nm is symlinked to llvm-nm which doesn't have the flag. Abandon in favor of #139953. https://github.com/llvm/llvm-project/pull/135867 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [BOLT][test] Fix callcont-fallthru.s after #129481 (PR #135867)
https://github.com/aaupov closed https://github.com/llvm/llvm-project/pull/135867 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] IR: Remove redundant UseList check in addUse (PR #138676)
nikic wrote: I think this may have been noise. I reran this and there are no differences over the significance threshold: https://llvm-compile-time-tracker.com/compare.php?from=6c1bb48cc45396894597c8cb897c31205d1bdeb6&to=1837fe71fcfb4363fd2b66cdb9ff6a82b3f380fb&stat=instructions:u https://github.com/llvm/llvm-project/pull/138676 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] IR: Remove redundant UseList check in addUse (PR #138676)
https://github.com/nikic approved this pull request. https://github.com/llvm/llvm-project/pull/138676 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/20.x: [LoongArch] Fix fp_to_uint/fp_to_sint conversion errors for lasx (#137129) (PR #139851)
SixWeining wrote: This PR fixes https://github.com/llvm/llvm-project/issues/136971. https://github.com/llvm/llvm-project/pull/139851 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [Clang][Backport] Demote mixed enumeration arithmetic error to a warning (#131811) (PR #139396)
@@ -7567,9 +7567,13 @@ def warn_arith_conv_mixed_enum_types_cxx20 : Warning< "%sub{select_arith_conv_kind}0 " "different enumeration types%diff{ ($ and $)|}1,2 is deprecated">, InGroup; -def err_conv_mixed_enum_types_cxx26 : Error< + +def err_conv_mixed_enum_types: Error < "invalid %sub{select_arith_conv_kind}0 " "different enumeration types%diff{ ($ and $)|}1,2">; +def warn_conv_mixed_enum_types_cxx26 : Warning < + err_conv_mixed_enum_types.Summary>, + InGroup, DefaultError; AaronBallman wrote: I don't disagree (it does add a new enum value), but I am starting to think this ABI requirement is onerous enough to be worth rethinking the scope of it. The point to the ABI requirement is that you should be able to use Clang as a library interchangeably between the dot releases. The platform calling convention ABI does not change when an enumeration gains a new enumerator, nor does name mangling behavior change. In fact, adding an enumerator can only increase the number of valid integer values that can be represented by the enumeration, which means the addition won't introduce new UB but could remove UB by widening the range of valid values. So I think this kind of change isn't introducing a kind of ABI break we need to avoid. Or am I missing something? https://github.com/llvm/llvm-project/pull/139396 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [MLIR][ArmSVE] Add initial lowering of vector.contract to SVE `*MMLA` instructions (PR #135636)
@@ -0,0 +1,304 @@ +//===- LowerContractionToSMMLAPattern.cpp - Contract to SMMLA ---*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===--===// +// +// This file implements lowering patterns from vector.contract to +// SVE I8MM operations. +// +//===--- + +#include "mlir/Dialect/Arith/IR/Arith.h" +#include "mlir/Dialect/ArmSVE/IR/ArmSVEDialect.h" +#include "mlir/Dialect/ArmSVE/Transforms/Transforms.h" +#include "mlir/Dialect/Func/IR/FuncOps.h" +#include "mlir/Dialect/LLVMIR/LLVMDialect.h" +#include "mlir/Dialect/Utils/IndexingUtils.h" +#include "mlir/Dialect/Vector/IR/VectorOps.h" +#include "mlir/IR/AffineMap.h" +#include "mlir/IR/PatternMatch.h" +#include "mlir/Transforms/GreedyPatternRewriteDriver.h" + +#include "mlir/Dialect/UB/IR/UBOps.h" + +#define DEBUG_TYPE "lower-contract-to-arm-sve-i8mm" + +using namespace mlir; +using namespace mlir::arm_sve; + +namespace { +// Check if the given value is a result of the operation `T` (which must be +// sign- or zero- extend) from i8 to i32. Return the value before the extension. +template +inline std::enable_if_t<(std::is_base_of_v || + std::is_base_of_v), +std::optional> +extractExtOperand(Value v, Type i8Ty, Type i32Ty) { + auto extOp = dyn_cast_or_null(v.getDefiningOp()); + if (!extOp) +return {}; + + auto inOp = extOp.getIn(); + auto inTy = dyn_cast(inOp.getType()); + if (!inTy || inTy.getElementType() != i8Ty) +return {}; + + auto outTy = dyn_cast(extOp.getType()); + if (!outTy || outTy.getElementType() != i32Ty) +return {}; + + return inOp; +} + +// Designate the operation (resp. instruction) used to do sub-tile matrix +// multiplications. +enum class MMLA { + Signed, // smmla + Unsigned,// ummla + Mixed, // usmmla + MixedSwapped // usmmla with LHS and RHS swapped +}; + +// Create the matrix multply and accumulate operation according to `op`. +Value createMMLA(PatternRewriter &rewriter, MMLA op, Location loc, + mlir::VectorType accType, Value acc, Value lhs, Value rhs) { + switch (op) { + case MMLA::Signed: +return rewriter.create(loc, accType, acc, lhs, rhs); + case MMLA::Unsigned: +return rewriter.create(loc, accType, acc, lhs, rhs); + case MMLA::Mixed: +return rewriter.create(loc, accType, acc, lhs, rhs); + case MMLA::MixedSwapped: +// The accumulator comes transposed and the result will be transposed +// later, so all we have to do here is swap the operands. +return rewriter.create(loc, accType, acc, rhs, lhs); + } +} + +class LowerContractionToSVEI8MMPattern +: public OpRewritePattern { +public: + using OpRewritePattern::OpRewritePattern; + LogicalResult matchAndRewrite(vector::ContractionOp op, +PatternRewriter &rewriter) const override { + +Location loc = op.getLoc(); +mlir::VectorType lhsType = op.getLhsType(); +mlir::VectorType rhsType = op.getRhsType(); + +// For now handle LHS and RHS<8x[N]> - these are the types we +// eventually expect from MMT4D. M and N dimensions must be even and at momchil-velikov wrote: There's no dependency on MMT4D - this comment is merely a rationale why we have chosen these particular operand shapes. https://github.com/llvm/llvm-project/pull/135636 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [MLIR][ArmSVE] Add initial lowering of vector.contract to SVE `*MMLA` instructions (PR #135636)
@@ -0,0 +1,304 @@ +//===- LowerContractionToSMMLAPattern.cpp - Contract to SMMLA ---*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===--===// +// +// This file implements lowering patterns from vector.contract to +// SVE I8MM operations. +// +//===--- + +#include "mlir/Dialect/Arith/IR/Arith.h" +#include "mlir/Dialect/ArmSVE/IR/ArmSVEDialect.h" +#include "mlir/Dialect/ArmSVE/Transforms/Transforms.h" +#include "mlir/Dialect/Func/IR/FuncOps.h" +#include "mlir/Dialect/LLVMIR/LLVMDialect.h" +#include "mlir/Dialect/Utils/IndexingUtils.h" +#include "mlir/Dialect/Vector/IR/VectorOps.h" +#include "mlir/IR/AffineMap.h" +#include "mlir/IR/PatternMatch.h" +#include "mlir/Transforms/GreedyPatternRewriteDriver.h" + +#include "mlir/Dialect/UB/IR/UBOps.h" + +#define DEBUG_TYPE "lower-contract-to-arm-sve-i8mm" + +using namespace mlir; +using namespace mlir::arm_sve; + +namespace { +// Check if the given value is a result of the operation `T` (which must be +// sign- or zero- extend) from i8 to i32. Return the value before the extension. +template +inline std::enable_if_t<(std::is_base_of_v || + std::is_base_of_v), +std::optional> +extractExtOperand(Value v, Type i8Ty, Type i32Ty) { + auto extOp = dyn_cast_or_null(v.getDefiningOp()); + if (!extOp) +return {}; + + auto inOp = extOp.getIn(); + auto inTy = dyn_cast(inOp.getType()); + if (!inTy || inTy.getElementType() != i8Ty) +return {}; + + auto outTy = dyn_cast(extOp.getType()); + if (!outTy || outTy.getElementType() != i32Ty) +return {}; + + return inOp; +} + +// Designate the operation (resp. instruction) used to do sub-tile matrix +// multiplications. +enum class MMLA { + Signed, // smmla + Unsigned,// ummla + Mixed, // usmmla + MixedSwapped // usmmla with LHS and RHS swapped +}; + +// Create the matrix multply and accumulate operation according to `op`. +Value createMMLA(PatternRewriter &rewriter, MMLA op, Location loc, + mlir::VectorType accType, Value acc, Value lhs, Value rhs) { + switch (op) { + case MMLA::Signed: +return rewriter.create(loc, accType, acc, lhs, rhs); + case MMLA::Unsigned: +return rewriter.create(loc, accType, acc, lhs, rhs); + case MMLA::Mixed: +return rewriter.create(loc, accType, acc, lhs, rhs); + case MMLA::MixedSwapped: +// The accumulator comes transposed and the result will be transposed +// later, so all we have to do here is swap the operands. +return rewriter.create(loc, accType, acc, rhs, lhs); + } +} + +class LowerContractionToSVEI8MMPattern +: public OpRewritePattern { +public: + using OpRewritePattern::OpRewritePattern; + LogicalResult matchAndRewrite(vector::ContractionOp op, +PatternRewriter &rewriter) const override { + +Location loc = op.getLoc(); +mlir::VectorType lhsType = op.getLhsType(); +mlir::VectorType rhsType = op.getRhsType(); + +// For now handle LHS and RHS<8x[N]> - these are the types we +// eventually expect from MMT4D. M and N dimensions must be even and at +// least 2. +if (!lhsType.hasRank() || lhsType.getRank() != 2 || !rhsType.hasRank() || momchil-velikov wrote: Done https://github.com/llvm/llvm-project/pull/135636 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [MLIR][ArmSVE] Add initial lowering of vector.contract to SVE `*MMLA` instructions (PR #135636)
@@ -0,0 +1,304 @@ +//===- LowerContractionToSMMLAPattern.cpp - Contract to SMMLA ---*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===--===// +// +// This file implements lowering patterns from vector.contract to +// SVE I8MM operations. momchil-velikov wrote: The `vector.contract` implicitly sign-extends its operands, so it does not need to by accompanied by explicit extend operations. I'll add code to handle this case too. https://github.com/llvm/llvm-project/pull/135636 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [llvm] Enable fexec-charset option (PR #138895)
@@ -0,0 +1,36 @@ +//===--- clang/Lex/LiteralConverter.h - Translator for Literals -*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===--===// + +#ifndef LLVM_CLANG_LEX_LITERALCONVERTER_H +#define LLVM_CLANG_LEX_LITERALCONVERTER_H + +#include "clang/Basic/Diagnostic.h" +#include "clang/Basic/LangOptions.h" +#include "clang/Basic/TargetInfo.h" +#include "llvm/ADT/StringMap.h" +#include "llvm/ADT/StringRef.h" +#include "llvm/Support/CharSet.h" + +enum ConversionAction { NoConversion, ToSystemCharset, ToExecCharset }; + +class LiteralConverter { + llvm::StringRef InternalCharset; + llvm::StringRef SystemCharset; + llvm::StringRef ExecCharset; + llvm::StringMap CharsetConverters; abhina-sree wrote: That is true, but I guess the tradeoff is you would need to recreate the converters more times. pragma converts are stackable so it's possible to do the following ``` pragma convert ("IBM-1047") pragma convert("UTF-8") pragma convert(pop) pragma convert(pop) ``` https://github.com/llvm/llvm-project/pull/138895 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [libcxx] [libc++] Implement std::move_only_function (P0288R9) (PR #94670)
frederick-vs-ja wrote: FYI [P2548R2](https://wg21.link/p2548r2) added relaxing wording to [[func.wrap.general]](https://eel.is/c++draft/func.wrap.general) to allow unwrapping in construction. I think we should implement the allowance for `move_only_function` in C++23 as a DR. We can't unwrap in `function` construction though, IIUC, because its target object is observable. But we can unwrap `function` when constructing a `move_only_function`. https://github.com/llvm/llvm-project/pull/94670 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/20.x: [LoongArch] Fix fp_to_uint/fp_to_sint conversion errors for lasx (#137129) (PR #139851)
https://github.com/heiher approved this pull request. https://github.com/llvm/llvm-project/pull/139851 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [MLIR][ArmSVE] Add initial lowering of vector.contract to SVE `*MMLA` instructions (PR #135636)
@@ -0,0 +1,304 @@ +//===- LowerContractionToSMMLAPattern.cpp - Contract to SMMLA ---*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===--===// +// +// This file implements lowering patterns from vector.contract to +// SVE I8MM operations. +// +//===--- + +#include "mlir/Dialect/Arith/IR/Arith.h" +#include "mlir/Dialect/ArmSVE/IR/ArmSVEDialect.h" +#include "mlir/Dialect/ArmSVE/Transforms/Transforms.h" +#include "mlir/Dialect/Func/IR/FuncOps.h" +#include "mlir/Dialect/LLVMIR/LLVMDialect.h" +#include "mlir/Dialect/Utils/IndexingUtils.h" +#include "mlir/Dialect/Vector/IR/VectorOps.h" +#include "mlir/IR/AffineMap.h" +#include "mlir/IR/PatternMatch.h" +#include "mlir/Transforms/GreedyPatternRewriteDriver.h" + +#include "mlir/Dialect/UB/IR/UBOps.h" + +#define DEBUG_TYPE "lower-contract-to-arm-sve-i8mm" + +using namespace mlir; +using namespace mlir::arm_sve; + +namespace { +// Check if the given value is a result of the operation `T` (which must be +// sign- or zero- extend) from i8 to i32. Return the value before the extension. +template +inline std::enable_if_t<(std::is_base_of_v || + std::is_base_of_v), +std::optional> +extractExtOperand(Value v, Type i8Ty, Type i32Ty) { + auto extOp = dyn_cast_or_null(v.getDefiningOp()); + if (!extOp) +return {}; + + auto inOp = extOp.getIn(); + auto inTy = dyn_cast(inOp.getType()); + if (!inTy || inTy.getElementType() != i8Ty) +return {}; + + auto outTy = dyn_cast(extOp.getType()); + if (!outTy || outTy.getElementType() != i32Ty) +return {}; + + return inOp; +} + +// Designate the operation (resp. instruction) used to do sub-tile matrix +// multiplications. +enum class MMLA { + Signed, // smmla + Unsigned,// ummla + Mixed, // usmmla + MixedSwapped // usmmla with LHS and RHS swapped +}; + +// Create the matrix multply and accumulate operation according to `op`. +Value createMMLA(PatternRewriter &rewriter, MMLA op, Location loc, + mlir::VectorType accType, Value acc, Value lhs, Value rhs) { + switch (op) { + case MMLA::Signed: +return rewriter.create(loc, accType, acc, lhs, rhs); + case MMLA::Unsigned: +return rewriter.create(loc, accType, acc, lhs, rhs); + case MMLA::Mixed: +return rewriter.create(loc, accType, acc, lhs, rhs); + case MMLA::MixedSwapped: +// The accumulator comes transposed and the result will be transposed +// later, so all we have to do here is swap the operands. +return rewriter.create(loc, accType, acc, rhs, lhs); + } +} + +class LowerContractionToSVEI8MMPattern +: public OpRewritePattern { +public: + using OpRewritePattern::OpRewritePattern; + LogicalResult matchAndRewrite(vector::ContractionOp op, +PatternRewriter &rewriter) const override { + +Location loc = op.getLoc(); +mlir::VectorType lhsType = op.getLhsType(); +mlir::VectorType rhsType = op.getRhsType(); + +// For now handle LHS and RHS<8x[N]> - these are the types we +// eventually expect from MMT4D. M and N dimensions must be even and at banach-space wrote: Perhaps just expand this comment a bit (e.g. by noting that MMT4D is the main use-case ATM)? https://github.com/llvm/llvm-project/pull/135636 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [X86] Remove extra MOV after widening atomic load (PR #138635)
@@ -1200,6 +1200,13 @@ def : Pat<(i16 (atomic_load_nonext_16 addr:$src)), (MOV16rm addr:$src)>; def : Pat<(i32 (atomic_load_nonext_32 addr:$src)), (MOV32rm addr:$src)>; def : Pat<(i64 (atomic_load_nonext_64 addr:$src)), (MOV64rm addr:$src)>; +def : Pat<(v4i32 (scalar_to_vector (i32 (anyext (i16 (atomic_load_16 addr:$src)), + (MOVDI2PDIrm addr:$src)>; // load atomic <2 x i8> jofrn wrote: Ok. Thanks! Since you've made the commits in already, I'll interleave the SSE/AVX updates throughout the series rather than making a new PR. https://github.com/llvm/llvm-project/pull/138635 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [MLIR][ArmSVE] Add initial lowering of vector.contract to SVE `*MMLA` instructions (PR #135636)
@@ -0,0 +1,304 @@ +//===- LowerContractionToSMMLAPattern.cpp - Contract to SMMLA ---*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===--===// +// +// This file implements lowering patterns from vector.contract to +// SVE I8MM operations. +// +//===--- + +#include "mlir/Dialect/Arith/IR/Arith.h" +#include "mlir/Dialect/ArmSVE/IR/ArmSVEDialect.h" +#include "mlir/Dialect/ArmSVE/Transforms/Transforms.h" +#include "mlir/Dialect/Func/IR/FuncOps.h" +#include "mlir/Dialect/LLVMIR/LLVMDialect.h" +#include "mlir/Dialect/Utils/IndexingUtils.h" +#include "mlir/Dialect/Vector/IR/VectorOps.h" +#include "mlir/IR/AffineMap.h" +#include "mlir/IR/PatternMatch.h" +#include "mlir/Transforms/GreedyPatternRewriteDriver.h" + +#include "mlir/Dialect/UB/IR/UBOps.h" + +#define DEBUG_TYPE "lower-contract-to-arm-sve-i8mm" + +using namespace mlir; +using namespace mlir::arm_sve; + +namespace { +// Check if the given value is a result of the operation `T` (which must be +// sign- or zero- extend) from i8 to i32. Return the value before the extension. +template +inline std::enable_if_t<(std::is_base_of_v || + std::is_base_of_v), +std::optional> +extractExtOperand(Value v, Type i8Ty, Type i32Ty) { + auto extOp = dyn_cast_or_null(v.getDefiningOp()); + if (!extOp) +return {}; + + auto inOp = extOp.getIn(); + auto inTy = dyn_cast(inOp.getType()); + if (!inTy || inTy.getElementType() != i8Ty) +return {}; + + auto outTy = dyn_cast(extOp.getType()); + if (!outTy || outTy.getElementType() != i32Ty) +return {}; + + return inOp; +} + +// Designate the operation (resp. instruction) used to do sub-tile matrix +// multiplications. +enum class MMLA { + Signed, // smmla + Unsigned,// ummla + Mixed, // usmmla + MixedSwapped // usmmla with LHS and RHS swapped +}; + +// Create the matrix multply and accumulate operation according to `op`. +Value createMMLA(PatternRewriter &rewriter, MMLA op, Location loc, + mlir::VectorType accType, Value acc, Value lhs, Value rhs) { + switch (op) { + case MMLA::Signed: +return rewriter.create(loc, accType, acc, lhs, rhs); + case MMLA::Unsigned: +return rewriter.create(loc, accType, acc, lhs, rhs); + case MMLA::Mixed: +return rewriter.create(loc, accType, acc, lhs, rhs); + case MMLA::MixedSwapped: +// The accumulator comes transposed and the result will be transposed +// later, so all we have to do here is swap the operands. +return rewriter.create(loc, accType, acc, rhs, lhs); + } +} + +class LowerContractionToSVEI8MMPattern +: public OpRewritePattern { +public: + using OpRewritePattern::OpRewritePattern; + LogicalResult matchAndRewrite(vector::ContractionOp op, +PatternRewriter &rewriter) const override { + +Location loc = op.getLoc(); +mlir::VectorType lhsType = op.getLhsType(); +mlir::VectorType rhsType = op.getRhsType(); + +// For now handle LHS and RHS<8x[N]> - these are the types we +// eventually expect from MMT4D. M and N dimensions must be even and at +// least 2. +if (!lhsType.hasRank() || lhsType.getRank() != 2 || !rhsType.hasRank() || +rhsType.getRank() != 2) + return failure(); + +if (lhsType.isScalable() || !rhsType.isScalable()) + return failure(); + +// M, N, and K are the conventional names for matrix dimensions in the +// context of matrix multiplication. +auto M = lhsType.getDimSize(0); +auto N = rhsType.getDimSize(0); +auto K = rhsType.getDimSize(1); + +if (lhsType.getDimSize(1) != K || K != 8 || M < 2 || M % 2 != 0 || N < 2 || +N % 2 != 0 || !rhsType.getScalableDims()[0]) + return failure(); + +// Check permutation maps. For now only accept +// lhs: (d0, d1, d2) -> (d0, d2) +// rhs: (d0, d1, d2) -> (d1, d2) +// acc: (d0, d1, d2) -> (d0, d1) +// Note: RHS is transposed. +if (op.getIndexingMapsArray()[0] != +AffineMap::getMultiDimMapWithTargets(3, ArrayRef{0u, 2u}, + op.getContext()) || +op.getIndexingMapsArray()[1] != +AffineMap::getMultiDimMapWithTargets(3, ArrayRef{1u, 2u}, + op.getContext()) || +op.getIndexingMapsArray()[2] != +AffineMap::getMultiDimMapWithTargets(3, ArrayRef{0u, 1u}, + op.getContext())) + return failure(); + +// Check iterator types for matrix multiplication. +auto itTypes = op.getIteratorTypesArray(); +if (itTypes.size() != 3 || itTypes[0] != vector::IteratorType::parallel || +itTypes[1] != vector::IteratorType
[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: prevent false positives due to jump tables (PR #138884)
https://github.com/atrosinenko updated https://github.com/llvm/llvm-project/pull/138884 >From eae2c10ca4d3024862eba06acbb073244ac350e9 Mon Sep 17 00:00:00 2001 From: Anatoly Trosinenko Date: Tue, 6 May 2025 11:31:03 +0300 Subject: [PATCH] [BOLT] Gadget scanner: prevent false positives due to jump tables As part of PAuth hardening, AArch64 LLVM backend can use a special BR_JumpTable pseudo (enabled by -faarch64-jump-table-hardening Clang option) which is expanded in the AsmPrinter into a contiguous sequence without unsafe instructions in the middle. This commit adds another target-specific callback to MCPlusBuilder to make it possible to inhibit false positives for known-safe jump table dispatch sequences. Without special handling, the branch instruction is likely to be reported as a non-protected call (as its destination is not produced by an auth instruction, PC-relative address materialization, etc.) and possibly as a tail call being performed with unsafe link register (as the detection whether the branch instruction is a tail call is an heuristic). For now, only the specific instruction sequence used by the AArch64 LLVM backend is matched. --- bolt/include/bolt/Core/MCInstUtils.h | 9 + bolt/include/bolt/Core/MCPlusBuilder.h| 14 + bolt/lib/Core/MCInstUtils.cpp | 20 + bolt/lib/Passes/PAuthGadgetScanner.cpp| 10 + .../Target/AArch64/AArch64MCPlusBuilder.cpp | 73 ++ .../AArch64/gs-pauth-jump-table.s | 703 ++ 6 files changed, 829 insertions(+) create mode 100644 bolt/test/binary-analysis/AArch64/gs-pauth-jump-table.s diff --git a/bolt/include/bolt/Core/MCInstUtils.h b/bolt/include/bolt/Core/MCInstUtils.h index b495eb8ef5eec..5dd0aaa48d6e7 100644 --- a/bolt/include/bolt/Core/MCInstUtils.h +++ b/bolt/include/bolt/Core/MCInstUtils.h @@ -158,6 +158,15 @@ class MCInstReference { return nullptr; } + /// Returns the only preceding instruction, or std::nullopt if multiple or no + /// predecessors are possible. + /// + /// If CFG information is available, basic block boundary can be crossed, + /// provided there is exactly one predecessor. If CFG is not available, the + /// preceding instruction in the offset order is returned, unless this is the + /// first instruction of the function. + std::optional getSinglePredecessor(); + raw_ostream &print(raw_ostream &OS) const; }; diff --git a/bolt/include/bolt/Core/MCPlusBuilder.h b/bolt/include/bolt/Core/MCPlusBuilder.h index 87de6754017db..eb93d7de7fee9 100644 --- a/bolt/include/bolt/Core/MCPlusBuilder.h +++ b/bolt/include/bolt/Core/MCPlusBuilder.h @@ -14,6 +14,7 @@ #ifndef BOLT_CORE_MCPLUSBUILDER_H #define BOLT_CORE_MCPLUSBUILDER_H +#include "bolt/Core/MCInstUtils.h" #include "bolt/Core/MCPlus.h" #include "bolt/Core/Relocation.h" #include "llvm/ADT/ArrayRef.h" @@ -699,6 +700,19 @@ class MCPlusBuilder { return std::nullopt; } + /// Tests if BranchInst corresponds to an instruction sequence which is known + /// to be a safe dispatch via jump table. + /// + /// The target can decide which instruction sequences to consider "safe" from + /// the Pointer Authentication point of view, such as any jump table dispatch + /// sequence without function calls inside, any sequence which is contiguous, + /// or only some specific well-known sequences. + virtual bool + isSafeJumpTableBranchForPtrAuth(MCInstReference BranchInst) const { +llvm_unreachable("not implemented"); +return false; + } + virtual bool isTerminator(const MCInst &Inst) const; virtual bool isNoop(const MCInst &Inst) const { diff --git a/bolt/lib/Core/MCInstUtils.cpp b/bolt/lib/Core/MCInstUtils.cpp index 40f6edd59135c..b7c6d898988af 100644 --- a/bolt/lib/Core/MCInstUtils.cpp +++ b/bolt/lib/Core/MCInstUtils.cpp @@ -55,3 +55,23 @@ raw_ostream &MCInstReference::print(raw_ostream &OS) const { OS << ">"; return OS; } + +std::optional MCInstReference::getSinglePredecessor() { + if (const RefInBB *Ref = tryGetRefInBB()) { +if (Ref->It != Ref->BB->begin()) + return MCInstReference(Ref->BB, &*std::prev(Ref->It)); + +if (Ref->BB->pred_size() != 1) + return std::nullopt; + +BinaryBasicBlock *PredBB = *Ref->BB->pred_begin(); +assert(!PredBB->empty() && "Empty basic blocks are not supported yet"); +return MCInstReference(PredBB, &*PredBB->rbegin()); + } + + const RefInBF &Ref = getRefInBF(); + if (Ref.It == Ref.BF->instrs().begin()) +return std::nullopt; + + return MCInstReference(Ref.BF, std::prev(Ref.It)); +} diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp b/bolt/lib/Passes/PAuthGadgetScanner.cpp index 10a3f9716c201..e96ec5996984d 100644 --- a/bolt/lib/Passes/PAuthGadgetScanner.cpp +++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp @@ -1329,6 +1329,11 @@ shouldReportUnsafeTailCall(const BinaryContext &BC, const BinaryFunction &BF, return std::nullopt; } + if (BC.MIB->isSafeJumpTableBranchForPtrAuth(Inst)) { +LL
[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: optionally assume auth traps on failure (PR #139778)
https://github.com/atrosinenko updated https://github.com/llvm/llvm-project/pull/139778 >From 3dd903c3143f03e0aefb26d0349cf746b8169357 Mon Sep 17 00:00:00 2001 From: Anatoly Trosinenko Date: Tue, 13 May 2025 19:50:41 +0300 Subject: [PATCH] [BOLT] Gadget scanner: optionally assume auth traps on failure On AArch64 it is possible for an auth instruction to either return an invalid address value on failure (without FEAT_FPAC) or generate an error (with FEAT_FPAC). It thus may be possible to never emit explicit pointer checks, if the target CPU is known to support FEAT_FPAC. This commit implements an --auth-traps-on-failure command line option, which essentially makes "safe-to-dereference" and "trusted" register properties identical and disables scanning for authentication oracles completely. --- bolt/lib/Passes/PAuthGadgetScanner.cpp| 112 +++ .../binary-analysis/AArch64/cmdline-args.test | 1 + .../AArch64/gs-pauth-authentication-oracles.s | 6 +- .../binary-analysis/AArch64/gs-pauth-calls.s | 5 +- .../AArch64/gs-pauth-debug-output.s | 177 ++--- .../AArch64/gs-pauth-jump-table.s | 6 +- .../AArch64/gs-pauth-signing-oracles.s| 55 +++--- .../AArch64/gs-pauth-tail-calls.s | 184 +- 8 files changed, 318 insertions(+), 228 deletions(-) diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp b/bolt/lib/Passes/PAuthGadgetScanner.cpp index e96ec5996984d..d591bd7831fc6 100644 --- a/bolt/lib/Passes/PAuthGadgetScanner.cpp +++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp @@ -14,6 +14,7 @@ #include "bolt/Passes/PAuthGadgetScanner.h" #include "bolt/Core/ParallelUtilities.h" #include "bolt/Passes/DataflowAnalysis.h" +#include "bolt/Utils/CommandLineOpts.h" #include "llvm/ADT/STLExtras.h" #include "llvm/ADT/SmallSet.h" #include "llvm/MC/MCInst.h" @@ -26,6 +27,11 @@ namespace llvm { namespace bolt { namespace PAuthGadgetScanner { +static cl::opt AuthTrapsOnFailure( +"auth-traps-on-failure", +cl::desc("Assume authentication instructions always trap on failure"), +cl::cat(opts::BinaryAnalysisCategory)); + [[maybe_unused]] static void traceInst(const BinaryContext &BC, StringRef Label, const MCInst &MI) { dbgs() << " " << Label << ": "; @@ -365,6 +371,34 @@ class SrcSafetyAnalysis { return Clobbered; } + std::optional getRegMadeTrustedByChecking(const MCInst &Inst, + SrcState Cur) const { +// This functions cannot return multiple registers. This is never the case +// on AArch64. +std::optional RegCheckedByInst = +BC.MIB->getAuthCheckedReg(Inst, /*MayOverwrite=*/false); +if (RegCheckedByInst && Cur.SafeToDerefRegs[*RegCheckedByInst]) + return *RegCheckedByInst; + +auto It = CheckerSequenceInfo.find(&Inst); +if (It == CheckerSequenceInfo.end()) + return std::nullopt; + +MCPhysReg RegCheckedBySequence = It->second.first; +const MCInst *FirstCheckerInst = It->second.second; + +// FirstCheckerInst should belong to the same basic block (see the +// assertion in DataflowSrcSafetyAnalysis::run()), meaning it was +// deterministically processed a few steps before this instruction. +const SrcState &StateBeforeChecker = getStateBefore(*FirstCheckerInst); + +// The sequence checks the register, but it should be authenticated before. +if (!StateBeforeChecker.SafeToDerefRegs[RegCheckedBySequence]) + return std::nullopt; + +return RegCheckedBySequence; + } + // Returns all registers that can be treated as if they are written by an // authentication instruction. SmallVector getRegsMadeSafeToDeref(const MCInst &Point, @@ -387,18 +421,38 @@ class SrcSafetyAnalysis { Regs.push_back(DstAndSrc->first); } +// Make sure explicit checker sequence keeps register safe-to-dereference +// when the register would be clobbered according to the regular rules: +// +//; LR is safe to dereference here +//mov x16, x30 ; start of the sequence, LR is s-t-d right before +//xpaclri ; clobbers LR, LR is not safe anymore +//cmp x30, x16 +//b.eq 1f; end of the sequence: LR is marked as trusted +//brk 0x1234 +// 1: +//; at this point LR would be marked as trusted, +//; but not safe-to-dereference +// +// or even just +// +//; X1 is safe to dereference here +//ldr x0, [x1, #8]! +//; X1 is trusted here, but it was clobbered due to address write-back +if (auto CheckedReg = getRegMadeTrustedByChecking(Point, Cur)) + Regs.push_back(*CheckedReg); + return Regs; } // Returns all registers made trusted by this instruction. SmallVector getRegsMadeTrusted(const MCInst &Point, const SrcState &Cur) const { +assert(!AuthTrapsOnFailure &
[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: prevent false positives due to jump tables (PR #138884)
https://github.com/atrosinenko updated https://github.com/llvm/llvm-project/pull/138884 >From eae2c10ca4d3024862eba06acbb073244ac350e9 Mon Sep 17 00:00:00 2001 From: Anatoly Trosinenko Date: Tue, 6 May 2025 11:31:03 +0300 Subject: [PATCH] [BOLT] Gadget scanner: prevent false positives due to jump tables As part of PAuth hardening, AArch64 LLVM backend can use a special BR_JumpTable pseudo (enabled by -faarch64-jump-table-hardening Clang option) which is expanded in the AsmPrinter into a contiguous sequence without unsafe instructions in the middle. This commit adds another target-specific callback to MCPlusBuilder to make it possible to inhibit false positives for known-safe jump table dispatch sequences. Without special handling, the branch instruction is likely to be reported as a non-protected call (as its destination is not produced by an auth instruction, PC-relative address materialization, etc.) and possibly as a tail call being performed with unsafe link register (as the detection whether the branch instruction is a tail call is an heuristic). For now, only the specific instruction sequence used by the AArch64 LLVM backend is matched. --- bolt/include/bolt/Core/MCInstUtils.h | 9 + bolt/include/bolt/Core/MCPlusBuilder.h| 14 + bolt/lib/Core/MCInstUtils.cpp | 20 + bolt/lib/Passes/PAuthGadgetScanner.cpp| 10 + .../Target/AArch64/AArch64MCPlusBuilder.cpp | 73 ++ .../AArch64/gs-pauth-jump-table.s | 703 ++ 6 files changed, 829 insertions(+) create mode 100644 bolt/test/binary-analysis/AArch64/gs-pauth-jump-table.s diff --git a/bolt/include/bolt/Core/MCInstUtils.h b/bolt/include/bolt/Core/MCInstUtils.h index b495eb8ef5eec..5dd0aaa48d6e7 100644 --- a/bolt/include/bolt/Core/MCInstUtils.h +++ b/bolt/include/bolt/Core/MCInstUtils.h @@ -158,6 +158,15 @@ class MCInstReference { return nullptr; } + /// Returns the only preceding instruction, or std::nullopt if multiple or no + /// predecessors are possible. + /// + /// If CFG information is available, basic block boundary can be crossed, + /// provided there is exactly one predecessor. If CFG is not available, the + /// preceding instruction in the offset order is returned, unless this is the + /// first instruction of the function. + std::optional getSinglePredecessor(); + raw_ostream &print(raw_ostream &OS) const; }; diff --git a/bolt/include/bolt/Core/MCPlusBuilder.h b/bolt/include/bolt/Core/MCPlusBuilder.h index 87de6754017db..eb93d7de7fee9 100644 --- a/bolt/include/bolt/Core/MCPlusBuilder.h +++ b/bolt/include/bolt/Core/MCPlusBuilder.h @@ -14,6 +14,7 @@ #ifndef BOLT_CORE_MCPLUSBUILDER_H #define BOLT_CORE_MCPLUSBUILDER_H +#include "bolt/Core/MCInstUtils.h" #include "bolt/Core/MCPlus.h" #include "bolt/Core/Relocation.h" #include "llvm/ADT/ArrayRef.h" @@ -699,6 +700,19 @@ class MCPlusBuilder { return std::nullopt; } + /// Tests if BranchInst corresponds to an instruction sequence which is known + /// to be a safe dispatch via jump table. + /// + /// The target can decide which instruction sequences to consider "safe" from + /// the Pointer Authentication point of view, such as any jump table dispatch + /// sequence without function calls inside, any sequence which is contiguous, + /// or only some specific well-known sequences. + virtual bool + isSafeJumpTableBranchForPtrAuth(MCInstReference BranchInst) const { +llvm_unreachable("not implemented"); +return false; + } + virtual bool isTerminator(const MCInst &Inst) const; virtual bool isNoop(const MCInst &Inst) const { diff --git a/bolt/lib/Core/MCInstUtils.cpp b/bolt/lib/Core/MCInstUtils.cpp index 40f6edd59135c..b7c6d898988af 100644 --- a/bolt/lib/Core/MCInstUtils.cpp +++ b/bolt/lib/Core/MCInstUtils.cpp @@ -55,3 +55,23 @@ raw_ostream &MCInstReference::print(raw_ostream &OS) const { OS << ">"; return OS; } + +std::optional MCInstReference::getSinglePredecessor() { + if (const RefInBB *Ref = tryGetRefInBB()) { +if (Ref->It != Ref->BB->begin()) + return MCInstReference(Ref->BB, &*std::prev(Ref->It)); + +if (Ref->BB->pred_size() != 1) + return std::nullopt; + +BinaryBasicBlock *PredBB = *Ref->BB->pred_begin(); +assert(!PredBB->empty() && "Empty basic blocks are not supported yet"); +return MCInstReference(PredBB, &*PredBB->rbegin()); + } + + const RefInBF &Ref = getRefInBF(); + if (Ref.It == Ref.BF->instrs().begin()) +return std::nullopt; + + return MCInstReference(Ref.BF, std::prev(Ref.It)); +} diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp b/bolt/lib/Passes/PAuthGadgetScanner.cpp index 10a3f9716c201..e96ec5996984d 100644 --- a/bolt/lib/Passes/PAuthGadgetScanner.cpp +++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp @@ -1329,6 +1329,11 @@ shouldReportUnsafeTailCall(const BinaryContext &BC, const BinaryFunction &BF, return std::nullopt; } + if (BC.MIB->isSafeJumpTableBranchForPtrAuth(Inst)) { +LL
[llvm-branch-commits] [llvm] [BOLT] Introduce helpers to match `MCInst`s one at a time (NFC) (PR #138883)
https://github.com/atrosinenko updated https://github.com/llvm/llvm-project/pull/138883 >From 1c135a144d7f21e05c3598a992baa170cdde7950 Mon Sep 17 00:00:00 2001 From: Anatoly Trosinenko Date: Wed, 7 May 2025 16:42:00 +0300 Subject: [PATCH] [BOLT] Introduce helpers to match `MCInst`s one at a time (NFC) Introduce matchInst helper function to capture and/or match the operands of MCInst. Unlike the existing `MCPlusBuilder::MCInstMatcher` machinery, matchInst is intended for the use cases when precise control over the instruction order is required. For example, when validating PtrAuth hardening, all registers are usually considered unsafe after a function call, even though callee-saved registers should preserve their old values *under normal operation*. --- bolt/include/bolt/Core/MCInstUtils.h | 128 ++ .../Target/AArch64/AArch64MCPlusBuilder.cpp | 90 +--- 2 files changed, 162 insertions(+), 56 deletions(-) diff --git a/bolt/include/bolt/Core/MCInstUtils.h b/bolt/include/bolt/Core/MCInstUtils.h index a3912a8fb265a..b495eb8ef5eec 100644 --- a/bolt/include/bolt/Core/MCInstUtils.h +++ b/bolt/include/bolt/Core/MCInstUtils.h @@ -166,6 +166,134 @@ static inline raw_ostream &operator<<(raw_ostream &OS, return Ref.print(OS); } +/// Instruction-matching helpers operating on a single instruction at a time. +/// +/// Unlike MCPlusBuilder::MCInstMatcher, this matchInst() function focuses on +/// the cases where a precise control over the instruction order is important: +/// +/// // Bring the short names into the local scope: +/// using namespace MCInstMatcher; +/// // Declare the registers to capture: +/// Reg Xn, Xm; +/// // Capture the 0th and 1st operands, match the 2nd operand against the +/// // just captured Xm register, match the 3rd operand against literal 0: +/// if (!matchInst(MaybeAdd, AArch64::ADDXrs, Xm, Xn, Xm, Imm(0)) +/// return AArch64::NoRegister; +/// // Match the 0th operand against Xm: +/// if (!matchInst(MaybeBr, AArch64::BR, Xm)) +/// return AArch64::NoRegister; +/// // Return the matched register: +/// return Xm.get(); +namespace MCInstMatcher { + +// The base class to match an operand of type T. +// +// The subclasses of OpMatcher are intended to be allocated on the stack and +// to only be used by passing them to matchInst() and by calling their get() +// function, thus the peculiar `mutable` specifiers: to make the calling code +// compact and readable, the templated matchInst() function has to accept both +// long-lived Imm/Reg wrappers declared as local variables (intended to capture +// the first operand's value and match the subsequent operands, whether inside +// a single instruction or across multiple instructions), as well as temporary +// wrappers around literal values to match, f.e. Imm(42) or Reg(AArch64::XZR). +template class OpMatcher { + mutable std::optional Value; + mutable std::optional SavedValue; + + // Remember/restore the last Value - to be called by matchInst. + void remember() const { SavedValue = Value; } + void restore() const { Value = SavedValue; } + + template + friend bool matchInst(const MCInst &, unsigned, const OpMatchers &...); + +protected: + OpMatcher(std::optional ValueToMatch) : Value(ValueToMatch) {} + + bool matchValue(T OpValue) const { +// Check that OpValue does not contradict the existing Value. +bool MatchResult = !Value || *Value == OpValue; +// If MatchResult is false, all matchers will be reset before returning from +// matchInst, including this one, thus no need to assign conditionally. +Value = OpValue; + +return MatchResult; + } + +public: + /// Returns the captured value. + T get() const { +assert(Value.has_value()); +return *Value; + } +}; + +class Reg : public OpMatcher { + bool matches(const MCOperand &Op) const { +if (!Op.isReg()) + return false; + +return matchValue(Op.getReg()); + } + + template + friend bool matchInst(const MCInst &, unsigned, const OpMatchers &...); + +public: + Reg(std::optional RegToMatch = std::nullopt) + : OpMatcher(RegToMatch) {} +}; + +class Imm : public OpMatcher { + bool matches(const MCOperand &Op) const { +if (!Op.isImm()) + return false; + +return matchValue(Op.getImm()); + } + + template + friend bool matchInst(const MCInst &, unsigned, const OpMatchers &...); + +public: + Imm(std::optional ImmToMatch = std::nullopt) + : OpMatcher(ImmToMatch) {} +}; + +/// Tries to match Inst and updates Ops on success. +/// +/// If Inst has the specified Opcode and its operand list prefix matches Ops, +/// this function returns true and updates Ops, otherwise false is returned and +/// values of Ops are kept as before matchInst was called. +/// +/// Please note that while Ops are technically passed by a const reference to +/// make invocations like `matchInst(MI, Opcode, Imm(42))` possible, all their +/// fields are marked mut
[llvm-branch-commits] [llvm] [BOLT] Factor out MCInstReference from gadget scanner (NFC) (PR #138655)
https://github.com/atrosinenko updated https://github.com/llvm/llvm-project/pull/138655 >From cbcac1bc4612b601a5ec963663ba69a5f212feb1 Mon Sep 17 00:00:00 2001 From: Anatoly Trosinenko Date: Mon, 28 Apr 2025 18:35:48 +0300 Subject: [PATCH] [BOLT] Factor out MCInstReference from gadget scanner (NFC) Move MCInstReference representing a constant reference to an instruction inside a parent entity - either inside a basic block (which has a reference to its parent function) or directly to the function (when CFG information is not available). --- bolt/include/bolt/Core/MCInstUtils.h | 172 + bolt/include/bolt/Passes/PAuthGadgetScanner.h | 180 +- bolt/lib/Core/CMakeLists.txt | 1 + bolt/lib/Core/MCInstUtils.cpp | 57 ++ bolt/lib/Passes/PAuthGadgetScanner.cpp| 99 -- 5 files changed, 272 insertions(+), 237 deletions(-) create mode 100644 bolt/include/bolt/Core/MCInstUtils.h create mode 100644 bolt/lib/Core/MCInstUtils.cpp diff --git a/bolt/include/bolt/Core/MCInstUtils.h b/bolt/include/bolt/Core/MCInstUtils.h new file mode 100644 index 0..a3912a8fb265a --- /dev/null +++ b/bolt/include/bolt/Core/MCInstUtils.h @@ -0,0 +1,172 @@ +//===- bolt/Core/MCInstUtils.h --*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===--===// + +#ifndef BOLT_CORE_MCINSTUTILS_H +#define BOLT_CORE_MCINSTUTILS_H + +#include "bolt/Core/BinaryBasicBlock.h" + +#include +#include +#include + +namespace llvm { +namespace bolt { + +class BinaryFunction; + +/// MCInstReference represents a reference to a constant MCInst as stored either +/// in a BinaryFunction (i.e. before a CFG is created), or in a BinaryBasicBlock +/// (after a CFG is created). +class MCInstReference { + using nocfg_const_iterator = std::map::const_iterator; + + // Two cases are possible: + // * functions with CFG reconstructed - a function stores a collection of + // basic blocks, each basic block stores a contiguous vector of MCInst + // * functions without CFG - there are no basic blocks created, + // the instructions are directly stored in std::map in BinaryFunction + // + // In both cases, the direct parent of MCInst is stored together with an + // iterator pointing to the instruction. + + // Helper struct: CFG is available, the direct parent is a basic block, + // iterator's type is `MCInst *`. + struct RefInBB { +RefInBB(const BinaryBasicBlock *BB, const MCInst *Inst) +: BB(BB), It(Inst) {} +RefInBB(const RefInBB &Other) = default; +RefInBB &operator=(const RefInBB &Other) = default; + +const BinaryBasicBlock *BB; +BinaryBasicBlock::const_iterator It; + +bool operator<(const RefInBB &Other) const { + if (BB != Other.BB) +return std::less{}(BB, Other.BB); + return It < Other.It; +} + +bool operator==(const RefInBB &Other) const { + return BB == Other.BB && It == Other.It; +} + }; + + // Helper struct: CFG is *not* available, the direct parent is a function, + // iterator's type is std::map::iterator (the mapped value + // is an instruction's offset). + struct RefInBF { +RefInBF(const BinaryFunction *BF, nocfg_const_iterator It) +: BF(BF), It(It) {} +RefInBF(const RefInBF &Other) = default; +RefInBF &operator=(const RefInBF &Other) = default; + +const BinaryFunction *BF; +nocfg_const_iterator It; + +bool operator<(const RefInBF &Other) const { + if (BF != Other.BF) +return std::less{}(BF, Other.BF); + return It->first < Other.It->first; +} + +bool operator==(const RefInBF &Other) const { + return BF == Other.BF && It->first == Other.It->first; +} + }; + + std::variant Reference; + + // Utility methods to be used like this: + // + // if (auto *Ref = tryGetRefInBB()) + // return Ref->doSomething(...); + // return getRefInBF().doSomethingElse(...); + const RefInBB *tryGetRefInBB() const { +assert(std::get_if(&Reference) || + std::get_if(&Reference)); +return std::get_if(&Reference); + } + const RefInBF &getRefInBF() const { +assert(std::get_if(&Reference)); +return *std::get_if(&Reference); + } + +public: + /// Constructs an empty reference. + MCInstReference() : Reference(RefInBB(nullptr, nullptr)) {} + /// Constructs a reference to the instruction inside the basic block. + MCInstReference(const BinaryBasicBlock *BB, const MCInst *Inst) + : Reference(RefInBB(BB, Inst)) { +assert(BB && Inst && "Neither BB nor Inst should be nullptr"); + } + /// Constructs a reference to the instruction inside the basic block. + MCInstReference(const BinaryBasicBlock *BB, uns