[llvm-branch-commits] [mlir] [MLIR][ArmSVE] Add an ArmSVE dialect operation which maps to svusmmla (PR #135634)
https://github.com/banach-space edited https://github.com/llvm/llvm-project/pull/135634 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [MLIR][ArmSVE] Add an ArmSVE dialect operation which maps to svusmmla (PR #135634)
https://github.com/banach-space approved this pull request. LGTM % nit Thanks! https://github.com/llvm/llvm-project/pull/135634 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [MLIR][ArmSVE] Add an ArmSVE dialect operation which maps to svusmmla (PR #135634)
@@ -273,6 +273,34 @@ def UmmlaOp : ArmSVE_Op<"ummla", "$acc `,` $src1 `,` $src2 attr-dict `:` type($src1) `to` type($dst)"; } +def UsmmlaOp : ArmSVE_Op<"usmmla", [Pure, +AllTypesMatch<["src1", "src2"]>, +AllTypesMatch<["acc", "dst"]>]> { banach-space wrote: This indentation is inconsistent with the other ops, but the existing indentation feels a bit ad-hoc. I like yours much more. Would you mind updating other definitions so that we do maintain consistency? Probably as a separate PR to keep the history clean. Updating this PR instead would also be fine with me. https://github.com/llvm/llvm-project/pull/135634 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LV] An attempt to cherry-pick the fix PR #132691 (cherry-pick from the main branch to the release/20.x branch) (PR #135231)
pawosm-arm wrote: Thanks @fhahn for your comment and your patch. It is a good reason for closing down this one. https://github.com/llvm/llvm-project/pull/135231 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [GOFF] Add writing of section symbols (PR #133799)
@@ -0,0 +1,113 @@ +//===- MCGOFFAttributes.h - Attributes of GOFF symbols ===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===--===// +// +// Defines the various attribute collections defining GOFF symbols. +// +//===--===// + +#ifndef LLVM_MC_MCGOFFATTRIBUTES_H +#define LLVM_MC_MCGOFFATTRIBUTES_H + +#include "llvm/ADT/StringRef.h" +#include "llvm/BinaryFormat/GOFF.h" + +namespace llvm { +namespace GOFF { +// An "External Symbol Definition" in the GOFF file has a type, and depending on +// the type a different subset of the fields is used. +// +// Unlike other formats, a 2 dimensional structure is used to define the +// location of data. For example, the equivalent of the ELF .text section is +// made up of a Section Definition (SD) and a class (Element Definition; ED). +// The name of the SD symbol depends on the application, while the class has the +// predefined name C_CODE/C_CODE64 in AMODE31 and AMODE64 respectively. +// +// Data can be placed into this structure in 2 ways. First, the data (in a text +// record) can be associated with an ED symbol. To refer to data, a Label +// Definition (LD) is used to give an offset into the data a name. When binding, +// the whole data is pulled into the resulting executable, and the addresses +// given by the LD symbols are resolved. +// +// The alternative is to use a Part Definition (PR). In this case, the data (in +// a text record) is associated with the part. When binding, only the data of +// referenced PRs is pulled into the resulting binary. +// +// Both approaches are used, which means that the equivalent of a section in ELF +// results in 3 GOFF symbols, either SD/ED/LD or SD/ED/PR. Moreover, certain +// sections are fine with just defining SD/ED symbols. The SymbolMapper takes +// care of all those details. + +// Attributes for SD symbols. +struct SDAttr { + GOFF::ESDTaskingBehavior TaskingBehavior = GOFF::ESD_TA_Unspecified; + GOFF::ESDBindingScope BindingScope = GOFF::ESD_BSC_Unspecified; +}; + +// Attributes for ED symbols. +struct EDAttr { + bool IsReadOnly = false; + GOFF::ESDExecutable Executable = GOFF::ESD_EXE_Unspecified; + GOFF::ESDAmode Amode; redstar wrote: Ok, I removed this. However, it feels like that there is an inconsistency in the documentation / implementations. The following HLASM code ``` stdin#C CSECT C_WSA64 CATTR ALIGN(4),DEFLOAD,NOTEXECUTABLE,PART(a),RMODE(64) DC 0X END ``` results in `RMODE(64)` set at the ED symbol, and `AMODE(64)` set at the PR symbol. Setting the Amode on a PR symbol makes sense to me because it is not possible to add a LD symbol to the part - this is causing binder errors. To reference the part from a different compilation unit, I have to use `a`, thus the PR has also some symbol semantics. https://github.com/llvm/llvm-project/pull/133799 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [GOFF] Add writing of section symbols (PR #133799)
@@ -2759,6 +2762,29 @@ MCSection *TargetLoweringObjectFileXCOFF::getSectionForLSDA( //===--===// TargetLoweringObjectFileGOFF::TargetLoweringObjectFileGOFF() = default; +void TargetLoweringObjectFileGOFF::getModuleMetadata(Module &M) { + // Construct the default names for the root SD and the ADA PR symbol. + StringRef FileName = sys::path::stem(M.getSourceFileName()); + if (FileName.size() > 1 && FileName.starts_with('<') && + FileName.ends_with('>')) +FileName = FileName.substr(1, FileName.size() - 2); + DefaultRootSDName = Twine(FileName).concat("#C").str(); redstar wrote: Yes, my plan is to request an update to the documentation. https://github.com/llvm/llvm-project/pull/133799 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [lldb] release/20.x: [lldb] Respect LaunchInfo::SetExecutable in ProcessLauncherPosixFork (#133093) (PR #134079)
tstellar wrote: Ping @JDevlieghere https://github.com/llvm/llvm-project/pull/134079 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [GOFF] Add writing of section symbols (PR #133799)
@@ -0,0 +1,145 @@ +//===- MCSectionGOFF.cpp - GOFF Code Section Representation ---===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===--===// + +#include "llvm/MC/MCSectionGOFF.h" +#include "llvm/BinaryFormat/GOFF.h" +#include "llvm/Support/raw_ostream.h" + +using namespace llvm; + +namespace { +void emitRMode(raw_ostream &OS, GOFF::ESDRmode Rmode, bool UseParenthesis) { + if (Rmode != GOFF::ESD_RMODE_None) { +OS << "RMODE" << (UseParenthesis ? '(' : ' '); +switch (Rmode) { +case GOFF::ESD_RMODE_24: + OS << "24"; + break; +case GOFF::ESD_RMODE_31: + OS << "31"; + break; +case GOFF::ESD_RMODE_64: + OS << "64"; + break; +case GOFF::ESD_RMODE_None: + break; +} +if (UseParenthesis) + OS << ')'; + } +} + +void emitCATTR(raw_ostream &OS, StringRef Name, StringRef ParentName, + bool EmitAmodeAndRmode, GOFF::ESDAmode Amode, + GOFF::ESDRmode Rmode, GOFF::ESDAlignment Alignment, + GOFF::ESDLoadingBehavior LoadBehavior, + GOFF::ESDExecutable Executable, bool IsReadOnly, + StringRef PartName) { + if (EmitAmodeAndRmode && Amode != GOFF::ESD_AMODE_None) { +OS << ParentName << " AMODE "; +switch (Amode) { +case GOFF::ESD_AMODE_24: + OS << "24"; + break; +case GOFF::ESD_AMODE_31: + OS << "31"; + break; +case GOFF::ESD_AMODE_ANY: + OS << "ANY"; + break; +case GOFF::ESD_AMODE_64: + OS << "64"; + break; +case GOFF::ESD_AMODE_MIN: + OS << "ANY64"; + break; +case GOFF::ESD_AMODE_None: + break; +} +OS << "\n"; + } + if (EmitAmodeAndRmode && Rmode != GOFF::ESD_RMODE_None) { +OS << ParentName << ' '; +emitRMode(OS, Rmode, /*UseParenthesis=*/false); redstar wrote: I changed it, only the `CATTR RMODE` is now emitted. https://github.com/llvm/llvm-project/pull/133799 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [MLIR][ArmSVE] Add initial lowering of vector.contract to SVE `*MMLA` instructions (PR #135636)
https://github.com/banach-space commented: Thanks! This one is a bit longer, so I may need to wait till Thursday before I can review. One high-level question - would sharing some code between NEON and SVE be possible? https://github.com/llvm/llvm-project/pull/135636 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LV] An attempt to cherry-pick the fix PR #132691 (cherry-pick from the main branch to the release/20.x branch) (PR #135231)
https://github.com/pawosm-arm closed https://github.com/llvm/llvm-project/pull/135231 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] 2e7710e - [clang] Introduce "binary" StringLiteral for #embed data (#127629)
Author: Mariya Podchishchaeva Date: 2025-04-14T12:26:02-07:00 New Revision: 2e7710eaffddcbb6094e32826ec6e69bb4cb1799 URL: https://github.com/llvm/llvm-project/commit/2e7710eaffddcbb6094e32826ec6e69bb4cb1799 DIFF: https://github.com/llvm/llvm-project/commit/2e7710eaffddcbb6094e32826ec6e69bb4cb1799.diff LOG: [clang] Introduce "binary" StringLiteral for #embed data (#127629) StringLiteral is used as internal data of EmbedExpr and we directly use it as an initializer if a single EmbedExpr appears in the initializer list of a char array. It is fast and convenient, but it is causing problems when string literal character values are checked because #embed data values are within a range [0-2^(char width)] but ordinary StringLiteral is of maybe signed char type. This PR introduces new kind of StringLiteral to hold binary data coming from an embedded resource to mitigate these problems. The new kind of StringLiteral is not assumed to have signed char type. The new kind of StringLiteral also helps to prevent crashes when trying to find StringLiteral token locations since these simply do not exist for binary data. Fixes https://github.com/llvm/llvm-project/issues/119256 Added: clang/test/Preprocessor/embed_constexpr.c Modified: clang/include/clang/AST/Expr.h clang/lib/AST/Expr.cpp clang/lib/Parse/ParseInit.cpp clang/lib/Sema/SemaInit.cpp Removed: diff --git a/clang/include/clang/AST/Expr.h b/clang/include/clang/AST/Expr.h index 7be4022649329..06ac0f1704aa9 100644 --- a/clang/include/clang/AST/Expr.h +++ b/clang/include/clang/AST/Expr.h @@ -1752,7 +1752,14 @@ enum class StringLiteralKind { UTF8, UTF16, UTF32, - Unevaluated + Unevaluated, + // Binary kind of string literal is used for the data coming via #embed + // directive. File's binary contents is transformed to a special kind of + // string literal that in some cases may be used directly as an initializer + // and some features of classic string literals are not applicable to this + // kind of a string literal, for example finding a particular byte's source + // location for better diagnosing. + Binary }; /// StringLiteral - This represents a string literal expression, e.g. "foo" @@ -1884,6 +1891,8 @@ class StringLiteral final int64_t getCodeUnitS(size_t I, uint64_t BitWidth) const { int64_t V = getCodeUnit(I); if (isOrdinary() || isWide()) { + // Ordinary and wide string literals have types that can be signed. + // It is important for checking C23 constexpr initializers. unsigned Width = getCharByteWidth() * BitWidth; llvm::APInt AInt(Width, (uint64_t)V); V = AInt.getSExtValue(); @@ -4965,9 +4974,9 @@ class EmbedExpr final : public Expr { assert(EExpr && CurOffset != ULLONG_MAX && "trying to dereference an invalid iterator"); IntegerLiteral *N = EExpr->FakeChildNode; - StringRef DataRef = EExpr->Data->BinaryData->getBytes(); N->setValue(*EExpr->Ctx, - llvm::APInt(N->getValue().getBitWidth(), DataRef[CurOffset], + llvm::APInt(N->getValue().getBitWidth(), + EExpr->Data->BinaryData->getCodeUnit(CurOffset), N->getType()->isSignedIntegerType())); // We want to return a reference to the fake child node in the // EmbedExpr, not the local variable N. diff --git a/clang/lib/AST/Expr.cpp b/clang/lib/AST/Expr.cpp index aa7e14329a21b..8571b617c70eb 100644 --- a/clang/lib/AST/Expr.cpp +++ b/clang/lib/AST/Expr.cpp @@ -1104,6 +1104,7 @@ unsigned StringLiteral::mapCharByteWidth(TargetInfo const &Target, switch (SK) { case StringLiteralKind::Ordinary: case StringLiteralKind::UTF8: + case StringLiteralKind::Binary: CharByteWidth = Target.getCharWidth(); break; case StringLiteralKind::Wide: @@ -1216,6 +1217,7 @@ void StringLiteral::outputString(raw_ostream &OS) const { switch (getKind()) { case StringLiteralKind::Unevaluated: case StringLiteralKind::Ordinary: + case StringLiteralKind::Binary: break; // no prefix. case StringLiteralKind::Wide: OS << 'L'; @@ -1332,6 +1334,11 @@ StringLiteral::getLocationOfByte(unsigned ByteNo, const SourceManager &SM, const LangOptions &Features, const TargetInfo &Target, unsigned *StartToken, unsigned *StartTokenByteOffset) const { + // No source location of bytes for binary literals since they don't come from + // source. + if (getKind() == StringLiteralKind::Binary) +return getStrTokenLoc(0); + assert((getKind() == StringLiteralKind::Ordinary || getKind() == StringLiteralKind::UTF8 || getKind() == StringLiteralKind::Unevaluated) && diff --git a/clang/lib/Parse/ParseInit.cpp b/clang/lib/Parse/ParseInit.cpp index 63b1d7bd9db53..471b3eaf
[llvm-branch-commits] [clang] [X86] Backport saturate-convert intrinsics renaming & YMM rounding intrinsics removal in AVX10.2 (PR #135549)
github-actions[bot] wrote: @phoebewang (or anyone else). If you would like to add a note about this fix in the release notes (completely optional). Please reply to this comment with a one or two sentence description of the fix. When you are done, please add the release:note label to this PR. https://github.com/llvm/llvm-project/pull/135549 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/20.x: [LLVM][MemCpyOpt] Unify alias tags if we optimize allocas (#129537) (PR #135615)
tstellar wrote: @nikic Was an approver for the PR in main. https://github.com/llvm/llvm-project/pull/135615 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] release/20.x: [Clang] Fix a lambda pattern comparison mismatch after ecc7e6ce4 (#133863) (PR #134194)
tstellar wrote: Has this fix had enough time in main yet? https://github.com/llvm/llvm-project/pull/134194 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/20.x: [llvm][CodeGen] avoid repeated interval calculation in window scheduler (#132352) (PR #134775)
https://github.com/tstellar closed https://github.com/llvm/llvm-project/pull/134775 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/20.x: [SCEV] Use ashr to adjust constant multipliers (#135534) (PR #135543)
https://github.com/llvmbot updated https://github.com/llvm/llvm-project/pull/135543 >From 0dd4235473d4f5a99c46ea631351616d62e9b32e Mon Sep 17 00:00:00 2001 From: Yingwei Zheng Date: Sun, 13 Apr 2025 20:22:48 +0800 Subject: [PATCH] [SCEV] Use ashr to adjust constant multipliers (#135534) SCEV converts "-2 *nsw (i32 V)" into "2148473647 *nsw (i32 V)". But we cannot preserve the nsw flag when the constant multiplier is negative. This patch changes lshr to ashr so that we can preserve both nsw and nuw flags. Alive2 proof: https://alive2.llvm.org/ce/z/LZVSEa Closes https://github.com/llvm/llvm-project/issues/135531. (cherry picked from commit bb9580a02b393683ff0b6c360df684f33c715a1f) --- llvm/lib/Analysis/ScalarEvolution.cpp | 2 +- .../test/Analysis/ScalarEvolution/pr135531.ll | 19 +++ 2 files changed, 20 insertions(+), 1 deletion(-) create mode 100644 llvm/test/Analysis/ScalarEvolution/pr135531.ll diff --git a/llvm/lib/Analysis/ScalarEvolution.cpp b/llvm/lib/Analysis/ScalarEvolution.cpp index b8069df4e6598..36fe036aa9e9f 100644 --- a/llvm/lib/Analysis/ScalarEvolution.cpp +++ b/llvm/lib/Analysis/ScalarEvolution.cpp @@ -7854,7 +7854,7 @@ const SCEV *ScalarEvolution::createSCEV(Value *V) { unsigned GCD = std::min(MulZeros, TZ); APInt DivAmt = APInt::getOneBitSet(BitWidth, TZ - GCD); SmallVector MulOps; - MulOps.push_back(getConstant(OpC->getAPInt().lshr(GCD))); + MulOps.push_back(getConstant(OpC->getAPInt().ashr(GCD))); append_range(MulOps, LHSMul->operands().drop_front()); auto *NewMul = getMulExpr(MulOps, LHSMul->getNoWrapFlags()); ShiftedLHS = getUDivExpr(NewMul, getConstant(DivAmt)); diff --git a/llvm/test/Analysis/ScalarEvolution/pr135531.ll b/llvm/test/Analysis/ScalarEvolution/pr135531.ll new file mode 100644 index 0..e172d56d3a515 --- /dev/null +++ b/llvm/test/Analysis/ScalarEvolution/pr135531.ll @@ -0,0 +1,19 @@ +; NOTE: Assertions have been autogenerated by utils/update_analyze_test_checks.py UTC_ARGS: --version 5 +; RUN: opt -disable-output -passes='print' < %s 2>&1 | FileCheck %s + +define i32 @pr135511(i32 %x) { +; CHECK-LABEL: 'pr135511' +; CHECK-NEXT: Classifying expressions for: @pr135511 +; CHECK-NEXT:%and = and i32 %x, 16382 +; CHECK-NEXT:--> (2 * (zext i13 (trunc i32 (%x /u 2) to i13) to i32)) U: [0,16383) S: [0,16383) +; CHECK-NEXT:%neg = sub nsw i32 0, %and +; CHECK-NEXT:--> (-2 * (zext i13 (trunc i32 (%x /u 2) to i13) to i32)) U: [0,-1) S: [-16382,1) +; CHECK-NEXT:%res = and i32 %neg, 268431360 +; CHECK-NEXT:--> (4096 * (zext i16 (trunc i32 ((-1 * (zext i13 (trunc i32 (%x /u 2) to i13) to i32)) /u 2048) to i16) to i32)) U: [0,268431361) S: [0,268431361) +; CHECK-NEXT: Determining loop execution counts for: @pr135511 +; + %and = and i32 %x, 16382 + %neg = sub nsw i32 0, %and + %res = and i32 %neg, 268431360 + ret i32 %res +} ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [SPARC] Use lzcnt to implement CTLZ when we have VIS3 (PR #135715)
@@ -0,0 +1,171 @@ +; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py +; RUN: llc < %s -mtriple=sparcv9 | FileCheck %s -check-prefix=V9 +; RUN: llc < %s -mtriple=sparcv9 -mattr=popc | FileCheck %s -check-prefix=POPC +; RUN: llc < %s -mtriple=sparcv9 -mattr=vis3 | FileCheck %s -check-prefix=VIS3 + +define i32 @f(i32 %x) nounwind { +; V9-LABEL: f: +; V9: ! %bb.0: ! %entry +; V9-NEXT:srl %o0, 1, %o1 +; V9-NEXT:or %o0, %o1, %o1 +; V9-NEXT:srl %o1, 2, %o2 +; V9-NEXT:or %o1, %o2, %o1 +; V9-NEXT:srl %o1, 4, %o2 +; V9-NEXT:or %o1, %o2, %o1 +; V9-NEXT:srl %o1, 8, %o2 +; V9-NEXT:or %o1, %o2, %o1 +; V9-NEXT:srl %o1, 16, %o2 +; V9-NEXT:or %o1, %o2, %o1 +; V9-NEXT:xor %o1, -1, %o1 +; V9-NEXT:srl %o1, 1, %o2 +; V9-NEXT:sethi 1398101, %o3 +; V9-NEXT:or %o3, 341, %o3 +; V9-NEXT:and %o2, %o3, %o2 +; V9-NEXT:sub %o1, %o2, %o1 +; V9-NEXT:sethi 838860, %o2 +; V9-NEXT:or %o2, 819, %o2 +; V9-NEXT:and %o1, %o2, %o3 +; V9-NEXT:srl %o1, 2, %o1 +; V9-NEXT:and %o1, %o2, %o1 +; V9-NEXT:add %o3, %o1, %o1 +; V9-NEXT:srl %o1, 4, %o2 +; V9-NEXT:add %o1, %o2, %o1 +; V9-NEXT:sethi 246723, %o2 +; V9-NEXT:or %o2, 783, %o2 +; V9-NEXT:and %o1, %o2, %o1 +; V9-NEXT:sll %o1, 8, %o2 +; V9-NEXT:add %o1, %o2, %o1 +; V9-NEXT:sll %o1, 16, %o2 +; V9-NEXT:add %o1, %o2, %o1 +; V9-NEXT:srl %o1, 24, %o1 +; V9-NEXT:cmp %o0, 0 +; V9-NEXT:move %icc, 0, %o1 +; V9-NEXT:retl +; V9-NEXT:mov %o1, %o0 +; +; POPC-LABEL: f: +; POPC: ! %bb.0: ! %entry +; POPC-NEXT:srl %o0, 1, %o1 +; POPC-NEXT:or %o0, %o1, %o1 +; POPC-NEXT:srl %o1, 2, %o2 +; POPC-NEXT:or %o1, %o2, %o1 +; POPC-NEXT:srl %o1, 4, %o2 +; POPC-NEXT:or %o1, %o2, %o1 +; POPC-NEXT:srl %o1, 8, %o2 +; POPC-NEXT:or %o1, %o2, %o1 +; POPC-NEXT:srl %o1, 16, %o2 +; POPC-NEXT:or %o1, %o2, %o1 +; POPC-NEXT:xor %o1, -1, %o1 +; POPC-NEXT:srl %o1, 0, %o1 +; POPC-NEXT:popc %o1, %o1 +; POPC-NEXT:cmp %o0, 0 +; POPC-NEXT:move %icc, 0, %o1 +; POPC-NEXT:retl +; POPC-NEXT:mov %o1, %o0 +; +; VIS3-LABEL: f: +; VIS3: ! %bb.0: ! %entry +; VIS3-NEXT:srl %o0, 0, %o1 +; VIS3-NEXT:lzcnt %o1, %o1 +; VIS3-NEXT:add %o1, -32, %o1 +; VIS3-NEXT:cmp %o0, 0 +; VIS3-NEXT:move %icc, 0, %o1 +; VIS3-NEXT:retl +; VIS3-NEXT:mov %o1, %o0 +entry: + %0 = call i32 @llvm.ctlz.i32(i32 %x, i1 true) + %1 = icmp eq i32 %x, 0 + %2 = select i1 %1, i32 0, i32 %0 + %3 = trunc i32 %2 to i8 + %conv = zext i8 %3 to i32 + ret i32 %conv +} + +define i64 @g(i64 %x) nounwind { +; V9-LABEL: g: +; V9: ! %bb.0: ! %entry +; V9-NEXT:srlx %o0, 1, %o1 +; V9-NEXT:or %o0, %o1, %o1 +; V9-NEXT:srlx %o1, 2, %o2 +; V9-NEXT:or %o1, %o2, %o1 +; V9-NEXT:srlx %o1, 4, %o2 +; V9-NEXT:or %o1, %o2, %o1 +; V9-NEXT:srlx %o1, 8, %o2 +; V9-NEXT:or %o1, %o2, %o1 +; V9-NEXT:srlx %o1, 16, %o2 +; V9-NEXT:or %o1, %o2, %o1 +; V9-NEXT:srlx %o1, 32, %o2 +; V9-NEXT:or %o1, %o2, %o1 +; V9-NEXT:xor %o1, -1, %o1 +; V9-NEXT:srlx %o1, 1, %o2 +; V9-NEXT:sethi 1398101, %o3 +; V9-NEXT:or %o3, 341, %o3 +; V9-NEXT:sllx %o3, 32, %o4 +; V9-NEXT:or %o4, %o3, %o3 +; V9-NEXT:and %o2, %o3, %o2 +; V9-NEXT:sub %o1, %o2, %o1 +; V9-NEXT:sethi 838860, %o2 +; V9-NEXT:or %o2, 819, %o2 +; V9-NEXT:sllx %o2, 32, %o3 +; V9-NEXT:or %o3, %o2, %o2 +; V9-NEXT:and %o1, %o2, %o3 +; V9-NEXT:srlx %o1, 2, %o1 +; V9-NEXT:and %o1, %o2, %o1 +; V9-NEXT:add %o3, %o1, %o1 +; V9-NEXT:srlx %o1, 4, %o2 +; V9-NEXT:add %o1, %o2, %o1 +; V9-NEXT:sethi 246723, %o2 +; V9-NEXT:or %o2, 783, %o2 +; V9-NEXT:sllx %o2, 32, %o3 +; V9-NEXT:or %o3, %o2, %o2 +; V9-NEXT:and %o1, %o2, %o1 +; V9-NEXT:sethi 16448, %o2 +; V9-NEXT:or %o2, 257, %o2 +; V9-NEXT:sllx %o2, 32, %o3 +; V9-NEXT:or %o3, %o2, %o2 +; V9-NEXT:mulx %o1, %o2, %o1 +; V9-NEXT:srlx %o1, 56, %o1 +; V9-NEXT:movrz %o0, 0, %o1 +; V9-NEXT:retl +; V9-NEXT:mov %o1, %o0 +; +; POPC-LABEL: g: +; POPC: ! %bb.0: ! %entry +; POPC-NEXT:srlx %o0, 1, %o1 +; POPC-NEXT:or %o0, %o1, %o1 +; POPC-NEXT:srlx %o1, 2, %o2 +; POPC-NEXT:or %o1, %o2, %o1 +; POPC-NEXT:srlx %o1, 4, %o2 +; POPC-NEXT:or %o1, %o2, %o1 +; POPC-NEXT:srlx %o1, 8, %o2 +; POPC-NEXT:or %o1, %o2, %o1 +; POPC-NEXT:srlx %o1, 16, %o2 +; POPC-NEXT:or %o1, %o2, %o1 +; POPC-NEXT:srlx %o1, 32, %o2 +; POPC-NEXT:or %o1, %o2, %o1 +; POPC-NEXT:xor %o1, -1, %o1 +; POPC-NEXT:popc %o1, %o1 +; POPC-NEXT:movrz %o0, 0, %o1 +; POPC-NEXT:retl +; POPC-NEXT:mov %o1, %o0 +; +; VIS3-LABEL: g: +; VIS3: ! %bb.0: ! %entry +; VIS3-NEXT:lzcnt %o0, %o1 +; VIS3-NEXT:movrz %o0, 0, %o1 +; VIS3-NEXT:retl +; VIS3-NEXT:mov %o1, %o0 +entry: + %0 = call i64 @llvm.ctlz.i64(i64 %x, i1 true) + %1 = icmp eq i64 %x, 0 + %
[llvm-branch-commits] [SPARC] Use lzcnt to implement CTLZ when we have VIS3 (PR #135715)
@@ -0,0 +1,171 @@ +; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py +; RUN: llc < %s -mtriple=sparcv9 | FileCheck %s -check-prefix=V9 +; RUN: llc < %s -mtriple=sparcv9 -mattr=popc | FileCheck %s -check-prefix=POPC +; RUN: llc < %s -mtriple=sparcv9 -mattr=vis3 | FileCheck %s -check-prefix=VIS3 + +define i32 @f(i32 %x) nounwind { +; V9-LABEL: f: +; V9: ! %bb.0: ! %entry +; V9-NEXT:srl %o0, 1, %o1 +; V9-NEXT:or %o0, %o1, %o1 +; V9-NEXT:srl %o1, 2, %o2 +; V9-NEXT:or %o1, %o2, %o1 +; V9-NEXT:srl %o1, 4, %o2 +; V9-NEXT:or %o1, %o2, %o1 +; V9-NEXT:srl %o1, 8, %o2 +; V9-NEXT:or %o1, %o2, %o1 +; V9-NEXT:srl %o1, 16, %o2 +; V9-NEXT:or %o1, %o2, %o1 +; V9-NEXT:xor %o1, -1, %o1 +; V9-NEXT:srl %o1, 1, %o2 +; V9-NEXT:sethi 1398101, %o3 +; V9-NEXT:or %o3, 341, %o3 +; V9-NEXT:and %o2, %o3, %o2 +; V9-NEXT:sub %o1, %o2, %o1 +; V9-NEXT:sethi 838860, %o2 +; V9-NEXT:or %o2, 819, %o2 +; V9-NEXT:and %o1, %o2, %o3 +; V9-NEXT:srl %o1, 2, %o1 +; V9-NEXT:and %o1, %o2, %o1 +; V9-NEXT:add %o3, %o1, %o1 +; V9-NEXT:srl %o1, 4, %o2 +; V9-NEXT:add %o1, %o2, %o1 +; V9-NEXT:sethi 246723, %o2 +; V9-NEXT:or %o2, 783, %o2 +; V9-NEXT:and %o1, %o2, %o1 +; V9-NEXT:sll %o1, 8, %o2 +; V9-NEXT:add %o1, %o2, %o1 +; V9-NEXT:sll %o1, 16, %o2 +; V9-NEXT:add %o1, %o2, %o1 +; V9-NEXT:srl %o1, 24, %o1 +; V9-NEXT:cmp %o0, 0 +; V9-NEXT:move %icc, 0, %o1 +; V9-NEXT:retl +; V9-NEXT:mov %o1, %o0 +; +; POPC-LABEL: f: +; POPC: ! %bb.0: ! %entry +; POPC-NEXT:srl %o0, 1, %o1 +; POPC-NEXT:or %o0, %o1, %o1 +; POPC-NEXT:srl %o1, 2, %o2 +; POPC-NEXT:or %o1, %o2, %o1 +; POPC-NEXT:srl %o1, 4, %o2 +; POPC-NEXT:or %o1, %o2, %o1 +; POPC-NEXT:srl %o1, 8, %o2 +; POPC-NEXT:or %o1, %o2, %o1 +; POPC-NEXT:srl %o1, 16, %o2 +; POPC-NEXT:or %o1, %o2, %o1 +; POPC-NEXT:xor %o1, -1, %o1 +; POPC-NEXT:srl %o1, 0, %o1 +; POPC-NEXT:popc %o1, %o1 +; POPC-NEXT:cmp %o0, 0 +; POPC-NEXT:move %icc, 0, %o1 +; POPC-NEXT:retl +; POPC-NEXT:mov %o1, %o0 +; +; VIS3-LABEL: f: +; VIS3: ! %bb.0: ! %entry +; VIS3-NEXT:srl %o0, 0, %o1 +; VIS3-NEXT:lzcnt %o1, %o1 +; VIS3-NEXT:add %o1, -32, %o1 +; VIS3-NEXT:cmp %o0, 0 +; VIS3-NEXT:move %icc, 0, %o1 +; VIS3-NEXT:retl +; VIS3-NEXT:mov %o1, %o0 +entry: + %0 = call i32 @llvm.ctlz.i32(i32 %x, i1 true) + %1 = icmp eq i32 %x, 0 + %2 = select i1 %1, i32 0, i32 %0 + %3 = trunc i32 %2 to i8 + %conv = zext i8 %3 to i32 + ret i32 %conv +} + +define i64 @g(i64 %x) nounwind { +; V9-LABEL: g: +; V9: ! %bb.0: ! %entry +; V9-NEXT:srlx %o0, 1, %o1 +; V9-NEXT:or %o0, %o1, %o1 +; V9-NEXT:srlx %o1, 2, %o2 +; V9-NEXT:or %o1, %o2, %o1 +; V9-NEXT:srlx %o1, 4, %o2 +; V9-NEXT:or %o1, %o2, %o1 +; V9-NEXT:srlx %o1, 8, %o2 +; V9-NEXT:or %o1, %o2, %o1 +; V9-NEXT:srlx %o1, 16, %o2 +; V9-NEXT:or %o1, %o2, %o1 +; V9-NEXT:srlx %o1, 32, %o2 +; V9-NEXT:or %o1, %o2, %o1 +; V9-NEXT:xor %o1, -1, %o1 +; V9-NEXT:srlx %o1, 1, %o2 +; V9-NEXT:sethi 1398101, %o3 +; V9-NEXT:or %o3, 341, %o3 +; V9-NEXT:sllx %o3, 32, %o4 +; V9-NEXT:or %o4, %o3, %o3 +; V9-NEXT:and %o2, %o3, %o2 +; V9-NEXT:sub %o1, %o2, %o1 +; V9-NEXT:sethi 838860, %o2 +; V9-NEXT:or %o2, 819, %o2 +; V9-NEXT:sllx %o2, 32, %o3 +; V9-NEXT:or %o3, %o2, %o2 +; V9-NEXT:and %o1, %o2, %o3 +; V9-NEXT:srlx %o1, 2, %o1 +; V9-NEXT:and %o1, %o2, %o1 +; V9-NEXT:add %o3, %o1, %o1 +; V9-NEXT:srlx %o1, 4, %o2 +; V9-NEXT:add %o1, %o2, %o1 +; V9-NEXT:sethi 246723, %o2 +; V9-NEXT:or %o2, 783, %o2 +; V9-NEXT:sllx %o2, 32, %o3 +; V9-NEXT:or %o3, %o2, %o2 +; V9-NEXT:and %o1, %o2, %o1 +; V9-NEXT:sethi 16448, %o2 +; V9-NEXT:or %o2, 257, %o2 +; V9-NEXT:sllx %o2, 32, %o3 +; V9-NEXT:or %o3, %o2, %o2 +; V9-NEXT:mulx %o1, %o2, %o1 +; V9-NEXT:srlx %o1, 56, %o1 +; V9-NEXT:movrz %o0, 0, %o1 +; V9-NEXT:retl +; V9-NEXT:mov %o1, %o0 +; +; POPC-LABEL: g: +; POPC: ! %bb.0: ! %entry +; POPC-NEXT:srlx %o0, 1, %o1 +; POPC-NEXT:or %o0, %o1, %o1 +; POPC-NEXT:srlx %o1, 2, %o2 +; POPC-NEXT:or %o1, %o2, %o1 +; POPC-NEXT:srlx %o1, 4, %o2 +; POPC-NEXT:or %o1, %o2, %o1 +; POPC-NEXT:srlx %o1, 8, %o2 +; POPC-NEXT:or %o1, %o2, %o1 +; POPC-NEXT:srlx %o1, 16, %o2 +; POPC-NEXT:or %o1, %o2, %o1 +; POPC-NEXT:srlx %o1, 32, %o2 +; POPC-NEXT:or %o1, %o2, %o1 +; POPC-NEXT:xor %o1, -1, %o1 +; POPC-NEXT:popc %o1, %o1 +; POPC-NEXT:movrz %o0, 0, %o1 +; POPC-NEXT:retl +; POPC-NEXT:mov %o1, %o0 +; +; VIS3-LABEL: g: +; VIS3: ! %bb.0: ! %entry +; VIS3-NEXT:lzcnt %o0, %o1 +; VIS3-NEXT:movrz %o0, 0, %o1 +; VIS3-NEXT:retl +; VIS3-NEXT:mov %o1, %o0 +entry: + %0 = call i64 @llvm.ctlz.i64(i64 %x, i1 true) + %1 = icmp eq i64 %x, 0 + %
[llvm-branch-commits] [SPARC] Use lzcnt to implement CTLZ when we have VIS3 (PR #135715)
@@ -303,4 +303,10 @@ def : Pat<(i64 (mulhs i64:$lhs, i64:$rhs)), (SUBrr (UMULXHI $lhs, $rhs), (ADDrr (ANDrr (SRAXri $lhs, 63), $rhs), (ANDrr (SRAXri $rhs, 63), $lhs)))>; + +def : Pat<(i64 (ctlz i64:$src)), (LZCNT $src)>; +// 32-bit LZCNT. +// The zero extension will leave us with 32 extra leading zeros, +// so we need to compensate for it. +def : Pat<(i32 (ctlz i32:$src)), (ADDri (LZCNT (SRLri $src, 0)), (i32 -32))>; s-barannikov wrote: It may make sense to `Promote` 32-bit ctlz. IIUC DAG type legalizer does the same expansion, but has a benefit of potentially optimizing `shr` and `add` with outer instructions during DAG combining phase. https://github.com/llvm/llvm-project/pull/135715 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [SPARC] Use lzcnt to implement CTLZ when we have VIS3 (PR #135715)
@@ -0,0 +1,171 @@ +; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py +; RUN: llc < %s -mtriple=sparcv9 | FileCheck %s -check-prefix=V9 +; RUN: llc < %s -mtriple=sparcv9 -mattr=popc | FileCheck %s -check-prefix=POPC +; RUN: llc < %s -mtriple=sparcv9 -mattr=vis3 | FileCheck %s -check-prefix=VIS3 + +define i32 @f(i32 %x) nounwind { +; V9-LABEL: f: +; V9: ! %bb.0: ! %entry +; V9-NEXT:srl %o0, 1, %o1 +; V9-NEXT:or %o0, %o1, %o1 +; V9-NEXT:srl %o1, 2, %o2 +; V9-NEXT:or %o1, %o2, %o1 +; V9-NEXT:srl %o1, 4, %o2 +; V9-NEXT:or %o1, %o2, %o1 +; V9-NEXT:srl %o1, 8, %o2 +; V9-NEXT:or %o1, %o2, %o1 +; V9-NEXT:srl %o1, 16, %o2 +; V9-NEXT:or %o1, %o2, %o1 +; V9-NEXT:xor %o1, -1, %o1 +; V9-NEXT:srl %o1, 1, %o2 +; V9-NEXT:sethi 1398101, %o3 +; V9-NEXT:or %o3, 341, %o3 +; V9-NEXT:and %o2, %o3, %o2 +; V9-NEXT:sub %o1, %o2, %o1 +; V9-NEXT:sethi 838860, %o2 +; V9-NEXT:or %o2, 819, %o2 +; V9-NEXT:and %o1, %o2, %o3 +; V9-NEXT:srl %o1, 2, %o1 +; V9-NEXT:and %o1, %o2, %o1 +; V9-NEXT:add %o3, %o1, %o1 +; V9-NEXT:srl %o1, 4, %o2 +; V9-NEXT:add %o1, %o2, %o1 +; V9-NEXT:sethi 246723, %o2 +; V9-NEXT:or %o2, 783, %o2 +; V9-NEXT:and %o1, %o2, %o1 +; V9-NEXT:sll %o1, 8, %o2 +; V9-NEXT:add %o1, %o2, %o1 +; V9-NEXT:sll %o1, 16, %o2 +; V9-NEXT:add %o1, %o2, %o1 +; V9-NEXT:srl %o1, 24, %o1 +; V9-NEXT:cmp %o0, 0 +; V9-NEXT:move %icc, 0, %o1 +; V9-NEXT:retl +; V9-NEXT:mov %o1, %o0 +; +; POPC-LABEL: f: +; POPC: ! %bb.0: ! %entry +; POPC-NEXT:srl %o0, 1, %o1 +; POPC-NEXT:or %o0, %o1, %o1 +; POPC-NEXT:srl %o1, 2, %o2 +; POPC-NEXT:or %o1, %o2, %o1 +; POPC-NEXT:srl %o1, 4, %o2 +; POPC-NEXT:or %o1, %o2, %o1 +; POPC-NEXT:srl %o1, 8, %o2 +; POPC-NEXT:or %o1, %o2, %o1 +; POPC-NEXT:srl %o1, 16, %o2 +; POPC-NEXT:or %o1, %o2, %o1 +; POPC-NEXT:xor %o1, -1, %o1 +; POPC-NEXT:srl %o1, 0, %o1 +; POPC-NEXT:popc %o1, %o1 +; POPC-NEXT:cmp %o0, 0 +; POPC-NEXT:move %icc, 0, %o1 +; POPC-NEXT:retl +; POPC-NEXT:mov %o1, %o0 +; +; VIS3-LABEL: f: +; VIS3: ! %bb.0: ! %entry +; VIS3-NEXT:srl %o0, 0, %o1 +; VIS3-NEXT:lzcnt %o1, %o1 +; VIS3-NEXT:add %o1, -32, %o1 +; VIS3-NEXT:cmp %o0, 0 +; VIS3-NEXT:move %icc, 0, %o1 +; VIS3-NEXT:retl +; VIS3-NEXT:mov %o1, %o0 +entry: + %0 = call i32 @llvm.ctlz.i32(i32 %x, i1 true) + %1 = icmp eq i32 %x, 0 + %2 = select i1 %1, i32 0, i32 %0 + %3 = trunc i32 %2 to i8 + %conv = zext i8 %3 to i32 + ret i32 %conv +} + +define i64 @g(i64 %x) nounwind { +; V9-LABEL: g: +; V9: ! %bb.0: ! %entry +; V9-NEXT:srlx %o0, 1, %o1 +; V9-NEXT:or %o0, %o1, %o1 +; V9-NEXT:srlx %o1, 2, %o2 +; V9-NEXT:or %o1, %o2, %o1 +; V9-NEXT:srlx %o1, 4, %o2 +; V9-NEXT:or %o1, %o2, %o1 +; V9-NEXT:srlx %o1, 8, %o2 +; V9-NEXT:or %o1, %o2, %o1 +; V9-NEXT:srlx %o1, 16, %o2 +; V9-NEXT:or %o1, %o2, %o1 +; V9-NEXT:srlx %o1, 32, %o2 +; V9-NEXT:or %o1, %o2, %o1 +; V9-NEXT:xor %o1, -1, %o1 +; V9-NEXT:srlx %o1, 1, %o2 +; V9-NEXT:sethi 1398101, %o3 +; V9-NEXT:or %o3, 341, %o3 +; V9-NEXT:sllx %o3, 32, %o4 +; V9-NEXT:or %o4, %o3, %o3 +; V9-NEXT:and %o2, %o3, %o2 +; V9-NEXT:sub %o1, %o2, %o1 +; V9-NEXT:sethi 838860, %o2 +; V9-NEXT:or %o2, 819, %o2 +; V9-NEXT:sllx %o2, 32, %o3 +; V9-NEXT:or %o3, %o2, %o2 +; V9-NEXT:and %o1, %o2, %o3 +; V9-NEXT:srlx %o1, 2, %o1 +; V9-NEXT:and %o1, %o2, %o1 +; V9-NEXT:add %o3, %o1, %o1 +; V9-NEXT:srlx %o1, 4, %o2 +; V9-NEXT:add %o1, %o2, %o1 +; V9-NEXT:sethi 246723, %o2 +; V9-NEXT:or %o2, 783, %o2 +; V9-NEXT:sllx %o2, 32, %o3 +; V9-NEXT:or %o3, %o2, %o2 +; V9-NEXT:and %o1, %o2, %o1 +; V9-NEXT:sethi 16448, %o2 +; V9-NEXT:or %o2, 257, %o2 +; V9-NEXT:sllx %o2, 32, %o3 +; V9-NEXT:or %o3, %o2, %o2 +; V9-NEXT:mulx %o1, %o2, %o1 +; V9-NEXT:srlx %o1, 56, %o1 +; V9-NEXT:movrz %o0, 0, %o1 +; V9-NEXT:retl +; V9-NEXT:mov %o1, %o0 +; +; POPC-LABEL: g: +; POPC: ! %bb.0: ! %entry +; POPC-NEXT:srlx %o0, 1, %o1 +; POPC-NEXT:or %o0, %o1, %o1 +; POPC-NEXT:srlx %o1, 2, %o2 +; POPC-NEXT:or %o1, %o2, %o1 +; POPC-NEXT:srlx %o1, 4, %o2 +; POPC-NEXT:or %o1, %o2, %o1 +; POPC-NEXT:srlx %o1, 8, %o2 +; POPC-NEXT:or %o1, %o2, %o1 +; POPC-NEXT:srlx %o1, 16, %o2 +; POPC-NEXT:or %o1, %o2, %o1 +; POPC-NEXT:srlx %o1, 32, %o2 +; POPC-NEXT:or %o1, %o2, %o1 +; POPC-NEXT:xor %o1, -1, %o1 +; POPC-NEXT:popc %o1, %o1 +; POPC-NEXT:movrz %o0, 0, %o1 +; POPC-NEXT:retl +; POPC-NEXT:mov %o1, %o0 +; +; VIS3-LABEL: g: +; VIS3: ! %bb.0: ! %entry +; VIS3-NEXT:lzcnt %o0, %o1 +; VIS3-NEXT:movrz %o0, 0, %o1 +; VIS3-NEXT:retl +; VIS3-NEXT:mov %o1, %o0 +entry: + %0 = call i64 @llvm.ctlz.i64(i64 %x, i1 true) s-barannikov
[llvm-branch-commits] [SPARC] Use umulxhi to do extending 64x64->128 multiply when we have VIS3 (PR #135714)
@@ -294,4 +294,13 @@ def : Pat<(f32 fpnegimm0), (FNEGS (FZEROS))>; // VIS3 instruction patterns. let Predicates = [HasVIS3] in { def : Pat<(i64 (adde i64:$lhs, i64:$rhs)), (ADDXCCC $lhs, $rhs)>; + +def : Pat<(i64 (mulhu i64:$lhs, i64:$rhs)), (UMULXHI $lhs, $rhs)>; +// Signed "MULXHI". +// Based on the formula presented in OSA2011 §7.140, but with bitops to select +// the values to be added. +def : Pat<(i64 (mulhs i64:$lhs, i64:$rhs)), + (SUBrr (UMULXHI $lhs, $rhs), + (ADDrr (ANDrr (SRAXri $lhs, 63), $rhs), +(ANDrr (SRAXri $rhs, 63), $lhs)))>; s-barannikov wrote: Does it produce better code than setting MULHS to Expand? https://github.com/llvm/llvm-project/pull/135714 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [llvm][IR] Treat memcmp and bcmp as libcalls (PR #135706)
https://github.com/ilovepi created https://github.com/llvm/llvm-project/pull/135706 Since the backend may emit calls to these functions, they should be treated like other libcalls. If we don't, then it is possible to have their definitions removed during LTO because they are dead, only to have a later transform introduce calls to them. See https://discourse.llvm.org/t/rfc-addressing-deficiencies-in-llvm-s-lto-implementation/84999 for more information. >From af02216b9358166b635335491934ff44cdcc89a5 Mon Sep 17 00:00:00 2001 From: Paul Kirth Date: Mon, 14 Apr 2025 08:25:15 -0700 Subject: [PATCH] [llvm][IR] Treat memcmp and bcmp as libcalls Since the backend may emit calls to these functions, they should be treated like other libcalls. If we don't, then it is possible to have their definitions removed during LTO because they are dead, only to have a later transform introduce calls to them. See https://discourse.llvm.org/t/rfc-addressing-deficiencies-in-llvm-s-lto-implementation/84999 for more information. --- llvm/include/llvm/IR/RuntimeLibcalls.def | 2 ++ llvm/test/LTO/Resolution/RISCV/bcmp-libcall.ll | 3 +-- 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/llvm/include/llvm/IR/RuntimeLibcalls.def b/llvm/include/llvm/IR/RuntimeLibcalls.def index 2545aebc73391..2c72bc8c012cc 100644 --- a/llvm/include/llvm/IR/RuntimeLibcalls.def +++ b/llvm/include/llvm/IR/RuntimeLibcalls.def @@ -513,6 +513,8 @@ HANDLE_LIBCALL(UO_PPCF128, "__gcc_qunord") HANDLE_LIBCALL(MEMCPY, "memcpy") HANDLE_LIBCALL(MEMMOVE, "memmove") HANDLE_LIBCALL(MEMSET, "memset") +HANDLE_LIBCALL(MEMCMP, "memcmp") +HANDLE_LIBCALL(BCMP, "bcmp") // DSEPass can emit calloc if it finds a pair of malloc/memset HANDLE_LIBCALL(CALLOC, "calloc") HANDLE_LIBCALL(BZERO, nullptr) diff --git a/llvm/test/LTO/Resolution/RISCV/bcmp-libcall.ll b/llvm/test/LTO/Resolution/RISCV/bcmp-libcall.ll index 4c6bebf69a074..80421cd9350c8 100644 --- a/llvm/test/LTO/Resolution/RISCV/bcmp-libcall.ll +++ b/llvm/test/LTO/Resolution/RISCV/bcmp-libcall.ll @@ -29,8 +29,7 @@ define i1 @foo(ptr %0, [2 x i32] %1) { declare i32 @memcmp(ptr, ptr, i32) ;; Ensure bcmp is removed from module. Follow up patches can address this. -; INTERNALIZE-NOT: declare{{.*}}i32 @bcmp -; INTERNALIZE-NOT: define{{.*}}i32 @bcmp +; INTERNALIZE: define{{.*}}i32 @bcmp define i32 @bcmp(ptr %0, ptr %1, i32 %2) { ret i32 0 } ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [llvm][IR] Treat memcmp and bcmp as libcalls (PR #135706)
llvmbot wrote: @llvm/pr-subscribers-lto Author: Paul Kirth (ilovepi) Changes Since the backend may emit calls to these functions, they should be treated like other libcalls. If we don't, then it is possible to have their definitions removed during LTO because they are dead, only to have a later transform introduce calls to them. See https://discourse.llvm.org/t/rfc-addressing-deficiencies-in-llvm-s-lto-implementation/84999 for more information. --- Full diff: https://github.com/llvm/llvm-project/pull/135706.diff 2 Files Affected: - (modified) llvm/include/llvm/IR/RuntimeLibcalls.def (+2) - (modified) llvm/test/LTO/Resolution/RISCV/bcmp-libcall.ll (+1-2) ``diff diff --git a/llvm/include/llvm/IR/RuntimeLibcalls.def b/llvm/include/llvm/IR/RuntimeLibcalls.def index 2545aebc73391..2c72bc8c012cc 100644 --- a/llvm/include/llvm/IR/RuntimeLibcalls.def +++ b/llvm/include/llvm/IR/RuntimeLibcalls.def @@ -513,6 +513,8 @@ HANDLE_LIBCALL(UO_PPCF128, "__gcc_qunord") HANDLE_LIBCALL(MEMCPY, "memcpy") HANDLE_LIBCALL(MEMMOVE, "memmove") HANDLE_LIBCALL(MEMSET, "memset") +HANDLE_LIBCALL(MEMCMP, "memcmp") +HANDLE_LIBCALL(BCMP, "bcmp") // DSEPass can emit calloc if it finds a pair of malloc/memset HANDLE_LIBCALL(CALLOC, "calloc") HANDLE_LIBCALL(BZERO, nullptr) diff --git a/llvm/test/LTO/Resolution/RISCV/bcmp-libcall.ll b/llvm/test/LTO/Resolution/RISCV/bcmp-libcall.ll index 4c6bebf69a074..80421cd9350c8 100644 --- a/llvm/test/LTO/Resolution/RISCV/bcmp-libcall.ll +++ b/llvm/test/LTO/Resolution/RISCV/bcmp-libcall.ll @@ -29,8 +29,7 @@ define i1 @foo(ptr %0, [2 x i32] %1) { declare i32 @memcmp(ptr, ptr, i32) ;; Ensure bcmp is removed from module. Follow up patches can address this. -; INTERNALIZE-NOT: declare{{.*}}i32 @bcmp -; INTERNALIZE-NOT: define{{.*}}i32 @bcmp +; INTERNALIZE: define{{.*}}i32 @bcmp define i32 @bcmp(ptr %0, ptr %1, i32 %2) { ret i32 0 } `` https://github.com/llvm/llvm-project/pull/135706 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [llvm][IR] Treat memcmp and bcmp as libcalls (PR #135706)
https://github.com/ilovepi ready_for_review https://github.com/llvm/llvm-project/pull/135706 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] [llvm] Reentry (PR #135656)
github-actions[bot] wrote: :warning: C/C++ code formatter, clang-format found issues in your code. :warning: You can test this locally with the following command: ``bash git-clang-format --diff HEAD~1 HEAD --extensions h,cpp -- compiler-rt/lib/ctx_profile/CtxInstrProfiling.cpp compiler-rt/lib/ctx_profile/tests/CtxInstrProfilingTest.cpp llvm/include/llvm/ProfileData/CtxInstrContextNode.h llvm/lib/Transforms/Instrumentation/PGOCtxProfLowering.cpp `` View the diff from clang-format here. ``diff diff --git a/compiler-rt/lib/ctx_profile/CtxInstrProfiling.cpp b/compiler-rt/lib/ctx_profile/CtxInstrProfiling.cpp index 2e26541c1..0261bab53 100644 --- a/compiler-rt/lib/ctx_profile/CtxInstrProfiling.cpp +++ b/compiler-rt/lib/ctx_profile/CtxInstrProfiling.cpp @@ -55,7 +55,7 @@ void onFunctionExited(void *Address) { } // Returns true if it was entered the first time -bool rootEnterIsFirst(void* Address) { +bool rootEnterIsFirst(void *Address) { bool Ret = true; if (!EnteredContextAddress) { EnteredContextAddress = Address; @@ -67,14 +67,13 @@ bool rootEnterIsFirst(void* Address) { } // Return true if this also exits the root. -bool exitsRoot(void* Address) { +bool exitsRoot(void *Address) { onFunctionExited(Address); if (UnderContextRefCount == 0) { EnteredContextAddress = nullptr; return true; } return false; - } bool hasEnteredARoot() { return UnderContextRefCount > 0; } @@ -367,8 +366,7 @@ ContextNode *getOrStartContextOutsideCollection(FunctionData &Data, // If we didn't start profiling, or if we are under a context, just not // collecting, return the scratch buffer. - if (hasEnteredARoot() || - !__sanitizer::atomic_load_relaxed(&ProfilingStarted)) + if (hasEnteredARoot() || !__sanitizer::atomic_load_relaxed(&ProfilingStarted)) return TheScratchContext; return markAsScratch( onContextEnter(*getFlatProfile(Data, Callee, Guid, NumCounters))); diff --git a/compiler-rt/lib/ctx_profile/tests/CtxInstrProfilingTest.cpp b/compiler-rt/lib/ctx_profile/tests/CtxInstrProfilingTest.cpp index 39a225ac1..064608d28 100644 --- a/compiler-rt/lib/ctx_profile/tests/CtxInstrProfilingTest.cpp +++ b/compiler-rt/lib/ctx_profile/tests/CtxInstrProfilingTest.cpp @@ -276,7 +276,7 @@ TEST_F(ContextTest, RootEntersOtherRoot) { __llvm_ctx_profile_release_context(&Root); EXPECT_EQ(__llvm_ctx_profile_current_context_root, Root.CtxRoot); __llvm_ctx_profile_release_context(&Root); - EXPECT_EQ(__llvm_ctx_profile_current_context_root, nullptr); + EXPECT_EQ(__llvm_ctx_profile_current_context_root, nullptr); } TEST_F(ContextTest, NeedMoreMemory) { `` https://github.com/llvm/llvm-project/pull/135656 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [HLSL] Adding support for Root Constants in LLVM Metadata (PR #135085)
@@ -0,0 +1,34 @@ +; RUN: opt %s -dxil-embed -dxil-globals -S -o - | FileCheck %s +; RUN: llc %s --filetype=obj -o - | obj2yaml | FileCheck %s --check-prefix=DXC + +target triple = "dxil-unknown-shadermodel6.0-compute" + +; CHECK: @dx.rts0 = private constant [48 x i8] c"{{.*}}", section "RTS0", align 4 + +define void @main() #0 { +entry: + ret void +} +attributes #0 = { "hlsl.numthreads"="1,1,1" "hlsl.shader"="compute" } + + +!dx.rootsignatures = !{!2} ; list of function/root signature pairs +!2 = !{ ptr @main, !3 } ; function, root signature +!3 = !{ !4, !5 } ; list of root signature elements +!4 = !{ !"RootFlags", i32 1 } ; 1 = allow_input_assembler_input_layout +!5 = !{ !"RootConstants", i32 0, i32 1, i32 2, i32 3 } + +; DXC: - Name:RTS0 +; DXC-NEXT:Size:48 +; DXC-NEXT:RootSignature: +; DXC-NEXT: Version: 2 +; DXC-NEXT: NumStaticSamplers: 0 +; DXC-NEXT: StaticSamplersOffset: 0 joaosaffran wrote: The difference comes to the fact that `obj2yaml` has no support for static samplers yet. So those tools are not dealing with this value consistently. This will be fixed when adding the support is added to `obj2yaml` https://github.com/llvm/llvm-project/pull/135085 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [llvm] release/20.x: [X86][AVX10] Remove VAES and VPCLMULQDQ feature from AVX10.1 (#135489) (PR #135577)
https://github.com/llvmbot updated https://github.com/llvm/llvm-project/pull/135577 >From 0c30835a63db5bb309abe8533a9c57b3b00a15ed Mon Sep 17 00:00:00 2001 From: Phoebe Wang Date: Mon, 14 Apr 2025 08:54:10 +0800 Subject: [PATCH] [X86][AVX10] Remove VAES and VPCLMULQDQ feature from AVX10.1 (#135489) According to SDM, they require both VAES/VPCLMULQDQ and AVX10.1 CPUID bits. Fixes: #135394 (cherry picked from commit ebba554a3211b0b98d3ae33ba70f9d6ceaab6ad4) --- clang/test/CodeGen/attr-target-x86.c | 8 llvm/lib/Target/X86/X86.td| 2 +- llvm/lib/TargetParser/X86TargetParser.cpp | 3 +-- 3 files changed, 6 insertions(+), 7 deletions(-) diff --git a/clang/test/CodeGen/attr-target-x86.c b/clang/test/CodeGen/attr-target-x86.c index c92aad633082f..e5067c1c3b075 100644 --- a/clang/test/CodeGen/attr-target-x86.c +++ b/clang/test/CodeGen/attr-target-x86.c @@ -56,7 +56,7 @@ void f_default2(void) { __attribute__((target("avx, sse4.2, arch= ivybridge"))) void f_avx_sse4_2_ivybridge_2(void) {} -// CHECK: [[f_no_aes_ivybridge]] = {{.*}}"target-cpu"="ivybridge" "target-features"="+avx,+cmov,+crc32,+cx16,+cx8,+f16c,+fsgsbase,+fxsr,+mmx,+pclmul,+popcnt,+rdrnd,+sahf,+sse,+sse2,+sse3,+sse4.1,+sse4.2,+ssse3,+x87,+xsave,+xsaveopt,-aes,-amx-avx512,-avx10.1-256,-avx10.1-512,-avx10.2-256,-avx10.2-512,-vaes" +// CHECK: [[f_no_aes_ivybridge]] = {{.*}}"target-cpu"="ivybridge" "target-features"="+avx,+cmov,+crc32,+cx16,+cx8,+f16c,+fsgsbase,+fxsr,+mmx,+pclmul,+popcnt,+rdrnd,+sahf,+sse,+sse2,+sse3,+sse4.1,+sse4.2,+ssse3,+x87,+xsave,+xsaveopt,-aes,-vaes" __attribute__((target("no-aes, arch=ivybridge"))) void f_no_aes_ivybridge(void) {} @@ -98,11 +98,11 @@ void f_x86_64_v3(void) {} __attribute__((target("arch=x86-64-v4"))) void f_x86_64_v4(void) {} -// CHECK: [[f_avx10_1_256]] = {{.*}}"target-cpu"="i686" "target-features"="+aes,+avx,+avx10.1-256,+avx2,+avx512bf16,+avx512bitalg,+avx512bw,+avx512cd,+avx512dq,+avx512f,+avx512fp16,+avx512ifma,+avx512vbmi,+avx512vbmi2,+avx512vl,+avx512vnni,+avx512vpopcntdq,+cmov,+crc32,+cx8,+f16c,+fma,+mmx,+pclmul,+popcnt,+sse,+sse2,+sse3,+sse4.1,+sse4.2,+ssse3,+vaes,+vpclmulqdq,+x87,+xsave,-amx-avx512,-avx10.1-512,-avx10.2-512,-evex512" +// CHECK: [[f_avx10_1_256]] = {{.*}}"target-cpu"="i686" "target-features"="+avx,+avx10.1-256,+avx2,+avx512bf16,+avx512bitalg,+avx512bw,+avx512cd,+avx512dq,+avx512f,+avx512fp16,+avx512ifma,+avx512vbmi,+avx512vbmi2,+avx512vl,+avx512vnni,+avx512vpopcntdq,+cmov,+crc32,+cx8,+f16c,+fma,+mmx,+popcnt,+sse,+sse2,+sse3,+sse4.1,+sse4.2,+ssse3,+x87,+xsave,-amx-avx512,-avx10.1-512,-avx10.2-512,-evex512" __attribute__((target("avx10.1-256"))) void f_avx10_1_256(void) {} -// CHECK: [[f_avx10_1_512]] = {{.*}}"target-cpu"="i686" "target-features"="+aes,+avx,+avx10.1-256,+avx10.1-512,+avx2,+avx512bf16,+avx512bitalg,+avx512bw,+avx512cd,+avx512dq,+avx512f,+avx512fp16,+avx512ifma,+avx512vbmi,+avx512vbmi2,+avx512vl,+avx512vnni,+avx512vpopcntdq,+cmov,+crc32,+cx8,+evex512,+f16c,+fma,+mmx,+pclmul,+popcnt,+sse,+sse2,+sse3,+sse4.1,+sse4.2,+ssse3,+vaes,+vpclmulqdq,+x87,+xsave" +// CHECK: [[f_avx10_1_512]] = {{.*}}"target-cpu"="i686" "target-features"="+avx,+avx10.1-256,+avx10.1-512,+avx2,+avx512bf16,+avx512bitalg,+avx512bw,+avx512cd,+avx512dq,+avx512f,+avx512fp16,+avx512ifma,+avx512vbmi,+avx512vbmi2,+avx512vl,+avx512vnni,+avx512vpopcntdq,+cmov,+crc32,+cx8,+evex512,+f16c,+fma,+mmx,+popcnt,+sse,+sse2,+sse3,+sse4.1,+sse4.2,+ssse3,+x87,+xsave" __attribute__((target("avx10.1-512"))) void f_avx10_1_512(void) {} @@ -112,4 +112,4 @@ void f_prefer_256_bit(void) {} // CHECK: [[f_no_prefer_256_bit]] = {{.*}}"target-features"="{{.*}}-prefer-256-bit __attribute__((target("no-prefer-256-bit"))) -void f_no_prefer_256_bit(void) {} \ No newline at end of file +void f_no_prefer_256_bit(void) {} diff --git a/llvm/lib/Target/X86/X86.td b/llvm/lib/Target/X86/X86.td index 38761e1fd7eec..577428cad6d61 100644 --- a/llvm/lib/Target/X86/X86.td +++ b/llvm/lib/Target/X86/X86.td @@ -338,7 +338,7 @@ def FeatureAVX10_1 : SubtargetFeature<"avx10.1-256", "HasAVX10_1", "true", "Support AVX10.1 up to 256-bit instruction", [FeatureCDI, FeatureVBMI, FeatureIFMA, FeatureVNNI, FeatureBF16, FeatureVPOPCNTDQ, FeatureVBMI2, FeatureBITALG, - FeatureVAES, FeatureVPCLMULQDQ, FeatureFP16]>; + FeatureFP16]>; def FeatureAVX10_1_512 : SubtargetFeature<"avx10.1-512", "HasAVX10_1_512", "true", "Support AVX10.1 up to 512-bit instruction", [FeatureAVX10_1, FeatureEVEX512]>; diff --git a/llvm/lib/TargetParser/X86TargetParser.cpp b/llvm/lib/TargetParser/X86TargetParser.cpp index e4b7ed7cf9b61..2ae6dd6b3d1ef 100644 --- a/llvm/lib/TargetParser/X86TargetParser.cpp +++ b/llvm/lib/TargetParser/X86T
[llvm-branch-commits] [clang] 0c30835 - [X86][AVX10] Remove VAES and VPCLMULQDQ feature from AVX10.1 (#135489)
Author: Phoebe Wang Date: 2025-04-14T12:45:29-07:00 New Revision: 0c30835a63db5bb309abe8533a9c57b3b00a15ed URL: https://github.com/llvm/llvm-project/commit/0c30835a63db5bb309abe8533a9c57b3b00a15ed DIFF: https://github.com/llvm/llvm-project/commit/0c30835a63db5bb309abe8533a9c57b3b00a15ed.diff LOG: [X86][AVX10] Remove VAES and VPCLMULQDQ feature from AVX10.1 (#135489) According to SDM, they require both VAES/VPCLMULQDQ and AVX10.1 CPUID bits. Fixes: #135394 (cherry picked from commit ebba554a3211b0b98d3ae33ba70f9d6ceaab6ad4) Added: Modified: clang/test/CodeGen/attr-target-x86.c llvm/lib/Target/X86/X86.td llvm/lib/TargetParser/X86TargetParser.cpp Removed: diff --git a/clang/test/CodeGen/attr-target-x86.c b/clang/test/CodeGen/attr-target-x86.c index c92aad633082f..e5067c1c3b075 100644 --- a/clang/test/CodeGen/attr-target-x86.c +++ b/clang/test/CodeGen/attr-target-x86.c @@ -56,7 +56,7 @@ void f_default2(void) { __attribute__((target("avx, sse4.2, arch= ivybridge"))) void f_avx_sse4_2_ivybridge_2(void) {} -// CHECK: [[f_no_aes_ivybridge]] = {{.*}}"target-cpu"="ivybridge" "target-features"="+avx,+cmov,+crc32,+cx16,+cx8,+f16c,+fsgsbase,+fxsr,+mmx,+pclmul,+popcnt,+rdrnd,+sahf,+sse,+sse2,+sse3,+sse4.1,+sse4.2,+ssse3,+x87,+xsave,+xsaveopt,-aes,-amx-avx512,-avx10.1-256,-avx10.1-512,-avx10.2-256,-avx10.2-512,-vaes" +// CHECK: [[f_no_aes_ivybridge]] = {{.*}}"target-cpu"="ivybridge" "target-features"="+avx,+cmov,+crc32,+cx16,+cx8,+f16c,+fsgsbase,+fxsr,+mmx,+pclmul,+popcnt,+rdrnd,+sahf,+sse,+sse2,+sse3,+sse4.1,+sse4.2,+ssse3,+x87,+xsave,+xsaveopt,-aes,-vaes" __attribute__((target("no-aes, arch=ivybridge"))) void f_no_aes_ivybridge(void) {} @@ -98,11 +98,11 @@ void f_x86_64_v3(void) {} __attribute__((target("arch=x86-64-v4"))) void f_x86_64_v4(void) {} -// CHECK: [[f_avx10_1_256]] = {{.*}}"target-cpu"="i686" "target-features"="+aes,+avx,+avx10.1-256,+avx2,+avx512bf16,+avx512bitalg,+avx512bw,+avx512cd,+avx512dq,+avx512f,+avx512fp16,+avx512ifma,+avx512vbmi,+avx512vbmi2,+avx512vl,+avx512vnni,+avx512vpopcntdq,+cmov,+crc32,+cx8,+f16c,+fma,+mmx,+pclmul,+popcnt,+sse,+sse2,+sse3,+sse4.1,+sse4.2,+ssse3,+vaes,+vpclmulqdq,+x87,+xsave,-amx-avx512,-avx10.1-512,-avx10.2-512,-evex512" +// CHECK: [[f_avx10_1_256]] = {{.*}}"target-cpu"="i686" "target-features"="+avx,+avx10.1-256,+avx2,+avx512bf16,+avx512bitalg,+avx512bw,+avx512cd,+avx512dq,+avx512f,+avx512fp16,+avx512ifma,+avx512vbmi,+avx512vbmi2,+avx512vl,+avx512vnni,+avx512vpopcntdq,+cmov,+crc32,+cx8,+f16c,+fma,+mmx,+popcnt,+sse,+sse2,+sse3,+sse4.1,+sse4.2,+ssse3,+x87,+xsave,-amx-avx512,-avx10.1-512,-avx10.2-512,-evex512" __attribute__((target("avx10.1-256"))) void f_avx10_1_256(void) {} -// CHECK: [[f_avx10_1_512]] = {{.*}}"target-cpu"="i686" "target-features"="+aes,+avx,+avx10.1-256,+avx10.1-512,+avx2,+avx512bf16,+avx512bitalg,+avx512bw,+avx512cd,+avx512dq,+avx512f,+avx512fp16,+avx512ifma,+avx512vbmi,+avx512vbmi2,+avx512vl,+avx512vnni,+avx512vpopcntdq,+cmov,+crc32,+cx8,+evex512,+f16c,+fma,+mmx,+pclmul,+popcnt,+sse,+sse2,+sse3,+sse4.1,+sse4.2,+ssse3,+vaes,+vpclmulqdq,+x87,+xsave" +// CHECK: [[f_avx10_1_512]] = {{.*}}"target-cpu"="i686" "target-features"="+avx,+avx10.1-256,+avx10.1-512,+avx2,+avx512bf16,+avx512bitalg,+avx512bw,+avx512cd,+avx512dq,+avx512f,+avx512fp16,+avx512ifma,+avx512vbmi,+avx512vbmi2,+avx512vl,+avx512vnni,+avx512vpopcntdq,+cmov,+crc32,+cx8,+evex512,+f16c,+fma,+mmx,+popcnt,+sse,+sse2,+sse3,+sse4.1,+sse4.2,+ssse3,+x87,+xsave" __attribute__((target("avx10.1-512"))) void f_avx10_1_512(void) {} @@ -112,4 +112,4 @@ void f_prefer_256_bit(void) {} // CHECK: [[f_no_prefer_256_bit]] = {{.*}}"target-features"="{{.*}}-prefer-256-bit __attribute__((target("no-prefer-256-bit"))) -void f_no_prefer_256_bit(void) {} \ No newline at end of file +void f_no_prefer_256_bit(void) {} diff --git a/llvm/lib/Target/X86/X86.td b/llvm/lib/Target/X86/X86.td index 38761e1fd7eec..577428cad6d61 100644 --- a/llvm/lib/Target/X86/X86.td +++ b/llvm/lib/Target/X86/X86.td @@ -338,7 +338,7 @@ def FeatureAVX10_1 : SubtargetFeature<"avx10.1-256", "HasAVX10_1", "true", "Support AVX10.1 up to 256-bit instruction", [FeatureCDI, FeatureVBMI, FeatureIFMA, FeatureVNNI, FeatureBF16, FeatureVPOPCNTDQ, FeatureVBMI2, FeatureBITALG, - FeatureVAES, FeatureVPCLMULQDQ, FeatureFP16]>; + FeatureFP16]>; def FeatureAVX10_1_512 : SubtargetFeature<"avx10.1-512", "HasAVX10_1_512", "true", "Support AVX10.1 up to 512-bit instruction", [FeatureAVX10_1, FeatureEVEX512]>; diff --git a/llvm/lib/TargetParser/X86TargetParser.cpp b/llvm/lib/TargetParser/X86TargetParser.cpp index e4b7ed7cf9b
[llvm-branch-commits] [llvm] [GOFF] Add writing of section symbols (PR #133799)
@@ -0,0 +1,113 @@ +//===- MCGOFFAttributes.h - Attributes of GOFF symbols ===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===--===// +// +// Defines the various attribute collections defining GOFF symbols. +// +//===--===// + +#ifndef LLVM_MC_MCGOFFATTRIBUTES_H +#define LLVM_MC_MCGOFFATTRIBUTES_H + +#include "llvm/ADT/StringRef.h" +#include "llvm/BinaryFormat/GOFF.h" + +namespace llvm { +namespace GOFF { +// An "External Symbol Definition" in the GOFF file has a type, and depending on +// the type a different subset of the fields is used. +// +// Unlike other formats, a 2 dimensional structure is used to define the +// location of data. For example, the equivalent of the ELF .text section is +// made up of a Section Definition (SD) and a class (Element Definition; ED). +// The name of the SD symbol depends on the application, while the class has the +// predefined name C_CODE/C_CODE64 in AMODE31 and AMODE64 respectively. +// +// Data can be placed into this structure in 2 ways. First, the data (in a text +// record) can be associated with an ED symbol. To refer to data, a Label +// Definition (LD) is used to give an offset into the data a name. When binding, +// the whole data is pulled into the resulting executable, and the addresses +// given by the LD symbols are resolved. +// +// The alternative is to use a Part Definition (PR). In this case, the data (in +// a text record) is associated with the part. When binding, only the data of +// referenced PRs is pulled into the resulting binary. +// +// Both approaches are used, which means that the equivalent of a section in ELF +// results in 3 GOFF symbols, either SD/ED/LD or SD/ED/PR. Moreover, certain +// sections are fine with just defining SD/ED symbols. The SymbolMapper takes +// care of all those details. + +// Attributes for SD symbols. +struct SDAttr { + GOFF::ESDTaskingBehavior TaskingBehavior = GOFF::ESD_TA_Unspecified; + GOFF::ESDBindingScope BindingScope = GOFF::ESD_BSC_Unspecified; +}; + +// Attributes for ED symbols. +struct EDAttr { + bool IsReadOnly = false; + GOFF::ESDExecutable Executable = GOFF::ESD_EXE_Unspecified; + GOFF::ESDAmode Amode; + GOFF::ESDRmode Rmode; + GOFF::ESDNameSpaceId NameSpace = GOFF::ESD_NS_NormalName; + GOFF::ESDTextStyle TextStyle = GOFF::ESD_TS_ByteOriented; + GOFF::ESDBindingAlgorithm BindAlgorithm = GOFF::ESD_BA_Concatenate; + GOFF::ESDLoadingBehavior LoadBehavior = GOFF::ESD_LB_Initial; + GOFF::ESDReserveQwords ReservedQwords = GOFF::ESD_RQ_0; + GOFF::ESDAlignment Alignment = GOFF::ESD_ALIGN_Doubleword; +}; + +// Attributes for LD symbols. +struct LDAttr { + bool IsRenamable = false; + GOFF::ESDExecutable Executable = GOFF::ESD_EXE_Unspecified; + GOFF::ESDNameSpaceId NameSpace = GOFF::ESD_NS_NormalName; + GOFF::ESDBindingStrength BindingStrength = GOFF::ESD_BST_Strong; + GOFF::ESDLinkageType Linkage = GOFF::ESD_LT_XPLink; + GOFF::ESDAmode Amode; + GOFF::ESDBindingScope BindingScope = GOFF::ESD_BSC_Unspecified; +}; + +// Attributes for PR symbols. +struct PRAttr { + bool IsRenamable = false; + bool IsReadOnly = false; // Not documented. + GOFF::ESDExecutable Executable = GOFF::ESD_EXE_Unspecified; + GOFF::ESDNameSpaceId NameSpace = GOFF::ESD_NS_NormalName; + GOFF::ESDLinkageType Linkage = GOFF::ESD_LT_XPLink; + GOFF::ESDAmode Amode; redstar wrote: Removed. https://github.com/llvm/llvm-project/pull/133799 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [llvm] [HLSL] Allow resource annotations to specify only register space (PR #135287)
@@ -163,11 +163,16 @@ void Parser::ParseHLSLAnnotations(ParsedAttributes &Attrs, SourceLocation SlotLoc = Tok.getLocation(); ArgExprs.push_back(ParseIdentifierLoc()); -// Add numeric_constant for fix-it. -if (SlotStr.size() == 1 && Tok.is(tok::numeric_constant)) +if (SlotStr.size() == 1) { + if (!Tok.is(tok::numeric_constant)) { +Diag(Tok.getLocation(), diag::err_expected) << tok::numeric_constant; +SkipUntil(tok::r_paren, StopAtSemi); // skip through ) spall wrote: I'm unfamiliar with the parsing code so this might be a dumb question, but why do you SkipUntil here? What happens after the code returns? https://github.com/llvm/llvm-project/pull/135287 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [MLIR][ArmSVE] Add initial lowering of vector.contract to SVE `*MMLA` instructions (PR #135636)
llvmbot wrote: @llvm/pr-subscribers-mlir-sve Author: Momchil Velikov (momchil-velikov) Changes Supersedes https://github.com/llvm/llvm-project/pull/135359 --- Patch is 77.36 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/135636.diff 16 Files Affected: - (modified) mlir/include/mlir/Conversion/Passes.td (+4) - (modified) mlir/include/mlir/Dialect/ArmSVE/Transforms/Transforms.h (+3) - (modified) mlir/lib/Conversion/VectorToLLVM/CMakeLists.txt (+1) - (modified) mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVMPass.cpp (+7) - (modified) mlir/lib/Dialect/ArmNeon/Transforms/LowerContractionToSMMLAPattern.cpp (+4-1) - (modified) mlir/lib/Dialect/ArmSVE/Transforms/CMakeLists.txt (+1) - (added) mlir/lib/Dialect/ArmSVE/Transforms/LowerContractionToSVEI8MMPattern.cpp (+304) - (added) mlir/test/Dialect/Vector/CPU/ArmSVE/vector-smmla.mlir (+94) - (added) mlir/test/Dialect/Vector/CPU/ArmSVE/vector-summla.mlir (+85) - (added) mlir/test/Dialect/Vector/CPU/ArmSVE/vector-ummla.mlir (+94) - (added) mlir/test/Dialect/Vector/CPU/ArmSVE/vector-usmmla.mlir (+95) - (added) mlir/test/Integration/Dialect/Vector/CPU/ArmSVE/contraction-smmla-4x8x4.mlir (+117) - (added) mlir/test/Integration/Dialect/Vector/CPU/ArmSVE/contraction-smmla-8x8x8-vs2.mlir (+159) - (added) mlir/test/Integration/Dialect/Vector/CPU/ArmSVE/contraction-summla-4x8x4.mlir (+118) - (added) mlir/test/Integration/Dialect/Vector/CPU/ArmSVE/contraction-ummla-4x8x4.mlir (+119) - (added) mlir/test/Integration/Dialect/Vector/CPU/ArmSVE/contraction-usmmla-4x8x4.mlir (+117) ``diff diff --git a/mlir/include/mlir/Conversion/Passes.td b/mlir/include/mlir/Conversion/Passes.td index bbba495e613b2..930d8b44abca0 100644 --- a/mlir/include/mlir/Conversion/Passes.td +++ b/mlir/include/mlir/Conversion/Passes.td @@ -1406,6 +1406,10 @@ def ConvertVectorToLLVMPass : Pass<"convert-vector-to-llvm"> { "bool", /*default=*/"false", "Enables the use of ArmSVE dialect while lowering the vector " "dialect.">, +Option<"armI8MM", "enable-arm-i8mm", + "bool", /*default=*/"false", + "Enables the use of Arm FEAT_I8MM instructions while lowering " + "the vector dialect.">, Option<"x86Vector", "enable-x86vector", "bool", /*default=*/"false", "Enables the use of X86Vector dialect while lowering the vector " diff --git a/mlir/include/mlir/Dialect/ArmSVE/Transforms/Transforms.h b/mlir/include/mlir/Dialect/ArmSVE/Transforms/Transforms.h index 8665c8224cc45..232e2be29e574 100644 --- a/mlir/include/mlir/Dialect/ArmSVE/Transforms/Transforms.h +++ b/mlir/include/mlir/Dialect/ArmSVE/Transforms/Transforms.h @@ -20,6 +20,9 @@ class RewritePatternSet; void populateArmSVELegalizeForLLVMExportPatterns( const LLVMTypeConverter &converter, RewritePatternSet &patterns); +void populateLowerContractionToSVEI8MMPatternPatterns( +RewritePatternSet &patterns); + /// Configure the target to support lowering ArmSVE ops to ops that map to LLVM /// intrinsics. void configureArmSVELegalizeForExportTarget(LLVMConversionTarget &target); diff --git a/mlir/lib/Conversion/VectorToLLVM/CMakeLists.txt b/mlir/lib/Conversion/VectorToLLVM/CMakeLists.txt index 330474a718e30..8e2620029c354 100644 --- a/mlir/lib/Conversion/VectorToLLVM/CMakeLists.txt +++ b/mlir/lib/Conversion/VectorToLLVM/CMakeLists.txt @@ -35,6 +35,7 @@ add_mlir_conversion_library(MLIRVectorToLLVMPass MLIRVectorToLLVM MLIRArmNeonDialect + MLIRArmNeonTransforms MLIRArmSVEDialect MLIRArmSVETransforms MLIRAMXDialect diff --git a/mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVMPass.cpp b/mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVMPass.cpp index 7082b92c95d1d..1e6c8122b1d0e 100644 --- a/mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVMPass.cpp +++ b/mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVMPass.cpp @@ -14,6 +14,7 @@ #include "mlir/Dialect/AMX/Transforms.h" #include "mlir/Dialect/Arith/IR/Arith.h" #include "mlir/Dialect/ArmNeon/ArmNeonDialect.h" +#include "mlir/Dialect/ArmNeon/Transforms.h" #include "mlir/Dialect/ArmSVE/IR/ArmSVEDialect.h" #include "mlir/Dialect/ArmSVE/Transforms/Transforms.h" #include "mlir/Dialect/Func/IR/FuncOps.h" @@ -82,6 +83,12 @@ void ConvertVectorToLLVMPass::runOnOperation() { populateVectorStepLoweringPatterns(patterns); populateVectorRankReducingFMAPattern(patterns); populateVectorGatherLoweringPatterns(patterns); +if (armI8MM) { + if (armNeon) +arm_neon::populateLowerContractionToSMMLAPatternPatterns(patterns); + if (armSVE) +populateLowerContractionToSVEI8MMPatternPatterns(patterns); +} (void)applyPatternsGreedily(getOperation(), std::move(patterns)); } diff --git a/mlir/lib/Dialect/ArmNeon/Transforms/LowerContractionToSMMLAPattern.cpp b/mlir/lib/Dialect/ArmNeon/Transforms/LowerContractionToSMMLAPattern.cpp index 2a1271dfd6bdf..e
[llvm-branch-commits] [mlir] [MLIR][ArmSVE] Add initial lowering of vector.contract to SVE `*MMLA` instructions (PR #135636)
llvmbot wrote: @llvm/pr-subscribers-mlir-neon Author: Momchil Velikov (momchil-velikov) Changes Supersedes https://github.com/llvm/llvm-project/pull/135359 --- Patch is 77.36 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/135636.diff 16 Files Affected: - (modified) mlir/include/mlir/Conversion/Passes.td (+4) - (modified) mlir/include/mlir/Dialect/ArmSVE/Transforms/Transforms.h (+3) - (modified) mlir/lib/Conversion/VectorToLLVM/CMakeLists.txt (+1) - (modified) mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVMPass.cpp (+7) - (modified) mlir/lib/Dialect/ArmNeon/Transforms/LowerContractionToSMMLAPattern.cpp (+4-1) - (modified) mlir/lib/Dialect/ArmSVE/Transforms/CMakeLists.txt (+1) - (added) mlir/lib/Dialect/ArmSVE/Transforms/LowerContractionToSVEI8MMPattern.cpp (+304) - (added) mlir/test/Dialect/Vector/CPU/ArmSVE/vector-smmla.mlir (+94) - (added) mlir/test/Dialect/Vector/CPU/ArmSVE/vector-summla.mlir (+85) - (added) mlir/test/Dialect/Vector/CPU/ArmSVE/vector-ummla.mlir (+94) - (added) mlir/test/Dialect/Vector/CPU/ArmSVE/vector-usmmla.mlir (+95) - (added) mlir/test/Integration/Dialect/Vector/CPU/ArmSVE/contraction-smmla-4x8x4.mlir (+117) - (added) mlir/test/Integration/Dialect/Vector/CPU/ArmSVE/contraction-smmla-8x8x8-vs2.mlir (+159) - (added) mlir/test/Integration/Dialect/Vector/CPU/ArmSVE/contraction-summla-4x8x4.mlir (+118) - (added) mlir/test/Integration/Dialect/Vector/CPU/ArmSVE/contraction-ummla-4x8x4.mlir (+119) - (added) mlir/test/Integration/Dialect/Vector/CPU/ArmSVE/contraction-usmmla-4x8x4.mlir (+117) ``diff diff --git a/mlir/include/mlir/Conversion/Passes.td b/mlir/include/mlir/Conversion/Passes.td index bbba495e613b2..930d8b44abca0 100644 --- a/mlir/include/mlir/Conversion/Passes.td +++ b/mlir/include/mlir/Conversion/Passes.td @@ -1406,6 +1406,10 @@ def ConvertVectorToLLVMPass : Pass<"convert-vector-to-llvm"> { "bool", /*default=*/"false", "Enables the use of ArmSVE dialect while lowering the vector " "dialect.">, +Option<"armI8MM", "enable-arm-i8mm", + "bool", /*default=*/"false", + "Enables the use of Arm FEAT_I8MM instructions while lowering " + "the vector dialect.">, Option<"x86Vector", "enable-x86vector", "bool", /*default=*/"false", "Enables the use of X86Vector dialect while lowering the vector " diff --git a/mlir/include/mlir/Dialect/ArmSVE/Transforms/Transforms.h b/mlir/include/mlir/Dialect/ArmSVE/Transforms/Transforms.h index 8665c8224cc45..232e2be29e574 100644 --- a/mlir/include/mlir/Dialect/ArmSVE/Transforms/Transforms.h +++ b/mlir/include/mlir/Dialect/ArmSVE/Transforms/Transforms.h @@ -20,6 +20,9 @@ class RewritePatternSet; void populateArmSVELegalizeForLLVMExportPatterns( const LLVMTypeConverter &converter, RewritePatternSet &patterns); +void populateLowerContractionToSVEI8MMPatternPatterns( +RewritePatternSet &patterns); + /// Configure the target to support lowering ArmSVE ops to ops that map to LLVM /// intrinsics. void configureArmSVELegalizeForExportTarget(LLVMConversionTarget &target); diff --git a/mlir/lib/Conversion/VectorToLLVM/CMakeLists.txt b/mlir/lib/Conversion/VectorToLLVM/CMakeLists.txt index 330474a718e30..8e2620029c354 100644 --- a/mlir/lib/Conversion/VectorToLLVM/CMakeLists.txt +++ b/mlir/lib/Conversion/VectorToLLVM/CMakeLists.txt @@ -35,6 +35,7 @@ add_mlir_conversion_library(MLIRVectorToLLVMPass MLIRVectorToLLVM MLIRArmNeonDialect + MLIRArmNeonTransforms MLIRArmSVEDialect MLIRArmSVETransforms MLIRAMXDialect diff --git a/mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVMPass.cpp b/mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVMPass.cpp index 7082b92c95d1d..1e6c8122b1d0e 100644 --- a/mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVMPass.cpp +++ b/mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVMPass.cpp @@ -14,6 +14,7 @@ #include "mlir/Dialect/AMX/Transforms.h" #include "mlir/Dialect/Arith/IR/Arith.h" #include "mlir/Dialect/ArmNeon/ArmNeonDialect.h" +#include "mlir/Dialect/ArmNeon/Transforms.h" #include "mlir/Dialect/ArmSVE/IR/ArmSVEDialect.h" #include "mlir/Dialect/ArmSVE/Transforms/Transforms.h" #include "mlir/Dialect/Func/IR/FuncOps.h" @@ -82,6 +83,12 @@ void ConvertVectorToLLVMPass::runOnOperation() { populateVectorStepLoweringPatterns(patterns); populateVectorRankReducingFMAPattern(patterns); populateVectorGatherLoweringPatterns(patterns); +if (armI8MM) { + if (armNeon) +arm_neon::populateLowerContractionToSMMLAPatternPatterns(patterns); + if (armSVE) +populateLowerContractionToSVEI8MMPatternPatterns(patterns); +} (void)applyPatternsGreedily(getOperation(), std::move(patterns)); } diff --git a/mlir/lib/Dialect/ArmNeon/Transforms/LowerContractionToSMMLAPattern.cpp b/mlir/lib/Dialect/ArmNeon/Transforms/LowerContractionToSMMLAPattern.cpp index 2a1271dfd6bdf..
[llvm-branch-commits] [mlir] [MLIR][ArmSVE] Add an ArmSVE dialect operation which maps to svusmmla (PR #135634)
llvmbot wrote: @llvm/pr-subscribers-mlir-sve Author: Momchil Velikov (momchil-velikov) Changes Supersedes https://github.com/llvm/llvm-project/pull/135358 --- Full diff: https://github.com/llvm/llvm-project/pull/135634.diff 5 Files Affected: - (modified) mlir/include/mlir/Dialect/ArmSVE/IR/ArmSVE.td (+32) - (modified) mlir/lib/Dialect/ArmSVE/Transforms/LegalizeForLLVMExport.cpp (+4) - (modified) mlir/test/Dialect/ArmSVE/legalize-for-llvm.mlir (+12) - (modified) mlir/test/Dialect/ArmSVE/roundtrip.mlir (+11) - (modified) mlir/test/Target/LLVMIR/arm-sve.mlir (+12) ``diff diff --git a/mlir/include/mlir/Dialect/ArmSVE/IR/ArmSVE.td b/mlir/include/mlir/Dialect/ArmSVE/IR/ArmSVE.td index 1a59062ccc93d..da2a8f89b4cfd 100644 --- a/mlir/include/mlir/Dialect/ArmSVE/IR/ArmSVE.td +++ b/mlir/include/mlir/Dialect/ArmSVE/IR/ArmSVE.td @@ -273,6 +273,34 @@ def UmmlaOp : ArmSVE_Op<"ummla", "$acc `,` $src1 `,` $src2 attr-dict `:` type($src1) `to` type($dst)"; } +def UsmmlaOp : ArmSVE_Op<"usmmla", [Pure, +AllTypesMatch<["src1", "src2"]>, +AllTypesMatch<["acc", "dst"]>]> { + let summary = "Matrix-matrix multiply and accumulate op"; + let description = [{ +USMMLA: Unsigned by signed integer matrix multiply-accumulate. + +The unsigned by signed integer matrix multiply-accumulate operation +multiplies the 2×8 matrix of unsigned 8-bit integer values held +the first source vector by the 8×2 matrix of signed 8-bit integer +values in the second source vector. The resulting 2×2 widened 32-bit +integer matrix product is then added to the 32-bit integer matrix +accumulator. + +Source: +https://developer.arm.com/documentation/100987/ + }]; + // Supports (vector<16xi8>, vector<16xi8>) -> (vector<4xi32>) + let arguments = (ins + ScalableVectorOfLengthAndType<[4], [I32]>:$acc, + ScalableVectorOfLengthAndType<[16], [I8]>:$src1, + ScalableVectorOfLengthAndType<[16], [I8]>:$src2 + ); + let results = (outs ScalableVectorOfLengthAndType<[4], [I32]>:$dst); + let assemblyFormat = +"$acc `,` $src1 `,` $src2 attr-dict `:` type($src1) `to` type($dst)"; +} + class SvboolTypeConstraint : TypesMatchWith< "expected corresponding svbool type widened to [16]xi1", lhsArg, rhsArg, @@ -568,6 +596,10 @@ def SmmlaIntrOp : ArmSVE_IntrBinaryOverloadedOp<"smmla">, Arguments<(ins AnyScalableVectorOfAnyRank, AnyScalableVectorOfAnyRank, AnyScalableVectorOfAnyRank)>; +def UsmmlaIntrOp : + ArmSVE_IntrBinaryOverloadedOp<"usmmla">, + Arguments<(ins AnyScalableVectorOfAnyRank, AnyScalableVectorOfAnyRank, AnyScalableVectorOfAnyRank)>; + def SdotIntrOp : ArmSVE_IntrBinaryOverloadedOp<"sdot">, Arguments<(ins AnyScalableVectorOfAnyRank, AnyScalableVectorOfAnyRank, AnyScalableVectorOfAnyRank)>; diff --git a/mlir/lib/Dialect/ArmSVE/Transforms/LegalizeForLLVMExport.cpp b/mlir/lib/Dialect/ArmSVE/Transforms/LegalizeForLLVMExport.cpp index fe13ed03356b2..b1846e15196fc 100644 --- a/mlir/lib/Dialect/ArmSVE/Transforms/LegalizeForLLVMExport.cpp +++ b/mlir/lib/Dialect/ArmSVE/Transforms/LegalizeForLLVMExport.cpp @@ -24,6 +24,7 @@ using SdotOpLowering = OneToOneConvertToLLVMPattern; using SmmlaOpLowering = OneToOneConvertToLLVMPattern; using UdotOpLowering = OneToOneConvertToLLVMPattern; using UmmlaOpLowering = OneToOneConvertToLLVMPattern; +using UsmmlaOpLowering = OneToOneConvertToLLVMPattern; using DupQLaneLowering = OneToOneConvertToLLVMPattern; using ScalableMaskedAddIOpLowering = @@ -194,6 +195,7 @@ void mlir::populateArmSVELegalizeForLLVMExportPatterns( SmmlaOpLowering, UdotOpLowering, UmmlaOpLowering, + UsmmlaOpLowering, DupQLaneLowering, ScalableMaskedAddIOpLowering, ScalableMaskedAddFOpLowering, @@ -222,6 +224,7 @@ void mlir::configureArmSVELegalizeForExportTarget( SmmlaIntrOp, UdotIntrOp, UmmlaIntrOp, +UsmmlaIntrOp, DupQLaneIntrOp, ScalableMaskedAddIIntrOp, ScalableMaskedAddFIntrOp, @@ -242,6 +245,7 @@ void mlir::configureArmSVELegalizeForExportTarget( SmmlaOp, UdotOp, UmmlaOp, + UsmmlaOp, DupQLaneOp, ScalableMaskedAddIOp, ScalableMaskedAddFOp, diff --git a/mlir/test/Dialect/ArmSVE/legalize-for-llvm.mlir b/mlir/test/Dialect/ArmSVE/legalize-for-llvm.mlir index 5d044517e0ea8..47587aa26506c 100644 --- a/mlir/test/Dialect/ArmSVE/legalize-for-llvm.mlir +++ b/mlir/test/Dialect/ArmSVE/legalize-for-llvm.mlir @@ -48,6 +48,18 @@ func.func @arm_sve_ummla(%a: vector<[16]xi8>, // - +func.func @arm_sve_usmmla(%a: vector<[16]xi8>, +
[llvm-branch-commits] [llvm] [ctxprof] Extend the notion of "cannot return" (PR #135651)
https://github.com/mtrofin updated https://github.com/llvm/llvm-project/pull/135651 >From e88504d840c675fbd196dc75c981098abc2c970d Mon Sep 17 00:00:00 2001 From: Mircea Trofin Date: Mon, 14 Apr 2025 10:03:55 -0700 Subject: [PATCH] [ctxprof] Extend the notion of "cannot return" --- .../Instrumentation/PGOCtxProfLowering.cpp| 19 -- .../ctx-instrumentation-invalid-roots.ll | 25 +++ .../PGOProfile/ctx-instrumentation.ll | 13 ++ 3 files changed, 40 insertions(+), 17 deletions(-) diff --git a/llvm/lib/Transforms/Instrumentation/PGOCtxProfLowering.cpp b/llvm/lib/Transforms/Instrumentation/PGOCtxProfLowering.cpp index f99d7b9d03e02..136225ab27cdc 100644 --- a/llvm/lib/Transforms/Instrumentation/PGOCtxProfLowering.cpp +++ b/llvm/lib/Transforms/Instrumentation/PGOCtxProfLowering.cpp @@ -9,6 +9,7 @@ #include "llvm/Transforms/Instrumentation/PGOCtxProfLowering.h" #include "llvm/ADT/STLExtras.h" +#include "llvm/Analysis/CFG.h" #include "llvm/Analysis/CtxProfAnalysis.h" #include "llvm/Analysis/OptimizationRemarkEmitter.h" #include "llvm/IR/Analysis.h" @@ -105,6 +106,12 @@ std::pair getNumCountersAndCallsites(const Function &F) { } return {NumCounters, NumCallsites}; } + +void emitUnsupportedRoot(const Function &F, StringRef Reason) { + F.getContext().emitError("[ctxprof] The function " + F.getName() + + " was indicated as context root but " + Reason + + ", which is not supported."); +} } // namespace // set up tie-in with compiler-rt. @@ -164,12 +171,8 @@ CtxInstrumentationLowerer::CtxInstrumentationLowerer(Module &M, for (const auto &BB : *F) for (const auto &I : BB) if (const auto *CB = dyn_cast(&I)) -if (CB->isMustTailCall()) { - M.getContext().emitError("The function " + Fname + - " was indicated as a context root, " - "but it features musttail " - "calls, which is not supported."); -} +if (CB->isMustTailCall()) + emitUnsupportedRoot(*F, "it features musttail calls"); } } @@ -230,11 +233,13 @@ bool CtxInstrumentationLowerer::lowerFunction(Function &F) { // Probably pointless to try to do anything here, unlikely to be // performance-affecting. - if (F.doesNotReturn()) { + if (!llvm::canReturn(F)) { for (auto &BB : F) for (auto &I : make_early_inc_range(BB)) if (isa(&I)) I.eraseFromParent(); +if (ContextRootSet.contains(&F)) + emitUnsupportedRoot(F, "it does not return"); return true; } diff --git a/llvm/test/Transforms/PGOProfile/ctx-instrumentation-invalid-roots.ll b/llvm/test/Transforms/PGOProfile/ctx-instrumentation-invalid-roots.ll index 454780153b823..b5ceb4602c60b 100644 --- a/llvm/test/Transforms/PGOProfile/ctx-instrumentation-invalid-roots.ll +++ b/llvm/test/Transforms/PGOProfile/ctx-instrumentation-invalid-roots.ll @@ -1,17 +1,22 @@ -; RUN: not opt -passes=ctx-instr-gen,ctx-instr-lower -profile-context-root=good \ -; RUN: -profile-context-root=bad \ -; RUN: -S < %s 2>&1 | FileCheck %s +; RUN: split-file %s %t +; RUN: not opt -passes=ctx-instr-gen,ctx-instr-lower -profile-context-root=the_func -S %t/musttail.ll -o - 2>&1 | FileCheck %s +; RUN: not opt -passes=ctx-instr-gen,ctx-instr-lower -profile-context-root=the_func -S %t/unreachable.ll -o - 2>&1 | FileCheck %s +; RUN: not opt -passes=ctx-instr-gen,ctx-instr-lower -profile-context-root=the_func -S %t/noreturn.ll -o - 2>&1 | FileCheck %s +;--- musttail.ll declare void @foo() -define void @good() { - call void @foo() - ret void -} - -define void @bad() { +define void @the_func() { musttail call void @foo() ret void } +;--- unreachable.ll +define void @the_func() { + unreachable +} +;--- noreturn.ll +define void @the_func() noreturn { + unreachable +} -; CHECK: error: The function bad was indicated as a context root, but it features musttail calls, which is not supported. +; CHECK: error: [ctxprof] The function the_func was indicated as context root diff --git a/llvm/test/Transforms/PGOProfile/ctx-instrumentation.ll b/llvm/test/Transforms/PGOProfile/ctx-instrumentation.ll index 6b2f25a585ec3..71d54f98d26e1 100644 --- a/llvm/test/Transforms/PGOProfile/ctx-instrumentation.ll +++ b/llvm/test/Transforms/PGOProfile/ctx-instrumentation.ll @@ -323,6 +323,18 @@ define void @does_not_return() noreturn { ; unreachable } + +define void @unreachable() { +; INSTRUMENT-LABEL: define void @unreachable() { +; INSTRUMENT-NEXT:call void @llvm.instrprof.increment(ptr @unreachable, i64 742261418966908927, i32 1, i32 0) +; INSTRUMENT-NEXT:unreachable +; +; LOWERING-LABEL: define void @unreachable( +; LOWERING-SAME: ) !guid [[META9:![0-9]+]] { +; LOWERING-NEXT:unreachable +; + unreachable +} ;. ; LOWERING: attributes #[[ATTR0]
[llvm-branch-commits] [llvm] [ctxprof] Extend the notion of "cannot return" (PR #135651)
llvmbot wrote: @llvm/pr-subscribers-llvm-transforms Author: Mircea Trofin (mtrofin) Changes --- Full diff: https://github.com/llvm/llvm-project/pull/135651.diff 3 Files Affected: - (modified) llvm/lib/Transforms/Instrumentation/PGOCtxProfLowering.cpp (+12-7) - (modified) llvm/test/Transforms/PGOProfile/ctx-instrumentation-invalid-roots.ll (+15-10) - (modified) llvm/test/Transforms/PGOProfile/ctx-instrumentation.ll (+13) ``diff diff --git a/llvm/lib/Transforms/Instrumentation/PGOCtxProfLowering.cpp b/llvm/lib/Transforms/Instrumentation/PGOCtxProfLowering.cpp index f99d7b9d03e02..136225ab27cdc 100644 --- a/llvm/lib/Transforms/Instrumentation/PGOCtxProfLowering.cpp +++ b/llvm/lib/Transforms/Instrumentation/PGOCtxProfLowering.cpp @@ -9,6 +9,7 @@ #include "llvm/Transforms/Instrumentation/PGOCtxProfLowering.h" #include "llvm/ADT/STLExtras.h" +#include "llvm/Analysis/CFG.h" #include "llvm/Analysis/CtxProfAnalysis.h" #include "llvm/Analysis/OptimizationRemarkEmitter.h" #include "llvm/IR/Analysis.h" @@ -105,6 +106,12 @@ std::pair getNumCountersAndCallsites(const Function &F) { } return {NumCounters, NumCallsites}; } + +void emitUnsupportedRoot(const Function &F, StringRef Reason) { + F.getContext().emitError("[ctxprof] The function " + F.getName() + + " was indicated as context root but " + Reason + + ", which is not supported."); +} } // namespace // set up tie-in with compiler-rt. @@ -164,12 +171,8 @@ CtxInstrumentationLowerer::CtxInstrumentationLowerer(Module &M, for (const auto &BB : *F) for (const auto &I : BB) if (const auto *CB = dyn_cast(&I)) -if (CB->isMustTailCall()) { - M.getContext().emitError("The function " + Fname + - " was indicated as a context root, " - "but it features musttail " - "calls, which is not supported."); -} +if (CB->isMustTailCall()) + emitUnsupportedRoot(*F, "it features musttail calls"); } } @@ -230,11 +233,13 @@ bool CtxInstrumentationLowerer::lowerFunction(Function &F) { // Probably pointless to try to do anything here, unlikely to be // performance-affecting. - if (F.doesNotReturn()) { + if (!llvm::canReturn(F)) { for (auto &BB : F) for (auto &I : make_early_inc_range(BB)) if (isa(&I)) I.eraseFromParent(); +if (ContextRootSet.contains(&F)) + emitUnsupportedRoot(F, "it does not return"); return true; } diff --git a/llvm/test/Transforms/PGOProfile/ctx-instrumentation-invalid-roots.ll b/llvm/test/Transforms/PGOProfile/ctx-instrumentation-invalid-roots.ll index 454780153b823..b5ceb4602c60b 100644 --- a/llvm/test/Transforms/PGOProfile/ctx-instrumentation-invalid-roots.ll +++ b/llvm/test/Transforms/PGOProfile/ctx-instrumentation-invalid-roots.ll @@ -1,17 +1,22 @@ -; RUN: not opt -passes=ctx-instr-gen,ctx-instr-lower -profile-context-root=good \ -; RUN: -profile-context-root=bad \ -; RUN: -S < %s 2>&1 | FileCheck %s +; RUN: split-file %s %t +; RUN: not opt -passes=ctx-instr-gen,ctx-instr-lower -profile-context-root=the_func -S %t/musttail.ll -o - 2>&1 | FileCheck %s +; RUN: not opt -passes=ctx-instr-gen,ctx-instr-lower -profile-context-root=the_func -S %t/unreachable.ll -o - 2>&1 | FileCheck %s +; RUN: not opt -passes=ctx-instr-gen,ctx-instr-lower -profile-context-root=the_func -S %t/noreturn.ll -o - 2>&1 | FileCheck %s +;--- musttail.ll declare void @foo() -define void @good() { - call void @foo() - ret void -} - -define void @bad() { +define void @the_func() { musttail call void @foo() ret void } +;--- unreachable.ll +define void @the_func() { + unreachable +} +;--- noreturn.ll +define void @the_func() noreturn { + unreachable +} -; CHECK: error: The function bad was indicated as a context root, but it features musttail calls, which is not supported. +; CHECK: error: [ctxprof] The function the_func was indicated as context root diff --git a/llvm/test/Transforms/PGOProfile/ctx-instrumentation.ll b/llvm/test/Transforms/PGOProfile/ctx-instrumentation.ll index 6b2f25a585ec3..71d54f98d26e1 100644 --- a/llvm/test/Transforms/PGOProfile/ctx-instrumentation.ll +++ b/llvm/test/Transforms/PGOProfile/ctx-instrumentation.ll @@ -323,6 +323,18 @@ define void @does_not_return() noreturn { ; unreachable } + +define void @unreachable() { +; INSTRUMENT-LABEL: define void @unreachable() { +; INSTRUMENT-NEXT:call void @llvm.instrprof.increment(ptr @unreachable, i64 742261418966908927, i32 1, i32 0) +; INSTRUMENT-NEXT:unreachable +; +; LOWERING-LABEL: define void @unreachable( +; LOWERING-SAME: ) !guid [[META9:![0-9]+]] { +; LOWERING-NEXT:unreachable +; + unreachable +} ;. ; LOWERING: attributes #[[ATTR0]] = { noreturn } ; LOWERING: attributes #[[ATTR1:[0-9]+]] = { nounwind } @@ -340,4
[llvm-branch-commits] [llvm] [ctxprof] Extend the notion of "cannot return" (PR #135651)
https://github.com/mtrofin ready_for_review https://github.com/llvm/llvm-project/pull/135651 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [HLSL] Adding support for Root Constants in LLVM Metadata (PR #135085)
@@ -94,10 +144,56 @@ static bool parse(LLVMContext *Ctx, mcdxbc::RootSignatureDesc &RSD, static bool verifyRootFlag(uint32_t Flags) { return (Flags & ~0xfff) == 0; } +static bool verifyShaderVisibility(dxbc::ShaderVisibility Flags) { + switch (Flags) { + + case dxbc::ShaderVisibility::All: + case dxbc::ShaderVisibility::Vertex: + case dxbc::ShaderVisibility::Hull: + case dxbc::ShaderVisibility::Domain: + case dxbc::ShaderVisibility::Geometry: + case dxbc::ShaderVisibility::Pixel: + case dxbc::ShaderVisibility::Amplification: + case dxbc::ShaderVisibility::Mesh: +return true; + } + + return false; +} + +static bool verifyParameterType(dxbc::RootParameterType Flags) { + switch (Flags) { + case dxbc::RootParameterType::Constants32Bit: +return true; + } + + return false; +} + +static bool verifyVersion(uint32_t Version) { + return (Version == 1 || Version == 2); +} + static bool validate(LLVMContext *Ctx, const mcdxbc::RootSignatureDesc &RSD) { + + if (!verifyVersion(RSD.Header.Version)) { +return reportValueError(Ctx, "Version", RSD.Header.Version); + } + if (!verifyRootFlag(RSD.Header.Flags)) { -return reportError(Ctx, "Invalid Root Signature flag value"); +return reportValueError(Ctx, "RootFlags", RSD.Header.Flags); + } + + for (const auto &P : RSD.Parameters) { +if (!verifyShaderVisibility(P.Header.ShaderVisibility)) + return reportValueError(Ctx, "ShaderVisibility", + (uint32_t)P.Header.ShaderVisibility); + +if (!verifyParameterType(P.Header.ParameterType)) + return reportValueError(Ctx, "ParameterType", inbelic wrote: There isn't a test case demonstrating this https://github.com/llvm/llvm-project/pull/135085 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [HLSL] Adding support for Root Constants in LLVM Metadata (PR #135085)
inbelic wrote: I think we should change the name of the file to be more descriptive https://github.com/llvm/llvm-project/pull/135085 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [ctxprof] Extend the notion of "cannot return" (PR #135651)
https://github.com/mtrofin created https://github.com/llvm/llvm-project/pull/135651 None >From 41540073aaa8adc96e6f7889df66e1791cb4dbc9 Mon Sep 17 00:00:00 2001 From: Mircea Trofin Date: Mon, 14 Apr 2025 10:03:55 -0700 Subject: [PATCH] [ctxprof] Extend the notion of "cannot return" --- .../Instrumentation/PGOCtxProfLowering.cpp| 19 -- .../ctx-instrumentation-invalid-roots.ll | 25 +++ .../PGOProfile/ctx-instrumentation.ll | 13 ++ 3 files changed, 40 insertions(+), 17 deletions(-) diff --git a/llvm/lib/Transforms/Instrumentation/PGOCtxProfLowering.cpp b/llvm/lib/Transforms/Instrumentation/PGOCtxProfLowering.cpp index a314457423819..603022d94838c 100644 --- a/llvm/lib/Transforms/Instrumentation/PGOCtxProfLowering.cpp +++ b/llvm/lib/Transforms/Instrumentation/PGOCtxProfLowering.cpp @@ -9,6 +9,7 @@ #include "llvm/Transforms/Instrumentation/PGOCtxProfLowering.h" #include "llvm/ADT/STLExtras.h" +#include "llvm/Analysis/CFG.h" #include "llvm/Analysis/CtxProfAnalysis.h" #include "llvm/Analysis/OptimizationRemarkEmitter.h" #include "llvm/IR/Analysis.h" @@ -102,6 +103,12 @@ std::pair getNumCountersAndCallsites(const Function &F) { } return {NumCounters, NumCallsites}; } + +void emitUnsupportedRoot(const Function &F, StringRef Reason) { + F.getContext().emitError("[ctxprof] The function " + F.getName() + + " was indicated as context root but " + Reason + + ", which is not supported."); +} } // namespace // set up tie-in with compiler-rt. @@ -144,12 +151,8 @@ CtxInstrumentationLowerer::CtxInstrumentationLowerer(Module &M, for (const auto &BB : *F) for (const auto &I : BB) if (const auto *CB = dyn_cast(&I)) -if (CB->isMustTailCall()) { - M.getContext().emitError( - "The function " + Fname + - " was indicated as a context root, but it features musttail " - "calls, which is not supported."); -} +if (CB->isMustTailCall()) + emitUnsupportedRoot(*F, "it features musttail calls"); } } @@ -210,11 +213,13 @@ bool CtxInstrumentationLowerer::lowerFunction(Function &F) { // Probably pointless to try to do anything here, unlikely to be // performance-affecting. - if (F.doesNotReturn()) { + if (!llvm::canReturn(F)) { for (auto &BB : F) for (auto &I : make_early_inc_range(BB)) if (isa(&I)) I.eraseFromParent(); +if (ContextRootSet.contains(&F)) + emitUnsupportedRoot(F, "it does not return"); return true; } diff --git a/llvm/test/Transforms/PGOProfile/ctx-instrumentation-invalid-roots.ll b/llvm/test/Transforms/PGOProfile/ctx-instrumentation-invalid-roots.ll index 454780153b823..b5ceb4602c60b 100644 --- a/llvm/test/Transforms/PGOProfile/ctx-instrumentation-invalid-roots.ll +++ b/llvm/test/Transforms/PGOProfile/ctx-instrumentation-invalid-roots.ll @@ -1,17 +1,22 @@ -; RUN: not opt -passes=ctx-instr-gen,ctx-instr-lower -profile-context-root=good \ -; RUN: -profile-context-root=bad \ -; RUN: -S < %s 2>&1 | FileCheck %s +; RUN: split-file %s %t +; RUN: not opt -passes=ctx-instr-gen,ctx-instr-lower -profile-context-root=the_func -S %t/musttail.ll -o - 2>&1 | FileCheck %s +; RUN: not opt -passes=ctx-instr-gen,ctx-instr-lower -profile-context-root=the_func -S %t/unreachable.ll -o - 2>&1 | FileCheck %s +; RUN: not opt -passes=ctx-instr-gen,ctx-instr-lower -profile-context-root=the_func -S %t/noreturn.ll -o - 2>&1 | FileCheck %s +;--- musttail.ll declare void @foo() -define void @good() { - call void @foo() - ret void -} - -define void @bad() { +define void @the_func() { musttail call void @foo() ret void } +;--- unreachable.ll +define void @the_func() { + unreachable +} +;--- noreturn.ll +define void @the_func() noreturn { + unreachable +} -; CHECK: error: The function bad was indicated as a context root, but it features musttail calls, which is not supported. +; CHECK: error: [ctxprof] The function the_func was indicated as context root diff --git a/llvm/test/Transforms/PGOProfile/ctx-instrumentation.ll b/llvm/test/Transforms/PGOProfile/ctx-instrumentation.ll index 8f72711a9c8b1..6afa37ef286f5 100644 --- a/llvm/test/Transforms/PGOProfile/ctx-instrumentation.ll +++ b/llvm/test/Transforms/PGOProfile/ctx-instrumentation.ll @@ -323,6 +323,18 @@ define void @does_not_return() noreturn { ; unreachable } + +define void @unreachable() { +; INSTRUMENT-LABEL: define void @unreachable() { +; INSTRUMENT-NEXT:call void @llvm.instrprof.increment(ptr @unreachable, i64 742261418966908927, i32 1, i32 0) +; INSTRUMENT-NEXT:unreachable +; +; LOWERING-LABEL: define void @unreachable( +; LOWERING-SAME: ) !guid [[META9:![0-9]+]] { +; LOWERING-NEXT:unreachable +; + unreachable +} ;. ; LOWERING: attributes #[[ATTR0]] = { noreturn } ; LOWERING: attributes #[[ATTR1:[0-9]+]]
[llvm-branch-commits] [llvm] [ctxprof] Extend the notion of "cannot return" (PR #135651)
mtrofin wrote: > [!WARNING] > This pull request is not mergeable via GitHub because a downstack PR is > open. Once all requirements are satisfied, merge this PR as a stack href="https://app.graphite.dev/github/pr/llvm/llvm-project/135651?utm_source=stack-comment-downstack-mergeability-warning"; > >on Graphite. > https://graphite.dev/docs/merge-pull-requests";>Learn more * **#135651** https://app.graphite.dev/github/pr/llvm/llvm-project/135651?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> 👈 https://app.graphite.dev/github/pr/llvm/llvm-project/135651?utm_source=stack-comment-view-in-graphite"; target="_blank">(View in Graphite) * **#135650** https://app.graphite.dev/github/pr/llvm/llvm-project/135650?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * `main` This stack of pull requests is managed by https://graphite.dev?utm-source=stack-comment";>Graphite. Learn more about https://stacking.dev/?utm_source=stack-comment";>stacking. https://github.com/llvm/llvm-project/pull/135651 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [ctxprof] Extend the notion of "cannot return" (PR #135651)
https://github.com/mtrofin updated https://github.com/llvm/llvm-project/pull/135651 >From e88504d840c675fbd196dc75c981098abc2c970d Mon Sep 17 00:00:00 2001 From: Mircea Trofin Date: Mon, 14 Apr 2025 10:03:55 -0700 Subject: [PATCH] [ctxprof] Extend the notion of "cannot return" --- .../Instrumentation/PGOCtxProfLowering.cpp| 19 -- .../ctx-instrumentation-invalid-roots.ll | 25 +++ .../PGOProfile/ctx-instrumentation.ll | 13 ++ 3 files changed, 40 insertions(+), 17 deletions(-) diff --git a/llvm/lib/Transforms/Instrumentation/PGOCtxProfLowering.cpp b/llvm/lib/Transforms/Instrumentation/PGOCtxProfLowering.cpp index f99d7b9d03e02..136225ab27cdc 100644 --- a/llvm/lib/Transforms/Instrumentation/PGOCtxProfLowering.cpp +++ b/llvm/lib/Transforms/Instrumentation/PGOCtxProfLowering.cpp @@ -9,6 +9,7 @@ #include "llvm/Transforms/Instrumentation/PGOCtxProfLowering.h" #include "llvm/ADT/STLExtras.h" +#include "llvm/Analysis/CFG.h" #include "llvm/Analysis/CtxProfAnalysis.h" #include "llvm/Analysis/OptimizationRemarkEmitter.h" #include "llvm/IR/Analysis.h" @@ -105,6 +106,12 @@ std::pair getNumCountersAndCallsites(const Function &F) { } return {NumCounters, NumCallsites}; } + +void emitUnsupportedRoot(const Function &F, StringRef Reason) { + F.getContext().emitError("[ctxprof] The function " + F.getName() + + " was indicated as context root but " + Reason + + ", which is not supported."); +} } // namespace // set up tie-in with compiler-rt. @@ -164,12 +171,8 @@ CtxInstrumentationLowerer::CtxInstrumentationLowerer(Module &M, for (const auto &BB : *F) for (const auto &I : BB) if (const auto *CB = dyn_cast(&I)) -if (CB->isMustTailCall()) { - M.getContext().emitError("The function " + Fname + - " was indicated as a context root, " - "but it features musttail " - "calls, which is not supported."); -} +if (CB->isMustTailCall()) + emitUnsupportedRoot(*F, "it features musttail calls"); } } @@ -230,11 +233,13 @@ bool CtxInstrumentationLowerer::lowerFunction(Function &F) { // Probably pointless to try to do anything here, unlikely to be // performance-affecting. - if (F.doesNotReturn()) { + if (!llvm::canReturn(F)) { for (auto &BB : F) for (auto &I : make_early_inc_range(BB)) if (isa(&I)) I.eraseFromParent(); +if (ContextRootSet.contains(&F)) + emitUnsupportedRoot(F, "it does not return"); return true; } diff --git a/llvm/test/Transforms/PGOProfile/ctx-instrumentation-invalid-roots.ll b/llvm/test/Transforms/PGOProfile/ctx-instrumentation-invalid-roots.ll index 454780153b823..b5ceb4602c60b 100644 --- a/llvm/test/Transforms/PGOProfile/ctx-instrumentation-invalid-roots.ll +++ b/llvm/test/Transforms/PGOProfile/ctx-instrumentation-invalid-roots.ll @@ -1,17 +1,22 @@ -; RUN: not opt -passes=ctx-instr-gen,ctx-instr-lower -profile-context-root=good \ -; RUN: -profile-context-root=bad \ -; RUN: -S < %s 2>&1 | FileCheck %s +; RUN: split-file %s %t +; RUN: not opt -passes=ctx-instr-gen,ctx-instr-lower -profile-context-root=the_func -S %t/musttail.ll -o - 2>&1 | FileCheck %s +; RUN: not opt -passes=ctx-instr-gen,ctx-instr-lower -profile-context-root=the_func -S %t/unreachable.ll -o - 2>&1 | FileCheck %s +; RUN: not opt -passes=ctx-instr-gen,ctx-instr-lower -profile-context-root=the_func -S %t/noreturn.ll -o - 2>&1 | FileCheck %s +;--- musttail.ll declare void @foo() -define void @good() { - call void @foo() - ret void -} - -define void @bad() { +define void @the_func() { musttail call void @foo() ret void } +;--- unreachable.ll +define void @the_func() { + unreachable +} +;--- noreturn.ll +define void @the_func() noreturn { + unreachable +} -; CHECK: error: The function bad was indicated as a context root, but it features musttail calls, which is not supported. +; CHECK: error: [ctxprof] The function the_func was indicated as context root diff --git a/llvm/test/Transforms/PGOProfile/ctx-instrumentation.ll b/llvm/test/Transforms/PGOProfile/ctx-instrumentation.ll index 6b2f25a585ec3..71d54f98d26e1 100644 --- a/llvm/test/Transforms/PGOProfile/ctx-instrumentation.ll +++ b/llvm/test/Transforms/PGOProfile/ctx-instrumentation.ll @@ -323,6 +323,18 @@ define void @does_not_return() noreturn { ; unreachable } + +define void @unreachable() { +; INSTRUMENT-LABEL: define void @unreachable() { +; INSTRUMENT-NEXT:call void @llvm.instrprof.increment(ptr @unreachable, i64 742261418966908927, i32 1, i32 0) +; INSTRUMENT-NEXT:unreachable +; +; LOWERING-LABEL: define void @unreachable( +; LOWERING-SAME: ) !guid [[META9:![0-9]+]] { +; LOWERING-NEXT:unreachable +; + unreachable +} ;. ; LOWERING: attributes #[[ATTR0]
[llvm-branch-commits] [llvm] [HLSL] Adding support for Root Constants in LLVM Metadata (PR #135085)
https://github.com/joaosaffran updated https://github.com/llvm/llvm-project/pull/135085 >From 9b59d0108f6b23c039e2c417247216862073cd4b Mon Sep 17 00:00:00 2001 From: joaosaffran Date: Wed, 9 Apr 2025 21:05:58 + Subject: [PATCH 1/9] adding support for root constants in metadata generation --- llvm/lib/Target/DirectX/DXILRootSignature.cpp | 120 +- llvm/lib/Target/DirectX/DXILRootSignature.h | 6 +- .../RootSignature-Flags-Validation-Error.ll | 7 +- .../RootSignature-RootConstants.ll| 34 + ...ature-ShaderVisibility-Validation-Error.ll | 20 +++ 5 files changed, 182 insertions(+), 5 deletions(-) create mode 100644 llvm/test/CodeGen/DirectX/ContainerData/RootSignature-RootConstants.ll create mode 100644 llvm/test/CodeGen/DirectX/ContainerData/RootSignature-ShaderVisibility-Validation-Error.ll diff --git a/llvm/lib/Target/DirectX/DXILRootSignature.cpp b/llvm/lib/Target/DirectX/DXILRootSignature.cpp index 412ab7765a7ae..7686918b0fc75 100644 --- a/llvm/lib/Target/DirectX/DXILRootSignature.cpp +++ b/llvm/lib/Target/DirectX/DXILRootSignature.cpp @@ -40,6 +40,13 @@ static bool reportError(LLVMContext *Ctx, Twine Message, return true; } +static bool reportValueError(LLVMContext *Ctx, Twine ParamName, uint32_t Value, + DiagnosticSeverity Severity = DS_Error) { + Ctx->diagnose(DiagnosticInfoGeneric( + "Invalid value for " + ParamName + ": " + Twine(Value), Severity)); + return true; +} + static bool parseRootFlags(LLVMContext *Ctx, mcdxbc::RootSignatureDesc &RSD, MDNode *RootFlagNode) { @@ -52,6 +59,45 @@ static bool parseRootFlags(LLVMContext *Ctx, mcdxbc::RootSignatureDesc &RSD, return false; } +static bool extractMdValue(uint32_t &Value, MDNode *Node, unsigned int OpId) { + + auto *CI = mdconst::extract(Node->getOperand(OpId)); + if (CI == nullptr) +return true; + + Value = CI->getZExtValue(); + return false; +} + +static bool parseRootConstants(LLVMContext *Ctx, mcdxbc::RootSignatureDesc &RSD, + MDNode *RootFlagNode) { + + if (RootFlagNode->getNumOperands() != 5) +return reportError(Ctx, "Invalid format for RootConstants Element"); + + mcdxbc::RootParameter NewParameter; + NewParameter.Header.ParameterType = dxbc::RootParameterType::Constants32Bit; + + uint32_t SV; + if (extractMdValue(SV, RootFlagNode, 1)) +return reportError(Ctx, "Invalid value for ShaderVisibility"); + + NewParameter.Header.ShaderVisibility = (dxbc::ShaderVisibility)SV; + + if (extractMdValue(NewParameter.Constants.ShaderRegister, RootFlagNode, 2)) +return reportError(Ctx, "Invalid value for ShaderRegister"); + + if (extractMdValue(NewParameter.Constants.RegisterSpace, RootFlagNode, 3)) +return reportError(Ctx, "Invalid value for RegisterSpace"); + + if (extractMdValue(NewParameter.Constants.Num32BitValues, RootFlagNode, 4)) +return reportError(Ctx, "Invalid value for Num32BitValues"); + + RSD.Parameters.push_back(NewParameter); + + return false; +} + static bool parseRootSignatureElement(LLVMContext *Ctx, mcdxbc::RootSignatureDesc &RSD, MDNode *Element) { @@ -62,12 +108,16 @@ static bool parseRootSignatureElement(LLVMContext *Ctx, RootSignatureElementKind ElementKind = StringSwitch(ElementText->getString()) .Case("RootFlags", RootSignatureElementKind::RootFlags) + .Case("RootConstants", RootSignatureElementKind::RootConstants) .Default(RootSignatureElementKind::Error); switch (ElementKind) { case RootSignatureElementKind::RootFlags: return parseRootFlags(Ctx, RSD, Element); + case RootSignatureElementKind::RootConstants: +return parseRootConstants(Ctx, RSD, Element); +break; case RootSignatureElementKind::Error: return reportError(Ctx, "Invalid Root Signature Element: " + ElementText->getString()); @@ -94,10 +144,56 @@ static bool parse(LLVMContext *Ctx, mcdxbc::RootSignatureDesc &RSD, static bool verifyRootFlag(uint32_t Flags) { return (Flags & ~0xfff) == 0; } +static bool verifyShaderVisibility(dxbc::ShaderVisibility Flags) { + switch (Flags) { + + case dxbc::ShaderVisibility::All: + case dxbc::ShaderVisibility::Vertex: + case dxbc::ShaderVisibility::Hull: + case dxbc::ShaderVisibility::Domain: + case dxbc::ShaderVisibility::Geometry: + case dxbc::ShaderVisibility::Pixel: + case dxbc::ShaderVisibility::Amplification: + case dxbc::ShaderVisibility::Mesh: +return true; + } + + return false; +} + +static bool verifyParameterType(dxbc::RootParameterType Flags) { + switch (Flags) { + case dxbc::RootParameterType::Constants32Bit: +return true; + } + + return false; +} + +static bool verifyVersion(uint32_t Version) { + return (Version == 1 || Version == 2); +} + static bool validate(LLVMContext *Ctx, const mcdxbc::RootSignatureDesc
[llvm-branch-commits] [llvm] release/20.x: [HEXAGON] Fix corner cases for hwloops pass (#135439) (PR #135657)
https://github.com/llvmbot created https://github.com/llvm/llvm-project/pull/135657 Backport da8ce56c53fe6e34809ba0b310fa90257e230a89 Requested by: @androm3da >From e2d8942a7829df234bdd0f505008d1db215e4bf8 Mon Sep 17 00:00:00 2001 From: aankit-ca Date: Mon, 14 Apr 2025 11:03:10 -0700 Subject: [PATCH] [HEXAGON] Fix corner cases for hwloops pass (#135439) Add check to make sure Dist > 0 or Dist < 0 for appropriate cmp cases to hexagon hardware loops pass. The change modifies the HexagonHardwareLoops pass to add runtime checks to make sure that end_value > initial_value for less than comparisons and end_value < initial_value for greater than comparisons. Fix for https://github.com/llvm/llvm-project/issues/133241 @androm3da @iajbar PTAL - Co-authored-by: aankit-quic (cherry picked from commit da8ce56c53fe6e34809ba0b310fa90257e230a89) --- .../Target/Hexagon/HexagonHardwareLoops.cpp | 46 ++- .../CodeGen/Hexagon/hwloop-dist-check.mir | 277 ++ llvm/test/CodeGen/Hexagon/swp-phi-start.ll| 5 +- 3 files changed, 325 insertions(+), 3 deletions(-) create mode 100644 llvm/test/CodeGen/Hexagon/hwloop-dist-check.mir diff --git a/llvm/lib/Target/Hexagon/HexagonHardwareLoops.cpp b/llvm/lib/Target/Hexagon/HexagonHardwareLoops.cpp index 9334746349240..dd4b240455126 100644 --- a/llvm/lib/Target/Hexagon/HexagonHardwareLoops.cpp +++ b/llvm/lib/Target/Hexagon/HexagonHardwareLoops.cpp @@ -731,6 +731,11 @@ CountValue *HexagonHardwareLoops::computeCount(MachineLoop *Loop, Register IVReg, int64_t IVBump, Comparison::Kind Cmp) const { + LLVM_DEBUG(llvm::dbgs() << "Loop: " << *Loop << "\n"); + LLVM_DEBUG(llvm::dbgs() << "Initial Value: " << *Start << "\n"); + LLVM_DEBUG(llvm::dbgs() << "End Value: " << *End << "\n"); + LLVM_DEBUG(llvm::dbgs() << "Inc/Dec Value: " << IVBump << "\n"); + LLVM_DEBUG(llvm::dbgs() << "Comparison: " << Cmp << "\n"); // Cannot handle comparison EQ, i.e. while (A == B). if (Cmp == Comparison::EQ) return nullptr; @@ -846,6 +851,7 @@ CountValue *HexagonHardwareLoops::computeCount(MachineLoop *Loop, if (IVBump < 0) { std::swap(Start, End); IVBump = -IVBump; +std::swap(CmpLess, CmpGreater); } // Cmp may now have a wrong direction, e.g. LEs may now be GEs. // Signedness, and "including equality" are preserved. @@ -989,7 +995,45 @@ CountValue *HexagonHardwareLoops::computeCount(MachineLoop *Loop, CountSR = 0; } - return new CountValue(CountValue::CV_Register, CountR, CountSR); + const TargetRegisterClass *PredRC = &Hexagon::PredRegsRegClass; + Register MuxR = CountR; + unsigned MuxSR = CountSR; + // For the loop count to be valid unsigned number, CmpLess should imply + // Dist >= 0. Similarly, CmpGreater should imply Dist < 0. We can skip the + // check if the initial distance is zero and the comparison is LTu || LTEu. + if (!(Start->isImm() && StartV == 0 && Comparison::isUnsigned(Cmp) && +CmpLess) && + (CmpLess || CmpGreater)) { +// Generate: +// DistCheck = CMP_GT DistR, 0 --> CmpLess +// DistCheck = CMP_GT DistR, -1 --> CmpGreater +Register DistCheckR = MRI->createVirtualRegister(PredRC); +const MCInstrDesc &DistCheckD = TII->get(Hexagon::C2_cmpgti); +BuildMI(*PH, InsertPos, DL, DistCheckD, DistCheckR) +.addReg(DistR, 0, DistSR) +.addImm((CmpLess) ? 0 : -1); + +// Generate: +// MUXR = MUX DistCheck, CountR, 1 --> CmpLess +// MUXR = MUX DistCheck, 1, CountR --> CmpGreater +MuxR = MRI->createVirtualRegister(IntRC); +if (CmpLess) { + const MCInstrDesc &MuxD = TII->get(Hexagon::C2_muxir); + BuildMI(*PH, InsertPos, DL, MuxD, MuxR) + .addReg(DistCheckR) + .addReg(CountR, 0, CountSR) + .addImm(1); +} else { + const MCInstrDesc &MuxD = TII->get(Hexagon::C2_muxri); + BuildMI(*PH, InsertPos, DL, MuxD, MuxR) + .addReg(DistCheckR) + .addImm(1) + .addReg(CountR, 0, CountSR); +} +MuxSR = 0; + } + + return new CountValue(CountValue::CV_Register, MuxR, MuxSR); } /// Return true if the operation is invalid within hardware loop. diff --git a/llvm/test/CodeGen/Hexagon/hwloop-dist-check.mir b/llvm/test/CodeGen/Hexagon/hwloop-dist-check.mir new file mode 100644 index 0..9f8c14a314309 --- /dev/null +++ b/llvm/test/CodeGen/Hexagon/hwloop-dist-check.mir @@ -0,0 +1,277 @@ +# RUN: llc --mtriple=hexagon -run-pass=hwloops %s -o - | FileCheck %s + +# CHECK-LABEL: name: f +# CHECK: [[R1:%[0-9]+]]:predregs = C2_cmpgti [[R0:%[0-9]+]], 0 +# CHECK: [[R3:%[0-9]+]]:intregs = C2_muxir [[R1:%[0-9]+]], [[R2:%[0-9]+]], 1 +# CHECK-LABEL: name: g +# CHECK: [[R1:%[0-9]+]]:predregs = C2_cmpgti [[R0:%[0-9]+]], 0 +# CHECK: [[R3:%[0-9]+]]:intregs = C2_muxir [[R1:%[0-9]+]], [[R2:%[0-9]+]], 1 +--- | + @a = dso_
[llvm-branch-commits] [flang] [llvm] [Github][CI] Upload .ninja_log as an artifact (PR #135539)
https://github.com/boomanaiden154 updated https://github.com/llvm/llvm-project/pull/135539 >From 109923e35d854d63faa5b9599f5fd128bcfe5c79 Mon Sep 17 00:00:00 2001 From: Aiden Grossman Date: Sun, 13 Apr 2025 11:26:06 + Subject: [PATCH 1/3] testing Created using spr 1.3.4 --- .ci/monolithic-linux.sh | 1 + 1 file changed, 1 insertion(+) diff --git a/.ci/monolithic-linux.sh b/.ci/monolithic-linux.sh index 6461c9d40ad59..8c1ad5d80da51 100755 --- a/.ci/monolithic-linux.sh +++ b/.ci/monolithic-linux.sh @@ -34,6 +34,7 @@ function at-exit { mkdir -p artifacts ccache --print-stats > artifacts/ccache_stats.txt cp "${BUILD_DIR}"/.ninja_log artifacts/.ninja_log + ls artifacts/ # If building fails there will be no results files. shopt -s nullglob >From 78c42b3aed24e533d53b2f701f5a0abd5f611e2a Mon Sep 17 00:00:00 2001 From: Aiden Grossman Date: Sun, 13 Apr 2025 11:43:16 + Subject: [PATCH 2/3] cleanup Created using spr 1.3.4 --- flang/CMakeLists.txt | 1 - 1 file changed, 1 deletion(-) diff --git a/flang/CMakeLists.txt b/flang/CMakeLists.txt index 236b4644404ec..76eb13295eb07 100644 --- a/flang/CMakeLists.txt +++ b/flang/CMakeLists.txt @@ -1,4 +1,3 @@ -# testing cmake_minimum_required(VERSION 3.20.0) set(LLVM_SUBPROJECT_TITLE "Flang") >From cb1924f997f4e35014df8bc072f038992b7a6bca Mon Sep 17 00:00:00 2001 From: Aiden Grossman Date: Mon, 14 Apr 2025 07:36:51 + Subject: [PATCH 3/3] fix Created using spr 1.3.4 --- .ci/monolithic-linux.sh | 1 - 1 file changed, 1 deletion(-) diff --git a/.ci/monolithic-linux.sh b/.ci/monolithic-linux.sh index 8c1ad5d80da51..6461c9d40ad59 100755 --- a/.ci/monolithic-linux.sh +++ b/.ci/monolithic-linux.sh @@ -34,7 +34,6 @@ function at-exit { mkdir -p artifacts ccache --print-stats > artifacts/ccache_stats.txt cp "${BUILD_DIR}"/.ninja_log artifacts/.ninja_log - ls artifacts/ # If building fails there will be no results files. shopt -s nullglob ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [llvm] release/20.x: [X86][AVX10] Remove VAES and VPCLMULQDQ feature from AVX10.1 (#135489) (PR #135577)
https://github.com/RKSimon approved this pull request. LGTM - cheers https://github.com/llvm/llvm-project/pull/135577 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [RISCV][NFC] Use bitmasks generated by TableGen (PR #135600)
https://github.com/BeMg approved this pull request. LGTM https://github.com/llvm/llvm-project/pull/135600 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [GOFF] Add writing of section symbols (PR #133799)
https://github.com/AidoP edited https://github.com/llvm/llvm-project/pull/133799 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [GOFF] Add writing of section symbols (PR #133799)
https://github.com/AidoP edited https://github.com/llvm/llvm-project/pull/133799 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] [llvm] Reentry (PR #135656)
https://github.com/mtrofin created https://github.com/llvm/llvm-project/pull/135656 None >From e7b5f81cf9bd3237b0b1ee2f51f105b99f1061b7 Mon Sep 17 00:00:00 2001 From: Mircea Trofin Date: Mon, 14 Apr 2025 07:19:58 -0700 Subject: [PATCH] Reentry --- .../lib/ctx_profile/CtxInstrProfiling.cpp | 151 -- .../tests/CtxInstrProfilingTest.cpp | 115 - .../llvm/ProfileData/CtxInstrContextNode.h| 6 +- .../Instrumentation/PGOCtxProfLowering.cpp| 82 ++ .../PGOProfile/ctx-instrumentation.ll | 4 +- 5 files changed, 269 insertions(+), 89 deletions(-) diff --git a/compiler-rt/lib/ctx_profile/CtxInstrProfiling.cpp b/compiler-rt/lib/ctx_profile/CtxInstrProfiling.cpp index 2d173f0fcb19a..2e26541c1acea 100644 --- a/compiler-rt/lib/ctx_profile/CtxInstrProfiling.cpp +++ b/compiler-rt/lib/ctx_profile/CtxInstrProfiling.cpp @@ -41,7 +41,44 @@ Arena *FlatCtxArena = nullptr; // Set to true when we enter a root, and false when we exit - regardless if this // thread collects a contextual profile for that root. -__thread bool IsUnderContext = false; +__thread int UnderContextRefCount = 0; +__thread void *volatile EnteredContextAddress = 0; + +void onFunctionEntered(void *Address) { + UnderContextRefCount += (Address == EnteredContextAddress); + assert(UnderContextRefCount > 0); +} + +void onFunctionExited(void *Address) { + UnderContextRefCount -= (Address == EnteredContextAddress); + assert(UnderContextRefCount >= 0); +} + +// Returns true if it was entered the first time +bool rootEnterIsFirst(void* Address) { + bool Ret = true; + if (!EnteredContextAddress) { +EnteredContextAddress = Address; +assert(UnderContextRefCount == 0); +Ret = true; + } + onFunctionEntered(Address); + return Ret; +} + +// Return true if this also exits the root. +bool exitsRoot(void* Address) { + onFunctionExited(Address); + if (UnderContextRefCount == 0) { +EnteredContextAddress = nullptr; +return true; + } + return false; + +} + +bool hasEnteredARoot() { return UnderContextRefCount > 0; } + __sanitizer::atomic_uint8_t ProfilingStarted = {}; __sanitizer::atomic_uintptr_t RootDetector = {}; @@ -287,62 +324,65 @@ ContextRoot *FunctionData::getOrAllocateContextRoot() { return Root; } -ContextNode *tryStartContextGivenRoot(ContextRoot *Root, GUID Guid, - uint32_t Counters, uint32_t Callsites) -SANITIZER_NO_THREAD_SAFETY_ANALYSIS { - IsUnderContext = true; - __sanitizer::atomic_fetch_add(&Root->TotalEntries, 1, -__sanitizer::memory_order_relaxed); +ContextNode *tryStartContextGivenRoot( +ContextRoot *Root, void *EntryAddress, GUID Guid, uint32_t Counters, +uint32_t Callsites) SANITIZER_NO_THREAD_SAFETY_ANALYSIS { + + if (rootEnterIsFirst(EntryAddress)) +__sanitizer::atomic_fetch_add(&Root->TotalEntries, 1, + __sanitizer::memory_order_relaxed); if (!Root->FirstMemBlock) { setupContext(Root, Guid, Counters, Callsites); } if (Root->Taken.TryLock()) { +assert(__llvm_ctx_profile_current_context_root == nullptr); __llvm_ctx_profile_current_context_root = Root; onContextEnter(*Root->FirstNode); return Root->FirstNode; } // If this thread couldn't take the lock, return scratch context. - __llvm_ctx_profile_current_context_root = nullptr; return TheScratchContext; } +ContextNode *getOrStartContextOutsideCollection(FunctionData &Data, +ContextRoot *OwnCtxRoot, +void *Callee, GUID Guid, +uint32_t NumCounters, +uint32_t NumCallsites) { + // This must only be called when __llvm_ctx_profile_current_context_root is + // null. + assert(__llvm_ctx_profile_current_context_root == nullptr); + // OwnCtxRoot is Data.CtxRoot. Since it's volatile, and is used by the caller, + // pre-load it. + assert(Data.CtxRoot == OwnCtxRoot); + // If we have a root detector, try sampling. + // Otherwise - regardless if we started profiling or not, if Data.CtxRoot is + // allocated, try starting a context tree - basically, as-if + // __llvm_ctx_profile_start_context were called. + if (auto *RAD = getRootDetector()) +RAD->sample(); + else if (reinterpret_cast(OwnCtxRoot) > 1) +return tryStartContextGivenRoot(OwnCtxRoot, Data.EntryAddress, Guid, +NumCounters, NumCallsites); + + // If we didn't start profiling, or if we are under a context, just not + // collecting, return the scratch buffer. + if (hasEnteredARoot() || + !__sanitizer::atomic_load_relaxed(&ProfilingStarted)) +return TheScratchContext; + return markAsScratch( + onContextEnter(*getFlatProfile(Data, Callee, Guid, NumCounters))); +} + ContextNode *getUnhandledContext(FunctionData &Data,
[llvm-branch-commits] [mlir] [MLIR][ArmSVE] Add initial lowering of vector.contract to SVE `*MMLA` instructions (PR #135636)
https://github.com/momchil-velikov created https://github.com/llvm/llvm-project/pull/135636 Supersedes https://github.com/llvm/llvm-project/pull/135359 >From 2e61d3ee7b9ac88ae1be8ca248dad1a0880ccff4 Mon Sep 17 00:00:00 2001 From: Momchil Velikov Date: Tue, 8 Apr 2025 14:43:54 + Subject: [PATCH] [MLIR][ArmSVE] Add initial lowering of `vector.contract` to SVE `*MMLA` instructions --- mlir/include/mlir/Conversion/Passes.td| 4 + .../Dialect/ArmSVE/Transforms/Transforms.h| 3 + .../Conversion/VectorToLLVM/CMakeLists.txt| 1 + .../VectorToLLVM/ConvertVectorToLLVMPass.cpp | 7 + .../LowerContractionToSMMLAPattern.cpp| 5 +- .../Dialect/ArmSVE/Transforms/CMakeLists.txt | 1 + .../LowerContractionToSVEI8MMPattern.cpp | 304 ++ .../Vector/CPU/ArmSVE/vector-smmla.mlir | 94 ++ .../Vector/CPU/ArmSVE/vector-summla.mlir | 85 + .../Vector/CPU/ArmSVE/vector-ummla.mlir | 94 ++ .../Vector/CPU/ArmSVE/vector-usmmla.mlir | 95 ++ .../CPU/ArmSVE/contraction-smmla-4x8x4.mlir | 117 +++ .../ArmSVE/contraction-smmla-8x8x8-vs2.mlir | 159 + .../CPU/ArmSVE/contraction-summla-4x8x4.mlir | 118 +++ .../CPU/ArmSVE/contraction-ummla-4x8x4.mlir | 119 +++ .../CPU/ArmSVE/contraction-usmmla-4x8x4.mlir | 117 +++ 16 files changed, 1322 insertions(+), 1 deletion(-) create mode 100644 mlir/lib/Dialect/ArmSVE/Transforms/LowerContractionToSVEI8MMPattern.cpp create mode 100644 mlir/test/Dialect/Vector/CPU/ArmSVE/vector-smmla.mlir create mode 100644 mlir/test/Dialect/Vector/CPU/ArmSVE/vector-summla.mlir create mode 100644 mlir/test/Dialect/Vector/CPU/ArmSVE/vector-ummla.mlir create mode 100644 mlir/test/Dialect/Vector/CPU/ArmSVE/vector-usmmla.mlir create mode 100644 mlir/test/Integration/Dialect/Vector/CPU/ArmSVE/contraction-smmla-4x8x4.mlir create mode 100644 mlir/test/Integration/Dialect/Vector/CPU/ArmSVE/contraction-smmla-8x8x8-vs2.mlir create mode 100644 mlir/test/Integration/Dialect/Vector/CPU/ArmSVE/contraction-summla-4x8x4.mlir create mode 100644 mlir/test/Integration/Dialect/Vector/CPU/ArmSVE/contraction-ummla-4x8x4.mlir create mode 100644 mlir/test/Integration/Dialect/Vector/CPU/ArmSVE/contraction-usmmla-4x8x4.mlir diff --git a/mlir/include/mlir/Conversion/Passes.td b/mlir/include/mlir/Conversion/Passes.td index bbba495e613b2..930d8b44abca0 100644 --- a/mlir/include/mlir/Conversion/Passes.td +++ b/mlir/include/mlir/Conversion/Passes.td @@ -1406,6 +1406,10 @@ def ConvertVectorToLLVMPass : Pass<"convert-vector-to-llvm"> { "bool", /*default=*/"false", "Enables the use of ArmSVE dialect while lowering the vector " "dialect.">, +Option<"armI8MM", "enable-arm-i8mm", + "bool", /*default=*/"false", + "Enables the use of Arm FEAT_I8MM instructions while lowering " + "the vector dialect.">, Option<"x86Vector", "enable-x86vector", "bool", /*default=*/"false", "Enables the use of X86Vector dialect while lowering the vector " diff --git a/mlir/include/mlir/Dialect/ArmSVE/Transforms/Transforms.h b/mlir/include/mlir/Dialect/ArmSVE/Transforms/Transforms.h index 8665c8224cc45..232e2be29e574 100644 --- a/mlir/include/mlir/Dialect/ArmSVE/Transforms/Transforms.h +++ b/mlir/include/mlir/Dialect/ArmSVE/Transforms/Transforms.h @@ -20,6 +20,9 @@ class RewritePatternSet; void populateArmSVELegalizeForLLVMExportPatterns( const LLVMTypeConverter &converter, RewritePatternSet &patterns); +void populateLowerContractionToSVEI8MMPatternPatterns( +RewritePatternSet &patterns); + /// Configure the target to support lowering ArmSVE ops to ops that map to LLVM /// intrinsics. void configureArmSVELegalizeForExportTarget(LLVMConversionTarget &target); diff --git a/mlir/lib/Conversion/VectorToLLVM/CMakeLists.txt b/mlir/lib/Conversion/VectorToLLVM/CMakeLists.txt index 330474a718e30..8e2620029c354 100644 --- a/mlir/lib/Conversion/VectorToLLVM/CMakeLists.txt +++ b/mlir/lib/Conversion/VectorToLLVM/CMakeLists.txt @@ -35,6 +35,7 @@ add_mlir_conversion_library(MLIRVectorToLLVMPass MLIRVectorToLLVM MLIRArmNeonDialect + MLIRArmNeonTransforms MLIRArmSVEDialect MLIRArmSVETransforms MLIRAMXDialect diff --git a/mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVMPass.cpp b/mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVMPass.cpp index 7082b92c95d1d..1e6c8122b1d0e 100644 --- a/mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVMPass.cpp +++ b/mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVMPass.cpp @@ -14,6 +14,7 @@ #include "mlir/Dialect/AMX/Transforms.h" #include "mlir/Dialect/Arith/IR/Arith.h" #include "mlir/Dialect/ArmNeon/ArmNeonDialect.h" +#include "mlir/Dialect/ArmNeon/Transforms.h" #include "mlir/Dialect/ArmSVE/IR/ArmSVEDialect.h" #include "mlir/Dialect/ArmSVE/Transforms/Transforms.h" #include "mlir/Dialect/Func/IR/FuncOps.h" @@ -82,6 +83,12 @@ void ConvertVectorToLLVMPass::
[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: refactor issue reporting (PR #135662)
https://github.com/atrosinenko created https://github.com/llvm/llvm-project/pull/135662 Remove `getAffectedRegisters` and `setOverwritingInstrs` methods from the base `Report` class. Instead, make `Report` always represent the brief version of the report. When an issue is detected on the first run of the analysis, return an optional request for extra details to attach to the report on the second run. >From c82cab53c33623fa9d6384b58946eaaab7807270 Mon Sep 17 00:00:00 2001 From: Anatoly Trosinenko Date: Mon, 14 Apr 2025 15:08:54 +0300 Subject: [PATCH] [BOLT] Gadget scanner: refactor issue reporting Remove `getAffectedRegisters` and `setOverwritingInstrs` methods from the base `Report` class. Instead, make `Report` always represent the brief version of the report. When an issue is detected on the first run of the analysis, return an optional request for extra details to attach to the report on the second run. --- bolt/include/bolt/Passes/PAuthGadgetScanner.h | 102 ++--- bolt/lib/Passes/PAuthGadgetScanner.cpp| 200 ++ .../AArch64/gs-pauth-debug-output.s | 8 +- 3 files changed, 187 insertions(+), 123 deletions(-) diff --git a/bolt/include/bolt/Passes/PAuthGadgetScanner.h b/bolt/include/bolt/Passes/PAuthGadgetScanner.h index 3e39b64e59e0f..3b6c1f6af94a0 100644 --- a/bolt/include/bolt/Passes/PAuthGadgetScanner.h +++ b/bolt/include/bolt/Passes/PAuthGadgetScanner.h @@ -219,11 +219,6 @@ struct Report { virtual void generateReport(raw_ostream &OS, const BinaryContext &BC) const = 0; - // The two methods below are called by Analysis::computeDetailedInfo when - // iterating over the reports. - virtual const ArrayRef getAffectedRegisters() const { return {}; } - virtual void setOverwritingInstrs(const ArrayRef Instrs) {} - void printBasicInfo(raw_ostream &OS, const BinaryContext &BC, StringRef IssueKind) const; }; @@ -231,27 +226,11 @@ struct Report { struct GadgetReport : public Report { // The particular kind of gadget that is detected. const GadgetKind &Kind; - // The set of registers related to this gadget report (possibly empty). - SmallVector AffectedRegisters; - // The instructions that clobber the affected registers. - // There is no one-to-one correspondence with AffectedRegisters: for example, - // the same register can be overwritten by different instructions in different - // preceding basic blocks. - SmallVector OverwritingInstrs; - - GadgetReport(const GadgetKind &Kind, MCInstReference Location, - MCPhysReg AffectedRegister) - : Report(Location), Kind(Kind), AffectedRegisters({AffectedRegister}) {} - - void generateReport(raw_ostream &OS, const BinaryContext &BC) const override; - const ArrayRef getAffectedRegisters() const override { -return AffectedRegisters; - } + GadgetReport(const GadgetKind &Kind, MCInstReference Location) + : Report(Location), Kind(Kind) {} - void setOverwritingInstrs(const ArrayRef Instrs) override { -OverwritingInstrs.assign(Instrs.begin(), Instrs.end()); - } + void generateReport(raw_ostream &OS, const BinaryContext &BC) const override; }; /// Report with a free-form message attached. @@ -263,8 +242,75 @@ struct GenericReport : public Report { const BinaryContext &BC) const override; }; +/// An information about an issue collected on the slower, detailed, +/// run of an analysis. +class ExtraInfo { +public: + virtual void print(raw_ostream &OS, const MCInstReference Location) const = 0; + + virtual ~ExtraInfo() {} +}; + +class ClobberingInfo : public ExtraInfo { + SmallVector ClobberingInstrs; + +public: + ClobberingInfo(const ArrayRef Instrs) + : ClobberingInstrs(Instrs) {} + + void print(raw_ostream &OS, const MCInstReference Location) const override; +}; + +/// A brief version of a report that can be further augmented with the details. +/// +/// It is common for a particular type of gadget detector to be tied to some +/// specific kind of analysis. If an issue is returned by that detector, it may +/// be further augmented with the detailed info in an analysis-specific way, +/// or just be left as-is (f.e. if a free-form warning was reported). +template struct BriefReport { + BriefReport(std::shared_ptr Issue, + const std::optional RequestedDetails) + : Issue(Issue), RequestedDetails(RequestedDetails) {} + + std::shared_ptr Issue; + std::optional RequestedDetails; +}; + +/// A detailed version of a report. +struct DetailedReport { + DetailedReport(std::shared_ptr Issue, + std::shared_ptr Details) + : Issue(Issue), Details(Details) {} + + std::shared_ptr Issue; + std::shared_ptr Details; +}; + struct FunctionAnalysisResult { - std::vector> Diagnostics; + std::vector Diagnostics; +}; + +/// A helper class storing per-function context to be instantiated by Analysis. +class FunctionAnalysis { + BinaryContext &B
[llvm-branch-commits] [clang] release/20.x: [clang] Introduce "binary" StringLiteral for #embed data (#127629) (PR #133460)
github-actions[bot] wrote: @Fznamznon (or anyone else). If you would like to add a note about this fix in the release notes (completely optional). Please reply to this comment with a one or two sentence description of the fix. When you are done, please add the release:note label to this PR. https://github.com/llvm/llvm-project/pull/133460 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect authentication oracles (PR #135663)
https://github.com/atrosinenko created https://github.com/llvm/llvm-project/pull/135663 Implement the detection of authentication instructions whose results can be inspected by an attacker to know whether authentication succeeded. As the properties of output registers of authentication instructions are inspected, add a second set of analysis-related classes to iterate over the instructions in reverse order. >From 3e21f98012fa793d62d8ed588b2551e4ef757498 Mon Sep 17 00:00:00 2001 From: Anatoly Trosinenko Date: Sat, 5 Apr 2025 14:54:01 +0300 Subject: [PATCH] [BOLT] Gadget scanner: detect authentication oracles Implement the detection of authentication instructions whose results can be inspected by an attacker to know whether authentication succeeded. As the properties of output registers of authentication instructions are inspected, add a second set of analysis-related classes to iterate over the instructions in reverse order. --- bolt/include/bolt/Core/MCPlusBuilder.h| 3 + bolt/include/bolt/Passes/PAuthGadgetScanner.h | 12 + bolt/lib/Passes/PAuthGadgetScanner.cpp| 537 ++ .../AArch64/gs-pauth-authentication-oracles.s | 675 ++ .../AArch64/gs-pauth-debug-output.s | 78 ++ 5 files changed, 1305 insertions(+) create mode 100644 bolt/test/binary-analysis/AArch64/gs-pauth-authentication-oracles.s diff --git a/bolt/include/bolt/Core/MCPlusBuilder.h b/bolt/include/bolt/Core/MCPlusBuilder.h index 9d50036b8083b..4f895fb5f9cc5 100644 --- a/bolt/include/bolt/Core/MCPlusBuilder.h +++ b/bolt/include/bolt/Core/MCPlusBuilder.h @@ -622,6 +622,9 @@ class MCPlusBuilder { /// controlled, provided InReg and executable code are not. Please note that /// registers other than InReg as well as the contents of memory which is /// writable by the process should be considered attacker-controlled. + /// + /// The instruction should not write any values derived from InReg anywhere, + /// except for OutReg. virtual std::optional> analyzeAddressArithmeticsForPtrAuth(const MCInst &Inst) const { llvm_unreachable("not implemented"); diff --git a/bolt/include/bolt/Passes/PAuthGadgetScanner.h b/bolt/include/bolt/Passes/PAuthGadgetScanner.h index 3b6c1f6af94a0..2b923e362941f 100644 --- a/bolt/include/bolt/Passes/PAuthGadgetScanner.h +++ b/bolt/include/bolt/Passes/PAuthGadgetScanner.h @@ -261,6 +261,15 @@ class ClobberingInfo : public ExtraInfo { void print(raw_ostream &OS, const MCInstReference Location) const override; }; +class LeakageInfo : public ExtraInfo { + SmallVector LeakingInstrs; + +public: + LeakageInfo(const ArrayRef Instrs) : LeakingInstrs(Instrs) {} + + void print(raw_ostream &OS, const MCInstReference Location) const override; +}; + /// A brief version of a report that can be further augmented with the details. /// /// It is common for a particular type of gadget detector to be tied to some @@ -302,6 +311,9 @@ class FunctionAnalysis { void findUnsafeUses(SmallVector> &Reports); void augmentUnsafeUseReports(const ArrayRef> Reports); + void findUnsafeDefs(SmallVector> &Reports); + void augmentUnsafeDefReports(const ArrayRef> Reports); + public: FunctionAnalysis(BinaryFunction &BF, MCPlusBuilder::AllocatorIdTy AllocatorId, bool PacRetGadgetsOnly) diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp b/bolt/lib/Passes/PAuthGadgetScanner.cpp index b3081f034e8ee..f403caddf3fd8 100644 --- a/bolt/lib/Passes/PAuthGadgetScanner.cpp +++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp @@ -712,6 +712,460 @@ SrcSafetyAnalysis::create(BinaryFunction &BF, RegsToTrackInstsFor); } +/// A state representing which registers are safe to be used as the destination +/// operand of an authentication instruction. +/// +/// Similar to SrcState, it is the analysis that should take register aliasing +/// into account. +/// +/// Depending on the implementation, it may be possible that an authentication +/// instruction returns an invalid pointer on failure instead of terminating +/// the program immediately (assuming the program will crash as soon as that +/// pointer is dereferenced). To prevent brute-forcing the correct signature, +/// it should be impossible for an attacker to test if a pointer is correctly +/// signed - either the program should be terminated on authentication failure +/// or it should be impossible to tell whether authentication succeeded or not. +/// +/// For that reason, a restricted set of operations is allowed on any register +/// containing a value derived from the result of an authentication instruction +/// until that register is either wiped or checked not to contain a result of a +/// failed authentication. +/// +/// Specifically, the safety property for a register is computed by iterating +/// the instructions in backward order: the source register Xn of an instruction +/// Inst is safe if at least one of the following is true: +///
[llvm-branch-commits] [clang] release/20.x: [modules] Handle friend function that was a definition but became only a declaration during AST deserialization (#132214) (PR #134232)
https://github.com/dmpolukhin closed https://github.com/llvm/llvm-project/pull/134232 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] release/20.x: [clang] Introduce "binary" StringLiteral for #embed data (#127629) (PR #133460)
https://github.com/tstellar updated https://github.com/llvm/llvm-project/pull/133460 >From 2e7710eaffddcbb6094e32826ec6e69bb4cb1799 Mon Sep 17 00:00:00 2001 From: Mariya Podchishchaeva Date: Thu, 20 Mar 2025 13:02:29 +0100 Subject: [PATCH 1/2] [clang] Introduce "binary" StringLiteral for #embed data (#127629) StringLiteral is used as internal data of EmbedExpr and we directly use it as an initializer if a single EmbedExpr appears in the initializer list of a char array. It is fast and convenient, but it is causing problems when string literal character values are checked because #embed data values are within a range [0-2^(char width)] but ordinary StringLiteral is of maybe signed char type. This PR introduces new kind of StringLiteral to hold binary data coming from an embedded resource to mitigate these problems. The new kind of StringLiteral is not assumed to have signed char type. The new kind of StringLiteral also helps to prevent crashes when trying to find StringLiteral token locations since these simply do not exist for binary data. Fixes https://github.com/llvm/llvm-project/issues/119256 --- clang/include/clang/AST/Expr.h| 15 --- clang/lib/AST/Expr.cpp| 7 +++ clang/lib/Parse/ParseInit.cpp | 2 +- clang/lib/Sema/SemaInit.cpp | 1 + clang/test/Preprocessor/embed_constexpr.c | 21 + 5 files changed, 42 insertions(+), 4 deletions(-) create mode 100644 clang/test/Preprocessor/embed_constexpr.c diff --git a/clang/include/clang/AST/Expr.h b/clang/include/clang/AST/Expr.h index 7be4022649329..06ac0f1704aa9 100644 --- a/clang/include/clang/AST/Expr.h +++ b/clang/include/clang/AST/Expr.h @@ -1752,7 +1752,14 @@ enum class StringLiteralKind { UTF8, UTF16, UTF32, - Unevaluated + Unevaluated, + // Binary kind of string literal is used for the data coming via #embed + // directive. File's binary contents is transformed to a special kind of + // string literal that in some cases may be used directly as an initializer + // and some features of classic string literals are not applicable to this + // kind of a string literal, for example finding a particular byte's source + // location for better diagnosing. + Binary }; /// StringLiteral - This represents a string literal expression, e.g. "foo" @@ -1884,6 +1891,8 @@ class StringLiteral final int64_t getCodeUnitS(size_t I, uint64_t BitWidth) const { int64_t V = getCodeUnit(I); if (isOrdinary() || isWide()) { + // Ordinary and wide string literals have types that can be signed. + // It is important for checking C23 constexpr initializers. unsigned Width = getCharByteWidth() * BitWidth; llvm::APInt AInt(Width, (uint64_t)V); V = AInt.getSExtValue(); @@ -4965,9 +4974,9 @@ class EmbedExpr final : public Expr { assert(EExpr && CurOffset != ULLONG_MAX && "trying to dereference an invalid iterator"); IntegerLiteral *N = EExpr->FakeChildNode; - StringRef DataRef = EExpr->Data->BinaryData->getBytes(); N->setValue(*EExpr->Ctx, - llvm::APInt(N->getValue().getBitWidth(), DataRef[CurOffset], + llvm::APInt(N->getValue().getBitWidth(), + EExpr->Data->BinaryData->getCodeUnit(CurOffset), N->getType()->isSignedIntegerType())); // We want to return a reference to the fake child node in the // EmbedExpr, not the local variable N. diff --git a/clang/lib/AST/Expr.cpp b/clang/lib/AST/Expr.cpp index aa7e14329a21b..8571b617c70eb 100644 --- a/clang/lib/AST/Expr.cpp +++ b/clang/lib/AST/Expr.cpp @@ -1104,6 +1104,7 @@ unsigned StringLiteral::mapCharByteWidth(TargetInfo const &Target, switch (SK) { case StringLiteralKind::Ordinary: case StringLiteralKind::UTF8: + case StringLiteralKind::Binary: CharByteWidth = Target.getCharWidth(); break; case StringLiteralKind::Wide: @@ -1216,6 +1217,7 @@ void StringLiteral::outputString(raw_ostream &OS) const { switch (getKind()) { case StringLiteralKind::Unevaluated: case StringLiteralKind::Ordinary: + case StringLiteralKind::Binary: break; // no prefix. case StringLiteralKind::Wide: OS << 'L'; @@ -1332,6 +1334,11 @@ StringLiteral::getLocationOfByte(unsigned ByteNo, const SourceManager &SM, const LangOptions &Features, const TargetInfo &Target, unsigned *StartToken, unsigned *StartTokenByteOffset) const { + // No source location of bytes for binary literals since they don't come from + // source. + if (getKind() == StringLiteralKind::Binary) +return getStrTokenLoc(0); + assert((getKind() == StringLiteralKind::Ordinary || getKind() == StringLiteralKind::UTF8 || getKind() == StringLiteralKind::Unevaluated) && diff --git a/clang/lib/Parse/ParseInit.cpp b/clang/lib/Par
[llvm-branch-commits] [clang] release/20.x: [clang] Introduce "binary" StringLiteral for #embed data (#127629) (PR #133460)
https://github.com/tstellar closed https://github.com/llvm/llvm-project/pull/133460 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect authentication oracles (PR #135663)
atrosinenko wrote: > [!WARNING] > This pull request is not mergeable via GitHub because a downstack PR is > open. Once all requirements are satisfied, merge this PR as a stack href="https://app.graphite.dev/github/pr/llvm/llvm-project/135663?utm_source=stack-comment-downstack-mergeability-warning"; > >on Graphite. > https://graphite.dev/docs/merge-pull-requests";>Learn more * **#135663** https://app.graphite.dev/github/pr/llvm/llvm-project/135663?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> 👈 https://app.graphite.dev/github/pr/llvm/llvm-project/135663?utm_source=stack-comment-view-in-graphite"; target="_blank">(View in Graphite) * **#135662** https://app.graphite.dev/github/pr/llvm/llvm-project/135662?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * **#135661** https://app.graphite.dev/github/pr/llvm/llvm-project/135661?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * **#134146** https://app.graphite.dev/github/pr/llvm/llvm-project/134146?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * **#133461** https://app.graphite.dev/github/pr/llvm/llvm-project/133461?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * **#135073** https://app.graphite.dev/github/pr/llvm/llvm-project/135073?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * `main` This stack of pull requests is managed by https://graphite.dev?utm-source=stack-comment";>Graphite. Learn more about https://stacking.dev/?utm_source=stack-comment";>stacking. https://github.com/llvm/llvm-project/pull/135663 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: refactor issue reporting (PR #135662)
atrosinenko wrote: > [!WARNING] > This pull request is not mergeable via GitHub because a downstack PR is > open. Once all requirements are satisfied, merge this PR as a stack href="https://app.graphite.dev/github/pr/llvm/llvm-project/135662?utm_source=stack-comment-downstack-mergeability-warning"; > >on Graphite. > https://graphite.dev/docs/merge-pull-requests";>Learn more * **#135663** https://app.graphite.dev/github/pr/llvm/llvm-project/135663?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * **#135662** https://app.graphite.dev/github/pr/llvm/llvm-project/135662?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> 👈 https://app.graphite.dev/github/pr/llvm/llvm-project/135662?utm_source=stack-comment-view-in-graphite"; target="_blank">(View in Graphite) * **#135661** https://app.graphite.dev/github/pr/llvm/llvm-project/135661?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * **#134146** https://app.graphite.dev/github/pr/llvm/llvm-project/134146?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * **#133461** https://app.graphite.dev/github/pr/llvm/llvm-project/133461?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * **#135073** https://app.graphite.dev/github/pr/llvm/llvm-project/135073?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * `main` This stack of pull requests is managed by https://graphite.dev?utm-source=stack-comment";>Graphite. Learn more about https://stacking.dev/?utm_source=stack-comment";>stacking. https://github.com/llvm/llvm-project/pull/135662 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: use more appropriate types (NFC) (PR #135661)
https://github.com/atrosinenko created https://github.com/llvm/llvm-project/pull/135661 * use more flexible `const ArrayRef` and `StringRef` types instead of `const std::vector &` and `const std::string &`, correspondingly, for function arguments * return plain `const SrcState &` instead of `ErrorOr` from `SrcSafetyAnalysis::getStateBefore`, as absent state is not handled gracefully by any caller >From 51373db0c000ad32a91eb4097ccc4404a6e54d25 Mon Sep 17 00:00:00 2001 From: Anatoly Trosinenko Date: Mon, 14 Apr 2025 14:35:56 +0300 Subject: [PATCH] [BOLT] Gadget scanner: use more appropriate types (NFC) * use more flexible `const ArrayRef` and `StringRef` types instead of `const std::vector &` and `const std::string &`, correspondingly, for function arguments * return plain `const SrcState &` instead of `ErrorOr` from `SrcSafetyAnalysis::getStateBefore`, as absent state is not handled gracefully by any caller --- bolt/include/bolt/Passes/PAuthGadgetScanner.h | 8 +--- bolt/lib/Passes/PAuthGadgetScanner.cpp| 39 --- 2 files changed, 19 insertions(+), 28 deletions(-) diff --git a/bolt/include/bolt/Passes/PAuthGadgetScanner.h b/bolt/include/bolt/Passes/PAuthGadgetScanner.h index 6765e2aff414f..3e39b64e59e0f 100644 --- a/bolt/include/bolt/Passes/PAuthGadgetScanner.h +++ b/bolt/include/bolt/Passes/PAuthGadgetScanner.h @@ -12,7 +12,6 @@ #include "bolt/Core/BinaryContext.h" #include "bolt/Core/BinaryFunction.h" #include "bolt/Passes/BinaryPasses.h" -#include "llvm/ADT/SmallSet.h" #include "llvm/Support/raw_ostream.h" #include @@ -199,9 +198,6 @@ raw_ostream &operator<<(raw_ostream &OS, const MCInstReference &); namespace PAuthGadgetScanner { -class SrcSafetyAnalysis; -struct SrcState; - /// Description of a gadget kind that can be detected. Intended to be /// statically allocated to be attached to reports by reference. class GadgetKind { @@ -210,7 +206,7 @@ class GadgetKind { public: GadgetKind(const char *Description) : Description(Description) {} - const StringRef getDescription() const { return Description; } + StringRef getDescription() const { return Description; } }; /// Base report located at some instruction, without any additional information. @@ -261,7 +257,7 @@ struct GadgetReport : public Report { /// Report with a free-form message attached. struct GenericReport : public Report { std::string Text; - GenericReport(MCInstReference Location, const std::string &Text) + GenericReport(MCInstReference Location, StringRef Text) : Report(Location), Text(Text) {} virtual void generateReport(raw_ostream &OS, const BinaryContext &BC) const override; diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp b/bolt/lib/Passes/PAuthGadgetScanner.cpp index ad47bdff753c8..ed89471cbb8d3 100644 --- a/bolt/lib/Passes/PAuthGadgetScanner.cpp +++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp @@ -91,14 +91,14 @@ class TrackedRegisters { const std::vector Registers; std::vector RegToIndexMapping; - static size_t getMappingSize(const std::vector &RegsToTrack) { + static size_t getMappingSize(const ArrayRef RegsToTrack) { if (RegsToTrack.empty()) return 0; return 1 + *llvm::max_element(RegsToTrack); } public: - TrackedRegisters(const std::vector &RegsToTrack) + TrackedRegisters(const ArrayRef RegsToTrack) : Registers(RegsToTrack), RegToIndexMapping(getMappingSize(RegsToTrack), NoIndex) { for (unsigned I = 0; I < RegsToTrack.size(); ++I) @@ -234,7 +234,7 @@ struct SrcState { static void printLastInsts( raw_ostream &OS, -const std::vector> &LastInstWritingReg) { +const ArrayRef> LastInstWritingReg) { OS << "Insts: "; for (unsigned I = 0; I < LastInstWritingReg.size(); ++I) { auto &Set = LastInstWritingReg[I]; @@ -295,7 +295,7 @@ void SrcStatePrinter::print(raw_ostream &OS, const SrcState &S) const { class SrcSafetyAnalysis { public: SrcSafetyAnalysis(BinaryFunction &BF, -const std::vector &RegsToTrackInstsFor) +const ArrayRef RegsToTrackInstsFor) : BC(BF.getBinaryContext()), NumRegs(BC.MRI->getNumRegs()), RegsToTrackInstsFor(RegsToTrackInstsFor) {} @@ -303,11 +303,10 @@ class SrcSafetyAnalysis { static std::shared_ptr create(BinaryFunction &BF, MCPlusBuilder::AllocatorIdTy AllocId, - const std::vector &RegsToTrackInstsFor); + const ArrayRef RegsToTrackInstsFor); virtual void run() = 0; - virtual ErrorOr - getStateBefore(const MCInst &Inst) const = 0; + virtual const SrcState &getStateBefore(const MCInst &Inst) const = 0; protected: BinaryContext &BC; @@ -348,7 +347,7 @@ class SrcSafetyAnalysis { } BitVector getClobberedRegs(const MCInst &Point) const { -BitVector Clobbered(NumRegs, false); +BitVector Clobbered(NumRegs); // Assume a call can clobber all registers, including callee-saved // registers. There's a
[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: use more appropriate types (NFC) (PR #135661)
atrosinenko wrote: > [!WARNING] > This pull request is not mergeable via GitHub because a downstack PR is > open. Once all requirements are satisfied, merge this PR as a stack href="https://app.graphite.dev/github/pr/llvm/llvm-project/135661?utm_source=stack-comment-downstack-mergeability-warning"; > >on Graphite. > https://graphite.dev/docs/merge-pull-requests";>Learn more * **#135663** https://app.graphite.dev/github/pr/llvm/llvm-project/135663?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * **#135662** https://app.graphite.dev/github/pr/llvm/llvm-project/135662?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * **#135661** https://app.graphite.dev/github/pr/llvm/llvm-project/135661?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> 👈 https://app.graphite.dev/github/pr/llvm/llvm-project/135661?utm_source=stack-comment-view-in-graphite"; target="_blank">(View in Graphite) * **#134146** https://app.graphite.dev/github/pr/llvm/llvm-project/134146?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * **#133461** https://app.graphite.dev/github/pr/llvm/llvm-project/133461?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * **#135073** https://app.graphite.dev/github/pr/llvm/llvm-project/135073?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * `main` This stack of pull requests is managed by https://graphite.dev?utm-source=stack-comment";>Graphite. Learn more about https://stacking.dev/?utm_source=stack-comment";>stacking. https://github.com/llvm/llvm-project/pull/135661 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: refactor issue reporting (PR #135662)
llvmbot wrote: @llvm/pr-subscribers-bolt Author: Anatoly Trosinenko (atrosinenko) Changes Remove `getAffectedRegisters` and `setOverwritingInstrs` methods from the base `Report` class. Instead, make `Report` always represent the brief version of the report. When an issue is detected on the first run of the analysis, return an optional request for extra details to attach to the report on the second run. --- Patch is 21.59 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/135662.diff 3 Files Affected: - (modified) bolt/include/bolt/Passes/PAuthGadgetScanner.h (+71-31) - (modified) bolt/lib/Passes/PAuthGadgetScanner.cpp (+112-88) - (modified) bolt/test/binary-analysis/AArch64/gs-pauth-debug-output.s (+4-4) ``diff diff --git a/bolt/include/bolt/Passes/PAuthGadgetScanner.h b/bolt/include/bolt/Passes/PAuthGadgetScanner.h index 3e39b64e59e0f..3b6c1f6af94a0 100644 --- a/bolt/include/bolt/Passes/PAuthGadgetScanner.h +++ b/bolt/include/bolt/Passes/PAuthGadgetScanner.h @@ -219,11 +219,6 @@ struct Report { virtual void generateReport(raw_ostream &OS, const BinaryContext &BC) const = 0; - // The two methods below are called by Analysis::computeDetailedInfo when - // iterating over the reports. - virtual const ArrayRef getAffectedRegisters() const { return {}; } - virtual void setOverwritingInstrs(const ArrayRef Instrs) {} - void printBasicInfo(raw_ostream &OS, const BinaryContext &BC, StringRef IssueKind) const; }; @@ -231,27 +226,11 @@ struct Report { struct GadgetReport : public Report { // The particular kind of gadget that is detected. const GadgetKind &Kind; - // The set of registers related to this gadget report (possibly empty). - SmallVector AffectedRegisters; - // The instructions that clobber the affected registers. - // There is no one-to-one correspondence with AffectedRegisters: for example, - // the same register can be overwritten by different instructions in different - // preceding basic blocks. - SmallVector OverwritingInstrs; - - GadgetReport(const GadgetKind &Kind, MCInstReference Location, - MCPhysReg AffectedRegister) - : Report(Location), Kind(Kind), AffectedRegisters({AffectedRegister}) {} - - void generateReport(raw_ostream &OS, const BinaryContext &BC) const override; - const ArrayRef getAffectedRegisters() const override { -return AffectedRegisters; - } + GadgetReport(const GadgetKind &Kind, MCInstReference Location) + : Report(Location), Kind(Kind) {} - void setOverwritingInstrs(const ArrayRef Instrs) override { -OverwritingInstrs.assign(Instrs.begin(), Instrs.end()); - } + void generateReport(raw_ostream &OS, const BinaryContext &BC) const override; }; /// Report with a free-form message attached. @@ -263,8 +242,75 @@ struct GenericReport : public Report { const BinaryContext &BC) const override; }; +/// An information about an issue collected on the slower, detailed, +/// run of an analysis. +class ExtraInfo { +public: + virtual void print(raw_ostream &OS, const MCInstReference Location) const = 0; + + virtual ~ExtraInfo() {} +}; + +class ClobberingInfo : public ExtraInfo { + SmallVector ClobberingInstrs; + +public: + ClobberingInfo(const ArrayRef Instrs) + : ClobberingInstrs(Instrs) {} + + void print(raw_ostream &OS, const MCInstReference Location) const override; +}; + +/// A brief version of a report that can be further augmented with the details. +/// +/// It is common for a particular type of gadget detector to be tied to some +/// specific kind of analysis. If an issue is returned by that detector, it may +/// be further augmented with the detailed info in an analysis-specific way, +/// or just be left as-is (f.e. if a free-form warning was reported). +template struct BriefReport { + BriefReport(std::shared_ptr Issue, + const std::optional RequestedDetails) + : Issue(Issue), RequestedDetails(RequestedDetails) {} + + std::shared_ptr Issue; + std::optional RequestedDetails; +}; + +/// A detailed version of a report. +struct DetailedReport { + DetailedReport(std::shared_ptr Issue, + std::shared_ptr Details) + : Issue(Issue), Details(Details) {} + + std::shared_ptr Issue; + std::shared_ptr Details; +}; + struct FunctionAnalysisResult { - std::vector> Diagnostics; + std::vector Diagnostics; +}; + +/// A helper class storing per-function context to be instantiated by Analysis. +class FunctionAnalysis { + BinaryContext &BC; + BinaryFunction &BF; + MCPlusBuilder::AllocatorIdTy AllocatorId; + FunctionAnalysisResult Result; + + bool PacRetGadgetsOnly; + + void findUnsafeUses(SmallVector> &Reports); + void augmentUnsafeUseReports(const ArrayRef> Reports); + +public: + FunctionAnalysis(BinaryFunction &BF, MCPlusBuilder::AllocatorIdTy AllocatorId, + bool PacRetGadgetsOnly) + :
[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: use more appropriate types (NFC) (PR #135661)
llvmbot wrote: @llvm/pr-subscribers-bolt Author: Anatoly Trosinenko (atrosinenko) Changes * use more flexible `const ArrayRef` and `StringRef` types instead of `const std::vector &` and `const std::string &`, correspondingly, for function arguments * return plain `const SrcState &` instead of `ErrorOr ` from `SrcSafetyAnalysis::getStateBefore`, as absent state is not handled gracefully by any caller --- Full diff: https://github.com/llvm/llvm-project/pull/135661.diff 2 Files Affected: - (modified) bolt/include/bolt/Passes/PAuthGadgetScanner.h (+2-6) - (modified) bolt/lib/Passes/PAuthGadgetScanner.cpp (+17-22) ``diff diff --git a/bolt/include/bolt/Passes/PAuthGadgetScanner.h b/bolt/include/bolt/Passes/PAuthGadgetScanner.h index 6765e2aff414f..3e39b64e59e0f 100644 --- a/bolt/include/bolt/Passes/PAuthGadgetScanner.h +++ b/bolt/include/bolt/Passes/PAuthGadgetScanner.h @@ -12,7 +12,6 @@ #include "bolt/Core/BinaryContext.h" #include "bolt/Core/BinaryFunction.h" #include "bolt/Passes/BinaryPasses.h" -#include "llvm/ADT/SmallSet.h" #include "llvm/Support/raw_ostream.h" #include @@ -199,9 +198,6 @@ raw_ostream &operator<<(raw_ostream &OS, const MCInstReference &); namespace PAuthGadgetScanner { -class SrcSafetyAnalysis; -struct SrcState; - /// Description of a gadget kind that can be detected. Intended to be /// statically allocated to be attached to reports by reference. class GadgetKind { @@ -210,7 +206,7 @@ class GadgetKind { public: GadgetKind(const char *Description) : Description(Description) {} - const StringRef getDescription() const { return Description; } + StringRef getDescription() const { return Description; } }; /// Base report located at some instruction, without any additional information. @@ -261,7 +257,7 @@ struct GadgetReport : public Report { /// Report with a free-form message attached. struct GenericReport : public Report { std::string Text; - GenericReport(MCInstReference Location, const std::string &Text) + GenericReport(MCInstReference Location, StringRef Text) : Report(Location), Text(Text) {} virtual void generateReport(raw_ostream &OS, const BinaryContext &BC) const override; diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp b/bolt/lib/Passes/PAuthGadgetScanner.cpp index ad47bdff753c8..ed89471cbb8d3 100644 --- a/bolt/lib/Passes/PAuthGadgetScanner.cpp +++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp @@ -91,14 +91,14 @@ class TrackedRegisters { const std::vector Registers; std::vector RegToIndexMapping; - static size_t getMappingSize(const std::vector &RegsToTrack) { + static size_t getMappingSize(const ArrayRef RegsToTrack) { if (RegsToTrack.empty()) return 0; return 1 + *llvm::max_element(RegsToTrack); } public: - TrackedRegisters(const std::vector &RegsToTrack) + TrackedRegisters(const ArrayRef RegsToTrack) : Registers(RegsToTrack), RegToIndexMapping(getMappingSize(RegsToTrack), NoIndex) { for (unsigned I = 0; I < RegsToTrack.size(); ++I) @@ -234,7 +234,7 @@ struct SrcState { static void printLastInsts( raw_ostream &OS, -const std::vector> &LastInstWritingReg) { +const ArrayRef> LastInstWritingReg) { OS << "Insts: "; for (unsigned I = 0; I < LastInstWritingReg.size(); ++I) { auto &Set = LastInstWritingReg[I]; @@ -295,7 +295,7 @@ void SrcStatePrinter::print(raw_ostream &OS, const SrcState &S) const { class SrcSafetyAnalysis { public: SrcSafetyAnalysis(BinaryFunction &BF, -const std::vector &RegsToTrackInstsFor) +const ArrayRef RegsToTrackInstsFor) : BC(BF.getBinaryContext()), NumRegs(BC.MRI->getNumRegs()), RegsToTrackInstsFor(RegsToTrackInstsFor) {} @@ -303,11 +303,10 @@ class SrcSafetyAnalysis { static std::shared_ptr create(BinaryFunction &BF, MCPlusBuilder::AllocatorIdTy AllocId, - const std::vector &RegsToTrackInstsFor); + const ArrayRef RegsToTrackInstsFor); virtual void run() = 0; - virtual ErrorOr - getStateBefore(const MCInst &Inst) const = 0; + virtual const SrcState &getStateBefore(const MCInst &Inst) const = 0; protected: BinaryContext &BC; @@ -348,7 +347,7 @@ class SrcSafetyAnalysis { } BitVector getClobberedRegs(const MCInst &Point) const { -BitVector Clobbered(NumRegs, false); +BitVector Clobbered(NumRegs); // Assume a call can clobber all registers, including callee-saved // registers. There's a good chance that callee-saved registers will be // saved on the stack at some point during execution of the callee. @@ -409,8 +408,7 @@ class SrcSafetyAnalysis { // FirstCheckerInst should belong to the same basic block, meaning // it was deterministically processed a few steps before this instruction. - const SrcState &StateBeforeChecker = - getStateBefore(*FirstCheckerInst).get(); + const SrcState &
[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: use more appropriate types (NFC) (PR #135661)
https://github.com/atrosinenko ready_for_review https://github.com/llvm/llvm-project/pull/135661 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [GOFF] Add writing of section symbols (PR #133799)
@@ -0,0 +1,113 @@ +//===- MCGOFFAttributes.h - Attributes of GOFF symbols ===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===--===// +// +// Defines the various attribute collections defining GOFF symbols. +// +//===--===// + +#ifndef LLVM_MC_MCGOFFATTRIBUTES_H +#define LLVM_MC_MCGOFFATTRIBUTES_H + +#include "llvm/ADT/StringRef.h" +#include "llvm/BinaryFormat/GOFF.h" + +namespace llvm { +namespace GOFF { +// An "External Symbol Definition" in the GOFF file has a type, and depending on +// the type a different subset of the fields is used. +// +// Unlike other formats, a 2 dimensional structure is used to define the +// location of data. For example, the equivalent of the ELF .text section is +// made up of a Section Definition (SD) and a class (Element Definition; ED). +// The name of the SD symbol depends on the application, while the class has the +// predefined name C_CODE/C_CODE64 in AMODE31 and AMODE64 respectively. +// +// Data can be placed into this structure in 2 ways. First, the data (in a text +// record) can be associated with an ED symbol. To refer to data, a Label +// Definition (LD) is used to give an offset into the data a name. When binding, +// the whole data is pulled into the resulting executable, and the addresses +// given by the LD symbols are resolved. +// +// The alternative is to use a Part Definition (PR). In this case, the data (in +// a text record) is associated with the part. When binding, only the data of +// referenced PRs is pulled into the resulting binary. +// +// Both approaches are used, which means that the equivalent of a section in ELF +// results in 3 GOFF symbols, either SD/ED/LD or SD/ED/PR. Moreover, certain +// sections are fine with just defining SD/ED symbols. The SymbolMapper takes +// care of all those details. + +// Attributes for SD symbols. +struct SDAttr { + GOFF::ESDTaskingBehavior TaskingBehavior = GOFF::ESD_TA_Unspecified; + GOFF::ESDBindingScope BindingScope = GOFF::ESD_BSC_Unspecified; +}; + +// Attributes for ED symbols. +struct EDAttr { + bool IsReadOnly = false; + GOFF::ESDExecutable Executable = GOFF::ESD_EXE_Unspecified; + GOFF::ESDAmode Amode; redstar wrote: I had to do a bit of research here. The Amode at the ED symbol acts as a default when no Amode at the LD/ER is present. The Amode at the PR seems to be not necessary. However, I need to check if this results in binder errors if I remove this. https://github.com/llvm/llvm-project/pull/133799 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] AMDGPU/GlobalISel: add RegBankLegalize rules for extends and trunc (PR #132383)
@@ -179,8 +174,7 @@ body: | ; CHECK: liveins: $sgpr0 petar-avramovic wrote: [Re: line +159] > This change is a code quality regression: the input has G_ANYEXT, so the high > half can be undefined. fixed See this comment inline on https://app.graphite.dev/github/pr/llvm/llvm-project/132383?utm_source=unchanged-line-comment";>Graphite. https://github.com/llvm/llvm-project/pull/132383 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [lldb] Draft: test (PR #135630)
https://github.com/mizvekov updated https://github.com/llvm/llvm-project/pull/135630 >From 99af05b3d8227ffe9a43e67e700d3850e6bec665 Mon Sep 17 00:00:00 2001 From: Matheus Izvekov Date: Mon, 14 Apr 2025 11:56:01 -0300 Subject: [PATCH] Draft: test With change: 1) 4m46s - https://buildkite.com/llvm-project/github-pull-requests/builds/168411#_ --- lldb/DELETE.ME | 0 1 file changed, 0 insertions(+), 0 deletions(-) create mode 100644 lldb/DELETE.ME diff --git a/lldb/DELETE.ME b/lldb/DELETE.ME new file mode 100644 index 0..e69de29bb2d1d ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [lldb] Draft: test (PR #135630)
https://github.com/mizvekov created https://github.com/llvm/llvm-project/pull/135630 None >From e0418788c2384e1d7bd190baa52b9bfc0035ec2f Mon Sep 17 00:00:00 2001 From: Matheus Izvekov Date: Mon, 14 Apr 2025 11:56:01 -0300 Subject: [PATCH] Draft: test --- lldb/DELETE.ME | 0 1 file changed, 0 insertions(+), 0 deletions(-) create mode 100644 lldb/DELETE.ME diff --git a/lldb/DELETE.ME b/lldb/DELETE.ME new file mode 100644 index 0..e69de29bb2d1d ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AArch64AsmPrinter]Place jump tables into hot/unlikely-prefixed data sections for aarch64 (PR #126018)
https://github.com/mingmingl-llvm edited https://github.com/llvm/llvm-project/pull/126018 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] [llvm] Reentry (PR #135656)
mtrofin wrote: > [!WARNING] > This pull request is not mergeable via GitHub because a downstack PR is > open. Once all requirements are satisfied, merge this PR as a stack href="https://app.graphite.dev/github/pr/llvm/llvm-project/135656?utm_source=stack-comment-downstack-mergeability-warning"; > >on Graphite. > https://graphite.dev/docs/merge-pull-requests";>Learn more * **#135656** https://app.graphite.dev/github/pr/llvm/llvm-project/135656?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> 👈 https://app.graphite.dev/github/pr/llvm/llvm-project/135656?utm_source=stack-comment-view-in-graphite"; target="_blank">(View in Graphite) * **#135651** https://app.graphite.dev/github/pr/llvm/llvm-project/135651?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * **#135650** https://app.graphite.dev/github/pr/llvm/llvm-project/135650?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * `main` This stack of pull requests is managed by https://graphite.dev?utm-source=stack-comment";>Graphite. Learn more about https://stacking.dev/?utm_source=stack-comment";>stacking. https://github.com/llvm/llvm-project/pull/135656 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/20.x: [HEXAGON] Fix corner cases for hwloops pass (#135439) (PR #135657)
https://github.com/llvmbot milestoned https://github.com/llvm/llvm-project/pull/135657 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/20.x: [HEXAGON] Fix corner cases for hwloops pass (#135439) (PR #135657)
llvmbot wrote: @llvm/pr-subscribers-backend-hexagon Author: None (llvmbot) Changes Backport da8ce56c53fe6e34809ba0b310fa90257e230a89 Requested by: @androm3da --- Full diff: https://github.com/llvm/llvm-project/pull/135657.diff 3 Files Affected: - (modified) llvm/lib/Target/Hexagon/HexagonHardwareLoops.cpp (+45-1) - (added) llvm/test/CodeGen/Hexagon/hwloop-dist-check.mir (+277) - (modified) llvm/test/CodeGen/Hexagon/swp-phi-start.ll (+3-2) ``diff diff --git a/llvm/lib/Target/Hexagon/HexagonHardwareLoops.cpp b/llvm/lib/Target/Hexagon/HexagonHardwareLoops.cpp index 9334746349240..dd4b240455126 100644 --- a/llvm/lib/Target/Hexagon/HexagonHardwareLoops.cpp +++ b/llvm/lib/Target/Hexagon/HexagonHardwareLoops.cpp @@ -731,6 +731,11 @@ CountValue *HexagonHardwareLoops::computeCount(MachineLoop *Loop, Register IVReg, int64_t IVBump, Comparison::Kind Cmp) const { + LLVM_DEBUG(llvm::dbgs() << "Loop: " << *Loop << "\n"); + LLVM_DEBUG(llvm::dbgs() << "Initial Value: " << *Start << "\n"); + LLVM_DEBUG(llvm::dbgs() << "End Value: " << *End << "\n"); + LLVM_DEBUG(llvm::dbgs() << "Inc/Dec Value: " << IVBump << "\n"); + LLVM_DEBUG(llvm::dbgs() << "Comparison: " << Cmp << "\n"); // Cannot handle comparison EQ, i.e. while (A == B). if (Cmp == Comparison::EQ) return nullptr; @@ -846,6 +851,7 @@ CountValue *HexagonHardwareLoops::computeCount(MachineLoop *Loop, if (IVBump < 0) { std::swap(Start, End); IVBump = -IVBump; +std::swap(CmpLess, CmpGreater); } // Cmp may now have a wrong direction, e.g. LEs may now be GEs. // Signedness, and "including equality" are preserved. @@ -989,7 +995,45 @@ CountValue *HexagonHardwareLoops::computeCount(MachineLoop *Loop, CountSR = 0; } - return new CountValue(CountValue::CV_Register, CountR, CountSR); + const TargetRegisterClass *PredRC = &Hexagon::PredRegsRegClass; + Register MuxR = CountR; + unsigned MuxSR = CountSR; + // For the loop count to be valid unsigned number, CmpLess should imply + // Dist >= 0. Similarly, CmpGreater should imply Dist < 0. We can skip the + // check if the initial distance is zero and the comparison is LTu || LTEu. + if (!(Start->isImm() && StartV == 0 && Comparison::isUnsigned(Cmp) && +CmpLess) && + (CmpLess || CmpGreater)) { +// Generate: +// DistCheck = CMP_GT DistR, 0 --> CmpLess +// DistCheck = CMP_GT DistR, -1 --> CmpGreater +Register DistCheckR = MRI->createVirtualRegister(PredRC); +const MCInstrDesc &DistCheckD = TII->get(Hexagon::C2_cmpgti); +BuildMI(*PH, InsertPos, DL, DistCheckD, DistCheckR) +.addReg(DistR, 0, DistSR) +.addImm((CmpLess) ? 0 : -1); + +// Generate: +// MUXR = MUX DistCheck, CountR, 1 --> CmpLess +// MUXR = MUX DistCheck, 1, CountR --> CmpGreater +MuxR = MRI->createVirtualRegister(IntRC); +if (CmpLess) { + const MCInstrDesc &MuxD = TII->get(Hexagon::C2_muxir); + BuildMI(*PH, InsertPos, DL, MuxD, MuxR) + .addReg(DistCheckR) + .addReg(CountR, 0, CountSR) + .addImm(1); +} else { + const MCInstrDesc &MuxD = TII->get(Hexagon::C2_muxri); + BuildMI(*PH, InsertPos, DL, MuxD, MuxR) + .addReg(DistCheckR) + .addImm(1) + .addReg(CountR, 0, CountSR); +} +MuxSR = 0; + } + + return new CountValue(CountValue::CV_Register, MuxR, MuxSR); } /// Return true if the operation is invalid within hardware loop. diff --git a/llvm/test/CodeGen/Hexagon/hwloop-dist-check.mir b/llvm/test/CodeGen/Hexagon/hwloop-dist-check.mir new file mode 100644 index 0..9f8c14a314309 --- /dev/null +++ b/llvm/test/CodeGen/Hexagon/hwloop-dist-check.mir @@ -0,0 +1,277 @@ +# RUN: llc --mtriple=hexagon -run-pass=hwloops %s -o - | FileCheck %s + +# CHECK-LABEL: name: f +# CHECK: [[R1:%[0-9]+]]:predregs = C2_cmpgti [[R0:%[0-9]+]], 0 +# CHECK: [[R3:%[0-9]+]]:intregs = C2_muxir [[R1:%[0-9]+]], [[R2:%[0-9]+]], 1 +# CHECK-LABEL: name: g +# CHECK: [[R1:%[0-9]+]]:predregs = C2_cmpgti [[R0:%[0-9]+]], 0 +# CHECK: [[R3:%[0-9]+]]:intregs = C2_muxir [[R1:%[0-9]+]], [[R2:%[0-9]+]], 1 +--- | + @a = dso_local global [255 x ptr] zeroinitializer, align 8 + + ; Function Attrs: minsize nofree norecurse nosync nounwind optsize memory(write, argmem: none, inaccessiblemem: none) + define dso_local void @f(i32 noundef %m) local_unnamed_addr #0 { + entry: +%cond = tail call i32 @llvm.smax.i32(i32 %m, i32 2) +%0 = add nsw i32 %cond, -4 +%1 = shl i32 %cond, 3 +%cgep = getelementptr i8, ptr @a, i32 %1 +%cgep36 = bitcast ptr @a to ptr +br label %do.body + + do.body: ; preds = %do.body, %entry +%lsr.iv1 = phi ptr [ %cgep4, %do.body ], [ %cgep, %entry ] +%lsr.iv = phi i32 [ %lsr.iv.next, %do.body ], [ %0, %entry ] +%sh.
[llvm-branch-commits] [llvm] release/20.x: [HEXAGON] Fix corner cases for hwloops pass (#135439) (PR #135657)
llvmbot wrote: @iajbar What do you think about merging this PR to the release branch? https://github.com/llvm/llvm-project/pull/135657 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [lldb] Draft: test (PR #135630)
https://github.com/mizvekov updated https://github.com/llvm/llvm-project/pull/135630 >From fbe0a6651e8e506a25f0f16d34f3995f138d15ba Mon Sep 17 00:00:00 2001 From: Matheus Izvekov Date: Mon, 14 Apr 2025 11:56:01 -0300 Subject: [PATCH] Draft: test With change: 1) 4m46s - https://buildkite.com/llvm-project/github-pull-requests/builds/168411#_ 2) 4m36s - https://buildkite.com/llvm-project/github-pull-requests/builds/168431#01963503-57a4-4934-9de8-f298abe3c432 --- lldb/DELETE.ME | 0 1 file changed, 0 insertions(+), 0 deletions(-) create mode 100644 lldb/DELETE.ME diff --git a/lldb/DELETE.ME b/lldb/DELETE.ME new file mode 100644 index 0..e69de29bb2d1d ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [HLSL] Adding support for Root Constants in LLVM Metadata (PR #135085)
inbelic wrote: The test name implies that we are only testing root constants. Can we either change the name or remove root flags https://github.com/llvm/llvm-project/pull/135085 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [HLSL] Adding support for Root Constants in LLVM Metadata (PR #135085)
@@ -52,6 +59,45 @@ static bool parseRootFlags(LLVMContext *Ctx, mcdxbc::RootSignatureDesc &RSD, return false; } +static bool extractMdValue(uint32_t &Value, MDNode *Node, unsigned int OpId) { inbelic wrote: We could use this for `parseRootFlags` as well to make it consistent. https://github.com/llvm/llvm-project/pull/135085 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [HLSL] Adding support for Root Constants in LLVM Metadata (PR #135085)
@@ -0,0 +1,34 @@ +; RUN: opt %s -dxil-embed -dxil-globals -S -o - | FileCheck %s +; RUN: llc %s --filetype=obj -o - | obj2yaml | FileCheck %s --check-prefix=DXC + +target triple = "dxil-unknown-shadermodel6.0-compute" + +; CHECK: @dx.rts0 = private constant [48 x i8] c"{{.*}}", section "RTS0", align 4 + +define void @main() #0 { +entry: + ret void +} +attributes #0 = { "hlsl.numthreads"="1,1,1" "hlsl.shader"="compute" } + + +!dx.rootsignatures = !{!2} ; list of function/root signature pairs +!2 = !{ ptr @main, !3 } ; function, root signature +!3 = !{ !4, !5 } ; list of root signature elements +!4 = !{ !"RootFlags", i32 1 } ; 1 = allow_input_assembler_input_layout +!5 = !{ !"RootConstants", i32 0, i32 1, i32 2, i32 3 } + +; DXC: - Name:RTS0 +; DXC-NEXT:Size:48 +; DXC-NEXT:RootSignature: +; DXC-NEXT: Version: 2 +; DXC-NEXT: NumStaticSamplers: 0 +; DXC-NEXT: StaticSamplersOffset: 0 inbelic wrote: This has a different value from the test above (0 vs 48). Presumably they should be the same? https://github.com/llvm/llvm-project/pull/135085 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [HLSL] Adding support for Root Constants in LLVM Metadata (PR #135085)
@@ -94,10 +144,56 @@ static bool parse(LLVMContext *Ctx, mcdxbc::RootSignatureDesc &RSD, static bool verifyRootFlag(uint32_t Flags) { return (Flags & ~0xfff) == 0; } +static bool verifyShaderVisibility(dxbc::ShaderVisibility Flags) { + switch (Flags) { + + case dxbc::ShaderVisibility::All: + case dxbc::ShaderVisibility::Vertex: + case dxbc::ShaderVisibility::Hull: + case dxbc::ShaderVisibility::Domain: + case dxbc::ShaderVisibility::Geometry: + case dxbc::ShaderVisibility::Pixel: + case dxbc::ShaderVisibility::Amplification: + case dxbc::ShaderVisibility::Mesh: +return true; + } + + return false; +} + +static bool verifyParameterType(dxbc::RootParameterType Flags) { + switch (Flags) { + case dxbc::RootParameterType::Constants32Bit: inbelic wrote: Shouldn't root flags also be here? https://github.com/llvm/llvm-project/pull/135085 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [HLSL] Adding support for Root Constants in LLVM Metadata (PR #135085)
inbelic wrote: I think you meant this to be a new test file and not to edit the root flags validation. Can we also remove the "RootFlags" member to reduce noise. https://github.com/llvm/llvm-project/pull/135085 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] llvm-reduce: Preserve uselistorder when writing thinlto bitcode (PR #133369)
https://github.com/arsenm updated https://github.com/llvm/llvm-project/pull/133369 >From f056830c8eda3fab39cf73dad981b7b7091bdeb6 Mon Sep 17 00:00:00 2001 From: Matt Arsenault Date: Fri, 28 Mar 2025 10:46:08 +0700 Subject: [PATCH 1/2] llvm-reduce: Preserve uselistorder when writing thinlto bitcode Fixes #63621 --- .../thinlto-preserve-uselistorder.ll | 19 +++ llvm/tools/llvm-reduce/ReducerWorkItem.cpp| 11 --- 2 files changed, 27 insertions(+), 3 deletions(-) create mode 100644 llvm/test/tools/llvm-reduce/thinlto-preserve-uselistorder.ll diff --git a/llvm/test/tools/llvm-reduce/thinlto-preserve-uselistorder.ll b/llvm/test/tools/llvm-reduce/thinlto-preserve-uselistorder.ll new file mode 100644 index 0..2332f2d632911 --- /dev/null +++ b/llvm/test/tools/llvm-reduce/thinlto-preserve-uselistorder.ll @@ -0,0 +1,19 @@ +; RUN: opt --thinlto-bc --thinlto-split-lto-unit %s -o %t.0 +; RUN: llvm-reduce -write-tmp-files-as-bitcode --delta-passes=instructions %t.0 -o %t.1 \ +; RUN: --test %python --test-arg %p/Inputs/llvm-dis-and-filecheck.py --test-arg llvm-dis --test-arg FileCheck --test-arg --check-prefix=INTERESTING --test-arg %s +; RUN: llvm-dis --preserve-ll-uselistorder %t.1 -o %t.2 +; RUN: FileCheck --check-prefix=RESULT %s < %t.2 + +define i32 @func(i32 %arg0, i32 %arg1) { +entry: + %add0 = add i32 %arg0, 0 + %add1 = add i32 %add0, 0 + %add2 = add i32 %add1, 0 + %add3 = add i32 %arg1, 0 + %add4 = add i32 %add2, %add3 + ret i32 %add4 +} + +; INTERESTING: uselistorder i32 0 +; RESULT: uselistorder i32 0, { 0, 2, 1 } +uselistorder i32 0, { 3, 2, 1, 0 } diff --git a/llvm/tools/llvm-reduce/ReducerWorkItem.cpp b/llvm/tools/llvm-reduce/ReducerWorkItem.cpp index 8d2675c685038..9af2e5f5fdd23 100644 --- a/llvm/tools/llvm-reduce/ReducerWorkItem.cpp +++ b/llvm/tools/llvm-reduce/ReducerWorkItem.cpp @@ -776,7 +776,11 @@ void ReducerWorkItem::readBitcode(MemoryBufferRef Data, LLVMContext &Ctx, } void ReducerWorkItem::writeBitcode(raw_ostream &OutStream) const { + const bool ShouldPreserveUseListOrder = true; + if (LTOInfo && LTOInfo->IsThinLTO && LTOInfo->EnableSplitLTOUnit) { +// FIXME: This should not depend on the pass manager. There are hidden +// transforms that may happen inside ThinLTOBitcodeWriterPass PassBuilder PB; LoopAnalysisManager LAM; FunctionAnalysisManager FAM; @@ -788,7 +792,8 @@ void ReducerWorkItem::writeBitcode(raw_ostream &OutStream) const { PB.registerLoopAnalyses(LAM); PB.crossRegisterProxies(LAM, FAM, CGAM, MAM); ModulePassManager MPM; -MPM.addPass(ThinLTOBitcodeWriterPass(OutStream, nullptr)); +MPM.addPass(ThinLTOBitcodeWriterPass(OutStream, nullptr, + ShouldPreserveUseListOrder)); MPM.run(*M, MAM); } else { std::unique_ptr Index; @@ -797,8 +802,8 @@ void ReducerWorkItem::writeBitcode(raw_ostream &OutStream) const { Index = std::make_unique( buildModuleSummaryIndex(*M, nullptr, &PSI)); } -WriteBitcodeToFile(getModule(), OutStream, - /*ShouldPreserveUseListOrder=*/true, Index.get()); +WriteBitcodeToFile(getModule(), OutStream, ShouldPreserveUseListOrder, + Index.get()); } } >From f324dce27a48741d8381843ba7478cf8f78b6c06 Mon Sep 17 00:00:00 2001 From: Matt Arsenault Date: Mon, 14 Apr 2025 16:19:11 +0200 Subject: [PATCH 2/2] Remove fixme --- llvm/tools/llvm-reduce/ReducerWorkItem.cpp | 2 -- 1 file changed, 2 deletions(-) diff --git a/llvm/tools/llvm-reduce/ReducerWorkItem.cpp b/llvm/tools/llvm-reduce/ReducerWorkItem.cpp index 9af2e5f5fdd23..67da8bf1fd2bf 100644 --- a/llvm/tools/llvm-reduce/ReducerWorkItem.cpp +++ b/llvm/tools/llvm-reduce/ReducerWorkItem.cpp @@ -779,8 +779,6 @@ void ReducerWorkItem::writeBitcode(raw_ostream &OutStream) const { const bool ShouldPreserveUseListOrder = true; if (LTOInfo && LTOInfo->IsThinLTO && LTOInfo->EnableSplitLTOUnit) { -// FIXME: This should not depend on the pass manager. There are hidden -// transforms that may happen inside ThinLTOBitcodeWriterPass PassBuilder PB; LoopAnalysisManager LAM; FunctionAnalysisManager FAM; ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [ctxprof] Extend the notion of "cannot return" (PR #135651)
https://github.com/mtrofin updated https://github.com/llvm/llvm-project/pull/135651 >From d02d00bef28047f16f37454576050fcf15a87814 Mon Sep 17 00:00:00 2001 From: Mircea Trofin Date: Mon, 14 Apr 2025 10:03:55 -0700 Subject: [PATCH] [ctxprof] Extend the notion of "cannot return" --- .../Instrumentation/PGOCtxProfLowering.cpp| 19 -- .../ctx-instrumentation-invalid-roots.ll | 25 +++ .../PGOProfile/ctx-instrumentation.ll | 15 ++- 3 files changed, 41 insertions(+), 18 deletions(-) diff --git a/llvm/lib/Transforms/Instrumentation/PGOCtxProfLowering.cpp b/llvm/lib/Transforms/Instrumentation/PGOCtxProfLowering.cpp index f99d7b9d03e02..136225ab27cdc 100644 --- a/llvm/lib/Transforms/Instrumentation/PGOCtxProfLowering.cpp +++ b/llvm/lib/Transforms/Instrumentation/PGOCtxProfLowering.cpp @@ -9,6 +9,7 @@ #include "llvm/Transforms/Instrumentation/PGOCtxProfLowering.h" #include "llvm/ADT/STLExtras.h" +#include "llvm/Analysis/CFG.h" #include "llvm/Analysis/CtxProfAnalysis.h" #include "llvm/Analysis/OptimizationRemarkEmitter.h" #include "llvm/IR/Analysis.h" @@ -105,6 +106,12 @@ std::pair getNumCountersAndCallsites(const Function &F) { } return {NumCounters, NumCallsites}; } + +void emitUnsupportedRoot(const Function &F, StringRef Reason) { + F.getContext().emitError("[ctxprof] The function " + F.getName() + + " was indicated as context root but " + Reason + + ", which is not supported."); +} } // namespace // set up tie-in with compiler-rt. @@ -164,12 +171,8 @@ CtxInstrumentationLowerer::CtxInstrumentationLowerer(Module &M, for (const auto &BB : *F) for (const auto &I : BB) if (const auto *CB = dyn_cast(&I)) -if (CB->isMustTailCall()) { - M.getContext().emitError("The function " + Fname + - " was indicated as a context root, " - "but it features musttail " - "calls, which is not supported."); -} +if (CB->isMustTailCall()) + emitUnsupportedRoot(*F, "it features musttail calls"); } } @@ -230,11 +233,13 @@ bool CtxInstrumentationLowerer::lowerFunction(Function &F) { // Probably pointless to try to do anything here, unlikely to be // performance-affecting. - if (F.doesNotReturn()) { + if (!llvm::canReturn(F)) { for (auto &BB : F) for (auto &I : make_early_inc_range(BB)) if (isa(&I)) I.eraseFromParent(); +if (ContextRootSet.contains(&F)) + emitUnsupportedRoot(F, "it does not return"); return true; } diff --git a/llvm/test/Transforms/PGOProfile/ctx-instrumentation-invalid-roots.ll b/llvm/test/Transforms/PGOProfile/ctx-instrumentation-invalid-roots.ll index 454780153b823..b5ceb4602c60b 100644 --- a/llvm/test/Transforms/PGOProfile/ctx-instrumentation-invalid-roots.ll +++ b/llvm/test/Transforms/PGOProfile/ctx-instrumentation-invalid-roots.ll @@ -1,17 +1,22 @@ -; RUN: not opt -passes=ctx-instr-gen,ctx-instr-lower -profile-context-root=good \ -; RUN: -profile-context-root=bad \ -; RUN: -S < %s 2>&1 | FileCheck %s +; RUN: split-file %s %t +; RUN: not opt -passes=ctx-instr-gen,ctx-instr-lower -profile-context-root=the_func -S %t/musttail.ll -o - 2>&1 | FileCheck %s +; RUN: not opt -passes=ctx-instr-gen,ctx-instr-lower -profile-context-root=the_func -S %t/unreachable.ll -o - 2>&1 | FileCheck %s +; RUN: not opt -passes=ctx-instr-gen,ctx-instr-lower -profile-context-root=the_func -S %t/noreturn.ll -o - 2>&1 | FileCheck %s +;--- musttail.ll declare void @foo() -define void @good() { - call void @foo() - ret void -} - -define void @bad() { +define void @the_func() { musttail call void @foo() ret void } +;--- unreachable.ll +define void @the_func() { + unreachable +} +;--- noreturn.ll +define void @the_func() noreturn { + unreachable +} -; CHECK: error: The function bad was indicated as a context root, but it features musttail calls, which is not supported. +; CHECK: error: [ctxprof] The function the_func was indicated as context root diff --git a/llvm/test/Transforms/PGOProfile/ctx-instrumentation.ll b/llvm/test/Transforms/PGOProfile/ctx-instrumentation.ll index 6b2f25a585ec3..6afa37ef286f5 100644 --- a/llvm/test/Transforms/PGOProfile/ctx-instrumentation.ll +++ b/llvm/test/Transforms/PGOProfile/ctx-instrumentation.ll @@ -18,7 +18,7 @@ declare void @bar() ; LOWERING: @[[GLOB4:[0-9]+]] = internal global { ptr, ptr, ptr, ptr, i8 } zeroinitializer ; LOWERING: @[[GLOB5:[0-9]+]] = internal global { ptr, ptr, ptr, ptr, i8 } zeroinitializer ; LOWERING: @[[GLOB6:[0-9]+]] = internal global { ptr, ptr, ptr, ptr, i8 } zeroinitializer -; LOWERING: @[[GLOB7:[0-9]+]] = internal global { ptr, ptr, ptr, ptr, i8 } { ptr null, ptr null, ptr inttoptr (i64 1 to ptr), ptr null, i8 0 } +; LOWERING: @[[GLOB7:[0-9]+]] = intern
[llvm-branch-commits] [llvm] [ctxprof] Extend the notion of "cannot return" (PR #135651)
https://github.com/mtrofin updated https://github.com/llvm/llvm-project/pull/135651 >From 9642055c61eb43c7d821924fbbe180bc52a3d0d6 Mon Sep 17 00:00:00 2001 From: Mircea Trofin Date: Mon, 14 Apr 2025 10:03:55 -0700 Subject: [PATCH] [ctxprof] Extend the notion of "cannot return" --- .../llvm/Transforms/IPO/FunctionAttrs.h | 2 -- .../Instrumentation/PGOCtxProfLowering.cpp| 19 -- .../ctx-instrumentation-invalid-roots.ll | 25 +++ .../PGOProfile/ctx-instrumentation.ll | 15 ++- 4 files changed, 41 insertions(+), 20 deletions(-) diff --git a/llvm/include/llvm/Transforms/IPO/FunctionAttrs.h b/llvm/include/llvm/Transforms/IPO/FunctionAttrs.h index 3a2c09afbebd3..6a21ff616d506 100644 --- a/llvm/include/llvm/Transforms/IPO/FunctionAttrs.h +++ b/llvm/include/llvm/Transforms/IPO/FunctionAttrs.h @@ -30,8 +30,6 @@ class Module; /// Returns the memory access properties of this copy of the function. MemoryEffects computeFunctionBodyMemoryAccess(Function &F, AAResults &AAR); -bool canReturn(const Function &F); - /// Propagate function attributes for function summaries along the index's /// callgraph during thinlink bool thinLTOPropagateFunctionAttrs( diff --git a/llvm/lib/Transforms/Instrumentation/PGOCtxProfLowering.cpp b/llvm/lib/Transforms/Instrumentation/PGOCtxProfLowering.cpp index f99d7b9d03e02..136225ab27cdc 100644 --- a/llvm/lib/Transforms/Instrumentation/PGOCtxProfLowering.cpp +++ b/llvm/lib/Transforms/Instrumentation/PGOCtxProfLowering.cpp @@ -9,6 +9,7 @@ #include "llvm/Transforms/Instrumentation/PGOCtxProfLowering.h" #include "llvm/ADT/STLExtras.h" +#include "llvm/Analysis/CFG.h" #include "llvm/Analysis/CtxProfAnalysis.h" #include "llvm/Analysis/OptimizationRemarkEmitter.h" #include "llvm/IR/Analysis.h" @@ -105,6 +106,12 @@ std::pair getNumCountersAndCallsites(const Function &F) { } return {NumCounters, NumCallsites}; } + +void emitUnsupportedRoot(const Function &F, StringRef Reason) { + F.getContext().emitError("[ctxprof] The function " + F.getName() + + " was indicated as context root but " + Reason + + ", which is not supported."); +} } // namespace // set up tie-in with compiler-rt. @@ -164,12 +171,8 @@ CtxInstrumentationLowerer::CtxInstrumentationLowerer(Module &M, for (const auto &BB : *F) for (const auto &I : BB) if (const auto *CB = dyn_cast(&I)) -if (CB->isMustTailCall()) { - M.getContext().emitError("The function " + Fname + - " was indicated as a context root, " - "but it features musttail " - "calls, which is not supported."); -} +if (CB->isMustTailCall()) + emitUnsupportedRoot(*F, "it features musttail calls"); } } @@ -230,11 +233,13 @@ bool CtxInstrumentationLowerer::lowerFunction(Function &F) { // Probably pointless to try to do anything here, unlikely to be // performance-affecting. - if (F.doesNotReturn()) { + if (!llvm::canReturn(F)) { for (auto &BB : F) for (auto &I : make_early_inc_range(BB)) if (isa(&I)) I.eraseFromParent(); +if (ContextRootSet.contains(&F)) + emitUnsupportedRoot(F, "it does not return"); return true; } diff --git a/llvm/test/Transforms/PGOProfile/ctx-instrumentation-invalid-roots.ll b/llvm/test/Transforms/PGOProfile/ctx-instrumentation-invalid-roots.ll index 454780153b823..b5ceb4602c60b 100644 --- a/llvm/test/Transforms/PGOProfile/ctx-instrumentation-invalid-roots.ll +++ b/llvm/test/Transforms/PGOProfile/ctx-instrumentation-invalid-roots.ll @@ -1,17 +1,22 @@ -; RUN: not opt -passes=ctx-instr-gen,ctx-instr-lower -profile-context-root=good \ -; RUN: -profile-context-root=bad \ -; RUN: -S < %s 2>&1 | FileCheck %s +; RUN: split-file %s %t +; RUN: not opt -passes=ctx-instr-gen,ctx-instr-lower -profile-context-root=the_func -S %t/musttail.ll -o - 2>&1 | FileCheck %s +; RUN: not opt -passes=ctx-instr-gen,ctx-instr-lower -profile-context-root=the_func -S %t/unreachable.ll -o - 2>&1 | FileCheck %s +; RUN: not opt -passes=ctx-instr-gen,ctx-instr-lower -profile-context-root=the_func -S %t/noreturn.ll -o - 2>&1 | FileCheck %s +;--- musttail.ll declare void @foo() -define void @good() { - call void @foo() - ret void -} - -define void @bad() { +define void @the_func() { musttail call void @foo() ret void } +;--- unreachable.ll +define void @the_func() { + unreachable +} +;--- noreturn.ll +define void @the_func() noreturn { + unreachable +} -; CHECK: error: The function bad was indicated as a context root, but it features musttail calls, which is not supported. +; CHECK: error: [ctxprof] The function the_func was indicated as context root diff --git a/llvm/test/Transforms/PGOProfile/ctx-instrumentation.ll b/llvm/test/Transforms/PGOProfile/ct
[llvm-branch-commits] [llvm] [ctxprof] Extend the notion of "cannot return" (PR #135651)
https://github.com/mtrofin updated https://github.com/llvm/llvm-project/pull/135651 >From ea629230ae6202ed34122cecb7ebce20ccffad19 Mon Sep 17 00:00:00 2001 From: Mircea Trofin Date: Mon, 14 Apr 2025 10:03:55 -0700 Subject: [PATCH] [ctxprof] Extend the notion of "cannot return" --- .../Instrumentation/PGOCtxProfLowering.cpp| 19 -- .../ctx-instrumentation-invalid-roots.ll | 25 +++ .../PGOProfile/ctx-instrumentation.ll | 15 ++- 3 files changed, 41 insertions(+), 18 deletions(-) diff --git a/llvm/lib/Transforms/Instrumentation/PGOCtxProfLowering.cpp b/llvm/lib/Transforms/Instrumentation/PGOCtxProfLowering.cpp index f99d7b9d03e02..136225ab27cdc 100644 --- a/llvm/lib/Transforms/Instrumentation/PGOCtxProfLowering.cpp +++ b/llvm/lib/Transforms/Instrumentation/PGOCtxProfLowering.cpp @@ -9,6 +9,7 @@ #include "llvm/Transforms/Instrumentation/PGOCtxProfLowering.h" #include "llvm/ADT/STLExtras.h" +#include "llvm/Analysis/CFG.h" #include "llvm/Analysis/CtxProfAnalysis.h" #include "llvm/Analysis/OptimizationRemarkEmitter.h" #include "llvm/IR/Analysis.h" @@ -105,6 +106,12 @@ std::pair getNumCountersAndCallsites(const Function &F) { } return {NumCounters, NumCallsites}; } + +void emitUnsupportedRoot(const Function &F, StringRef Reason) { + F.getContext().emitError("[ctxprof] The function " + F.getName() + + " was indicated as context root but " + Reason + + ", which is not supported."); +} } // namespace // set up tie-in with compiler-rt. @@ -164,12 +171,8 @@ CtxInstrumentationLowerer::CtxInstrumentationLowerer(Module &M, for (const auto &BB : *F) for (const auto &I : BB) if (const auto *CB = dyn_cast(&I)) -if (CB->isMustTailCall()) { - M.getContext().emitError("The function " + Fname + - " was indicated as a context root, " - "but it features musttail " - "calls, which is not supported."); -} +if (CB->isMustTailCall()) + emitUnsupportedRoot(*F, "it features musttail calls"); } } @@ -230,11 +233,13 @@ bool CtxInstrumentationLowerer::lowerFunction(Function &F) { // Probably pointless to try to do anything here, unlikely to be // performance-affecting. - if (F.doesNotReturn()) { + if (!llvm::canReturn(F)) { for (auto &BB : F) for (auto &I : make_early_inc_range(BB)) if (isa(&I)) I.eraseFromParent(); +if (ContextRootSet.contains(&F)) + emitUnsupportedRoot(F, "it does not return"); return true; } diff --git a/llvm/test/Transforms/PGOProfile/ctx-instrumentation-invalid-roots.ll b/llvm/test/Transforms/PGOProfile/ctx-instrumentation-invalid-roots.ll index 454780153b823..b5ceb4602c60b 100644 --- a/llvm/test/Transforms/PGOProfile/ctx-instrumentation-invalid-roots.ll +++ b/llvm/test/Transforms/PGOProfile/ctx-instrumentation-invalid-roots.ll @@ -1,17 +1,22 @@ -; RUN: not opt -passes=ctx-instr-gen,ctx-instr-lower -profile-context-root=good \ -; RUN: -profile-context-root=bad \ -; RUN: -S < %s 2>&1 | FileCheck %s +; RUN: split-file %s %t +; RUN: not opt -passes=ctx-instr-gen,ctx-instr-lower -profile-context-root=the_func -S %t/musttail.ll -o - 2>&1 | FileCheck %s +; RUN: not opt -passes=ctx-instr-gen,ctx-instr-lower -profile-context-root=the_func -S %t/unreachable.ll -o - 2>&1 | FileCheck %s +; RUN: not opt -passes=ctx-instr-gen,ctx-instr-lower -profile-context-root=the_func -S %t/noreturn.ll -o - 2>&1 | FileCheck %s +;--- musttail.ll declare void @foo() -define void @good() { - call void @foo() - ret void -} - -define void @bad() { +define void @the_func() { musttail call void @foo() ret void } +;--- unreachable.ll +define void @the_func() { + unreachable +} +;--- noreturn.ll +define void @the_func() noreturn { + unreachable +} -; CHECK: error: The function bad was indicated as a context root, but it features musttail calls, which is not supported. +; CHECK: error: [ctxprof] The function the_func was indicated as context root diff --git a/llvm/test/Transforms/PGOProfile/ctx-instrumentation.ll b/llvm/test/Transforms/PGOProfile/ctx-instrumentation.ll index 6b2f25a585ec3..6afa37ef286f5 100644 --- a/llvm/test/Transforms/PGOProfile/ctx-instrumentation.ll +++ b/llvm/test/Transforms/PGOProfile/ctx-instrumentation.ll @@ -18,7 +18,7 @@ declare void @bar() ; LOWERING: @[[GLOB4:[0-9]+]] = internal global { ptr, ptr, ptr, ptr, i8 } zeroinitializer ; LOWERING: @[[GLOB5:[0-9]+]] = internal global { ptr, ptr, ptr, ptr, i8 } zeroinitializer ; LOWERING: @[[GLOB6:[0-9]+]] = internal global { ptr, ptr, ptr, ptr, i8 } zeroinitializer -; LOWERING: @[[GLOB7:[0-9]+]] = internal global { ptr, ptr, ptr, ptr, i8 } { ptr null, ptr null, ptr inttoptr (i64 1 to ptr), ptr null, i8 0 } +; LOWERING: @[[GLOB7:[0-9]+]] = intern
[llvm-branch-commits] [compiler-rt] [llvm] Reentry (PR #135656)
https://github.com/mtrofin updated https://github.com/llvm/llvm-project/pull/135656 >From 978d61c0a92cd2a66c64c8f5daa1a3f30c18df77 Mon Sep 17 00:00:00 2001 From: Mircea Trofin Date: Mon, 14 Apr 2025 07:19:58 -0700 Subject: [PATCH] Reentry --- .../lib/ctx_profile/CtxInstrProfiling.cpp | 151 -- .../tests/CtxInstrProfilingTest.cpp | 115 - .../llvm/ProfileData/CtxInstrContextNode.h| 6 +- .../Instrumentation/PGOCtxProfLowering.cpp| 82 ++ .../PGOProfile/ctx-instrumentation.ll | 4 +- 5 files changed, 269 insertions(+), 89 deletions(-) diff --git a/compiler-rt/lib/ctx_profile/CtxInstrProfiling.cpp b/compiler-rt/lib/ctx_profile/CtxInstrProfiling.cpp index 2d173f0fcb19a..2e26541c1acea 100644 --- a/compiler-rt/lib/ctx_profile/CtxInstrProfiling.cpp +++ b/compiler-rt/lib/ctx_profile/CtxInstrProfiling.cpp @@ -41,7 +41,44 @@ Arena *FlatCtxArena = nullptr; // Set to true when we enter a root, and false when we exit - regardless if this // thread collects a contextual profile for that root. -__thread bool IsUnderContext = false; +__thread int UnderContextRefCount = 0; +__thread void *volatile EnteredContextAddress = 0; + +void onFunctionEntered(void *Address) { + UnderContextRefCount += (Address == EnteredContextAddress); + assert(UnderContextRefCount > 0); +} + +void onFunctionExited(void *Address) { + UnderContextRefCount -= (Address == EnteredContextAddress); + assert(UnderContextRefCount >= 0); +} + +// Returns true if it was entered the first time +bool rootEnterIsFirst(void* Address) { + bool Ret = true; + if (!EnteredContextAddress) { +EnteredContextAddress = Address; +assert(UnderContextRefCount == 0); +Ret = true; + } + onFunctionEntered(Address); + return Ret; +} + +// Return true if this also exits the root. +bool exitsRoot(void* Address) { + onFunctionExited(Address); + if (UnderContextRefCount == 0) { +EnteredContextAddress = nullptr; +return true; + } + return false; + +} + +bool hasEnteredARoot() { return UnderContextRefCount > 0; } + __sanitizer::atomic_uint8_t ProfilingStarted = {}; __sanitizer::atomic_uintptr_t RootDetector = {}; @@ -287,62 +324,65 @@ ContextRoot *FunctionData::getOrAllocateContextRoot() { return Root; } -ContextNode *tryStartContextGivenRoot(ContextRoot *Root, GUID Guid, - uint32_t Counters, uint32_t Callsites) -SANITIZER_NO_THREAD_SAFETY_ANALYSIS { - IsUnderContext = true; - __sanitizer::atomic_fetch_add(&Root->TotalEntries, 1, -__sanitizer::memory_order_relaxed); +ContextNode *tryStartContextGivenRoot( +ContextRoot *Root, void *EntryAddress, GUID Guid, uint32_t Counters, +uint32_t Callsites) SANITIZER_NO_THREAD_SAFETY_ANALYSIS { + + if (rootEnterIsFirst(EntryAddress)) +__sanitizer::atomic_fetch_add(&Root->TotalEntries, 1, + __sanitizer::memory_order_relaxed); if (!Root->FirstMemBlock) { setupContext(Root, Guid, Counters, Callsites); } if (Root->Taken.TryLock()) { +assert(__llvm_ctx_profile_current_context_root == nullptr); __llvm_ctx_profile_current_context_root = Root; onContextEnter(*Root->FirstNode); return Root->FirstNode; } // If this thread couldn't take the lock, return scratch context. - __llvm_ctx_profile_current_context_root = nullptr; return TheScratchContext; } +ContextNode *getOrStartContextOutsideCollection(FunctionData &Data, +ContextRoot *OwnCtxRoot, +void *Callee, GUID Guid, +uint32_t NumCounters, +uint32_t NumCallsites) { + // This must only be called when __llvm_ctx_profile_current_context_root is + // null. + assert(__llvm_ctx_profile_current_context_root == nullptr); + // OwnCtxRoot is Data.CtxRoot. Since it's volatile, and is used by the caller, + // pre-load it. + assert(Data.CtxRoot == OwnCtxRoot); + // If we have a root detector, try sampling. + // Otherwise - regardless if we started profiling or not, if Data.CtxRoot is + // allocated, try starting a context tree - basically, as-if + // __llvm_ctx_profile_start_context were called. + if (auto *RAD = getRootDetector()) +RAD->sample(); + else if (reinterpret_cast(OwnCtxRoot) > 1) +return tryStartContextGivenRoot(OwnCtxRoot, Data.EntryAddress, Guid, +NumCounters, NumCallsites); + + // If we didn't start profiling, or if we are under a context, just not + // collecting, return the scratch buffer. + if (hasEnteredARoot() || + !__sanitizer::atomic_load_relaxed(&ProfilingStarted)) +return TheScratchContext; + return markAsScratch( + onContextEnter(*getFlatProfile(Data, Callee, Guid, NumCounters))); +} + ContextNode *getUnhandledContext(FunctionData &Data, void
[llvm-branch-commits] [compiler-rt] [llvm] Reentry (PR #135656)
https://github.com/mtrofin updated https://github.com/llvm/llvm-project/pull/135656 >From 978d61c0a92cd2a66c64c8f5daa1a3f30c18df77 Mon Sep 17 00:00:00 2001 From: Mircea Trofin Date: Mon, 14 Apr 2025 07:19:58 -0700 Subject: [PATCH] Reentry --- .../lib/ctx_profile/CtxInstrProfiling.cpp | 151 -- .../tests/CtxInstrProfilingTest.cpp | 115 - .../llvm/ProfileData/CtxInstrContextNode.h| 6 +- .../Instrumentation/PGOCtxProfLowering.cpp| 82 ++ .../PGOProfile/ctx-instrumentation.ll | 4 +- 5 files changed, 269 insertions(+), 89 deletions(-) diff --git a/compiler-rt/lib/ctx_profile/CtxInstrProfiling.cpp b/compiler-rt/lib/ctx_profile/CtxInstrProfiling.cpp index 2d173f0fcb19a..2e26541c1acea 100644 --- a/compiler-rt/lib/ctx_profile/CtxInstrProfiling.cpp +++ b/compiler-rt/lib/ctx_profile/CtxInstrProfiling.cpp @@ -41,7 +41,44 @@ Arena *FlatCtxArena = nullptr; // Set to true when we enter a root, and false when we exit - regardless if this // thread collects a contextual profile for that root. -__thread bool IsUnderContext = false; +__thread int UnderContextRefCount = 0; +__thread void *volatile EnteredContextAddress = 0; + +void onFunctionEntered(void *Address) { + UnderContextRefCount += (Address == EnteredContextAddress); + assert(UnderContextRefCount > 0); +} + +void onFunctionExited(void *Address) { + UnderContextRefCount -= (Address == EnteredContextAddress); + assert(UnderContextRefCount >= 0); +} + +// Returns true if it was entered the first time +bool rootEnterIsFirst(void* Address) { + bool Ret = true; + if (!EnteredContextAddress) { +EnteredContextAddress = Address; +assert(UnderContextRefCount == 0); +Ret = true; + } + onFunctionEntered(Address); + return Ret; +} + +// Return true if this also exits the root. +bool exitsRoot(void* Address) { + onFunctionExited(Address); + if (UnderContextRefCount == 0) { +EnteredContextAddress = nullptr; +return true; + } + return false; + +} + +bool hasEnteredARoot() { return UnderContextRefCount > 0; } + __sanitizer::atomic_uint8_t ProfilingStarted = {}; __sanitizer::atomic_uintptr_t RootDetector = {}; @@ -287,62 +324,65 @@ ContextRoot *FunctionData::getOrAllocateContextRoot() { return Root; } -ContextNode *tryStartContextGivenRoot(ContextRoot *Root, GUID Guid, - uint32_t Counters, uint32_t Callsites) -SANITIZER_NO_THREAD_SAFETY_ANALYSIS { - IsUnderContext = true; - __sanitizer::atomic_fetch_add(&Root->TotalEntries, 1, -__sanitizer::memory_order_relaxed); +ContextNode *tryStartContextGivenRoot( +ContextRoot *Root, void *EntryAddress, GUID Guid, uint32_t Counters, +uint32_t Callsites) SANITIZER_NO_THREAD_SAFETY_ANALYSIS { + + if (rootEnterIsFirst(EntryAddress)) +__sanitizer::atomic_fetch_add(&Root->TotalEntries, 1, + __sanitizer::memory_order_relaxed); if (!Root->FirstMemBlock) { setupContext(Root, Guid, Counters, Callsites); } if (Root->Taken.TryLock()) { +assert(__llvm_ctx_profile_current_context_root == nullptr); __llvm_ctx_profile_current_context_root = Root; onContextEnter(*Root->FirstNode); return Root->FirstNode; } // If this thread couldn't take the lock, return scratch context. - __llvm_ctx_profile_current_context_root = nullptr; return TheScratchContext; } +ContextNode *getOrStartContextOutsideCollection(FunctionData &Data, +ContextRoot *OwnCtxRoot, +void *Callee, GUID Guid, +uint32_t NumCounters, +uint32_t NumCallsites) { + // This must only be called when __llvm_ctx_profile_current_context_root is + // null. + assert(__llvm_ctx_profile_current_context_root == nullptr); + // OwnCtxRoot is Data.CtxRoot. Since it's volatile, and is used by the caller, + // pre-load it. + assert(Data.CtxRoot == OwnCtxRoot); + // If we have a root detector, try sampling. + // Otherwise - regardless if we started profiling or not, if Data.CtxRoot is + // allocated, try starting a context tree - basically, as-if + // __llvm_ctx_profile_start_context were called. + if (auto *RAD = getRootDetector()) +RAD->sample(); + else if (reinterpret_cast(OwnCtxRoot) > 1) +return tryStartContextGivenRoot(OwnCtxRoot, Data.EntryAddress, Guid, +NumCounters, NumCallsites); + + // If we didn't start profiling, or if we are under a context, just not + // collecting, return the scratch buffer. + if (hasEnteredARoot() || + !__sanitizer::atomic_load_relaxed(&ProfilingStarted)) +return TheScratchContext; + return markAsScratch( + onContextEnter(*getFlatProfile(Data, Callee, Guid, NumCounters))); +} + ContextNode *getUnhandledContext(FunctionData &Data, void
[llvm-branch-commits] [llvm] [ctxprof] Extend the notion of "cannot return" (PR #135651)
https://github.com/mtrofin updated https://github.com/llvm/llvm-project/pull/135651 >From ea629230ae6202ed34122cecb7ebce20ccffad19 Mon Sep 17 00:00:00 2001 From: Mircea Trofin Date: Mon, 14 Apr 2025 10:03:55 -0700 Subject: [PATCH] [ctxprof] Extend the notion of "cannot return" --- .../Instrumentation/PGOCtxProfLowering.cpp| 19 -- .../ctx-instrumentation-invalid-roots.ll | 25 +++ .../PGOProfile/ctx-instrumentation.ll | 15 ++- 3 files changed, 41 insertions(+), 18 deletions(-) diff --git a/llvm/lib/Transforms/Instrumentation/PGOCtxProfLowering.cpp b/llvm/lib/Transforms/Instrumentation/PGOCtxProfLowering.cpp index f99d7b9d03e02..136225ab27cdc 100644 --- a/llvm/lib/Transforms/Instrumentation/PGOCtxProfLowering.cpp +++ b/llvm/lib/Transforms/Instrumentation/PGOCtxProfLowering.cpp @@ -9,6 +9,7 @@ #include "llvm/Transforms/Instrumentation/PGOCtxProfLowering.h" #include "llvm/ADT/STLExtras.h" +#include "llvm/Analysis/CFG.h" #include "llvm/Analysis/CtxProfAnalysis.h" #include "llvm/Analysis/OptimizationRemarkEmitter.h" #include "llvm/IR/Analysis.h" @@ -105,6 +106,12 @@ std::pair getNumCountersAndCallsites(const Function &F) { } return {NumCounters, NumCallsites}; } + +void emitUnsupportedRoot(const Function &F, StringRef Reason) { + F.getContext().emitError("[ctxprof] The function " + F.getName() + + " was indicated as context root but " + Reason + + ", which is not supported."); +} } // namespace // set up tie-in with compiler-rt. @@ -164,12 +171,8 @@ CtxInstrumentationLowerer::CtxInstrumentationLowerer(Module &M, for (const auto &BB : *F) for (const auto &I : BB) if (const auto *CB = dyn_cast(&I)) -if (CB->isMustTailCall()) { - M.getContext().emitError("The function " + Fname + - " was indicated as a context root, " - "but it features musttail " - "calls, which is not supported."); -} +if (CB->isMustTailCall()) + emitUnsupportedRoot(*F, "it features musttail calls"); } } @@ -230,11 +233,13 @@ bool CtxInstrumentationLowerer::lowerFunction(Function &F) { // Probably pointless to try to do anything here, unlikely to be // performance-affecting. - if (F.doesNotReturn()) { + if (!llvm::canReturn(F)) { for (auto &BB : F) for (auto &I : make_early_inc_range(BB)) if (isa(&I)) I.eraseFromParent(); +if (ContextRootSet.contains(&F)) + emitUnsupportedRoot(F, "it does not return"); return true; } diff --git a/llvm/test/Transforms/PGOProfile/ctx-instrumentation-invalid-roots.ll b/llvm/test/Transforms/PGOProfile/ctx-instrumentation-invalid-roots.ll index 454780153b823..b5ceb4602c60b 100644 --- a/llvm/test/Transforms/PGOProfile/ctx-instrumentation-invalid-roots.ll +++ b/llvm/test/Transforms/PGOProfile/ctx-instrumentation-invalid-roots.ll @@ -1,17 +1,22 @@ -; RUN: not opt -passes=ctx-instr-gen,ctx-instr-lower -profile-context-root=good \ -; RUN: -profile-context-root=bad \ -; RUN: -S < %s 2>&1 | FileCheck %s +; RUN: split-file %s %t +; RUN: not opt -passes=ctx-instr-gen,ctx-instr-lower -profile-context-root=the_func -S %t/musttail.ll -o - 2>&1 | FileCheck %s +; RUN: not opt -passes=ctx-instr-gen,ctx-instr-lower -profile-context-root=the_func -S %t/unreachable.ll -o - 2>&1 | FileCheck %s +; RUN: not opt -passes=ctx-instr-gen,ctx-instr-lower -profile-context-root=the_func -S %t/noreturn.ll -o - 2>&1 | FileCheck %s +;--- musttail.ll declare void @foo() -define void @good() { - call void @foo() - ret void -} - -define void @bad() { +define void @the_func() { musttail call void @foo() ret void } +;--- unreachable.ll +define void @the_func() { + unreachable +} +;--- noreturn.ll +define void @the_func() noreturn { + unreachable +} -; CHECK: error: The function bad was indicated as a context root, but it features musttail calls, which is not supported. +; CHECK: error: [ctxprof] The function the_func was indicated as context root diff --git a/llvm/test/Transforms/PGOProfile/ctx-instrumentation.ll b/llvm/test/Transforms/PGOProfile/ctx-instrumentation.ll index 6b2f25a585ec3..6afa37ef286f5 100644 --- a/llvm/test/Transforms/PGOProfile/ctx-instrumentation.ll +++ b/llvm/test/Transforms/PGOProfile/ctx-instrumentation.ll @@ -18,7 +18,7 @@ declare void @bar() ; LOWERING: @[[GLOB4:[0-9]+]] = internal global { ptr, ptr, ptr, ptr, i8 } zeroinitializer ; LOWERING: @[[GLOB5:[0-9]+]] = internal global { ptr, ptr, ptr, ptr, i8 } zeroinitializer ; LOWERING: @[[GLOB6:[0-9]+]] = internal global { ptr, ptr, ptr, ptr, i8 } zeroinitializer -; LOWERING: @[[GLOB7:[0-9]+]] = internal global { ptr, ptr, ptr, ptr, i8 } { ptr null, ptr null, ptr inttoptr (i64 1 to ptr), ptr null, i8 0 } +; LOWERING: @[[GLOB7:[0-9]+]] = intern
[llvm-branch-commits] [llvm] llvm-reduce: Preserve uselistorder when writing thinlto bitcode (PR #133369)
arsenm wrote: ### Merge activity * **Apr 14, 2:29 PM EDT**: A user started a stack merge that includes this pull request via [Graphite](https://app.graphite.dev/github/pr/llvm/llvm-project/133369). https://github.com/llvm/llvm-project/pull/133369 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [MLIR][ArmSVE] Add an ArmSVE dialect operation which maps to svusmmla (PR #135634)
https://github.com/momchil-velikov created https://github.com/llvm/llvm-project/pull/135634 Supersedes https://github.com/llvm/llvm-project/pull/135358 >From 71e2f13ad5922bf93961c5d81fd9d1f5899c80b0 Mon Sep 17 00:00:00 2001 From: Momchil Velikov Date: Thu, 10 Apr 2025 14:38:27 + Subject: [PATCH] [MLIR][ArmSVE] Add an ArmSVE dialect operation which maps to `svusmmla` --- mlir/include/mlir/Dialect/ArmSVE/IR/ArmSVE.td | 32 +++ .../Transforms/LegalizeForLLVMExport.cpp | 4 +++ .../Dialect/ArmSVE/legalize-for-llvm.mlir | 12 +++ mlir/test/Dialect/ArmSVE/roundtrip.mlir | 11 +++ mlir/test/Target/LLVMIR/arm-sve.mlir | 12 +++ 5 files changed, 71 insertions(+) diff --git a/mlir/include/mlir/Dialect/ArmSVE/IR/ArmSVE.td b/mlir/include/mlir/Dialect/ArmSVE/IR/ArmSVE.td index 1a59062ccc93d..da2a8f89b4cfd 100644 --- a/mlir/include/mlir/Dialect/ArmSVE/IR/ArmSVE.td +++ b/mlir/include/mlir/Dialect/ArmSVE/IR/ArmSVE.td @@ -273,6 +273,34 @@ def UmmlaOp : ArmSVE_Op<"ummla", "$acc `,` $src1 `,` $src2 attr-dict `:` type($src1) `to` type($dst)"; } +def UsmmlaOp : ArmSVE_Op<"usmmla", [Pure, +AllTypesMatch<["src1", "src2"]>, +AllTypesMatch<["acc", "dst"]>]> { + let summary = "Matrix-matrix multiply and accumulate op"; + let description = [{ +USMMLA: Unsigned by signed integer matrix multiply-accumulate. + +The unsigned by signed integer matrix multiply-accumulate operation +multiplies the 2×8 matrix of unsigned 8-bit integer values held +the first source vector by the 8×2 matrix of signed 8-bit integer +values in the second source vector. The resulting 2×2 widened 32-bit +integer matrix product is then added to the 32-bit integer matrix +accumulator. + +Source: +https://developer.arm.com/documentation/100987/ + }]; + // Supports (vector<16xi8>, vector<16xi8>) -> (vector<4xi32>) + let arguments = (ins + ScalableVectorOfLengthAndType<[4], [I32]>:$acc, + ScalableVectorOfLengthAndType<[16], [I8]>:$src1, + ScalableVectorOfLengthAndType<[16], [I8]>:$src2 + ); + let results = (outs ScalableVectorOfLengthAndType<[4], [I32]>:$dst); + let assemblyFormat = +"$acc `,` $src1 `,` $src2 attr-dict `:` type($src1) `to` type($dst)"; +} + class SvboolTypeConstraint : TypesMatchWith< "expected corresponding svbool type widened to [16]xi1", lhsArg, rhsArg, @@ -568,6 +596,10 @@ def SmmlaIntrOp : ArmSVE_IntrBinaryOverloadedOp<"smmla">, Arguments<(ins AnyScalableVectorOfAnyRank, AnyScalableVectorOfAnyRank, AnyScalableVectorOfAnyRank)>; +def UsmmlaIntrOp : + ArmSVE_IntrBinaryOverloadedOp<"usmmla">, + Arguments<(ins AnyScalableVectorOfAnyRank, AnyScalableVectorOfAnyRank, AnyScalableVectorOfAnyRank)>; + def SdotIntrOp : ArmSVE_IntrBinaryOverloadedOp<"sdot">, Arguments<(ins AnyScalableVectorOfAnyRank, AnyScalableVectorOfAnyRank, AnyScalableVectorOfAnyRank)>; diff --git a/mlir/lib/Dialect/ArmSVE/Transforms/LegalizeForLLVMExport.cpp b/mlir/lib/Dialect/ArmSVE/Transforms/LegalizeForLLVMExport.cpp index fe13ed03356b2..b1846e15196fc 100644 --- a/mlir/lib/Dialect/ArmSVE/Transforms/LegalizeForLLVMExport.cpp +++ b/mlir/lib/Dialect/ArmSVE/Transforms/LegalizeForLLVMExport.cpp @@ -24,6 +24,7 @@ using SdotOpLowering = OneToOneConvertToLLVMPattern; using SmmlaOpLowering = OneToOneConvertToLLVMPattern; using UdotOpLowering = OneToOneConvertToLLVMPattern; using UmmlaOpLowering = OneToOneConvertToLLVMPattern; +using UsmmlaOpLowering = OneToOneConvertToLLVMPattern; using DupQLaneLowering = OneToOneConvertToLLVMPattern; using ScalableMaskedAddIOpLowering = @@ -194,6 +195,7 @@ void mlir::populateArmSVELegalizeForLLVMExportPatterns( SmmlaOpLowering, UdotOpLowering, UmmlaOpLowering, + UsmmlaOpLowering, DupQLaneLowering, ScalableMaskedAddIOpLowering, ScalableMaskedAddFOpLowering, @@ -222,6 +224,7 @@ void mlir::configureArmSVELegalizeForExportTarget( SmmlaIntrOp, UdotIntrOp, UmmlaIntrOp, +UsmmlaIntrOp, DupQLaneIntrOp, ScalableMaskedAddIIntrOp, ScalableMaskedAddFIntrOp, @@ -242,6 +245,7 @@ void mlir::configureArmSVELegalizeForExportTarget( SmmlaOp, UdotOp, UmmlaOp, + UsmmlaOp, DupQLaneOp, ScalableMaskedAddIOp, ScalableMaskedAddFOp, diff --git a/mlir/test/Dialect/ArmSVE/legalize-for-llvm.mlir b/mlir/test/Dialect/ArmSVE/legalize-for-llvm.mlir index 5d044517e0ea8..47587aa26506c 100644 --- a/mlir/test/Dialect/ArmSVE/legalize-for-llvm.mlir +++ b/mlir/test/Dialect/ArmSVE/legalize-for-llvm.mlir @@ -48,6 +48,18 @@ f
[llvm-branch-commits] [mlir] [MLIR][ArmSVE] Add initial lowering of vector.contract to SVE `*MMLA` instructions (PR #135636)
llvmbot wrote: @llvm/pr-subscribers-mlir Author: Momchil Velikov (momchil-velikov) Changes Supersedes https://github.com/llvm/llvm-project/pull/135359 --- Patch is 77.36 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/135636.diff 16 Files Affected: - (modified) mlir/include/mlir/Conversion/Passes.td (+4) - (modified) mlir/include/mlir/Dialect/ArmSVE/Transforms/Transforms.h (+3) - (modified) mlir/lib/Conversion/VectorToLLVM/CMakeLists.txt (+1) - (modified) mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVMPass.cpp (+7) - (modified) mlir/lib/Dialect/ArmNeon/Transforms/LowerContractionToSMMLAPattern.cpp (+4-1) - (modified) mlir/lib/Dialect/ArmSVE/Transforms/CMakeLists.txt (+1) - (added) mlir/lib/Dialect/ArmSVE/Transforms/LowerContractionToSVEI8MMPattern.cpp (+304) - (added) mlir/test/Dialect/Vector/CPU/ArmSVE/vector-smmla.mlir (+94) - (added) mlir/test/Dialect/Vector/CPU/ArmSVE/vector-summla.mlir (+85) - (added) mlir/test/Dialect/Vector/CPU/ArmSVE/vector-ummla.mlir (+94) - (added) mlir/test/Dialect/Vector/CPU/ArmSVE/vector-usmmla.mlir (+95) - (added) mlir/test/Integration/Dialect/Vector/CPU/ArmSVE/contraction-smmla-4x8x4.mlir (+117) - (added) mlir/test/Integration/Dialect/Vector/CPU/ArmSVE/contraction-smmla-8x8x8-vs2.mlir (+159) - (added) mlir/test/Integration/Dialect/Vector/CPU/ArmSVE/contraction-summla-4x8x4.mlir (+118) - (added) mlir/test/Integration/Dialect/Vector/CPU/ArmSVE/contraction-ummla-4x8x4.mlir (+119) - (added) mlir/test/Integration/Dialect/Vector/CPU/ArmSVE/contraction-usmmla-4x8x4.mlir (+117) ``diff diff --git a/mlir/include/mlir/Conversion/Passes.td b/mlir/include/mlir/Conversion/Passes.td index bbba495e613b2..930d8b44abca0 100644 --- a/mlir/include/mlir/Conversion/Passes.td +++ b/mlir/include/mlir/Conversion/Passes.td @@ -1406,6 +1406,10 @@ def ConvertVectorToLLVMPass : Pass<"convert-vector-to-llvm"> { "bool", /*default=*/"false", "Enables the use of ArmSVE dialect while lowering the vector " "dialect.">, +Option<"armI8MM", "enable-arm-i8mm", + "bool", /*default=*/"false", + "Enables the use of Arm FEAT_I8MM instructions while lowering " + "the vector dialect.">, Option<"x86Vector", "enable-x86vector", "bool", /*default=*/"false", "Enables the use of X86Vector dialect while lowering the vector " diff --git a/mlir/include/mlir/Dialect/ArmSVE/Transforms/Transforms.h b/mlir/include/mlir/Dialect/ArmSVE/Transforms/Transforms.h index 8665c8224cc45..232e2be29e574 100644 --- a/mlir/include/mlir/Dialect/ArmSVE/Transforms/Transforms.h +++ b/mlir/include/mlir/Dialect/ArmSVE/Transforms/Transforms.h @@ -20,6 +20,9 @@ class RewritePatternSet; void populateArmSVELegalizeForLLVMExportPatterns( const LLVMTypeConverter &converter, RewritePatternSet &patterns); +void populateLowerContractionToSVEI8MMPatternPatterns( +RewritePatternSet &patterns); + /// Configure the target to support lowering ArmSVE ops to ops that map to LLVM /// intrinsics. void configureArmSVELegalizeForExportTarget(LLVMConversionTarget &target); diff --git a/mlir/lib/Conversion/VectorToLLVM/CMakeLists.txt b/mlir/lib/Conversion/VectorToLLVM/CMakeLists.txt index 330474a718e30..8e2620029c354 100644 --- a/mlir/lib/Conversion/VectorToLLVM/CMakeLists.txt +++ b/mlir/lib/Conversion/VectorToLLVM/CMakeLists.txt @@ -35,6 +35,7 @@ add_mlir_conversion_library(MLIRVectorToLLVMPass MLIRVectorToLLVM MLIRArmNeonDialect + MLIRArmNeonTransforms MLIRArmSVEDialect MLIRArmSVETransforms MLIRAMXDialect diff --git a/mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVMPass.cpp b/mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVMPass.cpp index 7082b92c95d1d..1e6c8122b1d0e 100644 --- a/mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVMPass.cpp +++ b/mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVMPass.cpp @@ -14,6 +14,7 @@ #include "mlir/Dialect/AMX/Transforms.h" #include "mlir/Dialect/Arith/IR/Arith.h" #include "mlir/Dialect/ArmNeon/ArmNeonDialect.h" +#include "mlir/Dialect/ArmNeon/Transforms.h" #include "mlir/Dialect/ArmSVE/IR/ArmSVEDialect.h" #include "mlir/Dialect/ArmSVE/Transforms/Transforms.h" #include "mlir/Dialect/Func/IR/FuncOps.h" @@ -82,6 +83,12 @@ void ConvertVectorToLLVMPass::runOnOperation() { populateVectorStepLoweringPatterns(patterns); populateVectorRankReducingFMAPattern(patterns); populateVectorGatherLoweringPatterns(patterns); +if (armI8MM) { + if (armNeon) +arm_neon::populateLowerContractionToSMMLAPatternPatterns(patterns); + if (armSVE) +populateLowerContractionToSVEI8MMPatternPatterns(patterns); +} (void)applyPatternsGreedily(getOperation(), std::move(patterns)); } diff --git a/mlir/lib/Dialect/ArmNeon/Transforms/LowerContractionToSMMLAPattern.cpp b/mlir/lib/Dialect/ArmNeon/Transforms/LowerContractionToSMMLAPattern.cpp index 2a1271dfd6bdf..e807b
[llvm-branch-commits] [llvm] [HLSL] Adding support for Root Constants in LLVM Metadata (PR #135085)
@@ -52,6 +59,45 @@ static bool parseRootFlags(LLVMContext *Ctx, mcdxbc::RootSignatureDesc &RSD, return false; } +static bool extractMdValue(uint32_t &Value, MDNode *Node, unsigned int OpId) { inbelic wrote: Maybe we could rename this to `extractMdIntValue` or the like? We will eventually have `float` args https://github.com/llvm/llvm-project/pull/135085 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [GOFF] Add writing of section symbols (PR #133799)
@@ -2759,6 +2762,29 @@ MCSection *TargetLoweringObjectFileXCOFF::getSectionForLSDA( //===--===// TargetLoweringObjectFileGOFF::TargetLoweringObjectFileGOFF() = default; +void TargetLoweringObjectFileGOFF::getModuleMetadata(Module &M) { + // Construct the default names for the root SD and the ADA PR symbol. + StringRef FileName = sys::path::stem(M.getSourceFileName()); + if (FileName.size() > 1 && FileName.starts_with('<') && + FileName.ends_with('>')) +FileName = FileName.substr(1, FileName.size() - 2); + DefaultRootSDName = Twine(FileName).concat("#C").str(); AidoP wrote: Thank you, that's very interesting. The documentation seems to suggest that the binding scope attribute only applies to LDs. Interestingly AMBLIST doesn't seem to display it (N/A is shown). It seems like there are a few undocumented fields and behaviours being relied upon now. Is there anything being done or are there any plans for IBM to update the doc? https://github.com/llvm/llvm-project/pull/133799 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [flang] [llvm] [Github][CI] Upload .ninja_log as an artifact (PR #135539)
@@ -33,6 +33,8 @@ function at-exit { mkdir -p artifacts ccache --print-stats > artifacts/ccache_stats.txt + cp "${BUILD_DIR}"/.ninja_log artifacts/.ninja_log + ls artifacts/ boomanaiden154 wrote: Leftover from testing. I've removed it. The github actions workflow should print out the file list. https://github.com/llvm/llvm-project/pull/135539 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] AMDGPU/GlobalISel: add RegBankLegalize rules for extends and trunc (PR #132383)
@@ -215,8 +207,8 @@ body: | ; CHECK: liveins: $sgpr0 ; CHECK-NEXT: {{ $}} ; CHECK-NEXT: [[COPY:%[0-9]+]]:sgpr(s32) = COPY $sgpr0 -; CHECK-NEXT: [[TRUNC:%[0-9]+]]:sgpr(s1) = G_TRUNC [[COPY]](s32) -; CHECK-NEXT: [[ANYEXT:%[0-9]+]]:sgpr(s64) = G_ANYEXT [[TRUNC]](s1) +; CHECK-NEXT: [[DEF:%[0-9]+]]:sgpr(s32) = G_IMPLICIT_DEF +; CHECK-NEXT: [[MV:%[0-9]+]]:sgpr(s64) = G_MERGE_VALUES [[COPY]](s32), [[DEF]](s32) petar-avramovic wrote: Not sure if this is correct spot from the original comment but > Isn't this a correctness regression? I'm not entirely certain because I > remember there was some weirdness around what G_TRUNC means semantically. Can > you explain why there is no need for a trunc or bitwise and or something like > that in the output? G_TRUNC and G_ANYEXT are no-op with the exception when one operand is vcc. Here we have uniform S1 - trunc + anyext is no-op. Trunc to vcc is clear high bits, then compare Anyext from vcc is select > Note that anyext_s1_to_s32_vgpr does leave a G_AND, so either that test shows > a code quality issue or this test is incorrect. anyext_s1_to_s32_vgpr we need to lower vgpr trunc to vcc. And is from clearing high bits for icmp https://github.com/llvm/llvm-project/pull/132383 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] release/20.x: [modules] Handle friend function that was a definition but became only a declaration during AST deserialization (#132214) (PR #134232)
dmpolukhin wrote: > > @nikic what do you mean by ABI change in this case? It doesn't change ABI > > of generated code, moreover it doesn't even change PCM serialized format > > because it is in memory only filed and attribute. > > It changes the ABI of libclang-cpp, by changing the layout of an exported > type. It looks a bit strange requirement to me and significantly reduces ability to fix regression in release compilers. But if it is the release requirement, this fix cannot be cherry-pick to clang-20 so I abandon this PR. https://github.com/llvm/llvm-project/pull/134232 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] AMDGPU/GlobalISel: add RegBankLegalize rules for bit shifts and sext-inreg (PR #132385)
https://github.com/petar-avramovic updated https://github.com/llvm/llvm-project/pull/132385 >From 2fdf172213d449b78bc6de1ac20d493adda29dbc Mon Sep 17 00:00:00 2001 From: Petar Avramovic Date: Mon, 14 Apr 2025 16:35:19 +0200 Subject: [PATCH] AMDGPU/GlobalISel: add RegBankLegalize rules for bit shifts and sext-inreg Uniform S16 shifts have to be extended to S32 using appropriate Extend before lowering to S32 instruction. Uniform packed V2S16 are lowered to SGPR S32 instructions, other option is to use VALU packed V2S16 and ReadAnyLane. For uniform S32 and S64 and divergent S16, S32, S64 and V2S16 there are instructions available. --- .../Target/AMDGPU/AMDGPURegBankLegalize.cpp | 2 +- .../AMDGPU/AMDGPURegBankLegalizeHelper.cpp| 107 ++ .../AMDGPU/AMDGPURegBankLegalizeHelper.h | 5 + .../AMDGPU/AMDGPURegBankLegalizeRules.cpp | 43 +++- .../AMDGPU/AMDGPURegBankLegalizeRules.h | 11 ++ llvm/test/CodeGen/AMDGPU/GlobalISel/ashr.ll | 10 +- llvm/test/CodeGen/AMDGPU/GlobalISel/lshr.ll | 187 +- .../AMDGPU/GlobalISel/regbankselect-ashr.mir | 6 +- .../AMDGPU/GlobalISel/regbankselect-lshr.mir | 17 +- .../GlobalISel/regbankselect-sext-inreg.mir | 24 +-- .../AMDGPU/GlobalISel/regbankselect-shl.mir | 6 +- .../CodeGen/AMDGPU/GlobalISel/sext_inreg.ll | 34 ++-- llvm/test/CodeGen/AMDGPU/GlobalISel/shl.ll| 10 +- 13 files changed, 311 insertions(+), 151 deletions(-) diff --git a/llvm/lib/Target/AMDGPU/AMDGPURegBankLegalize.cpp b/llvm/lib/Target/AMDGPU/AMDGPURegBankLegalize.cpp index 9544c9f43eeaf..15584f16a0638 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPURegBankLegalize.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPURegBankLegalize.cpp @@ -310,7 +310,7 @@ bool AMDGPURegBankLegalize::runOnMachineFunction(MachineFunction &MF) { // Opcodes that support pretty much all combinations of reg banks and LLTs // (except S1). There is no point in writing rules for them. if (Opc == AMDGPU::G_BUILD_VECTOR || Opc == AMDGPU::G_UNMERGE_VALUES || -Opc == AMDGPU::G_MERGE_VALUES) { +Opc == AMDGPU::G_MERGE_VALUES || Opc == G_BITCAST) { RBLHelper.applyMappingTrivial(*MI); continue; } diff --git a/llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeHelper.cpp b/llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeHelper.cpp index 59cd23847311c..9f240c8e6a7a7 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeHelper.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeHelper.cpp @@ -14,11 +14,13 @@ #include "AMDGPURegBankLegalizeHelper.h" #include "AMDGPUGlobalISelUtils.h" #include "AMDGPUInstrInfo.h" +#include "AMDGPURegBankLegalizeRules.h" #include "AMDGPURegisterBankInfo.h" #include "GCNSubtarget.h" #include "MCTargetDesc/AMDGPUMCTargetDesc.h" #include "llvm/CodeGen/GlobalISel/GenericMachineInstrs.h" #include "llvm/CodeGen/GlobalISel/MachineIRBuilder.h" +#include "llvm/CodeGen/MachineInstr.h" #include "llvm/CodeGen/MachineUniformityAnalysis.h" #include "llvm/IR/IntrinsicsAMDGPU.h" #include "llvm/Support/AMDGPUAddrSpace.h" @@ -166,6 +168,59 @@ void RegBankLegalizeHelper::lowerVccExtToSel(MachineInstr &MI) { MI.eraseFromParent(); } +std::pair RegBankLegalizeHelper::unpackZExt(Register Reg) { + auto PackedS32 = B.buildBitcast(SgprRB_S32, Reg); + auto Mask = B.buildConstant(SgprRB_S32, 0x); + auto Lo = B.buildAnd(SgprRB_S32, PackedS32, Mask); + auto Hi = B.buildLShr(SgprRB_S32, PackedS32, B.buildConstant(SgprRB_S32, 16)); + return {Lo.getReg(0), Hi.getReg(0)}; +} + +std::pair RegBankLegalizeHelper::unpackSExt(Register Reg) { + auto PackedS32 = B.buildBitcast(SgprRB_S32, Reg); + auto Lo = B.buildSExtInReg(SgprRB_S32, PackedS32, 16); + auto Hi = B.buildAShr(SgprRB_S32, PackedS32, B.buildConstant(SgprRB_S32, 16)); + return {Lo.getReg(0), Hi.getReg(0)}; +} + +std::pair RegBankLegalizeHelper::unpackAExt(Register Reg) { + auto PackedS32 = B.buildBitcast(SgprRB_S32, Reg); + auto Lo = PackedS32; + auto Hi = B.buildLShr(SgprRB_S32, PackedS32, B.buildConstant(SgprRB_S32, 16)); + return {Lo.getReg(0), Hi.getReg(0)}; +} + +void RegBankLegalizeHelper::lowerUnpack(MachineInstr &MI) { + Register Lo, Hi; + switch (MI.getOpcode()) { + case AMDGPU::G_SHL: { +auto [Val0, Val1] = unpackAExt(MI.getOperand(1).getReg()); +auto [Amt0, Amt1] = unpackAExt(MI.getOperand(2).getReg()); +Lo = B.buildInstr(MI.getOpcode(), {SgprRB_S32}, {Val0, Amt0}).getReg(0); +Hi = B.buildInstr(MI.getOpcode(), {SgprRB_S32}, {Val1, Amt1}).getReg(0); +break; + } + case AMDGPU::G_LSHR: { +auto [Val0, Val1] = unpackZExt(MI.getOperand(1).getReg()); +auto [Amt0, Amt1] = unpackZExt(MI.getOperand(2).getReg()); +Lo = B.buildInstr(MI.getOpcode(), {SgprRB_S32}, {Val0, Amt0}).getReg(0); +Hi = B.buildInstr(MI.getOpcode(), {SgprRB_S32}, {Val1, Amt1}).getReg(0); +break; + } + case AMDGPU::G_ASHR: { +auto [Val0, Val1] = unpackSExt(MI.getOperand(1).getReg()); +auto [Amt0, Amt1] = unpackSExt(MI.get
[llvm-branch-commits] [clang] 7034995 - [clang] Handle Binary StingLiteral kind in one more place (#132201)
Author: Mariya Podchishchaeva Date: 2025-04-14T12:26:02-07:00 New Revision: 7034995f102967c6f28c2d7d04913608853050ac URL: https://github.com/llvm/llvm-project/commit/7034995f102967c6f28c2d7d04913608853050ac DIFF: https://github.com/llvm/llvm-project/commit/7034995f102967c6f28c2d7d04913608853050ac.diff LOG: [clang] Handle Binary StingLiteral kind in one more place (#132201) The bots are upset by https://github.com/llvm/llvm-project/pull/127629 . Fix that. Added: Modified: clang/lib/Sema/SemaExprCXX.cpp Removed: diff --git a/clang/lib/Sema/SemaExprCXX.cpp b/clang/lib/Sema/SemaExprCXX.cpp index 1e39d69e8b230..c6621402adfc9 100644 --- a/clang/lib/Sema/SemaExprCXX.cpp +++ b/clang/lib/Sema/SemaExprCXX.cpp @@ -4143,6 +4143,7 @@ Sema::IsStringLiteralToNonConstPointerConversion(Expr *From, QualType ToType) { // We don't allow UTF literals to be implicitly converted break; case StringLiteralKind::Ordinary: + case StringLiteralKind::Binary: return (ToPointeeType->getKind() == BuiltinType::Char_U || ToPointeeType->getKind() == BuiltinType::Char_S); case StringLiteralKind::Wide: ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [HLSL] Adding support for Root Constants in LLVM Metadata (PR #135085)
https://github.com/joaosaffran updated https://github.com/llvm/llvm-project/pull/135085 >From 9b59d0108f6b23c039e2c417247216862073cd4b Mon Sep 17 00:00:00 2001 From: joaosaffran Date: Wed, 9 Apr 2025 21:05:58 + Subject: [PATCH 1/3] adding support for root constants in metadata generation --- llvm/lib/Target/DirectX/DXILRootSignature.cpp | 120 +- llvm/lib/Target/DirectX/DXILRootSignature.h | 6 +- .../RootSignature-Flags-Validation-Error.ll | 7 +- .../RootSignature-RootConstants.ll| 34 + ...ature-ShaderVisibility-Validation-Error.ll | 20 +++ 5 files changed, 182 insertions(+), 5 deletions(-) create mode 100644 llvm/test/CodeGen/DirectX/ContainerData/RootSignature-RootConstants.ll create mode 100644 llvm/test/CodeGen/DirectX/ContainerData/RootSignature-ShaderVisibility-Validation-Error.ll diff --git a/llvm/lib/Target/DirectX/DXILRootSignature.cpp b/llvm/lib/Target/DirectX/DXILRootSignature.cpp index 412ab7765a7ae..7686918b0fc75 100644 --- a/llvm/lib/Target/DirectX/DXILRootSignature.cpp +++ b/llvm/lib/Target/DirectX/DXILRootSignature.cpp @@ -40,6 +40,13 @@ static bool reportError(LLVMContext *Ctx, Twine Message, return true; } +static bool reportValueError(LLVMContext *Ctx, Twine ParamName, uint32_t Value, + DiagnosticSeverity Severity = DS_Error) { + Ctx->diagnose(DiagnosticInfoGeneric( + "Invalid value for " + ParamName + ": " + Twine(Value), Severity)); + return true; +} + static bool parseRootFlags(LLVMContext *Ctx, mcdxbc::RootSignatureDesc &RSD, MDNode *RootFlagNode) { @@ -52,6 +59,45 @@ static bool parseRootFlags(LLVMContext *Ctx, mcdxbc::RootSignatureDesc &RSD, return false; } +static bool extractMdValue(uint32_t &Value, MDNode *Node, unsigned int OpId) { + + auto *CI = mdconst::extract(Node->getOperand(OpId)); + if (CI == nullptr) +return true; + + Value = CI->getZExtValue(); + return false; +} + +static bool parseRootConstants(LLVMContext *Ctx, mcdxbc::RootSignatureDesc &RSD, + MDNode *RootFlagNode) { + + if (RootFlagNode->getNumOperands() != 5) +return reportError(Ctx, "Invalid format for RootConstants Element"); + + mcdxbc::RootParameter NewParameter; + NewParameter.Header.ParameterType = dxbc::RootParameterType::Constants32Bit; + + uint32_t SV; + if (extractMdValue(SV, RootFlagNode, 1)) +return reportError(Ctx, "Invalid value for ShaderVisibility"); + + NewParameter.Header.ShaderVisibility = (dxbc::ShaderVisibility)SV; + + if (extractMdValue(NewParameter.Constants.ShaderRegister, RootFlagNode, 2)) +return reportError(Ctx, "Invalid value for ShaderRegister"); + + if (extractMdValue(NewParameter.Constants.RegisterSpace, RootFlagNode, 3)) +return reportError(Ctx, "Invalid value for RegisterSpace"); + + if (extractMdValue(NewParameter.Constants.Num32BitValues, RootFlagNode, 4)) +return reportError(Ctx, "Invalid value for Num32BitValues"); + + RSD.Parameters.push_back(NewParameter); + + return false; +} + static bool parseRootSignatureElement(LLVMContext *Ctx, mcdxbc::RootSignatureDesc &RSD, MDNode *Element) { @@ -62,12 +108,16 @@ static bool parseRootSignatureElement(LLVMContext *Ctx, RootSignatureElementKind ElementKind = StringSwitch(ElementText->getString()) .Case("RootFlags", RootSignatureElementKind::RootFlags) + .Case("RootConstants", RootSignatureElementKind::RootConstants) .Default(RootSignatureElementKind::Error); switch (ElementKind) { case RootSignatureElementKind::RootFlags: return parseRootFlags(Ctx, RSD, Element); + case RootSignatureElementKind::RootConstants: +return parseRootConstants(Ctx, RSD, Element); +break; case RootSignatureElementKind::Error: return reportError(Ctx, "Invalid Root Signature Element: " + ElementText->getString()); @@ -94,10 +144,56 @@ static bool parse(LLVMContext *Ctx, mcdxbc::RootSignatureDesc &RSD, static bool verifyRootFlag(uint32_t Flags) { return (Flags & ~0xfff) == 0; } +static bool verifyShaderVisibility(dxbc::ShaderVisibility Flags) { + switch (Flags) { + + case dxbc::ShaderVisibility::All: + case dxbc::ShaderVisibility::Vertex: + case dxbc::ShaderVisibility::Hull: + case dxbc::ShaderVisibility::Domain: + case dxbc::ShaderVisibility::Geometry: + case dxbc::ShaderVisibility::Pixel: + case dxbc::ShaderVisibility::Amplification: + case dxbc::ShaderVisibility::Mesh: +return true; + } + + return false; +} + +static bool verifyParameterType(dxbc::RootParameterType Flags) { + switch (Flags) { + case dxbc::RootParameterType::Constants32Bit: +return true; + } + + return false; +} + +static bool verifyVersion(uint32_t Version) { + return (Version == 1 || Version == 2); +} + static bool validate(LLVMContext *Ctx, const mcdxbc::RootSignatureDesc
[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: refactor issue reporting (PR #135662)
https://github.com/atrosinenko ready_for_review https://github.com/llvm/llvm-project/pull/135662 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [GOFF] Add writing of section symbols (PR #133799)
@@ -0,0 +1,113 @@ +//===- MCGOFFAttributes.h - Attributes of GOFF symbols ===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===--===// +// +// Defines the various attribute collections defining GOFF symbols. +// +//===--===// + +#ifndef LLVM_MC_MCGOFFATTRIBUTES_H +#define LLVM_MC_MCGOFFATTRIBUTES_H + +#include "llvm/ADT/StringRef.h" +#include "llvm/BinaryFormat/GOFF.h" + +namespace llvm { +namespace GOFF { +// An "External Symbol Definition" in the GOFF file has a type, and depending on +// the type a different subset of the fields is used. +// +// Unlike other formats, a 2 dimensional structure is used to define the +// location of data. For example, the equivalent of the ELF .text section is +// made up of a Section Definition (SD) and a class (Element Definition; ED). +// The name of the SD symbol depends on the application, while the class has the +// predefined name C_CODE/C_CODE64 in AMODE31 and AMODE64 respectively. +// +// Data can be placed into this structure in 2 ways. First, the data (in a text +// record) can be associated with an ED symbol. To refer to data, a Label +// Definition (LD) is used to give an offset into the data a name. When binding, +// the whole data is pulled into the resulting executable, and the addresses +// given by the LD symbols are resolved. +// +// The alternative is to use a Part Definition (PR). In this case, the data (in +// a text record) is associated with the part. When binding, only the data of +// referenced PRs is pulled into the resulting binary. +// +// Both approaches are used, which means that the equivalent of a section in ELF +// results in 3 GOFF symbols, either SD/ED/LD or SD/ED/PR. Moreover, certain +// sections are fine with just defining SD/ED symbols. The SymbolMapper takes +// care of all those details. + +// Attributes for SD symbols. +struct SDAttr { + GOFF::ESDTaskingBehavior TaskingBehavior = GOFF::ESD_TA_Unspecified; + GOFF::ESDBindingScope BindingScope = GOFF::ESD_BSC_Unspecified; +}; + +// Attributes for ED symbols. +struct EDAttr { + bool IsReadOnly = false; + GOFF::ESDExecutable Executable = GOFF::ESD_EXE_Unspecified; + GOFF::ESDAmode Amode; redstar wrote: No binding problems, so it seems sage to make this change. That said, amblist shows the Amode on PR and ED symbols. https://github.com/llvm/llvm-project/pull/133799 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect authentication oracles (PR #135663)
https://github.com/atrosinenko ready_for_review https://github.com/llvm/llvm-project/pull/135663 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] a141e58 - [llvm][CodeGen] avoid repeated interval calculation in window scheduler (#132352)
Author: Hua Tian Date: 2025-04-14T12:34:31-07:00 New Revision: a141e58685fd8d07b13751dcce0d2fca27b93848 URL: https://github.com/llvm/llvm-project/commit/a141e58685fd8d07b13751dcce0d2fca27b93848 DIFF: https://github.com/llvm/llvm-project/commit/a141e58685fd8d07b13751dcce0d2fca27b93848.diff LOG: [llvm][CodeGen] avoid repeated interval calculation in window scheduler (#132352) Some new registers are reused when replacing some old ones in certain use case of ModuloScheduleExpander. It is necessary to avoid repeated interval calculations for these registers. (cherry picked from commit 7e65944292278cc245e36cc6ca971654d584012d) Added: llvm/test/CodeGen/AArch64/aarch64-swp-ws-live-intervals.mir Modified: llvm/include/llvm/CodeGen/ModuloSchedule.h llvm/lib/CodeGen/ModuloSchedule.cpp Removed: diff --git a/llvm/include/llvm/CodeGen/ModuloSchedule.h b/llvm/include/llvm/CodeGen/ModuloSchedule.h index e2fb1daf9fb4e..64598ce449a44 100644 --- a/llvm/include/llvm/CodeGen/ModuloSchedule.h +++ b/llvm/include/llvm/CodeGen/ModuloSchedule.h @@ -188,9 +188,6 @@ class ModuloScheduleExpander { /// Instructions to change when emitting the final schedule. InstrChangesTy InstrChanges; - /// Record the registers that need to compute live intervals. - SmallVector NoIntervalRegs; - void generatePipelinedLoop(); void generateProlog(unsigned LastStage, MachineBasicBlock *KernelBB, ValueMapTy *VRMap, MBBVectorTy &PrologBBs); @@ -214,7 +211,6 @@ class ModuloScheduleExpander { void addBranches(MachineBasicBlock &PreheaderBB, MBBVectorTy &PrologBBs, MachineBasicBlock *KernelBB, MBBVectorTy &EpilogBBs, ValueMapTy *VRMap); - void calculateIntervals(); bool computeDelta(MachineInstr &MI, unsigned &Delta); void updateMemOperands(MachineInstr &NewMI, MachineInstr &OldMI, unsigned Num); diff --git a/llvm/lib/CodeGen/ModuloSchedule.cpp b/llvm/lib/CodeGen/ModuloSchedule.cpp index fcfb3c1f7073d..7792a0eaa285b 100644 --- a/llvm/lib/CodeGen/ModuloSchedule.cpp +++ b/llvm/lib/CodeGen/ModuloSchedule.cpp @@ -181,10 +181,6 @@ void ModuloScheduleExpander::generatePipelinedLoop() { // Add branches between prolog and epilog blocks. addBranches(*Preheader, PrologBBs, KernelBB, EpilogBBs, VRMap); - // The intervals of newly created virtual registers are calculated after the - // kernel expansion. - calculateIntervals(); - delete[] VRMap; delete[] VRMapPhi; } @@ -546,10 +542,8 @@ void ModuloScheduleExpander::generateExistingPhis( if (VRMap[LastStageNum - np - 1].count(LoopVal)) PhiOp2 = VRMap[LastStageNum - np - 1][LoopVal]; - if (IsLast && np == NumPhis - 1) { + if (IsLast && np == NumPhis - 1) replaceRegUsesAfterLoop(Def, NewReg, BB, MRI); -NoIntervalRegs.push_back(NewReg); - } continue; } } @@ -589,10 +583,8 @@ void ModuloScheduleExpander::generateExistingPhis( // Check if we need to rename any uses that occurs after the loop. The // register to replace depends on whether the Phi is scheduled in the // epilog. - if (IsLast && np == NumPhis - 1) { + if (IsLast && np == NumPhis - 1) replaceRegUsesAfterLoop(Def, NewReg, BB, MRI); -NoIntervalRegs.push_back(NewReg); - } // In the kernel, a dependent Phi uses the value from this Phi. if (InKernel) @@ -612,10 +604,8 @@ void ModuloScheduleExpander::generateExistingPhis( if (NumStages == 0 && IsLast) { auto &CurStageMap = VRMap[CurStageNum]; auto It = CurStageMap.find(LoopVal); - if (It != CurStageMap.end()) { + if (It != CurStageMap.end()) replaceRegUsesAfterLoop(Def, It->second, BB, MRI); -NoIntervalRegs.push_back(It->second); - } } } } @@ -735,10 +725,8 @@ void ModuloScheduleExpander::generatePhis( rewriteScheduledInstr(NewBB, InstrMap, CurStageNum, np, &*BBI, Def, NewReg); } -if (IsLast && np == NumPhis - 1) { +if (IsLast && np == NumPhis - 1) replaceRegUsesAfterLoop(Def, NewReg, BB, MRI); - NoIntervalRegs.push_back(NewReg); -} } } } @@ -950,14 +938,6 @@ void ModuloScheduleExpander::addBranches(MachineBasicBlock &PreheaderBB, } } -/// Some registers are generated during the kernel expansion. We calculate the -/// live intervals of these registers after the expansion. -void ModuloScheduleExpander::calculateIntervals() { - for (Register Reg : NoIntervalRegs) -LIS.createAndComputeVirtRegInterval(Reg); - NoIntervalRegs.clear(); -} - /// Return true if we can compute the amount the instruction changes /// during each iteration. Set Delta to the amount of the change. bool Modu