[llvm-branch-commits] [clang] [flang] [llvm] [Clang][OpenMP] Add permutation clause (PR #92030)
https://github.com/Meinersbur closed https://github.com/llvm/llvm-project/pull/92030 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [lld][ELF][LoongArch] Add support for R_LARCH_LE_{HI20, ADD, LO12}_R relocations (PR #99486)
https://github.com/SixWeining approved this pull request. https://github.com/llvm/llvm-project/pull/99486 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [flang] [llvm] [Clang][OpenMP] Add permutation clause (PR #92030)
https://github.com/Meinersbur reopened https://github.com/llvm/llvm-project/pull/92030 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [lld][ELF][LoongArch] Add support for R_LARCH_LE_{HI20, ADD, LO12}_R relocations (PR #99486)
https://github.com/wangleiat updated https://github.com/llvm/llvm-project/pull/99486 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [lld][ELF][LoongArch] Add support for R_LARCH_LE_{HI20, ADD, LO12}_R relocations (PR #99486)
https://github.com/wangleiat updated https://github.com/llvm/llvm-project/pull/99486 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] clang/AMDGPU: Emit atomicrmw for __builtin_amdgcn_global_atomic_fadd_{f32|f64} (PR #96872)
arsenm wrote: ping https://github.com/llvm/llvm-project/pull/96872 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] AMDGPU: Remove flat/global atomic fadd v2bf16 intrinsics (PR #97050)
arsenm wrote: ping https://github.com/llvm/llvm-project/pull/97050 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [MLIR][OpenMP] Automate operand structure definition (PR #99508)
https://github.com/Meinersbur edited https://github.com/llvm/llvm-project/pull/99508 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [MLIR][OpenMP] Automate operand structure definition (PR #99508)
https://github.com/Meinersbur commented: 👍 for automatically generating this. https://github.com/llvm/llvm-project/pull/99508 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [MLIR][OpenMP] Automate operand structure definition (PR #99508)
@@ -148,6 +169,110 @@ static void verifyClause(Record *op, Record *clause) { "or explicitly skipping this field."); } +/// Translate the type of an OpenMP clause's argument to its corresponding +/// representation for clause operand structures. +/// +/// All kinds of values are represented as `mlir::Value` fields, whereas +/// attributes are represented based on their `storageType`. +/// +/// \param[in] init The `DefInit` object representing the argument. +/// \param[out] rank Number of levels of array nesting associated with the +/// type. +/// +/// \return the name of the base type to represent elements of the argument +/// type. Meinersbur wrote: [nit] indention ```suggestion /// \return the name of the base type to represent elements of the argument /// type. ``` https://github.com/llvm/llvm-project/pull/99508 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [MLIR][OpenMP] Automate operand structure definition (PR #99508)
@@ -148,6 +169,110 @@ static void verifyClause(Record *op, Record *clause) { "or explicitly skipping this field."); } +/// Translate the type of an OpenMP clause's argument to its corresponding +/// representation for clause operand structures. +/// +/// All kinds of values are represented as `mlir::Value` fields, whereas +/// attributes are represented based on their `storageType`. +/// +/// \param[in] init The `DefInit` object representing the argument. +/// \param[out] rank Number of levels of array nesting associated with the +/// type. Meinersbur wrote: [nit] indention ```suggestion /// \param[out] rank Number of levels of array nesting associated with the /// type. ``` https://github.com/llvm/llvm-project/pull/99508 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [MLIR][OpenMP] Automate operand structure definition (PR #99508)
@@ -148,6 +169,110 @@ static void verifyClause(Record *op, Record *clause) { "or explicitly skipping this field."); } +/// Translate the type of an OpenMP clause's argument to its corresponding +/// representation for clause operand structures. +/// +/// All kinds of values are represented as `mlir::Value` fields, whereas +/// attributes are represented based on their `storageType`. +/// +/// \param[in] init The `DefInit` object representing the argument. +/// \param[out] rank Number of levels of array nesting associated with the +/// type. +/// +/// \return the name of the base type to represent elements of the argument +/// type. +static StringRef translateArgumentType(Init *init, int &rank) { + Record *def = cast(init)->getDef(); + bool isAttr = false, isValue = false; + + for (auto [sc, _] : def->getSuperClasses()) { +std::string scName = sc->getNameInitAsString(); +if (scName == "OptionalAttr") + return translateArgumentType(def->getValue("baseAttr")->getValue(), rank); + +if (scName == "TypedArrayAttrBase") { + ++rank; + return translateArgumentType(def->getValue("elementAttr")->getValue(), + rank); +} + +if (scName == "ElementsAttrBase") { + rank += def->getValueAsInt("rank"); + return def->getValueAsString("elementReturnType").trim(); +} + +if (scName == "Attr") + isAttr = true; +else if (scName == "TypeConstraint") + isValue = true; +else if (scName == "Variadic") + ++rank; + } + + if (isValue) { +assert(!isAttr); +return "::mlir::Value"; + } + + assert(isAttr); Meinersbur wrote: [nit] Add description on why `isAttr` is required here. It is not used on the next line. https://github.com/llvm/llvm-project/pull/99508 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [MLIR][OpenMP] Automate operand structure definition (PR #99508)
@@ -23,303 +23,31 @@ #define GET_ATTRDEF_CLASSES #include "mlir/Dialect/OpenMP/OpenMPOpsAttributes.h.inc" +#include "mlir/Dialect/OpenMP/OpenMPClauseOps.h.inc" + namespace mlir { namespace omp { //===--===// -// Mixin structures defining MLIR operands associated with each OpenMP clause. +// Extra clause operand structures. //===--===// -struct AlignedClauseOps { - llvm::SmallVector alignedVars; - llvm::SmallVector alignments; -}; - -struct AllocateClauseOps { - llvm::SmallVector allocateVars, allocatorVars; -}; - -struct CancelDirectiveNameClauseOps { - ClauseCancellationConstructTypeAttr cancelDirective; -}; - -struct CollapseClauseOps { - llvm::SmallVector collapseLowerBound, collapseUpperBound, collapseStep; -}; - -struct CopyprivateClauseOps { - llvm::SmallVector copyprivateVars; - llvm::SmallVector copyprivateSyms; -}; - -struct CriticalNameClauseOps { - StringAttr symName; -}; - -struct DependClauseOps { - llvm::SmallVector dependKinds; - llvm::SmallVector dependVars; -}; - -struct DeviceClauseOps { - Value device; -}; - struct DeviceTypeClauseOps { Meinersbur wrote: Why is the `device_type` clause not generated? https://github.com/llvm/llvm-project/pull/99508 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [MLIR][OpenMP] Automate operand structure definition (PR #99508)
@@ -12,11 +12,43 @@ #include "mlir/TableGen/GenInfo.h" +#include "mlir/TableGen/CodeGenHelpers.h" +#include "llvm/ADT/StringExtras.h" +#include "llvm/ADT/TypeSwitch.h" #include "llvm/TableGen/Error.h" #include "llvm/TableGen/Record.h" using namespace llvm; +/// The code block defining the base mixin class for combining clause operand +/// structures. +static const char *const baseMixinClass = R"( +namespace detail { +template +struct Clauses : public Mixins... {}; +} // namespace detail +)"; + +/// The code block defining operation argument structures. +static const char *const operationArgStruct = R"( +using {0}Operands = detail::Clauses<{1}>; +)"; + +/// Remove multiple optional prefixes and suffixes from \c str. Meinersbur wrote: Is being ordered intended. I.e. `CollapseClauseSkip` is normalized to `Collapse`, but not `CollapseSkipClause`? Whatever the intention, it should be documented. https://github.com/llvm/llvm-project/pull/99508 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [MLIR][OpenMP] Automate operand structure definition (PR #99508)
https://github.com/skatrak updated https://github.com/llvm/llvm-project/pull/99508 >From 1d99939c020aab8650cd20df24e0b1e71726ae90 Mon Sep 17 00:00:00 2001 From: Sergio Afonso Date: Wed, 17 Jul 2024 13:26:09 +0100 Subject: [PATCH 1/2] [MLIR][OpenMP] Automate operand structure definition This patch adds the "gen-openmp-clause-ops" `mlir-tblgen` generator to produce the structure definitions previously in OpenMPClauseOperands.h automatically from the information contained in OpenMPOps.td and OpenMPClauses.td. Changes introduced to the `ElementsAttrBase` common tablegen class, as well as some of its subclasses, add more fine-grained information on their shape and type of their elements. This information is needed in order to properly generate the corresponding types to represent these attributes within the produced operand structures. The original header is maintained to enable the definition of similar structures that are not directly related to any single `OpenMP_Clause` or `OpenMP_Op` tablegen definition. --- .../mlir/Dialect/OpenMP/CMakeLists.txt| 1 + .../Dialect/OpenMP/OpenMPClauseOperands.h | 290 +- mlir/include/mlir/IR/CommonAttrConstraints.td | 18 +- mlir/test/mlir-tblgen/openmp-clause-ops.td| 78 + mlir/tools/mlir-tblgen/OmpOpGen.cpp | 174 ++- 5 files changed, 266 insertions(+), 295 deletions(-) create mode 100644 mlir/test/mlir-tblgen/openmp-clause-ops.td diff --git a/mlir/include/mlir/Dialect/OpenMP/CMakeLists.txt b/mlir/include/mlir/Dialect/OpenMP/CMakeLists.txt index d3422f6e48b06..23ccba3067bcb 100644 --- a/mlir/include/mlir/Dialect/OpenMP/CMakeLists.txt +++ b/mlir/include/mlir/Dialect/OpenMP/CMakeLists.txt @@ -17,6 +17,7 @@ mlir_tablegen(OpenMPOpsDialect.h.inc -gen-dialect-decls -dialect=omp) mlir_tablegen(OpenMPOpsDialect.cpp.inc -gen-dialect-defs -dialect=omp) mlir_tablegen(OpenMPOps.h.inc -gen-op-decls) mlir_tablegen(OpenMPOps.cpp.inc -gen-op-defs) +mlir_tablegen(OpenMPClauseOps.h.inc -gen-openmp-clause-ops) mlir_tablegen(OpenMPOpsTypes.h.inc -gen-typedef-decls -typedefs-dialect=omp) mlir_tablegen(OpenMPOpsTypes.cpp.inc -gen-typedef-defs -typedefs-dialect=omp) mlir_tablegen(OpenMPOpsEnums.h.inc -gen-enum-decls) diff --git a/mlir/include/mlir/Dialect/OpenMP/OpenMPClauseOperands.h b/mlir/include/mlir/Dialect/OpenMP/OpenMPClauseOperands.h index f4a87d52a172e..e5b4de4908966 100644 --- a/mlir/include/mlir/Dialect/OpenMP/OpenMPClauseOperands.h +++ b/mlir/include/mlir/Dialect/OpenMP/OpenMPClauseOperands.h @@ -23,303 +23,31 @@ #define GET_ATTRDEF_CLASSES #include "mlir/Dialect/OpenMP/OpenMPOpsAttributes.h.inc" +#include "mlir/Dialect/OpenMP/OpenMPClauseOps.h.inc" + namespace mlir { namespace omp { //===--===// -// Mixin structures defining MLIR operands associated with each OpenMP clause. +// Extra clause operand structures. //===--===// -struct AlignedClauseOps { - llvm::SmallVector alignedVars; - llvm::SmallVector alignments; -}; - -struct AllocateClauseOps { - llvm::SmallVector allocateVars, allocatorVars; -}; - -struct CancelDirectiveNameClauseOps { - ClauseCancellationConstructTypeAttr cancelDirective; -}; - -struct CollapseClauseOps { - llvm::SmallVector collapseLowerBound, collapseUpperBound, collapseStep; -}; - -struct CopyprivateClauseOps { - llvm::SmallVector copyprivateVars; - llvm::SmallVector copyprivateSyms; -}; - -struct CriticalNameClauseOps { - StringAttr symName; -}; - -struct DependClauseOps { - llvm::SmallVector dependKinds; - llvm::SmallVector dependVars; -}; - -struct DeviceClauseOps { - Value device; -}; - struct DeviceTypeClauseOps { // The default capture type. DeclareTargetDeviceType deviceType = DeclareTargetDeviceType::any; }; -struct DistScheduleClauseOps { - UnitAttr distScheduleStatic; - Value distScheduleChunkSize; -}; - -struct DoacrossClauseOps { - ClauseDependAttr doacrossDependType; - IntegerAttr doacrossNumLoops; - llvm::SmallVector doacrossDependVars; -}; - -struct FilterClauseOps { - Value filteredThreadId; -}; - -struct FinalClauseOps { - Value final; -}; - -struct GrainsizeClauseOps { - Value grainsize; -}; - -struct HasDeviceAddrClauseOps { - llvm::SmallVector hasDeviceAddrVars; -}; - -struct HintClauseOps { - IntegerAttr hint; -}; - -struct IfClauseOps { - Value ifVar; -}; - -struct InReductionClauseOps { - llvm::SmallVector inReductionVars; - llvm::SmallVector inReductionByref; - llvm::SmallVector inReductionSyms; -}; - -struct IsDevicePtrClauseOps { - llvm::SmallVector isDevicePtrVars; -}; - -struct LinearClauseOps { - llvm::SmallVector linearVars, linearStepVars; -}; - -struct LoopRelatedOps { - UnitAttr loopInclusive; -}; - -struct MapClauseOps { - llvm::SmallVector mapVars; -}; - -struct MergeableClauseOps { - UnitAttr mergeable; -}; - -struct NogroupClauseOps { - UnitAttr nogroup; -};
[llvm-branch-commits] [mlir] [MLIR][OpenMP] Automate operand structure definition (PR #99508)
@@ -12,11 +12,43 @@ #include "mlir/TableGen/GenInfo.h" +#include "mlir/TableGen/CodeGenHelpers.h" +#include "llvm/ADT/StringExtras.h" +#include "llvm/ADT/TypeSwitch.h" #include "llvm/TableGen/Error.h" #include "llvm/TableGen/Record.h" using namespace llvm; +/// The code block defining the base mixin class for combining clause operand +/// structures. +static const char *const baseMixinClass = R"( +namespace detail { +template +struct Clauses : public Mixins... {}; +} // namespace detail +)"; + +/// The code block defining operation argument structures. +static const char *const operationArgStruct = R"( +using {0}Operands = detail::Clauses<{1}>; +)"; + +/// Remove multiple optional prefixes and suffixes from \c str. skatrak wrote: I agree, this should be made clearer in the comments. https://github.com/llvm/llvm-project/pull/99508 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [MLIR][OpenMP] Automate operand structure definition (PR #99508)
@@ -148,6 +169,110 @@ static void verifyClause(Record *op, Record *clause) { "or explicitly skipping this field."); } +/// Translate the type of an OpenMP clause's argument to its corresponding +/// representation for clause operand structures. +/// +/// All kinds of values are represented as `mlir::Value` fields, whereas +/// attributes are represented based on their `storageType`. +/// +/// \param[in] init The `DefInit` object representing the argument. +/// \param[out] rank Number of levels of array nesting associated with the +/// type. +/// +/// \return the name of the base type to represent elements of the argument +/// type. +static StringRef translateArgumentType(Init *init, int &rank) { + Record *def = cast(init)->getDef(); + bool isAttr = false, isValue = false; + + for (auto [sc, _] : def->getSuperClasses()) { +std::string scName = sc->getNameInitAsString(); +if (scName == "OptionalAttr") + return translateArgumentType(def->getValue("baseAttr")->getValue(), rank); + +if (scName == "TypedArrayAttrBase") { + ++rank; + return translateArgumentType(def->getValue("elementAttr")->getValue(), + rank); +} + +if (scName == "ElementsAttrBase") { + rank += def->getValueAsInt("rank"); + return def->getValueAsString("elementReturnType").trim(); +} + +if (scName == "Attr") + isAttr = true; +else if (scName == "TypeConstraint") + isValue = true; +else if (scName == "Variadic") + ++rank; + } + + if (isValue) { +assert(!isAttr); +return "::mlir::Value"; + } + + assert(isAttr); skatrak wrote: It's because it had to be either a value or an attribute. `isValue` is checked before, returning early, so here we know for sure it must be an attribute (otherwise we somehow stumbled into an argument that isn't either of those things). I'll add a message to the assert. https://github.com/llvm/llvm-project/pull/99508 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [MLIR][OpenMP] Automate operand structure definition (PR #99508)
@@ -23,303 +23,31 @@ #define GET_ATTRDEF_CLASSES #include "mlir/Dialect/OpenMP/OpenMPOpsAttributes.h.inc" +#include "mlir/Dialect/OpenMP/OpenMPClauseOps.h.inc" + namespace mlir { namespace omp { //===--===// -// Mixin structures defining MLIR operands associated with each OpenMP clause. +// Extra clause operand structures. //===--===// -struct AlignedClauseOps { - llvm::SmallVector alignedVars; - llvm::SmallVector alignments; -}; - -struct AllocateClauseOps { - llvm::SmallVector allocateVars, allocatorVars; -}; - -struct CancelDirectiveNameClauseOps { - ClauseCancellationConstructTypeAttr cancelDirective; -}; - -struct CollapseClauseOps { - llvm::SmallVector collapseLowerBound, collapseUpperBound, collapseStep; -}; - -struct CopyprivateClauseOps { - llvm::SmallVector copyprivateVars; - llvm::SmallVector copyprivateSyms; -}; - -struct CriticalNameClauseOps { - StringAttr symName; -}; - -struct DependClauseOps { - llvm::SmallVector dependKinds; - llvm::SmallVector dependVars; -}; - -struct DeviceClauseOps { - Value device; -}; - struct DeviceTypeClauseOps { skatrak wrote: This is because it's not defined as an `OpenMP_Clause` in OpenMPClauses.td, so it can't be generated by this pass. Same case below with `DeclareTargetOperands`, which is not based on an `OpenMP_Op` definition. https://github.com/llvm/llvm-project/pull/99508 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LLVM][Coroutines] Create `.noalloc` variant of switch ABI coroutine ramp functions during CoroSplit (PR #99283)
@@ -1967,22 +2047,13 @@ splitCoroutine(Function &F, SmallVectorImpl &Clones, for (DbgVariableRecord *DVR : DbgVariableRecords) coro::salvageDebugInfo(ArgToAllocaMap, *DVR, Shape.OptimizeFrame, false /*UseEntryValue*/); - return Shape; -} -/// Remove calls to llvm.coro.end in the original function. -static void removeCoroEndsFromRampFunction(const coro::Shape &Shape) { - if (Shape.ABI != coro::ABI::Switch) { -for (auto *End : Shape.CoroEnds) { - replaceCoroEnd(End, Shape, Shape.FramePtr, /*in resume*/ false, nullptr); -} - } else { -for (llvm::AnyCoroEndInst *End : Shape.CoroEnds) { - auto &Context = End->getContext(); - End->replaceAllUsesWith(ConstantInt::getFalse(Context)); - End->eraseFromParent(); -} + removeCoroEndsFromRampFunction(Shape); + + if (!isNoSuspendCoroutine && Shape.ABI == coro::ABI::Switch) { vogelsgesang wrote: Additionally, we should only create this `noalloc` variant, if the coroutines promise type itself is marked as "elidable". **Example** ``` struct [[coro_must_elide]] ElidedTask { ... } struct NonElidedTask { ... } ElidedTask foo(); NonElidedTask bar(); NonElidedTask foobar() { co_await foo(); co_await bar(); } ``` **Current behavior**: We create `foo.noalloc`, `bar.noalloc` and `foobar.alloc`. `foo.noalloc` gets actually called. However, `bar.noalloc` and `foobar.alloc` are dead code. The `.noalloc` variants will never be used, as the `NonElidedTask` is not marked with `[[coro_must_elide]]`. I think we should avoid generating this dead code https://github.com/llvm/llvm-project/pull/99283 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] AMDGPU: Remove flat/global atomic fadd v2bf16 intrinsics (PR #97050)
@@ -106,106 +100,6 @@ define <2 x half> @flat_atomic_fadd_v2f16_rtn(ptr %ptr, <2 x half> %data) { ret <2 x half> %ret } -define amdgpu_kernel void @flat_atomic_fadd_v2bf16_noret(ptr %ptr, <2 x i16> %data) { -; GFX940-LABEL: flat_atomic_fadd_v2bf16_noret: -; GFX940: ; %bb.0: -; GFX940-NEXT:s_load_dwordx2 s[0:1], s[2:3], 0x24 -; GFX940-NEXT:s_load_dword s4, s[2:3], 0x2c -; GFX940-NEXT:s_waitcnt lgkmcnt(0) -; GFX940-NEXT:v_mov_b64_e32 v[0:1], s[0:1] -; GFX940-NEXT:v_mov_b32_e32 v2, s4 -; GFX940-NEXT:flat_atomic_pk_add_bf16 v[0:1], v2 -; GFX940-NEXT:s_endpgm - %ret = call <2 x i16> @llvm.amdgcn.flat.atomic.fadd.v2bf16.p0(ptr %ptr, <2 x i16> %data) yxsamliu wrote: do we have equivalent codegen tests for the counterpart atomicrmw insts to cover the removed tests? same as below https://github.com/llvm/llvm-project/pull/97050 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] AMDGPU: Remove flat/global atomic fadd v2bf16 intrinsics (PR #97050)
@@ -106,106 +100,6 @@ define <2 x half> @flat_atomic_fadd_v2f16_rtn(ptr %ptr, <2 x half> %data) { ret <2 x half> %ret } -define amdgpu_kernel void @flat_atomic_fadd_v2bf16_noret(ptr %ptr, <2 x i16> %data) { -; GFX940-LABEL: flat_atomic_fadd_v2bf16_noret: -; GFX940: ; %bb.0: -; GFX940-NEXT:s_load_dwordx2 s[0:1], s[2:3], 0x24 -; GFX940-NEXT:s_load_dword s4, s[2:3], 0x2c -; GFX940-NEXT:s_waitcnt lgkmcnt(0) -; GFX940-NEXT:v_mov_b64_e32 v[0:1], s[0:1] -; GFX940-NEXT:v_mov_b32_e32 v2, s4 -; GFX940-NEXT:flat_atomic_pk_add_bf16 v[0:1], v2 -; GFX940-NEXT:s_endpgm - %ret = call <2 x i16> @llvm.amdgcn.flat.atomic.fadd.v2bf16.p0(ptr %ptr, <2 x i16> %data) arsenm wrote: Yes https://github.com/llvm/llvm-project/pull/97050 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LLVM][Coroutines] Create `.noalloc` variant of switch ABI coroutine ramp functions during CoroSplit (PR #99283)
https://github.com/vogelsgesang edited https://github.com/llvm/llvm-project/pull/99283 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LLVM][Coroutines] Create `.noalloc` variant of switch ABI coroutine ramp functions during CoroSplit (PR #99283)
https://github.com/vogelsgesang approved this pull request. LGTM on a high level. But I am not experienced in this code area, so you might want to wait for another review https://github.com/llvm/llvm-project/pull/99283 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LLVM][Coroutines] Create `.noalloc` variant of switch ABI coroutine ramp functions during CoroSplit (PR #99283)
@@ -1455,6 +1462,64 @@ struct SwitchCoroutineSplitter { setCoroInfo(F, Shape, Clones); } + static Function *createNoAllocVariant(Function &F, coro::Shape &Shape, +SmallVectorImpl &Clones) { +auto *OrigFnTy = F.getFunctionType(); +auto OldParams = OrigFnTy->params(); + +SmallVector NewParams; +NewParams.reserve(OldParams.size() + 1); +for (Type *T : OldParams) { + NewParams.push_back(T); +} +NewParams.push_back(PointerType::getUnqual(Shape.FrameTy)); + +auto *NewFnTy = FunctionType::get(OrigFnTy->getReturnType(), NewParams, + OrigFnTy->isVarArg()); +Function *NoAllocF = +Function::Create(NewFnTy, F.getLinkage(), F.getName() + ".noalloc"); vogelsgesang wrote: I wonder if the function should be marked as `linkage = internal`. To my understanding, the `[[coro_must_elide]]` does not work across translation units, yet. As such, there won't be any cross-TU calls to the `.noalloc` function https://github.com/llvm/llvm-project/pull/99283 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LLVM][Coroutines] Create `.noalloc` variant of switch ABI coroutine ramp functions during CoroSplit (PR #99283)
https://github.com/vogelsgesang edited https://github.com/llvm/llvm-project/pull/99283 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LLVM][Coroutines] Create `.noalloc` variant of switch ABI coroutine ramp functions during CoroSplit (PR #99283)
@@ -1455,6 +1462,64 @@ struct SwitchCoroutineSplitter { setCoroInfo(F, Shape, Clones); } + static Function *createNoAllocVariant(Function &F, coro::Shape &Shape, +SmallVectorImpl &Clones) { +auto *OrigFnTy = F.getFunctionType(); +auto OldParams = OrigFnTy->params(); + +SmallVector NewParams; +NewParams.reserve(OldParams.size() + 1); +for (Type *T : OldParams) { + NewParams.push_back(T); +} vogelsgesang wrote: Can this be simplified as ```suggestion NewParams.append(OldParams->begin(), OldParams->end()); ``` ? https://github.com/llvm/llvm-project/pull/99283 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LLVM][Coroutines] Transform "coro_must_elide" calls to switch ABI coroutines to the `noalloc` variant (PR #99285)
@@ -0,0 +1,135 @@ +//===- CoroSplit.cpp - Converts a coroutine into a state machine --===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===--===// + +//===--===// + +#include "llvm/Transforms/Coroutines/CoroAnnotationElide.h" + +#include "llvm/Analysis/LazyCallGraph.h" +#include "llvm/Analysis/OptimizationRemarkEmitter.h" +#include "llvm/IR/Analysis.h" +#include "llvm/IR/IRBuilder.h" +#include "llvm/IR/InstIterator.h" +#include "llvm/IR/Instruction.h" +#include "llvm/IR/Module.h" +#include "llvm/IR/PassManager.h" +#include "llvm/Transforms/Utils/CallGraphUpdater.h" + +#include + +using namespace llvm; + +#define DEBUG_TYPE "coro-annotation-elide" + +#define CORO_MUST_ELIDE_ANNOTATION "coro_must_elide" + +static Instruction *getFirstNonAllocaInTheEntryBlock(Function *F) { + for (Instruction &I : F->getEntryBlock()) +if (!isa(&I)) + return &I; + llvm_unreachable("no terminator in the entry block"); +} + +static Value *allocateFrameInCaller(Function *Caller, uint64_t FrameSize, +Align FrameAlign) { + LLVMContext &C = Caller->getContext(); + BasicBlock::iterator InsertPt = + getFirstNonAllocaInTheEntryBlock(Caller)->getIterator(); + const DataLayout &DL = Caller->getDataLayout(); + auto FrameTy = ArrayType::get(Type::getInt8Ty(C), FrameSize); + auto *Frame = new AllocaInst(FrameTy, DL.getAllocaAddrSpace(), "", InsertPt); + Frame->setAlignment(FrameAlign); + return new BitCastInst(Frame, PointerType::getUnqual(C), "vFrame", InsertPt); vogelsgesang wrote: do we also need to emit any specific debug info, such that the elided frame can be inspected from the debugger? https://github.com/llvm/llvm-project/pull/99285 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang-tidy] Add FixIts for libc namespace macros (PR #99681)
https://github.com/ilovepi created https://github.com/llvm/llvm-project/pull/99681 This adds a few FixIts that update the usage of namespaces other than LIBC_NAMESPACE_DECL, and add the required header when its missing. ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang-tidy] Add FixIts for libc namespace macros (PR #99681)
llvmbot wrote: @llvm/pr-subscribers-clang-tools-extra @llvm/pr-subscribers-clang-tidy Author: Paul Kirth (ilovepi) Changes This adds a few FixIts that update the usage of namespaces other than LIBC_NAMESPACE_DECL, and add the required header when its missing. --- Full diff: https://github.com/llvm/llvm-project/pull/99681.diff 4 Files Affected: - (modified) clang-tools-extra/clang-tidy/llvmlibc/ImplementationInNamespaceCheck.cpp (+33-4) - (modified) clang-tools-extra/clang-tidy/llvmlibc/ImplementationInNamespaceCheck.h (+11-1) - (modified) clang-tools-extra/clang-tidy/llvmlibc/NamespaceConstants.h (+2) - (modified) clang-tools-extra/test/clang-tidy/checkers/llvmlibc/implementation-in-namespace.cpp (+5-1) ``diff diff --git a/clang-tools-extra/clang-tidy/llvmlibc/ImplementationInNamespaceCheck.cpp b/clang-tools-extra/clang-tidy/llvmlibc/ImplementationInNamespaceCheck.cpp index bb436e4d12a30..e6e4900283c79 100644 --- a/clang-tools-extra/clang-tidy/llvmlibc/ImplementationInNamespaceCheck.cpp +++ b/clang-tools-extra/clang-tidy/llvmlibc/ImplementationInNamespaceCheck.cpp @@ -8,7 +8,6 @@ #include "ImplementationInNamespaceCheck.h" #include "NamespaceConstants.h" -#include "clang/AST/ASTContext.h" #include "clang/ASTMatchers/ASTMatchFinder.h" using namespace clang::ast_matchers; @@ -25,6 +24,16 @@ void ImplementationInNamespaceCheck::registerMatchers(MatchFinder *Finder) { this); } +void ImplementationInNamespaceCheck::storeOptions( +ClangTidyOptions::OptionMap &Opts) { + Options.store(Opts, "IncludeStyle", IncludeInserter.getStyle()); +} + +void ImplementationInNamespaceCheck::registerPPCallbacks( +const SourceManager &SM, Preprocessor *PP, Preprocessor *ModuleExpanderPP) { + IncludeInserter.registerPreprocessor(PP); +} + void ImplementationInNamespaceCheck::check( const MatchFinder::MatchResult &Result) { const auto *MatchedDecl = @@ -41,8 +50,26 @@ void ImplementationInNamespaceCheck::check( // Enforce that the namespace is the result of macro expansion if (Result.SourceManager->isMacroBodyExpansion(NS->getLocation()) == false) { -diag(NS->getLocation(), "the outermost namespace should be the '%0' macro") -<< RequiredNamespaceDeclMacroName; +auto DB = diag(NS->getLocation(), + "the outermost namespace should be the '%0' macro") + << RequiredNamespaceDeclMacroName; + +// TODO: Determine how to split inline namespaces correctly in the FixItHint +// +// We can't easily replace LIBC_NAMEPACE::inner::namespace { with +// +// namespace LIBC_NAMEPACE_DECL { +// namespace inner::namespace { +// +// For now, just update the simple case w/ LIBC_NAMEPACE_DECL +if (!NS->isInlineNamespace()) + DB << FixItHint::CreateReplacement(NS->getLocation(), + RequiredNamespaceDeclMacroName); + +DB << IncludeInserter.createIncludeInsertion( +Result.SourceManager->getFileID(NS->getBeginLoc()), +NamespaceMacroHeader); + return; } @@ -51,7 +78,9 @@ void ImplementationInNamespaceCheck::check( // instead. if (NS->getVisibility() != Visibility::HiddenVisibility) { diag(NS->getLocation(), "the '%0' macro should start with '%1'") -<< RequiredNamespaceDeclMacroName << RequiredNamespaceDeclStart; +<< RequiredNamespaceDeclMacroName << RequiredNamespaceDeclStart +<< FixItHint::CreateReplacement(NS->getLocation(), +RequiredNamespaceDeclMacroName); return; } diff --git a/clang-tools-extra/clang-tidy/llvmlibc/ImplementationInNamespaceCheck.h b/clang-tools-extra/clang-tidy/llvmlibc/ImplementationInNamespaceCheck.h index 42da38f728bb8..f7fc19fda180a 100644 --- a/clang-tools-extra/clang-tidy/llvmlibc/ImplementationInNamespaceCheck.h +++ b/clang-tools-extra/clang-tidy/llvmlibc/ImplementationInNamespaceCheck.h @@ -10,6 +10,7 @@ #define LLVM_CLANG_TOOLS_EXTRA_CLANG_TIDY_LLVMLIBC_IMPLEMENTATIONINNAMESPACECHECK_H #include "../ClangTidyCheck.h" +#include "../utils/IncludeInserter.h" namespace clang::tidy::llvm_libc { @@ -20,13 +21,22 @@ namespace clang::tidy::llvm_libc { class ImplementationInNamespaceCheck : public ClangTidyCheck { public: ImplementationInNamespaceCheck(StringRef Name, ClangTidyContext *Context) - : ClangTidyCheck(Name, Context) {} + : ClangTidyCheck(Name, Context), +IncludeInserter(Options.getLocalOrGlobal("IncludeStyle", + utils::IncludeSorter::IS_LLVM), +areDiagsSelfContained()) {} bool isLanguageVersionSupported(const LangOptions &LangOpts) const override { return LangOpts.CPlusPlus; } void registerMatchers(ast_matchers::MatchFinder *Finder) override; + void registerPPCallbacks(const SourceManager &SM, Preprocessor *PP, + Preprocessor *ModuleExpanderPP) override; vo
[llvm-branch-commits] [clang-tidy] Add FixIts for libc namespace macros (PR #99681)
https://github.com/ilovepi updated https://github.com/llvm/llvm-project/pull/99681 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang-tidy] Add FixIts for libc namespace macros (PR #99681)
https://github.com/ilovepi updated https://github.com/llvm/llvm-project/pull/99681 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] MachineOutliner: Use PM to query MachineModuleInfo (PR #99688)
https://github.com/arsenm created https://github.com/llvm/llvm-project/pull/99688 Avoid getting this from the MachineFunction >From 5cb470e1eba85641bba9bc4c97aebc1e38f3e167 Mon Sep 17 00:00:00 2001 From: Matt Arsenault Date: Fri, 19 Jul 2024 22:43:39 +0400 Subject: [PATCH] MachineOutliner: Use PM to query MachineModuleInfo Avoid getting this from the MachineFunction --- llvm/include/llvm/CodeGen/TargetInstrInfo.h | 9 ++-- llvm/lib/CodeGen/MachineOutliner.cpp | 43 ++-- llvm/lib/CodeGen/TargetInstrInfo.cpp | 10 +++-- llvm/lib/Target/AArch64/AArch64InstrInfo.cpp | 8 ++-- llvm/lib/Target/AArch64/AArch64InstrInfo.h | 6 ++- llvm/lib/Target/ARM/ARMBaseInstrInfo.cpp | 6 ++- llvm/lib/Target/ARM/ARMBaseInstrInfo.h | 6 ++- llvm/lib/Target/RISCV/RISCVInstrInfo.cpp | 6 ++- llvm/lib/Target/RISCV/RISCVInstrInfo.h | 4 +- llvm/lib/Target/X86/X86InstrInfo.cpp | 4 +- llvm/lib/Target/X86/X86InstrInfo.h | 6 ++- 11 files changed, 64 insertions(+), 44 deletions(-) diff --git a/llvm/include/llvm/CodeGen/TargetInstrInfo.h b/llvm/include/llvm/CodeGen/TargetInstrInfo.h index 5c7f6ddc94840..4ae9e470616bb 100644 --- a/llvm/include/llvm/CodeGen/TargetInstrInfo.h +++ b/llvm/include/llvm/CodeGen/TargetInstrInfo.h @@ -2089,6 +2089,7 @@ class TargetInstrInfo : public MCInstrInfo { /// information for a set of outlining candidates. Returns std::nullopt if the /// candidates are not suitable for outlining. virtual std::optional getOutliningCandidateInfo( + const MachineModuleInfo &MMI, std::vector &RepeatedSequenceLocs) const { llvm_unreachable( "Target didn't implement TargetInstrInfo::getOutliningCandidateInfo!"); @@ -2103,7 +2104,8 @@ class TargetInstrInfo : public MCInstrInfo { protected: /// Target-dependent implementation for getOutliningTypeImpl. virtual outliner::InstrType - getOutliningTypeImpl(MachineBasicBlock::iterator &MIT, unsigned Flags) const { + getOutliningTypeImpl(const MachineModuleInfo &MMI, + MachineBasicBlock::iterator &MIT, unsigned Flags) const { llvm_unreachable( "Target didn't implement TargetInstrInfo::getOutliningTypeImpl!"); } @@ -2111,8 +2113,9 @@ class TargetInstrInfo : public MCInstrInfo { public: /// Returns how or if \p MIT should be outlined. \p Flags is the /// target-specific information returned by isMBBSafeToOutlineFrom. - outliner::InstrType - getOutliningType(MachineBasicBlock::iterator &MIT, unsigned Flags) const; + outliner::InstrType getOutliningType(const MachineModuleInfo &MMI, + MachineBasicBlock::iterator &MIT, + unsigned Flags) const; /// Optional target hook that returns true if \p MBB is safe to outline from, /// and returns any target-specific information in \p Flags. diff --git a/llvm/lib/CodeGen/MachineOutliner.cpp b/llvm/lib/CodeGen/MachineOutliner.cpp index c7ccf10e12b12..4b56a467b8d07 100644 --- a/llvm/lib/CodeGen/MachineOutliner.cpp +++ b/llvm/lib/CodeGen/MachineOutliner.cpp @@ -132,6 +132,7 @@ namespace { /// Maps \p MachineInstrs to unsigned integers and stores the mappings. struct InstructionMapper { + const MachineModuleInfo &MMI; /// The next available integer to assign to a \p MachineInstr that /// cannot be outlined. @@ -333,7 +334,7 @@ struct InstructionMapper { // which may be outlinable. Check if each instruction is known to be safe. for (; It != OutlinableRangeEnd; ++It) { // Keep track of where this instruction is in the module. -switch (TII.getOutliningType(It, Flags)) { +switch (TII.getOutliningType(MMI, It, Flags)) { case InstrType::Illegal: mapToIllegalUnsigned(It, CanOutlineWithPrevInstr, UnsignedVecForMBB, InstrListForMBB); @@ -382,7 +383,7 @@ struct InstructionMapper { } } - InstructionMapper() { + InstructionMapper(const MachineModuleInfo &MMI_) : MMI(MMI_) { // Make sure that the implementation of DenseMapInfo hasn't // changed. assert(DenseMapInfo::getEmptyKey() == (unsigned)-1 && @@ -405,6 +406,8 @@ struct MachineOutliner : public ModulePass { static char ID; + MachineModuleInfo *MMI = nullptr; + /// Set to true if the outliner should consider functions with /// linkonceodr linkage. bool OutlineFromLinkOnceODRs = false; @@ -489,20 +492,19 @@ struct MachineOutliner : public ModulePass { /// Populate and \p InstructionMapper with instruction-to-integer mappings. /// These are used to construct a suffix tree. - void populateMapper(InstructionMapper &Mapper, Module &M, - MachineModuleInfo &MMI); + void populateMapper(InstructionMapper &Mapper, Module &M); /// Initialize information necessary to output a size remark. /// FIXME: This should be handled by the pass manager, not the outliner. /// FIXME: T
[llvm-branch-commits] [llvm] MachineOutliner: Use PM to query MachineModuleInfo (PR #99688)
arsenm wrote: > [!WARNING] > This pull request is not mergeable via GitHub because a downstack PR is > open. Once all requirements are satisfied, merge this PR as a stack href="https://app.graphite.dev/github/pr/llvm/llvm-project/99688?utm_source=stack-comment-downstack-mergeability-warning"; > >on Graphite. > https://graphite.dev/docs/merge-pull-requests";>Learn more * **#99688** https://app.graphite.dev/github/pr/llvm/llvm-project/99688?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> 👈 * **#99679** https://app.graphite.dev/github/pr/llvm/llvm-project/99679?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * `main` This stack of pull requests is managed by Graphite. https://stacking.dev/?utm_source=stack-comment";>Learn more about stacking. Join @arsenm and the rest of your teammates on https://graphite.dev?utm-source=stack-comment";>https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="11px" height="11px"/> Graphite https://github.com/llvm/llvm-project/pull/99688 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] MachineOutliner: Use PM to query MachineModuleInfo (PR #99688)
https://github.com/arsenm ready_for_review https://github.com/llvm/llvm-project/pull/99688 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] MachineOutliner: Use PM to query MachineModuleInfo (PR #99688)
llvmbot wrote: @llvm/pr-subscribers-backend-arm Author: Matt Arsenault (arsenm) Changes Avoid getting this from the MachineFunction --- Patch is 20.14 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/99688.diff 11 Files Affected: - (modified) llvm/include/llvm/CodeGen/TargetInstrInfo.h (+6-3) - (modified) llvm/lib/CodeGen/MachineOutliner.cpp (+21-22) - (modified) llvm/lib/CodeGen/TargetInstrInfo.cpp (+6-4) - (modified) llvm/lib/Target/AArch64/AArch64InstrInfo.cpp (+5-3) - (modified) llvm/lib/Target/AArch64/AArch64InstrInfo.h (+4-2) - (modified) llvm/lib/Target/ARM/ARMBaseInstrInfo.cpp (+4-2) - (modified) llvm/lib/Target/ARM/ARMBaseInstrInfo.h (+4-2) - (modified) llvm/lib/Target/RISCV/RISCVInstrInfo.cpp (+4-2) - (modified) llvm/lib/Target/RISCV/RISCVInstrInfo.h (+3-1) - (modified) llvm/lib/Target/X86/X86InstrInfo.cpp (+3-1) - (modified) llvm/lib/Target/X86/X86InstrInfo.h (+4-2) ``diff diff --git a/llvm/include/llvm/CodeGen/TargetInstrInfo.h b/llvm/include/llvm/CodeGen/TargetInstrInfo.h index 5c7f6ddc94840..4ae9e470616bb 100644 --- a/llvm/include/llvm/CodeGen/TargetInstrInfo.h +++ b/llvm/include/llvm/CodeGen/TargetInstrInfo.h @@ -2089,6 +2089,7 @@ class TargetInstrInfo : public MCInstrInfo { /// information for a set of outlining candidates. Returns std::nullopt if the /// candidates are not suitable for outlining. virtual std::optional getOutliningCandidateInfo( + const MachineModuleInfo &MMI, std::vector &RepeatedSequenceLocs) const { llvm_unreachable( "Target didn't implement TargetInstrInfo::getOutliningCandidateInfo!"); @@ -2103,7 +2104,8 @@ class TargetInstrInfo : public MCInstrInfo { protected: /// Target-dependent implementation for getOutliningTypeImpl. virtual outliner::InstrType - getOutliningTypeImpl(MachineBasicBlock::iterator &MIT, unsigned Flags) const { + getOutliningTypeImpl(const MachineModuleInfo &MMI, + MachineBasicBlock::iterator &MIT, unsigned Flags) const { llvm_unreachable( "Target didn't implement TargetInstrInfo::getOutliningTypeImpl!"); } @@ -2111,8 +2113,9 @@ class TargetInstrInfo : public MCInstrInfo { public: /// Returns how or if \p MIT should be outlined. \p Flags is the /// target-specific information returned by isMBBSafeToOutlineFrom. - outliner::InstrType - getOutliningType(MachineBasicBlock::iterator &MIT, unsigned Flags) const; + outliner::InstrType getOutliningType(const MachineModuleInfo &MMI, + MachineBasicBlock::iterator &MIT, + unsigned Flags) const; /// Optional target hook that returns true if \p MBB is safe to outline from, /// and returns any target-specific information in \p Flags. diff --git a/llvm/lib/CodeGen/MachineOutliner.cpp b/llvm/lib/CodeGen/MachineOutliner.cpp index c7ccf10e12b12..4b56a467b8d07 100644 --- a/llvm/lib/CodeGen/MachineOutliner.cpp +++ b/llvm/lib/CodeGen/MachineOutliner.cpp @@ -132,6 +132,7 @@ namespace { /// Maps \p MachineInstrs to unsigned integers and stores the mappings. struct InstructionMapper { + const MachineModuleInfo &MMI; /// The next available integer to assign to a \p MachineInstr that /// cannot be outlined. @@ -333,7 +334,7 @@ struct InstructionMapper { // which may be outlinable. Check if each instruction is known to be safe. for (; It != OutlinableRangeEnd; ++It) { // Keep track of where this instruction is in the module. -switch (TII.getOutliningType(It, Flags)) { +switch (TII.getOutliningType(MMI, It, Flags)) { case InstrType::Illegal: mapToIllegalUnsigned(It, CanOutlineWithPrevInstr, UnsignedVecForMBB, InstrListForMBB); @@ -382,7 +383,7 @@ struct InstructionMapper { } } - InstructionMapper() { + InstructionMapper(const MachineModuleInfo &MMI_) : MMI(MMI_) { // Make sure that the implementation of DenseMapInfo hasn't // changed. assert(DenseMapInfo::getEmptyKey() == (unsigned)-1 && @@ -405,6 +406,8 @@ struct MachineOutliner : public ModulePass { static char ID; + MachineModuleInfo *MMI = nullptr; + /// Set to true if the outliner should consider functions with /// linkonceodr linkage. bool OutlineFromLinkOnceODRs = false; @@ -489,20 +492,19 @@ struct MachineOutliner : public ModulePass { /// Populate and \p InstructionMapper with instruction-to-integer mappings. /// These are used to construct a suffix tree. - void populateMapper(InstructionMapper &Mapper, Module &M, - MachineModuleInfo &MMI); + void populateMapper(InstructionMapper &Mapper, Module &M); /// Initialize information necessary to output a size remark. /// FIXME: This should be handled by the pass manager, not the outliner. /// FIXME: This is nearly identical to the initSizeRemarkInfo in the legacy /// pass manage
[llvm-branch-commits] [llvm] MachineOutliner: Use PM to query MachineModuleInfo (PR #99688)
llvmbot wrote: @llvm/pr-subscribers-backend-risc-v @llvm/pr-subscribers-backend-aarch64 Author: Matt Arsenault (arsenm) Changes Avoid getting this from the MachineFunction --- Patch is 20.14 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/99688.diff 11 Files Affected: - (modified) llvm/include/llvm/CodeGen/TargetInstrInfo.h (+6-3) - (modified) llvm/lib/CodeGen/MachineOutliner.cpp (+21-22) - (modified) llvm/lib/CodeGen/TargetInstrInfo.cpp (+6-4) - (modified) llvm/lib/Target/AArch64/AArch64InstrInfo.cpp (+5-3) - (modified) llvm/lib/Target/AArch64/AArch64InstrInfo.h (+4-2) - (modified) llvm/lib/Target/ARM/ARMBaseInstrInfo.cpp (+4-2) - (modified) llvm/lib/Target/ARM/ARMBaseInstrInfo.h (+4-2) - (modified) llvm/lib/Target/RISCV/RISCVInstrInfo.cpp (+4-2) - (modified) llvm/lib/Target/RISCV/RISCVInstrInfo.h (+3-1) - (modified) llvm/lib/Target/X86/X86InstrInfo.cpp (+3-1) - (modified) llvm/lib/Target/X86/X86InstrInfo.h (+4-2) ``diff diff --git a/llvm/include/llvm/CodeGen/TargetInstrInfo.h b/llvm/include/llvm/CodeGen/TargetInstrInfo.h index 5c7f6ddc94840..4ae9e470616bb 100644 --- a/llvm/include/llvm/CodeGen/TargetInstrInfo.h +++ b/llvm/include/llvm/CodeGen/TargetInstrInfo.h @@ -2089,6 +2089,7 @@ class TargetInstrInfo : public MCInstrInfo { /// information for a set of outlining candidates. Returns std::nullopt if the /// candidates are not suitable for outlining. virtual std::optional getOutliningCandidateInfo( + const MachineModuleInfo &MMI, std::vector &RepeatedSequenceLocs) const { llvm_unreachable( "Target didn't implement TargetInstrInfo::getOutliningCandidateInfo!"); @@ -2103,7 +2104,8 @@ class TargetInstrInfo : public MCInstrInfo { protected: /// Target-dependent implementation for getOutliningTypeImpl. virtual outliner::InstrType - getOutliningTypeImpl(MachineBasicBlock::iterator &MIT, unsigned Flags) const { + getOutliningTypeImpl(const MachineModuleInfo &MMI, + MachineBasicBlock::iterator &MIT, unsigned Flags) const { llvm_unreachable( "Target didn't implement TargetInstrInfo::getOutliningTypeImpl!"); } @@ -2111,8 +2113,9 @@ class TargetInstrInfo : public MCInstrInfo { public: /// Returns how or if \p MIT should be outlined. \p Flags is the /// target-specific information returned by isMBBSafeToOutlineFrom. - outliner::InstrType - getOutliningType(MachineBasicBlock::iterator &MIT, unsigned Flags) const; + outliner::InstrType getOutliningType(const MachineModuleInfo &MMI, + MachineBasicBlock::iterator &MIT, + unsigned Flags) const; /// Optional target hook that returns true if \p MBB is safe to outline from, /// and returns any target-specific information in \p Flags. diff --git a/llvm/lib/CodeGen/MachineOutliner.cpp b/llvm/lib/CodeGen/MachineOutliner.cpp index c7ccf10e12b12..4b56a467b8d07 100644 --- a/llvm/lib/CodeGen/MachineOutliner.cpp +++ b/llvm/lib/CodeGen/MachineOutliner.cpp @@ -132,6 +132,7 @@ namespace { /// Maps \p MachineInstrs to unsigned integers and stores the mappings. struct InstructionMapper { + const MachineModuleInfo &MMI; /// The next available integer to assign to a \p MachineInstr that /// cannot be outlined. @@ -333,7 +334,7 @@ struct InstructionMapper { // which may be outlinable. Check if each instruction is known to be safe. for (; It != OutlinableRangeEnd; ++It) { // Keep track of where this instruction is in the module. -switch (TII.getOutliningType(It, Flags)) { +switch (TII.getOutliningType(MMI, It, Flags)) { case InstrType::Illegal: mapToIllegalUnsigned(It, CanOutlineWithPrevInstr, UnsignedVecForMBB, InstrListForMBB); @@ -382,7 +383,7 @@ struct InstructionMapper { } } - InstructionMapper() { + InstructionMapper(const MachineModuleInfo &MMI_) : MMI(MMI_) { // Make sure that the implementation of DenseMapInfo hasn't // changed. assert(DenseMapInfo::getEmptyKey() == (unsigned)-1 && @@ -405,6 +406,8 @@ struct MachineOutliner : public ModulePass { static char ID; + MachineModuleInfo *MMI = nullptr; + /// Set to true if the outliner should consider functions with /// linkonceodr linkage. bool OutlineFromLinkOnceODRs = false; @@ -489,20 +492,19 @@ struct MachineOutliner : public ModulePass { /// Populate and \p InstructionMapper with instruction-to-integer mappings. /// These are used to construct a suffix tree. - void populateMapper(InstructionMapper &Mapper, Module &M, - MachineModuleInfo &MMI); + void populateMapper(InstructionMapper &Mapper, Module &M); /// Initialize information necessary to output a size remark. /// FIXME: This should be handled by the pass manager, not the outliner. /// FIXME: This is nearly identical to the initSizeRem
[llvm-branch-commits] [llvm] [BOLT] Match functions with call graph (PR #98125)
https://github.com/shawbyoung updated https://github.com/llvm/llvm-project/pull/98125 >From cf32a43e7c2b04079c6123fe13df4fb7226d771f Mon Sep 17 00:00:00 2001 From: shawbyoung Date: Tue, 9 Jul 2024 10:04:25 -0700 Subject: [PATCH 01/16] Comments Created using spr 1.3.4 --- bolt/lib/Profile/YAMLProfileReader.cpp | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/bolt/lib/Profile/YAMLProfileReader.cpp b/bolt/lib/Profile/YAMLProfileReader.cpp index 69ea0899c5f2c..6753337c24ea7 100644 --- a/bolt/lib/Profile/YAMLProfileReader.cpp +++ b/bolt/lib/Profile/YAMLProfileReader.cpp @@ -501,7 +501,6 @@ size_t YAMLProfileReader::matchWithCallGraph(BinaryContext &BC) { // Maps binary functions to adjacent functions in the FCG. for (const BinaryFunction *CallerBF : BFs) { -// Add all call targets to the hash map. for (const BinaryBasicBlock &BB : CallerBF->blocks()) { for (const MCInst &Inst : BB) { if (!BC.MIB->isCall(Instr)) @@ -533,7 +532,8 @@ size_t YAMLProfileReader::matchWithCallGraph(BinaryContext &BC) { } } - // Create mapping from neighbor hash to BFs. + // Using the constructed adjacent function mapping, creates mapping from + // neighbor hash to BFs. std::unordered_map> NeighborHashToBFs; for (const BinaryFunction *BF : BFs) { @@ -552,12 +552,12 @@ size_t YAMLProfileReader::matchWithCallGraph(BinaryContext &BC) { .push_back(BF); } - // TODO: change call anchor PR to have this representation - we need it here + // TODO: note, this will be introduced in the matching functions with calls + // as anchors pr DenseMap IdToYAMLBF; - // TODO: change call anchor PR to have this representation - we need it here - // Maps hashes to profiled functions. + // Maps YAML functions to adjacent functions in the profile FCG. std::unordered_map YamlBFToHashes(BFs.size()); @@ -590,7 +590,7 @@ size_t YAMLProfileReader::matchWithCallGraph(BinaryContext &BC) { } } - // Matching YAMLBF with neighbor hashes. + // Matches YAMLBF to BFs with neighbor hashes. for (yaml::bolt::BinaryFunctionProfile &YamlBF : YamlBP.Functions) { if (YamlBF.Used) continue; >From ee9049fc4bd3d4203c19c9c0982a78ab3b47666f Mon Sep 17 00:00:00 2001 From: shawbyoung Date: Tue, 9 Jul 2024 13:52:05 -0700 Subject: [PATCH 02/16] Moved blended hash definition Created using spr 1.3.4 --- bolt/include/bolt/Profile/YAMLProfileReader.h | 69 ++- bolt/lib/Profile/StaleProfileMatching.cpp | 65 --- bolt/lib/Profile/YAMLProfileReader.cpp| 110 -- 3 files changed, 119 insertions(+), 125 deletions(-) diff --git a/bolt/include/bolt/Profile/YAMLProfileReader.h b/bolt/include/bolt/Profile/YAMLProfileReader.h index 36e8f8739eee1..e8a34ecad9a08 100644 --- a/bolt/include/bolt/Profile/YAMLProfileReader.h +++ b/bolt/include/bolt/Profile/YAMLProfileReader.h @@ -16,6 +16,73 @@ namespace llvm { namespace bolt { +/// An object wrapping several components of a basic block hash. The combined +/// (blended) hash is represented and stored as one uint64_t, while individual +/// components are of smaller size (e.g., uint16_t or uint8_t). +struct BlendedBlockHash { +private: + using ValueOffset = Bitfield::Element; + using ValueOpcode = Bitfield::Element; + using ValueInstr = Bitfield::Element; + using ValuePred = Bitfield::Element; + using ValueSucc = Bitfield::Element; + +public: + explicit BlendedBlockHash() {} + + explicit BlendedBlockHash(uint64_t Hash) { +Offset = Bitfield::get(Hash); +OpcodeHash = Bitfield::get(Hash); +InstrHash = Bitfield::get(Hash); +PredHash = Bitfield::get(Hash); +SuccHash = Bitfield::get(Hash); + } + + /// Combine the blended hash into uint64_t. + uint64_t combine() const { +uint64_t Hash = 0; +Bitfield::set(Hash, Offset); +Bitfield::set(Hash, OpcodeHash); +Bitfield::set(Hash, InstrHash); +Bitfield::set(Hash, PredHash); +Bitfield::set(Hash, SuccHash); +return Hash; + } + + /// Compute a distance between two given blended hashes. The smaller the + /// distance, the more similar two blocks are. For identical basic blocks, + /// the distance is zero. + uint64_t distance(const BlendedBlockHash &BBH) const { +assert(OpcodeHash == BBH.OpcodeHash && + "incorrect blended hash distance computation"); +uint64_t Dist = 0; +// Account for NeighborHash +Dist += SuccHash == BBH.SuccHash ? 0 : 1; +Dist += PredHash == BBH.PredHash ? 0 : 1; +Dist <<= 16; +// Account for InstrHash +Dist += InstrHash == BBH.InstrHash ? 0 : 1; +Dist <<= 16; +// Account for Offset +Dist += (Offset >= BBH.Offset ? Offset - BBH.Offset : BBH.Offset - Offset); +return Dist; + } + + /// The offset of the basic block from the function start. + uint16_t Offset{0}; + /// (Loose) Hash of the basic block instructions, excluding operands. + uint16_t OpcodeHash{0}; + /// (Str
[llvm-branch-commits] [llvm] [BOLT] Match functions with call graph (PR #98125)
https://github.com/shawbyoung updated https://github.com/llvm/llvm-project/pull/98125 >From cf32a43e7c2b04079c6123fe13df4fb7226d771f Mon Sep 17 00:00:00 2001 From: shawbyoung Date: Tue, 9 Jul 2024 10:04:25 -0700 Subject: [PATCH 01/16] Comments Created using spr 1.3.4 --- bolt/lib/Profile/YAMLProfileReader.cpp | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/bolt/lib/Profile/YAMLProfileReader.cpp b/bolt/lib/Profile/YAMLProfileReader.cpp index 69ea0899c5f2c..6753337c24ea7 100644 --- a/bolt/lib/Profile/YAMLProfileReader.cpp +++ b/bolt/lib/Profile/YAMLProfileReader.cpp @@ -501,7 +501,6 @@ size_t YAMLProfileReader::matchWithCallGraph(BinaryContext &BC) { // Maps binary functions to adjacent functions in the FCG. for (const BinaryFunction *CallerBF : BFs) { -// Add all call targets to the hash map. for (const BinaryBasicBlock &BB : CallerBF->blocks()) { for (const MCInst &Inst : BB) { if (!BC.MIB->isCall(Instr)) @@ -533,7 +532,8 @@ size_t YAMLProfileReader::matchWithCallGraph(BinaryContext &BC) { } } - // Create mapping from neighbor hash to BFs. + // Using the constructed adjacent function mapping, creates mapping from + // neighbor hash to BFs. std::unordered_map> NeighborHashToBFs; for (const BinaryFunction *BF : BFs) { @@ -552,12 +552,12 @@ size_t YAMLProfileReader::matchWithCallGraph(BinaryContext &BC) { .push_back(BF); } - // TODO: change call anchor PR to have this representation - we need it here + // TODO: note, this will be introduced in the matching functions with calls + // as anchors pr DenseMap IdToYAMLBF; - // TODO: change call anchor PR to have this representation - we need it here - // Maps hashes to profiled functions. + // Maps YAML functions to adjacent functions in the profile FCG. std::unordered_map YamlBFToHashes(BFs.size()); @@ -590,7 +590,7 @@ size_t YAMLProfileReader::matchWithCallGraph(BinaryContext &BC) { } } - // Matching YAMLBF with neighbor hashes. + // Matches YAMLBF to BFs with neighbor hashes. for (yaml::bolt::BinaryFunctionProfile &YamlBF : YamlBP.Functions) { if (YamlBF.Used) continue; >From ee9049fc4bd3d4203c19c9c0982a78ab3b47666f Mon Sep 17 00:00:00 2001 From: shawbyoung Date: Tue, 9 Jul 2024 13:52:05 -0700 Subject: [PATCH 02/16] Moved blended hash definition Created using spr 1.3.4 --- bolt/include/bolt/Profile/YAMLProfileReader.h | 69 ++- bolt/lib/Profile/StaleProfileMatching.cpp | 65 --- bolt/lib/Profile/YAMLProfileReader.cpp| 110 -- 3 files changed, 119 insertions(+), 125 deletions(-) diff --git a/bolt/include/bolt/Profile/YAMLProfileReader.h b/bolt/include/bolt/Profile/YAMLProfileReader.h index 36e8f8739eee1..e8a34ecad9a08 100644 --- a/bolt/include/bolt/Profile/YAMLProfileReader.h +++ b/bolt/include/bolt/Profile/YAMLProfileReader.h @@ -16,6 +16,73 @@ namespace llvm { namespace bolt { +/// An object wrapping several components of a basic block hash. The combined +/// (blended) hash is represented and stored as one uint64_t, while individual +/// components are of smaller size (e.g., uint16_t or uint8_t). +struct BlendedBlockHash { +private: + using ValueOffset = Bitfield::Element; + using ValueOpcode = Bitfield::Element; + using ValueInstr = Bitfield::Element; + using ValuePred = Bitfield::Element; + using ValueSucc = Bitfield::Element; + +public: + explicit BlendedBlockHash() {} + + explicit BlendedBlockHash(uint64_t Hash) { +Offset = Bitfield::get(Hash); +OpcodeHash = Bitfield::get(Hash); +InstrHash = Bitfield::get(Hash); +PredHash = Bitfield::get(Hash); +SuccHash = Bitfield::get(Hash); + } + + /// Combine the blended hash into uint64_t. + uint64_t combine() const { +uint64_t Hash = 0; +Bitfield::set(Hash, Offset); +Bitfield::set(Hash, OpcodeHash); +Bitfield::set(Hash, InstrHash); +Bitfield::set(Hash, PredHash); +Bitfield::set(Hash, SuccHash); +return Hash; + } + + /// Compute a distance between two given blended hashes. The smaller the + /// distance, the more similar two blocks are. For identical basic blocks, + /// the distance is zero. + uint64_t distance(const BlendedBlockHash &BBH) const { +assert(OpcodeHash == BBH.OpcodeHash && + "incorrect blended hash distance computation"); +uint64_t Dist = 0; +// Account for NeighborHash +Dist += SuccHash == BBH.SuccHash ? 0 : 1; +Dist += PredHash == BBH.PredHash ? 0 : 1; +Dist <<= 16; +// Account for InstrHash +Dist += InstrHash == BBH.InstrHash ? 0 : 1; +Dist <<= 16; +// Account for Offset +Dist += (Offset >= BBH.Offset ? Offset - BBH.Offset : BBH.Offset - Offset); +return Dist; + } + + /// The offset of the basic block from the function start. + uint16_t Offset{0}; + /// (Loose) Hash of the basic block instructions, excluding operands. + uint16_t OpcodeHash{0}; + /// (Str
[llvm-branch-commits] [llvm] [BOLT] Match functions with call graph (PR #98125)
https://github.com/shawbyoung edited https://github.com/llvm/llvm-project/pull/98125 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] SelectionDAG: Avoid using MachineFunction::getMMI (PR #99696)
https://github.com/arsenm created https://github.com/llvm/llvm-project/pull/99696 None >From ba52c00103e695781a4d86450da3b94c238b89a6 Mon Sep 17 00:00:00 2001 From: Matt Arsenault Date: Fri, 19 Jul 2024 23:10:25 +0400 Subject: [PATCH] SelectionDAG: Avoid using MachineFunction::getMMI --- llvm/include/llvm/CodeGen/SelectionDAG.h | 9 ++--- llvm/include/llvm/CodeGen/SelectionDAGISel.h | 1 + llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp| 3 ++- .../CodeGen/SelectionDAG/SelectionDAGBuilder.cpp | 6 +++--- .../lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp | 15 +-- .../unittests/CodeGen/AArch64SelectionDAGTest.cpp | 3 ++- .../CodeGen/SelectionDAGAddressAnalysisTest.cpp | 3 ++- .../CodeGen/SelectionDAGPatternMatchTest.cpp | 3 ++- 8 files changed, 27 insertions(+), 16 deletions(-) diff --git a/llvm/include/llvm/CodeGen/SelectionDAG.h b/llvm/include/llvm/CodeGen/SelectionDAG.h index 16ec65f2e7daa..24eab7b408675 100644 --- a/llvm/include/llvm/CodeGen/SelectionDAG.h +++ b/llvm/include/llvm/CodeGen/SelectionDAG.h @@ -245,6 +245,7 @@ class SelectionDAG { ProfileSummaryInfo *PSI = nullptr; BlockFrequencyInfo *BFI = nullptr; + MachineModuleInfo *MMI = nullptr; /// List of non-single value types. FoldingSet VTListMap; @@ -459,14 +460,15 @@ class SelectionDAG { void init(MachineFunction &NewMF, OptimizationRemarkEmitter &NewORE, Pass *PassPtr, const TargetLibraryInfo *LibraryInfo, UniformityInfo *UA, ProfileSummaryInfo *PSIin, -BlockFrequencyInfo *BFIin, FunctionVarLocs const *FnVarLocs); +BlockFrequencyInfo *BFIin, MachineModuleInfo &MMI, +FunctionVarLocs const *FnVarLocs); void init(MachineFunction &NewMF, OptimizationRemarkEmitter &NewORE, MachineFunctionAnalysisManager &AM, const TargetLibraryInfo *LibraryInfo, UniformityInfo *UA, ProfileSummaryInfo *PSIin, BlockFrequencyInfo *BFIin, -FunctionVarLocs const *FnVarLocs) { -init(NewMF, NewORE, nullptr, LibraryInfo, UA, PSIin, BFIin, FnVarLocs); +MachineModuleInfo &MMI, FunctionVarLocs const *FnVarLocs) { +init(NewMF, NewORE, nullptr, LibraryInfo, UA, PSIin, BFIin, MMI, FnVarLocs); MFAM = &AM; } @@ -500,6 +502,7 @@ class SelectionDAG { OptimizationRemarkEmitter &getORE() const { return *ORE; } ProfileSummaryInfo *getPSI() const { return PSI; } BlockFrequencyInfo *getBFI() const { return BFI; } + MachineModuleInfo *getMMI() const { return MMI; } FlagInserter *getFlagInserter() { return Inserter; } void setFlagInserter(FlagInserter *FI) { Inserter = FI; } diff --git a/llvm/include/llvm/CodeGen/SelectionDAGISel.h b/llvm/include/llvm/CodeGen/SelectionDAGISel.h index aa0efa5d9bf5d..fc0590b1a1b69 100644 --- a/llvm/include/llvm/CodeGen/SelectionDAGISel.h +++ b/llvm/include/llvm/CodeGen/SelectionDAGISel.h @@ -48,6 +48,7 @@ class SelectionDAGISel { std::unique_ptr FuncInfo; SwiftErrorValueTracking *SwiftError; MachineFunction *MF; + MachineModuleInfo *MMI; MachineRegisterInfo *RegInfo; SelectionDAG *CurDAG; std::unique_ptr SDB; diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp index 02d44cd36ae53..a1376aaaf12a9 100644 --- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp +++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp @@ -1333,7 +1333,7 @@ void SelectionDAG::init(MachineFunction &NewMF, OptimizationRemarkEmitter &NewORE, Pass *PassPtr, const TargetLibraryInfo *LibraryInfo, UniformityInfo *NewUA, ProfileSummaryInfo *PSIin, -BlockFrequencyInfo *BFIin, +BlockFrequencyInfo *BFIin, MachineModuleInfo &MMIin, FunctionVarLocs const *VarLocs) { MF = &NewMF; SDAGISelPass = PassPtr; @@ -1345,6 +1345,7 @@ void SelectionDAG::init(MachineFunction &NewMF, UA = NewUA; PSI = PSIin; BFI = BFIin; + MMI = &MMIin; FnVarLocs = VarLocs; } diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp index 98a795edb7a03..02eb1305b24ae 100644 --- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp +++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp @@ -6707,11 +6707,11 @@ void SelectionDAGBuilder::visitIntrinsicCall(const CallInst &I, getValue(I.getArgOperand(0; return; case Intrinsic::eh_sjlj_callsite: { -MachineModuleInfo &MMI = DAG.getMachineFunction().getMMI(); ConstantInt *CI = cast(I.getArgOperand(0)); -assert(MMI.getCurrentCallSite() == 0 && "Overlapping call sites!"); +assert(DAG.getMMI()->getCurrentCallSite() == 0 && + "Overlapping call sites!"); -MMI.setCurrentCallSite(CI->getZExtValue()); +DAG.getMMI()->setCurrentCallSite(CI->getZExtValue()); r
[llvm-branch-commits] [llvm] SelectionDAG: Avoid using MachineFunction::getMMI (PR #99696)
arsenm wrote: > [!WARNING] > This pull request is not mergeable via GitHub because a downstack PR is > open. Once all requirements are satisfied, merge this PR as a stack href="https://app.graphite.dev/github/pr/llvm/llvm-project/99696?utm_source=stack-comment-downstack-mergeability-warning"; > >on Graphite. > https://graphite.dev/docs/merge-pull-requests";>Learn more * **#99696** https://app.graphite.dev/github/pr/llvm/llvm-project/99696?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> 👈 * **#99688** https://app.graphite.dev/github/pr/llvm/llvm-project/99688?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * **#99679** https://app.graphite.dev/github/pr/llvm/llvm-project/99679?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * `main` This stack of pull requests is managed by Graphite. https://stacking.dev/?utm_source=stack-comment";>Learn more about stacking. Join @arsenm and the rest of your teammates on https://graphite.dev?utm-source=stack-comment";>https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="11px" height="11px"/> Graphite https://github.com/llvm/llvm-project/pull/99696 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] SelectionDAG: Avoid using MachineFunction::getMMI (PR #99696)
llvmbot wrote: @llvm/pr-subscribers-llvm-selectiondag Author: Matt Arsenault (arsenm) Changes --- Full diff: https://github.com/llvm/llvm-project/pull/99696.diff 8 Files Affected: - (modified) llvm/include/llvm/CodeGen/SelectionDAG.h (+6-3) - (modified) llvm/include/llvm/CodeGen/SelectionDAGISel.h (+1) - (modified) llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp (+2-1) - (modified) llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp (+3-3) - (modified) llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp (+9-6) - (modified) llvm/unittests/CodeGen/AArch64SelectionDAGTest.cpp (+2-1) - (modified) llvm/unittests/CodeGen/SelectionDAGAddressAnalysisTest.cpp (+2-1) - (modified) llvm/unittests/CodeGen/SelectionDAGPatternMatchTest.cpp (+2-1) ``diff diff --git a/llvm/include/llvm/CodeGen/SelectionDAG.h b/llvm/include/llvm/CodeGen/SelectionDAG.h index 16ec65f2e7daa..24eab7b408675 100644 --- a/llvm/include/llvm/CodeGen/SelectionDAG.h +++ b/llvm/include/llvm/CodeGen/SelectionDAG.h @@ -245,6 +245,7 @@ class SelectionDAG { ProfileSummaryInfo *PSI = nullptr; BlockFrequencyInfo *BFI = nullptr; + MachineModuleInfo *MMI = nullptr; /// List of non-single value types. FoldingSet VTListMap; @@ -459,14 +460,15 @@ class SelectionDAG { void init(MachineFunction &NewMF, OptimizationRemarkEmitter &NewORE, Pass *PassPtr, const TargetLibraryInfo *LibraryInfo, UniformityInfo *UA, ProfileSummaryInfo *PSIin, -BlockFrequencyInfo *BFIin, FunctionVarLocs const *FnVarLocs); +BlockFrequencyInfo *BFIin, MachineModuleInfo &MMI, +FunctionVarLocs const *FnVarLocs); void init(MachineFunction &NewMF, OptimizationRemarkEmitter &NewORE, MachineFunctionAnalysisManager &AM, const TargetLibraryInfo *LibraryInfo, UniformityInfo *UA, ProfileSummaryInfo *PSIin, BlockFrequencyInfo *BFIin, -FunctionVarLocs const *FnVarLocs) { -init(NewMF, NewORE, nullptr, LibraryInfo, UA, PSIin, BFIin, FnVarLocs); +MachineModuleInfo &MMI, FunctionVarLocs const *FnVarLocs) { +init(NewMF, NewORE, nullptr, LibraryInfo, UA, PSIin, BFIin, MMI, FnVarLocs); MFAM = &AM; } @@ -500,6 +502,7 @@ class SelectionDAG { OptimizationRemarkEmitter &getORE() const { return *ORE; } ProfileSummaryInfo *getPSI() const { return PSI; } BlockFrequencyInfo *getBFI() const { return BFI; } + MachineModuleInfo *getMMI() const { return MMI; } FlagInserter *getFlagInserter() { return Inserter; } void setFlagInserter(FlagInserter *FI) { Inserter = FI; } diff --git a/llvm/include/llvm/CodeGen/SelectionDAGISel.h b/llvm/include/llvm/CodeGen/SelectionDAGISel.h index aa0efa5d9bf5d..fc0590b1a1b69 100644 --- a/llvm/include/llvm/CodeGen/SelectionDAGISel.h +++ b/llvm/include/llvm/CodeGen/SelectionDAGISel.h @@ -48,6 +48,7 @@ class SelectionDAGISel { std::unique_ptr FuncInfo; SwiftErrorValueTracking *SwiftError; MachineFunction *MF; + MachineModuleInfo *MMI; MachineRegisterInfo *RegInfo; SelectionDAG *CurDAG; std::unique_ptr SDB; diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp index 02d44cd36ae53..a1376aaaf12a9 100644 --- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp +++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp @@ -1333,7 +1333,7 @@ void SelectionDAG::init(MachineFunction &NewMF, OptimizationRemarkEmitter &NewORE, Pass *PassPtr, const TargetLibraryInfo *LibraryInfo, UniformityInfo *NewUA, ProfileSummaryInfo *PSIin, -BlockFrequencyInfo *BFIin, +BlockFrequencyInfo *BFIin, MachineModuleInfo &MMIin, FunctionVarLocs const *VarLocs) { MF = &NewMF; SDAGISelPass = PassPtr; @@ -1345,6 +1345,7 @@ void SelectionDAG::init(MachineFunction &NewMF, UA = NewUA; PSI = PSIin; BFI = BFIin; + MMI = &MMIin; FnVarLocs = VarLocs; } diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp index 98a795edb7a03..02eb1305b24ae 100644 --- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp +++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp @@ -6707,11 +6707,11 @@ void SelectionDAGBuilder::visitIntrinsicCall(const CallInst &I, getValue(I.getArgOperand(0; return; case Intrinsic::eh_sjlj_callsite: { -MachineModuleInfo &MMI = DAG.getMachineFunction().getMMI(); ConstantInt *CI = cast(I.getArgOperand(0)); -assert(MMI.getCurrentCallSite() == 0 && "Overlapping call sites!"); +assert(DAG.getMMI()->getCurrentCallSite() == 0 && + "Overlapping call sites!"); -MMI.setCurrentCallSite(CI->getZExtValue()); +DAG.getMMI()->setCurrentCallSite(CI->getZExtValue()); return; } case Intrinsic::eh_sjlj_functioncontext: { diff --g
[llvm-branch-commits] [llvm] SelectionDAG: Avoid using MachineFunction::getMMI (PR #99696)
https://github.com/arsenm ready_for_review https://github.com/llvm/llvm-project/pull/99696 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] SelectionDAG: Avoid using MachineFunction::getMMI (PR #99696)
https://github.com/rnk approved this pull request. https://github.com/llvm/llvm-project/pull/99696 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang-tidy] Add FixIts for libc namespace macros (PR #99681)
https://github.com/michaelrj-google commented: LGTM from the libc side https://github.com/llvm/llvm-project/pull/99681 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [MLIR][OpenMP] Automate operand structure definition (PR #99508)
@@ -12,11 +12,52 @@ #include "mlir/TableGen/GenInfo.h" +#include "mlir/TableGen/CodeGenHelpers.h" +#include "llvm/ADT/StringExtras.h" +#include "llvm/ADT/TypeSwitch.h" #include "llvm/TableGen/Error.h" #include "llvm/TableGen/Record.h" using namespace llvm; +/// The code block defining the base mixin class for combining clause operand +/// structures. +static const char *const baseMixinClass = R"( +namespace detail { +template +struct Clauses : public Mixins... {}; +} // namespace detail +)"; + +/// The code block defining operation argument structures. +static const char *const operationArgStruct = R"( +using {0}Operands = detail::Clauses<{1}>; +)"; + +/// Remove multiple optional prefixes and suffixes from \c str. +/// +/// Prefixes and suffixes are attempted to be removed once in the order they +/// appear in the \c prefixes and \c suffixes arguments. All prefixes are +/// processed before suffixes are. This means it will behave as shown in the +/// following example: +/// - str: "PrePreNameSuf1Suf2" +/// - prefixes: ["Pre"] +/// - suffixes: ["Suf1", "Suf2"] +/// - return: "PreNameSuf1" +static StringRef stripPrefixAndSuffix(StringRef str, + llvm::ArrayRef prefixes, + llvm::ArrayRef suffixes) { + for (StringRef prefix : prefixes) +if (str.starts_with(prefix)) + str = str.substr(prefix.size()); + + for (StringRef suffix : suffixes) +if (str.ends_with(suffix)) + str = str.substr(0, str.size() - suffix.size()); kparzysz wrote: `str = str.drop_back(suffix.size())` https://github.com/llvm/llvm-project/pull/99508 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [MLIR][OpenMP] Automate operand structure definition (PR #99508)
@@ -12,11 +12,52 @@ #include "mlir/TableGen/GenInfo.h" +#include "mlir/TableGen/CodeGenHelpers.h" +#include "llvm/ADT/StringExtras.h" +#include "llvm/ADT/TypeSwitch.h" #include "llvm/TableGen/Error.h" #include "llvm/TableGen/Record.h" using namespace llvm; +/// The code block defining the base mixin class for combining clause operand +/// structures. +static const char *const baseMixinClass = R"( +namespace detail { +template +struct Clauses : public Mixins... {}; +} // namespace detail +)"; + +/// The code block defining operation argument structures. +static const char *const operationArgStruct = R"( +using {0}Operands = detail::Clauses<{1}>; +)"; + +/// Remove multiple optional prefixes and suffixes from \c str. +/// +/// Prefixes and suffixes are attempted to be removed once in the order they +/// appear in the \c prefixes and \c suffixes arguments. All prefixes are +/// processed before suffixes are. This means it will behave as shown in the +/// following example: +/// - str: "PrePreNameSuf1Suf2" +/// - prefixes: ["Pre"] +/// - suffixes: ["Suf1", "Suf2"] +/// - return: "PreNameSuf1" +static StringRef stripPrefixAndSuffix(StringRef str, + llvm::ArrayRef prefixes, + llvm::ArrayRef suffixes) { + for (StringRef prefix : prefixes) +if (str.starts_with(prefix)) + str = str.substr(prefix.size()); kparzysz wrote: `str = str.drop_front(prefix.size())` https://github.com/llvm/llvm-project/pull/99508 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [MLIR][OpenMP] Automate operand structure definition (PR #99508)
@@ -408,17 +408,26 @@ class ElementsAttrBase : let storageType = [{ ::mlir::ElementsAttr }]; let returnType = [{ ::mlir::ElementsAttr }]; let convertFromStorage = "$_self"; + + // The underlying C++ value type of each element. + string elementReturnType = ?; kparzysz wrote: This may need wider support, specifically we may need to generate an accessor function in .h.inc/.cpp.inc. Something like https://github.com/llvm/llvm-project/blob/main/mlir/tools/mlir-tblgen/OpDefinitionsGen.cpp#L1204, for example. I'm wary about making this kind of change in a widely shared file. Maybe we could just handle this in OmpOpGen.cpp? Specifically, infer this information in there based on the type of the attribute? https://github.com/llvm/llvm-project/pull/99508 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [BOLT] Match functions with call graph (PR #98125)
https://github.com/shawbyoung updated https://github.com/llvm/llvm-project/pull/98125 >From cf32a43e7c2b04079c6123fe13df4fb7226d771f Mon Sep 17 00:00:00 2001 From: shawbyoung Date: Tue, 9 Jul 2024 10:04:25 -0700 Subject: [PATCH 01/16] Comments Created using spr 1.3.4 --- bolt/lib/Profile/YAMLProfileReader.cpp | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/bolt/lib/Profile/YAMLProfileReader.cpp b/bolt/lib/Profile/YAMLProfileReader.cpp index 69ea0899c5f2c..6753337c24ea7 100644 --- a/bolt/lib/Profile/YAMLProfileReader.cpp +++ b/bolt/lib/Profile/YAMLProfileReader.cpp @@ -501,7 +501,6 @@ size_t YAMLProfileReader::matchWithCallGraph(BinaryContext &BC) { // Maps binary functions to adjacent functions in the FCG. for (const BinaryFunction *CallerBF : BFs) { -// Add all call targets to the hash map. for (const BinaryBasicBlock &BB : CallerBF->blocks()) { for (const MCInst &Inst : BB) { if (!BC.MIB->isCall(Instr)) @@ -533,7 +532,8 @@ size_t YAMLProfileReader::matchWithCallGraph(BinaryContext &BC) { } } - // Create mapping from neighbor hash to BFs. + // Using the constructed adjacent function mapping, creates mapping from + // neighbor hash to BFs. std::unordered_map> NeighborHashToBFs; for (const BinaryFunction *BF : BFs) { @@ -552,12 +552,12 @@ size_t YAMLProfileReader::matchWithCallGraph(BinaryContext &BC) { .push_back(BF); } - // TODO: change call anchor PR to have this representation - we need it here + // TODO: note, this will be introduced in the matching functions with calls + // as anchors pr DenseMap IdToYAMLBF; - // TODO: change call anchor PR to have this representation - we need it here - // Maps hashes to profiled functions. + // Maps YAML functions to adjacent functions in the profile FCG. std::unordered_map YamlBFToHashes(BFs.size()); @@ -590,7 +590,7 @@ size_t YAMLProfileReader::matchWithCallGraph(BinaryContext &BC) { } } - // Matching YAMLBF with neighbor hashes. + // Matches YAMLBF to BFs with neighbor hashes. for (yaml::bolt::BinaryFunctionProfile &YamlBF : YamlBP.Functions) { if (YamlBF.Used) continue; >From ee9049fc4bd3d4203c19c9c0982a78ab3b47666f Mon Sep 17 00:00:00 2001 From: shawbyoung Date: Tue, 9 Jul 2024 13:52:05 -0700 Subject: [PATCH 02/16] Moved blended hash definition Created using spr 1.3.4 --- bolt/include/bolt/Profile/YAMLProfileReader.h | 69 ++- bolt/lib/Profile/StaleProfileMatching.cpp | 65 --- bolt/lib/Profile/YAMLProfileReader.cpp| 110 -- 3 files changed, 119 insertions(+), 125 deletions(-) diff --git a/bolt/include/bolt/Profile/YAMLProfileReader.h b/bolt/include/bolt/Profile/YAMLProfileReader.h index 36e8f8739eee1..e8a34ecad9a08 100644 --- a/bolt/include/bolt/Profile/YAMLProfileReader.h +++ b/bolt/include/bolt/Profile/YAMLProfileReader.h @@ -16,6 +16,73 @@ namespace llvm { namespace bolt { +/// An object wrapping several components of a basic block hash. The combined +/// (blended) hash is represented and stored as one uint64_t, while individual +/// components are of smaller size (e.g., uint16_t or uint8_t). +struct BlendedBlockHash { +private: + using ValueOffset = Bitfield::Element; + using ValueOpcode = Bitfield::Element; + using ValueInstr = Bitfield::Element; + using ValuePred = Bitfield::Element; + using ValueSucc = Bitfield::Element; + +public: + explicit BlendedBlockHash() {} + + explicit BlendedBlockHash(uint64_t Hash) { +Offset = Bitfield::get(Hash); +OpcodeHash = Bitfield::get(Hash); +InstrHash = Bitfield::get(Hash); +PredHash = Bitfield::get(Hash); +SuccHash = Bitfield::get(Hash); + } + + /// Combine the blended hash into uint64_t. + uint64_t combine() const { +uint64_t Hash = 0; +Bitfield::set(Hash, Offset); +Bitfield::set(Hash, OpcodeHash); +Bitfield::set(Hash, InstrHash); +Bitfield::set(Hash, PredHash); +Bitfield::set(Hash, SuccHash); +return Hash; + } + + /// Compute a distance between two given blended hashes. The smaller the + /// distance, the more similar two blocks are. For identical basic blocks, + /// the distance is zero. + uint64_t distance(const BlendedBlockHash &BBH) const { +assert(OpcodeHash == BBH.OpcodeHash && + "incorrect blended hash distance computation"); +uint64_t Dist = 0; +// Account for NeighborHash +Dist += SuccHash == BBH.SuccHash ? 0 : 1; +Dist += PredHash == BBH.PredHash ? 0 : 1; +Dist <<= 16; +// Account for InstrHash +Dist += InstrHash == BBH.InstrHash ? 0 : 1; +Dist <<= 16; +// Account for Offset +Dist += (Offset >= BBH.Offset ? Offset - BBH.Offset : BBH.Offset - Offset); +return Dist; + } + + /// The offset of the basic block from the function start. + uint16_t Offset{0}; + /// (Loose) Hash of the basic block instructions, excluding operands. + uint16_t OpcodeHash{0}; + /// (Str
[llvm-branch-commits] [llvm] [BOLT] Match functions with call graph (PR #98125)
https://github.com/shawbyoung updated https://github.com/llvm/llvm-project/pull/98125 >From cf32a43e7c2b04079c6123fe13df4fb7226d771f Mon Sep 17 00:00:00 2001 From: shawbyoung Date: Tue, 9 Jul 2024 10:04:25 -0700 Subject: [PATCH 01/16] Comments Created using spr 1.3.4 --- bolt/lib/Profile/YAMLProfileReader.cpp | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/bolt/lib/Profile/YAMLProfileReader.cpp b/bolt/lib/Profile/YAMLProfileReader.cpp index 69ea0899c5f2c..6753337c24ea7 100644 --- a/bolt/lib/Profile/YAMLProfileReader.cpp +++ b/bolt/lib/Profile/YAMLProfileReader.cpp @@ -501,7 +501,6 @@ size_t YAMLProfileReader::matchWithCallGraph(BinaryContext &BC) { // Maps binary functions to adjacent functions in the FCG. for (const BinaryFunction *CallerBF : BFs) { -// Add all call targets to the hash map. for (const BinaryBasicBlock &BB : CallerBF->blocks()) { for (const MCInst &Inst : BB) { if (!BC.MIB->isCall(Instr)) @@ -533,7 +532,8 @@ size_t YAMLProfileReader::matchWithCallGraph(BinaryContext &BC) { } } - // Create mapping from neighbor hash to BFs. + // Using the constructed adjacent function mapping, creates mapping from + // neighbor hash to BFs. std::unordered_map> NeighborHashToBFs; for (const BinaryFunction *BF : BFs) { @@ -552,12 +552,12 @@ size_t YAMLProfileReader::matchWithCallGraph(BinaryContext &BC) { .push_back(BF); } - // TODO: change call anchor PR to have this representation - we need it here + // TODO: note, this will be introduced in the matching functions with calls + // as anchors pr DenseMap IdToYAMLBF; - // TODO: change call anchor PR to have this representation - we need it here - // Maps hashes to profiled functions. + // Maps YAML functions to adjacent functions in the profile FCG. std::unordered_map YamlBFToHashes(BFs.size()); @@ -590,7 +590,7 @@ size_t YAMLProfileReader::matchWithCallGraph(BinaryContext &BC) { } } - // Matching YAMLBF with neighbor hashes. + // Matches YAMLBF to BFs with neighbor hashes. for (yaml::bolt::BinaryFunctionProfile &YamlBF : YamlBP.Functions) { if (YamlBF.Used) continue; >From ee9049fc4bd3d4203c19c9c0982a78ab3b47666f Mon Sep 17 00:00:00 2001 From: shawbyoung Date: Tue, 9 Jul 2024 13:52:05 -0700 Subject: [PATCH 02/16] Moved blended hash definition Created using spr 1.3.4 --- bolt/include/bolt/Profile/YAMLProfileReader.h | 69 ++- bolt/lib/Profile/StaleProfileMatching.cpp | 65 --- bolt/lib/Profile/YAMLProfileReader.cpp| 110 -- 3 files changed, 119 insertions(+), 125 deletions(-) diff --git a/bolt/include/bolt/Profile/YAMLProfileReader.h b/bolt/include/bolt/Profile/YAMLProfileReader.h index 36e8f8739eee1..e8a34ecad9a08 100644 --- a/bolt/include/bolt/Profile/YAMLProfileReader.h +++ b/bolt/include/bolt/Profile/YAMLProfileReader.h @@ -16,6 +16,73 @@ namespace llvm { namespace bolt { +/// An object wrapping several components of a basic block hash. The combined +/// (blended) hash is represented and stored as one uint64_t, while individual +/// components are of smaller size (e.g., uint16_t or uint8_t). +struct BlendedBlockHash { +private: + using ValueOffset = Bitfield::Element; + using ValueOpcode = Bitfield::Element; + using ValueInstr = Bitfield::Element; + using ValuePred = Bitfield::Element; + using ValueSucc = Bitfield::Element; + +public: + explicit BlendedBlockHash() {} + + explicit BlendedBlockHash(uint64_t Hash) { +Offset = Bitfield::get(Hash); +OpcodeHash = Bitfield::get(Hash); +InstrHash = Bitfield::get(Hash); +PredHash = Bitfield::get(Hash); +SuccHash = Bitfield::get(Hash); + } + + /// Combine the blended hash into uint64_t. + uint64_t combine() const { +uint64_t Hash = 0; +Bitfield::set(Hash, Offset); +Bitfield::set(Hash, OpcodeHash); +Bitfield::set(Hash, InstrHash); +Bitfield::set(Hash, PredHash); +Bitfield::set(Hash, SuccHash); +return Hash; + } + + /// Compute a distance between two given blended hashes. The smaller the + /// distance, the more similar two blocks are. For identical basic blocks, + /// the distance is zero. + uint64_t distance(const BlendedBlockHash &BBH) const { +assert(OpcodeHash == BBH.OpcodeHash && + "incorrect blended hash distance computation"); +uint64_t Dist = 0; +// Account for NeighborHash +Dist += SuccHash == BBH.SuccHash ? 0 : 1; +Dist += PredHash == BBH.PredHash ? 0 : 1; +Dist <<= 16; +// Account for InstrHash +Dist += InstrHash == BBH.InstrHash ? 0 : 1; +Dist <<= 16; +// Account for Offset +Dist += (Offset >= BBH.Offset ? Offset - BBH.Offset : BBH.Offset - Offset); +return Dist; + } + + /// The offset of the basic block from the function start. + uint16_t Offset{0}; + /// (Loose) Hash of the basic block instructions, excluding operands. + uint16_t OpcodeHash{0}; + /// (Str
[llvm-branch-commits] [llvm] [LLVM][Coroutines] Create `.noalloc` variant of switch ABI coroutine ramp functions during CoroSplit (PR #99283)
@@ -1455,6 +1462,64 @@ struct SwitchCoroutineSplitter { setCoroInfo(F, Shape, Clones); } + static Function *createNoAllocVariant(Function &F, coro::Shape &Shape, +SmallVectorImpl &Clones) { +auto *OrigFnTy = F.getFunctionType(); +auto OldParams = OrigFnTy->params(); + +SmallVector NewParams; +NewParams.reserve(OldParams.size() + 1); +for (Type *T : OldParams) { + NewParams.push_back(T); +} +NewParams.push_back(PointerType::getUnqual(Shape.FrameTy)); + +auto *NewFnTy = FunctionType::get(OrigFnTy->getReturnType(), NewParams, + OrigFnTy->isVarArg()); +Function *NoAllocF = +Function::Create(NewFnTy, F.getLinkage(), F.getName() + ".noalloc"); yuxuanchen1997 wrote: Does ThinLTO count as Cross-TU? https://github.com/llvm/llvm-project/pull/99283 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LLVM][Coroutines] Create `.noalloc` variant of switch ABI coroutine ramp functions during CoroSplit (PR #99283)
@@ -1455,6 +1462,64 @@ struct SwitchCoroutineSplitter { setCoroInfo(F, Shape, Clones); } + static Function *createNoAllocVariant(Function &F, coro::Shape &Shape, +SmallVectorImpl &Clones) { +auto *OrigFnTy = F.getFunctionType(); +auto OldParams = OrigFnTy->params(); + +SmallVector NewParams; +NewParams.reserve(OldParams.size() + 1); +for (Type *T : OldParams) { + NewParams.push_back(T); +} +NewParams.push_back(PointerType::getUnqual(Shape.FrameTy)); + +auto *NewFnTy = FunctionType::get(OrigFnTy->getReturnType(), NewParams, + OrigFnTy->isVarArg()); +Function *NoAllocF = +Function::Create(NewFnTy, F.getLinkage(), F.getName() + ".noalloc"); vogelsgesang wrote: Intuitively, I would think so, yes. But I never looked into ThinLTO before, so I don't know for sure. Does #99285 enable `CoroAnnotationElide` already for ThinLTO? https://github.com/llvm/llvm-project/pull/99283 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LLVM][Coroutines] Create `.noalloc` variant of switch ABI coroutine ramp functions during CoroSplit (PR #99283)
https://github.com/vogelsgesang edited https://github.com/llvm/llvm-project/pull/99283 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LLVM][Coroutines] Transform "coro_must_elide" calls to switch ABI coroutines to the `noalloc` variant (PR #99285)
@@ -968,8 +969,8 @@ PassBuilder::buildInlinerPipeline(OptimizationLevel Level, // it's been modified since. MainCGPipeline.addPass(createCGSCCToFunctionPassAdaptor( RequireAnalysisPass())); - MainCGPipeline.addPass(CoroSplitPass(Level != OptimizationLevel::O0)); + MainCGPipeline.addPass(CoroAnnotationElidePass()); vogelsgesang wrote: should this also be added to `buildModuleInlinerPipeline`? https://github.com/llvm/llvm-project/pull/99285 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LLVM][Coroutines] Transform "coro_must_elide" calls to switch ABI coroutines to the `noalloc` variant (PR #99285)
@@ -968,8 +969,8 @@ PassBuilder::buildInlinerPipeline(OptimizationLevel Level, // it's been modified since. MainCGPipeline.addPass(createCGSCCToFunctionPassAdaptor( RequireAnalysisPass())); - MainCGPipeline.addPass(CoroSplitPass(Level != OptimizationLevel::O0)); + MainCGPipeline.addPass(CoroAnnotationElidePass()); vogelsgesang wrote: Should this optimization also work for ThinLTO? (also see discussion in https://github.com/llvm/llvm-project/pull/99283#discussion_r1684973257) I read this pass setup a bit more carefully now, and I don't think it will work for ThinLTO. Why? * Both `CoroAnnotationElide` and `CoroSplit` are added to `buildInlinerPipeline`. * `buildInlinerPipeline` is used by `buildModuleSimplificationPipeline`. * `buildModuleSimplificationPipeline` is part of both `buildThinLTOPreLinkDefaultPipeline` and `buildThinLTODefaultPipeline` * -> `CoroSplit` already runs as part of `buildThinLTOPreLinkDefaultPipeline` (i.e. on a per-translation-unit-level) * -> by the time we reach `buildThinLTODefaultPipeline` (i.e. the cross-TU part of ThinLTO), the coroutines are already split * -> although, `CoroAnnotationElide` is run a 2nd time as part of cross-TU optimization, it will be a no-op due to the `Caller->isPresplitCoroutine()` check in `CoroAnnotationElide.cpp:120` https://github.com/llvm/llvm-project/pull/99285 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LLVM][Coroutines] Transform "coro_must_elide" calls to switch ABI coroutines to the `noalloc` variant (PR #99285)
https://github.com/vogelsgesang edited https://github.com/llvm/llvm-project/pull/99285 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LLVM][Coroutines] Transform "coro_must_elide" calls to switch ABI coroutines to the `noalloc` variant (PR #99285)
https://github.com/vogelsgesang edited https://github.com/llvm/llvm-project/pull/99285 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LLVM][Coroutines] Transform "coro_must_elide" calls to switch ABI coroutines to the `noalloc` variant (PR #99285)
@@ -968,8 +969,8 @@ PassBuilder::buildInlinerPipeline(OptimizationLevel Level, // it's been modified since. MainCGPipeline.addPass(createCGSCCToFunctionPassAdaptor( RequireAnalysisPass())); - MainCGPipeline.addPass(CoroSplitPass(Level != OptimizationLevel::O0)); + MainCGPipeline.addPass(CoroAnnotationElidePass()); apolloww wrote: There is another PR #90310 trying to move the coro passes into post-link pipeline if ThinLTO(`-flto=thin`) is enabled for the compilation. It ran into some issue with asan and going through some refactoring. https://github.com/llvm/llvm-project/pull/99285 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LLVM][Coroutines] Transform "coro_must_elide" calls to switch ABI coroutines to the `noalloc` variant (PR #99285)
@@ -968,8 +969,8 @@ PassBuilder::buildInlinerPipeline(OptimizationLevel Level, // it's been modified since. MainCGPipeline.addPass(createCGSCCToFunctionPassAdaptor( RequireAnalysisPass())); - MainCGPipeline.addPass(CoroSplitPass(Level != OptimizationLevel::O0)); + MainCGPipeline.addPass(CoroAnnotationElidePass()); vogelsgesang wrote: Thanks for that context! I assume, as part of relanding #90310 you would then also adjust the way the `CoroAnnotationElide` registration? Do you happen to know the answer to https://github.com/llvm/llvm-project/pull/99283#discussion_r1684973257 or can provide some guidance there? https://github.com/llvm/llvm-project/pull/99285 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits