[llvm-branch-commits] [clang] [Serialization] Code cleanups and polish 83233 (PR #83237)
https://github.com/ChuanqiXu9 updated https://github.com/llvm/llvm-project/pull/83237 >From f2e53e44eebab4720a1dbade24fcb14d698fb03f Mon Sep 17 00:00:00 2001 From: Chuanqi Xu Date: Wed, 28 Feb 2024 11:41:53 +0800 Subject: [PATCH 1/6] [Serialization] Code cleanups and polish 83233 --- clang/include/clang/AST/DeclTemplate.h| 39 +- clang/include/clang/AST/ExternalASTSource.h | 8 +- .../clang/Sema/MultiplexExternalSemaSource.h | 4 +- .../include/clang/Serialization/ASTBitCodes.h | 2 +- clang/include/clang/Serialization/ASTReader.h | 4 +- clang/lib/AST/DeclTemplate.cpp| 85 ++-- clang/lib/AST/ExternalASTSource.cpp | 10 +- clang/lib/AST/ODRHash.cpp | 10 - .../lib/Sema/MultiplexExternalSemaSource.cpp | 13 +- clang/lib/Serialization/ASTCommon.h | 1 - clang/lib/Serialization/ASTReader.cpp | 42 +- clang/lib/Serialization/ASTReaderDecl.cpp | 76 +--- clang/lib/Serialization/ASTReaderInternals.h | 1 - clang/lib/Serialization/ASTWriter.cpp | 27 +- clang/lib/Serialization/ASTWriterDecl.cpp | 52 +-- clang/lib/Serialization/CMakeLists.txt| 1 + .../Serialization/TemplateArgumentHasher.cpp | 423 ++ .../Serialization/TemplateArgumentHasher.h| 34 ++ clang/test/Modules/cxx-templates.cpp | 8 +- .../Modules/recursive-instantiations.cppm | 40 ++ .../test/OpenMP/target_parallel_ast_print.cpp | 4 - clang/test/OpenMP/target_teams_ast_print.cpp | 4 - clang/test/OpenMP/task_ast_print.cpp | 4 - clang/test/OpenMP/teams_ast_print.cpp | 4 - 24 files changed, 610 insertions(+), 286 deletions(-) create mode 100644 clang/lib/Serialization/TemplateArgumentHasher.cpp create mode 100644 clang/lib/Serialization/TemplateArgumentHasher.h create mode 100644 clang/test/Modules/recursive-instantiations.cppm diff --git a/clang/include/clang/AST/DeclTemplate.h b/clang/include/clang/AST/DeclTemplate.h index 44f840d297465d..7406252363d223 100644 --- a/clang/include/clang/AST/DeclTemplate.h +++ b/clang/include/clang/AST/DeclTemplate.h @@ -256,9 +256,6 @@ class TemplateArgumentList final TemplateArgumentList(const TemplateArgumentList &) = delete; TemplateArgumentList &operator=(const TemplateArgumentList &) = delete; - /// Create hash for the given arguments. - static unsigned ComputeODRHash(ArrayRef Args); - /// Create a new template argument list that copies the given set of /// template arguments. static TemplateArgumentList *CreateCopy(ASTContext &Context, @@ -732,25 +729,6 @@ class RedeclarableTemplateDecl : public TemplateDecl, } void anchor() override; - struct LazySpecializationInfo { -GlobalDeclID DeclID = GlobalDeclID(); -unsigned ODRHash = ~0U; -bool IsPartial = false; -LazySpecializationInfo(GlobalDeclID ID, unsigned Hash = ~0U, - bool Partial = false) -: DeclID(ID), ODRHash(Hash), IsPartial(Partial) {} -LazySpecializationInfo() {} -bool operator<(const LazySpecializationInfo &Other) const { - return DeclID < Other.DeclID; -} -bool operator==(const LazySpecializationInfo &Other) const { - assert((DeclID != Other.DeclID || ODRHash == Other.ODRHash) && - "Hashes differ!"); - assert((DeclID != Other.DeclID || IsPartial == Other.IsPartial) && - "Both must be the same kinds!"); - return DeclID == Other.DeclID; -} - }; protected: template struct SpecEntryTraits { @@ -794,16 +772,20 @@ class RedeclarableTemplateDecl : public TemplateDecl, void loadLazySpecializationsImpl(bool OnlyPartial = false) const; - void loadLazySpecializationsImpl(llvm::ArrayRef Args, + bool loadLazySpecializationsImpl(llvm::ArrayRef Args, TemplateParameterList *TPL = nullptr) const; - Decl *loadLazySpecializationImpl(LazySpecializationInfo &LazySpecInfo) const; - template typename SpecEntryTraits::DeclType* findSpecializationImpl(llvm::FoldingSetVector &Specs, void *&InsertPos, ProfileArguments &&...ProfileArgs); + template + typename SpecEntryTraits::DeclType * + findSpecializationLocally(llvm::FoldingSetVector &Specs, +void *&InsertPos, +ProfileArguments &&...ProfileArgs); + template void addSpecializationImpl(llvm::FoldingSetVector &Specs, EntryType *Entry, void *InsertPos); @@ -819,13 +801,6 @@ class RedeclarableTemplateDecl : public TemplateDecl, llvm::PointerIntPair InstantiatedFromMember; -/// If non-null, points to an array of specializations (including -/// partial specializations) known only by their external declaration IDs. -/// -/// The first value in the array is the number of specializations/partial -/// specializations that follow. -LazySpecializationInfo *LazySpecializations = n
[llvm-branch-commits] [clang] [Serialization] Code cleanups and polish 83233 (PR #83237)
ChuanqiXu9 wrote: I think now I understand the problem. The root cause happens in https://github.com/llvm/llvm-project/blob/175aa864f33786f3a6a4ee7381cbcafd0758501a/clang/lib/Serialization/MultiOnDiskHashTable.h#L329 The description in () is optional. You can skip it if you're not interested it or in the first iteration. what the code does is: when we write a on-disk hash table, try to write the imported merged hash table in the same process so that we don't need to read these tables again. However, in line 329 the function will try to omit the data from imported table with the same key which already emitted by the current module file. This is the root cause of the problem. (The wrotten merged hash table are called overiden files, and they will be removed in https://github.com/llvm/llvm-project/blob/175aa864f33786f3a6a4ee7381cbcafd0758501a/clang/lib/Serialization/MultiOnDiskHashTable.h#L133-L137) (when will the table will be merged? when the number of on disk hash table for the same item is large than some threshold (by default 4), we will merge them into an in memory table to try to speedup the querying. So this is majorly an optimization.) It is bad to skip data with the same key. Since it violates the big assumption that we discussed for a long time: - It is bad to have different key values for the logical same specializations. - But it is actually good to have the same key values for the different specializations. And the code should work well if we counts the hash value for all template arguments as 0x12345678. And the implicitly optimization to skip data with the same key, violates the second assumption above. So this is the root cause of the problem. (Why my previous try works? Since it will remove the imported table if it loads all the items from it, so it avoids the "optimization" surprisingly.) Then it looks pretty simple to overcome the issue, just skip the optimization like I did in the most new commit. @ilya-biryukov @alexfh I think we can start another round of test. Thanks in ahead. https://github.com/llvm/llvm-project/pull/83237 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [Serialization] Code cleanups and polish 83233 (PR #83237)
@@ -1827,6 +1833,12 @@ void ASTDeclWriter::VisitVarTemplateDecl(VarTemplateDecl *D) { void ASTDeclWriter::VisitVarTemplateSpecializationDecl( VarTemplateSpecializationDecl *D) { + // FIXME: We need to load the "logical" first declaration before writing + // the Redeclarable part. But it may be too expensive to load all the + // specializations. Maybe we can find a way to load the "logical" first + // declaration only. Or we should try to solve this on the reader side. ChuanqiXu9 wrote: Yeah, but I tried to find the root cause : ) https://github.com/llvm/llvm-project/pull/83237 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] release/19.x: [compiler-rt] Fix definition of `usize` on 32-bit Windows (PR #106303)
mstorsjo wrote: > @mstorsjo What do you think about merging this PR to the release branch? LGTM! https://github.com/llvm/llvm-project/pull/106303 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang-tools-extra] [clangd] Add clangd 19 release notes (PR #105975)
https://github.com/kadircet approved this pull request. thanks a lot for doing this @HighCommander4! https://github.com/llvm/llvm-project/pull/105975 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LLVM][Coroutines] Create `.noalloc` variant of switch ABI coroutine ramp functions during CoroSplit (PR #99283)
https://github.com/ChuanqiXu9 commented: The patch looks good to me except the thing I mentioned in https://github.com/llvm/llvm-project/pull/99282#pullrequestreview-2265588601 https://github.com/llvm/llvm-project/pull/99283 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LLVM][Coroutines] Create `.noalloc` variant of switch ABI coroutine ramp functions during CoroSplit (PR #99283)
https://github.com/ChuanqiXu9 edited https://github.com/llvm/llvm-project/pull/99283 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LLVM][Coroutines] Create `.noalloc` variant of switch ABI coroutine ramp functions during CoroSplit (PR #99283)
@@ -1455,6 +1462,74 @@ struct SwitchCoroutineSplitter { setCoroInfo(F, Shape, Clones); } + // Create a variant of ramp function that does not perform heap allocation + // for a switch ABI coroutine. + // + // The newly split `.noalloc` ramp function has the following differences: + // - Has one additional frame pointer parameter in lieu of dynamic + // allocation. + // - Suppressed allocations by replacing coro.alloc and coro.free. + static Function *createNoAllocVariant(Function &F, coro::Shape &Shape, +SmallVectorImpl &Clones) { +auto *OrigFnTy = F.getFunctionType(); ChuanqiXu9 wrote: nit: I feel better with an assertion here that the ABI is switch ABI. https://github.com/llvm/llvm-project/pull/99283 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LLVM][Coroutines] Create `.noalloc` variant of switch ABI coroutine ramp functions during CoroSplit (PR #99283)
@@ -1455,6 +1462,74 @@ struct SwitchCoroutineSplitter { setCoroInfo(F, Shape, Clones); } + // Create a variant of ramp function that does not perform heap allocation + // for a switch ABI coroutine. + // + // The newly split `.noalloc` ramp function has the following differences: + // - Has one additional frame pointer parameter in lieu of dynamic + // allocation. + // - Suppressed allocations by replacing coro.alloc and coro.free. + static Function *createNoAllocVariant(Function &F, coro::Shape &Shape, +SmallVectorImpl &Clones) { +auto *OrigFnTy = F.getFunctionType(); +auto OldParams = OrigFnTy->params(); + +SmallVector NewParams; +NewParams.reserve(OldParams.size() + 1); +NewParams.append(OldParams.begin(), OldParams.end()); +NewParams.push_back(PointerType::getUnqual(Shape.FrameTy)); + +auto *NewFnTy = FunctionType::get(OrigFnTy->getReturnType(), NewParams, + OrigFnTy->isVarArg()); +Function *NoAllocF = +Function::Create(NewFnTy, F.getLinkage(), F.getName() + ".noalloc"); + +ValueToValueMapTy VMap; +unsigned int Idx = 0; +for (const auto &I : F.args()) { + VMap[&I] = NoAllocF->getArg(Idx++); +} +SmallVector Returns; +CloneFunctionInto(NoAllocF, &F, VMap, + CloneFunctionChangeType::LocalChangesOnly, Returns); + +if (Shape.CoroBegin) { + auto *NewCoroBegin = + cast_if_present(VMap[Shape.CoroBegin]); + auto *NewCoroId = cast(NewCoroBegin->getId()); + coro::replaceCoroFree(NewCoroId, /*Elide=*/true); + coro::suppressCoroAllocs(NewCoroId); + NewCoroBegin->replaceAllUsesWith(NoAllocF->getArg(Idx)); ChuanqiXu9 wrote: nit: it looks better to use `FrameIdx` below instead of using the induction variable across code sections. https://github.com/llvm/llvm-project/pull/99283 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LLVM][Coroutines] Create `.noalloc` variant of switch ABI coroutine ramp functions during CoroSplit (PR #99283)
@@ -26,6 +26,10 @@ bool declaresIntrinsics(const Module &M, const std::initializer_list); void replaceCoroFree(CoroIdInst *CoroId, bool Elide); +void suppressCoroAllocs(CoroIdInst *CoroId); ChuanqiXu9 wrote: Let's add some comments for this since I can't guess its job by its name. https://github.com/llvm/llvm-project/pull/99283 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LLVM][Coroutines] Create `.noalloc` variant of switch ABI coroutine ramp functions during CoroSplit (PR #99283)
@@ -2049,6 +2055,21 @@ the coroutine must reach the final suspend point when it get destroyed. This attribute only works for switched-resume coroutines now. +coro_elide_safe +--- + +When a Call or Invoke instruction is marked with `coro_elide_safe`, +CoroAnnotationElidePass performs heap elision when possible. Note that for +recursive or mutually recursive functions this elision is usually not possible. + +coro_gen_noalloc_ramp +- + +This attribute hints CoroSplitPass to generate a `f.noalloc` ramp function for ChuanqiXu9 wrote: It will be better to explain and describe the `f.noalloc` ramp function in this document. And it will be better to have some example codes for it and compare it with the normal ramp functions. https://github.com/llvm/llvm-project/pull/99283 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LLVM][Coroutines] Transform "coro_elide_safe" calls to switch ABI coroutines to the `noalloc` variant (PR #99285)
@@ -0,0 +1,147 @@ +//===- CoroAnnotationElide.cpp - Elide attributed safe coroutine calls ===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===--===// +// +// \file +// This pass transforms all Call or Invoke instructions that are annotated +// "coro_elide_safe" to call the `.noalloc` variant of coroutine instead. +// The frame of the callee coroutine is allocated inside the caller. A pointer +// to the allocated frame will be passed into the `.noalloc` ramp function. +// +//===--===// + +#include "llvm/Transforms/Coroutines/CoroAnnotationElide.h" + +#include "llvm/Analysis/LazyCallGraph.h" +#include "llvm/Analysis/OptimizationRemarkEmitter.h" +#include "llvm/IR/Analysis.h" +#include "llvm/IR/IRBuilder.h" +#include "llvm/IR/InstIterator.h" +#include "llvm/IR/Instruction.h" +#include "llvm/IR/Module.h" +#include "llvm/IR/PassManager.h" +#include "llvm/Transforms/Utils/CallGraphUpdater.h" + +#include + +using namespace llvm; + +#define DEBUG_TYPE "coro-annotation-elide" + +static Instruction *getFirstNonAllocaInTheEntryBlock(Function *F) { + for (Instruction &I : F->getEntryBlock()) +if (!isa(&I)) + return &I; + llvm_unreachable("no terminator in the entry block"); +} + +// Create an alloca in the caller, using FrameSize and FrameAlign as the callee +// coroutine's activation frame. +static Value *allocateFrameInCaller(Function *Caller, uint64_t FrameSize, +Align FrameAlign) { + LLVMContext &C = Caller->getContext(); + BasicBlock::iterator InsertPt = + getFirstNonAllocaInTheEntryBlock(Caller)->getIterator(); + const DataLayout &DL = Caller->getDataLayout(); + auto FrameTy = ArrayType::get(Type::getInt8Ty(C), FrameSize); + auto *Frame = new AllocaInst(FrameTy, DL.getAllocaAddrSpace(), "", InsertPt); + Frame->setAlignment(FrameAlign); + return new BitCastInst(Frame, PointerType::getUnqual(C), "vFrame", InsertPt); ChuanqiXu9 wrote: Why do we need bit case here? Since I remember we're in the era of opaque pointers. Do I misunderstand anything? https://github.com/llvm/llvm-project/pull/99285 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LLVM][Coroutines] Transform "coro_elide_safe" calls to switch ABI coroutines to the `noalloc` variant (PR #99285)
@@ -0,0 +1,147 @@ +//===- CoroAnnotationElide.cpp - Elide attributed safe coroutine calls ===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===--===// +// +// \file +// This pass transforms all Call or Invoke instructions that are annotated +// "coro_elide_safe" to call the `.noalloc` variant of coroutine instead. +// The frame of the callee coroutine is allocated inside the caller. A pointer +// to the allocated frame will be passed into the `.noalloc` ramp function. +// +//===--===// + +#include "llvm/Transforms/Coroutines/CoroAnnotationElide.h" + +#include "llvm/Analysis/LazyCallGraph.h" +#include "llvm/Analysis/OptimizationRemarkEmitter.h" +#include "llvm/IR/Analysis.h" +#include "llvm/IR/IRBuilder.h" +#include "llvm/IR/InstIterator.h" +#include "llvm/IR/Instruction.h" +#include "llvm/IR/Module.h" +#include "llvm/IR/PassManager.h" +#include "llvm/Transforms/Utils/CallGraphUpdater.h" + +#include + +using namespace llvm; + +#define DEBUG_TYPE "coro-annotation-elide" + +static Instruction *getFirstNonAllocaInTheEntryBlock(Function *F) { + for (Instruction &I : F->getEntryBlock()) +if (!isa(&I)) + return &I; + llvm_unreachable("no terminator in the entry block"); +} + +// Create an alloca in the caller, using FrameSize and FrameAlign as the callee +// coroutine's activation frame. +static Value *allocateFrameInCaller(Function *Caller, uint64_t FrameSize, +Align FrameAlign) { + LLVMContext &C = Caller->getContext(); + BasicBlock::iterator InsertPt = + getFirstNonAllocaInTheEntryBlock(Caller)->getIterator(); + const DataLayout &DL = Caller->getDataLayout(); + auto FrameTy = ArrayType::get(Type::getInt8Ty(C), FrameSize); + auto *Frame = new AllocaInst(FrameTy, DL.getAllocaAddrSpace(), "", InsertPt); + Frame->setAlignment(FrameAlign); + return new BitCastInst(Frame, PointerType::getUnqual(C), "vFrame", InsertPt); +} + +// Given a call or invoke instruction to the elide safe coroutine, this function +// does the following: +// - Allocate a frame for the callee coroutine in the caller using alloca. +// - Replace the old CB with a new Call or Invoke to `NewCallee`, with the +//pointer to the frame as an additional argument to NewCallee. +static void processCall(CallBase *CB, Function *Caller, Function *NewCallee, +uint64_t FrameSize, Align FrameAlign) { + auto *FramePtr = allocateFrameInCaller(Caller, FrameSize, FrameAlign); + auto NewCBInsertPt = CB->getIterator(); + llvm::CallBase *NewCB = nullptr; + SmallVector NewArgs; + NewArgs.append(CB->arg_begin(), CB->arg_end()); + NewArgs.push_back(FramePtr); + + if (auto *CI = dyn_cast(CB)) { +auto *NewCI = CallInst::Create(NewCallee->getFunctionType(), NewCallee, + NewArgs, "", NewCBInsertPt); +NewCI->setTailCallKind(CI->getTailCallKind()); +NewCB = NewCI; ChuanqiXu9 wrote: Out of curious, why do we use a new variable here but not in the following branch? https://github.com/llvm/llvm-project/pull/99285 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LLVM][Coroutines] Transform "coro_elide_safe" calls to switch ABI coroutines to the `noalloc` variant (PR #99285)
@@ -0,0 +1,147 @@ +//===- CoroAnnotationElide.cpp - Elide attributed safe coroutine calls ===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===--===// +// +// \file +// This pass transforms all Call or Invoke instructions that are annotated +// "coro_elide_safe" to call the `.noalloc` variant of coroutine instead. +// The frame of the callee coroutine is allocated inside the caller. A pointer +// to the allocated frame will be passed into the `.noalloc` ramp function. +// +//===--===// + +#include "llvm/Transforms/Coroutines/CoroAnnotationElide.h" + +#include "llvm/Analysis/LazyCallGraph.h" +#include "llvm/Analysis/OptimizationRemarkEmitter.h" +#include "llvm/IR/Analysis.h" +#include "llvm/IR/IRBuilder.h" +#include "llvm/IR/InstIterator.h" +#include "llvm/IR/Instruction.h" +#include "llvm/IR/Module.h" +#include "llvm/IR/PassManager.h" +#include "llvm/Transforms/Utils/CallGraphUpdater.h" + +#include + +using namespace llvm; + +#define DEBUG_TYPE "coro-annotation-elide" + +static Instruction *getFirstNonAllocaInTheEntryBlock(Function *F) { + for (Instruction &I : F->getEntryBlock()) +if (!isa(&I)) + return &I; + llvm_unreachable("no terminator in the entry block"); +} + +// Create an alloca in the caller, using FrameSize and FrameAlign as the callee +// coroutine's activation frame. +static Value *allocateFrameInCaller(Function *Caller, uint64_t FrameSize, +Align FrameAlign) { + LLVMContext &C = Caller->getContext(); + BasicBlock::iterator InsertPt = + getFirstNonAllocaInTheEntryBlock(Caller)->getIterator(); + const DataLayout &DL = Caller->getDataLayout(); + auto FrameTy = ArrayType::get(Type::getInt8Ty(C), FrameSize); + auto *Frame = new AllocaInst(FrameTy, DL.getAllocaAddrSpace(), "", InsertPt); + Frame->setAlignment(FrameAlign); + return new BitCastInst(Frame, PointerType::getUnqual(C), "vFrame", InsertPt); +} + +// Given a call or invoke instruction to the elide safe coroutine, this function +// does the following: +// - Allocate a frame for the callee coroutine in the caller using alloca. +// - Replace the old CB with a new Call or Invoke to `NewCallee`, with the +//pointer to the frame as an additional argument to NewCallee. +static void processCall(CallBase *CB, Function *Caller, Function *NewCallee, +uint64_t FrameSize, Align FrameAlign) { + auto *FramePtr = allocateFrameInCaller(Caller, FrameSize, FrameAlign); ChuanqiXu9 wrote: It will be better for the performance to generate the lifetime intrinsics for the new frame. https://github.com/llvm/llvm-project/pull/99285 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LoongArch] Optimize for immediate value materialization using BSTRINS_D instruction (PR #106332)
@@ -41,11 +43,82 @@ LoongArchMatInt::InstSeq LoongArchMatInt::generateInstSeq(int64_t Val) { Insts.push_back(Inst(LoongArch::ORI, Lo12)); } + // hi32 + // Higher20 if (SignExtend32<1>(Hi20 >> 19) != SignExtend32<20>(Higher20)) Insts.push_back(Inst(LoongArch::LU32I_D, SignExtend64<20>(Higher20))); + // Highest12 if (SignExtend32<1>(Higher20 >> 19) != SignExtend32<12>(Highest12)) Insts.push_back(Inst(LoongArch::LU52I_D, SignExtend64<12>(Highest12))); + size_t N = Insts.size(); + if (N < 3) +return Insts; + + // When the number of instruction sequences is greater than 2, we have the + // opportunity to optimize using the BSTRINS_D instruction. The scenario is as + // follows: + // + // N of Insts = 3 + // 1. ORI + LU32I_D + LU52I_D => ORI + BSTRINS_D, TmpVal = ORI + // 2. ADDI_W + LU32I_D + LU32I_D => ADDI_W + BSTRINS_D, TmpVal = ADDI_W + // 3. LU12I_W + ORI + LU32I_D => ORI + BSTRINS_D, TmpVal = ORI + // 4. LU12I_W + LU32I_D + LU52I_D => LU12I_W + BSTRINS_D, TmpVal = LU12I_W + // + // N of Insts = 4 + // 5. LU12I_W + ORI + LU32I_D + LU52I_D => LU12I_W + ORI + BSTRINS_D + // => ORI + LU52I_D + BSTRINS_D + //TmpVal = (LU12I_W | ORI) or (ORI | LU52I_D) + // The BSTRINS_D instruction will use the `TmpVal` to construct the `Val`. + uint64_t TmpVal1 = 0; + uint64_t TmpVal2 = 0; + switch (Insts[0].Opc) { + default: +llvm_unreachable("unexpected opcode"); +break; + case LoongArch::LU12I_W: +if (Insts[1].Opc == LoongArch::ORI) { + TmpVal1 = Insts[1].Imm; + if (N == 3) +break; + TmpVal2 = Insts[3].Imm << 52 | TmpVal1; +} +TmpVal1 |= Insts[0].Imm << 12; +break; + case LoongArch::ORI: + case LoongArch::ADDI_W: +TmpVal1 = Insts[0].Imm; +break; + } + + for (uint64_t Msb = 32; Msb < 64; ++Msb) { +uint64_t HighMask = ~((1ULL << (Msb + 1)) - 1); +for (uint64_t Lsb = Msb; Lsb > 0; --Lsb) { heiher wrote: It appears the maximum number of iterations may be up to `∑_{i=32}^{63}`. Could we reduce the complexity? https://github.com/llvm/llvm-project/pull/106332 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LoongArch] Optimize for immediate value materialization using BSTRINS_D instruction (PR #106332)
@@ -41,11 +43,82 @@ LoongArchMatInt::InstSeq LoongArchMatInt::generateInstSeq(int64_t Val) { Insts.push_back(Inst(LoongArch::ORI, Lo12)); } + // hi32 + // Higher20 if (SignExtend32<1>(Hi20 >> 19) != SignExtend32<20>(Higher20)) Insts.push_back(Inst(LoongArch::LU32I_D, SignExtend64<20>(Higher20))); + // Highest12 if (SignExtend32<1>(Higher20 >> 19) != SignExtend32<12>(Highest12)) Insts.push_back(Inst(LoongArch::LU52I_D, SignExtend64<12>(Highest12))); + size_t N = Insts.size(); + if (N < 3) +return Insts; + + // When the number of instruction sequences is greater than 2, we have the + // opportunity to optimize using the BSTRINS_D instruction. The scenario is as + // follows: + // + // N of Insts = 3 + // 1. ORI + LU32I_D + LU52I_D => ORI + BSTRINS_D, TmpVal = ORI + // 2. ADDI_W + LU32I_D + LU32I_D => ADDI_W + BSTRINS_D, TmpVal = ADDI_W heiher wrote: ADDI_W + LU32I_D + LU{52}I_D https://github.com/llvm/llvm-project/pull/106332 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LLVM][Coroutines] Transform "coro_elide_safe" calls to switch ABI coroutines to the `noalloc` variant (PR #99285)
@@ -0,0 +1,147 @@ +//===- CoroAnnotationElide.cpp - Elide attributed safe coroutine calls ===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===--===// +// +// \file +// This pass transforms all Call or Invoke instructions that are annotated +// "coro_elide_safe" to call the `.noalloc` variant of coroutine instead. +// The frame of the callee coroutine is allocated inside the caller. A pointer +// to the allocated frame will be passed into the `.noalloc` ramp function. +// +//===--===// + +#include "llvm/Transforms/Coroutines/CoroAnnotationElide.h" + +#include "llvm/Analysis/LazyCallGraph.h" +#include "llvm/Analysis/OptimizationRemarkEmitter.h" +#include "llvm/IR/Analysis.h" +#include "llvm/IR/IRBuilder.h" +#include "llvm/IR/InstIterator.h" +#include "llvm/IR/Instruction.h" +#include "llvm/IR/Module.h" +#include "llvm/IR/PassManager.h" +#include "llvm/Transforms/Utils/CallGraphUpdater.h" + +#include + +using namespace llvm; + +#define DEBUG_TYPE "coro-annotation-elide" + +static Instruction *getFirstNonAllocaInTheEntryBlock(Function *F) { + for (Instruction &I : F->getEntryBlock()) +if (!isa(&I)) + return &I; + llvm_unreachable("no terminator in the entry block"); +} + +// Create an alloca in the caller, using FrameSize and FrameAlign as the callee +// coroutine's activation frame. +static Value *allocateFrameInCaller(Function *Caller, uint64_t FrameSize, +Align FrameAlign) { + LLVMContext &C = Caller->getContext(); + BasicBlock::iterator InsertPt = + getFirstNonAllocaInTheEntryBlock(Caller)->getIterator(); + const DataLayout &DL = Caller->getDataLayout(); + auto FrameTy = ArrayType::get(Type::getInt8Ty(C), FrameSize); + auto *Frame = new AllocaInst(FrameTy, DL.getAllocaAddrSpace(), "", InsertPt); + Frame->setAlignment(FrameAlign); + return new BitCastInst(Frame, PointerType::getUnqual(C), "vFrame", InsertPt); +} + +// Given a call or invoke instruction to the elide safe coroutine, this function +// does the following: +// - Allocate a frame for the callee coroutine in the caller using alloca. +// - Replace the old CB with a new Call or Invoke to `NewCallee`, with the +//pointer to the frame as an additional argument to NewCallee. +static void processCall(CallBase *CB, Function *Caller, Function *NewCallee, +uint64_t FrameSize, Align FrameAlign) { + auto *FramePtr = allocateFrameInCaller(Caller, FrameSize, FrameAlign); ChuanqiXu9 wrote: This can be done as a new optimization in other patches. But let's leave to TODO here. I think we can do this by introducing two pesudo lifetime intrinsics in the frontend around the `co_await` expression and convert the pesudo lifetime intrinsics to real lifetime intrinsics here. https://github.com/llvm/llvm-project/pull/99285 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LLVM][Coroutines] Transform "coro_elide_safe" calls to switch ABI coroutines to the `noalloc` variant (PR #99285)
https://github.com/ChuanqiXu9 edited https://github.com/llvm/llvm-project/pull/99285 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] DAG: Check if is_fpclass is custom, instead of isLegalOrCustom (PR #105577)
https://github.com/spavloff approved this pull request. LGTM. https://github.com/llvm/llvm-project/pull/105577 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LoongArch] Optimize for immediate value materialization using BSTRINS_D instruction (PR #106332)
https://github.com/wangleiat updated https://github.com/llvm/llvm-project/pull/106332 >From b2e3659d23ff3a576e2967576d501b24d6466e87 Mon Sep 17 00:00:00 2001 From: wanglei Date: Wed, 28 Aug 2024 12:16:47 +0800 Subject: [PATCH] update test sextw-removal.ll Created using spr 1.3.5-bogner --- llvm/test/CodeGen/LoongArch/sextw-removal.ll | 40 1 file changed, 16 insertions(+), 24 deletions(-) diff --git a/llvm/test/CodeGen/LoongArch/sextw-removal.ll b/llvm/test/CodeGen/LoongArch/sextw-removal.ll index 2bb39395c1d1b6..7500b5ae09359a 100644 --- a/llvm/test/CodeGen/LoongArch/sextw-removal.ll +++ b/llvm/test/CodeGen/LoongArch/sextw-removal.ll @@ -323,21 +323,17 @@ define void @test7(i32 signext %arg, i32 signext %arg1) nounwind { ; CHECK-NEXT:st.d $s2, $sp, 8 # 8-byte Folded Spill ; CHECK-NEXT:sra.w $a0, $a0, $a1 ; CHECK-NEXT:lu12i.w $a1, 349525 -; CHECK-NEXT:ori $a1, $a1, 1365 -; CHECK-NEXT:lu32i.d $a1, 349525 -; CHECK-NEXT:lu52i.d $fp, $a1, 1365 +; CHECK-NEXT:ori $fp, $a1, 1365 +; CHECK-NEXT:bstrins.d $fp, $fp, 62, 32 ; CHECK-NEXT:lu12i.w $a1, 209715 -; CHECK-NEXT:ori $a1, $a1, 819 -; CHECK-NEXT:lu32i.d $a1, 209715 -; CHECK-NEXT:lu52i.d $s0, $a1, 819 +; CHECK-NEXT:ori $s0, $a1, 819 +; CHECK-NEXT:bstrins.d $s0, $s0, 61, 32 ; CHECK-NEXT:lu12i.w $a1, 61680 -; CHECK-NEXT:ori $a1, $a1, 3855 -; CHECK-NEXT:lu32i.d $a1, -61681 -; CHECK-NEXT:lu52i.d $s1, $a1, 240 +; CHECK-NEXT:ori $s1, $a1, 3855 +; CHECK-NEXT:bstrins.d $s1, $s1, 59, 32 ; CHECK-NEXT:lu12i.w $a1, 4112 -; CHECK-NEXT:ori $a1, $a1, 257 -; CHECK-NEXT:lu32i.d $a1, 65793 -; CHECK-NEXT:lu52i.d $s2, $a1, 16 +; CHECK-NEXT:ori $s2, $a1, 257 +; CHECK-NEXT:bstrins.d $s2, $s2, 56, 32 ; CHECK-NEXT:.p2align 4, , 16 ; CHECK-NEXT: .LBB6_1: # %bb2 ; CHECK-NEXT:# =>This Inner Loop Header: Depth=1 @@ -374,21 +370,17 @@ define void @test7(i32 signext %arg, i32 signext %arg1) nounwind { ; NORMV-NEXT:st.d $s2, $sp, 8 # 8-byte Folded Spill ; NORMV-NEXT:sra.w $a0, $a0, $a1 ; NORMV-NEXT:lu12i.w $a1, 349525 -; NORMV-NEXT:ori $a1, $a1, 1365 -; NORMV-NEXT:lu32i.d $a1, 349525 -; NORMV-NEXT:lu52i.d $fp, $a1, 1365 +; NORMV-NEXT:ori $fp, $a1, 1365 +; NORMV-NEXT:bstrins.d $fp, $fp, 62, 32 ; NORMV-NEXT:lu12i.w $a1, 209715 -; NORMV-NEXT:ori $a1, $a1, 819 -; NORMV-NEXT:lu32i.d $a1, 209715 -; NORMV-NEXT:lu52i.d $s0, $a1, 819 +; NORMV-NEXT:ori $s0, $a1, 819 +; NORMV-NEXT:bstrins.d $s0, $s0, 61, 32 ; NORMV-NEXT:lu12i.w $a1, 61680 -; NORMV-NEXT:ori $a1, $a1, 3855 -; NORMV-NEXT:lu32i.d $a1, -61681 -; NORMV-NEXT:lu52i.d $s1, $a1, 240 +; NORMV-NEXT:ori $s1, $a1, 3855 +; NORMV-NEXT:bstrins.d $s1, $s1, 59, 32 ; NORMV-NEXT:lu12i.w $a1, 4112 -; NORMV-NEXT:ori $a1, $a1, 257 -; NORMV-NEXT:lu32i.d $a1, 65793 -; NORMV-NEXT:lu52i.d $s2, $a1, 16 +; NORMV-NEXT:ori $s2, $a1, 257 +; NORMV-NEXT:bstrins.d $s2, $s2, 56, 32 ; NORMV-NEXT:.p2align 4, , 16 ; NORMV-NEXT: .LBB6_1: # %bb2 ; NORMV-NEXT:# =>This Inner Loop Header: Depth=1 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LoongArch] Optimize for immediate value materialization using BSTRINS_D instruction (PR #106332)
https://github.com/wangleiat updated https://github.com/llvm/llvm-project/pull/106332 >From b2e3659d23ff3a576e2967576d501b24d6466e87 Mon Sep 17 00:00:00 2001 From: wanglei Date: Wed, 28 Aug 2024 12:16:47 +0800 Subject: [PATCH] update test sextw-removal.ll Created using spr 1.3.5-bogner --- llvm/test/CodeGen/LoongArch/sextw-removal.ll | 40 1 file changed, 16 insertions(+), 24 deletions(-) diff --git a/llvm/test/CodeGen/LoongArch/sextw-removal.ll b/llvm/test/CodeGen/LoongArch/sextw-removal.ll index 2bb39395c1d1b6..7500b5ae09359a 100644 --- a/llvm/test/CodeGen/LoongArch/sextw-removal.ll +++ b/llvm/test/CodeGen/LoongArch/sextw-removal.ll @@ -323,21 +323,17 @@ define void @test7(i32 signext %arg, i32 signext %arg1) nounwind { ; CHECK-NEXT:st.d $s2, $sp, 8 # 8-byte Folded Spill ; CHECK-NEXT:sra.w $a0, $a0, $a1 ; CHECK-NEXT:lu12i.w $a1, 349525 -; CHECK-NEXT:ori $a1, $a1, 1365 -; CHECK-NEXT:lu32i.d $a1, 349525 -; CHECK-NEXT:lu52i.d $fp, $a1, 1365 +; CHECK-NEXT:ori $fp, $a1, 1365 +; CHECK-NEXT:bstrins.d $fp, $fp, 62, 32 ; CHECK-NEXT:lu12i.w $a1, 209715 -; CHECK-NEXT:ori $a1, $a1, 819 -; CHECK-NEXT:lu32i.d $a1, 209715 -; CHECK-NEXT:lu52i.d $s0, $a1, 819 +; CHECK-NEXT:ori $s0, $a1, 819 +; CHECK-NEXT:bstrins.d $s0, $s0, 61, 32 ; CHECK-NEXT:lu12i.w $a1, 61680 -; CHECK-NEXT:ori $a1, $a1, 3855 -; CHECK-NEXT:lu32i.d $a1, -61681 -; CHECK-NEXT:lu52i.d $s1, $a1, 240 +; CHECK-NEXT:ori $s1, $a1, 3855 +; CHECK-NEXT:bstrins.d $s1, $s1, 59, 32 ; CHECK-NEXT:lu12i.w $a1, 4112 -; CHECK-NEXT:ori $a1, $a1, 257 -; CHECK-NEXT:lu32i.d $a1, 65793 -; CHECK-NEXT:lu52i.d $s2, $a1, 16 +; CHECK-NEXT:ori $s2, $a1, 257 +; CHECK-NEXT:bstrins.d $s2, $s2, 56, 32 ; CHECK-NEXT:.p2align 4, , 16 ; CHECK-NEXT: .LBB6_1: # %bb2 ; CHECK-NEXT:# =>This Inner Loop Header: Depth=1 @@ -374,21 +370,17 @@ define void @test7(i32 signext %arg, i32 signext %arg1) nounwind { ; NORMV-NEXT:st.d $s2, $sp, 8 # 8-byte Folded Spill ; NORMV-NEXT:sra.w $a0, $a0, $a1 ; NORMV-NEXT:lu12i.w $a1, 349525 -; NORMV-NEXT:ori $a1, $a1, 1365 -; NORMV-NEXT:lu32i.d $a1, 349525 -; NORMV-NEXT:lu52i.d $fp, $a1, 1365 +; NORMV-NEXT:ori $fp, $a1, 1365 +; NORMV-NEXT:bstrins.d $fp, $fp, 62, 32 ; NORMV-NEXT:lu12i.w $a1, 209715 -; NORMV-NEXT:ori $a1, $a1, 819 -; NORMV-NEXT:lu32i.d $a1, 209715 -; NORMV-NEXT:lu52i.d $s0, $a1, 819 +; NORMV-NEXT:ori $s0, $a1, 819 +; NORMV-NEXT:bstrins.d $s0, $s0, 61, 32 ; NORMV-NEXT:lu12i.w $a1, 61680 -; NORMV-NEXT:ori $a1, $a1, 3855 -; NORMV-NEXT:lu32i.d $a1, -61681 -; NORMV-NEXT:lu52i.d $s1, $a1, 240 +; NORMV-NEXT:ori $s1, $a1, 3855 +; NORMV-NEXT:bstrins.d $s1, $s1, 59, 32 ; NORMV-NEXT:lu12i.w $a1, 4112 -; NORMV-NEXT:ori $a1, $a1, 257 -; NORMV-NEXT:lu32i.d $a1, 65793 -; NORMV-NEXT:lu52i.d $s2, $a1, 16 +; NORMV-NEXT:ori $s2, $a1, 257 +; NORMV-NEXT:bstrins.d $s2, $s2, 56, 32 ; NORMV-NEXT:.p2align 4, , 16 ; NORMV-NEXT: .LBB6_1: # %bb2 ; NORMV-NEXT:# =>This Inner Loop Header: Depth=1 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LoongArch] Optimize for immediate value materialization using BSTRINS_D instruction (PR #106332)
@@ -41,11 +43,82 @@ LoongArchMatInt::InstSeq LoongArchMatInt::generateInstSeq(int64_t Val) { Insts.push_back(Inst(LoongArch::ORI, Lo12)); } + // hi32 + // Higher20 if (SignExtend32<1>(Hi20 >> 19) != SignExtend32<20>(Higher20)) Insts.push_back(Inst(LoongArch::LU32I_D, SignExtend64<20>(Higher20))); + // Highest12 if (SignExtend32<1>(Higher20 >> 19) != SignExtend32<12>(Highest12)) Insts.push_back(Inst(LoongArch::LU52I_D, SignExtend64<12>(Highest12))); + size_t N = Insts.size(); + if (N < 3) +return Insts; + + // When the number of instruction sequences is greater than 2, we have the + // opportunity to optimize using the BSTRINS_D instruction. The scenario is as + // follows: + // + // N of Insts = 3 + // 1. ORI + LU32I_D + LU52I_D => ORI + BSTRINS_D, TmpVal = ORI + // 2. ADDI_W + LU32I_D + LU32I_D => ADDI_W + BSTRINS_D, TmpVal = ADDI_W + // 3. LU12I_W + ORI + LU32I_D => ORI + BSTRINS_D, TmpVal = ORI + // 4. LU12I_W + LU32I_D + LU52I_D => LU12I_W + BSTRINS_D, TmpVal = LU12I_W + // + // N of Insts = 4 + // 5. LU12I_W + ORI + LU32I_D + LU52I_D => LU12I_W + ORI + BSTRINS_D + // => ORI + LU52I_D + BSTRINS_D + //TmpVal = (LU12I_W | ORI) or (ORI | LU52I_D) + // The BSTRINS_D instruction will use the `TmpVal` to construct the `Val`. + uint64_t TmpVal1 = 0; + uint64_t TmpVal2 = 0; + switch (Insts[0].Opc) { + default: +llvm_unreachable("unexpected opcode"); +break; + case LoongArch::LU12I_W: +if (Insts[1].Opc == LoongArch::ORI) { + TmpVal1 = Insts[1].Imm; + if (N == 3) +break; + TmpVal2 = Insts[3].Imm << 52 | TmpVal1; +} +TmpVal1 |= Insts[0].Imm << 12; +break; + case LoongArch::ORI: + case LoongArch::ADDI_W: +TmpVal1 = Insts[0].Imm; +break; + } + + for (uint64_t Msb = 32; Msb < 64; ++Msb) { +uint64_t HighMask = ~((1ULL << (Msb + 1)) - 1); +for (uint64_t Lsb = Msb; Lsb > 0; --Lsb) { wangleiat wrote: I currently don't have a good way to reduce the number of loops, except for some obvious cases such as `hi32 = lo32`. https://github.com/llvm/llvm-project/pull/106332 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [BOLT][NFC] Rename profile-use-pseudo-probes (PR #106364)
https://github.com/aaupov created https://github.com/llvm/llvm-project/pull/106364 None ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [BOLT][NFC] Rename profile-use-pseudo-probes (PR #106364)
llvmbot wrote: @llvm/pr-subscribers-bolt Author: Amir Ayupov (aaupov) Changes --- Full diff: https://github.com/llvm/llvm-project/pull/106364.diff 5 Files Affected: - (modified) bolt/lib/Profile/DataAggregator.cpp (+2-2) - (modified) bolt/lib/Profile/YAMLProfileReader.cpp (-5) - (modified) bolt/lib/Profile/YAMLProfileWriter.cpp (+8-3) - (modified) bolt/lib/Rewrite/PseudoProbeRewriter.cpp (+3-3) - (modified) bolt/test/X86/pseudoprobe-decoding-inline.test (+3-3) ``diff diff --git a/bolt/lib/Profile/DataAggregator.cpp b/bolt/lib/Profile/DataAggregator.cpp index 813d825f8b570c..10d745cc69824b 100644 --- a/bolt/lib/Profile/DataAggregator.cpp +++ b/bolt/lib/Profile/DataAggregator.cpp @@ -88,7 +88,7 @@ MaxSamples("max-samples", cl::cat(AggregatorCategory)); extern cl::opt ProfileFormat; -extern cl::opt ProfileUsePseudoProbes; +extern cl::opt ProfileWritePseudoProbes; extern cl::opt SaveProfile; cl::opt ReadPreAggregated( @@ -2300,7 +2300,7 @@ std::error_code DataAggregator::writeBATYAML(BinaryContext &BC, yaml::bolt::BinaryProfile BP; const MCPseudoProbeDecoder *PseudoProbeDecoder = - opts::ProfileUsePseudoProbes ? BC.getPseudoProbeDecoder() : nullptr; + opts::ProfileWritePseudoProbes ? BC.getPseudoProbeDecoder() : nullptr; // Fill out the header info. BP.Header.Version = 1; diff --git a/bolt/lib/Profile/YAMLProfileReader.cpp b/bolt/lib/Profile/YAMLProfileReader.cpp index 3eca5e972fa5ba..604a9fb4813be4 100644 --- a/bolt/lib/Profile/YAMLProfileReader.cpp +++ b/bolt/lib/Profile/YAMLProfileReader.cpp @@ -49,11 +49,6 @@ llvm::cl::opt llvm::cl::opt ProfileUseDFS("profile-use-dfs", cl::desc("use DFS order for YAML profile"), cl::Hidden, cl::cat(BoltOptCategory)); - -llvm::cl::opt ProfileUsePseudoProbes( -"profile-use-pseudo-probes", -cl::desc("Use pseudo probes for profile generation and matching"), -cl::Hidden, cl::cat(BoltOptCategory)); } // namespace opts namespace llvm { diff --git a/bolt/lib/Profile/YAMLProfileWriter.cpp b/bolt/lib/Profile/YAMLProfileWriter.cpp index f74cf60e076d0a..ffbf2388e912fb 100644 --- a/bolt/lib/Profile/YAMLProfileWriter.cpp +++ b/bolt/lib/Profile/YAMLProfileWriter.cpp @@ -13,6 +13,7 @@ #include "bolt/Profile/DataAggregator.h" #include "bolt/Profile/ProfileReaderBase.h" #include "bolt/Rewrite/RewriteInstance.h" +#include "bolt/Utils/CommandLineOpts.h" #include "llvm/Support/CommandLine.h" #include "llvm/Support/FileSystem.h" #include "llvm/Support/raw_ostream.h" @@ -21,8 +22,12 @@ #define DEBUG_TYPE "bolt-prof" namespace opts { -extern llvm::cl::opt ProfileUseDFS; -extern llvm::cl::opt ProfileUsePseudoProbes; +using namespace llvm; +extern cl::opt ProfileUseDFS; +cl::opt ProfileWritePseudoProbes( +"profile-write-pseudo-probes", +cl::desc("Use pseudo probes in profile generation"), cl::Hidden, +cl::cat(BoltOptCategory)); } // namespace opts namespace llvm { @@ -59,7 +64,7 @@ YAMLProfileWriter::convert(const BinaryFunction &BF, bool UseDFS, yaml::bolt::BinaryFunctionProfile YamlBF; const BinaryContext &BC = BF.getBinaryContext(); const MCPseudoProbeDecoder *PseudoProbeDecoder = - opts::ProfileUsePseudoProbes ? BC.getPseudoProbeDecoder() : nullptr; + opts::ProfileWritePseudoProbes ? BC.getPseudoProbeDecoder() : nullptr; const uint16_t LBRProfile = BF.getProfileFlags() & BinaryFunction::PF_LBR; diff --git a/bolt/lib/Rewrite/PseudoProbeRewriter.cpp b/bolt/lib/Rewrite/PseudoProbeRewriter.cpp index 6e80d9b0014b7b..228913e6ea1f39 100644 --- a/bolt/lib/Rewrite/PseudoProbeRewriter.cpp +++ b/bolt/lib/Rewrite/PseudoProbeRewriter.cpp @@ -50,7 +50,7 @@ static cl::opt PrintPseudoProbes( clEnumValN(PPP_All, "all", "enable all debugging printout")), cl::Hidden, cl::cat(BoltCategory)); -extern cl::opt ProfileUsePseudoProbes; +extern cl::opt ProfileWritePseudoProbes; } // namespace opts namespace { @@ -91,14 +91,14 @@ class PseudoProbeRewriter final : public MetadataRewriter { }; Error PseudoProbeRewriter::preCFGInitializer() { - if (opts::ProfileUsePseudoProbes) + if (opts::ProfileWritePseudoProbes) parsePseudoProbe(); return Error::success(); } Error PseudoProbeRewriter::postEmitFinalizer() { - if (!opts::ProfileUsePseudoProbes) + if (!opts::ProfileWritePseudoProbes) parsePseudoProbe(); updatePseudoProbes(); diff --git a/bolt/test/X86/pseudoprobe-decoding-inline.test b/bolt/test/X86/pseudoprobe-decoding-inline.test index b361551e5711ea..1fdd00c7ef6c4b 100644 --- a/bolt/test/X86/pseudoprobe-decoding-inline.test +++ b/bolt/test/X86/pseudoprobe-decoding-inline.test @@ -6,11 +6,11 @@ # PREAGG: B X:0 #main# 1 0 ## Check pseudo-probes in regular YAML profile (non-BOLTed binary) # RUN: link_fdata %s %S/../../../llvm/test/tools/llvm-profgen/Inputs/inline-cs-pseudoprobe.perfbin %t.preagg PREAGG -# RUN: perf2bolt %S/../../../llvm/test/tools/llvm-
[llvm-branch-commits] [BOLT][NFC] Rename profile-use-pseudo-probes (PR #106364)
https://github.com/aaupov edited https://github.com/llvm/llvm-project/pull/106364 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [BOLT] Only parse probes for profiled functions in profile-write-pseudo-probes mode (PR #106365)
https://github.com/aaupov created https://github.com/llvm/llvm-project/pull/106365 None ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [BOLT] Only parse probes for profiled functions in profile-write-pseudo-probes mode (PR #106365)
llvmbot wrote: @llvm/pr-subscribers-bolt Author: Amir Ayupov (aaupov) Changes --- Full diff: https://github.com/llvm/llvm-project/pull/106365.diff 1 Files Affected: - (modified) bolt/lib/Rewrite/PseudoProbeRewriter.cpp (+7-3) ``diff diff --git a/bolt/lib/Rewrite/PseudoProbeRewriter.cpp b/bolt/lib/Rewrite/PseudoProbeRewriter.cpp index 228913e6ea1f39..89a7fddbb5d2af 100644 --- a/bolt/lib/Rewrite/PseudoProbeRewriter.cpp +++ b/bolt/lib/Rewrite/PseudoProbeRewriter.cpp @@ -72,7 +72,8 @@ class PseudoProbeRewriter final : public MetadataRewriter { /// Parse .pseudo_probe_desc section and .pseudo_probe section /// Setup Pseudo probe decoder - void parsePseudoProbe(); + /// If \p ProfiledOnly is set, only parse records for functions with profile. + void parsePseudoProbe(bool ProfiledOnly = false); /// PseudoProbe decoder std::shared_ptr ProbeDecoderPtr; @@ -92,7 +93,7 @@ class PseudoProbeRewriter final : public MetadataRewriter { Error PseudoProbeRewriter::preCFGInitializer() { if (opts::ProfileWritePseudoProbes) -parsePseudoProbe(); +parsePseudoProbe(true); return Error::success(); } @@ -105,7 +106,7 @@ Error PseudoProbeRewriter::postEmitFinalizer() { return Error::success(); } -void PseudoProbeRewriter::parsePseudoProbe() { +void PseudoProbeRewriter::parsePseudoProbe(bool ProfiledOnly) { MCPseudoProbeDecoder &ProbeDecoder(*ProbeDecoderPtr); PseudoProbeDescSection = BC.getUniqueSectionByName(".pseudo_probe_desc"); PseudoProbeSection = BC.getUniqueSectionByName(".pseudo_probe"); @@ -136,6 +137,7 @@ void PseudoProbeRewriter::parsePseudoProbe() { MCPseudoProbeDecoder::Uint64Map FuncStartAddrs; SmallVector Suffixes({".llvm.", ".destroy", ".resume"}); for (const BinaryFunction *F : BC.getAllBinaryFunctions()) { +bool HasProfile = F->hasProfileAvailable(); for (const MCSymbol *Sym : F->getSymbols()) { StringRef SymName = NameResolver::restore(Sym->getName()); if (std::optional CommonName = @@ -144,6 +146,8 @@ void PseudoProbeRewriter::parsePseudoProbe() { } uint64_t GUID = Function::getGUID(SymName); FuncStartAddrs[GUID] = F->getAddress(); + if (ProfiledOnly && HasProfile) +GuidFilter.insert(GUID); } } Contents = PseudoProbeSection->getContents(); `` https://github.com/llvm/llvm-project/pull/106365 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [BOLT] Only parse probes for profiled functions in profile-write-pseudo-probes mode (PR #106365)
https://github.com/aaupov edited https://github.com/llvm/llvm-project/pull/106365 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [BOLT] Only parse probes for profiled functions in profile-write-pseudo-probes mode (PR #106365)
https://github.com/aaupov edited https://github.com/llvm/llvm-project/pull/106365 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] DAG: Handle lowering unordered compare with inf (PR #100378)
@@ -219,9 +219,13 @@ findSplitPointForStackProtector(MachineBasicBlock *BB, /// (i.e. fewer instructions should be required to lower it). An example is the /// test "inf|normal|subnormal|zero", which is an inversion of "nan". /// \param Test The test as specified in 'is_fpclass' intrinsic invocation. +/// +/// \param UseFP The intention is to perform the comparison using floating-point +/// compare instructions which check for nan. +/// spavloff wrote: In the example in https://llvm.org/docs/CodingStandards.html#doxygen-use-in-documentation-comments parameter lines are not separated by blank lines. It is not a big deal, but the params separated from each other and NOT separated from the description didn't look good. https://github.com/llvm/llvm-project/pull/100378 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LLVM][Coroutines] Transform "coro_elide_safe" calls to switch ABI coroutines to the `noalloc` variant (PR #99285)
@@ -0,0 +1,147 @@ +//===- CoroAnnotationElide.cpp - Elide attributed safe coroutine calls ===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===--===// +// +// \file +// This pass transforms all Call or Invoke instructions that are annotated +// "coro_elide_safe" to call the `.noalloc` variant of coroutine instead. +// The frame of the callee coroutine is allocated inside the caller. A pointer +// to the allocated frame will be passed into the `.noalloc` ramp function. +// +//===--===// + +#include "llvm/Transforms/Coroutines/CoroAnnotationElide.h" + +#include "llvm/Analysis/LazyCallGraph.h" +#include "llvm/Analysis/OptimizationRemarkEmitter.h" +#include "llvm/IR/Analysis.h" +#include "llvm/IR/IRBuilder.h" +#include "llvm/IR/InstIterator.h" +#include "llvm/IR/Instruction.h" +#include "llvm/IR/Module.h" +#include "llvm/IR/PassManager.h" +#include "llvm/Transforms/Utils/CallGraphUpdater.h" + +#include + +using namespace llvm; + +#define DEBUG_TYPE "coro-annotation-elide" + +static Instruction *getFirstNonAllocaInTheEntryBlock(Function *F) { + for (Instruction &I : F->getEntryBlock()) +if (!isa(&I)) + return &I; + llvm_unreachable("no terminator in the entry block"); +} + +// Create an alloca in the caller, using FrameSize and FrameAlign as the callee +// coroutine's activation frame. +static Value *allocateFrameInCaller(Function *Caller, uint64_t FrameSize, +Align FrameAlign) { + LLVMContext &C = Caller->getContext(); + BasicBlock::iterator InsertPt = + getFirstNonAllocaInTheEntryBlock(Caller)->getIterator(); + const DataLayout &DL = Caller->getDataLayout(); + auto FrameTy = ArrayType::get(Type::getInt8Ty(C), FrameSize); + auto *Frame = new AllocaInst(FrameTy, DL.getAllocaAddrSpace(), "", InsertPt); + Frame->setAlignment(FrameAlign); + return new BitCastInst(Frame, PointerType::getUnqual(C), "vFrame", InsertPt); +} + +// Given a call or invoke instruction to the elide safe coroutine, this function +// does the following: +// - Allocate a frame for the callee coroutine in the caller using alloca. +// - Replace the old CB with a new Call or Invoke to `NewCallee`, with the +//pointer to the frame as an additional argument to NewCallee. +static void processCall(CallBase *CB, Function *Caller, Function *NewCallee, +uint64_t FrameSize, Align FrameAlign) { + auto *FramePtr = allocateFrameInCaller(Caller, FrameSize, FrameAlign); yuxuanchen1997 wrote: The old CoroElide didn't have it and just right out of my mind I don't see a clear path for allowing this in the LLVM Coroutine semantics. In C++ semantics this is doable (lifetime of the coroutine ended at the full expression after `co_await`.) Maybe introduce this from FE? But sure leave a todo here for another day. https://github.com/llvm/llvm-project/pull/99285 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LLVM][Coroutines] Transform "coro_elide_safe" calls to switch ABI coroutines to the `noalloc` variant (PR #99285)
@@ -0,0 +1,147 @@ +//===- CoroAnnotationElide.cpp - Elide attributed safe coroutine calls ===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===--===// +// +// \file +// This pass transforms all Call or Invoke instructions that are annotated +// "coro_elide_safe" to call the `.noalloc` variant of coroutine instead. +// The frame of the callee coroutine is allocated inside the caller. A pointer +// to the allocated frame will be passed into the `.noalloc` ramp function. +// +//===--===// + +#include "llvm/Transforms/Coroutines/CoroAnnotationElide.h" + +#include "llvm/Analysis/LazyCallGraph.h" +#include "llvm/Analysis/OptimizationRemarkEmitter.h" +#include "llvm/IR/Analysis.h" +#include "llvm/IR/IRBuilder.h" +#include "llvm/IR/InstIterator.h" +#include "llvm/IR/Instruction.h" +#include "llvm/IR/Module.h" +#include "llvm/IR/PassManager.h" +#include "llvm/Transforms/Utils/CallGraphUpdater.h" + +#include + +using namespace llvm; + +#define DEBUG_TYPE "coro-annotation-elide" + +static Instruction *getFirstNonAllocaInTheEntryBlock(Function *F) { + for (Instruction &I : F->getEntryBlock()) +if (!isa(&I)) + return &I; + llvm_unreachable("no terminator in the entry block"); +} + +// Create an alloca in the caller, using FrameSize and FrameAlign as the callee +// coroutine's activation frame. +static Value *allocateFrameInCaller(Function *Caller, uint64_t FrameSize, +Align FrameAlign) { + LLVMContext &C = Caller->getContext(); + BasicBlock::iterator InsertPt = + getFirstNonAllocaInTheEntryBlock(Caller)->getIterator(); + const DataLayout &DL = Caller->getDataLayout(); + auto FrameTy = ArrayType::get(Type::getInt8Ty(C), FrameSize); + auto *Frame = new AllocaInst(FrameTy, DL.getAllocaAddrSpace(), "", InsertPt); + Frame->setAlignment(FrameAlign); + return new BitCastInst(Frame, PointerType::getUnqual(C), "vFrame", InsertPt); yuxuanchen1997 wrote: This is the same procedure as in `CoroElide`. Let's remove the bitcast I guess? https://github.com/llvm/llvm-project/pull/99285 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LLVM][Coroutines] Transform "coro_elide_safe" calls to switch ABI coroutines to the `noalloc` variant (PR #99285)
@@ -0,0 +1,147 @@ +//===- CoroAnnotationElide.cpp - Elide attributed safe coroutine calls ===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===--===// +// +// \file +// This pass transforms all Call or Invoke instructions that are annotated +// "coro_elide_safe" to call the `.noalloc` variant of coroutine instead. +// The frame of the callee coroutine is allocated inside the caller. A pointer +// to the allocated frame will be passed into the `.noalloc` ramp function. +// +//===--===// + +#include "llvm/Transforms/Coroutines/CoroAnnotationElide.h" + +#include "llvm/Analysis/LazyCallGraph.h" +#include "llvm/Analysis/OptimizationRemarkEmitter.h" +#include "llvm/IR/Analysis.h" +#include "llvm/IR/IRBuilder.h" +#include "llvm/IR/InstIterator.h" +#include "llvm/IR/Instruction.h" +#include "llvm/IR/Module.h" +#include "llvm/IR/PassManager.h" +#include "llvm/Transforms/Utils/CallGraphUpdater.h" + +#include + +using namespace llvm; + +#define DEBUG_TYPE "coro-annotation-elide" + +static Instruction *getFirstNonAllocaInTheEntryBlock(Function *F) { + for (Instruction &I : F->getEntryBlock()) +if (!isa(&I)) + return &I; + llvm_unreachable("no terminator in the entry block"); +} + +// Create an alloca in the caller, using FrameSize and FrameAlign as the callee +// coroutine's activation frame. +static Value *allocateFrameInCaller(Function *Caller, uint64_t FrameSize, +Align FrameAlign) { + LLVMContext &C = Caller->getContext(); + BasicBlock::iterator InsertPt = + getFirstNonAllocaInTheEntryBlock(Caller)->getIterator(); + const DataLayout &DL = Caller->getDataLayout(); + auto FrameTy = ArrayType::get(Type::getInt8Ty(C), FrameSize); + auto *Frame = new AllocaInst(FrameTy, DL.getAllocaAddrSpace(), "", InsertPt); + Frame->setAlignment(FrameAlign); + return new BitCastInst(Frame, PointerType::getUnqual(C), "vFrame", InsertPt); +} + +// Given a call or invoke instruction to the elide safe coroutine, this function +// does the following: +// - Allocate a frame for the callee coroutine in the caller using alloca. +// - Replace the old CB with a new Call or Invoke to `NewCallee`, with the +//pointer to the frame as an additional argument to NewCallee. +static void processCall(CallBase *CB, Function *Caller, Function *NewCallee, +uint64_t FrameSize, Align FrameAlign) { + auto *FramePtr = allocateFrameInCaller(Caller, FrameSize, FrameAlign); + auto NewCBInsertPt = CB->getIterator(); + llvm::CallBase *NewCB = nullptr; + SmallVector NewArgs; + NewArgs.append(CB->arg_begin(), CB->arg_end()); + NewArgs.push_back(FramePtr); + + if (auto *CI = dyn_cast(CB)) { +auto *NewCI = CallInst::Create(NewCallee->getFunctionType(), NewCallee, + NewArgs, "", NewCBInsertPt); +NewCI->setTailCallKind(CI->getTailCallKind()); +NewCB = NewCI; yuxuanchen1997 wrote: `setTailCallKind` is on `CallInst` not `CallBase`. This `NewCB = NewCI` is upcasting the pointer. https://github.com/llvm/llvm-project/pull/99285 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] AArch64: Use consistent atomicrmw expansion for FP operations (PR #103702)
@@ -27056,21 +27056,35 @@ AArch64TargetLowering::shouldExpandAtomicLoadInIR(LoadInst *LI) const { : AtomicExpansionKind::LLSC; } +// Return true if the atomic operation expansion will lower to use a library +// call, and is thus ineligible to use an LLSC expansion. +static bool rmwOpMayLowerToLibcall(const AtomicRMWInst *RMW) { + if (!RMW->isFloatingPointOperation()) +return false; + switch (RMW->getType()->getScalarType()->getTypeID()) { + case Type::FloatTyID: + case Type::DoubleTyID: + case Type::HalfTyID: + case Type::BFloatTyID: +return false; arsenm wrote: That is in the test (in the parent #103701) https://github.com/llvm/llvm-project/pull/103702 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] AArch64: Use consistent atomicrmw expansion for FP operations (PR #103702)
https://github.com/efriedma-quic approved this pull request. LGTM https://github.com/llvm/llvm-project/pull/103702 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang-tools-extra] [clangd] Add clangd 19 release notes (PR #105975)
https://github.com/HighCommander4 milestoned https://github.com/llvm/llvm-project/pull/105975 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang-tools-extra] [clangd] Add clangd 19 release notes (PR #105975)
HighCommander4 wrote: Thanks for the review. @tstellar could you merge these release notes for us please? https://github.com/llvm/llvm-project/pull/105975 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] 5a00383 - Revert "Revert "[MemProf] Reduce cloning overhead by sharing nodes when possi…"
Author: Teresa Johnson Date: 2024-08-28T11:44:54-07:00 New Revision: 5a00383d7f192a2951e3add4d8ab1f918e7d58f8 URL: https://github.com/llvm/llvm-project/commit/5a00383d7f192a2951e3add4d8ab1f918e7d58f8 DIFF: https://github.com/llvm/llvm-project/commit/5a00383d7f192a2951e3add4d8ab1f918e7d58f8.diff LOG: Revert "Revert "[MemProf] Reduce cloning overhead by sharing nodes when possi…" This reverts commit 11aa31f595325d6b2dede3364e4b86d78fffe635. Added: Modified: llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp Removed: diff --git a/llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp b/llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp index 66b68d5cd457fb..c9de9c964bba0a 100644 --- a/llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp +++ b/llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp @@ -242,9 +242,16 @@ class CallsiteContextGraph { // recursion. bool Recursive = false; -// The corresponding allocation or interior call. +// The corresponding allocation or interior call. This is the primary call +// for which we have created this node. CallInfo Call; +// List of other calls that can be treated the same as the primary call +// through cloning. I.e. located in the same function and have the same +// (possibly pruned) stack ids. They will be updated the same way as the +// primary call when assigning to function clones. +std::vector MatchingCalls; + // For alloc nodes this is a unique id assigned when constructed, and for // callsite stack nodes it is the original stack id when the node is // constructed from the memprof MIB metadata on the alloc nodes. Note that @@ -457,6 +464,9 @@ class CallsiteContextGraph { /// iteration. MapVector> FuncToCallsWithMetadata; + /// Records the function each call is located in. + DenseMap CallToFunc; + /// Map from callsite node to the enclosing caller function. std::map NodeToCallingFunc; @@ -474,7 +484,8 @@ class CallsiteContextGraph { /// StackIdToMatchingCalls map. void assignStackNodesPostOrder( ContextNode *Node, DenseSet &Visited, - DenseMap> &StackIdToMatchingCalls); + DenseMap> &StackIdToMatchingCalls, + DenseMap &CallToMatchingCall); /// Duplicates the given set of context ids, updating the provided /// map from each original id with the newly generated context ids, @@ -1230,10 +1241,11 @@ static void checkNode(const ContextNode *Node, template void CallsiteContextGraph:: -assignStackNodesPostOrder(ContextNode *Node, - DenseSet &Visited, - DenseMap> - &StackIdToMatchingCalls) { +assignStackNodesPostOrder( +ContextNode *Node, DenseSet &Visited, +DenseMap> +&StackIdToMatchingCalls, +DenseMap &CallToMatchingCall) { auto Inserted = Visited.insert(Node); if (!Inserted.second) return; @@ -1246,7 +1258,8 @@ void CallsiteContextGraph:: // Skip any that have been removed during the recursion. if (!Edge) continue; -assignStackNodesPostOrder(Edge->Caller, Visited, StackIdToMatchingCalls); +assignStackNodesPostOrder(Edge->Caller, Visited, StackIdToMatchingCalls, + CallToMatchingCall); } // If this node's stack id is in the map, update the graph to contain new @@ -1289,8 +1302,19 @@ void CallsiteContextGraph:: auto &[Call, Ids, Func, SavedContextIds] = Calls[I]; // Skip any for which we didn't assign any ids, these don't get a node in // the graph. -if (SavedContextIds.empty()) +if (SavedContextIds.empty()) { + // If this call has a matching call (located in the same function and + // having the same stack ids), simply add it to the context node created + // for its matching call earlier. These can be treated the same through + // cloning and get updated at the same time. + if (!CallToMatchingCall.contains(Call)) +continue; + auto MatchingCall = CallToMatchingCall[Call]; + assert(NonAllocationCallToContextNodeMap.contains(MatchingCall)); + NonAllocationCallToContextNodeMap[MatchingCall]->MatchingCalls.push_back( + Call); continue; +} assert(LastId == Ids.back()); @@ -1422,6 +1446,10 @@ void CallsiteContextGraph::updateStackNodes() { // there is more than one call with the same stack ids. Their (possibly newly // duplicated) context ids are saved in the StackIdToMatchingCalls map. DenseMap> OldToNewContextIds; + // Save a map from each call to any that are found to match it. I.e. located + // in the same function and have the same (possibly pruned) stack ids. We use + // this to avoid creating extra graph nodes as they can be treated the same. + DenseMap CallToMatchingCall; for (auto &It : StackIdTo
[llvm-branch-commits] [clang] Revert "[LinkerWrapper] Extend with usual pass options (#96704)" (#102226) (PR #106439)
https://github.com/haoNoQ milestoned https://github.com/llvm/llvm-project/pull/106439 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] Revert "[LinkerWrapper] Extend with usual pass options (#96704)" (#102226) (PR #106439)
https://github.com/haoNoQ created https://github.com/llvm/llvm-project/pull/106439 This reverts commit 90ccf2187332ff900d46a58a27cb0353577d37cb. Cherry picked from commit 030ee841a9c9fbbd6e7c001e751737381da01f7b. Conflicts: clang/test/Driver/linker-wrapper-passes.c >From 5e343fa7c1bef713f367afafbfe25e114c8f86d5 Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Tue, 6 Aug 2024 21:33:25 -0500 Subject: [PATCH] Revert "[LinkerWrapper] Extend with usual pass options (#96704)" (#102226) This reverts commit 90ccf2187332ff900d46a58a27cb0353577d37cb. Fixes: https://github.com/llvm/llvm-project/issues/100212 (cherry picked from commit 030ee841a9c9fbbd6e7c001e751737381da01f7b) Conflicts: clang/test/Driver/linker-wrapper-passes.c --- clang/test/Driver/linker-wrapper-passes.c | 71 --- clang/test/lit.cfg.py | 12 clang/test/lit.site.cfg.py.in | 4 -- 3 files changed, 87 deletions(-) delete mode 100644 clang/test/Driver/linker-wrapper-passes.c diff --git a/clang/test/Driver/linker-wrapper-passes.c b/clang/test/Driver/linker-wrapper-passes.c deleted file mode 100644 index b257c942afa075..00 --- a/clang/test/Driver/linker-wrapper-passes.c +++ /dev/null @@ -1,71 +0,0 @@ -// Check various clang-linker-wrapper pass options after -offload-opt. - -// REQUIRES: llvm-plugins, llvm-examples -// REQUIRES: x86-registered-target -// REQUIRES: amdgpu-registered-target -// Setup. -// RUN: mkdir -p %t -// RUN: %clang -cc1 -emit-llvm-bc -o %t/host-x86_64-unknown-linux-gnu.bc \ -// RUN: -disable-O0-optnone -triple=x86_64-unknown-linux-gnu %s -// RUN: %clang -cc1 -emit-llvm-bc -o %t/openmp-amdgcn-amd-amdhsa.bc \ -// RUN: -disable-O0-optnone -triple=amdgcn-amd-amdhsa %s -// RUN: opt %t/openmp-amdgcn-amd-amdhsa.bc -o %t/openmp-amdgcn-amd-amdhsa.bc \ -// RUN: -passes=forceattrs -force-remove-attribute=f:noinline -// RUN: clang-offload-packager -o %t/openmp-x86_64-unknown-linux-gnu.out \ -// RUN: --image=file=%t/openmp-amdgcn-amd-amdhsa.bc,triple=amdgcn-amd-amdhsa -// RUN: %clang -cc1 -S -o %t/host-x86_64-unknown-linux-gnu.s \ -// RUN: -fopenmp -fopenmp-targets=amdgcn-amd-amdhsa \ -// RUN: -fembed-offload-object=%t/openmp-x86_64-unknown-linux-gnu.out \ -// RUN: %t/host-x86_64-unknown-linux-gnu.bc -// RUN: %clang -cc1as -o %t/host-x86_64-unknown-linux-gnu.o \ -// RUN: -triple x86_64-unknown-linux-gnu -filetype obj -target-cpu x86-64 \ -// RUN: %t/host-x86_64-unknown-linux-gnu.s - -// Check plugin, -passes, and no remarks. -// RUN: clang-linker-wrapper -o a.out --embed-bitcode \ -// RUN: --linker-path=/usr/bin/true %t/host-x86_64-unknown-linux-gnu.o \ -// RUN: %offload-opt-loadbye --offload-opt=-wave-goodbye \ -// RUN: --offload-opt=-passes="function(goodbye),module(inline)" 2>&1 | \ -// RUN: FileCheck -match-full-lines -check-prefixes=OUT %s - -// Check plugin, -p, and remarks. -// RUN: clang-linker-wrapper -o a.out --embed-bitcode \ -// RUN: --linker-path=/usr/bin/true %t/host-x86_64-unknown-linux-gnu.o \ -// RUN: %offload-opt-loadbye --offload-opt=-wave-goodbye \ -// RUN: --offload-opt=-p="function(goodbye),module(inline)" \ -// RUN: --offload-opt=-pass-remarks=inline \ -// RUN: --offload-opt=-pass-remarks-output=%t/remarks.yml \ -// RUN: --offload-opt=-pass-remarks-filter=inline \ -// RUN: --offload-opt=-pass-remarks-format=yaml 2>&1 | \ -// RUN: FileCheck -match-full-lines -check-prefixes=OUT,REM %s -// RUN: FileCheck -input-file=%t/remarks.yml -match-full-lines \ -// RUN: -check-prefixes=YML %s - -// Check handling of bad plugin. -// RUN: not clang-linker-wrapper \ -// RUN: --offload-opt=-load-pass-plugin=%t/nonexistent.so 2>&1 | \ -// RUN: FileCheck -match-full-lines -check-prefixes=BAD-PLUGIN %s - -// OUT-NOT: {{.}} -// OUT: Bye: f -// OUT-NEXT: Bye: test -// REM-NEXT: remark: {{.*}} 'f' inlined into 'test' {{.*}} -// OUT-NOT: {{.}} - -// YML-NOT: {{.}} -// YML: --- !Passed -// YML-NEXT: Pass: inline -// YML-NEXT: Name: Inlined -// YML-NEXT: Function: test -// YML-NEXT: Args: -// YML: - Callee: f -// YML: - Caller: test -// YML: ... -// YML-NOT: {{.}} - -// BAD-PLUGIN-NOT: {{.}} -// BAD-PLUGIN: {{.*}}Could not load library {{.*}}nonexistent.so{{.*}} -// BAD-PLUGIN-NOT: {{.}} - -void f() {} -void test() { f(); } diff --git a/clang/test/lit.cfg.py b/clang/test/lit.cfg.py index 2bd7501136a10e..92a3361ce672e2 100644 --- a/clang/test/lit.cfg.py +++ b/clang/test/lit.cfg.py @@ -110,15 +110,6 @@ if config.clang_examples: config.available_features.add("examples") -if config.llvm_examples: -config.available_features.add("llvm-examples") - -if config.llvm_linked_bye_extension: -config.substitutions.append(("%offload-opt-loadbye", "")) -else: -loadbye = f"-load-pass-plugin={config.llvm_shlib_dir}/Bye{config.llvm_shlib_ext}" -config.substitutions.append(("%offload-opt-loadbye", f"--offload-opt={loadby
[llvm-branch-commits] [clang] Revert "[LinkerWrapper] Extend with usual pass options (#96704)" (#102226) (PR #106439)
https://github.com/haoNoQ edited https://github.com/llvm/llvm-project/pull/106439 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] Revert "[LinkerWrapper] Extend with usual pass options (#96704)" (#102226) (PR #106439)
llvmbot wrote: @llvm/pr-subscribers-clang-driver Author: Artem Dergachev (haoNoQ) Changes This reverts commit 90ccf2187332ff900d46a58a27cb0353577d37cb. Cherry picked from commit 030ee841a9c9fbbd6e7c001e751737381da01f7b. Conflicts: clang/test/Driver/linker-wrapper-passes.c --- Full diff: https://github.com/llvm/llvm-project/pull/106439.diff 3 Files Affected: - (removed) clang/test/Driver/linker-wrapper-passes.c (-71) - (modified) clang/test/lit.cfg.py (-12) - (modified) clang/test/lit.site.cfg.py.in (-4) ``diff diff --git a/clang/test/Driver/linker-wrapper-passes.c b/clang/test/Driver/linker-wrapper-passes.c deleted file mode 100644 index b257c942afa075..00 --- a/clang/test/Driver/linker-wrapper-passes.c +++ /dev/null @@ -1,71 +0,0 @@ -// Check various clang-linker-wrapper pass options after -offload-opt. - -// REQUIRES: llvm-plugins, llvm-examples -// REQUIRES: x86-registered-target -// REQUIRES: amdgpu-registered-target -// Setup. -// RUN: mkdir -p %t -// RUN: %clang -cc1 -emit-llvm-bc -o %t/host-x86_64-unknown-linux-gnu.bc \ -// RUN: -disable-O0-optnone -triple=x86_64-unknown-linux-gnu %s -// RUN: %clang -cc1 -emit-llvm-bc -o %t/openmp-amdgcn-amd-amdhsa.bc \ -// RUN: -disable-O0-optnone -triple=amdgcn-amd-amdhsa %s -// RUN: opt %t/openmp-amdgcn-amd-amdhsa.bc -o %t/openmp-amdgcn-amd-amdhsa.bc \ -// RUN: -passes=forceattrs -force-remove-attribute=f:noinline -// RUN: clang-offload-packager -o %t/openmp-x86_64-unknown-linux-gnu.out \ -// RUN: --image=file=%t/openmp-amdgcn-amd-amdhsa.bc,triple=amdgcn-amd-amdhsa -// RUN: %clang -cc1 -S -o %t/host-x86_64-unknown-linux-gnu.s \ -// RUN: -fopenmp -fopenmp-targets=amdgcn-amd-amdhsa \ -// RUN: -fembed-offload-object=%t/openmp-x86_64-unknown-linux-gnu.out \ -// RUN: %t/host-x86_64-unknown-linux-gnu.bc -// RUN: %clang -cc1as -o %t/host-x86_64-unknown-linux-gnu.o \ -// RUN: -triple x86_64-unknown-linux-gnu -filetype obj -target-cpu x86-64 \ -// RUN: %t/host-x86_64-unknown-linux-gnu.s - -// Check plugin, -passes, and no remarks. -// RUN: clang-linker-wrapper -o a.out --embed-bitcode \ -// RUN: --linker-path=/usr/bin/true %t/host-x86_64-unknown-linux-gnu.o \ -// RUN: %offload-opt-loadbye --offload-opt=-wave-goodbye \ -// RUN: --offload-opt=-passes="function(goodbye),module(inline)" 2>&1 | \ -// RUN: FileCheck -match-full-lines -check-prefixes=OUT %s - -// Check plugin, -p, and remarks. -// RUN: clang-linker-wrapper -o a.out --embed-bitcode \ -// RUN: --linker-path=/usr/bin/true %t/host-x86_64-unknown-linux-gnu.o \ -// RUN: %offload-opt-loadbye --offload-opt=-wave-goodbye \ -// RUN: --offload-opt=-p="function(goodbye),module(inline)" \ -// RUN: --offload-opt=-pass-remarks=inline \ -// RUN: --offload-opt=-pass-remarks-output=%t/remarks.yml \ -// RUN: --offload-opt=-pass-remarks-filter=inline \ -// RUN: --offload-opt=-pass-remarks-format=yaml 2>&1 | \ -// RUN: FileCheck -match-full-lines -check-prefixes=OUT,REM %s -// RUN: FileCheck -input-file=%t/remarks.yml -match-full-lines \ -// RUN: -check-prefixes=YML %s - -// Check handling of bad plugin. -// RUN: not clang-linker-wrapper \ -// RUN: --offload-opt=-load-pass-plugin=%t/nonexistent.so 2>&1 | \ -// RUN: FileCheck -match-full-lines -check-prefixes=BAD-PLUGIN %s - -// OUT-NOT: {{.}} -// OUT: Bye: f -// OUT-NEXT: Bye: test -// REM-NEXT: remark: {{.*}} 'f' inlined into 'test' {{.*}} -// OUT-NOT: {{.}} - -// YML-NOT: {{.}} -// YML: --- !Passed -// YML-NEXT: Pass: inline -// YML-NEXT: Name: Inlined -// YML-NEXT: Function: test -// YML-NEXT: Args: -// YML: - Callee: f -// YML: - Caller: test -// YML: ... -// YML-NOT: {{.}} - -// BAD-PLUGIN-NOT: {{.}} -// BAD-PLUGIN: {{.*}}Could not load library {{.*}}nonexistent.so{{.*}} -// BAD-PLUGIN-NOT: {{.}} - -void f() {} -void test() { f(); } diff --git a/clang/test/lit.cfg.py b/clang/test/lit.cfg.py index 2bd7501136a10e..92a3361ce672e2 100644 --- a/clang/test/lit.cfg.py +++ b/clang/test/lit.cfg.py @@ -110,15 +110,6 @@ if config.clang_examples: config.available_features.add("examples") -if config.llvm_examples: -config.available_features.add("llvm-examples") - -if config.llvm_linked_bye_extension: -config.substitutions.append(("%offload-opt-loadbye", "")) -else: -loadbye = f"-load-pass-plugin={config.llvm_shlib_dir}/Bye{config.llvm_shlib_ext}" -config.substitutions.append(("%offload-opt-loadbye", f"--offload-opt={loadbye}")) - def have_host_jit_feature_support(feature_name): clang_repl_exe = lit.util.which("clang-repl", config.clang_tools_dir) @@ -223,9 +214,6 @@ def have_host_clang_repl_cuda(): if config.has_plugins and config.llvm_plugin_ext: config.available_features.add("plugins") -if config.llvm_has_plugins and config.llvm_plugin_ext: -config.available_features.add("llvm-plugins") - if config.clang_default_pie_on_linux: config.available_features.add("default-p
[llvm-branch-commits] [llvm] [LLVM][Coroutines] Create `.noalloc` variant of switch ABI coroutine ramp functions during CoroSplit (PR #99283)
@@ -2049,6 +2055,21 @@ the coroutine must reach the final suspend point when it get destroyed. This attribute only works for switched-resume coroutines now. +coro_elide_safe +--- + +When a Call or Invoke instruction is marked with `coro_elide_safe`, +CoroAnnotationElidePass performs heap elision when possible. Note that for +recursive or mutually recursive functions this elision is usually not possible. + +coro_gen_noalloc_ramp +- + +This attribute hints CoroSplitPass to generate a `f.noalloc` ramp function for yuxuanchen1997 wrote: This attribute is deleted while addressing your feedback in https://github.com/llvm/llvm-project/pull/99282#pullrequestreview-2265588601 I can add a clarification in the documentation for coro_safe_elide. https://github.com/llvm/llvm-project/pull/99283 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] 1740035 - Revert "[CodeGen] Use MachineInstr::{all_uses, all_defs} (NFC) (#106404)"
Author: Vitaly Buka Date: 2024-08-28T13:35:28-07:00 New Revision: 1740035264c3326d7dabee0682dd3802bc4384d7 URL: https://github.com/llvm/llvm-project/commit/1740035264c3326d7dabee0682dd3802bc4384d7 DIFF: https://github.com/llvm/llvm-project/commit/1740035264c3326d7dabee0682dd3802bc4384d7.diff LOG: Revert "[CodeGen] Use MachineInstr::{all_uses,all_defs} (NFC) (#106404)" This reverts commit a4989cd603b8e8185e35e3c2b7b48b422d4898be. Added: Modified: llvm/lib/CodeGen/MachineConvergenceVerifier.cpp llvm/lib/CodeGen/MachineInstr.cpp llvm/lib/CodeGen/RegAllocFast.cpp Removed: diff --git a/llvm/lib/CodeGen/MachineConvergenceVerifier.cpp b/llvm/lib/CodeGen/MachineConvergenceVerifier.cpp index ac6b04a202c533..3d3c55faa82465 100644 --- a/llvm/lib/CodeGen/MachineConvergenceVerifier.cpp +++ b/llvm/lib/CodeGen/MachineConvergenceVerifier.cpp @@ -51,7 +51,9 @@ GenericConvergenceVerifier::findAndCheckConvergenceTokenUsed( const MachineRegisterInfo &MRI = Context.getFunction()->getRegInfo(); const MachineInstr *TokenDef = nullptr; - for (const MachineOperand &MO : MI.all_uses()) { + for (const MachineOperand &MO : MI.operands()) { +if (!MO.isReg() || !MO.isUse()) + continue; Register OpReg = MO.getReg(); if (!OpReg.isVirtual()) continue; diff --git a/llvm/lib/CodeGen/MachineInstr.cpp b/llvm/lib/CodeGen/MachineInstr.cpp index 7f81aeb545d328..f21910ee3a444a 100644 --- a/llvm/lib/CodeGen/MachineInstr.cpp +++ b/llvm/lib/CodeGen/MachineInstr.cpp @@ -1041,9 +1041,10 @@ unsigned MachineInstr::getBundleSize() const { /// Returns true if the MachineInstr has an implicit-use operand of exactly /// the given register (not considering sub/super-registers). bool MachineInstr::hasRegisterImplicitUseOperand(Register Reg) const { - for (const MachineOperand &MO : all_uses()) -if (MO.isImplicit() && MO.getReg() == Reg) + for (const MachineOperand &MO : operands()) { +if (MO.isReg() && MO.isUse() && MO.isImplicit() && MO.getReg() == Reg) return true; + } return false; } @@ -1263,8 +1264,10 @@ unsigned MachineInstr::findTiedOperandIdx(unsigned OpIdx) const { /// clearKillInfo - Clears kill flags on all operands. /// void MachineInstr::clearKillInfo() { - for (MachineOperand &MO : all_uses()) -MO.setIsKill(false); + for (MachineOperand &MO : operands()) { +if (MO.isReg() && MO.isUse()) + MO.setIsKill(false); + } } void MachineInstr::substituteRegister(Register FromReg, Register ToReg, @@ -1546,9 +1549,12 @@ bool MachineInstr::isLoadFoldBarrier() const { /// allDefsAreDead - Return true if all the defs of this instruction are dead. /// bool MachineInstr::allDefsAreDead() const { - for (const MachineOperand &MO : all_defs()) + for (const MachineOperand &MO : operands()) { +if (!MO.isReg() || MO.isUse()) + continue; if (!MO.isDead()) return false; + } return true; } @@ -2057,8 +2063,8 @@ void MachineInstr::clearRegisterKills(Register Reg, const TargetRegisterInfo *RegInfo) { if (!Reg.isPhysical()) RegInfo = nullptr; - for (MachineOperand &MO : all_uses()) { -if (!MO.isKill()) + for (MachineOperand &MO : operands()) { +if (!MO.isReg() || !MO.isUse() || !MO.isKill()) continue; Register OpReg = MO.getReg(); if ((RegInfo && RegInfo->regsOverlap(Reg, OpReg)) || Reg == OpReg) diff --git a/llvm/lib/CodeGen/RegAllocFast.cpp b/llvm/lib/CodeGen/RegAllocFast.cpp index a0a8a8897af7f2..6babd5a3f1f96f 100644 --- a/llvm/lib/CodeGen/RegAllocFast.cpp +++ b/llvm/lib/CodeGen/RegAllocFast.cpp @@ -1563,7 +1563,9 @@ void RegAllocFastImpl::allocateInstruction(MachineInstr &MI) { bool ReArrangedImplicitMOs = true; while (ReArrangedImplicitMOs) { ReArrangedImplicitMOs = false; -for (MachineOperand &MO : MI.all_uses()) { +for (MachineOperand &MO : MI.operands()) { + if (!MO.isReg() || !MO.isUse()) +continue; Register Reg = MO.getReg(); if (!Reg.isVirtual() || !shouldAllocateRegister(Reg)) continue; ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] 77e8b2f - Revert "[mlir][spirv] Add an argmax integration test with `mlir-vulkan-runner…"
Author: Jakub Kuderski Date: 2024-08-28T17:25:55-04:00 New Revision: 77e8b2fe44d540e23f395789644ccc2d597a956a URL: https://github.com/llvm/llvm-project/commit/77e8b2fe44d540e23f395789644ccc2d597a956a DIFF: https://github.com/llvm/llvm-project/commit/77e8b2fe44d540e23f395789644ccc2d597a956a.diff LOG: Revert "[mlir][spirv] Add an argmax integration test with `mlir-vulkan-runner…" This reverts commit 17b7a9da46cef85b1a00b574c18c5f8cd5a761e1. Added: Modified: mlir/tools/mlir-vulkan-runner/CMakeLists.txt mlir/tools/mlir-vulkan-runner/mlir-vulkan-runner.cpp utils/bazel/llvm-project-overlay/mlir/BUILD.bazel Removed: mlir/test/mlir-vulkan-runner/argmax.mlir diff --git a/mlir/test/mlir-vulkan-runner/argmax.mlir b/mlir/test/mlir-vulkan-runner/argmax.mlir deleted file mode 100644 index d30c1cb5b58bdc..00 --- a/mlir/test/mlir-vulkan-runner/argmax.mlir +++ /dev/null @@ -1,109 +0,0 @@ -// RUN: mlir-vulkan-runner %s \ -// RUN: --shared-libs=%vulkan-runtime-wrappers,%mlir_runner_utils \ -// RUN: --entry-point-result=void | FileCheck %s - -// This kernel computes the argmax (index of the maximum element) from an array -// of integers. Each thread computes a lane maximum using a single `scf.for`. -// Then `gpu.subgroup_reduce` is used to find the maximum across the entire -// subgroup, which is then used by SPIR-V subgroup ops to compute the argmax -// of the entire input array. Note that this kernel only works if we have a -// single workgroup. - -// CHECK: [15] -module attributes { - gpu.container_module, - spirv.target_env = #spirv.target_env< -#spirv.vce, #spirv.resource_limits<>> -} { - gpu.module @kernels { -gpu.func @kernel_argmax(%input : memref<128xi32>, %output : memref<1xi32>, %total_count_buf : memref<1xi32>) kernel - attributes {spirv.entry_point_abi = #spirv.entry_point_abi} { - %idx0 = arith.constant 0 : index - %idx1 = arith.constant 1 : index - - %total_count = memref.load %total_count_buf[%idx0] : memref<1xi32> - %lane_count_idx = gpu.subgroup_size : index - %lane_count_i32 = index.castu %lane_count_idx : index to i32 - %lane_id_idx = gpu.thread_id x - %lane_id_i32 = index.castu %lane_id_idx : index to i32 - %lane_res_init = arith.constant 0 : i32 - %lane_max_init = memref.load %input[%lane_id_idx] : memref<128xi32> - %num_batches_i32 = arith.divui %total_count, %lane_count_i32 : i32 - %num_batches_idx = index.castu %num_batches_i32 : i32 to index - - %lane_res, %lane_max = scf.for %iter = %idx1 to %num_batches_idx step %idx1 - iter_args(%lane_res_iter = %lane_res_init, %lane_max_iter = %lane_max_init) -> (i32, i32) { -%iter_i32 = index.castu %iter : index to i32 -%mul = arith.muli %lane_count_i32, %iter_i32 : i32 -%idx_i32 = arith.addi %mul, %lane_id_i32 : i32 -%idx = index.castu %idx_i32 : i32 to index -%elem = memref.load %input[%idx] : memref<128xi32> -%gt = arith.cmpi sgt, %elem, %lane_max_iter : i32 -%lane_res_next = arith.select %gt, %idx_i32, %lane_res_iter : i32 -%lane_max_next = arith.select %gt, %elem, %lane_max_iter : i32 -scf.yield %lane_res_next, %lane_max_next : i32, i32 - } - - %subgroup_max = gpu.subgroup_reduce maxsi %lane_max : (i32) -> (i32) - %eq = arith.cmpi eq, %lane_max, %subgroup_max : i32 - %ballot = spirv.GroupNonUniformBallot %eq : vector<4xi32> - %lsb = spirv.GroupNonUniformBallotFindLSB %ballot : vector<4xi32>, i32 - %cond = arith.cmpi eq, %lsb, %lane_id_i32 : i32 - - scf.if %cond { -memref.store %lane_res, %output[%idx0] : memref<1xi32> - } - - gpu.return -} - } - - func.func @main() { -// Allocate 3 buffers. -%in_buf = memref.alloc() : memref<128xi32> -%out_buf = memref.alloc() : memref<1xi32> -%total_count_buf = memref.alloc() : memref<1xi32> - -// Constants. -%cst0 = arith.constant 0 : i32 -%idx0 = arith.constant 0 : index -%idx1 = arith.constant 1 : index -%idx16 = arith.constant 16 : index -%idx32 = arith.constant 32 : index -%idx48 = arith.constant 48 : index -%idx64 = arith.constant 64 : index -%idx80 = arith.constant 80 : index -%idx96 = arith.constant 96 : index -%idx112 = arith.constant 112 : index - -// Initialize input buffer. -%in_vec = arith.constant dense<[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]> : vector<16xi32> -vector.store %in_vec, %in_buf[%idx0] : memref<128xi32>, vector<16xi32> -vector.store %in_vec, %in_buf[%idx16] : memref<128xi32>, vector<16xi32> -vector.store %in_vec, %in_buf[%idx32] : memref<128xi32>, vector<16xi32> -vector.store %in_vec, %in_buf[%idx48] : memref<128xi32>, vector<16xi32> -vector.store %in_vec, %in_buf[%idx64] : memref<128xi32>, vector<16xi32> -vector.store %in_vec, %in_buf[%idx80] :
[llvm-branch-commits] [llvm] [LLVM][Coroutines] Transform "coro_elide_safe" calls to switch ABI coroutines to the `noalloc` variant (PR #99285)
https://github.com/yuxuanchen1997 updated https://github.com/llvm/llvm-project/pull/99285 >From d6f2e78230c0907db95568e5b920d574ce6b4758 Mon Sep 17 00:00:00 2001 From: Yuxuan Chen Date: Mon, 15 Jul 2024 15:01:39 -0700 Subject: [PATCH] [LLVM][Coroutines] Transform "coro_elide_safe" calls to switch ABI coroutines to the `noalloc` variant --- .../Coroutines/CoroAnnotationElide.h | 36 + llvm/lib/Passes/PassBuilder.cpp | 1 + llvm/lib/Passes/PassBuilderPipelines.cpp | 10 +- llvm/lib/Passes/PassRegistry.def | 1 + llvm/lib/Transforms/Coroutines/CMakeLists.txt | 1 + .../Coroutines/CoroAnnotationElide.cpp| 152 ++ llvm/test/Other/new-pm-defaults.ll| 1 + .../Other/new-pm-thinlto-postlink-defaults.ll | 1 + .../new-pm-thinlto-postlink-pgo-defaults.ll | 1 + ...-pm-thinlto-postlink-samplepgo-defaults.ll | 1 + .../Coroutines/coro-transform-must-elide.ll | 76 + 11 files changed, 279 insertions(+), 2 deletions(-) create mode 100644 llvm/include/llvm/Transforms/Coroutines/CoroAnnotationElide.h create mode 100644 llvm/lib/Transforms/Coroutines/CoroAnnotationElide.cpp create mode 100644 llvm/test/Transforms/Coroutines/coro-transform-must-elide.ll diff --git a/llvm/include/llvm/Transforms/Coroutines/CoroAnnotationElide.h b/llvm/include/llvm/Transforms/Coroutines/CoroAnnotationElide.h new file mode 100644 index 00..352c9e14526697 --- /dev/null +++ b/llvm/include/llvm/Transforms/Coroutines/CoroAnnotationElide.h @@ -0,0 +1,36 @@ +//===- CoroAnnotationElide.h - Elide attributed safe coroutine calls --===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===--===// +// +// \file +// This pass transforms all Call or Invoke instructions that are annotated +// "coro_elide_safe" to call the `.noalloc` variant of coroutine instead. +// The frame of the callee coroutine is allocated inside the caller. A pointer +// to the allocated frame will be passed into the `.noalloc` ramp function. +// +//===--===// + +#ifndef LLVM_TRANSFORMS_COROUTINES_COROANNOTATIONELIDE_H +#define LLVM_TRANSFORMS_COROUTINES_COROANNOTATIONELIDE_H + +#include "llvm/Analysis/CGSCCPassManager.h" +#include "llvm/Analysis/LazyCallGraph.h" +#include "llvm/IR/PassManager.h" + +namespace llvm { + +struct CoroAnnotationElidePass : PassInfoMixin { + CoroAnnotationElidePass() {} + + PreservedAnalyses run(LazyCallGraph::SCC &C, CGSCCAnalysisManager &AM, +LazyCallGraph &CG, CGSCCUpdateResult &UR); + + static bool isRequired() { return false; } +}; +} // end namespace llvm + +#endif // LLVM_TRANSFORMS_COROUTINES_COROANNOTATIONELIDE_H diff --git a/llvm/lib/Passes/PassBuilder.cpp b/llvm/lib/Passes/PassBuilder.cpp index 17eed97fd950c9..c2b99a0d1f8cea 100644 --- a/llvm/lib/Passes/PassBuilder.cpp +++ b/llvm/lib/Passes/PassBuilder.cpp @@ -138,6 +138,7 @@ #include "llvm/Target/TargetMachine.h" #include "llvm/Transforms/AggressiveInstCombine/AggressiveInstCombine.h" #include "llvm/Transforms/CFGuard.h" +#include "llvm/Transforms/Coroutines/CoroAnnotationElide.h" #include "llvm/Transforms/Coroutines/CoroCleanup.h" #include "llvm/Transforms/Coroutines/CoroConditionalWrapper.h" #include "llvm/Transforms/Coroutines/CoroEarly.h" diff --git a/llvm/lib/Passes/PassBuilderPipelines.cpp b/llvm/lib/Passes/PassBuilderPipelines.cpp index 1184123c7710f0..992b4fca8a6919 100644 --- a/llvm/lib/Passes/PassBuilderPipelines.cpp +++ b/llvm/lib/Passes/PassBuilderPipelines.cpp @@ -33,6 +33,7 @@ #include "llvm/Support/VirtualFileSystem.h" #include "llvm/Target/TargetMachine.h" #include "llvm/Transforms/AggressiveInstCombine/AggressiveInstCombine.h" +#include "llvm/Transforms/Coroutines/CoroAnnotationElide.h" #include "llvm/Transforms/Coroutines/CoroCleanup.h" #include "llvm/Transforms/Coroutines/CoroConditionalWrapper.h" #include "llvm/Transforms/Coroutines/CoroEarly.h" @@ -984,8 +985,10 @@ PassBuilder::buildInlinerPipeline(OptimizationLevel Level, MainCGPipeline.addPass(createCGSCCToFunctionPassAdaptor( RequireAnalysisPass())); - if (Phase != ThinOrFullLTOPhase::ThinLTOPreLink) + if (Phase != ThinOrFullLTOPhase::ThinLTOPreLink) { MainCGPipeline.addPass(CoroSplitPass(Level != OptimizationLevel::O0)); +MainCGPipeline.addPass(CoroAnnotationElidePass()); + } // Make sure we don't affect potential future NoRerun CGSCC adaptors. MIWP.addLateModulePass(createModuleToFunctionPassAdaptor( @@ -1027,9 +1030,12 @@ PassBuilder::buildModuleInlinerPipeline(OptimizationLevel Level, buildFunctionSimplificationPipeline(Level, Phase), PTO.EagerlyInvalidateAnalyses)); - if (Phase != ThinOrFullLT
[llvm-branch-commits] [llvm] [LLVM][Coroutines] Create `.noalloc` variant of switch ABI coroutine ramp functions during CoroSplit (PR #99283)
https://github.com/yuxuanchen1997 updated https://github.com/llvm/llvm-project/pull/99283 >From e2a6027dd2af62f4fbfa92795873f0489fd35cfd Mon Sep 17 00:00:00 2001 From: Yuxuan Chen Date: Tue, 4 Jun 2024 23:22:00 -0700 Subject: [PATCH] [LLVM][Coroutines] Create `.noalloc` variant of switch ABI coroutine ramp functions during CoroSplit --- llvm/docs/Coroutines.rst | 18 +++ llvm/lib/Transforms/Coroutines/CoroInternal.h | 7 + llvm/lib/Transforms/Coroutines/CoroSplit.cpp | 150 +++--- llvm/lib/Transforms/Coroutines/Coroutines.cpp | 27 .../Transforms/Coroutines/coro-split-00.ll| 15 ++ 5 files changed, 191 insertions(+), 26 deletions(-) diff --git a/llvm/docs/Coroutines.rst b/llvm/docs/Coroutines.rst index 36092325e536fb..5679aefcb421d8 100644 --- a/llvm/docs/Coroutines.rst +++ b/llvm/docs/Coroutines.rst @@ -2022,6 +2022,12 @@ The pass CoroSplit builds coroutine frame and outlines resume and destroy parts into separate functions. This pass also lowers `coro.await.suspend.void`_, `coro.await.suspend.bool`_ and `coro.await.suspend.handle`_ intrinsics. +CoroAnnotationElide +--- +This pass finds all usages of coroutines that are "must elide" and replaces +`coro.begin` intrinsic with an address of a coroutine frame placed on its caller +and replaces `coro.alloc` and `coro.free` intrinsics with `false` and `null` +respectively to remove the deallocation code. CoroElide - @@ -2049,6 +2055,18 @@ the coroutine must reach the final suspend point when it get destroyed. This attribute only works for switched-resume coroutines now. +coro_elide_safe +--- + +When a Call or Invoke instruction to switch ABI coroutine `f` is marked with +`coro_elide_safe`, CoroSplitPass generates a `f.noalloc` ramp function. +`f.noalloc` has one more argument than its original ramp function `f`, which is +the pointer to the allocated frame. `f.noalloc` also suppressed any allocations +or deallocations that may be guarded by `@llvm.coro.alloc` and `@llvm.coro.free`. + +CoroAnnotationElidePass performs the heap elision when possible. Note that for +recursive or mutually recursive functions this elision is usually not possible. + Metadata diff --git a/llvm/lib/Transforms/Coroutines/CoroInternal.h b/llvm/lib/Transforms/Coroutines/CoroInternal.h index d535ad7f85d74a..be86f96525b677 100644 --- a/llvm/lib/Transforms/Coroutines/CoroInternal.h +++ b/llvm/lib/Transforms/Coroutines/CoroInternal.h @@ -26,6 +26,13 @@ bool declaresIntrinsics(const Module &M, const std::initializer_list); void replaceCoroFree(CoroIdInst *CoroId, bool Elide); +/// Replaces all @llvm.coro.alloc intrinsics calls associated with a given +/// call @llvm.coro.id instruction with boolean value false. +void suppressCoroAllocs(CoroIdInst *CoroId); +/// Replaces CoroAllocs with boolean value false. +void suppressCoroAllocs(LLVMContext &Context, +ArrayRef CoroAllocs); + /// Attempts to rewrite the location operand of debug intrinsics in terms of /// the coroutine frame pointer, folding pointer offsets into the DIExpression /// of the intrinsic. diff --git a/llvm/lib/Transforms/Coroutines/CoroSplit.cpp b/llvm/lib/Transforms/Coroutines/CoroSplit.cpp index 6bf3c75b95113e..494c4d632de95f 100644 --- a/llvm/lib/Transforms/Coroutines/CoroSplit.cpp +++ b/llvm/lib/Transforms/Coroutines/CoroSplit.cpp @@ -25,6 +25,7 @@ #include "llvm/ADT/PriorityWorklist.h" #include "llvm/ADT/SmallPtrSet.h" #include "llvm/ADT/SmallVector.h" +#include "llvm/ADT/StringExtras.h" #include "llvm/ADT/StringRef.h" #include "llvm/ADT/Twine.h" #include "llvm/Analysis/CFG.h" @@ -1177,6 +1178,14 @@ static void updateAsyncFuncPointerContextSize(coro::Shape &Shape) { Shape.AsyncLowering.AsyncFuncPointer->setInitializer(NewFuncPtrStruct); } +static TypeSize getFrameSizeForShape(coro::Shape &Shape) { + // In the same function all coro.sizes should have the same result type. + auto *SizeIntrin = Shape.CoroSizes.back(); + Module *M = SizeIntrin->getModule(); + const DataLayout &DL = M->getDataLayout(); + return DL.getTypeAllocSize(Shape.FrameTy); +} + static void replaceFrameSizeAndAlignment(coro::Shape &Shape) { if (Shape.ABI == coro::ABI::Async) updateAsyncFuncPointerContextSize(Shape); @@ -1192,10 +1201,8 @@ static void replaceFrameSizeAndAlignment(coro::Shape &Shape) { // In the same function all coro.sizes should have the same result type. auto *SizeIntrin = Shape.CoroSizes.back(); - Module *M = SizeIntrin->getModule(); - const DataLayout &DL = M->getDataLayout(); - auto Size = DL.getTypeAllocSize(Shape.FrameTy); - auto *SizeConstant = ConstantInt::get(SizeIntrin->getType(), Size); + auto *SizeConstant = + ConstantInt::get(SizeIntrin->getType(), getFrameSizeForShape(Shape)); for (CoroSizeInst *CS : Shape.CoroSizes) { CS->replaceAllUsesWith(SizeConstant); @@ -1452,6 +1459,75 @@ struct SwitchCorou
[llvm-branch-commits] [llvm] [LLVM][Coroutines] Create `.noalloc` variant of switch ABI coroutine ramp functions during CoroSplit (PR #99283)
https://github.com/yuxuanchen1997 updated https://github.com/llvm/llvm-project/pull/99283 >From e2a6027dd2af62f4fbfa92795873f0489fd35cfd Mon Sep 17 00:00:00 2001 From: Yuxuan Chen Date: Tue, 4 Jun 2024 23:22:00 -0700 Subject: [PATCH] [LLVM][Coroutines] Create `.noalloc` variant of switch ABI coroutine ramp functions during CoroSplit --- llvm/docs/Coroutines.rst | 18 +++ llvm/lib/Transforms/Coroutines/CoroInternal.h | 7 + llvm/lib/Transforms/Coroutines/CoroSplit.cpp | 150 +++--- llvm/lib/Transforms/Coroutines/Coroutines.cpp | 27 .../Transforms/Coroutines/coro-split-00.ll| 15 ++ 5 files changed, 191 insertions(+), 26 deletions(-) diff --git a/llvm/docs/Coroutines.rst b/llvm/docs/Coroutines.rst index 36092325e536fb..5679aefcb421d8 100644 --- a/llvm/docs/Coroutines.rst +++ b/llvm/docs/Coroutines.rst @@ -2022,6 +2022,12 @@ The pass CoroSplit builds coroutine frame and outlines resume and destroy parts into separate functions. This pass also lowers `coro.await.suspend.void`_, `coro.await.suspend.bool`_ and `coro.await.suspend.handle`_ intrinsics. +CoroAnnotationElide +--- +This pass finds all usages of coroutines that are "must elide" and replaces +`coro.begin` intrinsic with an address of a coroutine frame placed on its caller +and replaces `coro.alloc` and `coro.free` intrinsics with `false` and `null` +respectively to remove the deallocation code. CoroElide - @@ -2049,6 +2055,18 @@ the coroutine must reach the final suspend point when it get destroyed. This attribute only works for switched-resume coroutines now. +coro_elide_safe +--- + +When a Call or Invoke instruction to switch ABI coroutine `f` is marked with +`coro_elide_safe`, CoroSplitPass generates a `f.noalloc` ramp function. +`f.noalloc` has one more argument than its original ramp function `f`, which is +the pointer to the allocated frame. `f.noalloc` also suppressed any allocations +or deallocations that may be guarded by `@llvm.coro.alloc` and `@llvm.coro.free`. + +CoroAnnotationElidePass performs the heap elision when possible. Note that for +recursive or mutually recursive functions this elision is usually not possible. + Metadata diff --git a/llvm/lib/Transforms/Coroutines/CoroInternal.h b/llvm/lib/Transforms/Coroutines/CoroInternal.h index d535ad7f85d74a..be86f96525b677 100644 --- a/llvm/lib/Transforms/Coroutines/CoroInternal.h +++ b/llvm/lib/Transforms/Coroutines/CoroInternal.h @@ -26,6 +26,13 @@ bool declaresIntrinsics(const Module &M, const std::initializer_list); void replaceCoroFree(CoroIdInst *CoroId, bool Elide); +/// Replaces all @llvm.coro.alloc intrinsics calls associated with a given +/// call @llvm.coro.id instruction with boolean value false. +void suppressCoroAllocs(CoroIdInst *CoroId); +/// Replaces CoroAllocs with boolean value false. +void suppressCoroAllocs(LLVMContext &Context, +ArrayRef CoroAllocs); + /// Attempts to rewrite the location operand of debug intrinsics in terms of /// the coroutine frame pointer, folding pointer offsets into the DIExpression /// of the intrinsic. diff --git a/llvm/lib/Transforms/Coroutines/CoroSplit.cpp b/llvm/lib/Transforms/Coroutines/CoroSplit.cpp index 6bf3c75b95113e..494c4d632de95f 100644 --- a/llvm/lib/Transforms/Coroutines/CoroSplit.cpp +++ b/llvm/lib/Transforms/Coroutines/CoroSplit.cpp @@ -25,6 +25,7 @@ #include "llvm/ADT/PriorityWorklist.h" #include "llvm/ADT/SmallPtrSet.h" #include "llvm/ADT/SmallVector.h" +#include "llvm/ADT/StringExtras.h" #include "llvm/ADT/StringRef.h" #include "llvm/ADT/Twine.h" #include "llvm/Analysis/CFG.h" @@ -1177,6 +1178,14 @@ static void updateAsyncFuncPointerContextSize(coro::Shape &Shape) { Shape.AsyncLowering.AsyncFuncPointer->setInitializer(NewFuncPtrStruct); } +static TypeSize getFrameSizeForShape(coro::Shape &Shape) { + // In the same function all coro.sizes should have the same result type. + auto *SizeIntrin = Shape.CoroSizes.back(); + Module *M = SizeIntrin->getModule(); + const DataLayout &DL = M->getDataLayout(); + return DL.getTypeAllocSize(Shape.FrameTy); +} + static void replaceFrameSizeAndAlignment(coro::Shape &Shape) { if (Shape.ABI == coro::ABI::Async) updateAsyncFuncPointerContextSize(Shape); @@ -1192,10 +1201,8 @@ static void replaceFrameSizeAndAlignment(coro::Shape &Shape) { // In the same function all coro.sizes should have the same result type. auto *SizeIntrin = Shape.CoroSizes.back(); - Module *M = SizeIntrin->getModule(); - const DataLayout &DL = M->getDataLayout(); - auto Size = DL.getTypeAllocSize(Shape.FrameTy); - auto *SizeConstant = ConstantInt::get(SizeIntrin->getType(), Size); + auto *SizeConstant = + ConstantInt::get(SizeIntrin->getType(), getFrameSizeForShape(Shape)); for (CoroSizeInst *CS : Shape.CoroSizes) { CS->replaceAllUsesWith(SizeConstant); @@ -1452,6 +1459,75 @@ struct SwitchCorou
[llvm-branch-commits] [llvm] [LLVM][Coroutines] Transform "coro_elide_safe" calls to switch ABI coroutines to the `noalloc` variant (PR #99285)
https://github.com/yuxuanchen1997 updated https://github.com/llvm/llvm-project/pull/99285 >From d6f2e78230c0907db95568e5b920d574ce6b4758 Mon Sep 17 00:00:00 2001 From: Yuxuan Chen Date: Mon, 15 Jul 2024 15:01:39 -0700 Subject: [PATCH] [LLVM][Coroutines] Transform "coro_elide_safe" calls to switch ABI coroutines to the `noalloc` variant --- .../Coroutines/CoroAnnotationElide.h | 36 + llvm/lib/Passes/PassBuilder.cpp | 1 + llvm/lib/Passes/PassBuilderPipelines.cpp | 10 +- llvm/lib/Passes/PassRegistry.def | 1 + llvm/lib/Transforms/Coroutines/CMakeLists.txt | 1 + .../Coroutines/CoroAnnotationElide.cpp| 152 ++ llvm/test/Other/new-pm-defaults.ll| 1 + .../Other/new-pm-thinlto-postlink-defaults.ll | 1 + .../new-pm-thinlto-postlink-pgo-defaults.ll | 1 + ...-pm-thinlto-postlink-samplepgo-defaults.ll | 1 + .../Coroutines/coro-transform-must-elide.ll | 76 + 11 files changed, 279 insertions(+), 2 deletions(-) create mode 100644 llvm/include/llvm/Transforms/Coroutines/CoroAnnotationElide.h create mode 100644 llvm/lib/Transforms/Coroutines/CoroAnnotationElide.cpp create mode 100644 llvm/test/Transforms/Coroutines/coro-transform-must-elide.ll diff --git a/llvm/include/llvm/Transforms/Coroutines/CoroAnnotationElide.h b/llvm/include/llvm/Transforms/Coroutines/CoroAnnotationElide.h new file mode 100644 index 00..352c9e14526697 --- /dev/null +++ b/llvm/include/llvm/Transforms/Coroutines/CoroAnnotationElide.h @@ -0,0 +1,36 @@ +//===- CoroAnnotationElide.h - Elide attributed safe coroutine calls --===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===--===// +// +// \file +// This pass transforms all Call or Invoke instructions that are annotated +// "coro_elide_safe" to call the `.noalloc` variant of coroutine instead. +// The frame of the callee coroutine is allocated inside the caller. A pointer +// to the allocated frame will be passed into the `.noalloc` ramp function. +// +//===--===// + +#ifndef LLVM_TRANSFORMS_COROUTINES_COROANNOTATIONELIDE_H +#define LLVM_TRANSFORMS_COROUTINES_COROANNOTATIONELIDE_H + +#include "llvm/Analysis/CGSCCPassManager.h" +#include "llvm/Analysis/LazyCallGraph.h" +#include "llvm/IR/PassManager.h" + +namespace llvm { + +struct CoroAnnotationElidePass : PassInfoMixin { + CoroAnnotationElidePass() {} + + PreservedAnalyses run(LazyCallGraph::SCC &C, CGSCCAnalysisManager &AM, +LazyCallGraph &CG, CGSCCUpdateResult &UR); + + static bool isRequired() { return false; } +}; +} // end namespace llvm + +#endif // LLVM_TRANSFORMS_COROUTINES_COROANNOTATIONELIDE_H diff --git a/llvm/lib/Passes/PassBuilder.cpp b/llvm/lib/Passes/PassBuilder.cpp index 17eed97fd950c9..c2b99a0d1f8cea 100644 --- a/llvm/lib/Passes/PassBuilder.cpp +++ b/llvm/lib/Passes/PassBuilder.cpp @@ -138,6 +138,7 @@ #include "llvm/Target/TargetMachine.h" #include "llvm/Transforms/AggressiveInstCombine/AggressiveInstCombine.h" #include "llvm/Transforms/CFGuard.h" +#include "llvm/Transforms/Coroutines/CoroAnnotationElide.h" #include "llvm/Transforms/Coroutines/CoroCleanup.h" #include "llvm/Transforms/Coroutines/CoroConditionalWrapper.h" #include "llvm/Transforms/Coroutines/CoroEarly.h" diff --git a/llvm/lib/Passes/PassBuilderPipelines.cpp b/llvm/lib/Passes/PassBuilderPipelines.cpp index 1184123c7710f0..992b4fca8a6919 100644 --- a/llvm/lib/Passes/PassBuilderPipelines.cpp +++ b/llvm/lib/Passes/PassBuilderPipelines.cpp @@ -33,6 +33,7 @@ #include "llvm/Support/VirtualFileSystem.h" #include "llvm/Target/TargetMachine.h" #include "llvm/Transforms/AggressiveInstCombine/AggressiveInstCombine.h" +#include "llvm/Transforms/Coroutines/CoroAnnotationElide.h" #include "llvm/Transforms/Coroutines/CoroCleanup.h" #include "llvm/Transforms/Coroutines/CoroConditionalWrapper.h" #include "llvm/Transforms/Coroutines/CoroEarly.h" @@ -984,8 +985,10 @@ PassBuilder::buildInlinerPipeline(OptimizationLevel Level, MainCGPipeline.addPass(createCGSCCToFunctionPassAdaptor( RequireAnalysisPass())); - if (Phase != ThinOrFullLTOPhase::ThinLTOPreLink) + if (Phase != ThinOrFullLTOPhase::ThinLTOPreLink) { MainCGPipeline.addPass(CoroSplitPass(Level != OptimizationLevel::O0)); +MainCGPipeline.addPass(CoroAnnotationElidePass()); + } // Make sure we don't affect potential future NoRerun CGSCC adaptors. MIWP.addLateModulePass(createModuleToFunctionPassAdaptor( @@ -1027,9 +1030,12 @@ PassBuilder::buildModuleInlinerPipeline(OptimizationLevel Level, buildFunctionSimplificationPipeline(Level, Phase), PTO.EagerlyInvalidateAnalyses)); - if (Phase != ThinOrFullLT
[llvm-branch-commits] [llvm] [LLVM][Coroutines] Transform "coro_elide_safe" calls to switch ABI coroutines to the `noalloc` variant (PR #99285)
github-actions[bot] wrote: :warning: C/C++ code formatter, clang-format found issues in your code. :warning: You can test this locally with the following command: ``bash git-clang-format --diff e2a6027dd2af62f4fbfa92795873f0489fd35cfd d6f2e78230c0907db95568e5b920d574ce6b4758 --extensions cpp,h -- llvm/include/llvm/Transforms/Coroutines/CoroAnnotationElide.h llvm/lib/Transforms/Coroutines/CoroAnnotationElide.cpp llvm/lib/Passes/PassBuilder.cpp llvm/lib/Passes/PassBuilderPipelines.cpp `` View the diff from clang-format here. ``diff diff --git a/llvm/lib/Transforms/Coroutines/CoroAnnotationElide.cpp b/llvm/lib/Transforms/Coroutines/CoroAnnotationElide.cpp index e7c7e01f9c..28953f2137 100644 --- a/llvm/lib/Transforms/Coroutines/CoroAnnotationElide.cpp +++ b/llvm/lib/Transforms/Coroutines/CoroAnnotationElide.cpp @@ -143,9 +143,9 @@ PreservedAnalyses CoroAnnotationElidePass::run(LazyCallGraph::SCC &C, << "' elided in '" << ore::NV("caller", Caller->getName()); }); Changed = true; -updateCGAndAnalysisManagerForCGSCCPass(CG, *CallerC, *CallerN, AM, UR, FAM); +updateCGAndAnalysisManagerForCGSCCPass(CG, *CallerC, *CallerN, AM, UR, + FAM); } - } } return Changed ? PreservedAnalyses::none() : PreservedAnalyses::all(); `` https://github.com/llvm/llvm-project/pull/99285 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LLVM][Coroutines] Transform "coro_elide_safe" calls to switch ABI coroutines to the `noalloc` variant (PR #99285)
github-actions[bot] wrote: :warning: C/C++ code formatter, clang-format found issues in your code. :warning: You can test this locally with the following command: ``bash git-clang-format --diff e2a6027dd2af62f4fbfa92795873f0489fd35cfd d6f2e78230c0907db95568e5b920d574ce6b4758 --extensions h,cpp -- llvm/include/llvm/Transforms/Coroutines/CoroAnnotationElide.h llvm/lib/Transforms/Coroutines/CoroAnnotationElide.cpp llvm/lib/Passes/PassBuilder.cpp llvm/lib/Passes/PassBuilderPipelines.cpp `` View the diff from clang-format here. ``diff diff --git a/llvm/lib/Transforms/Coroutines/CoroAnnotationElide.cpp b/llvm/lib/Transforms/Coroutines/CoroAnnotationElide.cpp index e7c7e01f9c..28953f2137 100644 --- a/llvm/lib/Transforms/Coroutines/CoroAnnotationElide.cpp +++ b/llvm/lib/Transforms/Coroutines/CoroAnnotationElide.cpp @@ -143,9 +143,9 @@ PreservedAnalyses CoroAnnotationElidePass::run(LazyCallGraph::SCC &C, << "' elided in '" << ore::NV("caller", Caller->getName()); }); Changed = true; -updateCGAndAnalysisManagerForCGSCCPass(CG, *CallerC, *CallerN, AM, UR, FAM); +updateCGAndAnalysisManagerForCGSCCPass(CG, *CallerC, *CallerN, AM, UR, + FAM); } - } } return Changed ? PreservedAnalyses::none() : PreservedAnalyses::all(); `` https://github.com/llvm/llvm-project/pull/99285 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LLVM][Coroutines] Transform "coro_elide_safe" calls to switch ABI coroutines to the `noalloc` variant (PR #99285)
https://github.com/yuxuanchen1997 updated https://github.com/llvm/llvm-project/pull/99285 >From 68a410d159fdb96e7580a7f3fe035df00b893f3c Mon Sep 17 00:00:00 2001 From: Yuxuan Chen Date: Mon, 15 Jul 2024 15:01:39 -0700 Subject: [PATCH] [LLVM][Coroutines] Transform "coro_elide_safe" calls to switch ABI coroutines to the `noalloc` variant --- .../Coroutines/CoroAnnotationElide.h | 36 + llvm/lib/Passes/PassBuilder.cpp | 1 + llvm/lib/Passes/PassBuilderPipelines.cpp | 10 +- llvm/lib/Passes/PassRegistry.def | 1 + llvm/lib/Transforms/Coroutines/CMakeLists.txt | 1 + .../Coroutines/CoroAnnotationElide.cpp| 152 ++ llvm/test/Other/new-pm-defaults.ll| 1 + .../Other/new-pm-thinlto-postlink-defaults.ll | 1 + .../new-pm-thinlto-postlink-pgo-defaults.ll | 1 + ...-pm-thinlto-postlink-samplepgo-defaults.ll | 1 + .../Coroutines/coro-transform-must-elide.ll | 76 + 11 files changed, 279 insertions(+), 2 deletions(-) create mode 100644 llvm/include/llvm/Transforms/Coroutines/CoroAnnotationElide.h create mode 100644 llvm/lib/Transforms/Coroutines/CoroAnnotationElide.cpp create mode 100644 llvm/test/Transforms/Coroutines/coro-transform-must-elide.ll diff --git a/llvm/include/llvm/Transforms/Coroutines/CoroAnnotationElide.h b/llvm/include/llvm/Transforms/Coroutines/CoroAnnotationElide.h new file mode 100644 index 00..352c9e14526697 --- /dev/null +++ b/llvm/include/llvm/Transforms/Coroutines/CoroAnnotationElide.h @@ -0,0 +1,36 @@ +//===- CoroAnnotationElide.h - Elide attributed safe coroutine calls --===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===--===// +// +// \file +// This pass transforms all Call or Invoke instructions that are annotated +// "coro_elide_safe" to call the `.noalloc` variant of coroutine instead. +// The frame of the callee coroutine is allocated inside the caller. A pointer +// to the allocated frame will be passed into the `.noalloc` ramp function. +// +//===--===// + +#ifndef LLVM_TRANSFORMS_COROUTINES_COROANNOTATIONELIDE_H +#define LLVM_TRANSFORMS_COROUTINES_COROANNOTATIONELIDE_H + +#include "llvm/Analysis/CGSCCPassManager.h" +#include "llvm/Analysis/LazyCallGraph.h" +#include "llvm/IR/PassManager.h" + +namespace llvm { + +struct CoroAnnotationElidePass : PassInfoMixin { + CoroAnnotationElidePass() {} + + PreservedAnalyses run(LazyCallGraph::SCC &C, CGSCCAnalysisManager &AM, +LazyCallGraph &CG, CGSCCUpdateResult &UR); + + static bool isRequired() { return false; } +}; +} // end namespace llvm + +#endif // LLVM_TRANSFORMS_COROUTINES_COROANNOTATIONELIDE_H diff --git a/llvm/lib/Passes/PassBuilder.cpp b/llvm/lib/Passes/PassBuilder.cpp index 17eed97fd950c9..c2b99a0d1f8cea 100644 --- a/llvm/lib/Passes/PassBuilder.cpp +++ b/llvm/lib/Passes/PassBuilder.cpp @@ -138,6 +138,7 @@ #include "llvm/Target/TargetMachine.h" #include "llvm/Transforms/AggressiveInstCombine/AggressiveInstCombine.h" #include "llvm/Transforms/CFGuard.h" +#include "llvm/Transforms/Coroutines/CoroAnnotationElide.h" #include "llvm/Transforms/Coroutines/CoroCleanup.h" #include "llvm/Transforms/Coroutines/CoroConditionalWrapper.h" #include "llvm/Transforms/Coroutines/CoroEarly.h" diff --git a/llvm/lib/Passes/PassBuilderPipelines.cpp b/llvm/lib/Passes/PassBuilderPipelines.cpp index 1184123c7710f0..992b4fca8a6919 100644 --- a/llvm/lib/Passes/PassBuilderPipelines.cpp +++ b/llvm/lib/Passes/PassBuilderPipelines.cpp @@ -33,6 +33,7 @@ #include "llvm/Support/VirtualFileSystem.h" #include "llvm/Target/TargetMachine.h" #include "llvm/Transforms/AggressiveInstCombine/AggressiveInstCombine.h" +#include "llvm/Transforms/Coroutines/CoroAnnotationElide.h" #include "llvm/Transforms/Coroutines/CoroCleanup.h" #include "llvm/Transforms/Coroutines/CoroConditionalWrapper.h" #include "llvm/Transforms/Coroutines/CoroEarly.h" @@ -984,8 +985,10 @@ PassBuilder::buildInlinerPipeline(OptimizationLevel Level, MainCGPipeline.addPass(createCGSCCToFunctionPassAdaptor( RequireAnalysisPass())); - if (Phase != ThinOrFullLTOPhase::ThinLTOPreLink) + if (Phase != ThinOrFullLTOPhase::ThinLTOPreLink) { MainCGPipeline.addPass(CoroSplitPass(Level != OptimizationLevel::O0)); +MainCGPipeline.addPass(CoroAnnotationElidePass()); + } // Make sure we don't affect potential future NoRerun CGSCC adaptors. MIWP.addLateModulePass(createModuleToFunctionPassAdaptor( @@ -1027,9 +1030,12 @@ PassBuilder::buildModuleInlinerPipeline(OptimizationLevel Level, buildFunctionSimplificationPipeline(Level, Phase), PTO.EagerlyInvalidateAnalyses)); - if (Phase != ThinOrFullLT
[llvm-branch-commits] [llvm] [LLVM][Coroutines] Transform "coro_elide_safe" calls to switch ABI coroutines to the `noalloc` variant (PR #99285)
https://github.com/yuxuanchen1997 updated https://github.com/llvm/llvm-project/pull/99285 >From 5b18641d2b59adf11810f71fe5ab3204a94a7a56 Mon Sep 17 00:00:00 2001 From: Yuxuan Chen Date: Mon, 15 Jul 2024 15:01:39 -0700 Subject: [PATCH] [LLVM][Coroutines] Transform "coro_elide_safe" calls to switch ABI coroutines to the `noalloc` variant --- .../Coroutines/CoroAnnotationElide.h | 36 llvm/lib/Passes/PassBuilder.cpp | 1 + llvm/lib/Passes/PassBuilderPipelines.cpp | 10 +- llvm/lib/Passes/PassRegistry.def | 1 + llvm/lib/Transforms/Coroutines/CMakeLists.txt | 1 + .../Coroutines/CoroAnnotationElide.cpp| 155 ++ llvm/test/Other/new-pm-defaults.ll| 1 + .../Other/new-pm-thinlto-postlink-defaults.ll | 1 + .../new-pm-thinlto-postlink-pgo-defaults.ll | 1 + ...-pm-thinlto-postlink-samplepgo-defaults.ll | 1 + .../Coroutines/coro-transform-must-elide.ll | 75 + 11 files changed, 281 insertions(+), 2 deletions(-) create mode 100644 llvm/include/llvm/Transforms/Coroutines/CoroAnnotationElide.h create mode 100644 llvm/lib/Transforms/Coroutines/CoroAnnotationElide.cpp create mode 100644 llvm/test/Transforms/Coroutines/coro-transform-must-elide.ll diff --git a/llvm/include/llvm/Transforms/Coroutines/CoroAnnotationElide.h b/llvm/include/llvm/Transforms/Coroutines/CoroAnnotationElide.h new file mode 100644 index 00..352c9e14526697 --- /dev/null +++ b/llvm/include/llvm/Transforms/Coroutines/CoroAnnotationElide.h @@ -0,0 +1,36 @@ +//===- CoroAnnotationElide.h - Elide attributed safe coroutine calls --===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===--===// +// +// \file +// This pass transforms all Call or Invoke instructions that are annotated +// "coro_elide_safe" to call the `.noalloc` variant of coroutine instead. +// The frame of the callee coroutine is allocated inside the caller. A pointer +// to the allocated frame will be passed into the `.noalloc` ramp function. +// +//===--===// + +#ifndef LLVM_TRANSFORMS_COROUTINES_COROANNOTATIONELIDE_H +#define LLVM_TRANSFORMS_COROUTINES_COROANNOTATIONELIDE_H + +#include "llvm/Analysis/CGSCCPassManager.h" +#include "llvm/Analysis/LazyCallGraph.h" +#include "llvm/IR/PassManager.h" + +namespace llvm { + +struct CoroAnnotationElidePass : PassInfoMixin { + CoroAnnotationElidePass() {} + + PreservedAnalyses run(LazyCallGraph::SCC &C, CGSCCAnalysisManager &AM, +LazyCallGraph &CG, CGSCCUpdateResult &UR); + + static bool isRequired() { return false; } +}; +} // end namespace llvm + +#endif // LLVM_TRANSFORMS_COROUTINES_COROANNOTATIONELIDE_H diff --git a/llvm/lib/Passes/PassBuilder.cpp b/llvm/lib/Passes/PassBuilder.cpp index 17eed97fd950c9..c2b99a0d1f8cea 100644 --- a/llvm/lib/Passes/PassBuilder.cpp +++ b/llvm/lib/Passes/PassBuilder.cpp @@ -138,6 +138,7 @@ #include "llvm/Target/TargetMachine.h" #include "llvm/Transforms/AggressiveInstCombine/AggressiveInstCombine.h" #include "llvm/Transforms/CFGuard.h" +#include "llvm/Transforms/Coroutines/CoroAnnotationElide.h" #include "llvm/Transforms/Coroutines/CoroCleanup.h" #include "llvm/Transforms/Coroutines/CoroConditionalWrapper.h" #include "llvm/Transforms/Coroutines/CoroEarly.h" diff --git a/llvm/lib/Passes/PassBuilderPipelines.cpp b/llvm/lib/Passes/PassBuilderPipelines.cpp index 1184123c7710f0..992b4fca8a6919 100644 --- a/llvm/lib/Passes/PassBuilderPipelines.cpp +++ b/llvm/lib/Passes/PassBuilderPipelines.cpp @@ -33,6 +33,7 @@ #include "llvm/Support/VirtualFileSystem.h" #include "llvm/Target/TargetMachine.h" #include "llvm/Transforms/AggressiveInstCombine/AggressiveInstCombine.h" +#include "llvm/Transforms/Coroutines/CoroAnnotationElide.h" #include "llvm/Transforms/Coroutines/CoroCleanup.h" #include "llvm/Transforms/Coroutines/CoroConditionalWrapper.h" #include "llvm/Transforms/Coroutines/CoroEarly.h" @@ -984,8 +985,10 @@ PassBuilder::buildInlinerPipeline(OptimizationLevel Level, MainCGPipeline.addPass(createCGSCCToFunctionPassAdaptor( RequireAnalysisPass())); - if (Phase != ThinOrFullLTOPhase::ThinLTOPreLink) + if (Phase != ThinOrFullLTOPhase::ThinLTOPreLink) { MainCGPipeline.addPass(CoroSplitPass(Level != OptimizationLevel::O0)); +MainCGPipeline.addPass(CoroAnnotationElidePass()); + } // Make sure we don't affect potential future NoRerun CGSCC adaptors. MIWP.addLateModulePass(createModuleToFunctionPassAdaptor( @@ -1027,9 +1030,12 @@ PassBuilder::buildModuleInlinerPipeline(OptimizationLevel Level, buildFunctionSimplificationPipeline(Level, Phase), PTO.EagerlyInvalidateAnalyses)); - if (Phase != ThinOrFullLTO
[llvm-branch-commits] [clang] Revert "[LinkerWrapper] Extend with usual pass options (#96704)" (#102226) (PR #106439)
haoNoQ wrote: (According to the discussion in 102226, this patch was never supposed to be in the release branch.) https://github.com/llvm/llvm-project/pull/106439 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] Revert "[LinkerWrapper] Extend with usual pass options (#96704)" (#102226) (PR #106439)
https://github.com/jhuber6 approved this pull request. Thanks https://github.com/llvm/llvm-project/pull/106439 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [llvm] release/19.x: workflows/release-binaries: Enable flang builds on Windows (#101344) (PR #106480)
https://github.com/llvmbot created https://github.com/llvm/llvm-project/pull/106480 Backport 8927576b8f6442bb6129bda597efee46176f8aec Requested by: @tstellar >From b3eb0c3dfe85b18ed4ef8e3f804970680c0e94ca Mon Sep 17 00:00:00 2001 From: Tom Stellard Date: Wed, 28 Aug 2024 18:22:57 -0700 Subject: [PATCH] workflows/release-binaries: Enable flang builds on Windows (#101344) Flang for Windows depends on compiler-rt, so we need to enable it for the stage1 builds. This also fixes failures building the flang tests on macOS. Fixes #100202. (cherry picked from commit 8927576b8f6442bb6129bda597efee46176f8aec) --- .github/workflows/release-binaries.yml | 8 clang/cmake/caches/Release.cmake | 7 +-- 2 files changed, 5 insertions(+), 10 deletions(-) diff --git a/.github/workflows/release-binaries.yml b/.github/workflows/release-binaries.yml index 509016e5b89c45..672dd7517d23ce 100644 --- a/.github/workflows/release-binaries.yml +++ b/.github/workflows/release-binaries.yml @@ -135,16 +135,8 @@ jobs: target_cmake_flags="$target_cmake_flags -DBOOTSTRAP_DARWIN_osx_ARCHS=$arches -DBOOTSTRAP_DARWIN_osx_BUILTIN_ARCHS=$arches" fi -# x86 macOS and x86 Windows have trouble building flang, so disable it. -# Windows: https://github.com/llvm/llvm-project/issues/100202 -# macOS: 'rebase opcodes terminated early at offset 1 of 80016' when building __fortran_builtins.mod build_flang="true" -if [ "$target" = "Windows-X64" ]; then - target_cmake_flags="$target_cmake_flags -DLLVM_RELEASE_ENABLE_PROJECTS=\"clang;lld;lldb;clang-tools-extra;bolt;polly;mlir\"" - build_flang="false" -fi - if [ "${{ runner.os }}" = "Windows" ]; then # The build times out on Windows, so we need to disable LTO. target_cmake_flags="$target_cmake_flags -DLLVM_RELEASE_ENABLE_LTO=OFF" diff --git a/clang/cmake/caches/Release.cmake b/clang/cmake/caches/Release.cmake index e5161dd9a27b96..6d5f75ca0074ee 100644 --- a/clang/cmake/caches/Release.cmake +++ b/clang/cmake/caches/Release.cmake @@ -47,11 +47,14 @@ set(LLVM_TARGETS_TO_BUILD Native CACHE STRING "") set(CLANG_ENABLE_BOOTSTRAP ON CACHE BOOL "") set(STAGE1_PROJECTS "clang") -set(STAGE1_RUNTIMES "") + +# Building Flang on Windows requires compiler-rt, so we need to build it in +# stage1. compiler-rt is also required for building the Flang tests on +# macOS. +set(STAGE1_RUNTIMES "compiler-rt") if (LLVM_RELEASE_ENABLE_PGO) list(APPEND STAGE1_PROJECTS "lld") - list(APPEND STAGE1_RUNTIMES "compiler-rt") set(CLANG_BOOTSTRAP_TARGETS generate-profdata stage2-package ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [llvm] release/19.x: workflows/release-binaries: Enable flang builds on Windows (#101344) (PR #106480)
https://github.com/llvmbot milestoned https://github.com/llvm/llvm-project/pull/106480 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [llvm] release/19.x: workflows/release-binaries: Enable flang builds on Windows (#101344) (PR #106480)
llvmbot wrote: @tstellar What do you think about merging this PR to the release branch? https://github.com/llvm/llvm-project/pull/106480 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [llvm] release/19.x: workflows/release-binaries: Enable flang builds on Windows (#101344) (PR #106480)
llvmbot wrote: @llvm/pr-subscribers-clang Author: None (llvmbot) Changes Backport 8927576b8f6442bb6129bda597efee46176f8aec Requested by: @tstellar --- Full diff: https://github.com/llvm/llvm-project/pull/106480.diff 2 Files Affected: - (modified) .github/workflows/release-binaries.yml (-8) - (modified) clang/cmake/caches/Release.cmake (+5-2) ``diff diff --git a/.github/workflows/release-binaries.yml b/.github/workflows/release-binaries.yml index 509016e5b89c45..672dd7517d23ce 100644 --- a/.github/workflows/release-binaries.yml +++ b/.github/workflows/release-binaries.yml @@ -135,16 +135,8 @@ jobs: target_cmake_flags="$target_cmake_flags -DBOOTSTRAP_DARWIN_osx_ARCHS=$arches -DBOOTSTRAP_DARWIN_osx_BUILTIN_ARCHS=$arches" fi -# x86 macOS and x86 Windows have trouble building flang, so disable it. -# Windows: https://github.com/llvm/llvm-project/issues/100202 -# macOS: 'rebase opcodes terminated early at offset 1 of 80016' when building __fortran_builtins.mod build_flang="true" -if [ "$target" = "Windows-X64" ]; then - target_cmake_flags="$target_cmake_flags -DLLVM_RELEASE_ENABLE_PROJECTS=\"clang;lld;lldb;clang-tools-extra;bolt;polly;mlir\"" - build_flang="false" -fi - if [ "${{ runner.os }}" = "Windows" ]; then # The build times out on Windows, so we need to disable LTO. target_cmake_flags="$target_cmake_flags -DLLVM_RELEASE_ENABLE_LTO=OFF" diff --git a/clang/cmake/caches/Release.cmake b/clang/cmake/caches/Release.cmake index e5161dd9a27b96..6d5f75ca0074ee 100644 --- a/clang/cmake/caches/Release.cmake +++ b/clang/cmake/caches/Release.cmake @@ -47,11 +47,14 @@ set(LLVM_TARGETS_TO_BUILD Native CACHE STRING "") set(CLANG_ENABLE_BOOTSTRAP ON CACHE BOOL "") set(STAGE1_PROJECTS "clang") -set(STAGE1_RUNTIMES "") + +# Building Flang on Windows requires compiler-rt, so we need to build it in +# stage1. compiler-rt is also required for building the Flang tests on +# macOS. +set(STAGE1_RUNTIMES "compiler-rt") if (LLVM_RELEASE_ENABLE_PGO) list(APPEND STAGE1_PROJECTS "lld") - list(APPEND STAGE1_RUNTIMES "compiler-rt") set(CLANG_BOOTSTRAP_TARGETS generate-profdata stage2-package `` https://github.com/llvm/llvm-project/pull/106480 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] 92885bb - Revert "[llvm-profdata] Enabled functionality to write split-layout profile (…"
Author: William Junda Huang Date: 2024-08-28T21:33:24-04:00 New Revision: 92885bbeab632875929827a09841237cd59405fb URL: https://github.com/llvm/llvm-project/commit/92885bbeab632875929827a09841237cd59405fb DIFF: https://github.com/llvm/llvm-project/commit/92885bbeab632875929827a09841237cd59405fb.diff LOG: Revert "[llvm-profdata] Enabled functionality to write split-layout profile (…" This reverts commit 75e9d191f52b047ea839f75ab2a7a7d9f8c6becd. Added: Modified: llvm/docs/CommandGuide/llvm-profdata.rst llvm/include/llvm/ProfileData/SampleProfReader.h llvm/include/llvm/ProfileData/SampleProfWriter.h llvm/lib/ProfileData/SampleProfReader.cpp llvm/lib/ProfileData/SampleProfWriter.cpp llvm/tools/llvm-profdata/llvm-profdata.cpp Removed: llvm/test/tools/llvm-profdata/Inputs/split-layout.profdata llvm/test/tools/llvm-profdata/sample-split-layout.test diff --git a/llvm/docs/CommandGuide/llvm-profdata.rst b/llvm/docs/CommandGuide/llvm-profdata.rst index af840f3994b3d6..acf016a6dbcd70 100644 --- a/llvm/docs/CommandGuide/llvm-profdata.rst +++ b/llvm/docs/CommandGuide/llvm-profdata.rst @@ -162,12 +162,6 @@ OPTIONS coverage for the optimized target. This option can only be used with sample-based profile in extbinary format. -.. option:: --split-layout=[true|false] - - Split the profile data section to two with one containing sample profiles with - inlined functions and the other not. This option can only be used with - sample-based profile in extbinary format. - .. option:: --convert-sample-profile-layout=[nest|flat] Convert the merged profile into a profile with a new layout. Supported diff --git a/llvm/include/llvm/ProfileData/SampleProfReader.h b/llvm/include/llvm/ProfileData/SampleProfReader.h index 0fd86600de21f0..f053946a5db0a9 100644 --- a/llvm/include/llvm/ProfileData/SampleProfReader.h +++ b/llvm/include/llvm/ProfileData/SampleProfReader.h @@ -495,9 +495,9 @@ class SampleProfileReader { /// are present. virtual void setProfileUseMD5() { ProfileIsMD5 = true; } - /// Don't read profile without context if the flag is set. - void setSkipFlatProf(bool Skip) { SkipFlatProf = Skip; } - + /// Don't read profile without context if the flag is set. This is only meaningful + /// for ExtBinary format. + virtual void setSkipFlatProf(bool Skip) {} /// Return whether any name in the profile contains ".__uniq." suffix. virtual bool hasUniqSuffix() { return false; } @@ -581,10 +581,6 @@ class SampleProfileReader { /// Whether the profile uses MD5 for Sample Contexts and function names. This /// can be one-way overriden by the user to force use MD5. bool ProfileIsMD5 = false; - - /// If SkipFlatProf is true, skip functions marked with !Flat in text mode or - /// sections with SecFlagFlat flag in ExtBinary mode. - bool SkipFlatProf = false; }; class SampleProfileReaderText : public SampleProfileReader { @@ -793,6 +789,10 @@ class SampleProfileReaderExtBinaryBase : public SampleProfileReaderBinary { /// The set containing the functions to use when compiling a module. DenseSet FuncsToUse; + /// If SkipFlatProf is true, skip the sections with + /// SecFlagFlat flag. + bool SkipFlatProf = false; + public: SampleProfileReaderExtBinaryBase(std::unique_ptr B, LLVMContext &C, SampleProfileFormat Format) @@ -815,6 +815,8 @@ class SampleProfileReaderExtBinaryBase : public SampleProfileReaderBinary { return std::move(ProfSymList); }; + void setSkipFlatProf(bool Skip) override { SkipFlatProf = Skip; } + private: /// Read the profiles on-demand for the given functions. This is used after /// stale call graph matching finds new functions whose profiles aren't loaded diff --git a/llvm/include/llvm/ProfileData/SampleProfWriter.h b/llvm/include/llvm/ProfileData/SampleProfWriter.h index 4b659eaf950b3e..5398a44f13ba36 100644 --- a/llvm/include/llvm/ProfileData/SampleProfWriter.h +++ b/llvm/include/llvm/ProfileData/SampleProfWriter.h @@ -28,9 +28,9 @@ namespace sampleprof { enum SectionLayout { DefaultLayout, - // The layout splits profile with inlined functions from profile without - // inlined functions. When Thinlto is enabled, ThinLTO postlink phase only - // has to load profile with inlined functions and can skip the other part. + // The layout splits profile with context information from profile without + // context information. When Thinlto is enabled, ThinLTO postlink phase only + // has to load profile with context information and can skip the other part. CtxSplitLayout, NumOfLayout, }; @@ -128,7 +128,7 @@ class SampleProfileWriter { virtual void setToCompressAllSections() {} virtual void setUseMD5() {} virtual void setPartialProfile() {} - virtual void setUseCtxSplitLayout() {} + virtual void resetSecLayout(SectionLayout SL) {} protected: SamplePro
[llvm-branch-commits] [clang] release/19.x: [clang-format] Fix misalignments of pointers in angle brackets (#106013) (PR #106326)
https://github.com/owenca approved this pull request. https://github.com/llvm/llvm-project/pull/106326 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] release/19.x: [clang-format] js handle anonymous classes (#106242) (PR #106390)
https://github.com/owenca approved this pull request. https://github.com/llvm/llvm-project/pull/106390 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [libcxx] [clang] Finish implementation of P0522 (PR #96023)
https://github.com/mizvekov updated https://github.com/llvm/llvm-project/pull/96023 >From 84f988ee7c2d8fc5f777bc98850f6ab126fb3b71 Mon Sep 17 00:00:00 2001 From: Matheus Izvekov Date: Mon, 17 Jun 2024 21:39:08 -0300 Subject: [PATCH] [clang] Finish implementation of P0522 This finishes the clang implementation of P0522, getting rid of the fallback to the old, pre-P0522 rules. Before this patch, when partial ordering template template parameters, we would perform, in order: * If the old rules would match, we would accept it. Otherwise, don't generate diagnostics yet. * If the new rules would match, just accept it. Otherwise, don't generate any diagnostics yet again. * Apply the old rules again, this time with diagnostics. This situation was far from ideal, as we would sometimes: * Accept some things we shouldn't. * Reject some things we shouldn't. * Only diagnose rejection in terms of the old rules. With this patch, we apply the P0522 rules throughout. This needed to extend template argument deduction in order to accept the historial rule for TTP matching pack parameter to non-pack arguments. This change also makes us accept some combinations of historical and P0522 allowances we wouldn't before. It also fixes a bunch of bugs that were documented in the test suite, which I am not sure there are issues already created for them. This causes a lot of changes to the way these failures are diagnosed, with related test suite churn. The problem here is that the old rules were very simple and non-recursive, making it easy to provide customized diagnostics, and to keep them consistent with each other. The new rules are a lot more complex and rely on template argument deduction, substitutions, and they are recursive. The approach taken here is to mostly rely on existing diagnostics, and create a new instantiation context that keeps track of this context. So for example when a substitution failure occurs, we use the error produced there unmodified, and just attach notes to it explaining that it occurred in the context of partial ordering this template argument against that template parameter. This diverges from the old diagnostics, which would lead with an error pointing to the template argument, explain the problem in subsequent notes, and produce a final note pointing to the parameter. --- clang/docs/ReleaseNotes.rst | 10 + .../clang/Basic/DiagnosticSemaKinds.td| 7 + clang/include/clang/Sema/Sema.h | 14 +- clang/lib/Frontend/FrontendActions.cpp| 2 + clang/lib/Sema/SemaTemplate.cpp | 94 ++--- clang/lib/Sema/SemaTemplateDeduction.cpp | 353 +- clang/lib/Sema/SemaTemplateInstantiate.cpp| 15 + .../temp/temp.arg/temp.arg.template/p3-0x.cpp | 31 +- clang/test/CXX/temp/temp.param/p12.cpp| 21 +- clang/test/Modules/cxx-templates.cpp | 15 +- clang/test/SemaCXX/make_integer_seq.cpp | 5 +- clang/test/SemaTemplate/cwg2398.cpp | 138 ++- clang/test/SemaTemplate/temp_arg_nontype.cpp | 46 ++- clang/test/SemaTemplate/temp_arg_template.cpp | 38 +- .../SemaTemplate/temp_arg_template_p0522.cpp | 82 ++-- .../Templight/templight-empty-entries-fix.cpp | 12 + .../templight-prior-template-arg.cpp | 33 +- .../type_traits/is_specialization.verify.cpp | 2 +- 18 files changed, 641 insertions(+), 277 deletions(-) diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst index 2639fe3270200d..3826a19e28a666 100644 --- a/clang/docs/ReleaseNotes.rst +++ b/clang/docs/ReleaseNotes.rst @@ -129,6 +129,10 @@ C++23 Feature Support C++20 Feature Support ^ +C++17 Feature Support +^ +- The implementation of the relaxed template template argument matching rules is + more complete and reliable, and should provide more accurate diagnostics. Resolutions to C++ Defect Reports ^ @@ -255,6 +259,10 @@ Improvements to Clang's diagnostics - Clang now diagnoses when the result of a [[nodiscard]] function is discarded after being cast in C. Fixes #GH104391. +- Clang now properly explains the reason a template template argument failed to + match a template template parameter, in terms of the C++17 relaxed matching rules + instead of the old ones. + - Don't emit duplicated dangling diagnostics. (#GH93386). - Improved diagnostic when trying to befriend a concept. (#GH45182). @@ -322,6 +330,8 @@ Bug Fixes to C++ Support - Correctly check constraints of explicit instantiations of member functions. (#GH46029) - When performing partial ordering of function templates, clang now checks that the deduction was consistent. Fixes (#GH18291). +- Fixes to several issues in partial ordering of template template parameters, which + were documented in the test suite. - Fixed an assertion failure about a constraint of a friend function template references to a value with greater
[llvm-branch-commits] [llvm] [LLVM][Coroutines] Transform "coro_elide_safe" calls to switch ABI coroutines to the `noalloc` variant (PR #99285)
@@ -0,0 +1,147 @@ +//===- CoroAnnotationElide.cpp - Elide attributed safe coroutine calls ===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===--===// +// +// \file +// This pass transforms all Call or Invoke instructions that are annotated +// "coro_elide_safe" to call the `.noalloc` variant of coroutine instead. +// The frame of the callee coroutine is allocated inside the caller. A pointer +// to the allocated frame will be passed into the `.noalloc` ramp function. +// +//===--===// + +#include "llvm/Transforms/Coroutines/CoroAnnotationElide.h" + +#include "llvm/Analysis/LazyCallGraph.h" +#include "llvm/Analysis/OptimizationRemarkEmitter.h" +#include "llvm/IR/Analysis.h" +#include "llvm/IR/IRBuilder.h" +#include "llvm/IR/InstIterator.h" +#include "llvm/IR/Instruction.h" +#include "llvm/IR/Module.h" +#include "llvm/IR/PassManager.h" +#include "llvm/Transforms/Utils/CallGraphUpdater.h" + +#include + +using namespace llvm; + +#define DEBUG_TYPE "coro-annotation-elide" + +static Instruction *getFirstNonAllocaInTheEntryBlock(Function *F) { + for (Instruction &I : F->getEntryBlock()) +if (!isa(&I)) + return &I; + llvm_unreachable("no terminator in the entry block"); +} + +// Create an alloca in the caller, using FrameSize and FrameAlign as the callee +// coroutine's activation frame. +static Value *allocateFrameInCaller(Function *Caller, uint64_t FrameSize, +Align FrameAlign) { + LLVMContext &C = Caller->getContext(); + BasicBlock::iterator InsertPt = + getFirstNonAllocaInTheEntryBlock(Caller)->getIterator(); + const DataLayout &DL = Caller->getDataLayout(); + auto FrameTy = ArrayType::get(Type::getInt8Ty(C), FrameSize); + auto *Frame = new AllocaInst(FrameTy, DL.getAllocaAddrSpace(), "", InsertPt); + Frame->setAlignment(FrameAlign); + return new BitCastInst(Frame, PointerType::getUnqual(C), "vFrame", InsertPt); +} + +// Given a call or invoke instruction to the elide safe coroutine, this function +// does the following: +// - Allocate a frame for the callee coroutine in the caller using alloca. +// - Replace the old CB with a new Call or Invoke to `NewCallee`, with the +//pointer to the frame as an additional argument to NewCallee. +static void processCall(CallBase *CB, Function *Caller, Function *NewCallee, +uint64_t FrameSize, Align FrameAlign) { + auto *FramePtr = allocateFrameInCaller(Caller, FrameSize, FrameAlign); ChuanqiXu9 wrote: Yeah, we need to do this in the frontend. https://github.com/llvm/llvm-project/pull/99285 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LLVM][Coroutines] Create `.noalloc` variant of switch ABI coroutine ramp functions during CoroSplit (PR #99283)
https://github.com/ChuanqiXu9 approved this pull request. LGTM https://github.com/llvm/llvm-project/pull/99283 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LLVM][Coroutines] Transform "coro_elide_safe" calls to switch ABI coroutines to the `noalloc` variant (PR #99285)
https://github.com/ChuanqiXu9 approved this pull request. LGTM https://github.com/llvm/llvm-project/pull/99285 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [clang-format] Revert "[clang-format][NFC] Delete TT_LambdaArrow (#70… (PR #106482)
https://github.com/owenca created https://github.com/llvm/llvm-project/pull/106482 …… (#105923) …519)" This reverts commit e00d32afb9d33a1eca48e2b041c9688436706c5b and adds a test for lambda arrow SplitPenalty. Fixes #105480. >From 386f54403a6b38fd14d8e3126fcc46b7e579f575 Mon Sep 17 00:00:00 2001 From: Owen Pan Date: Wed, 28 Aug 2024 18:23:54 -0700 Subject: [PATCH] =?UTF-8?q?[clang-format]=20Revert=20"[clang-format][NFC]?= =?UTF-8?q?=20Delete=20TT=5FLambdaArrow=20(#70=E2=80=A6=20(#105923)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit …519)" This reverts commit e00d32afb9d33a1eca48e2b041c9688436706c5b and adds a test for lambda arrow SplitPenalty. Fixes #105480. --- clang/lib/Format/ContinuationIndenter.cpp | 10 +++--- clang/lib/Format/FormatToken.h| 3 +- clang/lib/Format/TokenAnnotator.cpp | 33 ++ clang/lib/Format/UnwrappedLineParser.cpp | 2 +- clang/unittests/Format/TokenAnnotatorTest.cpp | 34 ++- 5 files changed, 50 insertions(+), 32 deletions(-) diff --git a/clang/lib/Format/ContinuationIndenter.cpp b/clang/lib/Format/ContinuationIndenter.cpp index b07360425ca6e1..7d89f0e63dd225 100644 --- a/clang/lib/Format/ContinuationIndenter.cpp +++ b/clang/lib/Format/ContinuationIndenter.cpp @@ -842,10 +842,8 @@ void ContinuationIndenter::addTokenOnCurrentLine(LineState &State, bool DryRun, CurrentState.ContainsUnwrappedBuilder = true; } - if (Current.is(TT_TrailingReturnArrow) && - Style.Language == FormatStyle::LK_Java) { + if (Current.is(TT_LambdaArrow) && Style.Language == FormatStyle::LK_Java) CurrentState.NoLineBreak = true; - } if (Current.isMemberAccess() && Previous.is(tok::r_paren) && (Previous.MatchingParen && (Previous.TotalLength - Previous.MatchingParen->TotalLength > 10))) { @@ -1000,7 +998,7 @@ unsigned ContinuationIndenter::addTokenOnNewLine(LineState &State, // // is common and should be formatted like a free-standing function. The same // goes for wrapping before the lambda return type arrow. - if (Current.isNot(TT_TrailingReturnArrow) && + if (Current.isNot(TT_LambdaArrow) && (!Style.isJavaScript() || Current.NestingLevel != 0 || !PreviousNonComment || PreviousNonComment->isNot(tok::equal) || !Current.isOneOf(Keywords.kw_async, Keywords.kw_function))) { @@ -1257,7 +1255,7 @@ unsigned ContinuationIndenter::getNewLineColumn(const LineState &State) { } return CurrentState.Indent; } - if (Current.is(TT_TrailingReturnArrow) && + if (Current.is(TT_LambdaArrow) && Previous.isOneOf(tok::kw_noexcept, tok::kw_mutable, tok::kw_constexpr, tok::kw_consteval, tok::kw_static, TT_AttributeSquare)) { return ContinuationIndent; @@ -1590,7 +1588,7 @@ unsigned ContinuationIndenter::moveStateToNextToken(LineState &State, } if (Current.isOneOf(TT_BinaryOperator, TT_ConditionalExpr) && Newline) CurrentState.NestedBlockIndent = State.Column + Current.ColumnWidth + 1; - if (Current.isOneOf(TT_LambdaLSquare, TT_TrailingReturnArrow)) + if (Current.isOneOf(TT_LambdaLSquare, TT_LambdaArrow)) CurrentState.LastSpace = State.Column; if (Current.is(TT_RequiresExpression) && Style.RequiresExpressionIndentation == FormatStyle::REI_Keyword) { diff --git a/clang/lib/Format/FormatToken.h b/clang/lib/Format/FormatToken.h index cc45d5a8c5c1ec..9bfeb2052164ee 100644 --- a/clang/lib/Format/FormatToken.h +++ b/clang/lib/Format/FormatToken.h @@ -102,6 +102,7 @@ namespace format { TYPE(JsTypeColon) \ TYPE(JsTypeOperator) \ TYPE(JsTypeOptionalQuestion) \ + TYPE(LambdaArrow) \ TYPE(LambdaLBrace) \ TYPE(LambdaLSquare) \ TYPE(LeadingJavaAnnotation) \ @@ -725,7 +726,7 @@ struct FormatToken { bool isMemberAccess() const { return isOneOf(tok::arrow, tok::period, tok::arrowstar) && !isOneOf(TT_DesignatedInitializerPeriod, TT_TrailingReturnArrow, -TT_LeadingJavaAnnotation); +TT_LambdaArrow, TT_LeadingJavaAnnotation); } bool isPointerOrReference() const { diff --git a/clang/lib/Format/TokenAnnotator.cpp b/clang/lib/Format/TokenAnnotator.cpp index 851f79895ac5ac..07b42e79ba9a61 100644 --- a/clang/lib/Format/TokenAnnotator.cpp +++ b/clang/lib/Format/TokenAnnotator.cpp @@ -831,7 +831,7 @@ class AnnotatingParser { } // An arrow after an ObjC method expression is not a lambda arrow. if (CurrentToken->is(TT_ObjCMethodExpr) && CurrentToken->Next && -Current
[llvm-branch-commits] [clang] [clang-format] Revert "[clang-format][NFC] Delete TT_LambdaArrow (#70… (PR #106482)
llvmbot wrote: @llvm/pr-subscribers-clang-format Author: Owen Pan (owenca) Changes …… (#105923) …519)" This reverts commit e00d32afb9d33a1eca48e2b041c9688436706c5b and adds a test for lambda arrow SplitPenalty. Fixes #105480. --- Full diff: https://github.com/llvm/llvm-project/pull/106482.diff 5 Files Affected: - (modified) clang/lib/Format/ContinuationIndenter.cpp (+4-6) - (modified) clang/lib/Format/FormatToken.h (+2-1) - (modified) clang/lib/Format/TokenAnnotator.cpp (+18-15) - (modified) clang/lib/Format/UnwrappedLineParser.cpp (+1-1) - (modified) clang/unittests/Format/TokenAnnotatorTest.cpp (+25-9) ``diff diff --git a/clang/lib/Format/ContinuationIndenter.cpp b/clang/lib/Format/ContinuationIndenter.cpp index b07360425ca6e1..7d89f0e63dd225 100644 --- a/clang/lib/Format/ContinuationIndenter.cpp +++ b/clang/lib/Format/ContinuationIndenter.cpp @@ -842,10 +842,8 @@ void ContinuationIndenter::addTokenOnCurrentLine(LineState &State, bool DryRun, CurrentState.ContainsUnwrappedBuilder = true; } - if (Current.is(TT_TrailingReturnArrow) && - Style.Language == FormatStyle::LK_Java) { + if (Current.is(TT_LambdaArrow) && Style.Language == FormatStyle::LK_Java) CurrentState.NoLineBreak = true; - } if (Current.isMemberAccess() && Previous.is(tok::r_paren) && (Previous.MatchingParen && (Previous.TotalLength - Previous.MatchingParen->TotalLength > 10))) { @@ -1000,7 +998,7 @@ unsigned ContinuationIndenter::addTokenOnNewLine(LineState &State, // // is common and should be formatted like a free-standing function. The same // goes for wrapping before the lambda return type arrow. - if (Current.isNot(TT_TrailingReturnArrow) && + if (Current.isNot(TT_LambdaArrow) && (!Style.isJavaScript() || Current.NestingLevel != 0 || !PreviousNonComment || PreviousNonComment->isNot(tok::equal) || !Current.isOneOf(Keywords.kw_async, Keywords.kw_function))) { @@ -1257,7 +1255,7 @@ unsigned ContinuationIndenter::getNewLineColumn(const LineState &State) { } return CurrentState.Indent; } - if (Current.is(TT_TrailingReturnArrow) && + if (Current.is(TT_LambdaArrow) && Previous.isOneOf(tok::kw_noexcept, tok::kw_mutable, tok::kw_constexpr, tok::kw_consteval, tok::kw_static, TT_AttributeSquare)) { return ContinuationIndent; @@ -1590,7 +1588,7 @@ unsigned ContinuationIndenter::moveStateToNextToken(LineState &State, } if (Current.isOneOf(TT_BinaryOperator, TT_ConditionalExpr) && Newline) CurrentState.NestedBlockIndent = State.Column + Current.ColumnWidth + 1; - if (Current.isOneOf(TT_LambdaLSquare, TT_TrailingReturnArrow)) + if (Current.isOneOf(TT_LambdaLSquare, TT_LambdaArrow)) CurrentState.LastSpace = State.Column; if (Current.is(TT_RequiresExpression) && Style.RequiresExpressionIndentation == FormatStyle::REI_Keyword) { diff --git a/clang/lib/Format/FormatToken.h b/clang/lib/Format/FormatToken.h index cc45d5a8c5c1ec..9bfeb2052164ee 100644 --- a/clang/lib/Format/FormatToken.h +++ b/clang/lib/Format/FormatToken.h @@ -102,6 +102,7 @@ namespace format { TYPE(JsTypeColon) \ TYPE(JsTypeOperator) \ TYPE(JsTypeOptionalQuestion) \ + TYPE(LambdaArrow) \ TYPE(LambdaLBrace) \ TYPE(LambdaLSquare) \ TYPE(LeadingJavaAnnotation) \ @@ -725,7 +726,7 @@ struct FormatToken { bool isMemberAccess() const { return isOneOf(tok::arrow, tok::period, tok::arrowstar) && !isOneOf(TT_DesignatedInitializerPeriod, TT_TrailingReturnArrow, -TT_LeadingJavaAnnotation); +TT_LambdaArrow, TT_LeadingJavaAnnotation); } bool isPointerOrReference() const { diff --git a/clang/lib/Format/TokenAnnotator.cpp b/clang/lib/Format/TokenAnnotator.cpp index 851f79895ac5ac..07b42e79ba9a61 100644 --- a/clang/lib/Format/TokenAnnotator.cpp +++ b/clang/lib/Format/TokenAnnotator.cpp @@ -831,7 +831,7 @@ class AnnotatingParser { } // An arrow after an ObjC method expression is not a lambda arrow. if (CurrentToken->is(TT_ObjCMethodExpr) && CurrentToken->Next && -CurrentToken->Next->is(TT_TrailingReturnArrow)) { +CurrentToken->Next->is(TT_LambdaArrow)) { CurrentToken->Next->overwriteFixedType(TT_Unknown); } Left->MatchingParen = CurrentToken; @@ -1769,8 +1769,10 @@ class AnnotatingParser { } break; case tok::arrow: - if (Tok->Previous && Tok->Previous->is(tok::kw_noexcept)) + if (Tok->isNot(TT_LambdaArrow) && Tok->Previous && + Tok
[llvm-branch-commits] [clang] [clang-format] Revert "[clang-format][NFC] Delete TT_LambdaArrow (#70… (PR #106482)
https://github.com/owenca milestoned https://github.com/llvm/llvm-project/pull/106482 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LoongArch] Optimize for immediate value materialization using BSTRINS_D instruction (PR #106332)
https://github.com/wangleiat updated https://github.com/llvm/llvm-project/pull/106332 >From b2e3659d23ff3a576e2967576d501b24d6466e87 Mon Sep 17 00:00:00 2001 From: wanglei Date: Wed, 28 Aug 2024 12:16:47 +0800 Subject: [PATCH] update test sextw-removal.ll Created using spr 1.3.5-bogner --- llvm/test/CodeGen/LoongArch/sextw-removal.ll | 40 1 file changed, 16 insertions(+), 24 deletions(-) diff --git a/llvm/test/CodeGen/LoongArch/sextw-removal.ll b/llvm/test/CodeGen/LoongArch/sextw-removal.ll index 2bb39395c1d1b6..7500b5ae09359a 100644 --- a/llvm/test/CodeGen/LoongArch/sextw-removal.ll +++ b/llvm/test/CodeGen/LoongArch/sextw-removal.ll @@ -323,21 +323,17 @@ define void @test7(i32 signext %arg, i32 signext %arg1) nounwind { ; CHECK-NEXT:st.d $s2, $sp, 8 # 8-byte Folded Spill ; CHECK-NEXT:sra.w $a0, $a0, $a1 ; CHECK-NEXT:lu12i.w $a1, 349525 -; CHECK-NEXT:ori $a1, $a1, 1365 -; CHECK-NEXT:lu32i.d $a1, 349525 -; CHECK-NEXT:lu52i.d $fp, $a1, 1365 +; CHECK-NEXT:ori $fp, $a1, 1365 +; CHECK-NEXT:bstrins.d $fp, $fp, 62, 32 ; CHECK-NEXT:lu12i.w $a1, 209715 -; CHECK-NEXT:ori $a1, $a1, 819 -; CHECK-NEXT:lu32i.d $a1, 209715 -; CHECK-NEXT:lu52i.d $s0, $a1, 819 +; CHECK-NEXT:ori $s0, $a1, 819 +; CHECK-NEXT:bstrins.d $s0, $s0, 61, 32 ; CHECK-NEXT:lu12i.w $a1, 61680 -; CHECK-NEXT:ori $a1, $a1, 3855 -; CHECK-NEXT:lu32i.d $a1, -61681 -; CHECK-NEXT:lu52i.d $s1, $a1, 240 +; CHECK-NEXT:ori $s1, $a1, 3855 +; CHECK-NEXT:bstrins.d $s1, $s1, 59, 32 ; CHECK-NEXT:lu12i.w $a1, 4112 -; CHECK-NEXT:ori $a1, $a1, 257 -; CHECK-NEXT:lu32i.d $a1, 65793 -; CHECK-NEXT:lu52i.d $s2, $a1, 16 +; CHECK-NEXT:ori $s2, $a1, 257 +; CHECK-NEXT:bstrins.d $s2, $s2, 56, 32 ; CHECK-NEXT:.p2align 4, , 16 ; CHECK-NEXT: .LBB6_1: # %bb2 ; CHECK-NEXT:# =>This Inner Loop Header: Depth=1 @@ -374,21 +370,17 @@ define void @test7(i32 signext %arg, i32 signext %arg1) nounwind { ; NORMV-NEXT:st.d $s2, $sp, 8 # 8-byte Folded Spill ; NORMV-NEXT:sra.w $a0, $a0, $a1 ; NORMV-NEXT:lu12i.w $a1, 349525 -; NORMV-NEXT:ori $a1, $a1, 1365 -; NORMV-NEXT:lu32i.d $a1, 349525 -; NORMV-NEXT:lu52i.d $fp, $a1, 1365 +; NORMV-NEXT:ori $fp, $a1, 1365 +; NORMV-NEXT:bstrins.d $fp, $fp, 62, 32 ; NORMV-NEXT:lu12i.w $a1, 209715 -; NORMV-NEXT:ori $a1, $a1, 819 -; NORMV-NEXT:lu32i.d $a1, 209715 -; NORMV-NEXT:lu52i.d $s0, $a1, 819 +; NORMV-NEXT:ori $s0, $a1, 819 +; NORMV-NEXT:bstrins.d $s0, $s0, 61, 32 ; NORMV-NEXT:lu12i.w $a1, 61680 -; NORMV-NEXT:ori $a1, $a1, 3855 -; NORMV-NEXT:lu32i.d $a1, -61681 -; NORMV-NEXT:lu52i.d $s1, $a1, 240 +; NORMV-NEXT:ori $s1, $a1, 3855 +; NORMV-NEXT:bstrins.d $s1, $s1, 59, 32 ; NORMV-NEXT:lu12i.w $a1, 4112 -; NORMV-NEXT:ori $a1, $a1, 257 -; NORMV-NEXT:lu32i.d $a1, 65793 -; NORMV-NEXT:lu52i.d $s2, $a1, 16 +; NORMV-NEXT:ori $s2, $a1, 257 +; NORMV-NEXT:bstrins.d $s2, $s2, 56, 32 ; NORMV-NEXT:.p2align 4, , 16 ; NORMV-NEXT: .LBB6_1: # %bb2 ; NORMV-NEXT:# =>This Inner Loop Header: Depth=1 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LoongArch] Optimize for immediate value materialization using BSTRINS_D instruction (PR #106332)
https://github.com/wangleiat updated https://github.com/llvm/llvm-project/pull/106332 >From b2e3659d23ff3a576e2967576d501b24d6466e87 Mon Sep 17 00:00:00 2001 From: wanglei Date: Wed, 28 Aug 2024 12:16:47 +0800 Subject: [PATCH] update test sextw-removal.ll Created using spr 1.3.5-bogner --- llvm/test/CodeGen/LoongArch/sextw-removal.ll | 40 1 file changed, 16 insertions(+), 24 deletions(-) diff --git a/llvm/test/CodeGen/LoongArch/sextw-removal.ll b/llvm/test/CodeGen/LoongArch/sextw-removal.ll index 2bb39395c1d1b6..7500b5ae09359a 100644 --- a/llvm/test/CodeGen/LoongArch/sextw-removal.ll +++ b/llvm/test/CodeGen/LoongArch/sextw-removal.ll @@ -323,21 +323,17 @@ define void @test7(i32 signext %arg, i32 signext %arg1) nounwind { ; CHECK-NEXT:st.d $s2, $sp, 8 # 8-byte Folded Spill ; CHECK-NEXT:sra.w $a0, $a0, $a1 ; CHECK-NEXT:lu12i.w $a1, 349525 -; CHECK-NEXT:ori $a1, $a1, 1365 -; CHECK-NEXT:lu32i.d $a1, 349525 -; CHECK-NEXT:lu52i.d $fp, $a1, 1365 +; CHECK-NEXT:ori $fp, $a1, 1365 +; CHECK-NEXT:bstrins.d $fp, $fp, 62, 32 ; CHECK-NEXT:lu12i.w $a1, 209715 -; CHECK-NEXT:ori $a1, $a1, 819 -; CHECK-NEXT:lu32i.d $a1, 209715 -; CHECK-NEXT:lu52i.d $s0, $a1, 819 +; CHECK-NEXT:ori $s0, $a1, 819 +; CHECK-NEXT:bstrins.d $s0, $s0, 61, 32 ; CHECK-NEXT:lu12i.w $a1, 61680 -; CHECK-NEXT:ori $a1, $a1, 3855 -; CHECK-NEXT:lu32i.d $a1, -61681 -; CHECK-NEXT:lu52i.d $s1, $a1, 240 +; CHECK-NEXT:ori $s1, $a1, 3855 +; CHECK-NEXT:bstrins.d $s1, $s1, 59, 32 ; CHECK-NEXT:lu12i.w $a1, 4112 -; CHECK-NEXT:ori $a1, $a1, 257 -; CHECK-NEXT:lu32i.d $a1, 65793 -; CHECK-NEXT:lu52i.d $s2, $a1, 16 +; CHECK-NEXT:ori $s2, $a1, 257 +; CHECK-NEXT:bstrins.d $s2, $s2, 56, 32 ; CHECK-NEXT:.p2align 4, , 16 ; CHECK-NEXT: .LBB6_1: # %bb2 ; CHECK-NEXT:# =>This Inner Loop Header: Depth=1 @@ -374,21 +370,17 @@ define void @test7(i32 signext %arg, i32 signext %arg1) nounwind { ; NORMV-NEXT:st.d $s2, $sp, 8 # 8-byte Folded Spill ; NORMV-NEXT:sra.w $a0, $a0, $a1 ; NORMV-NEXT:lu12i.w $a1, 349525 -; NORMV-NEXT:ori $a1, $a1, 1365 -; NORMV-NEXT:lu32i.d $a1, 349525 -; NORMV-NEXT:lu52i.d $fp, $a1, 1365 +; NORMV-NEXT:ori $fp, $a1, 1365 +; NORMV-NEXT:bstrins.d $fp, $fp, 62, 32 ; NORMV-NEXT:lu12i.w $a1, 209715 -; NORMV-NEXT:ori $a1, $a1, 819 -; NORMV-NEXT:lu32i.d $a1, 209715 -; NORMV-NEXT:lu52i.d $s0, $a1, 819 +; NORMV-NEXT:ori $s0, $a1, 819 +; NORMV-NEXT:bstrins.d $s0, $s0, 61, 32 ; NORMV-NEXT:lu12i.w $a1, 61680 -; NORMV-NEXT:ori $a1, $a1, 3855 -; NORMV-NEXT:lu32i.d $a1, -61681 -; NORMV-NEXT:lu52i.d $s1, $a1, 240 +; NORMV-NEXT:ori $s1, $a1, 3855 +; NORMV-NEXT:bstrins.d $s1, $s1, 59, 32 ; NORMV-NEXT:lu12i.w $a1, 4112 -; NORMV-NEXT:ori $a1, $a1, 257 -; NORMV-NEXT:lu32i.d $a1, 65793 -; NORMV-NEXT:lu52i.d $s2, $a1, 16 +; NORMV-NEXT:ori $s2, $a1, 257 +; NORMV-NEXT:bstrins.d $s2, $s2, 56, 32 ; NORMV-NEXT:.p2align 4, , 16 ; NORMV-NEXT: .LBB6_1: # %bb2 ; NORMV-NEXT:# =>This Inner Loop Header: Depth=1 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [flang] [flang] Introduce custom loop nest generation for loops in workshare construct (PR #101445)
ivanradanov wrote: > ... However, they would work if they ran after the pass lowering > `omp.workshare` to a set of `omp.single` for the code in between > `omp.wsloop`s. That way we would not have to introduce a new loop wrapper and > also we could create passes assuming the parent of region of an `omp.wsloop` > is executed by all threads in the team. I don't think that should be an > issue, since in principle it makes sense to me that the `omp.workshare` > transformation would run immediately after PFT to MLIR lowering. What do you > think about that alternative? Ideally, the `omp.workshare` lowering will run after the HLIF to FIR lowering, because missing the high level optimizations that HLFIR provides can result in very bad performance (unneeded temporary arrays, unnecessary copies, non-fused array computation, etc). The workshare lowering transforms the `omp.workshare.loop_wrapper`s into `omp.wsloop`s so they are gone after that. Another factor is that there may not be PFT->loop lowerings for many constructs that need to be divided into units of work. so we may need to first generate HLFIR and alter the lowerings from HLFIR to FIR to get the `omp.wsloop` (or `omp.workshare.loop_wrapper`), which means that there will be portions of the pipeline (from PFT->HLFIR until HLFIR->FIR) where a `omp.wsloop` nested in an `omp.workshare` will be the wrong representation. Are there any concerns with adding `omp.workshare.loop_wrapper`? I do not see that big of an overhead (maintenance or compile time) resulting from its addition, while it makes things clearer and more robust in my opinion. https://github.com/llvm/llvm-project/pull/101445 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] DAG: Check if is_fpclass is custom, instead of isLegalOrCustom (PR #105577)
https://github.com/arsenm updated https://github.com/llvm/llvm-project/pull/105577 >From 9e23baea4d3444e7e0bccdf39b738f404abfe265 Mon Sep 17 00:00:00 2001 From: Matt Arsenault Date: Wed, 21 Aug 2024 20:15:55 +0400 Subject: [PATCH] DAG: Check if is_fpclass is custom, instead of isLegalOrCustom For some reason, isOperationLegalOrCustom is not the same as isOperationLegal || isOperationCustom. Unfortunately, it checks if the type is legal which makes it uesless for custom lowering on non-legal types (which is always ppcf128). Really the DAG builder shouldn't be going to expand this in the builder, it makes it difficult to work with. It's only here to work around the DAG requiring legal integer types the same size as the FP type after type legalization. --- .../SelectionDAG/SelectionDAGBuilder.cpp | 3 +- llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp | 17 +- llvm/test/CodeGen/AMDGPU/fract-match.ll | 10 +- .../CodeGen/AMDGPU/llvm.is.fpclass.f16.ll | 205 +++--- llvm/test/CodeGen/PowerPC/is_fpclass.ll | 37 ++-- 5 files changed, 160 insertions(+), 112 deletions(-) diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp index d103308cce566a..ad24704d940a36 100644 --- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp +++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp @@ -7031,7 +7031,8 @@ void SelectionDAGBuilder::visitIntrinsicCall(const CallInst &I, // If ISD::IS_FPCLASS should be expanded, do it right now, because the // expansion can use illegal types. Making expansion early allows // legalizing these types prior to selection. -if (!TLI.isOperationLegalOrCustom(ISD::IS_FPCLASS, ArgVT)) { +if (!TLI.isOperationLegal(ISD::IS_FPCLASS, ArgVT) && +!TLI.isOperationCustom(ISD::IS_FPCLASS, ArgVT)) { SDValue Result = TLI.expandIS_FPCLASS(DestVT, Op, Test, Flags, sdl, DAG); setValue(&I, Result); return; diff --git a/llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp b/llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp index 96143d688801aa..d24836b7eeb095 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp @@ -426,12 +426,17 @@ AMDGPUTargetLowering::AMDGPUTargetLowering(const TargetMachine &TM, // FIXME: These IS_FPCLASS vector fp types are marked custom so it reaches // scalarization code. Can be removed when IS_FPCLASS expand isn't called by // default unless marked custom/legal. - setOperationAction( - ISD::IS_FPCLASS, - {MVT::v2f16, MVT::v3f16, MVT::v4f16, MVT::v16f16, MVT::v2f32, MVT::v3f32, - MVT::v4f32, MVT::v5f32, MVT::v6f32, MVT::v7f32, MVT::v8f32, MVT::v16f32, - MVT::v2f64, MVT::v3f64, MVT::v4f64, MVT::v8f64, MVT::v16f64}, - Custom); + setOperationAction(ISD::IS_FPCLASS, + {MVT::v2f32, MVT::v3f32, MVT::v4f32, MVT::v5f32, + MVT::v6f32, MVT::v7f32, MVT::v8f32, MVT::v16f32, + MVT::v2f64, MVT::v3f64, MVT::v4f64, MVT::v8f64, + MVT::v16f64}, + Custom); + + if (isTypeLegal(MVT::f16)) +setOperationAction(ISD::IS_FPCLASS, + {MVT::v2f16, MVT::v3f16, MVT::v4f16, MVT::v16f16}, + Custom); // Expand to fneg + fadd. setOperationAction(ISD::FSUB, MVT::f64, Expand); diff --git a/llvm/test/CodeGen/AMDGPU/fract-match.ll b/llvm/test/CodeGen/AMDGPU/fract-match.ll index 1b28ddb2c58620..b212b9caf8400e 100644 --- a/llvm/test/CodeGen/AMDGPU/fract-match.ll +++ b/llvm/test/CodeGen/AMDGPU/fract-match.ll @@ -2135,16 +2135,16 @@ define <2 x half> @safe_math_fract_v2f16(<2 x half> %x, ptr addrspace(1) nocaptu ; GFX8-LABEL: safe_math_fract_v2f16: ; GFX8: ; %bb.0: ; %entry ; GFX8-NEXT:s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0) -; GFX8-NEXT:v_mov_b32_e32 v6, 0x204 +; GFX8-NEXT:s_movk_i32 s6, 0x204 ; GFX8-NEXT:v_floor_f16_sdwa v3, v0 dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:WORD_1 ; GFX8-NEXT:v_floor_f16_e32 v4, v0 -; GFX8-NEXT:v_cmp_class_f16_sdwa s[4:5], v0, v6 src0_sel:WORD_1 src1_sel:DWORD +; GFX8-NEXT:v_fract_f16_sdwa v5, v0 dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:WORD_1 +; GFX8-NEXT:v_cmp_class_f16_sdwa s[4:5], v0, s6 src0_sel:WORD_1 src1_sel:DWORD ; GFX8-NEXT:v_pack_b32_f16 v3, v4, v3 ; GFX8-NEXT:v_fract_f16_e32 v4, v0 -; GFX8-NEXT:v_fract_f16_sdwa v5, v0 dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:WORD_1 -; GFX8-NEXT:v_cmp_class_f16_e32 vcc, v0, v6 ; GFX8-NEXT:v_cndmask_b32_e64 v5, v5, 0, s[4:5] -; GFX8-NEXT:v_cndmask_b32_e64 v0, v4, 0, vcc +; GFX8-NEXT:v_cmp_class_f16_e64 s[4:5], v0, s6 +; GFX8-NEXT:v_cndmask_b32_e64 v0, v4, 0, s[4:5] ; GFX8-NEXT:v_pack_b32_f16 v0, v0, v5 ; GFX8-NEXT:global_store_dword v[1:2], v3, off ; GFX8-NEXT:s_waitcnt vmcnt(0) diff --git a/llvm/test/CodeGen/AMDGPU/llvm.is.fpclass.f16.l
[llvm-branch-commits] [llvm] DAG: Handle lowering unordered compare with inf (PR #100378)
https://github.com/arsenm updated https://github.com/llvm/llvm-project/pull/100378 >From 4edffb2750e8320c39109cd7c9c086c2ee86e9d4 Mon Sep 17 00:00:00 2001 From: Matt Arsenault Date: Tue, 7 Feb 2023 12:22:05 -0400 Subject: [PATCH 1/3] DAG: Handle lowering unordered compare with inf Try to take advantage of the nan check behavior of fcmp. x86_64 looks better, x86_32 looks worse. --- llvm/include/llvm/CodeGen/CodeGenCommonISel.h | 7 +- llvm/lib/CodeGen/CodeGenCommonISel.cpp| 8 +- .../CodeGen/SelectionDAG/TargetLowering.cpp | 53 +++-- llvm/test/CodeGen/X86/is_fpclass.ll | 78 +-- 4 files changed, 83 insertions(+), 63 deletions(-) diff --git a/llvm/include/llvm/CodeGen/CodeGenCommonISel.h b/llvm/include/llvm/CodeGen/CodeGenCommonISel.h index 90ef890f22d1b1..e4b2e20babc07a 100644 --- a/llvm/include/llvm/CodeGen/CodeGenCommonISel.h +++ b/llvm/include/llvm/CodeGen/CodeGenCommonISel.h @@ -218,10 +218,15 @@ findSplitPointForStackProtector(MachineBasicBlock *BB, /// Evaluates if the specified FP class test is better performed as the inverse /// (i.e. fewer instructions should be required to lower it). An example is the /// test "inf|normal|subnormal|zero", which is an inversion of "nan". +/// /// \param Test The test as specified in 'is_fpclass' intrinsic invocation. +/// +/// \param UseFCmp The intention is to perform the comparison using +/// floating-point compare instructions which check for nan. +/// /// \returns The inverted test, or fcNone, if inversion does not produce a /// simpler test. -FPClassTest invertFPClassTestIfSimpler(FPClassTest Test); +FPClassTest invertFPClassTestIfSimpler(FPClassTest Test, bool UseFCmp); /// Assuming the instruction \p MI is going to be deleted, attempt to salvage /// debug users of \p MI by writing the effect of \p MI in a DIExpression. diff --git a/llvm/lib/CodeGen/CodeGenCommonISel.cpp b/llvm/lib/CodeGen/CodeGenCommonISel.cpp index fe144d3c182039..d985751e2be0be 100644 --- a/llvm/lib/CodeGen/CodeGenCommonISel.cpp +++ b/llvm/lib/CodeGen/CodeGenCommonISel.cpp @@ -173,8 +173,9 @@ llvm::findSplitPointForStackProtector(MachineBasicBlock *BB, return SplitPoint; } -FPClassTest llvm::invertFPClassTestIfSimpler(FPClassTest Test) { +FPClassTest llvm::invertFPClassTestIfSimpler(FPClassTest Test, bool UseFCmp) { FPClassTest InvertedTest = ~Test; + // Pick the direction with fewer tests // TODO: Handle more combinations of cases that can be handled together switch (static_cast(InvertedTest)) { @@ -200,6 +201,11 @@ FPClassTest llvm::invertFPClassTestIfSimpler(FPClassTest Test) { case fcSubnormal | fcZero: case fcSubnormal | fcZero | fcNan: return InvertedTest; + case fcInf | fcNan: +// If we're trying to use fcmp, we can take advantage of the nan check +// behavior of the compare (but this is more instructions in the integer +// expansion). +return UseFCmp ? InvertedTest : fcNone; default: return fcNone; } diff --git a/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp b/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp index 4e796289cff0a1..1e3a0da0f3be5b 100644 --- a/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp +++ b/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp @@ -8672,7 +8672,7 @@ SDValue TargetLowering::expandIS_FPCLASS(EVT ResultVT, SDValue Op, // Degenerated cases. if (Test == fcNone) return DAG.getBoolConstant(false, DL, ResultVT, OperandVT); - if ((Test & fcAllFlags) == fcAllFlags) + if (Test == fcAllFlags) return DAG.getBoolConstant(true, DL, ResultVT, OperandVT); // PPC double double is a pair of doubles, of which the higher part determines @@ -8683,14 +8683,6 @@ SDValue TargetLowering::expandIS_FPCLASS(EVT ResultVT, SDValue Op, OperandVT = MVT::f64; } - // Some checks may be represented as inversion of simpler check, for example - // "inf|normal|subnormal|zero" => !"nan". - bool IsInverted = false; - if (FPClassTest InvertedCheck = invertFPClassTestIfSimpler(Test)) { -IsInverted = true; -Test = InvertedCheck; - } - // Floating-point type properties. EVT ScalarFloatVT = OperandVT.getScalarType(); const Type *FloatTy = ScalarFloatVT.getTypeForEVT(*DAG.getContext()); @@ -8702,9 +8694,16 @@ SDValue TargetLowering::expandIS_FPCLASS(EVT ResultVT, SDValue Op, if (Flags.hasNoFPExcept() && isOperationLegalOrCustom(ISD::SETCC, OperandVT.getScalarType())) { FPClassTest FPTestMask = Test; +bool IsInvertedFP = false; + +if (FPClassTest InvertedFPCheck = +invertFPClassTestIfSimpler(FPTestMask, true)) { + FPTestMask = InvertedFPCheck; + IsInvertedFP = true; +} -ISD::CondCode OrderedCmpOpcode = IsInverted ? ISD::SETUNE : ISD::SETOEQ; -ISD::CondCode UnorderedCmpOpcode = IsInverted ? ISD::SETONE : ISD::SETUEQ; +ISD::CondCode OrderedCmpOpcode = IsInvertedFP ? ISD::SETUNE : ISD::SETOEQ; +ISD::CondCode UnorderedCmpOpcode = IsInvertedFP ? ISD::
[llvm-branch-commits] [llvm] DAG: Lower single infinity is.fpclass tests to fcmp (PR #100380)
https://github.com/arsenm updated https://github.com/llvm/llvm-project/pull/100380 >From 7d48a3885d59edef708def4fada703032318a63e Mon Sep 17 00:00:00 2001 From: Matt Arsenault Date: Wed, 1 Feb 2023 09:52:34 -0400 Subject: [PATCH] DAG: Lower single infinity is.fpclass tests to fcmp InstCombine also should have taken care of this, but this should be helpful when the fcmp based lowering strategy tries to combine multiple tests. --- llvm/lib/CodeGen/CodeGenCommonISel.cpp| 2 + .../CodeGen/SelectionDAG/TargetLowering.cpp | 16 llvm/test/CodeGen/X86/is_fpclass.ll | 92 --- 3 files changed, 54 insertions(+), 56 deletions(-) diff --git a/llvm/lib/CodeGen/CodeGenCommonISel.cpp b/llvm/lib/CodeGen/CodeGenCommonISel.cpp index d985751e2be0be..4cd2f6ae2fdb11 100644 --- a/llvm/lib/CodeGen/CodeGenCommonISel.cpp +++ b/llvm/lib/CodeGen/CodeGenCommonISel.cpp @@ -202,6 +202,8 @@ FPClassTest llvm::invertFPClassTestIfSimpler(FPClassTest Test, bool UseFCmp) { case fcSubnormal | fcZero | fcNan: return InvertedTest; case fcInf | fcNan: + case fcPosInf | fcNan: + case fcNegInf | fcNan: // If we're trying to use fcmp, we can take advantage of the nan check // behavior of the compare (but this is more instructions in the integer // expansion). diff --git a/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp b/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp index aa022480947a7d..e3fdea34f895ba 100644 --- a/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp +++ b/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp @@ -8751,6 +8751,22 @@ SDValue TargetLowering::expandIS_FPCLASS(EVT ResultVT, SDValue Op, IsOrderedInf ? OrderedCmpOpcode : UnorderedCmpOpcode); } +if ((OrderedFPTestMask == fcPosInf || OrderedFPTestMask == fcNegInf) && +isCondCodeLegalOrCustom(IsOrdered ? OrderedCmpOpcode + : UnorderedCmpOpcode, +OperandVT.getSimpleVT())) { + // isposinf(x) --> x == inf + // isneginf(x) --> x == -inf + // isposinf(x) || nan --> x u== inf + // isneginf(x) || nan --> x u== -inf + + SDValue Inf = DAG.getConstantFP( + APFloat::getInf(Semantics, OrderedFPTestMask == fcNegInf), DL, + OperandVT); + return DAG.getSetCC(DL, ResultVT, Op, Inf, + IsOrdered ? OrderedCmpOpcode : UnorderedCmpOpcode); +} + if (OrderedFPTestMask == (fcSubnormal | fcZero) && !IsOrdered) { // TODO: Could handle ordered case, but it produces worse code for // x86. Maybe handle ordered if fabs is free? diff --git a/llvm/test/CodeGen/X86/is_fpclass.ll b/llvm/test/CodeGen/X86/is_fpclass.ll index cc4d4c4543a515..97136dafa6c2c0 100644 --- a/llvm/test/CodeGen/X86/is_fpclass.ll +++ b/llvm/test/CodeGen/X86/is_fpclass.ll @@ -2116,24 +2116,19 @@ entry: define i1 @is_plus_inf_or_nan_f(float %x) { ; X86-LABEL: is_plus_inf_or_nan_f: ; X86: # %bb.0: -; X86-NEXT:movl {{[0-9]+}}(%esp), %eax -; X86-NEXT:cmpl $2139095040, %eax # imm = 0x7F80 -; X86-NEXT:sete %cl -; X86-NEXT:andl $2147483647, %eax # imm = 0x7FFF -; X86-NEXT:cmpl $2139095041, %eax # imm = 0x7F81 -; X86-NEXT:setge %al -; X86-NEXT:orb %cl, %al +; X86-NEXT:flds {{[0-9]+}}(%esp) +; X86-NEXT:flds {{\.?LCPI[0-9]+_[0-9]+}} +; X86-NEXT:fucompp +; X86-NEXT:fnstsw %ax +; X86-NEXT:# kill: def $ah killed $ah killed $ax +; X86-NEXT:sahf +; X86-NEXT:sete %al ; X86-NEXT:retl ; ; X64-LABEL: is_plus_inf_or_nan_f: ; X64: # %bb.0: -; X64-NEXT:movd %xmm0, %eax -; X64-NEXT:cmpl $2139095040, %eax # imm = 0x7F80 -; X64-NEXT:sete %cl -; X64-NEXT:andl $2147483647, %eax # imm = 0x7FFF -; X64-NEXT:cmpl $2139095041, %eax # imm = 0x7F81 -; X64-NEXT:setge %al -; X64-NEXT:orb %cl, %al +; X64-NEXT:ucomiss {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm0 +; X64-NEXT:sete %al ; X64-NEXT:retq %class = tail call i1 @llvm.is.fpclass.f32(float %x, i32 515) ; 0x200|0x3 = "+inf|nan" ret i1 %class @@ -2142,24 +2137,19 @@ define i1 @is_plus_inf_or_nan_f(float %x) { define i1 @is_minus_inf_or_nan_f(float %x) { ; X86-LABEL: is_minus_inf_or_nan_f: ; X86: # %bb.0: -; X86-NEXT:movl {{[0-9]+}}(%esp), %eax -; X86-NEXT:cmpl $-8388608, %eax # imm = 0xFF80 -; X86-NEXT:sete %cl -; X86-NEXT:andl $2147483647, %eax # imm = 0x7FFF -; X86-NEXT:cmpl $2139095041, %eax # imm = 0x7F81 -; X86-NEXT:setge %al -; X86-NEXT:orb %cl, %al +; X86-NEXT:flds {{[0-9]+}}(%esp) +; X86-NEXT:flds {{\.?LCPI[0-9]+_[0-9]+}} +; X86-NEXT:fucompp +; X86-NEXT:fnstsw %ax +; X86-NEXT:# kill: def $ah killed $ah killed $ax +; X86-NEXT:sahf +; X86-NEXT:sete %al ; X86-NEXT:retl ; ; X64-LABEL: is_minus_inf_or_nan_f: ; X64: # %bb.0: -; X64-NEXT:movd %xmm0, %eax -; X64-NEXT:cmpl $-8388608, %eax # imm
[llvm-branch-commits] [llvm] DAG: Lower fcNormal is.fpclass to compare with inf (PR #100389)
https://github.com/arsenm updated https://github.com/llvm/llvm-project/pull/100389 >From f5da09293f633b8c4eb23de1a5c912a2546d1b9a Mon Sep 17 00:00:00 2001 From: Matt Arsenault Date: Wed, 1 Feb 2023 09:06:59 -0400 Subject: [PATCH] DAG: Lower fcNormal is.fpclass to compare with inf Looks worse for x86 without the fabs check. Not sure if this is useful for any targets. --- .../CodeGen/SelectionDAG/TargetLowering.cpp | 25 +++ 1 file changed, 25 insertions(+) diff --git a/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp b/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp index e3fdea34f895ba..ff3aab645f24b4 100644 --- a/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp +++ b/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp @@ -8787,6 +8787,31 @@ SDValue TargetLowering::expandIS_FPCLASS(EVT ResultVT, SDValue Op, IsOrdered ? OrderedOp : UnorderedOp); } } + +if (FPTestMask == fcNormal) { + // TODO: Handle unordered + ISD::CondCode IsFiniteOp = IsInvertedFP ? ISD::SETUGE : ISD::SETOLT; + ISD::CondCode IsNormalOp = IsInvertedFP ? ISD::SETOLT : ISD::SETUGE; + + if (isCondCodeLegalOrCustom(IsFiniteOp, + OperandVT.getScalarType().getSimpleVT()) && + isCondCodeLegalOrCustom(IsNormalOp, + OperandVT.getScalarType().getSimpleVT()) && + isFAbsFree(OperandVT)) { +// isnormal(x) --> fabs(x) < infinity && !(fabs(x) < smallest_normal) +SDValue Inf = +DAG.getConstantFP(APFloat::getInf(Semantics), DL, OperandVT); +SDValue SmallestNormal = DAG.getConstantFP( +APFloat::getSmallestNormalized(Semantics), DL, OperandVT); + +SDValue Abs = DAG.getNode(ISD::FABS, DL, OperandVT, Op); +SDValue IsFinite = DAG.getSetCC(DL, ResultVT, Abs, Inf, IsFiniteOp); +SDValue IsNormal = +DAG.getSetCC(DL, ResultVT, Abs, SmallestNormal, IsNormalOp); +unsigned LogicOp = IsInvertedFP ? ISD::OR : ISD::AND; +return DAG.getNode(LogicOp, DL, ResultVT, IsFinite, IsNormal); + } +} } // Some checks may be represented as inversion of simpler check, for example ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] DAG: Lower single infinity is.fpclass tests to fcmp (PR #100380)
https://github.com/arsenm ready_for_review https://github.com/llvm/llvm-project/pull/100380 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x: workflows/release-tasks: Pass required secrets to all called workflows (#106286) (PR #106491)
https://github.com/llvmbot created https://github.com/llvm/llvm-project/pull/106491 Backport 9d81e7e36e33aecdee05fef551c0652abafaa052 Requested by: @tstellar >From c3beefa91b9e50c97a4ab7c32b40771d9fd0f97e Mon Sep 17 00:00:00 2001 From: Tom Stellard Date: Wed, 28 Aug 2024 22:18:08 -0700 Subject: [PATCH] workflows/release-tasks: Pass required secrets to all called workflows (#106286) Called workflows don't have access to secrets by default, so we need to explicitly pass secrets that we use. (cherry picked from commit 9d81e7e36e33aecdee05fef551c0652abafaa052) --- .github/workflows/release-doxygen.yml | 7 ++- .github/workflows/release-lit.yml | 7 ++- .github/workflows/release-sources.yml | 4 .github/workflows/release-tasks.yml | 12 4 files changed, 28 insertions(+), 2 deletions(-) diff --git a/.github/workflows/release-doxygen.yml b/.github/workflows/release-doxygen.yml index ef00a438ce7ac4..ea95e5bb12b2b8 100644 --- a/.github/workflows/release-doxygen.yml +++ b/.github/workflows/release-doxygen.yml @@ -25,6 +25,10 @@ on: description: 'Upload documentation' required: false type: boolean +secrets: + RELEASE_TASKS_USER_TOKEN: +description: "Secret used to check user permissions." +required: false jobs: release-doxygen: @@ -63,5 +67,6 @@ jobs: if: env.upload env: GITHUB_TOKEN: ${{ github.token }} + USER_TOKEN: ${{ secrets.RELEASE_TASKS_USER_TOKEN }} run: | - ./llvm/utils/release/github-upload-release.py --token "$GITHUB_TOKEN" --release "${{ inputs.release-version }}" --user "${{ github.actor }}" upload --files ./*doxygen*.tar.xz + ./llvm/utils/release/github-upload-release.py --token "$GITHUB_TOKEN" --release "${{ inputs.release-version }}" --user "${{ github.actor }}" --user-token "$USER_TOKEN" upload --files ./*doxygen*.tar.xz diff --git a/.github/workflows/release-lit.yml b/.github/workflows/release-lit.yml index 0316ba406041d6..9d6f3140e68830 100644 --- a/.github/workflows/release-lit.yml +++ b/.github/workflows/release-lit.yml @@ -17,6 +17,10 @@ on: description: 'Release Version' required: true type: string +secrets: + RELEASE_TASKS_USER_TOKEN: +description: "Secret used to check user permissions." +required: false jobs: release-lit: @@ -36,8 +40,9 @@ jobs: - name: Check Permissions env: GITHUB_TOKEN: ${{ github.token }} + USER_TOKEN: ${{ secrets.RELEASE_TASKS_USER_TOKEN }} run: | - ./llvm/utils/release/./github-upload-release.py --token "$GITHUB_TOKEN" --user ${{ github.actor }} check-permissions + ./llvm/utils/release/./github-upload-release.py --token "$GITHUB_TOKEN" --user ${{ github.actor }} --user-token "$USER_TOKEN" check-permissions - name: Setup Cpp uses: aminya/setup-cpp@v1 diff --git a/.github/workflows/release-sources.yml b/.github/workflows/release-sources.yml index 9c5b1a9f017092..edb0449ef7e2c2 100644 --- a/.github/workflows/release-sources.yml +++ b/.github/workflows/release-sources.yml @@ -16,6 +16,10 @@ on: description: Release Version required: true type: string +secrets: + RELEASE_TASKS_USER_TOKEN: +description: "Secret used to check user permissions." +required: false # Run on pull_requests for testing purposes. pull_request: paths: diff --git a/.github/workflows/release-tasks.yml b/.github/workflows/release-tasks.yml index cf42730aaf8170..780dd0ff6325c9 100644 --- a/.github/workflows/release-tasks.yml +++ b/.github/workflows/release-tasks.yml @@ -66,6 +66,9 @@ jobs: with: release-version: ${{ needs.validate-tag.outputs.release-version }} upload: true +# Called workflows don't have access to secrets by default, so we need to explicitly pass secrets that we use. +secrets: + RELEASE_TASKS_USER_TOKEN: ${{ secrets.RELEASE_TASKS_USER_TOKEN }} release-lit: name: Release Lit @@ -73,6 +76,9 @@ jobs: uses: ./.github/workflows/release-lit.yml with: release-version: ${{ needs.validate-tag.outputs.release-version }} +# Called workflows don't have access to secrets by default, so we need to explicitly pass secrets that we use. +secrets: + RELEASE_TASKS_USER_TOKEN: ${{ secrets.RELEASE_TASKS_USER_TOKEN }} release-binaries: name: Build Release Binaries @@ -97,6 +103,9 @@ jobs: release-version: ${{ needs.validate-tag.outputs.release-version }} upload: true runs-on: ${{ matrix.runs-on }} +# Called workflows don't have access to secrets by default, so we need to explicitly pass secrets that we use. +secrets: + RELEASE_TASKS_USER_TOKEN: ${{ secrets.RELEASE_TASKS_USER_TOKEN }} release-sources: name: Package Release Sources @@ -109,3 +118,6 @@ jobs: uses: ./.github/workflows/release-sources.yml
[llvm-branch-commits] [llvm] release/19.x: workflows/release-tasks: Pass required secrets to all called workflows (#106286) (PR #106491)
llvmbot wrote: @tru What do you think about merging this PR to the release branch? https://github.com/llvm/llvm-project/pull/106491 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x: workflows/release-tasks: Pass required secrets to all called workflows (#106286) (PR #106491)
https://github.com/llvmbot milestoned https://github.com/llvm/llvm-project/pull/106491 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x: workflows/release-tasks: Pass required secrets to all called workflows (#106286) (PR #106491)
llvmbot wrote: @llvm/pr-subscribers-github-workflow Author: None (llvmbot) Changes Backport 9d81e7e36e33aecdee05fef551c0652abafaa052 Requested by: @tstellar --- Full diff: https://github.com/llvm/llvm-project/pull/106491.diff 4 Files Affected: - (modified) .github/workflows/release-doxygen.yml (+6-1) - (modified) .github/workflows/release-lit.yml (+6-1) - (modified) .github/workflows/release-sources.yml (+4) - (modified) .github/workflows/release-tasks.yml (+12) ``diff diff --git a/.github/workflows/release-doxygen.yml b/.github/workflows/release-doxygen.yml index ef00a438ce7ac4..ea95e5bb12b2b8 100644 --- a/.github/workflows/release-doxygen.yml +++ b/.github/workflows/release-doxygen.yml @@ -25,6 +25,10 @@ on: description: 'Upload documentation' required: false type: boolean +secrets: + RELEASE_TASKS_USER_TOKEN: +description: "Secret used to check user permissions." +required: false jobs: release-doxygen: @@ -63,5 +67,6 @@ jobs: if: env.upload env: GITHUB_TOKEN: ${{ github.token }} + USER_TOKEN: ${{ secrets.RELEASE_TASKS_USER_TOKEN }} run: | - ./llvm/utils/release/github-upload-release.py --token "$GITHUB_TOKEN" --release "${{ inputs.release-version }}" --user "${{ github.actor }}" upload --files ./*doxygen*.tar.xz + ./llvm/utils/release/github-upload-release.py --token "$GITHUB_TOKEN" --release "${{ inputs.release-version }}" --user "${{ github.actor }}" --user-token "$USER_TOKEN" upload --files ./*doxygen*.tar.xz diff --git a/.github/workflows/release-lit.yml b/.github/workflows/release-lit.yml index 0316ba406041d6..9d6f3140e68830 100644 --- a/.github/workflows/release-lit.yml +++ b/.github/workflows/release-lit.yml @@ -17,6 +17,10 @@ on: description: 'Release Version' required: true type: string +secrets: + RELEASE_TASKS_USER_TOKEN: +description: "Secret used to check user permissions." +required: false jobs: release-lit: @@ -36,8 +40,9 @@ jobs: - name: Check Permissions env: GITHUB_TOKEN: ${{ github.token }} + USER_TOKEN: ${{ secrets.RELEASE_TASKS_USER_TOKEN }} run: | - ./llvm/utils/release/./github-upload-release.py --token "$GITHUB_TOKEN" --user ${{ github.actor }} check-permissions + ./llvm/utils/release/./github-upload-release.py --token "$GITHUB_TOKEN" --user ${{ github.actor }} --user-token "$USER_TOKEN" check-permissions - name: Setup Cpp uses: aminya/setup-cpp@v1 diff --git a/.github/workflows/release-sources.yml b/.github/workflows/release-sources.yml index 9c5b1a9f017092..edb0449ef7e2c2 100644 --- a/.github/workflows/release-sources.yml +++ b/.github/workflows/release-sources.yml @@ -16,6 +16,10 @@ on: description: Release Version required: true type: string +secrets: + RELEASE_TASKS_USER_TOKEN: +description: "Secret used to check user permissions." +required: false # Run on pull_requests for testing purposes. pull_request: paths: diff --git a/.github/workflows/release-tasks.yml b/.github/workflows/release-tasks.yml index cf42730aaf8170..780dd0ff6325c9 100644 --- a/.github/workflows/release-tasks.yml +++ b/.github/workflows/release-tasks.yml @@ -66,6 +66,9 @@ jobs: with: release-version: ${{ needs.validate-tag.outputs.release-version }} upload: true +# Called workflows don't have access to secrets by default, so we need to explicitly pass secrets that we use. +secrets: + RELEASE_TASKS_USER_TOKEN: ${{ secrets.RELEASE_TASKS_USER_TOKEN }} release-lit: name: Release Lit @@ -73,6 +76,9 @@ jobs: uses: ./.github/workflows/release-lit.yml with: release-version: ${{ needs.validate-tag.outputs.release-version }} +# Called workflows don't have access to secrets by default, so we need to explicitly pass secrets that we use. +secrets: + RELEASE_TASKS_USER_TOKEN: ${{ secrets.RELEASE_TASKS_USER_TOKEN }} release-binaries: name: Build Release Binaries @@ -97,6 +103,9 @@ jobs: release-version: ${{ needs.validate-tag.outputs.release-version }} upload: true runs-on: ${{ matrix.runs-on }} +# Called workflows don't have access to secrets by default, so we need to explicitly pass secrets that we use. +secrets: + RELEASE_TASKS_USER_TOKEN: ${{ secrets.RELEASE_TASKS_USER_TOKEN }} release-sources: name: Package Release Sources @@ -109,3 +118,6 @@ jobs: uses: ./.github/workflows/release-sources.yml with: release-version: ${{ needs.validate-tag.outputs.release-version }} +# Called workflows don't have access to secrets by default, so we need to explicitly pass secrets that we use. +secrets: + RELEASE_TASKS_USER_TOKEN: ${{ secrets.RELEASE_TASKS_USER_TOKEN }} `` https://github.com/llvm/llvm-proje
[llvm-branch-commits] [llvm] release/19.x: workflows/release-tasks: Pass required secrets to all called workflows (#106286) (PR #106491)
https://github.com/tru approved this pull request. https://github.com/llvm/llvm-project/pull/106491 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [BOLT] Only parse probes for profiled functions in profile-write-pseudo-probes mode (PR #106365)
https://github.com/wlei-llvm approved this pull request. LGTM, thanks. https://github.com/llvm/llvm-project/pull/106365 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits