[llvm-branch-commits] [llvm] release/19.x: [MIPS] Optimize sortRelocs for o32 (PR #106008)
nikic wrote: I don't think this should be backported in the current form, because it breaks ABI. This is not strictly impossible at this stage, but also very undesirable. https://github.com/llvm/llvm-project/pull/106008 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [mlir][GPU] Plumb range information through the NVVM lowterings (PR #107659)
https://github.com/grypp edited https://github.com/llvm/llvm-project/pull/107659 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [mlir][GPU] Plumb range information through the NVVM lowterings (PR #107659)
@@ -209,7 +209,12 @@ struct GPULaneIdOpToNVVM : ConvertOpToLLVMPattern { ConversionPatternRewriter &rewriter) const override { auto loc = op->getLoc(); MLIRContext *context = rewriter.getContext(); -Value newOp = rewriter.create(loc, rewriter.getI32Type()); +LLVM::ConstantRangeAttr bounds = nullptr; +if (std::optional upperBound = op.getUpperBound()) grypp wrote: who is setting the upperbound? I might be missing something https://github.com/llvm/llvm-project/pull/107659 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [mlir][GPU] Plumb range information through the NVVM lowterings (PR #107659)
@@ -209,7 +209,12 @@ struct GPULaneIdOpToNVVM : ConvertOpToLLVMPattern { ConversionPatternRewriter &rewriter) const override { auto loc = op->getLoc(); MLIRContext *context = rewriter.getContext(); -Value newOp = rewriter.create(loc, rewriter.getI32Type()); +LLVM::ConstantRangeAttr bounds = nullptr; +if (std::optional upperBound = op.getUpperBound()) + bounds = rewriter.getAttr( + 32, 0, upperBound->getZExtValue()); grypp wrote: So 32 is the bitwidth, and 0 is the lower limit right? maybe we can create symbols to name them. https://github.com/llvm/llvm-project/pull/107659 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [mlir][GPU] Plumb range information through the NVVM lowterings (PR #107659)
https://github.com/grypp commented: Thanks for doing that. I think this is going be very useful. I left some comments https://github.com/llvm/llvm-project/pull/107659 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [mlir][GPU] Plumb range information through the NVVM lowterings (PR #107659)
@@ -1784,53 +1799,53 @@ def NVVM_CpAsyncBulkWaitGroupOp : NVVM_Op<"cp.async.bulk.wait_group">, }]; } -def NVVM_CpAsyncBulkTensorGlobalToSharedClusterOp : - NVVM_Op<"cp.async.bulk.tensor.shared.cluster.global", - [DeclareOpInterfaceMethods, +def NVVM_CpAsyncBulkTensorGlobalToSharedClusterOp : + NVVM_Op<"cp.async.bulk.tensor.shared.cluster.global", + [DeclareOpInterfaceMethods, AttrSizedOperandSegments]>, Arguments<(ins LLVM_PointerShared:$dstMem, LLVM_AnyPointer:$tmaDescriptor, Variadic:$coordinates, - LLVM_PointerShared:$mbar, + LLVM_PointerShared:$mbar, Variadic:$im2colOffsets, Optional:$multicastMask, Optional:$l2CacheHint, PtxPredicate:$predicate)> { let description = [{ -Initiates an asynchronous copy operation on the tensor data from global -memory to shared memory. +Initiates an asynchronous copy operation on the tensor data from global grypp wrote: I see that you are removing the trailing spaces. It's unrelated. Can we remove it from this PR? https://github.com/llvm/llvm-project/pull/107659 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [mlir][GPU] Plumb range information through the NVVM lowterings (PR #107659)
@@ -699,9 +699,21 @@ gpu.module @test_module_32 { } gpu.module @test_module_33 { -// CHECK-LABEL: func @kernel_with_block_size() -// CHECK: attributes {gpu.kernel, gpu.known_block_size = array, nvvm.kernel, nvvm.maxntid = array} - gpu.func @kernel_with_block_size() kernel attributes {known_block_size = array} { +// CHECK-LABEL: func @kernel_with_block_size( +// CHECK: attributes {gpu.kernel, gpu.known_block_size = array, nvvm.kernel, nvvm.maxntid = array} + gpu.func @kernel_with_block_size(%arg0: !llvm.ptr) kernel attributes {known_block_size = array} { grypp wrote: If I remember correctly, you added known_block_size to func.func. So I am wondering is this PR going to work for func.func? https://github.com/llvm/llvm-project/pull/107659 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [mlir][GPU] Plumb range information through the NVVM lowterings (PR #107659)
@@ -84,7 +87,7 @@ llvm.func @llvm_nvvm_barrier0() { // CHECK-SAME: i32 %[[barId:.*]], i32 %[[numThreads:.*]]) llvm.func @llvm_nvvm_barrier(%barID : i32, %numberOfThreads : i32) { // CHECK: call void @llvm.nvvm.barrier0() - nvvm.barrier + nvvm.barrier grypp wrote: let's remove unrelated space removing from here as well https://github.com/llvm/llvm-project/pull/107659 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [mlir][GPU] Plumb range information through the NVVM lowterings (PR #107659)
grypp wrote: nit: `lowterings` typo https://github.com/llvm/llvm-project/pull/107659 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] 2b33fbe - Revert "[NFCI][LTO][lld] Optimize away symbol copies within LTO global resolu…"
Author: Mingming Liu Date: 2024-09-08T16:45:07-07:00 New Revision: 2b33fbee3f36344786fa63b189387c3bd90c4c3f URL: https://github.com/llvm/llvm-project/commit/2b33fbee3f36344786fa63b189387c3bd90c4c3f DIFF: https://github.com/llvm/llvm-project/commit/2b33fbee3f36344786fa63b189387c3bd90c4c3f.diff LOG: Revert "[NFCI][LTO][lld] Optimize away symbol copies within LTO global resolu…" This reverts commit 9ade4e2646bd52b49e50c1648301da65de90ffa9. Added: Modified: lld/ELF/InputFiles.cpp lld/ELF/LTO.cpp llvm/include/llvm/LTO/Config.h llvm/include/llvm/LTO/LTO.h llvm/lib/LTO/LTO.cpp Removed: diff --git a/lld/ELF/InputFiles.cpp b/lld/ELF/InputFiles.cpp index da69c4882ead21..1570adf1370930 100644 --- a/lld/ELF/InputFiles.cpp +++ b/lld/ELF/InputFiles.cpp @@ -1804,10 +1804,6 @@ void BitcodeFile::parseLazy() { auto *sym = symtab.insert(unique_saver().save(irSym.getName())); sym->resolve(LazySymbol{*this}); symbols[i] = sym; -} else { - // Keep copies of per-module undefined symbols for LTO::GlobalResolutions - // usage. - unique_saver().save(irSym.getName()); } } diff --git a/lld/ELF/LTO.cpp b/lld/ELF/LTO.cpp index f339f1c2c0ec21..935d0a9eab9ee0 100644 --- a/lld/ELF/LTO.cpp +++ b/lld/ELF/LTO.cpp @@ -135,7 +135,6 @@ static lto::Config createConfig() { config->ltoValidateAllVtablesHaveTypeInfos; c.AllVtablesHaveTypeInfos = ctx.ltoAllVtablesHaveTypeInfos; c.AlwaysEmitRegularLTOObj = !config->ltoObjPath.empty(); - c.KeepSymbolNameCopies = false; for (const llvm::StringRef &name : config->thinLTOModulesToCompile) c.ThinLTOModulesToCompile.emplace_back(name); diff --git a/llvm/include/llvm/LTO/Config.h b/llvm/include/llvm/LTO/Config.h index a49cce9f30e20c..482b6e55a19d35 100644 --- a/llvm/include/llvm/LTO/Config.h +++ b/llvm/include/llvm/LTO/Config.h @@ -88,11 +88,6 @@ struct Config { /// want to know a priori all possible output files. bool AlwaysEmitRegularLTOObj = false; - /// If true, the LTO instance creates copies of the symbol names for LTO::run. - /// The lld linker uses string saver to keep symbol names alive and doesn't - /// need to create copies, so it can set this field to false. - bool KeepSymbolNameCopies = true; - /// Allows non-imported definitions to get the potentially more constraining /// visibility from the prevailing definition. FromPrevailing is the default /// because it works for many binary formats. ELF can use the more optimized diff --git a/llvm/include/llvm/LTO/LTO.h b/llvm/include/llvm/LTO/LTO.h index 782f37dc8d4404..949e80a43f0e88 100644 --- a/llvm/include/llvm/LTO/LTO.h +++ b/llvm/include/llvm/LTO/LTO.h @@ -15,9 +15,6 @@ #ifndef LLVM_LTO_LTO_H #define LLVM_LTO_LTO_H -#include - -#include "llvm/ADT/DenseMap.h" #include "llvm/ADT/MapVector.h" #include "llvm/ADT/StringMap.h" #include "llvm/Bitcode/BitcodeReader.h" @@ -26,7 +23,6 @@ #include "llvm/Object/IRSymtab.h" #include "llvm/Support/Caching.h" #include "llvm/Support/Error.h" -#include "llvm/Support/StringSaver.h" #include "llvm/Support/thread.h" #include "llvm/Transforms/IPO/FunctionAttrs.h" #include "llvm/Transforms/IPO/FunctionImport.h" @@ -407,19 +403,10 @@ class LTO { }; }; - // GlobalResolutionSymbolSaver allocator. - std::unique_ptr Alloc; - - // Symbol saver for global resolution map. - std::unique_ptr GlobalResolutionSymbolSaver; - // Global mapping from mangled symbol names to resolutions. - // Make this an unique_ptr to guard against accessing after it has been reset + // Make this an optional to guard against accessing after it has been reset // (to reduce memory after we're done with it). - std::unique_ptr> - GlobalResolutions; - - void releaseGlobalResolutionsMemory(); + std::optional> GlobalResolutions; void addModuleToGlobalRes(ArrayRef Syms, ArrayRef Res, unsigned Partition, diff --git a/llvm/lib/LTO/LTO.cpp b/llvm/lib/LTO/LTO.cpp index 5d9a5cbd18f156..68072563cb33d6 100644 --- a/llvm/lib/LTO/LTO.cpp +++ b/llvm/lib/LTO/LTO.cpp @@ -77,10 +77,6 @@ cl::opt EnableLTOInternalization( "enable-lto-internalization", cl::init(true), cl::Hidden, cl::desc("Enable global value internalization in LTO")); -static cl::opt -LTOKeepSymbolCopies("lto-keep-symbol-copies", cl::init(false), cl::Hidden, -cl::desc("Keep copies of symbols in LTO indexing")); - /// Indicate we are linking with an allocator that supports hot/cold operator /// new interfaces. extern cl::opt SupportsHotColdNew; @@ -591,14 +587,8 @@ LTO::LTO(Config Conf, ThinBackend Backend, : Conf(std::move(Conf)), RegularLTO(ParallelCodeGenParallelismLevel, this->Conf), ThinLTO(std::move(Backend)), - GlobalResolutions( - std::make_unique>()), - LTOMode(LTOMode) { - if (Conf.KeepSymbolNameCopies || LTOKeepSym
[llvm-branch-commits] [clang] [Serialization] Code cleanups and polish 83233 (PR #83237)
ChuanqiXu9 wrote: > > what the code does is: when we write a on-disk hash table, try to write the > > imported merged hash table in the same process so that we don't need to > > read these tables again. However, in line 329 the function will try to omit > > the data from imported table with the same key which already emitted by the > > current module file. This is the root cause of the problem. > > It's been a while since I looked at this, but as I recall, a fundamental > assumption of MultiOnDiskHashTable is that if we have a lookup result for a > key K in the current file, that result supersedes any results from dependency > files. So lookup won't look in those files if we have a local result (they > are overridden) and merging doesn't take results from those files either. > > So I think the problem probably is that when we form a local result, we need > to (but presumably don't) add all the imported results with the same key to > the local result. Thanks for the insight! I didn't known this. I'll try to make it. https://github.com/llvm/llvm-project/pull/83237 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LLVM][Coroutines] Create `.noalloc` variant of switch ABI coroutine ramp functions during CoroSplit (PR #99283)
https://github.com/yuxuanchen1997 updated https://github.com/llvm/llvm-project/pull/99283 >From 0c712a2fbc5b44e892b37085dbace8ba974c1238 Mon Sep 17 00:00:00 2001 From: Yuxuan Chen Date: Tue, 4 Jun 2024 23:22:00 -0700 Subject: [PATCH] [LLVM][Coroutines] Create `.noalloc` variant of switch ABI coroutine ramp functions during CoroSplit --- llvm/docs/Coroutines.rst | 18 +++ llvm/lib/Transforms/Coroutines/CoroInternal.h | 7 + llvm/lib/Transforms/Coroutines/CoroSplit.cpp | 150 +++--- llvm/lib/Transforms/Coroutines/Coroutines.cpp | 27 .../Transforms/Coroutines/coro-split-00.ll| 15 ++ 5 files changed, 191 insertions(+), 26 deletions(-) diff --git a/llvm/docs/Coroutines.rst b/llvm/docs/Coroutines.rst index 36092325e536fb..5679aefcb421d8 100644 --- a/llvm/docs/Coroutines.rst +++ b/llvm/docs/Coroutines.rst @@ -2022,6 +2022,12 @@ The pass CoroSplit builds coroutine frame and outlines resume and destroy parts into separate functions. This pass also lowers `coro.await.suspend.void`_, `coro.await.suspend.bool`_ and `coro.await.suspend.handle`_ intrinsics. +CoroAnnotationElide +--- +This pass finds all usages of coroutines that are "must elide" and replaces +`coro.begin` intrinsic with an address of a coroutine frame placed on its caller +and replaces `coro.alloc` and `coro.free` intrinsics with `false` and `null` +respectively to remove the deallocation code. CoroElide - @@ -2049,6 +2055,18 @@ the coroutine must reach the final suspend point when it get destroyed. This attribute only works for switched-resume coroutines now. +coro_elide_safe +--- + +When a Call or Invoke instruction to switch ABI coroutine `f` is marked with +`coro_elide_safe`, CoroSplitPass generates a `f.noalloc` ramp function. +`f.noalloc` has one more argument than its original ramp function `f`, which is +the pointer to the allocated frame. `f.noalloc` also suppressed any allocations +or deallocations that may be guarded by `@llvm.coro.alloc` and `@llvm.coro.free`. + +CoroAnnotationElidePass performs the heap elision when possible. Note that for +recursive or mutually recursive functions this elision is usually not possible. + Metadata diff --git a/llvm/lib/Transforms/Coroutines/CoroInternal.h b/llvm/lib/Transforms/Coroutines/CoroInternal.h index d535ad7f85d74a..be86f96525b677 100644 --- a/llvm/lib/Transforms/Coroutines/CoroInternal.h +++ b/llvm/lib/Transforms/Coroutines/CoroInternal.h @@ -26,6 +26,13 @@ bool declaresIntrinsics(const Module &M, const std::initializer_list); void replaceCoroFree(CoroIdInst *CoroId, bool Elide); +/// Replaces all @llvm.coro.alloc intrinsics calls associated with a given +/// call @llvm.coro.id instruction with boolean value false. +void suppressCoroAllocs(CoroIdInst *CoroId); +/// Replaces CoroAllocs with boolean value false. +void suppressCoroAllocs(LLVMContext &Context, +ArrayRef CoroAllocs); + /// Attempts to rewrite the location operand of debug intrinsics in terms of /// the coroutine frame pointer, folding pointer offsets into the DIExpression /// of the intrinsic. diff --git a/llvm/lib/Transforms/Coroutines/CoroSplit.cpp b/llvm/lib/Transforms/Coroutines/CoroSplit.cpp index 6bf3c75b95113e..494c4d632de95f 100644 --- a/llvm/lib/Transforms/Coroutines/CoroSplit.cpp +++ b/llvm/lib/Transforms/Coroutines/CoroSplit.cpp @@ -25,6 +25,7 @@ #include "llvm/ADT/PriorityWorklist.h" #include "llvm/ADT/SmallPtrSet.h" #include "llvm/ADT/SmallVector.h" +#include "llvm/ADT/StringExtras.h" #include "llvm/ADT/StringRef.h" #include "llvm/ADT/Twine.h" #include "llvm/Analysis/CFG.h" @@ -1177,6 +1178,14 @@ static void updateAsyncFuncPointerContextSize(coro::Shape &Shape) { Shape.AsyncLowering.AsyncFuncPointer->setInitializer(NewFuncPtrStruct); } +static TypeSize getFrameSizeForShape(coro::Shape &Shape) { + // In the same function all coro.sizes should have the same result type. + auto *SizeIntrin = Shape.CoroSizes.back(); + Module *M = SizeIntrin->getModule(); + const DataLayout &DL = M->getDataLayout(); + return DL.getTypeAllocSize(Shape.FrameTy); +} + static void replaceFrameSizeAndAlignment(coro::Shape &Shape) { if (Shape.ABI == coro::ABI::Async) updateAsyncFuncPointerContextSize(Shape); @@ -1192,10 +1201,8 @@ static void replaceFrameSizeAndAlignment(coro::Shape &Shape) { // In the same function all coro.sizes should have the same result type. auto *SizeIntrin = Shape.CoroSizes.back(); - Module *M = SizeIntrin->getModule(); - const DataLayout &DL = M->getDataLayout(); - auto Size = DL.getTypeAllocSize(Shape.FrameTy); - auto *SizeConstant = ConstantInt::get(SizeIntrin->getType(), Size); + auto *SizeConstant = + ConstantInt::get(SizeIntrin->getType(), getFrameSizeForShape(Shape)); for (CoroSizeInst *CS : Shape.CoroSizes) { CS->replaceAllUsesWith(SizeConstant); @@ -1452,6 +1459,75 @@ struct SwitchCorou
[llvm-branch-commits] [llvm] [LLVM][Coroutines] Transform "coro_elide_safe" calls to switch ABI coroutines to the `noalloc` variant (PR #99285)
https://github.com/yuxuanchen1997 updated https://github.com/llvm/llvm-project/pull/99285 >From 201bca06d4e75bc4fa24ac269ad7b9750f24616f Mon Sep 17 00:00:00 2001 From: Yuxuan Chen Date: Mon, 15 Jul 2024 15:01:39 -0700 Subject: [PATCH] [LLVM][Coroutines] Transform "coro_elide_safe" calls to switch ABI coroutines to the `noalloc` variant --- .../Coroutines/CoroAnnotationElide.h | 36 llvm/lib/Passes/PassBuilder.cpp | 1 + llvm/lib/Passes/PassBuilderPipelines.cpp | 10 +- llvm/lib/Passes/PassRegistry.def | 1 + llvm/lib/Transforms/Coroutines/CMakeLists.txt | 1 + .../Coroutines/CoroAnnotationElide.cpp| 155 ++ llvm/test/Other/new-pm-defaults.ll| 1 + .../Other/new-pm-thinlto-postlink-defaults.ll | 1 + .../new-pm-thinlto-postlink-pgo-defaults.ll | 1 + ...-pm-thinlto-postlink-samplepgo-defaults.ll | 1 + .../Coroutines/coro-transform-must-elide.ll | 75 + 11 files changed, 281 insertions(+), 2 deletions(-) create mode 100644 llvm/include/llvm/Transforms/Coroutines/CoroAnnotationElide.h create mode 100644 llvm/lib/Transforms/Coroutines/CoroAnnotationElide.cpp create mode 100644 llvm/test/Transforms/Coroutines/coro-transform-must-elide.ll diff --git a/llvm/include/llvm/Transforms/Coroutines/CoroAnnotationElide.h b/llvm/include/llvm/Transforms/Coroutines/CoroAnnotationElide.h new file mode 100644 index 00..352c9e14526697 --- /dev/null +++ b/llvm/include/llvm/Transforms/Coroutines/CoroAnnotationElide.h @@ -0,0 +1,36 @@ +//===- CoroAnnotationElide.h - Elide attributed safe coroutine calls --===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===--===// +// +// \file +// This pass transforms all Call or Invoke instructions that are annotated +// "coro_elide_safe" to call the `.noalloc` variant of coroutine instead. +// The frame of the callee coroutine is allocated inside the caller. A pointer +// to the allocated frame will be passed into the `.noalloc` ramp function. +// +//===--===// + +#ifndef LLVM_TRANSFORMS_COROUTINES_COROANNOTATIONELIDE_H +#define LLVM_TRANSFORMS_COROUTINES_COROANNOTATIONELIDE_H + +#include "llvm/Analysis/CGSCCPassManager.h" +#include "llvm/Analysis/LazyCallGraph.h" +#include "llvm/IR/PassManager.h" + +namespace llvm { + +struct CoroAnnotationElidePass : PassInfoMixin { + CoroAnnotationElidePass() {} + + PreservedAnalyses run(LazyCallGraph::SCC &C, CGSCCAnalysisManager &AM, +LazyCallGraph &CG, CGSCCUpdateResult &UR); + + static bool isRequired() { return false; } +}; +} // end namespace llvm + +#endif // LLVM_TRANSFORMS_COROUTINES_COROANNOTATIONELIDE_H diff --git a/llvm/lib/Passes/PassBuilder.cpp b/llvm/lib/Passes/PassBuilder.cpp index 83c1a6712bf4d9..c34f9148cce58b 100644 --- a/llvm/lib/Passes/PassBuilder.cpp +++ b/llvm/lib/Passes/PassBuilder.cpp @@ -139,6 +139,7 @@ #include "llvm/Target/TargetMachine.h" #include "llvm/Transforms/AggressiveInstCombine/AggressiveInstCombine.h" #include "llvm/Transforms/CFGuard.h" +#include "llvm/Transforms/Coroutines/CoroAnnotationElide.h" #include "llvm/Transforms/Coroutines/CoroCleanup.h" #include "llvm/Transforms/Coroutines/CoroConditionalWrapper.h" #include "llvm/Transforms/Coroutines/CoroEarly.h" diff --git a/llvm/lib/Passes/PassBuilderPipelines.cpp b/llvm/lib/Passes/PassBuilderPipelines.cpp index 7f9e1362e7ef23..4e8e3dcdff4428 100644 --- a/llvm/lib/Passes/PassBuilderPipelines.cpp +++ b/llvm/lib/Passes/PassBuilderPipelines.cpp @@ -33,6 +33,7 @@ #include "llvm/Support/VirtualFileSystem.h" #include "llvm/Target/TargetMachine.h" #include "llvm/Transforms/AggressiveInstCombine/AggressiveInstCombine.h" +#include "llvm/Transforms/Coroutines/CoroAnnotationElide.h" #include "llvm/Transforms/Coroutines/CoroCleanup.h" #include "llvm/Transforms/Coroutines/CoroConditionalWrapper.h" #include "llvm/Transforms/Coroutines/CoroEarly.h" @@ -973,8 +974,10 @@ PassBuilder::buildInlinerPipeline(OptimizationLevel Level, MainCGPipeline.addPass(createCGSCCToFunctionPassAdaptor( RequireAnalysisPass())); - if (Phase != ThinOrFullLTOPhase::ThinLTOPreLink) + if (Phase != ThinOrFullLTOPhase::ThinLTOPreLink) { MainCGPipeline.addPass(CoroSplitPass(Level != OptimizationLevel::O0)); +MainCGPipeline.addPass(CoroAnnotationElidePass()); + } // Make sure we don't affect potential future NoRerun CGSCC adaptors. MIWP.addLateModulePass(createModuleToFunctionPassAdaptor( @@ -1016,9 +1019,12 @@ PassBuilder::buildModuleInlinerPipeline(OptimizationLevel Level, buildFunctionSimplificationPipeline(Level, Phase), PTO.EagerlyInvalidateAnalyses)); - if (Phase != ThinOrFullLTO
[llvm-branch-commits] [llvm] [LLVM][Coroutines] Create `.noalloc` variant of switch ABI coroutine ramp functions during CoroSplit (PR #99283)
https://github.com/yuxuanchen1997 updated https://github.com/llvm/llvm-project/pull/99283 >From 0c712a2fbc5b44e892b37085dbace8ba974c1238 Mon Sep 17 00:00:00 2001 From: Yuxuan Chen Date: Tue, 4 Jun 2024 23:22:00 -0700 Subject: [PATCH] [LLVM][Coroutines] Create `.noalloc` variant of switch ABI coroutine ramp functions during CoroSplit --- llvm/docs/Coroutines.rst | 18 +++ llvm/lib/Transforms/Coroutines/CoroInternal.h | 7 + llvm/lib/Transforms/Coroutines/CoroSplit.cpp | 150 +++--- llvm/lib/Transforms/Coroutines/Coroutines.cpp | 27 .../Transforms/Coroutines/coro-split-00.ll| 15 ++ 5 files changed, 191 insertions(+), 26 deletions(-) diff --git a/llvm/docs/Coroutines.rst b/llvm/docs/Coroutines.rst index 36092325e536fb..5679aefcb421d8 100644 --- a/llvm/docs/Coroutines.rst +++ b/llvm/docs/Coroutines.rst @@ -2022,6 +2022,12 @@ The pass CoroSplit builds coroutine frame and outlines resume and destroy parts into separate functions. This pass also lowers `coro.await.suspend.void`_, `coro.await.suspend.bool`_ and `coro.await.suspend.handle`_ intrinsics. +CoroAnnotationElide +--- +This pass finds all usages of coroutines that are "must elide" and replaces +`coro.begin` intrinsic with an address of a coroutine frame placed on its caller +and replaces `coro.alloc` and `coro.free` intrinsics with `false` and `null` +respectively to remove the deallocation code. CoroElide - @@ -2049,6 +2055,18 @@ the coroutine must reach the final suspend point when it get destroyed. This attribute only works for switched-resume coroutines now. +coro_elide_safe +--- + +When a Call or Invoke instruction to switch ABI coroutine `f` is marked with +`coro_elide_safe`, CoroSplitPass generates a `f.noalloc` ramp function. +`f.noalloc` has one more argument than its original ramp function `f`, which is +the pointer to the allocated frame. `f.noalloc` also suppressed any allocations +or deallocations that may be guarded by `@llvm.coro.alloc` and `@llvm.coro.free`. + +CoroAnnotationElidePass performs the heap elision when possible. Note that for +recursive or mutually recursive functions this elision is usually not possible. + Metadata diff --git a/llvm/lib/Transforms/Coroutines/CoroInternal.h b/llvm/lib/Transforms/Coroutines/CoroInternal.h index d535ad7f85d74a..be86f96525b677 100644 --- a/llvm/lib/Transforms/Coroutines/CoroInternal.h +++ b/llvm/lib/Transforms/Coroutines/CoroInternal.h @@ -26,6 +26,13 @@ bool declaresIntrinsics(const Module &M, const std::initializer_list); void replaceCoroFree(CoroIdInst *CoroId, bool Elide); +/// Replaces all @llvm.coro.alloc intrinsics calls associated with a given +/// call @llvm.coro.id instruction with boolean value false. +void suppressCoroAllocs(CoroIdInst *CoroId); +/// Replaces CoroAllocs with boolean value false. +void suppressCoroAllocs(LLVMContext &Context, +ArrayRef CoroAllocs); + /// Attempts to rewrite the location operand of debug intrinsics in terms of /// the coroutine frame pointer, folding pointer offsets into the DIExpression /// of the intrinsic. diff --git a/llvm/lib/Transforms/Coroutines/CoroSplit.cpp b/llvm/lib/Transforms/Coroutines/CoroSplit.cpp index 6bf3c75b95113e..494c4d632de95f 100644 --- a/llvm/lib/Transforms/Coroutines/CoroSplit.cpp +++ b/llvm/lib/Transforms/Coroutines/CoroSplit.cpp @@ -25,6 +25,7 @@ #include "llvm/ADT/PriorityWorklist.h" #include "llvm/ADT/SmallPtrSet.h" #include "llvm/ADT/SmallVector.h" +#include "llvm/ADT/StringExtras.h" #include "llvm/ADT/StringRef.h" #include "llvm/ADT/Twine.h" #include "llvm/Analysis/CFG.h" @@ -1177,6 +1178,14 @@ static void updateAsyncFuncPointerContextSize(coro::Shape &Shape) { Shape.AsyncLowering.AsyncFuncPointer->setInitializer(NewFuncPtrStruct); } +static TypeSize getFrameSizeForShape(coro::Shape &Shape) { + // In the same function all coro.sizes should have the same result type. + auto *SizeIntrin = Shape.CoroSizes.back(); + Module *M = SizeIntrin->getModule(); + const DataLayout &DL = M->getDataLayout(); + return DL.getTypeAllocSize(Shape.FrameTy); +} + static void replaceFrameSizeAndAlignment(coro::Shape &Shape) { if (Shape.ABI == coro::ABI::Async) updateAsyncFuncPointerContextSize(Shape); @@ -1192,10 +1201,8 @@ static void replaceFrameSizeAndAlignment(coro::Shape &Shape) { // In the same function all coro.sizes should have the same result type. auto *SizeIntrin = Shape.CoroSizes.back(); - Module *M = SizeIntrin->getModule(); - const DataLayout &DL = M->getDataLayout(); - auto Size = DL.getTypeAllocSize(Shape.FrameTy); - auto *SizeConstant = ConstantInt::get(SizeIntrin->getType(), Size); + auto *SizeConstant = + ConstantInt::get(SizeIntrin->getType(), getFrameSizeForShape(Shape)); for (CoroSizeInst *CS : Shape.CoroSizes) { CS->replaceAllUsesWith(SizeConstant); @@ -1452,6 +1459,75 @@ struct SwitchCorou
[llvm-branch-commits] [llvm] [LLVM][Coroutines] Transform "coro_elide_safe" calls to switch ABI coroutines to the `noalloc` variant (PR #99285)
https://github.com/yuxuanchen1997 updated https://github.com/llvm/llvm-project/pull/99285 >From 201bca06d4e75bc4fa24ac269ad7b9750f24616f Mon Sep 17 00:00:00 2001 From: Yuxuan Chen Date: Mon, 15 Jul 2024 15:01:39 -0700 Subject: [PATCH] [LLVM][Coroutines] Transform "coro_elide_safe" calls to switch ABI coroutines to the `noalloc` variant --- .../Coroutines/CoroAnnotationElide.h | 36 llvm/lib/Passes/PassBuilder.cpp | 1 + llvm/lib/Passes/PassBuilderPipelines.cpp | 10 +- llvm/lib/Passes/PassRegistry.def | 1 + llvm/lib/Transforms/Coroutines/CMakeLists.txt | 1 + .../Coroutines/CoroAnnotationElide.cpp| 155 ++ llvm/test/Other/new-pm-defaults.ll| 1 + .../Other/new-pm-thinlto-postlink-defaults.ll | 1 + .../new-pm-thinlto-postlink-pgo-defaults.ll | 1 + ...-pm-thinlto-postlink-samplepgo-defaults.ll | 1 + .../Coroutines/coro-transform-must-elide.ll | 75 + 11 files changed, 281 insertions(+), 2 deletions(-) create mode 100644 llvm/include/llvm/Transforms/Coroutines/CoroAnnotationElide.h create mode 100644 llvm/lib/Transforms/Coroutines/CoroAnnotationElide.cpp create mode 100644 llvm/test/Transforms/Coroutines/coro-transform-must-elide.ll diff --git a/llvm/include/llvm/Transforms/Coroutines/CoroAnnotationElide.h b/llvm/include/llvm/Transforms/Coroutines/CoroAnnotationElide.h new file mode 100644 index 00..352c9e14526697 --- /dev/null +++ b/llvm/include/llvm/Transforms/Coroutines/CoroAnnotationElide.h @@ -0,0 +1,36 @@ +//===- CoroAnnotationElide.h - Elide attributed safe coroutine calls --===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===--===// +// +// \file +// This pass transforms all Call or Invoke instructions that are annotated +// "coro_elide_safe" to call the `.noalloc` variant of coroutine instead. +// The frame of the callee coroutine is allocated inside the caller. A pointer +// to the allocated frame will be passed into the `.noalloc` ramp function. +// +//===--===// + +#ifndef LLVM_TRANSFORMS_COROUTINES_COROANNOTATIONELIDE_H +#define LLVM_TRANSFORMS_COROUTINES_COROANNOTATIONELIDE_H + +#include "llvm/Analysis/CGSCCPassManager.h" +#include "llvm/Analysis/LazyCallGraph.h" +#include "llvm/IR/PassManager.h" + +namespace llvm { + +struct CoroAnnotationElidePass : PassInfoMixin { + CoroAnnotationElidePass() {} + + PreservedAnalyses run(LazyCallGraph::SCC &C, CGSCCAnalysisManager &AM, +LazyCallGraph &CG, CGSCCUpdateResult &UR); + + static bool isRequired() { return false; } +}; +} // end namespace llvm + +#endif // LLVM_TRANSFORMS_COROUTINES_COROANNOTATIONELIDE_H diff --git a/llvm/lib/Passes/PassBuilder.cpp b/llvm/lib/Passes/PassBuilder.cpp index 83c1a6712bf4d9..c34f9148cce58b 100644 --- a/llvm/lib/Passes/PassBuilder.cpp +++ b/llvm/lib/Passes/PassBuilder.cpp @@ -139,6 +139,7 @@ #include "llvm/Target/TargetMachine.h" #include "llvm/Transforms/AggressiveInstCombine/AggressiveInstCombine.h" #include "llvm/Transforms/CFGuard.h" +#include "llvm/Transforms/Coroutines/CoroAnnotationElide.h" #include "llvm/Transforms/Coroutines/CoroCleanup.h" #include "llvm/Transforms/Coroutines/CoroConditionalWrapper.h" #include "llvm/Transforms/Coroutines/CoroEarly.h" diff --git a/llvm/lib/Passes/PassBuilderPipelines.cpp b/llvm/lib/Passes/PassBuilderPipelines.cpp index 7f9e1362e7ef23..4e8e3dcdff4428 100644 --- a/llvm/lib/Passes/PassBuilderPipelines.cpp +++ b/llvm/lib/Passes/PassBuilderPipelines.cpp @@ -33,6 +33,7 @@ #include "llvm/Support/VirtualFileSystem.h" #include "llvm/Target/TargetMachine.h" #include "llvm/Transforms/AggressiveInstCombine/AggressiveInstCombine.h" +#include "llvm/Transforms/Coroutines/CoroAnnotationElide.h" #include "llvm/Transforms/Coroutines/CoroCleanup.h" #include "llvm/Transforms/Coroutines/CoroConditionalWrapper.h" #include "llvm/Transforms/Coroutines/CoroEarly.h" @@ -973,8 +974,10 @@ PassBuilder::buildInlinerPipeline(OptimizationLevel Level, MainCGPipeline.addPass(createCGSCCToFunctionPassAdaptor( RequireAnalysisPass())); - if (Phase != ThinOrFullLTOPhase::ThinLTOPreLink) + if (Phase != ThinOrFullLTOPhase::ThinLTOPreLink) { MainCGPipeline.addPass(CoroSplitPass(Level != OptimizationLevel::O0)); +MainCGPipeline.addPass(CoroAnnotationElidePass()); + } // Make sure we don't affect potential future NoRerun CGSCC adaptors. MIWP.addLateModulePass(createModuleToFunctionPassAdaptor( @@ -1016,9 +1019,12 @@ PassBuilder::buildModuleInlinerPipeline(OptimizationLevel Level, buildFunctionSimplificationPipeline(Level, Phase), PTO.EagerlyInvalidateAnalyses)); - if (Phase != ThinOrFullLTO
[llvm-branch-commits] [clang] [Serialization] Code cleanups and polish 83233 (PR #83237)
ChuanqiXu9 wrote: > > what the code does is: when we write a on-disk hash table, try to write the > > imported merged hash table in the same process so that we don't need to > > read these tables again. However, in line 329 the function will try to omit > > the data from imported table with the same key which already emitted by the > > current module file. This is the root cause of the problem. > > It's been a while since I looked at this, but as I recall, a fundamental > assumption of MultiOnDiskHashTable is that if we have a lookup result for a > key K in the current file, that result supersedes any results from dependency > files. So lookup won't look in those files if we have a local result (they > are overridden) and merging doesn't take results from those files either. > > So I think the problem probably is that when we form a local result, we need > to (but presumably don't) add all the imported results with the same key to > the local result. I took a second look at MultiOnDiskHashTable.h and it looks like it is what I did. There is not any logic to supersedes results from dependency files in MultiOnDiskHashTable itself (or it doesn't have the concept of dependency file in MultiOnDiskHashTable.) The only similar thing is `MultiOnDiskHashTableGenerator::emit`, where the contents from the merged tables of imported files will be come the contents of the current file. And the corresponding imported files will be marked as overriden and removed in `find`. And in this patch, I've already tried to write all the contents into the current one. So I feel the concern is already addressed. https://github.com/llvm/llvm-project/pull/83237 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits