[llvm-branch-commits] [libcxx] [libc++] Implement std::move_only_function (P0288R9) (PR #94670)
@@ -0,0 +1,93 @@ +//===--===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===--===// + +#ifndef _LIBCPP___FUNCTIONAL_MOVE_ONLY_FUNCTION_H +#define _LIBCPP___FUNCTIONAL_MOVE_ONLY_FUNCTION_H + +#include <__config> + +#if !defined(_LIBCPP_HAS_NO_PRAGMA_SYSTEM_HEADER) +# pragma GCC system_header +#endif + +#if _LIBCPP_STD_VER >= 23 + +// move_only_function design: +// +// move_only_function has a small buffer with a size of `3 * sizeof(void*)` bytes. This buffer can only be used when the +// object that should be stored is trivially relocatable (currently only when it is trivially move constructible and +// trivially destructible). There is also a bool in the lower bits of the vptr stored which is set when the contained +// object is not trivially destructible. +// +// trivially relocatable: It would also be possible to store nothrow_move_constructible types, but that would mean +// that move_only_function itself would not be trivially relocatable anymore. The decision to keep move_only_function +// trivially relocatable was made because we expect move_only_function to be mostly used to store a functor. To only +// forward functors there is std::function_ref (not voted in yet, expected in C++26). +// +// buffer size: We did a survey of six implementations from various vendors. Three of them had a buffer size of 24 bytes +// on 64 bit systems. This also allows storing a std::string or std::vector inside the small buffer (once the compiler +// has full support of trivially_relocatable annotations). +// +// trivially-destructible bit: This allows us to keep the overall binary size smaller because we don't have to store +// a pointer to a noop function inside the vtable. It also avoids loading the vtable during destruction, potentially +// resulting in fewer cache misses. The downside is that calling the function now also requires setting the lower bits +// of the pointer to zero, but this is a very fast operation on modern CPUs. vogelsgesang wrote: Does the same also apply for `function_ref`? Should it be possible to turn a `move_only_function` / `copyable_function` into a `function_ref` without double-wrapping? https://github.com/llvm/llvm-project/pull/94670 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LLVM][Coroutines] Create `.noalloc` variant of switch ABI coroutine ramp functions during CoroSplit (PR #99283)
@@ -1967,22 +2047,13 @@ splitCoroutine(Function &F, SmallVectorImpl &Clones, for (DbgVariableRecord *DVR : DbgVariableRecords) coro::salvageDebugInfo(ArgToAllocaMap, *DVR, Shape.OptimizeFrame, false /*UseEntryValue*/); - return Shape; -} -/// Remove calls to llvm.coro.end in the original function. -static void removeCoroEndsFromRampFunction(const coro::Shape &Shape) { - if (Shape.ABI != coro::ABI::Switch) { -for (auto *End : Shape.CoroEnds) { - replaceCoroEnd(End, Shape, Shape.FramePtr, /*in resume*/ false, nullptr); -} - } else { -for (llvm::AnyCoroEndInst *End : Shape.CoroEnds) { - auto &Context = End->getContext(); - End->replaceAllUsesWith(ConstantInt::getFalse(Context)); - End->eraseFromParent(); -} + removeCoroEndsFromRampFunction(Shape); + + if (!isNoSuspendCoroutine && Shape.ABI == coro::ABI::Switch) { vogelsgesang wrote: Additionally, we should only create this `noalloc` variant, if the coroutines promise type itself is marked as "elidable". **Example** ``` struct [[coro_must_elide]] ElidedTask { ... } struct NonElidedTask { ... } ElidedTask foo(); NonElidedTask bar(); NonElidedTask foobar() { co_await foo(); co_await bar(); } ``` **Current behavior**: We create `foo.noalloc`, `bar.noalloc` and `foobar.alloc`. `foo.noalloc` gets actually called. However, `bar.noalloc` and `foobar.alloc` are dead code. The `.noalloc` variants will never be used, as the `NonElidedTask` is not marked with `[[coro_must_elide]]`. I think we should avoid generating this dead code https://github.com/llvm/llvm-project/pull/99283 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LLVM][Coroutines] Create `.noalloc` variant of switch ABI coroutine ramp functions during CoroSplit (PR #99283)
https://github.com/vogelsgesang edited https://github.com/llvm/llvm-project/pull/99283 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LLVM][Coroutines] Create `.noalloc` variant of switch ABI coroutine ramp functions during CoroSplit (PR #99283)
https://github.com/vogelsgesang approved this pull request. LGTM on a high level. But I am not experienced in this code area, so you might want to wait for another review https://github.com/llvm/llvm-project/pull/99283 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LLVM][Coroutines] Create `.noalloc` variant of switch ABI coroutine ramp functions during CoroSplit (PR #99283)
@@ -1455,6 +1462,64 @@ struct SwitchCoroutineSplitter { setCoroInfo(F, Shape, Clones); } + static Function *createNoAllocVariant(Function &F, coro::Shape &Shape, +SmallVectorImpl &Clones) { +auto *OrigFnTy = F.getFunctionType(); +auto OldParams = OrigFnTy->params(); + +SmallVector NewParams; +NewParams.reserve(OldParams.size() + 1); +for (Type *T : OldParams) { + NewParams.push_back(T); +} +NewParams.push_back(PointerType::getUnqual(Shape.FrameTy)); + +auto *NewFnTy = FunctionType::get(OrigFnTy->getReturnType(), NewParams, + OrigFnTy->isVarArg()); +Function *NoAllocF = +Function::Create(NewFnTy, F.getLinkage(), F.getName() + ".noalloc"); vogelsgesang wrote: I wonder if the function should be marked as `linkage = internal`. To my understanding, the `[[coro_must_elide]]` does not work across translation units, yet. As such, there won't be any cross-TU calls to the `.noalloc` function https://github.com/llvm/llvm-project/pull/99283 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LLVM][Coroutines] Create `.noalloc` variant of switch ABI coroutine ramp functions during CoroSplit (PR #99283)
https://github.com/vogelsgesang edited https://github.com/llvm/llvm-project/pull/99283 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LLVM][Coroutines] Create `.noalloc` variant of switch ABI coroutine ramp functions during CoroSplit (PR #99283)
@@ -1455,6 +1462,64 @@ struct SwitchCoroutineSplitter { setCoroInfo(F, Shape, Clones); } + static Function *createNoAllocVariant(Function &F, coro::Shape &Shape, +SmallVectorImpl &Clones) { +auto *OrigFnTy = F.getFunctionType(); +auto OldParams = OrigFnTy->params(); + +SmallVector NewParams; +NewParams.reserve(OldParams.size() + 1); +for (Type *T : OldParams) { + NewParams.push_back(T); +} vogelsgesang wrote: Can this be simplified as ```suggestion NewParams.append(OldParams->begin(), OldParams->end()); ``` ? https://github.com/llvm/llvm-project/pull/99283 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LLVM][Coroutines] Transform "coro_must_elide" calls to switch ABI coroutines to the `noalloc` variant (PR #99285)
@@ -0,0 +1,135 @@ +//===- CoroSplit.cpp - Converts a coroutine into a state machine --===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===--===// + +//===--===// + +#include "llvm/Transforms/Coroutines/CoroAnnotationElide.h" + +#include "llvm/Analysis/LazyCallGraph.h" +#include "llvm/Analysis/OptimizationRemarkEmitter.h" +#include "llvm/IR/Analysis.h" +#include "llvm/IR/IRBuilder.h" +#include "llvm/IR/InstIterator.h" +#include "llvm/IR/Instruction.h" +#include "llvm/IR/Module.h" +#include "llvm/IR/PassManager.h" +#include "llvm/Transforms/Utils/CallGraphUpdater.h" + +#include + +using namespace llvm; + +#define DEBUG_TYPE "coro-annotation-elide" + +#define CORO_MUST_ELIDE_ANNOTATION "coro_must_elide" + +static Instruction *getFirstNonAllocaInTheEntryBlock(Function *F) { + for (Instruction &I : F->getEntryBlock()) +if (!isa(&I)) + return &I; + llvm_unreachable("no terminator in the entry block"); +} + +static Value *allocateFrameInCaller(Function *Caller, uint64_t FrameSize, +Align FrameAlign) { + LLVMContext &C = Caller->getContext(); + BasicBlock::iterator InsertPt = + getFirstNonAllocaInTheEntryBlock(Caller)->getIterator(); + const DataLayout &DL = Caller->getDataLayout(); + auto FrameTy = ArrayType::get(Type::getInt8Ty(C), FrameSize); + auto *Frame = new AllocaInst(FrameTy, DL.getAllocaAddrSpace(), "", InsertPt); + Frame->setAlignment(FrameAlign); + return new BitCastInst(Frame, PointerType::getUnqual(C), "vFrame", InsertPt); vogelsgesang wrote: do we also need to emit any specific debug info, such that the elided frame can be inspected from the debugger? https://github.com/llvm/llvm-project/pull/99285 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LLVM][Coroutines] Create `.noalloc` variant of switch ABI coroutine ramp functions during CoroSplit (PR #99283)
@@ -1455,6 +1462,64 @@ struct SwitchCoroutineSplitter { setCoroInfo(F, Shape, Clones); } + static Function *createNoAllocVariant(Function &F, coro::Shape &Shape, +SmallVectorImpl &Clones) { +auto *OrigFnTy = F.getFunctionType(); +auto OldParams = OrigFnTy->params(); + +SmallVector NewParams; +NewParams.reserve(OldParams.size() + 1); +for (Type *T : OldParams) { + NewParams.push_back(T); +} +NewParams.push_back(PointerType::getUnqual(Shape.FrameTy)); + +auto *NewFnTy = FunctionType::get(OrigFnTy->getReturnType(), NewParams, + OrigFnTy->isVarArg()); +Function *NoAllocF = +Function::Create(NewFnTy, F.getLinkage(), F.getName() + ".noalloc"); vogelsgesang wrote: Intuitively, I would think so, yes. But I never looked into ThinLTO before, so I don't know for sure. Does #99285 enable `CoroAnnotationElide` already for ThinLTO? https://github.com/llvm/llvm-project/pull/99283 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LLVM][Coroutines] Create `.noalloc` variant of switch ABI coroutine ramp functions during CoroSplit (PR #99283)
https://github.com/vogelsgesang edited https://github.com/llvm/llvm-project/pull/99283 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LLVM][Coroutines] Transform "coro_must_elide" calls to switch ABI coroutines to the `noalloc` variant (PR #99285)
@@ -968,8 +969,8 @@ PassBuilder::buildInlinerPipeline(OptimizationLevel Level, // it's been modified since. MainCGPipeline.addPass(createCGSCCToFunctionPassAdaptor( RequireAnalysisPass())); - MainCGPipeline.addPass(CoroSplitPass(Level != OptimizationLevel::O0)); + MainCGPipeline.addPass(CoroAnnotationElidePass()); vogelsgesang wrote: should this also be added to `buildModuleInlinerPipeline`? https://github.com/llvm/llvm-project/pull/99285 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LLVM][Coroutines] Transform "coro_must_elide" calls to switch ABI coroutines to the `noalloc` variant (PR #99285)
@@ -968,8 +969,8 @@ PassBuilder::buildInlinerPipeline(OptimizationLevel Level, // it's been modified since. MainCGPipeline.addPass(createCGSCCToFunctionPassAdaptor( RequireAnalysisPass())); - MainCGPipeline.addPass(CoroSplitPass(Level != OptimizationLevel::O0)); + MainCGPipeline.addPass(CoroAnnotationElidePass()); vogelsgesang wrote: Should this optimization also work for ThinLTO? (also see discussion in https://github.com/llvm/llvm-project/pull/99283#discussion_r1684973257) I read this pass setup a bit more carefully now, and I don't think it will work for ThinLTO. Why? * Both `CoroAnnotationElide` and `CoroSplit` are added to `buildInlinerPipeline`. * `buildInlinerPipeline` is used by `buildModuleSimplificationPipeline`. * `buildModuleSimplificationPipeline` is part of both `buildThinLTOPreLinkDefaultPipeline` and `buildThinLTODefaultPipeline` * -> `CoroSplit` already runs as part of `buildThinLTOPreLinkDefaultPipeline` (i.e. on a per-translation-unit-level) * -> by the time we reach `buildThinLTODefaultPipeline` (i.e. the cross-TU part of ThinLTO), the coroutines are already split * -> although, `CoroAnnotationElide` is run a 2nd time as part of cross-TU optimization, it will be a no-op due to the `Caller->isPresplitCoroutine()` check in `CoroAnnotationElide.cpp:120` https://github.com/llvm/llvm-project/pull/99285 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LLVM][Coroutines] Transform "coro_must_elide" calls to switch ABI coroutines to the `noalloc` variant (PR #99285)
https://github.com/vogelsgesang edited https://github.com/llvm/llvm-project/pull/99285 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LLVM][Coroutines] Transform "coro_must_elide" calls to switch ABI coroutines to the `noalloc` variant (PR #99285)
https://github.com/vogelsgesang edited https://github.com/llvm/llvm-project/pull/99285 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LLVM][Coroutines] Transform "coro_must_elide" calls to switch ABI coroutines to the `noalloc` variant (PR #99285)
@@ -968,8 +969,8 @@ PassBuilder::buildInlinerPipeline(OptimizationLevel Level, // it's been modified since. MainCGPipeline.addPass(createCGSCCToFunctionPassAdaptor( RequireAnalysisPass())); - MainCGPipeline.addPass(CoroSplitPass(Level != OptimizationLevel::O0)); + MainCGPipeline.addPass(CoroAnnotationElidePass()); vogelsgesang wrote: Thanks for that context! I assume, as part of relanding #90310 you would then also adjust the way the `CoroAnnotationElide` registration? Do you happen to know the answer to https://github.com/llvm/llvm-project/pull/99283#discussion_r1684973257 or can provide some guidance there? https://github.com/llvm/llvm-project/pull/99285 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LLVM][Coroutines] Create `.noalloc` variant of switch ABI coroutine ramp functions during CoroSplit (PR #99283)
@@ -1967,22 +2047,13 @@ splitCoroutine(Function &F, SmallVectorImpl &Clones, for (DbgVariableRecord *DVR : DbgVariableRecords) coro::salvageDebugInfo(ArgToAllocaMap, *DVR, Shape.OptimizeFrame, false /*UseEntryValue*/); - return Shape; -} -/// Remove calls to llvm.coro.end in the original function. -static void removeCoroEndsFromRampFunction(const coro::Shape &Shape) { - if (Shape.ABI != coro::ABI::Switch) { -for (auto *End : Shape.CoroEnds) { - replaceCoroEnd(End, Shape, Shape.FramePtr, /*in resume*/ false, nullptr); -} - } else { -for (llvm::AnyCoroEndInst *End : Shape.CoroEnds) { - auto &Context = End->getContext(); - End->replaceAllUsesWith(ConstantInt::getFalse(Context)); - End->eraseFromParent(); -} + removeCoroEndsFromRampFunction(Shape); + + if (!isNoSuspendCoroutine && Shape.ABI == coro::ABI::Switch) { vogelsgesang wrote: Yes, I would be in favor of adding a second attribute. What do you & others think? CC @ChuanqiXu9 https://github.com/llvm/llvm-project/pull/99283 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LLVM][Coroutines] Create `.noalloc` variant of switch ABI coroutine ramp functions during CoroSplit (PR #99283)
@@ -1967,22 +2047,13 @@ splitCoroutine(Function &F, SmallVectorImpl &Clones, for (DbgVariableRecord *DVR : DbgVariableRecords) coro::salvageDebugInfo(ArgToAllocaMap, *DVR, Shape.OptimizeFrame, false /*UseEntryValue*/); - return Shape; -} -/// Remove calls to llvm.coro.end in the original function. -static void removeCoroEndsFromRampFunction(const coro::Shape &Shape) { - if (Shape.ABI != coro::ABI::Switch) { -for (auto *End : Shape.CoroEnds) { - replaceCoroEnd(End, Shape, Shape.FramePtr, /*in resume*/ false, nullptr); -} - } else { -for (llvm::AnyCoroEndInst *End : Shape.CoroEnds) { - auto &Context = End->getContext(); - End->replaceAllUsesWith(ConstantInt::getFalse(Context)); - End->eraseFromParent(); -} + removeCoroEndsFromRampFunction(Shape); + + if (!isNoSuspendCoroutine && Shape.ABI == coro::ABI::Switch) { vogelsgesang wrote: fine by me. Doesn't need to be addressed as part of this commit, can also be done in a follow-up commit. Maybe add a `FIXME` to the source code, such that other people reading the source code in the meantime are aware of this future work? https://github.com/llvm/llvm-project/pull/99283 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LLVM][Coroutines] Create `.noalloc` variant of switch ABI coroutine ramp functions during CoroSplit (PR #99283)
@@ -2049,6 +2055,22 @@ the coroutine must reach the final suspend point when it get destroyed. This attribute only works for switched-resume coroutines now. +coro_must_elide +--- + +When a Call or Invoke instruction is marked with `coro_must_elide`, +CoroAnnotationElidePass performs heap elision when possible. Note that for vogelsgesang wrote: I think the name `coro_must_elide` is a misnormer. "must elide" sounds as if it would be a compilation error if elision fails. However, this is not the case https://github.com/llvm/llvm-project/pull/99283 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LLVM][Coroutines] Create `.noalloc` variant of switch ABI coroutine ramp functions during CoroSplit (PR #99283)
@@ -2049,6 +2055,22 @@ the coroutine must reach the final suspend point when it get destroyed. This attribute only works for switched-resume coroutines now. +coro_must_elide +--- + +When a Call or Invoke instruction is marked with `coro_must_elide`, +CoroAnnotationElidePass performs heap elision when possible. Note that for vogelsgesang wrote: love it! https://github.com/llvm/llvm-project/pull/99283 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits