[llvm-branch-commits] [libcxx] [libc++] Implement std::move_only_function (P0288R9) (PR #94670)

2024-06-22 Thread Adrian Vogelsgesang via llvm-branch-commits


@@ -0,0 +1,93 @@
+//===--===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+
+#ifndef _LIBCPP___FUNCTIONAL_MOVE_ONLY_FUNCTION_H
+#define _LIBCPP___FUNCTIONAL_MOVE_ONLY_FUNCTION_H
+
+#include <__config>
+
+#if !defined(_LIBCPP_HAS_NO_PRAGMA_SYSTEM_HEADER)
+#  pragma GCC system_header
+#endif
+
+#if _LIBCPP_STD_VER >= 23
+
+// move_only_function design:
+//
+// move_only_function has a small buffer with a size of `3 * sizeof(void*)` 
bytes. This buffer can only be used when the
+// object that should be stored is trivially relocatable (currently only when 
it is trivially move constructible and
+// trivially destructible). There is also a bool in the lower bits of the vptr 
stored which is set when the contained
+// object is not trivially destructible.
+//
+// trivially relocatable: It would also be possible to store 
nothrow_move_constructible types, but that would mean
+// that move_only_function itself would not be trivially relocatable anymore. 
The decision to keep move_only_function
+// trivially relocatable was made because we expect move_only_function to be 
mostly used to store a functor. To only
+// forward functors there is std::function_ref (not voted in yet, expected in 
C++26).
+//
+// buffer size: We did a survey of six implementations from various vendors. 
Three of them had a buffer size of 24 bytes
+// on 64 bit systems. This also allows storing a std::string or std::vector 
inside the small buffer (once the compiler
+// has full support of trivially_relocatable annotations).
+//
+// trivially-destructible bit: This allows us to keep the overall binary size 
smaller because we don't have to store
+// a pointer to a noop function inside the vtable. It also avoids loading the 
vtable during destruction, potentially
+// resulting in fewer cache misses. The downside is that calling the function 
now also requires setting the lower bits
+// of the pointer to zero, but this is a very fast operation on modern CPUs.

vogelsgesang wrote:

Does the same also apply for `function_ref`? Should it be possible to turn a 
`move_only_function` / `copyable_function` into a `function_ref` without 
double-wrapping?

https://github.com/llvm/llvm-project/pull/94670
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LLVM][Coroutines] Create `.noalloc` variant of switch ABI coroutine ramp functions during CoroSplit (PR #99283)

2024-07-19 Thread Adrian Vogelsgesang via llvm-branch-commits


@@ -1967,22 +2047,13 @@ splitCoroutine(Function &F, SmallVectorImpl 
&Clones,
   for (DbgVariableRecord *DVR : DbgVariableRecords)
 coro::salvageDebugInfo(ArgToAllocaMap, *DVR, Shape.OptimizeFrame,
false /*UseEntryValue*/);
-  return Shape;
-}
 
-/// Remove calls to llvm.coro.end in the original function.
-static void removeCoroEndsFromRampFunction(const coro::Shape &Shape) {
-  if (Shape.ABI != coro::ABI::Switch) {
-for (auto *End : Shape.CoroEnds) {
-  replaceCoroEnd(End, Shape, Shape.FramePtr, /*in resume*/ false, nullptr);
-}
-  } else {
-for (llvm::AnyCoroEndInst *End : Shape.CoroEnds) {
-  auto &Context = End->getContext();
-  End->replaceAllUsesWith(ConstantInt::getFalse(Context));
-  End->eraseFromParent();
-}
+  removeCoroEndsFromRampFunction(Shape);
+
+  if (!isNoSuspendCoroutine && Shape.ABI == coro::ABI::Switch) {

vogelsgesang wrote:

Additionally, we should only create this `noalloc` variant, if the coroutines 
promise type itself is marked as "elidable".

**Example**

```
struct [[coro_must_elide]] ElidedTask { ... }
struct NonElidedTask { ... }

ElidedTask foo();
NonElidedTask bar();

NonElidedTask foobar() {
co_await foo();
co_await bar();
}
```

**Current behavior**: We create `foo.noalloc`, `bar.noalloc` and `foobar.alloc`.

`foo.noalloc` gets actually called. However, `bar.noalloc` and `foobar.alloc` 
are dead code. The `.noalloc` variants will never be used, as the 
`NonElidedTask` is not marked with `[[coro_must_elide]]`.

I think we should avoid generating this dead code

https://github.com/llvm/llvm-project/pull/99283
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LLVM][Coroutines] Create `.noalloc` variant of switch ABI coroutine ramp functions during CoroSplit (PR #99283)

2024-07-19 Thread Adrian Vogelsgesang via llvm-branch-commits

https://github.com/vogelsgesang edited 
https://github.com/llvm/llvm-project/pull/99283
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LLVM][Coroutines] Create `.noalloc` variant of switch ABI coroutine ramp functions during CoroSplit (PR #99283)

2024-07-19 Thread Adrian Vogelsgesang via llvm-branch-commits

https://github.com/vogelsgesang approved this pull request.

LGTM on a high level.

But I am not experienced in this code area, so you might want to wait for 
another review

https://github.com/llvm/llvm-project/pull/99283
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LLVM][Coroutines] Create `.noalloc` variant of switch ABI coroutine ramp functions during CoroSplit (PR #99283)

2024-07-19 Thread Adrian Vogelsgesang via llvm-branch-commits


@@ -1455,6 +1462,64 @@ struct SwitchCoroutineSplitter {
 setCoroInfo(F, Shape, Clones);
   }
 
+  static Function *createNoAllocVariant(Function &F, coro::Shape &Shape,
+SmallVectorImpl &Clones) {
+auto *OrigFnTy = F.getFunctionType();
+auto OldParams = OrigFnTy->params();
+
+SmallVector NewParams;
+NewParams.reserve(OldParams.size() + 1);
+for (Type *T : OldParams) {
+  NewParams.push_back(T);
+}
+NewParams.push_back(PointerType::getUnqual(Shape.FrameTy));
+
+auto *NewFnTy = FunctionType::get(OrigFnTy->getReturnType(), NewParams,
+  OrigFnTy->isVarArg());
+Function *NoAllocF =
+Function::Create(NewFnTy, F.getLinkage(), F.getName() + ".noalloc");

vogelsgesang wrote:

I wonder if the function should be marked as `linkage = internal`. To my 
understanding, the `[[coro_must_elide]]` does not work across translation 
units, yet. As such, there won't be any cross-TU calls to the `.noalloc` 
function

https://github.com/llvm/llvm-project/pull/99283
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LLVM][Coroutines] Create `.noalloc` variant of switch ABI coroutine ramp functions during CoroSplit (PR #99283)

2024-07-19 Thread Adrian Vogelsgesang via llvm-branch-commits

https://github.com/vogelsgesang edited 
https://github.com/llvm/llvm-project/pull/99283
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LLVM][Coroutines] Create `.noalloc` variant of switch ABI coroutine ramp functions during CoroSplit (PR #99283)

2024-07-19 Thread Adrian Vogelsgesang via llvm-branch-commits


@@ -1455,6 +1462,64 @@ struct SwitchCoroutineSplitter {
 setCoroInfo(F, Shape, Clones);
   }
 
+  static Function *createNoAllocVariant(Function &F, coro::Shape &Shape,
+SmallVectorImpl &Clones) {
+auto *OrigFnTy = F.getFunctionType();
+auto OldParams = OrigFnTy->params();
+
+SmallVector NewParams;
+NewParams.reserve(OldParams.size() + 1);
+for (Type *T : OldParams) {
+  NewParams.push_back(T);
+}

vogelsgesang wrote:

Can this be simplified as
```suggestion
NewParams.append(OldParams->begin(), OldParams->end());
```

?

https://github.com/llvm/llvm-project/pull/99283
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LLVM][Coroutines] Transform "coro_must_elide" calls to switch ABI coroutines to the `noalloc` variant (PR #99285)

2024-07-19 Thread Adrian Vogelsgesang via llvm-branch-commits


@@ -0,0 +1,135 @@
+//===- CoroSplit.cpp - Converts a coroutine into a state machine 
--===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+
+//===--===//
+
+#include "llvm/Transforms/Coroutines/CoroAnnotationElide.h"
+
+#include "llvm/Analysis/LazyCallGraph.h"
+#include "llvm/Analysis/OptimizationRemarkEmitter.h"
+#include "llvm/IR/Analysis.h"
+#include "llvm/IR/IRBuilder.h"
+#include "llvm/IR/InstIterator.h"
+#include "llvm/IR/Instruction.h"
+#include "llvm/IR/Module.h"
+#include "llvm/IR/PassManager.h"
+#include "llvm/Transforms/Utils/CallGraphUpdater.h"
+
+#include 
+
+using namespace llvm;
+
+#define DEBUG_TYPE "coro-annotation-elide"
+
+#define CORO_MUST_ELIDE_ANNOTATION "coro_must_elide"
+
+static Instruction *getFirstNonAllocaInTheEntryBlock(Function *F) {
+  for (Instruction &I : F->getEntryBlock())
+if (!isa(&I))
+  return &I;
+  llvm_unreachable("no terminator in the entry block");
+}
+
+static Value *allocateFrameInCaller(Function *Caller, uint64_t FrameSize,
+Align FrameAlign) {
+  LLVMContext &C = Caller->getContext();
+  BasicBlock::iterator InsertPt =
+  getFirstNonAllocaInTheEntryBlock(Caller)->getIterator();
+  const DataLayout &DL = Caller->getDataLayout();
+  auto FrameTy = ArrayType::get(Type::getInt8Ty(C), FrameSize);
+  auto *Frame = new AllocaInst(FrameTy, DL.getAllocaAddrSpace(), "", InsertPt);
+  Frame->setAlignment(FrameAlign);
+  return new BitCastInst(Frame, PointerType::getUnqual(C), "vFrame", InsertPt);

vogelsgesang wrote:

do we also need to emit any specific debug info, such that the elided frame can 
be inspected from the debugger?

https://github.com/llvm/llvm-project/pull/99285
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LLVM][Coroutines] Create `.noalloc` variant of switch ABI coroutine ramp functions during CoroSplit (PR #99283)

2024-07-19 Thread Adrian Vogelsgesang via llvm-branch-commits


@@ -1455,6 +1462,64 @@ struct SwitchCoroutineSplitter {
 setCoroInfo(F, Shape, Clones);
   }
 
+  static Function *createNoAllocVariant(Function &F, coro::Shape &Shape,
+SmallVectorImpl &Clones) {
+auto *OrigFnTy = F.getFunctionType();
+auto OldParams = OrigFnTy->params();
+
+SmallVector NewParams;
+NewParams.reserve(OldParams.size() + 1);
+for (Type *T : OldParams) {
+  NewParams.push_back(T);
+}
+NewParams.push_back(PointerType::getUnqual(Shape.FrameTy));
+
+auto *NewFnTy = FunctionType::get(OrigFnTy->getReturnType(), NewParams,
+  OrigFnTy->isVarArg());
+Function *NoAllocF =
+Function::Create(NewFnTy, F.getLinkage(), F.getName() + ".noalloc");

vogelsgesang wrote:

Intuitively, I would think so, yes. But I never looked into ThinLTO before, so 
I don't know for sure.

Does #99285 enable `CoroAnnotationElide` already for ThinLTO?


https://github.com/llvm/llvm-project/pull/99283
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LLVM][Coroutines] Create `.noalloc` variant of switch ABI coroutine ramp functions during CoroSplit (PR #99283)

2024-07-19 Thread Adrian Vogelsgesang via llvm-branch-commits

https://github.com/vogelsgesang edited 
https://github.com/llvm/llvm-project/pull/99283
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LLVM][Coroutines] Transform "coro_must_elide" calls to switch ABI coroutines to the `noalloc` variant (PR #99285)

2024-07-19 Thread Adrian Vogelsgesang via llvm-branch-commits


@@ -968,8 +969,8 @@ PassBuilder::buildInlinerPipeline(OptimizationLevel Level,
   // it's been modified since.
   MainCGPipeline.addPass(createCGSCCToFunctionPassAdaptor(
   RequireAnalysisPass()));
-
   MainCGPipeline.addPass(CoroSplitPass(Level != OptimizationLevel::O0));
+  MainCGPipeline.addPass(CoroAnnotationElidePass());

vogelsgesang wrote:

should this also be added to `buildModuleInlinerPipeline`?

https://github.com/llvm/llvm-project/pull/99285
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LLVM][Coroutines] Transform "coro_must_elide" calls to switch ABI coroutines to the `noalloc` variant (PR #99285)

2024-07-19 Thread Adrian Vogelsgesang via llvm-branch-commits


@@ -968,8 +969,8 @@ PassBuilder::buildInlinerPipeline(OptimizationLevel Level,
   // it's been modified since.
   MainCGPipeline.addPass(createCGSCCToFunctionPassAdaptor(
   RequireAnalysisPass()));
-
   MainCGPipeline.addPass(CoroSplitPass(Level != OptimizationLevel::O0));
+  MainCGPipeline.addPass(CoroAnnotationElidePass());

vogelsgesang wrote:

Should this optimization also work for ThinLTO? (also see discussion in 
https://github.com/llvm/llvm-project/pull/99283#discussion_r1684973257)

I read this pass setup a bit more carefully now, and I don't think it will work 
for ThinLTO.

Why?

* Both `CoroAnnotationElide` and `CoroSplit` are added to 
`buildInlinerPipeline`.
* `buildInlinerPipeline` is used by `buildModuleSimplificationPipeline`.
* `buildModuleSimplificationPipeline` is part of both 
`buildThinLTOPreLinkDefaultPipeline` and `buildThinLTODefaultPipeline`
* -> `CoroSplit` already runs as part of `buildThinLTOPreLinkDefaultPipeline` 
(i.e. on a per-translation-unit-level)
* -> by the time we reach `buildThinLTODefaultPipeline` (i.e. the cross-TU part 
of ThinLTO), the coroutines are already split
* -> although, `CoroAnnotationElide` is run a 2nd time as part of cross-TU 
optimization, it will be a no-op due to the `Caller->isPresplitCoroutine()` 
check in `CoroAnnotationElide.cpp:120`

https://github.com/llvm/llvm-project/pull/99285
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LLVM][Coroutines] Transform "coro_must_elide" calls to switch ABI coroutines to the `noalloc` variant (PR #99285)

2024-07-19 Thread Adrian Vogelsgesang via llvm-branch-commits

https://github.com/vogelsgesang edited 
https://github.com/llvm/llvm-project/pull/99285
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LLVM][Coroutines] Transform "coro_must_elide" calls to switch ABI coroutines to the `noalloc` variant (PR #99285)

2024-07-19 Thread Adrian Vogelsgesang via llvm-branch-commits

https://github.com/vogelsgesang edited 
https://github.com/llvm/llvm-project/pull/99285
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LLVM][Coroutines] Transform "coro_must_elide" calls to switch ABI coroutines to the `noalloc` variant (PR #99285)

2024-07-19 Thread Adrian Vogelsgesang via llvm-branch-commits


@@ -968,8 +969,8 @@ PassBuilder::buildInlinerPipeline(OptimizationLevel Level,
   // it's been modified since.
   MainCGPipeline.addPass(createCGSCCToFunctionPassAdaptor(
   RequireAnalysisPass()));
-
   MainCGPipeline.addPass(CoroSplitPass(Level != OptimizationLevel::O0));
+  MainCGPipeline.addPass(CoroAnnotationElidePass());

vogelsgesang wrote:

Thanks for that context! I assume, as part of relanding #90310 you would then 
also adjust the way the `CoroAnnotationElide` registration?

Do you happen to know the answer to 
https://github.com/llvm/llvm-project/pull/99283#discussion_r1684973257 or can 
provide some guidance there?

https://github.com/llvm/llvm-project/pull/99285
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LLVM][Coroutines] Create `.noalloc` variant of switch ABI coroutine ramp functions during CoroSplit (PR #99283)

2024-07-25 Thread Adrian Vogelsgesang via llvm-branch-commits


@@ -1967,22 +2047,13 @@ splitCoroutine(Function &F, SmallVectorImpl 
&Clones,
   for (DbgVariableRecord *DVR : DbgVariableRecords)
 coro::salvageDebugInfo(ArgToAllocaMap, *DVR, Shape.OptimizeFrame,
false /*UseEntryValue*/);
-  return Shape;
-}
 
-/// Remove calls to llvm.coro.end in the original function.
-static void removeCoroEndsFromRampFunction(const coro::Shape &Shape) {
-  if (Shape.ABI != coro::ABI::Switch) {
-for (auto *End : Shape.CoroEnds) {
-  replaceCoroEnd(End, Shape, Shape.FramePtr, /*in resume*/ false, nullptr);
-}
-  } else {
-for (llvm::AnyCoroEndInst *End : Shape.CoroEnds) {
-  auto &Context = End->getContext();
-  End->replaceAllUsesWith(ConstantInt::getFalse(Context));
-  End->eraseFromParent();
-}
+  removeCoroEndsFromRampFunction(Shape);
+
+  if (!isNoSuspendCoroutine && Shape.ABI == coro::ABI::Switch) {

vogelsgesang wrote:

Yes, I would be in favor of adding a second attribute. What do you & others 
think?
CC @ChuanqiXu9 

https://github.com/llvm/llvm-project/pull/99283
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LLVM][Coroutines] Create `.noalloc` variant of switch ABI coroutine ramp functions during CoroSplit (PR #99283)

2024-07-30 Thread Adrian Vogelsgesang via llvm-branch-commits


@@ -1967,22 +2047,13 @@ splitCoroutine(Function &F, SmallVectorImpl 
&Clones,
   for (DbgVariableRecord *DVR : DbgVariableRecords)
 coro::salvageDebugInfo(ArgToAllocaMap, *DVR, Shape.OptimizeFrame,
false /*UseEntryValue*/);
-  return Shape;
-}
 
-/// Remove calls to llvm.coro.end in the original function.
-static void removeCoroEndsFromRampFunction(const coro::Shape &Shape) {
-  if (Shape.ABI != coro::ABI::Switch) {
-for (auto *End : Shape.CoroEnds) {
-  replaceCoroEnd(End, Shape, Shape.FramePtr, /*in resume*/ false, nullptr);
-}
-  } else {
-for (llvm::AnyCoroEndInst *End : Shape.CoroEnds) {
-  auto &Context = End->getContext();
-  End->replaceAllUsesWith(ConstantInt::getFalse(Context));
-  End->eraseFromParent();
-}
+  removeCoroEndsFromRampFunction(Shape);
+
+  if (!isNoSuspendCoroutine && Shape.ABI == coro::ABI::Switch) {

vogelsgesang wrote:

fine by me. Doesn't need to be addressed as part of this commit, can also be 
done in a follow-up commit.
Maybe add a `FIXME` to the source code, such that other people reading the 
source code in the meantime are aware of this future work?

https://github.com/llvm/llvm-project/pull/99283
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LLVM][Coroutines] Create `.noalloc` variant of switch ABI coroutine ramp functions during CoroSplit (PR #99283)

2024-08-21 Thread Adrian Vogelsgesang via llvm-branch-commits


@@ -2049,6 +2055,22 @@ the coroutine must reach the final suspend point when it 
get destroyed.
 
 This attribute only works for switched-resume coroutines now.
 
+coro_must_elide
+---
+
+When a Call or Invoke instruction is marked with `coro_must_elide`,
+CoroAnnotationElidePass performs heap elision when possible. Note that for

vogelsgesang wrote:

I think the name `coro_must_elide` is a misnormer. "must elide" sounds as if it 
would be a compilation error if elision fails. However, this is not the case

https://github.com/llvm/llvm-project/pull/99283
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LLVM][Coroutines] Create `.noalloc` variant of switch ABI coroutine ramp functions during CoroSplit (PR #99283)

2024-08-21 Thread Adrian Vogelsgesang via llvm-branch-commits


@@ -2049,6 +2055,22 @@ the coroutine must reach the final suspend point when it 
get destroyed.
 
 This attribute only works for switched-resume coroutines now.
 
+coro_must_elide
+---
+
+When a Call or Invoke instruction is marked with `coro_must_elide`,
+CoroAnnotationElidePass performs heap elision when possible. Note that for

vogelsgesang wrote:

love it!

https://github.com/llvm/llvm-project/pull/99283
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits