lxfind created this revision.
lxfind added reviewers: lewissbaker, modocache, junparser.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

If we ever assign co_await to a temporary variable, such as foo(co_await expr),
we generate AST that looks like this: 
MaterializedTemporaryExpr(CoawaitExpr(...)).
MaterializedTemporaryExpr would emit an intrinsics that marks the lifetime 
start of the
temporary storage. However such temporary storage will not be used until 
co_await is ready
to write the result. Marking the lifetime start way too early causes extra 
storage to be
put in the coroutine frame instead of the stack.
As you can see from https://godbolt.org/z/XsLUiY, the frame generated for 
get_big_object2 is 12K, which contains a big_object object unnecessarily.
After this patch, the frame size for get_big_object2 is now only 8K. There are 
still room for improvements, in particular, GCC has a 4K frame for this 
function. But that's a separate problem and not addressed in this patch.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D82314

Files:
  clang/lib/CodeGen/CGCoroutine.cpp


Index: clang/lib/CodeGen/CGCoroutine.cpp
===================================================================
--- clang/lib/CodeGen/CGCoroutine.cpp
+++ clang/lib/CodeGen/CGCoroutine.cpp
@@ -245,10 +245,39 @@
   }
 
   LValueOrRValue Res;
-  if (forLValue)
+  if (forLValue) {
     Res.LV = CGF.EmitLValue(S.getResumeExpr());
-  else
+  } else {
+    // If the result of co_await is stored into a memory address, we want to
+    // make sure the lifetime of that memory address does not start too early,
+    // causing more space to be allocated in the frame rather than on stack.
+    // The lifetime of the return value of co_await should start here right
+    // before we attempt to assign it.
+    auto adjustLifetimeStart = [&]() {
+      // aggSlot points to the instruction that allocated the object. Latter
+      // instructions use it in this pattern:
+      //   %tmpX = alloca %.., align 4
+      //   %0 = bitcast %...* %tmpX to i8*
+      //   call void @llvm.lifetime.start.p0i8(i64 ..., i8* nonnull %0) #2
+      // Hence we trace back through uses to eventually locate the lifetime
+      // start intrinsics marker, and move it down to the current insertion
+      // point.
+      auto *AllocaInst =
+          dyn_cast_or_null<llvm::AllocaInst>(aggSlot.getPointer());
+      if (!AllocaInst || AllocaInst->getNumUses() != 1)
+        return;
+      auto *CastInst = dyn_cast<llvm::CastInst>(*AllocaInst->users().begin());
+      if (!CastInst || CastInst->getNumUses() != 1)
+        return;
+      if (auto *LifetimeInst =
+              dyn_cast<llvm::IntrinsicInst>(*CastInst->users().begin())) {
+        LifetimeInst->removeFromParent();
+        CGF.Builder.Insert(LifetimeInst);
+      }
+    };
+    adjustLifetimeStart();
     Res.RV = CGF.EmitAnyExpr(S.getResumeExpr(), aggSlot, ignoreResult);
+  }
 
   if (TryStmt) {
     Builder.CreateFlagStore(false, Coro.ResumeEHVar);


Index: clang/lib/CodeGen/CGCoroutine.cpp
===================================================================
--- clang/lib/CodeGen/CGCoroutine.cpp
+++ clang/lib/CodeGen/CGCoroutine.cpp
@@ -245,10 +245,39 @@
   }
 
   LValueOrRValue Res;
-  if (forLValue)
+  if (forLValue) {
     Res.LV = CGF.EmitLValue(S.getResumeExpr());
-  else
+  } else {
+    // If the result of co_await is stored into a memory address, we want to
+    // make sure the lifetime of that memory address does not start too early,
+    // causing more space to be allocated in the frame rather than on stack.
+    // The lifetime of the return value of co_await should start here right
+    // before we attempt to assign it.
+    auto adjustLifetimeStart = [&]() {
+      // aggSlot points to the instruction that allocated the object. Latter
+      // instructions use it in this pattern:
+      //   %tmpX = alloca %.., align 4
+      //   %0 = bitcast %...* %tmpX to i8*
+      //   call void @llvm.lifetime.start.p0i8(i64 ..., i8* nonnull %0) #2
+      // Hence we trace back through uses to eventually locate the lifetime
+      // start intrinsics marker, and move it down to the current insertion
+      // point.
+      auto *AllocaInst =
+          dyn_cast_or_null<llvm::AllocaInst>(aggSlot.getPointer());
+      if (!AllocaInst || AllocaInst->getNumUses() != 1)
+        return;
+      auto *CastInst = dyn_cast<llvm::CastInst>(*AllocaInst->users().begin());
+      if (!CastInst || CastInst->getNumUses() != 1)
+        return;
+      if (auto *LifetimeInst =
+              dyn_cast<llvm::IntrinsicInst>(*CastInst->users().begin())) {
+        LifetimeInst->removeFromParent();
+        CGF.Builder.Insert(LifetimeInst);
+      }
+    };
+    adjustLifetimeStart();
     Res.RV = CGF.EmitAnyExpr(S.getResumeExpr(), aggSlot, ignoreResult);
+  }
 
   if (TryStmt) {
     Builder.CreateFlagStore(false, Coro.ResumeEHVar);
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to