[llvm-branch-commits] [llvm] release/19.x: [MIPS] Optimize sortRelocs for o32 (PR #106008)

2024-09-08 Thread Nikita Popov via llvm-branch-commits

nikic wrote:

I don't think this should be backported in the current form, because it breaks 
ABI. This is not strictly impossible at this stage, but also very undesirable.

https://github.com/llvm/llvm-project/pull/106008
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] [mlir][GPU] Plumb range information through the NVVM lowterings (PR #107659)

2024-09-08 Thread Guray Ozen via llvm-branch-commits

https://github.com/grypp edited https://github.com/llvm/llvm-project/pull/107659
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] [mlir][GPU] Plumb range information through the NVVM lowterings (PR #107659)

2024-09-08 Thread Guray Ozen via llvm-branch-commits


@@ -209,7 +209,12 @@ struct GPULaneIdOpToNVVM : 
ConvertOpToLLVMPattern {
   ConversionPatternRewriter &rewriter) const override {
 auto loc = op->getLoc();
 MLIRContext *context = rewriter.getContext();
-Value newOp = rewriter.create(loc, rewriter.getI32Type());
+LLVM::ConstantRangeAttr bounds = nullptr;
+if (std::optional upperBound = op.getUpperBound())

grypp wrote:

who is setting the upperbound? I might be missing something

https://github.com/llvm/llvm-project/pull/107659
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] [mlir][GPU] Plumb range information through the NVVM lowterings (PR #107659)

2024-09-08 Thread Guray Ozen via llvm-branch-commits


@@ -209,7 +209,12 @@ struct GPULaneIdOpToNVVM : 
ConvertOpToLLVMPattern {
   ConversionPatternRewriter &rewriter) const override {
 auto loc = op->getLoc();
 MLIRContext *context = rewriter.getContext();
-Value newOp = rewriter.create(loc, rewriter.getI32Type());
+LLVM::ConstantRangeAttr bounds = nullptr;
+if (std::optional upperBound = op.getUpperBound())
+  bounds = rewriter.getAttr(
+  32, 0, upperBound->getZExtValue());

grypp wrote:

So 32 is the bitwidth, and 0 is the lower limit right? maybe we can create 
symbols to name them.  

https://github.com/llvm/llvm-project/pull/107659
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] [mlir][GPU] Plumb range information through the NVVM lowterings (PR #107659)

2024-09-08 Thread Guray Ozen via llvm-branch-commits

https://github.com/grypp commented:

Thanks for doing that. I think this is going be very useful. I left some 
comments

https://github.com/llvm/llvm-project/pull/107659
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] [mlir][GPU] Plumb range information through the NVVM lowterings (PR #107659)

2024-09-08 Thread Guray Ozen via llvm-branch-commits


@@ -1784,53 +1799,53 @@ def NVVM_CpAsyncBulkWaitGroupOp : 
NVVM_Op<"cp.async.bulk.wait_group">,
   }];
 }
 
-def NVVM_CpAsyncBulkTensorGlobalToSharedClusterOp : 
-  NVVM_Op<"cp.async.bulk.tensor.shared.cluster.global", 
-  [DeclareOpInterfaceMethods, 
+def NVVM_CpAsyncBulkTensorGlobalToSharedClusterOp :
+  NVVM_Op<"cp.async.bulk.tensor.shared.cluster.global",
+  [DeclareOpInterfaceMethods,
   AttrSizedOperandSegments]>,
   Arguments<(ins  LLVM_PointerShared:$dstMem,
   LLVM_AnyPointer:$tmaDescriptor,
   Variadic:$coordinates,
-  LLVM_PointerShared:$mbar,  
+  LLVM_PointerShared:$mbar,
   Variadic:$im2colOffsets,
   Optional:$multicastMask,
   Optional:$l2CacheHint,
   PtxPredicate:$predicate)> {
   let description = [{
-Initiates an asynchronous copy operation on the tensor data from global 
-memory to shared memory. 
+Initiates an asynchronous copy operation on the tensor data from global

grypp wrote:

I see that you are removing the trailing spaces. It's unrelated. Can we remove 
it from this PR? 

https://github.com/llvm/llvm-project/pull/107659
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] [mlir][GPU] Plumb range information through the NVVM lowterings (PR #107659)

2024-09-08 Thread Guray Ozen via llvm-branch-commits


@@ -699,9 +699,21 @@ gpu.module @test_module_32 {
 }
 
 gpu.module @test_module_33 {
-// CHECK-LABEL: func @kernel_with_block_size()
-// CHECK: attributes {gpu.kernel, gpu.known_block_size = array, nvvm.kernel, nvvm.maxntid = array}
-  gpu.func @kernel_with_block_size() kernel attributes {known_block_size = 
array} {
+// CHECK-LABEL: func @kernel_with_block_size(
+// CHECK: attributes {gpu.kernel, gpu.known_block_size = array, 
nvvm.kernel, nvvm.maxntid = array}
+  gpu.func @kernel_with_block_size(%arg0: !llvm.ptr) kernel attributes 
{known_block_size = array} {

grypp wrote:

If I remember correctly, you added known_block_size to func.func. So I am 
wondering is this PR going to work for func.func? 

https://github.com/llvm/llvm-project/pull/107659
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] [mlir][GPU] Plumb range information through the NVVM lowterings (PR #107659)

2024-09-08 Thread Guray Ozen via llvm-branch-commits


@@ -84,7 +87,7 @@ llvm.func @llvm_nvvm_barrier0() {
 // CHECK-SAME: i32 %[[barId:.*]], i32 %[[numThreads:.*]])
 llvm.func @llvm_nvvm_barrier(%barID : i32, %numberOfThreads : i32) {
   // CHECK: call void @llvm.nvvm.barrier0()
-  nvvm.barrier 
+  nvvm.barrier

grypp wrote:

let's remove unrelated space removing from here as well

https://github.com/llvm/llvm-project/pull/107659
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] [mlir][GPU] Plumb range information through the NVVM lowterings (PR #107659)

2024-09-08 Thread Guray Ozen via llvm-branch-commits

grypp wrote:

nit: `lowterings` typo 

https://github.com/llvm/llvm-project/pull/107659
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] 2b33fbe - Revert "[NFCI][LTO][lld] Optimize away symbol copies within LTO global resolu…"

2024-09-08 Thread via llvm-branch-commits

Author: Mingming Liu
Date: 2024-09-08T16:45:07-07:00
New Revision: 2b33fbee3f36344786fa63b189387c3bd90c4c3f

URL: 
https://github.com/llvm/llvm-project/commit/2b33fbee3f36344786fa63b189387c3bd90c4c3f
DIFF: 
https://github.com/llvm/llvm-project/commit/2b33fbee3f36344786fa63b189387c3bd90c4c3f.diff

LOG: Revert "[NFCI][LTO][lld] Optimize away symbol copies within LTO global 
resolu…"

This reverts commit 9ade4e2646bd52b49e50c1648301da65de90ffa9.

Added: 


Modified: 
lld/ELF/InputFiles.cpp
lld/ELF/LTO.cpp
llvm/include/llvm/LTO/Config.h
llvm/include/llvm/LTO/LTO.h
llvm/lib/LTO/LTO.cpp

Removed: 




diff  --git a/lld/ELF/InputFiles.cpp b/lld/ELF/InputFiles.cpp
index da69c4882ead21..1570adf1370930 100644
--- a/lld/ELF/InputFiles.cpp
+++ b/lld/ELF/InputFiles.cpp
@@ -1804,10 +1804,6 @@ void BitcodeFile::parseLazy() {
   auto *sym = symtab.insert(unique_saver().save(irSym.getName()));
   sym->resolve(LazySymbol{*this});
   symbols[i] = sym;
-} else {
-  // Keep copies of per-module undefined symbols for LTO::GlobalResolutions
-  // usage.
-  unique_saver().save(irSym.getName());
 }
 }
 

diff  --git a/lld/ELF/LTO.cpp b/lld/ELF/LTO.cpp
index f339f1c2c0ec21..935d0a9eab9ee0 100644
--- a/lld/ELF/LTO.cpp
+++ b/lld/ELF/LTO.cpp
@@ -135,7 +135,6 @@ static lto::Config createConfig() {
   config->ltoValidateAllVtablesHaveTypeInfos;
   c.AllVtablesHaveTypeInfos = ctx.ltoAllVtablesHaveTypeInfos;
   c.AlwaysEmitRegularLTOObj = !config->ltoObjPath.empty();
-  c.KeepSymbolNameCopies = false;
 
   for (const llvm::StringRef &name : config->thinLTOModulesToCompile)
 c.ThinLTOModulesToCompile.emplace_back(name);

diff  --git a/llvm/include/llvm/LTO/Config.h b/llvm/include/llvm/LTO/Config.h
index a49cce9f30e20c..482b6e55a19d35 100644
--- a/llvm/include/llvm/LTO/Config.h
+++ b/llvm/include/llvm/LTO/Config.h
@@ -88,11 +88,6 @@ struct Config {
   /// want to know a priori all possible output files.
   bool AlwaysEmitRegularLTOObj = false;
 
-  /// If true, the LTO instance creates copies of the symbol names for 
LTO::run.
-  /// The lld linker uses string saver to keep symbol names alive and doesn't
-  /// need to create copies, so it can set this field to false.
-  bool KeepSymbolNameCopies = true;
-
   /// Allows non-imported definitions to get the potentially more constraining
   /// visibility from the prevailing definition. FromPrevailing is the default
   /// because it works for many binary formats. ELF can use the more optimized

diff  --git a/llvm/include/llvm/LTO/LTO.h b/llvm/include/llvm/LTO/LTO.h
index 782f37dc8d4404..949e80a43f0e88 100644
--- a/llvm/include/llvm/LTO/LTO.h
+++ b/llvm/include/llvm/LTO/LTO.h
@@ -15,9 +15,6 @@
 #ifndef LLVM_LTO_LTO_H
 #define LLVM_LTO_LTO_H
 
-#include 
-
-#include "llvm/ADT/DenseMap.h"
 #include "llvm/ADT/MapVector.h"
 #include "llvm/ADT/StringMap.h"
 #include "llvm/Bitcode/BitcodeReader.h"
@@ -26,7 +23,6 @@
 #include "llvm/Object/IRSymtab.h"
 #include "llvm/Support/Caching.h"
 #include "llvm/Support/Error.h"
-#include "llvm/Support/StringSaver.h"
 #include "llvm/Support/thread.h"
 #include "llvm/Transforms/IPO/FunctionAttrs.h"
 #include "llvm/Transforms/IPO/FunctionImport.h"
@@ -407,19 +403,10 @@ class LTO {
 };
   };
 
-  // GlobalResolutionSymbolSaver allocator.
-  std::unique_ptr Alloc;
-
-  // Symbol saver for global resolution map.
-  std::unique_ptr GlobalResolutionSymbolSaver;
-
   // Global mapping from mangled symbol names to resolutions.
-  // Make this an unique_ptr to guard against accessing after it has been reset
+  // Make this an optional to guard against accessing after it has been reset
   // (to reduce memory after we're done with it).
-  std::unique_ptr>
-  GlobalResolutions;
-
-  void releaseGlobalResolutionsMemory();
+  std::optional> GlobalResolutions;
 
   void addModuleToGlobalRes(ArrayRef Syms,
 ArrayRef Res, unsigned Partition,

diff  --git a/llvm/lib/LTO/LTO.cpp b/llvm/lib/LTO/LTO.cpp
index 5d9a5cbd18f156..68072563cb33d6 100644
--- a/llvm/lib/LTO/LTO.cpp
+++ b/llvm/lib/LTO/LTO.cpp
@@ -77,10 +77,6 @@ cl::opt EnableLTOInternalization(
 "enable-lto-internalization", cl::init(true), cl::Hidden,
 cl::desc("Enable global value internalization in LTO"));
 
-static cl::opt
-LTOKeepSymbolCopies("lto-keep-symbol-copies", cl::init(false), cl::Hidden,
-cl::desc("Keep copies of symbols in LTO indexing"));
-
 /// Indicate we are linking with an allocator that supports hot/cold operator
 /// new interfaces.
 extern cl::opt SupportsHotColdNew;
@@ -591,14 +587,8 @@ LTO::LTO(Config Conf, ThinBackend Backend,
 : Conf(std::move(Conf)),
   RegularLTO(ParallelCodeGenParallelismLevel, this->Conf),
   ThinLTO(std::move(Backend)),
-  GlobalResolutions(
-  std::make_unique>()),
-  LTOMode(LTOMode) {
-  if (Conf.KeepSymbolNameCopies || LTOKeepSym

[llvm-branch-commits] [clang] [Serialization] Code cleanups and polish 83233 (PR #83237)

2024-09-08 Thread Chuanqi Xu via llvm-branch-commits

ChuanqiXu9 wrote:

> > what the code does is: when we write a on-disk hash table, try to write the 
> > imported merged hash table in the same process so that we don't need to 
> > read these tables again. However, in line 329 the function will try to omit 
> > the data from imported table with the same key which already emitted by the 
> > current module file. This is the root cause of the problem.
> 
> It's been a while since I looked at this, but as I recall, a fundamental 
> assumption of MultiOnDiskHashTable is that if we have a lookup result for a 
> key K in the current file, that result supersedes any results from dependency 
> files. So lookup won't look in those files if we have a local result (they 
> are overridden) and merging doesn't take results from those files either.
> 
> So I think the problem probably is that when we form a local result, we need 
> to (but presumably don't) add all the imported results with the same key to 
> the local result.

Thanks for the insight! I didn't known this. I'll try to make it.

https://github.com/llvm/llvm-project/pull/83237
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LLVM][Coroutines] Create `.noalloc` variant of switch ABI coroutine ramp functions during CoroSplit (PR #99283)

2024-09-08 Thread Yuxuan Chen via llvm-branch-commits

https://github.com/yuxuanchen1997 updated 
https://github.com/llvm/llvm-project/pull/99283

>From 0c712a2fbc5b44e892b37085dbace8ba974c1238 Mon Sep 17 00:00:00 2001
From: Yuxuan Chen 
Date: Tue, 4 Jun 2024 23:22:00 -0700
Subject: [PATCH] [LLVM][Coroutines] Create `.noalloc` variant of switch ABI
 coroutine ramp functions during CoroSplit

---
 llvm/docs/Coroutines.rst  |  18 +++
 llvm/lib/Transforms/Coroutines/CoroInternal.h |   7 +
 llvm/lib/Transforms/Coroutines/CoroSplit.cpp  | 150 +++---
 llvm/lib/Transforms/Coroutines/Coroutines.cpp |  27 
 .../Transforms/Coroutines/coro-split-00.ll|  15 ++
 5 files changed, 191 insertions(+), 26 deletions(-)

diff --git a/llvm/docs/Coroutines.rst b/llvm/docs/Coroutines.rst
index 36092325e536fb..5679aefcb421d8 100644
--- a/llvm/docs/Coroutines.rst
+++ b/llvm/docs/Coroutines.rst
@@ -2022,6 +2022,12 @@ The pass CoroSplit builds coroutine frame and outlines 
resume and destroy parts
 into separate functions. This pass also lowers `coro.await.suspend.void`_,
 `coro.await.suspend.bool`_ and `coro.await.suspend.handle`_ intrinsics.
 
+CoroAnnotationElide
+---
+This pass finds all usages of coroutines that are "must elide" and replaces
+`coro.begin` intrinsic with an address of a coroutine frame placed on its 
caller
+and replaces `coro.alloc` and `coro.free` intrinsics with `false` and `null`
+respectively to remove the deallocation code.
 
 CoroElide
 -
@@ -2049,6 +2055,18 @@ the coroutine must reach the final suspend point when it 
get destroyed.
 
 This attribute only works for switched-resume coroutines now.
 
+coro_elide_safe
+---
+
+When a Call or Invoke instruction to switch ABI coroutine `f` is marked with
+`coro_elide_safe`, CoroSplitPass generates a `f.noalloc` ramp function.
+`f.noalloc` has one more argument than its original ramp function `f`, which is
+the pointer to the allocated frame. `f.noalloc` also suppressed any allocations
+or deallocations that may be guarded by `@llvm.coro.alloc` and 
`@llvm.coro.free`.
+
+CoroAnnotationElidePass performs the heap elision when possible. Note that for
+recursive or mutually recursive functions this elision is usually not possible.
+
 Metadata
 
 
diff --git a/llvm/lib/Transforms/Coroutines/CoroInternal.h 
b/llvm/lib/Transforms/Coroutines/CoroInternal.h
index d535ad7f85d74a..be86f96525b677 100644
--- a/llvm/lib/Transforms/Coroutines/CoroInternal.h
+++ b/llvm/lib/Transforms/Coroutines/CoroInternal.h
@@ -26,6 +26,13 @@ bool declaresIntrinsics(const Module &M,
 const std::initializer_list);
 void replaceCoroFree(CoroIdInst *CoroId, bool Elide);
 
+/// Replaces all @llvm.coro.alloc intrinsics calls associated with a given
+/// call @llvm.coro.id instruction with boolean value false.
+void suppressCoroAllocs(CoroIdInst *CoroId);
+/// Replaces CoroAllocs with boolean value false.
+void suppressCoroAllocs(LLVMContext &Context,
+ArrayRef CoroAllocs);
+
 /// Attempts to rewrite the location operand of debug intrinsics in terms of
 /// the coroutine frame pointer, folding pointer offsets into the DIExpression
 /// of the intrinsic.
diff --git a/llvm/lib/Transforms/Coroutines/CoroSplit.cpp 
b/llvm/lib/Transforms/Coroutines/CoroSplit.cpp
index 6bf3c75b95113e..494c4d632de95f 100644
--- a/llvm/lib/Transforms/Coroutines/CoroSplit.cpp
+++ b/llvm/lib/Transforms/Coroutines/CoroSplit.cpp
@@ -25,6 +25,7 @@
 #include "llvm/ADT/PriorityWorklist.h"
 #include "llvm/ADT/SmallPtrSet.h"
 #include "llvm/ADT/SmallVector.h"
+#include "llvm/ADT/StringExtras.h"
 #include "llvm/ADT/StringRef.h"
 #include "llvm/ADT/Twine.h"
 #include "llvm/Analysis/CFG.h"
@@ -1177,6 +1178,14 @@ static void 
updateAsyncFuncPointerContextSize(coro::Shape &Shape) {
   Shape.AsyncLowering.AsyncFuncPointer->setInitializer(NewFuncPtrStruct);
 }
 
+static TypeSize getFrameSizeForShape(coro::Shape &Shape) {
+  // In the same function all coro.sizes should have the same result type.
+  auto *SizeIntrin = Shape.CoroSizes.back();
+  Module *M = SizeIntrin->getModule();
+  const DataLayout &DL = M->getDataLayout();
+  return DL.getTypeAllocSize(Shape.FrameTy);
+}
+
 static void replaceFrameSizeAndAlignment(coro::Shape &Shape) {
   if (Shape.ABI == coro::ABI::Async)
 updateAsyncFuncPointerContextSize(Shape);
@@ -1192,10 +1201,8 @@ static void replaceFrameSizeAndAlignment(coro::Shape 
&Shape) {
 
   // In the same function all coro.sizes should have the same result type.
   auto *SizeIntrin = Shape.CoroSizes.back();
-  Module *M = SizeIntrin->getModule();
-  const DataLayout &DL = M->getDataLayout();
-  auto Size = DL.getTypeAllocSize(Shape.FrameTy);
-  auto *SizeConstant = ConstantInt::get(SizeIntrin->getType(), Size);
+  auto *SizeConstant =
+  ConstantInt::get(SizeIntrin->getType(), getFrameSizeForShape(Shape));
 
   for (CoroSizeInst *CS : Shape.CoroSizes) {
 CS->replaceAllUsesWith(SizeConstant);
@@ -1452,6 +1459,75 @@ struct SwitchCorou

[llvm-branch-commits] [llvm] [LLVM][Coroutines] Transform "coro_elide_safe" calls to switch ABI coroutines to the `noalloc` variant (PR #99285)

2024-09-08 Thread Yuxuan Chen via llvm-branch-commits

https://github.com/yuxuanchen1997 updated 
https://github.com/llvm/llvm-project/pull/99285

>From 201bca06d4e75bc4fa24ac269ad7b9750f24616f Mon Sep 17 00:00:00 2001
From: Yuxuan Chen 
Date: Mon, 15 Jul 2024 15:01:39 -0700
Subject: [PATCH] [LLVM][Coroutines] Transform "coro_elide_safe" calls to
 switch ABI coroutines to the `noalloc` variant

---
 .../Coroutines/CoroAnnotationElide.h  |  36 
 llvm/lib/Passes/PassBuilder.cpp   |   1 +
 llvm/lib/Passes/PassBuilderPipelines.cpp  |  10 +-
 llvm/lib/Passes/PassRegistry.def  |   1 +
 llvm/lib/Transforms/Coroutines/CMakeLists.txt |   1 +
 .../Coroutines/CoroAnnotationElide.cpp| 155 ++
 llvm/test/Other/new-pm-defaults.ll|   1 +
 .../Other/new-pm-thinlto-postlink-defaults.ll |   1 +
 .../new-pm-thinlto-postlink-pgo-defaults.ll   |   1 +
 ...-pm-thinlto-postlink-samplepgo-defaults.ll |   1 +
 .../Coroutines/coro-transform-must-elide.ll   |  75 +
 11 files changed, 281 insertions(+), 2 deletions(-)
 create mode 100644 
llvm/include/llvm/Transforms/Coroutines/CoroAnnotationElide.h
 create mode 100644 llvm/lib/Transforms/Coroutines/CoroAnnotationElide.cpp
 create mode 100644 llvm/test/Transforms/Coroutines/coro-transform-must-elide.ll

diff --git a/llvm/include/llvm/Transforms/Coroutines/CoroAnnotationElide.h 
b/llvm/include/llvm/Transforms/Coroutines/CoroAnnotationElide.h
new file mode 100644
index 00..352c9e14526697
--- /dev/null
+++ b/llvm/include/llvm/Transforms/Coroutines/CoroAnnotationElide.h
@@ -0,0 +1,36 @@
+//===- CoroAnnotationElide.h - Elide attributed safe coroutine calls 
--===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+//
+// \file
+// This pass transforms all Call or Invoke instructions that are annotated
+// "coro_elide_safe" to call the `.noalloc` variant of coroutine instead.
+// The frame of the callee coroutine is allocated inside the caller. A pointer
+// to the allocated frame will be passed into the `.noalloc` ramp function.
+//
+//===--===//
+
+#ifndef LLVM_TRANSFORMS_COROUTINES_COROANNOTATIONELIDE_H
+#define LLVM_TRANSFORMS_COROUTINES_COROANNOTATIONELIDE_H
+
+#include "llvm/Analysis/CGSCCPassManager.h"
+#include "llvm/Analysis/LazyCallGraph.h"
+#include "llvm/IR/PassManager.h"
+
+namespace llvm {
+
+struct CoroAnnotationElidePass : PassInfoMixin {
+  CoroAnnotationElidePass() {}
+
+  PreservedAnalyses run(LazyCallGraph::SCC &C, CGSCCAnalysisManager &AM,
+LazyCallGraph &CG, CGSCCUpdateResult &UR);
+
+  static bool isRequired() { return false; }
+};
+} // end namespace llvm
+
+#endif // LLVM_TRANSFORMS_COROUTINES_COROANNOTATIONELIDE_H
diff --git a/llvm/lib/Passes/PassBuilder.cpp b/llvm/lib/Passes/PassBuilder.cpp
index 83c1a6712bf4d9..c34f9148cce58b 100644
--- a/llvm/lib/Passes/PassBuilder.cpp
+++ b/llvm/lib/Passes/PassBuilder.cpp
@@ -139,6 +139,7 @@
 #include "llvm/Target/TargetMachine.h"
 #include "llvm/Transforms/AggressiveInstCombine/AggressiveInstCombine.h"
 #include "llvm/Transforms/CFGuard.h"
+#include "llvm/Transforms/Coroutines/CoroAnnotationElide.h"
 #include "llvm/Transforms/Coroutines/CoroCleanup.h"
 #include "llvm/Transforms/Coroutines/CoroConditionalWrapper.h"
 #include "llvm/Transforms/Coroutines/CoroEarly.h"
diff --git a/llvm/lib/Passes/PassBuilderPipelines.cpp 
b/llvm/lib/Passes/PassBuilderPipelines.cpp
index 7f9e1362e7ef23..4e8e3dcdff4428 100644
--- a/llvm/lib/Passes/PassBuilderPipelines.cpp
+++ b/llvm/lib/Passes/PassBuilderPipelines.cpp
@@ -33,6 +33,7 @@
 #include "llvm/Support/VirtualFileSystem.h"
 #include "llvm/Target/TargetMachine.h"
 #include "llvm/Transforms/AggressiveInstCombine/AggressiveInstCombine.h"
+#include "llvm/Transforms/Coroutines/CoroAnnotationElide.h"
 #include "llvm/Transforms/Coroutines/CoroCleanup.h"
 #include "llvm/Transforms/Coroutines/CoroConditionalWrapper.h"
 #include "llvm/Transforms/Coroutines/CoroEarly.h"
@@ -973,8 +974,10 @@ PassBuilder::buildInlinerPipeline(OptimizationLevel Level,
   MainCGPipeline.addPass(createCGSCCToFunctionPassAdaptor(
   RequireAnalysisPass()));
 
-  if (Phase != ThinOrFullLTOPhase::ThinLTOPreLink)
+  if (Phase != ThinOrFullLTOPhase::ThinLTOPreLink) {
 MainCGPipeline.addPass(CoroSplitPass(Level != OptimizationLevel::O0));
+MainCGPipeline.addPass(CoroAnnotationElidePass());
+  }
 
   // Make sure we don't affect potential future NoRerun CGSCC adaptors.
   MIWP.addLateModulePass(createModuleToFunctionPassAdaptor(
@@ -1016,9 +1019,12 @@ 
PassBuilder::buildModuleInlinerPipeline(OptimizationLevel Level,
   buildFunctionSimplificationPipeline(Level, Phase),
   PTO.EagerlyInvalidateAnalyses));
 
-  if (Phase != ThinOrFullLTO

[llvm-branch-commits] [llvm] [LLVM][Coroutines] Create `.noalloc` variant of switch ABI coroutine ramp functions during CoroSplit (PR #99283)

2024-09-08 Thread Yuxuan Chen via llvm-branch-commits

https://github.com/yuxuanchen1997 updated 
https://github.com/llvm/llvm-project/pull/99283

>From 0c712a2fbc5b44e892b37085dbace8ba974c1238 Mon Sep 17 00:00:00 2001
From: Yuxuan Chen 
Date: Tue, 4 Jun 2024 23:22:00 -0700
Subject: [PATCH] [LLVM][Coroutines] Create `.noalloc` variant of switch ABI
 coroutine ramp functions during CoroSplit

---
 llvm/docs/Coroutines.rst  |  18 +++
 llvm/lib/Transforms/Coroutines/CoroInternal.h |   7 +
 llvm/lib/Transforms/Coroutines/CoroSplit.cpp  | 150 +++---
 llvm/lib/Transforms/Coroutines/Coroutines.cpp |  27 
 .../Transforms/Coroutines/coro-split-00.ll|  15 ++
 5 files changed, 191 insertions(+), 26 deletions(-)

diff --git a/llvm/docs/Coroutines.rst b/llvm/docs/Coroutines.rst
index 36092325e536fb..5679aefcb421d8 100644
--- a/llvm/docs/Coroutines.rst
+++ b/llvm/docs/Coroutines.rst
@@ -2022,6 +2022,12 @@ The pass CoroSplit builds coroutine frame and outlines 
resume and destroy parts
 into separate functions. This pass also lowers `coro.await.suspend.void`_,
 `coro.await.suspend.bool`_ and `coro.await.suspend.handle`_ intrinsics.
 
+CoroAnnotationElide
+---
+This pass finds all usages of coroutines that are "must elide" and replaces
+`coro.begin` intrinsic with an address of a coroutine frame placed on its 
caller
+and replaces `coro.alloc` and `coro.free` intrinsics with `false` and `null`
+respectively to remove the deallocation code.
 
 CoroElide
 -
@@ -2049,6 +2055,18 @@ the coroutine must reach the final suspend point when it 
get destroyed.
 
 This attribute only works for switched-resume coroutines now.
 
+coro_elide_safe
+---
+
+When a Call or Invoke instruction to switch ABI coroutine `f` is marked with
+`coro_elide_safe`, CoroSplitPass generates a `f.noalloc` ramp function.
+`f.noalloc` has one more argument than its original ramp function `f`, which is
+the pointer to the allocated frame. `f.noalloc` also suppressed any allocations
+or deallocations that may be guarded by `@llvm.coro.alloc` and 
`@llvm.coro.free`.
+
+CoroAnnotationElidePass performs the heap elision when possible. Note that for
+recursive or mutually recursive functions this elision is usually not possible.
+
 Metadata
 
 
diff --git a/llvm/lib/Transforms/Coroutines/CoroInternal.h 
b/llvm/lib/Transforms/Coroutines/CoroInternal.h
index d535ad7f85d74a..be86f96525b677 100644
--- a/llvm/lib/Transforms/Coroutines/CoroInternal.h
+++ b/llvm/lib/Transforms/Coroutines/CoroInternal.h
@@ -26,6 +26,13 @@ bool declaresIntrinsics(const Module &M,
 const std::initializer_list);
 void replaceCoroFree(CoroIdInst *CoroId, bool Elide);
 
+/// Replaces all @llvm.coro.alloc intrinsics calls associated with a given
+/// call @llvm.coro.id instruction with boolean value false.
+void suppressCoroAllocs(CoroIdInst *CoroId);
+/// Replaces CoroAllocs with boolean value false.
+void suppressCoroAllocs(LLVMContext &Context,
+ArrayRef CoroAllocs);
+
 /// Attempts to rewrite the location operand of debug intrinsics in terms of
 /// the coroutine frame pointer, folding pointer offsets into the DIExpression
 /// of the intrinsic.
diff --git a/llvm/lib/Transforms/Coroutines/CoroSplit.cpp 
b/llvm/lib/Transforms/Coroutines/CoroSplit.cpp
index 6bf3c75b95113e..494c4d632de95f 100644
--- a/llvm/lib/Transforms/Coroutines/CoroSplit.cpp
+++ b/llvm/lib/Transforms/Coroutines/CoroSplit.cpp
@@ -25,6 +25,7 @@
 #include "llvm/ADT/PriorityWorklist.h"
 #include "llvm/ADT/SmallPtrSet.h"
 #include "llvm/ADT/SmallVector.h"
+#include "llvm/ADT/StringExtras.h"
 #include "llvm/ADT/StringRef.h"
 #include "llvm/ADT/Twine.h"
 #include "llvm/Analysis/CFG.h"
@@ -1177,6 +1178,14 @@ static void 
updateAsyncFuncPointerContextSize(coro::Shape &Shape) {
   Shape.AsyncLowering.AsyncFuncPointer->setInitializer(NewFuncPtrStruct);
 }
 
+static TypeSize getFrameSizeForShape(coro::Shape &Shape) {
+  // In the same function all coro.sizes should have the same result type.
+  auto *SizeIntrin = Shape.CoroSizes.back();
+  Module *M = SizeIntrin->getModule();
+  const DataLayout &DL = M->getDataLayout();
+  return DL.getTypeAllocSize(Shape.FrameTy);
+}
+
 static void replaceFrameSizeAndAlignment(coro::Shape &Shape) {
   if (Shape.ABI == coro::ABI::Async)
 updateAsyncFuncPointerContextSize(Shape);
@@ -1192,10 +1201,8 @@ static void replaceFrameSizeAndAlignment(coro::Shape 
&Shape) {
 
   // In the same function all coro.sizes should have the same result type.
   auto *SizeIntrin = Shape.CoroSizes.back();
-  Module *M = SizeIntrin->getModule();
-  const DataLayout &DL = M->getDataLayout();
-  auto Size = DL.getTypeAllocSize(Shape.FrameTy);
-  auto *SizeConstant = ConstantInt::get(SizeIntrin->getType(), Size);
+  auto *SizeConstant =
+  ConstantInt::get(SizeIntrin->getType(), getFrameSizeForShape(Shape));
 
   for (CoroSizeInst *CS : Shape.CoroSizes) {
 CS->replaceAllUsesWith(SizeConstant);
@@ -1452,6 +1459,75 @@ struct SwitchCorou

[llvm-branch-commits] [llvm] [LLVM][Coroutines] Transform "coro_elide_safe" calls to switch ABI coroutines to the `noalloc` variant (PR #99285)

2024-09-08 Thread Yuxuan Chen via llvm-branch-commits

https://github.com/yuxuanchen1997 updated 
https://github.com/llvm/llvm-project/pull/99285

>From 201bca06d4e75bc4fa24ac269ad7b9750f24616f Mon Sep 17 00:00:00 2001
From: Yuxuan Chen 
Date: Mon, 15 Jul 2024 15:01:39 -0700
Subject: [PATCH] [LLVM][Coroutines] Transform "coro_elide_safe" calls to
 switch ABI coroutines to the `noalloc` variant

---
 .../Coroutines/CoroAnnotationElide.h  |  36 
 llvm/lib/Passes/PassBuilder.cpp   |   1 +
 llvm/lib/Passes/PassBuilderPipelines.cpp  |  10 +-
 llvm/lib/Passes/PassRegistry.def  |   1 +
 llvm/lib/Transforms/Coroutines/CMakeLists.txt |   1 +
 .../Coroutines/CoroAnnotationElide.cpp| 155 ++
 llvm/test/Other/new-pm-defaults.ll|   1 +
 .../Other/new-pm-thinlto-postlink-defaults.ll |   1 +
 .../new-pm-thinlto-postlink-pgo-defaults.ll   |   1 +
 ...-pm-thinlto-postlink-samplepgo-defaults.ll |   1 +
 .../Coroutines/coro-transform-must-elide.ll   |  75 +
 11 files changed, 281 insertions(+), 2 deletions(-)
 create mode 100644 
llvm/include/llvm/Transforms/Coroutines/CoroAnnotationElide.h
 create mode 100644 llvm/lib/Transforms/Coroutines/CoroAnnotationElide.cpp
 create mode 100644 llvm/test/Transforms/Coroutines/coro-transform-must-elide.ll

diff --git a/llvm/include/llvm/Transforms/Coroutines/CoroAnnotationElide.h 
b/llvm/include/llvm/Transforms/Coroutines/CoroAnnotationElide.h
new file mode 100644
index 00..352c9e14526697
--- /dev/null
+++ b/llvm/include/llvm/Transforms/Coroutines/CoroAnnotationElide.h
@@ -0,0 +1,36 @@
+//===- CoroAnnotationElide.h - Elide attributed safe coroutine calls 
--===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+//
+// \file
+// This pass transforms all Call or Invoke instructions that are annotated
+// "coro_elide_safe" to call the `.noalloc` variant of coroutine instead.
+// The frame of the callee coroutine is allocated inside the caller. A pointer
+// to the allocated frame will be passed into the `.noalloc` ramp function.
+//
+//===--===//
+
+#ifndef LLVM_TRANSFORMS_COROUTINES_COROANNOTATIONELIDE_H
+#define LLVM_TRANSFORMS_COROUTINES_COROANNOTATIONELIDE_H
+
+#include "llvm/Analysis/CGSCCPassManager.h"
+#include "llvm/Analysis/LazyCallGraph.h"
+#include "llvm/IR/PassManager.h"
+
+namespace llvm {
+
+struct CoroAnnotationElidePass : PassInfoMixin {
+  CoroAnnotationElidePass() {}
+
+  PreservedAnalyses run(LazyCallGraph::SCC &C, CGSCCAnalysisManager &AM,
+LazyCallGraph &CG, CGSCCUpdateResult &UR);
+
+  static bool isRequired() { return false; }
+};
+} // end namespace llvm
+
+#endif // LLVM_TRANSFORMS_COROUTINES_COROANNOTATIONELIDE_H
diff --git a/llvm/lib/Passes/PassBuilder.cpp b/llvm/lib/Passes/PassBuilder.cpp
index 83c1a6712bf4d9..c34f9148cce58b 100644
--- a/llvm/lib/Passes/PassBuilder.cpp
+++ b/llvm/lib/Passes/PassBuilder.cpp
@@ -139,6 +139,7 @@
 #include "llvm/Target/TargetMachine.h"
 #include "llvm/Transforms/AggressiveInstCombine/AggressiveInstCombine.h"
 #include "llvm/Transforms/CFGuard.h"
+#include "llvm/Transforms/Coroutines/CoroAnnotationElide.h"
 #include "llvm/Transforms/Coroutines/CoroCleanup.h"
 #include "llvm/Transforms/Coroutines/CoroConditionalWrapper.h"
 #include "llvm/Transforms/Coroutines/CoroEarly.h"
diff --git a/llvm/lib/Passes/PassBuilderPipelines.cpp 
b/llvm/lib/Passes/PassBuilderPipelines.cpp
index 7f9e1362e7ef23..4e8e3dcdff4428 100644
--- a/llvm/lib/Passes/PassBuilderPipelines.cpp
+++ b/llvm/lib/Passes/PassBuilderPipelines.cpp
@@ -33,6 +33,7 @@
 #include "llvm/Support/VirtualFileSystem.h"
 #include "llvm/Target/TargetMachine.h"
 #include "llvm/Transforms/AggressiveInstCombine/AggressiveInstCombine.h"
+#include "llvm/Transforms/Coroutines/CoroAnnotationElide.h"
 #include "llvm/Transforms/Coroutines/CoroCleanup.h"
 #include "llvm/Transforms/Coroutines/CoroConditionalWrapper.h"
 #include "llvm/Transforms/Coroutines/CoroEarly.h"
@@ -973,8 +974,10 @@ PassBuilder::buildInlinerPipeline(OptimizationLevel Level,
   MainCGPipeline.addPass(createCGSCCToFunctionPassAdaptor(
   RequireAnalysisPass()));
 
-  if (Phase != ThinOrFullLTOPhase::ThinLTOPreLink)
+  if (Phase != ThinOrFullLTOPhase::ThinLTOPreLink) {
 MainCGPipeline.addPass(CoroSplitPass(Level != OptimizationLevel::O0));
+MainCGPipeline.addPass(CoroAnnotationElidePass());
+  }
 
   // Make sure we don't affect potential future NoRerun CGSCC adaptors.
   MIWP.addLateModulePass(createModuleToFunctionPassAdaptor(
@@ -1016,9 +1019,12 @@ 
PassBuilder::buildModuleInlinerPipeline(OptimizationLevel Level,
   buildFunctionSimplificationPipeline(Level, Phase),
   PTO.EagerlyInvalidateAnalyses));
 
-  if (Phase != ThinOrFullLTO

[llvm-branch-commits] [clang] [Serialization] Code cleanups and polish 83233 (PR #83237)

2024-09-08 Thread Chuanqi Xu via llvm-branch-commits

ChuanqiXu9 wrote:

> > what the code does is: when we write a on-disk hash table, try to write the 
> > imported merged hash table in the same process so that we don't need to 
> > read these tables again. However, in line 329 the function will try to omit 
> > the data from imported table with the same key which already emitted by the 
> > current module file. This is the root cause of the problem.
> 
> It's been a while since I looked at this, but as I recall, a fundamental 
> assumption of MultiOnDiskHashTable is that if we have a lookup result for a 
> key K in the current file, that result supersedes any results from dependency 
> files. So lookup won't look in those files if we have a local result (they 
> are overridden) and merging doesn't take results from those files either.
> 
> So I think the problem probably is that when we form a local result, we need 
> to (but presumably don't) add all the imported results with the same key to 
> the local result.

I took a second look at MultiOnDiskHashTable.h and it looks like it is what I 
did. There is not any logic to supersedes results from dependency files in 
MultiOnDiskHashTable itself (or it doesn't have the concept of dependency file 
in MultiOnDiskHashTable.) The only similar thing is 
`MultiOnDiskHashTableGenerator::emit`, where the contents from the merged 
tables of imported files will be come the contents of the current file. And the 
corresponding imported files will be marked as overriden and removed in `find`.

And in this patch, I've already tried to write all the contents into the 
current one. So I feel the concern is already addressed.

https://github.com/llvm/llvm-project/pull/83237
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits