date:20241003

[llvm-branch-commits] [flang] [WIP][flang] Introduce HLFIR lowerings to omp.workshare_loop_nest (PR #104748)

2024-10-03 Thread Ivan R. Ivanov via llvm-branch-commits


ivanradanov wrote:

@Thirumalai-Shaktivel Fixed, it was a very stupid mistake with the argument 
order of the copyprivate copy function. Thank you.

https://github.com/llvm/llvm-project/pull/104748
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [flang] [WIP][flang] Introduce HLFIR lowerings to omp.workshare_loop_nest (PR #104748)

2024-10-03 Thread Ivan R. Ivanov via llvm-branch-commits


https://github.com/ivanradanov updated 
https://github.com/llvm/llvm-project/pull/104748

>From 07a9eb3581f480c47ce4de3de00c7cef15df3cdc Mon Sep 17 00:00:00 2001
From: Ivan Radanov Ivanov 
Date: Fri, 4 Oct 2024 14:21:14 +0900
Subject: [PATCH 1/7] Fix dst src in copy function

---
 flang/lib/Optimizer/OpenMP/LowerWorkshare.cpp | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/flang/lib/Optimizer/OpenMP/LowerWorkshare.cpp 
b/flang/lib/Optimizer/OpenMP/LowerWorkshare.cpp
index cf1867311cc236..baf8346e7608a9 100644
--- a/flang/lib/Optimizer/OpenMP/LowerWorkshare.cpp
+++ b/flang/lib/Optimizer/OpenMP/LowerWorkshare.cpp
@@ -162,8 +162,8 @@ static mlir::func::FuncOp createCopyFunc(mlir::Location 
loc, mlir::Type varType,
   {loc, loc});
   builder.setInsertionPointToStart(&funcOp.getRegion().back());
 
-  Value loaded = builder.create(loc, funcOp.getArgument(0));
-  builder.create(loc, loaded, funcOp.getArgument(1));
+  Value loaded = builder.create(loc, funcOp.getArgument(1));
+  builder.create(loc, loaded, funcOp.getArgument(0));
 
   builder.create(loc);
   return funcOp;

>From c3ff901b31806c73228e4f47a47f420c2d2465ed Mon Sep 17 00:00:00 2001
From: Ivan Radanov Ivanov 
Date: Fri, 4 Oct 2024 14:38:48 +0900
Subject: [PATCH 2/7] Use omp.single to handle CFG cases

---
 flang/lib/Optimizer/OpenMP/LowerWorkshare.cpp | 77 +--
 1 file changed, 53 insertions(+), 24 deletions(-)

diff --git a/flang/lib/Optimizer/OpenMP/LowerWorkshare.cpp 
b/flang/lib/Optimizer/OpenMP/LowerWorkshare.cpp
index baf8346e7608a9..34399abbcd20ea 100644
--- a/flang/lib/Optimizer/OpenMP/LowerWorkshare.cpp
+++ b/flang/lib/Optimizer/OpenMP/LowerWorkshare.cpp
@@ -16,7 +16,6 @@
 //
 
//===--===//
 
-#include "flang/Optimizer/Builder/Todo.h"
 #include 
 #include 
 #include 
@@ -39,7 +38,6 @@
 #include 
 #include 
 #include 
-#include 
 
 #include 
 
@@ -96,6 +94,12 @@ bool shouldUseWorkshareLowering(Operation *op) {
   if (isNestedIn(parentWorkshare, op))
 return false;
 
+  if (parentWorkshare.getRegion().getBlocks().size() != 1) {
+parentWorkshare->emitWarning(
+"omp workshare with unstructured control flow currently unsupported.");
+return false;
+  }
+
   return true;
 }
 
@@ -408,15 +412,6 @@ LogicalResult lowerWorkshare(mlir::omp::WorkshareOp wsOp, 
DominanceInfo &di) {
 
   OpBuilder rootBuilder(wsOp);
 
-  // This operation is just a placeholder which will be erased later. We need 
it
-  // because our `parallelizeRegion` function works on regions and not blocks.
-  omp::WorkshareOp newOp =
-  rootBuilder.create(loc, omp::WorkshareOperands());
-  if (!wsOp.getNowait())
-rootBuilder.create(loc);
-
-  parallelizeRegion(wsOp.getRegion(), newOp.getRegion(), rootMapping, loc, di);
-
   // FIXME Currently, we only support workshare constructs with structured
   // control flow. The transformation itself supports CFG, however, once we
   // transform the MLIR region in the omp.workshare, we need to inline that
@@ -427,19 +422,53 @@ LogicalResult lowerWorkshare(mlir::omp::WorkshareOp wsOp, 
DominanceInfo &di) {
   // time when fir ops get lowered to CFG. However, SCF is not registered in
   // flang so we cannot use it. Remove this requirement once we have
   // scf.execute_region or an alternative operation available.
-  if (wsOp.getRegion().getBlocks().size() != 1)
-TODO(wsOp->getLoc(), "omp workshare with unstructured control flow");
-
-  // Inline the contents of the placeholder workshare op into its parent block.
-  Block *theBlock = &newOp.getRegion().front();
-  Operation *term = theBlock->getTerminator();
-  Block *parentBlock = wsOp->getBlock();
-  parentBlock->getOperations().splice(newOp->getIterator(),
-  theBlock->getOperations());
-  assert(term->getNumOperands() == 0);
-  term->erase();
-  newOp->erase();
-  wsOp->erase();
+  if (wsOp.getRegion().getBlocks().size() == 1) {
+// This operation is just a placeholder which will be erased later. We need
+// it because our `parallelizeRegion` function works on regions and not
+// blocks.
+omp::WorkshareOp newOp =
+rootBuilder.create(loc, omp::WorkshareOperands());
+if (!wsOp.getNowait())
+  rootBuilder.create(loc);
+
+parallelizeRegion(wsOp.getRegion(), newOp.getRegion(), rootMapping, loc,
+  di);
+
+// Inline the contents of the placeholder workshare op into its parent
+// block.
+Block *theBlock = &newOp.getRegion().front();
+Operation *term = theBlock->getTerminator();
+Block *parentBlock = wsOp->getBlock();
+parentBlock->getOperations().splice(newOp->getIterator(),
+theBlock->getOperations());
+assert(term->getNumOperands() == 0);
+term->erase();
+newOp->erase();
+wsOp->erase();
+  } else {
+// Otherwise just change the operation to an omp.single.
+
+//

[llvm-branch-commits] [flang] [flang] Lower omp.workshare to other omp constructs (PR #101446)

2024-10-03 Thread Ivan R. Ivanov via llvm-branch-commits


https://github.com/ivanradanov updated 
https://github.com/llvm/llvm-project/pull/101446

>From e56dbd6a0625890fd9a3d6a62675e864ca94a8f5 Mon Sep 17 00:00:00 2001
From: Ivan Radanov Ivanov 
Date: Sun, 4 Aug 2024 22:06:55 +0900
Subject: [PATCH 1/8] [flang] Lower omp.workshare to other omp constructs

Change to workshare loop wrapper op

Move single op declaration

Schedule pass properly

Correctly handle nested nested loop nests to be parallelized by workshare

Leave comments for shouldUseWorkshareLowering

Use copyprivate to scatter val from omp.single

TODO still need to implement copy function
TODO transitive check for usage outside of omp.single not imiplemented yet

Transitively check for users outisde of single op

TODO need to implement copy func
TODO need to hoist allocas outside of single regions

Add tests

Hoist allocas

More tests

Emit body for copy func

Test the tmp storing logic

Clean up trivially dead ops

Only handle single-block regions for now

Fix tests for custom assembly for loop wrapper

Only run the lower workshare pass if openmp is enabled

Implement some missing functionality

Fix tests

Fix test

Iterate backwards to find all trivially dead ops

Add expalanation comment for createCopyFun

Update test
---
 flang/include/flang/Optimizer/OpenMP/Passes.h |   5 +
 .../include/flang/Optimizer/OpenMP/Passes.td  |   5 +
 flang/include/flang/Tools/CLOptions.inc   |   6 +-
 flang/include/flang/Tools/CrossToolHelpers.h  |   1 +
 flang/lib/Frontend/FrontendActions.cpp|  10 +-
 flang/lib/Optimizer/OpenMP/CMakeLists.txt |   1 +
 flang/lib/Optimizer/OpenMP/LowerWorkshare.cpp | 446 ++
 flang/test/Fir/basic-program.fir  |   1 +
 .../Transforms/OpenMP/lower-workshare.mlir| 189 
 .../Transforms/OpenMP/lower-workshare2.mlir   |  23 +
 .../Transforms/OpenMP/lower-workshare3.mlir   |  74 +++
 .../Transforms/OpenMP/lower-workshare4.mlir   |  59 +++
 .../Transforms/OpenMP/lower-workshare5.mlir   |  42 ++
 .../Transforms/OpenMP/lower-workshare6.mlir   |  51 ++
 flang/tools/bbc/bbc.cpp   |   5 +-
 flang/tools/tco/tco.cpp   |   1 +
 16 files changed, 915 insertions(+), 4 deletions(-)
 create mode 100644 flang/lib/Optimizer/OpenMP/LowerWorkshare.cpp
 create mode 100644 flang/test/Transforms/OpenMP/lower-workshare.mlir
 create mode 100644 flang/test/Transforms/OpenMP/lower-workshare2.mlir
 create mode 100644 flang/test/Transforms/OpenMP/lower-workshare3.mlir
 create mode 100644 flang/test/Transforms/OpenMP/lower-workshare4.mlir
 create mode 100644 flang/test/Transforms/OpenMP/lower-workshare5.mlir
 create mode 100644 flang/test/Transforms/OpenMP/lower-workshare6.mlir

diff --git a/flang/include/flang/Optimizer/OpenMP/Passes.h 
b/flang/include/flang/Optimizer/OpenMP/Passes.h
index 403d79667bf448..feb395f1a12dbd 100644
--- a/flang/include/flang/Optimizer/OpenMP/Passes.h
+++ b/flang/include/flang/Optimizer/OpenMP/Passes.h
@@ -25,6 +25,11 @@ namespace flangomp {
 #define GEN_PASS_REGISTRATION
 #include "flang/Optimizer/OpenMP/Passes.h.inc"
 
+/// Impelements the logic specified in the 2.8.3  workshare Construct section 
of
+/// the OpenMP standard which specifies what statements or constructs shall be
+/// divided into units of work.
+bool shouldUseWorkshareLowering(mlir::Operation *op);
+
 } // namespace flangomp
 
 #endif // FORTRAN_OPTIMIZER_OPENMP_PASSES_H
diff --git a/flang/include/flang/Optimizer/OpenMP/Passes.td 
b/flang/include/flang/Optimizer/OpenMP/Passes.td
index 395178e26a5762..041240cad12eb3 100644
--- a/flang/include/flang/Optimizer/OpenMP/Passes.td
+++ b/flang/include/flang/Optimizer/OpenMP/Passes.td
@@ -37,4 +37,9 @@ def FunctionFiltering : Pass<"omp-function-filtering"> {
   ];
 }
 
+// Needs to be scheduled on Module as we create functions in it
+def LowerWorkshare : Pass<"lower-workshare", "::mlir::ModuleOp"> {
+  let summary = "Lower workshare construct";
+}
+
 #endif //FORTRAN_OPTIMIZER_OPENMP_PASSES
diff --git a/flang/include/flang/Tools/CLOptions.inc 
b/flang/include/flang/Tools/CLOptions.inc
index 1881e23b00045a..bb00e079008a0b 100644
--- a/flang/include/flang/Tools/CLOptions.inc
+++ b/flang/include/flang/Tools/CLOptions.inc
@@ -337,7 +337,7 @@ inline void createDefaultFIROptimizerPassPipeline(
 /// \param optLevel - optimization level used for creating FIR optimization
 ///   passes pipeline
 inline void createHLFIRToFIRPassPipeline(
-mlir::PassManager &pm, llvm::OptimizationLevel optLevel = defaultOptLevel) 
{
+mlir::PassManager &pm, bool enableOpenMP, llvm::OptimizationLevel optLevel 
= defaultOptLevel) {
   if (optLevel.isOptimizingForSpeed()) {
 addCanonicalizerPassWithoutRegionSimplification(pm);
 addNestedPassToAllTopLevelOperations(
@@ -354,6 +354,8 @@ inline void createHLFIRToFIRPassPipeline(
   pm.addPass(hlfir::createLowerHLFIRIntrinsics());
   pm.addPass(hlfir::createBufferizeHLFIR());
   pm.addPass(hlfir::createConvertHLFIRtoFIR());
+  if (enableOpenMP)
+pm.add

[llvm-branch-commits] [flang] [flang] Lower omp.workshare to other omp constructs (PR #101446)

2024-10-03 Thread Ivan R. Ivanov via llvm-branch-commits


https://github.com/ivanradanov updated 
https://github.com/llvm/llvm-project/pull/101446

>From e56dbd6a0625890fd9a3d6a62675e864ca94a8f5 Mon Sep 17 00:00:00 2001
From: Ivan Radanov Ivanov 
Date: Sun, 4 Aug 2024 22:06:55 +0900
Subject: [PATCH 1/9] [flang] Lower omp.workshare to other omp constructs

Change to workshare loop wrapper op

Move single op declaration

Schedule pass properly

Correctly handle nested nested loop nests to be parallelized by workshare

Leave comments for shouldUseWorkshareLowering

Use copyprivate to scatter val from omp.single

TODO still need to implement copy function
TODO transitive check for usage outside of omp.single not imiplemented yet

Transitively check for users outisde of single op

TODO need to implement copy func
TODO need to hoist allocas outside of single regions

Add tests

Hoist allocas

More tests

Emit body for copy func

Test the tmp storing logic

Clean up trivially dead ops

Only handle single-block regions for now

Fix tests for custom assembly for loop wrapper

Only run the lower workshare pass if openmp is enabled

Implement some missing functionality

Fix tests

Fix test

Iterate backwards to find all trivially dead ops

Add expalanation comment for createCopyFun

Update test
---
 flang/include/flang/Optimizer/OpenMP/Passes.h |   5 +
 .../include/flang/Optimizer/OpenMP/Passes.td  |   5 +
 flang/include/flang/Tools/CLOptions.inc   |   6 +-
 flang/include/flang/Tools/CrossToolHelpers.h  |   1 +
 flang/lib/Frontend/FrontendActions.cpp|  10 +-
 flang/lib/Optimizer/OpenMP/CMakeLists.txt |   1 +
 flang/lib/Optimizer/OpenMP/LowerWorkshare.cpp | 446 ++
 flang/test/Fir/basic-program.fir  |   1 +
 .../Transforms/OpenMP/lower-workshare.mlir| 189 
 .../Transforms/OpenMP/lower-workshare2.mlir   |  23 +
 .../Transforms/OpenMP/lower-workshare3.mlir   |  74 +++
 .../Transforms/OpenMP/lower-workshare4.mlir   |  59 +++
 .../Transforms/OpenMP/lower-workshare5.mlir   |  42 ++
 .../Transforms/OpenMP/lower-workshare6.mlir   |  51 ++
 flang/tools/bbc/bbc.cpp   |   5 +-
 flang/tools/tco/tco.cpp   |   1 +
 16 files changed, 915 insertions(+), 4 deletions(-)
 create mode 100644 flang/lib/Optimizer/OpenMP/LowerWorkshare.cpp
 create mode 100644 flang/test/Transforms/OpenMP/lower-workshare.mlir
 create mode 100644 flang/test/Transforms/OpenMP/lower-workshare2.mlir
 create mode 100644 flang/test/Transforms/OpenMP/lower-workshare3.mlir
 create mode 100644 flang/test/Transforms/OpenMP/lower-workshare4.mlir
 create mode 100644 flang/test/Transforms/OpenMP/lower-workshare5.mlir
 create mode 100644 flang/test/Transforms/OpenMP/lower-workshare6.mlir

diff --git a/flang/include/flang/Optimizer/OpenMP/Passes.h 
b/flang/include/flang/Optimizer/OpenMP/Passes.h
index 403d79667bf448..feb395f1a12dbd 100644
--- a/flang/include/flang/Optimizer/OpenMP/Passes.h
+++ b/flang/include/flang/Optimizer/OpenMP/Passes.h
@@ -25,6 +25,11 @@ namespace flangomp {
 #define GEN_PASS_REGISTRATION
 #include "flang/Optimizer/OpenMP/Passes.h.inc"
 
+/// Impelements the logic specified in the 2.8.3  workshare Construct section 
of
+/// the OpenMP standard which specifies what statements or constructs shall be
+/// divided into units of work.
+bool shouldUseWorkshareLowering(mlir::Operation *op);
+
 } // namespace flangomp
 
 #endif // FORTRAN_OPTIMIZER_OPENMP_PASSES_H
diff --git a/flang/include/flang/Optimizer/OpenMP/Passes.td 
b/flang/include/flang/Optimizer/OpenMP/Passes.td
index 395178e26a5762..041240cad12eb3 100644
--- a/flang/include/flang/Optimizer/OpenMP/Passes.td
+++ b/flang/include/flang/Optimizer/OpenMP/Passes.td
@@ -37,4 +37,9 @@ def FunctionFiltering : Pass<"omp-function-filtering"> {
   ];
 }
 
+// Needs to be scheduled on Module as we create functions in it
+def LowerWorkshare : Pass<"lower-workshare", "::mlir::ModuleOp"> {
+  let summary = "Lower workshare construct";
+}
+
 #endif //FORTRAN_OPTIMIZER_OPENMP_PASSES
diff --git a/flang/include/flang/Tools/CLOptions.inc 
b/flang/include/flang/Tools/CLOptions.inc
index 1881e23b00045a..bb00e079008a0b 100644
--- a/flang/include/flang/Tools/CLOptions.inc
+++ b/flang/include/flang/Tools/CLOptions.inc
@@ -337,7 +337,7 @@ inline void createDefaultFIROptimizerPassPipeline(
 /// \param optLevel - optimization level used for creating FIR optimization
 ///   passes pipeline
 inline void createHLFIRToFIRPassPipeline(
-mlir::PassManager &pm, llvm::OptimizationLevel optLevel = defaultOptLevel) 
{
+mlir::PassManager &pm, bool enableOpenMP, llvm::OptimizationLevel optLevel 
= defaultOptLevel) {
   if (optLevel.isOptimizingForSpeed()) {
 addCanonicalizerPassWithoutRegionSimplification(pm);
 addNestedPassToAllTopLevelOperations(
@@ -354,6 +354,8 @@ inline void createHLFIRToFIRPassPipeline(
   pm.addPass(hlfir::createLowerHLFIRIntrinsics());
   pm.addPass(hlfir::createBufferizeHLFIR());
   pm.addPass(hlfir::createConvertHLFIRtoFIR());
+  if (enableOpenMP)
+pm.add

[llvm-branch-commits] [flang] [flang] Lower omp.workshare to other omp constructs (PR #101446)

2024-10-03 Thread Ivan R. Ivanov via llvm-branch-commits


ivanradanov wrote:

> My concern with the TODO message is that some code that previously compiled 
> using the lowering of WORKSHARE as SINGLE will now hit this TODO. This is 
> okay with me so long as it is fixed soon (before LLVM 20). Otherwise, could 
> these cases continued to be lowered as SINGLE for now.

I have updated it to lower to omp.single and emit a warning in CFG cases.

https://github.com/llvm/llvm-project/pull/101446
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [flang] [WIP][flang] Introduce HLFIR lowerings to omp.workshare_loop_nest (PR #104748)

2024-10-03 Thread Ivan R. Ivanov via llvm-branch-commits


https://github.com/ivanradanov updated 
https://github.com/llvm/llvm-project/pull/104748

>From 4c207b5c8e44d83eea08d283b8e3811585137744 Mon Sep 17 00:00:00 2001
From: Ivan Radanov Ivanov 
Date: Fri, 4 Oct 2024 15:28:07 +0900
Subject: [PATCH 1/6] Different warning

---
 flang/lib/Optimizer/OpenMP/LowerWorkshare.cpp| 9 +
 .../Transforms/OpenMP/lower-workshare-todo-cfg-dom.mlir  | 2 ++
 .../test/Transforms/OpenMP/lower-workshare-todo-cfg.mlir | 2 ++
 3 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/flang/lib/Optimizer/OpenMP/LowerWorkshare.cpp 
b/flang/lib/Optimizer/OpenMP/LowerWorkshare.cpp
index 34399abbcd20ea..4d8e2a9a067141 100644
--- a/flang/lib/Optimizer/OpenMP/LowerWorkshare.cpp
+++ b/flang/lib/Optimizer/OpenMP/LowerWorkshare.cpp
@@ -94,11 +94,9 @@ bool shouldUseWorkshareLowering(Operation *op) {
   if (isNestedIn(parentWorkshare, op))
 return false;
 
-  if (parentWorkshare.getRegion().getBlocks().size() != 1) {
-parentWorkshare->emitWarning(
-"omp workshare with unstructured control flow currently unsupported.");
+  // Do not use workshare lowering until we support CFG in omp.workshare
+  if (parentWorkshare.getRegion().getBlocks().size() != 1)
 return false;
-  }
 
   return true;
 }
@@ -448,6 +446,9 @@ LogicalResult lowerWorkshare(mlir::omp::WorkshareOp wsOp, 
DominanceInfo &di) {
   } else {
 // Otherwise just change the operation to an omp.single.
 
+wsOp->emitWarning("omp workshare with unstructured control flow currently "
+  "unsupported and will be serialized.");
+
 // `shouldUseWorkshareLowering` should have guaranteed that there are no
 // omp.workshare_loop_wrapper's that bind to this omp.workshare.
 assert(!wsOp->walk([&](Operation *op) {
diff --git a/flang/test/Transforms/OpenMP/lower-workshare-todo-cfg-dom.mlir 
b/flang/test/Transforms/OpenMP/lower-workshare-todo-cfg-dom.mlir
index 62d9da6c520f85..96dc878bed0c99 100644
--- a/flang/test/Transforms/OpenMP/lower-workshare-todo-cfg-dom.mlir
+++ b/flang/test/Transforms/OpenMP/lower-workshare-todo-cfg-dom.mlir
@@ -1,5 +1,7 @@
 // RUN: fir-opt --lower-workshare --allow-unregistered-dialect %s 2>&1 | 
FileCheck %s
 
+// CHECK: warning: omp workshare with unstructured control flow currently 
unsupported and will be serialized.
+
 // CHECK: omp.parallel
 // CHECK-NEXT: omp.single
 
diff --git a/flang/test/Transforms/OpenMP/lower-workshare-todo-cfg.mlir 
b/flang/test/Transforms/OpenMP/lower-workshare-todo-cfg.mlir
index d9551eb99f0762..ce8a4eb96982be 100644
--- a/flang/test/Transforms/OpenMP/lower-workshare-todo-cfg.mlir
+++ b/flang/test/Transforms/OpenMP/lower-workshare-todo-cfg.mlir
@@ -1,5 +1,7 @@
 // RUN: fir-opt --lower-workshare --allow-unregistered-dialect %s 2>&1 | 
FileCheck %s
 
+// CHECK: warning: omp workshare with unstructured control flow currently 
unsupported and will be serialized.
+
 // CHECK: omp.parallel
 // CHECK-NEXT: omp.single
 

>From 40cc5dd26a1e3b8a85a134d6ee249374f70f2bb8 Mon Sep 17 00:00:00 2001
From: Ivan Radanov Ivanov 
Date: Sun, 4 Aug 2024 17:33:52 +0900
Subject: [PATCH 2/6] Add workshare loop wrapper lowerings

Bufferize test

Bufferize test

Bufferize test

Add test for should use workshare lowering
---
 .../HLFIR/Transforms/BufferizeHLFIR.cpp   |   4 +-
 .../Transforms/OptimizedBufferization.cpp |  10 +-
 flang/test/HLFIR/bufferize-workshare.fir  |  58 
 .../OpenMP/should-use-workshare-lowering.mlir | 140 ++
 4 files changed, 208 insertions(+), 4 deletions(-)
 create mode 100644 flang/test/HLFIR/bufferize-workshare.fir
 create mode 100644 
flang/test/Transforms/OpenMP/should-use-workshare-lowering.mlir

diff --git a/flang/lib/Optimizer/HLFIR/Transforms/BufferizeHLFIR.cpp 
b/flang/lib/Optimizer/HLFIR/Transforms/BufferizeHLFIR.cpp
index 07794828fce267..1848dbe2c7a2c2 100644
--- a/flang/lib/Optimizer/HLFIR/Transforms/BufferizeHLFIR.cpp
+++ b/flang/lib/Optimizer/HLFIR/Transforms/BufferizeHLFIR.cpp
@@ -26,6 +26,7 @@
 #include "flang/Optimizer/HLFIR/HLFIRDialect.h"
 #include "flang/Optimizer/HLFIR/HLFIROps.h"
 #include "flang/Optimizer/HLFIR/Passes.h"
+#include "flang/Optimizer/OpenMP/Passes.h"
 #include "mlir/Dialect/OpenMP/OpenMPDialect.h"
 #include "mlir/IR/Dominance.h"
 #include "mlir/IR/PatternMatch.h"
@@ -792,7 +793,8 @@ struct ElementalOpConversion
 // Generate a loop nest looping around the fir.elemental shape and clone
 // fir.elemental region inside the inner loop.
 hlfir::LoopNest loopNest =
-hlfir::genLoopNest(loc, builder, extents, !elemental.isOrdered());
+hlfir::genLoopNest(loc, builder, extents, !elemental.isOrdered(),
+   flangomp::shouldUseWorkshareLowering(elemental));
 auto insPt = builder.saveInsertionPoint();
 builder.setInsertionPointToStart(loopNest.body);
 auto yield = hlfir::inlineElementalOp(loc, builder, elemental,
diff --git a/flang/lib/Optimizer/HLFIR/Transforms/OptimizedBufferization.cpp 
b/flan

[llvm-branch-commits] [llvm] [AMDGPU] Serialize WWM_REG vreg flag (PR #110229)

2024-10-03 Thread Akshat Oke via llvm-branch-commits



@@ -0,0 +1,16 @@
+# RUN: llc -mtriple=amdgcn -run-pass=none -o - %s | FileCheck %s

Akshat-Oke wrote:

Negative test is now in MIR/Generic.

https://github.com/llvm/llvm-project/pull/110229
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [AMDGPU] Serialize WWM_REG vreg flag (PR #110229)

2024-10-03 Thread Akshat Oke via llvm-branch-commits


https://github.com/Akshat-Oke updated 
https://github.com/llvm/llvm-project/pull/110229

>From 80207b7bd00d4b0889918d9a7df627f7c304bd7d Mon Sep 17 00:00:00 2001
From: Akshat Oke 
Date: Fri, 27 Sep 2024 08:58:39 +
Subject: [PATCH 1/3] [AMDGPU] Serialize WWM_REG vreg flag

---
 llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp | 15 +++
 llvm/lib/Target/AMDGPU/SIMachineFunctionInfo.h |  4 ++--
 llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp  | 11 +++
 llvm/lib/Target/AMDGPU/SIRegisterInfo.h| 10 ++
 llvm/test/CodeGen/AMDGPU/virtual-registers.mir | 16 
 5 files changed, 54 insertions(+), 2 deletions(-)
 create mode 100644 llvm/test/CodeGen/AMDGPU/virtual-registers.mir

diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp 
b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
index 1f2148c2922de9..28578a875c164c 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
@@ -1712,6 +1712,21 @@ bool GCNTargetMachine::parseMachineFunctionInfo(
 MFI->reserveWWMRegister(ParsedReg);
   }
 
+  auto setRegisterFlags = [&](const VRegInfo &Info) {
+for (const auto &Flag : Info.Flags) {
+  MFI->setFlag(Info.VReg, Flag);
+}
+  };
+
+  for (const auto &P : PFS.VRegInfosNamed) {
+const VRegInfo &Info = *P.second;
+setRegisterFlags(Info);
+  }
+  for (const auto &P : PFS.VRegInfos) {
+const VRegInfo &Info = *P.second;
+setRegisterFlags(Info);
+  }
+
   auto parseAndCheckArgument = [&](const std::optional &A,
const TargetRegisterClass &RC,
ArgDescriptor &Arg, unsigned UserSGPRs,
diff --git a/llvm/lib/Target/AMDGPU/SIMachineFunctionInfo.h 
b/llvm/lib/Target/AMDGPU/SIMachineFunctionInfo.h
index 669f98dd865d61..e28c24bf8f8500 100644
--- a/llvm/lib/Target/AMDGPU/SIMachineFunctionInfo.h
+++ b/llvm/lib/Target/AMDGPU/SIMachineFunctionInfo.h
@@ -693,8 +693,8 @@ class SIMachineFunctionInfo final : public 
AMDGPUMachineFunction,
 
   void setFlag(Register Reg, uint8_t Flag) {
 assert(Reg.isVirtual());
-if (VRegFlags.inBounds(Reg))
-  VRegFlags[Reg] |= Flag;
+VRegFlags.grow(Reg);
+VRegFlags[Reg] |= Flag;
   }
 
   bool checkFlag(Register Reg, uint8_t Flag) const {
diff --git a/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp 
b/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
index 9e1c4941dba283..84569b3f11df67 100644
--- a/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
+++ b/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
@@ -3839,3 +3839,14 @@ SIRegisterInfo::getSubRegAlignmentNumBits(const 
TargetRegisterClass *RC,
   }
   return 0;
 }
+
+SmallVector
+SIRegisterInfo::getVRegFlagsOfReg(Register Reg,
+  const MachineFunction &MF) const {
+  SmallVector RegFlags;
+  const SIMachineFunctionInfo *FuncInfo = MF.getInfo();
+  if (FuncInfo->checkFlag(Reg, AMDGPU::VirtRegFlag::WWM_REG)) {
+RegFlags.push_back("WWM_REG");
+  }
+  return RegFlags;
+}
diff --git a/llvm/lib/Target/AMDGPU/SIRegisterInfo.h 
b/llvm/lib/Target/AMDGPU/SIRegisterInfo.h
index 409e5418abc8ec..2c3707e119178a 100644
--- a/llvm/lib/Target/AMDGPU/SIRegisterInfo.h
+++ b/llvm/lib/Target/AMDGPU/SIRegisterInfo.h
@@ -454,6 +454,16 @@ class SIRegisterInfo final : public AMDGPUGenRegisterInfo {
   // No check if the subreg is supported by the current RC is made.
   unsigned getSubRegAlignmentNumBits(const TargetRegisterClass *RC,
  unsigned SubReg) const;
+
+  std::pair getVRegFlagValue(StringRef Name) const override {
+if (Name == "WWM_REG") {
+  return {true, AMDGPU::VirtRegFlag::WWM_REG};
+}
+return {false, 0};
+  }
+
+  SmallVector
+  getVRegFlagsOfReg(Register Reg, const MachineFunction &MF) const override;
 };
 
 namespace AMDGPU {
diff --git a/llvm/test/CodeGen/AMDGPU/virtual-registers.mir 
b/llvm/test/CodeGen/AMDGPU/virtual-registers.mir
new file mode 100644
index 00..3ea8f6eafcf10c
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/virtual-registers.mir
@@ -0,0 +1,16 @@
+# RUN: llc -mtriple=amdgcn -run-pass=none -o - %s | FileCheck %s
+# This test ensures that the MIR parser parses virtual register flags correctly
+
+---
+name: vregs
+# CHECK: registers:
+# CHECK-NEXT:   - { id: 0, class: vgpr_32, preferred-register: '$vgpr1', 
flags: [ WWM_REG ] }
+# CHECK-NEXT:   - { id: 1, class: sgpr_64, preferred-register: '$sgpr0_sgpr1', 
flags: [  ] }
+# CHECK-NEXT:   - { id: 2, class: sgpr_64, preferred-register: '', flags: [  ] 
}
+registers:
+  - { id: 0, class: vgpr_32, preferred-register: $vgpr1, flags: [ WWM_REG ]}
+  - { id: 1, class: sgpr_64, preferred-register: $sgpr0_sgpr1 }
+body: |
+  bb.0:
+%2:sgpr_64 = COPY %1
+%1:sgpr_64 = COPY %0

>From bc0ab7806225d8acac2d47a8d9a914698cbd1e05 Mon Sep 17 00:00:00 2001
From: Akshat Oke 
Date: Fri, 4 Oct 2024 06:31:06 +
Subject: [PATCH 2/3] Correct TRI methods to optional<> and SmallString

---
 llvm/lib/Target/AMDGPU/SIRegisterI

[llvm-branch-commits] [clang] 3217472 - Revert "[RISCV][FMV] Support target_version (#99040)"

2024-10-03 Thread via llvm-branch-commits


Author: Piyou Chen
Date: 2024-10-04T11:55:45+08:00
New Revision: 32174720649068de7c4ef97a484d777dba72e65c

URL: 
https://github.com/llvm/llvm-project/commit/32174720649068de7c4ef97a484d777dba72e65c
DIFF: 
https://github.com/llvm/llvm-project/commit/32174720649068de7c4ef97a484d777dba72e65c.diff

LOG: Revert "[RISCV][FMV] Support target_version (#99040)"

This reverts commit 7ab488e92c39c813a50cb4fd6587e7afc161c7d5.

Added: 


Modified: 
clang/lib/AST/ASTContext.cpp
clang/lib/CodeGen/CodeGenModule.cpp
clang/lib/Sema/SemaDecl.cpp
clang/lib/Sema/SemaDeclAttr.cpp

Removed: 
clang/test/CodeGen/attr-target-version-riscv-invalid.c
clang/test/CodeGen/attr-target-version-riscv.c
clang/test/CodeGenCXX/attr-target-version-riscv.cpp
clang/test/SemaCXX/attr-target-version-riscv.cpp



diff  --git a/clang/lib/AST/ASTContext.cpp b/clang/lib/AST/ASTContext.cpp
index 034fbbe0bc7829..a81429ad6a2380 100644
--- a/clang/lib/AST/ASTContext.cpp
+++ b/clang/lib/AST/ASTContext.cpp
@@ -14325,17 +14325,9 @@ void 
ASTContext::getFunctionFeatureMap(llvm::StringMap &FeatureMap,
   Target->initFeatureMap(FeatureMap, getDiagnostics(), TargetCPU, 
Features);
 }
   } else if (const auto *TV = FD->getAttr()) {
-std::vector Features;
-if (Target->getTriple().isRISCV()) {
-  ParsedTargetAttr ParsedAttr = Target->parseTargetAttr(TV->getName());
-  Features.insert(Features.begin(), ParsedAttr.Features.begin(),
-  ParsedAttr.Features.end());
-} else {
-  assert(Target->getTriple().isAArch64());
-  llvm::SmallVector Feats;
-  TV->getFeatures(Feats);
-  Features = getFMVBackendFeaturesFor(Feats);
-}
+llvm::SmallVector Feats;
+TV->getFeatures(Feats);
+std::vector Features = getFMVBackendFeaturesFor(Feats);
 Features.insert(Features.begin(),
 Target->getTargetOpts().FeaturesAsWritten.begin(),
 Target->getTargetOpts().FeaturesAsWritten.end());

diff  --git a/clang/lib/CodeGen/CodeGenModule.cpp 
b/clang/lib/CodeGen/CodeGenModule.cpp
index 5ba098144a74e7..25c1c496a4f27f 100644
--- a/clang/lib/CodeGen/CodeGenModule.cpp
+++ b/clang/lib/CodeGen/CodeGenModule.cpp
@@ -4287,13 +4287,8 @@ void CodeGenModule::emitMultiVersionFunctions() {
   } else if (const auto *TVA = CurFD->getAttr()) {
 if (TVA->isDefaultVersion() && IsDefined)
   ShouldEmitResolver = true;
+TVA->getFeatures(Feats);
 llvm::Function *Func = createFunction(CurFD);
-if (getTarget().getTriple().isRISCV()) {
-  Feats.push_back(TVA->getName());
-} else {
-  assert(getTarget().getTriple().isAArch64());
-  TVA->getFeatures(Feats);
-}
 Options.emplace_back(Func, /*Architecture*/ "", Feats);
   } else if (const auto *TC = CurFD->getAttr()) {
 if (IsDefined)

diff  --git a/clang/lib/Sema/SemaDecl.cpp b/clang/lib/Sema/SemaDecl.cpp
index 21f25a2ea09eb0..2bf610746bc317 100644
--- a/clang/lib/Sema/SemaDecl.cpp
+++ b/clang/lib/Sema/SemaDecl.cpp
@@ -10329,8 +10329,7 @@ Sema::ActOnFunctionDeclarator(Scope *S, Declarator &D, 
DeclContext *DC,
   // Handle attributes.
   ProcessDeclAttributes(S, NewFD, D);
   const auto *NewTVA = NewFD->getAttr();
-  if (Context.getTargetInfo().getTriple().isAArch64() && NewTVA &&
-  !NewTVA->isDefaultVersion() &&
+  if (NewTVA && !NewTVA->isDefaultVersion() &&
   !Context.getTargetInfo().hasFeature("fmv")) {
 // Don't add to scope fmv functions declarations if fmv disabled
 AddToScope = false;
@@ -11039,15 +11038,7 @@ static bool CheckMultiVersionValue(Sema &S, const 
FunctionDecl *FD) {
 
   if (TVA) {
 llvm::SmallVector Feats;
-if (S.getASTContext().getTargetInfo().getTriple().isRISCV()) {
-  ParsedTargetAttr ParseInfo =
-  S.getASTContext().getTargetInfo().parseTargetAttr(TVA->getName());
-  for (auto &Feat : ParseInfo.Features)
-Feats.push_back(StringRef{Feat}.substr(1));
-} else {
-  assert(S.getASTContext().getTargetInfo().getTriple().isAArch64());
-  TVA->getFeatures(Feats);
-}
+TVA->getFeatures(Feats);
 for (const auto &Feat : Feats) {
   if (!TargetInfo.validateCpuSupports(Feat)) {
 S.Diag(FD->getLocation(), diag::err_bad_multiversion_option)
@@ -11333,8 +11324,7 @@ static bool 
PreviousDeclsHaveMultiVersionAttribute(const FunctionDecl *FD) {
 }
 
 static void patchDefaultTargetVersion(FunctionDecl *From, FunctionDecl *To) {
-  if (!From->getASTContext().getTargetInfo().getTriple().isAArch64() &&
-  !From->getASTContext().getTargetInfo().getTriple().isRISCV())
+  if (!From->getASTContext().getTargetInfo().getTriple().isAArch64())
 return;
 
   MultiVersionKind MVKindFrom = From->getMultiVersionKind();
@@ -15521,8 +15511,7 @@ Decl *Sema::ActOnStartOfFunctionDef(Scope *FnBodyScope, 
De

[llvm-branch-commits] [compiler-rt] [llvm] [Coverage] Make SingleByteCoverage work consistent to merging (PR #110972)

2024-10-03 Thread Ellis Hoag via llvm-branch-commits



@@ -952,7 +952,7 @@ void InstrProfRecord::merge(InstrProfRecord &Other, 
uint64_t Weight,
   Value = getInstrMaxCountValue();
   Overflowed = true;
 }
-Counts[I] = Value;

ellishg wrote:

This is deliberate. Even though we only record boolean coverage in the raw 
profiles, when we aggregate many raw profiles together we can still get some 
sense of relative hotness by looking at the counter value. Otherwise we lose 
information if we treat the counter value in the indexed profile as a boolean.

https://github.com/llvm/llvm-project/pull/110972
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [BOLT] Support perf2bolt-N in the driver (PR #111072)

2024-10-03 Thread via llvm-branch-commits


llvmbot wrote:




@llvm/pr-subscribers-bolt

Author: Amir Ayupov (aaupov)


Changes

Check invoked tool with `starts_with`


---
Full diff: https://github.com/llvm/llvm-project/pull/111072.diff


1 Files Affected:

- (modified) bolt/tools/driver/llvm-bolt.cpp (+2-2) 


``diff
diff --git a/bolt/tools/driver/llvm-bolt.cpp b/bolt/tools/driver/llvm-bolt.cpp
index 9b03524e9f18e8..a8d1ac64808930 100644
--- a/bolt/tools/driver/llvm-bolt.cpp
+++ b/bolt/tools/driver/llvm-bolt.cpp
@@ -202,9 +202,9 @@ int main(int argc, char **argv) {
 
   ToolName = argv[0];
 
-  if (llvm::sys::path::filename(ToolName) == "perf2bolt")
+  if (llvm::sys::path::filename(ToolName).starts_with("perf2bolt"))
 perf2boltMode(argc, argv);
-  else if (llvm::sys::path::filename(ToolName) == "llvm-boltdiff")
+  else if (llvm::sys::path::filename(ToolName).starts_with("llvm-boltdiff"))
 boltDiffMode(argc, argv);
   else
 boltMode(argc, argv);

``




https://github.com/llvm/llvm-project/pull/111072
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [BOLT] Support perf2bolt-N in the driver (PR #111072)

2024-10-03 Thread Amir Ayupov via llvm-branch-commits


https://github.com/aaupov created 
https://github.com/llvm/llvm-project/pull/111072

Check invoked tool with `starts_with`



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [BOLT] Support --show-density for fdata and YAML profiles (PR #110567)

2024-10-03 Thread Amir Ayupov via llvm-branch-commits


https://github.com/aaupov closed 
https://github.com/llvm/llvm-project/pull/110567
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [compiler-rt] [llvm] [Coverage] Make SingleByteCoverage work consistent to merging (PR #110972)

2024-10-03 Thread NAKAMURA Takumi via llvm-branch-commits



@@ -952,7 +952,7 @@ void InstrProfRecord::merge(InstrProfRecord &Other, 
uint64_t Weight,
   Value = getInstrMaxCountValue();
   Overflowed = true;
 }
-Counts[I] = Value;

chapuni wrote:

I didn't imagine use cases in PGO. I'll leave it unchanged.

In contrast, do you think we could round counters as boolean only in llvm-cov?

https://github.com/llvm/llvm-project/pull/110972
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [compiler-rt] [llvm] [Coverage] Make SingleByteCoverage work consistent to merging (PR #110972)

2024-10-03 Thread Ellis Hoag via llvm-branch-commits



@@ -952,7 +952,7 @@ void InstrProfRecord::merge(InstrProfRecord &Other, 
uint64_t Weight,
   Value = getInstrMaxCountValue();
   Overflowed = true;
 }
-Counts[I] = Value;

ellishg wrote:

I think that makes sense for frontend coverage since we aren't using those 
values for optimization.

https://github.com/llvm/llvm-project/pull/110972
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [flang] [WIP][flang] Introduce HLFIR lowerings to omp.workshare_loop_nest (PR #104748)

2024-10-03 Thread Thirumalai Shaktivel via llvm-branch-commits


Thirumalai-Shaktivel wrote:

Hi @ivanradanov, thanks for the PR!

I tried building and testing this PR. And came across a case where it seg 
faults. Can you please check it?
```fortran
program test
real :: arr_01(10)
!$omp parallel workshare
arr_01 = arr_01*2
!$omp end parallel workshare
end program
```

https://github.com/llvm/llvm-project/pull/104748
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [RISCV][CFI] add function epilogue cfi information (PR #110810)

2024-10-03 Thread via llvm-branch-commits


https://github.com/dlav-sc edited 
https://github.com/llvm/llvm-project/pull/110810
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [libcxx] [release/19.x][libc++] Follow-up to "Poison Pills are Too Toxic" (PR #109291)

2024-10-03 Thread Louis Dionne via llvm-branch-commits


https://github.com/ldionne updated 
https://github.com/llvm/llvm-project/pull/109291

>From 094cc808ded9b00a50e26b91898323e17cc4840f Mon Sep 17 00:00:00 2001
From: Jakub Mazurkiewicz 
Date: Wed, 10 Apr 2024 23:12:22 +0200
Subject: [PATCH 1/2] [libc++] Follow-up to "Poison Pills are Too Toxic"

* Update release notes and `Cxx23.html`
* Update `__cpp_lib_ranges` feature test macro
---
 libcxx/docs/FeatureTestMacroTable.rst| 2 ++
 libcxx/docs/ReleaseNotes/19.rst  | 1 +
 libcxx/docs/Status/Cxx23.rst | 1 +
 libcxx/docs/Status/Cxx23Papers.csv   | 2 +-
 libcxx/include/version   | 5 -
 .../algorithm.version.compile.pass.cpp   | 9 +
 .../functional.version.compile.pass.cpp  | 9 +
 .../iterator.version.compile.pass.cpp| 9 +
 .../memory.version.compile.pass.cpp  | 9 +
 .../ranges.version.compile.pass.cpp  | 9 +
 .../version.version.compile.pass.cpp | 9 +
 libcxx/utils/generate_feature_test_macro_components.py   | 1 +
 12 files changed, 40 insertions(+), 26 deletions(-)

diff --git a/libcxx/docs/FeatureTestMacroTable.rst 
b/libcxx/docs/FeatureTestMacroTable.rst
index a1506e115fe70f..7f95f0f4e1c17c 100644
--- a/libcxx/docs/FeatureTestMacroTable.rst
+++ b/libcxx/docs/FeatureTestMacroTable.rst
@@ -350,6 +350,8 @@ Status
 -- 
-
 ``__cpp_lib_print````202207L``
 -- 
-
+``__cpp_lib_ranges``   ``202211L``
+-- 
-
 ``__cpp_lib_ranges_as_const``  *unimplemented*
 -- 
-
 ``__cpp_lib_ranges_as_rvalue`` ``202207L``
diff --git a/libcxx/docs/ReleaseNotes/19.rst b/libcxx/docs/ReleaseNotes/19.rst
index 92896f6b0d11e7..26210ddb274e5f 100644
--- a/libcxx/docs/ReleaseNotes/19.rst
+++ b/libcxx/docs/ReleaseNotes/19.rst
@@ -77,6 +77,7 @@ Implemented Papers
 - P2602R2 - Poison Pills are Too Toxic
 - P1981R0 - Rename ``leap`` to ``leap_second``
 - P1982R0 - Rename ``link`` to ``time_zone_link``
+- P2602R2 - Poison Pills are Too Toxic (as DR against C++20)
 
 
 Improvements and New Features
diff --git a/libcxx/docs/Status/Cxx23.rst b/libcxx/docs/Status/Cxx23.rst
index 23d30c8128d71e..8c1cae8b3e3b2f 100644
--- a/libcxx/docs/Status/Cxx23.rst
+++ b/libcxx/docs/Status/Cxx23.rst
@@ -44,6 +44,7 @@ Paper Status
.. [#note-P1413R3] P1413R3: ``std::aligned_storage_t`` and 
``std::aligned_union_t`` are marked deprecated, but
   clang doesn't issue a diagnostic for deprecated using template 
declarations.
.. [#note-P2520R0] P2520R0: Libc++ implemented this paper as a DR in C++20 
as well.
+   .. [#note-P2602R2] P2602R2: Libc++ implemented this paper as a DR in C++20 
as well.
.. [#note-P2711R1] P2711R1: ``join_with_view`` hasn't been done yet since 
this type isn't implemented yet.
.. [#note-P2770R0] P2770R0: ``join_with_view`` hasn't been done yet since 
this type isn't implemented yet.
.. [#note-P2693R1] P2693R1: The formatter for ``std::thread::id`` is 
implemented.
diff --git a/libcxx/docs/Status/Cxx23Papers.csv 
b/libcxx/docs/Status/Cxx23Papers.csv
index 92f4908487ae72..f46bb844532029 100644
--- a/libcxx/docs/Status/Cxx23Papers.csv
+++ b/libcxx/docs/Status/Cxx23Papers.csv
@@ -100,7 +100,7 @@
 "`P2396R1 `__","LWG", "Concurrency TS 2 fixes ", 
"November 2022","","","|concurrency TS|"
 "`P2505R5 `__","LWG", "Monadic Functions for 
``std::expected``", "November 2022","|Complete|","17.0",""
 "`P2539R4 `__","LWG", "Should the output of 
``std::print`` to a terminal be synchronized with the underlying stream?", 
"November 2022","|Complete|","18.0","|format|"
-"`P2602R2 `__","LWG", "Poison Pills are Too Toxic", 
"November 2022","|Complete|","19.0","|ranges|"
+"`P2602R2 `__","LWG", "Poison Pills are Too Toxic", 
"November 2022","|Complete| [#note-P2602R2]_","19.0","|ranges| |DR|"
 "`P2708R1 `__","LWG", "No Further Fundamentals 
TSes", "November 2022","|Nothing to do|","",""
 "","","","","","",""
 "`P0290R4 `__","LWG", "``apply()`` for 
``synchronized_value``","February 2023","","","|concurrency TS|"
diff --git a/libcxx/include/version b/libcxx/include/version
index fe64343eafbc9c..c8a31f77a915e1 100644
--- a/libcxx/include/version
+++ b/libcxx/include/version
@@ -182,8 +182,9 @@ __cpp_lib_philox_engine

[llvm-branch-commits] [llvm] [MergeFunctions] Add support to run the pass over a set of function pointers (PR #110996)

2024-10-03 Thread Rafael Eckstein via llvm-branch-commits


https://github.com/Casperento created 
https://github.com/llvm/llvm-project/pull/110996

This modification will enable the usage of `MergeFunctions` as a standalone 
library. Currently, `MergeFunctions` can only be applied to an entire module. 
By adopting this change, developers will gain the flexibility to reuse the 
`MergeFunctions` code within their own projects, choosing which functions to 
merge; hence, promoting code reusability. Notice that this modification will 
not break backward compatibility, because `MergeFunctions` will still work as a 
pass after the modification.

### Summary of Changes:
- Modified the `MergeFunctionsPass` to allow running the pass over a set of 
function pointers.
- This behavior is optional and doesn't interfere with the existing 
functionality of running the pass on the entire `Module`.
- Added unit tests to assert the correctness of the updated implementation, 
ensuring that function merging works as expected when run on both sets of 
pointers and full modules.

>From 9b0073551ece0d22bf3378af2b03e456a26031b6 Mon Sep 17 00:00:00 2001
From: Casperento <44746868+caspere...@users.noreply.github.com>
Date: Tue, 24 Sep 2024 16:45:59 -0300
Subject: [PATCH] new runOn method

remove templates

unit tests added

format
---
 .../llvm/Transforms/IPO/MergeFunctions.h  |   7 +
 llvm/lib/Transforms/IPO/MergeFunctions.cpp|  63 +++-
 .../unittests/Transforms/Utils/CMakeLists.txt |   1 +
 .../Transforms/Utils/MergeFunctionsTest.cpp   | 270 ++
 .../llvm/unittests/Transforms/Utils/BUILD.gn  |   1 +
 5 files changed, 340 insertions(+), 2 deletions(-)
 create mode 100644 llvm/unittests/Transforms/Utils/MergeFunctionsTest.cpp

diff --git a/llvm/include/llvm/Transforms/IPO/MergeFunctions.h 
b/llvm/include/llvm/Transforms/IPO/MergeFunctions.h
index 822f0fd99188d0..1b3b1d22f11e28 100644
--- a/llvm/include/llvm/Transforms/IPO/MergeFunctions.h
+++ b/llvm/include/llvm/Transforms/IPO/MergeFunctions.h
@@ -15,7 +15,10 @@
 #ifndef LLVM_TRANSFORMS_IPO_MERGEFUNCTIONS_H
 #define LLVM_TRANSFORMS_IPO_MERGEFUNCTIONS_H
 
+#include "llvm/IR/Function.h"
 #include "llvm/IR/PassManager.h"
+#include 
+#include 
 
 namespace llvm {
 
@@ -25,6 +28,10 @@ class Module;
 class MergeFunctionsPass : public PassInfoMixin {
 public:
   PreservedAnalyses run(Module &M, ModuleAnalysisManager &AM);
+
+  static bool runOnModule(Module &M);
+  static std::pair>
+  runOnFunctions(std::set &F);
 };
 
 } // end namespace llvm
diff --git a/llvm/lib/Transforms/IPO/MergeFunctions.cpp 
b/llvm/lib/Transforms/IPO/MergeFunctions.cpp
index feda5d6459cb47..2e775be4cab7c8 100644
--- a/llvm/lib/Transforms/IPO/MergeFunctions.cpp
+++ b/llvm/lib/Transforms/IPO/MergeFunctions.cpp
@@ -122,6 +122,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -198,6 +199,8 @@ class MergeFunctions {
   }
 
   bool runOnModule(Module &M);
+  bool runOnFunctions(std::set &F);
+  std::map &getDelToNewMap();
 
 private:
   // The function comparison operator is provided here so that FunctionNodes do
@@ -291,17 +294,31 @@ class MergeFunctions {
   // dangling iterators into FnTree. The invariant that preserves this is that
   // there is exactly one mapping F -> FN for each FunctionNode FN in FnTree.
   DenseMap, FnTreeType::iterator> FNodesInTree;
+
+  /// Deleted-New functions mapping
+  std::map DelToNewMap;
 };
 } // end anonymous namespace
 
 PreservedAnalyses MergeFunctionsPass::run(Module &M,
   ModuleAnalysisManager &AM) {
-  MergeFunctions MF;
-  if (!MF.runOnModule(M))
+  if (!MergeFunctionsPass::runOnModule(M))
 return PreservedAnalyses::all();
   return PreservedAnalyses::none();
 }
 
+bool MergeFunctionsPass::runOnModule(Module &M) {
+  MergeFunctions MF;
+  return MF.runOnModule(M);
+}
+
+std::pair>
+MergeFunctionsPass::runOnFunctions(std::set &F) {
+  MergeFunctions MF;
+  bool MergeResult = MF.runOnFunctions(F);
+  return {MergeResult, MF.getDelToNewMap()};
+}
+
 #ifndef NDEBUG
 bool MergeFunctions::doFunctionalCheck(std::vector &Worklist) {
   if (const unsigned Max = NumFunctionsForVerificationCheck) {
@@ -439,6 +456,47 @@ bool MergeFunctions::runOnModule(Module &M) {
   return Changed;
 }
 
+bool MergeFunctions::runOnFunctions(std::set &F) {
+  bool Changed = false;
+  std::vector> 
HashedFuncs;
+  for (Function *Func : F) {
+if (isEligibleForMerging(*Func)) {
+  HashedFuncs.push_back({FunctionComparator::functionHash(*Func), Func});
+}
+  }
+  llvm::stable_sort(HashedFuncs, less_first());
+  auto S = HashedFuncs.begin();
+  for (auto I = HashedFuncs.begin(), IE = HashedFuncs.end(); I != IE; ++I) {
+if ((I != S && std::prev(I)->first == I->first) ||
+(std::next(I) != IE && std::next(I)->first == I->first)) {
+  Deferred.push_back(WeakTrackingVH(I->second));
+}
+  }
+  do {
+std::vector Worklist;
+Deferred.swap(Worklist);
+LLVM_DEBUG(dbgs() << "size of function: " << F.size() << '\n');
+LLVM_DEBUG(d

[llvm-branch-commits] [llvm] [MergeFunctions] Add support to run the pass over a set of function pointers (PR #110996)

2024-10-03 Thread via llvm-branch-commits


github-actions[bot] wrote:

This repository does not accept pull requests. Please follow 
http://llvm.org/docs/Contributing.html#how-to-submit-a-patch for contribution 
to LLVM.

https://github.com/llvm/llvm-project/pull/110996
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [MergeFunctions] Add support to run the pass over a set of function pointers (PR #110996)

2024-10-03 Thread via llvm-branch-commits


https://github.com/github-actions[bot] closed 
https://github.com/llvm/llvm-project/pull/110996
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [MergeFunctions] Add support to run the pass over a set of function pointers (PR #110996)

2024-10-03 Thread via llvm-branch-commits


https://github.com/github-actions[bot] locked 
https://github.com/llvm/llvm-project/pull/110996
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [lldb] 3983d73 - Revert "[lldb][test] TestDataFormatterLibcxxStringSimulator.py: add new paddi…"

2024-10-03 Thread via llvm-branch-commits


Author: Michael Buch
Date: 2024-10-03T14:57:40+01:00
New Revision: 3983d73e32a793b42a3955a34a0662daafa1355f

URL: 
https://github.com/llvm/llvm-project/commit/3983d73e32a793b42a3955a34a0662daafa1355f
DIFF: 
https://github.com/llvm/llvm-project/commit/3983d73e32a793b42a3955a34a0662daafa1355f.diff

LOG: Revert "[lldb][test] TestDataFormatterLibcxxStringSimulator.py: add new 
paddi…"

This reverts commit d5f6e886ff0df8265d44ab0646afcb4a06e6475a.

Added: 


Modified: 

lldb/test/API/functionalities/data-formatter/data-formatter-stl/libcxx-simulators/string/TestDataFormatterLibcxxStringSimulator.py

lldb/test/API/functionalities/data-formatter/data-formatter-stl/libcxx-simulators/string/main.cpp

Removed: 




diff  --git 
a/lldb/test/API/functionalities/data-formatter/data-formatter-stl/libcxx-simulators/string/TestDataFormatterLibcxxStringSimulator.py
 
b/lldb/test/API/functionalities/data-formatter/data-formatter-stl/libcxx-simulators/string/TestDataFormatterLibcxxStringSimulator.py
index fff181440b6d7c..afe6374e55a355 100644
--- 
a/lldb/test/API/functionalities/data-formatter/data-formatter-stl/libcxx-simulators/string/TestDataFormatterLibcxxStringSimulator.py
+++ 
b/lldb/test/API/functionalities/data-formatter/data-formatter-stl/libcxx-simulators/string/TestDataFormatterLibcxxStringSimulator.py
@@ -27,7 +27,7 @@ def _run_test(self, defines):
 
 
 for v in [None, "ALTERNATE_LAYOUT"]:
-for r in range(6):
+for r in range(5):
 for c in range(3):
 name = "test_r%d_c%d" % (r, c)
 defines = ["REVISION=%d" % r, "COMPRESSED_PAIR_REV=%d" % c]

diff  --git 
a/lldb/test/API/functionalities/data-formatter/data-formatter-stl/libcxx-simulators/string/main.cpp
 
b/lldb/test/API/functionalities/data-formatter/data-formatter-stl/libcxx-simulators/string/main.cpp
index 628d32c8d7a55e..f8fc13c10c4372 100644
--- 
a/lldb/test/API/functionalities/data-formatter/data-formatter-stl/libcxx-simulators/string/main.cpp
+++ 
b/lldb/test/API/functionalities/data-formatter/data-formatter-stl/libcxx-simulators/string/main.cpp
@@ -20,11 +20,7 @@
 // Pre-D128285 layout.
 #define PACKED_ANON_STRUCT
 #endif
-#if REVISION <= 4
-// Pre-2a1ef74 layout.
-#define NON_STANDARD_PADDING
-#endif
-// REVISION == 5: current layout
+// REVISION == 4: current layout
 
 #ifdef PACKED_ANON_STRUCT
 #define BEGIN_PACKED_ANON_STRUCT struct __attribute__((packed)) {
@@ -38,7 +34,6 @@
 namespace std {
 namespace __lldb {
 
-#ifdef NON_STANDARD_PADDING
 #if defined(ALTERNATE_LAYOUT) && defined(SUBCLASS_PADDING)
 template  struct __padding {
   unsigned char __xx[sizeof(_CharT) - 1];
@@ -46,13 +41,6 @@ template  struct 
__padding {
 
 template  struct __padding<_CharT, 1> {};
 #endif
-#else // !NON_STANDARD_PADDING
-template  struct __padding {
-  char __padding_[_PaddingSize];
-};
-
-template <> struct __padding<0> {};
-#endif
 
 template  class basic_string {
 public:
@@ -89,12 +77,7 @@ template  
class basic_string {
 };
 #else // !SUBCLASS_PADDING
 
-#ifdef NON_STANDARD_PADDING
 unsigned char __padding[sizeof(value_type) - 1];
-#else
-[[no_unique_address]] __padding __padding_;
-#endif
-
 #ifdef BITMASKS
 unsigned char __size_;
 #else // !BITMASKS
@@ -146,26 +129,21 @@ template  
class basic_string {
 union {
 #ifdef BITMASKS
   unsigned char __size_;
-#else  // !BITMASKS
+#else
   struct {
 unsigned char __is_long_ : 1;
 unsigned char __size_ : 7;
   };
-#endif // BITMASKS
+#endif
   value_type __lx;
 };
-#else  // !SHORT_UNION
+#else
 BEGIN_PACKED_ANON_STRUCT
 unsigned char __is_long_ : 1;
 unsigned char __size_ : 7;
 END_PACKED_ANON_STRUCT
-#ifdef NON_STANDARD_PADDING
-unsigned char __padding[sizeof(value_type) - 1];
-#else  // !NON_STANDARD_PADDING
-[[no_unique_address]] __padding __padding_;
-#endif // NON_STANDARD_PADDING
-
-#endif // SHORT_UNION
+char __padding_[sizeof(value_type) - 1];
+#endif
 value_type __data_[__min_cap];
   };
 



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [flang] [WIP][flang] Introduce HLFIR lowerings to omp.workshare_loop_nest (PR #104748)

2024-10-03 Thread Ivan R. Ivanov via llvm-branch-commits


ivanradanov wrote:

Thank you very much - it seems to only happen with `-O0`, I am trying to find 
the root cause now...

https://github.com/llvm/llvm-project/pull/104748
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] AMDGPU: Do not tail call if an inreg argument requires waterfalling (PR #111002)

2024-10-03 Thread Matt Arsenault via llvm-branch-commits


https://github.com/arsenm created 
https://github.com/llvm/llvm-project/pull/111002

If we have a divergent value passed to an outgoing inreg argument,
the call needs to be executed in a waterfall loop and thus cannot
be tail called.

The waterfall handling of arbitrary calls is broken on the selectiondag
path, so some of these cases still hit an error later.

I also noticed the argument evaluation code in isEligibleForTailCallOptimization
is not correctly accounting for implicit argument assignments. It also seems
inreg codegen is generally broken; we are assigning arguments to the reserved
private resource descriptor.

>From 7e7685b87e0fc00f9d329f3402885e5e01c03672 Mon Sep 17 00:00:00 2001
From: Matt Arsenault 
Date: Thu, 3 Oct 2024 16:06:49 +0400
Subject: [PATCH] AMDGPU: Do not tail call if an inreg argument requires
 waterfalling

If we have a divergent value passed to an outgoing inreg argument,
the call needs to be executed in a waterfall loop and thus cannot
be tail called.

The waterfall handling of arbitrary calls is broken on the selectiondag
path, so some of these cases still hit an error later.

I also noticed the argument evaluation code in isEligibleForTailCallOptimization
is not correctly accounting for implicit argument assignments. It also seems
inreg codegen is generally broken; we are assigning arguments to the reserved
private resource descriptor.
---
 llvm/lib/Target/AMDGPU/AMDGPUCallLowering.cpp |   3 +
 llvm/lib/Target/AMDGPU/SIISelLowering.cpp |  59 +-
 llvm/lib/Target/AMDGPU/SIRegisterInfo.h   |   3 +
 .../isel-amdgcn-cs-chain-intrinsic-w32.ll | 196 +-
 .../isel-amdgcn-cs-chain-intrinsic-w64.ll | 196 +-
 .../AMDGPU/tail-call-inreg-arguments.error.ll |  78 +++
 .../AMDGPU/tail-call-inreg-arguments.ll   |  97 +
 7 files changed, 510 insertions(+), 122 deletions(-)
 create mode 100644 llvm/test/CodeGen/AMDGPU/tail-call-inreg-arguments.error.ll
 create mode 100644 llvm/test/CodeGen/AMDGPU/tail-call-inreg-arguments.ll

diff --git a/llvm/lib/Target/AMDGPU/AMDGPUCallLowering.cpp 
b/llvm/lib/Target/AMDGPU/AMDGPUCallLowering.cpp
index 25e36dc4b3691f..2cde47c743f9e8 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUCallLowering.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUCallLowering.cpp
@@ -1142,6 +1142,9 @@ bool 
AMDGPUCallLowering::isEligibleForTailCallOptimization(
 return false;
   }
 
+  // FIXME: We need to check if any arguments passed in SGPR are uniform. If
+  // they are not, this cannot be a tail call. If they are uniform, but may be
+  // VGPR, we need to insert readfirstlanes.
   if (!areCalleeOutgoingArgsTailCallable(Info, MF, OutArgs))
 return false;
 
diff --git a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp 
b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
index 67e5b3de741412..53cb0800f7fd27 100644
--- a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+++ b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
@@ -3593,6 +3593,8 @@ bool SITargetLowering::isEligibleForTailCallOptimization(
   SmallVector ArgLocs;
   CCState CCInfo(CalleeCC, IsVarArg, MF, ArgLocs, Ctx);
 
+  // FIXME: We are not allocating special input registers, so we will be
+  // deciding based on incorrect register assignments.
   CCInfo.AnalyzeCallOperands(Outs, CCAssignFnForCall(CalleeCC, IsVarArg));
 
   const SIMachineFunctionInfo *FuncInfo = MF.getInfo();
@@ -3602,6 +3604,21 @@ bool SITargetLowering::isEligibleForTailCallOptimization(
   if (CCInfo.getStackSize() > FuncInfo->getBytesInStackArgArea())
 return false;
 
+  for (const auto &[CCVA, ArgVal] : zip_equal(ArgLocs, OutVals)) {
+// FIXME: What about inreg arguments that end up passed in memory?
+if (!CCVA.isRegLoc())
+  continue;
+
+// If we are passing an argument in an SGPR, and the value is divergent,
+// this call requires a waterfall loop.
+if (ArgVal->isDivergent() && TRI->isSGPRPhysReg(CCVA.getLocReg())) {
+  LLVM_DEBUG(
+  dbgs() << "Cannot tail call due to divergent outgoing argument in "
+ << printReg(CCVA.getLocReg(), TRI) << '\n');
+  return false;
+}
+  }
+
   const MachineRegisterInfo &MRI = MF.getRegInfo();
   return parametersInCSRMatch(MRI, CallerPreserved, ArgLocs, OutVals);
 }
@@ -3734,6 +3751,7 @@ SDValue SITargetLowering::LowerCall(CallLoweringInfo &CLI,
   // arguments to begin at SP+0. Completely unused for non-tail calls.
   int32_t FPDiff = 0;
   MachineFrameInfo &MFI = MF.getFrameInfo();
+  auto *TRI = static_cast(Subtarget->getRegisterInfo());
 
   // Adjust the stack pointer for the new arguments...
   // These operations are automatically eliminated by the prolog/epilog pass
@@ -3756,6 +3774,8 @@ SDValue SITargetLowering::LowerCall(CallLoweringInfo &CLI,
 }
   }
 
+  const unsigned NumSpecialInputs = RegsToPass.size();
+
   MVT PtrVT = MVT::i32;
 
   // Walk the register/memloc assignments, inserting copies/loads.
@@ -3857,16 +3877,40 @@ SDValue SITargetLowering::LowerCall(CallLoweringInfo 
&CLI,
   if (!MemOp

[llvm-branch-commits] [llvm] AMDGPU: Do not tail call if an inreg argument requires waterfalling (PR #111002)

2024-10-03 Thread Matt Arsenault via llvm-branch-commits


https://github.com/arsenm ready_for_review 
https://github.com/llvm/llvm-project/pull/111002
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] AMDGPU: Do not tail call if an inreg argument requires waterfalling (PR #111002)

2024-10-03 Thread Matt Arsenault via llvm-branch-commits


arsenm wrote:

> [!WARNING]
> This pull request is not mergeable via GitHub because a downstack PR is 
> open. Once all requirements are satisfied, merge this PR as a stack  href="https://app.graphite.dev/github/pr/llvm/llvm-project/111002?utm_source=stack-comment-downstack-mergeability-warning";
>  >on Graphite.
> https://graphite.dev/docs/merge-pull-requests";>Learn more

* **#111002** https://app.graphite.dev/github/pr/llvm/llvm-project/111002?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/> 👈
* **#110984** https://app.graphite.dev/github/pr/llvm/llvm-project/110984?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* `main`

This stack of pull requests is managed by Graphite. https://stacking.dev/?utm_source=stack-comment";>Learn more about 
stacking.


 Join @arsenm and the rest of your teammates on https://graphite.dev?utm-source=stack-comment";>https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="11px" height="11px"/> Graphite
  

https://github.com/llvm/llvm-project/pull/111002
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] AMDGPU: Do not tail call if an inreg argument requires waterfalling (PR #111002)

2024-10-03 Thread via llvm-branch-commits


llvmbot wrote:




@llvm/pr-subscribers-backend-amdgpu

Author: Matt Arsenault (arsenm)


Changes

If we have a divergent value passed to an outgoing inreg argument,
the call needs to be executed in a waterfall loop and thus cannot
be tail called.

The waterfall handling of arbitrary calls is broken on the selectiondag
path, so some of these cases still hit an error later.

I also noticed the argument evaluation code in isEligibleForTailCallOptimization
is not correctly accounting for implicit argument assignments. It also seems
inreg codegen is generally broken; we are assigning arguments to the reserved
private resource descriptor.

---

Patch is 63.45 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/111002.diff


7 Files Affected:

- (modified) llvm/lib/Target/AMDGPU/AMDGPUCallLowering.cpp (+3) 
- (modified) llvm/lib/Target/AMDGPU/SIISelLowering.cpp (+49-10) 
- (modified) llvm/lib/Target/AMDGPU/SIRegisterInfo.h (+3) 
- (modified) llvm/test/CodeGen/AMDGPU/isel-amdgcn-cs-chain-intrinsic-w32.ll 
(+140-56) 
- (modified) llvm/test/CodeGen/AMDGPU/isel-amdgcn-cs-chain-intrinsic-w64.ll 
(+140-56) 
- (added) llvm/test/CodeGen/AMDGPU/tail-call-inreg-arguments.error.ll (+78) 
- (added) llvm/test/CodeGen/AMDGPU/tail-call-inreg-arguments.ll (+97) 


``diff
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUCallLowering.cpp 
b/llvm/lib/Target/AMDGPU/AMDGPUCallLowering.cpp
index 25e36dc4b3691f..2cde47c743f9e8 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUCallLowering.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUCallLowering.cpp
@@ -1142,6 +1142,9 @@ bool 
AMDGPUCallLowering::isEligibleForTailCallOptimization(
 return false;
   }
 
+  // FIXME: We need to check if any arguments passed in SGPR are uniform. If
+  // they are not, this cannot be a tail call. If they are uniform, but may be
+  // VGPR, we need to insert readfirstlanes.
   if (!areCalleeOutgoingArgsTailCallable(Info, MF, OutArgs))
 return false;
 
diff --git a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp 
b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
index 67e5b3de741412..53cb0800f7fd27 100644
--- a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+++ b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
@@ -3593,6 +3593,8 @@ bool SITargetLowering::isEligibleForTailCallOptimization(
   SmallVector ArgLocs;
   CCState CCInfo(CalleeCC, IsVarArg, MF, ArgLocs, Ctx);
 
+  // FIXME: We are not allocating special input registers, so we will be
+  // deciding based on incorrect register assignments.
   CCInfo.AnalyzeCallOperands(Outs, CCAssignFnForCall(CalleeCC, IsVarArg));
 
   const SIMachineFunctionInfo *FuncInfo = MF.getInfo();
@@ -3602,6 +3604,21 @@ bool SITargetLowering::isEligibleForTailCallOptimization(
   if (CCInfo.getStackSize() > FuncInfo->getBytesInStackArgArea())
 return false;
 
+  for (const auto &[CCVA, ArgVal] : zip_equal(ArgLocs, OutVals)) {
+// FIXME: What about inreg arguments that end up passed in memory?
+if (!CCVA.isRegLoc())
+  continue;
+
+// If we are passing an argument in an SGPR, and the value is divergent,
+// this call requires a waterfall loop.
+if (ArgVal->isDivergent() && TRI->isSGPRPhysReg(CCVA.getLocReg())) {
+  LLVM_DEBUG(
+  dbgs() << "Cannot tail call due to divergent outgoing argument in "
+ << printReg(CCVA.getLocReg(), TRI) << '\n');
+  return false;
+}
+  }
+
   const MachineRegisterInfo &MRI = MF.getRegInfo();
   return parametersInCSRMatch(MRI, CallerPreserved, ArgLocs, OutVals);
 }
@@ -3734,6 +3751,7 @@ SDValue SITargetLowering::LowerCall(CallLoweringInfo &CLI,
   // arguments to begin at SP+0. Completely unused for non-tail calls.
   int32_t FPDiff = 0;
   MachineFrameInfo &MFI = MF.getFrameInfo();
+  auto *TRI = static_cast(Subtarget->getRegisterInfo());
 
   // Adjust the stack pointer for the new arguments...
   // These operations are automatically eliminated by the prolog/epilog pass
@@ -3756,6 +3774,8 @@ SDValue SITargetLowering::LowerCall(CallLoweringInfo &CLI,
 }
   }
 
+  const unsigned NumSpecialInputs = RegsToPass.size();
+
   MVT PtrVT = MVT::i32;
 
   // Walk the register/memloc assignments, inserting copies/loads.
@@ -3857,16 +3877,40 @@ SDValue SITargetLowering::LowerCall(CallLoweringInfo 
&CLI,
   if (!MemOpChains.empty())
 Chain = DAG.getNode(ISD::TokenFactor, DL, MVT::Other, MemOpChains);
 
+  SDValue ReadFirstLaneID =
+  DAG.getTargetConstant(Intrinsic::amdgcn_readfirstlane, DL, MVT::i32);
+
+  SDValue TokenGlue;
+  if (CLI.ConvergenceControlToken) {
+TokenGlue = DAG.getNode(ISD::CONVERGENCECTRL_GLUE, DL, MVT::Glue,
+CLI.ConvergenceControlToken);
+  }
+
   // Build a sequence of copy-to-reg nodes chained together with token chain
   // and flag operands which copy the outgoing args into the appropriate regs.
   SDValue InGlue;
-  for (auto &RegToPass : RegsToPass) {
-Chain = DAG.getCopyToReg(Chain, DL, RegToPass.first,
- RegToPass.second, InGl

[llvm-branch-commits] [llvm] AMDGPU: Do not tail call if an inreg argument requires waterfalling (PR #111002)

2024-10-03 Thread Matt Arsenault via llvm-branch-commits


https://github.com/arsenm updated 
https://github.com/llvm/llvm-project/pull/111002

>From ac0b62834e39264a02656301515c8023b350b33d Mon Sep 17 00:00:00 2001
From: Matt Arsenault 
Date: Thu, 3 Oct 2024 16:06:49 +0400
Subject: [PATCH] AMDGPU: Do not tail call if an inreg argument requires
 waterfalling

If we have a divergent value passed to an outgoing inreg argument,
the call needs to be executed in a waterfall loop and thus cannot
be tail called.

The waterfall handling of arbitrary calls is broken on the selectiondag
path, so some of these cases still hit an error later.

I also noticed the argument evaluation code in isEligibleForTailCallOptimization
is not correctly accounting for implicit argument assignments. It also seems
inreg codegen is generally broken; we are assigning arguments to the reserved
private resource descriptor.
---
 llvm/lib/Target/AMDGPU/AMDGPUCallLowering.cpp |   3 +
 llvm/lib/Target/AMDGPU/SIISelLowering.cpp |  60 +-
 llvm/lib/Target/AMDGPU/SIRegisterInfo.h   |   3 +
 .../isel-amdgcn-cs-chain-intrinsic-w32.ll | 196 +-
 .../isel-amdgcn-cs-chain-intrinsic-w64.ll | 196 +-
 .../AMDGPU/tail-call-inreg-arguments.error.ll |  78 +++
 .../AMDGPU/tail-call-inreg-arguments.ll   |  97 +
 7 files changed, 510 insertions(+), 123 deletions(-)
 create mode 100644 llvm/test/CodeGen/AMDGPU/tail-call-inreg-arguments.error.ll
 create mode 100644 llvm/test/CodeGen/AMDGPU/tail-call-inreg-arguments.ll

diff --git a/llvm/lib/Target/AMDGPU/AMDGPUCallLowering.cpp 
b/llvm/lib/Target/AMDGPU/AMDGPUCallLowering.cpp
index 25e36dc4b3691f..2cde47c743f9e8 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUCallLowering.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUCallLowering.cpp
@@ -1142,6 +1142,9 @@ bool 
AMDGPUCallLowering::isEligibleForTailCallOptimization(
 return false;
   }
 
+  // FIXME: We need to check if any arguments passed in SGPR are uniform. If
+  // they are not, this cannot be a tail call. If they are uniform, but may be
+  // VGPR, we need to insert readfirstlanes.
   if (!areCalleeOutgoingArgsTailCallable(Info, MF, OutArgs))
 return false;
 
diff --git a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp 
b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
index 67e5b3de741412..334a28021f9b65 100644
--- a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+++ b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
@@ -3565,7 +3565,6 @@ bool SITargetLowering::isEligibleForTailCallOptimization(
   if (IsVarArg)
 return false;
 
-  // FIXME: We need to know all arguments passed in SGPR are uniform.
   for (const Argument &Arg : CallerF.args()) {
 if (Arg.hasByValAttr())
   return false;
@@ -3593,6 +3592,8 @@ bool SITargetLowering::isEligibleForTailCallOptimization(
   SmallVector ArgLocs;
   CCState CCInfo(CalleeCC, IsVarArg, MF, ArgLocs, Ctx);
 
+  // FIXME: We are not allocating special input registers, so we will be
+  // deciding based on incorrect register assignments.
   CCInfo.AnalyzeCallOperands(Outs, CCAssignFnForCall(CalleeCC, IsVarArg));
 
   const SIMachineFunctionInfo *FuncInfo = MF.getInfo();
@@ -3602,6 +3603,21 @@ bool SITargetLowering::isEligibleForTailCallOptimization(
   if (CCInfo.getStackSize() > FuncInfo->getBytesInStackArgArea())
 return false;
 
+  for (const auto &[CCVA, ArgVal] : zip_equal(ArgLocs, OutVals)) {
+// FIXME: What about inreg arguments that end up passed in memory?
+if (!CCVA.isRegLoc())
+  continue;
+
+// If we are passing an argument in an SGPR, and the value is divergent,
+// this call requires a waterfall loop.
+if (ArgVal->isDivergent() && TRI->isSGPRPhysReg(CCVA.getLocReg())) {
+  LLVM_DEBUG(
+  dbgs() << "Cannot tail call due to divergent outgoing argument in "
+ << printReg(CCVA.getLocReg(), TRI) << '\n');
+  return false;
+}
+  }
+
   const MachineRegisterInfo &MRI = MF.getRegInfo();
   return parametersInCSRMatch(MRI, CallerPreserved, ArgLocs, OutVals);
 }
@@ -3734,6 +3750,7 @@ SDValue SITargetLowering::LowerCall(CallLoweringInfo &CLI,
   // arguments to begin at SP+0. Completely unused for non-tail calls.
   int32_t FPDiff = 0;
   MachineFrameInfo &MFI = MF.getFrameInfo();
+  auto *TRI = static_cast(Subtarget->getRegisterInfo());
 
   // Adjust the stack pointer for the new arguments...
   // These operations are automatically eliminated by the prolog/epilog pass
@@ -3756,6 +3773,8 @@ SDValue SITargetLowering::LowerCall(CallLoweringInfo &CLI,
 }
   }
 
+  const unsigned NumSpecialInputs = RegsToPass.size();
+
   MVT PtrVT = MVT::i32;
 
   // Walk the register/memloc assignments, inserting copies/loads.
@@ -3857,16 +3876,40 @@ SDValue SITargetLowering::LowerCall(CallLoweringInfo 
&CLI,
   if (!MemOpChains.empty())
 Chain = DAG.getNode(ISD::TokenFactor, DL, MVT::Other, MemOpChains);
 
+  SDValue ReadFirstLaneID =
+  DAG.getTargetConstant(Intrinsic::amdgcn_readfirstlane, DL, MVT::i32);
+
+  SDValue TokenGlue;
+  if (CLI.ConvergenceControlToken)

[llvm-branch-commits] [compiler-rt] [llvm] [Coverage] Make SingleByteCoverage work consistent to merging (PR #110972)

2024-10-03 Thread NAKAMURA Takumi via llvm-branch-commits


https://github.com/chapuni created 
https://github.com/llvm/llvm-project/pull/110972

- Round `Counts` as 1/0
- Confirm both `ExecutionCount` and `AltExecutionCount` are in range.

>From aacb50ddf87d96b4a0644c7ef5d0a86dc94f069b Mon Sep 17 00:00:00 2001
From: NAKAMURA Takumi 
Date: Wed, 2 Oct 2024 23:25:52 +0900
Subject: [PATCH] [Coverage] Make SingleByteCoverage work consistent to merging

- Round `Counts` as 1/0
- Confirm both `ExecutionCount` and `AltExecutionCount` are in range.
---
 compiler-rt/test/profile/instrprof-block-coverage.c | 2 +-
 compiler-rt/test/profile/instrprof-entry-coverage.c | 2 +-
 llvm/include/llvm/ProfileData/InstrProf.h   | 5 -
 llvm/lib/ProfileData/Coverage/CoverageMapping.cpp   | 3 +++
 llvm/lib/ProfileData/InstrProf.cpp  | 2 +-
 llvm/lib/ProfileData/InstrProfReader.cpp| 1 +
 6 files changed, 11 insertions(+), 4 deletions(-)

diff --git a/compiler-rt/test/profile/instrprof-block-coverage.c 
b/compiler-rt/test/profile/instrprof-block-coverage.c
index 829d5af8dc3f9e..8d924e1cac64d8 100644
--- a/compiler-rt/test/profile/instrprof-block-coverage.c
+++ b/compiler-rt/test/profile/instrprof-block-coverage.c
@@ -49,4 +49,4 @@ int main(int argc, char *argv[]) {
 
 // CHECK-ERROR-NOT: warning: {{.*}}: Found inconsistent block coverage
 
-// COUNTS: Maximum function count: 4
+// COUNTS: Maximum function count: 1
diff --git a/compiler-rt/test/profile/instrprof-entry-coverage.c 
b/compiler-rt/test/profile/instrprof-entry-coverage.c
index 1c6816ba01964b..b93a4e0c43ccd6 100644
--- a/compiler-rt/test/profile/instrprof-entry-coverage.c
+++ b/compiler-rt/test/profile/instrprof-entry-coverage.c
@@ -36,4 +36,4 @@ int main(int argc, char *argv[]) {
 // CHECK-DAG: foo
 // CHECK-DAG: bar
 
-// COUNTS: Maximum function count: 2
+// COUNTS: Maximum function count: 1
diff --git a/llvm/include/llvm/ProfileData/InstrProf.h 
b/llvm/include/llvm/ProfileData/InstrProf.h
index b0b2258735e2ae..df9e76966bf42b 100644
--- a/llvm/include/llvm/ProfileData/InstrProf.h
+++ b/llvm/include/llvm/ProfileData/InstrProf.h
@@ -830,6 +830,7 @@ struct InstrProfValueSiteRecord {
 /// Profiling information for a single function.
 struct InstrProfRecord {
   std::vector Counts;
+  bool SingleByteCoverage = false;
   std::vector BitmapBytes;
 
   InstrProfRecord() = default;
@@ -839,13 +840,15 @@ struct InstrProfRecord {
   : Counts(std::move(Counts)), BitmapBytes(std::move(BitmapBytes)) {}
   InstrProfRecord(InstrProfRecord &&) = default;
   InstrProfRecord(const InstrProfRecord &RHS)
-  : Counts(RHS.Counts), BitmapBytes(RHS.BitmapBytes),
+  : Counts(RHS.Counts), SingleByteCoverage(RHS.SingleByteCoverage),
+BitmapBytes(RHS.BitmapBytes),
 ValueData(RHS.ValueData
   ? std::make_unique(*RHS.ValueData)
   : nullptr) {}
   InstrProfRecord &operator=(InstrProfRecord &&) = default;
   InstrProfRecord &operator=(const InstrProfRecord &RHS) {
 Counts = RHS.Counts;
+SingleByteCoverage = RHS.SingleByteCoverage;
 BitmapBytes = RHS.BitmapBytes;
 if (!RHS.ValueData) {
   ValueData = nullptr;
diff --git a/llvm/lib/ProfileData/Coverage/CoverageMapping.cpp 
b/llvm/lib/ProfileData/Coverage/CoverageMapping.cpp
index a02136d5b0386d..bc765c59381718 100644
--- a/llvm/lib/ProfileData/Coverage/CoverageMapping.cpp
+++ b/llvm/lib/ProfileData/Coverage/CoverageMapping.cpp
@@ -874,6 +874,9 @@ Error CoverageMapping::loadFunctionRecord(
   consumeError(std::move(E));
   return Error::success();
 }
+assert(!SingleByteCoverage ||
+   (0 <= *ExecutionCount && *ExecutionCount <= 1 &&
+0 <= *AltExecutionCount && *AltExecutionCount <= 1));
 Function.pushRegion(Region, *ExecutionCount, *AltExecutionCount);
 
 // Record ExpansionRegion.
diff --git a/llvm/lib/ProfileData/InstrProf.cpp 
b/llvm/lib/ProfileData/InstrProf.cpp
index b9937c9429b77d..0f6677b4d35718 100644
--- a/llvm/lib/ProfileData/InstrProf.cpp
+++ b/llvm/lib/ProfileData/InstrProf.cpp
@@ -952,7 +952,7 @@ void InstrProfRecord::merge(InstrProfRecord &Other, 
uint64_t Weight,
   Value = getInstrMaxCountValue();
   Overflowed = true;
 }
-Counts[I] = Value;
+Counts[I] = (SingleByteCoverage && Value != 0 ? 1 : Value);
 if (Overflowed)
   Warn(instrprof_error::counter_overflow);
   }
diff --git a/llvm/lib/ProfileData/InstrProfReader.cpp 
b/llvm/lib/ProfileData/InstrProfReader.cpp
index b90617c74f6d13..a07d7f573275ba 100644
--- a/llvm/lib/ProfileData/InstrProfReader.cpp
+++ b/llvm/lib/ProfileData/InstrProfReader.cpp
@@ -743,6 +743,7 @@ Error RawInstrProfReader::readRawCounts(
 
   Record.Counts.clear();
   Record.Counts.reserve(NumCounters);
+  Record.SingleByteCoverage = hasSingleByteCoverage();
   for (uint32_t I = 0; I < NumCounters; I++) {
 const char *Ptr =
 CountersStart + CounterBaseOffset + I * getCounterTypeSize();

___
llvm-branch-commits mailing list
ll

[llvm-branch-commits] [compiler-rt] [llvm] [Coverage] Make SingleByteCoverage work consistent to merging (PR #110972)

2024-10-03 Thread via llvm-branch-commits


llvmbot wrote:




@llvm/pr-subscribers-pgo

Author: NAKAMURA Takumi (chapuni)


Changes

- Round `Counts` as 1/0
- Confirm both `ExecutionCount` and `AltExecutionCount` are in range.

---
Full diff: https://github.com/llvm/llvm-project/pull/110972.diff


6 Files Affected:

- (modified) compiler-rt/test/profile/instrprof-block-coverage.c (+1-1) 
- (modified) compiler-rt/test/profile/instrprof-entry-coverage.c (+1-1) 
- (modified) llvm/include/llvm/ProfileData/InstrProf.h (+4-1) 
- (modified) llvm/lib/ProfileData/Coverage/CoverageMapping.cpp (+3) 
- (modified) llvm/lib/ProfileData/InstrProf.cpp (+1-1) 
- (modified) llvm/lib/ProfileData/InstrProfReader.cpp (+1) 


``diff
diff --git a/compiler-rt/test/profile/instrprof-block-coverage.c 
b/compiler-rt/test/profile/instrprof-block-coverage.c
index 829d5af8dc3f9e..8d924e1cac64d8 100644
--- a/compiler-rt/test/profile/instrprof-block-coverage.c
+++ b/compiler-rt/test/profile/instrprof-block-coverage.c
@@ -49,4 +49,4 @@ int main(int argc, char *argv[]) {
 
 // CHECK-ERROR-NOT: warning: {{.*}}: Found inconsistent block coverage
 
-// COUNTS: Maximum function count: 4
+// COUNTS: Maximum function count: 1
diff --git a/compiler-rt/test/profile/instrprof-entry-coverage.c 
b/compiler-rt/test/profile/instrprof-entry-coverage.c
index 1c6816ba01964b..b93a4e0c43ccd6 100644
--- a/compiler-rt/test/profile/instrprof-entry-coverage.c
+++ b/compiler-rt/test/profile/instrprof-entry-coverage.c
@@ -36,4 +36,4 @@ int main(int argc, char *argv[]) {
 // CHECK-DAG: foo
 // CHECK-DAG: bar
 
-// COUNTS: Maximum function count: 2
+// COUNTS: Maximum function count: 1
diff --git a/llvm/include/llvm/ProfileData/InstrProf.h 
b/llvm/include/llvm/ProfileData/InstrProf.h
index b0b2258735e2ae..df9e76966bf42b 100644
--- a/llvm/include/llvm/ProfileData/InstrProf.h
+++ b/llvm/include/llvm/ProfileData/InstrProf.h
@@ -830,6 +830,7 @@ struct InstrProfValueSiteRecord {
 /// Profiling information for a single function.
 struct InstrProfRecord {
   std::vector Counts;
+  bool SingleByteCoverage = false;
   std::vector BitmapBytes;
 
   InstrProfRecord() = default;
@@ -839,13 +840,15 @@ struct InstrProfRecord {
   : Counts(std::move(Counts)), BitmapBytes(std::move(BitmapBytes)) {}
   InstrProfRecord(InstrProfRecord &&) = default;
   InstrProfRecord(const InstrProfRecord &RHS)
-  : Counts(RHS.Counts), BitmapBytes(RHS.BitmapBytes),
+  : Counts(RHS.Counts), SingleByteCoverage(RHS.SingleByteCoverage),
+BitmapBytes(RHS.BitmapBytes),
 ValueData(RHS.ValueData
   ? std::make_unique(*RHS.ValueData)
   : nullptr) {}
   InstrProfRecord &operator=(InstrProfRecord &&) = default;
   InstrProfRecord &operator=(const InstrProfRecord &RHS) {
 Counts = RHS.Counts;
+SingleByteCoverage = RHS.SingleByteCoverage;
 BitmapBytes = RHS.BitmapBytes;
 if (!RHS.ValueData) {
   ValueData = nullptr;
diff --git a/llvm/lib/ProfileData/Coverage/CoverageMapping.cpp 
b/llvm/lib/ProfileData/Coverage/CoverageMapping.cpp
index a02136d5b0386d..bc765c59381718 100644
--- a/llvm/lib/ProfileData/Coverage/CoverageMapping.cpp
+++ b/llvm/lib/ProfileData/Coverage/CoverageMapping.cpp
@@ -874,6 +874,9 @@ Error CoverageMapping::loadFunctionRecord(
   consumeError(std::move(E));
   return Error::success();
 }
+assert(!SingleByteCoverage ||
+   (0 <= *ExecutionCount && *ExecutionCount <= 1 &&
+0 <= *AltExecutionCount && *AltExecutionCount <= 1));
 Function.pushRegion(Region, *ExecutionCount, *AltExecutionCount);
 
 // Record ExpansionRegion.
diff --git a/llvm/lib/ProfileData/InstrProf.cpp 
b/llvm/lib/ProfileData/InstrProf.cpp
index b9937c9429b77d..0f6677b4d35718 100644
--- a/llvm/lib/ProfileData/InstrProf.cpp
+++ b/llvm/lib/ProfileData/InstrProf.cpp
@@ -952,7 +952,7 @@ void InstrProfRecord::merge(InstrProfRecord &Other, 
uint64_t Weight,
   Value = getInstrMaxCountValue();
   Overflowed = true;
 }
-Counts[I] = Value;
+Counts[I] = (SingleByteCoverage && Value != 0 ? 1 : Value);
 if (Overflowed)
   Warn(instrprof_error::counter_overflow);
   }
diff --git a/llvm/lib/ProfileData/InstrProfReader.cpp 
b/llvm/lib/ProfileData/InstrProfReader.cpp
index b90617c74f6d13..a07d7f573275ba 100644
--- a/llvm/lib/ProfileData/InstrProfReader.cpp
+++ b/llvm/lib/ProfileData/InstrProfReader.cpp
@@ -743,6 +743,7 @@ Error RawInstrProfReader::readRawCounts(
 
   Record.Counts.clear();
   Record.Counts.reserve(NumCounters);
+  Record.SingleByteCoverage = hasSingleByteCoverage();
   for (uint32_t I = 0; I < NumCounters; I++) {
 const char *Ptr =
 CountersStart + CounterBaseOffset + I * getCounterTypeSize();

``




https://github.com/llvm/llvm-project/pull/110972
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [flang] [WIP][flang] Introduce HLFIR lowerings to omp.workshare_loop_nest (PR #104748)

[llvm-branch-commits] [flang] [WIP][flang] Introduce HLFIR lowerings to omp.workshare_loop_nest (PR #104748)

[llvm-branch-commits] [flang] [flang] Lower omp.workshare to other omp constructs (PR #101446)

[llvm-branch-commits] [flang] [flang] Lower omp.workshare to other omp constructs (PR #101446)

[llvm-branch-commits] [flang] [flang] Lower omp.workshare to other omp constructs (PR #101446)

[llvm-branch-commits] [flang] [WIP][flang] Introduce HLFIR lowerings to omp.workshare_loop_nest (PR #104748)

[llvm-branch-commits] [llvm] [AMDGPU] Serialize WWM_REG vreg flag (PR #110229)

[llvm-branch-commits] [llvm] [AMDGPU] Serialize WWM_REG vreg flag (PR #110229)

[llvm-branch-commits] [clang] 3217472 - Revert "[RISCV][FMV] Support target_version (#99040)"

[llvm-branch-commits] [compiler-rt] [llvm] [Coverage] Make SingleByteCoverage work consistent to merging (PR #110972)

[llvm-branch-commits] [BOLT] Support perf2bolt-N in the driver (PR #111072)

[llvm-branch-commits] [BOLT] Support perf2bolt-N in the driver (PR #111072)

[llvm-branch-commits] [BOLT] Support --show-density for fdata and YAML profiles (PR #110567)

[llvm-branch-commits] [compiler-rt] [llvm] [Coverage] Make SingleByteCoverage work consistent to merging (PR #110972)

[llvm-branch-commits] [compiler-rt] [llvm] [Coverage] Make SingleByteCoverage work consistent to merging (PR #110972)

[llvm-branch-commits] [flang] [WIP][flang] Introduce HLFIR lowerings to omp.workshare_loop_nest (PR #104748)

[llvm-branch-commits] [llvm] [RISCV][CFI] add function epilogue cfi information (PR #110810)

[llvm-branch-commits] [libcxx] [release/19.x][libc++] Follow-up to "Poison Pills are Too Toxic" (PR #109291)

[llvm-branch-commits] [llvm] [MergeFunctions] Add support to run the pass over a set of function pointers (PR #110996)

[llvm-branch-commits] [llvm] [MergeFunctions] Add support to run the pass over a set of function pointers (PR #110996)

[llvm-branch-commits] [llvm] [MergeFunctions] Add support to run the pass over a set of function pointers (PR #110996)

[llvm-branch-commits] [llvm] [MergeFunctions] Add support to run the pass over a set of function pointers (PR #110996)

[llvm-branch-commits] [lldb] 3983d73 - Revert "[lldb][test] TestDataFormatterLibcxxStringSimulator.py: add new paddi…"

[llvm-branch-commits] [flang] [WIP][flang] Introduce HLFIR lowerings to omp.workshare_loop_nest (PR #104748)

[llvm-branch-commits] [llvm] AMDGPU: Do not tail call if an inreg argument requires waterfalling (PR #111002)

[llvm-branch-commits] [llvm] AMDGPU: Do not tail call if an inreg argument requires waterfalling (PR #111002)

[llvm-branch-commits] [llvm] AMDGPU: Do not tail call if an inreg argument requires waterfalling (PR #111002)

[llvm-branch-commits] [llvm] AMDGPU: Do not tail call if an inreg argument requires waterfalling (PR #111002)

[llvm-branch-commits] [llvm] AMDGPU: Do not tail call if an inreg argument requires waterfalling (PR #111002)

[llvm-branch-commits] [compiler-rt] [llvm] [Coverage] Make SingleByteCoverage work consistent to merging (PR #110972)

[llvm-branch-commits] [compiler-rt] [llvm] [Coverage] Make SingleByteCoverage work consistent to merging (PR #110972)

31 matches

Site Navigation

Mail list logo

Footer information