[llvm-branch-commits] [mlir] [mlir][bufferization] Remove remaining dialect conversion-based infra parts (PR #114155)

2024-11-13 Thread Javed Absar via llvm-branch-commits

https://github.com/javedabsar1 edited 
https://github.com/llvm/llvm-project/pull/114155
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] [mlir][bufferization] Remove remaining dialect conversion-based infra parts (PR #114155)

2024-11-13 Thread Javed Absar via llvm-branch-commits


@@ -86,18 +86,13 @@ getOrCreateFuncAnalysisState(OneShotAnalysisState &state) {
   return state.addExtension();
 }
 
-/// Return the unique ReturnOp that terminates `funcOp`.
-/// Return nullptr if there is no such unique ReturnOp.
-static func::ReturnOp getAssumedUniqueReturnOp(func::FuncOp funcOp) {
-  func::ReturnOp returnOp;
-  for (Block &b : funcOp.getBody()) {
-if (auto candidateOp = dyn_cast(b.getTerminator())) {
-  if (returnOp)
-return nullptr;
-  returnOp = candidateOp;
-}
-  }
-  return returnOp;
+/// Return all top-level func.return ops in the given function.

javedabsar1 wrote:

wasn't this `getReturnOps` part of another diff? Just confused and asking.

https://github.com/llvm/llvm-project/pull/114155
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [openmp] release/19.x: [OpenMP] Create versioned libgomp softlinks (#112973) (PR #115944)

2024-11-13 Thread Shilei Tian via llvm-branch-commits

https://github.com/shiltian approved this pull request.


https://github.com/llvm/llvm-project/pull/115944
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [mlir] [MLIR][OpenMP] LLVM IR translation of host_eval (PR #116052)

2024-11-13 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-flang-fir-hlfir

Author: Sergio Afonso (skatrak)


Changes

This patch adds support for processing the `host_eval` clause of `omp.target` 
to populate default and runtime kernel launch attributes. Specifically, these 
related to the `num_teams`, `thread_limit` and `num_threads` clauses attached 
to operations nested inside of `omp.target`. As a result, the `thread_limit` 
clause of `omp.target` is also supported.

The implementation of `initTargetDefaultAttrs()` is intended to reflect clang's 
own processing of multiple constructs and clauses in order to define a default 
number of teams and threads to be used as kernel attributes and to populate 
global variables in the target device module.

One side effect of this change is that it is no longer possible to translate to 
LLVM IR target device MLIR modules unless they have a supported target triple. 
This is because the local `getGridValue()` function in the `OpenMPIRBuilder` 
only works for certain architectures, and it is called whenever the maximum 
number of threads has not been explicitly defined. This limitation also matches 
clang.

Support for evaluating the collapsed loop trip count of target SPMD kernels 
remains unsupported.

---

Patch is 37.90 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/116052.diff


18 Files Affected:

- (modified) flang/test/Integration/OpenMP/target-filtering.f90 (+1-1) 
- (modified) flang/test/Lower/OpenMP/function-filtering-2.f90 (+3-3) 
- (modified) flang/test/Lower/OpenMP/function-filtering-3.f90 (+3-3) 
- (modified) flang/test/Lower/OpenMP/function-filtering.f90 (+3-3) 
- (modified) 
mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp (+229-16) 
- (modified) 
mlir/test/Target/LLVMIR/omptarget-byref-bycopy-generation-device.mlir (+2-2) 
- (modified) mlir/test/Target/LLVMIR/omptarget-constant-alloca-raise.mlir 
(+2-2) 
- (modified) 
mlir/test/Target/LLVMIR/omptarget-constant-indexing-device-region.mlir (+2-2) 
- (modified) mlir/test/Target/LLVMIR/omptarget-debug.mlir (+1-1) 
- (modified) mlir/test/Target/LLVMIR/omptarget-declare-target-llvm-device.mlir 
(+1-1) 
- (modified) mlir/test/Target/LLVMIR/omptarget-parallel-llvm.mlir (+2-2) 
- (modified) mlir/test/Target/LLVMIR/omptarget-region-device-llvm.mlir (+3-3) 
- (modified) mlir/test/Target/LLVMIR/omptarget-target-inside-task.mlir (+2-2) 
- (added) mlir/test/Target/LLVMIR/openmp-target-launch-device.mlir (+43) 
- (added) mlir/test/Target/LLVMIR/openmp-target-launch-host.mlir (+31) 
- (modified) mlir/test/Target/LLVMIR/openmp-target-use-device-nested.mlir 
(+2-2) 
- (modified) mlir/test/Target/LLVMIR/openmp-task-target-device.mlir (+1-1) 
- (modified) mlir/test/Target/LLVMIR/openmp-todo.mlir (+13-14) 


``diff
diff --git a/flang/test/Integration/OpenMP/target-filtering.f90 
b/flang/test/Integration/OpenMP/target-filtering.f90
index d1ab1b47e580d4..699c1040d91f9c 100644
--- a/flang/test/Integration/OpenMP/target-filtering.f90
+++ b/flang/test/Integration/OpenMP/target-filtering.f90
@@ -7,7 +7,7 @@
 !===--===!
 
 !RUN: %flang_fc1 -emit-llvm -fopenmp %s -o - | FileCheck %s --check-prefixes 
HOST,ALL
-!RUN: %flang_fc1 -emit-llvm -fopenmp -fopenmp-is-target-device %s -o - | 
FileCheck %s --check-prefixes DEVICE,ALL
+!RUN: %flang_fc1 -triple amdgcn-amd-amdhsa -emit-llvm -fopenmp 
-fopenmp-is-target-device %s -o - | FileCheck %s --check-prefixes DEVICE,ALL
 
 !HOST: define {{.*}}@{{.*}}before{{.*}}(
 !DEVICE-NOT: define {{.*}}@before{{.*}}(
diff --git a/flang/test/Lower/OpenMP/function-filtering-2.f90 
b/flang/test/Lower/OpenMP/function-filtering-2.f90
index 0c02aa223820e7..a2c5e29cfdcbf6 100644
--- a/flang/test/Lower/OpenMP/function-filtering-2.f90
+++ b/flang/test/Lower/OpenMP/function-filtering-2.f90
@@ -1,9 +1,9 @@
 ! RUN: %flang_fc1 -fopenmp -fopenmp-version=52 -flang-experimental-hlfir 
-emit-llvm %s -o - | FileCheck --check-prefixes=LLVM,LLVM-HOST %s
 ! RUN: %flang_fc1 -fopenmp -fopenmp-version=52 -emit-hlfir %s -o - | FileCheck 
--check-prefix=MLIR %s
-! RUN: %flang_fc1 -fopenmp -fopenmp-version=52 -fopenmp-is-target-device 
-flang-experimental-hlfir -emit-llvm %s -o - | FileCheck 
--check-prefixes=LLVM,LLVM-DEVICE %s
-! RUN: %flang_fc1 -fopenmp -fopenmp-version=52 -fopenmp-is-target-device 
-emit-hlfir %s -o - | FileCheck --check-prefix=MLIR %s
+! RUN: %flang_fc1 -triple amdgcn-amd-amdhsa -fopenmp -fopenmp-version=52 
-fopenmp-is-target-device -flang-experimental-hlfir -emit-llvm %s -o - | 
FileCheck --check-prefixes=LLVM,LLVM-DEVICE %s
+! RUN: %flang_fc1 -triple amdgcn-amd-amdhsa -fopenmp -fopenmp-version=52 
-fopenmp-is-target-device -emit-hlfir %s -o - | FileCheck --check-prefix=MLIR %s
 ! RUN: bbc -fopenmp -fopenmp-version=52 -emit-hlfir %s -o - | FileCheck 
--check-prefixes=MLIR-HOST,MLIR-ALL %s
-! RUN: bbc -fopenmp -fopenmp-version=52 -fopenmp-is-target-device -emit-hlfir 
%s -o - | Fi

[llvm-branch-commits] [clang] [llvm] [mlir] [OMPIRBuilder] Introduce struct to hold default kernel teams/threads (PR #116050)

2024-11-13 Thread Sergio Afonso via llvm-branch-commits

https://github.com/skatrak created 
https://github.com/llvm/llvm-project/pull/116050

This patch introduces the `OpenMPIRBuilder::TargetKernelDefaultAttrs` structure 
used to simplify passing default and constant values for number of teams and 
threads, and possibly other target kernel-related information in the future.

This is used to forward values passed to `createTarget` to `createTargetInit`, 
which previously used a default unrelated set of values.

>From 1fcfe48114bdda7be545b6bfaa710b6e639670d3 Mon Sep 17 00:00:00 2001
From: Sergio Afonso 
Date: Fri, 8 Nov 2024 15:46:48 +
Subject: [PATCH] [OMPIRBuilder] Introduce struct to hold default kernel
 teams/threads

This patch introduces the `OpenMPIRBuilder::TargetKernelDefaultAttrs` structure
used to simplify passing default and constant values for number of teams and
threads, and possibly other target kernel-related information in the future.

This is used to forward values passed to `createTarget` to `createTargetInit`,
which previously used a default unrelated set of values.
---
 clang/lib/CodeGen/CGOpenMPRuntime.cpp | 13 ++--
 clang/lib/CodeGen/CGOpenMPRuntime.h   |  9 +--
 clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp  |  9 +--
 .../llvm/Frontend/OpenMP/OMPIRBuilder.h   | 39 ++
 llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp | 71 +++
 .../Frontend/OpenMPIRBuilderTest.cpp  | 29 
 .../OpenMP/OpenMPToLLVMIRTranslation.cpp  | 11 +--
 .../LLVMIR/omptarget-region-device-llvm.mlir  |  2 +-
 8 files changed, 102 insertions(+), 81 deletions(-)

diff --git a/clang/lib/CodeGen/CGOpenMPRuntime.cpp 
b/clang/lib/CodeGen/CGOpenMPRuntime.cpp
index d714af035d21a2..0f7a1166227476 100644
--- a/clang/lib/CodeGen/CGOpenMPRuntime.cpp
+++ b/clang/lib/CodeGen/CGOpenMPRuntime.cpp
@@ -5880,10 +5880,13 @@ void 
CGOpenMPRuntime::emitUsesAllocatorsFini(CodeGenFunction &CGF,
 
 void CGOpenMPRuntime::computeMinAndMaxThreadsAndTeams(
 const OMPExecutableDirective &D, CodeGenFunction &CGF,
-int32_t &MinThreadsVal, int32_t &MaxThreadsVal, int32_t &MinTeamsVal,
-int32_t &MaxTeamsVal) {
+llvm::OpenMPIRBuilder::TargetKernelDefaultAttrs &Attrs) {
+  assert(Attrs.MaxTeams.size() == 1 && Attrs.MaxThreads.size() == 1 &&
+ "invalid default attrs structure");
+  int32_t &MaxTeamsVal = Attrs.MaxTeams.front();
+  int32_t &MaxThreadsVal = Attrs.MaxThreads.front();
 
-  getNumTeamsExprForTargetDirective(CGF, D, MinTeamsVal, MaxTeamsVal);
+  getNumTeamsExprForTargetDirective(CGF, D, Attrs.MinTeams, MaxTeamsVal);
   getNumThreadsExprForTargetDirective(CGF, D, MaxThreadsVal,
   /*UpperBoundOnly=*/true);
 
@@ -5901,12 +5904,12 @@ void CGOpenMPRuntime::computeMinAndMaxThreadsAndTeams(
   else
 continue;
 
-  MinThreadsVal = std::max(MinThreadsVal, AttrMinThreadsVal);
+  Attrs.MinThreads = std::max(Attrs.MinThreads, AttrMinThreadsVal);
   if (AttrMaxThreadsVal > 0)
 MaxThreadsVal = MaxThreadsVal > 0
 ? std::min(MaxThreadsVal, AttrMaxThreadsVal)
 : AttrMaxThreadsVal;
-  MinTeamsVal = std::max(MinTeamsVal, AttrMinBlocksVal);
+  Attrs.MinTeams = std::max(Attrs.MinTeams, AttrMinBlocksVal);
   if (AttrMaxBlocksVal > 0)
 MaxTeamsVal = MaxTeamsVal > 0 ? std::min(MaxTeamsVal, AttrMaxBlocksVal)
   : AttrMaxBlocksVal;
diff --git a/clang/lib/CodeGen/CGOpenMPRuntime.h 
b/clang/lib/CodeGen/CGOpenMPRuntime.h
index 5e7715743afb58..003395e7f17ded 100644
--- a/clang/lib/CodeGen/CGOpenMPRuntime.h
+++ b/clang/lib/CodeGen/CGOpenMPRuntime.h
@@ -312,12 +312,9 @@ class CGOpenMPRuntime {
   llvm::OpenMPIRBuilder OMPBuilder;
 
   /// Helper to determine the min/max number of threads/teams for \p D.
-  void computeMinAndMaxThreadsAndTeams(const OMPExecutableDirective &D,
-   CodeGenFunction &CGF,
-   int32_t &MinThreadsVal,
-   int32_t &MaxThreadsVal,
-   int32_t &MinTeamsVal,
-   int32_t &MaxTeamsVal);
+  void computeMinAndMaxThreadsAndTeams(
+  const OMPExecutableDirective &D, CodeGenFunction &CGF,
+  llvm::OpenMPIRBuilder::TargetKernelDefaultAttrs &Attrs);
 
   /// Helper to emit outlined function for 'target' directive.
   /// \param D Directive to emit.
diff --git a/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp 
b/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp
index 43dc0e62284602..96f8d6c5c08e56 100644
--- a/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp
+++ b/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp
@@ -745,14 +745,11 @@ void CGOpenMPRuntimeGPU::emitNonSPMDKernel(const 
OMPExecutableDirective &D,
 void CGOpenMPRuntimeGPU::emitKernelInit(const OMPExecutableDirective &D,
 CodeGenFunction &CGF,
 EntryFunctionSt

[llvm-branch-commits] [clang] [llvm] [mlir] [OMPIRBuilder] Introduce struct to hold default kernel teams/threads (PR #116050)

2024-11-13 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-clang

Author: Sergio Afonso (skatrak)


Changes

This patch introduces the `OpenMPIRBuilder::TargetKernelDefaultAttrs` structure 
used to simplify passing default and constant values for number of teams and 
threads, and possibly other target kernel-related information in the future.

This is used to forward values passed to `createTarget` to `createTargetInit`, 
which previously used a default unrelated set of values.

---

Patch is 21.80 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/116050.diff


8 Files Affected:

- (modified) clang/lib/CodeGen/CGOpenMPRuntime.cpp (+8-5) 
- (modified) clang/lib/CodeGen/CGOpenMPRuntime.h (+3-6) 
- (modified) clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp (+3-6) 
- (modified) llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h (+25-14) 
- (modified) llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp (+40-31) 
- (modified) llvm/unittests/Frontend/OpenMPIRBuilderTest.cpp (+16-13) 
- (modified) 
mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp (+6-5) 
- (modified) mlir/test/Target/LLVMIR/omptarget-region-device-llvm.mlir (+1-1) 


``diff
diff --git a/clang/lib/CodeGen/CGOpenMPRuntime.cpp 
b/clang/lib/CodeGen/CGOpenMPRuntime.cpp
index d714af035d21a2..0f7a1166227476 100644
--- a/clang/lib/CodeGen/CGOpenMPRuntime.cpp
+++ b/clang/lib/CodeGen/CGOpenMPRuntime.cpp
@@ -5880,10 +5880,13 @@ void 
CGOpenMPRuntime::emitUsesAllocatorsFini(CodeGenFunction &CGF,
 
 void CGOpenMPRuntime::computeMinAndMaxThreadsAndTeams(
 const OMPExecutableDirective &D, CodeGenFunction &CGF,
-int32_t &MinThreadsVal, int32_t &MaxThreadsVal, int32_t &MinTeamsVal,
-int32_t &MaxTeamsVal) {
+llvm::OpenMPIRBuilder::TargetKernelDefaultAttrs &Attrs) {
+  assert(Attrs.MaxTeams.size() == 1 && Attrs.MaxThreads.size() == 1 &&
+ "invalid default attrs structure");
+  int32_t &MaxTeamsVal = Attrs.MaxTeams.front();
+  int32_t &MaxThreadsVal = Attrs.MaxThreads.front();
 
-  getNumTeamsExprForTargetDirective(CGF, D, MinTeamsVal, MaxTeamsVal);
+  getNumTeamsExprForTargetDirective(CGF, D, Attrs.MinTeams, MaxTeamsVal);
   getNumThreadsExprForTargetDirective(CGF, D, MaxThreadsVal,
   /*UpperBoundOnly=*/true);
 
@@ -5901,12 +5904,12 @@ void CGOpenMPRuntime::computeMinAndMaxThreadsAndTeams(
   else
 continue;
 
-  MinThreadsVal = std::max(MinThreadsVal, AttrMinThreadsVal);
+  Attrs.MinThreads = std::max(Attrs.MinThreads, AttrMinThreadsVal);
   if (AttrMaxThreadsVal > 0)
 MaxThreadsVal = MaxThreadsVal > 0
 ? std::min(MaxThreadsVal, AttrMaxThreadsVal)
 : AttrMaxThreadsVal;
-  MinTeamsVal = std::max(MinTeamsVal, AttrMinBlocksVal);
+  Attrs.MinTeams = std::max(Attrs.MinTeams, AttrMinBlocksVal);
   if (AttrMaxBlocksVal > 0)
 MaxTeamsVal = MaxTeamsVal > 0 ? std::min(MaxTeamsVal, AttrMaxBlocksVal)
   : AttrMaxBlocksVal;
diff --git a/clang/lib/CodeGen/CGOpenMPRuntime.h 
b/clang/lib/CodeGen/CGOpenMPRuntime.h
index 5e7715743afb58..003395e7f17ded 100644
--- a/clang/lib/CodeGen/CGOpenMPRuntime.h
+++ b/clang/lib/CodeGen/CGOpenMPRuntime.h
@@ -312,12 +312,9 @@ class CGOpenMPRuntime {
   llvm::OpenMPIRBuilder OMPBuilder;
 
   /// Helper to determine the min/max number of threads/teams for \p D.
-  void computeMinAndMaxThreadsAndTeams(const OMPExecutableDirective &D,
-   CodeGenFunction &CGF,
-   int32_t &MinThreadsVal,
-   int32_t &MaxThreadsVal,
-   int32_t &MinTeamsVal,
-   int32_t &MaxTeamsVal);
+  void computeMinAndMaxThreadsAndTeams(
+  const OMPExecutableDirective &D, CodeGenFunction &CGF,
+  llvm::OpenMPIRBuilder::TargetKernelDefaultAttrs &Attrs);
 
   /// Helper to emit outlined function for 'target' directive.
   /// \param D Directive to emit.
diff --git a/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp 
b/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp
index 43dc0e62284602..96f8d6c5c08e56 100644
--- a/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp
+++ b/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp
@@ -745,14 +745,11 @@ void CGOpenMPRuntimeGPU::emitNonSPMDKernel(const 
OMPExecutableDirective &D,
 void CGOpenMPRuntimeGPU::emitKernelInit(const OMPExecutableDirective &D,
 CodeGenFunction &CGF,
 EntryFunctionState &EST, bool IsSPMD) {
-  int32_t MinThreadsVal = 1, MaxThreadsVal = -1, MinTeamsVal = 1,
-  MaxTeamsVal = -1;
-  computeMinAndMaxThreadsAndTeams(D, CGF, MinThreadsVal, MaxThreadsVal,
-  MinTeamsVal, MaxTeamsVal);
+  llvm::OpenMPIRBuilder::TargetKernelDefaultAttrs Attrs;
+  computeMinAndMaxThreadsAndTeams(D, CGF, Attrs);
 
   CGBuilderTy &Bld = CGF.Builder;

[llvm-branch-commits] [mlir] [MLIR][OpenMP] Add host_eval clause to omp.target (PR #116049)

2024-11-13 Thread Sergio Afonso via llvm-branch-commits

skatrak wrote:

PR stack:
- #116048
- #116049
- #116050
- #116051
- #116052

https://github.com/llvm/llvm-project/pull/116049
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [mlir] [MLIR][OpenMP] LLVM IR translation of host_eval (PR #116052)

2024-11-13 Thread Sergio Afonso via llvm-branch-commits

https://github.com/skatrak created 
https://github.com/llvm/llvm-project/pull/116052

This patch adds support for processing the `host_eval` clause of `omp.target` 
to populate default and runtime kernel launch attributes. Specifically, these 
related to the `num_teams`, `thread_limit` and `num_threads` clauses attached 
to operations nested inside of `omp.target`. As a result, the `thread_limit` 
clause of `omp.target` is also supported.

The implementation of `initTargetDefaultAttrs()` is intended to reflect clang's 
own processing of multiple constructs and clauses in order to define a default 
number of teams and threads to be used as kernel attributes and to populate 
global variables in the target device module.

One side effect of this change is that it is no longer possible to translate to 
LLVM IR target device MLIR modules unless they have a supported target triple. 
This is because the local `getGridValue()` function in the `OpenMPIRBuilder` 
only works for certain architectures, and it is called whenever the maximum 
number of threads has not been explicitly defined. This limitation also matches 
clang.

Support for evaluating the collapsed loop trip count of target SPMD kernels 
remains unsupported.

>From 8ff0d3bfc3c4b91987146276c059c9e0affaa788 Mon Sep 17 00:00:00 2001
From: Sergio Afonso 
Date: Tue, 12 Nov 2024 10:49:28 +
Subject: [PATCH] [MLIR][OpenMP] LLVM IR translation of host_eval

This patch adds support for processing the `host_eval` clause of `omp.target`
to populate default and runtime kernel launch attributes. Specifically, these
related to the `num_teams`, `thread_limit` and `num_threads` clauses attached
to operations nested inside of `omp.target`. As a result, the `thread_limit`
clause of `omp.target` is also supported.

The implementation of `initTargetDefaultAttrs()` is intended to reflect clang's
own processing of multiple constructs and clauses in order to define a default
number of teams and threads to be used as kernel attributes and to populate
global variables in the target device module.

One side effect of this change is that it is no longer possible to translate to
LLVM IR target device MLIR modules unless they have a supported target triple.
This is because the local `getGridValue()` function in the `OpenMPIRBuilder`
only works for certain architectures, and it is called whenever the maximum
number of threads has not been explicitly defined. This limitation also matches
clang.

Support for evaluating the collapsed loop trip count of target SPMD kernels
remains unsupported.
---
 .../Integration/OpenMP/target-filtering.f90   |   2 +-
 .../Lower/OpenMP/function-filtering-2.f90 |   6 +-
 .../Lower/OpenMP/function-filtering-3.f90 |   6 +-
 .../test/Lower/OpenMP/function-filtering.f90  |   6 +-
 .../OpenMP/OpenMPToLLVMIRTranslation.cpp  | 245 --
 ...target-byref-bycopy-generation-device.mlir |   4 +-
 .../omptarget-constant-alloca-raise.mlir  |   4 +-
 ...arget-constant-indexing-device-region.mlir |   4 +-
 mlir/test/Target/LLVMIR/omptarget-debug.mlir  |   2 +-
 .../omptarget-declare-target-llvm-device.mlir |   2 +-
 .../LLVMIR/omptarget-parallel-llvm.mlir   |   4 +-
 .../LLVMIR/omptarget-region-device-llvm.mlir  |   6 +-
 .../LLVMIR/omptarget-target-inside-task.mlir  |   4 +-
 .../LLVMIR/openmp-target-launch-device.mlir   |  43 +++
 .../LLVMIR/openmp-target-launch-host.mlir |  31 +++
 .../openmp-target-use-device-nested.mlir  |   4 +-
 .../LLVMIR/openmp-task-target-device.mlir |   2 +-
 mlir/test/Target/LLVMIR/openmp-todo.mlir  |  27 +-
 18 files changed, 344 insertions(+), 58 deletions(-)
 create mode 100644 mlir/test/Target/LLVMIR/openmp-target-launch-device.mlir
 create mode 100644 mlir/test/Target/LLVMIR/openmp-target-launch-host.mlir

diff --git a/flang/test/Integration/OpenMP/target-filtering.f90 
b/flang/test/Integration/OpenMP/target-filtering.f90
index d1ab1b47e580d4..699c1040d91f9c 100644
--- a/flang/test/Integration/OpenMP/target-filtering.f90
+++ b/flang/test/Integration/OpenMP/target-filtering.f90
@@ -7,7 +7,7 @@
 !===--===!
 
 !RUN: %flang_fc1 -emit-llvm -fopenmp %s -o - | FileCheck %s --check-prefixes 
HOST,ALL
-!RUN: %flang_fc1 -emit-llvm -fopenmp -fopenmp-is-target-device %s -o - | 
FileCheck %s --check-prefixes DEVICE,ALL
+!RUN: %flang_fc1 -triple amdgcn-amd-amdhsa -emit-llvm -fopenmp 
-fopenmp-is-target-device %s -o - | FileCheck %s --check-prefixes DEVICE,ALL
 
 !HOST: define {{.*}}@{{.*}}before{{.*}}(
 !DEVICE-NOT: define {{.*}}@before{{.*}}(
diff --git a/flang/test/Lower/OpenMP/function-filtering-2.f90 
b/flang/test/Lower/OpenMP/function-filtering-2.f90
index 0c02aa223820e7..a2c5e29cfdcbf6 100644
--- a/flang/test/Lower/OpenMP/function-filtering-2.f90
+++ b/flang/test/Lower/OpenMP/function-filtering-2.f90
@@ -1,9 +1,9 @@
 ! RUN: %flang_fc1 -fopenmp -fopenmp-version=52 -flang-experimental-hlfir 
-emit-llvm %s -o - | FileCheck --

[llvm-branch-commits] [llvm] [mlir] [OMPIRBuilder] Support runtime number of teams and threads, and SPMD mode (PR #116051)

2024-11-13 Thread Sergio Afonso via llvm-branch-commits

https://github.com/skatrak created 
https://github.com/llvm/llvm-project/pull/116051

This patch introduces a `TargetKernelRuntimeAttrs` structure to hold host- 
evaluated `num_teams`, `thread_limit`, `num_threads` and trip count values 
passed to the runtime kernel offloading call.

Additionally, `createTarget` is extended to take an `IsSPMD` flag, used to 
influence target device code generation.

>From cc5c5cc8b1c8b718ae3d0aece3784416460114bc Mon Sep 17 00:00:00 2001
From: Sergio Afonso 
Date: Fri, 8 Nov 2024 17:24:47 +
Subject: [PATCH] [OMPIRBuilder] Support runtime number of teams and threads,
 and SPMD mode

This patch introduces a `TargetKernelRuntimeAttrs` structure to hold host-
evaluated `num_teams`, `thread_limit`, `num_threads` and trip count values
passed to the runtime kernel offloading call.

Additionally, `createTarget` is extended to take an `IsSPMD` flag, used to
influence target device code generation.
---
 .../llvm/Frontend/OpenMP/OMPIRBuilder.h   |  26 +-
 llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp | 137 +++--
 .../Frontend/OpenMPIRBuilderTest.cpp  | 281 +-
 .../OpenMP/OpenMPToLLVMIRTranslation.cpp  |  10 +-
 4 files changed, 420 insertions(+), 34 deletions(-)

diff --git a/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h 
b/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h
index da450ef5adbc14..a85f41e586c514 100644
--- a/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h
+++ b/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h
@@ -2237,6 +2237,26 @@ class OpenMPIRBuilder {
 int32_t MinThreads = 1;
   };
 
+  /// Container to pass LLVM IR runtime values or constants related to the
+  /// number of teams and threads with which the kernel must be launched, as
+  /// well as the trip count of the SPMD loop, if it is an SPMD kernel. These
+  /// must be defined in the host prior to the call to the kernel launch OpenMP
+  /// RTL function.
+  struct TargetKernelRuntimeAttrs {
+SmallVector MaxTeams = {nullptr};
+Value *MinTeams = nullptr;
+SmallVector TargetThreadLimit = {nullptr};
+SmallVector TeamsThreadLimit = {nullptr};
+
+/// 'parallel' construct 'num_threads' clause value, if present and it is a
+/// target SPMD kernel.
+Value *MaxThreads = nullptr;
+
+/// Total number of iterations of the target SPMD kernel or null if it is a
+/// generic kernel.
+Value *LoopTripCount = nullptr;
+  };
+
   /// Data structure that contains the needed information to construct the
   /// kernel args vector.
   struct TargetKernelArgs {
@@ -2905,11 +2925,14 @@ class OpenMPIRBuilder {
   ///
   /// \param Loc where the target data construct was encountered.
   /// \param IsOffloadEntry whether it is an offload entry.
+  /// \param IsSPMD whether it is a target SPMD kernel.
   /// \param CodeGenIP The insertion point where the call to the outlined
   /// function should be emitted.
   /// \param EntryInfo The entry information about the function.
   /// \param DefaultAttrs Structure containing the default numbers of threads
   ///and teams to launch the kernel with.
+  /// \param RuntimeAttrs Structure containing the runtime numbers of threads
+  ///and teams to launch the kernel with.
   /// \param Inputs The input values to the region that will be passed.
   /// as arguments to the outlined function.
   /// \param BodyGenCB Callback that will generate the region code.
@@ -2919,11 +2942,12 @@ class OpenMPIRBuilder {
   // dependency information as passed in the depend clause
   // \param HasNowait Whether the target construct has a `nowait` clause or 
not.
   InsertPointOrErrorTy createTarget(
-  const LocationDescription &Loc, bool IsOffloadEntry,
+  const LocationDescription &Loc, bool IsOffloadEntry, bool IsSPMD,
   OpenMPIRBuilder::InsertPointTy AllocaIP,
   OpenMPIRBuilder::InsertPointTy CodeGenIP,
   TargetRegionEntryInfo &EntryInfo,
   const TargetKernelDefaultAttrs &DefaultAttrs,
+  const TargetKernelRuntimeAttrs &RuntimeAttrs,
   SmallVectorImpl &Inputs, GenMapInfoCallbackTy GenMapInfoCB,
   TargetBodyGenCallbackTy BodyGenCB,
   TargetGenArgAccessorsCallbackTy ArgAccessorFuncCB,
diff --git a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp 
b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
index 302d363965c940..f847f60386df85 100644
--- a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
+++ b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
@@ -6727,8 +6727,43 @@ FunctionCallee 
OpenMPIRBuilder::createDispatchDeinitFunction() {
   return getOrCreateRuntimeFunction(M, omp::OMPRTL___kmpc_dispatch_deinit);
 }
 
+static void emitUsed(StringRef Name, std::vector &List,
+ Module &M) {
+  if (List.empty())
+return;
+
+  Type *PtrTy = PointerType::get(M.getContext(), /*AddressSpace=*/0);
+
+  // Convert List to what ConstantArray needs.
+  SmallVector UsedArray;
+  UsedArray.reserve(List.size());
+  for (auto Item : List)
+UsedArray.push_back(ConstantExpr::getPointer

[llvm-branch-commits] [mlir] [MLIR][OpenMP] Add host_eval clause to omp.target (PR #116049)

2024-11-13 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-mlir-llvm

Author: Sergio Afonso (skatrak)


Changes

This patch adds the `host_eval` clause to the `omp.target` operation. 
Additionally, it updates its op verifier to make sure all uses of block 
arguments defined by this clause fall within one of the few cases where they 
are allowed.

MLIR to LLVM IR translation fails on translation of this clause with a 
not-yet-implemented error.

---

Patch is 20.92 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/116049.diff


7 Files Affected:

- (modified) mlir/docs/Dialects/OpenMPDialect/_index.md (+55) 
- (modified) mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td (+26-7) 
- (modified) mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp (+163-4) 
- (modified) 
mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp (+5) 
- (modified) mlir/test/Dialect/OpenMP/invalid.mlir (+69-1) 
- (modified) mlir/test/Dialect/OpenMP/ops.mlir (+37-1) 
- (modified) mlir/test/Target/LLVMIR/openmp-todo.mlir (+14) 


``diff
diff --git a/mlir/docs/Dialects/OpenMPDialect/_index.md 
b/mlir/docs/Dialects/OpenMPDialect/_index.md
index 4e5d777d6c4f7f..e0dd3f598e84b6 100644
--- a/mlir/docs/Dialects/OpenMPDialect/_index.md
+++ b/mlir/docs/Dialects/OpenMPDialect/_index.md
@@ -523,3 +523,58 @@ omp.parallel ... {
   omp.terminator
 } {omp.composite}
 ```
+
+## Host-Evaluated Clauses in Target Regions
+
+The `omp.target` operation, which represents the OpenMP `target` construct, is
+marked with the `IsolatedFromAbove` trait. This means that, inside of its
+region, no MLIR values defined outside of the op itself can be used. This is
+consistent with the OpenMP specification of the `target` construct, which
+mandates that all host device values used inside of the `target` region must
+either be privatized (data-sharing) or mapped (data-mapping).
+
+Normally, clauses applied to a construct are evaluated before entering that
+construct. Further, in some cases, the OpenMP specification stipulates that
+clauses be evaluated _on the host device_ on entry to a parent `target`
+construct. In particular, the `num_teams` and `thread_limit` clauses of the
+`teams` construct must be evaluated on the host device if it's nested inside or
+combined with a `target` construct.
+
+Additionally, the runtime library targeted by the MLIR to LLVM IR translation 
of
+the OpenMP dialect supports the optimized launch of SPMD kernels (i.e.
+`target teams distribute parallel {do,for}` in OpenMP), which requires
+specifying in advance what the total trip count of the loop is. Consequently, 
it
+is also beneficial to evaluate the trip count on the host device prior to the
+kernel launch.
+
+These host-evaluated values in MLIR would need to be placed outside of the
+`omp.target` region and also attached to the corresponding nested operations,
+which is not possible because of the `IsolatedFromAbove` trait. The solution
+implemented to address this problem has been to introduce the `host_eval`
+argument to the `omp.target` operation. It works similarly to a `map` clause,
+but its only intended use is to forward host-evaluated values to their
+corresponding operation inside of the region. Any uses outside of the 
previously
+described result in a verifier error.
+
+```mlir
+// Initialize %0, %1, %2, %3...
+omp.target host_eval(%0 -> %nt, %1 -> %lb, %2 -> %ub, %3 -> %step : i32, i32, 
i32, i32) {
+  omp.teams num_teams(to %nt : i32) {
+omp.parallel {
+  omp.distribute {
+omp.wsloop {
+  omp.loop_nest (%iv) : i32 = (%lb) to (%ub) step (%step) {
+// ...
+omp.yield
+  }
+  omp.terminator
+} {omp.composite}
+omp.terminator
+  } {omp.composite}
+  omp.terminator
+} {omp.composite}
+omp.terminator
+  }
+  omp.terminator
+}
+```
diff --git a/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td 
b/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
index a0da3db124d1f4..a99da1f0294d08 100644
--- a/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
+++ b/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
@@ -1166,9 +1166,10 @@ def TargetOp : OpenMP_Op<"target", traits = [
   ], clauses = [
 // TODO: Complete clause list (defaultmap, uses_allocators).
 OpenMP_AllocateClause, OpenMP_DependClause, OpenMP_DeviceClause,
-OpenMP_HasDeviceAddrClause, OpenMP_IfClause, OpenMP_InReductionClause,
-OpenMP_IsDevicePtrClause, OpenMP_MapClauseSkip,
-OpenMP_NowaitClause, OpenMP_PrivateClause, OpenMP_ThreadLimitClause
+OpenMP_HasDeviceAddrClause, OpenMP_HostEvalClause, OpenMP_IfClause,
+OpenMP_InReductionClause, OpenMP_IsDevicePtrClause,
+OpenMP_MapClauseSkip, OpenMP_NowaitClause,
+OpenMP_PrivateClause, OpenMP_ThreadLimitClause
   ], singleRegion = true> {
   let summary = "target construct";
   let description = [{
@@ -1186,16 +1187,34 @@ def TargetOp : OpenMP_Op<"target", traits = [
 
   let extraClassDeclaration = [{
 unsigned numMapBlockArgs() {

[llvm-branch-commits] [mlir] [MLIR][OpenMP] Add host_eval clause to omp.target (PR #116049)

2024-11-13 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-mlir-openmp

Author: Sergio Afonso (skatrak)


Changes

This patch adds the `host_eval` clause to the `omp.target` operation. 
Additionally, it updates its op verifier to make sure all uses of block 
arguments defined by this clause fall within one of the few cases where they 
are allowed.

MLIR to LLVM IR translation fails on translation of this clause with a 
not-yet-implemented error.

---

Patch is 20.92 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/116049.diff


7 Files Affected:

- (modified) mlir/docs/Dialects/OpenMPDialect/_index.md (+55) 
- (modified) mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td (+26-7) 
- (modified) mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp (+163-4) 
- (modified) 
mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp (+5) 
- (modified) mlir/test/Dialect/OpenMP/invalid.mlir (+69-1) 
- (modified) mlir/test/Dialect/OpenMP/ops.mlir (+37-1) 
- (modified) mlir/test/Target/LLVMIR/openmp-todo.mlir (+14) 


``diff
diff --git a/mlir/docs/Dialects/OpenMPDialect/_index.md 
b/mlir/docs/Dialects/OpenMPDialect/_index.md
index 4e5d777d6c4f7f..e0dd3f598e84b6 100644
--- a/mlir/docs/Dialects/OpenMPDialect/_index.md
+++ b/mlir/docs/Dialects/OpenMPDialect/_index.md
@@ -523,3 +523,58 @@ omp.parallel ... {
   omp.terminator
 } {omp.composite}
 ```
+
+## Host-Evaluated Clauses in Target Regions
+
+The `omp.target` operation, which represents the OpenMP `target` construct, is
+marked with the `IsolatedFromAbove` trait. This means that, inside of its
+region, no MLIR values defined outside of the op itself can be used. This is
+consistent with the OpenMP specification of the `target` construct, which
+mandates that all host device values used inside of the `target` region must
+either be privatized (data-sharing) or mapped (data-mapping).
+
+Normally, clauses applied to a construct are evaluated before entering that
+construct. Further, in some cases, the OpenMP specification stipulates that
+clauses be evaluated _on the host device_ on entry to a parent `target`
+construct. In particular, the `num_teams` and `thread_limit` clauses of the
+`teams` construct must be evaluated on the host device if it's nested inside or
+combined with a `target` construct.
+
+Additionally, the runtime library targeted by the MLIR to LLVM IR translation 
of
+the OpenMP dialect supports the optimized launch of SPMD kernels (i.e.
+`target teams distribute parallel {do,for}` in OpenMP), which requires
+specifying in advance what the total trip count of the loop is. Consequently, 
it
+is also beneficial to evaluate the trip count on the host device prior to the
+kernel launch.
+
+These host-evaluated values in MLIR would need to be placed outside of the
+`omp.target` region and also attached to the corresponding nested operations,
+which is not possible because of the `IsolatedFromAbove` trait. The solution
+implemented to address this problem has been to introduce the `host_eval`
+argument to the `omp.target` operation. It works similarly to a `map` clause,
+but its only intended use is to forward host-evaluated values to their
+corresponding operation inside of the region. Any uses outside of the 
previously
+described result in a verifier error.
+
+```mlir
+// Initialize %0, %1, %2, %3...
+omp.target host_eval(%0 -> %nt, %1 -> %lb, %2 -> %ub, %3 -> %step : i32, i32, 
i32, i32) {
+  omp.teams num_teams(to %nt : i32) {
+omp.parallel {
+  omp.distribute {
+omp.wsloop {
+  omp.loop_nest (%iv) : i32 = (%lb) to (%ub) step (%step) {
+// ...
+omp.yield
+  }
+  omp.terminator
+} {omp.composite}
+omp.terminator
+  } {omp.composite}
+  omp.terminator
+} {omp.composite}
+omp.terminator
+  }
+  omp.terminator
+}
+```
diff --git a/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td 
b/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
index a0da3db124d1f4..a99da1f0294d08 100644
--- a/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
+++ b/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
@@ -1166,9 +1166,10 @@ def TargetOp : OpenMP_Op<"target", traits = [
   ], clauses = [
 // TODO: Complete clause list (defaultmap, uses_allocators).
 OpenMP_AllocateClause, OpenMP_DependClause, OpenMP_DeviceClause,
-OpenMP_HasDeviceAddrClause, OpenMP_IfClause, OpenMP_InReductionClause,
-OpenMP_IsDevicePtrClause, OpenMP_MapClauseSkip,
-OpenMP_NowaitClause, OpenMP_PrivateClause, OpenMP_ThreadLimitClause
+OpenMP_HasDeviceAddrClause, OpenMP_HostEvalClause, OpenMP_IfClause,
+OpenMP_InReductionClause, OpenMP_IsDevicePtrClause,
+OpenMP_MapClauseSkip, OpenMP_NowaitClause,
+OpenMP_PrivateClause, OpenMP_ThreadLimitClause
   ], singleRegion = true> {
   let summary = "target construct";
   let description = [{
@@ -1186,16 +1187,34 @@ def TargetOp : OpenMP_Op<"target", traits = [
 
   let extraClassDeclaration = [{
 unsigned numMapBlockArgs()

[llvm-branch-commits] [clang] [llvm] [mlir] [OMPIRBuilder] Introduce struct to hold default kernel teams/threads (PR #116050)

2024-11-13 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-flang-openmp

Author: Sergio Afonso (skatrak)


Changes

This patch introduces the `OpenMPIRBuilder::TargetKernelDefaultAttrs` structure 
used to simplify passing default and constant values for number of teams and 
threads, and possibly other target kernel-related information in the future.

This is used to forward values passed to `createTarget` to `createTargetInit`, 
which previously used a default unrelated set of values.

---

Patch is 21.80 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/116050.diff


8 Files Affected:

- (modified) clang/lib/CodeGen/CGOpenMPRuntime.cpp (+8-5) 
- (modified) clang/lib/CodeGen/CGOpenMPRuntime.h (+3-6) 
- (modified) clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp (+3-6) 
- (modified) llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h (+25-14) 
- (modified) llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp (+40-31) 
- (modified) llvm/unittests/Frontend/OpenMPIRBuilderTest.cpp (+16-13) 
- (modified) 
mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp (+6-5) 
- (modified) mlir/test/Target/LLVMIR/omptarget-region-device-llvm.mlir (+1-1) 


``diff
diff --git a/clang/lib/CodeGen/CGOpenMPRuntime.cpp 
b/clang/lib/CodeGen/CGOpenMPRuntime.cpp
index d714af035d21a2..0f7a1166227476 100644
--- a/clang/lib/CodeGen/CGOpenMPRuntime.cpp
+++ b/clang/lib/CodeGen/CGOpenMPRuntime.cpp
@@ -5880,10 +5880,13 @@ void 
CGOpenMPRuntime::emitUsesAllocatorsFini(CodeGenFunction &CGF,
 
 void CGOpenMPRuntime::computeMinAndMaxThreadsAndTeams(
 const OMPExecutableDirective &D, CodeGenFunction &CGF,
-int32_t &MinThreadsVal, int32_t &MaxThreadsVal, int32_t &MinTeamsVal,
-int32_t &MaxTeamsVal) {
+llvm::OpenMPIRBuilder::TargetKernelDefaultAttrs &Attrs) {
+  assert(Attrs.MaxTeams.size() == 1 && Attrs.MaxThreads.size() == 1 &&
+ "invalid default attrs structure");
+  int32_t &MaxTeamsVal = Attrs.MaxTeams.front();
+  int32_t &MaxThreadsVal = Attrs.MaxThreads.front();
 
-  getNumTeamsExprForTargetDirective(CGF, D, MinTeamsVal, MaxTeamsVal);
+  getNumTeamsExprForTargetDirective(CGF, D, Attrs.MinTeams, MaxTeamsVal);
   getNumThreadsExprForTargetDirective(CGF, D, MaxThreadsVal,
   /*UpperBoundOnly=*/true);
 
@@ -5901,12 +5904,12 @@ void CGOpenMPRuntime::computeMinAndMaxThreadsAndTeams(
   else
 continue;
 
-  MinThreadsVal = std::max(MinThreadsVal, AttrMinThreadsVal);
+  Attrs.MinThreads = std::max(Attrs.MinThreads, AttrMinThreadsVal);
   if (AttrMaxThreadsVal > 0)
 MaxThreadsVal = MaxThreadsVal > 0
 ? std::min(MaxThreadsVal, AttrMaxThreadsVal)
 : AttrMaxThreadsVal;
-  MinTeamsVal = std::max(MinTeamsVal, AttrMinBlocksVal);
+  Attrs.MinTeams = std::max(Attrs.MinTeams, AttrMinBlocksVal);
   if (AttrMaxBlocksVal > 0)
 MaxTeamsVal = MaxTeamsVal > 0 ? std::min(MaxTeamsVal, AttrMaxBlocksVal)
   : AttrMaxBlocksVal;
diff --git a/clang/lib/CodeGen/CGOpenMPRuntime.h 
b/clang/lib/CodeGen/CGOpenMPRuntime.h
index 5e7715743afb58..003395e7f17ded 100644
--- a/clang/lib/CodeGen/CGOpenMPRuntime.h
+++ b/clang/lib/CodeGen/CGOpenMPRuntime.h
@@ -312,12 +312,9 @@ class CGOpenMPRuntime {
   llvm::OpenMPIRBuilder OMPBuilder;
 
   /// Helper to determine the min/max number of threads/teams for \p D.
-  void computeMinAndMaxThreadsAndTeams(const OMPExecutableDirective &D,
-   CodeGenFunction &CGF,
-   int32_t &MinThreadsVal,
-   int32_t &MaxThreadsVal,
-   int32_t &MinTeamsVal,
-   int32_t &MaxTeamsVal);
+  void computeMinAndMaxThreadsAndTeams(
+  const OMPExecutableDirective &D, CodeGenFunction &CGF,
+  llvm::OpenMPIRBuilder::TargetKernelDefaultAttrs &Attrs);
 
   /// Helper to emit outlined function for 'target' directive.
   /// \param D Directive to emit.
diff --git a/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp 
b/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp
index 43dc0e62284602..96f8d6c5c08e56 100644
--- a/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp
+++ b/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp
@@ -745,14 +745,11 @@ void CGOpenMPRuntimeGPU::emitNonSPMDKernel(const 
OMPExecutableDirective &D,
 void CGOpenMPRuntimeGPU::emitKernelInit(const OMPExecutableDirective &D,
 CodeGenFunction &CGF,
 EntryFunctionState &EST, bool IsSPMD) {
-  int32_t MinThreadsVal = 1, MaxThreadsVal = -1, MinTeamsVal = 1,
-  MaxTeamsVal = -1;
-  computeMinAndMaxThreadsAndTeams(D, CGF, MinThreadsVal, MaxThreadsVal,
-  MinTeamsVal, MaxTeamsVal);
+  llvm::OpenMPIRBuilder::TargetKernelDefaultAttrs Attrs;
+  computeMinAndMaxThreadsAndTeams(D, CGF, Attrs);
 
   CGBuilderTy &Bld = CGF.Bu

[llvm-branch-commits] [llvm] [mlir] [OMPIRBuilder] Support runtime number of teams and threads, and SPMD mode (PR #116051)

2024-11-13 Thread via llvm-branch-commits

llvmbot wrote:



@llvm/pr-subscribers-flang-openmp

@llvm/pr-subscribers-mlir-llvm

Author: Sergio Afonso (skatrak)


Changes

This patch introduces a `TargetKernelRuntimeAttrs` structure to hold host- 
evaluated `num_teams`, `thread_limit`, `num_threads` and trip count values 
passed to the runtime kernel offloading call.

Additionally, `createTarget` is extended to take an `IsSPMD` flag, used to 
influence target device code generation.

---

Patch is 31.58 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/116051.diff


4 Files Affected:

- (modified) llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h (+25-1) 
- (modified) llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp (+118-19) 
- (modified) llvm/unittests/Frontend/OpenMPIRBuilderTest.cpp (+271-10) 
- (modified) 
mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp (+6-4) 


``diff
diff --git a/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h 
b/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h
index da450ef5adbc14..a85f41e586c514 100644
--- a/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h
+++ b/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h
@@ -2237,6 +2237,26 @@ class OpenMPIRBuilder {
 int32_t MinThreads = 1;
   };
 
+  /// Container to pass LLVM IR runtime values or constants related to the
+  /// number of teams and threads with which the kernel must be launched, as
+  /// well as the trip count of the SPMD loop, if it is an SPMD kernel. These
+  /// must be defined in the host prior to the call to the kernel launch OpenMP
+  /// RTL function.
+  struct TargetKernelRuntimeAttrs {
+SmallVector MaxTeams = {nullptr};
+Value *MinTeams = nullptr;
+SmallVector TargetThreadLimit = {nullptr};
+SmallVector TeamsThreadLimit = {nullptr};
+
+/// 'parallel' construct 'num_threads' clause value, if present and it is a
+/// target SPMD kernel.
+Value *MaxThreads = nullptr;
+
+/// Total number of iterations of the target SPMD kernel or null if it is a
+/// generic kernel.
+Value *LoopTripCount = nullptr;
+  };
+
   /// Data structure that contains the needed information to construct the
   /// kernel args vector.
   struct TargetKernelArgs {
@@ -2905,11 +2925,14 @@ class OpenMPIRBuilder {
   ///
   /// \param Loc where the target data construct was encountered.
   /// \param IsOffloadEntry whether it is an offload entry.
+  /// \param IsSPMD whether it is a target SPMD kernel.
   /// \param CodeGenIP The insertion point where the call to the outlined
   /// function should be emitted.
   /// \param EntryInfo The entry information about the function.
   /// \param DefaultAttrs Structure containing the default numbers of threads
   ///and teams to launch the kernel with.
+  /// \param RuntimeAttrs Structure containing the runtime numbers of threads
+  ///and teams to launch the kernel with.
   /// \param Inputs The input values to the region that will be passed.
   /// as arguments to the outlined function.
   /// \param BodyGenCB Callback that will generate the region code.
@@ -2919,11 +2942,12 @@ class OpenMPIRBuilder {
   // dependency information as passed in the depend clause
   // \param HasNowait Whether the target construct has a `nowait` clause or 
not.
   InsertPointOrErrorTy createTarget(
-  const LocationDescription &Loc, bool IsOffloadEntry,
+  const LocationDescription &Loc, bool IsOffloadEntry, bool IsSPMD,
   OpenMPIRBuilder::InsertPointTy AllocaIP,
   OpenMPIRBuilder::InsertPointTy CodeGenIP,
   TargetRegionEntryInfo &EntryInfo,
   const TargetKernelDefaultAttrs &DefaultAttrs,
+  const TargetKernelRuntimeAttrs &RuntimeAttrs,
   SmallVectorImpl &Inputs, GenMapInfoCallbackTy GenMapInfoCB,
   TargetBodyGenCallbackTy BodyGenCB,
   TargetGenArgAccessorsCallbackTy ArgAccessorFuncCB,
diff --git a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp 
b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
index 302d363965c940..f847f60386df85 100644
--- a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
+++ b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
@@ -6727,8 +6727,43 @@ FunctionCallee 
OpenMPIRBuilder::createDispatchDeinitFunction() {
   return getOrCreateRuntimeFunction(M, omp::OMPRTL___kmpc_dispatch_deinit);
 }
 
+static void emitUsed(StringRef Name, std::vector &List,
+ Module &M) {
+  if (List.empty())
+return;
+
+  Type *PtrTy = PointerType::get(M.getContext(), /*AddressSpace=*/0);
+
+  // Convert List to what ConstantArray needs.
+  SmallVector UsedArray;
+  UsedArray.reserve(List.size());
+  for (auto Item : List)
+UsedArray.push_back(ConstantExpr::getPointerBitCastOrAddrSpaceCast(
+cast(&*Item), PtrTy));
+
+  ArrayType *ArrTy = ArrayType::get(PtrTy, UsedArray.size());
+  auto *GV =
+  new GlobalVariable(M, ArrTy, false, llvm::GlobalValue::AppendingLinkage,
+ llvm::ConstantArray::get(ArrTy, UsedArray), Name);
+
+  GV->setSection("llvm.metadata");
+}
+
+st

[llvm-branch-commits] [llvm] [mlir] [OMPIRBuilder] Support runtime number of teams and threads, and SPMD mode (PR #116051)

2024-11-13 Thread Sergio Afonso via llvm-branch-commits

skatrak wrote:

PR stack:
- #116048
- #116049
- #116050
- #116051
- #116052

https://github.com/llvm/llvm-project/pull/116051
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [flang] [lld] [llvm] [Flang] LLVM_ENABLE_RUNTIMES=flang-rt (PR #110217)

2024-11-13 Thread via llvm-branch-commits


@@ -50,43 +42,28 @@ macro(enable_cuda_compilation name files)
   "${CUDA_COMPILE_OPTIONS}"
   )
 
-if (EXISTS "${FLANG_LIBCUDACXX_PATH}/include")
+if (EXISTS "${FLANG_RT_LIBCUDACXX_PATH}/include")
   # When using libcudacxx headers files, we have to use them
   # for all files of F18 runtime.
-  include_directories(AFTER ${FLANG_LIBCUDACXX_PATH}/include)
+  include_directories(AFTER ${FLANG_RT_LIBCUDACXX_PATH}/include)
   add_compile_definitions(RT_USE_LIBCUDACXX=1)
 endif()
 
 # Add an OBJECT library consisting of CUDA PTX.
-llvm_add_library(${name}PTX OBJECT PARTIAL_SOURCES_INTENDED ${files})
-set_property(TARGET obj.${name}PTX PROPERTY CUDA_PTX_COMPILATION ON)
-if (FLANG_CUDA_RUNTIME_PTX_WITHOUT_GLOBAL_VARS)
-  target_compile_definitions(obj.${name}PTX
-PRIVATE FLANG_RUNTIME_NO_GLOBAL_VAR_DEFS
+add_flangrt_library(${name}PTX OBJECT ${files})

jeanPerier wrote:

I think `INSTALL_WITH_TOOLCHAIN` may be needed here. I am not an expert with 
`llvm_add_library`, so I cannot be assertive, but `llvm_add_library` seemed to 
build/install the PTX library in the lib directory of llvm build. I am not sure 
where it is located inside the build directory with the patch.

https://github.com/llvm/llvm-project/pull/110217
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [CGData] Refactor Global Merge Functions (PR #115750)

2024-11-13 Thread Kyungwoo Lee via llvm-branch-commits

kyulee-com wrote:

@nocchijiang  The new approach seems to be functioning well and is similar in 
size to the previous method.
I suspect that the no-LTO case might still encounter some slowdown, as each CU 
needs to read the entire CGData regardless. Currently, the CGData used for this 
merging process does not utilize names, which means we could potentially 
eliminate strings or make them optional. Alternatively, we could restructure 
the indexed CGData to allow for reading only the relevant hash entries on 
demand. I'd like to leave these options open for now, and if you can continue 
to improve it, that would be excellent.

https://github.com/llvm/llvm-project/pull/115750
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [CGData] Refactor Global Merge Functions (PR #115750)

2024-11-13 Thread Kyungwoo Lee via llvm-branch-commits

kyulee-com wrote:

@nocchijiang  The new approach seems to be functioning well and is similar in 
size to the previous method.
I suspect that the no-LTO case might still encounter some slowdown, as each CU 
needs to read the entire CGData regardless. Currently, the CGData used for this 
merging process does not utilize names, which means we could potentially 
eliminate strings or make them optional. Alternatively, we could restructure 
the indexed CGData to allow for reading only the relevant hash entries on 
demand. I'd like to leave these options open for now, and if you can continue 
to improve it, that would be excellent.

https://github.com/llvm/llvm-project/pull/115750
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [Flang][OpenMP] Derived type explicit allocatable member mapping (PR #113557)

2024-11-13 Thread via llvm-branch-commits

agozillon wrote:

Thank you very much @skatrak and @ergawy, I'll land this PR stack on either 
Friday or the coming Monday, going to give a few days leeway incase anyone else 
wishes to make any comments! 

https://github.com/llvm/llvm-project/pull/113557
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [mlir] [OMPIRBuilder] Support runtime number of teams and threads, and SPMD mode (PR #116051)

2024-11-13 Thread Sergio Afonso via llvm-branch-commits

skatrak wrote:

Buildbot failure seems to be some temporary issue unrelated to the PR.

https://github.com/llvm/llvm-project/pull/116051
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [flang] [lld] [llvm] [Flang] LLVM_ENABLE_RUNTIMES=flang-rt (PR #110217)

2024-11-13 Thread via llvm-branch-commits


@@ -50,43 +42,28 @@ macro(enable_cuda_compilation name files)
   "${CUDA_COMPILE_OPTIONS}"
   )
 
-if (EXISTS "${FLANG_LIBCUDACXX_PATH}/include")
+if (EXISTS "${FLANG_RT_LIBCUDACXX_PATH}/include")
   # When using libcudacxx headers files, we have to use them
   # for all files of F18 runtime.
-  include_directories(AFTER ${FLANG_LIBCUDACXX_PATH}/include)
+  include_directories(AFTER ${FLANG_RT_LIBCUDACXX_PATH}/include)
   add_compile_definitions(RT_USE_LIBCUDACXX=1)
 endif()
 
 # Add an OBJECT library consisting of CUDA PTX.
-llvm_add_library(${name}PTX OBJECT PARTIAL_SOURCES_INTENDED ${files})
-set_property(TARGET obj.${name}PTX PROPERTY CUDA_PTX_COMPILATION ON)
-if (FLANG_CUDA_RUNTIME_PTX_WITHOUT_GLOBAL_VARS)
-  target_compile_definitions(obj.${name}PTX
-PRIVATE FLANG_RUNTIME_NO_GLOBAL_VAR_DEFS
+add_flangrt_library(${name}PTX OBJECT ${files})

jeanPerier wrote:

Also, I think the `OBJECT` processing [in 
llvm_add_library](https://github.com/llvm/llvm-project/blob/0baa6a7272970257fd6f527e95eb7cb18ba3361c/llvm/cmake/modules/AddLLVM.cmake#L565)
 is more complex and also implies `STATIC` (that is, it both triggers [an 
object build for the 
`obj.${name}PTX`](https://github.com/llvm/llvm-project/blob/0baa6a7272970257fd6f527e95eb7cb18ba3361c/llvm/cmake/modules/AddLLVM.cmake#L568),
 and a [STATIC build for the  
`${name}PTX`](https://github.com/llvm/llvm-project/blob/0baa6a7272970257fd6f527e95eb7cb18ba3361c/llvm/cmake/modules/AddLLVM.cmake#L644)).

I do not think this is happening with `add_flangrt_library` that only makes an 
object build.

https://github.com/llvm/llvm-project/pull/110217
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] clang/HIP: Remove requires system-linux from some driver tests (PR #112842)

2024-11-13 Thread Matt Arsenault via llvm-branch-commits

https://github.com/arsenm updated 
https://github.com/llvm/llvm-project/pull/112842

>From 87f64e8bf51d43c34c5cb4de12661a44674d92b7 Mon Sep 17 00:00:00 2001
From: Matt Arsenault 
Date: Fri, 18 Oct 2024 09:40:34 +0400
Subject: [PATCH] clang/HIP: Remove requires system-linux from some driver
 tests

---
 clang/test/Driver/hip-partial-link.hip |  2 +-
 clang/test/Driver/linker-wrapper.c | 10 --
 2 files changed, 5 insertions(+), 7 deletions(-)

diff --git a/clang/test/Driver/hip-partial-link.hip 
b/clang/test/Driver/hip-partial-link.hip
index 8b27f78f3bdd12..5580e569780194 100644
--- a/clang/test/Driver/hip-partial-link.hip
+++ b/clang/test/Driver/hip-partial-link.hip
@@ -1,4 +1,4 @@
-// REQUIRES: x86-registered-target, amdgpu-registered-target, lld, system-linux
+// REQUIRES: x86-registered-target, amdgpu-registered-target, lld
 
 // RUN: %clang -x hip --target=x86_64-unknown-linux-gnu 
--no-offload-new-driver \
 // RUN:   --offload-arch=gfx906 -c -nostdinc -nogpuinc -nohipwrapperinc \
diff --git a/clang/test/Driver/linker-wrapper.c 
b/clang/test/Driver/linker-wrapper.c
index 470af4d5d70cac..fac4331e51f694 100644
--- a/clang/test/Driver/linker-wrapper.c
+++ b/clang/test/Driver/linker-wrapper.c
@@ -2,8 +2,6 @@
 // REQUIRES: nvptx-registered-target
 // REQUIRES: amdgpu-registered-target
 
-// REQUIRES: system-linux
-
 // An externally visible variable so static libraries extract.
 __attribute__((visibility("protected"), used)) int x;
 
@@ -30,7 +28,7 @@ __attribute__((visibility("protected"), used)) int x;
 // RUN: clang-linker-wrapper --host-triple=x86_64-unknown-linux-gnu --dry-run 
--device-debug -O0 \
 // RUN:   --linker-path=/usr/bin/ld %t.o -o a.out 2>&1 | FileCheck %s 
--check-prefix=NVPTX-LINK-DEBUG
 
-// NVPTX-LINK-DEBUG: clang{{.*}} -o {{.*}}.img --target=nvptx64-nvidia-cuda 
-march=sm_70 -O2 -flto {{.*}}.o {{.*}}.o -g 
+// NVPTX-LINK-DEBUG: clang{{.*}} -o {{.*}}.img --target=nvptx64-nvidia-cuda 
-march=sm_70 -O2 -flto {{.*}}.o {{.*}}.o -g
 
 // RUN: clang-offload-packager -o %t.out \
 // RUN:   
--image=file=%t.elf.o,kind=openmp,triple=amdgcn-amd-amdhsa,arch=gfx908 \
@@ -93,7 +91,7 @@ __attribute__((visibility("protected"), used)) int x;
 
 // CUDA: clang{{.*}} -o [[IMG_SM70:.+]] --target=nvptx64-nvidia-cuda 
-march=sm_70
 // CUDA: clang{{.*}} -o [[IMG_SM52:.+]] --target=nvptx64-nvidia-cuda 
-march=sm_52
-// CUDA: fatbinary{{.*}}-64 --create {{.*}}.fatbin 
--image=profile=sm_70,file=[[IMG_SM70]] --image=profile=sm_52,file=[[IMG_SM52]] 
+// CUDA: fatbinary{{.*}}-64 --create {{.*}}.fatbin 
--image=profile=sm_70,file=[[IMG_SM70]] --image=profile=sm_52,file=[[IMG_SM52]]
 // CUDA: usr/bin/ld{{.*}} {{.*}}.openmp.image.{{.*}}.o 
{{.*}}.cuda.image.{{.*}}.o
 
 // RUN: clang-offload-packager -o %t.out \
@@ -120,7 +118,7 @@ __attribute__((visibility("protected"), used)) int x;
 
 // HIP: clang{{.*}} -o [[IMG_GFX90A:.+]] --target=amdgcn-amd-amdhsa 
-mcpu=gfx90a
 // HIP: clang{{.*}} -o [[IMG_GFX908:.+]] --target=amdgcn-amd-amdhsa 
-mcpu=gfx908
-// HIP: clang-offload-bundler{{.*}}-type=o -bundle-align=4096 -compress 
-compression-level=6 
-targets=host-x86_64-unknown-linux,hip-amdgcn-amd-amdhsa--gfx90a,hip-amdgcn-amd-amdhsa--gfx908
 -input=/dev/null -input=[[IMG_GFX90A]] -input=[[IMG_GFX908]] 
-output={{.*}}.hipfb
+// HIP: clang-offload-bundler{{.*}}-type=o -bundle-align=4096 -compress 
-compression-level=6 
-targets=host-x86_64-unknown-linux,hip-amdgcn-amd-amdhsa--gfx90a,hip-amdgcn-amd-amdhsa--gfx908
 -input={{/dev/null|NUL}} -input=[[IMG_GFX90A]] -input=[[IMG_GFX908]] 
-output={{.*}}.hipfb
 
 // RUN: clang-offload-packager -o %t.out \
 // RUN:   
--image=file=%t.elf.o,kind=openmp,triple=amdgcn-amd-amdhsa,arch=gfx908 \
@@ -211,7 +209,7 @@ __attribute__((visibility("protected"), used)) int x;
 // RUN:   %t.o -o a.out 2>&1 | FileCheck %s --check-prefix=RELOCATABLE-LINK-HIP
 
 // RELOCATABLE-LINK-HIP: clang{{.*}} -o {{.*}}.img --target=amdgcn-amd-amdhsa
-// RELOCATABLE-LINK-HIP: clang-offload-bundler{{.*}} -type=o 
-bundle-align=4096 
-targets=host-x86_64-unknown-linux,hip-amdgcn-amd-amdhsa--gfx90a 
-input=/dev/null -input={{.*}} -output={{.*}}
+// RELOCATABLE-LINK-HIP: clang-offload-bundler{{.*}} -type=o 
-bundle-align=4096 
-targets=host-x86_64-unknown-linux,hip-amdgcn-amd-amdhsa--gfx90a 
-input={{/dev/null|NUL}} -input={{.*}} -output={{.*}}
 // RELOCATABLE-LINK-HIP: /usr/bin/ld.lld{{.*}}-r
 // RELOCATABLE-LINK-HIP: llvm-objcopy{{.*}}a.out --remove-section 
.llvm.offloading
 

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x [InstCombine] Drop nsw in negation of select (PR #116097)

2024-11-13 Thread via llvm-branch-commits

https://github.com/AreaZR created 
https://github.com/llvm/llvm-project/pull/116097

Closes https://github.com/llvm/llvm-project/issues/112666 and 
https://github.com/llvm/llvm-project/issues/114181.

(cherry-picked from 8d86a537ad756e31832eab67371179e881452fb5)

>From d2d67d1eaeb5a9d44b23b9175c1aab9928247239 Mon Sep 17 00:00:00 2001
From: Rose 
Date: Wed, 13 Nov 2024 14:40:33 -0500
Subject: [PATCH 1/2] Pre-commit tests (NFC)

---
 .../InstCombine/sub-of-negatible.ll   | 42 +++
 1 file changed, 42 insertions(+)

diff --git a/llvm/test/Transforms/InstCombine/sub-of-negatible.ll 
b/llvm/test/Transforms/InstCombine/sub-of-negatible.ll
index b2e14ceaca1b08..dfc461b48800f7 100644
--- a/llvm/test/Transforms/InstCombine/sub-of-negatible.ll
+++ b/llvm/test/Transforms/InstCombine/sub-of-negatible.ll
@@ -1374,6 +1374,48 @@ define i8 @negate_select_of_op_vs_negated_op(i8 %x, i8 
%y, i1 %c) {
   %t2 = sub i8 %y, %t1
   ret i8 %t2
 }
+
+define i8 @negate_select_of_op_vs_negated_op_nsw(i8 %x, i8 %y, i1 %c) {
+; CHECK-LABEL: @negate_select_of_op_vs_negated_op_nsw(
+; CHECK-NEXT:[[T0:%.*]] = sub nsw i8 0, [[X:%.*]]
+; CHECK-NEXT:[[TMP1:%.*]] = select i1 [[C:%.*]], i8 [[X]], i8 [[T0]]
+; CHECK-NEXT:[[T2:%.*]] = add i8 [[TMP1]], [[Y:%.*]]
+; CHECK-NEXT:ret i8 [[T2]]
+;
+  %t0 = sub nsw i8 0, %x
+  %t1 = select i1 %c, i8 %t0, i8 %x
+  %t2 = sub i8 %y, %t1
+  ret i8 %t2
+}
+
+define i8 @negate_select_of_op_vs_negated_op_nsw_commuted(i8 %x, i8 %y, i1 %c) 
{
+; CHECK-LABEL: @negate_select_of_op_vs_negated_op_nsw_commuted(
+; CHECK-NEXT:[[T0:%.*]] = sub nsw i8 0, [[X:%.*]]
+; CHECK-NEXT:[[TMP1:%.*]] = select i1 [[C:%.*]], i8 [[T0]], i8 [[X]]
+; CHECK-NEXT:[[T2:%.*]] = add i8 [[TMP1]], [[Y:%.*]]
+; CHECK-NEXT:ret i8 [[T2]]
+;
+  %t0 = sub nsw i8 0, %x
+  %t1 = select i1 %c, i8 %x, i8 %t0
+  %t2 = sub i8 %y, %t1
+  ret i8 %t2
+}
+
+define i8 @negate_select_of_op_vs_negated_op_nsw_xyyx(i8 %x, i8 %y, i8 %z, i1 
%c) {
+; CHECK-LABEL: @negate_select_of_op_vs_negated_op_nsw_xyyx(
+; CHECK-NEXT:[[SUB1:%.*]] = sub nsw i8 [[X:%.*]], [[Y:%.*]]
+; CHECK-NEXT:[[SUB2:%.*]] = sub nsw i8 [[Y]], [[X]]
+; CHECK-NEXT:[[TMP1:%.*]] = select i1 [[C:%.*]], i8 [[SUB2]], i8 [[SUB1]]
+; CHECK-NEXT:[[T2:%.*]] = add i8 [[TMP1]], [[Z:%.*]]
+; CHECK-NEXT:ret i8 [[T2]]
+;
+  %sub1 = sub nsw i8 %x, %y
+  %sub2 = sub nsw i8 %y, %x
+  %t1 = select i1 %c, i8 %sub1, i8 %sub2
+  %t2 = sub i8 %z, %t1
+  ret i8 %t2
+}
+
 define i8 @dont_negate_ordinary_select(i8 %x, i8 %y, i8 %z, i1 %c) {
 ; CHECK-LABEL: @dont_negate_ordinary_select(
 ; CHECK-NEXT:[[T0:%.*]] = select i1 [[C:%.*]], i8 [[X:%.*]], i8 [[Y:%.*]]

>From 9aa29134a11fdafb30c944fd4b3c95bc07db4d49 Mon Sep 17 00:00:00 2001
From: Rose 
Date: Wed, 13 Nov 2024 14:44:01 -0500
Subject: [PATCH 2/2] [InstCombine] Drop nsw in negation of select

(cherry-picked from 8d86a537ad756e31832eab67371179e881452fb5)
---
 .../lib/Transforms/InstCombine/InstCombineNegator.cpp | 11 +++
 llvm/test/Transforms/InstCombine/sub-of-negatible.ll  |  8 
 2 files changed, 15 insertions(+), 4 deletions(-)

diff --git a/llvm/lib/Transforms/InstCombine/InstCombineNegator.cpp 
b/llvm/lib/Transforms/InstCombine/InstCombineNegator.cpp
index e4895b59f4b4a9..cb052da79bb3c6 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineNegator.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineNegator.cpp
@@ -334,6 +334,17 @@ std::array 
Negator::getSortedOperandsOfBinOp(Instruction *I) {
   NewSelect->swapValues();
   // Don't swap prof metadata, we didn't change the branch behavior.
   NewSelect->setName(I->getName() + ".neg");
+  // Poison-generating flags should be dropped
+  Value *TV = NewSelect->getTrueValue();
+  Value *FV = NewSelect->getFalseValue();
+  if (match(TV, m_Neg(m_Specific(FV
+cast(TV)->dropPoisonGeneratingFlags();
+  else if (match(FV, m_Neg(m_Specific(TV
+cast(FV)->dropPoisonGeneratingFlags();
+  else {
+cast(TV)->dropPoisonGeneratingFlags();
+cast(FV)->dropPoisonGeneratingFlags();
+  }
   Builder.Insert(NewSelect);
   return NewSelect;
 }
diff --git a/llvm/test/Transforms/InstCombine/sub-of-negatible.ll 
b/llvm/test/Transforms/InstCombine/sub-of-negatible.ll
index dfc461b48800f7..f9549881aa3131 100644
--- a/llvm/test/Transforms/InstCombine/sub-of-negatible.ll
+++ b/llvm/test/Transforms/InstCombine/sub-of-negatible.ll
@@ -1377,7 +1377,7 @@ define i8 @negate_select_of_op_vs_negated_op(i8 %x, i8 
%y, i1 %c) {
 
 define i8 @negate_select_of_op_vs_negated_op_nsw(i8 %x, i8 %y, i1 %c) {
 ; CHECK-LABEL: @negate_select_of_op_vs_negated_op_nsw(
-; CHECK-NEXT:[[T0:%.*]] = sub nsw i8 0, [[X:%.*]]
+; CHECK-NEXT:[[T0:%.*]] = sub i8 0, [[X:%.*]]
 ; CHECK-NEXT:[[TMP1:%.*]] = select i1 [[C:%.*]], i8 [[X]], i8 [[T0]]
 ; CHECK-NEXT:[[T2:%.*]] = add i8 [[TMP1]], [[Y:%.*]]
 ; CHECK-NEXT:ret i8 [[T2]]
@@ -1390,7 +1390,7 @@ define i8 @negate_select

[llvm-branch-commits] [llvm] release/19.x: [InstCombine] Drop nsw in negation of select (PR #116097)

2024-11-13 Thread via llvm-branch-commits

https://github.com/AreaZR edited 
https://github.com/llvm/llvm-project/pull/116097
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [mlir] [OMPIRBuilder] Support runtime number of teams and threads, and SPMD mode (PR #116051)

2024-11-13 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-mlir

Author: Sergio Afonso (skatrak)


Changes

This patch introduces a `TargetKernelRuntimeAttrs` structure to hold host- 
evaluated `num_teams`, `thread_limit`, `num_threads` and trip count values 
passed to the runtime kernel offloading call.

Additionally, `createTarget` is extended to take an `IsSPMD` flag, used to 
influence target device code generation.

---

Patch is 31.58 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/116051.diff


4 Files Affected:

- (modified) llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h (+25-1) 
- (modified) llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp (+118-19) 
- (modified) llvm/unittests/Frontend/OpenMPIRBuilderTest.cpp (+271-10) 
- (modified) 
mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp (+6-4) 


``diff
diff --git a/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h 
b/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h
index da450ef5adbc14..a85f41e586c514 100644
--- a/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h
+++ b/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h
@@ -2237,6 +2237,26 @@ class OpenMPIRBuilder {
 int32_t MinThreads = 1;
   };
 
+  /// Container to pass LLVM IR runtime values or constants related to the
+  /// number of teams and threads with which the kernel must be launched, as
+  /// well as the trip count of the SPMD loop, if it is an SPMD kernel. These
+  /// must be defined in the host prior to the call to the kernel launch OpenMP
+  /// RTL function.
+  struct TargetKernelRuntimeAttrs {
+SmallVector MaxTeams = {nullptr};
+Value *MinTeams = nullptr;
+SmallVector TargetThreadLimit = {nullptr};
+SmallVector TeamsThreadLimit = {nullptr};
+
+/// 'parallel' construct 'num_threads' clause value, if present and it is a
+/// target SPMD kernel.
+Value *MaxThreads = nullptr;
+
+/// Total number of iterations of the target SPMD kernel or null if it is a
+/// generic kernel.
+Value *LoopTripCount = nullptr;
+  };
+
   /// Data structure that contains the needed information to construct the
   /// kernel args vector.
   struct TargetKernelArgs {
@@ -2905,11 +2925,14 @@ class OpenMPIRBuilder {
   ///
   /// \param Loc where the target data construct was encountered.
   /// \param IsOffloadEntry whether it is an offload entry.
+  /// \param IsSPMD whether it is a target SPMD kernel.
   /// \param CodeGenIP The insertion point where the call to the outlined
   /// function should be emitted.
   /// \param EntryInfo The entry information about the function.
   /// \param DefaultAttrs Structure containing the default numbers of threads
   ///and teams to launch the kernel with.
+  /// \param RuntimeAttrs Structure containing the runtime numbers of threads
+  ///and teams to launch the kernel with.
   /// \param Inputs The input values to the region that will be passed.
   /// as arguments to the outlined function.
   /// \param BodyGenCB Callback that will generate the region code.
@@ -2919,11 +2942,12 @@ class OpenMPIRBuilder {
   // dependency information as passed in the depend clause
   // \param HasNowait Whether the target construct has a `nowait` clause or 
not.
   InsertPointOrErrorTy createTarget(
-  const LocationDescription &Loc, bool IsOffloadEntry,
+  const LocationDescription &Loc, bool IsOffloadEntry, bool IsSPMD,
   OpenMPIRBuilder::InsertPointTy AllocaIP,
   OpenMPIRBuilder::InsertPointTy CodeGenIP,
   TargetRegionEntryInfo &EntryInfo,
   const TargetKernelDefaultAttrs &DefaultAttrs,
+  const TargetKernelRuntimeAttrs &RuntimeAttrs,
   SmallVectorImpl &Inputs, GenMapInfoCallbackTy GenMapInfoCB,
   TargetBodyGenCallbackTy BodyGenCB,
   TargetGenArgAccessorsCallbackTy ArgAccessorFuncCB,
diff --git a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp 
b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
index 302d363965c940..f847f60386df85 100644
--- a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
+++ b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
@@ -6727,8 +6727,43 @@ FunctionCallee 
OpenMPIRBuilder::createDispatchDeinitFunction() {
   return getOrCreateRuntimeFunction(M, omp::OMPRTL___kmpc_dispatch_deinit);
 }
 
+static void emitUsed(StringRef Name, std::vector &List,
+ Module &M) {
+  if (List.empty())
+return;
+
+  Type *PtrTy = PointerType::get(M.getContext(), /*AddressSpace=*/0);
+
+  // Convert List to what ConstantArray needs.
+  SmallVector UsedArray;
+  UsedArray.reserve(List.size());
+  for (auto Item : List)
+UsedArray.push_back(ConstantExpr::getPointerBitCastOrAddrSpaceCast(
+cast(&*Item), PtrTy));
+
+  ArrayType *ArrTy = ArrayType::get(PtrTy, UsedArray.size());
+  auto *GV =
+  new GlobalVariable(M, ArrTy, false, llvm::GlobalValue::AppendingLinkage,
+ llvm::ConstantArray::get(ArrTy, UsedArray), Name);
+
+  GV->setSection("llvm.metadata");
+}
+
+static void
+emitExecutionMode(OpenMPIRBu

[llvm-branch-commits] [flang] [Flang][OpenMP] Derived type explicit allocatable member mapping (PR #113557)

2024-11-13 Thread Sergio Afonso via llvm-branch-commits


@@ -145,11 +145,294 @@ createMapInfoOp(fir::FirOpBuilder &builder, 
mlir::Location loc,
   builder.getIntegerAttr(builder.getIntegerType(64, false), mapType),
   builder.getAttr(mapCaptureType),
   builder.getStringAttr(name), builder.getBoolAttr(partialMap));
-
   return op;
 }
 
-static int
+// This function gathers the individual omp::Object's that make up an
+// larger omp::Object symbol.
+//
+// For example, provided the larger symbol: "parent%child%member", this
+// function breaks it up into it's constituent components ("parent",
+// "child", "member"), so we can access each individual component and
+// introspect details, important to note this function breaks it up from
+// RHS to LHS ("member" to "parent") and then we reverse it so that the
+// returned omp::ObjectList is LHS to RHS, with the "parent" at the
+// beginning.
+omp::ObjectList gatherObjectsOf(omp::Object derivedTypeMember,
+semantics::SemanticsContext &semaCtx) {
+  omp::ObjectList objList;
+  std::optional baseObj = derivedTypeMember;
+  while (baseObj.has_value()) {
+objList.push_back(baseObj.value());
+baseObj = getBaseObject(baseObj.value(), semaCtx);
+  }
+  return omp::ObjectList{llvm::reverse(objList)};
+}
+
+// This function generates a series of indices from a provided omp::Object,
+// that devolves to an ArrayRef symbol, e.g. "array(2,3,4)", this function
+// would generate a series of indices of "[1][2][3]" for the above example,
+// offsetting by -1 to account for the non-zero fortran indexes.
+//
+// These indices can then be provided to a coordinate operation or other
+// GEP-like operation to access the relevant positional member of the
+// array.
+//
+// It is of note that the function only supports subscript integers currently
+// and not Triplets i.e. Array(1:2:3).
+static void generateArrayIndices(lower::AbstractConverter &converter,
+ fir::FirOpBuilder &firOpBuilder,
+ lower::StatementContext &stmtCtx,
+ mlir::Location clauseLocation,
+ llvm::SmallVectorImpl &indices,
+ omp::Object object) {
+  auto maybeRef = evaluate::ExtractDataRef(*object.ref());
+  if (!maybeRef)
+return;
+
+  auto *arr = std::get_if(&maybeRef->u);
+  if (!arr)
+return;
+
+  for (auto v : arr->subscript()) {
+if (std::holds_alternative(v.u)) {
+  llvm_unreachable("Triplet indexing in map clause is unsupported");
+} else {
+  auto expr =
+  std::get(v.u);
+  mlir::Value subscript =
+  fir::getBase(converter.genExprValue(toEvExpr(expr.value()), 
stmtCtx));
+  mlir::Value one = firOpBuilder.createIntegerConstant(
+  clauseLocation, firOpBuilder.getIndexType(), 1);
+  subscript = firOpBuilder.createConvert(
+  clauseLocation, firOpBuilder.getIndexType(), subscript);
+  indices.push_back(firOpBuilder.create(
+  clauseLocation, subscript, one));
+}
+  }
+}
+
+/// When mapping members of derived types, there is a chance that one of the
+/// members along the way to a mapped member is an descriptor. In which case
+/// we have to make sure we generate a map for those along the way otherwise
+/// we will be missing a chunk of data required to actually map the member
+/// type to device. This function effectively generates these maps and the
+/// appropriate data accesses required to generate these maps. It will avoid
+/// creating duplicate maps, as duplicates are just as bad as unmapped
+/// descriptor data in a lot of cases for the runtime (and unnecessary
+/// data movement should be avoided where possible).
+///
+/// As an example for the following mapping:
+///
+/// type :: vertexes
+/// integer(4), allocatable :: vertexx(:)
+/// integer(4), allocatable :: vertexy(:)
+/// end type vertexes
+///
+/// type :: dtype
+/// real(4) :: i
+/// type(vertexes), allocatable :: vertexes(:)
+/// end type dtype
+///
+/// type(dtype), allocatable :: alloca_dtype
+///
+/// !$omp target map(tofrom: alloca_dtype%vertexes(N1)%vertexx)
+///
+/// The below HLFIR/FIR is generated (trimmed for conciseness):
+///
+/// On the first iteration we index into the record type alloca_dtype
+/// to access "vertexes", we then generate a map for this descriptor
+/// alongside bounds to indicate we only need the 1 member, rather than
+/// the whole array block in this case (In theory we could map its
+/// entirety at the cost of data transfer bandwidth).
+///
+/// %13:2 = hlfir.declare ... "alloca_dtype" ...
+/// %39 = fir.load %13#0 : ...
+/// %40 = fir.coordinate_of %39, %c1 : ...
+/// %51 = omp.map.info var_ptr(%40 : ...) map_clauses(to) capture(ByRef) ...
+/// %52 = fir.load %40 : ...
+///
+/// Second iteration generating access to "vertexes(N1) utilising the N1 index
+/// %53 = load N1 ...
+/// %54 = fir.convert %53 : (i32) -> i64
+/// %55 = fir.convert

[llvm-branch-commits] [flang] [Flang][OpenMP] Derived type explicit allocatable member mapping (PR #113557)

2024-11-13 Thread Sergio Afonso via llvm-branch-commits

https://github.com/skatrak approved this pull request.

Thank you Andrew for all your work on this, LGTM!

https://github.com/llvm/llvm-project/pull/113557
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [mlir] [MLIR][OpenMP] LLVM IR translation of host_eval (PR #116052)

2024-11-13 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-flang-openmp

Author: Sergio Afonso (skatrak)


Changes

This patch adds support for processing the `host_eval` clause of `omp.target` 
to populate default and runtime kernel launch attributes. Specifically, these 
related to the `num_teams`, `thread_limit` and `num_threads` clauses attached 
to operations nested inside of `omp.target`. As a result, the `thread_limit` 
clause of `omp.target` is also supported.

The implementation of `initTargetDefaultAttrs()` is intended to reflect clang's 
own processing of multiple constructs and clauses in order to define a default 
number of teams and threads to be used as kernel attributes and to populate 
global variables in the target device module.

One side effect of this change is that it is no longer possible to translate to 
LLVM IR target device MLIR modules unless they have a supported target triple. 
This is because the local `getGridValue()` function in the `OpenMPIRBuilder` 
only works for certain architectures, and it is called whenever the maximum 
number of threads has not been explicitly defined. This limitation also matches 
clang.

Support for evaluating the collapsed loop trip count of target SPMD kernels 
remains unsupported.

---

Patch is 37.90 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/116052.diff


18 Files Affected:

- (modified) flang/test/Integration/OpenMP/target-filtering.f90 (+1-1) 
- (modified) flang/test/Lower/OpenMP/function-filtering-2.f90 (+3-3) 
- (modified) flang/test/Lower/OpenMP/function-filtering-3.f90 (+3-3) 
- (modified) flang/test/Lower/OpenMP/function-filtering.f90 (+3-3) 
- (modified) 
mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp (+229-16) 
- (modified) 
mlir/test/Target/LLVMIR/omptarget-byref-bycopy-generation-device.mlir (+2-2) 
- (modified) mlir/test/Target/LLVMIR/omptarget-constant-alloca-raise.mlir 
(+2-2) 
- (modified) 
mlir/test/Target/LLVMIR/omptarget-constant-indexing-device-region.mlir (+2-2) 
- (modified) mlir/test/Target/LLVMIR/omptarget-debug.mlir (+1-1) 
- (modified) mlir/test/Target/LLVMIR/omptarget-declare-target-llvm-device.mlir 
(+1-1) 
- (modified) mlir/test/Target/LLVMIR/omptarget-parallel-llvm.mlir (+2-2) 
- (modified) mlir/test/Target/LLVMIR/omptarget-region-device-llvm.mlir (+3-3) 
- (modified) mlir/test/Target/LLVMIR/omptarget-target-inside-task.mlir (+2-2) 
- (added) mlir/test/Target/LLVMIR/openmp-target-launch-device.mlir (+43) 
- (added) mlir/test/Target/LLVMIR/openmp-target-launch-host.mlir (+31) 
- (modified) mlir/test/Target/LLVMIR/openmp-target-use-device-nested.mlir 
(+2-2) 
- (modified) mlir/test/Target/LLVMIR/openmp-task-target-device.mlir (+1-1) 
- (modified) mlir/test/Target/LLVMIR/openmp-todo.mlir (+13-14) 


``diff
diff --git a/flang/test/Integration/OpenMP/target-filtering.f90 
b/flang/test/Integration/OpenMP/target-filtering.f90
index d1ab1b47e580d4..699c1040d91f9c 100644
--- a/flang/test/Integration/OpenMP/target-filtering.f90
+++ b/flang/test/Integration/OpenMP/target-filtering.f90
@@ -7,7 +7,7 @@
 !===--===!
 
 !RUN: %flang_fc1 -emit-llvm -fopenmp %s -o - | FileCheck %s --check-prefixes 
HOST,ALL
-!RUN: %flang_fc1 -emit-llvm -fopenmp -fopenmp-is-target-device %s -o - | 
FileCheck %s --check-prefixes DEVICE,ALL
+!RUN: %flang_fc1 -triple amdgcn-amd-amdhsa -emit-llvm -fopenmp 
-fopenmp-is-target-device %s -o - | FileCheck %s --check-prefixes DEVICE,ALL
 
 !HOST: define {{.*}}@{{.*}}before{{.*}}(
 !DEVICE-NOT: define {{.*}}@before{{.*}}(
diff --git a/flang/test/Lower/OpenMP/function-filtering-2.f90 
b/flang/test/Lower/OpenMP/function-filtering-2.f90
index 0c02aa223820e7..a2c5e29cfdcbf6 100644
--- a/flang/test/Lower/OpenMP/function-filtering-2.f90
+++ b/flang/test/Lower/OpenMP/function-filtering-2.f90
@@ -1,9 +1,9 @@
 ! RUN: %flang_fc1 -fopenmp -fopenmp-version=52 -flang-experimental-hlfir 
-emit-llvm %s -o - | FileCheck --check-prefixes=LLVM,LLVM-HOST %s
 ! RUN: %flang_fc1 -fopenmp -fopenmp-version=52 -emit-hlfir %s -o - | FileCheck 
--check-prefix=MLIR %s
-! RUN: %flang_fc1 -fopenmp -fopenmp-version=52 -fopenmp-is-target-device 
-flang-experimental-hlfir -emit-llvm %s -o - | FileCheck 
--check-prefixes=LLVM,LLVM-DEVICE %s
-! RUN: %flang_fc1 -fopenmp -fopenmp-version=52 -fopenmp-is-target-device 
-emit-hlfir %s -o - | FileCheck --check-prefix=MLIR %s
+! RUN: %flang_fc1 -triple amdgcn-amd-amdhsa -fopenmp -fopenmp-version=52 
-fopenmp-is-target-device -flang-experimental-hlfir -emit-llvm %s -o - | 
FileCheck --check-prefixes=LLVM,LLVM-DEVICE %s
+! RUN: %flang_fc1 -triple amdgcn-amd-amdhsa -fopenmp -fopenmp-version=52 
-fopenmp-is-target-device -emit-hlfir %s -o - | FileCheck --check-prefix=MLIR %s
 ! RUN: bbc -fopenmp -fopenmp-version=52 -emit-hlfir %s -o - | FileCheck 
--check-prefixes=MLIR-HOST,MLIR-ALL %s
-! RUN: bbc -fopenmp -fopenmp-version=52 -fopenmp-is-target-device -emit-hlfir 
%s -o - | FileC

[llvm-branch-commits] [flang] [mlir] [MLIR][OpenMP] LLVM IR translation of host_eval (PR #116052)

2024-11-13 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-mlir-llvm

Author: Sergio Afonso (skatrak)


Changes

This patch adds support for processing the `host_eval` clause of `omp.target` 
to populate default and runtime kernel launch attributes. Specifically, these 
related to the `num_teams`, `thread_limit` and `num_threads` clauses attached 
to operations nested inside of `omp.target`. As a result, the `thread_limit` 
clause of `omp.target` is also supported.

The implementation of `initTargetDefaultAttrs()` is intended to reflect clang's 
own processing of multiple constructs and clauses in order to define a default 
number of teams and threads to be used as kernel attributes and to populate 
global variables in the target device module.

One side effect of this change is that it is no longer possible to translate to 
LLVM IR target device MLIR modules unless they have a supported target triple. 
This is because the local `getGridValue()` function in the `OpenMPIRBuilder` 
only works for certain architectures, and it is called whenever the maximum 
number of threads has not been explicitly defined. This limitation also matches 
clang.

Support for evaluating the collapsed loop trip count of target SPMD kernels 
remains unsupported.

---

Patch is 37.90 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/116052.diff


18 Files Affected:

- (modified) flang/test/Integration/OpenMP/target-filtering.f90 (+1-1) 
- (modified) flang/test/Lower/OpenMP/function-filtering-2.f90 (+3-3) 
- (modified) flang/test/Lower/OpenMP/function-filtering-3.f90 (+3-3) 
- (modified) flang/test/Lower/OpenMP/function-filtering.f90 (+3-3) 
- (modified) 
mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp (+229-16) 
- (modified) 
mlir/test/Target/LLVMIR/omptarget-byref-bycopy-generation-device.mlir (+2-2) 
- (modified) mlir/test/Target/LLVMIR/omptarget-constant-alloca-raise.mlir 
(+2-2) 
- (modified) 
mlir/test/Target/LLVMIR/omptarget-constant-indexing-device-region.mlir (+2-2) 
- (modified) mlir/test/Target/LLVMIR/omptarget-debug.mlir (+1-1) 
- (modified) mlir/test/Target/LLVMIR/omptarget-declare-target-llvm-device.mlir 
(+1-1) 
- (modified) mlir/test/Target/LLVMIR/omptarget-parallel-llvm.mlir (+2-2) 
- (modified) mlir/test/Target/LLVMIR/omptarget-region-device-llvm.mlir (+3-3) 
- (modified) mlir/test/Target/LLVMIR/omptarget-target-inside-task.mlir (+2-2) 
- (added) mlir/test/Target/LLVMIR/openmp-target-launch-device.mlir (+43) 
- (added) mlir/test/Target/LLVMIR/openmp-target-launch-host.mlir (+31) 
- (modified) mlir/test/Target/LLVMIR/openmp-target-use-device-nested.mlir 
(+2-2) 
- (modified) mlir/test/Target/LLVMIR/openmp-task-target-device.mlir (+1-1) 
- (modified) mlir/test/Target/LLVMIR/openmp-todo.mlir (+13-14) 


``diff
diff --git a/flang/test/Integration/OpenMP/target-filtering.f90 
b/flang/test/Integration/OpenMP/target-filtering.f90
index d1ab1b47e580d4..699c1040d91f9c 100644
--- a/flang/test/Integration/OpenMP/target-filtering.f90
+++ b/flang/test/Integration/OpenMP/target-filtering.f90
@@ -7,7 +7,7 @@
 !===--===!
 
 !RUN: %flang_fc1 -emit-llvm -fopenmp %s -o - | FileCheck %s --check-prefixes 
HOST,ALL
-!RUN: %flang_fc1 -emit-llvm -fopenmp -fopenmp-is-target-device %s -o - | 
FileCheck %s --check-prefixes DEVICE,ALL
+!RUN: %flang_fc1 -triple amdgcn-amd-amdhsa -emit-llvm -fopenmp 
-fopenmp-is-target-device %s -o - | FileCheck %s --check-prefixes DEVICE,ALL
 
 !HOST: define {{.*}}@{{.*}}before{{.*}}(
 !DEVICE-NOT: define {{.*}}@before{{.*}}(
diff --git a/flang/test/Lower/OpenMP/function-filtering-2.f90 
b/flang/test/Lower/OpenMP/function-filtering-2.f90
index 0c02aa223820e7..a2c5e29cfdcbf6 100644
--- a/flang/test/Lower/OpenMP/function-filtering-2.f90
+++ b/flang/test/Lower/OpenMP/function-filtering-2.f90
@@ -1,9 +1,9 @@
 ! RUN: %flang_fc1 -fopenmp -fopenmp-version=52 -flang-experimental-hlfir 
-emit-llvm %s -o - | FileCheck --check-prefixes=LLVM,LLVM-HOST %s
 ! RUN: %flang_fc1 -fopenmp -fopenmp-version=52 -emit-hlfir %s -o - | FileCheck 
--check-prefix=MLIR %s
-! RUN: %flang_fc1 -fopenmp -fopenmp-version=52 -fopenmp-is-target-device 
-flang-experimental-hlfir -emit-llvm %s -o - | FileCheck 
--check-prefixes=LLVM,LLVM-DEVICE %s
-! RUN: %flang_fc1 -fopenmp -fopenmp-version=52 -fopenmp-is-target-device 
-emit-hlfir %s -o - | FileCheck --check-prefix=MLIR %s
+! RUN: %flang_fc1 -triple amdgcn-amd-amdhsa -fopenmp -fopenmp-version=52 
-fopenmp-is-target-device -flang-experimental-hlfir -emit-llvm %s -o - | 
FileCheck --check-prefixes=LLVM,LLVM-DEVICE %s
+! RUN: %flang_fc1 -triple amdgcn-amd-amdhsa -fopenmp -fopenmp-version=52 
-fopenmp-is-target-device -emit-hlfir %s -o - | FileCheck --check-prefix=MLIR %s
 ! RUN: bbc -fopenmp -fopenmp-version=52 -emit-hlfir %s -o - | FileCheck 
--check-prefixes=MLIR-HOST,MLIR-ALL %s
-! RUN: bbc -fopenmp -fopenmp-version=52 -fopenmp-is-target-device -emit-hlfir 
%s -o - | FileChec

[llvm-branch-commits] [clang] [llvm] [mlir] [OMPIRBuilder] Introduce struct to hold default kernel teams/threads (PR #116050)

2024-11-13 Thread Sergio Afonso via llvm-branch-commits

skatrak wrote:

PR stack:
- #116048
- #116049
- #116050
- #116051
- #116052

https://github.com/llvm/llvm-project/pull/116050
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] [MLIR][OpenMP] Add host_eval clause to omp.target (PR #116049)

2024-11-13 Thread Sergio Afonso via llvm-branch-commits

https://github.com/skatrak created 
https://github.com/llvm/llvm-project/pull/116049

This patch adds the `host_eval` clause to the `omp.target` operation. 
Additionally, it updates its op verifier to make sure all uses of block 
arguments defined by this clause fall within one of the few cases where they 
are allowed.

MLIR to LLVM IR translation fails on translation of this clause with a 
not-yet-implemented error.

>From 26fbb25720cf472c66eef259845e1fa73668f77c Mon Sep 17 00:00:00 2001
From: Sergio Afonso 
Date: Fri, 8 Nov 2024 12:00:45 +
Subject: [PATCH] [MLIR][OpenMP] Add host_eval clause to omp.target

This patch adds the `host_eval` clause to the `omp.target` operation.
Additionally, it updates its op verifier to make sure all uses of block
arguments defined by this clause fall within one of the few cases where they
are allowed.

MLIR to LLVM IR translation fails on translation of this clause with a
not-yet-implemented error.
---
 mlir/docs/Dialects/OpenMPDialect/_index.md|  55 ++
 mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td |  33 +++-
 mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp  | 167 +-
 .../OpenMP/OpenMPToLLVMIRTranslation.cpp  |   5 +
 mlir/test/Dialect/OpenMP/invalid.mlir |  70 +++-
 mlir/test/Dialect/OpenMP/ops.mlir |  38 +++-
 mlir/test/Target/LLVMIR/openmp-todo.mlir  |  14 ++
 7 files changed, 369 insertions(+), 13 deletions(-)

diff --git a/mlir/docs/Dialects/OpenMPDialect/_index.md 
b/mlir/docs/Dialects/OpenMPDialect/_index.md
index 4e5d777d6c4f7f..e0dd3f598e84b6 100644
--- a/mlir/docs/Dialects/OpenMPDialect/_index.md
+++ b/mlir/docs/Dialects/OpenMPDialect/_index.md
@@ -523,3 +523,58 @@ omp.parallel ... {
   omp.terminator
 } {omp.composite}
 ```
+
+## Host-Evaluated Clauses in Target Regions
+
+The `omp.target` operation, which represents the OpenMP `target` construct, is
+marked with the `IsolatedFromAbove` trait. This means that, inside of its
+region, no MLIR values defined outside of the op itself can be used. This is
+consistent with the OpenMP specification of the `target` construct, which
+mandates that all host device values used inside of the `target` region must
+either be privatized (data-sharing) or mapped (data-mapping).
+
+Normally, clauses applied to a construct are evaluated before entering that
+construct. Further, in some cases, the OpenMP specification stipulates that
+clauses be evaluated _on the host device_ on entry to a parent `target`
+construct. In particular, the `num_teams` and `thread_limit` clauses of the
+`teams` construct must be evaluated on the host device if it's nested inside or
+combined with a `target` construct.
+
+Additionally, the runtime library targeted by the MLIR to LLVM IR translation 
of
+the OpenMP dialect supports the optimized launch of SPMD kernels (i.e.
+`target teams distribute parallel {do,for}` in OpenMP), which requires
+specifying in advance what the total trip count of the loop is. Consequently, 
it
+is also beneficial to evaluate the trip count on the host device prior to the
+kernel launch.
+
+These host-evaluated values in MLIR would need to be placed outside of the
+`omp.target` region and also attached to the corresponding nested operations,
+which is not possible because of the `IsolatedFromAbove` trait. The solution
+implemented to address this problem has been to introduce the `host_eval`
+argument to the `omp.target` operation. It works similarly to a `map` clause,
+but its only intended use is to forward host-evaluated values to their
+corresponding operation inside of the region. Any uses outside of the 
previously
+described result in a verifier error.
+
+```mlir
+// Initialize %0, %1, %2, %3...
+omp.target host_eval(%0 -> %nt, %1 -> %lb, %2 -> %ub, %3 -> %step : i32, i32, 
i32, i32) {
+  omp.teams num_teams(to %nt : i32) {
+omp.parallel {
+  omp.distribute {
+omp.wsloop {
+  omp.loop_nest (%iv) : i32 = (%lb) to (%ub) step (%step) {
+// ...
+omp.yield
+  }
+  omp.terminator
+} {omp.composite}
+omp.terminator
+  } {omp.composite}
+  omp.terminator
+} {omp.composite}
+omp.terminator
+  }
+  omp.terminator
+}
+```
diff --git a/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td 
b/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
index a0da3db124d1f4..a99da1f0294d08 100644
--- a/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
+++ b/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
@@ -1166,9 +1166,10 @@ def TargetOp : OpenMP_Op<"target", traits = [
   ], clauses = [
 // TODO: Complete clause list (defaultmap, uses_allocators).
 OpenMP_AllocateClause, OpenMP_DependClause, OpenMP_DeviceClause,
-OpenMP_HasDeviceAddrClause, OpenMP_IfClause, OpenMP_InReductionClause,
-OpenMP_IsDevicePtrClause, OpenMP_MapClauseSkip,
-OpenMP_NowaitClause, OpenMP_PrivateClause, OpenMP_ThreadLimitClause
+OpenMP_HasDeviceAddrClause, OpenMP_HostEvalClause, OpenMP_IfClause,
+OpenMP_

[llvm-branch-commits] [flang] [mlir] [MLIR][OpenMP] LLVM IR translation of host_eval (PR #116052)

2024-11-13 Thread Sergio Afonso via llvm-branch-commits

skatrak wrote:

PR stack:
- #116048
- #116049
- #116050
- #116051
- #116052

https://github.com/llvm/llvm-project/pull/116052
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [llvm] [mlir] [OMPIRBuilder] Introduce struct to hold default kernel teams/threads (PR #116050)

2024-11-13 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-mlir

Author: Sergio Afonso (skatrak)


Changes

This patch introduces the `OpenMPIRBuilder::TargetKernelDefaultAttrs` structure 
used to simplify passing default and constant values for number of teams and 
threads, and possibly other target kernel-related information in the future.

This is used to forward values passed to `createTarget` to `createTargetInit`, 
which previously used a default unrelated set of values.

---

Patch is 21.80 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/116050.diff


8 Files Affected:

- (modified) clang/lib/CodeGen/CGOpenMPRuntime.cpp (+8-5) 
- (modified) clang/lib/CodeGen/CGOpenMPRuntime.h (+3-6) 
- (modified) clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp (+3-6) 
- (modified) llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h (+25-14) 
- (modified) llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp (+40-31) 
- (modified) llvm/unittests/Frontend/OpenMPIRBuilderTest.cpp (+16-13) 
- (modified) 
mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp (+6-5) 
- (modified) mlir/test/Target/LLVMIR/omptarget-region-device-llvm.mlir (+1-1) 


``diff
diff --git a/clang/lib/CodeGen/CGOpenMPRuntime.cpp 
b/clang/lib/CodeGen/CGOpenMPRuntime.cpp
index d714af035d21a2..0f7a1166227476 100644
--- a/clang/lib/CodeGen/CGOpenMPRuntime.cpp
+++ b/clang/lib/CodeGen/CGOpenMPRuntime.cpp
@@ -5880,10 +5880,13 @@ void 
CGOpenMPRuntime::emitUsesAllocatorsFini(CodeGenFunction &CGF,
 
 void CGOpenMPRuntime::computeMinAndMaxThreadsAndTeams(
 const OMPExecutableDirective &D, CodeGenFunction &CGF,
-int32_t &MinThreadsVal, int32_t &MaxThreadsVal, int32_t &MinTeamsVal,
-int32_t &MaxTeamsVal) {
+llvm::OpenMPIRBuilder::TargetKernelDefaultAttrs &Attrs) {
+  assert(Attrs.MaxTeams.size() == 1 && Attrs.MaxThreads.size() == 1 &&
+ "invalid default attrs structure");
+  int32_t &MaxTeamsVal = Attrs.MaxTeams.front();
+  int32_t &MaxThreadsVal = Attrs.MaxThreads.front();
 
-  getNumTeamsExprForTargetDirective(CGF, D, MinTeamsVal, MaxTeamsVal);
+  getNumTeamsExprForTargetDirective(CGF, D, Attrs.MinTeams, MaxTeamsVal);
   getNumThreadsExprForTargetDirective(CGF, D, MaxThreadsVal,
   /*UpperBoundOnly=*/true);
 
@@ -5901,12 +5904,12 @@ void CGOpenMPRuntime::computeMinAndMaxThreadsAndTeams(
   else
 continue;
 
-  MinThreadsVal = std::max(MinThreadsVal, AttrMinThreadsVal);
+  Attrs.MinThreads = std::max(Attrs.MinThreads, AttrMinThreadsVal);
   if (AttrMaxThreadsVal > 0)
 MaxThreadsVal = MaxThreadsVal > 0
 ? std::min(MaxThreadsVal, AttrMaxThreadsVal)
 : AttrMaxThreadsVal;
-  MinTeamsVal = std::max(MinTeamsVal, AttrMinBlocksVal);
+  Attrs.MinTeams = std::max(Attrs.MinTeams, AttrMinBlocksVal);
   if (AttrMaxBlocksVal > 0)
 MaxTeamsVal = MaxTeamsVal > 0 ? std::min(MaxTeamsVal, AttrMaxBlocksVal)
   : AttrMaxBlocksVal;
diff --git a/clang/lib/CodeGen/CGOpenMPRuntime.h 
b/clang/lib/CodeGen/CGOpenMPRuntime.h
index 5e7715743afb58..003395e7f17ded 100644
--- a/clang/lib/CodeGen/CGOpenMPRuntime.h
+++ b/clang/lib/CodeGen/CGOpenMPRuntime.h
@@ -312,12 +312,9 @@ class CGOpenMPRuntime {
   llvm::OpenMPIRBuilder OMPBuilder;
 
   /// Helper to determine the min/max number of threads/teams for \p D.
-  void computeMinAndMaxThreadsAndTeams(const OMPExecutableDirective &D,
-   CodeGenFunction &CGF,
-   int32_t &MinThreadsVal,
-   int32_t &MaxThreadsVal,
-   int32_t &MinTeamsVal,
-   int32_t &MaxTeamsVal);
+  void computeMinAndMaxThreadsAndTeams(
+  const OMPExecutableDirective &D, CodeGenFunction &CGF,
+  llvm::OpenMPIRBuilder::TargetKernelDefaultAttrs &Attrs);
 
   /// Helper to emit outlined function for 'target' directive.
   /// \param D Directive to emit.
diff --git a/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp 
b/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp
index 43dc0e62284602..96f8d6c5c08e56 100644
--- a/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp
+++ b/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp
@@ -745,14 +745,11 @@ void CGOpenMPRuntimeGPU::emitNonSPMDKernel(const 
OMPExecutableDirective &D,
 void CGOpenMPRuntimeGPU::emitKernelInit(const OMPExecutableDirective &D,
 CodeGenFunction &CGF,
 EntryFunctionState &EST, bool IsSPMD) {
-  int32_t MinThreadsVal = 1, MaxThreadsVal = -1, MinTeamsVal = 1,
-  MaxTeamsVal = -1;
-  computeMinAndMaxThreadsAndTeams(D, CGF, MinThreadsVal, MaxThreadsVal,
-  MinTeamsVal, MaxTeamsVal);
+  llvm::OpenMPIRBuilder::TargetKernelDefaultAttrs Attrs;
+  computeMinAndMaxThreadsAndTeams(D, CGF, Attrs);
 
   CGBuilderTy &Bld = CGF.Builder;
-

[llvm-branch-commits] [llvm] release/19.x: backport PR115901 (PR #116104)

2024-11-13 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-llvm-transforms

Author: Antonio Frighetto (antoniofrighetto)


Changes

Backport: 929cbe7f596733f85cd274485acc19442dd34a80.

Requested-by: @AreaZR.

---
Full diff: https://github.com/llvm/llvm-project/pull/116104.diff


3 Files Affected:

- (modified) llvm/lib/Transforms/InstCombine/InstCombinePHI.cpp (+2-1) 
- (modified) llvm/test/Transforms/InstCombine/opaque-ptr.ll (+1-1) 
- (modified) llvm/test/Transforms/InstCombine/phi.ll (+28) 


``diff
diff --git a/llvm/lib/Transforms/InstCombine/InstCombinePHI.cpp 
b/llvm/lib/Transforms/InstCombine/InstCombinePHI.cpp
index 86411320ab2487..b05a33c688890d 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombinePHI.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombinePHI.cpp
@@ -513,7 +513,8 @@ Instruction *InstCombinerImpl::foldPHIArgGEPIntoPHI(PHINode 
&PN) {
   // especially bad when the PHIs are in the header of a loop.
   bool NeededPhi = false;
 
-  GEPNoWrapFlags NW = GEPNoWrapFlags::all();
+  // Remember flags of the first phi-operand getelementptr.
+  GEPNoWrapFlags NW = FirstInst->getNoWrapFlags();
 
   // Scan to see if all operands are the same opcode, and all have one user.
   for (Value *V : drop_begin(PN.incoming_values())) {
diff --git a/llvm/test/Transforms/InstCombine/opaque-ptr.ll 
b/llvm/test/Transforms/InstCombine/opaque-ptr.ll
index df85547f56d74f..1fd8281b53816f 100644
--- a/llvm/test/Transforms/InstCombine/opaque-ptr.ll
+++ b/llvm/test/Transforms/InstCombine/opaque-ptr.ll
@@ -549,7 +549,7 @@ define ptr @phi_of_gep_flags_1(i1 %c, ptr %p) {
 ; CHECK:   else:
 ; CHECK-NEXT:br label [[JOIN]]
 ; CHECK:   join:
-; CHECK-NEXT:[[PHI:%.*]] = getelementptr nusw nuw i8, ptr [[P:%.*]], i64 4
+; CHECK-NEXT:[[PHI:%.*]] = getelementptr nusw i8, ptr [[P:%.*]], i64 4
 ; CHECK-NEXT:ret ptr [[PHI]]
 ;
   br i1 %c, label %if, label %else
diff --git a/llvm/test/Transforms/InstCombine/phi.ll 
b/llvm/test/Transforms/InstCombine/phi.ll
index b12982dd27e404..82ea9bb439b0bb 100644
--- a/llvm/test/Transforms/InstCombine/phi.ll
+++ b/llvm/test/Transforms/InstCombine/phi.ll
@@ -2714,3 +2714,31 @@ join:
   %cmp = icmp slt i32 %13, 0
   ret i1 %cmp
 }
+
+define i64 @wrong_gep_arg_into_phi(ptr noundef %ptr) {
+; CHECK-LABEL: @wrong_gep_arg_into_phi(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:br label [[FOR_COND:%.*]]
+; CHECK:   for.cond:
+; CHECK-NEXT:[[PTR_PN:%.*]] = phi ptr [ [[PTR:%.*]], [[ENTRY:%.*]] ], [ 
[[DOTPN:%.*]], [[FOR_COND]] ]
+; CHECK-NEXT:[[DOTPN]] = getelementptr i8, ptr [[PTR_PN]], i64 1
+; CHECK-NEXT:[[VAL:%.*]] = load i8, ptr [[DOTPN]], align 1
+; CHECK-NEXT:[[COND_NOT:%.*]] = icmp eq i8 [[VAL]], 0
+; CHECK-NEXT:br i1 [[COND_NOT]], label [[EXIT:%.*]], label [[FOR_COND]]
+; CHECK:   exit:
+; CHECK-NEXT:ret i64 0
+;
+entry:
+  %add.ptr = getelementptr i8, ptr %ptr, i64 1
+  br label %for.cond
+
+for.cond: ; preds = %for.cond, %entry
+  %.pn = phi ptr [ %add.ptr, %entry ], [ %incdec.ptr, %for.cond ]
+  %val = load i8, ptr %.pn, align 1
+  %cond = icmp ne i8 %val, 0
+  %incdec.ptr = getelementptr inbounds nuw i8, ptr %.pn, i64 1
+  br i1 %cond, label %for.cond, label %exit
+
+exit: ; preds = %for.cond
+  ret i64 0
+}

``




https://github.com/llvm/llvm-project/pull/116104
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [flang] [lld] [llvm] [Flang] LLVM_ENABLE_RUNTIMES=flang-rt (PR #110217)

2024-11-13 Thread via llvm-branch-commits

h-vetinari wrote:

> I thought that's what happens by default when you use 
> https://patch-diff.githubusercontent.com/raw/llvm/llvm-project/pull/110217.diff

You only get the diff of the respective PR w.r.t. its base. Since this PR is 
not targeting `main`, the diff does not apply (unless you pick up all the 
intermediate pieces in the chain of pull requests).

https://github.com/llvm/llvm-project/pull/110217
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: backport PR115901 (PR #116104)

2024-11-13 Thread Antonio Frighetto via llvm-branch-commits

https://github.com/antoniofrighetto edited 
https://github.com/llvm/llvm-project/pull/116104
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: backport PR115901 (PR #116104)

2024-11-13 Thread Antonio Frighetto via llvm-branch-commits

https://github.com/antoniofrighetto created 
https://github.com/llvm/llvm-project/pull/116104

Backport: 929cbe7f596733f85cd274485acc19442dd34a80.

Requested-by: @AreaZR.

>From 134b1917d27e268d8771f76f22d2ee32fbc2a2b3 Mon Sep 17 00:00:00 2001
From: Antonio Frighetto 
Date: Tue, 12 Nov 2024 10:45:46 +0100
Subject: [PATCH] [InstCombine] Intersect nowrap flags between geps while
 folding into phi

A miscompilation issue has been addressed with refined checking.
---
 .../Transforms/InstCombine/InstCombinePHI.cpp |  3 +-
 .../test/Transforms/InstCombine/opaque-ptr.ll |  2 +-
 llvm/test/Transforms/InstCombine/phi.ll   | 28 +++
 3 files changed, 31 insertions(+), 2 deletions(-)

diff --git a/llvm/lib/Transforms/InstCombine/InstCombinePHI.cpp 
b/llvm/lib/Transforms/InstCombine/InstCombinePHI.cpp
index 86411320ab2487..b05a33c688890d 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombinePHI.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombinePHI.cpp
@@ -513,7 +513,8 @@ Instruction *InstCombinerImpl::foldPHIArgGEPIntoPHI(PHINode 
&PN) {
   // especially bad when the PHIs are in the header of a loop.
   bool NeededPhi = false;
 
-  GEPNoWrapFlags NW = GEPNoWrapFlags::all();
+  // Remember flags of the first phi-operand getelementptr.
+  GEPNoWrapFlags NW = FirstInst->getNoWrapFlags();
 
   // Scan to see if all operands are the same opcode, and all have one user.
   for (Value *V : drop_begin(PN.incoming_values())) {
diff --git a/llvm/test/Transforms/InstCombine/opaque-ptr.ll 
b/llvm/test/Transforms/InstCombine/opaque-ptr.ll
index df85547f56d74f..1fd8281b53816f 100644
--- a/llvm/test/Transforms/InstCombine/opaque-ptr.ll
+++ b/llvm/test/Transforms/InstCombine/opaque-ptr.ll
@@ -549,7 +549,7 @@ define ptr @phi_of_gep_flags_1(i1 %c, ptr %p) {
 ; CHECK:   else:
 ; CHECK-NEXT:br label [[JOIN]]
 ; CHECK:   join:
-; CHECK-NEXT:[[PHI:%.*]] = getelementptr nusw nuw i8, ptr [[P:%.*]], i64 4
+; CHECK-NEXT:[[PHI:%.*]] = getelementptr nusw i8, ptr [[P:%.*]], i64 4
 ; CHECK-NEXT:ret ptr [[PHI]]
 ;
   br i1 %c, label %if, label %else
diff --git a/llvm/test/Transforms/InstCombine/phi.ll 
b/llvm/test/Transforms/InstCombine/phi.ll
index b12982dd27e404..82ea9bb439b0bb 100644
--- a/llvm/test/Transforms/InstCombine/phi.ll
+++ b/llvm/test/Transforms/InstCombine/phi.ll
@@ -2714,3 +2714,31 @@ join:
   %cmp = icmp slt i32 %13, 0
   ret i1 %cmp
 }
+
+define i64 @wrong_gep_arg_into_phi(ptr noundef %ptr) {
+; CHECK-LABEL: @wrong_gep_arg_into_phi(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:br label [[FOR_COND:%.*]]
+; CHECK:   for.cond:
+; CHECK-NEXT:[[PTR_PN:%.*]] = phi ptr [ [[PTR:%.*]], [[ENTRY:%.*]] ], [ 
[[DOTPN:%.*]], [[FOR_COND]] ]
+; CHECK-NEXT:[[DOTPN]] = getelementptr i8, ptr [[PTR_PN]], i64 1
+; CHECK-NEXT:[[VAL:%.*]] = load i8, ptr [[DOTPN]], align 1
+; CHECK-NEXT:[[COND_NOT:%.*]] = icmp eq i8 [[VAL]], 0
+; CHECK-NEXT:br i1 [[COND_NOT]], label [[EXIT:%.*]], label [[FOR_COND]]
+; CHECK:   exit:
+; CHECK-NEXT:ret i64 0
+;
+entry:
+  %add.ptr = getelementptr i8, ptr %ptr, i64 1
+  br label %for.cond
+
+for.cond: ; preds = %for.cond, %entry
+  %.pn = phi ptr [ %add.ptr, %entry ], [ %incdec.ptr, %for.cond ]
+  %val = load i8, ptr %.pn, align 1
+  %cond = icmp ne i8 %val, 0
+  %incdec.ptr = getelementptr inbounds nuw i8, ptr %.pn, i64 1
+  br i1 %cond, label %for.cond, label %exit
+
+exit: ; preds = %for.cond
+  ret i64 0
+}

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: backport PR115901 (PR #116104)

2024-11-13 Thread Antonio Frighetto via llvm-branch-commits

https://github.com/antoniofrighetto milestoned 
https://github.com/llvm/llvm-project/pull/116104
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x [InstCombine] Drop nsw in negation of select (PR #116097)

2024-11-13 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-llvm-transforms

Author: Rose (AreaZR)


Changes

Closes https://github.com/llvm/llvm-project/issues/112666 and 
https://github.com/llvm/llvm-project/issues/114181.

(cherry-picked from 8d86a537ad756e31832eab67371179e881452fb5)

---
Full diff: https://github.com/llvm/llvm-project/pull/116097.diff


2 Files Affected:

- (modified) llvm/lib/Transforms/InstCombine/InstCombineNegator.cpp (+11) 
- (modified) llvm/test/Transforms/InstCombine/sub-of-negatible.ll (+42) 


``diff
diff --git a/llvm/lib/Transforms/InstCombine/InstCombineNegator.cpp 
b/llvm/lib/Transforms/InstCombine/InstCombineNegator.cpp
index e4895b59f4b4a9..cb052da79bb3c6 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineNegator.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineNegator.cpp
@@ -334,6 +334,17 @@ std::array 
Negator::getSortedOperandsOfBinOp(Instruction *I) {
   NewSelect->swapValues();
   // Don't swap prof metadata, we didn't change the branch behavior.
   NewSelect->setName(I->getName() + ".neg");
+  // Poison-generating flags should be dropped
+  Value *TV = NewSelect->getTrueValue();
+  Value *FV = NewSelect->getFalseValue();
+  if (match(TV, m_Neg(m_Specific(FV
+cast(TV)->dropPoisonGeneratingFlags();
+  else if (match(FV, m_Neg(m_Specific(TV
+cast(FV)->dropPoisonGeneratingFlags();
+  else {
+cast(TV)->dropPoisonGeneratingFlags();
+cast(FV)->dropPoisonGeneratingFlags();
+  }
   Builder.Insert(NewSelect);
   return NewSelect;
 }
diff --git a/llvm/test/Transforms/InstCombine/sub-of-negatible.ll 
b/llvm/test/Transforms/InstCombine/sub-of-negatible.ll
index b2e14ceaca1b08..f9549881aa3131 100644
--- a/llvm/test/Transforms/InstCombine/sub-of-negatible.ll
+++ b/llvm/test/Transforms/InstCombine/sub-of-negatible.ll
@@ -1374,6 +1374,48 @@ define i8 @negate_select_of_op_vs_negated_op(i8 %x, i8 
%y, i1 %c) {
   %t2 = sub i8 %y, %t1
   ret i8 %t2
 }
+
+define i8 @negate_select_of_op_vs_negated_op_nsw(i8 %x, i8 %y, i1 %c) {
+; CHECK-LABEL: @negate_select_of_op_vs_negated_op_nsw(
+; CHECK-NEXT:[[T0:%.*]] = sub i8 0, [[X:%.*]]
+; CHECK-NEXT:[[TMP1:%.*]] = select i1 [[C:%.*]], i8 [[X]], i8 [[T0]]
+; CHECK-NEXT:[[T2:%.*]] = add i8 [[TMP1]], [[Y:%.*]]
+; CHECK-NEXT:ret i8 [[T2]]
+;
+  %t0 = sub nsw i8 0, %x
+  %t1 = select i1 %c, i8 %t0, i8 %x
+  %t2 = sub i8 %y, %t1
+  ret i8 %t2
+}
+
+define i8 @negate_select_of_op_vs_negated_op_nsw_commuted(i8 %x, i8 %y, i1 %c) 
{
+; CHECK-LABEL: @negate_select_of_op_vs_negated_op_nsw_commuted(
+; CHECK-NEXT:[[T0:%.*]] = sub i8 0, [[X:%.*]]
+; CHECK-NEXT:[[TMP1:%.*]] = select i1 [[C:%.*]], i8 [[T0]], i8 [[X]]
+; CHECK-NEXT:[[T2:%.*]] = add i8 [[TMP1]], [[Y:%.*]]
+; CHECK-NEXT:ret i8 [[T2]]
+;
+  %t0 = sub nsw i8 0, %x
+  %t1 = select i1 %c, i8 %x, i8 %t0
+  %t2 = sub i8 %y, %t1
+  ret i8 %t2
+}
+
+define i8 @negate_select_of_op_vs_negated_op_nsw_xyyx(i8 %x, i8 %y, i8 %z, i1 
%c) {
+; CHECK-LABEL: @negate_select_of_op_vs_negated_op_nsw_xyyx(
+; CHECK-NEXT:[[SUB1:%.*]] = sub i8 [[X:%.*]], [[Y:%.*]]
+; CHECK-NEXT:[[SUB2:%.*]] = sub i8 [[Y]], [[X]]
+; CHECK-NEXT:[[TMP1:%.*]] = select i1 [[C:%.*]], i8 [[SUB2]], i8 [[SUB1]]
+; CHECK-NEXT:[[T2:%.*]] = add i8 [[TMP1]], [[Z:%.*]]
+; CHECK-NEXT:ret i8 [[T2]]
+;
+  %sub1 = sub nsw i8 %x, %y
+  %sub2 = sub nsw i8 %y, %x
+  %t1 = select i1 %c, i8 %sub1, i8 %sub2
+  %t2 = sub i8 %z, %t1
+  ret i8 %t2
+}
+
 define i8 @dont_negate_ordinary_select(i8 %x, i8 %y, i8 %z, i1 %c) {
 ; CHECK-LABEL: @dont_negate_ordinary_select(
 ; CHECK-NEXT:[[T0:%.*]] = select i1 [[C:%.*]], i8 [[X:%.*]], i8 [[Y:%.*]]

``




https://github.com/llvm/llvm-project/pull/116097
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [CallGraphSection] Add call graph section options and documentation (PR #87572)

2024-11-13 Thread via llvm-branch-commits

https://github.com/Prabhuk updated 
https://github.com/llvm/llvm-project/pull/87572


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] [mlir][bufferization] Remove `finalizing-bufferize` pass (PR #114154)

2024-11-13 Thread Matthias Springer via llvm-branch-commits

https://github.com/matthias-springer updated 
https://github.com/llvm/llvm-project/pull/114154

>From 1e194b399b21ed1ef577803cadc199827e4d7431 Mon Sep 17 00:00:00 2001
From: Matthias Springer 
Date: Wed, 30 Oct 2024 00:46:05 +0100
Subject: [PATCH] [mlir][bufferization] Remove `finalizing-bufferize` pass

The dialect conversion-based bufferization passes have been migrated to 
One-Shot Bufferize about two years ago. To clean up the code base, this commit 
removes the `finalizing-bufferize` pass, one of the few remaining parts of the 
old infrastructure. Most bufferization passes have already been removed.

Note for LLVM integration: If you depend on this pass, migrate to One-Shot 
Bufferize or copy the pass to your codebase.

Depends on #114152.
---
 .../Bufferization/Transforms/Bufferize.h  |  6 --
 .../Dialect/Bufferization/Transforms/Passes.h |  4 -
 .../Bufferization/Transforms/Passes.td| 16 
 .../Bufferization/Transforms/Bufferize.cpp| 75 ---
 .../Pipelines/SparseTensorPipelines.cpp   |  2 -
 .../Transforms/finalizing-bufferize.mlir  | 95 ---
 6 files changed, 198 deletions(-)
 delete mode 100644 
mlir/test/Dialect/Bufferization/Transforms/finalizing-bufferize.mlir

diff --git a/mlir/include/mlir/Dialect/Bufferization/Transforms/Bufferize.h 
b/mlir/include/mlir/Dialect/Bufferization/Transforms/Bufferize.h
index 1603dfcbae5589..ebed2c354bfca5 100644
--- a/mlir/include/mlir/Dialect/Bufferization/Transforms/Bufferize.h
+++ b/mlir/include/mlir/Dialect/Bufferization/Transforms/Bufferize.h
@@ -56,12 +56,6 @@ class BufferizeTypeConverter : public TypeConverter {
 /// populateEliminateBufferizeMaterializationsPatterns.
 void populateBufferizeMaterializationLegality(ConversionTarget &target);
 
-/// Populate patterns to eliminate bufferize materializations.
-///
-/// In particular, these are the tensor_load/buffer_cast ops.
-void populateEliminateBufferizeMaterializationsPatterns(
-const BufferizeTypeConverter &typeConverter, RewritePatternSet &patterns);
-
 /// Bufferize `op` and its nested ops that implement `BufferizableOpInterface`.
 ///
 /// Note: This function does not resolve read-after-write conflicts. Use this
diff --git a/mlir/include/mlir/Dialect/Bufferization/Transforms/Passes.h 
b/mlir/include/mlir/Dialect/Bufferization/Transforms/Passes.h
index ab9a48f3473c27..fe43a05c81fdc3 100644
--- a/mlir/include/mlir/Dialect/Bufferization/Transforms/Passes.h
+++ b/mlir/include/mlir/Dialect/Bufferization/Transforms/Passes.h
@@ -200,10 +200,6 @@ std::unique_ptr createEmptyTensorToAllocTensorPass();
 /// Drop all memref function results that are equivalent to a function 
argument.
 LogicalResult dropEquivalentBufferResults(ModuleOp module);
 
-/// Creates a pass that finalizes a partial bufferization by removing remaining
-/// bufferization.to_tensor and bufferization.to_memref operations.
-std::unique_ptr> createFinalizingBufferizePass();
-
 /// Create a pass that bufferizes all ops that implement 
BufferizableOpInterface
 /// with One-Shot Bufferize.
 std::unique_ptr createOneShotBufferizePass();
diff --git a/mlir/include/mlir/Dialect/Bufferization/Transforms/Passes.td 
b/mlir/include/mlir/Dialect/Bufferization/Transforms/Passes.td
index 2743de43fb9cfa..3e93f33ffe0fb4 100644
--- a/mlir/include/mlir/Dialect/Bufferization/Transforms/Passes.td
+++ b/mlir/include/mlir/Dialect/Bufferization/Transforms/Passes.td
@@ -343,22 +343,6 @@ def BufferResultsToOutParams : 
Pass<"buffer-results-to-out-params", "ModuleOp">
   let dependentDialects = ["memref::MemRefDialect"];
 }
 
-def FinalizingBufferize : Pass<"finalizing-bufferize", "func::FuncOp"> {
-  let summary = "Finalize a partial bufferization";
-  let description = [{
-A bufferize pass that finalizes a partial bufferization by removing
-remaining `bufferization.to_tensor` and `bufferization.to_buffer` 
operations.
-
-The removal of those operations is only possible if the operations only
-exist in pairs, i.e., all uses of `bufferization.to_tensor` operations are
-`bufferization.to_buffer` operations.
-
-This pass will fail if not all operations can be removed or if any 
operation
-with tensor typed operands remains.
-  }];
-  let constructor = "mlir::bufferization::createFinalizingBufferizePass()";
-}
-
 def DropEquivalentBufferResults : Pass<"drop-equivalent-buffer-results", 
"ModuleOp">  {
   let summary = "Remove MemRef return values that are equivalent to a bbArg";
   let description = [{
diff --git a/mlir/lib/Dialect/Bufferization/Transforms/Bufferize.cpp 
b/mlir/lib/Dialect/Bufferization/Transforms/Bufferize.cpp
index 1d009b03754c52..62ce2583f4fa1d 100644
--- a/mlir/lib/Dialect/Bufferization/Transforms/Bufferize.cpp
+++ b/mlir/lib/Dialect/Bufferization/Transforms/Bufferize.cpp
@@ -26,7 +26,6 @@
 
 namespace mlir {
 namespace bufferization {
-#define GEN_PASS_DEF_FINALIZINGBUFFERIZE
 #define GEN_PASS_DEF_BUFFERIZATIONBUFFERIZE
 #define GEN_PASS_DEF_ONESHOTBUFFERIZE
 #includ

[llvm-branch-commits] [mlir] [mlir][bufferization] Remove remaining dialect conversion-based infra parts (PR #114155)

2024-11-13 Thread Matthias Springer via llvm-branch-commits

https://github.com/matthias-springer updated 
https://github.com/llvm/llvm-project/pull/114155

>From 5c02edc9f35d4c35b2c25bc3dba4d10531e2a4ab Mon Sep 17 00:00:00 2001
From: Matthias Springer 
Date: Wed, 30 Oct 2024 00:58:32 +0100
Subject: [PATCH] [mlir][bufferization] Remove remaining dialect
 conversion-based infra parts

This commit removes the last remaining components of the dialect 
conversion-based bufferization passes.

Note for LLVM integration: If you depend on these components, migrate to 
One-Shot Bufferize or copy them to your codebase.

Depends on #114154.
---
 .../Bufferization/Transforms/Bufferize.h  | 23 --
 .../mlir/Dialect/Func/Transforms/Passes.h |  4 -
 .../Bufferization/Transforms/BufferUtils.cpp  |  6 +-
 .../Bufferization/Transforms/Bufferize.cpp| 73 ---
 4 files changed, 4 insertions(+), 102 deletions(-)

diff --git a/mlir/include/mlir/Dialect/Bufferization/Transforms/Bufferize.h 
b/mlir/include/mlir/Dialect/Bufferization/Transforms/Bufferize.h
index ebed2c354bfca5..2f495d304b4a56 100644
--- a/mlir/include/mlir/Dialect/Bufferization/Transforms/Bufferize.h
+++ b/mlir/include/mlir/Dialect/Bufferization/Transforms/Bufferize.h
@@ -38,24 +38,6 @@ struct BufferizationStatistics {
   int64_t numTensorOutOfPlace = 0;
 };
 
-/// A helper type converter class that automatically populates the relevant
-/// materializations and type conversions for bufferization.
-class BufferizeTypeConverter : public TypeConverter {
-public:
-  BufferizeTypeConverter();
-};
-
-/// Marks ops used by bufferization for type conversion materializations as
-/// "legal" in the given ConversionTarget.
-///
-/// This function should be called by all bufferization passes using
-/// BufferizeTypeConverter so that materializations work properly. One 
exception
-/// is bufferization passes doing "full" conversions, where it can be desirable
-/// for even the materializations to remain illegal so that they are 
eliminated,
-/// such as via the patterns in
-/// populateEliminateBufferizeMaterializationsPatterns.
-void populateBufferizeMaterializationLegality(ConversionTarget &target);
-
 /// Bufferize `op` and its nested ops that implement `BufferizableOpInterface`.
 ///
 /// Note: This function does not resolve read-after-write conflicts. Use this
@@ -81,11 +63,6 @@ LogicalResult bufferizeOp(Operation *op, const 
BufferizationOptions &options,
 LogicalResult bufferizeBlockSignature(Block *block, RewriterBase &rewriter,
   const BufferizationOptions &options);
 
-/// Return `BufferizationOptions` such that the `bufferizeOp` behaves like the
-/// old (deprecated) partial, dialect conversion-based bufferization passes. A
-/// copy will be inserted before every buffer write.
-BufferizationOptions getPartialBufferizationOptions();
-
 } // namespace bufferization
 } // namespace mlir
 
diff --git a/mlir/include/mlir/Dialect/Func/Transforms/Passes.h 
b/mlir/include/mlir/Dialect/Func/Transforms/Passes.h
index 02fc9e1d934390..0248f068320c54 100644
--- a/mlir/include/mlir/Dialect/Func/Transforms/Passes.h
+++ b/mlir/include/mlir/Dialect/Func/Transforms/Passes.h
@@ -18,10 +18,6 @@
 #include "mlir/Pass/Pass.h"
 
 namespace mlir {
-namespace bufferization {
-class BufferizeTypeConverter;
-} // namespace bufferization
-
 class RewritePatternSet;
 
 namespace func {
diff --git a/mlir/lib/Dialect/Bufferization/Transforms/BufferUtils.cpp 
b/mlir/lib/Dialect/Bufferization/Transforms/BufferUtils.cpp
index 8fffdbf664c3f4..b11803da19ef98 100644
--- a/mlir/lib/Dialect/Bufferization/Transforms/BufferUtils.cpp
+++ b/mlir/lib/Dialect/Bufferization/Transforms/BufferUtils.cpp
@@ -11,6 +11,8 @@
 
//===--===//
 
 #include "mlir/Dialect/Bufferization/Transforms/BufferUtils.h"
+
+#include "mlir/Dialect/Bufferization/IR/BufferizableOpInterface.h"
 #include "mlir/Dialect/Bufferization/Transforms/Bufferize.h"
 #include "mlir/Dialect/MemRef/IR/MemRef.h"
 #include "mlir/Dialect/MemRef/Utils/MemRefUtils.h"
@@ -138,8 +140,8 @@ bufferization::getGlobalFor(arith::ConstantOp constantOp, 
uint64_t alignment,
   alignment > 0 ? IntegerAttr::get(globalBuilder.getI64Type(), alignment)
 : IntegerAttr();
 
-  BufferizeTypeConverter typeConverter;
-  auto memrefType = cast(typeConverter.convertType(type));
+  auto memrefType =
+  cast(getMemRefTypeWithStaticIdentityLayout(type));
   if (memorySpace)
 memrefType = MemRefType::Builder(memrefType).setMemorySpace(memorySpace);
   auto global = globalBuilder.create(
diff --git a/mlir/lib/Dialect/Bufferization/Transforms/Bufferize.cpp 
b/mlir/lib/Dialect/Bufferization/Transforms/Bufferize.cpp
index 62ce2583f4fa1d..6f0cdfa20f7be5 100644
--- a/mlir/lib/Dialect/Bufferization/Transforms/Bufferize.cpp
+++ b/mlir/lib/Dialect/Bufferization/Transforms/Bufferize.cpp
@@ -37,65 +37,6 @@ namespace bufferization {
 using namespace mlir;
 using namespace mlir::bufferization

[llvm-branch-commits] [mlir] [mlir][bufferization] Remove remaining dialect conversion-based infra parts (PR #114155)

2024-11-13 Thread Matthias Springer via llvm-branch-commits


@@ -86,18 +86,13 @@ getOrCreateFuncAnalysisState(OneShotAnalysisState &state) {
   return state.addExtension();
 }
 
-/// Return the unique ReturnOp that terminates `funcOp`.
-/// Return nullptr if there is no such unique ReturnOp.
-static func::ReturnOp getAssumedUniqueReturnOp(func::FuncOp funcOp) {
-  func::ReturnOp returnOp;
-  for (Block &b : funcOp.getBody()) {
-if (auto candidateOp = dyn_cast(b.getTerminator())) {
-  if (returnOp)
-return nullptr;
-  returnOp = candidateOp;
-}
-  }
-  return returnOp;
+/// Return all top-level func.return ops in the given function.

matthias-springer wrote:

The diff was broken because a dependent commit got merged. I rebased on the 
latest state.

https://github.com/llvm/llvm-project/pull/114155
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [CGData] Refactor Global Merge Functions (PR #115750)

2024-11-13 Thread Kyungwoo Lee via llvm-branch-commits

kyulee-com wrote:

> I can confirm that the performance have been improved significantly from my 
> testing on no-LTO projects that the slowdown is acceptable now. Before 
> applying the PR it was about 50% slowdown, now it is ~5%.

That's great to hear!
Since these PRs appear to be functioning, is it okay to merge them for now 
while we continue to discuss further improvements? Or do you have more comments 
to be addressed?

https://github.com/llvm/llvm-project/pull/115750
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [CallSiteInfo][CallGraphSection] Extend CallSiteInfo with TypeId (PR #87574)

2024-11-13 Thread via llvm-branch-commits

https://github.com/Prabhuk updated 
https://github.com/llvm/llvm-project/pull/87574


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [flang] Introduce HLFIR lowerings to omp.workshare_loop_nest (PR #104748)

2024-11-13 Thread Ivan R. Ivanov via llvm-branch-commits

https://github.com/ivanradanov edited 
https://github.com/llvm/llvm-project/pull/104748
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [flang] Introduce HLFIR lowerings to omp.workshare_loop_nest (PR #104748)

2024-11-13 Thread Ivan R. Ivanov via llvm-branch-commits

ivanradanov wrote:

@tblah It is ready for review, I had just forgotten to take the [WIP] in the 
title away, sorry for that.

https://github.com/llvm/llvm-project/pull/104748
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [flang] Lower omp.workshare to other omp constructs (PR #101446)

2024-11-13 Thread Ivan R. Ivanov via llvm-branch-commits

ivanradanov wrote:

@tblah I think they are in a good state - I just need a review on this one - 
the other ones are approved.

https://github.com/llvm/llvm-project/pull/101446
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [AsmPrinter][CallGraphSection] Emit call graph section (PR #87576)

2024-11-13 Thread via llvm-branch-commits

https://github.com/Prabhuk updated 
https://github.com/llvm/llvm-project/pull/87576


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [CallSiteInfo][CallGraphSection] Extract and propagate indirect call type ids (PR #87575)

2024-11-13 Thread via llvm-branch-commits

https://github.com/Prabhuk updated 
https://github.com/llvm/llvm-project/pull/87575


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [clang][CallGraphSection] Add type id metadata to indirect call and targets (PR #87573)

2024-11-13 Thread via llvm-branch-commits

https://github.com/Prabhuk updated 
https://github.com/llvm/llvm-project/pull/87573

>From a8a5848885e12c771f12cfa33b4dbc6a0272e925 Mon Sep 17 00:00:00 2001
From: Prabhuk 
Date: Mon, 22 Apr 2024 11:34:04 -0700
Subject: [PATCH 1/3] Update clang/lib/CodeGen/CodeGenModule.cpp

Cleaner if checks.

Co-authored-by: Matt Arsenault 
---
 clang/lib/CodeGen/CodeGenModule.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/clang/lib/CodeGen/CodeGenModule.cpp 
b/clang/lib/CodeGen/CodeGenModule.cpp
index e19bbee996f582..ff1586d2fa8abe 100644
--- a/clang/lib/CodeGen/CodeGenModule.cpp
+++ b/clang/lib/CodeGen/CodeGenModule.cpp
@@ -2711,7 +2711,7 @@ void 
CodeGenModule::CreateFunctionTypeMetadataForIcall(const FunctionDecl *FD,
 void CodeGenModule::CreateFunctionTypeMetadataForIcall(const QualType &QT,
llvm::CallBase *CB) {
   // Only if needed for call graph section and only for indirect calls.
-  if (!(CodeGenOpts.CallGraphSection && CB && CB->isIndirectCall()))
+  if (!CodeGenOpts.CallGraphSection || !CB || !CB->isIndirectCall())
 return;
 
   auto *MD = CreateMetadataIdentifierGeneralized(QT);

>From 019b2ca5e1c263183ed114e0b967b4e77b4a17a8 Mon Sep 17 00:00:00 2001
From: Prabhuk 
Date: Mon, 22 Apr 2024 11:34:31 -0700
Subject: [PATCH 2/3] Update clang/lib/CodeGen/CodeGenModule.cpp

Update the comments as suggested.

Co-authored-by: Matt Arsenault 
---
 clang/lib/CodeGen/CodeGenModule.cpp | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/clang/lib/CodeGen/CodeGenModule.cpp 
b/clang/lib/CodeGen/CodeGenModule.cpp
index ff1586d2fa8abe..5635a87d2358a7 100644
--- a/clang/lib/CodeGen/CodeGenModule.cpp
+++ b/clang/lib/CodeGen/CodeGenModule.cpp
@@ -2680,9 +2680,9 @@ void 
CodeGenModule::CreateFunctionTypeMetadataForIcall(const FunctionDecl *FD,
   bool EmittedMDIdGeneralized = false;
   if (CodeGenOpts.CallGraphSection &&
   (!F->hasLocalLinkage() ||
-   F->getFunction().hasAddressTaken(nullptr, /* IgnoreCallbackUses */ true,
-/* IgnoreAssumeLikeCalls */ true,
-/* IgnoreLLVMUsed */ false))) {
+   F->getFunction().hasAddressTaken(nullptr, /*IgnoreCallbackUses=*/ true,
+/*IgnoreAssumeLikeCalls=*/ true,
+/*IgnoreLLVMUsed=*/ false))) {
 F->addTypeMetadata(0, CreateMetadataIdentifierGeneralized(FD->getType()));
 EmittedMDIdGeneralized = true;
   }

>From 99242900c51778abd4b7e7f4361b09202b7abcda Mon Sep 17 00:00:00 2001
From: Prabhuk 
Date: Mon, 29 Apr 2024 11:53:40 -0700
Subject: [PATCH 3/3] dyn_cast to isa

Created using spr 1.3.6-beta.1
---
 clang/lib/CodeGen/CGCall.cpp | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/clang/lib/CodeGen/CGCall.cpp b/clang/lib/CodeGen/CGCall.cpp
index 526a63b24ff834..45033ced1d8344 100644
--- a/clang/lib/CodeGen/CGCall.cpp
+++ b/clang/lib/CodeGen/CGCall.cpp
@@ -5713,8 +5713,8 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo 
&CallInfo,
 if (callOrInvoke && *callOrInvoke && (*callOrInvoke)->isIndirectCall()) {
   if (const FunctionDecl *FD = dyn_cast_or_null(TargetDecl)) 
{
 // Type id metadata is set only for C/C++ contexts.
-if (dyn_cast(FD) || dyn_cast(FD) ||
-dyn_cast(FD)) {
+if (isa(FD) || isa(FD) ||
+isa(FD)) {
   CGM.CreateFunctionTypeMetadataForIcall(FD->getType(), *callOrInvoke);
 }
   }

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [clang][CallGraphSection] Add type id metadata to indirect call and targets (PR #87573)

2024-11-13 Thread via llvm-branch-commits

https://github.com/Prabhuk updated 
https://github.com/llvm/llvm-project/pull/87573

>From a8a5848885e12c771f12cfa33b4dbc6a0272e925 Mon Sep 17 00:00:00 2001
From: Prabhuk 
Date: Mon, 22 Apr 2024 11:34:04 -0700
Subject: [PATCH 1/3] Update clang/lib/CodeGen/CodeGenModule.cpp

Cleaner if checks.

Co-authored-by: Matt Arsenault 
---
 clang/lib/CodeGen/CodeGenModule.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/clang/lib/CodeGen/CodeGenModule.cpp 
b/clang/lib/CodeGen/CodeGenModule.cpp
index e19bbee996f582..ff1586d2fa8abe 100644
--- a/clang/lib/CodeGen/CodeGenModule.cpp
+++ b/clang/lib/CodeGen/CodeGenModule.cpp
@@ -2711,7 +2711,7 @@ void 
CodeGenModule::CreateFunctionTypeMetadataForIcall(const FunctionDecl *FD,
 void CodeGenModule::CreateFunctionTypeMetadataForIcall(const QualType &QT,
llvm::CallBase *CB) {
   // Only if needed for call graph section and only for indirect calls.
-  if (!(CodeGenOpts.CallGraphSection && CB && CB->isIndirectCall()))
+  if (!CodeGenOpts.CallGraphSection || !CB || !CB->isIndirectCall())
 return;
 
   auto *MD = CreateMetadataIdentifierGeneralized(QT);

>From 019b2ca5e1c263183ed114e0b967b4e77b4a17a8 Mon Sep 17 00:00:00 2001
From: Prabhuk 
Date: Mon, 22 Apr 2024 11:34:31 -0700
Subject: [PATCH 2/3] Update clang/lib/CodeGen/CodeGenModule.cpp

Update the comments as suggested.

Co-authored-by: Matt Arsenault 
---
 clang/lib/CodeGen/CodeGenModule.cpp | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/clang/lib/CodeGen/CodeGenModule.cpp 
b/clang/lib/CodeGen/CodeGenModule.cpp
index ff1586d2fa8abe..5635a87d2358a7 100644
--- a/clang/lib/CodeGen/CodeGenModule.cpp
+++ b/clang/lib/CodeGen/CodeGenModule.cpp
@@ -2680,9 +2680,9 @@ void 
CodeGenModule::CreateFunctionTypeMetadataForIcall(const FunctionDecl *FD,
   bool EmittedMDIdGeneralized = false;
   if (CodeGenOpts.CallGraphSection &&
   (!F->hasLocalLinkage() ||
-   F->getFunction().hasAddressTaken(nullptr, /* IgnoreCallbackUses */ true,
-/* IgnoreAssumeLikeCalls */ true,
-/* IgnoreLLVMUsed */ false))) {
+   F->getFunction().hasAddressTaken(nullptr, /*IgnoreCallbackUses=*/ true,
+/*IgnoreAssumeLikeCalls=*/ true,
+/*IgnoreLLVMUsed=*/ false))) {
 F->addTypeMetadata(0, CreateMetadataIdentifierGeneralized(FD->getType()));
 EmittedMDIdGeneralized = true;
   }

>From 99242900c51778abd4b7e7f4361b09202b7abcda Mon Sep 17 00:00:00 2001
From: Prabhuk 
Date: Mon, 29 Apr 2024 11:53:40 -0700
Subject: [PATCH 3/3] dyn_cast to isa

Created using spr 1.3.6-beta.1
---
 clang/lib/CodeGen/CGCall.cpp | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/clang/lib/CodeGen/CGCall.cpp b/clang/lib/CodeGen/CGCall.cpp
index 526a63b24ff834..45033ced1d8344 100644
--- a/clang/lib/CodeGen/CGCall.cpp
+++ b/clang/lib/CodeGen/CGCall.cpp
@@ -5713,8 +5713,8 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo 
&CallInfo,
 if (callOrInvoke && *callOrInvoke && (*callOrInvoke)->isIndirectCall()) {
   if (const FunctionDecl *FD = dyn_cast_or_null(TargetDecl)) 
{
 // Type id metadata is set only for C/C++ contexts.
-if (dyn_cast(FD) || dyn_cast(FD) ||
-dyn_cast(FD)) {
+if (isa(FD) || isa(FD) ||
+isa(FD)) {
   CGM.CreateFunctionTypeMetadataForIcall(FD->getType(), *callOrInvoke);
 }
   }

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [CallSiteInfo][CallGraphSection] Extend CallSiteInfo with TypeId (PR #87574)

2024-11-13 Thread via llvm-branch-commits

https://github.com/Prabhuk updated 
https://github.com/llvm/llvm-project/pull/87574


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [CallSiteInfo][CallGraphSection] Extend CallSiteInfo with TypeId (PR #87574)

2024-11-13 Thread via llvm-branch-commits

https://github.com/Prabhuk updated 
https://github.com/llvm/llvm-project/pull/87574


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [AsmPrinter][CallGraphSection] Emit call graph section (PR #87576)

2024-11-13 Thread via llvm-branch-commits

https://github.com/Prabhuk updated 
https://github.com/llvm/llvm-project/pull/87576


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [CallGraphSection] Add call graph section options and documentation (PR #87572)

2024-11-13 Thread via llvm-branch-commits

https://github.com/Prabhuk updated 
https://github.com/llvm/llvm-project/pull/87572


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [CallGraphSection] Add call graph section options and documentation (PR #87572)

2024-11-13 Thread via llvm-branch-commits

https://github.com/Prabhuk updated 
https://github.com/llvm/llvm-project/pull/87572


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [CallGraphSection] Add call graph section options and documentation (PR #87572)

2024-11-13 Thread via llvm-branch-commits

github-actions[bot] wrote:




:warning: C/C++ code formatter, clang-format found issues in your code. 
:warning:



You can test this locally with the following command:


``bash
git-clang-format --diff c960e4d69a31fa560f45d5b1eb4ba069f47467fb 
8d7d2f2b8e335cbeef6fb698ab67c6c0bba4a14b --extensions h,c,cpp -- 
clang/test/Driver/call-graph-section.c clang/lib/CodeGen/BackendUtil.cpp 
clang/lib/Driver/ToolChains/Clang.cpp llvm/include/llvm/CodeGen/CommandFlags.h 
llvm/include/llvm/Target/TargetOptions.h llvm/lib/CodeGen/CommandFlags.cpp
``





View the diff from clang-format here.


``diff
diff --git a/llvm/include/llvm/Target/TargetOptions.h 
b/llvm/include/llvm/Target/TargetOptions.h
index 91ce6a911c..02fcf9cbe6 100644
--- a/llvm/include/llvm/Target/TargetOptions.h
+++ b/llvm/include/llvm/Target/TargetOptions.h
@@ -149,10 +149,11 @@ namespace llvm {
   EmulatedTLS(false), EnableTLSDESC(false), EnableIPRA(false),
   EmitStackSizeSection(false), EnableMachineOutliner(false),
   EnableMachineFunctionSplitter(false), 
SupportsDefaultOutlining(false),
-  EmitAddrsig(false), BBAddrMap(false), EmitCallGraphSection(false), 
EmitCallSiteInfo(false),
-  SupportsDebugEntryValues(false), EnableDebugEntryValues(false),
-  ValueTrackingVariableLocations(false), ForceDwarfFrameSection(false),
-  XRayFunctionIndex(true), DebugStrictDwarf(false), Hotpatch(false),
+  EmitAddrsig(false), BBAddrMap(false), EmitCallGraphSection(false),
+  EmitCallSiteInfo(false), SupportsDebugEntryValues(false),
+  EnableDebugEntryValues(false), ValueTrackingVariableLocations(false),
+  ForceDwarfFrameSection(false), XRayFunctionIndex(true),
+  DebugStrictDwarf(false), Hotpatch(false),
   PPCGenScalarMASSEntries(false), JMCInstrument(false),
   EnableCFIFixup(false), MisExpect(false), 
XCOFFReadOnlyPointers(false),
   FPDenormalMode(DenormalMode::IEEE, DenormalMode::IEEE) {}

``




https://github.com/llvm/llvm-project/pull/87572
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [CallSiteInfo][CallGraphSection] Extract and propagate indirect call type ids (PR #87575)

2024-11-13 Thread via llvm-branch-commits

https://github.com/Prabhuk updated 
https://github.com/llvm/llvm-project/pull/87575


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [CGData] Refactor Global Merge Functions (PR #115750)

2024-11-13 Thread Zhaoxuan Jiang via llvm-branch-commits

nocchijiang wrote:

> I suspect that the no-LTO case might still encounter some slowdown, as each 
> CU needs to read the entire CGData regardless.

I can confirm that the performance have been improved significantly from my 
testing on no-LTO projects that the slowdown is acceptable now. Before applying 
the PR it was about 50% slowdown, now it is ~5%.

> Alternatively, we could restructure the indexed CGData to allow for reading 
> only the relevant hash entries on demand.

Besides only consuming the matched stable entries like what this PR does, this 
is exactly what I planned to do to reduce the memory footprint of the 
deserialized CGData. I would like to discuss the detail in the RFC thread with 
you to make sure that we are on the same page before coding it.


https://github.com/llvm/llvm-project/pull/115750
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [CallGraphSection] Add call graph section options and documentation (PR #87572)

2024-11-13 Thread via llvm-branch-commits

github-actions[bot] wrote:




:warning: C/C++ code formatter, clang-format found issues in your code. 
:warning:



You can test this locally with the following command:


``bash
git-clang-format --diff c960e4d69a31fa560f45d5b1eb4ba069f47467fb 
8d7d2f2b8e335cbeef6fb698ab67c6c0bba4a14b --extensions h,c,cpp -- 
clang/test/Driver/call-graph-section.c clang/lib/CodeGen/BackendUtil.cpp 
clang/lib/Driver/ToolChains/Clang.cpp llvm/include/llvm/CodeGen/CommandFlags.h 
llvm/include/llvm/Target/TargetOptions.h llvm/lib/CodeGen/CommandFlags.cpp
``





View the diff from clang-format here.


``diff
diff --git a/llvm/include/llvm/Target/TargetOptions.h 
b/llvm/include/llvm/Target/TargetOptions.h
index 91ce6a911c..02fcf9cbe6 100644
--- a/llvm/include/llvm/Target/TargetOptions.h
+++ b/llvm/include/llvm/Target/TargetOptions.h
@@ -149,10 +149,11 @@ namespace llvm {
   EmulatedTLS(false), EnableTLSDESC(false), EnableIPRA(false),
   EmitStackSizeSection(false), EnableMachineOutliner(false),
   EnableMachineFunctionSplitter(false), 
SupportsDefaultOutlining(false),
-  EmitAddrsig(false), BBAddrMap(false), EmitCallGraphSection(false), 
EmitCallSiteInfo(false),
-  SupportsDebugEntryValues(false), EnableDebugEntryValues(false),
-  ValueTrackingVariableLocations(false), ForceDwarfFrameSection(false),
-  XRayFunctionIndex(true), DebugStrictDwarf(false), Hotpatch(false),
+  EmitAddrsig(false), BBAddrMap(false), EmitCallGraphSection(false),
+  EmitCallSiteInfo(false), SupportsDebugEntryValues(false),
+  EnableDebugEntryValues(false), ValueTrackingVariableLocations(false),
+  ForceDwarfFrameSection(false), XRayFunctionIndex(true),
+  DebugStrictDwarf(false), Hotpatch(false),
   PPCGenScalarMASSEntries(false), JMCInstrument(false),
   EnableCFIFixup(false), MisExpect(false), 
XCOFFReadOnlyPointers(false),
   FPDenormalMode(DenormalMode::IEEE, DenormalMode::IEEE) {}

``




https://github.com/llvm/llvm-project/pull/87572
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [clang][CallGraphSection] Add type id metadata to indirect call and targets (PR #87573)

2024-11-13 Thread via llvm-branch-commits

https://github.com/Prabhuk updated 
https://github.com/llvm/llvm-project/pull/87573

>From a8a5848885e12c771f12cfa33b4dbc6a0272e925 Mon Sep 17 00:00:00 2001
From: Prabhuk 
Date: Mon, 22 Apr 2024 11:34:04 -0700
Subject: [PATCH 1/3] Update clang/lib/CodeGen/CodeGenModule.cpp

Cleaner if checks.

Co-authored-by: Matt Arsenault 
---
 clang/lib/CodeGen/CodeGenModule.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/clang/lib/CodeGen/CodeGenModule.cpp 
b/clang/lib/CodeGen/CodeGenModule.cpp
index e19bbee996f582..ff1586d2fa8abe 100644
--- a/clang/lib/CodeGen/CodeGenModule.cpp
+++ b/clang/lib/CodeGen/CodeGenModule.cpp
@@ -2711,7 +2711,7 @@ void 
CodeGenModule::CreateFunctionTypeMetadataForIcall(const FunctionDecl *FD,
 void CodeGenModule::CreateFunctionTypeMetadataForIcall(const QualType &QT,
llvm::CallBase *CB) {
   // Only if needed for call graph section and only for indirect calls.
-  if (!(CodeGenOpts.CallGraphSection && CB && CB->isIndirectCall()))
+  if (!CodeGenOpts.CallGraphSection || !CB || !CB->isIndirectCall())
 return;
 
   auto *MD = CreateMetadataIdentifierGeneralized(QT);

>From 019b2ca5e1c263183ed114e0b967b4e77b4a17a8 Mon Sep 17 00:00:00 2001
From: Prabhuk 
Date: Mon, 22 Apr 2024 11:34:31 -0700
Subject: [PATCH 2/3] Update clang/lib/CodeGen/CodeGenModule.cpp

Update the comments as suggested.

Co-authored-by: Matt Arsenault 
---
 clang/lib/CodeGen/CodeGenModule.cpp | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/clang/lib/CodeGen/CodeGenModule.cpp 
b/clang/lib/CodeGen/CodeGenModule.cpp
index ff1586d2fa8abe..5635a87d2358a7 100644
--- a/clang/lib/CodeGen/CodeGenModule.cpp
+++ b/clang/lib/CodeGen/CodeGenModule.cpp
@@ -2680,9 +2680,9 @@ void 
CodeGenModule::CreateFunctionTypeMetadataForIcall(const FunctionDecl *FD,
   bool EmittedMDIdGeneralized = false;
   if (CodeGenOpts.CallGraphSection &&
   (!F->hasLocalLinkage() ||
-   F->getFunction().hasAddressTaken(nullptr, /* IgnoreCallbackUses */ true,
-/* IgnoreAssumeLikeCalls */ true,
-/* IgnoreLLVMUsed */ false))) {
+   F->getFunction().hasAddressTaken(nullptr, /*IgnoreCallbackUses=*/ true,
+/*IgnoreAssumeLikeCalls=*/ true,
+/*IgnoreLLVMUsed=*/ false))) {
 F->addTypeMetadata(0, CreateMetadataIdentifierGeneralized(FD->getType()));
 EmittedMDIdGeneralized = true;
   }

>From 99242900c51778abd4b7e7f4361b09202b7abcda Mon Sep 17 00:00:00 2001
From: Prabhuk 
Date: Mon, 29 Apr 2024 11:53:40 -0700
Subject: [PATCH 3/3] dyn_cast to isa

Created using spr 1.3.6-beta.1
---
 clang/lib/CodeGen/CGCall.cpp | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/clang/lib/CodeGen/CGCall.cpp b/clang/lib/CodeGen/CGCall.cpp
index 526a63b24ff834..45033ced1d8344 100644
--- a/clang/lib/CodeGen/CGCall.cpp
+++ b/clang/lib/CodeGen/CGCall.cpp
@@ -5713,8 +5713,8 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo 
&CallInfo,
 if (callOrInvoke && *callOrInvoke && (*callOrInvoke)->isIndirectCall()) {
   if (const FunctionDecl *FD = dyn_cast_or_null(TargetDecl)) 
{
 // Type id metadata is set only for C/C++ contexts.
-if (dyn_cast(FD) || dyn_cast(FD) ||
-dyn_cast(FD)) {
+if (isa(FD) || isa(FD) ||
+isa(FD)) {
   CGM.CreateFunctionTypeMetadataForIcall(FD->getType(), *callOrInvoke);
 }
   }

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [CGData] Refactor Global Merge Functions (PR #115750)

2024-11-13 Thread Zhaoxuan Jiang via llvm-branch-commits

https://github.com/nocchijiang approved this pull request.


https://github.com/llvm/llvm-project/pull/115750
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [CallSiteInfo][CallGraphSection] Extract and propagate indirect call type ids (PR #87575)

2024-11-13 Thread via llvm-branch-commits

https://github.com/Prabhuk updated 
https://github.com/llvm/llvm-project/pull/87575


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [CallSiteInfo][CallGraphSection] Extract and propagate indirect call type ids (PR #87575)

2024-11-13 Thread via llvm-branch-commits

https://github.com/Prabhuk updated 
https://github.com/llvm/llvm-project/pull/87575


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [CallGraphSection] Add call graph section options and documentation (PR #87572)

2024-11-13 Thread via llvm-branch-commits

https://github.com/Prabhuk updated 
https://github.com/llvm/llvm-project/pull/87572


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [CallSiteInfo][CallGraphSection] Extend CallSiteInfo with TypeId (PR #87574)

2024-11-13 Thread via llvm-branch-commits

https://github.com/Prabhuk updated 
https://github.com/llvm/llvm-project/pull/87574


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [AsmPrinter][CallGraphSection] Emit call graph section (PR #87576)

2024-11-13 Thread via llvm-branch-commits

https://github.com/Prabhuk updated 
https://github.com/llvm/llvm-project/pull/87576


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [clang][CallGraphSection] Add type id metadata to indirect call and targets (PR #87573)

2024-11-13 Thread via llvm-branch-commits

https://github.com/Prabhuk updated 
https://github.com/llvm/llvm-project/pull/87573

>From a8a5848885e12c771f12cfa33b4dbc6a0272e925 Mon Sep 17 00:00:00 2001
From: Prabhuk 
Date: Mon, 22 Apr 2024 11:34:04 -0700
Subject: [PATCH 1/3] Update clang/lib/CodeGen/CodeGenModule.cpp

Cleaner if checks.

Co-authored-by: Matt Arsenault 
---
 clang/lib/CodeGen/CodeGenModule.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/clang/lib/CodeGen/CodeGenModule.cpp 
b/clang/lib/CodeGen/CodeGenModule.cpp
index e19bbee996f582..ff1586d2fa8abe 100644
--- a/clang/lib/CodeGen/CodeGenModule.cpp
+++ b/clang/lib/CodeGen/CodeGenModule.cpp
@@ -2711,7 +2711,7 @@ void 
CodeGenModule::CreateFunctionTypeMetadataForIcall(const FunctionDecl *FD,
 void CodeGenModule::CreateFunctionTypeMetadataForIcall(const QualType &QT,
llvm::CallBase *CB) {
   // Only if needed for call graph section and only for indirect calls.
-  if (!(CodeGenOpts.CallGraphSection && CB && CB->isIndirectCall()))
+  if (!CodeGenOpts.CallGraphSection || !CB || !CB->isIndirectCall())
 return;
 
   auto *MD = CreateMetadataIdentifierGeneralized(QT);

>From 019b2ca5e1c263183ed114e0b967b4e77b4a17a8 Mon Sep 17 00:00:00 2001
From: Prabhuk 
Date: Mon, 22 Apr 2024 11:34:31 -0700
Subject: [PATCH 2/3] Update clang/lib/CodeGen/CodeGenModule.cpp

Update the comments as suggested.

Co-authored-by: Matt Arsenault 
---
 clang/lib/CodeGen/CodeGenModule.cpp | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/clang/lib/CodeGen/CodeGenModule.cpp 
b/clang/lib/CodeGen/CodeGenModule.cpp
index ff1586d2fa8abe..5635a87d2358a7 100644
--- a/clang/lib/CodeGen/CodeGenModule.cpp
+++ b/clang/lib/CodeGen/CodeGenModule.cpp
@@ -2680,9 +2680,9 @@ void 
CodeGenModule::CreateFunctionTypeMetadataForIcall(const FunctionDecl *FD,
   bool EmittedMDIdGeneralized = false;
   if (CodeGenOpts.CallGraphSection &&
   (!F->hasLocalLinkage() ||
-   F->getFunction().hasAddressTaken(nullptr, /* IgnoreCallbackUses */ true,
-/* IgnoreAssumeLikeCalls */ true,
-/* IgnoreLLVMUsed */ false))) {
+   F->getFunction().hasAddressTaken(nullptr, /*IgnoreCallbackUses=*/ true,
+/*IgnoreAssumeLikeCalls=*/ true,
+/*IgnoreLLVMUsed=*/ false))) {
 F->addTypeMetadata(0, CreateMetadataIdentifierGeneralized(FD->getType()));
 EmittedMDIdGeneralized = true;
   }

>From 99242900c51778abd4b7e7f4361b09202b7abcda Mon Sep 17 00:00:00 2001
From: Prabhuk 
Date: Mon, 29 Apr 2024 11:53:40 -0700
Subject: [PATCH 3/3] dyn_cast to isa

Created using spr 1.3.6-beta.1
---
 clang/lib/CodeGen/CGCall.cpp | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/clang/lib/CodeGen/CGCall.cpp b/clang/lib/CodeGen/CGCall.cpp
index 526a63b24ff834..45033ced1d8344 100644
--- a/clang/lib/CodeGen/CGCall.cpp
+++ b/clang/lib/CodeGen/CGCall.cpp
@@ -5713,8 +5713,8 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo 
&CallInfo,
 if (callOrInvoke && *callOrInvoke && (*callOrInvoke)->isIndirectCall()) {
   if (const FunctionDecl *FD = dyn_cast_or_null(TargetDecl)) 
{
 // Type id metadata is set only for C/C++ contexts.
-if (dyn_cast(FD) || dyn_cast(FD) ||
-dyn_cast(FD)) {
+if (isa(FD) || isa(FD) ||
+isa(FD)) {
   CGM.CreateFunctionTypeMetadataForIcall(FD->getType(), *callOrInvoke);
 }
   }

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [AsmPrinter][CallGraphSection] Emit call graph section (PR #87576)

2024-11-13 Thread via llvm-branch-commits

https://github.com/Prabhuk updated 
https://github.com/llvm/llvm-project/pull/87576


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] be995b8 - Revert "[Fuchsia][CMake] Enable new libc header gen (#102371)"

2024-11-13 Thread via llvm-branch-commits

Author: Petr Hosek
Date: 2024-11-13T23:32:07-08:00
New Revision: be995b825da9c12c8fead48d2e5ba575f154bddf

URL: 
https://github.com/llvm/llvm-project/commit/be995b825da9c12c8fead48d2e5ba575f154bddf
DIFF: 
https://github.com/llvm/llvm-project/commit/be995b825da9c12c8fead48d2e5ba575f154bddf.diff

LOG: Revert "[Fuchsia][CMake] Enable new libc header gen (#102371)"

This reverts commit d492001bdcd7bfcd19ada7459a6b0eaf81ba3ba2.

Added: 


Modified: 
clang/cmake/caches/Fuchsia-stage2.cmake

Removed: 




diff  --git a/clang/cmake/caches/Fuchsia-stage2.cmake 
b/clang/cmake/caches/Fuchsia-stage2.cmake
index d5859c8806b561..5af98c7b3b3fba 100644
--- a/clang/cmake/caches/Fuchsia-stage2.cmake
+++ b/clang/cmake/caches/Fuchsia-stage2.cmake
@@ -329,7 +329,7 @@ foreach(target 
armv6m-none-eabi;armv7m-none-eabi;armv8m.main-none-eabi)
   foreach(lang C;CXX;ASM)
 # TODO: The preprocessor defines workaround various issues in libc and 
libc++ integration.
 # These should be addressed and removed over time.
-set(RUNTIMES_${target}_CMAKE_${lang}_local_flags "--target=${target} 
-mthumb -Wno-atomic-alignment \"-Dvfprintf(stream, format, 
vlist)=vprintf(format, vlist)\" \"-Dfprintf(stream, format, 
...)=printf(format)\" \"-Dgettimeofday(tv, tz)\" -D_LIBCPP_PRINT=1")
+set(RUNTIMES_${target}_CMAKE_${lang}_local_flags "--target=${target} 
-mthumb -Wno-atomic-alignment \"-Dvfprintf(stream, format, 
vlist)=vprintf(format, vlist)\" \"-Dfprintf(stream, format, 
...)=printf(format)\" \"-Dtimeval=struct timeval{int tv_sec; int tv_usec;}\" 
\"-Dgettimeofday(tv, tz)\" -D_LIBCPP_PRINT=1")
 if(${target} STREQUAL "armv8m.main-none-eabi")
   set(RUNTIMES_${target}_CMAKE_${lang}_local_flags 
"${RUNTIMES_${target}_CMAKE_${lang}_local_flags} -mfloat-abi=softfp 
-march=armv8m.main+fp+dsp -mcpu=cortex-m33" CACHE STRING "")
 endif()
@@ -340,6 +340,7 @@ foreach(target 
armv6m-none-eabi;armv7m-none-eabi;armv8m.main-none-eabi)
   endforeach()
   set(RUNTIMES_${target}_LLVM_LIBC_FULL_BUILD ON CACHE BOOL "")
   set(RUNTIMES_${target}_LIBC_ENABLE_USE_BY_CLANG ON CACHE BOOL "")
+  set(RUNTIMES_${target}_LIBC_USE_NEW_HEADER_GEN OFF CACHE BOOL "")
   set(RUNTIMES_${target}_LIBCXX_ABI_VERSION 2 CACHE STRING "")
   set(RUNTIMES_${target}_LIBCXX_CXX_ABI none CACHE STRING "")
   set(RUNTIMES_${target}_LIBCXX_ENABLE_SHARED OFF CACHE BOOL "")
@@ -384,13 +385,14 @@ foreach(target riscv32-unknown-elf)
   foreach(lang C;CXX;ASM)
 # TODO: The preprocessor defines workaround various issues in libc and 
libc++ integration.
 # These should be addressed and removed over time.
-set(RUNTIMES_${target}_CMAKE_${lang}_FLAGS "--target=${target} 
-march=rv32imafc -mabi=ilp32f -Wno-atomic-alignment \"-Dvfprintf(stream, 
format, vlist)=vprintf(format, vlist)\" \"-Dfprintf(stream, format, 
...)=printf(format)\" \"-Dgettimeofday(tv, tz)\" -D_LIBCPP_PRINT=1" CACHE 
STRING "")
+set(RUNTIMES_${target}_CMAKE_${lang}_FLAGS "--target=${target} 
-march=rv32imafc -mabi=ilp32f -Wno-atomic-alignment \"-Dvfprintf(stream, 
format, vlist)=vprintf(format, vlist)\" \"-Dfprintf(stream, format, 
...)=printf(format)\" \"-Dtimeval=struct timeval{int tv_sec; int tv_usec;}\" 
\"-Dgettimeofday(tv, tz)\" -D_LIBCPP_PRINT=1" CACHE STRING "")
   endforeach()
   foreach(type SHARED;MODULE;EXE)
 set(RUNTIMES_${target}_CMAKE_${type}_LINKER_FLAGS "-fuse-ld=lld" CACHE 
STRING "")
   endforeach()
   set(RUNTIMES_${target}_LLVM_LIBC_FULL_BUILD ON CACHE BOOL "")
   set(RUNTIMES_${target}_LIBC_ENABLE_USE_BY_CLANG ON CACHE BOOL "")
+  set(RUNTIMES_${target}_LIBC_USE_NEW_HEADER_GEN OFF CACHE BOOL "")
   set(RUNTIMES_${target}_LIBCXX_ABI_VERSION 2 CACHE STRING "")
   set(RUNTIMES_${target}_LIBCXX_CXX_ABI none CACHE STRING "")
   set(RUNTIMES_${target}_LIBCXX_ENABLE_SHARED OFF CACHE BOOL "")



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [CodeGen][NewPM] Port MachineCycleInfo to NPM (PR #114745)

2024-11-13 Thread Akshat Oke via llvm-branch-commits

https://github.com/optimisan updated 
https://github.com/llvm/llvm-project/pull/114745

>From abfe18a7fec0ed6970a75697898e681ff115d9c1 Mon Sep 17 00:00:00 2001
From: Akshat Oke 
Date: Wed, 30 Oct 2024 04:59:30 +
Subject: [PATCH 1/2] [CodeGen][NewPM] Port MachineCycleInfo to NPM

---
 .../llvm/CodeGen/MachineCycleAnalysis.h   | 21 ++
 llvm/include/llvm/InitializePasses.h  |  2 +-
 .../llvm/Passes/MachinePassRegistry.def   |  3 +-
 llvm/lib/CodeGen/CodeGen.cpp  |  2 +-
 llvm/lib/CodeGen/MachineCycleAnalysis.cpp | 38 ++-
 llvm/lib/Passes/PassBuilder.cpp   |  1 +
 llvm/test/CodeGen/X86/cycle-info.mir  |  2 +
 7 files changed, 57 insertions(+), 12 deletions(-)

diff --git a/llvm/include/llvm/CodeGen/MachineCycleAnalysis.h 
b/llvm/include/llvm/CodeGen/MachineCycleAnalysis.h
index 1888dd053ce65ee..64cf30e6ddf3b8d 100644
--- a/llvm/include/llvm/CodeGen/MachineCycleAnalysis.h
+++ b/llvm/include/llvm/CodeGen/MachineCycleAnalysis.h
@@ -16,6 +16,7 @@
 
 #include "llvm/ADT/GenericCycleInfo.h"
 #include "llvm/CodeGen/MachineFunctionPass.h"
+#include "llvm/CodeGen/MachinePassManager.h"
 #include "llvm/CodeGen/MachineSSAContext.h"
 
 namespace llvm {
@@ -46,6 +47,26 @@ class MachineCycleInfoWrapperPass : public 
MachineFunctionPass {
 //   version.
 bool isCycleInvariant(const MachineCycle *Cycle, MachineInstr &I);
 
+class MachineCycleAnalysis : public AnalysisInfoMixin {
+  friend AnalysisInfoMixin;
+  static AnalysisKey Key;
+
+public:
+  using Result = MachineCycleInfo;
+
+  Result run(MachineFunction &MF, MachineFunctionAnalysisManager &MFAM);
+};
+
+class MachineCycleInfoPrinterPass
+: public PassInfoMixin {
+  raw_ostream &OS;
+
+public:
+  explicit MachineCycleInfoPrinterPass(raw_ostream &OS) : OS(OS) {}
+  PreservedAnalyses run(MachineFunction &MF,
+MachineFunctionAnalysisManager &MFAM);
+};
+
 } // end namespace llvm
 
 #endif // LLVM_CODEGEN_MACHINECYCLEANALYSIS_H
diff --git a/llvm/include/llvm/InitializePasses.h 
b/llvm/include/llvm/InitializePasses.h
index bf934de9261cec0..598498f8597b6aa 100644
--- a/llvm/include/llvm/InitializePasses.h
+++ b/llvm/include/llvm/InitializePasses.h
@@ -191,7 +191,7 @@ void initializeMachineCFGPrinterPass(PassRegistry &);
 void initializeMachineCSELegacyPass(PassRegistry &);
 void initializeMachineCombinerPass(PassRegistry &);
 void initializeMachineCopyPropagationPass(PassRegistry &);
-void initializeMachineCycleInfoPrinterPassPass(PassRegistry &);
+void initializeMachineCycleInfoPrinterLegacyPass(PassRegistry &);
 void initializeMachineCycleInfoWrapperPassPass(PassRegistry &);
 void initializeMachineDominanceFrontierPass(PassRegistry &);
 void initializeMachineDominatorTreeWrapperPassPass(PassRegistry &);
diff --git a/llvm/include/llvm/Passes/MachinePassRegistry.def 
b/llvm/include/llvm/Passes/MachinePassRegistry.def
index 9d12a120ff7ac6d..bfe8caba0ce0b36 100644
--- a/llvm/include/llvm/Passes/MachinePassRegistry.def
+++ b/llvm/include/llvm/Passes/MachinePassRegistry.def
@@ -101,6 +101,7 @@ MACHINE_FUNCTION_ANALYSIS("live-vars", 
LiveVariablesAnalysis())
 MACHINE_FUNCTION_ANALYSIS("machine-block-freq", 
MachineBlockFrequencyAnalysis())
 MACHINE_FUNCTION_ANALYSIS("machine-branch-prob",
   MachineBranchProbabilityAnalysis())
+MACHINE_FUNCTION_ANALYSIS("machine-cycles", MachineCycleAnalysis())
 MACHINE_FUNCTION_ANALYSIS("machine-dom-tree", MachineDominatorTreeAnalysis())
 MACHINE_FUNCTION_ANALYSIS("machine-loops", MachineLoopAnalysis())
 MACHINE_FUNCTION_ANALYSIS("machine-opt-remark-emitter",
@@ -151,6 +152,7 @@ MACHINE_FUNCTION_PASS("print",
 MACHINE_FUNCTION_PASS("print",
   MachineDominatorTreePrinterPass(dbgs()))
 MACHINE_FUNCTION_PASS("print", MachineLoopPrinterPass(dbgs()))
+MACHINE_FUNCTION_PASS("print", 
MachineCycleInfoPrinterPass(dbgs()))
 MACHINE_FUNCTION_PASS("print",
   MachinePostDominatorTreePrinterPass(dbgs()))
 MACHINE_FUNCTION_PASS("print", SlotIndexesPrinterPass(dbgs()))
@@ -241,7 +243,6 @@ DUMMY_MACHINE_FUNCTION_PASS("post-RA-sched", 
PostRASchedulerPass)
 DUMMY_MACHINE_FUNCTION_PASS("postmisched", PostMachineSchedulerPass)
 DUMMY_MACHINE_FUNCTION_PASS("postra-machine-sink", PostRAMachineSinkingPass)
 DUMMY_MACHINE_FUNCTION_PASS("postrapseudos", ExpandPostRAPseudosPass)
-DUMMY_MACHINE_FUNCTION_PASS("print-machine-cycles", 
MachineCycleInfoPrinterPass)
 DUMMY_MACHINE_FUNCTION_PASS("print-machine-uniformity", 
MachineUniformityInfoPrinterPass)
 DUMMY_MACHINE_FUNCTION_PASS("processimpdefs", ProcessImplicitDefsPass)
 DUMMY_MACHINE_FUNCTION_PASS("prologepilog", PrologEpilogInserterPass)
diff --git a/llvm/lib/CodeGen/CodeGen.cpp b/llvm/lib/CodeGen/CodeGen.cpp
index 39fba1d0b527ef6..adddb8daaa0e914 100644
--- a/llvm/lib/CodeGen/CodeGen.cpp
+++ b/llvm/lib/CodeGen/CodeGen.cpp
@@ -78,7 +78,7 @@ void llvm::initializeCodeGen(PassRegistry &Registry) {
   initializeMachineCSELegacyPass(Registry);

[llvm-branch-commits] [llvm] [NewPM] Introduce MFAnalysisGetter for a common analysis getter (PR #116166)

2024-11-13 Thread Akshat Oke via llvm-branch-commits

https://github.com/optimisan created 
https://github.com/llvm/llvm-project/pull/116166

None

>From 197b28c684fcf3ba751a1283fd124bd2d090dfc7 Mon Sep 17 00:00:00 2001
From: Akshat Oke 
Date: Thu, 14 Nov 2024 05:57:01 +
Subject: [PATCH] [NewPM] Introduce MFAnalysisGetter for a common analysis
 getter

---
 .../include/llvm/CodeGen/MachinePassManager.h | 80 +++
 1 file changed, 80 insertions(+)

diff --git a/llvm/include/llvm/CodeGen/MachinePassManager.h 
b/llvm/include/llvm/CodeGen/MachinePassManager.h
index 69b5f6e92940c4..37669faff96c33 100644
--- a/llvm/include/llvm/CodeGen/MachinePassManager.h
+++ b/llvm/include/llvm/CodeGen/MachinePassManager.h
@@ -21,12 +21,19 @@
 #ifndef LLVM_CODEGEN_MACHINEPASSMANAGER_H
 #define LLVM_CODEGEN_MACHINEPASSMANAGER_H
 
+#include "llvm/ADT/DenseMap.h"
 #include "llvm/ADT/FunctionExtras.h"
+#include "llvm/ADT/STLExtras.h"
 #include "llvm/ADT/SmallVector.h"
 #include "llvm/CodeGen/MachineFunction.h"
+#include "llvm/IR/Function.h"
 #include "llvm/IR/PassManager.h"
 #include "llvm/IR/PassManagerInternal.h"
+#include "llvm/Pass.h"
 #include "llvm/Support/Error.h"
+#include 
+#include 
+#include 
 
 namespace llvm {
 class Module;
@@ -236,6 +243,79 @@ using MachineFunctionPassManager = 
PassManager;
 /// preserve.
 PreservedAnalyses getMachineFunctionPassPreservedAnalyses();
 
+/// For migrating to new pass manager
+/// Provides a common interface to fetch analyses instead of doing it twice in
+/// the *LegacyPass::runOnMachineFunction and NPM Pass::run NPM analyses must
+/// have the LegacyWrapper type to indicate which legacy analysis to run.
+///
+/// Outer analyses passes(Module or Function) can also be requested through
+/// `getAnalysis` or `getCachedAnalysis`.
+class MFAnalysisGetter {
+private:
+  Pass *LegacyPass;
+  MachineFunctionAnalysisManager *MFAM;
+
+  template 
+  using type_of_run =
+  typename function_traits::template arg_t<0>;
+
+  template 
+  static constexpr bool IsFunctionAnalysis =
+  std::is_same_v>;
+
+  template 
+  static constexpr bool IsModuleAnalysis =
+  std::is_same_v>;
+
+public:
+  MFAnalysisGetter(Pass *LegacyPass) : LegacyPass(LegacyPass) {}
+  MFAnalysisGetter(MachineFunctionAnalysisManager *MFAM) : MFAM(MFAM) {}
+
+  /// Outer analyses requested from NPM will be cached results and can be null
+  template 
+  typename AnalysisT::Result *getAnalysis(MachineFunction &MF) {
+if (MFAM) {
+  // need a proxy to get the result for outer analyses
+  // this can return null
+  if constexpr (IsModuleAnalysis)
+return MFAM->getResult(MF)
+.getCachedResult(*MF.getFunction().getParent());
+  else if constexpr (IsFunctionAnalysis) {
+return 
&MFAM->getResult(MF)
+.getManager()
+.getResult(MF.getFunction());
+  }
+  return &MFAM->getResult(MF);
+}
+return &LegacyPass->getAnalysis()
+.getResult();
+  }
+
+  template 
+  typename AnalysisT::Result *getCachedAnalysis(MachineFunction &MF) {
+if (MFAM) {
+  if constexpr (IsFunctionAnalysis) {
+return MFAM->getResult(MF)
+.getManager()
+.getCachedResult(MF.getFunction());
+  } else if constexpr (IsModuleAnalysis)
+return MFAM->getResult(MF)
+.getCachedResult(*MF.getFunction().getParent());
+
+  return &MFAM->getCachedResult(MF);
+}
+
+if (auto *P =
+LegacyPass->getAnalysisIfAvailable())
+  return &P->getResult();
+return nullptr;
+  }
+
+  /// This is not intended to be used to invoke getAnalysis()
+  Pass *getLegacyPass() const { return LegacyPass; }
+  MachineFunctionAnalysisManager *getMFAM() const { return MFAM; }
+};
+
 } // end namespace llvm
 
 #endif // LLVM_CODEGEN_MACHINEPASSMANAGER_H

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [NewPM] Introduce MFAnalysisGetter for a common analysis getter (PR #116166)

2024-11-13 Thread Akshat Oke via llvm-branch-commits

https://github.com/optimisan ready_for_review 
https://github.com/llvm/llvm-project/pull/116166
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [NewPM] Introduce MFAnalysisGetter for a common analysis getter (PR #116166)

2024-11-13 Thread Akshat Oke via llvm-branch-commits

https://github.com/optimisan edited 
https://github.com/llvm/llvm-project/pull/116166
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [NewPM] Introduce MFAnalysisGetter for a common analysis getter (PR #116166)

2024-11-13 Thread Akshat Oke via llvm-branch-commits

https://github.com/optimisan updated 
https://github.com/llvm/llvm-project/pull/116166

>From 7e31c7d84a9aa87a226bb9f1341fa4a6bae9e7bb Mon Sep 17 00:00:00 2001
From: Akshat Oke 
Date: Thu, 14 Nov 2024 05:57:01 +
Subject: [PATCH 1/2] [NewPM] Introduce MFAnalysisGetter for a common analysis
 getter

---
 .../include/llvm/CodeGen/MachinePassManager.h | 78 +++
 1 file changed, 78 insertions(+)

diff --git a/llvm/include/llvm/CodeGen/MachinePassManager.h 
b/llvm/include/llvm/CodeGen/MachinePassManager.h
index 69b5f6e92940c4..e8b7b96240d148 100644
--- a/llvm/include/llvm/CodeGen/MachinePassManager.h
+++ b/llvm/include/llvm/CodeGen/MachinePassManager.h
@@ -24,8 +24,10 @@
 #include "llvm/ADT/FunctionExtras.h"
 #include "llvm/ADT/SmallVector.h"
 #include "llvm/CodeGen/MachineFunction.h"
+#include "llvm/IR/Function.h"
 #include "llvm/IR/PassManager.h"
 #include "llvm/IR/PassManagerInternal.h"
+#include "llvm/Pass.h"
 #include "llvm/Support/Error.h"
 
 namespace llvm {
@@ -236,6 +238,82 @@ using MachineFunctionPassManager = 
PassManager;
 /// preserve.
 PreservedAnalyses getMachineFunctionPassPreservedAnalyses();
 
+/// For migrating to new pass manager
+/// Provides a common interface to fetch analyses instead of doing it twice in
+/// the *LegacyPass::runOnMachineFunction and NPM Pass::run.
+///
+/// NPM analyses must have the LegacyWrapper type to indicate which legacy
+/// analysis to run. Legacy wrapper analyses must have `getResult()` method.
+/// This can be added on a needs-to basis.
+///
+/// Outer analyses passes(Module or Function) can also be requested through
+/// `getAnalysis` or `getCachedAnalysis`.
+class MFAnalysisGetter {
+private:
+  Pass *LegacyPass;
+  MachineFunctionAnalysisManager *MFAM;
+
+  template 
+  using type_of_run =
+  typename function_traits::template arg_t<0>;
+
+  template 
+  static constexpr bool IsFunctionAnalysis =
+  std::is_same_v>;
+
+  template 
+  static constexpr bool IsModuleAnalysis =
+  std::is_same_v>;
+
+public:
+  MFAnalysisGetter(Pass *LegacyPass) : LegacyPass(LegacyPass) {}
+  MFAnalysisGetter(MachineFunctionAnalysisManager *MFAM) : MFAM(MFAM) {}
+
+  /// Outer analyses requested from NPM will be cached results and can be null
+  template 
+  typename AnalysisT::Result *getAnalysis(MachineFunction &MF) {
+if (MFAM) {
+  // need a proxy to get the result for outer analyses
+  // this can return null
+  if constexpr (IsModuleAnalysis)
+return MFAM->getResult(MF)
+.getCachedResult(*MF.getFunction().getParent());
+  else if constexpr (IsFunctionAnalysis) {
+return 
&MFAM->getResult(MF)
+.getManager()
+.getResult(MF.getFunction());
+  }
+  return &MFAM->getResult(MF);
+}
+return &LegacyPass->getAnalysis()
+.getResult();
+  }
+
+  template 
+  typename AnalysisT::Result *getCachedAnalysis(MachineFunction &MF) {
+if (MFAM) {
+  if constexpr (IsFunctionAnalysis) {
+return MFAM->getResult(MF)
+.getManager()
+.getCachedResult(MF.getFunction());
+  } else if constexpr (IsModuleAnalysis)
+return MFAM->getResult(MF)
+.getCachedResult(*MF.getFunction().getParent());
+
+  return &MFAM->getCachedResult(MF);
+}
+
+if (auto *P =
+LegacyPass->getAnalysisIfAvailable())
+  return &P->getResult();
+return nullptr;
+  }
+
+  /// This is not intended to be used to invoke getAnalysis()
+  Pass *getLegacyPass() const { return LegacyPass; }
+  MachineFunctionAnalysisManager *getMFAM() const { return MFAM; }
+};
+
 } // end namespace llvm
 
 #endif // LLVM_CODEGEN_MACHINEPASSMANAGER_H

>From 125b82d45358690f146b259b779616d79eccd267 Mon Sep 17 00:00:00 2001
From: Akshat Oke 
Date: Thu, 14 Nov 2024 06:49:23 +
Subject: [PATCH 2/2] Initialize with null

---
 llvm/include/llvm/CodeGen/MachinePassManager.h | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/llvm/include/llvm/CodeGen/MachinePassManager.h 
b/llvm/include/llvm/CodeGen/MachinePassManager.h
index e8b7b96240d148..bba41ed343f10d 100644
--- a/llvm/include/llvm/CodeGen/MachinePassManager.h
+++ b/llvm/include/llvm/CodeGen/MachinePassManager.h
@@ -250,8 +250,8 @@ PreservedAnalyses getMachineFunctionPassPreservedAnalyses();
 /// `getAnalysis` or `getCachedAnalysis`.
 class MFAnalysisGetter {
 private:
-  Pass *LegacyPass;
-  MachineFunctionAnalysisManager *MFAM;
+  Pass *LegacyPass = nullptr;
+  MachineFunctionAnalysisManager *MFAM = nullptr;
 
   template 
   using type_of_run =
@@ -259,11 +259,11 @@ class MFAnalysisGetter {
 
   template 
   static constexpr bool IsFunctionAnalysis =
-  std::is_same_v>;
+  std::is_same_v>;
 
   template 
   static constexpr bool IsModuleAnalysis =
-  std::is_same_v>;
+  std::is_same_v>;
 
 public:
   MFAnalysisGetter(Pass *LegacyPass) : LegacyPass(LegacyPass) {}

___
llvm-bran

[llvm-branch-commits] [llvm] [CodeGen][NewPM] Port MachineSink to NPM (PR #115434)

2024-11-13 Thread Akshat Oke via llvm-branch-commits

https://github.com/optimisan edited 
https://github.com/llvm/llvm-project/pull/115434
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [CodeGen][NewPM] Port MachineSink to NPM (PR #115434)

2024-11-13 Thread Akshat Oke via llvm-branch-commits

https://github.com/optimisan updated 
https://github.com/llvm/llvm-project/pull/115434

>From 7e31c7d84a9aa87a226bb9f1341fa4a6bae9e7bb Mon Sep 17 00:00:00 2001
From: Akshat Oke 
Date: Thu, 14 Nov 2024 05:57:01 +
Subject: [PATCH 1/5] [NewPM] Introduce MFAnalysisGetter for a common analysis
 getter

---
 .../include/llvm/CodeGen/MachinePassManager.h | 78 +++
 1 file changed, 78 insertions(+)

diff --git a/llvm/include/llvm/CodeGen/MachinePassManager.h 
b/llvm/include/llvm/CodeGen/MachinePassManager.h
index 69b5f6e92940c4..e8b7b96240d148 100644
--- a/llvm/include/llvm/CodeGen/MachinePassManager.h
+++ b/llvm/include/llvm/CodeGen/MachinePassManager.h
@@ -24,8 +24,10 @@
 #include "llvm/ADT/FunctionExtras.h"
 #include "llvm/ADT/SmallVector.h"
 #include "llvm/CodeGen/MachineFunction.h"
+#include "llvm/IR/Function.h"
 #include "llvm/IR/PassManager.h"
 #include "llvm/IR/PassManagerInternal.h"
+#include "llvm/Pass.h"
 #include "llvm/Support/Error.h"
 
 namespace llvm {
@@ -236,6 +238,82 @@ using MachineFunctionPassManager = 
PassManager;
 /// preserve.
 PreservedAnalyses getMachineFunctionPassPreservedAnalyses();
 
+/// For migrating to new pass manager
+/// Provides a common interface to fetch analyses instead of doing it twice in
+/// the *LegacyPass::runOnMachineFunction and NPM Pass::run.
+///
+/// NPM analyses must have the LegacyWrapper type to indicate which legacy
+/// analysis to run. Legacy wrapper analyses must have `getResult()` method.
+/// This can be added on a needs-to basis.
+///
+/// Outer analyses passes(Module or Function) can also be requested through
+/// `getAnalysis` or `getCachedAnalysis`.
+class MFAnalysisGetter {
+private:
+  Pass *LegacyPass;
+  MachineFunctionAnalysisManager *MFAM;
+
+  template 
+  using type_of_run =
+  typename function_traits::template arg_t<0>;
+
+  template 
+  static constexpr bool IsFunctionAnalysis =
+  std::is_same_v>;
+
+  template 
+  static constexpr bool IsModuleAnalysis =
+  std::is_same_v>;
+
+public:
+  MFAnalysisGetter(Pass *LegacyPass) : LegacyPass(LegacyPass) {}
+  MFAnalysisGetter(MachineFunctionAnalysisManager *MFAM) : MFAM(MFAM) {}
+
+  /// Outer analyses requested from NPM will be cached results and can be null
+  template 
+  typename AnalysisT::Result *getAnalysis(MachineFunction &MF) {
+if (MFAM) {
+  // need a proxy to get the result for outer analyses
+  // this can return null
+  if constexpr (IsModuleAnalysis)
+return MFAM->getResult(MF)
+.getCachedResult(*MF.getFunction().getParent());
+  else if constexpr (IsFunctionAnalysis) {
+return 
&MFAM->getResult(MF)
+.getManager()
+.getResult(MF.getFunction());
+  }
+  return &MFAM->getResult(MF);
+}
+return &LegacyPass->getAnalysis()
+.getResult();
+  }
+
+  template 
+  typename AnalysisT::Result *getCachedAnalysis(MachineFunction &MF) {
+if (MFAM) {
+  if constexpr (IsFunctionAnalysis) {
+return MFAM->getResult(MF)
+.getManager()
+.getCachedResult(MF.getFunction());
+  } else if constexpr (IsModuleAnalysis)
+return MFAM->getResult(MF)
+.getCachedResult(*MF.getFunction().getParent());
+
+  return &MFAM->getCachedResult(MF);
+}
+
+if (auto *P =
+LegacyPass->getAnalysisIfAvailable())
+  return &P->getResult();
+return nullptr;
+  }
+
+  /// This is not intended to be used to invoke getAnalysis()
+  Pass *getLegacyPass() const { return LegacyPass; }
+  MachineFunctionAnalysisManager *getMFAM() const { return MFAM; }
+};
+
 } // end namespace llvm
 
 #endif // LLVM_CODEGEN_MACHINEPASSMANAGER_H

>From 125b82d45358690f146b259b779616d79eccd267 Mon Sep 17 00:00:00 2001
From: Akshat Oke 
Date: Thu, 14 Nov 2024 06:49:23 +
Subject: [PATCH 2/5] Initialize with null

---
 llvm/include/llvm/CodeGen/MachinePassManager.h | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/llvm/include/llvm/CodeGen/MachinePassManager.h 
b/llvm/include/llvm/CodeGen/MachinePassManager.h
index e8b7b96240d148..bba41ed343f10d 100644
--- a/llvm/include/llvm/CodeGen/MachinePassManager.h
+++ b/llvm/include/llvm/CodeGen/MachinePassManager.h
@@ -250,8 +250,8 @@ PreservedAnalyses getMachineFunctionPassPreservedAnalyses();
 /// `getAnalysis` or `getCachedAnalysis`.
 class MFAnalysisGetter {
 private:
-  Pass *LegacyPass;
-  MachineFunctionAnalysisManager *MFAM;
+  Pass *LegacyPass = nullptr;
+  MachineFunctionAnalysisManager *MFAM = nullptr;
 
   template 
   using type_of_run =
@@ -259,11 +259,11 @@ class MFAnalysisGetter {
 
   template 
   static constexpr bool IsFunctionAnalysis =
-  std::is_same_v>;
+  std::is_same_v>;
 
   template 
   static constexpr bool IsModuleAnalysis =
-  std::is_same_v>;
+  std::is_same_v>;
 
 public:
   MFAnalysisGetter(Pass *LegacyPass) : LegacyPass(LegacyPass) {}

>From 52b7ce7a4c500ab9e0f975337f51b1e37543ce0a Mon Sep 17

[llvm-branch-commits] [llvm] [CodeGen][NewPM] Port MachineSink to NPM (PR #115434)

2024-11-13 Thread via llvm-branch-commits

github-actions[bot] wrote:




:warning: C/C++ code formatter, clang-format found issues in your code. 
:warning:



You can test this locally with the following command:


``bash
git-clang-format --diff 68bcb36981bce9b99ee70c3bd41443c6d44cd7ae 
4a8feec5b0ac4967e28a871211d660468f3a0a93 --extensions cpp,h -- 
llvm/include/llvm/CodeGen/MachineSink.h 
llvm/include/llvm/Analysis/AliasAnalysis.h 
llvm/include/llvm/CodeGen/MachineBlockFrequencyInfo.h 
llvm/include/llvm/CodeGen/MachineBranchProbabilityInfo.h 
llvm/include/llvm/CodeGen/MachineCycleAnalysis.h 
llvm/include/llvm/CodeGen/MachineDominators.h 
llvm/include/llvm/CodeGen/MachinePassManager.h 
llvm/include/llvm/CodeGen/MachinePostDominators.h 
llvm/include/llvm/CodeGen/Passes.h llvm/include/llvm/IR/Dominators.h 
llvm/include/llvm/InitializePasses.h 
llvm/include/llvm/Passes/CodeGenPassBuilder.h 
llvm/include/llvm/Target/CGPassBuilderOption.h llvm/lib/CodeGen/CodeGen.cpp 
llvm/lib/CodeGen/MachineSink.cpp llvm/lib/CodeGen/TargetPassConfig.cpp 
llvm/lib/Passes/PassBuilder.cpp llvm/lib/Target/NVPTX/NVPTXTargetMachine.cpp
``





View the diff from clang-format here.


``diff
diff --git a/llvm/include/llvm/CodeGen/MachineSink.h 
b/llvm/include/llvm/CodeGen/MachineSink.h
index 1eee9d7f7e..71bd7229b7 100644
--- a/llvm/include/llvm/CodeGen/MachineSink.h
+++ b/llvm/include/llvm/CodeGen/MachineSink.h
@@ -22,7 +22,8 @@ public:
 
   PreservedAnalyses run(MachineFunction &MF, MachineFunctionAnalysisManager &);
 
-  void printPipeline(raw_ostream &OS, function_ref 
MapClassName2PassName);
+  void printPipeline(raw_ostream &OS,
+ function_ref MapClassName2PassName);
 };
 
 } // namespace llvm

``




https://github.com/llvm/llvm-project/pull/115434
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] [mlir][Parser] Deduplicate floating-point parsing functionality (PR #116172)

2024-11-13 Thread Matthias Springer via llvm-branch-commits

https://github.com/matthias-springer edited 
https://github.com/llvm/llvm-project/pull/116172
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] [mlir][Parser] Deduplicate fp parsing functionality (PR #116172)

2024-11-13 Thread Matthias Springer via llvm-branch-commits

https://github.com/matthias-springer created 
https://github.com/llvm/llvm-project/pull/116172

The following functionality is duplicated in multiple places: trying to parse 
an APFloat from a floating point literal or an integer in hexadecimal 
representation (bit pattern). Move it to a common helper function.

NFC apart from the slightly changed error messages.

Depends on #116171.

>From 51530aeea8c18804034881c87236d1ab5ceb274f Mon Sep 17 00:00:00 2001
From: Matthias Springer 
Date: Thu, 14 Nov 2024 07:43:08 +0100
Subject: [PATCH] [mlir][Parser] Deduplicate fp parsing functionality

---
 mlir/lib/AsmParser/AsmParserImpl.h   | 33 ++---
 mlir/lib/AsmParser/AttributeParser.cpp   | 71 
 mlir/lib/AsmParser/Parser.cpp| 23 +++
 mlir/lib/AsmParser/Parser.h  |  6 ++
 mlir/test/IR/invalid-builtin-attributes.mlir | 10 +--
 5 files changed, 56 insertions(+), 87 deletions(-)

diff --git a/mlir/lib/AsmParser/AsmParserImpl.h 
b/mlir/lib/AsmParser/AsmParserImpl.h
index 1e6cbc0ec51beb..bbd70d5980f8fe 100644
--- a/mlir/lib/AsmParser/AsmParserImpl.h
+++ b/mlir/lib/AsmParser/AsmParserImpl.h
@@ -288,32 +288,13 @@ class AsmParserImpl : public BaseT {
 bool isNegative = parser.consumeIf(Token::minus);
 Token curTok = parser.getToken();
 auto emitErrorAtTok = [&]() { return emitError(curTok.getLoc(), ""); };
-
-// Check for a floating point value.
-if (curTok.is(Token::floatliteral)) {
-  auto val = curTok.getFloatingPointValue();
-  if (!val)
-return emitErrorAtTok() << "floating point value too large";
-  parser.consumeToken(Token::floatliteral);
-  result = APFloat(isNegative ? -*val : *val);
-  bool losesInfo;
-  result.convert(semantics, APFloat::rmNearestTiesToEven, &losesInfo);
-  return success();
-}
-
-// Check for a hexadecimal float value.
-if (curTok.is(Token::integer)) {
-  FailureOr apResult = parseFloatFromIntegerLiteral(
-  emitErrorAtTok, curTok, isNegative, semantics);
-  if (failed(apResult))
-return failure();
-
-  result = *apResult;
-  parser.consumeToken(Token::integer);
-  return success();
-}
-
-return emitErrorAtTok() << "expected floating point literal";
+FailureOr apResult =
+parseFloatFromLiteral(emitErrorAtTok, curTok, isNegative, semantics);
+if (failed(apResult))
+  return failure();
+parser.consumeToken();
+result = *apResult;
+return success();
   }
 
   /// Parse a floating point value from the stream.
diff --git a/mlir/lib/AsmParser/AttributeParser.cpp 
b/mlir/lib/AsmParser/AttributeParser.cpp
index ba9be3b030453a..9ebada076cd042 100644
--- a/mlir/lib/AsmParser/AttributeParser.cpp
+++ b/mlir/lib/AsmParser/AttributeParser.cpp
@@ -658,36 +658,12 @@ TensorLiteralParser::getFloatAttrElements(SMLoc loc, 
FloatType eltTy,
   for (const auto &signAndToken : storage) {
 bool isNegative = signAndToken.first;
 const Token &token = signAndToken.second;
-
-// Handle hexadecimal float literals.
-if (token.is(Token::integer) && token.getSpelling().starts_with("0x")) {
-  auto emitErrorAtTok = [&]() { return p.emitError(token.getLoc()); };
-  FailureOr result = parseFloatFromIntegerLiteral(
-  emitErrorAtTok, token, isNegative, eltTy.getFloatSemantics());
-  if (failed(result))
-return failure();
-
-  floatValues.push_back(*result);
-  continue;
-}
-
-// Check to see if any decimal integers or booleans were parsed.
-if (!token.is(Token::floatliteral))
-  return p.emitError()
- << "expected floating-point elements, but parsed integer";
-
-// Build the float values from tokens.
-auto val = token.getFloatingPointValue();
-if (!val)
-  return p.emitError("floating point value too large for attribute");
-
-APFloat apVal(isNegative ? -*val : *val);
-if (!eltTy.isF64()) {
-  bool unused;
-  apVal.convert(eltTy.getFloatSemantics(), APFloat::rmNearestTiesToEven,
-&unused);
-}
-floatValues.push_back(apVal);
+auto emitErrorAtTok = [&]() { return p.emitError(token.getLoc()); };
+FailureOr result = parseFloatFromLiteral(
+emitErrorAtTok, token, isNegative, eltTy.getFloatSemantics());
+if (failed(result))
+  return failure();
+floatValues.push_back(*result);
   }
   return success();
 }
@@ -905,34 +881,15 @@ ParseResult 
DenseArrayElementParser::parseIntegerElement(Parser &p) {
 
 ParseResult DenseArrayElementParser::parseFloatElement(Parser &p) {
   bool isNegative = p.consumeIf(Token::minus);
-
   Token token = p.getToken();
-  std::optional result;
-  auto floatType = cast(type);
-  if (p.consumeIf(Token::integer)) {
-// Parse an integer literal as a float.
-auto emitErrorAtTok = [&]() { return p.emitError(token.getLoc()); };
-FailureOr fromIntLit = parseFloatFromIntegerLiteral(
-emitErrorAtTok, token, isNegative, float

[llvm-branch-commits] [mlir] [mlir][Parser] Deduplicate floating-point parsing functionality (PR #116172)

2024-11-13 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-mlir

Author: Matthias Springer (matthias-springer)


Changes

The following functionality is duplicated in multiple places: trying to parse 
an APFloat from a floating point literal or an integer in hexadecimal 
representation (bit pattern). Move it to a common helper function.

NFC apart from the slightly changed error messages.

Depends on #116171.

---
Full diff: https://github.com/llvm/llvm-project/pull/116172.diff


5 Files Affected:

- (modified) mlir/lib/AsmParser/AsmParserImpl.h (+7-26) 
- (modified) mlir/lib/AsmParser/AttributeParser.cpp (+14-57) 
- (modified) mlir/lib/AsmParser/Parser.cpp (+23) 
- (modified) mlir/lib/AsmParser/Parser.h (+6) 
- (modified) mlir/test/IR/invalid-builtin-attributes.mlir (+6-4) 


``diff
diff --git a/mlir/lib/AsmParser/AsmParserImpl.h 
b/mlir/lib/AsmParser/AsmParserImpl.h
index 1e6cbc0ec51beb..bbd70d5980f8fe 100644
--- a/mlir/lib/AsmParser/AsmParserImpl.h
+++ b/mlir/lib/AsmParser/AsmParserImpl.h
@@ -288,32 +288,13 @@ class AsmParserImpl : public BaseT {
 bool isNegative = parser.consumeIf(Token::minus);
 Token curTok = parser.getToken();
 auto emitErrorAtTok = [&]() { return emitError(curTok.getLoc(), ""); };
-
-// Check for a floating point value.
-if (curTok.is(Token::floatliteral)) {
-  auto val = curTok.getFloatingPointValue();
-  if (!val)
-return emitErrorAtTok() << "floating point value too large";
-  parser.consumeToken(Token::floatliteral);
-  result = APFloat(isNegative ? -*val : *val);
-  bool losesInfo;
-  result.convert(semantics, APFloat::rmNearestTiesToEven, &losesInfo);
-  return success();
-}
-
-// Check for a hexadecimal float value.
-if (curTok.is(Token::integer)) {
-  FailureOr apResult = parseFloatFromIntegerLiteral(
-  emitErrorAtTok, curTok, isNegative, semantics);
-  if (failed(apResult))
-return failure();
-
-  result = *apResult;
-  parser.consumeToken(Token::integer);
-  return success();
-}
-
-return emitErrorAtTok() << "expected floating point literal";
+FailureOr apResult =
+parseFloatFromLiteral(emitErrorAtTok, curTok, isNegative, semantics);
+if (failed(apResult))
+  return failure();
+parser.consumeToken();
+result = *apResult;
+return success();
   }
 
   /// Parse a floating point value from the stream.
diff --git a/mlir/lib/AsmParser/AttributeParser.cpp 
b/mlir/lib/AsmParser/AttributeParser.cpp
index ba9be3b030453a..9ebada076cd042 100644
--- a/mlir/lib/AsmParser/AttributeParser.cpp
+++ b/mlir/lib/AsmParser/AttributeParser.cpp
@@ -658,36 +658,12 @@ TensorLiteralParser::getFloatAttrElements(SMLoc loc, 
FloatType eltTy,
   for (const auto &signAndToken : storage) {
 bool isNegative = signAndToken.first;
 const Token &token = signAndToken.second;
-
-// Handle hexadecimal float literals.
-if (token.is(Token::integer) && token.getSpelling().starts_with("0x")) {
-  auto emitErrorAtTok = [&]() { return p.emitError(token.getLoc()); };
-  FailureOr result = parseFloatFromIntegerLiteral(
-  emitErrorAtTok, token, isNegative, eltTy.getFloatSemantics());
-  if (failed(result))
-return failure();
-
-  floatValues.push_back(*result);
-  continue;
-}
-
-// Check to see if any decimal integers or booleans were parsed.
-if (!token.is(Token::floatliteral))
-  return p.emitError()
- << "expected floating-point elements, but parsed integer";
-
-// Build the float values from tokens.
-auto val = token.getFloatingPointValue();
-if (!val)
-  return p.emitError("floating point value too large for attribute");
-
-APFloat apVal(isNegative ? -*val : *val);
-if (!eltTy.isF64()) {
-  bool unused;
-  apVal.convert(eltTy.getFloatSemantics(), APFloat::rmNearestTiesToEven,
-&unused);
-}
-floatValues.push_back(apVal);
+auto emitErrorAtTok = [&]() { return p.emitError(token.getLoc()); };
+FailureOr result = parseFloatFromLiteral(
+emitErrorAtTok, token, isNegative, eltTy.getFloatSemantics());
+if (failed(result))
+  return failure();
+floatValues.push_back(*result);
   }
   return success();
 }
@@ -905,34 +881,15 @@ ParseResult 
DenseArrayElementParser::parseIntegerElement(Parser &p) {
 
 ParseResult DenseArrayElementParser::parseFloatElement(Parser &p) {
   bool isNegative = p.consumeIf(Token::minus);
-
   Token token = p.getToken();
-  std::optional result;
-  auto floatType = cast(type);
-  if (p.consumeIf(Token::integer)) {
-// Parse an integer literal as a float.
-auto emitErrorAtTok = [&]() { return p.emitError(token.getLoc()); };
-FailureOr fromIntLit = parseFloatFromIntegerLiteral(
-emitErrorAtTok, token, isNegative, floatType.getFloatSemantics());
-if (failed(fromIntLit))
-  return failure();
-result = *fromIntLit;
-  } else if (p.consumeIf(Token::floatliteral)) {
-// P

[llvm-branch-commits] [mlir] [mlir][Parser] Deduplicate floating-point parsing functionality (PR #116172)

2024-11-13 Thread Matthias Springer via llvm-branch-commits

https://github.com/matthias-springer edited 
https://github.com/llvm/llvm-project/pull/116172
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [NewPM] Introduce MFAnalysisGetter for a common analysis getter (PR #116166)

2024-11-13 Thread Akshat Oke via llvm-branch-commits

optimisan wrote:

> [!WARNING]
> This pull request is not mergeable via GitHub because a downstack PR is 
> open. Once all requirements are satisfied, merge this PR as a stack  href="https://app.graphite.dev/github/pr/llvm/llvm-project/116166?utm_source=stack-comment-downstack-mergeability-warning";
>  >on Graphite.
> https://graphite.dev/docs/merge-pull-requests";>Learn more

* **#116166** https://app.graphite.dev/github/pr/llvm/llvm-project/116166?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/> 👈
* **#114745** https://app.graphite.dev/github/pr/llvm/llvm-project/114745?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>: 2 other dependent PRs 
([#114746](https://github.com/llvm/llvm-project/pull/114746) https://app.graphite.dev/github/pr/llvm/llvm-project/114746?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>, 
[#115434](https://github.com/llvm/llvm-project/pull/115434) https://app.graphite.dev/github/pr/llvm/llvm-project/115434?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>)
* **#114027** https://app.graphite.dev/github/pr/llvm/llvm-project/114027?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* `main`

This stack of pull requests is managed by Graphite. https://stacking.dev/?utm_source=stack-comment";>Learn more about 
stacking.


 Join @optimisan and the rest of your teammates on https://graphite.dev?utm-source=stack-comment";>https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="11px" height="11px"/> Graphite
  

https://github.com/llvm/llvm-project/pull/116166
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [NewPM] Introduce MFAnalysisGetter for a common analysis getter (PR #116166)

2024-11-13 Thread Akshat Oke via llvm-branch-commits

https://github.com/optimisan updated 
https://github.com/llvm/llvm-project/pull/116166

>From 7e31c7d84a9aa87a226bb9f1341fa4a6bae9e7bb Mon Sep 17 00:00:00 2001
From: Akshat Oke 
Date: Thu, 14 Nov 2024 05:57:01 +
Subject: [PATCH] [NewPM] Introduce MFAnalysisGetter for a common analysis
 getter

---
 .../include/llvm/CodeGen/MachinePassManager.h | 78 +++
 1 file changed, 78 insertions(+)

diff --git a/llvm/include/llvm/CodeGen/MachinePassManager.h 
b/llvm/include/llvm/CodeGen/MachinePassManager.h
index 69b5f6e92940c4..e8b7b96240d148 100644
--- a/llvm/include/llvm/CodeGen/MachinePassManager.h
+++ b/llvm/include/llvm/CodeGen/MachinePassManager.h
@@ -24,8 +24,10 @@
 #include "llvm/ADT/FunctionExtras.h"
 #include "llvm/ADT/SmallVector.h"
 #include "llvm/CodeGen/MachineFunction.h"
+#include "llvm/IR/Function.h"
 #include "llvm/IR/PassManager.h"
 #include "llvm/IR/PassManagerInternal.h"
+#include "llvm/Pass.h"
 #include "llvm/Support/Error.h"
 
 namespace llvm {
@@ -236,6 +238,82 @@ using MachineFunctionPassManager = 
PassManager;
 /// preserve.
 PreservedAnalyses getMachineFunctionPassPreservedAnalyses();
 
+/// For migrating to new pass manager
+/// Provides a common interface to fetch analyses instead of doing it twice in
+/// the *LegacyPass::runOnMachineFunction and NPM Pass::run.
+///
+/// NPM analyses must have the LegacyWrapper type to indicate which legacy
+/// analysis to run. Legacy wrapper analyses must have `getResult()` method.
+/// This can be added on a needs-to basis.
+///
+/// Outer analyses passes(Module or Function) can also be requested through
+/// `getAnalysis` or `getCachedAnalysis`.
+class MFAnalysisGetter {
+private:
+  Pass *LegacyPass;
+  MachineFunctionAnalysisManager *MFAM;
+
+  template 
+  using type_of_run =
+  typename function_traits::template arg_t<0>;
+
+  template 
+  static constexpr bool IsFunctionAnalysis =
+  std::is_same_v>;
+
+  template 
+  static constexpr bool IsModuleAnalysis =
+  std::is_same_v>;
+
+public:
+  MFAnalysisGetter(Pass *LegacyPass) : LegacyPass(LegacyPass) {}
+  MFAnalysisGetter(MachineFunctionAnalysisManager *MFAM) : MFAM(MFAM) {}
+
+  /// Outer analyses requested from NPM will be cached results and can be null
+  template 
+  typename AnalysisT::Result *getAnalysis(MachineFunction &MF) {
+if (MFAM) {
+  // need a proxy to get the result for outer analyses
+  // this can return null
+  if constexpr (IsModuleAnalysis)
+return MFAM->getResult(MF)
+.getCachedResult(*MF.getFunction().getParent());
+  else if constexpr (IsFunctionAnalysis) {
+return 
&MFAM->getResult(MF)
+.getManager()
+.getResult(MF.getFunction());
+  }
+  return &MFAM->getResult(MF);
+}
+return &LegacyPass->getAnalysis()
+.getResult();
+  }
+
+  template 
+  typename AnalysisT::Result *getCachedAnalysis(MachineFunction &MF) {
+if (MFAM) {
+  if constexpr (IsFunctionAnalysis) {
+return MFAM->getResult(MF)
+.getManager()
+.getCachedResult(MF.getFunction());
+  } else if constexpr (IsModuleAnalysis)
+return MFAM->getResult(MF)
+.getCachedResult(*MF.getFunction().getParent());
+
+  return &MFAM->getCachedResult(MF);
+}
+
+if (auto *P =
+LegacyPass->getAnalysisIfAvailable())
+  return &P->getResult();
+return nullptr;
+  }
+
+  /// This is not intended to be used to invoke getAnalysis()
+  Pass *getLegacyPass() const { return LegacyPass; }
+  MachineFunctionAnalysisManager *getMFAM() const { return MFAM; }
+};
+
 } // end namespace llvm
 
 #endif // LLVM_CODEGEN_MACHINEPASSMANAGER_H

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [mlir] [mlir][Parser] Add `nan` and `inf` keywords (PR #116176)

2024-11-13 Thread Matthias Springer via llvm-branch-commits

https://github.com/matthias-springer created 
https://github.com/llvm/llvm-project/pull/116176

Add two new keywords for parsing `nan` / `inf` floating-point literals. This is 
more convenient that writing the integer hexadecimal bit patterns by hand 
(which differ depending on the floating-point type).

Note: The printer still prints `nan` / `inf` literals as integer hexadecimals. 
That's because there can be multiple `nan` / `inf` bit patterns. When parsing a 
`nan` / `inf` keyword, the exact bit pattern is unspecified: we use whatever 
`APFloat::getInf`/`APFloat::getNaN` returns.

Depends on #116172.


>From 045afe88e53f873cf027ab92af32c120a1d47d63 Mon Sep 17 00:00:00 2001
From: Matthias Springer 
Date: Thu, 14 Nov 2024 08:47:06 +0100
Subject: [PATCH] [mlir][Parser] Add `nan` and `inf` keywords

---
 llvm/include/llvm/ADT/APFloat.h   |  2 +
 llvm/lib/Support/APFloat.cpp  |  9 
 mlir/lib/AsmParser/AttributeParser.cpp| 30 +++
 mlir/lib/AsmParser/Parser.cpp | 29 +-
 mlir/lib/AsmParser/TokenKinds.def |  2 +
 mlir/test/Dialect/Arith/canonicalize.mlir | 10 ++--
 mlir/test/IR/attribute.mlir   | 54 +++
 .../math-polynomial-approx.mlir   | 36 ++---
 8 files changed, 138 insertions(+), 34 deletions(-)

diff --git a/llvm/include/llvm/ADT/APFloat.h b/llvm/include/llvm/ADT/APFloat.h
index 97547fb577e0ec..40ad7ba92552ed 100644
--- a/llvm/include/llvm/ADT/APFloat.h
+++ b/llvm/include/llvm/ADT/APFloat.h
@@ -311,6 +311,8 @@ struct APFloatBase {
   static unsigned int semanticsIntSizeInBits(const fltSemantics&, bool);
   static bool semanticsHasZero(const fltSemantics &);
   static bool semanticsHasSignedRepr(const fltSemantics &);
+  static bool semanticsHasInf(const fltSemantics &);
+  static bool semanticsHasNan(const fltSemantics &);
 
   // Returns true if any number described by \p Src can be precisely 
represented
   // by a normal (not subnormal) value in \p Dst.
diff --git a/llvm/lib/Support/APFloat.cpp b/llvm/lib/Support/APFloat.cpp
index c566d489d11b03..8b9d9af2ca65b3 100644
--- a/llvm/lib/Support/APFloat.cpp
+++ b/llvm/lib/Support/APFloat.cpp
@@ -375,6 +375,15 @@ bool APFloatBase::semanticsHasSignedRepr(const 
fltSemantics &semantics) {
   return semantics.hasSignedRepr;
 }
 
+bool APFloatBase::semanticsHasInf(const fltSemantics &semantics) {
+  return semantics.nonFiniteBehavior != fltNonfiniteBehavior::NanOnly
+  && semantics.nonFiniteBehavior != fltNonfiniteBehavior::FiniteOnly;
+}
+
+bool APFloatBase::semanticsHasNan(const fltSemantics &semantics) {
+  return semantics.nonFiniteBehavior != fltNonfiniteBehavior::FiniteOnly;
+}
+
 bool APFloatBase::isRepresentableAsNormalIn(const fltSemantics &Src,
 const fltSemantics &Dst) {
   // Exponent range must be larger.
diff --git a/mlir/lib/AsmParser/AttributeParser.cpp 
b/mlir/lib/AsmParser/AttributeParser.cpp
index 9ebada076cd042..68da950f09e568 100644
--- a/mlir/lib/AsmParser/AttributeParser.cpp
+++ b/mlir/lib/AsmParser/AttributeParser.cpp
@@ -21,8 +21,10 @@
 #include "mlir/IR/DialectImplementation.h"
 #include "mlir/IR/DialectResourceBlobManager.h"
 #include "mlir/IR/IntegerSet.h"
+#include "llvm/ADT/APFloat.h"
 #include "llvm/ADT/StringExtras.h"
 #include "llvm/Support/Endian.h"
+#include 
 #include 
 
 using namespace mlir;
@@ -121,6 +123,8 @@ Attribute Parser::parseAttribute(Type type) {
 
   // Parse floating point and integer attributes.
   case Token::floatliteral:
+  case Token::kw_inf:
+  case Token::kw_nan:
 return parseFloatAttr(type, /*isNegative=*/false);
   case Token::integer:
 return parseDecOrHexAttr(type, /*isNegative=*/false);
@@ -128,7 +132,8 @@ Attribute Parser::parseAttribute(Type type) {
 consumeToken(Token::minus);
 if (getToken().is(Token::integer))
   return parseDecOrHexAttr(type, /*isNegative=*/true);
-if (getToken().is(Token::floatliteral))
+if (getToken().is(Token::floatliteral) || getToken().is(Token::kw_inf) ||
+getToken().is(Token::kw_nan))
   return parseFloatAttr(type, /*isNegative=*/true);
 
 return (emitWrongTokenError(
@@ -342,10 +347,8 @@ ParseResult Parser::parseAttributeDict(NamedAttrList 
&attributes) {
 
 /// Parse a float attribute.
 Attribute Parser::parseFloatAttr(Type type, bool isNegative) {
-  auto val = getToken().getFloatingPointValue();
-  if (!val)
-return (emitError("floating point value too large for attribute"), 
nullptr);
-  consumeToken(Token::floatliteral);
+  const Token tok = getToken();
+  consumeToken();
   if (!type) {
 // Default to F64 when no type is specified.
 if (!consumeIf(Token::colon))
@@ -353,10 +356,16 @@ Attribute Parser::parseFloatAttr(Type type, bool 
isNegative) {
 else if (!(type = parseType()))
   return nullptr;
   }
-  if (!isa(type))
+  auto floatType = dyn_cast(type);
+  if (!floatType)
 return (emitError("floating point valu

[llvm-branch-commits] [llvm] [mlir] [mlir][Parser] Add `nan` and `inf` keywords (PR #116176)

2024-11-13 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-llvm-adt

Author: Matthias Springer (matthias-springer)


Changes

Add two new keywords for parsing `nan` / `inf` floating-point literals. This is 
more convenient that writing the integer hexadecimal bit patterns by hand 
(which differ depending on the floating-point type).

Note: The printer still prints `nan` / `inf` literals as integer hexadecimals. 
That's because there can be multiple `nan` / `inf` bit patterns. When parsing a 
`nan` / `inf` keyword, the exact bit pattern is unspecified: we use whatever 
`APFloat::getInf`/`APFloat::getNaN` returns.

Depends on #116172.


---
Full diff: https://github.com/llvm/llvm-project/pull/116176.diff


8 Files Affected:

- (modified) llvm/include/llvm/ADT/APFloat.h (+2) 
- (modified) llvm/lib/Support/APFloat.cpp (+9) 
- (modified) mlir/lib/AsmParser/AttributeParser.cpp (+21-9) 
- (modified) mlir/lib/AsmParser/Parser.cpp (+27-2) 
- (modified) mlir/lib/AsmParser/TokenKinds.def (+2) 
- (modified) mlir/test/Dialect/Arith/canonicalize.mlir (+5-5) 
- (modified) mlir/test/IR/attribute.mlir (+54) 
- (modified) mlir/test/mlir-cpu-runner/math-polynomial-approx.mlir (+18-18) 


``diff
diff --git a/llvm/include/llvm/ADT/APFloat.h b/llvm/include/llvm/ADT/APFloat.h
index 97547fb577e0ec..40ad7ba92552ed 100644
--- a/llvm/include/llvm/ADT/APFloat.h
+++ b/llvm/include/llvm/ADT/APFloat.h
@@ -311,6 +311,8 @@ struct APFloatBase {
   static unsigned int semanticsIntSizeInBits(const fltSemantics&, bool);
   static bool semanticsHasZero(const fltSemantics &);
   static bool semanticsHasSignedRepr(const fltSemantics &);
+  static bool semanticsHasInf(const fltSemantics &);
+  static bool semanticsHasNan(const fltSemantics &);
 
   // Returns true if any number described by \p Src can be precisely 
represented
   // by a normal (not subnormal) value in \p Dst.
diff --git a/llvm/lib/Support/APFloat.cpp b/llvm/lib/Support/APFloat.cpp
index c566d489d11b03..8b9d9af2ca65b3 100644
--- a/llvm/lib/Support/APFloat.cpp
+++ b/llvm/lib/Support/APFloat.cpp
@@ -375,6 +375,15 @@ bool APFloatBase::semanticsHasSignedRepr(const 
fltSemantics &semantics) {
   return semantics.hasSignedRepr;
 }
 
+bool APFloatBase::semanticsHasInf(const fltSemantics &semantics) {
+  return semantics.nonFiniteBehavior != fltNonfiniteBehavior::NanOnly
+  && semantics.nonFiniteBehavior != fltNonfiniteBehavior::FiniteOnly;
+}
+
+bool APFloatBase::semanticsHasNan(const fltSemantics &semantics) {
+  return semantics.nonFiniteBehavior != fltNonfiniteBehavior::FiniteOnly;
+}
+
 bool APFloatBase::isRepresentableAsNormalIn(const fltSemantics &Src,
 const fltSemantics &Dst) {
   // Exponent range must be larger.
diff --git a/mlir/lib/AsmParser/AttributeParser.cpp 
b/mlir/lib/AsmParser/AttributeParser.cpp
index 9ebada076cd042..68da950f09e568 100644
--- a/mlir/lib/AsmParser/AttributeParser.cpp
+++ b/mlir/lib/AsmParser/AttributeParser.cpp
@@ -21,8 +21,10 @@
 #include "mlir/IR/DialectImplementation.h"
 #include "mlir/IR/DialectResourceBlobManager.h"
 #include "mlir/IR/IntegerSet.h"
+#include "llvm/ADT/APFloat.h"
 #include "llvm/ADT/StringExtras.h"
 #include "llvm/Support/Endian.h"
+#include 
 #include 
 
 using namespace mlir;
@@ -121,6 +123,8 @@ Attribute Parser::parseAttribute(Type type) {
 
   // Parse floating point and integer attributes.
   case Token::floatliteral:
+  case Token::kw_inf:
+  case Token::kw_nan:
 return parseFloatAttr(type, /*isNegative=*/false);
   case Token::integer:
 return parseDecOrHexAttr(type, /*isNegative=*/false);
@@ -128,7 +132,8 @@ Attribute Parser::parseAttribute(Type type) {
 consumeToken(Token::minus);
 if (getToken().is(Token::integer))
   return parseDecOrHexAttr(type, /*isNegative=*/true);
-if (getToken().is(Token::floatliteral))
+if (getToken().is(Token::floatliteral) || getToken().is(Token::kw_inf) ||
+getToken().is(Token::kw_nan))
   return parseFloatAttr(type, /*isNegative=*/true);
 
 return (emitWrongTokenError(
@@ -342,10 +347,8 @@ ParseResult Parser::parseAttributeDict(NamedAttrList 
&attributes) {
 
 /// Parse a float attribute.
 Attribute Parser::parseFloatAttr(Type type, bool isNegative) {
-  auto val = getToken().getFloatingPointValue();
-  if (!val)
-return (emitError("floating point value too large for attribute"), 
nullptr);
-  consumeToken(Token::floatliteral);
+  const Token tok = getToken();
+  consumeToken();
   if (!type) {
 // Default to F64 when no type is specified.
 if (!consumeIf(Token::colon))
@@ -353,10 +356,16 @@ Attribute Parser::parseFloatAttr(Type type, bool 
isNegative) {
 else if (!(type = parseType()))
   return nullptr;
   }
-  if (!isa(type))
+  auto floatType = dyn_cast(type);
+  if (!floatType)
 return (emitError("floating point value not valid for specified type"),
 nullptr);
-  return FloatAttr::get(type, isNegative ? -*val : *val);
+  auto emitErrorAtTok = [&]() { return emitError(tok.g

[llvm-branch-commits] [llvm] [mlir] [mlir][Parser] Add `nan` and `inf` keywords (PR #116176)

2024-11-13 Thread via llvm-branch-commits

llvmbot wrote:



@llvm/pr-subscribers-llvm-support

@llvm/pr-subscribers-mlir

Author: Matthias Springer (matthias-springer)


Changes

Add two new keywords for parsing `nan` / `inf` floating-point literals. This is 
more convenient that writing the integer hexadecimal bit patterns by hand 
(which differ depending on the floating-point type).

Note: The printer still prints `nan` / `inf` literals as integer hexadecimals. 
That's because there can be multiple `nan` / `inf` bit patterns. When parsing a 
`nan` / `inf` keyword, the exact bit pattern is unspecified: we use whatever 
`APFloat::getInf`/`APFloat::getNaN` returns.

Depends on #116172.


---
Full diff: https://github.com/llvm/llvm-project/pull/116176.diff


8 Files Affected:

- (modified) llvm/include/llvm/ADT/APFloat.h (+2) 
- (modified) llvm/lib/Support/APFloat.cpp (+9) 
- (modified) mlir/lib/AsmParser/AttributeParser.cpp (+21-9) 
- (modified) mlir/lib/AsmParser/Parser.cpp (+27-2) 
- (modified) mlir/lib/AsmParser/TokenKinds.def (+2) 
- (modified) mlir/test/Dialect/Arith/canonicalize.mlir (+5-5) 
- (modified) mlir/test/IR/attribute.mlir (+54) 
- (modified) mlir/test/mlir-cpu-runner/math-polynomial-approx.mlir (+18-18) 


``diff
diff --git a/llvm/include/llvm/ADT/APFloat.h b/llvm/include/llvm/ADT/APFloat.h
index 97547fb577e0ec..40ad7ba92552ed 100644
--- a/llvm/include/llvm/ADT/APFloat.h
+++ b/llvm/include/llvm/ADT/APFloat.h
@@ -311,6 +311,8 @@ struct APFloatBase {
   static unsigned int semanticsIntSizeInBits(const fltSemantics&, bool);
   static bool semanticsHasZero(const fltSemantics &);
   static bool semanticsHasSignedRepr(const fltSemantics &);
+  static bool semanticsHasInf(const fltSemantics &);
+  static bool semanticsHasNan(const fltSemantics &);
 
   // Returns true if any number described by \p Src can be precisely 
represented
   // by a normal (not subnormal) value in \p Dst.
diff --git a/llvm/lib/Support/APFloat.cpp b/llvm/lib/Support/APFloat.cpp
index c566d489d11b03..8b9d9af2ca65b3 100644
--- a/llvm/lib/Support/APFloat.cpp
+++ b/llvm/lib/Support/APFloat.cpp
@@ -375,6 +375,15 @@ bool APFloatBase::semanticsHasSignedRepr(const 
fltSemantics &semantics) {
   return semantics.hasSignedRepr;
 }
 
+bool APFloatBase::semanticsHasInf(const fltSemantics &semantics) {
+  return semantics.nonFiniteBehavior != fltNonfiniteBehavior::NanOnly
+  && semantics.nonFiniteBehavior != fltNonfiniteBehavior::FiniteOnly;
+}
+
+bool APFloatBase::semanticsHasNan(const fltSemantics &semantics) {
+  return semantics.nonFiniteBehavior != fltNonfiniteBehavior::FiniteOnly;
+}
+
 bool APFloatBase::isRepresentableAsNormalIn(const fltSemantics &Src,
 const fltSemantics &Dst) {
   // Exponent range must be larger.
diff --git a/mlir/lib/AsmParser/AttributeParser.cpp 
b/mlir/lib/AsmParser/AttributeParser.cpp
index 9ebada076cd042..68da950f09e568 100644
--- a/mlir/lib/AsmParser/AttributeParser.cpp
+++ b/mlir/lib/AsmParser/AttributeParser.cpp
@@ -21,8 +21,10 @@
 #include "mlir/IR/DialectImplementation.h"
 #include "mlir/IR/DialectResourceBlobManager.h"
 #include "mlir/IR/IntegerSet.h"
+#include "llvm/ADT/APFloat.h"
 #include "llvm/ADT/StringExtras.h"
 #include "llvm/Support/Endian.h"
+#include 
 #include 
 
 using namespace mlir;
@@ -121,6 +123,8 @@ Attribute Parser::parseAttribute(Type type) {
 
   // Parse floating point and integer attributes.
   case Token::floatliteral:
+  case Token::kw_inf:
+  case Token::kw_nan:
 return parseFloatAttr(type, /*isNegative=*/false);
   case Token::integer:
 return parseDecOrHexAttr(type, /*isNegative=*/false);
@@ -128,7 +132,8 @@ Attribute Parser::parseAttribute(Type type) {
 consumeToken(Token::minus);
 if (getToken().is(Token::integer))
   return parseDecOrHexAttr(type, /*isNegative=*/true);
-if (getToken().is(Token::floatliteral))
+if (getToken().is(Token::floatliteral) || getToken().is(Token::kw_inf) ||
+getToken().is(Token::kw_nan))
   return parseFloatAttr(type, /*isNegative=*/true);
 
 return (emitWrongTokenError(
@@ -342,10 +347,8 @@ ParseResult Parser::parseAttributeDict(NamedAttrList 
&attributes) {
 
 /// Parse a float attribute.
 Attribute Parser::parseFloatAttr(Type type, bool isNegative) {
-  auto val = getToken().getFloatingPointValue();
-  if (!val)
-return (emitError("floating point value too large for attribute"), 
nullptr);
-  consumeToken(Token::floatliteral);
+  const Token tok = getToken();
+  consumeToken();
   if (!type) {
 // Default to F64 when no type is specified.
 if (!consumeIf(Token::colon))
@@ -353,10 +356,16 @@ Attribute Parser::parseFloatAttr(Type type, bool 
isNegative) {
 else if (!(type = parseType()))
   return nullptr;
   }
-  if (!isa(type))
+  auto floatType = dyn_cast(type);
+  if (!floatType)
 return (emitError("floating point value not valid for specified type"),
 nullptr);
-  return FloatAttr::get(type, isNegative ? -*val : *val);
+  auto emitErrorAtTok =