[llvm-branch-commits] [mlir] [mlir][bufferization] Remove remaining dialect conversion-based infra parts (PR #114155)
https://github.com/javedabsar1 edited https://github.com/llvm/llvm-project/pull/114155 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [mlir][bufferization] Remove remaining dialect conversion-based infra parts (PR #114155)
@@ -86,18 +86,13 @@ getOrCreateFuncAnalysisState(OneShotAnalysisState &state) { return state.addExtension(); } -/// Return the unique ReturnOp that terminates `funcOp`. -/// Return nullptr if there is no such unique ReturnOp. -static func::ReturnOp getAssumedUniqueReturnOp(func::FuncOp funcOp) { - func::ReturnOp returnOp; - for (Block &b : funcOp.getBody()) { -if (auto candidateOp = dyn_cast(b.getTerminator())) { - if (returnOp) -return nullptr; - returnOp = candidateOp; -} - } - return returnOp; +/// Return all top-level func.return ops in the given function. javedabsar1 wrote: wasn't this `getReturnOps` part of another diff? Just confused and asking. https://github.com/llvm/llvm-project/pull/114155 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [openmp] release/19.x: [OpenMP] Create versioned libgomp softlinks (#112973) (PR #115944)
https://github.com/shiltian approved this pull request. https://github.com/llvm/llvm-project/pull/115944 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [flang] [mlir] [MLIR][OpenMP] LLVM IR translation of host_eval (PR #116052)
llvmbot wrote: @llvm/pr-subscribers-flang-fir-hlfir Author: Sergio Afonso (skatrak) Changes This patch adds support for processing the `host_eval` clause of `omp.target` to populate default and runtime kernel launch attributes. Specifically, these related to the `num_teams`, `thread_limit` and `num_threads` clauses attached to operations nested inside of `omp.target`. As a result, the `thread_limit` clause of `omp.target` is also supported. The implementation of `initTargetDefaultAttrs()` is intended to reflect clang's own processing of multiple constructs and clauses in order to define a default number of teams and threads to be used as kernel attributes and to populate global variables in the target device module. One side effect of this change is that it is no longer possible to translate to LLVM IR target device MLIR modules unless they have a supported target triple. This is because the local `getGridValue()` function in the `OpenMPIRBuilder` only works for certain architectures, and it is called whenever the maximum number of threads has not been explicitly defined. This limitation also matches clang. Support for evaluating the collapsed loop trip count of target SPMD kernels remains unsupported. --- Patch is 37.90 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/116052.diff 18 Files Affected: - (modified) flang/test/Integration/OpenMP/target-filtering.f90 (+1-1) - (modified) flang/test/Lower/OpenMP/function-filtering-2.f90 (+3-3) - (modified) flang/test/Lower/OpenMP/function-filtering-3.f90 (+3-3) - (modified) flang/test/Lower/OpenMP/function-filtering.f90 (+3-3) - (modified) mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp (+229-16) - (modified) mlir/test/Target/LLVMIR/omptarget-byref-bycopy-generation-device.mlir (+2-2) - (modified) mlir/test/Target/LLVMIR/omptarget-constant-alloca-raise.mlir (+2-2) - (modified) mlir/test/Target/LLVMIR/omptarget-constant-indexing-device-region.mlir (+2-2) - (modified) mlir/test/Target/LLVMIR/omptarget-debug.mlir (+1-1) - (modified) mlir/test/Target/LLVMIR/omptarget-declare-target-llvm-device.mlir (+1-1) - (modified) mlir/test/Target/LLVMIR/omptarget-parallel-llvm.mlir (+2-2) - (modified) mlir/test/Target/LLVMIR/omptarget-region-device-llvm.mlir (+3-3) - (modified) mlir/test/Target/LLVMIR/omptarget-target-inside-task.mlir (+2-2) - (added) mlir/test/Target/LLVMIR/openmp-target-launch-device.mlir (+43) - (added) mlir/test/Target/LLVMIR/openmp-target-launch-host.mlir (+31) - (modified) mlir/test/Target/LLVMIR/openmp-target-use-device-nested.mlir (+2-2) - (modified) mlir/test/Target/LLVMIR/openmp-task-target-device.mlir (+1-1) - (modified) mlir/test/Target/LLVMIR/openmp-todo.mlir (+13-14) ``diff diff --git a/flang/test/Integration/OpenMP/target-filtering.f90 b/flang/test/Integration/OpenMP/target-filtering.f90 index d1ab1b47e580d4..699c1040d91f9c 100644 --- a/flang/test/Integration/OpenMP/target-filtering.f90 +++ b/flang/test/Integration/OpenMP/target-filtering.f90 @@ -7,7 +7,7 @@ !===--===! !RUN: %flang_fc1 -emit-llvm -fopenmp %s -o - | FileCheck %s --check-prefixes HOST,ALL -!RUN: %flang_fc1 -emit-llvm -fopenmp -fopenmp-is-target-device %s -o - | FileCheck %s --check-prefixes DEVICE,ALL +!RUN: %flang_fc1 -triple amdgcn-amd-amdhsa -emit-llvm -fopenmp -fopenmp-is-target-device %s -o - | FileCheck %s --check-prefixes DEVICE,ALL !HOST: define {{.*}}@{{.*}}before{{.*}}( !DEVICE-NOT: define {{.*}}@before{{.*}}( diff --git a/flang/test/Lower/OpenMP/function-filtering-2.f90 b/flang/test/Lower/OpenMP/function-filtering-2.f90 index 0c02aa223820e7..a2c5e29cfdcbf6 100644 --- a/flang/test/Lower/OpenMP/function-filtering-2.f90 +++ b/flang/test/Lower/OpenMP/function-filtering-2.f90 @@ -1,9 +1,9 @@ ! RUN: %flang_fc1 -fopenmp -fopenmp-version=52 -flang-experimental-hlfir -emit-llvm %s -o - | FileCheck --check-prefixes=LLVM,LLVM-HOST %s ! RUN: %flang_fc1 -fopenmp -fopenmp-version=52 -emit-hlfir %s -o - | FileCheck --check-prefix=MLIR %s -! RUN: %flang_fc1 -fopenmp -fopenmp-version=52 -fopenmp-is-target-device -flang-experimental-hlfir -emit-llvm %s -o - | FileCheck --check-prefixes=LLVM,LLVM-DEVICE %s -! RUN: %flang_fc1 -fopenmp -fopenmp-version=52 -fopenmp-is-target-device -emit-hlfir %s -o - | FileCheck --check-prefix=MLIR %s +! RUN: %flang_fc1 -triple amdgcn-amd-amdhsa -fopenmp -fopenmp-version=52 -fopenmp-is-target-device -flang-experimental-hlfir -emit-llvm %s -o - | FileCheck --check-prefixes=LLVM,LLVM-DEVICE %s +! RUN: %flang_fc1 -triple amdgcn-amd-amdhsa -fopenmp -fopenmp-version=52 -fopenmp-is-target-device -emit-hlfir %s -o - | FileCheck --check-prefix=MLIR %s ! RUN: bbc -fopenmp -fopenmp-version=52 -emit-hlfir %s -o - | FileCheck --check-prefixes=MLIR-HOST,MLIR-ALL %s -! RUN: bbc -fopenmp -fopenmp-version=52 -fopenmp-is-target-device -emit-hlfir %s -o - | Fi
[llvm-branch-commits] [clang] [llvm] [mlir] [OMPIRBuilder] Introduce struct to hold default kernel teams/threads (PR #116050)
https://github.com/skatrak created https://github.com/llvm/llvm-project/pull/116050 This patch introduces the `OpenMPIRBuilder::TargetKernelDefaultAttrs` structure used to simplify passing default and constant values for number of teams and threads, and possibly other target kernel-related information in the future. This is used to forward values passed to `createTarget` to `createTargetInit`, which previously used a default unrelated set of values. >From 1fcfe48114bdda7be545b6bfaa710b6e639670d3 Mon Sep 17 00:00:00 2001 From: Sergio Afonso Date: Fri, 8 Nov 2024 15:46:48 + Subject: [PATCH] [OMPIRBuilder] Introduce struct to hold default kernel teams/threads This patch introduces the `OpenMPIRBuilder::TargetKernelDefaultAttrs` structure used to simplify passing default and constant values for number of teams and threads, and possibly other target kernel-related information in the future. This is used to forward values passed to `createTarget` to `createTargetInit`, which previously used a default unrelated set of values. --- clang/lib/CodeGen/CGOpenMPRuntime.cpp | 13 ++-- clang/lib/CodeGen/CGOpenMPRuntime.h | 9 +-- clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp | 9 +-- .../llvm/Frontend/OpenMP/OMPIRBuilder.h | 39 ++ llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp | 71 +++ .../Frontend/OpenMPIRBuilderTest.cpp | 29 .../OpenMP/OpenMPToLLVMIRTranslation.cpp | 11 +-- .../LLVMIR/omptarget-region-device-llvm.mlir | 2 +- 8 files changed, 102 insertions(+), 81 deletions(-) diff --git a/clang/lib/CodeGen/CGOpenMPRuntime.cpp b/clang/lib/CodeGen/CGOpenMPRuntime.cpp index d714af035d21a2..0f7a1166227476 100644 --- a/clang/lib/CodeGen/CGOpenMPRuntime.cpp +++ b/clang/lib/CodeGen/CGOpenMPRuntime.cpp @@ -5880,10 +5880,13 @@ void CGOpenMPRuntime::emitUsesAllocatorsFini(CodeGenFunction &CGF, void CGOpenMPRuntime::computeMinAndMaxThreadsAndTeams( const OMPExecutableDirective &D, CodeGenFunction &CGF, -int32_t &MinThreadsVal, int32_t &MaxThreadsVal, int32_t &MinTeamsVal, -int32_t &MaxTeamsVal) { +llvm::OpenMPIRBuilder::TargetKernelDefaultAttrs &Attrs) { + assert(Attrs.MaxTeams.size() == 1 && Attrs.MaxThreads.size() == 1 && + "invalid default attrs structure"); + int32_t &MaxTeamsVal = Attrs.MaxTeams.front(); + int32_t &MaxThreadsVal = Attrs.MaxThreads.front(); - getNumTeamsExprForTargetDirective(CGF, D, MinTeamsVal, MaxTeamsVal); + getNumTeamsExprForTargetDirective(CGF, D, Attrs.MinTeams, MaxTeamsVal); getNumThreadsExprForTargetDirective(CGF, D, MaxThreadsVal, /*UpperBoundOnly=*/true); @@ -5901,12 +5904,12 @@ void CGOpenMPRuntime::computeMinAndMaxThreadsAndTeams( else continue; - MinThreadsVal = std::max(MinThreadsVal, AttrMinThreadsVal); + Attrs.MinThreads = std::max(Attrs.MinThreads, AttrMinThreadsVal); if (AttrMaxThreadsVal > 0) MaxThreadsVal = MaxThreadsVal > 0 ? std::min(MaxThreadsVal, AttrMaxThreadsVal) : AttrMaxThreadsVal; - MinTeamsVal = std::max(MinTeamsVal, AttrMinBlocksVal); + Attrs.MinTeams = std::max(Attrs.MinTeams, AttrMinBlocksVal); if (AttrMaxBlocksVal > 0) MaxTeamsVal = MaxTeamsVal > 0 ? std::min(MaxTeamsVal, AttrMaxBlocksVal) : AttrMaxBlocksVal; diff --git a/clang/lib/CodeGen/CGOpenMPRuntime.h b/clang/lib/CodeGen/CGOpenMPRuntime.h index 5e7715743afb58..003395e7f17ded 100644 --- a/clang/lib/CodeGen/CGOpenMPRuntime.h +++ b/clang/lib/CodeGen/CGOpenMPRuntime.h @@ -312,12 +312,9 @@ class CGOpenMPRuntime { llvm::OpenMPIRBuilder OMPBuilder; /// Helper to determine the min/max number of threads/teams for \p D. - void computeMinAndMaxThreadsAndTeams(const OMPExecutableDirective &D, - CodeGenFunction &CGF, - int32_t &MinThreadsVal, - int32_t &MaxThreadsVal, - int32_t &MinTeamsVal, - int32_t &MaxTeamsVal); + void computeMinAndMaxThreadsAndTeams( + const OMPExecutableDirective &D, CodeGenFunction &CGF, + llvm::OpenMPIRBuilder::TargetKernelDefaultAttrs &Attrs); /// Helper to emit outlined function for 'target' directive. /// \param D Directive to emit. diff --git a/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp b/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp index 43dc0e62284602..96f8d6c5c08e56 100644 --- a/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp +++ b/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp @@ -745,14 +745,11 @@ void CGOpenMPRuntimeGPU::emitNonSPMDKernel(const OMPExecutableDirective &D, void CGOpenMPRuntimeGPU::emitKernelInit(const OMPExecutableDirective &D, CodeGenFunction &CGF, EntryFunctionSt
[llvm-branch-commits] [clang] [llvm] [mlir] [OMPIRBuilder] Introduce struct to hold default kernel teams/threads (PR #116050)
llvmbot wrote: @llvm/pr-subscribers-clang Author: Sergio Afonso (skatrak) Changes This patch introduces the `OpenMPIRBuilder::TargetKernelDefaultAttrs` structure used to simplify passing default and constant values for number of teams and threads, and possibly other target kernel-related information in the future. This is used to forward values passed to `createTarget` to `createTargetInit`, which previously used a default unrelated set of values. --- Patch is 21.80 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/116050.diff 8 Files Affected: - (modified) clang/lib/CodeGen/CGOpenMPRuntime.cpp (+8-5) - (modified) clang/lib/CodeGen/CGOpenMPRuntime.h (+3-6) - (modified) clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp (+3-6) - (modified) llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h (+25-14) - (modified) llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp (+40-31) - (modified) llvm/unittests/Frontend/OpenMPIRBuilderTest.cpp (+16-13) - (modified) mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp (+6-5) - (modified) mlir/test/Target/LLVMIR/omptarget-region-device-llvm.mlir (+1-1) ``diff diff --git a/clang/lib/CodeGen/CGOpenMPRuntime.cpp b/clang/lib/CodeGen/CGOpenMPRuntime.cpp index d714af035d21a2..0f7a1166227476 100644 --- a/clang/lib/CodeGen/CGOpenMPRuntime.cpp +++ b/clang/lib/CodeGen/CGOpenMPRuntime.cpp @@ -5880,10 +5880,13 @@ void CGOpenMPRuntime::emitUsesAllocatorsFini(CodeGenFunction &CGF, void CGOpenMPRuntime::computeMinAndMaxThreadsAndTeams( const OMPExecutableDirective &D, CodeGenFunction &CGF, -int32_t &MinThreadsVal, int32_t &MaxThreadsVal, int32_t &MinTeamsVal, -int32_t &MaxTeamsVal) { +llvm::OpenMPIRBuilder::TargetKernelDefaultAttrs &Attrs) { + assert(Attrs.MaxTeams.size() == 1 && Attrs.MaxThreads.size() == 1 && + "invalid default attrs structure"); + int32_t &MaxTeamsVal = Attrs.MaxTeams.front(); + int32_t &MaxThreadsVal = Attrs.MaxThreads.front(); - getNumTeamsExprForTargetDirective(CGF, D, MinTeamsVal, MaxTeamsVal); + getNumTeamsExprForTargetDirective(CGF, D, Attrs.MinTeams, MaxTeamsVal); getNumThreadsExprForTargetDirective(CGF, D, MaxThreadsVal, /*UpperBoundOnly=*/true); @@ -5901,12 +5904,12 @@ void CGOpenMPRuntime::computeMinAndMaxThreadsAndTeams( else continue; - MinThreadsVal = std::max(MinThreadsVal, AttrMinThreadsVal); + Attrs.MinThreads = std::max(Attrs.MinThreads, AttrMinThreadsVal); if (AttrMaxThreadsVal > 0) MaxThreadsVal = MaxThreadsVal > 0 ? std::min(MaxThreadsVal, AttrMaxThreadsVal) : AttrMaxThreadsVal; - MinTeamsVal = std::max(MinTeamsVal, AttrMinBlocksVal); + Attrs.MinTeams = std::max(Attrs.MinTeams, AttrMinBlocksVal); if (AttrMaxBlocksVal > 0) MaxTeamsVal = MaxTeamsVal > 0 ? std::min(MaxTeamsVal, AttrMaxBlocksVal) : AttrMaxBlocksVal; diff --git a/clang/lib/CodeGen/CGOpenMPRuntime.h b/clang/lib/CodeGen/CGOpenMPRuntime.h index 5e7715743afb58..003395e7f17ded 100644 --- a/clang/lib/CodeGen/CGOpenMPRuntime.h +++ b/clang/lib/CodeGen/CGOpenMPRuntime.h @@ -312,12 +312,9 @@ class CGOpenMPRuntime { llvm::OpenMPIRBuilder OMPBuilder; /// Helper to determine the min/max number of threads/teams for \p D. - void computeMinAndMaxThreadsAndTeams(const OMPExecutableDirective &D, - CodeGenFunction &CGF, - int32_t &MinThreadsVal, - int32_t &MaxThreadsVal, - int32_t &MinTeamsVal, - int32_t &MaxTeamsVal); + void computeMinAndMaxThreadsAndTeams( + const OMPExecutableDirective &D, CodeGenFunction &CGF, + llvm::OpenMPIRBuilder::TargetKernelDefaultAttrs &Attrs); /// Helper to emit outlined function for 'target' directive. /// \param D Directive to emit. diff --git a/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp b/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp index 43dc0e62284602..96f8d6c5c08e56 100644 --- a/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp +++ b/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp @@ -745,14 +745,11 @@ void CGOpenMPRuntimeGPU::emitNonSPMDKernel(const OMPExecutableDirective &D, void CGOpenMPRuntimeGPU::emitKernelInit(const OMPExecutableDirective &D, CodeGenFunction &CGF, EntryFunctionState &EST, bool IsSPMD) { - int32_t MinThreadsVal = 1, MaxThreadsVal = -1, MinTeamsVal = 1, - MaxTeamsVal = -1; - computeMinAndMaxThreadsAndTeams(D, CGF, MinThreadsVal, MaxThreadsVal, - MinTeamsVal, MaxTeamsVal); + llvm::OpenMPIRBuilder::TargetKernelDefaultAttrs Attrs; + computeMinAndMaxThreadsAndTeams(D, CGF, Attrs); CGBuilderTy &Bld = CGF.Builder;
[llvm-branch-commits] [mlir] [MLIR][OpenMP] Add host_eval clause to omp.target (PR #116049)
skatrak wrote: PR stack: - #116048 - #116049 - #116050 - #116051 - #116052 https://github.com/llvm/llvm-project/pull/116049 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [flang] [mlir] [MLIR][OpenMP] LLVM IR translation of host_eval (PR #116052)
https://github.com/skatrak created https://github.com/llvm/llvm-project/pull/116052 This patch adds support for processing the `host_eval` clause of `omp.target` to populate default and runtime kernel launch attributes. Specifically, these related to the `num_teams`, `thread_limit` and `num_threads` clauses attached to operations nested inside of `omp.target`. As a result, the `thread_limit` clause of `omp.target` is also supported. The implementation of `initTargetDefaultAttrs()` is intended to reflect clang's own processing of multiple constructs and clauses in order to define a default number of teams and threads to be used as kernel attributes and to populate global variables in the target device module. One side effect of this change is that it is no longer possible to translate to LLVM IR target device MLIR modules unless they have a supported target triple. This is because the local `getGridValue()` function in the `OpenMPIRBuilder` only works for certain architectures, and it is called whenever the maximum number of threads has not been explicitly defined. This limitation also matches clang. Support for evaluating the collapsed loop trip count of target SPMD kernels remains unsupported. >From 8ff0d3bfc3c4b91987146276c059c9e0affaa788 Mon Sep 17 00:00:00 2001 From: Sergio Afonso Date: Tue, 12 Nov 2024 10:49:28 + Subject: [PATCH] [MLIR][OpenMP] LLVM IR translation of host_eval This patch adds support for processing the `host_eval` clause of `omp.target` to populate default and runtime kernel launch attributes. Specifically, these related to the `num_teams`, `thread_limit` and `num_threads` clauses attached to operations nested inside of `omp.target`. As a result, the `thread_limit` clause of `omp.target` is also supported. The implementation of `initTargetDefaultAttrs()` is intended to reflect clang's own processing of multiple constructs and clauses in order to define a default number of teams and threads to be used as kernel attributes and to populate global variables in the target device module. One side effect of this change is that it is no longer possible to translate to LLVM IR target device MLIR modules unless they have a supported target triple. This is because the local `getGridValue()` function in the `OpenMPIRBuilder` only works for certain architectures, and it is called whenever the maximum number of threads has not been explicitly defined. This limitation also matches clang. Support for evaluating the collapsed loop trip count of target SPMD kernels remains unsupported. --- .../Integration/OpenMP/target-filtering.f90 | 2 +- .../Lower/OpenMP/function-filtering-2.f90 | 6 +- .../Lower/OpenMP/function-filtering-3.f90 | 6 +- .../test/Lower/OpenMP/function-filtering.f90 | 6 +- .../OpenMP/OpenMPToLLVMIRTranslation.cpp | 245 -- ...target-byref-bycopy-generation-device.mlir | 4 +- .../omptarget-constant-alloca-raise.mlir | 4 +- ...arget-constant-indexing-device-region.mlir | 4 +- mlir/test/Target/LLVMIR/omptarget-debug.mlir | 2 +- .../omptarget-declare-target-llvm-device.mlir | 2 +- .../LLVMIR/omptarget-parallel-llvm.mlir | 4 +- .../LLVMIR/omptarget-region-device-llvm.mlir | 6 +- .../LLVMIR/omptarget-target-inside-task.mlir | 4 +- .../LLVMIR/openmp-target-launch-device.mlir | 43 +++ .../LLVMIR/openmp-target-launch-host.mlir | 31 +++ .../openmp-target-use-device-nested.mlir | 4 +- .../LLVMIR/openmp-task-target-device.mlir | 2 +- mlir/test/Target/LLVMIR/openmp-todo.mlir | 27 +- 18 files changed, 344 insertions(+), 58 deletions(-) create mode 100644 mlir/test/Target/LLVMIR/openmp-target-launch-device.mlir create mode 100644 mlir/test/Target/LLVMIR/openmp-target-launch-host.mlir diff --git a/flang/test/Integration/OpenMP/target-filtering.f90 b/flang/test/Integration/OpenMP/target-filtering.f90 index d1ab1b47e580d4..699c1040d91f9c 100644 --- a/flang/test/Integration/OpenMP/target-filtering.f90 +++ b/flang/test/Integration/OpenMP/target-filtering.f90 @@ -7,7 +7,7 @@ !===--===! !RUN: %flang_fc1 -emit-llvm -fopenmp %s -o - | FileCheck %s --check-prefixes HOST,ALL -!RUN: %flang_fc1 -emit-llvm -fopenmp -fopenmp-is-target-device %s -o - | FileCheck %s --check-prefixes DEVICE,ALL +!RUN: %flang_fc1 -triple amdgcn-amd-amdhsa -emit-llvm -fopenmp -fopenmp-is-target-device %s -o - | FileCheck %s --check-prefixes DEVICE,ALL !HOST: define {{.*}}@{{.*}}before{{.*}}( !DEVICE-NOT: define {{.*}}@before{{.*}}( diff --git a/flang/test/Lower/OpenMP/function-filtering-2.f90 b/flang/test/Lower/OpenMP/function-filtering-2.f90 index 0c02aa223820e7..a2c5e29cfdcbf6 100644 --- a/flang/test/Lower/OpenMP/function-filtering-2.f90 +++ b/flang/test/Lower/OpenMP/function-filtering-2.f90 @@ -1,9 +1,9 @@ ! RUN: %flang_fc1 -fopenmp -fopenmp-version=52 -flang-experimental-hlfir -emit-llvm %s -o - | FileCheck --
[llvm-branch-commits] [llvm] [mlir] [OMPIRBuilder] Support runtime number of teams and threads, and SPMD mode (PR #116051)
https://github.com/skatrak created https://github.com/llvm/llvm-project/pull/116051 This patch introduces a `TargetKernelRuntimeAttrs` structure to hold host- evaluated `num_teams`, `thread_limit`, `num_threads` and trip count values passed to the runtime kernel offloading call. Additionally, `createTarget` is extended to take an `IsSPMD` flag, used to influence target device code generation. >From cc5c5cc8b1c8b718ae3d0aece3784416460114bc Mon Sep 17 00:00:00 2001 From: Sergio Afonso Date: Fri, 8 Nov 2024 17:24:47 + Subject: [PATCH] [OMPIRBuilder] Support runtime number of teams and threads, and SPMD mode This patch introduces a `TargetKernelRuntimeAttrs` structure to hold host- evaluated `num_teams`, `thread_limit`, `num_threads` and trip count values passed to the runtime kernel offloading call. Additionally, `createTarget` is extended to take an `IsSPMD` flag, used to influence target device code generation. --- .../llvm/Frontend/OpenMP/OMPIRBuilder.h | 26 +- llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp | 137 +++-- .../Frontend/OpenMPIRBuilderTest.cpp | 281 +- .../OpenMP/OpenMPToLLVMIRTranslation.cpp | 10 +- 4 files changed, 420 insertions(+), 34 deletions(-) diff --git a/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h b/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h index da450ef5adbc14..a85f41e586c514 100644 --- a/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h +++ b/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h @@ -2237,6 +2237,26 @@ class OpenMPIRBuilder { int32_t MinThreads = 1; }; + /// Container to pass LLVM IR runtime values or constants related to the + /// number of teams and threads with which the kernel must be launched, as + /// well as the trip count of the SPMD loop, if it is an SPMD kernel. These + /// must be defined in the host prior to the call to the kernel launch OpenMP + /// RTL function. + struct TargetKernelRuntimeAttrs { +SmallVector MaxTeams = {nullptr}; +Value *MinTeams = nullptr; +SmallVector TargetThreadLimit = {nullptr}; +SmallVector TeamsThreadLimit = {nullptr}; + +/// 'parallel' construct 'num_threads' clause value, if present and it is a +/// target SPMD kernel. +Value *MaxThreads = nullptr; + +/// Total number of iterations of the target SPMD kernel or null if it is a +/// generic kernel. +Value *LoopTripCount = nullptr; + }; + /// Data structure that contains the needed information to construct the /// kernel args vector. struct TargetKernelArgs { @@ -2905,11 +2925,14 @@ class OpenMPIRBuilder { /// /// \param Loc where the target data construct was encountered. /// \param IsOffloadEntry whether it is an offload entry. + /// \param IsSPMD whether it is a target SPMD kernel. /// \param CodeGenIP The insertion point where the call to the outlined /// function should be emitted. /// \param EntryInfo The entry information about the function. /// \param DefaultAttrs Structure containing the default numbers of threads ///and teams to launch the kernel with. + /// \param RuntimeAttrs Structure containing the runtime numbers of threads + ///and teams to launch the kernel with. /// \param Inputs The input values to the region that will be passed. /// as arguments to the outlined function. /// \param BodyGenCB Callback that will generate the region code. @@ -2919,11 +2942,12 @@ class OpenMPIRBuilder { // dependency information as passed in the depend clause // \param HasNowait Whether the target construct has a `nowait` clause or not. InsertPointOrErrorTy createTarget( - const LocationDescription &Loc, bool IsOffloadEntry, + const LocationDescription &Loc, bool IsOffloadEntry, bool IsSPMD, OpenMPIRBuilder::InsertPointTy AllocaIP, OpenMPIRBuilder::InsertPointTy CodeGenIP, TargetRegionEntryInfo &EntryInfo, const TargetKernelDefaultAttrs &DefaultAttrs, + const TargetKernelRuntimeAttrs &RuntimeAttrs, SmallVectorImpl &Inputs, GenMapInfoCallbackTy GenMapInfoCB, TargetBodyGenCallbackTy BodyGenCB, TargetGenArgAccessorsCallbackTy ArgAccessorFuncCB, diff --git a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp index 302d363965c940..f847f60386df85 100644 --- a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp +++ b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp @@ -6727,8 +6727,43 @@ FunctionCallee OpenMPIRBuilder::createDispatchDeinitFunction() { return getOrCreateRuntimeFunction(M, omp::OMPRTL___kmpc_dispatch_deinit); } +static void emitUsed(StringRef Name, std::vector &List, + Module &M) { + if (List.empty()) +return; + + Type *PtrTy = PointerType::get(M.getContext(), /*AddressSpace=*/0); + + // Convert List to what ConstantArray needs. + SmallVector UsedArray; + UsedArray.reserve(List.size()); + for (auto Item : List) +UsedArray.push_back(ConstantExpr::getPointer
[llvm-branch-commits] [mlir] [MLIR][OpenMP] Add host_eval clause to omp.target (PR #116049)
llvmbot wrote: @llvm/pr-subscribers-mlir-llvm Author: Sergio Afonso (skatrak) Changes This patch adds the `host_eval` clause to the `omp.target` operation. Additionally, it updates its op verifier to make sure all uses of block arguments defined by this clause fall within one of the few cases where they are allowed. MLIR to LLVM IR translation fails on translation of this clause with a not-yet-implemented error. --- Patch is 20.92 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/116049.diff 7 Files Affected: - (modified) mlir/docs/Dialects/OpenMPDialect/_index.md (+55) - (modified) mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td (+26-7) - (modified) mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp (+163-4) - (modified) mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp (+5) - (modified) mlir/test/Dialect/OpenMP/invalid.mlir (+69-1) - (modified) mlir/test/Dialect/OpenMP/ops.mlir (+37-1) - (modified) mlir/test/Target/LLVMIR/openmp-todo.mlir (+14) ``diff diff --git a/mlir/docs/Dialects/OpenMPDialect/_index.md b/mlir/docs/Dialects/OpenMPDialect/_index.md index 4e5d777d6c4f7f..e0dd3f598e84b6 100644 --- a/mlir/docs/Dialects/OpenMPDialect/_index.md +++ b/mlir/docs/Dialects/OpenMPDialect/_index.md @@ -523,3 +523,58 @@ omp.parallel ... { omp.terminator } {omp.composite} ``` + +## Host-Evaluated Clauses in Target Regions + +The `omp.target` operation, which represents the OpenMP `target` construct, is +marked with the `IsolatedFromAbove` trait. This means that, inside of its +region, no MLIR values defined outside of the op itself can be used. This is +consistent with the OpenMP specification of the `target` construct, which +mandates that all host device values used inside of the `target` region must +either be privatized (data-sharing) or mapped (data-mapping). + +Normally, clauses applied to a construct are evaluated before entering that +construct. Further, in some cases, the OpenMP specification stipulates that +clauses be evaluated _on the host device_ on entry to a parent `target` +construct. In particular, the `num_teams` and `thread_limit` clauses of the +`teams` construct must be evaluated on the host device if it's nested inside or +combined with a `target` construct. + +Additionally, the runtime library targeted by the MLIR to LLVM IR translation of +the OpenMP dialect supports the optimized launch of SPMD kernels (i.e. +`target teams distribute parallel {do,for}` in OpenMP), which requires +specifying in advance what the total trip count of the loop is. Consequently, it +is also beneficial to evaluate the trip count on the host device prior to the +kernel launch. + +These host-evaluated values in MLIR would need to be placed outside of the +`omp.target` region and also attached to the corresponding nested operations, +which is not possible because of the `IsolatedFromAbove` trait. The solution +implemented to address this problem has been to introduce the `host_eval` +argument to the `omp.target` operation. It works similarly to a `map` clause, +but its only intended use is to forward host-evaluated values to their +corresponding operation inside of the region. Any uses outside of the previously +described result in a verifier error. + +```mlir +// Initialize %0, %1, %2, %3... +omp.target host_eval(%0 -> %nt, %1 -> %lb, %2 -> %ub, %3 -> %step : i32, i32, i32, i32) { + omp.teams num_teams(to %nt : i32) { +omp.parallel { + omp.distribute { +omp.wsloop { + omp.loop_nest (%iv) : i32 = (%lb) to (%ub) step (%step) { +// ... +omp.yield + } + omp.terminator +} {omp.composite} +omp.terminator + } {omp.composite} + omp.terminator +} {omp.composite} +omp.terminator + } + omp.terminator +} +``` diff --git a/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td b/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td index a0da3db124d1f4..a99da1f0294d08 100644 --- a/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td +++ b/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td @@ -1166,9 +1166,10 @@ def TargetOp : OpenMP_Op<"target", traits = [ ], clauses = [ // TODO: Complete clause list (defaultmap, uses_allocators). OpenMP_AllocateClause, OpenMP_DependClause, OpenMP_DeviceClause, -OpenMP_HasDeviceAddrClause, OpenMP_IfClause, OpenMP_InReductionClause, -OpenMP_IsDevicePtrClause, OpenMP_MapClauseSkip, -OpenMP_NowaitClause, OpenMP_PrivateClause, OpenMP_ThreadLimitClause +OpenMP_HasDeviceAddrClause, OpenMP_HostEvalClause, OpenMP_IfClause, +OpenMP_InReductionClause, OpenMP_IsDevicePtrClause, +OpenMP_MapClauseSkip, OpenMP_NowaitClause, +OpenMP_PrivateClause, OpenMP_ThreadLimitClause ], singleRegion = true> { let summary = "target construct"; let description = [{ @@ -1186,16 +1187,34 @@ def TargetOp : OpenMP_Op<"target", traits = [ let extraClassDeclaration = [{ unsigned numMapBlockArgs() {
[llvm-branch-commits] [mlir] [MLIR][OpenMP] Add host_eval clause to omp.target (PR #116049)
llvmbot wrote: @llvm/pr-subscribers-mlir-openmp Author: Sergio Afonso (skatrak) Changes This patch adds the `host_eval` clause to the `omp.target` operation. Additionally, it updates its op verifier to make sure all uses of block arguments defined by this clause fall within one of the few cases where they are allowed. MLIR to LLVM IR translation fails on translation of this clause with a not-yet-implemented error. --- Patch is 20.92 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/116049.diff 7 Files Affected: - (modified) mlir/docs/Dialects/OpenMPDialect/_index.md (+55) - (modified) mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td (+26-7) - (modified) mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp (+163-4) - (modified) mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp (+5) - (modified) mlir/test/Dialect/OpenMP/invalid.mlir (+69-1) - (modified) mlir/test/Dialect/OpenMP/ops.mlir (+37-1) - (modified) mlir/test/Target/LLVMIR/openmp-todo.mlir (+14) ``diff diff --git a/mlir/docs/Dialects/OpenMPDialect/_index.md b/mlir/docs/Dialects/OpenMPDialect/_index.md index 4e5d777d6c4f7f..e0dd3f598e84b6 100644 --- a/mlir/docs/Dialects/OpenMPDialect/_index.md +++ b/mlir/docs/Dialects/OpenMPDialect/_index.md @@ -523,3 +523,58 @@ omp.parallel ... { omp.terminator } {omp.composite} ``` + +## Host-Evaluated Clauses in Target Regions + +The `omp.target` operation, which represents the OpenMP `target` construct, is +marked with the `IsolatedFromAbove` trait. This means that, inside of its +region, no MLIR values defined outside of the op itself can be used. This is +consistent with the OpenMP specification of the `target` construct, which +mandates that all host device values used inside of the `target` region must +either be privatized (data-sharing) or mapped (data-mapping). + +Normally, clauses applied to a construct are evaluated before entering that +construct. Further, in some cases, the OpenMP specification stipulates that +clauses be evaluated _on the host device_ on entry to a parent `target` +construct. In particular, the `num_teams` and `thread_limit` clauses of the +`teams` construct must be evaluated on the host device if it's nested inside or +combined with a `target` construct. + +Additionally, the runtime library targeted by the MLIR to LLVM IR translation of +the OpenMP dialect supports the optimized launch of SPMD kernels (i.e. +`target teams distribute parallel {do,for}` in OpenMP), which requires +specifying in advance what the total trip count of the loop is. Consequently, it +is also beneficial to evaluate the trip count on the host device prior to the +kernel launch. + +These host-evaluated values in MLIR would need to be placed outside of the +`omp.target` region and also attached to the corresponding nested operations, +which is not possible because of the `IsolatedFromAbove` trait. The solution +implemented to address this problem has been to introduce the `host_eval` +argument to the `omp.target` operation. It works similarly to a `map` clause, +but its only intended use is to forward host-evaluated values to their +corresponding operation inside of the region. Any uses outside of the previously +described result in a verifier error. + +```mlir +// Initialize %0, %1, %2, %3... +omp.target host_eval(%0 -> %nt, %1 -> %lb, %2 -> %ub, %3 -> %step : i32, i32, i32, i32) { + omp.teams num_teams(to %nt : i32) { +omp.parallel { + omp.distribute { +omp.wsloop { + omp.loop_nest (%iv) : i32 = (%lb) to (%ub) step (%step) { +// ... +omp.yield + } + omp.terminator +} {omp.composite} +omp.terminator + } {omp.composite} + omp.terminator +} {omp.composite} +omp.terminator + } + omp.terminator +} +``` diff --git a/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td b/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td index a0da3db124d1f4..a99da1f0294d08 100644 --- a/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td +++ b/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td @@ -1166,9 +1166,10 @@ def TargetOp : OpenMP_Op<"target", traits = [ ], clauses = [ // TODO: Complete clause list (defaultmap, uses_allocators). OpenMP_AllocateClause, OpenMP_DependClause, OpenMP_DeviceClause, -OpenMP_HasDeviceAddrClause, OpenMP_IfClause, OpenMP_InReductionClause, -OpenMP_IsDevicePtrClause, OpenMP_MapClauseSkip, -OpenMP_NowaitClause, OpenMP_PrivateClause, OpenMP_ThreadLimitClause +OpenMP_HasDeviceAddrClause, OpenMP_HostEvalClause, OpenMP_IfClause, +OpenMP_InReductionClause, OpenMP_IsDevicePtrClause, +OpenMP_MapClauseSkip, OpenMP_NowaitClause, +OpenMP_PrivateClause, OpenMP_ThreadLimitClause ], singleRegion = true> { let summary = "target construct"; let description = [{ @@ -1186,16 +1187,34 @@ def TargetOp : OpenMP_Op<"target", traits = [ let extraClassDeclaration = [{ unsigned numMapBlockArgs()
[llvm-branch-commits] [clang] [llvm] [mlir] [OMPIRBuilder] Introduce struct to hold default kernel teams/threads (PR #116050)
llvmbot wrote: @llvm/pr-subscribers-flang-openmp Author: Sergio Afonso (skatrak) Changes This patch introduces the `OpenMPIRBuilder::TargetKernelDefaultAttrs` structure used to simplify passing default and constant values for number of teams and threads, and possibly other target kernel-related information in the future. This is used to forward values passed to `createTarget` to `createTargetInit`, which previously used a default unrelated set of values. --- Patch is 21.80 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/116050.diff 8 Files Affected: - (modified) clang/lib/CodeGen/CGOpenMPRuntime.cpp (+8-5) - (modified) clang/lib/CodeGen/CGOpenMPRuntime.h (+3-6) - (modified) clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp (+3-6) - (modified) llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h (+25-14) - (modified) llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp (+40-31) - (modified) llvm/unittests/Frontend/OpenMPIRBuilderTest.cpp (+16-13) - (modified) mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp (+6-5) - (modified) mlir/test/Target/LLVMIR/omptarget-region-device-llvm.mlir (+1-1) ``diff diff --git a/clang/lib/CodeGen/CGOpenMPRuntime.cpp b/clang/lib/CodeGen/CGOpenMPRuntime.cpp index d714af035d21a2..0f7a1166227476 100644 --- a/clang/lib/CodeGen/CGOpenMPRuntime.cpp +++ b/clang/lib/CodeGen/CGOpenMPRuntime.cpp @@ -5880,10 +5880,13 @@ void CGOpenMPRuntime::emitUsesAllocatorsFini(CodeGenFunction &CGF, void CGOpenMPRuntime::computeMinAndMaxThreadsAndTeams( const OMPExecutableDirective &D, CodeGenFunction &CGF, -int32_t &MinThreadsVal, int32_t &MaxThreadsVal, int32_t &MinTeamsVal, -int32_t &MaxTeamsVal) { +llvm::OpenMPIRBuilder::TargetKernelDefaultAttrs &Attrs) { + assert(Attrs.MaxTeams.size() == 1 && Attrs.MaxThreads.size() == 1 && + "invalid default attrs structure"); + int32_t &MaxTeamsVal = Attrs.MaxTeams.front(); + int32_t &MaxThreadsVal = Attrs.MaxThreads.front(); - getNumTeamsExprForTargetDirective(CGF, D, MinTeamsVal, MaxTeamsVal); + getNumTeamsExprForTargetDirective(CGF, D, Attrs.MinTeams, MaxTeamsVal); getNumThreadsExprForTargetDirective(CGF, D, MaxThreadsVal, /*UpperBoundOnly=*/true); @@ -5901,12 +5904,12 @@ void CGOpenMPRuntime::computeMinAndMaxThreadsAndTeams( else continue; - MinThreadsVal = std::max(MinThreadsVal, AttrMinThreadsVal); + Attrs.MinThreads = std::max(Attrs.MinThreads, AttrMinThreadsVal); if (AttrMaxThreadsVal > 0) MaxThreadsVal = MaxThreadsVal > 0 ? std::min(MaxThreadsVal, AttrMaxThreadsVal) : AttrMaxThreadsVal; - MinTeamsVal = std::max(MinTeamsVal, AttrMinBlocksVal); + Attrs.MinTeams = std::max(Attrs.MinTeams, AttrMinBlocksVal); if (AttrMaxBlocksVal > 0) MaxTeamsVal = MaxTeamsVal > 0 ? std::min(MaxTeamsVal, AttrMaxBlocksVal) : AttrMaxBlocksVal; diff --git a/clang/lib/CodeGen/CGOpenMPRuntime.h b/clang/lib/CodeGen/CGOpenMPRuntime.h index 5e7715743afb58..003395e7f17ded 100644 --- a/clang/lib/CodeGen/CGOpenMPRuntime.h +++ b/clang/lib/CodeGen/CGOpenMPRuntime.h @@ -312,12 +312,9 @@ class CGOpenMPRuntime { llvm::OpenMPIRBuilder OMPBuilder; /// Helper to determine the min/max number of threads/teams for \p D. - void computeMinAndMaxThreadsAndTeams(const OMPExecutableDirective &D, - CodeGenFunction &CGF, - int32_t &MinThreadsVal, - int32_t &MaxThreadsVal, - int32_t &MinTeamsVal, - int32_t &MaxTeamsVal); + void computeMinAndMaxThreadsAndTeams( + const OMPExecutableDirective &D, CodeGenFunction &CGF, + llvm::OpenMPIRBuilder::TargetKernelDefaultAttrs &Attrs); /// Helper to emit outlined function for 'target' directive. /// \param D Directive to emit. diff --git a/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp b/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp index 43dc0e62284602..96f8d6c5c08e56 100644 --- a/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp +++ b/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp @@ -745,14 +745,11 @@ void CGOpenMPRuntimeGPU::emitNonSPMDKernel(const OMPExecutableDirective &D, void CGOpenMPRuntimeGPU::emitKernelInit(const OMPExecutableDirective &D, CodeGenFunction &CGF, EntryFunctionState &EST, bool IsSPMD) { - int32_t MinThreadsVal = 1, MaxThreadsVal = -1, MinTeamsVal = 1, - MaxTeamsVal = -1; - computeMinAndMaxThreadsAndTeams(D, CGF, MinThreadsVal, MaxThreadsVal, - MinTeamsVal, MaxTeamsVal); + llvm::OpenMPIRBuilder::TargetKernelDefaultAttrs Attrs; + computeMinAndMaxThreadsAndTeams(D, CGF, Attrs); CGBuilderTy &Bld = CGF.Bu
[llvm-branch-commits] [llvm] [mlir] [OMPIRBuilder] Support runtime number of teams and threads, and SPMD mode (PR #116051)
llvmbot wrote: @llvm/pr-subscribers-flang-openmp @llvm/pr-subscribers-mlir-llvm Author: Sergio Afonso (skatrak) Changes This patch introduces a `TargetKernelRuntimeAttrs` structure to hold host- evaluated `num_teams`, `thread_limit`, `num_threads` and trip count values passed to the runtime kernel offloading call. Additionally, `createTarget` is extended to take an `IsSPMD` flag, used to influence target device code generation. --- Patch is 31.58 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/116051.diff 4 Files Affected: - (modified) llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h (+25-1) - (modified) llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp (+118-19) - (modified) llvm/unittests/Frontend/OpenMPIRBuilderTest.cpp (+271-10) - (modified) mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp (+6-4) ``diff diff --git a/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h b/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h index da450ef5adbc14..a85f41e586c514 100644 --- a/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h +++ b/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h @@ -2237,6 +2237,26 @@ class OpenMPIRBuilder { int32_t MinThreads = 1; }; + /// Container to pass LLVM IR runtime values or constants related to the + /// number of teams and threads with which the kernel must be launched, as + /// well as the trip count of the SPMD loop, if it is an SPMD kernel. These + /// must be defined in the host prior to the call to the kernel launch OpenMP + /// RTL function. + struct TargetKernelRuntimeAttrs { +SmallVector MaxTeams = {nullptr}; +Value *MinTeams = nullptr; +SmallVector TargetThreadLimit = {nullptr}; +SmallVector TeamsThreadLimit = {nullptr}; + +/// 'parallel' construct 'num_threads' clause value, if present and it is a +/// target SPMD kernel. +Value *MaxThreads = nullptr; + +/// Total number of iterations of the target SPMD kernel or null if it is a +/// generic kernel. +Value *LoopTripCount = nullptr; + }; + /// Data structure that contains the needed information to construct the /// kernel args vector. struct TargetKernelArgs { @@ -2905,11 +2925,14 @@ class OpenMPIRBuilder { /// /// \param Loc where the target data construct was encountered. /// \param IsOffloadEntry whether it is an offload entry. + /// \param IsSPMD whether it is a target SPMD kernel. /// \param CodeGenIP The insertion point where the call to the outlined /// function should be emitted. /// \param EntryInfo The entry information about the function. /// \param DefaultAttrs Structure containing the default numbers of threads ///and teams to launch the kernel with. + /// \param RuntimeAttrs Structure containing the runtime numbers of threads + ///and teams to launch the kernel with. /// \param Inputs The input values to the region that will be passed. /// as arguments to the outlined function. /// \param BodyGenCB Callback that will generate the region code. @@ -2919,11 +2942,12 @@ class OpenMPIRBuilder { // dependency information as passed in the depend clause // \param HasNowait Whether the target construct has a `nowait` clause or not. InsertPointOrErrorTy createTarget( - const LocationDescription &Loc, bool IsOffloadEntry, + const LocationDescription &Loc, bool IsOffloadEntry, bool IsSPMD, OpenMPIRBuilder::InsertPointTy AllocaIP, OpenMPIRBuilder::InsertPointTy CodeGenIP, TargetRegionEntryInfo &EntryInfo, const TargetKernelDefaultAttrs &DefaultAttrs, + const TargetKernelRuntimeAttrs &RuntimeAttrs, SmallVectorImpl &Inputs, GenMapInfoCallbackTy GenMapInfoCB, TargetBodyGenCallbackTy BodyGenCB, TargetGenArgAccessorsCallbackTy ArgAccessorFuncCB, diff --git a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp index 302d363965c940..f847f60386df85 100644 --- a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp +++ b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp @@ -6727,8 +6727,43 @@ FunctionCallee OpenMPIRBuilder::createDispatchDeinitFunction() { return getOrCreateRuntimeFunction(M, omp::OMPRTL___kmpc_dispatch_deinit); } +static void emitUsed(StringRef Name, std::vector &List, + Module &M) { + if (List.empty()) +return; + + Type *PtrTy = PointerType::get(M.getContext(), /*AddressSpace=*/0); + + // Convert List to what ConstantArray needs. + SmallVector UsedArray; + UsedArray.reserve(List.size()); + for (auto Item : List) +UsedArray.push_back(ConstantExpr::getPointerBitCastOrAddrSpaceCast( +cast(&*Item), PtrTy)); + + ArrayType *ArrTy = ArrayType::get(PtrTy, UsedArray.size()); + auto *GV = + new GlobalVariable(M, ArrTy, false, llvm::GlobalValue::AppendingLinkage, + llvm::ConstantArray::get(ArrTy, UsedArray), Name); + + GV->setSection("llvm.metadata"); +} + +st
[llvm-branch-commits] [llvm] [mlir] [OMPIRBuilder] Support runtime number of teams and threads, and SPMD mode (PR #116051)
skatrak wrote: PR stack: - #116048 - #116049 - #116050 - #116051 - #116052 https://github.com/llvm/llvm-project/pull/116051 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [flang] [lld] [llvm] [Flang] LLVM_ENABLE_RUNTIMES=flang-rt (PR #110217)
@@ -50,43 +42,28 @@ macro(enable_cuda_compilation name files) "${CUDA_COMPILE_OPTIONS}" ) -if (EXISTS "${FLANG_LIBCUDACXX_PATH}/include") +if (EXISTS "${FLANG_RT_LIBCUDACXX_PATH}/include") # When using libcudacxx headers files, we have to use them # for all files of F18 runtime. - include_directories(AFTER ${FLANG_LIBCUDACXX_PATH}/include) + include_directories(AFTER ${FLANG_RT_LIBCUDACXX_PATH}/include) add_compile_definitions(RT_USE_LIBCUDACXX=1) endif() # Add an OBJECT library consisting of CUDA PTX. -llvm_add_library(${name}PTX OBJECT PARTIAL_SOURCES_INTENDED ${files}) -set_property(TARGET obj.${name}PTX PROPERTY CUDA_PTX_COMPILATION ON) -if (FLANG_CUDA_RUNTIME_PTX_WITHOUT_GLOBAL_VARS) - target_compile_definitions(obj.${name}PTX -PRIVATE FLANG_RUNTIME_NO_GLOBAL_VAR_DEFS +add_flangrt_library(${name}PTX OBJECT ${files}) jeanPerier wrote: I think `INSTALL_WITH_TOOLCHAIN` may be needed here. I am not an expert with `llvm_add_library`, so I cannot be assertive, but `llvm_add_library` seemed to build/install the PTX library in the lib directory of llvm build. I am not sure where it is located inside the build directory with the patch. https://github.com/llvm/llvm-project/pull/110217 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [CGData] Refactor Global Merge Functions (PR #115750)
kyulee-com wrote: @nocchijiang The new approach seems to be functioning well and is similar in size to the previous method. I suspect that the no-LTO case might still encounter some slowdown, as each CU needs to read the entire CGData regardless. Currently, the CGData used for this merging process does not utilize names, which means we could potentially eliminate strings or make them optional. Alternatively, we could restructure the indexed CGData to allow for reading only the relevant hash entries on demand. I'd like to leave these options open for now, and if you can continue to improve it, that would be excellent. https://github.com/llvm/llvm-project/pull/115750 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [CGData] Refactor Global Merge Functions (PR #115750)
kyulee-com wrote: @nocchijiang The new approach seems to be functioning well and is similar in size to the previous method. I suspect that the no-LTO case might still encounter some slowdown, as each CU needs to read the entire CGData regardless. Currently, the CGData used for this merging process does not utilize names, which means we could potentially eliminate strings or make them optional. Alternatively, we could restructure the indexed CGData to allow for reading only the relevant hash entries on demand. I'd like to leave these options open for now, and if you can continue to improve it, that would be excellent. https://github.com/llvm/llvm-project/pull/115750 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [flang] [Flang][OpenMP] Derived type explicit allocatable member mapping (PR #113557)
agozillon wrote: Thank you very much @skatrak and @ergawy, I'll land this PR stack on either Friday or the coming Monday, going to give a few days leeway incase anyone else wishes to make any comments! https://github.com/llvm/llvm-project/pull/113557 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [mlir] [OMPIRBuilder] Support runtime number of teams and threads, and SPMD mode (PR #116051)
skatrak wrote: Buildbot failure seems to be some temporary issue unrelated to the PR. https://github.com/llvm/llvm-project/pull/116051 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [flang] [lld] [llvm] [Flang] LLVM_ENABLE_RUNTIMES=flang-rt (PR #110217)
@@ -50,43 +42,28 @@ macro(enable_cuda_compilation name files) "${CUDA_COMPILE_OPTIONS}" ) -if (EXISTS "${FLANG_LIBCUDACXX_PATH}/include") +if (EXISTS "${FLANG_RT_LIBCUDACXX_PATH}/include") # When using libcudacxx headers files, we have to use them # for all files of F18 runtime. - include_directories(AFTER ${FLANG_LIBCUDACXX_PATH}/include) + include_directories(AFTER ${FLANG_RT_LIBCUDACXX_PATH}/include) add_compile_definitions(RT_USE_LIBCUDACXX=1) endif() # Add an OBJECT library consisting of CUDA PTX. -llvm_add_library(${name}PTX OBJECT PARTIAL_SOURCES_INTENDED ${files}) -set_property(TARGET obj.${name}PTX PROPERTY CUDA_PTX_COMPILATION ON) -if (FLANG_CUDA_RUNTIME_PTX_WITHOUT_GLOBAL_VARS) - target_compile_definitions(obj.${name}PTX -PRIVATE FLANG_RUNTIME_NO_GLOBAL_VAR_DEFS +add_flangrt_library(${name}PTX OBJECT ${files}) jeanPerier wrote: Also, I think the `OBJECT` processing [in llvm_add_library](https://github.com/llvm/llvm-project/blob/0baa6a7272970257fd6f527e95eb7cb18ba3361c/llvm/cmake/modules/AddLLVM.cmake#L565) is more complex and also implies `STATIC` (that is, it both triggers [an object build for the `obj.${name}PTX`](https://github.com/llvm/llvm-project/blob/0baa6a7272970257fd6f527e95eb7cb18ba3361c/llvm/cmake/modules/AddLLVM.cmake#L568), and a [STATIC build for the `${name}PTX`](https://github.com/llvm/llvm-project/blob/0baa6a7272970257fd6f527e95eb7cb18ba3361c/llvm/cmake/modules/AddLLVM.cmake#L644)). I do not think this is happening with `add_flangrt_library` that only makes an object build. https://github.com/llvm/llvm-project/pull/110217 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] clang/HIP: Remove requires system-linux from some driver tests (PR #112842)
https://github.com/arsenm updated https://github.com/llvm/llvm-project/pull/112842 >From 87f64e8bf51d43c34c5cb4de12661a44674d92b7 Mon Sep 17 00:00:00 2001 From: Matt Arsenault Date: Fri, 18 Oct 2024 09:40:34 +0400 Subject: [PATCH] clang/HIP: Remove requires system-linux from some driver tests --- clang/test/Driver/hip-partial-link.hip | 2 +- clang/test/Driver/linker-wrapper.c | 10 -- 2 files changed, 5 insertions(+), 7 deletions(-) diff --git a/clang/test/Driver/hip-partial-link.hip b/clang/test/Driver/hip-partial-link.hip index 8b27f78f3bdd12..5580e569780194 100644 --- a/clang/test/Driver/hip-partial-link.hip +++ b/clang/test/Driver/hip-partial-link.hip @@ -1,4 +1,4 @@ -// REQUIRES: x86-registered-target, amdgpu-registered-target, lld, system-linux +// REQUIRES: x86-registered-target, amdgpu-registered-target, lld // RUN: %clang -x hip --target=x86_64-unknown-linux-gnu --no-offload-new-driver \ // RUN: --offload-arch=gfx906 -c -nostdinc -nogpuinc -nohipwrapperinc \ diff --git a/clang/test/Driver/linker-wrapper.c b/clang/test/Driver/linker-wrapper.c index 470af4d5d70cac..fac4331e51f694 100644 --- a/clang/test/Driver/linker-wrapper.c +++ b/clang/test/Driver/linker-wrapper.c @@ -2,8 +2,6 @@ // REQUIRES: nvptx-registered-target // REQUIRES: amdgpu-registered-target -// REQUIRES: system-linux - // An externally visible variable so static libraries extract. __attribute__((visibility("protected"), used)) int x; @@ -30,7 +28,7 @@ __attribute__((visibility("protected"), used)) int x; // RUN: clang-linker-wrapper --host-triple=x86_64-unknown-linux-gnu --dry-run --device-debug -O0 \ // RUN: --linker-path=/usr/bin/ld %t.o -o a.out 2>&1 | FileCheck %s --check-prefix=NVPTX-LINK-DEBUG -// NVPTX-LINK-DEBUG: clang{{.*}} -o {{.*}}.img --target=nvptx64-nvidia-cuda -march=sm_70 -O2 -flto {{.*}}.o {{.*}}.o -g +// NVPTX-LINK-DEBUG: clang{{.*}} -o {{.*}}.img --target=nvptx64-nvidia-cuda -march=sm_70 -O2 -flto {{.*}}.o {{.*}}.o -g // RUN: clang-offload-packager -o %t.out \ // RUN: --image=file=%t.elf.o,kind=openmp,triple=amdgcn-amd-amdhsa,arch=gfx908 \ @@ -93,7 +91,7 @@ __attribute__((visibility("protected"), used)) int x; // CUDA: clang{{.*}} -o [[IMG_SM70:.+]] --target=nvptx64-nvidia-cuda -march=sm_70 // CUDA: clang{{.*}} -o [[IMG_SM52:.+]] --target=nvptx64-nvidia-cuda -march=sm_52 -// CUDA: fatbinary{{.*}}-64 --create {{.*}}.fatbin --image=profile=sm_70,file=[[IMG_SM70]] --image=profile=sm_52,file=[[IMG_SM52]] +// CUDA: fatbinary{{.*}}-64 --create {{.*}}.fatbin --image=profile=sm_70,file=[[IMG_SM70]] --image=profile=sm_52,file=[[IMG_SM52]] // CUDA: usr/bin/ld{{.*}} {{.*}}.openmp.image.{{.*}}.o {{.*}}.cuda.image.{{.*}}.o // RUN: clang-offload-packager -o %t.out \ @@ -120,7 +118,7 @@ __attribute__((visibility("protected"), used)) int x; // HIP: clang{{.*}} -o [[IMG_GFX90A:.+]] --target=amdgcn-amd-amdhsa -mcpu=gfx90a // HIP: clang{{.*}} -o [[IMG_GFX908:.+]] --target=amdgcn-amd-amdhsa -mcpu=gfx908 -// HIP: clang-offload-bundler{{.*}}-type=o -bundle-align=4096 -compress -compression-level=6 -targets=host-x86_64-unknown-linux,hip-amdgcn-amd-amdhsa--gfx90a,hip-amdgcn-amd-amdhsa--gfx908 -input=/dev/null -input=[[IMG_GFX90A]] -input=[[IMG_GFX908]] -output={{.*}}.hipfb +// HIP: clang-offload-bundler{{.*}}-type=o -bundle-align=4096 -compress -compression-level=6 -targets=host-x86_64-unknown-linux,hip-amdgcn-amd-amdhsa--gfx90a,hip-amdgcn-amd-amdhsa--gfx908 -input={{/dev/null|NUL}} -input=[[IMG_GFX90A]] -input=[[IMG_GFX908]] -output={{.*}}.hipfb // RUN: clang-offload-packager -o %t.out \ // RUN: --image=file=%t.elf.o,kind=openmp,triple=amdgcn-amd-amdhsa,arch=gfx908 \ @@ -211,7 +209,7 @@ __attribute__((visibility("protected"), used)) int x; // RUN: %t.o -o a.out 2>&1 | FileCheck %s --check-prefix=RELOCATABLE-LINK-HIP // RELOCATABLE-LINK-HIP: clang{{.*}} -o {{.*}}.img --target=amdgcn-amd-amdhsa -// RELOCATABLE-LINK-HIP: clang-offload-bundler{{.*}} -type=o -bundle-align=4096 -targets=host-x86_64-unknown-linux,hip-amdgcn-amd-amdhsa--gfx90a -input=/dev/null -input={{.*}} -output={{.*}} +// RELOCATABLE-LINK-HIP: clang-offload-bundler{{.*}} -type=o -bundle-align=4096 -targets=host-x86_64-unknown-linux,hip-amdgcn-amd-amdhsa--gfx90a -input={{/dev/null|NUL}} -input={{.*}} -output={{.*}} // RELOCATABLE-LINK-HIP: /usr/bin/ld.lld{{.*}}-r // RELOCATABLE-LINK-HIP: llvm-objcopy{{.*}}a.out --remove-section .llvm.offloading ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x [InstCombine] Drop nsw in negation of select (PR #116097)
https://github.com/AreaZR created https://github.com/llvm/llvm-project/pull/116097 Closes https://github.com/llvm/llvm-project/issues/112666 and https://github.com/llvm/llvm-project/issues/114181. (cherry-picked from 8d86a537ad756e31832eab67371179e881452fb5) >From d2d67d1eaeb5a9d44b23b9175c1aab9928247239 Mon Sep 17 00:00:00 2001 From: Rose Date: Wed, 13 Nov 2024 14:40:33 -0500 Subject: [PATCH 1/2] Pre-commit tests (NFC) --- .../InstCombine/sub-of-negatible.ll | 42 +++ 1 file changed, 42 insertions(+) diff --git a/llvm/test/Transforms/InstCombine/sub-of-negatible.ll b/llvm/test/Transforms/InstCombine/sub-of-negatible.ll index b2e14ceaca1b08..dfc461b48800f7 100644 --- a/llvm/test/Transforms/InstCombine/sub-of-negatible.ll +++ b/llvm/test/Transforms/InstCombine/sub-of-negatible.ll @@ -1374,6 +1374,48 @@ define i8 @negate_select_of_op_vs_negated_op(i8 %x, i8 %y, i1 %c) { %t2 = sub i8 %y, %t1 ret i8 %t2 } + +define i8 @negate_select_of_op_vs_negated_op_nsw(i8 %x, i8 %y, i1 %c) { +; CHECK-LABEL: @negate_select_of_op_vs_negated_op_nsw( +; CHECK-NEXT:[[T0:%.*]] = sub nsw i8 0, [[X:%.*]] +; CHECK-NEXT:[[TMP1:%.*]] = select i1 [[C:%.*]], i8 [[X]], i8 [[T0]] +; CHECK-NEXT:[[T2:%.*]] = add i8 [[TMP1]], [[Y:%.*]] +; CHECK-NEXT:ret i8 [[T2]] +; + %t0 = sub nsw i8 0, %x + %t1 = select i1 %c, i8 %t0, i8 %x + %t2 = sub i8 %y, %t1 + ret i8 %t2 +} + +define i8 @negate_select_of_op_vs_negated_op_nsw_commuted(i8 %x, i8 %y, i1 %c) { +; CHECK-LABEL: @negate_select_of_op_vs_negated_op_nsw_commuted( +; CHECK-NEXT:[[T0:%.*]] = sub nsw i8 0, [[X:%.*]] +; CHECK-NEXT:[[TMP1:%.*]] = select i1 [[C:%.*]], i8 [[T0]], i8 [[X]] +; CHECK-NEXT:[[T2:%.*]] = add i8 [[TMP1]], [[Y:%.*]] +; CHECK-NEXT:ret i8 [[T2]] +; + %t0 = sub nsw i8 0, %x + %t1 = select i1 %c, i8 %x, i8 %t0 + %t2 = sub i8 %y, %t1 + ret i8 %t2 +} + +define i8 @negate_select_of_op_vs_negated_op_nsw_xyyx(i8 %x, i8 %y, i8 %z, i1 %c) { +; CHECK-LABEL: @negate_select_of_op_vs_negated_op_nsw_xyyx( +; CHECK-NEXT:[[SUB1:%.*]] = sub nsw i8 [[X:%.*]], [[Y:%.*]] +; CHECK-NEXT:[[SUB2:%.*]] = sub nsw i8 [[Y]], [[X]] +; CHECK-NEXT:[[TMP1:%.*]] = select i1 [[C:%.*]], i8 [[SUB2]], i8 [[SUB1]] +; CHECK-NEXT:[[T2:%.*]] = add i8 [[TMP1]], [[Z:%.*]] +; CHECK-NEXT:ret i8 [[T2]] +; + %sub1 = sub nsw i8 %x, %y + %sub2 = sub nsw i8 %y, %x + %t1 = select i1 %c, i8 %sub1, i8 %sub2 + %t2 = sub i8 %z, %t1 + ret i8 %t2 +} + define i8 @dont_negate_ordinary_select(i8 %x, i8 %y, i8 %z, i1 %c) { ; CHECK-LABEL: @dont_negate_ordinary_select( ; CHECK-NEXT:[[T0:%.*]] = select i1 [[C:%.*]], i8 [[X:%.*]], i8 [[Y:%.*]] >From 9aa29134a11fdafb30c944fd4b3c95bc07db4d49 Mon Sep 17 00:00:00 2001 From: Rose Date: Wed, 13 Nov 2024 14:44:01 -0500 Subject: [PATCH 2/2] [InstCombine] Drop nsw in negation of select (cherry-picked from 8d86a537ad756e31832eab67371179e881452fb5) --- .../lib/Transforms/InstCombine/InstCombineNegator.cpp | 11 +++ llvm/test/Transforms/InstCombine/sub-of-negatible.ll | 8 2 files changed, 15 insertions(+), 4 deletions(-) diff --git a/llvm/lib/Transforms/InstCombine/InstCombineNegator.cpp b/llvm/lib/Transforms/InstCombine/InstCombineNegator.cpp index e4895b59f4b4a9..cb052da79bb3c6 100644 --- a/llvm/lib/Transforms/InstCombine/InstCombineNegator.cpp +++ b/llvm/lib/Transforms/InstCombine/InstCombineNegator.cpp @@ -334,6 +334,17 @@ std::array Negator::getSortedOperandsOfBinOp(Instruction *I) { NewSelect->swapValues(); // Don't swap prof metadata, we didn't change the branch behavior. NewSelect->setName(I->getName() + ".neg"); + // Poison-generating flags should be dropped + Value *TV = NewSelect->getTrueValue(); + Value *FV = NewSelect->getFalseValue(); + if (match(TV, m_Neg(m_Specific(FV +cast(TV)->dropPoisonGeneratingFlags(); + else if (match(FV, m_Neg(m_Specific(TV +cast(FV)->dropPoisonGeneratingFlags(); + else { +cast(TV)->dropPoisonGeneratingFlags(); +cast(FV)->dropPoisonGeneratingFlags(); + } Builder.Insert(NewSelect); return NewSelect; } diff --git a/llvm/test/Transforms/InstCombine/sub-of-negatible.ll b/llvm/test/Transforms/InstCombine/sub-of-negatible.ll index dfc461b48800f7..f9549881aa3131 100644 --- a/llvm/test/Transforms/InstCombine/sub-of-negatible.ll +++ b/llvm/test/Transforms/InstCombine/sub-of-negatible.ll @@ -1377,7 +1377,7 @@ define i8 @negate_select_of_op_vs_negated_op(i8 %x, i8 %y, i1 %c) { define i8 @negate_select_of_op_vs_negated_op_nsw(i8 %x, i8 %y, i1 %c) { ; CHECK-LABEL: @negate_select_of_op_vs_negated_op_nsw( -; CHECK-NEXT:[[T0:%.*]] = sub nsw i8 0, [[X:%.*]] +; CHECK-NEXT:[[T0:%.*]] = sub i8 0, [[X:%.*]] ; CHECK-NEXT:[[TMP1:%.*]] = select i1 [[C:%.*]], i8 [[X]], i8 [[T0]] ; CHECK-NEXT:[[T2:%.*]] = add i8 [[TMP1]], [[Y:%.*]] ; CHECK-NEXT:ret i8 [[T2]] @@ -1390,7 +1390,7 @@ define i8 @negate_select
[llvm-branch-commits] [llvm] release/19.x: [InstCombine] Drop nsw in negation of select (PR #116097)
https://github.com/AreaZR edited https://github.com/llvm/llvm-project/pull/116097 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [mlir] [OMPIRBuilder] Support runtime number of teams and threads, and SPMD mode (PR #116051)
llvmbot wrote: @llvm/pr-subscribers-mlir Author: Sergio Afonso (skatrak) Changes This patch introduces a `TargetKernelRuntimeAttrs` structure to hold host- evaluated `num_teams`, `thread_limit`, `num_threads` and trip count values passed to the runtime kernel offloading call. Additionally, `createTarget` is extended to take an `IsSPMD` flag, used to influence target device code generation. --- Patch is 31.58 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/116051.diff 4 Files Affected: - (modified) llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h (+25-1) - (modified) llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp (+118-19) - (modified) llvm/unittests/Frontend/OpenMPIRBuilderTest.cpp (+271-10) - (modified) mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp (+6-4) ``diff diff --git a/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h b/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h index da450ef5adbc14..a85f41e586c514 100644 --- a/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h +++ b/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h @@ -2237,6 +2237,26 @@ class OpenMPIRBuilder { int32_t MinThreads = 1; }; + /// Container to pass LLVM IR runtime values or constants related to the + /// number of teams and threads with which the kernel must be launched, as + /// well as the trip count of the SPMD loop, if it is an SPMD kernel. These + /// must be defined in the host prior to the call to the kernel launch OpenMP + /// RTL function. + struct TargetKernelRuntimeAttrs { +SmallVector MaxTeams = {nullptr}; +Value *MinTeams = nullptr; +SmallVector TargetThreadLimit = {nullptr}; +SmallVector TeamsThreadLimit = {nullptr}; + +/// 'parallel' construct 'num_threads' clause value, if present and it is a +/// target SPMD kernel. +Value *MaxThreads = nullptr; + +/// Total number of iterations of the target SPMD kernel or null if it is a +/// generic kernel. +Value *LoopTripCount = nullptr; + }; + /// Data structure that contains the needed information to construct the /// kernel args vector. struct TargetKernelArgs { @@ -2905,11 +2925,14 @@ class OpenMPIRBuilder { /// /// \param Loc where the target data construct was encountered. /// \param IsOffloadEntry whether it is an offload entry. + /// \param IsSPMD whether it is a target SPMD kernel. /// \param CodeGenIP The insertion point where the call to the outlined /// function should be emitted. /// \param EntryInfo The entry information about the function. /// \param DefaultAttrs Structure containing the default numbers of threads ///and teams to launch the kernel with. + /// \param RuntimeAttrs Structure containing the runtime numbers of threads + ///and teams to launch the kernel with. /// \param Inputs The input values to the region that will be passed. /// as arguments to the outlined function. /// \param BodyGenCB Callback that will generate the region code. @@ -2919,11 +2942,12 @@ class OpenMPIRBuilder { // dependency information as passed in the depend clause // \param HasNowait Whether the target construct has a `nowait` clause or not. InsertPointOrErrorTy createTarget( - const LocationDescription &Loc, bool IsOffloadEntry, + const LocationDescription &Loc, bool IsOffloadEntry, bool IsSPMD, OpenMPIRBuilder::InsertPointTy AllocaIP, OpenMPIRBuilder::InsertPointTy CodeGenIP, TargetRegionEntryInfo &EntryInfo, const TargetKernelDefaultAttrs &DefaultAttrs, + const TargetKernelRuntimeAttrs &RuntimeAttrs, SmallVectorImpl &Inputs, GenMapInfoCallbackTy GenMapInfoCB, TargetBodyGenCallbackTy BodyGenCB, TargetGenArgAccessorsCallbackTy ArgAccessorFuncCB, diff --git a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp index 302d363965c940..f847f60386df85 100644 --- a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp +++ b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp @@ -6727,8 +6727,43 @@ FunctionCallee OpenMPIRBuilder::createDispatchDeinitFunction() { return getOrCreateRuntimeFunction(M, omp::OMPRTL___kmpc_dispatch_deinit); } +static void emitUsed(StringRef Name, std::vector &List, + Module &M) { + if (List.empty()) +return; + + Type *PtrTy = PointerType::get(M.getContext(), /*AddressSpace=*/0); + + // Convert List to what ConstantArray needs. + SmallVector UsedArray; + UsedArray.reserve(List.size()); + for (auto Item : List) +UsedArray.push_back(ConstantExpr::getPointerBitCastOrAddrSpaceCast( +cast(&*Item), PtrTy)); + + ArrayType *ArrTy = ArrayType::get(PtrTy, UsedArray.size()); + auto *GV = + new GlobalVariable(M, ArrTy, false, llvm::GlobalValue::AppendingLinkage, + llvm::ConstantArray::get(ArrTy, UsedArray), Name); + + GV->setSection("llvm.metadata"); +} + +static void +emitExecutionMode(OpenMPIRBu
[llvm-branch-commits] [flang] [Flang][OpenMP] Derived type explicit allocatable member mapping (PR #113557)
@@ -145,11 +145,294 @@ createMapInfoOp(fir::FirOpBuilder &builder, mlir::Location loc, builder.getIntegerAttr(builder.getIntegerType(64, false), mapType), builder.getAttr(mapCaptureType), builder.getStringAttr(name), builder.getBoolAttr(partialMap)); - return op; } -static int +// This function gathers the individual omp::Object's that make up an +// larger omp::Object symbol. +// +// For example, provided the larger symbol: "parent%child%member", this +// function breaks it up into it's constituent components ("parent", +// "child", "member"), so we can access each individual component and +// introspect details, important to note this function breaks it up from +// RHS to LHS ("member" to "parent") and then we reverse it so that the +// returned omp::ObjectList is LHS to RHS, with the "parent" at the +// beginning. +omp::ObjectList gatherObjectsOf(omp::Object derivedTypeMember, +semantics::SemanticsContext &semaCtx) { + omp::ObjectList objList; + std::optional baseObj = derivedTypeMember; + while (baseObj.has_value()) { +objList.push_back(baseObj.value()); +baseObj = getBaseObject(baseObj.value(), semaCtx); + } + return omp::ObjectList{llvm::reverse(objList)}; +} + +// This function generates a series of indices from a provided omp::Object, +// that devolves to an ArrayRef symbol, e.g. "array(2,3,4)", this function +// would generate a series of indices of "[1][2][3]" for the above example, +// offsetting by -1 to account for the non-zero fortran indexes. +// +// These indices can then be provided to a coordinate operation or other +// GEP-like operation to access the relevant positional member of the +// array. +// +// It is of note that the function only supports subscript integers currently +// and not Triplets i.e. Array(1:2:3). +static void generateArrayIndices(lower::AbstractConverter &converter, + fir::FirOpBuilder &firOpBuilder, + lower::StatementContext &stmtCtx, + mlir::Location clauseLocation, + llvm::SmallVectorImpl &indices, + omp::Object object) { + auto maybeRef = evaluate::ExtractDataRef(*object.ref()); + if (!maybeRef) +return; + + auto *arr = std::get_if(&maybeRef->u); + if (!arr) +return; + + for (auto v : arr->subscript()) { +if (std::holds_alternative(v.u)) { + llvm_unreachable("Triplet indexing in map clause is unsupported"); +} else { + auto expr = + std::get(v.u); + mlir::Value subscript = + fir::getBase(converter.genExprValue(toEvExpr(expr.value()), stmtCtx)); + mlir::Value one = firOpBuilder.createIntegerConstant( + clauseLocation, firOpBuilder.getIndexType(), 1); + subscript = firOpBuilder.createConvert( + clauseLocation, firOpBuilder.getIndexType(), subscript); + indices.push_back(firOpBuilder.create( + clauseLocation, subscript, one)); +} + } +} + +/// When mapping members of derived types, there is a chance that one of the +/// members along the way to a mapped member is an descriptor. In which case +/// we have to make sure we generate a map for those along the way otherwise +/// we will be missing a chunk of data required to actually map the member +/// type to device. This function effectively generates these maps and the +/// appropriate data accesses required to generate these maps. It will avoid +/// creating duplicate maps, as duplicates are just as bad as unmapped +/// descriptor data in a lot of cases for the runtime (and unnecessary +/// data movement should be avoided where possible). +/// +/// As an example for the following mapping: +/// +/// type :: vertexes +/// integer(4), allocatable :: vertexx(:) +/// integer(4), allocatable :: vertexy(:) +/// end type vertexes +/// +/// type :: dtype +/// real(4) :: i +/// type(vertexes), allocatable :: vertexes(:) +/// end type dtype +/// +/// type(dtype), allocatable :: alloca_dtype +/// +/// !$omp target map(tofrom: alloca_dtype%vertexes(N1)%vertexx) +/// +/// The below HLFIR/FIR is generated (trimmed for conciseness): +/// +/// On the first iteration we index into the record type alloca_dtype +/// to access "vertexes", we then generate a map for this descriptor +/// alongside bounds to indicate we only need the 1 member, rather than +/// the whole array block in this case (In theory we could map its +/// entirety at the cost of data transfer bandwidth). +/// +/// %13:2 = hlfir.declare ... "alloca_dtype" ... +/// %39 = fir.load %13#0 : ... +/// %40 = fir.coordinate_of %39, %c1 : ... +/// %51 = omp.map.info var_ptr(%40 : ...) map_clauses(to) capture(ByRef) ... +/// %52 = fir.load %40 : ... +/// +/// Second iteration generating access to "vertexes(N1) utilising the N1 index +/// %53 = load N1 ... +/// %54 = fir.convert %53 : (i32) -> i64 +/// %55 = fir.convert
[llvm-branch-commits] [flang] [Flang][OpenMP] Derived type explicit allocatable member mapping (PR #113557)
https://github.com/skatrak approved this pull request. Thank you Andrew for all your work on this, LGTM! https://github.com/llvm/llvm-project/pull/113557 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [flang] [mlir] [MLIR][OpenMP] LLVM IR translation of host_eval (PR #116052)
llvmbot wrote: @llvm/pr-subscribers-flang-openmp Author: Sergio Afonso (skatrak) Changes This patch adds support for processing the `host_eval` clause of `omp.target` to populate default and runtime kernel launch attributes. Specifically, these related to the `num_teams`, `thread_limit` and `num_threads` clauses attached to operations nested inside of `omp.target`. As a result, the `thread_limit` clause of `omp.target` is also supported. The implementation of `initTargetDefaultAttrs()` is intended to reflect clang's own processing of multiple constructs and clauses in order to define a default number of teams and threads to be used as kernel attributes and to populate global variables in the target device module. One side effect of this change is that it is no longer possible to translate to LLVM IR target device MLIR modules unless they have a supported target triple. This is because the local `getGridValue()` function in the `OpenMPIRBuilder` only works for certain architectures, and it is called whenever the maximum number of threads has not been explicitly defined. This limitation also matches clang. Support for evaluating the collapsed loop trip count of target SPMD kernels remains unsupported. --- Patch is 37.90 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/116052.diff 18 Files Affected: - (modified) flang/test/Integration/OpenMP/target-filtering.f90 (+1-1) - (modified) flang/test/Lower/OpenMP/function-filtering-2.f90 (+3-3) - (modified) flang/test/Lower/OpenMP/function-filtering-3.f90 (+3-3) - (modified) flang/test/Lower/OpenMP/function-filtering.f90 (+3-3) - (modified) mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp (+229-16) - (modified) mlir/test/Target/LLVMIR/omptarget-byref-bycopy-generation-device.mlir (+2-2) - (modified) mlir/test/Target/LLVMIR/omptarget-constant-alloca-raise.mlir (+2-2) - (modified) mlir/test/Target/LLVMIR/omptarget-constant-indexing-device-region.mlir (+2-2) - (modified) mlir/test/Target/LLVMIR/omptarget-debug.mlir (+1-1) - (modified) mlir/test/Target/LLVMIR/omptarget-declare-target-llvm-device.mlir (+1-1) - (modified) mlir/test/Target/LLVMIR/omptarget-parallel-llvm.mlir (+2-2) - (modified) mlir/test/Target/LLVMIR/omptarget-region-device-llvm.mlir (+3-3) - (modified) mlir/test/Target/LLVMIR/omptarget-target-inside-task.mlir (+2-2) - (added) mlir/test/Target/LLVMIR/openmp-target-launch-device.mlir (+43) - (added) mlir/test/Target/LLVMIR/openmp-target-launch-host.mlir (+31) - (modified) mlir/test/Target/LLVMIR/openmp-target-use-device-nested.mlir (+2-2) - (modified) mlir/test/Target/LLVMIR/openmp-task-target-device.mlir (+1-1) - (modified) mlir/test/Target/LLVMIR/openmp-todo.mlir (+13-14) ``diff diff --git a/flang/test/Integration/OpenMP/target-filtering.f90 b/flang/test/Integration/OpenMP/target-filtering.f90 index d1ab1b47e580d4..699c1040d91f9c 100644 --- a/flang/test/Integration/OpenMP/target-filtering.f90 +++ b/flang/test/Integration/OpenMP/target-filtering.f90 @@ -7,7 +7,7 @@ !===--===! !RUN: %flang_fc1 -emit-llvm -fopenmp %s -o - | FileCheck %s --check-prefixes HOST,ALL -!RUN: %flang_fc1 -emit-llvm -fopenmp -fopenmp-is-target-device %s -o - | FileCheck %s --check-prefixes DEVICE,ALL +!RUN: %flang_fc1 -triple amdgcn-amd-amdhsa -emit-llvm -fopenmp -fopenmp-is-target-device %s -o - | FileCheck %s --check-prefixes DEVICE,ALL !HOST: define {{.*}}@{{.*}}before{{.*}}( !DEVICE-NOT: define {{.*}}@before{{.*}}( diff --git a/flang/test/Lower/OpenMP/function-filtering-2.f90 b/flang/test/Lower/OpenMP/function-filtering-2.f90 index 0c02aa223820e7..a2c5e29cfdcbf6 100644 --- a/flang/test/Lower/OpenMP/function-filtering-2.f90 +++ b/flang/test/Lower/OpenMP/function-filtering-2.f90 @@ -1,9 +1,9 @@ ! RUN: %flang_fc1 -fopenmp -fopenmp-version=52 -flang-experimental-hlfir -emit-llvm %s -o - | FileCheck --check-prefixes=LLVM,LLVM-HOST %s ! RUN: %flang_fc1 -fopenmp -fopenmp-version=52 -emit-hlfir %s -o - | FileCheck --check-prefix=MLIR %s -! RUN: %flang_fc1 -fopenmp -fopenmp-version=52 -fopenmp-is-target-device -flang-experimental-hlfir -emit-llvm %s -o - | FileCheck --check-prefixes=LLVM,LLVM-DEVICE %s -! RUN: %flang_fc1 -fopenmp -fopenmp-version=52 -fopenmp-is-target-device -emit-hlfir %s -o - | FileCheck --check-prefix=MLIR %s +! RUN: %flang_fc1 -triple amdgcn-amd-amdhsa -fopenmp -fopenmp-version=52 -fopenmp-is-target-device -flang-experimental-hlfir -emit-llvm %s -o - | FileCheck --check-prefixes=LLVM,LLVM-DEVICE %s +! RUN: %flang_fc1 -triple amdgcn-amd-amdhsa -fopenmp -fopenmp-version=52 -fopenmp-is-target-device -emit-hlfir %s -o - | FileCheck --check-prefix=MLIR %s ! RUN: bbc -fopenmp -fopenmp-version=52 -emit-hlfir %s -o - | FileCheck --check-prefixes=MLIR-HOST,MLIR-ALL %s -! RUN: bbc -fopenmp -fopenmp-version=52 -fopenmp-is-target-device -emit-hlfir %s -o - | FileC
[llvm-branch-commits] [flang] [mlir] [MLIR][OpenMP] LLVM IR translation of host_eval (PR #116052)
llvmbot wrote: @llvm/pr-subscribers-mlir-llvm Author: Sergio Afonso (skatrak) Changes This patch adds support for processing the `host_eval` clause of `omp.target` to populate default and runtime kernel launch attributes. Specifically, these related to the `num_teams`, `thread_limit` and `num_threads` clauses attached to operations nested inside of `omp.target`. As a result, the `thread_limit` clause of `omp.target` is also supported. The implementation of `initTargetDefaultAttrs()` is intended to reflect clang's own processing of multiple constructs and clauses in order to define a default number of teams and threads to be used as kernel attributes and to populate global variables in the target device module. One side effect of this change is that it is no longer possible to translate to LLVM IR target device MLIR modules unless they have a supported target triple. This is because the local `getGridValue()` function in the `OpenMPIRBuilder` only works for certain architectures, and it is called whenever the maximum number of threads has not been explicitly defined. This limitation also matches clang. Support for evaluating the collapsed loop trip count of target SPMD kernels remains unsupported. --- Patch is 37.90 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/116052.diff 18 Files Affected: - (modified) flang/test/Integration/OpenMP/target-filtering.f90 (+1-1) - (modified) flang/test/Lower/OpenMP/function-filtering-2.f90 (+3-3) - (modified) flang/test/Lower/OpenMP/function-filtering-3.f90 (+3-3) - (modified) flang/test/Lower/OpenMP/function-filtering.f90 (+3-3) - (modified) mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp (+229-16) - (modified) mlir/test/Target/LLVMIR/omptarget-byref-bycopy-generation-device.mlir (+2-2) - (modified) mlir/test/Target/LLVMIR/omptarget-constant-alloca-raise.mlir (+2-2) - (modified) mlir/test/Target/LLVMIR/omptarget-constant-indexing-device-region.mlir (+2-2) - (modified) mlir/test/Target/LLVMIR/omptarget-debug.mlir (+1-1) - (modified) mlir/test/Target/LLVMIR/omptarget-declare-target-llvm-device.mlir (+1-1) - (modified) mlir/test/Target/LLVMIR/omptarget-parallel-llvm.mlir (+2-2) - (modified) mlir/test/Target/LLVMIR/omptarget-region-device-llvm.mlir (+3-3) - (modified) mlir/test/Target/LLVMIR/omptarget-target-inside-task.mlir (+2-2) - (added) mlir/test/Target/LLVMIR/openmp-target-launch-device.mlir (+43) - (added) mlir/test/Target/LLVMIR/openmp-target-launch-host.mlir (+31) - (modified) mlir/test/Target/LLVMIR/openmp-target-use-device-nested.mlir (+2-2) - (modified) mlir/test/Target/LLVMIR/openmp-task-target-device.mlir (+1-1) - (modified) mlir/test/Target/LLVMIR/openmp-todo.mlir (+13-14) ``diff diff --git a/flang/test/Integration/OpenMP/target-filtering.f90 b/flang/test/Integration/OpenMP/target-filtering.f90 index d1ab1b47e580d4..699c1040d91f9c 100644 --- a/flang/test/Integration/OpenMP/target-filtering.f90 +++ b/flang/test/Integration/OpenMP/target-filtering.f90 @@ -7,7 +7,7 @@ !===--===! !RUN: %flang_fc1 -emit-llvm -fopenmp %s -o - | FileCheck %s --check-prefixes HOST,ALL -!RUN: %flang_fc1 -emit-llvm -fopenmp -fopenmp-is-target-device %s -o - | FileCheck %s --check-prefixes DEVICE,ALL +!RUN: %flang_fc1 -triple amdgcn-amd-amdhsa -emit-llvm -fopenmp -fopenmp-is-target-device %s -o - | FileCheck %s --check-prefixes DEVICE,ALL !HOST: define {{.*}}@{{.*}}before{{.*}}( !DEVICE-NOT: define {{.*}}@before{{.*}}( diff --git a/flang/test/Lower/OpenMP/function-filtering-2.f90 b/flang/test/Lower/OpenMP/function-filtering-2.f90 index 0c02aa223820e7..a2c5e29cfdcbf6 100644 --- a/flang/test/Lower/OpenMP/function-filtering-2.f90 +++ b/flang/test/Lower/OpenMP/function-filtering-2.f90 @@ -1,9 +1,9 @@ ! RUN: %flang_fc1 -fopenmp -fopenmp-version=52 -flang-experimental-hlfir -emit-llvm %s -o - | FileCheck --check-prefixes=LLVM,LLVM-HOST %s ! RUN: %flang_fc1 -fopenmp -fopenmp-version=52 -emit-hlfir %s -o - | FileCheck --check-prefix=MLIR %s -! RUN: %flang_fc1 -fopenmp -fopenmp-version=52 -fopenmp-is-target-device -flang-experimental-hlfir -emit-llvm %s -o - | FileCheck --check-prefixes=LLVM,LLVM-DEVICE %s -! RUN: %flang_fc1 -fopenmp -fopenmp-version=52 -fopenmp-is-target-device -emit-hlfir %s -o - | FileCheck --check-prefix=MLIR %s +! RUN: %flang_fc1 -triple amdgcn-amd-amdhsa -fopenmp -fopenmp-version=52 -fopenmp-is-target-device -flang-experimental-hlfir -emit-llvm %s -o - | FileCheck --check-prefixes=LLVM,LLVM-DEVICE %s +! RUN: %flang_fc1 -triple amdgcn-amd-amdhsa -fopenmp -fopenmp-version=52 -fopenmp-is-target-device -emit-hlfir %s -o - | FileCheck --check-prefix=MLIR %s ! RUN: bbc -fopenmp -fopenmp-version=52 -emit-hlfir %s -o - | FileCheck --check-prefixes=MLIR-HOST,MLIR-ALL %s -! RUN: bbc -fopenmp -fopenmp-version=52 -fopenmp-is-target-device -emit-hlfir %s -o - | FileChec
[llvm-branch-commits] [clang] [llvm] [mlir] [OMPIRBuilder] Introduce struct to hold default kernel teams/threads (PR #116050)
skatrak wrote: PR stack: - #116048 - #116049 - #116050 - #116051 - #116052 https://github.com/llvm/llvm-project/pull/116050 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [MLIR][OpenMP] Add host_eval clause to omp.target (PR #116049)
https://github.com/skatrak created https://github.com/llvm/llvm-project/pull/116049 This patch adds the `host_eval` clause to the `omp.target` operation. Additionally, it updates its op verifier to make sure all uses of block arguments defined by this clause fall within one of the few cases where they are allowed. MLIR to LLVM IR translation fails on translation of this clause with a not-yet-implemented error. >From 26fbb25720cf472c66eef259845e1fa73668f77c Mon Sep 17 00:00:00 2001 From: Sergio Afonso Date: Fri, 8 Nov 2024 12:00:45 + Subject: [PATCH] [MLIR][OpenMP] Add host_eval clause to omp.target This patch adds the `host_eval` clause to the `omp.target` operation. Additionally, it updates its op verifier to make sure all uses of block arguments defined by this clause fall within one of the few cases where they are allowed. MLIR to LLVM IR translation fails on translation of this clause with a not-yet-implemented error. --- mlir/docs/Dialects/OpenMPDialect/_index.md| 55 ++ mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td | 33 +++- mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp | 167 +- .../OpenMP/OpenMPToLLVMIRTranslation.cpp | 5 + mlir/test/Dialect/OpenMP/invalid.mlir | 70 +++- mlir/test/Dialect/OpenMP/ops.mlir | 38 +++- mlir/test/Target/LLVMIR/openmp-todo.mlir | 14 ++ 7 files changed, 369 insertions(+), 13 deletions(-) diff --git a/mlir/docs/Dialects/OpenMPDialect/_index.md b/mlir/docs/Dialects/OpenMPDialect/_index.md index 4e5d777d6c4f7f..e0dd3f598e84b6 100644 --- a/mlir/docs/Dialects/OpenMPDialect/_index.md +++ b/mlir/docs/Dialects/OpenMPDialect/_index.md @@ -523,3 +523,58 @@ omp.parallel ... { omp.terminator } {omp.composite} ``` + +## Host-Evaluated Clauses in Target Regions + +The `omp.target` operation, which represents the OpenMP `target` construct, is +marked with the `IsolatedFromAbove` trait. This means that, inside of its +region, no MLIR values defined outside of the op itself can be used. This is +consistent with the OpenMP specification of the `target` construct, which +mandates that all host device values used inside of the `target` region must +either be privatized (data-sharing) or mapped (data-mapping). + +Normally, clauses applied to a construct are evaluated before entering that +construct. Further, in some cases, the OpenMP specification stipulates that +clauses be evaluated _on the host device_ on entry to a parent `target` +construct. In particular, the `num_teams` and `thread_limit` clauses of the +`teams` construct must be evaluated on the host device if it's nested inside or +combined with a `target` construct. + +Additionally, the runtime library targeted by the MLIR to LLVM IR translation of +the OpenMP dialect supports the optimized launch of SPMD kernels (i.e. +`target teams distribute parallel {do,for}` in OpenMP), which requires +specifying in advance what the total trip count of the loop is. Consequently, it +is also beneficial to evaluate the trip count on the host device prior to the +kernel launch. + +These host-evaluated values in MLIR would need to be placed outside of the +`omp.target` region and also attached to the corresponding nested operations, +which is not possible because of the `IsolatedFromAbove` trait. The solution +implemented to address this problem has been to introduce the `host_eval` +argument to the `omp.target` operation. It works similarly to a `map` clause, +but its only intended use is to forward host-evaluated values to their +corresponding operation inside of the region. Any uses outside of the previously +described result in a verifier error. + +```mlir +// Initialize %0, %1, %2, %3... +omp.target host_eval(%0 -> %nt, %1 -> %lb, %2 -> %ub, %3 -> %step : i32, i32, i32, i32) { + omp.teams num_teams(to %nt : i32) { +omp.parallel { + omp.distribute { +omp.wsloop { + omp.loop_nest (%iv) : i32 = (%lb) to (%ub) step (%step) { +// ... +omp.yield + } + omp.terminator +} {omp.composite} +omp.terminator + } {omp.composite} + omp.terminator +} {omp.composite} +omp.terminator + } + omp.terminator +} +``` diff --git a/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td b/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td index a0da3db124d1f4..a99da1f0294d08 100644 --- a/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td +++ b/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td @@ -1166,9 +1166,10 @@ def TargetOp : OpenMP_Op<"target", traits = [ ], clauses = [ // TODO: Complete clause list (defaultmap, uses_allocators). OpenMP_AllocateClause, OpenMP_DependClause, OpenMP_DeviceClause, -OpenMP_HasDeviceAddrClause, OpenMP_IfClause, OpenMP_InReductionClause, -OpenMP_IsDevicePtrClause, OpenMP_MapClauseSkip, -OpenMP_NowaitClause, OpenMP_PrivateClause, OpenMP_ThreadLimitClause +OpenMP_HasDeviceAddrClause, OpenMP_HostEvalClause, OpenMP_IfClause, +OpenMP_
[llvm-branch-commits] [flang] [mlir] [MLIR][OpenMP] LLVM IR translation of host_eval (PR #116052)
skatrak wrote: PR stack: - #116048 - #116049 - #116050 - #116051 - #116052 https://github.com/llvm/llvm-project/pull/116052 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [llvm] [mlir] [OMPIRBuilder] Introduce struct to hold default kernel teams/threads (PR #116050)
llvmbot wrote: @llvm/pr-subscribers-mlir Author: Sergio Afonso (skatrak) Changes This patch introduces the `OpenMPIRBuilder::TargetKernelDefaultAttrs` structure used to simplify passing default and constant values for number of teams and threads, and possibly other target kernel-related information in the future. This is used to forward values passed to `createTarget` to `createTargetInit`, which previously used a default unrelated set of values. --- Patch is 21.80 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/116050.diff 8 Files Affected: - (modified) clang/lib/CodeGen/CGOpenMPRuntime.cpp (+8-5) - (modified) clang/lib/CodeGen/CGOpenMPRuntime.h (+3-6) - (modified) clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp (+3-6) - (modified) llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h (+25-14) - (modified) llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp (+40-31) - (modified) llvm/unittests/Frontend/OpenMPIRBuilderTest.cpp (+16-13) - (modified) mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp (+6-5) - (modified) mlir/test/Target/LLVMIR/omptarget-region-device-llvm.mlir (+1-1) ``diff diff --git a/clang/lib/CodeGen/CGOpenMPRuntime.cpp b/clang/lib/CodeGen/CGOpenMPRuntime.cpp index d714af035d21a2..0f7a1166227476 100644 --- a/clang/lib/CodeGen/CGOpenMPRuntime.cpp +++ b/clang/lib/CodeGen/CGOpenMPRuntime.cpp @@ -5880,10 +5880,13 @@ void CGOpenMPRuntime::emitUsesAllocatorsFini(CodeGenFunction &CGF, void CGOpenMPRuntime::computeMinAndMaxThreadsAndTeams( const OMPExecutableDirective &D, CodeGenFunction &CGF, -int32_t &MinThreadsVal, int32_t &MaxThreadsVal, int32_t &MinTeamsVal, -int32_t &MaxTeamsVal) { +llvm::OpenMPIRBuilder::TargetKernelDefaultAttrs &Attrs) { + assert(Attrs.MaxTeams.size() == 1 && Attrs.MaxThreads.size() == 1 && + "invalid default attrs structure"); + int32_t &MaxTeamsVal = Attrs.MaxTeams.front(); + int32_t &MaxThreadsVal = Attrs.MaxThreads.front(); - getNumTeamsExprForTargetDirective(CGF, D, MinTeamsVal, MaxTeamsVal); + getNumTeamsExprForTargetDirective(CGF, D, Attrs.MinTeams, MaxTeamsVal); getNumThreadsExprForTargetDirective(CGF, D, MaxThreadsVal, /*UpperBoundOnly=*/true); @@ -5901,12 +5904,12 @@ void CGOpenMPRuntime::computeMinAndMaxThreadsAndTeams( else continue; - MinThreadsVal = std::max(MinThreadsVal, AttrMinThreadsVal); + Attrs.MinThreads = std::max(Attrs.MinThreads, AttrMinThreadsVal); if (AttrMaxThreadsVal > 0) MaxThreadsVal = MaxThreadsVal > 0 ? std::min(MaxThreadsVal, AttrMaxThreadsVal) : AttrMaxThreadsVal; - MinTeamsVal = std::max(MinTeamsVal, AttrMinBlocksVal); + Attrs.MinTeams = std::max(Attrs.MinTeams, AttrMinBlocksVal); if (AttrMaxBlocksVal > 0) MaxTeamsVal = MaxTeamsVal > 0 ? std::min(MaxTeamsVal, AttrMaxBlocksVal) : AttrMaxBlocksVal; diff --git a/clang/lib/CodeGen/CGOpenMPRuntime.h b/clang/lib/CodeGen/CGOpenMPRuntime.h index 5e7715743afb58..003395e7f17ded 100644 --- a/clang/lib/CodeGen/CGOpenMPRuntime.h +++ b/clang/lib/CodeGen/CGOpenMPRuntime.h @@ -312,12 +312,9 @@ class CGOpenMPRuntime { llvm::OpenMPIRBuilder OMPBuilder; /// Helper to determine the min/max number of threads/teams for \p D. - void computeMinAndMaxThreadsAndTeams(const OMPExecutableDirective &D, - CodeGenFunction &CGF, - int32_t &MinThreadsVal, - int32_t &MaxThreadsVal, - int32_t &MinTeamsVal, - int32_t &MaxTeamsVal); + void computeMinAndMaxThreadsAndTeams( + const OMPExecutableDirective &D, CodeGenFunction &CGF, + llvm::OpenMPIRBuilder::TargetKernelDefaultAttrs &Attrs); /// Helper to emit outlined function for 'target' directive. /// \param D Directive to emit. diff --git a/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp b/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp index 43dc0e62284602..96f8d6c5c08e56 100644 --- a/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp +++ b/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp @@ -745,14 +745,11 @@ void CGOpenMPRuntimeGPU::emitNonSPMDKernel(const OMPExecutableDirective &D, void CGOpenMPRuntimeGPU::emitKernelInit(const OMPExecutableDirective &D, CodeGenFunction &CGF, EntryFunctionState &EST, bool IsSPMD) { - int32_t MinThreadsVal = 1, MaxThreadsVal = -1, MinTeamsVal = 1, - MaxTeamsVal = -1; - computeMinAndMaxThreadsAndTeams(D, CGF, MinThreadsVal, MaxThreadsVal, - MinTeamsVal, MaxTeamsVal); + llvm::OpenMPIRBuilder::TargetKernelDefaultAttrs Attrs; + computeMinAndMaxThreadsAndTeams(D, CGF, Attrs); CGBuilderTy &Bld = CGF.Builder; -
[llvm-branch-commits] [llvm] release/19.x: backport PR115901 (PR #116104)
llvmbot wrote: @llvm/pr-subscribers-llvm-transforms Author: Antonio Frighetto (antoniofrighetto) Changes Backport: 929cbe7f596733f85cd274485acc19442dd34a80. Requested-by: @AreaZR. --- Full diff: https://github.com/llvm/llvm-project/pull/116104.diff 3 Files Affected: - (modified) llvm/lib/Transforms/InstCombine/InstCombinePHI.cpp (+2-1) - (modified) llvm/test/Transforms/InstCombine/opaque-ptr.ll (+1-1) - (modified) llvm/test/Transforms/InstCombine/phi.ll (+28) ``diff diff --git a/llvm/lib/Transforms/InstCombine/InstCombinePHI.cpp b/llvm/lib/Transforms/InstCombine/InstCombinePHI.cpp index 86411320ab2487..b05a33c688890d 100644 --- a/llvm/lib/Transforms/InstCombine/InstCombinePHI.cpp +++ b/llvm/lib/Transforms/InstCombine/InstCombinePHI.cpp @@ -513,7 +513,8 @@ Instruction *InstCombinerImpl::foldPHIArgGEPIntoPHI(PHINode &PN) { // especially bad when the PHIs are in the header of a loop. bool NeededPhi = false; - GEPNoWrapFlags NW = GEPNoWrapFlags::all(); + // Remember flags of the first phi-operand getelementptr. + GEPNoWrapFlags NW = FirstInst->getNoWrapFlags(); // Scan to see if all operands are the same opcode, and all have one user. for (Value *V : drop_begin(PN.incoming_values())) { diff --git a/llvm/test/Transforms/InstCombine/opaque-ptr.ll b/llvm/test/Transforms/InstCombine/opaque-ptr.ll index df85547f56d74f..1fd8281b53816f 100644 --- a/llvm/test/Transforms/InstCombine/opaque-ptr.ll +++ b/llvm/test/Transforms/InstCombine/opaque-ptr.ll @@ -549,7 +549,7 @@ define ptr @phi_of_gep_flags_1(i1 %c, ptr %p) { ; CHECK: else: ; CHECK-NEXT:br label [[JOIN]] ; CHECK: join: -; CHECK-NEXT:[[PHI:%.*]] = getelementptr nusw nuw i8, ptr [[P:%.*]], i64 4 +; CHECK-NEXT:[[PHI:%.*]] = getelementptr nusw i8, ptr [[P:%.*]], i64 4 ; CHECK-NEXT:ret ptr [[PHI]] ; br i1 %c, label %if, label %else diff --git a/llvm/test/Transforms/InstCombine/phi.ll b/llvm/test/Transforms/InstCombine/phi.ll index b12982dd27e404..82ea9bb439b0bb 100644 --- a/llvm/test/Transforms/InstCombine/phi.ll +++ b/llvm/test/Transforms/InstCombine/phi.ll @@ -2714,3 +2714,31 @@ join: %cmp = icmp slt i32 %13, 0 ret i1 %cmp } + +define i64 @wrong_gep_arg_into_phi(ptr noundef %ptr) { +; CHECK-LABEL: @wrong_gep_arg_into_phi( +; CHECK-NEXT: entry: +; CHECK-NEXT:br label [[FOR_COND:%.*]] +; CHECK: for.cond: +; CHECK-NEXT:[[PTR_PN:%.*]] = phi ptr [ [[PTR:%.*]], [[ENTRY:%.*]] ], [ [[DOTPN:%.*]], [[FOR_COND]] ] +; CHECK-NEXT:[[DOTPN]] = getelementptr i8, ptr [[PTR_PN]], i64 1 +; CHECK-NEXT:[[VAL:%.*]] = load i8, ptr [[DOTPN]], align 1 +; CHECK-NEXT:[[COND_NOT:%.*]] = icmp eq i8 [[VAL]], 0 +; CHECK-NEXT:br i1 [[COND_NOT]], label [[EXIT:%.*]], label [[FOR_COND]] +; CHECK: exit: +; CHECK-NEXT:ret i64 0 +; +entry: + %add.ptr = getelementptr i8, ptr %ptr, i64 1 + br label %for.cond + +for.cond: ; preds = %for.cond, %entry + %.pn = phi ptr [ %add.ptr, %entry ], [ %incdec.ptr, %for.cond ] + %val = load i8, ptr %.pn, align 1 + %cond = icmp ne i8 %val, 0 + %incdec.ptr = getelementptr inbounds nuw i8, ptr %.pn, i64 1 + br i1 %cond, label %for.cond, label %exit + +exit: ; preds = %for.cond + ret i64 0 +} `` https://github.com/llvm/llvm-project/pull/116104 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [flang] [lld] [llvm] [Flang] LLVM_ENABLE_RUNTIMES=flang-rt (PR #110217)
h-vetinari wrote: > I thought that's what happens by default when you use > https://patch-diff.githubusercontent.com/raw/llvm/llvm-project/pull/110217.diff You only get the diff of the respective PR w.r.t. its base. Since this PR is not targeting `main`, the diff does not apply (unless you pick up all the intermediate pieces in the chain of pull requests). https://github.com/llvm/llvm-project/pull/110217 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x: backport PR115901 (PR #116104)
https://github.com/antoniofrighetto edited https://github.com/llvm/llvm-project/pull/116104 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x: backport PR115901 (PR #116104)
https://github.com/antoniofrighetto created https://github.com/llvm/llvm-project/pull/116104 Backport: 929cbe7f596733f85cd274485acc19442dd34a80. Requested-by: @AreaZR. >From 134b1917d27e268d8771f76f22d2ee32fbc2a2b3 Mon Sep 17 00:00:00 2001 From: Antonio Frighetto Date: Tue, 12 Nov 2024 10:45:46 +0100 Subject: [PATCH] [InstCombine] Intersect nowrap flags between geps while folding into phi A miscompilation issue has been addressed with refined checking. --- .../Transforms/InstCombine/InstCombinePHI.cpp | 3 +- .../test/Transforms/InstCombine/opaque-ptr.ll | 2 +- llvm/test/Transforms/InstCombine/phi.ll | 28 +++ 3 files changed, 31 insertions(+), 2 deletions(-) diff --git a/llvm/lib/Transforms/InstCombine/InstCombinePHI.cpp b/llvm/lib/Transforms/InstCombine/InstCombinePHI.cpp index 86411320ab2487..b05a33c688890d 100644 --- a/llvm/lib/Transforms/InstCombine/InstCombinePHI.cpp +++ b/llvm/lib/Transforms/InstCombine/InstCombinePHI.cpp @@ -513,7 +513,8 @@ Instruction *InstCombinerImpl::foldPHIArgGEPIntoPHI(PHINode &PN) { // especially bad when the PHIs are in the header of a loop. bool NeededPhi = false; - GEPNoWrapFlags NW = GEPNoWrapFlags::all(); + // Remember flags of the first phi-operand getelementptr. + GEPNoWrapFlags NW = FirstInst->getNoWrapFlags(); // Scan to see if all operands are the same opcode, and all have one user. for (Value *V : drop_begin(PN.incoming_values())) { diff --git a/llvm/test/Transforms/InstCombine/opaque-ptr.ll b/llvm/test/Transforms/InstCombine/opaque-ptr.ll index df85547f56d74f..1fd8281b53816f 100644 --- a/llvm/test/Transforms/InstCombine/opaque-ptr.ll +++ b/llvm/test/Transforms/InstCombine/opaque-ptr.ll @@ -549,7 +549,7 @@ define ptr @phi_of_gep_flags_1(i1 %c, ptr %p) { ; CHECK: else: ; CHECK-NEXT:br label [[JOIN]] ; CHECK: join: -; CHECK-NEXT:[[PHI:%.*]] = getelementptr nusw nuw i8, ptr [[P:%.*]], i64 4 +; CHECK-NEXT:[[PHI:%.*]] = getelementptr nusw i8, ptr [[P:%.*]], i64 4 ; CHECK-NEXT:ret ptr [[PHI]] ; br i1 %c, label %if, label %else diff --git a/llvm/test/Transforms/InstCombine/phi.ll b/llvm/test/Transforms/InstCombine/phi.ll index b12982dd27e404..82ea9bb439b0bb 100644 --- a/llvm/test/Transforms/InstCombine/phi.ll +++ b/llvm/test/Transforms/InstCombine/phi.ll @@ -2714,3 +2714,31 @@ join: %cmp = icmp slt i32 %13, 0 ret i1 %cmp } + +define i64 @wrong_gep_arg_into_phi(ptr noundef %ptr) { +; CHECK-LABEL: @wrong_gep_arg_into_phi( +; CHECK-NEXT: entry: +; CHECK-NEXT:br label [[FOR_COND:%.*]] +; CHECK: for.cond: +; CHECK-NEXT:[[PTR_PN:%.*]] = phi ptr [ [[PTR:%.*]], [[ENTRY:%.*]] ], [ [[DOTPN:%.*]], [[FOR_COND]] ] +; CHECK-NEXT:[[DOTPN]] = getelementptr i8, ptr [[PTR_PN]], i64 1 +; CHECK-NEXT:[[VAL:%.*]] = load i8, ptr [[DOTPN]], align 1 +; CHECK-NEXT:[[COND_NOT:%.*]] = icmp eq i8 [[VAL]], 0 +; CHECK-NEXT:br i1 [[COND_NOT]], label [[EXIT:%.*]], label [[FOR_COND]] +; CHECK: exit: +; CHECK-NEXT:ret i64 0 +; +entry: + %add.ptr = getelementptr i8, ptr %ptr, i64 1 + br label %for.cond + +for.cond: ; preds = %for.cond, %entry + %.pn = phi ptr [ %add.ptr, %entry ], [ %incdec.ptr, %for.cond ] + %val = load i8, ptr %.pn, align 1 + %cond = icmp ne i8 %val, 0 + %incdec.ptr = getelementptr inbounds nuw i8, ptr %.pn, i64 1 + br i1 %cond, label %for.cond, label %exit + +exit: ; preds = %for.cond + ret i64 0 +} ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x: backport PR115901 (PR #116104)
https://github.com/antoniofrighetto milestoned https://github.com/llvm/llvm-project/pull/116104 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x [InstCombine] Drop nsw in negation of select (PR #116097)
llvmbot wrote: @llvm/pr-subscribers-llvm-transforms Author: Rose (AreaZR) Changes Closes https://github.com/llvm/llvm-project/issues/112666 and https://github.com/llvm/llvm-project/issues/114181. (cherry-picked from 8d86a537ad756e31832eab67371179e881452fb5) --- Full diff: https://github.com/llvm/llvm-project/pull/116097.diff 2 Files Affected: - (modified) llvm/lib/Transforms/InstCombine/InstCombineNegator.cpp (+11) - (modified) llvm/test/Transforms/InstCombine/sub-of-negatible.ll (+42) ``diff diff --git a/llvm/lib/Transforms/InstCombine/InstCombineNegator.cpp b/llvm/lib/Transforms/InstCombine/InstCombineNegator.cpp index e4895b59f4b4a9..cb052da79bb3c6 100644 --- a/llvm/lib/Transforms/InstCombine/InstCombineNegator.cpp +++ b/llvm/lib/Transforms/InstCombine/InstCombineNegator.cpp @@ -334,6 +334,17 @@ std::array Negator::getSortedOperandsOfBinOp(Instruction *I) { NewSelect->swapValues(); // Don't swap prof metadata, we didn't change the branch behavior. NewSelect->setName(I->getName() + ".neg"); + // Poison-generating flags should be dropped + Value *TV = NewSelect->getTrueValue(); + Value *FV = NewSelect->getFalseValue(); + if (match(TV, m_Neg(m_Specific(FV +cast(TV)->dropPoisonGeneratingFlags(); + else if (match(FV, m_Neg(m_Specific(TV +cast(FV)->dropPoisonGeneratingFlags(); + else { +cast(TV)->dropPoisonGeneratingFlags(); +cast(FV)->dropPoisonGeneratingFlags(); + } Builder.Insert(NewSelect); return NewSelect; } diff --git a/llvm/test/Transforms/InstCombine/sub-of-negatible.ll b/llvm/test/Transforms/InstCombine/sub-of-negatible.ll index b2e14ceaca1b08..f9549881aa3131 100644 --- a/llvm/test/Transforms/InstCombine/sub-of-negatible.ll +++ b/llvm/test/Transforms/InstCombine/sub-of-negatible.ll @@ -1374,6 +1374,48 @@ define i8 @negate_select_of_op_vs_negated_op(i8 %x, i8 %y, i1 %c) { %t2 = sub i8 %y, %t1 ret i8 %t2 } + +define i8 @negate_select_of_op_vs_negated_op_nsw(i8 %x, i8 %y, i1 %c) { +; CHECK-LABEL: @negate_select_of_op_vs_negated_op_nsw( +; CHECK-NEXT:[[T0:%.*]] = sub i8 0, [[X:%.*]] +; CHECK-NEXT:[[TMP1:%.*]] = select i1 [[C:%.*]], i8 [[X]], i8 [[T0]] +; CHECK-NEXT:[[T2:%.*]] = add i8 [[TMP1]], [[Y:%.*]] +; CHECK-NEXT:ret i8 [[T2]] +; + %t0 = sub nsw i8 0, %x + %t1 = select i1 %c, i8 %t0, i8 %x + %t2 = sub i8 %y, %t1 + ret i8 %t2 +} + +define i8 @negate_select_of_op_vs_negated_op_nsw_commuted(i8 %x, i8 %y, i1 %c) { +; CHECK-LABEL: @negate_select_of_op_vs_negated_op_nsw_commuted( +; CHECK-NEXT:[[T0:%.*]] = sub i8 0, [[X:%.*]] +; CHECK-NEXT:[[TMP1:%.*]] = select i1 [[C:%.*]], i8 [[T0]], i8 [[X]] +; CHECK-NEXT:[[T2:%.*]] = add i8 [[TMP1]], [[Y:%.*]] +; CHECK-NEXT:ret i8 [[T2]] +; + %t0 = sub nsw i8 0, %x + %t1 = select i1 %c, i8 %x, i8 %t0 + %t2 = sub i8 %y, %t1 + ret i8 %t2 +} + +define i8 @negate_select_of_op_vs_negated_op_nsw_xyyx(i8 %x, i8 %y, i8 %z, i1 %c) { +; CHECK-LABEL: @negate_select_of_op_vs_negated_op_nsw_xyyx( +; CHECK-NEXT:[[SUB1:%.*]] = sub i8 [[X:%.*]], [[Y:%.*]] +; CHECK-NEXT:[[SUB2:%.*]] = sub i8 [[Y]], [[X]] +; CHECK-NEXT:[[TMP1:%.*]] = select i1 [[C:%.*]], i8 [[SUB2]], i8 [[SUB1]] +; CHECK-NEXT:[[T2:%.*]] = add i8 [[TMP1]], [[Z:%.*]] +; CHECK-NEXT:ret i8 [[T2]] +; + %sub1 = sub nsw i8 %x, %y + %sub2 = sub nsw i8 %y, %x + %t1 = select i1 %c, i8 %sub1, i8 %sub2 + %t2 = sub i8 %z, %t1 + ret i8 %t2 +} + define i8 @dont_negate_ordinary_select(i8 %x, i8 %y, i8 %z, i1 %c) { ; CHECK-LABEL: @dont_negate_ordinary_select( ; CHECK-NEXT:[[T0:%.*]] = select i1 [[C:%.*]], i8 [[X:%.*]], i8 [[Y:%.*]] `` https://github.com/llvm/llvm-project/pull/116097 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [CallGraphSection] Add call graph section options and documentation (PR #87572)
https://github.com/Prabhuk updated https://github.com/llvm/llvm-project/pull/87572 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [mlir][bufferization] Remove `finalizing-bufferize` pass (PR #114154)
https://github.com/matthias-springer updated https://github.com/llvm/llvm-project/pull/114154 >From 1e194b399b21ed1ef577803cadc199827e4d7431 Mon Sep 17 00:00:00 2001 From: Matthias Springer Date: Wed, 30 Oct 2024 00:46:05 +0100 Subject: [PATCH] [mlir][bufferization] Remove `finalizing-bufferize` pass The dialect conversion-based bufferization passes have been migrated to One-Shot Bufferize about two years ago. To clean up the code base, this commit removes the `finalizing-bufferize` pass, one of the few remaining parts of the old infrastructure. Most bufferization passes have already been removed. Note for LLVM integration: If you depend on this pass, migrate to One-Shot Bufferize or copy the pass to your codebase. Depends on #114152. --- .../Bufferization/Transforms/Bufferize.h | 6 -- .../Dialect/Bufferization/Transforms/Passes.h | 4 - .../Bufferization/Transforms/Passes.td| 16 .../Bufferization/Transforms/Bufferize.cpp| 75 --- .../Pipelines/SparseTensorPipelines.cpp | 2 - .../Transforms/finalizing-bufferize.mlir | 95 --- 6 files changed, 198 deletions(-) delete mode 100644 mlir/test/Dialect/Bufferization/Transforms/finalizing-bufferize.mlir diff --git a/mlir/include/mlir/Dialect/Bufferization/Transforms/Bufferize.h b/mlir/include/mlir/Dialect/Bufferization/Transforms/Bufferize.h index 1603dfcbae5589..ebed2c354bfca5 100644 --- a/mlir/include/mlir/Dialect/Bufferization/Transforms/Bufferize.h +++ b/mlir/include/mlir/Dialect/Bufferization/Transforms/Bufferize.h @@ -56,12 +56,6 @@ class BufferizeTypeConverter : public TypeConverter { /// populateEliminateBufferizeMaterializationsPatterns. void populateBufferizeMaterializationLegality(ConversionTarget &target); -/// Populate patterns to eliminate bufferize materializations. -/// -/// In particular, these are the tensor_load/buffer_cast ops. -void populateEliminateBufferizeMaterializationsPatterns( -const BufferizeTypeConverter &typeConverter, RewritePatternSet &patterns); - /// Bufferize `op` and its nested ops that implement `BufferizableOpInterface`. /// /// Note: This function does not resolve read-after-write conflicts. Use this diff --git a/mlir/include/mlir/Dialect/Bufferization/Transforms/Passes.h b/mlir/include/mlir/Dialect/Bufferization/Transforms/Passes.h index ab9a48f3473c27..fe43a05c81fdc3 100644 --- a/mlir/include/mlir/Dialect/Bufferization/Transforms/Passes.h +++ b/mlir/include/mlir/Dialect/Bufferization/Transforms/Passes.h @@ -200,10 +200,6 @@ std::unique_ptr createEmptyTensorToAllocTensorPass(); /// Drop all memref function results that are equivalent to a function argument. LogicalResult dropEquivalentBufferResults(ModuleOp module); -/// Creates a pass that finalizes a partial bufferization by removing remaining -/// bufferization.to_tensor and bufferization.to_memref operations. -std::unique_ptr> createFinalizingBufferizePass(); - /// Create a pass that bufferizes all ops that implement BufferizableOpInterface /// with One-Shot Bufferize. std::unique_ptr createOneShotBufferizePass(); diff --git a/mlir/include/mlir/Dialect/Bufferization/Transforms/Passes.td b/mlir/include/mlir/Dialect/Bufferization/Transforms/Passes.td index 2743de43fb9cfa..3e93f33ffe0fb4 100644 --- a/mlir/include/mlir/Dialect/Bufferization/Transforms/Passes.td +++ b/mlir/include/mlir/Dialect/Bufferization/Transforms/Passes.td @@ -343,22 +343,6 @@ def BufferResultsToOutParams : Pass<"buffer-results-to-out-params", "ModuleOp"> let dependentDialects = ["memref::MemRefDialect"]; } -def FinalizingBufferize : Pass<"finalizing-bufferize", "func::FuncOp"> { - let summary = "Finalize a partial bufferization"; - let description = [{ -A bufferize pass that finalizes a partial bufferization by removing -remaining `bufferization.to_tensor` and `bufferization.to_buffer` operations. - -The removal of those operations is only possible if the operations only -exist in pairs, i.e., all uses of `bufferization.to_tensor` operations are -`bufferization.to_buffer` operations. - -This pass will fail if not all operations can be removed or if any operation -with tensor typed operands remains. - }]; - let constructor = "mlir::bufferization::createFinalizingBufferizePass()"; -} - def DropEquivalentBufferResults : Pass<"drop-equivalent-buffer-results", "ModuleOp"> { let summary = "Remove MemRef return values that are equivalent to a bbArg"; let description = [{ diff --git a/mlir/lib/Dialect/Bufferization/Transforms/Bufferize.cpp b/mlir/lib/Dialect/Bufferization/Transforms/Bufferize.cpp index 1d009b03754c52..62ce2583f4fa1d 100644 --- a/mlir/lib/Dialect/Bufferization/Transforms/Bufferize.cpp +++ b/mlir/lib/Dialect/Bufferization/Transforms/Bufferize.cpp @@ -26,7 +26,6 @@ namespace mlir { namespace bufferization { -#define GEN_PASS_DEF_FINALIZINGBUFFERIZE #define GEN_PASS_DEF_BUFFERIZATIONBUFFERIZE #define GEN_PASS_DEF_ONESHOTBUFFERIZE #includ
[llvm-branch-commits] [mlir] [mlir][bufferization] Remove remaining dialect conversion-based infra parts (PR #114155)
https://github.com/matthias-springer updated https://github.com/llvm/llvm-project/pull/114155 >From 5c02edc9f35d4c35b2c25bc3dba4d10531e2a4ab Mon Sep 17 00:00:00 2001 From: Matthias Springer Date: Wed, 30 Oct 2024 00:58:32 +0100 Subject: [PATCH] [mlir][bufferization] Remove remaining dialect conversion-based infra parts This commit removes the last remaining components of the dialect conversion-based bufferization passes. Note for LLVM integration: If you depend on these components, migrate to One-Shot Bufferize or copy them to your codebase. Depends on #114154. --- .../Bufferization/Transforms/Bufferize.h | 23 -- .../mlir/Dialect/Func/Transforms/Passes.h | 4 - .../Bufferization/Transforms/BufferUtils.cpp | 6 +- .../Bufferization/Transforms/Bufferize.cpp| 73 --- 4 files changed, 4 insertions(+), 102 deletions(-) diff --git a/mlir/include/mlir/Dialect/Bufferization/Transforms/Bufferize.h b/mlir/include/mlir/Dialect/Bufferization/Transforms/Bufferize.h index ebed2c354bfca5..2f495d304b4a56 100644 --- a/mlir/include/mlir/Dialect/Bufferization/Transforms/Bufferize.h +++ b/mlir/include/mlir/Dialect/Bufferization/Transforms/Bufferize.h @@ -38,24 +38,6 @@ struct BufferizationStatistics { int64_t numTensorOutOfPlace = 0; }; -/// A helper type converter class that automatically populates the relevant -/// materializations and type conversions for bufferization. -class BufferizeTypeConverter : public TypeConverter { -public: - BufferizeTypeConverter(); -}; - -/// Marks ops used by bufferization for type conversion materializations as -/// "legal" in the given ConversionTarget. -/// -/// This function should be called by all bufferization passes using -/// BufferizeTypeConverter so that materializations work properly. One exception -/// is bufferization passes doing "full" conversions, where it can be desirable -/// for even the materializations to remain illegal so that they are eliminated, -/// such as via the patterns in -/// populateEliminateBufferizeMaterializationsPatterns. -void populateBufferizeMaterializationLegality(ConversionTarget &target); - /// Bufferize `op` and its nested ops that implement `BufferizableOpInterface`. /// /// Note: This function does not resolve read-after-write conflicts. Use this @@ -81,11 +63,6 @@ LogicalResult bufferizeOp(Operation *op, const BufferizationOptions &options, LogicalResult bufferizeBlockSignature(Block *block, RewriterBase &rewriter, const BufferizationOptions &options); -/// Return `BufferizationOptions` such that the `bufferizeOp` behaves like the -/// old (deprecated) partial, dialect conversion-based bufferization passes. A -/// copy will be inserted before every buffer write. -BufferizationOptions getPartialBufferizationOptions(); - } // namespace bufferization } // namespace mlir diff --git a/mlir/include/mlir/Dialect/Func/Transforms/Passes.h b/mlir/include/mlir/Dialect/Func/Transforms/Passes.h index 02fc9e1d934390..0248f068320c54 100644 --- a/mlir/include/mlir/Dialect/Func/Transforms/Passes.h +++ b/mlir/include/mlir/Dialect/Func/Transforms/Passes.h @@ -18,10 +18,6 @@ #include "mlir/Pass/Pass.h" namespace mlir { -namespace bufferization { -class BufferizeTypeConverter; -} // namespace bufferization - class RewritePatternSet; namespace func { diff --git a/mlir/lib/Dialect/Bufferization/Transforms/BufferUtils.cpp b/mlir/lib/Dialect/Bufferization/Transforms/BufferUtils.cpp index 8fffdbf664c3f4..b11803da19ef98 100644 --- a/mlir/lib/Dialect/Bufferization/Transforms/BufferUtils.cpp +++ b/mlir/lib/Dialect/Bufferization/Transforms/BufferUtils.cpp @@ -11,6 +11,8 @@ //===--===// #include "mlir/Dialect/Bufferization/Transforms/BufferUtils.h" + +#include "mlir/Dialect/Bufferization/IR/BufferizableOpInterface.h" #include "mlir/Dialect/Bufferization/Transforms/Bufferize.h" #include "mlir/Dialect/MemRef/IR/MemRef.h" #include "mlir/Dialect/MemRef/Utils/MemRefUtils.h" @@ -138,8 +140,8 @@ bufferization::getGlobalFor(arith::ConstantOp constantOp, uint64_t alignment, alignment > 0 ? IntegerAttr::get(globalBuilder.getI64Type(), alignment) : IntegerAttr(); - BufferizeTypeConverter typeConverter; - auto memrefType = cast(typeConverter.convertType(type)); + auto memrefType = + cast(getMemRefTypeWithStaticIdentityLayout(type)); if (memorySpace) memrefType = MemRefType::Builder(memrefType).setMemorySpace(memorySpace); auto global = globalBuilder.create( diff --git a/mlir/lib/Dialect/Bufferization/Transforms/Bufferize.cpp b/mlir/lib/Dialect/Bufferization/Transforms/Bufferize.cpp index 62ce2583f4fa1d..6f0cdfa20f7be5 100644 --- a/mlir/lib/Dialect/Bufferization/Transforms/Bufferize.cpp +++ b/mlir/lib/Dialect/Bufferization/Transforms/Bufferize.cpp @@ -37,65 +37,6 @@ namespace bufferization { using namespace mlir; using namespace mlir::bufferization
[llvm-branch-commits] [mlir] [mlir][bufferization] Remove remaining dialect conversion-based infra parts (PR #114155)
@@ -86,18 +86,13 @@ getOrCreateFuncAnalysisState(OneShotAnalysisState &state) { return state.addExtension(); } -/// Return the unique ReturnOp that terminates `funcOp`. -/// Return nullptr if there is no such unique ReturnOp. -static func::ReturnOp getAssumedUniqueReturnOp(func::FuncOp funcOp) { - func::ReturnOp returnOp; - for (Block &b : funcOp.getBody()) { -if (auto candidateOp = dyn_cast(b.getTerminator())) { - if (returnOp) -return nullptr; - returnOp = candidateOp; -} - } - return returnOp; +/// Return all top-level func.return ops in the given function. matthias-springer wrote: The diff was broken because a dependent commit got merged. I rebased on the latest state. https://github.com/llvm/llvm-project/pull/114155 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [CGData] Refactor Global Merge Functions (PR #115750)
kyulee-com wrote: > I can confirm that the performance have been improved significantly from my > testing on no-LTO projects that the slowdown is acceptable now. Before > applying the PR it was about 50% slowdown, now it is ~5%. That's great to hear! Since these PRs appear to be functioning, is it okay to merge them for now while we continue to discuss further improvements? Or do you have more comments to be addressed? https://github.com/llvm/llvm-project/pull/115750 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [CallSiteInfo][CallGraphSection] Extend CallSiteInfo with TypeId (PR #87574)
https://github.com/Prabhuk updated https://github.com/llvm/llvm-project/pull/87574 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [flang] [flang] Introduce HLFIR lowerings to omp.workshare_loop_nest (PR #104748)
https://github.com/ivanradanov edited https://github.com/llvm/llvm-project/pull/104748 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [flang] [flang] Introduce HLFIR lowerings to omp.workshare_loop_nest (PR #104748)
ivanradanov wrote: @tblah It is ready for review, I had just forgotten to take the [WIP] in the title away, sorry for that. https://github.com/llvm/llvm-project/pull/104748 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [flang] [flang] Lower omp.workshare to other omp constructs (PR #101446)
ivanradanov wrote: @tblah I think they are in a good state - I just need a review on this one - the other ones are approved. https://github.com/llvm/llvm-project/pull/101446 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [AsmPrinter][CallGraphSection] Emit call graph section (PR #87576)
https://github.com/Prabhuk updated https://github.com/llvm/llvm-project/pull/87576 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [CallSiteInfo][CallGraphSection] Extract and propagate indirect call type ids (PR #87575)
https://github.com/Prabhuk updated https://github.com/llvm/llvm-project/pull/87575 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [clang][CallGraphSection] Add type id metadata to indirect call and targets (PR #87573)
https://github.com/Prabhuk updated https://github.com/llvm/llvm-project/pull/87573 >From a8a5848885e12c771f12cfa33b4dbc6a0272e925 Mon Sep 17 00:00:00 2001 From: Prabhuk Date: Mon, 22 Apr 2024 11:34:04 -0700 Subject: [PATCH 1/3] Update clang/lib/CodeGen/CodeGenModule.cpp Cleaner if checks. Co-authored-by: Matt Arsenault --- clang/lib/CodeGen/CodeGenModule.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/clang/lib/CodeGen/CodeGenModule.cpp b/clang/lib/CodeGen/CodeGenModule.cpp index e19bbee996f582..ff1586d2fa8abe 100644 --- a/clang/lib/CodeGen/CodeGenModule.cpp +++ b/clang/lib/CodeGen/CodeGenModule.cpp @@ -2711,7 +2711,7 @@ void CodeGenModule::CreateFunctionTypeMetadataForIcall(const FunctionDecl *FD, void CodeGenModule::CreateFunctionTypeMetadataForIcall(const QualType &QT, llvm::CallBase *CB) { // Only if needed for call graph section and only for indirect calls. - if (!(CodeGenOpts.CallGraphSection && CB && CB->isIndirectCall())) + if (!CodeGenOpts.CallGraphSection || !CB || !CB->isIndirectCall()) return; auto *MD = CreateMetadataIdentifierGeneralized(QT); >From 019b2ca5e1c263183ed114e0b967b4e77b4a17a8 Mon Sep 17 00:00:00 2001 From: Prabhuk Date: Mon, 22 Apr 2024 11:34:31 -0700 Subject: [PATCH 2/3] Update clang/lib/CodeGen/CodeGenModule.cpp Update the comments as suggested. Co-authored-by: Matt Arsenault --- clang/lib/CodeGen/CodeGenModule.cpp | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/clang/lib/CodeGen/CodeGenModule.cpp b/clang/lib/CodeGen/CodeGenModule.cpp index ff1586d2fa8abe..5635a87d2358a7 100644 --- a/clang/lib/CodeGen/CodeGenModule.cpp +++ b/clang/lib/CodeGen/CodeGenModule.cpp @@ -2680,9 +2680,9 @@ void CodeGenModule::CreateFunctionTypeMetadataForIcall(const FunctionDecl *FD, bool EmittedMDIdGeneralized = false; if (CodeGenOpts.CallGraphSection && (!F->hasLocalLinkage() || - F->getFunction().hasAddressTaken(nullptr, /* IgnoreCallbackUses */ true, -/* IgnoreAssumeLikeCalls */ true, -/* IgnoreLLVMUsed */ false))) { + F->getFunction().hasAddressTaken(nullptr, /*IgnoreCallbackUses=*/ true, +/*IgnoreAssumeLikeCalls=*/ true, +/*IgnoreLLVMUsed=*/ false))) { F->addTypeMetadata(0, CreateMetadataIdentifierGeneralized(FD->getType())); EmittedMDIdGeneralized = true; } >From 99242900c51778abd4b7e7f4361b09202b7abcda Mon Sep 17 00:00:00 2001 From: Prabhuk Date: Mon, 29 Apr 2024 11:53:40 -0700 Subject: [PATCH 3/3] dyn_cast to isa Created using spr 1.3.6-beta.1 --- clang/lib/CodeGen/CGCall.cpp | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/clang/lib/CodeGen/CGCall.cpp b/clang/lib/CodeGen/CGCall.cpp index 526a63b24ff834..45033ced1d8344 100644 --- a/clang/lib/CodeGen/CGCall.cpp +++ b/clang/lib/CodeGen/CGCall.cpp @@ -5713,8 +5713,8 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo &CallInfo, if (callOrInvoke && *callOrInvoke && (*callOrInvoke)->isIndirectCall()) { if (const FunctionDecl *FD = dyn_cast_or_null(TargetDecl)) { // Type id metadata is set only for C/C++ contexts. -if (dyn_cast(FD) || dyn_cast(FD) || -dyn_cast(FD)) { +if (isa(FD) || isa(FD) || +isa(FD)) { CGM.CreateFunctionTypeMetadataForIcall(FD->getType(), *callOrInvoke); } } ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [clang][CallGraphSection] Add type id metadata to indirect call and targets (PR #87573)
https://github.com/Prabhuk updated https://github.com/llvm/llvm-project/pull/87573 >From a8a5848885e12c771f12cfa33b4dbc6a0272e925 Mon Sep 17 00:00:00 2001 From: Prabhuk Date: Mon, 22 Apr 2024 11:34:04 -0700 Subject: [PATCH 1/3] Update clang/lib/CodeGen/CodeGenModule.cpp Cleaner if checks. Co-authored-by: Matt Arsenault --- clang/lib/CodeGen/CodeGenModule.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/clang/lib/CodeGen/CodeGenModule.cpp b/clang/lib/CodeGen/CodeGenModule.cpp index e19bbee996f582..ff1586d2fa8abe 100644 --- a/clang/lib/CodeGen/CodeGenModule.cpp +++ b/clang/lib/CodeGen/CodeGenModule.cpp @@ -2711,7 +2711,7 @@ void CodeGenModule::CreateFunctionTypeMetadataForIcall(const FunctionDecl *FD, void CodeGenModule::CreateFunctionTypeMetadataForIcall(const QualType &QT, llvm::CallBase *CB) { // Only if needed for call graph section and only for indirect calls. - if (!(CodeGenOpts.CallGraphSection && CB && CB->isIndirectCall())) + if (!CodeGenOpts.CallGraphSection || !CB || !CB->isIndirectCall()) return; auto *MD = CreateMetadataIdentifierGeneralized(QT); >From 019b2ca5e1c263183ed114e0b967b4e77b4a17a8 Mon Sep 17 00:00:00 2001 From: Prabhuk Date: Mon, 22 Apr 2024 11:34:31 -0700 Subject: [PATCH 2/3] Update clang/lib/CodeGen/CodeGenModule.cpp Update the comments as suggested. Co-authored-by: Matt Arsenault --- clang/lib/CodeGen/CodeGenModule.cpp | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/clang/lib/CodeGen/CodeGenModule.cpp b/clang/lib/CodeGen/CodeGenModule.cpp index ff1586d2fa8abe..5635a87d2358a7 100644 --- a/clang/lib/CodeGen/CodeGenModule.cpp +++ b/clang/lib/CodeGen/CodeGenModule.cpp @@ -2680,9 +2680,9 @@ void CodeGenModule::CreateFunctionTypeMetadataForIcall(const FunctionDecl *FD, bool EmittedMDIdGeneralized = false; if (CodeGenOpts.CallGraphSection && (!F->hasLocalLinkage() || - F->getFunction().hasAddressTaken(nullptr, /* IgnoreCallbackUses */ true, -/* IgnoreAssumeLikeCalls */ true, -/* IgnoreLLVMUsed */ false))) { + F->getFunction().hasAddressTaken(nullptr, /*IgnoreCallbackUses=*/ true, +/*IgnoreAssumeLikeCalls=*/ true, +/*IgnoreLLVMUsed=*/ false))) { F->addTypeMetadata(0, CreateMetadataIdentifierGeneralized(FD->getType())); EmittedMDIdGeneralized = true; } >From 99242900c51778abd4b7e7f4361b09202b7abcda Mon Sep 17 00:00:00 2001 From: Prabhuk Date: Mon, 29 Apr 2024 11:53:40 -0700 Subject: [PATCH 3/3] dyn_cast to isa Created using spr 1.3.6-beta.1 --- clang/lib/CodeGen/CGCall.cpp | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/clang/lib/CodeGen/CGCall.cpp b/clang/lib/CodeGen/CGCall.cpp index 526a63b24ff834..45033ced1d8344 100644 --- a/clang/lib/CodeGen/CGCall.cpp +++ b/clang/lib/CodeGen/CGCall.cpp @@ -5713,8 +5713,8 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo &CallInfo, if (callOrInvoke && *callOrInvoke && (*callOrInvoke)->isIndirectCall()) { if (const FunctionDecl *FD = dyn_cast_or_null(TargetDecl)) { // Type id metadata is set only for C/C++ contexts. -if (dyn_cast(FD) || dyn_cast(FD) || -dyn_cast(FD)) { +if (isa(FD) || isa(FD) || +isa(FD)) { CGM.CreateFunctionTypeMetadataForIcall(FD->getType(), *callOrInvoke); } } ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [CallSiteInfo][CallGraphSection] Extend CallSiteInfo with TypeId (PR #87574)
https://github.com/Prabhuk updated https://github.com/llvm/llvm-project/pull/87574 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [CallSiteInfo][CallGraphSection] Extend CallSiteInfo with TypeId (PR #87574)
https://github.com/Prabhuk updated https://github.com/llvm/llvm-project/pull/87574 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [AsmPrinter][CallGraphSection] Emit call graph section (PR #87576)
https://github.com/Prabhuk updated https://github.com/llvm/llvm-project/pull/87576 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [CallGraphSection] Add call graph section options and documentation (PR #87572)
https://github.com/Prabhuk updated https://github.com/llvm/llvm-project/pull/87572 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [CallGraphSection] Add call graph section options and documentation (PR #87572)
https://github.com/Prabhuk updated https://github.com/llvm/llvm-project/pull/87572 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [CallGraphSection] Add call graph section options and documentation (PR #87572)
github-actions[bot] wrote: :warning: C/C++ code formatter, clang-format found issues in your code. :warning: You can test this locally with the following command: ``bash git-clang-format --diff c960e4d69a31fa560f45d5b1eb4ba069f47467fb 8d7d2f2b8e335cbeef6fb698ab67c6c0bba4a14b --extensions h,c,cpp -- clang/test/Driver/call-graph-section.c clang/lib/CodeGen/BackendUtil.cpp clang/lib/Driver/ToolChains/Clang.cpp llvm/include/llvm/CodeGen/CommandFlags.h llvm/include/llvm/Target/TargetOptions.h llvm/lib/CodeGen/CommandFlags.cpp `` View the diff from clang-format here. ``diff diff --git a/llvm/include/llvm/Target/TargetOptions.h b/llvm/include/llvm/Target/TargetOptions.h index 91ce6a911c..02fcf9cbe6 100644 --- a/llvm/include/llvm/Target/TargetOptions.h +++ b/llvm/include/llvm/Target/TargetOptions.h @@ -149,10 +149,11 @@ namespace llvm { EmulatedTLS(false), EnableTLSDESC(false), EnableIPRA(false), EmitStackSizeSection(false), EnableMachineOutliner(false), EnableMachineFunctionSplitter(false), SupportsDefaultOutlining(false), - EmitAddrsig(false), BBAddrMap(false), EmitCallGraphSection(false), EmitCallSiteInfo(false), - SupportsDebugEntryValues(false), EnableDebugEntryValues(false), - ValueTrackingVariableLocations(false), ForceDwarfFrameSection(false), - XRayFunctionIndex(true), DebugStrictDwarf(false), Hotpatch(false), + EmitAddrsig(false), BBAddrMap(false), EmitCallGraphSection(false), + EmitCallSiteInfo(false), SupportsDebugEntryValues(false), + EnableDebugEntryValues(false), ValueTrackingVariableLocations(false), + ForceDwarfFrameSection(false), XRayFunctionIndex(true), + DebugStrictDwarf(false), Hotpatch(false), PPCGenScalarMASSEntries(false), JMCInstrument(false), EnableCFIFixup(false), MisExpect(false), XCOFFReadOnlyPointers(false), FPDenormalMode(DenormalMode::IEEE, DenormalMode::IEEE) {} `` https://github.com/llvm/llvm-project/pull/87572 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [CallSiteInfo][CallGraphSection] Extract and propagate indirect call type ids (PR #87575)
https://github.com/Prabhuk updated https://github.com/llvm/llvm-project/pull/87575 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [CGData] Refactor Global Merge Functions (PR #115750)
nocchijiang wrote: > I suspect that the no-LTO case might still encounter some slowdown, as each > CU needs to read the entire CGData regardless. I can confirm that the performance have been improved significantly from my testing on no-LTO projects that the slowdown is acceptable now. Before applying the PR it was about 50% slowdown, now it is ~5%. > Alternatively, we could restructure the indexed CGData to allow for reading > only the relevant hash entries on demand. Besides only consuming the matched stable entries like what this PR does, this is exactly what I planned to do to reduce the memory footprint of the deserialized CGData. I would like to discuss the detail in the RFC thread with you to make sure that we are on the same page before coding it. https://github.com/llvm/llvm-project/pull/115750 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [CallGraphSection] Add call graph section options and documentation (PR #87572)
github-actions[bot] wrote: :warning: C/C++ code formatter, clang-format found issues in your code. :warning: You can test this locally with the following command: ``bash git-clang-format --diff c960e4d69a31fa560f45d5b1eb4ba069f47467fb 8d7d2f2b8e335cbeef6fb698ab67c6c0bba4a14b --extensions h,c,cpp -- clang/test/Driver/call-graph-section.c clang/lib/CodeGen/BackendUtil.cpp clang/lib/Driver/ToolChains/Clang.cpp llvm/include/llvm/CodeGen/CommandFlags.h llvm/include/llvm/Target/TargetOptions.h llvm/lib/CodeGen/CommandFlags.cpp `` View the diff from clang-format here. ``diff diff --git a/llvm/include/llvm/Target/TargetOptions.h b/llvm/include/llvm/Target/TargetOptions.h index 91ce6a911c..02fcf9cbe6 100644 --- a/llvm/include/llvm/Target/TargetOptions.h +++ b/llvm/include/llvm/Target/TargetOptions.h @@ -149,10 +149,11 @@ namespace llvm { EmulatedTLS(false), EnableTLSDESC(false), EnableIPRA(false), EmitStackSizeSection(false), EnableMachineOutliner(false), EnableMachineFunctionSplitter(false), SupportsDefaultOutlining(false), - EmitAddrsig(false), BBAddrMap(false), EmitCallGraphSection(false), EmitCallSiteInfo(false), - SupportsDebugEntryValues(false), EnableDebugEntryValues(false), - ValueTrackingVariableLocations(false), ForceDwarfFrameSection(false), - XRayFunctionIndex(true), DebugStrictDwarf(false), Hotpatch(false), + EmitAddrsig(false), BBAddrMap(false), EmitCallGraphSection(false), + EmitCallSiteInfo(false), SupportsDebugEntryValues(false), + EnableDebugEntryValues(false), ValueTrackingVariableLocations(false), + ForceDwarfFrameSection(false), XRayFunctionIndex(true), + DebugStrictDwarf(false), Hotpatch(false), PPCGenScalarMASSEntries(false), JMCInstrument(false), EnableCFIFixup(false), MisExpect(false), XCOFFReadOnlyPointers(false), FPDenormalMode(DenormalMode::IEEE, DenormalMode::IEEE) {} `` https://github.com/llvm/llvm-project/pull/87572 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [clang][CallGraphSection] Add type id metadata to indirect call and targets (PR #87573)
https://github.com/Prabhuk updated https://github.com/llvm/llvm-project/pull/87573 >From a8a5848885e12c771f12cfa33b4dbc6a0272e925 Mon Sep 17 00:00:00 2001 From: Prabhuk Date: Mon, 22 Apr 2024 11:34:04 -0700 Subject: [PATCH 1/3] Update clang/lib/CodeGen/CodeGenModule.cpp Cleaner if checks. Co-authored-by: Matt Arsenault --- clang/lib/CodeGen/CodeGenModule.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/clang/lib/CodeGen/CodeGenModule.cpp b/clang/lib/CodeGen/CodeGenModule.cpp index e19bbee996f582..ff1586d2fa8abe 100644 --- a/clang/lib/CodeGen/CodeGenModule.cpp +++ b/clang/lib/CodeGen/CodeGenModule.cpp @@ -2711,7 +2711,7 @@ void CodeGenModule::CreateFunctionTypeMetadataForIcall(const FunctionDecl *FD, void CodeGenModule::CreateFunctionTypeMetadataForIcall(const QualType &QT, llvm::CallBase *CB) { // Only if needed for call graph section and only for indirect calls. - if (!(CodeGenOpts.CallGraphSection && CB && CB->isIndirectCall())) + if (!CodeGenOpts.CallGraphSection || !CB || !CB->isIndirectCall()) return; auto *MD = CreateMetadataIdentifierGeneralized(QT); >From 019b2ca5e1c263183ed114e0b967b4e77b4a17a8 Mon Sep 17 00:00:00 2001 From: Prabhuk Date: Mon, 22 Apr 2024 11:34:31 -0700 Subject: [PATCH 2/3] Update clang/lib/CodeGen/CodeGenModule.cpp Update the comments as suggested. Co-authored-by: Matt Arsenault --- clang/lib/CodeGen/CodeGenModule.cpp | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/clang/lib/CodeGen/CodeGenModule.cpp b/clang/lib/CodeGen/CodeGenModule.cpp index ff1586d2fa8abe..5635a87d2358a7 100644 --- a/clang/lib/CodeGen/CodeGenModule.cpp +++ b/clang/lib/CodeGen/CodeGenModule.cpp @@ -2680,9 +2680,9 @@ void CodeGenModule::CreateFunctionTypeMetadataForIcall(const FunctionDecl *FD, bool EmittedMDIdGeneralized = false; if (CodeGenOpts.CallGraphSection && (!F->hasLocalLinkage() || - F->getFunction().hasAddressTaken(nullptr, /* IgnoreCallbackUses */ true, -/* IgnoreAssumeLikeCalls */ true, -/* IgnoreLLVMUsed */ false))) { + F->getFunction().hasAddressTaken(nullptr, /*IgnoreCallbackUses=*/ true, +/*IgnoreAssumeLikeCalls=*/ true, +/*IgnoreLLVMUsed=*/ false))) { F->addTypeMetadata(0, CreateMetadataIdentifierGeneralized(FD->getType())); EmittedMDIdGeneralized = true; } >From 99242900c51778abd4b7e7f4361b09202b7abcda Mon Sep 17 00:00:00 2001 From: Prabhuk Date: Mon, 29 Apr 2024 11:53:40 -0700 Subject: [PATCH 3/3] dyn_cast to isa Created using spr 1.3.6-beta.1 --- clang/lib/CodeGen/CGCall.cpp | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/clang/lib/CodeGen/CGCall.cpp b/clang/lib/CodeGen/CGCall.cpp index 526a63b24ff834..45033ced1d8344 100644 --- a/clang/lib/CodeGen/CGCall.cpp +++ b/clang/lib/CodeGen/CGCall.cpp @@ -5713,8 +5713,8 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo &CallInfo, if (callOrInvoke && *callOrInvoke && (*callOrInvoke)->isIndirectCall()) { if (const FunctionDecl *FD = dyn_cast_or_null(TargetDecl)) { // Type id metadata is set only for C/C++ contexts. -if (dyn_cast(FD) || dyn_cast(FD) || -dyn_cast(FD)) { +if (isa(FD) || isa(FD) || +isa(FD)) { CGM.CreateFunctionTypeMetadataForIcall(FD->getType(), *callOrInvoke); } } ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [CGData] Refactor Global Merge Functions (PR #115750)
https://github.com/nocchijiang approved this pull request. https://github.com/llvm/llvm-project/pull/115750 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [CallSiteInfo][CallGraphSection] Extract and propagate indirect call type ids (PR #87575)
https://github.com/Prabhuk updated https://github.com/llvm/llvm-project/pull/87575 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [CallSiteInfo][CallGraphSection] Extract and propagate indirect call type ids (PR #87575)
https://github.com/Prabhuk updated https://github.com/llvm/llvm-project/pull/87575 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [CallGraphSection] Add call graph section options and documentation (PR #87572)
https://github.com/Prabhuk updated https://github.com/llvm/llvm-project/pull/87572 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [CallSiteInfo][CallGraphSection] Extend CallSiteInfo with TypeId (PR #87574)
https://github.com/Prabhuk updated https://github.com/llvm/llvm-project/pull/87574 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [AsmPrinter][CallGraphSection] Emit call graph section (PR #87576)
https://github.com/Prabhuk updated https://github.com/llvm/llvm-project/pull/87576 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [clang][CallGraphSection] Add type id metadata to indirect call and targets (PR #87573)
https://github.com/Prabhuk updated https://github.com/llvm/llvm-project/pull/87573 >From a8a5848885e12c771f12cfa33b4dbc6a0272e925 Mon Sep 17 00:00:00 2001 From: Prabhuk Date: Mon, 22 Apr 2024 11:34:04 -0700 Subject: [PATCH 1/3] Update clang/lib/CodeGen/CodeGenModule.cpp Cleaner if checks. Co-authored-by: Matt Arsenault --- clang/lib/CodeGen/CodeGenModule.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/clang/lib/CodeGen/CodeGenModule.cpp b/clang/lib/CodeGen/CodeGenModule.cpp index e19bbee996f582..ff1586d2fa8abe 100644 --- a/clang/lib/CodeGen/CodeGenModule.cpp +++ b/clang/lib/CodeGen/CodeGenModule.cpp @@ -2711,7 +2711,7 @@ void CodeGenModule::CreateFunctionTypeMetadataForIcall(const FunctionDecl *FD, void CodeGenModule::CreateFunctionTypeMetadataForIcall(const QualType &QT, llvm::CallBase *CB) { // Only if needed for call graph section and only for indirect calls. - if (!(CodeGenOpts.CallGraphSection && CB && CB->isIndirectCall())) + if (!CodeGenOpts.CallGraphSection || !CB || !CB->isIndirectCall()) return; auto *MD = CreateMetadataIdentifierGeneralized(QT); >From 019b2ca5e1c263183ed114e0b967b4e77b4a17a8 Mon Sep 17 00:00:00 2001 From: Prabhuk Date: Mon, 22 Apr 2024 11:34:31 -0700 Subject: [PATCH 2/3] Update clang/lib/CodeGen/CodeGenModule.cpp Update the comments as suggested. Co-authored-by: Matt Arsenault --- clang/lib/CodeGen/CodeGenModule.cpp | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/clang/lib/CodeGen/CodeGenModule.cpp b/clang/lib/CodeGen/CodeGenModule.cpp index ff1586d2fa8abe..5635a87d2358a7 100644 --- a/clang/lib/CodeGen/CodeGenModule.cpp +++ b/clang/lib/CodeGen/CodeGenModule.cpp @@ -2680,9 +2680,9 @@ void CodeGenModule::CreateFunctionTypeMetadataForIcall(const FunctionDecl *FD, bool EmittedMDIdGeneralized = false; if (CodeGenOpts.CallGraphSection && (!F->hasLocalLinkage() || - F->getFunction().hasAddressTaken(nullptr, /* IgnoreCallbackUses */ true, -/* IgnoreAssumeLikeCalls */ true, -/* IgnoreLLVMUsed */ false))) { + F->getFunction().hasAddressTaken(nullptr, /*IgnoreCallbackUses=*/ true, +/*IgnoreAssumeLikeCalls=*/ true, +/*IgnoreLLVMUsed=*/ false))) { F->addTypeMetadata(0, CreateMetadataIdentifierGeneralized(FD->getType())); EmittedMDIdGeneralized = true; } >From 99242900c51778abd4b7e7f4361b09202b7abcda Mon Sep 17 00:00:00 2001 From: Prabhuk Date: Mon, 29 Apr 2024 11:53:40 -0700 Subject: [PATCH 3/3] dyn_cast to isa Created using spr 1.3.6-beta.1 --- clang/lib/CodeGen/CGCall.cpp | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/clang/lib/CodeGen/CGCall.cpp b/clang/lib/CodeGen/CGCall.cpp index 526a63b24ff834..45033ced1d8344 100644 --- a/clang/lib/CodeGen/CGCall.cpp +++ b/clang/lib/CodeGen/CGCall.cpp @@ -5713,8 +5713,8 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo &CallInfo, if (callOrInvoke && *callOrInvoke && (*callOrInvoke)->isIndirectCall()) { if (const FunctionDecl *FD = dyn_cast_or_null(TargetDecl)) { // Type id metadata is set only for C/C++ contexts. -if (dyn_cast(FD) || dyn_cast(FD) || -dyn_cast(FD)) { +if (isa(FD) || isa(FD) || +isa(FD)) { CGM.CreateFunctionTypeMetadataForIcall(FD->getType(), *callOrInvoke); } } ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [AsmPrinter][CallGraphSection] Emit call graph section (PR #87576)
https://github.com/Prabhuk updated https://github.com/llvm/llvm-project/pull/87576 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] be995b8 - Revert "[Fuchsia][CMake] Enable new libc header gen (#102371)"
Author: Petr Hosek Date: 2024-11-13T23:32:07-08:00 New Revision: be995b825da9c12c8fead48d2e5ba575f154bddf URL: https://github.com/llvm/llvm-project/commit/be995b825da9c12c8fead48d2e5ba575f154bddf DIFF: https://github.com/llvm/llvm-project/commit/be995b825da9c12c8fead48d2e5ba575f154bddf.diff LOG: Revert "[Fuchsia][CMake] Enable new libc header gen (#102371)" This reverts commit d492001bdcd7bfcd19ada7459a6b0eaf81ba3ba2. Added: Modified: clang/cmake/caches/Fuchsia-stage2.cmake Removed: diff --git a/clang/cmake/caches/Fuchsia-stage2.cmake b/clang/cmake/caches/Fuchsia-stage2.cmake index d5859c8806b561..5af98c7b3b3fba 100644 --- a/clang/cmake/caches/Fuchsia-stage2.cmake +++ b/clang/cmake/caches/Fuchsia-stage2.cmake @@ -329,7 +329,7 @@ foreach(target armv6m-none-eabi;armv7m-none-eabi;armv8m.main-none-eabi) foreach(lang C;CXX;ASM) # TODO: The preprocessor defines workaround various issues in libc and libc++ integration. # These should be addressed and removed over time. -set(RUNTIMES_${target}_CMAKE_${lang}_local_flags "--target=${target} -mthumb -Wno-atomic-alignment \"-Dvfprintf(stream, format, vlist)=vprintf(format, vlist)\" \"-Dfprintf(stream, format, ...)=printf(format)\" \"-Dgettimeofday(tv, tz)\" -D_LIBCPP_PRINT=1") +set(RUNTIMES_${target}_CMAKE_${lang}_local_flags "--target=${target} -mthumb -Wno-atomic-alignment \"-Dvfprintf(stream, format, vlist)=vprintf(format, vlist)\" \"-Dfprintf(stream, format, ...)=printf(format)\" \"-Dtimeval=struct timeval{int tv_sec; int tv_usec;}\" \"-Dgettimeofday(tv, tz)\" -D_LIBCPP_PRINT=1") if(${target} STREQUAL "armv8m.main-none-eabi") set(RUNTIMES_${target}_CMAKE_${lang}_local_flags "${RUNTIMES_${target}_CMAKE_${lang}_local_flags} -mfloat-abi=softfp -march=armv8m.main+fp+dsp -mcpu=cortex-m33" CACHE STRING "") endif() @@ -340,6 +340,7 @@ foreach(target armv6m-none-eabi;armv7m-none-eabi;armv8m.main-none-eabi) endforeach() set(RUNTIMES_${target}_LLVM_LIBC_FULL_BUILD ON CACHE BOOL "") set(RUNTIMES_${target}_LIBC_ENABLE_USE_BY_CLANG ON CACHE BOOL "") + set(RUNTIMES_${target}_LIBC_USE_NEW_HEADER_GEN OFF CACHE BOOL "") set(RUNTIMES_${target}_LIBCXX_ABI_VERSION 2 CACHE STRING "") set(RUNTIMES_${target}_LIBCXX_CXX_ABI none CACHE STRING "") set(RUNTIMES_${target}_LIBCXX_ENABLE_SHARED OFF CACHE BOOL "") @@ -384,13 +385,14 @@ foreach(target riscv32-unknown-elf) foreach(lang C;CXX;ASM) # TODO: The preprocessor defines workaround various issues in libc and libc++ integration. # These should be addressed and removed over time. -set(RUNTIMES_${target}_CMAKE_${lang}_FLAGS "--target=${target} -march=rv32imafc -mabi=ilp32f -Wno-atomic-alignment \"-Dvfprintf(stream, format, vlist)=vprintf(format, vlist)\" \"-Dfprintf(stream, format, ...)=printf(format)\" \"-Dgettimeofday(tv, tz)\" -D_LIBCPP_PRINT=1" CACHE STRING "") +set(RUNTIMES_${target}_CMAKE_${lang}_FLAGS "--target=${target} -march=rv32imafc -mabi=ilp32f -Wno-atomic-alignment \"-Dvfprintf(stream, format, vlist)=vprintf(format, vlist)\" \"-Dfprintf(stream, format, ...)=printf(format)\" \"-Dtimeval=struct timeval{int tv_sec; int tv_usec;}\" \"-Dgettimeofday(tv, tz)\" -D_LIBCPP_PRINT=1" CACHE STRING "") endforeach() foreach(type SHARED;MODULE;EXE) set(RUNTIMES_${target}_CMAKE_${type}_LINKER_FLAGS "-fuse-ld=lld" CACHE STRING "") endforeach() set(RUNTIMES_${target}_LLVM_LIBC_FULL_BUILD ON CACHE BOOL "") set(RUNTIMES_${target}_LIBC_ENABLE_USE_BY_CLANG ON CACHE BOOL "") + set(RUNTIMES_${target}_LIBC_USE_NEW_HEADER_GEN OFF CACHE BOOL "") set(RUNTIMES_${target}_LIBCXX_ABI_VERSION 2 CACHE STRING "") set(RUNTIMES_${target}_LIBCXX_CXX_ABI none CACHE STRING "") set(RUNTIMES_${target}_LIBCXX_ENABLE_SHARED OFF CACHE BOOL "") ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [CodeGen][NewPM] Port MachineCycleInfo to NPM (PR #114745)
https://github.com/optimisan updated https://github.com/llvm/llvm-project/pull/114745 >From abfe18a7fec0ed6970a75697898e681ff115d9c1 Mon Sep 17 00:00:00 2001 From: Akshat Oke Date: Wed, 30 Oct 2024 04:59:30 + Subject: [PATCH 1/2] [CodeGen][NewPM] Port MachineCycleInfo to NPM --- .../llvm/CodeGen/MachineCycleAnalysis.h | 21 ++ llvm/include/llvm/InitializePasses.h | 2 +- .../llvm/Passes/MachinePassRegistry.def | 3 +- llvm/lib/CodeGen/CodeGen.cpp | 2 +- llvm/lib/CodeGen/MachineCycleAnalysis.cpp | 38 ++- llvm/lib/Passes/PassBuilder.cpp | 1 + llvm/test/CodeGen/X86/cycle-info.mir | 2 + 7 files changed, 57 insertions(+), 12 deletions(-) diff --git a/llvm/include/llvm/CodeGen/MachineCycleAnalysis.h b/llvm/include/llvm/CodeGen/MachineCycleAnalysis.h index 1888dd053ce65ee..64cf30e6ddf3b8d 100644 --- a/llvm/include/llvm/CodeGen/MachineCycleAnalysis.h +++ b/llvm/include/llvm/CodeGen/MachineCycleAnalysis.h @@ -16,6 +16,7 @@ #include "llvm/ADT/GenericCycleInfo.h" #include "llvm/CodeGen/MachineFunctionPass.h" +#include "llvm/CodeGen/MachinePassManager.h" #include "llvm/CodeGen/MachineSSAContext.h" namespace llvm { @@ -46,6 +47,26 @@ class MachineCycleInfoWrapperPass : public MachineFunctionPass { // version. bool isCycleInvariant(const MachineCycle *Cycle, MachineInstr &I); +class MachineCycleAnalysis : public AnalysisInfoMixin { + friend AnalysisInfoMixin; + static AnalysisKey Key; + +public: + using Result = MachineCycleInfo; + + Result run(MachineFunction &MF, MachineFunctionAnalysisManager &MFAM); +}; + +class MachineCycleInfoPrinterPass +: public PassInfoMixin { + raw_ostream &OS; + +public: + explicit MachineCycleInfoPrinterPass(raw_ostream &OS) : OS(OS) {} + PreservedAnalyses run(MachineFunction &MF, +MachineFunctionAnalysisManager &MFAM); +}; + } // end namespace llvm #endif // LLVM_CODEGEN_MACHINECYCLEANALYSIS_H diff --git a/llvm/include/llvm/InitializePasses.h b/llvm/include/llvm/InitializePasses.h index bf934de9261cec0..598498f8597b6aa 100644 --- a/llvm/include/llvm/InitializePasses.h +++ b/llvm/include/llvm/InitializePasses.h @@ -191,7 +191,7 @@ void initializeMachineCFGPrinterPass(PassRegistry &); void initializeMachineCSELegacyPass(PassRegistry &); void initializeMachineCombinerPass(PassRegistry &); void initializeMachineCopyPropagationPass(PassRegistry &); -void initializeMachineCycleInfoPrinterPassPass(PassRegistry &); +void initializeMachineCycleInfoPrinterLegacyPass(PassRegistry &); void initializeMachineCycleInfoWrapperPassPass(PassRegistry &); void initializeMachineDominanceFrontierPass(PassRegistry &); void initializeMachineDominatorTreeWrapperPassPass(PassRegistry &); diff --git a/llvm/include/llvm/Passes/MachinePassRegistry.def b/llvm/include/llvm/Passes/MachinePassRegistry.def index 9d12a120ff7ac6d..bfe8caba0ce0b36 100644 --- a/llvm/include/llvm/Passes/MachinePassRegistry.def +++ b/llvm/include/llvm/Passes/MachinePassRegistry.def @@ -101,6 +101,7 @@ MACHINE_FUNCTION_ANALYSIS("live-vars", LiveVariablesAnalysis()) MACHINE_FUNCTION_ANALYSIS("machine-block-freq", MachineBlockFrequencyAnalysis()) MACHINE_FUNCTION_ANALYSIS("machine-branch-prob", MachineBranchProbabilityAnalysis()) +MACHINE_FUNCTION_ANALYSIS("machine-cycles", MachineCycleAnalysis()) MACHINE_FUNCTION_ANALYSIS("machine-dom-tree", MachineDominatorTreeAnalysis()) MACHINE_FUNCTION_ANALYSIS("machine-loops", MachineLoopAnalysis()) MACHINE_FUNCTION_ANALYSIS("machine-opt-remark-emitter", @@ -151,6 +152,7 @@ MACHINE_FUNCTION_PASS("print", MACHINE_FUNCTION_PASS("print", MachineDominatorTreePrinterPass(dbgs())) MACHINE_FUNCTION_PASS("print", MachineLoopPrinterPass(dbgs())) +MACHINE_FUNCTION_PASS("print", MachineCycleInfoPrinterPass(dbgs())) MACHINE_FUNCTION_PASS("print", MachinePostDominatorTreePrinterPass(dbgs())) MACHINE_FUNCTION_PASS("print", SlotIndexesPrinterPass(dbgs())) @@ -241,7 +243,6 @@ DUMMY_MACHINE_FUNCTION_PASS("post-RA-sched", PostRASchedulerPass) DUMMY_MACHINE_FUNCTION_PASS("postmisched", PostMachineSchedulerPass) DUMMY_MACHINE_FUNCTION_PASS("postra-machine-sink", PostRAMachineSinkingPass) DUMMY_MACHINE_FUNCTION_PASS("postrapseudos", ExpandPostRAPseudosPass) -DUMMY_MACHINE_FUNCTION_PASS("print-machine-cycles", MachineCycleInfoPrinterPass) DUMMY_MACHINE_FUNCTION_PASS("print-machine-uniformity", MachineUniformityInfoPrinterPass) DUMMY_MACHINE_FUNCTION_PASS("processimpdefs", ProcessImplicitDefsPass) DUMMY_MACHINE_FUNCTION_PASS("prologepilog", PrologEpilogInserterPass) diff --git a/llvm/lib/CodeGen/CodeGen.cpp b/llvm/lib/CodeGen/CodeGen.cpp index 39fba1d0b527ef6..adddb8daaa0e914 100644 --- a/llvm/lib/CodeGen/CodeGen.cpp +++ b/llvm/lib/CodeGen/CodeGen.cpp @@ -78,7 +78,7 @@ void llvm::initializeCodeGen(PassRegistry &Registry) { initializeMachineCSELegacyPass(Registry);
[llvm-branch-commits] [llvm] [NewPM] Introduce MFAnalysisGetter for a common analysis getter (PR #116166)
https://github.com/optimisan created https://github.com/llvm/llvm-project/pull/116166 None >From 197b28c684fcf3ba751a1283fd124bd2d090dfc7 Mon Sep 17 00:00:00 2001 From: Akshat Oke Date: Thu, 14 Nov 2024 05:57:01 + Subject: [PATCH] [NewPM] Introduce MFAnalysisGetter for a common analysis getter --- .../include/llvm/CodeGen/MachinePassManager.h | 80 +++ 1 file changed, 80 insertions(+) diff --git a/llvm/include/llvm/CodeGen/MachinePassManager.h b/llvm/include/llvm/CodeGen/MachinePassManager.h index 69b5f6e92940c4..37669faff96c33 100644 --- a/llvm/include/llvm/CodeGen/MachinePassManager.h +++ b/llvm/include/llvm/CodeGen/MachinePassManager.h @@ -21,12 +21,19 @@ #ifndef LLVM_CODEGEN_MACHINEPASSMANAGER_H #define LLVM_CODEGEN_MACHINEPASSMANAGER_H +#include "llvm/ADT/DenseMap.h" #include "llvm/ADT/FunctionExtras.h" +#include "llvm/ADT/STLExtras.h" #include "llvm/ADT/SmallVector.h" #include "llvm/CodeGen/MachineFunction.h" +#include "llvm/IR/Function.h" #include "llvm/IR/PassManager.h" #include "llvm/IR/PassManagerInternal.h" +#include "llvm/Pass.h" #include "llvm/Support/Error.h" +#include +#include +#include namespace llvm { class Module; @@ -236,6 +243,79 @@ using MachineFunctionPassManager = PassManager; /// preserve. PreservedAnalyses getMachineFunctionPassPreservedAnalyses(); +/// For migrating to new pass manager +/// Provides a common interface to fetch analyses instead of doing it twice in +/// the *LegacyPass::runOnMachineFunction and NPM Pass::run NPM analyses must +/// have the LegacyWrapper type to indicate which legacy analysis to run. +/// +/// Outer analyses passes(Module or Function) can also be requested through +/// `getAnalysis` or `getCachedAnalysis`. +class MFAnalysisGetter { +private: + Pass *LegacyPass; + MachineFunctionAnalysisManager *MFAM; + + template + using type_of_run = + typename function_traits::template arg_t<0>; + + template + static constexpr bool IsFunctionAnalysis = + std::is_same_v>; + + template + static constexpr bool IsModuleAnalysis = + std::is_same_v>; + +public: + MFAnalysisGetter(Pass *LegacyPass) : LegacyPass(LegacyPass) {} + MFAnalysisGetter(MachineFunctionAnalysisManager *MFAM) : MFAM(MFAM) {} + + /// Outer analyses requested from NPM will be cached results and can be null + template + typename AnalysisT::Result *getAnalysis(MachineFunction &MF) { +if (MFAM) { + // need a proxy to get the result for outer analyses + // this can return null + if constexpr (IsModuleAnalysis) +return MFAM->getResult(MF) +.getCachedResult(*MF.getFunction().getParent()); + else if constexpr (IsFunctionAnalysis) { +return &MFAM->getResult(MF) +.getManager() +.getResult(MF.getFunction()); + } + return &MFAM->getResult(MF); +} +return &LegacyPass->getAnalysis() +.getResult(); + } + + template + typename AnalysisT::Result *getCachedAnalysis(MachineFunction &MF) { +if (MFAM) { + if constexpr (IsFunctionAnalysis) { +return MFAM->getResult(MF) +.getManager() +.getCachedResult(MF.getFunction()); + } else if constexpr (IsModuleAnalysis) +return MFAM->getResult(MF) +.getCachedResult(*MF.getFunction().getParent()); + + return &MFAM->getCachedResult(MF); +} + +if (auto *P = +LegacyPass->getAnalysisIfAvailable()) + return &P->getResult(); +return nullptr; + } + + /// This is not intended to be used to invoke getAnalysis() + Pass *getLegacyPass() const { return LegacyPass; } + MachineFunctionAnalysisManager *getMFAM() const { return MFAM; } +}; + } // end namespace llvm #endif // LLVM_CODEGEN_MACHINEPASSMANAGER_H ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [NewPM] Introduce MFAnalysisGetter for a common analysis getter (PR #116166)
https://github.com/optimisan ready_for_review https://github.com/llvm/llvm-project/pull/116166 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [NewPM] Introduce MFAnalysisGetter for a common analysis getter (PR #116166)
https://github.com/optimisan edited https://github.com/llvm/llvm-project/pull/116166 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [NewPM] Introduce MFAnalysisGetter for a common analysis getter (PR #116166)
https://github.com/optimisan updated https://github.com/llvm/llvm-project/pull/116166 >From 7e31c7d84a9aa87a226bb9f1341fa4a6bae9e7bb Mon Sep 17 00:00:00 2001 From: Akshat Oke Date: Thu, 14 Nov 2024 05:57:01 + Subject: [PATCH 1/2] [NewPM] Introduce MFAnalysisGetter for a common analysis getter --- .../include/llvm/CodeGen/MachinePassManager.h | 78 +++ 1 file changed, 78 insertions(+) diff --git a/llvm/include/llvm/CodeGen/MachinePassManager.h b/llvm/include/llvm/CodeGen/MachinePassManager.h index 69b5f6e92940c4..e8b7b96240d148 100644 --- a/llvm/include/llvm/CodeGen/MachinePassManager.h +++ b/llvm/include/llvm/CodeGen/MachinePassManager.h @@ -24,8 +24,10 @@ #include "llvm/ADT/FunctionExtras.h" #include "llvm/ADT/SmallVector.h" #include "llvm/CodeGen/MachineFunction.h" +#include "llvm/IR/Function.h" #include "llvm/IR/PassManager.h" #include "llvm/IR/PassManagerInternal.h" +#include "llvm/Pass.h" #include "llvm/Support/Error.h" namespace llvm { @@ -236,6 +238,82 @@ using MachineFunctionPassManager = PassManager; /// preserve. PreservedAnalyses getMachineFunctionPassPreservedAnalyses(); +/// For migrating to new pass manager +/// Provides a common interface to fetch analyses instead of doing it twice in +/// the *LegacyPass::runOnMachineFunction and NPM Pass::run. +/// +/// NPM analyses must have the LegacyWrapper type to indicate which legacy +/// analysis to run. Legacy wrapper analyses must have `getResult()` method. +/// This can be added on a needs-to basis. +/// +/// Outer analyses passes(Module or Function) can also be requested through +/// `getAnalysis` or `getCachedAnalysis`. +class MFAnalysisGetter { +private: + Pass *LegacyPass; + MachineFunctionAnalysisManager *MFAM; + + template + using type_of_run = + typename function_traits::template arg_t<0>; + + template + static constexpr bool IsFunctionAnalysis = + std::is_same_v>; + + template + static constexpr bool IsModuleAnalysis = + std::is_same_v>; + +public: + MFAnalysisGetter(Pass *LegacyPass) : LegacyPass(LegacyPass) {} + MFAnalysisGetter(MachineFunctionAnalysisManager *MFAM) : MFAM(MFAM) {} + + /// Outer analyses requested from NPM will be cached results and can be null + template + typename AnalysisT::Result *getAnalysis(MachineFunction &MF) { +if (MFAM) { + // need a proxy to get the result for outer analyses + // this can return null + if constexpr (IsModuleAnalysis) +return MFAM->getResult(MF) +.getCachedResult(*MF.getFunction().getParent()); + else if constexpr (IsFunctionAnalysis) { +return &MFAM->getResult(MF) +.getManager() +.getResult(MF.getFunction()); + } + return &MFAM->getResult(MF); +} +return &LegacyPass->getAnalysis() +.getResult(); + } + + template + typename AnalysisT::Result *getCachedAnalysis(MachineFunction &MF) { +if (MFAM) { + if constexpr (IsFunctionAnalysis) { +return MFAM->getResult(MF) +.getManager() +.getCachedResult(MF.getFunction()); + } else if constexpr (IsModuleAnalysis) +return MFAM->getResult(MF) +.getCachedResult(*MF.getFunction().getParent()); + + return &MFAM->getCachedResult(MF); +} + +if (auto *P = +LegacyPass->getAnalysisIfAvailable()) + return &P->getResult(); +return nullptr; + } + + /// This is not intended to be used to invoke getAnalysis() + Pass *getLegacyPass() const { return LegacyPass; } + MachineFunctionAnalysisManager *getMFAM() const { return MFAM; } +}; + } // end namespace llvm #endif // LLVM_CODEGEN_MACHINEPASSMANAGER_H >From 125b82d45358690f146b259b779616d79eccd267 Mon Sep 17 00:00:00 2001 From: Akshat Oke Date: Thu, 14 Nov 2024 06:49:23 + Subject: [PATCH 2/2] Initialize with null --- llvm/include/llvm/CodeGen/MachinePassManager.h | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/llvm/include/llvm/CodeGen/MachinePassManager.h b/llvm/include/llvm/CodeGen/MachinePassManager.h index e8b7b96240d148..bba41ed343f10d 100644 --- a/llvm/include/llvm/CodeGen/MachinePassManager.h +++ b/llvm/include/llvm/CodeGen/MachinePassManager.h @@ -250,8 +250,8 @@ PreservedAnalyses getMachineFunctionPassPreservedAnalyses(); /// `getAnalysis` or `getCachedAnalysis`. class MFAnalysisGetter { private: - Pass *LegacyPass; - MachineFunctionAnalysisManager *MFAM; + Pass *LegacyPass = nullptr; + MachineFunctionAnalysisManager *MFAM = nullptr; template using type_of_run = @@ -259,11 +259,11 @@ class MFAnalysisGetter { template static constexpr bool IsFunctionAnalysis = - std::is_same_v>; + std::is_same_v>; template static constexpr bool IsModuleAnalysis = - std::is_same_v>; + std::is_same_v>; public: MFAnalysisGetter(Pass *LegacyPass) : LegacyPass(LegacyPass) {} ___ llvm-bran
[llvm-branch-commits] [llvm] [CodeGen][NewPM] Port MachineSink to NPM (PR #115434)
https://github.com/optimisan edited https://github.com/llvm/llvm-project/pull/115434 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [CodeGen][NewPM] Port MachineSink to NPM (PR #115434)
https://github.com/optimisan updated https://github.com/llvm/llvm-project/pull/115434 >From 7e31c7d84a9aa87a226bb9f1341fa4a6bae9e7bb Mon Sep 17 00:00:00 2001 From: Akshat Oke Date: Thu, 14 Nov 2024 05:57:01 + Subject: [PATCH 1/5] [NewPM] Introduce MFAnalysisGetter for a common analysis getter --- .../include/llvm/CodeGen/MachinePassManager.h | 78 +++ 1 file changed, 78 insertions(+) diff --git a/llvm/include/llvm/CodeGen/MachinePassManager.h b/llvm/include/llvm/CodeGen/MachinePassManager.h index 69b5f6e92940c4..e8b7b96240d148 100644 --- a/llvm/include/llvm/CodeGen/MachinePassManager.h +++ b/llvm/include/llvm/CodeGen/MachinePassManager.h @@ -24,8 +24,10 @@ #include "llvm/ADT/FunctionExtras.h" #include "llvm/ADT/SmallVector.h" #include "llvm/CodeGen/MachineFunction.h" +#include "llvm/IR/Function.h" #include "llvm/IR/PassManager.h" #include "llvm/IR/PassManagerInternal.h" +#include "llvm/Pass.h" #include "llvm/Support/Error.h" namespace llvm { @@ -236,6 +238,82 @@ using MachineFunctionPassManager = PassManager; /// preserve. PreservedAnalyses getMachineFunctionPassPreservedAnalyses(); +/// For migrating to new pass manager +/// Provides a common interface to fetch analyses instead of doing it twice in +/// the *LegacyPass::runOnMachineFunction and NPM Pass::run. +/// +/// NPM analyses must have the LegacyWrapper type to indicate which legacy +/// analysis to run. Legacy wrapper analyses must have `getResult()` method. +/// This can be added on a needs-to basis. +/// +/// Outer analyses passes(Module or Function) can also be requested through +/// `getAnalysis` or `getCachedAnalysis`. +class MFAnalysisGetter { +private: + Pass *LegacyPass; + MachineFunctionAnalysisManager *MFAM; + + template + using type_of_run = + typename function_traits::template arg_t<0>; + + template + static constexpr bool IsFunctionAnalysis = + std::is_same_v>; + + template + static constexpr bool IsModuleAnalysis = + std::is_same_v>; + +public: + MFAnalysisGetter(Pass *LegacyPass) : LegacyPass(LegacyPass) {} + MFAnalysisGetter(MachineFunctionAnalysisManager *MFAM) : MFAM(MFAM) {} + + /// Outer analyses requested from NPM will be cached results and can be null + template + typename AnalysisT::Result *getAnalysis(MachineFunction &MF) { +if (MFAM) { + // need a proxy to get the result for outer analyses + // this can return null + if constexpr (IsModuleAnalysis) +return MFAM->getResult(MF) +.getCachedResult(*MF.getFunction().getParent()); + else if constexpr (IsFunctionAnalysis) { +return &MFAM->getResult(MF) +.getManager() +.getResult(MF.getFunction()); + } + return &MFAM->getResult(MF); +} +return &LegacyPass->getAnalysis() +.getResult(); + } + + template + typename AnalysisT::Result *getCachedAnalysis(MachineFunction &MF) { +if (MFAM) { + if constexpr (IsFunctionAnalysis) { +return MFAM->getResult(MF) +.getManager() +.getCachedResult(MF.getFunction()); + } else if constexpr (IsModuleAnalysis) +return MFAM->getResult(MF) +.getCachedResult(*MF.getFunction().getParent()); + + return &MFAM->getCachedResult(MF); +} + +if (auto *P = +LegacyPass->getAnalysisIfAvailable()) + return &P->getResult(); +return nullptr; + } + + /// This is not intended to be used to invoke getAnalysis() + Pass *getLegacyPass() const { return LegacyPass; } + MachineFunctionAnalysisManager *getMFAM() const { return MFAM; } +}; + } // end namespace llvm #endif // LLVM_CODEGEN_MACHINEPASSMANAGER_H >From 125b82d45358690f146b259b779616d79eccd267 Mon Sep 17 00:00:00 2001 From: Akshat Oke Date: Thu, 14 Nov 2024 06:49:23 + Subject: [PATCH 2/5] Initialize with null --- llvm/include/llvm/CodeGen/MachinePassManager.h | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/llvm/include/llvm/CodeGen/MachinePassManager.h b/llvm/include/llvm/CodeGen/MachinePassManager.h index e8b7b96240d148..bba41ed343f10d 100644 --- a/llvm/include/llvm/CodeGen/MachinePassManager.h +++ b/llvm/include/llvm/CodeGen/MachinePassManager.h @@ -250,8 +250,8 @@ PreservedAnalyses getMachineFunctionPassPreservedAnalyses(); /// `getAnalysis` or `getCachedAnalysis`. class MFAnalysisGetter { private: - Pass *LegacyPass; - MachineFunctionAnalysisManager *MFAM; + Pass *LegacyPass = nullptr; + MachineFunctionAnalysisManager *MFAM = nullptr; template using type_of_run = @@ -259,11 +259,11 @@ class MFAnalysisGetter { template static constexpr bool IsFunctionAnalysis = - std::is_same_v>; + std::is_same_v>; template static constexpr bool IsModuleAnalysis = - std::is_same_v>; + std::is_same_v>; public: MFAnalysisGetter(Pass *LegacyPass) : LegacyPass(LegacyPass) {} >From 52b7ce7a4c500ab9e0f975337f51b1e37543ce0a Mon Sep 17
[llvm-branch-commits] [llvm] [CodeGen][NewPM] Port MachineSink to NPM (PR #115434)
github-actions[bot] wrote: :warning: C/C++ code formatter, clang-format found issues in your code. :warning: You can test this locally with the following command: ``bash git-clang-format --diff 68bcb36981bce9b99ee70c3bd41443c6d44cd7ae 4a8feec5b0ac4967e28a871211d660468f3a0a93 --extensions cpp,h -- llvm/include/llvm/CodeGen/MachineSink.h llvm/include/llvm/Analysis/AliasAnalysis.h llvm/include/llvm/CodeGen/MachineBlockFrequencyInfo.h llvm/include/llvm/CodeGen/MachineBranchProbabilityInfo.h llvm/include/llvm/CodeGen/MachineCycleAnalysis.h llvm/include/llvm/CodeGen/MachineDominators.h llvm/include/llvm/CodeGen/MachinePassManager.h llvm/include/llvm/CodeGen/MachinePostDominators.h llvm/include/llvm/CodeGen/Passes.h llvm/include/llvm/IR/Dominators.h llvm/include/llvm/InitializePasses.h llvm/include/llvm/Passes/CodeGenPassBuilder.h llvm/include/llvm/Target/CGPassBuilderOption.h llvm/lib/CodeGen/CodeGen.cpp llvm/lib/CodeGen/MachineSink.cpp llvm/lib/CodeGen/TargetPassConfig.cpp llvm/lib/Passes/PassBuilder.cpp llvm/lib/Target/NVPTX/NVPTXTargetMachine.cpp `` View the diff from clang-format here. ``diff diff --git a/llvm/include/llvm/CodeGen/MachineSink.h b/llvm/include/llvm/CodeGen/MachineSink.h index 1eee9d7f7e..71bd7229b7 100644 --- a/llvm/include/llvm/CodeGen/MachineSink.h +++ b/llvm/include/llvm/CodeGen/MachineSink.h @@ -22,7 +22,8 @@ public: PreservedAnalyses run(MachineFunction &MF, MachineFunctionAnalysisManager &); - void printPipeline(raw_ostream &OS, function_ref MapClassName2PassName); + void printPipeline(raw_ostream &OS, + function_ref MapClassName2PassName); }; } // namespace llvm `` https://github.com/llvm/llvm-project/pull/115434 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [mlir][Parser] Deduplicate floating-point parsing functionality (PR #116172)
https://github.com/matthias-springer edited https://github.com/llvm/llvm-project/pull/116172 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [mlir][Parser] Deduplicate fp parsing functionality (PR #116172)
https://github.com/matthias-springer created https://github.com/llvm/llvm-project/pull/116172 The following functionality is duplicated in multiple places: trying to parse an APFloat from a floating point literal or an integer in hexadecimal representation (bit pattern). Move it to a common helper function. NFC apart from the slightly changed error messages. Depends on #116171. >From 51530aeea8c18804034881c87236d1ab5ceb274f Mon Sep 17 00:00:00 2001 From: Matthias Springer Date: Thu, 14 Nov 2024 07:43:08 +0100 Subject: [PATCH] [mlir][Parser] Deduplicate fp parsing functionality --- mlir/lib/AsmParser/AsmParserImpl.h | 33 ++--- mlir/lib/AsmParser/AttributeParser.cpp | 71 mlir/lib/AsmParser/Parser.cpp| 23 +++ mlir/lib/AsmParser/Parser.h | 6 ++ mlir/test/IR/invalid-builtin-attributes.mlir | 10 +-- 5 files changed, 56 insertions(+), 87 deletions(-) diff --git a/mlir/lib/AsmParser/AsmParserImpl.h b/mlir/lib/AsmParser/AsmParserImpl.h index 1e6cbc0ec51beb..bbd70d5980f8fe 100644 --- a/mlir/lib/AsmParser/AsmParserImpl.h +++ b/mlir/lib/AsmParser/AsmParserImpl.h @@ -288,32 +288,13 @@ class AsmParserImpl : public BaseT { bool isNegative = parser.consumeIf(Token::minus); Token curTok = parser.getToken(); auto emitErrorAtTok = [&]() { return emitError(curTok.getLoc(), ""); }; - -// Check for a floating point value. -if (curTok.is(Token::floatliteral)) { - auto val = curTok.getFloatingPointValue(); - if (!val) -return emitErrorAtTok() << "floating point value too large"; - parser.consumeToken(Token::floatliteral); - result = APFloat(isNegative ? -*val : *val); - bool losesInfo; - result.convert(semantics, APFloat::rmNearestTiesToEven, &losesInfo); - return success(); -} - -// Check for a hexadecimal float value. -if (curTok.is(Token::integer)) { - FailureOr apResult = parseFloatFromIntegerLiteral( - emitErrorAtTok, curTok, isNegative, semantics); - if (failed(apResult)) -return failure(); - - result = *apResult; - parser.consumeToken(Token::integer); - return success(); -} - -return emitErrorAtTok() << "expected floating point literal"; +FailureOr apResult = +parseFloatFromLiteral(emitErrorAtTok, curTok, isNegative, semantics); +if (failed(apResult)) + return failure(); +parser.consumeToken(); +result = *apResult; +return success(); } /// Parse a floating point value from the stream. diff --git a/mlir/lib/AsmParser/AttributeParser.cpp b/mlir/lib/AsmParser/AttributeParser.cpp index ba9be3b030453a..9ebada076cd042 100644 --- a/mlir/lib/AsmParser/AttributeParser.cpp +++ b/mlir/lib/AsmParser/AttributeParser.cpp @@ -658,36 +658,12 @@ TensorLiteralParser::getFloatAttrElements(SMLoc loc, FloatType eltTy, for (const auto &signAndToken : storage) { bool isNegative = signAndToken.first; const Token &token = signAndToken.second; - -// Handle hexadecimal float literals. -if (token.is(Token::integer) && token.getSpelling().starts_with("0x")) { - auto emitErrorAtTok = [&]() { return p.emitError(token.getLoc()); }; - FailureOr result = parseFloatFromIntegerLiteral( - emitErrorAtTok, token, isNegative, eltTy.getFloatSemantics()); - if (failed(result)) -return failure(); - - floatValues.push_back(*result); - continue; -} - -// Check to see if any decimal integers or booleans were parsed. -if (!token.is(Token::floatliteral)) - return p.emitError() - << "expected floating-point elements, but parsed integer"; - -// Build the float values from tokens. -auto val = token.getFloatingPointValue(); -if (!val) - return p.emitError("floating point value too large for attribute"); - -APFloat apVal(isNegative ? -*val : *val); -if (!eltTy.isF64()) { - bool unused; - apVal.convert(eltTy.getFloatSemantics(), APFloat::rmNearestTiesToEven, -&unused); -} -floatValues.push_back(apVal); +auto emitErrorAtTok = [&]() { return p.emitError(token.getLoc()); }; +FailureOr result = parseFloatFromLiteral( +emitErrorAtTok, token, isNegative, eltTy.getFloatSemantics()); +if (failed(result)) + return failure(); +floatValues.push_back(*result); } return success(); } @@ -905,34 +881,15 @@ ParseResult DenseArrayElementParser::parseIntegerElement(Parser &p) { ParseResult DenseArrayElementParser::parseFloatElement(Parser &p) { bool isNegative = p.consumeIf(Token::minus); - Token token = p.getToken(); - std::optional result; - auto floatType = cast(type); - if (p.consumeIf(Token::integer)) { -// Parse an integer literal as a float. -auto emitErrorAtTok = [&]() { return p.emitError(token.getLoc()); }; -FailureOr fromIntLit = parseFloatFromIntegerLiteral( -emitErrorAtTok, token, isNegative, float
[llvm-branch-commits] [mlir] [mlir][Parser] Deduplicate floating-point parsing functionality (PR #116172)
llvmbot wrote: @llvm/pr-subscribers-mlir Author: Matthias Springer (matthias-springer) Changes The following functionality is duplicated in multiple places: trying to parse an APFloat from a floating point literal or an integer in hexadecimal representation (bit pattern). Move it to a common helper function. NFC apart from the slightly changed error messages. Depends on #116171. --- Full diff: https://github.com/llvm/llvm-project/pull/116172.diff 5 Files Affected: - (modified) mlir/lib/AsmParser/AsmParserImpl.h (+7-26) - (modified) mlir/lib/AsmParser/AttributeParser.cpp (+14-57) - (modified) mlir/lib/AsmParser/Parser.cpp (+23) - (modified) mlir/lib/AsmParser/Parser.h (+6) - (modified) mlir/test/IR/invalid-builtin-attributes.mlir (+6-4) ``diff diff --git a/mlir/lib/AsmParser/AsmParserImpl.h b/mlir/lib/AsmParser/AsmParserImpl.h index 1e6cbc0ec51beb..bbd70d5980f8fe 100644 --- a/mlir/lib/AsmParser/AsmParserImpl.h +++ b/mlir/lib/AsmParser/AsmParserImpl.h @@ -288,32 +288,13 @@ class AsmParserImpl : public BaseT { bool isNegative = parser.consumeIf(Token::minus); Token curTok = parser.getToken(); auto emitErrorAtTok = [&]() { return emitError(curTok.getLoc(), ""); }; - -// Check for a floating point value. -if (curTok.is(Token::floatliteral)) { - auto val = curTok.getFloatingPointValue(); - if (!val) -return emitErrorAtTok() << "floating point value too large"; - parser.consumeToken(Token::floatliteral); - result = APFloat(isNegative ? -*val : *val); - bool losesInfo; - result.convert(semantics, APFloat::rmNearestTiesToEven, &losesInfo); - return success(); -} - -// Check for a hexadecimal float value. -if (curTok.is(Token::integer)) { - FailureOr apResult = parseFloatFromIntegerLiteral( - emitErrorAtTok, curTok, isNegative, semantics); - if (failed(apResult)) -return failure(); - - result = *apResult; - parser.consumeToken(Token::integer); - return success(); -} - -return emitErrorAtTok() << "expected floating point literal"; +FailureOr apResult = +parseFloatFromLiteral(emitErrorAtTok, curTok, isNegative, semantics); +if (failed(apResult)) + return failure(); +parser.consumeToken(); +result = *apResult; +return success(); } /// Parse a floating point value from the stream. diff --git a/mlir/lib/AsmParser/AttributeParser.cpp b/mlir/lib/AsmParser/AttributeParser.cpp index ba9be3b030453a..9ebada076cd042 100644 --- a/mlir/lib/AsmParser/AttributeParser.cpp +++ b/mlir/lib/AsmParser/AttributeParser.cpp @@ -658,36 +658,12 @@ TensorLiteralParser::getFloatAttrElements(SMLoc loc, FloatType eltTy, for (const auto &signAndToken : storage) { bool isNegative = signAndToken.first; const Token &token = signAndToken.second; - -// Handle hexadecimal float literals. -if (token.is(Token::integer) && token.getSpelling().starts_with("0x")) { - auto emitErrorAtTok = [&]() { return p.emitError(token.getLoc()); }; - FailureOr result = parseFloatFromIntegerLiteral( - emitErrorAtTok, token, isNegative, eltTy.getFloatSemantics()); - if (failed(result)) -return failure(); - - floatValues.push_back(*result); - continue; -} - -// Check to see if any decimal integers or booleans were parsed. -if (!token.is(Token::floatliteral)) - return p.emitError() - << "expected floating-point elements, but parsed integer"; - -// Build the float values from tokens. -auto val = token.getFloatingPointValue(); -if (!val) - return p.emitError("floating point value too large for attribute"); - -APFloat apVal(isNegative ? -*val : *val); -if (!eltTy.isF64()) { - bool unused; - apVal.convert(eltTy.getFloatSemantics(), APFloat::rmNearestTiesToEven, -&unused); -} -floatValues.push_back(apVal); +auto emitErrorAtTok = [&]() { return p.emitError(token.getLoc()); }; +FailureOr result = parseFloatFromLiteral( +emitErrorAtTok, token, isNegative, eltTy.getFloatSemantics()); +if (failed(result)) + return failure(); +floatValues.push_back(*result); } return success(); } @@ -905,34 +881,15 @@ ParseResult DenseArrayElementParser::parseIntegerElement(Parser &p) { ParseResult DenseArrayElementParser::parseFloatElement(Parser &p) { bool isNegative = p.consumeIf(Token::minus); - Token token = p.getToken(); - std::optional result; - auto floatType = cast(type); - if (p.consumeIf(Token::integer)) { -// Parse an integer literal as a float. -auto emitErrorAtTok = [&]() { return p.emitError(token.getLoc()); }; -FailureOr fromIntLit = parseFloatFromIntegerLiteral( -emitErrorAtTok, token, isNegative, floatType.getFloatSemantics()); -if (failed(fromIntLit)) - return failure(); -result = *fromIntLit; - } else if (p.consumeIf(Token::floatliteral)) { -// P
[llvm-branch-commits] [mlir] [mlir][Parser] Deduplicate floating-point parsing functionality (PR #116172)
https://github.com/matthias-springer edited https://github.com/llvm/llvm-project/pull/116172 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [NewPM] Introduce MFAnalysisGetter for a common analysis getter (PR #116166)
optimisan wrote: > [!WARNING] > This pull request is not mergeable via GitHub because a downstack PR is > open. Once all requirements are satisfied, merge this PR as a stack href="https://app.graphite.dev/github/pr/llvm/llvm-project/116166?utm_source=stack-comment-downstack-mergeability-warning"; > >on Graphite. > https://graphite.dev/docs/merge-pull-requests";>Learn more * **#116166** https://app.graphite.dev/github/pr/llvm/llvm-project/116166?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> 👈 * **#114745** https://app.graphite.dev/github/pr/llvm/llvm-project/114745?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/>: 2 other dependent PRs ([#114746](https://github.com/llvm/llvm-project/pull/114746) https://app.graphite.dev/github/pr/llvm/llvm-project/114746?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/>, [#115434](https://github.com/llvm/llvm-project/pull/115434) https://app.graphite.dev/github/pr/llvm/llvm-project/115434?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/>) * **#114027** https://app.graphite.dev/github/pr/llvm/llvm-project/114027?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * `main` This stack of pull requests is managed by Graphite. https://stacking.dev/?utm_source=stack-comment";>Learn more about stacking. Join @optimisan and the rest of your teammates on https://graphite.dev?utm-source=stack-comment";>https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="11px" height="11px"/> Graphite https://github.com/llvm/llvm-project/pull/116166 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [NewPM] Introduce MFAnalysisGetter for a common analysis getter (PR #116166)
https://github.com/optimisan updated https://github.com/llvm/llvm-project/pull/116166 >From 7e31c7d84a9aa87a226bb9f1341fa4a6bae9e7bb Mon Sep 17 00:00:00 2001 From: Akshat Oke Date: Thu, 14 Nov 2024 05:57:01 + Subject: [PATCH] [NewPM] Introduce MFAnalysisGetter for a common analysis getter --- .../include/llvm/CodeGen/MachinePassManager.h | 78 +++ 1 file changed, 78 insertions(+) diff --git a/llvm/include/llvm/CodeGen/MachinePassManager.h b/llvm/include/llvm/CodeGen/MachinePassManager.h index 69b5f6e92940c4..e8b7b96240d148 100644 --- a/llvm/include/llvm/CodeGen/MachinePassManager.h +++ b/llvm/include/llvm/CodeGen/MachinePassManager.h @@ -24,8 +24,10 @@ #include "llvm/ADT/FunctionExtras.h" #include "llvm/ADT/SmallVector.h" #include "llvm/CodeGen/MachineFunction.h" +#include "llvm/IR/Function.h" #include "llvm/IR/PassManager.h" #include "llvm/IR/PassManagerInternal.h" +#include "llvm/Pass.h" #include "llvm/Support/Error.h" namespace llvm { @@ -236,6 +238,82 @@ using MachineFunctionPassManager = PassManager; /// preserve. PreservedAnalyses getMachineFunctionPassPreservedAnalyses(); +/// For migrating to new pass manager +/// Provides a common interface to fetch analyses instead of doing it twice in +/// the *LegacyPass::runOnMachineFunction and NPM Pass::run. +/// +/// NPM analyses must have the LegacyWrapper type to indicate which legacy +/// analysis to run. Legacy wrapper analyses must have `getResult()` method. +/// This can be added on a needs-to basis. +/// +/// Outer analyses passes(Module or Function) can also be requested through +/// `getAnalysis` or `getCachedAnalysis`. +class MFAnalysisGetter { +private: + Pass *LegacyPass; + MachineFunctionAnalysisManager *MFAM; + + template + using type_of_run = + typename function_traits::template arg_t<0>; + + template + static constexpr bool IsFunctionAnalysis = + std::is_same_v>; + + template + static constexpr bool IsModuleAnalysis = + std::is_same_v>; + +public: + MFAnalysisGetter(Pass *LegacyPass) : LegacyPass(LegacyPass) {} + MFAnalysisGetter(MachineFunctionAnalysisManager *MFAM) : MFAM(MFAM) {} + + /// Outer analyses requested from NPM will be cached results and can be null + template + typename AnalysisT::Result *getAnalysis(MachineFunction &MF) { +if (MFAM) { + // need a proxy to get the result for outer analyses + // this can return null + if constexpr (IsModuleAnalysis) +return MFAM->getResult(MF) +.getCachedResult(*MF.getFunction().getParent()); + else if constexpr (IsFunctionAnalysis) { +return &MFAM->getResult(MF) +.getManager() +.getResult(MF.getFunction()); + } + return &MFAM->getResult(MF); +} +return &LegacyPass->getAnalysis() +.getResult(); + } + + template + typename AnalysisT::Result *getCachedAnalysis(MachineFunction &MF) { +if (MFAM) { + if constexpr (IsFunctionAnalysis) { +return MFAM->getResult(MF) +.getManager() +.getCachedResult(MF.getFunction()); + } else if constexpr (IsModuleAnalysis) +return MFAM->getResult(MF) +.getCachedResult(*MF.getFunction().getParent()); + + return &MFAM->getCachedResult(MF); +} + +if (auto *P = +LegacyPass->getAnalysisIfAvailable()) + return &P->getResult(); +return nullptr; + } + + /// This is not intended to be used to invoke getAnalysis() + Pass *getLegacyPass() const { return LegacyPass; } + MachineFunctionAnalysisManager *getMFAM() const { return MFAM; } +}; + } // end namespace llvm #endif // LLVM_CODEGEN_MACHINEPASSMANAGER_H ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [mlir] [mlir][Parser] Add `nan` and `inf` keywords (PR #116176)
https://github.com/matthias-springer created https://github.com/llvm/llvm-project/pull/116176 Add two new keywords for parsing `nan` / `inf` floating-point literals. This is more convenient that writing the integer hexadecimal bit patterns by hand (which differ depending on the floating-point type). Note: The printer still prints `nan` / `inf` literals as integer hexadecimals. That's because there can be multiple `nan` / `inf` bit patterns. When parsing a `nan` / `inf` keyword, the exact bit pattern is unspecified: we use whatever `APFloat::getInf`/`APFloat::getNaN` returns. Depends on #116172. >From 045afe88e53f873cf027ab92af32c120a1d47d63 Mon Sep 17 00:00:00 2001 From: Matthias Springer Date: Thu, 14 Nov 2024 08:47:06 +0100 Subject: [PATCH] [mlir][Parser] Add `nan` and `inf` keywords --- llvm/include/llvm/ADT/APFloat.h | 2 + llvm/lib/Support/APFloat.cpp | 9 mlir/lib/AsmParser/AttributeParser.cpp| 30 +++ mlir/lib/AsmParser/Parser.cpp | 29 +- mlir/lib/AsmParser/TokenKinds.def | 2 + mlir/test/Dialect/Arith/canonicalize.mlir | 10 ++-- mlir/test/IR/attribute.mlir | 54 +++ .../math-polynomial-approx.mlir | 36 ++--- 8 files changed, 138 insertions(+), 34 deletions(-) diff --git a/llvm/include/llvm/ADT/APFloat.h b/llvm/include/llvm/ADT/APFloat.h index 97547fb577e0ec..40ad7ba92552ed 100644 --- a/llvm/include/llvm/ADT/APFloat.h +++ b/llvm/include/llvm/ADT/APFloat.h @@ -311,6 +311,8 @@ struct APFloatBase { static unsigned int semanticsIntSizeInBits(const fltSemantics&, bool); static bool semanticsHasZero(const fltSemantics &); static bool semanticsHasSignedRepr(const fltSemantics &); + static bool semanticsHasInf(const fltSemantics &); + static bool semanticsHasNan(const fltSemantics &); // Returns true if any number described by \p Src can be precisely represented // by a normal (not subnormal) value in \p Dst. diff --git a/llvm/lib/Support/APFloat.cpp b/llvm/lib/Support/APFloat.cpp index c566d489d11b03..8b9d9af2ca65b3 100644 --- a/llvm/lib/Support/APFloat.cpp +++ b/llvm/lib/Support/APFloat.cpp @@ -375,6 +375,15 @@ bool APFloatBase::semanticsHasSignedRepr(const fltSemantics &semantics) { return semantics.hasSignedRepr; } +bool APFloatBase::semanticsHasInf(const fltSemantics &semantics) { + return semantics.nonFiniteBehavior != fltNonfiniteBehavior::NanOnly + && semantics.nonFiniteBehavior != fltNonfiniteBehavior::FiniteOnly; +} + +bool APFloatBase::semanticsHasNan(const fltSemantics &semantics) { + return semantics.nonFiniteBehavior != fltNonfiniteBehavior::FiniteOnly; +} + bool APFloatBase::isRepresentableAsNormalIn(const fltSemantics &Src, const fltSemantics &Dst) { // Exponent range must be larger. diff --git a/mlir/lib/AsmParser/AttributeParser.cpp b/mlir/lib/AsmParser/AttributeParser.cpp index 9ebada076cd042..68da950f09e568 100644 --- a/mlir/lib/AsmParser/AttributeParser.cpp +++ b/mlir/lib/AsmParser/AttributeParser.cpp @@ -21,8 +21,10 @@ #include "mlir/IR/DialectImplementation.h" #include "mlir/IR/DialectResourceBlobManager.h" #include "mlir/IR/IntegerSet.h" +#include "llvm/ADT/APFloat.h" #include "llvm/ADT/StringExtras.h" #include "llvm/Support/Endian.h" +#include #include using namespace mlir; @@ -121,6 +123,8 @@ Attribute Parser::parseAttribute(Type type) { // Parse floating point and integer attributes. case Token::floatliteral: + case Token::kw_inf: + case Token::kw_nan: return parseFloatAttr(type, /*isNegative=*/false); case Token::integer: return parseDecOrHexAttr(type, /*isNegative=*/false); @@ -128,7 +132,8 @@ Attribute Parser::parseAttribute(Type type) { consumeToken(Token::minus); if (getToken().is(Token::integer)) return parseDecOrHexAttr(type, /*isNegative=*/true); -if (getToken().is(Token::floatliteral)) +if (getToken().is(Token::floatliteral) || getToken().is(Token::kw_inf) || +getToken().is(Token::kw_nan)) return parseFloatAttr(type, /*isNegative=*/true); return (emitWrongTokenError( @@ -342,10 +347,8 @@ ParseResult Parser::parseAttributeDict(NamedAttrList &attributes) { /// Parse a float attribute. Attribute Parser::parseFloatAttr(Type type, bool isNegative) { - auto val = getToken().getFloatingPointValue(); - if (!val) -return (emitError("floating point value too large for attribute"), nullptr); - consumeToken(Token::floatliteral); + const Token tok = getToken(); + consumeToken(); if (!type) { // Default to F64 when no type is specified. if (!consumeIf(Token::colon)) @@ -353,10 +356,16 @@ Attribute Parser::parseFloatAttr(Type type, bool isNegative) { else if (!(type = parseType())) return nullptr; } - if (!isa(type)) + auto floatType = dyn_cast(type); + if (!floatType) return (emitError("floating point valu
[llvm-branch-commits] [llvm] [mlir] [mlir][Parser] Add `nan` and `inf` keywords (PR #116176)
llvmbot wrote: @llvm/pr-subscribers-llvm-adt Author: Matthias Springer (matthias-springer) Changes Add two new keywords for parsing `nan` / `inf` floating-point literals. This is more convenient that writing the integer hexadecimal bit patterns by hand (which differ depending on the floating-point type). Note: The printer still prints `nan` / `inf` literals as integer hexadecimals. That's because there can be multiple `nan` / `inf` bit patterns. When parsing a `nan` / `inf` keyword, the exact bit pattern is unspecified: we use whatever `APFloat::getInf`/`APFloat::getNaN` returns. Depends on #116172. --- Full diff: https://github.com/llvm/llvm-project/pull/116176.diff 8 Files Affected: - (modified) llvm/include/llvm/ADT/APFloat.h (+2) - (modified) llvm/lib/Support/APFloat.cpp (+9) - (modified) mlir/lib/AsmParser/AttributeParser.cpp (+21-9) - (modified) mlir/lib/AsmParser/Parser.cpp (+27-2) - (modified) mlir/lib/AsmParser/TokenKinds.def (+2) - (modified) mlir/test/Dialect/Arith/canonicalize.mlir (+5-5) - (modified) mlir/test/IR/attribute.mlir (+54) - (modified) mlir/test/mlir-cpu-runner/math-polynomial-approx.mlir (+18-18) ``diff diff --git a/llvm/include/llvm/ADT/APFloat.h b/llvm/include/llvm/ADT/APFloat.h index 97547fb577e0ec..40ad7ba92552ed 100644 --- a/llvm/include/llvm/ADT/APFloat.h +++ b/llvm/include/llvm/ADT/APFloat.h @@ -311,6 +311,8 @@ struct APFloatBase { static unsigned int semanticsIntSizeInBits(const fltSemantics&, bool); static bool semanticsHasZero(const fltSemantics &); static bool semanticsHasSignedRepr(const fltSemantics &); + static bool semanticsHasInf(const fltSemantics &); + static bool semanticsHasNan(const fltSemantics &); // Returns true if any number described by \p Src can be precisely represented // by a normal (not subnormal) value in \p Dst. diff --git a/llvm/lib/Support/APFloat.cpp b/llvm/lib/Support/APFloat.cpp index c566d489d11b03..8b9d9af2ca65b3 100644 --- a/llvm/lib/Support/APFloat.cpp +++ b/llvm/lib/Support/APFloat.cpp @@ -375,6 +375,15 @@ bool APFloatBase::semanticsHasSignedRepr(const fltSemantics &semantics) { return semantics.hasSignedRepr; } +bool APFloatBase::semanticsHasInf(const fltSemantics &semantics) { + return semantics.nonFiniteBehavior != fltNonfiniteBehavior::NanOnly + && semantics.nonFiniteBehavior != fltNonfiniteBehavior::FiniteOnly; +} + +bool APFloatBase::semanticsHasNan(const fltSemantics &semantics) { + return semantics.nonFiniteBehavior != fltNonfiniteBehavior::FiniteOnly; +} + bool APFloatBase::isRepresentableAsNormalIn(const fltSemantics &Src, const fltSemantics &Dst) { // Exponent range must be larger. diff --git a/mlir/lib/AsmParser/AttributeParser.cpp b/mlir/lib/AsmParser/AttributeParser.cpp index 9ebada076cd042..68da950f09e568 100644 --- a/mlir/lib/AsmParser/AttributeParser.cpp +++ b/mlir/lib/AsmParser/AttributeParser.cpp @@ -21,8 +21,10 @@ #include "mlir/IR/DialectImplementation.h" #include "mlir/IR/DialectResourceBlobManager.h" #include "mlir/IR/IntegerSet.h" +#include "llvm/ADT/APFloat.h" #include "llvm/ADT/StringExtras.h" #include "llvm/Support/Endian.h" +#include #include using namespace mlir; @@ -121,6 +123,8 @@ Attribute Parser::parseAttribute(Type type) { // Parse floating point and integer attributes. case Token::floatliteral: + case Token::kw_inf: + case Token::kw_nan: return parseFloatAttr(type, /*isNegative=*/false); case Token::integer: return parseDecOrHexAttr(type, /*isNegative=*/false); @@ -128,7 +132,8 @@ Attribute Parser::parseAttribute(Type type) { consumeToken(Token::minus); if (getToken().is(Token::integer)) return parseDecOrHexAttr(type, /*isNegative=*/true); -if (getToken().is(Token::floatliteral)) +if (getToken().is(Token::floatliteral) || getToken().is(Token::kw_inf) || +getToken().is(Token::kw_nan)) return parseFloatAttr(type, /*isNegative=*/true); return (emitWrongTokenError( @@ -342,10 +347,8 @@ ParseResult Parser::parseAttributeDict(NamedAttrList &attributes) { /// Parse a float attribute. Attribute Parser::parseFloatAttr(Type type, bool isNegative) { - auto val = getToken().getFloatingPointValue(); - if (!val) -return (emitError("floating point value too large for attribute"), nullptr); - consumeToken(Token::floatliteral); + const Token tok = getToken(); + consumeToken(); if (!type) { // Default to F64 when no type is specified. if (!consumeIf(Token::colon)) @@ -353,10 +356,16 @@ Attribute Parser::parseFloatAttr(Type type, bool isNegative) { else if (!(type = parseType())) return nullptr; } - if (!isa(type)) + auto floatType = dyn_cast(type); + if (!floatType) return (emitError("floating point value not valid for specified type"), nullptr); - return FloatAttr::get(type, isNegative ? -*val : *val); + auto emitErrorAtTok = [&]() { return emitError(tok.g
[llvm-branch-commits] [llvm] [mlir] [mlir][Parser] Add `nan` and `inf` keywords (PR #116176)
llvmbot wrote: @llvm/pr-subscribers-llvm-support @llvm/pr-subscribers-mlir Author: Matthias Springer (matthias-springer) Changes Add two new keywords for parsing `nan` / `inf` floating-point literals. This is more convenient that writing the integer hexadecimal bit patterns by hand (which differ depending on the floating-point type). Note: The printer still prints `nan` / `inf` literals as integer hexadecimals. That's because there can be multiple `nan` / `inf` bit patterns. When parsing a `nan` / `inf` keyword, the exact bit pattern is unspecified: we use whatever `APFloat::getInf`/`APFloat::getNaN` returns. Depends on #116172. --- Full diff: https://github.com/llvm/llvm-project/pull/116176.diff 8 Files Affected: - (modified) llvm/include/llvm/ADT/APFloat.h (+2) - (modified) llvm/lib/Support/APFloat.cpp (+9) - (modified) mlir/lib/AsmParser/AttributeParser.cpp (+21-9) - (modified) mlir/lib/AsmParser/Parser.cpp (+27-2) - (modified) mlir/lib/AsmParser/TokenKinds.def (+2) - (modified) mlir/test/Dialect/Arith/canonicalize.mlir (+5-5) - (modified) mlir/test/IR/attribute.mlir (+54) - (modified) mlir/test/mlir-cpu-runner/math-polynomial-approx.mlir (+18-18) ``diff diff --git a/llvm/include/llvm/ADT/APFloat.h b/llvm/include/llvm/ADT/APFloat.h index 97547fb577e0ec..40ad7ba92552ed 100644 --- a/llvm/include/llvm/ADT/APFloat.h +++ b/llvm/include/llvm/ADT/APFloat.h @@ -311,6 +311,8 @@ struct APFloatBase { static unsigned int semanticsIntSizeInBits(const fltSemantics&, bool); static bool semanticsHasZero(const fltSemantics &); static bool semanticsHasSignedRepr(const fltSemantics &); + static bool semanticsHasInf(const fltSemantics &); + static bool semanticsHasNan(const fltSemantics &); // Returns true if any number described by \p Src can be precisely represented // by a normal (not subnormal) value in \p Dst. diff --git a/llvm/lib/Support/APFloat.cpp b/llvm/lib/Support/APFloat.cpp index c566d489d11b03..8b9d9af2ca65b3 100644 --- a/llvm/lib/Support/APFloat.cpp +++ b/llvm/lib/Support/APFloat.cpp @@ -375,6 +375,15 @@ bool APFloatBase::semanticsHasSignedRepr(const fltSemantics &semantics) { return semantics.hasSignedRepr; } +bool APFloatBase::semanticsHasInf(const fltSemantics &semantics) { + return semantics.nonFiniteBehavior != fltNonfiniteBehavior::NanOnly + && semantics.nonFiniteBehavior != fltNonfiniteBehavior::FiniteOnly; +} + +bool APFloatBase::semanticsHasNan(const fltSemantics &semantics) { + return semantics.nonFiniteBehavior != fltNonfiniteBehavior::FiniteOnly; +} + bool APFloatBase::isRepresentableAsNormalIn(const fltSemantics &Src, const fltSemantics &Dst) { // Exponent range must be larger. diff --git a/mlir/lib/AsmParser/AttributeParser.cpp b/mlir/lib/AsmParser/AttributeParser.cpp index 9ebada076cd042..68da950f09e568 100644 --- a/mlir/lib/AsmParser/AttributeParser.cpp +++ b/mlir/lib/AsmParser/AttributeParser.cpp @@ -21,8 +21,10 @@ #include "mlir/IR/DialectImplementation.h" #include "mlir/IR/DialectResourceBlobManager.h" #include "mlir/IR/IntegerSet.h" +#include "llvm/ADT/APFloat.h" #include "llvm/ADT/StringExtras.h" #include "llvm/Support/Endian.h" +#include #include using namespace mlir; @@ -121,6 +123,8 @@ Attribute Parser::parseAttribute(Type type) { // Parse floating point and integer attributes. case Token::floatliteral: + case Token::kw_inf: + case Token::kw_nan: return parseFloatAttr(type, /*isNegative=*/false); case Token::integer: return parseDecOrHexAttr(type, /*isNegative=*/false); @@ -128,7 +132,8 @@ Attribute Parser::parseAttribute(Type type) { consumeToken(Token::minus); if (getToken().is(Token::integer)) return parseDecOrHexAttr(type, /*isNegative=*/true); -if (getToken().is(Token::floatliteral)) +if (getToken().is(Token::floatliteral) || getToken().is(Token::kw_inf) || +getToken().is(Token::kw_nan)) return parseFloatAttr(type, /*isNegative=*/true); return (emitWrongTokenError( @@ -342,10 +347,8 @@ ParseResult Parser::parseAttributeDict(NamedAttrList &attributes) { /// Parse a float attribute. Attribute Parser::parseFloatAttr(Type type, bool isNegative) { - auto val = getToken().getFloatingPointValue(); - if (!val) -return (emitError("floating point value too large for attribute"), nullptr); - consumeToken(Token::floatliteral); + const Token tok = getToken(); + consumeToken(); if (!type) { // Default to F64 when no type is specified. if (!consumeIf(Token::colon)) @@ -353,10 +356,16 @@ Attribute Parser::parseFloatAttr(Type type, bool isNegative) { else if (!(type = parseType())) return nullptr; } - if (!isa(type)) + auto floatType = dyn_cast(type); + if (!floatType) return (emitError("floating point value not valid for specified type"), nullptr); - return FloatAttr::get(type, isNegative ? -*val : *val); + auto emitErrorAtTok =