date:20241127

[llvm-branch-commits] [llvm] AMDGPU: Simplify demanded bits on readlane/writeline index arguments (PR #117963)

2024-11-27 Thread Matt Arsenault via llvm-branch-commits


https://github.com/arsenm created 
https://github.com/llvm/llvm-project/pull/117963

The main goal is to fold away wave64 code when compiled for wave32.
If we have out of bounds indexing, these will now clamp down to
a low bit which may CSE with the operations on the low half of the
wave.

>From 4c11c64bac4ba4816e070fef3af8a4e59cce2318 Mon Sep 17 00:00:00 2001
From: Matt Arsenault 
Date: Wed, 27 Nov 2024 22:24:15 -0500
Subject: [PATCH] AMDGPU: Simplify demanded bits on readlane/writeline index
 arguments

The main goal is to fold away wave64 code when compiled for wave32.
If we have out of bounds indexing, these will now clamp down to
a low bit which may CSE with the operations on the low half of the
wave.
---
 .../AMDGPU/AMDGPUInstCombineIntrinsic.cpp |  43 -
 .../Target/AMDGPU/AMDGPUTargetTransformInfo.h |   4 +
 .../lane-index-simplify-demanded-bits.ll  | 147 --
 3 files changed, 142 insertions(+), 52 deletions(-)

diff --git a/llvm/lib/Target/AMDGPU/AMDGPUInstCombineIntrinsic.cpp 
b/llvm/lib/Target/AMDGPU/AMDGPUInstCombineIntrinsic.cpp
index 18a09c39a06387..a0bb3e181ac526 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUInstCombineIntrinsic.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUInstCombineIntrinsic.cpp
@@ -450,6 +450,37 @@ static bool isTriviallyUniform(const Use &U) {
   return false;
 }
 
+/// Simplify a lane index operand (e.g. llvm.amdgcn.readlane src1).
+///
+/// The instruction only reads the low 5 bits for wave32, and 6 bits for 
wave64.
+bool GCNTTIImpl::simplifyDemandedLaneMaskArg(InstCombiner &IC,
+ IntrinsicInst &II,
+ unsigned LaneArgIdx) const {
+  unsigned MaskBits = ST->isWaveSizeKnown() && ST->isWave32() ? 5 : 6;
+  APInt DemandedMask(32, maskTrailingOnes(MaskBits));
+
+  KnownBits Known(32);
+  if (IC.SimplifyDemandedBits(&II, LaneArgIdx, DemandedMask, Known))
+return true;
+
+  if (!Known.isConstant())
+return false;
+
+  // Unlike the DAG version, SimplifyDemandedBits does not change
+  // constants. Make sure we clamp these down. Out of bounds indexes may appear
+  // in wave64 code compiled for wave32.
+
+  Value *LaneArg = II.getArgOperand(LaneArgIdx);
+  Constant *MaskedConst =
+  ConstantInt::get(LaneArg->getType(), Known.getConstant() & DemandedMask);
+  if (MaskedConst != LaneArg) {
+II.getOperandUse(LaneArgIdx).set(MaskedConst);
+return true;
+  }
+
+  return false;
+}
+
 std::optional
 GCNTTIImpl::instCombineIntrinsic(InstCombiner &IC, IntrinsicInst &II) const {
   Intrinsic::ID IID = II.getIntrinsicID();
@@ -1092,7 +1123,17 @@ GCNTTIImpl::instCombineIntrinsic(InstCombiner &IC, 
IntrinsicInst &II) const {
 const Use &Src = II.getArgOperandUse(0);
 if (isTriviallyUniform(Src))
   return IC.replaceInstUsesWith(II, Src.get());
-break;
+
+if (IID == Intrinsic::amdgcn_readlane &&
+simplifyDemandedLaneMaskArg(IC, II, 1))
+  return &II;
+
+return std::nullopt;
+  }
+  case Intrinsic::amdgcn_writelane: {
+if (simplifyDemandedLaneMaskArg(IC, II, 1))
+  return &II;
+return std::nullopt;
   }
   case Intrinsic::amdgcn_trig_preop: {
 // The intrinsic is declared with name mangling, but currently the
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.h 
b/llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.h
index 10956861650ab3..585f38fc02c29c 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.h
+++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.h
@@ -220,6 +220,10 @@ class GCNTTIImpl final : public 
BasicTTIImplBase {
 
   bool canSimplifyLegacyMulToMul(const Instruction &I, const Value *Op0,
  const Value *Op1, InstCombiner &IC) const;
+
+  bool simplifyDemandedLaneMaskArg(InstCombiner &IC, IntrinsicInst &II,
+   unsigned LaneAgIdx) const;
+
   std::optional instCombineIntrinsic(InstCombiner &IC,
 IntrinsicInst &II) const;
   std::optional simplifyDemandedVectorEltsIntrinsic(
diff --git 
a/llvm/test/Transforms/InstCombine/AMDGPU/lane-index-simplify-demanded-bits.ll 
b/llvm/test/Transforms/InstCombine/AMDGPU/lane-index-simplify-demanded-bits.ll
index b686f447b8d3c9..327d68bdf550e4 100644
--- 
a/llvm/test/Transforms/InstCombine/AMDGPU/lane-index-simplify-demanded-bits.ll
+++ 
b/llvm/test/Transforms/InstCombine/AMDGPU/lane-index-simplify-demanded-bits.ll
@@ -18,30 +18,45 @@ define i32 @readlane_31(i32 %arg) #0 {
 }
 
 define i32 @readlane_32(i32 %arg) #0 {
-; CHECK-LABEL: define i32 @readlane_32(
-; CHECK-SAME: i32 [[ARG:%.*]]) #[[ATTR0]] {
-; CHECK-NEXT:[[RES:%.*]] = call i32 @llvm.amdgcn.readlane.i32(i32 [[ARG]], 
i32 32)
-; CHECK-NEXT:ret i32 [[RES]]
+; WAVE64-LABEL: define i32 @readlane_32(
+; WAVE64-SAME: i32 [[ARG:%.*]]) #[[ATTR0]] {
+; WAVE64-NEXT:[[RES:%.*]] = call i32 @llvm.amdgcn.readlane.i32(i32 
[[ARG]], i32 32)
+; WAVE64-NEXT:ret i32 [[RES]]
+

[llvm-branch-commits] [llvm] AMDGPU: Simplify demanded bits on readlane/writeline index arguments (PR #117963)

2024-11-27 Thread Matt Arsenault via llvm-branch-commits


arsenm wrote:

> [!WARNING]
> This pull request is not mergeable via GitHub because a downstack PR is 
> open. Once all requirements are satisfied, merge this PR as a stack  href="https://app.graphite.dev/github/pr/llvm/llvm-project/117963?utm_source=stack-comment-downstack-mergeability-warning";
>  >on Graphite.
> https://graphite.dev/docs/merge-pull-requests";>Learn more

* **#117963** https://app.graphite.dev/github/pr/llvm/llvm-project/117963?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/> 👈 https://app.graphite.dev/github/pr/llvm/llvm-project/117963?utm_source=stack-comment-view-in-graphite";
 target="_blank">(View in Graphite)
* **#117962** https://app.graphite.dev/github/pr/llvm/llvm-project/117962?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* `main`



This stack of pull requests is managed by https://graphite.dev?utm-source=stack-comment";>Graphite. Learn 
more about https://stacking.dev/?utm_source=stack-comment";>stacking.


https://github.com/llvm/llvm-project/pull/117963
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] AMDGPU: Simplify demanded bits on readlane/writeline index arguments (PR #117963)

2024-11-27 Thread Matt Arsenault via llvm-branch-commits


https://github.com/arsenm ready_for_review 
https://github.com/llvm/llvm-project/pull/117963
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] AMDGPU: Simplify demanded bits on readlane/writeline index arguments (PR #117963)

2024-11-27 Thread via llvm-branch-commits


llvmbot wrote:




@llvm/pr-subscribers-backend-amdgpu

Author: Matt Arsenault (arsenm)


Changes

The main goal is to fold away wave64 code when compiled for wave32.
If we have out of bounds indexing, these will now clamp down to
a low bit which may CSE with the operations on the low half of the
wave.

---
Full diff: https://github.com/llvm/llvm-project/pull/117963.diff


3 Files Affected:

- (modified) llvm/lib/Target/AMDGPU/AMDGPUInstCombineIntrinsic.cpp (+42-1) 
- (modified) llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.h (+4) 
- (modified) 
llvm/test/Transforms/InstCombine/AMDGPU/lane-index-simplify-demanded-bits.ll 
(+96-51) 


``diff
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUInstCombineIntrinsic.cpp 
b/llvm/lib/Target/AMDGPU/AMDGPUInstCombineIntrinsic.cpp
index 18a09c39a06387..a0bb3e181ac526 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUInstCombineIntrinsic.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUInstCombineIntrinsic.cpp
@@ -450,6 +450,37 @@ static bool isTriviallyUniform(const Use &U) {
   return false;
 }
 
+/// Simplify a lane index operand (e.g. llvm.amdgcn.readlane src1).
+///
+/// The instruction only reads the low 5 bits for wave32, and 6 bits for 
wave64.
+bool GCNTTIImpl::simplifyDemandedLaneMaskArg(InstCombiner &IC,
+ IntrinsicInst &II,
+ unsigned LaneArgIdx) const {
+  unsigned MaskBits = ST->isWaveSizeKnown() && ST->isWave32() ? 5 : 6;
+  APInt DemandedMask(32, maskTrailingOnes(MaskBits));
+
+  KnownBits Known(32);
+  if (IC.SimplifyDemandedBits(&II, LaneArgIdx, DemandedMask, Known))
+return true;
+
+  if (!Known.isConstant())
+return false;
+
+  // Unlike the DAG version, SimplifyDemandedBits does not change
+  // constants. Make sure we clamp these down. Out of bounds indexes may appear
+  // in wave64 code compiled for wave32.
+
+  Value *LaneArg = II.getArgOperand(LaneArgIdx);
+  Constant *MaskedConst =
+  ConstantInt::get(LaneArg->getType(), Known.getConstant() & DemandedMask);
+  if (MaskedConst != LaneArg) {
+II.getOperandUse(LaneArgIdx).set(MaskedConst);
+return true;
+  }
+
+  return false;
+}
+
 std::optional
 GCNTTIImpl::instCombineIntrinsic(InstCombiner &IC, IntrinsicInst &II) const {
   Intrinsic::ID IID = II.getIntrinsicID();
@@ -1092,7 +1123,17 @@ GCNTTIImpl::instCombineIntrinsic(InstCombiner &IC, 
IntrinsicInst &II) const {
 const Use &Src = II.getArgOperandUse(0);
 if (isTriviallyUniform(Src))
   return IC.replaceInstUsesWith(II, Src.get());
-break;
+
+if (IID == Intrinsic::amdgcn_readlane &&
+simplifyDemandedLaneMaskArg(IC, II, 1))
+  return &II;
+
+return std::nullopt;
+  }
+  case Intrinsic::amdgcn_writelane: {
+if (simplifyDemandedLaneMaskArg(IC, II, 1))
+  return &II;
+return std::nullopt;
   }
   case Intrinsic::amdgcn_trig_preop: {
 // The intrinsic is declared with name mangling, but currently the
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.h 
b/llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.h
index 10956861650ab3..585f38fc02c29c 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.h
+++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.h
@@ -220,6 +220,10 @@ class GCNTTIImpl final : public 
BasicTTIImplBase {
 
   bool canSimplifyLegacyMulToMul(const Instruction &I, const Value *Op0,
  const Value *Op1, InstCombiner &IC) const;
+
+  bool simplifyDemandedLaneMaskArg(InstCombiner &IC, IntrinsicInst &II,
+   unsigned LaneAgIdx) const;
+
   std::optional instCombineIntrinsic(InstCombiner &IC,
 IntrinsicInst &II) const;
   std::optional simplifyDemandedVectorEltsIntrinsic(
diff --git 
a/llvm/test/Transforms/InstCombine/AMDGPU/lane-index-simplify-demanded-bits.ll 
b/llvm/test/Transforms/InstCombine/AMDGPU/lane-index-simplify-demanded-bits.ll
index b686f447b8d3c9..327d68bdf550e4 100644
--- 
a/llvm/test/Transforms/InstCombine/AMDGPU/lane-index-simplify-demanded-bits.ll
+++ 
b/llvm/test/Transforms/InstCombine/AMDGPU/lane-index-simplify-demanded-bits.ll
@@ -18,30 +18,45 @@ define i32 @readlane_31(i32 %arg) #0 {
 }
 
 define i32 @readlane_32(i32 %arg) #0 {
-; CHECK-LABEL: define i32 @readlane_32(
-; CHECK-SAME: i32 [[ARG:%.*]]) #[[ATTR0]] {
-; CHECK-NEXT:[[RES:%.*]] = call i32 @llvm.amdgcn.readlane.i32(i32 [[ARG]], 
i32 32)
-; CHECK-NEXT:ret i32 [[RES]]
+; WAVE64-LABEL: define i32 @readlane_32(
+; WAVE64-SAME: i32 [[ARG:%.*]]) #[[ATTR0]] {
+; WAVE64-NEXT:[[RES:%.*]] = call i32 @llvm.amdgcn.readlane.i32(i32 
[[ARG]], i32 32)
+; WAVE64-NEXT:ret i32 [[RES]]
+;
+; WAVE32-LABEL: define i32 @readlane_32(
+; WAVE32-SAME: i32 [[ARG:%.*]]) #[[ATTR0]] {
+; WAVE32-NEXT:[[RES:%.*]] = call i32 @llvm.amdgcn.readlane.i32(i32 
[[ARG]], i32 0)
+; WAVE32-NEXT:ret i32 [[RES]]
 ;
   %res = call i32 @llvm.amdgcn.readlane.i32(i32 %arg, i32 32)
   ret i32 %res
 }

[llvm-branch-commits] [mlir] [mlir][Transforms] Add 1:N `matchAndRewrite` overload (PR #116470)

2024-11-27 Thread Matthias Springer via llvm-branch-commits


https://github.com/matthias-springer edited 
https://github.com/llvm/llvm-project/pull/116470
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [mlir] [mlir][Transforms] Add 1:N `matchAndRewrite` overload (PR #116470)

2024-11-27 Thread Matthias Springer via llvm-branch-commits


https://github.com/matthias-springer edited 
https://github.com/llvm/llvm-project/pull/116470
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [flang] [Flang][NFC] Split runtime headers in preparation for cross-compilation. (PR #112188)

2024-11-27 Thread via llvm-branch-commits


https://github.com/jeanPerier approved this pull request.

LGTM, I am ok with merging this patch now to make it easier to iterate on the 
last two core patches. Thanks for all the work.

https://github.com/llvm/llvm-project/pull/112188
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang] [compiler-rt] [llvm] [TySan] Fixed false positive when accessing offset member variables (PR #95387)

2024-11-27 Thread via llvm-branch-commits


gbMattN wrote:

@fhahn it looks like the latest TySan pull updates have fixed the bug this pull 
request set out to fix. I'd be happy to close this pull request out; does TySan 
have a test to check for this bug currently?

https://github.com/llvm/llvm-project/pull/95387
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [RISCV] Add FeatureDisableLatencySchedHeuristic (PR #115858)

2024-11-27 Thread Craig Topper via llvm-branch-commits



@@ -1390,6 +1390,10 @@ def FeaturePredictableSelectIsExpensive
 : SubtargetFeature<"predictable-select-expensive", 
"PredictableSelectIsExpensive", "true",
"Prefer likely predicted branches over selects">;
 
+def FeatureDisableLatencySchedHeuristic

topperc wrote:

This should be TuneDisableLatencySchedHeuristic? And FeaturePostRAScheduler 
should have been TunePostRAScheduler

https://github.com/llvm/llvm-project/pull/115858
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [BOLT][WIP] Support ret-converted call-cont fallthru in BAT mode (PR #115334)

2024-11-27 Thread Amir Ayupov via llvm-branch-commits


https://github.com/aaupov updated 
https://github.com/llvm/llvm-project/pull/115334

>From 9476fad1aa50282a38614a63a6a5a41f0ac42532 Mon Sep 17 00:00:00 2001
From: Amir Ayupov 
Date: Wed, 27 Nov 2024 15:59:02 +0100
Subject: [PATCH] Preserve CFG until BAT, use it to check call cont landing
 pads, encode them in secondary entry points table

---
 bolt/docs/BAT.md  |  20 +---
 .../bolt/Profile/BoltAddressTranslation.h |  26 +
 bolt/lib/Core/BinaryEmitter.cpp   |   3 +-
 bolt/lib/Profile/BoltAddressTranslation.cpp   | 104 +-
 bolt/lib/Profile/DataAggregator.cpp   |  17 +--
 5 files changed, 44 insertions(+), 126 deletions(-)

diff --git a/bolt/docs/BAT.md b/bolt/docs/BAT.md
index 20340884621b9a..828114310e195f 100644
--- a/bolt/docs/BAT.md
+++ b/bolt/docs/BAT.md
@@ -115,21 +115,13 @@ Deleted basic blocks are emitted as having `OutputOffset` 
equal to the size of
 the function. They don't affect address translation and only participate in
 input basic block mapping.
 
-### Secondary Entry Points table
+### Secondary Entry Points and Call Continuation Landing Pads table
 The table is emitted for hot fragments only. It contains `NumSecEntryPoints`
-offsets denoting secondary entry points, delta encoded, implicitly starting at 
zero.
+offsets, delta encoded, implicitly starting at zero.
 | Entry | Encoding | Description |
 | - |  | --- |
-| `SecEntryPoint` | Delta, ULEB128 | Secondary entry point offset |
+| `OutputOffset` | Delta, ULEB128 | An offset of secondary entry point or a 
call continuation landing pad\*|
 
-### Call continuation landing pads table
-This table contains the addresses of call continuation blocks that are also
-landing pads, to aid pre-aggregated profile conversion. The table is optional
-for backwards compatibility, but new versions of BOLT will always emit it.
-
-| Entry | Encoding | Description |
-| - |  | --- |
-| `NumEntries` | ULEB128 | Number of addresses |
-| `InputAddress` | Delta, ULEB128 | `NumEntries` input addresses of call 
continuation landing pad blocks |
-
-Addresses are delta encoded, implicitly starting at zero.
+Call continuation landing pads offsets are shifted by the size of the function
+for backwards compatibility (treated as entry points past the end of the
+function).
diff --git a/bolt/include/bolt/Profile/BoltAddressTranslation.h 
b/bolt/include/bolt/Profile/BoltAddressTranslation.h
index b04ed7a82eeefb..f956f48b8356b2 100644
--- a/bolt/include/bolt/Profile/BoltAddressTranslation.h
+++ b/bolt/include/bolt/Profile/BoltAddressTranslation.h
@@ -143,20 +143,12 @@ class BoltAddressTranslation {
   /// Write the serialized address translation table for a function.
   template  void writeMaps(uint64_t &PrevAddress, raw_ostream &OS);
 
-  /// Write call continuation landing pad addresses.
-  void writeCallContLandingPads(raw_ostream &OS);
-
   /// Read the serialized address translation table for a function.
   /// Return a parse error if failed.
   template 
   void parseMaps(uint64_t &PrevAddress, DataExtractor &DE, uint64_t &Offset,
  Error &Err);
 
-
-  /// Read the table with call continuation landing pad offsets.
-  void parseCallContLandingPads(DataExtractor &DE, uint64_t &Offset,
-Error &Err);
-
   /// Returns the bitmask with set bits corresponding to indices of BRANCHENTRY
   /// entries in function address translation map.
   APInt calculateBranchEntriesBitMask(MapTy &Map, size_t EqualElems) const;
@@ -176,14 +168,6 @@ class BoltAddressTranslation {
   /// Map a function to its secondary entry points vector
   std::unordered_map> SecondaryEntryPointsMap;
 
-  /// Vector with call continuation landing pads input addresses (pre-BOLT
-  /// binary).
-  std::vector CallContLandingPadAddrs;
-
-  /// Return a secondary entry point ID for a function located at \p Address 
and
-  /// \p Offset within that function.
-  unsigned getSecondaryEntryPointId(uint64_t Address, uint32_t Offset) const;
-
   /// Links outlined cold bocks to their original function
   std::map ColdPartSource;
 
@@ -195,13 +179,9 @@ class BoltAddressTranslation {
   const static uint32_t BRANCHENTRY = 0x1;
 
 public:
-  /// Returns whether a given \p Offset is a secondary entry point in function
-  /// with address \p Address.
-  bool isSecondaryEntry(uint64_t Address, uint32_t Offset) const;
-
-  /// Returns whether a given \p Offset is a call continuation landing pad in
-  /// function with address \p Address.
-  bool isCallContinuationLandingPad(uint64_t Address, uint32_t Offset) const;
+  /// Return a secondary entry point ID for a function located at \p Address 
and
+  /// \p Offset within that function.
+  unsigned getSecondaryEntryPointId(uint64_t Address, uint32_t Offset) const;
 
   /// Map basic block input offset to a basic block index and hash pair.
   class BBHashMapTy {
diff --git a/bolt/lib/Core/BinaryEmitter.cpp b/bolt/lib/Cor

[llvm-branch-commits] [clang] [TySan] A Type Sanitizer (Clang) (PR #76260)

2024-11-27 Thread Florian Hahn via llvm-branch-commits


https://github.com/fhahn updated https://github.com/llvm/llvm-project/pull/76260

>From ab8d005600b99fb62d991bc63c58136576429385 Mon Sep 17 00:00:00 2001
From: Florian Hahn 
Date: Thu, 18 Apr 2024 23:01:03 +0100
Subject: [PATCH 1/4] [TySan] A Type Sanitizer (Clang)

---
 clang/include/clang/Basic/Features.def |  1 +
 clang/include/clang/Basic/Sanitizers.def   |  3 ++
 clang/include/clang/Driver/SanitizerArgs.h |  1 +
 clang/lib/CodeGen/BackendUtil.cpp  |  6 
 clang/lib/CodeGen/CGDecl.cpp   |  3 +-
 clang/lib/CodeGen/CGDeclCXX.cpp|  4 +++
 clang/lib/CodeGen/CodeGenFunction.cpp  |  2 ++
 clang/lib/CodeGen/CodeGenModule.cpp| 12 ---
 clang/lib/CodeGen/CodeGenTBAA.cpp  |  6 ++--
 clang/lib/CodeGen/SanitizerMetadata.cpp| 40 +-
 clang/lib/CodeGen/SanitizerMetadata.h  | 13 +++
 clang/lib/Driver/SanitizerArgs.cpp | 13 ---
 clang/lib/Driver/ToolChains/CommonArgs.cpp |  6 +++-
 clang/lib/Driver/ToolChains/Darwin.cpp |  6 
 clang/lib/Driver/ToolChains/Linux.cpp  |  2 ++
 clang/test/Driver/sanitizer-ld.c   | 23 +
 16 files changed, 114 insertions(+), 27 deletions(-)

diff --git a/clang/include/clang/Basic/Features.def 
b/clang/include/clang/Basic/Features.def
index 9088c867d53ce4..1d5459fc74d449 100644
--- a/clang/include/clang/Basic/Features.def
+++ b/clang/include/clang/Basic/Features.def
@@ -102,6 +102,7 @@ FEATURE(numerical_stability_sanitizer, 
LangOpts.Sanitize.has(SanitizerKind::Nume
 FEATURE(memory_sanitizer,
 LangOpts.Sanitize.hasOneOf(SanitizerKind::Memory |
SanitizerKind::KernelMemory))
+FEATURE(type_sanitizer, LangOpts.Sanitize.has(SanitizerKind::Type))
 FEATURE(thread_sanitizer, LangOpts.Sanitize.has(SanitizerKind::Thread))
 FEATURE(dataflow_sanitizer, LangOpts.Sanitize.has(SanitizerKind::DataFlow))
 FEATURE(scudo, LangOpts.Sanitize.hasOneOf(SanitizerKind::Scudo))
diff --git a/clang/include/clang/Basic/Sanitizers.def 
b/clang/include/clang/Basic/Sanitizers.def
index 9223f62b3639a7..f234488eaa80cf 100644
--- a/clang/include/clang/Basic/Sanitizers.def
+++ b/clang/include/clang/Basic/Sanitizers.def
@@ -73,6 +73,9 @@ SANITIZER("fuzzer", Fuzzer)
 // libFuzzer-required instrumentation, no linking.
 SANITIZER("fuzzer-no-link", FuzzerNoLink)
 
+// TypeSanitizer
+SANITIZER("type", Type)
+
 // ThreadSanitizer
 SANITIZER("thread", Thread)
 
diff --git a/clang/include/clang/Driver/SanitizerArgs.h 
b/clang/include/clang/Driver/SanitizerArgs.h
index 0c6f3869549ef7..4f08ea2b260179 100644
--- a/clang/include/clang/Driver/SanitizerArgs.h
+++ b/clang/include/clang/Driver/SanitizerArgs.h
@@ -87,6 +87,7 @@ class SanitizerArgs {
   bool needsHwasanAliasesRt() const {
 return needsHwasanRt() && HwasanUseAliases;
   }
+  bool needsTysanRt() const { return Sanitizers.has(SanitizerKind::Type); }
   bool needsTsanRt() const { return Sanitizers.has(SanitizerKind::Thread); }
   bool needsMsanRt() const { return Sanitizers.has(SanitizerKind::Memory); }
   bool needsFuzzer() const { return Sanitizers.has(SanitizerKind::Fuzzer); }
diff --git a/clang/lib/CodeGen/BackendUtil.cpp 
b/clang/lib/CodeGen/BackendUtil.cpp
index bf9b04f02e9f44..014dc5cdeb616e 100644
--- a/clang/lib/CodeGen/BackendUtil.cpp
+++ b/clang/lib/CodeGen/BackendUtil.cpp
@@ -77,6 +77,7 @@
 #include "llvm/Transforms/Instrumentation/SanitizerBinaryMetadata.h"
 #include "llvm/Transforms/Instrumentation/SanitizerCoverage.h"
 #include "llvm/Transforms/Instrumentation/ThreadSanitizer.h"
+#include "llvm/Transforms/Instrumentation/TypeSanitizer.h"
 #include "llvm/Transforms/ObjCARC.h"
 #include "llvm/Transforms/Scalar/EarlyCSE.h"
 #include "llvm/Transforms/Scalar/GVN.h"
@@ -735,6 +736,11 @@ static void addSanitizers(const Triple &TargetTriple,
   MPM.addPass(createModuleToFunctionPassAdaptor(ThreadSanitizerPass()));
 }
 
+if (LangOpts.Sanitize.has(SanitizerKind::Type)) {
+  MPM.addPass(ModuleTypeSanitizerPass());
+  MPM.addPass(createModuleToFunctionPassAdaptor(TypeSanitizerPass()));
+}
+
 if (LangOpts.Sanitize.has(SanitizerKind::NumericalStability))
   MPM.addPass(NumericalStabilitySanitizerPass());
 
diff --git a/clang/lib/CodeGen/CGDecl.cpp b/clang/lib/CodeGen/CGDecl.cpp
index 47b21bc9f63f04..bb9d120c37ca86 100644
--- a/clang/lib/CodeGen/CGDecl.cpp
+++ b/clang/lib/CodeGen/CGDecl.cpp
@@ -458,7 +458,8 @@ void CodeGenFunction::EmitStaticVarDecl(const VarDecl &D,
   LocalDeclMap.find(&D)->second = Address(castedAddr, elemTy, alignment);
   CGM.setStaticLocalDeclAddress(&D, castedAddr);
 
-  CGM.getSanitizerMetadata()->reportGlobal(var, D);
+  CGM.getSanitizerMetadata()->reportGlobalToASan(var, D);
+  CGM.getSanitizerMetadata()->reportGlobalToTySan(var, D);
 
   // Emit global variable debug descriptor for static vars.
   CGDebugInfo *DI = getDebugInfo();
diff --git a/clang/lib/CodeGen/CGDeclCXX.cpp b/clang/lib/CodeGen/CGDeclCXX.cpp
index 2c3054605ee754..96517511b21

[llvm-branch-commits] [clang] [TySan] A Type Sanitizer (Clang) (PR #76260)

2024-11-27 Thread Florian Hahn via llvm-branch-commits



@@ -1027,6 +1027,10 @@ Sanitizers
   `_. See that link
   for examples.
 
+- Introduced an experimental Type Sanitizer, activated by using the
+  -fsanitize=type flag. This sanitizer detects violations of C/C++ type-based

fhahn wrote:

Updated, thanks!

https://github.com/llvm/llvm-project/pull/76260
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang] [TySan] A Type Sanitizer (Clang) (PR #76260)

2024-11-27 Thread Florian Hahn via llvm-branch-commits


https://github.com/fhahn edited https://github.com/llvm/llvm-project/pull/76260
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang] [TySan] A Type Sanitizer (Clang) (PR #76260)

2024-11-27 Thread Florian Hahn via llvm-branch-commits



@@ -102,6 +102,7 @@ FEATURE(numerical_stability_sanitizer, 
LangOpts.Sanitize.has(SanitizerKind::Nume
 FEATURE(memory_sanitizer,
 LangOpts.Sanitize.hasOneOf(SanitizerKind::Memory |
SanitizerKind::KernelMemory))
+FEATURE(type_sanitizer, LangOpts.Sanitize.has(SanitizerKind::Type))

fhahn wrote:

Where would be the right place to add those? Might be good to at least file an 
issue to clean this up?

https://github.com/llvm/llvm-project/pull/76260
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang] [TySan] A Type Sanitizer (Clang) (PR #76260)

2024-11-27 Thread Florian Hahn via llvm-branch-commits



@@ -6630,7 +6631,8 @@ CodeGenModule::GetAddrOfConstantStringFromLiteral(const 
StringLiteral *S,
   if (Entry)
 *Entry = GV;
 
-  SanitizerMD->reportGlobal(GV, S->getStrTokenLoc(0), "");
+  SanitizerMD->reportGlobalToASan(GV, S->getStrTokenLoc(0), "");
+  // FIXME: Should we also report to the TySan?

fhahn wrote:

Yep, merged the code to report globals again, thanks!

https://github.com/llvm/llvm-project/pull/76260
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang] [TySan] A Type Sanitizer (Clang) (PR #76260)

2024-11-27 Thread Florian Hahn via llvm-branch-commits


https://github.com/fhahn commented:

Thanks, as this depends on LLVM & compiler-rt patches, so it should only be 
merged once those other PRs are also approved.

Motivation for the sanitizer is enabling `-fpointer-tbaa` by default 
(https://github.com/llvm/llvm-project/pull/117244)

https://github.com/llvm/llvm-project/pull/76260
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang] [TySan] A Type Sanitizer (Clang) (PR #76260)

2024-11-27 Thread Florian Hahn via llvm-branch-commits


https://github.com/fhahn updated https://github.com/llvm/llvm-project/pull/76260

>From ab8d005600b99fb62d991bc63c58136576429385 Mon Sep 17 00:00:00 2001
From: Florian Hahn 
Date: Thu, 18 Apr 2024 23:01:03 +0100
Subject: [PATCH 1/5] [TySan] A Type Sanitizer (Clang)

---
 clang/include/clang/Basic/Features.def |  1 +
 clang/include/clang/Basic/Sanitizers.def   |  3 ++
 clang/include/clang/Driver/SanitizerArgs.h |  1 +
 clang/lib/CodeGen/BackendUtil.cpp  |  6 
 clang/lib/CodeGen/CGDecl.cpp   |  3 +-
 clang/lib/CodeGen/CGDeclCXX.cpp|  4 +++
 clang/lib/CodeGen/CodeGenFunction.cpp  |  2 ++
 clang/lib/CodeGen/CodeGenModule.cpp| 12 ---
 clang/lib/CodeGen/CodeGenTBAA.cpp  |  6 ++--
 clang/lib/CodeGen/SanitizerMetadata.cpp| 40 +-
 clang/lib/CodeGen/SanitizerMetadata.h  | 13 +++
 clang/lib/Driver/SanitizerArgs.cpp | 13 ---
 clang/lib/Driver/ToolChains/CommonArgs.cpp |  6 +++-
 clang/lib/Driver/ToolChains/Darwin.cpp |  6 
 clang/lib/Driver/ToolChains/Linux.cpp  |  2 ++
 clang/test/Driver/sanitizer-ld.c   | 23 +
 16 files changed, 114 insertions(+), 27 deletions(-)

diff --git a/clang/include/clang/Basic/Features.def 
b/clang/include/clang/Basic/Features.def
index 9088c867d53ce4..1d5459fc74d449 100644
--- a/clang/include/clang/Basic/Features.def
+++ b/clang/include/clang/Basic/Features.def
@@ -102,6 +102,7 @@ FEATURE(numerical_stability_sanitizer, 
LangOpts.Sanitize.has(SanitizerKind::Nume
 FEATURE(memory_sanitizer,
 LangOpts.Sanitize.hasOneOf(SanitizerKind::Memory |
SanitizerKind::KernelMemory))
+FEATURE(type_sanitizer, LangOpts.Sanitize.has(SanitizerKind::Type))
 FEATURE(thread_sanitizer, LangOpts.Sanitize.has(SanitizerKind::Thread))
 FEATURE(dataflow_sanitizer, LangOpts.Sanitize.has(SanitizerKind::DataFlow))
 FEATURE(scudo, LangOpts.Sanitize.hasOneOf(SanitizerKind::Scudo))
diff --git a/clang/include/clang/Basic/Sanitizers.def 
b/clang/include/clang/Basic/Sanitizers.def
index 9223f62b3639a7..f234488eaa80cf 100644
--- a/clang/include/clang/Basic/Sanitizers.def
+++ b/clang/include/clang/Basic/Sanitizers.def
@@ -73,6 +73,9 @@ SANITIZER("fuzzer", Fuzzer)
 // libFuzzer-required instrumentation, no linking.
 SANITIZER("fuzzer-no-link", FuzzerNoLink)
 
+// TypeSanitizer
+SANITIZER("type", Type)
+
 // ThreadSanitizer
 SANITIZER("thread", Thread)
 
diff --git a/clang/include/clang/Driver/SanitizerArgs.h 
b/clang/include/clang/Driver/SanitizerArgs.h
index 0c6f3869549ef7..4f08ea2b260179 100644
--- a/clang/include/clang/Driver/SanitizerArgs.h
+++ b/clang/include/clang/Driver/SanitizerArgs.h
@@ -87,6 +87,7 @@ class SanitizerArgs {
   bool needsHwasanAliasesRt() const {
 return needsHwasanRt() && HwasanUseAliases;
   }
+  bool needsTysanRt() const { return Sanitizers.has(SanitizerKind::Type); }
   bool needsTsanRt() const { return Sanitizers.has(SanitizerKind::Thread); }
   bool needsMsanRt() const { return Sanitizers.has(SanitizerKind::Memory); }
   bool needsFuzzer() const { return Sanitizers.has(SanitizerKind::Fuzzer); }
diff --git a/clang/lib/CodeGen/BackendUtil.cpp 
b/clang/lib/CodeGen/BackendUtil.cpp
index bf9b04f02e9f44..014dc5cdeb616e 100644
--- a/clang/lib/CodeGen/BackendUtil.cpp
+++ b/clang/lib/CodeGen/BackendUtil.cpp
@@ -77,6 +77,7 @@
 #include "llvm/Transforms/Instrumentation/SanitizerBinaryMetadata.h"
 #include "llvm/Transforms/Instrumentation/SanitizerCoverage.h"
 #include "llvm/Transforms/Instrumentation/ThreadSanitizer.h"
+#include "llvm/Transforms/Instrumentation/TypeSanitizer.h"
 #include "llvm/Transforms/ObjCARC.h"
 #include "llvm/Transforms/Scalar/EarlyCSE.h"
 #include "llvm/Transforms/Scalar/GVN.h"
@@ -735,6 +736,11 @@ static void addSanitizers(const Triple &TargetTriple,
   MPM.addPass(createModuleToFunctionPassAdaptor(ThreadSanitizerPass()));
 }
 
+if (LangOpts.Sanitize.has(SanitizerKind::Type)) {
+  MPM.addPass(ModuleTypeSanitizerPass());
+  MPM.addPass(createModuleToFunctionPassAdaptor(TypeSanitizerPass()));
+}
+
 if (LangOpts.Sanitize.has(SanitizerKind::NumericalStability))
   MPM.addPass(NumericalStabilitySanitizerPass());
 
diff --git a/clang/lib/CodeGen/CGDecl.cpp b/clang/lib/CodeGen/CGDecl.cpp
index 47b21bc9f63f04..bb9d120c37ca86 100644
--- a/clang/lib/CodeGen/CGDecl.cpp
+++ b/clang/lib/CodeGen/CGDecl.cpp
@@ -458,7 +458,8 @@ void CodeGenFunction::EmitStaticVarDecl(const VarDecl &D,
   LocalDeclMap.find(&D)->second = Address(castedAddr, elemTy, alignment);
   CGM.setStaticLocalDeclAddress(&D, castedAddr);
 
-  CGM.getSanitizerMetadata()->reportGlobal(var, D);
+  CGM.getSanitizerMetadata()->reportGlobalToASan(var, D);
+  CGM.getSanitizerMetadata()->reportGlobalToTySan(var, D);
 
   // Emit global variable debug descriptor for static vars.
   CGDebugInfo *DI = getDebugInfo();
diff --git a/clang/lib/CodeGen/CGDeclCXX.cpp b/clang/lib/CodeGen/CGDeclCXX.cpp
index 2c3054605ee754..96517511b21

[llvm-branch-commits] [clang] [TySan] A Type Sanitizer (Clang) (PR #76260)

2024-11-27 Thread Florian Hahn via llvm-branch-commits



@@ -5740,7 +5740,8 @@ void CodeGenModule::EmitGlobalVarDefinition(const VarDecl 
*D,
   if (NeedsGlobalCtor || NeedsGlobalDtor)
 EmitCXXGlobalVarDeclInitFunc(D, GV, NeedsGlobalCtor);
 
-  SanitizerMD->reportGlobal(GV, *D, NeedsGlobalCtor);
+  SanitizerMD->reportGlobalToASan(GV, *D, NeedsGlobalCtor);

fhahn wrote:

Actually on current `main`, `reportGlobal` reports not only to Asan, but also 
memsan and hwsan. Updated to include reporting globals to TySan, thanks!

https://github.com/llvm/llvm-project/pull/76260
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang] [flang] [lld] [llvm] [Flang] LLVM_ENABLE_RUNTIMES=flang-rt (PR #110217)

2024-11-27 Thread via llvm-branch-commits



@@ -50,43 +42,28 @@ macro(enable_cuda_compilation name files)
   "${CUDA_COMPILE_OPTIONS}"
   )
 
-if (EXISTS "${FLANG_LIBCUDACXX_PATH}/include")
+if (EXISTS "${FLANG_RT_LIBCUDACXX_PATH}/include")
   # When using libcudacxx headers files, we have to use them
   # for all files of F18 runtime.
-  include_directories(AFTER ${FLANG_LIBCUDACXX_PATH}/include)
+  include_directories(AFTER ${FLANG_RT_LIBCUDACXX_PATH}/include)
   add_compile_definitions(RT_USE_LIBCUDACXX=1)
 endif()
 
 # Add an OBJECT library consisting of CUDA PTX.
-llvm_add_library(${name}PTX OBJECT PARTIAL_SOURCES_INTENDED ${files})
-set_property(TARGET obj.${name}PTX PROPERTY CUDA_PTX_COMPILATION ON)
-if (FLANG_CUDA_RUNTIME_PTX_WITHOUT_GLOBAL_VARS)
-  target_compile_definitions(obj.${name}PTX
-PRIVATE FLANG_RUNTIME_NO_GLOBAL_VAR_DEFS
+add_flangrt_library(${name}PTX OBJECT ${files})

jeanPerier wrote:

Thanks a lot, that solved my build problems! You are correct that this cannot 
yet been used as is by flang-new currently, it is being used to experiment on 
the best ways to plug-in/use the runtime on the device and needs not to be 
installed.

https://github.com/llvm/llvm-project/pull/110217
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [flang] [llvm] [flang][OpenMP] Use new modifiers in DEPEND/GRAINSIZE/NUM_TASKS (PR #117916)

2024-11-27 Thread via llvm-branch-commits


llvmbot wrote:




@llvm/pr-subscribers-flang-fir-hlfir

Author: Krzysztof Parzyszek (kparzysz)


Changes

The usual changes, added more references to OpenMP specs.

---

Patch is 24.00 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/117916.diff


15 Files Affected:

- (modified) flang/examples/FeatureList/FeatureList.cpp (+4-2) 
- (modified) flang/include/flang/Parser/dump-parse-tree.h (+5-2) 
- (modified) flang/include/flang/Parser/parse-tree.h (+30-12) 
- (modified) flang/include/flang/Semantics/openmp-modifiers.h (+1) 
- (modified) flang/lib/Lower/OpenMP/Clauses.cpp (+26-29) 
- (modified) flang/lib/Lower/OpenMP/Clauses.h (+1) 
- (modified) flang/lib/Parser/openmp-parsers.cpp (+26-5) 
- (modified) flang/lib/Parser/parse-tree.cpp (+12-1) 
- (modified) flang/lib/Parser/unparse.cpp (+7-9) 
- (modified) flang/lib/Semantics/check-omp-structure.cpp (+5-10) 
- (modified) flang/lib/Semantics/openmp-modifiers.cpp (+19-2) 
- (modified) flang/test/Parser/OpenMP/depobj-construct.f90 (+2-2) 
- (modified) flang/test/Parser/OpenMP/taskloop.f90 (+4-4) 
- (modified) flang/test/Semantics/OpenMP/depend05.f90 (+1-1) 
- (modified) llvm/include/llvm/Frontend/OpenMP/ClauseT.h (+3-2) 


``diff
diff --git a/flang/examples/FeatureList/FeatureList.cpp 
b/flang/examples/FeatureList/FeatureList.cpp
index 2e90f19dc2e62c..c5cb8c8fdf40bb 100644
--- a/flang/examples/FeatureList/FeatureList.cpp
+++ b/flang/examples/FeatureList/FeatureList.cpp
@@ -488,7 +488,9 @@ struct NodeVisitor {
   READ_FEATURE(OmpEndLoopDirective)
   READ_FEATURE(OmpEndSectionsDirective)
   READ_FEATURE(OmpGrainsizeClause)
-  READ_FEATURE(OmpGrainsizeClause::Prescriptiveness)
+  READ_FEATURE(OmpGrainsizeClause::Modifier)
+  READ_FEATURE(OmpPrescriptiveness)
+  READ_FEATURE(OmpPrescriptiveness::Value)
   READ_FEATURE(OmpIfClause)
   READ_FEATURE(OmpIfClause::DirectiveNameModifier)
   READ_FEATURE(OmpLinearClause)
@@ -500,7 +502,7 @@ struct NodeVisitor {
   READ_FEATURE(OmpMapClause)
   READ_FEATURE(OmpMapClause::Modifier)
   READ_FEATURE(OmpNumTasksClause)
-  READ_FEATURE(OmpNumTasksClause::Prescriptiveness)
+  READ_FEATURE(OmpNumTasksClause::Modifier)
   READ_FEATURE(OmpObject)
   READ_FEATURE(OmpObjectList)
   READ_FEATURE(OmpOrderClause)
diff --git a/flang/include/flang/Parser/dump-parse-tree.h 
b/flang/include/flang/Parser/dump-parse-tree.h
index 3699aa34f4f8ad..1ec38de29b85d6 100644
--- a/flang/include/flang/Parser/dump-parse-tree.h
+++ b/flang/include/flang/Parser/dump-parse-tree.h
@@ -534,6 +534,7 @@ class ParseTreeDumper {
   NODE(OmpDoacross, Source)
   NODE(parser, OmpDependClause)
   NODE(OmpDependClause, TaskDep)
+  NODE(OmpDependClause::TaskDep, Modifier)
   NODE(parser, OmpDetachClause)
   NODE(parser, OmpDoacrossClause)
   NODE(parser, OmpDestroyClause)
@@ -572,9 +573,11 @@ class ParseTreeDumper {
   NODE(parser, OmpOrderModifier)
   NODE_ENUM(OmpOrderModifier, Value)
   NODE(parser, OmpGrainsizeClause)
-  NODE_ENUM(OmpGrainsizeClause, Prescriptiveness)
+  NODE(OmpGrainsizeClause, Modifier)
+  NODE(parser, OmpPrescriptiveness)
+  NODE_ENUM(OmpPrescriptiveness, Value)
   NODE(parser, OmpNumTasksClause)
-  NODE_ENUM(OmpNumTasksClause, Prescriptiveness)
+  NODE(OmpNumTasksClause, Modifier)
   NODE(parser, OmpBindClause)
   NODE_ENUM(OmpBindClause, Binding)
   NODE(parser, OmpProcBindClause)
diff --git a/flang/include/flang/Parser/parse-tree.h 
b/flang/include/flang/Parser/parse-tree.h
index 2143e280457535..c00560b1f1726a 100644
--- a/flang/include/flang/Parser/parse-tree.h
+++ b/flang/include/flang/Parser/parse-tree.h
@@ -3627,6 +3627,15 @@ struct OmpOrderModifier {
   WRAPPER_CLASS_BOILERPLATE(OmpOrderModifier, Value);
 };
 
+// Ref: [5.1:166-171], [5.2:269-270]
+//
+// prescriptiveness ->
+//STRICT// since 5.1
+struct OmpPrescriptiveness {
+  ENUM_CLASS(Value, Strict)
+  WRAPPER_CLASS_BOILERPLATE(OmpPrescriptiveness, Value);
+};
+
 // Ref: [4.5:201-207], [5.0:293-299], [5.1:325-331], [5.2:124]
 //
 // reduction-identifier ->
@@ -3816,8 +3825,8 @@ struct OmpDependClause {
   struct TaskDep {
 OmpTaskDependenceType::Value GetTaskDepType() const;
 TUPLE_CLASS_BOILERPLATE(TaskDep);
-std::tuple, OmpTaskDependenceType, 
OmpObjectList>
-t;
+MODIFIER_BOILERPLATE(OmpIterator, OmpTaskDependenceType);
+std::tuple t;
   };
   std::variant u;
 };
@@ -3878,11 +3887,15 @@ struct OmpFromClause {
   std::tuple t;
 };
 
-// OMP 5.2 12.6.1 grainsize-clause -> grainsize ([prescriptiveness :] value)
+// Ref: [4.5:87-91], [5.0:140-146], [5.1:166-171], [5.2:269]
+//
+// grainsize-clause ->
+//GRAINSIZE(grain-size) |   // since 4.5
+//GRAINSIZE([prescriptiveness:] grain-size) // since 5.1
 struct OmpGrainsizeClause {
   TUPLE_CLASS_BOILERPLATE(OmpGrainsizeClause);
-  ENUM_CLASS(Prescriptiveness, Strict);
-  std::tuple, ScalarIntExpr> t;
+  MODIFIER_BOILERPLATE(OmpPrescriptiveness);
+  std::tuple t;
 };
 
 // 2.12 if-claus

[llvm-branch-commits] [flang] [llvm] [flang][OpenMP] Use new modifiers in DEPEND/GRAINSIZE/NUM_TASKS (PR #117916)

2024-11-27 Thread Krzysztof Parzyszek via llvm-branch-commits


https://github.com/kparzysz created 
https://github.com/llvm/llvm-project/pull/117916

The usual changes, added more references to OpenMP specs.

>From 43f008a7f8b7a6377f6cb7f3ea4cc20394c2d79d Mon Sep 17 00:00:00 2001
From: Krzysztof Parzyszek 
Date: Wed, 27 Nov 2024 08:34:33 -0600
Subject: [PATCH] [flang][OpenMP] Use new modifiers in
 DEPEND/GRAINSIZE/NUM_TASKS

The usual changes, added more references to OpenMP specs.
---
 flang/examples/FeatureList/FeatureList.cpp|  6 +-
 flang/include/flang/Parser/dump-parse-tree.h  |  7 ++-
 flang/include/flang/Parser/parse-tree.h   | 42 ++
 .../flang/Semantics/openmp-modifiers.h|  1 +
 flang/lib/Lower/OpenMP/Clauses.cpp| 55 +--
 flang/lib/Lower/OpenMP/Clauses.h  |  1 +
 flang/lib/Parser/openmp-parsers.cpp   | 31 +--
 flang/lib/Parser/parse-tree.cpp   | 13 -
 flang/lib/Parser/unparse.cpp  | 16 +++---
 flang/lib/Semantics/check-omp-structure.cpp   | 15 ++---
 flang/lib/Semantics/openmp-modifiers.cpp  | 21 ++-
 flang/test/Parser/OpenMP/depobj-construct.f90 |  4 +-
 flang/test/Parser/OpenMP/taskloop.f90 |  8 +--
 flang/test/Semantics/OpenMP/depend05.f90  |  2 +-
 llvm/include/llvm/Frontend/OpenMP/ClauseT.h   |  5 +-
 15 files changed, 146 insertions(+), 81 deletions(-)

diff --git a/flang/examples/FeatureList/FeatureList.cpp 
b/flang/examples/FeatureList/FeatureList.cpp
index 2e90f19dc2e62c..c5cb8c8fdf40bb 100644
--- a/flang/examples/FeatureList/FeatureList.cpp
+++ b/flang/examples/FeatureList/FeatureList.cpp
@@ -488,7 +488,9 @@ struct NodeVisitor {
   READ_FEATURE(OmpEndLoopDirective)
   READ_FEATURE(OmpEndSectionsDirective)
   READ_FEATURE(OmpGrainsizeClause)
-  READ_FEATURE(OmpGrainsizeClause::Prescriptiveness)
+  READ_FEATURE(OmpGrainsizeClause::Modifier)
+  READ_FEATURE(OmpPrescriptiveness)
+  READ_FEATURE(OmpPrescriptiveness::Value)
   READ_FEATURE(OmpIfClause)
   READ_FEATURE(OmpIfClause::DirectiveNameModifier)
   READ_FEATURE(OmpLinearClause)
@@ -500,7 +502,7 @@ struct NodeVisitor {
   READ_FEATURE(OmpMapClause)
   READ_FEATURE(OmpMapClause::Modifier)
   READ_FEATURE(OmpNumTasksClause)
-  READ_FEATURE(OmpNumTasksClause::Prescriptiveness)
+  READ_FEATURE(OmpNumTasksClause::Modifier)
   READ_FEATURE(OmpObject)
   READ_FEATURE(OmpObjectList)
   READ_FEATURE(OmpOrderClause)
diff --git a/flang/include/flang/Parser/dump-parse-tree.h 
b/flang/include/flang/Parser/dump-parse-tree.h
index 3699aa34f4f8ad..1ec38de29b85d6 100644
--- a/flang/include/flang/Parser/dump-parse-tree.h
+++ b/flang/include/flang/Parser/dump-parse-tree.h
@@ -534,6 +534,7 @@ class ParseTreeDumper {
   NODE(OmpDoacross, Source)
   NODE(parser, OmpDependClause)
   NODE(OmpDependClause, TaskDep)
+  NODE(OmpDependClause::TaskDep, Modifier)
   NODE(parser, OmpDetachClause)
   NODE(parser, OmpDoacrossClause)
   NODE(parser, OmpDestroyClause)
@@ -572,9 +573,11 @@ class ParseTreeDumper {
   NODE(parser, OmpOrderModifier)
   NODE_ENUM(OmpOrderModifier, Value)
   NODE(parser, OmpGrainsizeClause)
-  NODE_ENUM(OmpGrainsizeClause, Prescriptiveness)
+  NODE(OmpGrainsizeClause, Modifier)
+  NODE(parser, OmpPrescriptiveness)
+  NODE_ENUM(OmpPrescriptiveness, Value)
   NODE(parser, OmpNumTasksClause)
-  NODE_ENUM(OmpNumTasksClause, Prescriptiveness)
+  NODE(OmpNumTasksClause, Modifier)
   NODE(parser, OmpBindClause)
   NODE_ENUM(OmpBindClause, Binding)
   NODE(parser, OmpProcBindClause)
diff --git a/flang/include/flang/Parser/parse-tree.h 
b/flang/include/flang/Parser/parse-tree.h
index 2143e280457535..c00560b1f1726a 100644
--- a/flang/include/flang/Parser/parse-tree.h
+++ b/flang/include/flang/Parser/parse-tree.h
@@ -3627,6 +3627,15 @@ struct OmpOrderModifier {
   WRAPPER_CLASS_BOILERPLATE(OmpOrderModifier, Value);
 };
 
+// Ref: [5.1:166-171], [5.2:269-270]
+//
+// prescriptiveness ->
+//STRICT// since 5.1
+struct OmpPrescriptiveness {
+  ENUM_CLASS(Value, Strict)
+  WRAPPER_CLASS_BOILERPLATE(OmpPrescriptiveness, Value);
+};
+
 // Ref: [4.5:201-207], [5.0:293-299], [5.1:325-331], [5.2:124]
 //
 // reduction-identifier ->
@@ -3816,8 +3825,8 @@ struct OmpDependClause {
   struct TaskDep {
 OmpTaskDependenceType::Value GetTaskDepType() const;
 TUPLE_CLASS_BOILERPLATE(TaskDep);
-std::tuple, OmpTaskDependenceType, 
OmpObjectList>
-t;
+MODIFIER_BOILERPLATE(OmpIterator, OmpTaskDependenceType);
+std::tuple t;
   };
   std::variant u;
 };
@@ -3878,11 +3887,15 @@ struct OmpFromClause {
   std::tuple t;
 };
 
-// OMP 5.2 12.6.1 grainsize-clause -> grainsize ([prescriptiveness :] value)
+// Ref: [4.5:87-91], [5.0:140-146], [5.1:166-171], [5.2:269]
+//
+// grainsize-clause ->
+//GRAINSIZE(grain-size) |   // since 4.5
+//GRAINSIZE([prescriptiveness:] grain-size) // since 5.1
 struct OmpGrainsizeClause {
   TUPLE_CLASS_BOILERPLATE(OmpGrainsizeClause);
-  ENUM_CLASS(Prescriptiveness, Strict);
-

[llvm-branch-commits] [flang] [llvm] [flang][OpenMP] Use new modifiers in DEPEND/GRAINSIZE/NUM_TASKS (PR #117916)

2024-11-27 Thread via llvm-branch-commits


llvmbot wrote:




@llvm/pr-subscribers-flang-parser

Author: Krzysztof Parzyszek (kparzysz)


Changes

The usual changes, added more references to OpenMP specs.

---

Patch is 24.00 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/117916.diff


15 Files Affected:

- (modified) flang/examples/FeatureList/FeatureList.cpp (+4-2) 
- (modified) flang/include/flang/Parser/dump-parse-tree.h (+5-2) 
- (modified) flang/include/flang/Parser/parse-tree.h (+30-12) 
- (modified) flang/include/flang/Semantics/openmp-modifiers.h (+1) 
- (modified) flang/lib/Lower/OpenMP/Clauses.cpp (+26-29) 
- (modified) flang/lib/Lower/OpenMP/Clauses.h (+1) 
- (modified) flang/lib/Parser/openmp-parsers.cpp (+26-5) 
- (modified) flang/lib/Parser/parse-tree.cpp (+12-1) 
- (modified) flang/lib/Parser/unparse.cpp (+7-9) 
- (modified) flang/lib/Semantics/check-omp-structure.cpp (+5-10) 
- (modified) flang/lib/Semantics/openmp-modifiers.cpp (+19-2) 
- (modified) flang/test/Parser/OpenMP/depobj-construct.f90 (+2-2) 
- (modified) flang/test/Parser/OpenMP/taskloop.f90 (+4-4) 
- (modified) flang/test/Semantics/OpenMP/depend05.f90 (+1-1) 
- (modified) llvm/include/llvm/Frontend/OpenMP/ClauseT.h (+3-2) 


``diff
diff --git a/flang/examples/FeatureList/FeatureList.cpp 
b/flang/examples/FeatureList/FeatureList.cpp
index 2e90f19dc2e62c..c5cb8c8fdf40bb 100644
--- a/flang/examples/FeatureList/FeatureList.cpp
+++ b/flang/examples/FeatureList/FeatureList.cpp
@@ -488,7 +488,9 @@ struct NodeVisitor {
   READ_FEATURE(OmpEndLoopDirective)
   READ_FEATURE(OmpEndSectionsDirective)
   READ_FEATURE(OmpGrainsizeClause)
-  READ_FEATURE(OmpGrainsizeClause::Prescriptiveness)
+  READ_FEATURE(OmpGrainsizeClause::Modifier)
+  READ_FEATURE(OmpPrescriptiveness)
+  READ_FEATURE(OmpPrescriptiveness::Value)
   READ_FEATURE(OmpIfClause)
   READ_FEATURE(OmpIfClause::DirectiveNameModifier)
   READ_FEATURE(OmpLinearClause)
@@ -500,7 +502,7 @@ struct NodeVisitor {
   READ_FEATURE(OmpMapClause)
   READ_FEATURE(OmpMapClause::Modifier)
   READ_FEATURE(OmpNumTasksClause)
-  READ_FEATURE(OmpNumTasksClause::Prescriptiveness)
+  READ_FEATURE(OmpNumTasksClause::Modifier)
   READ_FEATURE(OmpObject)
   READ_FEATURE(OmpObjectList)
   READ_FEATURE(OmpOrderClause)
diff --git a/flang/include/flang/Parser/dump-parse-tree.h 
b/flang/include/flang/Parser/dump-parse-tree.h
index 3699aa34f4f8ad..1ec38de29b85d6 100644
--- a/flang/include/flang/Parser/dump-parse-tree.h
+++ b/flang/include/flang/Parser/dump-parse-tree.h
@@ -534,6 +534,7 @@ class ParseTreeDumper {
   NODE(OmpDoacross, Source)
   NODE(parser, OmpDependClause)
   NODE(OmpDependClause, TaskDep)
+  NODE(OmpDependClause::TaskDep, Modifier)
   NODE(parser, OmpDetachClause)
   NODE(parser, OmpDoacrossClause)
   NODE(parser, OmpDestroyClause)
@@ -572,9 +573,11 @@ class ParseTreeDumper {
   NODE(parser, OmpOrderModifier)
   NODE_ENUM(OmpOrderModifier, Value)
   NODE(parser, OmpGrainsizeClause)
-  NODE_ENUM(OmpGrainsizeClause, Prescriptiveness)
+  NODE(OmpGrainsizeClause, Modifier)
+  NODE(parser, OmpPrescriptiveness)
+  NODE_ENUM(OmpPrescriptiveness, Value)
   NODE(parser, OmpNumTasksClause)
-  NODE_ENUM(OmpNumTasksClause, Prescriptiveness)
+  NODE(OmpNumTasksClause, Modifier)
   NODE(parser, OmpBindClause)
   NODE_ENUM(OmpBindClause, Binding)
   NODE(parser, OmpProcBindClause)
diff --git a/flang/include/flang/Parser/parse-tree.h 
b/flang/include/flang/Parser/parse-tree.h
index 2143e280457535..c00560b1f1726a 100644
--- a/flang/include/flang/Parser/parse-tree.h
+++ b/flang/include/flang/Parser/parse-tree.h
@@ -3627,6 +3627,15 @@ struct OmpOrderModifier {
   WRAPPER_CLASS_BOILERPLATE(OmpOrderModifier, Value);
 };
 
+// Ref: [5.1:166-171], [5.2:269-270]
+//
+// prescriptiveness ->
+//STRICT// since 5.1
+struct OmpPrescriptiveness {
+  ENUM_CLASS(Value, Strict)
+  WRAPPER_CLASS_BOILERPLATE(OmpPrescriptiveness, Value);
+};
+
 // Ref: [4.5:201-207], [5.0:293-299], [5.1:325-331], [5.2:124]
 //
 // reduction-identifier ->
@@ -3816,8 +3825,8 @@ struct OmpDependClause {
   struct TaskDep {
 OmpTaskDependenceType::Value GetTaskDepType() const;
 TUPLE_CLASS_BOILERPLATE(TaskDep);
-std::tuple, OmpTaskDependenceType, 
OmpObjectList>
-t;
+MODIFIER_BOILERPLATE(OmpIterator, OmpTaskDependenceType);
+std::tuple t;
   };
   std::variant u;
 };
@@ -3878,11 +3887,15 @@ struct OmpFromClause {
   std::tuple t;
 };
 
-// OMP 5.2 12.6.1 grainsize-clause -> grainsize ([prescriptiveness :] value)
+// Ref: [4.5:87-91], [5.0:140-146], [5.1:166-171], [5.2:269]
+//
+// grainsize-clause ->
+//GRAINSIZE(grain-size) |   // since 4.5
+//GRAINSIZE([prescriptiveness:] grain-size) // since 5.1
 struct OmpGrainsizeClause {
   TUPLE_CLASS_BOILERPLATE(OmpGrainsizeClause);
-  ENUM_CLASS(Prescriptiveness, Strict);
-  std::tuple, ScalarIntExpr> t;
+  MODIFIER_BOILERPLATE(OmpPrescriptiveness);
+  std::tuple t;
 };
 
 // 2.12 if-clause -

[llvm-branch-commits] [flang] [llvm] [flang][OpenMP] Use new modifiers in DEPEND/GRAINSIZE/NUM_TASKS (PR #117917)

2024-11-27 Thread Krzysztof Parzyszek via llvm-branch-commits


https://github.com/kparzysz created 
https://github.com/llvm/llvm-project/pull/117917

The usual changes, added more references to OpenMP specs.

>From 43f008a7f8b7a6377f6cb7f3ea4cc20394c2d79d Mon Sep 17 00:00:00 2001
From: Krzysztof Parzyszek 
Date: Wed, 27 Nov 2024 08:34:33 -0600
Subject: [PATCH] [flang][OpenMP] Use new modifiers in
 DEPEND/GRAINSIZE/NUM_TASKS

The usual changes, added more references to OpenMP specs.
---
 flang/examples/FeatureList/FeatureList.cpp|  6 +-
 flang/include/flang/Parser/dump-parse-tree.h  |  7 ++-
 flang/include/flang/Parser/parse-tree.h   | 42 ++
 .../flang/Semantics/openmp-modifiers.h|  1 +
 flang/lib/Lower/OpenMP/Clauses.cpp| 55 +--
 flang/lib/Lower/OpenMP/Clauses.h  |  1 +
 flang/lib/Parser/openmp-parsers.cpp   | 31 +--
 flang/lib/Parser/parse-tree.cpp   | 13 -
 flang/lib/Parser/unparse.cpp  | 16 +++---
 flang/lib/Semantics/check-omp-structure.cpp   | 15 ++---
 flang/lib/Semantics/openmp-modifiers.cpp  | 21 ++-
 flang/test/Parser/OpenMP/depobj-construct.f90 |  4 +-
 flang/test/Parser/OpenMP/taskloop.f90 |  8 +--
 flang/test/Semantics/OpenMP/depend05.f90  |  2 +-
 llvm/include/llvm/Frontend/OpenMP/ClauseT.h   |  5 +-
 15 files changed, 146 insertions(+), 81 deletions(-)

diff --git a/flang/examples/FeatureList/FeatureList.cpp 
b/flang/examples/FeatureList/FeatureList.cpp
index 2e90f19dc2e62c..c5cb8c8fdf40bb 100644
--- a/flang/examples/FeatureList/FeatureList.cpp
+++ b/flang/examples/FeatureList/FeatureList.cpp
@@ -488,7 +488,9 @@ struct NodeVisitor {
   READ_FEATURE(OmpEndLoopDirective)
   READ_FEATURE(OmpEndSectionsDirective)
   READ_FEATURE(OmpGrainsizeClause)
-  READ_FEATURE(OmpGrainsizeClause::Prescriptiveness)
+  READ_FEATURE(OmpGrainsizeClause::Modifier)
+  READ_FEATURE(OmpPrescriptiveness)
+  READ_FEATURE(OmpPrescriptiveness::Value)
   READ_FEATURE(OmpIfClause)
   READ_FEATURE(OmpIfClause::DirectiveNameModifier)
   READ_FEATURE(OmpLinearClause)
@@ -500,7 +502,7 @@ struct NodeVisitor {
   READ_FEATURE(OmpMapClause)
   READ_FEATURE(OmpMapClause::Modifier)
   READ_FEATURE(OmpNumTasksClause)
-  READ_FEATURE(OmpNumTasksClause::Prescriptiveness)
+  READ_FEATURE(OmpNumTasksClause::Modifier)
   READ_FEATURE(OmpObject)
   READ_FEATURE(OmpObjectList)
   READ_FEATURE(OmpOrderClause)
diff --git a/flang/include/flang/Parser/dump-parse-tree.h 
b/flang/include/flang/Parser/dump-parse-tree.h
index 3699aa34f4f8ad..1ec38de29b85d6 100644
--- a/flang/include/flang/Parser/dump-parse-tree.h
+++ b/flang/include/flang/Parser/dump-parse-tree.h
@@ -534,6 +534,7 @@ class ParseTreeDumper {
   NODE(OmpDoacross, Source)
   NODE(parser, OmpDependClause)
   NODE(OmpDependClause, TaskDep)
+  NODE(OmpDependClause::TaskDep, Modifier)
   NODE(parser, OmpDetachClause)
   NODE(parser, OmpDoacrossClause)
   NODE(parser, OmpDestroyClause)
@@ -572,9 +573,11 @@ class ParseTreeDumper {
   NODE(parser, OmpOrderModifier)
   NODE_ENUM(OmpOrderModifier, Value)
   NODE(parser, OmpGrainsizeClause)
-  NODE_ENUM(OmpGrainsizeClause, Prescriptiveness)
+  NODE(OmpGrainsizeClause, Modifier)
+  NODE(parser, OmpPrescriptiveness)
+  NODE_ENUM(OmpPrescriptiveness, Value)
   NODE(parser, OmpNumTasksClause)
-  NODE_ENUM(OmpNumTasksClause, Prescriptiveness)
+  NODE(OmpNumTasksClause, Modifier)
   NODE(parser, OmpBindClause)
   NODE_ENUM(OmpBindClause, Binding)
   NODE(parser, OmpProcBindClause)
diff --git a/flang/include/flang/Parser/parse-tree.h 
b/flang/include/flang/Parser/parse-tree.h
index 2143e280457535..c00560b1f1726a 100644
--- a/flang/include/flang/Parser/parse-tree.h
+++ b/flang/include/flang/Parser/parse-tree.h
@@ -3627,6 +3627,15 @@ struct OmpOrderModifier {
   WRAPPER_CLASS_BOILERPLATE(OmpOrderModifier, Value);
 };
 
+// Ref: [5.1:166-171], [5.2:269-270]
+//
+// prescriptiveness ->
+//STRICT// since 5.1
+struct OmpPrescriptiveness {
+  ENUM_CLASS(Value, Strict)
+  WRAPPER_CLASS_BOILERPLATE(OmpPrescriptiveness, Value);
+};
+
 // Ref: [4.5:201-207], [5.0:293-299], [5.1:325-331], [5.2:124]
 //
 // reduction-identifier ->
@@ -3816,8 +3825,8 @@ struct OmpDependClause {
   struct TaskDep {
 OmpTaskDependenceType::Value GetTaskDepType() const;
 TUPLE_CLASS_BOILERPLATE(TaskDep);
-std::tuple, OmpTaskDependenceType, 
OmpObjectList>
-t;
+MODIFIER_BOILERPLATE(OmpIterator, OmpTaskDependenceType);
+std::tuple t;
   };
   std::variant u;
 };
@@ -3878,11 +3887,15 @@ struct OmpFromClause {
   std::tuple t;
 };
 
-// OMP 5.2 12.6.1 grainsize-clause -> grainsize ([prescriptiveness :] value)
+// Ref: [4.5:87-91], [5.0:140-146], [5.1:166-171], [5.2:269]
+//
+// grainsize-clause ->
+//GRAINSIZE(grain-size) |   // since 4.5
+//GRAINSIZE([prescriptiveness:] grain-size) // since 5.1
 struct OmpGrainsizeClause {
   TUPLE_CLASS_BOILERPLATE(OmpGrainsizeClause);
-  ENUM_CLASS(Prescriptiveness, Strict);
-

[llvm-branch-commits] [flang] [llvm] [flang][OpenMP] Use new modifiers in DEPEND/GRAINSIZE/NUM_TASKS (PR #117917)

2024-11-27 Thread via llvm-branch-commits


llvmbot wrote:




@llvm/pr-subscribers-flang-semantics

Author: Krzysztof Parzyszek (kparzysz)


Changes

The usual changes, added more references to OpenMP specs.

---

Patch is 24.00 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/117917.diff


15 Files Affected:

- (modified) flang/examples/FeatureList/FeatureList.cpp (+4-2) 
- (modified) flang/include/flang/Parser/dump-parse-tree.h (+5-2) 
- (modified) flang/include/flang/Parser/parse-tree.h (+30-12) 
- (modified) flang/include/flang/Semantics/openmp-modifiers.h (+1) 
- (modified) flang/lib/Lower/OpenMP/Clauses.cpp (+26-29) 
- (modified) flang/lib/Lower/OpenMP/Clauses.h (+1) 
- (modified) flang/lib/Parser/openmp-parsers.cpp (+26-5) 
- (modified) flang/lib/Parser/parse-tree.cpp (+12-1) 
- (modified) flang/lib/Parser/unparse.cpp (+7-9) 
- (modified) flang/lib/Semantics/check-omp-structure.cpp (+5-10) 
- (modified) flang/lib/Semantics/openmp-modifiers.cpp (+19-2) 
- (modified) flang/test/Parser/OpenMP/depobj-construct.f90 (+2-2) 
- (modified) flang/test/Parser/OpenMP/taskloop.f90 (+4-4) 
- (modified) flang/test/Semantics/OpenMP/depend05.f90 (+1-1) 
- (modified) llvm/include/llvm/Frontend/OpenMP/ClauseT.h (+3-2) 


``diff
diff --git a/flang/examples/FeatureList/FeatureList.cpp 
b/flang/examples/FeatureList/FeatureList.cpp
index 2e90f19dc2e62c..c5cb8c8fdf40bb 100644
--- a/flang/examples/FeatureList/FeatureList.cpp
+++ b/flang/examples/FeatureList/FeatureList.cpp
@@ -488,7 +488,9 @@ struct NodeVisitor {
   READ_FEATURE(OmpEndLoopDirective)
   READ_FEATURE(OmpEndSectionsDirective)
   READ_FEATURE(OmpGrainsizeClause)
-  READ_FEATURE(OmpGrainsizeClause::Prescriptiveness)
+  READ_FEATURE(OmpGrainsizeClause::Modifier)
+  READ_FEATURE(OmpPrescriptiveness)
+  READ_FEATURE(OmpPrescriptiveness::Value)
   READ_FEATURE(OmpIfClause)
   READ_FEATURE(OmpIfClause::DirectiveNameModifier)
   READ_FEATURE(OmpLinearClause)
@@ -500,7 +502,7 @@ struct NodeVisitor {
   READ_FEATURE(OmpMapClause)
   READ_FEATURE(OmpMapClause::Modifier)
   READ_FEATURE(OmpNumTasksClause)
-  READ_FEATURE(OmpNumTasksClause::Prescriptiveness)
+  READ_FEATURE(OmpNumTasksClause::Modifier)
   READ_FEATURE(OmpObject)
   READ_FEATURE(OmpObjectList)
   READ_FEATURE(OmpOrderClause)
diff --git a/flang/include/flang/Parser/dump-parse-tree.h 
b/flang/include/flang/Parser/dump-parse-tree.h
index 3699aa34f4f8ad..1ec38de29b85d6 100644
--- a/flang/include/flang/Parser/dump-parse-tree.h
+++ b/flang/include/flang/Parser/dump-parse-tree.h
@@ -534,6 +534,7 @@ class ParseTreeDumper {
   NODE(OmpDoacross, Source)
   NODE(parser, OmpDependClause)
   NODE(OmpDependClause, TaskDep)
+  NODE(OmpDependClause::TaskDep, Modifier)
   NODE(parser, OmpDetachClause)
   NODE(parser, OmpDoacrossClause)
   NODE(parser, OmpDestroyClause)
@@ -572,9 +573,11 @@ class ParseTreeDumper {
   NODE(parser, OmpOrderModifier)
   NODE_ENUM(OmpOrderModifier, Value)
   NODE(parser, OmpGrainsizeClause)
-  NODE_ENUM(OmpGrainsizeClause, Prescriptiveness)
+  NODE(OmpGrainsizeClause, Modifier)
+  NODE(parser, OmpPrescriptiveness)
+  NODE_ENUM(OmpPrescriptiveness, Value)
   NODE(parser, OmpNumTasksClause)
-  NODE_ENUM(OmpNumTasksClause, Prescriptiveness)
+  NODE(OmpNumTasksClause, Modifier)
   NODE(parser, OmpBindClause)
   NODE_ENUM(OmpBindClause, Binding)
   NODE(parser, OmpProcBindClause)
diff --git a/flang/include/flang/Parser/parse-tree.h 
b/flang/include/flang/Parser/parse-tree.h
index 2143e280457535..c00560b1f1726a 100644
--- a/flang/include/flang/Parser/parse-tree.h
+++ b/flang/include/flang/Parser/parse-tree.h
@@ -3627,6 +3627,15 @@ struct OmpOrderModifier {
   WRAPPER_CLASS_BOILERPLATE(OmpOrderModifier, Value);
 };
 
+// Ref: [5.1:166-171], [5.2:269-270]
+//
+// prescriptiveness ->
+//STRICT// since 5.1
+struct OmpPrescriptiveness {
+  ENUM_CLASS(Value, Strict)
+  WRAPPER_CLASS_BOILERPLATE(OmpPrescriptiveness, Value);
+};
+
 // Ref: [4.5:201-207], [5.0:293-299], [5.1:325-331], [5.2:124]
 //
 // reduction-identifier ->
@@ -3816,8 +3825,8 @@ struct OmpDependClause {
   struct TaskDep {
 OmpTaskDependenceType::Value GetTaskDepType() const;
 TUPLE_CLASS_BOILERPLATE(TaskDep);
-std::tuple, OmpTaskDependenceType, 
OmpObjectList>
-t;
+MODIFIER_BOILERPLATE(OmpIterator, OmpTaskDependenceType);
+std::tuple t;
   };
   std::variant u;
 };
@@ -3878,11 +3887,15 @@ struct OmpFromClause {
   std::tuple t;
 };
 
-// OMP 5.2 12.6.1 grainsize-clause -> grainsize ([prescriptiveness :] value)
+// Ref: [4.5:87-91], [5.0:140-146], [5.1:166-171], [5.2:269]
+//
+// grainsize-clause ->
+//GRAINSIZE(grain-size) |   // since 4.5
+//GRAINSIZE([prescriptiveness:] grain-size) // since 5.1
 struct OmpGrainsizeClause {
   TUPLE_CLASS_BOILERPLATE(OmpGrainsizeClause);
-  ENUM_CLASS(Prescriptiveness, Strict);
-  std::tuple, ScalarIntExpr> t;
+  MODIFIER_BOILERPLATE(OmpPrescriptiveness);
+  std::tuple t;
 };
 
 // 2.12 if-claus

[llvm-branch-commits] [flang] [llvm] [flang][OpenMP] Use new modifiers in DEPEND/GRAINSIZE/NUM_TASKS (PR #117917)

2024-11-27 Thread via llvm-branch-commits


github-actions[bot] wrote:




:warning: C/C++ code formatter, clang-format found issues in your code. 
:warning:



You can test this locally with the following command:


``bash
git-clang-format --diff 923638f04160c6c4a5fdf7afa309dcea3ec9fb7e 
801c4bd8eea9ba8f7490b164aad0fc6b67cbb9b1 --extensions ,c,cpp,h -- 
clang/test/AST/ast-dump-cxx2b-deducing-this.cpp 
clang/test/Analysis/string_notnullterm.c 
clang/test/Analysis/void-call-exit-modelling.c clang/test/C/C23/n2412.c 
clang/test/CodeCompletion/keywords-cxx20.cpp 
clang/test/CodeGen/AArch64/sme-inline-callees-streaming-attrs.c 
clang/test/CodeGen/ubsan-trap-merge.c clang/test/Driver/loongarch-mdiv32.c 
clang/test/Driver/loongarch-mlamcas.c 
clang/test/SemaOpenACC/combined-construct-attach-ast.cpp 
clang/test/SemaOpenACC/combined-construct-attach-clause.c 
clang/test/SemaOpenACC/combined-construct-attach-clause.cpp 
clang/test/SemaOpenACC/combined-construct-deviceptr-ast.cpp 
clang/test/SemaOpenACC/combined-construct-deviceptr-clause.c 
clang/test/SemaOpenACC/combined-construct-deviceptr-clause.cpp 
clang/test/SemaOpenACC/combined-construct-present-ast.cpp 
clang/test/SemaOpenACC/combined-construct-present-clause.c 
clang/test/SemaOpenACC/combined-construct-present-clause.cpp 
clang/test/SemaOpenACC/combined-construct-wait-ast.cpp 
clang/test/SemaOpenACC/combined-construct-wait-clause.c 
clang/test/SemaOpenACC/combined-construct-wait-clause.cpp 
compiler-rt/lib/builtins/extendhfxf2.c 
compiler-rt/test/builtins/Unit/extendhfxf2_test.c 
libc/test/integration/startup/gpu/rpc_lane_test.cpp 
libcxx/include/__type_traits/detected_or.h 
libcxx/test/std/containers/sequences/vector/addressof.compile.pass.cpp 
llvm/include/llvm/DebugInfo/GSYM/CallSiteInfo.h 
llvm/include/llvm/IR/NVVMIntrinsicFlags.h 
llvm/lib/DebugInfo/GSYM/CallSiteInfo.cpp 
llvm/unittests/ADT/SmallVectorExtrasTest.cpp 
llvm/unittests/Target/AArch64/AArch64RegisterInfoTest.cpp 
bolt/include/bolt/Core/BinaryFunction.h 
bolt/include/bolt/Profile/DataAggregator.h bolt/lib/Core/BinaryContext.cpp 
bolt/lib/Core/BinaryFunction.cpp bolt/lib/Passes/ReorderFunctions.cpp 
bolt/lib/Profile/DataAggregator.cpp bolt/test/AArch64/data-at-0-offset.c 
bolt/test/AArch64/double_jump.cpp bolt/test/R_ABS.pic.lld.cpp 
bolt/test/runtime/X86/instrumentation-indirect.c 
bolt/test/runtime/bolt-reserved.cpp bolt/unittests/Core/MCPlusBuilder.cpp 
clang-tools-extra/clang-tidy/modernize/UseStartsEndsWithCheck.cpp 
clang-tools-extra/clangd/refactor/tweaks/DefineOutline.cpp 
clang-tools-extra/clangd/unittests/tweaks/DefineOutlineTests.cpp 
clang-tools-extra/test/clang-tidy/checkers/modernize/use-starts-ends-with.cpp 
clang/include/clang/Analysis/FlowSensitive/ASTOps.h 
clang/include/clang/Sema/Sema.h clang/include/clang/Sema/SemaHLSL.h 
clang/lib/AST/ByteCode/Compiler.cpp clang/lib/AST/ExprConstant.cpp 
clang/lib/Analysis/Consumed.cpp clang/lib/Analysis/FlowSensitive/ASTOps.cpp 
clang/lib/Analysis/FlowSensitive/Arena.cpp 
clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp 
clang/lib/Analysis/FlowSensitive/Models/ChromiumCheckModel.cpp 
clang/lib/Analysis/IntervalPartition.cpp 
clang/lib/Analysis/UnsafeBufferUsage.cpp clang/lib/Basic/Targets/LoongArch.cpp 
clang/lib/Basic/Targets/LoongArch.h clang/lib/Basic/Targets/RISCV.cpp 
clang/lib/CodeGen/CGBuiltin.cpp clang/lib/CodeGen/CGCall.cpp 
clang/lib/CodeGen/CGExpr.cpp clang/lib/CodeGen/CGHLSLRuntime.cpp 
clang/lib/CodeGen/CGHLSLRuntime.h clang/lib/CodeGen/CodeGenModule.cpp 
clang/lib/CodeGen/TargetInfo.h clang/lib/CodeGen/Targets/AArch64.cpp 
clang/lib/Driver/ToolChains/Arch/LoongArch.cpp 
clang/lib/Driver/ToolChains/Clang.cpp clang/lib/Driver/ToolChains/Hexagon.cpp 
clang/lib/Format/Format.cpp clang/lib/Frontend/InitPreprocessor.cpp 
clang/lib/Headers/intrin.h clang/lib/Parse/ParseDeclCXX.cpp 
clang/lib/Parse/ParseHLSL.cpp clang/lib/Sema/HLSLExternalSemaSource.cpp 
clang/lib/Sema/SemaAMDGPU.cpp clang/lib/Sema/SemaAPINotes.cpp 
clang/lib/Sema/SemaChecking.cpp clang/lib/Sema/SemaCodeComplete.cpp 
clang/lib/Sema/SemaConcept.cpp clang/lib/Sema/SemaDecl.cpp 
clang/lib/Sema/SemaDeclAttr.cpp clang/lib/Sema/SemaDeclCXX.cpp 
clang/lib/Sema/SemaDeclObjC.cpp clang/lib/Sema/SemaExpr.cpp 
clang/lib/Sema/SemaExprMember.cpp clang/lib/Sema/SemaFunctionEffects.cpp 
clang/lib/Sema/SemaHLSL.cpp clang/lib/Sema/SemaOpenACC.cpp 
clang/lib/Sema/SemaOpenMP.cpp clang/lib/Sema/SemaOverload.cpp 
clang/lib/Sema/SemaStmt.cpp clang/lib/Sema/SemaTemplateInstantiate.cpp 
clang/lib/Sema/SemaTemplateInstantiateDecl.cpp 
clang/lib/Sema/SemaTemplateVariadic.cpp clang/lib/Sema/TreeTransform.h 
clang/lib/StaticAnalyzer/Core/BugReporter.cpp 
clang/lib/StaticAnalyzer/Core/ExplodedGraph.cpp 
clang/lib/StaticAnalyzer/Core/ExprEngineCallAndReturn.cpp 
clang/test/AST/ByteCode/c23.c 
clang/test/AST/ast-print-openacc-combined-construct.cpp 
clang/test/Analysis/analyzer-enabled-checkers.c clang/test/Analysis/bstring.cpp 
clang/test/Analysis/copy-elision.cpp 
clang/test/Analysis/cxx-uninitialized-object-unguarded-

[llvm-branch-commits] [flang] [llvm] [flang][OpenMP] Use new modifiers in DEPEND/GRAINSIZE/NUM_TASKS (PR #117917)

2024-11-27 Thread Krzysztof Parzyszek via llvm-branch-commits


https://github.com/kparzysz updated 
https://github.com/llvm/llvm-project/pull/117917

>From 43f008a7f8b7a6377f6cb7f3ea4cc20394c2d79d Mon Sep 17 00:00:00 2001
From: Krzysztof Parzyszek 
Date: Wed, 27 Nov 2024 08:34:33 -0600
Subject: [PATCH 1/2] [flang][OpenMP] Use new modifiers in
 DEPEND/GRAINSIZE/NUM_TASKS

The usual changes, added more references to OpenMP specs.
---
 flang/examples/FeatureList/FeatureList.cpp|  6 +-
 flang/include/flang/Parser/dump-parse-tree.h  |  7 ++-
 flang/include/flang/Parser/parse-tree.h   | 42 ++
 .../flang/Semantics/openmp-modifiers.h|  1 +
 flang/lib/Lower/OpenMP/Clauses.cpp| 55 +--
 flang/lib/Lower/OpenMP/Clauses.h  |  1 +
 flang/lib/Parser/openmp-parsers.cpp   | 31 +--
 flang/lib/Parser/parse-tree.cpp   | 13 -
 flang/lib/Parser/unparse.cpp  | 16 +++---
 flang/lib/Semantics/check-omp-structure.cpp   | 15 ++---
 flang/lib/Semantics/openmp-modifiers.cpp  | 21 ++-
 flang/test/Parser/OpenMP/depobj-construct.f90 |  4 +-
 flang/test/Parser/OpenMP/taskloop.f90 |  8 +--
 flang/test/Semantics/OpenMP/depend05.f90  |  2 +-
 llvm/include/llvm/Frontend/OpenMP/ClauseT.h   |  5 +-
 15 files changed, 146 insertions(+), 81 deletions(-)

diff --git a/flang/examples/FeatureList/FeatureList.cpp 
b/flang/examples/FeatureList/FeatureList.cpp
index 2e90f19dc2e62c..c5cb8c8fdf40bb 100644
--- a/flang/examples/FeatureList/FeatureList.cpp
+++ b/flang/examples/FeatureList/FeatureList.cpp
@@ -488,7 +488,9 @@ struct NodeVisitor {
   READ_FEATURE(OmpEndLoopDirective)
   READ_FEATURE(OmpEndSectionsDirective)
   READ_FEATURE(OmpGrainsizeClause)
-  READ_FEATURE(OmpGrainsizeClause::Prescriptiveness)
+  READ_FEATURE(OmpGrainsizeClause::Modifier)
+  READ_FEATURE(OmpPrescriptiveness)
+  READ_FEATURE(OmpPrescriptiveness::Value)
   READ_FEATURE(OmpIfClause)
   READ_FEATURE(OmpIfClause::DirectiveNameModifier)
   READ_FEATURE(OmpLinearClause)
@@ -500,7 +502,7 @@ struct NodeVisitor {
   READ_FEATURE(OmpMapClause)
   READ_FEATURE(OmpMapClause::Modifier)
   READ_FEATURE(OmpNumTasksClause)
-  READ_FEATURE(OmpNumTasksClause::Prescriptiveness)
+  READ_FEATURE(OmpNumTasksClause::Modifier)
   READ_FEATURE(OmpObject)
   READ_FEATURE(OmpObjectList)
   READ_FEATURE(OmpOrderClause)
diff --git a/flang/include/flang/Parser/dump-parse-tree.h 
b/flang/include/flang/Parser/dump-parse-tree.h
index 3699aa34f4f8ad..1ec38de29b85d6 100644
--- a/flang/include/flang/Parser/dump-parse-tree.h
+++ b/flang/include/flang/Parser/dump-parse-tree.h
@@ -534,6 +534,7 @@ class ParseTreeDumper {
   NODE(OmpDoacross, Source)
   NODE(parser, OmpDependClause)
   NODE(OmpDependClause, TaskDep)
+  NODE(OmpDependClause::TaskDep, Modifier)
   NODE(parser, OmpDetachClause)
   NODE(parser, OmpDoacrossClause)
   NODE(parser, OmpDestroyClause)
@@ -572,9 +573,11 @@ class ParseTreeDumper {
   NODE(parser, OmpOrderModifier)
   NODE_ENUM(OmpOrderModifier, Value)
   NODE(parser, OmpGrainsizeClause)
-  NODE_ENUM(OmpGrainsizeClause, Prescriptiveness)
+  NODE(OmpGrainsizeClause, Modifier)
+  NODE(parser, OmpPrescriptiveness)
+  NODE_ENUM(OmpPrescriptiveness, Value)
   NODE(parser, OmpNumTasksClause)
-  NODE_ENUM(OmpNumTasksClause, Prescriptiveness)
+  NODE(OmpNumTasksClause, Modifier)
   NODE(parser, OmpBindClause)
   NODE_ENUM(OmpBindClause, Binding)
   NODE(parser, OmpProcBindClause)
diff --git a/flang/include/flang/Parser/parse-tree.h 
b/flang/include/flang/Parser/parse-tree.h
index 2143e280457535..c00560b1f1726a 100644
--- a/flang/include/flang/Parser/parse-tree.h
+++ b/flang/include/flang/Parser/parse-tree.h
@@ -3627,6 +3627,15 @@ struct OmpOrderModifier {
   WRAPPER_CLASS_BOILERPLATE(OmpOrderModifier, Value);
 };
 
+// Ref: [5.1:166-171], [5.2:269-270]
+//
+// prescriptiveness ->
+//STRICT// since 5.1
+struct OmpPrescriptiveness {
+  ENUM_CLASS(Value, Strict)
+  WRAPPER_CLASS_BOILERPLATE(OmpPrescriptiveness, Value);
+};
+
 // Ref: [4.5:201-207], [5.0:293-299], [5.1:325-331], [5.2:124]
 //
 // reduction-identifier ->
@@ -3816,8 +3825,8 @@ struct OmpDependClause {
   struct TaskDep {
 OmpTaskDependenceType::Value GetTaskDepType() const;
 TUPLE_CLASS_BOILERPLATE(TaskDep);
-std::tuple, OmpTaskDependenceType, 
OmpObjectList>
-t;
+MODIFIER_BOILERPLATE(OmpIterator, OmpTaskDependenceType);
+std::tuple t;
   };
   std::variant u;
 };
@@ -3878,11 +3887,15 @@ struct OmpFromClause {
   std::tuple t;
 };
 
-// OMP 5.2 12.6.1 grainsize-clause -> grainsize ([prescriptiveness :] value)
+// Ref: [4.5:87-91], [5.0:140-146], [5.1:166-171], [5.2:269]
+//
+// grainsize-clause ->
+//GRAINSIZE(grain-size) |   // since 4.5
+//GRAINSIZE([prescriptiveness:] grain-size) // since 5.1
 struct OmpGrainsizeClause {
   TUPLE_CLASS_BOILERPLATE(OmpGrainsizeClause);
-  ENUM_CLASS(Prescriptiveness, Strict);
-  std::tuple, ScalarIntExpr> t;
+  MODIFIER_BOILERPLATE(

[llvm-branch-commits] [flang] [llvm] [flang][OpenMP] Use new modifiers in DEPEND/GRAINSIZE/NUM_TASKS (PR #117917)

2024-11-27 Thread via llvm-branch-commits


llvmbot wrote:




@llvm/pr-subscribers-flang-parser

Author: Krzysztof Parzyszek (kparzysz)


Changes

The usual changes, added more references to OpenMP specs.

---

Patch is 24.00 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/117917.diff


15 Files Affected:

- (modified) flang/examples/FeatureList/FeatureList.cpp (+4-2) 
- (modified) flang/include/flang/Parser/dump-parse-tree.h (+5-2) 
- (modified) flang/include/flang/Parser/parse-tree.h (+30-12) 
- (modified) flang/include/flang/Semantics/openmp-modifiers.h (+1) 
- (modified) flang/lib/Lower/OpenMP/Clauses.cpp (+26-29) 
- (modified) flang/lib/Lower/OpenMP/Clauses.h (+1) 
- (modified) flang/lib/Parser/openmp-parsers.cpp (+26-5) 
- (modified) flang/lib/Parser/parse-tree.cpp (+12-1) 
- (modified) flang/lib/Parser/unparse.cpp (+7-9) 
- (modified) flang/lib/Semantics/check-omp-structure.cpp (+5-10) 
- (modified) flang/lib/Semantics/openmp-modifiers.cpp (+19-2) 
- (modified) flang/test/Parser/OpenMP/depobj-construct.f90 (+2-2) 
- (modified) flang/test/Parser/OpenMP/taskloop.f90 (+4-4) 
- (modified) flang/test/Semantics/OpenMP/depend05.f90 (+1-1) 
- (modified) llvm/include/llvm/Frontend/OpenMP/ClauseT.h (+3-2) 


``diff
diff --git a/flang/examples/FeatureList/FeatureList.cpp 
b/flang/examples/FeatureList/FeatureList.cpp
index 2e90f19dc2e62c..c5cb8c8fdf40bb 100644
--- a/flang/examples/FeatureList/FeatureList.cpp
+++ b/flang/examples/FeatureList/FeatureList.cpp
@@ -488,7 +488,9 @@ struct NodeVisitor {
   READ_FEATURE(OmpEndLoopDirective)
   READ_FEATURE(OmpEndSectionsDirective)
   READ_FEATURE(OmpGrainsizeClause)
-  READ_FEATURE(OmpGrainsizeClause::Prescriptiveness)
+  READ_FEATURE(OmpGrainsizeClause::Modifier)
+  READ_FEATURE(OmpPrescriptiveness)
+  READ_FEATURE(OmpPrescriptiveness::Value)
   READ_FEATURE(OmpIfClause)
   READ_FEATURE(OmpIfClause::DirectiveNameModifier)
   READ_FEATURE(OmpLinearClause)
@@ -500,7 +502,7 @@ struct NodeVisitor {
   READ_FEATURE(OmpMapClause)
   READ_FEATURE(OmpMapClause::Modifier)
   READ_FEATURE(OmpNumTasksClause)
-  READ_FEATURE(OmpNumTasksClause::Prescriptiveness)
+  READ_FEATURE(OmpNumTasksClause::Modifier)
   READ_FEATURE(OmpObject)
   READ_FEATURE(OmpObjectList)
   READ_FEATURE(OmpOrderClause)
diff --git a/flang/include/flang/Parser/dump-parse-tree.h 
b/flang/include/flang/Parser/dump-parse-tree.h
index 3699aa34f4f8ad..1ec38de29b85d6 100644
--- a/flang/include/flang/Parser/dump-parse-tree.h
+++ b/flang/include/flang/Parser/dump-parse-tree.h
@@ -534,6 +534,7 @@ class ParseTreeDumper {
   NODE(OmpDoacross, Source)
   NODE(parser, OmpDependClause)
   NODE(OmpDependClause, TaskDep)
+  NODE(OmpDependClause::TaskDep, Modifier)
   NODE(parser, OmpDetachClause)
   NODE(parser, OmpDoacrossClause)
   NODE(parser, OmpDestroyClause)
@@ -572,9 +573,11 @@ class ParseTreeDumper {
   NODE(parser, OmpOrderModifier)
   NODE_ENUM(OmpOrderModifier, Value)
   NODE(parser, OmpGrainsizeClause)
-  NODE_ENUM(OmpGrainsizeClause, Prescriptiveness)
+  NODE(OmpGrainsizeClause, Modifier)
+  NODE(parser, OmpPrescriptiveness)
+  NODE_ENUM(OmpPrescriptiveness, Value)
   NODE(parser, OmpNumTasksClause)
-  NODE_ENUM(OmpNumTasksClause, Prescriptiveness)
+  NODE(OmpNumTasksClause, Modifier)
   NODE(parser, OmpBindClause)
   NODE_ENUM(OmpBindClause, Binding)
   NODE(parser, OmpProcBindClause)
diff --git a/flang/include/flang/Parser/parse-tree.h 
b/flang/include/flang/Parser/parse-tree.h
index 2143e280457535..c00560b1f1726a 100644
--- a/flang/include/flang/Parser/parse-tree.h
+++ b/flang/include/flang/Parser/parse-tree.h
@@ -3627,6 +3627,15 @@ struct OmpOrderModifier {
   WRAPPER_CLASS_BOILERPLATE(OmpOrderModifier, Value);
 };
 
+// Ref: [5.1:166-171], [5.2:269-270]
+//
+// prescriptiveness ->
+//STRICT// since 5.1
+struct OmpPrescriptiveness {
+  ENUM_CLASS(Value, Strict)
+  WRAPPER_CLASS_BOILERPLATE(OmpPrescriptiveness, Value);
+};
+
 // Ref: [4.5:201-207], [5.0:293-299], [5.1:325-331], [5.2:124]
 //
 // reduction-identifier ->
@@ -3816,8 +3825,8 @@ struct OmpDependClause {
   struct TaskDep {
 OmpTaskDependenceType::Value GetTaskDepType() const;
 TUPLE_CLASS_BOILERPLATE(TaskDep);
-std::tuple, OmpTaskDependenceType, 
OmpObjectList>
-t;
+MODIFIER_BOILERPLATE(OmpIterator, OmpTaskDependenceType);
+std::tuple t;
   };
   std::variant u;
 };
@@ -3878,11 +3887,15 @@ struct OmpFromClause {
   std::tuple t;
 };
 
-// OMP 5.2 12.6.1 grainsize-clause -> grainsize ([prescriptiveness :] value)
+// Ref: [4.5:87-91], [5.0:140-146], [5.1:166-171], [5.2:269]
+//
+// grainsize-clause ->
+//GRAINSIZE(grain-size) |   // since 4.5
+//GRAINSIZE([prescriptiveness:] grain-size) // since 5.1
 struct OmpGrainsizeClause {
   TUPLE_CLASS_BOILERPLATE(OmpGrainsizeClause);
-  ENUM_CLASS(Prescriptiveness, Strict);
-  std::tuple, ScalarIntExpr> t;
+  MODIFIER_BOILERPLATE(OmpPrescriptiveness);
+  std::tuple t;
 };
 
 // 2.12 if-clause -

[llvm-branch-commits] [flang] [MLIR][OpenMP] Add Lowering support for OpenMP Declare Mapper directive (PR #117046)

2024-11-27 Thread via llvm-branch-commits


https://github.com/agozillon edited 
https://github.com/llvm/llvm-project/pull/117046
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [flang] [MLIR][OpenMP] Add Lowering support for OpenMP Declare Mapper directive (PR #117046)

2024-11-27 Thread via llvm-branch-commits

@@ -2701,7 +2701,42 @@ static void
 genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable,
semantics::SemanticsContext &semaCtx, lower::pft::Evaluation &eval,
const parser::OpenMPDeclareMapperConstruct &declareMapperConstruct) {
-  TODO(converter.getCurrentLocation(), "OpenMPDeclareMapperConstruct");
+  fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder();
+  lower::StatementContext stmtCtx;
+  const auto &spec =
+  std::get(declareMapperConstruct.t);
+  const auto &mapperName{std::get>(spec.t)};
+  const auto &varType{std::get(spec.t)};
+  const auto &varName{std::get(spec.t)};
+  assert(varType.declTypeSpec->category() ==
+ semantics::DeclTypeSpec::Category::TypeDerived &&
+ "Expected derived type");
+
+  std::string mapperNameStr;
+  if (mapperName.has_value())
+mapperNameStr = mapperName->ToString();
+  else
+mapperNameStr =
+"default_" + varType.declTypeSpec->derivedTypeSpec().name().ToString();
+
+  mlir::OpBuilder::InsertPoint insPt = firOpBuilder.saveInsertionPoint();
+  firOpBuilder.setInsertionPointToStart(converter.getModuleOp().getBody());
+  auto mlirType = converter.genType(varType.declTypeSpec->derivedTypeSpec());
+  auto varVal = firOpBuilder.createTemporaryAlloc(
+  converter.getCurrentLocation(), mlirType, varName.ToString());

agozillon wrote:

> ```
> 1. If a DECLARE MAPPER directive is not specified for a type DT, a predefined 
> mapper exists for type DT as if the type DT had appeared in the directive as 
> follows:
> 
> !$OMP DECLARE MAPPER (DT :: var) MAP (TOFROM: var)
> 2. If a variable is not a scalar then it is treated as if it had appeared in 
> a map clause with a map-type of tofrom. Which is effectively equivalent to 
> the following and extending declare mapper for non-derived types:
> !$OMP DECLARE MAPPER (T :: var) MAP (TOFROM: var)
> ```

I think the keyword here is likely "as if", so as long as the effects are as 
described it's reasonable would be my reading, and if we really wanted to be 
exact about the wording we'd generate/embed our own equivalent pragmas to the 
above for all default mappings, and then lower them, so not just at the MLIR 
level. However, saying that I am not against defining a default declare mapper 
for all cases once it's in place,  it might tidy things up a bit, but it may 
also be more complicated/trouble than it's worth, in either case I am fine with 
the approach of defining default declare mappers if we'd like to go down that 
route :-) 

I'd also love it if whatever implementation we landed on was compatible with 
the OpenACC implementations documentation/approach to mapping descriptors via 
runtime calls, as I'd like to move towards that eventually when I have some 
time to dig into it and see if it's viable for us. I imagine it will be, I just 
don't know a ton about the region'd approach so hope it wouldn't be prohibitive 
of this.

https://github.com/llvm/llvm-project/pull/117046
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang] [TySan] A Type Sanitizer (Clang) (PR #76260)

2024-11-27 Thread Aaron Ballman via llvm-branch-commits



@@ -1027,6 +1027,10 @@ Sanitizers
   `_. See that link
   for examples.
 
+- Introduced an experimental Type Sanitizer, activated by using the
+  -fsanitize=type flag. This sanitizer detects violations of C/C++ type-based

AaronBallman wrote:

```suggestion
  ``-fsanitize=type`` flag. This sanitizer detects violations of C/C++ 
type-based
```

https://github.com/llvm/llvm-project/pull/76260
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang] [TySan] A Type Sanitizer (Clang) (PR #76260)

2024-11-27 Thread Aaron Ballman via llvm-branch-commits



@@ -102,6 +102,7 @@ FEATURE(numerical_stability_sanitizer, 
LangOpts.Sanitize.has(SanitizerKind::Nume
 FEATURE(memory_sanitizer,
 LangOpts.Sanitize.hasOneOf(SanitizerKind::Memory |
SanitizerKind::KernelMemory))
+FEATURE(type_sanitizer, LangOpts.Sanitize.has(SanitizerKind::Type))

AaronBallman wrote:

None of these sanitizers are features. :-( I think this is the correct thing to 
do for consistency, but at the same time, we keep adding more sanitizers and we 
keep making this problem worse.

Nothing to change here, just me grumbling. :-D

https://github.com/llvm/llvm-project/pull/76260
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang] [flang] [lld] [llvm] [Flang] LLVM_ENABLE_RUNTIMES=flang-rt (PR #110217)

2024-11-27 Thread via llvm-branch-commits


jeanPerier wrote:

Question about multi-versioning. Say I want to 1. build flang, 2. build 
flang_rt with -O0 -g out of tree, 3. build flang_rt with -O3 out of tree.

When I am trying that with the patch, step 3. build and install libflang_rt.a 
override the ones from setp 2. Is there a way/option to use the out-of-tree 
build directory as the output directory for the build libflang_rt.a to build 
different versions instead of the llvm/clang build directory?

Note: I know it is currently not possible to do that, and I am asking because 
your patch is a step towards that, which is great!

https://github.com/llvm/llvm-project/pull/110217
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [compiler-rt] [TySan] Improved compatability for tests (PR #96507)

2024-11-27 Thread via llvm-branch-commits


github-actions[bot] wrote:




:warning: C/C++ code formatter, clang-format found issues in your code. 
:warning:



You can test this locally with the following command:


``bash
git-clang-format --diff 8e6e62d0dee48a696afd0c7d53d74eaccef97b5e 
11c62d601626d9da1fb3ed0c9cadab2d106681ab --extensions cpp,c -- 
compiler-rt/test/tysan/violation-pr45282.c 
compiler-rt/test/tysan/violation-pr47137.c 
compiler-rt/test/tysan/violation-pr62544.c 
compiler-rt/test/tysan/violation-pr62828.cpp 
compiler-rt/test/tysan/violation-pr68655.cpp 
compiler-rt/test/tysan/violation-pr86685.c
``





View the diff from clang-format here.


``diff
diff --git a/compiler-rt/test/tysan/violation-pr47137.c 
b/compiler-rt/test/tysan/violation-pr47137.c
index 72693afe66..e488c23fe1 100644
--- a/compiler-rt/test/tysan/violation-pr47137.c
+++ b/compiler-rt/test/tysan/violation-pr47137.c
@@ -2,9 +2,9 @@
 // RUN: FileCheck %s < %t.out
 
 // https://github.com/llvm/llvm-project/issues/47137
+#include 
 #include 
 #include 
-#include 
 
 void f(int m) {
   int n = (4 * m + 2) / 3;

``




https://github.com/llvm/llvm-project/pull/96507
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [compiler-rt] [TySan] Improved compatability for tests (PR #96507)

2024-11-27 Thread via llvm-branch-commits


https://github.com/gbMattN updated 
https://github.com/llvm/llvm-project/pull/96507

>From 11c62d601626d9da1fb3ed0c9cadab2d106681ab Mon Sep 17 00:00:00 2001
From: Matthew Nagy 
Date: Fri, 28 Jun 2024 16:48:53 +
Subject: [PATCH] [TySan] Improves compatability for tests

---
 compiler-rt/test/tysan/violation-pr45282.c   | 2 +-
 compiler-rt/test/tysan/violation-pr47137.c   | 5 +++--
 compiler-rt/test/tysan/violation-pr62544.c   | 2 +-
 compiler-rt/test/tysan/violation-pr62828.cpp | 2 +-
 compiler-rt/test/tysan/violation-pr68655.cpp | 2 +-
 compiler-rt/test/tysan/violation-pr86685.c   | 2 +-
 6 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/compiler-rt/test/tysan/violation-pr45282.c 
b/compiler-rt/test/tysan/violation-pr45282.c
index f3583d6be6f6a3..ebec01e921da8b 100644
--- a/compiler-rt/test/tysan/violation-pr45282.c
+++ b/compiler-rt/test/tysan/violation-pr45282.c
@@ -18,7 +18,7 @@ int main(void) {
 
   // CHECK:  TypeSanitizer: type-aliasing-violation on address
   // CHECK-NEXT: WRITE of size 8 at {{.+}} with type double accesses an 
existing object of type float
-  // CHECK-NEXT:   in main violation-pr45282.c:25
+  // CHECK-NEXT:   in main {{.*}}violation-pr45282.c:25
 
   // loop of problems
   for (j = 2; j <= 4; ++j) {
diff --git a/compiler-rt/test/tysan/violation-pr47137.c 
b/compiler-rt/test/tysan/violation-pr47137.c
index 3987128ff6fc67..72693afe66e2c4 100644
--- a/compiler-rt/test/tysan/violation-pr47137.c
+++ b/compiler-rt/test/tysan/violation-pr47137.c
@@ -4,6 +4,7 @@
 // https://github.com/llvm/llvm-project/issues/47137
 #include 
 #include 
+#include 
 
 void f(int m) {
   int n = (4 * m + 2) / 3;
@@ -23,8 +24,8 @@ void f(int m) {
   }
 
   // CHECK:  TypeSanitizer: type-aliasing-violation on address
-  // CHECK-NEXT: READ of size 2 at {{.+}} with type short accesses an existing 
object of type long long
-  // CHECK-NEXT:in f violation-pr47137.c:30
+  // CHECK-NEXT: READ of size 2 at {{.+}} with type short accesses an existing 
object of type long{{( long)?}}
+  // CHECK-NEXT:in f {{.*}}violation-pr47137.c:31
   for (int i = 0, j = 0; j < 4 * m; i += 4, j += 3) {
 for (int k = 0; k < 3; k++) {
   ((uint16_t *)a)[j + k] = ((uint16_t *)a)[i + k];
diff --git a/compiler-rt/test/tysan/violation-pr62544.c 
b/compiler-rt/test/tysan/violation-pr62544.c
index 30610925ba385f..5ab1b706d35805 100644
--- a/compiler-rt/test/tysan/violation-pr62544.c
+++ b/compiler-rt/test/tysan/violation-pr62544.c
@@ -18,7 +18,7 @@ int main() {
 
   // CHECK:  TypeSanitizer: type-aliasing-violation on address
   // CHECK-NEXT: WRITE of size 2 at {{.+}} with type short accesses an 
existing object of type int
-  // CHECK-NEXT:   in main violation-pr62544.c:22
+  // CHECK-NEXT:   in main {{.*}}violation-pr62544.c:22
   *e = 3;
   printf("%d\n", a);
 }
diff --git a/compiler-rt/test/tysan/violation-pr62828.cpp 
b/compiler-rt/test/tysan/violation-pr62828.cpp
index 33003df9761f52..d620f8a98f54c6 100644
--- a/compiler-rt/test/tysan/violation-pr62828.cpp
+++ b/compiler-rt/test/tysan/violation-pr62828.cpp
@@ -24,7 +24,7 @@ short *test1(int_v8 *cast_c_array, short_v8 *shuf_c_array1, 
int *ptr) {
 
   // CHECK:  ERROR: TypeSanitizer: type-aliasing-violation on address
   // CHECK-NEXT: READ of size 2 at {{.+}} with type short accesses an existing 
object of type int
-  // CHECK-NEXT:in test1(int (*) [8], short (*) [8], int*) 
violation-pr62828.cpp:29
+  // CHECK-NEXT:in test1(int (*) [8], short (*) [8], int*) 
{{.*}}violation-pr62828.cpp:29
   for (int i3 = 0; i3 < 4; ++i3) {
 output2[i3] = input2[(i3 * 2)];
   }
diff --git a/compiler-rt/test/tysan/violation-pr68655.cpp 
b/compiler-rt/test/tysan/violation-pr68655.cpp
index ac20f8c94e1ffd..a81910640c3e8a 100644
--- a/compiler-rt/test/tysan/violation-pr68655.cpp
+++ b/compiler-rt/test/tysan/violation-pr68655.cpp
@@ -9,7 +9,7 @@ struct S1 {
 
 // CHECK: TypeSanitizer: type-aliasing-violation on address
 // CHECK-NEXT:  READ of size 4 at {{.+}} with type int accesses an existing 
object of type long long (in S1 at offset 0)
-// CHECK-NEXT: in copyMem(S1*, S1*) violation-pr68655.cpp:19
+// CHECK-NEXT: in copyMem(S1*, S1*) {{.*}}violation-pr68655.cpp:19
 
 void inline copyMem(S1 *dst, S1 *src) {
   unsigned *d = reinterpret_cast(dst);
diff --git a/compiler-rt/test/tysan/violation-pr86685.c 
b/compiler-rt/test/tysan/violation-pr86685.c
index fe4fd82af5fdd2..0ec72b3f85e8c3 100644
--- a/compiler-rt/test/tysan/violation-pr86685.c
+++ b/compiler-rt/test/tysan/violation-pr86685.c
@@ -13,7 +13,7 @@ void foo(int *s, float *f, long n) {
 
 // CHECK:  TypeSanitizer: type-aliasing-violation on address
 // CHECK-NEXT: WRITE of size 4 at {{.+}} with type int accesses an 
existing object of type float
-// CHECK-NEXT:   #0 {{.+}} in foo violation-pr86685.c:17
+// CHECK-NEXT:   #0 {{.+}} in foo {{.*}}violation-pr86685.c:17
 *s = 4;
   }
 }

___
llvm-branch-commits mailing list

[llvm-branch-commits] [llvm] release/19.x: [RISCV] Add hasPostISelHook to sf.vfnrclip pseudo instructions. (#114274) (PR #117948)

2024-11-27 Thread via llvm-branch-commits


llvmbot wrote:




@llvm/pr-subscribers-backend-risc-v

Author: None (llvmbot)


Changes

Backport 71b6f6b8a1cd9a63b9d382fe15f40bbb427939b9 
408c84f35b8b0338b630a6ee313c14238e62b5e6

Requested by: @topperc

---
Full diff: https://github.com/llvm/llvm-project/pull/117948.diff


5 Files Affected:

- (modified) llvm/lib/Target/RISCV/RISCVInstrInfo.cpp (+7) 
- (modified) llvm/lib/Target/RISCV/RISCVInstrInfoVPseudos.td (+8-9) 
- (modified) llvm/lib/Target/RISCV/RISCVInstrInfoXSf.td (+3-2) 
- (modified) llvm/test/CodeGen/RISCV/rvv/sf_vfnrclip_x_f_qf.ll (+1-3) 
- (modified) llvm/test/CodeGen/RISCV/rvv/sf_vfnrclip_xu_f_qf.ll (+1-3) 


``diff
diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp 
b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
index 6c0cbeadebf431..7f4bbe7861087e 100644
--- a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
+++ b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
@@ -2536,6 +2536,13 @@ bool RISCVInstrInfo::verifyInstruction(const 
MachineInstr &MI,
 }
   }
 
+  if (int Idx = RISCVII::getFRMOpNum(Desc);
+  Idx >= 0 && MI.getOperand(Idx).getImm() == RISCVFPRndMode::DYN &&
+  !MI.readsRegister(RISCV::FRM, /*TRI=*/nullptr)) {
+ErrInfo = "dynamic rounding mode should read FRM";
+return false;
+  }
+
   return true;
 }
 
diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfoVPseudos.td 
b/llvm/lib/Target/RISCV/RISCVInstrInfoVPseudos.td
index b860273d639ee5..93fd0b2aada35a 100644
--- a/llvm/lib/Target/RISCV/RISCVInstrInfoVPseudos.td
+++ b/llvm/lib/Target/RISCV/RISCVInstrInfoVPseudos.td
@@ -6471,7 +6471,7 @@ defm PseudoVFRDIV : VPseudoVFRDIV_VF_RM;
 
//===--===//
 // 13.5. Vector Widening Floating-Point Multiply
 
//===--===//
-let mayRaiseFPException = true, hasSideEffects = 0 in {
+let mayRaiseFPException = true, hasSideEffects = 0, hasPostISelHook = 1 in {
 defm PseudoVFWMUL : VPseudoVWMUL_VV_VF_RM;
 }
 
@@ -6504,7 +6504,7 @@ defm PseudoVFWMACCBF16  : VPseudoVWMAC_VV_VF_BF_RM;
 
//===--===//
 // 13.8. Vector Floating-Point Square-Root Instruction
 
//===--===//
-let mayRaiseFPException = true, hasSideEffects = 0 in
+let mayRaiseFPException = true, hasSideEffects = 0, hasPostISelHook = 1 in
 defm PseudoVFSQRT : VPseudoVSQR_V_RM;
 
 
//===--===//
@@ -6516,7 +6516,7 @@ defm PseudoVFRSQRT7 : VPseudoVRCP_V;
 
//===--===//
 // 13.10. Vector Floating-Point Reciprocal Estimate Instruction
 
//===--===//
-let mayRaiseFPException = true, hasSideEffects = 0 in
+let mayRaiseFPException = true, hasSideEffects = 0, hasPostISelHook = 1 in
 defm PseudoVFREC7 : VPseudoVRCP_V_RM;
 
 
//===--===//
@@ -6627,9 +6627,10 @@ defm PseudoVFNCVT_F_X  : VPseudoVNCVTF_W_RM;
 defm PseudoVFNCVT_RM_F_XU  : VPseudoVNCVTF_RM_W;
 defm PseudoVFNCVT_RM_F_X   : VPseudoVNCVTF_RM_W;
 
-let hasSideEffects = 0, hasPostISelHook = 1 in
+let hasSideEffects = 0, hasPostISelHook = 1 in {
 defm PseudoVFNCVT_F_F  : VPseudoVNCVTD_W_RM;
 defm PseudoVFNCVTBF16_F_F :  VPseudoVNCVTD_W_RM;
+}
 
 defm PseudoVFNCVT_ROD_F_F  : VPseudoVNCVTD_W;
 } // mayRaiseFPException = true
@@ -6665,8 +,7 @@ let Predicates = [HasVInstructionsAnyF] in {
 
//===--===//
 // 14.3. Vector Single-Width Floating-Point Reduction Instructions
 
//===--===//
-let mayRaiseFPException = true,
-hasSideEffects = 0 in {
+let mayRaiseFPException = true, hasSideEffects = 0, hasPostISelHook = 1 in {
 defm PseudoVFREDOSUM : VPseudoVFREDO_VS_RM;
 defm PseudoVFREDUSUM : VPseudoVFRED_VS_RM;
 }
@@ -6678,9 +6678,8 @@ defm PseudoVFREDMAX  : VPseudoVFREDMINMAX_VS;
 
//===--===//
 // 14.4. Vector Widening Floating-Point Reduction Instructions
 
//===--===//
-let IsRVVWideningReduction = 1,
-hasSideEffects = 0,
-mayRaiseFPException = true in {
+let IsRVVWideningReduction = 1, hasSideEffects = 0, mayRaiseFPException = true,
+hasPostISelHook = 1 in {
 defm PseudoVFWREDUSUM  : VPseudoVFWRED_VS_RM;
 defm PseudoVFWREDOSUM  : VPseudoVFWREDO_VS_RM;
 }
diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfoXSf.td 
b/llvm/lib/Target/RISCV/RISCVInstrInfoXSf.td
index 71aa1f19e089a9..eacc75b9a6c445 100644
--- a/llvm/lib/Target/RISCV/RISCVInstrInfoXSf.td
+++ b/llvm/lib/Target/RISCV/RISCVInstrInfoXSf.td
@@ -217,7 +217,8 @@ let Predicates = [HasVendorXSfvfwmaccqqq], DecoderName

[llvm-branch-commits] [llvm] release/19.x: [RISCV] Add hasPostISelHook to sf.vfnrclip pseudo instructions. (#114274) (PR #117948)

2024-11-27 Thread via llvm-branch-commits


llvmbot wrote:

@topperc @preames What do you think about merging this PR to the release branch?

https://github.com/llvm/llvm-project/pull/117948
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] release/19.x: [RISCV] Add hasPostISelHook to sf.vfnrclip pseudo instructions. (#114274) (PR #117948)

2024-11-27 Thread via llvm-branch-commits


https://github.com/llvmbot milestoned 
https://github.com/llvm/llvm-project/pull/117948
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [RISCV] Add TuneDisableLatencySchedHeuristic (PR #115858)

2024-11-27 Thread Pengcheng Wang via llvm-branch-commits


https://github.com/wangpc-pp edited 
https://github.com/llvm/llvm-project/pull/115858
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] AMDGPU: Simplify demanded bits on readlane/writeline index arguments (PR #117963)

2024-11-27 Thread Shilei Tian via llvm-branch-commits



@@ -450,6 +450,37 @@ static bool isTriviallyUniform(const Use &U) {
   return false;
 }
 
+/// Simplify a lane index operand (e.g. llvm.amdgcn.readlane src1).
+///
+/// The instruction only reads the low 5 bits for wave32, and 6 bits for 
wave64.
+bool GCNTTIImpl::simplifyDemandedLaneMaskArg(InstCombiner &IC,
+ IntrinsicInst &II,
+ unsigned LaneArgIdx) const {
+  unsigned MaskBits = ST->isWaveSizeKnown() && ST->isWave32() ? 5 : 6;

shiltian wrote:

If we default to 64, can't we just use `getWavefrontSize`?

https://github.com/llvm/llvm-project/pull/117963
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang] release/19.x: [clang] hexagon: fix link order for libc/builtins (#117057) (PR #117968)

2024-11-27 Thread via llvm-branch-commits


https://github.com/llvmbot milestoned 
https://github.com/llvm/llvm-project/pull/117968
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang] release/19.x: [clang] hexagon: fix link order for libc/builtins (#117057) (PR #117968)

2024-11-27 Thread via llvm-branch-commits


https://github.com/llvmbot created 
https://github.com/llvm/llvm-project/pull/117968

Backport 9cc2502c048b1403ba8ba5cc5a655d867c329d12

Requested by: @androm3da

>From f09d5dd3a16f66d183d59f9698250a897f4247ab Mon Sep 17 00:00:00 2001
From: Brian Cain 
Date: Mon, 25 Nov 2024 11:35:45 -0600
Subject: [PATCH] [clang] hexagon: fix link order for libc/builtins (#117057)

When linking programs with `eld`, we get a link error like below:

Error:
/inst/clang+llvm-19.1.0-cross-hexagon-unknown-linux-musl/x86_64-linux-gnu/bin/../target/hexagon-unknown-linux-musl//usr/lib/libc.a(scalbn.lo)(.text.scalbn+0x3c):
undefined reference to `__hexagon_muldf3'

libc has references to the clang_rt builtins library, so the order of
the libraries should be reversed.

(cherry picked from commit 9cc2502c048b1403ba8ba5cc5a655d867c329d12)
---
 clang/lib/Driver/ToolChains/Hexagon.cpp |  2 +-
 clang/test/Driver/hexagon-toolchain-linux.c | 10 +-
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/clang/lib/Driver/ToolChains/Hexagon.cpp 
b/clang/lib/Driver/ToolChains/Hexagon.cpp
index 29781399cbab44..383dc8387e75e7 100644
--- a/clang/lib/Driver/ToolChains/Hexagon.cpp
+++ b/clang/lib/Driver/ToolChains/Hexagon.cpp
@@ -378,9 +378,9 @@ constructHexagonLinkArgs(Compilation &C, const JobAction 
&JA,
   if (NeedsXRayDeps)
 linkXRayRuntimeDeps(HTC, Args, CmdArgs);
 
-  CmdArgs.push_back("-lclang_rt.builtins-hexagon");
   if (!Args.hasArg(options::OPT_nolibc))
 CmdArgs.push_back("-lc");
+  CmdArgs.push_back("-lclang_rt.builtins-hexagon");
 }
 if (D.CCCIsCXX()) {
   if (HTC.ShouldLinkCXXStdlib(Args))
diff --git a/clang/test/Driver/hexagon-toolchain-linux.c 
b/clang/test/Driver/hexagon-toolchain-linux.c
index 86cc9a30e932c6..6f7f3b20f9141f 100644
--- a/clang/test/Driver/hexagon-toolchain-linux.c
+++ b/clang/test/Driver/hexagon-toolchain-linux.c
@@ -11,7 +11,7 @@
 // CHECK000-NOT:  
{{.*}}basic_linux_libcxx_tree{{/|}}usr{{/|}}lib{{/|}}crti.o
 // CHECK000:  "-dynamic-linker={{/|}}lib{{/|}}ld-musl-hexagon.so.1"
 // CHECK000:  
"{{.*}}basic_linux_libcxx_tree{{/|}}usr{{/|}}lib{{/|}}crt1.o"
-// CHECK000:  "-lclang_rt.builtins-hexagon" "-lc"
+// CHECK000:  "-lc" "-lclang_rt.builtins-hexagon"
 // 
-
 // Passing --musl --shared
 // 
-
@@ -21,7 +21,7 @@
 // RUN:   --sysroot=%S/Inputs/basic_linux_libcxx_tree -shared %s 2>&1 | 
FileCheck -check-prefix=CHECK001 %s
 // CHECK001-NOT:-dynamic-linker={{/|}}lib{{/|}}ld-musl-hexagon.so.1
 // CHECK001:
"{{.*}}basic_linux_libcxx_tree{{/|}}usr{{/|}}lib{{/|}}crti.o"
-// CHECK001:"-lclang_rt.builtins-hexagon" "-lc"
+// CHECK001:"-lc" "-lclang_rt.builtins-hexagon"
 // CHECK001-NOT:
{{.*}}basic_linux_libcxx_tree{{/|}}usr{{/|}}lib{{/|}}crt1.o
 // 
-
 // Passing --musl -nostdlib
@@ -33,8 +33,8 @@
 // CHECK002:   
"-dynamic-linker={{/|}}lib{{/|}}ld-musl-hexagon.so.1"
 // CHECK002-NOT:   
{{.*}}basic_linux_libcxx_tree{{/|}}usr{{/|}}lib{{/|}}crti.o
 // CHECK002-NOT:   
{{.*}}basic_linux_libcxx_tree{{/|}}usr{{/|}}lib{{/|}}crt1.o
-// CHECK002-NOT:   "-lclang_rt.builtins-hexagon"
 // CHECK002-NOT:   "-lc"
+// CHECK002-NOT:   "-lclang_rt.builtins-hexagon"
 // 
-
 // Passing --musl -nostartfiles
 // 
-
@@ -45,7 +45,7 @@
 // CHECK003:   
"-dynamic-linker={{/|}}lib{{/|}}ld-musl-hexagon.so.1"
 // CHECK003-NOT:   
{{.*}}basic_linux_libcxx_tree{{/|}}usr{{/|}}lib{{/|}}Scrt1.o
 // CHECK003-NOT:   
{{.*}}basic_linux_libcxx_tree{{/|}}usr{{/|}}lib{{/|}}crt1.o
-// CHECK003:   "-lclang_rt.builtins-hexagon" "-lc"
+// CHECK003:   "-lc" "-lclang_rt.builtins-hexagon"
 // 
-
 // Passing --musl -nodefaultlibs
 // 
-
@@ -55,8 +55,8 @@
 // RUN:   --sysroot=%S/Inputs/basic_linux_libcxx_tree -nodefaultlibs %s 2>&1 | 
FileCheck -check-prefix=CHECK004 %s
 // CHECK004:   
"-dynamic-linker={{/|}}lib{{/|}}ld-musl-hexagon.so.1"
 // CHECK004:   
"{{.*}}basic_linux_libcxx_tree{{/|}}usr{{/|}}lib{{/|}}crt1.o"
-// CHECK004-NOT:   "-lclang_rt.builtins-hexagon"
 // CHECK004-NOT:   "-lc"
+// CHECK004-NOT:   "-lclang_rt.builtins-hexagon"
 // 
-
 // Passing --musl -nolibc
 // 
-

__

[llvm-branch-commits] [clang] release/19.x: [clang] hexagon: fix link order for libc/builtins (#117057) (PR #117968)

2024-11-27 Thread via llvm-branch-commits


llvmbot wrote:



@llvm/pr-subscribers-clang-driver

@llvm/pr-subscribers-backend-hexagon

Author: None (llvmbot)


Changes

Backport 9cc2502c048b1403ba8ba5cc5a655d867c329d12

Requested by: @androm3da

---
Full diff: https://github.com/llvm/llvm-project/pull/117968.diff


2 Files Affected:

- (modified) clang/lib/Driver/ToolChains/Hexagon.cpp (+1-1) 
- (modified) clang/test/Driver/hexagon-toolchain-linux.c (+5-5) 


``diff
diff --git a/clang/lib/Driver/ToolChains/Hexagon.cpp 
b/clang/lib/Driver/ToolChains/Hexagon.cpp
index 29781399cbab44..383dc8387e75e7 100644
--- a/clang/lib/Driver/ToolChains/Hexagon.cpp
+++ b/clang/lib/Driver/ToolChains/Hexagon.cpp
@@ -378,9 +378,9 @@ constructHexagonLinkArgs(Compilation &C, const JobAction 
&JA,
   if (NeedsXRayDeps)
 linkXRayRuntimeDeps(HTC, Args, CmdArgs);
 
-  CmdArgs.push_back("-lclang_rt.builtins-hexagon");
   if (!Args.hasArg(options::OPT_nolibc))
 CmdArgs.push_back("-lc");
+  CmdArgs.push_back("-lclang_rt.builtins-hexagon");
 }
 if (D.CCCIsCXX()) {
   if (HTC.ShouldLinkCXXStdlib(Args))
diff --git a/clang/test/Driver/hexagon-toolchain-linux.c 
b/clang/test/Driver/hexagon-toolchain-linux.c
index 86cc9a30e932c6..6f7f3b20f9141f 100644
--- a/clang/test/Driver/hexagon-toolchain-linux.c
+++ b/clang/test/Driver/hexagon-toolchain-linux.c
@@ -11,7 +11,7 @@
 // CHECK000-NOT:  
{{.*}}basic_linux_libcxx_tree{{/|}}usr{{/|}}lib{{/|}}crti.o
 // CHECK000:  "-dynamic-linker={{/|}}lib{{/|}}ld-musl-hexagon.so.1"
 // CHECK000:  
"{{.*}}basic_linux_libcxx_tree{{/|}}usr{{/|}}lib{{/|}}crt1.o"
-// CHECK000:  "-lclang_rt.builtins-hexagon" "-lc"
+// CHECK000:  "-lc" "-lclang_rt.builtins-hexagon"
 // 
-
 // Passing --musl --shared
 // 
-
@@ -21,7 +21,7 @@
 // RUN:   --sysroot=%S/Inputs/basic_linux_libcxx_tree -shared %s 2>&1 | 
FileCheck -check-prefix=CHECK001 %s
 // CHECK001-NOT:-dynamic-linker={{/|}}lib{{/|}}ld-musl-hexagon.so.1
 // CHECK001:
"{{.*}}basic_linux_libcxx_tree{{/|}}usr{{/|}}lib{{/|}}crti.o"
-// CHECK001:"-lclang_rt.builtins-hexagon" "-lc"
+// CHECK001:"-lc" "-lclang_rt.builtins-hexagon"
 // CHECK001-NOT:
{{.*}}basic_linux_libcxx_tree{{/|}}usr{{/|}}lib{{/|}}crt1.o
 // 
-
 // Passing --musl -nostdlib
@@ -33,8 +33,8 @@
 // CHECK002:   
"-dynamic-linker={{/|}}lib{{/|}}ld-musl-hexagon.so.1"
 // CHECK002-NOT:   
{{.*}}basic_linux_libcxx_tree{{/|}}usr{{/|}}lib{{/|}}crti.o
 // CHECK002-NOT:   
{{.*}}basic_linux_libcxx_tree{{/|}}usr{{/|}}lib{{/|}}crt1.o
-// CHECK002-NOT:   "-lclang_rt.builtins-hexagon"
 // CHECK002-NOT:   "-lc"
+// CHECK002-NOT:   "-lclang_rt.builtins-hexagon"
 // 
-
 // Passing --musl -nostartfiles
 // 
-
@@ -45,7 +45,7 @@
 // CHECK003:   
"-dynamic-linker={{/|}}lib{{/|}}ld-musl-hexagon.so.1"
 // CHECK003-NOT:   
{{.*}}basic_linux_libcxx_tree{{/|}}usr{{/|}}lib{{/|}}Scrt1.o
 // CHECK003-NOT:   
{{.*}}basic_linux_libcxx_tree{{/|}}usr{{/|}}lib{{/|}}crt1.o
-// CHECK003:   "-lclang_rt.builtins-hexagon" "-lc"
+// CHECK003:   "-lc" "-lclang_rt.builtins-hexagon"
 // 
-
 // Passing --musl -nodefaultlibs
 // 
-
@@ -55,8 +55,8 @@
 // RUN:   --sysroot=%S/Inputs/basic_linux_libcxx_tree -nodefaultlibs %s 2>&1 | 
FileCheck -check-prefix=CHECK004 %s
 // CHECK004:   
"-dynamic-linker={{/|}}lib{{/|}}ld-musl-hexagon.so.1"
 // CHECK004:   
"{{.*}}basic_linux_libcxx_tree{{/|}}usr{{/|}}lib{{/|}}crt1.o"
-// CHECK004-NOT:   "-lclang_rt.builtins-hexagon"
 // CHECK004-NOT:   "-lc"
+// CHECK004-NOT:   "-lclang_rt.builtins-hexagon"
 // 
-
 // Passing --musl -nolibc
 // 
-

``




https://github.com/llvm/llvm-project/pull/117968
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang] [flang] [lld] [llvm] [Flang] LLVM_ENABLE_RUNTIMES=flang-rt (PR #110217)

2024-11-27 Thread Michael Kruse via llvm-branch-commits

Meinersbur wrote:

> I think it is because COMPILE_LANGUAGE:CXX does not work for sources where 
> the language has been set to CUDA. Since CUDA may be C or C++ sources, it 
> would also be wrong to add them to COMPILE_LANGUAGE:CUDA.
> 
> I do not think cmake has a way to specify flags to be passed to host 
> compilers ([from this cmake 
> issue](https://gitlab.kitware.com/cmake/cmake/-/issues/25911)).
> 
> Maybe you could define those flags in some functions that could be used both 
> here to set COMPILE_LANGUAGE:CXX flags and in `AddFlangRTOffload` to add 
> these flags to the `CUDA_COMPILE_OPTIONS`.

I think you are right, the generator-expression only applies to C++. I added it 
to avoid applying those switches to C files. I didn't have CUDA in mind, and 
neither I think llvm_add_library had[^1]. Overriding the `LANGUAGE` source file 
property is something I would not recommend.

For `nvcc`, the documented switch to disable exceptions seems to be 
`--no-exceptions`. Options can be forwarded to the host compiler using 
`-Xcompiler`. I updated this PR to use these options. Please check those 
options.

Windows targets might not be supported for now, but I think still need to be 
thought of. Using Clang to compile with experimental CUDA support seemed to 
habe been intended at some point, but currently fails (on main) with 
```
/home/meinersbur/src/llvm-project-flangrt/flang-rt/../flang/include/flang/Runtime/freestanding-tools.h:91:5:
 error: non-void function 'memmove' should return a value [-Wreturn-type]
   91 | return;
  | ^
```
and other errors.

If I read the CMake issue Kitware correctly, Kitware consideres `-Xcompiler` to 
be nvcc-specific and didn't want generalize this for all compilers[^2]. 

[^1]: The code that applies these compile switches has the comment `# Update 
target props, since all sources are C++.` Using the file extension might have 
been a workaround before generator expression were available in CMake. It was 
added in 2014.

[^2]: If it was up to me, I would rather add `target_device_compile_options` 
than `target_host_compile`, because the host (respectively the target that we 
are cross-compiling to; "host" is ambiguous here) is the level that all the 
compilers and linkers work on. Only the device/auxiliary code is different from 
other CMake-supported-languages.

https://github.com/llvm/llvm-project/pull/110217
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] release/19.x: [RISCV] Add hasPostISelHook to sf.vfnrclip pseudo instructions. (#114274) (PR #117948)

2024-11-27 Thread via llvm-branch-commits


https://github.com/llvmbot created 
https://github.com/llvm/llvm-project/pull/117948

Backport 71b6f6b8a1cd9a63b9d382fe15f40bbb427939b9 
408c84f35b8b0338b630a6ee313c14238e62b5e6

Requested by: @topperc

>From bd583dd074029a7113fcb9d41a64da51399c84fa Mon Sep 17 00:00:00 2001
From: Craig Topper 
Date: Wed, 30 Oct 2024 11:47:40 -0700
Subject: [PATCH 1/2] [RISCV] Add missing hasPostISelHook = 1 to vector pseudos
 that might read FRM. (#114186)

We need an implicit FRM read operand anytime the rounding mode is
dynamic. The post isel hook is responsible for this when isel creates an
instruction with dynamic rounding mode.

Add a MachineVerifier check to verify the operand is present.

(cherry picked from commit 71b6f6b8a1cd9a63b9d382fe15f40bbb427939b9)
---
 llvm/lib/Target/RISCV/RISCVInstrInfo.cpp|  7 +++
 llvm/lib/Target/RISCV/RISCVInstrInfoVPseudos.td | 17 -
 2 files changed, 15 insertions(+), 9 deletions(-)

diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp 
b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
index 6c0cbeadebf431..7f4bbe7861087e 100644
--- a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
+++ b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
@@ -2536,6 +2536,13 @@ bool RISCVInstrInfo::verifyInstruction(const 
MachineInstr &MI,
 }
   }
 
+  if (int Idx = RISCVII::getFRMOpNum(Desc);
+  Idx >= 0 && MI.getOperand(Idx).getImm() == RISCVFPRndMode::DYN &&
+  !MI.readsRegister(RISCV::FRM, /*TRI=*/nullptr)) {
+ErrInfo = "dynamic rounding mode should read FRM";
+return false;
+  }
+
   return true;
 }
 
diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfoVPseudos.td 
b/llvm/lib/Target/RISCV/RISCVInstrInfoVPseudos.td
index b860273d639ee5..93fd0b2aada35a 100644
--- a/llvm/lib/Target/RISCV/RISCVInstrInfoVPseudos.td
+++ b/llvm/lib/Target/RISCV/RISCVInstrInfoVPseudos.td
@@ -6471,7 +6471,7 @@ defm PseudoVFRDIV : VPseudoVFRDIV_VF_RM;
 
//===--===//
 // 13.5. Vector Widening Floating-Point Multiply
 
//===--===//
-let mayRaiseFPException = true, hasSideEffects = 0 in {
+let mayRaiseFPException = true, hasSideEffects = 0, hasPostISelHook = 1 in {
 defm PseudoVFWMUL : VPseudoVWMUL_VV_VF_RM;
 }
 
@@ -6504,7 +6504,7 @@ defm PseudoVFWMACCBF16  : VPseudoVWMAC_VV_VF_BF_RM;
 
//===--===//
 // 13.8. Vector Floating-Point Square-Root Instruction
 
//===--===//
-let mayRaiseFPException = true, hasSideEffects = 0 in
+let mayRaiseFPException = true, hasSideEffects = 0, hasPostISelHook = 1 in
 defm PseudoVFSQRT : VPseudoVSQR_V_RM;
 
 
//===--===//
@@ -6516,7 +6516,7 @@ defm PseudoVFRSQRT7 : VPseudoVRCP_V;
 
//===--===//
 // 13.10. Vector Floating-Point Reciprocal Estimate Instruction
 
//===--===//
-let mayRaiseFPException = true, hasSideEffects = 0 in
+let mayRaiseFPException = true, hasSideEffects = 0, hasPostISelHook = 1 in
 defm PseudoVFREC7 : VPseudoVRCP_V_RM;
 
 
//===--===//
@@ -6627,9 +6627,10 @@ defm PseudoVFNCVT_F_X  : VPseudoVNCVTF_W_RM;
 defm PseudoVFNCVT_RM_F_XU  : VPseudoVNCVTF_RM_W;
 defm PseudoVFNCVT_RM_F_X   : VPseudoVNCVTF_RM_W;
 
-let hasSideEffects = 0, hasPostISelHook = 1 in
+let hasSideEffects = 0, hasPostISelHook = 1 in {
 defm PseudoVFNCVT_F_F  : VPseudoVNCVTD_W_RM;
 defm PseudoVFNCVTBF16_F_F :  VPseudoVNCVTD_W_RM;
+}
 
 defm PseudoVFNCVT_ROD_F_F  : VPseudoVNCVTD_W;
 } // mayRaiseFPException = true
@@ -6665,8 +,7 @@ let Predicates = [HasVInstructionsAnyF] in {
 
//===--===//
 // 14.3. Vector Single-Width Floating-Point Reduction Instructions
 
//===--===//
-let mayRaiseFPException = true,
-hasSideEffects = 0 in {
+let mayRaiseFPException = true, hasSideEffects = 0, hasPostISelHook = 1 in {
 defm PseudoVFREDOSUM : VPseudoVFREDO_VS_RM;
 defm PseudoVFREDUSUM : VPseudoVFRED_VS_RM;
 }
@@ -6678,9 +6678,8 @@ defm PseudoVFREDMAX  : VPseudoVFREDMINMAX_VS;
 
//===--===//
 // 14.4. Vector Widening Floating-Point Reduction Instructions
 
//===--===//
-let IsRVVWideningReduction = 1,
-hasSideEffects = 0,
-mayRaiseFPException = true in {
+let IsRVVWideningReduction = 1, hasSideEffects = 0, mayRaiseFPException = true,
+hasPostISelHook = 1 in {
 defm PseudoVFWREDUSUM  : VPseudoVFWRED_VS_RM;
 defm PseudoVFWREDOSUM  : VPseudoVFWREDO_VS_RM;
 }

>From 6140b33fd309d6

[llvm-branch-commits] [clang] [flang] [lld] [llvm] [Flang] LLVM_ENABLE_RUNTIMES=flang-rt (PR #110217)

2024-11-27 Thread Michael Kruse via llvm-branch-commits


Meinersbur wrote:

> @Meinersbur, could you let me know if/once you think this should be fixed, 
> then I'll give it another shot in building.

I am on vacation since last week. I will address the remaining comments, merge 
main again, then ping you.

https://github.com/llvm/llvm-project/pull/110217
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [RISCV] Add TuneDisableLatencySchedHeuristic (PR #115858)

2024-11-27 Thread Craig Topper via llvm-branch-commits


https://github.com/topperc approved this pull request.

LGTM

https://github.com/llvm/llvm-project/pull/115858
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang] release/19.x: [clang] hexagon: fix link order for libc/builtins (#117057) (PR #117968)

2024-11-27 Thread via llvm-branch-commits


llvmbot wrote:

@SidManning What do you think about merging this PR to the release branch?

https://github.com/llvm/llvm-project/pull/117968
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [RISCV] Enable ShouldTrackLaneMasks when having vector instructions (PR #115843)

2024-11-27 Thread Pengcheng Wang via llvm-branch-commits


https://github.com/wangpc-pp edited 
https://github.com/llvm/llvm-project/pull/115843
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [RISCV] Enable ShouldTrackLaneMasks when having vector instructions (PR #115843)

2024-11-27 Thread Pengcheng Wang via llvm-branch-commits


https://github.com/wangpc-pp updated 
https://github.com/llvm/llvm-project/pull/115843


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [RISCV] Enable ShouldTrackLaneMasks when having vector instructions (PR #115843)

2024-11-27 Thread Pengcheng Wang via llvm-branch-commits


https://github.com/wangpc-pp updated 
https://github.com/llvm/llvm-project/pull/115843


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [flang] [MLIR][OpenMP] Add Lowering support for OpenMP Declare Mapper directive (PR #117046)

2024-11-27 Thread Kiran Chandramohan via llvm-branch-commits



@@ -2701,7 +2701,42 @@ static void
 genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable,
semantics::SemanticsContext &semaCtx, lower::pft::Evaluation &eval,
const parser::OpenMPDeclareMapperConstruct &declareMapperConstruct) {
-  TODO(converter.getCurrentLocation(), "OpenMPDeclareMapperConstruct");
+  fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder();
+  lower::StatementContext stmtCtx;
+  const auto &spec =
+  std::get(declareMapperConstruct.t);
+  const auto &mapperName{std::get>(spec.t)};
+  const auto &varType{std::get(spec.t)};
+  const auto &varName{std::get(spec.t)};
+  assert(varType.declTypeSpec->category() ==
+ semantics::DeclTypeSpec::Category::TypeDerived &&
+ "Expected derived type");
+
+  std::string mapperNameStr;
+  if (mapperName.has_value())
+mapperNameStr = mapperName->ToString();
+  else
+mapperNameStr =
+"default_" + varType.declTypeSpec->derivedTypeSpec().name().ToString();
+
+  mlir::OpBuilder::InsertPoint insPt = firOpBuilder.saveInsertionPoint();
+  firOpBuilder.setInsertionPointToStart(converter.getModuleOp().getBody());
+  auto mlirType = converter.genType(varType.declTypeSpec->derivedTypeSpec());
+  auto varVal = firOpBuilder.createTemporaryAlloc(
+  converter.getCurrentLocation(), mlirType, varName.ToString());

kiranchandramohan wrote:

Besides the representation, we should also clarify 
-> where we create the map_entries for the relevant operations (like target) 
for which the declare mapper implicitly applies
-> where the composition of map-types (map-type decay) from the map clause of 
declare mapper and the map clause of the relevant operation (like target) 
happens

Note: All these are for discussion and should be treated as suggestions.

https://github.com/llvm/llvm-project/pull/117046
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang] [Serialization] Code cleanups and polish 83233 (PR #83237)

2024-11-27 Thread Chuanqi Xu via llvm-branch-commits


ChuanqiXu9 wrote:

@alexfh gentle ping

https://github.com/llvm/llvm-project/pull/83237
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] release/19.x: Bail out jump threading on indirect branches only (#117778) (PR #117869)

2024-11-27 Thread via llvm-branch-commits


llvmbot wrote:




@llvm/pr-subscribers-llvm-transforms

Author: None (llvmbot)


Changes

Backport 3c9022c965b85951f30af140da591f819acef8a0 
39601a6e5484de183bf525b7d0624e7890ccd8ab

Requested by: @nikic

---
Full diff: https://github.com/llvm/llvm-project/pull/117869.diff


2 Files Affected:

- (modified) llvm/lib/Transforms/Utils/Local.cpp (+8-2) 
- (added) 
llvm/test/Transforms/SimplifyCFG/switch-branch-fold-indirectbr-102351.ll (+81) 


``diff
diff --git a/llvm/lib/Transforms/Utils/Local.cpp 
b/llvm/lib/Transforms/Utils/Local.cpp
index 7192efe3f16b9d..f68cbf62b9825f 100644
--- a/llvm/lib/Transforms/Utils/Local.cpp
+++ b/llvm/lib/Transforms/Utils/Local.cpp
@@ -1028,7 +1028,13 @@ CanRedirectPredsOfEmptyBBToSucc(BasicBlock *BB, 
BasicBlock *Succ,
   if (!BB->hasNPredecessorsOrMore(2))
 return false;
 
-  // Get single common predecessors of both BB and Succ
+  if (any_of(BBPreds, [](const BasicBlock *Pred) {
+return isa(Pred->getTerminator());
+  }))
+return false;
+
+  // Get the single common predecessor of both BB and Succ. Return false
+  // when there are more than one common predecessors.
   for (BasicBlock *SuccPred : SuccPreds) {
 if (BBPreds.count(SuccPred)) {
   if (CommonPred)
@@ -1133,7 +1139,7 @@ bool 
llvm::TryToSimplifyUncondBranchFromEmptyBlock(BasicBlock *BB,
 
   bool BBKillable = CanPropagatePredecessorsForPHIs(BB, Succ, BBPreds);
 
-  // Even if we can not fold bB into Succ, we may be able to redirect the
+  // Even if we can not fold BB into Succ, we may be able to redirect the
   // predecessors of BB to Succ.
   bool BBPhisMergeable =
   BBKillable ||
diff --git 
a/llvm/test/Transforms/SimplifyCFG/switch-branch-fold-indirectbr-102351.ll 
b/llvm/test/Transforms/SimplifyCFG/switch-branch-fold-indirectbr-102351.ll
new file mode 100644
index 00..d3713be8358db4
--- /dev/null
+++ b/llvm/test/Transforms/SimplifyCFG/switch-branch-fold-indirectbr-102351.ll
@@ -0,0 +1,81 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py 
UTC_ARGS: --prefix-filecheck-ir-name pref --version 5
+; RUN: opt < %s -passes=simplifycfg -S | FileCheck %s
+
+define i32 @foo.1(i32 %arg, ptr %arg1) {
+; CHECK-LABEL: define i32 @foo.1(
+; CHECK-SAME: i32 [[ARG:%.*]], ptr [[ARG1:%.*]]) {
+; CHECK-NEXT:  [[BB:.*]]:
+; CHECK-NEXT:[[ALLOCA:%.*]] = alloca [2 x ptr], align 16
+; CHECK-NEXT:store ptr blockaddress(@foo.1, %[[BB8:.*]]), ptr [[ALLOCA]], 
align 16
+; CHECK-NEXT:[[GETELEMENTPTR:%.*]] = getelementptr inbounds [2 x ptr], ptr 
[[ALLOCA]], i64 0, i64 1
+; CHECK-NEXT:store ptr blockaddress(@foo.1, %[[BB16:.*]]), ptr 
[[GETELEMENTPTR]], align 8
+; CHECK-NEXT:br label %[[PREFBB2:.*]]
+; CHECK:   [[PREFBB2]]:
+; CHECK-NEXT:[[PHI:%.*]] = phi i32 [ 0, %[[BB]] ], [ [[PHI14:%.*]], 
%[[BB13:.*]] ]
+; CHECK-NEXT:[[PHI3:%.*]] = phi i32 [ 0, %[[BB]] ], [ [[PHI15:%.*]], 
%[[BB13]] ]
+; CHECK-NEXT:switch i32 [[PHI]], label %[[BB13]] [
+; CHECK-NEXT:  i32 0, label %[[PREFBB18:.*]]
+; CHECK-NEXT:  i32 1, label %[[BB8]]
+; CHECK-NEXT:  i32 2, label %[[PREFBB11:.*]]
+; CHECK-NEXT:]
+; CHECK:   [[BB8]]:
+; CHECK-NEXT:[[PHI10:%.*]] = phi i32 [ [[ARG]], %[[PREFBB18]] ], [ 
[[PHI3]], %[[PREFBB2]] ]
+; CHECK-NEXT:br label %[[BB13]]
+; CHECK:   [[PREFBB11]]:
+; CHECK-NEXT:[[CALL:%.*]] = call i32 @wombat(i32 noundef [[PHI3]])
+; CHECK-NEXT:[[ADD:%.*]] = add nsw i32 [[PHI3]], 1
+; CHECK-NEXT:br label %[[PREFBB18]]
+; CHECK:   [[BB13]]:
+; CHECK-NEXT:[[PHI14]] = phi i32 [ [[PHI]], %[[PREFBB2]] ], [ 2, %[[BB8]] ]
+; CHECK-NEXT:[[PHI15]] = phi i32 [ [[PHI3]], %[[PREFBB2]] ], [ [[PHI10]], 
%[[BB8]] ]
+; CHECK-NEXT:br label %[[PREFBB2]]
+; CHECK:   [[BB16]]:
+; CHECK-NEXT:[[CALL17:%.*]] = call i32 @wombat(i32 noundef [[ARG]])
+; CHECK-NEXT:ret i32 0
+; CHECK:   [[PREFBB18]]:
+; CHECK-NEXT:[[LOAD:%.*]] = load ptr, ptr [[ARG1]], align 8
+; CHECK-NEXT:indirectbr ptr [[LOAD]], [label %[[BB8]], label %bb16]
+;
+bb:
+  %alloca = alloca [2 x ptr], align 16
+  store ptr blockaddress(@foo.1, %bb8), ptr %alloca, align 16
+  %getelementptr = getelementptr inbounds [2 x ptr], ptr %alloca, i64 0, i64 1
+  store ptr blockaddress(@foo.1, %bb16), ptr %getelementptr, align 8
+  br label %bb2
+
+bb2:  ; preds = %bb13, %bb
+  %phi = phi i32 [ 0, %bb ], [ %phi14, %bb13 ]
+  %phi3 = phi i32 [ 0, %bb ], [ %phi15, %bb13 ]
+  switch i32 %phi, label %bb13 [
+  i32 0, label %bb5
+  i32 1, label %bb8
+  i32 2, label %bb11
+  ]
+
+bb5:  ; preds = %bb2
+  br label %bb18
+
+bb8:  ; preds = %bb18, %bb2
+  %phi10 = phi i32 [ %arg, %bb18 ], [ %phi3, %bb2 ]
+  br label %bb13
+
+bb11: ; preds = %bb2
+  %call = call i32 @wombat(i32 noundef %phi3)
+  %add = add nsw i32 %phi3, 1
+  br label %bb18
+
+bb13:

[llvm-branch-commits] [llvm] release/19.x: Bail out jump threading on indirect branches only (#117778) (PR #117869)

2024-11-27 Thread via llvm-branch-commits


https://github.com/llvmbot created 
https://github.com/llvm/llvm-project/pull/117869

Backport 3c9022c965b85951f30af140da591f819acef8a0 
39601a6e5484de183bf525b7d0624e7890ccd8ab

Requested by: @nikic

>From 30e75e7bdac13b4139a687adbf42c289c31f3305 Mon Sep 17 00:00:00 2001
From: AdityaK 
Date: Tue, 10 Sep 2024 22:39:02 -0700
Subject: [PATCH 1/2] Bail out jump threading on indirect branches (#103688)

The bug was introduced by
https://github.com/llvm/llvm-project/pull/68473

Fixes: #102351
(cherry picked from commit 3c9022c965b85951f30af140da591f819acef8a0)
---
 llvm/lib/Transforms/Utils/Local.cpp   |  11 +-
 .../switch-branch-fold-indirectbr-102351.ll   | 104 ++
 2 files changed, 113 insertions(+), 2 deletions(-)
 create mode 100644 
llvm/test/Transforms/SimplifyCFG/switch-branch-fold-indirectbr-102351.ll

diff --git a/llvm/lib/Transforms/Utils/Local.cpp 
b/llvm/lib/Transforms/Utils/Local.cpp
index 7192efe3f16b9d..4eb8dc1d2d6158 100644
--- a/llvm/lib/Transforms/Utils/Local.cpp
+++ b/llvm/lib/Transforms/Utils/Local.cpp
@@ -1028,7 +1028,14 @@ CanRedirectPredsOfEmptyBBToSucc(BasicBlock *BB, 
BasicBlock *Succ,
   if (!BB->hasNPredecessorsOrMore(2))
 return false;
 
-  // Get single common predecessors of both BB and Succ
+  if (any_of(BBPreds, [](const BasicBlock *Pred) {
+return isa(Pred->begin()) &&
+   isa(Pred->getTerminator());
+  }))
+return false;
+
+  // Get the single common predecessor of both BB and Succ. Return false
+  // when there are more than one common predecessors.
   for (BasicBlock *SuccPred : SuccPreds) {
 if (BBPreds.count(SuccPred)) {
   if (CommonPred)
@@ -1133,7 +1140,7 @@ bool 
llvm::TryToSimplifyUncondBranchFromEmptyBlock(BasicBlock *BB,
 
   bool BBKillable = CanPropagatePredecessorsForPHIs(BB, Succ, BBPreds);
 
-  // Even if we can not fold bB into Succ, we may be able to redirect the
+  // Even if we can not fold BB into Succ, we may be able to redirect the
   // predecessors of BB to Succ.
   bool BBPhisMergeable =
   BBKillable ||
diff --git 
a/llvm/test/Transforms/SimplifyCFG/switch-branch-fold-indirectbr-102351.ll 
b/llvm/test/Transforms/SimplifyCFG/switch-branch-fold-indirectbr-102351.ll
new file mode 100644
index 00..03aee68fa4248c
--- /dev/null
+++ b/llvm/test/Transforms/SimplifyCFG/switch-branch-fold-indirectbr-102351.ll
@@ -0,0 +1,104 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py 
UTC_ARGS: --version 5
+; RUN: opt < %s -passes=simplifycfg -S | FileCheck %s
+
+target datalayout = 
"e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-unknown-linux-gnu"
+
+define dso_local noundef i32 @main() {
+; CHECK-LABEL: define dso_local noundef i32 @main() {
+; CHECK-NEXT:  [[BB:.*]]:
+; CHECK-NEXT:[[ALLOCA:%.*]] = alloca [2 x ptr], align 16
+; CHECK-NEXT:store ptr blockaddress(@main, %[[BB4:.*]]), ptr [[ALLOCA]], 
align 16, !tbaa [[TBAA0:![0-9]+]]
+; CHECK-NEXT:[[GETELEMENTPTR:%.*]] = getelementptr inbounds [2 x ptr], ptr 
[[ALLOCA]], i64 0, i64 1
+; CHECK-NEXT:store ptr blockaddress(@main, %[[BB10:.*]]), ptr 
[[GETELEMENTPTR]], align 8, !tbaa [[TBAA0]]
+; CHECK-NEXT:br label %[[BB1:.*]]
+; CHECK:   [[BB1]]:
+; CHECK-NEXT:[[PHI:%.*]] = phi i32 [ 0, %[[BB]] ], [ [[PHI8:%.*]], 
%[[BB7:.*]] ]
+; CHECK-NEXT:[[PHI2:%.*]] = phi i32 [ 0, %[[BB]] ], [ [[PHI9:%.*]], 
%[[BB7]] ]
+; CHECK-NEXT:switch i32 [[PHI]], label %[[BB7]] [
+; CHECK-NEXT:  i32 0, label %[[BB12:.*]]
+; CHECK-NEXT:  i32 1, label %[[BB4]]
+; CHECK-NEXT:  i32 2, label %[[BB6:.*]]
+; CHECK-NEXT:]
+; CHECK:   [[BB4]]:
+; CHECK-NEXT:[[PHI5:%.*]] = phi i32 [ [[PHI13:%.*]], %[[BB12]] ], [ 
[[PHI2]], %[[BB1]] ]
+; CHECK-NEXT:br label %[[BB7]]
+; CHECK:   [[BB6]]:
+; CHECK-NEXT:[[CALL:%.*]] = call i32 @foo(i32 noundef [[PHI2]])
+; CHECK-NEXT:[[ADD:%.*]] = add nsw i32 [[PHI2]], 1
+; CHECK-NEXT:br label %[[BB12]]
+; CHECK:   [[BB7]]:
+; CHECK-NEXT:[[PHI8]] = phi i32 [ [[PHI]], %[[BB1]] ], [ 2, %[[BB4]] ]
+; CHECK-NEXT:[[PHI9]] = phi i32 [ [[PHI2]], %[[BB1]] ], [ [[PHI5]], 
%[[BB4]] ]
+; CHECK-NEXT:br label %[[BB1]], !llvm.loop [[LOOP4:![0-9]+]]
+; CHECK:   [[BB10]]:
+; CHECK-NEXT:[[CALL11:%.*]] = call i32 @foo(i32 noundef [[PHI13]])
+; CHECK-NEXT:ret i32 0
+; CHECK:   [[BB12]]:
+; CHECK-NEXT:[[PHI13]] = phi i32 [ [[ADD]], %[[BB6]] ], [ [[PHI2]], 
%[[BB1]] ]
+; CHECK-NEXT:[[SEXT:%.*]] = sext i32 [[PHI13]] to i64
+; CHECK-NEXT:[[GETELEMENTPTR14:%.*]] = getelementptr inbounds [2 x ptr], 
ptr [[ALLOCA]], i64 0, i64 [[SEXT]]
+; CHECK-NEXT:[[LOAD:%.*]] = load ptr, ptr [[GETELEMENTPTR14]], align 8, 
!tbaa [[TBAA0]]
+; CHECK-NEXT:indirectbr ptr [[LOAD]], [label %[[BB4]], label %bb10]
+;
+bb:
+  %alloca = alloca [2 x ptr], align 16
+  store ptr blockaddress(@main, %bb4), ptr %alloca, align 16, !tbaa !0
+  %getelementptr = getelementptr inbounds [2 x

[llvm-branch-commits] [llvm] release/19.x: Bail out jump threading on indirect branches only (#117778) (PR #117869)

2024-11-27 Thread via llvm-branch-commits


llvmbot wrote:

@hiraditya @nikic What do you think about merging this PR to the release branch?

https://github.com/llvm/llvm-project/pull/117869
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] release/19.x: Bail out jump threading on indirect branches only (#117778) (PR #117869)

2024-11-27 Thread via llvm-branch-commits


https://github.com/llvmbot milestoned 
https://github.com/llvm/llvm-project/pull/117869
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] AMDGPU/GlobalISel: AMDGPURegBankSelect (PR #112863)

2024-11-27 Thread Petar Avramovic via llvm-branch-commits


petar-avramovic wrote:

ping

https://github.com/llvm/llvm-project/pull/112863
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] AMDGPU/GlobalISel: RegBankLegalize rules for load (PR #112882)

2024-11-27 Thread Petar Avramovic via llvm-branch-commits


petar-avramovic wrote:

ping

https://github.com/llvm/llvm-project/pull/112882
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] MachineUniformityAnalysis: Improve isConstantOrUndefValuePhi (PR #112866)

2024-11-27 Thread Petar Avramovic via llvm-branch-commits


https://github.com/petar-avramovic updated 
https://github.com/llvm/llvm-project/pull/112866

>From 97ce5f3295ed0f795656aed9180901c2299159f8 Mon Sep 17 00:00:00 2001
From: Petar Avramovic 
Date: Thu, 31 Oct 2024 14:10:57 +0100
Subject: [PATCH] MachineUniformityAnalysis: Improve isConstantOrUndefValuePhi

Change existing code for G_PHI to match what LLVM-IR version is doing
via PHINode::hasConstantOrUndefValue. This is not safe for regular PHI
since it may appear with an undef operand and getVRegDef can fail.
Most notably this improves number of values that can be allocated
to sgpr register bank in AMDGPURegBankSelect.
Common case here are phis that appear in structurize-cfg lowering
for cycles with multiple exits:
Undef incoming value is coming from block that reached cycle exit
condition, if other incoming is uniform keep the phi uniform despite
the fact it is joining values from pair of blocks that are entered
via divergent condition branch.
---
 llvm/lib/CodeGen/MachineSSAContext.cpp| 27 +-
 .../AMDGPU/MIR/hidden-diverge-gmir.mir| 28 +++
 .../AMDGPU/MIR/hidden-loop-diverge.mir|  4 +-
 .../AMDGPU/MIR/uses-value-from-cycle.mir  |  8 +-
 .../GlobalISel/divergence-structurizer.mir| 80 --
 .../regbankselect-mui-regbanklegalize.mir | 69 ---
 .../regbankselect-mui-regbankselect.mir   | 18 ++--
 .../AMDGPU/GlobalISel/regbankselect-mui.ll| 84 ++-
 .../AMDGPU/GlobalISel/regbankselect-mui.mir   | 51 ++-
 9 files changed, 191 insertions(+), 178 deletions(-)

diff --git a/llvm/lib/CodeGen/MachineSSAContext.cpp 
b/llvm/lib/CodeGen/MachineSSAContext.cpp
index e384187b6e8593..8e13c0916dd9e1 100644
--- a/llvm/lib/CodeGen/MachineSSAContext.cpp
+++ b/llvm/lib/CodeGen/MachineSSAContext.cpp
@@ -54,9 +54,34 @@ const MachineBasicBlock 
*MachineSSAContext::getDefBlock(Register value) const {
   return F->getRegInfo().getVRegDef(value)->getParent();
 }
 
+static bool isUndef(const MachineInstr &MI) {
+  return MI.getOpcode() == TargetOpcode::G_IMPLICIT_DEF ||
+ MI.getOpcode() == TargetOpcode::IMPLICIT_DEF;
+}
+
+/// MachineInstr equivalent of PHINode::hasConstantOrUndefValue() for G_PHI.
 template <>
 bool MachineSSAContext::isConstantOrUndefValuePhi(const MachineInstr &Phi) {
-  return Phi.isConstantValuePHI();
+  if (!Phi.isPHI())
+return false;
+
+  // In later passes PHI may appear with an undef operand, getVRegDef can fail.
+  if (Phi.getOpcode() == TargetOpcode::PHI)
+return Phi.isConstantValuePHI();
+
+  // For G_PHI we do equivalent of PHINode::hasConstantOrUndefValue().
+  const MachineRegisterInfo &MRI = Phi.getMF()->getRegInfo();
+  Register This = Phi.getOperand(0).getReg();
+  Register ConstantValue;
+  for (unsigned i = 1, e = Phi.getNumOperands(); i < e; i += 2) {
+Register Incoming = Phi.getOperand(i).getReg();
+if (Incoming != This && !isUndef(*MRI.getVRegDef(Incoming))) {
+  if (ConstantValue && ConstantValue != Incoming)
+return false;
+  ConstantValue = Incoming;
+}
+  }
+  return true;
 }
 
 template <>
diff --git 
a/llvm/test/Analysis/UniformityAnalysis/AMDGPU/MIR/hidden-diverge-gmir.mir 
b/llvm/test/Analysis/UniformityAnalysis/AMDGPU/MIR/hidden-diverge-gmir.mir
index ce00edf3363f77..9694a340b5e906 100644
--- a/llvm/test/Analysis/UniformityAnalysis/AMDGPU/MIR/hidden-diverge-gmir.mir
+++ b/llvm/test/Analysis/UniformityAnalysis/AMDGPU/MIR/hidden-diverge-gmir.mir
@@ -1,24 +1,24 @@
 # RUN: llc -mtriple=amdgcn-- -run-pass=print-machine-uniformity -o - %s 2>&1 | 
FileCheck %s
 # CHECK-LABEL: MachineUniformityInfo for function: hidden_diverge
 # CHECK-LABEL: BLOCK bb.0
-# CHECK: DIVERGENT: %{{[0-9]*}}: %{{[0-9]*}}:_(s32) = G_INTRINSIC 
intrinsic(@llvm.amdgcn.workitem.id.x)
-# CHECK: DIVERGENT: %{{[0-9]*}}: %{{[0-9]*}}:_(s1) = G_ICMP intpred(slt)
-# CHECK: DIVERGENT: %{{[0-9]*}}: %{{[0-9]*}}:_(s1) = G_XOR %{{[0-9]*}}:_, 
%{{[0-9]*}}:_
-# CHECK: DIVERGENT: %{{[0-9]*}}: %{{[0-9]*}}:_(s1), %{{[0-9]*}}:_(s64) = 
G_INTRINSIC_W_SIDE_EFFECTS intrinsic(@llvm.amdgcn.if)
-# CHECK: DIVERGENT: %{{[0-9]*}}: %{{[0-9]*}}:_(s1), %{{[0-9]*}}:_(s64) = 
G_INTRINSIC_W_SIDE_EFFECTS intrinsic(@llvm.amdgcn.if)
-# CHECK: DIVERGENT: G_BRCOND %{{[0-9]*}}:_(s1), %bb.1
-# CHECK: DIVERGENT: G_BR %bb.2
+# CHECK: DIVERGENT: %{{[0-9]*}}: %{{[0-9]*}}:_(s32) = G_INTRINSIC 
intrinsic(@llvm.amdgcn.workitem.id.x)
+# CHECK: DIVERGENT: %{{[0-9]*}}: %{{[0-9]*}}:_(s1) = G_ICMP intpred(slt)
+# CHECK: DIVERGENT: %{{[0-9]*}}: %{{[0-9]*}}:_(s1) = G_XOR %{{[0-9]*}}:_, 
%{{[0-9]*}}:_
+# CHECK: DIVERGENT: %{{[0-9]*}}: %{{[0-9]*}}:_(s1), %{{[0-9]*}}:_(s64) = 
G_INTRINSIC_W_SIDE_EFFECTS intrinsic(@llvm.amdgcn.if)
+# CHECK: DIVERGENT: %{{[0-9]*}}: %{{[0-9]*}}:_(s1), %{{[0-9]*}}:_(s64) = 
G_INTRINSIC_W_SIDE_EFFECTS intrinsic(@llvm.amdgcn.if)
+# CHECK: DIVERGENT: G_BRCOND %{{[0-9]*}}:_(s1), %bb.1
+# CHECK: DIVERGENT: G_BR %bb.2
 # CHECK-LABEL: BLOCK bb.1
 # CHECK-LABEL: BLOCK bb.2
-# CHECK: D

[llvm-branch-commits] [llvm] release/19.x: Bail out jump threading on indirect branches only (#117778) (PR #117869)

2024-11-27 Thread Nikita Popov via llvm-branch-commits


https://github.com/nikic approved this pull request.


https://github.com/llvm/llvm-project/pull/117869
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [mlir] [OMPIRBuilder] Support runtime number of teams and threads, and SPMD mode (PR #116051)

2024-11-27 Thread Sergio Afonso via llvm-branch-commits



@@ -6758,6 +6793,27 @@ static Expected createOutlinedFunction(
   auto Func =
   Function::Create(FuncType, GlobalValue::InternalLinkage, FuncName, M);
 
+  // Forward target-cpu and target-features function attributes from the
+  // original function to the new outlined function.
+  Function *ParentFn = Builder.GetInsertBlock()->getParent();
+
+  auto TargetCpuAttr = ParentFn->getFnAttribute("target-cpu");
+  if (TargetCpuAttr.isStringAttribute())
+Func->addFnAttr(TargetCpuAttr);
+
+  auto TargetFeaturesAttr = ParentFn->getFnAttribute("target-features");
+  if (TargetFeaturesAttr.isStringAttribute())
+Func->addFnAttr(TargetFeaturesAttr);

skatrak wrote:

Good catch, I moved that to another PR right after this one in the stack.

https://github.com/llvm/llvm-project/pull/116051
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [flang] [mlir] [MLIR][OpenMP] LLVM IR translation of host_eval (PR #116052)

2024-11-27 Thread Sergio Afonso via llvm-branch-commits


https://github.com/skatrak edited 
https://github.com/llvm/llvm-project/pull/116052
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [RISCV] Add FeatureDisableLatencySchedHeuristic (PR #115858)

2024-11-27 Thread Michael Maitland via llvm-branch-commits


https://github.com/michaelmaitland approved this pull request.

LGTM


https://github.com/llvm/llvm-project/pull/115858
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] AMDGPU/GlobalISel: RegBankLegalize rules for load (PR #112882)

2024-11-27 Thread Petar Avramovic via llvm-branch-commits


https://github.com/petar-avramovic updated 
https://github.com/llvm/llvm-project/pull/112882

>From 59e70ef3cb6b1e9183691782b5675a376add3fbd Mon Sep 17 00:00:00 2001
From: Petar Avramovic 
Date: Wed, 30 Oct 2024 15:37:59 +0100
Subject: [PATCH] AMDGPU/GlobalISel: RegBankLegalize rules for load

Add IDs for bit width that cover multiple LLTs: B32 B64 etc.
"Predicate" wrapper class for bool predicate functions used to
write pretty rules. Predicates can be combined using &&, || and !.
Lowering for splitting and widening loads.
Write rules for loads to not change existing mir tests from old
regbankselect.
---
 .../AMDGPU/AMDGPURegBankLegalizeHelper.cpp| 284 +++-
 .../AMDGPU/AMDGPURegBankLegalizeHelper.h  |   5 +
 .../AMDGPU/AMDGPURegBankLegalizeRules.cpp | 309 -
 .../AMDGPU/AMDGPURegBankLegalizeRules.h   |  65 +++-
 .../AMDGPU/GlobalISel/regbankselect-load.mir  | 320 +++---
 .../GlobalISel/regbankselect-zextload.mir |   9 +-
 6 files changed, 927 insertions(+), 65 deletions(-)

diff --git a/llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeHelper.cpp 
b/llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeHelper.cpp
index 916140e2bbcd68..5c4195cb15fb2c 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeHelper.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeHelper.cpp
@@ -38,6 +38,83 @@ void 
RegBankLegalizeHelper::findRuleAndApplyMapping(MachineInstr &MI) {
   lower(MI, Mapping, WaterfallSgprs);
 }
 
+void RegBankLegalizeHelper::splitLoad(MachineInstr &MI,
+  ArrayRef LLTBreakdown, LLT MergeTy) 
{
+  MachineFunction &MF = B.getMF();
+  assert(MI.getNumMemOperands() == 1);
+  MachineMemOperand &BaseMMO = **MI.memoperands_begin();
+  Register Dst = MI.getOperand(0).getReg();
+  const RegisterBank *DstRB = MRI.getRegBankOrNull(Dst);
+  Register Base = MI.getOperand(1).getReg();
+  LLT PtrTy = MRI.getType(Base);
+  const RegisterBank *PtrRB = MRI.getRegBankOrNull(Base);
+  LLT OffsetTy = LLT::scalar(PtrTy.getSizeInBits());
+  SmallVector LoadPartRegs;
+
+  unsigned ByteOffset = 0;
+  for (LLT PartTy : LLTBreakdown) {
+Register BasePlusOffset;
+if (ByteOffset == 0) {
+  BasePlusOffset = Base;
+} else {
+  auto Offset = B.buildConstant({PtrRB, OffsetTy}, ByteOffset);
+  BasePlusOffset = B.buildPtrAdd({PtrRB, PtrTy}, Base, Offset).getReg(0);
+}
+auto *OffsetMMO = MF.getMachineMemOperand(&BaseMMO, ByteOffset, PartTy);
+auto LoadPart = B.buildLoad({DstRB, PartTy}, BasePlusOffset, *OffsetMMO);
+LoadPartRegs.push_back(LoadPart.getReg(0));
+ByteOffset += PartTy.getSizeInBytes();
+  }
+
+  if (!MergeTy.isValid()) {
+// Loads are of same size, concat or merge them together.
+B.buildMergeLikeInstr(Dst, LoadPartRegs);
+  } else {
+// Loads are not all of same size, need to unmerge them to smaller pieces
+// of MergeTy type, then merge pieces to Dst.
+SmallVector MergeTyParts;
+for (Register Reg : LoadPartRegs) {
+  if (MRI.getType(Reg) == MergeTy) {
+MergeTyParts.push_back(Reg);
+  } else {
+auto Unmerge = B.buildUnmerge({DstRB, MergeTy}, Reg);
+for (unsigned i = 0; i < Unmerge->getNumOperands() - 1; ++i)
+  MergeTyParts.push_back(Unmerge.getReg(i));
+  }
+}
+B.buildMergeLikeInstr(Dst, MergeTyParts);
+  }
+  MI.eraseFromParent();
+}
+
+void RegBankLegalizeHelper::widenLoad(MachineInstr &MI, LLT WideTy,
+  LLT MergeTy) {
+  MachineFunction &MF = B.getMF();
+  assert(MI.getNumMemOperands() == 1);
+  MachineMemOperand &BaseMMO = **MI.memoperands_begin();
+  Register Dst = MI.getOperand(0).getReg();
+  const RegisterBank *DstRB = MRI.getRegBankOrNull(Dst);
+  Register Base = MI.getOperand(1).getReg();
+
+  MachineMemOperand *WideMMO = MF.getMachineMemOperand(&BaseMMO, 0, WideTy);
+  auto WideLoad = B.buildLoad({DstRB, WideTy}, Base, *WideMMO);
+
+  if (WideTy.isScalar()) {
+B.buildTrunc(Dst, WideLoad);
+  } else {
+SmallVector MergeTyParts;
+auto Unmerge = B.buildUnmerge({DstRB, MergeTy}, WideLoad);
+
+LLT DstTy = MRI.getType(Dst);
+unsigned NumElts = DstTy.getSizeInBits() / MergeTy.getSizeInBits();
+for (unsigned i = 0; i < NumElts; ++i) {
+  MergeTyParts.push_back(Unmerge.getReg(i));
+}
+B.buildMergeLikeInstr(Dst, MergeTyParts);
+  }
+  MI.eraseFromParent();
+}
+
 void RegBankLegalizeHelper::lower(MachineInstr &MI,
   const RegBankLLTMapping &Mapping,
   SmallSet &WaterfallSgprs) {
@@ -116,6 +193,50 @@ void RegBankLegalizeHelper::lower(MachineInstr &MI,
 MI.eraseFromParent();
 break;
   }
+  case SplitLoad: {
+LLT DstTy = MRI.getType(MI.getOperand(0).getReg());
+unsigned Size = DstTy.getSizeInBits();
+// Even split to 128-bit loads
+if (Size > 128) {
+  LLT B128;
+  if (DstTy.isVector()) {
+LLT EltTy = DstTy.getElementType();
+B128 = LLT:

[llvm-branch-commits] [llvm] AMDGPU/GlobalISel: AMDGPURegBankLegalize (PR #112864)

2024-11-27 Thread Petar Avramovic via llvm-branch-commits


petar-avramovic wrote:

ping

https://github.com/llvm/llvm-project/pull/112864
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [mlir] [OMPIRBuilder] Support runtime number of teams and threads, and SPMD mode (PR #116051)

2024-11-27 Thread Sergio Afonso via llvm-branch-commits


https://github.com/skatrak updated 
https://github.com/llvm/llvm-project/pull/116051

>From f120456cd3200ff82cca63570272d57ba909fe87 Mon Sep 17 00:00:00 2001
From: Sergio Afonso 
Date: Wed, 27 Nov 2024 11:33:01 +
Subject: [PATCH 1/2] [OMPIRBuilder] Support runtime number of teams and
 threads, and SPMD mode

This patch introduces a `TargetKernelRuntimeAttrs` structure to hold
host-evaluated `num_teams`, `thread_limit`, `num_threads` and trip count values
passed to the runtime kernel offloading call.

Additionally, `createTarget` is extended to take an `IsSPMD` flag, used to
influence target device code generation.
---
 .../llvm/Frontend/OpenMP/OMPIRBuilder.h   |  26 +-
 llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp | 125 +++--
 .../Frontend/OpenMPIRBuilderTest.cpp  | 256 +-
 .../OpenMP/OpenMPToLLVMIRTranslation.cpp  |  10 +-
 4 files changed, 383 insertions(+), 34 deletions(-)

diff --git a/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h 
b/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h
index da450ef5adbc14..a85f41e586c514 100644
--- a/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h
+++ b/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h
@@ -2237,6 +2237,26 @@ class OpenMPIRBuilder {
 int32_t MinThreads = 1;
   };
 
+  /// Container to pass LLVM IR runtime values or constants related to the
+  /// number of teams and threads with which the kernel must be launched, as
+  /// well as the trip count of the SPMD loop, if it is an SPMD kernel. These
+  /// must be defined in the host prior to the call to the kernel launch OpenMP
+  /// RTL function.
+  struct TargetKernelRuntimeAttrs {
+SmallVector MaxTeams = {nullptr};
+Value *MinTeams = nullptr;
+SmallVector TargetThreadLimit = {nullptr};
+SmallVector TeamsThreadLimit = {nullptr};
+
+/// 'parallel' construct 'num_threads' clause value, if present and it is a
+/// target SPMD kernel.
+Value *MaxThreads = nullptr;
+
+/// Total number of iterations of the target SPMD kernel or null if it is a
+/// generic kernel.
+Value *LoopTripCount = nullptr;
+  };
+
   /// Data structure that contains the needed information to construct the
   /// kernel args vector.
   struct TargetKernelArgs {
@@ -2905,11 +2925,14 @@ class OpenMPIRBuilder {
   ///
   /// \param Loc where the target data construct was encountered.
   /// \param IsOffloadEntry whether it is an offload entry.
+  /// \param IsSPMD whether it is a target SPMD kernel.
   /// \param CodeGenIP The insertion point where the call to the outlined
   /// function should be emitted.
   /// \param EntryInfo The entry information about the function.
   /// \param DefaultAttrs Structure containing the default numbers of threads
   ///and teams to launch the kernel with.
+  /// \param RuntimeAttrs Structure containing the runtime numbers of threads
+  ///and teams to launch the kernel with.
   /// \param Inputs The input values to the region that will be passed.
   /// as arguments to the outlined function.
   /// \param BodyGenCB Callback that will generate the region code.
@@ -2919,11 +2942,12 @@ class OpenMPIRBuilder {
   // dependency information as passed in the depend clause
   // \param HasNowait Whether the target construct has a `nowait` clause or 
not.
   InsertPointOrErrorTy createTarget(
-  const LocationDescription &Loc, bool IsOffloadEntry,
+  const LocationDescription &Loc, bool IsOffloadEntry, bool IsSPMD,
   OpenMPIRBuilder::InsertPointTy AllocaIP,
   OpenMPIRBuilder::InsertPointTy CodeGenIP,
   TargetRegionEntryInfo &EntryInfo,
   const TargetKernelDefaultAttrs &DefaultAttrs,
+  const TargetKernelRuntimeAttrs &RuntimeAttrs,
   SmallVectorImpl &Inputs, GenMapInfoCallbackTy GenMapInfoCB,
   TargetBodyGenCallbackTy BodyGenCB,
   TargetGenArgAccessorsCallbackTy ArgAccessorFuncCB,
diff --git a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp 
b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
index 302d363965c940..09f794ccf734b3 100644
--- a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
+++ b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
@@ -6727,8 +6727,43 @@ FunctionCallee 
OpenMPIRBuilder::createDispatchDeinitFunction() {
   return getOrCreateRuntimeFunction(M, omp::OMPRTL___kmpc_dispatch_deinit);
 }
 
+static void emitUsed(StringRef Name, std::vector &List,
+ Module &M) {
+  if (List.empty())
+return;
+
+  Type *PtrTy = PointerType::get(M.getContext(), /*AddressSpace=*/0);
+
+  // Convert List to what ConstantArray needs.
+  SmallVector UsedArray;
+  UsedArray.reserve(List.size());
+  for (auto Item : List)
+UsedArray.push_back(ConstantExpr::getPointerBitCastOrAddrSpaceCast(
+cast(&*Item), PtrTy));
+
+  ArrayType *ArrTy = ArrayType::get(PtrTy, UsedArray.size());
+  auto *GV =
+  new GlobalVariable(M, ArrTy, false, llvm::GlobalValue::AppendingLinkage,
+ llvm::ConstantArray::get(ArrTy, UsedArray), Name);
+
+  GV->setSection("

[llvm-branch-commits] [llvm] [mlir] [OMPIRBuilder] Support runtime number of teams and threads, and SPMD mode (PR #116051)

2024-11-27 Thread Sergio Afonso via llvm-branch-commits



@@ -6727,8 +6727,43 @@ FunctionCallee 
OpenMPIRBuilder::createDispatchDeinitFunction() {
   return getOrCreateRuntimeFunction(M, omp::OMPRTL___kmpc_dispatch_deinit);
 }
 
+static void emitUsed(StringRef Name, std::vector &List,
+ Module &M) {
+  if (List.empty())
+return;
+
+  Type *PtrTy = PointerType::get(M.getContext(), /*AddressSpace=*/0);
+
+  // Convert List to what ConstantArray needs.
+  SmallVector UsedArray;
+  UsedArray.reserve(List.size());
+  for (auto Item : List)
+UsedArray.push_back(ConstantExpr::getPointerBitCastOrAddrSpaceCast(
+cast(&*Item), PtrTy));
+
+  ArrayType *ArrTy = ArrayType::get(PtrTy, UsedArray.size());
+  auto *GV =
+  new GlobalVariable(M, ArrTy, false, llvm::GlobalValue::AppendingLinkage,
+ llvm::ConstantArray::get(ArrTy, UsedArray), Name);
+
+  GV->setSection("llvm.metadata");
+}
+
+static void
+emitExecutionMode(OpenMPIRBuilder &OMPBuilder, IRBuilderBase &Builder,
+  StringRef FunctionName, OMPTgtExecModeFlags Mode,
+  std::vector &LLVMCompilerUsed) {
+  auto *Int8Ty = Type::getInt8Ty(Builder.getContext());
+  auto *GVMode = new llvm::GlobalVariable(
+  OMPBuilder.M, Int8Ty, /*isConstant=*/true,
+  llvm::GlobalValue::WeakAnyLinkage, llvm::ConstantInt::get(Int8Ty, Mode),
+  Twine(FunctionName, "_exec_mode"));
+  GVMode->setVisibility(llvm::GlobalVariable::ProtectedVisibility);
+  LLVMCompilerUsed.emplace_back(GVMode);
+}

skatrak wrote:

Moved to OMPIRBuilder and simplified implementation.

https://github.com/llvm/llvm-project/pull/116051
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [mlir] [OMPIRBuilder] Support runtime number of teams and threads, and SPMD mode (PR #116051)

2024-11-27 Thread Sergio Afonso via llvm-branch-commits



@@ -7427,13 +7521,17 @@ emitTargetCall(OpenMPIRBuilder &OMPBuilder, 
IRBuilderBase &Builder,
 }
 
 OpenMPIRBuilder::InsertPointOrErrorTy OpenMPIRBuilder::createTarget(
-const LocationDescription &Loc, bool IsOffloadEntry, InsertPointTy 
AllocaIP,
-InsertPointTy CodeGenIP, TargetRegionEntryInfo &EntryInfo,
+const LocationDescription &Loc, bool IsOffloadEntry, bool IsSPMD,
+InsertPointTy AllocaIP, InsertPointTy CodeGenIP,
+TargetRegionEntryInfo &EntryInfo,
 const TargetKernelDefaultAttrs &DefaultAttrs,
+const TargetKernelRuntimeAttrs &RuntimeAttrs,
 SmallVectorImpl &Args, GenMapInfoCallbackTy GenMapInfoCB,
 OpenMPIRBuilder::TargetBodyGenCallbackTy CBFunc,
 OpenMPIRBuilder::TargetGenArgAccessorsCallbackTy ArgAccessorFuncCB,
 SmallVector Dependencies, bool HasNowait) {
+  assert((!RuntimeAttrs.LoopTripCount || IsSPMD) &&
+ "trip count not expected if IsSPMD=false");

skatrak wrote:

Removed.

https://github.com/llvm/llvm-project/pull/116051
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [mlir] [OMPIRBuilder] Support runtime number of teams and threads, and SPMD mode (PR #116051)

2024-11-27 Thread Sergio Afonso via llvm-branch-commits



@@ -6758,6 +6793,27 @@ static Expected createOutlinedFunction(
   auto Func =
   Function::Create(FuncType, GlobalValue::InternalLinkage, FuncName, M);
 
+  // Forward target-cpu and target-features function attributes from the
+  // original function to the new outlined function.
+  Function *ParentFn = Builder.GetInsertBlock()->getParent();
+
+  auto TargetCpuAttr = ParentFn->getFnAttribute("target-cpu");
+  if (TargetCpuAttr.isStringAttribute())
+Func->addFnAttr(TargetCpuAttr);
+
+  auto TargetFeaturesAttr = ParentFn->getFnAttribute("target-features");
+  if (TargetFeaturesAttr.isStringAttribute())
+Func->addFnAttr(TargetFeaturesAttr);
+
+  if (OMPBuilder.Config.isTargetDevice()) {
+std::vector LLVMCompilerUsed;

skatrak wrote:

Done.

https://github.com/llvm/llvm-project/pull/116051
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [mlir] [OMPIRBuilder] Support runtime number of teams and threads, and SPMD mode (PR #116051)

2024-11-27 Thread Sergio Afonso via llvm-branch-commits


https://github.com/skatrak edited 
https://github.com/llvm/llvm-project/pull/116051
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [mlir] [OMPIRBuilder] Support runtime number of teams and threads, and SPMD mode (PR #116051)

2024-11-27 Thread Sergio Afonso via llvm-branch-commits


https://github.com/skatrak commented:

Thank you @jdoerfert for the review. Your comments should be addressed now.

https://github.com/llvm/llvm-project/pull/116051
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [OMPIRBuilder] Propagate attributes to outlined target regions (PR #117875)

2024-11-27 Thread Sergio Afonso via llvm-branch-commits


https://github.com/skatrak created 
https://github.com/llvm/llvm-project/pull/117875

This patch copies the target-cpu and target-features attributes of functions 
containing target regions into the corresponding outlined function holding the 
target region.

This mirrors what is currently being done for all other outlined functions 
through the `CodeExtractor` in `OpenMPIRBuilder::finalize()`.

>From c7ca41f39546949a5c6ae28782f9e2f6585c240b Mon Sep 17 00:00:00 2001
From: Sergio Afonso 
Date: Wed, 27 Nov 2024 11:35:28 +
Subject: [PATCH] [OMPIRBuilder] Propagate attributes to outlined target
 regions

This patch copies the target-cpu and target-features attributes of functions
containing target regions into the corresponding outlined function holding the
target region.

This mirrors what is currently being done for all other outlined functions
through the `CodeExtractor` in `OpenMPIRBuilder::finalize()`.
---
 llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp | 12 +
 .../Frontend/OpenMPIRBuilderTest.cpp  | 25 +++
 2 files changed, 37 insertions(+)

diff --git a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp 
b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
index 73f221c07af746..8da3fa52b14af4 100644
--- a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
+++ b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
@@ -6768,6 +6768,18 @@ static Expected createOutlinedFunction(
   auto Func =
   Function::Create(FuncType, GlobalValue::InternalLinkage, FuncName, M);
 
+  // Forward target-cpu and target-features function attributes from the
+  // original function to the new outlined function.
+  Function *ParentFn = Builder.GetInsertBlock()->getParent();
+
+  auto TargetCpuAttr = ParentFn->getFnAttribute("target-cpu");
+  if (TargetCpuAttr.isStringAttribute())
+Func->addFnAttr(TargetCpuAttr);
+
+  auto TargetFeaturesAttr = ParentFn->getFnAttribute("target-features");
+  if (TargetFeaturesAttr.isStringAttribute())
+Func->addFnAttr(TargetFeaturesAttr);
+
   if (OMPBuilder.Config.isTargetDevice()) {
 Value *ExecMode = OMPBuilder.emitKernelExecutionMode(
 FuncName, IsSPMD ? OMP_TGT_EXEC_MODE_SPMD : OMP_TGT_EXEC_MODE_GENERIC);
diff --git a/llvm/unittests/Frontend/OpenMPIRBuilderTest.cpp 
b/llvm/unittests/Frontend/OpenMPIRBuilderTest.cpp
index e4845256633b9c..d114b5372156af 100644
--- a/llvm/unittests/Frontend/OpenMPIRBuilderTest.cpp
+++ b/llvm/unittests/Frontend/OpenMPIRBuilderTest.cpp
@@ -6122,6 +6122,8 @@ TEST_F(OpenMPIRBuilderTest, TargetRegion) {
   OpenMPIRBuilderConfig Config(false, false, false, false, false, false, 
false);
   OMPBuilder.setConfig(Config);
   F->setName("func");
+  F->addFnAttr("target-cpu", "x86-64");
+  F->addFnAttr("target-features", "+mmx,+sse");
   IRBuilder<> Builder(BB);
   auto *Int32Ty = Builder.getInt32Ty();
 
@@ -6269,6 +6271,13 @@ TEST_F(OpenMPIRBuilderTest, TargetRegion) {
   StringRef FunctionName2 = OutlinedFunc->getName();
   EXPECT_TRUE(FunctionName2.starts_with("__omp_offloading"));
 
+  // Check that target-cpu and target-features were propagated to the outlined
+  // function
+  EXPECT_EQ(OutlinedFunc->getFnAttribute("target-cpu"),
+F->getFnAttribute("target-cpu"));
+  EXPECT_EQ(OutlinedFunc->getFnAttribute("target-features"),
+F->getFnAttribute("target-features"));
+
   EXPECT_FALSE(verifyModule(*M, &errs()));
 }
 
@@ -6279,6 +6288,8 @@ TEST_F(OpenMPIRBuilderTest, TargetRegionDevice) {
   OMPBuilder.initialize();
 
   F->setName("func");
+  F->addFnAttr("target-cpu", "gfx90a");
+  F->addFnAttr("target-features", "+gfx9-insts,+wavefrontsize64");
   IRBuilder<> Builder(BB);
   OpenMPIRBuilder::LocationDescription Loc({Builder.saveIP(), DL});
 
@@ -6355,6 +6366,13 @@ TEST_F(OpenMPIRBuilderTest, TargetRegionDevice) {
   Function *OutlinedFn = TargetStore->getFunction();
   EXPECT_NE(F, OutlinedFn);
 
+  // Check that target-cpu and target-features were propagated to the outlined
+  // function
+  EXPECT_EQ(OutlinedFn->getFnAttribute("target-cpu"),
+F->getFnAttribute("target-cpu"));
+  EXPECT_EQ(OutlinedFn->getFnAttribute("target-features"),
+F->getFnAttribute("target-features"));
+
   EXPECT_TRUE(OutlinedFn->hasWeakODRLinkage());
   // Account for the "implicit" first argument.
   EXPECT_EQ(OutlinedFn->getName(), "__omp_offloading_1_2_parent_l3");
@@ -6594,6 +6612,13 @@ TEST_F(OpenMPIRBuilderTest, TargetRegionDeviceSPMD) {
   EXPECT_NE(OutlinedFn, nullptr);
   EXPECT_NE(F, OutlinedFn);
 
+  // Check that target-cpu and target-features were propagated to the outlined
+  // function
+  EXPECT_EQ(OutlinedFn->getFnAttribute("target-cpu"),
+F->getFnAttribute("target-cpu"));
+  EXPECT_EQ(OutlinedFn->getFnAttribute("target-features"),
+F->getFnAttribute("target-features"));
+
   EXPECT_TRUE(OutlinedFn->hasWeakODRLinkage());
   // Account for the "implicit" first argument.
   EXPECT_EQ(OutlinedFn->getName(), "__omp_offloading_1_2_parent_l3");

__

[llvm-branch-commits] [llvm] [mlir] [OMPIRBuilder] Support runtime number of teams and threads, and SPMD mode (PR #116051)

2024-11-27 Thread Sergio Afonso via llvm-branch-commits



@@ -6378,6 +6439,204 @@ TEST_F(OpenMPIRBuilderTest, TargetRegionDevice) {
   auto *ExitBlock = EntryBlockBranch->getSuccessor(1);
   EXPECT_EQ(ExitBlock->getName(), "worker.exit");
   EXPECT_TRUE(isa(ExitBlock->getFirstNonPHI()));
+
+  // Check global exec_mode.
+  GlobalVariable *Used = M->getGlobalVariable("llvm.compiler.used");
+  EXPECT_NE(Used, nullptr);
+  Constant *UsedInit = Used->getInitializer();
+  EXPECT_NE(UsedInit, nullptr);
+  EXPECT_TRUE(isa(UsedInit));
+  auto *UsedInitData = cast(UsedInit);
+  EXPECT_EQ(1U, UsedInitData->getNumOperands());
+  Constant *ExecMode = UsedInitData->getOperand(0);
+  EXPECT_TRUE(isa(ExecMode));
+  Constant *ExecModeValue = cast(ExecMode)->getInitializer();
+  EXPECT_NE(ExecModeValue, nullptr);
+  EXPECT_TRUE(isa(ExecModeValue));
+  EXPECT_EQ(OMP_TGT_EXEC_MODE_GENERIC,
+cast(ExecModeValue)->getZExtValue());
+}
+
+TEST_F(OpenMPIRBuilderTest, TargetRegionSPMD) {
+  using InsertPointTy = OpenMPIRBuilder::InsertPointTy;
+  OpenMPIRBuilder OMPBuilder(*M);
+  OMPBuilder.initialize();
+  OpenMPIRBuilderConfig Config(/*IsTargetDevice=*/false, /*IsGPU=*/false,
+   /*OpenMPOffloadMandatory=*/false,
+   /*HasRequiresReverseOffload=*/false,
+   /*HasRequiresUnifiedAddress=*/false,
+   /*HasRequiresUnifiedSharedMemory=*/false,
+   /*HasRequiresDynamicAllocators=*/false);
+  OMPBuilder.setConfig(Config);
+  F->setName("func");
+  IRBuilder<> Builder(BB);
+
+  auto BodyGenCB = [&](InsertPointTy,
+   InsertPointTy CodeGenIP) -> InsertPointTy {
+Builder.restoreIP(CodeGenIP);
+return Builder.saveIP();
+  };
+
+  auto SimpleArgAccessorCB =
+  [&](llvm::Argument &, llvm::Value *, llvm::Value *&,
+  llvm::OpenMPIRBuilder::InsertPointTy,
+  llvm::OpenMPIRBuilder::InsertPointTy CodeGenIP) {

skatrak wrote:

Done.

https://github.com/llvm/llvm-project/pull/116051
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [flang] [llvm] [mlir] [MLIR][OpenMP] LLVM IR translation of host_eval (PR #116052)

2024-11-27 Thread Sergio Afonso via llvm-branch-commits


https://github.com/skatrak updated 
https://github.com/llvm/llvm-project/pull/116052

>From c7ca41f39546949a5c6ae28782f9e2f6585c240b Mon Sep 17 00:00:00 2001
From: Sergio Afonso 
Date: Wed, 27 Nov 2024 11:35:28 +
Subject: [PATCH 1/2] [OMPIRBuilder] Propagate attributes to outlined target
 regions

This patch copies the target-cpu and target-features attributes of functions
containing target regions into the corresponding outlined function holding the
target region.

This mirrors what is currently being done for all other outlined functions
through the `CodeExtractor` in `OpenMPIRBuilder::finalize()`.
---
 llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp | 12 +
 .../Frontend/OpenMPIRBuilderTest.cpp  | 25 +++
 2 files changed, 37 insertions(+)

diff --git a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp 
b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
index 73f221c07af746..8da3fa52b14af4 100644
--- a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
+++ b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
@@ -6768,6 +6768,18 @@ static Expected createOutlinedFunction(
   auto Func =
   Function::Create(FuncType, GlobalValue::InternalLinkage, FuncName, M);
 
+  // Forward target-cpu and target-features function attributes from the
+  // original function to the new outlined function.
+  Function *ParentFn = Builder.GetInsertBlock()->getParent();
+
+  auto TargetCpuAttr = ParentFn->getFnAttribute("target-cpu");
+  if (TargetCpuAttr.isStringAttribute())
+Func->addFnAttr(TargetCpuAttr);
+
+  auto TargetFeaturesAttr = ParentFn->getFnAttribute("target-features");
+  if (TargetFeaturesAttr.isStringAttribute())
+Func->addFnAttr(TargetFeaturesAttr);
+
   if (OMPBuilder.Config.isTargetDevice()) {
 Value *ExecMode = OMPBuilder.emitKernelExecutionMode(
 FuncName, IsSPMD ? OMP_TGT_EXEC_MODE_SPMD : OMP_TGT_EXEC_MODE_GENERIC);
diff --git a/llvm/unittests/Frontend/OpenMPIRBuilderTest.cpp 
b/llvm/unittests/Frontend/OpenMPIRBuilderTest.cpp
index e4845256633b9c..d114b5372156af 100644
--- a/llvm/unittests/Frontend/OpenMPIRBuilderTest.cpp
+++ b/llvm/unittests/Frontend/OpenMPIRBuilderTest.cpp
@@ -6122,6 +6122,8 @@ TEST_F(OpenMPIRBuilderTest, TargetRegion) {
   OpenMPIRBuilderConfig Config(false, false, false, false, false, false, 
false);
   OMPBuilder.setConfig(Config);
   F->setName("func");
+  F->addFnAttr("target-cpu", "x86-64");
+  F->addFnAttr("target-features", "+mmx,+sse");
   IRBuilder<> Builder(BB);
   auto *Int32Ty = Builder.getInt32Ty();
 
@@ -6269,6 +6271,13 @@ TEST_F(OpenMPIRBuilderTest, TargetRegion) {
   StringRef FunctionName2 = OutlinedFunc->getName();
   EXPECT_TRUE(FunctionName2.starts_with("__omp_offloading"));
 
+  // Check that target-cpu and target-features were propagated to the outlined
+  // function
+  EXPECT_EQ(OutlinedFunc->getFnAttribute("target-cpu"),
+F->getFnAttribute("target-cpu"));
+  EXPECT_EQ(OutlinedFunc->getFnAttribute("target-features"),
+F->getFnAttribute("target-features"));
+
   EXPECT_FALSE(verifyModule(*M, &errs()));
 }
 
@@ -6279,6 +6288,8 @@ TEST_F(OpenMPIRBuilderTest, TargetRegionDevice) {
   OMPBuilder.initialize();
 
   F->setName("func");
+  F->addFnAttr("target-cpu", "gfx90a");
+  F->addFnAttr("target-features", "+gfx9-insts,+wavefrontsize64");
   IRBuilder<> Builder(BB);
   OpenMPIRBuilder::LocationDescription Loc({Builder.saveIP(), DL});
 
@@ -6355,6 +6366,13 @@ TEST_F(OpenMPIRBuilderTest, TargetRegionDevice) {
   Function *OutlinedFn = TargetStore->getFunction();
   EXPECT_NE(F, OutlinedFn);
 
+  // Check that target-cpu and target-features were propagated to the outlined
+  // function
+  EXPECT_EQ(OutlinedFn->getFnAttribute("target-cpu"),
+F->getFnAttribute("target-cpu"));
+  EXPECT_EQ(OutlinedFn->getFnAttribute("target-features"),
+F->getFnAttribute("target-features"));
+
   EXPECT_TRUE(OutlinedFn->hasWeakODRLinkage());
   // Account for the "implicit" first argument.
   EXPECT_EQ(OutlinedFn->getName(), "__omp_offloading_1_2_parent_l3");
@@ -6594,6 +6612,13 @@ TEST_F(OpenMPIRBuilderTest, TargetRegionDeviceSPMD) {
   EXPECT_NE(OutlinedFn, nullptr);
   EXPECT_NE(F, OutlinedFn);
 
+  // Check that target-cpu and target-features were propagated to the outlined
+  // function
+  EXPECT_EQ(OutlinedFn->getFnAttribute("target-cpu"),
+F->getFnAttribute("target-cpu"));
+  EXPECT_EQ(OutlinedFn->getFnAttribute("target-features"),
+F->getFnAttribute("target-features"));
+
   EXPECT_TRUE(OutlinedFn->hasWeakODRLinkage());
   // Account for the "implicit" first argument.
   EXPECT_EQ(OutlinedFn->getName(), "__omp_offloading_1_2_parent_l3");

>From 58bd5ffe4d7c640866aca0af3faef0dcefa48cbb Mon Sep 17 00:00:00 2001
From: Sergio Afonso 
Date: Tue, 12 Nov 2024 10:49:28 +
Subject: [PATCH 2/2] [MLIR][OpenMP] LLVM IR translation of host_eval

This patch adds support for processing the `host_eval` clause of `omp.target`
to populate default and runtime kernel launch attributes. Specifically,

[llvm-branch-commits] [llvm] [OMPIRBuilder] Propagate attributes to outlined target regions (PR #117875)

2024-11-27 Thread via llvm-branch-commits


llvmbot wrote:




@llvm/pr-subscribers-flang-openmp

Author: Sergio Afonso (skatrak)


Changes

This patch copies the target-cpu and target-features attributes of functions 
containing target regions into the corresponding outlined function holding the 
target region.

This mirrors what is currently being done for all other outlined functions 
through the `CodeExtractor` in `OpenMPIRBuilder::finalize()`.

---
Full diff: https://github.com/llvm/llvm-project/pull/117875.diff


2 Files Affected:

- (modified) llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp (+12) 
- (modified) llvm/unittests/Frontend/OpenMPIRBuilderTest.cpp (+25) 


``diff
diff --git a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp 
b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
index 73f221c07af746..8da3fa52b14af4 100644
--- a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
+++ b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
@@ -6768,6 +6768,18 @@ static Expected createOutlinedFunction(
   auto Func =
   Function::Create(FuncType, GlobalValue::InternalLinkage, FuncName, M);
 
+  // Forward target-cpu and target-features function attributes from the
+  // original function to the new outlined function.
+  Function *ParentFn = Builder.GetInsertBlock()->getParent();
+
+  auto TargetCpuAttr = ParentFn->getFnAttribute("target-cpu");
+  if (TargetCpuAttr.isStringAttribute())
+Func->addFnAttr(TargetCpuAttr);
+
+  auto TargetFeaturesAttr = ParentFn->getFnAttribute("target-features");
+  if (TargetFeaturesAttr.isStringAttribute())
+Func->addFnAttr(TargetFeaturesAttr);
+
   if (OMPBuilder.Config.isTargetDevice()) {
 Value *ExecMode = OMPBuilder.emitKernelExecutionMode(
 FuncName, IsSPMD ? OMP_TGT_EXEC_MODE_SPMD : OMP_TGT_EXEC_MODE_GENERIC);
diff --git a/llvm/unittests/Frontend/OpenMPIRBuilderTest.cpp 
b/llvm/unittests/Frontend/OpenMPIRBuilderTest.cpp
index e4845256633b9c..d114b5372156af 100644
--- a/llvm/unittests/Frontend/OpenMPIRBuilderTest.cpp
+++ b/llvm/unittests/Frontend/OpenMPIRBuilderTest.cpp
@@ -6122,6 +6122,8 @@ TEST_F(OpenMPIRBuilderTest, TargetRegion) {
   OpenMPIRBuilderConfig Config(false, false, false, false, false, false, 
false);
   OMPBuilder.setConfig(Config);
   F->setName("func");
+  F->addFnAttr("target-cpu", "x86-64");
+  F->addFnAttr("target-features", "+mmx,+sse");
   IRBuilder<> Builder(BB);
   auto *Int32Ty = Builder.getInt32Ty();
 
@@ -6269,6 +6271,13 @@ TEST_F(OpenMPIRBuilderTest, TargetRegion) {
   StringRef FunctionName2 = OutlinedFunc->getName();
   EXPECT_TRUE(FunctionName2.starts_with("__omp_offloading"));
 
+  // Check that target-cpu and target-features were propagated to the outlined
+  // function
+  EXPECT_EQ(OutlinedFunc->getFnAttribute("target-cpu"),
+F->getFnAttribute("target-cpu"));
+  EXPECT_EQ(OutlinedFunc->getFnAttribute("target-features"),
+F->getFnAttribute("target-features"));
+
   EXPECT_FALSE(verifyModule(*M, &errs()));
 }
 
@@ -6279,6 +6288,8 @@ TEST_F(OpenMPIRBuilderTest, TargetRegionDevice) {
   OMPBuilder.initialize();
 
   F->setName("func");
+  F->addFnAttr("target-cpu", "gfx90a");
+  F->addFnAttr("target-features", "+gfx9-insts,+wavefrontsize64");
   IRBuilder<> Builder(BB);
   OpenMPIRBuilder::LocationDescription Loc({Builder.saveIP(), DL});
 
@@ -6355,6 +6366,13 @@ TEST_F(OpenMPIRBuilderTest, TargetRegionDevice) {
   Function *OutlinedFn = TargetStore->getFunction();
   EXPECT_NE(F, OutlinedFn);
 
+  // Check that target-cpu and target-features were propagated to the outlined
+  // function
+  EXPECT_EQ(OutlinedFn->getFnAttribute("target-cpu"),
+F->getFnAttribute("target-cpu"));
+  EXPECT_EQ(OutlinedFn->getFnAttribute("target-features"),
+F->getFnAttribute("target-features"));
+
   EXPECT_TRUE(OutlinedFn->hasWeakODRLinkage());
   // Account for the "implicit" first argument.
   EXPECT_EQ(OutlinedFn->getName(), "__omp_offloading_1_2_parent_l3");
@@ -6594,6 +6612,13 @@ TEST_F(OpenMPIRBuilderTest, TargetRegionDeviceSPMD) {
   EXPECT_NE(OutlinedFn, nullptr);
   EXPECT_NE(F, OutlinedFn);
 
+  // Check that target-cpu and target-features were propagated to the outlined
+  // function
+  EXPECT_EQ(OutlinedFn->getFnAttribute("target-cpu"),
+F->getFnAttribute("target-cpu"));
+  EXPECT_EQ(OutlinedFn->getFnAttribute("target-features"),
+F->getFnAttribute("target-features"));
+
   EXPECT_TRUE(OutlinedFn->hasWeakODRLinkage());
   // Account for the "implicit" first argument.
   EXPECT_EQ(OutlinedFn->getName(), "__omp_offloading_1_2_parent_l3");

``




https://github.com/llvm/llvm-project/pull/117875
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [OMPIRBuilder] Propagate attributes to outlined target regions (PR #117875)

2024-11-27 Thread Sergio Afonso via llvm-branch-commits


skatrak wrote:

PR stack:
- #116048
- #116049
- #116050
- #116051
- #117875
- #116052
- #116219

https://github.com/llvm/llvm-project/pull/117875
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [flang] [mlir] [Flang][OpenMP] Lowering of host-evaluated clauses (PR #116219)

2024-11-27 Thread Sergio Afonso via llvm-branch-commits


https://github.com/skatrak updated 
https://github.com/llvm/llvm-project/pull/116219

>From 581746291e48096b40b4b059cd52d44c9629f2dd Mon Sep 17 00:00:00 2001
From: Sergio Afonso 
Date: Thu, 14 Nov 2024 12:24:15 +
Subject: [PATCH] [Flang][OpenMP] Lowering of host-evaluated clauses

This patch adds support for lowering OpenMP clauses and expressions attached to
constructs nested inside of a target region that need to be evaluated in the
host device. This is done through the use of the `OpenMP_HostEvalClause`
`omp.target` set of operands and entry block arguments.

When lowering clauses for a target construct, a more involved
`processHostEvalClauses()` function is called, which looks at the current and
potentially other nested constructs in order to find and lower clauses that
need to be processed outside of the `omp.target` operation under construction.
This populates an instance of a global structure with the resulting MLIR
values.

The resulting list of host-evaluated values is used to initialize the
`host_eval` operands when constructing the `omp.target` operation, and then
replaced with the corresponding block arguments after creating that operation's
region.

Afterwards, while lowering nested operations, those that might potentially be
evaluated in the host (e.g. `num_teams`, `thread_limit`, `num_threads` and
`collapse`) check first whether there is an active global host-evaluated
information structure and whether it holds values referring to these clauses.
If that is the case, the stored values (referring to `omp.target` entry block
arguments at that stage) are used instead of lowering clauses again.
---
 flang/lib/Lower/OpenMP/OpenMP.cpp | 463 --
 flang/test/Lower/OpenMP/host-eval.f90 | 157 ++
 flang/test/Lower/OpenMP/target-spmd.f90   | 191 
 .../Dialect/OpenMP/OpenMPClauseOperands.h |   6 +
 4 files changed, 788 insertions(+), 29 deletions(-)
 create mode 100644 flang/test/Lower/OpenMP/host-eval.f90
 create mode 100644 flang/test/Lower/OpenMP/target-spmd.f90

diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp 
b/flang/lib/Lower/OpenMP/OpenMP.cpp
index 91f99ba4b0ca55..a8a203e531b1e9 100644
--- a/flang/lib/Lower/OpenMP/OpenMP.cpp
+++ b/flang/lib/Lower/OpenMP/OpenMP.cpp
@@ -45,6 +45,19 @@ using namespace Fortran::lower::omp;
 // Code generation helper functions
 
//===--===//
 
+static void genOMPDispatch(lower::AbstractConverter &converter,
+   lower::SymMap &symTable,
+   semantics::SemanticsContext &semaCtx,
+   lower::pft::Evaluation &eval, mlir::Location loc,
+   const ConstructQueue &queue,
+   ConstructQueue::const_iterator item);
+
+static void processHostEvalClauses(lower::AbstractConverter &converter,
+   semantics::SemanticsContext &semaCtx,
+   lower::StatementContext &stmtCtx,
+   lower::pft::Evaluation &eval,
+   mlir::Location loc);
+
 namespace {
 /// Structure holding the information needed to create and bind entry block
 /// arguments associated to a single clause.
@@ -63,6 +76,7 @@ struct EntryBlockArgsEntry {
 /// Structure holding the information needed to create and bind entry block
 /// arguments associated to all clauses that can define them.
 struct EntryBlockArgs {
+  llvm::ArrayRef hostEvalVars;
   EntryBlockArgsEntry inReduction;
   EntryBlockArgsEntry map;
   EntryBlockArgsEntry priv;
@@ -85,18 +99,146 @@ struct EntryBlockArgs {
 
   auto getVars() const {
 return llvm::concat(
-inReduction.vars, map.vars, priv.vars, reduction.vars,
+hostEvalVars, inReduction.vars, map.vars, priv.vars, reduction.vars,
 taskReduction.vars, useDeviceAddr.vars, useDevicePtr.vars);
   }
 };
+
+/// Structure holding information that is needed to pass host-evaluated
+/// information to later lowering stages.
+class HostEvalInfo {
+public:
+  // Allow this function access to private members in order to initialize them.
+  friend void ::processHostEvalClauses(lower::AbstractConverter &,
+   semantics::SemanticsContext &,
+   lower::StatementContext &,
+   lower::pft::Evaluation &,
+   mlir::Location);
+
+  /// Fill \c vars with values stored in \c ops.
+  ///
+  /// The order in which values are stored matches the one expected by \see
+  /// bindOperands().
+  void collectValues(llvm::SmallVectorImpl &vars) const {
+vars.append(ops.loopLowerBounds);
+vars.append(ops.loopUpperBounds);
+vars.append(ops.loopSteps);
+
+if (ops.numTeamsLower)
+  vars.push_back(ops.numTeamsLower);
+
+if (ops.numTeamsUpper)
+  vars.push_back(ops.numTeamsUpper)

[llvm-branch-commits] [BOLT][WIP] Support ret-converted call-cont fallthru in BAT mode (PR #115334)

2024-11-27 Thread Amir Ayupov via llvm-branch-commits


https://github.com/aaupov updated 
https://github.com/llvm/llvm-project/pull/115334


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [BOLT] Encode landing pads in BAT (PR #114602)

2024-11-27 Thread Amir Ayupov via llvm-branch-commits


https://github.com/aaupov closed 
https://github.com/llvm/llvm-project/pull/114602
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

79 matches

Mail list logo