[llvm-branch-commits] [mlir] PR for llvm/llvm-project#79600 (PR #79603)

2024-01-29 Thread Andrei Golubev via llvm-branch-commits

andrey-golubev wrote:

> Here we have to wait for the build bots, as they are mandatory.

Finished, so asking for someone to press the merge button. Thanks in advance!

https://github.com/llvm/llvm-project/pull/79603
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [AMDGPU] New llvm.amdgcn.wave.id intrinsic (#79325) (PR #79689)

2024-01-29 Thread Jay Foad via llvm-branch-commits

jayfoad wrote:

@tstellar does this backport PR look OK? I created it with `gh pr create -f -B 
release/18.x` and I wasn't sure if I had to edit anything, apart from adding 
the release milestone.

https://github.com/llvm/llvm-project/pull/79689
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [AMDGPU] New llvm.amdgcn.wave.id intrinsic (#79325) (PR #79689)

2024-01-29 Thread Jay Foad via llvm-branch-commits

https://github.com/jayfoad closed 
https://github.com/llvm/llvm-project/pull/79689
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79800 (PR #79813)

2024-01-29 Thread via llvm-branch-commits

https://github.com/llvmbot created 
https://github.com/llvm/llvm-project/pull/79813

resolves llvm/llvm-project#79800

>From 0f9256b72bcda1b0444bd302aa22ede428c73a54 Mon Sep 17 00:00:00 2001
From: David Sherwood <57997763+david-...@users.noreply.github.com>
Date: Fri, 26 Jan 2024 14:43:48 +
Subject: [PATCH] [LoopVectorize] Refine runtime memory check costs when there
 is an outer loop (#76034)

When we generate runtime memory checks for an inner loop it's
possible that these checks are invariant in the outer loop and
so will get hoisted out. In such cases, the effective cost of
the checks should reduce to reflect the outer loop trip count.

This fixes a 25% performance regression introduced by commit

49b0e6dcc296792b577ae8f0f674e61a0929b99d

when building the SPEC2017 x264 benchmark with PGO, where we
decided the inner loop trip count wasn't high enough to warrant
the (incorrect) high cost of the runtime checks. Also, when
runtime memory checks consist entirely of diff checks these are
likely to be outer loop invariant.

(cherry picked from commit 962fbafecf4730ba84a3b9fd7a662a5c30bb2c7c)
---
 .../Transforms/Vectorize/LoopVectorize.cpp|  62 -
 .../AArch64/low_trip_memcheck_cost.ll | 217 ++
 2 files changed, 273 insertions(+), 6 deletions(-)
 create mode 100644 
llvm/test/Transforms/LoopVectorize/AArch64/low_trip_memcheck_cost.ll

diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp 
b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
index 6ca93e15719fb27..dd596c567cd4824 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -1957,6 +1957,8 @@ class GeneratedRTChecks {
   bool CostTooHigh = false;
   const bool AddBranchWeights;
 
+  Loop *OuterLoop = nullptr;
+
 public:
   GeneratedRTChecks(ScalarEvolution &SE, DominatorTree *DT, LoopInfo *LI,
 TargetTransformInfo *TTI, const DataLayout &DL,
@@ -2053,6 +2055,9 @@ class GeneratedRTChecks {
   DT->eraseNode(SCEVCheckBlock);
   LI->removeBlock(SCEVCheckBlock);
 }
+
+// Outer loop is used as part of the later cost calculations.
+OuterLoop = L->getParentLoop();
   }
 
   InstructionCost getCost() {
@@ -2076,16 +2081,61 @@ class GeneratedRTChecks {
 LLVM_DEBUG(dbgs() << "  " << C << "  for " << I << "\n");
 RTCheckCost += C;
   }
-if (MemCheckBlock)
+if (MemCheckBlock) {
+  InstructionCost MemCheckCost = 0;
   for (Instruction &I : *MemCheckBlock) {
 if (MemCheckBlock->getTerminator() == &I)
   continue;
 InstructionCost C =
 TTI->getInstructionCost(&I, TTI::TCK_RecipThroughput);
 LLVM_DEBUG(dbgs() << "  " << C << "  for " << I << "\n");
-RTCheckCost += C;
+MemCheckCost += C;
   }
 
+  // If the runtime memory checks are being created inside an outer loop
+  // we should find out if these checks are outer loop invariant. If so,
+  // the checks will likely be hoisted out and so the effective cost will
+  // reduce according to the outer loop trip count.
+  if (OuterLoop) {
+ScalarEvolution *SE = MemCheckExp.getSE();
+// TODO: If profitable, we could refine this further by analysing every
+// individual memory check, since there could be a mixture of loop
+// variant and invariant checks that mean the final condition is
+// variant.
+const SCEV *Cond = SE->getSCEV(MemRuntimeCheckCond);
+if (SE->isLoopInvariant(Cond, OuterLoop)) {
+  // It seems reasonable to assume that we can reduce the effective
+  // cost of the checks even when we know nothing about the trip
+  // count. Assume that the outer loop executes at least twice.
+  unsigned BestTripCount = 2;
+
+  // If exact trip count is known use that.
+  if (unsigned SmallTC = SE->getSmallConstantTripCount(OuterLoop))
+BestTripCount = SmallTC;
+  else if (LoopVectorizeWithBlockFrequency) {
+// Else use profile data if available.
+if (auto EstimatedTC = getLoopEstimatedTripCount(OuterLoop))
+  BestTripCount = *EstimatedTC;
+  }
+
+  InstructionCost NewMemCheckCost = MemCheckCost / BestTripCount;
+
+  // Let's ensure the cost is always at least 1.
+  NewMemCheckCost = std::max(*NewMemCheckCost.getValue(),
+ (InstructionCost::CostType)1);
+
+  LLVM_DEBUG(dbgs()
+ << "We expect runtime memory checks to be hoisted "
+ << "out of the outer loop. Cost reduced from "
+ << MemCheckCost << " to " << NewMemCheckCost << '\n');
+
+  MemCheckCost = NewMemCheckCost;
+}
+  }
+
+  RTCheckCost += MemCheckCost;
+}
+
 if (SCEVCheckBlock || MemCheckBlock)
   LLVM_DEBUG(dbgs() << "Total cost of runtime checks: " << RTCheckCost
 

[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79800 (PR #79813)

2024-01-29 Thread via llvm-branch-commits

https://github.com/llvmbot milestoned 
https://github.com/llvm/llvm-project/pull/79813
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79800 (PR #79813)

2024-01-29 Thread via llvm-branch-commits

llvmbot wrote:

@david-arm What do you think about merging this PR to the release branch?

https://github.com/llvm/llvm-project/pull/79813
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79800 (PR #79813)

2024-01-29 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-llvm-transforms

Author: None (llvmbot)


Changes

resolves llvm/llvm-project#79800

---
Full diff: https://github.com/llvm/llvm-project/pull/79813.diff


2 Files Affected:

- (modified) llvm/lib/Transforms/Vectorize/LoopVectorize.cpp (+56-6) 
- (added) llvm/test/Transforms/LoopVectorize/AArch64/low_trip_memcheck_cost.ll 
(+217) 


``diff
diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp 
b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
index 6ca93e15719fb27..dd596c567cd4824 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -1957,6 +1957,8 @@ class GeneratedRTChecks {
   bool CostTooHigh = false;
   const bool AddBranchWeights;
 
+  Loop *OuterLoop = nullptr;
+
 public:
   GeneratedRTChecks(ScalarEvolution &SE, DominatorTree *DT, LoopInfo *LI,
 TargetTransformInfo *TTI, const DataLayout &DL,
@@ -2053,6 +2055,9 @@ class GeneratedRTChecks {
   DT->eraseNode(SCEVCheckBlock);
   LI->removeBlock(SCEVCheckBlock);
 }
+
+// Outer loop is used as part of the later cost calculations.
+OuterLoop = L->getParentLoop();
   }
 
   InstructionCost getCost() {
@@ -2076,16 +2081,61 @@ class GeneratedRTChecks {
 LLVM_DEBUG(dbgs() << "  " << C << "  for " << I << "\n");
 RTCheckCost += C;
   }
-if (MemCheckBlock)
+if (MemCheckBlock) {
+  InstructionCost MemCheckCost = 0;
   for (Instruction &I : *MemCheckBlock) {
 if (MemCheckBlock->getTerminator() == &I)
   continue;
 InstructionCost C =
 TTI->getInstructionCost(&I, TTI::TCK_RecipThroughput);
 LLVM_DEBUG(dbgs() << "  " << C << "  for " << I << "\n");
-RTCheckCost += C;
+MemCheckCost += C;
   }
 
+  // If the runtime memory checks are being created inside an outer loop
+  // we should find out if these checks are outer loop invariant. If so,
+  // the checks will likely be hoisted out and so the effective cost will
+  // reduce according to the outer loop trip count.
+  if (OuterLoop) {
+ScalarEvolution *SE = MemCheckExp.getSE();
+// TODO: If profitable, we could refine this further by analysing every
+// individual memory check, since there could be a mixture of loop
+// variant and invariant checks that mean the final condition is
+// variant.
+const SCEV *Cond = SE->getSCEV(MemRuntimeCheckCond);
+if (SE->isLoopInvariant(Cond, OuterLoop)) {
+  // It seems reasonable to assume that we can reduce the effective
+  // cost of the checks even when we know nothing about the trip
+  // count. Assume that the outer loop executes at least twice.
+  unsigned BestTripCount = 2;
+
+  // If exact trip count is known use that.
+  if (unsigned SmallTC = SE->getSmallConstantTripCount(OuterLoop))
+BestTripCount = SmallTC;
+  else if (LoopVectorizeWithBlockFrequency) {
+// Else use profile data if available.
+if (auto EstimatedTC = getLoopEstimatedTripCount(OuterLoop))
+  BestTripCount = *EstimatedTC;
+  }
+
+  InstructionCost NewMemCheckCost = MemCheckCost / BestTripCount;
+
+  // Let's ensure the cost is always at least 1.
+  NewMemCheckCost = std::max(*NewMemCheckCost.getValue(),
+ (InstructionCost::CostType)1);
+
+  LLVM_DEBUG(dbgs()
+ << "We expect runtime memory checks to be hoisted "
+ << "out of the outer loop. Cost reduced from "
+ << MemCheckCost << " to " << NewMemCheckCost << '\n');
+
+  MemCheckCost = NewMemCheckCost;
+}
+  }
+
+  RTCheckCost += MemCheckCost;
+}
+
 if (SCEVCheckBlock || MemCheckBlock)
   LLVM_DEBUG(dbgs() << "Total cost of runtime checks: " << RTCheckCost
 << "\n");
@@ -2144,8 +2194,8 @@ class GeneratedRTChecks {
 
 BranchInst::Create(LoopVectorPreHeader, SCEVCheckBlock);
 // Create new preheader for vector loop.
-if (auto *PL = LI->getLoopFor(LoopVectorPreHeader))
-  PL->addBasicBlockToLoop(SCEVCheckBlock, *LI);
+if (OuterLoop)
+  OuterLoop->addBasicBlockToLoop(SCEVCheckBlock, *LI);
 
 SCEVCheckBlock->getTerminator()->eraseFromParent();
 SCEVCheckBlock->moveBefore(LoopVectorPreHeader);
@@ -2179,8 +2229,8 @@ class GeneratedRTChecks {
 DT->changeImmediateDominator(LoopVectorPreHeader, MemCheckBlock);
 MemCheckBlock->moveBefore(LoopVectorPreHeader);
 
-if (auto *PL = LI->getLoopFor(LoopVectorPreHeader))
-  PL->addBasicBlockToLoop(MemCheckBlock, *LI);
+if (OuterLoop)
+  OuterLoop->addBasicBlockToLoop(MemCheckBlock, *LI);
 
 BranchInst &BI =
 *BranchInst::Create(Bypass, LoopVectorPreHeader, MemRuntimeCheckCond);
diff --git 
a/llvm/test/Transforms/LoopVectorize/AArch6

[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79756 (PR #79814)

2024-01-29 Thread via llvm-branch-commits

https://github.com/llvmbot created 
https://github.com/llvm/llvm-project/pull/79814

resolves llvm/llvm-project#79756

>From 2ca45150c7984eea123409e6a7d25b2c7606ef5c Mon Sep 17 00:00:00 2001
From: David Green 
Date: Sun, 28 Jan 2024 17:01:21 +
Subject: [PATCH] Revert "[AArch64] merge index address with large offset into
 base address"

This reverts commit 32878c2065c8005b3ea30c79e16dfd7eed55d645 due to #79756 and 
#76202.

(cherry picked from commit 915c3d9e5a2d1314afe64cd6116a3b6c9809ec90)
---
 llvm/lib/Target/AArch64/AArch64InstrInfo.cpp  |  10 -
 llvm/lib/Target/AArch64/AArch64InstrInfo.h|   3 -
 .../AArch64/AArch64LoadStoreOptimizer.cpp | 229 --
 llvm/test/CodeGen/AArch64/arm64-addrmode.ll   |  15 +-
 .../AArch64/large-offset-ldr-merge.mir|   5 +-
 5 files changed, 12 insertions(+), 250 deletions(-)

diff --git a/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp 
b/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
index 2e8d8c63d6bec2..13e9d9725cc2ed 100644
--- a/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
+++ b/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
@@ -4098,16 +4098,6 @@ AArch64InstrInfo::getLdStOffsetOp(const MachineInstr 
&MI) {
   return MI.getOperand(Idx);
 }
 
-const MachineOperand &
-AArch64InstrInfo::getLdStAmountOp(const MachineInstr &MI) {
-  switch (MI.getOpcode()) {
-  default:
-llvm_unreachable("Unexpected opcode");
-  case AArch64::LDRBBroX:
-return MI.getOperand(4);
-  }
-}
-
 static const TargetRegisterClass *getRegClass(const MachineInstr &MI,
   Register Reg) {
   if (MI.getParent() == nullptr)
diff --git a/llvm/lib/Target/AArch64/AArch64InstrInfo.h 
b/llvm/lib/Target/AArch64/AArch64InstrInfo.h
index db24a19fe5f8e3..6526f6740747ab 100644
--- a/llvm/lib/Target/AArch64/AArch64InstrInfo.h
+++ b/llvm/lib/Target/AArch64/AArch64InstrInfo.h
@@ -111,9 +111,6 @@ class AArch64InstrInfo final : public AArch64GenInstrInfo {
   /// Returns the immediate offset operator of a load/store.
   static const MachineOperand &getLdStOffsetOp(const MachineInstr &MI);
 
-  /// Returns the shift amount operator of a load/store.
-  static const MachineOperand &getLdStAmountOp(const MachineInstr &MI);
-
   /// Returns whether the instruction is FP or NEON.
   static bool isFpOrNEON(const MachineInstr &MI);
 
diff --git a/llvm/lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp 
b/llvm/lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp
index e90b8a8ca7acee..926a89466255ca 100644
--- a/llvm/lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp
+++ b/llvm/lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp
@@ -62,8 +62,6 @@ STATISTIC(NumUnscaledPairCreated,
   "Number of load/store from unscaled generated");
 STATISTIC(NumZeroStoresPromoted, "Number of narrow zero stores promoted");
 STATISTIC(NumLoadsFromStoresPromoted, "Number of loads from stores promoted");
-STATISTIC(NumConstOffsetFolded,
-  "Number of const offset of index address folded");
 
 DEBUG_COUNTER(RegRenamingCounter, DEBUG_TYPE "-reg-renaming",
   "Controls which pairs are considered for renaming");
@@ -77,11 +75,6 @@ static cl::opt 
LdStLimit("aarch64-load-store-scan-limit",
 static cl::opt UpdateLimit("aarch64-update-scan-limit", 
cl::init(100),
  cl::Hidden);
 
-// The LdStConstLimit limits how far we search for const offset instructions
-// when we form index address load/store instructions.
-static cl::opt LdStConstLimit("aarch64-load-store-const-scan-limit",
-cl::init(10), cl::Hidden);
-
 // Enable register renaming to find additional store pairing opportunities.
 static cl::opt EnableRenaming("aarch64-load-store-renaming",
 cl::init(true), cl::Hidden);
@@ -178,13 +171,6 @@ struct AArch64LoadStoreOpt : public MachineFunctionPass {
   findMatchingUpdateInsnForward(MachineBasicBlock::iterator I,
 int UnscaledOffset, unsigned Limit);
 
-  // Scan the instruction list to find a register assigned with a const
-  // value that can be combined with the current instruction (a load or store)
-  // using base addressing with writeback. Scan forwards.
-  MachineBasicBlock::iterator
-  findMatchingConstOffsetBackward(MachineBasicBlock::iterator I, unsigned 
Limit,
-  unsigned &Offset);
-
   // Scan the instruction list to find a base register update that can
   // be combined with the current instruction (a load or store) using
   // pre or post indexed addressing with writeback. Scan backwards.
@@ -196,19 +182,11 @@ struct AArch64LoadStoreOpt : public MachineFunctionPass {
   bool isMatchingUpdateInsn(MachineInstr &MemMI, MachineInstr &MI,
 unsigned BaseReg, int Offset);
 
-  bool isMatchingMovConstInsn(MachineInstr &MemMI, MachineInstr &MI,
-  unsigned IndexReg, unsigned &Offset);
-
   // Merge a pre- or post-index base regi

[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79756 (PR #79814)

2024-01-29 Thread via llvm-branch-commits

https://github.com/llvmbot milestoned 
https://github.com/llvm/llvm-project/pull/79814
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79756 (PR #79814)

2024-01-29 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-backend-aarch64

Author: None (llvmbot)


Changes

resolves llvm/llvm-project#79756

---
Full diff: https://github.com/llvm/llvm-project/pull/79814.diff


5 Files Affected:

- (modified) llvm/lib/Target/AArch64/AArch64InstrInfo.cpp (-10) 
- (modified) llvm/lib/Target/AArch64/AArch64InstrInfo.h (-3) 
- (modified) llvm/lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp (-229) 
- (modified) llvm/test/CodeGen/AArch64/arm64-addrmode.ll (+9-6) 
- (modified) llvm/test/CodeGen/AArch64/large-offset-ldr-merge.mir (+3-2) 


``diff
diff --git a/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp 
b/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
index 2e8d8c63d6bec24..13e9d9725cc2ed1 100644
--- a/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
+++ b/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
@@ -4098,16 +4098,6 @@ AArch64InstrInfo::getLdStOffsetOp(const MachineInstr 
&MI) {
   return MI.getOperand(Idx);
 }
 
-const MachineOperand &
-AArch64InstrInfo::getLdStAmountOp(const MachineInstr &MI) {
-  switch (MI.getOpcode()) {
-  default:
-llvm_unreachable("Unexpected opcode");
-  case AArch64::LDRBBroX:
-return MI.getOperand(4);
-  }
-}
-
 static const TargetRegisterClass *getRegClass(const MachineInstr &MI,
   Register Reg) {
   if (MI.getParent() == nullptr)
diff --git a/llvm/lib/Target/AArch64/AArch64InstrInfo.h 
b/llvm/lib/Target/AArch64/AArch64InstrInfo.h
index db24a19fe5f8e3c..6526f6740747abb 100644
--- a/llvm/lib/Target/AArch64/AArch64InstrInfo.h
+++ b/llvm/lib/Target/AArch64/AArch64InstrInfo.h
@@ -111,9 +111,6 @@ class AArch64InstrInfo final : public AArch64GenInstrInfo {
   /// Returns the immediate offset operator of a load/store.
   static const MachineOperand &getLdStOffsetOp(const MachineInstr &MI);
 
-  /// Returns the shift amount operator of a load/store.
-  static const MachineOperand &getLdStAmountOp(const MachineInstr &MI);
-
   /// Returns whether the instruction is FP or NEON.
   static bool isFpOrNEON(const MachineInstr &MI);
 
diff --git a/llvm/lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp 
b/llvm/lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp
index e90b8a8ca7aceee..926a89466255cab 100644
--- a/llvm/lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp
+++ b/llvm/lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp
@@ -62,8 +62,6 @@ STATISTIC(NumUnscaledPairCreated,
   "Number of load/store from unscaled generated");
 STATISTIC(NumZeroStoresPromoted, "Number of narrow zero stores promoted");
 STATISTIC(NumLoadsFromStoresPromoted, "Number of loads from stores promoted");
-STATISTIC(NumConstOffsetFolded,
-  "Number of const offset of index address folded");
 
 DEBUG_COUNTER(RegRenamingCounter, DEBUG_TYPE "-reg-renaming",
   "Controls which pairs are considered for renaming");
@@ -77,11 +75,6 @@ static cl::opt 
LdStLimit("aarch64-load-store-scan-limit",
 static cl::opt UpdateLimit("aarch64-update-scan-limit", 
cl::init(100),
  cl::Hidden);
 
-// The LdStConstLimit limits how far we search for const offset instructions
-// when we form index address load/store instructions.
-static cl::opt LdStConstLimit("aarch64-load-store-const-scan-limit",
-cl::init(10), cl::Hidden);
-
 // Enable register renaming to find additional store pairing opportunities.
 static cl::opt EnableRenaming("aarch64-load-store-renaming",
 cl::init(true), cl::Hidden);
@@ -178,13 +171,6 @@ struct AArch64LoadStoreOpt : public MachineFunctionPass {
   findMatchingUpdateInsnForward(MachineBasicBlock::iterator I,
 int UnscaledOffset, unsigned Limit);
 
-  // Scan the instruction list to find a register assigned with a const
-  // value that can be combined with the current instruction (a load or store)
-  // using base addressing with writeback. Scan forwards.
-  MachineBasicBlock::iterator
-  findMatchingConstOffsetBackward(MachineBasicBlock::iterator I, unsigned 
Limit,
-  unsigned &Offset);
-
   // Scan the instruction list to find a base register update that can
   // be combined with the current instruction (a load or store) using
   // pre or post indexed addressing with writeback. Scan backwards.
@@ -196,19 +182,11 @@ struct AArch64LoadStoreOpt : public MachineFunctionPass {
   bool isMatchingUpdateInsn(MachineInstr &MemMI, MachineInstr &MI,
 unsigned BaseReg, int Offset);
 
-  bool isMatchingMovConstInsn(MachineInstr &MemMI, MachineInstr &MI,
-  unsigned IndexReg, unsigned &Offset);
-
   // Merge a pre- or post-index base register update into a ld/st instruction.
   MachineBasicBlock::iterator
   mergeUpdateInsn(MachineBasicBlock::iterator I,
   MachineBasicBlock::iterator Update, bool IsPreIdx);
 
-  MachineBasicBlock::iterator
-  mergeConstOffsetInsn(MachineBasicBlock::iterator

[llvm-branch-commits] [llvm] [workflows] Fix argument passing in abi-dump jobs (#79658) (PR #79836)

2024-01-29 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar created 
https://github.com/llvm/llvm-project/pull/79836

This was broken by 859e6aa1008b80d9b10657bac37822a32ee14a23, which added quotes 
around the EXTRA_ARGS variable.

>From 4649293daee971fe03dba59ee54e2c2a0b86 Mon Sep 17 00:00:00 2001
From: Tom Stellard 
Date: Mon, 29 Jan 2024 06:30:22 -0800
Subject: [PATCH] [workflows] Fix argument passing in abi-dump jobs (#79658)

This was broken by 859e6aa1008b80d9b10657bac37822a32ee14a23, which added
quotes around the EXTRA_ARGS variable.
---
 .github/workflows/llvm-tests.yml | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/.github/workflows/llvm-tests.yml b/.github/workflows/llvm-tests.yml
index cc9855ce182b2b8..8a53abee376716f 100644
--- a/.github/workflows/llvm-tests.yml
+++ b/.github/workflows/llvm-tests.yml
@@ -143,7 +143,7 @@ jobs:
   else
 touch llvm.symbols
   fi
-  abi-dumper "$EXTRA_ARGS" -lver ${{ matrix.ref }} -skip-cxx 
-public-headers ./install/include/${{ needs.abi-dump-setup.outputs.ABI_HEADERS 
}} -o ${{ matrix.ref }}.abi ./install/lib/libLLVM.so
+  abi-dumper $EXTRA_ARGS -lver ${{ matrix.ref }} -skip-cxx 
-public-headers ./install/include/${{ needs.abi-dump-setup.outputs.ABI_HEADERS 
}} -o ${{ matrix.ref }}.abi ./install/lib/libLLVM.so
   # Remove symbol versioning from dumps, so we can compare across 
major versions.
   sed -i 's/LLVM_${{ matrix.llvm_version_major }}/LLVM_NOVERSION/' ${{ 
matrix.ref }}.abi
   - name: Upload ABI file
@@ -193,7 +193,7 @@ jobs:
   # FIXME: Reading of gzip'd abi files on the GitHub runners stop
   # working some time in March of 2021, likely due to a change in the
   # runner's environment.
-  abi-compliance-checker "$EXTRA_ARGS" -l libLLVM.so -old 
build-baseline/*.abi -new build-latest/*.abi || test "${{ 
needs.abi-dump-setup.outputs.ABI_HEADERS }}" = "llvm-c"
+  abi-compliance-checker $EXTRA_ARGS -l libLLVM.so -old 
build-baseline/*.abi -new build-latest/*.abi || test "${{ 
needs.abi-dump-setup.outputs.ABI_HEADERS }}" = "llvm-c"
   - name: Upload ABI Comparison
 if: always()
 uses: actions/upload-artifact@v3

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [workflows] Fix argument passing in abi-dump jobs (#79658) (PR #79836)

2024-01-29 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-github-workflow

Author: Tom Stellard (tstellar)


Changes

This was broken by 859e6aa1008b80d9b10657bac37822a32ee14a23, which added quotes 
around the EXTRA_ARGS variable.

---
Full diff: https://github.com/llvm/llvm-project/pull/79836.diff


1 Files Affected:

- (modified) .github/workflows/llvm-tests.yml (+2-2) 


``diff
diff --git a/.github/workflows/llvm-tests.yml b/.github/workflows/llvm-tests.yml
index cc9855ce182b2b8..8a53abee376716f 100644
--- a/.github/workflows/llvm-tests.yml
+++ b/.github/workflows/llvm-tests.yml
@@ -143,7 +143,7 @@ jobs:
   else
 touch llvm.symbols
   fi
-  abi-dumper "$EXTRA_ARGS" -lver ${{ matrix.ref }} -skip-cxx 
-public-headers ./install/include/${{ needs.abi-dump-setup.outputs.ABI_HEADERS 
}} -o ${{ matrix.ref }}.abi ./install/lib/libLLVM.so
+  abi-dumper $EXTRA_ARGS -lver ${{ matrix.ref }} -skip-cxx 
-public-headers ./install/include/${{ needs.abi-dump-setup.outputs.ABI_HEADERS 
}} -o ${{ matrix.ref }}.abi ./install/lib/libLLVM.so
   # Remove symbol versioning from dumps, so we can compare across 
major versions.
   sed -i 's/LLVM_${{ matrix.llvm_version_major }}/LLVM_NOVERSION/' ${{ 
matrix.ref }}.abi
   - name: Upload ABI file
@@ -193,7 +193,7 @@ jobs:
   # FIXME: Reading of gzip'd abi files on the GitHub runners stop
   # working some time in March of 2021, likely due to a change in the
   # runner's environment.
-  abi-compliance-checker "$EXTRA_ARGS" -l libLLVM.so -old 
build-baseline/*.abi -new build-latest/*.abi || test "${{ 
needs.abi-dump-setup.outputs.ABI_HEADERS }}" = "llvm-c"
+  abi-compliance-checker $EXTRA_ARGS -l libLLVM.so -old 
build-baseline/*.abi -new build-latest/*.abi || test "${{ 
needs.abi-dump-setup.outputs.ABI_HEADERS }}" = "llvm-c"
   - name: Upload ABI Comparison
 if: always()
 uses: actions/upload-artifact@v3

``




https://github.com/llvm/llvm-project/pull/79836
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [workflows] Fix argument passing in abi-dump jobs (#79658) (PR #79836)

2024-01-29 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar updated 
https://github.com/llvm/llvm-project/pull/79836

>From 01d7ece20a46cec1bc1ef512d9961ee134ca73bd Mon Sep 17 00:00:00 2001
From: Tom Stellard 
Date: Mon, 29 Jan 2024 06:30:22 -0800
Subject: [PATCH] [workflows] Fix argument passing in abi-dump jobs (#79658)

This was broken by 859e6aa1008b80d9b10657bac37822a32ee14a23, which added
quotes around the EXTRA_ARGS variable.
---
 .github/workflows/llvm-tests.yml | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/.github/workflows/llvm-tests.yml b/.github/workflows/llvm-tests.yml
index 63f0f3abfd70a5b..127628d76f1913f 100644
--- a/.github/workflows/llvm-tests.yml
+++ b/.github/workflows/llvm-tests.yml
@@ -125,7 +125,7 @@ jobs:
   else
 touch llvm.symbols
   fi
-  abi-dumper "$EXTRA_ARGS" -lver ${{ matrix.ref }} -skip-cxx 
-public-headers ./install/include/${{ needs.abi-dump-setup.outputs.ABI_HEADERS 
}} -o ${{ matrix.ref }}.abi ./install/lib/libLLVM.so
+  abi-dumper $EXTRA_ARGS -lver ${{ matrix.ref }} -skip-cxx 
-public-headers ./install/include/${{ needs.abi-dump-setup.outputs.ABI_HEADERS 
}} -o ${{ matrix.ref }}.abi ./install/lib/libLLVM.so
   # Remove symbol versioning from dumps, so we can compare across 
major versions.
   sed -i 's/LLVM_${{ matrix.llvm_version_major }}/LLVM_NOVERSION/' ${{ 
matrix.ref }}.abi
   - name: Upload ABI file
@@ -175,7 +175,7 @@ jobs:
   # FIXME: Reading of gzip'd abi files on the GitHub runners stop
   # working some time in March of 2021, likely due to a change in the
   # runner's environment.
-  abi-compliance-checker "$EXTRA_ARGS" -l libLLVM.so -old 
build-baseline/*.abi -new build-latest/*.abi || test "${{ 
needs.abi-dump-setup.outputs.ABI_HEADERS }}" = "llvm-c"
+  abi-compliance-checker $EXTRA_ARGS -l libLLVM.so -old 
build-baseline/*.abi -new build-latest/*.abi || test "${{ 
needs.abi-dump-setup.outputs.ABI_HEADERS }}" = "llvm-c"
   - name: Upload ABI Comparison
 if: always()
 uses: actions/upload-artifact@v3

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] Backport 45d2d7757feb386186f69af6ef57bde7b5adc2db to release/18.x (PR #79839)

2024-01-29 Thread Jay Foad via llvm-branch-commits

https://github.com/jayfoad created 
https://github.com/llvm/llvm-project/pull/79839

This just missed the branch creation and is the last piece of functionality 
required to get AMDGPU GFX12 support working in the 18.x release.



>From c265c8527285075a58b2425198dbd4cca8b69477 Mon Sep 17 00:00:00 2001
From: Jay Foad 
Date: Thu, 25 Jan 2024 07:48:06 +
Subject: [PATCH] [AMDGPU] New llvm.amdgcn.wave.id intrinsic (#79325)

This is only valid on targets with architected SGPRs.
---
 llvm/include/llvm/IR/IntrinsicsAMDGPU.td  |  4 ++
 .../lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp | 19 ++
 llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.h  |  1 +
 llvm/lib/Target/AMDGPU/SIISelLowering.cpp | 14 +
 llvm/lib/Target/AMDGPU/SIISelLowering.h   |  1 +
 .../CodeGen/AMDGPU/llvm.amdgcn.wave.id.ll | 61 +++
 6 files changed, 100 insertions(+)
 create mode 100644 llvm/test/CodeGen/AMDGPU/llvm.amdgcn.wave.id.ll

diff --git a/llvm/include/llvm/IR/IntrinsicsAMDGPU.td 
b/llvm/include/llvm/IR/IntrinsicsAMDGPU.td
index 9eb1ac8e27befb1..c5f43d17d1c1481 100644
--- a/llvm/include/llvm/IR/IntrinsicsAMDGPU.td
+++ b/llvm/include/llvm/IR/IntrinsicsAMDGPU.td
@@ -2777,6 +2777,10 @@ class AMDGPULoadTr:
 
 def int_amdgcn_global_load_tr : AMDGPULoadTr;
 
+// i32 @llvm.amdgcn.wave.id()
+def int_amdgcn_wave_id :
+  DefaultAttrsIntrinsic<[llvm_i32_ty], [], [IntrNoMem, IntrSpeculatable]>;
+
 
//===--===//
 // Deep learning intrinsics.
 
//===--===//
diff --git a/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp 
b/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
index 615685822f91eeb..e98ede88a7e2db9 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
@@ -6883,6 +6883,23 @@ bool AMDGPULegalizerInfo::legalizeStackSave(MachineInstr 
&MI,
   return true;
 }
 
+bool AMDGPULegalizerInfo::legalizeWaveID(MachineInstr &MI,
+ MachineIRBuilder &B) const {
+  // With architected SGPRs, waveIDinGroup is in TTMP8[29:25].
+  if (!ST.hasArchitectedSGPRs())
+return false;
+  LLT S32 = LLT::scalar(32);
+  Register DstReg = MI.getOperand(0).getReg();
+  Register TTMP8 =
+  getFunctionLiveInPhysReg(B.getMF(), B.getTII(), AMDGPU::TTMP8,
+   AMDGPU::SReg_32RegClass, B.getDebugLoc(), S32);
+  auto LSB = B.buildConstant(S32, 25);
+  auto Width = B.buildConstant(S32, 5);
+  B.buildUbfx(DstReg, TTMP8, LSB, Width);
+  MI.eraseFromParent();
+  return true;
+}
+
 bool AMDGPULegalizerInfo::legalizeIntrinsic(LegalizerHelper &Helper,
 MachineInstr &MI) const {
   MachineIRBuilder &B = Helper.MIRBuilder;
@@ -7005,6 +7022,8 @@ bool 
AMDGPULegalizerInfo::legalizeIntrinsic(LegalizerHelper &Helper,
   case Intrinsic::amdgcn_workgroup_id_z:
 return legalizePreloadedArgIntrin(MI, MRI, B,
   AMDGPUFunctionArgInfo::WORKGROUP_ID_Z);
+  case Intrinsic::amdgcn_wave_id:
+return legalizeWaveID(MI, B);
   case Intrinsic::amdgcn_lds_kernel_id:
 return legalizePreloadedArgIntrin(MI, MRI, B,
   AMDGPUFunctionArgInfo::LDS_KERNEL_ID);
diff --git a/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.h 
b/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.h
index 56aabd4f6ab71b6..ecbe42681c6690c 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.h
+++ b/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.h
@@ -212,6 +212,7 @@ class AMDGPULegalizerInfo final : public LegalizerInfo {
 
   bool legalizeFPTruncRound(MachineInstr &MI, MachineIRBuilder &B) const;
   bool legalizeStackSave(MachineInstr &MI, MachineIRBuilder &B) const;
+  bool legalizeWaveID(MachineInstr &MI, MachineIRBuilder &B) const;
 
   bool legalizeImageIntrinsic(
   MachineInstr &MI, MachineIRBuilder &B,
diff --git a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp 
b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
index d60f511302613e1..c5ad9da88ec2b31 100644
--- a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+++ b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
@@ -7920,6 +7920,18 @@ SDValue SITargetLowering::lowerSBuffer(EVT VT, SDLoc DL, 
SDValue Rsrc,
   return Loads[0];
 }
 
+SDValue SITargetLowering::lowerWaveID(SelectionDAG &DAG, SDValue Op) const {
+  // With architected SGPRs, waveIDinGroup is in TTMP8[29:25].
+  if (!Subtarget->hasArchitectedSGPRs())
+return {};
+  SDLoc SL(Op);
+  MVT VT = MVT::i32;
+  SDValue TTMP8 = CreateLiveInRegister(DAG, &AMDGPU::SReg_32RegClass,
+   AMDGPU::TTMP8, VT, SL);
+  return DAG.getNode(AMDGPUISD::BFE_U32, SL, VT, TTMP8,
+ DAG.getConstant(25, SL, VT), DAG.getConstant(5, SL, VT));
+}
+
 SDValue SITargetLowering::lowerWorkitemID(SelectionDAG &DAG, SDValue Op,
   unsigned Dim,
   

[llvm-branch-commits] [llvm] Backport 45d2d7757feb386186f69af6ef57bde7b5adc2db to release/18.x (PR #79839)

2024-01-29 Thread Jay Foad via llvm-branch-commits

https://github.com/jayfoad milestoned 
https://github.com/llvm/llvm-project/pull/79839
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] Backport 45d2d7757feb386186f69af6ef57bde7b5adc2db to release/18.x (PR #79839)

2024-01-29 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-backend-amdgpu

Author: Jay Foad (jayfoad)


Changes

This just missed the branch creation and is the last piece of functionality 
required to get AMDGPU GFX12 support working in the 18.x release.



---
Full diff: https://github.com/llvm/llvm-project/pull/79839.diff


6 Files Affected:

- (modified) llvm/include/llvm/IR/IntrinsicsAMDGPU.td (+4) 
- (modified) llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp (+19) 
- (modified) llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.h (+1) 
- (modified) llvm/lib/Target/AMDGPU/SIISelLowering.cpp (+14) 
- (modified) llvm/lib/Target/AMDGPU/SIISelLowering.h (+1) 
- (added) llvm/test/CodeGen/AMDGPU/llvm.amdgcn.wave.id.ll (+61) 


``diff
diff --git a/llvm/include/llvm/IR/IntrinsicsAMDGPU.td 
b/llvm/include/llvm/IR/IntrinsicsAMDGPU.td
index 9eb1ac8e27befb1..c5f43d17d1c1481 100644
--- a/llvm/include/llvm/IR/IntrinsicsAMDGPU.td
+++ b/llvm/include/llvm/IR/IntrinsicsAMDGPU.td
@@ -2777,6 +2777,10 @@ class AMDGPULoadTr:
 
 def int_amdgcn_global_load_tr : AMDGPULoadTr;
 
+// i32 @llvm.amdgcn.wave.id()
+def int_amdgcn_wave_id :
+  DefaultAttrsIntrinsic<[llvm_i32_ty], [], [IntrNoMem, IntrSpeculatable]>;
+
 
//===--===//
 // Deep learning intrinsics.
 
//===--===//
diff --git a/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp 
b/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
index 615685822f91eeb..e98ede88a7e2db9 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
@@ -6883,6 +6883,23 @@ bool AMDGPULegalizerInfo::legalizeStackSave(MachineInstr 
&MI,
   return true;
 }
 
+bool AMDGPULegalizerInfo::legalizeWaveID(MachineInstr &MI,
+ MachineIRBuilder &B) const {
+  // With architected SGPRs, waveIDinGroup is in TTMP8[29:25].
+  if (!ST.hasArchitectedSGPRs())
+return false;
+  LLT S32 = LLT::scalar(32);
+  Register DstReg = MI.getOperand(0).getReg();
+  Register TTMP8 =
+  getFunctionLiveInPhysReg(B.getMF(), B.getTII(), AMDGPU::TTMP8,
+   AMDGPU::SReg_32RegClass, B.getDebugLoc(), S32);
+  auto LSB = B.buildConstant(S32, 25);
+  auto Width = B.buildConstant(S32, 5);
+  B.buildUbfx(DstReg, TTMP8, LSB, Width);
+  MI.eraseFromParent();
+  return true;
+}
+
 bool AMDGPULegalizerInfo::legalizeIntrinsic(LegalizerHelper &Helper,
 MachineInstr &MI) const {
   MachineIRBuilder &B = Helper.MIRBuilder;
@@ -7005,6 +7022,8 @@ bool 
AMDGPULegalizerInfo::legalizeIntrinsic(LegalizerHelper &Helper,
   case Intrinsic::amdgcn_workgroup_id_z:
 return legalizePreloadedArgIntrin(MI, MRI, B,
   AMDGPUFunctionArgInfo::WORKGROUP_ID_Z);
+  case Intrinsic::amdgcn_wave_id:
+return legalizeWaveID(MI, B);
   case Intrinsic::amdgcn_lds_kernel_id:
 return legalizePreloadedArgIntrin(MI, MRI, B,
   AMDGPUFunctionArgInfo::LDS_KERNEL_ID);
diff --git a/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.h 
b/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.h
index 56aabd4f6ab71b6..ecbe42681c6690c 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.h
+++ b/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.h
@@ -212,6 +212,7 @@ class AMDGPULegalizerInfo final : public LegalizerInfo {
 
   bool legalizeFPTruncRound(MachineInstr &MI, MachineIRBuilder &B) const;
   bool legalizeStackSave(MachineInstr &MI, MachineIRBuilder &B) const;
+  bool legalizeWaveID(MachineInstr &MI, MachineIRBuilder &B) const;
 
   bool legalizeImageIntrinsic(
   MachineInstr &MI, MachineIRBuilder &B,
diff --git a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp 
b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
index d60f511302613e1..c5ad9da88ec2b31 100644
--- a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+++ b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
@@ -7920,6 +7920,18 @@ SDValue SITargetLowering::lowerSBuffer(EVT VT, SDLoc DL, 
SDValue Rsrc,
   return Loads[0];
 }
 
+SDValue SITargetLowering::lowerWaveID(SelectionDAG &DAG, SDValue Op) const {
+  // With architected SGPRs, waveIDinGroup is in TTMP8[29:25].
+  if (!Subtarget->hasArchitectedSGPRs())
+return {};
+  SDLoc SL(Op);
+  MVT VT = MVT::i32;
+  SDValue TTMP8 = CreateLiveInRegister(DAG, &AMDGPU::SReg_32RegClass,
+   AMDGPU::TTMP8, VT, SL);
+  return DAG.getNode(AMDGPUISD::BFE_U32, SL, VT, TTMP8,
+ DAG.getConstant(25, SL, VT), DAG.getConstant(5, SL, VT));
+}
+
 SDValue SITargetLowering::lowerWorkitemID(SelectionDAG &DAG, SDValue Op,
   unsigned Dim,
   const ArgDescriptor &Arg) const {
@@ -8090,6 +8102,8 @@ SDValue SITargetLowering::LowerINTRINSIC_WO_CHAIN(SDValue 
Op,
   case Intrinsic::amdgcn_workgroup_id_z:
 return getPreloadedValue(DAG, *MFI, VT,
 

[llvm-branch-commits] [llvm] [AMDGPU] New llvm.amdgcn.wave.id intrinsic (#79325) (PR #79689)

2024-01-29 Thread Jay Foad via llvm-branch-commits

jayfoad wrote:

> jayfoad closed this by deleting the head repository 3 hours ago

Sorry. Recreated as #79839

https://github.com/llvm/llvm-project/pull/79689
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [compiler-rt] PR for llvm/llvm-project#79838 (PR #79841)

2024-01-29 Thread via llvm-branch-commits

https://github.com/llvmbot created 
https://github.com/llvm/llvm-project/pull/79841

resolves llvm/llvm-project#79838

>From 672a561cf963fbcee33de65efe25f220c2c21173 Mon Sep 17 00:00:00 2001
From: Sam James 
Date: Wed, 24 Jan 2024 08:23:03 +
Subject: [PATCH] [sanitizer] Handle Gentoo's libstdc++ path

On Gentoo, libc++ is indeed in /usr/include/c++/*, but libstdc++ is at
e.g. /usr/lib/gcc/x86_64-pc-linux-gnu/14/include/g++-v14.

Use '/include/g++' as it should be unique enough. Note that the omission of
a trailing slash is intentional to match g++-*.

See https://github.com/llvm/llvm-project/pull/78534#issuecomment-1904145839.

Reviewed by: mgorny
Closes: https://github.com/llvm/llvm-project/pull/79264

Signed-off-by: Sam James 
(cherry picked from commit e8f882f83acf30d9b4da8846bd26314139660430)
---
 .../lib/sanitizer_common/sanitizer_symbolizer_report.cpp  | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_report.cpp 
b/compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_report.cpp
index 8438e019591b58a..f6b157c07c6557c 100644
--- a/compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_report.cpp
+++ b/compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_report.cpp
@@ -34,8 +34,10 @@ static bool FrameIsInternal(const SymbolizedStack *frame) {
 return true;
   const char *file = frame->info.file;
   const char *module = frame->info.module;
+  // On Gentoo, the path is g++-*, so there's *not* a missing /.
   if (file && (internal_strstr(file, "/compiler-rt/lib/") ||
-   internal_strstr(file, "/include/c++/")))
+   internal_strstr(file, "/include/c++/") ||
+   internal_strstr(file, "/include/g++")))
 return true;
   if (module && (internal_strstr(module, "libclang_rt.")))
 return true;

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [compiler-rt] PR for llvm/llvm-project#79838 (PR #79841)

2024-01-29 Thread via llvm-branch-commits

https://github.com/llvmbot milestoned 
https://github.com/llvm/llvm-project/pull/79841
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [compiler-rt] PR for llvm/llvm-project#79838 (PR #79841)

2024-01-29 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-compiler-rt-sanitizer

Author: None (llvmbot)


Changes

resolves llvm/llvm-project#79838

---
Full diff: https://github.com/llvm/llvm-project/pull/79841.diff


1 Files Affected:

- (modified) compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_report.cpp 
(+3-1) 


``diff
diff --git a/compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_report.cpp 
b/compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_report.cpp
index 8438e019591b58a..f6b157c07c6557c 100644
--- a/compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_report.cpp
+++ b/compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_report.cpp
@@ -34,8 +34,10 @@ static bool FrameIsInternal(const SymbolizedStack *frame) {
 return true;
   const char *file = frame->info.file;
   const char *module = frame->info.module;
+  // On Gentoo, the path is g++-*, so there's *not* a missing /.
   if (file && (internal_strstr(file, "/compiler-rt/lib/") ||
-   internal_strstr(file, "/include/c++/")))
+   internal_strstr(file, "/include/c++/") ||
+   internal_strstr(file, "/include/g++")))
 return true;
   if (module && (internal_strstr(module, "libclang_rt.")))
 return true;

``




https://github.com/llvm/llvm-project/pull/79841
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libcxx] [clang] PR for llvm/llvm-project#79762 (PR #79763)

2024-01-29 Thread Louis Dionne via llvm-branch-commits

https://github.com/ldionne approved this pull request.


https://github.com/llvm/llvm-project/pull/79763
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [llvm] [clang-tools-extra] [DirectX] Move DXIL ResourceKind and ElementType to DXILABI.h. NFC (PR #78225)

2024-01-29 Thread Justin Bogner via llvm-branch-commits

https://github.com/bogner updated 
https://github.com/llvm/llvm-project/pull/78225


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang-tools-extra] [clang] [llvm] [DirectX] Move DXIL ResourceKind and ElementType to DXILABI.h. NFC (PR #78225)

2024-01-29 Thread Justin Bogner via llvm-branch-commits

https://github.com/bogner updated 
https://github.com/llvm/llvm-project/pull/78225


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] PR for llvm/llvm-project#79797 (PR #79863)

2024-01-29 Thread via llvm-branch-commits

https://github.com/llvmbot created 
https://github.com/llvm/llvm-project/pull/79863

resolves llvm/llvm-project#79797

>From 7eea3aec978bdfb154868f846d52e4cba4cf246c Mon Sep 17 00:00:00 2001
From: Andrei Golubev 
Date: Mon, 29 Jan 2024 10:37:11 +0200
Subject: [PATCH] [mlir] Revert to old fold logic in IR::Dialect::add{Types,
 Attributes}() (#79582)

Fold expressions on Clang are limited to 256 elements. This causes
compilation errors in cases when the amount of elements added exceeds
this limit. Side-step the issue by restoring the original trick that
would use the std::initializer_list. For the record, in our downstream
Clang 16 gives:

mlir/include/mlir/IR/Dialect.h:269:23: fatal error: instantiating fold
expression with 688 arguments exceeded expression nesting limit of 256
(addType(), ...);

Partially reverts 26d811b3ecd2fa1ca3d9b41e17fb42b8c7ad03d6.

Co-authored-by: Nikita Kudriavtsev 
(cherry picked from commit e3a38a75ddc6ff00301ec19a0e2488d00f2cc297)
---
 mlir/include/mlir/IR/Dialect.h | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/mlir/include/mlir/IR/Dialect.h b/mlir/include/mlir/IR/Dialect.h
index 45f29f37dd3b97c..50f6f6de5c2897a 100644
--- a/mlir/include/mlir/IR/Dialect.h
+++ b/mlir/include/mlir/IR/Dialect.h
@@ -281,7 +281,11 @@ class Dialect {
   /// Register a set of type classes with this dialect.
   template 
   void addTypes() {
-(addType(), ...);
+// This initializer_list argument pack expansion is essentially equal to
+// using a fold expression with a comma operator. Clang however, refuses
+// to compile a fold expression with a depth of more than 256 by default.
+// There seem to be no such limitations for initializer_list.
+(void)std::initializer_list{0, (addType(), 0)...};
   }
 
   /// Register a type instance with this dialect.
@@ -292,7 +296,11 @@ class Dialect {
   /// Register a set of attribute classes with this dialect.
   template 
   void addAttributes() {
-(addAttribute(), ...);
+// This initializer_list argument pack expansion is essentially equal to
+// using a fold expression with a comma operator. Clang however, refuses
+// to compile a fold expression with a depth of more than 256 by default.
+// There seem to be no such limitations for initializer_list.
+(void)std::initializer_list{0, (addAttribute(), 0)...};
   }
 
   /// Register an attribute instance with this dialect.

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] PR for llvm/llvm-project#79797 (PR #79863)

2024-01-29 Thread via llvm-branch-commits

llvmbot wrote:

@zero9178 What do you think about merging this PR to the release branch?

https://github.com/llvm/llvm-project/pull/79863
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] PR for llvm/llvm-project#79797 (PR #79863)

2024-01-29 Thread via llvm-branch-commits

https://github.com/llvmbot milestoned 
https://github.com/llvm/llvm-project/pull/79863
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] PR for llvm/llvm-project#79797 (PR #79863)

2024-01-29 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-mlir-core

Author: None (llvmbot)


Changes

resolves llvm/llvm-project#79797

---
Full diff: https://github.com/llvm/llvm-project/pull/79863.diff


1 Files Affected:

- (modified) mlir/include/mlir/IR/Dialect.h (+10-2) 


``diff
diff --git a/mlir/include/mlir/IR/Dialect.h b/mlir/include/mlir/IR/Dialect.h
index 45f29f37dd3b97c..50f6f6de5c2897a 100644
--- a/mlir/include/mlir/IR/Dialect.h
+++ b/mlir/include/mlir/IR/Dialect.h
@@ -281,7 +281,11 @@ class Dialect {
   /// Register a set of type classes with this dialect.
   template 
   void addTypes() {
-(addType(), ...);
+// This initializer_list argument pack expansion is essentially equal to
+// using a fold expression with a comma operator. Clang however, refuses
+// to compile a fold expression with a depth of more than 256 by default.
+// There seem to be no such limitations for initializer_list.
+(void)std::initializer_list{0, (addType(), 0)...};
   }
 
   /// Register a type instance with this dialect.
@@ -292,7 +296,11 @@ class Dialect {
   /// Register a set of attribute classes with this dialect.
   template 
   void addAttributes() {
-(addAttribute(), ...);
+// This initializer_list argument pack expansion is essentially equal to
+// using a fold expression with a comma operator. Clang however, refuses
+// to compile a fold expression with a depth of more than 256 by default.
+// There seem to be no such limitations for initializer_list.
+(void)std::initializer_list{0, (addAttribute(), 0)...};
   }
 
   /// Register an attribute instance with this dialect.

``




https://github.com/llvm/llvm-project/pull/79863
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] PR for llvm/llvm-project#79797 (PR #79863)

2024-01-29 Thread Markus Böck via llvm-branch-commits

https://github.com/zero9178 approved this pull request.


https://github.com/llvm/llvm-project/pull/79863
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] PR for llvm/llvm-project#79600 (PR #79603)

2024-01-29 Thread Markus Böck via llvm-branch-commits

https://github.com/zero9178 approved this pull request.

(I think only release managers can merge to the release branch, not sure 
however)

https://github.com/llvm/llvm-project/pull/79603
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79800 (PR #79813)

2024-01-29 Thread David Green via llvm-branch-commits

https://github.com/davemgreen approved this pull request.

The perf regression was fairly significant, so it would be good to get this 
into the branch. Thanks.

https://github.com/llvm/llvm-project/pull/79813
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] Backport 45d2d7757feb386186f69af6ef57bde7b5adc2db to release/18.x (PR #79839)

2024-01-29 Thread Matt Arsenault via llvm-branch-commits

https://github.com/arsenm edited https://github.com/llvm/llvm-project/pull/79839
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] Backport 45d2d7757feb386186f69af6ef57bde7b5adc2db to release/18.x (PR #79839)

2024-01-29 Thread Matt Arsenault via llvm-branch-commits


@@ -6883,6 +6883,23 @@ bool AMDGPULegalizerInfo::legalizeStackSave(MachineInstr 
&MI,
   return true;
 }
 
+bool AMDGPULegalizerInfo::legalizeWaveID(MachineInstr &MI,
+ MachineIRBuilder &B) const {
+  // With architected SGPRs, waveIDinGroup is in TTMP8[29:25].
+  if (!ST.hasArchitectedSGPRs())
+return false;
+  LLT S32 = LLT::scalar(32);
+  Register DstReg = MI.getOperand(0).getReg();
+  Register TTMP8 =
+  getFunctionLiveInPhysReg(B.getMF(), B.getTII(), AMDGPU::TTMP8,

arsenm wrote:

This avoids the live in in the later patch? 

https://github.com/llvm/llvm-project/pull/79839
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] Backport 45d2d7757feb386186f69af6ef57bde7b5adc2db to release/18.x (PR #79839)

2024-01-29 Thread Matt Arsenault via llvm-branch-commits

https://github.com/arsenm approved this pull request.


https://github.com/llvm/llvm-project/pull/79839
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [llvm] [clang-tools-extra] [flang] [openmp] [DirectX] Move DXIL ResourceKind and ElementType to DXILABI.h. NFC (PR #78225)

2024-01-29 Thread Justin Bogner via llvm-branch-commits

https://github.com/bogner updated 
https://github.com/llvm/llvm-project/pull/78225


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [clang-tools-extra] [llvm] [openmp] [flang] [DirectX] Move DXIL ResourceKind and ElementType to DXILABI.h. NFC (PR #78225)

2024-01-29 Thread Justin Bogner via llvm-branch-commits

https://github.com/bogner updated 
https://github.com/llvm/llvm-project/pull/78225


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] Backport 45d2d7757feb386186f69af6ef57bde7b5adc2db to release/18.x (PR #79839)

2024-01-29 Thread Jay Foad via llvm-branch-commits

https://github.com/jayfoad updated 
https://github.com/llvm/llvm-project/pull/79839

>From c265c8527285075a58b2425198dbd4cca8b69477 Mon Sep 17 00:00:00 2001
From: Jay Foad 
Date: Thu, 25 Jan 2024 07:48:06 +
Subject: [PATCH 1/2] [AMDGPU] New llvm.amdgcn.wave.id intrinsic (#79325)

This is only valid on targets with architected SGPRs.
---
 llvm/include/llvm/IR/IntrinsicsAMDGPU.td  |  4 ++
 .../lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp | 19 ++
 llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.h  |  1 +
 llvm/lib/Target/AMDGPU/SIISelLowering.cpp | 14 +
 llvm/lib/Target/AMDGPU/SIISelLowering.h   |  1 +
 .../CodeGen/AMDGPU/llvm.amdgcn.wave.id.ll | 61 +++
 6 files changed, 100 insertions(+)
 create mode 100644 llvm/test/CodeGen/AMDGPU/llvm.amdgcn.wave.id.ll

diff --git a/llvm/include/llvm/IR/IntrinsicsAMDGPU.td 
b/llvm/include/llvm/IR/IntrinsicsAMDGPU.td
index 9eb1ac8e27befb..c5f43d17d1c148 100644
--- a/llvm/include/llvm/IR/IntrinsicsAMDGPU.td
+++ b/llvm/include/llvm/IR/IntrinsicsAMDGPU.td
@@ -2777,6 +2777,10 @@ class AMDGPULoadTr:
 
 def int_amdgcn_global_load_tr : AMDGPULoadTr;
 
+// i32 @llvm.amdgcn.wave.id()
+def int_amdgcn_wave_id :
+  DefaultAttrsIntrinsic<[llvm_i32_ty], [], [IntrNoMem, IntrSpeculatable]>;
+
 
//===--===//
 // Deep learning intrinsics.
 
//===--===//
diff --git a/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp 
b/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
index 615685822f91ee..e98ede88a7e2db 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
@@ -6883,6 +6883,23 @@ bool AMDGPULegalizerInfo::legalizeStackSave(MachineInstr 
&MI,
   return true;
 }
 
+bool AMDGPULegalizerInfo::legalizeWaveID(MachineInstr &MI,
+ MachineIRBuilder &B) const {
+  // With architected SGPRs, waveIDinGroup is in TTMP8[29:25].
+  if (!ST.hasArchitectedSGPRs())
+return false;
+  LLT S32 = LLT::scalar(32);
+  Register DstReg = MI.getOperand(0).getReg();
+  Register TTMP8 =
+  getFunctionLiveInPhysReg(B.getMF(), B.getTII(), AMDGPU::TTMP8,
+   AMDGPU::SReg_32RegClass, B.getDebugLoc(), S32);
+  auto LSB = B.buildConstant(S32, 25);
+  auto Width = B.buildConstant(S32, 5);
+  B.buildUbfx(DstReg, TTMP8, LSB, Width);
+  MI.eraseFromParent();
+  return true;
+}
+
 bool AMDGPULegalizerInfo::legalizeIntrinsic(LegalizerHelper &Helper,
 MachineInstr &MI) const {
   MachineIRBuilder &B = Helper.MIRBuilder;
@@ -7005,6 +7022,8 @@ bool 
AMDGPULegalizerInfo::legalizeIntrinsic(LegalizerHelper &Helper,
   case Intrinsic::amdgcn_workgroup_id_z:
 return legalizePreloadedArgIntrin(MI, MRI, B,
   AMDGPUFunctionArgInfo::WORKGROUP_ID_Z);
+  case Intrinsic::amdgcn_wave_id:
+return legalizeWaveID(MI, B);
   case Intrinsic::amdgcn_lds_kernel_id:
 return legalizePreloadedArgIntrin(MI, MRI, B,
   AMDGPUFunctionArgInfo::LDS_KERNEL_ID);
diff --git a/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.h 
b/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.h
index 56aabd4f6ab71b..ecbe42681c6690 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.h
+++ b/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.h
@@ -212,6 +212,7 @@ class AMDGPULegalizerInfo final : public LegalizerInfo {
 
   bool legalizeFPTruncRound(MachineInstr &MI, MachineIRBuilder &B) const;
   bool legalizeStackSave(MachineInstr &MI, MachineIRBuilder &B) const;
+  bool legalizeWaveID(MachineInstr &MI, MachineIRBuilder &B) const;
 
   bool legalizeImageIntrinsic(
   MachineInstr &MI, MachineIRBuilder &B,
diff --git a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp 
b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
index d60f511302613e..c5ad9da88ec2b3 100644
--- a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+++ b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
@@ -7920,6 +7920,18 @@ SDValue SITargetLowering::lowerSBuffer(EVT VT, SDLoc DL, 
SDValue Rsrc,
   return Loads[0];
 }
 
+SDValue SITargetLowering::lowerWaveID(SelectionDAG &DAG, SDValue Op) const {
+  // With architected SGPRs, waveIDinGroup is in TTMP8[29:25].
+  if (!Subtarget->hasArchitectedSGPRs())
+return {};
+  SDLoc SL(Op);
+  MVT VT = MVT::i32;
+  SDValue TTMP8 = CreateLiveInRegister(DAG, &AMDGPU::SReg_32RegClass,
+   AMDGPU::TTMP8, VT, SL);
+  return DAG.getNode(AMDGPUISD::BFE_U32, SL, VT, TTMP8,
+ DAG.getConstant(25, SL, VT), DAG.getConstant(5, SL, VT));
+}
+
 SDValue SITargetLowering::lowerWorkitemID(SelectionDAG &DAG, SDValue Op,
   unsigned Dim,
   const ArgDescriptor &Arg) const {
@@ -8090,6 +8102,8 @@ SDValue SITargetLowering::LowerINTRINSIC_WO_CHAIN(SDValue 
Op,
   case Intrinsic::

[llvm-branch-commits] [llvm] Backport 45d2d7757feb386186f69af6ef57bde7b5adc2db to release/18.x (PR #79839)

2024-01-29 Thread Jay Foad via llvm-branch-commits


@@ -6883,6 +6883,23 @@ bool AMDGPULegalizerInfo::legalizeStackSave(MachineInstr 
&MI,
   return true;
 }
 
+bool AMDGPULegalizerInfo::legalizeWaveID(MachineInstr &MI,
+ MachineIRBuilder &B) const {
+  // With architected SGPRs, waveIDinGroup is in TTMP8[29:25].
+  if (!ST.hasArchitectedSGPRs())
+return false;
+  LLT S32 = LLT::scalar(32);
+  Register DstReg = MI.getOperand(0).getReg();
+  Register TTMP8 =
+  getFunctionLiveInPhysReg(B.getMF(), B.getTII(), AMDGPU::TTMP8,

jayfoad wrote:

True, 66c710ec9dcdbdec6cadd89b972d8945983dc92f improved this to avoid adding 
liveins. I wasn't going to bother backporting that since I didn't think it was 
required for correctness. But I have cherry-picked it into this PR now.

https://github.com/llvm/llvm-project/pull/79839
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [llvm] PR for llvm/llvm-project#79614 (PR #79870)

2024-01-29 Thread via llvm-branch-commits

https://github.com/llvmbot created 
https://github.com/llvm/llvm-project/pull/79870

resolves llvm/llvm-project#79614

>From 4b861c2643e93050618488e58141b5802c1f4e35 Mon Sep 17 00:00:00 2001
From: Alexandros Lamprineas 
Date: Mon, 29 Jan 2024 16:37:09 +
Subject: [PATCH] [AArch64][TargetParser] Add mcpu alias for Microsoft Azure
 Cobalt 100. (#79614)

With a690e86 we added -mcpu/mtune=native support to handle the Microsoft
Azure Cobalt 100 CPU as a Neoverse N2. This patch adds a CPU alias in
TargetParser to maintain compatibility with GCC.

(cherry picked from commit ae8005ffb6cd18900de8ed5a86f60a4a16975471)
---
 clang/test/Driver/aarch64-mcpu.c | 3 +++
 clang/test/Misc/target-invalid-cpu-note.c| 4 ++--
 llvm/include/llvm/TargetParser/AArch64TargetParser.h | 3 ++-
 llvm/unittests/TargetParser/TargetParserTest.cpp | 2 +-
 4 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/clang/test/Driver/aarch64-mcpu.c b/clang/test/Driver/aarch64-mcpu.c
index 511482a420da268..3e07f3597f34081 100644
--- a/clang/test/Driver/aarch64-mcpu.c
+++ b/clang/test/Driver/aarch64-mcpu.c
@@ -72,6 +72,9 @@
 // RUN: %clang --target=aarch64 -mcpu=cortex-r82  -### -c %s 2>&1 | FileCheck 
-check-prefix=CORTEXR82 %s
 // CORTEXR82: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "cortex-r82"
 
+// RUN: %clang --target=aarch64 -mcpu=cobalt-100 -### -c %s 2>&1 | FileCheck 
-check-prefix=COBALT-100 %s
+// COBALT-100: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" 
"neoverse-n2"
+
 // RUN: %clang --target=aarch64 -mcpu=grace -### -c %s 2>&1 | FileCheck 
-check-prefix=GRACE %s
 // GRACE: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "neoverse-v2"
 
diff --git a/clang/test/Misc/target-invalid-cpu-note.c 
b/clang/test/Misc/target-invalid-cpu-note.c
index 84aed5c9c36fe47..2f10bfb1fd82fe3 100644
--- a/clang/test/Misc/target-invalid-cpu-note.c
+++ b/clang/test/Misc/target-invalid-cpu-note.c
@@ -5,11 +5,11 @@
 
 // RUN: not %clang_cc1 -triple arm64--- -target-cpu not-a-cpu -fsyntax-only %s 
2>&1 | FileCheck %s --check-prefix AARCH64
 // AARCH64: error: unknown target CPU 'not-a-cpu'
-// AARCH64-NEXT: note: valid target CPU values are: cortex-a34, cortex-a35, 
cortex-a53, cortex-a55, cortex-a510, cortex-a520, cortex-a57, cortex-a65, 
cortex-a65ae, cortex-a72, cortex-a73, cortex-a75, cortex-a76, cortex-a76ae, 
cortex-a77, cortex-a78, cortex-a78c, cortex-a710, cortex-a715, cortex-a720, 
cortex-r82, cortex-x1, cortex-x1c, cortex-x2, cortex-x3, cortex-x4, 
neoverse-e1, neoverse-n1, neoverse-n2, neoverse-512tvb, neoverse-v1, 
neoverse-v2, cyclone, apple-a7, apple-a8, apple-a9, apple-a10, apple-a11, 
apple-a12, apple-a13, apple-a14, apple-a15, apple-a16, apple-a17, apple-m1, 
apple-m2, apple-m3, apple-s4, apple-s5, exynos-m3, exynos-m4, exynos-m5, 
falkor, saphira, kryo, thunderx2t99, thunderx3t110, thunderx, thunderxt88, 
thunderxt81, thunderxt83, tsv110, a64fx, carmel, ampere1, ampere1a, grace{{$}}
+// AARCH64-NEXT: note: valid target CPU values are: cortex-a34, cortex-a35, 
cortex-a53, cortex-a55, cortex-a510, cortex-a520, cortex-a57, cortex-a65, 
cortex-a65ae, cortex-a72, cortex-a73, cortex-a75, cortex-a76, cortex-a76ae, 
cortex-a77, cortex-a78, cortex-a78c, cortex-a710, cortex-a715, cortex-a720, 
cortex-r82, cortex-x1, cortex-x1c, cortex-x2, cortex-x3, cortex-x4, 
neoverse-e1, neoverse-n1, neoverse-n2, neoverse-512tvb, neoverse-v1, 
neoverse-v2, cyclone, apple-a7, apple-a8, apple-a9, apple-a10, apple-a11, 
apple-a12, apple-a13, apple-a14, apple-a15, apple-a16, apple-a17, apple-m1, 
apple-m2, apple-m3, apple-s4, apple-s5, exynos-m3, exynos-m4, exynos-m5, 
falkor, saphira, kryo, thunderx2t99, thunderx3t110, thunderx, thunderxt88, 
thunderxt81, thunderxt83, tsv110, a64fx, carmel, ampere1, ampere1a, cobalt-100, 
grace{{$}}
 
 // RUN: not %clang_cc1 -triple arm64--- -tune-cpu not-a-cpu -fsyntax-only %s 
2>&1 | FileCheck %s --check-prefix TUNE_AARCH64
 // TUNE_AARCH64: error: unknown target CPU 'not-a-cpu'
-// TUNE_AARCH64-NEXT: note: valid target CPU values are: cortex-a34, 
cortex-a35, cortex-a53, cortex-a55, cortex-a510, cortex-a520, cortex-a57, 
cortex-a65, cortex-a65ae, cortex-a72, cortex-a73, cortex-a75, cortex-a76, 
cortex-a76ae, cortex-a77, cortex-a78, cortex-a78c, cortex-a710, cortex-a715, 
cortex-a720, cortex-r82, cortex-x1, cortex-x1c, cortex-x2, cortex-x3, 
cortex-x4, neoverse-e1, neoverse-n1, neoverse-n2, neoverse-512tvb, neoverse-v1, 
neoverse-v2, cyclone, apple-a7, apple-a8, apple-a9, apple-a10, apple-a11, 
apple-a12, apple-a13, apple-a14, apple-a15, apple-a16, apple-a17, apple-m1, 
apple-m2, apple-m3, apple-s4, apple-s5, exynos-m3, exynos-m4, exynos-m5, 
falkor, saphira, kryo, thunderx2t99, thunderx3t110, thunderx, thunderxt88, 
thunderxt81, thunderxt83, tsv110, a64fx, carmel, ampere1, ampere1a, grace{{$}}
+// TUNE_AARCH64-NEXT: note: valid target CPU values are: cortex-a34, 
cortex-a35, cortex-a53, cortex-a55, cortex-a510, cortex-a520, cortex-a57, 
cortex-a65, cortex-a65ae, 

[llvm-branch-commits] [llvm] [clang] PR for llvm/llvm-project#79614 (PR #79870)

2024-01-29 Thread via llvm-branch-commits

https://github.com/llvmbot milestoned 
https://github.com/llvm/llvm-project/pull/79870
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [llvm] PR for llvm/llvm-project#79614 (PR #79870)

2024-01-29 Thread via llvm-branch-commits

llvmbot wrote:

@davemgreen What do you think about merging this PR to the release branch?

https://github.com/llvm/llvm-project/pull/79870
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [clang] PR for llvm/llvm-project#79614 (PR #79870)

2024-01-29 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-clang

Author: None (llvmbot)


Changes

resolves llvm/llvm-project#79614

---
Full diff: https://github.com/llvm/llvm-project/pull/79870.diff


4 Files Affected:

- (modified) clang/test/Driver/aarch64-mcpu.c (+3) 
- (modified) clang/test/Misc/target-invalid-cpu-note.c (+2-2) 
- (modified) llvm/include/llvm/TargetParser/AArch64TargetParser.h (+2-1) 
- (modified) llvm/unittests/TargetParser/TargetParserTest.cpp (+1-1) 


``diff
diff --git a/clang/test/Driver/aarch64-mcpu.c b/clang/test/Driver/aarch64-mcpu.c
index 511482a420da268..3e07f3597f34081 100644
--- a/clang/test/Driver/aarch64-mcpu.c
+++ b/clang/test/Driver/aarch64-mcpu.c
@@ -72,6 +72,9 @@
 // RUN: %clang --target=aarch64 -mcpu=cortex-r82  -### -c %s 2>&1 | FileCheck 
-check-prefix=CORTEXR82 %s
 // CORTEXR82: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "cortex-r82"
 
+// RUN: %clang --target=aarch64 -mcpu=cobalt-100 -### -c %s 2>&1 | FileCheck 
-check-prefix=COBALT-100 %s
+// COBALT-100: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" 
"neoverse-n2"
+
 // RUN: %clang --target=aarch64 -mcpu=grace -### -c %s 2>&1 | FileCheck 
-check-prefix=GRACE %s
 // GRACE: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "neoverse-v2"
 
diff --git a/clang/test/Misc/target-invalid-cpu-note.c 
b/clang/test/Misc/target-invalid-cpu-note.c
index 84aed5c9c36fe47..2f10bfb1fd82fe3 100644
--- a/clang/test/Misc/target-invalid-cpu-note.c
+++ b/clang/test/Misc/target-invalid-cpu-note.c
@@ -5,11 +5,11 @@
 
 // RUN: not %clang_cc1 -triple arm64--- -target-cpu not-a-cpu -fsyntax-only %s 
2>&1 | FileCheck %s --check-prefix AARCH64
 // AARCH64: error: unknown target CPU 'not-a-cpu'
-// AARCH64-NEXT: note: valid target CPU values are: cortex-a34, cortex-a35, 
cortex-a53, cortex-a55, cortex-a510, cortex-a520, cortex-a57, cortex-a65, 
cortex-a65ae, cortex-a72, cortex-a73, cortex-a75, cortex-a76, cortex-a76ae, 
cortex-a77, cortex-a78, cortex-a78c, cortex-a710, cortex-a715, cortex-a720, 
cortex-r82, cortex-x1, cortex-x1c, cortex-x2, cortex-x3, cortex-x4, 
neoverse-e1, neoverse-n1, neoverse-n2, neoverse-512tvb, neoverse-v1, 
neoverse-v2, cyclone, apple-a7, apple-a8, apple-a9, apple-a10, apple-a11, 
apple-a12, apple-a13, apple-a14, apple-a15, apple-a16, apple-a17, apple-m1, 
apple-m2, apple-m3, apple-s4, apple-s5, exynos-m3, exynos-m4, exynos-m5, 
falkor, saphira, kryo, thunderx2t99, thunderx3t110, thunderx, thunderxt88, 
thunderxt81, thunderxt83, tsv110, a64fx, carmel, ampere1, ampere1a, grace{{$}}
+// AARCH64-NEXT: note: valid target CPU values are: cortex-a34, cortex-a35, 
cortex-a53, cortex-a55, cortex-a510, cortex-a520, cortex-a57, cortex-a65, 
cortex-a65ae, cortex-a72, cortex-a73, cortex-a75, cortex-a76, cortex-a76ae, 
cortex-a77, cortex-a78, cortex-a78c, cortex-a710, cortex-a715, cortex-a720, 
cortex-r82, cortex-x1, cortex-x1c, cortex-x2, cortex-x3, cortex-x4, 
neoverse-e1, neoverse-n1, neoverse-n2, neoverse-512tvb, neoverse-v1, 
neoverse-v2, cyclone, apple-a7, apple-a8, apple-a9, apple-a10, apple-a11, 
apple-a12, apple-a13, apple-a14, apple-a15, apple-a16, apple-a17, apple-m1, 
apple-m2, apple-m3, apple-s4, apple-s5, exynos-m3, exynos-m4, exynos-m5, 
falkor, saphira, kryo, thunderx2t99, thunderx3t110, thunderx, thunderxt88, 
thunderxt81, thunderxt83, tsv110, a64fx, carmel, ampere1, ampere1a, cobalt-100, 
grace{{$}}
 
 // RUN: not %clang_cc1 -triple arm64--- -tune-cpu not-a-cpu -fsyntax-only %s 
2>&1 | FileCheck %s --check-prefix TUNE_AARCH64
 // TUNE_AARCH64: error: unknown target CPU 'not-a-cpu'
-// TUNE_AARCH64-NEXT: note: valid target CPU values are: cortex-a34, 
cortex-a35, cortex-a53, cortex-a55, cortex-a510, cortex-a520, cortex-a57, 
cortex-a65, cortex-a65ae, cortex-a72, cortex-a73, cortex-a75, cortex-a76, 
cortex-a76ae, cortex-a77, cortex-a78, cortex-a78c, cortex-a710, cortex-a715, 
cortex-a720, cortex-r82, cortex-x1, cortex-x1c, cortex-x2, cortex-x3, 
cortex-x4, neoverse-e1, neoverse-n1, neoverse-n2, neoverse-512tvb, neoverse-v1, 
neoverse-v2, cyclone, apple-a7, apple-a8, apple-a9, apple-a10, apple-a11, 
apple-a12, apple-a13, apple-a14, apple-a15, apple-a16, apple-a17, apple-m1, 
apple-m2, apple-m3, apple-s4, apple-s5, exynos-m3, exynos-m4, exynos-m5, 
falkor, saphira, kryo, thunderx2t99, thunderx3t110, thunderx, thunderxt88, 
thunderxt81, thunderxt83, tsv110, a64fx, carmel, ampere1, ampere1a, grace{{$}}
+// TUNE_AARCH64-NEXT: note: valid target CPU values are: cortex-a34, 
cortex-a35, cortex-a53, cortex-a55, cortex-a510, cortex-a520, cortex-a57, 
cortex-a65, cortex-a65ae, cortex-a72, cortex-a73, cortex-a75, cortex-a76, 
cortex-a76ae, cortex-a77, cortex-a78, cortex-a78c, cortex-a710, cortex-a715, 
cortex-a720, cortex-r82, cortex-x1, cortex-x1c, cortex-x2, cortex-x3, 
cortex-x4, neoverse-e1, neoverse-n1, neoverse-n2, neoverse-512tvb, neoverse-v1, 
neoverse-v2, cyclone, apple-a7, apple-a8, apple-a9, apple-a10, apple-a11, 
apple-a12, apple-a13, apple-a14, apple-a15, apple-a16, apple-a17, apple-m1, 
apple-m2, a

[llvm-branch-commits] [llvm] [clang] PR for llvm/llvm-project#79614 (PR #79870)

2024-01-29 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-clang-driver

Author: None (llvmbot)


Changes

resolves llvm/llvm-project#79614

---
Full diff: https://github.com/llvm/llvm-project/pull/79870.diff


4 Files Affected:

- (modified) clang/test/Driver/aarch64-mcpu.c (+3) 
- (modified) clang/test/Misc/target-invalid-cpu-note.c (+2-2) 
- (modified) llvm/include/llvm/TargetParser/AArch64TargetParser.h (+2-1) 
- (modified) llvm/unittests/TargetParser/TargetParserTest.cpp (+1-1) 


``diff
diff --git a/clang/test/Driver/aarch64-mcpu.c b/clang/test/Driver/aarch64-mcpu.c
index 511482a420da268..3e07f3597f34081 100644
--- a/clang/test/Driver/aarch64-mcpu.c
+++ b/clang/test/Driver/aarch64-mcpu.c
@@ -72,6 +72,9 @@
 // RUN: %clang --target=aarch64 -mcpu=cortex-r82  -### -c %s 2>&1 | FileCheck 
-check-prefix=CORTEXR82 %s
 // CORTEXR82: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "cortex-r82"
 
+// RUN: %clang --target=aarch64 -mcpu=cobalt-100 -### -c %s 2>&1 | FileCheck 
-check-prefix=COBALT-100 %s
+// COBALT-100: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" 
"neoverse-n2"
+
 // RUN: %clang --target=aarch64 -mcpu=grace -### -c %s 2>&1 | FileCheck 
-check-prefix=GRACE %s
 // GRACE: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "neoverse-v2"
 
diff --git a/clang/test/Misc/target-invalid-cpu-note.c 
b/clang/test/Misc/target-invalid-cpu-note.c
index 84aed5c9c36fe47..2f10bfb1fd82fe3 100644
--- a/clang/test/Misc/target-invalid-cpu-note.c
+++ b/clang/test/Misc/target-invalid-cpu-note.c
@@ -5,11 +5,11 @@
 
 // RUN: not %clang_cc1 -triple arm64--- -target-cpu not-a-cpu -fsyntax-only %s 
2>&1 | FileCheck %s --check-prefix AARCH64
 // AARCH64: error: unknown target CPU 'not-a-cpu'
-// AARCH64-NEXT: note: valid target CPU values are: cortex-a34, cortex-a35, 
cortex-a53, cortex-a55, cortex-a510, cortex-a520, cortex-a57, cortex-a65, 
cortex-a65ae, cortex-a72, cortex-a73, cortex-a75, cortex-a76, cortex-a76ae, 
cortex-a77, cortex-a78, cortex-a78c, cortex-a710, cortex-a715, cortex-a720, 
cortex-r82, cortex-x1, cortex-x1c, cortex-x2, cortex-x3, cortex-x4, 
neoverse-e1, neoverse-n1, neoverse-n2, neoverse-512tvb, neoverse-v1, 
neoverse-v2, cyclone, apple-a7, apple-a8, apple-a9, apple-a10, apple-a11, 
apple-a12, apple-a13, apple-a14, apple-a15, apple-a16, apple-a17, apple-m1, 
apple-m2, apple-m3, apple-s4, apple-s5, exynos-m3, exynos-m4, exynos-m5, 
falkor, saphira, kryo, thunderx2t99, thunderx3t110, thunderx, thunderxt88, 
thunderxt81, thunderxt83, tsv110, a64fx, carmel, ampere1, ampere1a, grace{{$}}
+// AARCH64-NEXT: note: valid target CPU values are: cortex-a34, cortex-a35, 
cortex-a53, cortex-a55, cortex-a510, cortex-a520, cortex-a57, cortex-a65, 
cortex-a65ae, cortex-a72, cortex-a73, cortex-a75, cortex-a76, cortex-a76ae, 
cortex-a77, cortex-a78, cortex-a78c, cortex-a710, cortex-a715, cortex-a720, 
cortex-r82, cortex-x1, cortex-x1c, cortex-x2, cortex-x3, cortex-x4, 
neoverse-e1, neoverse-n1, neoverse-n2, neoverse-512tvb, neoverse-v1, 
neoverse-v2, cyclone, apple-a7, apple-a8, apple-a9, apple-a10, apple-a11, 
apple-a12, apple-a13, apple-a14, apple-a15, apple-a16, apple-a17, apple-m1, 
apple-m2, apple-m3, apple-s4, apple-s5, exynos-m3, exynos-m4, exynos-m5, 
falkor, saphira, kryo, thunderx2t99, thunderx3t110, thunderx, thunderxt88, 
thunderxt81, thunderxt83, tsv110, a64fx, carmel, ampere1, ampere1a, cobalt-100, 
grace{{$}}
 
 // RUN: not %clang_cc1 -triple arm64--- -tune-cpu not-a-cpu -fsyntax-only %s 
2>&1 | FileCheck %s --check-prefix TUNE_AARCH64
 // TUNE_AARCH64: error: unknown target CPU 'not-a-cpu'
-// TUNE_AARCH64-NEXT: note: valid target CPU values are: cortex-a34, 
cortex-a35, cortex-a53, cortex-a55, cortex-a510, cortex-a520, cortex-a57, 
cortex-a65, cortex-a65ae, cortex-a72, cortex-a73, cortex-a75, cortex-a76, 
cortex-a76ae, cortex-a77, cortex-a78, cortex-a78c, cortex-a710, cortex-a715, 
cortex-a720, cortex-r82, cortex-x1, cortex-x1c, cortex-x2, cortex-x3, 
cortex-x4, neoverse-e1, neoverse-n1, neoverse-n2, neoverse-512tvb, neoverse-v1, 
neoverse-v2, cyclone, apple-a7, apple-a8, apple-a9, apple-a10, apple-a11, 
apple-a12, apple-a13, apple-a14, apple-a15, apple-a16, apple-a17, apple-m1, 
apple-m2, apple-m3, apple-s4, apple-s5, exynos-m3, exynos-m4, exynos-m5, 
falkor, saphira, kryo, thunderx2t99, thunderx3t110, thunderx, thunderxt88, 
thunderxt81, thunderxt83, tsv110, a64fx, carmel, ampere1, ampere1a, grace{{$}}
+// TUNE_AARCH64-NEXT: note: valid target CPU values are: cortex-a34, 
cortex-a35, cortex-a53, cortex-a55, cortex-a510, cortex-a520, cortex-a57, 
cortex-a65, cortex-a65ae, cortex-a72, cortex-a73, cortex-a75, cortex-a76, 
cortex-a76ae, cortex-a77, cortex-a78, cortex-a78c, cortex-a710, cortex-a715, 
cortex-a720, cortex-r82, cortex-x1, cortex-x1c, cortex-x2, cortex-x3, 
cortex-x4, neoverse-e1, neoverse-n1, neoverse-n2, neoverse-512tvb, neoverse-v1, 
neoverse-v2, cyclone, apple-a7, apple-a8, apple-a9, apple-a10, apple-a11, 
apple-a12, apple-a13, apple-a14, apple-a15, apple-a16, apple-a17, apple-m1, 
appl

[llvm-branch-commits] [llvm] [clang] PR for llvm/llvm-project#79614 (PR #79870)

2024-01-29 Thread David Green via llvm-branch-commits

https://github.com/davemgreen approved this pull request.

Sounds simple enough to me. LGTM

https://github.com/llvm/llvm-project/pull/79870
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [workflows] Fix argument passing in abi-dump jobs (#79658) (PR #79836)

2024-01-29 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/79836
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] 2c32141 - [llvm] [cmake] Include httplib in LLVMConfig.cmake (#79305)

2024-01-29 Thread Tom Stellard via llvm-branch-commits

Author: Michał Górny
Date: 2024-01-29T10:25:40-08:00
New Revision: 2c3214135ffa8e3f9ab61d12521637532126f368

URL: 
https://github.com/llvm/llvm-project/commit/2c3214135ffa8e3f9ab61d12521637532126f368
DIFF: 
https://github.com/llvm/llvm-project/commit/2c3214135ffa8e3f9ab61d12521637532126f368.diff

LOG: [llvm] [cmake] Include httplib in LLVMConfig.cmake (#79305)

Include LLVM_ENABLE_HTTPLIB along with httplib package finding in
LLVMConfig.cmake, as this dependency is needed by LLVMDebuginfod that is
now used by LLDB. Without it, building LLDB standalone fails with:

```
CMake Error at /usr/lib/llvm/19/lib64/cmake/llvm/LLVMExports.cmake:90 
(set_target_properties):
  The link interface of target "LLVMDebuginfod" contains:

httplib::httplib

  but the target was not found.  Possible reasons include:

* There is a typo in the target name.
* A find_package call is missing for an IMPORTED target.
* An ALIAS target is missing.

Call Stack (most recent call first):
  /usr/lib/llvm/19/lib64/cmake/llvm/LLVMConfig.cmake:357 (include)
  cmake/modules/LLDBStandalone.cmake:9 (find_package)
  CMakeLists.txt:34 (include)
```

(cherry picked from commit 3c9f34c12450345c6eb524e47cf79664271e4260)

Added: 


Modified: 
llvm/cmake/modules/LLVMConfig.cmake.in

Removed: 




diff  --git a/llvm/cmake/modules/LLVMConfig.cmake.in 
b/llvm/cmake/modules/LLVMConfig.cmake.in
index 74e1c6bf52e2305..770a9caea322e6a 100644
--- a/llvm/cmake/modules/LLVMConfig.cmake.in
+++ b/llvm/cmake/modules/LLVMConfig.cmake.in
@@ -90,6 +90,11 @@ if(LLVM_ENABLE_CURL)
   find_package(CURL)
 endif()
 
+set(LLVM_ENABLE_HTTPLIB @LLVM_ENABLE_HTTPLIB@)
+if(LLVM_ENABLE_HTTPLIB)
+  find_package(httplib)
+endif()
+
 set(LLVM_WITH_Z3 @LLVM_WITH_Z3@)
 
 set(LLVM_ENABLE_DIA_SDK @LLVM_ENABLE_DIA_SDK@)



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79547 (PR #79548)

2024-01-29 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/79548
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79547 (PR #79548)

2024-01-29 Thread Tom Stellard via llvm-branch-commits

tstellar wrote:

Merged: 2c3214135ffa8e3f9ab61d12521637532126f368

https://github.com/llvm/llvm-project/pull/79548
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] 3df71e5 - [mlir][LLVM] Use int32_t to indirectly construct GEPArg (#79562)

2024-01-29 Thread Tom Stellard via llvm-branch-commits

Author: Andrei Golubev
Date: 2024-01-29T10:29:59-08:00
New Revision: 3df71e5a3f5d5fb9436c53c298e5426f729288e2

URL: 
https://github.com/llvm/llvm-project/commit/3df71e5a3f5d5fb9436c53c298e5426f729288e2
DIFF: 
https://github.com/llvm/llvm-project/commit/3df71e5a3f5d5fb9436c53c298e5426f729288e2.diff

LOG: [mlir][LLVM] Use int32_t to indirectly construct GEPArg (#79562)

GEPArg can only be constructed from int32_t and mlir::Value. Explicitly
cast other types (e.g. unsigned, size_t) to int32_t to avoid narrowing
conversion warnings on MSVC. Some recent examples of such are:

```
mlir\lib\Dialect\LLVMIR\Transforms\TypeConsistency.cpp: error C2398:
Element '1': conversion from 'size_t' to 'T' requires a narrowing
conversion
with
[
T=mlir::LLVM::GEPArg
]

mlir\lib\Dialect\LLVMIR\Transforms\TypeConsistency.cpp: error C2398:
Element '1': conversion from 'unsigned int' to 'T' requires a narrowing
conversion
with
[
T=mlir::LLVM::GEPArg
]
```

Co-authored-by: Nikita Kudriavtsev 
(cherry picked from commit 89cd345667a5f8f4c37c621fd8abe8d84e85c050)

Added: 


Modified: 
mlir/lib/Conversion/GPUCommon/GPUOpsLowering.cpp
mlir/lib/Conversion/GPUCommon/GPUToLLVMConversion.cpp
mlir/lib/Dialect/LLVMIR/Transforms/TypeConsistency.cpp

Removed: 




diff  --git a/mlir/lib/Conversion/GPUCommon/GPUOpsLowering.cpp 
b/mlir/lib/Conversion/GPUCommon/GPUOpsLowering.cpp
index ae2bd8e5b5405d..73d418cb841327 100644
--- a/mlir/lib/Conversion/GPUCommon/GPUOpsLowering.cpp
+++ b/mlir/lib/Conversion/GPUCommon/GPUOpsLowering.cpp
@@ -529,7 +529,8 @@ LogicalResult GPUPrintfOpToVPrintfLowering::matchAndRewrite(
   /*alignment=*/0);
   for (auto [index, arg] : llvm::enumerate(args)) {
 Value ptr = rewriter.create(
-loc, ptrType, structType, tempAlloc, ArrayRef{0, index});
+loc, ptrType, structType, tempAlloc,
+ArrayRef{0, static_cast(index)});
 rewriter.create(loc, arg, ptr);
   }
   std::array printfArgs = {stringStart, tempAlloc};

diff  --git a/mlir/lib/Conversion/GPUCommon/GPUToLLVMConversion.cpp 
b/mlir/lib/Conversion/GPUCommon/GPUToLLVMConversion.cpp
index f853d5c47b623c..78d4e806246872 100644
--- a/mlir/lib/Conversion/GPUCommon/GPUToLLVMConversion.cpp
+++ b/mlir/lib/Conversion/GPUCommon/GPUToLLVMConversion.cpp
@@ -1041,13 +1041,14 @@ Value 
ConvertLaunchFuncOpToGpuRuntimeCallPattern::generateParamsArray(
   auto arrayPtr = builder.create(
   loc, llvmPointerType, llvmPointerType, arraySize, /*alignment=*/0);
   for (const auto &en : llvm::enumerate(arguments)) {
+const auto index = static_cast(en.index());
 Value fieldPtr =
 builder.create(loc, llvmPointerType, structType, 
structPtr,
-ArrayRef{0, en.index()});
+ArrayRef{0, index});
 builder.create(loc, en.value(), fieldPtr);
-auto elementPtr = builder.create(
-loc, llvmPointerType, llvmPointerType, arrayPtr,
-ArrayRef{en.index()});
+auto elementPtr =
+builder.create(loc, llvmPointerType, llvmPointerType,
+arrayPtr, ArrayRef{index});
 builder.create(loc, fieldPtr, elementPtr);
   }
   return arrayPtr;

diff  --git a/mlir/lib/Dialect/LLVMIR/Transforms/TypeConsistency.cpp 
b/mlir/lib/Dialect/LLVMIR/Transforms/TypeConsistency.cpp
index 72f9295749a66b..b25c831bc7172a 100644
--- a/mlir/lib/Dialect/LLVMIR/Transforms/TypeConsistency.cpp
+++ b/mlir/lib/Dialect/LLVMIR/Transforms/TypeConsistency.cpp
@@ -488,7 +488,8 @@ static void splitVectorStore(const DataLayout &dataLayout, 
Location loc,
 // Other patterns will turn this into a type-consistent GEP.
 auto gepOp = rewriter.create(
 loc, address.getType(), rewriter.getI8Type(), address,
-ArrayRef{storeOffset + index * elementSize});
+ArrayRef{
+static_cast(storeOffset + index * elementSize)});
 
 rewriter.create(loc, extractOp, gepOp);
   }
@@ -524,9 +525,9 @@ static void splitIntegerStore(const DataLayout &dataLayout, 
Location loc,
 
 // We create an `i8` indexed GEP here as that is the easiest (offset is
 // already known). Other patterns turn this into a type-consistent GEP.
-auto gepOp =
-rewriter.create(loc, address.getType(), rewriter.getI8Type(),
-   address, ArrayRef{currentOffset});
+auto gepOp = rewriter.create(
+loc, address.getType(), rewriter.getI8Type(), address,
+ArrayRef{static_cast(currentOffset)});
 rewriter.create(loc, valueToStore, gepOp);
 
 // No need to care about padding here since we already checked previously



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] PR for llvm/llvm-project#79600 (PR #79603)

2024-01-29 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/79603
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] PR for llvm/llvm-project#79600 (PR #79603)

2024-01-29 Thread Tom Stellard via llvm-branch-commits

tstellar wrote:

Merged: 3df71e5a3f5d5fb9436c53c298e5426f729288e2

https://github.com/llvm/llvm-project/pull/79603
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] PR for llvm/llvm-project#79479 (PR #79907)

2024-01-29 Thread Craig Topper via llvm-branch-commits

https://github.com/topperc created 
https://github.com/llvm/llvm-project/pull/79907

Resolves https://github.com/llvm/llvm-project/issues/79479.

>From 8fb154776db1627da75e6d67cf468d5b55868e93 Mon Sep 17 00:00:00 2001
From: Craig Topper 
Date: Thu, 25 Jan 2024 09:14:52 -0800
Subject: [PATCH 1/2] [RISCV] Support __riscv_v_fixed_vlen for vbool types.
 (#76551)

This adopts a similar behavior to AArch64 SVE, where bool vectors are
represented as a vector of chars with 1/8 the number of elements. This
ensures the vector always occupies a power of 2 number of bytes.

A consequence of this is that vbool64_t, vbool32_t, and vool16_t can
only be used with a vector length that guarantees at least 8 bits.
---
 clang/docs/ReleaseNotes.rst   |   2 +
 clang/include/clang/AST/Type.h|   3 +
 clang/include/clang/Basic/AttrDocs.td |   5 +-
 clang/lib/AST/ASTContext.cpp  |  20 +-
 clang/lib/AST/ItaniumMangle.cpp   |  25 +-
 clang/lib/AST/JSONNodeDumper.cpp  |   3 +
 clang/lib/AST/TextNodeDumper.cpp  |   3 +
 clang/lib/AST/Type.cpp|  15 +-
 clang/lib/AST/TypePrinter.cpp |   2 +
 clang/lib/CodeGen/Targets/RISCV.cpp   |  21 +-
 clang/lib/Sema/SemaExpr.cpp   |   6 +-
 clang/lib/Sema/SemaType.cpp   |  21 +-
 .../attr-riscv-rvv-vector-bits-bitcast.c  | 100 ++
 .../CodeGen/attr-riscv-rvv-vector-bits-call.c |  74 +
 .../CodeGen/attr-riscv-rvv-vector-bits-cast.c |  76 -
 .../attr-riscv-rvv-vector-bits-codegen.c  | 172 +++
 .../attr-riscv-rvv-vector-bits-globals.c  | 107 +++
 .../attr-riscv-rvv-vector-bits-types.c| 284 ++
 .../riscv-mangle-rvv-fixed-vectors.cpp|  72 +
 clang/test/Sema/attr-riscv-rvv-vector-bits.c  |  88 +-
 20 files changed, 1065 insertions(+), 34 deletions(-)

diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 060bc7669b72a5..45d1ab34d0f931 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -1227,6 +1227,8 @@ RISC-V Support
 - Default ABI with F but without D was changed to ilp32f for RV32 and to lp64f
   for RV64.
 
+- ``__attribute__((rvv_vector_bits(N))) is now supported for RVV vbool*_t 
types.
+
 CUDA/HIP Language Changes
 ^
 
diff --git a/clang/include/clang/AST/Type.h b/clang/include/clang/AST/Type.h
index ea425791fc97f0..6384cf9420b82e 100644
--- a/clang/include/clang/AST/Type.h
+++ b/clang/include/clang/AST/Type.h
@@ -3495,6 +3495,9 @@ enum class VectorKind {
 
   /// is RISC-V RVV fixed-length data vector
   RVVFixedLengthData,
+
+  /// is RISC-V RVV fixed-length mask vector
+  RVVFixedLengthMask,
 };
 
 /// Represents a GCC generic vector type. This type is created using
diff --git a/clang/include/clang/Basic/AttrDocs.td 
b/clang/include/clang/Basic/AttrDocs.td
index 7e633f8e2635a9..e02a1201e2ad79 100644
--- a/clang/include/clang/Basic/AttrDocs.td
+++ b/clang/include/clang/Basic/AttrDocs.td
@@ -2424,7 +2424,10 @@ only be a power of 2 between 64 and 65536.
 For types where LMUL!=1, ``__riscv_v_fixed_vlen`` needs to be scaled by the 
LMUL
 of the type before passing to the attribute.
 
-``vbool*_t`` types are not supported at this time.
+For ``vbool*_t`` types, ``__riscv_v_fixed_vlen`` needs to be divided by the
+number from the type name. For example, ``vbool8_t`` needs to use
+``__riscv_v_fixed_vlen`` / 8. If the resulting value is not a multiple of 8,
+the type is not supported for that value of ``__riscv_v_fixed_vlen``.
 }];
 }
 
diff --git a/clang/lib/AST/ASTContext.cpp b/clang/lib/AST/ASTContext.cpp
index 5eb7aa3664569d..ab16ca10395fa8 100644
--- a/clang/lib/AST/ASTContext.cpp
+++ b/clang/lib/AST/ASTContext.cpp
@@ -1945,7 +1945,8 @@ TypeInfo ASTContext::getTypeInfoImpl(const Type *T) const 
{
 else if (VT->getVectorKind() == VectorKind::SveFixedLengthPredicate)
   // Adjust the alignment for fixed-length SVE predicates.
   Align = 16;
-else if (VT->getVectorKind() == VectorKind::RVVFixedLengthData)
+else if (VT->getVectorKind() == VectorKind::RVVFixedLengthData ||
+ VT->getVectorKind() == VectorKind::RVVFixedLengthMask)
   // Adjust the alignment for fixed-length RVV vectors.
   Align = std::min(64, Width);
 break;
@@ -9416,7 +9417,9 @@ bool ASTContext::areCompatibleVectorTypes(QualType 
FirstVec,
   Second->getVectorKind() != VectorKind::SveFixedLengthData &&
   Second->getVectorKind() != VectorKind::SveFixedLengthPredicate &&
   First->getVectorKind() != VectorKind::RVVFixedLengthData &&
-  Second->getVectorKind() != VectorKind::RVVFixedLengthData)
+  Second->getVectorKind() != VectorKind::RVVFixedLengthData &&
+  First->getVectorKind() != VectorKind::RVVFixedLengthMask &&
+  Second->getVectorKind() != VectorKind::RVVFixedLengthMask)
 return true;
 
   return false;
@@ -9522,8 +9525,11 @@ static uint64_t 

[llvm-branch-commits] [clang] PR for llvm/llvm-project#79479 (PR #79907)

2024-01-29 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-clang

Author: Craig Topper (topperc)


Changes

Resolves https://github.com/llvm/llvm-project/issues/79479.

---

Patch is 92.08 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/79907.diff


20 Files Affected:

- (modified) clang/docs/ReleaseNotes.rst (+2) 
- (modified) clang/include/clang/AST/Type.h (+3) 
- (modified) clang/include/clang/Basic/AttrDocs.td (+4-1) 
- (modified) clang/lib/AST/ASTContext.cpp (+16-4) 
- (modified) clang/lib/AST/ItaniumMangle.cpp (+17-8) 
- (modified) clang/lib/AST/JSONNodeDumper.cpp (+3) 
- (modified) clang/lib/AST/TextNodeDumper.cpp (+3) 
- (modified) clang/lib/AST/Type.cpp (+14-1) 
- (modified) clang/lib/AST/TypePrinter.cpp (+2) 
- (modified) clang/lib/CodeGen/Targets/RISCV.cpp (+15-6) 
- (modified) clang/lib/Sema/SemaExpr.cpp (+4-2) 
- (modified) clang/lib/Sema/SemaType.cpp (+15-6) 
- (modified) clang/test/CodeGen/attr-riscv-rvv-vector-bits-bitcast.c (+100) 
- (modified) clang/test/CodeGen/attr-riscv-rvv-vector-bits-call.c (+74) 
- (modified) clang/test/CodeGen/attr-riscv-rvv-vector-bits-cast.c (+72-4) 
- (modified) clang/test/CodeGen/attr-riscv-rvv-vector-bits-codegen.c (+172) 
- (modified) clang/test/CodeGen/attr-riscv-rvv-vector-bits-globals.c (+107) 
- (modified) clang/test/CodeGen/attr-riscv-rvv-vector-bits-types.c (+284) 
- (modified) clang/test/CodeGenCXX/riscv-mangle-rvv-fixed-vectors.cpp (+72) 
- (modified) clang/test/Sema/attr-riscv-rvv-vector-bits.c (+86-2) 


``diff
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 060bc7669b72a5..2f4fe8bf7556e7 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -1227,6 +1227,8 @@ RISC-V Support
 - Default ABI with F but without D was changed to ilp32f for RV32 and to lp64f
   for RV64.
 
+- ``__attribute__((rvv_vector_bits(N)))`` is now supported for RVV vbool*_t 
types.
+
 CUDA/HIP Language Changes
 ^
 
diff --git a/clang/include/clang/AST/Type.h b/clang/include/clang/AST/Type.h
index ea425791fc97f0..6384cf9420b82e 100644
--- a/clang/include/clang/AST/Type.h
+++ b/clang/include/clang/AST/Type.h
@@ -3495,6 +3495,9 @@ enum class VectorKind {
 
   /// is RISC-V RVV fixed-length data vector
   RVVFixedLengthData,
+
+  /// is RISC-V RVV fixed-length mask vector
+  RVVFixedLengthMask,
 };
 
 /// Represents a GCC generic vector type. This type is created using
diff --git a/clang/include/clang/Basic/AttrDocs.td 
b/clang/include/clang/Basic/AttrDocs.td
index 7e633f8e2635a9..e02a1201e2ad79 100644
--- a/clang/include/clang/Basic/AttrDocs.td
+++ b/clang/include/clang/Basic/AttrDocs.td
@@ -2424,7 +2424,10 @@ only be a power of 2 between 64 and 65536.
 For types where LMUL!=1, ``__riscv_v_fixed_vlen`` needs to be scaled by the 
LMUL
 of the type before passing to the attribute.
 
-``vbool*_t`` types are not supported at this time.
+For ``vbool*_t`` types, ``__riscv_v_fixed_vlen`` needs to be divided by the
+number from the type name. For example, ``vbool8_t`` needs to use
+``__riscv_v_fixed_vlen`` / 8. If the resulting value is not a multiple of 8,
+the type is not supported for that value of ``__riscv_v_fixed_vlen``.
 }];
 }
 
diff --git a/clang/lib/AST/ASTContext.cpp b/clang/lib/AST/ASTContext.cpp
index 5eb7aa3664569d..ab16ca10395fa8 100644
--- a/clang/lib/AST/ASTContext.cpp
+++ b/clang/lib/AST/ASTContext.cpp
@@ -1945,7 +1945,8 @@ TypeInfo ASTContext::getTypeInfoImpl(const Type *T) const 
{
 else if (VT->getVectorKind() == VectorKind::SveFixedLengthPredicate)
   // Adjust the alignment for fixed-length SVE predicates.
   Align = 16;
-else if (VT->getVectorKind() == VectorKind::RVVFixedLengthData)
+else if (VT->getVectorKind() == VectorKind::RVVFixedLengthData ||
+ VT->getVectorKind() == VectorKind::RVVFixedLengthMask)
   // Adjust the alignment for fixed-length RVV vectors.
   Align = std::min(64, Width);
 break;
@@ -9416,7 +9417,9 @@ bool ASTContext::areCompatibleVectorTypes(QualType 
FirstVec,
   Second->getVectorKind() != VectorKind::SveFixedLengthData &&
   Second->getVectorKind() != VectorKind::SveFixedLengthPredicate &&
   First->getVectorKind() != VectorKind::RVVFixedLengthData &&
-  Second->getVectorKind() != VectorKind::RVVFixedLengthData)
+  Second->getVectorKind() != VectorKind::RVVFixedLengthData &&
+  First->getVectorKind() != VectorKind::RVVFixedLengthMask &&
+  Second->getVectorKind() != VectorKind::RVVFixedLengthMask)
 return true;
 
   return false;
@@ -9522,8 +9525,11 @@ static uint64_t getRVVTypeSize(ASTContext &Context, 
const BuiltinType *Ty) {
 
   ASTContext::BuiltinVectorTypeInfo Info = 
Context.getBuiltinVectorTypeInfo(Ty);
 
-  uint64_t EltSize = Context.getTypeSize(Info.ElementType);
-  uint64_t MinElts = Info.EC.getKnownMinValue();
+  unsigned EltSize = Context.getTypeSize(Info.ElementType);
+  if (Info.ElementType == Context.BoolTy)
+EltSize = 1;
+
+  unsigned Min

[llvm-branch-commits] [clang] b73cd5e - Revert "[SemaCXX] Implement CWG2137 (list-initialization from objects of the same type) (#77768)"

2024-01-29 Thread Tom Stellard via llvm-branch-commits

Author: Alexander Kornienko
Date: 2024-01-29T14:57:23-08:00
New Revision: b73cd5ec714740283841e0fc1f3ebebe65dd329a

URL: 
https://github.com/llvm/llvm-project/commit/b73cd5ec714740283841e0fc1f3ebebe65dd329a
DIFF: 
https://github.com/llvm/llvm-project/commit/b73cd5ec714740283841e0fc1f3ebebe65dd329a.diff

LOG: Revert "[SemaCXX] Implement CWG2137 (list-initialization from objects of 
the same type) (#77768)"

This reverts commit 924701311aa79180e86ad8ce43d253f27d25ec7d. Causes compilation
errors on valid code, see
https://github.com/llvm/llvm-project/pull/77768#issuecomment-1908062472.

(cherry picked from commit 6e4930c67508a90bdfd756f6e45417b5253cd741)

Added: 


Modified: 
clang/lib/Sema/SemaInit.cpp
clang/lib/Sema/SemaOverload.cpp
clang/test/CXX/drs/dr14xx.cpp
clang/test/CXX/drs/dr21xx.cpp
clang/test/CXX/drs/dr23xx.cpp
clang/www/cxx_dr_status.html

libcxx/test/std/utilities/utility/pairs/pairs.pair/ctor.pair_U_V_move.pass.cpp

Removed: 




diff  --git a/clang/lib/Sema/SemaInit.cpp b/clang/lib/Sema/SemaInit.cpp
index 91e4cb7b68a24a..457fa377355a97 100644
--- a/clang/lib/Sema/SemaInit.cpp
+++ b/clang/lib/Sema/SemaInit.cpp
@@ -4200,7 +4200,7 @@ static OverloadingResult ResolveConstructorOverload(
 /// \param IsListInit Is this list-initialization?
 /// \param IsInitListCopy Is this non-list-initialization resulting from a
 ///   list-initialization from {x} where x is the same
-///   aggregate type as the entity?
+///   type as the entity?
 static void TryConstructorInitialization(Sema &S,
  const InitializedEntity &Entity,
  const InitializationKind &Kind,
@@ -4230,14 +4230,6 @@ static void TryConstructorInitialization(Sema &S,
 Entity.getKind() !=
 InitializedEntity::EK_LambdaToBlockConversionBlockElement);
 
-  bool CopyElisionPossible = false;
-  auto ElideConstructor = [&] {
-// Convert qualifications if necessary.
-Sequence.AddQualificationConversionStep(DestType, VK_PRValue);
-if (ILE)
-  Sequence.RewrapReferenceInitList(DestType, ILE);
-  };
-
   // C++17 [dcl.init]p17:
   // - If the initializer expression is a prvalue and the cv-unqualified
   //   version of the source type is the same class as the class of the
@@ -4250,17 +4242,11 @@ static void TryConstructorInitialization(Sema &S,
   if (S.getLangOpts().CPlusPlus17 && !RequireActualConstructor &&
   UnwrappedArgs.size() == 1 && UnwrappedArgs[0]->isPRValue() &&
   S.Context.hasSameUnqualifiedType(UnwrappedArgs[0]->getType(), DestType)) 
{
-if (ILE && !DestType->isAggregateType()) {
-  // CWG2311: T{ prvalue_of_type_T } is not eligible for copy elision
-  // Make this an elision if this won't call an initializer-list
-  // constructor. (Always on an aggregate type or check constructors 
first.)
-  assert(!IsInitListCopy &&
- "IsInitListCopy only possible with aggregate types");
-  CopyElisionPossible = true;
-} else {
-  ElideConstructor();
-  return;
-}
+// Convert qualifications if necessary.
+Sequence.AddQualificationConversionStep(DestType, VK_PRValue);
+if (ILE)
+  Sequence.RewrapReferenceInitList(DestType, ILE);
+return;
   }
 
   const RecordType *DestRecordType = DestType->getAs();
@@ -4305,12 +4291,6 @@ static void TryConstructorInitialization(Sema &S,
   S, Kind.getLocation(), Args, CandidateSet, DestType, Ctors, Best,
   CopyInitialization, AllowExplicit,
   /*OnlyListConstructors=*/true, IsListInit, RequireActualConstructor);
-
-if (CopyElisionPossible && Result == OR_No_Viable_Function) {
-  // No initializer list candidate
-  ElideConstructor();
-  return;
-}
   }
 
   // C++11 [over.match.list]p1:
@@ -4592,9 +4572,9 @@ static void TryListInitialization(Sema &S,
 return;
   }
 
-  // C++11 [dcl.init.list]p3, per DR1467 and DR2137:
-  // - If T is an aggregate class and the initializer list has a single element
-  //   of type cv U, where U is T or a class derived from T, the object is
+  // C++11 [dcl.init.list]p3, per DR1467:
+  // - If T is a class type and the initializer list has a single element of
+  //   type cv U, where U is T or a class derived from T, the object is
   //   initialized from that element (by copy-initialization for
   //   copy-list-initialization, or by direct-initialization for
   //   direct-list-initialization).
@@ -4605,7 +4585,7 @@ static void TryListInitialization(Sema &S,
   // - Otherwise, if T is an aggregate, [...] (continue below).
   if (S.getLangOpts().CPlusPlus11 && InitList->getNumInits() == 1 &&
   !IsDesignatedInit) {
-if (DestType->isRecordType() && DestType->isAggregateType()) {
+if (DestType->isRecordType()) {
   QualType InitType = InitList->get

[llvm-branch-commits] [libcxx] [clang] PR for llvm/llvm-project#79762 (PR #79763)

2024-01-29 Thread Tom Stellard via llvm-branch-commits

tstellar wrote:

Merged: b73cd5ec714740283841e0fc1f3ebebe65dd329a

https://github.com/llvm/llvm-project/pull/79763
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [libcxx] PR for llvm/llvm-project#79762 (PR #79763)

2024-01-29 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/79763
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] 4c8cf4a - [AMDGPU] New llvm.amdgcn.wave.id intrinsic (#79325)

2024-01-29 Thread Tom Stellard via llvm-branch-commits

Author: Jay Foad
Date: 2024-01-29T15:00:47-08:00
New Revision: 4c8cf4a1c29da834f1999a1c56c7e637c6886825

URL: 
https://github.com/llvm/llvm-project/commit/4c8cf4a1c29da834f1999a1c56c7e637c6886825
DIFF: 
https://github.com/llvm/llvm-project/commit/4c8cf4a1c29da834f1999a1c56c7e637c6886825.diff

LOG: [AMDGPU] New llvm.amdgcn.wave.id intrinsic (#79325)

This is only valid on targets with architected SGPRs.

Added: 
llvm/test/CodeGen/AMDGPU/llvm.amdgcn.wave.id.ll

Modified: 
llvm/include/llvm/IR/IntrinsicsAMDGPU.td
llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.h
llvm/lib/Target/AMDGPU/SIISelLowering.cpp
llvm/lib/Target/AMDGPU/SIISelLowering.h

Removed: 




diff  --git a/llvm/include/llvm/IR/IntrinsicsAMDGPU.td 
b/llvm/include/llvm/IR/IntrinsicsAMDGPU.td
index 9eb1ac8e27bef..c5f43d17d1c14 100644
--- a/llvm/include/llvm/IR/IntrinsicsAMDGPU.td
+++ b/llvm/include/llvm/IR/IntrinsicsAMDGPU.td
@@ -2777,6 +2777,10 @@ class AMDGPULoadTr:
 
 def int_amdgcn_global_load_tr : AMDGPULoadTr;
 
+// i32 @llvm.amdgcn.wave.id()
+def int_amdgcn_wave_id :
+  DefaultAttrsIntrinsic<[llvm_i32_ty], [], [IntrNoMem, IntrSpeculatable]>;
+
 
//===--===//
 // Deep learning intrinsics.
 
//===--===//

diff  --git a/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp 
b/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
index 615685822f91e..e98ede88a7e2d 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
@@ -6883,6 +6883,23 @@ bool AMDGPULegalizerInfo::legalizeStackSave(MachineInstr 
&MI,
   return true;
 }
 
+bool AMDGPULegalizerInfo::legalizeWaveID(MachineInstr &MI,
+ MachineIRBuilder &B) const {
+  // With architected SGPRs, waveIDinGroup is in TTMP8[29:25].
+  if (!ST.hasArchitectedSGPRs())
+return false;
+  LLT S32 = LLT::scalar(32);
+  Register DstReg = MI.getOperand(0).getReg();
+  Register TTMP8 =
+  getFunctionLiveInPhysReg(B.getMF(), B.getTII(), AMDGPU::TTMP8,
+   AMDGPU::SReg_32RegClass, B.getDebugLoc(), S32);
+  auto LSB = B.buildConstant(S32, 25);
+  auto Width = B.buildConstant(S32, 5);
+  B.buildUbfx(DstReg, TTMP8, LSB, Width);
+  MI.eraseFromParent();
+  return true;
+}
+
 bool AMDGPULegalizerInfo::legalizeIntrinsic(LegalizerHelper &Helper,
 MachineInstr &MI) const {
   MachineIRBuilder &B = Helper.MIRBuilder;
@@ -7005,6 +7022,8 @@ bool 
AMDGPULegalizerInfo::legalizeIntrinsic(LegalizerHelper &Helper,
   case Intrinsic::amdgcn_workgroup_id_z:
 return legalizePreloadedArgIntrin(MI, MRI, B,
   AMDGPUFunctionArgInfo::WORKGROUP_ID_Z);
+  case Intrinsic::amdgcn_wave_id:
+return legalizeWaveID(MI, B);
   case Intrinsic::amdgcn_lds_kernel_id:
 return legalizePreloadedArgIntrin(MI, MRI, B,
   AMDGPUFunctionArgInfo::LDS_KERNEL_ID);

diff  --git a/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.h 
b/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.h
index 56aabd4f6ab71..ecbe42681c669 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.h
+++ b/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.h
@@ -212,6 +212,7 @@ class AMDGPULegalizerInfo final : public LegalizerInfo {
 
   bool legalizeFPTruncRound(MachineInstr &MI, MachineIRBuilder &B) const;
   bool legalizeStackSave(MachineInstr &MI, MachineIRBuilder &B) const;
+  bool legalizeWaveID(MachineInstr &MI, MachineIRBuilder &B) const;
 
   bool legalizeImageIntrinsic(
   MachineInstr &MI, MachineIRBuilder &B,

diff  --git a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp 
b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
index d60f511302613..c5ad9da88ec2b 100644
--- a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+++ b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
@@ -7920,6 +7920,18 @@ SDValue SITargetLowering::lowerSBuffer(EVT VT, SDLoc DL, 
SDValue Rsrc,
   return Loads[0];
 }
 
+SDValue SITargetLowering::lowerWaveID(SelectionDAG &DAG, SDValue Op) const {
+  // With architected SGPRs, waveIDinGroup is in TTMP8[29:25].
+  if (!Subtarget->hasArchitectedSGPRs())
+return {};
+  SDLoc SL(Op);
+  MVT VT = MVT::i32;
+  SDValue TTMP8 = CreateLiveInRegister(DAG, &AMDGPU::SReg_32RegClass,
+   AMDGPU::TTMP8, VT, SL);
+  return DAG.getNode(AMDGPUISD::BFE_U32, SL, VT, TTMP8,
+ DAG.getConstant(25, SL, VT), DAG.getConstant(5, SL, VT));
+}
+
 SDValue SITargetLowering::lowerWorkitemID(SelectionDAG &DAG, SDValue Op,
   unsigned Dim,
   const ArgDescriptor &Arg) const {
@@ -8090,6 +8102,8 @@ SDValue SITargetLowering::LowerINTRINSIC_WO_CHAIN(SDValue 
Op,
   case

[llvm-branch-commits] [llvm] 824a3e5 - [AMDGPU] Do not bother adding reserved registers to liveins (#79436)

2024-01-29 Thread Tom Stellard via llvm-branch-commits

Author: Jay Foad
Date: 2024-01-29T15:00:47-08:00
New Revision: 824a3e5dec3aabc91428f009c1f439a75f577469

URL: 
https://github.com/llvm/llvm-project/commit/824a3e5dec3aabc91428f009c1f439a75f577469
DIFF: 
https://github.com/llvm/llvm-project/commit/824a3e5dec3aabc91428f009c1f439a75f577469.diff

LOG: [AMDGPU] Do not bother adding reserved registers to liveins (#79436)

Tweak the implementation of llvm.amdgcn.wave.id to not add TTMP8 to the
function liveins.

Added: 


Modified: 
llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
llvm/lib/Target/AMDGPU/SIISelLowering.cpp

Removed: 




diff  --git a/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp 
b/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
index e98ede88a7e2d..17ffb7ec988f0 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
@@ -6890,9 +6890,7 @@ bool AMDGPULegalizerInfo::legalizeWaveID(MachineInstr &MI,
 return false;
   LLT S32 = LLT::scalar(32);
   Register DstReg = MI.getOperand(0).getReg();
-  Register TTMP8 =
-  getFunctionLiveInPhysReg(B.getMF(), B.getTII(), AMDGPU::TTMP8,
-   AMDGPU::SReg_32RegClass, B.getDebugLoc(), S32);
+  auto TTMP8 = B.buildCopy(S32, Register(AMDGPU::TTMP8));
   auto LSB = B.buildConstant(S32, 25);
   auto Width = B.buildConstant(S32, 5);
   B.buildUbfx(DstReg, TTMP8, LSB, Width);

diff  --git a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp 
b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
index c5ad9da88ec2b..d6bf0d8cb2efa 100644
--- a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+++ b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
@@ -7926,8 +7926,7 @@ SDValue SITargetLowering::lowerWaveID(SelectionDAG &DAG, 
SDValue Op) const {
 return {};
   SDLoc SL(Op);
   MVT VT = MVT::i32;
-  SDValue TTMP8 = CreateLiveInRegister(DAG, &AMDGPU::SReg_32RegClass,
-   AMDGPU::TTMP8, VT, SL);
+  SDValue TTMP8 = DAG.getCopyFromReg(DAG.getEntryNode(), SL, AMDGPU::TTMP8, 
VT);
   return DAG.getNode(AMDGPUISD::BFE_U32, SL, VT, TTMP8,
  DAG.getConstant(25, SL, VT), DAG.getConstant(5, SL, VT));
 }



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] Backport 45d2d7757feb386186f69af6ef57bde7b5adc2db to release/18.x (PR #79839)

2024-01-29 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/79839
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] Backport 45d2d7757feb386186f69af6ef57bde7b5adc2db to release/18.x (PR #79839)

2024-01-29 Thread Tom Stellard via llvm-branch-commits

tstellar wrote:

Merged: 824a3e5dec3aabc91428f009c1f439a75f577469 
4c8cf4a1c29da834f1999a1c56c7e637c6886825

https://github.com/llvm/llvm-project/pull/79839
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BPF] expand cttz, ctlz for i32, i64 (PR #73668)

2024-01-29 Thread via llvm-branch-commits

eddyz87 wrote:

> @eddyz87 Could you please take a look? This has been stalled for a while :)

Hello, I tried this with simple C test:

```c
unsigned int test(unsigned int v) {
  return __builtin_ctz(v);
  //return __builtin_clz(v);
}
```



The clz part compiles fine, but when ctz is used I 
still get an assertion, however a different one:


```
$ clang  --target=bpf -S -O2 test-clz.c -o -
...
LLVM ERROR: Cannot select: t15: i64 = ConstantPool<[32 x i8] 
c"\00\01\1C\02\1D\0E\18\03\1E\16\14\0F\19\11\04\08\1F\1B\0D\17\15\13\10\07\1A\0C\12\06\0B\05\0A\09">
 0
In function: test
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and 
include the crash backtrace.
Stack dump:
0.  Program arguments: llc -debug-only=isel --asm-show-inst -mtriple=bpf 
-mcpu=v3 -filetype=obj -o - test-clz.ll
1.  Running pass 'Function Pass Manager' on module 'test-clz.ll'.
2.  Running pass 'BPF DAG->DAG Pattern Instruction Selection' on function 
'@test'
 #0 0x55d4977b1db8 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) 
/home/eddy/work/llvm-project/llvm/lib/Support/Unix/Signals.inc:723:13
 #1 0x55d4977afeb0 llvm::sys::RunSignalHandlers() 
/home/eddy/work/llvm-project/llvm/lib/Support/Signals.cpp:106:18
 #2 0x55d4977b2588 SignalHandler(int) 
/home/eddy/work/llvm-project/llvm/lib/Support/Unix/Signals.inc:413:1
 #3 0x7fdbaa03f190 __restore_rt (/lib64/libc.so.6+0x3f190)
 #4 0x7fdbaa091dec __pthread_kill_implementation (/lib64/libc.so.6+0x91dec)
 #5 0x7fdbaa03f0c6 gsignal (/lib64/libc.so.6+0x3f0c6)
 #6 0x7fdbaa0268d7 abort (/lib64/libc.so.6+0x268d7)
 #7 0x55d497735145 llvm::report_fatal_error(llvm::Twine const&, bool) 
/home/eddy/work/llvm-project/llvm/lib/Support/ErrorHandling.cpp:125:5
 #8 0x55d4975f4b6d llvm::SDNode::getValueType(unsigned int) const 
/home/eddy/work/llvm-project/llvm/include/llvm/CodeGen/SelectionDAGNodes.h:1007:5
 #9 0x55d4975f4b6d llvm::SDValue::getValueType() const 
/home/eddy/work/llvm-project/llvm/include/llvm/CodeGen/SelectionDAGNodes.h:1162:16
#10 0x55d4975f4b6d llvm::SelectionDAGISel::CannotYetSelect(llvm::SDNode*) 
/home/eddy/work/llvm-project/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp:4232:43
```




Looking at debug info from llc it looks like cttz_zero_undef is 
expanded using some kind of a lookup table:


```
$ llc -debug-only=isel --asm-show-inst -mtriple=bpf -mcpu=v3 -filetype=asm -o - 
test-clz.ll
...
Type-legalized selection DAG: %bb.0 'test:entry'
SelectionDAG has 7 nodes:
  t0: ch,glue = EntryToken
  t2: i32,ch = CopyFromReg t0, Register:i32 %0
t3: i32 = cttz_zero_undef t2
  t5: ch,glue = CopyToReg t0, Register:i32 $w0, t3
  t6: ch = BPFISD::RET_GLUE t5, Register:i32 $w0, t5:1



Legalized selection DAG: %bb.0 'test:entry'
SelectionDAG has 18 nodes:
  t0: ch,glue = EntryToken
  t2: i32,ch = CopyFromReg t0, Register:i32 %0
t8: i32 = sub Constant:i32<0>, t2
  t9: i32 = and t2, t8
t11: i32 = mul t9, Constant:i32<125613361>
  t13: i32 = srl t11, Constant:i32<27>
t14: i64 = sign_extend t13
  t16: i64 = add ConstantPool:i64<[32 x i8] 
c"\00\01\1C\02\1D\0E\18\03\1E\16\14\0F\19\11\04\08\1F\1B\0D\17\15\13\10\07\1A\0C\12\06\0B\05\0A\09">
 0, t14
t18: i32,ch = load<(load (s8) from constant-pool), zext from i8> t0, t16, 
undef:i64
  t5: ch,glue = CopyToReg t0, Register:i32 $w0, t18
  t6: ch = BPFISD::RET_GLUE t5, Register:i32 $w0, t5:1
```


If there is no way to convince lowering to use some other strategy, and you 
don't want to spend time on implementing translation for `ConstantPool`, I 
think it would be fine to leave `ctz` as-is, or just adjust error reporting, so 
that it clearly says that `ctz` is not supported w/o showing a stack-trace. 
Wdyt?

https://github.com/llvm/llvm-project/pull/73668
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] bdaf16d - [LoopVectorize] Refine runtime memory check costs when there is an outer loop (#76034)

2024-01-29 Thread Tom Stellard via llvm-branch-commits

Author: David Sherwood
Date: 2024-01-29T15:05:18-08:00
New Revision: bdaf16d59f4a64529371cbe056245f6cc035d7cf

URL: 
https://github.com/llvm/llvm-project/commit/bdaf16d59f4a64529371cbe056245f6cc035d7cf
DIFF: 
https://github.com/llvm/llvm-project/commit/bdaf16d59f4a64529371cbe056245f6cc035d7cf.diff

LOG: [LoopVectorize] Refine runtime memory check costs when there is an outer 
loop (#76034)

When we generate runtime memory checks for an inner loop it's
possible that these checks are invariant in the outer loop and
so will get hoisted out. In such cases, the effective cost of
the checks should reduce to reflect the outer loop trip count.

This fixes a 25% performance regression introduced by commit

49b0e6dcc296792b577ae8f0f674e61a0929b99d

when building the SPEC2017 x264 benchmark with PGO, where we
decided the inner loop trip count wasn't high enough to warrant
the (incorrect) high cost of the runtime checks. Also, when
runtime memory checks consist entirely of diff checks these are
likely to be outer loop invariant.

(cherry picked from commit 962fbafecf4730ba84a3b9fd7a662a5c30bb2c7c)

Added: 
llvm/test/Transforms/LoopVectorize/AArch64/low_trip_memcheck_cost.ll

Modified: 
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp

Removed: 




diff  --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp 
b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
index 6ca93e15719fb..dd596c567cd48 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -1957,6 +1957,8 @@ class GeneratedRTChecks {
   bool CostTooHigh = false;
   const bool AddBranchWeights;
 
+  Loop *OuterLoop = nullptr;
+
 public:
   GeneratedRTChecks(ScalarEvolution &SE, DominatorTree *DT, LoopInfo *LI,
 TargetTransformInfo *TTI, const DataLayout &DL,
@@ -2053,6 +2055,9 @@ class GeneratedRTChecks {
   DT->eraseNode(SCEVCheckBlock);
   LI->removeBlock(SCEVCheckBlock);
 }
+
+// Outer loop is used as part of the later cost calculations.
+OuterLoop = L->getParentLoop();
   }
 
   InstructionCost getCost() {
@@ -2076,16 +2081,61 @@ class GeneratedRTChecks {
 LLVM_DEBUG(dbgs() << "  " << C << "  for " << I << "\n");
 RTCheckCost += C;
   }
-if (MemCheckBlock)
+if (MemCheckBlock) {
+  InstructionCost MemCheckCost = 0;
   for (Instruction &I : *MemCheckBlock) {
 if (MemCheckBlock->getTerminator() == &I)
   continue;
 InstructionCost C =
 TTI->getInstructionCost(&I, TTI::TCK_RecipThroughput);
 LLVM_DEBUG(dbgs() << "  " << C << "  for " << I << "\n");
-RTCheckCost += C;
+MemCheckCost += C;
   }
 
+  // If the runtime memory checks are being created inside an outer loop
+  // we should find out if these checks are outer loop invariant. If so,
+  // the checks will likely be hoisted out and so the effective cost will
+  // reduce according to the outer loop trip count.
+  if (OuterLoop) {
+ScalarEvolution *SE = MemCheckExp.getSE();
+// TODO: If profitable, we could refine this further by analysing every
+// individual memory check, since there could be a mixture of loop
+// variant and invariant checks that mean the final condition is
+// variant.
+const SCEV *Cond = SE->getSCEV(MemRuntimeCheckCond);
+if (SE->isLoopInvariant(Cond, OuterLoop)) {
+  // It seems reasonable to assume that we can reduce the effective
+  // cost of the checks even when we know nothing about the trip
+  // count. Assume that the outer loop executes at least twice.
+  unsigned BestTripCount = 2;
+
+  // If exact trip count is known use that.
+  if (unsigned SmallTC = SE->getSmallConstantTripCount(OuterLoop))
+BestTripCount = SmallTC;
+  else if (LoopVectorizeWithBlockFrequency) {
+// Else use profile data if available.
+if (auto EstimatedTC = getLoopEstimatedTripCount(OuterLoop))
+  BestTripCount = *EstimatedTC;
+  }
+
+  InstructionCost NewMemCheckCost = MemCheckCost / BestTripCount;
+
+  // Let's ensure the cost is always at least 1.
+  NewMemCheckCost = std::max(*NewMemCheckCost.getValue(),
+ (InstructionCost::CostType)1);
+
+  LLVM_DEBUG(dbgs()
+ << "We expect runtime memory checks to be hoisted "
+ << "out of the outer loop. Cost reduced from "
+ << MemCheckCost << " to " << NewMemCheckCost << '\n');
+
+  MemCheckCost = NewMemCheckCost;
+}
+  }
+
+  RTCheckCost += MemCheckCost;
+}
+
 if (SCEVCheckBlock || MemCheckBlock)
   LLVM_DEBUG(dbgs() << "Total cost of runtime checks: " << RTCheckCost
 << "\n");
@@ -2144,8 +2194,8 @@ clas

[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79800 (PR #79813)

2024-01-29 Thread Tom Stellard via llvm-branch-commits

tstellar wrote:

Merged: bdaf16d59f4a64529371cbe056245f6cc035d7cf

https://github.com/llvm/llvm-project/pull/79813
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79800 (PR #79813)

2024-01-29 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/79813
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] 0680e84 - [mlir] Revert to old fold logic in IR::Dialect::add{Types, Attributes}() (#79582)

2024-01-29 Thread Tom Stellard via llvm-branch-commits

Author: Andrei Golubev
Date: 2024-01-29T15:10:47-08:00
New Revision: 0680e84a3f2a366a860bd0491f490a2fba800313

URL: 
https://github.com/llvm/llvm-project/commit/0680e84a3f2a366a860bd0491f490a2fba800313
DIFF: 
https://github.com/llvm/llvm-project/commit/0680e84a3f2a366a860bd0491f490a2fba800313.diff

LOG: [mlir] Revert to old fold logic in IR::Dialect::add{Types, Attributes}() 
(#79582)

Fold expressions on Clang are limited to 256 elements. This causes
compilation errors in cases when the amount of elements added exceeds
this limit. Side-step the issue by restoring the original trick that
would use the std::initializer_list. For the record, in our downstream
Clang 16 gives:

mlir/include/mlir/IR/Dialect.h:269:23: fatal error: instantiating fold
expression with 688 arguments exceeded expression nesting limit of 256
(addType(), ...);

Partially reverts 26d811b3ecd2fa1ca3d9b41e17fb42b8c7ad03d6.

Co-authored-by: Nikita Kudriavtsev 
(cherry picked from commit e3a38a75ddc6ff00301ec19a0e2488d00f2cc297)

Added: 


Modified: 
mlir/include/mlir/IR/Dialect.h

Removed: 




diff  --git a/mlir/include/mlir/IR/Dialect.h b/mlir/include/mlir/IR/Dialect.h
index 45f29f37dd3b97..50f6f6de5c2897 100644
--- a/mlir/include/mlir/IR/Dialect.h
+++ b/mlir/include/mlir/IR/Dialect.h
@@ -281,7 +281,11 @@ class Dialect {
   /// Register a set of type classes with this dialect.
   template 
   void addTypes() {
-(addType(), ...);
+// This initializer_list argument pack expansion is essentially equal to
+// using a fold expression with a comma operator. Clang however, refuses
+// to compile a fold expression with a depth of more than 256 by default.
+// There seem to be no such limitations for initializer_list.
+(void)std::initializer_list{0, (addType(), 0)...};
   }
 
   /// Register a type instance with this dialect.
@@ -292,7 +296,11 @@ class Dialect {
   /// Register a set of attribute classes with this dialect.
   template 
   void addAttributes() {
-(addAttribute(), ...);
+// This initializer_list argument pack expansion is essentially equal to
+// using a fold expression with a comma operator. Clang however, refuses
+// to compile a fold expression with a depth of more than 256 by default.
+// There seem to be no such limitations for initializer_list.
+(void)std::initializer_list{0, (addAttribute(), 0)...};
   }
 
   /// Register an attribute instance with this dialect.



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] PR for llvm/llvm-project#79797 (PR #79863)

2024-01-29 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/79863
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] PR for llvm/llvm-project#79797 (PR #79863)

2024-01-29 Thread Tom Stellard via llvm-branch-commits

tstellar wrote:

Merged: 0680e84a3f2a366a860bd0491f490a2fba800313

https://github.com/llvm/llvm-project/pull/79863
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] bab01ae - Revert "[AArch64] merge index address with large offset into base address"

2024-01-29 Thread Tom Stellard via llvm-branch-commits

Author: David Green
Date: 2024-01-29T15:17:53-08:00
New Revision: bab01aead7d7a34436bc8e1639b90227374f079e

URL: 
https://github.com/llvm/llvm-project/commit/bab01aead7d7a34436bc8e1639b90227374f079e
DIFF: 
https://github.com/llvm/llvm-project/commit/bab01aead7d7a34436bc8e1639b90227374f079e.diff

LOG: Revert "[AArch64] merge index address with large offset into base address"

This reverts commit 32878c2065c8005b3ea30c79e16dfd7eed55d645 due to #79756 and 
#76202.

(cherry picked from commit 915c3d9e5a2d1314afe64cd6116a3b6c9809ec90)

Added: 


Modified: 
llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
llvm/lib/Target/AArch64/AArch64InstrInfo.h
llvm/lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp
llvm/test/CodeGen/AArch64/arm64-addrmode.ll
llvm/test/CodeGen/AArch64/large-offset-ldr-merge.mir

Removed: 




diff  --git a/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp 
b/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
index 2e8d8c63d6bec..13e9d9725cc2e 100644
--- a/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
+++ b/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
@@ -4098,16 +4098,6 @@ AArch64InstrInfo::getLdStOffsetOp(const MachineInstr 
&MI) {
   return MI.getOperand(Idx);
 }
 
-const MachineOperand &
-AArch64InstrInfo::getLdStAmountOp(const MachineInstr &MI) {
-  switch (MI.getOpcode()) {
-  default:
-llvm_unreachable("Unexpected opcode");
-  case AArch64::LDRBBroX:
-return MI.getOperand(4);
-  }
-}
-
 static const TargetRegisterClass *getRegClass(const MachineInstr &MI,
   Register Reg) {
   if (MI.getParent() == nullptr)

diff  --git a/llvm/lib/Target/AArch64/AArch64InstrInfo.h 
b/llvm/lib/Target/AArch64/AArch64InstrInfo.h
index db24a19fe5f8e..6526f6740747a 100644
--- a/llvm/lib/Target/AArch64/AArch64InstrInfo.h
+++ b/llvm/lib/Target/AArch64/AArch64InstrInfo.h
@@ -111,9 +111,6 @@ class AArch64InstrInfo final : public AArch64GenInstrInfo {
   /// Returns the immediate offset operator of a load/store.
   static const MachineOperand &getLdStOffsetOp(const MachineInstr &MI);
 
-  /// Returns the shift amount operator of a load/store.
-  static const MachineOperand &getLdStAmountOp(const MachineInstr &MI);
-
   /// Returns whether the instruction is FP or NEON.
   static bool isFpOrNEON(const MachineInstr &MI);
 

diff  --git a/llvm/lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp 
b/llvm/lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp
index e90b8a8ca7ace..926a89466255c 100644
--- a/llvm/lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp
+++ b/llvm/lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp
@@ -62,8 +62,6 @@ STATISTIC(NumUnscaledPairCreated,
   "Number of load/store from unscaled generated");
 STATISTIC(NumZeroStoresPromoted, "Number of narrow zero stores promoted");
 STATISTIC(NumLoadsFromStoresPromoted, "Number of loads from stores promoted");
-STATISTIC(NumConstOffsetFolded,
-  "Number of const offset of index address folded");
 
 DEBUG_COUNTER(RegRenamingCounter, DEBUG_TYPE "-reg-renaming",
   "Controls which pairs are considered for renaming");
@@ -77,11 +75,6 @@ static cl::opt 
LdStLimit("aarch64-load-store-scan-limit",
 static cl::opt UpdateLimit("aarch64-update-scan-limit", 
cl::init(100),
  cl::Hidden);
 
-// The LdStConstLimit limits how far we search for const offset instructions
-// when we form index address load/store instructions.
-static cl::opt LdStConstLimit("aarch64-load-store-const-scan-limit",
-cl::init(10), cl::Hidden);
-
 // Enable register renaming to find additional store pairing opportunities.
 static cl::opt EnableRenaming("aarch64-load-store-renaming",
 cl::init(true), cl::Hidden);
@@ -178,13 +171,6 @@ struct AArch64LoadStoreOpt : public MachineFunctionPass {
   findMatchingUpdateInsnForward(MachineBasicBlock::iterator I,
 int UnscaledOffset, unsigned Limit);
 
-  // Scan the instruction list to find a register assigned with a const
-  // value that can be combined with the current instruction (a load or store)
-  // using base addressing with writeback. Scan forwards.
-  MachineBasicBlock::iterator
-  findMatchingConstOffsetBackward(MachineBasicBlock::iterator I, unsigned 
Limit,
-  unsigned &Offset);
-
   // Scan the instruction list to find a base register update that can
   // be combined with the current instruction (a load or store) using
   // pre or post indexed addressing with writeback. Scan backwards.
@@ -196,19 +182,11 @@ struct AArch64LoadStoreOpt : public MachineFunctionPass {
   bool isMatchingUpdateInsn(MachineInstr &MemMI, MachineInstr &MI,
 unsigned BaseReg, int Offset);
 
-  bool isMatchingMovConstInsn(MachineInstr &MemMI, MachineInstr &MI,
-  unsigned IndexR

[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79756 (PR #79814)

2024-01-29 Thread Tom Stellard via llvm-branch-commits

tstellar wrote:

Merged: bab01aead7d7a34436bc8e1639b90227374f079e

https://github.com/llvm/llvm-project/pull/79814
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79756 (PR #79814)

2024-01-29 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/79814
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [RISCV] Backport 5cf9f2cd9888feea23a624c1de3cc37ce8ce8112 to release/18.x (PR #79931)

2024-01-29 Thread Luke Lau via llvm-branch-commits

https://github.com/lukel97 milestoned 
https://github.com/llvm/llvm-project/pull/79931
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [RISCV] Backport 5cf9f2cd9888feea23a624c1de3cc37ce8ce8112 to release/18.x (PR #79931)

2024-01-29 Thread Luke Lau via llvm-branch-commits

https://github.com/lukel97 created 
https://github.com/llvm/llvm-project/pull/79931

This cherry picks a fix 5cf9f2cd9888feea23a624c1de3cc37ce8ce8112 for a 
miscompile (only with the -mrvv-vector-bits=zvl configuration or similar) 
introduced in bb8a8770e203ba027d141cd1200e93809ea66c8f, which is present in the 
18.x release branch. It also includes a commit that adds a test 
d407e6ca61a422f25841674d8f0b5ea0dbec85f8

>From 5b3331f29489446d7d723a33310b7fec37153976 Mon Sep 17 00:00:00 2001
From: Luke Lau 
Date: Fri, 26 Jan 2024 20:16:21 +0700
Subject: [PATCH 1/2] [RISCV] Add test to showcase miscompile from #79072

---
 .../rvv/fixed-vectors-shuffle-exact-vlen.ll| 18 --
 1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-exact-vlen.ll 
b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-exact-vlen.ll
index f53b51e05c572..c0b02f62444ef 100644
--- a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-exact-vlen.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-exact-vlen.ll
@@ -138,8 +138,8 @@ define <4 x i64> @m2_splat_two_source(<4 x i64> %v1, <4 x 
i64> %v2) vscale_range
   ret <4 x i64> %res
 }
 
-define <4 x i64> @m2_splat_into_identity_two_source(<4 x i64> %v1, <4 x i64> 
%v2) vscale_range(2,2) {
-; CHECK-LABEL: m2_splat_into_identity_two_source:
+define <4 x i64> @m2_splat_into_identity_two_source_v2_hi(<4 x i64> %v1, <4 x 
i64> %v2) vscale_range(2,2) {
+; CHECK-LABEL: m2_splat_into_identity_two_source_v2_hi:
 ; CHECK:   # %bb.0:
 ; CHECK-NEXT:vsetivli zero, 2, e64, m1, ta, ma
 ; CHECK-NEXT:vrgather.vi v10, v8, 0
@@ -149,6 +149,20 @@ define <4 x i64> @m2_splat_into_identity_two_source(<4 x 
i64> %v1, <4 x i64> %v2
   ret <4 x i64> %res
 }
 
+; FIXME: This is a miscompile, we're clobbering the lower reg group of %v2
+; (v10), and the vmv1r.v is moving from the wrong reg group (should be v10)
+define <4 x i64> @m2_splat_into_slide_two_source_v2_lo(<4 x i64> %v1, <4 x 
i64> %v2) vscale_range(2,2) {
+; CHECK-LABEL: m2_splat_into_slide_two_source_v2_lo:
+; CHECK:   # %bb.0:
+; CHECK-NEXT:vsetivli zero, 2, e64, m1, ta, ma
+; CHECK-NEXT:vrgather.vi v10, v8, 0
+; CHECK-NEXT:vmv1r.v v11, v8
+; CHECK-NEXT:vmv2r.v v8, v10
+; CHECK-NEXT:ret
+  %res = shufflevector <4 x i64> %v1, <4 x i64> %v2, <4 x i32> 
+  ret <4 x i64> %res
+}
+
 define <4 x i64> @m2_splat_into_slide_two_source(<4 x i64> %v1, <4 x i64> %v2) 
vscale_range(2,2) {
 ; CHECK-LABEL: m2_splat_into_slide_two_source:
 ; CHECK:   # %bb.0:

>From 60341586c8bd46b1094663749ac6467058b7efe8 Mon Sep 17 00:00:00 2001
From: Luke Lau 
Date: Fri, 26 Jan 2024 20:18:08 +0700
Subject: [PATCH 2/2] [RISCV] Fix M1 shuffle on wrong SrcVec in
 lowerShuffleViaVRegSplitting

This fixes a miscompile from #79072 where we were taking the wrong SrcVec to do
the M1 shuffle. E.g. if the SrcVecIdx was 2 and we had 2 VRegsPerSrc, we ended
up taking it from V1 instead of V2.
---
 llvm/lib/Target/RISCV/RISCVISelLowering.cpp   | 2 +-
 .../CodeGen/RISCV/rvv/fixed-vectors-shuffle-exact-vlen.ll | 8 +++-
 2 files changed, 4 insertions(+), 6 deletions(-)

diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp 
b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index 47c6cd6e5487b..7895d74f06d12 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -4718,7 +4718,7 @@ static SDValue 
lowerShuffleViaVRegSplitting(ShuffleVectorSDNode *SVN,
 if (SrcVecIdx == -1)
   continue;
 unsigned ExtractIdx = (SrcVecIdx % VRegsPerSrc) * NumOpElts;
-SDValue SrcVec = (unsigned)SrcVecIdx > VRegsPerSrc ? V2 : V1;
+SDValue SrcVec = (unsigned)SrcVecIdx >= VRegsPerSrc ? V2 : V1;
 SDValue SubVec = DAG.getNode(ISD::EXTRACT_SUBVECTOR, DL, M1VT, SrcVec,
  DAG.getVectorIdxConstant(ExtractIdx, DL));
 SubVec = convertFromScalableVector(OneRegVT, SubVec, DAG, Subtarget);
diff --git a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-exact-vlen.ll 
b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-exact-vlen.ll
index c0b02f62444ef..3f0bdb9d5e316 100644
--- a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-exact-vlen.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-exact-vlen.ll
@@ -149,15 +149,13 @@ define <4 x i64> 
@m2_splat_into_identity_two_source_v2_hi(<4 x i64> %v1, <4 x i6
   ret <4 x i64> %res
 }
 
-; FIXME: This is a miscompile, we're clobbering the lower reg group of %v2
-; (v10), and the vmv1r.v is moving from the wrong reg group (should be v10)
 define <4 x i64> @m2_splat_into_slide_two_source_v2_lo(<4 x i64> %v1, <4 x 
i64> %v2) vscale_range(2,2) {
 ; CHECK-LABEL: m2_splat_into_slide_two_source_v2_lo:
 ; CHECK:   # %bb.0:
 ; CHECK-NEXT:vsetivli zero, 2, e64, m1, ta, ma
-; CHECK-NEXT:vrgather.vi v10, v8, 0
-; CHECK-NEXT:vmv1r.v v11, v8
-; CHECK-NEXT:vmv2r.v v8, v10
+; CHECK-NEXT:vrgather.vi v12, v8, 0
+; CHECK-NEXT:vm

[llvm-branch-commits] [llvm] [RISCV] Backport 5cf9f2cd9888feea23a624c1de3cc37ce8ce8112 to release/18.x (PR #79931)

2024-01-29 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-backend-risc-v

Author: Luke Lau (lukel97)


Changes

This cherry picks a fix 5cf9f2cd9888feea23a624c1de3cc37ce8ce8112 for a 
miscompile (only with the -mrvv-vector-bits=zvl configuration or similar) 
introduced in bb8a8770e203ba027d141cd1200e93809ea66c8f, which is present in the 
18.x release branch. It also includes a commit that adds a test 
d407e6ca61a422f25841674d8f0b5ea0dbec85f8

---
Full diff: https://github.com/llvm/llvm-project/pull/79931.diff


2 Files Affected:

- (modified) llvm/lib/Target/RISCV/RISCVISelLowering.cpp (+1-1) 
- (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-exact-vlen.ll 
(+14-2) 


``diff
diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp 
b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index 47c6cd6e5487b..7895d74f06d12 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -4718,7 +4718,7 @@ static SDValue 
lowerShuffleViaVRegSplitting(ShuffleVectorSDNode *SVN,
 if (SrcVecIdx == -1)
   continue;
 unsigned ExtractIdx = (SrcVecIdx % VRegsPerSrc) * NumOpElts;
-SDValue SrcVec = (unsigned)SrcVecIdx > VRegsPerSrc ? V2 : V1;
+SDValue SrcVec = (unsigned)SrcVecIdx >= VRegsPerSrc ? V2 : V1;
 SDValue SubVec = DAG.getNode(ISD::EXTRACT_SUBVECTOR, DL, M1VT, SrcVec,
  DAG.getVectorIdxConstant(ExtractIdx, DL));
 SubVec = convertFromScalableVector(OneRegVT, SubVec, DAG, Subtarget);
diff --git a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-exact-vlen.ll 
b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-exact-vlen.ll
index f53b51e05c572..3f0bdb9d5e316 100644
--- a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-exact-vlen.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-exact-vlen.ll
@@ -138,8 +138,8 @@ define <4 x i64> @m2_splat_two_source(<4 x i64> %v1, <4 x 
i64> %v2) vscale_range
   ret <4 x i64> %res
 }
 
-define <4 x i64> @m2_splat_into_identity_two_source(<4 x i64> %v1, <4 x i64> 
%v2) vscale_range(2,2) {
-; CHECK-LABEL: m2_splat_into_identity_two_source:
+define <4 x i64> @m2_splat_into_identity_two_source_v2_hi(<4 x i64> %v1, <4 x 
i64> %v2) vscale_range(2,2) {
+; CHECK-LABEL: m2_splat_into_identity_two_source_v2_hi:
 ; CHECK:   # %bb.0:
 ; CHECK-NEXT:vsetivli zero, 2, e64, m1, ta, ma
 ; CHECK-NEXT:vrgather.vi v10, v8, 0
@@ -149,6 +149,18 @@ define <4 x i64> @m2_splat_into_identity_two_source(<4 x 
i64> %v1, <4 x i64> %v2
   ret <4 x i64> %res
 }
 
+define <4 x i64> @m2_splat_into_slide_two_source_v2_lo(<4 x i64> %v1, <4 x 
i64> %v2) vscale_range(2,2) {
+; CHECK-LABEL: m2_splat_into_slide_two_source_v2_lo:
+; CHECK:   # %bb.0:
+; CHECK-NEXT:vsetivli zero, 2, e64, m1, ta, ma
+; CHECK-NEXT:vrgather.vi v12, v8, 0
+; CHECK-NEXT:vmv1r.v v13, v10
+; CHECK-NEXT:vmv2r.v v8, v12
+; CHECK-NEXT:ret
+  %res = shufflevector <4 x i64> %v1, <4 x i64> %v2, <4 x i32> 
+  ret <4 x i64> %res
+}
+
 define <4 x i64> @m2_splat_into_slide_two_source(<4 x i64> %v1, <4 x i64> %v2) 
vscale_range(2,2) {
 ; CHECK-LABEL: m2_splat_into_slide_two_source:
 ; CHECK:   # %bb.0:

``




https://github.com/llvm/llvm-project/pull/79931
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [compiler-rt] PR for llvm/llvm-project#79838 (PR #79841)

2024-01-29 Thread Fangrui Song via llvm-branch-commits

MaskRay wrote:

LGTM

https://github.com/llvm/llvm-project/pull/79841
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [compiler-rt] PR for llvm/llvm-project#79838 (PR #79841)

2024-01-29 Thread Fangrui Song via llvm-branch-commits

https://github.com/MaskRay approved this pull request.


https://github.com/llvm/llvm-project/pull/79841
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits